Feed aggregator

Web Performance of Ecommerce Applications

LoadStorm - 6 hours 36 min ago

You probably have been wondering why I’ve posted so infrequently over the past year. We have been bombarded with emails and phone calls demanding more blogging that includes my extremely dry, obtuse humor. So, in the interest of global stability and national security, I must acquiesce to the will of the masses.

Right. That’s a joke. I’m only slightly more popular than Justin Bieber. If you are a serious tech geek, you’ll need to look up that Bieber reference.

Web performance is why you read our blog, and web performance is my life. I want our blog to contain the most interesting information about load testing, page speed, and application scalability. In order to deliver on that goal, we came up with the concept of using LoadStorm and other tools to gather actual performance data regarding as many web applications as possible.

Thus, the Web Performance Lab was born.

Why Create a Web Performance Lab?

Amazon EC2 General Purpose Types of Server Instances

The Web Performance Lab is designed to be a virtual sandbox where we install software and run experiments to find out how the software performs. Why? Because I sit around and wonder if an AWS EC2 m3.2xlarge instance will run a web app four times faster than a m3.large. It should, since it has 8 virtual CPUs compared to 2, and it has 30 GB of memory compared to 7.5. That’s simple math, but rarely does linear algebra work out in web performance. Just because we have 4x the horsepower on the hardware does NOT result in 4x the speed or scalability.

WPL (I’ll abbreviate because we geeks love acronyms) gives us a playground to satisfy our curiosity. I want to know:

  • What is the most scalable ecommerce application?
  • How many concurrent users can WordPress handle on a small server?
  • Which cloud platform gives me the biggest bang for the buck?
  • Does Linux or Windows provide a better stack for scalability?
  • Can I fairly compare open source software performance to commercial solutions?
  • Who will help me create statistically accurate load testing scenarios for experiments?
  • Are other performance engineers as curious as me about these comparison test results?
  • How can I involve other application experts to achieve more useful performance results?
  • What other tools (not LoadStorm) should I put in the WPL toolkit for testing?
  • Can we identify the top 10 best performance optimizations that apply to each web application?
  • Who are the top experts on each of the most important web applications?
  • Do they know anything about tuning their apps for scalability?
  • Will they want to participate?
  • Are they able to share their knowledge in an objective set of experiments?
  • What are the decision criteria in picking the best server monitoring tool?
  • Nginx vs. Apache – what are the real numbers to compare?
  • How many types of caching exist?
  • What type of caching gives the highest level of increase to scalability?
  • Can we empirically conclude anything from running a Webpagetest or Pagespeed or Yslow analysis?
  • Are the advertised “free” server monitoring tool actually free?
  • Will most web applications get memory bound before being CPU-constrained?
  • NewRelic vs. AppDynamics – which is best for what type of architecture?
  • How do open source load testing tools such as JMeter compare to cloud solutions such as LoadStorm?
  • Drupal vs. WordPress vs. Joomla vs. Redaxscript vs. concrete5 vs. pimcore?
  • Magento vs. osCommerce vs. OpenCart vs. WooCommerce vs. Drupal Commerce vs. VirtueMart?

There are hundreds of similar questions that I ponder with Elijah Craig. He and I are very productive in the evenings, and he helps me plan experiments to solve the riddles of web performance that appear to me in visions after a long day of work in the load testing salt mines.

Ecommerce Application Performance is High Priority

U.S. Online Retail Sales 2012-2017

That last question in the list is worthy of assigning highest priority in the web performance lab. Online retail is a great place to start because it has so much at stake. There are so many good test scenarios to build into experiments. Ecommerce provides excellent examples of business processes that are important to measure.

With over $250 billion of online sales in the U.S. alone during 2013, and with over 15% annual growth, how could we ignore ecommerce? It’s the biggest market possible for our Web Performance Lab to address. The stakes are enormous. My hope is that other people will be as interested as I am.

Cyber Monday 2013 generated $1.7 billion in sales for a single day! What ecommerce applications are generating the most money? I doubt we will ever know, nor will the WPL answer that question. However, some of you reading this blog will want to get your share of that $300 billion this year, and the $340 billion in 2015, so I’m certain that you need to understand which online retail platform is going to perform best. You need to know, right?

Cyber Monday sales data

We will run experiments to gather objective performance measurements. How many combinations and permutations can we try?

  • Each app with out-of-the-box configuration (no tuning) on an EC2 large instance, c3.8xlarge, r3.8xlarge, hs1.8xlarge, m1.small
  • Each app (no tuning) running on a similar server in Google Compute Engine, Rackspace, Azure, Joyent, Bluelock, Savvis
  • Each app (no tuning) running on same hardware using Ubuntu, Fedora, CentOS, Debian, Red Hat, Gentoo, Windows
  • Each app tuned with web application caching such as memcached, Varnish, JCS
  • Each app tuned with a CDN such as Cloudfront, Akamai, Edgecast, Cloudflare, MaxCDN
  • Each app tuned with different database configurations
  • Each app tuned with different web server configurations
  • Each app tuned with different database configurations

We have been running some of these experiments during the past few months. Esteban shared some of his results in blog posts earlier. My problem with his work is that some of the conclusions aren’t as solid as I would prefer. I spent some time reviewing his results with him in a conference room recently, and I poked some holes in his logic.

Now don’t get me wrong, I am an Esteban fan. He is a friend and high character guy. That said, we all learn from experiments. We try, we ponder, we learn. That’s how humans gain understanding. As a child you figure out the world by putting your hand on a hot stove. You register that learning experience in your brain, and you don’t do that again. You find out the best ways to accomplish objectives by failing. Just ask Edison. He figured out 1,000 ways how NOT to create a functional lightbulb before he found the correct way. So it is with WPL. We are learning from trying.

Therefore, we are beginning a new series of experiments on ecommerce platforms. We will be publishing the results more quickly and with less filtering. We hope you find it useful and interesting. Please feel free to comment and make suggestions. Also, if you disagree with our statistical approach or calculations, please let us know. Recommendations are also welcome for ways to improve our scientific method employed during the experiments.

The post Web Performance of Ecommerce Applications appeared first on LoadStorm.

Very Short Blog Posts (17): Regression Obsession

DevelopSense - Michael Bolton - 6 hours 59 min ago
Regression testing is focused on the risk that something that used to work in some way no longer works that way. A lot of organizations (Agile ones in particular) seem fascinated by regression testing (or checking) above all other testing activities. It’s a good idea to check for the risk of regression, but it’s also […]
Categories: Software Testing

A Tale of Four Projects

DevelopSense - Michael Bolton - Wed, 04/23/2014 - 16:53
Once upon time, in a high-tech business park far, far away, there were four companies, each working on a development project. In Project Blue, the testers created a suite of 250 test cases, based on 50 use cases, before development started. These cases remained static throughout the project. Each week saw incremental improvement in the […]
Categories: Software Testing

Software Quality Metrics for your Continuous Delivery Pipeline – Part II: Database

No matter how often you deploy your application or how sophisticated your delivery pipeline is, you always need to know the quality status of the software you are building. That can only be done if you measure it; but measure what exactly? In Part I we introduced the Concept of Quality Metrics in CD (Continuous […]
Categories: Load & Perf Testing

Software Quality Metrics for your Continuous Delivery Pipeline – Part II: Database

Perf Planet - Wed, 04/23/2014 - 05:35
No matter how often you deploy your application or how sophisticated your delivery pipeline is, you always need to know the quality status of the software you are building. That can only be done if you measure it; but measure what exactly? In Part I we introduced the Concept of Quality Metrics in CD (Continuous […]

New findings: Retail sites that use a CDN are slower than sites that do not

Web Performance Today - Tue, 04/22/2014 - 22:43

This week at Radware, we’ve released our latest research into the performance of the top 500 retail sites. (A little background: Every quarter, we test the load time and page composition of the same set of websites. The goal is to benchmark retail web performance and identify patterns.)

In the four years that we’ve been gathering this data, there have been two persistent trends:

  1. Pages are getting slower. The median top 500 ecommerce home page takes 10 seconds to load. In spring 2012, the median page loaded in 6.8 seconds. This represents a 47% slowdown in just two years.
  2. Pages are getting bigger. The median page contains 99 resources and is 1510 KB in size. In other words, a typical page is 20% bigger than it was just six months ago.

While these findings are compelling, I’m most interested in what we discovered when we analyzed how long it takes for pages to render primary content, and whether or not using a content delivery network (CDN) helps pages render their most important content more quickly. Here’s what we found…

Pages are taking longer to serve primary content.

Part of our analysis involved looking at the Time to Interact (TTI) for the top 100 sites. From a user experience perspective, TTI is arguably the most critical metric. It refers to the amount of time it takes for a page’s primary content — usually a feature image with a call-to-action button — to render and become usable. We identified the TTI for each page by looking at a filmstrip that depicts page load frame by frame. (See the example below.)

We found that the median TTI was 5.4 seconds — well short of the ideal render time of 3 seconds or less. In our Summer 2013 State of the Union, we found that the median TTI was 4.9 seconds. The difference between last summer and now may not seem dramatic, but this represents a 10% slowdown in just nine months. This is significant, especially given that this trend seems likely to continue.

While this finding is extremely compelling in and of itself, things get even more interesting when we compare the TTI for sites that use a content delivery network to the TTI for sites that do not…

Using a content delivery network (CDN) correlates to slower Time to Interact, not faster.

Content delivery networks are a well-known performance solution, so our initial  expectation was that sites that use a CDN would be at least as fast, if not faster, than sites that don’t. Yet we found the opposite: the time to interact for pages that use a CDN was 5.7 seconds, compared to a TTI of 4.7 seconds for pages that do not use a CDN — a 1000-millisecond difference.

One second may not sound like much, but it can have a big impact on business metrics. In the past, I’ve shared case studies demonstrating that a one-second delay can result in:

  • 8.3% increase in bounce rate
  • 9.3% fewer page views
  • 3.5% decrease in conversions
  • 2.1% decrease in cart size
This finding should not be construed as a criticism of CDNs.

CDNs are a proven web performance solution that addresses the problem of latency: the amount of time it takes for a host server to receive, process, and deliver on a request for a page resource (images, CSS files, etc.). Latency depends largely on how far away the user is from the server, and it’s compounded by the number of resources a web page contains.

A CDN addresses the latency problem by caching static page resources in distributed servers (AKA edge caches, points of presence, or PoPs) across a region or worldwide — thereby bringing resources closer to users and reducing round trip time.

CDNs cure some, but not all, performance pains.

To understand why using a CDN may not always correlate to faster pages, we need to look at the pages themselves.

The first thing to consider is page size. We found that pages that use a CDN and those that do not are roughly comparable in terms of page size and total number of resources: pages that use a CDN tend to use slightly fewer, but slightly fatter, resources, which results in somewhat larger pages overall.

But this small difference in size arguably cannot account for the fact that pages that use a CDN become interactive a full second behind those sites that do not use a CDN. This suggests that the issue is not solely about getting resources to the user faster (i.e. by using a CDN) — it’s about how the pages themselves are built.

Some things to consider about leading sites:

  • Leading sites are more likely to use a CDN.
  • Leading sites are more likely to incorporate large high-resolution images and other rich content, thereby increasing the size of page resources.
  • Leading sites are more likely to implement third-party marketing scripts, such as trackers and analytic tools. Third-party scripts can have a significant impact on performance. Poorly implemented scripts can delay page render, and non-functional scripts can prevent a page from loading.

In other words, sites that use a CDN are likely to contain more rich content and more third-party scripts, two of the greatest performance leeches. For these sites, using a CDN no doubt mitigates some degree of the impact of increased page size and complexity, but a CDN can’t be expected to do all the heavy lifting.

Front-end performance optimization picks up where CDNs leave off.

A CDN isn’t a magic bullet. It can’t fix the entire myriad of web performance problems, which includes:

  • Server-side processing.
  • Third-party scripts that block the rest of the page from rendering.
  • Badly optimized pages, in which non-essential content renders before primary content.
  • Unoptimized images (e.g. images that are uncompressed, unconsolidated, non-progressive, and/or in the wrong format).
  • Unminified code.
  • And many, many more.

This is where front-end performance optimization (FEO) comes in. (I’m aware that I’m probably preaching to the choir here, in which case feel free to stop reading now, and forward this post to ten people you know who need to read it. )

CDNs address middle mile performance by bringing resources closer to users. FEO addresses performance at the very front end — where 80-85% of response time happens — so that pages render more efficiently in the browser.

To get the best acceleration results, most of our customers use a combination of front-end optimization (FEO), content delivery network (CDN), application delivery controller (ADC), and in-house engineering. As the table below (which is based on this case study) demonstrates, a multi-pronged performance solution can make pages up to four times faster.

Takeaway

While 75% of the top 100 retail websites use a content delivery network, CDN usage doesn’t correlate to faster load times. Sites that use a CDN took a full second longer to render primary content than their non-CDN-using counterparts. The problem lies not with CDNs, which are an effective weapon in the fight against latency. Instead, the problem lies within the web pages themselves: pages are larger and more complex than ever, and leading retailers are more likely to incorporate performance-leaching content, such as rich media and third-party scripts. The solution lies in adopting an aggressive, multi-pronged solution set that includes a CDN, automated front-end optimization, and in-house engineering.

DOWNLOAD: State of the Union: Ecommerce Page Speed & Web Performance [Spring 2014]

The post New findings: Retail sites that use a CDN are slower than sites that do not appeared first on Web Performance Today.

What He Said

QA Hates You - Tue, 04/22/2014 - 09:29

Wayne Ariola in SD Times:

Remember: The cost of quality isn’t the price of creating quality software; it’s the penalty or risk incurred by failing to deliver quality software.

Word to your mother, who doesn’t understand why the computer thing doesn’t work any more and is afraid to touch computers because some online provider used her as a guinea pig in some new-feature experiment with bugs built right in.

Categories: Software Testing

Test Driven Development and CI using JavaScript [Part I]

LoadImpact - Tue, 04/22/2014 - 02:39

In this tutorial, we will learn how to apply TDD (Test-Driven Development) using JavaScript code. This is the first part of a set of tutorials that includes TDD and CI (Continuous Integration) using JavaScript as the main language.

Some types of testing

There are several approaches for testing code and each come with their own set of challenges. Emily Bache, author of The Coding Dojo Handbook, writes about them in more detail on her blog – “Coding is like cooking

1. Test Last: in this approach, you code a solution and subsequently create the test cases.

  • Problem 1: It’s difficult to create test cases after the code is completed.
  • Problem 2: If test cases find an issue, it’s difficult to refactor the completed code.

2. Test First: you design test cases and then write the code.

  • Problem 1: You need a good design and formulating test cases increases the design stage, which takes too much time.
  • Problem 2: Design issues are caught too late in the coding process, which makes refactoring the code more difficult due to specification changes in the design. This issue also leads to scope creep.

3. Test-Driven: You write test cases parallel to new coding modules. In other words, you add a task for unit tests as your developers are assigned different coding tasks during the project development stage.

 

TDD approach

TDD focuses on writing code at the same time as you write the tests. You write small modules of code, and then write your tests shortly after.

Patterns to apply to the code:

  • Avoid direct calls over the network or to the database. Use interfaces or abstract classes instead.
  • Implement a real class that implements the network or database call and a class which simulates the calls and returns quick values (Fakes and Mocks).
  • Create a constructor that uses Fakes or Mocks as a parameter in its interface or abstract class.

Patterns to apply to unit tests: 

  • Use the setup function to initialize the testing, which initializes common behavior for the rest of the unit test cases.
  • Use the TearDown function to release resources after a unit test case has finalized.
  • Use “assert()” to verify the correct behavior and results of the code during the unit test cases.
  • Avoid dependency between unit test cases.
  • Test small pieces of code.

 

Behavior-Driven Development

Behavior-Driven Development (BDD) is a specialized version of TDD focused on behavioral specifications. Since TDD does not specify how the test cases should be done and what needs to be tested, BDD was created in response to these issues.

Test cases are written based on user stories or scenarios. Stories are established during the design phase. Business analysts, managers and project/product managers gather the design specifications, and then users explain the logical functionality for each control. Specifications also include a design flow so test cases can validate proper flow.

This is an example of the language used to create a BDD test story:

Story: Returns go to stock

In order to keep track of stock

As a store owner

I want to add items back to stock when they’re returned

Scenario 1Refunded items should be returned to stock

Given a customer previously bought a black sweater from me

And I currently have three black sweaters left in stock

When he returns the sweater for a refund

Then I should have four black sweaters in stock

Scenario 2:  Replaced items should be returned to stock

Given that a customer buys a blue garment

And I have two blue garments in stock

And three black garments in stock.

When he returns the garment for a replacement in black,

Then I should have three blue garments in stock

And two black garments in stock

 

Frameworks to Install

1. Jamine

Jasmine  is a set of standalone libraries that allow you to test JavaScript based on BDD. These libraries do not require DOM, which make them perfect to test on the client side and the server side. You can download it from http://github.com/pivotal/jasmine

It is divided into suites, specs and expectations

.Suites define the unit’s story. 

.Specs define the scenarios. 

.Expectations define desired behaviors and results. 

Jasmine has a set of helper libraries that lets you organize tests.  

2. RequreJS

RequireJS is a Javascript library that allows you to organize code into modules, which load dynamically on demand.

By dividing code into modules, you can speed up the load-time for application components and have better organization of your code.

You can download RequireJS from http://www.requirejs.org

Part II of this two part tutorial will discuss Behavioral Driven Testing and Software Testing – how to use BDD to test your JavaScipt code. Don’t miss out, subscribe to our blog below. 

————-

This post was written by Miguel Dominguez. Miguel is currently Senior Software Developer at digitallabs AB but also works as a freelance developer. His focus is on mobile application (android) development, web front-end development (javascript, css, html5) and back-end (mvc, .net, java). Follow Miguel’s blog.


Categories: Load & Perf Testing

“In The Real World”

DevelopSense - Michael Bolton - Mon, 04/21/2014 - 04:17
In Rapid Software Testing, James Bach, our colleagues, and I advocate an approach that puts the skill set and the mindset of the individual tester—rather than some document or tool or test case or process modelY—at the centre of testing. We advocate an exploratory approach to testing so that we find not only the problems […]
Categories: Software Testing

Uplink Latency of WiFi and 4G Networks

Ilya Grigorik - Mon, 04/21/2014 - 01:00

The user opens your application on their device and triggers an action requiring that we fetch a remote resource: application invokes the appropriate platform API (e.g. XMLHttpRequest), the runtime serializes the request (e.g. translates it to a well-formed HTTP request) and passes the resulting byte buffer to the OS, which then fragments it into one or more TCP packets and finally passes the buffer to the link layer.

So far, so good, but what happens next? As you can guess, the answer depends on the properties of the current link layer in use on the device. Let's dig a bit deeper...

Transmitting over WiFi

If the user is on WiFi, then the link layer breaks up the data into multiple frames and (optimistically) begins transmitting data one frame at a time: it waits until the radio channel is "silent," transmits the WiFi frame, and then waits for an acknowledgement from the receiver before proceeding with transmission of the next frame. Yes, you've read that right, each frame requires a full roundtrip between the sender and receiver! 802.11n is the first standard to introduce "frame aggregation," which allows multiple frames to be sent in a single transmission.

Of course, not all transmissions will succeed on their first attempt. If two peers transmit at the same time and on the same channel then a collision will happen and both peers will have to retransmit data: both peers sleep for a random interval and then repeat the process. The WiFi access model is simple to understand and implement, but as you can guess, also doesn't provide any guarantees about the latency costs of the transmission. If the network is mostly idle, then transmission times are nice and low, but if the network is congested, then all bets are off.

In fact, don't be surprised to see 100ms+ delays just for the first hop between the WiFi sender and the access point - e.g. see the histogram above, showing 180ms+ first-hop latency tails on my own (home) WiFi network. That said, note that there is no "typical" or "average" uplink WiFi latency: the latency will depend on the conditions and load of the particular WiFi network. In short, expect high variability and long latency tails, with an occasional chance of network collapse if too many peers are competing for access.

If your WiFi access point is also your gateway then you can run a simple ping command to measure your first hop latency. Uplink scheduling on 4G networks

In order to make better use of the limited capacity of the shared radio channel and optimize energy use on the device, 4G/LTE standards take a much more hands-on approach to scheduling and resource assignment: the radio tower (eNodeB) notifies the device for when it should listen for inbound data, and also tells the device when it is allowed to transmit data. As you can imagine, this can incur a lot of coordination overhead (read, latency), but such is the cost of achieving higher channel and energy efficiency.

  1. The radio network has a dedicated Physical Uplink Control Channel (PUCCH) which is used by the device to notify the radio network that it wants to transmit data: each device has a periodic timeslot (typically on a 5, 10, or 20 ms interval) where it is allowed to send a Scheduling Request (SR) that consists of a single bit indicating that it needs uplink access.

  2. The SR request bit is received by the radio tower (eNodeB) but the SR request on its own is not sufficient to assign uplink resources as it doesn't tell the scheduler the amount of data that the device intends to transfer. So, the eNodeB responds with a small "uplink grant" that is just large enough to communicate the size of the pending buffer.

  3. Once the device receives its first uplink grant, it waits for its turn to transmit (up to ~5 ms), and sends a Buffer Status Report (BSR) indicating the amount of application data pending in its upload buffers. Finally, the eNodeB receives the BSR message, allocates the necessary uplink resources and sends back another uplink grant that will allow the device to drain its buffer.

What happens if additional data is added to the device buffer while the above process is underway? Simple, the device sends another BSR message and waits for new uplink grant! If timed correctly, then the BSR requests and uplink grants can be pipelined with existing data transfers from the device, allowing us to minimize first-hop delays. On the other hand, once the device buffer is drained and then new data arrives, the entire process is repeated once over: SR, uplink grant, BSR, uplink grant, data transmission.

So, what does this all mean in practice? Let's do the math:

  • If the network is configured to use a 10 ms periodic interval for communicating SR messages then we would expect a ~5 ms average delay before the SR request is sent.
  • There are two full roundtrips between the device and the eNodeB to negotiate the uplink resource assignment to transmit pending application data. The latency incurred by these roundtrips will vary for each network, but as a rule of thumb each exchange is ~5 ms.

Add it all up, and we're looking at 20+ ms of delay between application data arriving at the (empty buffer) of the link layer on the device and the same data being available at the link layer of the eNodeB. From there the packet needs to traverse the carrier network, exit onto the public network, and get routed to your server.

Above uplink latency overhead is one reason why low latency applications, such as delivering voice, can be a big challenge over 4G networks. In fact, for voice specifically, there is ongoing work on Voice over LTE (VoLTE) which aims to address this problem. How? Well, one way is to provide a persistent uplink grant: transmit up to X bytes on a Y periodic interval. Believe it or not, today most 4G networks still fall back to old 3G infrastructure to transmit voice! Optimizing for WiFi and 4G networks

As you can tell, both WiFi and 4G have their challenges. WiFi can deliver low latency first hop if the network is mostly idle: no coordination is required and the device can transmit whenever it senses that the radio channel is idle. On the other hand, WiFi is subject to high variability and long latency tails if the network has many peers competing for access - and most networks do.

By contrast, 4G networks require coordination between the device and the radio tower for each uplink transfer, which translates to higher minimum latency, but the upside is that 4G can reign in the latency tails and provides more predictable performance and reduces congestion.

So, how does all this impact application developers? First off, latency aside, and regardless of wireless technology, consider the energy costs of your network transfers! Periodic transfers incur high energy overhead due to the need to wake up the radio on each transmission. Second, same periodic transfers also incur high uplink coordination overhead - 4G in particular. In short, don't trickle data. Aggregate your network requests and fire them in one batch: you will reduce energy costs and reduce latency by amortizing scheduling overhead.

Webcast: Debugging Mobile Web Apps: Tips, Tricks, Tools, and Techniques - Jul 17 2014

O'Reilly Media - Fri, 04/18/2014 - 04:21

Debugging web apps across multiple platforms and devices can be extremely difficult. Fortunately there are a few cutting edge tools that can ease the pain. Follow along as Jonathan demonstrates how - and when - to use the latest and greatest tools and techniques to debug the mobile web.

About Jonathan Stark

Jonathan Stark is a mobile strategy consultant who helps CEOs transition their business to mobile.

Jonathan is the author of three books on mobile and web development, most notably O'Reilly's Building iPhone Apps with HTML, CSS, and JavaScript which is available in seven languages.

His Jonathan's Card experiment made international headlines by combining mobile payments with social giving to create a "pay it forward" coffee movement at Starbucks locations all over the U.S.

Hear Jonathan speak, watch his talk show, listen to his podcast (co-hosted with the incomparable @kellishaver), join the mailing list, or connect online:

Stress Testing Drupal Commerce

LoadStorm - Thu, 04/17/2014 - 07:53

I’ve had the pleasure of working with Andy Kucharski for several years on various performance testing projects. He’s recognized as one of the top Drupal performance experts in the world. He is the Founder of Promet Source and is a frequent speaker at conferences, as well as a great client of LoadStorm. As an example of his speaking prowess, he gave the following presentation at Drupal Mid Camp in Chicago 2014.

Promet Source is a Drupal web application and website development company that offers expert services and support. They specialize in building and performance tuning complex Drupal web applications. Andy’s team worked with our Web Performance Lab to conduct stress testing on Drupal Commerce in a controlled environment. He is skilled at using LoadStorm and New Relic to push Drupal implementations to the point of failure. His team tells me he is good at breaking things.

In this presentation at Drupal Mid Camp, Andy explained how his team ran several experiments in which they load tested a kickstarter drupal commerce site on an AWS instance and then compared how the site performed after several well known performance tuning enhancements were applied. They compared performance improvements after Drupal cache, aggregation, Varnish, and Nginx reverse proxy.

View the below slideshare to see the summary of the importance of web performance and to see how they used LoadStorm to prove that they were able to scale Drupal Commerce from a point of failure (POF) of 100 users to 450 users! That’s a tremendous 450% improvement in scalability.

Drupal commerce performance profiling and tunning using loadstorm experiments drupal mid camp chicago 2014 from Andrew Kucharski

The post Stress Testing Drupal Commerce appeared first on LoadStorm.

Very Short Blog Posts (16): Usability Problems Are Probably Testability Problems Too

DevelopSense - Michael Bolton - Wed, 04/16/2014 - 13:04
Want to add ooomph to your reports of usability problems in your product? Consider that usability problems also tend to be testability problems. The design of the product may make it frustrating, inconsistent, slow, or difficult to learn. Poor affordances may conceal useful features and shortcuts. Missing help files could fail to address confusion; self-contradictory […]
Categories: Software Testing

Top 3 PHP Performance Tips for Continuous Delivery

Are you developing or hosting PHP applications? Are you doing performance sanity checks along your delivery pipeline? No? Not Yet? Then start with a quick check. It only takes 15 minutes and it really pays off. As developer you can improve your code, and as somebody responsible for your build pipeline you can automate these […]
Categories: Load & Perf Testing

Top 3 PHP Performance Tips for Continuous Delivery

Perf Planet - Wed, 04/16/2014 - 05:35
Are you developing or hosting PHP applications? Are you doing performance sanity checks along your delivery pipeline? No? Not Yet? Then start with a quick check. It only takes 15 minutes and it really pays off. As developer you can improve your code, and as somebody responsible for your build pipeline you can automate these […]

Seven Reasons Why our iOS App Costs $ 99.99

Perf Planet - Tue, 04/15/2014 - 09:26

That’s not a typo. The paid version of the HttpWatch app really costs $ 99.99. We’ve had some great feedback about the app but there’s been a fair amount of surprise and disbelief that we would attempt to sell an app for 100x more than Angry Birds:

“I like this app and what it offers; providing waterfalls charts and webpage testing tools on iOS but £70 for the professional version is way out most people’s budget. Even my company won’t pay for it at that price even though we’re Sales Engineers all about CDN & website acceleration”

“Its a perfect app, BUTTTT the profissional version is outrageously expensive! I always buy the apps that I like, but this value is prohibitively expensive! Unfeasible!”

“$100+ for pro is stupidly expensive though. I would have dropped maybe $10 on it but not $100…”

So why don’t we just drop the price to a few dollars and make everyone happy? Here are some of the reasons why.

1. We Need To Make a Profit

We stay in business by developing software and selling it at a profit. We’re not looking to sell advertising, get venture capital, build market share or sell the company. Revenue from the software we sell has to pay for salaries, equipment, software, web site hosting and all the other expenses that are incurred when developing and selling software.

Ultimately, everything we do has to pay for itself one way or another. Our app is priced to allow us to recoup our original app development costs and cover future upgrades and bug fixes.

2. After Apple’s Cut It’s Actually a $70 app

Apple takes a hefty 30% margin on every app store transaction. It doesn’t matter who you are - every app developer has the 30% deducted from their revenue.

Therefore each sale of our app yields about $ 70 or £ 40 depending on slight pricing variations in each country’s app store.

3. This Isn’t a Mass Market App

Unfortunately, the market for paid apps that are aimed at a technical audience like HttpWatch is very small compared to gaming apps like Angry Birds or apps with general appeal like Paper.

It’s unreasonable to expect narrow market apps to receive the development attention they deserve if they sell for just a few dollars. The app store is littered with great apps that have never been updated for the iPhone 5 screen size or iOS 7 because they don’t generate enough revenue to make it worthwhile.

4. Dropping the Price Doesn’t Always Increase App Revenue

In the past few years there’s been a race to the bottom with even low price paid apps pushed out by free apps that have in-app purchases. There’s a general expectation that apps should always be just a few dollars or free. We often hear that if we dropped the price of the app to under $ 10 there would be a massive increase in sales leading to an increase in revenue.

To test this out we ran some pricing experiments. First dropping the price to $ 9.99 and then $ 19.99 to compare the sales volume and revenues to pricing it at $ 99.99. There was a significant increase in sales volume of nearly 500% with the $ 9.99 price:

Interestingly, dropping the price to $ 19.99 seemed to make no difference to the number of sales.

However, the 90% price drop led to revenues dropping over 50% at $ 9.99 compared to the $ 99.99 pricing.

5. There’s no Upgrade or Maintenance Pricing on the App Store

For software developers the major downside of the app store is that there is no mechanism for offering upgrades or maintenance to existing app users at a reduced rate compared to the price of a new app.

For paid apps you really only get one chance to get revenue from a customer unless you create a whole new app (e.g. like Rovio did with Angry Birds Star Wars and Angry Seasons). Creating a whole new technical app at regular intervals doesn’t make sense as so much of the functionality would be similar and it would alienate existing users who would have to pay the full app price to ‘upgrade’.

6. We Plan to Keep Updating the App

Charging a relatively high initial price for the app means that we can justify its continued maintenance in the face of OS and device changes as well as adding new features.

7. We Want to Interact with Customers

Talking to customers to get feedback, provide support and discuss ideas for new features has a cost.

A good programmer in the UK costs about $ 40 an hour. If an app sells for $ 10 that only pays for about ten minutes of a programmer’s time. Therefore, spending more than ten minutes interacting with a customer who bought a $ 10 app effectively results in a loss on that app sale.

Uncover Hidden Performance Issues Through Continuous Testing

LoadImpact - Tue, 04/15/2014 - 03:24

On-premise test tools, APMs, CEMs and server/network based monitoring solutions may not be giving you a holistic picture of your system’s performance; cloud-based continuous testing can.  

When it comes to application performance a wide array of potential causes of performance issues and end user dissatisfaction exist.  It is helpful to view the entire environment, from end user browser or mobile device all the way through to the web and application servers, as the complex system that it is.

Everything between the user’s browser or mobile and your code can affect performance

The state of the art in application performance monitoring has evolved to include on-premise test tools, Application Performance Management (APM) solutions, customer experience monitoring (CEM) solutions, server and network based monitoring. All of these technologies seek to determine root causes of performance problems, real or perceived by end users. Each of these technologies has it’s own merits and costs and seek to tackle the problem from different angles. Often a multifaceted approach is required when high value, mission critical applications are being developed and deployed.

On-premise solutions can blast the environment with 10+Gbit/sec of traffic in order to stress routers, switches and servers. These solutions can be quite complex and costly, and are typically used to validate new technology before it can be deployed in the enterprise.

APM solutions can be very effective in determining if network issues are causing performance problems or if the root cause is elsewhere. They will typically take packet data from a switch SPAN port or TAP (test access point), or possibly a tap-aggregation solution. APM solutions are typically “always-on” and can be an early warning system detecting applications problems before the help desk knows about an issue.  These systems can also be very complex and will require training & professional services to get the maximum value.

What all of these solutions lack is a holistic view of the system which has to take into account edge devices (Firewalls, Anti-Malware, IPS, etc), network connectivity and even endpoint challenges such as packet loss and latency of mobile connections. Cloud-based testing platforms such as Load Impact allow both developers and application owners to implement a continuous testing methodology that can shed light on issues that can impact application performance that might be missed by other solutions.

A simply way to accomplish this is to perform a long-term (1 to 24+ hr) application response test to look for anomalies that can crop up at certain times of day.  In this example I compressed the timescale and introduced my own anomalies to illustrate the effects of common infrastructure changes.

The test environment is built on an esxi platform and includes a 10gbit virtual network, 1gbit physical LAN, and Untangle NG Firewall and a 50/5 mbit/sec internet link.  For the purposes of this test the production configuration of the Untangle NG Firewall was left intact – including Firewall rules, IPS protections however QoS was disabled.  Turnkey Linux was used for the Ubuntu-based Apache webserver with 8 CPU cores and 2 gigs of ram.

It was surprising to me what did impact response times and what had no effect whatsoever.  Here are a few examples:

First up is the impact of bandwidth consumption on the link serving the webserver farm.  This was accomplished by saturating the download link with traffic, and as expected it had a dramatic impact on application response time:

At approx 14:13 link saturation occurred (50Mbit) and application response times nearly tripled as a result

Snapshot of the Untangle Firewall throughput during link saturation testing

Next up is executing a Vmware snapshot of the webserver.  I fully expected this to impact response times significantly, but the impact is brief.  If this was a larger VM then the impact could have been longer in duration:

This almost 4x spike in response time only lasts a few seconds and is the result of a VM snapshot

Lastly was a test to simulate network congestion on the LAN segment where the webserver is running.  

This test was accomplished using Iperf to generate 6+ Gbit/sec of network traffic to the webserver VM.  While I fully expected this to impact server response times, the fact that it did not is a testament to how good the 10gig vmxnet3 network driver is:

Using Iperf to generate a link-saturating 15+Gbit/sec of traffic to Apache (Ubuntu on VM)

 

In this test approx 5.5Gbit/sec was generated to the webserver,no impact whatsoever in response times

Taking a continuous monitoring approach for application performance has benefits to not only application developers and owners, but those responsible for network, security and server infrastructure.  The ability to pinpoint the moment when performance degrades and correlate that with server resources (using the Load Impact Server Metrics Agent) and other external events is very powerful.  

Often times application owners do not have control or visibility into the entire infrastructure and having concrete “when and where” evidence makes having conversations with other teams in the organization more productive.

———-

This post was written by Peter Cannell. Peter has been a sales and engineering professional in the IT industry for over 15 years. His experience spans multiple disciplines including Networking, Security, Virtualization and Applications. He enjoys writing about technology and offering a practical perspective to new technologies and how they can be deployed. Follow Peter on his blog or connect with him on Linkedin.


Categories: Load & Perf Testing

Testing on the Toilet: Test Behaviors, Not Methods

Google Testing Blog - Mon, 04/14/2014 - 15:25
by Erik Kuefler

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

After writing a method, it's easy to write just one test that verifies everything the method does. But it can be harmful to think that tests and public methods should have a 1:1 relationship. What we really want to test are behaviors, where a single method can exhibit many behaviors, and a single behavior sometimes spans across multiple methods.

Let's take a look at a bad test that verifies an entire method:

@Test public void testProcessTransaction() {
User user = newUserWithBalance(LOW_BALANCE_THRESHOLD.plus(dollars(2));
transactionProcessor.processTransaction(
user,
new Transaction("Pile of Beanie Babies", dollars(3)));
assertContains("You bought a Pile of Beanie Babies", ui.getText());
assertEquals(1, user.getEmails().size());
assertEquals("Your balance is low", user.getEmails().get(0).getSubject());
}

Displaying the name of the purchased item and sending an email about the balance being low are two separate behaviors, but this test looks at both of those behaviors together just because they happen to be triggered by the same method. Tests like this very often become massive and difficult to maintain over time as additional behaviors keep getting added in—eventually it will be very hard to tell which parts of the input are responsible for which assertions. The fact that the test's name is a direct mirror of the method's name is a bad sign.

It's a much better idea to use separate tests to verify separate behaviors:

@Test public void testProcessTransaction_displaysNotification() {
transactionProcessor.processTransaction(
new User(), new Transaction("Pile of Beanie Babies"));
assertContains("You bought a Pile of Beanie Babies", ui.getText());
}
@Test public void testProcessTransaction_sendsEmailWhenBalanceIsLow() {
User user = newUserWithBalance(LOW_BALANCE_THRESHOLD.plus(dollars(2));
transactionProcessor.processTransaction(
user,
new Transaction(dollars(3)));
assertEquals(1, user.getEmails().size());
assertEquals("Your balance is low", user.getEmails().get(0).getSubject());
}

Now, when someone adds a new behavior, they will write a new test for that behavior. Each test will remain focused and easy to understand, no matter how many behaviors are added. This will make your tests more resilient since adding new behaviors is unlikely to break the existing tests, and clearer since each test contains code to exercise only one behavior.

Categories: Software Testing

AB Testing – Episode 3

Alan Page - Mon, 04/14/2014 - 11:02

Yes – it’s more of Brent and Alan yelling at each other. But this time, Brent says that he, “hates the Agile Manifesto”. I also talk about my trip to Florida, and we discuss estimation and planning and a bunch of other stuff.

Subscribe to the ABTesting Podcast!

Subscribe via RSS
Subscribe via iTunes

(potentially) related posts:
  1. Alan and Brent talk testing…
  2. More Test Talk with Brent
  3. Thoughts on Swiss Testing Day
Categories: Software Testing

Performance Testing Insights: Part I

LoadStorm - Sat, 04/12/2014 - 18:35

Performance Testing can be viewed as the systematic process of collecting and monitoring the results of system usage, then analyzing them to aid system improvement towards desired results. As part of the performance testing process, the tester needs to gather statistical information, examine server logs and system state histories, determine the system’s performance under natural and artificial conditions and alter system modes of operation.

Performance testing complements functional testing. Functional testing can validate proper functionality under correct usage and proper error handling under incorrect usage. It cannot, however, tell how much load an application can handle before it breaks or performs improperly. Finding the breaking points and performance bottlenecks, as well as identifying functional errors that only occur under stress, requires performance testing.

The purpose of Performance testing is to demonstrate that:

1. The application processes required business process and transaction volumes within specified response times in a real-time production database (Speed).

2. The application can handle various user load scenarios (stresses), ranging from a sudden load “spike” to a persistent load “soak” (Scalability).

3. The application is consistent in availability and functional integrity (Stability).

4. Determination of minimum configuration that will allow the system to meet the formal stated performance expectations of stakeholders

When should I start testing and when should I stop?

When to Start Performance Testing:

A common practice is to start performance testing only after functional, integration, and system testing are complete; that way, it is understood that the target application is “sufficiently sound and stable” to ensure valid performance test results. However, the problem with the above approach is that it delays performance testing until the latter part of the development lifecycle. Then, if the tests uncover performance-related problems, one has to resolve problems with potentially serious design implications at a time when the corrections made might invalidate earlier test results. In addition, the changes might destabilize the code just when one wants to freeze it, prior to beta testing or the final release.
A better approach is to begin performance testing as early as possible, just as soon as any of the application components can support the tests. This will enable users to establish some early benchmarks against which performance measurement can be conducted as the components are developed.

When to Stop Performance Testing:

The conventional approach is to stop testing once all planned tests are executed and there is consistent and reliable pattern of performance improvement. This approach gives users accurate performance information at that instance. However, one can quickly fall behind by just standing still. The environment in which clients will run the application will always be changing, so it’s a good idea to run ongoing performance tests. Another alternative is to set up a continual performance test and periodically examine the results. One can “overload” these tests by making use of real world conditions. Regardless of how well it is designed, one will never be able to reproduce all the conditions that application will have to contend with in the real-world environment.

The post Performance Testing Insights: Part I appeared first on LoadStorm.

Pages