Software Testing

Lessons from An Engineer

QA Hates You - Tue, 04/29/2014 - 04:49

A slideshow of What I Learned in Engineering School presents a number of lessons for software development if you look at them the right way.

For example:

A skyscraper is a vertically cantilevered beam. The primary structural design consdieration is not resistance to vertical (gravity) loads, but resistance to lateral loads from wind and earthquakes. For this reason, tall structures function and are designed conceptually as large beams cantilevered from the ground.

Just like how we have to test software. A building is supposed to stand up, so simplistically speaking, you’d think it would have to be strong and rigid against gravity. But there are other forces at work to account for.

So with a piece of software: simplistically, it’s designed to perform a task, and simplistic testing makes sure it does that task adequately. However, when looking at it from the tester’s perspective, you have to account for other forces besides the drive to get to the software’s goal. You’ve got to account for the real world, people making mistakes, and interactions that are sometimes hard to predict in a requirements guide or on a napkin.


Work with the natural order. The locks of the Panama Canal are operated without pumps. Gravity moves millions of gallons of water from lakes to the lock chambers, where ships are raised and lowered 85 feet in passing between the Atlantic and Pacific oceans. As long as precipitation refills the lakes,the locks continue to function.

Your software will work better for your users if it conforms to their knowledge, their expectations, and their habits. Also, your processes will work better if you take into account your corporate (or organizational) environment, people, habits, and whatnot. You can’t come in and make everybody change to the better way you know just because you know it’s better. You have to take stock of what’s already going on and craft the better processes without trying to push water uphill.

At any rate, it’s worth a read with an eye to the lessons you can apply to software development.

Categories: Software Testing

QA Music – An Integration

QA Hates You - Mon, 04/28/2014 - 06:31

We’ve featured the band Within Temptation before, and we’ve featured the song “Radioactive“.

An expensive integration later, and we have Within Temptation doing “Radioactive”:

That will be $450,000 for something delivered three years late. I am a consultant, you know.

Categories: Software Testing

Web Performance of Ecommerce Applications

LoadStorm - Thu, 04/24/2014 - 13:26

You probably have been wondering why I’ve posted so infrequently over the past year. We have been bombarded with emails and phone calls demanding more blogging that includes my extremely dry, obtuse humor. So, in the interest of global stability and national security, I must acquiesce to the will of the masses.

Right. That’s a joke. I’m only slightly more popular than Justin Bieber. If you are a serious tech geek, you’ll need to look up that Bieber reference.

Web performance is why you read our blog, and web performance is my life. I want our blog to contain the most interesting information about load testing, page speed, and application scalability. In order to deliver on that goal, we came up with the concept of using LoadStorm and other tools to gather actual performance data regarding as many web applications as possible.

Thus, the Web Performance Lab was born.

Why Create a Web Performance Lab?

Amazon EC2 General Purpose Types of Server Instances

The Web Performance Lab is designed to be a virtual sandbox where we install software and run experiments to find out how the software performs. Why? Because I sit around and wonder if an AWS EC2 m3.2xlarge instance will run a web app four times faster than a m3.large. It should, since it has 8 virtual CPUs compared to 2, and it has 30 GB of memory compared to 7.5. That’s simple math, but rarely does linear algebra work out in web performance. Just because we have 4x the horsepower on the hardware does NOT result in 4x the speed or scalability.

WPL (I’ll abbreviate because we geeks love acronyms) gives us a playground to satisfy our curiosity. I want to know:

  • What is the most scalable ecommerce application?
  • How many concurrent users can WordPress handle on a small server?
  • Which cloud platform gives me the biggest bang for the buck?
  • Does Linux or Windows provide a better stack for scalability?
  • Can I fairly compare open source software performance to commercial solutions?
  • Who will help me create statistically accurate load testing scenarios for experiments?
  • Are other performance engineers as curious as me about these comparison test results?
  • How can I involve other application experts to achieve more useful performance results?
  • What other tools (not LoadStorm) should I put in the WPL toolkit for testing?
  • Can we identify the top 10 best performance optimizations that apply to each web application?
  • Who are the top experts on each of the most important web applications?
  • Do they know anything about tuning their apps for scalability?
  • Will they want to participate?
  • Are they able to share their knowledge in an objective set of experiments?
  • What are the decision criteria in picking the best server monitoring tool?
  • Nginx vs. Apache – what are the real numbers to compare?
  • How many types of caching exist?
  • What type of caching gives the highest level of increase to scalability?
  • Can we empirically conclude anything from running a Webpagetest or Pagespeed or Yslow analysis?
  • Are the advertised “free” server monitoring tool actually free?
  • Will most web applications get memory bound before being CPU-constrained?
  • NewRelic vs. AppDynamics – which is best for what type of architecture?
  • How do open source load testing tools such as JMeter compare to cloud solutions such as LoadStorm?
  • Drupal vs. WordPress vs. Joomla vs. Redaxscript vs. concrete5 vs. pimcore?
  • Magento vs. osCommerce vs. OpenCart vs. WooCommerce vs. Drupal Commerce vs. VirtueMart?

There are hundreds of similar questions that I ponder with Elijah Craig. He and I are very productive in the evenings, and he helps me plan experiments to solve the riddles of web performance that appear to me in visions after a long day of work in the load testing salt mines.

Ecommerce Application Performance is High Priority

U.S. Online Retail Sales 2012-2017

That last question in the list is worthy of assigning highest priority in the web performance lab. Online retail is a great place to start because it has so much at stake. There are so many good test scenarios to build into experiments. Ecommerce provides excellent examples of business processes that are important to measure.

With over $250 billion of online sales in the U.S. alone during 2013, and with over 15% annual growth, how could we ignore ecommerce? It’s the biggest market possible for our Web Performance Lab to address. The stakes are enormous. My hope is that other people will be as interested as I am.

Cyber Monday 2013 generated $1.7 billion in sales for a single day! What ecommerce applications are generating the most money? I doubt we will ever know, nor will the WPL answer that question. However, some of you reading this blog will want to get your share of that $300 billion this year, and the $340 billion in 2015, so I’m certain that you need to understand which online retail platform is going to perform best. You need to know, right?

Cyber Monday sales data

We will run experiments to gather objective performance measurements. How many combinations and permutations can we try?

  • Each app with out-of-the-box configuration (no tuning) on an EC2 large instance, c3.8xlarge, r3.8xlarge, hs1.8xlarge, m1.small
  • Each app (no tuning) running on a similar server in Google Compute Engine, Rackspace, Azure, Joyent, Bluelock, Savvis
  • Each app (no tuning) running on same hardware using Ubuntu, Fedora, CentOS, Debian, Red Hat, Gentoo, Windows
  • Each app tuned with web application caching such as memcached, Varnish, JCS
  • Each app tuned with a CDN such as Cloudfront, Akamai, Edgecast, Cloudflare, MaxCDN
  • Each app tuned with different database configurations
  • Each app tuned with different web server configurations
  • Each app tuned with different database configurations

We have been running some of these experiments during the past few months. Esteban shared some of his results in blog posts earlier. My problem with his work is that some of the conclusions aren’t as solid as I would prefer. I spent some time reviewing his results with him in a conference room recently, and I poked some holes in his logic.

Now don’t get me wrong, I am an Esteban fan. He is a friend and high character guy. That said, we all learn from experiments. We try, we ponder, we learn. That’s how humans gain understanding. As a child you figure out the world by putting your hand on a hot stove. You register that learning experience in your brain, and you don’t do that again. You find out the best ways to accomplish objectives by failing. Just ask Edison. He figured out 1,000 ways how NOT to create a functional lightbulb before he found the correct way. So it is with WPL. We are learning from trying.

Therefore, we are beginning a new series of experiments on ecommerce platforms. We will be publishing the results more quickly and with less filtering. We hope you find it useful and interesting. Please feel free to comment and make suggestions. Also, if you disagree with our statistical approach or calculations, please let us know. Recommendations are also welcome for ways to improve our scientific method employed during the experiments.

The post Web Performance of Ecommerce Applications appeared first on LoadStorm.

Very Short Blog Posts (17): Regression Obsession

DevelopSense - Michael Bolton - Thu, 04/24/2014 - 13:03
Regression testing is focused on the risk that something that used to work in some way no longer works that way. A lot of organizations (Agile ones in particular) seem fascinated by regression testing (or checking) above all other testing activities. It’s a good idea to check for the risk of regression, but it’s also […]
Categories: Software Testing

The Power To Declare Something Is NOT A Bug

Eric Jacobson's Software Testing Blog - Thu, 04/24/2014 - 11:20

Many think testers have the power to declare something as a bug.  This normally goes without saying.  How about the inverse? 

Should testers be given the power to declare something is NOT a bug? 

Well…no, IMO.  That sounds dangerous because what if the tester is wrong?  I think many will agree with me.  Michael Bolton asked the above question in response to a commenter on this post.  It really gave me pause. 

For me, it means maybe testers should not be given the power to run around declaring things as bugs either.  They should instead raise the possibility that something may be a problem.  Then I suppose they could raise the possibility something may not be a problem.

Categories: Software Testing

A Tale of Four Projects

DevelopSense - Michael Bolton - Wed, 04/23/2014 - 16:53
Once upon time, in a high-tech business park far, far away, there were four companies, each working on a development project. In Project Blue, the testers created a suite of 250 test cases, based on 50 use cases, before development started. These cases remained static throughout the project. Each week saw incremental improvement in the […]
Categories: Software Testing

What He Said

QA Hates You - Tue, 04/22/2014 - 09:29

Wayne Ariola in SD Times:

Remember: The cost of quality isn’t the price of creating quality software; it’s the penalty or risk incurred by failing to deliver quality software.

Word to your mother, who doesn’t understand why the computer thing doesn’t work any more and is afraid to touch computers because some online provider used her as a guinea pig in some new-feature experiment with bugs built right in.

Categories: Software Testing

“In The Real World”

DevelopSense - Michael Bolton - Mon, 04/21/2014 - 04:17
In Rapid Software Testing, James Bach, our colleagues, and I advocate an approach that puts the skill set and the mindset of the individual tester—rather than some document or tool or test case or process modelY—at the centre of testing. We advocate an exploratory approach to testing so that we find not only the problems […]
Categories: Software Testing

Stress Testing Drupal Commerce

LoadStorm - Thu, 04/17/2014 - 07:53

I’ve had the pleasure of working with Andy Kucharski for several years on various performance testing projects. He’s recognized as one of the top Drupal performance experts in the world. He is the Founder of Promet Source and is a frequent speaker at conferences, as well as a great client of LoadStorm. As an example of his speaking prowess, he gave the following presentation at Drupal Mid Camp in Chicago 2014.

Promet Source is a Drupal web application and website development company that offers expert services and support. They specialize in building and performance tuning complex Drupal web applications. Andy’s team worked with our Web Performance Lab to conduct stress testing on Drupal Commerce in a controlled environment. He is skilled at using LoadStorm and New Relic to push Drupal implementations to the point of failure. His team tells me he is good at breaking things.

In this presentation at Drupal Mid Camp, Andy explained how his team ran several experiments in which they load tested a kickstarter drupal commerce site on an AWS instance and then compared how the site performed after several well known performance tuning enhancements were applied. They compared performance improvements after Drupal cache, aggregation, Varnish, and Nginx reverse proxy.

View the below slideshare to see the summary of the importance of web performance and to see how they used LoadStorm to prove that they were able to scale Drupal Commerce from a point of failure (POF) of 100 users to 450 users! That’s a tremendous 450% improvement in scalability.

Drupal commerce performance profiling and tunning using loadstorm experiments drupal mid camp chicago 2014 from Andrew Kucharski

The post Stress Testing Drupal Commerce appeared first on LoadStorm.

Very Short Blog Posts (16): Usability Problems Are Probably Testability Problems Too

DevelopSense - Michael Bolton - Wed, 04/16/2014 - 13:04
Want to add ooomph to your reports of usability problems in your product? Consider that usability problems also tend to be testability problems. The design of the product may make it frustrating, inconsistent, slow, or difficult to learn. Poor affordances may conceal useful features and shortcuts. Missing help files could fail to address confusion; self-contradictory […]
Categories: Software Testing

Testing on the Toilet: Test Behaviors, Not Methods

Google Testing Blog - Mon, 04/14/2014 - 15:25
by Erik Kuefler

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

After writing a method, it's easy to write just one test that verifies everything the method does. But it can be harmful to think that tests and public methods should have a 1:1 relationship. What we really want to test are behaviors, where a single method can exhibit many behaviors, and a single behavior sometimes spans across multiple methods.

Let's take a look at a bad test that verifies an entire method:

@Test public void testProcessTransaction() {
User user = newUserWithBalance(;
new Transaction("Pile of Beanie Babies", dollars(3)));
assertContains("You bought a Pile of Beanie Babies", ui.getText());
assertEquals(1, user.getEmails().size());
assertEquals("Your balance is low", user.getEmails().get(0).getSubject());

Displaying the name of the purchased item and sending an email about the balance being low are two separate behaviors, but this test looks at both of those behaviors together just because they happen to be triggered by the same method. Tests like this very often become massive and difficult to maintain over time as additional behaviors keep getting added in—eventually it will be very hard to tell which parts of the input are responsible for which assertions. The fact that the test's name is a direct mirror of the method's name is a bad sign.

It's a much better idea to use separate tests to verify separate behaviors:

@Test public void testProcessTransaction_displaysNotification() {
new User(), new Transaction("Pile of Beanie Babies"));
assertContains("You bought a Pile of Beanie Babies", ui.getText());
@Test public void testProcessTransaction_sendsEmailWhenBalanceIsLow() {
User user = newUserWithBalance(;
new Transaction(dollars(3)));
assertEquals(1, user.getEmails().size());
assertEquals("Your balance is low", user.getEmails().get(0).getSubject());

Now, when someone adds a new behavior, they will write a new test for that behavior. Each test will remain focused and easy to understand, no matter how many behaviors are added. This will make your tests more resilient since adding new behaviors is unlikely to break the existing tests, and clearer since each test contains code to exercise only one behavior.

Categories: Software Testing

AB Testing – Episode 3

Alan Page - Mon, 04/14/2014 - 11:02

Yes – it’s more of Brent and Alan yelling at each other. But this time, Brent says that he, “hates the Agile Manifesto”. I also talk about my trip to Florida, and we discuss estimation and planning and a bunch of other stuff.

Subscribe to the ABTesting Podcast!

Subscribe via RSS
Subscribe via iTunes

(potentially) related posts:
  1. Alan and Brent talk testing…
  2. More Test Talk with Brent
  3. Thoughts on Swiss Testing Day
Categories: Software Testing

Performance Testing Insights: Part I

LoadStorm - Sat, 04/12/2014 - 18:35

Performance Testing can be viewed as the systematic process of collecting and monitoring the results of system usage, then analyzing them to aid system improvement towards desired results. As part of the performance testing process, the tester needs to gather statistical information, examine server logs and system state histories, determine the system’s performance under natural and artificial conditions and alter system modes of operation.

Performance testing complements functional testing. Functional testing can validate proper functionality under correct usage and proper error handling under incorrect usage. It cannot, however, tell how much load an application can handle before it breaks or performs improperly. Finding the breaking points and performance bottlenecks, as well as identifying functional errors that only occur under stress, requires performance testing.

The purpose of Performance testing is to demonstrate that:

1. The application processes required business process and transaction volumes within specified response times in a real-time production database (Speed).

2. The application can handle various user load scenarios (stresses), ranging from a sudden load “spike” to a persistent load “soak” (Scalability).

3. The application is consistent in availability and functional integrity (Stability).

4. Determination of minimum configuration that will allow the system to meet the formal stated performance expectations of stakeholders

When should I start testing and when should I stop?

When to Start Performance Testing:

A common practice is to start performance testing only after functional, integration, and system testing are complete; that way, it is understood that the target application is “sufficiently sound and stable” to ensure valid performance test results. However, the problem with the above approach is that it delays performance testing until the latter part of the development lifecycle. Then, if the tests uncover performance-related problems, one has to resolve problems with potentially serious design implications at a time when the corrections made might invalidate earlier test results. In addition, the changes might destabilize the code just when one wants to freeze it, prior to beta testing or the final release.
A better approach is to begin performance testing as early as possible, just as soon as any of the application components can support the tests. This will enable users to establish some early benchmarks against which performance measurement can be conducted as the components are developed.

When to Stop Performance Testing:

The conventional approach is to stop testing once all planned tests are executed and there is consistent and reliable pattern of performance improvement. This approach gives users accurate performance information at that instance. However, one can quickly fall behind by just standing still. The environment in which clients will run the application will always be changing, so it’s a good idea to run ongoing performance tests. Another alternative is to set up a continual performance test and periodically examine the results. One can “overload” these tests by making use of real world conditions. Regardless of how well it is designed, one will never be able to reproduce all the conditions that application will have to contend with in the real-world environment.

The post Performance Testing Insights: Part I appeared first on LoadStorm.