Software Testing

Another Branding Failure

QA Hates You - Tue, 06/21/2016 - 04:53

A couple weeks ago, I pointed out the some flaws with inconsistent application of the trademark symbol. Today, we’re going to look at a failure of branding in a news story.

Can you spot the branding failure in this story?

After the refi boom, can Quicken keep rocketing higher?:

Quicken Loans Inc, once an obscure online mortgage player, seized on the refinancing boom to become the nation’s third largest mortgage lender, behind only Wells Fargo & Co and JPMorgan Chase & Co.

Now, with the refi market saturated, Quicken faces a pivotal challenge — convincing home buyers to trust that emotional transaction to a website instead of the banker next door.

Okay, can anyone not named Hilary spot the problem?

Quicken Loans and Quicken are two different things and have been owned by two different companies since 2002. For fourteen years.

Me, I know the difference because earlier this year I did some testing on a Quicken Loans promotion, and the developers put simply Quicken into some of the legalesque opt-in and Terms of Service check boxes. So I researched it. And then made them use Quicken Loans in the labels instead.

After reading the story, I reached out to someone at Quicken Loans to see if they use “Quicken” internally informally, and she said $&#&^$! yes (I’m paraphrasing here to maintain her reputation). So maybe the journalist had some communication with internal people who used “Quicken” instead of the company name, or perhaps that’s what everybody but me does.

However, informal nomenclature aside, Quicken Loans != Quicken, and to refer to it as such could have consequences. If this story hit the wires and Intuit’s stock dropped a bunch, ay! Or something more sinister, which in this case means unintended and unforeseen consequences.

My point is to take a little time to research the approved use of trademarks, brand names, and company names before you start testing or writing about them. Don’t trust the developers (or journalists, apparently) to have done this for you.

Categories: Software Testing

QA Music: Where The Wild Things Are Running

QA Hates You - Mon, 06/20/2016 - 04:21

Against the Current, “Running With The Wild Things”:

I like the sound of them; I’m going to pick up their CD.

When the last CD is sold in this country, you know who’ll buy it. Me.

(Link via.)

Categories: Software Testing

Good Reasons NOT to Log Bugs

Eric Jacobson's Software Testing Blog - Thu, 06/16/2016 - 13:14

I noticed one of our development teams was creating new Jira Issues for each bug found during the development cycle.  IMO, this is an antipattern. 

These are the problems it can create, that I can think of:

  • New Jira Issues (bug reports) are creating unneccessry admin work for the whole team. 
    • We see these bug reports cluttering an Agile board.
    • They may have to get prioritized.
    • We have to track them, they have to get assigned, change statuses, get linked, maybe even estimated.
    • They take time to create. 
    • They may cause us to communicate via text rather than conversation.
  • Bug reports mislead lazy people into tracking progress, quality, or team performance by counting bugs.
  • It leads to confusion about how to manage the User Story.  If the User Story is done except for the open bug reports, can we mark the User Story “Done”?  Or do we need to keep the User Story open until the logged bugs get fixed…”Why is this User Story still in progress?  Oh yeah, it’s because of those linked logged bugs”.
  • It’s an indication our acceptance criteria is inadequete.  That is to say, if the acceptance criteria in the User Story is not met, we wouldn’t have to log a bug report.  We would merely NOT mark the Story “Done”.
  • Bug reports may give us an excuse not to fix all bugs…”let’s fix it next Sprint”, “let’s put it on the Product Backlog and fix it some other day”…which means never.
  • It’s probably a sign the team is breaking development into a coding phase and a testing phase.  Instead, we really want the testing and programming to take place in one phase...development. 
  • It probably means the programmer is considering their code “done”, throwing it over the wall to a tester, and moving on to a different Story.  This misleads us on progress.  Untested is as good as nothing.

If the bug is an escape, if it occurs in production.  It’s probably a good idea to log it.

Categories: Software Testing

Shorten the Feedback Loop, Unless…

Eric Jacobson's Software Testing Blog - Wed, 06/15/2016 - 07:59

On a production support kanban development team, a process dilema came up.  In the case where something needs to be tested by a tester:

  1. Should the tester perform the testing first in a development environment, then in a production-like environment after the thing-under-test has been packaged and deployed?  Note: in this case, the package/deploy process is handled semi-manually by two separate teams, so there is a delay.
  2. Or, should the tester perform all the testing in a production-like environment after the thing-under-test has been packaged and deployed?

Advantage of scenario 1 above:

  • Dev environment testing shortens the feedback loop.  This would be deep testing.  If problems surface they would be quicker and less risky to fix.  The post-package testing would be shallow testing, answering questions like: did the stuff I deep tested get deployed properly?

Advantage of scenario 2 above:

  • Knock out the testing in one environment.  The deep testing will indirectly cover the package/deployment testing.

From the surface, scenario 2 looks better because it only requires one testing chunk, NOT two chunks separated by a lengthy gap.  But what happens if a problem surfaces in scenario 2?  Now we must go through two lengthy gaps.  How about a third problem?  Three gaps.  And so on.

My conclusion: Scenario 1 is better unless this type of thing-under-test is easy and has a history of zero problems.

Categories: Software Testing

An Oldie, But An Oldie

QA Hates You - Tue, 06/14/2016 - 03:53

Round round work around
I work around
Yeah
work around round round I work around
I work around
work around round round I work around
From job to job
work around round round I work around
It’s a real cool app
work around round round I work around
Please don’t make it snap

I’ve got little bugs runnin’ in and out of the code
Don’t type an int or it will implode

My buttons don’t click, the users all moan
Yeah, the GUIS are buggy but the issues are known

I work around
work around round round I work around
From town to town
work around round round I work around
It’s a real cool app
work around round round I work around
Please don’t make it snap
work around round round I work around
I work around
Round
work around round round oooo
Wah wa ooo
Wah wa ooo
Wah wa ooo

We always make a patch cause the clients get mad
And we’ve never missed a deadline, so it isn’t so bad

None of the data gets checked cause it doesn’t work right
We can run a batch job in the middle of the night

I work around
work around round round I work around
From job to job
work around round round I work around
It’s a real cool app
work around round round I work around
Please don’t make it snap
work around round round I work around
I work around
Round
Ah ah ah ah ah ah ah ah

Round round work around
I work around
Yeah
work around round round I work around
work around round round I work around
Wah wa ooo
work around round round I work around
Oooo ooo ooo
work around round round I work around
Ahh ooo ooo
work around round round I work around
Ahh ooo ooo
work around round round I work around
Ahh ooo ooo

I don’t want to make you feel old, old man, but most of your co-workers don’t remember “Kokomo” much less “I Get Around” and probably think the Beach Boys were the guys on Jersey Shore

Categories: Software Testing

QA Music: Fire, Fire

QA Hates You - Mon, 06/13/2016 - 04:26

Puscifer, “The Arsonist”

Few songs use the word “deleterious.” Too few, if you ask me.

Categories: Software Testing

A Few Anniversaries (and one announcement)

Alan Page - Thu, 06/09/2016 - 08:54

There’s a light at the end of the oh-my-work-is-so-crazy train, and I look forward to ranting more often both here and on twitter.

But first, a few minor anniversaries to acknowledge. Monday was my 21-year anniversary at Microsoft. It’s not a nice even number like 20, but it’s weird to think that people born on the day I started (full-time) at Microsoft, can now drink in the U.S. While I doubt I’ll make it to 25, I doubted that I’d make it to 20…or 10, so this is definitely an area where I’m bad at estimating.

Meanwhile, the ABTesting Podcast just hit episode #40. That’s another milestone I never thought I’d hit, but Brent and I keep finding things to talk about (or new ways to talk about the same things). We should hit the 50-episode milestone (by my already established as poor estimates) before the end of the calendar year. I’m thinking of inviting Satya to be a guest, but I don’t think he’ll show up.

 On the announcement front, I’m speaking at Test Bash Philadelphia in November. I’ll be talking about “Testing without Testers and other stupid ideas that sometimes work”. This is an evolution of a talk I’ve been giving recently, but I’m preparing something extra special for test bash that should inspire, as well as cause some great conversations to happen.

(potentially) related posts:
  1. Alan and Brent are /still/ talking…
  2. An Interlude
  3. 2012 Recap
Categories: Software Testing

Inconsistency Ain’t Just A River In Egypt

QA Hates You - Wed, 06/08/2016 - 10:28

So I went to the Hostess Cakes site today while researching a tweet (what, you don’t research your tweets?). I wanted to see if the Twinkie brand name had a registered trademark or trademark symbol.

The site was not helpful:

The site has both, but this is incorrect. Also, note how other products that bear a service mark have it in the headline but not in the copy. It’s okay to not have it in the copy, since it is in the heading, and it’s common to only use the service mark the first time it appears on a page, but this page has them for some and not for others.

It’s definitely the sort of inconsistency I notice on a Web site, and then I wonder what else is lurking beneath the unreviewed copy.

Categories: Software Testing

Third Party Dependencies And Your Site’s Security, A Dramatic Recreation

QA Hates You - Mon, 06/06/2016 - 10:30

The chain is not as strong as it’s weakest link; it’s as strong as the link you assumed someone else affixed.

Categories: Software Testing

The Inquiry Method for Test Planning

Google Testing Blog - Mon, 06/06/2016 - 07:07
by Anthony Vallone

Creating a test plan is often a complex undertaking. An ideal test plan is accomplished by applying basic principles of cost-benefit analysis and risk analysis, optimally balancing these software development factors:
  • Implementation cost: The time and complexity of implementing testable features and automated tests for specific scenarios will vary, and this affects short-term development cost.
  • Maintenance cost: Some tests or test plans may vary from easy to difficult to maintain, and this affects long-term development cost. When manual testing is chosen, this also adds to long-term cost.
  • Monetary cost: Some test approaches may require billed resources.
  • Benefit: Tests are capable of preventing issues and aiding productivity by varying degrees. Also, the earlier they can catch problems in the development life-cycle, the greater the benefit.
  • Risk: The probability of failure scenarios may vary from rare to likely, and their consequences may vary from minor nuisance to catastrophic.
Effectively balancing these factors in a plan depends heavily on project criticality, implementation details, resources available, and team opinions. Many projects can achieve outstanding coverage with high-benefit, low-cost unit tests, but they may need to weigh options for larger tests and complex corner cases. Mission critical projects must minimize risk as much as possible, so they will accept higher costs and invest heavily in rigorous testing at all levels.
This guide puts the onus on the reader to find the right balance for their project. Also, it does not provide a test plan template, because templates are often too generic or too specific and quickly become outdated. Instead, it focuses on selecting the best content when writing a test plan.

Test plan vs. strategy
Before proceeding, two common methods for defining test plans need to be clarified:
  • Single test plan: Some projects have a single "test plan" that describes all implemented and planned testing for the project.
  • Single test strategy and many plans: Some projects have a "test strategy" document as well as many smaller "test plan" documents. Strategies typically cover the overall test approach and goals, while plans cover specific features or project updates.
Either of these may be embedded in and integrated with project design documents. Both of these methods work well, so choose whichever makes sense for your project. Generally speaking, stable projects benefit from a single plan, whereas rapidly changing projects are best served by infrequently changed strategies and frequently added plans.
For the purpose of this guide, I will refer to both test document types simply as "test plans”. If you have multiple documents, just apply the advice below to your document aggregation.

Content selection
A good approach to creating content for your test plan is to start by listing all questions that need answers. The lists below provide a comprehensive collection of important questions that may or may not apply to your project. Go through the lists and select all that apply. By answering these questions, you will form the contents for your test plan, and you should structure your plan around the chosen content in any format your team prefers. Be sure to balance the factors as mentioned above when making decisions.

Prerequisites
  • Do you need a test plan? If there is no project design document or a clear vision for the product, it may be too early to write a test plan.
  • Has testability been considered in the project design? Before a project gets too far into implementation, all scenarios must be designed as testable, preferably via automation. Both project design documents and test plans should comment on testability as needed.
  • Will you keep the plan up-to-date? If so, be careful about adding too much detail, otherwise it may be difficult to maintain the plan.
  • Does this quality effort overlap with other teams? If so, how have you deduplicated the work?

Risk
  • Are there any significant project risks, and how will you mitigate them? Consider:
    • Injury to people or animals
    • Security and integrity of user data
    • User privacy
    • Security of company systems
    • Hardware or property damage
    • Legal and compliance issues
    • Exposure of confidential or sensitive data
    • Data loss or corruption
    • Revenue loss
    • Unrecoverable scenarios
    • SLAs
    • Performance requirements
    • Misinforming users
    • Impact to other projects
    • Impact from other projects
    • Impact to company’s public image
    • Loss of productivity
  • What are the project’s technical vulnerabilities? Consider:
    • Features or components known to be hacky, fragile, or in great need of refactoring
    • Dependencies or platforms that frequently cause issues
    • Possibility for users to cause harm to the system
    • Trends seen in past issues

Coverage
  • What does the test surface look like? Is it a simple library with one method, or a multi-platform client-server stateful system with a combinatorial explosion of use cases? Describe the design and architecture of the system in a way that highlights possible points of failure.
  • What are the features? Consider making a summary list of all features and describe how certain categories of features will be tested.
  • What will not be tested? No test suite covers every possibility. It’s best to be up-front about this and provide rationale for not testing certain cases. Examples: low risk areas that are a low priority, complex cases that are a low priority, areas covered by other teams, features not ready for testing, etc. 
  • What is covered by unit (small), integration (medium), and system (large) tests? Always test as much as possible in smaller tests, leaving fewer cases for larger tests. Describe how certain categories of test cases are best tested by each test size and provide rationale.
  • What will be tested manually vs. automated? When feasible and cost-effective, automation is usually best. Many projects can automate all testing. However, there may be good reasons to choose manual testing. Describe the types of cases that will be tested manually and provide rationale.
  • How are you covering each test category? Consider:
  • Will you use static and/or dynamic analysis tools? Both static analysis tools and dynamic analysis tools can find problems that are hard to catch in reviews and testing, so consider using them.
  • How will system components and dependencies be stubbed, mocked, faked, staged, or used normally during testing? There are good reasons to do each of these, and they each have a unique impact on coverage.
  • What builds are your tests running against? Are tests running against a build from HEAD (aka tip), a staged build, and/or a release candidate? If only from HEAD, how will you test release build cherry picks (selection of individual changelists for a release) and system configuration changes not normally seen by builds from HEAD?
  • What kind of testing will be done outside of your team? Examples:
    • Dogfooding
    • External crowdsource testing
    • Public alpha/beta versions (how will they be tested before releasing?)
    • External trusted testers
  • How are data migrations tested? You may need special testing to compare before and after migration results.
  • Do you need to be concerned with backward compatibility? You may own previously distributed clients or there may be other systems that depend on your system’s protocol, configuration, features, and behavior.
  • Do you need to test upgrade scenarios for server/client/device software or dependencies/platforms/APIs that the software utilizes?
  • Do you have line coverage goals?

Tooling and Infrastructure
  • Do you need new test frameworks? If so, describe these or add design links in the plan.
  • Do you need a new test lab setup? If so, describe these or add design links in the plan.
  • If your project offers a service to other projects, are you providing test tools to those users? Consider providing mocks, fakes, and/or reliable staged servers for users trying to test their integration with your system.
  • For end-to-end testing, how will test infrastructure, systems under test, and other dependencies be managed? How will they be deployed? How will persistence be set-up/torn-down? How will you handle required migrations from one datacenter to another?
  • Do you need tools to help debug system or test failures? You may be able to use existing tools, or you may need to develop new ones.

Process
  • Are there test schedule requirements? What time commitments have been made, which tests will be in place (or test feedback provided) by what dates? Are some tests important to deliver before others?
  • How are builds and tests run continuously? Most small tests will be run by continuous integration tools, but large tests may need a different approach. Alternatively, you may opt for running large tests as-needed. 
  • How will build and test results be reported and monitored?
    • Do you have a team rotation to monitor continuous integration?
    • Large tests might require monitoring by someone with expertise.
    • Do you need a dashboard for test results and other project health indicators?
    • Who will get email alerts and how?
    • Will the person monitoring tests simply use verbal communication to the team?
  • How are tests used when releasing?
    • Are they run explicitly against the release candidate, or does the release process depend only on continuous test results? 
    • If system components and dependencies are released independently, are tests run for each type of release? 
    • Will a "release blocker" bug stop the release manager(s) from actually releasing? Is there an agreement on what are the release blocking criteria?
    • When performing canary releases (aka % rollouts), how will progress be monitored and tested?
  • How will external users report bugs? Consider feedback links or other similar tools to collect and cluster reports.
  • How does bug triage work? Consider labels or categories for bugs in order for them to land in a triage bucket. Also make sure the teams responsible for filing and or creating the bug report template are aware of this. Are you using one bug tracker or do you need to setup some automatic or manual import routine?
  • Do you have a policy for submitting new tests before closing bugs that could have been caught?
  • How are tests used for unsubmitted changes? If anyone can run all tests against any experimental build (a good thing), consider providing a howto.
  • How can team members create and/or debug tests? Consider providing a howto.

Utility
  • Who are the test plan readers? Some test plans are only read by a few people, while others are read by many. At a minimum, you should consider getting a review from all stakeholders (project managers, tech leads, feature owners). When writing the plan, be sure to understand the expected readers, provide them with enough background to understand the plan, and answer all questions you think they will have - even if your answer is that you don’t have an answer yet. Also consider adding contacts for the test plan, so any reader can get more information.
  • How can readers review the actual test cases? Manual cases might be in a test case management tool, in a separate document, or included in the test plan. Consider providing links to directories containing automated test cases.
  • Do you need traceability between requirements, features, and tests?
  • Do you have any general product health or quality goals and how will you measure success? Consider:
    • Release cadence
    • Number of bugs caught by users in production
    • Number of bugs caught in release testing
    • Number of open bugs over time
    • Code coverage
    • Cost of manual testing
    • Difficulty of creating new tests


Categories: Software Testing

GTAC Diversity Scholarship

Google Testing Blog - Fri, 06/03/2016 - 07:35
by Lesley Katzen on behalf of the GTAC Diversity Committee

We are committed to increasing diversity at GTAC, and we believe the best way to do that is by making sure we have a diverse set of applicants to speak and attend. As part of that commitment, we are excited to announce that we will be offering travel scholarships this year.
Travel scholarships will be available for selected applicants from traditionally underrepresented groups in technology.

To be eligible for a grant to attend GTAC, applicants must:

  • Be 18 years of age or older.
  • Be from a traditionally underrepresented group in technology.
  • Work or study in Computer Science, Computer Engineering, Information Technology, or a technical field related to software testing.
  • Be able to attend core dates of GTAC, November 15th - 16th 2016 in Sunnyvale, CA.


To apply:
Please fill out the following form to be considered for a travel scholarship.
The deadline for submission is June 1st June 15th.  Scholarship recipients will be announced on June 30th July 15th. If you are selected, we will contact you with information on how to proceed with booking travel.


What the scholarship covers:
Google will pay for standard coach class airfare for selected scholarship recipients to San Francisco or San Jose, and 3 nights of accommodations in a hotel near the Sunnyvale campus. Breakfast and lunch will be provided for GTAC attendees and speakers on both days of the conference. We will also provide a $50.00 gift card for other incidentals such as airport transportation or meals. You will need to provide your own credit card to cover any hotel incidentals.


Google is dedicated to providing a harassment-free and inclusive conference experience for everyone. Our anti-harassment policy can be found at:
https://www.google.com/events/policy/anti-harassmentpolicy.html
Categories: Software Testing

GTAC 2016 - Save the Date

Google Testing Blog - Thu, 06/02/2016 - 22:41
by Sonal Shah on behalf of the GTAC Committee


We are pleased to announce that the tenth GTAC (Google Test Automation Conference) will be held on Google’s campus in Sunnyvale (California, USA) on Tuesday and Wednesday, November 15th and 16th, 2016.  

Based on feedback from the last GTAC (2015) and the increasing demand every year, we have decided to keep GTAC on a fall schedule. This schedule is a change from what we previously announced.

The schedule for the next few months is:
May 1, 2016  - Registration opens for speakers and attendees.June 1, 2016 June 15, 2016 - Registration closes for speaker and attendee submissions.June 30, 2016 July 15, 2016 - Selected attendees will be notified.August 15, 2016 August 29, 2016 - Selected speakers will be notified.November 14, 2016 - Rehearsal day for speakers.November 15-16, 2016 - GTAC 2016!

As part of our efforts to increase diversity of speakers and attendees at GTAC,  we will be offering travel scholarships for selected applicants from traditionally underrepresented groups in technology.

Stay tuned to this blog and the GTAC website for information about attending or presenting at GTAC. Please do not hesitate to contact gtac2016@google.com if you have any questions. We look forward to seeing you there!
Categories: Software Testing

GTAC 2016 Registration is now open!

Google Testing Blog - Thu, 06/02/2016 - 22:36
by Sonal Shah on behalf of the GTAC Committee

The GTAC (Google Test Automation Conference) 2016 application process is now open for presentation proposals and attendance. GTAC will be held at the Google Sunnyvale office on November 15th - 16th, 2016.

GTAC will be streamed live on YouTube again this year, so even if you cannot attend in person, you will be able to watch the conference remotely. We will post the livestream information as we get closer to the event, and recordings will be posted afterwards.

Speakers
Presentations are targeted at students, academics, and experienced engineers working on test automation. Full presentations are 30 minutes and lightning talks are 10 minutes. Speakers should be prepared for a question and answer session following their presentation.

Application
For presentation proposals and/or attendance, complete this form. We will be selecting about 25 talks and 300 attendees for the event. The selection process is not first come first serve (no need to rush your application), and we select a diverse group of engineers from various locations, company sizes, and technical backgrounds.

Deadline
The due date for both presentation and attendance applications is June 1st, 2016 June 15, 2016.

Cost
There are no registration fees, but speakers and attendees must arrange and pay for their own travel and accommodations.

More information
Please read our FAQ for most common questions
https://developers.google.com/google-test-automation-conference/2016/faq.
Categories: Software Testing

GTAC 2016 Registration Deadline Extended

Google Testing Blog - Thu, 06/02/2016 - 21:58
by Sonal Shah on behalf of the GTAC Committee

Our goal in organizing GTAC each year is to make it a first-class conference, dedicated to presenting leading edge industry practices. The quality of submissions we've received for GTAC 2016 so far has been overwhelming. In order to include the best talks possible, we are extending the deadline for speaker and attendee submissions by 15 days. The new timelines are as follows:

June 1, 2016 June 15, 2016 - Last day for speaker, attendee and diversity scholarship submissions.
June 15, 2016 July 15, 2016 - Attendees and scholarship awardees will be notified of selection/rejection/waitlist status. Those on the waitlist will be notified as space becomes available.
August 15, 2016 August 29, 2016 - Selected speakers will be notified.

To register, please fill out this form.
To apply for diversity scholarship, please fill out this form.

The GTAC website has a list of frequently asked questions. Please do not hesitate to contact gtac2016@google.com if you still have any questions.

Categories: Software Testing

s/automation/programming/

DevelopSense - Michael Bolton - Thu, 06/02/2016 - 11:03
Several years ago in one of his early insightful blog posts, Pradeep Soundarajan said this: “The test doesn’t find the bug. A human finds the bug, and the test plays a role in helping the human find it.” More recently, Pradeep said this: Instead of saying, “It is programmed”, we say, “It is automated”. A […]
Categories: Software Testing

I Am A FogBugz Overachiever

QA Hates You - Tue, 05/31/2016 - 09:57

It looks as though FogBugz has decided to offer a little advice in the defect report’s description field:

Its placeholder says:

Every good bug report needs exactly three things: steps to reproduce, what you expected to see, and what you saw instead.

Exactly three things? Well, I must be an overachiever then when I add some analysis or relationships to other bugs, logs, and so on.

But that’s my way.

Categories: Software Testing

The Honest Manual Writer Heuristic

DevelopSense - Michael Bolton - Mon, 05/30/2016 - 16:09
Want a quick idea for a a burst of activity that will reveal both bugs and opportunities for further exploration? Play “Honest Manual Writer”. Here’s how it works: imagine you’re the world’s most organized, most thorough, and—above all—most honest documentation writer. Your client has assigned you to write a user manual, including both reference and […]
Categories: Software Testing

Flaky Tests at Google and How We Mitigate Them

Google Testing Blog - Fri, 05/27/2016 - 18:34
by John Micco

At Google, we run a very large corpus of tests continuously to validate our code submissions. Everyone from developers to project managers rely on the results of these tests to make decisions about whether the system is ready for deployment or whether code changes are OK to submit. Productivity for developers at Google relies on the ability of the tests to find real problems with the code being changed or developed in a timely and reliable fashion.

Tests are run before submission (pre-submit testing) which gates submission and verifies that changes are acceptable, and again after submission (post-submit testing) to decide whether the project is ready to be released. In both cases, all of the tests for a particular project must report a passing result before submitting code or releasing a project.

Unfortunately, across our entire corpus of tests, we see a continual rate of about 1.5% of all test runs reporting a "flaky" result. We define a "flaky" test result as a test that exhibits both a passing and a failing result with the same code. There are many root causes why tests return flaky results, including concurrency, relying on non-deterministic or undefined behaviors, flaky third party code, infrastructure problems, etc. We have invested a lot of effort in removing flakiness from tests, but overall the insertion rate is about the same as the fix rate, meaning we are stuck with a certain rate of tests that provide value, but occasionally produce a flaky result. Almost 16% of our tests have some level of flakiness associated with them! This is a staggering number; it means that more than 1 in 7 of the tests written by our world-class engineers occasionally fail in a way not caused by changes to the code or tests.

When doing post-submit testing, our Continuous Integration (CI) system identifies when a passing test transitions to failing, so that we can investigate the code submission that caused the failure. What we find in practice is that about 84% of the transitions we observe from pass to fail involve a flaky test! This causes extra repetitive work to determine whether a new failure is a flaky result or a legitimate failure. It is quite common to ignore legitimate failures in flaky tests due to the high number of false-positives. At the very least, build monitors typically wait for additional CI cycles to run this test again to determine whether or not the test has been broken by a submission adding to the delay of identifying real problems and increasing the pool of changes that could contribute.

In addition to the cost of build monitoring, consider that the average project contains 1000 or so individual tests. To release a project, we require that all these tests pass with the latest code changes. If 1.5% of test results are flaky, 15 tests will likely fail, requiring expensive investigation by a build cop or developer. In some cases, developers dismiss a failing result as flaky only to later realize that it was a legitimate failure caused by the code. It is human nature to ignore alarms when there is a history of false signals coming from a system. For example, see this article about airline pilots ignoring an alarm on 737s. The same phenomenon occurs with pre-submit testing. The same 15 or so failing tests block submission and introduce costly delays into the core development process. Ignoring legitimate failures at this stage results in the submission of broken code.

We have several mitigation strategies for flaky tests during presubmit testing, including the ability to re-run only failing tests, and an option to re-run tests automatically when they fail. We even have a way to denote a test as flaky - causing it to report a failure only if it fails 3 times in a row. This reduces false positives, but encourages developers to ignore flakiness in their own tests unless their tests start failing 3 times in a row, which is hardly a perfect solution.
Imagine a 15 minute integration test marked as flaky that is broken by my code submission. The breakage will not be discovered until 3 executions of the test complete, or 45 minutes, after which it will need to be determined if the test is broken (and needs to be fixed) or if the test just flaked three times in a row.

Other mitigation strategies include:
  • A tool that monitors the flakiness of tests and if the flakiness is too high, it automatically quarantines the test. Quarantining removes the test from the critical path and files a bug for developers to reduce the flakiness. This prevents it from becoming a problem for developers, but could easily mask a real race condition or some other bug in the code being tested.
  • Another tool detects changes in the flakiness level of tests and works to identify the change that caused the test to change the level of flakiness.

In summary, test flakiness is an important problem, and Google is continuing to invest in detecting, mitigating, tracking, and fixing test flakiness throughout our code base. For example:
  • We have a new team dedicated to providing accurate and timely information about test flakiness to help developers and build monitors so that they know whether they are being harmed by test flakiness.
  • As we analyze the data from flaky test executions, we are seeing promising correlations with features that should enable us to identify a flaky result accurately without re-running the test.


By continually advancing the state of the art for teams at Google, we aim to remove the friction caused by test flakiness from the core developer workflows.

Categories: Software Testing

Test automation as an orchard

Dorothy Graham - Thu, 05/26/2016 - 05:46
At StarEast in May 2016, I was kindly invited to give a lightning keynote, which I did on this analogy. Hope you find it interesting and useful!

-----------------------------------------------------------------------

Automation is SO easy.
Let me rephrase that - automation often seems to be very easy.When you see your first demo, or run your first automated test, it’s like magic - wow, that’s good, wish I could type that fast.
But good automation is very different to that first test.
If you go into the garden and see a lovely juicy fruit hanging on a low branch, and you reach out and pick it, you think, "Wow, that was easy - isn’t it good, lovely and tasty".
But good test automation is more like building an orchard to grow enough fruit to feed a small town.
Where do you start?First you need to know what kind of fruit you want to grow - apples? oranges? (oranges would not be a good choice for the UK). You need to consider what kind of soil you have, what kind of climate, and also what will the market be - you don’t want to grow fruit that no one wants to buy or eat.
In automation, first you need to know what kind of tests you want to automate, and why. You need to consider the company culture, other tools, what the context is, and what will bring lasting value to your business.
Growing pains?Then you need to grow your trees. Fortunately automation can grow a lot quicker than trees, but it still takes time - it’s not instant.
While the trees are growing, you need to prune them and prune them hard especially in the first few years. Maybe you don’t allow them to fruit at all for the first 3 years - this way you are building a strong infrastructure for the trees so that they will be stronger and healthier and will produce much more fruit later on. You may also want to train them to grow into the structure that you want from the trees when they are mature.
In automation, you need to prune your tests - don’t just let them grow and grow and get all straggly. You need to make sure that each test has earned its place in your test suite, otherwise get rid of it. This way you will build a strong infrastructure of worthwhile tests that will make your automation stronger and healthier over the years, and it will bring good benefits to your organisation. You need to structure your automation (a good testware architecture) so that it will give lasting benefits.
Feeding, pests and diseasesOver time, you need to fertilise the ground, so that the trees have the nourishment they need to grow to be strong and healthy.
In automation, you need to nourish the people who are working on the automation, so that they will continue to improve and build stronger and healthier automation. They need to keep learning, experimenting, and be encouraged to make mistakes - in order to learn from them.
You need to deal with pests - bugs - that might attack your trees and damage your fruit.
Is this anything to do with automation? Are there bugs in automated scripts? In testing tools? Of course there are, and you need to deal with them - be prepared to look for them and eradicate them.
What about diseases? What if one of your trees gets infected with some kind of blight, or suddenly stops producing good fruit? You may need to chop down that infected tree and burn it, because it you don’t, this blight might spread to your whole orchard.
Does automation get sick? Actually, a lot of automation efforts seem to decay over time - they take more and more effort to maintain. technical debt builds up, and often the automation dies. If you want your automation to live and produce good results, you might need to take drastic action and re-factor the architecture if it is causing problems. Because if you don’t, your whole automation may die.
Picking and packingWhat about picking the fruit? I have seen machines that shake the trees so they can be scooped up - that might be ok if you are making cider or applesauce, but I wouldn’t want fruit picked in that way to be in my fruit bowl on the table. Manual effort is still needed. The machines can help but not do everything (and someone is driving the machines).
Test execution tools don’t do testing, they just run stuff. The tools can help and can very usefully do some things, but there are tests that should not be automated and should be run manually. The tools don’t replace testers, they support them.
We need to pack the fruit so it will survive the journey to market, perhaps building a structure to hold the fruit so it can be transported without damage.
Automation needs to survive too - it needs to survive more than one release of the application, more than one version of the tool, and may need to run on new platforms. The structure of the automation, the testware architecture, is what determines whether or not the automated tests survive these changes well.
Marketing, selling, roles and expectationsIt is important to do marketing and selling for our fruit - if no one buys it, we will have a glut of rotting fruit on our hands.
Automation needs to be marketed and sold as well - we need to make sure that our managers and stakeholders are aware of the value that automation brings, so that they want to keep buying it and supporting it over time.
By the way, the people who are good at marketing and selling are probably not the same people who are good at picking or packing or pruning - different roles are needed. Of course the same is true for automation - different roles are needed: tester, automator, automation architect, champion (who sells the benefits to stakeholders and managers).
Finally, it is important to set realistic expectations. If your local supermarket buyers have heard that eating your fruit will enable them to leap tall buildings at a single bound, you will have a very easy sell for the first shipment of fruit, but when they find out that it doesn’t meet those expectations, even if the fruit is very good, it may be seen as worthless.
Setting realistic expectations for automation is critical for long-term success and for gaining long-term support; otherwise if the expectations aren’t met, the automation may be seen as worthless, even if it is actually providing useful benefits.
SummarySo if you are growing your own automation, remember these things:
  • -      it takes time to do it well
  • -      prepare the ground
  • -      choose the right tests to grow
  • -      be prepared to prune / re-factor
  • -      deal with pests and diseases (see previous point)
  • -      make sure you have a good structure so the automation will survive change
  • -      different roles are needed
  • -      sell and market the automation and set realistic expectations
  • -      you can achieve great results


I hope that all of your automation efforts are very fruitful!


Categories: Software Testing

Five ways to reduce the cost of large test suites

The Quest for Software++ - Mon, 05/23/2016 - 16:00
Lambeth council in south London has a historic reputation for controversial policies. At one point, it banned the police from using the council facilities. In 1985, it refused to set the budget, as a protest against government policies. After an audit, the leader and 30 other councillors had to repay the losses personally, and were banned from holding a political office for five years. For conservative media in 1980s, Lambeth was the prime example of ‘Loony left’, and it looks as if it is on good track to regain that reputation. Lambeth recently closed two libraries to save money, but...

Pages