Syndicate content
If it ain't broke, you're not trying hard enough.
URL: http://googletesting.blogspot.com/
Updated: 21 hours 59 min ago

A Virtual Battlefield... with Bugs: Ship Wars @ Google Kirkland

Thu, 11/08/2012 - 16:15
By Anthony F. Voellm (aka Tony the @p3rfguy / G+) and Emily Bedont


On Wednesday, October 24th, while sitting under the Solar System, 30 software engineers from the Greater Seattle area came together at Google Kirkland to partake in the first ever Test Edition of Ship Wars. Ship Wars was created by two Google Waterloo engineers, Garret Kelly and Aaron Kemp, as a 20% project. Yes, 20% time does exist at Google!  The object of the game is to code a spaceship that will outperform all others in a virtual universe - algorithm vs algorithm.

The Kirkland event marked the 7th iteration of the program which was also recently done in NYC. Kirkland however was the first time that the game had been customized to encourage exploratory testing. In the case of "Ship Wars the Test Edition," we planted 4 bugs that the engineering participants were awarded for finding. Well, we ran out of prizes and were quickly reminded that when you put a lot of testing minded people in a room, many bugs will be unveiled! One of the best unveiled bugs was not one of the four planted in the simulator. When turning your ship 90 degrees, the ship actually turned -90 degrees. Oops!

Participants were encouraged to test their spaceship built on their own machine or a Google Chromebook. While the coding was done in the browser, the simulator and web server were run on Google Compute Engine. Throughout the 90 minutes, people challenged other participants to duels. Head-to-head battles took place on Chromebooks at the front of the room. There were many accolades called out but in the end, there could only be one champion who would walk away with a brand spankin’ new Nexus7. Check out our video of the evening’s activities.

Sounds fun, huh? We sure hope our participants, including our first place winner shown receiving the Nexus 7 from Garret, enjoyed the evening! Beyond the battles, our guests were introduced to the revived Google Testing Blog, heard firsthand that GTAC will be back in 2013, learned about testing at Google, and interacted with Googlers in a "Googley" environment. Achievement unlocked.

Special thanks to all the Googlers that supported the event!

Categories: Software Testing

GTAC Coming to New York in the Spring

Tue, 10/30/2012 - 11:18

By The GTAC Committee

The next and seventh GTAC (Google Test Automation Conference) will be held on April 23-24, 2013 at the beautiful Google New York office! We had a late start preparing for GTAC, but things are now falling into place. This will be the second time it is hosted at our second largest engineering office. We are also sticking with our tradition of changing the region each year, as the last GTAC was held in California.

The GTAC event brings together engineers from many organizations to discuss test automation. It is a great opportunity to present, learn, and challenge modern testing technologies and strategies. We will soon be recruiting speakers to discuss their innovations.

This year’s theme will be “Testing Media and Mobile“. In the past few years, substantial changes have taken place in both the media and mobile areas. Television is no longer the king of media. Over 27 billion videos are streamed in the U.S. per month. Over 1 billion people now own smartphones. HTML5 includes support for audio, video, and scalable vector graphics, which will liberate many web developers from their dependence on third-party media software. These are incredibly complex technologies to test. We are thrilled to be hosting this event in which many in the industry will share their innovations.

Registration information for speakers and attendees will soon be posted here and on the GTAC site (http://www.gtac.biz). Even though we will be focusing on “Testing Media and Mobile”, we will be accepting proposals for talks on other topics.


Categories: Software Testing

Why Are There So Many C++ Testing Frameworks?

Fri, 10/26/2012 - 17:22
By Zhanyong Wan - Software Engineer

These days, it seems that everyone is rolling their own C++ testing framework, if they haven't done so already. Wikipedia has a partial list of such frameworks. This is interesting because many OOP languages have only one or two major frameworks. For example, most Java people seem happy with either JUnit or TestNG. Are C++ programmers the do-it-yourself kind?

When we started working on Google Test (Google’s C++ testing framework), and especially after we open-sourced it, people began asking us why we were doing it. The short answer is that we couldn’t find an existing C++ testing framework that satisfied all our needs. This doesn't mean that these frameworks were all poorly designed or implemented. Rather, many of them had great ideas and tricks that we learned from. However, Google had a huge number of C++ projects that got compiled on various operating systems (Linux, Windows, Mac OS X, and later Android, among others) with different compilers and all kinds of compiler flags, and we needed a framework that worked well in all these environments and ccould handle many different types and sizes of projects.

Unlike Java, which has the famous slogan "Write once, run anywhere," C++ code is being written in a much more diverse environment. Due to the complexity of the language and the need to do low-level tasks, compatibility between different C++ compilers and even different versions of the same compiler is poor. There is a C++ standard, but it's not well supported by compiler vendors. For many tasks you have to rely on unportable extensions or platform-specific functionality. This makes it hard to write a reasonably complex system that can be built using many different compilers and works on many platforms.

To make things more complicated, most C++ compilers allow you to turn off some standard language features in return for better performance. Don't like using exceptions? You can turn it off. Think dynamic cast is bad? You can disable Run-Time Type Identification, the feature behind dynamic cast and run-time access to type information. If you do any of these, however, code using these features will fail to compile. Many testing frameworks rely on exceptions. They are automatically out of the question for us since we turn off exceptions in many projects (in case you are curious, Google Test doesn’t require exceptions or run-time type identification by default; when these language features are turned on, Google Test will try to take advantage of them and provide you with more utilities, like the exception assertions.).

Why not just write a portable framework, then? Indeed, that's a top design goal for Google Test. And authors of some other frameworks have tried this too. However, this comes with a cost. Cross-platform C++ development requires much more effort: you need to test your code with different operating systems, different compilers, different versions of them, and different compiler flags (combine these factors and the task soon gets daunting); some platforms may not let you do certain things and you have to find a workaround there and guard the code with conditional compilation; different versions of compilers have different bugs and you may have to revise your code to bypass them all; etc. In the end, it's hard unless you are happy with a bare-bone system.

So, I think a major reason that we have many C++ testing frameworks is that C++ is different in different environments, making it hard to write portable C++ code. John's framework may not suit Bill's environment, even if it solves John's problems perfectly.

Another reason is that some limitations of C++ make it impossible to implement certain features really well, and different people chose different ways to workaround the limitations. One notable example is that C++ is a statically-typed language and doesn't support reflection. Most Java testing frameworks use reflection to automatically discover tests you've written such that you don't have to register them one-by-one. This is a good thing as manually registering tests is tedious and you can easily write a test and forget to register it. Since C++ has no reflection, we have to do it differently. Unfortunately there is no single best option. Some frameworks require you to register tests by hand, some use scripts to parse your source code to discover tests, and some use macros to automate the registration. We prefer the last approach and think it works for most people, but some disagree. Also, there are different ways to devise the macros and they involve different trade-offs, so the result is not clear cut.

Let’s see some actual code to understand how Google Test solves the test registration problem. The simplest way to add a test is to use the TEST macro (what else would we name it?):

TEST(Subject, HasCertainProperty) {
  … testing code goes here …
}

This defines a test method whose purpose is to verify that the given subject has the given property. The macro automatically registers the test with Google Test such that it will be run when the test program (which may contain many such TEST definitions) is executed.

Here’s a more concrete example that verifies a Factorial() function works as expected for positive arguments:

TEST(FactorialTest, HandlesPositiveInput) {
  EXPECT_EQ(1, Factorial(1));
  EXPECT_EQ(2, Factorial(2));
  EXPECT_EQ(6, Factorial(3));
  EXPECT_EQ(40320, Factorial(8));
}

Finally, many C++ testing framework authors neglected extensibility and were happy just providing canned solutions, so we ended up with many solutions, each satisfying a different niche but none general enough. A versatile framework must have a good extension story. Let's face it: you cannot be all things to all people, no matter what. Instead of bloating the framework with rarely used features, we should provide good out-of-box solutions for maybe 95% of the use cases, and leave the rest to extensions. If I can easily extend a framework to solve a particular problem of mine, I will feel less motivated to write my own thing. Unfortunately, many framework authors don't seem to see the importance of extensibility. I think that mindset contributed to the plethora of frameworks we see today. In Google Test, we try to make it easy to expand your testing vocabulary by defining custom assertions that generate informative error messages. For instance, here’s a naive way to verify that an int value is in a given range:

bool IsInRange(int value, int low, int high) {
  return low <= value && value <= high;
}
  ...
  EXPECT_TRUE(IsInRange(SomeFunction(), low, high));

The problem is that when the assertion fails, you only know that the value returned by SomeFunction() is not in range [low, high], but you have no idea what that return value and the range actually are -- this makes debugging the test failure harder.

You could provide a custom message to make the failure more descriptive:

  EXPECT_TRUE(IsInRange(SomeFunction(), low, high))
      << "SomeFunction() = " << SomeFunction() 
      << ", not in range ["
      << low << ", " << high << "]";

Except that this is incorrect as SomeFunction() may return a different answer each time.  You can fix that by introducing an intermediate variable to hold the function’s result:

  int result = SomeFunction();
  EXPECT_TRUE(IsInRange(result, low, high))
      << "result (return value of SomeFunction()) = " << result
      << ", not in range [" << low << ", " << high << "]";

However this is tedious and obscures what you are really trying to do.  It’s not a good pattern when you need to do the “is in range” check repeatedly. What we need here is a way to abstract this pattern into a reusable construct.

Google Test lets you define a test predicate like this:

AssertionResult IsInRange(int value, int low, int high) {
  if (value < low)
    return AssertionFailure()
        << value << " < lower bound " << low;
  else if (value > high)
    return AssertionFailure()
        << value << " > upper bound " << high;
  else
    return AssertionSuccess()
        << value << " is in range [" 
        << low << ", " << high << "]";
}

Then the statement EXPECT_TRUE(IsInRange(SomeFunction(), low, high)) may print (assuming that SomeFunction() returns 13):

   Value of: IsInRange(SomeFunction(), low, high)
     Actual: false (13 < lower bound 20)
   Expected: true

The same IsInRange() definition also lets you use it in an EXPECT_FALSE context, e.g. EXPECT_FALSE(IsInRange(AnotherFunction(), low, high)) could print:

   Value of: IsInRange(AnotherFunction(), low, high)
     Actual: true (25 is in range [20, 60])
   Expected: false

This way, you can build a library of test predicates for your problem domain, and benefit from clear, declarative test code and descriptive failure messages.

In the same vein, Google Mock (our C++ mocking framework) allows you to easily define matchers that can be used exactly the same way as built-in matchers.  Also, we have included an event listener API in Google Test for people to write plug-ins. We hope that people will use these features to extend Google Test/Mock for their own need and contribute back extensions that might be generally useful.

Perhaps one day we will solve the C++ testing framework fragmentation problem, after all. :-)

Categories: Software Testing

Hermetic Servers

Wed, 10/03/2012 - 12:08
By Chaitali Narla and Diego Salas
Consider a complex and rich web app. Under the hood, it is probably a maze of servers, each performing a different task and most talking to each other. Any user action navigates this server maze on its round-trip from the user to the datastores and back. A lot of Google’s web apps are like this including GMail and Google+. So how do we write end-to-end tests for them?
The “End-To-End” Test
An end-to-end test in the Google testing world is a test that exercises the entire server stack from a user request to response. Here is a simplified view of the System Under Test (SUT) that an end-to-end test would assert. Note that the frontend server in the SUT connects to a third backend which this particular user request does not need.

One of the challenges to writing a fast and reliable end-to-end test for such a system is avoiding network access. Tests involving network access are slower than their counterparts that only access local resources, and accessing external servers might lead to flakiness due to lack of determinism or unavailability of the external servers. 
Hermetic Servers
One of the tricks we use at Google to design end-to-end tests is Hermetic Servers.
What is a Hermetic Server? The short definition would be a “server in a box”. If you can start up the entire server on a single machine that has no network connection AND the server works as expected, you have a hermetic server! This is a special case of the more general “hermetic” concept which applies to an isolated system not necessarily on a single machine. 
Why is it useful to have a hermetic server? Because if your entire SUT is composed of hermetic servers, it could all be started on a single machine for testing; no network connection necessary! The single machine could be a physical or virtual machine.
Designing Hermetic Servers
The process for building a hermetic server starts early in the design phase of any new server. Some things we watch out for:
  • All connections to other servers are injected into the server at runtime using a suitable form of dependency injection such as commandline flags or Guice.
  • All required static files are bundled in the server binary.
  • If the server talks to a datastore, make sure the datastore can be faked with data files or in-memory implementations.
Meeting the above requirements ensures we have a highly configurable server that has potential to become a hermetic server. But it is not yet ready to be used in tests. We do a few more things to complete the package:
  • Make sure those connection points which our test won’t exercise have appropriate fakes or mocks to verify this non-interaction.
  • Provide modules to easily populate datastores with test data.
  • Provide logging modules that can help trace the request/response path as it passes through the SUT.

Using Hermetic Servers in tests
Let’s take the SUT shown earlier and assume all the servers in it are hermetic servers. Here is how an end-to-end test for the same user request would look:

The end-to-end test does the following steps:
  • starts the entire SUT as shown in the diagram on a single machine
  • makes requests to the server via the test client
  • validates responses from the server
One thing to note here is the mock server connection for the backend is not needed in this test. If we wish to test a request that needs this backend, we would have to provide a hermetic server at that connection point as well.
This end-to-end test is more reliable because it uses no network connection. It is faster because everything it needs is available in-memory or in the local hard disk. We run such tests on our continuous builds, so they run at each changelist affecting any of the servers in the SUT. If the test fails, the logging module helps track where the failure occurred in the SUT.
We use hermetic servers in a lot of end-to-end tests. Some common cases include
  • Startup tests for servers using Guice to verify that there are no Guice errors on startup.
  • API tests for backend servers.
  • Micro-benchmark performance tests.
  • UI and API tests for frontend servers.

Conclusion
Hermetic servers do have some limitations. They will increase your test’s runtime since you have to start the entire SUT each time you run the end-to-end test. If your test runs with limited resources such as memory and CPU, hermetic servers might push your test over those limits as the server interactions grow in complexity. The dataset size you can use in the in-memory datastores will be much smaller than production datastores.
Hermetic servers are a great testing tool. Like all other tools, they need to be used thoughtfully where appropriate.
Categories: Software Testing

Conversation with a Test Engineer

Wed, 09/19/2012 - 09:31
By Alan Myrvold

Alan Faulkner is a Google Test Engineer working on DoubleClick Bid Manager, which enables advertising agencies and advertisers to bid on multiple ad exchanges. Bid Manager is the next generation of the Invite Media product, acquired by Google in 2010. Alan Faulkner has been focused on the migration component of Bid Manager, which transitions advertiser information from Invite Media to Bid Manager. He joined Google in August 2011, and works in the Kirkland, WA office.
Are you a Test Engineer, or a Software Engineer in Test, and what’s the difference?
Right now, I’m a Test Engineer, but the two roles can be very similar. As a Test Engineer, you’re more focused on the overall quality of the product and speed of releases, while a Software Engineer in Test might focus more on test frameworks, automation, and refactoring code for testability. I think of the difference as more of a shift in focus and not capabilities, since both roles at Google need to be able to write production quality code. Example test engineering tasks I worked on are introducing an automated release process, identifying areas for the team to improve code coverage, and reducing the manual steps needed to validate data correctness.

What is a typical day for you?
When I get in, I look at any code reviews I need to respond to, look for any production bugs from technical account managers that are high priority, and then start writing code. In my current role, I focus my development effort on improving the efficiency and coverage of our large scale integration tests and frameworks. I also work on adding additional features to our product that improve our testability. I typically spend anywhere from 50% to 75% of my time either writing code or participating in code reviews.

Do you write only test code?
No, I write a lot of code that is included in the product as well. One of the great things about being an SET or TE at Google is that you can write product code as easily as test code. I write both. My test code focuses on improving test frameworks and enabling developers to write integration tests. The production code that I write focuses on increasing the verification of external inputs. I also focus on adding features that improve testability. This pushes more quality features into the product itself rather than relying on test code for correctness.

What programming languages do you use?
Both the test and product code are mostly Java. Occasionally I use Python or C++ too.

How much time to do you spend doing manual testing?
Right now, with the role I am in, I spend less than 5% of my time doing manual testing. Although some exploratory testing helps develop product knowledge and find risky areas, it doesn’t scale as a repeated process. There are a small amount of manual steps and I focus on ways to help reduce this so our team does not spend our time doing repeated manual steps as part of our data migration.

Do you write unit tests for code that isn’t yours?
At Google, the responsibility of testing is shared across all product engineers, not just Test Engineers. Everyone is responsible for writing unit tests as well as integration tests for their components. That being said, I have written unit tests for components that are outside of what I developed but that has been to illustrate how to write a unit test for said component. This component usually involved a abnormally complex set of code or to illustrate using a new mocking framework, such as Mockito.

What do you like about working on the Google advertising products?
I like the challenges of the scalability problems we need to solve, from handling massive amounts of data to dealing with lots of real time ad requests that need to be responded to in milliseconds. I also like the impact, since the products affect a lot of people. It’s rewarding to work on stuff like that.

How is testing at Google different from your experience at other companies?
I feel the role is more flexible at Google. There are fewer SET’s and TE’s in my group at Google per developer, and you have the flexibility to pick what is most important. For example, I get to write a lot of production code to fix bugs, make the code more testable, and increasing the visibility into errors encountered during our data migrations. Plus, developers at Google spend a lot of time writing tests, so testing isn’t just my responsibility.

How does the Google Kirkland office differ from the headquarters in Mountain View?
What I really like about the offices at Google is that each of them has their own local feel and personality. Google encourages this! For instance, the office here in Kirkland has a climbing wall, boats and all the conference rooms in our building are named after local bands in the area. The office in Seattle has kayaks and the New York office has an actual food truck in its cafeteria.

What’s the future of testing at Google?
I think the future is really bright. We have a lot of flexibility to make a big impact on quality, testability and improving our release velocity. We need to release new features faster and with good quality. The problems that we face are complex and at an extreme scale. We need engineering teams focused on ensuring that we have efficient ways to simulate and test. There will always be a need for testers and developers that focus on these areas here at Google.

Interested in test jobs at Google?
Categories: Software Testing

Testing 2.0

Fri, 08/31/2012 - 11:14

By Anthony F. Voellm (aka Tony the perfguy)

It’s amazing what has happened in the field of test in the last 20 years... a lot of “art” has turned into “science”. Computer scientists, engineers, and many other disciplines have worked on provable systems and calculus, pioneered model based testing, invented security fuzz testing, and even settled on a common pattern for unit tests called xunit. The xunit pattern shows up in open source software like JUnit as well as in Microsoft development test tools.

With all this innovation in test, there’s no wonder test is dead.  The situation is no different from the late 1800’s when patents were declared dead. Everything had been invented. So now that everything in test has been invented, it’s dead.

Well... if you believe everything in test has been invented then please stop reading now :)

As an aside:  “Test is dead” was a keynote at the Google Test Automation Conference (GTAC) in 2011.  You can watch that talk and many other GTAC test talks on YouTube, and I definitely recommend you check them out here.  Talks span a wide range of topics ranging from GUI Automation to Cloud.

What really excites me these days is that we have closed a chapter on test. A lot of the foundation of writing and testing great software has been laid (examples at the beginning of the post, tools like Webdriver for UI, FIO for storage, and much more), which I think of as Testing 1.0. We all use Testing 1.0 day in and day out. In fact at Google, most of the developers (called Software Engineers or SWEs) do the basic Testing 1.0 work and we have a high bar on quality.  Knuth once said "Be careful about using the following code -- I've only proven that it works, I haven't tested it." 

This brings us to the current chapter in test which I call Testing 1.5.  This chapter is being written by computer scientists, applied scientists, engineers, developers, statisticians, and many other disciplines.  These people come together in the Software Engineer in Test (SET) and Test Engineer (TE) roles at Google. SET/TEs focus on; developing software faster, building it better the first time, testing it in depth, releasing it quicker, and making sure it works in all environments.  We often put deep test focus on Security, Reliability and Performance.  I sometimes think of the SET/TE’s as risk assessors whose role is to figure out the probability of finding a bug, and then working to reduce that probability. Super interesting computer science problems where we take a solid engineering approach, rather than a process oriented / manual / people intensive based approach.  We always look to scale with machines wherever possible.

While Testing 1.0 is done and 1.5 is alive and well, it’s Testing 2.0 that gets me up early in the morning to start my day. Imagine if we could reinvent how we use and think about tests.  What if we could automate the complex decisions on good and bad quality that humans are still so good at today? What would it look like if we had a system collecting all the “quality signals” (think: tests, production information, developer behavior, …) and could predict how good the code is today, and what it most likely will be tomorrow? That would be so awesome...

Google is working on Testing 2.0 and we’ll continue to contribute to Testing 1.0 and 1.5. Nothing is static... keep up or miss an amazing ride.

Peace.... Tony

Special thanks to Chris, Simon, Anthony, Matt, Asim, Ari, Baran, Jim, Chaitali, Rob, Emily, Kristen, Annie, and many others for providing input and suggestions for this post.
Categories: Software Testing

Testing Google's New API Infrastructure

Fri, 08/24/2012 - 16:15

By Anthony Vallone

If you haven’t noticed, Google has been launching many public APIs recently. These APIs are empowering mobile app, desktop app, web service, and website developers by providing easy access to Google tools, services, and data. In the past couple of years, we have invested heavily in building a new API infrastructure for our APIs. Before this infrastructure, our teams had numerous technical challenges to solve when releasing an API: scalability, authorization, quota, caching, billing, client libraries, translation from external REST requests to internal RPCs, etc. The new infrastructure generically solves these problems and allows our teams to focus on their service. Automating testing of these new APIs turned out to be quite a large problem. Our solution to this problem is somewhat unique within Google, and we hope you find it interesting.


System Under Test (SUT)

Let’s start with a simplified view of the SUT design:



A developer’s application uses a Google-supplied API Client Library to call Google API methods. The library connects to the API Infrastructure service and sends the request. Part of the request defines the particular API and version being used by the client. This service is knowledgeable of all Google APIs, because they are defined by API Configuration files. These files are created by each API providing team. Configuration files declare API versions, methods, method parameters, and other API settings. Given an API request and information about the API, the API Infrastructure Service can translate the request to Google’s internal RPC format and pass it to the correct API Provider Service. This service then satisfies the request and passes the response back to the developer’s app via the API Infrastructure Service and API Client Library.

Now, the Fun Part

As of this writing, we have released 10 language-specific client libraries and 35 public APIs built on this infrastructure. Also, each of the libraries need to work on multiple platforms. Our test space has three dimensions: API (35), language (10), and platform (varies by lib). How are we going to test all the libraries on all the platforms against all the APIs when we only have two engineers on the team dedicated to test automation?

Step 1: Create a Comprehensive API

Each API uses different features of the infrastructure, and we want to ensure that every feature works. Rather than use the APIs to test our infrastructure, we create a Test API that uses every feature. In some cases where API configuration options are mutually exclusive, we have to create API versions that are feature-specific. Of course, each API team still needs to do basic integration testing with the infrastructure, but they can assume that the infrastructure features that their API depends on are well tested by the infrastructure team.

Step 2: Client Abstraction Layer in the Tests

We want to avoid creating library-specific tests, because this would lead to mass duplication of test logic. The obvious solution is to create a test library to be used by all tests as an abstraction layer hiding the various libraries and platforms. This allows us to define tests that don’t care about library or platform.

Step 3: Adapter Servers

When a test library makes an API call, it should be able to use any language and platform. We can solve this by setting up servers on each of our target platforms. For each target language, create a language-specific server. These servers receive requests from test clients. The servers need only translate test client requests into actual library calls and return the response to the caller. The code for these servers is quite simple to create and maintain.

Step 4: Iterate

Now, we have all the pieces in place. When we run our tests, they are configured to run over all supported languages and platforms against the test API:



Test Nirvana Achieved

We have a suite of straightforward tests that focus on infrastructure features. When the tests run, they are quick, reliable, and test all of our supported features, platforms, and libraries. When a feature is added to the API infrastructure, we only need to create one new test, update each adapter server to handle a new call type, and add the feature to the Test API.




Categories: Software Testing

Covering all your codebases: A conversation with a Software Engineer in Test

Sat, 08/11/2012 - 19:50
Cross-posted from the Google Student Blog

Today we’re featuring Sabrina Williams, a Software Engineer in Test who joined Google in August 2011. Software Engineers in Test undertake a broad range of challenges on a daily basis, designing and building intelligent systems that can explore various use cases and scenarios for distributed computing infrastructure. Read on to learn more about Sabrina’s path to Google and what she works on now that she’s here!



Tell us about yourself and how you got to Google.
I grew up in rural Prunedale, Calif. and went to Stanford where I double-majored in philosophy and computer science. After college I spent six years as a software engineer at HP, working primarily on printer drivers. I began focusing on testing my last two years there—reading books, looking up information and prototyping test tools in my own time. By the time I left, I’d started a project for an automated test framework that most of our lab used.

I applied for a software engineering role at Google four years ago and didn’t do well in my interviews. Thankfully, a Google recruiter called last year and set me up for software engineer (SWE) interviews again. After a day of talking about testing and mocking for every design question I answered, I was told that there were opportunities for me in SWE and SET. I ended up choosing the SET role after speaking with the SET hiring manager. He said two things that convinced me. First, SETs spend as much time coding as SWEs, and I wanted a role where I could write code. Second, the SETs job is to creatively solve testing problems, which sounded more interesting to me than writing features for a product. This seemed like a really unique and appealing opportunity, so I took it!

So what exactly do SETs do?
SETs are SWEs who are really into testing. We help SWEs design and refactor their code so that it is more testable. We work with test engineers (TEs) to figure out how to automate difficult test cases. We also write harnesses, frameworks and tools to make test automation possible. SETs tend to have the best understanding of how everything plays together (production code, manual tests, automated tests, tools, etc.) and we have to make that information accessible to everyone on the team.

What project do you work on?
I work on the Google Cloud Print team. Our goal is to make it possible to print anywhere from any device. You can use Google Cloud Print to connect home and work printers to the web so that you (and anyone you share your printers with) can access them from your phone, tablet, Chromebook, PC or any other supported web-connected device.

What advice would you give to aspiring SETs?
First, for computer science majors in general: if there’s any other field about which you are passionate, at least minor in it. CS is wonderfully chameleonic in that it can be applied to anything. So if, for example, you love art history, minor in art and you can write software to help restore images of old paintings.

For aspiring SETs, challenge yourself to write tests for all of the code you write for school. If you can get an internship where you have access to a real-world code base, study how that company approaches testing their code. If it’s well-tested, see how they did it. If it’s not well-tested, think about how you would test it. I don’t (personally) know of a CS program that has even a full course based on testing, so you’ll have to teach yourself. Start by looking up buzzwords like “unit test” and “test-driven development.” Look up the different types of tests (unit, integration, component, system, etc.). Find a code coverage tool (if a free/cheap one is available for your language of choice) and see how well you’re covering your code with your tests. Write a tool that will run all of your tests every time you build your code. If all of this sounds like fun...well...we need more people like you! 

If you’re interested in applying for a Software Engineer in Test position, please apply for our general Software Engineer position, then indicate in your resume objective line that you’re interested in the SET role.

Posted by Jessica Safir, University Programs

Categories: Software Testing

Welcome to the Next Generation of Google Testing

Thu, 08/09/2012 - 23:39
By Anthony Vallone
Wow... it has been a long time since we’ve posted to the blog. This past year has been a whirlwind of change for many test teams as Google has restructured leadership with a focus on products. Now that the dust has settled, our teams are leaner, more focused, and more effective. We have learned quite a bit over the past year about how best to tackle and manage test problems at monumental scale. The next generation of test teams at Google are looking forward to sharing all that we have learned. Stay tuned for a revived Google Testing Blog that will provide deep insight into our latest testing technologies and strategies.
Categories: Software Testing

ScriptCover makes Javascript coverage analysis easy

Tue, 08/07/2012 - 10:31
By Ekaterina Kamenskaya, Software Engineer in Test, YouTube

Today we introduce the Javascript coverage analysis tool, ScriptCover. It is a Chrome extension that provides line-by-line Javascript code coverage statistics for web pages in real time without any user modifications required. The results are collected both when the page loads and as users interact with it. The tool reports details about total web page coverage and for each external/internal script, as well as annotated code sources with individually highlighted executed lines.
Short report in Chrome extension’s popup, detailing both overall scores and per-script coverage.

Main features:
  • Report current and previous total Javascript coverage percentages and total number of instrumented code instructions.
  • Report Javascript coverage per individual instruction for each internal and external script.
  • Display detailed reports with annotated Javascript source code.
  • Recalculate coverage statistics while loading the page and on user actions.
Sample of annotated source code from detailed report. First two columns are line number and number of times each instruction has been executed.
Here are the benefits of ScriptCover over other existing tools:
  • Per instructions coverage for external and internal scripts: The tool formats original external and internal Javascript code from ‘<script>’ tags to ideally place one instruction per line and then calculates and displays Javascript coverage statistics. It is useful even when the code is compressed to one line.

  • Dynamic: Users can get updated Javascript coverage statistics while the web page is loading and while interacting with the page.

  • Easy to use: Users with different levels of expertise can install and use the tool to analyse coverage. Additionally, there is no need to write tests, modify the web application’s code, save the inspected web page locally, manually change proxy settings, etc. When the extension is activated in a Chrome browser, users just navigate through web pages and get coverage statistics on the fly.

  • It’s free and open source!
         
Want to try it out? Install ScriptCover and let us know what you think.

We envision many potential features and improvements for ScriptCover. If you are passionate about code coverage, read our documentation and participate in discussion group. Your contributions to the project’s design, code base and feature requests are welcome!
Categories: Software Testing

Introducing DOM Snitch, our passive in-the-browser reconnaissance tool

Fri, 08/03/2012 - 10:46
By Radoslav Vasilev from Google Zurich

Every day modern web applications are becoming increasingly sophisticated, and as their complexity grows so does their attack surface. Previously we introduced open source tools such as Skipfish and Ratproxy to assist developers in understanding and securing these applications.

As existing tools focus mostly on testing server-side code, today we are happy to introduce DOM Snitch — an experimental* Chrome extension that enables developers and testers to identify insecure practices commonly found in client-side code. To do this, we have adopted several approaches to intercepting JavaScript calls to key and potentially dangerous browser infrastructure such as document.write or HTMLElement.innerHTML (among others). Once a JavaScript call has been intercepted, DOM Snitch records the document URL and a complete stack trace that will help assess if the intercepted call can lead to cross-site scripting, mixed content, insecure modifications to the same-origin policy for DOM access, or other client-side issues.



Here are the benefits of DOM Snitch:
  • Real-time: Developers can observe DOM modifications as they happen inside the browser without the need to step through JavaScript code with a debugger or pause the execution of their application.

  • Easy to use: With built-in security heuristics and nested views, both advanced and less experienced developers and testers can quickly spot areas of the application being tested that need more attention.

  • Easier collaboration: Enables developers to easily export and share captured DOM modifications while troubleshooting an issue with their peers.


DOM Snitch is intended for use by developers, testers, and security researchers alike. Click here to download DOM Snitch. To read the documentation, please visit this page.

*Developers and testers should be aware that DOM Snitch is currently experimental. We do not guarantee that it will work flawlessly for all web applications. More details on known issues can be found here or in the project’s issues tracker.
Categories: Software Testing

GTAC 2011 Keynotes

Fri, 08/03/2012 - 10:45
By James Whittaker

I am pleased to confirm 3 of our keynote speakers for GTAC 2011 at the Computer History Museum in Mountain View CA.

Google's own Alberto Savoia, aka Testivus.

Steve McConnell the best selling author of Code Complete and CEO of Construx Software.

Award winning speaker ("the Jon Stewart of Software Security") Hugh Thompson.

This is the start of an incredible lineup. Stay tuned for updates concerning their talks and continue to nominate additional speakers and keynotes. We're not done yet and we're taking nominations through mid July.

In addition to the keynotes, we're going to be giving updates on How Google Tests Software from teams across the company including Android, Chrome, Gmail, You Tube and many more.
Categories: Software Testing

How We Tested Google Instant Pages

Fri, 08/03/2012 - 10:44
By Jason Arbon and Tejas Shah

Google Instant Pages are a cool new way that Google speeds up your search experience. When Google thinks it knows which result you are likely to click, it preloads that page in the background, so when you click the page it renders instantly, saving the user about 5 seconds. 5 seconds is significant when you think of how many searches are performed each day--and especially when you consider that the rest of the search experience is optimized for sub-second performance.

The testing problem here is interesting. This feature requires client and server coordination, and since we are pre-loading and rendering the pages in an invisible background page, we wanted to make sure that nothing major was broken with the page rendering.

The original idea was for developers to test out a few pages as they went.But, this doesn’t scale to a large number of sites and is very expensive to repeat. Also, how do you know what the pages should look like? To write Selenium tests to functionally validate thousands of sites would take forever--the product would ship first. The solution was to perform automated test runs that load these pages from search results with Instant Pages turned on, and another run with Instant Pages turned off. The page renderings from each run were then compared.

How did we compare the two runs? How to compare pages when content and ads on web pages are constantly changing and we don't know what the expected behavior is? We could have used cached versions of these pages, but that wouldn’t be the realworld experience we were testing and would take time setting up, and the timing would have been different. We opted to leverage some other work that compares pages using the Document Object Model (DOM). We automatically scan each page, pixel by pixel, but look at what element is visible at the point on the page, not the color/RGB values. We then do a simple measure of how closely these pixel measurements match. These so-called "quality bots" generate a score of 0-100%, where 100% means all measurements were identical.

When we performed the runs, the vast majority (~95%) of all comparisons were almost identical, like we hoped. Where the pages where different we built a web page that showed the differences between the two pages by rendering both images and highlighting the difference. It was quick and easy for the developers to visually verify that the differences were only due to content or other non-structural differences in the rendering. Anytime test automation scales, is repeatable, quantified, and developers can validate the results without us is a good thing!

How did this testing get organized? As with many things in testing at Google, it came down to people chatting and realizing their work can be helpful for other engineers. This was bottom up, not top down. Tejas Shah was working on a general quality bot solution for compatibility (more on that in later posts) between Chrome and other browsers. He chatted with the Instant Pages developers when he was visiting their building and they agreed his bot might be able to help. He then spend the next couple of weeks pulling it all together and sharing the results with the team.

And now more applications of the quality bot are surfacing. What if we kept the browser version fixed, and only varied the version of the application? Could this help validate web applications independent of a functional spec and without custom validation script development and maintenance? Stay tuned...
Categories: Software Testing

Keynote Lineup for GTAC 2011

Fri, 08/03/2012 - 10:43
By James Whittaker

The call for proposals and participation is now closed. Over the next few weeks we will be announcing the full agenda and notifying accepted participants. In the meantime, the keynote lineup is now locked. It consists of two famous Googlers and two famous external speakers that I am very pleased to have join us.

Opening Keynote: Test is Dead by Alberto Savoia

The way most software is designed, developed and launched has changed dramatically over the last decade – but what about testing? Alberto Savoia believes that software testing as we knew it is dead – or at least moribund – in which case we should stick a fork in it and proactively take it out of its misery for good. In this opening keynote of biblical scope, Alberto will cast stones at the old test-mentality and will try his darnedest to agitate you and convince you that these days most testers should follow a new test-mentality, one which includes shifting their focus and priority from “Are we building it right?” to “Are we building the right it?” The subtitle of this year’s GTAC is “cloudy with a chance of tests,” and if anyone can gather the clouds into a hurricane, it's Alberto – it might be wise to bring your umbrella.

Alberto Savoia is Director of Engineering and Innovation Agitator at Google. In addition to leading several major product development efforts (including the launch of Google AdWords), Alberto has been a lifelong believer, champion, innovator and entrepreneur in the area of developer testing and test automation tools. He is a frequent keynote speaker and the author of many articles on testing, including the classic booklet “The Way of Testivus” and “Beautiful Tests” in O’Reilly’s Beautiful Code. His work in software development tools has won him several awards including the 2005 Wall Street Journal Technical Innovator Award, InfoWorld’s Technology of the Year award, and no less than four Software Development Magazine Jolt Awards.

Day 1 Closer: Redefining Security Vulnerabilities: How Attackers See Bugs by Herbert H. Thompson

Developers see features, testers see bugs, and attackers see “opportunities.” Those opportunities are expanding beyond buffer overflows, cross site scripting, etc. into logical bugs (and features) that allow attackers to use the information they find to exploit trusting users. For example, attackers can leverage a small information disclosure issue in an elaborate phishing attempt. When you add people in the mix, we need to reevaluate which “bugs” are actual security vulnerabilities. This talk is loaded with real world examples of how attackers are using software “features” and information tidbits (many of which come from bugs) to exploit the biggest weakness of all: trusting users.

Dr. Herbert H. Thompson is Chief Security Strategist at People Security and a world-renown expert in application security. He has co-authored four books on the topic including, How to Break Software Security: Effective Techniques for Security Testing (with Dr. James Whittaker) and The Software Vulnerability Guide (with Scott Chase). In 2006 he was named one of the “Top 5 Most Influential Thinkers in IT Security” by SC Magazine. Thompson continually lends his perspective and expertise on secure software development and has been interviewed by top news organizations including CNN, MSNBC, BusinessWeek, Forbes, Associated Press, and the Washington Post. He is also Program Committee Chair for RSA Conference, the world’s leading information security gathering. He holds a Ph.D. in Applied Mathematics from Florida Institute of Technology, and is an adjunct professor in the Computer Science department at Columbia University in New York.

Day 2 Opener: Engineering Productivity: Accelerating Google Since 2006 by Patrick Copeland

Patrick Copeland is the founder and architect of Google's testing and productivity strategy and in this "mini keynote" he tells the story and relates the pain of taking a company from ad hoc testing practices to the pinnacle of what can be accomplished with a well oiled test engineering discipline.

Conference Closer: Secrets of World-Class Software Organizations by Steve McConnell

Construx consultants work with literally hundreds of software organizations each year. Among these organizations a few stand out as being truly world class. They are exceptional in their ability to meet their software development goals and exceptional in the contribution they make to their companies' overall business success. Do world class software organizations operate differently than average organizations? In Construx's experience, the answer is a resounding "YES." In this talk, award-winning author Steve McConnell reveals the technical, management, business, and cultural secrets that make a software organization world class.

Steve McConnell is CEO and Chief Software Engineer at Construx Software where he consults to a broad range of industries, teaches seminars, and oversees Construx’s software engineering practices. Steve is the author of Software Estimation: Demystifying the Black Art (2006), Code Complete (1993, 2004), Rapid Development (1996), Software Project Survival Guide (1998), and Professional Software Development (2004), as well as numerous technical articles. His books have won numerous awards for "Best Book of the Year," and readers of Software Development magazine named him one of the three most influential people in the software industry along with Bill Gates and Linus Torvalds.
Categories: Software Testing

Pretotyping: A Different Type of Testing

Fri, 08/03/2012 - 10:43
Have you ever poured your heart and soul and blood, sweat and tears to help test and perfect a product that, after launch, flopped miserably? Not because it was not working right (you tested the snot out of it), but because it was not the right product.

Are you currently wasting your time testing a new product or feature that, in the end, nobody will use?

Testing typically revolves around making sure that we have built something right. Testing activities can be roughly described as “verifying that something works as intended, or as specified.” This is critical. However, before we take steps and invest time and effort to make sure that something built right, we should make sure that the thing we are testing, whether its a new feature or a whole new product, is the right thing to build in the first place.

Spending time, money and effort to test something that nobody ends up using is a waste of time.

For the past couple of years, I’ve been thinking about, and working on, a concept called pretotyping.

What is pretotyping? Here’s a somewhat formal definition – the dry and boring kind you’d find in a dictionary:

Pretotyping [pree-tuh-tahy-ping], verb: Testing the initial appeal and actual usage of a potential new product by simulating its core experience with the smallest possible investment of time and money.

Here’s a less formal definition:

Pretotyping is a way to test an idea quickly and inexpensively by creating extremely simplified, mocked or virtual versions of that product to help validate the premise that "If we build it, they will use it."

My favorite definition of pretotyping, however, is this:

Make sure – as quickly and as cheaply as you can – that you are building the right it before you build it right.
My thinking on pretotyping evolved from my positive experiences with Agile and Test Driven Development. Pretotyping applies some of the core ideas from these two models and applies them further upstream in the development cycle.

I’ve just finished writing the first draft of a booklet on pretotyping called “Pretotype It”.


You can download a PDF of the booklet from Google Docs or Scribd.

The "Pretotype It" booklet is itself a pretotype and test. I wrote this first-draft to test my (possibly optimistic) assumption that people would be interested in it, so please let me know what you think of it.

You can follow my pretotyping work on my pretotyping blog.

Post contentPosted by Alberto Savoia
Categories: Software Testing

Unleash the QualityBots

Fri, 08/03/2012 - 10:41
By Richard Bustamante

Are you a website developer that wants to know if Chrome updates will break your website before they reach the stable release channel? Have you ever wished there was an easy way to compare how your website appears in all channels of Chrome? Now you can!

QualityBots is a new open source tool for web developers created by the Web Testing team at Google. It’s a comparison tool that examines web pages across different Chrome channels using pixel-based DOM analysis. As new versions of Chrome are pushed, QualityBots serves as an early warning system for breakages. Additionally, it helps developers quickly and easily understand how their pages appear across Chrome channels.


QualityBots is built on top of Google AppEngine for the frontend and Amazon EC2 for the backend workers that crawl the web pages. Using QualityBots requires an Amazon EC2 account to run the virtual machines that will crawl public web pages with different versions of Chrome. The tool provides a web frontend where users can log on and request URLs that they want to crawl, see the results from the latest run on a dashboard, and drill down to get detailed information about what elements on the page are causing the trouble.

Developers and testers can use these results to identify sites that need attention due to a high amount of change and to highlight the pages that can be safely ignored when they render identically across Chrome channels. This saves time and the need for tedious compatibility testing of sites when nothing has changed.



We hope that interested website developers will take a deeper look and even join the project at the QualityBots project page. Feedback is more than welcome at qualitybots-discuss@googlegroups.com. Posted by Ibrahim El Far, Web Testing Technologies Team (Eriel Thomas, Jason Stredwick, Richard Bustamante, and Tejas Shah are the members of the team that delivered this product)
Categories: Software Testing

Google Test Analytics - Now in Open Source

Fri, 08/03/2012 - 10:41

By Jim Reardon

The test plan is dead!

Well, hopefully.  At a STAR West session this past week, James Whittaker asked a group of test professionals about test plans.  His first question: “How many people here write test plans?”  About 80 hands shot up instantly, a vast majority of the room.  “How many of you get value or refer to them again after a week?”  Exactly three people raised their hands.

That’s a lot of time being spent writing documents that are often long-winded, full of paragraphs of details on a project everyone already knows to get abandoned so quickly.

A group of us at Google set about creating a methodology that can replace a test plan -- it needed to be comprehensive, quick, actionable, and have sustained value to a project.  In the past few weeks, James has posted a few blogs about this methodology, which we’ve called ACC.  It's a tool to break down a software product into its constituent parts, and the method by which we created "10 Minute Test Plans" (that only take 30 minutes!)

Comprehensive
The ACC methodology creates a matrix that describes your project completely; several projects that have used it internally at Google have found coverage areas that were missing in their conventional test plans.

Quick
The ACC methodology is fast; we’ve created ACC breakdowns for complex projects in under half an hour.  Far faster than writing a conventional test plan.

Actionable
As part of your ACC breakdown, risk is assessed to the capabilities of your appliciation.  Using these values, you get a heat map of your project, showing the areas with the highest risk -- great places to spend some quality time testing.

Sustained Value
We’ve built in some experimental features that bring your ACC test plan to life by importing data signals like bugs and test coverage that quantify the risk across your project.

Today, I'm happy to announce we're open sourcing Test Analytics, a tool built at Google to make generating an ACC simple -- and which brings some experimental ideas we had around the field of risk-based testing that work hand-in-hand with the ACC breakdown.




Defining a project’s ACC model.
Test Analytics has two main parts: first and foremost, it's a step-by-step tool to create an ACC matrix that's faster and much simpler than the Google Spreadsheets we used before the tool existed.  It also provides visualizations of the matrix and risks associated with your ACC Capabilities that were difficult or impossible to do in a simple spreadsheet.




A project’s Capabilities grid.
The second part is taking the ACC plan and making it a living, automatic-updating risk matrix.  Test Analytics does this by importing quality signals from your project: Bugs, Test Cases, Test Results, and Code Changes.  By importing these data, Test Analytics lets you visualize risk that isn't just estimated or guessed, but based on quantitative values.  If a Component or Capability in your project has had a lot of code change or many bugs are still open or not verified as working, the risk in that area is higher.  Test Results can provide a mitigation to those risks -- if you run tests and import passing results, the risk in an area gets lower as you test.




A project’s risk, calculated as a factor of inherent risk as well as imported quality signals.
This part's still experimental; we're playing around with how we calculate risk based on these signals to best determine risk.  However, we wanted to release this functionality early so we can get feedback from the testing community on how well it works for teams so we can iterate and make the tool even more useful.  It'd also be great to import even more quality signals: code complexity, static code analysis, code coverage, external user feedback and more are all ideas we've had that could add an even higher level of dynamic data to your test plan.




An overview of test results, bugs, and code changes attributed to a project’s capability.  The Capability’s total risk is affected by these factors.
You can check out a live hosted version, browse or check out the code along with documentation, and of course if you have any feedback let us know - there's a Google Group set up for discussion, where we'll be active in responding to questions and sharing our experiences with Test Analytics so far.

Long live the test plan!
Categories: Software Testing

GTAC Videos Now Available

Fri, 08/03/2012 - 10:40
By James Whittaker

All the GTAC 2011 talks are now available at http://www.gtac.biz/talks and also up on You Tube. A hearty thanks to all the speakers who helped make this the best GTAC ever. 


Enjoy!
Categories: Software Testing

Google JS Test, now in Open Source

Fri, 08/03/2012 - 10:26

By Aaron Jacobs
Google JS Test is a JavaScript unit testing framework that runs on the V8 JavaScript Engine, the same open source project that is responsible for Google Chrome’s super-fast JS execution speed. Google JS Test is used internally by several Google projects, and we’re pleased to announce that it has been released as an open source project.

Features of Google JS Test include:
  • Extremely fast startup and execution time, without needing to run a browser.
  • Clean, readable output in the case of both passing and failing tests.
  • An optional browser-based test runner that can simply be refreshed whenever JS is changed.
  • Style and semantics that resemble Google Test for C++.
  • A built-in mocking framework that requires minimal boilerplate code (e.g. no $tearDown or$verifyAll calls), with style and semantics based on the Google C++ Mocking Framework.
  • A system of matchers allowing for expressive tests and easy to read failure output, with many built-in matchers and the ability for the user to add their own.

See the Google JS Test project home page for a quick introduction, and the getting started page for a tutorial that will teach you the basics in just a few minutes.
Categories: Software Testing

Take a BITE out of Bugs and Redundant Labor

Fri, 08/03/2012 - 10:25
In a time when more and more of the web is becoming streamlined, the process of filing bugs for websites remains tedious and manual. Find an issue. Switch to your bug system window. Fill out boilerplate descriptions of the problem. Switch back to the browser, take a screenshot, attach it to the issue. Type some more descriptions. The whole process is one of context switching; from the tools used to file the bug, to gather information about it, to highlight problematic areas, most of your focus as the tester is pulled away from the very application you’re trying to test.

The Browser Integrated Testing Environment, or BITE, is an open source Chrome Extension which aims to fix the manual web testing experience. To use the extension, it must be linked to a server providing information about bugs and tests in your system. BITE then provides the ability to file bugs from the context of a website, using relevant templates.

When filing a bug, BITE automatically grabs screenshots, links, and problematic UI elements and attaches them to the bug. This gives developers charged with investigating and/or fixing the bug a wealth of information to help them determine root causes and factors in the behavior.


When it comes to reproducing a bug, testers will often labor to remember and accurately record the exact steps taken. With BITE, however, every action the tester takes on the page is recorded in JavaScript, and can be played back later. This enables engineers to quickly determine if the steps of a bug repro in a specific environment, or whether a code change has resolved the issue.

Also included in BITE is a Record/Playback console to automate user actions in a manual test. Like the BITE recording experience, the RPF console will automatically author javascript that can be used to replay your actions at a later date. And BITE’s record and playback mechanism is fault tolerant; UI automation tests will fail from time to time, and when they do, it tends to be for test issues, rather than product issues. To that end, when a BITE playback fails, the tester can fix their recording in real-time, just by repeating the action on the page. There’s no need to touch code, or report a failing test; if your script can’t find a button to click on, just click on it again, and the script will be fixed! For those times when you do have to touch the code, we’ve used the Ace (http://ace.ajax.org/) as an inline editor, so you can make changes to your javascript in real-time.

Check out the BITE project page at http://code.google.com/p/bite-project. Feedback is welcome at bite-feedback@google.com. Posted by Joe Allan Muharsky from the Web Testing Technologies Team (Jason Stredwick, Julie Ralph, Po Hu and Richard Bustamante are the members of the team that delivered the product).
Categories: Software Testing