Feed aggregator

Defending Network Performance with Packet Level Detail

My favorite war room accusation is: “It’s always the network at fault!” Whether you’re the one taking the blame or the one pointing the finger likely has everything to do with which seat you occupy in that war room. I suppose that comes with the territory, because at the same time there seems to be […]

The post Defending Network Performance with Packet Level Detail appeared first on Compuware APM Blog.

Categories: Load & Perf Testing

QCon Rio – Why Bandwidth Doesn’t Matter

LoadImpact - Fri, 09/19/2014 - 02:29

We’ll be speaking at this year’s QCon conference in Rio on September 24th and 25th and hope to see you there! Our founder, Ragnar Lönn, will be giving a presentation on “Why bandwidth matter – why network delay will always limit web performance and what to do about it” that promises to lead to many interesting discussions. ... Read more

The post QCon Rio – Why Bandwidth Doesn’t Matter appeared first on Load Impact Blog.

Categories: Load & Perf Testing

Try Sending It Left

Eric Jacobson's Software Testing Blog - Wed, 09/17/2014 - 14:32

About five years ago, my tester friend, Alex Kell, blew my mind by cockily declaring, “Why would you ever log a bug?  Just send the Story back.”

Okay.

My dev team uses a Kanban board that includes “In Testing” and “In Development” columns.  Sometimes bug reports are created against Stories.  But other times Stories are just sent left; For example, a Story “In Testing” may have its status changed to “In Development”, like Alex Kell’s maneuver above.  This normally is done using the Dead Horse When-To-Stop-A-Test Heuristic. We could also send an “In Development” story left if we decide the business rules need to be firmed up before coding can continue.

So how does one know when to log a bug report vs. send it left?

I proposed the following heuristic to my team today:

If the Acceptance Test Criteria (listed on the Story card) is violated, send it left.  It seems to me, logging a bug report for something already stated in the Story (e.g., Feature, Work Item, Spec) is mostly a waste of time.

Thoughts?

Categories: Software Testing

Cyber Monday and the Impact of Web Performance

LoadStorm - Wed, 09/17/2014 - 09:44
Cyber Monday is a pretty big deal for online retailers.

In fact, the previous statement is much too modest. Cyber Monday is the biggest day of the year for ecommerce in the United States and beyond. As the statistics show, Cyber Monday has become a billion dollar juggernaut since 2010 – and it has only continued to grow. Last year alone, Cyber Monday was responsible for over $1.7 billion spent by online consumers in the US, a shocking 18% jump from the year before!

Since its inception in 2005, the Monday after Thanksgiving has become a potential goldmine for those with an online presence. It has helped to significantly boost revenue during the Christmas period for savvy businesses that have taken advantage of using the promotion. The “cannot-be-missed” deals are important to any Cyber Monday campaign, but having the website ready to maintain consistent and fast performance with the traffic rush is absolutely critical.

An unprepared business would expect an increase in business on Cyber Monday, but overlook the fact that more visitors = more strain on the performance-side of their website. And the more strain on a website, the more it will begin to falter when it matters most.

How web performance can cause your Cyber Monday to crash

During the mad rush of consumers looking to snap up some bargain deals, your website has to be prepared for the sudden visitor increase – otherwise your Cyber Monday will crumble before your eyes.

Last year Cyber Monday website crashes cost several large companies thousands of dollars in revenue. Motorola was offering a special price on their new MotoX, but the site was not prepared for the rush of traffic it would bring. Many customers experienced a very slow website, errors showing prices without the discount, and then the website crashed entirely.

In addition to losing customers who would have otherwise purchased that weekend, Motorola also had to deal with the PR aftermath. Unhappy would-be customers and the tech media took to social media, posting tweets such as:

In an effort to mitigate the damage, Motorola CEO issued the statement:

Moral of the story? Motorola lost out on thousands of dollars of sales and lost thousands of potential new customers forever, all of which could have been avoided if load and performance testing had been performed early. If they had load tested, Motorola would have been aware of the problems, found the causes, and fixed them before real users experienced them.

While many companies didn’t see full website crashes like Motorola, the rush of traffic still led to painfully slow websites and therefore a loss in revenue. A website must not only remain up and available, but also remain fast to navigate around. Just think of the amount of pages a potential customer might have to go through on your website. Now imagine if there were delays in between each page loading. Internet users are an impatient bunch, a one second delay can cause a 7% decrease in conversions and 11% fewer page views. And 74% of people will leave a mobile site if the delay is longer than five seconds!

Clearly, ensuring your website is constantly up and stable is imperative to maximizing profits for your business this Cyber Monday. Because the last thing you want to do is miss out on the most important day of the year for ecommerce and present your competitors with an opening to snag that business.

The stakes are high; you’ve got to make sure you are suitably prepared for the rush.

The post Cyber Monday and the Impact of Web Performance appeared first on LoadStorm.

Progressive image rendering: Good or evil?

Web Performance Today - Wed, 09/17/2014 - 09:03

A funny thing about the latest research we’ve just released at Radware is that, depending on whom I talk with about it, their reaction ranges from “Wow, that’s amazing!” to “You studied what? Why?”

In case you’re in the second camp, let me give a bit of back story…

A bit of back story

Today, a typical web page is enormous. According to the HTTP Archive, the average page is 1860 KB in size — 1180 KB of which is images. That’s 63%. That’s huge. There are a lot of ways we can go wrong with images, resulting in pages that take much longer than necessary to render. One of the ways we can go wrong (or at least not go as right as we could) is by using images that are in the wrong format.

For the past fifteen years or so, there’s been a LOT of debate about which image format delivers the better user experience: baseline images or progressive JPEGs. (For the uninitiated, baseline images are images that load from top to bottom, line by line. Progressive images load in layers, starting with a blurry image, then finishing with a high-resolution layer.)

There’s good evidence (here and here) that progressive images can actually be smaller than baseline images. There’s also the intuitive rationale that serving something to users more quickly — even if it’s a low-res version of the image — is better than making users wait longer.

But despite this combination of evidence and intuition, the adoption rate for progressive JPEGs is quite low. Depending on which report you read, it ranges from just 5 to 7%. The reason for this could be anti-progressive bias, or it could just be oversight — it’s impossible to say for sure.

At Radware, we wanted to end the highly subjective, largely data-free debate over whether progressive images help or hurt the user experience. We had several good reasons:

  • To potentially save tons of developer time in manually optimizing images
  • To guide feature development in automated front-end optimization tools (like ours!)
  • Ultimately, to serve the best — and fastest — experience to users
How we did it

Working with NeuroStrata, a neuroscience research firm we’ve worked with in the past, we developed a threefold study to begin to answer these questions:

  • Do progressive JPEGs actually deliver a better user experience than baseline images?
  • How does PerfectImage (a WebP-based format that we’re developing here at Radware) measure up against Progressive JPEGs and baseline images?
  • Do users perceive images differently depending on whether they’re engaged in a task that’s text-based (i.e. does not rely on images for completion) or visual-based (i.e. relies on image for completion)?

As I mentioned above, our study was threefold. Using a total of 742 study participants, we applied three different neuroscientific approaches:

  • Facial Action Coding – Measures moment-by-moment emotional responses in facial expressions
  • Facial Heart Beat Monitoring – Measures heart rate by detecting micro color changes in the face caused by heartbeat
  • Implicit Response Test – Extracts relative measures of frustration and emotional engagement

We built mockups of pages depicting user flows through five different websites: Gap, Moonpig, YouTube, Vodafone, and Amazon. (Important: These were mockups, in which we selected and formatted the images ourselves. We did not test on the actual sites.) We then created three versions of each site, using three image formats:

  • Original — Standard baseline image (GIF, JPEG, PNG)
  • Progressive JPEG (PJ) — Image is downloaded in lower resolution, displayed, then ‘progressively’ downloaded and redisplayed until the full resolution is shown
  • PerfectImage (PI) — Lossy compressed WebP image that is degraded until the SSIM (Structural SIMilarity) index is 0.985 compared to the original

Participants were randomly served videos depicting one image format per site, meaning they never saw the same site twice. All tests were done remotely, with subjects participating via their own webcam-enabled computers. We recorded videos of each participant, ran them through the neuro software, and aggregated and analyzed the results. In designing the study and in our analysis, we focused on the ‘Happiness’ metric (one of six universal facial expressions) as the metric most closely aligned with user engagement.

A few of our findings

It’s impossible to detail all our findings in a single blog post, but here are some of the standout points:

1. In the Facial Action Coding test, the highest overall levels of Emotion [Happiness] were evoked with PerfectImage (PI).

2. Also in the Facial Action Coding test, Original either tied or outperformed Progressive JPEG across all sites.

3. Heart rate largely correlated to the facial action coding results across all three image formats.

4. In the Implicit Response test, PerfectImage was the preferred format overall. Progressive JPEG was the least preferred by a significant margin.

5. Also in the Implicit Response test, the Progressive JPEG format was least preferred across all task types — both text-based and visual.

Interpreting these results

The value in tackling this question using three different test methodologies is that we’re able to see that our findings (e.g. PerfectImage is most preferred, Progressive JPEG is least preferred) are consistent and reproducible. We asked Dr. David Lewis (Chair of Mindlab International, a leader in the neuroscience of consumerism and communications) for his interpretation of our test results. He gave us a really great, comprehensive answer, which we’ve included as an appendix in our report, but I’ll include my favourite excerpt here:

When, as with the Progressive JPEG method, image rendition is a two-stage process in which an initially coarse image snaps into sharp focus, cognitive fluency is inhibited and the brain has to work slightly harder to make sense of what is being displayed.

Or, as my colleague Kent Alstad puts it, our brains want a great big “EASY” button for everything they do. As soon as we make things even a little bit harder to process, we increase frustration.

Takeaways

This research answers some questions and, not surprisingly, raises new ones. The most critical takeaway here is that users are extremely sensitive to how images render. We need to do everything in our power to make images load as quickly, clearly, and simply as possible.

To answer a question that I’ve already been asked many times, PerfectImage (the preferred format in our test) is still in the R&D phase here at Radware. We’ll share more as we get further along. But in the meantime, there are still many image optimization techniques — compression, consolidation, deferral, preloading, leveraging the browser cache, to name a few — that will help images render faster, which will help pages load faster. The ultimate payoff is happier users.

Download the report: Progressive Image Rendering: Good or Evil?

The post Progressive image rendering: Good or evil? appeared first on Web Performance Today.

How Bad Performance Impacts Ecommerce Sales (Part I)

LoadImpact - Wed, 09/17/2014 - 07:11

As we approach the critical holiday period – where as much as 18% of shopping carts are abandoned due to slow websites – it’s time to discuss how bad performance can impact e-commerce sales and provide you with real-world examples and practical steps on how to improve performance.

Why is website performance important?

A decade ago, the number of businesses selling online was relatively low. Nowadays, those that don’t sell online are a dwindling minority. Due to the ubiquitous nature of the Internet in our modern day life, the marketplace for online sales is huge, and so is the amount of competition.

Consumers are spoilt for choice and aren’t afraid to shop around. Serve up a sluggish website, and visitors will go elsewhere without hesitation. A slow e-commerce website means you’ll lose individual sales as well as any repeat business that may have come from those initial sales.

Load Impact did a study on this in 2012 and found 53% of e-commerce site owners lost money or visitors due to poor performance or stability on their site. 

That’s the important point here. If you’re looking to grow a business through online sales, a badly performing website will not only hinder short-term sales, but it will seriously hurt your chances of long-term growth.

There are statistics to back this up. The correlation between website speed and conversion rates / revenue has often been documented internally within organizations. I have seen this first hand for several of my e-commerce clients. The positive impact of a fast website can be dramatic, even for the relatively small online retailers.

When it comes to the giants of online retail, you get to appreciate how massive an impact website speed can have. All the way back in 2006, Amazon evidently reported that a 100-millisecond increase in page speed translated to a 1% increase in its revenue. (source)

Former Amazon employee Greg Linden also alluded to this on his blog:

“In A/B tests, we tried delaying the page in increments of 100 milliseconds and found that even very small delays would result in substantial and costly drops in revenue.”

Avoid misdiagnosis by measuring

As we’ve already established, speed is a critical part of website usability and it differentiates the average businesses from the great businesses. A common problem for small and medium sized businesses is a lack of awareness around website performance and what an important factor it is.

It is easy to assume that a website is performing acceptably because it is bringing in sales. If a website isn’t bringing in any sales, it can be easy to assume that it needs a re-design or simply more traffic needs to be driven to it. These are dangerous assumptions to make without any evidence to back them up.

So how do you get the evidence you need? Measure, measure, measure! Sound like too much hard work? Consider the risks of not measuring:

  • Bad customer experience = bad reputation.
  • You may spend a significant budget on re-designing your website because “it’s not working” when actually all it needed was a performance audit.
  • You may increase your Pay Per Click (PPC) advertising budget to push more traffic to the website, but this just creates disgruntled visitors instead of happy customers. Indeed, this would equate to pouring money into the proverbial “leaky bucket”. Worse than that though, the more money you spend, the more disgruntled visitors you create!
How to measure website performance

For those new to the concept of performance monitoring, measuring the speed of a website may seem a lot simpler than it is.

The seemingly obvious way to measure how quickly a website is loading is to ping the homepage and… well… see how long it takes to load! If it loads in 3 seconds, great. If it loads in 10 seconds, not so great.

While that incredibly simple test is a good indicator in itself, it doesn’t come near to giving a complete picture of your website performance. There are many factors you need to consider if you want to get a true measurement of a website’s speed. Here are just a few:

  • Site-wide performance - The homepage is just one page. Testing how quickly the homepage loads ignores the rest of the website which could perform completely differently
  • Performance under load – if your website performs fine with 2 concurrent users, but then falls over with 10 concurrent users, you have a problem.
  • Geographical location – The website may perform acceptably from some countries, but not for others. A test from a single location doesn’t reveal the website’s performance from different locations around the world.
  • Real user behavior – real users behave differently. Some will land on a page and then leave immediately (known as a “bounce”), some will land on the website and visit several pages looking for information, some will submit a contact form or complete a checkout process if the website sells products online. In short, the best measure of a website’s performance is when it is under user load that is representative of real world scenarios.

So the best load tests comprise of multiple pages being tested, multiple concurrent virtual users who behave in different manners and originate from multiple geographical locations in the world.

But how do you decide what type of user scenarios to setup? i.e. what kind of user behavior do you want to mimic? This is where website traffic statistics come in useful. Tools such as Google Analytics will show you how your current visitors behave.

If you have a 10% conversion rate on your contact form page, then it would make sense to create a user scenario that mimics this and have 10% of your generated load use this scenario. Understand your current audience, and build up a set of user scenarios that are representative of their behavior, broadly speaking.

Understanding the technical terminology is an important first step before trying to setup your own load tests as well.

Load Impact provides a handy resource that will help you get your head around terms such as “ramping up”, “ramping down”, “virtual users”, “accumulated load time”, “load test execution plan”, “user scenarios” etc..

In part II of this article, I will delve into some real world website performance metrics as well as practical ways you can improve the performance of your website. Make sure to check back in next week.

 

Categories: Load & Perf Testing

The Weasel Returns

Alan Page - Tue, 09/16/2014 - 12:05

I’m back home and back at work…and slowly getting used to both. It was by far the best (and longest) vacation I’ve had (or will probably ever have). While it’s good to be home, it’s a bit weird getting used to working again after so much time off (just over 8 weeks in total).

But – a lot happened while I was gone that’s probably worth at least a comment or two.

Borg News

Microsoft went a little crazy while I was out, but the moves (layoffs, strategy, etc.) make sense – as long as the execution is right. I feel bad for a lot of my friends who were part of the layoffs, and hope they’re all doing well by now (and if any are reading this, I almost always have job leads to share).

ISO 29119

A lot of testers I know are riled up (and rightfully so) about ISO 29119 – which, in a nutshell, is a “standard” for testing that says software should be tested exactly as described in a set of textbooks from the 1980’s. On one hand, I have the flexibility to ignore 29119 – I would never work for a company that thought it was a good idea. But I know there are testers who find themselves in a situation where they have to follow a bunch of busywork from the “standard” rather than provide actual value to the software project.

As for me…

Honestly, I have to say that I don’t think of myself as a tester these days. I know a lot about testing and quality (and use that to help the team), but the more I work on software, the more I realize that thinking of testing as a separate and distinct activity from software development is a road to ruin. This thought is at least partially what makes me want to dismiss 29119 entirely – from what I’ve seen, 29119 is all about a test team taking a product that someone else developed, and doing a bunch of ass-covering while trying to test quality into the product. That approach to software development doesn’t interest me at all.

I talked with a recruiter recently (I always keep my options open) who was looking for someone to “architect and build their QA infrastructure”. I told them that I’d talk to them about it if they were interested, but that my goal in the interview would be to talk them out of doing that and give them some ideas on how to better spend that money.

I didn’t hear back.

Podcast?!?

It’s also been a long hiatus from AB Testing. Brent and I are planning to record on Friday, and expect to have a new episode published by Monday September 22, and get back on our every-two-week schedule.

Categories: Software Testing

Detecting Bad Deployments on Resource Impact and Not Response Time: Hotspot Garbage Collection

This story came in from Joseph – one of our fellow dynaTrace users and a performance engineer at a large fleet management service company. Their fleet management software runs on .NET,  is developed in-house, is load tested with JMeter and monitored in Production with dynaTrace. A usage and configuration change of their dependency injection library […]

The post Detecting Bad Deployments on Resource Impact and Not Response Time: Hotspot Garbage Collection appeared first on Compuware APM Blog.

Categories: Load & Perf Testing

EXCLUSIVE WEBINAR: Lifting web performance from the tactical to the strategic

LoadImpact - Mon, 09/15/2014 - 09:29

Join us September 23rd @2pm PDT for an exclusive webinar with Load Impact’s Head of Professional Services – Michael Sjölin. 

About the webinar:

It can often be difficult for IT professionals tasked with developing, testing and maintaining technical resources to translate business requirements into reliable test configurations. Moreover, once testing is complete, its often even more challenging to map test results – such as web performance data – against general business requirments.

In this 40 minute webinar (including time for Q&A), Load Impact’s head of Professional Services – Michael Sjölin – who has over 20 years experience in performance testing and QA, will demonstrate how the best QA, DevOps and IT managers tackle these challenges.

Using case studies and real-world examples, Michael will show how to convert business objectives into realistic test configurations and how to translate test results into terms business professionals will understand.

The goal of this webinar is to help you lift web performance from the tactical to the strategic within your company; to help you effectively champion the importance of web performance optimization among non-technical staff and business executives.

About Michael: 

Michael is an IT consultant with 20+ years of experience in test management, performance testing, growth management, online media services and international management. His particular areas of expertise include test management and performance optimization within Banking, Defense, Telecom and Automotive industries. Read more about Michael’s experience on Linkedin

 

 

Categories: Load & Perf Testing

QA Music: Like a Tester

QA Hates You - Mon, 09/15/2014 - 04:00

Like a Storm, “Love the Way You Hate Me”:

Is that a didgeridoo in a rock song? Yes.

Categories: Software Testing

The Software Tester's Greatest Asset

Randy Rice's Software Testing & Quality - Fri, 09/12/2014 - 09:13
I interact with thousands of testers each year. In some cases, it's in a classroom setting, in others, it may be over a cup of coffee. Sometimes, people dialog with me through this blog, my website or my Facebook page.

The thing I sense most from testers that are "stuck" in their career or just in their ability to solve problems is that they have closed minds to other ways of doing things. Perhaps they have bought into a certain philosophy of testing, or learned testing from someone who really wasn't that good at testing.

In my observation, the current testing field is fragmented into a variety of camps, such as those that like structure, or those that reject any form of structure. There are those that insist their way is the only way to perform testing. That's unfortunate - not the debate, but the ideology.

The reality is there are many ways to perform testing. It's also easy to use the wrong approach on a particular project or task. It's the old Maslow "law of the instrument" that says, "I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail."

Let me digress for a moment...

I enjoy working on cars, even though it can be a very time-consuming, dirty and frustrating experience. I've been working on my own cars for over 40 years now. I've learned little tricks along the way to remove rusted and frozen bolts. I have a lot of tools - wrenches, sockets, hammers...you name it. The most helpful tool I own is a 2-foot piece of pipe. No, I don't hit the car with it! I use it for leverage. (I can also use it for self defense, but that's another story.) It cost me five dollars, but has saved me many hours of time. Yeah, a lowly piece of pipe slipped over the end of a wrench can do wonders.

The funny thing is that I worked on cars for many years without knowing that old mechanic's trick. It makes me wonder how many other things I don't know.

Here's the challenge...

Are you open to other ways of doing things, even if you personally don't like it?

For example, if you needed to follow a testing standard, would that make you storm out of the room in a huff?

Or, if you had to do exploratory testing, would that cause you to break out in hives?

Or, if your employer mandated that the entire test team (including you) get a certification, would you quit?

I'm not suggesting you abandon your principles or beliefs about testing. I am suggesting that in the things we reject out of hand, there could be just the solution you are looking for.

The best thing a tester does is to look at things objectively, with an open mind. When we jump to conclusions too soon, we may very well find ourselves in a position where we have lost our objectivity.

As a tester, your greatest asset is an open mind. Look at the problem from various angles. Consider the pros and cons of things, realizing that even your list of pros and cons can be skewed. Then, you can work in many contexts and also enjoy the journey.


Categories: Software Testing

Onload in Onload

Steve Souders - Fri, 09/12/2014 - 08:58
or “Why you should use document.readyState”

I asked several web devs what happens if an onload handler adds another onload handler. Does the second onload handler execute?

The onload event has already fired, so it might be too late for the second onload to get triggered. On the other hand, the onload phase isn’t over (we’re between loadEventStart and loadEventEnd in Navigation Timing terms), so there might be a chance the second onload handler could be added to a queue and executed at the end.

None of the people I asked knew the answer, but we all had a guess. I’ll explain in a minute why this is important, but until then settle on your answer – do you think the second onload executes?

To answer this question I created the Onload in Onload test page. It sets an initial onload handler. In that first onload handler a second onload handler is added. Here’s the code:

function addOnload(callback) { if ( "undefined" != typeof(window.attachEvent) ) { return window.attachEvent("onload", callback); } else if ( window.addEventListener ){ return window.addEventListener("load", callback, false); } } function onload1() { document.getElementById('results').innerHTML += "First onload executed."; addOnload(onload2); } function onload2() { document.getElementById('results').innerHTML += "Second onload executed."; } addOnload(onload1);

I created a Browserscope user test to record the results and tweeted asking people to run the test. Thanks to crowdsourcing we have results from dozens of browsers. So far no browser executes the second onload handler.

Why is this important?

There’s increasing awareness of the negative impact scripts have on page load times. Many websites are following the performance best practice of loading scripts asynchronously. While this is a fantastic change that makes pages render more quickly, it’s still possible for an asynchronous script to make pages slower because onload doesn’t fire until all asynchronous scripts are done downloading and executing.

To further mitigate the negative performance impact of scripts, some websites have moved to loading scripts in an onload handler. The problem is that often the scripts being moved to the onload handler are third party scripts. Combine this with the fact that many third party scripts, especially metrics, kickoff their execution via an onload handler. The end result is we’re loading scripts that include an onload handler in an onload handler. We know from the test results above that this results in the second onload handler not being executed, which means the third party script won’t complete all of its functionality.

Scripts (especially third party scripts) that use onload handlers should therefore check if the onload event has already fired. If it has, then rather than using an onload handler, the script execution should start immediately. A good example of this is my Episodes RUM library. Previously I initiated gathering of the RUM metrics via an onload handler, but now episodes.js also checks document.readyState to ensure the metrics are gathered even if onload has already fired. Here’s the code:

if ( "undefined" != typeof(document.readyState) && "complete" == document.readyState ) { // The page is ALREADY loaded - start EPISODES right now. if ( EPISODES.autorun ) { EPISODES.done(); } } else { // Start EPISODES on onload. EPISODES.addEventListener("load", EPISODES.onload, false); }

Summing up:

  • If you own a website and want to make absolutely certain a script doesn’t impact page load times, consider loading the script in an onload handler. If you do this, make sure to test that the delayed script doesn’t rely on an onload handler to complete its functionality. (Another option is to load the script in an iframe, but third party scripts may not perform correctly from within an iframe.)
  • If you own a third party script that adds an onload handler, you might want to augment that by checking document.readyState to make sure onload hasn’t already fired.
Categories: Software Testing

Zone of control vs Sphere of influence

The Quest for Software++ - Fri, 09/12/2014 - 01:22

In The Logical Thinking Process, H. William Dettmer talks about three different areas of systems:

  • The Zone of control (or span of control) includes all those things in a system that we can change on our own.
  • The Sphere of influence includes activities that we can impact to some degree, but can’t exercise full control over.
  • The External environment includes the elements over which we have no influence.

These three system areas, and the boundaries between them, provide a very useful perspective on what a delivery team can hope to achieve with user stories. Evaluating which system area a user story falls into is an excellent way to quickly spot ideas that require significant refinement.

This is an excerpt from my upcoming book 50 Quick Ideas to Improve your User Stories. Grab the book draft from LeanPub and you’ll get all future updates automatically.

A good guideline is that the user need of a story (‘In order to…’) should ideally be in the sphere of influence of the delivery team, and the deliverable (‘I want…’) should ideally be in their zone of control. This is not a 100% rule and there are valid exceptions, but if a story does not fit into this pattern it should be investigated – often it won’t describe a real user need and rephrasing can help us identify root causes of problems and work on them, instead of just dealing with the symptoms.

When the user need of a story is in the zone of control of the delivery group, the story is effectively a task without risk, which should raise alarm bells. There are three common scenarios: The story might be fake, micro-story or misleading.

Micro-stories are what you get when a large business story is broken down into very small pieces, so that some small parts no longer carry any risk – they are effectively stepping stones to something larger. Such stories are OK, but it’s important to track the whole hierarchy and measure the success of the micro-stories based on the success of the larger piece. If the combination of all those smaller pieces still fails to achieve the business objective, it might be worth taking the whole hierarchy out or revisiting the larger piece. Good strategies for tracking higher level objectives are user story mapping and impact mapping.

Fake stories are those about the needs of delivery team members. For example, ‘As a QA, in order to test faster, I want the database server restarts to be automated’. This isn’t really about delivering value to users, but a task that someone on the team needs, and such stories are often put into product backlogs because of misguided product owners who want to micromanage. For ideas on how to deal with these stories, see the chapter Don’t push everything into stories in the 50 Quick Ideas book.

Misleading stories describe a solution and not the real user need. One case we came across recently was ‘As a back-office operator, in order to run reports faster, I want the customer reporting database queries to be optimised’. At first glance, this seemed like a nice user story – it even included a potentially measurable change in someone’s behaviour. However, the speed of report execution is pretty much in the zone of control of the delivery team, which prompted us to investigate further. We discovered that the operator asking for the change was looking for discrepancies in customer information. He ran several different reports just to compare them manually. Because of the volume of data and the systems involved, he had to wait around for 20 to 30 minutes for the reports, and then spend another 10 to 20 minutes loading the different files into Excel and comparing them. We could probably have decreased the time needed for the first part of that job significantly, but the operator would still have had to spend time comparing information. Then we traced the request to something outside our zone of control. Running reports faster helped the operator to compare customer information, which helped him to identify discrepancies (still within our control potentially), and then to resolve them by calling the customers and cleaning up their data. Cleaning up customer data was outside our zone of control, we could just influence it by providing information quickly. This was a nice place to start discussing the story and its deliverables. We rephrased the story to ‘In order to resolve customer data discrepancies faster…’ and implemented a web page that quickly compared different data sources and almost instantly displayed only the differences. There was no need to run the lengthy reports, the database software was more than capable of zeroing in on the differences very quickly. The operator could then call the customers and verify the information.

When the deliverable of a story is outside the zone of control of the delivery team, there are two common situations: the expectation is completely unrealistic, or the story is not completely actionable by the delivery group. The first case is easy to deal with – just politely reject it. The second case is more interesting. Such stories might need the involvement of an external specialist, or a different part of the organisation. For example, one of our clients was a team in a large financial organisation where configuration changes to message formats had to be executed by a specialist central team. This, of course, took a lot of time and coordination. By doing the zone of control/sphere of influence triage on stories, we quickly identified those that were at risk of being delayed. The team started on them quickly, so that everything would be ready for the specialists as soon as possible.

How to make it work

The system boundaries vary depending on viewpoint, so consider them from the perspective of the delivery team.

If a story does not fit into the expected pattern, raise the alarm early and consider re-writing it. Throw out or replace fake and misleading stories. Micro-stories aren’t necessarily bad, but going so deep in detail is probably an overkill for anything apart from short-term plans. If you discover micro-stories on mid-term or long-term plans, it’s probably better to replace a whole group of related stories with one larger item.

If you discover stories that are only partially actionable by your team, consider splitting them into a part that is actionable by the delivery group, and a part that needs management intervention or coordination.

To take this approach even further, consider drawing up a Current reality tree (outside the scope of this post, but well explained in The Logical Thinking Process), which will help you further to identify the root causes of undesirable effects.

How to Optimize the Good and Exclude the Bad/ Bot Traffic that Impacts your Web Analytics and Performance

This blog is about how a new generation of BOTs impacted our application performance, exploited problems in our deployment and skewed our web analytics. I explain how we dealt with it and what you can learn to protect your own systems. Another positive side-effect of identifying these requests is that we can adjust our web […]

The post How to Optimize the Good and Exclude the Bad/ Bot Traffic that Impacts your Web Analytics and Performance appeared first on Compuware APM Blog.

Categories: Load & Perf Testing

Renaming Compuware APM to Dynatrace – What It Means to You

Most of you reading this blog have probably seen our recent announcements, Compuware going private from last week, and yesterday naming Compuware APM business unit Dynatrace. Quite a few of you reached out to me with questions as to what it meant and what has/will change. So I thought I would address the majority of […]

The post Renaming Compuware APM to Dynatrace – What It Means to You appeared first on Compuware APM Blog.

Categories: Load & Perf Testing

An Inadvertent Password Review

QA Hates You - Wed, 09/10/2014 - 04:58

So I’m at a presentation last week, and the presenter’s got his little tablet plugged into the projector. He wanders off and starts talking to someone, and his tablet shuts off automatically after a bit.

So he goes to the tablet and looks down at it. He starts it back up, and he types his password to unlock it, and….

Because it’s a tablet, the keyboard displays onscreen and shows his key taps.

As does the projector.

So we all saw his password. On one hand, it was a strong password. On the other hand, we all saw it.

Don’t do that.

Categories: Software Testing

Chrome - Firefox WebRTC Interop Test - Pt 2

Google Testing Blog - Tue, 09/09/2014 - 14:09
by Patrik Höglund

This is the second in a series of articles about Chrome’s WebRTC Interop Test. See the first.

In the previous blog post we managed to write an automated test which got a WebRTC call between Firefox and Chrome to run. But how do we verify that the call actually worked?

Verifying the CallNow we can launch the two browsers, but how do we figure out the whether the call actually worked? If you try opening two apprtc.appspot.com tabs in the same room, you will notice the video feeds flip over using a CSS transform, your local video is relegated to a small frame and a new big video feed with the remote video shows up. For the first version of the test, I just looked at the page in the Chrome debugger and looked for some reliable signal. As it turns out, the remoteVideo.style.opacity property will go from 0 to 1 when the call goes up and from 1 to 0 when it goes down. Since we can execute arbitrary JavaScript in the Chrome tab from the test, we can simply implement the check like this:

bool WaitForCallToComeUp(content::WebContents* tab_contents) {
// Apprtc will set remoteVideo.style.opacity to 1 when the call comes up.
std::string javascript =
"window.domAutomationController.send(remoteVideo.style.opacity)";
return test::PollingWaitUntil(javascript, "1", tab_contents);
}

Verifying Video is PlayingSo getting a call up is good, but what if there is a bug where Firefox and Chrome cannot send correct video streams to each other? To check that, we needed to step up our game a bit. We decided to use our existing video detector, which looks at a video element and determines if the pixels are changing. This is a very basic check, but it’s better than nothing. To do this, we simply evaluate the .js file’s JavaScript in the context of the Chrome tab, making the functions in the file available to us. The implementation then becomes

bool DetectRemoteVideoPlaying(content::WebContents* tab_contents) {
if (!EvalInJavascriptFile(tab_contents, GetSourceDir().Append(
FILE_PATH_LITERAL(
"chrome/test/data/webrtc/test_functions.js"))))
return false;
if (!EvalInJavascriptFile(tab_contents, GetSourceDir().Append(
FILE_PATH_LITERAL(
"chrome/test/data/webrtc/video_detector.js"))))
return false;

// The remote video tag is called remoteVideo in the AppRTC code.
StartDetectingVideo(tab_contents, "remoteVideo");
WaitForVideoToPlay(tab_contents);
return true;
}
where StartDetectingVideo and WaitForVideoToPlay call the corresponding JavaScript methods in video_detector.js. If the video feed is frozen and unchanging, the test will time out and fail.

What to Send in the CallNow we can get a call up between the browsers and detect if video is playing. But what video should we send? For chrome, we have a convenient --use-fake-device-for-media-stream flag that will make Chrome pretend there’s a webcam and present a generated video feed (which is a spinning green ball with a timestamp). This turned out to be useful since Firefox and Chrome cannot acquire the same camera at the same time, so if we didn’t use the fake device we would have two webcams plugged into the bots executing the tests!

Bots running in Chrome’s regular test infrastructure do not have either software or hardware webcams plugged into them, so this test must run on bots with webcams for Firefox to be able to acquire a camera. Fortunately, we have that in the WebRTC waterfalls in order to test that we can actually acquire hardware webcams on all platforms. We also added a check to just succeed the test when there’s no real webcam on the system since we don’t want it to fail when a dev runs it on a machine without a webcam:

if (!HasWebcamOnSystem())
return;

It would of course be better if Firefox had a similar fake device, but to my knowledge it doesn’t.

Downloading all Code and Components Now we have all we need to run the test and have it verify something useful. We just have the hard part left: how do we actually download all the resources we need to run this test? Recall that this is actually a three-way integration test between Chrome, Firefox and AppRTC, which require the following:

  • The AppEngine SDK in order to bring up the local AppRTC instance, 
  • The AppRTC code itself, 
  • Chrome (already present in the checkout), and 
  • Firefox nightly.

While developing the test, I initially just hand-downloaded these and installed and hard-coded the paths. This is a very bad idea in the long run. Recall that the Chromium infrastructure is comprised of thousands and thousands of machines, and while this test will only run on perhaps 5 at a time due to its webcam requirements, we don’t want manual maintenance work whenever we replace a machine. And for that matter, we definitely don’t want to download a new Firefox by hand every night and put it on the right location on the bots! So how do we automate this?

Downloading the AppEngine SDK
First, let’s start with the easy part. We don’t really care if the AppEngine SDK is up-to-date, so a relatively stale version is fine. We could have the test download it from the authoritative source, but that’s a bad idea for a couple reasons. First, it updates outside our control. Second, there could be anti-robot measures on the page. Third, the download will likely be unreliable and fail the test occasionally.

The way we solved this was to upload a copy of the SDK to a Google storage bucket under our control and download it using the depot_tools script download_from_google_storage.py. This is a lot more reliable than an external website and will not download the SDK if we already have the right version on the bot.

Downloading the AppRTC Code
This code is on GitHub. Experience has shown that git clone commands run against GitHub will fail every now and then, and fail the test. We could either write some retry mechanism, but we have found it’s better to simply mirror the git repository in Chromium’s internal mirrors, which are closer to our bots and thereby more reliable from our perspective. The pull is done by a Chromium DEPS file (which is Chromium’s dependency provisioning framework).

Downloading Firefox
It turns out that Firefox supplies handy libraries for this task. We’re using mozdownload in this script in order to download the Firefox nightly build. Unfortunately this fails every now and then so we would like to have some retry mechanism, or we could write some mechanism to “mirror” the Firefox nightly build in some location we control.

Putting it TogetherWith that, we have everything we need to deploy the test. You can see the final code here.

The provisioning code above was put into a separate “.gclient solution” so that regular Chrome devs and bots are not burdened with downloading hundreds of megs of SDKs and code that they will not use. When this test runs, you will first see a Chrome browser pop up, which will ensure the local apprtc instance is up. Then a Firefox browser will pop up. They will each acquire the fake device and real camera, respectively, and after a short delay the AppRTC call will come up, proving that video interop is working.

This is a complicated and expensive test, but we believe it is worth it to keep the main interop case under automation this way, especially as the spec evolves and the browsers are in varying states of implementation.

Future Work

  • Also run on Windows/Mac. 
  • Also test Opera. 
  • Interop between Chrome/Firefox mobile and desktop browsers. 
  • Also ensure audio is playing. 
  • Measure bandwidth stats, video quality, etc.


Categories: Software Testing

Facebook's Summer of Downtime Continues: Why Marketers Should Worry

Perf Planet - Fri, 09/05/2014 - 10:21

On September 3, Facebook users reported the social network went down on web and mobile in the U.S. as well as the United Kingdom, Germany, Thailand, Portugal, and other parts of the world at around 12:40pm PST. Facebook has had an uncharacteristically problematic summer with major outages in MayJune, and August.

It Shouldn’t Be Any Different

QA Hates You - Fri, 09/05/2014 - 08:57

The banner on deals.ebay.com:

The same banner on the eBay Gold store (wait, you didn’t know eBay had a gold store? Neither did its testers!):

Now, why would the height of the banner be different on one page?

Because they’re different, no matter how much the same they seem.

One of the tricks of testing is recognizing how things differ in your applications and Web sites. Although the pages and features try to share code and styling whenever possible, they diverge more than it appears. As you test across features, you’ll get a sense of where different code does the same things, so you’ll learn where to test similar workflows whenever something changes.

That includes checking the styling of different pages within your site when a CSS file changes.

Categories: Software Testing

Understanding Application Performance on the Network – Part IX: Conclusion

Perf Planet - Fri, 09/05/2014 - 06:46

One thing I learned – or more accurately, had reinforced – from the many comments on this blog series is that there are often subtle differences in the implementation of various TCP features and specifications; TCP slow-start and Congestion Avoidance are good examples, as is the retransmission of dropped packets (and even the Nagle algorithm). […]

The post Understanding Application Performance on the Network – Part IX: Conclusion appeared first on Compuware APM Blog.

Pages