Last week at Velocity Conference - New York I had the opportunity to sit in keynote address by Mikey Dickerson on the topic “One Year After healthcare.gov: Where Are We Now?” Mikey Dickerson is the Administrator/Deputy CIO of USDS. In October, 2013 he took a leave of absence from Google to join what became known as the “ad […]
The post Playbook for Performance at Velocity New York 2014 appeared first on Compuware APM Blog.
Invigorated by the comments in my last post, I’ll revisit the topic.
I don’t think we can increase our tester reputations by sticking to the credo:
“Raise every bug, no matter how trivial”
Notice, I’m using the language “raise” instead of “log”. This is an effort to include teams that have matured to the point of replacing bug reports with conversations. I used the term “share” in my previous post but I like “raise” better. I think Michael Bolton uses it.
Here are a couple problems with said credo:
- Identifying bugs is so complex that one cannot commit to raising them all. As we test, there are countless evaluations our brains are making; “That screen seems slow today, that control might be better a hair to the right, why isn’t there a flag in the DB to persist that data?”. We are constantly making decisions of which observations are worth spending time on. The counter argument to my previous post seems to be, just raise everything and let the stakeholders decide. I argue, everything is too much. Instead, the more experience and skill a tester gains, the better she will know what to raise. And yes, she should be raising a lot, documenting bugs/issues as quickly as she can. I still think, with skill, she can skip the trivial ones.
- Raising trivial bugs hurts your reputation as a tester. I facilitate bug triage meetings with product owners. Trivial bugs are often mocked before being rejected": “Ha! Does this need to be fixed because it’s bugging the tester or the user? Reject it! Why would anyone log that?”. Important bugs have the opposite reaction. Sorry. That’s the way it is.
- Time is finite. If I’m testing something where bugs are rare, I’ll be more inclined to raise trivial bugs. If I’m testing something where bugs are common, I’ll be more inclined to spend my time on (what I think) are the most important bugs.
It’s not the tester’s job to decide what is important. Yes, in general I agree. But I’m not dogmatic about this. Maybe if I share some examples of trivial bugs (IMO), it will help:
- Your product has an administrative screen that only can be used by a handful of tech support people. They use it once a year. As a tester, you notice the admin screen does not scroll with your scroll wheel. Instead, one must use the scroll bar. Trivial bug.
- Your product includes a screen with two radio buttons. You notice that if you toggle between the radio buttons 10 times and then try to close the screen less than a second later, a system error gets logged behind the scenes. Trivial bug.
- Your product includes 100 different reports users can generate. These have been in production for 5 years without user complaints. You notice some of these reports include a horizontal line above the footer while others do not. Trivial bug.
- The stakeholders have given your development team 1 million dollars to build a new module. They have expressed their expectations that all energy be spent on the new module and they do not want you working on any bugs in the legacy module unless they report the bug themselves and specifically request its fix. You find a bug in the legacy module and can’t help but raise it…
You laugh, but the drive to raise bugs is stronger than you may think. I would like to think there is more to our jobs than “Raise every bug, no matter how trivial”.
We're currently experiencing temporary login issues with the WPM Home website. Monitoring and load testing applications are not affected. We will post a status update once service has been restored. We apologize for any inconvenience during this time.
We have completed selection and confirmation of all speakers and attendees for GTAC 2014. You can find the detailed agenda at:
Thank you to all who submitted proposals! It was very hard to make selections from so many fantastic submissions.
There was a tremendous amount of interest in GTAC this year with over 1,500 applicants (up from 533 last year) and 194 of those for speaking (up from 88 last year). Unfortunately, our venue only seats 250. However, don’t despair if you did not receive an invitation. Just like last year, anyone can join us via YouTube live streaming. We’ll also be setting up Google Moderator, so remote attendees can get involved in Q&A after each talk. Information about live streaming, Moderator, and other details will be posted on the GTAC site soon and announced here.
At this time, the Buenos Aires agent is currently off line. We are working to restore the Buenos Aires agent to service as soon as possible. Again, we do apologize for the inconvenience and appreciate your patience on this issue. In addition, we will provide an update once the Buenos Airesagent is available again.
Seen on Twitter:
Engineers don't let engineers design user interfaces. pic.twitter.com/XKSDUOxKHe
— John Bellomy (@cowbs) September 28, 2014
I’ve tested applications like that, where the Web pages are filled with tabs full of edit boxes crammed into the space with inscrutable labels that, I’m assured, the users know what they mean.
Sometimes, these interface designs come straight off of some crowded paper form that a worker would fill out with pen, checking boxes and putting tic marks or numbers in boxes. On paper, this is as quick as moving your eye, moving the pen, and pressing down. With a screen, it’s a little different, as it involves tabbing or moving the mouse and clicking and then typing something, scanning the form, moving the mouse, clicking, typing some more, and so on.
Other times, these interface designs pretty directly capture what the worker saw in a mainframe application or in a terminal window connecting to a mainframe application. With Windows or Web-safe colors instead of amber text on a black background that the worker. A lot of needless tabbing because interfaces could not easily branch. If you check this box, then these blank spaces become relevant. No, paper couldn’t do that and mainframes couldn’t do that, so the new Windows or Web application won’t do that. Because the users are used to it.
- The workers (“users”) today aren’t the users of tomorrow; if you’re not designing the interface right because it’s good enough for the grizzled greybeard who’s been around forever, you’re not appreciating how much easier you could make the process for n00bs. That is, probably most of the users. Especially if you’re writing software for a company that’s okay with this sort of interface. I imagine it has a lot of turnover and a lot of people getting trained to do it the hard way just because it’s always been done.
- Notice that we use the term “users” a lot in relation to people who work with the software we build. That’s defining them in terms of their relationship with our software, but their main jobs are doing something else. If your software design captures workers and traps them into being users too much, it drags on their productivity. Computer software should make their jobs easier and more streamlined, not slower than working with pen and paper. Sure, you can say that the data collection for analysis on the back-end is the driver for the software, but that doesn’t mean you should ignore other efficiencies you can introduce with a good (or at least better than this) design.
I read somewhere recently about hiring tester with test skills rather than domain knowledge, and sure, that’s the right balance, however, domain knowledge is what allows you to spot these sorts of problems. You might be hired because you’re a good tester, but you ought to study up on the industry whose software you’re testing. Me, I’ve been known to refresh myself on the basics of chemistry to better test chemical modeling software and to grok at least a little bit of the workflow of a warehouse when testing order fulfillment software.
Because otherwise you’re only logging the defects qua defects like “The Tare Weight edit box allows alpha characters” and not the higher level concerns about why you’d expect a worker would enter the total shipping weight before the number of items to ship.
Domain knowledge gives you the insight about the worker’s starting point in your software and what he wants to do to get done with your software. And that will give you the possible paths for his interaction without having to make all the possibilities available on one screen in tiny print.
In the previous post we learned how to setup Drupal 8 for serving content in a RESTful format. We also created a simple application with Backbone.js to retrieve that content data and display it. This post will explain how to improve that application. If you followed the steps outlined in the previous post, the application should look like ... Read more
It’s a bug, no doubt. Yes, you are a super tester for finding it. Pat yourself on the back.
Now come down off that pedestal and think about this. By any stretch of the imagination, could that bug ever threaten the value of the product-under-test? Could it threaten the value of your testing? No? Then swallow your pride and keep it to yourself.
My thinking used to be: “I’ll just log it as low priority so we at least know it exists”. As a manager, when testers came to me with trivial bugs, I used to give the easy answer, “Sure, they probably won’t fix it but log it anyway”.
Now I see things differently. If a trivial bug gets logged, often…
- a programmer sees the bug report and fixes it
- a programmer sees the bug report and wonders why the tester is not testing more important things
- a team member stumbles upon the bug report and has to spend 4 minutes reading it and understanding it before assigning some other attribute to it (like “deferred” or “rejected”)
- a team member argues that it’s not worth fixing
- a tester has spent 15 minutes documenting a trivial bug.
It seems to me, reporting trivial bugs tends to waste everybody’s time. Time that may be better spent adding value to your product. If you don’t buy that argument, how about this one: Tester credibility is built on finding good bugs, not trivial ones.
At this time the Paris agent is currently off line. We are working to restore the Paris agent to service as soon as possible. We will provide an update once the Paris agent is available again. We regret any inconvenience this may have caused.
First, yes it is still hosted out of my basement. I did move it out of the utility room and into a storage room so if the water heater leaks it will no longer take out everything.
Yes, Halloween has gotten a bit out of control. This is what it looked like last year (in our garage though the video doesn't quite do it justice).
The WebPagetest "rack" is a gorilla shelf that holds everything except for the Android phones.
Starting at the bottom we have the 4 VM Servers that power most of the Dulles desktop testing. Each server is running VMWare ESXi (now known as VMWare Hypervisor) with ~8 Windows 7 VM's on each. I put the PC's together myself:
- Single socket Supermicro Motherboards with built-in IPMI (remote management)
- Xeon E3 processor (basically a Core i7)
- 32 GB Ram
- Single SSD Drive for VM Storage
- USB Thumb drive (on motherboard) for ESXi hypervisor
The SSDs for the VM storage lets me run all of the VM's off of a single drive with no I/O contention because of the insane IOPS you can get from them (I tend to use Samsung 840 Pro's but really looking forward to the 850's).
As far as scaling the servers goes, I load up more VM's than I expect to use, submit a whole lot of tests with all of the options enabled and watch the hypervisor's utilization. I shut down VM's until the CPU utilization stays below 80% (one per CPU thread seems to be the sweet spot).
Moving up the rack we have the unraid NAS where the tests are archived for long-term storage (as of this post the array can hold 49TB of data with 18TB used for test results). I have a bunch of other things on the array so not all of that 30TB is free but I expect to be able to continue storing results indefinitely for the foreseeable future.
I haven't lost any data (though drives have come and gone) but the main reason I like unraid is if I lose multiple drives it is not completely catastrophic and the data on the remaining drives can still be recovered. It's also great for power because you can have it automatically spin down the drives that aren't being actively accessed.
Next to the unraid array is the stack of Thinkpad T430's that power the "Dulles Thinkpad" test location. They are great if you want to test on relatively high-end physical hardware with GPU rendering. I really like them for test machines because they also have built-in remote management (AMT/vPro in Intel speak) so I can reboot or remotely fix them if anything goes wrong. I have all of the batteries pulled out so they don't kill them with recharge cycles but if you want built-in battery backup/UPS they work great for that too.
Buried in the corner next to the stack of Thinkpads is the web server that runs www.webpagetest.org.
The hardware mostly matches the VM servers (same motherboard, CPU and memory) but the drive configuration is different. There are 2 SSD's in a RAID 1 array that run the main OS, Web Server and UI and 2 magnetic disks in a RAID 1 array that is used for short-term test archiving (1-7 days) before they are moved off to the NAS. The switch sitting on top of the web server connects the Thinkpads to the main switch (ran out of ports on the main switch).
The top shelf holds the main networking gear and some of the mobile testing infrastructure.
The iPhones are kept in the basement with the rest of the gear and connect WiFi to an Apple Airport Express. The Apple access points tend to be the most reliable and I haven't had to touch them in years. The access point is connected to a network bridge so that all of the Phone traffic goes through the bridge for traffic shaping. The bridge is running Free BSD 9.2 which works really well for dummynet and has a fixed profile set up (for now) so that everything going through it sees a 3G connection (though traffic to the web server is configured to bypass the shaping so the test results are fast to upload). The bridge is running a supermicro 1U atom server which is super-low power, has remote management and is more than fast enough for routing packets.
There are 2 iPhones running tests for the mobile HTTP Archive and 2 running tests for the Dulles iPhone testing for WebPagetest. The empty bracket is for the third phone that is usually running tests for Dulles as well but I'm using it for dev work to update the agents to move from mobitest to the new nodejs agent code.
The networking infrastructure is right next to the mobile agents.
The main switch has 2 VLANs on it. One connects directly to the public Internet (the right 4 ports) and the other (all of the other ports) to an internal network. Below the switch is the router that bridges the two networks and NATs all of the test agent traffic (and runs as a DHCP and DNS server). The WebPagetest web server and the router are both connected to the public Internet directly which ended up being handy when the router had software issues and I was in Alaska (I could tunnel through the web server to the management interface on the router to bring it back up). The router is actually the bottom unit and a spare server is on top of it, both are the same 1U atom servers as the traffic-shaping bridge though the router runs Linux.
My Internet connection is awesome (at least by US pre-Google Fiber standards). I am lucky enough to live in an area that has Verizon FIOS (Fiber). I upgraded to a business account (not much more than a residential one) to get the static IP's and I get much better support, 75Mbps down/35Mbps up and super-low latency. The FIOS connection itself hasn't been down at all in at least the last 3 years.
The Android devices are on the main level of the house right now on a shelf in the study, mostly so I don't have to go downstairs in case the devices need a bit of manual intervention (and while we shake out any reliability issues in the new agent code).
The phones are connected through an Anker usb hub to and Intel NUC running Windows 7 where the nodejs agent code runs to manage the testing. The current-generation NUC's don't support remote management so I'm really looking forward to the next release (January or so) that are supposed to add it back. For now I'm just using VNC on the system which gives me enough control to reboot the system or any of the phones if necessary.
The phones are all connected over WiFi to the Access point in the basement (which is directly below them). The actual testing is done over the traffic-shaped WiFi connection but all of the phone management and test processing is done on the tethered NUC system. I tried Linux on it but at the time the USB 3 drivers were just too buggy so it is running Windows (for now). The old android agent is not connected to the NUC and is running mobitest but the other 10 phones are all connected to the same host. I tried connecting an 11th but Windows complained that too many USB device ID's were being used so it looks like the limit (at least for my config) is 10 phones per host. I have another NUC ready to go for when I add more phones.
One of the Nexus 7's is locked in portrait mode and the other is allowed to rotate (which in the stand means landscape). All of the rest of the phones are locked in portrait. I use these stands to hold the phones and have been really happy with them (and have a few spares off to the left of the picture).
At this point the android agents are very stable. They can run for weeks at a time without supervision and when I do need to do something it's usually a matter of remotely rebooting one of the phones (and then it comes right back up). After we add a little more logic to the nodejs agent to do the rebooting itself they should become completely hands-free.
Unlike the desktop testing, the phone screens are on and visible while tests are running so every now and then I worry that the kids may walk in while someone is testing a NSFW site but they don't really go in there (something to be aware of when you set up mobile testing though).
One question I get asked a lot is why I don't host it all in a data center somewhere (or run a bunch of it in the cloud). Maybe I'm old-school but I like having the hardware close by in case I need to do something that requires physical access and the costs are WAY cheaper that if I was to host it somewhere else. The increased power bill is very slight (10's of dollars a month), I'd have an Internet connection anyway so the incremental cost for the business line is also 10's of dollars per month and the server and storage costs were one-time costs that were less than even a couple of months of hosting. Yes, I need to replace drives from time to time but at $150 per 4TB drive, that's still a LOT cheaper than storing 20TB of data in the cloud (not to mention the benefit of having it all on the same network).
100% Coverage I just recently wrote a blog about BOTs causing unwanted traffic on our servers. Right after I wrote this blog I was notified about yet another “interesting” and unusual load behavior on our download page which is used by customers to download latest product versions and updates: If you see such a load […]
The post Bad Deployments: The Performance Impact of Recursive Browser Redirect Loops appeared first on Compuware APM Blog.
Unless you have been living under a rock for the past month you have at least heard about Apple’s iPhone 6 launch. This event has been hyped for weeks whipping the blogosphere into a pretty good frenzy over what Apple would announce at this live event. An event of this size and visibility certainly deserves ... Read more
The post Another Load Testing Fail: iPhone 6 Launch Day Crash appeared first on Load Impact Blog.
Before I go into that, I want you to know why I'm making this offer.
First, I know that many people have stretched training budgets and have to make every dollar count.
Second, I have over 15 e-learning courses in software testing, but the one that I thinks covers the
most information in software testing is the ISTQB Foundation Level course.
The reason I conduct and promote the ISTQB program is because it gives a well-rounded framework for building knowledge in software testing. It's great to get your team on the same page in terms
of testing terminology. It also builds credibility for testers in an organization.
ISTQB is the International Software Test Qualifications Board, a not-for-profit organization that has defined the "ISTQB® Certified Tester" program that has become the world-wide leader in the certification of competences in software testing. Over 336,000 people worldwide have been certified in this program. The ISTQB® is an organization based on volunteer work by hundreds of international testing experts, including myself.
You can learn more at www.istqb.org and about the ASTQB (American Software Testing Qualifications Board) at www.astqb.org.
I think e-Learning has the best results in preparing people to take the ISTQB Foundation Level exam because you have time to really absorb the concepts, as opposed to trying to learn everything in 3
or 4 days. Plus, you can review the material at any time. That's hard to do in a live class. I have seen people score very high on the exam after taking this course and it gets great reviews.
OK.... now for the details....
For this week only, Monday, Sept 22 through Midnight (CDT) Friday, Sept 26th I am running a special offer on ISTQB Foundation Level certification e-learning training. If you purchase the 5-team
license, you get an extra person at no extra cost - exams included!
So, if you have been thinking about getting your team certified in software testing, this is a great opportunity - an $899 value.
In addition, if you order a 5-team license or higher, I will conduct a private one-hour web meeting Q&A session with me in advance of taking the exam. Your team also gets access to the course as
long as they need it - no time limits!
All you have to do is use the code "ISTQB9922" at checkout time. You will be contacted for the names of the participants.
Payment must be by credit card or PayPal.
To see the details of the course, go to
To learn more about the e-learning program, go to
To register, go to https://www.mysoftwaretesting.com/ISTQB_Foundation_Level_Course_in_Software_Testing_p/istqb5.htm
Any questions? Just respond to this e-mail.
Act fast, because this deal goes away at Midnight (CDT) Friday,
It’s hard to argue with facts. That’s probably why AppDynamics’ spin machine has been hard at work lately, trying to find distorted angles and mis-representations about our capabilities. This is an attempt to distract from their own shortcomings and the fact that this year again customers on the market for a new generation APM, favored Dynatrace […]
My favorite war room accusation is: “It’s always the network at fault!” Whether you’re the one taking the blame or the one pointing the finger likely has everything to do with which seat you occupy in that war room. I suppose that comes with the territory, because at the same time there seems to be […]
The post Defending Network Performance with Packet Level Detail appeared first on Compuware APM Blog.
We’ll be speaking at this year’s QCon conference in Rio on September 24th and 25th and hope to see you there! Our founder, Ragnar Lönn, will be giving a presentation on “Why bandwidth matter – why network delay will always limit web performance and what to do about it” that promises to lead to many interesting discussions. ... Read more
About five years ago, my tester friend, Alex Kell, blew my mind by cockily declaring, “Why would you ever log a bug? Just send the Story back.”
My dev team uses a Kanban board that includes “In Testing” and “In Development” columns. Sometimes bug reports are created against Stories. But other times Stories are just sent left; For example, a Story “In Testing” may have its status changed to “In Development”, like Alex Kell’s maneuver above. This normally is done using the Dead Horse When-To-Stop-A-Test Heuristic. We could also send an “In Development” story left if we decide the business rules need to be firmed up before coding can continue.
So how does one know when to log a bug report vs. send it left?
I proposed the following heuristic to my team today:
If the Acceptance Test Criteria (listed on the Story card) is violated, send it left. It seems to me, logging a bug report for something already stated in the Story (e.g., Feature, Work Item, Spec) is mostly a waste of time.
In fact, the previous statement is much too modest. Cyber Monday is the biggest day of the year for ecommerce in the United States and beyond. As the statistics show, Cyber Monday has become a billion dollar juggernaut since 2010 – and it has only continued to grow. Last year alone, Cyber Monday was responsible for over $1.7 billion spent by online consumers in the US, a shocking 18% jump from the year before!
Since its inception in 2005, the Monday after Thanksgiving has become a potential goldmine for those with an online presence. It has helped to significantly boost revenue during the Christmas period for savvy businesses that have taken advantage of using the promotion. The “cannot-be-missed” deals are important to any Cyber Monday campaign, but having the website ready to maintain consistent and fast performance with the traffic rush is absolutely critical.
An unprepared business would expect an increase in business on Cyber Monday, but overlook the fact that more visitors = more strain on the performance-side of their website. And the more strain on a website, the more it will begin to falter when it matters most.How web performance can cause your Cyber Monday to crash
During the mad rush of consumers looking to snap up some bargain deals, your website has to be prepared for the sudden visitor increase – otherwise your Cyber Monday will crumble before your eyes.
Last year Cyber Monday website crashes cost several large companies thousands of dollars in revenue. Motorola was offering a special price on their new MotoX, but the site was not prepared for the rush of traffic it would bring. Many customers experienced a very slow website, errors showing prices without the discount, and then the website crashed entirely.
In addition to losing customers who would have otherwise purchased that weekend, Motorola also had to deal with the PR aftermath. Unhappy would-be customers and the tech media took to social media, posting tweets such as:
In an effort to mitigate the damage, Motorola CEO issued the statement:
Moral of the story? Motorola lost out on thousands of dollars of sales and lost thousands of potential new customers forever, all of which could have been avoided if load and performance testing had been performed early. If they had load tested, Motorola would have been aware of the problems, found the causes, and fixed them before real users experienced them.
While many companies didn’t see full website crashes like Motorola, the rush of traffic still led to painfully slow websites and therefore a loss in revenue. A website must not only remain up and available, but also remain fast to navigate around. Just think of the amount of pages a potential customer might have to go through on your website. Now imagine if there were delays in between each page loading. Internet users are an impatient bunch, a one second delay can cause a 7% decrease in conversions and 11% fewer page views. And 74% of people will leave a mobile site if the delay is longer than five seconds!
Clearly, ensuring your website is constantly up and stable is imperative to maximizing profits for your business this Cyber Monday. Because the last thing you want to do is miss out on the most important day of the year for ecommerce and present your competitors with an opening to snag that business.The stakes are high; you’ve got to make sure you are suitably prepared for the rush.