Pollin’ the Dice, Comin’ Up Snake Eyes
I hit Dice.com to check out the local job market action, but instead of letting me go about my bidness, immediately the home page asks me to take a poll wherein I could win a Kindle. All the better for reading a $.99 copy of John Donnelly’s Gold, I think, so I click through to it.
And then I get to this particular bit of logical Möbius strip:
To clarify: The control is labeled What did you accomplish on Dice.com today? (Select all that apply)
However, not one of the checkboxes is labeled None of these.
So to continue to the next step, if you want to continue, you must lie. And remember, the entrance to this quiz is on page load of Dice.com. That is, before you have accomplished anything at all.
Me, I didn’t lie: I eventually checked Other and Specified I got a blog post out of it.
What’s the lesson here, lads and lasses? Read the labels of the controls you’re checking, and make sure they make sense and make sure any enforcement rules upon them make sense vis-à-vis that label text.
Slice AND Dice Your Data
So, how do we get our hands around all that data? And once we have a conclusion, how do we present it effectively?
Slice up the data. Then start over from the raw data and dice it an entirely different way. Repeat.
The point is to look at the data from several angles to gain comprehension. Let's take the product comparison as an angle. It was a pretty straightforward project: We need a CMS. Which one should we use? There are literally hundreds of CMSs in this world, and a whole lot of information available about most of them. Let's commence data analysis:
- Slice. Make a feature comparison table, showing which ones have the features we're interested in.
- Dice. Map out the release frequency of modules, showing how quickly these introduce new features (as a proxy for future feature development pace).
- Slice. Figure out the annual cost of each CMS.
- Dice. Describe the community around each CMS, including number of job postings/candidates, presence and activity in forums or other documentation, and breakdown of current users in our industry.
So often we deal with large amounts of data, and it's easy to get overwhelmed and make a choice based on just a few data points and a rough guess. Take the time to cut through the data in a few ways, though, and your choice will not only be more obvious, it'll also be a better decision.
Signing Off
This will be my last post on this blog. Tomorrow is my last day at Google. It was a great ride and a great pleasure to work alongside such brilliant engineers. I will hand over this blog to another test director and then find a new place for whatever blogging I do in the future.
Follow me on Twitter (@docjamesw) if you are interested in where I land and to find my next blog outlet.
Peace.
Do you need to monitor your Mobile App?
About the Performance of Map Reduce Jobs
Internet Explorer 9 and Firefox 8/9 support with dynaTrace Ajax Edition 3.4
Clouds on Cloud Nine: The Challenge of Managing Hybrid-Cloud Environments
Third Party Content Management applied: Four steps to gain control of your Page Load Performance!
How to manage the performance of 1000+ JVMs
The Top Java Memory Problems – Part 2
We are growing!
Pagination with Cassandra and what we can learn from it
MMT 28 – Slides and more
Put Your Back Intuit
So I installed the new full CD version of Intuit QuickBooks, which is adware designed to get you to buy a lot of Intuit additional services disguised as accounting software. Now, if you’re like me, you’re not into the intricacies of actual accounting nor the myriad business rules that the various state and Federal governments change upon a whim, but you rely on software and a good accountant (or, sometimes, an accountant, although I’d like to add my current accountant is a good accountant unlike previous engagements who continue to bill me a small amount every year for simply having my address in their files).
Where was I? Oh, yes. I was talking about trusting your application, particularly one with complicated rules whose violation might result in a prison sentence. You want to trust that application, don’t you? So do I.
But I get the software installed and get into the mandatory registration (that is, give us personal information so we can target more in-application advertising pop-ups to you), and I get confronted with obvious slops on the design.
To whit:
A couple missing lines and slurred text, probably caused by poor compression or sizing.
Next up:
A stray bracket in the corner.
Man, oh man, I can’t wait to find out what strange punctuation marks it leaves in my figures.
Do I trust the application? Not so much. Which is why I don’t use it for much more than a glorified check register. And if it continues with its unrepentant, unrelenting barrage of “Collect credit cards with Intuit!”, “Print checks with Intuit!”, “Let Intuit have access to all your financial accounts!” banners popping up before I can pay my bills, I won’t have to trust it in the future, as I move to Microsoft Excel where it’s nice and quiet.
Estimating maximum users that an application can support
When load testing an application, the first set of tests should focus on measuring the maximum throughput. This is especially true of multi-user, interactive applications like web applications. The maximum throughput is best measured by running a few emulated users with zero think time. This means that each emulated user sends a request, receives a response and immediately loops back to send the next request. Although this is artificial, it is the best way to quickly determine the maximum performance of the server infrastructure.
<h2>Little’s Law</h2>
Once you have that throughput (say X), we can use Little’s Law to estimate the number of real simultaneous users that the application can support. In simple terms, Little’s Law states that :
N = X / λ
where N is the number of concurrent users, λ is the average arrival rate and X is the throughput. Note that the arrival rate is the inverse of the inter-arrival time i.e. the time between requests.
To understand this better, let’s take a concrete example from some tests I ran on a basic PHP script deployed in an apache server. The maximum throughput obtained was 2011.763 requests/sec with an average response time of 6.737 ms, an average think time of 0.003 secs when running 20 users. The arrival rate is the inverse of the inter-arrival time which is the sum of the response time and think time. In this case, X is .2011.763 and λ is 1/(0.006737 + 0.003). Therefore,
N = X / λ = 2011.763 * 0.009737 = 19.5885
This is pretty close to the actual number of emulated users which is 20.
Estimating Concurrent Users
This is all well and good, but how does this help us in estimating the number of real concurrent users (with non-zero think time) that the system can support ? Using the same example as above, let us assume that if this were a real application, the average inter-arrival time is 5 seconds. Using Little’s Law, we can now compute N as :
N = X /λ = 2011.763 * 5 = 10058 users.
In other words, this application running on this same infrastructure can support more than 10,000 concurrent users with an inter-arrival time of 5 seconds.
What does this say for think times ? If we assume that the application (and infrastructure) will continue to perform in the same manner as the number of connected users increase (i.e it maintains the average response time of 0.006737 seconds), the the average think time is 4.993 seconds. If the response time degrades as load goes up (which is usually the case after a certain point), then the number of users supported will also correspondingly decrease.
A well-designed application can scale linearly to support 10′s or 100′s of thousands of users. In the case of large websites like Facebook , Ebay and Flickr, the applications scale to handle millions of users. But obviously, these companies have invested tremendously to ensure that their applications and software infrastructure can scale.
Little’s Law can be used to estimate the maximum number of concurrent users that your application can support. As such, it is a handy tool to get a quick, rough idea. For example, if Little’s Law indicates that the application can only support 10,000 users but your target is really 20,000 users, you know you have work to do to improve basic performance.
Which is More Important? Knowing What Works or Finding Bugs?
- "Instead of figuring out what works, they are stuck investigating what doesn’t work.”
Ilya asked:
Why did you use "stuck" referring to context of the other testers? Isn't "investigating what doesn’t work" more important than "figuring out what works" (other factors being equal)?
I love that question. It really made me think. Here is my answer:
- If stuff doesn’t work, then investigating why it doesn’t work may be more important than figuring out what works.
- If we’re not aware of anything that is broken, then figuring out what else works (or what else is not broken) is more important than investigating why something doesn’t work…because there is nothing broken to investigate.
When testers spend their time investigating things that don’t work, rather than figuring out what does work, it is less desirable than the opposite. Less desirable because it means we’ve got stuff that doesn’t work! Less desirable to who? It is less desirable for the development team. It means there are problems in the way we are developing software.
An ultimate goal would be bug free software, right? If skilled testers are not finding any bugs, and they are able to tell the team how the software appears to work, that is a good thing for the development team. However, it may be a bad thing for the tester.
- Many testers feel like failures if they don’t have any issues to investigate.
- Many testers are not sure what to do if they don’t have any issues to investigate.
- If everything works, many testers get bored.
- If everything works, there are fewer hero opportunities for many testers.
I don’t believe things need to be that way. I‘m interested in exploring ways to have hero moments by delivering good news to the team. It sounds so natural but it isn’t. As a tester, it is soooooo much more interesting to tell the team that stuff just doesn’t work. Now that’s dysfunctional. Or is it?
And that is the initial thought that sparked my Avoid Trivial Bugs, Report What Works post.
Thanks, Ilya, for making me think.
Testability Explained
Web Performance Optimization, Part 10: The Evolution of Client Side Caching
While we've touched upon client side caching in our series on Web performance, we haven't discussed how client caching has grown more rich and useful over the years. In the initial days of the Web and the HTTP/1.0 protocol, caching was mostly limited to a handful of headers, including Expires, If-Modified-Since, and Pragma: no-cache. Since then, client caching has evolved to embrace greater granularity. Some new technologies even permit the deployment of offline-aware, browser-based applications.
Browser Request CachingThe most common and oldest type of client-side caching on the client is browser request caching. Built into the HTTP protocol standard, browser request caching allows the server to control how often the browser requests new copies of files from the server. We discussed the major aspects of browser request caching in part 1 of our series. Over time, Webmasters have taken to using different headers to improve caching on their site, including:
Pragma: no-cache. This old directive is used mostly by HTTP/1.0 servers, and instructs a client that a specific response's contents should never be cached. It is used for highly dynamic content that is apt to change from request to request.
Expires. Supported since HTTP/1.0, this header specifies an explicit expiration date for cached content. It can be superseded by the value of the Cache-Control header. For example, if Cache-Control: no-cache is sent in a response, this will take precedence over any value of the Expires header.
If-Modified-Since: Since the HTTP/1.0 protocol, clients have been able to use this header to request that the server only send data if the resource has been changed since the specified date. If there have been no changed, the server returns an HTTP 304 Not Modified response.
Last-Modified. This HTTP/1.0 and 1.1 header designates when the resource was most recently changed. Browsers usually supply this value as the value of the If-Modified-Since header.
Cache-Control. This core directive, introduced in the HTTP/1.1 standard, specifies whether a response's contents can be cached, and if so, for how long. The header "Cache-Control: no-cache" obsoletes the "Pragma: no-cache" header of the HTTP/1.0 protocol.
ETag. The ETag ("entity tag") header is a hash value that is specific to a given version of a resource. It can be used by the client in conjunction with the If-Match, If-None-Match, and If-Range headers to decide whether it should generate a new request for the latest version of a resource. The format of entity tags themselves is defined in section 3.11 of RFC2616.
Note that this header and the Last-Modified header are exclusive; servers should set one or the other. The ETag header is new with the HTTP/1.1 protocol standard.
For modern applications, the good folks at Google recommend setting one of either Cache-Control or Expires, and one of either Last-Modified or ETag.
With the advent of JavaScript and AJAX, more Web applications are downloading data dynamically. JavaScript developers can use the XmlHttpRequest object to fetch data in XML (or other) format, and display it in real time without forcing a refresh of the entire page. This presents opportunities for finer-grained caching based on the nature of the data displayed within the page.
AJAX applications can still use all of the browser request caching mechanisms discussed above. The resource requested by the XmlHttpRequest object will be stored in the browser's file cache just as other HTTP objects are. A given AJAX application can go further and make refresh calls to the XmlHttpRequest object using programmatic rules. In his article "An AJAX Caching Strategy", Bruce Perry shows how he uses a custom CacheDecider object that he wrote in JavaScript to determine when to update an AJAX display of oil, gasoline, and propane prices.
Developers creating HTML5 applications can create fully offline-aware applications using the HTML5 ApplicationCache interface. The Application Cache uses a cache manifest file to specify which files in an HTML5 application can be used offline, and which files require a network connection. The manifest may also specify a list of fallback files for network resources when the user is offline. For example, instead of fetching the file /get-data.php when disconnected, the manifest can instruct the browser to display the file /offline.html instead. This manifest is referenced in the HTML element of an HTML5 app:
<html manifest="manifest.appcache">
...
</html>
Web performance optimization is very important, and today's Web application development team can boost site performance and improve its site's load testing scores by selecting from a variety of client side caching techniques. An effective client side caching strategy can reduce load times by several factors. The most recent innovations in client side caching, such as the HTML5 Application Cache, enable an application to run (though perhaps in a more limited form) even without a network connection present.
But That’s Not Why QA Hates You
Over at Forbes.com, Susannah Breslin posts This Is Why Your Employees Hate You.
Basically, here three order list points boil down to 1)You’re hired into a new company and don’t get the lay of the land before you start making a mess, 2) You’re unlikeable, and 3) You are not a leader.
As you might know, I think #1 is very important, and I’ve harped on it on occasion here. When you’re hired in as a manager, you have (or have convinced someone that you have) skill and ideas applicable to leading people in doing whatever you’re managing. You might have led a team in some other industry doing something similar, or you might even have been working within the same industry for a competitor or some related organization. Be that as it may, you don’t know how things are done in your new organization, and until you do, you should probably avoid upsetting the apple cart with your new ideas and processes which are really only old ideas and processes that might have worked at your last employer. At your new posting, some things are done that way because they’ve always been done that way, but some things are done that way because they work for your new employer and new employees. Until you can tell them apart, you don’t know where your new ideas are improvements or impediments.
As to number 2, remember, lads and lasses, there’s a fine line between being a jerk and being confident and right. Regardless of which side of that line you’re on, people who don’t like you or what you’re saying will think and say you’re a jerk. So be professional, but be confident and tell people the hard truths. Clearly. Dare I say, bluntly? I DARE.
And for number 3, we’ve seen QA managers like this, haven’t we? Just glad to be sitting at the big table and unafraid to rock the boat. You’re not going to add anything dodging that responsibility, and when it comes time to trim budget, if nobody remembers you saying anything about anything, especially not saying anything that stuck up for anything, they’re going to wonder why you’re on the payroll in the first place.
So do what Ms. Breslin says. Or the opposite of what she says. You’ll be a better manager for it.
But know these are not the reasons QA hates you. QA hates you because QA hates everybody.
Identifying Memory leaks in .Net using winDBG
Load the SOS debugger extension for a CLR 4.0 application
.loadby sos clr
1) Identify the objects in finalizeQueue which survived Garbage collection
!fq
SyncBlocks to be cleaned up: 0
MTA Interfaces to be released: 0
STA Interfaces to be released: 0
----------------------------------
generation 0 has 1 finalizable objects (000000008ba63058->000000008ba63060)
generation 1 has 15 finalizable objects (000000008ba62fe0->000000008ba63058)
generation 2 has 14884 finalizable objects (000000008ba45ec0->000000008ba62fe0)
Ready for finalization 0 objects (000000008ba63060->000000008ba63060)
…
000007ff01d58610 2 1024 System.Data.DataTable
000007ff0164e298 4 1120 System.Diagnostics.Process
000007ff01d82738 8 1728 System.Data.DataColumn
000007ff017235a0 16 1920 System.Threading.OverlappedData
000007ff001f1780 28 2464 System.Threading.Thread
000007ff006764b8 52 3744 System.Reflection.Emit.DynamicResolver
000007ff0026fba8 66 4224 System.Threading.ReaderWriterLock
000007ff01e7bb00 314 10048 System.Data.SqlClient.SNIPacket
000007ff01e717e0 546 21840 System.Data.SqlClient.SNIHandle
000007ff01e49f30 314 32656 System.Data.SqlClient.SqlConnection
000007ff02933bf0 302 45904 System.Data.SqlClient.SqlDataAdapter
000007ff0166d240 398 70048 System.Diagnostics.PerformanceCounter
000007ff01e0d0e8 1510 338240 System.Data.SqlClient.SqlCommand
000007ff0057d8f8 11209 358688 System.WeakReference
2) You can also print the details of the finalizable objects for Gen2 using the above details
dd 000000008ba45ec0 000000008ba62fe0
3) Identify the suspected object having leaks
!dumpheap -type System.Data.SqlClient.SqlCommand
Address MT Size
0000000010c43f40 000007ff01e0d0e8 224
0000000010c44020 000007ff01e0d0e8 224
0000000010c44198 000007ff01e0d0e8 224
0000000010c44278 000007ff01e0d0e8 224
0000000010c444e8 000007ff01e0d0e8 224
4) Identify the GC roots for the objects. It contains the call stack
!gcroot 0000000010c43f40
Scan Thread 15 OSTHread 1bc0
Scan Thread 16 OSTHread 1d74
Scan Thread 19 OSTHread 3200
Scan Thread 17 OSTHread 21f0
Scan Thread 18 OSTHread 3564
Scan Thread 20 OSTHread 322c
Scan Thread 21 OSTHread 2b80
Scan Thread 28 OSTHread 2e5c
Scan Thread 29 OSTHread 35d0
Scan Thread 30 OSTHread 3bc
Scan Thread 31 OSTHread 2770
Scan Thread 33 OSTHread 19b4
Scan Thread 34 OSTHread 534
Scan Thread 35 OSTHread 3280
Scan Thread 36 OSTHread 1908
Scan Thread 37 OSTHread 344c
Scan Thread 38 OSTHread 13d4
Scan Thread 39 OSTHread 21d4
Scan Thread 40 OSTHread 31a4
DOMAIN(0000000001AB88B0):HANDLE(Pinned):1217c0:Root: 000000002070f040(System.Object[])->
00000000108c9ac0(System.Collections.Hashtable+SyncHashtable)->
00000000108c91e8(System.Collections.Hashtable)->
000000001171efc0(System.Collections.Hashtable+bucket[])->
00000000120a3768(System.Collections.Hashtable)->
00000000120a37c0(System.Collections.Hashtable+bucket[])->
00000000120a3820(System.Collections.Hashtable)->
00000000120fa180(System.Collections.Hashtable+bucket[])->
00000000120b7200(System.Collections.Generic.Dictionary`2[[ATOM.AS.CobolBase.CobolProgramName, ATOM.AS.CobolBase],[ATOM.AS.CobolBase.CobolProgram, ATOM.AS.CobolBase]])->
00000000120b7360(System.Collections.Generic.Dictionary`2+Entry[[ATOM.AS.CobolBase.CobolProgramName, ATOM.AS.CobolBase],[ATOM.AS.CobolBase.CobolProgram, ATOM.AS.CobolBase]][])->
00000000120b7258(ATOM.AS.CobolBase.CobolProgram)->
00000000120b7310(System.Collections.Generic.List`1[[ATOM.AS.CobolBase.IProgramEvents, ATOM.AS.CobolBase]])->
0000000019b0f918(System.Object[])->
0000000019b0f590(ATOM.AS.DataAccess.DataAccess)->
0000000019b13c40(ATOM.AS.DataAccess.ProviderSQL)->
0000000019b13d70(System.Data.SqlClient.SqlCommand)->
0000000019b15978(System.Data.SqlClient.SqlCommand+CachedAsyncState)
@2011, copyright Vamsidhar Tokala