- Top 3 Performance Problems in Custom Microsoft CRM Applications
- Top 10 Client-Side Performance Problems in Web 2.0
- How to Automate Google Analytics Analysis
- Ajax Best Practices: Reduce and Aggregate similar XHR calls
- dynaTrace Continuously Monitors ShowSlow URLs
- Performance as Key to Success! How Online News Portals could do better
- Week 9 – How to Measure Application Performance
- Video of Business Transaction Management in Action: In 6 minutes from Slow Search Request to identify Impacted Users and Offending SQL
- IE Compatibility View: How to identify performance problems between IE versions
- dynaTrace at Web Performance Meetups in Boston and New York City
- Too Much Cache is Like a Krispy Kreme Burger
- Debugging SAP scripts using SAPGUI Spy in LoadRunner
- Monitoring Maintenance Windows
- How to Monitor Oracle Database Performance
- Stressing Out Your Access Management System
- Running remote Unix commands from LoadRunner
- Web Performance Tuning Never Ends
- Running command-line programs from LoadRunner
- IIS Connections Affect Web Performance
- Load Testing Quote for August 19, 2010
Software Development
Agile 2010 Session on Learning
At Agile 2010 I will be leading a session on building an learning culture on your agile team (Wednesday at 3:30 in Asia 3). I have done this session before but have reworked it significantly to add more interactive activities. I think it will be a lot of fun and … hopefully you will learn something!
I will also be giving away two book that have shaped my thinking on learning in an agile context:
Let’s change the tune
As a community, we’re very guilty of using technical terms and confusing business users. If we want to get them more involved, we have to use the right names for the right things and stop confusing people. This lesson is obvious in acceptance tests and we know that we need to keep the naming consistent and avoid misleading terms, but we don’t do this when we talk about the process. For example, when we say continuous integration in the context of agile acceptance testing, we don’t really mean running integration tests. So why use that term, and then have to explain how acceptance tests are different from integration tests? Until I started using Specification Workshops as the name for a collaborative meeting about acceptance tests, it was very hard to convince business users to participate. But a simple change in naming made the problem go away.
By using better names, we can avoid a lot of completely meaningless discussions and get people started on the right path straight away. So here is what I propose:
One of the biggest issues teams have with acceptance testing is who should write what and when. So we need to come up with a good name for the start of the process that clearly says that everyone should be involved, and that this needs to happen before developers start developing and testers start testing, because we want to use acceptance tests as a target for development. Test first is a good technical name for it, but business users don’t get it. I propose we talk about Specifying Collaboratively instead of test first or writing acceptance tests.
It sounds quite normal to put every single numerical possibility into an automated functional test, why wouldn’t you do it if it is automated. But such complex tests are unusable as a communication tool. So instead of writing functional tests, let’s talk about Illustrating with Examples (thanks, Dave) and expect the output of that to be Key Examples to point out that we only want enough to explain the context properly.
Key examples are raw material, but if we just talk about acceptance testing then why not just dump all those complicated 50-column 100-row tables into an acceptance test without any explanation, it’s going to be tested by a machine anyway. But these tests are for humans as well for machines, so let’s talk about a whole new step after this, about the process of extracting the minimal set of attributes and examples to specify a business rule, and about adding a title, description and so on. I propose we call this Distilling the specification.
I just don’t want to spend any more arguing with people who already paid a license for QTP that it is completely unusable for acceptance tests. As long as we talk about test automation, there is always going to be a push to use whatever horrible contraption testers already use for automation, because it’s logical to use a single tool. Acceptance testing tools don’t compete with QTP or things like that, they address a completely different problem. Instead of talking about test automation, let’s talk about automating a specification without distorting any information – Literal Automation. Literal automation also avoids the whole scripting horror and using technical libraries directly in test specifications. If it’s literal, it should look as it looked on the whiteboard, it should not be translated to selenium commands.
After Literal Automation, we get a specification that can be checked against the code automatically, an Executable Specification.
We want to run all the acceptance tests frequently to make sure that the system still does what it is supposed to do (and equally more important to check that the specification still says what the system does). If we call this regression testing, it’s very hard to explain to testers why they should not go and add five million other test cases to a previously nice, small and focused specification. If we talk about continuous integration, then we get into the trouble of explaining why these tests should not always be end-to-end and run the whole system. On the top of that, for some legacy systems we need to run acceptance tests against a live, deployed environment. Technical integration tests run before deployment. So let’s not talk about regression testing or continuous integration, let’s talk about
Continuous Validation (or even Frequent Validation).
The long term pay-off from agile acceptance testing is having a reference on what the system does that is as relevant as the code itself, but much easier to read. That makes development much more efficient long term, facilitates collaboration with business users, leads to an alignment of software design and business models and just makes everyone’s life much easier. But to do this, the reference really has to be relevant, it has to be maintained, it has to be consistent with itself and with code. we should not have silos of tests that use terms we had three years ago, and those we used a year ago, and so on. Going back and updating tests is a very hard thing to sell to teams who are busy, but going back to update documentation after a big change is expected. So let’s not talk about folders filled with hundreds of tests, let’s talk about a Living Documentation system. That makes it much easier to explain why things should be self-explanatory, why business users need access to this as well and why it has to be nicely organised so that things are easy to find.
What do you think about this? Does it make sense? Will it help? Do you have a better name for one of these concepts, that explains it more clearly?
Webinar: Effect Maps – Tomorrow
Donna Reed generously offered to organise a follow-up webinar to the one I did recently on effective specifications for agile teams. I’ll run through an interactive exercise of creating an effect map and then do Q&A on all the topics we did not have time to answer the last time.
The webinar is tomorrow. Register now
The Sine of Death by UI Test Automation
I came up with this yesterday while running my agile acceptance testing workshop for a client.
Sounds familiar? Read How to implement UI Testing without shooting yourself in the foot
Clean Acceptance Tests, August 3rd, central London
The next meeting of the UK agile testing user group is on the 3rd of August in central London. Here are the details of the talk:
Dan Leong on Clean Acceptance TestsThis presentation discusses how our agile team renewed our focus and understanding of our acceptance tests when the team members changed. Our group found some core shared values in the context of acceptance testing which we expressed in the style of the agile manifesto. We then looked at our existing tests to find bad test smells that we could learn from. The whole exercise was a good experience and we encourage you to try something similar in your teams.
Dan Leong is a team lead at Sky Network Services, where they have been using agile/XP techniques for over 4 years to deliver the company’s broadband and voice provisioning system. He has over 10+ experience working in companies ranging from small .com start-ups to global advertising and media companies. Like the rest of us, he’s trying to figure out how to do things better.
The event is free to attend, but up front registration is required. Register now
Driving CRUD screens with BDD
There is a discussion on the UK agile testing mailing list on driving the development of an administrative application with BDD. It illustrates a problem that many teams have, so I’ll post my response here as well.
Although we have a long way to go things are going OK at the moment and we feel it is bringing some real focus to the development process. However a lot of the early stories are largely admin type CRUD – for
example functionaility to set up user defined entities and their user defined properties within the system and to provide a mechanism for relating these entities to each other
…
Does anyone have any advice about how to write tests for this sort of stuff or any experiences in starting out with BDD they can share.
CRUD is not a user story, it’s a screen. It’s not a business function, but implementation detail. Why do the business users need a particular CRUD screen? (I know it sounds as a stupid rhetorical question, but I’m serious) What does it allow them to do?
Often you don’t need to implement an entire CRUD screen and deliver them one by one. Sometimes there is value in releasing two screens that allow you to set a subset of properties but together they bring value. You can then automate these tests through the UI and use the CRUD screens, but that will be hidden in the automation layer. Say for example that we have a risk report that lists users and their card numbers:
Scenario: Only regular customers with a specific risk category and card type show up in risk reports Given the following users |name| type| card type| card number| risk category| |Mike |VIP | Mastercard |53111 11111 11111 1111|X| |Tom |VIP| Visa |41111 11111 11111 1111|Y| John |Regular |Mastercard |51111 11111 11111 1111|X| |Steve |Regular |Mastercard |52111 11111 11111 1111|Y| Then the risk report for Mastercard and category X contains the following data |John | 51111 11111 11111 1111|This could, for example, invoke the user CRUD to save a user name, type and risk category and a completely different CRUD to save card details for that user. Any other information that would go on that CRUD (addresses, card expiry dates etc) aren’t part of this story or criteria because they are not important for this particular report.
Start with the outputs, the reports that the system produces, instead of data entry operations. this ensures that you have the data you need to produce the reports at the end, and that you don’t have superfluous data that nobody really cares about. if you don’t do that, the resulting data schemas are often overcomplicated and contain many things that simply aren’t used at the end at all.
Guardian pulling the plug?
The Unite union (of the let’s screw BA travellers for several weeks every few months fame) created a
Facebook group to stop jobs in the technology department being outsourced and offshored at Guardian News Media Ltd, publisher of the Guardian, the Observer and guardian.co.uk. According to the group web site, the board of Guardian News Media is meeting tomorrow to make a decision on outsourcing a large part of their IT department.
Without taking a position on who’s right or wrong in this case, I’m very interested in how this whole thing is going to play out. A recent major rewrite of their flagship web site is one of the most publicised apparently successful IT projects in the UK.
Amazon uses them as a case study for cloud computing.
Phil Willis spoke at several conferences, including Qcon London last year, on how applying domain driven design on this project helped domain experts to get involved in software development, and how they maintained a deep, malleable domain model, whilst meeting deadlines. This is one of the key case studies used by the Domain Driven Design community to prove that DDD works.
ThoughtWorks use them as a reference of how they help big companies implement agile processes. On the back of that project, they got to do the same with AutoTrader, ran by a Guardian subsidiary Trader Media. Their web site quotes Tom Turcan, General Manager for Digital: “A multi-million pound project running on time, and absorbing growth in scope – remarkable”.
A common thread here is that DDD, clouds, agile apparently gave better service more efficiently and produced more business value for the same investment. If the Guardian News Media board is now thinking of pulling the plug on all that, then that is seriously casting a shadow on those claims. Cloud computing aside, both Domain Driven Design and Agile development rely heavily on on-site close collaboration of business users and IT development teams. I guess a multi-million pound project running on time, and absorbing growth in scope, is less remarkable than the money they expect to save by sending the jobs abroad. Or maybe this is a confirmation that their new web site runs on its own and doesn’t need that many people to maintain it.
Update (2:30 PM) The Register picked up the story this morning as well
Stop automating manual test scripts!
Creating an Executable Specification from existing manual test scripts might seem as a logical thing to do when starting out with Specification by Example and Agile Acceptance Testing. Such scripts already describe what the system does, and the testers are running them anyway, so automation will surely help. Not really — this is in fact one of the most common failure patterns.
The problem is that manual and automated checks are affected by a completely different set of constraints. With manual testing, the time spent preparing the context is often a key bottleneck. With automated testing, people spend most time on understanding what is wrong when a test fails.
For example, to prepare for a manual test that checks user account management rules, you might have to log on to an administrative application, create a user, log on as that new user to the client application, and change the password after first use. To avoid doing this several times during the test, manual scripts often reuse the context. So you would create the user once, block that account and verify that the user cannot log on, reset the password to verify that it is re-enabled, then set some user preferences and verify they change the home page correctly. This approach helps a tester run through the script quicker.
With automated testing, time spent on setting up the user is no longer a problem. Automated tests generally go through many more cases than manual tests, and when they run correctly nobody is really looking at them. Once a test fails, someone has to go in and understand what went wrong. If a test is described as a sequence of interdependent steps, it will be very hard to understand what exactly caused the problem, because the context changes throughout the script. The fact that a single script is checking ten different things also makes it more probable that the test will fail because it is affected by lots of different areas of code. In the previous example with user account management, if the password reset function stops working, we won’t be able to set the user preferences correctly. If we had ten different, smaller, focused and independent tests instead of one big script, a bug in the password reset function won’t affect the test results for user preferences. That makes tests more resilient to change and reduces the cost of maintenance. It also helps us pin-point the problems quicker.
Instead of plainly automating manual test scripts, think about what the script is testing and describe that with a group of independent, focused tests. This will significantly reduce the automation overhead and maintenance costs.
Effective Specifications for Agile Teams: Slides and Links
It was a pleasure to do the Agilista PM Webinar on Effective Specifications for Agile Teams today. This is the first time I did a webinar, so I’m sorry if it was a bit rough, but I hope you enjoyed it. Here are the links and slides.
- download slides
- Bridging the Communication Gap: Specification by Example and Agile Acceptance Testing
- Effect Managing IT – also see my review of the book
- Exploring Requirements: Quality Before Design
- Sources of Power: How People Make Decisions
- Challenging Requirements (video)
- Other articles and resources mentioned in the webinar
Taking action on JavaScript Popups with Ruby/Watir in Firefox and Internet Explorer
A long time ago I did a post on catching JavaScript popups, but have a much better way of catching them.
These JavaScript popups cause trouble as they interrupt the page from fully loading, causing Watir to wait (as the page is waiting), which means the next command in your script will never be reached. Previous work arounds to this were to use watirs built in click_no_wait, but I have that to be extremely temperamental and did not always work depending on which element the click was being performed on.
The new and improved method is to have a completely separate process that runs in the background and is continually checking for JavaScript pop ups. AutoIt commands are used to first locate the pop-up and then depending on what text or title is present in the pop up and different action can be performed on it. Unfortunately the same code cannot be used for both IE and FF due to the fact that the AutoIt controls cannot perform the same actions on IE pop-ups as it can on FF pop-ups. I have included the code for both below:
clickPopupsIE.rb
require 'win32ole'
begin
autoit WIN32OLE.new('AutoItX3.Control')
loop do
autoit.ControlClick("Windows Internet Explorer",'', 'OK')
autoit.ControlClick("Security Information",'', '&Yes')
autoit.ControlClick("Security Alert",'', '&Yes')
autoit.ControlClick("Security Warning",'', 'Yes')
autoit.ControlClick("Message from webpage",'', 'OK')
sleep 1
end
rescue Exception > e
puts e
end
clickPopupsFF.rb
require 'win32ole'
websiteName = "w3schools.com"
begin
autoit = WIN32OLE.new('AutoItX3.Control')
loop do
autoit.winActivate("The page at http://#{websiteName} says:")
autoit.Send("{ENTER}") if(autoit.WinWait("The page at http://#{websiteName} says:",'',2) == 1)
end
rescue Exception => e
puts e
end
These two scripts can then be called from any of your other Watir scripts using the following two functions scripts:
require 'win32/process'
def callPopupKillerFF
$pid = Process.create(:app_name => 'ruby clickPopupsFF.rb', :creation_flags => Process::DETACHED_PROCESS).process_id
end
def callPopupKillerIE
$pid = Process.create(:app_name => 'ruby clickPopupsIE.rb', :creation_flags => Process::DETACHED_PROCESS).process_id
end
def killPopupKiller
Process.kill(9,$pid)
end
As you can see above you do need to require one more ruby gem, ‘win32/process’, this is used to run the popup clicker as a separate process that runs in the background. Once you have those functions in place you can simply call:
callPopupKillerIE #Starts the IE popup killer
#Some watir code that results in a popup#
killPopupKiller #Kills the popup killer process, so that you do not end up with 5 of them running!
Well there you have it, a robust and effective popup killer for both IE and FF. If you have any questions let me know!
–Steve
Tagged: AutoIt, Automation, Close, FireFox, IE, JavaScript, Killer, Pop, popup, QA, Ruby, test, Testing, Up, Watir
Moq Sequences
I have recently started to use Moq for mocking and I really like it. The fluent interface and Lambda support makes it very easy and natural to use.
However, I quickly ran into a situation where I wanted to ensure that methods on a mock object were called in a particular order. I have provided a simpler example below where I want to check that BlogPresenter.Show() shows blogs in reverse chronological order:
public class Post { public DateTime DateTime { get; set; } } public class BlogPresenter { private readonly BlogView view; public BlogPresenter(BlogView view) { this.view = view; } public void Show(IEnumerable posts) { foreach (var post in posts.OrderByDescending(post => post.DateTime)) view.ShowPost(post); } } public interface BlogView { void ShowPost(Post post); }To check this I used a callback to increment a counter like this:
[Test] public void Should_show_each_post_once_with_most_recent_first() { var olderPost = new Post { DateTime = new DateTime(2010, 1, 1) }; var newerPost = new Post { DateTime = new DateTime(2010, 1, 2) }; var posts = new List { newerPost, olderPost }; var mockView = new Mock(); var viewOrder = 0; mockView.Setup(v => v.ShowPost(newerPost)).Callback(() => Assert.That(viewOrder++, Is.EqualTo(0))); mockView.Setup(v => v.ShowPost(olderPost)).Callback(() => Assert.That(viewOrder++, Is.EqualTo(1))); new BlogPresenter(mockView.Object).Show(posts); mockView.Verify(v => v.ShowPost(newerPost), Times.Once()); mockView.Verify(v => v.ShowPost(olderPost), Times.Once()); }This works code but is not very intentional. I wanted to express the intent the there is a required ordering of method calls. After searching around I found a nice code snippet from Max Guernsey, III that was promising. I thought I would push a little further to see if I could get something like this:
[Test] public void Should_show_each_post_with_most_recent_first_using_sequences() { var olderPost = new Post { DateTime = new DateTime(2010, 1, 1) }; var newerPost = new Post { DateTime = new DateTime(2010, 1, 2) }; var posts = new List { newerPost, olderPost }; var mockView = new Mock(); using (Sequence.Create()) { mockView.Setup(v => v.ShowPost(newerPost)).InSequence(); mockView.Setup(v => v.ShowPost(olderPost)).InSequence(); new BlogPresenter(mockView.Object).Show(posts); } }So, I created Moq.Sequences and you download Moq.Sequences.dll from github. Simply, add Moq.Sequences.dll as a reference in your .Net project and add a using Moq.Sequences; in your test class. Moq.Sequences supports the following:
- checks order of method calls, property gets and property sets
- allows you to specify the number of times a call is made before the next one is expected
- allows intermixing of sequenced and non-sequenced expectations
- thread safe – each thread can have its own sequence
Sequences are added using the Sequence static class and extension methods. You create a sequence by calling:
using (Sequence.Create()) { ... }Sequences that do not fully complete are detected when the sequence is disposed. So, all the setups and mock calls should be done within the lifetime of the sequence.
StepsWithin a sequence you set the expectations for ordering via an extension method InSequence<(). I call these steps. For example,
using (Sequence.Create()) { mock.Setup(_ => _.Method1()).InSequence(); mock.Setup(_ => _.Method2()).InSequence(); ... }This sets an expectation that Method1() will be called once followed by a single call to Method2. You can set more sophisticated expectations by using a Times parameter:
using (Sequence.Create()) { mock.Setup(_ => _.Method1()).InSequence(Times.AtMostOnce()); mock.Setup(_ => _.Method2()).InSequence(Times.Between(1, 10, Range.Inclusive)); ... }Steps can be created for method calls, property gets and property sets:
using (Sequence.Create()) { mock.Setup(_ => _.Method1()).InSequence(); // method call mock.SetupGet(_ => _.Property1).InSequence().Returns(0); // property get mock.SetupSet(_ => _.Property2 = 0).InSequence(); // property set ... }Also, sequenced steps can be intermingled with non-sequenced expectations.
LoopsLoops are used when you want to check that as group of method calls are called in order several times. An example of this could be some resources which you expect to be opened, operated on and then closed, one after the other.
Loops are created via Sequence.Loop():
using (Sequence.Create()) { mock.Setup(_ => _.Method1()).InSequence(); using (Sequence.Loop()) { mock.Setup(_ => _.Method2()).InSequence(); mock.Setup(_ => _.Method3()).InSequence(); } ... }The above checks that Method1 is called by any number of calls to Method2 followed immediately by a call to Method3. You can constrain the number of times the loop can be executed by adding a Times parameter such as the following example where the combination of Method2 and Method3 should be called exactly twice:
using (Sequence.Create()) { mock.Setup(_ => _.Method1()).InSequence(); using (Sequence.Loop(Times.Exactly(2))) { mock.Setup(_ => _.Method2()).InSequence(); mock.Setup(_ => _.Method3()).InSequence(); } ... } Wrap-UpMoq.Sequences provides a simple, intentional way of checking that things are are done in a specific order. Feel free to check it out at github. I welcome any feedback!
How to do agile when we only have 50 crap developers?
Why do people complaining that they can’t do agile development with 50 crap developers not see that the problem is in the second part of that statement, not the first? I got an e-mail last week that shows the point perfectly:
We discussed whether an agile approach is right, and I concluded that not everyone can work that way.
Quite true. I find it self-evident that not everyone can do software development, agile or any other way. That requires brains, knowledge, experience is a plus, and hopefully some talent as well. And of course, there is no generic approach that works in every context.
We think that an agile approach asks programmers to be much more engaged than when they’re just being served what to do
It’s hard for me to make a comparison to answer this. I’ve always tried to be very engaged in my own work and I expected the same from everyone else working with me, even before I ever did anything resembling agile. I’ve never seen a project where people were asked not to be engaged into what they need to do, but out of general principle I would refuse to participate in one.
If your programmers aren’t engaged and they get everything served to them, your problem is right there. It is not in a process, agile or non agile.
Which means the choice of people is very important
I completely agree. Once again, this isn’t particularly specific to agile software development approaches – or even software development at all. This is important for any craft. My former colleague Relja Jovic, who was the executive editor at PC World Yugoslavia when I worked there, used to say “From shit, you can only make a shit pie” whenever we were asked to get someone unqualified to write an article (“how hard can it be?”). That holds true for programming, testing, analysis, project management and anything else to do with delivering software. With crap people, you get crap output. Tough luck. Maybe hire people who know how to deliver software instead?
Agile acceptance testing – executive summary
I’m thinking about organising a one day or half-day seminar on the current state of agile acceptance testing for managers, team leaders and senior technical people some time in September. If you are interested or know someone who might be (maybe your boss needs to know about these things?) drop me an e-mail or leave a comment below. The seminar would be based on the research I did for my upcoming book, and would cover the following topics:
- key benefits that teams from my research are getting from agile acceptance testing, specification by example and behaviour driven development
- key principles to implement this properly
- key practices to support the principles, and how teams from my research use them in different contexts
- key pitfalls/problems experienced commonly by the teams covered by the research
- some generous time for Q&A
Here are the things I’d like to know from you:
- would half-day or full day be better? (full day would allow us to go more in-depth, but that means that people would have to take a day off work)
- do you have any specific questions you could like to have answered?
- does the schedule make sense? how can I give you more value with this?
Anatomy of a good acceptance test
The long term benefits of agile acceptance testing come from live documentation – a description of the system functionality which is reliable, easily accessible and much easier to read and understand than the code. In order to be effective as live specification, acceptance tests have to be written in a way that enables others to pick them up months or even years later and easily understand what they do, why they are there and what they describe. Here are some simple heuristics that will help you measure and improve your tests to make them better as a live specification.
The five most important things to think about are:
- It needs to be self explanatory
- It needs to be focused
- It needs to be a specification, not a script
- It needs to be in domain language
- It needs to be about business functionality, not about software design
Here is a really good example:
It has all the elements listed above. Whenever I’ve shown it to people in the workshops, I did not have to use a single word to explain it. The title and the introductory paragraph explain the structure of the test data enough that readers don’t need to work back from the data to understand the rule. But the examples are there to make it actually testable and explain the behaviour in edge cases. It is focused on a very particular rule of free delivery availability, does not explain how the books are purchased but just what the available delivery mechanism is, and does not try to talk about any implementation specifics.
Here is a really bad example (taken from http://fitnesse.org/FitNesse.UserGuide.PayrollTests.AddAndPayTest)
This test has so many bad things in it that it is actually a good example to demonstrate what happens when people don’t take care about writing good tests.
First of all, although it has a title and some text around the tables to seemingly explain what’s going on, the effect of that is marginal. Why is this test “simple”, and what exactly in payroll is it checking? It fails straight away on the self-explanatory litmus-test.
Second, it’s not really clear what this test is checking. We need to work backwards. It seems to verify that the cheques are printed with unique numbers, starting from the next available number that’s configurable in the system. It also seems to validate the data printed on each cheque. And that there is one cheque printed per employee (I’ll come back to this later).
There is a lot of seemingly incidental complexity there – names and addresses aren’t really used anywhere in the test apart from setting up the employees. There are some database IDs there which are completely irrelevant for the business rules, but they are used to match up employees and the paycheck inspector.
Paycheck inspector is obviously an invented thing just for testing. No company is going to have Peter Sellers in a Clouseau outfit inspecting cheques as they go out. If you have enough employees to have to print cheques, you don’t want to inspect them manually. That’s what this test is about, anyway.
There is also a very interesting issue of those blank cells in the assertion part of the test, and the two Paycheck inspector tables which seem unrelated. Blank cells in Fitnesse are used to print test results for debugging and troubleshooting, they don’t check anything. So this is an automated test that a human has to look over — pretty much defeating the purpose of automation. Blank cells are typically a sign of instability in tests (more on this a bit later) and they are often a signal that you’re missing something – either testing in the wrong place or missing a rule that would help make the system process repeatable and testable.
The language is inconsistent, which makes it hard to make a connection between inputs and outputs at first. What is the 1001 value in the table below? The column header tells us that it is a number – well thanks for that, I though it was a sausage. There is a “cheque number” above, but what kind of a cheque number is that? What is the relationship between these two things?
So this test is very very bad.
Presuming that the addresses are there because cheques are printed as part of a statement with an address ready for automated envelope packaging, this test fails to check at least one very important thing (and I’ll come back to this later as well): that the right people got paid the right amounts. If the first person got both cheques, this test would happily pass. If they both got each-others salaries, this test would pass. If a date far in the future was printed on the cheque, our employees might not be able to cash it in but the test would still pass. The reason these cells are blank is because there is another hidden rule there: ordering of cheques coming out. There is nothing specifying that. So a technical workaround for a functional gap is to create a test that gives us loads of false positives.
Is this test checking one thing or many things? Without the context information it’s hard to tell. If the cheque printing system is used for anything else, I’d pull out the fact that cheque numbers are unique and start from a configured number into a separate page. If we only ever print salary cheques, it’s probably part of a single thing (salary cheque printing).
Now, let’s clean it up. Let’s try to work back and drop all the incidental stuff. Let’s use a nice descriptive title, such as “Payroll Cheque Printing”. And let’s add a paragraph that explains the structure of the test.
A cheque has the payee name, amount, payment date. A cheque does not have a name and a salary. If the cheque is printed on a statement, it also has an address that will be used for automatic envelope packaging. A combination of a name and address should be enough for us to match the employee with his cheque — we don’t really need the database IDs. We can make the system more testable by agreeing on an ordering rule, whatever it is. For example, alphabetic by name.
Let’s also pull the context to the start of the test. Our context is the payroll date and the next available cheque number, along with employee salary data. Let’s make it explicit what the number is for, so that the readers don’t have to figure this out for themselves in the future. We can also make this block stick out visually to show that it is about the context.
The action that gets kicked off doesn’t necessarily need to be listed in the test. A payroll run can be executed implicitly by the table that checks payroll results. This is an example of focusing on what’s being tested instead of how it is being checked. There is no need to have a separate step that says “Next we pay them”.
Let’s also rename that paycheck inspector to something that makes more sense. Because we want whoever automates this to ensure that we check for all cheques printed, let’s put that in the header. Otherwise, someone might use subset matching and the system might print every cheque twice and we won’t notice.
And here is the cleaned version.
Much easier to understand, without all that incidental stuff. And now comes the punchline. When we look at the test like this, without database IDs, without all the unnecessary clutter, we can have a nice shot at answering the question “Are we missing something?”. What are the edge cases that might break this? We don’t really need to ensure validity of employee data, hopefully that’s done in another part of the system. But is there any kind of valid employee data that would be an edge case for this test – can we play with the numbers to make this illogical in some way?
An obvious answer to that is – what happens if an employee has a salary of 0? Do we still print the cheque? The rule as we described it says “One cheque per employee” – so any employees that have been fired years ago and no longer receive salaries would still get cheques printed, with zeroes on them. We could then have a discussion with the business on making this rule stronger and ensuring that cheques don’t go out when they don’t need to do.
FitNesse gets a lot of bad reputation because of this kind of broken tests. Concordion was built as an answer. Some people at my recent workshop on this example pointed out that Given-When-Then structure of Cucumber would be better at preventing some of these problems. I don’t think so. It’s not about the tools – people can do this kind of bad test design with any tool, and likewise they could do nice and clean tests with FitNesse. I guess the fact that the basic examples coming with FitNesse are so bad doesn’t help, but the problem is not in the tool. It’s about the process and effort put into making the tests easy to understand. The funny thing is it doesn’t take a lot more effort to make the test nice and clean, but it brings a lot more value that way.
If you found this article helpful, you’ll love my three day workshop on effective specification by example and agile acceptance testing
Effective specifications for agile projects
Meet me for an online webinar on Effective Specifications for Agile Projects. We’ll start at 7 pm UK Time / 2pm EDT / 11am PDT on July 15th. Here’s the abstract:
Fast turnaround and short iterations require a very efficient and precise specifications process to provide direction. Two emerging practices of specification by example and agile acceptance testing enable agile teams to specify, deliver and verify better. This webinar will teach you about these practices and how to implement them in your project.
LEARNING OBJECTIVES:
- how to ensure a shared understanding of specifications by all stakeholders
- how to manage requirements to provide details just in time but ensure that all development team members have enough information to work
- how to ensure that the product built is fit for purpose
- how to know when you are really done with a story
- how to facilitate change in software with live documentation
The webinar is free for participants, but you have to sign up for the webinar upfront.
Evolution of DDD: CQRS and Event Sourcing
Speaking at the DDD exchange conference today, Greg Young said that doing doing domain driven design is impossible with a classic three layer architecture where DTOs are being shared across layers. He then presented CQRS and Event Sourcing, which according to him provide a much better way to design complex systems.
“We need to start capturing the intent of the user. Our domain is focused on behaviours. Users should send back commands, not DTOs. The server needs to be told to do something.”, said Young, advising developers not to use editable data grids in user interface, because they are data oriented. Users should specify the intent, not edit data, said he.
“Actually figuring out what your users do and why they do it will cost money. But once you have done it, it will give you a better architecture and keep you at the same cost – or much much lower. Even at the same cost, we can get a lot of benefit.”, said Young, and suggested CQRS and Event Sourcing as the key patterns to support that.
CQRSCQRS is based on command-query separation by Bertrand Meyer but evolved. It fundamentally requires splitting apart queries (read operations) and commands (transactions). “Treating them as separate concepts forces us to think that there are different needs for different sides”, said Young.
When we build up DTOs, we have a structural view of our data. When we process a transaction, we have a much more behavioural view. The conceptual structures are actually using completely different boundaries in these situations. So teams often come up with a half-way answer, which leads to a system that is neither behaviourally or structurally optimal. Young said that CQRS enables us not to make these trade-off s any more.
Instead of having a domain layer, we can have a thin read layer for DTOs for queries. This makes it a lot easier to optimise queries compared to using O/R mapping systems. The only knowledge people need to have to write efficient queries is a data model – not the O/R mapping technology, domain models and so on.
Separating out the reads also cleans up a lot of things in the domain layer. The repositories no longer have lots of different read methods, but only those required for particular domain. We don’t have setters on domain objects – they become purely behavioural. “Domain objects are not property buckets, they expose behaviour. We can specialise our domain layer to process transactions. The code will be clearer and the aggregate boundaries will be a lot stronger”, said Young.
Event SourcingOnce commands and queries are separated, we can look at whether we need to use the same data source at all. When the same data source is used for transaction processing and querying, the third relational normal form is often used for both things. This requires teams to create complex queries. Splitting data sources would enable us to specialise the query data source for reading, having a first normal form model. The two data sources can then be synchronised.
To avoid inconsistency caused by this synchronisation, we need to build from one source of truth. A good option is to use domain events – notifications about what happened as the result of commands being processed.
The data model for processing transactions can also be write-only. “This is how the business thinks about changes”, said Young, “You can’t refuse to record something that happened”. We can take this even further and not have the current state of the system in the transactional storage. Understanding this enables us to restructure the transactional processing data model to just accept results of command processing. If we unify the models for transaction processing and storage, this cuts down the cost of development and maintenance.
Recording events that resulted from transactions ensures that we don’t lose any information but we can still recreate the current state when needed. This makes auditing trivial and enables a number of other things, such as replay the state of system at any time to troubleshoot or test and extracting analytics in the future that we haven’t thought about while developing the system.
Having a system structured like this requires the domain model to only to know about events. This simplifies testing – both checking for expected results and unintended consequences. Testing events would check both at the same time – any unintended change would generate more events. Integration with other systems also becomes a lot easier, according to Young, because this naturally flows around events and commands.
The event storage can also act as a queue to feed the query storage, and essentially require only a single transactional commit from the system while processing commands. This makes the system even more performant.
See other news from the DDD Exchange 2010
Related post: Two data streams for a happy web site
Udi Dahan: the biggest mistakes teams make when applying DDD
Udi Dahan spoke today at the DDD Exchange about common misunderstandings and problems that teams have with implementing Domain Driven Design. According to Dahan, the domain model pattern seems to be abused more often than not.
Tiers aren’t layersWhen applying DDD, many people have in mind a classic three layer architecture, said Dahan, and there is a common assumption that the architecture layers are the same as tiers of deployment. For example, anything that’s in the business domain layer is not going to be deployed to the client tier. According to Dahan, this is the biggest mistake teams make when applying domain driven design.
A layered model is useful to tell us not to mix user interface logic and business logic or business logic and databases, said he, but the model says nothing about deployment tiers. “Starting with an assumption that layers are equal to tiers straight-jackets our implementation and colours our impression of the model thinking that we have to have data transfer objects going across tiers. If they weren’t in the different tiers, we would not need data transfer objects”, said Dahan, continuing that “almost any project I’ve seen that said that they are doing DDD made this assumption. This constrains choices and causes too many problems that aren’t related to DDD. “
Expecting to have a single architectural model for everything, teams don’t understand that they can deploy the same components to multiple tiers. Things like validation are a common cause of confusion. This causes a lot of architectural problems, that teams work around by putting in solutions such as caches. But according to Dahan, caches are often a sign that teams assume that there is a single linear layered architecture, which then causes performance problems. Many of our misunderstandings and problems we get ourselves into are related to this mix-up – layers, tiers, and what is a business rule.
Logical and physical connections are differentAnother big issue with implementing DDD today is that there is an overemphasis on reuse in code. “Reuse is drilled into us in the universities and work. Even though this doesn’t reflect what we see in practice this drives us to centralise the code. Preferably in the domain model and then reuse it anywhere. Too much reuse creates an unmaintainable mess, because the system becomes hard to understand and has lots of dependencies.”, said Dahan. “When you think about domain models, business logic, validation – be very careful about how you treat it and not projecting your ideas of reuse”, advised he.
When talking about tiers, understand that you can get the exact same component, put it on another tier, and not create a dependency. The problem is when a single logical element of reuse became equal to a single physical place. This leads to lots of service calls – if it’s logically in one place it has to be physically in one place, so lots of remote calls have to happen. Equalling logical and physical locations causes too much complexity. Dahan suggested an alternative: “We want to remove these original blinders. You are allowed to deploy business logic to the client tier – that doesn’t mean you mixed business logic and user interfaces. It’s OK to take the same validation component you use on the server side and deploy on the client side. It doesn’t have to be in the same place to be logically cohesive”.
Teams also need to understand that there are different models, said Dahan, continuing “We need to understand what needs to be immediate, what means stale, and when the information can be stale, then addressing these situations with different models.”
All rules aren’t created equalAssuming that all rules are equally likely to change is another common mistake that teams make, said Dahan. This leads to all the rules being put in the same place and the same kind of reuse applied to the same rules. Validations, calculations and workflows are not necessarily the same. For example, the requirement “the username must be unique” is common but never changes. Workflow rules (eg when a customer cancels an order, the system checks if it has been shipped and cancels if not) change substantially more. The reason for this is, according to Dahan, because these rules are actually related to a particular domain, what makes a business unique. “By virtue of the fact that all systems have the exact same requirements around username uniqueness and field length, that they are not unique for a particular business so they are not unique”, said he. These are just technical constraints, according to Dahan: “The business doesn’t care that username fields are 8 characters. Usernames have to be unique for technical reasons to be able to select customers to check details”. Rules that are not part of genuine domain logic do not have to be implemented in the domain model, suggested he, because they do not model the domain. They may be deployed to a separate tier.
Dahan quoted Martin Fowler, who defined the domain model pattern in Patterns of Enterprise Application Architecture (The Addison-Wesley Signature Series): “Use the domain model pattern if you have complicated and ever-changing rules…. If you have simple not null checks and a couple of sums to calculate, a Transaction Script is a better bet”.
Race conditions hide business rulesAnother big problem when implementing DDD, according to Dahan, is that assuming the race conditions affect the domain. Giving an example of a possible race condition between the user cancelling an order at the same time as an operator asking the system to ship the same order. “As developers, we think that race conditions are business logic, so we need to write code for that”, said he, but businesses don’t understand race conditions. “We made them up. They didn’t exist in a business before we [software] got involved.” From a business perspective, a millisecond should not substantially change the domain. Businesses have existed long before computers, on paper, with concurrency problem windows not in milliseconds but in days and enormous chances for race conditions.
“But businesses still functioned. We ignore that and think that race conditions matter. “, said Dahan. Instead of solving race conditions with code, he suggested factoring time into design and talking to the business users about how the process was implemented before computers. For example, if the user cancels an order that was already shipped, charging the customer money unless they are of a particular status. The command does not fail because of a race condition, it extends to handle edge cases.
“If you think you have a race condition, you don’t understand the domain well enough. These rules didn’t exist in the age of paper, there is no reason for them to exist in the age of computers. When you have race conditions, go back to the business and find out actual rules”, advised Dahan
See other news from the DDD Exchange 2010
Eric Evans: Domain driven design redefined
Today at the DDD Exchange 2010 mini-conference in London, Eric Evans spoke about emerging themes in the domain driven design community. Six years after the DDD book was published, Evans said that he can now define it more precisely than before.
“Anything can be called agile, anything can be called SOA – when anything can be called that it’s no longer a useful term”, said Evans. So he wanted to offer a new, precise definition of DDD, which defies “what it is, what it isn’t, while still leaving enough space for innovation.”
Evans said that one way to define DDD is as a set of driving principles:
- Focus on the core domain — “people get distracted by technology and we want to bring that attention back to the business domain”, said Evans. Even that whole business domain is too much to focus, according to him, and DDD requires us to focus on the core, the critical, most valuable part.
- Explore models in a creative collaboration with domain practitioners and software practitioners – “we have to collaborate, not just quiz [business experts] for requirements”.
- Speak a Ubiquitous Language in an explicitly Bounded Context
Speaking about the focus on core domains, Evans said that that has to be something very specific.”We want to focus with collaborative cooperation on our core domain – we don’t want to give the same kind of attention to invoicing in most companies”, said Evans. Giving an example of eBay, he said that it is easy to assume that online auctions are the core domain but that would be wrong. He compared Amazon and eBay, and said that you can buy books in both places and they have similar features, but their core domains are very different. For eBay, it’s the rating of the seller – this is what makes eBay effective. “A star rating tells me that lots of people did business with a seller and I trust that. Developing trust between a buyer and a seller is a subdomain of online auctions, and their approach to building trust is part of the core domain of eBay. eBay would not have been today if they didn’t get that right”, said Evans.
Another way to define DDD, Evans said, is that it is a pattern language. It is a set of interrelated problem/solution pairs that have helped teams realise the principles. It also gives us a language, a vocabulary, that allows us to discuss domain modelling and design clearly. According to Evans, this is the key to enable innovation and keep the definition of DDD precise. There is a lot of innovation going on at the moment in the community, especially around architectural practices, and one of the key things for DDD in the future will be embracing those innovations. “I don’t want to say what the best way to do domain driven design is – others will present genuine advancements but none of us will be even able to understand it if we don’t speak the same language”, said Evans, continuing that he wants to hold on to the position of the final arbiter of the terminology used by the community.
The building blocks (entities, value objects…) aren’t really key to DDD, although most people initially thought that this is what DDD is about, said Evans. Having said that, he suggested that there is a lot of innovation in the community especially around aggregates, services and domain events, and that these concepts will be important for applying DDD in the future.
Related post: QCon London 2009: Eric Evans – What I’ve learned about DDD since the book
See other news from DDD Exchange 2010
Effective root cause analysis techniques
At the Agile testing user group meeting on 4th May 2010, Douglas Squirrel presented ideas on running effective root cause analysis that he uses at YouDevise to facilitate continuous improvement. Saying that reflective practices are more important than pure technical practices for software development teams today, Squirrel suggested that an effective root cause analysis workshop can be instrumental to reflect on organisational processes and improve them.
Target a specific eventInstead of generic and abstract problems – such as “why do support people never find …”, Squirrel suggested targeting a specific event, a particular incident. This helps put things into a more concrete perspective and also shifts the focus from a single group of people (in this case support) to the wider organisation, so that the entire organisation can benefit from the analysis.
Everyone affected attendsIt is crucial for everyone affected by that particular incident or event to attend the root cause analysis, to avoid the danger of missing information. It also helps to avoid the blame culture – as it is very easy to blame people who are not present. Squirrel especially pointed out that executives should attend root cause analysis meetings. First of all, they are capable of fixing very deep resource problem and having senior executives at a meeting also sends a message of importance.
Establish a No-Blame cultureTo get to the right information, Squirrel suggested that there has to be a “no blame” culture during a meeting. Otherwise people won’t be direct and honest. In one case, by allowing people to speak openly about what went wrong, they discovered that a group of people stayed working after midnight to fix a problem – and started addressing the cause of that instead the original issue.
Poll to identify problemsSquirrel also suggested to ask everyone “What do you think is the problem?”. This provides different contexts and uncovers additional information. Talking about a server configuration issue, he said that the traders at the meeting complained that they weren’t notified about the fix and lost twelve hours of trading. Polling to identify problems exposed a communicational issue between the operational staff and the traders.
Write a lotAfter the initial set of problems is identified, start popping the why stack and get to the “fifth why”. Every moment you should be either writing down why something happened or asking ‘Why?’. This keeps the discussions short and focused.
Move down, then acrossPolling and the discussion will identify many problems, and to manage this effectively Squirrel advised tackling them one at a time, otherwise the group will get distracted with reasons and not identify solutions. “You know you are at the fifth why because it hurts, and there is usually a pause ” when the real problem is identified, said Squirrel, adding that “if it doesn’t hurt, you are not doing it right”. One of the key take-aways for me from the practical exercise that followed is that people sometimes use humour to disguise things that hurt or divert the discussion from that, so cynical or humorous remarks are also a sign that we might be going in the right direction.
Set proportionate tasksWhen the real root cause of the problem is identified, don’t get carried away and “retrain your development team because of five minutes of downtime”, said Squirrel, “but define tasks proportionate to the problem”. “It’s not necessary to solve problems, but make progress”, said Squirrel. Instead of gold-plating solutions, he suggested acting quickly. “If you do it wrong, it will come back again”. Solutions that take too long will never get done, so Squirrel suggested thinking about what you can do in a week or even in a hour, and building up the solution the next time a problem happens. If a solution that takes less than a week to implement cannot be identified, escalate or bring in consultants suggested Squirrel, adding that the root cause analysis process helps to build a business case for the solution.
Look for patterns across sessionsRunning root cause analysis sessions frequently and acting quickly to build up incremental solutions helps to avoid complacency and ensure that the organisation is continuously improving. Identifying patterns in problems that are analysed might point to deeper organisational issues, or solutions that never get implemented. If tasks don’t get finished, this might mean that there is no buy-in to implement change – in which case Squirrel advised getting the buy-in from senior management upfront. It can also expose cross-cutting cultural problems with the organisation, which can be addressed separately.
If you found this summary interesting, you can watch the video of this presentation including the interactive exercise that Squirrel organised after the presentation.
Introduction to specification by example and agile acceptance testing – July 5th, London
I’ll be running a one day introductory workshop to agile acceptance testing and specification by example on July 5th in London. Click here for more information and to sign up