This past week I dug into the performance of custom elements and found some surprises.
In my previous blog post, Async Ads with HTML Imports, I complained that HTML imports block the page from rendering when a SCRIPT tag is found and lamented the fact that the spec doesn’t provide a mechanism to make this coupling asynchronous. Custom elements are just the opposite: they’re asynchronous by default and the spec doesn’t provide a way to make them synchronous.Cake and eat it too
It’s important to understand why we need both synchronous AND asynchronous mechanisms for loading content.
- Sometimes content is so critical to the page it should be rendered before anything else. In these situations it’s necessary to load the content synchronously in order to achieve the user experience desired by the website owner and designers. Synchronous loading is also necessary in other situations such as when there are dependencies between resources (e.g., interdependent scripts) and to avoid jarring changes in page layout (also known as Flash of Unstyled Content or FOUC).
- Other times, the content coming from sub-resources in the page is secondary to the main page’s content and developers & designers prefer to load it asynchronously. This is a newer pattern in web development. (I like to think I had something to do with it becoming popular.) Loading these less critical resources asynchronously produces a better user experience in terms of faster loading and rendering.
The bottomline is there are situations that call for both behaviors and developers need a way to achieve the user experience they deem appropriate. The main role for specs and browsers is to provide both mechanisms and choose a good default. We didn’t do that in the previous versions of HTML and are trying to fill that gap now with the Resource Priorities spec which adds the lazyload attribute to various tags including IMG, SCRIPT, and LINK. We don’t want to repeat this gap-filling-after-the-fact process in the future, so we need to provide sync and async capabilities to the HTML5 features being spec’ed now – and that includes Web Components.Custom Elements howto
To make custom elements more reusable they’re wrapped inside an HTML import:<link rel="import" href="import-custom-element.php">
In the HTML document we can use the custom element just like any other HTML tag:<x-foo></x-foo>
Experienced developers recognize that this creates a race condition: what happens if the x-foo tag gets parsed before import-custom-element.php is done downloading?(async) Custom Elements = FOUC
The first example, custom-element.php, demonstrates the typical custom element implementation described above. If you load it (in Chrome Canary) you’ll see that there’s a Flash of Unstyled Content (FOUC). This reveals that browsers handle custom elements asynchronously: the HTML import starts downloading but the browser continues to parse the page. When it reaches the x-foo tag it skips over it as an unrecognized element and renders the rest of the page. When the HTML import finishes loading the browser backfills x-foo which causes the page’s content to jump down ~100 pixels – a jarring FOUC experience.
This is great for faster rendering! I love that the default is async. And there are certainly scenarios when this wouldn’t created FOUC (custom elements that aren’t visible or will be used later) or the FOUC isn’t so jarring (below-the-fold, changes style but not layout). But in cases like this one where the FOUC is undesirable, there needs to be a way to avoid this disruptive change in layout. Sadly, the spec doesn’t provide a way of doing this. Let’s look at two possible workarounds.Sized Custom Elements
The jarring change in layout can be avoided if the main page reserves space for the custom element. This is done in the custom-element-sized.php example like this:<div style="height: 120px;"> <x-foo></x-foo> </div>
The custom element is inside a fixed size container. As shown by this example, the existing page content is rendered immediately and when the HTML import finally finishes downloading the custom element is backfilled without a change in layout. We’ve achieved the best of both worlds!
The drawback to this approach is it only works for custom elements that have a fixed, predefined size. That condition might hold for some custom elements, but certainly not for all of them.Sync Custom Elements
The custom-element-sync.php example shows a workaround to avoid FOUC for custom elements that have an unknown size. Unfortunately, this technique blocks rendering for everything in the page that occurs below the custom element. The workaround is to add a SCRIPT tag right above the custom element, for example:<script> var foo=128; </script> <x-foo></x-foo>
As shown in my previous post, HTML imports cause the parser to stop at the first SCRIPT tag that is encountered. There is a slight benefit here of making sure the only SCRIPT tag after the <link rel="import"...> is right before the custom element – this allows the content above the custom element to render without being blocked. You can see this in action in the example – only the content below the custom element is blocked from rendering until the HTML import finishes loading.
By blocking everything below the custom element we’ve avoided the FOUC issue, but the cost is high. Blocking this much content can be a bad user experience depending on the main page’s content. Certainly if the custom element occupied the entire above-the-fold area (e.g., on a mobile device) then this would be a viable alternative.
It would be better if the spec for custom elements included a way to make them synchronous. One solution proposed by Daniel Buchner and me to W3 Public Webapps is to add an attribute called “elements” to HTML imports:<link rel="import" href="elements.html" elements="x-carousel, x-button">
The “elements” attribute is a list of the custom elements that should be loaded synchronously. (In other words, it’s NOT the list of all custom elements in the HTML import – only the ones that should cause rendering to be blocked.) As the browser parses the page it would skip over all custom elements just as it does now, unless it encounters a custom element that is listed in the “elements” attribute value (e.g., “x-carousel” and “x-button”). If one of the listed custom elements is reached, the parser would block until either the custom element becomes defined or all outstanding HTML import requests are done loading.Tired of hacks
I love finding ways to make things work the way I want them to, but it’s wrong to resort to hacks for these new HTML5 features to achieve basic behavior like avoiding FOUC and asynchronously loading. Luckily, the specs and implementations are in early stages. Perhaps there’s still time to get them changed. An important part of that is hearing from the web development community. If you have preferences and use cases for HTML imports and custom elements, please weigh in. A little effort today will result in a better Web tomorrow.
Many thanks to the authors for these fantastic articles on Web Components:
Here’s another failure story, per the post where I complained about people not telling enough test failure stories.
Years ago, after learning about Keyword-Driven Automation, I wrote an automation framework called OKRA (Object Keyword-Driven Repository for Automation). @Wiggly came up with the name. Each automated check was written as a separate Excel worksheet, using dynamic dropdowns to select from available Action and Object keywords in Excel. The driver was written in VBScript via QTP. It worked, for a little while, however:
- One Automator (me) could not keep up with 16 programmers. The checks quickly became too old to matter. FAIL!
- An Automator with little formal programming training, writing half-ass code with VBScript, could not get help from a team of C# focused programmers. FAIL!
- The product under test was a .Net Winforms app full of important drag-n-drop functionality, sitting on top of constantly changing, time-sensitive, data. Testability was never considered. FAIL!
- OKRA was completely UI-based automation. FAIL!
Later, a product programmer took an interest in developing his own automation framework. It would allow manual testers to write automated checks by building visual workflows. This was a Microsoft technology called MS Workflow or something like that. The programmer worked in his spare time over the course of about a year. It eventually faded into oblivion and was never introduced to testers. FAIL!
Finally, I hired a real automator, with solid programming skills, and attempted to give it another try. This time we picked Microsoft’s recently launched CodedUI framework and wrote the tests in C# so the product programmers could collaborate. I stood in front of my SVP and project team and declared,
“This automation will shave 2 days off our regression test effort each iteration!”
- The automator was often responsible for writing automated checks for a product they barely understood. FAIL!
- Despite the fact that CodedUI was marketed by Microsoft as being the best automation framework for .Net Winform apps, it failed to quickly identify most UI objects, especially for 3rd party controls.
- Although, at first, I pushed for significant amounts of automation below the presentation layer, the automator focused more energy on UI automation. I eventually gave in too. The tests were slow at best and human testers could not afford to wait. FAIL! Note: this was not the automators failure, it was my poor direction.
At this point, I’ve given up all efforts to automate this beast of an application.
Can you relate?
In North American football, the offense sometimes runs a draw play, which is a delayed handoff that gets the moving parts of the defense to act before the offense turns the ball over to its running back:
The idea behind a draw play is to attack aggressive, pass-rushing defenses by “drawing” them downfield. This creates larger gaps between defenders and thereby allows the offense to effectively run the ball. Draw plays are often run out of the shotgun formation, but can also be run when the quarterback is under center. These types of draw plays are sometimes referred to as “delayed handoffs”. The running back will most often run straight downfield through the “A-Gap” (the space between the center and the offensive guard), although there are more complicated variations.
Now, in our deadline-driven software development world, we often find ourselves typing and clicking as fast as we can to get as much test coverage as we can in our allotted time. In a way, we’re playing to software that’s trained to be aggressive linebackers, running toward our software’s operations as fast as possible. Kinda like I started charging into this metaphor too quickly and find myself off-balance, unable to elegantly segue to my point.
Within applications, you can sometimes find defects by delaying actions. If your software integrates with other software or services asynchronously, the expectations of the code might not match how fast–or slow–a user might interact with the software.
For example, a shopping cart piece of e-commerce with distinct units might mark those units as Hold, Wait For Payment if a user adds it to the shopping cart and starts the payment process. If the hold has a timeout associated with it–an expectation that, if the user does not complete the payment within five minutes (or two minutes, or twenty minutes) that the payment has been abandoned and the system should remove the hold. However, in the case of paying with PayPal, the PayPal window itself can linger for that duration and more, ready to accept your login and confirmation of payment, at which time PayPal will send a payment received message to your system. For an item whose hold might have been relinquished and might not be available any more. How will you know unless you hold onto that ball for a a couple of seconds and let all the integrated software make their plays?
Another example: Financial software that uses two stages of authentication. The first time you log in from a browser, it prompts you for your user name and then directs you to a security question; after you answer that correctly, it prompts you for your password. If you log in and don’t do anything for ten minutes, your session times out. But what happens if it takes you longer than ten minutes to answer your security question? What happens if you answer the security question but don’t enter a password in those ten minutes?
The draw test is a valid line of testing because your users don’t live to fill out your forms. In their real lives, they’re called to meetings, go to lunch, minimize the browser window when a co-worker comes in, or otherwise temporarily delay their activity in a fashion that might put their next actions beyond your software’s expectations. If you take a little time, you can identify the places where your system might allow this and to make sure your software handles delayed action on your user’s part elegantly.
I picked up my Xbox One this morning – a special white console with “I made this!” engraved on it. The reviews started appearing last night, and for the post part, they’re quite positive (and I’m not surprised by the drawbacks listed in the reviews I’ve read so far). Now that this project is officially behind me – and before I figure out what’s next, I have some backlog blog posts to share.Lessons Learned
I gave a keynote at STAR West in October on “Lessons learned from Xbox One”. Given that it was an unreleased product at the time at the time of the presentation, rather than talk about the details, I spoke mostly about the team, and how strong team principles lead to great software projects (I spoke of my love of the Xbox team in a previous post – http://angryweasel.com/blog/?p=723). I thought I’d share a few of those lessons.Lesson 1: Build a Great Team
In order to build great software, build a great team first. I’ll say that differently in case it didn’t sink in (or make sense). Worry about building a great team first, and then let them build a great product. I realize this isn’t always an option for every organization, but I see too many teams take an existing group of people, and then try to make a product that isn’t in the sweet spot of that organizations strengths. A good team can usually make a few good things, but a great team can make anything great.
That leads to a phrase I heard once that rings too true – “Don’t ship your org chart”. Just over ten years ago, I was working on Windows CE (an embedded operating system loosely based on Windows). During product planning, I noticed what looked like an unbalanced number of networking features (there were a lot – including some relatively obscure functionality). I didn’t know if it was a product strategy decision, customer requests, or what, so I asked. The answer was, “Our networking team has a lot of people, and we need to keep them busy.” To this day, I still don’t know why there was no attempt to shift people around as needed for features customers wanted, but I do know it seemed silly at the time, and it seems silly now.
Which leads to…Lesson 2: Diversity and strengths
I’m a believer in teams filled with (or at least largely populated by) specializing generalists. Everyone certainly doesn’t have to be able to do everything, but every team member should be able to do many things well, and a few things very well. It’s important, when building a team to have diversity in both generalization and specialization – and match those skills with what’s needed for your project.
In my opinion (and experience), you can’t have enough deep specialization. My official job title at Microsoft is Principal SDET – that means I’m in the Principal band, and that I don’t manage people. I think far too many testers (and organizations) see management as the only career growth path, when there is huge value in developing, growing, and nurturing the experience of experienced individual test contributors (which reminds me, I wrote a paper about the value of highly experienced non-manager testers a few years ago).
There are 100 or so Principal SDETs at Microsoft (plus more in the management ranks that I haven’t counted lately). When I joined the team two years ago, there were five of us (Principal testers) – two on the Xbox Live team, and three on console. For various reasons, we wanted a name for our virtual team on the console software team. The mythical Hydra name was already in use to describe our three-OS model, so after a bit of deliberation, we named ourselves after Hagrid’s three-headed dog in Harry Potter, and Team Fluffy was born. Over time, we expanded (and promoted) our way to eight Principal testers in console, and three in Live, for what I believe is the highest team concentration of Principal testers at the company. Someone asked once if we needed that many Principal ICs – but honestly, I don’t know how we would have made this product without the diversity and deep skills of this group. I think the eight of us all contributed significantly to getting this product shipped on time with quality.Teams and Products
Yes – products are important, but I’m a huge believer in the team first – and I think it’s just as difficult and challenging to build a great team, as it is to build a great product – and that we should all put the same care and passion into hiring and developing our teams, as we do in building great software.
I’ll share a few other lessons later this week.(potentially) related posts:
Have you ever been to a restaurant with a kitchen window? Well, sometimes it may be best not to show the customers what the chicken looks like until it is served.
A tester on my team has something similar to a kitchen window for his automated checks; the results are available to the project team.
Here’s the rub:
His new automated check scenario batches are likely to result in…say, a 10% failure rate (e.g., 17 failed checks). These failures are typically bugs in the automated checks, not the product under test. Note: this project only has one environment at this point.
When a good curious product owner looks through the kitchen window and sees 17 Failures, it can be scary! Are these product bugs? Are these temporary failures?
Here’s how we solved this little problem:
- Most of the time, we close the curtains. The tester writes new automated checks in a sandbox, debugs them, then merges them to a public list.
- When the curtains are open, we are careful to explain, “this chicken is not yet ready to eat”. We added an “Ignore” attribute to the checks so they can be filtered from sight.
Developers, listen to the kind man Frank Hayes as he explains you should never set the cat on fire:
QA: Do just the opposite of what he says except for the first verse and the cat thing.