Sunday, December 4, 2011

YOW! Retrospective

The YOW! 2011 conference in Melbourne was fantastic. My brain is totally buzzing from the great presentations, discussions and energy from the conference. Before the buzz settles, here's my personal retrospective.

YOW will be making the presentations and videos available online in due course.

1 - What I Learnt

Key themes
  1. Continuous Delivery of Business Value
  2. JS everywhere, REST, push technologies, HTML5
  3. Having fun with technology
Continuous Delivery/Business Value sessions

Continuous Design - Mary Poppendieck

Design Thinking:
  1. Assemble Diverse Team
  2. Frame the problem
  3. Ideation - prototype
  4. Experimentation
  5. Iterate back to 3, or Reformat the problem/pivot back to 2
Continuous Design puts coding into this loop
The Learning Cycle. Learn -> Model -> Build -> Measure -> Learn ...
The fastest learner wins.
A small, experienced, dedicated, self empowered team can introduce 4x more product - George Stalk, competing against time.

Need good people, whole team (marketing, design, dev, test, ops, support)
Measure product metrics not project metrics

Amazon.com, working backwards:
  1. Write a press release
  2. Write FAQs
  3. Describe customer experience
  4. (Write User Manual)
Innovation Accounting
  • Start with hypothesis - a business growth model
  • Establish a baseline - the Minimum Viable Product
  • Target every initiative at a Growth Metric
  • Split Test A/B
  • Definition of Done includes feature value validated
Cohort Metrics
  • Measure a cohort of new users every week
  • Track behaviour of each cohort against product events and changes
Mary then covered some continuous delivery principles - which made Jez Humble and Martin Fowler adjust their subsequent presentation somewhat :-)
  • Low Dependency Architecture
  • Feature Toggles
  • Trunk-based development
  • Branch by abstraction
  • Canary releasing
Continuous Delivery - Jez Humble, Martin Fowler
This covered many of the concepts from the excellent Continuous Delivery book. I won't repeat this content - go read the book. Some useful soundbites from the talk:
  • On acquisition, Flickr were initially told by Yahoo to stop continuous deployment. The Flickr team then gained stats that they had less downtime than any of the other Yahoo services. Flickr deploy approx 10x/day to production. Downtime was 4 x 6mins per year.
  • Mingle has 20 hours of acceptance tests. Running in parallel across 70 boxes, these complete in 45 minutes.
  • On any acceptance test failure you should find out how it got through, and beef up your commit tests to catch the failure earlier.
  • The job of the deployment pipeline is to kill code that isn't suitable for release
  • Devops - development is measured on throughput, operations on stability. Need to establish a culture of collaboration, automation, measurement, sharing.
  • Dark launching - deploy back end service first before switching it on
  • Benefits in decoupling deployment and release (eg. using feature toggles). Facebook has every major feature to be released in the next 6 months already deployed to production, and waiting to be released.
Introducing Continuous Delivery
  • Focus on behaviours, not tools
  • People are the key
  • Get everyone together at the beginning
  • Keep meeting
  • Make it easy for everyone to see what's happening (eg. big visible display)
  • Kaizen. Continuous Delivery should be part of continuous improvement process. It's going to take months. Do it step-by-step.
ITIL and Continuous Delivery

Jez provided an excellent answer to a question about using Continuous Delivery with ITIL. See the last 5 minutes of the video (when it comes out).
  • The deployment pipeline is a superior way to manage risk (over paperwork)
  • Under ITIL v3, continuous delivery can be approved as a standard change with known low risk.
  • 2 other strategies for integrating with ITIL v3
  • Need to work closely with auditor
  • Auditors love deployment pipelines, everything you could possibly audit is available and version controlled.
Adventures in Infrastructure as code - Julian Simpson

Julian demonstrated code using Puppet and Chef, that installs Ubuntu, nginx, jetty and deploys a war file. In short, Puppet is configuration-based and appeals more to sysadmins who like to see all the moving parts. Chef is convention-based and tends to appeal to developers more.

He discussed testing using cucumber-puppet and cucumber-nagios, and mentioned cfengine, vagrant and the veewee plugin.


The Limited Red Society - Joshua Kerievsky

Parallel to the Limited WIP Society, the goal is to minimise the development time spent in the red (failing tests), or in the pink (code that won't compile).

By visualising the development flow, developers get feedback for improvement. Analogies with sports visualisations. Joshua recommended "The art of winning an unfair game" book, on which the Moneyball movie was based.

Measure and practice the Shortest Longest Red Etude. A good retreat is better than a bad stand.

Refactoring strategies missed in Joshua's book:
  • Parallel Change - write refactored code alongside current code and gradually switch over (like building a 2nd bridge)
  • Narrowed Change - reduce the number of change points (eg. using Extract Method), before making the change. This is preferred over parallel change.
We had covered these in depth on Joshua's excellent Refactoring Workshop on Wednesday. The eLearning package that Industrial Logic have put together is exceptional, with IDE plugins to allow us to visualise our refactoring sessions and provide feedback on our work.

The development visualisation is available in Industrial Logic's Sessions tool.

Technology Sessions

Domain-driven design for RESTful Systems - Jim Webber
HTTP is an Application Protocol. You have to narrow it down to your Domain Application Protocol.

PUT is idempotent. Always use it rather than POST for financial transactions.

Jim presented some of the Restbucks examples from his book.

His talk was very engaging. It was somewhat alarming to find that only 1 of the audience had played a PS/1 or later as their first computer game, while over 50% of the audience had first played games that were loaded from cassette tape.

The Live Web - Drag n' drop in the Cloud - Dan Ingalls
A component architecture for HTML5. Builds a structured scene graph in JavaScript. Integrates HTML, SVG and Canvas graphics.

It's like Squeak (and Scratch, BYOB, Snap) but runs behind firewall

Kit approach - components, connections, encapsulation
Parts bin - cloud based repository of active content
Live connect, dataflow binding

The IDE runs using Lively Web, and saves using WebDAV (to node.js, to svn)

Not sure how I'd use this for commerical projects yet, but it's a lot of fun and I'll be trying it out with my son.

Better testing with less work - John Hughes
I've previously looked at (and discounted) QuickCheck in Java, but the Erlang version does so much more that I'll be looking at it again. There is a lot of difference between the different language implementations.

QuickCheck allows us to write general properties that a system should meet. The properties should hold true across a range of values (similar to JUnit Theories). It then generates random inputs to test those properties. On failure it will retry with simpler test cases until it finds the minimal test case that fails. This simplifies debugging.

A particular sweet spot for QuickCheck is finding race conditions. It's easy to convert sequential to parallel tests. John used QuickCheck to easily find some race conditions in a database implementation that had been eluding the developers for months.

NodeJS and the JavaScript-everywhere strategy - Matthew Eernisse
Matthew discussed how they're using Node at Yammer. Three of their customer-facing services use Node - the Upload service, the collaborative document editing service and an undisclosed new service.

Internally, Yammer use Jake and FooUnit for build and test. It's good to see Yammer developers open sourcing some of their work on Github.

Matthew described the Geddy web framework and showed how they use the yammer.util.promise framework for async code.

Webbit - a lightweight WebSocket web server for Java - Aslak Hellesoy
Webbit is a single-threaded, non-blocking, Java-based, high throughput web server (no servlet API). It supports 3 models:
  • HTTP request-response
  • WebSocket (long-lived 2 way connections)
  • EventSource (long-lived push server->client connections)
EventSource is a W3C standard and is implemented in most decent browsers. It's more lightweight than WebSockets.

At DRW, Aslak's team ported a trading app to Webbit, Chrome, WebSockets.

Aslak coded a HTTP client app in less than 100 lines (including Java, JS, HTML, CSS). See https://gist.github.com/1421652.

Blocking or long operations must be offloaded to another thread.

Webbit has no hot-deploy capability out-of-the-box, and does not support HTTPS. (Aslak recommended stud as a HTTPS terminating proxy).

For testing, Webbit provides StubRequest, StubResponse and StubConnection classes. For end-to-end tests, the server can be started and stopped in 1.5ms.

Webbit tends to be embedded in an app (rather than the app being embedded in the web server). It has a small footprint, using about 100kb RAM and almost no CPU when idle. It has REST (JAX/RS) and RPC add-ons.

Above the Clouds - Introducing Akka - Peter Vlugter

It was a surprise to find out that NZ has a developer working for Typesafe as part of the core Akka development team.

Peter introduced the core features - Actors, STM (Software Transactional Memory) and Dataflow (a way of composing futures, inspired by the Oz framework). They are looking at supporting Agents and Transactors (2 phase commit STM).

Akka supports Scala and Java APIs.

Peter gave a good, detailed introduction to Scala Actors.

Akka integrations include Camel, Spring, ZeroMQ, AMQP, Play framework, REST. It has a testing toolkit (and Peter is keen to explore a QuickCheck for actors), It is open-source, except for the commercial Atmos tool (trace, monitor, manage, provision).

General
The keynotes were entertaining and insightful. Hearing Cameron Prudy describe potential JVM enhancements gave me some warm and fuzzies about Java's future as a platform.

Mike Lee, Damian Conway, Mary Poppendieck, Simon Peyton-Jones and Jim Webber were outstanding speakers.

Thanks to the great company - especially John Hurst, Martin Paulo, Paul and Luke (the AEMOs) and Balaji, and to everyone else that made this great.

And above all, thanks to Dave Thomas, the YOW committee, the volunteers and speakers who made YOW happen.


WOW - that turned into an epic. Illustrating just how much I got out of YOW!


2 - What I will do next

Books to read
The Lean Startup - Eric Ries
REST in Practice - Jim Webber

Technologies to play with
Lively Web
QuickCheck
Node.js
Webbit

Presentations to watch online (that I missed)
The Post-Java Virtual Machine - Ola Bini
JRuby for the win - Ola Bini
Three 'Tall' Tales - Kevin O'Neill (mobile dev and testing)

3 - Impediments
Above all, the speakers brought out the joy of programming (especially Damian Conway!)

My main impediment is time. I really need to re-instate my "20% time" for research, play, innovation and general hacking.

Must be about time for my new year's resolutions..

Sunday, October 23, 2011

A message from the past from someone that cared

Last year I saw a tweet that a failing unit test was "a message from the past from someone that cared".

I was reminded of this recently when picking up a project that hadn't been touched for over 2 years. One of the acceptance tests failed with the following error:



The error message was sufficiently detailed to lead me straight to the solution. Thankfully the test writer1 had cared enough to make sure that the failing test provided sufficient detail to be easily fixed.

The original tweeter was no doubt referring to failing unit tests helping to detect unwanted code regressions. When relying on external configuration or data, tests should also check any assumptions they are making about the data. In this case, the Given clause of the "Given-When-Then" statement is clear about its expectations, and the test fixture is checking that these preconditions are met.

This type of environment issue could even be treated as a different type of test failure, as discussed in Triaging Test Failures.


1 OK the test writer was me :-) But since I hadn't worked on the application for over 2 years, it may as well have been someone else!

Saturday, February 26, 2011

Getting Groovy with Jar Versions

Oh how I love the Groovy!

I just needed to check version numbers of jar files in multiple folders and collate a table like:

While most jar files have the version number in either the Bundle-Version or Implementation-Version attributes of the MANIFEST.MF file, some do not follow this convention, and we need to either look for a file ending in _VERSION, or parse a pom.xml within a META-INF subfolder to get the version number.

Groovy makes it so easy to do this stuff. On top of this, creating the HTML table is a breeze. The icing on the cake is to save the file and open in Firefox in a couple of lines.



This would have taken hundreds of lines in Java. Groovy makes it concise, while allowing access to all Java libraries, resulting in code that is readable (if a little alien at first) to a Java developer.




Monday, January 3, 2011

concordion-extensions is now a Concordion sub-project

I'm pleased to announce that concordion-extensions is now a sub-project of Concordion. This project includes extensions to add screenshots and logging information to Concordion output, and to change the format of the timestamp in the footer. See http://concordion.org/Extensions.html for further details.

An updated release, concordion-extensions v1.0.1, is now also available in Maven central. See http://concordion.org/Download.html for details.

The classes have been repackaged under org.concordion.ext. Any users of concordion-extensions v1.0.0 will need to update their package definitions with this new release.

If anyone has ideas for additional extensions, please discuss them on the Concordion mailing list. add them to http://code.google.com/p/concordion-extensions/wiki/ExtensionIdeas.

Update:
Oct 26, 2014. These extensions are now packaged as individual projects and hosted on Github. See http://concordion.org/Extensions.html for details.

Saturday, October 16, 2010

Displaying screenshots on acceptance test failures

When running GUI tests it can be difficult to determine what was being shown on the GUI at the point of failure, especially if the tests are running minimised or on a CI server.

We developed a screenshot extension for Concordion to include screenshots in the output whenever failures occur. We've been finding this extension incredibly useful for diagnosing intermittent failures in our automated acceptance tests.

This requires the new Concordion 1.4.1, which introduces support for extensions. Amongst other things, extensions can add new listeners to commands and make changes to the Concordion output.

The following example is running an end-to-end test using WebDriver (Selenium 2) to test a Google search through the Firefox browser.

While the results show that the test is failing since Netherlands is displayed in the Google results, it would be helpful to see the actual results page. The screenshot extension allows us to hover over the failure to see an image of the browser page at the time the failure occurred.



Clicking on the image opens it for further inspection:

In this example, we've configured the extension to use WebDriver's TakesScreenshot interface, so we see an image of just the browser page, irrespective of whether it is currently visible on the screen.

The extension can also be used to take screenshots on success, and includes a command to explicitly take screenshots.

The screenshot extension project is on Github. You'll need to set the concordion.extensions system property to use it - see the README for details.

The source code for this example is in the demo project.

Acknowledgements:
This extension was partly inspired by Mark Derricutt's ScreenshotCommand, and by Adam Setch's post to the Concordion list.

UPDATESOct 25 2010. Added acknowledgements
Jan 03 2011. Project moved to Google Code and docs to Concordion site.
Oct 26 2014. Updated project and demo location to Github projects

Sunday, September 12, 2010

What's happening in my acceptance tests?

Agile Acceptance Testing allows us to describe desired behaviour using examples that describe the business intent. Good acceptance tests are written as plain language specifications, not scripts. Implementation details are coded in a separate test "fixture" class.

One downside of this approach is a loss of transparency of what the tests are actually doing. The "fixtures" are often written by developers, which may need a leap in faith for testers to trust. On a recent project, this trust was dented when the tests didn't do what they were supposed to be doing.

With this in mind, I set out to provide more insight into what our tests are actually doing, without undermining the principles of acceptance testing.

The result is a logging extension for Concordion. This adds a "tooltip" to the Concordion output HTML that shows log output when hovered over:


This tooltip is proving useful not only to testers, but also for developers to gain insight into what is happening in their tests and to find performance improvements. For example, in the above example we were surprised to see the web page being loaded twice, and a number of element lookups being duplicated.

This approach could also be used for user documentation of the steps required to complete an action, potentially with screen shots, or even as an embedded screen cast.

Implementation Details
We've added a new extension mechanism to Concordion 1.4.1 to make it easy to add features such as this.

This extension is available on Github. The extension captures java.util.logging output and has a number of configuration options. You'll need to set the concordion.extensions system property to use it - see the README for details.

For a example using this Concordion extension with WebDriver, see the demo project .

UPDATESOct 4  2010. Changed github link to new concordion-extensions project
Oct 6  2010. Source code moved to trunk of Concordion
Oct 24 2010. Updated to reference Concordion 1.4.1 and concordion-extension-demo project
Jan 03 2011. Project moved to Google Code and docs to Concordion site.
Oct 26 2014. Updated links to new Github projects.

Sunday, June 27, 2010

Triaging test failures

One of the goals of Open Spaces conferences is to turn "corridor conversations" into the focal point of the conference. This was aptly demonstrated at CITCON ANZ when Richard Vowles introduced a topic we'd been discussing over kebabs the night before.

Richard has subsequently discussed the topic on the Illegal Argument podcast. This post is an extension of the discussion.

The problem
When running integration and acceptance tests, test failures may be caused by factors other than incorrect code. This is most apparent when performing end-to-end testing through to Enterprise Information Systems. A number of factors can cause the test to fail - system unavailable, test data not in required state.

It would be useful to categorise the failures by cause, for notification and reporting purposes. Developers should be notified of code-related issues, testers might be responsible for data issues, and sys ops for server errors. Over time it would also be useful to visualise how often server and data errors are occurring.

Richard provided the example:
Given a customer has been in arrears for over 90 days...
In order to run this test in an end-to-end environment, the test code has to get a customer in this state. Richard's system uses an AS/400 back-end, and it simply is not possible to automate the setup of a customer in this state. The test code may need to be configured with a specific customer id, or it may be smart enough to search for a customer in the required state.

Over time, the customer data may no longer be available. For example, periodic data refreshes may remove or update the customer details.

Richard's not the only one with this problem - I'm also seeing it on a current project.

The problem of finding adequate test data is exacerbated when the test updates the state:
Given a customer has been in arrears for over 90 days,
when her invoices are paid in full,
then her status is changed to black.
In this case, the destructive change to the customer's state means that the data is no longer suitable for running this test. The test needs to find a different customer in arrears the next time it is run. Since debtors are a finite resource, the test may be unrunnable at some stage.

Why run these fragile end-to-end tests?
With Agile Acceptance Testing (ATDD, BDD etc), the focus is on testing business examples that will prove to the customer that a feature is "done". Running the tests end-to-end provides the greatest assurance to the customer that the functional requirements are being met, and reduces the need for manual regression testing.

Depending on the project, it may be possible to implement these tests "under the covers" of the user interface, or using mocks for back-end functionality. We often use these approaches to drive the design of our code, possibly before the user interface or back-end are available. However, these approaches don't provide the full benefits that we get from end-to-end tests.

"Unrunnable" test result
In the past, I have tackled this issue by making assertions on the Given clause of the tests. If the test pre-conditions are not met, the test results in a failure.

The proposal made at CITCON is to introduce a new "Unrunnable" test result state. This state is neither success nor failure. The discussion led to introducing a new colour for this state, to differentiate it from red (failure) and green (success).

Triaging test results
Extending the idea, it would be useful to be able to triage test failures into user-defined categories. Depending on the nature of the failure (and possibly the severity) the failure would be assigned a category.

The CI server would send failure notifications to a category-specific list. For example, system failures would be notified to sys ops, data issues to testers, and code issues to developers.

Each category would be displayed with a different failure colour, allowing the causes of test failures to be tracked over time.

For some categories, for example server errors, it may not be worthwhile continuing with the test run. The test runner could potentially be configured to abort the test run dependent on the category of the failure.

Comparison with Pending state
Many BDD and ATDD tools already model a separate Pending or Unimplemented state - displayed in yellow (Cucumber), or grey (Concordion). The pending state can be viewed as one of these test failure conditions ("code unavailable").

Example
A test could be annotated as follows:
@Triage(nature="SERVER_ERROR", exception=HostUnavailableException.class)
public class ArrearsFixture() {
@Triage(nature="DATA_UNAVAILABLE",exception=NoDataException.class)
public Customer findCustomerInArrears(Condition condition) {
....
}
}

On Hudson, test failures might show up as:

clearly showing that there is an ongoing issue with server errors, impacting on the team's ability to adequately test the system. Intermittent data-releated issues are also causing some tests to be unrunnable.

I'm not aware of any test tools/frameworks currently offering this capability. Does anyone know of anything similar?

UPDATE:
1. This topic was discussed on the Illegal Argument list. I liked Mark Derricutt's point:
It's my thought that the finer grained reporting you CAN get the better,
whether you make use of it depends on the project and problem space.

After all - Exception is good enough to be thrown for all errors right?
IllegalArgumentException, IOException, FileNotFoundException are all rather
"controversial, mean many things to many people, and cause inconsistency and
confusion" - but we need that differentiation of exceptions to separate out
a chain of responsibility.

We know this in our code, but I think we also need this for our
builds/tests.