Blogs about Jazz

Blogs > Jazz Team Blog >

Improving throughput in the deployment pipeline

Tags: , ,

During the last 18 months, the IBM Jazz Collaborative Lifecycle Management (CLM) team has been undergoing a transformation around adopting continuous delivery best practices and methodologies, some of which you may already be aware of through our discussions at Jazz.net and this year’s Innovate conference.

Through this blog, I want to share some of those experiences and showcase what we learnt, where we are today and where we want to be.

I got involved with this effort during summer of 2012 when I was asked to start looking at a nascent collaborative effort brewing between two teams from IBM Rational and IBM Tivoli around continuous deployment (For folks in the know, this effort eventually transformed into a product offering called IBM SmartCloud Continuous Delivery (SCD) and was released in October 2012).

The idea was simple: “Create automation that would allow our development team to request deployment of any CLM build into an environment that resembled a production-like environment.”

And what would our developers do with them? They could use them to run manual tests, automated tests, do proof-of-concepts, run pre-integration tests, reproduce customer issues, verify patches, hot-fixes…. the list could go on.

However, that was not our complete goal. Using these automated deployment capabilities, we also wanted to auto-provision similar production-like virtual systems with every new build, run a series of automated tests whose complexity ranged from the simplest ones like unit or functional tests to the performance, system and reliability tests, record results from each of these executions in a human as well as report-consumable format. The goal was to identify issues much earlier in the development cycle.

BACKGROUND:

As some of you may already know, we constantly “drink our own champagne” through our self-hosting efforts. Since we are the earliest adopters of our own code, we often notice ourselves running into issues that are show stoppers on our production environments. Some of these problems are extremely critical and need immediate attention. Often the only resolution is to involve the development team to spin up a hot fix, test it locally, bring down the production servers, apply the patch and bring up the servers. These unscheduled downtimes not only result in lost productivity for the dev teams, they are also a big cause for embarrassment around the wider consumer community who use these production servers for tracking our plans, reporting work items and defects, using forums etc. From our operations team’s perspective, finding an appropriate maintenance window is a challenging task, given the global nature of our development teams.

When we teased apart this issue, we realized that most of the problems that we found in production should’ve ideally been found during our testing cycles. Talking to the test teams, we realized that they were over burdened due to the amount of code changes the development teams were putting in to support multiple releases. The test teams had a huge testing matrix of all supported combinations of platforms, OS version, product versions, upgrades, migration cases…The whole nine yards. The test teams could barely keep up with the amount of testing and were constantly struggling with integration builds that had broken features, regressions, version differences etc. This all came back to the development team and its practices for testing and delivering changes to the integration stream.

During development retrospectives, a typical theme that resonated was that the developers were trying to pack a feature rich release in a relatively short period of time. Most development teams were testing features in non-production like environments using borrowed, quickly-munged systems and were often running manual tests for validating their changes.

So we explored the possibility of providing the development team with a mechanism where they could test product features and changes into virtual production-like environments. These new environments would be like a developer’s own sandbox which they could create with the click of a button, own for a short period of time and release it once they were done. We also thought about providing them with various automated testing capabilities that our test teams owned. These would range from the most simple smoke tests to the most complex system and performance tests. We thought if the development team could consume the functionality of quickly deploying systems and running automated tests against them, we would be able to churn more reliable integration builds with fewer function breakages. Such builds would also free up our test teams to focus on advanced, complex testing and not worry about regressions creeping into them. They would also give the test teams more open cycles to improve the test automation and that would feedback into the same framework that the development team could use for their tests. Builds that would pass through the test team’s gates would be of much better quality and would improve the self-hosting experience for our operations team, who could now focus on overall system health and its organic growth. And eventually, our production systems, our development teams and the wider consumer community would suffer fewer downtimes – Overall a win-win for all.

“EMPOWER” THE DEVELOPER

  • To successfully pull this use case off, we had certain imperative objectives around the common developer in mind:
  • It had to be a self-service model and must rely on concepts that the developers were already familiar with.
  • It had to have a normal, consistent flow using standardized patterns and system configuration definitions that we recommend to our customers and also use internally. However at the same time, it had to be flexible enough and could be changed easily.
  • It had to provide a rich catalog of functionality, configurations and topologies that one could choose from.
  • The deployment aspect of this functionality should be a like a background task that did not require user intervention.
  • It had to be like a request/response model where the system could notify the user when the environment was ready and was available for them to consume.
  • And finally, it had to provide capabilities for executing automated tests along with setup and configuration of the CLM product itself.

USE JAZZ BUILD DEFINITIONS FOR REQUESTS

Given that we were building Jazz applications using Jazz technologies, it seemed like the most prudent way of driving any automation in our development environment, whether consumed by an individual or by our product builds, was to use Jazz build definitions in Rational Team Concert. This thinking also aligned with the following two fundamental behavioral patterns that we assumed of our development community:

  • All developers know how to request Jazz-based builds: The deployment of a test environment should be very similar to that. The build definitions should contain any/all properties that would allow developers to tweak the target environment configuration.
  • All developers know how to read build results: Accessing the deployed test environment and automated test results shouldn’t prereq any special knowledge on their part. All relevant references to host systems, provision requests, validation results etc should be captured by the build result (under Tests, Logs and External Links).

THE CLM “DEPLOYMENT PIPELINE”

Where We Are Today

Using a home-grown orchestrated build mechanism, we have a continuously running deployment pipeline for CLM that provisions six different production-like environments for executing the following types of automated tests: build verification/smoke, application/function test, system acceptance, and performance/reliability acceptance. We record results from each of these deployments and test execution cycles in a common build tracking record that our development team uses for collaborating on the state of a given build. In addition, this item records the approval/rejection state of each deployment and test execution phase and allows us to visualize trend data around the overall health of the milestone and release.  A few sample charts can be viewed at the following team dashboards: Pipeline MetricsTrend Data. This pipeline uses the IBM SmartCloud Continuous Delivery (SCD) 2.0 offering and integrates with Rational Team Concert for Jazz-based builds and IBM Workload Deployer as its “Cloud” environment of choice. All our orchestration and deployment automation code is written as OpsCode Chef recipes leveraging the Ruby programming language and is executed on systems running a Jazz Build Engine process.

Where We Want To Go

Earlier this year, IBM acquired a company called urban{code} that offers a few interesting solutions around continuous build, deployment and release. These are called uBuild, uDeploy and uRelease. We are currently exploring the possibility of swapping out our current deployment technology (SCD) with uDeploy and are running a few proof-of-concept demos around an end-to-end deployment pipeline for a given sample application. In addition, we are exploring the concept of using Jenkins alongside Jazz Build Toolkit for build, package and orchestration activities of our target CLM application on virtual systems. We are also trying to streamline our automated test infrastructure to allow for seamless integration with whichever build, package and deployment technology we use. We want to bring about a cultural and process change among our development teams where pipeline based deployments and corresponding automated test execution cycles help us drive the overall quality of our products higher and enable our customers to adopt them much more quickly and with ease into their production environments.

Maneesh Mehra
CLM Automation Architect
IBM Rational Software Group