Purpose
This metric helps a team understand overall solution quality by studying two factors: test completeness and test
success. The test completion rate enables the team to manage what part of the test effort remains, including the risk
associated with potentially undiscovered quality issues. The test success rate helps the team to decide whether the
software or system is working correctly. This metric should be used in conjunction with the Defect Trends metric (see
Defects) to determine release readiness of the solution.
The purpose of this metric is obvious—to track actual testing progress against plan and therefore to be able to be
proactive upon early indications that testing activity is falling behind. It is well-known that when the schedule is
under pressure in software development, it is normally testing (especially development testing, i.e., unit test and
component test or functional verification test) that is impacted (cut or reduced). Schedule slippage occurs day by day
and week by week. With a formal metric in place, it is much more difficult for the team to ignore the problem, and they
will be more likely to take actions.
When the term "test" is used in this metric, it refers to the executable tests that contain the steps needed to run
against the actual solution under development. These tests can be manual or automated. The Rational Unified Process
refers to these tests as "test scripts", but in some test methodologies and test tools, these are referred to as test
cases. To avoid confusion, this metric uses the term "test."
Definition
Test Execution Status is measured by reporting the following items once per iteration and trending throughout the
release cycle:
-
Tests planned.
-
Tests implemented.
-
Tests attempted.
-
Passed tests.
-
Failed tests.
-
Blocked tests.
The term iteration is sometimes called "test cycle," but these are not synonymous. A single iteration may have multiple
test cycles, based on the build cycle. The recommendation is to capture this metric once per iteration, regardless of
the number of test cycles.
Terms
Tests planned = The number of tests scheduled to be executed in the iteration.
Tests implemented = The number of tests built and ready to be executed - both manually and automatically - in the
iteration.
Tests attempted = The number of tests that have been executed, and is the sum of the passed, failed and blocked tests.
Passed tests = The number of tests that have a most recent result of pass.
Failed tests = The number of tests that have a most recent result of failed.
Blocked tests = The number of tests that cannot be executed completely to the last step of the test. For manual tests,
this means the tester could not execute all the steps of the test. For automated tests, the automated testing tool
reports a passing result, but the human test analyst determines that the test was invalid using information outside the
scope of what the automated testing tool can report.
Build Health is captured in IBM® Rational® Team Concert® and BM® Rational® Insight®.
Analysis
Use either a line or bar graph that shows number of tests on the y-axis and the iterations along the x-axis. Categorize
the tests as indicated above. Ideally, show the results against a planned schedule for implementation and successful
execution. The following patterns might occur:
Rising slope
-
Tests planned, tests implemented, tests attempted, and passed tests: This is the desired trend.
-
Failed tests: Indicates decreasing solution quality and/or decreasing requirement quality.
-
Blocked tests: Indicates the test effort may be falling behind schedule and the quality of the
solution is more unknown. Also, may be an indicator of technical problems or test data problems in the test
environment.
Falling slope
-
Tests planned: Indicates tests are being removed from the scope of the test effort, possibly
indicating a decrease in overall scope for the release.
-
Tests implemented: Indicates there are not enough test resources to write the planned tests.
-
Tests attempted: Indicates there are not enough test resources to execute the planned and
implemented tests.
-
Passed tests: Depending on the trends of planned, implemented and attempted tests, this pattern
usually indicates decreasing solution quality, and/or previously passing tests are now failing.
-
Failed tests and blocked tests: This is the desired trend.
Flat line
-
Tests planned, tests implemented, and tests attempted: Indicates new tests are not being added to
the overall test effort for the release. Some root causes are: lack of test resources to implement and/or execute
tests, lack of clear requirements, no new requirements being delivered to test.
-
Passed tests: Indicates defects are not being corrected. Could also indicate a coincidental net
zero difference in the number of passing tests.
-
Failed tests and blocked tests: Indicates there is a lack of test resources to execute previously
failed or blocked tests, or defects are not being corrected, or a coincidental net zero difference in the number of
failing or blocked tests, or some combination of these issues. The test schedule may be in jeopardy in this
scenario.
Hockey stick
A hockey stick trend gradually increases or decreases, then takes a sharp turn in the upward direction, typically late
in the release cycle. This means the project is experiencing surprises at a time when things should be routine.
-
Tests planned, tests implemented, tests attempted: Indicates many new tests are being added to the
test effort, possibly due to: new requirements added to the scope of the project, previously ambiguous requirements
have been clarified, additional test resources added.
-
Passed tests: Indicates the following possible scenarios: defects are being corrected and verified
more quickly, requirements are more clear, additional test resources have been added, or some combination of these
items.
-
Failed tests and blocked tests: Indicates the following possible issues: new tests are failing or
blocked, previously passing tests are now failing or blocked, requirements are not clear, previously corrected
defects are recurring, new defects are being discovered.
Reverse hockey stick
A reverse hockey stick trend gradually increases or decreases, then takes a sharp turn in the downward direction,
typically late in the release cycle. This means the project is experiencing surprises at a time when things should be
routine.
-
Tests planned, tests implemented, tests attempted: Indicates tests have been removed from the
scope of the project - perhaps due to removal of requirements.
-
Passed tests: Indicates that a large number of tests that were previously passing are suddenly no
longer passing. Possible sources for this issue are: previously passing tests are now failing and/or newly
delivered solution quality is decreasing.
-
Failed tests and blocked tests: Indicates the following possible scenarios: defects are being
corrected and verified more quickly, requirements are more clear, additional test resources have been added, or
some combination of these items.
Alternative Tracking Methods
S-Curve
The x-axis of the S curve represents time units and the y-axis represents the number of test cases or test
points. By "S-curve" we mean that the data are cumulative over time and resemble an "S" shape as a result of the
period of intense test activity, causing a steep planned test ramp-up. For the metric to be useful, it
should contain the following information on one graph:
-
Planned progress over time in terms of number of test cases or number of test points to be completed successfully
by week
-
Number of test cases attempted by week
-
Number of test cases completed successfully by week
As seen from the figure below, the testing plan is expressed in terms of a line curve, which is put in place before the
test begins. The lightly shaded bars are the cumulative number of test cases attempted, and the red bars represent the
number of successful test cases. With the plan curve in place, each week when the test is in progress, two more bars
(one for attempted and one for successful completion) are added to the graph. This example shows that during the rapid
test ramp-up period (the steep slope of the curve), for some weeks the test cases attempted were slightly ahead of plan
(which is possible), and the successes were slightly behind plan.
Test Points
To track test points, the teams need to express the test plan (amount of testing done every week) and track the
week-by-week progress in terms of test points. The figure below shows test point tracking for a product level
test, which was underway, for a systems software. The weights and the resulting test scores should be determined in the
testing planning stage and remain unchanged during the testing process.
Tables
Although the figures above give a quick visual status of the progress against the total plan and plan-to-date (the eye
can quickly determine if testing is ahead or behind on planned attempts and successes), it may be difficult to discern
the exact amount of slippage. For that
reason, it is useful to also display testing status in tabular form, as in the table below.
The table also shows underlying data broken out by department and product or component, which helps to identify problem
areas. In some cases, the overall test curve may appear to be on schedule, but because some areas are ahead of
schedule, they may mask areas that are behind when progress is only viewed at the system level.
Release-to-Release Comparison
For release-to-release comparisons, it is important to use weeks before product general availability (GA) as the
time unit for the x-axis. By referencing the GA dates, the comparison provides a true in-process status of the
current release.
In the figure below, we can see that the release represented by the red, thicker curve is more back-end loaded than the
release represented by the blue, thinner curve. In this context, the metric is both a quality and a schedule statement
for the release, as late testing will affect late cycle defect arrivals and hence the quality of the final product.
With this type of comparison, the project team can plan ahead (even before testing starts) to mitigate the risks.
For more information, see In Process Metrics.
Frequency and reporting
Data is captured each day and monitored at the end of each iteration to help identify trends.
Collection and reporting tools
Test execution data is captured in IBM® Rational® Quality Manager®. IBM® Rational® Insight® provides an out of the box
report focused on implemented tests and their status.
Assumptions and prerequisites
-
Test plans are updated as part of iteration planning and compared to actuals at the end of each iteration.
-
Test execution data is captured in a tool, tracking the status of each attempted test.
Pitfalls, advice, and countermeasures for this metric
The following items are indicators of measurement pitfalls and should be used to corroborate this metric:
-
Release size
-
Requirement quality
-
Test capacity
-
Defect aging
|