AUTOMATION
MTERICS
“When you can measure what you are speaking about, and can
express it in numbers, you know something about it; but when you cannot
measure it, when you cannot express it in numbers, your knowledge is of a
meager and unsatisfactory kind.”
--
Lord Kelvin, a physicist.
|
As
part of a successful automated testing program it is important that goals and
strategies are defined and then implemented. During implementation progress
against these goals and strategies set out to be accomplished at the onset of
the program needs to be continuously tracked and measured. This article
discusses various types of automated and general testing metrics that can be
used to measure and track progress.
Based
on the outcome of these various metrics the defects remaining to be fixed in a
testing cycle can be assessed; schedules can be adjusted accordingly or goals
can be reduced. For example, if a feature is still left with too many high
priority defects a decision can be made that the ship date is moved or that the
system is shipped or even goes live without that specific feature.
Success is
measured based on the goal we set out to accomplish relative to the
expectations of our stakeholders and customers.
if
you can measure something, then you have something you can quantify. If you can
quantify something, then you can explain it in more detail and know something
more about it. If you can explain it, then you have a better chance to attempt
to improve upon it, and so on.
Metrics can
provide insight into the status of automated testing efforts.
Automation
efforts can provide a larger test coverage area and increase the overall
quality of the product. Automation can also reduce the time of testing and the
cost of
delivery.
This benefit is typically realized over multiple test cycles and project
cycles. Automated testing metrics can aid in making assessments as to whether
progress, productivity and quality goals are being met.
What is a Metric?
The
basic definition of a metric is a standard of measurement. It also can be
described as a system of related measures that facilitates the quantification
of some particular characteristic.1 For our purposes, a metric can be looked at as a
measure which can be utilized to display past and present performance and/or used
for predicting future performance.
What Are Automated Testing
Metrics?
Automated
testing metrics are metrics used to measure the performance (e.g. past,
present, future) of the implemented automated testing process.
What Makes A Good Automated
Testing Metric?
As
with any metrics, automated testing metrics should have clearly defined goals
of the automation effort. It serves no purpose to measure something for the
sake of measuring. To be meaningful, it should be something that directly
relates to the performance of the effort.
Prior
to defining the automated testing metrics, there are metrics setting
fundamentals you may want to review. Before measuring anything, set goals. What
is it you are trying to accomplish? Goals are important, if you do not have
goals, what is it that you are measuring? It is also important to continuously
track and measure on an ongoing basis. Based on the metrics outcome, then you
can decide if changes to deadlines, feature lists, process strategies, etc.,
need to be adjusted accordingly. As a step toward goal setting, there may be
questions that need to be asked of the current state of affairs. Decide what
questions can be asked to determine whether or not you are tracking towards the
defined goals. For example:
How much time
does it take to run the test plan?
How is test
coverage defined (KLOC, FP, etc)?
How much time
does it take to do data analysis?
How long does it
take to build a scenario/driver?
How often do we
run the test(s) selected?
How many
permutations of the test(s) selected do we run?
How many people
do we require to run the test(s) selected?
How much system
time/lab time is required to run the test(s) selected?
Etc
In essence, a good
automated testing metric has the following characteristics:
is Objective
is Measurable
is Meaningful
has data that is easily gathered
can help identify areas of test automation improvement
is Simple
A
good metric is clear and not subjective, it is able to be measured, it has
meaning to the project, it does not take enormous effort and/or resources to
obtain the data for the metric, and it is simple to understand. A few more
words about metrics being simple. Albert Einstein once said
“Make everything simple as possible, but not simpler.”
|
When
applying this wisdom towards software testing, you will see that:
Simple reduces
errors
Simple is more
effective
Simple is elegant
Simple brings
focus
Percent
Automatable
At
the beginning of an automated testing effort, the project is either automating
existing manual test procedures, starting a new automation effort from scratch,
or some combination of both. Whichever the case, a percent automatable metric
can be determined.
Percent
automatable can be defined as: of a set of given test cases, how many are
automatable? This could be represented in the following equation:
ATC # of test cases automatable
PA (%) = -------- = ( ----------------------------------- )
TC # of total test cases
PA = Percent
Automatable
ATC = # of test cases
automatable
TC = # of total test
cases
In
evaluating test cases to be developed, what is to be considered automatable and
what is not to be considered automatable? Given enough ingenuity and resources,
one can argue that almost anything can be automated. So where do you draw the
line? Something that can be considered ‘not automatable’ for example, could be
an application area that is still under design, not very stable, and much of it
is in flux. In cases such as this, we should:
“evaluate whether it make
sense to automate”
We
would evaluate for example, given the set of automatable test cases, which ones
would provide the biggest return on investment:
“just
because a test is automatable doesn’t necessary mean it should be automated”
When
going through the test case development process, determine what tests can be
AND makes sense to automate. Prioritize your automation effort based on your
outcome. This metric can be used to summarize, for example, the % automatable
of various projects or component within a project, and set the automation goal.
Automation
Progress
Automation
Progress refers to, of the percent automatable test cases, how many have been
automated at a given time? Basically, how well are you doing in the goal of
automated testing? The goal is to automat 100% of the “automatable” test cases.
This metric is useful to track during the various stages of automated testing
development.
AA # of actual test cases automated
AP (%) = -------- = ( -------------------------------------- )
ATC # of test cases automatable
AP = Automation
Progress
AA = # of actual test
cases automated
ATC = # of test cases
automatable
The Automation
Progress metric is a metric typically tracked over time. In the case below,
time in “weeks”.
A
common metric closely associated with progress of automation, yet not exclusive
to automation is Test Progress. Test progress can simply be defined as the
number of test cases attempted (or completed) over time.
TC # of test cases (attempted or completed)
TP = -------- = ( ------------------------------------------------
)
T time (days/weeks/months, etc)
TP = Test Progress
TC = # of test cases
(either attempted or completed)
T = some unit of time
(days / weeks / months, etc)
The
purpose of this metric is to track test progress and compare it to the plan. This
metric can be used to show where testing is tracking against the overall
project plan. Test Progress over the period of time of a project usually
follows an “S” shape. This typical “S” shape usually mirrors the testing
activity during the project lifecycle. Little initial testing, followed by an
increased amount of testing through the various development phases, into
quality assurance, prior to release or delivery.
This is a metric to show progress over time. A more detailed
analysis is needed to determine pass/fail, which can be represented in other
metrics.
Percent
of Automated Testing Test Coverage
Another
automated software metric we want to consider is Percent of Automated Testing
Test Coverage. That is a long title for a metric to determine what test
coverage is the automated testing actually achieving? It is a metric which
indicates the completeness of the testing. This metric is not so much measuring
how much automation is being executed, but rather, how much of the product’s
functionality is being covered. For example, 2000 test cases executing the same
or similar data paths may take a lot of time and effort to execute, does not
equate to a large percentage of test coverage. Percent of automatable testing
coverage does not specify anything about the effectiveness of the testing
taking place, it is a metric to measure its’ dimension.
AC automation coverage
PTC(%) = ------- = ( ------------------------------- )
C total coverage
PTC = Percent of
Automatable testing coverage
AC = Automation coverage
C = Total Coverage
(KLOC, FP, etc)
Size
of system is usually counted as lines of code (KLOC) or function points (FP).
KLOC is a common method of sizing a system, however, FP has also gained
acceptance. Some argue that FPs can be used to size software applications more
accurately. Function Point Analysis was developed in an attempt to overcome
difficulties associated with KLOC (or just LOC) sizing. Function Points measure
software size by quantifying the functionality provided to the user based logical
design and functional specifications. There is a wealth of material available
regarding the sizing or coverage of systems. A useful resourse is Stephen H
Kan’s book entitled ”Metrics and Models in Software Quality Engineering”
(Addison Wesley, 2003).
The
Percent Automated Test Coverage metric can be used in conjunction with the
standard software testing metric called Test Coverage.
TTP total # of TP
TC(%) = ------- = ( ----------------------------------- )
TTR total # of Test Requirements
TC = Percent of
Testing Coverage
TTP = Total # of Test
Procedures developed
TTR = Total # of
defined Test Requirements
This
measurement of test coverage divides the total number of test procedures
developed, by the total number of defined test requirements. This metric
provides the test team with a barometer to gage the depth of test coverage. The
depth of test coverage is usually based on the defined acceptance criteria.
When testing a mission critical system, such as operational medical systems,
the test coverage indicator would need to be high relative to the depth of test
coverage for non-mission critical systems. The depth of test coverage for a
commercial software product that will be used by millions of end users may also
be high relative to a government information system with a couple of hundred
end users. 3
Defect
Density
Measuring
defects is a discipline to be implemented regardless if the testing effort is
automated or not. Josh Bloch, Chief Architect at Google stated:
“Regardless
of how talented and meticulous a developer is, bugs and security
vulnerabilities will be found in any body of code – open source or
commercial.”, “Given this inevitably, it’s critical that all developers take
the time and measures to find and fix these errors.”
|
Defect
density is another well known metric not specific to automation. It is a
measure of the total known defects divided by the size of the software entity
being measured. For example, if there is a high defect density in a specific
functionality, it is important to conduct a causal analysis. Is this
functionality very complex, and therefore it is to be expected that the defect
density is high? Is there a problem with the design/implementation of the
functionality? Were the wrong (or not enough) resources assigned to the
functionality, because an inaccurate risk had been assigned to it? It also
could be inferred that the developer, responsible for this specific
functionality, needs more training.
D # of known defects
DD = ------- = ( ------------------------------- )
SS total size of system
DD = Defect Density
D = # of known
defects
SS = Total Size of
system
One
use of defect density is to map it against software component size. A typical
defect density curve that we have experienced looks like the following, where
we see small and lager sized components having a higher defect density ratio as
shown below. Additionally, when evaluating defect density, the priority of the
defect should be considered. For example, one application requirement may have
as many as 50 low priority defects and still pass because the acceptance
criteria have been satisfied. Still, another requirement might only have one
open defect that prevents the acceptance criteria from being satisfied because
it is a high priority. Higher priority requirements are generally weighted
heavier.
The
graph below shows one approach to utilizing the defect density metric. Projects
can be tracked over time (for example, stages in the development cycle).
Another
closely related metric to Defect Density is Defect Trend Analysis. Defect Trend
Analysis is calculated as:
4 Graph
adapted from article:
http://www.teknologika.com/blog/SoftwareDevelopmentMetricsDefectTracking.aspx
D # of known defects
DTA = ------- = ( ------------------------------------ )
TPE # of test procedures executed
DTA = Defect Trend
Analysis
D = # of known
Defects
TPE = # of Test
Procedures Executed over time
Defect
Trend Analysis can help determine the trend of defects found. Is the trend
improving as the testing phase is winding down or is the trend worsening?
Defects the test automation uncovered that manual testing didn't or couldn't
have is an additional way to demonstrate ROI. During the testing process, we
have found defect trend analysis one of the more useful metrics to show the
health of a project. One approach to show trend is to plot total number of
defects along with number of open Software Problem Reports as shown in the
graph below.
4
Effective
Defect Tracking Analysis can present a clear view of the status of testing
throughout the project. A few additional common metrics sometimes used related
to defects are as follows:
􀂾
Cost to locate defect = Cost of testing / the
number of defects located
􀂾
Defects detected in testing = Defects detected in
testing / total system defects
􀂾
Defects detected in production = Defects detected in
production/system size
Some of these metrics
can be combined and used to enhance quality measurements as shown in the next
section.
Actual
Impact on Quality
One
of the more popular metrics for tracking quality (if defect count is used as a
measure of quality) through testing is Defect Removal Efficiency (DRE), not
specific to automation, but very useful when used in conjunction with
automation efforts. DRE is a metric used to determine the effectiveness of your
defect removal efforts. It is also an indirect measurement of the quality of
the product. The value of the DRE is calculated as a percentage. The higher the
percentage, the higher positive impact on the quality of the product. This is
because it represents the timely identification and removal of defects at any
particular phase.
DT # of defects found during testing
DRE(%) = --------------- = (
-------------------------------------------- )
DT + DA # of defects found during testing +
# of defect found after delivery
DRE = Defect Removal
Efficiency
DT = # of defects
found during testing
DA = # of defects
acceptance defects found after delivery The highest attainable value of DRE is
“1” which equates to “100%”. In practice we have found that an efficiency
rating of 100% is not likely. DRE should be measured during the different
development phases. If the DRE is low during analysis and design, it may
indicate that more time should be spent improving the way formal technical reviews
are conducted, and so on.
This
calculation can be extended for released products as a measure of the number of
defects in the product that were not caught during the product development or
testing phase.
Other
Software Testing Metrics
Along
with the metrics mentioned in the previous sections, here are a few more common
test metrics. These metrics do not necessarily just apply to automation, but
could be, and most often are, associated with software testing in general.
These metrics are broken up into three categories:
Coverage: Meaningful
parameters for measuring test scope and success.
.
Progress:
Parameters that help identify test progress to be matched against
success criteria. Progress metrics are collected iteratively over time. They
can be used to graph the process itself (e.g. time to fix defects, time to
test, etc).
Quality: Meaningful
measures of excellence, worth, value, etc. of the testing product. It is
difficult to measure quality directly; however, measuring the effects of
quality is easier and possible.
5 Adapted
from “Automated Software Testing” Addison Wesley, 1999, Dustin, et al
Metric
Name
|
Description
|
Category
| ||
Test
Coverage
|
Total
number of test procedures/total number of test requirements.
The
Test Coverage metric will indicate planned test coverage.
|
Coverage
| ||
System
Coverage Analysis
|
The
System Coverage Analysis measures the amount of coverage at the system
interface level.
|
Coverage
| ||
Test
Procedure Execution Status
|
Executed
number of test procedures/total number of test procedures
This
Test Procedure Execution metric will indicate the extent of the
testing effort still outstanding.
|
Progress
| ||
Error
Discovery Rate
|
Number
total defects found/number of test procedures executed. The Error
Discovery Rate metric uses the same calculation as the defect density
metric. Metric used to analyze and support a rational product release
decision
|
Progress
| ||
Defect
Aging
|
Date
Defect was opened versus date defect was fixed
Defect
Aging metric provides an indication of turnaround of the defect.
|
Progress
| ||
Defect
Fix Retest
|
Date
defect was fixed & released in new build versus date defect was
re-tested. The Defect Fix Retest metric provides an idea if the
testing team is re-testing the fixes fast enough, in order to get an accurate
progress metric
|
Progress
| ||
Current
Quality Ratio
|
Number
of test procedures successfully executed (without defects) versus the number
of test procedures. Current Quality Ratio metric provides indications
about the amount of functionality that has successfully been demonstrated.
|
Quality
| ||
Quality
of Fixes
|
Number
total defects reopened/total number of defects fixed
This
Quality of Fixes metric will provide indications of development
issues.
|
Quality
| ||
Ratio
of previously working functionality versus new errors introduced
The
Quality of Fixes metric will keep track of how often previously
working functionality was adversarial affected by software fixes.
|
Quality
| |||
Problem
Reports
|
Number
of Software Problem Reports broken down by priority. The Problem Reports
Resolved measure counts the number of software problems reported, listed
by priority.
|
Quality
| ||
Test
Effectiveness
|
Test
effectiveness needs to be assessed statistically to determine how well the
test data has exposed defects contained in the product.
|
Quality
| ||
Test
Efficiency
|
Number
of test required / the number of system errors
|
Quality
| ||