Metrics – a „must-have” when reporting to the Project Manager for some, a dreaded tool of repression for others. Yet what they should be, for a developer team, is a tool and support.
Intelligently gathered metrics can be very useful in diagnosing problems with product quality and functioning of the entire project, or a team. For this reason, one of the basic metrics that should be used in every project is the metric of trends in software testing result changes. Its use does not require much effort, and yet it allows to observe certain aspects of the process that show only in the context of a certain period. Here, there are several real-life examples of this metric, based on automatic test results, used to find and solve various problems. They show how helpful it is to monitor the test result trends in assessing software quality.
1.95 was the last release version prior to merging the changes from a separate branch, the one developed as the 2.x version. These changes came to the testing team as a surprise, as the test_error result happening to most automated tests might suggest. What is more, it is clear that until the 1.3 version, some defects were “hidden” by test_error and could not be detected, nor repaired at that point.
The reason: faulty cooperation between the developer and tester teams.
The result: a period of non-operational automatic testing, late detection of old defects
The effect of a constant number of test_error results.
It can be seen that although for versions 2.104 through 2.106 new errors appeared in automatic tests, they were corrected and the number of test_error results came back to its constant value (10-20%).
The reason: the team concentrates on creating new tests and does not delegate anyone to supervise the previously developed tests implemented in the test cycle.
The result: the team gets used to a certain number of tests failing and thus allows some areas of the software to remain untested.
For over a month no test_error results were recorded, although many changes were introduced by developers at this time. The only tests that failed were those due to application defects.
The reason: the automation area is not aligned with the development area of the software – the testing team did not prepare automated tests for the area actively developed at that time.
The result: the false conviction that the application is well covered by tests in all areas and the tests are “self-sufficient.”
The sporadic test_error results are repaired quickly, and their number always comes back to 0. The number of failed tests due to software defects remains below 10% most of the time, which is usually considered an acceptable pre-release value.
A proper testing team reaction can be seen when the number of test_error results grows rapidly at version 20.352 and quickly reverts to the default value at version 20.354, which means the faulty tests have been corrected. At this stage, no significant problems are identified.
Of course, in some cases, the test result trends will not suffice to draw any rational conclusions. Still, this metric is always a good source of general information on the project and the areas of the process which could use additional analysis.