Automation Metrics That Actually Matter
Most teams track the wrong automation numbers. Test count and coverage percentage tell you almost nothing. Here's what to measure instead.
On this page
- The Vanity Metrics to Stop Tracking
- Metric 1: Defect Detection Rate
- Metric 2: Mean Time to Failure Detection
- Metric 3: Flakiness Rate
- Metric 4: Build-over-Build Regression Rate
- Metric 5: Mean Time to Repair (Failing Tests)
- Metric 6: Automation ROI per Feature Area
- Metric 7: Test Execution Cost
- Putting It Together: A Simple Dashboard
- Takeaways
Your test suite has 2,000 automated tests. What does that tell you?
Nothing useful. It tells you how many tests exist. It doesn't tell you if they catch bugs, how long they take to run, or whether they're worth maintaining.
Here are the metrics that actually measure automation value.
The Vanity Metrics to Stop Tracking
Test count: A meaningless number. 2,000 weak tests are worse than 200 strong ones — they take longer, fail more often, and provide false confidence.
Code coverage percentage: 80% coverage with bad assertions is almost as useless as 0%. Coverage measures lines executed, not logic verified.
Pass rate on green builds: Of course tests pass on green builds. That's the wrong baseline.
These metrics optimize for gaming. Teams add tests to hit numbers. The tests don't add value; they add maintenance burden.
Metric 1: Defect Detection Rate
Defect Detection Rate = Bugs caught by automation / Total bugs foundIf your automation finds 30 of the 50 bugs discovered before release, your DDR is 60%.
Track this over time. If DDR drops, your test suite isn't keeping pace with the product. If DDR increases, your investment is working.
How to measure: Tag every bug with how it was found: automated test, manual exploratory, prod report, user complaint. Build a simple spreadsheet. Review monthly.
Target: automation should catch at least 50% of pre-production bugs in a mature codebase.
Metric 2: Mean Time to Failure Detection
MTTFD = Average time from code change to failing test notificationA test suite that takes 4 hours to run doesn't provide fast feedback. By the time it fails, the developer has moved on to the next feature. Context-switching cost is high.
Target by suite type:
| Suite | Target MTTFD |
|---|---|
| Unit tests | < 2 minutes |
| Integration tests | < 15 minutes |
| E2E / UI tests | < 45 minutes |
| Full regression | < 2 hours |
If your E2E suite takes 3 hours, that's not a test quality problem — it's a structural problem. Investigate parallelization, test selection, and whether those tests belong in a lower layer.
Metric 3: Flakiness Rate
Flakiness Rate = (Flaky test failures / Total test runs) × 100A test that fails randomly without code changes is worse than no test. It:
- Trains engineers to ignore red builds
- Wastes time on false investigations
- Erodes trust in the entire suite
Threshold: If more than 2% of your test runs involve flaky failures, address it before adding new tests.
Track flakiness per test file. Tests with > 5% flakiness rate should be quarantined and fixed or deleted.
[!WARNING] Flaky tests are a leading indicator of suite rot. If you ignore them, you'll have a suite that "usually passes" — which is the same as having no suite when it matters.
Metric 4: Build-over-Build Regression Rate
Regression Rate = New failures in this build vs. last build (on passing code)Measure how often your automation suite introduces new failures on code that previously passed — indicating the tests themselves are broken, not the product.
This catches:
- Tests that broke due to environment changes
- Tests tightly coupled to test order
- Tests depending on external services that went down
Target: < 1% of runs should have regressions unrelated to code changes.
Metric 5: Mean Time to Repair (Failing Tests)
MTTR = Average time from test failure to test fixIf tests fail and sit unfixed for weeks, they're not providing value — they're noise. Teams learn to ignore them.
Track the backlog of broken tests. If it grows, your team doesn't believe the tests are worth fixing. That's a cultural problem that metrics surface.
Target: Broken tests fixed within the same sprint they broke. Never carry a broken test into a second sprint without a documented reason.
Metric 6: Automation ROI per Feature Area
Not all automation is equally valuable. Some feature areas have high regression risk; others rarely break.
Track bug density by feature:
Feature Area | Bugs/Quarter | Automated Coverage | Bugs Caught by Automation
Payment flow | 12 | 85% | 10
Profile edit | 2 | 90% | 1
Search | 8 | 40% | 2This tells you where to invest automation effort. Search has low coverage and high bug density — that's where new tests create the most value.
Metric 7: Test Execution Cost
Weekly automation cost = (CI minutes × cost/minute) + (engineer hours on maintenance × hourly rate)Automation isn't free. A 6-hour E2E suite running on CI 5x/day at $0.10/minute costs $180/day — $3,600/month.
If you're not catching bugs worth that, you're burning money.
Include maintenance cost. A suite that requires 2 engineer-hours of maintenance per week is costing 8+ hours/month of senior time.
[!TIP] Calculate your automation's cost-per-bug-caught annually. If it's cheaper to catch that class of bug manually, your automation investment needs restructuring.
Putting It Together: A Simple Dashboard
You don't need complex tooling. A weekly 15-minute review of these numbers is enough:
Week of [date]:
- DDR: 58% (target: >50%) ✅
- MTTFD: 28 min (target: <45 min) ✅
- Flakiness rate: 3.1% (target: <2%) ❌
- Broken test backlog: 4 tests (target: 0) ❌
- Regression runs this week: 12, caught 6 regressions ✅Two red items this week: flakiness and broken test backlog. Next sprint, allocate time to address both.
Takeaways
- Stop measuring test count and coverage percentage — they're vanity metrics that optimize for gaming.
- Defect Detection Rate tells you if your suite is catching real bugs.
- Flakiness Rate tells you if your suite is trustworthy.
- MTTFD tells you if your feedback loop is fast enough to help developers.
- MTTR tells you if your team believes the tests are worth fixing.
- ROI per feature area tells you where to invest next.
- Calculate the actual cost of your automation. Justify it or restructure it.
Metrics exist to drive decisions. If a metric doesn't change what you do next, stop tracking it.
Sudarshan Chaudhari
AI Systems Builder / Product Engineer
Bangkok, Thailand
Solo Android developer with 13+ years in QA, building Android apps, AI automation systems, and developer tools at SudarshanTechLabs.
Related Posts
Building something? Available for Android dev and QA consulting.
Work with meComments — powered by Giscus
