February 28, 20265 min read

Automation Metrics That Actually Matter

Most teams track the wrong automation numbers. Test count and coverage percentage tell you almost nothing. Here's what to measure instead.

TestingAutomationMonitoring

On this page

The Vanity Metrics to Stop Tracking
Metric 1: Defect Detection Rate
Metric 2: Mean Time to Failure Detection
Metric 3: Flakiness Rate
Metric 4: Build-over-Build Regression Rate
Metric 5: Mean Time to Repair (Failing Tests)
Metric 6: Automation ROI per Feature Area
Metric 7: Test Execution Cost
Putting It Together: A Simple Dashboard
Takeaways

Your test suite has 2,000 automated tests. What does that tell you?

Nothing useful. It tells you how many tests exist. It doesn't tell you if they catch bugs, how long they take to run, or whether they're worth maintaining.

Here are the metrics that actually measure automation value.

The Vanity Metrics to Stop Tracking

Test count: A meaningless number. 2,000 weak tests are worse than 200 strong ones — they take longer, fail more often, and provide false confidence.

Code coverage percentage: 80% coverage with bad assertions is almost as useless as 0%. Coverage measures lines executed, not logic verified.

Pass rate on green builds: Of course tests pass on green builds. That's the wrong baseline.

These metrics optimize for gaming. Teams add tests to hit numbers. The tests don't add value; they add maintenance burden.

Metric 1: Defect Detection Rate

code

Defect Detection Rate = Bugs caught by automation / Total bugs found

If your automation finds 30 of the 50 bugs discovered before release, your DDR is 60%.

Track this over time. If DDR drops, your test suite isn't keeping pace with the product. If DDR increases, your investment is working.

How to measure: Tag every bug with how it was found: automated test, manual exploratory, prod report, user complaint. Build a simple spreadsheet. Review monthly.

Target: automation should catch at least 50% of pre-production bugs in a mature codebase.

Metric 2: Mean Time to Failure Detection

code

MTTFD = Average time from code change to failing test notification

A test suite that takes 4 hours to run doesn't provide fast feedback. By the time it fails, the developer has moved on to the next feature. Context-switching cost is high.

Target by suite type:

Suite	Target MTTFD
Unit tests	< 2 minutes
Integration tests	< 15 minutes
E2E / UI tests	< 45 minutes
Full regression	< 2 hours

If your E2E suite takes 3 hours, that's not a test quality problem — it's a structural problem. Investigate parallelization, test selection, and whether those tests belong in a lower layer.

Metric 3: Flakiness Rate

code

Flakiness Rate = (Flaky test failures / Total test runs) × 100

A test that fails randomly without code changes is worse than no test. It:

Trains engineers to ignore red builds
Wastes time on false investigations
Erodes trust in the entire suite

Threshold: If more than 2% of your test runs involve flaky failures, address it before adding new tests.

Track flakiness per test file. Tests with > 5% flakiness rate should be quarantined and fixed or deleted.

[!WARNING] Flaky tests are a leading indicator of suite rot. If you ignore them, you'll have a suite that "usually passes" — which is the same as having no suite when it matters.

Metric 4: Build-over-Build Regression Rate

code

Regression Rate = New failures in this build vs. last build (on passing code)

Measure how often your automation suite introduces new failures on code that previously passed — indicating the tests themselves are broken, not the product.

This catches:

Tests that broke due to environment changes
Tests tightly coupled to test order
Tests depending on external services that went down

Target: < 1% of runs should have regressions unrelated to code changes.

Metric 5: Mean Time to Repair (Failing Tests)

code

MTTR = Average time from test failure to test fix

If tests fail and sit unfixed for weeks, they're not providing value — they're noise. Teams learn to ignore them.

Track the backlog of broken tests. If it grows, your team doesn't believe the tests are worth fixing. That's a cultural problem that metrics surface.

Target: Broken tests fixed within the same sprint they broke. Never carry a broken test into a second sprint without a documented reason.

Metric 6: Automation ROI per Feature Area

Not all automation is equally valuable. Some feature areas have high regression risk; others rarely break.

Track bug density by feature:

code

Feature Area | Bugs/Quarter | Automated Coverage | Bugs Caught by Automation
Payment flow |      12      |        85%         |          10
Profile edit |       2      |        90%         |           1
Search       |       8      |        40%         |           2

This tells you where to invest automation effort. Search has low coverage and high bug density — that's where new tests create the most value.

Metric 7: Test Execution Cost

code

Weekly automation cost = (CI minutes × cost/minute) + (engineer hours on maintenance × hourly rate)

Automation isn't free. A 6-hour E2E suite running on CI 5x/day at $0.10/minute costs $180/day — $3,600/month.

If you're not catching bugs worth that, you're burning money.

Include maintenance cost. A suite that requires 2 engineer-hours of maintenance per week is costing 8+ hours/month of senior time.

[!TIP] Calculate your automation's cost-per-bug-caught annually. If it's cheaper to catch that class of bug manually, your automation investment needs restructuring.

Putting It Together: A Simple Dashboard

You don't need complex tooling. A weekly 15-minute review of these numbers is enough:

code

Week of [date]:
- DDR: 58% (target: >50%) ✅
- MTTFD: 28 min (target: <45 min) ✅
- Flakiness rate: 3.1% (target: <2%) ❌
- Broken test backlog: 4 tests (target: 0) ❌
- Regression runs this week: 12, caught 6 regressions ✅

Two red items this week: flakiness and broken test backlog. Next sprint, allocate time to address both.

Takeaways

Stop measuring test count and coverage percentage — they're vanity metrics that optimize for gaming.
Defect Detection Rate tells you if your suite is catching real bugs.
Flakiness Rate tells you if your suite is trustworthy.
MTTFD tells you if your feedback loop is fast enough to help developers.
MTTR tells you if your team believes the tests are worth fixing.
ROI per feature area tells you where to invest next.
Calculate the actual cost of your automation. Justify it or restructure it.

Metrics exist to drive decisions. If a metric doesn't change what you do next, stop tracking it.

Sudarshan Chaudhari

AI Systems Builder / Product Engineer

Bangkok, Thailand

Solo Android developer with 13+ years in QA, building Android apps, AI automation systems, and developer tools at SudarshanTechLabs.

GitHub Play Store

Stay updated

Get new posts on Android, Kotlin, and solo dev straight to your inbox.

RSS Feed Telegram