Skip to content
All posts
February 21, 20265 min read

Flaky Tests: The Silent Killer of Your CI Pipeline

A test that fails randomly is worse than no test at all. It trains your team to ignore failures, destroys trust in the suite, and hides real bugs. Here's how to identify, fix, and prevent flakiness.

TestingAutomationCI/CD
Share:

One flaky test can destroy a test suite.

Not because it's one test. Because the moment engineers start ignoring failures — "oh that's just the flaky Selenium test" — every failure becomes ignorable. Real bugs slip through. The suite becomes theater.

Flakiness is a trust problem. Fix it or remove the test.


What Makes a Test Flaky

A flaky test passes sometimes and fails other times without any code change. The causes fall into a few categories:

1. Timing and Async Issues

The most common cause. The test assumes an element is present before it actually loads, or checks a value before an async operation completes.

kotlin
// Flaky — assumes the list is populated immediately
val items = viewModel.items.value
assertEquals(3, items.size)

// Better — wait for the state to settle
val items = viewModel.items.drop(1).first() // skips initial empty state
assertEquals(3, items.size)

In UI tests, clicking an element before it's fully rendered causes failures that pass on retry.

2. Test Ordering Dependencies

Tests that rely on shared state set up by a previous test. If the previous test fails or runs in a different order, this test fails.

code
Test A: Creates a user (passes)
Test B: Expects that user to exist (passes if A ran first, fails otherwise)

Every test must be independent. Set up what you need, tear down after.

3. Environmental Differences

  • Tests pass locally, fail in CI (different timezone, locale, screen resolution)
  • Tests fail on CI agent 1 but pass on agent 2 (different OS version, dependencies)
  • Tests fail when run in parallel but pass sequentially (shared database, shared files)

4. Network and External Dependencies

Tests that call real APIs, real databases, or real external services fail when those services are slow or unavailable. Mock external dependencies in unit tests. Use dedicated test environments for integration tests.

5. Date and Time Dependencies

kotlin
// Flaky — what happens at 11:59:58 PM?
val today = LocalDate.now()
assertEquals(today, schedule.nextRunDate)

Tests that depend on the current time fail at boundary conditions. Inject time as a dependency and control it in tests.


The Cost of Ignoring Flakiness

Teams often accept flakiness because "it usually passes on retry." This is how the problem compounds:

  1. One flaky test → engineers learn to retry failures
  2. More flaky tests → retry becomes the default response to all failures
  3. Real failure occurs → team assumes it's flaky, retries → bug ships
  4. Trust in suite collapses → team stops caring about test results entirely

[!WARNING] A test suite with 10% flakiness has a very high probability of false greens on any given run. At that point, your CI pipeline is not a safety net.


How to Find Flaky Tests

Run tests multiple times in a row. A test that fails 1-in-10 runs will show up quickly if you run the suite 20 times.

bash
# Run the suite 10 times and count failures
for i in {1..10}; do
  ./gradlew test 2>&1 | grep -E "(PASS|FAIL)" >> results.txt
done
grep FAIL results.txt | sort | uniq -c | sort -rn

Track failure rates in CI. Most modern CI systems let you export test results as JUnit XML. Build a dashboard that shows which tests fail most often.

Look at retry patterns. If your suite has test retries enabled and certain tests always retry, those are flaky tests in disguise.


Fixing Flaky Tests

For Timing Issues

Replace arbitrary sleeps with proper waits:

kotlin
// Bad
Thread.sleep(2000)
checkElement.click()

// Better — explicit wait with timeout
waitUntilVisible(checkElement, timeout = 5.seconds)
checkElement.click()

For State Dependencies

Each test sets up its own data and cleans up after:

kotlin
@Before
fun setup() {
    db.insertTestUser(userId = "test-user-123")
}

@After
fun teardown() {
    db.deleteUser(userId = "test-user-123")
}

For Parallel Execution Issues

Isolate database state per test using transactions that roll back:

kotlin
@Transactional
@Test
fun `should update user profile`() {
    // All DB changes are rolled back after this test
}

For Time-Dependent Tests

Inject a clock interface:

kotlin
interface Clock {
    fun now(): LocalDateTime
}

// In production
class SystemClock : Clock {
    override fun now() = LocalDateTime.now()
}

// In tests
class FixedClock(private val fixedTime: LocalDateTime) : Clock {
    override fun now() = fixedTime
}

When You Can't Fix It Immediately

If a test is flaky and you don't have bandwidth to fix it now:

  1. Quarantine it — move it to a separate suite that doesn't block CI
  2. Track it — create a bug/ticket with reproduction steps
  3. Set a deadline — flaky tests in quarantine for more than 2 weeks get deleted
  4. Delete it — a quarantined test that never gets fixed is dead weight

Never leave a known flaky test in the main test suite. It will corrupt the team's trust in everything else.


Prevention: Write Less Flaky Tests From the Start

  • Never use
    code
    Thread.sleep()
    — use explicit waits
  • Never share state between tests — each test is isolated
  • Never call real external services in unit tests — mock them
  • Inject time, random number generators, and file system as dependencies
  • Run tests in parallel from day one — surface parallelism issues early

Takeaways

  • Flakiness is a trust problem — one ignored failure teaches the team to ignore all failures
  • Timing issues, state dependencies, and environment differences are the top causes
  • Track failure rates — you can't fix what you can't measure
  • Quarantine flaky tests immediately, then fix or delete
  • Write isolated, deterministic tests from the start — easier than fixing flakiness later
Share:
S

Sudarshan Chaudhari

AI Systems Builder / Product Engineer

Bangkok, Thailand

Solo Android developer with 13+ years in QA, building Android apps, AI automation systems, and developer tools at SudarshanTechLabs.

Stay updated

Get new posts on Android, Kotlin, and solo dev straight to your inbox.

Newsletter preferences

Building something? Available for Android dev and QA consulting.

Work with me

Comments — powered by Giscus