How to Reproduce Hard-to-Reproduce Bugs
Some bugs appear once and vanish. Here's a systematic approach to reproducing elusive bugs — the steps, the tools, and the mindset that turns intermittent into consistent.
On this page
The hardest bugs to fix are the ones you can't reproduce. Not because the fix is complex — because you can't even confirm the fix worked if you can't make it fail first.
Reproducing hard bugs is a skill. Here's the systematic approach.
Step 1: Capture Everything When It First Appears
The moment a bug occurs is the richest information point. Before anything else — before you try to reproduce, before you start the investigation — capture:
- Screenshot or screen recording of the failure state
- Device logcat (start capturing before you begin testing each session)
- App version and build number
- Device model and OS version
- What you were doing for the last 5 minutes before the failure
- Network conditions (Wi-Fi vs cellular, signal strength)
- Battery level and power saving mode state
- Whether the app had been running for minutes or hours
Most of this takes 60 seconds to capture. Without it, you're reconstructing from memory, which is unreliable.
Step 2: Enumerate Possible Variables
Hard-to-reproduce bugs usually depend on a specific combination of conditions. List every variable you can think of:
State variables:
- Account type (free vs paid, new vs veteran user)
- Data state (empty account, large data set, specific record types)
- Session state (fresh login vs long session, token age)
Environment variables:
- Device model and manufacturer
- OS version and OEM skin
- App version (has this been happening since a specific build?)
- Available storage and memory
Timing variables:
- Time since app launch
- Time since last action
- Concurrent operations (download in progress, notification received)
Network variables:
- Wi-Fi vs cellular
- Signal quality
- VPN active
- Corporate proxy
Map what you know about the conditions when the bug appeared against each variable. Start eliminating the ones that weren't factors.
Step 3: Try Exact Reproduction
Reconstruct the exact state as precisely as possible. Use a fresh install if you don't know what accumulated state exists. Set up the account data to match what was present. Reproduce the exact sequence of actions.
If it reproduces: you have a reliable case. Document it precisely.
If it doesn't: the missing variable is in your list. Start testing variations.
Step 4: Bisect the Variables
Systematically change one variable at a time:
- Try with different device
- Try with different OS version
- Try with different account data state
- Try after the app has been running for 2+ hours
- Try with airplane mode activated mid-action
- Try with another app downloading in the background
Each variation either reproduces the bug (you found the variable) or doesn't (you eliminated one possibility).
[!TIP] When bisecting, keep notes. "Tried on Pixel 7, no reproduce. Tried on Samsung A52 One UI 4.1, no reproduce. Tried on Xiaomi MIUI 13, reproduced 2/3 times." This log becomes your evidence even if you don't find the exact cause immediately.
Step 5: Stress Testing for Timing Bugs
Race conditions and timing bugs only appear under specific timing conditions. To surface them:
Slow the network artificially:
# Android: throttle network in developer options
# Or use Charles Proxy to simulate slow connection
# Throttle to 3G speeds during the operation that triggers the bugRun repeated rapid actions:
# Tap the target button 10 times as fast as possible
# Submit the form while a network request is already in flight
# Navigate away and back rapidlyBackground/foreground cycles:
- Start the operation
- Immediately press home button
- Wait 30 seconds
- Return to app
Timing bugs that appear at "random" often appear consistently under these deliberate stress conditions.
Step 6: Use Device Logs to Trace State
Even if you can't reproduce the visual failure, logs often show what happened before and during a crash or error:
# Filter logcat to your app
adb logcat --pid=$(adb shell pidof com.your.package) -v time
# Look for exceptions, warnings, and state transitions
# around the time the bug occurredIn Firebase Crashlytics, the breadcrumbs leading up to a crash often tell you the state that triggered it even without a reliable reproduction path.
When You Genuinely Can't Reproduce
Some bugs are environment-specific in ways you can't replicate without the exact device and account state. In these cases:
- Document as much as possible about the conditions observed
- Add logging specifically around the suspected code path
- Ship the additional logging in the next release
- Wait for the bug to recur with better telemetry
This "instrument and wait" approach is sometimes the only path forward for bugs that only appear on specific production configurations that can't be recreated in a lab.
Hard-to-reproduce doesn't mean impossible to understand. It means you need more information. The systematic approach gives you a path to that information even when the bug seems random.
Sudarshan Chaudhari
AI Systems Builder / Product Engineer
Bangkok, Thailand
Solo Android developer with 13+ years in QA, building Android apps, AI automation systems, and developer tools at SudarshanTechLabs.
Related Posts
Building something? Available for Android dev and QA consulting.
Work with meComments — powered by Giscus
