Real Case: How Manual Testing Found a Critical Bug Automation Missed
A real-world example of a critical bug that passed all automated tests and was caught only by manual exploratory testing — and what it reveals about automation's limits.
On this page
The automated suite was green. Code review was clean. The feature had been in development for two weeks. Everything pointed to a clean release.
During the manual exploratory session, I found a critical bug that would have caused data loss for 15% of users.
Here's exactly what happened.
The Feature
A digital signage content management system. New feature: bulk content scheduling — schedule multiple pieces of content to display across multiple screens in a single operation.
Automated tests covered:
- Scheduling single items (existing tests, passing)
- API validation for bulk input (new tests, passing)
- Database writes verified (new tests, passing)
- UI flow from selection to confirmation (new tests, passing)
The Exploratory Session
Charter: Explore bulk scheduling under adverse conditions and edge cases.
20 minutes in, normal testing. Everything working as expected.
Then: I scheduled 12 items across 6 screens. While the bulk confirmation dialog was open (the "Are you sure?" step), I opened a second browser tab and deleted two of the 6 screens.
Then confirmed the bulk schedule in the first tab.
Expected: Error informing me that 2 screens no longer exist, allowing me to proceed with the remaining 4 or cancel.
Actual: Success message. Schedule created. But only 4 screens received the schedule. The 2 deleted screens' content slots were silently removed from the schedule with no indication. The user had no way to know the schedule was incomplete.
Why Automation Missed It
The automated tests tested operations sequentially. Create screens. Create content. Schedule. Verify. Clean up. Each operation in order.
No automated test created state in one context, modified that state in another context concurrently, and then completed the original operation. This is a race condition that requires:
- Two concurrent sessions (not one)
- A state modification in one session that affects the context of another
- A confirmation action that doesn't re-validate the current state
The automated test designer (me) didn't think to test this combination. The automated framework executes exactly what you write. It doesn't explore.
Why Manual Exploration Found It
The session charter was "adverse conditions and edge cases." My mental model of scheduling systems includes: what happens when the underlying resources change during a multi-step operation?
This is institutional knowledge from previous bugs in similar systems. It's pattern recognition. When I saw a multi-step confirmation dialog, I immediately thought: "What if the data changes between these steps?"
That's a human thought process. You can't encode it in an automated test unless you've already thought of the scenario.
The Fix and the Prevention
The fix: re-validate the target screens on bulk schedule confirmation. If any screens have been deleted, list them explicitly and let the user decide whether to proceed with the remaining screens or cancel.
The prevention: the exploratory session became a documented test case. A specific automated test was added for this scenario. Future regressions would be caught automatically.
[!NOTE] This is the right workflow: manual exploration discovers the bug → it becomes a documented regression test case → automation prevents future recurrence. Manual and automated working together, not competing.
What This Case Reveals
The scenario required creativity to discover. No specification said "test concurrent session state modification." It required thinking like an adversarial user and having domain experience with similar system types.
Automation covers the expected. Everything that was anticipated in the test design was covered. The unanticipated scenario wasn't.
The cost of missing it. If this shipped, 15% of scheduled content deployments involving concurrent system use would silently be incomplete. The signage would display wrong content. No error. No indication. Clients would notice weeks later that scheduled campaigns didn't run as expected.
That's the kind of production bug that costs customer relationships. Manual exploratory testing found it in a 60-minute session. The session cost less than one hour of one engineer's time. The avoided incident would have cost significantly more.
Sudarshan Chaudhari
AI Systems Builder / Product Engineer
Bangkok, Thailand
Solo Android developer with 13+ years in QA, building Android apps, AI automation systems, and developer tools at SudarshanTechLabs.
Related Posts
Building something? Available for Android dev and QA consulting.
Work with meComments — powered by Giscus
