Designing Honest Player Benchmark Reports With Rust
A practical look at RustPlayerBenchAI, a small Rust CLI for profile-based signage player benchmark reports, threshold tuning, and JSON output.
On this page
Benchmark reports are only useful when they are honest about what they measure.
That sounds obvious, but it matters. A report that claims too much becomes misleading. A report that says too little becomes noise. For digital signage players, I want something in the middle: a simple benchmark summary that compares known device profiles, explains CPU and memory expectations, and produces a clear pass, warn, or fail verdict.
That is the reason I built RustPlayerBenchAI. The binary is called
benchrunscosbrightsignpi4localIt does not pretend to collect real host metrics yet. It creates deterministic profile-based samples so I can design the reporting shape, JSON contract, threshold behavior, and release checks before connecting it to real devices.
What The CLI Measures Today
RustPlayerBenchAI currently reports:
- Device profile.
- Duration in seconds.
- Average CPU percentage.
- Average memory usage in MB.
- Sample count.
- Verdict.
Run the default local profile:
benchrun runRun a known signage profile:
benchrun run --device scos --duration 5Example JSON report:
{
"device": "scos",
"duration_secs": 5,
"avg_cpu_percent": 45.0,
"avg_memory_mb": 380.0,
"sample_count": 5,
"verdict": "Pass"
}That report is intentionally small. If a benchmark tool is going to be used in smoke tests, QA notes, or support conversations, the first version should be easy to read.
Known Profiles
The current profile model is deterministic:
| Device | CPU Profile | Memory Profile |
|---|---|---|
code | 45% | 380 MB |
code | 30% | 256 MB |
code | 55% | 512 MB |
code | 40% | 350 MB |
The implementation reflects that:
fn simulate_cpu(device: &str) -> f64 {
match device {
"scos" => 45.0,
"brightsign" => 30.0,
"pi4" => 55.0,
_ => 40.0,
}
}
fn simulate_memory(device: &str) -> f64 {
match device {
"scos" => 380.0,
"brightsign" => 256.0,
"pi4" => 512.0,
_ => 350.0,
}
}That is not a real performance collector. It is a stable reporting baseline. The value is in getting the output model right before adding host metric collection.
Thresholds Should Be Tunable
The v1.1.0 release added configurable CPU and memory thresholds.
Default verdict thresholds are:
| Verdict | CPU | Memory |
|---|---|---|
code | 70% or lower | 600 MB or lower |
code | 70% to 90% | 600 MB to 800 MB |
code | Above 90% | Above 800 MB |
Those defaults are reasonable for a first pass, but different fleets need different expectations. A constrained device, an older player, or a busy local test profile may need a different warning line.
The CLI exposes those values:
benchrun run \
--device scos \
--cpu-warn 60 \
--cpu-fail 85 \
--memory-warn 500 \
--memory-fail 700That makes the benchmark report more honest. Instead of hiding assumptions in code, the caller can state the environment’s limits at runtime.
JSON Makes Reports Comparable
Terminal output is useful when a person is reading the result. JSON is useful when benchmark reports become artifacts.
benchrun run --device pi4 --duration 60 --json > bench-report.jsonOnce the result is JSON, a small script can compare reports over time:
benchrun run --device scos --duration 10 --json > reports/scos-before.json
benchrun run --device scos --duration 10 --json > reports/scos-after.jsonIn the current simulated version, those values should stay stable. In a future host-metrics version, the same report shape can support trend checks, release comparisons, or QA notes.
[!TIP] Keep the benchmark output boring. A stable report contract is more useful than a clever terminal UI when another tool needs to consume the result.
Where This Fits In A Signage Workflow
For signage systems, player performance is not an abstract number. It affects whether content plays smoothly, whether dashboards stay responsive, and whether a device has enough headroom for the playlist it is running.
RustPlayerBenchAI can fit into a lightweight workflow like this:
benchrun run --device scos --duration 30
benchrun run --device brightsign --duration 30 --json > brightsign-report.json
benchrun run --device pi4 --cpu-warn 65 --memory-warn 550As the tool grows, the same structure can move from profile simulation to real collection:
- Read CPU and memory from the host.
- Sample over time instead of using profile constants.
- Store reports for release comparison.
- Flag regressions between builds.
- Generate a small HTML report for QA.
The point is to keep the reporting model stable while improving where the numbers come from.
What This Does Not Prove
Benchmark tools can easily overclaim, so the boundary matters.
RustPlayerBenchAI does not currently prove:
- Real device CPU usage under production content.
- GPU performance.
- Network stability.
- Browser memory leaks.
- Playback smoothness.
- Thermal throttling behavior.
It does provide a clear report format and threshold model for the next version of that work.
Key Takeaways
- Benchmark reports should be explicit about what they measure and what they do not.
- Profile-based samples are useful for designing report contracts before real host collection.
- Configurable thresholds make pass, warn, and fail verdicts match the environment.
- JSON output turns benchmark results into release or QA artifacts.
- A useful benchmark CLI starts with a stable, boring report shape.
RustPlayerBenchAI is a small step toward more practical player checks. Today it gives deterministic benchmark reports. The next useful step is connecting the same report model to real device metrics.
Sudarshan Chaudhari
AI Systems Builder / Product Engineer
Bangkok, Thailand
Solo Android developer with 13+ years in QA, building Android apps, AI automation systems, and developer tools at SudarshanTechLabs.
Related Posts
Building something? Available for Android dev and QA consulting.
Work with meComments — powered by Giscus
