Skip to content
All posts
June 20, 20265 min read

Designing Honest Player Benchmark Reports With Rust

A practical look at RustPlayerBenchAI, a small Rust CLI for profile-based signage player benchmark reports, threshold tuning, and JSON output.

RustCLIBenchmarkingAutomationDigital Signage
Share:

Benchmark reports are only useful when they are honest about what they measure.

That sounds obvious, but it matters. A report that claims too much becomes misleading. A report that says too little becomes noise. For digital signage players, I want something in the middle: a simple benchmark summary that compares known device profiles, explains CPU and memory expectations, and produces a clear pass, warn, or fail verdict.

That is the reason I built RustPlayerBenchAI. The binary is called

code
benchrun
, and the current version is a profile-based benchmark reporter for signage player classes like
code
scos
,
code
brightsign
,
code
pi4
, and
code
local
.

It does not pretend to collect real host metrics yet. It creates deterministic profile-based samples so I can design the reporting shape, JSON contract, threshold behavior, and release checks before connecting it to real devices.

What The CLI Measures Today

RustPlayerBenchAI currently reports:

  • Device profile.
  • Duration in seconds.
  • Average CPU percentage.
  • Average memory usage in MB.
  • Sample count.
  • Verdict.

Run the default local profile:

bash
benchrun run

Run a known signage profile:

bash
benchrun run --device scos --duration 5

Example JSON report:

json
{
  "device": "scos",
  "duration_secs": 5,
  "avg_cpu_percent": 45.0,
  "avg_memory_mb": 380.0,
  "sample_count": 5,
  "verdict": "Pass"
}

That report is intentionally small. If a benchmark tool is going to be used in smoke tests, QA notes, or support conversations, the first version should be easy to read.

Known Profiles

The current profile model is deterministic:

DeviceCPU ProfileMemory Profile
code
scos
45%380 MB
code
brightsign
30%256 MB
code
pi4
55%512 MB
code
local
or unknown
40%350 MB

The implementation reflects that:

rust
fn simulate_cpu(device: &str) -> f64 {
    match device {
        "scos" => 45.0,
        "brightsign" => 30.0,
        "pi4" => 55.0,
        _ => 40.0,
    }
}

fn simulate_memory(device: &str) -> f64 {
    match device {
        "scos" => 380.0,
        "brightsign" => 256.0,
        "pi4" => 512.0,
        _ => 350.0,
    }
}

That is not a real performance collector. It is a stable reporting baseline. The value is in getting the output model right before adding host metric collection.

Thresholds Should Be Tunable

The v1.1.0 release added configurable CPU and memory thresholds.

Default verdict thresholds are:

VerdictCPUMemory
code
PASS
70% or lower600 MB or lower
code
WARN
70% to 90%600 MB to 800 MB
code
FAIL
Above 90%Above 800 MB

Those defaults are reasonable for a first pass, but different fleets need different expectations. A constrained device, an older player, or a busy local test profile may need a different warning line.

The CLI exposes those values:

bash
benchrun run \
  --device scos \
  --cpu-warn 60 \
  --cpu-fail 85 \
  --memory-warn 500 \
  --memory-fail 700

That makes the benchmark report more honest. Instead of hiding assumptions in code, the caller can state the environment’s limits at runtime.

JSON Makes Reports Comparable

Terminal output is useful when a person is reading the result. JSON is useful when benchmark reports become artifacts.

bash
benchrun run --device pi4 --duration 60 --json > bench-report.json

Once the result is JSON, a small script can compare reports over time:

bash
benchrun run --device scos --duration 10 --json > reports/scos-before.json
benchrun run --device scos --duration 10 --json > reports/scos-after.json

In the current simulated version, those values should stay stable. In a future host-metrics version, the same report shape can support trend checks, release comparisons, or QA notes.

[!TIP] Keep the benchmark output boring. A stable report contract is more useful than a clever terminal UI when another tool needs to consume the result.

Where This Fits In A Signage Workflow

For signage systems, player performance is not an abstract number. It affects whether content plays smoothly, whether dashboards stay responsive, and whether a device has enough headroom for the playlist it is running.

RustPlayerBenchAI can fit into a lightweight workflow like this:

bash
benchrun run --device scos --duration 30
benchrun run --device brightsign --duration 30 --json > brightsign-report.json
benchrun run --device pi4 --cpu-warn 65 --memory-warn 550

As the tool grows, the same structure can move from profile simulation to real collection:

  • Read CPU and memory from the host.
  • Sample over time instead of using profile constants.
  • Store reports for release comparison.
  • Flag regressions between builds.
  • Generate a small HTML report for QA.

The point is to keep the reporting model stable while improving where the numbers come from.

What This Does Not Prove

Benchmark tools can easily overclaim, so the boundary matters.

RustPlayerBenchAI does not currently prove:

  • Real device CPU usage under production content.
  • GPU performance.
  • Network stability.
  • Browser memory leaks.
  • Playback smoothness.
  • Thermal throttling behavior.

It does provide a clear report format and threshold model for the next version of that work.

Key Takeaways

  • Benchmark reports should be explicit about what they measure and what they do not.
  • Profile-based samples are useful for designing report contracts before real host collection.
  • Configurable thresholds make pass, warn, and fail verdicts match the environment.
  • JSON output turns benchmark results into release or QA artifacts.
  • A useful benchmark CLI starts with a stable, boring report shape.

RustPlayerBenchAI is a small step toward more practical player checks. Today it gives deterministic benchmark reports. The next useful step is connecting the same report model to real device metrics.

Share:
S

Sudarshan Chaudhari

AI Systems Builder / Product Engineer

Bangkok, Thailand

Solo Android developer with 13+ years in QA, building Android apps, AI automation systems, and developer tools at SudarshanTechLabs.

Stay updated

Get new posts on Android, Kotlin, and solo dev straight to your inbox.

Newsletter preferences

Building something? Available for Android dev and QA consulting.

Work with me

Comments — powered by Giscus