Skip to content

bench-harness: field-collector and enforcement follow-ups #18

@humancto

Description

@humancto

Tracked follow-up nits from rust-expert review of PR #17 (bench oracle harness scaffold).

Five categories; each is a small issue the first Phase 2 bench PR will touch on real hardware. None block shipping 0.10.

1. Field-collector edge cases (benches/runner/hardware-signature.sh)

  • Empty-value silent malformation. _field_scheduler and _field_storage have code paths that can return an empty string (e.g., LVM on Linux, Apple Silicon APFS). An empty collector output produces key= in the signature line, which the shape regex rejects — but the signature is printed anyway. Add [ -z "$v" ] && v=unknown at the exit of every collector.
  • set -o pipefail aborts the signature. _field_cpu_mhz_max runs lscpu | awk … inside $( ). If lscpu fails non-zero under pipefail, the whole script exits. Wrap the producer: v=$( { lscpu 2>/dev/null || true; } | awk … ).
  • macOS storage always unknown. On Apple Silicon, df / yields an APFS volume like /dev/disk3s1s1 and diskutil info on that volume does not include "Device / Media Name" (a whole-disk property). Resolve volume → whole-disk first: diskutil info <vol> | awk '/Part of Whole/ {print $4}', then diskutil info disk3.
  • Cross-platform cores asymmetry. Linux nproc returns cgroup-effective count; macOS hw.physicalcpu returns hardware. Same host in a cgroup-limited container vs. not produces different sha. Use nproc --all or getconf _NPROCESSORS_CONF for physical parity.

2. Turbo detection on AMD

_field_turbo only reads /sys/devices/system/cpu/intel_pstate/no_turbo. On AMD hosts that path doesn't exist and turbo silently reports unknown. Read /sys/devices/system/cpu/cpufreq/boost (byte 0/1) on AMD.

3. Missing test cases (scripts/test-hardware-signature.sh)

  • A test that injects an empty-value line into the canonicalize pipeline and asserts the shape regex rejects it, proving the collectors must never return empty.
  • A test that shadows lscpu with a stub exiting non-zero and asserts the script still emits a well-formed signature (no pipefail abort).

4. TSC enforcement consistency

HARDWARE.md says "invariant TSC required for Tier 1" but nothing enforces it — tsc=variable with BENCH_TIER=1 succeeds. Either:

  • Enforce: fail validate_tier_argv / signature emission when tsc=variable && tier=1.
  • Soften: HARDWARE.md says "REQUIRED but not enforced by tooling; operator discipline."

Pick one and make the doc and tooling agree.

5. Fetch-path coverage (benches/oracles/etcd/fetch.sh)

do_fetch — the network download plus SHA256SUMSVERSIONS cross-check — has zero test coverage on this PR. The plan defers real fetching to the first Phase 2 bench PR. But a -n / dry-run mode that exercises the cross-check logic against a fixture would be cheap and catch regressions.

Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions