Skip to content

review: serialize runtime event emission; document RTMR3/TCB verification scope#706

Merged
kvinwang merged 2 commits into
masterfrom
harden/emit-event-serialize
Jun 4, 2026
Merged

review: serialize runtime event emission; document RTMR3/TCB verification scope#706
kvinwang merged 2 commits into
masterfrom
harden/emit-event-serialize

Conversation

@kvinwang
Copy link
Copy Markdown
Collaborator

@kvinwang kvinwang commented Jun 3, 2026

Summary

Two independent attestation-hardening changes, kept out of the cloud-merge PR (#701):

1. Serialize runtime event emission (concurrency fix)

emit_runtime_event appended to the event log and extended RTMR3 without synchronization. The guest-agent's emit_event RPC is reachable concurrently by multiple clients, so two concurrent calls could interleave as:

A: emit() (append log A)
B: emit() (append log B)
B: extend_rtmr(digest B)
A: extend_rtmr(digest A)

Now the on-disk log order is [A, B] but the RTMR extension order is [B, A]. RTMR is a chained SHA384(old ‖ digest) register, so replaying the log during verification no longer reproduces the quoted rt_mr3the whole quote becomes unverifiable. The two separate write_all calls inside emit() can also be split by another writer, corrupting a log line.

Fix: guard the whole critical section (log append + register extension) with a process-global Mutex, so log order always matches extension order. The function is fully synchronous (no await), so a std::sync::Mutex is appropriate.

2. Document two intentional verification scoping decisions

Added a "Verification Design Notes" section to docs/security/security-model.md:

  • Why only RTMR3 is replayed from an event log — RTMR0-2/MRTD come straight from the signed quote and are checked against offline-reproduced expected values, so their boot event log has no consumer. The embedded log is stripped to imr == 3 at the source (into_stripped).
  • Why TCB status is surfaced, not gatedvalidate_tcb enforces only hard invariants (debug off, SEAM measurements); the status string is passed through for downstream policy to judge OutOfDate etc. Only Revoked is rejected outright, by dcap-qvl's is_valid(). Notes planned grace-period refactor.

Testing

  • cargo check -p dstack-attest
  • cargo clippy -p dstack-attest (clean)

…tion scope

emit_runtime_event appended to the event log and extended RTMR3 without
synchronization. Concurrent emit_event RPCs could interleave their log writes
and extend_rtmr calls, making the on-disk log order diverge from the RTMR
extension order and breaking RTMR3 replay during quote verification. Guard the
whole critical section with a process-global mutex so log order always matches
extension order.

Also document two intentional verification scoping decisions in the security
model: why only RTMR3 is replayed from an event log (boot-time RTMR0-2 events
have no downstream consumer and are stripped at the source), and why TCB status
is surfaced rather than gated (left to downstream policy; only Revoked is
rejected, by dcap-qvl).
Copilot AI review requested due to automatic review settings June 3, 2026 15:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens TDX attestation verification reliability by preventing concurrent runtime-event emission from producing an event log ordering that cannot reproduce the quoted rt_mr3, and it documents intentional design scope choices in quote verification (RTMR3-only replay and surfacing—rather than gating on—TCB status).

Changes:

  • Serialize runtime event emission by guarding the “append event log + extend RTMR3” critical section with a process-global mutex.
  • Add “Verification Design Notes” to the security model documentation explaining RTMR3-only replay and TCB-status handling.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
dstack-attest/src/lib.rs Adds a process-global mutex to ensure runtime event log order matches RTMR3 extension order under concurrency.
docs/security/security-model.md Documents why only RTMR3 is replay-verified and why TCB status is surfaced rather than used as a hard gate.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/security/security-model.md Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@kvinwang kvinwang enabled auto-merge June 4, 2026 00:09
@kvinwang kvinwang merged commit c5046ab into master Jun 4, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants