-
Notifications
You must be signed in to change notification settings - Fork 26
feat: add attestation aggregate coverage metrics #386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
MegaRedHand
merged 10 commits into
main
from
add-attestation-aggregate-coverage-metrics
May 27, 2026
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
6622da5
feat: add attestation aggregate coverage metrics
pablodeymo 4c2e65f
feat: emit attestation aggregate coverage metrics
pablodeymo 453d312
refactor(blockchain): inline coverage emitters into lib.rs
pablodeymo a9c4ea5
refactor(blockchain): drop always-zero proposal_payloads/proposal_gos…
pablodeymo 168036f
Merge remote-tracking branch 'origin/main' into add-attestation-aggre…
pablodeymo 04569ee
fix(blockchain): correct attestation aggregate coverage slot keying
pablodeymo 6d88001
Move attestation coverage snapshot state out of the storage crate
pablodeymo c751426
refactor(blockchain): move attestation coverage emission into its own…
pablodeymo c82c548
Merge branch 'main' into add-attestation-aggregate-coverage-metrics
MegaRedHand 6fcefec
fix(blockchain): correct attestation coverage emission per review
pablodeymo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,194 @@ | ||
| //! Attestation aggregate coverage emission. | ||
| //! | ||
| //! Pure observability — nothing here feeds back into fork choice or the state | ||
| //! transition. The emitters build `Vec<bool>` locals (`seen` for validators, | ||
| //! `has_subnet` for subnets, with subnet = `vid % committee_count`, matching | ||
| //! the gossip subnet assignment in `crates/net/p2p/src/lib.rs`) and push the | ||
| //! resulting counts to the coverage gauges registered in | ||
| //! [`crate::metrics`]. | ||
|
|
||
| use ethlambda_storage::Store; | ||
| use ethlambda_types::attestation::{AggregatedAttestation, AggregationBits, validator_indices}; | ||
|
|
||
| use crate::metrics; | ||
|
|
||
| /// Pre-merge snapshot of `new_payloads` participant bits, used by the | ||
| /// attestation aggregate coverage report. | ||
| /// | ||
| /// Each entry is tagged with its attestation `data.slot` (the voting round) so | ||
| /// the consumer can filter to a single round at emit time — `new_payloads` may | ||
| /// hold entries spanning more than one slot. Holds raw participant bits; the | ||
| /// consumer constructs coverage bitsets at emit time using the current | ||
| /// validator and committee counts. | ||
| #[derive(Debug, Clone)] | ||
| pub(crate) struct CoverageSnapshot { | ||
| pub(crate) entries: Vec<(u64, AggregationBits)>, | ||
| } | ||
|
|
||
| /// Capture the participant bits of every entry in `new_payloads` for the | ||
| /// attestation aggregate coverage report. Each entry is tagged with its | ||
| /// attestation `data.slot` so the post-block report can filter to a single | ||
| /// voting round (`new_payloads` may span multiple slots). | ||
| /// | ||
| /// Returns `None` when `new_payloads` is empty so callers can keep their last | ||
| /// non-empty snapshot rather than overwriting it with nothing — a node that | ||
| /// missed a round still reports the round it last saw. | ||
| pub(crate) fn snapshot_new_payloads(store: &Store) -> Option<CoverageSnapshot> { | ||
| let entries = store.new_aggregated_payload_participants(); | ||
| if entries.is_empty() { | ||
| return None; | ||
| } | ||
| Some(CoverageSnapshot { entries }) | ||
| } | ||
|
|
||
| fn cov_add(seen: &mut [bool], has_subnet: &mut [bool], bits: &AggregationBits) { | ||
| let cc = has_subnet.len(); | ||
| if cc == 0 { | ||
| return; | ||
| } | ||
| for vid in validator_indices(bits) { | ||
| let vid = vid as usize; | ||
| if vid < seen.len() { | ||
| seen[vid] = true; | ||
| has_subnet[vid % cc] = true; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| fn cov_record(section: &str, seen: &[bool], has_subnet: &[bool]) { | ||
| metrics::set_attestation_aggregate_coverage_validators( | ||
| section, | ||
| "combined", | ||
| seen.iter().filter(|&&b| b).count() as i64, | ||
| ); | ||
| metrics::set_attestation_aggregate_coverage_subnets( | ||
| section, | ||
| has_subnet.iter().filter(|&&b| b).count() as i64, | ||
| ); | ||
| } | ||
|
|
||
| fn or_into(dst: &mut [bool], src: &[bool]) { | ||
| for (d, &s) in dst.iter_mut().zip(src) { | ||
| *d |= s; | ||
| } | ||
| } | ||
|
|
||
| /// Post-block coverage report for `reporting_slot`. Emits `timely` / `late` / | ||
| /// `block` / `combined` sections plus the `diff_validators` symmetric | ||
| /// difference between `block` and `timely`. Called at interval 1 of the | ||
| /// next slot. | ||
| pub(crate) fn emit_post_block_coverage( | ||
| store: &Store, | ||
| pre_merge_coverage: Option<&CoverageSnapshot>, | ||
| committee_count: u64, | ||
| reporting_slot: u64, | ||
| ) { | ||
| let validator_count = store.head_state().validators.len(); | ||
| if validator_count == 0 || committee_count == 0 { | ||
| return; | ||
| } | ||
| let cc = committee_count as usize; | ||
| let (mut timely_v, mut timely_s) = (vec![false; validator_count], vec![false; cc]); | ||
| let (mut late_v, mut late_s) = (vec![false; validator_count], vec![false; cc]); | ||
| let (mut block_v, mut block_s) = (vec![false; validator_count], vec![false; cc]); | ||
|
|
||
| // Every section is the same cohort: validators whose attestations *for* | ||
| // `reporting_slot` (`data.slot == reporting_slot`) were seen via that | ||
| // channel. | ||
|
|
||
| // `timely`: pre-merge snapshot of `new_payloads`, filtered to this round. | ||
| if let Some(snap) = pre_merge_coverage { | ||
| for (data_slot, bits) in &snap.entries { | ||
| if *data_slot == reporting_slot { | ||
| cov_add(&mut timely_v, &mut timely_s, bits); | ||
| } | ||
| } | ||
| } | ||
| // `late`: current `new_payloads` for this round (arrived after the promote). | ||
| for (data_slot, bits) in store.new_aggregated_payload_participants() { | ||
| if data_slot == reporting_slot { | ||
| cov_add(&mut late_v, &mut late_s, &bits); | ||
| } | ||
| } | ||
| // `block`: attestations included in the canonical head block. At interval 1 | ||
| // the head is normally the block proposed at `reporting_slot + 1`, which | ||
| // carries this round's votes; filter by `data.slot` so we count the same | ||
| // cohort even if the head is at a different slot. | ||
| if let Some(block) = store.get_block(&store.head()) { | ||
| for att in block.body.attestations.iter() { | ||
| if att.data.slot == reporting_slot { | ||
| cov_add(&mut block_v, &mut block_s, &att.aggregation_bits); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| let mut combined_v = timely_v.clone(); | ||
| let mut combined_s = timely_s.clone(); | ||
| or_into(&mut combined_v, &late_v); | ||
| or_into(&mut combined_s, &late_s); | ||
| or_into(&mut combined_v, &block_v); | ||
| or_into(&mut combined_s, &block_s); | ||
|
|
||
| // Only report a round once the canonical head block actually carries its | ||
| // votes (`block_v` non-empty). Gating on `combined` instead would still | ||
| // fire on a missed slot — the `timely` snapshot for the round is populated | ||
| // while `block_v` is all-false — pushing exactly the misleading | ||
| // `block_only=0, timely_only=N` the diff is meant to avoid. When there is | ||
| // no block for the round the gauges retain their previous value. | ||
| if !block_v.iter().any(|&b| b) { | ||
| return; | ||
| } | ||
|
|
||
| cov_record("timely", &timely_v, &timely_s); | ||
| cov_record("late", &late_v, &late_s); | ||
| cov_record("block", &block_v, &block_s); | ||
| cov_record("combined", &combined_v, &combined_s); | ||
|
|
||
| let (block_only, timely_only) = | ||
| block_v | ||
| .iter() | ||
| .zip(timely_v.iter()) | ||
| .fold((0i64, 0i64), |(b, t), (bv, tv)| match (bv, tv) { | ||
| (true, false) => (b + 1, t), | ||
| (false, true) => (b, t + 1), | ||
| _ => (b, t), | ||
| }); | ||
| metrics::set_attestation_aggregate_coverage_diff_validators("block_only", block_only); | ||
| metrics::set_attestation_aggregate_coverage_diff_validators("timely_only", timely_only); | ||
| } | ||
|
|
||
| /// `agg_start_new` coverage from `new_payloads`, called right before fork- | ||
| /// choice aggregation runs at interval 2. | ||
| pub(crate) fn emit_agg_start_new_coverage(store: &Store, committee_count: u64) { | ||
| let validator_count = store.head_state().validators.len(); | ||
| if validator_count == 0 || committee_count == 0 { | ||
| return; | ||
| } | ||
| let cc = committee_count as usize; | ||
| let mut seen = vec![false; validator_count]; | ||
| let mut has_subnet = vec![false; cc]; | ||
| for (_slot, bits) in store.new_aggregated_payload_participants() { | ||
| cov_add(&mut seen, &mut has_subnet, &bits); | ||
| } | ||
| cov_record("agg_start_new", &seen, &has_subnet); | ||
| } | ||
|
|
||
| /// `proposal_combined` coverage for a block we are about to publish: the full | ||
| /// set of validators included across the block's aggregated attestations. | ||
| pub(crate) fn emit_proposal_coverage<'a>( | ||
| store: &Store, | ||
| committee_count: u64, | ||
| selected: impl IntoIterator<Item = &'a AggregatedAttestation>, | ||
| ) { | ||
| let validator_count = store.head_state().validators.len(); | ||
| if validator_count == 0 || committee_count == 0 { | ||
| return; | ||
| } | ||
| let cc = committee_count as usize; | ||
| let mut combined_v = vec![false; validator_count]; | ||
| let mut combined_s = vec![false; cc]; | ||
| for att in selected { | ||
| cov_add(&mut combined_v, &mut combined_s, &att.aggregation_bits); | ||
| } | ||
| cov_record("proposal_combined", &combined_v, &combined_s); | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finding 5 (Low, doc/schema): per-subnet
subnet=subnet_Nseries are never written.cov_recordonly ever callsset_attestation_aggregate_coverage_validators(section, "combined", ...). Thehas_subnetbitsets are computed but only feed the subnet-count gauge, never the per-subnet validator series. Yet thelean_attestation_aggregate_coverage_validatorshelp text advertises "subnet=subnet_N is per-subnet coverage", and the gauge carries asubnetlabel dimension for it.As written, querying
subnet=subnet_Nreturns nothing — the schema overstates what's emitted. Either wire up per-subnet emission or trim the help text/label so dashboards don't expect a series that never appears.