The pct filter in render.py takes a value and appends % without multiplying by 100:
def filter_pct(value, decimals=1) -> str:
"""Format a pre-computed percentage: 52.2 → '52.2%'"""
return f"{float(value):.{decimals}f}%"
The docstring says "pre-computed percentage", but the filter name pct strongly implies "give me a fraction and format it as a percent" — which is what most analysts (and Claude, when extending a paper) reach for first. There is no validation, range check, or warning.
The failure mode is silent and looks plausible. A fact stored as a fraction 0.923 renders as 0.9% instead of 92.3%. The output keeps a % sign and looks fine on inspection, so the bug only shows up when someone notices the wrong order of magnitude. We hit this in the clinvar-hidden-structures paper, where the abstract reported "0.9% of laboratories reclassified zero" instead of "92.3%". Not a number you want in a journal submission.
Why it's particularly dangerous in this codebase
- The
pct name invites use on fractions; the contract requires already-multiplied percents. The two are inconsistent.
- The same paper can mix conventions — some fact CSVs store percent values (
indep_major_disc_pct = 2.44), others store fractions (frac_active_reclass_zero = 0.923) — and pct works "correctly" on the first set and silently wrong on the second.
StrictUndefined catches missing fields but not misinterpreted ones. A renderer warning that fires when pct receives a value in [0, 1) and the result is < 1% would catch most real bugs.
Suggested resolutions (any one would have prevented our bug)
- Drop
pct as redundant. Document the convention "scripts emit pre-computed percent values; templates use {{ x | dp(1) }}%". dp already exists; the literal % is one character. No filter needed.
- Rename to disambiguate.
pct_already for the current behaviour; add frac_to_pct that multiplies by 100.
- Add a sanity warning. If
pct receives 0 < value < 1, emit a render-time warning to stderr — most legitimate percent values are ≥ 1, and a 0.x value is almost always a fraction passed in by mistake.
Happy to send a PR for (1) or (3) if there's a preferred direction.
🤖 Written by Claude
The
pctfilter inrender.pytakes a value and appends%without multiplying by 100:The docstring says "pre-computed percentage", but the filter name
pctstrongly implies "give me a fraction and format it as a percent" — which is what most analysts (and Claude, when extending a paper) reach for first. There is no validation, range check, or warning.The failure mode is silent and looks plausible. A fact stored as a fraction
0.923renders as0.9%instead of92.3%. The output keeps a%sign and looks fine on inspection, so the bug only shows up when someone notices the wrong order of magnitude. We hit this in the clinvar-hidden-structures paper, where the abstract reported "0.9% of laboratories reclassified zero" instead of "92.3%". Not a number you want in a journal submission.Why it's particularly dangerous in this codebase
pctname invites use on fractions; the contract requires already-multiplied percents. The two are inconsistent.indep_major_disc_pct = 2.44), others store fractions (frac_active_reclass_zero = 0.923) — andpctworks "correctly" on the first set and silently wrong on the second.StrictUndefinedcatches missing fields but not misinterpreted ones. A renderer warning that fires whenpctreceives a value in [0, 1) and the result is < 1% would catch most real bugs.Suggested resolutions (any one would have prevented our bug)
pctas redundant. Document the convention "scripts emit pre-computed percent values; templates use{{ x | dp(1) }}%".dpalready exists; the literal%is one character. No filter needed.pct_alreadyfor the current behaviour; addfrac_to_pctthat multiplies by 100.pctreceives0 < value < 1, emit a render-time warning to stderr — most legitimate percent values are ≥ 1, and a 0.x value is almost always a fraction passed in by mistake.Happy to send a PR for (1) or (3) if there's a preferred direction.
🤖 Written by Claude