Skip to content

pct filter silently corrupts output when given a fraction instead of a pre-multiplied percent #4

@davmlaw

Description

@davmlaw

The pct filter in render.py takes a value and appends % without multiplying by 100:

def filter_pct(value, decimals=1) -> str:
    """Format a pre-computed percentage: 52.2 → '52.2%'"""
    return f"{float(value):.{decimals}f}%"

The docstring says "pre-computed percentage", but the filter name pct strongly implies "give me a fraction and format it as a percent" — which is what most analysts (and Claude, when extending a paper) reach for first. There is no validation, range check, or warning.

The failure mode is silent and looks plausible. A fact stored as a fraction 0.923 renders as 0.9% instead of 92.3%. The output keeps a % sign and looks fine on inspection, so the bug only shows up when someone notices the wrong order of magnitude. We hit this in the clinvar-hidden-structures paper, where the abstract reported "0.9% of laboratories reclassified zero" instead of "92.3%". Not a number you want in a journal submission.

Why it's particularly dangerous in this codebase

  1. The pct name invites use on fractions; the contract requires already-multiplied percents. The two are inconsistent.
  2. The same paper can mix conventions — some fact CSVs store percent values (indep_major_disc_pct = 2.44), others store fractions (frac_active_reclass_zero = 0.923) — and pct works "correctly" on the first set and silently wrong on the second.
  3. StrictUndefined catches missing fields but not misinterpreted ones. A renderer warning that fires when pct receives a value in [0, 1) and the result is < 1% would catch most real bugs.

Suggested resolutions (any one would have prevented our bug)

  1. Drop pct as redundant. Document the convention "scripts emit pre-computed percent values; templates use {{ x | dp(1) }}%". dp already exists; the literal % is one character. No filter needed.
  2. Rename to disambiguate. pct_already for the current behaviour; add frac_to_pct that multiplies by 100.
  3. Add a sanity warning. If pct receives 0 < value < 1, emit a render-time warning to stderr — most legitimate percent values are ≥ 1, and a 0.x value is almost always a fraction passed in by mistake.

Happy to send a PR for (1) or (3) if there's a preferred direction.

🤖 Written by Claude

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions