Skip to content

Ensure gnomAD linkage for variants with ClinVar Canonical Allele ID beginning with CA0 #722

@jstone-dev

Description

@jstone-dev

It appears that Illumina's Hail tables of gnomAD data drop leading zeros from ClinVar Canonical Allele IDs. So, for instance, CA025094 is recorded in Illumina's table (or at least in the version of it that has been transformed and stored on S3 to be queried by Athena) as CA25094.

Since MaveDB uses CAIDs to annotate mapped variants with gnomAD minor allele frequencies, it seems possible that some annotations will be missed if we're not stripping the leading zero. Let's check on this before rolling out gnomAD features.

Example variant
URN: urn:mavedb:00001224-a-1#1
ClinGen allele ID: CA025094
gnomAD data: https://gnomad.broadinstitute.org/variant/13-32356440-G-A?dataset=gnomad_r4

Metadata

Metadata

Assignees

No one assigned

    Labels

    app: workerTask implementation touches the worker

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions