feat: Target Gene Mapping Table#719
Merged
bencap merged 4 commits intorelease-2026.2.0from May 6, 2026
Merged
Conversation
Collaborator
Author
|
API support for VariantEffect/dcd_mapping2#97 |
23cc43b to
97579c4
Compare
d40d913 to
ddc7ec3
Compare
7c372c3 to
4ad8368
Compare
…arated concerns Move VRSMap client code, type schemas, metadata utilities, and constants into separate modules within a mapping package. Maintain backward compatibility through re-exports in __init__.py so existing imports continue to work without changes. Co-authored-by: Copilot <copilot@github.com>
…) QC and provenance Add a new `target_gene_mappings` table that records alignment QC and provenance for each (target gene, annotation layer) pair produced by dcd-mapping. Replaces flat QC fields on `mapped_variants` with a normalized FK relationship. - Add `TargetGeneMapping` model, view model, and `AnnotationLayer` enum - Extend `MappedVariant` with `target_gene_mapping_id`, `alignment_level`, `at_mismatched_locus`, and `near_gap` columns - Update mapping worker to persist `TargetGeneMapping` rows and link variants - Add Alembic migration (`8c4a2f1d9e6b`) for schema changes - Add manual backfill script to populate new columns for existing mapped variants - Drop `variants_failed_pre_layer_selection` and `variants_with_mapping_warnings` QC counts from the schema (not recoverable for existing data) Co-authored-by: Copilot <copilot@github.com>
4ad8368 to
9ba16ea
Compare
…d fallback Replace `.get()` default parameter with `or` chaining to satisfy type checking and add UUID fallback for cases where correlation_id is unavailable in both pipeline_params and logging context. This improves type safety and ensures all pipelines have a correlation_id for better traceability in logs and external systems.
Base automatically changed from
feature/bencap/627/job-traceability
to
release-2026.2.0
May 6, 2026 18:41
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new per-(target gene, alignment level) mapping QC and provenance model, refactors the mapping library for better modularity, and updates the database schema and ORM models to support richer mapping provenance and annotation. The changes enable more detailed tracking of variant mapping quality and provenance, and lay the groundwork for improved downstream analysis and data integrity.
Database schema and model enhancements:
target_gene_mappingstable to store per-(target gene, alignment level) QC and provenance information, and extended themapped_variantstable with new columns (target_gene_mapping_id,alignment_level,at_mismatched_locus,near_gap) to link variants to their mapping QC and annotation details.TargetGeneMappingSQLAlchemy model and established relationships fromTargetGeneandMappedVarianttoTargetGeneMappingfor ORM-level access to mapping QC records. [1] [2] [3]AnnotationLayerenum to standardize annotation layer values and provide translation from dcd-mapping wire codes.Mapping library refactor:
client.py,constants.py,metadata.py,schema.py), with a new public API inmapping/__init__.pyfor backward compatibility. This modularizes code for maintainability and clarity. [1] [2] [3] [4] [5]API and script updates:
MappedVariantWithMappingDetailsmodel, exposing richer mapping QC and provenance information.Other improvements:
target_gene_mappingto the public model exports for easier access in other modules.These changes collectively provide a robust foundation for tracking, querying, and analyzing variant mapping provenance and quality throughout the application.