Release 2026.2.0#98
Merged
bencap merged 19 commits intomavedb-mainfrom May 8, 2026
Merged
Conversation
- Add `build_ref_identical_allele` and `translate_ref_identical_to_vrs` to `lookup.py` to represent HGVS `.=` variants as VRS Alleles with a `ReferenceLengthExpression` state - Handle `.=` variants explicitly in `_construct_vrs_allele` and `_create_post_mapped_hgvs_strings` instead of silently skipping them - Remove `len(hgvs) == 3` guards from validity predicates and early-exit checks; the reference-identical case is now handled directly - Fix `is_intronic_variant` to return False for position-less variants - Skip `vrs_ref_allele_seq` annotation for `ReferenceLengthExpression` alleles to avoid redundant and expensive full-sequence lookups
…ilure when valid hgvs nt mapping exists
- Introduce `AlignmentQc` schema capturing per-alignment BLAT quality metrics (identity, CIGAR, mismatch positions, gap intervals) with in-memory-only position lists excluded from serialization - Add `TargetMapping` schema representing a per-(target, layer) mapping row with variant counts, tool parameters, and preferred-layer flag - Add `VrsMapResult` NamedTuple to pair VRS mappings with their `TargetMapping` rows from `vrs_map` - Rename `annotation_layer` → `alignment_level` on `MappedScore` / `ScoreAnnotation` to align with new terminology - Rename `ident_pct` → `percent_identity` on `AlignmentResult`; add `score`, `next_best_score`, `alignment_qc`, `aligner_parameters`, and `reference_assembly` fields - Implement `build_scoreset_mapping` in `annotate.py` to assemble `ScoresetMapping` with populated `target_mappings` list, per-variant locus-quality flags (`at_mismatched_locus`, `near_gap`), and reference sequence metadata - Restore canonical BLAT PSL scoring (`matches - misMatches - qNumInsert - tNumInsert`) in `_get_best_hsp`; previous BioPython port used raw identity count, causing noisy alignments to outrank clean ones - Update JSON schema, API router, CI workflow, and README to reflect new output shape - Add `test_annotate_target_mapping.py` and expand `test_align.py` / `test_annotate.py` with unit tests for new logic Co-authored-by: Copilot <copilot@github.com>
…alignment_qc - protein-vs-DNA (-q=prot -t=dnax) BLAT runs store target coords in nucleotide space and query coords in amino-acid space (3:1 ratio); minus-strand target blocks have ts > te, making seq[ts:te] return "". Comparison was crashing with ValueError from zip(strict=True); the per-base mismatch loop is now skipped entirely for this mode, setting mismatch_positions_unavailable=True so at_mismatched_locus is correctly left as None (not evaluated) rather than a false False. The preferred layer for protein scoresets is PROTEIN, flagged from the downstream protein-to-protein alignment, so no signal is lost. - For nucleotide-vs-nucleotide runs, replace the bare zip(strict=True) with an explicit length-mismatch guard that logs a WARNING and falls through to zip(strict=False), preserving all mismatches in the overlapping prefix rather than crashing or discarding the block.
…s for better matching This is mostly useful in multi-word target names where gene information is available but not in the first word of the target name.
…and adjust related annotations Co-authored-by: Copilot <copilot@github.com>
…-identical-vrs feat(vrs_map): add VRS mapping support for reference-identical variants
…-generated-unnecessarily Fix protein layer generation and error handling in vrs_map
…or-visibility feat(logging): improve error visibility and logging across application
…vel-mapping-metadata feat: Target level mapping metadata
- Exclude `notebooks/` directory from linting (1,299 noise errors) - Remove deprecated `ANN101` rule from configuration - Fix invalid noqa directive format in vrs_map.py (`:` vs `. `) - Update deprecated rule codes: `ASYNC101` → `ASYNC221` in main.py - Migrate `str + Enum` to `StrEnum` in schemas.py (Python 3.11+) - Fix subprocess S603 noqa placement in align.py - Wrap `NamedTemporaryFile` in context manager (SIM115) - Fix operator precedence with parentheses in annotate.py (RUF021) - Sort `__all__` exports in lookup.py and mavedb_data.py (RUF022) - Add `# noqa: A004` for unavoidable `map` builtin shadowing - Auto-fix pytest decorator style: `@pytest.fixture()` → `@pytest.fixture`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Features
Bug Fixes
Maintenance