Skip to content

load_gff3 miscalculates CDS #60

@mpoelchau

Description

@mpoelchau

We've been having trouble with load_gff3 - I initially posted this on the Apollo repo (GMOD/Apollo#2662), reposting it here now. Thanks for considering this, I'd appreciate any pointers!

We are trying to use the python-apollo arrow annotations load_gff3 command to load annotations to the user-created annotations track. It is changing the CDS locations of the model, both with and without the --disable_cds_recalculation option.

Here is what a load without --disable_cds_recalculation looks like; the correct frame can be seen in the track below.

Screenshot 2023-11-28 at 2 53 30 PM

The gff3 that was used to load the annotation has 6 CDS lines; the gff3 for the uploaded annotation has 12 CDS lines (even though the view shows only one CDS segment). Apollo also won't calculate a protein or CDS sequence on the uploaded annotation.

Here is what a load with --disable_cds_recalculation looks like (command: arrow annotations load_gff3 --source https://apollo2-stage-node1-cbo.nal.usda.gov/apollo Anoplophora_glabripennis ~/Downloads/NW_019416298.gff3 --disable_cds_recalculation)
Screenshot 2023-11-28 at 2 50 01 PM

Again, the gff3 for the uploaded annotation in Apollo has 12 CDS lines instead of 6. Apollo also won't calculate a protein or CDS sequence on the uploaded annotation.
I'll note that if you run the same command multiple times, the single CDS will display in a different spot each time.

If you load the annotation by dragging it up, it loads correctly:

Screenshot 2023-11-28 at 2 56 20 PM

This is happening for many (but not all) annotations in multiple assemblies/organisms.

Some other observations:

  • We haven't observed the problem for models with a single CDS/exon segment
  • The underlying genomic sequence has lowercase nucleotides
  • I tried using load_legacy_gff3, and that calculated the CDS correctly, but I'm unable to delete features when I load them with that method (Hibernate operation: could not execute statement; SQL [n/a]; ERROR: update or delete on table "feature" violates foreign key constraint "fk_8jm56covt0m7m0m191bc5jseh" on table "feature_relationship" Detail: Key (id)=(4858111) is still referenced from table "feature_relationship".; nested exception is org.postgresql.util.PSQLException: ERROR: update or delete on table "feature" violates foreign key constraint "fk_8jm56covt0m7m0m191bc5jseh" on table "feature_relationship" Detail: Key (id)=(4858111) is still referenced from table "feature_relationship".)

I've attached "before" and "after" gff3s. (Used .txt extension because GitHub wouldn't let me upload otherswise)
before.txt
after-nocdsrecalc.txt
after.txt

  • Provide the javascript console log output generated from the action.
    None.

  • Provide the server log output generated from the action (typically catalina.out).
    nothing is added to Catalina.out when I add the annotations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions