TAFFISH wrapper for EVidenceModeler (EVM), a eukaryotic gene-structure annotation tool that combines ab initio gene predictions, protein alignments, transcript alignments, and other evidence into weighted consensus gene models.
This repository packages EVidenceModeler 2.1.0 as a TAFFISH tool app. The
published command is taf-evidencemodeler; the default in-container upstream
command is EVidenceModeler.
Install from the public TAFFISH Hub index:
taf update
taf install evidencemodelerInstall the exact release:
taf install evidencemodeler 2.1.0-r1For local testing before the app is published to the public index:
taf install --from .Show TAFFISH app help:
taf-evidencemodeler --helpShow upstream EVidenceModeler help:
taf-evidencemodeler EVidenceModeler --help
taf-evidencemodeler -- --helpRun the upstream bundled sample:
mkdir evm-test
cd evm-test
taf-evidencemodeler cp -a /opt/evidencemodeler/testing/. .
taf-evidencemodeler EVidenceModeler \
--sample_id smalltest \
--genome genome.fasta \
--weights weights.txt \
--gene_predictions gene_predictions.gff3 \
--protein_alignments protein_alignments.gff3 \
--transcript_alignments transcript_alignments.gff3 \
--segmentSize 100000 \
--overlapSize 10000 \
--CPU 4The main outputs from that run are:
smalltest.EVM.gff3
smalltest.EVM.pep
smalltest.EVM.cds
smalltest.EVM.bed
For a real run, provide your own genome FASTA, weight file, and GFF3 evidence:
taf-evidencemodeler EVidenceModeler \
--sample_id sample1 \
--genome genome.fa \
--weights weights.txt \
--gene_predictions gene_predictions.gff3 \
--protein_alignments protein_alignments.gff3 \
--transcript_alignments transcript_alignments.gff3 \
--segmentSize 100000 \
--overlapSize 10000 \
--CPU 8The default command is EVidenceModeler, so option-leading calls can also use
the TAFFISH -- separator:
taf-evidencemodeler -- --version
taf-evidencemodeler -- --helpBecause this is a command-mode TAFFISH tool, the first non-option argument is treated as an executable inside the container. For normal EVM use, name the upstream command explicitly:
taf-evidencemodeler EVidenceModeler --help
taf-evidencemodeler create_weights_file.pl -h
taf-evidencemodeler augustus_GFF3_to_EVM_GFF3.pl augustus.gff3 > augustus.evm.gff3
taf-evidencemodeler miniprot_GFF_2_EVM_GFF3.py miniprot.gff > miniprot.evm.gff3EVidenceModeler does not run gene predictors or aligners for you. It combines evidence that has already been generated and converted into EVM-compatible GFF3 formats.
Required inputs:
--sample_id
--genome
--weights
--gene_predictions
--segmentSize
--overlapSize
Optional but commonly used inputs:
--protein_alignments
--transcript_alignments
--repeats
--terminalExonsFile
The weights.txt file gives each evidence source a class and numeric weight.
The bundled helper can infer source names from GFF3 files:
taf-evidencemodeler create_weights_file.pl \
-A gene_predictions.gff3 \
-P protein_alignments.gff3 \
-T transcript_alignments.gff3 > weights.txtReview and edit the generated weights before real analysis; EVM weights are a biological modeling choice, not just a file-format detail.
The image includes:
EVidenceModeler
ParaFly
EvmUtils/*.pl
EvmUtils/misc/*.pl
EvmUtils/misc/*.py
EvmUtils/misc/GFF2_toolkit/*.pl
PerlLib modules
upstream testing data
Common helpers include:
partition_EVM_inputs.pl
write_EVM_commands.pl
execute_EVM_commands.pl
recombine_EVM_partial_outputs.pl
convert_EVM_outputs_to_GFF3.pl
gff3_file_to_proteins.pl
gene_gff3_to_bed.pl
augustus_GFF3_to_EVM_GFF3.pl
braker_GTF_to_EVM_GFF3.pl
genomeThreader_to_evm_gff3.pl
miniprot_GFF_2_EVM_GFF3.py
BPbtab.pl
prepare_Jigsaw_formats.pl
gff3_genes_to_gff2.pl
The converter helpers support common upstream evidence sources such as AUGUSTUS, BRAKER, SNAP, GeneMark, GlimmerHMM, GenomeThreader, Exonerate, miniprot, MAKER, and TACO outputs. Those external tools are not bundled here; produce their outputs with dedicated TAFFISH apps or your local workflow, then feed the converted GFF3 into EVM.
Some legacy helper scripts are included for older EVM preparation paths, such as BTAB and GFF2/Jigsaw conversion utilities:
taf-evidencemodeler BPbtab.pl < blast.output > blast.output.btab
taf-evidencemodeler prepare_Jigsaw_formats.pl -h
taf-evidencemodeler gff3_genes_to_gff2.plThese helpers are available for upstream compatibility. They do not replace the modern recommendation to prepare high-quality spliced protein/transcript alignments and EVM-compatible GFF3 evidence before running EVM.
The container builds EVidenceModeler from the official EVidenceModeler-v2.1.0
tag and initializes the bundled ParaFly submodule. The upstream ParaFly build
system hard-codes -m64; this app removes that architecture-specific flag at
build time so the package can build natively on both linux/amd64 and
linux/arm64.
The upstream 2.1.0 script still reports EVidenceModeler-v2.0.0 internally.
This image patches only that displayed version string to
EVidenceModeler-v2.1.0 so wrapper version checks match the packaged source
tag. No algorithmic code is changed.
Runtime dependencies include Perl, Python 3, ParaFly, GNU find, GNU sort,
bash, Perl DB_File, Perl URI::Escape, and BioPerl Bio::SearchIO.
EUK_MODULES and PERL5LIB are set to the bundled EVM PerlLib directory so
older helper scripts that still reference EUK_MODULES can resolve the same
modules as the main EVM command. These are covered by the smoke tests because
they are used by the main EVM execution path and bundled helpers.
This release declares native linux/amd64 and linux/arm64 builds.
On Apple Silicon macOS, Docker or Podman should normally pull the native arm64 image once it is published. If a local backend still tries to use an amd64 image for inspection or compatibility testing, use a backend-specific platform override:
TAFFISH_CONTAINER_BACKEND=docker \
TAFFISH_DOCKER_RUN_ARGS="--platform linux/amd64" \
taf-evidencemodeler EVidenceModeler --helpThis kind of global TAFFISH_*_RUN_ARGS override is for local platform or site
policy. The app itself does not need GPU, special devices, or network access at
runtime.
The upstream GitHub project notes that EVidenceModeler is no longer being actively maintained as of 2024. This TAFFISH app packages the existing 2.1.0 release for reproducible use, but new projects should also evaluate current annotation alternatives when appropriate.
name: evidencemodeler
command: taf-evidencemodeler
version: 2.1.0-r1
kind: tool
image: ghcr.io/taffish/evidencemodeler:2.1.0-r1
Haas et al. 2008. Automated eukaryotic gene structure annotation using
EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biology
9, R7. DOI: 10.1186/gb-2008-9-1-r7; PMID: 18190707.
Upstream EVidenceModeler is distributed under the BSD 3-Clause license. This TAFFISH wrapper repository is distributed under the Apache License 2.0.