TAFFISH wrapper for HISAT2, a fast and sensitive aligner for DNA and RNA sequencing reads.
This repository packages upstream HISAT2 2.2.2 as a TAFFISH tool app. It builds
the upstream v2.2.2 source release and exposes the default hisat2 command
through the versioned taf-hisat2 wrapper.
Release 2.2.2-r2 is a help-only TAFFISH update. It keeps the upstream
software, Dockerfile, runtime dependencies, smoke tests, and command behavior
unchanged from 2.2.2-r1, and refreshes the terminal taf-hisat2 --help
text.
Install from the public TAFFISH Hub index:
taf update
taf install hisat2Install the exact release:
taf install hisat2 2.2.2-r2For local testing before the app is published to the public index:
taf install --from .Show TAFFISH app help:
taf-hisat2 --helpShow the TAFFISH package version:
taf-hisat2 --versionShow the upstream HISAT2 version:
taf-hisat2 hisat2 --version
taf-hisat2 -- --versionShow upstream command help:
taf-hisat2 hisat2 --help
taf-hisat2 hisat2-build --help
taf-hisat2 hisat2-inspect --helpBuild an index:
taf-hisat2 hisat2-build genome.fa genomeAlign single-end FASTQ reads:
taf-hisat2 hisat2 -x genome -U reads.fq.gz -S output.samAlign paired-end FASTQ reads:
taf-hisat2 hisat2 -x genome -1 sample_R1.fq.gz -2 sample_R2.fq.gz -S output.samAlign FASTA reads and disable spliced alignment for DNA-seq style mapping:
taf-hisat2 hisat2 -f -x genome -U reads.fa -S output.sam --no-spliced-alignmentUse known splice sites or exons extracted from a GTF:
taf-hisat2 hisat2_extract_splice_sites.py genes.gtf > splicesites.txt
taf-hisat2 hisat2_extract_exons.py genes.gtf > exons.txt
taf-hisat2 hisat2-build --ss splicesites.txt --exon exons.txt genome.fa genomeInspect an existing index:
taf-hisat2 hisat2-inspect -n genome
taf-hisat2 hisat2-inspect -s genomeBecause this is a command-mode TAFFISH tool, the first non-option argument is the in-container command. For normal HISAT2 usage, naming the upstream command explicitly is the clearest form:
taf-hisat2 hisat2 -x genome -U reads.fq.gz -S output.sam
taf-hisat2 hisat2-build genome.fa genome
taf-hisat2 hisat2-inspect -n genomeThe app also accepts the common shorthand when the first argument is a HISAT2 option:
taf-hisat2 -- --help
taf-hisat2 -x genome -U reads.fq.gz -S output.samThis README lists common usage patterns, not the full upstream manual. The
TAFFISH wrapper calls the upstream hisat2 script directly, so official HISAT2
options are available as upstream implements them. Use taf-hisat2 hisat2 --help, taf-hisat2 hisat2-build --help, and the upstream manual for the
complete option list.
name: hisat2
command: taf-hisat2
version: 2.2.2-r2
kind: tool
image: ghcr.io/taffish/hisat2:2.2.2-r2
The container image is built from docker/Dockerfile. It starts from
debian:12-slim, downloads the upstream HISAT2 v2.2.2 source tarball,
verifies its SHA-256 checksum, applies the Bioconda SIMDe compatibility patch
for native arm64 builds, compiles HISAT2 from source, and installs the runtime
files under /opt/hisat2.
The image includes these user-facing commands and runtimes:
hisat2
hisat2-build
hisat2-inspect
hisat2-align-s
hisat2-align-l
hisat2-build-s
hisat2-build-l
hisat2-inspect-s
hisat2-inspect-l
hisat2-repeat
hisat2_extract_splice_sites.py
hisat2_extract_exons.py
hisat2_extract_snps_haplotypes_UCSC.py
hisat2_extract_snps_haplotypes_VCF.py
hisat2_simulate_reads.py
hisat2_read_statistics.py
extract_splice_sites.py
extract_exons.py
python3
python
perl
gzip
bzip2
samtools
hisat2, hisat2-build, and hisat2-inspect are upstream wrapper scripts.
They expect the small and large binaries to live in the same installation
directory, so the container keeps the full command set together in
/opt/hisat2 and symlinks public commands into /usr/local/bin.
The runtime includes gzip and bzip2 because the upstream hisat2 wrapper
uses those executables for compressed input and selected compressed output
paths. It includes both python and python3 entry points because upstream
wrappers still use /usr/bin/env python, while newer helper scripts use
python3. It includes Perl because the main hisat2 wrapper is a Perl script.
It includes samtools because the VCF SNP/haplotype helper can call
samtools faidx in genotype-reference modes. For larger
downstream SAM/BAM work, using the dedicated taf-samtools and taf-bcftools
apps is still the cleaner TAFFISH composition.
The upstream example data and index-building scripts are kept under
/opt/hisat2/example and /opt/hisat2/scripts. They are useful as references,
but many of the script examples download large genome resources and are not run
by the TAFFISH smoke checks.
The image is built and validated for:
linux/amd64
linux/arm64
This app packages the stable HISAT2 v2.2.2 release from the main upstream
line. HISAT-3N is documented by upstream, but its commands live on the separate
hisat-3n branch and are not present in the v2.2.2 release tarball. This app
therefore does not provide hisat-3n, hisat-3n-build, or
hisat-3n-table. Those should be packaged separately if TAFFISH Hub needs
first-class HISAT-3N support.
HISAT-genotype is also a separate upstream project and is not included here.
This build does not enable HISAT2 SRA access with USE_SRA=1, because that
requires the NCBI NGS/VDB toolkit at build and runtime. For SRA access, download
or convert reads with an SRA-focused tool first, then pass FASTQ/FASTA files to
taf-hisat2.
No prebuilt genome indexes are bundled. Users can build indexes with
hisat2-build, use their own downloaded HISAT2 indexes, or mount shared index
directories into the TAFFISH working environment.
The TAFFISH metadata declares a Docker smoke check:
exist: hisat2, hisat2-build, hisat2-inspect, small/large binaries,
helper scripts, Python, Perl, gzip, bzip2, samtools
test: upstream version is 2.2.2
test: upstream help is available
test: runtime interpreters and compression tools are present
test: helper script help is available
test: a tiny reference can be indexed, inspected, and aligned against
test: gzip and bzip2 read inputs are accepted
test: a tiny large-index path runs
test: GTF splice-site and exon helper scripts run
During TAFFISH Hub indexing, this smoke metadata verifies that the published image exposes the expected command surface, reports the pinned upstream version, and can run representative local alignment and helper workflows. It does not validate biological alignment quality, large genome performance, prebuilt index downloads, SRA access, HISAT-3N, or HISAT-genotype.
Each smoke command is self-contained because the public index runs every
[smoke].test entry in a fresh temporary container. No smoke entry depends on
files created by an earlier entry.
- Project: HISAT2
- Homepage: https://daehwankimlab.github.io/hisat2/
- Source: https://github.com/DaehwanKimLab/hisat2
- Release: https://github.com/DaehwanKimLab/hisat2/releases/tag/v2.2.2
- Download: https://github.com/DaehwanKimLab/hisat2/archive/refs/tags/v2.2.2.tar.gz
- Upstream license: GPL-3.0-or-later
- Citations:
- Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nature Methods, 2015. DOI: 10.1038/nmeth.3317
- Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature Biotechnology, 2019. DOI: 10.1038/s41587-019-0201-4
Useful checks before publishing:
taf check
docker build -t ghcr.io/taffish/hisat2:2.2.2-r2 -f docker/Dockerfile .
taf build
TAFFISH_CONTAINER_BACKEND=docker target/taf-hisat2-v2.2.2-r2 -- --version
TAFFISH_CONTAINER_BACKEND=docker target/taf-hisat2-v2.2.2-r2 hisat2 --version
taf publish --release --dry-run