Skip to content

taffish/hisat2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

taf-hisat2

TAFFISH wrapper for HISAT2, a fast and sensitive aligner for DNA and RNA sequencing reads.

This repository packages upstream HISAT2 2.2.2 as a TAFFISH tool app. It builds the upstream v2.2.2 source release and exposes the default hisat2 command through the versioned taf-hisat2 wrapper.

Release 2.2.2-r2 is a help-only TAFFISH update. It keeps the upstream software, Dockerfile, runtime dependencies, smoke tests, and command behavior unchanged from 2.2.2-r1, and refreshes the terminal taf-hisat2 --help text.

Installation

Install from the public TAFFISH Hub index:

taf update
taf install hisat2

Install the exact release:

taf install hisat2 2.2.2-r2

For local testing before the app is published to the public index:

taf install --from .

Usage

Show TAFFISH app help:

taf-hisat2 --help

Show the TAFFISH package version:

taf-hisat2 --version

Show the upstream HISAT2 version:

taf-hisat2 hisat2 --version
taf-hisat2 -- --version

Show upstream command help:

taf-hisat2 hisat2 --help
taf-hisat2 hisat2-build --help
taf-hisat2 hisat2-inspect --help

Build an index:

taf-hisat2 hisat2-build genome.fa genome

Align single-end FASTQ reads:

taf-hisat2 hisat2 -x genome -U reads.fq.gz -S output.sam

Align paired-end FASTQ reads:

taf-hisat2 hisat2 -x genome -1 sample_R1.fq.gz -2 sample_R2.fq.gz -S output.sam

Align FASTA reads and disable spliced alignment for DNA-seq style mapping:

taf-hisat2 hisat2 -f -x genome -U reads.fa -S output.sam --no-spliced-alignment

Use known splice sites or exons extracted from a GTF:

taf-hisat2 hisat2_extract_splice_sites.py genes.gtf > splicesites.txt
taf-hisat2 hisat2_extract_exons.py genes.gtf > exons.txt
taf-hisat2 hisat2-build --ss splicesites.txt --exon exons.txt genome.fa genome

Inspect an existing index:

taf-hisat2 hisat2-inspect -n genome
taf-hisat2 hisat2-inspect -s genome

Because this is a command-mode TAFFISH tool, the first non-option argument is the in-container command. For normal HISAT2 usage, naming the upstream command explicitly is the clearest form:

taf-hisat2 hisat2 -x genome -U reads.fq.gz -S output.sam
taf-hisat2 hisat2-build genome.fa genome
taf-hisat2 hisat2-inspect -n genome

The app also accepts the common shorthand when the first argument is a HISAT2 option:

taf-hisat2 -- --help
taf-hisat2 -x genome -U reads.fq.gz -S output.sam

This README lists common usage patterns, not the full upstream manual. The TAFFISH wrapper calls the upstream hisat2 script directly, so official HISAT2 options are available as upstream implements them. Use taf-hisat2 hisat2 --help, taf-hisat2 hisat2-build --help, and the upstream manual for the complete option list.

Package

name: hisat2
command: taf-hisat2
version: 2.2.2-r2
kind: tool
image: ghcr.io/taffish/hisat2:2.2.2-r2

Container

The container image is built from docker/Dockerfile. It starts from debian:12-slim, downloads the upstream HISAT2 v2.2.2 source tarball, verifies its SHA-256 checksum, applies the Bioconda SIMDe compatibility patch for native arm64 builds, compiles HISAT2 from source, and installs the runtime files under /opt/hisat2.

The image includes these user-facing commands and runtimes:

hisat2
hisat2-build
hisat2-inspect
hisat2-align-s
hisat2-align-l
hisat2-build-s
hisat2-build-l
hisat2-inspect-s
hisat2-inspect-l
hisat2-repeat
hisat2_extract_splice_sites.py
hisat2_extract_exons.py
hisat2_extract_snps_haplotypes_UCSC.py
hisat2_extract_snps_haplotypes_VCF.py
hisat2_simulate_reads.py
hisat2_read_statistics.py
extract_splice_sites.py
extract_exons.py
python3
python
perl
gzip
bzip2
samtools

hisat2, hisat2-build, and hisat2-inspect are upstream wrapper scripts. They expect the small and large binaries to live in the same installation directory, so the container keeps the full command set together in /opt/hisat2 and symlinks public commands into /usr/local/bin.

The runtime includes gzip and bzip2 because the upstream hisat2 wrapper uses those executables for compressed input and selected compressed output paths. It includes both python and python3 entry points because upstream wrappers still use /usr/bin/env python, while newer helper scripts use python3. It includes Perl because the main hisat2 wrapper is a Perl script. It includes samtools because the VCF SNP/haplotype helper can call samtools faidx in genotype-reference modes. For larger downstream SAM/BAM work, using the dedicated taf-samtools and taf-bcftools apps is still the cleaner TAFFISH composition.

The upstream example data and index-building scripts are kept under /opt/hisat2/example and /opt/hisat2/scripts. They are useful as references, but many of the script examples download large genome resources and are not run by the TAFFISH smoke checks.

The image is built and validated for:

linux/amd64
linux/arm64

Boundaries

This app packages the stable HISAT2 v2.2.2 release from the main upstream line. HISAT-3N is documented by upstream, but its commands live on the separate hisat-3n branch and are not present in the v2.2.2 release tarball. This app therefore does not provide hisat-3n, hisat-3n-build, or hisat-3n-table. Those should be packaged separately if TAFFISH Hub needs first-class HISAT-3N support.

HISAT-genotype is also a separate upstream project and is not included here.

This build does not enable HISAT2 SRA access with USE_SRA=1, because that requires the NCBI NGS/VDB toolkit at build and runtime. For SRA access, download or convert reads with an SRA-focused tool first, then pass FASTQ/FASTA files to taf-hisat2.

No prebuilt genome indexes are bundled. Users can build indexes with hisat2-build, use their own downloaded HISAT2 indexes, or mount shared index directories into the TAFFISH working environment.

The TAFFISH metadata declares a Docker smoke check:

exist: hisat2, hisat2-build, hisat2-inspect, small/large binaries,
       helper scripts, Python, Perl, gzip, bzip2, samtools
test:  upstream version is 2.2.2
test:  upstream help is available
test:  runtime interpreters and compression tools are present
test:  helper script help is available
test:  a tiny reference can be indexed, inspected, and aligned against
test:  gzip and bzip2 read inputs are accepted
test:  a tiny large-index path runs
test:  GTF splice-site and exon helper scripts run

During TAFFISH Hub indexing, this smoke metadata verifies that the published image exposes the expected command surface, reports the pinned upstream version, and can run representative local alignment and helper workflows. It does not validate biological alignment quality, large genome performance, prebuilt index downloads, SRA access, HISAT-3N, or HISAT-genotype.

Each smoke command is self-contained because the public index runs every [smoke].test entry in a fresh temporary container. No smoke entry depends on files created by an earlier entry.

Upstream

Maintainer Notes

Useful checks before publishing:

taf check
docker build -t ghcr.io/taffish/hisat2:2.2.2-r2 -f docker/Dockerfile .
taf build
TAFFISH_CONTAINER_BACKEND=docker target/taf-hisat2-v2.2.2-r2 -- --version
TAFFISH_CONTAINER_BACKEND=docker target/taf-hisat2-v2.2.2-r2 hisat2 --version
taf publish --release --dry-run

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors