Skip to content

taffish/fastp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

taf-fastp

TAFFISH wrapper for fastp, the ultrafast all-in-one FASTQ preprocessing and quality-control tool for short-read sequencing data.

This repository packages upstream fastp 1.3.3 as a TAFFISH tool app. It builds the official v1.3.3 source release in a Debian 12 container image and exposes the upstream fastp executable through the versioned taf-fastp command.

Installation

Install from the public TAFFISH Hub index:

taf update
taf install fastp

Install the exact release:

taf install fastp 1.3.3-r2

For local testing before the app is published to the public index:

taf install --from .

Usage

Show TAFFISH app help:

taf-fastp --help

Show the TAFFISH package version:

taf-fastp --version

Show the upstream fastp version:

taf-fastp fastp --version
taf-fastp -- --version

Show upstream command help:

taf-fastp fastp --help
taf-fastp -- --help

Process single-end FASTQ and write explicit HTML/JSON reports:

taf-fastp fastp \
  -i reads.fq \
  -o clean.fq \
  -h fastp.html \
  -j fastp.json \
  -w 4

Process paired-end gzip-compressed FASTQ:

taf-fastp fastp \
  -i sample_R1.fq.gz \
  -I sample_R2.fq.gz \
  -o clean_R1.fq.gz \
  -O clean_R2.fq.gz \
  -h sample.fastp.html \
  -j sample.fastp.json \
  -w 8

Use adapter detection, length filtering, and polyG trimming controls:

taf-fastp fastp \
  -i sample_R1.fq.gz \
  -I sample_R2.fq.gz \
  -o clean_R1.fq.gz \
  -O clean_R2.fq.gz \
  --detect_adapter_for_pe \
  --length_required 30 \
  --trim_poly_g \
  -w 8

Read from stdin and write passing reads to stdout:

cat reads.fq | taf-fastp fastp --stdin --stdout -w 4 > clean.fq

Batch process a folder of FASTQ files with the upstream helper:

taf-fastp parallel.py \
  -i raw_fastq \
  -o clean_fastq \
  -r fastp_reports \
  -p 4 \
  -a '-w 2 --detect_adapter_for_pe'

Because this is a command-mode TAFFISH tool, the first non-option argument is the in-container command. For normal fastp usage, naming the upstream fastp binary explicitly is the clearest form:

taf-fastp fastp -i reads.fq -o clean.fq -h fastp.html -j fastp.json
taf-fastp parallel.py -h

The app also accepts the common shorthand when the first argument is a fastp option:

taf-fastp -i reads.fq -o clean.fq -h fastp.html -j fastp.json

This README lists common usage patterns, not the full upstream manual. Because the TAFFISH wrapper calls the upstream fastp binary directly, official fastp options are available as-is, including merge mode, split output, interleaved input, unpaired/failed-read outputs, base correction, UMI handling, deduplication, overrepresentation analysis, and low-complexity filtering. Use the upstream help for the complete option list:

taf-fastp fastp --help

Package

name: fastp
command: taf-fastp
version: 1.3.3-r2
kind: tool
image: ghcr.io/taffish/fastp:1.3.3-r2

Container

The container image is built from docker/Dockerfile. It starts from debian:12-slim, downloads the upstream v1.3.3 source archive from GitHub, builds the required isa-l, libdeflate, and Google Highway libraries, and installs fastp into /usr/local/bin.

The image includes these user-facing commands:

fastp
parallel.py
python
python3

parallel.py is the upstream batch helper. It scans an input directory for FASTQ files, pairs files by read-name flags such as R1/R2, runs fastp, and writes per-sample reports plus an aggregate overall.html.

The fastp binary is built with the upstream compression dependencies:

isa-l 2.31.0
libdeflate 1.25
Google Highway 1.3.0

libdeflate and Google Highway are linked into the fastp binary. isa-l is provided as a runtime shared library so the same Dockerfile can build cleanly on both amd64 and arm64.

This app intentionally does not bundle fastplong, FastQC, MultiQC, aligners, or downstream analysis tools. fastp itself generates HTML and JSON reports, and downstream aggregation or alignment should be handled by dedicated TAFFISH tools or flows. The upstream parallel.py aggregate report references CDN JavaScript libraries when viewed in a browser, so the file is created offline but some interactive charts may need network access to render.

The image is built and validated for:

linux/amd64
linux/arm64

The TAFFISH metadata declares a Docker smoke check:

exist: fastp, parallel.py, python, python3
test:  fastp reports upstream version 1.3.3
test:  upstream fastp help is available
test:  upstream parallel.py help is available
test:  the fastp binary resolves the bundled isa-l runtime library
test:  a tiny single-end FASTQ can produce clean FASTQ plus HTML/JSON reports
test:  stdin/stdout mode works on a tiny FASTQ stream
test:  a tiny paired-end FASTQ pair can produce gzip-compressed outputs
test:  parallel.py can process a tiny paired-end folder and aggregate reports

During TAFFISH Hub indexing, this smoke metadata verifies that the published image exposes the expected command surface, reports the pinned upstream version, and can run minimal local FASTQ preprocessing/reporting workflows. The smoke checks are intentionally small and deterministic; they verify the container and representative fastp functionality, not every possible combination of upstream options or full scientific correctness for every data type.

Each smoke command is self-contained because the public index runs every [smoke].test entry in a fresh temporary container. No smoke entry depends on files created by an earlier entry.

Upstream

Maintainer Notes

Useful checks before publishing:

taf check
taf compile -- fastp --version
taf publish --release --dry-run
docker build -t ghcr.io/taffish/fastp:1.3.3-r2 -f docker/Dockerfile .
docker build --platform linux/amd64 -t ghcr.io/taffish/fastp:1.3.3-r2-amd64-test -f docker/Dockerfile .
docker build --platform linux/arm64 -t ghcr.io/taffish/fastp:1.3.3-r2-arm64-test -f docker/Dockerfile .
docker run --rm ghcr.io/taffish/fastp:1.3.3-r2 fastp --version
docker run --rm ghcr.io/taffish/fastp:1.3.3-r2 fastp --help

The repository wrapper files are licensed under Apache-2.0. Upstream fastp is distributed under the MIT license, and third-party runtime components are distributed under their own upstream licenses.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors