ONT-DMS is a Nextflow pipeline for processing barcoded protein variant libraries and NestLink libraries sequenced by nanopore sequencing. Reads are binned according to their barcodes or flycodes (UMIs). Accurate consensus sequences are calculated using Dorado polish. Finally, variants are called with the pipeline, linking barcodes or flycodes with their respective protein variants.
Note
Tested on macOS 15.7 (Mamba), Windows 11 (WSL with Docker), and Ubuntu 22.04 (Mamba, Apptainer).
- Nextflow (Installation guide).
- Mamba/ Conda (https://conda-forge.org/) or Docker
- Dorado (Installation guide), does not have to be installed when using containers.
- Slurm workflow manager
- Nextflow, installed in a Mamba/ Conda environment called
nextflow. - Mamba/ Conda or Apptainer
- Dorado
Tip
Add dorado to your PATH e.g. by adding export PATH="$PATH:/home/fackle/data/dorado-1.2.0-linux-x64/bin" to your .bashrc.
- Clone the repository with
git clone https://github.com/fabianackle/ONT-DMS.git. - Check the nextflow configuration file
nextflow.config. - Create a
params.jsonfile with the parameters listed below, specify the nanopore reads (BAM) and reference sequence, see the examples contained in this repo. - Run the pipeline with either
./run_NL-pipeline.shfor local execution or on a cluster withsbatch run_NL-pipeline.slurm.
Tip
The pipeline can also be run directly: nextflow run main.nf -profile conda -params-file params.json.
| Parameter | Type | Description |
|---|---|---|
data |
String | Path to input BAM file(s).† |
reference |
String | Path to reference FASTA file. |
filter_quality |
Float | Minimum mean read quality threshold. |
filter_min_length |
Integer | Read filtering minimum length threshold. |
filter_max_length |
Integer | Read filtering maximum length threshold. |
extract_barcode_adapter |
String | Linked cutadapt adapter for barcode extraction.†2 |
barcode_regex |
String | Regular expression matching the barcode. |
barcode_min_coverage |
Integer | The minimal amount a barcode has to be seen to be considered a high-quality barcode. |
polish_bacteria |
Boolean | Use a bacterial model for polishing with dorado (optional).†3 |
barcode_5p |
String | 5' sequence flanking the barcode. |
barcode_3p |
String | 3' sequence flanking the barcode. |
orf_5p |
String | 5' sequence flanking the ORF. |
orf_3p |
String | 3' sequence flanking the ORF. |
translate_barcode |
Boolean | Translates barcode, used with flycodes. |
outdir |
String | Output directory for results. |
† for multiple BAM files use *, e.g. data/barcode*.bam.
†2 see Linked adapters (combined 5’ and 3’ adapter).
†3 do not use if you want to use move tables, as currently there are no models that support both.