Skip to content

DasLab/rr_util

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rr_util

Utilities for generating raw reads datasets for deep neural network training

usage: bam2reads.py [-h] -b BAM -s FASTA [-o OUT_TAG] [-n NUM_SEQ_PER_CHUNK]

Takes sorted bam file and produces reads file and index for ML modeling

options:
  -h, --help            show this help message and exit
  -b BAM, --bam BAM     bam files to process
  -s FASTA, --fasta FASTA
                        FASTA of reference sequences used to align bam
  -o OUT_TAG, --out_tag OUT_TAG
                        tag used for tag.reads.txt and tags.index.csv
  -n NUM_SEQ_PER_CHUNK, --num_seq_per_chunk NUM_SEQ_PER_CHUNK
                        number of sequences

Outputs are .index.csv with 1-indexed start/end of reads for each ref sequence and .reads.txt

About

Utilities for generating raw reads datasets for deep neural network training

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages