Skip to content

Latest commit

 

History

History
94 lines (66 loc) · 3.96 KB

File metadata and controls

94 lines (66 loc) · 3.96 KB

PHALCON

Description

PHALCON is a scalable single-cell variant caller designed for high-throughput sequencing data. It is robust to common single-cell sequencing (SCS) errors and enables accurate mutation detection across large numbers of cells within practical runtimes.

PHALCON Workflow

Usage

PHALCON takes as input a read count matrix $(sites \times cells)$ and (optionally) a genotype quality matrix. If you have a loom file instead, a script named loomToReadcount.py is present in the supplementary folder in the main directory, output files of which can be fed as input to PHALCON.

Installation

Clone the repo and navigate to the main directory:

git clone https://github.com/Zafar-Lab/PHALCON
cd PHALCON

Create and activate the PHALCON conda environment:

conda env create -f environment.yml
conda activate phalcon

Verify that PHALCON and all dependencies have been installed successfully, run:

python src/phalcon.py --help

This should display the list of available command-line arguments.

Arguments

-i: Input read count file

-o: Output prefix

-r: Minimum read depth threshold (Default : 5)

-a: Alternate frequency threshold (Default : 0.2)

-v: Threshold for proportion of cells with insufficient read count information (Default : 0.5)

-m: Threshold for proportion of sites harboring a mutation (Default : 0.004)

-c: Clustering algorithm to use (Default : "spectral", Options: "spectral" or "leiden")

-s: Seed

Optional Arguments

-gq: Enable genotype quality filter (Default : 0)

-q: Genotype quality threshold (Default : 30)

Run PHALCON

The example below runs PHALCON using both a read count matrix (sample_read_count_file.tsv) and a genotype quality matrix (sample_geno_qual_file) while keeping all other parameters at their default values.

python src/phalcon.py -i ../sample_read_count_file.tsv -g ../sample_geno_qual_file.tsv -gq 1

PHALCON executable (optional)

You may also create a system-wide executable for PHALCON. To create the executable, download the src folder, unzip it on your system, and follow the steps below:

Step 1- Change directory to src folder and run the following command:

chmod +x phalcon.py

Step 2- Convert phalcon into an executable file using the following commands (second command is optional):

sudo mv phalcon.py /usr/local/bin/phalcon
sudo ln -s $(pwd)/phalcon.py /usr/local/bin/phalcon

On the command line, give the input arguments (use help for the list of arguments) and run phalcon.

Below is an example where "sample_read_count_file.tsv" and "sample_geno_qual_file.tsv" files are provided as input with all other variables being kept at the default values.

phalcon -i ../sample_read_count_file.tsv -g ../sample_geno_qual_file.tsv -gq 1

Use -gq 0 to disable the genotype quality filter. For a sample run, you can find the input files here:

Output

PHALCON mainly outputs the variant calls on each cell (.vcf format) and the reconstructed phylogeny (.gv format). Other auxiliary files, such as umap, cluster labels, etc, are also outputted.

Sample read count and genotype quality files

Tutorials

You can find the readthedocs for PHALCON here - PHALCON-read-the-docs

Tutorial on simulated datasets - PHALCON-on-simulated-data

AML tutorial - AML-67-001

TNBC tutorial - TN4

Help

Run phalcon -help for the description of parameters, along with their default values.