Skip to content

Analysis of multiple time points of cfDNA from plasma of patients with oncological diagnoses

Notifications You must be signed in to change notification settings

AntonCoon/OTTER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code style: black

OTTER 🦦: Oncology Traces TrackER

A Biohack 2023 project - Analysis of multiple time points of cfDNA from plasma of patients with oncological diagnoses

🦦 consits of filtering, clusterization and variant calling.

Installation

git clone git@github.com:AntonCoon/OncoTracker.git
cd OncoTracker && make

Requirements

  • jre;
  • python modules:
    • ipython;
    • pandas;
    • scipy;
    • vcfpy;
    • seaborn.

Alternatively, after installation, a pipeline can be executen within a Docker/Podman container:

docker build -t otter:1 .
docker run -it --rm --mount type=bind,src="$(pwd)",target=/pipeline otter:1

# then run from the contaier
python3 main.py --filter-medium example_data/BH_2/

Usage

python main.py --filter-soft|--filter-medium|--filter-hard| <path/to/folder/with/vcf/files>

A folder with results be created in path/to/folder/with/vcf/files.

Available options are:

  • --filter-soft leaves all the variants which occur in at least 2 files;
  • --filter-medium leaves only variants which are present in both replicas in at least 2 timepoints;
  • --filter-hard leaves only variants which are present in all of the subject's files.

Results description

Results consist of

  • filtered_LEVEL.vcf - filtered variants;
  • corr_cust.vcf - variants filtered and clustered by correlated in variant allele frequency dynamics (here only variants present in all timepoints are considered);
  • SnpEff output:
    • {filtered_LEVEL,corr_clust}_snpEff-ann.vcf - annotated variants;
    • {filtered_LEVEL,corr_clust}_snpEff_genes.txt - gene counts summary;
    • {filtered_LEVEL,corr_clust}_snpEff_summary.html - SnpEff output summary;
  • *correlation.pdf - a heatmap of pairwise correlation used for clustering;
  • *dendrogram.pdf - a dengrogram with variants colored by cluster cluster.

About

Analysis of multiple time points of cfDNA from plasma of patients with oncological diagnoses

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published