-
Notifications
You must be signed in to change notification settings - Fork 35
Description
Hi,
I recently used Lofreq v 2.1.5 to make variants calls to the HIV WGS data. Before running lofreq I aligned reads to BWA and filtered for properly paired alignment using samtools. After running lofreq commands in the following series:
viterbi --> indelqual --> alnqual --> call, I noticed made some frameshift mutation calls. Upon looking at the alignment on IGV at one of the frameshift mutation region, the top one shows before the lofreq preprocessing and the bottom one after lofreq preprocessing. As you can see the viterbi step introduces insertion and deletion on the same reads resulting in 2 frameshift insertions and deletions reported on 29% of the reads as shown below:
| Sample | HGVS.g | HGVS.c | HGVS.p | lofreq | Variant_Type | lofreq_Var_Count |
|---|---|---|---|---|---|---|
| A | NC_001802.1:g.5212_5213insCC | HIV1gp4:c.108_109insCC | vpr:p.Ile37fs | 0.290914 | frameshift_variant | 3269 |
| A | NC_001802.1:g.5214_5215delTT | HIV1gp4:c.111_112delTT | vpr:p.Ile37fs | 0.295586 | frameshift_variant | 3268 |
Because insertion and deletion are present on the same read it looks more like an artifact than real. How do I fix this? Should I be removing the viterbi step? If so, do I still keep the indelqual and alnqual steps?
I've also found regions where the alignments were completely missing due to BWA (2nd figure attached) and I was wondering if I provide lofreq with raw alignment bam file (containing unmapped reads) instead of filtered proper paired alignment bam, viterbi step can possibly realign the unmapped reads in those regions with gaps?

