Skip to content

Significant Read Reduction in Specific Samples After Running TranscriptClean #43

@tabatakenshiro

Description

@tabatakenshiro

Hi,

I encountered a problem while processing data with TranscriptClean. Specifically, a significant reduction in read count was observed in a subset of samples, while others remained unaffected:

Affected samples: Some samples showed a drastic reduction in read count.
Unaffected samples: The remaining samples exhibited no reduction in read count.

A detailed summary of the read counts before and after running TranscriptClean is attached below.

Debugging Steps Taken
Command execution: TranscriptClean was run using a command with identical parameters across all samples.
Input files: Reads were mapped using minimap2 -ax splice -uf, then filtered with samtools view -q 40 -F 2304 before being used as input for TranscriptClean.
Splice junction correction: Initially, the --spliceJns option was enabled. Disabling this option and re-running the pipeline resulted in an increase in output reads. However, the selective read loss in specific samples persisted.

Inquiry
I understand that TranscriptClean performs indel/mismatch correction, but it appears that specific reads are being excluded. Given the extreme discrepancy observed, I suspect the issue might be related to the analysis pipeline rather than sample quality.

Could you provide any insights or recommendations on how to troubleshoot this issue further? Any guidance would be greatly appreciated.

Thank you for your time and consideration.

Kenshiro,

Check3=before, Check4=after

Sample​ Check​ raw total reads total length​ bases mapped​ bases mapped (cigar)​ mismatches​ error rate​ average length​ maximum length​ average quality​
Pt1​ Check3​ 38571700​ 27988121299​ 27988121299​ 26439881767​ 1294552523​ 4.90%​ 726​ 14189​ 22.1​
Pt1​ Check4​ 14885194​ 9448829077​ 9448829077​ 8902680697​ 20855028​ 0.23%​ 635​ 8911​ 255​
Pt2​ Check3​ 58299674​ 43342802595​ 43342802595​ 40954916189​ 1871537414​ 4.57%​ 743​ 13185​ 22.5​
Pt2​ Check4​ 3185766​ 2150401555​ 2150401555​ 2017913761​ 3661245​ 0.18%​ 675​ 7481​ 255​
Pt3​ Check3​ 17127913​ 9246550886​ 9246550886​ 8633108380​ 463702400​ 5.37%​ 540​ 8159​ 21.6​
Pt3​ Check4​ 17127913​ 9317313128​ 9317313128​ 8703870622​ 19176366​ 0.22%​ 544​ 8224​ 255​
Ctrl1​ Check3​ 31658805​ 24249086965​ 24249086965​ 22935673068​ 1206981057​ 5.26%​ 766​ 10338​ 21.7​
Ctrl1​ Check4​ 15929871​ 11571670680​ 11571670680​ 10929699850​ 23711792​ 0.22%​ 726​ 10408​ 255​
Ctrl2​ Check3​ 12877599​ 8679179782​ 8679179782​ 8160470026​ 492314770​ 6.03%​ 674​ 8230​ 20.9​
Ctrl2​ Check4​ 12877599​ 8752414304​ 8752414304​ 8233704548​ 17855502​ 0.22%​ 680​ 8287​ 255​

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions