Skip to content

list index out of range #27

@NiklasDreyer

Description

@NiklasDreyer

Hi,

Thanks for an awesome piece of software. I have used TranscriptClean on large-scale assemblies with high success before.

However I am now running a metatranscriptomics project in which I am only interested in reads that map to the COI gene of the taxon of interest. I have 6504 reads that map to the full-length reference COI sequence. I have sorted the .sam file with samtools, but when I run TranscriptClean I get the error that the list "index is out of range". When I inspect the mapping in a genome map viewer, it looks good albeit with some gaps here and there. Still, all my reads are within the reference.

I sort the samples
samtools sort -O sam -T sample.sort -o sample.sort.sam mapped1.sam

I run this commant
python ....../TranscriptClean.py --sam sample.sort.sam --genome mygenome.fasta --out outfile

The program then returns
list index out of range Took 0:00:54 to process transcript batch. Took 0:00:00 to combine all outputs.
Below is a snippet from the sorted .sam file.

@HD VN:1.0 SO:coordinate @SQ SN:Facetotecta LN:1527 @RG ID:Unpaired_reads_assembled_against_Facetotecta SM: @PG ID:samtools PN:samtools VN:1.14 CL:samtools sort -O sam -T sample.sort -o sample.sort.sam mapped1_sorted_Facetotecta_cut_extraction.sam m54057_190926_040405/25100833/ccs_1 0 Facetotecta 1 255 2M1P2M1P1M5P1M10P1M3P1M2P1M5P1M4P1M1P1M2P1M2P1M1P1M1P1M1P2M3P1M2P1M2P1M1P1M2P1M1P1M2P4M3P1M1P1M1P1M2P2M1P1M2P1M5P2M3P1M1P1M2P2M4P2M1P1M2P1M2P1M8P1M3P1M21P2M4P1M5P1M2P1M2P1M3P1M4P1M5P1M3P1M5P1M1P1M3P2M3P1M1P1M6P1M1P1M3P1M2P1M1P1M10P1M2P1M18P1M1P2M4P1M6P1M1P1M8P2M3P2M11P1M2P1M6P1M2P1M6P1M3P2M2P1M7P1M6P1M1P1M1P1M1P1M1P1M9P1M8P3M5P1M1P1M1P1M1P1M1P2M7P1M3P1M1P1M1P1M2P1M1P1M14P1M1P1M4P1M4P1M1P1M12P3M2P1M6P1M1P1M3P2M2P1M1P1M1P2M3P1M3P2M3P1M3P2M1P1M4P2M23P1M4P1M4P1M8P1M2P1M1P1M1P1M17P1M1P1M5P1M3P1M1P1M16P1M1P1M1P2M3P1M5P1M1P3M1P3M1P1M2P2M4P2M1P1M1P1M5P2M3P1M7P1M5P1M2P1M2P1M1P1M1P1M1P1M1P1M3P2M1P1M30P1M1P1M2P1M1P2M8P1M3P1M8P1M1P1M1P1M8P2M1P2M1P1M1P1M4P2M5P1M1P1M4P1M10P1M5P1M4P1M5P1M2P1M10P1M1P1M1P1M3P1M4P1M4P2M1P2M9P2M4P2M3P2M2P1M1P1M3P2M2P1M2P2M2P3M3P1M17P1M4P1M1P1M3P2M2P1M4P1M8P1M1P1M1P1M2P1M2P3M1P1M3P1M4P1M1P2M2P1M1P1M3P1M3P1M3P1M4P1M3P1M1P1M3P1M1P1M5P2M2P1M1P2M1P1M3P1M1P1M1P1M3P1M8P2M1P2M1P1M2P2M11P1M3P2M8P1M1P2M14P1M14P2M8P1M1P2M3P1M4P1M5P1M1P1M1P1M1P1M2P2M6P1M1P1M1P1M1P1M4P1M2P1M3P1M6P1M2P1M2P2M2P1M1P3M1P1M2P1M6P1M2P1M1P1M2P1M9P1M2P2M4P1M4P1M5P1M6P1M1P1M1P1M3P1M3P1M4P1M7P1M8P1M8P1M9P1M16P1M2P1M18P1M4P1M12P1M6P1M3P1M3P1M2P1M6P1M1P1M12P1M1P1M1P1M12P2M7P1M1P1M3P1M3P1M1P2M1P1M4P1M3P1M3P1M1P1M7P1M3P1M2P2M21P1M6P1M3P1M1P2M29P2M2P1M2P1M1P1M81P2M1P1M4P2M1P1M2P1M2P1M1P2M1P1M1P1M1P2M1P1M4P2M1P1M2P2M10P1M3P1M3P1M1P1M4P1M1P1M1P1M1P1M2P1M1P1M1P2M6P1M9P1M3P1M2P2M3P1M7P1M2P1M3P1M1P3M4P1M6P1M2P1M1P2M1P1M3P1M1P1M9P1M1P1M1P2M3P1M4P2M3P3M1P1M10P1M8P1M4P1M2P1M4P1M2P1M2P1M4P2M5P1M2P1M5P1M1P2M3P1M1P1M1P2M13P1M1P1M1P1M2P1M2P1M12P1M9P1M1P1M1P1M1P1M2P1M3P2M2P1M2P1M8P1M1P2M1P2M1P3M10P2M4P1M2P1M4P1M4P1M1P1M8P1M2P1M1P1M4P2M1P2M2P1M2P1M3P1M9P1M5P2M4P2M17P1M1P1M13P1M2P1M3P1M11P1M2P1M10P1M2P1M22P1M1P1M19P1M4P1M3P1M14P1M5P1M3P1M2P1M3P1M5P1M12P1M11P1M2P1M2P1M6P1M2P1M10P1M1P1M9P1M3P1M1P1M4P1M2P1M2P1M1P1M12P1M3P1M2P1M1P1M1P1M2P1M2P1M3P1M2P1M4P1M5P2M1P1M2P1M2P1M1P1M2P1M3P1M3P1M6P1M1P1M3P1M2P1M6P1M3P1M6P1M1P1M3P1M1P1M1P1M4P1M4P1M8P1M6P1M1P1M1P1M2P2M * 0 0 ATGAAACGATGATTATTTTCCACTAACCACAAAGACATTGGTACAATGTACTTTATCCTGGGAGCGTGATCAGGTATAATCGGTACTGGTATAAGAATACTTATTCGAAGGGAACTAGGTCAACCCGGTAGACTTATTGGTAATGACCAAATTTACAACGTAATTGTTACAGCTCATGCATTTATCATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGCTTTGGCAATTGGCTTGTTCCTCTTATAATTGGAGCTCCTGATATAGCCTTCCCTCGAATAAACAATATAAGATTTTGACTTCTTCCTCCTTCCCTCTCTCTTCTTTTATCAAGAAGATTAACTGAATCTGGAGTTGGAACAGGATGAACAGTTTACCCTCCTCTTTCAAGTAATATTGCCCACAGTGGTATTTCCGTTGACTTAGCTATCTTCTCACTCCATTTGGCAGGAGCAAGATCAATTTTAGGTGCCATTAATTTCATTACTACTATCATCAATATACGTAATAAAATAATCACAATAGACCGATTACCTCTATTTGTATGATCAGTTTTCATCACAGCGTTTCTCC * RG:Z:Unpaired_reads_assembled_against_Facetotecta m54057_190926_040405/7602703/ccs_2 0 Facetotecta 1 255 2M1P2M1P1M5P1M10P1M3P1M2P1M5P1M4P1M1P1M2P1M2P1M1P1M1P1M1P2M3P1M2P1M2P1M1P1M2P1M1P1M2P4M3P1M1P1M1P1M2P2M1P1M2P1M5P2M3P1M1P1M2P2M4P2M1P1M2P1M2P1M8P1M3P1M21P2M4P1M5P1M2P1M2P1M3P1M4P1M5P1M3P1M5P1M1P1M3P2M3P1M1P1M6P1M1P1M3P1M2P1M1P1M10P1M2P1M18P1M1P2M4P1M6P1M1P1M8P2M3P2M11P1M2P1M6P1M2P1M6P1M3P2M2P1M7P1M6P1M1P1M1P1M1P1M1P1M9P1M8P3M5P1M1P1M1P1M1P1M1P2M7P1M3P1M1P1M1P1M2P1M1P1M14P1M1P1M4P1M4P1M1P1M12P3M2P1M6P1M1P1M3P2M2P1M1P1M1P2M3P1M3P2M3P1M3P2M1P1M4P2M23P1M4P1M4P1M8P1M2P1M1P1M1P1M17P1M1P1M5P1M3P1M1P1M16P1M1P1M1P2M3P1M5P1M1P3M1P3M1P1M2P2M4P2M1P1M1P1M5P2M3P1M7P1M5P1M2P1M2P1M1P1M1P1M1P1M1P1M3P2M1P1M30P1M1P1M2P1M1P2M8P1M3P1M8P1M1P1M1P1M8P2M1P2M1P1M1P1M4P2M5P1M1P1M4P1M10P1M5P1M4P1M5P1M2P1M10P1M1P1M1P1M3P1M4P1M4P2M1P2M9P2M4P2M3P2M2P1M1P1M3P2M2P1M2P2M2P3M3P1M17P1M4P1M1P1M3P2M2P1M4P1M8P1M1P1M1P1M2P1M2P3M1P1M3P1M4P1M1P2M2P1M1P1M3P1M3P1M3P1M4P1M3P1M1P1M3P1M1P1M5P2M2P1M1P2M1P1M3P1M1P1M1P1M3P1M8P2M1P2M1P1M2P2M11P1M3P2M8P1M1P2M14P1M14P2M8P1M1P2M3P1M4P1M5P1M1P1M1P1M1P1M2P2M6P1M1P1M1P1M1P1M4P1M2P1M3P1M6P1M2P1M2P2M2P1D1P3D1P1D2P1M6P1M2P1M1I1M2P1M8P1I1M2P2M4P1M4P1M5P1M6P1M1P1M1P1M3P1M3P1M4P1M4P3I1M8P1M8P1M9P1M16P1M2P1M18P1M4P1M12P1M6P1M3P1M3P1M2P1M6P1M1P1M12P1M1P1M1P1M12P2M7P1M1P1M3P1M3P1M1P2M1P1M4P1M3P1M3P1M1P1M7P1M3P1M2P2M21P1M6P1M3P1M1P2M29P2M2P1M2P1M1P1M81P2M1P1M4P2M1P1M2P1M2P1M1P2M1P1M1P1M1P2M1P1M4P2M1P1M2P2M10P1M3P1M3P1M1P1M4P1M1P1M1P1M1P1M2P1M1P1M1P2M6P1M9P1M3P1M2P2M3P1M7P1M2P1M3P1M1P3M4P1M6P1M2P1M1P2M1P1M3P1M1P1M9P1M1P1M1P2M3P1M4P2M3P3M1P1M10P1M8P1M4P1M2P1M4P1M2P1M2P1M4P2M5P1M2P1M5P1M1P2M3P1M1P1M1P2M13P1M1P1M1P1M2P1M2P1M12P1M9P1M1P1M1P1M1P1M2P1M3P2M2P1M2P1M8P1M1P2M1P2M1P3M10P2M4P1M2P1M4P1M4P1M1P1M8P1M2P1M1P1M4P2M1P2M2P1M2P1M3P1M9P1M5P2M4P2M17P1M1P1M13P1M2P1M3P1M11P1M2P1M10P1M2P1M22P1M1P1M19P1M4P1M3P1M14P1M5P1M3P1M2P1M3P1M5P1M12P1M11P1M2P1M2P1M6P1M2P1M10P1M1P1M9P1M3P1M1P1M4P1M2P1M2P1M1P1M12P1M3P1M2P1M1P1M1P1M2P1M2P1M3P1M2P1M4P1M5P2M1P1M2P1M2P1M1P1M2P1M3P1M3P1M6P1M1P1M3P1M2P1M6P1M3P1M6P1M1P1M3P1M1P1M1P1M4P1M4P1M8P1M6P1M1P1M1P1M2P2M4P3M2P1M2P1I1M3P1M4P1M20P1D4P1M2P1M1P1M1P2M13P1M6P2M2P1M5P2M2P2M1P1M1P1M2P1M1P1M1P1M1P1M1P1M4P3M1P2M1P1M2P1M1P1M3P6M4P1M1P2M1P2M2P2M2P1M5P1M1P1M13P1M2P1M1P1M1P1M1P1M8P1M8P1M2P1M4P1M3P1M1P2M14P3M1P1M4P1M3P1M2P1M2P1M12P2M1P3M2P2M2P2M11P1M1P1M2P1D6P1D3P1D7P1D2P1D14P1D3P1M2P1M1P1M3P2M1P1M4P2M2P3M1P2M1P1M1P2M1P1M1P1M2P2M2P2M1P2M8P2M1P3M1P1M1P5M2P1M8P2M1P1M1P2M2P1M1P2M1P1M2P1M1P1M1P3M2P2M1P1M1P1M1P2M2P1M8P1M1P1M1P2M3P1M1P3M2P1M4P1M2P2M2P1M1P1M1P1M1P2M3P1M2P2M51P1M1P1M1P1M1P1M2P2M1P2M208P1M1P1M1P1M1P1M1P1M1P1M2P1M1P1M1P2M9P1M1P2M1P2M4P1M1P1M4P1M1P1M1P1M3P1M3P1M1P1M1P1M1P1M1P1M14P1M4P1M * 0 0 ATGAAACGATGATTATTTTCAACCAATCATAAAGATATTGGAACTATATATATAATATTCGGCGCCTGATCCGGCACTATAGGAGTGGCAATAAGAATAATTATCCGTAGAGAACTAGGGCAACCCGGTTCTCTAATTGGTAACGATCAAATCTATAATGTAATTGTAACTGCCCACGCCTTTATCATAATTTTCTTTATAGTAATACCAATCATAATTGGAGGATTTGGAAACTGACTAATTCCTCTGATATTAGGATCCCCTGATATAGCATTTCCACGGATAAATAACATAAGATTCTGACTACTCCCCCCATCATTAATTCTTTTAATTAGAAGAAGACTAACAGAAAGGGGGGTAGGAACAGGATGAACGGTCTATCCTCCTCTTTCAAGAAATATCTCTCATAGAGGAGTCTCAGTAGACATGGCCATCTTCTCCCTCCACTTAGCTGGAGCAAGATCCATTTTAGGAGCCATTAATTTTATTACTACGATCATTAATATACGCAACAAAAACCTTTCTTTTGACCGTCTACCATTATTAGTATGATCTATCTTTATTACTACTATCCTTTTACTACTTTCTTTACCAGTACTTGCCGGAGCTATTACCATACTATTAACAGATCGAAATATTAATACTTCATTCTTTGATCCAGGTGGGGATCCTGTATTATATCAACATCTATTTTGATTTTTCGGACACCCAGAAGTTTATATTTTAATTCTACCAGGGTTTGGAATAGTTTCCCACATTATTAGACAAGAAAG *

Any ideas about what goes wrong? It looks like TrascriptClean cannot run without a proper genome map or chromosome list, but I wanted to ask in case others are getting the same "error".

Appreciate any help and I am open to other solutions.
Niklas

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions