[mira_talk] Paired DeNovo Assembly with filtered Reads fails: Readgroups not set

  • From: Bert Brutzel <bertbrutzel@xxxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 08 Oct 2014 11:17:55 +0200


Dear All,

we had two organisms in our sequencing, one yeast and one bacterium. We separated both reads using a combination of reference mappings and blast annotations. We then extracted the read names from a bam file of the mappings. Now I have a list of reads, which I extracted from the original fastQ of an Illumina Paired-End run, via grep.

mira now gives me the error message that Readgroups are missing.
Please find below the error, my manifest, and a $head -n 20 out put of both read files.

I am sure I screwed up the formating somewhere in my wild chase, but I am not certain how to fix this to get paired data again.

Thanks for the help,
Bert

error:

******************************************************************************** * MIRA found readgroups where pairs are expected but no read has a partner. * * See log above and then check your input please (either manifest file or data * * files loaded or segment_naming scheme). * ********************************************************************************
->Thrown: void Assembly::basicReadGroupChecks()
->Caught: main


manifest:

# Reference assembly with paired Illumina data
# Organism: Bacterium A
# 06.10.2014
project = Bacterium A
job = genome,denovo,accurate
parameters = -NW:cac=warn -GE:kpmf=15 -NW:cmrnl=warn -NW:cdrn=warn
# The second part defines the sequencing data MIRA should load and assemble
# The data is logically divided into "readgroups"
#here is the paired data
readgroup = DataIlluminaPairedLib
autopairing
data = /home/data/Genomes/BacteriumA/Sequence_Clean_Up/20140724_PAENI_TAGGCATG_L005_R1_001.fastq /home/data/Genomes/BacteriumA/Sequence_Clean_Up/20140724_PAENI_TAGGCAT$
technology = solexa

head R1:

@HWUSI-EAS1580R:46:FC:5:1:19723:1031 1:N:0:TAGGCATG
GNTAGAAATATATGATGGAGCTGTTTATGGTGAAGTAAAGCAAGTAAAGGTCGACTTGAAGCAAGAGNNNNNNNNGGGAACCAAAGNNNNAATTAGATATATTTTAGTGGAA
+
?#?B9DDDDDIIFIIHIGHBIGIIIIHHIIDIIIIGIIDIGEGDFGGGGDFEIIGIHIIIIIIEI5>########51953?E<??4####81'2-EAADAGB>BA;:<@??3
@HWUSI-EAS1580R:46:FC:5:1:19506:1032 1:N:0:TAGGCATG
GNCGTGGGCGTAGGAAATTTGAGAGGAGCTGTCCTTAGTACGAGAGGACCGGGATGGACGTCCCGCGNNNNNNNNAGTTGTTACGCNNNNACAACCCCGCGGCAGCCAGCTC
+
=#;=:;9@@@<EE?8B=B=C4<EEE>GBG<<>EEDDGG3GEE-E2B+4B>EB4E<ED<DE1B##################################################
@HWUSI-EAS1580R:46:FC:5:1:19235:1032 1:N:0:TAGGCATG
GNGCGGGTGTAGTACCCACGCGCCATGACCAGACTTGGGCGCTCTCCATCCAGATACGCAATGGCGGNNNNNNNNCGGGCCACCCCNNNNCCGCACAGACACCCTCCCCCGC
+
B#A@;BCAC?GGG@GGGGGEG=GC8CCCGGA>>CA3?=2B*=8089A=07B+=A##########################################################

>head R2:
@HWUSI-EAS1580R:46:FC:5:1:4279:1032 2:N:0:TAGGCATG
CCTGGTGTCAATGAGATGAGTAAGCAGAGCAGAGTAACTTCCCAATCCGAGNACAATTGAGTTTTATACGTAAACATACCGGGCCTTATATCTAACCTGTCTCTTATACACA
+
IGGIIEGDGBGGGAGEGGEGFGEEGHHHFIIGDIEEGGG@GGG<EGGGGED#=?:??@B?IEEIHIIHFI>A?@D999;5EBCEEBEB><BBBBBFEBEBD@C2CCCBEDB@
@HWUSI-EAS1580R:46:FC:5:1:10542:1032 2:N:0:TAGGCATG
ATTACGGACGATGCACGGGATTTCAACAAGGCCTGCGACCGGATCGCAGATNAGGCCCAGTGTATTTTTTAACGCCAAACCAACGCCATGATCAACCCGATCCGGCGGACCT
+
GD?:GGG4GDGBGBB2EFE8DEGGG<GDDDBFE8B-=?88?A>5B>>>B<6#7;?:;?7;?E?8E8B=@@7-A>E??8?7F<?A############################



Other related posts: