Thanks Adrian, The illumina coverage is 80-100x, so pretty decent. I did an assembly with only illumina reads and yes the numbers were better but not fully recovering the genes and shuffling snps among the 3 homologous genes. Thus, I thought that adding long 454 reads could help in resolving gene copies to their full extent. Paradoxically, I am also getting better assemblies using 454 reads alone, it is just that I can't get both types of reads assembled together. Juan -----Original Message----- From: mira_talk-bounce@xxxxxxxxxxxxx [mailto:mira_talk-bounce@xxxxxxxxxxxxx] On Behalf Of Adrian Pelin Sent: Monday, February 24, 2014 11:49 AM To: mira_talk@xxxxxxxxxxxxx Subject: [mira_talk] Re: tweaking Manifest for polyploid genome What is your coverage for the illumina PE reads? In my experience, 454 + illumina assemblies with MIRA are worse in stats than illumina assemblies alone. If your illumina coverage is enough, try using it only without 454. On 2/24/2014 11:30 AM, Gutierrez, Juan wrote: > Thanks Hanquan and Martin for your answers. > > There are 3 types of reads in the assembly: > > 869,000 454 reads, about 500bp long on average. > 32,254,100 illumina 100bp pair ended reads (approx total 62 million > reads) > 3,897,411 illumina 100bp single reads (result of the trimming process > where the other pair was of bad quality) > > All reads are longer than 90bp (I removed the ones shorter than that > after trimming) > > I am pretty confident on the trimming I did and all remaining reads are of > extreme high quality. I agree with Martin in that reads do not map, but I > think it is because of the similarity among the 3 different copies of each > gene. They just do not know where to map unequivocally. > > I am thinking that I might have been very strict on the parameters. Maybe the > right approach for this kind of repetitive assembly is being less strict on > the parameters and let Mira resolve the uncertainties? > > Juan > > > -----Original Message----- > From: mira_talk-bounce@xxxxxxxxxxxxx > [mailto:mira_talk-bounce@xxxxxxxxxxxxx] On Behalf Of Martin Mokrejs > Sent: Monday, February 24, 2014 8:16 AM > To: mira_talk@xxxxxxxxxxxxx > Subject: [mira_talk] Re: tweaking Manifest for polyploid genome > > Hi Juan, > how many 454 reads do you have on input? I see max coverage 180830 for the > 454 technology. also from "Coverage assessment" I see that only 454 is > covered at 0.78x and Solexa at 28.34x. Let me suspect your main issue is > proper trimming of raw reads. Looks the reads just do not assemble or you > sequenced too few material using 454. Ah, I see in your Manifest file you > used RapidLib approach and MIDs ... poor boy, how did you remove them? > > Maybe you would appreciate as a paid service my help, please see > http://www.bioinformatics.cz . There are plenty of tricks needed to > get > 454 trimming right and I don't know any other tool (except mine) ;) doing > that right, not even of a tool doing the proper queries for all adapters, > primers, artifacts. However, a lot of effort had to be put into the wrapper > code to manage and interpret the candidate alignments. > Having just the right queries is not enough. 3 years of work, 25k lines of > code in python. My apologies if this this is considered as an Ad, I couldn't > resist. > Martin > > > Gutierrez, Juan wrote: >> Hi, >> >> >> >> I am trying to do RNA-seq de novo on a polyploid (hexaploid) genome >> using a combination of 454 and illumina 100bp paired ended reads. The >> three copies of each gene are highly similar to one another. I am >> having trouble in separating each one of the three copies into three >> different fully-length assemblies. Most of the times I just get a >> fragment of each of the three copies. I am guessing that when Mira >> finds a difference between highly similar transcripts, it just can’t >> assess if there is a polymorphism between 2 of the copies or a >> sequencing error. In any case, Mira seems to end the assembly way >> before reaching the end of the transcript. >> >> >> >> I have prepared and run 2 different Manifests. I am getting better >> results with Manifest.conf (less number of contigs but longer) than >> with Manifest2.conf (higher number but shorter contigs), so I am >> supposing that I could eventually separate the 3 copies of each gene >> by fine-adjusting the parameters. >> >> >> >> Any suggestion would be greatly appreciated, >> >> Thanks so much! >> >> Juan >> >> >> >> >> >> This electronic message contains information generated by the USDA solely >> for the intended recipients. Any unauthorized interception of this message >> or the use or disclosure of the information it contains may violate the law >> and subject the violator to civil or criminal penalties. If you believe you >> have received this message in error, please notify the sender and delete the >> email immediately. > -- > You have received this mail because you are subscribed to the > mira_talk mailing list. For information on how to subscribe or > unsubscribe, please visit > http://www.chevreux.org/mira_mailinglists.html > > b j yǢ m +&j)[yƮ 쨹 r y h jY&j)b b h )ߢ *' xh , &ޢ r > z ^jǯ ȭ i 0 ^ Ɗ h jf ) +- fl=== -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html