[mira_talk] Re: tweaking Manifest for polyploid genome

  • From: "Gutierrez, Juan" <Juan.Gutierrez@xxxxxxxxxxxx>
  • To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
  • Date: Mon, 24 Feb 2014 16:30:20 +0000

Thanks Hanquan and Martin for your answers.

There are 3 types of reads in the assembly:

869,000 454 reads, about 500bp long on average.
32,254,100 illumina 100bp pair ended reads (approx total 62 million reads)
3,897,411 illumina 100bp single reads (result of the trimming process where the 
other pair was of bad quality)

All reads are longer than 90bp (I removed the ones shorter than that after 
trimming)

I am pretty confident on the trimming I did and all remaining reads are of 
extreme high quality. I agree with Martin in that reads do not map, but I think 
it is because of the similarity among the 3 different copies of each gene. They 
just do not know where to map unequivocally.

I am thinking that I might have been very strict on the parameters. Maybe the 
right approach for this kind of repetitive assembly is being less strict on the 
parameters and let Mira resolve the uncertainties?

Juan 
 

-----Original Message-----
From: mira_talk-bounce@xxxxxxxxxxxxx [mailto:mira_talk-bounce@xxxxxxxxxxxxx] On 
Behalf Of Martin Mokrejs
Sent: Monday, February 24, 2014 8:16 AM
To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: tweaking Manifest for polyploid genome

Hi Juan,
  how many 454 reads do you have on input? I see max coverage 180830 for the 
454 technology. also from "Coverage assessment" I see that only 454 is covered 
at 0.78x and Solexa at 28.34x. Let me suspect your main issue is proper 
trimming of raw reads. Looks the reads just do not assemble or you sequenced 
too few material using 454. Ah, I see in your Manifest file you used RapidLib 
approach and MIDs ... poor boy, how did you remove them?

  Maybe you would appreciate as a paid service my help, please see 
http://www.bioinformatics.cz . There are plenty of tricks needed to get
454 trimming right and I don't know any other tool (except mine) ;) doing that 
right, not even of a tool doing the proper queries for all adapters, primers, 
artifacts. However, a lot of effort had to be put into the wrapper code to 
manage and interpret the candidate alignments.
Having just the right queries is not enough. 3 years of work, 25k lines of code 
in python. My apologies if this this is considered as an Ad, I couldn't resist.
Martin


Gutierrez, Juan wrote:
> Hi,
> 
>  
> 
> I am trying to do RNA-seq de novo on a polyploid (hexaploid) genome 
> using a combination of 454 and illumina 100bp paired ended reads. The 
> three copies of each gene are highly similar to one another. I am 
> having trouble in separating each one of the three copies into three 
> different fully-length assemblies. Most of the times I just get a 
> fragment of each of the three copies. I am guessing that when Mira 
> finds a difference between highly similar transcripts, it just can’t 
> assess if there is a polymorphism between 2 of the copies or a 
> sequencing error. In any case, Mira seems to end the assembly way 
> before reaching the end of the transcript.
> 
>  
> 
> I have prepared and run 2 different Manifests. I am getting better 
> results with Manifest.conf (less number of contigs but longer) than 
> with Manifest2.conf (higher number but shorter contigs), so I am 
> supposing that I could eventually separate the 3 copies of each gene 
> by fine-adjusting the parameters.
> 
>  
> 
> Any suggestion would be greatly appreciated,
> 
> Thanks so much!
> 
> Juan
> 
> 
> 
> 
> 
> This electronic message contains information generated by the USDA solely for 
> the intended recipients. Any unauthorized interception of this message or the 
> use or disclosure of the information it contains may violate the law and 
> subject the violator to civil or criminal penalties. If you believe you have 
> received this message in error, please notify the sender and delete the email 
> immediately.

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

b��j��yǢ��m�+&j)[yƮ�쨹�޲��r��y�h�����jY&j)b�    
b��h�)ߢ���*'�xh��,���&ޢ�����r��z�^jǯ�ȭ��i��0��^���Ɗ��h�jf��)��+-�f

Other related posts: