[mira_talk] EST assembly advice for a beginner

  • From: Craig Marshall <craig.marshall@xxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 29 Sep 2009 20:51:24 +1300

I have some EST data that I would like to assemble but I'm not having much success in getting a set of contigs that I believe.


The data is from conventional sanger sequencing on 3730 XL sequencers and averages about 1200 bp per read. Each read is from either end of a plasmid soI have a forward and reverse read from each cloned sequence. I guess this makes them paired reads. There are about 24,000 reads from about 12,000 sequences.

I have experimented with various parameters but I don't think I have sufficient feel for what is going on to be confident that I have got reasonable results. I have the MIRA and consed/phred/phrap manuals and have used the default scripts in phred/phrap to process the ABI data files. I'm reasonably happy that these are right, but without much experience of this work, it is hard to judge whether they are in good order or not.

Can anyone point me to a suitable primer that might help me work out suitable settings to get MIRA to work on these files. Ideally I would like to be confident that I have contigs that contain just one EST (and not several which seems to be a problem with what I have done so far). Part of this comes from unmasked vector sequence, I think, although I thought I'd incorporated the vector sequence in the appropriate files. It is possible that there are collections of closely-related EST but I'm prepared to accept a certain amount of slop in these, provided they are the 'right' length.

The other thing that I'm finding difficult is assessing the quality of the contigs. I end up with quite a large data set and even with consed I don't find it easy to work my way through the contigs to assess whether I believe the data from the run.

I should add that these data are from a nematode but not one that is already sequenced.

Many thanks,

Craig

--
Craig Marshall  craig.marshall@xxxxxxxxxxx
Biochemistry http://biochem.otago.ac.nz/staff/marshall/cmarshall.html
University of Otago     Phone +64 3 479 7570 Fax +64 3 479 7866


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: