[mira_talk] Re: Assembly of NCBI Data

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Sun, 9 Jan 2011 01:05:07 +0100

On Freitag 07 Januar 2011 Thomas, Dallas wrote:
>   Have 13,900 sequences comprised of NCBI nt, est and gss data.  This is
> just the fast information obtained from NCBI.  Looking at doing a
> de-novo assembly on the data, however am a bit baffled on the --joblist
> and options I should be using.  Does one just use sanger? Or would one
> assume that a high proportion of est, etc might come from 454?  Would
> you use "est" in the joblist for all of the sequences or assembly only
> the est using "est" and use genome for the gss and nt.

If in doubt, try

  --job=est,denovo,draft,sanger

and see how it behaves (the assembly) and how the results look like. If far 
too many contigs get made due to differences which look like homopolymer 
errors, try 454.

B.

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: