On Dienstag 08 September 2009 Martin A. Hansen wrote: > OK, so finally MIRA completed with the combined 454/Solexa data run. > [...] > Large contigs: > -------------- > With Contig size >= 500 > AND (Total avg. Cov >= 20 > OR Cov(san) >= 0 > OR Cov(454) >= 13 > OR Cov(sxa) >= 6 > OR Cov(sid) >= 0 > ) Your average Solexa coverage is ~18 ... so you ran on a subset of the Solexa reads? > [...] > Quality assessment: > ------------------- > Average consensus quality: 22 An average quality of 22 for a hybrid 454/Solexa assembly is *not* normal. It should be somewhere in the 80s. It looks like you ran the assembly without giving any quality values for any data set ... is that possible? > I find the result is more messy that the ~70 contigs from the 454-data only > assembly. The longest contigs are a bit longer, but none of the big contigs > appear to have been joined. And then a fair number of short contigs have > appeared. Solexa de-novo are always like that. Reads seem to have a higher degree of variation than 454 or Sanger reads. This leads to more "short and low coverage" contigs. The fact that none of the larger contigs were "joined" tells me that you have a sizeable number of repetitive areas in the genome that are >3000 bases. Or that the DNA prep for both 454 and Solexa led to the same problems at the same genome sites ... but I think this is rather improbable. Next step would be to build a scaffold (Bambus)? > Bwt. I did check the integrity of the Solexa mate pair data and it does > look OK. Good to hear. Regards, Bastien -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html