On Monday 28 March 2011 19:50:24 Stephanie Pearl wrote: > [...] > 1. Is Gap4 the recommended viewer for viewing the assembly and its tags? I like gap4, so if you do not have millions of contigs/reads, it's still works OK. Although James has now a pretty stable beta version of gap5 out. It's definitively the way to go for large projects. And integration with MIRA is as easy as never before with gap5: To import: "tg_index -C input.caf" To export: use the CAF export function of gap5 > 2. On p. 74 of the Definitive Guide to Mira handbook, you mention something > about needing to invert single contigs by hand. Under what circumstances > would I need to invert contigs by hand and how would I know that they > should be inverted? Hmm, a section might have helped for me to find it quickly in the HTML ... PDF is a bulky beast and can you believe it ... I had to download the version from the net (I didn't want to check out an old document version extra for that). Anyway ... p.74 reads for me: ------ 4.6.2 Reverse GenBank features are in forward direction in a gap4 project caf2gap has currently (as of version 2.0.2) a bug that turns around all features in reverse direction during the conversion from CAF to a gap4 project. There is a fix available, please contact me for further information (until I find time to describe it here). ------ This is only of interest if you input data contained sequences having annotations in GenBank format (either as backbone reference sequence or as own sequence to be assembled, be it in GBF/GBK files or CAF/MAF files whhich were created with such annotations). Do you have that? > 3. On p. 96 of the Definitive Guide under the "Where are the SNPs?" > section, you indicate that you don't recommend assembling sequences of > more than one strain to ID SNPs. You are in a section regarding mapping assembly. Chapter 6 Solexa sequence assembly with MIRA3 6.4 Mapping assemblies 6.4.6 Places of interest in a mapping assembly 6.4.6.1 Where are SNPs? There I indeed recommend to map your strains one by one to reference sequences to get the best results. > If I'm interpreting this correctly and > this is the case, then under which circumstances should I use > MiraSearchESTSNPs? (I have Sanger contigs (plus the individual reads that > comprise these, but no quality scores), then 2 sets of 454 reads of 2 more > closely related species (both have quality scores). miraSearchESTSNPs is not a mapping tool, but a de-novo assembly tool. You have to decide what you want to do: - map the 45 against Sanger contigs? Use "mira --job=mapping,est,..." (for each strain data set separately, after that perhaps together to see whether the result suits you). If you give strain info information to those mapping assemblies, MIRA will happily point you toward SNPs which are more or less good. - assemble all data de-novo, separating contigs with no SNP from contigs with SNPs and have MIRA point at differences? Use "miraSearchESTSNPs" with all data sets at once (and use strain info there as well, for all reads!) Hope that helps, B.