Dear all, I would still be very happy about receiving some hints! ;) Also let me know if one of the questions is unclear or somewhat. Cheers, Thomas On Aug 3, 2010, at 11:49 AM, Thomas Müller wrote: > Hello everybody, > > Bastien, as you recommended I tried out miraSearchSNPs. It works, but I have > some questions: > > - I call miraSearchESTSNPs for the first and second part with the parameters > I used for normal mira ( > miraSearchESTSNPs --project=DC_NC -job=denovo,454,normal,esps1 --notraceinfo > -SK:not=12 -AS:urd=no -OUT:rld=yes -AS:sd=no -CO:asir=yes -SB:lsd=yes > 454_SETTINGS -LR:mxti=no -CL:qc=no -CL:cpat=no -AS:epoq=no -AL:mrs=90 > -CO:rodirs=10 -SK:pr=80 -AS:mrl=50 -AL:mo=20 -FN:fai=DC_NC.fasta > -FN:fqui=DC_NC.qual > ) except the --project switch for step2. For the third step I use just use > miraSeachESTSNPs --job=denovo,normal,esps3. I'm not sure about using all the > parameters in step2 too. Is this correct? > > > - If I use just one library as input, what is the difference between the > results of step1 and step2? As in step1 all sequences are used and in step2 > the sequences get assembled by each strain, the results should be nearly the > same? But I get e.g. 40000 contigs after step1 and 18000 contigs after step2 > (5000 in the DC_NC_out.caf and 13000 in the remain_out.caf). > > - Also when using one input strain: why do I get SRO and SIO tags in the > step3.caf? Using just one strain I wouldn't expect inter-organism SNPs. > > - Just one more regarding one input strain: after step3 there are nearly 3000 > contigs left (after step1 40000). Would those contigs represent something > like the possible unigene set? > > - If I assemble several strains there are many (for me) unexpected files in > the result folders after step2. Next to the expected .caf, .maf, .tcs, > .padded* and .unpadded* there are also the following files: > > .padded_AllStrains.padded.fasta > .padded_AllStrains.padded.fasta.qual > .padded_AllStrains.unpadded.fasta > .padded_AllStrains.unpadded.fasta.qual > .padded_DC_NC.padded.fasta -> DC_NC is the name of my lib > .padded_DC_NC.padded.fasta.qual > .padded_DC_NC.unpadded.fasta + .qual > .padded_default.padded.fasta + .qual > .padded_default.unpadded.fasta + .qual > > The All_Strains files seems to be equal to the DC_NC files and the default > files seems to consist out of sequence names and many '@' in the .fasta files > and '0' in the qual files. I'm confused about what to do with those files or > which extra information they provide. > > - Why is the number of contigs decreasing as much after step2?? I don't get > it... > > I think that's all for now ;) > > Thanks a lot in advance! > Sorry if I missed something in the mailing-list's archive or in the manuals! > > Cheers, > Thomas > -- > You have received this mail because you are subscribed to the mira_talk > mailing list. For information on how to subscribe or unsubscribe, please > visit http://www.chevreux.org/mira_mailinglists.html -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html