[mira_talk] Re: miraSearchESTSNPs

From: Thomas Müller <thomas.mueller@xxxxxxxxxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Wed, 11 Aug 2010 09:14:31 +0200

Dear all,

I would still be very happy about receiving some hints! ;)
Also let me know if one of the questions is unclear or somewhat.

Cheers,
Thomas

On Aug 3, 2010, at 11:49 AM, Thomas Müller wrote:

> Hello everybody,
> 
> Bastien, as you recommended I tried out miraSearchSNPs. It works, but I have 
> some questions:
> 
> - I call miraSearchESTSNPs for the first and second part with the parameters 
> I used for normal mira (
> miraSearchESTSNPs --project=DC_NC -job=denovo,454,normal,esps1 --notraceinfo 
> -SK:not=12 -AS:urd=no -OUT:rld=yes -AS:sd=no -CO:asir=yes -SB:lsd=yes 
> 454_SETTINGS -LR:mxti=no -CL:qc=no -CL:cpat=no -AS:epoq=no -AL:mrs=90 
> -CO:rodirs=10 -SK:pr=80 -AS:mrl=50 -AL:mo=20 -FN:fai=DC_NC.fasta 
> -FN:fqui=DC_NC.qual
> ) except the --project switch for step2. For the third step I use just use 
> miraSeachESTSNPs --job=denovo,normal,esps3. I'm not sure about using all the 
> parameters in step2 too. Is this correct?
> 
> 
> - If I use just one library as input, what is the difference between the 
> results of step1 and step2? As in step1 all sequences are used and in step2 
> the sequences get assembled by each strain, the results should be nearly the 
> same? But I get e.g. 40000 contigs after step1 and 18000 contigs after step2 
> (5000 in the DC_NC_out.caf and 13000 in the remain_out.caf). 
> 
> - Also when using one input strain: why do I get SRO and SIO tags in the 
> step3.caf? Using just one strain I wouldn't expect inter-organism SNPs.
> 
> - Just one more regarding one input strain: after step3 there are nearly 3000 
> contigs left (after step1 40000). Would those contigs represent something 
> like the possible unigene set?
> 
> - If I assemble several strains there are many (for me) unexpected files in 
> the result folders after step2. Next to the expected .caf, .maf, .tcs, 
> .padded* and .unpadded* there are also the following files:
> 
> .padded_AllStrains.padded.fasta
> .padded_AllStrains.padded.fasta.qual
> .padded_AllStrains.unpadded.fasta
> .padded_AllStrains.unpadded.fasta.qual
> .padded_DC_NC.padded.fasta            -> DC_NC is the name of my lib
> .padded_DC_NC.padded.fasta.qual
> .padded_DC_NC.unpadded.fasta + .qual
> .padded_default.padded.fasta + .qual
> .padded_default.unpadded.fasta + .qual
> 
> The All_Strains files seems to be equal to the DC_NC files and the default 
> files seems to consist out of sequence names and many '@' in the .fasta files 
> and '0' in the qual files. I'm confused about what to do with those files or 
> which extra information they provide.
> 
> - Why is the number of contigs decreasing as much after step2?? I don't get 
> it...
> 
> I think that's all for now ;)
> 
> Thanks a lot in advance!
> Sorry if I missed something in the mailing-list's archive or in the manuals!
> 
> Cheers,
> Thomas
> --
> You have received this mail because you are subscribed to the mira_talk 
> mailing list. For information on how to subscribe or unsubscribe, please 
> visit http://www.chevreux.org/mira_mailinglists.html



--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Follow-Ups:
- [mira_talk] Bug in miraSearchESTSNPs
  - From: Thomas Müller
- [mira_talk] Re: miraSearchESTSNPs
  - From: Bastien Chevreux

References:
- [mira_talk] miraSearchESTSNPs
  - From: Thomas Müller

[mira_talk] Re: miraSearchESTSNPs

Other related posts: