[mira_talk] Re: Questions on consensus sequence vs .TCS file

From: yongmei <yongmei@xxxxxx>
To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
Date: Mon, 10 Feb 2014 16:48:59 +0800
Hi, Francisco, 
I sent you a dropbox link of a .caf file and a .tcs file that generated by your 
CAF_2_TCS.py end of January. 
I am just wondering whether you have time to have a look. 

Thanks. 

Yongmei 

________________________________________
From: mira_talk-bounce@xxxxxxxxxxxxx [mira_talk-bounce@xxxxxxxxxxxxx] On Behalf 
Of Francisco Pina Martins [f.pinamartins@xxxxxxxxx]
Sent: Friday, January 24, 2014 1:09 AM
To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: Questions on consensus sequence vs .TCS file

Ok, so I've checked things, and here's what I've gotten:

CAF_2_TCS writes the "consensus base" directly from the CAF file contig
data.
This means it is not likely a bug in CAF_2_TCS.py in the sums of the
number of bases.
I'm inclined to think it might be a bug in the way the CAF file contig
is generated. However, I would like to confirm this, as it might
eventually be a bug in the way CAF_2_TCS.py considers the positions of
the bases.
Can I have a (small) example CAF file where this occurs please? Just so
I can see exactly what is happening and where.
If the file is very large (judging by the coverage values it must be,
even if it contains only one contig), just PM me something like a
dropbox link instead of sending an email attachment.

Thanks,

Francisco


On 23/01/14 08:54, yongmei wrote:
> Thanks for your email.
>
>>> I am very confused with the result.
>>> Below is a part of the .tcs file converted from the mira output .caf file 
>>> using CAF_2_TCS.py
>>> […]
>> Ummmm, CAF_2_TCS.py is nothing I wrote. Who’s the author, have you tried to 
>> contact him?
>> What does MIRA tell you in its result files (FASTA) what the seemingly wrong 
>> bases are? Are these correct? If yes, this would really be a strong 
>> indicator >for a bug in the py script and not in MIRA.
> The result fasta file in the mira results folder for these bases are not 
> correct either. They are the same as it in the TCS.
> I wrote my own R program to parse the .caf file from mira's output, and got 
> the same information as CAF_2_TCS.py.
>
>>> I also tried to convert the fastq to fasta file and set the default_qual = 
>>> 50 and use the fasta file to do the same mira assembly,
>>> and I got the perfect results.
> I mean the .fasta in the mira output folder looks perfect, so as the TCS file 
> and my own results.
>
> Since we know what our sample is, it should be very similar to the reference 
> (maybe with a couple of mutations in every 1kb).
> When we use the fastq to do the assembly, the result shows lots of mutations. 
> And when I checked the .caf file use CAF_2_TCS.py
> or my own R program, and I found that many of the "Mutations" actually are 
> not mutations, for example, for a base, there are
> more than 20k "A", and only less than 100 "C","T","G" and "*", I expected the 
> result for this base to be  "A", however, the result file shows
> a "G" for this base. And we had quite a lot this kind of cases.
> However, if I use the fasta file to run mira, we do not have this kind of 
> problems at all.
> So I am wonder whether there is some problem with our .fastq file or 
> something else.
>
> Thank you very much for your help.
> Best wishes,
> Yongmei
> ________________________________________
> From: mira_talk-bounce@xxxxxxxxxxxxx [mira_talk-bounce@xxxxxxxxxxxxx] On 
> Behalf Of Bastien Chevreux [bach@xxxxxxxxxxxx]
> Sent: Thursday, January 23, 2014 3:31 PM
> To: mira_talk@xxxxxxxxxxxxx
> Subject: [mira_talk] Re: Questions on consensus sequence vs .TCS file
>
> On 23 Jan 2014, at 3:41 , yongmei <yongmei@xxxxxx> wrote:
>> I am very confused with the result.
>> Below is a part of the .tcs file converted from the mira output .caf file 
>> using CAF_2_TCS.py
>> […]
> Ummmm, CAF_2_TCS.py is nothing I wrote. Who’s the author, have you tried to 
> contact him?
>
> What does MIRA tell you in its result files (FASTA) what the seemingly wrong 
> bases are? Are these correct? If yes, this would really be a strong indicator 
> for a bug in the py script and not in MIRA.
>
>> I also tried to convert the fastq to fasta file and set the default_qual = 
>> 50 and use the fasta file to do the same mira assembly,
>> and I got the perfect results.
> I’m not sure if I understood your last sentence correctly. What result is 
> perfect? The TCS?
>
> B.
>
>
> --
> You have received this mail because you are subscribed to the mira_talk 
> mailing list. For information on how to subscribe or unsubscribe, please 
> visit http://www.chevreux.org/mira_mailinglists.html


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html
Follow-Ups:
- [mira_talk] Re: Questions on consensus sequence vs .TCS file
  - From: Francisco Pina Martins
- [mira_talk] Re: Questions on consensus sequence vs .TCS file
  - From: Francisco Pina Martins
[mira_talk] Re: Questions on consensus sequence vs .TCS file

Other related posts: