[mira_talk] Re: Questions on consensus sequence vs .TCS file

From: Francisco Pina Martins <f.pinamartins@xxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Tue, 11 Feb 2014 11:24:44 +0000

Hi Yongmei,

I have received your files, but my daughter was born just the day afteryou sent them and my life has been kind of in a chaos ever since. =-)

I will however take a good look at them once I wrestle back some degreeof control (shouldn't take long now).

Sorry about the delay...

Cheers,

Francisco

On 10/02/14 08:48, yongmei wrote:

Hi, Francisco,
I sent you a dropbox link of a .caf file and a .tcs file that generated by your 
CAF_2_TCS.py end of January.
I am just wondering whether you have time to have a look.

Thanks.

Yongmei

________________________________________
From: mira_talk-bounce@xxxxxxxxxxxxx [mira_talk-bounce@xxxxxxxxxxxxx] On Behalf 
Of Francisco Pina Martins [f.pinamartins@xxxxxxxxx]
Sent: Friday, January 24, 2014 1:09 AM
To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: Questions on consensus sequence vs .TCS file

Ok, so I've checked things, and here's what I've gotten:

CAF_2_TCS writes the "consensus base" directly from the CAF file contig
data.
This means it is not likely a bug in CAF_2_TCS.py in the sums of the
number of bases.
I'm inclined to think it might be a bug in the way the CAF file contig
is generated. However, I would like to confirm this, as it might
eventually be a bug in the way CAF_2_TCS.py considers the positions of
the bases.
Can I have a (small) example CAF file where this occurs please? Just so
I can see exactly what is happening and where.
If the file is very large (judging by the coverage values it must be,
even if it contains only one contig), just PM me something like a
dropbox link instead of sending an email attachment.

Thanks,

Francisco


On 23/01/14 08:54, yongmei wrote:

Thanks for your email.

I am very confused with the result.
Below is a part of the .tcs file converted from the mira output .caf file using 
CAF_2_TCS.py
[…]

Ummmm, CAF_2_TCS.py is nothing I wrote. Who’s the author, have you tried to 
contact him?
What does MIRA tell you in its result files (FASTA) what the seemingly wrong bases 
are? Are these correct? If yes, this would really be a strong indicator >for a 
bug in the py script and not in MIRA.

The result fasta file in the mira results folder for these bases are not 
correct either. They are the same as it in the TCS.
I wrote my own R program to parse the .caf file from mira's output, and got the 
same information as CAF_2_TCS.py.

I also tried to convert the fastq to fasta file and set the default_qual = 50 
and use the fasta file to do the same mira assembly,
and I got the perfect results.

I mean the .fasta in the mira output folder looks perfect, so as the TCS file 
and my own results.

Since we know what our sample is, it should be very similar to the reference 
(maybe with a couple of mutations in every 1kb).
When we use the fastq to do the assembly, the result shows lots of mutations. 
And when I checked the .caf file use CAF_2_TCS.py
or my own R program, and I found that many of the "Mutations" actually are not 
mutations, for example, for a base, there are
more than 20k "A", and only less than 100 "C","T","G" and "*", I expected the result for 
this base to be  "A", however, the result file shows
a "G" for this base. And we had quite a lot this kind of cases.
However, if I use the fasta file to run mira, we do not have this kind of 
problems at all.
So I am wonder whether there is some problem with our .fastq file or something 
else.

Thank you very much for your help.
Best wishes,
Yongmei
________________________________________
From: mira_talk-bounce@xxxxxxxxxxxxx [mira_talk-bounce@xxxxxxxxxxxxx] On Behalf 
Of Bastien Chevreux [bach@xxxxxxxxxxxx]
Sent: Thursday, January 23, 2014 3:31 PM
To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: Questions on consensus sequence vs .TCS file

On 23 Jan 2014, at 3:41 , yongmei <yongmei@xxxxxx> wrote:

I am very confused with the result.
Below is a part of the .tcs file converted from the mira output .caf file using 
CAF_2_TCS.py
[…]

Ummmm, CAF_2_TCS.py is nothing I wrote. Who’s the author, have you tried to 
contact him?

What does MIRA tell you in its result files (FASTA) what the seemingly wrong 
bases are? Are these correct? If yes, this would really be a strong indicator 
for a bug in the py script and not in MIRA.

I also tried to convert the fastq to fasta file and set the default_qual = 50 
and use the fasta file to do the same mira assembly,
and I got the perfect results.

I’m not sure if I understood your last sentence correctly. What result is 
perfect? The TCS?

B.


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html



--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

References:
- [mira_talk] Re: Questions on consensus sequence vs .TCS file
  - From: yongmei

[mira_talk] Re: Questions on consensus sequence vs .TCS file

Other related posts: