[mira_talk] Re: edited assembly: reassembly

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 17 Dec 2009 18:33:30 +0100

On Freitag 11 Dezember 2009 Stefano Ghignone wrote:
> Hi all,
> and sorry if I bother again the mailing list with my issues.
> I need to reassemble an assembly (a mapping, precisely), which I edited
> using consed, working on the ACE output.
> [...]
> And I'm still trying to convert the ace file into the caf format. I was
> thinking to reassemble with mira using the caf file as input, after
> filtering out contig with less than 4x of coverage. I used phred2caf and
> roche454ace2caf, but both produce a caf file which convert_project
> doesn't like. With the latter script, by Bernd Senf, I got this message
> 
> when I apply convert_project:
> > Converting from caf to: caf
> > First counting reads:
> >  [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%]
> > ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%]
> > ....|.... [90%] ....|.... [100%]
> > Now loading and processing data:
> >  [0%] Searched for read fos10Contig6.c but did not find it?
> > Unable to find Read in Pool

Hello Stefano,

I had a look at the CAF you sent. It looks like the roche454ace2caf tool 
writes out first a contig definition, then read definitions into the CAF file. 
Having had a look at the CAF format files from Sanger: this looks like to be 
perfectly valid. Moreover, the CAF tools from Sanger accept that format.

Problem is: MIRA does not. When Thomas and I wrote the parser for MIRA some 10 
years ago, all we had seen by that time (and the examples from the Sanger 
Centre) had the format: first define all reads of a contig, the define the 
contig. Rinse repeat for every contig.

I think your best option would be to ask Bernd (the author of the 
roche454ace2caf tool) to change the behaviour of his tool to first write out 
the reads of a contig, then the contig defs, then reads, then ... etc.pp If 
I'm right in what I think, this could be done by changing just one or two 
lines in his code. For MIRA to cope with the format as it is now would require 
heavy modification ... which won't happen anytime soon, sorry.

Regards,
  Bastien

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: