[mira_talk] howto format paired-end reads for mira
- From: Bastien Chevreux <bach@xxxxxxxxxxxx>
- To: mira_talk@xxxxxxxxxxxxx
- Date: Thu, 25 Jun 2009 19:51:08 +0200
This question came to me via mail and Filip allowed me to put question and
answer to the MIRA talk list.
On Mittwoch 24 Juni 2009 Filip Van Nieuwerburgh wrote:
> First of all, I want to thank you for making MIRA available for the
> community. It is the best free assembler I used until now (but very slow).
Hi Filip,
yeah, MIRA is not the fastest. But it's the one which creates the least
trouble afterwards for finishing (at least for me). I don't mind the computer
needing a few hours more if I can save time :-)
> I am very excited about your last update, which includes more support on
> true hybrid assembly.
454 & Sanger has been wrking since ... quite some time. Solexa it could do
that since the early *40 versions, albeit not really well and not documented
at all.
De-novo Solexa (or a denovo hybrid with Solexa) is still slow in the end-game
... this is going to be improved in one of the next updates.
> I have a question on using Illumina mate paired reads in a denovo hybrid
> assembly (./bin/mira -project=bchoc -job=denovo,genome,accurate,454,solexa
> COMMON_SETTINGS -highlyrepetitive SOLEXA_SETTINGS
> -GE:tismin=2000:tismax=3000 >&log_assemblybchocHybridDate090624.txt).
>
> How should the reads be formatted? To illustrate my question, I copy/paste
> the requirements stated in the Velvet manual:
>
> 1. For paired-end reads, the assumption is that each read is next to
> its mate read. In other words, if the reads are indexed from 0, then reads
> 0 and 1 are paired, 2 and 3, 4 and 5, etc. If for some reason you have
> forward and reverse reads in two different FASTA files but in corresponding
> order, the bundled Perl script shuffleSequences.pl will merge the two files
> into one as appropriate. To use it, type:> ./shuffleSequences.pl
> forward_reads.fa reverse_reads.fa output.fa
This is not needed by MIRA. Only requisite: one part of the pair must be named
"/1" and the other part "/2" (or a non-Solexa-standard ".f" and ".r", but then
you need to adapt the -LR:rns switch).
As the "/1" "/2" naming is the current standard from the Illumina pipeline,
you probably don't need to do anything.
> 2. Concerning read orientation, Velvet expects paired-end reads to
> come from opposite strands facing each other, as in the traditional Sanger
> format.
Yep, this is perfect and MIRA expects that, too, for all reads (Sanger, 454
and Solexa). As 454 has a "forward/forward" orientation initially, one pair
must be turned around. But the sff_extract script from Jose does that
automatically for you and the output it produces is directly usable by MIRA.
> What are the requirements for MIRA? Maybe both reads have to be on the same
> strand and facing the same direction in MIRA?
Nope :-)
> Thank you very much for answering my question. I realize that you must
> receive many questions from all over the world.
This is why I prefer questions on the mailing list ... people tend to find
answers there pretty quickly with Google once a similar question was asked.
Would you mind if i push this answer mail to the list?
Regards,
Bastien
--
You have received this mail because you are subscribed to the mira_talk mailing
list. For information on how to subscribe or unsubscribe, please visit
http://www.chevreux.org/mira_mailinglists.html
Other related posts:
- » [mira_talk] howto format paired-end reads for mira - Bastien Chevreux