[mira_talk] Re: Request for Comments: mirabait for paired-end

  • From: Martin MOKREJŠ <mmokrejs@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 25 Jun 2014 22:01:46 +0200

Bastien Chevreux wrote:
> On 25 Jun 2014, at 17:41 , Martin MOKREJŠ <mmokrejs@xxxxxxxxx> wrote:
>> BTW, mira manual has has two places showing how 454-paired-end
>> data are to be specified, no sign of changes during the years.
> 
> Ummm … where? I though I had eradicated all obsolete info.

http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html

While I am looking for those places I will add more (un)related comments:

  "1---> 2---> or samedir forward or SF or LEFTIES"
   ^^^^^^^^^^^----- shouldn't there be 1<---2<--- instead? Or why is there 
"LEFTIES"?
                    The schematics looks same as the very next line saying 
RIGHTIES.


Then 2 lines further there is:

  "This is the usual placement code for 454 "paired-end" and IonTorrent 
long-mate protocols. "

So, here I would definitely expect explanation what sff_extract did and does 
now, along with
comparison to "original" sff_extract if people obtained it directly from Jose 
Blanca (don't
have the URL handy now). Do they have different version numbers? Weren't they 
both 0.28? ;)


"  Understanding how MIRA uses strain information\nbla"

Here I terribly lack note about asir=yes to inform user it is (I think) better 
to add
a strain rather then to omit it. So, for each individual animal it is better 
add a strain
info.
Further, explanation that normalization acts only within readgroups and is 
affected by order of
the data groups as you explained me recently ... is important but no idea how 
it relates to
strain. The section "3.4.3.  The manifest file: defining the data to load"
told me nothing about that but maybe here it would be even more appropriate?


There are two examples where it appears (only):

# some 454 data

readgroup = DataFo454Pairs
data = ../../data/454data.fastq
technology = 454
template_size = 8000 1200
segment_placement = 2---> 1--->

# some Ion Torrent data

readgroup = DataFoIonPairs
data = ../../data/iondata.fastq
technology = iontor
template_size = 1000 300
segment_placement = 2---> 1--->

but here should be the note about other orientations of pairs (in verbatim 
copy/paste examples for 454)
and the note about sff_extract changes. Ideally add 3kb and 20kb library 
examples so one has some clue
about the min and max values. One comment line wouldn't harm would save the 
reading browsing back and
forth in the manual.




> 
>> So, my recommendation is not to re-use the old mirabait name. The rest is up 
>> to you.
> 
> Concern noted, but I think I’ll still go for it.

Man, don't do it! Who is going to check version number to realize what argument 
the
program expects? Which user is going to report that once he/she upgraded the 
*test*
assembly is different? Especially as mira assemblies are always different unless
single-threaded.

It does no hurt you to rename it, but it will hurt a lot a number of people.

Martin

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: