Now I'll explain well the reason for the request (have all the non-aligned sequences as singletons, but not as debris in output). In a recent article of transcriptomics, where the assembly was performed with MIRA 3 ("http://www.biomedcentral.com/1471-2164/11/635";), the authors say: "Due to the heuristic nature of the assembly process and previous reports of redundancy (different contig Belonging to the transcript sequences Same region) in sets of transcriptome contigs assembled with different methods [17], a second run of assembly Was Conducted using the previously Obtained contigs and singlets as input. " So inspired by this statement, I made a script to iterate successive assemblies using MIRA, in which, at each cycle, I use as input sequences, the contig + singlets produced in the previous cycle. In the first assembly is fine to have the distinction between debris and contig + singlets. Instead, over the next cycles I expect that, assembling the contig + singlets obtained from the first assembly, I get only contigs + singletons and not yet debris. This because in the second round i assemble sequences (contigs made also by one read) produced by the program and composed, in turn, from reads that have not been discarded by MIRA in the previous round and so of good overall quality. For this reason I would find a combination of settings in MIRA that let me get in the output all sequences as contigs or singlets, but not as debris. I take this opportunity also to ask whether such approach can really improve the assembly of a transcriptome or whether, in your opinion, it did nothing but introduce errors. Thank you very much! 2011/2/16 Bastien Chevreux <bach@xxxxxxxxxxxx>: > On Wednesday 16 February 2011 15:32:47 Michele Vidotto wrote: >> However now MIRA continues to give both singlets and debris. I was >> wondering if I can get all the sequences that are not aligned, in >> output, in the form of singletons (thus abolishing the distinction >> between debris and singletons). > > Hmmm, I was going to reply that there is no distinction between debris and > singlets as they're all unaligned reads. But rethinking it, there are > differences. > > In MIRA: reads which get thrown out during quality checks or at different > stages of clipping will never ever appear as singlets, most of the time > there's normally just too much junk in this population. Sorry, this behaviour > of MIRA will not be changed. > > Everything else which passes the stage can be put into singlets via > -OUT:sssip:stsip if -AS:mrpc does not interfere. > >> In particular I would like that all the non-aligned sequences were >> found in the file "* _out.unpadded.fasta" as singletons, and to be >> listed in the file "* _info_contigreadlist.txt" without having any >> listed in "*_info_debrislist.txt" > > All singlets will appear both in the FASTA as well as in the contigreadlist > file. Debris will only appear in the debris file. > >> is it bossible with a particular commands combinations? > > As I wrote above: reads not passing the initial clipping stages will always > end up in debris. > > Is there a particular application you're after that you need the singlets so > badly in the result files? > > B. > > -- > You have received this mail because you are subscribed to the mira_talk > mailing list. For information on how to subscribe or unsubscribe, please > visit http://www.chevreux.org/mira_mailinglists.html > -- Michele Vidotto (Ph.D. Student) Department of Biology Universita` degli Studi di Padova Via Ugo Bassi 58/B, 35131, Padova, Italy Phone: +39 049 827 6204 Fax: +39 049 827 6209 mailto: michele.vidotto@xxxxxxxxxxxxxxxxx -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html