I am still configuring trans-abyss, and yes it is not user friendly. Our illumina reads are single end, so many of the steps in trans-abyss are skipped. We are only using trans-abyss to merge our multi-k-mer assembly to remove redundant contigs. I realize abyss will won't catch indels well, but we are only using it to help make our 454 assembly better. Since we have no reference genome, we sequenced a normalized transcriptome via 454. Then did non-normalized sequencing with Illumina. We are merely assembling the Illumina reads and hoping that they will close some gaps & join contigs in our 454 assembly. On Mon, Nov 29, 2010 at 2:00 PM, Robin Kramer <kodream@xxxxxxxxx> wrote: > Sven, > > The problem is that bowtie itself has only limited support for indels > since it isn't a true SW aligner, and Abyss in its scaffolding stage > doesn't support indels(even if bowtie generates them), whatsoever. > > > I am curious as too your experience with the trans package. Did it do > a good job? The last I checked it was something akin to an NxN blast > search and required quite a bit of external configuration to use, and > since it was only a perl script I was guessing that it was itself > quite slow. > > > On 11/16/10, Wachholtz, Michael <mwachholtz@xxxxxxxxxxx> wrote: >> I recently discovered that Abyss has trans-abyss package. While this >> program is geared towards expression analysis with reference genome, >> it has a merge.pl script. We have have been on the fence about what >> k-mer value to use. The results are hard to interpret, but this >> package will do assembly of all values k-mer from i/2 to i (where i is >> the read length) and merge all the contigs into a final assembly. Our >> computer is quad-core with 25GB RAM. It only takes Abyss less than >> 1hour to assemble ~100,000,000 reads. Very fast!! Since all of our >> illumina reads were filtered to contain mostly 30+ quality scores, we >> just run this assembly through MIRA's fasta2frag program. This will >> output quality score file for the fragments, putting in a value of 30 >> for each bp (saves me the work of writing a script for this). Then >> just treat the fragments as sanger reads and do hybrid with our 454 >> reads in MIRA. If anyone has done Illumina transcriptome assembly with >> the velvet/oases package instead of abyss, I would like to hear your >> thoughts about the advantages or technique you used. While abyss seems >> to do a fine job of catching SNPs and logging them as "popped >> bubbles", I'm not sure how it handles indels & transcript variants. >> Once we have a complete assembly, our goal is to do RNA-Seq analysis >> with the original Illumina data. While MIRA will catch a large >> majority of SNPs during assembly, some of the SNP/variation data will >> have been lost in the abyss assembly. However this "lost" information >> can easily be found when we map reads using bowtie, bam/sam tools. >> >> On Tue, Nov 16, 2010 at 2:55 PM, Sven Klages >> <sir.svencelot@xxxxxxxxxxxxxx> wrote: >>> oh, yes. I see, .. I just wanted to use it for my own data and was quite >>> astonished ;-) >>> fasta output, no qualities ... not of any use for me neither .. >>> cheers, >>> Sven >>> >>> 2010/11/16 Wachholtz, Michael <mwachholtz@xxxxxxxxxxx> >>>> >>>> I have, but the output is in fasta format with no quality scores. The >>>> only advantage this program has is that it will output how many >>>> identical reads there were. I prefer the fastq program in that it will >>>> retain the quality score of best sequence and will output in fastq >>>> format. >>>> >>>> On Mon, Nov 15, 2010 at 5:18 AM, Sven Klages >>>> <sir.svencelot@xxxxxxxxxxxxxx> wrote: >>>> > Hi Michael, >>>> > >>>> > 2010/11/15 Wachholtz, Michael <mwachholtz@xxxxxxxxxxx> >>>> >> >>>> >> [...] >>>> >> >>>> >> it is safe to use such strict criteria. After that, for each lane, we >>>> >> used the fastq program to collapse/remove any identical reads. This >>>> > >>>> > [...] >>>> > >>>> > just a short question. You have successfuly used the FASTX-Toolkit to >>>> > quality-clip your data; >>>> > this tool collection also contains a program to remove duplicates from >>>> > NGS >>>> > data: >>>> > >>>> > FASTQ/A Collapser >>>> > Collapsing identical sequences in a FASTQ/A file into a single sequence >>>> > (while maintaining reads counts) >>>> > >>>> > Have you tried this for your data? >>>> > >>>> > cheers, >>>> > Sven >>>> > >>>> > >>>> >>>> -- >>>> You have received this mail because you are subscribed to the mira_talk >>>> mailing list. For information on how to subscribe or unsubscribe, please >>>> visit http://www.chevreux.org/mira_mailinglists.html >>> >>> >> >> -- >> You have received this mail because you are subscribed to the mira_talk >> mailing list. For information on how to subscribe or unsubscribe, please >> visit http://www.chevreux.org/mira_mailinglists.html >> > > -- > You have received this mail because you are subscribed to the mira_talk > mailing list. For information on how to subscribe or unsubscribe, please > visit http://www.chevreux.org/mira_mailinglists.html > -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html