[mira_talk] Re: from + len > size of contig?

  • From: "abenjak ." <abenjak@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 1 Jul 2015 16:40:50 +0200

Could the problem be that you set the Illumina contigs as "technology =
solexa"?
If these are assembled contigs, than the technology should be set to "text".

Andrej

On Wed, Jul 1, 2015 at 4:23 PM, Bleker, Carissa R. <blekercr@xxxxxxxx>
wrote:

We've asked, but there is no chance for getting fastq reads. The PacBio
was the raw data, which I assembled using mira as well. Attempting to map
the illumina reads to the PacBio mira assembly is what threw up the error
below.


------------------------------
*From:* mira_talk-bounce@xxxxxxxxxxxxx <mira_talk-bounce@xxxxxxxxxxxxx>
on behalf of John Nash <john.he.nash@xxxxxxxxx>
*Sent:* Tuesday, June 30, 2015 12:46 PM

*To:* mira_talk@xxxxxxxxxxxxx
*Subject:* [mira_talk] Re: from + len > size of contig?

I agree with Chris. I have found using illumina reads to correct PacBio
assemblies a great tool BUT it is only useful if the reads are fastq reads.

How was your PacBio data assembled? Was it using the pacbio (hgap, etc)
tools?

Is there any way that you can chase up the person/lab who did the
illumina sequencing and hunt down the fastq reads? So what you need to do
is not a hybrid de novo assembly of pacbio and illumina reads but a
reference assembly of illumina reads to a pacbio reference.

0.02
John


On Jun 29, 2015, at 12:40 PM, Chris Hoefler <hoeflerb@xxxxxxxxx> wrote:

That's not going to work very well. What are you trying to achieve with
the hybrid assembly? Is the PacBio assembly not good enough for what you
need? Without Illumina reads, you won't be able to do much to improve it.
If you just want to order the Illumina contigs using the PacBio reference,
you can use Mauve. I'm assuming that since Mira was able to take your
contigs as reads that they aren't very long (< 20 kb)?

On Mon, Jun 29, 2015 at 11:06 AM, Bleker, Carissa R. <blekercr@xxxxxxxx>
wrote:

Nope, I only have the fasta file. They are from the same strain, I'm
trying to a hybrid assembly with the PacBio and Illumina data.
------------------------------
*From:* mira_talk-bounce@xxxxxxxxxxxxx <mira_talk-bounce@xxxxxxxxxxxxx>
on behalf of Chris Hoefler <hoeflerb@xxxxxxxxx>
*Sent:* Monday, June 29, 2015 10:42 AM
*To:* mira_talk@xxxxxxxxxxxxx
*Subject:* [mira_talk] Re: from + len > size of contig?

Do you have the Illumina reads? You can just map those directly to the
reference instead of the contigs. Are you mapping two different
strains...what are you trying to do?

On Mon, Jun 29, 2015 at 8:22 AM, Bleker, Carissa R. <blekercr@xxxxxxxx>
wrote:


Hi,

I was trying to map Illumina contigs to a mira assembled Pacbio
referene.
My config looks like:

'''
project = glycomyces_mapping_try1
job = genome,mapping,accurate

# parameter settings
parameters = COMMON_SETTINGS -GE:not=8, -DI:trt=/tmp/
parameters = -NW:cmrnl=no

# since no fasta qualtity file for illumina
parameters = SOLEXA_SETTINGS --noqualities

# reference sequence
readgroup = GlycomycesPacbio
is_reference
data = /path/to/file/glycomyces_assembly_pacbio_try1_out.caf

# illumina sequences
readgroup = GlyvomyceseIllumina
data = /path/to/file/glycomyces_illumina.fasta.fna
technology = solexa
default_qual = 30 # fake quality value
'''

After running for a few hours I get the error:

'''
Internal logic/programming/debugging error (*sigh* this should not have
happened)


********************************************************************************
* from + len > size of
contig? *

********************************************************************************
->Thrown: void Contig::updateCountVectors(const int32 from, const int32
len, vector<char>::const_iterator updateI, const uint32 seqtype, const bool
addiftrue, int32 coveragemultiplier)
->Caught: void Contig::stripToBackbone()

Aborting process, probably due to an internal error.
'''

I noticed a previous problem like this in the mailing list and a
recommendation was to use only one thread, however this gave exactly the
same error at the some point in the mapping. I also tried both the CAF and
MAF files from the initial Pacbio denovo assembly.

This is my first time doing an assembly, so any and all advice is
welcome!





--
Chris Hoefler, PhD
Postdoctoral Research Associate
Straight Lab
Texas A&M University
2128 TAMU
College Station, TX 77843-2128




--
Chris Hoefler, PhD
Postdoctoral Research Associate
Straight Lab
Texas A&M University
2128 TAMU
College Station, TX 77843-2128



Other related posts: