[rmaexpress_help] Re: RMAExpress still crashes after 1.0 beta 8

  • From: Ben Bolstad <bmb@xxxxxxxxxxxxx>
  • To: rmaexpress_help@xxxxxxxxxxxxx
  • Date: Wed, 05 Mar 2008 08:08:05 -0800

Well, really this should be more properly handled by the parsing code
(ie my responsibility). But it depends on the failure mode as to whether
you can detect it manually yourself. In the case of the two files below,
the corruption is with the data stored in the file itself rather than
something fairly easy to detect by just looking at file sizes etc.

That said, one way to do it would be to use the Raw Data Visualizer
option, and more specifically the "Individual Density Plots" option.
Scrolling through these should show the corrupt files (the plot will
most likely be non existent). Doing this at this end shows these are all
potentially corrupt:

GSM128656.CEL
GSM260885.CEL
GSM133917.CEL
GSM133941.CEL
GSM133946.CEL
GSM133954.CEL
GSM133956.CEL
GSM133972.CEL
GSM133982.CEL
GSM133990.CEL
GSM134356.CEL
GSM134368.CEL
GSM134372.CEL
GSM134393.CEL
GSM134407.CEL
GSM134420.CEL
GSM134442.CEL
GSM134453.CEL
GSM134460.CEL
GSM134461.CEL
GSM142607.CEL
GSM142791.CEL
GSM157302.CEL
GSM157308.CEL
GSM157313.CEL
GSM157320.CEL
GSM183517.CEL

Although that seems to be a lot of files out of approx 2800 it is not
that many (about 1% of files). I can't guarantee that list is
comprehensive or that I did not accidentally mistype one of the names
above. Also it is potentially possible there are a different set of
corruptions in the data I have and you have.


Ben


On Wed, 2008-03-05 at 06:22 -0800, Alex Feltus wrote:
> Is there any specificity to the corruption?  Can I
> prescreen files in some way? 
> 
> Alex
> 
> --- Ben Bolstad <bmb@xxxxxxxxxxxxx> wrote:
> 
> > I am guessing that there are further corrupted CEL
> > files in that
> > dataset. I did not have the energy to further
> > examine that possibility
> > yesterday.
> > 
> > Ben
> > 
> > 
> > On Wed, 2008-03-05 at 06:03 -0800, Alex Feltus
> > wrote:
> > > Ben:
> > > 
> > > Thanks for your help with this.  I removed those
> > two
> > > .CEL files which allowed me to get to the BA stage
> > > (further than before).  However, I still get a
> > > segmentation fault.  
> > > 
> > > Alex
> > > 
> > > Here is the gdb output (appears to break as
> > before):
> > > 
> > > Starting program: /usr/local/bin/RMAExpress 
> > > [Thread debugging using libthread_db enabled]
> > > [New Thread 47990231476192 (LWP 8626)]
> > > [New Thread 1082132816 (LWP 24051)]
> > > [New Thread 1090525520 (LWP 24052)]
> > > [Thread 1090525520 (LWP 24052) exited]
> > > [Thread 1082132816 (LWP 24051) exited]
> > > [New Thread 1082132816 (LWP 24053)]
> > > [New Thread 1090525520 (LWP 24054)]
> > > [New Thread 1098918224 (LWP 24055)]
> > > [Thread 1090525520 (LWP 24054) exited]
> > > [Thread 1098918224 (LWP 24055) exited]
> > > [Thread 1082132816 (LWP 24053) exited]
> > > [New Thread 1082132816 (LWP 24056)]
> > > [New Thread 1098918224 (LWP 24057)]
> > > [Thread 1098918224 (LWP 24057) exited]
> > > [Thread 1082132816 (LWP 24056) exited]
> > > 
> > > Program received signal SIGSEGV, Segmentation
> > fault.
> > > [Switching to Thread 47990231476192 (LWP 8626)]
> > > 0x0000000000427480 in max_density (z=0x378a8440,
> > > rows=0, cols=<value optimized out>, column=0) at
> > > Preprocess/rma_background3.c:301
> > > 301         if (dens_y[i] == max_y)
> > > 
> > > --- Ben Bolstad <bmb@xxxxxxxxxxxxx> wrote:
> > > 
> > > > I have investigated this issue further by first
> > > > downloading all
> > > > available ATH1 arrays (GPL198) from GEO. I can
> > > > duplicate this error, but
> > > > it is not due to a dataset size problem. Instead
> > it
> > > > is because one of
> > > > CEL files is corrupted (GSM260954.cel for the
> > > > record). There is actually
> > > > also another file in that set that is corrupted,
> > at
> > > > least at my end
> > > > (GSM226522_S17_3.CEL) though the corruption mode
> > is
> > > > different. Fixes for
> > > > detecting these corruption types during the
> > parsing
> > > > face are imminent
> > > > for the next beta release. I'm not sure the
> > timing
> > > > on this next beta,
> > > > though it will probably be within the next
> > couple of
> > > > weeks.
> > > > 
> > > > I don't expect these situations to be
> > particularly
> > > > common so I don't
> > > > believe that 1.0 beta 8 is critically damaged,
> > and
> > > > it is still a
> > > > definite improvement over 1.0 beta 7 and earlier
> > > > versions for (super)
> > > > large datasets.
> > > > 
> > > > Best,
> > > > 
> > > > Ben 
> > > > 
> > > > 
> > > > On Mon, 2008-03-03 at 10:04 -0800, Alex Feltus
> > > > wrote:
> > > > > Ben:
> > > > > 
> > > > > I can RMA 31 arrays of this type no problem,
> > so
> > > > the
> > > > > data is good.  I have ~300GB of disk space
> > that
> > > > can be
> > > > > used for temp files.  I also never even hit
> > swap
> > > > since
> > > > > I have 16GB RAM.  I am using the default
> > buffer
> > > > > settings: 150 arrays/50,000 probe sets.
> > > > > 
> > > > > Here is GDB output for 460 GPL198 arrays:
> > > > > 
> > > > > GNU gdb 6.6-debian
> > > > > Copyright (C) 2006 Free Software Foundation,
> > Inc.
> > > > > GDB is free software, covered by the GNU
> > General
> > > > > Public License, and you are
> > > > > welcome to change it and/or distribute copies
> > of
> > > > it
> > > > > under certain conditions.
> > > > > Type "show copying" to see the conditions.
> > > > > There is absolutely no warranty for GDB.  Type
> > > > "show
> > > > > warranty" for details.
> > > > > This GDB was configured as
> > "x86_64-linux-gnu"...
> > > > > Using host libthread_db library
> > > > > "/lib/libthread_db.so.1".
> > > > > (gdb) R
> > > > > Starting program: /usr/local/bin/RMAExpress 
> > > > > [Thread debugging using libthread_db enabled]
> > > > > [New Thread 47608618407904 (LWP 16114)]
> > > > > [New Thread 1082132816 (LWP 16117)]
> > > > > [New Thread 1090525520 (LWP 16118)]
> > > > > [Thread 1090525520 (LWP 16118) exited]
> > > > > [Thread 1082132816 (LWP 16117) exited]
> > > > > [New Thread 1082132816 (LWP 16119)]
> > > > > [New Thread 1090525520 (LWP 16120)]
> > > > > [Thread 1090525520 (LWP 16120) exited]
> > > > > [Thread 1082132816 (LWP 16119) exited]
> > > > > [New Thread 1082132816 (LWP 16121)]
> > > > > [New Thread 1090525520 (LWP 16122)]
> > > > > [Thread 1090525520 (LWP 16122) exited]
> > > > > [Thread 1082132816 (LWP 16121) exited]
> > > > > 
> > > > > Program received signal SIGSEGV, Segmentation
> > > > fault.
> > > > > [Switching to Thread 47608618407904 (LWP
> > 16114)]
> > > > > 0x0000000000427480 in max_density
> > (z=0x3767c660,
> > > > > rows=0, cols=<value optimized out>, column=0)
> > at
> > > > > Preprocess/rma_background3.c:301
> > > > > 301         if (dens_y[i] == max_y)
> > > > > 
> > > > > 
> > > > > Alex
> > > > > 
> > > > > 
> > > > > --- Ben Bolstad <bmb@xxxxxxxxxxxxx> wrote:
> > > > > 
> > > > > > At least for the one dataset of this type,
> > that
> > > > I
> > > > > > currently have
> > > > > > accessible, consisting of only 24 arrays
> > every
> > > > thing
> > > > > > processes fine with
> > > > > > out complaint or crashing.
> > > > > > 
> > > > > > What values are you using in your Buffer
> > > > settings?
> > > > > > Do you have ample
> > > > > > disk space in the location you are trying to
> > use
> > > > for
> > > > > > your temporary
> > > > > > space?
> > > > > > 
> > > > > > Ben
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > On Mon, 2008-03-03 at 07:53 -0800, Ben
> > Bolstad
> > > > > > wrote:
> > > > > > > Ok, that makes more sense since the first
> > one
> > > > you
> > > > > > gave me did not seem
> > > > > > > to be an Affymetrix array type. I know
> > that
> > > > things
> > > > > > worked with
> > > > > > > Arabidopsis ATH1 Arrays in the past,
> > though I
> > > > have
> > > > > > not tested with this
> > 
> === message truncated ===
> 
> 
> --
> Alex Feltus, Ph.D.
> Assistant Professor
> Clemson University
> Department of Genetics & Biochemistry
> Biosystems Research Complex Rm 302C
> 51 New Cherry Street
> Clemson, SC 29634
> 864-656-3231 (office) - (864) 656-6879 (fax)
> http://www.clemson.edu/genbiochem/faculty/afeltus.html
> 
> 
>       
> ____________________________________________________________________________________
> Be a better friend, newshound, and 
> know-it-all with Yahoo! Mobile.  Try it now.  
> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ 
> 
> 


Other related posts: