[rmaexpress_help] Re: RMAExpress still crashes after 1.0 beta 8

  • From: Ben Bolstad <bmb@xxxxxxxxxxxxx>
  • To: rmaexpress_help@xxxxxxxxxxxxx
  • Date: Thu, 06 Mar 2008 18:24:08 -0800

I'm not sure what you mean by it missing the ".cel" files and only
seeing the ".CEL", at least on my local build it has no problem showing
both (unless I am mis-understanding what you mean by "auto file
listing")

Ben


On Thu, 2008-03-06 at 14:32 -0800, Alex Feltus wrote:
> Ben:
> 
> I just wanted to do a final follow-up.  I found a few
> more corrupt .cel files and was able to RMA normalize
> 2852 arrays.  Woo-hoo!
> 
> Of course, a .cel check would be a great addition to
> your code as it works wonderfully when the data is
> sound.  One other thing I noticed is that your auto
> file listing in the GUI mode appears to be case
> sensitive (CAPS), so it misses .cel files.
> 
> Thanks for the tech support!
> 
> Alex
> 
> 
> --- Ben Bolstad <bmb@xxxxxxxxxxxxx> wrote:
> 
> > Well, really this should be more properly handled by
> > the parsing code
> > (ie my responsibility). But it depends on the
> > failure mode as to whether
> > you can detect it manually yourself. In the case of
> > the two files below,
> > the corruption is with the data stored in the file
> > itself rather than
> > something fairly easy to detect by just looking at
> > file sizes etc.
> > 
> > That said, one way to do it would be to use the Raw
> > Data Visualizer
> > option, and more specifically the "Individual
> > Density Plots" option.
> > Scrolling through these should show the corrupt
> > files (the plot will
> > most likely be non existent). Doing this at this end
> > shows these are all
> > potentially corrupt:
> > 
> > GSM128656.CEL
> > GSM260885.CEL
> > GSM133917.CEL
> > GSM133941.CEL
> > GSM133946.CEL
> > GSM133954.CEL
> > GSM133956.CEL
> > GSM133972.CEL
> > GSM133982.CEL
> > GSM133990.CEL
> > GSM134356.CEL
> > GSM134368.CEL
> > GSM134372.CEL
> > GSM134393.CEL
> > GSM134407.CEL
> > GSM134420.CEL
> > GSM134442.CEL
> > GSM134453.CEL
> > GSM134460.CEL
> > GSM134461.CEL
> > GSM142607.CEL
> > GSM142791.CEL
> > GSM157302.CEL
> > GSM157308.CEL
> > GSM157313.CEL
> > GSM157320.CEL
> > GSM183517.CEL
> > 
> > Although that seems to be a lot of files out of
> > approx 2800 it is not
> > that many (about 1% of files). I can't guarantee
> > that list is
> > comprehensive or that I did not accidentally mistype
> > one of the names
> > above. Also it is potentially possible there are a
> > different set of
> > corruptions in the data I have and you have.
> > 
> > 
> > Ben
> > 
> > 
> > On Wed, 2008-03-05 at 06:22 -0800, Alex Feltus
> > wrote:
> > > Is there any specificity to the corruption?  Can I
> > > prescreen files in some way? 
> > > 
> > > Alex
> > > 
> > > --- Ben Bolstad <bmb@xxxxxxxxxxxxx> wrote:
> > > 
> > > > I am guessing that there are further corrupted
> > CEL
> > > > files in that
> > > > dataset. I did not have the energy to further
> > > > examine that possibility
> > > > yesterday.
> > > > 
> > > > Ben
> > > > 
> > > > 
> > > > On Wed, 2008-03-05 at 06:03 -0800, Alex Feltus
> > > > wrote:
> > > > > Ben:
> > > > > 
> > > > > Thanks for your help with this.  I removed
> > those
> > > > two
> > > > > .CEL files which allowed me to get to the BA
> > stage
> > > > > (further than before).  However, I still get a
> > > > > segmentation fault.  
> > > > > 
> > > > > Alex
> > > > > 
> > > > > Here is the gdb output (appears to break as
> > > > before):
> > > > > 
> > > > > Starting program: /usr/local/bin/RMAExpress 
> > > > > [Thread debugging using libthread_db enabled]
> > > > > [New Thread 47990231476192 (LWP 8626)]
> > > > > [New Thread 1082132816 (LWP 24051)]
> > > > > [New Thread 1090525520 (LWP 24052)]
> > > > > [Thread 1090525520 (LWP 24052) exited]
> > > > > [Thread 1082132816 (LWP 24051) exited]
> > > > > [New Thread 1082132816 (LWP 24053)]
> > > > > [New Thread 1090525520 (LWP 24054)]
> > > > > [New Thread 1098918224 (LWP 24055)]
> > > > > [Thread 1090525520 (LWP 24054) exited]
> > > > > [Thread 1098918224 (LWP 24055) exited]
> > > > > [Thread 1082132816 (LWP 24053) exited]
> > > > > [New Thread 1082132816 (LWP 24056)]
> > > > > [New Thread 1098918224 (LWP 24057)]
> > > > > [Thread 1098918224 (LWP 24057) exited]
> > > > > [Thread 1082132816 (LWP 24056) exited]
> > > > > 
> > > > > Program received signal SIGSEGV, Segmentation
> > > > fault.
> > > > > [Switching to Thread 47990231476192 (LWP
> > 8626)]
> > > > > 0x0000000000427480 in max_density
> > (z=0x378a8440,
> > > > > rows=0, cols=<value optimized out>, column=0)
> > at
> > > > > Preprocess/rma_background3.c:301
> > > > > 301         if (dens_y[i] == max_y)
> > > > > 
> > > > > --- Ben Bolstad <bmb@xxxxxxxxxxxxx> wrote:
> > > > > 
> > > > > > I have investigated this issue further by
> > first
> > > > > > downloading all
> > > > > > available ATH1 arrays (GPL198) from GEO. I
> > can
> > > > > > duplicate this error, but
> > > > > > it is not due to a dataset size problem.
> > Instead
> > > > it
> > > > > > is because one of
> > > > > > CEL files is corrupted (GSM260954.cel for
> > the
> > > > > > record). There is actually
> > > > > > also another file in that set that is
> > corrupted,
> > > > at
> > > > > > least at my end
> > > > > > (GSM226522_S17_3.CEL) though the corruption
> > mode
> > > > is
> > > > > > different. Fixes for
> > > > > > detecting these corruption types during the
> > > > parsing
> > > > > > face are imminent
> > > > > > for the next beta release. I'm not sure the
> > > > timing
> > > > > > on this next beta,
> > > > > > though it will probably be within the next
> > > > couple of
> > > > > > weeks.
> > > > > > 
> > > > > > I don't expect these situations to be
> > > > particularly
> > > > > > common so I don't
> > > > > > believe that 1.0 beta 8 is critically
> > damaged,
> > > > and
> > > > > > it is still a
> > > > > > definite improvement over 1.0 beta 7 and
> > earlier
> > > > > > versions for (super)
> > > > > > large datasets.
> > > > > > 
> > > > > > Best,
> > > > > > 
> > > > > > Ben 
> > > > > > 
> > > > > > 
> > > > > > On Mon, 2008-03-03 at 10:04 -0800, Alex
> > Feltus
> > > > > > wrote:
> > > > > > > Ben:
> > > > > > > 
> > > > > > > I can RMA 31 arrays of this type no
> > problem,
> > > > so
> > > > > > the
> > > > > > > data is good.  I have ~300GB of disk space
> > > > that
> > > > > > can be
> > > > > > > used for temp files.  I also never even
> > hit
> > > > swap
> > > > > > since
> > > > > > > I have 16GB RAM.  I am using the default
> > > > buffer
> > > > > > > settings: 150 arrays/50,000 probe sets.
> > > > > > > 
> > > > > > > Here is GDB output for 460 GPL198 arrays:
> > > > > > > 
> > > > > > > GNU gdb 6.6-debian
> > 
> === message truncated ===
> 
> 
> 
>       
> ____________________________________________________________________________________
> Never miss a thing.  Make Yahoo your home page. 
> http://www.yahoo.com/r/hs
> 


Other related posts: