[rmaexpress_help] Re: Difficulty in reading files

  • From: Olivier Schaad <Olivier.Schaad@xxxxxxxxxxxxxxxx>
  • To: rmaexpress_help@xxxxxxxxxxxxx
  • Date: Fri, 01 Sep 2006 10:57:51 +0200

dear Ben
Do have plan to implement the possibility of using RMA express in order to normalize tilling arrays, the CEL file do have the same format, but the CDF file does not exist any more it is now the bpmap file ( PRNFGMm1b520182FR(-).bpmap)
i.e :


PMX     PMY     MMX     MMY     Seq     Pos     Probe
78      187     78      188     r2_Tag  0       TGTGATAATTTCGACGAGGCGTTAC
226     29      226     30      r2_Tag  0       CAATGATAGGCTAGTCTCGCGCAGT
257     251     257     252     r2_Tag  0       GATAAGCGTTCACAGCTCGGCAATA
174     1       174     2       r2_Tag  1       CATCCGATTAAATACCGTGGATTAC
281     107     281     108     r2_Tag  1       TAGTGCATCCTCGTGGCATCATGCG
163     102     163     103     r2_Tag  1       TGTAACGCCTCGTCGAAATTATCAC
76      61      76      62      r2_Tag  2       GCGGTCACTCAGCATATAGTCGTTG
26      41      26      42      r2_Tag  5       GCGTTACGTGAGTCTGATAGCAGTT
41      87      41      88      r2_Tag  5       CTAGCCTGCCGGTCAATAACTGATG
53      1       53      2       r2_Tag  5       ATCGTAACTCGGGTGACCAATGACC
101     29      101     30      r2_Tag  5       TGTCATGATCGTGAGTTGTCGCAGT
124     3       124     4       r2_Tag  5       TCGTTAGCCCGAGCTTAACTATTAG
314     123     314     124     r2_Tag  5       GCGTAGGTATCGACTCTCACTGTGG
164     102     164     103     r2_Tag  9       TCAGAATATGTAACGCCTCGTCGAA
171     114     171     115     r2_Tag  21      CTATTGGCTGAACTACCATGTACTG
229     171     229     172     r2_Tag  26      CATGGTAGTTCAGCCAATAGATGCC
.
.
.
143     289     143     290     chr3    3756721 TCTCTTCCTGTACCTGAACTATGTA
37      229     37      230     chr3    3756750 GCAGAAGCCTGCAACTTCTCAATCA
36      131     36      132     chr3    3756775 CAAGCTGAGTTAAGAGCGAATACGC
245     101     245     102     chr3    3756802 AAAGCTACTCTTGTTCACGCAGCAG
314     265     314     266     chr3    3756827 CGATGCCTTCTCTTCCTGTACCTGA
164     299     164     300     chr3    3756853 CATTACAGTCTACAGAAGCCTTCAA
149     243     149     244     chr3    3756883 CAATCACAAGCTGAGTTAAGAGCGA
1       187     1       188     chr3    3756909 GACGCGCAAAGCTACTCTTGTTCAC
225     81      225     82      chr3    3756934 GCAGCAGCGATACCTTCTCTTCCTG
135     95      135     96      chr3    3756960 ACCTGAACATTACAGTCTACTCAAG
114     119     114     120     chr3    3756988 TCAACTTCTCAATCACAAGCTGAGG
259     189     259     190     chr3    3757016 GAACGTTGACACACAAAGCTACCCA
130     37      130     38      chr3    3757042 GTTCACACAGCGATGCCTTCCCTTT
122     105     122     106     chr3    3757068 TGTAACTGAACATTGCACTTTGCAG
209     323     209     324     chr3    3757092 GAAGCCTGCAACTTCTCAATCGCAA
18      125     18      126     chr3    3757120 GAAACTAGTCGTGTTCAAACTGTGG
44      197     44      198     chr3    3757146 ACCTGAGGCCTGTCCTGATTTCACC
16      273     16      274     chr3    3757171 TCAGTAGACAAACCGGAGATAGTTA
304     307     304     308     chr3    3757197 GAGTCGCAGTCTCACAACCAGGAAA
303     141     303     142     chr3    3757222 GCTCCAGCACCAAAGAGATCACTGC
322     305     322     306     chr3    3757246 CAAGAAAGTAGGAAGAATCAGGATA
116     243     116     244     chr3    3757267 GATAGAAGAGTGGAGATCCACAGGA
250     15      250     16      chr3    3757294 CGTGCGGGTCACAAGCCCATAAGGT
328     43      328     44      chr3    3757320 TCCAAGAGTGATGTGTAGCTCTGTT
12      73      12      74      chr3    3757347 TATGGATAATGAGAAGCTGAGAGTG
245     315     245     316     chr3    3757373 TGAATCTGGAGAAGAAGGTTGAGAA
93      171     93      172     chr3    3757403 GAGGGTTGAACGAATGGGTTGGATC
28      67      28      68      chr3    3757426 TCTATAGGATTAGGGAGAGAGTTTG
278     87      278     88      chr3    3757452 GAGGATGTTGAGAGGACAGAGGATG
261     223     261     224     chr3    3757479 TAAGTTTCAGATGGCGAATTTGACA
25      113     25      114     chr3    3757502 CAATGGAGAAGCTTAAGCAGATGGG
322     325     322     326     chr3    3757528 GCTTTAGCCGTTGGGCCAAGAAGAA
128     161     128     162     chr3    3757554 GCTTATACATGCAATAGGTTGTGTC
229     45      229     46      chr3    3757579 TATCACCCACATTGTCTACGTGCTT
131     79      131     80      chr3    3757600 GCTTCCTTTAACTGAGAACTATCTG
313

92 55 92 56 chr6 90643266 CACTGTGTAACCTCTGGGAACTTAA
28 1 28 2 chr6 90643289 AAATTACTGCAGCTCCGTTTCCCCA
271 35 271 36 chr6 90643315 ACAAGATCTTCCCAGGCTGCCTCCT
197 229 197 230 chr6 90643341 TGCAGACCCCTGACAACCAGCCCCT
30 269 30 270 chr6 90643365 TCACCAGTGGGATTTCAAAGCCACC
9 169 9 170 chr6 90643391 GAGAGCATTCCCTAGAATGCCTGCT
186 247 186 248 chr6 90643417 TTCAAGGCTCCTTTCTTAGCCTCAT
194 255 194 256 chr6 90643446 TCTGCTCCAGACAGCTGCCTCAGAG
218 89 218 90 chr6 90643472 CATTTGTAAAGGGAGCAGCCTTGTA
62 3 62 4 chr6 90643500 CACTAAAAAACCAGGAGTGGTCCAT
42 69 42 70 chr6 90643529 CACATGCAAAGACCATACAGCACCA
126 37 126 38 chr6 90643552 CAATAAACTTTGCAGGTTCAAATCC



At 11:12 PM 8/31/2006, you wrote:
Hi Jun,

There are two separate issues going on here. The first is a problem with
the RMAExpress code which I will go about fixing immediately. Basically
what is happening is that it is detecting an error with the CEL file,
but failing to properly close a modal dialog box and reactivate the main
memus so it locks up.

The second issue here is actually the underlying cause of the problem.
Specifically, some of the arrays are HG_U95A and some are HG_U95Av2.
RMAExpress tries to ensure that all CEL files being read in are from the
same array type and in this case it is finding a mismatch. In particular
these are the array types for that set of files.


GSM60097.CEL: HG_U95Av2 GSM60098.CEL: HG_U95Av2 GSM60099.CEL: HG_U95A GSM60100.CEL: HG_U95A GSM60101.CEL: HG_U95A GSM60102.CEL: HG_U95A GSM60103.CEL: HG_U95A GSM60104.CEL: HG_U95A GSM60105.CEL: HG_U95A GSM60106.CEL: HG_U95A GSM60107.CEL: HG_U95A

Now this is really a special case because the U95A and U95Av2 differ
based upon only a relative handful of probesets. There are several ways
to handle this. The best way is to use the RMADataConv program basically
following the instructions in the user guide in the subsection "Merging
MG U74A and MG U74Av2 datasets" except substituting HG_U95A and
HG_U95Av2 in place for the mouse array names. With the requisite file of
overlap names available at:

http://bmbolstad.com/misc/mixtureCDF/HGU95Aoverlap.txt

The second way to handle this (and not necessarily recommended) is to
manually edit the part of the CEL files which describes the name of
appropriate array type so that they are all identical and just pretend
that all the arrays are really of one type only.

Hope this helps,

Ben




On Thu, 2006-08-31 at 16:36 -0400, Jun Ding wrote: > Hi RMAExpress Developers and Users, > > I have difficulties in reading the .CEL files into RMAExpress. All the > .CEL (11) files are downloaded (GEO dataset GSE2737) and extracted, but > I can only read in the first two .CEL files. When I tried to read in > more (or only read one of the other files), RMAExpress just crashed. I > have tried RMAExpress release 0.4.1 and 0.4 alpha 7. I also got an > error message from Visual Studio Just-In-Time Debugger: "An unhandled > win32 exception occurred in RMAExpress.exe [3352]." > > Does that mean all but 2 .CEL files are corrupted? Can anyone give me > some suggestions? Thank you very much! > > Best, > Jun > > ---------------------------- > Jun Ding, Ph.D. student > Department of Biostatistics > University of Michigan > Ann Arbor, MI, 48105 > ---------------------------- > > > > > -- Ben Bolstad <bmb@xxxxxxxxxxxxx> http://bmbolstad.com

Please consider the environment before printing this email - Pensez svp à l'environnement avant d'imprimer ce message *****************************************************************


>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Dr. Olivier Schaad telephone: (+41-22) 379 64 78
Chargé d'enseignement / Data Analyst fax: (+41-22) 379 64 76
Genomics Platform NCCR "Frontiers in Genetics"
Université de Genève/ Fac Sciences
30, Quai E Ansermet
CH-1211 Geneva 4, Switzerland
e-mail: Olivier.Schaad@xxxxxxxxxxxxxxxx
http://www.frontiers-in-genetics.org
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<



Other related posts: