[rmaexpress_help] Re: Difficulty in reading files
- From: Olivier Schaad <Olivier.Schaad@xxxxxxxxxxxxxxxx>
- To: rmaexpress_help@xxxxxxxxxxxxx
- Date: Fri, 01 Sep 2006 10:57:51 +0200
dear Ben
Do have plan to implement the
possibility of using RMA express in order to
normalize tilling arrays, the CEL file do have
the same format, but the CDF file does not exist
any more it is now the bpmap file ( PRNFGMm1b520182FR(-).bpmap)
i.e :
PMX PMY MMX MMY Seq Pos Probe
78 187 78 188 r2_Tag 0 TGTGATAATTTCGACGAGGCGTTAC
226 29 226 30 r2_Tag 0 CAATGATAGGCTAGTCTCGCGCAGT
257 251 257 252 r2_Tag 0 GATAAGCGTTCACAGCTCGGCAATA
174 1 174 2 r2_Tag 1 CATCCGATTAAATACCGTGGATTAC
281 107 281 108 r2_Tag 1 TAGTGCATCCTCGTGGCATCATGCG
163 102 163 103 r2_Tag 1 TGTAACGCCTCGTCGAAATTATCAC
76 61 76 62 r2_Tag 2 GCGGTCACTCAGCATATAGTCGTTG
26 41 26 42 r2_Tag 5 GCGTTACGTGAGTCTGATAGCAGTT
41 87 41 88 r2_Tag 5 CTAGCCTGCCGGTCAATAACTGATG
53 1 53 2 r2_Tag 5 ATCGTAACTCGGGTGACCAATGACC
101 29 101 30 r2_Tag 5 TGTCATGATCGTGAGTTGTCGCAGT
124 3 124 4 r2_Tag 5 TCGTTAGCCCGAGCTTAACTATTAG
314 123 314 124 r2_Tag 5 GCGTAGGTATCGACTCTCACTGTGG
164 102 164 103 r2_Tag 9 TCAGAATATGTAACGCCTCGTCGAA
171 114 171 115 r2_Tag 21 CTATTGGCTGAACTACCATGTACTG
229 171 229 172 r2_Tag 26 CATGGTAGTTCAGCCAATAGATGCC
.
.
.
143 289 143 290 chr3 3756721 TCTCTTCCTGTACCTGAACTATGTA
37 229 37 230 chr3 3756750 GCAGAAGCCTGCAACTTCTCAATCA
36 131 36 132 chr3 3756775 CAAGCTGAGTTAAGAGCGAATACGC
245 101 245 102 chr3 3756802 AAAGCTACTCTTGTTCACGCAGCAG
314 265 314 266 chr3 3756827 CGATGCCTTCTCTTCCTGTACCTGA
164 299 164 300 chr3 3756853 CATTACAGTCTACAGAAGCCTTCAA
149 243 149 244 chr3 3756883 CAATCACAAGCTGAGTTAAGAGCGA
1 187 1 188 chr3 3756909 GACGCGCAAAGCTACTCTTGTTCAC
225 81 225 82 chr3 3756934 GCAGCAGCGATACCTTCTCTTCCTG
135 95 135 96 chr3 3756960 ACCTGAACATTACAGTCTACTCAAG
114 119 114 120 chr3 3756988 TCAACTTCTCAATCACAAGCTGAGG
259 189 259 190 chr3 3757016 GAACGTTGACACACAAAGCTACCCA
130 37 130 38 chr3 3757042 GTTCACACAGCGATGCCTTCCCTTT
122 105 122 106 chr3 3757068 TGTAACTGAACATTGCACTTTGCAG
209 323 209 324 chr3 3757092 GAAGCCTGCAACTTCTCAATCGCAA
18 125 18 126 chr3 3757120 GAAACTAGTCGTGTTCAAACTGTGG
44 197 44 198 chr3 3757146 ACCTGAGGCCTGTCCTGATTTCACC
16 273 16 274 chr3 3757171 TCAGTAGACAAACCGGAGATAGTTA
304 307 304 308 chr3 3757197 GAGTCGCAGTCTCACAACCAGGAAA
303 141 303 142 chr3 3757222 GCTCCAGCACCAAAGAGATCACTGC
322 305 322 306 chr3 3757246 CAAGAAAGTAGGAAGAATCAGGATA
116 243 116 244 chr3 3757267 GATAGAAGAGTGGAGATCCACAGGA
250 15 250 16 chr3 3757294 CGTGCGGGTCACAAGCCCATAAGGT
328 43 328 44 chr3 3757320 TCCAAGAGTGATGTGTAGCTCTGTT
12 73 12 74 chr3 3757347 TATGGATAATGAGAAGCTGAGAGTG
245 315 245 316 chr3 3757373 TGAATCTGGAGAAGAAGGTTGAGAA
93 171 93 172 chr3 3757403 GAGGGTTGAACGAATGGGTTGGATC
28 67 28 68 chr3 3757426 TCTATAGGATTAGGGAGAGAGTTTG
278 87 278 88 chr3 3757452 GAGGATGTTGAGAGGACAGAGGATG
261 223 261 224 chr3 3757479 TAAGTTTCAGATGGCGAATTTGACA
25 113 25 114 chr3 3757502 CAATGGAGAAGCTTAAGCAGATGGG
322 325 322 326 chr3 3757528 GCTTTAGCCGTTGGGCCAAGAAGAA
128 161 128 162 chr3 3757554 GCTTATACATGCAATAGGTTGTGTC
229 45 229 46 chr3 3757579 TATCACCCACATTGTCTACGTGCTT
131 79 131 80 chr3 3757600 GCTTCCTTTAACTGAGAACTATCTG
313
92 55 92 56 chr6 90643266
CACTGTGTAACCTCTGGGAACTTAA
28 1 28 2 chr6 90643289
AAATTACTGCAGCTCCGTTTCCCCA
271 35 271 36 chr6 90643315
ACAAGATCTTCCCAGGCTGCCTCCT
197 229 197 230 chr6 90643341
TGCAGACCCCTGACAACCAGCCCCT
30 269 30 270 chr6 90643365
TCACCAGTGGGATTTCAAAGCCACC
9 169 9 170 chr6 90643391
GAGAGCATTCCCTAGAATGCCTGCT
186 247 186 248 chr6 90643417
TTCAAGGCTCCTTTCTTAGCCTCAT
194 255 194 256 chr6 90643446
TCTGCTCCAGACAGCTGCCTCAGAG
218 89 218 90 chr6 90643472
CATTTGTAAAGGGAGCAGCCTTGTA
62 3 62 4 chr6 90643500
CACTAAAAAACCAGGAGTGGTCCAT
42 69 42 70 chr6 90643529
CACATGCAAAGACCATACAGCACCA
126 37 126 38 chr6 90643552
CAATAAACTTTGCAGGTTCAAATCC
At 11:12 PM 8/31/2006, you wrote:
Hi Jun,
There are two separate issues going on here. The first is a problem with
the RMAExpress code which I will go about fixing immediately. Basically
what is happening is that it is detecting an error with the CEL file,
but failing to properly close a modal dialog box and reactivate the main
memus so it locks up.
The second issue here is actually the underlying cause of the problem.
Specifically, some of the arrays are HG_U95A and some are HG_U95Av2.
RMAExpress tries to ensure that all CEL files being read in are from the
same array type and in this case it is finding a mismatch. In particular
these are the array types for that set of files.
GSM60097.CEL: HG_U95Av2
GSM60098.CEL: HG_U95Av2
GSM60099.CEL: HG_U95A
GSM60100.CEL: HG_U95A
GSM60101.CEL: HG_U95A
GSM60102.CEL: HG_U95A
GSM60103.CEL: HG_U95A
GSM60104.CEL: HG_U95A
GSM60105.CEL: HG_U95A
GSM60106.CEL: HG_U95A
GSM60107.CEL: HG_U95A
Now this is really a special case because the U95A and U95Av2 differ
based upon only a relative handful of probesets. There are several ways
to handle this. The best way is to use the RMADataConv program basically
following the instructions in the user guide in the subsection "Merging
MG U74A and MG U74Av2 datasets" except substituting HG_U95A and
HG_U95Av2 in place for the mouse array names. With the requisite file of
overlap names available at:
http://bmbolstad.com/misc/mixtureCDF/HGU95Aoverlap.txt
The second way to handle this (and not necessarily recommended) is to
manually edit the part of the CEL files which describes the name of
appropriate array type so that they are all identical and just pretend
that all the arrays are really of one type only.
Hope this helps,
Ben
On Thu, 2006-08-31 at 16:36 -0400, Jun Ding wrote:
> Hi RMAExpress Developers and Users,
>
> I have difficulties in reading the .CEL files into RMAExpress. All the
> .CEL (11) files are downloaded (GEO dataset GSE2737) and extracted, but
> I can only read in the first two .CEL files. When I tried to read in
> more (or only read one of the other files), RMAExpress just crashed. I
> have tried RMAExpress release 0.4.1 and 0.4 alpha 7. I also got an
> error message from Visual Studio Just-In-Time Debugger: "An unhandled
> win32 exception occurred in RMAExpress.exe [3352]."
>
> Does that mean all but 2 .CEL files are corrupted? Can anyone give me
> some suggestions? Thank you very much!
>
> Best,
> Jun
>
> ----------------------------
> Jun Ding, Ph.D. student
> Department of Biostatistics
> University of Michigan
> Ann Arbor, MI, 48105
> ----------------------------
>
>
>
>
>
--
Ben Bolstad <bmb@xxxxxxxxxxxxx>
http://bmbolstad.com
Please consider the environment before printing this email - Pensez
svp à l'environnement avant d'imprimer ce message
*****************************************************************
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Dr. Olivier
Schaad
telephone: (+41-22) 379 64 78
Chargé d'enseignement / Data Analyst fax: (+41-22) 379 64 76
Genomics Platform NCCR "Frontiers in Genetics"
Université de Genève/ Fac Sciences
30, Quai E Ansermet
CH-1211 Geneva 4, Switzerland
e-mail: Olivier.Schaad@xxxxxxxxxxxxxxxx
http://www.frontiers-in-genetics.org
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- Follow-Ups:
- [rmaexpress_help] Re: Difficulty in reading files
- From: Ben Bolstad
Other related posts:
- » [rmaexpress_help] Difficulty in reading files
- » [rmaexpress_help] Re: Difficulty in reading files
- » [rmaexpress_help] Re: Difficulty in reading files
- » [rmaexpress_help] Re: Difficulty in reading files
- » [rmaexpress_help] Re: Difficulty in reading files
Hi Jun,
There are two separate issues going on here. The first is a problem with the RMAExpress code which I will go about fixing immediately. Basically what is happening is that it is detecting an error with the CEL file, but failing to properly close a modal dialog box and reactivate the main memus so it locks up.
The second issue here is actually the underlying cause of the problem. Specifically, some of the arrays are HG_U95A and some are HG_U95Av2. RMAExpress tries to ensure that all CEL files being read in are from the same array type and in this case it is finding a mismatch. In particular these are the array types for that set of files.
GSM60097.CEL: HG_U95Av2 GSM60098.CEL: HG_U95Av2 GSM60099.CEL: HG_U95A GSM60100.CEL: HG_U95A GSM60101.CEL: HG_U95A GSM60102.CEL: HG_U95A GSM60103.CEL: HG_U95A GSM60104.CEL: HG_U95A GSM60105.CEL: HG_U95A GSM60106.CEL: HG_U95A GSM60107.CEL: HG_U95A
Now this is really a special case because the U95A and U95Av2 differ based upon only a relative handful of probesets. There are several ways to handle this. The best way is to use the RMADataConv program basically following the instructions in the user guide in the subsection "Merging MG U74A and MG U74Av2 datasets" except substituting HG_U95A and HG_U95Av2 in place for the mouse array names. With the requisite file of overlap names available at:
http://bmbolstad.com/misc/mixtureCDF/HGU95Aoverlap.txt
The second way to handle this (and not necessarily recommended) is to manually edit the part of the CEL files which describes the name of appropriate array type so that they are all identical and just pretend that all the arrays are really of one type only.
Hope this helps,
Ben
On Thu, 2006-08-31 at 16:36 -0400, Jun Ding wrote: > Hi RMAExpress Developers and Users, > > I have difficulties in reading the .CEL files into RMAExpress. All the > .CEL (11) files are downloaded (GEO dataset GSE2737) and extracted, but > I can only read in the first two .CEL files. When I tried to read in > more (or only read one of the other files), RMAExpress just crashed. I > have tried RMAExpress release 0.4.1 and 0.4 alpha 7. I also got an > error message from Visual Studio Just-In-Time Debugger: "An unhandled > win32 exception occurred in RMAExpress.exe [3352]." > > Does that mean all but 2 .CEL files are corrupted? Can anyone give me > some suggestions? Thank you very much! > > Best, > Jun > > ---------------------------- > Jun Ding, Ph.D. student > Department of Biostatistics > University of Michigan > Ann Arbor, MI, 48105 > ---------------------------- > > > > > -- Ben Bolstad <bmb@xxxxxxxxxxxxx> http://bmbolstad.com
Please consider the environment before printing this email - Pensez svp à l'environnement avant d'imprimer ce message *****************************************************************
- [rmaexpress_help] Re: Difficulty in reading files
- From: Ben Bolstad