[mira_talk] Re: Mira memory problem 64bits

  • From: raw937@xxxxxxxxx
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 18 Jul 2011 15:20:02 +0000

Camila,
Prin-seq lite is a perfect way to trim your smart primers with qual etc. Quick 
and easy perl script. 
Sent from my “contract free” BlackBerry® smartphone on the WIND network.

-----Original Message-----
From: Lionel Guy <guy.lionel@xxxxxxxxx>
Sender: mira_talk-bounce@xxxxxxxxxxxxx
Date: Mon, 18 Jul 2011 17:09:39 
To: <mira_talk@xxxxxxxxxxxxx>
Reply-To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: Mira memory problem 64bits

Hi Camila,

On 18 Jul 2011, at 16:33 , Mazzoni, Camila wrote:

> I tried to read about others' problems with memory but it doesn't seem to 
> apply to mine. I'm trying to run a cDNA assembly from 454 using a fasta file 
> only because I had to trim the SMART primers and couldn't do it in the 
> quality files.

Quoting Bastien, every time you assemble without quality, God kills a kitten. 
Running without qualities is ALWAYS a bad idea. There are tools out there (e.g. 
in FastX Toolkit, or in Galaxy, as far as I remember) to filter primers both in 
the fasta and qual file. One solution is to convert fasta + qual to fastq, 
trim, and back to fasta + qual (or not, since mira takes fastq input.

Otherwise, it seems like a "normal" out-of-memory problem... Could you indicate:
- what system you are running your assembly on (especially memory)
- what is the exact command you used to run mira
- which version of mira you are using
- whether your library was normalized

From the repeat ratio histogram, it seems you have quite a lot of high-number 
repeats (which is expected with non-normalized cDNA library), but it seems a 
bit strange that 400,000 reads would take up to 14 Go memory... Did you try to 
run miramem? What does it tell you?

Cheers,

Lionel

> After trimming, many reads became empty, I hope it's not a problem. There's a 
> warning, but it runs anyway.
> The assembly ran for a couple of days until I got the problem. Would really 
> appreciate some help.
> 
> G4PJSDO01AVIZX: unable to load or other reason for invalid data.
> G4PJSDO01D1A0V: unable to load or other reason for invalid data.
> G4PJSDO01DQ1WX: unable to load or other reason for invalid data.
> G4PJSDO01CP10J: unable to load or other reason for invalid data.
> G4PJSDO01ENJN9: unable to load or other reason for invalid data.
> G4PJSDO01DY3E3: unable to load or other reason for invalid data.
> G4PJSDO01A5O3K: unable to load or other reason for invalid data.
> G4PJSDO01A0QYE: unable to load or other reason for invalid data.
> G4PJSDO01DM9DJ: unable to load or other reason for invalid data.
> G4PJSDO01ED04F: unable to load or other reason for invalid data.
> G4PJSDO01DOW9Z: unable to load or other reason for invalid data.
> G4PJSDO01CVKTS: unable to load or other reason for invalid data.
> G4PJSDO01DC53K: unable to load or other reason for invalid data.
> 
> ===========================================================================
> Pool statistics:
> Backbones: 0    Backbone rails: 0
> 
>         Sanger    454    PacBio    Solexa    SOLiD
>         ----------------------------------------
> Total reads    0    486969    0    0    0
> Reads wo qual    0    486969    0    0    0
> Used reads    0    126111    0    0    0
> Avg tot rlen    0    136    0    0    0
> Avg rlen used    0    382    0    0    0
> 
> With strain    0    0    0    0    0
> W/o clips    0    366774    0    0    0
> ===========================================================================
> 
> 
> Localtime: Sun Jul 17 15:27:04 2011
> Writing temporary hstat files:
>  [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] 
> ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... 
> [90%] ....|.... [100%] done
> Localtime: Sun Jul 17 15:27:18 2011
> 
> Analysing hstat files:
>  [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] 
> ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... 
> [90%] ....|.... [100%] 
> Localtime: Sun Jul 17 15:28:18 2011
> Hash statistics:
> =========================================================
> Measured avg. frequency coverage: 38
> 
> Deduced thresholds:
> -------------------
> Min normal cov: 15
> Max normal cov: 61
> Repeat cov: 72
> Heavy cov: 304
> Crazy cov: 760
> Mask cov: 3800
> 
> Repeat ratio histogram:
> -----------------------
> 0    1092396
> 1    176514
> 2    67544
> 3    34124
> 4    19688
> 5    13174
> 6    7984
> 7    5068
> 8    3056
> 9    2316
> 10    2190
> 11    1740
> 12    1302
> 13    1258
> 14    1050
> 15    930
> 16    1144
> 17    916
> 18    558
> 19    464
> 20    472
> 21    572
> 22    592
> 23    550
> 24    396
> 25    474
> 26    492
> 27    326
> 28    374
> 29    466
> 30    580
> 31    500
> 32    448
> 33    342
> 34    436
> 35    330
> 36    174
> 37    254
> 38    232
> 39    150
> 40    138
> 41    120
> 42    132
> 43    208
> 44    224
> 45    122
> 46    168
> 47    224
> 48    228
> 49    304
> 50    178
> 51    126
> 52    152
> 53    104
> 54    130
> 55    154
> 56    112
> 57    94
> 58    52
> 59    108
> 60    114
> 61    100
> 62    118
> 63    144
> 64    94
> 65    166
> 66    108
> 67    76
> 68    70
> 69    110
> 70    138
> 71    96
> 72    80
> 73    74
> 74    68
> 75    72
> 76    76
> 77    78
> 78    106
> 79    96
> 80    102
> 81    106
> 82    136
> 83    170
> 84    208
> 85    220
> 86    162
> 87    134
> 88    128
> 89    74
> 90    114
> 91    108
> 92    50
> 93    82
> 94    92
> 95    46
> 96    66
> 97    54
> 98    88
> 99    80
> 100    74
> 101    74
> 102    74
> 103    74
> 104    80
> 105    70
> 106    72
> 107    62
> 108    74
> 109    64
> 110    64
> 111    104
> 112    98
> 113    92
> 114    76
> 115    96
> 116    102
> 117    98
> 118    62
> 119    46
> 120    36
> 121    56
> 122    80
> 123    70
> 124    118
> 125    62
> 126    64
> 127    76
> 128    84
> 129    98
> 130    110
> 131    86
> 132    66
> 133    52
> 134    36
> 135    54
> 136    142
> 137    46
> 138    44
> 139    52
> 140    66
> 141    92
> 142    56
> 143    36
> 144    46
> 145    98
> 146    36
> 147    48
> 148    20
> 149    20
> 150    48
> 151    50
> 152    22
> 153    42
> 154    30
> 155    22
> 156    22
> 157    32
> 158    40
> 159    24
> 160    22
> 161    26
> 162    26
> 163    22
> 164    34
> 165    22
> 166    4
> 167    14
> 168    24
> 169    18
> 170    48
> 171    28
> 172    34
> 173    22
> 174    18
> 175    6
> 176    10
> 177    8
> 178    16
> 179    4
> 180    12
> 181    10
> 182    8
> 183    4
> 184    2
> 185    2
> 186    4
> 188    4
> 189    4
> 190    12
> 191    10
> 192    8
> 193    12
> 194    18
> 195    18
> 196    12
> 197    18
> 198    12
> 199    12
> 200    6
> 201    6
> 202    2
> 203    12
> 204    6
> 205    12
> 206    4
> 207    4
> 208    8
> 209    2
> 211    6
> 212    8
> 213    12
> 214    16
> 215    20
> 216    14
> 217    18
> 218    18
> 219    28
> 220    6
> 221    4
> 222    8
> 223    12
> 224    2
> 225    4
> 226    18
> 227    42
> 228    12
> 229    26
> 230    36
> 231    24
> 232    16
> 233    8
> 234    18
> 235    14
> 236    18
> 237    12
> 238    24
> 239    38
> 240    18
> 241    10
> 317    2
> =========================================================
> 
> Assigning statistics values:
> Localtime: Sun Jul 17 15:28:24 2011
>  [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] 
> ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... 
> [90%] ....|.... [100%] 
> Localtime: Sun Jul 17 15:28:39 2011
> clean up temporary stat files...Localtime: Sun Jul 17 15:28:39 2011
> Writing read repeat info to: 
> G4PJSDO01_20110627_output_gact_assembly/G4PJSDO01_20110627_output_gact_d_info/G4PJSDO01_20110627_output_gact_info_readrepeats.lst
>  ... 70344 sequences with 290480 masked stretches.
> Localtime: Sun Jul 17 15:28:41 2011
> 
> 
> Searching for possible overlaps:
> Localtime: Sun Jul 17 15:28:44 2011
> Now running threaded and partitioned skimmer with 1 partitions in 2 threads:
> Ouch, out of memory detected.
> 
> 
> ========================== Memory self assessment 
> ==============================
> Running in 64 bit mode.
> 
> Dump from /proc/meminfo
> --------------------------------------------------------------------------------
> MemTotal:       16471540 kB
> MemFree:          235044 kB
> Buffers:            1136 kB
> Cached:            12848 kB
> SwapCached:        55960 kB
> Active:         13934608 kB
> Inactive:        2154392 kB
> Active(anon):   13930900 kB
> Inactive(anon):  2145880 kB
> Active(file):       3708 kB
> Inactive(file):     8512 kB
> Unevictable:          16 kB
> Mlocked:              16 kB
> SwapTotal:       1052216 kB
> SwapFree:            236 kB
> Dirty:                12 kB
> Writeback:           312 kB
> AnonPages:      16020528 kB
> Mapped:             5376 kB
> Shmem:               972 kB
> Slab:              29332 kB
> SReclaimable:      11716 kB
> SUnreclaim:        17616 kB
> KernelStack:        2160 kB
> PageTables:        48628 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:     9287984 kB
> Committed_AS:   17754680 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:      305136 kB
> VmallocChunk:   34350540532 kB
> HardwareCorrupted:     0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:        9984 kB
> DirectMap2M:     3135488 kB
> DirectMap1G:    13631488 kB
> --------------------------------------------------------------------------------
> 
> Dump from /proc/self/status
> --------------------------------------------------------------------------------
> Name:    mira
> State:    R (running)
> Tgid:    19465
> Pid:    19465
> PPid:    1
> TracerPid:    0
> Uid:    1010    1010    1010    1010
> Gid:    100    100    100    100
> FDSize:    256
> Groups:    100 
> VmPeak:    17188460 kB
> VmSize:    17188460 kB
> VmLck:           0 kB
> VmHWM:    16172456 kB
> VmRSS:    15973200 kB
> VmData:    17183588 kB
> VmStk:         108 kB
> VmExe:        4728 kB
> VmLib:           0 kB
> VmPTE:       33384 kB
> Threads:    1
> SigQ:    0/128618
> SigPnd:    0000000000000000
> ShdPnd:    0000000000000000
> SigBlk:    0000000000000000
> SigIgn:    0000000000000001
> SigCgt:    0000000180000000
> CapInh:    0000000000000000
> CapPrm:    0000000000000000
> CapEff:    0000000000000000
> CapBnd:    ffffffffffffffff
> Cpus_allowed:    fff
> Cpus_allowed_list:    0-11
> Mems_allowed:    00000000,00000003
> Mems_allowed_list:    0-1
> voluntary_ctxt_switches:    44914
> nonvoluntary_ctxt_switches:    232867
> Stack usage:    104 kB
> --------------------------------------------------------------------------------
> 
> Information on current assembly object:
> 
> AS_readpool: 486969 reads.
> AS_contigs: 0 contigs.
> AS_bbcontigs: 0 contigs.
> Mem used for reads: 753421136 (719 MiB)
> 
> Memory used in assembly structures:
>                                            Eff. Size   Free cap. LostByAlign
>      AS_writtenskimhitsperid:     486969       2 MiB         0 B         4 B
>                AS_skim_edges:          0     7.7 GiB     7.7 GiB         0 B
>                  AS_adsfacts:          0     133 MiB     133 MiB         4 B
>           AS_confirmed_edges:          0     267 MiB     267 MiB         4 B
>    AS_permanent_overlap_bans:  155345426     5.8 GiB         0 B         0 B
>               AS_readhitmiss:          0        24 B         0 B         0 B
>             AS_readhmcovered:          0        24 B         0 B         0 B
>                 AS_count_rhm:          0        24 B         0 B         0 B
>                  AS_clipleft:     486969       2 MiB         0 B         4 B
>                 AS_clipright:     486969       2 MiB         0 B         4 B
>                  AS_used_ids:          0     476 KiB     476 KiB         7 B
>               AS_multicopies:     486969     476 KiB         0 B         7 B
>             AS_hasmcoverlaps:     486969     476 KiB         0 B         7 B
>        AS_maxcoveragereached:     486969       2 MiB         0 B         4 B
>        AS_coverageperseqtype:          0        24 B         0 B         0 B
>            AS_istroublemaker:     486969     476 KiB         0 B         7 B
>                  AS_isdebris:     486969     476 KiB         0 B         7 B
>           AS_needalloverlaps:     486969     476 KiB         7 B         0 B
>     AS_readsforrepeatresolve:          0        40 B         0 B         0 B
>                 AS_allrmbsok:          0        24 B         0 B         0 B
>         AS_probablermbsnotok:          0        24 B         0 B         0 B
>             AS_weakrmbsnotok:          0        24 B         0 B         0 B
>           AS_readmaytakeskim:          0        40 B         0 B         0 B
>                AS_skimstaken:          0        40 B         0 B         0 B
>           AS_numskimoverlaps:          0        24 B         0 B         0 B
>        AS_numleftextendskims:          0        24 B         0 B         0 B
>          AS_rightextendskims:          0        24 B         0 B         0 B
>       AS_skimleftextendratio:          0        24 B         0 B         0 B
>      AS_skimrightextendratio:          0        24 B         0 B         0 B
>              AS_usedlogfiles:         32       1 KiB         0 B         0 B
> Total: 15645945424 (14.6 GiB)
> 
> ================================================================================
> 
> 
> ========================== Memory self assessment 
> ==============================
> Running in 64 bit mode.
> 
> Dump from /proc/meminfo
> --------------------------------------------------------------------------------
> MemTotal:       16471540 kB
> MemFree:          121808 kB
> Buffers:            5004 kB
> Cached:            48048 kB
> SwapCached:        64824 kB
> Active:         13986316 kB
> Inactive:        2217332 kB
> Active(anon):   13971012 kB
> Inactive(anon):  2180684 kB
> Active(file):      15304 kB
> Inactive(file):    36648 kB
> Unevictable:          16 kB
> Mlocked:              16 kB
> SwapTotal:       1052216 kB
> SwapFree:          31128 kB
> Dirty:               780 kB
> Writeback:             0 kB
> AnonPages:      16086008 kB
> Mapped:            16436 kB
> Shmem:              1100 kB
> Slab:              29312 kB
> SReclaimable:      11732 kB
> SUnreclaim:        17580 kB
> KernelStack:        2600 kB
> PageTables:        49312 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:     9287984 kB
> Committed_AS:   18208056 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:      305136 kB
> VmallocChunk:   34350540532 kB
> HardwareCorrupted:     0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:        9984 kB
> DirectMap2M:     3135488 kB
> DirectMap1G:    13631488 kB
> --------------------------------------------------------------------------------
> 
> Dump from /proc/self/status
> --------------------------------------------------------------------------------
> Name:    mira
> State:    R (running)
> Tgid:    19465
> Pid:    19465
> PPid:    1
> TracerPid:    0
> Uid:    1010    1010    1010    1010
> Gid:    100    100    100    100
> FDSize:    256
> Groups:    100 
> VmPeak:    17188460 kB
> VmSize:    17188460 kB
> VmLck:           0 kB
> VmHWM:    16172456 kB
> VmRSS:    15972808 kB
> VmData:    17183588 kB
> VmStk:         108 kB
> VmExe:        4728 kB
> VmLib:           0 kB
> VmPTE:       33384 kB
> Threads:    1
> SigQ:    0/128618
> SigPnd:    0000000000000000
> ShdPnd:    0000000000000000
> SigBlk:    0000000000000000
> SigIgn:    0000000000000001
> SigCgt:    0000000180000000
> CapInh:    0000000000000000
> CapPrm:    0000000000000000
> CapEff:    0000000000000000
> CapBnd:    ffffffffffffffff
> Cpus_allowed:    fff
> Cpus_allowed_list:    0-11
> Mems_allowed:    00000000,00000003
> Mems_allowed_list:    0-1
> voluntary_ctxt_switches:    44925
> nonvoluntary_ctxt_switches:    232898
> Stack usage:    104 kB
> --------------------------------------------------------------------------------
> 
> Information on current assembly object:
> 
> AS_readpool: 486969 reads.
> AS_contigs: 0 contigs.
> AS_bbcontigs: 0 contigs.
> Mem used for reads: 753421136 (719. MiB)
> 
> Memory used in assembly structures:
>                                            Eff. Size   Free cap. LostByAlign
>      AS_writtenskimhitsperid:     486969       2 MiB         0 B         4 B
>                AS_skim_edges:          0     7.7 GiB     7.7 GiB         0 B
>                  AS_adsfacts:          0     133 MiB     133 MiB         4 B
>           AS_confirmed_edges:          0     267 MiB     267 MiB         4 B
>    AS_permanent_overlap_bans:  155345426     5.8 GiB         0 B         0 B
>               AS_readhitmiss:          0        24 B         0 B         0 B
>             AS_readhmcovered:          0        24 B         0 B         0 B
>                 AS_count_rhm:          0        24 B         0 B         0 B
>                  AS_clipleft:     486969       2 MiB         0 B         4 B
>                 AS_clipright:     486969       2 MiB         0 B         4 B
>                  AS_used_ids:          0     476 KiB     476 KiB         7 B
>               AS_multicopies:     486969     476 KiB         0 B         7 B
>             AS_hasmcoverlaps:     486969     476 KiB         0 B         7 B
>        AS_maxcoveragereached:     486969       2 MiB         0 B         4 B
>        AS_coverageperseqtype:          0        24 B         0 B         0 B
>            AS_istroublemaker:     486969     476 KiB         0 B         7 B
>                  AS_isdebris:     486969     476 KiB         0 B         7 B
>           AS_needalloverlaps:     486969     476 KiB         7 B         0 B
>     AS_readsforrepeatresolve:          0        40 B         0 B         0 B
>                 AS_allrmbsok:          0        24 B         0 B         0 B
>         AS_probablermbsnotok:          0        24 B         0 B         0 B
>             AS_weakrmbsnotok:          0        24 B         0 B         0 B
>           AS_readmaytakeskim:          0        40 B         0 B         0 B
>                AS_skimstaken:          0        40 B         0 B         0 B
>           AS_numskimoverlaps:          0        24 B         0 B         0 B
>        AS_numleftextendskims:          0        24 B         0 B         0 B
>          AS_rightextendskims:          0        24 B         0 B         0 B
>       AS_skimleftextendratio:          0        24 B         0 B         0 B
>      AS_skimrightextendratio:          0        24 B         0 B         0 B
>              AS_usedlogfiles:         32       1 KiB         0 B         0 B
> Total: 15645945424 (14.6 GiB)
> 
> ================================================================================
> Dynamic allocs: 28
> Align allocs: 618
> Out of memory detected, exception message is: std::bad_alloc
> 
> 
> If you have questions on why this happened, please send the last 1000
> lines of the output log (or better: the complete file) to the author
> together with a short summary of your assembly project.
> 
> 
> 
> For general help, you will probably get a quicker response on the
>     MIRA talk mailing list
> than if you mailed the author directly.
> 
> To report bugs or ask for features, please use the new ticketing system at:
>     http://sourceforge.net/apps/trac/mira-assembler/
> This ensures that requests don't get lost.
> 



--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: