anon page allocation on solaris for shared servers went to 300MB out of nothing [long]

  • From: GG <grzegorzof@xxxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Wed, 11 Jun 2014 21:16:21 +0200

Hi,
we've experienced strange server hang caused by out of memory errors on 128 GB machine with 90GB SGA .
It's solaris 11 with oracle EE11.2.0.3 + recent PSU .
Doing some post mortem vmcore checking I've found this:

CAT(vmcore.0/11U)> mem
                              pages        bytes
physinstalled              16777216 137438953472 (128G)
physmem                    16489356 135080804352 (125G)
total_pages                16489085 135078584320 (125G)

freemem                       64872    531431424 (506M)
avefree                       64872    531431424 (506M)
avefree30                     65057    532946944 (508M)
needfree                      75063    614916096 (586M)
availrmem (nonswapable)     2935072  24044109824 (22.3G)
availrmem_initial          16489085 135078584320 (125G)
swapfs_minfree              2061169  16885096448 (15.7G)
sw_pending_size                             8192 (8K)

lotsfree                     257641   2110595072 (1.96G)
desfree                      128820   1055293440 (1006M)
minfree                       64410    527646720 (503M)
throttlefree                  64410    527646720 (503M)

pp_kernel(calculated)       2039836  16710336512 (15.5G)
pages_locked                   2721     22290432 (21.2M)

shared memory (SM)                       2870632 (2.73M)
intimate SM (ISM)                    96636780544 (90G)
dynamic ISM (DISM)                             0 (0)
locked DISM                       0            0 (0)
total locked SM                      96636780544 (90G) (70.31% of memory)
spt_used (ISM)             11796482  96636780544 (90G)
segspt_minfree               809107   6628204544 (6.17G)

WARNING: soft swapping (avefree < desfree && freemem <= desfree)

k_anoninfo: (physical == disk-backed)
ani_phys_max - disk swap 17039359 pages (129G) ani_phys_avail - available disk 8443024 pages (64.4G)
  ani_asleep_mem_resv - reserved asleep memory                  0 pages (0)
  ani_mem_resv - reserved memory                                0 pages (0)
ani_mem_locked - locked memory 11796482 pages (90G) ani_free - unallocated physical and memory 8541727 pages (65.1G)

initial virtual swap available for reservation 31467275 pages (240G)
  ani_max + MAX(availrmem_initial - swapfs_minfree, 0)
current virtual swap available for reservation 9316927 pages (71G)
  ani_phys_avail + Asleep_availrmem + MAX(availrmem - swapfs_minfree, 0)


CAT(vmcore.0/11U)> proc -r -s size
addr PID PPID RUID/UID size RSS swresv lwpcnt command ============== ====== ====== ========== ========== ======== ======== ====== ===== x6401242c9000 568 1 100 96886915072 4456448 6840320 1 ora_s083_sid 0x640208970ff8 8314 1 100 96886923264 4538368 6840320 1 ora_diag_sid 0x640143146050 657 1 100 96887463936 4431872 7127040 1 ora_s115_sid

---------above looks ok in terms of RSS , but check this out

0x6401b0bf9000 8471 1 100 97329143808 297426944 368320512 258 ora_s034_sid 0x64015b6e8050 534 1 100 97439129600 301449216 552239104 1 ora_s070_sid 0x640119acb018 369 1 100 97455792128 301031424 565493760 1 ora_s060_sid 0x6402742f4040 27109 1 100 97455898624 295297024 574578688 1 ora_s039_sid 0x64027e1cc020 659 1 100 97455923200 299892736 568311808 1 ora_s116_sid 0x640297349000 212 1 100 97457045504 298565632 564764672 7 ora_s051_sid 0x640164be8000 8407 1 100 97463615488 300081152 538255360 258 ora_s002_sid 0x6402045a9028 610 1 100 97472503808 299917312 589152256 1 ora_s102_sid 0x640133fb0048 552 1 100 97472544768 299761664 589258752 1 ora_s075_sid 0x6401ef3be020 384 1 100 97472675840 298442752 584015872 1 ora_s066_sid 0x640206514008 226 1 100 97472684032 296189952 588218368 1 ora_s058_sid 0x64028bb3d000 578 1 100 97472684032 301342720 587759616 1 ora_s088_sid 0x6401632e8010 378 1 100 97472692224 299614208 584294400 1 ora_s063_sid 0x64023294cfe0 574 1 100 97472692224 297820160 586129408 1 ora_s086_sid

the RSS is about 280-300M in size , looks strange for me like for an oracle server process .

going further

CAT(vmcore.0/11U)> mem -l user
  PID  size   RSS swrsv   anon  swap  file command
  665 91.9G  284M 1.65G   280M 1.37G  196M ora_s119_sid
  663 90.7G  308M  579M   303M  273M  196M ora_s118_sid
  659 90.7G  286M  541M   280M  259M  196M ora_s116_sid
  657 90.2G 4.22M 6.79M    24K 3.95M  196M ora_s115_sid
  629 91.8G  282M 1.56G   277M 1.29G  196M ora_s111_sid
  627 90.7G  286M  568M   281M  284M  196M ora_s110_sid
  625 90.2G 6.64M 7.56M   736K 4.09M  196M ora_s109_sid
  623 91.7G  279M 1.51G   275M 1.24G  196M ora_s108_sid
  618 91.9G  286M 1.72G   282M 1.44G  196M ora_s106_sid
  616 90.7G  285M  574M   280M  290M  196M ora_s105_sid
  612 90.7G  291M  574M   285M  286M  196M ora_s103_sid
  610 90.7G  286M  561M   280M  278M  196M ora_s102_sid
  608 91.9G  283M 1.67G   279M 1.39G  196M ora_s101_sid
  606 90.7G  290M  575M   285M  288M  196M ora_s100_sid
  602 90.2G 4.22M 6.90M    24K 3.83M  196M ora_s098_sid
  598 90.7G  285M  567M   280M  285M  196M ora_s096_sid
  596 90.7G  286M  566M   280M  283M  196M ora_s095_sid
  594 90.7G  287M  553M   282M  268M  196M ora_s094_sid
  584 90.2G 6.39M 7.66M   464K 4.50M  196M ora_s091_sid
  582 90.2G 4.23M 7.16M    24K 4.37M  196M ora_s090_sid
  580 90.7G  286M  557M   281M  272M  196M ora_s089_sid
  578 90.7G  287M  560M   281M  276M  196M ora_s088_sid
  574 90.7G  284M  558M   278M  277M  196M ora_s086_sid
  572 91.8G  284M 1.63G   279M 1.36G  196M ora_s085_sid
  570 90.7G  285M  567M   279M  285M  196M ora_s084_sid
  568 90.2G 4.25M 6.52M    24K 3.63M  196M ora_s083_sid
  566 90.7G  286M  569M   280M  286M  196M ora_s082_sid
  562 90.8G  287M  614M   281M  329M  196M ora_s080_sid
  560 91.8G  285M 1.63G   281M 1.35G  196M ora_s079_sid
  554 91.8G  284M 1.63G   280M 1.35G  196M ora_s076_sid
  552 90.7G  285M  561M   280M  279M  196M ora_s075_sid
  542 90.7G  291M  576M   286M  287M  196M ora_s074_sid
  538 90.2G 6.25M 7.91M   496K 4.67M  196M ora_s072_sid
  534 90.7G  287M  526M   281M  242M  196M ora_s070_sid
  530 91.9G  287M 1.72G   282M 1.44G  196M ora_s068_sid
  386 90.7G  285M  567M   279M  285M  196M ora_s067_sid
  384 90.7G  284M  556M   278M  275M  196M ora_s066_sid
  382 90.7G  288M  566M   283M  280M  196M ora_s065_sid
  380 91.8G  284M 1.63G   280M 1.35G  196M ora_s064_sid
  378 90.7G  285M  557M   280M  273M  196M ora_s063_sid
  373 90.2G 4.25M 6.91M    24K 4.02M  196M ora_s062_sid
  369 90.7G  287M  539M   281M  255M  196M ora_s060_sid
  367 90.7G  287M  569M   281M  285M  196M ora_s059_sid
  312 91.9G  288M 1.65G   284M 1.37G  196M ora_s048_sid
  304 91.8G  284M 1.56G   279M 1.29G  196M ora_s047_sid
  302 90.2G 4.54M 7.91M    24K 5.10M  196M ora_s046_sid
  292 91.9G  288M 1.67G   284M 1.39G  196M ora_s045_sid
  226 90.7G  282M  560M   276M  281M  196M ora_s058_sid
  224 90.7G  286M  571M   280M  288M  196M ora_s057_sid
  220 90.8G  286M  640M   281M  356M  196M ora_s055_sid
  216 90.2G 4.64M 7.91M    24K 5.14M  196M ora_s053_sid
  214 90.2G 4.25M 7.34M    24K 4.59M  196M ora_s052_sid
  212 90.7G  284M  538M   278M  257M  196M ora_s051_sid


did some math and it was like 79 shared servers with about 280MB anon memory size .
Questions:
Does Anyone have an idea about what could casue such shared server anon/private memory utilization, is it normal at all ?

Curently anon page size for shared server process (pmax -x PID) is like 4-7MB there is only one shared server where pmap -x PID reports 300MB anon space usage , interestingly Oracle v$sesstat claims that process allocated pga/uga memory is 20MB only .

Any ideas how I can drill down and find out about allocations in shared server process memory ?

btw
Oracle recommended decreasing SGA :) .

Regards
GG




--
//www.freelists.org/webpage/oracle-l


Other related posts:

  • » anon page allocation on solaris for shared servers went to 300MB out of nothing [long] - GG