[procps] Re: [PATCH v2][resend] library: properly handle memory used by tmpfs

  • From: Jakob Unterwurzacher <jakobunt@xxxxxxxxx>
  • To: Jaromir Capik <jcapik@xxxxxxxxxx>
  • Date: Tue, 29 Apr 2014 10:23:22 +0200

Thanks for the detailed respone, by the way! The concern about
inconsistencies is certainly valid. Probably changing the "cached" value as
reported by the kernel is not the best approach, after all.

Something like that could work, what do you think?

             total       used       free     shared    buffers     cached
Mem:          1000        500        100        150        100        200
-/+ shm/buf/cache:        350        250

That is, leave the "Mem" line untouched and adjust only the "-/+" line for
memory used by tmpfs. That would restore the semantics of the "-/+" line
without messing up values reported by the kernel.
(Note that I'm not sure if "-/+ buffers/cache" should actually be renamed
to "-/+ shm/buf/cache", I just did it here for clarity.)

Best regards,
Jakob


On Tue, Apr 29, 2014 at 9:06 AM, Jakob Unterwurzacher <jakobunt@xxxxxxxxx>wrote:

> Hi Jaromir! I'm sorry I got you in the line of fire!
>
> If this is not the way to go, do your colleagues have a better way to tell
> if the system is (close to) being out of memory?
>
> Regards,
> Jakob
>
>
> On Mo, Apr 28, 2014 at 3:00 , Jaromir Capik <jcapik@xxxxxxxxxx> wrote:
>
>> Hello Jakob.
>>
>> I must apologize to you.
>> Your patch caused a really strong second wave of discussion in the kernel
>> team and few people were strictly against the subtraction and called
>> the change as reinforcing the misconception, that memory in the page cache
>> can be considered as free.
>> Moreover, the manpage was modified to contain the following:
>> "the Cached value is actually the sum of page cache and tmpfs memory,"
>> but according to the info I got, that isn't correct. The kernel guys
>> explained, that the Cached value is not a sum of page cache and tmpfs,
>> as in fact the tmpfs memory lives in the page cache (= is a part of)
>> and therefore shouldn't be subtracted. They believe the difference from
>> the kernel concept causes more confusion and agreed, the previous decision
>> was a mistake. I tried to read all their responses and now I'm convinced
>> it really isn't systematic as other reporting tools don't subtract tmpfs
>> from cached. I'm going to remove the subtraction and change the manual
>> accordingly. I hope you won't feel discouraged and sad because of that.
>> You lost a lot of your time and I personally appreciate your effort
>> to contribute and make the tools better. In this case it was simply
>> a walking on a thin ice and no one could be sure about the right approach.
>>
>> Sorry for that and thanks for the understanding.
>>
>> Regards,
>> Jaromir.
>>
>> --
>> Jaromir Capik
>> Red Hat Czech, s.r.o.
>> Software Engineer / Secondary Arch
>>
>> Email: jcapik@xxxxxxxxxx
>> Web: www.cz.redhat.com
>> Red Hat Czech s.r.o., Purkynova 99/71, 612 45, Brno, Czech Republic
>> IC: 27690016
>>
>> ----- Original Message -----
>>
>>>  From: "Jakob Unterwurzacher" <jakobunt@xxxxxxxxx>
>>>  To: procps@xxxxxxxxxxxxx
>>>  Cc: "Jakob Unterwurzacher" <jakobunt@xxxxxxxxx>
>>>  Sent: Tuesday, February 18, 2014 10:12:21 PM
>>>  Subject: [procps] [PATCH v2][resend] library: properly handle memory
>>> used by tmpfs
>>>   tmpfs has become much more widely used since distributions use it for
>>>  /tmp (Fedora 18+). In /proc/meminfo, memory used by tmpfs is accounted
>>>  into "Cached" (aka "NR_FILE_PAGES",
>>>   http://lxr.free-electrons.com/source/mm/shmem.c#L301 ).
>>>   The tools just pass it on, so what top, free and vmstat report as
>>>  "cached" is the sum of page cache and tmpfs.
>>>   free has the extremely useful "-/+ buffers/cache" output. However, now
>>>  that tmpfs is accounted into "cached", those numbers are way off once
>>>  you have big files in /tmp.
>>>   Fortunately, kernel 2.6.32 introduces "Shmem", which makes tmpfs memory
>>>  usage accessible from userspace (
>>>  https://git.kernel.org/cgit/linux/kernel/git/torvalds/
>>> linux.git/commit/?id=4b02108ac1b3354a22b0d83c684797692efdc395
>>>  ).
>>>   This patch substracts Shmem from Cached to get the actual page cache
>>>  memory. This makes both issues mentioned above disappear. For older
>>>  kernels, Shmem is not available (hence zero) and this patch is no-op.
>>>   Additionally:
>>>  * Update the man pages of free and vmstat to explain what is happening
>>>  * Finally drop "MemShared" from the /proc/meminfo parser, it has been
>>>    dead for 10+ years and is only causing confusion ( removed in kernel
>>>    2.5.54, see
>>>    https://git.kernel.org/cgit/linux/kernel/git/tglx/history.
>>> git/commit/?id=fe04e9451e5a159247cf9f03c615a4273ac0c571
>>>    )
>>>  ---
>>>   free.1         | 29 ++++++++++++++++++++++++-----
>>>   proc/sysinfo.c |  8 ++++----
>>>   proc/sysinfo.h |  3 +--
>>>   vmstat.8       |  3 ++-
>>>   4 files changed, 31 insertions(+), 12 deletions(-)
>>>   diff --git a/free.1 b/free.1
>>>  index 1e8e7ef..21cce28 100644
>>>  --- a/free.1
>>>  +++ b/free.1
>>>  @@ -11,11 +11,30 @@ free \- Display amount of free and used memory in
>>> the
>>>  system
>>>   .SH DESCRIPTION
>>>   .B free
>>>   displays the total amount of free and used physical and swap memory in
>>> the
>>>  -system, as well as the buffers used by the kernel.
>>>  -The shared memory column represents either the MemShared value (2.4
>>> series
>>>  -kernels) or the Shmem value (2.6 series kernels and later) taken from
>>> the
>>>  -/proc/meminfo file. The value is zero if none of the entries is
>>> exported
>>>  -by the kernel.
>>>  +system, as well as the buffers and caches used by the kernel. The
>>>  +information is gathered by parsing /proc/meminfo. The displayed
>>>  +columns are:
>>>  +.TP
>>>  +\fBtotal\fR
>>>  +Total installed memory (MemTotal and SwapTotal in /proc/meminfo)
>>>  +.TP
>>>  +\fBused\fR
>>>  +Used memory (calculated as total - free)
>>>  +.TP
>>>  +\fBfree\fR
>>>  +Unused memory (MemFree and SwapFree in /proc/meminfo)
>>>  +.TP
>>>  +\fBshared\fR
>>>  +Memory used (mostly) by tmpfs (Shmem in /proc/meminfo, available on
>>>  +kernels 2.6.32, displayed as zero if not available)
>>>  +.TP
>>>  +\fBbuffers\fR
>>>  +Memory used by kernel buffers (Buffers in /proc/meminfo)
>>>  +.TP
>>>  +\fBcached\fR
>>>  +Memory used by the page cache  (calculated as Cached - Shmem in
>>>  +/proc/meminfo - the Cached value is actually the sum of page cache and
>>>  +tmpfs memory)
>>>   .SH OPTIONS
>>>   .TP
>>>   \fB\-b\fR, \fB\-\-bytes\fR
>>>  diff --git a/proc/sysinfo.c b/proc/sysinfo.c
>>>  index f318376..42b8e7c 100644
>>>  --- a/proc/sysinfo.c
>>>  +++ b/proc/sysinfo.c
>>>  @@ -531,7 +531,6 @@ static int compare_mem_table_structs(const void
>>> *a, const
>>>  void *b){
>>>    *
>>>    * MemTotal:        61768 kB    old
>>>    * MemFree:          1436 kB    old
>>>  - * MemShared:           0 kB    old (now always zero; not calculated)
>>>    * Buffers:          1312 kB    old
>>>    * Cached:          20932 kB    old
>>>    * Active:          12464 kB    new
>>>  @@ -560,7 +559,7 @@ static int compare_mem_table_structs(const void
>>> *a, const
>>>  void *b){
>>>    * Hugepagesize:     4096 kB    2.5.??+
>>>    */
>>>    -/* obsolete since 2.6.x, but reused for shmem in 2.6.32+ */
>>>  +/* Shmem in 2.6.32+ */
>>>   unsigned long kb_main_shared;
>>>   /* old but still kicking -- the important stuff */
>>>   unsigned long kb_main_buffers;
>>>  @@ -631,14 +630,13 @@ void meminfo(void){
>>>     {"LowTotal",     &kb_low_total},
>>>     {"Mapped",       &kb_mapped},       // kB version of vmstat nr_mapped
>>>     {"MemFree",      &kb_main_free},    // important
>>>  -  {"MemShared",    &kb_main_shared},  // obsolete since kernel 2.6!
>>> (sharing
>>>  the variable with Shmem replacement)
>>>     {"MemTotal",     &kb_main_total},   // important
>>>     {"NFS_Unstable", &kb_nfs_unstable},
>>>     {"PageTables",   &kb_pagetables},   // kB version of vmstat
>>>     nr_page_table_pages
>>>     {"ReverseMaps",  &nr_reversemaps},  // same as vmstat
>>> nr_page_table_pages
>>>     {"SReclaimable", &kb_swap_reclaimable}, // "swap reclaimable"
>>> (dentry and
>>>     inode structures)
>>>     {"SUnreclaim",   &kb_swap_unreclaimable},
>>>  -  {"Shmem",        &kb_main_shared},  // kernel 2.6 and later (sharing
>>> the
>>>  output variable with obsolete MemShared)
>>>  +  {"Shmem",        &kb_main_shared},  // kernel 2.6.32 and later
>>>     {"Slab",         &kb_slab},         // kB version of vmstat nr_slab
>>>     {"SwapCached",   &kb_swap_cached},
>>>     {"SwapFree",     &kb_swap_free},    // important
>>>  @@ -684,6 +682,8 @@ nextline:
>>>     }
>>>     kb_swap_used = kb_swap_total - kb_swap_free;
>>>     kb_main_used = kb_main_total - kb_main_free;
>>>  +  /* "Cached" includes "Shmem" - we want only the page cache here */
>>>  +  kb_main_cached -= kb_main_shared;
>>>   }
>>>     /*****************************************************************/
>>>  diff --git a/proc/sysinfo.h b/proc/sysinfo.h
>>>  index 1eb3472..2291631 100644
>>>  --- a/proc/sysinfo.h
>>>  +++ b/proc/sysinfo.h
>>>  @@ -20,8 +20,7 @@ extern int        uptime (double *uptime_secs, double
>>>  *idle_secs);
>>>   extern unsigned long getbtime(void);
>>>   extern void       loadavg(double *av1, double *av5, double *av15);
>>>    -
>>>  -/* obsolete */
>>>  +/* Shmem in 2.6.32+ */
>>>   extern unsigned long kb_main_shared;
>>>   /* old but still kicking -- the important stuff */
>>>   extern unsigned long kb_main_buffers;
>>>  diff --git a/vmstat.8 b/vmstat.8
>>>  index ef6cbe9..d102602 100644
>>>  --- a/vmstat.8
>>>  +++ b/vmstat.8
>>>  @@ -99,7 +99,8 @@ b: The number of processes in uninterruptible sleep.
>>>   swpd: the amount of virtual memory used.
>>>   free: the amount of idle memory.
>>>   buff: the amount of memory used as buffers.
>>>  -cache: the amount of memory used as cache.
>>>  +cache: the amount of memory used as cache (excluding tmpfs memory for
>>>  +kernels 2.6.32+)
>>>   inact: the amount of inactive memory.  (\-a option)
>>>   active: the amount of active memory.  (\-a option)
>>>   .fi
>>>  --
>>>  1.8.5.3
>>>
>>>
>>
>

Other related posts: