proc: report file/anon bit in /proc/pid/pagemap
This is an implementation of Andrew's proposal to extend the pagemap file
bits to report what is missing about tasks' working set.
The problem with the working set detection is multilateral. In the criu
(checkpoint/restore) project we dump the tasks' memory into image files
and to do it properly we need to detect which pages inside mappings are
really in use. The mincore syscall I though could help with this did not.
First, it doesn't report swapped pages, thus we cannot find out which
parts of anonymous mappings to dump. Next, it does report pages from page
cache as present even if they are not mapped, and it doesn't make that has
not been cow-ed.
Note, that issue with swap pages is critical -- we must dump swap pages to
image file. But the issues with file pages are optimization -- we can
take all file pages to image, this would be correct, but if we know that a
page is not mapped or not cow-ed, we can remove them from dump file. The
dump would still be self-consistent, though significantly smaller in size
(up to 10 times smaller on real apps).
Andrew noticed, that the proc pagemap file solved 2 of 3 above issues --
it reports whether a page is present or swapped and it doesn't report not
mapped page cache pages. But, it doesn't distinguish cow-ed file pages
from not cow-ed.
I would like to make the last unused bit in this file to report whether the
page mapped into respective pte is PageAnon or not.
[comment stolen from Pavel Emelyanov's v1 patch]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>