On 8/9/19 11:05 AM, Mina Almasry wrote:
On Fri, Aug 9, 2019 at 4:27 AM Michal Koutný mkoutny@suse.com wrote:
Alternatives considered: [...]
(I did not try that but) have you considered: 3) MAP_POPULATE while you're making the reservation,
I have tried this, and the behaviour is not great. Basically if userspace mmaps more memory than its cgroup limit allows with MAP_POPULATE, the kernel will reserve the total amount requested by the userspace, it will fault in up to the cgroup limit, and then it will SIGBUS the task when it tries to access the rest of its 'reserved' memory.
So for example:
- if /proc/sys/vm/nr_hugepages == 10, and
- your cgroup limit is 5 pages, and
- you mmap(MAP_POPULATE) 7 pages.
Then the kernel will reserve 7 pages, and will fault in 5 of those 7 pages, and will SIGBUS you when you try to access the remaining 2 pages. So the problem persists. Folks would still like to know they are crossing the limits on mmap time.
If you got the failure at mmap time in the MAP_POPULATE case would this be useful?
Just thinking that would be a relatively simple change.