On Tue, Dec 13, 2022 at 06:04:04PM +0500, Muhammad Usama Anjum wrote:
Hi Muhammad! I'm really sorry for diving in such late (unfortunatelly too busy to step in yet). Anyway, while in general such interface looks reasonable here are few moments which really bothers me: as far as I undertstand you don't need vzalloc here, plain vmalloc should works as well since you copy only filled results back to userspace.
Thank you for reviewing. Correct, I'll update to use vmalloc.
Next -- there is no restriction on vec_len parameter, is not here a door for DoS from userspace? Say I could start a number of ioctl on same pagemap and try to allocate very big amount of vec_len in summay causing big pressure on kernel's memory. Or I miss something obvious here?
Yes, there is a chance that a large chunk of kernel memory can get allocated here as vec_len can be very large. We need to think of limiting this buffer in the current implementation. Any reasonable limit should work. I'm not sure what would be the reasonable limit. Maybe couple of hundred MBs? I'll think about it. Or I should update the implementation such that less amount of intermediate buffer can be used like mincore does. But this can complicate the implementation further as we are already using page ranges instead of keeping just the flags. I'll see what can be done.
You know, I'm not yet convinced about overall design. This is new uapi which should be reviewed very very carefully, once merged in we can't step back and will have to live with it forever. As to buffer size: look how pagemap_read is implemented, it allocates PAGEMAP_WALK_SIZE buffer array to gather results then copies it back to userspace. If the main idea to be able to walk over memory of a process with mm context locked it still doesn't bring much benefit because once ioctl is complete the state of mm can be changed so precise results are only possible if target process is not running.
Maybe all of these aspects are been discussed already I probably need to read all previous converstaions first :)