On Tue, May 18, 2021 at 01:08:27PM +0200, Michal Hocko wrote:
On Tue 18-05-21 12:35:36, David Hildenbrand wrote:
On 18.05.21 12:31, Michal Hocko wrote:
Although I have to say openly that I am not a great fan of VM_FAULT_OOM in general. It is usually a a wrong way to tell the handle the failure because it happens outside of the allocation context so you lose all the details (e.g. allocation constrains, numa policy etc.). Also whenever there is ENOMEM then the allocation itself has already made sure that all the reclaim attempts have been already depleted. Just consider an allocation with GFP_NOWAIT/NO_RETRY or similar to fail and propagate ENOMEM up the call stack. Turning that into the OOM killer sounds like a bad idea to me. But that is a more general topic. I have tried to bring this up in the past but there was not much of an interest to fix it as it was not a pressing problem...
I'm certainly interested; it would mean that we actually want to try recovering from VM_FAULT_OOM in various cases, and as you state, we might have to supply more information to make that work reliably.
Or maybe we want to get rid of VM_FAULT_OOM altogether... But this is really tangent to this discussion. The only relation is that this would be another place to check when somebody wants to go that direction.
If we are to get rid of VM_FAULT_OOM, vmf_error() would be updated and this place will get the update automagically.
Having that said, I guess what we have here is just the same as when our process fails to allocate a generic page table in __handle_mm_fault(), when we fail p4d_alloc() and friends ...
From a quick look it is really similar in a sense that it effectively never happens and if it does then it certainly does the wrong thing. The point I was trying to make is that there is likely no need to go that way.
As David pointed out, failure to handle direct map in secretmem_fault() is like any allocation failure in page fault handling and most of them result in VM_FAULT_OOM, so I think that having vmf_error() in secretmem_fault() is more consistent with the rest of the code than using VM_FAULT_SIGBUS.
Besides if the direct map manipulation failures would result in errors other than -ENOMEM, having vmf_error() may prove useful.