On Mon, Oct 21, 2024 at 11:35:24PM +0200, Vlastimil Babka wrote:
On 10/21/24 23:20, David Hildenbrand wrote:
I don't think there's really any value in that. There's just no sensible situation in which a user would care about this I don't think.
Making sure nobody touches an area, and wile doing that somebody already touched that area? I guess it could be worked around by mprotect(PROT_NONE),madvise(GUARD),mprotect(PROT_READ|PROT_WRITE) ... which is not particularly nice :)
And if you're saying 'hey do MADV_DONTNEED if this fails and keep trying!' then why not just do that in the kernel?
Heh, no!
If user space doesn't expect there to be something, it should *fail*. That's likely going to be the majority of use cases for guard pages (happy to be told otherwise). No retry.
And if user space expects there to be something it should zap ahead of time (which some allocators maybe already do to free up memory after free()) to then install the guard. No retry.
There is this case where user space might be unsure. There, it might make sense to retry exactly once.
I've thought so too and the RFC was implemented like this, but Jann came up with a scenario where a THP can cause the range including our to-be-installed guard pte to be populated even if the userspace is not trying to access that exact address, see here:
https://lore.kernel.org/all/CAG48ez3vqbqyWb4bLdpqSUnhwqGo2OQetecNhEGPdCGDr94...
So unless we can't *reliably* detect that userspace is really shooting itself in the foot and return a failure to install guard pte *only* in that case (which would be useful), and not horribly complicate everything to ensure that reliability and to avoid false positives due to races with THP's, then it's probably better to just retry as this version does.
It would be complicated, and I'd reallly like to avoid trying to detect this. It feels a bit whack-a-mole because maybe there's other scenarios we've not thought about that could be equally problematic?
Trying to explain to a user 'hey this is for installing guard pages but if there's a facing fault it'll fail and that could keep happening and then you'll have to zap and maybe in a loop' just... seems like a bloody awful interface?
Hope my example above made it clearer. This "retry forever until it works" use case doesn't quite make sense to me, but I might just be missing something important.
But again, I have to do more reading on the history of the current approach ... and it's fairly late here.
Yeah see the RFC thread linked above.
Right, but I don't think this is the only scenario that can happen, and I think, FWIW, yet again the fundamental point comes down to 'is it a problem?'
Because if looping like this isn't, then problem solved we can all high 5 and go home listening to the prodigy and full happiness.
If not then we can revisit.
And how could it be a problem? Surely only security or DoS potential. Hopefully Jann can give some positive feedback on that.
We could also, and I hate to say it, but... try to find some means of checking on reasonable forward progress in the loop if we had to or some other 'reasonable attempt'.
But let's see...