On Wed, Oct 23, 2024 at 01:36:10PM +0200, David Hildenbrand wrote:
On 23.10.24 13:31, Marco Elver wrote:
On Wed, 23 Oct 2024 at 11:29, David Hildenbrand david@redhat.com wrote:
On 23.10.24 11:18, Lorenzo Stoakes wrote:
On Wed, Oct 23, 2024 at 11:13:47AM +0200, David Hildenbrand wrote:
On 23.10.24 11:06, Vlastimil Babka wrote:
On 10/23/24 10:56, Dmitry Vyukov wrote: > > > > Overall while I sympathise with this, it feels dangerous and a pretty major > > change, because there'll be something somewhere that will break because it > > expects faults to be swallowed that we no longer do swallow. > > > > So I'd say it'd be something we should defer, but of course it's a highly > > user-facing change so how easy that would be I don't know. > > > > But I definitely don't think a 'introduce the ability to do cheap PROT_NONE > > guards' series is the place to also fundmentally change how user access > > page faults are handled within the kernel :) > > Will delivering signals on kernel access be a backwards compatible > change? Or will we need a different API? MADV_GUARD_POISON_KERNEL? > It's just somewhat painful to detect/update all userspace if we add > this feature in future. Can we say signal delivery on kernel accesses > is unspecified?
Would adding signal delivery to guard PTEs only help enough the ASAN etc usecase? Wouldn't it be instead possible to add some prctl to opt-in the whole ASANized process to deliver all existing segfaults as signals instead of -EFAULT ?
Not sure if it is an "instead", you might have to deliver the signal in addition to letting the syscall fail (not that I would be an expert on signal delivery :D ).
prctl sounds better, or some way to configure the behavior on VMA ranges; otherwise we would need yet another marker, which is not the end of the world but would make it slightly more confusing.
Yeah prctl() sounds sensible, and since we are explicitly adding a marker for guard pages here we can do this as a follow up too without breaking any userland expectations, i.e. 'new feature to make guard pages signal' is not going to contradict the default behaviour.
So all makes sense to me, but I do think best as a follow up! :)
Yeah, fully agreed. And my gut feeling is that it might not be that easy ... :)
In the end, what we want is *some* notification that a guard PTE was accessed. Likely the notification must not necessarily completely synchronous (although it would be ideal) and it must not be a signal.
Maybe having a different way to obtain that information from user space would work.
For bug detection tools (like GWP-ASan [1]) it's essential to have useful stack traces. As such, having this signal be synchronous would be more useful. I don't see how one could get a useful stack trace (or other information like what's stashed away in ucontext like CPU registers) if this were asynchronous.
Yes, I know. But it would be better than not getting *any* notification except of some syscalls simply failing with -EFAULT, and not having an idea which address was even accessed.
Maybe the signal injection is easier than I think, but I somehow doubt it ...
Yeah I'm afraid I don't think this series is a place where I can fundamentally change how something so sensitive works in the kernel.
It's espeically super sensitive because this is a uAPI change and a wrong decision here could result in guard pages being broken out the gate and I really don't want to risk that.
-- Cheers,
David / dhildenb