On May 4, 2019, at 11:59 AM, Linus Torvalds torvalds@linux-foundation.org wrote:
On Fri, May 3, 2019 at 10:08 PM Linus Torvalds torvalds@linux-foundation.org wrote:
I'll look at it tomorrow, but I think this actually makes unnecessary changes.
In particular, I think we could keep the existing entry code almost unchanged with this whole approach.
So here's what I *think* should work. Note that I also removed your test-case code, because it really didn't have a chance in hell of working. Doing that
int3_emulate_call(regs, (unsigned long)&int3_magic);
inside of int3_exception_notify() could not possibly be valid, since int3_emulate_call() returns the new pt_regs that need to be used, and throwing it away is clearly wrong.
So you can't use a register_die_notifier() to try to intercept the 'int3' error and then do it manually, it needs to be done by the ftrace_int3_handler() code that actually returns the new regs, and where do_kernel_int3() will then return it to the low-level handler.
I hate register_die_notifier(), so I consider this a plus. I’ve occasionally considered removing the ability for the notifiers to skip normal processing, because, as it stands, figuring out what actually happens in the trap handlers is almost impossible.
It generally looks sane to me.
As an aside, is it even *possible* to get #BP from v8086 mode? On a quick SDM read, the INT3 instruction causes #GP if VM=1 and IOPL<3. And, if we allow vm86() to have IOPL=3, we should just remove that ability. It’s nuts.
(We should maybe consider a config option for iopl() that defaults off. We’ve supported ioperm() for a long, long time.)