Re: [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions

7 May 2019


      On Tue, 7 May 2019 14:50:26 +0000
David Laight David.Laight@ACULAB.COM wrote:
...
From: Steven Rostedt
...
Sent: 07 May 2019 14:14
On Tue, 7 May 2019 12:57:15 +0000
David Laight David.Laight@ACULAB.COM wrote:
...
The 'user' (ie the kernel code that needs to emulate the call) doesn't
write the data to the stack, just to some per-cpu location.
(Actually it could be on the stack at the other end of pt-regs.)
So you get to the 'register restore and iret' code with the stack unaltered.
It is then a SMOP to replace the %flags saved by the int3 with the %ip
saved by the int3, the %ip with the address of the function to call,
restore the flags (push and popf) and issue a ret.f to remove the %ip and %cs.
How would you handle NMIs doing the same thing? Yes, the NMI handlers
have breakpoints that will need to emulated calls as well.
...
(Actually you need to add 4 to the callers %ip address to allow for the
difference between the size of int3 (hopefully 0xcc, not 0xcd 0x3).)
...
...
...
For 32bit 'the gap' happens naturally when building a 5 entry frame. Yes
it is possible to build a 5 entry frame on top of the old 3 entry one,
but why bother...
Presumably there is 'horrid' code to generate the gap in 64bit mode?
(less horrid than 32bit, but still horrid?)
Or does it copy the entire pt_regs into a local stack frame and use
that for the iret?
On x86_64, the gap is only done for int3 and nothing else, thus it is
much less horrid. That's because x86_64 has a sane pt_regs storage for
all exceptions.
Well, in particular, it always loads %sp as part of the iret.
So you can create a gap and the cpu will remove it for you.
In 64bit mode you could overwrite the %ss with the return address
to the caller restore %eax and %flags, push the function address
and use ret.n to jump to the function subtracting the right amount
from %esp.
Actually that means you can do the following in both modes:
   if not emulated_call_address then pop %ax; iret else
   # assume kernel<->kernel return
   push emulated_call_address;
   push flags_saved_by_int3
   load %ax, return_address_from_iret
   add %ax,#4
   store %ax, first_stack_location_written_by_int3
   load %ax, value_saved_by_int3_entry
   popf
   ret.n
The ret.n discards everything from the %ax to the required return address.
So 'n' is the size of the int3 frame, so 12 for i386 and 40 for amd64.
If the register restore (done just before this code) finished with
'add %sp, sizeof *pt_regs' then the emulated_call_address can be
loaded in %ax from the other end of pt_regs.
This all reminds me of fixing up the in-kernel faults that happen
when loading the user segment registers during 'return to user'
fault in kernel space.
This all sounds much more complex and fragile than the proposed
solution. Why would we do this over what is being proposed?
-- Steve

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions