On Thu, Nov 18, 2021 at 01:11:09PM +0100, Peter Zijlstra wrote:
On Thu, Nov 18, 2021 at 10:39:44AM +0100, Peter Zijlstra wrote:
On Thu, Nov 18, 2021 at 09:18:52AM +0100, Peter Zijlstra wrote:
On Thu, Nov 18, 2021 at 09:06:27AM +0100, Peter Zijlstra wrote:
On Wed, Nov 17, 2021 at 03:50:17PM -0800, Linus Torvalds wrote:
I really don't think the WCHAN code should use unwinders at all. It's too damn fragile, and it's too easily triggered from user space.
On x86, esp. with ORC, it pretty much has to. The thing is, the ORC unwinder has been very stable so far. I'm guessing there's some really stupid thing going on, like for example trying to unwind a freed stack.
I *just* managed to reproduce, so let me go have a poke.
Confirmed, with the below it no longer reproduces. Now, let me go undo that and fix the unwinder to not explode while trying to unwind nothing.
OK, so the bug is firmly with 5d1ceb3969b6 ("x86: Fix __get_wchan() for !STACKTRACE") which lost the try_get_task_stack() that stack_trace_*() does.
We can ofc trivially re-instate that, but I'm now running with the below which I suppose is a better fix, hmm?
(obv I still need to look a the other two unwinders)
I now have the below, the only thing missing is that there's a user_mode() call on a stack based regs. Now on x86_64 we can __get_kernel_nofault() regs->cs and call it a day, but on i386 we have to also fetch regs->flags.
Is this really the way to go?
Please no. Can we just add a check in unwind_start() to ensure the caller did try_get_task_stack()?