On Mon, 21 Mar 2011, Tixy wrote:
Open Issue
32-bit Thumb breakpoints may straddle two memory words, which means that when we set or clear them there is a window of opportunity where another CPU may only see half of the new instruction and execute invalid code. To prevent this I've used stop_machine() to get all CPUs to synchronously modify the instruction and update their I-caches. To my thinking, something like this would also be needed so that other CPUs see the new instruction, otherwise they could indefinately be executing the old one from their local I-cache.
The problem with using stop_machine() is that the breakpoint setting code is called from enable_kprobe() which holds the text_mutex and has this comment which says:
since [the breakpoint setting code] doesn't use stop_machine(), this doesn't cause deadlock on text_mutex. So, we don't need get_online_cpus()
Now I am using stop_machine() I need to understand what the consequences and alternatives are.
Why not always using a 16-bit Thumb breakpoint instruction even in place of a 32-bit Thumb instruction? This way you sidestep all the issues about atomically updating the instruction across two words. The instruction to emulate might still be 32-bit and therefore pc would be advanced appropriately.
Nicolas