From: Oleg Nesterov oleg@redhat.com Date: Wed, 16 Apr 2014 17:29:46 +0200
On 04/16, David Miller wrote:
From: Oleg Nesterov oleg@redhat.com Date: Wed, 16 Apr 2014 17:06:46 +0200
Off-topic, I am just curious... can't someone explain why flush_pfn_alias() or flush_icache_alias() can't race with itself ? I have no idea what they do, but what if another thread calls the same function with the same CACHE_COLOUR() right after set_pte_ext?
PTE modifications are supposed to run with the page table lock held.
OK, but __access_remote_vm() doesn't take ptl?
And on arm copy_to_user_page()->flush_ptrace_access()->flush_pfn_alias() does this.
Well, for one thing, PTE's can't gain permissions except under mmap_sem which __access_remote_vm() does hold.
But I see what you're saying, flush_pfn_alias() is doing something different. It's not making user mappings, but kernel ones in order to implement the cache flush.
On sparc64 we handle this situation by hand-loading the mappings into the TLB, doing the operation using the mappings, then flushing it out of the TLB, all with interrupts disabled.
Furthermore, in ARMs case, the code explicitly states that these mappings are not used on SMP. See the comment above the FLUSH_ALIAS_START definition in arch/arm/mm/mm.h
On 04/16, David Miller wrote:
From: Oleg Nesterov oleg@redhat.com Date: Wed, 16 Apr 2014 17:29:46 +0200
On 04/16, David Miller wrote:
From: Oleg Nesterov oleg@redhat.com Date: Wed, 16 Apr 2014 17:06:46 +0200
Off-topic, I am just curious... can't someone explain why flush_pfn_alias() or flush_icache_alias() can't race with itself ? I have no idea what they do, but what if another thread calls the same function with the same CACHE_COLOUR() right after set_pte_ext?
PTE modifications are supposed to run with the page table lock held.
OK, but __access_remote_vm() doesn't take ptl?
And on arm copy_to_user_page()->flush_ptrace_access()->flush_pfn_alias() does this.
Well, for one thing, PTE's can't gain permissions except under mmap_sem which __access_remote_vm() does hold.
But I see what you're saying, flush_pfn_alias() is doing something different. It's not making user mappings, but kernel ones in order to implement the cache flush.
Yees, this is what I was able to understand, to some degree.
Furthermore, in ARMs case, the code explicitly states that these mappings are not used on SMP. See the comment above the FLUSH_ALIAS_START definition in arch/arm/mm/mm.h
Ah, and this is what I missed, despite the fact the comment is close to set_top_pte().
Thanks!
Oleg.
On Wed, Apr 16, 2014 at 11:47:40AM -0400, David Miller wrote:
From: Oleg Nesterov oleg@redhat.com Date: Wed, 16 Apr 2014 17:29:46 +0200
On 04/16, David Miller wrote:
From: Oleg Nesterov oleg@redhat.com Date: Wed, 16 Apr 2014 17:06:46 +0200
Off-topic, I am just curious... can't someone explain why flush_pfn_alias() or flush_icache_alias() can't race with itself ? I have no idea what they do, but what if another thread calls the same function with the same CACHE_COLOUR() right after set_pte_ext?
PTE modifications are supposed to run with the page table lock held.
OK, but __access_remote_vm() doesn't take ptl?
And on arm copy_to_user_page()->flush_ptrace_access()->flush_pfn_alias() does this.
Well, for one thing, PTE's can't gain permissions except under mmap_sem which __access_remote_vm() does hold.
But I see what you're saying, flush_pfn_alias() is doing something different. It's not making user mappings, but kernel ones in order to implement the cache flush.
On sparc64 we handle this situation by hand-loading the mappings into the TLB, doing the operation using the mappings, then flushing it out of the TLB, all with interrupts disabled.
Furthermore, in ARMs case, the code explicitly states that these mappings are not used on SMP. See the comment above the FLUSH_ALIAS_START definition in arch/arm/mm/mm.h
Yes, thankfully SMP on ARM requires non-aliasing data caches... and now you've got me wondering whether that stuff is safe on preempt UP...
I'm thinking that both flush_icache_alias() and flush_pfn_alias() want at least a preemption disabled around each so that we don't end up with two threads being preempted here.
Thankfully, there's not many ARM CPUs with VIPT aliasing caches, which is probably why no one has noticed.
From: Russell King - ARM Linux linux@arm.linux.org.uk Date: Wed, 16 Apr 2014 21:22:43 +0100
I'm thinking that both flush_icache_alias() and flush_pfn_alias() want at least a preemption disabled around each so that we don't end up with two threads being preempted here.
Yes, you would need to disable preemption to keep another thread of control from potentially using the same flush slot.
On 04/16/14 17:13, David Miller wrote:
From: Russell King - ARM Linux linux@arm.linux.org.uk Date: Wed, 16 Apr 2014 21:22:43 +0100
I'm thinking that both flush_icache_alias() and flush_pfn_alias() want at least a preemption disabled around each so that we don't end up with two threads being preempted here.
Yes, you would need to disable preemption to keep another thread of control from potentially using the same flush slot.
Sorry for the delay in replying.
I guess the above potential problem is largely independent of the uprobes caching issue.
I spent a while reading up on ARM cache operations and MMFR3 register contents. I don't pretend to understand all the details but, based on what I do, it looks to me like Victor's v3 patch addresses all the issues that we think it needs to. I also see now why the dcache_flush_page() is needed rather than a call to the lower-level clean_dcache_line() function.
Victor, maybe you could remove the "#ifdef CONFIG_SMP"s from it and send it out as an official (non-RFC) uprobes patch? It would be really nice to get this into V3.15, if at all possible.
-dl
On 25 April 2014 13:16, David Long dave.long@linaro.org wrote:
On 04/16/14 17:13, David Miller wrote:
From: Russell King - ARM Linux linux@arm.linux.org.uk Date: Wed, 16 Apr 2014 21:22:43 +0100
I'm thinking that both flush_icache_alias() and flush_pfn_alias() want at least a preemption disabled around each so that we don't end up with two threads being preempted here.
Yes, you would need to disable preemption to keep another thread of control from potentially using the same flush slot.
Sorry for the delay in replying.
I guess the above potential problem is largely independent of the uprobes caching issue.
I spent a while reading up on ARM cache operations and MMFR3 register contents. I don't pretend to understand all the details but, based on what I do, it looks to me like Victor's v3 patch addresses all the issues that we think it needs to. I also see now why the dcache_flush_page() is needed rather than a call to the lower-level clean_dcache_line() function.
Victor, maybe you could remove the "#ifdef CONFIG_SMP"s from it and send it out as an official (non-RFC) uprobes patch? It would be really nice to get this into V3.15, if at all possible.
I'll send it by the end of today (PST).
Thanks, Victor
-dl
linaro-kernel@lists.linaro.org