The __clear_user function is defined to return the number of bytes that could not be cleared. From the underlying memset / bzero implementation this means setting register a2 to that number on return. Currently if a page fault is triggered within the memset_partial block, the value loaded into a2 on return is meaningless.
The label .Lpartial_fixup@ is jumped to on page fault. Currently it masks the remaining count of bytes (a2) with STORMASK, meaning that the least significant 2 (32bit) or 3 (64bit) bits of the remaining count are always clear. Secondly, .Lpartial_fixup@ expects t1 to contain the end address of the copy. This is set up by the initial block: PTR_ADDU t1, a0 /* end address */ However, the .Lmemset_partial@ block then reuses register t1 to calculate a jump through a block of word copies. This leaves it no longer containing the end address of the copy operation if a page fault occurs, and the remaining bytes calculation is incorrect.
Fix these issues by removing the and of a2 with STORMASK, and replace t1 with register t2 in the .Lmemset_partial@ block.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Matt Redfearn matt.redfearn@mips.com ---
arch/mips/lib/memset.S | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/mips/lib/memset.S b/arch/mips/lib/memset.S index 90bcdf1224ee..3257dca58cad 100644 --- a/arch/mips/lib/memset.S +++ b/arch/mips/lib/memset.S @@ -161,19 +161,19 @@
.Lmemset_partial@: R10KCBARRIER(0(ra)) - PTR_LA t1, 2f /* where to start */ + PTR_LA t2, 2f /* where to start */ #ifdef CONFIG_CPU_MICROMIPS LONG_SRL t7, t0, 1 #endif #if LONGSIZE == 4 - PTR_SUBU t1, FILLPTRG + PTR_SUBU t2, FILLPTRG #else .set noat LONG_SRL AT, FILLPTRG, 1 - PTR_SUBU t1, AT + PTR_SUBU t2, AT .set at #endif - jr t1 + jr t2 PTR_ADDU a0, t0 /* dest ptr */
.set push @@ -250,7 +250,6 @@
.Lpartial_fixup@: PTR_L t0, TI_TASK($28) - andi a2, STORMASK LONG_L t0, THREAD_BUADDR(t0) LONG_ADDU a2, t1 jr ra
On Thu, Mar 29, 2018 at 10:28:24AM +0100, Matt Redfearn wrote:
The __clear_user function is defined to return the number of bytes that could not be cleared. From the underlying memset / bzero implementation this means setting register a2 to that number on return. Currently if a page fault is triggered within the memset_partial block, the value loaded into a2 on return is meaningless.
The label .Lpartial_fixup@ is jumped to on page fault. Currently it masks the remaining count of bytes (a2) with STORMASK, meaning that the least significant 2 (32bit) or 3 (64bit) bits of the remaining count are always clear.
Are you sure about that. It seems to do that *to ensure those bits are set correctly*...
Secondly, .Lpartial_fixup@ expects t1 to contain the end address of the copy. This is set up by the initial block: PTR_ADDU t1, a0 /* end address */ However, the .Lmemset_partial@ block then reuses register t1 to calculate a jump through a block of word copies. This leaves it no longer containing the end address of the copy operation if a page fault occurs, and the remaining bytes calculation is incorrect.
Fix these issues by removing the and of a2 with STORMASK, and replace t1 with register t2 in the .Lmemset_partial@ block.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Matt Redfearn matt.redfearn@mips.com
arch/mips/lib/memset.S | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/mips/lib/memset.S b/arch/mips/lib/memset.S index 90bcdf1224ee..3257dca58cad 100644 --- a/arch/mips/lib/memset.S +++ b/arch/mips/lib/memset.S @@ -161,19 +161,19 @@ .Lmemset_partial@: R10KCBARRIER(0(ra))
- PTR_LA t1, 2f /* where to start */
- PTR_LA t2, 2f /* where to start */
#ifdef CONFIG_CPU_MICROMIPS LONG_SRL t7, t0, 1
Hmm, on microMIPS t7 isn't on the clobber list for __bzero, and nor is t8...
#endif #if LONGSIZE == 4
- PTR_SUBU t1, FILLPTRG
- PTR_SUBU t2, FILLPTRG
#else .set noat LONG_SRL AT, FILLPTRG, 1
- PTR_SUBU t1, AT
- PTR_SUBU t2, AT .set at
#endif
- jr t1
- jr t2 PTR_ADDU a0, t0 /* dest ptr */
^^^ note this...
.set push @@ -250,7 +250,6 @@ .Lpartial_fixup@: PTR_L t0, TI_TASK($28)
- andi a2, STORMASK
... this isn't right.
If I read correctly, t1 (after the above change stops clobbering it) is the end of the full 64-byte blocks, i.e. the start address of the final partial block.
The .Lfwd_fixup calculation (for full blocks) appears to be:
a2 = ((len & 0x3f) + start_of_partial) - badvaddr
which is spot on. (len & 0x3f) is the partial block and remaining bytes that haven't been set yet, add start_of_partial to get end of the full range, subtract bad address to find how much didn't copy.
The calculation for .Lpartial_fixup however appears to (currently) do:
a2 = ((len & STORMASK) + start_of_partial) - badvaddr
Which might make sense if start_of_partial (t1) was replaced with end_of_partial, which does seem to be calculated as noted above, and put in a0 ready for the final few bytes to be set.
LONG_L t0, THREAD_BUADDR(t0) LONG_ADDU a2, t1
^^ So I think either it needs to just s/t1/a0/ here and not bother preserving t1 above (smaller change and probably the original intent), or preserve t1 and mask 0x3f instead of STORMASK like .Lfwd_fixup does (which would work but seems needlessly complicated to me).
Does that make any sense or have I misunderstood some subtlety?
Cheers James
jr ra
2.7.4
Hi James,
On 16/04/18 23:13, James Hogan wrote:
On Thu, Mar 29, 2018 at 10:28:24AM +0100, Matt Redfearn wrote:
The __clear_user function is defined to return the number of bytes that could not be cleared. From the underlying memset / bzero implementation this means setting register a2 to that number on return. Currently if a page fault is triggered within the memset_partial block, the value loaded into a2 on return is meaningless.
The label .Lpartial_fixup@ is jumped to on page fault. Currently it masks the remaining count of bytes (a2) with STORMASK, meaning that the least significant 2 (32bit) or 3 (64bit) bits of the remaining count are always clear.
Are you sure about that. It seems to do that *to ensure those bits are set correctly*...
Secondly, .Lpartial_fixup@ expects t1 to contain the end address of the copy. This is set up by the initial block: PTR_ADDU t1, a0 /* end address */ However, the .Lmemset_partial@ block then reuses register t1 to calculate a jump through a block of word copies. This leaves it no longer containing the end address of the copy operation if a page fault occurs, and the remaining bytes calculation is incorrect.
Fix these issues by removing the and of a2 with STORMASK, and replace t1 with register t2 in the .Lmemset_partial@ block.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Matt Redfearn matt.redfearn@mips.com
arch/mips/lib/memset.S | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/mips/lib/memset.S b/arch/mips/lib/memset.S index 90bcdf1224ee..3257dca58cad 100644 --- a/arch/mips/lib/memset.S +++ b/arch/mips/lib/memset.S @@ -161,19 +161,19 @@ .Lmemset_partial@: R10KCBARRIER(0(ra))
- PTR_LA t1, 2f /* where to start */
- PTR_LA t2, 2f /* where to start */ #ifdef CONFIG_CPU_MICROMIPS LONG_SRL t7, t0, 1
Hmm, on microMIPS t7 isn't on the clobber list for __bzero, and nor is t8...
#endif #if LONGSIZE == 4
- PTR_SUBU t1, FILLPTRG
- PTR_SUBU t2, FILLPTRG #else .set noat LONG_SRL AT, FILLPTRG, 1
- PTR_SUBU t1, AT
- PTR_SUBU t2, AT .set at #endif
- jr t1
- jr t2 PTR_ADDU a0, t0 /* dest ptr */
^^^ note this...
.set push @@ -250,7 +250,6 @@ .Lpartial_fixup@: PTR_L t0, TI_TASK($28)
- andi a2, STORMASK
... this isn't right.
If I read correctly, t1 (after the above change stops clobbering it) is the end of the full 64-byte blocks, i.e. the start address of the final partial block.
The .Lfwd_fixup calculation (for full blocks) appears to be:
a2 = ((len & 0x3f) + start_of_partial) - badvaddr
which is spot on. (len & 0x3f) is the partial block and remaining bytes that haven't been set yet, add start_of_partial to get end of the full range, subtract bad address to find how much didn't copy.
The calculation for .Lpartial_fixup however appears to (currently) do:
a2 = ((len & STORMASK) + start_of_partial) - badvaddr
Which might make sense if start_of_partial (t1) was replaced with end_of_partial, which does seem to be calculated as noted above, and put in a0 ready for the final few bytes to be set.
LONG_L t0, THREAD_BUADDR(t0) LONG_ADDU a2, t1
^^ So I think either it needs to just s/t1/a0/ here and not bother preserving t1 above (smaller change and probably the original intent), or preserve t1 and mask 0x3f instead of STORMASK like .Lfwd_fixup does (which would work but seems needlessly complicated to me).
Does that make any sense or have I misunderstood some subtlety?
Thanks for taking the time to work this through - you're right, changing t1 to a0 in the fault handler does give the right result and is much less invasive. Updated patch incoming :-)
Thanks, Matt
Cheers James
jr ra
2.7.4
The __clear_user function is defined to return the number of bytes that could not be cleared. From the underlying memset / bzero implementation this means setting register a2 to that number on return. Currently if a page fault is triggered within the memset_partial block, the value loaded into a2 on return is meaningless.
The label .Lpartial_fixup@ is jumped to on page fault. In order to work out how many bytes failed to copy, the exception handler should find how many bytes left in the partial block (andi a2, STORMASK), add that to the partial block end address (a2), and subtract the faulting address to get the remainder. Currently it incorrectly subtracts the partial block start address (t1), which has additionally has been clobbered to generate a jump target in memset_partial. Fix this by adding the block end address instead.
Since this code is non-trivial to read, add comments to describe the fault handling.
This issue was found with the following test code: int j, k; for (j = 0; j < 512; j++) { if ((k = clear_user(NULL, j)) != j) { pr_err("clear_user (NULL %d) returned %d\n", j, k); } } Which now passes on Creator Ci40 (MIPS32) and Cavium Octeon II (MIPS64).
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Suggested-by: James Hogan jhogan@kernel.org Signed-off-by: Matt Redfearn matt.redfearn@mips.com
---
Changes in v2: - Use James Hogan's suggestion of replacing t1 with a0 to get the correct remainder count. - Add comments to .Lpartial_fixup to aid those who next try to deciper this code.
arch/mips/lib/memset.S | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/mips/lib/memset.S b/arch/mips/lib/memset.S index 90bcdf1224ee..fa3bec269331 100644 --- a/arch/mips/lib/memset.S +++ b/arch/mips/lib/memset.S @@ -250,11 +250,11 @@
.Lpartial_fixup@: PTR_L t0, TI_TASK($28) - andi a2, STORMASK - LONG_L t0, THREAD_BUADDR(t0) - LONG_ADDU a2, t1 + andi a2, STORMASK /* #Bytes beyond partial block */ + LONG_L t0, THREAD_BUADDR(t0) /* Get faulting address */ + LONG_ADDU a2, a0 /* Add end address of partial block */ jr ra - LONG_SUBU a2, t0 + LONG_SUBU a2, t0 /* a2 = partial_end + #bytes - fault */
.Llast_fixup@: jr ra
The __clear_user function is defined to return the number of bytes that could not be cleared. From the underlying memset / bzero implementation this means setting register a2 to that number on return. Currently if a page fault is triggered within the memset_partial block, the value loaded into a2 on return is meaningless.
The label .Lpartial_fixup@ is jumped to on page fault. In order to work out how many bytes failed to copy, the exception handler should find how many bytes left in the partial block (andi a2, STORMASK), add that to the partial block end address (a2), and subtract the faulting address to get the remainder. Currently it incorrectly subtracts the partial block start address (t1), which has additionally has been clobbered to generate a jump target in memset_partial. Fix this by adding the block end address instead.
This issue was found with the following test code: int j, k; for (j = 0; j < 512; j++) { if ((k = clear_user(NULL, j)) != j) { pr_err("clear_user (NULL %d) returned %d\n", j, k); } } Which now passes on Creator Ci40 (MIPS32) and Cavium Octeon II (MIPS64).
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Suggested-by: James Hogan jhogan@kernel.org Signed-off-by: Matt Redfearn matt.redfearn@mips.com
---
Changes in v3: - Just fix the issue at hand
Changes in v2: - Use James Hogan's suggestion of replacing t1 with a0 to get the correct remainder count.
arch/mips/lib/memset.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/lib/memset.S b/arch/mips/lib/memset.S index 90bcdf1224ee..184819c1d5c8 100644 --- a/arch/mips/lib/memset.S +++ b/arch/mips/lib/memset.S @@ -252,7 +252,7 @@ PTR_L t0, TI_TASK($28) andi a2, STORMASK LONG_L t0, THREAD_BUADDR(t0) - LONG_ADDU a2, t1 + LONG_ADDU a2, a0 jr ra LONG_SUBU a2, t0
On Tue, Apr 17, 2018 at 03:52:21PM +0100, Matt Redfearn wrote:
The __clear_user function is defined to return the number of bytes that could not be cleared. From the underlying memset / bzero implementation this means setting register a2 to that number on return. Currently if a page fault is triggered within the memset_partial block, the value loaded into a2 on return is meaningless.
The label .Lpartial_fixup@ is jumped to on page fault. In order to work out how many bytes failed to copy, the exception handler should find how many bytes left in the partial block (andi a2, STORMASK), add that to the partial block end address (a2), and subtract the faulting address to get the remainder. Currently it incorrectly subtracts the partial block start address (t1), which has additionally has been clobbered to generate a jump target in memset_partial. Fix this by adding the block end address instead.
This issue was found with the following test code: int j, k; for (j = 0; j < 512; j++) { if ((k = clear_user(NULL, j)) != j) { pr_err("clear_user (NULL %d) returned %d\n", j, k); } } Which now passes on Creator Ci40 (MIPS32) and Cavium Octeon II (MIPS64).
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Suggested-by: James Hogan jhogan@kernel.org Signed-off-by: Matt Redfearn matt.redfearn@mips.com
Applied, thanks James
linux-stable-mirror@lists.linaro.org