Hi, Thorsten here, the Linux kernel's regression tracker.
I noticed a report about a linux-6.6.y regression in bugzilla.kernel.org
that appears to be caused by this commit from Dan applied by Greg:
15fffc6a5624b1 ("driver core: Fix uevent_show() vs driver detach race")
[v6.11-rc3, v6.10.5, v6.6.46, v6.1.105, v5.15.165, v5.10.224, v5.4.282,
v4.19.320]
The reporter did not check yet if mainline is affected; decided to
forward the report by mail nevertheless, as the maintainer for the
subsystem is also the maintainer for the stable tree. ;-)
To quote from https://bugzilla.kernel.org/show_bug.cgi?id=219244 :
> The symptoms of this bug are as follows:
>
> - After booting (to the graphical login screen) the mouse pointer
> would frozen and only after unplugging and plugging-in again the usb
> plug of the mouse would the mouse be working as expected.
> - If one would log in without fixing the mouse issue, the mouse
> pointer would still be frozen after login.
> - The usb keyboard usually is not affected even though plugged into
> the same usb-hub - thus logging in is possible.
> - The mouse pointer is also frozen if the usb connector is plugged
> into a different usb-port (different from the usb-hub)
> - The pointer is moveable via the inbuilt synaptics trackpad
>
>
> The kernel log shows almost the same messages (not sure if the
> differences mean anything in regards to this bug) for the initial
> recognizing the mouse (frozen mouse pointer) and the re-plugged-in mouse
> (and subsequently moveable mouse pointer):
>
> [kernel] [ 8.763158] usb 1-2.2.1.2: new low-speed USB device number 10 using xhci_hcd
> [kernel] [ 8.956028] usb 1-2.2.1.2: New USB device found, idVendor=045e, idProduct=00cb, bcdDevice= 1.04
> [kernel] [ 8.956036] usb 1-2.2.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> [kernel] [ 8.956039] usb 1-2.2.1.2: Product: Microsoft Basic Optical Mouse v2.0
> [kernel] [ 8.956041] usb 1-2.2.1.2: Manufacturer: Microsoft
> [kernel] [ 8.963554] input: Microsoft Microsoft Basic Optical Mouse v2.0 as /devices/pci0000:00/0000:00:14.0/usb1/1-2/1-2.2/1-2.2.1/1-2.2.1.2/1-2.2.1.2:1.0/0003:045E:00CB.0002/input/input18
> [kernel] [ 8.964417] hid-generic 0003:045E:00CB.0002: input,hidraw1: USB HID v1.11 Mouse [Microsoft Microsoft Basic Optical Mouse v2.0 ] on usb-0000:00:14.0-2.2.1.2/input0
>
> [kernel] [ 31.258381] usb 1-2.2.1.2: USB disconnect, device number 10
> [kernel] [ 31.595051] usb 1-2.2.1.2: new low-speed USB device number 16 using xhci_hcd
> [kernel] [ 31.804002] usb 1-2.2.1.2: New USB device found, idVendor=045e, idProduct=00cb, bcdDevice= 1.04
> [kernel] [ 31.804010] usb 1-2.2.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> [kernel] [ 31.804013] usb 1-2.2.1.2: Product: Microsoft Basic Optical Mouse v2.0
> [kernel] [ 31.804016] usb 1-2.2.1.2: Manufacturer: Microsoft
> [kernel] [ 31.812933] input: Microsoft Microsoft Basic Optical Mouse v2.0 as /devices/pci0000:00/0000:00:14.0/usb1/1-2/1-2.2/1-2.2.1/1-2.2.1.2/1-2.2.1.2:1.0/0003:045E:00CB.0004/input/input20
> [kernel] [ 31.814028] hid-generic 0003:045E:00CB.0004: input,hidraw1: USB HID v1.11 Mouse [Microsoft Microsoft Basic Optical Mouse v2.0 ] on usb-0000:00:14.0-2.2.1.2/input0
>
> Differences:
>
> ../0003:045E:00CB.0002/input/input18 vs ../0003:045E:00CB.0004/input/input20
>
> and
>
> hid-generic 0003:045E:00CB.0002 vs hid-generic 0003:045E:00CB.0004
>
>
> The connector / usb-port was not changed in this case!
>
>
> The symptoms of this bug have been present at one point in the
> recent
> past, but with kernel v6.6.45 (or maybe even some version before that)
> it was fine. But with 6.6.45 it seems to be definitely fine.
>
> But with v6.6.46 the symptoms returned. That's the reason I
> suspected
> the kernel to be the cause of this issue. So I did some bisecting -
> which wasn't easy because that bug would often times not appear if the
> system was simply rebooted into the test kernel.
> As the bug would definitely appear on the affected kernels (v6.6.46
> ff) after shutting down the system for the night and booting the next
> day, I resorted to simulating the over-night powering-off by shutting
> the system down, unplugging the power and pressing the power button to
> get rid of residual voltage. But even then a few times the bug would
> only appear if I repeated this procedure before booting the system again
> with the respective kernel.
>
> This is on a Thinkpad with Kaby Lake and integrated Intel graphics.
> Even though it is a laptop, it is used as a desktop device, and the
> internal battery is disconnected, the removable battery is removed as
> the system is plugged-in via the power cord at all times (when in use)!
> Also, the system has no power (except for the bios battery, of
> course)
> over night as the power outlet is switched off if the device is not in use.
>
> Not sure if this affects the issue - or how it does. But for
> successful bisecting I had to resort to the above procedure.
>
> Bisecting the issue (between the release commits of v6.6.45 and
> v6.6.46) resulted in this commit [1] being the probable culprit.
>
> I then tested kernel v6.6.49. It still produced the bug for me. So I
> reverted the changes of the assumed "bad commit" and re-compiled kernel
> v6.6.49. With this modified kernel the bug seems to be gone.
>
> Now, I assume the commit has a reason for being introduced, but
> maybe
> needs some adjusting in order to avoid this bug I'm experiencing on my
> system.
> Also, I can't say why the issue appeared in the past even without
> this
> commit being present, as I haven't bisected any kernel version before
> v6.6.45.
>
>
> [1]:
>
> 4d035c743c3e391728a6f81cbf0f7f9ca700cf62 is the first bad commit
> commit 4d035c743c3e391728a6f81cbf0f7f9ca700cf62
> Author: Dan Williams <dan.j.williams(a)intel.com>
> Date: Fri Jul 12 12:42:09 2024 -0700
>
> driver core: Fix uevent_show() vs driver detach race
>
> commit 15fffc6a5624b13b428bb1c6e9088e32a55eb82c upstream.
>
> uevent_show() wants to de-reference dev->driver->name. There is no clean
See the ticket for more details. Note, you have to use bugzilla to reach
the reporter, as I sadly[1] can not CCed them in mails like this.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
[1] because bugzilla.kernel.org tells users upon registration their
"email address will never be displayed to logged out users"
P.S.: let me use this mail to also add the report to the list of tracked
regressions to ensure it's doesn't fall through the cracks:
#regzbot introduced: 4d035c743c3e391728a6f81cbf0f7f9ca700cf62
#regzbot from: brmails+k
#regzbot duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=219244
#regzbot title: driver core: frozen usb mouse pointer at boot
#regzbot ignore-activity
The patch titled
Subject: mm: avoid unconditional one-tick sleep when swapcache_prepare fails
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-avoid-unconditional-one-tick-sleep-when-swapcache_prepare-fails.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Barry Song <v-songbaohua(a)oppo.com>
Subject: mm: avoid unconditional one-tick sleep when swapcache_prepare fails
Date: Fri, 27 Sep 2024 09:19:36 +1200
Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
introduced an unconditional one-tick sleep when `swapcache_prepare()`
fails, which has led to reports of UI stuttering on latency-sensitive
Android devices. To address this, we can use a waitqueue to wake up tasks
that fail `swapcache_prepare()` sooner, instead of always sleeping for a
full tick. While tasks may occasionally be woken by an unrelated
`do_swap_page()`, this method is preferable to two scenarios: rapid
re-entry into page faults, which can cause livelocks, and multiple
millisecond sleeps, which visibly degrade user experience.
Oven's testing shows that a single waitqueue resolves the UI stuttering
issue. If a 'thundering herd' problem becomes apparent later, a waitqueue
hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SIZE]` for page bit
locks can be introduced.
Link: https://lkml.kernel.org/r/20240926211936.75373-1-21cnbao@gmail.com
Fixes: 13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
Signed-off-by: Barry Song <v-songbaohua(a)oppo.com>
Reported-by: Oven Liyang <liyangouwen1(a)oppo.com>
Tested-by: Oven Liyang <liyangouwen1(a)oppo.com>
Cc: Kairui Song <kasong(a)tencent.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: Kalesh Singh <kaleshsingh(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
--- a/mm/memory.c~mm-avoid-unconditional-one-tick-sleep-when-swapcache_prepare-fails
+++ a/mm/memory.c
@@ -4192,6 +4192,8 @@ static struct folio *alloc_swap_folio(st
}
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+static DECLARE_WAIT_QUEUE_HEAD(swapcache_wq);
+
/*
* We enter with non-exclusive mmap_lock (to exclude vma changes,
* but allow concurrent faults), and pte mapped but not yet locked.
@@ -4204,6 +4206,7 @@ vm_fault_t do_swap_page(struct vm_fault
{
struct vm_area_struct *vma = vmf->vma;
struct folio *swapcache, *folio = NULL;
+ DECLARE_WAITQUEUE(wait, current);
struct page *page;
struct swap_info_struct *si = NULL;
rmap_t rmap_flags = RMAP_NONE;
@@ -4302,7 +4305,9 @@ vm_fault_t do_swap_page(struct vm_fault
* Relax a bit to prevent rapid
* repeated page faults.
*/
+ add_wait_queue(&swapcache_wq, &wait);
schedule_timeout_uninterruptible(1);
+ remove_wait_queue(&swapcache_wq, &wait);
goto out_page;
}
need_clear_cache = true;
@@ -4609,8 +4614,10 @@ unlock:
pte_unmap_unlock(vmf->pte, vmf->ptl);
out:
/* Clear the swap cache pin for direct swapin after PTL unlock */
- if (need_clear_cache)
+ if (need_clear_cache) {
swapcache_clear(si, entry, nr_pages);
+ wake_up(&swapcache_wq);
+ }
if (si)
put_swap_device(si);
return ret;
@@ -4625,8 +4632,10 @@ out_release:
folio_unlock(swapcache);
folio_put(swapcache);
}
- if (need_clear_cache)
+ if (need_clear_cache) {
swapcache_clear(si, entry, nr_pages);
+ wake_up(&swapcache_wq);
+ }
if (si)
put_swap_device(si);
return ret;
_
Patches currently in -mm which might be from v-songbaohua(a)oppo.com are
mm-avoid-unconditional-one-tick-sleep-when-swapcache_prepare-fails.patch
The patch titled
Subject: selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
selftests-mm-fixed-incorrect-buffer-mirror-size-in-hmm2-double_map-test.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Donet Tom <donettom(a)linux.ibm.com>
Subject: selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test
Date: Fri, 27 Sep 2024 00:07:52 -0500
The hmm2 double_map test was failing due to an incorrect buffer->mirror
size. The buffer->mirror size was 6, while buffer->ptr size was 6 *
PAGE_SIZE. The test failed because the kernel's copy_to_user function was
attempting to copy a 6 * PAGE_SIZE buffer to buffer->mirror. Since the
size of buffer->mirror was incorrect, copy_to_user failed.
This patch corrects the buffer->mirror size to 6 * PAGE_SIZE.
Test Result without this patch
==============================
# RUN hmm2.hmm2_device_private.double_map ...
# hmm-tests.c:1680:double_map:Expected ret (-14) == 0 (0)
# double_map: Test terminated by assertion
# FAIL hmm2.hmm2_device_private.double_map
not ok 53 hmm2.hmm2_device_private.double_map
Test Result with this patch
===========================
# RUN hmm2.hmm2_device_private.double_map ...
# OK hmm2.hmm2_device_private.double_map
ok 53 hmm2.hmm2_device_private.double_map
Link: https://lkml.kernel.org/r/20240927050752.51066-1-donettom@linux.ibm.com
Fixes: fee9f6d1b8df ("mm/hmm/test: add selftests for HMM")
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: J��r��me Glisse <jglisse(a)redhat.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Przemek Kitszel <przemyslaw.kitszel(a)intel.com>
Cc: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/hmm-tests.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/mm/hmm-tests.c~selftests-mm-fixed-incorrect-buffer-mirror-size-in-hmm2-double_map-test
+++ a/tools/testing/selftests/mm/hmm-tests.c
@@ -1657,7 +1657,7 @@ TEST_F(hmm2, double_map)
buffer->fd = -1;
buffer->size = size;
- buffer->mirror = malloc(npages);
+ buffer->mirror = malloc(size);
ASSERT_NE(buffer->mirror, NULL);
/* Reserve a range of addresses. */
_
Patches currently in -mm which might be from donettom(a)linux.ibm.com are
selftests-mm-fixed-incorrect-buffer-mirror-size-in-hmm2-double_map-test.patch
The patch titled
Subject: device-dax: correct pgoff align in dax_set_mapping()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
device-dax-correct-pgoff-align-in-dax_set_mapping.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Kun(llfl)" <llfl(a)linux.alibaba.com>
Subject: device-dax: correct pgoff align in dax_set_mapping()
Date: Fri, 27 Sep 2024 15:45:09 +0800
pgoff should be aligned using ALIGN_DOWN() instead of ALIGN(). Otherwise,
vmf->address not aligned to fault_size will be aligned to the next
alignment, that can result in memory failure getting the wrong address.
Link: https://lkml.kernel.org/r/23c02a03e8d666fef11bbe13e85c69c8b4ca0624.17274216…
Fixes: b9b5777f09be ("device-dax: use ALIGN() for determining pgoff")
Signed-off-by: Kun(llfl) <llfl(a)linux.alibaba.com>
Tested-by: JianXiong Zhao <zhaojianxiong.zjx(a)alibaba-inc.com>
Cc: Joao Martins <joao.m.martins(a)oracle.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/dax/device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/dax/device.c~device-dax-correct-pgoff-align-in-dax_set_mapping
+++ a/drivers/dax/device.c
@@ -86,7 +86,7 @@ static void dax_set_mapping(struct vm_fa
nr_pages = 1;
pgoff = linear_page_index(vmf->vma,
- ALIGN(vmf->address, fault_size));
+ ALIGN_DOWN(vmf->address, fault_size));
for (i = 0; i < nr_pages; i++) {
struct page *page = pfn_to_page(pfn_t_to_pfn(pfn) + i);
_
Patches currently in -mm which might be from llfl(a)linux.alibaba.com are
device-dax-correct-pgoff-align-in-dax_set_mapping.patch
arch-s390.h uses types from std.h, but does not include it.
Depending on the inclusion order the compilation can fail.
Include std.h explicitly to avoid these errors.
Fixes: 404fa87c0eaf ("tools/nolibc: s390: provide custom implementation for sys_fork")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
tools/include/nolibc/arch-s390.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/include/nolibc/arch-s390.h b/tools/include/nolibc/arch-s390.h
index 2ec13d8b9a2db80efa8d6cbbbd01bfa3d0059de2..f9ab83a219b8a2d5e53b0b303d8bf0bf78280d5f 100644
--- a/tools/include/nolibc/arch-s390.h
+++ b/tools/include/nolibc/arch-s390.h
@@ -10,6 +10,7 @@
#include "compiler.h"
#include "crt.h"
+#include "std.h"
/* Syscalls for s390:
* - registers are 64-bit
---
base-commit: e477dba5442c0af7acb9e8bbbbde1108a37ed39c
change-id: 20240927-nolibc-s390-std-h-cbb13f70fa73
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
Hello,
I am unsure if this is the 'correct' behavior for ptrace.
If you run ptrace_traceme followed by ptrace_attach, then the process
attaches its own parent to itself and cannot be attached by another
thing. The attach call errors out, but GDB does report something
attached to it. I am unsure if Bash does this itself perhaps.
It's a bit hard for me to reason about because my debugging skills are
bad and trying 'strace' with bash -c ./thing, or just on the thing
itself gives -1 on both ptrace calls as strace attaches to it.
similarly with GDB. Unsure how to debug this.
https://gist.github.com/x64-elf-sh42/83393e319ad8280b8704fbe3f499e381
to compile simply:
gcc test.c -o thingy
This code works on my machine which is:
Linux 6.1.0-25-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.106-3
(2024-08-26) x86_64 GNU/Linux
GDB -p on the pid reports that another pid is attached and the
operation is illegal. That other pid is the bash shell that i spawned
this binary from (code in gist).
it's useful for anti-debugging, but it seems odd it will attach it's
parent to the process since that's not actually doing the attach call.
If anything i'd expect the pid attached to itself, rather than the
parent getting attached.
The first call to ptrace (traceme) gets return value 0. The second
call (attach) gets return value -1. That does seem correct, but yet
there is something 'attached' when i try to use GDB.
If I only do the traceme call, it does not get attached by Bash, so it
looks totally like the 'attach' call has a side effect of attaching
the parent, rather than just only failing.
Kind regards,
~sh42
Hello everyone,
I wondered a few times in the last months how we could properly track
regressions for the stable series, as I'm always a bit unsure how proper
use of regzbot would look for these cases.
For issues that affect mainline usually just the commit itself is
specified with the relevant "#regzbot introduced: ..." invocation, but
how would one do this if the stable trees are also affected? Do we need
to specify "#regzbot introduced" for each backported version of the
upstream commit?!
IMO this is especially interesting because sometimes getting a patchset
to fix the older trees can take quite some time since possibly backport
dependencies have to be found or even custom patches need to be written,
as it i.e. was the case [here][0]. This led to fixes for the linux-6.1.y
branch landing a bit later which would be good to track with regzbot so
it does not fall through the cracks.
Maybe there already is a regzbot facility to do this that I have just
missed, but I thought if there is people on the lists will know :)
Cheers,
Chris
P.S.: I hope everybody had a great time at the conferences! I hope to
also make it next year ..
[0]: https://lore.kernel.org/all/66bcb6f68172f_adbf529471@willemb.c.googlers.com…