Hi folks, here is a series with some fixes for dummy_hcd. First of all, the reasoning behind it.
Syzkaller report [0] shows a hung task on uevent_show, and despite it was fixed with a patch on drivers/base (a race between drivers shutdown and uevent_show), another issue remains: a problem with Realtek emulated wifi device [1]. While working the fix ([1]), we noticed that if it is applied to recent kernels, all fine. But in v6.1.y and v6.6.y for example, it didn't solve entirely the issue, and after some debugging, it was narrowed to dummy_hcd transfer rates being waaay slower in such stable versions.
The reason of such slowness is well-described in the first 2 patches of this backport, but the thing is that these patches introduced subtle issues as well, fixed in the other 2 patches. Hence, I decided to backport all of them for the 2 latest LTS kernels.
Maybe this is not a good idea - I don't see a strong con, but who's better to judge the benefits vs the risks than the patch authors, reviewers, and the USB maintainer?! So, I've CCed Alan, Andrey, Greg and Marcello here, and I thank you all in advance for reviews on this. And my apologies for bothering you with the emails, I hope this is a simple "OK, makes sense" or "Nah, doesn't worth it" situation =)
Cheers,
Guilherme
[0] https://syzkaller.appspot.com/bug?extid=edd9fe0d3a65b14588d5 [1] https://lore.kernel.org/r/20241101193412.1390391-1-gpiccoli@igalia.com/
Alan Stern (1): USB: gadget: dummy-hcd: Fix "task hung" problem
Andrey Konovalov (1): usb: gadget: dummy_hcd: execute hrtimer callback in softirq context
Marcello Sylvester Bauer (2): usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler usb: gadget: dummy_hcd: Set transfer interval to 1 microframe
drivers/usb/gadget/udc/dummy_hcd.c | 57 ++++++++++++++++++++---------- 1 file changed, 38 insertions(+), 19 deletions(-)
From: Marcello Sylvester Bauer sylv@sylv.io
commit a7f3813e589fd8e2834720829a47b5eb914a9afe upstream.
The dummy_hcd transfer scheduler assumes that the internal kernel timer frequency is set to 1000Hz to give a polling interval of 1ms. Reducing the timer frequency will result in an anti-proportional reduction in transfer performance. Switch to a hrtimer to decouple this association.
Signed-off-by: Marcello Sylvester Bauer marcello.bauer@9elements.com Signed-off-by: Marcello Sylvester Bauer sylv@sylv.io Reviewed-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/57a1c2180ff74661600e010c234d1dbaba1d0d46.171284396... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Guilherme G. Piccoli gpiccoli@igalia.com --- drivers/usb/gadget/udc/dummy_hcd.c | 35 +++++++++++++++++------------- 1 file changed, 20 insertions(+), 15 deletions(-)
diff --git a/drivers/usb/gadget/udc/dummy_hcd.c b/drivers/usb/gadget/udc/dummy_hcd.c index 0953e1b5c030..dab559d8ee8c 100644 --- a/drivers/usb/gadget/udc/dummy_hcd.c +++ b/drivers/usb/gadget/udc/dummy_hcd.c @@ -30,7 +30,7 @@ #include <linux/slab.h> #include <linux/errno.h> #include <linux/init.h> -#include <linux/timer.h> +#include <linux/hrtimer.h> #include <linux/list.h> #include <linux/interrupt.h> #include <linux/platform_device.h> @@ -240,7 +240,7 @@ enum dummy_rh_state { struct dummy_hcd { struct dummy *dum; enum dummy_rh_state rh_state; - struct timer_list timer; + struct hrtimer timer; u32 port_status; u32 old_status; unsigned long re_timeout; @@ -1301,8 +1301,8 @@ static int dummy_urb_enqueue( urb->error_count = 1; /* mark as a new urb */
/* kick the scheduler, it'll do the rest */ - if (!timer_pending(&dum_hcd->timer)) - mod_timer(&dum_hcd->timer, jiffies + 1); + if (!hrtimer_active(&dum_hcd->timer)) + hrtimer_start(&dum_hcd->timer, ms_to_ktime(1), HRTIMER_MODE_REL);
done: spin_unlock_irqrestore(&dum_hcd->dum->lock, flags); @@ -1323,7 +1323,7 @@ static int dummy_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status) rc = usb_hcd_check_unlink_urb(hcd, urb, status); if (!rc && dum_hcd->rh_state != DUMMY_RH_RUNNING && !list_empty(&dum_hcd->urbp_list)) - mod_timer(&dum_hcd->timer, jiffies); + hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL);
spin_unlock_irqrestore(&dum_hcd->dum->lock, flags); return rc; @@ -1777,7 +1777,7 @@ static int handle_control_request(struct dummy_hcd *dum_hcd, struct urb *urb, * drivers except that the callbacks are invoked from soft interrupt * context. */ -static void dummy_timer(struct timer_list *t) +static enum hrtimer_restart dummy_timer(struct hrtimer *t) { struct dummy_hcd *dum_hcd = from_timer(dum_hcd, t, timer); struct dummy *dum = dum_hcd->dum; @@ -1808,8 +1808,6 @@ static void dummy_timer(struct timer_list *t) break; }
- /* FIXME if HZ != 1000 this will probably misbehave ... */ - /* look at each urb queued by the host side driver */ spin_lock_irqsave(&dum->lock, flags);
@@ -1817,7 +1815,7 @@ static void dummy_timer(struct timer_list *t) dev_err(dummy_dev(dum_hcd), "timer fired with no URBs pending?\n"); spin_unlock_irqrestore(&dum->lock, flags); - return; + return HRTIMER_NORESTART; } dum_hcd->next_frame_urbp = NULL;
@@ -1995,10 +1993,12 @@ static void dummy_timer(struct timer_list *t) dum_hcd->udev = NULL; } else if (dum_hcd->rh_state == DUMMY_RH_RUNNING) { /* want a 1 msec delay here */ - mod_timer(&dum_hcd->timer, jiffies + msecs_to_jiffies(1)); + hrtimer_start(&dum_hcd->timer, ms_to_ktime(1), HRTIMER_MODE_REL); }
spin_unlock_irqrestore(&dum->lock, flags); + + return HRTIMER_NORESTART; }
/*-------------------------------------------------------------------------*/ @@ -2387,7 +2387,7 @@ static int dummy_bus_resume(struct usb_hcd *hcd) dum_hcd->rh_state = DUMMY_RH_RUNNING; set_link_state(dum_hcd); if (!list_empty(&dum_hcd->urbp_list)) - mod_timer(&dum_hcd->timer, jiffies); + hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL); hcd->state = HC_STATE_RUNNING; } spin_unlock_irq(&dum_hcd->dum->lock); @@ -2465,7 +2465,8 @@ static DEVICE_ATTR_RO(urbs);
static int dummy_start_ss(struct dummy_hcd *dum_hcd) { - timer_setup(&dum_hcd->timer, dummy_timer, 0); + hrtimer_init(&dum_hcd->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + dum_hcd->timer.function = dummy_timer; dum_hcd->rh_state = DUMMY_RH_RUNNING; dum_hcd->stream_en_ep = 0; INIT_LIST_HEAD(&dum_hcd->urbp_list); @@ -2494,7 +2495,8 @@ static int dummy_start(struct usb_hcd *hcd) return dummy_start_ss(dum_hcd);
spin_lock_init(&dum_hcd->dum->lock); - timer_setup(&dum_hcd->timer, dummy_timer, 0); + hrtimer_init(&dum_hcd->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + dum_hcd->timer.function = dummy_timer; dum_hcd->rh_state = DUMMY_RH_RUNNING;
INIT_LIST_HEAD(&dum_hcd->urbp_list); @@ -2513,8 +2515,11 @@ static int dummy_start(struct usb_hcd *hcd)
static void dummy_stop(struct usb_hcd *hcd) { - device_remove_file(dummy_dev(hcd_to_dummy_hcd(hcd)), &dev_attr_urbs); - dev_info(dummy_dev(hcd_to_dummy_hcd(hcd)), "stopped\n"); + struct dummy_hcd *dum_hcd = hcd_to_dummy_hcd(hcd); + + hrtimer_cancel(&dum_hcd->timer); + device_remove_file(dummy_dev(dum_hcd), &dev_attr_urbs); + dev_info(dummy_dev(dum_hcd), "stopped\n"); }
/*-------------------------------------------------------------------------*/
From: Marcello Sylvester Bauer sylv@sylv.io
commit 0a723ed3baa941ca4f51d87bab00661f41142835 upstream.
Currently, the transfer polling interval is set to 1ms, which is the frame rate of full-speed and low-speed USB. The USB 2.0 specification introduces microframes (125 microseconds) to improve the timing precision of data transfers.
Reducing the transfer interval to 1 microframe increases data throughput for high-speed and super-speed USB communication
Signed-off-by: Marcello Sylvester Bauer marcello.bauer@9elements.com Signed-off-by: Marcello Sylvester Bauer sylv@sylv.io Link: https://lore.kernel.org/r/6295dbb84ca76884551df9eb157cce569377a22c.171284396... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Guilherme G. Piccoli gpiccoli@igalia.com --- drivers/usb/gadget/udc/dummy_hcd.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/gadget/udc/dummy_hcd.c b/drivers/usb/gadget/udc/dummy_hcd.c index dab559d8ee8c..f37b0d8386c1 100644 --- a/drivers/usb/gadget/udc/dummy_hcd.c +++ b/drivers/usb/gadget/udc/dummy_hcd.c @@ -50,6 +50,8 @@ #define POWER_BUDGET 500 /* in mA; use 8 for low-power port testing */ #define POWER_BUDGET_3 900 /* in mA */
+#define DUMMY_TIMER_INT_NSECS 125000 /* 1 microframe */ + static const char driver_name[] = "dummy_hcd"; static const char driver_desc[] = "USB Host+Gadget Emulator";
@@ -1302,7 +1304,7 @@ static int dummy_urb_enqueue(
/* kick the scheduler, it'll do the rest */ if (!hrtimer_active(&dum_hcd->timer)) - hrtimer_start(&dum_hcd->timer, ms_to_ktime(1), HRTIMER_MODE_REL); + hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), HRTIMER_MODE_REL);
done: spin_unlock_irqrestore(&dum_hcd->dum->lock, flags); @@ -1993,7 +1995,7 @@ static enum hrtimer_restart dummy_timer(struct hrtimer *t) dum_hcd->udev = NULL; } else if (dum_hcd->rh_state == DUMMY_RH_RUNNING) { /* want a 1 msec delay here */ - hrtimer_start(&dum_hcd->timer, ms_to_ktime(1), HRTIMER_MODE_REL); + hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), HRTIMER_MODE_REL); }
spin_unlock_irqrestore(&dum->lock, flags);
From: Andrey Konovalov andreyknvl@gmail.com
commit 9313d139aa25e572d860f6f673b73a20f32d7f93 upstream.
Commit a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler") switched dummy_hcd to use hrtimer and made the timer's callback be executed in the hardirq context.
With that change, __usb_hcd_giveback_urb now gets executed in the hardirq context, which causes problems for KCOV and KMSAN.
One problem is that KCOV now is unable to collect coverage from the USB code that gets executed from the dummy_hcd's timer callback, as KCOV cannot collect coverage in the hardirq context.
Another problem is that the dummy_hcd hrtimer might get triggered in the middle of a softirq with KCOV remote coverage collection enabled, and that causes a WARNING in KCOV, as reported by syzbot. (I sent a separate patch to shut down this WARNING, but that doesn't fix the other two issues.)
Finally, KMSAN appears to ignore tracking memory copying operations that happen in the hardirq context, which causes false positive kernel-infoleaks, as reported by syzbot.
Change the hrtimer in dummy_hcd to execute the callback in the softirq context.
Reported-by: syzbot+2388cdaeb6b10f0c13ac@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2388cdaeb6b10f0c13ac Reported-by: syzbot+17ca2339e34a1d863aad@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=17ca2339e34a1d863aad Reported-by: syzbot+c793a7eca38803212c61@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c793a7eca38803212c61 Reported-by: syzbot+1e6e0b916b211bee1bd6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1e6e0b916b211bee1bd6 Reported-by: kernel test robot oliver.sang@intel.com Closes: https://lore.kernel.org/oe-lkp/202406141323.413a90d2-lkp@intel.com Fixes: a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler") Cc: stable@vger.kernel.org Acked-by: Marcello Sylvester Bauer sylv@sylv.io Signed-off-by: Andrey Konovalov andreyknvl@gmail.com Reported-by: syzbot+edd9fe0d3a65b14588d5@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=edd9fe0d3a65b14588d5 Link: https://lore.kernel.org/r/20240904013051.4409-1-andrey.konovalov@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Guilherme G. Piccoli gpiccoli@igalia.com --- drivers/usb/gadget/udc/dummy_hcd.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/usb/gadget/udc/dummy_hcd.c b/drivers/usb/gadget/udc/dummy_hcd.c index f37b0d8386c1..ff7bee78bcc4 100644 --- a/drivers/usb/gadget/udc/dummy_hcd.c +++ b/drivers/usb/gadget/udc/dummy_hcd.c @@ -1304,7 +1304,8 @@ static int dummy_urb_enqueue(
/* kick the scheduler, it'll do the rest */ if (!hrtimer_active(&dum_hcd->timer)) - hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), HRTIMER_MODE_REL); + hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), + HRTIMER_MODE_REL_SOFT);
done: spin_unlock_irqrestore(&dum_hcd->dum->lock, flags); @@ -1325,7 +1326,7 @@ static int dummy_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status) rc = usb_hcd_check_unlink_urb(hcd, urb, status); if (!rc && dum_hcd->rh_state != DUMMY_RH_RUNNING && !list_empty(&dum_hcd->urbp_list)) - hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL); + hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL_SOFT);
spin_unlock_irqrestore(&dum_hcd->dum->lock, flags); return rc; @@ -1995,7 +1996,8 @@ static enum hrtimer_restart dummy_timer(struct hrtimer *t) dum_hcd->udev = NULL; } else if (dum_hcd->rh_state == DUMMY_RH_RUNNING) { /* want a 1 msec delay here */ - hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), HRTIMER_MODE_REL); + hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), + HRTIMER_MODE_REL_SOFT); }
spin_unlock_irqrestore(&dum->lock, flags); @@ -2389,7 +2391,7 @@ static int dummy_bus_resume(struct usb_hcd *hcd) dum_hcd->rh_state = DUMMY_RH_RUNNING; set_link_state(dum_hcd); if (!list_empty(&dum_hcd->urbp_list)) - hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL); + hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL_SOFT); hcd->state = HC_STATE_RUNNING; } spin_unlock_irq(&dum_hcd->dum->lock); @@ -2467,7 +2469,7 @@ static DEVICE_ATTR_RO(urbs);
static int dummy_start_ss(struct dummy_hcd *dum_hcd) { - hrtimer_init(&dum_hcd->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + hrtimer_init(&dum_hcd->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_SOFT); dum_hcd->timer.function = dummy_timer; dum_hcd->rh_state = DUMMY_RH_RUNNING; dum_hcd->stream_en_ep = 0; @@ -2497,7 +2499,7 @@ static int dummy_start(struct usb_hcd *hcd) return dummy_start_ss(dum_hcd);
spin_lock_init(&dum_hcd->dum->lock); - hrtimer_init(&dum_hcd->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + hrtimer_init(&dum_hcd->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_SOFT); dum_hcd->timer.function = dummy_timer; dum_hcd->rh_state = DUMMY_RH_RUNNING;
From: Alan Stern stern@rowland.harvard.edu
commit 5189df7b8088268012882c220d6aca4e64981348 upstream.
The syzbot fuzzer has been encountering "task hung" problems ever since the dummy-hcd driver was changed to use hrtimers instead of regular timers. It turns out that the problems are caused by a subtle difference between the timer_pending() and hrtimer_active() APIs.
The changeover blindly replaced the first by the second. However, timer_pending() returns True when the timer is queued but not when its callback is running, whereas hrtimer_active() returns True when the hrtimer is queued _or_ its callback is running. This difference occasionally caused dummy_urb_enqueue() to think that the callback routine had not yet started when in fact it was almost finished. As a result the hrtimer was not restarted, which made it impossible for the driver to dequeue later the URB that was just enqueued. This caused usb_kill_urb() to hang, and things got worse from there.
Since hrtimers have no API for telling when they are queued and the callback isn't running, the driver must keep track of this for itself. That's what this patch does, adding a new "timer_pending" flag and setting or clearing it at the appropriate times.
Reported-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/6709234e.050a0220.3e960.0011.GAE@google.co... Tested-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com Signed-off-by: Alan Stern stern@rowland.harvard.edu Fixes: a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler") Cc: Marcello Sylvester Bauer sylv@sylv.io Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/2dab644e-ef87-4de8-ac9a-26f100b2c609@rowland.harva... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Guilherme G. Piccoli gpiccoli@igalia.com --- drivers/usb/gadget/udc/dummy_hcd.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/drivers/usb/gadget/udc/dummy_hcd.c b/drivers/usb/gadget/udc/dummy_hcd.c index ff7bee78bcc4..d5d89fadde43 100644 --- a/drivers/usb/gadget/udc/dummy_hcd.c +++ b/drivers/usb/gadget/udc/dummy_hcd.c @@ -254,6 +254,7 @@ struct dummy_hcd { u32 stream_en_ep; u8 num_stream[30 / 2];
+ unsigned timer_pending:1; unsigned active:1; unsigned old_active:1; unsigned resuming:1; @@ -1303,9 +1304,11 @@ static int dummy_urb_enqueue( urb->error_count = 1; /* mark as a new urb */
/* kick the scheduler, it'll do the rest */ - if (!hrtimer_active(&dum_hcd->timer)) + if (!dum_hcd->timer_pending) { + dum_hcd->timer_pending = 1; hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), HRTIMER_MODE_REL_SOFT); + }
done: spin_unlock_irqrestore(&dum_hcd->dum->lock, flags); @@ -1324,9 +1327,10 @@ static int dummy_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status) spin_lock_irqsave(&dum_hcd->dum->lock, flags);
rc = usb_hcd_check_unlink_urb(hcd, urb, status); - if (!rc && dum_hcd->rh_state != DUMMY_RH_RUNNING && - !list_empty(&dum_hcd->urbp_list)) + if (rc == 0 && !dum_hcd->timer_pending) { + dum_hcd->timer_pending = 1; hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL_SOFT); + }
spin_unlock_irqrestore(&dum_hcd->dum->lock, flags); return rc; @@ -1813,6 +1817,7 @@ static enum hrtimer_restart dummy_timer(struct hrtimer *t)
/* look at each urb queued by the host side driver */ spin_lock_irqsave(&dum->lock, flags); + dum_hcd->timer_pending = 0;
if (!dum_hcd->udev) { dev_err(dummy_dev(dum_hcd), @@ -1994,8 +1999,10 @@ static enum hrtimer_restart dummy_timer(struct hrtimer *t) if (list_empty(&dum_hcd->urbp_list)) { usb_put_dev(dum_hcd->udev); dum_hcd->udev = NULL; - } else if (dum_hcd->rh_state == DUMMY_RH_RUNNING) { + } else if (!dum_hcd->timer_pending && + dum_hcd->rh_state == DUMMY_RH_RUNNING) { /* want a 1 msec delay here */ + dum_hcd->timer_pending = 1; hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), HRTIMER_MODE_REL_SOFT); } @@ -2390,8 +2397,10 @@ static int dummy_bus_resume(struct usb_hcd *hcd) } else { dum_hcd->rh_state = DUMMY_RH_RUNNING; set_link_state(dum_hcd); - if (!list_empty(&dum_hcd->urbp_list)) + if (!list_empty(&dum_hcd->urbp_list)) { + dum_hcd->timer_pending = 1; hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL_SOFT); + } hcd->state = HC_STATE_RUNNING; } spin_unlock_irq(&dum_hcd->dum->lock); @@ -2522,6 +2531,7 @@ static void dummy_stop(struct usb_hcd *hcd) struct dummy_hcd *dum_hcd = hcd_to_dummy_hcd(hcd);
hrtimer_cancel(&dum_hcd->timer); + dum_hcd->timer_pending = 0; device_remove_file(dummy_dev(dum_hcd), &dev_attr_urbs); dev_info(dummy_dev(dum_hcd), "stopped\n"); }
On Sat, Nov 02, 2024 at 11:13:49PM -0300, Guilherme G. Piccoli wrote:
Hi folks, here is a series with some fixes for dummy_hcd. First of all, the reasoning behind it.
Syzkaller report [0] shows a hung task on uevent_show, and despite it was fixed with a patch on drivers/base (a race between drivers shutdown and uevent_show), another issue remains: a problem with Realtek emulated wifi device [1]. While working the fix ([1]), we noticed that if it is applied to recent kernels, all fine. But in v6.1.y and v6.6.y for example, it didn't solve entirely the issue, and after some debugging, it was narrowed to dummy_hcd transfer rates being waaay slower in such stable versions.
The reason of such slowness is well-described in the first 2 patches of this backport, but the thing is that these patches introduced subtle issues as well, fixed in the other 2 patches. Hence, I decided to backport all of them for the 2 latest LTS kernels.
Maybe this is not a good idea - I don't see a strong con, but who's better to judge the benefits vs the risks than the patch authors, reviewers, and the USB maintainer?! So, I've CCed Alan, Andrey, Greg and Marcello here, and I thank you all in advance for reviews on this. And my apologies for bothering you with the emails, I hope this is a simple "OK, makes sense" or "Nah, doesn't worth it" situation =)
Cheers,
Guilherme
[0] https://syzkaller.appspot.com/bug?extid=edd9fe0d3a65b14588d5 [1] https://lore.kernel.org/r/20241101193412.1390391-1-gpiccoli@igalia.com/
Alan Stern (1): USB: gadget: dummy-hcd: Fix "task hung" problem
Andrey Konovalov (1): usb: gadget: dummy_hcd: execute hrtimer callback in softirq context
Marcello Sylvester Bauer (2): usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler usb: gadget: dummy_hcd: Set transfer interval to 1 microframe
drivers/usb/gadget/udc/dummy_hcd.c | 57 ++++++++++++++++++++---------- 1 file changed, 38 insertions(+), 19 deletions(-)
I'm not aware of any reasons not to backport these commits to the stable kernels, if they fix a real problem for you.
However, it probably wasn't necessary to post the patches explicitly. (Not unless they required some modifications for the backports.) I should think all you really needed to do was ask the appropriate maintainers to queue those commits for the stable kernels you listed.
Alan Stern
On Sat, Nov 02, 2024 at 10:41:19PM -0400, Alan Stern wrote:
On Sat, Nov 02, 2024 at 11:13:49PM -0300, Guilherme G. Piccoli wrote:
Hi folks, here is a series with some fixes for dummy_hcd. First of all, the reasoning behind it.
Syzkaller report [0] shows a hung task on uevent_show, and despite it was fixed with a patch on drivers/base (a race between drivers shutdown and uevent_show), another issue remains: a problem with Realtek emulated wifi device [1]. While working the fix ([1]), we noticed that if it is applied to recent kernels, all fine. But in v6.1.y and v6.6.y for example, it didn't solve entirely the issue, and after some debugging, it was narrowed to dummy_hcd transfer rates being waaay slower in such stable versions.
The reason of such slowness is well-described in the first 2 patches of this backport, but the thing is that these patches introduced subtle issues as well, fixed in the other 2 patches. Hence, I decided to backport all of them for the 2 latest LTS kernels.
Maybe this is not a good idea - I don't see a strong con, but who's better to judge the benefits vs the risks than the patch authors, reviewers, and the USB maintainer?! So, I've CCed Alan, Andrey, Greg and Marcello here, and I thank you all in advance for reviews on this. And my apologies for bothering you with the emails, I hope this is a simple "OK, makes sense" or "Nah, doesn't worth it" situation =)
Cheers,
Guilherme
[0] https://syzkaller.appspot.com/bug?extid=edd9fe0d3a65b14588d5 [1] https://lore.kernel.org/r/20241101193412.1390391-1-gpiccoli@igalia.com/
Alan Stern (1): USB: gadget: dummy-hcd: Fix "task hung" problem
Andrey Konovalov (1): usb: gadget: dummy_hcd: execute hrtimer callback in softirq context
Marcello Sylvester Bauer (2): usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler usb: gadget: dummy_hcd: Set transfer interval to 1 microframe
drivers/usb/gadget/udc/dummy_hcd.c | 57 ++++++++++++++++++++---------- 1 file changed, 38 insertions(+), 19 deletions(-)
I'm not aware of any reasons not to backport these commits to the stable kernels, if they fix a real problem for you.
However, it probably wasn't necessary to post the patches explicitly. (Not unless they required some modifications for the backports.) I should think all you really needed to do was ask the appropriate maintainers to queue those commits for the stable kernels you listed.
I've queued this up. And yes, giving us the IDs would be easier (which is what I ended up doing here).
Thanks!
On Sun, Nov 3, 2024 at 3:28 AM Guilherme G. Piccoli gpiccoli@igalia.com wrote:
Hi folks, here is a series with some fixes for dummy_hcd. First of all, the reasoning behind it.
Syzkaller report [0] shows a hung task on uevent_show, and despite it was fixed with a patch on drivers/base (a race between drivers shutdown and uevent_show), another issue remains: a problem with Realtek emulated wifi device [1]. While working the fix ([1]), we noticed that if it is applied to recent kernels, all fine. But in v6.1.y and v6.6.y for example, it didn't solve entirely the issue, and after some debugging, it was narrowed to dummy_hcd transfer rates being waaay slower in such stable versions.
The reason of such slowness is well-described in the first 2 patches of this backport, but the thing is that these patches introduced subtle issues as well, fixed in the other 2 patches. Hence, I decided to backport all of them for the 2 latest LTS kernels.
Maybe this is not a good idea - I don't see a strong con, but who's better to judge the benefits vs the risks than the patch authors, reviewers, and the USB maintainer?! So, I've CCed Alan, Andrey, Greg and Marcello here, and I thank you all in advance for reviews on this. And my apologies for bothering you with the emails, I hope this is a simple "OK, makes sense" or "Nah, doesn't worth it" situation =)
Sounds good to me, thank you!
On 03/11/2024 22:29, Andrey Konovalov wrote:
On Sun, Nov 3, 2024 at 3:28 AM Guilherme G. Piccoli gpiccoli@igalia.com wrote:
Hi folks, here is a series with some fixes for dummy_hcd. First of all, the reasoning behind it.
Syzkaller report [0] shows a hung task on uevent_show, and despite it was fixed with a patch on drivers/base (a race between drivers shutdown and uevent_show), another issue remains: a problem with Realtek emulated wifi device [1]. While working the fix ([1]), we noticed that if it is applied to recent kernels, all fine. But in v6.1.y and v6.6.y for example, it didn't solve entirely the issue, and after some debugging, it was narrowed to dummy_hcd transfer rates being waaay slower in such stable versions.
The reason of such slowness is well-described in the first 2 patches of this backport, but the thing is that these patches introduced subtle issues as well, fixed in the other 2 patches. Hence, I decided to backport all of them for the 2 latest LTS kernels.
Maybe this is not a good idea - I don't see a strong con, but who's better to judge the benefits vs the risks than the patch authors, reviewers, and the USB maintainer?! So, I've CCed Alan, Andrey, Greg and Marcello here, and I thank you all in advance for reviews on this. And my apologies for bothering you with the emails, I hope this is a simple "OK, makes sense" or "Nah, doesn't worth it" situation =)
Sounds good to me, thank you!
Thanks a bunch to all of you folks! For the reviews and the suggestion about the commit-ids. I've always sent patches to stable this way, be it a backport or even a cherry-pick, but it's interesting and definitely easier to just mention the IDs and ask for merge - thanks for the suggestion, I'll do that in case of future clean cherry-picks =)
Cheers,
Guilherme
linux-stable-mirror@lists.linaro.org