Error injection testing uncovered a pretty severe problem where we could
end up committing a super that pointed to the wrong tree roots,
resulting in transid mismatch errors.
The way we commit the transaction is we update the super copy with the
current generations and bytenrs of the important roots, and then copy
that into our super_for_commit. Then we allow transactions to continue
again, we write out the dirty pages for the transaction, and then we
write the super. If the write out fails we'll bail and skip writing the
supers.
However since we've allowed a new transaction to start, we can have a
log attempting to sync at this point, which would be blocked on
fs_info->tree_log_mutex. Once the commit fails we're allowed to do the
log tree commit, which uses super_for_commit, which now points at fs
tree's that were not written out.
Fix this by checking BTRFS_FS_STATE_ERROR once we acquire the
tree_log_mutex. This way if the transaction commit fails we're sure to
see this bit set and we can skip writing the super out. This patch
fixes this specific transid mismatch error I was seeing with this
particular error path.
cc: stable(a)vger.kernel.org
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
---
fs/btrfs/tree-log.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 0278f489e7d9..83e8f105869b 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3302,6 +3302,22 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
* begins and releases it only after writing its superblock.
*/
mutex_lock(&fs_info->tree_log_mutex);
+
+ /*
+ * The transaction writeout phase could have failed, and thus marked the
+ * fs in an error state. We must not commit here, as we could have
+ * updated our generation in the super_for_commit and writing the super
+ * here would result in transid mismatches. If there is an error here
+ * just bail.
+ */
+ if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
+ ret = -EIO;
+ btrfs_set_log_full_commit(trans);
+ btrfs_abort_transaction(trans, ret);
+ mutex_unlock(&fs_info->tree_log_mutex);
+ goto out_wake_log_root;
+ }
+
btrfs_set_super_log_root(fs_info->super_for_commit, log_root_start);
btrfs_set_super_log_root_level(fs_info->super_for_commit, log_root_level);
ret = write_all_supers(fs_info, 1);
--
2.26.3
commit 6f755e85c332 ("coresight: Add helper for inserting synchronization
packets") removed trailing '\0' from barrier_pkt array and updated the
call sites like etb_update_buffer() to have proper checks for barrier_pkt
size before read but missed updating tmc_update_etf_buffer() which still
reads barrier_pkt past the array size resulting in KASAN out-of-bounds
bug. Fix this by adding a check for barrier_pkt size before accessing
like it is done in etb_update_buffer().
BUG: KASAN: global-out-of-bounds in tmc_update_etf_buffer+0x4b8/0x698
Read of size 4 at addr ffffffd05b7d1030 by task perf/2629
Call trace:
dump_backtrace+0x0/0x27c
show_stack+0x20/0x2c
dump_stack+0x11c/0x188
print_address_description+0x3c/0x4a4
__kasan_report+0x140/0x164
kasan_report+0x10/0x18
__asan_report_load4_noabort+0x1c/0x24
tmc_update_etf_buffer+0x4b8/0x698
etm_event_stop+0x248/0x2d8
etm_event_del+0x20/0x2c
event_sched_out+0x214/0x6f0
group_sched_out+0xd0/0x270
ctx_sched_out+0x2ec/0x518
__perf_event_task_sched_out+0x4fc/0xe6c
__schedule+0x1094/0x16a0
preempt_schedule_irq+0x88/0x170
arm64_preempt_schedule_irq+0xf0/0x18c
el1_irq+0xe8/0x180
perf_event_exec+0x4d8/0x56c
setup_new_exec+0x204/0x400
load_elf_binary+0x72c/0x18c0
search_binary_handler+0x13c/0x420
load_script+0x500/0x6c4
search_binary_handler+0x13c/0x420
exec_binprm+0x118/0x654
__do_execve_file+0x77c/0xba4
__arm64_compat_sys_execve+0x98/0xac
el0_svc_common+0x1f8/0x5e0
el0_svc_compat_handler+0x84/0xb0
el0_svc_compat+0x10/0x50
The buggy address belongs to the variable:
barrier_pkt+0x10/0x40
Memory state around the buggy address:
ffffffd05b7d0f00: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00
ffffffd05b7d0f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffffd05b7d1000: 00 00 00 00 00 00 fa fa fa fa fa fa 00 00 00 03
^
ffffffd05b7d1080: fa fa fa fa 00 02 fa fa fa fa fa fa 03 fa fa fa
ffffffd05b7d1100: fa fa fa fa 00 00 00 00 05 fa fa fa fa fa fa fa
==================================================================
Fixes: 6f755e85c332 ("coresight: Add helper for inserting synchronization packets")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan(a)codeaurora.org>
---
drivers/hwtracing/coresight/coresight-tmc-etf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 45b85edfc690..cd0fb7bfba68 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -530,7 +530,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
buf_ptr = buf->data_pages[cur] + offset;
*buf_ptr = readl_relaxed(drvdata->base + TMC_RRD);
- if (lost && *barrier) {
+ if (lost && i < CORESIGHT_BARRIER_PKT_SIZE) {
*buf_ptr = *barrier;
barrier++;
}
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
This is a note to let you know that I've just added the patch titled
xhci: Fix 5.12 regression of missing xHC cache clearing command after
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From a7f2e9272aff1ccfe0fc801dab1d5a7a1c6b7ed2 Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 25 May 2021 10:41:00 +0300
Subject: xhci: Fix 5.12 regression of missing xHC cache clearing command after
a Stall
If endpoints halts due to a stall then the dequeue pointer read from
hardware may already be set ahead of the stalled TRB.
After commit 674f8438c121 ("xhci: split handling halted endpoints into two
steps") in 5.12 xhci driver won't issue a Set TR Dequeue if hardware
dequeue pointer is already in the right place.
Turns out the "Set TR Dequeue pointer" command is anyway needed as it in
addition to moving the dequeue pointer also clears endpoint state and
cache.
Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps")
Cc: <stable(a)vger.kernel.org> # 5.12
Reported-by: Peter Ganzhorn <peter.ganzhorn(a)googlemail.com>
Tested-by: Peter Ganzhorn <peter.ganzhorn(a)googlemail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20210525074100.1154090-3-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-ring.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 256d336354a0..6acd2329e08d 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -933,14 +933,18 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
continue;
}
/*
- * If ring stopped on the TD we need to cancel, then we have to
+ * If a ring stopped on the TD we need to cancel then we have to
* move the xHC endpoint ring dequeue pointer past this TD.
+ * Rings halted due to STALL may show hw_deq is past the stalled
+ * TD, but still require a set TR Deq command to flush xHC cache.
*/
hw_deq = xhci_get_hw_deq(xhci, ep->vdev, ep->ep_index,
td->urb->stream_id);
hw_deq &= ~0xf;
- if (trb_in_td(xhci, td->start_seg, td->first_trb,
+ if (td->cancel_status == TD_HALTED) {
+ cached_td = td;
+ } else if (trb_in_td(xhci, td->start_seg, td->first_trb,
td->last_trb, hw_deq, false)) {
switch (td->cancel_status) {
case TD_CLEARED: /* TD is already no-op */
--
2.31.1
This is a note to let you know that I've just added the patch titled
xhci: fix giving back URB with incorrect status regression in 5.12
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From a80c203c3f1c06d2201c19ae071d0ae770a2b1ca Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Tue, 25 May 2021 10:40:59 +0300
Subject: xhci: fix giving back URB with incorrect status regression in 5.12
5.12 kernel changes how xhci handles cancelled URBs and halted
endpoints. Among these changes cancelled and stalled URBs are no longer
given back before they are cleared from xHC hardware cache.
These changes unfortunately cleared the -EPIPE status of a stalled
transfer in one case before giving bak the URB, causing a USB card reader
to fail from working.
Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps")
Cc: <stable(a)vger.kernel.org> # 5.12
Reported-by: Peter Ganzhorn <peter.ganzhorn(a)googlemail.com>
Tested-by: Peter Ganzhorn <peter.ganzhorn(a)googlemail.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20210525074100.1154090-2-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-ring.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index a8e4189277da..256d336354a0 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -828,14 +828,10 @@ static void xhci_giveback_invalidated_tds(struct xhci_virt_ep *ep)
list_for_each_entry_safe(td, tmp_td, &ep->cancelled_td_list,
cancelled_td_list) {
- /*
- * Doesn't matter what we pass for status, since the core will
- * just overwrite it (because the URB has been unlinked).
- */
ring = xhci_urb_to_transfer_ring(ep->xhci, td->urb);
if (td->cancel_status == TD_CLEARED)
- xhci_td_cleanup(ep->xhci, td, ring, 0);
+ xhci_td_cleanup(ep->xhci, td, ring, td->status);
if (ep->xhci->xhc_state & XHCI_STATE_DYING)
return;
--
2.31.1
If endpoints halts due to a stall then the dequeue pointer read from
hardware may already be set ahead of the stalled TRB.
After commit 674f8438c121 ("xhci: split handling halted endpoints into two
steps") in 5.12 xhci driver won't issue a Set TR Dequeue if hardware
dequeue pointer is already in the right place.
Turns out the "Set TR Dequeue pointer" command is anyway needed as it in
addition to moving the dequeue pointer also clears endpoint state and
cache.
Reported-by: Peter Ganzhorn <peter.ganzhorn(a)googlemail.com>
Tested-by: Peter Ganzhorn <peter.ganzhorn(a)googlemail.com>
Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps")
Cc: <stable(a)vger.kernel.org> # 5.12
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-ring.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 256d336354a0..6acd2329e08d 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -933,14 +933,18 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
continue;
}
/*
- * If ring stopped on the TD we need to cancel, then we have to
+ * If a ring stopped on the TD we need to cancel then we have to
* move the xHC endpoint ring dequeue pointer past this TD.
+ * Rings halted due to STALL may show hw_deq is past the stalled
+ * TD, but still require a set TR Deq command to flush xHC cache.
*/
hw_deq = xhci_get_hw_deq(xhci, ep->vdev, ep->ep_index,
td->urb->stream_id);
hw_deq &= ~0xf;
- if (trb_in_td(xhci, td->start_seg, td->first_trb,
+ if (td->cancel_status == TD_HALTED) {
+ cached_td = td;
+ } else if (trb_in_td(xhci, td->start_seg, td->first_trb,
td->last_trb, hw_deq, false)) {
switch (td->cancel_status) {
case TD_CLEARED: /* TD is already no-op */
--
2.25.1
5.12 kernel changes how xhci handles cancelled URBs and halted
endpoints. Among these changes cancelled and stalled URBs are no longer
given back before they are cleared from xHC hardware cache.
These changes unfortunately cleared the -EPIPE status of a stalled
transfer in one case before giving bak the URB, causing a USB card reader
to fail from working.
Reported-by: Peter Ganzhorn <peter.ganzhorn(a)googlemail.com>
Tested-by: Peter Ganzhorn <peter.ganzhorn(a)googlemail.com>
Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps")
Cc: <stable(a)vger.kernel.org> # 5.12
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-ring.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index a8e4189277da..256d336354a0 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -828,14 +828,10 @@ static void xhci_giveback_invalidated_tds(struct xhci_virt_ep *ep)
list_for_each_entry_safe(td, tmp_td, &ep->cancelled_td_list,
cancelled_td_list) {
- /*
- * Doesn't matter what we pass for status, since the core will
- * just overwrite it (because the URB has been unlinked).
- */
ring = xhci_urb_to_transfer_ring(ep->xhci, td->urb);
if (td->cancel_status == TD_CLEARED)
- xhci_td_cleanup(ep->xhci, td, ring, 0);
+ xhci_td_cleanup(ep->xhci, td, ring, td->status);
if (ep->xhci->xhc_state & XHCI_STATE_DYING)
return;
--
2.25.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d1d90dd27254c44d087ad3f8b5b3e4fff0571f45 Mon Sep 17 00:00:00 2001
From: Jack Pham <jackp(a)codeaurora.org>
Date: Wed, 28 Apr 2021 02:01:10 -0700
Subject: [PATCH] usb: dwc3: gadget: Enable suspend events
commit 72704f876f50 ("dwc3: gadget: Implement the suspend entry event
handler") introduced (nearly 5 years ago!) an interrupt handler for
U3/L1-L2 suspend events. The problem is that these events aren't
currently enabled in the DEVTEN register so the handler is never
even invoked. Fix this simply by enabling the corresponding bit
in dwc3_gadget_enable_irq() using the same revision check as found
in the handler.
Fixes: 72704f876f50 ("dwc3: gadget: Implement the suspend entry event handler")
Acked-by: Felipe Balbi <balbi(a)kernel.org>
Signed-off-by: Jack Pham <jackp(a)codeaurora.org>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20210428090111.3370-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index dd80e5ca8c78..cab3a9184068 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2323,6 +2323,10 @@ static void dwc3_gadget_enable_irq(struct dwc3 *dwc)
if (DWC3_VER_IS_PRIOR(DWC3, 250A))
reg |= DWC3_DEVTEN_ULSTCNGEN;
+ /* On 2.30a and above this bit enables U3/L2-L1 Suspend Events */
+ if (!DWC3_VER_IS_PRIOR(DWC3, 230A))
+ reg |= DWC3_DEVTEN_EOPFEN;
+
dwc3_writel(dwc->regs, DWC3_DEVTEN, reg);
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d1d90dd27254c44d087ad3f8b5b3e4fff0571f45 Mon Sep 17 00:00:00 2001
From: Jack Pham <jackp(a)codeaurora.org>
Date: Wed, 28 Apr 2021 02:01:10 -0700
Subject: [PATCH] usb: dwc3: gadget: Enable suspend events
commit 72704f876f50 ("dwc3: gadget: Implement the suspend entry event
handler") introduced (nearly 5 years ago!) an interrupt handler for
U3/L1-L2 suspend events. The problem is that these events aren't
currently enabled in the DEVTEN register so the handler is never
even invoked. Fix this simply by enabling the corresponding bit
in dwc3_gadget_enable_irq() using the same revision check as found
in the handler.
Fixes: 72704f876f50 ("dwc3: gadget: Implement the suspend entry event handler")
Acked-by: Felipe Balbi <balbi(a)kernel.org>
Signed-off-by: Jack Pham <jackp(a)codeaurora.org>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20210428090111.3370-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index dd80e5ca8c78..cab3a9184068 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2323,6 +2323,10 @@ static void dwc3_gadget_enable_irq(struct dwc3 *dwc)
if (DWC3_VER_IS_PRIOR(DWC3, 250A))
reg |= DWC3_DEVTEN_ULSTCNGEN;
+ /* On 2.30a and above this bit enables U3/L2-L1 Suspend Events */
+ if (!DWC3_VER_IS_PRIOR(DWC3, 230A))
+ reg |= DWC3_DEVTEN_EOPFEN;
+
dwc3_writel(dwc->regs, DWC3_DEVTEN, reg);
}
The patch below does not apply to the 5.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d1d90dd27254c44d087ad3f8b5b3e4fff0571f45 Mon Sep 17 00:00:00 2001
From: Jack Pham <jackp(a)codeaurora.org>
Date: Wed, 28 Apr 2021 02:01:10 -0700
Subject: [PATCH] usb: dwc3: gadget: Enable suspend events
commit 72704f876f50 ("dwc3: gadget: Implement the suspend entry event
handler") introduced (nearly 5 years ago!) an interrupt handler for
U3/L1-L2 suspend events. The problem is that these events aren't
currently enabled in the DEVTEN register so the handler is never
even invoked. Fix this simply by enabling the corresponding bit
in dwc3_gadget_enable_irq() using the same revision check as found
in the handler.
Fixes: 72704f876f50 ("dwc3: gadget: Implement the suspend entry event handler")
Acked-by: Felipe Balbi <balbi(a)kernel.org>
Signed-off-by: Jack Pham <jackp(a)codeaurora.org>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20210428090111.3370-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index dd80e5ca8c78..cab3a9184068 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2323,6 +2323,10 @@ static void dwc3_gadget_enable_irq(struct dwc3 *dwc)
if (DWC3_VER_IS_PRIOR(DWC3, 250A))
reg |= DWC3_DEVTEN_ULSTCNGEN;
+ /* On 2.30a and above this bit enables U3/L2-L1 Suspend Events */
+ if (!DWC3_VER_IS_PRIOR(DWC3, 230A))
+ reg |= DWC3_DEVTEN_EOPFEN;
+
dwc3_writel(dwc->regs, DWC3_DEVTEN, reg);
}
The direction of the pipe argument must match the request-type direction
bit or control requests may fail depending on the host-controller-driver
implementation.
Fix the four control requests which erroneously used usb_rcvctrlpipe().
Fixes: 1d3e20236d7a ("[PATCH] USB: usbtouchscreen: unified USB touchscreen driver")
Fixes: 24ced062a296 ("usbtouchscreen: add support for DMC TSC-10/25 devices")
Fixes: 9e3b25837a20 ("Input: usbtouchscreen - add support for e2i touchscreen controller")
Cc: stable(a)vger.kernel.org # 2.6.17
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
Changes in v2
- include also the request in e2i_init which did not use USB_DIR_OUT
drivers/input/touchscreen/usbtouchscreen.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/input/touchscreen/usbtouchscreen.c b/drivers/input/touchscreen/usbtouchscreen.c
index c847453a03c2..43c521f50c85 100644
--- a/drivers/input/touchscreen/usbtouchscreen.c
+++ b/drivers/input/touchscreen/usbtouchscreen.c
@@ -251,7 +251,7 @@ static int e2i_init(struct usbtouch_usb *usbtouch)
int ret;
struct usb_device *udev = interface_to_usbdev(usbtouch->interface);
- ret = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
+ ret = usb_control_msg(udev, usb_sndctrlpipe(udev, 0),
0x01, 0x02, 0x0000, 0x0081,
NULL, 0, USB_CTRL_SET_TIMEOUT);
@@ -531,7 +531,7 @@ static int mtouch_init(struct usbtouch_usb *usbtouch)
if (ret)
return ret;
- ret = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
+ ret = usb_control_msg(udev, usb_sndctrlpipe(udev, 0),
MTOUCHUSB_RESET,
USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_DEVICE,
1, 0, NULL, 0, USB_CTRL_SET_TIMEOUT);
@@ -543,7 +543,7 @@ static int mtouch_init(struct usbtouch_usb *usbtouch)
msleep(150);
for (i = 0; i < 3; i++) {
- ret = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
+ ret = usb_control_msg(udev, usb_sndctrlpipe(udev, 0),
MTOUCHUSB_ASYNC_REPORT,
USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_DEVICE,
1, 1, NULL, 0, USB_CTRL_SET_TIMEOUT);
@@ -722,7 +722,7 @@ static int dmc_tsc10_init(struct usbtouch_usb *usbtouch)
}
/* start sending data */
- ret = usb_control_msg(dev, usb_rcvctrlpipe (dev, 0),
+ ret = usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
TSC10_CMD_DATA1,
USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_DEVICE,
0, 0, NULL, 0, USB_CTRL_SET_TIMEOUT);
--
2.26.3
Angesichts der unvermeidlichen Umstände war ich gezwungen, mich an Sie zu wenden, um zu sehen, ob Sie mit mir zusammenarbeiten können, um ein Projekt von hohem Wert in Höhe von Zweiund vierzig Millionen Dollar durchzuführen. Wenden Sie sich an mich, um weitere Informationen zu dieser äußerst wichtigen Angelegenheit zu erhalten, da Zeit von entscheidender Bedeutung ist. Bitte behandeln Sie diese Nachricht als klassifiziert.
Mit bestem Gruß,
Sharon
Persönliche E-Mail: shannon.olsens(a)aol.com
When receiving a packet with multiple fragments, hardware may still
touch the first fragment until the entire packet has been received. The
driver therefore keeps the first fragment mapped for DMA until end of
packet has been asserted, and delays its dma_sync call until then.
The driver tries to fit multiple receive buffers on one page. When using
3K receive buffers (e.g. using Jumbo frames and legacy-rx is turned
off/build_skb is being used) on an architecture with 4K pages, the
driver allocates an order 1 compound page and uses one page per receive
buffer. To determine the correct offset for a delayed DMA sync of the
first fragment of a multi-fragment packet, the driver then cannot just
use PAGE_MASK on the DMA address but has to construct a mask based on
the actual size of the backing page.
Using PAGE_MASK in the 3K RX buffer/4K page architecture configuration
will always sync the first page of a compound page. With the SWIOTLB
enabled this can lead to corrupted packets (zeroed out first fragment,
re-used garbage from another packet) and various consequences, such as
slow/stalling data transfers and connection resets. For example, testing
on a link with MTU exceeding 3058 bytes on a host with SWIOTLB enabled
(e.g. "iommu=soft swiotlb=262144,force") TCP transfers quickly fizzle
out without this patch.
Cc: stable(a)vger.kernel.org
Fixes: 0c5661ecc5dd7 ("ixgbe: fix crash in build_skb Rx code path")
Signed-off-by: Markus Boehme <markubo(a)amazon.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index c5ec17d19c59..507ef3471301 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1825,7 +1825,8 @@ static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
struct sk_buff *skb)
{
if (ring_uses_build_skb(rx_ring)) {
- unsigned long offset = (unsigned long)(skb->data) & ~PAGE_MASK;
+ unsigned long mask = (unsigned long)ixgbe_rx_pg_size(rx_ring) - 1;
+ unsigned long offset = (unsigned long)(skb->data) & mask;
dma_sync_single_range_for_cpu(rx_ring->dev,
IXGBE_CB(skb)->dma,
--
2.25.1
The patch titled
Subject: pid: take a reference when initializing `cad_pid`
has been added to the -mm tree. Its filename is
pid-take-a-reference-when-initializing-cad_pid.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/pid-take-a-reference-when-initial…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/pid-take-a-reference-when-initial…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Mark Rutland <mark.rutland(a)arm.com>
Subject: pid: take a reference when initializing `cad_pid`
During boot, kernel_init_freeable() initializes `cad_pid` to the init
task's struct pid. Later on, we may change `cad_pid` via a sysctl, and
when this happens proc_do_cad_pid() will increment the refcount on the new
pid via get_pid(), and will decrement the refcount on the old pid via
put_pid(). As we never called get_pid() when we initialized `cad_pid`, we
decrement a reference we never incremented, can therefore free the init
task's struct pid early. As there can be dangling references to the
struct pid, we can later encounter a use-after-free (e.g. when delivering
signals).
This was spotted when fuzzing v5.13-rc3 with Syzkaller, but seems to have
been around since the conversion of `cad_pid` to struct pid in commit:
9ec52099e4b8678a ("[PATCH] replace cad_pid by a struct pid")
... from the pre-KASAN stone age of v2.6.19.
Fix this by getting a reference to the init task's struct pid when we
assign it to `cad_pid`.
Full KASAN splat below.
==================================================================
BUG: KASAN: use-after-free in ns_of_pid include/linux/pid.h:153 [inline]
BUG: KASAN: use-after-free in task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509
Read of size 4 at addr ffff23794dda0004 by task syz-executor.0/273
CPU: 1 PID: 273 Comm: syz-executor.0 Not tainted 5.12.0-00001-g9aef892b2d15 #1
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x4a8 arch/arm64/kernel/stacktrace.c:105
show_stack+0x34/0x48 arch/arm64/kernel/stacktrace.c:191
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x1d4/0x2a0 lib/dump_stack.c:120
print_address_description.constprop.11+0x60/0x3a8 mm/kasan/report.c:232
__kasan_report mm/kasan/report.c:399 [inline]
kasan_report+0x1e8/0x200 mm/kasan/report.c:416
__asan_report_load4_noabort+0x30/0x48 mm/kasan/report_generic.c:308
ns_of_pid include/linux/pid.h:153 [inline]
task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509
do_notify_parent+0x308/0xe60 kernel/signal.c:1950
exit_notify kernel/exit.c:682 [inline]
do_exit+0x2334/0x2bd0 kernel/exit.c:845
do_group_exit+0x108/0x2c8 kernel/exit.c:922
get_signal+0x4e4/0x2a88 kernel/signal.c:2781
do_signal arch/arm64/kernel/signal.c:882 [inline]
do_notify_resume+0x300/0x970 arch/arm64/kernel/signal.c:936
work_pending+0xc/0x2dc
Allocated by task 0:
kasan_save_stack+0x28/0x58 mm/kasan/common.c:38
kasan_set_track mm/kasan/common.c:46 [inline]
set_alloc_info mm/kasan/common.c:427 [inline]
__kasan_slab_alloc+0x88/0xa8 mm/kasan/common.c:460
kasan_slab_alloc include/linux/kasan.h:223 [inline]
slab_post_alloc_hook+0x50/0x5c0 mm/slab.h:516
slab_alloc_node mm/slub.c:2907 [inline]
slab_alloc mm/slub.c:2915 [inline]
kmem_cache_alloc+0x1f4/0x4c0 mm/slub.c:2920
alloc_pid+0xdc/0xc00 kernel/pid.c:180
copy_process+0x2794/0x5e18 kernel/fork.c:2129
kernel_clone+0x194/0x13c8 kernel/fork.c:2500
kernel_thread+0xd4/0x110 kernel/fork.c:2552
rest_init+0x44/0x4a0 init/main.c:687
arch_call_rest_init+0x1c/0x28
start_kernel+0x520/0x554 init/main.c:1064
0x0
Freed by task 270:
kasan_save_stack+0x28/0x58 mm/kasan/common.c:38
kasan_set_track+0x28/0x40 mm/kasan/common.c:46
kasan_set_free_info+0x28/0x50 mm/kasan/generic.c:357
____kasan_slab_free mm/kasan/common.c:360 [inline]
____kasan_slab_free mm/kasan/common.c:325 [inline]
__kasan_slab_free+0xf4/0x148 mm/kasan/common.c:367
kasan_slab_free include/linux/kasan.h:199 [inline]
slab_free_hook mm/slub.c:1562 [inline]
slab_free_freelist_hook+0x98/0x260 mm/slub.c:1600
slab_free mm/slub.c:3161 [inline]
kmem_cache_free+0x224/0x8e0 mm/slub.c:3177
put_pid.part.4+0xe0/0x1a8 kernel/pid.c:114
put_pid+0x30/0x48 kernel/pid.c:109
proc_do_cad_pid+0x190/0x1b0 kernel/sysctl.c:1401
proc_sys_call_handler+0x338/0x4b0 fs/proc/proc_sysctl.c:591
proc_sys_write+0x34/0x48 fs/proc/proc_sysctl.c:617
call_write_iter include/linux/fs.h:1977 [inline]
new_sync_write+0x3ac/0x510 fs/read_write.c:518
vfs_write fs/read_write.c:605 [inline]
vfs_write+0x9c4/0x1018 fs/read_write.c:585
ksys_write+0x124/0x240 fs/read_write.c:658
__do_sys_write fs/read_write.c:670 [inline]
__se_sys_write fs/read_write.c:667 [inline]
__arm64_sys_write+0x78/0xb0 fs/read_write.c:667
__invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:49 [inline]
el0_svc_common.constprop.1+0x16c/0x388 arch/arm64/kernel/syscall.c:129
do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:168
el0_svc+0x28/0x38 arch/arm64/kernel/entry-common.c:416
el0_sync_handler+0x134/0x180 arch/arm64/kernel/entry-common.c:432
el0_sync+0x154/0x180 arch/arm64/kernel/entry.S:701
The buggy address belongs to the object at ffff23794dda0000
which belongs to the cache pid of size 224
The buggy address is located 4 bytes inside of
224-byte region [ffff23794dda0000, ffff23794dda00e0)
The buggy address belongs to the page:
page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4dda0
head:(____ptrval____) order:1 compound_mapcount:0
flags: 0x3fffc0000010200(slab|head)
raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff23794d40d080
raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff23794dd9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff23794dd9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff23794dda0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff23794dda0080: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff23794dda0100: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
==================================================================
Link: https://lkml.kernel.org/r/20210524172230.38715-1-mark.rutland@arm.com
Fixes: 9ec52099e4b8678a ("[PATCH] replace cad_pid by a struct pid")
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Cedric Le Goater <clg(a)fr.ibm.com>
Cc: Christian Brauner <christian(a)brauner.io>
Cc: Eric W. Biederman <ebiederm(a)xmission.com>
Cc: Kees Cook <keescook(a)chromium.org
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: Paul Mackerras <paulus(a)samba.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
init/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/init/main.c~pid-take-a-reference-when-initializing-cad_pid
+++ a/init/main.c
@@ -1537,7 +1537,7 @@ static noinline void __init kernel_init_
*/
set_mems_allowed(node_states[N_MEMORY]);
- cad_pid = task_pid(current);
+ cad_pid = get_pid(task_pid(current));
smp_prepare_cpus(setup_max_cpus);
_
Patches currently in -mm which might be from mark.rutland(a)arm.com are
pid-take-a-reference-when-initializing-cad_pid.patch
Hi Greg,
This is based on the 190 commits that you posted for mainline. Out of
that 24 patches had a reply mail asking not to revert. So, the attached
series for stable is based on the remaining 166 commits. I have borrowed
the commit message from your series.
--
Regards
Sudip
The direction of the pipe argument must match the request-type direction
bit or control requests may fail depending on the host-controller-driver
implementation.
Fix the SET_ROM_WAIT_STATES request which erroneously used
usb_rcvctrlpipe().
Fixes: 88095e7b473a ("mmc: Add new VUB300 USB-to-SD/SDIO/MMC driver")
Cc: stable(a)vger.kernel.org # 3.0
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/mmc/host/vub300.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mmc/host/vub300.c b/drivers/mmc/host/vub300.c
index 739cf63ef6e2..4950d10d3a19 100644
--- a/drivers/mmc/host/vub300.c
+++ b/drivers/mmc/host/vub300.c
@@ -2279,7 +2279,7 @@ static int vub300_probe(struct usb_interface *interface,
if (retval < 0)
goto error5;
retval =
- usb_control_msg(vub300->udev, usb_rcvctrlpipe(vub300->udev, 0),
+ usb_control_msg(vub300->udev, usb_sndctrlpipe(vub300->udev, 0),
SET_ROM_WAIT_STATES,
USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_DEVICE,
firmware_rom_wait_states, 0x0000, NULL, 0, HZ);
--
2.26.3
This is a note to let you know that I've just added the patch titled
usb: typec: tcpm: Respond Not_Supported if no snk_vdo
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From a20dcf53ea9836387b229c4878f9559cf1b55b71 Mon Sep 17 00:00:00 2001
From: Kyle Tso <kyletso(a)google.com>
Date: Sun, 23 May 2021 09:58:55 +0800
Subject: usb: typec: tcpm: Respond Not_Supported if no snk_vdo
If snk_vdo is not populated from fwnode, it implies the port does not
support responding to SVDM commands. Not_Supported Message shall be sent
if the contract is in PD3. And for PD2, the port shall ignore the
commands.
Fixes: 193a68011fdc ("staging: typec: tcpm: Respond to Discover Identity commands")
Cc: stable <stable(a)vger.kernel.org>
Reviewed-by: Guenter Roeck <linux(a)roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Kyle Tso <kyletso(a)google.com>
Link: https://lore.kernel.org/r/20210523015855.1785484-3-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/typec/tcpm/tcpm.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index 6ea5df3782cf..9ce8c9af4da5 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -2430,7 +2430,10 @@ static void tcpm_pd_data_request(struct tcpm_port *port,
NONE_AMS);
break;
case PD_DATA_VENDOR_DEF:
- tcpm_handle_vdm_request(port, msg->payload, cnt);
+ if (tcpm_vdm_ams(port) || port->nr_snk_vdo)
+ tcpm_handle_vdm_request(port, msg->payload, cnt);
+ else if (port->negotiated_rev > PD_REV20)
+ tcpm_pd_handle_msg(port, PD_MSG_CTRL_NOT_SUPP, NONE_AMS);
break;
case PD_DATA_BIST:
port->bist_request = le32_to_cpu(msg->payload[0]);
--
2.31.1
This is a note to let you know that I've just added the patch titled
usb: typec: tcpm: Properly interrupt VDM AMS
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 0bc3ee92880d910a1d100b73a781904f359e1f1c Mon Sep 17 00:00:00 2001
From: Kyle Tso <kyletso(a)google.com>
Date: Sun, 23 May 2021 09:58:54 +0800
Subject: usb: typec: tcpm: Properly interrupt VDM AMS
When a VDM AMS is interrupted by Messages other than VDM, the AMS needs
to be finished properly. Also start a VDM AMS if receiving SVDM Commands
from the port partner to complement the functionality of tcpm_vdm_ams().
Fixes: 0908c5aca31e ("usb: typec: tcpm: AMS and Collision Avoidance")
Cc: stable <stable(a)vger.kernel.org>
Reviewed-by: Guenter Roeck <linux(a)roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Kyle Tso <kyletso(a)google.com>
Link: https://lore.kernel.org/r/20210523015855.1785484-2-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/typec/tcpm/tcpm.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index 8fdfd7f65ad7..6ea5df3782cf 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -1550,6 +1550,8 @@ static int tcpm_pd_svdm(struct tcpm_port *port, struct typec_altmode *adev,
if (PD_VDO_SVDM_VER(p[0]) < svdm_version)
typec_partner_set_svdm_version(port->partner,
PD_VDO_SVDM_VER(p[0]));
+
+ tcpm_ams_start(port, DISCOVER_IDENTITY);
/* 6.4.4.3.1: Only respond as UFP (device) */
if (port->data_role == TYPEC_DEVICE &&
port->nr_snk_vdo) {
@@ -1568,14 +1570,19 @@ static int tcpm_pd_svdm(struct tcpm_port *port, struct typec_altmode *adev,
}
break;
case CMD_DISCOVER_SVID:
+ tcpm_ams_start(port, DISCOVER_SVIDS);
break;
case CMD_DISCOVER_MODES:
+ tcpm_ams_start(port, DISCOVER_MODES);
break;
case CMD_ENTER_MODE:
+ tcpm_ams_start(port, DFP_TO_UFP_ENTER_MODE);
break;
case CMD_EXIT_MODE:
+ tcpm_ams_start(port, DFP_TO_UFP_EXIT_MODE);
break;
case CMD_ATTENTION:
+ tcpm_ams_start(port, ATTENTION);
/* Attention command does not have response */
*adev_action = ADEV_ATTENTION;
return 0;
@@ -2287,6 +2294,12 @@ static void tcpm_pd_data_request(struct tcpm_port *port,
bool frs_enable;
int ret;
+ if (tcpm_vdm_ams(port) && type != PD_DATA_VENDOR_DEF) {
+ port->vdm_state = VDM_STATE_ERR_BUSY;
+ tcpm_ams_finish(port);
+ mod_vdm_delayed_work(port, 0);
+ }
+
switch (type) {
case PD_DATA_SOURCE_CAP:
for (i = 0; i < cnt; i++)
@@ -2459,6 +2472,16 @@ static void tcpm_pd_ctrl_request(struct tcpm_port *port,
enum pd_ctrl_msg_type type = pd_header_type_le(msg->header);
enum tcpm_state next_state;
+ /*
+ * Stop VDM state machine if interrupted by other Messages while NOT_SUPP is allowed in
+ * VDM AMS if waiting for VDM responses and will be handled later.
+ */
+ if (tcpm_vdm_ams(port) && type != PD_CTRL_NOT_SUPP && type != PD_CTRL_GOOD_CRC) {
+ port->vdm_state = VDM_STATE_ERR_BUSY;
+ tcpm_ams_finish(port);
+ mod_vdm_delayed_work(port, 0);
+ }
+
switch (type) {
case PD_CTRL_GOOD_CRC:
case PD_CTRL_PING:
@@ -2717,6 +2740,13 @@ static void tcpm_pd_ext_msg_request(struct tcpm_port *port,
enum pd_ext_msg_type type = pd_header_type_le(msg->header);
unsigned int data_size = pd_ext_header_data_size_le(msg->ext_msg.header);
+ /* stopping VDM state machine if interrupted by other Messages */
+ if (tcpm_vdm_ams(port)) {
+ port->vdm_state = VDM_STATE_ERR_BUSY;
+ tcpm_ams_finish(port);
+ mod_vdm_delayed_work(port, 0);
+ }
+
if (!(le16_to_cpu(msg->ext_msg.header) & PD_EXT_HDR_CHUNKED)) {
tcpm_pd_handle_msg(port, PD_MSG_CTRL_NOT_SUPP, NONE_AMS);
tcpm_log(port, "Unchunked extended messages unsupported");
--
2.31.1
From: Colin Ian King <colin.king(a)canonical.com>
commit af0e1871d79cfbb91f732d2c6fa7558e45c31038 upstream.
The lux_val returned from tsl2583_get_lux can potentially be zero,
so check for this to avoid a division by zero and an overflowed
gain_trim_val.
Fixes clang scan-build warning:
drivers/iio/light/tsl2583.c:345:40: warning: Either the
condition 'lux_val<0' is redundant or there is division
by zero at line 345. [zerodivcond]
Fixes: ac4f6eee8fe8 ("staging: iio: TAOS tsl258x: Device driver")
Signed-off-by: Colin Ian King <colin.king(a)canonical.com>
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
[iwamatsu: Change file path.]
[Colin Ian King: minor context adjustments for 4.9.y]
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu(a)toshiba.co.jp>
---
drivers/staging/iio/light/tsl2583.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/staging/iio/light/tsl2583.c b/drivers/staging/iio/light/tsl2583.c
index 08f1583ee34e..fb3ec56d45b6 100644
--- a/drivers/staging/iio/light/tsl2583.c
+++ b/drivers/staging/iio/light/tsl2583.c
@@ -382,6 +382,15 @@ static int taos_als_calibrate(struct iio_dev *indio_dev)
dev_err(&chip->client->dev, "taos_als_calibrate failed to get lux\n");
return lux_val;
}
+
+ /* Avoid division by zero of lux_value later on */
+ if (lux_val == 0) {
+ dev_err(&chip->client->dev,
+ "%s: lux_val of 0 will produce out of range trim_value\n",
+ __func__);
+ return -ENODATA;
+ }
+
gain_trim_val = (unsigned int)(((chip->taos_settings.als_cal_target)
* chip->taos_settings.als_gain_trim) / lux_val);
--
2.31.1
From: Eric Biggers <ebiggers(a)google.com>
commit f053cf7aa66cd9d592b0fc967f4d887c2abff1b7 upstream.
[Please apply to 5.4-stable.]
ext4 didn't properly clean up if verity failed to be enabled on a file:
- It left verity metadata (pages past EOF) in the page cache, which
would be exposed to userspace if the file was later extended.
- It didn't truncate the verity metadata at all (either from cache or
from disk) if an error occurred while setting the verity bit.
Fix these bugs by adding a call to truncate_inode_pages() and ensuring
that we truncate the verity metadata (both from cache and from disk) in
all error paths. Also rework the code to cleanly separate the success
path from the error paths, which makes it much easier to understand.
Reported-by: Yunlei He <heyunlei(a)hihonor.com>
Fixes: c93d8f885809 ("ext4: add basic fs-verity support")
Cc: stable(a)vger.kernel.org # v5.4+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Link: https://lore.kernel.org/r/20210302200420.137977-2-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
---
fs/ext4/verity.c | 89 ++++++++++++++++++++++++++++++------------------
1 file changed, 55 insertions(+), 34 deletions(-)
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index d0d8a9795dd6..6a30e54c1128 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -198,55 +198,76 @@ static int ext4_end_enable_verity(struct file *filp, const void *desc,
struct inode *inode = file_inode(filp);
const int credits = 2; /* superblock and inode for ext4_orphan_del() */
handle_t *handle;
+ struct ext4_iloc iloc;
int err = 0;
- int err2;
- if (desc != NULL) {
- /* Succeeded; write the verity descriptor. */
- err = ext4_write_verity_descriptor(inode, desc, desc_size,
- merkle_tree_size);
-
- /* Write all pages before clearing VERITY_IN_PROGRESS. */
- if (!err)
- err = filemap_write_and_wait(inode->i_mapping);
- }
+ /*
+ * If an error already occurred (which fs/verity/ signals by passing
+ * desc == NULL), then only clean-up is needed.
+ */
+ if (desc == NULL)
+ goto cleanup;
- /* If we failed, truncate anything we wrote past i_size. */
- if (desc == NULL || err)
- ext4_truncate(inode);
+ /* Append the verity descriptor. */
+ err = ext4_write_verity_descriptor(inode, desc, desc_size,
+ merkle_tree_size);
+ if (err)
+ goto cleanup;
/*
- * We must always clean up by clearing EXT4_STATE_VERITY_IN_PROGRESS and
- * deleting the inode from the orphan list, even if something failed.
- * If everything succeeded, we'll also set the verity bit in the same
- * transaction.
+ * Write all pages (both data and verity metadata). Note that this must
+ * happen before clearing EXT4_STATE_VERITY_IN_PROGRESS; otherwise pages
+ * beyond i_size won't be written properly. For crash consistency, this
+ * also must happen before the verity inode flag gets persisted.
*/
+ err = filemap_write_and_wait(inode->i_mapping);
+ if (err)
+ goto cleanup;
- ext4_clear_inode_state(inode, EXT4_STATE_VERITY_IN_PROGRESS);
+ /*
+ * Finally, set the verity inode flag and remove the inode from the
+ * orphan list (in a single transaction).
+ */
handle = ext4_journal_start(inode, EXT4_HT_INODE, credits);
if (IS_ERR(handle)) {
- ext4_orphan_del(NULL, inode);
- return PTR_ERR(handle);
+ err = PTR_ERR(handle);
+ goto cleanup;
}
- err2 = ext4_orphan_del(handle, inode);
- if (err2)
- goto out_stop;
+ err = ext4_orphan_del(handle, inode);
+ if (err)
+ goto stop_and_cleanup;
- if (desc != NULL && !err) {
- struct ext4_iloc iloc;
+ err = ext4_reserve_inode_write(handle, inode, &iloc);
+ if (err)
+ goto stop_and_cleanup;
- err = ext4_reserve_inode_write(handle, inode, &iloc);
- if (err)
- goto out_stop;
- ext4_set_inode_flag(inode, EXT4_INODE_VERITY);
- ext4_set_inode_flags(inode);
- err = ext4_mark_iloc_dirty(handle, inode, &iloc);
- }
-out_stop:
+ ext4_set_inode_flag(inode, EXT4_INODE_VERITY);
+ ext4_set_inode_flags(inode);
+ err = ext4_mark_iloc_dirty(handle, inode, &iloc);
+ if (err)
+ goto stop_and_cleanup;
+
+ ext4_journal_stop(handle);
+
+ ext4_clear_inode_state(inode, EXT4_STATE_VERITY_IN_PROGRESS);
+ return 0;
+
+stop_and_cleanup:
ext4_journal_stop(handle);
- return err ?: err2;
+cleanup:
+ /*
+ * Verity failed to be enabled, so clean up by truncating any verity
+ * metadata that was written beyond i_size (both from cache and from
+ * disk), removing the inode from the orphan list (if it wasn't done
+ * already), and clearing EXT4_STATE_VERITY_IN_PROGRESS.
+ */
+ truncate_inode_pages(inode->i_mapping, inode->i_size);
+ ext4_truncate(inode);
+ ext4_orphan_del(NULL, inode);
+ ext4_clear_inode_state(inode, EXT4_STATE_VERITY_IN_PROGRESS);
+ return err;
}
static int ext4_get_verity_descriptor_location(struct inode *inode,
--
2.31.1.818.g46aad6cb9e-goog
Hello Greg,
On 17.04.2021 00:16:40, Alexandre Belloni wrote:
> On Wed, 10 Mar 2021 16:10:26 -0500, Francois Gervais wrote:
> > The rtc device node is always or at the very least can possibly be NULL.
> >
> > Since v5.12-rc1-dontuse/3c9ea42802a1fbf7ef29660ff8c6e526c58114f6 this
> > will lead to a NULL pointer dereference.
> >
> > To fix this we fallback to using the parent node which is the i2c client
> > node as set by devm_rtc_allocate_device().
> >
> > [...]
>
> Applied, thanks!
>
> [1/1] rtc: pcf85063: fallback to parent of_node
> commit: 03531606ef4cda25b629f500d1ffb6173b805c05
>
> I made the fallback unconditionnal because this should have been that way from
> the beginning as you point out.
can you queue this for stable, as it causes a NULL Pointer deref with
(at least) v5.12.
Thanks,
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ae897fda4f507e4b239f0bdfd578b3688ca96fb4 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich(a)suse.com>
Date: Thu, 20 May 2021 13:42:42 +0200
Subject: [PATCH] x86/Xen: swap NX determination and GDT setup on BSP
xen_setup_gdt(), via xen_load_gdt_boot(), wants to adjust page tables.
For this to work when NX is not available, x86_configure_nx() needs to
be called first.
[jgross] Note that this is a revert of 36104cb9012a82e73 ("x86/xen:
Delay get_cpu_cap until stack canary is established"), which is possible
now that we no longer support running as PV guest in 32-bit mode.
Cc: <stable.vger.kernel.org> # 5.9
Fixes: 36104cb9012a82e73 ("x86/xen: Delay get_cpu_cap until stack canary is established")
Reported-by: Olaf Hering <olaf(a)aepfle.de>
Signed-off-by: Jan Beulich <jbeulich(a)suse.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/12a866b0-9e89-59f7-ebeb-a2a6cec0987a@suse.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 17503fed2017..e87699aa2dc8 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1273,16 +1273,16 @@ asmlinkage __visible void __init xen_start_kernel(void)
/* Get mfn list */
xen_build_dynamic_phys_to_machine();
+ /* Work out if we support NX */
+ get_cpu_cap(&boot_cpu_data);
+ x86_configure_nx();
+
/*
* Set up kernel GDT and segment registers, mainly so that
* -fstack-protector code can be executed.
*/
xen_setup_gdt(0);
- /* Work out if we support NX */
- get_cpu_cap(&boot_cpu_data);
- x86_configure_nx();
-
/* Determine virtual and physical address sizes */
get_cpu_address_sizes(&boot_cpu_data);
From: xinhui pan <xinhui.pan(a)amd.com>
[ Upstream commit ad2c28bd9a4083816fa45a7e90c2486cde8a9873 ]
BO would be added into swap list if it is validated into system domain.
If BO is validated again into non-system domain, say, VRAM domain. It
actually should not be in the swap list.
Signed-off-by: xinhui pan <xinhui.pan(a)amd.com>
Acked-by: Guchun Chen <guchun.chen(a)amd.com>
Acked-by: Alex Deucher <alexander.deucher(a)amd.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210224032808.150465-1-xinhu…
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/ttm/ttm_bo.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 101a68dc615b..799ec7a7caa4 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -153,6 +153,8 @@ void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
swap = &ttm_bo_glob.swap_lru[bo->priority];
list_move_tail(&bo->swap, swap);
+ } else {
+ list_del_init(&bo->swap);
}
if (bdev->driver->del_from_lru_notify)
--
2.30.2
The direction of the pipe argument must match the request-type direction
bit or control requests may fail depending on the host-controller-driver
implementation.
Control transfers without a data stage are treated as OUT requests by
the USB stack and should be using usb_sndctrlpipe(). Failing to do so
will now trigger a warning.
Fix the single zero-length control request which was using the
read-register helper, and update the helper so that zero-length reads
fail with an error message instead.
Fixes: 6a7eba24e4f0 ("V4L/DVB (8157): gspca: all subdrivers")
Cc: stable(a)vger.kernel.org # 2.6.27
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/media/usb/gspca/sunplus.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/media/usb/gspca/sunplus.c b/drivers/media/usb/gspca/sunplus.c
index ace3da40006e..971dee0a56da 100644
--- a/drivers/media/usb/gspca/sunplus.c
+++ b/drivers/media/usb/gspca/sunplus.c
@@ -242,6 +242,10 @@ static void reg_r(struct gspca_dev *gspca_dev,
gspca_err(gspca_dev, "reg_r: buffer overflow\n");
return;
}
+ if (len == 0) {
+ gspca_err(gspca_dev, "reg_r: zero-length read\n");
+ return;
+ }
if (gspca_dev->usb_err < 0)
return;
ret = usb_control_msg(gspca_dev->dev,
@@ -250,7 +254,7 @@ static void reg_r(struct gspca_dev *gspca_dev,
USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_DEVICE,
0, /* value */
index,
- len ? gspca_dev->usb_buf : NULL, len,
+ gspca_dev->usb_buf, len,
500);
if (ret < 0) {
pr_err("reg_r err %d\n", ret);
@@ -727,7 +731,7 @@ static int sd_start(struct gspca_dev *gspca_dev)
case MegaImageVI:
reg_w_riv(gspca_dev, 0xf0, 0, 0);
spca504B_WaitCmdStatus(gspca_dev);
- reg_r(gspca_dev, 0xf0, 4, 0);
+ reg_w_riv(gspca_dev, 0xf0, 4, 0);
spca504B_WaitCmdStatus(gspca_dev);
break;
default:
--
2.26.3
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4ba50e7c423c29639878c00573288869aa627068 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich(a)suse.com>
Date: Tue, 18 May 2021 18:13:42 +0200
Subject: [PATCH] xen-pciback: redo VF placement in the virtual topology
The commit referenced below was incomplete: It merely affected what
would get written to the vdev-<N> xenstore node. The guest would still
find the function at the original function number as long as
__xen_pcibk_get_pci_dev() wouldn't be in sync. The same goes for AER wrt
__xen_pcibk_get_pcifront_dev().
Undo overriding the function to zero and instead make sure that VFs at
function zero remain alone in their slot. This has the added benefit of
improving overall capacity, considering that there's only a total of 32
slots available right now (PCI segment and bus can both only ever be
zero at present).
Fixes: 8a5248fe10b1 ("xen PV passthru: assign SR-IOV virtual functions to separate virtual slots")
Signed-off-by: Jan Beulich <jbeulich(a)suse.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Link: https://lore.kernel.org/r/8def783b-404c-3452-196d-3f3fd4d72c9e@suse.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c
index 4162d0e7e00d..cc7450f2b2a9 100644
--- a/drivers/xen/xen-pciback/vpci.c
+++ b/drivers/xen/xen-pciback/vpci.c
@@ -70,7 +70,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
struct pci_dev *dev, int devid,
publish_pci_dev_cb publish_cb)
{
- int err = 0, slot, func = -1;
+ int err = 0, slot, func = PCI_FUNC(dev->devfn);
struct pci_dev_entry *t, *dev_entry;
struct vpci_dev_data *vpci_dev = pdev->pci_dev_data;
@@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
/*
* Keep multi-function devices together on the virtual PCI bus, except
- * virtual functions.
+ * that we want to keep virtual functions at func 0 on their own. They
+ * aren't multi-function devices and hence their presence at func 0
+ * may cause guests to not scan the other functions.
*/
- if (!dev->is_virtfn) {
+ if (!dev->is_virtfn || func) {
for (slot = 0; slot < PCI_SLOT_MAX; slot++) {
if (list_empty(&vpci_dev->dev_list[slot]))
continue;
t = list_entry(list_first(&vpci_dev->dev_list[slot]),
struct pci_dev_entry, list);
+ if (t->dev->is_virtfn && !PCI_FUNC(t->dev->devfn))
+ continue;
if (match_slot(dev, t->dev)) {
dev_info(&dev->dev, "vpci: assign to virtual slot %d func %d\n",
- slot, PCI_FUNC(dev->devfn));
+ slot, func);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = PCI_FUNC(dev->devfn);
goto unlock;
}
}
@@ -123,7 +126,6 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
slot);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = dev->is_virtfn ? 0 : PCI_FUNC(dev->devfn);
goto unlock;
}
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4ba50e7c423c29639878c00573288869aa627068 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich(a)suse.com>
Date: Tue, 18 May 2021 18:13:42 +0200
Subject: [PATCH] xen-pciback: redo VF placement in the virtual topology
The commit referenced below was incomplete: It merely affected what
would get written to the vdev-<N> xenstore node. The guest would still
find the function at the original function number as long as
__xen_pcibk_get_pci_dev() wouldn't be in sync. The same goes for AER wrt
__xen_pcibk_get_pcifront_dev().
Undo overriding the function to zero and instead make sure that VFs at
function zero remain alone in their slot. This has the added benefit of
improving overall capacity, considering that there's only a total of 32
slots available right now (PCI segment and bus can both only ever be
zero at present).
Fixes: 8a5248fe10b1 ("xen PV passthru: assign SR-IOV virtual functions to separate virtual slots")
Signed-off-by: Jan Beulich <jbeulich(a)suse.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Link: https://lore.kernel.org/r/8def783b-404c-3452-196d-3f3fd4d72c9e@suse.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c
index 4162d0e7e00d..cc7450f2b2a9 100644
--- a/drivers/xen/xen-pciback/vpci.c
+++ b/drivers/xen/xen-pciback/vpci.c
@@ -70,7 +70,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
struct pci_dev *dev, int devid,
publish_pci_dev_cb publish_cb)
{
- int err = 0, slot, func = -1;
+ int err = 0, slot, func = PCI_FUNC(dev->devfn);
struct pci_dev_entry *t, *dev_entry;
struct vpci_dev_data *vpci_dev = pdev->pci_dev_data;
@@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
/*
* Keep multi-function devices together on the virtual PCI bus, except
- * virtual functions.
+ * that we want to keep virtual functions at func 0 on their own. They
+ * aren't multi-function devices and hence their presence at func 0
+ * may cause guests to not scan the other functions.
*/
- if (!dev->is_virtfn) {
+ if (!dev->is_virtfn || func) {
for (slot = 0; slot < PCI_SLOT_MAX; slot++) {
if (list_empty(&vpci_dev->dev_list[slot]))
continue;
t = list_entry(list_first(&vpci_dev->dev_list[slot]),
struct pci_dev_entry, list);
+ if (t->dev->is_virtfn && !PCI_FUNC(t->dev->devfn))
+ continue;
if (match_slot(dev, t->dev)) {
dev_info(&dev->dev, "vpci: assign to virtual slot %d func %d\n",
- slot, PCI_FUNC(dev->devfn));
+ slot, func);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = PCI_FUNC(dev->devfn);
goto unlock;
}
}
@@ -123,7 +126,6 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
slot);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = dev->is_virtfn ? 0 : PCI_FUNC(dev->devfn);
goto unlock;
}
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4ba50e7c423c29639878c00573288869aa627068 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich(a)suse.com>
Date: Tue, 18 May 2021 18:13:42 +0200
Subject: [PATCH] xen-pciback: redo VF placement in the virtual topology
The commit referenced below was incomplete: It merely affected what
would get written to the vdev-<N> xenstore node. The guest would still
find the function at the original function number as long as
__xen_pcibk_get_pci_dev() wouldn't be in sync. The same goes for AER wrt
__xen_pcibk_get_pcifront_dev().
Undo overriding the function to zero and instead make sure that VFs at
function zero remain alone in their slot. This has the added benefit of
improving overall capacity, considering that there's only a total of 32
slots available right now (PCI segment and bus can both only ever be
zero at present).
Fixes: 8a5248fe10b1 ("xen PV passthru: assign SR-IOV virtual functions to separate virtual slots")
Signed-off-by: Jan Beulich <jbeulich(a)suse.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Link: https://lore.kernel.org/r/8def783b-404c-3452-196d-3f3fd4d72c9e@suse.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c
index 4162d0e7e00d..cc7450f2b2a9 100644
--- a/drivers/xen/xen-pciback/vpci.c
+++ b/drivers/xen/xen-pciback/vpci.c
@@ -70,7 +70,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
struct pci_dev *dev, int devid,
publish_pci_dev_cb publish_cb)
{
- int err = 0, slot, func = -1;
+ int err = 0, slot, func = PCI_FUNC(dev->devfn);
struct pci_dev_entry *t, *dev_entry;
struct vpci_dev_data *vpci_dev = pdev->pci_dev_data;
@@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
/*
* Keep multi-function devices together on the virtual PCI bus, except
- * virtual functions.
+ * that we want to keep virtual functions at func 0 on their own. They
+ * aren't multi-function devices and hence their presence at func 0
+ * may cause guests to not scan the other functions.
*/
- if (!dev->is_virtfn) {
+ if (!dev->is_virtfn || func) {
for (slot = 0; slot < PCI_SLOT_MAX; slot++) {
if (list_empty(&vpci_dev->dev_list[slot]))
continue;
t = list_entry(list_first(&vpci_dev->dev_list[slot]),
struct pci_dev_entry, list);
+ if (t->dev->is_virtfn && !PCI_FUNC(t->dev->devfn))
+ continue;
if (match_slot(dev, t->dev)) {
dev_info(&dev->dev, "vpci: assign to virtual slot %d func %d\n",
- slot, PCI_FUNC(dev->devfn));
+ slot, func);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = PCI_FUNC(dev->devfn);
goto unlock;
}
}
@@ -123,7 +126,6 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
slot);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = dev->is_virtfn ? 0 : PCI_FUNC(dev->devfn);
goto unlock;
}
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4ba50e7c423c29639878c00573288869aa627068 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich(a)suse.com>
Date: Tue, 18 May 2021 18:13:42 +0200
Subject: [PATCH] xen-pciback: redo VF placement in the virtual topology
The commit referenced below was incomplete: It merely affected what
would get written to the vdev-<N> xenstore node. The guest would still
find the function at the original function number as long as
__xen_pcibk_get_pci_dev() wouldn't be in sync. The same goes for AER wrt
__xen_pcibk_get_pcifront_dev().
Undo overriding the function to zero and instead make sure that VFs at
function zero remain alone in their slot. This has the added benefit of
improving overall capacity, considering that there's only a total of 32
slots available right now (PCI segment and bus can both only ever be
zero at present).
Fixes: 8a5248fe10b1 ("xen PV passthru: assign SR-IOV virtual functions to separate virtual slots")
Signed-off-by: Jan Beulich <jbeulich(a)suse.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Link: https://lore.kernel.org/r/8def783b-404c-3452-196d-3f3fd4d72c9e@suse.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c
index 4162d0e7e00d..cc7450f2b2a9 100644
--- a/drivers/xen/xen-pciback/vpci.c
+++ b/drivers/xen/xen-pciback/vpci.c
@@ -70,7 +70,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
struct pci_dev *dev, int devid,
publish_pci_dev_cb publish_cb)
{
- int err = 0, slot, func = -1;
+ int err = 0, slot, func = PCI_FUNC(dev->devfn);
struct pci_dev_entry *t, *dev_entry;
struct vpci_dev_data *vpci_dev = pdev->pci_dev_data;
@@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
/*
* Keep multi-function devices together on the virtual PCI bus, except
- * virtual functions.
+ * that we want to keep virtual functions at func 0 on their own. They
+ * aren't multi-function devices and hence their presence at func 0
+ * may cause guests to not scan the other functions.
*/
- if (!dev->is_virtfn) {
+ if (!dev->is_virtfn || func) {
for (slot = 0; slot < PCI_SLOT_MAX; slot++) {
if (list_empty(&vpci_dev->dev_list[slot]))
continue;
t = list_entry(list_first(&vpci_dev->dev_list[slot]),
struct pci_dev_entry, list);
+ if (t->dev->is_virtfn && !PCI_FUNC(t->dev->devfn))
+ continue;
if (match_slot(dev, t->dev)) {
dev_info(&dev->dev, "vpci: assign to virtual slot %d func %d\n",
- slot, PCI_FUNC(dev->devfn));
+ slot, func);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = PCI_FUNC(dev->devfn);
goto unlock;
}
}
@@ -123,7 +126,6 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
slot);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = dev->is_virtfn ? 0 : PCI_FUNC(dev->devfn);
goto unlock;
}
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4ba50e7c423c29639878c00573288869aa627068 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich(a)suse.com>
Date: Tue, 18 May 2021 18:13:42 +0200
Subject: [PATCH] xen-pciback: redo VF placement in the virtual topology
The commit referenced below was incomplete: It merely affected what
would get written to the vdev-<N> xenstore node. The guest would still
find the function at the original function number as long as
__xen_pcibk_get_pci_dev() wouldn't be in sync. The same goes for AER wrt
__xen_pcibk_get_pcifront_dev().
Undo overriding the function to zero and instead make sure that VFs at
function zero remain alone in their slot. This has the added benefit of
improving overall capacity, considering that there's only a total of 32
slots available right now (PCI segment and bus can both only ever be
zero at present).
Fixes: 8a5248fe10b1 ("xen PV passthru: assign SR-IOV virtual functions to separate virtual slots")
Signed-off-by: Jan Beulich <jbeulich(a)suse.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Link: https://lore.kernel.org/r/8def783b-404c-3452-196d-3f3fd4d72c9e@suse.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c
index 4162d0e7e00d..cc7450f2b2a9 100644
--- a/drivers/xen/xen-pciback/vpci.c
+++ b/drivers/xen/xen-pciback/vpci.c
@@ -70,7 +70,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
struct pci_dev *dev, int devid,
publish_pci_dev_cb publish_cb)
{
- int err = 0, slot, func = -1;
+ int err = 0, slot, func = PCI_FUNC(dev->devfn);
struct pci_dev_entry *t, *dev_entry;
struct vpci_dev_data *vpci_dev = pdev->pci_dev_data;
@@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
/*
* Keep multi-function devices together on the virtual PCI bus, except
- * virtual functions.
+ * that we want to keep virtual functions at func 0 on their own. They
+ * aren't multi-function devices and hence their presence at func 0
+ * may cause guests to not scan the other functions.
*/
- if (!dev->is_virtfn) {
+ if (!dev->is_virtfn || func) {
for (slot = 0; slot < PCI_SLOT_MAX; slot++) {
if (list_empty(&vpci_dev->dev_list[slot]))
continue;
t = list_entry(list_first(&vpci_dev->dev_list[slot]),
struct pci_dev_entry, list);
+ if (t->dev->is_virtfn && !PCI_FUNC(t->dev->devfn))
+ continue;
if (match_slot(dev, t->dev)) {
dev_info(&dev->dev, "vpci: assign to virtual slot %d func %d\n",
- slot, PCI_FUNC(dev->devfn));
+ slot, func);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = PCI_FUNC(dev->devfn);
goto unlock;
}
}
@@ -123,7 +126,6 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
slot);
list_add_tail(&dev_entry->list,
&vpci_dev->dev_list[slot]);
- func = dev->is_virtfn ? 0 : PCI_FUNC(dev->devfn);
goto unlock;
}
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From cabb1bb60e88ccaaa122ba01862403cd44e8e8f8 Mon Sep 17 00:00:00 2001
From: Neil Armstrong <narmstrong(a)baylibre.com>
Date: Mon, 26 Apr 2021 19:55:58 +0200
Subject: [PATCH] mmc: meson-gx: make replace WARN_ONCE with dev_warn_once
about scatterlist offset alignment
Some drivers like ath10k can sometimg give an sg buffer with an offset whose alignment
is not compatible with the Amlogic DMA descriptor engine requirements.
Simply replace with dev_warn_once() to inform user this should be fixed to avoid
degraded performance.
This should be ultimately fixed in ath10k, but since it's only a performance issue
the warning should be removed.
Fixes: 79ed05e329c3 ("mmc: meson-gx: add support for descriptor chain mode")
Cc: stable(a)vger.kernel.org
Reported-by: Christian Hewitt <christianshewitt(a)gmail.com>
Signed-off-by: Neil Armstrong <narmstrong(a)baylibre.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
Link: https://lore.kernel.org/r/20210426175559.3110575-1-narmstrong@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
index b8b771b643cc..1c61f0f24c09 100644
--- a/drivers/mmc/host/meson-gx-mmc.c
+++ b/drivers/mmc/host/meson-gx-mmc.c
@@ -258,7 +258,9 @@ static void meson_mmc_get_transfer_mode(struct mmc_host *mmc,
for_each_sg(data->sg, sg, data->sg_len, i) {
/* check for 8 byte alignment */
if (sg->offset % 8) {
- WARN_ONCE(1, "unaligned scatterlist buffer\n");
+ dev_warn_once(mmc_dev(mmc),
+ "unaligned sg offset %u, disabling descriptor DMA for transfer\n",
+ sg->offset);
return;
}
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From cabb1bb60e88ccaaa122ba01862403cd44e8e8f8 Mon Sep 17 00:00:00 2001
From: Neil Armstrong <narmstrong(a)baylibre.com>
Date: Mon, 26 Apr 2021 19:55:58 +0200
Subject: [PATCH] mmc: meson-gx: make replace WARN_ONCE with dev_warn_once
about scatterlist offset alignment
Some drivers like ath10k can sometimg give an sg buffer with an offset whose alignment
is not compatible with the Amlogic DMA descriptor engine requirements.
Simply replace with dev_warn_once() to inform user this should be fixed to avoid
degraded performance.
This should be ultimately fixed in ath10k, but since it's only a performance issue
the warning should be removed.
Fixes: 79ed05e329c3 ("mmc: meson-gx: add support for descriptor chain mode")
Cc: stable(a)vger.kernel.org
Reported-by: Christian Hewitt <christianshewitt(a)gmail.com>
Signed-off-by: Neil Armstrong <narmstrong(a)baylibre.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
Link: https://lore.kernel.org/r/20210426175559.3110575-1-narmstrong@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
index b8b771b643cc..1c61f0f24c09 100644
--- a/drivers/mmc/host/meson-gx-mmc.c
+++ b/drivers/mmc/host/meson-gx-mmc.c
@@ -258,7 +258,9 @@ static void meson_mmc_get_transfer_mode(struct mmc_host *mmc,
for_each_sg(data->sg, sg, data->sg_len, i) {
/* check for 8 byte alignment */
if (sg->offset % 8) {
- WARN_ONCE(1, "unaligned scatterlist buffer\n");
+ dev_warn_once(mmc_dev(mmc),
+ "unaligned sg offset %u, disabling descriptor DMA for transfer\n",
+ sg->offset);
return;
}
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From cabb1bb60e88ccaaa122ba01862403cd44e8e8f8 Mon Sep 17 00:00:00 2001
From: Neil Armstrong <narmstrong(a)baylibre.com>
Date: Mon, 26 Apr 2021 19:55:58 +0200
Subject: [PATCH] mmc: meson-gx: make replace WARN_ONCE with dev_warn_once
about scatterlist offset alignment
Some drivers like ath10k can sometimg give an sg buffer with an offset whose alignment
is not compatible with the Amlogic DMA descriptor engine requirements.
Simply replace with dev_warn_once() to inform user this should be fixed to avoid
degraded performance.
This should be ultimately fixed in ath10k, but since it's only a performance issue
the warning should be removed.
Fixes: 79ed05e329c3 ("mmc: meson-gx: add support for descriptor chain mode")
Cc: stable(a)vger.kernel.org
Reported-by: Christian Hewitt <christianshewitt(a)gmail.com>
Signed-off-by: Neil Armstrong <narmstrong(a)baylibre.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
Link: https://lore.kernel.org/r/20210426175559.3110575-1-narmstrong@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
index b8b771b643cc..1c61f0f24c09 100644
--- a/drivers/mmc/host/meson-gx-mmc.c
+++ b/drivers/mmc/host/meson-gx-mmc.c
@@ -258,7 +258,9 @@ static void meson_mmc_get_transfer_mode(struct mmc_host *mmc,
for_each_sg(data->sg, sg, data->sg_len, i) {
/* check for 8 byte alignment */
if (sg->offset % 8) {
- WARN_ONCE(1, "unaligned scatterlist buffer\n");
+ dev_warn_once(mmc_dev(mmc),
+ "unaligned sg offset %u, disabling descriptor DMA for transfer\n",
+ sg->offset);
return;
}
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From cabb1bb60e88ccaaa122ba01862403cd44e8e8f8 Mon Sep 17 00:00:00 2001
From: Neil Armstrong <narmstrong(a)baylibre.com>
Date: Mon, 26 Apr 2021 19:55:58 +0200
Subject: [PATCH] mmc: meson-gx: make replace WARN_ONCE with dev_warn_once
about scatterlist offset alignment
Some drivers like ath10k can sometimg give an sg buffer with an offset whose alignment
is not compatible with the Amlogic DMA descriptor engine requirements.
Simply replace with dev_warn_once() to inform user this should be fixed to avoid
degraded performance.
This should be ultimately fixed in ath10k, but since it's only a performance issue
the warning should be removed.
Fixes: 79ed05e329c3 ("mmc: meson-gx: add support for descriptor chain mode")
Cc: stable(a)vger.kernel.org
Reported-by: Christian Hewitt <christianshewitt(a)gmail.com>
Signed-off-by: Neil Armstrong <narmstrong(a)baylibre.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
Link: https://lore.kernel.org/r/20210426175559.3110575-1-narmstrong@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
index b8b771b643cc..1c61f0f24c09 100644
--- a/drivers/mmc/host/meson-gx-mmc.c
+++ b/drivers/mmc/host/meson-gx-mmc.c
@@ -258,7 +258,9 @@ static void meson_mmc_get_transfer_mode(struct mmc_host *mmc,
for_each_sg(data->sg, sg, data->sg_len, i) {
/* check for 8 byte alignment */
if (sg->offset % 8) {
- WARN_ONCE(1, "unaligned scatterlist buffer\n");
+ dev_warn_once(mmc_dev(mmc),
+ "unaligned sg offset %u, disabling descriptor DMA for transfer\n",
+ sg->offset);
return;
}
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 0b0226be3a52dadd965644bc52a807961c2c26df Mon Sep 17 00:00:00 2001
From: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Date: Sun, 9 May 2021 09:13:12 +0200
Subject: [PATCH] uio_hv_generic: Fix another memory leak in error handling
paths
Memory allocated by 'vmbus_alloc_ring()' at the beginning of the probe
function is never freed in the error handling path.
Add the missing 'vmbus_free_ring()' call.
Note that it is already freed in the .remove function.
Fixes: cdfa835c6e5e ("uio_hv_generic: defer opening vmbus until first use")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Link: https://lore.kernel.org/r/0d86027b8eeed8e6360bc3d52bcdb328ff9bdca1.16205440…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index eebc399f2cc7..652fe2547587 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -291,7 +291,7 @@ hv_uio_probe(struct hv_device *dev,
pdata->recv_buf = vzalloc(RECV_BUFFER_SIZE);
if (pdata->recv_buf == NULL) {
ret = -ENOMEM;
- goto fail_close;
+ goto fail_free_ring;
}
ret = vmbus_establish_gpadl(channel, pdata->recv_buf,
@@ -351,6 +351,8 @@ hv_uio_probe(struct hv_device *dev,
fail_close:
hv_uio_cleanup(dev, pdata);
+fail_free_ring:
+ vmbus_free_ring(dev->channel);
return ret;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 0b0226be3a52dadd965644bc52a807961c2c26df Mon Sep 17 00:00:00 2001
From: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Date: Sun, 9 May 2021 09:13:12 +0200
Subject: [PATCH] uio_hv_generic: Fix another memory leak in error handling
paths
Memory allocated by 'vmbus_alloc_ring()' at the beginning of the probe
function is never freed in the error handling path.
Add the missing 'vmbus_free_ring()' call.
Note that it is already freed in the .remove function.
Fixes: cdfa835c6e5e ("uio_hv_generic: defer opening vmbus until first use")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Link: https://lore.kernel.org/r/0d86027b8eeed8e6360bc3d52bcdb328ff9bdca1.16205440…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index eebc399f2cc7..652fe2547587 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -291,7 +291,7 @@ hv_uio_probe(struct hv_device *dev,
pdata->recv_buf = vzalloc(RECV_BUFFER_SIZE);
if (pdata->recv_buf == NULL) {
ret = -ENOMEM;
- goto fail_close;
+ goto fail_free_ring;
}
ret = vmbus_establish_gpadl(channel, pdata->recv_buf,
@@ -351,6 +351,8 @@ hv_uio_probe(struct hv_device *dev,
fail_close:
hv_uio_cleanup(dev, pdata);
+fail_free_ring:
+ vmbus_free_ring(dev->channel);
return ret;
}
The direction of the pipe argument must match the request-type direction
bit or control requests may fail depending on the host-controller-driver
implementation.
Fix the three requests which erroneously used usb_rcvctrlpipe().
Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver")
Cc: stable(a)vger.kernel.org # 3.5
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
Changes in v2
- fix also the request in attach
drivers/usb/serial/quatech2.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/serial/quatech2.c b/drivers/usb/serial/quatech2.c
index 3b5f2032ecdb..b0bb889f002c 100644
--- a/drivers/usb/serial/quatech2.c
+++ b/drivers/usb/serial/quatech2.c
@@ -416,7 +416,7 @@ static void qt2_close(struct usb_serial_port *port)
/* flush the port transmit buffer */
i = usb_control_msg(serial->dev,
- usb_rcvctrlpipe(serial->dev, 0),
+ usb_sndctrlpipe(serial->dev, 0),
QT2_FLUSH_DEVICE, 0x40, 1,
port_priv->device_port, NULL, 0, QT2_USB_TIMEOUT);
@@ -426,7 +426,7 @@ static void qt2_close(struct usb_serial_port *port)
/* flush the port receive buffer */
i = usb_control_msg(serial->dev,
- usb_rcvctrlpipe(serial->dev, 0),
+ usb_sndctrlpipe(serial->dev, 0),
QT2_FLUSH_DEVICE, 0x40, 0,
port_priv->device_port, NULL, 0, QT2_USB_TIMEOUT);
@@ -639,7 +639,7 @@ static int qt2_attach(struct usb_serial *serial)
int status;
/* power on unit */
- status = usb_control_msg(serial->dev, usb_rcvctrlpipe(serial->dev, 0),
+ status = usb_control_msg(serial->dev, usb_sndctrlpipe(serial->dev, 0),
0xc2, 0x40, 0x8000, 0, NULL, 0,
QT2_USB_TIMEOUT);
if (status < 0) {
--
2.26.3
During system resume, MHI host triggers M3->M0 transition and then waits
for target device to enter M0 state. Once done, the device queues a state
change event into ctrl event ring and notifies MHI host by raising an
interrupt, where a tasklet is scheduled to process this event. In most cases,
the tasklet is served timely and wait operation succeeds.
However, there are cases where CPU is busy and cannot serve this tasklet
for some time. Once delay goes long enough, the device moves itself to M1
state and also interrupts MHI host after inserting a new state change
event to ctrl ring. Later CPU finally has time to process the ring, however
there are two events in it now:
1. for M3->M0 event, which is processed first as queued first,
tasklet handler updates device state to M0 and wakes up the task,
i.e., the MHI host.
2. for M0->M1 event, which is processed later, tasklet handler
triggers M1->M2 transition and updates device state to M2 directly,
then wakes up the MHI host(if still sleeping on this wait queue).
Note that although MHI host has been woken up while processing the first
event, it may still has no chance to run before the second event is processed.
In other words, MHI host has to keep waiting till timeout cause the M0 state
has been missed.
kernel log here:
...
Apr 15 01:45:14 test-NUC8i7HVK kernel: [ 4247.911251] mhi 0000:06:00.0: Entered with PM state: M3, MHI state: M3
Apr 15 01:45:14 test-NUC8i7HVK kernel: [ 4247.917762] mhi 0000:06:00.0: State change event to state: M0
Apr 15 01:45:14 test-NUC8i7HVK kernel: [ 4247.917767] mhi 0000:06:00.0: State change event to state: M1
Apr 15 01:45:14 test-NUC8i7HVK kernel: [ 4338.788231] mhi 0000:06:00.0: Did not enter M0 state, MHI state: M2, PM state: M2
...
Fix this issue by simply adding M2 as a valid state for resume.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Fixes: 0c6b20a1d720 ("bus: mhi: core: Add support for MHI suspend and resume")
Signed-off-by: Baochen Qiang <bqiang(a)codeaurora.org>
Reviewed-by: Hemant Kumar <hemantk(a)codeaurora.org>
---
drivers/bus/mhi/core/pm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
index e2e59a341fef..59b009a3ee9b 100644
--- a/drivers/bus/mhi/core/pm.c
+++ b/drivers/bus/mhi/core/pm.c
@@ -934,6 +934,7 @@ int mhi_pm_resume(struct mhi_controller *mhi_cntrl)
ret = wait_event_timeout(mhi_cntrl->state_event,
mhi_cntrl->dev_state == MHI_STATE_M0 ||
+ mhi_cntrl->dev_state == MHI_STATE_M2 ||
MHI_PM_IN_ERROR_STATE(mhi_cntrl->pm_state),
msecs_to_jiffies(mhi_cntrl->timeout_ms));
--
2.25.1
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 71795ee590111e3636cc3c148289dfa9fa0a5fc3 Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Thu, 29 Apr 2021 10:51:34 -0400
Subject: [PATCH] btrfs: avoid RCU stalls while running delayed iputs
Generally a delayed iput is added when we might do the final iput, so
usually we'll end up sleeping while processing the delayed iputs
naturally. However there's no guarantee of this, especially for small
files. In production we noticed 5 instances of RCU stalls while testing
a kernel release overnight across 1000 machines, so this is relatively
common:
host count: 5
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: ....: (20998 ticks this GP) idle=59e/1/0x4000000000000002 softirq=12333372/12333372 fqs=3208
(t=21031 jiffies g=27810193 q=41075) NMI backtrace for cpu 1
CPU: 1 PID: 1713 Comm: btrfs-cleaner Kdump: loaded Not tainted 5.6.13-0_fbk12_rc1_5520_gec92bffc1ec9 #1
Call Trace:
<IRQ> dump_stack+0x50/0x70
nmi_cpu_backtrace.cold.6+0x30/0x65
? lapic_can_unplug_cpu.cold.30+0x40/0x40
nmi_trigger_cpumask_backtrace+0xba/0xca
rcu_dump_cpu_stacks+0x99/0xc7
rcu_sched_clock_irq.cold.90+0x1b2/0x3a3
? trigger_load_balance+0x5c/0x200
? tick_sched_do_timer+0x60/0x60
? tick_sched_do_timer+0x60/0x60
update_process_times+0x24/0x50
tick_sched_timer+0x37/0x70
__hrtimer_run_queues+0xfe/0x270
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x5e/0x120
apic_timer_interrupt+0xf/0x20 </IRQ>
RIP: 0010:queued_spin_lock_slowpath+0x17d/0x1b0
RSP: 0018:ffffc9000da5fe48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000000 RBX: ffff889fa81d0cd8 RCX: 0000000000000029
RDX: ffff889fff86c0c0 RSI: 0000000000080000 RDI: ffff88bfc2da7200
RBP: ffff888f2dcdd768 R08: 0000000001040000 R09: 0000000000000000
R10: 0000000000000001 R11: ffffffff82a55560 R12: ffff88bfc2da7200
R13: 0000000000000000 R14: ffff88bff6c2a360 R15: ffffffff814bd870
? kzalloc.constprop.57+0x30/0x30
list_lru_add+0x5a/0x100
inode_lru_list_add+0x20/0x40
iput+0x1c1/0x1f0
run_delayed_iput_locked+0x46/0x90
btrfs_run_delayed_iputs+0x3f/0x60
cleaner_kthread+0xf2/0x120
kthread+0x10b/0x130
Fix this by adding a cond_resched_lock() to the loop processing delayed
iputs so we can avoid these sort of stalls.
CC: stable(a)vger.kernel.org # 4.9+
Reviewed-by: Rik van Riel <riel(a)surriel.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 69fcdf8f0b1c..095e452f59f0 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3246,6 +3246,7 @@ void btrfs_run_delayed_iputs(struct btrfs_fs_info *fs_info)
inode = list_first_entry(&fs_info->delayed_iputs,
struct btrfs_inode, delayed_iput);
run_delayed_iput_locked(fs_info, inode);
+ cond_resched_lock(&fs_info->delayed_iput_lock);
}
spin_unlock(&fs_info->delayed_iput_lock);
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 71795ee590111e3636cc3c148289dfa9fa0a5fc3 Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Thu, 29 Apr 2021 10:51:34 -0400
Subject: [PATCH] btrfs: avoid RCU stalls while running delayed iputs
Generally a delayed iput is added when we might do the final iput, so
usually we'll end up sleeping while processing the delayed iputs
naturally. However there's no guarantee of this, especially for small
files. In production we noticed 5 instances of RCU stalls while testing
a kernel release overnight across 1000 machines, so this is relatively
common:
host count: 5
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: ....: (20998 ticks this GP) idle=59e/1/0x4000000000000002 softirq=12333372/12333372 fqs=3208
(t=21031 jiffies g=27810193 q=41075) NMI backtrace for cpu 1
CPU: 1 PID: 1713 Comm: btrfs-cleaner Kdump: loaded Not tainted 5.6.13-0_fbk12_rc1_5520_gec92bffc1ec9 #1
Call Trace:
<IRQ> dump_stack+0x50/0x70
nmi_cpu_backtrace.cold.6+0x30/0x65
? lapic_can_unplug_cpu.cold.30+0x40/0x40
nmi_trigger_cpumask_backtrace+0xba/0xca
rcu_dump_cpu_stacks+0x99/0xc7
rcu_sched_clock_irq.cold.90+0x1b2/0x3a3
? trigger_load_balance+0x5c/0x200
? tick_sched_do_timer+0x60/0x60
? tick_sched_do_timer+0x60/0x60
update_process_times+0x24/0x50
tick_sched_timer+0x37/0x70
__hrtimer_run_queues+0xfe/0x270
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x5e/0x120
apic_timer_interrupt+0xf/0x20 </IRQ>
RIP: 0010:queued_spin_lock_slowpath+0x17d/0x1b0
RSP: 0018:ffffc9000da5fe48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000000 RBX: ffff889fa81d0cd8 RCX: 0000000000000029
RDX: ffff889fff86c0c0 RSI: 0000000000080000 RDI: ffff88bfc2da7200
RBP: ffff888f2dcdd768 R08: 0000000001040000 R09: 0000000000000000
R10: 0000000000000001 R11: ffffffff82a55560 R12: ffff88bfc2da7200
R13: 0000000000000000 R14: ffff88bff6c2a360 R15: ffffffff814bd870
? kzalloc.constprop.57+0x30/0x30
list_lru_add+0x5a/0x100
inode_lru_list_add+0x20/0x40
iput+0x1c1/0x1f0
run_delayed_iput_locked+0x46/0x90
btrfs_run_delayed_iputs+0x3f/0x60
cleaner_kthread+0xf2/0x120
kthread+0x10b/0x130
Fix this by adding a cond_resched_lock() to the loop processing delayed
iputs so we can avoid these sort of stalls.
CC: stable(a)vger.kernel.org # 4.9+
Reviewed-by: Rik van Riel <riel(a)surriel.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 69fcdf8f0b1c..095e452f59f0 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3246,6 +3246,7 @@ void btrfs_run_delayed_iputs(struct btrfs_fs_info *fs_info)
inode = list_first_entry(&fs_info->delayed_iputs,
struct btrfs_inode, delayed_iput);
run_delayed_iput_locked(fs_info, inode);
+ cond_resched_lock(&fs_info->delayed_iput_lock);
}
spin_unlock(&fs_info->delayed_iput_lock);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 71795ee590111e3636cc3c148289dfa9fa0a5fc3 Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Thu, 29 Apr 2021 10:51:34 -0400
Subject: [PATCH] btrfs: avoid RCU stalls while running delayed iputs
Generally a delayed iput is added when we might do the final iput, so
usually we'll end up sleeping while processing the delayed iputs
naturally. However there's no guarantee of this, especially for small
files. In production we noticed 5 instances of RCU stalls while testing
a kernel release overnight across 1000 machines, so this is relatively
common:
host count: 5
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: ....: (20998 ticks this GP) idle=59e/1/0x4000000000000002 softirq=12333372/12333372 fqs=3208
(t=21031 jiffies g=27810193 q=41075) NMI backtrace for cpu 1
CPU: 1 PID: 1713 Comm: btrfs-cleaner Kdump: loaded Not tainted 5.6.13-0_fbk12_rc1_5520_gec92bffc1ec9 #1
Call Trace:
<IRQ> dump_stack+0x50/0x70
nmi_cpu_backtrace.cold.6+0x30/0x65
? lapic_can_unplug_cpu.cold.30+0x40/0x40
nmi_trigger_cpumask_backtrace+0xba/0xca
rcu_dump_cpu_stacks+0x99/0xc7
rcu_sched_clock_irq.cold.90+0x1b2/0x3a3
? trigger_load_balance+0x5c/0x200
? tick_sched_do_timer+0x60/0x60
? tick_sched_do_timer+0x60/0x60
update_process_times+0x24/0x50
tick_sched_timer+0x37/0x70
__hrtimer_run_queues+0xfe/0x270
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x5e/0x120
apic_timer_interrupt+0xf/0x20 </IRQ>
RIP: 0010:queued_spin_lock_slowpath+0x17d/0x1b0
RSP: 0018:ffffc9000da5fe48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000000 RBX: ffff889fa81d0cd8 RCX: 0000000000000029
RDX: ffff889fff86c0c0 RSI: 0000000000080000 RDI: ffff88bfc2da7200
RBP: ffff888f2dcdd768 R08: 0000000001040000 R09: 0000000000000000
R10: 0000000000000001 R11: ffffffff82a55560 R12: ffff88bfc2da7200
R13: 0000000000000000 R14: ffff88bff6c2a360 R15: ffffffff814bd870
? kzalloc.constprop.57+0x30/0x30
list_lru_add+0x5a/0x100
inode_lru_list_add+0x20/0x40
iput+0x1c1/0x1f0
run_delayed_iput_locked+0x46/0x90
btrfs_run_delayed_iputs+0x3f/0x60
cleaner_kthread+0xf2/0x120
kthread+0x10b/0x130
Fix this by adding a cond_resched_lock() to the loop processing delayed
iputs so we can avoid these sort of stalls.
CC: stable(a)vger.kernel.org # 4.9+
Reviewed-by: Rik van Riel <riel(a)surriel.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 69fcdf8f0b1c..095e452f59f0 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3246,6 +3246,7 @@ void btrfs_run_delayed_iputs(struct btrfs_fs_info *fs_info)
inode = list_first_entry(&fs_info->delayed_iputs,
struct btrfs_inode, delayed_iput);
run_delayed_iput_locked(fs_info, inode);
+ cond_resched_lock(&fs_info->delayed_iput_lock);
}
spin_unlock(&fs_info->delayed_iput_lock);
}
In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific
page flags") the use of PagePrivate to indicate a reservation count
should be restored at free time was changed to the hugetlb specific flag
HPageRestoreReserve. Changes to a userfaultfd error path as well as a
VM_BUG_ON() in remove_inode_hugepages() were overlooked.
Users could see incorrect hugetlb reserve counts if they experience an
error with a UFFDIO_COPY operation. Specifically, this would be the
result of an unlikely copy_huge_page_from_user error. There is not an
increased chance of hitting the VM_BUG_ON.
Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
---
fs/hugetlbfs/inode.c | 2 +-
mm/userfaultfd.c | 28 ++++++++++++++--------------
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 9d9e0097c1d3..55efd3dd04f6 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -529,7 +529,7 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
* the subpool and global reserve usage count can need
* to be adjusted.
*/
- VM_BUG_ON(PagePrivate(page));
+ VM_BUG_ON(HPageRestoreReserve(page));
remove_huge_page(page);
freed++;
if (!truncate_op) {
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 5508f7d9e2dc..9ce5a3793ad4 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -432,38 +432,38 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm,
* If a reservation for the page existed in the reservation
* map of a private mapping, the map was modified to indicate
* the reservation was consumed when the page was allocated.
- * We clear the PagePrivate flag now so that the global
+ * We clear the HPageRestoreReserve flag now so that the global
* reserve count will not be incremented in free_huge_page.
* The reservation map will still indicate the reservation
* was consumed and possibly prevent later page allocation.
* This is better than leaking a global reservation. If no
- * reservation existed, it is still safe to clear PagePrivate
- * as no adjustments to reservation counts were made during
- * allocation.
+ * reservation existed, it is still safe to clear
+ * HPageRestoreReserve as no adjustments to reservation counts
+ * were made during allocation.
*
* The reservation map for shared mappings indicates which
* pages have reservations. When a huge page is allocated
* for an address with a reservation, no change is made to
- * the reserve map. In this case PagePrivate will be set
- * to indicate that the global reservation count should be
+ * the reserve map. In this case HPageRestoreReserve will be
+ * set to indicate that the global reservation count should be
* incremented when the page is freed. This is the desired
* behavior. However, when a huge page is allocated for an
* address without a reservation a reservation entry is added
- * to the reservation map, and PagePrivate will not be set.
- * When the page is freed, the global reserve count will NOT
- * be incremented and it will appear as though we have leaked
- * reserved page. In this case, set PagePrivate so that the
- * global reserve count will be incremented to match the
- * reservation map entry which was created.
+ * to the reservation map, and HPageRestoreReserve will not be
+ * set. When the page is freed, the global reserve count will
+ * NOT be incremented and it will appear as though we have
+ * leaked reserved page. In this case, set HPageRestoreReserve
+ * so that the global reserve count will be incremented to
+ * match the reservation map entry which was created.
*
* Note that vm_alloc_shared is based on the flags of the vma
* for which the page was originally allocated. dst_vma could
* be different or NULL on error.
*/
if (vm_alloc_shared)
- SetPagePrivate(page);
+ SetHPageRestoreReserve(page);
else
- ClearPagePrivate(page);
+ ClearHPageRestoreReserve(page);
put_page(page);
}
BUG_ON(copied < 0);
--
2.31.1
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: userfaultfd: hugetlbfs: fix new flag usage in error path
In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific
page flags") the use of PagePrivate to indicate a reservation count should
be restored at free time was changed to the hugetlb specific flag
HPageRestoreReserve. Changes to a userfaultfd error path as well as a
VM_BUG_ON() in remove_inode_hugepages() were overlooked.
Users could see incorrect hugetlb reserve counts if they experience an
error with a UFFDIO_COPY operation. Specifically, this would be the
result of an unlikely copy_huge_page_from_user error. There is not an
increased chance of hitting the VM_BUG_ON.
Link: https://lkml.kernel.org/r/20210521233952.236434-1-mike.kravetz@oracle.com
Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Mina Almasry <almasry.mina(a)google.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 2 +-
mm/userfaultfd.c | 28 ++++++++++++++--------------
2 files changed, 15 insertions(+), 15 deletions(-)
--- a/fs/hugetlbfs/inode.c~userfaultfd-hugetlbfs-fix-new-flag-usage-in-error-path
+++ a/fs/hugetlbfs/inode.c
@@ -529,7 +529,7 @@ static void remove_inode_hugepages(struc
* the subpool and global reserve usage count can need
* to be adjusted.
*/
- VM_BUG_ON(PagePrivate(page));
+ VM_BUG_ON(HPageRestoreReserve(page));
remove_huge_page(page);
freed++;
if (!truncate_op) {
--- a/mm/userfaultfd.c~userfaultfd-hugetlbfs-fix-new-flag-usage-in-error-path
+++ a/mm/userfaultfd.c
@@ -360,38 +360,38 @@ out:
* If a reservation for the page existed in the reservation
* map of a private mapping, the map was modified to indicate
* the reservation was consumed when the page was allocated.
- * We clear the PagePrivate flag now so that the global
+ * We clear the HPageRestoreReserve flag now so that the global
* reserve count will not be incremented in free_huge_page.
* The reservation map will still indicate the reservation
* was consumed and possibly prevent later page allocation.
* This is better than leaking a global reservation. If no
- * reservation existed, it is still safe to clear PagePrivate
- * as no adjustments to reservation counts were made during
- * allocation.
+ * reservation existed, it is still safe to clear
+ * HPageRestoreReserve as no adjustments to reservation counts
+ * were made during allocation.
*
* The reservation map for shared mappings indicates which
* pages have reservations. When a huge page is allocated
* for an address with a reservation, no change is made to
- * the reserve map. In this case PagePrivate will be set
- * to indicate that the global reservation count should be
+ * the reserve map. In this case HPageRestoreReserve will be
+ * set to indicate that the global reservation count should be
* incremented when the page is freed. This is the desired
* behavior. However, when a huge page is allocated for an
* address without a reservation a reservation entry is added
- * to the reservation map, and PagePrivate will not be set.
- * When the page is freed, the global reserve count will NOT
- * be incremented and it will appear as though we have leaked
- * reserved page. In this case, set PagePrivate so that the
- * global reserve count will be incremented to match the
- * reservation map entry which was created.
+ * to the reservation map, and HPageRestoreReserve will not be
+ * set. When the page is freed, the global reserve count will
+ * NOT be incremented and it will appear as though we have
+ * leaked reserved page. In this case, set HPageRestoreReserve
+ * so that the global reserve count will be incremented to
+ * match the reservation map entry which was created.
*
* Note that vm_alloc_shared is based on the flags of the vma
* for which the page was originally allocated. dst_vma could
* be different or NULL on error.
*/
if (vm_alloc_shared)
- SetPagePrivate(page);
+ SetHPageRestoreReserve(page);
else
- ClearPagePrivate(page);
+ ClearHPageRestoreReserve(page);
put_page(page);
}
BUG_ON(copied < 0);
_
From: Varad Gautam <varad.gautam(a)suse.com>
Subject: ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry
do_mq_timedreceive calls wq_sleep with a stack local address. The sender
(do_mq_timedsend) uses this address to later call pipelined_send.
This leads to a very hard to trigger race where a do_mq_timedreceive call
might return and leave do_mq_timedsend to rely on an invalid address,
causing the following crash:
[ 240.739977] RIP: 0010:wake_q_add_safe+0x13/0x60
[ 240.739991] Call Trace:
[ 240.739999] __x64_sys_mq_timedsend+0x2a9/0x490
[ 240.740003] ? auditd_test_task+0x38/0x40
[ 240.740007] ? auditd_test_task+0x38/0x40
[ 240.740011] do_syscall_64+0x80/0x680
[ 240.740017] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 240.740019] RIP: 0033:0x7f5928e40343
The race occurs as:
1. do_mq_timedreceive calls wq_sleep with the address of `struct
ext_wait_queue` on function stack (aliased as `ewq_addr` here) - it
holds a valid `struct ext_wait_queue *` as long as the stack has not
been overwritten.
2. `ewq_addr` gets added to info->e_wait_q[RECV].list in wq_add, and
do_mq_timedsend receives it via wq_get_first_waiter(info, RECV) to call
__pipelined_op.
3. Sender calls __pipelined_op::smp_store_release(&this->state,
STATE_READY). Here is where the race window begins. (`this` is
`ewq_addr`.)
4. If the receiver wakes up now in do_mq_timedreceive::wq_sleep, it
will see `state == STATE_READY` and break.
5. do_mq_timedreceive returns, and `ewq_addr` is no longer guaranteed
to be a `struct ext_wait_queue *` since it was on do_mq_timedreceive's
stack. (Although the address may not get overwritten until another
function happens to touch it, which means it can persist around for an
indefinite time.)
6. do_mq_timedsend::__pipelined_op() still believes `ewq_addr` is a
`struct ext_wait_queue *`, and uses it to find a task_struct to pass to
the wake_q_add_safe call. In the lucky case where nothing has
overwritten `ewq_addr` yet, `ewq_addr->task` is the right task_struct.
In the unlucky case, __pipelined_op::wake_q_add_safe gets handed a
bogus address as the receiver's task_struct causing the crash.
do_mq_timedsend::__pipelined_op() should not dereference `this` after
setting STATE_READY, as the receiver counterpart is now free to return.
Change __pipelined_op to call wake_q_add_safe on the receiver's
task_struct returned by get_task_struct, instead of dereferencing `this`
which sits on the receiver's stack.
As Manfred pointed out, the race potentially also exists in
ipc/msg.c::expunge_all and ipc/sem.c::wake_up_sem_queue_prepare. Fix
those in the same way.
Link: https://lkml.kernel.org/r/20210510102950.12551-1-varad.gautam@suse.com
Fixes: c5b2cbdbdac563 ("ipc/mqueue.c: update/document memory barriers")
Fixes: 8116b54e7e23ef ("ipc/sem.c: document and update memory barriers")
Fixes: 0d97a82ba830d8 ("ipc/msg.c: update and document memory barriers")
Signed-off-by: Varad Gautam <varad.gautam(a)suse.com>
Reported-by: Matthias von Faber <matthias.vonfaber(a)aox-tech.de>
Acked-by: Davidlohr Bueso <dbueso(a)suse.de>
Acked-by: Manfred Spraul <manfred(a)colorfullife.com>
Cc: Christian Brauner <christian.brauner(a)ubuntu.com>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: "Eric W. Biederman" <ebiederm(a)xmission.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
ipc/mqueue.c | 6 ++++--
ipc/msg.c | 6 ++++--
ipc/sem.c | 6 ++++--
3 files changed, 12 insertions(+), 6 deletions(-)
--- a/ipc/mqueue.c~ipc-mqueue-msg-sem-avoid-relying-on-a-stack-reference-past-its-expiry
+++ a/ipc/mqueue.c
@@ -1004,12 +1004,14 @@ static inline void __pipelined_op(struct
struct mqueue_inode_info *info,
struct ext_wait_queue *this)
{
+ struct task_struct *task;
+
list_del(&this->list);
- get_task_struct(this->task);
+ task = get_task_struct(this->task);
/* see MQ_BARRIER for purpose/pairing */
smp_store_release(&this->state, STATE_READY);
- wake_q_add_safe(wake_q, this->task);
+ wake_q_add_safe(wake_q, task);
}
/* pipelined_send() - send a message directly to the task waiting in
--- a/ipc/msg.c~ipc-mqueue-msg-sem-avoid-relying-on-a-stack-reference-past-its-expiry
+++ a/ipc/msg.c
@@ -251,11 +251,13 @@ static void expunge_all(struct msg_queue
struct msg_receiver *msr, *t;
list_for_each_entry_safe(msr, t, &msq->q_receivers, r_list) {
- get_task_struct(msr->r_tsk);
+ struct task_struct *r_tsk;
+
+ r_tsk = get_task_struct(msr->r_tsk);
/* see MSG_BARRIER for purpose/pairing */
smp_store_release(&msr->r_msg, ERR_PTR(res));
- wake_q_add_safe(wake_q, msr->r_tsk);
+ wake_q_add_safe(wake_q, r_tsk);
}
}
--- a/ipc/sem.c~ipc-mqueue-msg-sem-avoid-relying-on-a-stack-reference-past-its-expiry
+++ a/ipc/sem.c
@@ -784,12 +784,14 @@ would_block:
static inline void wake_up_sem_queue_prepare(struct sem_queue *q, int error,
struct wake_q_head *wake_q)
{
- get_task_struct(q->sleeper);
+ struct task_struct *sleeper;
+
+ sleeper = get_task_struct(q->sleeper);
/* see SEM_BARRIER_2 for purpose/pairing */
smp_store_release(&q->status, error);
- wake_q_add_safe(wake_q, q->sleeper);
+ wake_q_add_safe(wake_q, sleeper);
}
static void unlink_queue(struct sem_array *sma, struct sem_queue *q)
_
From: Michal Hocko <mhocko(a)suse.com>
Subject: Revert "mm/gup: check page posion status for coredump."
While reviewing
http://lkml.kernel.org/r/20210429122519.15183-4-david@redhat.com I have
crossed d3378e86d182 ("mm/gup: check page posion status for coredump.")
and noticed that this patch is broken in two ways. First it doesn't
really prevent hwpoison pages from being dumped because hwpoison pages can
be marked asynchornously at any time after the check. Secondly, and more
importantly, the patch introduces a ref count leak because get_dump_page
takes a reference on the page which is not released.
It also seems that the patch was merged incorrectly because there were
follow up changes not included as well as discussions on how to address
the underlying problem
http://lkml.kernel.org/r/57ac524c-b49a-99ec-c1e4-ef5027bfb61b@redhat.com
Therefore revert the original patch.
Link: https://lkml.kernel.org/r/20210505135407.31590-1-mhocko@kernel.org
Fixes: d3378e86d182 ("mm/gup: check page posion status for coredump.")
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Aili Yao <yaoaili(a)kingsoft.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 4 ----
mm/internal.h | 20 --------------------
2 files changed, 24 deletions(-)
--- a/mm/gup.c~revert-mm-gup-check-page-posion-status-for-coredump
+++ a/mm/gup.c
@@ -1593,10 +1593,6 @@ struct page *get_dump_page(unsigned long
FOLL_FORCE | FOLL_DUMP | FOLL_GET);
if (locked)
mmap_read_unlock(mm);
-
- if (ret == 1 && is_page_poisoned(page))
- return NULL;
-
return (ret == 1) ? page : NULL;
}
#endif /* CONFIG_ELF_CORE */
--- a/mm/internal.h~revert-mm-gup-check-page-posion-status-for-coredump
+++ a/mm/internal.h
@@ -96,26 +96,6 @@ static inline void set_page_refcounted(s
set_page_count(page, 1);
}
-/*
- * When kernel touch the user page, the user page may be have been marked
- * poison but still mapped in user space, if without this page, the kernel
- * can guarantee the data integrity and operation success, the kernel is
- * better to check the posion status and avoid touching it, be good not to
- * panic, coredump for process fatal signal is a sample case matching this
- * scenario. Or if kernel can't guarantee the data integrity, it's better
- * not to call this function, let kernel touch the poison page and get to
- * panic.
- */
-static inline bool is_page_poisoned(struct page *page)
-{
- if (PageHWPoison(page))
- return true;
- else if (PageHuge(page) && PageHWPoison(compound_head(page)))
- return true;
-
- return false;
-}
-
extern unsigned long highest_memmap_pfn;
/*
_
The patch titled
Subject: kfence: use TASK_IDLE when awaiting allocation
has been added to the -mm tree. Its filename is
kfence-use-task_idle-when-awaiting-allocation.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/kfence-use-task_idle-when-awaitin…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/kfence-use-task_idle-when-awaitin…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Marco Elver <elver(a)google.com>
Subject: kfence: use TASK_IDLE when awaiting allocation
Since wait_event() uses TASK_UNINTERRUPTIBLE by default, waiting for an
allocation counts towards load. However, for KFENCE, this does not make
any sense, since there is no busy work we're awaiting.
Instead, use TASK_IDLE via wait_event_idle() to not count towards load.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1185565
Link: https://lkml.kernel.org/r/20210521083209.3740269-1-elver@google.com
Fixes: 407f1d8c1b5f ("kfence: await for allocation using wait_event")
Signed-off-by: Marco Elver <elver(a)google.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: David Laight <David.Laight(a)ACULAB.COM>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: <stable(a)vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kfence/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/kfence/core.c~kfence-use-task_idle-when-awaiting-allocation
+++ a/mm/kfence/core.c
@@ -627,10 +627,10 @@ static void toggle_allocation_gate(struc
* During low activity with no allocations we might wait a
* while; let's avoid the hung task warning.
*/
- wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
- sysctl_hung_task_timeout_secs * HZ / 2);
+ wait_event_idle_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
+ sysctl_hung_task_timeout_secs * HZ / 2);
} else {
- wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
+ wait_event_idle(allocation_wait, atomic_read(&kfence_allocation_gate));
}
/* Disable static key and reset timer. */
_
Patches currently in -mm which might be from elver(a)google.com are
kfence-use-task_idle-when-awaiting-allocation.patch
mm-slub-change-run-time-assertion-in-kmalloc_index-to-compile-time-fix.patch
printk-introduce-dump_stack_lvl-fix.patch
kfence-unconditionally-use-unbound-work-queue.patch
The patch titled
Subject: userfaultfd: hugetlbfs: fix new flag usage in error path
has been added to the -mm tree. Its filename is
userfaultfd-hugetlbfs-fix-new-flag-usage-in-error-path.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/userfaultfd-hugetlbfs-fix-new-fla…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/userfaultfd-hugetlbfs-fix-new-fla…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: userfaultfd: hugetlbfs: fix new flag usage in error path
In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific
page flags") the use of PagePrivate to indicate a reservation count should
be restored at free time was changed to the hugetlb specific flag
HPageRestoreReserve. Changes to a userfaultfd error path as well as a
VM_BUG_ON() in remove_inode_hugepages() were overlooked.
Users could see incorrect hugetlb reserve counts if they experience an
error with a UFFDIO_COPY operation. Specifically, this would be the
result of an unlikely copy_huge_page_from_user error. There is not an
increased chance of hitting the VM_BUG_ON.
Link: https://lkml.kernel.org/r/20210521233952.236434-1-mike.kravetz@oracle.com
Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Mina Almasry <almasry.mina(a)google.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 2 +-
mm/userfaultfd.c | 28 ++++++++++++++--------------
2 files changed, 15 insertions(+), 15 deletions(-)
--- a/fs/hugetlbfs/inode.c~userfaultfd-hugetlbfs-fix-new-flag-usage-in-error-path
+++ a/fs/hugetlbfs/inode.c
@@ -529,7 +529,7 @@ static void remove_inode_hugepages(struc
* the subpool and global reserve usage count can need
* to be adjusted.
*/
- VM_BUG_ON(PagePrivate(page));
+ VM_BUG_ON(HPageRestoreReserve(page));
remove_huge_page(page);
freed++;
if (!truncate_op) {
--- a/mm/userfaultfd.c~userfaultfd-hugetlbfs-fix-new-flag-usage-in-error-path
+++ a/mm/userfaultfd.c
@@ -360,38 +360,38 @@ out:
* If a reservation for the page existed in the reservation
* map of a private mapping, the map was modified to indicate
* the reservation was consumed when the page was allocated.
- * We clear the PagePrivate flag now so that the global
+ * We clear the HPageRestoreReserve flag now so that the global
* reserve count will not be incremented in free_huge_page.
* The reservation map will still indicate the reservation
* was consumed and possibly prevent later page allocation.
* This is better than leaking a global reservation. If no
- * reservation existed, it is still safe to clear PagePrivate
- * as no adjustments to reservation counts were made during
- * allocation.
+ * reservation existed, it is still safe to clear
+ * HPageRestoreReserve as no adjustments to reservation counts
+ * were made during allocation.
*
* The reservation map for shared mappings indicates which
* pages have reservations. When a huge page is allocated
* for an address with a reservation, no change is made to
- * the reserve map. In this case PagePrivate will be set
- * to indicate that the global reservation count should be
+ * the reserve map. In this case HPageRestoreReserve will be
+ * set to indicate that the global reservation count should be
* incremented when the page is freed. This is the desired
* behavior. However, when a huge page is allocated for an
* address without a reservation a reservation entry is added
- * to the reservation map, and PagePrivate will not be set.
- * When the page is freed, the global reserve count will NOT
- * be incremented and it will appear as though we have leaked
- * reserved page. In this case, set PagePrivate so that the
- * global reserve count will be incremented to match the
- * reservation map entry which was created.
+ * to the reservation map, and HPageRestoreReserve will not be
+ * set. When the page is freed, the global reserve count will
+ * NOT be incremented and it will appear as though we have
+ * leaked reserved page. In this case, set HPageRestoreReserve
+ * so that the global reserve count will be incremented to
+ * match the reservation map entry which was created.
*
* Note that vm_alloc_shared is based on the flags of the vma
* for which the page was originally allocated. dst_vma could
* be different or NULL on error.
*/
if (vm_alloc_shared)
- SetPagePrivate(page);
+ SetHPageRestoreReserve(page);
else
- ClearPagePrivate(page);
+ ClearHPageRestoreReserve(page);
put_page(page);
}
BUG_ON(copied < 0);
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
userfaultfd-hugetlbfs-fix-new-flag-usage-in-error-path.patch