This is a note to let you know that I've just added the patch titled
w1: w1_therm: Fix conversion result for negative temperatures
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 2f6055c26f1913763eabc66c7c27d0693561e966 Mon Sep 17 00:00:00 2001
From: Ivan Zaentsev <ivan.zaentsev(a)wirenboard.ru>
Date: Thu, 21 Jan 2021 12:30:21 +0300
Subject: w1: w1_therm: Fix conversion result for negative temperatures
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
DS18B20 device driver returns an incorrect value for negative temperatures
due to a missing sign-extension in w1_DS18B20_convert_temp().
Fix by using s16 temperature value when converting to int.
Fixes: 9ace0b4dab1c (w1: w1_therm: Add support for GXCAS GX20MH01 device.)
Cc: stable <stable(a)vger.kernel.org>
Reported-by: Paweł Marciniak <sunwire(a)gmail.com>
Signed-off-by: Ivan Zaentsev <ivan.zaentsev(a)wirenboard.ru>
Link: https://lore.kernel.org/r/20210121093021.224764-1-ivan.zaentsev@wirenboard.…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/w1/slaves/w1_therm.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/w1/slaves/w1_therm.c b/drivers/w1/slaves/w1_therm.c
index 3712b1e6dc71..976eea28f268 100644
--- a/drivers/w1/slaves/w1_therm.c
+++ b/drivers/w1/slaves/w1_therm.c
@@ -667,28 +667,24 @@ static inline int w1_DS18B20_get_resolution(struct w1_slave *sl)
*/
static inline int w1_DS18B20_convert_temp(u8 rom[9])
{
- int t;
- u32 bv;
+ u16 bv;
+ s16 t;
+
+ /* Signed 16-bit value to unsigned, cpu order */
+ bv = le16_to_cpup((__le16 *)rom);
/* Config register bit R2 = 1 - GX20MH01 in 13 or 14 bit resolution mode */
if (rom[4] & 0x80) {
- /* Signed 16-bit value to unsigned, cpu order */
- bv = le16_to_cpup((__le16 *)rom);
-
/* Insert two temperature bits from config register */
/* Avoid arithmetic shift of signed value */
bv = (bv << 2) | (rom[4] & 3);
-
- t = (int) sign_extend32(bv, 17); /* Degrees, lowest bit is 2^-6 */
- return (t*1000)/64; /* Millidegrees */
+ t = (s16) bv; /* Degrees, lowest bit is 2^-6 */
+ return (int)t * 1000 / 64; /* Sign-extend to int; millidegrees */
}
-
- t = (int)le16_to_cpup((__le16 *)rom);
- return t*1000/16;
+ t = (s16)bv; /* Degrees, lowest bit is 2^-4 */
+ return (int)t * 1000 / 16; /* Sign-extend to int; millidegrees */
}
-
-
/**
* w1_DS18S20_convert_temp() - temperature computation for DS18S20
* @rom: data read from device RAM (8 data bytes + 1 CRC byte)
--
2.30.0
This is a note to let you know that I've just added the patch titled
virt: vbox: Do not use wait_event_interruptible when called from
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From c35901b39ddc20077f4ae7b9f7bf344487f62212 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Thu, 21 Jan 2021 16:07:54 +0100
Subject: virt: vbox: Do not use wait_event_interruptible when called from
kernel context
Do not use wait_event_interruptible when vbg_hgcm_call() gets called from
kernel-context, such as it being called by the vboxsf filesystem code.
This fixes some filesystem related system calls on shared folders
unexpectedly failing with -EINTR.
Fixes: 0532a1b0d045 ("virt: vbox: Implement passing requestor info to the host for VirtualBox 6.0.x")
Reported-by: Ludovic Pouzenc <bugreports(a)pouzenc.fr>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20210121150754.147598-1-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/virt/vboxguest/vboxguest_utils.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/virt/vboxguest/vboxguest_utils.c b/drivers/virt/vboxguest/vboxguest_utils.c
index ea05af41ec69..8d195e3f8301 100644
--- a/drivers/virt/vboxguest/vboxguest_utils.c
+++ b/drivers/virt/vboxguest/vboxguest_utils.c
@@ -468,7 +468,7 @@ static int hgcm_cancel_call(struct vbg_dev *gdev, struct vmmdev_hgcm_call *call)
* Cancellation fun.
*/
static int vbg_hgcm_do_call(struct vbg_dev *gdev, struct vmmdev_hgcm_call *call,
- u32 timeout_ms, bool *leak_it)
+ u32 timeout_ms, bool interruptible, bool *leak_it)
{
int rc, cancel_rc, ret;
long timeout;
@@ -495,10 +495,15 @@ static int vbg_hgcm_do_call(struct vbg_dev *gdev, struct vmmdev_hgcm_call *call,
else
timeout = msecs_to_jiffies(timeout_ms);
- timeout = wait_event_interruptible_timeout(
- gdev->hgcm_wq,
- hgcm_req_done(gdev, &call->header),
- timeout);
+ if (interruptible) {
+ timeout = wait_event_interruptible_timeout(gdev->hgcm_wq,
+ hgcm_req_done(gdev, &call->header),
+ timeout);
+ } else {
+ timeout = wait_event_timeout(gdev->hgcm_wq,
+ hgcm_req_done(gdev, &call->header),
+ timeout);
+ }
/* timeout > 0 means hgcm_req_done has returned true, so success */
if (timeout > 0)
@@ -631,7 +636,8 @@ int vbg_hgcm_call(struct vbg_dev *gdev, u32 requestor, u32 client_id,
hgcm_call_init_call(call, client_id, function, parms, parm_count,
bounce_bufs);
- ret = vbg_hgcm_do_call(gdev, call, timeout_ms, &leak_it);
+ ret = vbg_hgcm_do_call(gdev, call, timeout_ms,
+ requestor & VMMDEV_REQUESTOR_USERMODE, &leak_it);
if (ret == 0) {
*vbox_status = call->header.result;
ret = hgcm_call_copy_back_result(call, parms, parm_count,
--
2.30.0
No upstream commit, this is a fix to a stable 5.4 specific backport.
The intention of backport commit cac68d12c531 ("io_uring: grab ->fs as part
of async offload") as found in the stable 5.4 tree was to make
io_sq_wq_submit_work() to switch the workqueue task's ->fs over to the
submitting task's one during the IO operation.
However, due to a small logic error, this change turned out to not have any
actual effect. From a high level, the relevant code in
io_sq_wq_submit_work() looks like
old_fs_struct = current->fs;
do {
...
if (req->fs != current->fs && current->fs != old_fs_struct) {
task_lock(current);
if (req->fs)
current->fs = req->fs;
else
current->fs = old_fs_struct;
task_unlock(current);
}
...
} while (req);
The if condition is supposed to cover the case that current->fs doesn't
match what's needed for processing the request, but observe how it fails
to ever evaluate to true due to the second clause:
current->fs != old_fs_struct will be false in the first iteration as per
the initialization of old_fs_struct and because this prevents current->fs
from getting replaced, the same follows inductively for all subsequent
iterations.
Fix said if condition such that
- if req->fs is set and doesn't match current->fs, the latter will be
switched to the former
- or if req->fs is unset, the switch back to the initial old_fs_struct
will be made, if necessary.
While at it, also correct the condition for the ->fs related cleanup right
before the return of io_sq_wq_submit_work(): currently, old_fs_struct is
restored only if it's non-NULL. It is always non-NULL though and thus, the
if-condition is rendundant. Supposedly, the motivation had been to optimize
and avoid switching current->fs back to the initial old_fs_struct in case
it is found to have the desired value already. Make it so.
Cc: stable(a)vger.kernel.org # v5.4
Fixes: cac68d12c531 ("io_uring: grab ->fs as part of async offload")
Reviewed-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Nicolai Stange <nstange(a)suse.de>
---
Tested on top of v5.4.90.
fs/io_uring.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4127ea027a14..478df7e10767 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2226,7 +2226,8 @@ static void io_sq_wq_submit_work(struct work_struct *work)
/* Ensure we clear previously set non-block flag */
req->rw.ki_flags &= ~IOCB_NOWAIT;
- if (req->fs != current->fs && current->fs != old_fs_struct) {
+ if ((req->fs && req->fs != current->fs) ||
+ (!req->fs && current->fs != old_fs_struct)) {
task_lock(current);
if (req->fs)
current->fs = req->fs;
@@ -2351,7 +2352,7 @@ static void io_sq_wq_submit_work(struct work_struct *work)
mmput(cur_mm);
}
revert_creds(old_cred);
- if (old_fs_struct) {
+ if (old_fs_struct != current->fs) {
task_lock(current);
current->fs = old_fs_struct;
task_unlock(current);
--
2.26.2
Currently, the __is_lm_address() check just masks out the top 12 bits
of the address, but if they are 0, it still yields a true result.
This has as a side effect that virt_addr_valid() returns true even for
invalid virtual addresses (e.g. 0x0).
Fix the detection checking that it's actually a kernel address starting
at PAGE_OFFSET.
Fixes: f4693c2716b35 ("arm64: mm: extend linear region for 52-bit VA configurations")
Cc: <stable(a)vger.kernel.org> # 5.4.x
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Suggested-by: Catalin Marinas <catalin.marinas(a)arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Acked-by: Mark Rutland <mark.rutland(a)arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
---
arch/arm64/include/asm/memory.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 18fce223b67b..99d7e1494aaa 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -247,9 +247,11 @@ static inline const void *__tag_set(const void *addr, u8 tag)
/*
- * The linear kernel range starts at the bottom of the virtual address space.
+ * Check whether an arbitrary address is within the linear map, which
+ * lives in the [PAGE_OFFSET, PAGE_END) interval at the bottom of the
+ * kernel's TTBR1 address range.
*/
-#define __is_lm_address(addr) (((u64)(addr) & ~PAGE_OFFSET) < (PAGE_END - PAGE_OFFSET))
+#define __is_lm_address(addr) (((u64)(addr) ^ PAGE_OFFSET) < (PAGE_END - PAGE_OFFSET))
#define __lm_to_phys(addr) (((addr) & ~PAGE_OFFSET) + PHYS_OFFSET)
#define __kimg_to_phys(addr) ((addr) - kimage_voffset)
--
2.30.0
From: Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
dwc2_hsotg_process_req_status uses ep_from_windex() to retrieve
the endpoint for the index provided in the wIndex request param.
In a test-case with a rndis gadget running and sending a malformed
packet to it like:
dev.ctrl_transfer(
0x82, # bmRequestType
0x00, # bRequest
0x0000, # wValue
0x0001, # wIndex
0x00 # wLength
)
it is possible to cause a crash:
[ 217.533022] dwc2 ff300000.usb: dwc2_hsotg_process_req_status: USB_REQ_GET_STATUS
[ 217.559003] Unable to handle kernel read from unreadable memory at virtual address 0000000000000088
...
[ 218.313189] Call trace:
[ 218.330217] ep_from_windex+0x3c/0x54
[ 218.348565] usb_gadget_giveback_request+0x10/0x20
[ 218.368056] dwc2_hsotg_complete_request+0x144/0x184
This happens because ep_from_windex wants to compare the endpoint
direction even if index_to_ep() didn't return an endpoint due to
the direction not matching.
The fix is easy insofar that the actual direction check is already
happening when calling index_to_ep() which will return NULL if there
is no endpoint for the targeted direction, so the offending check
can go away completely.
Fixes: c6f5c050e2a7 ("usb: dwc2: gadget: add bi-directional endpoint support")
Signed-off-by: Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
Cc: stable(a)vger.kernel.org
---
drivers/usb/dwc2/gadget.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 70ac47a341ac..a68c01b1dd73 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -1553,12 +1553,7 @@ static struct dwc2_hsotg_ep *ep_from_windex(struct dwc2_hsotg *hsotg,
if (idx > hsotg->num_of_eps)
return NULL;
- ep = index_to_ep(hsotg, idx, dir);
-
- if (idx && ep->dir_in != dir)
- return NULL;
-
- return ep;
+ return index_to_ep(hsotg, idx, dir);
}
/**
--
2.29.2
From: Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
dwc2_hsotg_process_req_status uses ep_from_windex() to retrieve
the endpoint for the index provided in the wIndex request param.
In a test-case with a rndis gadget running and sending a malformed
packet to it like:
dev.ctrl_transfer(
0x82, # bmRequestType
0x00, # bRequest
0x0000, # wValue
0x0001, # wIndex
0x00 # wLength
)
it is possible to cause a crash:
[ 217.533022] dwc2 ff300000.usb: dwc2_hsotg_process_req_status: USB_REQ_GET_STATUS
[ 217.559003] Unable to handle kernel read from unreadable memory at virtual address 0000000000000088
...
[ 218.313189] Call trace:
[ 218.330217] ep_from_windex+0x3c/0x54
[ 218.348565] usb_gadget_giveback_request+0x10/0x20
[ 218.368056] dwc2_hsotg_complete_request+0x144/0x184
This happens because ep_from_windex wants to compare the endpoint
direction even if index_to_ep() didn't return an endpoint due to
the direction not matching.
The fix is easy insofar that the actual direction check is already
happening when calling index_to_ep() which will return NULL if there
is no endpoint for the targeted direction, so the offending check
can go away completely.
Fixes: c6f5c050e2a7 ("usb: dwc2: gadget: add bi-directional endpoint support")
Signed-off-by: Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
Cc: stable(a)vger.kernel.org
---
changes in v2:
- remove unused struct dwc2_hsotg_ep *ep;
drivers/usb/dwc2/gadget.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 0a0d11151cfb..ad4c94366dad 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -1543,7 +1543,6 @@ static void dwc2_hsotg_complete_oursetup(struct usb_ep *ep,
static struct dwc2_hsotg_ep *ep_from_windex(struct dwc2_hsotg *hsotg,
u32 windex)
{
- struct dwc2_hsotg_ep *ep;
int dir = (windex & USB_DIR_IN) ? 1 : 0;
int idx = windex & 0x7F;
@@ -1553,12 +1552,7 @@ static struct dwc2_hsotg_ep *ep_from_windex(struct dwc2_hsotg *hsotg,
if (idx > hsotg->num_of_eps)
return NULL;
- ep = index_to_ep(hsotg, idx, dir);
-
- if (idx && ep->dir_in != dir)
- return NULL;
-
- return ep;
+ return index_to_ep(hsotg, idx, dir);
}
/**
--
2.29.2
From: Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
dwc2_hsotg_process_req_status uses ep_from_windex() to retrieve
the endpoint for the index provided in the wIndex request param.
In a test-case with a rndis gadget running and sending a malformed
packet to it like:
dev.ctrl_transfer(
0x82, # bmRequestType
0x00, # bRequest
0x0000, # wValue
0x0001, # wIndex
0x00 # wLength
)
it is possible to cause a crash:
[ 217.533022] dwc2 ff300000.usb: dwc2_hsotg_process_req_status: USB_REQ_GET_STATUS
[ 217.559003] Unable to handle kernel read from unreadable memory at virtual address 0000000000000088
...
[ 218.313189] Call trace:
[ 218.330217] ep_from_windex+0x3c/0x54
[ 218.348565] usb_gadget_giveback_request+0x10/0x20
[ 218.368056] dwc2_hsotg_complete_request+0x144/0x184
This happens because ep_from_windex wants to compare the endpoint
direction even if index_to_ep() didn't return an endpoint due to
the direction not matching.
The fix is easy insofar that the actual direction check is already
happening when calling index_to_ep() which will return NULL if there
is no endpoint for the targeted direction, so the offending check
can go away completely.
Fixes: c6f5c050e2a7 ("usb: dwc2: gadget: add bi-directional endpoint support")
Reported-by: Gerhard Klostermeier <gerhard.klostermeier(a)syss.de>
Signed-off-by: Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
Cc: stable(a)vger.kernel.org
---
changes in v3:
- added Reported-by tag
changes in v2:
- remove unused struct dwc2_hsotg_ep *ep;
drivers/usb/dwc2/gadget.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 0a0d11151cfb..ad4c94366dad 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -1543,7 +1543,6 @@ static void dwc2_hsotg_complete_oursetup(struct usb_ep *ep,
static struct dwc2_hsotg_ep *ep_from_windex(struct dwc2_hsotg *hsotg,
u32 windex)
{
- struct dwc2_hsotg_ep *ep;
int dir = (windex & USB_DIR_IN) ? 1 : 0;
int idx = windex & 0x7F;
@@ -1553,12 +1552,7 @@ static struct dwc2_hsotg_ep *ep_from_windex(struct dwc2_hsotg *hsotg,
if (idx > hsotg->num_of_eps)
return NULL;
- ep = index_to_ep(hsotg, idx, dir);
-
- if (idx && ep->dir_in != dir)
- return NULL;
-
- return ep;
+ return index_to_ep(hsotg, idx, dir);
}
/**
--
2.29.2
This is the start of the stable review cycle for the 4.19.171 release.
There are 58 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 27 Jan 2021 18:31:44 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.171-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.171-rc1
Dan Carpenter <dan.carpenter(a)oracle.com>
net: dsa: b53: fix an off by one in checking "vlan->vid"
Tariq Toukan <tariqt(a)nvidia.com>
net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: mscc: ocelot: allow offloading of bridge on top of LAG
Matteo Croce <mcroce(a)microsoft.com>
ipv6: set multicast flag on the multicast route
Eric Dumazet <edumazet(a)google.com>
net_sched: reject silly cell_log in qdisc_get_rtab()
Eric Dumazet <edumazet(a)google.com>
net_sched: avoid shift-out-of-bounds in tcindex_set_parms()
Matteo Croce <mcroce(a)microsoft.com>
ipv6: create multicast route with RTPROT_KERNEL
Guillaume Nault <gnault(a)redhat.com>
udp: mask TOS bits in udp_v4_early_demux()
Lecopzer Chen <lecopzer(a)gmail.com>
kasan: fix incorrect arguments passing in kasan_add_zero_shadow
Lecopzer Chen <lecopzer(a)gmail.com>
kasan: fix unaligned address is unhandled in kasan_remove_zero_shadow
Alexander Lobakin <alobakin(a)pm.me>
skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too
Geert Uytterhoeven <geert+renesas(a)glider.be>
sh_eth: Fix power down vs. is_opened flag ordering
Rasmus Villemoes <rasmus.villemoes(a)prevas.dk>
net: dsa: mv88e6xxx: also read STU state in mv88e6250_g1_vtu_getnext
Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
sh: dma: fix kconfig dependency for G2_DMA
Guillaume Nault <gnault(a)redhat.com>
netfilter: rpfilter: mask ecn bits before fib lookup
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
driver core: Extend device_is_dependent()
JC Kuo <jckuo(a)nvidia.com>
xhci: tegra: Delay for disabling LFPS detector
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: make sure TRB is fully written before giving it to the controller
Patrik Jakobsson <patrik.r.jakobsson(a)gmail.com>
usb: bdc: Make bdc pci driver depend on BROKEN
Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
usb: udc: core: Use lock when write to soft_connect
Ryan Chen <ryan_chen(a)aspeedtech.com>
usb: gadget: aspeed: fix stop dma register setting.
Longfang Liu <liulongfang(a)huawei.com>
USB: ehci: fix an interrupt calltrace error
Eugene Korenevsky <ekorenevsky(a)astralinux.ru>
ehci: fix EHCI host controller initialization sequence
Pali Rohár <pali(a)kernel.org>
serial: mvebu-uart: fix tx lost characters at power off
Wang Hui <john.wanghui(a)huawei.com>
stm class: Fix module init return on allocation failure
Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
intel_th: pci: Add Alder Lake-P support
Mathias Kresin <dev(a)kresin.me>
irqchip/mips-cpu: Set IPI domain parent chip
Lars-Peter Clausen <lars(a)metafoo.de>
iio: ad5504: Fix setting power-down state
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: peak_usb: fix use after free bugs
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: vxcan: vxcan_xmit: fix use after free bug
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: dev: can_restart: fix use after free bug
Hangbin Liu <liuhangbin(a)gmail.com>
selftests: net: fib_tests: remove duplicate log test
Hans de Goede <hdegoede(a)redhat.com>
platform/x86: intel-vbtn: Drop HP Stream x360 Convertible PC 11 from allow-list
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
i2c: octeon: check correct size of maximum RECV_LEN packet
Arnd Bergmann <arnd(a)arndb.de>
scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/kms/nv50-: fix case where notifier buffer is at offset 0
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/mmu: fix vram heap sizing
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/i2c/gm200: increase width of aux semaphore owner fields
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/privring: ack interrupts the same way as RM
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/bios: fix issue shadowing expansion ROMs
David Woodhouse <dwmw(a)amazon.co.uk>
xen: Fix event channel callback via INTX/GSI
Peter Geis <pgwipeout(a)gmail.com>
clk: tegra30: Add hda clock default rates to clock driver
Seth Miller <miller.seth(a)gmail.com>
HID: Ignore battery for Elan touchscreen on ASUS UX550
Damien Le Moal <damien.lemoal(a)wdc.com>
riscv: Fix kernel time_init()
Nilesh Javali <njavali(a)marvell.com>
scsi: qedi: Correct max length of CHAP secret
Can Guo <cang(a)codeaurora.org>
scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback
Anthony Iliopoulos <ailiop(a)suse.com>
dm integrity: select CRYPTO_SKCIPHER
Cezary Rojewski <cezary.rojewski(a)intel.com>
ASoC: Intel: haswell: Add missing pm_ops
Pan Bian <bianpan2016(a)163.com>
drm/atomic: put state on error path
Mikulas Patocka <mpatocka(a)redhat.com>
dm integrity: fix a crash if "recalculate" used without "internal_hash"
Hannes Reinecke <hare(a)suse.de>
dm: avoid filesystem lookup in dm_get_dev_t()
Alex Leibovich <alexl(a)marvell.com>
mmc: sdhci-xenon: fix 1.8v regulator stabilization
Peter Collingbourne <pcc(a)google.com>
mmc: core: don't initialize block size from ext_csd if not present
Josef Bacik <josef(a)toxicpanda.com>
btrfs: fix lockdep splat in btrfs_recover_relocation
Hans de Goede <hdegoede(a)redhat.com>
ACPI: scan: Make acpi_bus_get_device() clear return pointer on error
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda/via: Add minimum mute flag
Takashi Iwai <tiwai(a)suse.de>
ALSA: seq: oss: Fix missing error check in snd_seq_oss_synth_make_info()
Mikko Perttunen <mperttunen(a)nvidia.com>
i2c: bpmp-tegra: Ignore unknown I2C_M flags
-------------
Diffstat:
Makefile | 4 +-
arch/arm/xen/enlighten.c | 2 +-
arch/riscv/kernel/time.c | 3 +
arch/sh/drivers/dma/Kconfig | 3 +-
drivers/acpi/scan.c | 2 +
drivers/base/core.c | 17 ++++-
drivers/clk/tegra/clk-tegra30.c | 2 +
drivers/gpu/drm/drm_atomic_helper.c | 2 +-
drivers/gpu/drm/nouveau/dispnv50/disp.c | 4 +-
drivers/gpu/drm/nouveau/dispnv50/disp.h | 2 +-
drivers/gpu/drm/nouveau/dispnv50/wimmc37b.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/subdev/i2c/auxgm200.c | 8 +--
drivers/gpu/drm/nouveau/nvkm/subdev/ibus/gf100.c | 10 ++-
drivers/gpu/drm/nouveau/nvkm/subdev/ibus/gk104.c | 10 ++-
drivers/gpu/drm/nouveau/nvkm/subdev/mmu/base.c | 6 +-
drivers/hid/hid-ids.h | 1 +
drivers/hid/hid-input.c | 2 +
drivers/hwtracing/intel_th/pci.c | 5 ++
drivers/hwtracing/stm/heartbeat.c | 6 +-
drivers/i2c/busses/i2c-octeon-core.c | 2 +-
drivers/i2c/busses/i2c-tegra-bpmp.c | 2 +-
drivers/iio/dac/ad5504.c | 4 +-
drivers/irqchip/irq-mips-cpu.c | 7 ++
drivers/md/Kconfig | 1 +
drivers/md/dm-integrity.c | 6 ++
drivers/md/dm-table.c | 15 +++-
drivers/mmc/core/queue.c | 4 +-
drivers/mmc/host/sdhci-xenon.c | 7 +-
drivers/net/can/dev.c | 4 +-
drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 8 +--
drivers/net/can/vxcan.c | 6 +-
drivers/net/dsa/b53/b53_common.c | 2 +-
drivers/net/dsa/mv88e6xxx/global1_vtu.c | 4 ++
drivers/net/ethernet/mscc/ocelot.c | 4 +-
drivers/net/ethernet/renesas/sh_eth.c | 4 +-
drivers/platform/x86/intel-vbtn.c | 6 --
drivers/scsi/megaraid/megaraid_sas_base.c | 6 +-
drivers/scsi/qedi/qedi_main.c | 4 +-
drivers/scsi/ufs/ufshcd.c | 11 ++-
drivers/tty/serial/mvebu-uart.c | 10 ++-
drivers/usb/gadget/udc/aspeed-vhub/epn.c | 5 +-
drivers/usb/gadget/udc/bdc/Kconfig | 2 +-
drivers/usb/gadget/udc/core.c | 13 +++-
drivers/usb/host/ehci-hcd.c | 12 ++++
drivers/usb/host/ehci-hub.c | 3 +
drivers/usb/host/xhci-ring.c | 2 +
drivers/usb/host/xhci-tegra.c | 7 ++
drivers/xen/events/events_base.c | 10 ---
drivers/xen/platform-pci.c | 1 -
drivers/xen/xenbus/xenbus.h | 1 +
drivers/xen/xenbus/xenbus_comms.c | 8 ---
drivers/xen/xenbus/xenbus_probe.c | 81 ++++++++++++++++++----
fs/btrfs/volumes.c | 2 +
include/xen/xenbus.h | 2 +-
mm/kasan/kasan_init.c | 23 +++---
net/core/dev.c | 5 ++
net/core/skbuff.c | 6 +-
net/ipv4/netfilter/ipt_rpfilter.c | 2 +-
net/ipv4/udp.c | 3 +-
net/ipv6/addrconf.c | 3 +-
net/sched/cls_tcindex.c | 8 ++-
net/sched/sch_api.c | 3 +-
sound/core/seq/oss/seq_oss_synth.c | 3 +-
sound/pci/hda/patch_via.c | 1 +
sound/soc/intel/boards/haswell.c | 1 +
tools/testing/selftests/net/fib_tests.sh | 1 -
67 files changed, 290 insertions(+), 128 deletions(-)
On Tue, Jan 26, 2021 at 8:25 AM Mike Rapoport <rppt(a)linux.ibm.com> wrote:
>
> On Mon, Jan 25, 2021 at 09:46:19PM +0000, Chris Wilson wrote:
> >
> > CI does confirm that the revert of d3921cb8be29 brings the machines back
> > to life.
>
> I still cannot see what could possibly go wrong, so let's revert
> d3921cb8be29 for now and I'll continue to work with Chris to debug this.
Ok, reverted in my tree.
And added stable to the cc, so that they know not to pick up that
commit d3921cb8be29, despite it being marked for stable.
Linus
From: Nadav Amit <namit(a)vmware.com>
When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.
Synchronizing the virtual and the physical TLBs is expensive if the
hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching
mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.
Getting batched TLB flushes to use in such circumstances page-specific
invalidations again is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.
Fixes: 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
Signed-off-by: Nadav Amit <namit(a)vmware.com>
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: Lu Baolu <baolu.lu(a)linux.intel.com>
Cc: Joerg Roedel <joro(a)8bytes.org>
Cc: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
drivers/iommu/intel/iommu.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 788119c5b021..4e08f5e17175 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5373,6 +5373,30 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
return ret;
}
+static int
+intel_iommu_domain_get_attr_use_flush_queue(struct iommu_domain *domain)
+{
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+ struct intel_iommu *iommu = domain_get_iommu(dmar_domain);
+
+ if (intel_iommu_strict)
+ return 0;
+
+ /*
+ * The flush queue implementation does not perform page-selective
+ * invalidations that are required for efficient TLB flushes in virtual
+ * environments. The benefit of batching is likely to be much lower than
+ * the overhead of synchronizing the virtual and physical IOMMU
+ * page-tables.
+ */
+ if (iommu && cap_caching_mode(iommu->cap)) {
+ pr_warn_once("IOMMU batching is partially disabled due to virtualization");
+ return 0;
+ }
+
+ return 1;
+}
+
static int
intel_iommu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
@@ -5383,7 +5407,7 @@ intel_iommu_domain_get_attr(struct iommu_domain *domain,
case IOMMU_DOMAIN_DMA:
switch (attr) {
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
- *(int *)data = !intel_iommu_strict;
+ *(int *)data = !intel_iommu_domain_get_attr_use_flush_queue(domain);
return 0;
default:
return -ENODEV;
--
2.25.1
Commit 1d489151e9f9d1647110277ff77282fe4d96d09b upstream.
Thanks to a recent binutils change which doesn't generate unused
symbols, it's now possible for thunk_64.o be completely empty without
CONFIG_PREEMPTION: no text, no data, no symbols.
We could edit the Makefile to only build that file when
CONFIG_PREEMPTION is enabled, but that will likely create confusion
if/when the thunks end up getting used by some other code again.
Just ignore it and move on.
Reported-by: Nathan Chancellor <natechancellor(a)gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Reviewed-by: Miroslav Benes <mbenes(a)suse.cz>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1254
Signed-off-by: Josh Poimboeuf <jpoimboe(a)redhat.com>
---
This fixes a build break caused by the most recent version of binutils
(2.36).
tools/objtool/elf.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 4e1d7460574b..9452cfb01ef1 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -354,8 +354,11 @@ static int read_symbols(struct elf *elf)
symtab = find_section_by_name(elf, ".symtab");
if (!symtab) {
- WARN("missing symbol table");
- return -1;
+ /*
+ * A missing symbol table is actually possible if it's an empty
+ * .o file. This can happen for thunk_64.o.
+ */
+ return 0;
}
symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
--
2.29.2
When extracting the mask for a SMR that was programmed by the
bootloader, the SMR's valid bit is also extracted and is treated
as part of the mask, which is not correct. Consider the scenario
where an SMMU master whose context is determined by a bootloader
programmed SMR is removed (omitting parts of device/driver core):
->iommu_release_device()
-> arm_smmu_release_device()
-> arm_smmu_master_free_smes()
-> arm_smmu_free_sme() /* Assume that the SME is now free */
-> arm_smmu_write_sme()
-> arm_smmu_write_smr() /* Construct SMR value using mask and SID */
Since the valid bit was considered as part of the mask, the SMR will
be programmed as valid.
Fix the SMR mask extraction step for bootloader programmed SMRs
by masking out the valid bit when we know that we're already
working with a valid SMR.
Fixes: 07a7f2caaa5a ("iommu/arm-smmu-qcom: Read back stream mappings")
Signed-off-by: Isaac J. Manjarres <isaacm(a)codeaurora.org>
Cc: stable(a)vger.kernel.org
---
drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index bcda170..abb1d2f 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -206,6 +206,8 @@ static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
if (FIELD_GET(ARM_SMMU_SMR_VALID, smr)) {
+ /* Ignore valid bit for SMR mask extraction. */
+ smr &= ~ARM_SMMU_SMR_VALID;
smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr);
smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr);
smmu->smrs[i].valid = true;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
The patch titled
Subject: mm: hugetlb: fix missing put_page in gather_surplus_pages()
has been added to the -mm tree. Its filename is
mm-hugetlb-fix-missing-put_page-in-gather_surplus_pages.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-fix-missing-put_page-i…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-fix-missing-put_page-i…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlb: fix missing put_page in gather_surplus_pages()
The VM_BUG_ON_PAGE avoids the generation of any code, even if that
expression has side-effects when !CONFIG_DEBUG_VM.
Link: https://lkml.kernel.org/r/20210126031009.96266-1-songmuchun@bytedance.com
Fixes: e5dfacebe4a4 ("mm/hugetlb.c: just use put_page_testzero() instead of page_count()")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-missing-put_page-in-gather_surplus_pages
+++ a/mm/hugetlb.c
@@ -2047,13 +2047,16 @@ retry:
/* Free the needed pages to the hugetlb pool */
list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
+ int zeroed;
+
if ((--needed) < 0)
break;
/*
* This page is now managed by the hugetlb allocator and has
* no users -- drop the buddy allocator's reference.
*/
- VM_BUG_ON_PAGE(!put_page_testzero(page), page);
+ zeroed = put_page_testzero(page);
+ VM_BUG_ON_PAGE(!zeroed, page);
enqueue_huge_page(h, page);
}
free:
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page.patch
mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page.patch
mm-hugetlb-fix-a-race-between-isolating-and-freeing-page.patch
mm-hugetlb-remove-vm_bug_on_page-from-page_huge_active.patch
mm-migrate-do-not-migrate-hugetlb-page-whose-refcount-is-one.patch
mm-hugetlb-fix-missing-put_page-in-gather_surplus_pages.patch
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7955f105afb6034af344038d663bc98809483cdd Mon Sep 17 00:00:00 2001
From: Steve French <stfrench(a)microsoft.com>
Date: Wed, 9 Dec 2020 22:19:00 -0600
Subject: [PATCH] SMB3.1.1: do not log warning message if server doesn't
populate salt
In the negotiate protocol preauth context, the server is not required
to populate the salt (although it is done by most servers) so do
not warn on mount.
We retain the checks (warn) that the preauth context is the minimum
size and that the salt does not exceed DataLength of the SMB response.
Although we use the defaults in the case that the preauth context
response is invalid, these checks may be useful in the future
as servers add support for additional mechanisms.
CC: Stable <stable(a)vger.kernel.org>
Reviewed-by: Shyam Prasad N <sprasad(a)microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov(a)microsoft.com>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index acb72705062d..fc06c762fbbf 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -427,8 +427,8 @@ build_preauth_ctxt(struct smb2_preauth_neg_context *pneg_ctxt)
pneg_ctxt->ContextType = SMB2_PREAUTH_INTEGRITY_CAPABILITIES;
pneg_ctxt->DataLength = cpu_to_le16(38);
pneg_ctxt->HashAlgorithmCount = cpu_to_le16(1);
- pneg_ctxt->SaltLength = cpu_to_le16(SMB311_SALT_SIZE);
- get_random_bytes(pneg_ctxt->Salt, SMB311_SALT_SIZE);
+ pneg_ctxt->SaltLength = cpu_to_le16(SMB311_LINUX_CLIENT_SALT_SIZE);
+ get_random_bytes(pneg_ctxt->Salt, SMB311_LINUX_CLIENT_SALT_SIZE);
pneg_ctxt->HashAlgorithms = SMB2_PREAUTH_INTEGRITY_SHA512;
}
@@ -566,6 +566,9 @@ static void decode_preauth_context(struct smb2_preauth_neg_context *ctxt)
if (len < MIN_PREAUTH_CTXT_DATA_LEN) {
pr_warn_once("server sent bad preauth context\n");
return;
+ } else if (len < MIN_PREAUTH_CTXT_DATA_LEN + le16_to_cpu(ctxt->SaltLength)) {
+ pr_warn_once("server sent invalid SaltLength\n");
+ return;
}
if (le16_to_cpu(ctxt->HashAlgorithmCount) != 1)
pr_warn_once("Invalid SMB3 hash algorithm count\n");
diff --git a/fs/cifs/smb2pdu.h b/fs/cifs/smb2pdu.h
index fa57b03ca98c..204a622b89ed 100644
--- a/fs/cifs/smb2pdu.h
+++ b/fs/cifs/smb2pdu.h
@@ -333,12 +333,20 @@ struct smb2_neg_context {
/* Followed by array of data */
} __packed;
-#define SMB311_SALT_SIZE 32
+#define SMB311_LINUX_CLIENT_SALT_SIZE 32
/* Hash Algorithm Types */
#define SMB2_PREAUTH_INTEGRITY_SHA512 cpu_to_le16(0x0001)
#define SMB2_PREAUTH_HASH_SIZE 64
-#define MIN_PREAUTH_CTXT_DATA_LEN (SMB311_SALT_SIZE + 6)
+/*
+ * SaltLength that the server send can be zero, so the only three required
+ * fields (all __le16) end up six bytes total, so the minimum context data len
+ * in the response is six bytes which accounts for
+ *
+ * HashAlgorithmCount, SaltLength, and 1 HashAlgorithm.
+ */
+#define MIN_PREAUTH_CTXT_DATA_LEN 6
+
struct smb2_preauth_neg_context {
__le16 ContextType; /* 1 */
__le16 DataLength;
@@ -346,7 +354,7 @@ struct smb2_preauth_neg_context {
__le16 HashAlgorithmCount; /* 1 */
__le16 SaltLength;
__le16 HashAlgorithms; /* HashAlgorithms[0] since only one defined */
- __u8 Salt[SMB311_SALT_SIZE];
+ __u8 Salt[SMB311_LINUX_CLIENT_SALT_SIZE];
} __packed;
/* Encryption Algorithms Ciphers */
The following commit has been merged into the locking/urgent branch of tip:
Commit-ID: 12bb3f7f1b03d5913b3f9d4236a488aa7774dfe9
Gitweb: https://git.kernel.org/tip/12bb3f7f1b03d5913b3f9d4236a488aa7774dfe9
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Wed, 20 Jan 2021 16:00:24 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Tue, 26 Jan 2021 15:10:58 +01:00
futex: Ensure the correct return value from futex_lock_pi()
In case that futex_lock_pi() was aborted by a signal or a timeout and the
task returned without acquiring the rtmutex, but is the designated owner of
the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to
establish consistent state. In that case it invokes fixup_pi_state_owner()
which in turn tries to acquire the rtmutex again. If that succeeds then it
does not propagate this success to fixup_owner() and futex_lock_pi()
returns -EINTR or -ETIMEOUT despite having the futex locked.
Return success from fixup_pi_state_owner() in all cases where the current
task owns the rtmutex and therefore the futex and propagate it correctly
through fixup_owner(). Fixup the other callsite which does not expect a
positive return value.
Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
---
kernel/futex.c | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/kernel/futex.c b/kernel/futex.c
index c47d101..d5e61c2 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2373,8 +2373,8 @@ retry:
}
if (__rt_mutex_futex_trylock(&pi_state->pi_mutex)) {
- /* We got the lock after all, nothing to fix. */
- ret = 0;
+ /* We got the lock. pi_state is correct. Tell caller. */
+ ret = 1;
goto out_unlock;
}
@@ -2402,7 +2402,7 @@ retry:
* We raced against a concurrent self; things are
* already fixed up. Nothing to do.
*/
- ret = 0;
+ ret = 1;
goto out_unlock;
}
newowner = argowner;
@@ -2448,7 +2448,7 @@ retry:
raw_spin_unlock(&newowner->pi_lock);
raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
- return 0;
+ return argowner == current;
/*
* In order to reschedule or handle a page fault, we need to drop the
@@ -2490,7 +2490,7 @@ handle_err:
* Check if someone else fixed it for us:
*/
if (pi_state->owner != oldowner) {
- ret = 0;
+ ret = argowner == current;
goto out_unlock;
}
@@ -2523,8 +2523,6 @@ static long futex_wait_restart(struct restart_block *restart);
*/
static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
{
- int ret = 0;
-
if (locked) {
/*
* Got the lock. We might not be the anticipated owner if we
@@ -2535,8 +2533,8 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
* stable state, anything else needs more attention.
*/
if (q->pi_state->owner != current)
- ret = fixup_pi_state_owner(uaddr, q, current);
- return ret ? ret : locked;
+ return fixup_pi_state_owner(uaddr, q, current);
+ return 1;
}
/*
@@ -2547,10 +2545,8 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
* Another speculative read; pi_state->owner == current is unstable
* but needs our attention.
*/
- if (q->pi_state->owner == current) {
- ret = fixup_pi_state_owner(uaddr, q, NULL);
- return ret;
- }
+ if (q->pi_state->owner == current)
+ return fixup_pi_state_owner(uaddr, q, NULL);
/*
* Paranoia check. If we did not take the lock, then we should not be
@@ -2563,7 +2559,7 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
q->pi_state->owner);
}
- return ret;
+ return 0;
}
/**
@@ -3261,7 +3257,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
if (q.pi_state && (q.pi_state->owner != current)) {
spin_lock(q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, &q, current);
- if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current) {
+ if (ret < 0 && rt_mutex_owner(&q.pi_state->pi_mutex) == current) {
pi_state = q.pi_state;
get_pi_state(pi_state);
}
@@ -3271,6 +3267,11 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
*/
put_pi_state(q.pi_state);
spin_unlock(q.lock_ptr);
+ /*
+ * Adjust the return value. It's either -EFAULT or
+ * success (1) but the caller expects 0 for success.
+ */
+ ret = ret < 0 ? ret : 0;
}
} else {
struct rt_mutex *pi_mutex;
The following commit has been merged into the locking/urgent branch of tip:
Commit-ID: 04b79c55201f02ffd675e1231d731365e335c307
Gitweb: https://git.kernel.org/tip/04b79c55201f02ffd675e1231d731365e335c307
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Tue, 19 Jan 2021 16:06:10 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Tue, 26 Jan 2021 15:10:58 +01:00
futex: Replace pointless printk in fixup_owner()
If that unexpected case of inconsistent arguments ever happens then the
futex state is left completely inconsistent and the printk is not really
helpful. Replace it with a warning and make the state consistent.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
---
kernel/futex.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/kernel/futex.c b/kernel/futex.c
index d5e61c2..5dc8f89 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2550,14 +2550,10 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
/*
* Paranoia check. If we did not take the lock, then we should not be
- * the owner of the rt_mutex.
+ * the owner of the rt_mutex. Warn and establish consistent state.
*/
- if (rt_mutex_owner(&q->pi_state->pi_mutex) == current) {
- printk(KERN_ERR "fixup_owner: ret = %d pi-mutex: %p "
- "pi-state %p\n", ret,
- q->pi_state->pi_mutex.owner,
- q->pi_state->owner);
- }
+ if (WARN_ON_ONCE(rt_mutex_owner(&q->pi_state->pi_mutex) == current))
+ return fixup_pi_state_owner(uaddr, q, current);
return 0;
}
The following commit has been merged into the locking/urgent branch of tip:
Commit-ID: c5cade200ab9a2a3be9e7f32a752c8d86b502ec7
Gitweb: https://git.kernel.org/tip/c5cade200ab9a2a3be9e7f32a752c8d86b502ec7
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Tue, 19 Jan 2021 15:21:35 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Tue, 26 Jan 2021 15:10:58 +01:00
futex: Provide and use pi_state_update_owner()
Updating pi_state::owner is done at several places with the same
code. Provide a function for it and use that at the obvious places.
This is also a preparation for a bug fix to avoid yet another copy of the
same code or alternatively introducing a completely unpenetratable mess of
gotos.
Originally-by: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
---
kernel/futex.c | 66 ++++++++++++++++++++++++-------------------------
1 file changed, 33 insertions(+), 33 deletions(-)
diff --git a/kernel/futex.c b/kernel/futex.c
index 5dc8f89..7837f9e 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -763,6 +763,29 @@ static struct futex_pi_state *alloc_pi_state(void)
return pi_state;
}
+static void pi_state_update_owner(struct futex_pi_state *pi_state,
+ struct task_struct *new_owner)
+{
+ struct task_struct *old_owner = pi_state->owner;
+
+ lockdep_assert_held(&pi_state->pi_mutex.wait_lock);
+
+ if (old_owner) {
+ raw_spin_lock(&old_owner->pi_lock);
+ WARN_ON(list_empty(&pi_state->list));
+ list_del_init(&pi_state->list);
+ raw_spin_unlock(&old_owner->pi_lock);
+ }
+
+ if (new_owner) {
+ raw_spin_lock(&new_owner->pi_lock);
+ WARN_ON(!list_empty(&pi_state->list));
+ list_add(&pi_state->list, &new_owner->pi_state_list);
+ pi_state->owner = new_owner;
+ raw_spin_unlock(&new_owner->pi_lock);
+ }
+}
+
static void get_pi_state(struct futex_pi_state *pi_state)
{
WARN_ON_ONCE(!refcount_inc_not_zero(&pi_state->refcount));
@@ -1521,26 +1544,15 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_
ret = -EINVAL;
}
- if (ret)
- goto out_unlock;
-
- /*
- * This is a point of no return; once we modify the uval there is no
- * going back and subsequent operations must not fail.
- */
-
- raw_spin_lock(&pi_state->owner->pi_lock);
- WARN_ON(list_empty(&pi_state->list));
- list_del_init(&pi_state->list);
- raw_spin_unlock(&pi_state->owner->pi_lock);
-
- raw_spin_lock(&new_owner->pi_lock);
- WARN_ON(!list_empty(&pi_state->list));
- list_add(&pi_state->list, &new_owner->pi_state_list);
- pi_state->owner = new_owner;
- raw_spin_unlock(&new_owner->pi_lock);
-
- postunlock = __rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q);
+ if (!ret) {
+ /*
+ * This is a point of no return; once we modified the uval
+ * there is no going back and subsequent operations must
+ * not fail.
+ */
+ pi_state_update_owner(pi_state, new_owner);
+ postunlock = __rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q);
+ }
out_unlock:
raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
@@ -2433,19 +2445,7 @@ retry:
* We fixed up user space. Now we need to fix the pi_state
* itself.
*/
- if (pi_state->owner != NULL) {
- raw_spin_lock(&pi_state->owner->pi_lock);
- WARN_ON(list_empty(&pi_state->list));
- list_del_init(&pi_state->list);
- raw_spin_unlock(&pi_state->owner->pi_lock);
- }
-
- pi_state->owner = newowner;
-
- raw_spin_lock(&newowner->pi_lock);
- WARN_ON(!list_empty(&pi_state->list));
- list_add(&pi_state->list, &newowner->pi_state_list);
- raw_spin_unlock(&newowner->pi_lock);
+ pi_state_update_owner(pi_state, newowner);
raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
return argowner == current;
The following commit has been merged into the locking/urgent branch of tip:
Commit-ID: 6ccc84f917d33312eb2846bd7b567639f585ad6d
Gitweb: https://git.kernel.org/tip/6ccc84f917d33312eb2846bd7b567639f585ad6d
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Wed, 20 Jan 2021 11:35:19 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Tue, 26 Jan 2021 15:10:59 +01:00
futex: Use pi_state_update_owner() in put_pi_state()
No point in open coding it. This way it gains the extra sanity checks.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
---
kernel/futex.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/kernel/futex.c b/kernel/futex.c
index cfca221..a0fe63c 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -808,16 +808,10 @@ static void put_pi_state(struct futex_pi_state *pi_state)
* and has cleaned up the pi_state already
*/
if (pi_state->owner) {
- struct task_struct *owner;
unsigned long flags;
raw_spin_lock_irqsave(&pi_state->pi_mutex.wait_lock, flags);
- owner = pi_state->owner;
- if (owner) {
- raw_spin_lock(&owner->pi_lock);
- list_del_init(&pi_state->list);
- raw_spin_unlock(&owner->pi_lock);
- }
+ pi_state_update_owner(pi_state, NULL);
rt_mutex_proxy_unlock(&pi_state->pi_mutex);
raw_spin_unlock_irqrestore(&pi_state->pi_mutex.wait_lock, flags);
}
The following commit has been merged into the locking/urgent branch of tip:
Commit-ID: 34b1a1ce1458f50ef27c54e28eb9b1947012907a
Gitweb: https://git.kernel.org/tip/34b1a1ce1458f50ef27c54e28eb9b1947012907a
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Mon, 18 Jan 2021 19:01:21 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Tue, 26 Jan 2021 15:11:00 +01:00
futex: Handle faults correctly for PI futexes
fixup_pi_state_owner() tries to ensure that the state of the rtmutex,
pi_state and the user space value related to the PI futex are consistent
before returning to user space. In case that the user space value update
faults and the fault cannot be resolved by faulting the page in via
fault_in_user_writeable() the function returns with -EFAULT and leaves
the rtmutex and pi_state owner state inconsistent.
A subsequent futex_unlock_pi() operates on the inconsistent pi_state and
releases the rtmutex despite not owning it which can corrupt the RB tree of
the rtmutex and cause a subsequent kernel stack use after free.
It was suggested to loop forever in fixup_pi_state_owner() if the fault
cannot be resolved, but that results in runaway tasks which is especially
undesired when the problem happens due to a programming error and not due
to malice.
As the user space value cannot be fixed up, the proper solution is to make
the rtmutex and the pi_state consistent so both have the same owner. This
leaves the user space value out of sync. Any subsequent operation on the
futex will fail because the 10th rule of PI futexes (pi_state owner and
user space value are consistent) has been violated.
As a consequence this removes the inept attempts of 'fixing' the situation
in case that the current task owns the rtmutex when returning with an
unresolvable fault by unlocking the rtmutex which left pi_state::owner and
rtmutex::owner out of sync in a different and only slightly less dangerous
way.
Fixes: 1b7558e457ed ("futexes: fix fault handling in futex_lock_pi")
Reported-by: gzobqq(a)gmail.com
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
---
kernel/futex.c | 57 +++++++++++++++++--------------------------------
1 file changed, 20 insertions(+), 37 deletions(-)
diff --git a/kernel/futex.c b/kernel/futex.c
index 7a38ead..45a13eb 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -958,7 +958,8 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
* FUTEX_OWNER_DIED bit. See [4]
*
* [10] There is no transient state which leaves owner and user space
- * TID out of sync.
+ * TID out of sync. Except one error case where the kernel is denied
+ * write access to the user address, see fixup_pi_state_owner().
*
*
* Serialization and lifetime rules:
@@ -2480,6 +2481,24 @@ handle_err:
if (!err)
goto retry;
+ /*
+ * fault_in_user_writeable() failed so user state is immutable. At
+ * best we can make the kernel state consistent but user state will
+ * be most likely hosed and any subsequent unlock operation will be
+ * rejected due to PI futex rule [10].
+ *
+ * Ensure that the rtmutex owner is also the pi_state owner despite
+ * the user space value claiming something different. There is no
+ * point in unlocking the rtmutex if current is the owner as it
+ * would need to wait until the next waiter has taken the rtmutex
+ * to guarantee consistent state. Keep it simple. Userspace asked
+ * for this wreckaged state.
+ *
+ * The rtmutex has an owner - either current or some other
+ * task. See the EAGAIN loop above.
+ */
+ pi_state_update_owner(pi_state, rt_mutex_owner(&pi_state->pi_mutex));
+
return err;
}
@@ -2756,7 +2775,6 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags,
ktime_t *time, int trylock)
{
struct hrtimer_sleeper timeout, *to;
- struct futex_pi_state *pi_state = NULL;
struct task_struct *exiting = NULL;
struct rt_mutex_waiter rt_waiter;
struct futex_hash_bucket *hb;
@@ -2892,23 +2910,8 @@ no_block:
if (res)
ret = (res < 0) ? res : 0;
- /*
- * If fixup_owner() faulted and was unable to handle the fault, unlock
- * it and return the fault to userspace.
- */
- if (ret && (rt_mutex_owner(&q.pi_state->pi_mutex) == current)) {
- pi_state = q.pi_state;
- get_pi_state(pi_state);
- }
-
/* Unqueue and drop the lock */
unqueue_me_pi(&q);
-
- if (pi_state) {
- rt_mutex_futex_unlock(&pi_state->pi_mutex);
- put_pi_state(pi_state);
- }
-
goto out;
out_unlock_put_key:
@@ -3168,7 +3171,6 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
u32 __user *uaddr2)
{
struct hrtimer_sleeper timeout, *to;
- struct futex_pi_state *pi_state = NULL;
struct rt_mutex_waiter rt_waiter;
struct futex_hash_bucket *hb;
union futex_key key2 = FUTEX_KEY_INIT;
@@ -3246,10 +3248,6 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
if (q.pi_state && (q.pi_state->owner != current)) {
spin_lock(q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, &q, current);
- if (ret < 0 && rt_mutex_owner(&q.pi_state->pi_mutex) == current) {
- pi_state = q.pi_state;
- get_pi_state(pi_state);
- }
/*
* Drop the reference to the pi state which
* the requeue_pi() code acquired for us.
@@ -3291,25 +3289,10 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
if (res)
ret = (res < 0) ? res : 0;
- /*
- * If fixup_pi_state_owner() faulted and was unable to handle
- * the fault, unlock the rt_mutex and return the fault to
- * userspace.
- */
- if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current) {
- pi_state = q.pi_state;
- get_pi_state(pi_state);
- }
-
/* Unqueue and drop the lock. */
unqueue_me_pi(&q);
}
- if (pi_state) {
- rt_mutex_futex_unlock(&pi_state->pi_mutex);
- put_pi_state(pi_state);
- }
-
if (ret == -EINTR) {
/*
* We've already been requeued, but cannot restart by calling
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: cedrus: Fix H264 decoding
Author: Jernej Skrabec <jernej.skrabec(a)siol.net>
Date: Wed Dec 23 12:06:59 2020 +0100
During H264 API overhaul subtle bug was introduced Cedrus driver.
Progressive references have both, top and bottom reference flags set.
Cedrus reference list expects only bottom reference flag and only when
interlaced frames are decoded. However, due to a bug in Cedrus check,
exclusivity is not tested and that flag is set also for progressive
references. That causes "jumpy" background with many videos.
Fix that by checking that only bottom reference flag is set in control
and nothing else.
Tested-by: Andre Heider <a.heider(a)gmail.com>
Fixes: cfc8c3ed533e ("media: cedrus: h264: Properly configure reference field")
Signed-off-by: Jernej Skrabec <jernej.skrabec(a)siol.net>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/staging/media/sunxi/cedrus/cedrus_h264.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
index 781c84a9b1b7..de7442d4834d 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
@@ -203,7 +203,7 @@ static void _cedrus_write_ref_list(struct cedrus_ctx *ctx,
position = cedrus_buf->codec.h264.position;
sram_array[i] |= position << 1;
- if (ref_list[i].fields & V4L2_H264_BOTTOM_FIELD_REF)
+ if (ref_list[i].fields == V4L2_H264_BOTTOM_FIELD_REF)
sram_array[i] |= BIT(0);
}
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: cedrus: Fix H264 decoding
Author: Jernej Skrabec <jernej.skrabec(a)siol.net>
Date: Wed Dec 23 12:06:59 2020 +0100
During H264 API overhaul subtle bug was introduced Cedrus driver.
Progressive references have both, top and bottom reference flags set.
Cedrus reference list expects only bottom reference flag and only when
interlaced frames are decoded. However, due to a bug in Cedrus check,
exclusivity is not tested and that flag is set also for progressive
references. That causes "jumpy" background with many videos.
Fix that by checking that only bottom reference flag is set in control
and nothing else.
Tested-by: Andre Heider <a.heider(a)gmail.com>
Fixes: cfc8c3ed533e ("media: cedrus: h264: Properly configure reference field")
Signed-off-by: Jernej Skrabec <jernej.skrabec(a)siol.net>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/staging/media/sunxi/cedrus/cedrus_h264.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
index 781c84a9b1b7..de7442d4834d 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
@@ -203,7 +203,7 @@ static void _cedrus_write_ref_list(struct cedrus_ctx *ctx,
position = cedrus_buf->codec.h264.position;
sram_array[i] |= position << 1;
- if (ref_list[i].fields & V4L2_H264_BOTTOM_FIELD_REF)
+ if (ref_list[i].fields == V4L2_H264_BOTTOM_FIELD_REF)
sram_array[i] |= BIT(0);
}
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: v4l2-subdev.h: BIT() is not available in userspace
Author: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Date: Mon Jan 18 16:37:00 2021 +0100
The BIT macro is not available in userspace, so replace BIT(0) by
0x00000001.
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Fixes: 6446ec6cbf46 ("media: v4l2-subdev: add VIDIOC_SUBDEV_QUERYCAP ioctl")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
include/uapi/linux/v4l2-subdev.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
diff --git a/include/uapi/linux/v4l2-subdev.h b/include/uapi/linux/v4l2-subdev.h
index 00850b98078a..a38454d9e0f5 100644
--- a/include/uapi/linux/v4l2-subdev.h
+++ b/include/uapi/linux/v4l2-subdev.h
@@ -176,7 +176,7 @@ struct v4l2_subdev_capability {
};
/* The v4l2 sub-device video device node is registered in read-only mode. */
-#define V4L2_SUBDEV_CAP_RO_SUBDEV BIT(0)
+#define V4L2_SUBDEV_CAP_RO_SUBDEV 0x00000001
/* Backwards compatibility define --- to be removed */
#define v4l2_subdev_edid v4l2_edid
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: v4l2-subdev.h: BIT() is not available in userspace
Author: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Date: Mon Jan 18 16:37:00 2021 +0100
The BIT macro is not available in userspace, so replace BIT(0) by
0x00000001.
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Fixes: 6446ec6cbf46 ("media: v4l2-subdev: add VIDIOC_SUBDEV_QUERYCAP ioctl")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
include/uapi/linux/v4l2-subdev.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
diff --git a/include/uapi/linux/v4l2-subdev.h b/include/uapi/linux/v4l2-subdev.h
index 00850b98078a..a38454d9e0f5 100644
--- a/include/uapi/linux/v4l2-subdev.h
+++ b/include/uapi/linux/v4l2-subdev.h
@@ -176,7 +176,7 @@ struct v4l2_subdev_capability {
};
/* The v4l2 sub-device video device node is registered in read-only mode. */
-#define V4L2_SUBDEV_CAP_RO_SUBDEV BIT(0)
+#define V4L2_SUBDEV_CAP_RO_SUBDEV 0x00000001
/* Backwards compatibility define --- to be removed */
#define v4l2_subdev_edid v4l2_edid
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5c02406428d5219c367c5f53457698c58bc5f917 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 20 Jan 2021 13:59:11 -0500
Subject: [PATCH] dm integrity: conditionally disable "recalculate" feature
Otherwise a malicious user could (ab)use the "recalculate" feature
that makes dm-integrity calculate the checksums in the background
while the device is already usable. When the system restarts before all
checksums have been calculated, the calculation continues where it was
interrupted even if the recalculate feature is not requested the next
time the dm device is set up.
Disable recalculating if we use internal_hash or journal_hash with a
key (e.g. HMAC) and we don't have the "legacy_recalculate" flag.
This may break activation of a volume, created by an older kernel,
that is not yet fully recalculated -- if this happens, the user should
add the "legacy_recalculate" flag to constructor parameters.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Reported-by: Daniel Glockner <dg(a)emlix.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst
index 4e6f504474ac..2cc5488acbd9 100644
--- a/Documentation/admin-guide/device-mapper/dm-integrity.rst
+++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst
@@ -177,14 +177,20 @@ bitmap_flush_interval:number
The bitmap flush interval in milliseconds. The metadata buffers
are synchronized when this interval expires.
+allow_discards
+ Allow block discard requests (a.k.a. TRIM) for the integrity device.
+ Discards are only allowed to devices using internal hash.
+
fix_padding
Use a smaller padding of the tag area that is more
space-efficient. If this option is not present, large padding is
used - that is for compatibility with older kernels.
-allow_discards
- Allow block discard requests (a.k.a. TRIM) for the integrity device.
- Discards are only allowed to devices using internal hash.
+legacy_recalculate
+ Allow recalculating of volumes with HMAC keys. This is disabled by
+ default for security reasons - an attacker could modify the volume,
+ set recalc_sector to zero, and the kernel would not detect the
+ modification.
The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
allow_discards can be changed when reloading the target (load an inactive
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index cce203adcf77..b64fede032dc 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -257,8 +257,9 @@ struct dm_integrity_c {
bool journal_uptodate;
bool just_formatted;
bool recalculate_flag;
- bool fix_padding;
bool discard;
+ bool fix_padding;
+ bool legacy_recalculate;
struct alg_spec internal_hash_alg;
struct alg_spec journal_crypt_alg;
@@ -386,6 +387,14 @@ static int dm_integrity_failed(struct dm_integrity_c *ic)
return READ_ONCE(ic->failed);
}
+static bool dm_integrity_disable_recalculate(struct dm_integrity_c *ic)
+{
+ if ((ic->internal_hash_alg.key || ic->journal_mac_alg.key) &&
+ !ic->legacy_recalculate)
+ return true;
+ return false;
+}
+
static commit_id_t dm_integrity_commit_id(struct dm_integrity_c *ic, unsigned i,
unsigned j, unsigned char seq)
{
@@ -3140,6 +3149,7 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
arg_count += !!ic->journal_crypt_alg.alg_string;
arg_count += !!ic->journal_mac_alg.alg_string;
arg_count += (ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0;
+ arg_count += ic->legacy_recalculate;
DMEMIT("%s %llu %u %c %u", ic->dev->name, ic->start,
ic->tag_size, ic->mode, arg_count);
if (ic->meta_dev)
@@ -3163,6 +3173,8 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
}
if ((ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0)
DMEMIT(" fix_padding");
+ if (ic->legacy_recalculate)
+ DMEMIT(" legacy_recalculate");
#define EMIT_ALG(a, n) \
do { \
@@ -3792,7 +3804,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
unsigned extra_args;
struct dm_arg_set as;
static const struct dm_arg _args[] = {
- {0, 15, "Invalid number of feature args"},
+ {0, 16, "Invalid number of feature args"},
};
unsigned journal_sectors, interleave_sectors, buffer_sectors, journal_watermark, sync_msec;
bool should_write_sb;
@@ -3940,6 +3952,8 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
ic->discard = true;
} else if (!strcmp(opt_string, "fix_padding")) {
ic->fix_padding = true;
+ } else if (!strcmp(opt_string, "legacy_recalculate")) {
+ ic->legacy_recalculate = true;
} else {
r = -EINVAL;
ti->error = "Invalid argument";
@@ -4243,6 +4257,14 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
}
}
+ if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING) &&
+ le64_to_cpu(ic->sb->recalc_sector) < ic->provided_data_sectors &&
+ dm_integrity_disable_recalculate(ic)) {
+ ti->error = "Recalculating with HMAC is disabled for security reasons - if you really need it, use the argument \"legacy_recalculate\"";
+ r = -EOPNOTSUPP;
+ goto bad;
+ }
+
ic->bufio = dm_bufio_client_create(ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev,
1U << (SECTOR_SHIFT + ic->log2_buffer_sectors), 1, 0, NULL, NULL);
if (IS_ERR(ic->bufio)) {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5c02406428d5219c367c5f53457698c58bc5f917 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 20 Jan 2021 13:59:11 -0500
Subject: [PATCH] dm integrity: conditionally disable "recalculate" feature
Otherwise a malicious user could (ab)use the "recalculate" feature
that makes dm-integrity calculate the checksums in the background
while the device is already usable. When the system restarts before all
checksums have been calculated, the calculation continues where it was
interrupted even if the recalculate feature is not requested the next
time the dm device is set up.
Disable recalculating if we use internal_hash or journal_hash with a
key (e.g. HMAC) and we don't have the "legacy_recalculate" flag.
This may break activation of a volume, created by an older kernel,
that is not yet fully recalculated -- if this happens, the user should
add the "legacy_recalculate" flag to constructor parameters.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Reported-by: Daniel Glockner <dg(a)emlix.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst
index 4e6f504474ac..2cc5488acbd9 100644
--- a/Documentation/admin-guide/device-mapper/dm-integrity.rst
+++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst
@@ -177,14 +177,20 @@ bitmap_flush_interval:number
The bitmap flush interval in milliseconds. The metadata buffers
are synchronized when this interval expires.
+allow_discards
+ Allow block discard requests (a.k.a. TRIM) for the integrity device.
+ Discards are only allowed to devices using internal hash.
+
fix_padding
Use a smaller padding of the tag area that is more
space-efficient. If this option is not present, large padding is
used - that is for compatibility with older kernels.
-allow_discards
- Allow block discard requests (a.k.a. TRIM) for the integrity device.
- Discards are only allowed to devices using internal hash.
+legacy_recalculate
+ Allow recalculating of volumes with HMAC keys. This is disabled by
+ default for security reasons - an attacker could modify the volume,
+ set recalc_sector to zero, and the kernel would not detect the
+ modification.
The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
allow_discards can be changed when reloading the target (load an inactive
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index cce203adcf77..b64fede032dc 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -257,8 +257,9 @@ struct dm_integrity_c {
bool journal_uptodate;
bool just_formatted;
bool recalculate_flag;
- bool fix_padding;
bool discard;
+ bool fix_padding;
+ bool legacy_recalculate;
struct alg_spec internal_hash_alg;
struct alg_spec journal_crypt_alg;
@@ -386,6 +387,14 @@ static int dm_integrity_failed(struct dm_integrity_c *ic)
return READ_ONCE(ic->failed);
}
+static bool dm_integrity_disable_recalculate(struct dm_integrity_c *ic)
+{
+ if ((ic->internal_hash_alg.key || ic->journal_mac_alg.key) &&
+ !ic->legacy_recalculate)
+ return true;
+ return false;
+}
+
static commit_id_t dm_integrity_commit_id(struct dm_integrity_c *ic, unsigned i,
unsigned j, unsigned char seq)
{
@@ -3140,6 +3149,7 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
arg_count += !!ic->journal_crypt_alg.alg_string;
arg_count += !!ic->journal_mac_alg.alg_string;
arg_count += (ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0;
+ arg_count += ic->legacy_recalculate;
DMEMIT("%s %llu %u %c %u", ic->dev->name, ic->start,
ic->tag_size, ic->mode, arg_count);
if (ic->meta_dev)
@@ -3163,6 +3173,8 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
}
if ((ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0)
DMEMIT(" fix_padding");
+ if (ic->legacy_recalculate)
+ DMEMIT(" legacy_recalculate");
#define EMIT_ALG(a, n) \
do { \
@@ -3792,7 +3804,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
unsigned extra_args;
struct dm_arg_set as;
static const struct dm_arg _args[] = {
- {0, 15, "Invalid number of feature args"},
+ {0, 16, "Invalid number of feature args"},
};
unsigned journal_sectors, interleave_sectors, buffer_sectors, journal_watermark, sync_msec;
bool should_write_sb;
@@ -3940,6 +3952,8 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
ic->discard = true;
} else if (!strcmp(opt_string, "fix_padding")) {
ic->fix_padding = true;
+ } else if (!strcmp(opt_string, "legacy_recalculate")) {
+ ic->legacy_recalculate = true;
} else {
r = -EINVAL;
ti->error = "Invalid argument";
@@ -4243,6 +4257,14 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
}
}
+ if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING) &&
+ le64_to_cpu(ic->sb->recalc_sector) < ic->provided_data_sectors &&
+ dm_integrity_disable_recalculate(ic)) {
+ ti->error = "Recalculating with HMAC is disabled for security reasons - if you really need it, use the argument \"legacy_recalculate\"";
+ r = -EOPNOTSUPP;
+ goto bad;
+ }
+
ic->bufio = dm_bufio_client_create(ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev,
1U << (SECTOR_SHIFT + ic->log2_buffer_sectors), 1, 0, NULL, NULL);
if (IS_ERR(ic->bufio)) {
Rebased on linux-5.10.y, stable tags added. The first was dropped
before, most of others make it right.
Pavel Begunkov (11):
kernel/io_uring: cancel io_uring before task works
io_uring: inline io_uring_attempt_task_drop()
io_uring: add warn_once for io_uring_flush()
io_uring: stop SQPOLL submit on creator's death
io_uring: fix null-deref in io_disable_sqo_submit
io_uring: do sqo disable on install_fd error
io_uring: fix false positive sqo warning on flush
io_uring: fix uring_flush in exit_files() warning
io_uring: fix skipping disabling sqo on exec
io_uring: dont kill fasync under completion_lock
io_uring: fix sleeping under spin in __io_clean_op
fs/file.c | 2 -
fs/io_uring.c | 119 +++++++++++++++++++++++++++++++++++---------------
kernel/exit.c | 2 +
3 files changed, 86 insertions(+), 37 deletions(-)
--
2.24.0
From: Arnd Bergmann <arnd(a)arndb.de>
sdhci_pltfm_suspend() is only available when CONFIG_PM_SLEEP
support is built into the kernel, which caused a regression
in a recent bugfix:
ld.lld: error: undefined symbol: sdhci_pltfm_suspend
>>> referenced by sdhci-brcmstb.c
>>> mmc/host/sdhci-brcmstb.o:(sdhci_brcmstb_shutdown) in archive drivers/built-in.a
Making the call conditional on the symbol fixes the link
error.
Fixes: 5b191dcba719 ("mmc: sdhci-brcmstb: Fix mmc timeout errors on S5 suspend")
Fixes: e7b5d63a82fe ("mmc: sdhci-brcmstb: Add shutdown callback")
Cc: stable(a)vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
It would be helpful if someone could test this to ensure that the
driver works correctly even when CONFIG_PM_SLEEP is disabled
---
drivers/mmc/host/sdhci-brcmstb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/mmc/host/sdhci-brcmstb.c b/drivers/mmc/host/sdhci-brcmstb.c
index f9780c65ebe9..dc9280b149db 100644
--- a/drivers/mmc/host/sdhci-brcmstb.c
+++ b/drivers/mmc/host/sdhci-brcmstb.c
@@ -314,7 +314,8 @@ static int sdhci_brcmstb_probe(struct platform_device *pdev)
static void sdhci_brcmstb_shutdown(struct platform_device *pdev)
{
- sdhci_pltfm_suspend(&pdev->dev);
+ if (IS_ENABLED(CONFIG_PM_SLEEP))
+ sdhci_pltfm_suspend(&pdev->dev);
}
MODULE_DEVICE_TABLE(of, sdhci_brcm_of_match);
--
2.29.2
If the tctx inflight number haven't changed because of cancellation,
__io_uring_task_cancel() will continue leaving the task in
TASK_UNINTERRUPTIBLE state, that's not expected by
__io_uring_files_cancel(). Ensure we always call finish_wait() before
retrying.
Cc: stable(a)vger.kernel.org # 5.9+
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
---
fs/io_uring.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2166c469789d..09aada153a71 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9124,16 +9124,15 @@ void __io_uring_task_cancel(void)
prepare_to_wait(&tctx->wait, &wait, TASK_UNINTERRUPTIBLE);
/*
- * If we've seen completions, retry. This avoids a race where
- * a completion comes in before we did prepare_to_wait().
+ * If we've seen completions, retry without waiting. This
+ * avoids a race where a completion comes in before we did
+ * prepare_to_wait().
*/
- if (inflight != tctx_inflight(tctx))
- continue;
- schedule();
+ if (inflight == tctx_inflight(tctx))
+ schedule();
finish_wait(&tctx->wait, &wait);
} while (1);
- finish_wait(&tctx->wait, &wait);
atomic_dec(&tctx->in_idle);
io_uring_remove_task_files(tctx);
--
2.24.0
This is a note to let you know that I've just added the patch titled
usb: gadget: aspeed: add missing of_node_put
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From a55a9a4c5c6253f6e4dea268af728664ac997790 Mon Sep 17 00:00:00 2001
From: kernel test robot <lkp(a)intel.com>
Date: Thu, 21 Jan 2021 19:12:54 +0100
Subject: usb: gadget: aspeed: add missing of_node_put
Breaking out of for_each_child_of_node requires a put on the
child value.
Generated by: scripts/coccinelle/iterators/for_each_child.cocci
Fixes: 82c2d81361ec ("coccinelle: iterators: Add for_each_child.cocci script")
CC: Sumera Priyadarsini <sylphrenadin(a)gmail.com>
Reported-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: Julia Lawall <julia.lawall(a)inria.fr>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2101211907060.14700@hadrien
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/gadget/udc/aspeed-vhub/hub.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/udc/aspeed-vhub/hub.c b/drivers/usb/gadget/udc/aspeed-vhub/hub.c
index 6497185ec4e7..bfd8e77788e2 100644
--- a/drivers/usb/gadget/udc/aspeed-vhub/hub.c
+++ b/drivers/usb/gadget/udc/aspeed-vhub/hub.c
@@ -999,8 +999,10 @@ static int ast_vhub_of_parse_str_desc(struct ast_vhub *vhub,
str_array[offset].s = NULL;
ret = ast_vhub_str_alloc_add(vhub, &lang_str);
- if (ret)
+ if (ret) {
+ of_node_put(child);
break;
+ }
}
return ret;
--
2.30.0
The most-significant bit of the sub-integer-prescaler index is set in
the high byte of the baudrate request wIndex also for FTX devices.
This fixes rates like 1152000 which got mapped to 12 MBd.
Reported-by: Vladimir <svv75(a)mail.ru>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210351
Cc: stable(a)vger.kernel.org
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/ftdi_sio.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c
index 94398f89e600..4168801b9595 100644
--- a/drivers/usb/serial/ftdi_sio.c
+++ b/drivers/usb/serial/ftdi_sio.c
@@ -1386,8 +1386,9 @@ static int change_speed(struct tty_struct *tty, struct usb_serial_port *port)
index_value = get_ftdi_divisor(tty, port);
value = (u16)index_value;
index = (u16)(index_value >> 16);
- if ((priv->chip_type == FT2232C) || (priv->chip_type == FT2232H) ||
- (priv->chip_type == FT4232H) || (priv->chip_type == FT232H)) {
+ if (priv->chip_type == FT2232C || priv->chip_type == FT2232H ||
+ priv->chip_type == FT4232H || priv->chip_type == FT232H ||
+ priv->chip_type == FTX) {
/* Probably the BM type needs the MSB of the encoded fractional
* divider also moved like for the chips above. Any infos? */
index = (u16)((index << 8) | priv->interface);
--
2.26.2
This is a note to let you know that I've just added the patch titled
USB: usblp: don't call usb_set_interface if there's a single alt
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From d8c6edfa3f4ee0d45d7ce5ef18d1245b78774b9d Mon Sep 17 00:00:00 2001
From: Jeremy Figgins <kernel(a)jeremyfiggins.com>
Date: Sat, 23 Jan 2021 18:21:36 -0600
Subject: USB: usblp: don't call usb_set_interface if there's a single alt
Some devices, such as the Winbond Electronics Corp. Virtual Com Port
(Vendor=0416, ProdId=5011), lockup when usb_set_interface() or
usb_clear_halt() are called. This device has only a single
altsetting, so it should not be necessary to call usb_set_interface().
Acked-by: Pete Zaitcev <zaitcev(a)redhat.com>
Signed-off-by: Jeremy Figgins <kernel(a)jeremyfiggins.com>
Link: https://lore.kernel.org/r/YAy9kJhM/rG8EQXC@watson
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/class/usblp.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/class/usblp.c b/drivers/usb/class/usblp.c
index 134dc2005ce9..c9f6e9758288 100644
--- a/drivers/usb/class/usblp.c
+++ b/drivers/usb/class/usblp.c
@@ -1329,14 +1329,17 @@ static int usblp_set_protocol(struct usblp *usblp, int protocol)
if (protocol < USBLP_FIRST_PROTOCOL || protocol > USBLP_LAST_PROTOCOL)
return -EINVAL;
- alts = usblp->protocol[protocol].alt_setting;
- if (alts < 0)
- return -EINVAL;
- r = usb_set_interface(usblp->dev, usblp->ifnum, alts);
- if (r < 0) {
- printk(KERN_ERR "usblp: can't set desired altsetting %d on interface %d\n",
- alts, usblp->ifnum);
- return r;
+ /* Don't unnecessarily set the interface if there's a single alt. */
+ if (usblp->intf->num_altsetting > 1) {
+ alts = usblp->protocol[protocol].alt_setting;
+ if (alts < 0)
+ return -EINVAL;
+ r = usb_set_interface(usblp->dev, usblp->ifnum, alts);
+ if (r < 0) {
+ printk(KERN_ERR "usblp: can't set desired altsetting %d on interface %d\n",
+ alts, usblp->ifnum);
+ return r;
+ }
}
usblp->bidir = (usblp->protocol[protocol].epread != NULL);
--
2.30.0
This is the start of the stable review cycle for the 5.4.93 release.
There are 88 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 28 Jan 2021 09:42:44 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.93-rc2…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.93-rc2
Enke Chen <enchen(a)paloaltonetworks.com>
tcp: fix TCP_USER_TIMEOUT with zero window
Eric Dumazet <edumazet(a)google.com>
tcp: do not mess with cloned skbs in tcp_add_backlog()
Dan Carpenter <dan.carpenter(a)oracle.com>
net: dsa: b53: fix an off by one in checking "vlan->vid"
Tariq Toukan <tariqt(a)nvidia.com>
net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: mscc: ocelot: allow offloading of bridge on top of LAG
Matteo Croce <mcroce(a)microsoft.com>
ipv6: set multicast flag on the multicast route
Eric Dumazet <edumazet(a)google.com>
net_sched: reject silly cell_log in qdisc_get_rtab()
Eric Dumazet <edumazet(a)google.com>
net_sched: avoid shift-out-of-bounds in tcindex_set_parms()
Matteo Croce <mcroce(a)microsoft.com>
ipv6: create multicast route with RTPROT_KERNEL
Guillaume Nault <gnault(a)redhat.com>
udp: mask TOS bits in udp_v4_early_demux()
Lecopzer Chen <lecopzer(a)gmail.com>
kasan: fix incorrect arguments passing in kasan_add_zero_shadow
Lecopzer Chen <lecopzer(a)gmail.com>
kasan: fix unaligned address is unhandled in kasan_remove_zero_shadow
Alexander Lobakin <alobakin(a)pm.me>
skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too
Pan Bian <bianpan2016(a)163.com>
lightnvm: fix memory leak when submit fails
Geert Uytterhoeven <geert+renesas(a)glider.be>
sh_eth: Fix power down vs. is_opened flag ordering
Rasmus Villemoes <rasmus.villemoes(a)prevas.dk>
net: dsa: mv88e6xxx: also read STU state in mv88e6250_g1_vtu_getnext
Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
sh: dma: fix kconfig dependency for G2_DMA
Guillaume Nault <gnault(a)redhat.com>
netfilter: rpfilter: mask ecn bits before fib lookup
Yazen Ghannam <Yazen.Ghannam(a)amd.com>
x86/cpu/amd: Set __max_die_per_package on AMD
Paul Cercueil <paul(a)crapouillou.net>
pinctrl: ingenic: Fix JZ4760 support
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
driver core: Extend device_is_dependent()
JC Kuo <jckuo(a)nvidia.com>
xhci: tegra: Delay for disabling LFPS detector
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: make sure TRB is fully written before giving it to the controller
Patrik Jakobsson <patrik.r.jakobsson(a)gmail.com>
usb: bdc: Make bdc pci driver depend on BROKEN
Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
usb: udc: core: Use lock when write to soft_connect
Ryan Chen <ryan_chen(a)aspeedtech.com>
usb: gadget: aspeed: fix stop dma register setting.
Longfang Liu <liulongfang(a)huawei.com>
USB: ehci: fix an interrupt calltrace error
Eugene Korenevsky <ekorenevsky(a)astralinux.ru>
ehci: fix EHCI host controller initialization sequence
Pali Rohár <pali(a)kernel.org>
serial: mvebu-uart: fix tx lost characters at power off
Wang Hui <john.wanghui(a)huawei.com>
stm class: Fix module init return on allocation failure
Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
intel_th: pci: Add Alder Lake-P support
Andy Lutomirski <luto(a)kernel.org>
x86/mmx: Use KFPU_387 for MMX string operations
Borislav Petkov <bp(a)suse.de>
x86/topology: Make __max_die_per_package available unconditionally
Andy Lutomirski <luto(a)kernel.org>
x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state
Mathias Kresin <dev(a)kresin.me>
irqchip/mips-cpu: Set IPI domain parent chip
Ronnie Sahlberg <lsahlber(a)redhat.com>
cifs: do not fail __smb_send_rqst if non-fatal signals are pending
Lars-Peter Clausen <lars(a)metafoo.de>
iio: ad5504: Fix setting power-down state
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: peak_usb: fix use after free bugs
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: vxcan: vxcan_xmit: fix use after free bug
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: dev: can_restart: fix use after free bug
Hangbin Liu <liuhangbin(a)gmail.com>
selftests: net: fib_tests: remove duplicate log test
Hans de Goede <hdegoede(a)redhat.com>
platform/x86: intel-vbtn: Drop HP Stream x360 Convertible PC 11 from allow-list
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
i2c: octeon: check correct size of maximum RECV_LEN packet
Ariel Marcovitch <arielmarcovitch(a)gmail.com>
powerpc: Fix alignment bug within the init sections
Arnd Bergmann <arnd(a)arndb.de>
scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression
Billy Tsai <billy_tsai(a)aspeedtech.com>
pinctrl: aspeed: g6: Fix PWMG0 pinctrl setting
Youling Tang <tangyouling(a)loongson.cn>
powerpc: Use the common INIT_DATA_SECTION macro in vmlinux.lds.S
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/kms/nv50-: fix case where notifier buffer is at offset 0
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/mmu: fix vram heap sizing
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/i2c/gm200: increase width of aux semaphore owner fields
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/privring: ack interrupts the same way as RM
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/bios: fix issue shadowing expansion ROMs
Wayne Lin <Wayne.Lin(a)amd.com>
drm/amd/display: Fix to be able to stop crc calculation
Victor Zhao <Victor.Zhao(a)amd.com>
drm/amdgpu/psp: fix psp gfx ctrl cmds
Sagar Shrikant Kadam <sagar.kadam(a)sifive.com>
riscv: defconfig: enable gpio support for HiFive Unleashed
Sagar Shrikant Kadam <sagar.kadam(a)sifive.com>
dts: phy: add GPIO number and active state used for phy reset
Sagar Shrikant Kadam <sagar.kadam(a)sifive.com>
dts: phy: fix missing mdio device and probe failure of vsc8541-01 device
David Woodhouse <dwmw(a)amazon.co.uk>
x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery
David Woodhouse <dwmw(a)amazon.co.uk>
xen: Fix event channel callback via INTX/GSI
Arnd Bergmann <arnd(a)arndb.de>
arm64: make atomic helpers __always_inline
Peter Geis <pgwipeout(a)gmail.com>
clk: tegra30: Add hda clock default rates to clock driver
Seth Miller <miller.seth(a)gmail.com>
HID: Ignore battery for Elan touchscreen on ASUS UX550
Filipe Laíns <lains(a)archlinux.org>
HID: logitech-dj: add the G602 receiver
Damien Le Moal <damien.lemoal(a)wdc.com>
riscv: Fix sifive serial driver
Damien Le Moal <damien.lemoal(a)wdc.com>
riscv: Fix kernel time_init()
Ewan D. Milne <emilne(a)redhat.com>
scsi: sd: Suppress spurious errors when WRITE SAME is being disabled
Nilesh Javali <njavali(a)marvell.com>
scsi: qedi: Correct max length of CHAP secret
Can Guo <cang(a)codeaurora.org>
scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback
Anthony Iliopoulos <ailiop(a)suse.com>
dm integrity: select CRYPTO_SKCIPHER
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
HID: multitouch: Enable multi-input for Synaptics pointstick/touchpad device
Cezary Rojewski <cezary.rojewski(a)intel.com>
ASoC: Intel: haswell: Add missing pm_ops
Chris Wilson <chris(a)chris-wilson.co.uk>
drm/i915/gt: Prevent use of engine->wa_ctx after error
Daniel Vetter <daniel.vetter(a)ffwll.ch>
drm/syncobj: Fix use-after-free
Pan Bian <bianpan2016(a)163.com>
drm/atomic: put state on error path
Mikulas Patocka <mpatocka(a)redhat.com>
dm integrity: fix a crash if "recalculate" used without "internal_hash"
Hannes Reinecke <hare(a)suse.de>
dm: avoid filesystem lookup in dm_get_dev_t()
Alex Leibovich <alexl(a)marvell.com>
mmc: sdhci-xenon: fix 1.8v regulator stabilization
Peter Collingbourne <pcc(a)google.com>
mmc: core: don't initialize block size from ext_csd if not present
Filipe Manana <fdmanana(a)suse.com>
btrfs: send: fix invalid clone operations when cloning from the same file and root
Josef Bacik <josef(a)toxicpanda.com>
btrfs: don't clear ret in btrfs_start_dirty_block_groups
Josef Bacik <josef(a)toxicpanda.com>
btrfs: fix lockdep splat in btrfs_recover_relocation
Josef Bacik <josef(a)toxicpanda.com>
btrfs: don't get an EINTR during drop_snapshot for reloc
Hans de Goede <hdegoede(a)redhat.com>
ACPI: scan: Make acpi_bus_get_device() clear return pointer on error
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda/via: Add minimum mute flag
Takashi Iwai <tiwai(a)suse.de>
ALSA: seq: oss: Fix missing error check in snd_seq_oss_synth_make_info()
Jiaxun Yang <jiaxun.yang(a)flygoat.com>
platform/x86: ideapad-laptop: Disable touchpad_switch for ELAN0634
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
platform/x86: i2c-multi-instantiate: Don't create platform device for INT3515 ACPI nodes
Mikko Perttunen <mperttunen(a)nvidia.com>
i2c: bpmp-tegra: Ignore unknown I2C_M flags
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 4 ++
Makefile | 4 +-
arch/arm/xen/enlighten.c | 2 +-
arch/arm64/include/asm/atomic.h | 10 +--
arch/powerpc/kernel/vmlinux.lds.S | 25 +++----
.../riscv/boot/dts/sifive/hifive-unleashed-a00.dts | 2 +
arch/riscv/configs/defconfig | 2 +
arch/riscv/kernel/time.c | 3 +
arch/sh/drivers/dma/Kconfig | 3 +-
arch/x86/include/asm/fpu/api.h | 15 +++-
arch/x86/include/asm/topology.h | 4 +-
arch/x86/kernel/cpu/amd.c | 4 +-
arch/x86/kernel/cpu/topology.c | 2 +-
arch/x86/kernel/fpu/core.c | 9 +--
arch/x86/lib/mmx_32.c | 20 ++++--
arch/x86/xen/enlighten_hvm.c | 11 ++-
drivers/acpi/scan.c | 2 +
drivers/base/core.c | 17 ++++-
drivers/clk/tegra/clk-tegra30.c | 2 +
drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 2 +-
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c | 2 +-
drivers/gpu/drm/drm_atomic_helper.c | 2 +-
drivers/gpu/drm/drm_syncobj.c | 8 ++-
drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +
drivers/gpu/drm/nouveau/dispnv50/disp.c | 4 +-
drivers/gpu/drm/nouveau/dispnv50/disp.h | 2 +-
drivers/gpu/drm/nouveau/dispnv50/wimmc37b.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/subdev/i2c/auxgm200.c | 8 +--
drivers/gpu/drm/nouveau/nvkm/subdev/ibus/gf100.c | 10 ++-
drivers/gpu/drm/nouveau/nvkm/subdev/ibus/gk104.c | 10 ++-
drivers/gpu/drm/nouveau/nvkm/subdev/mmu/base.c | 6 +-
drivers/hid/hid-ids.h | 1 +
drivers/hid/hid-input.c | 2 +
drivers/hid/hid-logitech-dj.c | 4 ++
drivers/hid/hid-multitouch.c | 4 ++
drivers/hwtracing/intel_th/pci.c | 5 ++
drivers/hwtracing/stm/heartbeat.c | 6 +-
drivers/i2c/busses/i2c-octeon-core.c | 2 +-
drivers/i2c/busses/i2c-tegra-bpmp.c | 2 +-
drivers/iio/dac/ad5504.c | 4 +-
drivers/irqchip/irq-mips-cpu.c | 7 ++
drivers/lightnvm/core.c | 3 +-
drivers/md/Kconfig | 1 +
drivers/md/dm-integrity.c | 6 ++
drivers/md/dm-table.c | 15 +++-
drivers/mmc/core/queue.c | 4 +-
drivers/mmc/host/sdhci-xenon.c | 7 +-
drivers/net/can/dev.c | 4 +-
drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 8 +--
drivers/net/can/vxcan.c | 6 +-
drivers/net/dsa/b53/b53_common.c | 2 +-
drivers/net/dsa/mv88e6xxx/global1_vtu.c | 4 ++
drivers/net/ethernet/mscc/ocelot.c | 4 +-
drivers/net/ethernet/renesas/sh_eth.c | 4 +-
drivers/pinctrl/aspeed/pinctrl-aspeed-g6.c | 2 +-
drivers/pinctrl/pinctrl-ingenic.c | 24 +++----
drivers/platform/x86/i2c-multi-instantiate.c | 31 ++++++---
drivers/platform/x86/ideapad-laptop.c | 15 +++-
drivers/platform/x86/intel-vbtn.c | 6 --
drivers/scsi/megaraid/megaraid_sas_base.c | 6 +-
drivers/scsi/qedi/qedi_main.c | 4 +-
drivers/scsi/sd.c | 4 +-
drivers/scsi/ufs/ufshcd.c | 11 ++-
drivers/tty/serial/mvebu-uart.c | 10 ++-
drivers/tty/serial/sifive.c | 1 +
drivers/usb/gadget/udc/aspeed-vhub/epn.c | 5 +-
drivers/usb/gadget/udc/bdc/Kconfig | 2 +-
drivers/usb/gadget/udc/core.c | 13 +++-
drivers/usb/host/ehci-hcd.c | 12 ++++
drivers/usb/host/ehci-hub.c | 3 +
drivers/usb/host/xhci-ring.c | 2 +
drivers/usb/host/xhci-tegra.c | 7 ++
drivers/xen/events/events_base.c | 10 ---
drivers/xen/platform-pci.c | 1 -
drivers/xen/xenbus/xenbus.h | 1 +
drivers/xen/xenbus/xenbus_comms.c | 8 ---
drivers/xen/xenbus/xenbus_probe.c | 81 ++++++++++++++++++----
fs/btrfs/block-group.c | 3 +-
fs/btrfs/extent-tree.c | 10 ++-
fs/btrfs/send.c | 15 ++++
fs/btrfs/volumes.c | 2 +
fs/cifs/transport.c | 4 +-
include/asm-generic/bitops/atomic.h | 6 +-
include/net/inet_connection_sock.h | 3 +
include/xen/xenbus.h | 2 +-
mm/kasan/init.c | 23 +++---
net/core/dev.c | 5 ++
net/core/skbuff.c | 6 +-
net/ipv4/inet_connection_sock.c | 1 +
net/ipv4/netfilter/ipt_rpfilter.c | 2 +-
net/ipv4/tcp.c | 1 +
net/ipv4/tcp_input.c | 1 +
net/ipv4/tcp_ipv4.c | 25 +++----
net/ipv4/tcp_output.c | 1 +
net/ipv4/tcp_timer.c | 14 ++--
net/ipv4/udp.c | 3 +-
net/ipv6/addrconf.c | 3 +-
net/sched/cls_tcindex.c | 8 ++-
net/sched/sch_api.c | 3 +-
sound/core/seq/oss/seq_oss_synth.c | 3 +-
sound/pci/hda/patch_via.c | 1 +
sound/soc/intel/boards/haswell.c | 1 +
tools/testing/selftests/net/fib_tests.sh | 1 -
104 files changed, 490 insertions(+), 223 deletions(-)
The recent commit to fix a memory leak introduced an inadvertant NULL
pointer dereference. The `wacom_wac->pen_fifo` variable was never
intialized, resuling in a crash whenever functions tried to use it.
Since the FIFO is only used by AES pens (to buffer events from pen
proximity until the hardware reports the pen serial number) this would
have been easily overlooked without testing an AES device.
This patch converts `wacom_wac->pen_fifo` over to a pointer (since the
call to `devres_alloc` allocates memory for us) and ensures that we assign
it to point to the allocated and initalized `pen_fifo` before the function
returns.
Link: https://github.com/linuxwacom/input-wacom/issues/230
Fixes: 37309f47e2f5 ("HID: wacom: Fix memory leakage caused by kfifo_alloc")
CC: stable(a)vger.kernel.org # v4.19+
Signed-off-by: Jason Gerecke <jason.gerecke(a)wacom.com>
Tested-by: Ping Cheng <ping.cheng(a)wacom.com>
---
drivers/hid/wacom_sys.c | 7 ++++---
drivers/hid/wacom_wac.h | 2 +-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c
index e8acd235db2a..aa9e48876ced 100644
--- a/drivers/hid/wacom_sys.c
+++ b/drivers/hid/wacom_sys.c
@@ -147,9 +147,9 @@ static int wacom_wac_pen_serial_enforce(struct hid_device *hdev,
}
if (flush)
- wacom_wac_queue_flush(hdev, &wacom_wac->pen_fifo);
+ wacom_wac_queue_flush(hdev, wacom_wac->pen_fifo);
else if (insert)
- wacom_wac_queue_insert(hdev, &wacom_wac->pen_fifo,
+ wacom_wac_queue_insert(hdev, wacom_wac->pen_fifo,
raw_data, report_size);
return insert && !flush;
@@ -1280,7 +1280,7 @@ static void wacom_devm_kfifo_release(struct device *dev, void *res)
static int wacom_devm_kfifo_alloc(struct wacom *wacom)
{
struct wacom_wac *wacom_wac = &wacom->wacom_wac;
- struct kfifo_rec_ptr_2 *pen_fifo = &wacom_wac->pen_fifo;
+ struct kfifo_rec_ptr_2 *pen_fifo;
int error;
pen_fifo = devres_alloc(wacom_devm_kfifo_release,
@@ -1297,6 +1297,7 @@ static int wacom_devm_kfifo_alloc(struct wacom *wacom)
}
devres_add(&wacom->hdev->dev, pen_fifo);
+ wacom_wac->pen_fifo = pen_fifo;
return 0;
}
diff --git a/drivers/hid/wacom_wac.h b/drivers/hid/wacom_wac.h
index da612b6e9c77..195910dd2154 100644
--- a/drivers/hid/wacom_wac.h
+++ b/drivers/hid/wacom_wac.h
@@ -342,7 +342,7 @@ struct wacom_wac {
struct input_dev *pen_input;
struct input_dev *touch_input;
struct input_dev *pad_input;
- struct kfifo_rec_ptr_2 pen_fifo;
+ struct kfifo_rec_ptr_2 *pen_fifo;
int pid;
int num_contacts_left;
u8 bt_features;
--
2.30.0
This is the start of the stable review cycle for the 5.4.93 release.
There are 86 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 27 Jan 2021 18:31:44 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.93-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.93-rc1
Enke Chen <enchen(a)paloaltonetworks.com>
tcp: fix TCP_USER_TIMEOUT with zero window
Eric Dumazet <edumazet(a)google.com>
tcp: do not mess with cloned skbs in tcp_add_backlog()
Dan Carpenter <dan.carpenter(a)oracle.com>
net: dsa: b53: fix an off by one in checking "vlan->vid"
Tariq Toukan <tariqt(a)nvidia.com>
net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: mscc: ocelot: allow offloading of bridge on top of LAG
Matteo Croce <mcroce(a)microsoft.com>
ipv6: set multicast flag on the multicast route
Eric Dumazet <edumazet(a)google.com>
net_sched: reject silly cell_log in qdisc_get_rtab()
Eric Dumazet <edumazet(a)google.com>
net_sched: avoid shift-out-of-bounds in tcindex_set_parms()
Matteo Croce <mcroce(a)microsoft.com>
ipv6: create multicast route with RTPROT_KERNEL
Guillaume Nault <gnault(a)redhat.com>
udp: mask TOS bits in udp_v4_early_demux()
Lecopzer Chen <lecopzer(a)gmail.com>
kasan: fix incorrect arguments passing in kasan_add_zero_shadow
Lecopzer Chen <lecopzer(a)gmail.com>
kasan: fix unaligned address is unhandled in kasan_remove_zero_shadow
Alexander Lobakin <alobakin(a)pm.me>
skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too
Pan Bian <bianpan2016(a)163.com>
lightnvm: fix memory leak when submit fails
Geert Uytterhoeven <geert+renesas(a)glider.be>
sh_eth: Fix power down vs. is_opened flag ordering
Rasmus Villemoes <rasmus.villemoes(a)prevas.dk>
net: dsa: mv88e6xxx: also read STU state in mv88e6250_g1_vtu_getnext
Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
sh: dma: fix kconfig dependency for G2_DMA
Guillaume Nault <gnault(a)redhat.com>
netfilter: rpfilter: mask ecn bits before fib lookup
Yazen Ghannam <Yazen.Ghannam(a)amd.com>
x86/cpu/amd: Set __max_die_per_package on AMD
Paul Cercueil <paul(a)crapouillou.net>
pinctrl: ingenic: Fix JZ4760 support
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
driver core: Extend device_is_dependent()
JC Kuo <jckuo(a)nvidia.com>
xhci: tegra: Delay for disabling LFPS detector
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: make sure TRB is fully written before giving it to the controller
Patrik Jakobsson <patrik.r.jakobsson(a)gmail.com>
usb: bdc: Make bdc pci driver depend on BROKEN
Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
usb: udc: core: Use lock when write to soft_connect
Ryan Chen <ryan_chen(a)aspeedtech.com>
usb: gadget: aspeed: fix stop dma register setting.
Longfang Liu <liulongfang(a)huawei.com>
USB: ehci: fix an interrupt calltrace error
Eugene Korenevsky <ekorenevsky(a)astralinux.ru>
ehci: fix EHCI host controller initialization sequence
Pali Rohár <pali(a)kernel.org>
serial: mvebu-uart: fix tx lost characters at power off
Wang Hui <john.wanghui(a)huawei.com>
stm class: Fix module init return on allocation failure
Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
intel_th: pci: Add Alder Lake-P support
Andy Lutomirski <luto(a)kernel.org>
x86/mmx: Use KFPU_387 for MMX string operations
Mathias Kresin <dev(a)kresin.me>
irqchip/mips-cpu: Set IPI domain parent chip
Ronnie Sahlberg <lsahlber(a)redhat.com>
cifs: do not fail __smb_send_rqst if non-fatal signals are pending
Lars-Peter Clausen <lars(a)metafoo.de>
iio: ad5504: Fix setting power-down state
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: peak_usb: fix use after free bugs
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: vxcan: vxcan_xmit: fix use after free bug
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
can: dev: can_restart: fix use after free bug
Hangbin Liu <liuhangbin(a)gmail.com>
selftests: net: fib_tests: remove duplicate log test
Hans de Goede <hdegoede(a)redhat.com>
platform/x86: intel-vbtn: Drop HP Stream x360 Convertible PC 11 from allow-list
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
i2c: octeon: check correct size of maximum RECV_LEN packet
Ariel Marcovitch <arielmarcovitch(a)gmail.com>
powerpc: Fix alignment bug within the init sections
Arnd Bergmann <arnd(a)arndb.de>
scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression
Billy Tsai <billy_tsai(a)aspeedtech.com>
pinctrl: aspeed: g6: Fix PWMG0 pinctrl setting
Youling Tang <tangyouling(a)loongson.cn>
powerpc: Use the common INIT_DATA_SECTION macro in vmlinux.lds.S
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/kms/nv50-: fix case where notifier buffer is at offset 0
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/mmu: fix vram heap sizing
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/i2c/gm200: increase width of aux semaphore owner fields
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/privring: ack interrupts the same way as RM
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/bios: fix issue shadowing expansion ROMs
Wayne Lin <Wayne.Lin(a)amd.com>
drm/amd/display: Fix to be able to stop crc calculation
Victor Zhao <Victor.Zhao(a)amd.com>
drm/amdgpu/psp: fix psp gfx ctrl cmds
Sagar Shrikant Kadam <sagar.kadam(a)sifive.com>
riscv: defconfig: enable gpio support for HiFive Unleashed
Sagar Shrikant Kadam <sagar.kadam(a)sifive.com>
dts: phy: add GPIO number and active state used for phy reset
Sagar Shrikant Kadam <sagar.kadam(a)sifive.com>
dts: phy: fix missing mdio device and probe failure of vsc8541-01 device
David Woodhouse <dwmw(a)amazon.co.uk>
x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery
David Woodhouse <dwmw(a)amazon.co.uk>
xen: Fix event channel callback via INTX/GSI
Arnd Bergmann <arnd(a)arndb.de>
arm64: make atomic helpers __always_inline
Peter Geis <pgwipeout(a)gmail.com>
clk: tegra30: Add hda clock default rates to clock driver
Seth Miller <miller.seth(a)gmail.com>
HID: Ignore battery for Elan touchscreen on ASUS UX550
Filipe Laíns <lains(a)archlinux.org>
HID: logitech-dj: add the G602 receiver
Damien Le Moal <damien.lemoal(a)wdc.com>
riscv: Fix sifive serial driver
Damien Le Moal <damien.lemoal(a)wdc.com>
riscv: Fix kernel time_init()
Ewan D. Milne <emilne(a)redhat.com>
scsi: sd: Suppress spurious errors when WRITE SAME is being disabled
Nilesh Javali <njavali(a)marvell.com>
scsi: qedi: Correct max length of CHAP secret
Can Guo <cang(a)codeaurora.org>
scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback
Anthony Iliopoulos <ailiop(a)suse.com>
dm integrity: select CRYPTO_SKCIPHER
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
HID: multitouch: Enable multi-input for Synaptics pointstick/touchpad device
Cezary Rojewski <cezary.rojewski(a)intel.com>
ASoC: Intel: haswell: Add missing pm_ops
Chris Wilson <chris(a)chris-wilson.co.uk>
drm/i915/gt: Prevent use of engine->wa_ctx after error
Daniel Vetter <daniel.vetter(a)ffwll.ch>
drm/syncobj: Fix use-after-free
Pan Bian <bianpan2016(a)163.com>
drm/atomic: put state on error path
Mikulas Patocka <mpatocka(a)redhat.com>
dm integrity: fix a crash if "recalculate" used without "internal_hash"
Hannes Reinecke <hare(a)suse.de>
dm: avoid filesystem lookup in dm_get_dev_t()
Alex Leibovich <alexl(a)marvell.com>
mmc: sdhci-xenon: fix 1.8v regulator stabilization
Peter Collingbourne <pcc(a)google.com>
mmc: core: don't initialize block size from ext_csd if not present
Filipe Manana <fdmanana(a)suse.com>
btrfs: send: fix invalid clone operations when cloning from the same file and root
Josef Bacik <josef(a)toxicpanda.com>
btrfs: don't clear ret in btrfs_start_dirty_block_groups
Josef Bacik <josef(a)toxicpanda.com>
btrfs: fix lockdep splat in btrfs_recover_relocation
Josef Bacik <josef(a)toxicpanda.com>
btrfs: don't get an EINTR during drop_snapshot for reloc
Hans de Goede <hdegoede(a)redhat.com>
ACPI: scan: Make acpi_bus_get_device() clear return pointer on error
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda/via: Add minimum mute flag
Takashi Iwai <tiwai(a)suse.de>
ALSA: seq: oss: Fix missing error check in snd_seq_oss_synth_make_info()
Jiaxun Yang <jiaxun.yang(a)flygoat.com>
platform/x86: ideapad-laptop: Disable touchpad_switch for ELAN0634
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
platform/x86: i2c-multi-instantiate: Don't create platform device for INT3515 ACPI nodes
Mikko Perttunen <mperttunen(a)nvidia.com>
i2c: bpmp-tegra: Ignore unknown I2C_M flags
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 4 ++
Makefile | 4 +-
arch/arm/xen/enlighten.c | 2 +-
arch/arm64/include/asm/atomic.h | 10 +--
arch/powerpc/kernel/vmlinux.lds.S | 25 +++----
.../riscv/boot/dts/sifive/hifive-unleashed-a00.dts | 2 +
arch/riscv/configs/defconfig | 2 +
arch/riscv/kernel/time.c | 3 +
arch/sh/drivers/dma/Kconfig | 3 +-
arch/x86/kernel/cpu/amd.c | 4 +-
arch/x86/lib/mmx_32.c | 20 ++++--
arch/x86/xen/enlighten_hvm.c | 11 ++-
drivers/acpi/scan.c | 2 +
drivers/base/core.c | 17 ++++-
drivers/clk/tegra/clk-tegra30.c | 2 +
drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 2 +-
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c | 2 +-
drivers/gpu/drm/drm_atomic_helper.c | 2 +-
drivers/gpu/drm/drm_syncobj.c | 8 ++-
drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +
drivers/gpu/drm/nouveau/dispnv50/disp.c | 4 +-
drivers/gpu/drm/nouveau/dispnv50/disp.h | 2 +-
drivers/gpu/drm/nouveau/dispnv50/wimmc37b.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/subdev/i2c/auxgm200.c | 8 +--
drivers/gpu/drm/nouveau/nvkm/subdev/ibus/gf100.c | 10 ++-
drivers/gpu/drm/nouveau/nvkm/subdev/ibus/gk104.c | 10 ++-
drivers/gpu/drm/nouveau/nvkm/subdev/mmu/base.c | 6 +-
drivers/hid/hid-ids.h | 1 +
drivers/hid/hid-input.c | 2 +
drivers/hid/hid-logitech-dj.c | 4 ++
drivers/hid/hid-multitouch.c | 4 ++
drivers/hwtracing/intel_th/pci.c | 5 ++
drivers/hwtracing/stm/heartbeat.c | 6 +-
drivers/i2c/busses/i2c-octeon-core.c | 2 +-
drivers/i2c/busses/i2c-tegra-bpmp.c | 2 +-
drivers/iio/dac/ad5504.c | 4 +-
drivers/irqchip/irq-mips-cpu.c | 7 ++
drivers/lightnvm/core.c | 3 +-
drivers/md/Kconfig | 1 +
drivers/md/dm-integrity.c | 6 ++
drivers/md/dm-table.c | 15 +++-
drivers/mmc/core/queue.c | 4 +-
drivers/mmc/host/sdhci-xenon.c | 7 +-
drivers/net/can/dev.c | 4 +-
drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 8 +--
drivers/net/can/vxcan.c | 6 +-
drivers/net/dsa/b53/b53_common.c | 2 +-
drivers/net/dsa/mv88e6xxx/global1_vtu.c | 4 ++
drivers/net/ethernet/mscc/ocelot.c | 4 +-
drivers/net/ethernet/renesas/sh_eth.c | 4 +-
drivers/pinctrl/aspeed/pinctrl-aspeed-g6.c | 2 +-
drivers/pinctrl/pinctrl-ingenic.c | 24 +++----
drivers/platform/x86/i2c-multi-instantiate.c | 31 ++++++---
drivers/platform/x86/ideapad-laptop.c | 15 +++-
drivers/platform/x86/intel-vbtn.c | 6 --
drivers/scsi/megaraid/megaraid_sas_base.c | 6 +-
drivers/scsi/qedi/qedi_main.c | 4 +-
drivers/scsi/sd.c | 4 +-
drivers/scsi/ufs/ufshcd.c | 11 ++-
drivers/tty/serial/mvebu-uart.c | 10 ++-
drivers/tty/serial/sifive.c | 1 +
drivers/usb/gadget/udc/aspeed-vhub/epn.c | 5 +-
drivers/usb/gadget/udc/bdc/Kconfig | 2 +-
drivers/usb/gadget/udc/core.c | 13 +++-
drivers/usb/host/ehci-hcd.c | 12 ++++
drivers/usb/host/ehci-hub.c | 3 +
drivers/usb/host/xhci-ring.c | 2 +
drivers/usb/host/xhci-tegra.c | 7 ++
drivers/xen/events/events_base.c | 10 ---
drivers/xen/platform-pci.c | 1 -
drivers/xen/xenbus/xenbus.h | 1 +
drivers/xen/xenbus/xenbus_comms.c | 8 ---
drivers/xen/xenbus/xenbus_probe.c | 81 ++++++++++++++++++----
fs/btrfs/block-group.c | 3 +-
fs/btrfs/extent-tree.c | 10 ++-
fs/btrfs/send.c | 15 ++++
fs/btrfs/volumes.c | 2 +
fs/cifs/transport.c | 4 +-
include/asm-generic/bitops/atomic.h | 6 +-
include/net/inet_connection_sock.h | 3 +
include/xen/xenbus.h | 2 +-
mm/kasan/init.c | 23 +++---
net/core/dev.c | 5 ++
net/core/skbuff.c | 6 +-
net/ipv4/inet_connection_sock.c | 1 +
net/ipv4/netfilter/ipt_rpfilter.c | 2 +-
net/ipv4/tcp.c | 1 +
net/ipv4/tcp_input.c | 1 +
net/ipv4/tcp_ipv4.c | 25 +++----
net/ipv4/tcp_output.c | 1 +
net/ipv4/tcp_timer.c | 14 ++--
net/ipv4/udp.c | 3 +-
net/ipv6/addrconf.c | 3 +-
net/sched/cls_tcindex.c | 8 ++-
net/sched/sch_api.c | 3 +-
sound/core/seq/oss/seq_oss_synth.c | 3 +-
sound/pci/hda/patch_via.c | 1 +
sound/soc/intel/boards/haswell.c | 1 +
tools/testing/selftests/net/fib_tests.sh | 1 -
100 files changed, 469 insertions(+), 214 deletions(-)
The VM_BUG_ON_PAGE avoids the generation of any code, even if that
expression has side-effects when !CONFIG_DEBUG_VM.
Fixes: e5dfacebe4a4 ("mm/hugetlb.c: just use put_page_testzero() instead of page_count()")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
---
mm/hugetlb.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a6bad1f686c5..082ed643020b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2047,13 +2047,16 @@ static int gather_surplus_pages(struct hstate *h, long delta)
/* Free the needed pages to the hugetlb pool */
list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
+ int zeroed;
+
if ((--needed) < 0)
break;
/*
* This page is now managed by the hugetlb allocator and has
* no users -- drop the buddy allocator's reference.
*/
- VM_BUG_ON_PAGE(!put_page_testzero(page), page);
+ zeroed = put_page_testzero(page);
+ VM_BUG_ON_PAGE(!zeroed, page);
enqueue_huge_page(h, page);
}
free:
--
2.11.0
The patch titled
Subject: mm: memcontrol: prevent starvation when writing memory.high
has been removed from the -mm tree. Its filename was
mm-memcontrol-prevent-starvation-when-writing-memoryhigh.patch
This patch was dropped because an alternative patch was merged
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: memcontrol: prevent starvation when writing memory.high
When a value is written to a cgroup's memory.high control file, the
write() context first tries to reclaim the cgroup to size before putting
the limit in place for the workload. Concurrent charges from the workload
can keep such a write() looping in reclaim indefinitely.
In the past, a write to memory.high would first put the limit in place for
the workload, then do targeted reclaim until the new limit has been met -
similar to how we do it for memory.max. This wasn't prone to the
described starvation issue. However, this sequence could cause excessive
latencies in the workload, when allocating threads could be put into long
penalty sleeps on the sudden memory.high overage created by the write(),
before that had a chance to work it off.
Now that memory_high_write() performs reclaim before enforcing the new
limit, reflect that the cgroup may well fail to converge due to concurrent
workload activity. Bail out of the loop after a few tries.
Link: https://lkml.kernel.org/r/20210112163011.127833-1-hannes@cmpxchg.org
Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reported-by: Tejun Heo <tj(a)kernel.org>
Acked-by: Roman Gushchin <guro(a)fb.com>
Reviewed-by: Michal Koutný <mkoutny(a)suse.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org> [5.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
--- a/mm/memcontrol.c~mm-memcontrol-prevent-starvation-when-writing-memoryhigh
+++ a/mm/memcontrol.c
@@ -6273,7 +6273,6 @@ static ssize_t memory_high_write(struct
for (;;) {
unsigned long nr_pages = page_counter_read(&memcg->memory);
- unsigned long reclaimed;
if (nr_pages <= high)
break;
@@ -6287,10 +6286,10 @@ static ssize_t memory_high_write(struct
continue;
}
- reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
- GFP_KERNEL, true);
+ try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
+ GFP_KERNEL, true);
- if (!reclaimed && !nr_retries--)
+ if (!nr_retries--)
break;
}
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
revert-mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
Dear stable maintainers,
Please consider cherry-picking c8a950d0d3b9 ("tools: Factor HOSTCC,
HOSTLD, HOSTAR definitions") to 5.10, 5.4 and 4.19. It fixes a problem
where the host tools set by the user could be ignored for
'prepare-objtool', and is needed on x86 to enable the ORC unwinder,
dynamic ftrace, etc. with LLVM=1 and a 'hermetic' toolchain
environment.
On 5.10.10 it cherry-picks cleanly. On 5.4.92 and 4.19.170 it
cherry-picks cleanly, besides 'tools/bpf/resolve_btfids/Makefile'
which was introduced after 5.8-rc2. (This file can be ignored.)
Thanks!
Hello stable,
The following printk patches fixing a buffer overflow potential in 5.10
are now available in Linus' tree:
f0e386ee0c0b71ea6f7238506a4d0965a2dbef11 ("printk: fix buffer overflow
potential for print_text()")
08d60e5999540110576e7c1346d486220751b7f9 ("printk: fix string
termination for record_print_text()")
The first one (f0e386ee0c0b) was already queued up for 5.10-stable but I
requested it not be applied until this second one was accepted. Now they
are both accepted and both should be applied. Thanks.
John Ogness
The patch titled
Subject: mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()
has been added to the -mm tree. Its filename is
mm-filemap-adding-missing-mem_cgroup_uncharge-to-__add_to_page_cache_locked.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-filemap-adding-missing-mem_cgr…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-filemap-adding-missing-mem_cgr…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Waiman Long <longman(a)redhat.com>
Subject: mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()
commit 3fea5a499d57 ("mm: memcontrol: convert page cache to a new
mem_cgroup_charge() API") introduced a bug in __add_to_page_cache_locked()
causing the following splat:
[ 1570.068330] page dumped because: VM_BUG_ON_PAGE(page_memcg(page))
[ 1570.068333] pages's memcg:ffff8889a4116000
[ 1570.068343] ------------[ cut here ]------------
[ 1570.068346] kernel BUG at mm/memcontrol.c:2924!
[ 1570.068355] invalid opcode: 0000 [#1] SMP KASAN PTI
[ 1570.068359] CPU: 35 PID: 12345 Comm: cat Tainted: G S W I 5.11.0-rc4-debug+ #1
[ 1570.068363] Hardware name: HP HP Z8 G4 Workstation/81C7, BIOS P60 v01.25 12/06/2017
[ 1570.068365] RIP: 0010:commit_charge+0xf4/0x130
:
[ 1570.068375] RSP: 0018:ffff8881b38d70e8 EFLAGS: 00010286
[ 1570.068379] RAX: 0000000000000000 RBX: ffffea00260ddd00 RCX: 0000000000000027
[ 1570.068382] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88907ebe05a8
[ 1570.068384] RBP: ffffea00260ddd00 R08: ffffed120fd7c0b6 R09: ffffed120fd7c0b6
[ 1570.068386] R10: ffff88907ebe05ab R11: ffffed120fd7c0b5 R12: ffffea00260ddd38
[ 1570.068389] R13: ffff8889a4116000 R14: ffff8889a4116000 R15: 0000000000000001
[ 1570.068391] FS: 00007ff039638680(0000) GS:ffff88907ea00000(0000) knlGS:0000000000000000
[ 1570.068394] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1570.068396] CR2: 00007f36f354cc20 CR3: 00000008a0126006 CR4: 00000000007706e0
[ 1570.068398] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1570.068400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1570.068402] PKRU: 55555554
[ 1570.068404] Call Trace:
[ 1570.068407] mem_cgroup_charge+0x175/0x770
[ 1570.068413] __add_to_page_cache_locked+0x712/0xad0
[ 1570.068439] add_to_page_cache_lru+0xc5/0x1f0
[ 1570.068461] cachefiles_read_or_alloc_pages+0x895/0x2e10 [cachefiles]
[ 1570.068524] __fscache_read_or_alloc_pages+0x6c0/0xa00 [fscache]
[ 1570.068540] __nfs_readpages_from_fscache+0x16d/0x630 [nfs]
[ 1570.068585] nfs_readpages+0x24e/0x540 [nfs]
[ 1570.068693] read_pages+0x5b1/0xc40
[ 1570.068711] page_cache_ra_unbounded+0x460/0x750
[ 1570.068729] generic_file_buffered_read_get_pages+0x290/0x1710
[ 1570.068756] generic_file_buffered_read+0x2a9/0xc30
[ 1570.068832] nfs_file_read+0x13f/0x230 [nfs]
[ 1570.068872] new_sync_read+0x3af/0x610
[ 1570.068901] vfs_read+0x339/0x4b0
[ 1570.068909] ksys_read+0xf1/0x1c0
[ 1570.068920] do_syscall_64+0x33/0x40
[ 1570.068926] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1570.068930] RIP: 0033:0x7ff039135595
Before that commit, there was a try_charge() and commit_charge() in
__add_to_page_cache_locked(). These 2 separated charge functions were
replaced by a single mem_cgroup_charge(). However, it forgot to add a
matching mem_cgroup_uncharge() when the xarray insertion failed with the
page released back to the pool. Fix this by adding a
mem_cgroup_uncharge() call when insertion error happens.
Link: https://lkml.kernel.org/r/20210125042441.20030-1-longman@redhat.com
Fixes: 3fea5a499d57 ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API")
Signed-off-by: Waiman Long <longman(a)redhat.com>
Reviewed-by: Alex Shi <alex.shi(a)linux.alibaba.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Muchun Song <smuchun(a)gmail.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/filemap.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/mm/filemap.c~mm-filemap-adding-missing-mem_cgroup_uncharge-to-__add_to_page_cache_locked
+++ a/mm/filemap.c
@@ -835,6 +835,7 @@ noinline int __add_to_page_cache_locked(
XA_STATE(xas, &mapping->i_pages, offset);
int huge = PageHuge(page);
int error;
+ bool charged = false;
VM_BUG_ON_PAGE(!PageLocked(page), page);
VM_BUG_ON_PAGE(PageSwapBacked(page), page);
@@ -848,6 +849,7 @@ noinline int __add_to_page_cache_locked(
error = mem_cgroup_charge(page, current->mm, gfp);
if (error)
goto error;
+ charged = true;
}
gfp &= GFP_RECLAIM_MASK;
@@ -896,6 +898,8 @@ unlock:
if (xas_error(&xas)) {
error = xas_error(&xas);
+ if (charged)
+ mem_cgroup_uncharge(page);
goto error;
}
_
Patches currently in -mm which might be from longman(a)redhat.com are
mm-filemap-adding-missing-mem_cgroup_uncharge-to-__add_to_page_cache_locked.patch
Ville noticed that the last mocs entry is used unconditionally by the HW
when it performs cache evictions, and noted that while the value is not
meant to be writable by the driver, we should program it to a reasonable
value nevertheless.
As it turns out, we can change the value of mocs:63 and the value we
were programming into it would cause hard hangs in conjunction with
atomic operations.
Suggested-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2707
Fixes: 3bbaba0ceaa2 ("drm/i915: Added Programming of the MOCS")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: Jason Ekstrand <jason(a)jlekstrand.net>
Cc: <stable(a)vger.kernel.org> # v4.3+
---
drivers/gpu/drm/i915/gt/intel_mocs.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 254873e1646e..6ae512847f64 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -131,7 +131,10 @@ static const struct drm_i915_mocs_entry skl_mocs_table[] = {
GEN9_MOCS_ENTRIES,
MOCS_ENTRY(I915_MOCS_CACHED,
LE_3_WB | LE_TC_2_LLC_ELLC | LE_LRUM(3),
- L3_3_WB)
+ L3_3_WB),
+ MOCS_ENTRY(63,
+ LE_3_WB | LE_TC_1_LLC | LE_LRUM(3),
+ L3_1_UC)
};
/* NOTE: the LE_TGT_CACHE is not used on Broxton */
--
2.20.1
The patch titled
Subject: proc_sysctl: fix oops caused by incorrect command parameters
has been removed from the -mm tree. Its filename was
proc_sysctl-fix-oops-caused-by-incorrect-command-parameters.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Xiaoming Ni <nixiaoming(a)huawei.com>
Subject: proc_sysctl: fix oops caused by incorrect command parameters
The process_sysctl_arg() does not check whether val is empty before
invoking strlen(val). If the command line parameter () is incorrectly
configured and val is empty, oops is triggered.
For example:
"hung_task_panic=1" is incorrectly written as "hung_task_panic", oops is
triggered. The call stack is as follows:
Kernel command line: .... hung_task_panic
......
Call trace:
__pi_strlen+0x10/0x98
parse_args+0x278/0x344
do_sysctl_args+0x8c/0xfc
kernel_init+0x5c/0xf4
ret_from_fork+0x10/0x30
To fix it, check whether "val" is empty when "phram" is a sysctl field.
Error codes are returned in the failure branch, and error logs are
generated by parse_args().
Link: https://lkml.kernel.org/r/20210118133029.28580-1-nixiaoming@huawei.com
Fixes: 3db978d480e2843 ("kernel/sysctl: support setting sysctl parameters from kernel command line")
Signed-off-by: Xiaoming Ni <nixiaoming(a)huawei.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Iurii Zaikin <yzaikin(a)google.com>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Heiner Kallweit <hkallweit1(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: <stable(a)vger.kernel.org> [5.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/proc_sysctl.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/fs/proc/proc_sysctl.c~proc_sysctl-fix-oops-caused-by-incorrect-command-parameters
+++ a/fs/proc/proc_sysctl.c
@@ -1770,6 +1770,12 @@ static int process_sysctl_arg(char *para
return 0;
}
+ if (!val)
+ return -EINVAL;
+ len = strlen(val);
+ if (len == 0)
+ return -EINVAL;
+
/*
* To set sysctl options, we use a temporary mount of proc, look up the
* respective sys/ file and write to it. To avoid mounting it when no
@@ -1811,7 +1817,6 @@ static int process_sysctl_arg(char *para
file, param, val);
goto out;
}
- len = strlen(val);
wret = kernel_write(file, val, len, &pos);
if (wret < 0) {
err = wret;
_
Patches currently in -mm which might be from nixiaoming(a)huawei.com are
The patch titled
Subject: mm: fix page reference leak in soft_offline_page()
has been removed from the -mm tree. Its filename was
mm-fix-page-reference-leak-in-soft_offline_page.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Dan Williams <dan.j.williams(a)intel.com>
Subject: mm: fix page reference leak in soft_offline_page()
The conversion to move pfn_to_online_page() internal to
soft_offline_page() missed that the get_user_pages() reference taken by
the madvise() path needs to be dropped when pfn_to_online_page() fails.
Note the direct sysfs-path to soft_offline_page() does not perform a
get_user_pages() lookup.
When soft_offline_page() is handed a pfn_valid() && !pfn_to_online_page()
pfn the kernel hangs at dax-device shutdown due to a leaked reference.
Link: https://lkml.kernel.org/r/161058501210.1840162.8108917599181157327.stgit@dw…
Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Qian Cai <cai(a)lca.pw>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
--- a/mm/memory-failure.c~mm-fix-page-reference-leak-in-soft_offline_page
+++ a/mm/memory-failure.c
@@ -1885,6 +1885,12 @@ static int soft_offline_free_page(struct
return rc;
}
+static void put_ref_page(struct page *page)
+{
+ if (page)
+ put_page(page);
+}
+
/**
* soft_offline_page - Soft offline a page.
* @pfn: pfn to soft-offline
@@ -1910,20 +1916,26 @@ static int soft_offline_free_page(struct
int soft_offline_page(unsigned long pfn, int flags)
{
int ret;
- struct page *page;
bool try_again = true;
+ struct page *page, *ref_page = NULL;
+
+ WARN_ON_ONCE(!pfn_valid(pfn) && (flags & MF_COUNT_INCREASED));
if (!pfn_valid(pfn))
return -ENXIO;
+ if (flags & MF_COUNT_INCREASED)
+ ref_page = pfn_to_page(pfn);
+
/* Only online pages can be soft-offlined (esp., not ZONE_DEVICE). */
page = pfn_to_online_page(pfn);
- if (!page)
+ if (!page) {
+ put_ref_page(ref_page);
return -EIO;
+ }
if (PageHWPoison(page)) {
pr_info("%s: %#lx page already poisoned\n", __func__, pfn);
- if (flags & MF_COUNT_INCREASED)
- put_page(page);
+ put_ref_page(ref_page);
return 0;
}
_
Patches currently in -mm which might be from dan.j.williams(a)intel.com are
mm-move-pfn_to_online_page-out-of-line.patch
mm-teach-pfn_to_online_page-to-consider-subsection-validity.patch
mm-teach-pfn_to_online_page-about-zone_device-section-collisions.patch
mm-teach-pfn_to_online_page-about-zone_device-section-collisions-fix.patch
mm-fix-memory_failure-handling-of-dax-namespace-metadata.patch
The patch titled
Subject: mm: fix numa stats for thp migration
has been removed from the -mm tree. Its filename was
mm-fix-numa-stats-for-thp-migration.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Shakeel Butt <shakeelb(a)google.com>
Subject: mm: fix numa stats for thp migration
Currently the kernel is not correctly updating the numa stats for
NR_FILE_PAGES and NR_SHMEM on THP migration. Fix that. For NR_FILE_DIRTY
and NR_ZONE_WRITE_PENDING, although at the moment there is no need to
handle THP migration as kernel still does not have write support for file
THP but to be more future proof, this patch adds the THP support for those
stats as well.
Link: https://lkml.kernel.org/r/20210108155813.2914586-2-shakeelb@google.com
Fixes: e71769ae52609 ("mm: enable thp migration for shmem thp")
Signed-off-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Yang Shi <shy828301(a)gmail.com>
Reviewed-by: Roman Gushchin <guro(a)fb.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
--- a/mm/migrate.c~mm-fix-numa-stats-for-thp-migration
+++ a/mm/migrate.c
@@ -402,6 +402,7 @@ int migrate_page_move_mapping(struct add
struct zone *oldzone, *newzone;
int dirty;
int expected_count = expected_page_refs(mapping, page) + extra_count;
+ int nr = thp_nr_pages(page);
if (!mapping) {
/* Anonymous page without mapping */
@@ -437,7 +438,7 @@ int migrate_page_move_mapping(struct add
*/
newpage->index = page->index;
newpage->mapping = page->mapping;
- page_ref_add(newpage, thp_nr_pages(page)); /* add cache reference */
+ page_ref_add(newpage, nr); /* add cache reference */
if (PageSwapBacked(page)) {
__SetPageSwapBacked(newpage);
if (PageSwapCache(page)) {
@@ -459,7 +460,7 @@ int migrate_page_move_mapping(struct add
if (PageTransHuge(page)) {
int i;
- for (i = 1; i < HPAGE_PMD_NR; i++) {
+ for (i = 1; i < nr; i++) {
xas_next(&xas);
xas_store(&xas, newpage);
}
@@ -470,7 +471,7 @@ int migrate_page_move_mapping(struct add
* to one less reference.
* We know this isn't the last reference.
*/
- page_ref_unfreeze(page, expected_count - thp_nr_pages(page));
+ page_ref_unfreeze(page, expected_count - nr);
xas_unlock(&xas);
/* Leave irq disabled to prevent preemption while updating stats */
@@ -493,17 +494,17 @@ int migrate_page_move_mapping(struct add
old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat);
new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat);
- __dec_lruvec_state(old_lruvec, NR_FILE_PAGES);
- __inc_lruvec_state(new_lruvec, NR_FILE_PAGES);
+ __mod_lruvec_state(old_lruvec, NR_FILE_PAGES, -nr);
+ __mod_lruvec_state(new_lruvec, NR_FILE_PAGES, nr);
if (PageSwapBacked(page) && !PageSwapCache(page)) {
- __dec_lruvec_state(old_lruvec, NR_SHMEM);
- __inc_lruvec_state(new_lruvec, NR_SHMEM);
+ __mod_lruvec_state(old_lruvec, NR_SHMEM, -nr);
+ __mod_lruvec_state(new_lruvec, NR_SHMEM, nr);
}
if (dirty && mapping_can_writeback(mapping)) {
- __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
- __dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
- __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
- __inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
+ __mod_lruvec_state(old_lruvec, NR_FILE_DIRTY, -nr);
+ __mod_zone_page_state(oldzone, NR_ZONE_WRITE_PENDING, -nr);
+ __mod_lruvec_state(new_lruvec, NR_FILE_DIRTY, nr);
+ __mod_zone_page_state(newzone, NR_ZONE_WRITE_PENDING, nr);
}
}
local_irq_enable();
_
Patches currently in -mm which might be from shakeelb(a)google.com are
mm-memcg-add-swapcache-stat-for-memcg-v2.patch
The patch titled
Subject: mm: memcg: fix memcg file_dirty numa stat
has been removed from the -mm tree. Its filename was
mm-memcg-fix-memcg-file_dirty-numa-stat.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Shakeel Butt <shakeelb(a)google.com>
Subject: mm: memcg: fix memcg file_dirty numa stat
The kernel updates the per-node NR_FILE_DIRTY stats on page migration but
not the memcg numa stats. That was not an issue until recently the commit
5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface for
cgroup v2") exposed numa stats for the memcg. So fix the file_dirty
per-memcg numa stat.
Link: https://lkml.kernel.org/r/20210108155813.2914586-1-shakeelb@google.com
Fixes: 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface for cgroup v2")
Signed-off-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Muchun Song <songmuchun(a)bytedance.com>
Acked-by: Yang Shi <shy828301(a)gmail.com>
Reviewed-by: Roman Gushchin <guro(a)fb.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/migrate.c~mm-memcg-fix-memcg-file_dirty-numa-stat
+++ a/mm/migrate.c
@@ -500,9 +500,9 @@ int migrate_page_move_mapping(struct add
__inc_lruvec_state(new_lruvec, NR_SHMEM);
}
if (dirty && mapping_can_writeback(mapping)) {
- __dec_node_state(oldzone->zone_pgdat, NR_FILE_DIRTY);
+ __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
__dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
- __inc_node_state(newzone->zone_pgdat, NR_FILE_DIRTY);
+ __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
__inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
}
}
_
Patches currently in -mm which might be from shakeelb(a)google.com are
mm-memcg-add-swapcache-stat-for-memcg-v2.patch
The patch titled
Subject: mm: memcg/slab: optimize objcg stock draining
has been removed from the -mm tree. Its filename was
mm-memcg-slab-optimize-objcg-stock-draining.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Roman Gushchin <guro(a)fb.com>
Subject: mm: memcg/slab: optimize objcg stock draining
Imran Khan reported a 16% regression in hackbench results caused by the
commit f2fe7b09a52b ("mm: memcg/slab: charge individual slab objects
instead of pages"). The regression is noticeable in the case of a
consequent allocation of several relatively large slab objects, e.g.
skb's. As soon as the amount of stocked bytes exceeds PAGE_SIZE,
drain_obj_stock() and __memcg_kmem_uncharge() are called, and it leads
to a number of atomic operations in page_counter_uncharge().
The corresponding call graph is below (provided by Imran Khan):
|__alloc_skb
| |
| |__kmalloc_reserve.isra.61
| | |
| | |__kmalloc_node_track_caller
| | | |
| | | |slab_pre_alloc_hook.constprop.88
| | | obj_cgroup_charge
| | | | |
| | | | |__memcg_kmem_charge
| | | | | |
| | | | | |page_counter_try_charge
| | | | |
| | | | |refill_obj_stock
| | | | | |
| | | | | |drain_obj_stock.isra.68
| | | | | | |
| | | | | | |__memcg_kmem_uncharge
| | | | | | | |
| | | | | | | |page_counter_uncharge
| | | | | | | | |
| | | | | | | | |page_counter_cancel
| | | |
| | | |
| | | |__slab_alloc
| | | | |
| | | | |___slab_alloc
| | | | |
| | | |slab_post_alloc_hook
Instead of directly uncharging the accounted kernel memory, it's possible
to refill the generic page-sized per-cpu stock instead. It's a much
faster operation, especially on a default hierarchy. As a bonus,
__memcg_kmem_uncharge_page() will also get faster, so the freeing of
page-sized kernel allocations (e.g. large kmallocs) will become faster.
A similar change has been done earlier for the socket memory by the commit
475d0487a2ad ("mm: memcontrol: use per-cpu stocks for socket memory
uncharging").
Link: https://lkml.kernel.org/r/20210106042239.2860107-1-guro@fb.com
Fixes: f2fe7b09a52b ("mm: memcg/slab: charge individual slab objects instead of
pages")
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Reported-by: Imran Khan <imran.f.khan(a)oracle.com>
Tested-by: Imran Khan <imran.f.khan(a)oracle.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Michal Koutn <mkoutny(a)suse.com>
Cc: Michal Koutný <mkoutny(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
--- a/mm/memcontrol.c~mm-memcg-slab-optimize-objcg-stock-draining
+++ a/mm/memcontrol.c
@@ -3115,9 +3115,7 @@ void __memcg_kmem_uncharge(struct mem_cg
if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
page_counter_uncharge(&memcg->kmem, nr_pages);
- page_counter_uncharge(&memcg->memory, nr_pages);
- if (do_memsw_account())
- page_counter_uncharge(&memcg->memsw, nr_pages);
+ refill_stock(memcg, nr_pages);
}
/**
_
Patches currently in -mm which might be from guro(a)fb.com are
mm-memcg-slab-pre-allocate-obj_cgroups-for-slab-caches-with-slab_account.patch
mm-kmem-make-__memcg_kmem_uncharge-static.patch
mm-cma-allocate-cma-areas-bottom-up.patch
mm-cma-allocate-cma-areas-bottom-up-fix.patch
mm-cma-allocate-cma-areas-bottom-up-fix-2.patch
mm-cma-allocate-cma-areas-bottom-up-fix-3.patch
memblock-do-not-start-bottom-up-allocations-with-kernel_end.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch
The patch titled
Subject: mm: fix initialization of struct page for holes in memory layout
has been removed from the -mm tree. Its filename was
mm-fix-initialization-of-struct-page-for-holes-in-memory-layout.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mike Rapoport <rppt(a)linux.ibm.com>
Subject: mm: fix initialization of struct page for holes in memory layout
There could be struct pages that are not backed by actual physical memory.
This can happen when the actual memory bank is not a multiple of
SECTION_SIZE or when an architecture does not register memory holes
reserved by the firmware as memblock.memory.
Such pages are currently initialized using init_unavailable_mem() function
that iterates through PFNs in holes in memblock.memory and if there is a
struct page corresponding to a PFN, the fields if this page are set to
default values and the page is marked as Reserved.
init_unavailable_mem() does not take into account zone and node the page
belongs to and sets both zone and node links in struct page to zero.
On a system that has firmware reserved holes in a zone above ZONE_DMA, for
instance in a configuration below:
# grep -A1 E820 /proc/iomem
7a17b000-7a216fff : Unknown E820 type
7a217000-7bffffff : System RAM
unset zone link in struct page will trigger
VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page);
because there are pages in both ZONE_DMA32 and ZONE_DMA (unset zone link
in struct page) in the same pageblock.
Update init_unavailable_mem() to use zone constraints defined by an
architecture to properly setup the zone link and use node ID of the
adjacent range in memblock.memory to set the node link.
Link: https://lkml.kernel.org/r/20210111194017.22696-3-rppt@kernel.org
Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reported-by: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Qian Cai <cai(a)lca.pw>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 84 +++++++++++++++++++++++++++-------------------
1 file changed, 50 insertions(+), 34 deletions(-)
--- a/mm/page_alloc.c~mm-fix-initialization-of-struct-page-for-holes-in-memory-layout
+++ a/mm/page_alloc.c
@@ -7078,23 +7078,26 @@ void __init free_area_init_memoryless_no
* Initialize all valid struct pages in the range [spfn, epfn) and mark them
* PageReserved(). Return the number of struct pages that were initialized.
*/
-static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn)
+static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn,
+ int zone, int nid)
{
- unsigned long pfn;
+ unsigned long pfn, zone_spfn, zone_epfn;
u64 pgcnt = 0;
+ zone_spfn = arch_zone_lowest_possible_pfn[zone];
+ zone_epfn = arch_zone_highest_possible_pfn[zone];
+
+ spfn = clamp(spfn, zone_spfn, zone_epfn);
+ epfn = clamp(epfn, zone_spfn, zone_epfn);
+
for (pfn = spfn; pfn < epfn; pfn++) {
if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
+ pageblock_nr_pages - 1;
continue;
}
- /*
- * Use a fake node/zone (0) for now. Some of these pages
- * (in memblock.reserved but not in memblock.memory) will
- * get re-initialized via reserve_bootmem_region() later.
- */
- __init_single_page(pfn_to_page(pfn), pfn, 0, 0);
+
+ __init_single_page(pfn_to_page(pfn), pfn, zone, nid);
__SetPageReserved(pfn_to_page(pfn));
pgcnt++;
}
@@ -7103,51 +7106,64 @@ static u64 __init init_unavailable_range
}
/*
- * Only struct pages that are backed by physical memory are zeroed and
- * initialized by going through __init_single_page(). But, there are some
- * struct pages which are reserved in memblock allocator and their fields
- * may be accessed (for example page_to_pfn() on some configuration accesses
- * flags). We must explicitly initialize those struct pages.
+ * Only struct pages that correspond to ranges defined by memblock.memory
+ * are zeroed and initialized by going through __init_single_page() during
+ * memmap_init().
*
- * This function also addresses a similar issue where struct pages are left
- * uninitialized because the physical address range is not covered by
- * memblock.memory or memblock.reserved. That could happen when memblock
- * layout is manually configured via memmap=, or when the highest physical
- * address (max_pfn) does not end on a section boundary.
+ * But, there could be struct pages that correspond to holes in
+ * memblock.memory. This can happen because of the following reasons:
+ * - phyiscal memory bank size is not necessarily the exact multiple of the
+ * arbitrary section size
+ * - early reserved memory may not be listed in memblock.memory
+ * - memory layouts defined with memmap= kernel parameter may not align
+ * nicely with memmap sections
+ *
+ * Explicitly initialize those struct pages so that:
+ * - PG_Reserved is set
+ * - zone link is set accorging to the architecture constrains
+ * - node is set to node id of the next populated region except for the
+ * trailing hole where last node id is used
*/
-static void __init init_unavailable_mem(void)
+static void __init init_zone_unavailable_mem(int zone)
{
- phys_addr_t start, end;
- u64 i, pgcnt;
- phys_addr_t next = 0;
+ unsigned long start, end;
+ int i, nid;
+ u64 pgcnt;
+ unsigned long next = 0;
/*
- * Loop through unavailable ranges not covered by memblock.memory.
+ * Loop through holes in memblock.memory and initialize struct
+ * pages corresponding to these holes
*/
pgcnt = 0;
- for_each_mem_range(i, &start, &end) {
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, &nid) {
if (next < start)
- pgcnt += init_unavailable_range(PFN_DOWN(next),
- PFN_UP(start));
+ pgcnt += init_unavailable_range(next, start, zone, nid);
next = end;
}
/*
- * Early sections always have a fully populated memmap for the whole
- * section - see pfn_valid(). If the last section has holes at the
- * end and that section is marked "online", the memmap will be
- * considered initialized. Make sure that memmap has a well defined
- * state.
+ * Last section may surpass the actual end of memory (e.g. we can
+ * have 1Gb section and 512Mb of RAM pouplated).
+ * Make sure that memmap has a well defined state in this case.
*/
- pgcnt += init_unavailable_range(PFN_DOWN(next),
- round_up(max_pfn, PAGES_PER_SECTION));
+ end = round_up(max_pfn, PAGES_PER_SECTION);
+ pgcnt += init_unavailable_range(next, end, zone, nid);
/*
* Struct pages that do not have backing memory. This could be because
* firmware is using some of this memory, or for some other reasons.
*/
if (pgcnt)
- pr_info("Zeroed struct page in unavailable ranges: %lld pages", pgcnt);
+ pr_info("Zone %s: zeroed struct page in unavailable ranges: %lld pages", zone_names[zone], pgcnt);
+}
+
+static void __init init_unavailable_mem(void)
+{
+ int zone;
+
+ for (zone = 0; zone < ZONE_MOVABLE; zone++)
+ init_zone_unavailable_mem(zone);
}
#else
static inline void __init init_unavailable_mem(void)
_
Patches currently in -mm which might be from rppt(a)linux.ibm.com are
mm-add-definition-of-pmd_page_order.patch
mmap-make-mlock_future_check-global.patch
riscv-kconfig-make-direct-map-manipulation-options-depend-on-mmu.patch
set_memory-allow-set_direct_map__noflush-for-multiple-pages.patch
set_memory-allow-querying-whether-set_direct_map_-is-actually-enabled.patch
mm-introduce-memfd_secret-system-call-to-create-secret-memory-areas.patch
secretmem-use-pmd-size-pages-to-amortize-direct-map-fragmentation.patch
secretmem-add-memcg-accounting.patch
pm-hibernate-disable-when-there-are-active-secretmem-users.patch
arch-mm-wire-up-memfd_secret-system-call-where-relevant.patch
secretmem-test-add-basic-selftest-for-memfd_secret2.patch
The patch titled
Subject: x86/setup: don't remove E820_TYPE_RAM for pfn 0
has been removed from the -mm tree. Its filename was
x86-setup-dont-remove-e820_type_ram-for-pfn-0.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mike Rapoport <rppt(a)linux.ibm.com>
Subject: x86/setup: don't remove E820_TYPE_RAM for pfn 0
Patch series "mm: fix initialization of struct page for holes in memory layout", v3.
Commit 73a6e474cb37 ("mm: memmap_init: iterate over
memblock regions rather that check each PFN") exposed several issues with
the memory map initialization and these patches fix those issues.
Initially there were crashes during compaction that Qian Cai reported back
in April [1]. It seemed back then that the problem was fixed, but a few
weeks ago Andrea Arcangeli hit the same bug [2] and there was an additional
discussion at [3].
[1] https://lore.kernel.org/lkml/8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw
[2] https://lore.kernel.org/lkml/20201121194506.13464-1-aarcange@redhat.com
[3] https://lore.kernel.org/mm-commits/20201206005401.qKuAVgOXr%akpm@linux-foun…
This patch (of 2):
The first 4Kb of memory is a BIOS owned area and to avoid its allocation
for the kernel it was not listed in e820 tables as memory. As the result,
pfn 0 was never recognised by the generic memory management and it is not
a part of neither node 0 nor ZONE_DMA.
If set_pfnblock_flags_mask() would be ever called for the pageblock
corresponding to the first 2Mbytes of memory, having pfn 0 outside of
ZONE_DMA would trigger
VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page);
Along with reserving the first 4Kb in e820 tables, several first pages are
reserved with memblock in several places during setup_arch(). These
reservations are enough to ensure the kernel does not touch the BIOS area
and it is not necessary to remove E820_TYPE_RAM for pfn 0.
Remove the update of e820 table that changes the type of pfn 0 and move
the comment describing why it was done to trim_low_memory_range() that
reserves the beginning of the memory.
Link: https://lkml.kernel.org/r/20210111194017.22696-2-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Qian Cai <cai(a)lca.pw>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/kernel/setup.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
--- a/arch/x86/kernel/setup.c~x86-setup-dont-remove-e820_type_ram-for-pfn-0
+++ a/arch/x86/kernel/setup.c
@@ -661,17 +661,6 @@ static void __init trim_platform_memory_
static void __init trim_bios_range(void)
{
/*
- * A special case is the first 4Kb of memory;
- * This is a BIOS owned area, not kernel ram, but generally
- * not listed as such in the E820 table.
- *
- * This typically reserves additional memory (64KiB by default)
- * since some BIOSes are known to corrupt low memory. See the
- * Kconfig help text for X86_RESERVE_LOW.
- */
- e820__range_update(0, PAGE_SIZE, E820_TYPE_RAM, E820_TYPE_RESERVED);
-
- /*
* special case: Some BIOSes report the PC BIOS
* area (640Kb -> 1Mb) as RAM even though it is not.
* take them out.
@@ -728,6 +717,15 @@ early_param("reservelow", parse_reservel
static void __init trim_low_memory_range(void)
{
+ /*
+ * A special case is the first 4Kb of memory;
+ * This is a BIOS owned area, not kernel ram, but generally
+ * not listed as such in the E820 table.
+ *
+ * This typically reserves additional memory (64KiB by default)
+ * since some BIOSes are known to corrupt low memory. See the
+ * Kconfig help text for X86_RESERVE_LOW.
+ */
memblock_reserve(0, ALIGN(reserve_low, PAGE_SIZE));
}
_
Patches currently in -mm which might be from rppt(a)linux.ibm.com are
mm-add-definition-of-pmd_page_order.patch
mmap-make-mlock_future_check-global.patch
riscv-kconfig-make-direct-map-manipulation-options-depend-on-mmu.patch
set_memory-allow-set_direct_map__noflush-for-multiple-pages.patch
set_memory-allow-querying-whether-set_direct_map_-is-actually-enabled.patch
mm-introduce-memfd_secret-system-call-to-create-secret-memory-areas.patch
secretmem-use-pmd-size-pages-to-amortize-direct-map-fragmentation.patch
secretmem-add-memcg-accounting.patch
pm-hibernate-disable-when-there-are-active-secretmem-users.patch
arch-mm-wire-up-memfd_secret-system-call-where-relevant.patch
secretmem-test-add-basic-selftest-for-memfd_secret2.patch
Backport a lazytime fix from upstream to 4.14-stable. To make it
cherry-pick cleanly, I first cherry-picked two other commits. Commit #2
had a conflict due to it making a change to fs/xfs/ which isn't
applicable to 4.14. I dropped that part of the change.
Christoph Hellwig (1):
fs: move I_DIRTY_INODE to fs.h
Eric Biggers (1):
fs: fix lazytime expiration handling in __writeback_single_inode()
Jan Kara (1):
writeback: Drop I_DIRTY_TIME_EXPIRE
fs/ext4/inode.c | 6 ++---
fs/fs-writeback.c | 43 +++++++++++++-------------------
fs/gfs2/super.c | 2 +-
include/linux/fs.h | 4 +--
include/trace/events/writeback.h | 1 -
5 files changed, 24 insertions(+), 32 deletions(-)
--
2.30.0
Backport a lazytime fix from upstream to 4.19-stable. The first commit
had a trivial conflict in fs/xfs/ due to a file being renamed.
Following that, the second commit is a clean cherry-pick.
Eric Biggers (1):
fs: fix lazytime expiration handling in __writeback_single_inode()
Jan Kara (1):
writeback: Drop I_DIRTY_TIME_EXPIRE
fs/ext4/inode.c | 2 +-
fs/fs-writeback.c | 36 ++++++++++++++------------------
fs/xfs/xfs_trans_inode.c | 4 ++--
include/linux/fs.h | 1 -
include/trace/events/writeback.h | 1 -
5 files changed, 19 insertions(+), 25 deletions(-)
--
2.30.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1e249cb5b7fc09ff216aa5a12f6c302e434e88f9 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 12 Jan 2021 11:02:43 -0800
Subject: [PATCH] fs: fix lazytime expiration handling in
__writeback_single_inode()
When lazytime is enabled and an inode is being written due to its
in-memory updated timestamps having expired, either due to a sync() or
syncfs() system call or due to dirtytime_expire_interval having elapsed,
the VFS needs to inform the filesystem so that the filesystem can copy
the inode's timestamps out to the on-disk data structures.
This is done by __writeback_single_inode() calling
mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
However, this occurs after __writeback_single_inode() has already
cleared the dirty flags from ->i_state. This causes two bugs:
- mark_inode_dirty_sync() redirties the inode, causing it to remain
dirty. This wastefully causes the inode to be written twice. But
more importantly, it breaks cases where sync_filesystem() is expected
to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
ioctl (as reported at
https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
as possibly filesystem freezing (freeze_super()).
- Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
called from __writeback_single_inode() for lazytime expiration,
xfs_fs_dirty_inode() ignores the notification. (XFS only cares about
lazytime expirations, and it assumes that i_state will contain
I_DIRTY_TIME during those.) Therefore, lazy timestamps aren't
persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS.
Fix this by moving the call to mark_inode_dirty_sync() to earlier in
__writeback_single_inode(), before the dirty flags are cleared from
i_state. This makes filesystems be properly notified of the timestamp
expiration, and it avoids incorrectly redirtying the inode.
This fixes xfstest generic/580 (which tests
FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
enabled. It also fixes the new lazytime xfstest I've proposed, which
reproduces the above-mentioned XFS bug
(https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).
Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But
due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
right thing to do because mark_inode_dirty_sync() now knows not to move
the inode to a writeback list if it is currently queued for sync.
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
Cc: stable(a)vger.kernel.org
Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback")
Link: https://lore.kernel.org/r/20210112190253.64307-2-ebiggers@kernel.org
Suggested-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index acfb55834af2..c41cb887eb7d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
}
/*
- * Some filesystems may redirty the inode during the writeback
- * due to delalloc, clear dirty metadata flags right before
- * write_inode()
+ * If the inode has dirty timestamps and we need to write them, call
+ * mark_inode_dirty_sync() to notify the filesystem about it and to
+ * change I_DIRTY_TIME into I_DIRTY_SYNC.
*/
- spin_lock(&inode->i_lock);
-
- dirty = inode->i_state & I_DIRTY;
if ((inode->i_state & I_DIRTY_TIME) &&
- ((dirty & I_DIRTY_INODE) ||
- wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
+ (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
time_after(jiffies, inode->dirtied_time_when +
dirtytime_expire_interval * HZ))) {
- dirty |= I_DIRTY_TIME;
trace_writeback_lazytime(inode);
+ mark_inode_dirty_sync(inode);
}
+
+ /*
+ * Some filesystems may redirty the inode during the writeback
+ * due to delalloc, clear dirty metadata flags right before
+ * write_inode()
+ */
+ spin_lock(&inode->i_lock);
+ dirty = inode->i_state & I_DIRTY;
inode->i_state &= ~dirty;
/*
@@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
spin_unlock(&inode->i_lock);
- if (dirty & I_DIRTY_TIME)
- mark_inode_dirty_sync(inode);
/* Don't write the inode if only I_DIRTY_PAGES was set */
if (dirty & ~I_DIRTY_PAGES) {
int err = write_inode(inode, wbc);
From: Gaurav Kohli <gkohli(a)codeaurora.org>
[ upstream commit bbeb97464eefc65f506084fd9f18f21653e01137 ]
Below race can come, if trace_open and resize of
cpu buffer is running parallely on different cpus
CPUX CPUY
ring_buffer_resize
atomic_read(&buffer->resize_disabled)
tracing_open
tracing_reset_online_cpus
ring_buffer_reset_cpu
rb_reset_cpu
rb_update_pages
remove/insert pages
resetting pointer
This race can cause data abort or some times infinite loop in
rb_remove_pages and rb_insert_pages while checking pages
for sanity.
Take buffer lock to fix this.
Link: https://lkml.kernel.org/r/1601976833-24377-1-git-send-email-gkohli@codeauro…
Cc: stable(a)vger.kernel.org
Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic")
Reported-by: Denis Efremov <efremov(a)linux.com>
Signed-off-by: Gaurav Kohli <gkohli(a)codeaurora.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
I updated the "Fixes:" tag with the proper commit it fixes.
-- Steve
kernel/trace/ring_buffer.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 077877ed54f7..728374166653 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -4448,6 +4448,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu)
if (!cpumask_test_cpu(cpu, buffer->cpumask))
return;
+ /* prevent another thread from changing buffer sizes */
+ mutex_lock(&buffer->mutex);
atomic_inc(&buffer->resize_disabled);
atomic_inc(&cpu_buffer->record_disabled);
@@ -4471,6 +4473,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu)
atomic_dec(&cpu_buffer->record_disabled);
atomic_dec(&buffer->resize_disabled);
+
+ mutex_unlock(&buffer->mutex);
}
EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);
--
2.25.4
commit dca5244d2f5b94f1809f0c02a549edf41ccd5493 upstream.
GCC versions >= 4.9 and < 5.1 have been shown to emit memory references
beyond the stack pointer, resulting in memory corruption if an interrupt
is taken after the stack pointer has been adjusted but before the
reference has been executed. This leads to subtle, infrequent data
corruption such as the EXT4 problems reported by Russell King at the
link below.
Life is too short for buggy compilers, so raise the minimum GCC version
required by arm64 to 5.1.
Reported-by: Russell King <linux(a)armlinux.org.uk>
Suggested-by: Arnd Bergmann <arnd(a)kernel.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers(a)google.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Acked-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: <stable(a)vger.kernel.org> # 4.4.y, 4.9.y and 4.14.y only
Cc: Theodore Ts'o <tytso(a)mit.edu>
Cc: Florian Weimer <fweimer(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: Nathan Chancellor <natechancellor(a)gmail.com>
Cc: Naresh Kamboju <naresh.kamboju(a)linaro.org>
Link: https://lore.kernel.org/r/20210105154726.GD1551@shell.armlinux.org.uk
Link: https://lore.kernel.org/r/20210112224832.10980-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
[will: backport to 4.4.y/4.9.y/4.14.y; add __clang__ check]
Link: https://lore.kernel.org/r/CA+G9fYuzE9WMSB7uGjV4gTzK510SHEdJb_UXQCzsQ5MqA=h9…
Signed-off-by: Will Deacon <will(a)kernel.org>
---
include/linux/compiler-gcc.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index af8b4a879934..9485abe76b68 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -145,6 +145,12 @@
#if GCC_VERSION < 30200
# error Sorry, your compiler is too old - please upgrade it.
+#elif defined(CONFIG_ARM64) && GCC_VERSION < 50100 && !defined(__clang__)
+/*
+ * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63293
+ * https://lore.kernel.org/r/20210107111841.GN1551@shell.armlinux.org.uk
+ */
+# error Sorry, your version of GCC is too old - please use 5.1 or newer.
#endif
#if GCC_VERSION < 30300
--
2.30.0.280.ga3ce27912f-goog
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 2a0435df963f996ca870a2ef1cbf1773dc0ea25a Mon Sep 17 00:00:00 2001
From: Stephan Gerhold <stephan(a)gerhold.net>
Date: Thu, 7 Jan 2021 17:51:31 +0100
Subject: [PATCH] ASoC: hdmi-codec: Fix return value in hdmi_codec_set_jack()
Sound is broken on the DragonBoard 410c (apq8016_sbc) since 5.10:
hdmi-audio-codec hdmi-audio-codec.1.auto: ASoC: error at snd_soc_component_set_jack on hdmi-audio-codec.1.auto: -95
qcom-apq8016-sbc 7702000.sound: Failed to set jack: -95
ADV7533: ASoC: error at snd_soc_link_init on ADV7533: -95
hdmi-audio-codec hdmi-audio-codec.1.auto: ASoC: error at snd_soc_component_set_jack on hdmi-audio-codec.1.auto: -95
qcom-apq8016-sbc: probe of 7702000.sound failed with error -95
This happens because apq8016_sbc calls snd_soc_component_set_jack() on
all codec DAIs and attempts to ignore failures with return code -ENOTSUPP.
-ENOTSUPP is also excluded from error logging in soc_component_ret().
However, hdmi_codec_set_jack() returns -E*OP*NOTSUPP if jack detection
is not supported, which is not handled in apq8016_sbc and soc_component_ret().
Make it return -ENOTSUPP instead to fix sound and silence the errors.
Cc: Cheng-Yi Chiang <cychiang(a)chromium.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Fixes: 55c5cc63ab32 ("ASoC: hdmi-codec: Use set_jack ops to set jack")
Signed-off-by: Stephan Gerhold <stephan(a)gerhold.net>
Acked-by: Nicolin Chen <nicoleotsuka(a)gmail.com>
Link: https://lore.kernel.org/r/20210107165131.2535-1-stephan@gerhold.net
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c
index d5fcc4db8284..0f3ac22f2cf8 100644
--- a/sound/soc/codecs/hdmi-codec.c
+++ b/sound/soc/codecs/hdmi-codec.c
@@ -717,7 +717,7 @@ static int hdmi_codec_set_jack(struct snd_soc_component *component,
void *data)
{
struct hdmi_codec_priv *hcp = snd_soc_component_get_drvdata(component);
- int ret = -EOPNOTSUPP;
+ int ret = -ENOTSUPP;
if (hcp->hcd.ops->hook_plugged_cb) {
hcp->jack = jack;
diff --git a/sound/soc/fsl/imx-hdmi.c b/sound/soc/fsl/imx-hdmi.c
index ede4a9ad1054..dbbb7618351c 100644
--- a/sound/soc/fsl/imx-hdmi.c
+++ b/sound/soc/fsl/imx-hdmi.c
@@ -90,7 +90,7 @@ static int imx_hdmi_init(struct snd_soc_pcm_runtime *rtd)
}
ret = snd_soc_component_set_jack(component, &data->hdmi_jack, NULL);
- if (ret && ret != -EOPNOTSUPP) {
+ if (ret && ret != -ENOTSUPP) {
dev_err(card->dev, "Can't set HDMI Jack %d\n", ret);
return ret;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a95881d6aa2c000e3649f27a1a7329cf356e6bb3 Mon Sep 17 00:00:00 2001
From: Douglas Anderson <dianders(a)chromium.org>
Date: Thu, 14 Jan 2021 19:16:23 -0800
Subject: [PATCH] pinctrl: qcom: Properly clear "intr_ack_high" interrupts when
unmasking
In commit 4b7618fdc7e6 ("pinctrl: qcom: Add irq_enable callback for
msm gpio") we tried to Ack interrupts during unmask. However, that
patch forgot to check "intr_ack_high" so, presumably, it only worked
for a certain subset of SoCs.
Let's add a small accessor so we don't need to open-code the logic in
both places.
This was found by code inspection. I don't have any access to the
hardware in question nor software that needs the Ack during unmask.
Fixes: 4b7618fdc7e6 ("pinctrl: qcom: Add irq_enable callback for msm gpio")
Signed-off-by: Douglas Anderson <dianders(a)chromium.org>
Reviewed-by: Maulik Shah <mkshah(a)codeaurora.org>
Tested-by: Maulik Shah <mkshah(a)codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd(a)chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Link: https://lore.kernel.org/r/20210114191601.v7.3.I32d0f4e174d45363b49ab611a13c…
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c
index 2f363c28d9d9..192ed31eabf4 100644
--- a/drivers/pinctrl/qcom/pinctrl-msm.c
+++ b/drivers/pinctrl/qcom/pinctrl-msm.c
@@ -96,6 +96,14 @@ MSM_ACCESSOR(intr_cfg)
MSM_ACCESSOR(intr_status)
MSM_ACCESSOR(intr_target)
+static void msm_ack_intr_status(struct msm_pinctrl *pctrl,
+ const struct msm_pingroup *g)
+{
+ u32 val = g->intr_ack_high ? BIT(g->intr_status_bit) : 0;
+
+ msm_writel_intr_status(val, pctrl, g);
+}
+
static int msm_get_groups_count(struct pinctrl_dev *pctldev)
{
struct msm_pinctrl *pctrl = pinctrl_dev_get_drvdata(pctldev);
@@ -797,7 +805,7 @@ static void msm_gpio_irq_clear_unmask(struct irq_data *d, bool status_clear)
* when the interrupt is not in use.
*/
if (status_clear)
- msm_writel_intr_status(0, pctrl, g);
+ msm_ack_intr_status(pctrl, g);
val = msm_readl_intr_cfg(pctrl, g);
val |= BIT(g->intr_raw_status_bit);
@@ -890,7 +898,6 @@ static void msm_gpio_irq_ack(struct irq_data *d)
struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
const struct msm_pingroup *g;
unsigned long flags;
- u32 val;
if (test_bit(d->hwirq, pctrl->skip_wake_irqs)) {
if (test_bit(d->hwirq, pctrl->dual_edge_irqs))
@@ -902,8 +909,7 @@ static void msm_gpio_irq_ack(struct irq_data *d)
raw_spin_lock_irqsave(&pctrl->lock, flags);
- val = g->intr_ack_high ? BIT(g->intr_status_bit) : 0;
- msm_writel_intr_status(val, pctrl, g);
+ msm_ack_intr_status(pctrl, g);
if (test_bit(d->hwirq, pctrl->dual_edge_irqs))
msm_gpio_update_dual_edge_pos(pctrl, g, d);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a95881d6aa2c000e3649f27a1a7329cf356e6bb3 Mon Sep 17 00:00:00 2001
From: Douglas Anderson <dianders(a)chromium.org>
Date: Thu, 14 Jan 2021 19:16:23 -0800
Subject: [PATCH] pinctrl: qcom: Properly clear "intr_ack_high" interrupts when
unmasking
In commit 4b7618fdc7e6 ("pinctrl: qcom: Add irq_enable callback for
msm gpio") we tried to Ack interrupts during unmask. However, that
patch forgot to check "intr_ack_high" so, presumably, it only worked
for a certain subset of SoCs.
Let's add a small accessor so we don't need to open-code the logic in
both places.
This was found by code inspection. I don't have any access to the
hardware in question nor software that needs the Ack during unmask.
Fixes: 4b7618fdc7e6 ("pinctrl: qcom: Add irq_enable callback for msm gpio")
Signed-off-by: Douglas Anderson <dianders(a)chromium.org>
Reviewed-by: Maulik Shah <mkshah(a)codeaurora.org>
Tested-by: Maulik Shah <mkshah(a)codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd(a)chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Link: https://lore.kernel.org/r/20210114191601.v7.3.I32d0f4e174d45363b49ab611a13c…
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c
index 2f363c28d9d9..192ed31eabf4 100644
--- a/drivers/pinctrl/qcom/pinctrl-msm.c
+++ b/drivers/pinctrl/qcom/pinctrl-msm.c
@@ -96,6 +96,14 @@ MSM_ACCESSOR(intr_cfg)
MSM_ACCESSOR(intr_status)
MSM_ACCESSOR(intr_target)
+static void msm_ack_intr_status(struct msm_pinctrl *pctrl,
+ const struct msm_pingroup *g)
+{
+ u32 val = g->intr_ack_high ? BIT(g->intr_status_bit) : 0;
+
+ msm_writel_intr_status(val, pctrl, g);
+}
+
static int msm_get_groups_count(struct pinctrl_dev *pctldev)
{
struct msm_pinctrl *pctrl = pinctrl_dev_get_drvdata(pctldev);
@@ -797,7 +805,7 @@ static void msm_gpio_irq_clear_unmask(struct irq_data *d, bool status_clear)
* when the interrupt is not in use.
*/
if (status_clear)
- msm_writel_intr_status(0, pctrl, g);
+ msm_ack_intr_status(pctrl, g);
val = msm_readl_intr_cfg(pctrl, g);
val |= BIT(g->intr_raw_status_bit);
@@ -890,7 +898,6 @@ static void msm_gpio_irq_ack(struct irq_data *d)
struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
const struct msm_pingroup *g;
unsigned long flags;
- u32 val;
if (test_bit(d->hwirq, pctrl->skip_wake_irqs)) {
if (test_bit(d->hwirq, pctrl->dual_edge_irqs))
@@ -902,8 +909,7 @@ static void msm_gpio_irq_ack(struct irq_data *d)
raw_spin_lock_irqsave(&pctrl->lock, flags);
- val = g->intr_ack_high ? BIT(g->intr_status_bit) : 0;
- msm_writel_intr_status(val, pctrl, g);
+ msm_ack_intr_status(pctrl, g);
if (test_bit(d->hwirq, pctrl->dual_edge_irqs))
msm_gpio_update_dual_edge_pos(pctrl, g, d);
Commit fe8abf332b8f ("usb: dwc3: support clocks and resets for DWC3
core") introduced clock support and a new function named
dwc3_core_init_for_resume() which enables the clock before calling
dwc3_core_init() during resume as clocks get disabled during suspend.
Unfortunately in this commit the DWC3_GCTL_PRTCAP_OTG case was forgotten
and therefore during resume, a platform could call dwc3_core_init()
without re-enabling the clocks first, preventing to resume properly.
So update the resume path to call dwc3_core_init_for_resume() as it
should.
Fixes: fe8abf332b8f ("usb: dwc3: support clocks and resets for DWC3
core")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gary Bisson <gary.bisson(a)boundarydevices.com>
---
drivers/usb/dwc3/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 841daec70b6e..3101f0dcf6ae 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -1758,7 +1758,7 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
if (PMSG_IS_AUTO(msg))
break;
- ret = dwc3_core_init(dwc);
+ ret = dwc3_core_init_for_resume(dwc);
if (ret)
return ret;
--
2.29.2
Hi Greg,
On 2021-01-24, <gregkh(a)linuxfoundation.org> wrote:
> This is a note to let you know that I've just added the patch titled
>
> printk: fix buffer overflow potential for print_text()
We just learned that this patch introduces a new problem. I have just
posted a patch to fix the new problem:
https://lkml.kernel.org/r/20210124202728.4718-1-john.ogness@linutronix.de
You may want to hold off on applying the first fix until the second fix
has been accepted. Then you can apply both.
John Ogness
While running btrfs/011 in a loop I would often ASSERT() while trying to
add a new free space entry that already existed, or get an -EEXIST while
adding a new block to the extent tree, which is another indication of
double allocation.
This occurs because when we do the free space tree population, we create
the new root and then populate the tree and commit the transaction.
The problem is when you create a new root, the root node and commit root
node are the same. During this initial transaction commit we will run
all of the delayed refs that were paused during the free space tree
generation, and thus begin to cache block groups. While caching block
groups the caching thread will be reading from the main root for the
free space tree, so as we make allocations we'll be changing the free
space tree, which can cause us to add the same range twice which results
in either the ASSERT(ret != -EEXIST); in __btrfs_add_free_space, or in a
variety of different errors when running delayed refs because of a
double allocation.
Fix this by marking the fs_info as unsafe to load the free space tree,
and fall back on the old slow method. We could be smarter than this,
for example caching the block group while we're populating the free
space tree, but since this is a serious problem I've opted for the
simplest solution.
CC: stable(a)vger.kernel.org # 4.5+
Fixes: a5ed91828518 ("Btrfs: implement the free space B-tree")
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
---
v2->v3:
- Updated the changelog to be more specific about the scenario that the problem
happens under.
- Fixed the stable tag as well.
v1->v2:
- Dropped the UNTRUSTED bit on abort.
fs/btrfs/block-group.c | 11 ++++++++++-
fs/btrfs/ctree.h | 3 +++
fs/btrfs/free-space-tree.c | 10 +++++++++-
3 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 0886e81e5540..290f05e87a6b 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -673,7 +673,16 @@ static noinline void caching_thread(struct btrfs_work *work)
wake_up(&caching_ctl->wait);
}
- if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE))
+ /*
+ * If we are in the transaction that populated the free space tree we
+ * can't actually cache from the free space tree as our commit root and
+ * real root are the same, so we could change the contents of the blocks
+ * while caching. Instead do the slow caching in this case, and after
+ * the transaction has committed we will be safe.
+ */
+ if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
+ !(test_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED,
+ &fs_info->flags)))
ret = load_free_space_tree(caching_ctl);
else
ret = load_extent_tree_free(caching_ctl);
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 1bd416e5a731..1fdb5dbe5287 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -563,6 +563,9 @@ enum {
/* Indicate that we need to cleanup space cache v1 */
BTRFS_FS_CLEANUP_SPACE_CACHE_V1,
+
+ /* Indicate that we can't trust the free space tree for caching yet. */
+ BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED,
};
/*
diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
index e33a65bd9a0c..a33bca94d133 100644
--- a/fs/btrfs/free-space-tree.c
+++ b/fs/btrfs/free-space-tree.c
@@ -1150,6 +1150,7 @@ int btrfs_create_free_space_tree(struct btrfs_fs_info *fs_info)
return PTR_ERR(trans);
set_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags);
+ set_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags);
free_space_root = btrfs_create_tree(trans,
BTRFS_FREE_SPACE_TREE_OBJECTID);
if (IS_ERR(free_space_root)) {
@@ -1171,11 +1172,18 @@ int btrfs_create_free_space_tree(struct btrfs_fs_info *fs_info)
btrfs_set_fs_compat_ro(fs_info, FREE_SPACE_TREE);
btrfs_set_fs_compat_ro(fs_info, FREE_SPACE_TREE_VALID);
clear_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags);
+ ret = btrfs_commit_transaction(trans);
- return btrfs_commit_transaction(trans);
+ /*
+ * Now that we've committed the transaction any reading of our commit
+ * root will be safe, so we can cache from the free space tree now.
+ */
+ clear_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags);
+ return ret;
abort:
clear_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags);
+ clear_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags);
btrfs_abort_transaction(trans, ret);
btrfs_end_transaction(trans);
return ret;
--
2.26.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b7ba6cfabc42fc846eb96e33f1edcd3ea6290a27 Mon Sep 17 00:00:00 2001
From: Yingjie Wang <wangyingjie55(a)126.com>
Date: Fri, 15 Jan 2021 06:10:04 -0800
Subject: [PATCH] octeontx2-af: Fix missing check bugs in rvu_cgx.c
In rvu_mbox_handler_cgx_mac_addr_get()
and rvu_mbox_handler_cgx_mac_addr_set(),
the msg is expected only from PFs that are mapped to CGX LMACs.
It should be checked before mapping,
so we add the is_cgx_config_permitted() in the functions.
Fixes: 96be2e0da85e ("octeontx2-af: Support for MAC address filters in CGX")
Signed-off-by: Yingjie Wang <wangyingjie55(a)126.com>
Reviewed-by: Geetha sowjanya<gakula(a)marvell.com>
Link: https://lore.kernel.org/r/1610719804-35230-1-git-send-email-wangyingjie55@1…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
index d298b9357177..6c6b411e78fd 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
@@ -469,6 +469,9 @@ int rvu_mbox_handler_cgx_mac_addr_set(struct rvu *rvu,
int pf = rvu_get_pf(req->hdr.pcifunc);
u8 cgx_id, lmac_id;
+ if (!is_cgx_config_permitted(rvu, req->hdr.pcifunc))
+ return -EPERM;
+
rvu_get_cgx_lmac_id(rvu->pf2cgxlmac_map[pf], &cgx_id, &lmac_id);
cgx_lmac_addr_set(cgx_id, lmac_id, req->mac_addr);
@@ -485,6 +488,9 @@ int rvu_mbox_handler_cgx_mac_addr_get(struct rvu *rvu,
int rc = 0, i;
u64 cfg;
+ if (!is_cgx_config_permitted(rvu, req->hdr.pcifunc))
+ return -EPERM;
+
rvu_get_cgx_lmac_id(rvu->pf2cgxlmac_map[pf], &cgx_id, &lmac_id);
rsp->hdr.rc = rc;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e708789c4a87989faff1131ccfdc465a1c1eddbc Mon Sep 17 00:00:00 2001
From: Miquel Raynal <miquel.raynal(a)bootlin.com>
Date: Thu, 7 Jan 2021 09:38:13 +0100
Subject: [PATCH] mtd: spinand: Fix MTD_OPS_AUTO_OOB requests
The initial change breaking the logic is
commit 3d1f08b032dc ("mtd: spinand: Use the external ECC engine logic")
It inadvertently dropped proper OOB support while doing something
else.
Shortly later, half of it got re-integrated by
commit 868cbe2a6dce ("mtd: spinand: Fix OOB read")
(pointing by the way to a more early change which had nothing to do
with the issue). Problem is, this commit failed to revert the faulty
change entirely and missed the logic handling MTD_OPS_AUTO_OOB
requests.
Let's fix this mess by re-inserting the missing part now.
Fixes: 868cbe2a6dce ("mtd: spinand: Fix OOB read")
Reported-by: Felix Fietkau <nbd(a)nbd.name>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210107083813.24283-1-miquel.raynal@boot…
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index 8ea545bb924d..61d932c1b718 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -343,6 +343,7 @@ static int spinand_read_from_cache_op(struct spinand_device *spinand,
const struct nand_page_io_req *req)
{
struct nand_device *nand = spinand_to_nand(spinand);
+ struct mtd_info *mtd = spinand_to_mtd(spinand);
struct spi_mem_dirmap_desc *rdesc;
unsigned int nbytes = 0;
void *buf = NULL;
@@ -382,9 +383,16 @@ static int spinand_read_from_cache_op(struct spinand_device *spinand,
memcpy(req->databuf.in, spinand->databuf + req->dataoffs,
req->datalen);
- if (req->ooblen)
- memcpy(req->oobbuf.in, spinand->oobbuf + req->ooboffs,
- req->ooblen);
+ if (req->ooblen) {
+ if (req->mode == MTD_OPS_AUTO_OOB)
+ mtd_ooblayout_get_databytes(mtd, req->oobbuf.in,
+ spinand->oobbuf,
+ req->ooboffs,
+ req->ooblen);
+ else
+ memcpy(req->oobbuf.in, spinand->oobbuf + req->ooboffs,
+ req->ooblen);
+ }
return 0;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e708789c4a87989faff1131ccfdc465a1c1eddbc Mon Sep 17 00:00:00 2001
From: Miquel Raynal <miquel.raynal(a)bootlin.com>
Date: Thu, 7 Jan 2021 09:38:13 +0100
Subject: [PATCH] mtd: spinand: Fix MTD_OPS_AUTO_OOB requests
The initial change breaking the logic is
commit 3d1f08b032dc ("mtd: spinand: Use the external ECC engine logic")
It inadvertently dropped proper OOB support while doing something
else.
Shortly later, half of it got re-integrated by
commit 868cbe2a6dce ("mtd: spinand: Fix OOB read")
(pointing by the way to a more early change which had nothing to do
with the issue). Problem is, this commit failed to revert the faulty
change entirely and missed the logic handling MTD_OPS_AUTO_OOB
requests.
Let's fix this mess by re-inserting the missing part now.
Fixes: 868cbe2a6dce ("mtd: spinand: Fix OOB read")
Reported-by: Felix Fietkau <nbd(a)nbd.name>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210107083813.24283-1-miquel.raynal@boot…
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index 8ea545bb924d..61d932c1b718 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -343,6 +343,7 @@ static int spinand_read_from_cache_op(struct spinand_device *spinand,
const struct nand_page_io_req *req)
{
struct nand_device *nand = spinand_to_nand(spinand);
+ struct mtd_info *mtd = spinand_to_mtd(spinand);
struct spi_mem_dirmap_desc *rdesc;
unsigned int nbytes = 0;
void *buf = NULL;
@@ -382,9 +383,16 @@ static int spinand_read_from_cache_op(struct spinand_device *spinand,
memcpy(req->databuf.in, spinand->databuf + req->dataoffs,
req->datalen);
- if (req->ooblen)
- memcpy(req->oobbuf.in, spinand->oobbuf + req->ooboffs,
- req->ooblen);
+ if (req->ooblen) {
+ if (req->mode == MTD_OPS_AUTO_OOB)
+ mtd_ooblayout_get_databytes(mtd, req->oobbuf.in,
+ spinand->oobbuf,
+ req->ooboffs,
+ req->ooblen);
+ else
+ memcpy(req->oobbuf.in, spinand->oobbuf + req->ooboffs,
+ req->ooblen);
+ }
return 0;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e73b0101ae5124bf7cd3fb5d250302ad2f16a416 Mon Sep 17 00:00:00 2001
From: Baruch Siach <baruch(a)tkos.co.il>
Date: Sun, 17 Jan 2021 15:17:02 +0200
Subject: [PATCH] gpio: mvebu: fix pwm .get_state period calculation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The period is the sum of on and off values. That is, calculate period as
($on + $off) / clkrate
instead of
$off / clkrate - $on / clkrate
that makes no sense.
Reported-by: Russell King <linux(a)armlinux.org.uk>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Fixes: 757642f9a584e ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Baruch Siach <baruch(a)tkos.co.il>
Signed-off-by: Bartosz Golaszewski <bgolaszewski(a)baylibre.com>
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c
index 672681a976f5..a912a8fed197 100644
--- a/drivers/gpio/gpio-mvebu.c
+++ b/drivers/gpio/gpio-mvebu.c
@@ -676,20 +676,17 @@ static void mvebu_pwm_get_state(struct pwm_chip *chip,
else
state->duty_cycle = 1;
+ val = (unsigned long long) u; /* on duration */
regmap_read(mvpwm->regs, mvebu_pwmreg_blink_off_duration(mvpwm), &u);
- val = (unsigned long long) u * NSEC_PER_SEC;
+ val += (unsigned long long) u; /* period = on + off duration */
+ val *= NSEC_PER_SEC;
do_div(val, mvpwm->clk_rate);
- if (val < state->duty_cycle) {
+ if (val > UINT_MAX)
+ state->period = UINT_MAX;
+ else if (val)
+ state->period = val;
+ else
state->period = 1;
- } else {
- val -= state->duty_cycle;
- if (val > UINT_MAX)
- state->period = UINT_MAX;
- else if (val)
- state->period = val;
- else
- state->period = 1;
- }
regmap_read(mvchip->regs, GPIO_BLINK_EN_OFF + mvchip->offset, &u);
if (u)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e73b0101ae5124bf7cd3fb5d250302ad2f16a416 Mon Sep 17 00:00:00 2001
From: Baruch Siach <baruch(a)tkos.co.il>
Date: Sun, 17 Jan 2021 15:17:02 +0200
Subject: [PATCH] gpio: mvebu: fix pwm .get_state period calculation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The period is the sum of on and off values. That is, calculate period as
($on + $off) / clkrate
instead of
$off / clkrate - $on / clkrate
that makes no sense.
Reported-by: Russell King <linux(a)armlinux.org.uk>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Fixes: 757642f9a584e ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Baruch Siach <baruch(a)tkos.co.il>
Signed-off-by: Bartosz Golaszewski <bgolaszewski(a)baylibre.com>
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c
index 672681a976f5..a912a8fed197 100644
--- a/drivers/gpio/gpio-mvebu.c
+++ b/drivers/gpio/gpio-mvebu.c
@@ -676,20 +676,17 @@ static void mvebu_pwm_get_state(struct pwm_chip *chip,
else
state->duty_cycle = 1;
+ val = (unsigned long long) u; /* on duration */
regmap_read(mvpwm->regs, mvebu_pwmreg_blink_off_duration(mvpwm), &u);
- val = (unsigned long long) u * NSEC_PER_SEC;
+ val += (unsigned long long) u; /* period = on + off duration */
+ val *= NSEC_PER_SEC;
do_div(val, mvpwm->clk_rate);
- if (val < state->duty_cycle) {
+ if (val > UINT_MAX)
+ state->period = UINT_MAX;
+ else if (val)
+ state->period = val;
+ else
state->period = 1;
- } else {
- val -= state->duty_cycle;
- if (val > UINT_MAX)
- state->period = UINT_MAX;
- else if (val)
- state->period = val;
- else
- state->period = 1;
- }
regmap_read(mvchip->regs, GPIO_BLINK_EN_OFF + mvchip->offset, &u);
if (u)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e73b0101ae5124bf7cd3fb5d250302ad2f16a416 Mon Sep 17 00:00:00 2001
From: Baruch Siach <baruch(a)tkos.co.il>
Date: Sun, 17 Jan 2021 15:17:02 +0200
Subject: [PATCH] gpio: mvebu: fix pwm .get_state period calculation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The period is the sum of on and off values. That is, calculate period as
($on + $off) / clkrate
instead of
$off / clkrate - $on / clkrate
that makes no sense.
Reported-by: Russell King <linux(a)armlinux.org.uk>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Fixes: 757642f9a584e ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Baruch Siach <baruch(a)tkos.co.il>
Signed-off-by: Bartosz Golaszewski <bgolaszewski(a)baylibre.com>
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c
index 672681a976f5..a912a8fed197 100644
--- a/drivers/gpio/gpio-mvebu.c
+++ b/drivers/gpio/gpio-mvebu.c
@@ -676,20 +676,17 @@ static void mvebu_pwm_get_state(struct pwm_chip *chip,
else
state->duty_cycle = 1;
+ val = (unsigned long long) u; /* on duration */
regmap_read(mvpwm->regs, mvebu_pwmreg_blink_off_duration(mvpwm), &u);
- val = (unsigned long long) u * NSEC_PER_SEC;
+ val += (unsigned long long) u; /* period = on + off duration */
+ val *= NSEC_PER_SEC;
do_div(val, mvpwm->clk_rate);
- if (val < state->duty_cycle) {
+ if (val > UINT_MAX)
+ state->period = UINT_MAX;
+ else if (val)
+ state->period = val;
+ else
state->period = 1;
- } else {
- val -= state->duty_cycle;
- if (val > UINT_MAX)
- state->period = UINT_MAX;
- else if (val)
- state->period = val;
- else
- state->period = 1;
- }
regmap_read(mvchip->regs, GPIO_BLINK_EN_OFF + mvchip->offset, &u);
if (u)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e73b0101ae5124bf7cd3fb5d250302ad2f16a416 Mon Sep 17 00:00:00 2001
From: Baruch Siach <baruch(a)tkos.co.il>
Date: Sun, 17 Jan 2021 15:17:02 +0200
Subject: [PATCH] gpio: mvebu: fix pwm .get_state period calculation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The period is the sum of on and off values. That is, calculate period as
($on + $off) / clkrate
instead of
$off / clkrate - $on / clkrate
that makes no sense.
Reported-by: Russell King <linux(a)armlinux.org.uk>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Fixes: 757642f9a584e ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Baruch Siach <baruch(a)tkos.co.il>
Signed-off-by: Bartosz Golaszewski <bgolaszewski(a)baylibre.com>
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c
index 672681a976f5..a912a8fed197 100644
--- a/drivers/gpio/gpio-mvebu.c
+++ b/drivers/gpio/gpio-mvebu.c
@@ -676,20 +676,17 @@ static void mvebu_pwm_get_state(struct pwm_chip *chip,
else
state->duty_cycle = 1;
+ val = (unsigned long long) u; /* on duration */
regmap_read(mvpwm->regs, mvebu_pwmreg_blink_off_duration(mvpwm), &u);
- val = (unsigned long long) u * NSEC_PER_SEC;
+ val += (unsigned long long) u; /* period = on + off duration */
+ val *= NSEC_PER_SEC;
do_div(val, mvpwm->clk_rate);
- if (val < state->duty_cycle) {
+ if (val > UINT_MAX)
+ state->period = UINT_MAX;
+ else if (val)
+ state->period = val;
+ else
state->period = 1;
- } else {
- val -= state->duty_cycle;
- if (val > UINT_MAX)
- state->period = UINT_MAX;
- else if (val)
- state->period = val;
- else
- state->period = 1;
- }
regmap_read(mvchip->regs, GPIO_BLINK_EN_OFF + mvchip->offset, &u);
if (u)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9c7d9017a49fb8516c13b7bff59b7da2abed23e1 Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Fri, 8 Jan 2021 19:05:59 +0100
Subject: [PATCH] x86: PM: Register syscore_ops for scale invariance
On x86 scale invariace tends to be disabled during resume from
suspend-to-RAM, because the MPERF or APERF MSR values are not as
expected then due to updates taking place after the platform
firmware has been invoked to complete the suspend transition.
That, of course, is not desirable, especially if the schedutil
scaling governor is in use, because the lack of scale invariance
causes it to be less reliable.
To counter that effect, modify init_freq_invariance() to register
a syscore_ops object for scale invariance with the ->resume callback
pointing to init_counter_refs() which will run on the CPU starting
the resume transition (the other CPUs will be taken care of the
"online" operations taking place later).
Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in frequency invariant accounting")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Giovanni Gherdovich <ggherdovich(a)suse.cz>
Link: https://lkml.kernel.org/r/1803209.Mvru99baaF@kreacher
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 8ca66af96a54..117e24fbfd8a 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -56,6 +56,7 @@
#include <linux/numa.h>
#include <linux/pgtable.h>
#include <linux/overflow.h>
+#include <linux/syscore_ops.h>
#include <asm/acpi.h>
#include <asm/desc.h>
@@ -2083,6 +2084,23 @@ static void init_counter_refs(void)
this_cpu_write(arch_prev_mperf, mperf);
}
+#ifdef CONFIG_PM_SLEEP
+static struct syscore_ops freq_invariance_syscore_ops = {
+ .resume = init_counter_refs,
+};
+
+static void register_freq_invariance_syscore_ops(void)
+{
+ /* Bail out if registered already. */
+ if (freq_invariance_syscore_ops.node.prev)
+ return;
+
+ register_syscore_ops(&freq_invariance_syscore_ops);
+}
+#else
+static inline void register_freq_invariance_syscore_ops(void) {}
+#endif
+
static void init_freq_invariance(bool secondary, bool cppc_ready)
{
bool ret = false;
@@ -2109,6 +2127,7 @@ static void init_freq_invariance(bool secondary, bool cppc_ready)
if (ret) {
init_counter_refs();
static_branch_enable(&arch_scale_freq_key);
+ register_freq_invariance_syscore_ops();
pr_info("Estimated ratio of average max frequency by base frequency (times 1024): %llu\n", arch_max_freq_ratio);
} else {
pr_debug("Couldn't determine max cpu frequency, necessary for scale-invariant accounting.\n");
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From cf9d052aa6005f1e8dfaf491d83bf37f368af69e Mon Sep 17 00:00:00 2001
From: Douglas Anderson <dianders(a)chromium.org>
Date: Thu, 14 Jan 2021 19:16:24 -0800
Subject: [PATCH] pinctrl: qcom: Don't clear pending interrupts when enabling
In Linux, if a driver does disable_irq() and later does enable_irq()
on its interrupt, I believe it's expecting these properties:
* If an interrupt was pending when the driver disabled then it will
still be pending after the driver re-enables.
* If an edge-triggered interrupt comes in while an interrupt is
disabled it should assert when the interrupt is re-enabled.
If you think that the above sounds a lot like the disable_irq() and
enable_irq() are supposed to be masking/unmasking the interrupt
instead of disabling/enabling it then you've made an astute
observation. Specifically when talking about interrupts, "mask"
usually means to stop posting interrupts but keep tracking them and
"disable" means to fully shut off interrupt detection. It's
unfortunate that this is so confusing, but presumably this is all the
way it is for historical reasons.
Perhaps more confusing than the above is that, even though clients of
IRQs themselves don't have a way to request mask/unmask
vs. disable/enable calls, IRQ chips themselves can implement both.
...and yet more confusing is that if an IRQ chip implements
disable/enable then they will be called when a client driver calls
disable_irq() / enable_irq().
It does feel like some of the above could be cleared up. However,
without any other core interrupt changes it should be clear that when
an IRQ chip gets a request to "disable" an IRQ that it has to treat it
like a mask of that IRQ.
In any case, after that long interlude you can see that the "unmask
and clear" can break things. Maulik tried to fix it so that we no
longer did "unmask and clear" in commit 71266d9d3936 ("pinctrl: qcom:
Move clearing pending IRQ to .irq_request_resources callback"), but it
only handled the PDC case and it had problems (it caused
sc7180-trogdor devices to fail to suspend). Let's fix.
>From my understanding the source of the phantom interrupt in the
were these two things:
1. One that could have been introduced in msm_gpio_irq_set_type()
(only for the non-PDC case).
2. Edges could have been detected when a GPIO was muxed away.
Fixing case #1 is easy. We can just add a clear in
msm_gpio_irq_set_type().
Fixing case #2 is harder. Let's use a concrete example. In
sc7180-trogdor.dtsi we configure the uart3 to have two pinctrl states,
sleep and default, and mux between the two during runtime PM and
system suspend (see geni_se_resources_{on,off}() for more
details). The difference between the sleep and default state is that
the RX pin is muxed to a GPIO during sleep and muxed to the UART
otherwise.
As per Qualcomm, when we mux the pin over to the UART function the PDC
(or the non-PDC interrupt detection logic) is still watching it /
latching edges. These edges don't cause interrupts because the
current code masks the interrupt unless we're entering suspend.
However, as soon as we enter suspend we unmask the interrupt and it's
counted as a wakeup.
Let's deal with the problem like this:
* When we mux away, we'll mask our interrupt. This isn't necessary in
the above case since the client already masked us, but it's a good
idea in general.
* When we mux back will clear any interrupts and unmask.
Fixes: 4b7618fdc7e6 ("pinctrl: qcom: Add irq_enable callback for msm gpio")
Fixes: 71266d9d3936 ("pinctrl: qcom: Move clearing pending IRQ to .irq_request_resources callback")
Signed-off-by: Douglas Anderson <dianders(a)chromium.org>
Reviewed-by: Maulik Shah <mkshah(a)codeaurora.org>
Tested-by: Maulik Shah <mkshah(a)codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd(a)chromium.org>
Link: https://lore.kernel.org/r/20210114191601.v7.4.I7cf3019783720feb57b958c95c2b…
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c
index 192ed31eabf4..d70caecd21d2 100644
--- a/drivers/pinctrl/qcom/pinctrl-msm.c
+++ b/drivers/pinctrl/qcom/pinctrl-msm.c
@@ -51,6 +51,7 @@
* @dual_edge_irqs: Bitmap of irqs that need sw emulated dual edge
* detection.
* @skip_wake_irqs: Skip IRQs that are handled by wakeup interrupt controller
+ * @disabled_for_mux: These IRQs were disabled because we muxed away.
* @soc: Reference to soc_data of platform specific data.
* @regs: Base addresses for the TLMM tiles.
* @phys_base: Physical base address
@@ -72,6 +73,7 @@ struct msm_pinctrl {
DECLARE_BITMAP(dual_edge_irqs, MAX_NR_GPIO);
DECLARE_BITMAP(enabled_irqs, MAX_NR_GPIO);
DECLARE_BITMAP(skip_wake_irqs, MAX_NR_GPIO);
+ DECLARE_BITMAP(disabled_for_mux, MAX_NR_GPIO);
const struct msm_pinctrl_soc_data *soc;
void __iomem *regs[MAX_NR_TILES];
@@ -179,6 +181,10 @@ static int msm_pinmux_set_mux(struct pinctrl_dev *pctldev,
unsigned group)
{
struct msm_pinctrl *pctrl = pinctrl_dev_get_drvdata(pctldev);
+ struct gpio_chip *gc = &pctrl->chip;
+ unsigned int irq = irq_find_mapping(gc->irq.domain, group);
+ struct irq_data *d = irq_get_irq_data(irq);
+ unsigned int gpio_func = pctrl->soc->gpio_func;
const struct msm_pingroup *g;
unsigned long flags;
u32 val, mask;
@@ -195,6 +201,20 @@ static int msm_pinmux_set_mux(struct pinctrl_dev *pctldev,
if (WARN_ON(i == g->nfuncs))
return -EINVAL;
+ /*
+ * If an GPIO interrupt is setup on this pin then we need special
+ * handling. Specifically interrupt detection logic will still see
+ * the pin twiddle even when we're muxed away.
+ *
+ * When we see a pin with an interrupt setup on it then we'll disable
+ * (mask) interrupts on it when we mux away until we mux back. Note
+ * that disable_irq() refcounts and interrupts are disabled as long as
+ * at least one disable_irq() has been called.
+ */
+ if (d && i != gpio_func &&
+ !test_and_set_bit(d->hwirq, pctrl->disabled_for_mux))
+ disable_irq(irq);
+
raw_spin_lock_irqsave(&pctrl->lock, flags);
val = msm_readl_ctl(pctrl, g);
@@ -204,6 +224,20 @@ static int msm_pinmux_set_mux(struct pinctrl_dev *pctldev,
raw_spin_unlock_irqrestore(&pctrl->lock, flags);
+ if (d && i == gpio_func &&
+ test_and_clear_bit(d->hwirq, pctrl->disabled_for_mux)) {
+ /*
+ * Clear interrupts detected while not GPIO since we only
+ * masked things.
+ */
+ if (d->parent_data && test_bit(d->hwirq, pctrl->skip_wake_irqs))
+ irq_chip_set_parent_state(d, IRQCHIP_STATE_PENDING, false);
+ else
+ msm_ack_intr_status(pctrl, g);
+
+ enable_irq(irq);
+ }
+
return 0;
}
@@ -781,7 +815,7 @@ static void msm_gpio_irq_mask(struct irq_data *d)
raw_spin_unlock_irqrestore(&pctrl->lock, flags);
}
-static void msm_gpio_irq_clear_unmask(struct irq_data *d, bool status_clear)
+static void msm_gpio_irq_unmask(struct irq_data *d)
{
struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
@@ -799,14 +833,6 @@ static void msm_gpio_irq_clear_unmask(struct irq_data *d, bool status_clear)
raw_spin_lock_irqsave(&pctrl->lock, flags);
- /*
- * clear the interrupt status bit before unmask to avoid
- * any erroneous interrupts that would have got latched
- * when the interrupt is not in use.
- */
- if (status_clear)
- msm_ack_intr_status(pctrl, g);
-
val = msm_readl_intr_cfg(pctrl, g);
val |= BIT(g->intr_raw_status_bit);
val |= BIT(g->intr_enable_bit);
@@ -826,7 +852,7 @@ static void msm_gpio_irq_enable(struct irq_data *d)
irq_chip_enable_parent(d);
if (!test_bit(d->hwirq, pctrl->skip_wake_irqs))
- msm_gpio_irq_clear_unmask(d, true);
+ msm_gpio_irq_unmask(d);
}
static void msm_gpio_irq_disable(struct irq_data *d)
@@ -841,11 +867,6 @@ static void msm_gpio_irq_disable(struct irq_data *d)
msm_gpio_irq_mask(d);
}
-static void msm_gpio_irq_unmask(struct irq_data *d)
-{
- msm_gpio_irq_clear_unmask(d, false);
-}
-
/**
* msm_gpio_update_dual_edge_parent() - Prime next edge for IRQs handled by parent.
* @d: The irq dta.
@@ -934,6 +955,7 @@ static int msm_gpio_irq_set_type(struct irq_data *d, unsigned int type)
struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
const struct msm_pingroup *g;
unsigned long flags;
+ bool was_enabled;
u32 val;
if (msm_gpio_needs_dual_edge_parent_workaround(d, type)) {
@@ -995,6 +1017,7 @@ static int msm_gpio_irq_set_type(struct irq_data *d, unsigned int type)
* could cause the INTR_STATUS to be set for EDGE interrupts.
*/
val = msm_readl_intr_cfg(pctrl, g);
+ was_enabled = val & BIT(g->intr_raw_status_bit);
val |= BIT(g->intr_raw_status_bit);
if (g->intr_detection_width == 2) {
val &= ~(3 << g->intr_detection_bit);
@@ -1044,6 +1067,14 @@ static int msm_gpio_irq_set_type(struct irq_data *d, unsigned int type)
}
msm_writel_intr_cfg(val, pctrl, g);
+ /*
+ * The first time we set RAW_STATUS_EN it could trigger an interrupt.
+ * Clear the interrupt. This is safe because we have
+ * IRQCHIP_SET_TYPE_MASKED.
+ */
+ if (!was_enabled)
+ msm_ack_intr_status(pctrl, g);
+
if (test_bit(d->hwirq, pctrl->dual_edge_irqs))
msm_gpio_update_dual_edge_pos(pctrl, g, d);
@@ -1097,16 +1128,11 @@ static int msm_gpio_irq_reqres(struct irq_data *d)
}
/*
- * Clear the interrupt that may be pending before we enable
- * the line.
- * This is especially a problem with the GPIOs routed to the
- * PDC. These GPIOs are direct-connect interrupts to the GIC.
- * Disabling the interrupt line at the PDC does not prevent
- * the interrupt from being latched at the GIC. The state at
- * GIC needs to be cleared before enabling.
+ * The disable / clear-enable workaround we do in msm_pinmux_set_mux()
+ * only works if disable is not lazy since we only clear any bogus
+ * interrupt in hardware. Explicitly mark the interrupt as UNLAZY.
*/
- if (d->parent_data && test_bit(d->hwirq, pctrl->skip_wake_irqs))
- irq_chip_set_parent_state(d, IRQCHIP_STATE_PENDING, 0);
+ irq_set_status_flags(d->irq, IRQ_DISABLE_UNLAZY);
return 0;
out:
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From dad4e5b390866ca902653df0daa864ae4b8d4147 Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Sat, 23 Jan 2021 21:01:52 -0800
Subject: [PATCH] mm: fix page reference leak in soft_offline_page()
The conversion to move pfn_to_online_page() internal to
soft_offline_page() missed that the get_user_pages() reference taken by
the madvise() path needs to be dropped when pfn_to_online_page() fails.
Note the direct sysfs-path to soft_offline_page() does not perform a
get_user_pages() lookup.
When soft_offline_page() is handed a pfn_valid() && !pfn_to_online_page()
pfn the kernel hangs at dax-device shutdown due to a leaked reference.
Link: https://lkml.kernel.org/r/161058501210.1840162.8108917599181157327.stgit@dw…
Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Qian Cai <cai(a)lca.pw>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 04d9f154a130..e9481632fcd1 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1885,6 +1885,12 @@ static int soft_offline_free_page(struct page *page)
return rc;
}
+static void put_ref_page(struct page *page)
+{
+ if (page)
+ put_page(page);
+}
+
/**
* soft_offline_page - Soft offline a page.
* @pfn: pfn to soft-offline
@@ -1910,20 +1916,26 @@ static int soft_offline_free_page(struct page *page)
int soft_offline_page(unsigned long pfn, int flags)
{
int ret;
- struct page *page;
bool try_again = true;
+ struct page *page, *ref_page = NULL;
+
+ WARN_ON_ONCE(!pfn_valid(pfn) && (flags & MF_COUNT_INCREASED));
if (!pfn_valid(pfn))
return -ENXIO;
+ if (flags & MF_COUNT_INCREASED)
+ ref_page = pfn_to_page(pfn);
+
/* Only online pages can be soft-offlined (esp., not ZONE_DEVICE). */
page = pfn_to_online_page(pfn);
- if (!page)
+ if (!page) {
+ put_ref_page(ref_page);
return -EIO;
+ }
if (PageHWPoison(page)) {
pr_info("%s: %#lx page already poisoned\n", __func__, pfn);
- if (flags & MF_COUNT_INCREASED)
- put_page(page);
+ put_ref_page(ref_page);
return 0;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5c447d274f3746fbed6e695e7b9a2d7bd8b31b71 Mon Sep 17 00:00:00 2001
From: Shakeel Butt <shakeelb(a)google.com>
Date: Sat, 23 Jan 2021 21:01:15 -0800
Subject: [PATCH] mm: fix numa stats for thp migration
Currently the kernel is not correctly updating the numa stats for
NR_FILE_PAGES and NR_SHMEM on THP migration. Fix that.
For NR_FILE_DIRTY and NR_ZONE_WRITE_PENDING, although at the moment
there is no need to handle THP migration as kernel still does not have
write support for file THP but to be more future proof, this patch adds
the THP support for those stats as well.
Link: https://lkml.kernel.org/r/20210108155813.2914586-2-shakeelb@google.com
Fixes: e71769ae52609 ("mm: enable thp migration for shmem thp")
Signed-off-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Yang Shi <shy828301(a)gmail.com>
Reviewed-by: Roman Gushchin <guro(a)fb.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/migrate.c b/mm/migrate.c
index 613794f6a433..c0efe921bca5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -402,6 +402,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
struct zone *oldzone, *newzone;
int dirty;
int expected_count = expected_page_refs(mapping, page) + extra_count;
+ int nr = thp_nr_pages(page);
if (!mapping) {
/* Anonymous page without mapping */
@@ -437,7 +438,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
*/
newpage->index = page->index;
newpage->mapping = page->mapping;
- page_ref_add(newpage, thp_nr_pages(page)); /* add cache reference */
+ page_ref_add(newpage, nr); /* add cache reference */
if (PageSwapBacked(page)) {
__SetPageSwapBacked(newpage);
if (PageSwapCache(page)) {
@@ -459,7 +460,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
if (PageTransHuge(page)) {
int i;
- for (i = 1; i < HPAGE_PMD_NR; i++) {
+ for (i = 1; i < nr; i++) {
xas_next(&xas);
xas_store(&xas, newpage);
}
@@ -470,7 +471,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
* to one less reference.
* We know this isn't the last reference.
*/
- page_ref_unfreeze(page, expected_count - thp_nr_pages(page));
+ page_ref_unfreeze(page, expected_count - nr);
xas_unlock(&xas);
/* Leave irq disabled to prevent preemption while updating stats */
@@ -493,17 +494,17 @@ int migrate_page_move_mapping(struct address_space *mapping,
old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat);
new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat);
- __dec_lruvec_state(old_lruvec, NR_FILE_PAGES);
- __inc_lruvec_state(new_lruvec, NR_FILE_PAGES);
+ __mod_lruvec_state(old_lruvec, NR_FILE_PAGES, -nr);
+ __mod_lruvec_state(new_lruvec, NR_FILE_PAGES, nr);
if (PageSwapBacked(page) && !PageSwapCache(page)) {
- __dec_lruvec_state(old_lruvec, NR_SHMEM);
- __inc_lruvec_state(new_lruvec, NR_SHMEM);
+ __mod_lruvec_state(old_lruvec, NR_SHMEM, -nr);
+ __mod_lruvec_state(new_lruvec, NR_SHMEM, nr);
}
if (dirty && mapping_can_writeback(mapping)) {
- __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
- __dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
- __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
- __inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
+ __mod_lruvec_state(old_lruvec, NR_FILE_DIRTY, -nr);
+ __mod_zone_page_state(oldzone, NR_ZONE_WRITE_PENDING, -nr);
+ __mod_lruvec_state(new_lruvec, NR_FILE_DIRTY, nr);
+ __mod_zone_page_state(newzone, NR_ZONE_WRITE_PENDING, nr);
}
}
local_irq_enable();
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5c447d274f3746fbed6e695e7b9a2d7bd8b31b71 Mon Sep 17 00:00:00 2001
From: Shakeel Butt <shakeelb(a)google.com>
Date: Sat, 23 Jan 2021 21:01:15 -0800
Subject: [PATCH] mm: fix numa stats for thp migration
Currently the kernel is not correctly updating the numa stats for
NR_FILE_PAGES and NR_SHMEM on THP migration. Fix that.
For NR_FILE_DIRTY and NR_ZONE_WRITE_PENDING, although at the moment
there is no need to handle THP migration as kernel still does not have
write support for file THP but to be more future proof, this patch adds
the THP support for those stats as well.
Link: https://lkml.kernel.org/r/20210108155813.2914586-2-shakeelb@google.com
Fixes: e71769ae52609 ("mm: enable thp migration for shmem thp")
Signed-off-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Yang Shi <shy828301(a)gmail.com>
Reviewed-by: Roman Gushchin <guro(a)fb.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/migrate.c b/mm/migrate.c
index 613794f6a433..c0efe921bca5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -402,6 +402,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
struct zone *oldzone, *newzone;
int dirty;
int expected_count = expected_page_refs(mapping, page) + extra_count;
+ int nr = thp_nr_pages(page);
if (!mapping) {
/* Anonymous page without mapping */
@@ -437,7 +438,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
*/
newpage->index = page->index;
newpage->mapping = page->mapping;
- page_ref_add(newpage, thp_nr_pages(page)); /* add cache reference */
+ page_ref_add(newpage, nr); /* add cache reference */
if (PageSwapBacked(page)) {
__SetPageSwapBacked(newpage);
if (PageSwapCache(page)) {
@@ -459,7 +460,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
if (PageTransHuge(page)) {
int i;
- for (i = 1; i < HPAGE_PMD_NR; i++) {
+ for (i = 1; i < nr; i++) {
xas_next(&xas);
xas_store(&xas, newpage);
}
@@ -470,7 +471,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
* to one less reference.
* We know this isn't the last reference.
*/
- page_ref_unfreeze(page, expected_count - thp_nr_pages(page));
+ page_ref_unfreeze(page, expected_count - nr);
xas_unlock(&xas);
/* Leave irq disabled to prevent preemption while updating stats */
@@ -493,17 +494,17 @@ int migrate_page_move_mapping(struct address_space *mapping,
old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat);
new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat);
- __dec_lruvec_state(old_lruvec, NR_FILE_PAGES);
- __inc_lruvec_state(new_lruvec, NR_FILE_PAGES);
+ __mod_lruvec_state(old_lruvec, NR_FILE_PAGES, -nr);
+ __mod_lruvec_state(new_lruvec, NR_FILE_PAGES, nr);
if (PageSwapBacked(page) && !PageSwapCache(page)) {
- __dec_lruvec_state(old_lruvec, NR_SHMEM);
- __inc_lruvec_state(new_lruvec, NR_SHMEM);
+ __mod_lruvec_state(old_lruvec, NR_SHMEM, -nr);
+ __mod_lruvec_state(new_lruvec, NR_SHMEM, nr);
}
if (dirty && mapping_can_writeback(mapping)) {
- __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY);
- __dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
- __inc_lruvec_state(new_lruvec, NR_FILE_DIRTY);
- __inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
+ __mod_lruvec_state(old_lruvec, NR_FILE_DIRTY, -nr);
+ __mod_zone_page_state(oldzone, NR_ZONE_WRITE_PENDING, -nr);
+ __mod_lruvec_state(new_lruvec, NR_FILE_DIRTY, nr);
+ __mod_zone_page_state(newzone, NR_ZONE_WRITE_PENDING, nr);
}
}
local_irq_enable();
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 40c48fb79b9798954691f24b8ece1d3a7eb1b353 Mon Sep 17 00:00:00 2001
From: Lorenzo Bianconi <lorenzo(a)kernel.org>
Date: Tue, 8 Dec 2020 15:36:40 +0100
Subject: [PATCH] iio: common: st_sensors: fix possible infinite loop in
st_sensors_irq_thread
Return a boolean value in st_sensors_new_samples_available routine in
order to avoid an infinite loop in st_sensors_irq_thread if
stat_drdy.addr is not defined or stat_drdy read fails
Fixes: 90efe05562921 ("iio: st_sensors: harden interrupt handling")
Reported-by: Jonathan Cameron <jic23(a)kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo(a)kernel.org>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Link: https://lore.kernel.org/r/c9ec69ed349e7200c779fd7a5bf04c1aaa2817aa.16074381…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/common/st_sensors/st_sensors_trigger.c b/drivers/iio/common/st_sensors/st_sensors_trigger.c
index 0507283bd4c1..2dbd2646e44e 100644
--- a/drivers/iio/common/st_sensors/st_sensors_trigger.c
+++ b/drivers/iio/common/st_sensors/st_sensors_trigger.c
@@ -23,35 +23,31 @@
* @sdata: Sensor data.
*
* returns:
- * 0 - no new samples available
- * 1 - new samples available
- * negative - error or unknown
+ * false - no new samples available or read error
+ * true - new samples available
*/
-static int st_sensors_new_samples_available(struct iio_dev *indio_dev,
- struct st_sensor_data *sdata)
+static bool st_sensors_new_samples_available(struct iio_dev *indio_dev,
+ struct st_sensor_data *sdata)
{
int ret, status;
/* How would I know if I can't check it? */
if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr)
- return -EINVAL;
+ return true;
/* No scan mask, no interrupt */
if (!indio_dev->active_scan_mask)
- return 0;
+ return false;
ret = regmap_read(sdata->regmap,
sdata->sensor_settings->drdy_irq.stat_drdy.addr,
&status);
if (ret < 0) {
dev_err(sdata->dev, "error checking samples available\n");
- return ret;
+ return false;
}
- if (status & sdata->sensor_settings->drdy_irq.stat_drdy.mask)
- return 1;
-
- return 0;
+ return !!(status & sdata->sensor_settings->drdy_irq.stat_drdy.mask);
}
/**
@@ -180,9 +176,15 @@ int st_sensors_allocate_trigger(struct iio_dev *indio_dev,
/* Tell the interrupt handler that we're dealing with edges */
if (irq_trig == IRQF_TRIGGER_FALLING ||
- irq_trig == IRQF_TRIGGER_RISING)
+ irq_trig == IRQF_TRIGGER_RISING) {
+ if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr) {
+ dev_err(&indio_dev->dev,
+ "edge IRQ not supported w/o stat register.\n");
+ err = -EOPNOTSUPP;
+ goto iio_trigger_free;
+ }
sdata->edge_irq = true;
- else
+ } else {
/*
* If we're not using edges (i.e. level interrupts) we
* just mask off the IRQ, handle one interrupt, then
@@ -190,6 +192,7 @@ int st_sensors_allocate_trigger(struct iio_dev *indio_dev,
* interrupt handler top half again and start over.
*/
irq_trig |= IRQF_ONESHOT;
+ }
/*
* If the interrupt pin is Open Drain, by definition this
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 40c48fb79b9798954691f24b8ece1d3a7eb1b353 Mon Sep 17 00:00:00 2001
From: Lorenzo Bianconi <lorenzo(a)kernel.org>
Date: Tue, 8 Dec 2020 15:36:40 +0100
Subject: [PATCH] iio: common: st_sensors: fix possible infinite loop in
st_sensors_irq_thread
Return a boolean value in st_sensors_new_samples_available routine in
order to avoid an infinite loop in st_sensors_irq_thread if
stat_drdy.addr is not defined or stat_drdy read fails
Fixes: 90efe05562921 ("iio: st_sensors: harden interrupt handling")
Reported-by: Jonathan Cameron <jic23(a)kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo(a)kernel.org>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Link: https://lore.kernel.org/r/c9ec69ed349e7200c779fd7a5bf04c1aaa2817aa.16074381…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/common/st_sensors/st_sensors_trigger.c b/drivers/iio/common/st_sensors/st_sensors_trigger.c
index 0507283bd4c1..2dbd2646e44e 100644
--- a/drivers/iio/common/st_sensors/st_sensors_trigger.c
+++ b/drivers/iio/common/st_sensors/st_sensors_trigger.c
@@ -23,35 +23,31 @@
* @sdata: Sensor data.
*
* returns:
- * 0 - no new samples available
- * 1 - new samples available
- * negative - error or unknown
+ * false - no new samples available or read error
+ * true - new samples available
*/
-static int st_sensors_new_samples_available(struct iio_dev *indio_dev,
- struct st_sensor_data *sdata)
+static bool st_sensors_new_samples_available(struct iio_dev *indio_dev,
+ struct st_sensor_data *sdata)
{
int ret, status;
/* How would I know if I can't check it? */
if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr)
- return -EINVAL;
+ return true;
/* No scan mask, no interrupt */
if (!indio_dev->active_scan_mask)
- return 0;
+ return false;
ret = regmap_read(sdata->regmap,
sdata->sensor_settings->drdy_irq.stat_drdy.addr,
&status);
if (ret < 0) {
dev_err(sdata->dev, "error checking samples available\n");
- return ret;
+ return false;
}
- if (status & sdata->sensor_settings->drdy_irq.stat_drdy.mask)
- return 1;
-
- return 0;
+ return !!(status & sdata->sensor_settings->drdy_irq.stat_drdy.mask);
}
/**
@@ -180,9 +176,15 @@ int st_sensors_allocate_trigger(struct iio_dev *indio_dev,
/* Tell the interrupt handler that we're dealing with edges */
if (irq_trig == IRQF_TRIGGER_FALLING ||
- irq_trig == IRQF_TRIGGER_RISING)
+ irq_trig == IRQF_TRIGGER_RISING) {
+ if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr) {
+ dev_err(&indio_dev->dev,
+ "edge IRQ not supported w/o stat register.\n");
+ err = -EOPNOTSUPP;
+ goto iio_trigger_free;
+ }
sdata->edge_irq = true;
- else
+ } else {
/*
* If we're not using edges (i.e. level interrupts) we
* just mask off the IRQ, handle one interrupt, then
@@ -190,6 +192,7 @@ int st_sensors_allocate_trigger(struct iio_dev *indio_dev,
* interrupt handler top half again and start over.
*/
irq_trig |= IRQF_ONESHOT;
+ }
/*
* If the interrupt pin is Open Drain, by definition this
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 40c48fb79b9798954691f24b8ece1d3a7eb1b353 Mon Sep 17 00:00:00 2001
From: Lorenzo Bianconi <lorenzo(a)kernel.org>
Date: Tue, 8 Dec 2020 15:36:40 +0100
Subject: [PATCH] iio: common: st_sensors: fix possible infinite loop in
st_sensors_irq_thread
Return a boolean value in st_sensors_new_samples_available routine in
order to avoid an infinite loop in st_sensors_irq_thread if
stat_drdy.addr is not defined or stat_drdy read fails
Fixes: 90efe05562921 ("iio: st_sensors: harden interrupt handling")
Reported-by: Jonathan Cameron <jic23(a)kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo(a)kernel.org>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Link: https://lore.kernel.org/r/c9ec69ed349e7200c779fd7a5bf04c1aaa2817aa.16074381…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/common/st_sensors/st_sensors_trigger.c b/drivers/iio/common/st_sensors/st_sensors_trigger.c
index 0507283bd4c1..2dbd2646e44e 100644
--- a/drivers/iio/common/st_sensors/st_sensors_trigger.c
+++ b/drivers/iio/common/st_sensors/st_sensors_trigger.c
@@ -23,35 +23,31 @@
* @sdata: Sensor data.
*
* returns:
- * 0 - no new samples available
- * 1 - new samples available
- * negative - error or unknown
+ * false - no new samples available or read error
+ * true - new samples available
*/
-static int st_sensors_new_samples_available(struct iio_dev *indio_dev,
- struct st_sensor_data *sdata)
+static bool st_sensors_new_samples_available(struct iio_dev *indio_dev,
+ struct st_sensor_data *sdata)
{
int ret, status;
/* How would I know if I can't check it? */
if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr)
- return -EINVAL;
+ return true;
/* No scan mask, no interrupt */
if (!indio_dev->active_scan_mask)
- return 0;
+ return false;
ret = regmap_read(sdata->regmap,
sdata->sensor_settings->drdy_irq.stat_drdy.addr,
&status);
if (ret < 0) {
dev_err(sdata->dev, "error checking samples available\n");
- return ret;
+ return false;
}
- if (status & sdata->sensor_settings->drdy_irq.stat_drdy.mask)
- return 1;
-
- return 0;
+ return !!(status & sdata->sensor_settings->drdy_irq.stat_drdy.mask);
}
/**
@@ -180,9 +176,15 @@ int st_sensors_allocate_trigger(struct iio_dev *indio_dev,
/* Tell the interrupt handler that we're dealing with edges */
if (irq_trig == IRQF_TRIGGER_FALLING ||
- irq_trig == IRQF_TRIGGER_RISING)
+ irq_trig == IRQF_TRIGGER_RISING) {
+ if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr) {
+ dev_err(&indio_dev->dev,
+ "edge IRQ not supported w/o stat register.\n");
+ err = -EOPNOTSUPP;
+ goto iio_trigger_free;
+ }
sdata->edge_irq = true;
- else
+ } else {
/*
* If we're not using edges (i.e. level interrupts) we
* just mask off the IRQ, handle one interrupt, then
@@ -190,6 +192,7 @@ int st_sensors_allocate_trigger(struct iio_dev *indio_dev,
* interrupt handler top half again and start over.
*/
irq_trig |= IRQF_ONESHOT;
+ }
/*
* If the interrupt pin is Open Drain, by definition this
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 40c48fb79b9798954691f24b8ece1d3a7eb1b353 Mon Sep 17 00:00:00 2001
From: Lorenzo Bianconi <lorenzo(a)kernel.org>
Date: Tue, 8 Dec 2020 15:36:40 +0100
Subject: [PATCH] iio: common: st_sensors: fix possible infinite loop in
st_sensors_irq_thread
Return a boolean value in st_sensors_new_samples_available routine in
order to avoid an infinite loop in st_sensors_irq_thread if
stat_drdy.addr is not defined or stat_drdy read fails
Fixes: 90efe05562921 ("iio: st_sensors: harden interrupt handling")
Reported-by: Jonathan Cameron <jic23(a)kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo(a)kernel.org>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Link: https://lore.kernel.org/r/c9ec69ed349e7200c779fd7a5bf04c1aaa2817aa.16074381…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/common/st_sensors/st_sensors_trigger.c b/drivers/iio/common/st_sensors/st_sensors_trigger.c
index 0507283bd4c1..2dbd2646e44e 100644
--- a/drivers/iio/common/st_sensors/st_sensors_trigger.c
+++ b/drivers/iio/common/st_sensors/st_sensors_trigger.c
@@ -23,35 +23,31 @@
* @sdata: Sensor data.
*
* returns:
- * 0 - no new samples available
- * 1 - new samples available
- * negative - error or unknown
+ * false - no new samples available or read error
+ * true - new samples available
*/
-static int st_sensors_new_samples_available(struct iio_dev *indio_dev,
- struct st_sensor_data *sdata)
+static bool st_sensors_new_samples_available(struct iio_dev *indio_dev,
+ struct st_sensor_data *sdata)
{
int ret, status;
/* How would I know if I can't check it? */
if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr)
- return -EINVAL;
+ return true;
/* No scan mask, no interrupt */
if (!indio_dev->active_scan_mask)
- return 0;
+ return false;
ret = regmap_read(sdata->regmap,
sdata->sensor_settings->drdy_irq.stat_drdy.addr,
&status);
if (ret < 0) {
dev_err(sdata->dev, "error checking samples available\n");
- return ret;
+ return false;
}
- if (status & sdata->sensor_settings->drdy_irq.stat_drdy.mask)
- return 1;
-
- return 0;
+ return !!(status & sdata->sensor_settings->drdy_irq.stat_drdy.mask);
}
/**
@@ -180,9 +176,15 @@ int st_sensors_allocate_trigger(struct iio_dev *indio_dev,
/* Tell the interrupt handler that we're dealing with edges */
if (irq_trig == IRQF_TRIGGER_FALLING ||
- irq_trig == IRQF_TRIGGER_RISING)
+ irq_trig == IRQF_TRIGGER_RISING) {
+ if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr) {
+ dev_err(&indio_dev->dev,
+ "edge IRQ not supported w/o stat register.\n");
+ err = -EOPNOTSUPP;
+ goto iio_trigger_free;
+ }
sdata->edge_irq = true;
- else
+ } else {
/*
* If we're not using edges (i.e. level interrupts) we
* just mask off the IRQ, handle one interrupt, then
@@ -190,6 +192,7 @@ int st_sensors_allocate_trigger(struct iio_dev *indio_dev,
* interrupt handler top half again and start over.
*/
irq_trig |= IRQF_ONESHOT;
+ }
/*
* If the interrupt pin is Open Drain, by definition this
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: memcontrol: prevent starvation when writing memory.high
When a value is written to a cgroup's memory.high control file, the
write() context first tries to reclaim the cgroup to size before putting
the limit in place for the workload. Concurrent charges from the workload
can keep such a write() looping in reclaim indefinitely.
In the past, a write to memory.high would first put the limit in place for
the workload, then do targeted reclaim until the new limit has been met -
similar to how we do it for memory.max. This wasn't prone to the
described starvation issue. However, this sequence could cause excessive
latencies in the workload, when allocating threads could be put into long
penalty sleeps on the sudden memory.high overage created by the write(),
before that had a chance to work it off.
Now that memory_high_write() performs reclaim before enforcing the new
limit, reflect that the cgroup may well fail to converge due to concurrent
workload activity. Bail out of the loop after a few tries.
Link: https://lkml.kernel.org/r/20210112163011.127833-1-hannes@cmpxchg.org
Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reported-by: Tejun Heo <tj(a)kernel.org>
Acked-by: Roman Gushchin <guro(a)fb.com>
Reviewed-by: Michal Koutný <mkoutny(a)suse.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org> [5.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
--- a/mm/memcontrol.c~mm-memcontrol-prevent-starvation-when-writing-memoryhigh
+++ a/mm/memcontrol.c
@@ -6273,7 +6273,6 @@ static ssize_t memory_high_write(struct
for (;;) {
unsigned long nr_pages = page_counter_read(&memcg->memory);
- unsigned long reclaimed;
if (nr_pages <= high)
break;
@@ -6287,10 +6286,10 @@ static ssize_t memory_high_write(struct
continue;
}
- reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
- GFP_KERNEL, true);
+ try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
+ GFP_KERNEL, true);
- if (!reclaimed && !nr_retries--)
+ if (!nr_retries--)
break;
}
_
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
The MEI bus has a special behavior on suspend it destroys
all the attached devices, this is due to the fact that also
firmware context is not persistent across power flows.
If watchdog on MEI bus is ticking before suspending the firmware
times out and reports that the OS is missing watchdog tick.
Send the stop command to the firmware on watchdog unregistered
to eliminate the false event on suspend.
This does not make the things worse from the user-space perspective
as a user-space should re-open watchdog device after
suspending before this patch.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
---
V2: Update the commit message with better explanation
drivers/watchdog/mei_wdt.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/watchdog/mei_wdt.c b/drivers/watchdog/mei_wdt.c
index 5391bf3e6b11..c5967d8b4256 100644
--- a/drivers/watchdog/mei_wdt.c
+++ b/drivers/watchdog/mei_wdt.c
@@ -382,6 +382,7 @@ static int mei_wdt_register(struct mei_wdt *wdt)
watchdog_set_drvdata(&wdt->wdd, wdt);
watchdog_stop_on_reboot(&wdt->wdd);
+ watchdog_stop_on_unregister(&wdt->wdd);
ret = watchdog_register_device(&wdt->wdd);
if (ret)
--
2.26.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 45db630e5f7ec83817c57c8ae387fe219bd42adf Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 18 Jan 2021 10:17:55 +0000
Subject: [PATCH] drm/i915: Check for rq->hwsp validity after acquiring RCU
lock
Since we allow removing the timeline map at runtime, there is a risk
that rq->hwsp points into a stale page. To control that risk, we hold
the RCU read lock while reading *rq->hwsp, but we missed a couple of
important barriers. First, the unpinning / removal of the timeline map
must be after all RCU readers into that map are complete, i.e. after an
rcu barrier (in this case courtesy of call_rcu()). Secondly, we must
make sure that the rq->hwsp we are about to dereference under the RCU
lock is valid. In this case, we make the rq->hwsp pointer safe during
i915_request_retire() and so we know that rq->hwsp may become invalid
only after the request has been signaled. Therefore is the request is
not yet signaled when we acquire rq->hwsp under the RCU, we know that
rq->hwsp will remain valid for the duration of the RCU read lock.
This is a very small window that may lead to either considering the
request not completed (causing a delay until the request is checked
again, any wait for the request is not affected) or dereferencing an
invalid pointer.
Fixes: 3adac4689f58 ("drm/i915: Introduce concept of per-timeline (context) HWSP")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.1+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201218122421.18344-1-chris@…
(cherry picked from commit 9bb36cf66091ddf2d8840e5aa705ad3c93a6279b)
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210118101755.476744-1-chris…
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index a24cc1ff08a0..0625cbb3b431 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -134,11 +134,6 @@ static bool remove_signaling_context(struct intel_breadcrumbs *b,
return true;
}
-static inline bool __request_completed(const struct i915_request *rq)
-{
- return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno);
-}
-
__maybe_unused static bool
check_signal_order(struct intel_context *ce, struct i915_request *rq)
{
@@ -257,7 +252,7 @@ static void signal_irq_work(struct irq_work *work)
list_for_each_entry_rcu(rq, &ce->signals, signal_link) {
bool release;
- if (!__request_completed(rq))
+ if (!__i915_request_is_complete(rq))
break;
if (!test_and_clear_bit(I915_FENCE_FLAG_SIGNAL,
@@ -379,7 +374,7 @@ static void insert_breadcrumb(struct i915_request *rq)
* straight onto a signaled list, and queue the irq worker for
* its signal completion.
*/
- if (__request_completed(rq)) {
+ if (__i915_request_is_complete(rq)) {
if (__signal_request(rq) &&
llist_add(&rq->signal_node, &b->signaled_requests))
irq_work_queue(&b->irq_work);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 7ea94d201fe6..8015964043eb 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -126,6 +126,10 @@ static void __rcu_cacheline_free(struct rcu_head *rcu)
struct intel_timeline_cacheline *cl =
container_of(rcu, typeof(*cl), rcu);
+ /* Must wait until after all *rq->hwsp are complete before removing */
+ i915_gem_object_unpin_map(cl->hwsp->vma->obj);
+ __idle_hwsp_free(cl->hwsp, ptr_unmask_bits(cl->vaddr, CACHELINE_BITS));
+
i915_active_fini(&cl->active);
kfree(cl);
}
@@ -133,11 +137,6 @@ static void __rcu_cacheline_free(struct rcu_head *rcu)
static void __idle_cacheline_free(struct intel_timeline_cacheline *cl)
{
GEM_BUG_ON(!i915_active_is_idle(&cl->active));
-
- i915_gem_object_unpin_map(cl->hwsp->vma->obj);
- i915_vma_put(cl->hwsp->vma);
- __idle_hwsp_free(cl->hwsp, ptr_unmask_bits(cl->vaddr, CACHELINE_BITS));
-
call_rcu(&cl->rcu, __rcu_cacheline_free);
}
@@ -179,7 +178,6 @@ cacheline_alloc(struct intel_timeline_hwsp *hwsp, unsigned int cacheline)
return ERR_CAST(vaddr);
}
- i915_vma_get(hwsp->vma);
cl->hwsp = hwsp;
cl->vaddr = page_pack_bits(vaddr, cacheline);
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 620b6fab2c5c..92adfee30c7c 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -434,7 +434,7 @@ static inline u32 hwsp_seqno(const struct i915_request *rq)
static inline bool __i915_request_has_started(const struct i915_request *rq)
{
- return i915_seqno_passed(hwsp_seqno(rq), rq->fence.seqno - 1);
+ return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno - 1);
}
/**
@@ -465,11 +465,19 @@ static inline bool __i915_request_has_started(const struct i915_request *rq)
*/
static inline bool i915_request_started(const struct i915_request *rq)
{
+ bool result;
+
if (i915_request_signaled(rq))
return true;
- /* Remember: started but may have since been preempted! */
- return __i915_request_has_started(rq);
+ result = true;
+ rcu_read_lock(); /* the HWSP may be freed at runtime */
+ if (likely(!i915_request_signaled(rq)))
+ /* Remember: started but may have since been preempted! */
+ result = __i915_request_has_started(rq);
+ rcu_read_unlock();
+
+ return result;
}
/**
@@ -482,10 +490,16 @@ static inline bool i915_request_started(const struct i915_request *rq)
*/
static inline bool i915_request_is_running(const struct i915_request *rq)
{
+ bool result;
+
if (!i915_request_is_active(rq))
return false;
- return __i915_request_has_started(rq);
+ rcu_read_lock();
+ result = __i915_request_has_started(rq) && i915_request_is_active(rq);
+ rcu_read_unlock();
+
+ return result;
}
/**
@@ -509,12 +523,25 @@ static inline bool i915_request_is_ready(const struct i915_request *rq)
return !list_empty(&rq->sched.link);
}
+static inline bool __i915_request_is_complete(const struct i915_request *rq)
+{
+ return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno);
+}
+
static inline bool i915_request_completed(const struct i915_request *rq)
{
+ bool result;
+
if (i915_request_signaled(rq))
return true;
- return i915_seqno_passed(hwsp_seqno(rq), rq->fence.seqno);
+ result = true;
+ rcu_read_lock(); /* the HWSP may be freed at runtime */
+ if (likely(!i915_request_signaled(rq)))
+ result = __i915_request_is_complete(rq);
+ rcu_read_unlock();
+
+ return result;
}
static inline void i915_request_mark_complete(struct i915_request *rq)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 488751a0ef9b5ce572c47301ce62d54fc6b5a74d Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 18 Jan 2021 09:53:32 +0000
Subject: [PATCH] drm/i915/gt: Prevent use of engine->wa_ctx after error
On error we unpin and free the wa_ctx.vma, but do not clear any of the
derived flags. During lrc_init, we look at the flags and attempt to
dereference the wa_ctx.vma if they are set. To protect the error path
where we try to limp along without the wa_ctx, make sure we clear those
flags!
Reported-by: Matt Roper <matthew.d.roper(a)intel.com>
Fixes: 604a8f6f1e33 ("drm/i915/lrc: Only enable per-context and per-bb buffers if set")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v4.15+
Reviewed-by: Matt Roper <matthew.d.roper(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-1-chris@…
(cherry-picked from 5b4dc95cf7f573e927fbbd406ebe54225d41b9b2)
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210118095332.458813-1-chris…
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 7614a3d24fca..26c7d0a50585 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -3988,6 +3988,9 @@ static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
static void lrc_destroy_wa_ctx(struct intel_engine_cs *engine)
{
i915_vma_unpin_and_release(&engine->wa_ctx.vma, 0);
+
+ /* Called on error unwind, clear all flags to prevent further use */
+ memset(&engine->wa_ctx, 0, sizeof(engine->wa_ctx));
}
typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 51e87da7d4014f49769dcf60b8626a81492df2c4 Mon Sep 17 00:00:00 2001
From: Prike Liang <Prike.Liang(a)amd.com>
Date: Thu, 17 Dec 2020 13:55:46 +0800
Subject: [PATCH] drm/amdgpu/pm: no need GPU status set since
mmnbif_gpu_BIF_DOORBELL_FENCE_CNTL added in FSDL
In the renoir there is no need GpuChangeState message set to exit gfxoff in the s0i3 resume since
mmnbif_gpu_BIF_DOORBELL_FENCE_CNTL has been added in the s0i3 FSDL.
Signed-off-by: Prike Liang <Prike.Liang(a)amd.com>
Reviewed-by: Huang Rui <ray.huang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
index f743685a20e8..9a9697038016 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
@@ -1121,7 +1121,7 @@ static ssize_t renoir_get_gpu_metrics(struct smu_context *smu,
static int renoir_gfx_state_change_set(struct smu_context *smu, uint32_t state)
{
- return smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GpuChangeState, state, NULL);
+ return 0;
}
static const struct pptable_funcs renoir_ppt_funcs = {
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5c02406428d5219c367c5f53457698c58bc5f917 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 20 Jan 2021 13:59:11 -0500
Subject: [PATCH] dm integrity: conditionally disable "recalculate" feature
Otherwise a malicious user could (ab)use the "recalculate" feature
that makes dm-integrity calculate the checksums in the background
while the device is already usable. When the system restarts before all
checksums have been calculated, the calculation continues where it was
interrupted even if the recalculate feature is not requested the next
time the dm device is set up.
Disable recalculating if we use internal_hash or journal_hash with a
key (e.g. HMAC) and we don't have the "legacy_recalculate" flag.
This may break activation of a volume, created by an older kernel,
that is not yet fully recalculated -- if this happens, the user should
add the "legacy_recalculate" flag to constructor parameters.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Reported-by: Daniel Glockner <dg(a)emlix.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst
index 4e6f504474ac..2cc5488acbd9 100644
--- a/Documentation/admin-guide/device-mapper/dm-integrity.rst
+++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst
@@ -177,14 +177,20 @@ bitmap_flush_interval:number
The bitmap flush interval in milliseconds. The metadata buffers
are synchronized when this interval expires.
+allow_discards
+ Allow block discard requests (a.k.a. TRIM) for the integrity device.
+ Discards are only allowed to devices using internal hash.
+
fix_padding
Use a smaller padding of the tag area that is more
space-efficient. If this option is not present, large padding is
used - that is for compatibility with older kernels.
-allow_discards
- Allow block discard requests (a.k.a. TRIM) for the integrity device.
- Discards are only allowed to devices using internal hash.
+legacy_recalculate
+ Allow recalculating of volumes with HMAC keys. This is disabled by
+ default for security reasons - an attacker could modify the volume,
+ set recalc_sector to zero, and the kernel would not detect the
+ modification.
The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
allow_discards can be changed when reloading the target (load an inactive
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index cce203adcf77..b64fede032dc 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -257,8 +257,9 @@ struct dm_integrity_c {
bool journal_uptodate;
bool just_formatted;
bool recalculate_flag;
- bool fix_padding;
bool discard;
+ bool fix_padding;
+ bool legacy_recalculate;
struct alg_spec internal_hash_alg;
struct alg_spec journal_crypt_alg;
@@ -386,6 +387,14 @@ static int dm_integrity_failed(struct dm_integrity_c *ic)
return READ_ONCE(ic->failed);
}
+static bool dm_integrity_disable_recalculate(struct dm_integrity_c *ic)
+{
+ if ((ic->internal_hash_alg.key || ic->journal_mac_alg.key) &&
+ !ic->legacy_recalculate)
+ return true;
+ return false;
+}
+
static commit_id_t dm_integrity_commit_id(struct dm_integrity_c *ic, unsigned i,
unsigned j, unsigned char seq)
{
@@ -3140,6 +3149,7 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
arg_count += !!ic->journal_crypt_alg.alg_string;
arg_count += !!ic->journal_mac_alg.alg_string;
arg_count += (ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0;
+ arg_count += ic->legacy_recalculate;
DMEMIT("%s %llu %u %c %u", ic->dev->name, ic->start,
ic->tag_size, ic->mode, arg_count);
if (ic->meta_dev)
@@ -3163,6 +3173,8 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
}
if ((ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0)
DMEMIT(" fix_padding");
+ if (ic->legacy_recalculate)
+ DMEMIT(" legacy_recalculate");
#define EMIT_ALG(a, n) \
do { \
@@ -3792,7 +3804,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
unsigned extra_args;
struct dm_arg_set as;
static const struct dm_arg _args[] = {
- {0, 15, "Invalid number of feature args"},
+ {0, 16, "Invalid number of feature args"},
};
unsigned journal_sectors, interleave_sectors, buffer_sectors, journal_watermark, sync_msec;
bool should_write_sb;
@@ -3940,6 +3952,8 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
ic->discard = true;
} else if (!strcmp(opt_string, "fix_padding")) {
ic->fix_padding = true;
+ } else if (!strcmp(opt_string, "legacy_recalculate")) {
+ ic->legacy_recalculate = true;
} else {
r = -EINVAL;
ti->error = "Invalid argument";
@@ -4243,6 +4257,14 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
}
}
+ if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING) &&
+ le64_to_cpu(ic->sb->recalc_sector) < ic->provided_data_sectors &&
+ dm_integrity_disable_recalculate(ic)) {
+ ti->error = "Recalculating with HMAC is disabled for security reasons - if you really need it, use the argument \"legacy_recalculate\"";
+ r = -EOPNOTSUPP;
+ goto bad;
+ }
+
ic->bufio = dm_bufio_client_create(ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev,
1U << (SECTOR_SHIFT + ic->log2_buffer_sectors), 1, 0, NULL, NULL);
if (IS_ERR(ic->bufio)) {
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5c02406428d5219c367c5f53457698c58bc5f917 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 20 Jan 2021 13:59:11 -0500
Subject: [PATCH] dm integrity: conditionally disable "recalculate" feature
Otherwise a malicious user could (ab)use the "recalculate" feature
that makes dm-integrity calculate the checksums in the background
while the device is already usable. When the system restarts before all
checksums have been calculated, the calculation continues where it was
interrupted even if the recalculate feature is not requested the next
time the dm device is set up.
Disable recalculating if we use internal_hash or journal_hash with a
key (e.g. HMAC) and we don't have the "legacy_recalculate" flag.
This may break activation of a volume, created by an older kernel,
that is not yet fully recalculated -- if this happens, the user should
add the "legacy_recalculate" flag to constructor parameters.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Reported-by: Daniel Glockner <dg(a)emlix.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst
index 4e6f504474ac..2cc5488acbd9 100644
--- a/Documentation/admin-guide/device-mapper/dm-integrity.rst
+++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst
@@ -177,14 +177,20 @@ bitmap_flush_interval:number
The bitmap flush interval in milliseconds. The metadata buffers
are synchronized when this interval expires.
+allow_discards
+ Allow block discard requests (a.k.a. TRIM) for the integrity device.
+ Discards are only allowed to devices using internal hash.
+
fix_padding
Use a smaller padding of the tag area that is more
space-efficient. If this option is not present, large padding is
used - that is for compatibility with older kernels.
-allow_discards
- Allow block discard requests (a.k.a. TRIM) for the integrity device.
- Discards are only allowed to devices using internal hash.
+legacy_recalculate
+ Allow recalculating of volumes with HMAC keys. This is disabled by
+ default for security reasons - an attacker could modify the volume,
+ set recalc_sector to zero, and the kernel would not detect the
+ modification.
The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
allow_discards can be changed when reloading the target (load an inactive
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index cce203adcf77..b64fede032dc 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -257,8 +257,9 @@ struct dm_integrity_c {
bool journal_uptodate;
bool just_formatted;
bool recalculate_flag;
- bool fix_padding;
bool discard;
+ bool fix_padding;
+ bool legacy_recalculate;
struct alg_spec internal_hash_alg;
struct alg_spec journal_crypt_alg;
@@ -386,6 +387,14 @@ static int dm_integrity_failed(struct dm_integrity_c *ic)
return READ_ONCE(ic->failed);
}
+static bool dm_integrity_disable_recalculate(struct dm_integrity_c *ic)
+{
+ if ((ic->internal_hash_alg.key || ic->journal_mac_alg.key) &&
+ !ic->legacy_recalculate)
+ return true;
+ return false;
+}
+
static commit_id_t dm_integrity_commit_id(struct dm_integrity_c *ic, unsigned i,
unsigned j, unsigned char seq)
{
@@ -3140,6 +3149,7 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
arg_count += !!ic->journal_crypt_alg.alg_string;
arg_count += !!ic->journal_mac_alg.alg_string;
arg_count += (ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0;
+ arg_count += ic->legacy_recalculate;
DMEMIT("%s %llu %u %c %u", ic->dev->name, ic->start,
ic->tag_size, ic->mode, arg_count);
if (ic->meta_dev)
@@ -3163,6 +3173,8 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
}
if ((ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0)
DMEMIT(" fix_padding");
+ if (ic->legacy_recalculate)
+ DMEMIT(" legacy_recalculate");
#define EMIT_ALG(a, n) \
do { \
@@ -3792,7 +3804,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
unsigned extra_args;
struct dm_arg_set as;
static const struct dm_arg _args[] = {
- {0, 15, "Invalid number of feature args"},
+ {0, 16, "Invalid number of feature args"},
};
unsigned journal_sectors, interleave_sectors, buffer_sectors, journal_watermark, sync_msec;
bool should_write_sb;
@@ -3940,6 +3952,8 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
ic->discard = true;
} else if (!strcmp(opt_string, "fix_padding")) {
ic->fix_padding = true;
+ } else if (!strcmp(opt_string, "legacy_recalculate")) {
+ ic->legacy_recalculate = true;
} else {
r = -EINVAL;
ti->error = "Invalid argument";
@@ -4243,6 +4257,14 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
}
}
+ if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING) &&
+ le64_to_cpu(ic->sb->recalc_sector) < ic->provided_data_sectors &&
+ dm_integrity_disable_recalculate(ic)) {
+ ti->error = "Recalculating with HMAC is disabled for security reasons - if you really need it, use the argument \"legacy_recalculate\"";
+ r = -EOPNOTSUPP;
+ goto bad;
+ }
+
ic->bufio = dm_bufio_client_create(ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev,
1U << (SECTOR_SHIFT + ic->log2_buffer_sectors), 1, 0, NULL, NULL);
if (IS_ERR(ic->bufio)) {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5c02406428d5219c367c5f53457698c58bc5f917 Mon Sep 17 00:00:00 2001
From: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed, 20 Jan 2021 13:59:11 -0500
Subject: [PATCH] dm integrity: conditionally disable "recalculate" feature
Otherwise a malicious user could (ab)use the "recalculate" feature
that makes dm-integrity calculate the checksums in the background
while the device is already usable. When the system restarts before all
checksums have been calculated, the calculation continues where it was
interrupted even if the recalculate feature is not requested the next
time the dm device is set up.
Disable recalculating if we use internal_hash or journal_hash with a
key (e.g. HMAC) and we don't have the "legacy_recalculate" flag.
This may break activation of a volume, created by an older kernel,
that is not yet fully recalculated -- if this happens, the user should
add the "legacy_recalculate" flag to constructor parameters.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Reported-by: Daniel Glockner <dg(a)emlix.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst
index 4e6f504474ac..2cc5488acbd9 100644
--- a/Documentation/admin-guide/device-mapper/dm-integrity.rst
+++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst
@@ -177,14 +177,20 @@ bitmap_flush_interval:number
The bitmap flush interval in milliseconds. The metadata buffers
are synchronized when this interval expires.
+allow_discards
+ Allow block discard requests (a.k.a. TRIM) for the integrity device.
+ Discards are only allowed to devices using internal hash.
+
fix_padding
Use a smaller padding of the tag area that is more
space-efficient. If this option is not present, large padding is
used - that is for compatibility with older kernels.
-allow_discards
- Allow block discard requests (a.k.a. TRIM) for the integrity device.
- Discards are only allowed to devices using internal hash.
+legacy_recalculate
+ Allow recalculating of volumes with HMAC keys. This is disabled by
+ default for security reasons - an attacker could modify the volume,
+ set recalc_sector to zero, and the kernel would not detect the
+ modification.
The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
allow_discards can be changed when reloading the target (load an inactive
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index cce203adcf77..b64fede032dc 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -257,8 +257,9 @@ struct dm_integrity_c {
bool journal_uptodate;
bool just_formatted;
bool recalculate_flag;
- bool fix_padding;
bool discard;
+ bool fix_padding;
+ bool legacy_recalculate;
struct alg_spec internal_hash_alg;
struct alg_spec journal_crypt_alg;
@@ -386,6 +387,14 @@ static int dm_integrity_failed(struct dm_integrity_c *ic)
return READ_ONCE(ic->failed);
}
+static bool dm_integrity_disable_recalculate(struct dm_integrity_c *ic)
+{
+ if ((ic->internal_hash_alg.key || ic->journal_mac_alg.key) &&
+ !ic->legacy_recalculate)
+ return true;
+ return false;
+}
+
static commit_id_t dm_integrity_commit_id(struct dm_integrity_c *ic, unsigned i,
unsigned j, unsigned char seq)
{
@@ -3140,6 +3149,7 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
arg_count += !!ic->journal_crypt_alg.alg_string;
arg_count += !!ic->journal_mac_alg.alg_string;
arg_count += (ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0;
+ arg_count += ic->legacy_recalculate;
DMEMIT("%s %llu %u %c %u", ic->dev->name, ic->start,
ic->tag_size, ic->mode, arg_count);
if (ic->meta_dev)
@@ -3163,6 +3173,8 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
}
if ((ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0)
DMEMIT(" fix_padding");
+ if (ic->legacy_recalculate)
+ DMEMIT(" legacy_recalculate");
#define EMIT_ALG(a, n) \
do { \
@@ -3792,7 +3804,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
unsigned extra_args;
struct dm_arg_set as;
static const struct dm_arg _args[] = {
- {0, 15, "Invalid number of feature args"},
+ {0, 16, "Invalid number of feature args"},
};
unsigned journal_sectors, interleave_sectors, buffer_sectors, journal_watermark, sync_msec;
bool should_write_sb;
@@ -3940,6 +3952,8 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
ic->discard = true;
} else if (!strcmp(opt_string, "fix_padding")) {
ic->fix_padding = true;
+ } else if (!strcmp(opt_string, "legacy_recalculate")) {
+ ic->legacy_recalculate = true;
} else {
r = -EINVAL;
ti->error = "Invalid argument";
@@ -4243,6 +4257,14 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
}
}
+ if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING) &&
+ le64_to_cpu(ic->sb->recalc_sector) < ic->provided_data_sectors &&
+ dm_integrity_disable_recalculate(ic)) {
+ ti->error = "Recalculating with HMAC is disabled for security reasons - if you really need it, use the argument \"legacy_recalculate\"";
+ r = -EOPNOTSUPP;
+ goto bad;
+ }
+
ic->bufio = dm_bufio_client_create(ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev,
1U << (SECTOR_SHIFT + ic->log2_buffer_sectors), 1, 0, NULL, NULL);
if (IS_ERR(ic->bufio)) {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ca1219c0a7432272324660fc9f61a9940f90c50b Mon Sep 17 00:00:00 2001
From: Jisheng Zhang <Jisheng.Zhang(a)synaptics.com>
Date: Tue, 29 Dec 2020 16:16:25 +0800
Subject: [PATCH] mmc: sdhci-of-dwcmshc: fix rpmb access
Commit a44f7cb93732 ("mmc: core: use mrq->sbc when sending CMD23 for
RPMB") began to use ACMD23 for RPMB if the host supports ACMD23. In
RPMB ACM23 case, we need to set bit 31 to CMD23 argument, otherwise
RPMB write operation will return general fail.
However, no matter V4 is enabled or not, the dwcmshc's ARGUMENT2
register is 32-bit block count register which doesn't support stuff
bits of CMD23 argument. So let's handle this specific ACMD23 case.
>From another side, this patch also prepare for future v4 enabling
for dwcmshc, because from the 4.10 spec, the ARGUMENT2 register is
redefined as 32bit block count which doesn't support stuff bits of
CMD23 argument.
Fixes: a44f7cb93732 ("mmc: core: use mrq->sbc when sending CMD23 for RPMB")
Signed-off-by: Jisheng Zhang <Jisheng.Zhang(a)synaptics.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Link: https://lore.kernel.org/r/20201229161625.38255233@xhacker.debian
Cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c b/drivers/mmc/host/sdhci-of-dwcmshc.c
index 4b673792b5a4..d90020ed3622 100644
--- a/drivers/mmc/host/sdhci-of-dwcmshc.c
+++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
@@ -16,6 +16,8 @@
#include "sdhci-pltfm.h"
+#define SDHCI_DWCMSHC_ARG2_STUFF GENMASK(31, 16)
+
/* DWCMSHC specific Mode Select value */
#define DWCMSHC_CTRL_HS400 0x7
@@ -49,6 +51,29 @@ static void dwcmshc_adma_write_desc(struct sdhci_host *host, void **desc,
sdhci_adma_write_desc(host, desc, addr, len, cmd);
}
+static void dwcmshc_check_auto_cmd23(struct mmc_host *mmc,
+ struct mmc_request *mrq)
+{
+ struct sdhci_host *host = mmc_priv(mmc);
+
+ /*
+ * No matter V4 is enabled or not, ARGUMENT2 register is 32-bit
+ * block count register which doesn't support stuff bits of
+ * CMD23 argument on dwcmsch host controller.
+ */
+ if (mrq->sbc && (mrq->sbc->arg & SDHCI_DWCMSHC_ARG2_STUFF))
+ host->flags &= ~SDHCI_AUTO_CMD23;
+ else
+ host->flags |= SDHCI_AUTO_CMD23;
+}
+
+static void dwcmshc_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+ dwcmshc_check_auto_cmd23(mmc, mrq);
+
+ sdhci_request(mmc, mrq);
+}
+
static void dwcmshc_set_uhs_signaling(struct sdhci_host *host,
unsigned int timing)
{
@@ -133,6 +158,8 @@ static int dwcmshc_probe(struct platform_device *pdev)
sdhci_get_of_property(pdev);
+ host->mmc_host_ops.request = dwcmshc_request;
+
err = sdhci_add_host(host);
if (err)
goto err_clk;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ca1219c0a7432272324660fc9f61a9940f90c50b Mon Sep 17 00:00:00 2001
From: Jisheng Zhang <Jisheng.Zhang(a)synaptics.com>
Date: Tue, 29 Dec 2020 16:16:25 +0800
Subject: [PATCH] mmc: sdhci-of-dwcmshc: fix rpmb access
Commit a44f7cb93732 ("mmc: core: use mrq->sbc when sending CMD23 for
RPMB") began to use ACMD23 for RPMB if the host supports ACMD23. In
RPMB ACM23 case, we need to set bit 31 to CMD23 argument, otherwise
RPMB write operation will return general fail.
However, no matter V4 is enabled or not, the dwcmshc's ARGUMENT2
register is 32-bit block count register which doesn't support stuff
bits of CMD23 argument. So let's handle this specific ACMD23 case.
>From another side, this patch also prepare for future v4 enabling
for dwcmshc, because from the 4.10 spec, the ARGUMENT2 register is
redefined as 32bit block count which doesn't support stuff bits of
CMD23 argument.
Fixes: a44f7cb93732 ("mmc: core: use mrq->sbc when sending CMD23 for RPMB")
Signed-off-by: Jisheng Zhang <Jisheng.Zhang(a)synaptics.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Link: https://lore.kernel.org/r/20201229161625.38255233@xhacker.debian
Cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c b/drivers/mmc/host/sdhci-of-dwcmshc.c
index 4b673792b5a4..d90020ed3622 100644
--- a/drivers/mmc/host/sdhci-of-dwcmshc.c
+++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
@@ -16,6 +16,8 @@
#include "sdhci-pltfm.h"
+#define SDHCI_DWCMSHC_ARG2_STUFF GENMASK(31, 16)
+
/* DWCMSHC specific Mode Select value */
#define DWCMSHC_CTRL_HS400 0x7
@@ -49,6 +51,29 @@ static void dwcmshc_adma_write_desc(struct sdhci_host *host, void **desc,
sdhci_adma_write_desc(host, desc, addr, len, cmd);
}
+static void dwcmshc_check_auto_cmd23(struct mmc_host *mmc,
+ struct mmc_request *mrq)
+{
+ struct sdhci_host *host = mmc_priv(mmc);
+
+ /*
+ * No matter V4 is enabled or not, ARGUMENT2 register is 32-bit
+ * block count register which doesn't support stuff bits of
+ * CMD23 argument on dwcmsch host controller.
+ */
+ if (mrq->sbc && (mrq->sbc->arg & SDHCI_DWCMSHC_ARG2_STUFF))
+ host->flags &= ~SDHCI_AUTO_CMD23;
+ else
+ host->flags |= SDHCI_AUTO_CMD23;
+}
+
+static void dwcmshc_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+ dwcmshc_check_auto_cmd23(mmc, mrq);
+
+ sdhci_request(mmc, mrq);
+}
+
static void dwcmshc_set_uhs_signaling(struct sdhci_host *host,
unsigned int timing)
{
@@ -133,6 +158,8 @@ static int dwcmshc_probe(struct platform_device *pdev)
sdhci_get_of_property(pdev);
+ host->mmc_host_ops.request = dwcmshc_request;
+
err = sdhci_add_host(host);
if (err)
goto err_clk;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b503087445ce7e45fabdee87ca9e460d5b5b5168 Mon Sep 17 00:00:00 2001
From: Peter Collingbourne <pcc(a)google.com>
Date: Thu, 14 Jan 2021 12:14:05 -0800
Subject: [PATCH] mmc: core: don't initialize block size from ext_csd if not
present
If extended CSD was not available, the eMMC driver would incorrectly
set the block size to 0, as the data_sector_size field of ext_csd
was never initialized. This issue was exposed by commit 817046ecddbc
("block: Align max_hw_sectors to logical blocksize") which caused
max_sectors and max_hw_sectors to be set to 0 after setting the block
size to 0, resulting in a kernel panic in bio_split when attempting
to read from the device. Fix it by only reading the block size from
ext_csd if it is available.
Fixes: a5075eb94837 ("mmc: block: Allow disabling 512B sector size emulation")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Link: https://linux-review.googlesource.com/id/If244d178da4d86b52034459438fec295b…
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210114201405.2934886-1-pcc@google.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index de7cb0369c30..002426e3cf76 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -384,8 +384,10 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
"merging was advertised but not possible");
blk_queue_max_segments(mq->queue, mmc_get_max_segments(host));
- if (mmc_card_mmc(card))
+ if (mmc_card_mmc(card) && card->ext_csd.data_sector_size) {
block_size = card->ext_csd.data_sector_size;
+ WARN_ON(block_size != 512 && block_size != 4096);
+ }
blk_queue_logical_block_size(mq->queue, block_size);
/*
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b503087445ce7e45fabdee87ca9e460d5b5b5168 Mon Sep 17 00:00:00 2001
From: Peter Collingbourne <pcc(a)google.com>
Date: Thu, 14 Jan 2021 12:14:05 -0800
Subject: [PATCH] mmc: core: don't initialize block size from ext_csd if not
present
If extended CSD was not available, the eMMC driver would incorrectly
set the block size to 0, as the data_sector_size field of ext_csd
was never initialized. This issue was exposed by commit 817046ecddbc
("block: Align max_hw_sectors to logical blocksize") which caused
max_sectors and max_hw_sectors to be set to 0 after setting the block
size to 0, resulting in a kernel panic in bio_split when attempting
to read from the device. Fix it by only reading the block size from
ext_csd if it is available.
Fixes: a5075eb94837 ("mmc: block: Allow disabling 512B sector size emulation")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Link: https://linux-review.googlesource.com/id/If244d178da4d86b52034459438fec295b…
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210114201405.2934886-1-pcc@google.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index de7cb0369c30..002426e3cf76 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -384,8 +384,10 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
"merging was advertised but not possible");
blk_queue_max_segments(mq->queue, mmc_get_max_segments(host));
- if (mmc_card_mmc(card))
+ if (mmc_card_mmc(card) && card->ext_csd.data_sector_size) {
block_size = card->ext_csd.data_sector_size;
+ WARN_ON(block_size != 512 && block_size != 4096);
+ }
blk_queue_logical_block_size(mq->queue, block_size);
/*
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b503087445ce7e45fabdee87ca9e460d5b5b5168 Mon Sep 17 00:00:00 2001
From: Peter Collingbourne <pcc(a)google.com>
Date: Thu, 14 Jan 2021 12:14:05 -0800
Subject: [PATCH] mmc: core: don't initialize block size from ext_csd if not
present
If extended CSD was not available, the eMMC driver would incorrectly
set the block size to 0, as the data_sector_size field of ext_csd
was never initialized. This issue was exposed by commit 817046ecddbc
("block: Align max_hw_sectors to logical blocksize") which caused
max_sectors and max_hw_sectors to be set to 0 after setting the block
size to 0, resulting in a kernel panic in bio_split when attempting
to read from the device. Fix it by only reading the block size from
ext_csd if it is available.
Fixes: a5075eb94837 ("mmc: block: Allow disabling 512B sector size emulation")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Link: https://linux-review.googlesource.com/id/If244d178da4d86b52034459438fec295b…
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210114201405.2934886-1-pcc@google.com
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index de7cb0369c30..002426e3cf76 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -384,8 +384,10 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
"merging was advertised but not possible");
blk_queue_max_segments(mq->queue, mmc_get_max_segments(host));
- if (mmc_card_mmc(card))
+ if (mmc_card_mmc(card) && card->ext_csd.data_sector_size) {
block_size = card->ext_csd.data_sector_size;
+ WARN_ON(block_size != 512 && block_size != 4096);
+ }
blk_queue_logical_block_size(mq->queue, block_size);
/*
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1e249cb5b7fc09ff216aa5a12f6c302e434e88f9 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 12 Jan 2021 11:02:43 -0800
Subject: [PATCH] fs: fix lazytime expiration handling in
__writeback_single_inode()
When lazytime is enabled and an inode is being written due to its
in-memory updated timestamps having expired, either due to a sync() or
syncfs() system call or due to dirtytime_expire_interval having elapsed,
the VFS needs to inform the filesystem so that the filesystem can copy
the inode's timestamps out to the on-disk data structures.
This is done by __writeback_single_inode() calling
mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
However, this occurs after __writeback_single_inode() has already
cleared the dirty flags from ->i_state. This causes two bugs:
- mark_inode_dirty_sync() redirties the inode, causing it to remain
dirty. This wastefully causes the inode to be written twice. But
more importantly, it breaks cases where sync_filesystem() is expected
to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
ioctl (as reported at
https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
as possibly filesystem freezing (freeze_super()).
- Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
called from __writeback_single_inode() for lazytime expiration,
xfs_fs_dirty_inode() ignores the notification. (XFS only cares about
lazytime expirations, and it assumes that i_state will contain
I_DIRTY_TIME during those.) Therefore, lazy timestamps aren't
persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS.
Fix this by moving the call to mark_inode_dirty_sync() to earlier in
__writeback_single_inode(), before the dirty flags are cleared from
i_state. This makes filesystems be properly notified of the timestamp
expiration, and it avoids incorrectly redirtying the inode.
This fixes xfstest generic/580 (which tests
FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
enabled. It also fixes the new lazytime xfstest I've proposed, which
reproduces the above-mentioned XFS bug
(https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).
Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But
due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
right thing to do because mark_inode_dirty_sync() now knows not to move
the inode to a writeback list if it is currently queued for sync.
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
Cc: stable(a)vger.kernel.org
Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback")
Link: https://lore.kernel.org/r/20210112190253.64307-2-ebiggers@kernel.org
Suggested-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index acfb55834af2..c41cb887eb7d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
}
/*
- * Some filesystems may redirty the inode during the writeback
- * due to delalloc, clear dirty metadata flags right before
- * write_inode()
+ * If the inode has dirty timestamps and we need to write them, call
+ * mark_inode_dirty_sync() to notify the filesystem about it and to
+ * change I_DIRTY_TIME into I_DIRTY_SYNC.
*/
- spin_lock(&inode->i_lock);
-
- dirty = inode->i_state & I_DIRTY;
if ((inode->i_state & I_DIRTY_TIME) &&
- ((dirty & I_DIRTY_INODE) ||
- wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
+ (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
time_after(jiffies, inode->dirtied_time_when +
dirtytime_expire_interval * HZ))) {
- dirty |= I_DIRTY_TIME;
trace_writeback_lazytime(inode);
+ mark_inode_dirty_sync(inode);
}
+
+ /*
+ * Some filesystems may redirty the inode during the writeback
+ * due to delalloc, clear dirty metadata flags right before
+ * write_inode()
+ */
+ spin_lock(&inode->i_lock);
+ dirty = inode->i_state & I_DIRTY;
inode->i_state &= ~dirty;
/*
@@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
spin_unlock(&inode->i_lock);
- if (dirty & I_DIRTY_TIME)
- mark_inode_dirty_sync(inode);
/* Don't write the inode if only I_DIRTY_PAGES was set */
if (dirty & ~I_DIRTY_PAGES) {
int err = write_inode(inode, wbc);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1e249cb5b7fc09ff216aa5a12f6c302e434e88f9 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 12 Jan 2021 11:02:43 -0800
Subject: [PATCH] fs: fix lazytime expiration handling in
__writeback_single_inode()
When lazytime is enabled and an inode is being written due to its
in-memory updated timestamps having expired, either due to a sync() or
syncfs() system call or due to dirtytime_expire_interval having elapsed,
the VFS needs to inform the filesystem so that the filesystem can copy
the inode's timestamps out to the on-disk data structures.
This is done by __writeback_single_inode() calling
mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
However, this occurs after __writeback_single_inode() has already
cleared the dirty flags from ->i_state. This causes two bugs:
- mark_inode_dirty_sync() redirties the inode, causing it to remain
dirty. This wastefully causes the inode to be written twice. But
more importantly, it breaks cases where sync_filesystem() is expected
to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
ioctl (as reported at
https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
as possibly filesystem freezing (freeze_super()).
- Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
called from __writeback_single_inode() for lazytime expiration,
xfs_fs_dirty_inode() ignores the notification. (XFS only cares about
lazytime expirations, and it assumes that i_state will contain
I_DIRTY_TIME during those.) Therefore, lazy timestamps aren't
persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS.
Fix this by moving the call to mark_inode_dirty_sync() to earlier in
__writeback_single_inode(), before the dirty flags are cleared from
i_state. This makes filesystems be properly notified of the timestamp
expiration, and it avoids incorrectly redirtying the inode.
This fixes xfstest generic/580 (which tests
FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
enabled. It also fixes the new lazytime xfstest I've proposed, which
reproduces the above-mentioned XFS bug
(https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).
Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But
due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
right thing to do because mark_inode_dirty_sync() now knows not to move
the inode to a writeback list if it is currently queued for sync.
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
Cc: stable(a)vger.kernel.org
Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback")
Link: https://lore.kernel.org/r/20210112190253.64307-2-ebiggers@kernel.org
Suggested-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index acfb55834af2..c41cb887eb7d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
}
/*
- * Some filesystems may redirty the inode during the writeback
- * due to delalloc, clear dirty metadata flags right before
- * write_inode()
+ * If the inode has dirty timestamps and we need to write them, call
+ * mark_inode_dirty_sync() to notify the filesystem about it and to
+ * change I_DIRTY_TIME into I_DIRTY_SYNC.
*/
- spin_lock(&inode->i_lock);
-
- dirty = inode->i_state & I_DIRTY;
if ((inode->i_state & I_DIRTY_TIME) &&
- ((dirty & I_DIRTY_INODE) ||
- wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
+ (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
time_after(jiffies, inode->dirtied_time_when +
dirtytime_expire_interval * HZ))) {
- dirty |= I_DIRTY_TIME;
trace_writeback_lazytime(inode);
+ mark_inode_dirty_sync(inode);
}
+
+ /*
+ * Some filesystems may redirty the inode during the writeback
+ * due to delalloc, clear dirty metadata flags right before
+ * write_inode()
+ */
+ spin_lock(&inode->i_lock);
+ dirty = inode->i_state & I_DIRTY;
inode->i_state &= ~dirty;
/*
@@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
spin_unlock(&inode->i_lock);
- if (dirty & I_DIRTY_TIME)
- mark_inode_dirty_sync(inode);
/* Don't write the inode if only I_DIRTY_PAGES was set */
if (dirty & ~I_DIRTY_PAGES) {
int err = write_inode(inode, wbc);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1e249cb5b7fc09ff216aa5a12f6c302e434e88f9 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 12 Jan 2021 11:02:43 -0800
Subject: [PATCH] fs: fix lazytime expiration handling in
__writeback_single_inode()
When lazytime is enabled and an inode is being written due to its
in-memory updated timestamps having expired, either due to a sync() or
syncfs() system call or due to dirtytime_expire_interval having elapsed,
the VFS needs to inform the filesystem so that the filesystem can copy
the inode's timestamps out to the on-disk data structures.
This is done by __writeback_single_inode() calling
mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
However, this occurs after __writeback_single_inode() has already
cleared the dirty flags from ->i_state. This causes two bugs:
- mark_inode_dirty_sync() redirties the inode, causing it to remain
dirty. This wastefully causes the inode to be written twice. But
more importantly, it breaks cases where sync_filesystem() is expected
to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
ioctl (as reported at
https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
as possibly filesystem freezing (freeze_super()).
- Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
called from __writeback_single_inode() for lazytime expiration,
xfs_fs_dirty_inode() ignores the notification. (XFS only cares about
lazytime expirations, and it assumes that i_state will contain
I_DIRTY_TIME during those.) Therefore, lazy timestamps aren't
persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS.
Fix this by moving the call to mark_inode_dirty_sync() to earlier in
__writeback_single_inode(), before the dirty flags are cleared from
i_state. This makes filesystems be properly notified of the timestamp
expiration, and it avoids incorrectly redirtying the inode.
This fixes xfstest generic/580 (which tests
FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
enabled. It also fixes the new lazytime xfstest I've proposed, which
reproduces the above-mentioned XFS bug
(https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).
Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But
due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
right thing to do because mark_inode_dirty_sync() now knows not to move
the inode to a writeback list if it is currently queued for sync.
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
Cc: stable(a)vger.kernel.org
Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback")
Link: https://lore.kernel.org/r/20210112190253.64307-2-ebiggers@kernel.org
Suggested-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index acfb55834af2..c41cb887eb7d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
}
/*
- * Some filesystems may redirty the inode during the writeback
- * due to delalloc, clear dirty metadata flags right before
- * write_inode()
+ * If the inode has dirty timestamps and we need to write them, call
+ * mark_inode_dirty_sync() to notify the filesystem about it and to
+ * change I_DIRTY_TIME into I_DIRTY_SYNC.
*/
- spin_lock(&inode->i_lock);
-
- dirty = inode->i_state & I_DIRTY;
if ((inode->i_state & I_DIRTY_TIME) &&
- ((dirty & I_DIRTY_INODE) ||
- wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
+ (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
time_after(jiffies, inode->dirtied_time_when +
dirtytime_expire_interval * HZ))) {
- dirty |= I_DIRTY_TIME;
trace_writeback_lazytime(inode);
+ mark_inode_dirty_sync(inode);
}
+
+ /*
+ * Some filesystems may redirty the inode during the writeback
+ * due to delalloc, clear dirty metadata flags right before
+ * write_inode()
+ */
+ spin_lock(&inode->i_lock);
+ dirty = inode->i_state & I_DIRTY;
inode->i_state &= ~dirty;
/*
@@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
spin_unlock(&inode->i_lock);
- if (dirty & I_DIRTY_TIME)
- mark_inode_dirty_sync(inode);
/* Don't write the inode if only I_DIRTY_PAGES was set */
if (dirty & ~I_DIRTY_PAGES) {
int err = write_inode(inode, wbc);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1e249cb5b7fc09ff216aa5a12f6c302e434e88f9 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 12 Jan 2021 11:02:43 -0800
Subject: [PATCH] fs: fix lazytime expiration handling in
__writeback_single_inode()
When lazytime is enabled and an inode is being written due to its
in-memory updated timestamps having expired, either due to a sync() or
syncfs() system call or due to dirtytime_expire_interval having elapsed,
the VFS needs to inform the filesystem so that the filesystem can copy
the inode's timestamps out to the on-disk data structures.
This is done by __writeback_single_inode() calling
mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
However, this occurs after __writeback_single_inode() has already
cleared the dirty flags from ->i_state. This causes two bugs:
- mark_inode_dirty_sync() redirties the inode, causing it to remain
dirty. This wastefully causes the inode to be written twice. But
more importantly, it breaks cases where sync_filesystem() is expected
to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
ioctl (as reported at
https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
as possibly filesystem freezing (freeze_super()).
- Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
called from __writeback_single_inode() for lazytime expiration,
xfs_fs_dirty_inode() ignores the notification. (XFS only cares about
lazytime expirations, and it assumes that i_state will contain
I_DIRTY_TIME during those.) Therefore, lazy timestamps aren't
persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS.
Fix this by moving the call to mark_inode_dirty_sync() to earlier in
__writeback_single_inode(), before the dirty flags are cleared from
i_state. This makes filesystems be properly notified of the timestamp
expiration, and it avoids incorrectly redirtying the inode.
This fixes xfstest generic/580 (which tests
FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
enabled. It also fixes the new lazytime xfstest I've proposed, which
reproduces the above-mentioned XFS bug
(https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).
Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But
due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
right thing to do because mark_inode_dirty_sync() now knows not to move
the inode to a writeback list if it is currently queued for sync.
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
Cc: stable(a)vger.kernel.org
Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback")
Link: https://lore.kernel.org/r/20210112190253.64307-2-ebiggers@kernel.org
Suggested-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index acfb55834af2..c41cb887eb7d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
}
/*
- * Some filesystems may redirty the inode during the writeback
- * due to delalloc, clear dirty metadata flags right before
- * write_inode()
+ * If the inode has dirty timestamps and we need to write them, call
+ * mark_inode_dirty_sync() to notify the filesystem about it and to
+ * change I_DIRTY_TIME into I_DIRTY_SYNC.
*/
- spin_lock(&inode->i_lock);
-
- dirty = inode->i_state & I_DIRTY;
if ((inode->i_state & I_DIRTY_TIME) &&
- ((dirty & I_DIRTY_INODE) ||
- wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
+ (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
time_after(jiffies, inode->dirtied_time_when +
dirtytime_expire_interval * HZ))) {
- dirty |= I_DIRTY_TIME;
trace_writeback_lazytime(inode);
+ mark_inode_dirty_sync(inode);
}
+
+ /*
+ * Some filesystems may redirty the inode during the writeback
+ * due to delalloc, clear dirty metadata flags right before
+ * write_inode()
+ */
+ spin_lock(&inode->i_lock);
+ dirty = inode->i_state & I_DIRTY;
inode->i_state &= ~dirty;
/*
@@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
spin_unlock(&inode->i_lock);
- if (dirty & I_DIRTY_TIME)
- mark_inode_dirty_sync(inode);
/* Don't write the inode if only I_DIRTY_PAGES was set */
if (dirty & ~I_DIRTY_PAGES) {
int err = write_inode(inode, wbc);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 34d1eb0e599875064955a74712f08ff14c8e3d5f Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Wed, 16 Dec 2020 11:22:17 -0500
Subject: [PATCH] btrfs: don't clear ret in btrfs_start_dirty_block_groups
If we fail to update a block group item in the loop we'll break, however
we'll do btrfs_run_delayed_refs and lose our error value in ret, and
thus not clean up properly. Fix this by only running the delayed refs
if there was no failure.
CC: stable(a)vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 52f2198d44c9..0886e81e5540 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2669,7 +2669,8 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans)
* Go through delayed refs for all the stuff we've just kicked off
* and then loop back (just once)
*/
- ret = btrfs_run_delayed_refs(trans, 0);
+ if (!ret)
+ ret = btrfs_run_delayed_refs(trans, 0);
if (!ret && loops == 0) {
loops++;
spin_lock(&cur_trans->dirty_bgs_lock);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 34d1eb0e599875064955a74712f08ff14c8e3d5f Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Wed, 16 Dec 2020 11:22:17 -0500
Subject: [PATCH] btrfs: don't clear ret in btrfs_start_dirty_block_groups
If we fail to update a block group item in the loop we'll break, however
we'll do btrfs_run_delayed_refs and lose our error value in ret, and
thus not clean up properly. Fix this by only running the delayed refs
if there was no failure.
CC: stable(a)vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 52f2198d44c9..0886e81e5540 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2669,7 +2669,8 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans)
* Go through delayed refs for all the stuff we've just kicked off
* and then loop back (just once)
*/
- ret = btrfs_run_delayed_refs(trans, 0);
+ if (!ret)
+ ret = btrfs_run_delayed_refs(trans, 0);
if (!ret && loops == 0) {
loops++;
spin_lock(&cur_trans->dirty_bgs_lock);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 34d1eb0e599875064955a74712f08ff14c8e3d5f Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Wed, 16 Dec 2020 11:22:17 -0500
Subject: [PATCH] btrfs: don't clear ret in btrfs_start_dirty_block_groups
If we fail to update a block group item in the loop we'll break, however
we'll do btrfs_run_delayed_refs and lose our error value in ret, and
thus not clean up properly. Fix this by only running the delayed refs
if there was no failure.
CC: stable(a)vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 52f2198d44c9..0886e81e5540 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2669,7 +2669,8 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans)
* Go through delayed refs for all the stuff we've just kicked off
* and then loop back (just once)
*/
- ret = btrfs_run_delayed_refs(trans, 0);
+ if (!ret)
+ ret = btrfs_run_delayed_refs(trans, 0);
if (!ret && loops == 0) {
loops++;
spin_lock(&cur_trans->dirty_bgs_lock);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 34d1eb0e599875064955a74712f08ff14c8e3d5f Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Wed, 16 Dec 2020 11:22:17 -0500
Subject: [PATCH] btrfs: don't clear ret in btrfs_start_dirty_block_groups
If we fail to update a block group item in the loop we'll break, however
we'll do btrfs_run_delayed_refs and lose our error value in ret, and
thus not clean up properly. Fix this by only running the delayed refs
if there was no failure.
CC: stable(a)vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 52f2198d44c9..0886e81e5540 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2669,7 +2669,8 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans)
* Go through delayed refs for all the stuff we've just kicked off
* and then loop back (just once)
*/
- ret = btrfs_run_delayed_refs(trans, 0);
+ if (!ret)
+ ret = btrfs_run_delayed_refs(trans, 0);
if (!ret && loops == 0) {
loops++;
spin_lock(&cur_trans->dirty_bgs_lock);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 49ecc679ab48b40ca799bf94b327d5284eac9e46 Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Wed, 16 Dec 2020 11:22:11 -0500
Subject: [PATCH] btrfs: do not double free backref nodes on error
Zygo reported the following KASAN splat:
BUG: KASAN: use-after-free in btrfs_backref_cleanup_node+0x18a/0x420
Read of size 8 at addr ffff888112402950 by task btrfs/28836
CPU: 0 PID: 28836 Comm: btrfs Tainted: G W 5.10.0-e35f27394290-for-next+ #23
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
dump_stack+0xbc/0xf9
? btrfs_backref_cleanup_node+0x18a/0x420
print_address_description.constprop.8+0x21/0x210
? record_print_text.cold.34+0x11/0x11
? btrfs_backref_cleanup_node+0x18a/0x420
? btrfs_backref_cleanup_node+0x18a/0x420
kasan_report.cold.10+0x20/0x37
? btrfs_backref_cleanup_node+0x18a/0x420
__asan_load8+0x69/0x90
btrfs_backref_cleanup_node+0x18a/0x420
btrfs_backref_release_cache+0x83/0x1b0
relocate_block_group+0x394/0x780
? merge_reloc_roots+0x4a0/0x4a0
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
? check_flags.part.50+0x6c/0x1e0
? btrfs_relocate_chunk+0x120/0x120
? kmem_cache_alloc_trace+0xa06/0xcb0
? _copy_from_user+0x83/0xc0
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
? __kasan_check_read+0x11/0x20
? check_chain_key+0x1f4/0x2f0
? __asan_loadN+0xf/0x20
? btrfs_ioctl_get_supported_features+0x30/0x30
? kvm_sched_clock_read+0x18/0x30
? check_chain_key+0x1f4/0x2f0
? lock_downgrade+0x3f0/0x3f0
? handle_mm_fault+0xad6/0x2150
? do_vfs_ioctl+0xfc/0x9d0
? ioctl_file_clone+0xe0/0xe0
? check_flags.part.50+0x6c/0x1e0
? check_flags.part.50+0x6c/0x1e0
? check_flags+0x26/0x30
? lock_is_held_type+0xc3/0xf0
? syscall_enter_from_user_mode+0x1b/0x60
? do_syscall_64+0x13/0x80
? rcu_read_lock_sched_held+0xa1/0xd0
? __kasan_check_read+0x11/0x20
? __fget_light+0xae/0x110
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4c4bdfe427
Allocated by task 28836:
kasan_save_stack+0x21/0x50
__kasan_kmalloc.constprop.18+0xbe/0xd0
kasan_kmalloc+0x9/0x10
kmem_cache_alloc_trace+0x410/0xcb0
btrfs_backref_alloc_node+0x46/0xf0
btrfs_backref_add_tree_node+0x60d/0x11d0
build_backref_tree+0xc5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 28836:
kasan_save_stack+0x21/0x50
kasan_set_track+0x20/0x30
kasan_set_free_info+0x1f/0x30
__kasan_slab_free+0xf3/0x140
kasan_slab_free+0xe/0x10
kfree+0xde/0x200
btrfs_backref_error_cleanup+0x452/0x530
build_backref_tree+0x1a5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This occurred because we freed our backref node in
btrfs_backref_error_cleanup(), but then tried to free it again in
btrfs_backref_release_cache(). This is because
btrfs_backref_release_cache() will cycle through all of the
cache->leaves nodes and free them up. However
btrfs_backref_error_cleanup() freed the backref node with
btrfs_backref_free_node(), which simply kfree()d the backref node
without unlinking it from the cache. Change this to a
btrfs_backref_drop_node(), which does the appropriate cleanup and
removes the node from the cache->leaves list, so when we go to free the
remaining cache we don't trip over items we've already dropped.
Fixes: 75bfb9aff45e ("Btrfs: cleanup error handling in build_backref_tree")
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 02d7d7b2563b..9cadacf3ec27 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -3117,7 +3117,7 @@ void btrfs_backref_error_cleanup(struct btrfs_backref_cache *cache,
list_del_init(&lower->list);
if (lower == node)
node = NULL;
- btrfs_backref_free_node(cache, lower);
+ btrfs_backref_drop_node(cache, lower);
}
btrfs_backref_cleanup_node(cache, node);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 49ecc679ab48b40ca799bf94b327d5284eac9e46 Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Wed, 16 Dec 2020 11:22:11 -0500
Subject: [PATCH] btrfs: do not double free backref nodes on error
Zygo reported the following KASAN splat:
BUG: KASAN: use-after-free in btrfs_backref_cleanup_node+0x18a/0x420
Read of size 8 at addr ffff888112402950 by task btrfs/28836
CPU: 0 PID: 28836 Comm: btrfs Tainted: G W 5.10.0-e35f27394290-for-next+ #23
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
dump_stack+0xbc/0xf9
? btrfs_backref_cleanup_node+0x18a/0x420
print_address_description.constprop.8+0x21/0x210
? record_print_text.cold.34+0x11/0x11
? btrfs_backref_cleanup_node+0x18a/0x420
? btrfs_backref_cleanup_node+0x18a/0x420
kasan_report.cold.10+0x20/0x37
? btrfs_backref_cleanup_node+0x18a/0x420
__asan_load8+0x69/0x90
btrfs_backref_cleanup_node+0x18a/0x420
btrfs_backref_release_cache+0x83/0x1b0
relocate_block_group+0x394/0x780
? merge_reloc_roots+0x4a0/0x4a0
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
? check_flags.part.50+0x6c/0x1e0
? btrfs_relocate_chunk+0x120/0x120
? kmem_cache_alloc_trace+0xa06/0xcb0
? _copy_from_user+0x83/0xc0
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
? __kasan_check_read+0x11/0x20
? check_chain_key+0x1f4/0x2f0
? __asan_loadN+0xf/0x20
? btrfs_ioctl_get_supported_features+0x30/0x30
? kvm_sched_clock_read+0x18/0x30
? check_chain_key+0x1f4/0x2f0
? lock_downgrade+0x3f0/0x3f0
? handle_mm_fault+0xad6/0x2150
? do_vfs_ioctl+0xfc/0x9d0
? ioctl_file_clone+0xe0/0xe0
? check_flags.part.50+0x6c/0x1e0
? check_flags.part.50+0x6c/0x1e0
? check_flags+0x26/0x30
? lock_is_held_type+0xc3/0xf0
? syscall_enter_from_user_mode+0x1b/0x60
? do_syscall_64+0x13/0x80
? rcu_read_lock_sched_held+0xa1/0xd0
? __kasan_check_read+0x11/0x20
? __fget_light+0xae/0x110
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4c4bdfe427
Allocated by task 28836:
kasan_save_stack+0x21/0x50
__kasan_kmalloc.constprop.18+0xbe/0xd0
kasan_kmalloc+0x9/0x10
kmem_cache_alloc_trace+0x410/0xcb0
btrfs_backref_alloc_node+0x46/0xf0
btrfs_backref_add_tree_node+0x60d/0x11d0
build_backref_tree+0xc5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 28836:
kasan_save_stack+0x21/0x50
kasan_set_track+0x20/0x30
kasan_set_free_info+0x1f/0x30
__kasan_slab_free+0xf3/0x140
kasan_slab_free+0xe/0x10
kfree+0xde/0x200
btrfs_backref_error_cleanup+0x452/0x530
build_backref_tree+0x1a5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This occurred because we freed our backref node in
btrfs_backref_error_cleanup(), but then tried to free it again in
btrfs_backref_release_cache(). This is because
btrfs_backref_release_cache() will cycle through all of the
cache->leaves nodes and free them up. However
btrfs_backref_error_cleanup() freed the backref node with
btrfs_backref_free_node(), which simply kfree()d the backref node
without unlinking it from the cache. Change this to a
btrfs_backref_drop_node(), which does the appropriate cleanup and
removes the node from the cache->leaves list, so when we go to free the
remaining cache we don't trip over items we've already dropped.
Fixes: 75bfb9aff45e ("Btrfs: cleanup error handling in build_backref_tree")
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 02d7d7b2563b..9cadacf3ec27 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -3117,7 +3117,7 @@ void btrfs_backref_error_cleanup(struct btrfs_backref_cache *cache,
list_del_init(&lower->list);
if (lower == node)
node = NULL;
- btrfs_backref_free_node(cache, lower);
+ btrfs_backref_drop_node(cache, lower);
}
btrfs_backref_cleanup_node(cache, node);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 49ecc679ab48b40ca799bf94b327d5284eac9e46 Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Wed, 16 Dec 2020 11:22:11 -0500
Subject: [PATCH] btrfs: do not double free backref nodes on error
Zygo reported the following KASAN splat:
BUG: KASAN: use-after-free in btrfs_backref_cleanup_node+0x18a/0x420
Read of size 8 at addr ffff888112402950 by task btrfs/28836
CPU: 0 PID: 28836 Comm: btrfs Tainted: G W 5.10.0-e35f27394290-for-next+ #23
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
dump_stack+0xbc/0xf9
? btrfs_backref_cleanup_node+0x18a/0x420
print_address_description.constprop.8+0x21/0x210
? record_print_text.cold.34+0x11/0x11
? btrfs_backref_cleanup_node+0x18a/0x420
? btrfs_backref_cleanup_node+0x18a/0x420
kasan_report.cold.10+0x20/0x37
? btrfs_backref_cleanup_node+0x18a/0x420
__asan_load8+0x69/0x90
btrfs_backref_cleanup_node+0x18a/0x420
btrfs_backref_release_cache+0x83/0x1b0
relocate_block_group+0x394/0x780
? merge_reloc_roots+0x4a0/0x4a0
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
? check_flags.part.50+0x6c/0x1e0
? btrfs_relocate_chunk+0x120/0x120
? kmem_cache_alloc_trace+0xa06/0xcb0
? _copy_from_user+0x83/0xc0
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
? __kasan_check_read+0x11/0x20
? check_chain_key+0x1f4/0x2f0
? __asan_loadN+0xf/0x20
? btrfs_ioctl_get_supported_features+0x30/0x30
? kvm_sched_clock_read+0x18/0x30
? check_chain_key+0x1f4/0x2f0
? lock_downgrade+0x3f0/0x3f0
? handle_mm_fault+0xad6/0x2150
? do_vfs_ioctl+0xfc/0x9d0
? ioctl_file_clone+0xe0/0xe0
? check_flags.part.50+0x6c/0x1e0
? check_flags.part.50+0x6c/0x1e0
? check_flags+0x26/0x30
? lock_is_held_type+0xc3/0xf0
? syscall_enter_from_user_mode+0x1b/0x60
? do_syscall_64+0x13/0x80
? rcu_read_lock_sched_held+0xa1/0xd0
? __kasan_check_read+0x11/0x20
? __fget_light+0xae/0x110
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4c4bdfe427
Allocated by task 28836:
kasan_save_stack+0x21/0x50
__kasan_kmalloc.constprop.18+0xbe/0xd0
kasan_kmalloc+0x9/0x10
kmem_cache_alloc_trace+0x410/0xcb0
btrfs_backref_alloc_node+0x46/0xf0
btrfs_backref_add_tree_node+0x60d/0x11d0
build_backref_tree+0xc5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 28836:
kasan_save_stack+0x21/0x50
kasan_set_track+0x20/0x30
kasan_set_free_info+0x1f/0x30
__kasan_slab_free+0xf3/0x140
kasan_slab_free+0xe/0x10
kfree+0xde/0x200
btrfs_backref_error_cleanup+0x452/0x530
build_backref_tree+0x1a5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This occurred because we freed our backref node in
btrfs_backref_error_cleanup(), but then tried to free it again in
btrfs_backref_release_cache(). This is because
btrfs_backref_release_cache() will cycle through all of the
cache->leaves nodes and free them up. However
btrfs_backref_error_cleanup() freed the backref node with
btrfs_backref_free_node(), which simply kfree()d the backref node
without unlinking it from the cache. Change this to a
btrfs_backref_drop_node(), which does the appropriate cleanup and
removes the node from the cache->leaves list, so when we go to free the
remaining cache we don't trip over items we've already dropped.
Fixes: 75bfb9aff45e ("Btrfs: cleanup error handling in build_backref_tree")
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 02d7d7b2563b..9cadacf3ec27 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -3117,7 +3117,7 @@ void btrfs_backref_error_cleanup(struct btrfs_backref_cache *cache,
list_del_init(&lower->list);
if (lower == node)
node = NULL;
- btrfs_backref_free_node(cache, lower);
+ btrfs_backref_drop_node(cache, lower);
}
btrfs_backref_cleanup_node(cache, node);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 49ecc679ab48b40ca799bf94b327d5284eac9e46 Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Wed, 16 Dec 2020 11:22:11 -0500
Subject: [PATCH] btrfs: do not double free backref nodes on error
Zygo reported the following KASAN splat:
BUG: KASAN: use-after-free in btrfs_backref_cleanup_node+0x18a/0x420
Read of size 8 at addr ffff888112402950 by task btrfs/28836
CPU: 0 PID: 28836 Comm: btrfs Tainted: G W 5.10.0-e35f27394290-for-next+ #23
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
dump_stack+0xbc/0xf9
? btrfs_backref_cleanup_node+0x18a/0x420
print_address_description.constprop.8+0x21/0x210
? record_print_text.cold.34+0x11/0x11
? btrfs_backref_cleanup_node+0x18a/0x420
? btrfs_backref_cleanup_node+0x18a/0x420
kasan_report.cold.10+0x20/0x37
? btrfs_backref_cleanup_node+0x18a/0x420
__asan_load8+0x69/0x90
btrfs_backref_cleanup_node+0x18a/0x420
btrfs_backref_release_cache+0x83/0x1b0
relocate_block_group+0x394/0x780
? merge_reloc_roots+0x4a0/0x4a0
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
? check_flags.part.50+0x6c/0x1e0
? btrfs_relocate_chunk+0x120/0x120
? kmem_cache_alloc_trace+0xa06/0xcb0
? _copy_from_user+0x83/0xc0
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
? __kasan_check_read+0x11/0x20
? check_chain_key+0x1f4/0x2f0
? __asan_loadN+0xf/0x20
? btrfs_ioctl_get_supported_features+0x30/0x30
? kvm_sched_clock_read+0x18/0x30
? check_chain_key+0x1f4/0x2f0
? lock_downgrade+0x3f0/0x3f0
? handle_mm_fault+0xad6/0x2150
? do_vfs_ioctl+0xfc/0x9d0
? ioctl_file_clone+0xe0/0xe0
? check_flags.part.50+0x6c/0x1e0
? check_flags.part.50+0x6c/0x1e0
? check_flags+0x26/0x30
? lock_is_held_type+0xc3/0xf0
? syscall_enter_from_user_mode+0x1b/0x60
? do_syscall_64+0x13/0x80
? rcu_read_lock_sched_held+0xa1/0xd0
? __kasan_check_read+0x11/0x20
? __fget_light+0xae/0x110
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4c4bdfe427
Allocated by task 28836:
kasan_save_stack+0x21/0x50
__kasan_kmalloc.constprop.18+0xbe/0xd0
kasan_kmalloc+0x9/0x10
kmem_cache_alloc_trace+0x410/0xcb0
btrfs_backref_alloc_node+0x46/0xf0
btrfs_backref_add_tree_node+0x60d/0x11d0
build_backref_tree+0xc5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 28836:
kasan_save_stack+0x21/0x50
kasan_set_track+0x20/0x30
kasan_set_free_info+0x1f/0x30
__kasan_slab_free+0xf3/0x140
kasan_slab_free+0xe/0x10
kfree+0xde/0x200
btrfs_backref_error_cleanup+0x452/0x530
build_backref_tree+0x1a5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This occurred because we freed our backref node in
btrfs_backref_error_cleanup(), but then tried to free it again in
btrfs_backref_release_cache(). This is because
btrfs_backref_release_cache() will cycle through all of the
cache->leaves nodes and free them up. However
btrfs_backref_error_cleanup() freed the backref node with
btrfs_backref_free_node(), which simply kfree()d the backref node
without unlinking it from the cache. Change this to a
btrfs_backref_drop_node(), which does the appropriate cleanup and
removes the node from the cache->leaves list, so when we go to free the
remaining cache we don't trip over items we've already dropped.
Fixes: 75bfb9aff45e ("Btrfs: cleanup error handling in build_backref_tree")
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 02d7d7b2563b..9cadacf3ec27 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -3117,7 +3117,7 @@ void btrfs_backref_error_cleanup(struct btrfs_backref_cache *cache,
list_del_init(&lower->list);
if (lower == node)
node = NULL;
- btrfs_backref_free_node(cache, lower);
+ btrfs_backref_drop_node(cache, lower);
}
btrfs_backref_cleanup_node(cache, node);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 49ecc679ab48b40ca799bf94b327d5284eac9e46 Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Wed, 16 Dec 2020 11:22:11 -0500
Subject: [PATCH] btrfs: do not double free backref nodes on error
Zygo reported the following KASAN splat:
BUG: KASAN: use-after-free in btrfs_backref_cleanup_node+0x18a/0x420
Read of size 8 at addr ffff888112402950 by task btrfs/28836
CPU: 0 PID: 28836 Comm: btrfs Tainted: G W 5.10.0-e35f27394290-for-next+ #23
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
dump_stack+0xbc/0xf9
? btrfs_backref_cleanup_node+0x18a/0x420
print_address_description.constprop.8+0x21/0x210
? record_print_text.cold.34+0x11/0x11
? btrfs_backref_cleanup_node+0x18a/0x420
? btrfs_backref_cleanup_node+0x18a/0x420
kasan_report.cold.10+0x20/0x37
? btrfs_backref_cleanup_node+0x18a/0x420
__asan_load8+0x69/0x90
btrfs_backref_cleanup_node+0x18a/0x420
btrfs_backref_release_cache+0x83/0x1b0
relocate_block_group+0x394/0x780
? merge_reloc_roots+0x4a0/0x4a0
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
? check_flags.part.50+0x6c/0x1e0
? btrfs_relocate_chunk+0x120/0x120
? kmem_cache_alloc_trace+0xa06/0xcb0
? _copy_from_user+0x83/0xc0
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
? __kasan_check_read+0x11/0x20
? check_chain_key+0x1f4/0x2f0
? __asan_loadN+0xf/0x20
? btrfs_ioctl_get_supported_features+0x30/0x30
? kvm_sched_clock_read+0x18/0x30
? check_chain_key+0x1f4/0x2f0
? lock_downgrade+0x3f0/0x3f0
? handle_mm_fault+0xad6/0x2150
? do_vfs_ioctl+0xfc/0x9d0
? ioctl_file_clone+0xe0/0xe0
? check_flags.part.50+0x6c/0x1e0
? check_flags.part.50+0x6c/0x1e0
? check_flags+0x26/0x30
? lock_is_held_type+0xc3/0xf0
? syscall_enter_from_user_mode+0x1b/0x60
? do_syscall_64+0x13/0x80
? rcu_read_lock_sched_held+0xa1/0xd0
? __kasan_check_read+0x11/0x20
? __fget_light+0xae/0x110
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4c4bdfe427
Allocated by task 28836:
kasan_save_stack+0x21/0x50
__kasan_kmalloc.constprop.18+0xbe/0xd0
kasan_kmalloc+0x9/0x10
kmem_cache_alloc_trace+0x410/0xcb0
btrfs_backref_alloc_node+0x46/0xf0
btrfs_backref_add_tree_node+0x60d/0x11d0
build_backref_tree+0xc5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 28836:
kasan_save_stack+0x21/0x50
kasan_set_track+0x20/0x30
kasan_set_free_info+0x1f/0x30
__kasan_slab_free+0xf3/0x140
kasan_slab_free+0xe/0x10
kfree+0xde/0x200
btrfs_backref_error_cleanup+0x452/0x530
build_backref_tree+0x1a5/0x700
relocate_tree_blocks+0x2be/0xb90
relocate_block_group+0x2eb/0x780
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x1900
btrfs_ioctl_balance+0x3a7/0x460
btrfs_ioctl+0x24c8/0x4360
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This occurred because we freed our backref node in
btrfs_backref_error_cleanup(), but then tried to free it again in
btrfs_backref_release_cache(). This is because
btrfs_backref_release_cache() will cycle through all of the
cache->leaves nodes and free them up. However
btrfs_backref_error_cleanup() freed the backref node with
btrfs_backref_free_node(), which simply kfree()d the backref node
without unlinking it from the cache. Change this to a
btrfs_backref_drop_node(), which does the appropriate cleanup and
removes the node from the cache->leaves list, so when we go to free the
remaining cache we don't trip over items we've already dropped.
Fixes: 75bfb9aff45e ("Btrfs: cleanup error handling in build_backref_tree")
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 02d7d7b2563b..9cadacf3ec27 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -3117,7 +3117,7 @@ void btrfs_backref_error_cleanup(struct btrfs_backref_cache *cache,
list_del_init(&lower->list);
if (lower == node)
node = NULL;
- btrfs_backref_free_node(cache, lower);
+ btrfs_backref_drop_node(cache, lower);
}
btrfs_backref_cleanup_node(cache, node);
Below race can come, if trace_open and resize of
cpu buffer is running parallely on different cpus
CPUX CPUY
ring_buffer_resize
atomic_read(&buffer->resize_disabled)
tracing_open
tracing_reset_online_cpus
ring_buffer_reset_cpu
rb_reset_cpu
rb_update_pages
remove/insert pages
resetting pointer
This race can cause data abort or some times infinte loop in
rb_remove_pages and rb_insert_pages while checking pages
for sanity.
Take buffer lock to fix this.
Signed-off-by: Gaurav Kohli <gkohli(a)codeaurora.org>
Cc: stable(a)vger.kernel.org
---
Changes since v0:
-Addressed Steven's review comments.
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 93ef0ab..15bf28b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -4866,6 +4866,9 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu)
if (!cpumask_test_cpu(cpu, buffer->cpumask))
return;
+ /* prevent another thread from changing buffer sizes */
+ mutex_lock(&buffer->mutex);
+
atomic_inc(&cpu_buffer->resize_disabled);
atomic_inc(&cpu_buffer->record_disabled);
@@ -4876,6 +4879,8 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu)
atomic_dec(&cpu_buffer->record_disabled);
atomic_dec(&cpu_buffer->resize_disabled);
+
+ mutex_unlock(&buffer->mutex);
}
EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);
@@ -4889,6 +4894,9 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer)
struct ring_buffer_per_cpu *cpu_buffer;
int cpu;
+ /* prevent another thread from changing buffer sizes */
+ mutex_lock(&buffer->mutex);
+
for_each_online_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
@@ -4907,6 +4915,8 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer)
atomic_dec(&cpu_buffer->record_disabled);
atomic_dec(&cpu_buffer->resize_disabled);
}
+
+ mutex_unlock(&buffer->mutex);
}
/**
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
musb_queue_resume_work() would call the provided callback if the runtime
PM status was 'active'. Otherwise, it would enqueue the request if the
hardware was still suspended (musb->is_runtime_suspended is true).
This causes a race with the runtime PM handlers, as it is possible to be
in the case where the runtime PM status is not yet 'active', but the
hardware has been awaken (PM resume function has been called).
When hitting the race, the resume work was not enqueued, which probably
triggered other bugs further down the stack. For instance, a telnet
connection on Ingenic SoCs would result in a 50/50 chance of a
segmentation fault somewhere in the musb code.
Rework the code so that either we call the callback directly if
(musb->is_runtime_suspended == 0), or enqueue the query otherwise.
Fixes: ea2f35c01d5e ("usb: musb: Fix sleeping function called from invalid context for hdrc glue")
Cc: stable(a)vger.kernel.org # v4.9+
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
Reviewed-by: Tony Lindgren <tony(a)atomide.com>
Tested-by: Tony Lindgren <tony(a)atomide.com>
---
drivers/usb/musb/musb_core.c | 31 +++++++++++++++++--------------
1 file changed, 17 insertions(+), 14 deletions(-)
diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c
index 849e0b770130..1cd87729ba60 100644
--- a/drivers/usb/musb/musb_core.c
+++ b/drivers/usb/musb/musb_core.c
@@ -2240,32 +2240,35 @@ int musb_queue_resume_work(struct musb *musb,
{
struct musb_pending_work *w;
unsigned long flags;
+ bool is_suspended;
int error;
if (WARN_ON(!callback))
return -EINVAL;
- if (pm_runtime_active(musb->controller))
- return callback(musb, data);
+ spin_lock_irqsave(&musb->list_lock, flags);
+ is_suspended = musb->is_runtime_suspended;
+
+ if (is_suspended) {
+ w = devm_kzalloc(musb->controller, sizeof(*w), GFP_ATOMIC);
+ if (!w) {
+ error = -ENOMEM;
+ goto out_unlock;
+ }
- w = devm_kzalloc(musb->controller, sizeof(*w), GFP_ATOMIC);
- if (!w)
- return -ENOMEM;
+ w->callback = callback;
+ w->data = data;
- w->callback = callback;
- w->data = data;
- spin_lock_irqsave(&musb->list_lock, flags);
- if (musb->is_runtime_suspended) {
list_add_tail(&w->node, &musb->pending_list);
error = 0;
- } else {
- dev_err(musb->controller, "could not add resume work %p\n",
- callback);
- devm_kfree(musb->controller, w);
- error = -EINPROGRESS;
}
+
+out_unlock:
spin_unlock_irqrestore(&musb->list_lock, flags);
+ if (!is_suspended)
+ error = callback(musb, data);
+
return error;
}
EXPORT_SYMBOL_GPL(musb_queue_resume_work);
--
2.29.2
From: Xiaoming Ni <nixiaoming(a)huawei.com>
Subject: proc_sysctl: fix oops caused by incorrect command parameters
The process_sysctl_arg() does not check whether val is empty before
invoking strlen(val). If the command line parameter () is incorrectly
configured and val is empty, oops is triggered.
For example:
"hung_task_panic=1" is incorrectly written as "hung_task_panic", oops is
triggered. The call stack is as follows:
Kernel command line: .... hung_task_panic
......
Call trace:
__pi_strlen+0x10/0x98
parse_args+0x278/0x344
do_sysctl_args+0x8c/0xfc
kernel_init+0x5c/0xf4
ret_from_fork+0x10/0x30
To fix it, check whether "val" is empty when "phram" is a sysctl field.
Error codes are returned in the failure branch, and error logs are
generated by parse_args().
Link: https://lkml.kernel.org/r/20210118133029.28580-1-nixiaoming@huawei.com
Fixes: 3db978d480e2843 ("kernel/sysctl: support setting sysctl parameters from kernel command line")
Signed-off-by: Xiaoming Ni <nixiaoming(a)huawei.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Iurii Zaikin <yzaikin(a)google.com>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Heiner Kallweit <hkallweit1(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: <stable(a)vger.kernel.org> [5.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/proc_sysctl.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/fs/proc/proc_sysctl.c~proc_sysctl-fix-oops-caused-by-incorrect-command-parameters
+++ a/fs/proc/proc_sysctl.c
@@ -1770,6 +1770,12 @@ static int process_sysctl_arg(char *para
return 0;
}
+ if (!val)
+ return -EINVAL;
+ len = strlen(val);
+ if (len == 0)
+ return -EINVAL;
+
/*
* To set sysctl options, we use a temporary mount of proc, look up the
* respective sys/ file and write to it. To avoid mounting it when no
@@ -1811,7 +1817,6 @@ static int process_sysctl_arg(char *para
file, param, val);
goto out;
}
- len = strlen(val);
wret = kernel_write(file, val, len, &pos);
if (wret < 0) {
err = wret;
_