From: Joerg Roedel <jroedel(a)suse.de>
In kernels compiled with CONFIG_PARAVIRT=n the compiler re-orders the
DR7 read in exc_nmi() to happen before the call to sev_es_ist_enter().
This is problematic when running as an SEV-ES guest because in this
environment the DR7 read might cause a #VC exception, and taking #VC
exceptions is not safe in exc_nmi() before sev_es_ist_enter() has run.
The result is stack recursion if the NMI was caused on the #VC IST
stack, because a subsequent #VC exception in the NMI handler will
overwrite the stack frame of the interrupted #VC handler.
As there are no compiler barriers affecting the ordering of DR7
reads/writes, make the accesses to this register volatile, forbidding
the compiler to re-order them.
[ bp: Massage text. ]
Cc: stable(a)vger.kernel.org
Fixes: 315562c9af3d ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler")
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
---
arch/x86/include/asm/debugreg.h | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
index b049d950612f..d54d5dbfb4b9 100644
--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -39,7 +39,20 @@ static __always_inline unsigned long native_get_debugreg(int regno)
asm("mov %%db6, %0" :"=r" (val));
break;
case 7:
- asm("mov %%db7, %0" :"=r" (val));
+ /*
+ * Apply __FORCE_ORDER to DR7 reads to forbid re-ordering them
+ * with other code.
+ *
+ * This is needed because a DR7 access can cause a #VC exception
+ * when running under SEV-ES. Taking a #VC exception is not a
+ * safe thing to do just anywhere in the entry code and
+ * re-ordering might place the access into an unsafe location.
+ *
+ * This happened in the NMI handler, where the DR7 read was
+ * re-ordered to happen before the call to sev_es_ist_enter(),
+ * causing stack recursion.
+ */
+ asm("mov %%db7, %0" : "=r" (val) : __FORCE_ORDER);
break;
default:
BUG();
@@ -66,8 +79,16 @@ static __always_inline void native_set_debugreg(int regno, unsigned long value)
asm("mov %0, %%db6" ::"r" (value));
break;
case 7:
- asm("mov %0, %%db7" ::"r" (value));
- break;
+ /*
+ * Apply __FORCE_ORDER to DR7 writes to forbid re-ordering them
+ * with other code.
+ *
+ * While is didn't happen with a DR7 write (see the DR7 read
+ * comment above which explains where it happened), add the
+ * __FORCE_ORDER here too to avoid similar problems in the
+ * future.
+ */
+ asm("mov %0, %%db7" ::"r" (value), __FORCE_ORDER); break;
default:
BUG();
}
--
2.39.0
From: Eric Biggers <ebiggers(a)google.com>
When converting an inline directory to a regular one, f2fs is leaking
uninitialized memory to disk because it doesn't initialize the entire
directory block. Fix this by zero-initializing the block.
This bug was introduced by commit 4ec17d688d74 ("f2fs: avoid unneeded
initializing when converting inline dentry"), which didn't consider the
security implications of leaking uninitialized memory to disk.
This was found by running xfstest generic/435 on a KMSAN-enabled kernel.
Fixes: 4ec17d688d74 ("f2fs: avoid unneeded initializing when converting inline dentry")
Cc: <stable(a)vger.kernel.org> # v4.3+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
fs/f2fs/inline.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 08e302d32118d..72269e7efd260 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -421,18 +421,17 @@ static int f2fs_move_inline_dirents(struct inode *dir, struct page *ipage,
dentry_blk = page_address(page);
+ /*
+ * Start by zeroing the full block, to ensure that all unused space is
+ * zeroed and no uninitialized memory is leaked to disk.
+ */
+ memset(dentry_blk, 0, F2FS_BLKSIZE);
+
make_dentry_ptr_inline(dir, &src, inline_dentry);
make_dentry_ptr_block(dir, &dst, dentry_blk);
/* copy data from inline dentry block to new dentry block */
memcpy(dst.bitmap, src.bitmap, src.nr_bitmap);
- memset(dst.bitmap + src.nr_bitmap, 0, dst.nr_bitmap - src.nr_bitmap);
- /*
- * we do not need to zero out remainder part of dentry and filename
- * field, since we have used bitmap for marking the usage status of
- * them, besides, we can also ignore copying/zeroing reserved space
- * of dentry block, because them haven't been used so far.
- */
memcpy(dst.dentry, src.dentry, SIZE_OF_DIR_ENTRY * src.max);
memcpy(dst.filename, src.filename, src.max * F2FS_SLOT_LEN);
base-commit: 7a2b15cfa8dbbd54beb4e2ce7b2f42eb0ad00425
--
2.39.1
Fix mt7615_init_tx_queues() so that it returns result value of final call
to mt7615_init_tx_queue() instead of default 0. Otherwise, failure in
mt7615_init_tx_queue() is not tracked by the parent function. The patch
can be cleanly applied to the 5.10 branch.
This issue is addressed in b671da33d1c5 upstream. I decided to refrain
from backporting it whole for now.
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
ef5c600adb1d ("io_uring: always prep_async for drain requests")
973fc83f3a94 ("io_uring: defer all io_req_complete_failed")
1bec951c3809 ("io_uring: iopoll protect complete_post")
fa18fa2272c7 ("io_uring: inline __io_req_complete_put()")
e276ae344a77 ("io_uring: hold locks for io_req_complete_failed")
f9d567c75ec2 ("io_uring: inline __io_req_complete_post()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ef5c600adb1d985513d2b612cc90403a148ff287 Mon Sep 17 00:00:00 2001
From: Dylan Yudaken <dylany(a)meta.com>
Date: Fri, 27 Jan 2023 02:59:11 -0800
Subject: [PATCH] io_uring: always prep_async for drain requests
Drain requests all go through io_drain_req, which has a quick exit in case
there is nothing pending (ie the drain is not useful). In that case it can
run the issue the request immediately.
However for safety it queues it through task work.
The problem is that in this case the request is run asynchronously, but
the async work has not been prepared through io_req_prep_async.
This has not been a problem up to now, as the task work always would run
before returning to userspace, and so the user would not have a chance to
race with it.
However - with IORING_SETUP_DEFER_TASKRUN - this is no longer the case and
the work might be defered, giving userspace a chance to change data being
referred to in the request.
Instead _always_ prep_async for drain requests, which is simpler anyway
and removes this issue.
Cc: stable(a)vger.kernel.org
Fixes: c0e0d6ba25f1 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
Signed-off-by: Dylan Yudaken <dylany(a)meta.com>
Link: https://lore.kernel.org/r/20230127105911.2420061-1-dylany@meta.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 0a4efada9b3c..db623b3185c8 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1765,17 +1765,12 @@ static __cold void io_drain_req(struct io_kiocb *req)
}
spin_unlock(&ctx->completion_lock);
- ret = io_req_prep_async(req);
- if (ret) {
-fail:
- io_req_defer_failed(req, ret);
- return;
- }
io_prep_async_link(req);
de = kmalloc(sizeof(*de), GFP_KERNEL);
if (!de) {
ret = -ENOMEM;
- goto fail;
+ io_req_defer_failed(req, ret);
+ return;
}
spin_lock(&ctx->completion_lock);
@@ -2048,13 +2043,16 @@ static void io_queue_sqe_fallback(struct io_kiocb *req)
req->flags &= ~REQ_F_HARDLINK;
req->flags |= REQ_F_LINK;
io_req_defer_failed(req, req->cqe.res);
- } else if (unlikely(req->ctx->drain_active)) {
- io_drain_req(req);
} else {
int ret = io_req_prep_async(req);
- if (unlikely(ret))
+ if (unlikely(ret)) {
io_req_defer_failed(req, ret);
+ return;
+ }
+
+ if (unlikely(req->ctx->drain_active))
+ io_drain_req(req);
else
io_queue_iowq(req, NULL);
}
[CC stable as I only tested the stable tree for now]
I'm running a current Alpine Linux with linux-edge-6.1.8-r0 installed on a
Lenovo Thinkpad L540 where an external disk enclosure with two disks is
attached via USB. The Alpine Linux kernel appears to track Linux stable
and is more or less vanilla. Also, the machine boots into Xen 4.17.0 and
then starts a few headless VMs, nothing too exotic here.
But when updating from Linux 6.1.1 to 6.1.8, the disks from the external
enclosure did not show up. Unplug, replug, no dice, and this is 100%
reproducable. dmesg has new these lines now:
+ioremap error for 0xf2520000-0xf2530000, requested 0x2, got 0x0
+ioremap error for 0xf2520000-0xf2530000, requested 0x2, got 0x0
+xhci_hcd 0000:00:14.0: init 0000:00:14.0 fail, -14
+ioremap error for 0xfed1f000-0xfed20000, requested 0x2, got 0x0
+iTCO_wdt iTCO_wdt.1.auto: ioremap failed for resource [mem 0xfed1f410-0xfed1f414]
I'm not sure if the ioremap error is related here (booted with
early_ioremap_debug but then dmesg was filled with WARNINGS for both
versions, so I disabled it again), but that xhci_hcd error looks
suspicious.
Curiously 6.1.8 works just fine when NOT booted via Xen. I booted into
Xen + vanilla 6.1.8 now and was able to reproduce this issue. Xen +
vanilla 6.1.1 works fine.
From v6.1.1 to v6.1.8 there's only one commit in drivers/xen, but 54
commits in drivers/usb. Compiling takes time because the distribution
kernel has almost everything enabled and I still need to cut down enabled
options to be able to attempt a git biset in a reasonable time, but I
still wanted to report this, maybe someone has an idea about this.
Full dmesg and lshw outputs: https://nerdbynature.de/bits/usb_v6.1.8/
Thanks,
Christian.
PS: I found this workaround on the interwebs[0] to force the USB ports
of that machine to USB 2.0 and then the missing disks magically appear:
$ lspci -nn | grep -i usb
00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI [8086:8c31] (rev 05) <=== !!!
00:1a.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 [8086:8c2d] (rev 05)
00:1d.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 [8086:8c26] (rev 05)
$ setpci -H1 -d 8086:8c31 d8.l=0
$ setpci -H1 -d 8086:8c31 d0.l=0
$ dmesg
usb 1-1.3: new full-speed USB device number 3 using ehci-pci
usb 2-1.3: new high-speed USB device number 3 using ehci-pci
usb 1-1.3: New USB device found, idVendor=138a, idProduct=0011, bcdDevice=0.78
usb 1-1.3: New USB device strings: Mfr=0, Product=0, SerialNumber=1
usb 1-1.3: SerialNumber: aa32bf84ed47
usb 1-1.5: new full-speed USB device number 4 using ehci-pci
usb 2-1.3: New USB device found, idVendor=1e91, idProduct=a3a8, bcdDevice=2.07
usb 2-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=5
usb 2-1.3: Product: Elite Pro Dual
usb 2-1.3: Manufacturer: OWC
usb 2-1.3: SerialNumber: RANDOM__1E359879645F
usb 2-1.3: UAS is ignored for this device, using usb-storage instead
usb-storage 2-1.3:1.0: USB Mass Storage device detected
usb-storage 2-1.3:1.0: Quirks match for vid 1e91 pid a3a8: 800000
scsi host5: usb-storage 2-1.3:1.0
usb 1-1.5: New USB device found, idVendor=8087, idProduct=07dc, bcdDevice=0.01
usb 1-1.5: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Bluetooth: hci0: Legacy ROM 2.5 revision 8.0 build 1 week 45 2013
Bluetooth: hci0: Intel Bluetooth firmware file: intel/ibt-hw-37.7.10-fw-1.80.1.2d.d.bseq
usb 1-1.6: new high-speed USB device number 5 using ehci-pci
usb 1-1.6: New USB device found, idVendor=04f2, idProduct=b398, bcdDevice=39.98
usb 1-1.6: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-1.6: Product: Integrated Camera
usb 1-1.6: Manufacturer: Vimicro corp.
Bluetooth: hci0: Intel BT fw patch 0x2a completed & activated
scsi 5:0:0:0: Direct-Access ElitePro Dual U3FW-1 0207 PQ: 0 ANSI: 6
scsi 5:0:0:1: Direct-Access ElitePro Dual U3FW-2 0207 PQ: 0 ANSI: 6
sd 5:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16).
sd 5:0:0:0: [sdc] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
sd 5:0:0:1: [sdd] Very big device. Trying to use READ CAPACITY(16).
sd 5:0:0:1: [sdd] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
sd 5:0:0:1: [sdd] Write Protect is off
sd 5:0:0:1: [sdd] Mode Sense: 47 00 10 08
sd 5:0:0:0: [sdc] Write Protect is off
sd 5:0:0:0: [sdc] Mode Sense: 47 00 10 08
sd 5:0:0:0: [sdc] No Caching mode page found
sd 5:0:0:0: [sdc] Assuming drive cache: write through
sd 5:0:0:1: [sdd] No Caching mode page found
sd 5:0:0:1: [sdd] Assuming drive cache: write through
sd 5:0:0:0: [sdc] Attached SCSI disk
sd 5:0:0:1: [sdd] Attached SCSI disk
$ lsblk /dev/sd[cd]
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sdc 8:32 0 3.6T 0 disk
sdd 8:48 0 3.6T 0 disk
[0] https://superuser.com/a/875863/218574
--
BOFH excuse #135:
You put the disk in upside down.
From: Mel Gorman <mgorman(a)techsingularity.net>
commit 1c0908d8e441631f5b8ba433523cf39339ee2ba0 upstream.
Jan Kara reported the following bug triggering on 6.0.5-rt14 running dbench
on XFS on arm64.
kernel BUG at fs/inode.c:625!
Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP
CPU: 11 PID: 6611 Comm: dbench Tainted: G E 6.0.0-rt14-rt+ #1
pc : clear_inode+0xa0/0xc0
lr : clear_inode+0x38/0xc0
Call trace:
clear_inode+0xa0/0xc0
evict+0x160/0x180
iput+0x154/0x240
do_unlinkat+0x184/0x300
__arm64_sys_unlinkat+0x48/0xc0
el0_svc_common.constprop.4+0xe4/0x2c0
do_el0_svc+0xac/0x100
el0_svc+0x78/0x200
el0t_64_sync_handler+0x9c/0xc0
el0t_64_sync+0x19c/0x1a0
It also affects 6.1-rc7-rt5 and affects a preempt-rt fork of 5.14 so this
is likely a bug that existed forever and only became visible when ARM
support was added to preempt-rt. The same problem does not occur on x86-64
and he also reported that converting sb->s_inode_wblist_lock to
raw_spinlock_t makes the problem disappear indicating that the RT spinlock
variant is the problem.
Which in turn means that RT mutexes on ARM64 and any other weakly ordered
architecture are affected by this independent of RT.
Will Deacon observed:
"I'd be more inclined to be suspicious of the slowpath tbh, as we need to
make sure that we have acquire semantics on all paths where the lock can
be taken. Looking at the rtmutex code, this really isn't obvious to me
-- for example, try_to_take_rt_mutex() appears to be able to return via
the 'takeit' label without acquire semantics and it looks like we might
be relying on the caller's subsequent _unlock_ of the wait_lock for
ordering, but that will give us release semantics which aren't correct."
Sebastian Andrzej Siewior prototyped a fix that does work based on that
comment but it was a little bit overkill and added some fences that should
not be necessary.
The lock owner is updated with an IRQ-safe raw spinlock held, but the
spin_unlock does not provide acquire semantics which are needed when
acquiring a mutex.
Adds the necessary acquire semantics for lock owner updates in the slow path
acquisition and the waiter bit logic.
It successfully completed 10 iterations of the dbench workload while the
vanilla kernel fails on the first iteration.
[ bigeasy(a)linutronix.de: Initial prototype fix ]
Fixes: 700318d1d7b38 ("locking/rtmutex: Use acquire/release semantics")
Fixes: 23f78d4a03c5 ("[PATCH] pi-futex: rt mutex core")
Reported-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Mel Gorman <mgorman(a)techsingularity.net>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20221202100223.6mevpbl7i6x5udfd@techsingularity.n…
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
---
Could this be please backported to 5.15 and earlier? It is already part
of the 6.X kernels. Either I am not waiting long enough or this has been
skipped because it does not apply cleanly. In case of the latter, here
is a patch that applies against v5.15.
I received reports that this fixes "mysterious" crashes and that is how
I noticed that it is not part of the earlier kernels.
kernel/locking/rtmutex.c | 55 ++++++++++++++++++++++++++++++------
kernel/locking/rtmutex_api.c | 6 ++--
2 files changed, 49 insertions(+), 12 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index ea5a701ab2408..c9b21fd30bed5 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -87,15 +87,31 @@ static inline int __ww_mutex_check_kill(struct rt_mutex *lock,
* set this bit before looking at the lock.
*/
-static __always_inline void
-rt_mutex_set_owner(struct rt_mutex_base *lock, struct task_struct *owner)
+static __always_inline struct task_struct *
+rt_mutex_owner_encode(struct rt_mutex_base *lock, struct task_struct *owner)
{
unsigned long val = (unsigned long)owner;
if (rt_mutex_has_waiters(lock))
val |= RT_MUTEX_HAS_WAITERS;
- WRITE_ONCE(lock->owner, (struct task_struct *)val);
+ return (struct task_struct *)val;
+}
+
+static __always_inline void
+rt_mutex_set_owner(struct rt_mutex_base *lock, struct task_struct *owner)
+{
+ /*
+ * lock->wait_lock is held but explicit acquire semantics are needed
+ * for a new lock owner so WRITE_ONCE is insufficient.
+ */
+ xchg_acquire(&lock->owner, rt_mutex_owner_encode(lock, owner));
+}
+
+static __always_inline void rt_mutex_clear_owner(struct rt_mutex_base *lock)
+{
+ /* lock->wait_lock is held so the unlock provides release semantics. */
+ WRITE_ONCE(lock->owner, rt_mutex_owner_encode(lock, NULL));
}
static __always_inline void clear_rt_mutex_waiters(struct rt_mutex_base *lock)
@@ -104,7 +120,8 @@ static __always_inline void clear_rt_mutex_waiters(struct rt_mutex_base *lock)
((unsigned long)lock->owner & ~RT_MUTEX_HAS_WAITERS);
}
-static __always_inline void fixup_rt_mutex_waiters(struct rt_mutex_base *lock)
+static __always_inline void
+fixup_rt_mutex_waiters(struct rt_mutex_base *lock, bool acquire_lock)
{
unsigned long owner, *p = (unsigned long *) &lock->owner;
@@ -170,8 +187,21 @@ static __always_inline void fixup_rt_mutex_waiters(struct rt_mutex_base *lock)
* still set.
*/
owner = READ_ONCE(*p);
- if (owner & RT_MUTEX_HAS_WAITERS)
- WRITE_ONCE(*p, owner & ~RT_MUTEX_HAS_WAITERS);
+ if (owner & RT_MUTEX_HAS_WAITERS) {
+ /*
+ * See rt_mutex_set_owner() and rt_mutex_clear_owner() on
+ * why xchg_acquire() is used for updating owner for
+ * locking and WRITE_ONCE() for unlocking.
+ *
+ * WRITE_ONCE() would work for the acquire case too, but
+ * in case that the lock acquisition failed it might
+ * force other lockers into the slow path unnecessarily.
+ */
+ if (acquire_lock)
+ xchg_acquire(p, owner & ~RT_MUTEX_HAS_WAITERS);
+ else
+ WRITE_ONCE(*p, owner & ~RT_MUTEX_HAS_WAITERS);
+ }
}
/*
@@ -206,6 +236,13 @@ static __always_inline void mark_rt_mutex_waiters(struct rt_mutex_base *lock)
owner = *p;
} while (cmpxchg_relaxed(p, owner,
owner | RT_MUTEX_HAS_WAITERS) != owner);
+
+ /*
+ * The cmpxchg loop above is relaxed to avoid back-to-back ACQUIRE
+ * operations in the event of contention. Ensure the successful
+ * cmpxchg is visible.
+ */
+ smp_mb__after_atomic();
}
/*
@@ -1231,7 +1268,7 @@ static int __sched __rt_mutex_slowtrylock(struct rt_mutex_base *lock)
* try_to_take_rt_mutex() sets the lock waiters bit
* unconditionally. Clean this up.
*/
- fixup_rt_mutex_waiters(lock);
+ fixup_rt_mutex_waiters(lock, true);
return ret;
}
@@ -1591,7 +1628,7 @@ static int __sched __rt_mutex_slowlock(struct rt_mutex_base *lock,
* try_to_take_rt_mutex() sets the waiter bit
* unconditionally. We might have to fix that up.
*/
- fixup_rt_mutex_waiters(lock);
+ fixup_rt_mutex_waiters(lock, true);
return ret;
}
@@ -1701,7 +1738,7 @@ static void __sched rtlock_slowlock_locked(struct rt_mutex_base *lock)
* try_to_take_rt_mutex() sets the waiter bit unconditionally.
* We might have to fix that up:
*/
- fixup_rt_mutex_waiters(lock);
+ fixup_rt_mutex_waiters(lock, true);
debug_rt_mutex_free_waiter(&waiter);
}
diff --git a/kernel/locking/rtmutex_api.c b/kernel/locking/rtmutex_api.c
index 5c9299aaabae1..a461be2f873db 100644
--- a/kernel/locking/rtmutex_api.c
+++ b/kernel/locking/rtmutex_api.c
@@ -245,7 +245,7 @@ void __sched rt_mutex_init_proxy_locked(struct rt_mutex_base *lock,
void __sched rt_mutex_proxy_unlock(struct rt_mutex_base *lock)
{
debug_rt_mutex_proxy_unlock(lock);
- rt_mutex_set_owner(lock, NULL);
+ rt_mutex_clear_owner(lock);
}
/**
@@ -360,7 +360,7 @@ int __sched rt_mutex_wait_proxy_lock(struct rt_mutex_base *lock,
* try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
* have to fix that up.
*/
- fixup_rt_mutex_waiters(lock);
+ fixup_rt_mutex_waiters(lock, true);
raw_spin_unlock_irq(&lock->wait_lock);
return ret;
@@ -416,7 +416,7 @@ bool __sched rt_mutex_cleanup_proxy_lock(struct rt_mutex_base *lock,
* try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
* have to fix that up.
*/
- fixup_rt_mutex_waiters(lock);
+ fixup_rt_mutex_waiters(lock, false);
raw_spin_unlock_irq(&lock->wait_lock);
--
2.39.1
The bug that upstream commit c6c7f2a84da45 ("nfsd: Ensure knfsd shuts
down when the "nfsd" pseudofs is unmounted") fixes in kernel v5.13
exists in kernel v5.4 too.
That is, knfsd threads are left behind once the nfsd pseudofs is
unmounted, e.g. when the container is killed.
I backported the patch to kernel v5.4, and tested it.
Trond Myklebust (1):
nfsd: Ensure knfsd shuts down when the "nfsd" pseudofs is unmounted
fs/nfsd/netns.h | 6 +++---
fs/nfsd/nfs4state.c | 8 +-------
fs/nfsd/nfsctl.c | 14 ++------------
fs/nfsd/nfsd.h | 3 +--
fs/nfsd/nfssvc.c | 35 ++++++++++++++++++++++++++++++++++-
5 files changed, 41 insertions(+), 25 deletions(-)
--
2.30.2
The following patch is required to fix the ext4 online resize in 5.15.
Baokun Li (1):
ext4: fix bad checksum after online resize
fs/ext4/resize.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--
2.39.0.246.g2a6d74b583-goog
This bug is marked as fixed by commit:
net: core: netlink: add helper refcount dec and lock function
net: sched: add helper function to take reference to Qdisc
net: sched: extend Qdisc with rcu
net: sched: rename qdisc_destroy() to qdisc_put()
net: sched: use Qdisc rcu API instead of relying on rtnl lock
But I can't find it in the tested trees[1] for more than 90 days.
Is it a correct commit? Please update it by replying:
#syz fix: exact-commit-title
Until then the bug is still considered open and new crashes with
the same signature are ignored.
Kernel: Linux 4.19
Dashboard link: https://syzkaller.appspot.com/bug?extid=5f229e48cccc804062c0
---
[1] I expect the commit to be present in:
1. linux-4.19.y branch of
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Dobrý den,
máte zájem o využití velmi kvalitní služby tepelného obrábění kovů?
Můžeme vám nabídnout velmi výhodné podmínky spolupráce, technické poradenství, sériovou výrobu a testování prototypů.
Specializujeme se na tradiční a vakuové technologie: cementování, nitrocementování, kalení v plynu, zušlechtění, žíhání, pájení, normalizační žíhání (s překrystalizací).
Máme k dispozici rozsáhlé strojní vybavení, velký tým odborníků, a proto jsme schopni se přizpůsobit vašim požadavkům.
Pracujeme v souladu s našimi certifikáty v rozsahu norem platných v oblasti automobilového průmyslu (IATF 16949; CQI 9) a také letectví (akreditace NADCAP).
Pokud máte požadavky v této oblasti, rád vám představím naše možnosti.
Mohl bych vám zatelefonovat?
Vilém Dušek
by Linux kernel regression tracking (Thorsten Leemhuis)
Hi Greg. Just a heads up that likely isn't needed, as the change
mentioned below has a proper Cc: <stable@...> tag, so you scripts will
likely do the right thing automatically. But just to be sure:
I'm pretty sure Vlastimil would be extremely happy if you include
95e7a450b819 ("Revert "mm/compaction: fix set skip in
fast_find_migrateblock"") [was merged ~2 hours ago] in the next 6.1.y
release, as he in [1] wrote:
> FWIW, I consider this serious enough to be fixed in mainline+stable ASAP,
> hopefully in rc5, as it does hurt people using 6.1. mm-fixes PR for rc5 was
> sent 2 days ago [1] so please flag this in your regression report for Linus
> etc. Thanks.
[1]
https://lore.kernel.org/all/f47f69f9-7378-f18c-399b-b277c753532e@suse.cz/
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
The quilt patch titled
Subject: Revert "mm/compaction: fix set skip in fast_find_migrateblock"
has been removed from the -mm tree. Its filename was
revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: Revert "mm/compaction: fix set skip in fast_find_migrateblock"
Date: Fri, 13 Jan 2023 18:33:45 +0100
This reverts commit 7efc3b7261030da79001c00d92bc3392fd6c664c.
We have got openSUSE reports (Link 1) for 6.1 kernel with khugepaged
stalling CPU for long periods of time. Investigation of tracepoint data
shows that compaction is stuck in repeating fast_find_migrateblock() based
migrate page isolation, and then fails to migrate all isolated pages.
Commit 7efc3b726103 ("mm/compaction: fix set skip in
fast_find_migrateblock") was suspected as it was merged in 6.1 and in
theory can indeed remove a termination condition for
fast_find_migrateblock() under certain conditions, as it removes a place
that always marks a scanned pageblock from being re-scanned. There are
other such places, but those can be skipped under certain conditions,
which seems to match the tracepoint data.
Testing of revert also appears to have resolved the issue, thus revert the
commit until a more robust solution for the original problem is developed.
It's also likely this will fix qemu stalls with 6.1 kernel reported in
Link 2, but that is not yet confirmed.
Link: https://bugzilla.suse.com/show_bug.cgi?id=1206848
Link: https://lore.kernel.org/kvm/b8017e09-f336-3035-8344-c549086c2340@kernel.org/
Link: https://lkml.kernel.org/r/20230113173345.9692-1-vbabka@suse.cz
Fixes: 7efc3b726103 ("mm/compaction: fix set skip in fast_find_migrateblock")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Tested-by: Jiri Slaby <jirislaby(a)kernel.org>
Tested-by: Pedro Falcato <pedro.falcato(a)gmail.com>
Cc: Chuyi Zhou <zhouchuyi(a)bytedance.com>
Cc: Maxim Levitsky <mlevitsk(a)redhat.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Thorsten Leemhuis <regressions(a)leemhuis.info>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/compaction.c~revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock
+++ a/mm/compaction.c
@@ -1839,6 +1839,7 @@ static unsigned long fast_find_migratebl
pfn = cc->zone->zone_start_pfn;
cc->fast_search_fail = 0;
found_block = true;
+ set_pageblock_skip(freepage);
break;
}
}
_
Patches currently in -mm which might be from vbabka(a)suse.cz are
mm-mremap-fix-mremap-expanding-for-vmas-with-vm_ops-close.patch
mm-use-stack_depot_early_init-for-kmemleak-fix.patch
The patch titled
Subject: memory tier: release the new_memtier in find_create_memory_tier()
has been added to the -mm mm-unstable branch. Its filename is
memory-tier-release-the-new_memtier-in-find_create_memory_tier.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Tong Tiangen <tongtiangen(a)huawei.com>
Subject: memory tier: release the new_memtier in find_create_memory_tier()
Date: Sun, 29 Jan 2023 04:06:51 +0000
In find_create_memory_tier(), if failed to register device, then we should
release new_memtier from the tier list and put device instead of memtier.
Link: https://lkml.kernel.org/r/20230129040651.1329208-1-tongtiangen@huawei.com
Fixes: 9832fb87834e ("mm/demotion: expose memory tier details via sysfs")
Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Hanjun Guo <guohanjun(a)huawei.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Guohanjun <guohanjun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/memory-tiers.c~memory-tier-release-the-new_memtier-in-find_create_memory_tier
+++ a/mm/memory-tiers.c
@@ -211,8 +211,8 @@ static struct memory_tier *find_create_m
ret = device_register(&new_memtier->dev);
if (ret) {
- list_del(&memtier->list);
- put_device(&memtier->dev);
+ list_del(&new_memtier->list);
+ put_device(&new_memtier->dev);
return ERR_PTR(ret);
}
memtier = new_memtier;
_
Patches currently in -mm which might be from tongtiangen(a)huawei.com are
memory-tier-release-the-new_memtier-in-find_create_memory_tier.patch
Need to clear affinity_hint before free_irq(), otherwise there is a
one-time warning and stack trace during module unloading.
To fix this bug, set affinity_hint to NULL before free as required.
Cc: stable(a)vger.kernel.org
Fixes: 71fa6887eeca ("net: mana: Assign interrupts to CPUs based on NUMA nodes")
Signed-off-by: Haiyang Zhang <haiyangz(a)microsoft.com>
---
drivers/net/ethernet/microsoft/mana/gdma_main.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index b144f2237748..3bae9d4c1f08 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -1297,6 +1297,8 @@ static int mana_gd_setup_irqs(struct pci_dev *pdev)
for (j = i - 1; j >= 0; j--) {
irq = pci_irq_vector(pdev, j);
gic = &gc->irq_contexts[j];
+
+ irq_update_affinity_hint(irq, NULL);
free_irq(irq, gic);
}
@@ -1324,6 +1326,9 @@ static void mana_gd_remove_irqs(struct pci_dev *pdev)
continue;
gic = &gc->irq_contexts[i];
+
+ /* Need to clear the hint before free_irq */
+ irq_update_affinity_hint(irq, NULL);
free_irq(irq, gic);
}
--
2.25.1
blit_x and blit_y are uint32_t, so fbcon currently cannot support fonts
larger than 32x32.
The 32x32 case also needs shifting an unsigned int, to properly set bit
31, otherwise we get "UBSAN: shift-out-of-bounds in fbcon_set_font",
as reported on
http://lore.kernel.org/all/IA1PR07MB98308653E259A6F2CE94A4AFABCE9@IA1PR07MB…
Kernel Branch: 6.2.0-rc5-next-20230124
Kernel config: https://drive.google.com/file/d/1F-LszDAizEEH0ZX0HcSR06v5q8FPl2Uv/view?usp=…
Reproducer: https://drive.google.com/file/d/1mP1jcLBY7vWCNM60OMf-ogw-urQRjNrm/view?usp=…
Reported-by: Sanan Hasanov <sanan.hasanov(a)Knights.ucf.edu>
Signed-off-by: Samuel Thibault <samuel.thibault(a)ens-lyon.org>
Index: linux-6.0/drivers/video/fbdev/core/fbcon.c
===================================================================
--- linux-6.0.orig/drivers/video/fbdev/core/fbcon.c
+++ linux-6.0/drivers/video/fbdev/core/fbcon.c
@@ -2489,9 +2489,12 @@ static int fbcon_set_font(struct vc_data
h > FBCON_SWAP(info->var.rotate, info->var.yres, info->var.xres))
return -EINVAL;
+ if (font->width > 32 || font->height > 32)
+ return -EINVAL;
+
/* Make sure drawing engine can handle the font */
- if (!(info->pixmap.blit_x & (1 << (font->width - 1))) ||
- !(info->pixmap.blit_y & (1 << (font->height - 1))))
+ if (!(info->pixmap.blit_x & (1U << (font->width - 1))) ||
+ !(info->pixmap.blit_y & (1U << (font->height - 1))))
return -EINVAL;
/* Make sure driver can handle the font length */
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
ba81043753ff ("scsi: ufs: core: Fix devfreq deadlocks")
87bd05016a64 ("scsi: ufs: core: Allow host driver to disable wb toggling during clock scaling")
dd11376b9f1b ("scsi: ufs: Split the drivers/scsi/ufs directory")
4bc26113c603 ("scsi: ufs: Split the ufshcd.h header file")
3f06f7800b80 ("scsi: ufs: Minimize #include directives")
cff91daf52d3 ("scsi: ufs: Fix kernel-doc syntax in ufshcd.h")
c10d52d73ae0 ("scsi: ufs: Remove unnecessary ufshcd-crypto.h include directives")
e2106584d011 ("scsi: ufs: Rename sdev_ufs_device into ufs_device_wlun")
c906e8328de8 ("scsi: ufs: Make the config_scaling_param calls type safe")
8d077ede48c1 ("scsi: ufs: Optimize the command queueing code")
5675c381ea51 ("scsi: ufs: Stop using the clock scaling lock in the error handler")
511a083b8b6b ("scsi: ufs: Remove hba->cmd_queue")
59830c095cf0 ("scsi: ufs: Remove the sdev_rpmb member")
db33028647a3 ("scsi: Remove superfluous #include <linux/async.h> directives")
ddba1cf7a506 ("scsi: ufs: Let devices remain runtime suspended during system suspend")
659109a45c6c ("scsi: ufs: Fix double space in SCSI_UFS_HWMON description")
d28a78537d1d ("scsi: ufs: Wrap Universal Flash Storage drivers in SCSI_UFSHCD")
fe91c4725aee ("Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ba81043753fffbc2ad6e0c5ff2659f12ac2f46b4 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Mon, 16 Jan 2023 17:12:01 +0100
Subject: [PATCH] scsi: ufs: core: Fix devfreq deadlocks
There is a lock inversion and rwsem read-lock recursion in the devfreq
target callback which can lead to deadlocks.
Specifically, ufshcd_devfreq_scale() already holds a clk_scaling_lock
read lock when toggling the write booster, which involves taking the
dev_cmd mutex before taking another clk_scaling_lock read lock.
This can lead to a deadlock if another thread:
1) tries to acquire the dev_cmd and clk_scaling locks in the correct
order, or
2) takes a clk_scaling write lock before the attempt to take the
clk_scaling read lock a second time.
Fix this by dropping the clk_scaling_lock before toggling the write booster
as was done before commit 0e9d4ca43ba8 ("scsi: ufs: Protect some contexts
from unexpected clock scaling").
While the devfreq callbacks are already serialised, add a second
serialising mutex to handle the unlikely case where a callback triggered
through the devfreq sysfs interface is racing with a request to disable
clock scaling through the UFS controller 'clkscale_enable' sysfs
attribute. This could otherwise lead to the write booster being left
disabled after having disabled clock scaling.
Also take the new mutex in ufshcd_clk_scaling_allow() to make sure that any
pending write booster update has completed on return.
Note that this currently only affects Qualcomm platforms since commit
87bd05016a64 ("scsi: ufs: core: Allow host driver to disable wb toggling
during clock scaling").
The lock inversion (i.e. 1 above) was reported by lockdep as:
======================================================
WARNING: possible circular locking dependency detected
6.1.0-next-20221216 #211 Not tainted
------------------------------------------------------
kworker/u16:2/71 is trying to acquire lock:
ffff076280ba98a0 (&hba->dev_cmd.lock){+.+.}-{3:3}, at: ufshcd_query_flag+0x50/0x1c0
but task is already holding lock:
ffff076280ba9cf0 (&hba->clk_scaling_lock){++++}-{3:3}, at: ufshcd_devfreq_scale+0x2b8/0x380
which lock already depends on the new lock.
[ +0.011606]
the existing dependency chain (in reverse order) is:
-> #1 (&hba->clk_scaling_lock){++++}-{3:3}:
lock_acquire+0x68/0x90
down_read+0x58/0x80
ufshcd_exec_dev_cmd+0x70/0x2c0
ufshcd_verify_dev_init+0x68/0x170
ufshcd_probe_hba+0x398/0x1180
ufshcd_async_scan+0x30/0x320
async_run_entry_fn+0x34/0x150
process_one_work+0x288/0x6c0
worker_thread+0x74/0x450
kthread+0x118/0x120
ret_from_fork+0x10/0x20
-> #0 (&hba->dev_cmd.lock){+.+.}-{3:3}:
__lock_acquire+0x12a0/0x2240
lock_acquire.part.0+0xcc/0x220
lock_acquire+0x68/0x90
__mutex_lock+0x98/0x430
mutex_lock_nested+0x2c/0x40
ufshcd_query_flag+0x50/0x1c0
ufshcd_query_flag_retry+0x64/0x100
ufshcd_wb_toggle+0x5c/0x120
ufshcd_devfreq_scale+0x2c4/0x380
ufshcd_devfreq_target+0xf4/0x230
devfreq_set_target+0x84/0x2f0
devfreq_update_target+0xc4/0xf0
devfreq_monitor+0x38/0x1f0
process_one_work+0x288/0x6c0
worker_thread+0x74/0x450
kthread+0x118/0x120
ret_from_fork+0x10/0x20
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&hba->clk_scaling_lock);
lock(&hba->dev_cmd.lock);
lock(&hba->clk_scaling_lock);
lock(&hba->dev_cmd.lock);
*** DEADLOCK ***
Fixes: 0e9d4ca43ba8 ("scsi: ufs: Protect some contexts from unexpected clock scaling")
Cc: stable(a)vger.kernel.org # 5.12
Cc: Can Guo <quic_cang(a)quicinc.com>
Tested-by: Andrew Halaney <ahalaney(a)redhat.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Link: https://lore.kernel.org/r/20230116161201.16923-1-johan+linaro@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index bda61be5f035..3a1c4d31e010 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -1234,12 +1234,14 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba)
* clock scaling is in progress
*/
ufshcd_scsi_block_requests(hba);
+ mutex_lock(&hba->wb_mutex);
down_write(&hba->clk_scaling_lock);
if (!hba->clk_scaling.is_allowed ||
ufshcd_wait_for_doorbell_clr(hba, DOORBELL_CLR_TOUT_US)) {
ret = -EBUSY;
up_write(&hba->clk_scaling_lock);
+ mutex_unlock(&hba->wb_mutex);
ufshcd_scsi_unblock_requests(hba);
goto out;
}
@@ -1251,12 +1253,16 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba)
return ret;
}
-static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, bool writelock)
+static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, int err, bool scale_up)
{
- if (writelock)
- up_write(&hba->clk_scaling_lock);
- else
- up_read(&hba->clk_scaling_lock);
+ up_write(&hba->clk_scaling_lock);
+
+ /* Enable Write Booster if we have scaled up else disable it */
+ if (ufshcd_enable_wb_if_scaling_up(hba) && !err)
+ ufshcd_wb_toggle(hba, scale_up);
+
+ mutex_unlock(&hba->wb_mutex);
+
ufshcd_scsi_unblock_requests(hba);
ufshcd_release(hba);
}
@@ -1273,7 +1279,6 @@ static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, bool writelock)
static int ufshcd_devfreq_scale(struct ufs_hba *hba, bool scale_up)
{
int ret = 0;
- bool is_writelock = true;
ret = ufshcd_clock_scaling_prepare(hba);
if (ret)
@@ -1302,15 +1307,8 @@ static int ufshcd_devfreq_scale(struct ufs_hba *hba, bool scale_up)
}
}
- /* Enable Write Booster if we have scaled up else disable it */
- if (ufshcd_enable_wb_if_scaling_up(hba)) {
- downgrade_write(&hba->clk_scaling_lock);
- is_writelock = false;
- ufshcd_wb_toggle(hba, scale_up);
- }
-
out_unprepare:
- ufshcd_clock_scaling_unprepare(hba, is_writelock);
+ ufshcd_clock_scaling_unprepare(hba, ret, scale_up);
return ret;
}
@@ -6066,9 +6064,11 @@ static void ufshcd_force_error_recovery(struct ufs_hba *hba)
static void ufshcd_clk_scaling_allow(struct ufs_hba *hba, bool allow)
{
+ mutex_lock(&hba->wb_mutex);
down_write(&hba->clk_scaling_lock);
hba->clk_scaling.is_allowed = allow;
up_write(&hba->clk_scaling_lock);
+ mutex_unlock(&hba->wb_mutex);
}
static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend)
@@ -9793,6 +9793,7 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
/* Initialize mutex for exception event control */
mutex_init(&hba->ee_ctrl_mutex);
+ mutex_init(&hba->wb_mutex);
init_rwsem(&hba->clk_scaling_lock);
ufshcd_init_clk_gating(hba);
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index 5cf81dff60aa..727084cd79be 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -808,6 +808,7 @@ struct ufs_hba_monitor {
* @urgent_bkops_lvl: keeps track of urgent bkops level for device
* @is_urgent_bkops_lvl_checked: keeps track if the urgent bkops level for
* device is known or not.
+ * @wb_mutex: used to serialize devfreq and sysfs write booster toggling
* @clk_scaling_lock: used to serialize device commands and clock scaling
* @desc_size: descriptor sizes reported by device
* @scsi_block_reqs_cnt: reference counting for scsi block requests
@@ -951,6 +952,7 @@ struct ufs_hba {
enum bkops_status urgent_bkops_lvl;
bool is_urgent_bkops_lvl_checked;
+ struct mutex wb_mutex;
struct rw_semaphore clk_scaling_lock;
unsigned char desc_size[QUERY_DESC_IDN_MAX];
atomic_t scsi_block_reqs_cnt;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
b7ab9161cf5d ("cifs: Fix oops due to uncleared server->smbd_conn in reconnect")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b7ab9161cf5ddc42a288edf9d1a61f3bdffe17c7 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells(a)redhat.com>
Date: Wed, 25 Jan 2023 14:02:13 +0000
Subject: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in
reconnect
In smbd_destroy(), clear the server->smbd_conn pointer after freeing the
smbd_connection struct that it points to so that reconnection doesn't get
confused.
Fixes: 8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection")
Cc: stable(a)vger.kernel.org
Reviewed-by: Paulo Alcantara (SUSE) <pc(a)cjr.nz>
Acked-by: Tom Talpey <tom(a)talpey.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Cc: Long Li <longli(a)microsoft.com>
Cc: Pavel Shilovsky <piastryyy(a)gmail.com>
Cc: Ronnie Sahlberg <lsahlber(a)redhat.com>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c
index 90789aaa6567..8c816b25ce7c 100644
--- a/fs/cifs/smbdirect.c
+++ b/fs/cifs/smbdirect.c
@@ -1405,6 +1405,7 @@ void smbd_destroy(struct TCP_Server_Info *server)
destroy_workqueue(info->workqueue);
log_rdma_event(INFO, "rdma session destroyed\n");
kfree(info);
+ server->smbd_conn = NULL;
}
/*
Hi,
On 1/24/2023 7:33 PM, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> bpf: Always use raw spinlock for hash bucket lock
>
> to the 5.15-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> bpf-always-use-raw-spinlock-for-hash-bucket-lock.patch
> and it can be found in the queue-5.15 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Please drop it for v5.15. The fix depends on bpf memory allocator [0] which was
merged in v6.1.
[0]: 7c8199e24fa0 bpf: Introduce any context BPF specific memory allocator.
>
>
>
> commit 9515b63fddd8c96797b0513c8d6509a9cc767611
> Author: Hou Tao <houtao1(a)huawei.com>
> Date: Wed Sep 21 15:38:26 2022 +0800
>
> bpf: Always use raw spinlock for hash bucket lock
>
> [ Upstream commit 1d8b82c613297f24354b4d750413a7456b5cd92c ]
>
> For a non-preallocated hash map on RT kernel, regular spinlock instead
> of raw spinlock is used for bucket lock. The reason is that on RT kernel
> memory allocation is forbidden under atomic context and regular spinlock
> is sleepable under RT.
>
> Now hash map has been fully converted to use bpf_map_alloc, and there
> will be no synchronous memory allocation for non-preallocated hash map,
> so it is safe to always use raw spinlock for bucket lock on RT. So
> removing the usage of htab_use_raw_lock() and updating the comments
> accordingly.
>
> Signed-off-by: Hou Tao <houtao1(a)huawei.com>
> Link: https://lore.kernel.org/r/20220921073826.2365800-1-houtao@huaweicloud.com
> Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
> Stable-dep-of: 9f907439dc80 ("bpf: hash map, avoid deadlock with suitable hash mask")
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index e7f45a966e6b..ea2051a913fb 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -66,24 +66,16 @@
> * In theory the BPF locks could be converted to regular spinlocks as well,
> * but the bucket locks and percpu_freelist locks can be taken from
> * arbitrary contexts (perf, kprobes, tracepoints) which are required to be
> - * atomic contexts even on RT. These mechanisms require preallocated maps,
> - * so there is no need to invoke memory allocations within the lock held
> - * sections.
> - *
> - * BPF maps which need dynamic allocation are only used from (forced)
> - * thread context on RT and can therefore use regular spinlocks which in
> - * turn allows to invoke memory allocations from the lock held section.
> - *
> - * On a non RT kernel this distinction is neither possible nor required.
> - * spinlock maps to raw_spinlock and the extra code is optimized out by the
> - * compiler.
> + * atomic contexts even on RT. Before the introduction of bpf_mem_alloc,
> + * it is only safe to use raw spinlock for preallocated hash map on a RT kernel,
> + * because there is no memory allocation within the lock held sections. However
> + * after hash map was fully converted to use bpf_mem_alloc, there will be
> + * non-synchronous memory allocation for non-preallocated hash map, so it is
> + * safe to always use raw spinlock for bucket lock.
> */
> struct bucket {
> struct hlist_nulls_head head;
> - union {
> - raw_spinlock_t raw_lock;
> - spinlock_t lock;
> - };
> + raw_spinlock_t raw_lock;
> };
>
> #define HASHTAB_MAP_LOCK_COUNT 8
> @@ -132,26 +124,15 @@ static inline bool htab_is_prealloc(const struct bpf_htab *htab)
> return !(htab->map.map_flags & BPF_F_NO_PREALLOC);
> }
>
> -static inline bool htab_use_raw_lock(const struct bpf_htab *htab)
> -{
> - return (!IS_ENABLED(CONFIG_PREEMPT_RT) || htab_is_prealloc(htab));
> -}
> -
> static void htab_init_buckets(struct bpf_htab *htab)
> {
> unsigned i;
>
> for (i = 0; i < htab->n_buckets; i++) {
> INIT_HLIST_NULLS_HEAD(&htab->buckets[i].head, i);
> - if (htab_use_raw_lock(htab)) {
> - raw_spin_lock_init(&htab->buckets[i].raw_lock);
> - lockdep_set_class(&htab->buckets[i].raw_lock,
> + raw_spin_lock_init(&htab->buckets[i].raw_lock);
> + lockdep_set_class(&htab->buckets[i].raw_lock,
> &htab->lockdep_key);
> - } else {
> - spin_lock_init(&htab->buckets[i].lock);
> - lockdep_set_class(&htab->buckets[i].lock,
> - &htab->lockdep_key);
> - }
> cond_resched();
> }
> }
> @@ -161,28 +142,17 @@ static inline int htab_lock_bucket(const struct bpf_htab *htab,
> unsigned long *pflags)
> {
> unsigned long flags;
> - bool use_raw_lock;
>
> hash = hash & HASHTAB_MAP_LOCK_MASK;
>
> - use_raw_lock = htab_use_raw_lock(htab);
> - if (use_raw_lock)
> - preempt_disable();
> - else
> - migrate_disable();
> + preempt_disable();
> if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
> __this_cpu_dec(*(htab->map_locked[hash]));
> - if (use_raw_lock)
> - preempt_enable();
> - else
> - migrate_enable();
> + preempt_enable();
> return -EBUSY;
> }
>
> - if (use_raw_lock)
> - raw_spin_lock_irqsave(&b->raw_lock, flags);
> - else
> - spin_lock_irqsave(&b->lock, flags);
> + raw_spin_lock_irqsave(&b->raw_lock, flags);
> *pflags = flags;
>
> return 0;
> @@ -192,18 +162,10 @@ static inline void htab_unlock_bucket(const struct bpf_htab *htab,
> struct bucket *b, u32 hash,
> unsigned long flags)
> {
> - bool use_raw_lock = htab_use_raw_lock(htab);
> -
> hash = hash & HASHTAB_MAP_LOCK_MASK;
> - if (use_raw_lock)
> - raw_spin_unlock_irqrestore(&b->raw_lock, flags);
> - else
> - spin_unlock_irqrestore(&b->lock, flags);
> + raw_spin_unlock_irqrestore(&b->raw_lock, flags);
> __this_cpu_dec(*(htab->map_locked[hash]));
> - if (use_raw_lock)
> - preempt_enable();
> - else
> - migrate_enable();
> + preempt_enable();
> }
>
> static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
> .
Build ID on arm64 is broken if CONFIG_MODVERSIONS=y
on 5.4, 5.10, and 5.15
I've build tested on {x86_64, arm64, riscv, powerpc, s390, sh}
Without these patches Build ID is missing for arm64 (ld >= 2.36):
$ readelf -n vmlinux | grep "Build ID"
*NOTE* to following is not in mainline, yet:
[PATCH 5.15 fix build id for arm64 5/5] sh: define RUNTIME_DISCARD_EXIT
https://lore.kernel.org/all/9166a8abdc0f979e50377e61780a4bba1dfa2f52.167451…
Masahiro Yamada (2):
arch: fix broken BuildID for arm64 and riscv
s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36
Michael Ellerman (2):
powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT
powerpc/vmlinux.lds: Don't discard .rela* for relocatable builds
Tom Saeger (1):
sh: define RUNTIME_DISCARD_EXIT
arch/powerpc/kernel/vmlinux.lds.S | 6 +++++-
arch/s390/kernel/vmlinux.lds.S | 2 ++
arch/sh/kernel/vmlinux.lds.S | 2 ++
include/asm-generic/vmlinux.lds.h | 5 +++++
4 files changed, 14 insertions(+), 1 deletion(-)
base-commit: aabd5ba7e9b03e9a211a4842ab4a93d46f684d2c
--
2.39.1
Build ID on arm64 is broken if CONFIG_MODVERSIONS=y
on 5.4, 5.10, and 5.15
I've build tested on {x86_64, arm64, riscv, powerpc, s390, sh}
Without these patches Build ID is missing for arm64 (ld >= 2.36):
$ readelf -n vmlinux | grep "Build ID"
*NOTE* to following is not in mainline, yet:
[PATCH 5.10 fix build id for arm64 5/5] sh: define RUNTIME_DISCARD_EXIT
https://lore.kernel.org/all/9166a8abdc0f979e50377e61780a4bba1dfa2f52.167451…
Masahiro Yamada (2):
arch: fix broken BuildID for arm64 and riscv
s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36
Michael Ellerman (2):
powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT
powerpc/vmlinux.lds: Don't discard .rela* for relocatable builds
Tom Saeger (1):
sh: define RUNTIME_DISCARD_EXIT
arch/powerpc/kernel/vmlinux.lds.S | 6 +++++-
arch/s390/kernel/vmlinux.lds.S | 2 ++
arch/sh/kernel/vmlinux.lds.S | 2 ++
include/asm-generic/vmlinux.lds.h | 5 +++++
4 files changed, 14 insertions(+), 1 deletion(-)
base-commit: 179624a57b78c02de833370b7bdf0b0f4a27ca31
--
2.39.1
The remark about the error code by Simon Horman <simon.horman(a)corigine.com> was taken into account.
Return value -ENOENT was changed to -EINVAL.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
The patch titled
Subject: Squashfs: fix handling and sanity checking of xattr_ids count
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: Squashfs: fix handling and sanity checking of xattr_ids count
Date: Fri, 27 Jan 2023 06:18:42 +0000
A Sysbot [1] corrupted filesystem exposes two flaws in the handling and
sanity checking of the xattr_ids count in the filesystem. Both of these
flaws cause computation overflow due to incorrect typing.
In the corrupted filesystem the xattr_ids value is 4294967071, which
stored in a signed variable becomes the negative number -225.
Flaw 1 (64-bit systems only):
The signed integer xattr_ids variable causes sign extension.
This causes variable overflow in the SQUASHFS_XATTR_*(A) macros. The
variable is first multiplied by sizeof(struct squashfs_xattr_id) where the
type of the sizeof operator is "unsigned long".
On a 64-bit system this is 64-bits in size, and causes the negative number
to be sign extended and widened to 64-bits and then become unsigned. This
produces the very large number 18446744073709548016 or 2^64 - 3600. This
number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and
divided by SQUASHFS_METADATA_SIZE overflows and produces a length of 0
(stored in len).
Flaw 2 (32-bit systems only):
On a 32-bit system the integer variable is not widened by the unsigned
long type of the sizeof operator (32-bits), and the signedness of the
variable has no effect due it always being treated as unsigned.
The above corrupted xattr_ids value of 4294967071, when multiplied
overflows and produces the number 4294963696 or 2^32 - 3400. This number
when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and divided by
SQUASHFS_METADATA_SIZE overflows again and produces a length of 0.
The effect of the 0 length computation:
In conjunction with the corrupted xattr_ids field, the filesystem also has
a corrupted xattr_table_start value, where it matches the end of
filesystem value of 850.
This causes the following sanity check code to fail because the
incorrectly computed len of 0 matches the incorrect size of the table
reported by the superblock (0 bytes).
len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids);
/*
* The computed size of the index table (len bytes) should exactly
* match the table start and end points
*/
start = table_start + sizeof(*id_table);
end = msblk->bytes_used;
if (len != (end - start))
return ERR_PTR(-EINVAL);
Changing the xattr_ids variable to be "usigned int" fixes the flaw on a
64-bit system. This relies on the fact the computation is widened by the
unsigned long type of the sizeof operator.
Casting the variable to u64 in the above macro fixes this flaw on a 32-bit
system.
It also means 64-bit systems do not implicitly rely on the type of the
sizeof operator to widen the computation.
[1] https://lore.kernel.org/lkml/000000000000cd44f005f1a0f17f@google.com/
Link: https://lkml.kernel.org/r/20230127061842.10965-1-phillip@squashfs.org.uk
Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: <syzbot+082fa4af80a5bb1a9843(a)syzkaller.appspotmail.com>
Cc: Alexey Khoroshilov <khoroshilov(a)ispras.ru>
Cc: Fedor Pchelkin <pchelkin(a)ispras.ru>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/fs/squashfs/squashfs_fs.h~squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count
+++ a/fs/squashfs/squashfs_fs.h
@@ -183,7 +183,7 @@ static inline int squashfs_block_size(__
#define SQUASHFS_ID_BLOCK_BYTES(A) (SQUASHFS_ID_BLOCKS(A) *\
sizeof(u64))
/* xattr id lookup table defines */
-#define SQUASHFS_XATTR_BYTES(A) ((A) * sizeof(struct squashfs_xattr_id))
+#define SQUASHFS_XATTR_BYTES(A) (((u64) (A)) * sizeof(struct squashfs_xattr_id))
#define SQUASHFS_XATTR_BLOCK(A) (SQUASHFS_XATTR_BYTES(A) / \
SQUASHFS_METADATA_SIZE)
--- a/fs/squashfs/squashfs_fs_sb.h~squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count
+++ a/fs/squashfs/squashfs_fs_sb.h
@@ -63,7 +63,7 @@ struct squashfs_sb_info {
long long bytes_used;
unsigned int inodes;
unsigned int fragments;
- int xattr_ids;
+ unsigned int xattr_ids;
unsigned int ids;
bool panic_on_errors;
const struct squashfs_decompressor_thread_ops *thread_ops;
--- a/fs/squashfs/xattr.h~squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count
+++ a/fs/squashfs/xattr.h
@@ -10,12 +10,12 @@
#ifdef CONFIG_SQUASHFS_XATTR
extern __le64 *squashfs_read_xattr_id_table(struct super_block *, u64,
- u64 *, int *);
+ u64 *, unsigned int *);
extern int squashfs_xattr_lookup(struct super_block *, unsigned int, int *,
unsigned int *, unsigned long long *);
#else
static inline __le64 *squashfs_read_xattr_id_table(struct super_block *sb,
- u64 start, u64 *xattr_table_start, int *xattr_ids)
+ u64 start, u64 *xattr_table_start, unsigned int *xattr_ids)
{
struct squashfs_xattr_id_table *id_table;
--- a/fs/squashfs/xattr_id.c~squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count
+++ a/fs/squashfs/xattr_id.c
@@ -56,7 +56,7 @@ int squashfs_xattr_lookup(struct super_b
* Read uncompressed xattr id lookup table indexes from disk into memory
*/
__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start,
- u64 *xattr_table_start, int *xattr_ids)
+ u64 *xattr_table_start, unsigned int *xattr_ids)
{
struct squashfs_sb_info *msblk = sb->s_fs_info;
unsigned int len, indexes;
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
squashfs-fix-handling-and-sanity-checking-of-xattr_ids-count.patch
We already round down the address in kunmap_local_indexed() which is
the other implementation of __kunmap_local(). The only implementation
of kunmap_flush_on_unmap() is PA-RISC which is expecting a page-aligned
address. This may be causing PA-RISC to be flushing the wrong addresses
currently.
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Fixes: 298fa1ad5571 ("highmem: Provide generic variant of kmap_atomic*")
Cc: stable(a)vger.kernel.org
---
include/linux/highmem-internal.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h
index 034b1106d022..e098f38422af 100644
--- a/include/linux/highmem-internal.h
+++ b/include/linux/highmem-internal.h
@@ -200,7 +200,7 @@ static inline void *kmap_local_pfn(unsigned long pfn)
static inline void __kunmap_local(const void *addr)
{
#ifdef ARCH_HAS_FLUSH_ON_KUNMAP
- kunmap_flush_on_unmap(addr);
+ kunmap_flush_on_unmap(PTR_ALIGN_DOWN(addr, PAGE_SIZE));
#endif
}
@@ -227,7 +227,7 @@ static inline void *kmap_atomic_pfn(unsigned long pfn)
static inline void __kunmap_atomic(const void *addr)
{
#ifdef ARCH_HAS_FLUSH_ON_KUNMAP
- kunmap_flush_on_unmap(addr);
+ kunmap_flush_on_unmap(PTR_ALIGN_DOWN(addr, PAGE_SIZE));
#endif
pagefault_enable();
if (IS_ENABLED(CONFIG_PREEMPT_RT))
--
2.35.1
This reverts commit 972fa3a7c17c9d60212e32ecc0205dc585b1e769.
Kmemleak operates by periodically scanning memory regions for pointers
to allocated memory blocks to determine if they are leaked or not.
However, reserved memory regions can be used for DMA transactions
between a device and a CPU, and thus, wouldn't contain pointers to
allocated memory blocks, making them inappropriate for kmemleak to
scan. Thus, revert this commit.
Cc: stable(a)vger.kernel.org # 5.17+
Cc: Calvin Zhang <calvinzhang.cool(a)gmail.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
---
drivers/of/fdt.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index f08b25195ae7..d1a68b6d03b3 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -26,7 +26,6 @@
#include <linux/serial_core.h>
#include <linux/sysfs.h>
#include <linux/random.h>
-#include <linux/kmemleak.h>
#include <asm/setup.h> /* for COMMAND_LINE_SIZE */
#include <asm/page.h>
@@ -525,12 +524,9 @@ static int __init __reserved_mem_reserve_reg(unsigned long node,
size = dt_mem_next_cell(dt_root_size_cells, &prop);
if (size &&
- early_init_dt_reserve_memory(base, size, nomap) == 0) {
+ early_init_dt_reserve_memory(base, size, nomap) == 0)
pr_debug("Reserved memory: reserved region for node '%s': base %pa, size %lu MiB\n",
uname, &base, (unsigned long)(size / SZ_1M));
- if (!nomap)
- kmemleak_alloc_phys(base, size, 0);
- }
else
pr_err("Reserved memory: failed to reserve memory for node '%s': base %pa, size %lu MiB\n",
uname, &base, (unsigned long)(size / SZ_1M));
--
2.39.1.405.gd4c25cc71f-goog
This is the start of the stable review cycle for the 4.19.271 release.
There are 37 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Tue, 24 Jan 2023 15:02:08 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.271-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.271-rc1
YingChi Long <me(a)inclyc.cn>
x86/fpu: Use _Alignof to avoid undefined behavior in TYPE_ALIGN
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ext4: generalize extents status tree search functions"
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ext4: add new pending reservation mechanism"
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ext4: fix reserved cluster accounting at delayed write time"
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ext4: fix delayed allocation bug in ext4_clu_mapped for bigalloc + inline"
Khazhismel Kumykov <khazhy(a)chromium.org>
gsmi: fix null-deref in gsmi_get_variable
Tobias Schramm <t.schramm(a)manjaro.org>
serial: atmel: fix incorrect baudrate setup
Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
serial: pch_uart: Pass correct sg to dma_unmap_sg()
Juhyung Park <qkrwngud825(a)gmail.com>
usb-storage: apply IGNORE_UAS only for HIKSEMI MD202 on RTL9210
Maciej Żenczykowski <maze(a)google.com>
usb: gadget: f_ncm: fix potential NULL ptr deref in ncm_bitrate()
Daniel Scally <dan.scally(a)ideasonboard.com>
usb: gadget: g_webcam: Send color matching descriptor per frame
Prashant Malani <pmalani(a)chromium.org>
usb: typec: altmodes/displayport: Fix pin assignment calculation
Prashant Malani <pmalani(a)chromium.org>
usb: typec: altmodes/displayport: Add pin assignment helper
Alexander Stein <alexander.stein(a)ew.tq-group.com>
usb: host: ehci-fsl: Fix module alias
Michael Adler <michael.adler(a)siemens.com>
USB: serial: cp210x: add SCALANCE LPE-9000 device id
Enzo Matsumiya <ematsumiya(a)suse.de>
cifs: do not include page data when checking signature
Samuel Holland <samuel(a)sholland.org>
mmc: sunxi-mmc: Fix clock refcount imbalance during unbind
Ian Abbott <abbotti(a)mev.co.uk>
comedi: adv_pci1760: Fix PWM instruction handling
Flavio Suligoi <f.suligoi(a)asem.it>
usb: core: hub: disable autosuspend for TI TUSB8041
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
USB: misc: iowarrior: fix up header size for USB_DEVICE_ID_CODEMERCS_IOW100
Duke Xin(辛安文) <duke_xinanwen(a)163.com>
USB: serial: option: add Quectel EM05CN modem
Duke Xin(辛安文) <duke_xinanwen(a)163.com>
USB: serial: option: add Quectel EM05CN (SG) modem
Ali Mirghasemi <ali.mirghasemi1376(a)gmail.com>
USB: serial: option: add Quectel EC200U modem
Duke Xin(辛安文) <duke_xinanwen(a)163.com>
USB: serial: option: add Quectel EM05-G (RS) modem
Duke Xin(辛安文) <duke_xinanwen(a)163.com>
USB: serial: option: add Quectel EM05-G (CS) modem
Duke Xin(辛安文) <duke_xinanwen(a)163.com>
USB: serial: option: add Quectel EM05-G (GR) modem
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
prlimit: do_prlimit needs to have a speculation check
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Add a flag to disable USB3 lpm on a xhci root port level.
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Fix null pointer dereference when host dies
Jimmy Hu <hhhuuu(a)google.com>
usb: xhci: Check endpoint is valid before dereferencing it
Ricardo Ribalda <ribalda(a)chromium.org>
xhci-pci: set the dma max_seg_size
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: fix general protection fault in nilfs_btree_insert()
Shawn.Shao <shawn.shao(a)jaguarmicro.com>
Add exception protection processing for vd in axi_chan_handle_err function
Jaegeuk Kim <jaegeuk(a)kernel.org>
f2fs: let's avoid panic if extent_tree is not created
Jiri Slaby (SUSE) <jirislaby(a)kernel.org>
RDMA/srp: Move large values to a new enum for gcc13
Daniil Tatianin <d-tatianin(a)yandex-team.ru>
net/ethtool/ioctl: return -EOPNOTSUPP if we have no phy stats
Olga Kornievskaia <olga.kornievskaia(a)gmail.com>
pNFS/filelayout: Fix coalescing test for single DS
-------------
Diffstat:
Makefile | 4 +-
arch/x86/kernel/fpu/init.c | 7 +-
drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c | 6 +
drivers/firmware/google/gsmi.c | 7 +-
drivers/infiniband/ulp/srp/ib_srp.h | 8 +-
drivers/mmc/host/sunxi-mmc.c | 8 +-
drivers/staging/comedi/drivers/adv_pci1760.c | 2 +-
drivers/tty/serial/atmel_serial.c | 8 +-
drivers/tty/serial/pch_uart.c | 2 +-
drivers/usb/core/hub.c | 13 +
drivers/usb/gadget/function/f_ncm.c | 4 +-
drivers/usb/gadget/legacy/webcam.c | 3 +
drivers/usb/host/ehci-fsl.c | 2 +-
drivers/usb/host/xhci-pci.c | 2 +
drivers/usb/host/xhci-ring.c | 5 +-
drivers/usb/host/xhci.c | 13 +
drivers/usb/host/xhci.h | 1 +
drivers/usb/misc/iowarrior.c | 2 +-
drivers/usb/serial/cp210x.c | 1 +
drivers/usb/serial/option.c | 17 ++
drivers/usb/storage/uas-detect.h | 13 +
drivers/usb/storage/unusual_uas.h | 7 -
drivers/usb/typec/altmodes/displayport.c | 22 +-
fs/cifs/smb2pdu.c | 15 +-
fs/ext4/ext4.h | 8 +-
fs/ext4/extents.c | 139 +++------
fs/ext4/extents_status.c | 389 ++-----------------------
fs/ext4/extents_status.h | 76 +----
fs/ext4/inode.c | 92 ++----
fs/ext4/super.c | 8 -
fs/f2fs/extent_cache.c | 3 +-
fs/nfs/filelayout/filelayout.c | 8 +
fs/nilfs2/btree.c | 15 +-
include/trace/events/ext4.h | 39 +--
kernel/sys.c | 2 +
net/core/ethtool.c | 3 +-
36 files changed, 238 insertions(+), 716 deletions(-)
If CONFIG_PREEMPT_NONE is set and the task_work chains are long, we
could be running into issues blocking others for too long. Add a
reschedule check in handle_tw_list(), and flush the ctx if we need to
reschedule.
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
io_uring/io_uring.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index bb8a532b051e..bc8845de2829 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1179,10 +1179,16 @@ static unsigned int handle_tw_list(struct llist_node *node,
/* if not contended, grab and improve batching */
*locked = mutex_trylock(&(*ctx)->uring_lock);
percpu_ref_get(&(*ctx)->refs);
- }
+ } else if (!*locked)
+ *locked = mutex_trylock(&(*ctx)->uring_lock);
req->io_task_work.func(req, locked);
node = next;
count++;
+ if (unlikely(need_resched())) {
+ ctx_flush_and_put(*ctx, locked);
+ *ctx = NULL;
+ cond_resched();
+ }
}
return count;
--
2.39.0
migrate_pages/mempolicy semantics state that CAP_SYS_NICE is required
to move pages shared with another process to a different node.
page_mapcount > 1 is being used to determine if a hugetlb page is shared.
However, a hugetlb page will have a mapcount of 1 if mapped by multiple
processes via a shared PMD. As a result, hugetlb pages shared by multiple
processes and mapped with a shared PMD can be moved by a process without
CAP_SYS_NICE.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is
found consider the page shared.
Fixes: e2d8cf405525 ("migrate: add hugepage migration code to migrate_pages()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
---
mm/mempolicy.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 85a34f1f3ab8..72142fbe7652 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -600,7 +600,8 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
if (flags & (MPOL_MF_MOVE_ALL) ||
- (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
+ (flags & MPOL_MF_MOVE && page_mapcount(page) == 1 &&
+ !hugetlb_pmd_shared(pte))) {
if (isolate_hugetlb(page, qp->pagelist) &&
(flags & MPOL_MF_STRICT))
/*
--
2.39.1
Drain requests all go through io_drain_req, which has a quick exit in case
there is nothing pending (ie the drain is not useful). In that case it can
run the issue the request immediately.
However for safety it queues it through task work.
The problem is that in this case the request is run asynchronously, but
the async work has not been prepared through io_req_prep_async.
This has not been a problem up to now, as the task work always would run
before returning to userspace, and so the user would not have a chance to
race with it.
However - with IORING_SETUP_DEFER_TASKRUN - this is no longer the case and
the work might be defered, giving userspace a chance to change data being
referred to in the request.
Instead _always_ prep_async for drain requests, which is simpler anyway
and removes this issue.
Cc: stable(a)vger.kernel.org
Fixes: c0e0d6ba25f1 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
Signed-off-by: Dylan Yudaken <dylany(a)meta.com>
---
Hi,
Worth pointing out in discussion with Pavel that we considered just
removing the fast path in io_drain_req. That felt like more of an invasive
change as it can add an extra kmalloc + io_prep_async_link(), but perhaps
you feel that is a better approach.
I'll send out a liburing test shortly.
Dylan
io_uring/io_uring.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 0a4efada9b3c..db623b3185c8 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1765,17 +1765,12 @@ static __cold void io_drain_req(struct io_kiocb *req)
}
spin_unlock(&ctx->completion_lock);
- ret = io_req_prep_async(req);
- if (ret) {
-fail:
- io_req_defer_failed(req, ret);
- return;
- }
io_prep_async_link(req);
de = kmalloc(sizeof(*de), GFP_KERNEL);
if (!de) {
ret = -ENOMEM;
- goto fail;
+ io_req_defer_failed(req, ret);
+ return;
}
spin_lock(&ctx->completion_lock);
@@ -2048,13 +2043,16 @@ static void io_queue_sqe_fallback(struct io_kiocb *req)
req->flags &= ~REQ_F_HARDLINK;
req->flags |= REQ_F_LINK;
io_req_defer_failed(req, req->cqe.res);
- } else if (unlikely(req->ctx->drain_active)) {
- io_drain_req(req);
} else {
int ret = io_req_prep_async(req);
- if (unlikely(ret))
+ if (unlikely(ret)) {
io_req_defer_failed(req, ret);
+ return;
+ }
+
+ if (unlikely(req->ctx->drain_active))
+ io_drain_req(req);
else
io_queue_iowq(req, NULL);
}
base-commit: b00c51ef8f72ced0965d021a291b98ff822c5337
--
2.30.2
Build ID on arm64 is broken if CONFIG_MODVERSIONS=y
on 5.4, 5.10, and 5.15
Discussed:
https://lore.kernel.org/all/3df32572ec7016e783d37e185f88495831671f5d.167114…https://lore.kernel.org/all/cover.1670358255.git.tom.saeger@oracle.com/
Which is fixed for arm64 with backporting:
[2/6] ("arch: fix broken BuildID for arm64 and riscv")
Which had fixes:
[3/6] powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT
[4/6] powerpc/vmlinux.lds: Don't discard .rela* for relocatable builds
[5/6] s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36
But broke arch/sh (or so it was thought).
arch/sh is also broken in mainline.
[6/6] sh: define RUNTIME_DISCARD_EXIT
*NOTE* is not in mainline, but was sent here:
https://lore.kernel.org/all/9166a8abdc0f979e50377e61780a4bba1dfa2f52.167451…
There was enough breakage in 5.4.230-rc1 that the previous series was
dropped (which didn't have 1/6 or 6/6).
https://lore.kernel.org/stable/CA+G9fYuYi1Rvv19R_EVdht_7LV9qiR-6KVvZUGjct3k…
[1/6] 84d5f77fc2ee ("x86, vmlinux.lds: Add RUNTIME_DISCARD_EXIT to generic DISCARDS")
First defined RUNTIME_DISCARD_EXIT generically and its use for x86_64
specifically.
$ git describe --contains 84d5f77fc2ee4e0
v5.7-rc1~164^2~1
Which explains why the previous series broke 5.4.
I've build tested on a fairly wide matrix so far, but would appreciate
more testing.
with and without CONFIG_MODVERSIONS=y
# view Build ID
$ readelf -n vmlinux | grep "Build ID"
5.10 and 5.15 will have a similar series [2-6], as both already have [1/6].
If arch/sh is a must have, then [6/6] needs to find its way into mainline.
Regards,
--Tom
H.J. Lu (1):
x86, vmlinux.lds: Add RUNTIME_DISCARD_EXIT to generic DISCARDS
Masahiro Yamada (2):
arch: fix broken BuildID for arm64 and riscv
s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36
Michael Ellerman (2):
powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT
powerpc/vmlinux.lds: Don't discard .rela* for relocatable builds
Tom Saeger (1):
sh: define RUNTIME_DISCARD_EXIT
arch/powerpc/kernel/vmlinux.lds.S | 6 +++++-
arch/s390/kernel/vmlinux.lds.S | 2 ++
arch/sh/kernel/vmlinux.lds.S | 1 +
arch/x86/kernel/vmlinux.lds.S | 1 +
include/asm-generic/vmlinux.lds.h | 16 ++++++++++++++--
5 files changed, 23 insertions(+), 3 deletions(-)
base-commit: 90245959a5b936ee013266e5d1e6a508ed69274e
--
2.39.1
From: "Russell King (Oracle)" <rmk+kernel(a)armlinux.org.uk>
Dan Carpenter points out that the return code was not set in commit
60c8b4aebd8e ("nvmem: core: fix cleanup after dev_set_name()"), but
this is not the only issue - we also need to zero wp_gpio to prevent
gpiod_put() being called on an error value.
Cc: stable(a)vger.kernel.org
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Dan Carpenter <error27(a)gmail.com>
Fixes: 60c8b4aebd8e ("nvmem: core: fix cleanup after dev_set_name()")
Signed-off-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
---
drivers/nvmem/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index 563116405b3d..34ee9d36ee7b 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -783,6 +783,7 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
GPIOD_OUT_HIGH);
if (IS_ERR(nvmem->wp_gpio)) {
rval = PTR_ERR(nvmem->wp_gpio);
+ nvmem->wp_gpio = NULL;
goto err_put_device;
}
--
2.25.1
From: "Russell King (Oracle)" <rmk+kernel(a)armlinux.org.uk>
The i.MX6 CPU frequency driver sometimes fails to register at boot time
due to nvmem_cell_read_u32() sporadically returning -ENOENT.
This happens because there is a window where __nvmem_device_get() in
of_nvmem_cell_get() is able to return the nvmem device, but as cells
have been setup, nvmem_find_cell_entry_by_node() returns NULL.
The occurs because the nvmem core registration code violates one of the
fundamental principles of kernel programming: do not publish data
structures before their setup is complete.
Fix this by making nvmem core code conform with this principle.
Cc: stable(a)vger.kernel.org
Fixes: eace75cfdcf7 ("nvmem: Add a simple NVMEM framework for nvmem providers")
Signed-off-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
---
drivers/nvmem/core.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index ac77a019aed7..e92c6f1aadbb 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -832,22 +832,16 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
nvmem->dev.groups = nvmem_dev_groups;
#endif
- dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name);
-
- rval = device_add(&nvmem->dev);
- if (rval)
- goto err_put_device;
-
if (nvmem->nkeepout) {
rval = nvmem_validate_keepouts(nvmem);
if (rval)
- goto err_device_del;
+ goto err_put_device;
}
if (config->compat) {
rval = nvmem_sysfs_setup_compat(nvmem, config);
if (rval)
- goto err_device_del;
+ goto err_put_device;
}
if (config->cells) {
@@ -864,6 +858,12 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
if (rval)
goto err_remove_cells;
+ dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name);
+
+ rval = device_add(&nvmem->dev);
+ if (rval)
+ goto err_remove_cells;
+
blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);
return nvmem;
@@ -873,8 +873,6 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
err_teardown_compat:
if (config->compat)
nvmem_sysfs_remove_compat(nvmem, config);
-err_device_del:
- device_del(&nvmem->dev);
err_put_device:
put_device(&nvmem->dev);
--
2.25.1
From: "Russell King (Oracle)" <rmk+kernel(a)armlinux.org.uk>
The error path for wp_gpio attempts to free the IDA nvmem->id, but
this has yet to be assigned, so will always be zero - leaking the
ID allocated by ida_alloc(). Fix this by moving the initialisation
of nvmem->id earlier.
Cc: stable(a)vger.kernel.org
Fixes: f7d8d7dcd978 ("nvmem: fix memory leak in error path")
Signed-off-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
---
drivers/nvmem/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index 321d7d63e068..7394a7598efa 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -770,6 +770,8 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
return ERR_PTR(rval);
}
+ nvmem->id = rval;
+
if (config->wp_gpio)
nvmem->wp_gpio = config->wp_gpio;
else if (!config->ignore_wp)
@@ -785,7 +787,6 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
kref_init(&nvmem->refcnt);
INIT_LIST_HEAD(&nvmem->cells);
- nvmem->id = rval;
nvmem->owner = config->owner;
if (!nvmem->owner && config->dev->driver)
nvmem->owner = config->dev->driver->owner;
--
2.25.1
From: Samuel Holland <samuel(a)sholland.org>
The SID SRAM on at least some SoCs (A64 and D1) returns different values
when read with bus cycles narrower than 32 bits. This is not immediately
obvious, because memcpy_fromio() uses word-size accesses as long as
enough data is being copied.
The vendor driver always uses 32-bit MMIO reads, so do the same here.
This is faster than the register-based method, which is currently used
as a workaround on A64. And it fixes the values returned on D1, where
the SRAM method was being used.
The special case for the last word is needed to maintain .word_size == 1
for sysfs ABI compatibility, as noted previously in commit de2a3eaea552
("nvmem: sunxi_sid: Optimize register read-out method").
Cc: stable(a)vger.kernel.org
Fixes: 07ae4fde9efa ("nvmem: sunxi_sid: Add support for D1 variant")
Tested-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Samuel Holland <samuel(a)sholland.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
---
drivers/nvmem/sunxi_sid.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/nvmem/sunxi_sid.c b/drivers/nvmem/sunxi_sid.c
index 5750e1f4bcdb..92dfe4cb10e3 100644
--- a/drivers/nvmem/sunxi_sid.c
+++ b/drivers/nvmem/sunxi_sid.c
@@ -41,8 +41,21 @@ static int sunxi_sid_read(void *context, unsigned int offset,
void *val, size_t bytes)
{
struct sunxi_sid *sid = context;
+ u32 word;
+
+ /* .stride = 4 so offset is guaranteed to be aligned */
+ __ioread32_copy(val, sid->base + sid->value_offset + offset, bytes / 4);
- memcpy_fromio(val, sid->base + sid->value_offset + offset, bytes);
+ val += round_down(bytes, 4);
+ offset += round_down(bytes, 4);
+ bytes = bytes % 4;
+
+ if (!bytes)
+ return 0;
+
+ /* Handle any trailing bytes */
+ word = readl_relaxed(sid->base + sid->value_offset + offset);
+ memcpy(val, &word, bytes);
return 0;
}
--
2.25.1
From: Roberto Sassu <roberto.sassu(a)huawei.com>
Commit 98de59bfe4b2f ("take calculation of final prot in
security_mmap_file() into a helper") moved the code to update prot, to be
the actual protections applied to the kernel, to a new helper called
mmap_prot().
However, while without the helper ima_file_mmap() was getting the updated
prot, with the helper ima_file_mmap() gets the original prot, which
contains the protections requested by the application.
A possible consequence of this change is that, if an application calls
mmap() with only PROT_READ, and the kernel applies PROT_EXEC in addition,
that application would have access to executable memory without having this
event recorded in the IMA measurement list. This situation would occur for
example if the application, before mmap(), calls the personality() system
call with READ_IMPLIES_EXEC as the first argument.
Align ima_file_mmap() parameters with those of the mmap_file LSM hook, so
that IMA can receive both the requested prot and the final prot. Since the
requested protections are stored in a new variable, and the final
protections are stored in the existing variable, this effectively restores
the original behavior of the MMAP_CHECK hook.
Cc: stable(a)vger.kernel.org
Fixes: 98de59bfe4b2 ("take calculation of final prot in security_mmap_file() into a helper")
Signed-off-by: Roberto Sassu <roberto.sassu(a)huawei.com>
---
include/linux/ima.h | 6 ++++--
security/integrity/ima/ima_main.c | 7 +++++--
security/security.c | 7 ++++---
3 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/include/linux/ima.h b/include/linux/ima.h
index 5a0b2a285a18..d79fee67235e 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -21,7 +21,8 @@ extern int ima_file_check(struct file *file, int mask);
extern void ima_post_create_tmpfile(struct user_namespace *mnt_userns,
struct inode *inode);
extern void ima_file_free(struct file *file);
-extern int ima_file_mmap(struct file *file, unsigned long prot);
+extern int ima_file_mmap(struct file *file, unsigned long reqprot,
+ unsigned long prot, unsigned long flags);
extern int ima_file_mprotect(struct vm_area_struct *vma, unsigned long prot);
extern int ima_load_data(enum kernel_load_data_id id, bool contents);
extern int ima_post_load_data(char *buf, loff_t size,
@@ -76,7 +77,8 @@ static inline void ima_file_free(struct file *file)
return;
}
-static inline int ima_file_mmap(struct file *file, unsigned long prot)
+static inline int ima_file_mmap(struct file *file, unsigned long reqprot,
+ unsigned long prot, unsigned long flags)
{
return 0;
}
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 377300973e6c..f48f4e694921 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -397,7 +397,9 @@ static int process_measurement(struct file *file, const struct cred *cred,
/**
* ima_file_mmap - based on policy, collect/store measurement.
* @file: pointer to the file to be measured (May be NULL)
- * @prot: contains the protection that will be applied by the kernel.
+ * @reqprot: protection requested by the application
+ * @prot: protection that will be applied by the kernel
+ * @flags: operational flags
*
* Measure files being mmapped executable based on the ima_must_measure()
* policy decision.
@@ -405,7 +407,8 @@ static int process_measurement(struct file *file, const struct cred *cred,
* On success return 0. On integrity appraisal error, assuming the file
* is in policy and IMA-appraisal is in enforcing mode, return -EACCES.
*/
-int ima_file_mmap(struct file *file, unsigned long prot)
+int ima_file_mmap(struct file *file, unsigned long reqprot,
+ unsigned long prot, unsigned long flags)
{
u32 secid;
diff --git a/security/security.c b/security/security.c
index d1571900a8c7..174afa4fad81 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1661,12 +1661,13 @@ static inline unsigned long mmap_prot(struct file *file, unsigned long prot)
int security_mmap_file(struct file *file, unsigned long prot,
unsigned long flags)
{
+ unsigned long prot_adj = mmap_prot(file, prot);
int ret;
- ret = call_int_hook(mmap_file, 0, file, prot,
- mmap_prot(file, prot), flags);
+
+ ret = call_int_hook(mmap_file, 0, file, prot, prot_adj, flags);
if (ret)
return ret;
- return ima_file_mmap(file, prot);
+ return ima_file_mmap(file, prot, prot_adj, flags);
}
int security_mmap_addr(unsigned long addr)
--
2.25.1
This is a backport of 'commit 4444bc2116ae ("wifi: mac80211: Proper mark
iTXQs for resumption")' from linux 6.2.
If a hw queue is stopped ieee80211_tx_dequeue() should abort any
potential running iTXQ run and mark the queue for resumption later.
This also drops the redundant @txqs_stopped and
@IEEE80211_TXQ_STOP_NETIF_TX is renamed to @IEEE80211_TXQ_DIRTY to
better describe the flag.
Additionally this fixes an use-after-free caused by
ieee80211_tx_dequeue() potentially returning a pointer to a deleted skb.
The original 'commit 4444bc2116ae ("wifi: mac80211: Proper mark
iTXQs for resumption")' in 6.2 only fixed the issue only in combination
with 'commit 592234e941f1 ("wifi: mac80211: Fix iTXQ AMPDU fragmentation
handling")'
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/065cf0e5-2c64-56c6-ee66-a6b61be2dddf@roeck-us.net
Link: https://lore.kernel.org/r/20221230121850.218810-1-alexander@wetzel-home.de
Signed-off-by: Alexander Wetzel <alexander(a)wetzel-home.de>
---
The automatic backport for this and the next patch failed as expected:
https://lore.kernel.org/r/16742967949726@kroah.comhttps://lore.kernel.org/r/167429677624186@kroah.com
Since these patches stack only I've put them into a mini series.
They fix different things but the logic overlaps.
In kernels < 6.2 we still support the old push path and since backporting
'commit 107395f9cf44 ("wifi: mac80211: Drop support for TX push path")'
to stable kernels is a clear no go some changes had to be done to these
patches.
Therefore here are quick manual ports, taking the old push path into
account.
I developed and verified basic functionality with both patches applied
to the v6.1 tree from
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
This versions should also work for kernels < 6.1 with no or minimal
changes.
A quick hostap hwsim test shows no regressions. (Single run, compared to
reference runs I use with wireless-testing kernel)
But it also happened to trigger the KASAN I repored here again:
https://lore.kernel.org/r/20230112173808.6205-1-alexander@wetzel-home.de
So that's indeed an issue in stable...
I'll try to give that another shot with your feedback, soon.
Alexander
---
include/net/mac80211.h | 4 ----
net/mac80211/debugfs_sta.c | 5 +++--
net/mac80211/driver-ops.h | 2 +-
net/mac80211/ieee80211_i.h | 2 +-
net/mac80211/tx.c | 23 +++++++++++++++--------
net/mac80211/util.c | 20 ++++++--------------
6 files changed, 26 insertions(+), 30 deletions(-)
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index ac2bad57933f..72b739dc6d53 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -1827,8 +1827,6 @@ struct ieee80211_vif_cfg {
* @drv_priv: data area for driver use, will always be aligned to
* sizeof(void \*).
* @txq: the multicast data TX queue (if driver uses the TXQ abstraction)
- * @txqs_stopped: per AC flag to indicate that intermediate TXQs are stopped,
- * protected by fq->lock.
* @offload_flags: 802.3 -> 802.11 enapsulation offload flags, see
* &enum ieee80211_offload_flags.
* @mbssid_tx_vif: Pointer to the transmitting interface if MBSSID is enabled.
@@ -1857,8 +1855,6 @@ struct ieee80211_vif {
bool probe_req_reg;
bool rx_mcast_action_reg;
- bool txqs_stopped[IEEE80211_NUM_ACS];
-
struct ieee80211_vif *mbssid_tx_vif;
/* must be last */
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c
index d3397c1248d3..b057253db28d 100644
--- a/net/mac80211/debugfs_sta.c
+++ b/net/mac80211/debugfs_sta.c
@@ -167,7 +167,7 @@ static ssize_t sta_aqm_read(struct file *file, char __user *userbuf,
continue;
txqi = to_txq_info(sta->sta.txq[i]);
p += scnprintf(p, bufsz + buf - p,
- "%d %d %u %u %u %u %u %u %u %u %u 0x%lx(%s%s%s)\n",
+ "%d %d %u %u %u %u %u %u %u %u %u 0x%lx(%s%s%s%s)\n",
txqi->txq.tid,
txqi->txq.ac,
txqi->tin.backlog_bytes,
@@ -182,7 +182,8 @@ static ssize_t sta_aqm_read(struct file *file, char __user *userbuf,
txqi->flags,
test_bit(IEEE80211_TXQ_STOP, &txqi->flags) ? "STOP" : "RUN",
test_bit(IEEE80211_TXQ_AMPDU, &txqi->flags) ? " AMPDU" : "",
- test_bit(IEEE80211_TXQ_NO_AMSDU, &txqi->flags) ? " NO-AMSDU" : "");
+ test_bit(IEEE80211_TXQ_NO_AMSDU, &txqi->flags) ? " NO-AMSDU" : "",
+ test_bit(IEEE80211_TXQ_DIRTY, &txqi->flags) ? " DIRTY" : "");
}
rcu_read_unlock();
diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
index 81e40b0a3b16..e685c12757f4 100644
--- a/net/mac80211/driver-ops.h
+++ b/net/mac80211/driver-ops.h
@@ -1183,7 +1183,7 @@ static inline void drv_wake_tx_queue(struct ieee80211_local *local,
/* In reconfig don't transmit now, but mark for waking later */
if (local->in_reconfig) {
- set_bit(IEEE80211_TXQ_STOP_NETIF_TX, &txq->flags);
+ set_bit(IEEE80211_TXQ_DIRTY, &txq->flags);
return;
}
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index a842f2e1c230..9027c6354251 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -835,7 +835,7 @@ enum txq_info_flags {
IEEE80211_TXQ_STOP,
IEEE80211_TXQ_AMPDU,
IEEE80211_TXQ_NO_AMSDU,
- IEEE80211_TXQ_STOP_NETIF_TX,
+ IEEE80211_TXQ_DIRTY,
};
/**
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 874f2a4d831d..3363e322cfd9 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -3709,13 +3709,15 @@ struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
struct ieee80211_local *local = hw_to_local(hw);
struct txq_info *txqi = container_of(txq, struct txq_info, txq);
struct ieee80211_hdr *hdr;
- struct sk_buff *skb = NULL;
struct fq *fq = &local->fq;
struct fq_tin *tin = &txqi->tin;
struct ieee80211_tx_info *info;
struct ieee80211_tx_data tx;
+ struct sk_buff *skb;
ieee80211_tx_result r;
struct ieee80211_vif *vif = txq->vif;
+ int q = vif->hw_queue[txq->ac];
+ bool q_stopped;
WARN_ON_ONCE(softirq_count() == 0);
@@ -3723,16 +3725,21 @@ struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
return NULL;
begin:
- spin_lock_bh(&fq->lock);
+ skb = NULL;
+ spin_lock(&local->queue_stop_reason_lock);
+ q_stopped = local->queue_stop_reasons[q];
+ spin_unlock(&local->queue_stop_reason_lock);
+
+ if (unlikely(q_stopped)) {
+ /* mark for waking later */
+ set_bit(IEEE80211_TXQ_DIRTY, &txqi->flags);
+ return NULL;
+ }
- if (test_bit(IEEE80211_TXQ_STOP, &txqi->flags) ||
- test_bit(IEEE80211_TXQ_STOP_NETIF_TX, &txqi->flags))
- goto out;
+ spin_lock_bh(&fq->lock);
- if (vif->txqs_stopped[txq->ac]) {
- set_bit(IEEE80211_TXQ_STOP_NETIF_TX, &txqi->flags);
+ if (unlikely(test_bit(IEEE80211_TXQ_STOP, &txqi->flags)))
goto out;
- }
/* Make sure fragments stay together. */
skb = __skb_dequeue(&txqi->frags);
diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index b512cb37aafb..ed53c51bbc32 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -301,8 +301,6 @@ static void __ieee80211_wake_txqs(struct ieee80211_sub_if_data *sdata, int ac)
local_bh_disable();
spin_lock(&fq->lock);
- sdata->vif.txqs_stopped[ac] = false;
-
if (!test_bit(SDATA_STATE_RUNNING, &sdata->state))
goto out;
@@ -324,7 +322,7 @@ static void __ieee80211_wake_txqs(struct ieee80211_sub_if_data *sdata, int ac)
if (ac != txq->ac)
continue;
- if (!test_and_clear_bit(IEEE80211_TXQ_STOP_NETIF_TX,
+ if (!test_and_clear_bit(IEEE80211_TXQ_DIRTY,
&txqi->flags))
continue;
@@ -339,7 +337,7 @@ static void __ieee80211_wake_txqs(struct ieee80211_sub_if_data *sdata, int ac)
txqi = to_txq_info(vif->txq);
- if (!test_and_clear_bit(IEEE80211_TXQ_STOP_NETIF_TX, &txqi->flags) ||
+ if (!test_and_clear_bit(IEEE80211_TXQ_DIRTY, &txqi->flags) ||
(ps && atomic_read(&ps->num_sta_ps)) || ac != vif->txq->ac)
goto out;
@@ -537,16 +535,10 @@ static void __ieee80211_stop_queue(struct ieee80211_hw *hw, int queue,
continue;
for (ac = 0; ac < n_acs; ac++) {
- if (sdata->vif.hw_queue[ac] == queue ||
- sdata->vif.cab_queue == queue) {
- if (!local->ops->wake_tx_queue) {
- netif_stop_subqueue(sdata->dev, ac);
- continue;
- }
- spin_lock(&local->fq.lock);
- sdata->vif.txqs_stopped[ac] = true;
- spin_unlock(&local->fq.lock);
- }
+ if (!local->ops->wake_tx_queue &&
+ (sdata->vif.hw_queue[ac] == queue ||
+ sdata->vif.cab_queue == queue))
+ netif_stop_subqueue(sdata->dev, ac);
}
}
rcu_read_unlock();
--
2.39.0
Svace has identified unchecked return value of mt7615_init_tx_queue
function in 5.10 branch, even though it makes sense to track it
instead. This issue is fixed in upstream version by Lorenzo's patch.
The same patch can be cleanly applied to the 5.10 branch.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
[Public]
Hi,
This commit that went into 6.2 has been shown to help timing corner cases with suspend problems.
4b31b92b143f ("drm/amdgpu: complete gfxoff allow signal during suspend without delay")
Can you please take it to 6.1.y and 5.15.y?
Thanks,
From: Adrian Zaharia <Adrian.Zaharia(a)windriver.com>
This patch enables the flag to defer registering primary roothub if xhci has two roothubs.
Link to upstream patch:
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?id=b…
The support for deferring roothub registration was added in 5.10.121.
(See: a2532c441705 "usb: core: hcd: Add support for deferring roothub registration")
Kishon Vijay Abraham I (1):
xhci: Set HCD flag to defer primary roothub registration
drivers/usb/host/xhci.c | 2 ++
1 file changed, 2 insertions(+)
base-commit: 179624a57b78c02de833370b7bdf0b0f4a27ca31
--
2.39.1
Passing the host topology to the guest is almost certainly wrong
and will confuse the scheduler. In addition, several fields of
these CPUID leaves vary on each processor; it is simply impossible to
return the right values from KVM_GET_SUPPORTED_CPUID in such a way that
they can be passed to KVM_SET_CPUID2.
The values that will most likely prevent confusion are all zeroes.
Userspace will have to override it anyway if it wishes to present a
specific topology to the guest.
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
---
Documentation/virt/kvm/api.rst | 14 ++++++++++++++
arch/x86/kvm/cpuid.c | 32 ++++++++++++++++----------------
2 files changed, 30 insertions(+), 16 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index eee9f857a986..20f4f6b302ff 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8249,6 +8249,20 @@ CPU[EAX=1]:ECX[24] (TSC_DEADLINE) is not reported by ``KVM_GET_SUPPORTED_CPUID``
It can be enabled if ``KVM_CAP_TSC_DEADLINE_TIMER`` is present and the kernel
has enabled in-kernel emulation of the local APIC.
+CPU topology
+~~~~~~~~~~~~
+
+Several CPUID values include topology information for the host CPU:
+0x0b and 0x1f for Intel systems, 0x8000001e for AMD systems. Different
+versions of KVM return different values for this information and userspace
+should not rely on it. Currently they return all zeroes.
+
+If userspace wishes to set up a guest topology, it should be careful that
+the values of these three leaves differ for each CPU. In particular,
+the APIC ID is found in EDX for all subleaves of 0x0b and 0x1f, and in EAX
+for 0x8000001e; the latter also encodes the core id and node id in bits
+7:0 of EBX and ECX respectively.
+
Obsolete ioctls and capabilities
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0810e93cbedc..164bfb7e7a16 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -759,16 +759,22 @@ struct kvm_cpuid_array {
int nent;
};
+static struct kvm_cpuid_entry2 *get_next_cpuid(struct kvm_cpuid_array *array)
+{
+ if (array->nent >= array->maxnent)
+ return NULL;
+
+ return &array->entries[array->nent++];
+}
+
static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
u32 function, u32 index)
{
- struct kvm_cpuid_entry2 *entry;
+ struct kvm_cpuid_entry2 *entry = get_next_cpuid(array);
- if (array->nent >= array->maxnent)
+ if (!entry)
return NULL;
- entry = &array->entries[array->nent++];
-
memset(entry, 0, sizeof(*entry));
entry->function = function;
entry->index = index;
@@ -945,22 +951,13 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
entry->edx = edx.full;
break;
}
- /*
- * Per Intel's SDM, the 0x1f is a superset of 0xb,
- * thus they can be handled by common code.
- */
case 0x1f:
case 0xb:
/*
- * Populate entries until the level type (ECX[15:8]) of the
- * previous entry is zero. Note, CPUID EAX.{0x1f,0xb}.0 is
- * the starting entry, filled by the primary do_host_cpuid().
+ * No topology; a valid topology is indicated by the presence
+ * of subleaf 1.
*/
- for (i = 1; entry->ecx & 0xff00; ++i) {
- entry = do_host_cpuid(array, function, i);
- if (!entry)
- goto out;
- }
+ entry->eax = entry->ebx = entry->ecx = 0;
break;
case 0xd: {
u64 permitted_xcr0 = kvm_caps.supported_xcr0 & xstate_get_guest_group_perm();
@@ -1193,6 +1190,9 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
entry->ebx = entry->ecx = entry->edx = 0;
break;
case 0x8000001e:
+ /* Do not return host topology information. */
+ entry->eax = entry->ebx = entry->ecx = 0;
+ entry->edx = 0; /* reserved */
break;
case 0x8000001F:
if (!kvm_cpu_cap_has(X86_FEATURE_SEV)) {
--
2.31.1
state_lock, the spinlock type is meant to protect race against concurrent
MHI state transitions. In mhi_ep_set_m0_state(), while the state_lock is
being held, the channels are resumed in mhi_ep_resume_channels() if the
previous state was M3. This causes sleeping in atomic bug, since
mhi_ep_resume_channels() use mutex internally.
Since the state_lock is supposed to be held throughout the state change,
it is not ideal to drop the lock before calling mhi_ep_resume_channels().
So to fix this issue, let's change the type of state_lock to mutex. This
would also allow holding the lock throughout all state transitions thereby
avoiding any potential race.
Cc: <stable(a)vger.kernel.org> # 5.19
Fixes: e4b7b5f0f30a ("bus: mhi: ep: Add support for suspending and resuming channels")
Reported-by: Dan Carpenter <error27(a)gmail.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
drivers/bus/mhi/ep/main.c | 8 +++++---
drivers/bus/mhi/ep/sm.c | 42 ++++++++++++++++++++++-----------------
include/linux/mhi_ep.h | 4 ++--
3 files changed, 31 insertions(+), 23 deletions(-)
diff --git a/drivers/bus/mhi/ep/main.c b/drivers/bus/mhi/ep/main.c
index bcaaba97ef63..528c00b232bf 100644
--- a/drivers/bus/mhi/ep/main.c
+++ b/drivers/bus/mhi/ep/main.c
@@ -1001,11 +1001,11 @@ static void mhi_ep_reset_worker(struct work_struct *work)
mhi_ep_power_down(mhi_cntrl);
- spin_lock_bh(&mhi_cntrl->state_lock);
+ mutex_lock(&mhi_cntrl->state_lock);
+
/* Reset MMIO to signal host that the MHI_RESET is completed in endpoint */
mhi_ep_mmio_reset(mhi_cntrl);
cur_state = mhi_cntrl->mhi_state;
- spin_unlock_bh(&mhi_cntrl->state_lock);
/*
* Only proceed further if the reset is due to SYS_ERR. The host will
@@ -1014,6 +1014,8 @@ static void mhi_ep_reset_worker(struct work_struct *work)
*/
if (cur_state == MHI_STATE_SYS_ERR)
mhi_ep_power_up(mhi_cntrl);
+
+ mutex_unlock(&mhi_cntrl->state_lock);
}
/*
@@ -1386,8 +1388,8 @@ int mhi_ep_register_controller(struct mhi_ep_cntrl *mhi_cntrl,
INIT_LIST_HEAD(&mhi_cntrl->st_transition_list);
INIT_LIST_HEAD(&mhi_cntrl->ch_db_list);
- spin_lock_init(&mhi_cntrl->state_lock);
spin_lock_init(&mhi_cntrl->list_lock);
+ mutex_init(&mhi_cntrl->state_lock);
mutex_init(&mhi_cntrl->event_lock);
/* Set MHI version and AMSS EE before enumeration */
diff --git a/drivers/bus/mhi/ep/sm.c b/drivers/bus/mhi/ep/sm.c
index 3655c19e23c7..fd200b2ac0bb 100644
--- a/drivers/bus/mhi/ep/sm.c
+++ b/drivers/bus/mhi/ep/sm.c
@@ -63,24 +63,23 @@ int mhi_ep_set_m0_state(struct mhi_ep_cntrl *mhi_cntrl)
int ret;
/* If MHI is in M3, resume suspended channels */
- spin_lock_bh(&mhi_cntrl->state_lock);
+ mutex_lock(&mhi_cntrl->state_lock);
+
old_state = mhi_cntrl->mhi_state;
if (old_state == MHI_STATE_M3)
mhi_ep_resume_channels(mhi_cntrl);
ret = mhi_ep_set_mhi_state(mhi_cntrl, MHI_STATE_M0);
- spin_unlock_bh(&mhi_cntrl->state_lock);
-
if (ret) {
mhi_ep_handle_syserr(mhi_cntrl);
- return ret;
+ goto err_unlock;
}
/* Signal host that the device moved to M0 */
ret = mhi_ep_send_state_change_event(mhi_cntrl, MHI_STATE_M0);
if (ret) {
dev_err(dev, "Failed sending M0 state change event\n");
- return ret;
+ goto err_unlock;
}
if (old_state == MHI_STATE_READY) {
@@ -88,11 +87,14 @@ int mhi_ep_set_m0_state(struct mhi_ep_cntrl *mhi_cntrl)
ret = mhi_ep_send_ee_event(mhi_cntrl, MHI_EE_AMSS);
if (ret) {
dev_err(dev, "Failed sending AMSS EE event\n");
- return ret;
+ goto err_unlock;
}
}
- return 0;
+err_unlock:
+ mutex_unlock(&mhi_cntrl->state_lock);
+
+ return ret;
}
int mhi_ep_set_m3_state(struct mhi_ep_cntrl *mhi_cntrl)
@@ -100,13 +102,12 @@ int mhi_ep_set_m3_state(struct mhi_ep_cntrl *mhi_cntrl)
struct device *dev = &mhi_cntrl->mhi_dev->dev;
int ret;
- spin_lock_bh(&mhi_cntrl->state_lock);
- ret = mhi_ep_set_mhi_state(mhi_cntrl, MHI_STATE_M3);
- spin_unlock_bh(&mhi_cntrl->state_lock);
+ mutex_lock(&mhi_cntrl->state_lock);
+ ret = mhi_ep_set_mhi_state(mhi_cntrl, MHI_STATE_M3);
if (ret) {
mhi_ep_handle_syserr(mhi_cntrl);
- return ret;
+ goto err_unlock;
}
mhi_ep_suspend_channels(mhi_cntrl);
@@ -115,10 +116,13 @@ int mhi_ep_set_m3_state(struct mhi_ep_cntrl *mhi_cntrl)
ret = mhi_ep_send_state_change_event(mhi_cntrl, MHI_STATE_M3);
if (ret) {
dev_err(dev, "Failed sending M3 state change event\n");
- return ret;
+ goto err_unlock;
}
- return 0;
+err_unlock:
+ mutex_unlock(&mhi_cntrl->state_lock);
+
+ return ret;
}
int mhi_ep_set_ready_state(struct mhi_ep_cntrl *mhi_cntrl)
@@ -127,22 +131,24 @@ int mhi_ep_set_ready_state(struct mhi_ep_cntrl *mhi_cntrl)
enum mhi_state mhi_state;
int ret, is_ready;
- spin_lock_bh(&mhi_cntrl->state_lock);
+ mutex_lock(&mhi_cntrl->state_lock);
+
/* Ensure that the MHISTATUS is set to RESET by host */
mhi_state = mhi_ep_mmio_masked_read(mhi_cntrl, EP_MHISTATUS, MHISTATUS_MHISTATE_MASK);
is_ready = mhi_ep_mmio_masked_read(mhi_cntrl, EP_MHISTATUS, MHISTATUS_READY_MASK);
if (mhi_state != MHI_STATE_RESET || is_ready) {
dev_err(dev, "READY state transition failed. MHI host not in RESET state\n");
- spin_unlock_bh(&mhi_cntrl->state_lock);
- return -EIO;
+ ret = -EIO;
+ goto err_unlock;
}
ret = mhi_ep_set_mhi_state(mhi_cntrl, MHI_STATE_READY);
- spin_unlock_bh(&mhi_cntrl->state_lock);
-
if (ret)
mhi_ep_handle_syserr(mhi_cntrl);
+err_unlock:
+ mutex_unlock(&mhi_cntrl->state_lock);
+
return ret;
}
diff --git a/include/linux/mhi_ep.h b/include/linux/mhi_ep.h
index 478aece17046..f198a8ac7ee7 100644
--- a/include/linux/mhi_ep.h
+++ b/include/linux/mhi_ep.h
@@ -70,8 +70,8 @@ struct mhi_ep_db_info {
* @cmd_ctx_cache_phys: Physical address of the host command context cache
* @chdb: Array of channel doorbell interrupt info
* @event_lock: Lock for protecting event rings
- * @list_lock: Lock for protecting state transition and channel doorbell lists
* @state_lock: Lock for protecting state transitions
+ * @list_lock: Lock for protecting state transition and channel doorbell lists
* @st_transition_list: List of state transitions
* @ch_db_list: List of queued channel doorbells
* @wq: Dedicated workqueue for handling rings and state changes
@@ -117,8 +117,8 @@ struct mhi_ep_cntrl {
struct mhi_ep_db_info chdb[4];
struct mutex event_lock;
+ struct mutex state_lock;
spinlock_t list_lock;
- spinlock_t state_lock;
struct list_head st_transition_list;
struct list_head ch_db_list;
--
2.25.1
A Sysbot [1] corrupted filesystem exposes two flaws in the
handling and sanity checking of the xattr_ids count in the
filesystem. Both of these flaws cause computation overflow
due to incorrect typing.
In the corrupted filesystem the xattr_ids value is 4294967071, which
stored in a signed variable becomes the negative number -225.
Flaw 1 (64-bit systems only):
The signed integer xattr_ids variable causes sign extension.
This causes variable overflow in the SQUASHFS_XATTR_*(A) macros.
The variable is first multiplied by sizeof(struct squashfs_xattr_id)
where the type of the sizeof operator is "unsigned long".
On a 64-bit system this is 64-bits in size, and causes the negative
number to be sign extended and widened to 64-bits and then become unsigned.
This produces the very large number 18446744073709548016 or 2^64 - 3600.
This number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and
divided by SQUASHFS_METADATA_SIZE overflows and produces a length of 0
(stored in len).
Flaw 2 (32-bit systems only):
On a 32-bit system the integer variable is not widened by the unsigned
long type of the sizeof operator (32-bits), and the signedness of the
variable has no effect due it always being treated as unsigned.
The above corrupted xattr_ids value of 4294967071, when multiplied
overflows and produces the number 4294963696 or 2^32 - 3400. This
number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and
divided by SQUASHFS_METADATA_SIZE overflows again and produces a length
of 0.
The effect of the 0 length computation:
In conjunction with the corrupted xattr_ids field, the filesystem
also has a corrupted xattr_table_start value, where it matches
the end of filesystem value of 850.
This causes the following sanity check code to fail because the incorrectly
computed len of 0 matches the incorrect size of the table reported by the
superblock (0 bytes).
len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids);
/*
* The computed size of the index table (len bytes) should exactly
* match the table start and end points
*/
start = table_start + sizeof(*id_table);
end = msblk->bytes_used;
if (len != (end - start))
return ERR_PTR(-EINVAL);
Changing the xattr_ids variable to be "usigned int" fixes the flaw
on a 64-bit system. This relies on the fact the computation is widened
by the unsigned long type of the sizeof operator.
Casting the variable to u64 in the above macro fixes this flaw on a 32-bit
system.
It also means 64-bit systems do not implicitly rely on the type of the
sizeof operator to widen the computation.
[1] https://lore.kernel.org/lkml/000000000000cd44f005f1a0f17f@google.com/
Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup")
Reported-by: syzbot+082fa4af80a5bb1a9843(a)syzkaller.appspotmail.com
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Cc: <stable(a)vger.kernel.org>
---
fs/squashfs/squashfs_fs.h | 2 +-
fs/squashfs/squashfs_fs_sb.h | 2 +-
fs/squashfs/xattr.h | 4 ++--
fs/squashfs/xattr_id.c | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h
index b3fdc8212c5f..95f8e8901768 100644
--- a/fs/squashfs/squashfs_fs.h
+++ b/fs/squashfs/squashfs_fs.h
@@ -183,7 +183,7 @@ static inline int squashfs_block_size(__le32 raw)
#define SQUASHFS_ID_BLOCK_BYTES(A) (SQUASHFS_ID_BLOCKS(A) *\
sizeof(u64))
/* xattr id lookup table defines */
-#define SQUASHFS_XATTR_BYTES(A) ((A) * sizeof(struct squashfs_xattr_id))
+#define SQUASHFS_XATTR_BYTES(A) (((u64) (A)) * sizeof(struct squashfs_xattr_id))
#define SQUASHFS_XATTR_BLOCK(A) (SQUASHFS_XATTR_BYTES(A) / \
SQUASHFS_METADATA_SIZE)
diff --git a/fs/squashfs/squashfs_fs_sb.h b/fs/squashfs/squashfs_fs_sb.h
index 659082e9e51d..72f6f4b37863 100644
--- a/fs/squashfs/squashfs_fs_sb.h
+++ b/fs/squashfs/squashfs_fs_sb.h
@@ -63,7 +63,7 @@ struct squashfs_sb_info {
long long bytes_used;
unsigned int inodes;
unsigned int fragments;
- int xattr_ids;
+ unsigned int xattr_ids;
unsigned int ids;
bool panic_on_errors;
const struct squashfs_decompressor_thread_ops *thread_ops;
diff --git a/fs/squashfs/xattr.h b/fs/squashfs/xattr.h
index d8a270d3ac4c..f1a463d8bfa0 100644
--- a/fs/squashfs/xattr.h
+++ b/fs/squashfs/xattr.h
@@ -10,12 +10,12 @@
#ifdef CONFIG_SQUASHFS_XATTR
extern __le64 *squashfs_read_xattr_id_table(struct super_block *, u64,
- u64 *, int *);
+ u64 *, unsigned int *);
extern int squashfs_xattr_lookup(struct super_block *, unsigned int, int *,
unsigned int *, unsigned long long *);
#else
static inline __le64 *squashfs_read_xattr_id_table(struct super_block *sb,
- u64 start, u64 *xattr_table_start, int *xattr_ids)
+ u64 start, u64 *xattr_table_start, unsigned int *xattr_ids)
{
struct squashfs_xattr_id_table *id_table;
diff --git a/fs/squashfs/xattr_id.c b/fs/squashfs/xattr_id.c
index 087cab8c78f4..c8469c656e0d 100644
--- a/fs/squashfs/xattr_id.c
+++ b/fs/squashfs/xattr_id.c
@@ -56,7 +56,7 @@ int squashfs_xattr_lookup(struct super_block *sb, unsigned int index,
* Read uncompressed xattr id lookup table indexes from disk into memory
*/
__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start,
- u64 *xattr_table_start, int *xattr_ids)
+ u64 *xattr_table_start, unsigned int *xattr_ids)
{
struct squashfs_sb_info *msblk = sb->s_fs_info;
unsigned int len, indexes;
--
2.35.1
Stable team, please backport these two commits to v6.1:
2bd0db4b3f0b ("drm/i915: Allow panel fixed modes to have differing sync polarities")
55cfeecc2197 ("drm/i915: Allow alternate fixed modes always for eDP")
Reference for posterity: https://gitlab.freedesktop.org/drm/intel/-/issues/7841
Thanks,
Jani.
--
Jani Nikula, Intel Open Source Graphics Center
Syzkaller reports use-after-free in hci_cmd_timeout(). The bug was fixed
in the following patch and can be cleanly applied to 6.1 stable tree.
Due to some technical rearrangement, the fix for older stable branches
requires a different patch which I'll send you in another thread.
The patch titled
Subject: highmem: round down the address passed to kunmap_flush_on_unmap()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
highmem-round-down-the-address-passed-to-kunmap_flush_on_unmap.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: highmem: round down the address passed to kunmap_flush_on_unmap()
Date: Thu, 26 Jan 2023 20:07:27 +0000
We already round down the address in kunmap_local_indexed() which is the
other implementation of __kunmap_local(). The only implementation of
kunmap_flush_on_unmap() is PA-RISC which is expecting a page-aligned
address. This may be causing PA-RISC to be flushing the wrong addresses
currently.
Link: https://lkml.kernel.org/r/20230126200727.1680362-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Fixes: 298fa1ad5571 ("highmem: Provide generic variant of kmap_atomic*")
Cc: "Fabio M. De Francesco" <fmdefrancesco(a)gmail.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Helge Deller <deller(a)gmx.de>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Bagas Sanjaya <bagasdotme(a)gmail.com>
Cc: David Sterba <dsterba(a)suse.com>
Cc: "Fabio M. De Francesco" <fmdefrancesco(a)gmail.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/highmem-internal.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/highmem-internal.h~highmem-round-down-the-address-passed-to-kunmap_flush_on_unmap
+++ a/include/linux/highmem-internal.h
@@ -200,7 +200,7 @@ static inline void *kmap_local_pfn(unsig
static inline void __kunmap_local(const void *addr)
{
#ifdef ARCH_HAS_FLUSH_ON_KUNMAP
- kunmap_flush_on_unmap(addr);
+ kunmap_flush_on_unmap(PTR_ALIGN_DOWN(addr, PAGE_SIZE));
#endif
}
@@ -227,7 +227,7 @@ static inline void *kmap_atomic_pfn(unsi
static inline void __kunmap_atomic(const void *addr)
{
#ifdef ARCH_HAS_FLUSH_ON_KUNMAP
- kunmap_flush_on_unmap(addr);
+ kunmap_flush_on_unmap(PTR_ALIGN_DOWN(addr, PAGE_SIZE));
#endif
pagefault_enable();
if (IS_ENABLED(CONFIG_PREEMPT_RT))
_
Patches currently in -mm which might be from willy(a)infradead.org are
highmem-round-down-the-address-passed-to-kunmap_flush_on_unmap.patch
mm-remove-folio_pincount_ptr-and-head_compound_pincount.patch
mm-convert-head_subpages_mapcount-into-folio_nr_pages_mapped.patch
doc-clarify-refcount-section-by-referring-to-folios-pages.patch
mm-convert-total_compound_mapcount-to-folio_total_mapcount.patch
mm-convert-page_remove_rmap-to-use-a-folio-internally.patch
mm-convert-page_add_anon_rmap-to-use-a-folio-internally.patch
mm-convert-page_add_file_rmap-to-use-a-folio-internally.patch
mm-add-folio_add_new_anon_rmap.patch
mm-add-folio_add_new_anon_rmap-fix-2.patch
page_alloc-use-folio-fields-directly.patch
mm-use-a-folio-in-hugepage_add_anon_rmap-and-hugepage_add_new_anon_rmap.patch
mm-use-entire_mapcount-in-__page_dup_rmap.patch
mm-debug-remove-call-to-head_compound_mapcount.patch
hugetlb-remove-uses-of-folio_mapcount_ptr.patch
mm-convert-page_mapcount-to-use-folio_entire_mapcount.patch
mm-remove-head_compound_mapcount-and-_ptr-functions.patch
mm-reimplement-compound_order.patch
mm-reimplement-compound_nr.patch
mm-reimplement-compound_nr-fix.patch
mm-convert-set_compound_page_dtor-and-set_compound_order-to-folios.patch
mm-convert-is_transparent_hugepage-to-use-a-folio.patch
mm-convert-destroy_large_folio-to-use-folio_dtor.patch
hugetlb-remove-uses-of-compound_dtor-and-compound_nr.patch
mm-remove-first-tail-page-members-from-struct-page.patch
doc-correct-struct-folio-kernel-doc.patch
mm-move-page-deferred_list-to-folio-_deferred_list.patch
mm-huge_memory-remove-page_deferred_list.patch
mm-huge_memory-convert-get_deferred_split_queue-to-take-a-folio.patch
mm-convert-deferred_split_huge_page-to-deferred_split_folio.patch
shmem-convert-shmem_write_end-to-use-a-folio.patch
mm-add-vma_alloc_zeroed_movable_folio.patch
mm-convert-do_anonymous_page-to-use-a-folio.patch
mm-convert-wp_page_copy-to-use-folios.patch
mm-use-a-folio-in-copy_pte_range.patch
mm-use-a-folio-in-copy_present_pte.patch
mm-fs-convert-inode_attach_wb-to-take-a-folio.patch
mm-convert-mem_cgroup_css_from_page-to-mem_cgroup_css_from_folio.patch
mm-remove-page_evictable.patch
mm-remove-mlock_vma_page.patch
mm-remove-munlock_vma_page.patch
mm-clean-up-mlock_page-munlock_page-references-in-comments.patch
rmap-add-folio-parameter-to-__page_set_anon_rmap.patch
filemap-convert-filemap_map_pmd-to-take-a-folio.patch
filemap-convert-filemap_range_has_page-to-use-a-folio.patch
readahead-convert-readahead_expand-to-use-a-folio.patch
mm-add-memcpy_from_file_folio.patch
fs-convert-writepage_t-callback-to-pass-a-folio.patch
mpage-convert-__mpage_writepage-to-use-a-folio-more-fully.patch
The patch titled
Subject: migrate: hugetlb: check for hugetlb shared PMD in node migration
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
migrate-hugetlb-check-for-hugetlb-shared-pmd-in-node-migration.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: migrate: hugetlb: check for hugetlb shared PMD in node migration
Date: Thu, 26 Jan 2023 14:27:21 -0800
migrate_pages/mempolicy semantics state that CAP_SYS_NICE is required to
move pages shared with another process to a different node. page_mapcount
> 1 is being used to determine if a hugetlb page is shared. However, a
hugetlb page will have a mapcount of 1 if mapped by multiple processes via
a shared PMD. As a result, hugetlb pages shared by multiple processes and
mapped with a shared PMD can be moved by a process without CAP_SYS_NICE.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found
consider the page shared.
Link: https://lkml.kernel.org/r/20230126222721.222195-3-mike.kravetz@oracle.com
Fixes: e2d8cf405525 ("migrate: add hugepage migration code to migrate_pages()")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: James Houghton <jthoughton(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/mempolicy.c~migrate-hugetlb-check-for-hugetlb-shared-pmd-in-node-migration
+++ a/mm/mempolicy.c
@@ -600,7 +600,8 @@ static int queue_pages_hugetlb(pte_t *pt
/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
if (flags & (MPOL_MF_MOVE_ALL) ||
- (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
+ (flags & MPOL_MF_MOVE && page_mapcount(page) == 1 &&
+ !hugetlb_pmd_shared(pte))) {
if (isolate_hugetlb(page, qp->pagelist) &&
(flags & MPOL_MF_STRICT))
/*
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
mm-hugetlb-proc-check-for-hugetlb-shared-pmd-in-proc-pid-smaps.patch
migrate-hugetlb-check-for-hugetlb-shared-pmd-in-node-migration.patch
The patch titled
Subject: mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-proc-check-for-hugetlb-shared-pmd-in-proc-pid-smaps.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
Date: Thu, 26 Jan 2023 14:27:20 -0800
Patch series "Fixes for hugetlb mapcount at most 1 for shared PMDs".
This issue of mapcount in hugetlb pages referenced by shared PMDs was
discussed in [1]. The following two patches address user visible behavior
caused by this issue.
[1] https://lore.kernel.org/linux-mm/Y9BF+OCdWnCSilEu@monkey/
This patch (of 2):
A hugetlb page will have a mapcount of 1 if mapped by multiple processes
via a shared PMD. This is because only the first process increases the
map count, and subsequent processes just add the shared PMD page to their
page table.
page_mapcount is being used to decide if a hugetlb page is shared or
private in /proc/PID/smaps. Pages referenced via a shared PMD were
incorrectly being counted as private.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found
count the hugetlb page as shared. A new helper to check for a shared PMD
is added.
Link: https://lkml.kernel.org/r/20230126222721.222195-2-mike.kravetz@oracle.com
Fixes: 25ee01a2fca0 ("mm: hugetlb: proc: add hugetlb-related fields to /proc/PID/smaps")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: James Houghton <jthoughton(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 10 ++++++++--
include/linux/hugetlb.h | 12 ++++++++++++
2 files changed, 20 insertions(+), 2 deletions(-)
--- a/fs/proc/task_mmu.c~mm-hugetlb-proc-check-for-hugetlb-shared-pmd-in-proc-pid-smaps
+++ a/fs/proc/task_mmu.c
@@ -749,8 +749,14 @@ static int smaps_hugetlb_range(pte_t *pt
if (mapcount >= 2)
mss->shared_hugetlb += huge_page_size(hstate_vma(vma));
- else
- mss->private_hugetlb += huge_page_size(hstate_vma(vma));
+ else {
+ if (hugetlb_pmd_shared(pte))
+ mss->shared_hugetlb +=
+ huge_page_size(hstate_vma(vma));
+ else
+ mss->private_hugetlb +=
+ huge_page_size(hstate_vma(vma));
+ }
}
return 0;
}
--- a/include/linux/hugetlb.h~mm-hugetlb-proc-check-for-hugetlb-shared-pmd-in-proc-pid-smaps
+++ a/include/linux/hugetlb.h
@@ -1187,6 +1187,18 @@ static inline __init void hugetlb_cma_re
}
#endif
+#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE
+static inline bool hugetlb_pmd_shared(pte_t *pte)
+{
+ return page_count(virt_to_page(pte)) > 1;
+}
+#else
+static inline bool hugetlb_pmd_shared(pte_t *pte)
+{
+ return false;
+}
+#endif
+
bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr);
#ifndef __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
mm-hugetlb-proc-check-for-hugetlb-shared-pmd-in-proc-pid-smaps.patch
migrate-hugetlb-check-for-hugetlb-shared-pmd-in-node-migration.patch
Use the correct old/new topology and payload states in
intel_mst_disable_dp(). So far drm_atomic_get_mst_topology_state() it
used returned either the old state, in case the state was added already
earlier during the atomic check phase or otherwise the new state (but
the latter could fail, which can't be handled in the enable/disable
hooks). After the first patch in the patchset, the state should always
get added already during the check phase, so here we can get the
old/new states without a failure.
drm_dp_remove_payload() should use time_slots from the old payload state
and vc_start_slot in the new one. It should update the new payload
states to reflect the sink's current payload table after the payload is
removed. Pass the new topology state and the old and new payload states
accordingly.
This also fixes a problem where the payload allocations for multiple MST
streams on the same link got inconsistent after a few commits, as
during payload removal the old instead of the new payload state got
updated, so the subsequent enabling sequence and commits used a stale
payload state.
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org # 6.1
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
drivers/gpu/drm/i915/display/intel_dp_mst.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_dp_mst.c b/drivers/gpu/drm/i915/display/intel_dp_mst.c
index 5f7bcb5c14847..800fa12a61d93 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_mst.c
@@ -524,10 +524,14 @@ static void intel_mst_disable_dp(struct intel_atomic_state *state,
struct intel_dp *intel_dp = &dig_port->dp;
struct intel_connector *connector =
to_intel_connector(old_conn_state->connector);
- struct drm_dp_mst_topology_state *mst_state =
- drm_atomic_get_mst_topology_state(&state->base, &intel_dp->mst_mgr);
- struct drm_dp_mst_atomic_payload *payload =
- drm_atomic_get_mst_payload_state(mst_state, connector->port);
+ struct drm_dp_mst_topology_state *old_mst_state =
+ drm_atomic_get_old_mst_topology_state(&state->base, &intel_dp->mst_mgr);
+ struct drm_dp_mst_topology_state *new_mst_state =
+ drm_atomic_get_new_mst_topology_state(&state->base, &intel_dp->mst_mgr);
+ struct drm_dp_mst_atomic_payload *old_payload =
+ drm_atomic_get_mst_payload_state(old_mst_state, connector->port);
+ struct drm_dp_mst_atomic_payload *new_payload =
+ drm_atomic_get_mst_payload_state(new_mst_state, connector->port);
struct drm_i915_private *i915 = to_i915(connector->base.dev);
drm_dbg_kms(&i915->drm, "active links %d\n",
@@ -535,8 +539,8 @@ static void intel_mst_disable_dp(struct intel_atomic_state *state,
intel_hdcp_disable(intel_mst->connector);
- drm_dp_remove_payload(&intel_dp->mst_mgr, mst_state,
- payload, payload);
+ drm_dp_remove_payload(&intel_dp->mst_mgr, new_mst_state,
+ old_payload, new_payload);
intel_audio_codec_disable(encoder, old_crtc_state, old_conn_state);
}
--
2.37.1
Greetings,
I trust you are well. I sent you an email yesterday, I just want to confirm if you received it.
Please let me know as soon as possible,
Regard
Mrs Alice Walton