Unmapped folios accessed through file descriptors can be
underprotected. Those folios are added to the oldest generation based
on:
1. The fact that they are less costly to reclaim (no need to walk the
rmap and flush the TLB) and have less impact on performance (don't
cause major PFs and can be non-blocking if needed again).
2. The observation that they are likely to be single-use. E.g., for
client use cases like Android, its apps parse configuration files
and store the data in heap (anon); for server use cases like MySQL,
it reads from InnoDB files and holds the cached data for tables in
buffer pools (anon).
However, the oldest generation can be very short lived, and if so, it
doesn't provide the PID controller with enough time to respond to a
surge of refaults. (Note that the PID controller uses weighted
refaults and those from evicted generations only take a half of the
whole weight.) In other words, for a short lived generation, the
moving average smooths out the spike quickly.
To fix the problem:
1. For folios that are already on LRU, if they can be beyond the
tracking range of tiers, i.e., five accesses through file
descriptors, move them to the second oldest generation to give them
more time to age. (Note that tiers are used by the PID controller
to statistically determine whether folios accessed multiple times
through file descriptors are worth protecting.)
2. When adding unmapped folios to LRU, adjust the placement of them so
that they are not too close to the tail. The effect of this is
similar to the above.
On Android, launching 55 apps sequentially:
Before After Change
workingset_refault_anon 25641024 25598972 0%
workingset_refault_file 115016834 106178438 -8%
Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation")
Signed-off-by: Yu Zhao <yuzhao(a)google.com>
Reported-by: Charan Teja Kalla <quic_charante(a)quicinc.com>
Tested-by: Kalesh Singh <kaleshsingh(a)google.com>
Cc: stable(a)vger.kernel.org
---
include/linux/mm_inline.h | 23 ++++++++++++++---------
mm/vmscan.c | 2 +-
mm/workingset.c | 6 +++---
3 files changed, 18 insertions(+), 13 deletions(-)
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 9ae7def16cb2..f4fe593c1400 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -232,22 +232,27 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio,
if (folio_test_unevictable(folio) || !lrugen->enabled)
return false;
/*
- * There are three common cases for this page:
- * 1. If it's hot, e.g., freshly faulted in or previously hot and
- * migrated, add it to the youngest generation.
- * 2. If it's cold but can't be evicted immediately, i.e., an anon page
- * not in swapcache or a dirty page pending writeback, add it to the
- * second oldest generation.
- * 3. Everything else (clean, cold) is added to the oldest generation.
+ * There are four common cases for this page:
+ * 1. If it's hot, i.e., freshly faulted in, add it to the youngest
+ * generation, and it's protected over the rest below.
+ * 2. If it can't be evicted immediately, i.e., a dirty page pending
+ * writeback, add it to the second youngest generation.
+ * 3. If it should be evicted first, e.g., cold and clean from
+ * folio_rotate_reclaimable(), add it to the oldest generation.
+ * 4. Everything else falls between 2 & 3 above and is added to the
+ * second oldest generation if it's considered inactive, or the
+ * oldest generation otherwise. See lru_gen_is_active().
*/
if (folio_test_active(folio))
seq = lrugen->max_seq;
else if ((type == LRU_GEN_ANON && !folio_test_swapcache(folio)) ||
(folio_test_reclaim(folio) &&
(folio_test_dirty(folio) || folio_test_writeback(folio))))
- seq = lrugen->min_seq[type] + 1;
- else
+ seq = lrugen->max_seq - 1;
+ else if (reclaiming || lrugen->min_seq[type] + MIN_NR_GENS >= lrugen->max_seq)
seq = lrugen->min_seq[type];
+ else
+ seq = lrugen->min_seq[type] + 1;
gen = lru_gen_from_seq(seq);
flags = (gen + 1UL) << LRU_GEN_PGOFF;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 4e3b835c6b4a..e67631c60ac0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4260,7 +4260,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c
}
/* protected */
- if (tier > tier_idx) {
+ if (tier > tier_idx || refs == BIT(LRU_REFS_WIDTH)) {
int hist = lru_hist_from_seq(lrugen->min_seq[type]);
gen = folio_inc_gen(lruvec, folio, false);
diff --git a/mm/workingset.c b/mm/workingset.c
index 7d3dacab8451..2a2a34234df9 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -313,10 +313,10 @@ static void lru_gen_refault(struct folio *folio, void *shadow)
* 1. For pages accessed through page tables, hotter pages pushed out
* hot pages which refaulted immediately.
* 2. For pages accessed multiple times through file descriptors,
- * numbers of accesses might have been out of the range.
+ * they would have been protected by sort_folio().
*/
- if (lru_gen_in_fault() || refs == BIT(LRU_REFS_WIDTH)) {
- folio_set_workingset(folio);
+ if (lru_gen_in_fault() || refs >= BIT(LRU_REFS_WIDTH) - 1) {
+ set_mask_bits(&folio->flags, 0, LRU_REFS_MASK | BIT(PG_workingset));
mod_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + type, delta);
}
unlock:
--
2.43.0.472.g3155946c3a-goog
When a queue is unbound from the vfio_ap device driver, it is reset to
ensure its crypto data is not leaked when it is bound to another device
driver. If the queue is unbound due to the fact that the adapter or domain
was removed from the host's AP configuration, then attempting to reset it
will fail with response code 01 (APID not valid) getting returned from the
reset command. Let's ensure that the queue is assigned to the host's
configuration before resetting it.
Signed-off-by: Tony Krowiak <akrowiak(a)linux.ibm.com>
Fixes: eeb386aeb5b7 ("s390/vfio-ap: handle config changed and scan complete notification")
Cc: <stable(a)vger.kernel.org>
---
drivers/s390/crypto/vfio_ap_ops.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index e014108067dc..84decb0d5c97 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -2197,6 +2197,8 @@ void vfio_ap_mdev_remove_queue(struct ap_device *apdev)
q = dev_get_drvdata(&apdev->device);
get_update_locks_for_queue(q);
matrix_mdev = q->matrix_mdev;
+ apid = AP_QID_CARD(q->apqn);
+ apqi = AP_QID_QUEUE(q->apqn);
if (matrix_mdev) {
/* If the queue is assigned to the guest's AP configuration */
@@ -2214,8 +2216,16 @@ void vfio_ap_mdev_remove_queue(struct ap_device *apdev)
}
}
- vfio_ap_mdev_reset_queue(q);
- flush_work(&q->reset_work);
+ /*
+ * If the queue is not in the host's AP configuration, then resetting
+ * it will fail with response code 01, (APQN not valid); so, let's make
+ * sure it is in the host's config.
+ */
+ if (test_bit_inv(apid, (unsigned long *)matrix_dev->info.apm) &&
+ test_bit_inv(apqi, (unsigned long *)matrix_dev->info.aqm)) {
+ vfio_ap_mdev_reset_queue(q);
+ flush_work(&q->reset_work);
+ }
done:
if (matrix_mdev)
--
2.43.0
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x b8bd342d50cbf606666488488f9fea374aceb2d5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023091601-spotted-untie-0ba4@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
b8bd342d50cb ("fuse: nlookup missing decrement in fuse_direntplus_link")
d123d8e1833c ("fuse: split out readdir.c")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b8bd342d50cbf606666488488f9fea374aceb2d5 Mon Sep 17 00:00:00 2001
From: ruanmeisi <ruan.meisi(a)zte.com.cn>
Date: Tue, 25 Apr 2023 19:13:54 +0800
Subject: [PATCH] fuse: nlookup missing decrement in fuse_direntplus_link
During our debugging of glusterfs, we found an Assertion failed error:
inode_lookup >= nlookup, which was caused by the nlookup value in the
kernel being greater than that in the FUSE file system.
The issue was introduced by fuse_direntplus_link, where in the function,
fuse_iget increments nlookup, and if d_splice_alias returns failure,
fuse_direntplus_link returns failure without decrementing nlookup
https://github.com/gluster/glusterfs/pull/4081
Signed-off-by: ruanmeisi <ruan.meisi(a)zte.com.cn>
Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support")
Cc: <stable(a)vger.kernel.org> # v3.9
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
index dc603479b30e..b3d498163f97 100644
--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -243,8 +243,16 @@ static int fuse_direntplus_link(struct file *file,
dput(dentry);
dentry = alias;
}
- if (IS_ERR(dentry))
+ if (IS_ERR(dentry)) {
+ if (!IS_ERR(inode)) {
+ struct fuse_inode *fi = get_fuse_inode(inode);
+
+ spin_lock(&fi->lock);
+ fi->nlookup--;
+ spin_unlock(&fi->lock);
+ }
return PTR_ERR(dentry);
+ }
}
if (fc->readdirplus_auto)
set_bit(FUSE_I_INIT_RDPLUS, &get_fuse_inode(inode)->state);
This is the start of the stable review cycle for the 5.15.142 release.
There are 67 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 07 Dec 2023 03:14:57 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.142-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.142-rc1
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: fix deadlock on RTL8125 in jumbo mtu mode
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: disable ASPM in case of tx timeout
Wenchao Chen <wenchao.chen(a)unisoc.com>
mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled
Heiner Kallweit <hkallweit1(a)gmail.com>
mmc: core: add helpers mmc_regulator_enable/disable_vqmmc
Lu Baolu <baolu.lu(a)linux.intel.com>
iommu/vt-d: Make context clearing consistent with context mapping
Lu Baolu <baolu.lu(a)linux.intel.com>
iommu/vt-d: Omit devTLB invalidation requests when TES=0
Christoph Niedermaier <cniedermaier(a)dh-electronics.com>
cpufreq: imx6q: Don't disable 792 Mhz OPP unnecessarily
Christoph Niedermaier <cniedermaier(a)dh-electronics.com>
cpufreq: imx6q: don't warn for disabling a non-existing frequency
Steve French <stfrench(a)microsoft.com>
smb3: fix caching of ctime on setxattr
Jeff Layton <jlayton(a)kernel.org>
fs: add ctime accessors infrastructure
Helge Deller <deller(a)gmx.de>
fbdev: stifb: Make the STI next font pointer a 32-bit signed offset
Mark Hasemeyer <markhas(a)chromium.org>
ASoC: SOF: sof-pci-dev: Fix community key quirk detection
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: SOF: sof-pci-dev: don't use the community key on APL Chromebooks
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: SOF: sof-pci-dev: add parameter to override topology filename
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: SOF: sof-pci-dev: use community key on all Up boards
Hans de Goede <hdegoede(a)redhat.com>
ASoC: Intel: Move soc_intel_is_foo() helpers to a generic header
Steve French <stfrench(a)microsoft.com>
smb3: fix touch -h of symlink
Gaurav Batra <gbatra(a)linux.vnet.ibm.com>
powerpc/pseries/iommu: enable_ddw incorrectly returns direct mapping for SR-IOV device
Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
selftests/resctrl: Move _GNU_SOURCE define into Makefile
Shaopeng Tan <tan.shaopeng(a)jp.fujitsu.com>
selftests/resctrl: Add missing SPDX license to Makefile
Adrian Hunter <adrian.hunter(a)intel.com>
perf intel-pt: Fix async branch flags
Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
net: ravb: Stop DMA in case of failures on ravb_open()
Phil Edworthy <phil.edworthy(a)renesas.com>
ravb: Support separate Line0 (Desc), Line1 (Err) and Line2 (Mgmt) irqs
Phil Edworthy <phil.edworthy(a)renesas.com>
ravb: Separate handling of irq enable/disable regs into feature
Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
net: ravb: Start TX queues after HW initialization succeeded
Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
net: ravb: Use pm_runtime_resume_and_get()
Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
net: ravb: Check return value of reset_control_deassert()
Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
ravb: Fix races between ravb_tx_timeout_work() and net related ops
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: prevent potential deadlock in rtl8169_close
Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Revert "workqueue: remove unused cancel_work()"
Geetha sowjanya <gakula(a)marvell.com>
octeontx2-pf: Fix adding mbox work queue entry when num_vfs > 64
Furong Xu <0x1207(a)gmail.com>
net: stmmac: xgmac: Disable FPE MMC interrupts
Elena Salomatkina <elena.salomatkina.cmc(a)gmail.com>
octeontx2-af: Fix possible buffer overflow
Willem de Bruijn <willemb(a)google.com>
selftests/net: ipsec: fix constant out of range
Dmitry Antipov <dmantipov(a)yandex.ru>
uapi: propagate __struct_group() attributes to the container union
Ioana Ciornei <ioana.ciornei(a)nxp.com>
dpaa2-eth: increase the needed headroom to account for alignment
Zhengchao Shao <shaozhengchao(a)huawei.com>
ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet
Niklas Neronin <niklas.neronin(a)linux.intel.com>
usb: config: fix iteration issue in 'usb_get_bos_descriptor()'
Alan Stern <stern(a)rowland.harvard.edu>
USB: core: Change configuration warnings to notices
Haiyang Zhang <haiyangz(a)microsoft.com>
hv_netvsc: fix race of netvsc and VF register_netdevice
Patrick Wang <patrick.wang.shcn(a)gmail.com>
rcu: Avoid tracing a few functions executed in stop machine
Xin Long <lucien.xin(a)gmail.com>
vlan: move dev_put into vlan_dev_uninit
Xin Long <lucien.xin(a)gmail.com>
vlan: introduce vlan_dev_free_egress_priority
Max Nguyen <maxwell.nguyen(a)hp.com>
Input: xpad - add HyperX Clutch Gladiate Support
Filipe Manana <fdmanana(a)suse.com>
btrfs: make error messages more clear when getting a chunk map
Jann Horn <jannh(a)google.com>
btrfs: send: ensure send_fd is writable
Filipe Manana <fdmanana(a)suse.com>
btrfs: fix off-by-one when checking chunk map includes logical address
Bragatheswaran Manickavel <bragathemanick0908(a)gmail.com>
btrfs: ref-verify: fix memory leaks in btrfs_ref_tree_mod()
Qu Wenruo <wqu(a)suse.com>
btrfs: add dmesg output for first mount and last unmount of a filesystem
Helge Deller <deller(a)gmx.de>
parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
Timothy Pearson <tpearson(a)raptorengineering.com>
powerpc: Don't clobber f0/vs0 during fp|altivec register save
Abdul Halim, Mohd Syazwan <mohd.syazwan.abdul.halim(a)intel.com>
iommu/vt-d: Add MTL to quirk list to skip TE disabling
Markus Weippert <markus(a)gekmihesg.de>
bcache: revert replacing IS_ERR_OR_NULL with IS_ERR
Wu Bo <bo.wu(a)vivo.com>
dm verity: don't perform FEC for failed readahead IO
Mikulas Patocka <mpatocka(a)redhat.com>
dm-verity: align struct dm_verity_fec_io properly
Kailang Yang <kailang(a)realtek.com>
ALSA: hda/realtek: Add supported ALC257 for ChromeOS
Kailang Yang <kailang(a)realtek.com>
ALSA: hda/realtek: Headset Mic VREF to 100%
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda: Disable power-save on KONTRON SinglePC
Adrian Hunter <adrian.hunter(a)intel.com>
mmc: block: Be sure to wait while busy in CQE error recovery
Adrian Hunter <adrian.hunter(a)intel.com>
mmc: block: Do not lose cache flush during CQE error recovery
Adrian Hunter <adrian.hunter(a)intel.com>
mmc: block: Retry commands in CQE error recovery
Adrian Hunter <adrian.hunter(a)intel.com>
mmc: cqhci: Fix task clearing in CQE error recovery
Adrian Hunter <adrian.hunter(a)intel.com>
mmc: cqhci: Warn of halt or task clear failure
Adrian Hunter <adrian.hunter(a)intel.com>
mmc: cqhci: Increase recovery halt timeout
Yang Yingliang <yangyingliang(a)huawei.com>
firewire: core: fix possible memory leak in create_units()
Maria Yu <quic_aiquny(a)quicinc.com>
pinctrl: avoid reload of p state in list iteration
Adrian Hunter <adrian.hunter(a)intel.com>
perf inject: Fix GEN_ELF_TEXT_OFFSET for jit
-------------
Diffstat:
Makefile | 4 +-
arch/parisc/include/uapi/asm/errno.h | 2 -
arch/powerpc/kernel/fpu.S | 13 ++++
arch/powerpc/kernel/vector.S | 2 +
arch/powerpc/platforms/pseries/iommu.c | 8 +-
drivers/cpufreq/imx6q-cpufreq.c | 32 ++++----
drivers/firewire/core-device.c | 11 +--
drivers/input/joystick/xpad.c | 2 +
drivers/iommu/intel/dmar.c | 18 +++++
drivers/iommu/intel/iommu.c | 6 +-
drivers/md/bcache/btree.c | 2 +-
drivers/md/dm-verity-fec.c | 3 +-
drivers/md/dm-verity-target.c | 4 +-
drivers/md/dm-verity.h | 6 --
drivers/mmc/core/block.c | 2 +
drivers/mmc/core/core.c | 9 ++-
drivers/mmc/core/regulator.c | 41 ++++++++++
drivers/mmc/host/cqhci-core.c | 44 +++++------
drivers/mmc/host/sdhci-sprd.c | 25 ++++++
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 8 +-
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h | 2 +-
.../net/ethernet/marvell/octeontx2/af/rvu_nix.c | 4 +-
.../net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 7 +-
drivers/net/ethernet/realtek/r8169_main.c | 23 +++++-
drivers/net/ethernet/renesas/ravb.h | 4 +
drivers/net/ethernet/renesas/ravb_main.c | 91 ++++++++++++++++++----
drivers/net/ethernet/renesas/ravb_ptp.c | 6 +-
drivers/net/ethernet/stmicro/stmmac/mmc_core.c | 4 +
drivers/net/hyperv/netvsc_drv.c | 25 +++---
drivers/pinctrl/core.c | 6 +-
drivers/usb/core/config.c | 85 ++++++++++----------
drivers/video/fbdev/sticore.h | 2 +-
fs/btrfs/disk-io.c | 1 +
fs/btrfs/ref-verify.c | 2 +
fs/btrfs/send.c | 2 +-
fs/btrfs/super.c | 5 +-
fs/btrfs/volumes.c | 9 ++-
fs/cifs/cifsfs.c | 1 +
fs/cifs/xattr.c | 5 +-
fs/inode.c | 16 ++++
include/linux/fs.h | 45 ++++++++++-
include/linux/mmc/host.h | 3 +
include/linux/platform_data/x86/soc.h | 65 ++++++++++++++++
include/linux/workqueue.h | 1 +
include/uapi/linux/stddef.h | 2 +-
kernel/rcu/tree_plugin.h | 8 +-
kernel/workqueue.c | 9 +++
lib/errname.c | 6 --
net/8021q/vlan.h | 2 +-
net/8021q/vlan_dev.c | 15 +++-
net/8021q/vlan_netlink.c | 7 +-
net/ipv4/igmp.c | 6 +-
sound/pci/hda/hda_intel.c | 2 +
sound/pci/hda/patch_realtek.c | 12 +++
sound/soc/intel/common/soc-intel-quirks.h | 51 +-----------
sound/soc/sof/sof-pci-dev.c | 62 +++++++++++----
tools/arch/parisc/include/uapi/asm/errno.h | 2 -
tools/perf/util/genelf.h | 4 +-
tools/perf/util/intel-pt.c | 2 +
tools/testing/selftests/net/ipsec.c | 4 +-
tools/testing/selftests/resctrl/Makefile | 4 +-
tools/testing/selftests/resctrl/resctrl.h | 1 -
62 files changed, 606 insertions(+), 249 deletions(-)
In min_key_size_set():
if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
In max_key_size_set():
if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
return -EINVAL;
hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
The atomicity violation occurs due to concurrent execution of set_min and
set_max funcs.Consider a scenario where setmin writes a new, valid 'min'
value, and concurrently, setmax writes a value that is greater than the
old 'min' but smaller than the new 'min'. In this case, setmax might check
against the old 'min' value (before acquiring the lock) but write its
value after the 'min' has been updated by setmin. This leads to a
situation where the 'max' value ends up being smaller than the 'min'
value, which is an inconsistency.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 5.17.
To resolve this issue, it is suggested to encompass the validity checks
within the locked sections in both set_min and set_max funcs. The
modification ensures that the validation of 'val' against the
current min/max values is atomic, thus maintaining the integrity of the
settings. With this patch applied, our tool no longer reports the bug,
with the kernel configuration allyesconfig for x86_64. Due to the lack of
associated hardware, we cannot test the patch in runtime testing, and just
verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 18f81241b74f ("Bluetooth: Move {min,max}_key_size debugfs ...")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
v2:
* Adjust the format to pass the CI.
---
net/bluetooth/hci_debugfs.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c
index 6b7741f6e95b..3ffbf3f25363 100644
--- a/net/bluetooth/hci_debugfs.c
+++ b/net/bluetooth/hci_debugfs.c
@@ -1045,11 +1045,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(adv_max_interval_fops, adv_max_interval_get,
static int min_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE)
+
+ hci_dev_lock(hdev);
+ if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_min_key_size = val;
hci_dev_unlock(hdev);
@@ -1073,11 +1075,13 @@ DEFINE_DEBUGFS_ATTRIBUTE(min_key_size_fops, min_key_size_get,
static int max_key_size_set(void *data, u64 val)
{
struct hci_dev *hdev = data;
-
- if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size)
+
+ hci_dev_lock(hdev);
+ if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size) {
+ hci_dev_unlock(hdev);
return -EINVAL;
+ }
- hci_dev_lock(hdev);
hdev->le_max_key_size = val;
hci_dev_unlock(hdev);
--
2.34.1
In xc4000_get_frequency():
*freq = priv->freq_hz + priv->freq_offset;
The code accesses priv->freq_hz and priv->freq_offset without holding any
lock.
In xc4000_set_params():
// Code that updates priv->freq_hz and priv->freq_offset
...
xc4000_get_frequency() and xc4000_set_params() may execute concurrently,
risking inconsistent reads of priv->freq_hz and priv->freq_offset. Since
these related data may update during reading, it can result in incorrect
frequency calculation, leading to atomicity violations.
This possible bug is found by an experimental static analysis tool
developed by our team, BassCheck[1]. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations. The above
possible bug is reported when our tool analyzes the source code of
Linux 6.2.
To address this issue, it is proposed to add a mutex lock pair in
xc4000_get_frequency() to ensure atomicity. With this patch applied, our
tool no longer reports the possible bug, with the kernel configuration
allyesconfig for x86_64. Due to the lack of associated hardware, we cannot
test the patch in runtime testing, and just verify it according to the
code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 4c07e32884ab6 ("[media] xc4000: Fix get_frequency()")
Cc: stable(a)vger.kernel.org
Reported-by: BassCheck <bass(a)buaa.edu.cn>
Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com>
---
v2:
* In this patch v2, we've added some information of the static analysis
tool used, as per the researcher guidelines. Also, we've added a cc in the
signed-off-by area, according to the stable-kernel-rules.
Thank Greg KH for helpful advice.
---
drivers/media/tuners/xc4000.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/media/tuners/xc4000.c b/drivers/media/tuners/xc4000.c
index 57ded9ff3f04..29bc63021c5a 100644
--- a/drivers/media/tuners/xc4000.c
+++ b/drivers/media/tuners/xc4000.c
@@ -1515,10 +1515,10 @@ static int xc4000_get_frequency(struct dvb_frontend *fe, u32 *freq)
{
struct xc4000_priv *priv = fe->tuner_priv;
+ mutex_lock(&priv->lock);
*freq = priv->freq_hz + priv->freq_offset;
if (debug) {
- mutex_lock(&priv->lock);
if ((priv->cur_fw.type
& (BASE | FM | DTV6 | DTV7 | DTV78 | DTV8)) == BASE) {
u16 snr = 0;
@@ -1529,8 +1529,8 @@ static int xc4000_get_frequency(struct dvb_frontend *fe, u32 *freq)
return 0;
}
}
- mutex_unlock(&priv->lock);
}
+ mutex_unlock(&priv->lock);
dprintk(1, "%s()\n", __func__);
--
2.34.1
From: José Pekkarinen <jose.pekkarinen(a)foxhound.fi>
[ Upstream commit c1f342f35f820b33390571293498c3e2e9bc77ec ]
Observed on dmesg of my laptop I see the following
output:
[ 19.898700] psmouse serio1: synaptics: queried max coordinates: x [..5678], y [..4694]
[ 19.936057] psmouse serio1: synaptics: queried min coordinates: x [1266..], y [1162..]
[ 19.936076] psmouse serio1: synaptics: Your touchpad (PNP: LEN0411 PNP0f13) says it can support a different bus. If i2c-hid and hid-rmi are not used, you might want to try setting psmouse.synaptics_intertouch to 1 and report this to linux-input(a)vger.kernel.org.
[ 20.008901] psmouse serio1: synaptics: Touchpad model: 1, fw: 10.32, id: 0x1e2a1, caps: 0xf014a3/0x940300/0x12e800/0x500000, board id: 3471, fw id: 2909640
[ 20.008925] psmouse serio1: synaptics: serio: Synaptics pass-through port at isa0060/serio1/input0
[ 20.053344] input: SynPS/2 Synaptics TouchPad as /devices/platform/i8042/serio1/input/input7
[ 20.397608] mousedev: PS/2 mouse device common for all mice
This patch will add its pnp id to the smbus list to
produce the setup of intertouch for the device.
Signed-off-by: José Pekkarinen <jose.pekkarinen(a)foxhound.fi>
Link: https://lore.kernel.org/r/20231114063607.71772-1-jose.pekkarinen@foxhound.fi
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/input/mouse/synaptics.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c
index e43e93ac2798a..b6749af462620 100644
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@ -183,6 +183,7 @@ static const char * const smbus_pnp_ids[] = {
"LEN009b", /* T580 */
"LEN0402", /* X1 Extreme Gen 2 / P1 Gen 2 */
"LEN040f", /* P1 Gen 3 */
+ "LEN0411", /* L14 Gen 1 */
"LEN200f", /* T450s */
"LEN2044", /* L470 */
"LEN2054", /* E480 */
--
2.43.0
The ext4 filesystem tracks the trim status of blocks at the group level.
When an entire group has been trimmed then it is marked as such and subsequent
trim invocations with the same minimum trim size will not be attempted on that
group unless it is marked as able to be trimmed again such as when a block is
freed.
Currently the last group can't be marked as trimmed due to incorrect logic
in ext4_last_grp_cluster(). ext4_last_grp_cluster() is supposed to return the
zero based index of the last cluster in a group. This is then used by
ext4_try_to_trim_range() to determine if the trim operation spans the entire
group and as such if the trim status of the group should be recorded.
ext4_last_grp_cluster() takes a 0 based group index, thus the valid values
for grp are 0..(ext4_get_groups_count - 1). Any group index less than
(ext4_get_groups_count - 1) is not the last group and must have
EXT4_CLUSTERS_PER_GROUP(sb) clusters. For the last group we need to calculate
the number of clusters based on the number of blocks in the group. Finally
subtract 1 from the number of clusters as zero based indexing is expected.
Rearrange the function slightly to make it clear what we are calculating
and returning.
Reproducer:
// Create file system where the last group has fewer blocks than blocks per group
$ mkfs.ext4 -b 4096 -g 8192 /dev/nvme0n1 8191
$ mount /dev/nvme0n1 /mnt
Before Patch:
$ fstrim -v /mnt
/mnt: 25.9 MiB (27156480 bytes) trimmed
// Group not marked as trimmed so second invocation still discards blocks
$ fstrim -v /mnt
/mnt: 25.9 MiB (27156480 bytes) trimmed
After Patch:
fstrim -v /mnt
/mnt: 25.9 MiB (27156480 bytes) trimmed
// Group marked as trimmed so second invocation DOESN'T discard any blocks
fstrim -v /mnt
/mnt: 0 B (0 bytes) trimmed
Fixes: 45e4ab320c9b ("ext4: move setting of trimmed bit into ext4_try_to_trim_range()")
Cc: stable(a)vger.kernel.org # 4.19+
Signed-off-by: Suraj Jitindar Singh <surajjs(a)amazon.com>
---
fs/ext4/mballoc.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 454d5612641ee..c15d8b6f887dd 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -6731,11 +6731,16 @@ __acquires(bitlock)
static ext4_grpblk_t ext4_last_grp_cluster(struct super_block *sb,
ext4_group_t grp)
{
- if (grp < ext4_get_groups_count(sb))
- return EXT4_CLUSTERS_PER_GROUP(sb) - 1;
- return (ext4_blocks_count(EXT4_SB(sb)->s_es) -
- ext4_group_first_block_no(sb, grp) - 1) >>
- EXT4_CLUSTER_BITS(sb);
+ unsigned long nr_clusters_in_group;
+
+ if (grp < (ext4_get_groups_count(sb) - 1))
+ nr_clusters_in_group = EXT4_CLUSTERS_PER_GROUP(sb);
+ else
+ nr_clusters_in_group = (ext4_blocks_count(EXT4_SB(sb)->s_es) -
+ ext4_group_first_block_no(sb, grp))
+ >> EXT4_CLUSTER_BITS(sb);
+
+ return nr_clusters_in_group - 1;
}
static bool ext4_trim_interrupted(void)
--
2.34.1
commit c5a595000e2677e865a39f249c056bc05d6e55fd upstream.
Backport of upstream fix for tls on 6.1 and lower kernels.
The curr pointer must also be updated on the splice similar to how
we do this for other copy types.
Cc: stable(a)vger.kernel.org # 6.1.x-
Reported-by: Jann Horn <jannh(a)google.com>
Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend(a)gmail.com>
---
net/tls/tls_sw.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 2e60bf06adff..0323040d34bc 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1225,6 +1225,8 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
}
sk_msg_page_add(msg_pl, page, copy, offset);
+ msg_pl->sg.copybreak = 0;
+ msg_pl->sg.curr = msg_pl->sg.end;
sk_mem_charge(sk, copy);
offset += copy;
--
2.33.0