On 2024/10/23 1:44, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> selftests: mm: fix the incorrect usage() info of khugepaged
>
> to the 6.11-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> selftests-mm-fix-the-incorrect-usage-info-of-khugepa.patch
> and it can be found in the queue-6.11 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Hi,
I don't think this patch needs to be added to the stable tree because
this just fixes usage
information, as Andrew had previously said:
https://lore.kernel.org/lkml/20241017001441.2db5adaaa63dc3faa0934204@linux-…
>
>
>
> commit ad8b93ffe0a86e3b6be297826cd34b12080fc877
> Author: Nanyong Sun <sunnanyong(a)huawei.com>
> Date: Tue Oct 15 10:02:57 2024 +0800
>
> selftests: mm: fix the incorrect usage() info of khugepaged
>
> [ Upstream commit 3e822bed2fbd1527d88f483342b1d2a468520a9a ]
>
> The mount option of tmpfs should be huge=advise, not madvise which is not
> supported and may mislead the users.
>
> Link: https://lkml.kernel.org/r/20241015020257.139235-1-sunnanyong@huawei.com
> Fixes: 1b03d0d558a2 ("selftests/vm: add thp collapse file and tmpfs testing")
> Signed-off-by: Nanyong Sun <sunnanyong(a)huawei.com>
> Reviewed-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
> Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
> Cc: Shuah Khan <shuah(a)kernel.org>
> Cc: Zach O'Keefe <zokeefe(a)google.com>
> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c
> index 829320a519e72..89dec42986825 100644
> --- a/tools/testing/selftests/mm/khugepaged.c
> +++ b/tools/testing/selftests/mm/khugepaged.c
> @@ -1091,7 +1091,7 @@ static void usage(void)
> fprintf(stderr, "\n\t\"file,all\" mem_type requires kernel built with\n");
> fprintf(stderr, "\tCONFIG_READ_ONLY_THP_FOR_FS=y\n");
> fprintf(stderr, "\n\tif [dir] is a (sub)directory of a tmpfs mount, tmpfs must be\n");
> - fprintf(stderr, "\tmounted with huge=madvise option for khugepaged tests to work\n");
> + fprintf(stderr, "\tmounted with huge=advise option for khugepaged tests to work\n");
> fprintf(stderr, "\n\tSupported Options:\n");
> fprintf(stderr, "\t\t-h: This help message.\n");
> fprintf(stderr, "\t\t-s: mTHP size, expressed as page order.\n");
> .
This is the start of the stable review cycle for the 6.1.114 release.
There are 91 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 23 Oct 2024 10:22:25 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.114-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.1.114-rc1
Vasiliy Kovalev <kovalev(a)altlinux.org>
ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2
Nicholas Piggin <npiggin(a)gmail.com>
powerpc/64: Add big-endian ELFv2 flavour to crypto VMX asm generation
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: propagate directory read errors from nilfs_find_entry()
Paolo Abeni <pabeni(a)redhat.com>
mptcp: prevent MPC handshake on port-based signal endpoints
Paolo Abeni <pabeni(a)redhat.com>
tcp: fix mptcp DSS corruption due to large pmtu xmit
Nam Cao <namcao(a)linutronix.de>
irqchip/sifive-plic: Unmask interrupt in plic_irq_enable()
Marc Zyngier <maz(a)kernel.org>
irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
Ma Ke <make24(a)iscas.ac.cn>
pinctrl: apple: check devm_kasprintf() returned value
Sergey Matsievskiy <matsievskiysv(a)gmail.com>
pinctrl: ocelot: fix system hang on level based interrupts
Longlong Xia <xialonglong(a)kylinos.cn>
tty: n_gsm: Fix use-after-free in gsm_cleanup_mux
Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
x86/entry_32: Clear CPU buffers after register restore in NMI return
Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
x86/entry_32: Do not clobber user EFLAGS.ZF
Zhang Rui <rui.zhang(a)intel.com>
x86/apic: Always explicitly disarm TSC-deadline timer
Nathan Chancellor <nathan(a)kernel.org>
x86/resctrl: Annotate get_mem_config() functions as __init
Takashi Iwai <tiwai(a)suse.de>
parport: Proper fix for array out-of-bounds access
Prashanth K <quic_prashk(a)quicinc.com>
usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG
Daniele Palmas <dnlplm(a)gmail.com>
USB: serial: option: add Telit FN920C04 MBIM compositions
Benjamin B. Frost <benjamin(a)geanix.com>
USB: serial: option: add support for Quectel EG916Q-GL
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Mitigate failed set dequeue pointer commands
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Fix incorrect stream context type macro
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
Aaron Thompson <dev(a)aaront.org>
Bluetooth: ISO: Fix multiple init when debugfs is disabled
Aaron Thompson <dev(a)aaront.org>
Bluetooth: Remove debugfs directory on module init failure
Aaron Thompson <dev(a)aaront.org>
Bluetooth: Call iso_exit() on module unload
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: adc: ti-ads124s08: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ad3552r: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ad5766: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig
Emil Gedenryd <emil.gedenryd(a)axis.com>
iio: light: opt3001: add missing full-scale range value
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: light: veml6030: fix IIO device retrieval from embedded device
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: light: veml6030: fix ALS sensor resolution
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig
Mohammed Anees <pvmohammedanees2003(a)gmail.com>
drm/amdgpu: prevent BO_HANDLES error from being overwritten
Alex Deucher <alexander.deucher(a)amd.com>
drm/amdgpu/swsmu: Only force workload setup on init
Nikolay Kuratov <kniv(a)yandex-team.ru>
drm/vmwgfx: Handle surface check failure correctly
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/radeon: Fix encoder->possible_clones
Seunghwan Baek <sh8267.baek(a)samsung.com>
scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down
Jens Axboe <axboe(a)kernel.dk>
io_uring/sqpoll: close race on waiting for sqring entries
Omar Sandoval <osandov(a)fb.com>
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
Johannes Wikner <kwikner(a)ethz.ch>
x86/bugs: Do not use UNTRAIN_RET with IBPB on entry
Johannes Wikner <kwikner(a)ethz.ch>
x86/bugs: Skip RSB fill at VMEXIT
Johannes Wikner <kwikner(a)ethz.ch>
x86/entry: Have entry_ibpb() invalidate return predictions
Johannes Wikner <kwikner(a)ethz.ch>
x86/cpufeatures: Add a IBPB_NO_RET BUG flag
Jim Mattson <jmattson(a)google.com>
x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET
Michael Mueller <mimu(a)linux.ibm.com>
KVM: s390: Change virtual to physical address access in diag 0x258 handler
Nico Boehr <nrb(a)linux.ibm.com>
KVM: s390: gaccess: Check if guest address is in memslot
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
s390/sclp_vt220: Convert newlines to CRLF instead of LFCR
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
s390/sclp: Deactivate sclp after all its users
Lu Baolu <baolu.lu(a)linux.intel.com>
iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices
Wachowski, Karol <karol.wachowski(a)intel.com>
drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)
Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
maple_tree: correct tree corruption on spanning store
Jakub Kicinski <kuba(a)kernel.org>
devlink: bump the instance index directly when iterating
Jakub Kicinski <kuba(a)kernel.org>
devlink: drop the filter argument from devlinks_xa_find_get
Liu Shixin <liushixin2(a)huawei.com>
mm/swapfile: skip HugeTLB pages for unuse_vma
OGAWA Hirofumi <hirofumi(a)mail.parknet.co.jp>
fat: fix uninitialized variable
Nianyao Tang <tangnianyao(a)huawei.com>
irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1
Oleksij Rempel <linux(a)rempel-privat.de>
net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY
Mark Rutland <mark.rutland(a)arm.com>
arm64: probes: Fix simulate_ldr*_literal()
Mark Rutland <mark.rutland(a)arm.com>
arm64: probes: Remove broken LDR (literal) uprobe support
Jinjie Ruan <ruanjinjie(a)huawei.com>
posix-clock: Fix missing timespec64 check in pc_clock_settime()
Wei Fang <wei.fang(a)nxp.com>
net: enetc: add missing static descriptor and inline keyword
Wei Fang <wei.fang(a)nxp.com>
net: enetc: remove xdp_drops statistic from enetc_xdp_drop()
Jan Kara <jack(a)suse.cz>
udf: Don't return bh from udf_expand_dir_adinicb()
Jan Kara <jack(a)suse.cz>
udf: Handle error when expanding directory
Jan Kara <jack(a)suse.cz>
udf: Remove old directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_link() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_mkdir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_add_nondir() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: Implement adding of dir entries using new iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_unlink() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_rmdir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert empty_dir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_get_parent() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_lookup() to use new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_readdir() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: Convert udf_rename() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Provide function to mark entry as deleted using new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Implement searching for directory entry using new iteration code
Jan Kara <jack(a)suse.cz>
udf: Move udf_expand_dir_adinicb() to its callsite
Jan Kara <jack(a)suse.cz>
udf: Convert udf_expand_dir_adinicb() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: New directory iteration code
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow
Vasiliy Kovalev <kovalev(a)altlinux.org>
ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix user-after-free from session log off
Roi Martin <jroi.martin(a)gmail.com>
btrfs: fix uninitialized pointer free on read_alloc_one_name() error
Roi Martin <jroi.martin(a)gmail.com>
btrfs: fix uninitialized pointer free in add_inode_ref()
-------------
Diffstat:
Makefile | 4 +-
arch/arm64/kernel/probes/decode-insn.c | 16 +-
arch/arm64/kernel/probes/simulate-insn.c | 18 +-
arch/s390/kvm/diag.c | 2 +-
arch/s390/kvm/gaccess.c | 4 +
arch/s390/kvm/gaccess.h | 14 +-
arch/x86/entry/entry.S | 5 +
arch/x86/entry/entry_32.S | 6 +-
arch/x86/include/asm/cpufeatures.h | 4 +-
arch/x86/kernel/apic/apic.c | 14 +-
arch/x86/kernel/cpu/bugs.c | 32 +
arch/x86/kernel/cpu/common.c | 3 +
arch/x86/kernel/cpu/resctrl/core.c | 4 +-
block/blk-rq-qos.c | 2 +-
drivers/bluetooth/btusb.c | 13 +-
drivers/crypto/vmx/Makefile | 12 +-
drivers/crypto/vmx/ppc-xlate.pl | 10 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 6 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 3 +
drivers/gpu/drm/radeon/radeon_encoders.c | 2 +-
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 1 +
drivers/iio/adc/Kconfig | 4 +
drivers/iio/amplifiers/Kconfig | 1 +
.../iio/common/hid-sensors/hid-sensor-trigger.c | 2 +-
drivers/iio/dac/Kconfig | 7 +
drivers/iio/light/opt3001.c | 4 +
drivers/iio/light/veml6030.c | 5 +-
drivers/iio/proximity/Kconfig | 2 +
drivers/iommu/intel/iommu.c | 4 +-
drivers/irqchip/irq-gic-v3-its.c | 26 +-
drivers/irqchip/irq-sifive-plic.c | 21 +-
drivers/net/ethernet/cadence/macb_main.c | 14 +-
drivers/net/ethernet/freescale/enetc/enetc.c | 2 +-
drivers/parport/procfs.c | 22 +-
drivers/pinctrl/pinctrl-apple-gpio.c | 3 +
drivers/pinctrl/pinctrl-ocelot.c | 8 +-
drivers/s390/char/sclp.c | 3 +-
drivers/s390/char/sclp_vt220.c | 4 +-
drivers/tty/n_gsm.c | 2 +
drivers/ufs/core/ufshcd.c | 4 +-
drivers/usb/dwc3/gadget.c | 10 +-
drivers/usb/host/xhci-ring.c | 2 +-
drivers/usb/host/xhci.h | 2 +-
drivers/usb/serial/option.c | 8 +
fs/btrfs/tree-log.c | 6 +-
fs/fat/namei_vfat.c | 2 +-
fs/nilfs2/dir.c | 50 +-
fs/nilfs2/namei.c | 39 +-
fs/nilfs2/nilfs.h | 2 +-
fs/smb/server/mgmt/user_session.c | 26 +-
fs/smb/server/mgmt/user_session.h | 4 +
fs/smb/server/server.c | 2 +
fs/smb/server/smb2pdu.c | 8 +-
fs/udf/dir.c | 148 +--
fs/udf/directory.c | 594 ++++++++---
fs/udf/inode.c | 90 --
fs/udf/namei.c | 1037 +++++++-------------
fs/udf/udfdecl.h | 45 +-
include/linux/fsl/enetc_mdio.h | 3 +-
include/linux/irqchip/arm-gic-v4.h | 4 +-
io_uring/io_uring.h | 9 +-
kernel/time/posix-clock.c | 3 +
lib/maple_tree.c | 12 +-
mm/swapfile.c | 2 +-
net/bluetooth/af_bluetooth.c | 3 +
net/bluetooth/iso.c | 6 +-
net/devlink/leftover.c | 40 +-
net/ipv4/tcp_output.c | 4 +-
net/mptcp/mib.c | 1 +
net/mptcp/mib.h | 1 +
net/mptcp/pm_netlink.c | 3 +-
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 11 +
sound/pci/hda/patch_conexant.c | 19 +
75 files changed, 1237 insertions(+), 1275 deletions(-)
From: Chris Wilson <chris.p.wilson(a)intel.com>
commit 78a033433a5ae4fee85511ee075bc9a48312c79e upstream.
If we abort driver initialisation in the middle of gt/engine discovery,
some engines will be fully setup and some not. Those incompletely setup
engines only have 'engine->release == NULL' and so will leak any of the
common objects allocated.
v2:
- Drop the destroy_pinned_context() helper for now. It's not really
worth it with just a single callsite at the moment. (Janusz)
Signed-off-by: Chris Wilson <chris.p.wilson(a)intel.com>
Cc: Janusz Krzysztofik <janusz.krzysztofik(a)linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper(a)intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220915232654.3283095-2-matt…
Cc: <stable(a)vger.kernel.org> # 5.10+
---
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index a19537706ed1..eb6f4d7f1e34 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -904,8 +904,13 @@ int intel_engines_init(struct intel_gt *gt)
return err;
err = setup(engine);
- if (err)
+ if (err) {
+ intel_engine_cleanup_common(engine);
return err;
+ }
+
+ /* The backend should now be responsible for cleanup */
+ GEM_BUG_ON(engine->release == NULL);
err = engine_init_common(engine);
if (err)
--
2.34.1
From: Johannes Berg <johannes.berg(a)intel.com>
If more than 255 colocated APs exist for the set of all
APs found during 2.5/5 GHz scanning, then the 6 GHz scan
construction will loop forever since the loop variable
has type u8, which can never reach the number found when
that's bigger than 255, and is stored in a u32 variable.
Also move it into the loops to have a smaller scope.
Using a u32 there is fine, we limit the number of APs in
the scan list and each has a limit on the number of RNR
entries due to the frame size. With a limit of 1000 scan
results, a frame size upper bound of 4096 (really it's
more like ~2300) and a TBTT entry size of at least 11,
we get an upper bound for the number of ~372k, well in
the bounds of a u32.
Cc: stable(a)vger.kernel.org
Fixes: eae94cf82d74 ("iwlwifi: mvm: add support for 6GHz")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
---
drivers/net/wireless/intel/iwlwifi/mvm/scan.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
index 3ce9150213a7..ddcbd80a49fb 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
@@ -1774,7 +1774,7 @@ iwl_mvm_umac_scan_cfg_channels_v7_6g(struct iwl_mvm *mvm,
&cp->channel_config[ch_cnt];
u32 s_ssid_bitmap = 0, bssid_bitmap = 0, flags = 0;
- u8 j, k, n_s_ssids = 0, n_bssids = 0;
+ u8 k, n_s_ssids = 0, n_bssids = 0;
u8 max_s_ssids, max_bssids;
bool force_passive = false, found = false, allow_passive = true,
unsolicited_probe_on_chan = false, psc_no_listen = false;
@@ -1799,7 +1799,7 @@ iwl_mvm_umac_scan_cfg_channels_v7_6g(struct iwl_mvm *mvm,
cfg->v5.iter_count = 1;
cfg->v5.iter_interval = 0;
- for (j = 0; j < params->n_6ghz_params; j++) {
+ for (u32 j = 0; j < params->n_6ghz_params; j++) {
s8 tmp_psd_20;
if (!(scan_6ghz_params[j].channel_idx == i))
@@ -1873,7 +1873,7 @@ iwl_mvm_umac_scan_cfg_channels_v7_6g(struct iwl_mvm *mvm,
* SSID.
* TODO: improve this logic
*/
- for (j = 0; j < params->n_6ghz_params; j++) {
+ for (u32 j = 0; j < params->n_6ghz_params; j++) {
if (!(scan_6ghz_params[j].channel_idx == i))
continue;
--
2.47.0
On 22. 10. 24, 19:45, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> xhci: dbgtty: remove kfifo_out() wrapper
This is a cleanup, not needed in stable.
--
js
suse labs
The comment before the config of the GPLL3 PLL says that the
PLL should run at 930 MHz. In contrary to this, calculating
the frequency from the current configuration values by using
19.2 MHz as input frequency defined in 'qcs404.dtsi', it gives
921.6 MHz:
$ xo=19200000; l=48; alpha=0x0; alpha_hi=0x0
$ echo "$xo * ($((l)) + $(((alpha_hi << 32 | alpha) >> 8)) / 2^32)" | bc -l
921600000.00000000000000000000
Set 'alpha_hi' in the configuration to a value used in downstream
kernels [1][2] in order to get the correct output rate:
$ xo=19200000; l=48; alpha=0x0; alpha_hi=0x70
$ echo "$xo * ($((l)) + $(((alpha_hi << 32 | alpha) >> 8)) / 2^32)" | bc -l
930000000.00000000000000000000
The change is based on static code analysis, compile tested only.
[1] https://git.codelinaro.org/clo/la/kernel/msm-5.4/-/blob/kernel.lnx.5.4.r56-…
[2} https://git.codelinaro.org/clo/la/kernel/msm-5.15/-/blob/kernel.lnx.5.15.r4…
Cc: stable(a)vger.kernel.org
Fixes: 652f1813c113 ("clk: qcom: gcc: Add global clock controller driver for QCS404")
Signed-off-by: Gabor Juhos <j4g8y7(a)gmail.com>
---
Note: due to a bug in the clk_alpha_pll_configure() function, the following
patch is also needed in order for this fix to take effect:
https://lore.kernel.org/all/20241019-qcs615-mm-clockcontroller-v1-1-4cfb96d…
---
drivers/clk/qcom/gcc-qcs404.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/clk/qcom/gcc-qcs404.c b/drivers/clk/qcom/gcc-qcs404.c
index c3cfd572e7c1e0a987519be2cb2050c9bc7992c7..5ca003c9bfba89bee2e626b3c35936452cc02765 100644
--- a/drivers/clk/qcom/gcc-qcs404.c
+++ b/drivers/clk/qcom/gcc-qcs404.c
@@ -131,6 +131,7 @@ static struct clk_alpha_pll gpll1_out_main = {
/* 930MHz configuration */
static const struct alpha_pll_config gpll3_config = {
.l = 48,
+ .alpha_hi = 0x70,
.alpha = 0x0,
.alpha_en_mask = BIT(24),
.post_div_mask = 0xf << 8,
---
base-commit: 03dc72319cee7d0dfefee9ae7041b67732f6b8cd
change-id: 20241021-fix-gcc-qcs404-gpll3-f314335c8ecf
Best regards,
--
Gabor Juhos <j4g8y7(a)gmail.com>
From: Yu Kuai <yukuai3(a)huawei.com>
[ Upstream commit 73aeab373557fa6ee4ae0b742c6211ccd9859280 ]
Original state:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\--------------\ \-------------\ \-------------\|
V V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 1 2 4
After commit 0e456dba86c7 ("block, bfq: choose the last bfqq from merge
chain in bfq_setup_cooperator()"), if P1 issues a new IO:
Without the patch:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\------------------------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
bfqq3 will be used to handle IO from P1, this is not expected, IO
should be redirected to bfqq4;
With the patch:
-------------------------------------------
| |
Process 1 Process 2 Process 3 | Process 4
(BIC1) (BIC2) (BIC3) | (BIC4)
| | | |
\-------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
IO is redirected to bfqq4, however, procress reference of bfqq3 is still
2, while there is only P2 using it.
Fix the problem by calling bfq_merge_bfqqs() for each bfqq in the merge
chain. Also change bfqq_merge_bfqqs() to return new_bfqq to simplify
code.
Fixes: 0e456dba86c7 ("block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()")
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Link: https://lore.kernel.org/r/20240909134154.954924-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/bfq-iosched.c | 33 ++++++++++++++++-----------------
1 file changed, 16 insertions(+), 17 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 515e3c1a5475..c1600e3ac333 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2774,10 +2774,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
bfq_put_queue(bfqq);
}
-static void
-bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
- struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
+static struct bfq_queue *bfq_merge_bfqqs(struct bfq_data *bfqd,
+ struct bfq_io_cq *bic,
+ struct bfq_queue *bfqq)
{
+ struct bfq_queue *new_bfqq = bfqq->new_bfqq;
+
bfq_log_bfqq(bfqd, bfqq, "merging with queue %lu",
(unsigned long)new_bfqq->pid);
/* Save weight raising and idle window of the merged queues */
@@ -2845,6 +2847,8 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
new_bfqq->pid = -1;
bfqq->bic = NULL;
bfq_release_process_ref(bfqd, bfqq);
+
+ return new_bfqq;
}
static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
@@ -2880,14 +2884,8 @@ static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
* fulfilled, i.e., bic can be redirected to new_bfqq
* and bfqq can be put.
*/
- bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq,
- new_bfqq);
- /*
- * If we get here, bio will be queued into new_queue,
- * so use new_bfqq to decide whether bio and rq can be
- * merged.
- */
- bfqq = new_bfqq;
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq);
/*
* Change also bqfd->bio_bfqq, as
@@ -5444,6 +5442,7 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
bool waiting, idle_timer_disabled = false;
if (new_bfqq) {
+ struct bfq_queue *old_bfqq = bfqq;
/*
* Release the request's reference to the old bfqq
* and make sure one is taken to the shared queue.
@@ -5459,18 +5458,18 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
* then complete the merge and redirect it to
* new_bfqq.
*/
- if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq)
- bfq_merge_bfqqs(bfqd, RQ_BIC(rq),
- bfqq, new_bfqq);
+ if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq) {
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, RQ_BIC(rq), bfqq);
+ }
- bfq_clear_bfqq_just_created(bfqq);
+ bfq_clear_bfqq_just_created(old_bfqq);
/*
* rq is about to be enqueued into new_bfqq,
* release rq reference on bfqq
*/
- bfq_put_queue(bfqq);
+ bfq_put_queue(old_bfqq);
rq->elv.priv[1] = new_bfqq;
- bfqq = new_bfqq;
}
bfq_update_io_thinktime(bfqd, bfqq);
--
2.39.2
From: Yu Kuai <yukuai3(a)huawei.com>
[ Upstream commit 73aeab373557fa6ee4ae0b742c6211ccd9859280 ]
Original state:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\--------------\ \-------------\ \-------------\|
V V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 1 2 4
After commit 0e456dba86c7 ("block, bfq: choose the last bfqq from merge
chain in bfq_setup_cooperator()"), if P1 issues a new IO:
Without the patch:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\------------------------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
bfqq3 will be used to handle IO from P1, this is not expected, IO
should be redirected to bfqq4;
With the patch:
-------------------------------------------
| |
Process 1 Process 2 Process 3 | Process 4
(BIC1) (BIC2) (BIC3) | (BIC4)
| | | |
\-------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
IO is redirected to bfqq4, however, procress reference of bfqq3 is still
2, while there is only P2 using it.
Fix the problem by calling bfq_merge_bfqqs() for each bfqq in the merge
chain. Also change bfqq_merge_bfqqs() to return new_bfqq to simplify
code.
Fixes: 0e456dba86c7 ("block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()")
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Link: https://lore.kernel.org/r/20240909134154.954924-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/bfq-iosched.c | 37 +++++++++++++++++--------------------
1 file changed, 17 insertions(+), 20 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index b0bdb5197530..c985c944fa65 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2981,10 +2981,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
bfq_put_queue(bfqq);
}
-static void
-bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
- struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
+static struct bfq_queue *bfq_merge_bfqqs(struct bfq_data *bfqd,
+ struct bfq_io_cq *bic,
+ struct bfq_queue *bfqq)
{
+ struct bfq_queue *new_bfqq = bfqq->new_bfqq;
+
bfq_log_bfqq(bfqd, bfqq, "merging with queue %lu",
(unsigned long)new_bfqq->pid);
/* Save weight raising and idle window of the merged queues */
@@ -3078,6 +3080,8 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
bfq_reassign_last_bfqq(bfqq, new_bfqq);
bfq_release_process_ref(bfqd, bfqq);
+
+ return new_bfqq;
}
static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
@@ -3113,14 +3117,8 @@ static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
* fulfilled, i.e., bic can be redirected to new_bfqq
* and bfqq can be put.
*/
- bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq,
- new_bfqq);
- /*
- * If we get here, bio will be queued into new_queue,
- * so use new_bfqq to decide whether bio and rq can be
- * merged.
- */
- bfqq = new_bfqq;
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq);
/*
* Change also bqfd->bio_bfqq, as
@@ -5482,9 +5480,7 @@ bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
* state before killing it.
*/
bfqq->bic = bic;
- bfq_merge_bfqqs(bfqd, bic, bfqq, new_bfqq);
-
- return new_bfqq;
+ return bfq_merge_bfqqs(bfqd, bic, bfqq);
}
/*
@@ -5916,6 +5912,7 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
bool waiting, idle_timer_disabled = false;
if (new_bfqq) {
+ struct bfq_queue *old_bfqq = bfqq;
/*
* Release the request's reference to the old bfqq
* and make sure one is taken to the shared queue.
@@ -5931,18 +5928,18 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
* then complete the merge and redirect it to
* new_bfqq.
*/
- if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq)
- bfq_merge_bfqqs(bfqd, RQ_BIC(rq),
- bfqq, new_bfqq);
+ if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq) {
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, RQ_BIC(rq), bfqq);
+ }
- bfq_clear_bfqq_just_created(bfqq);
+ bfq_clear_bfqq_just_created(old_bfqq);
/*
* rq is about to be enqueued into new_bfqq,
* release rq reference on bfqq
*/
- bfq_put_queue(bfqq);
+ bfq_put_queue(old_bfqq);
rq->elv.priv[1] = new_bfqq;
- bfqq = new_bfqq;
}
bfq_update_io_thinktime(bfqd, bfqq);
--
2.39.2
From: Yu Kuai <yukuai3(a)huawei.com>
[ Upstream commit 73aeab373557fa6ee4ae0b742c6211ccd9859280 ]
Original state:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\--------------\ \-------------\ \-------------\|
V V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 1 2 4
After commit 0e456dba86c7 ("block, bfq: choose the last bfqq from merge
chain in bfq_setup_cooperator()"), if P1 issues a new IO:
Without the patch:
Process 1 Process 2 Process 3 Process 4
(BIC1) (BIC2) (BIC3) (BIC4)
Λ | | |
\------------------------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
bfqq3 will be used to handle IO from P1, this is not expected, IO
should be redirected to bfqq4;
With the patch:
-------------------------------------------
| |
Process 1 Process 2 Process 3 | Process 4
(BIC1) (BIC2) (BIC3) | (BIC4)
| | | |
\-------------\ \-------------\|
V V
bfqq1--------->bfqq2---------->bfqq3----------->bfqq4
ref 0 0 2 4
IO is redirected to bfqq4, however, procress reference of bfqq3 is still
2, while there is only P2 using it.
Fix the problem by calling bfq_merge_bfqqs() for each bfqq in the merge
chain. Also change bfqq_merge_bfqqs() to return new_bfqq to simplify
code.
Fixes: 0e456dba86c7 ("block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()")
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Link: https://lore.kernel.org/r/20240909134154.954924-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/bfq-iosched.c | 37 +++++++++++++++++--------------------
1 file changed, 17 insertions(+), 20 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index bfce6343a577..8e797782cfe3 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -3117,10 +3117,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
bfq_put_queue(bfqq);
}
-static void
-bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
- struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
+static struct bfq_queue *bfq_merge_bfqqs(struct bfq_data *bfqd,
+ struct bfq_io_cq *bic,
+ struct bfq_queue *bfqq)
{
+ struct bfq_queue *new_bfqq = bfqq->new_bfqq;
+
bfq_log_bfqq(bfqd, bfqq, "merging with queue %lu",
(unsigned long)new_bfqq->pid);
/* Save weight raising and idle window of the merged queues */
@@ -3214,6 +3216,8 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
bfq_reassign_last_bfqq(bfqq, new_bfqq);
bfq_release_process_ref(bfqd, bfqq);
+
+ return new_bfqq;
}
static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
@@ -3249,14 +3253,8 @@ static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
* fulfilled, i.e., bic can be redirected to new_bfqq
* and bfqq can be put.
*/
- bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq,
- new_bfqq);
- /*
- * If we get here, bio will be queued into new_queue,
- * so use new_bfqq to decide whether bio and rq can be
- * merged.
- */
- bfqq = new_bfqq;
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, bfqd->bio_bic, bfqq);
/*
* Change also bqfd->bio_bfqq, as
@@ -5616,9 +5614,7 @@ bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
* state before killing it.
*/
bfqq->bic = bic;
- bfq_merge_bfqqs(bfqd, bic, bfqq, new_bfqq);
-
- return new_bfqq;
+ return bfq_merge_bfqqs(bfqd, bic, bfqq);
}
/*
@@ -6066,6 +6062,7 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
bool waiting, idle_timer_disabled = false;
if (new_bfqq) {
+ struct bfq_queue *old_bfqq = bfqq;
/*
* Release the request's reference to the old bfqq
* and make sure one is taken to the shared queue.
@@ -6081,18 +6078,18 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
* then complete the merge and redirect it to
* new_bfqq.
*/
- if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq)
- bfq_merge_bfqqs(bfqd, RQ_BIC(rq),
- bfqq, new_bfqq);
+ if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq) {
+ while (bfqq != new_bfqq)
+ bfqq = bfq_merge_bfqqs(bfqd, RQ_BIC(rq), bfqq);
+ }
- bfq_clear_bfqq_just_created(bfqq);
+ bfq_clear_bfqq_just_created(old_bfqq);
/*
* rq is about to be enqueued into new_bfqq,
* release rq reference on bfqq
*/
- bfq_put_queue(bfqq);
+ bfq_put_queue(old_bfqq);
rq->elv.priv[1] = new_bfqq;
- bfqq = new_bfqq;
}
bfq_update_io_thinktime(bfqd, bfqq);
--
2.39.2
From: Huacai Chen <chenhuacai(a)loongson.cn>
mainline inclusion
from mainline-v6.11-rc1
commit 7697a0fe0154468f5df35c23ebd7aa48994c2cdc
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IAZ33N
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
----------------------------------------------------------------------
Chromium sandbox apparently wants to deny statx [1] so it could properly
inspect arguments after the sandboxed process later falls back to fstat.
Because there's currently not a "fd-only" version of statx, so that the
sandbox has no way to ensure the path argument is empty without being
able to peek into the sandboxed process's memory. For architectures able
to do newfstatat though, glibc falls back to newfstatat after getting
-ENOSYS for statx, then the respective SIGSYS handler [2] takes care of
inspecting the path argument, transforming allowed newfstatat's into
fstat instead which is allowed and has the same type of return value.
But, as LoongArch is the first architecture to not have fstat nor
newfstatat, the LoongArch glibc does not attempt falling back at all
when it gets -ENOSYS for statx -- and you see the problem there!
Actually, back when the LoongArch port was under review, people were
aware of the same problem with sandboxing clone3 [3], so clone was
eventually kept. Unfortunately it seemed at that time no one had noticed
statx, so besides restoring fstat/newfstatat to LoongArch uapi (and
postponing the problem further), it seems inevitable that we would need
to tackle seccomp deep argument inspection.
However, this is obviously a decision that shouldn't be taken lightly,
so we just restore fstat/newfstatat by defining __ARCH_WANT_NEW_STAT
in unistd.h. This is the simplest solution for now, and so we hope the
community will tackle the long-standing problem of seccomp deep argument
inspection in the future [4][5].
Also add "newstat" to syscall_abis_64 in Makefile.syscalls due to
upstream asm-generic changes.
More infomation please reading this thread [6].
[1] https://chromium-review.googlesource.com/c/chromium/src/+/2823150
[2] https://chromium.googlesource.com/chromium/src/sandbox/+/c085b51940bd/linux…
[3] https://lore.kernel.org/linux-arch/20220511211231.GG7074@brightrain.aerifal…
[4] https://lwn.net/Articles/799557/
[5] https://lpc.events/event/4/contributions/560/attachments/397/640/deep-arg-i…
[6] https://lore.kernel.org/loongarch/20240226-granit-seilschaft-eccc2433014d@b…
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
arch/loongarch/include/asm/unistd.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/loongarch/include/asm/unistd.h b/arch/loongarch/include/asm/unistd.h
index cfddb0116a8c..f1d2b7e5c062 100644
--- a/arch/loongarch/include/asm/unistd.h
+++ b/arch/loongarch/include/asm/unistd.h
@@ -8,4 +8,5 @@
#include <uapi/asm/unistd.h>
+#define __ARCH_WANT_NEW_STAT
#define NR_syscalls (__NR_syscalls)
--
2.33.0
From: Barry Song <v-songbaohua(a)oppo.com>
Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
introduced an unconditional one-tick sleep when `swapcache_prepare()`
fails, which has led to reports of UI stuttering on latency-sensitive
Android devices. To address this, we can use a waitqueue to wake up
tasks that fail `swapcache_prepare()` sooner, instead of always
sleeping for a full tick. While tasks may occasionally be woken by an
unrelated `do_swap_page()`, this method is preferable to two scenarios:
rapid re-entry into page faults, which can cause livelocks, and
multiple millisecond sleeps, which visibly degrade user experience.
Oven's testing shows that a single waitqueue resolves the UI
stuttering issue. If a 'thundering herd' problem becomes apparent
later, a waitqueue hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SIZE]`
for page bit locks can be introduced.
Fixes: 13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
Cc: Kairui Song <kasong(a)tencent.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: Kalesh Singh <kaleshsingh(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Reported-by: Oven Liyang <liyangouwen1(a)oppo.com>
Tested-by: Oven Liyang <liyangouwen1(a)oppo.com>
Signed-off-by: Barry Song <v-songbaohua(a)oppo.com>
---
mm/memory.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 2366578015ad..6913174f7f41 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4192,6 +4192,8 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf)
}
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+static DECLARE_WAIT_QUEUE_HEAD(swapcache_wq);
+
/*
* We enter with non-exclusive mmap_lock (to exclude vma changes,
* but allow concurrent faults), and pte mapped but not yet locked.
@@ -4204,6 +4206,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
{
struct vm_area_struct *vma = vmf->vma;
struct folio *swapcache, *folio = NULL;
+ DECLARE_WAITQUEUE(wait, current);
struct page *page;
struct swap_info_struct *si = NULL;
rmap_t rmap_flags = RMAP_NONE;
@@ -4302,7 +4305,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
* Relax a bit to prevent rapid
* repeated page faults.
*/
+ add_wait_queue(&swapcache_wq, &wait);
schedule_timeout_uninterruptible(1);
+ remove_wait_queue(&swapcache_wq, &wait);
goto out_page;
}
need_clear_cache = true;
@@ -4609,8 +4614,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
pte_unmap_unlock(vmf->pte, vmf->ptl);
out:
/* Clear the swap cache pin for direct swapin after PTL unlock */
- if (need_clear_cache)
+ if (need_clear_cache) {
swapcache_clear(si, entry, nr_pages);
+ wake_up(&swapcache_wq);
+ }
if (si)
put_swap_device(si);
return ret;
@@ -4625,8 +4632,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
folio_unlock(swapcache);
folio_put(swapcache);
}
- if (need_clear_cache)
+ if (need_clear_cache) {
swapcache_clear(si, entry, nr_pages);
+ wake_up(&swapcache_wq);
+ }
if (si)
put_swap_device(si);
return ret;
--
2.34.1
Make sure that the tag_list_lock mutex is no longer held than necessary.
This change reduces latency if e.g. blk_mq_quiesce_tagset() is called
concurrently from more than one thread. This function is used by the
NVMe core and also by the UFS driver.
Reported-by: Peter Wang <peter.wang(a)mediatek.com>
Cc: Chao Leng <lengchao(a)huawei.com>
Cc: Ming Lei <ming.lei(a)redhat.com>
Cc: stable(a)vger.kernel.org
Fixes: commit 414dd48e882c ("blk-mq: add tagset quiesce interface")
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
---
block/blk-mq.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4b2c8e940f59..1ef227dfb9ba 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -283,8 +283,9 @@ void blk_mq_quiesce_tagset(struct blk_mq_tag_set *set)
if (!blk_queue_skip_tagset_quiesce(q))
blk_mq_quiesce_queue_nowait(q);
}
- blk_mq_wait_quiesce_done(set);
mutex_unlock(&set->tag_list_lock);
+
+ blk_mq_wait_quiesce_done(set);
}
EXPORT_SYMBOL_GPL(blk_mq_quiesce_tagset);
Gregory's modest proposal to fix CXL cxl_mem_probe() failures due to
delayed arrival of the CXL "root" infrastructure [1] prompted questions
of how the existing mechanism for retrying cxl_mem_probe() could be
failing.
The critical missing piece in the debug was that Gregory's setup had
almost all CXL modules built-in to the kernel.
On the way to that discovery several other bugs and init-order corner
cases were discovered.
The main fix is to make sure the drivers/cxl/Makefile object order
supports root CXL ports being fully initialized upon cxl_acpi_probe()
exit. The modular case has some similar potential holes that are fixed
with MODULE_SOFTDEP() and other fix ups. Finally, an attempt to update
cxl_test to reproduce the original report resulted in the discovery of a
separate long standing use after free bug in cxl_region_detach().
[1]: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
---
Dan Williams (5):
cxl/port: Fix CXL port initialization order when the subsystem is built-in
cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()
cxl/acpi: Ensure ports ready at cxl_acpi_probe() return
cxl/port: Fix use-after-free, permit out-of-order decoder shutdown
cxl/test: Improve init-order fidelity relative to real-world systems
drivers/base/core.c | 35 +++++++
drivers/cxl/Kconfig | 1
drivers/cxl/Makefile | 12 +--
drivers/cxl/acpi.c | 7 +
drivers/cxl/core/hdm.c | 50 +++++++++--
drivers/cxl/core/port.c | 13 ++-
drivers/cxl/core/region.c | 48 +++-------
drivers/cxl/cxl.h | 3 -
include/linux/device.h | 3 +
tools/testing/cxl/test/cxl.c | 200 +++++++++++++++++++++++-------------------
tools/testing/cxl/test/mem.c | 1
11 files changed, 228 insertions(+), 145 deletions(-)
base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
The quilt patch titled
Subject: nilfs2: fix kernel bug due to missing clearing of buffer delay flag
has been removed from the -mm tree. Its filename was
nilfs2-fix-kernel-bug-due-to-missing-clearing-of-buffer-delay-flag.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix kernel bug due to missing clearing of buffer delay flag
Date: Wed, 16 Oct 2024 06:32:07 +0900
Syzbot reported that after nilfs2 reads a corrupted file system image and
degrades to read-only, the BUG_ON check for the buffer delay flag in
submit_bh_wbc() may fail, causing a kernel bug.
This is because the buffer delay flag is not cleared when clearing the
buffer state flags to discard a page/folio or a buffer head. So, fix
this.
This became necessary when the use of nilfs2's own page clear routine was
expanded. This state inconsistency does not occur if the buffer is
written normally by log writing.
Link: https://lkml.kernel.org/r/20241015213300.7114-1-konishi.ryusuke@gmail.com
Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+985ada84bf055a575c07(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=985ada84bf055a575c07
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/page.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/nilfs2/page.c~nilfs2-fix-kernel-bug-due-to-missing-clearing-of-buffer-delay-flag
+++ a/fs/nilfs2/page.c
@@ -77,7 +77,8 @@ void nilfs_forget_buffer(struct buffer_h
const unsigned long clear_bits =
(BIT(BH_Uptodate) | BIT(BH_Dirty) | BIT(BH_Mapped) |
BIT(BH_Async_Write) | BIT(BH_NILFS_Volatile) |
- BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected));
+ BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected) |
+ BIT(BH_Delay));
lock_buffer(bh);
set_mask_bits(&bh->b_state, clear_bits, 0);
@@ -406,7 +407,8 @@ void nilfs_clear_folio_dirty(struct foli
const unsigned long clear_bits =
(BIT(BH_Uptodate) | BIT(BH_Dirty) | BIT(BH_Mapped) |
BIT(BH_Async_Write) | BIT(BH_NILFS_Volatile) |
- BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected));
+ BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected) |
+ BIT(BH_Delay));
bh = head;
do {
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-kernel-bug-due-to-missing-clearing-of-checked-flag.patch
The following commit has been merged into the locking/core branch of tip:
Commit-ID: d7fe143cb115076fed0126ad8cf5ba6c3e575e43
Gitweb: https://git.kernel.org/tip/d7fe143cb115076fed0126ad8cf5ba6c3e575e43
Author: Ahmed Ehab <bottaawesome633(a)gmail.com>
AuthorDate: Sun, 25 Aug 2024 01:10:30 +03:00
Committer: Boqun Feng <boqun.feng(a)gmail.com>
CommitterDate: Thu, 17 Oct 2024 20:07:23 -07:00
locking/lockdep: Avoid creating new name string literals in lockdep_set_subclass()
Syzbot reports a problem that a warning will be triggered while
searching a lock class in look_up_lock_class().
The cause of the issue is that a new name is created and used by
lockdep_set_subclass() instead of using the existing one. This results
in a lock instance has a different name pointer than previous registered
one stored in lock class, and WARN_ONCE() is triggered because of that
in look_up_lock_class().
To fix this, change lockdep_set_subclass() to use the existing name
instead of a new one. Hence, no new name will be created by
lockdep_set_subclass(). Hence, the warning is avoided.
[boqun: Reword the commit log to state the correct issue]
Reported-by: <syzbot+7f4a6f7f7051474e40ad(a)syzkaller.appspotmail.com>
Fixes: de8f5e4f2dc1f ("lockdep: Introduce wait-type checks")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ahmed Ehab <bottaawesome633(a)gmail.com>
Signed-off-by: Boqun Feng <boqun.feng(a)gmail.com>
Link: https://lore.kernel.org/lkml/20240824221031.7751-1-bottaawesome633@gmail.co…
---
include/linux/lockdep.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 217f7ab..67964dc 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -173,7 +173,7 @@ static inline void lockdep_init_map(struct lockdep_map *lock, const char *name,
(lock)->dep_map.lock_type)
#define lockdep_set_subclass(lock, sub) \
- lockdep_init_map_type(&(lock)->dep_map, #lock, (lock)->dep_map.key, sub,\
+ lockdep_init_map_type(&(lock)->dep_map, (lock)->dep_map.name, (lock)->dep_map.key, sub,\
(lock)->dep_map.wait_type_inner, \
(lock)->dep_map.wait_type_outer, \
(lock)->dep_map.lock_type)
From: Mateusz Guzik <mjguzik(a)gmail.com>
[ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
of the previous implementation. They used to legitimately check for the
condition, but that got moved up in two commits:
633fb6ac3980 ("exec: move S_ISREG() check earlier")
0fd338b2d2cd ("exec: move path_noexec() check earlier")
Instead of being removed said checks are WARN_ON'ed instead, which
has some debug value.
However, the spurious path_noexec check is racy, resulting in
unwarranted warnings should someone race with setting the noexec flag.
One can note there is more to perm-checking whether execve is allowed
and none of the conditions are guaranteed to still hold after they were
tested for.
Additionally this does not validate whether the code path did any perm
checking to begin with -- it will pass if the inode happens to be
regular.
Keep the redundant path_noexec() check even though it's mindless
nonsense checking for guarantee that isn't given so drop the WARN.
Reword the commentary and do small tidy ups while here.
Signed-off-by: Mateusz Guzik <mjguzik(a)gmail.com>
Link: https://lore.kernel.org/r/20240805131721.765484-1-mjguzik@gmail.com
[brauner: keep redundant path_noexec() check]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
[cascardo: keep exit label and use it]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
---
fs/exec.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 6e5324c7e9b6..7144c541818f 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -144,13 +144,11 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
goto out;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * Check do_open_execat() for an explanation.
*/
error = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
fsnotify_open(file);
@@ -919,16 +917,16 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
file = do_filp_open(fd, name, &open_exec_flags);
if (IS_ERR(file))
- goto out;
+ return file;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * In the past the regular type check was here. It moved to may_open() in
+ * 633fb6ac3980 ("exec: move S_ISREG() check earlier"). Since then it is
+ * an invariant that all non-regular files error out before we get here.
*/
err = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
err = deny_write_access(file);
@@ -938,7 +936,6 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
if (name->name[0] != '\0')
fsnotify_open(file);
-out:
return file;
exit:
--
2.34.1
From: Mateusz Guzik <mjguzik(a)gmail.com>
[ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
of the previous implementation. They used to legitimately check for the
condition, but that got moved up in two commits:
633fb6ac3980 ("exec: move S_ISREG() check earlier")
0fd338b2d2cd ("exec: move path_noexec() check earlier")
Instead of being removed said checks are WARN_ON'ed instead, which
has some debug value.
However, the spurious path_noexec check is racy, resulting in
unwarranted warnings should someone race with setting the noexec flag.
One can note there is more to perm-checking whether execve is allowed
and none of the conditions are guaranteed to still hold after they were
tested for.
Additionally this does not validate whether the code path did any perm
checking to begin with -- it will pass if the inode happens to be
regular.
Keep the redundant path_noexec() check even though it's mindless
nonsense checking for guarantee that isn't given so drop the WARN.
Reword the commentary and do small tidy ups while here.
Signed-off-by: Mateusz Guzik <mjguzik(a)gmail.com>
Link: https://lore.kernel.org/r/20240805131721.765484-1-mjguzik@gmail.com
[brauner: keep redundant path_noexec() check]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
[cascardo: keep exit label and use it]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
---
fs/exec.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 26f0b79cb4f9..8395e7ff7b94 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -142,13 +142,11 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
goto out;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * Check do_open_execat() for an explanation.
*/
error = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
fsnotify_open(file);
@@ -919,16 +917,16 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
file = do_filp_open(fd, name, &open_exec_flags);
if (IS_ERR(file))
- goto out;
+ return file;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * In the past the regular type check was here. It moved to may_open() in
+ * 633fb6ac3980 ("exec: move S_ISREG() check earlier"). Since then it is
+ * an invariant that all non-regular files error out before we get here.
*/
err = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
err = deny_write_access(file);
@@ -938,7 +936,6 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
if (name->name[0] != '\0')
fsnotify_open(file);
-out:
return file;
exit:
--
2.34.1
From: Mateusz Guzik <mjguzik(a)gmail.com>
[ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
of the previous implementation. They used to legitimately check for the
condition, but that got moved up in two commits:
633fb6ac3980 ("exec: move S_ISREG() check earlier")
0fd338b2d2cd ("exec: move path_noexec() check earlier")
Instead of being removed said checks are WARN_ON'ed instead, which
has some debug value.
However, the spurious path_noexec check is racy, resulting in
unwarranted warnings should someone race with setting the noexec flag.
One can note there is more to perm-checking whether execve is allowed
and none of the conditions are guaranteed to still hold after they were
tested for.
Additionally this does not validate whether the code path did any perm
checking to begin with -- it will pass if the inode happens to be
regular.
Keep the redundant path_noexec() check even though it's mindless
nonsense checking for guarantee that isn't given so drop the WARN.
Reword the commentary and do small tidy ups while here.
Signed-off-by: Mateusz Guzik <mjguzik(a)gmail.com>
Link: https://lore.kernel.org/r/20240805131721.765484-1-mjguzik@gmail.com
[brauner: keep redundant path_noexec() check]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
[cascardo: keep exit label and use it]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
---
fs/exec.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 65d3ebc24fd3..a42c9b8b070d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -141,13 +141,11 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
goto out;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * Check do_open_execat() for an explanation.
*/
error = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
fsnotify_open(file);
@@ -927,16 +925,16 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
file = do_filp_open(fd, name, &open_exec_flags);
if (IS_ERR(file))
- goto out;
+ return file;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * In the past the regular type check was here. It moved to may_open() in
+ * 633fb6ac3980 ("exec: move S_ISREG() check earlier"). Since then it is
+ * an invariant that all non-regular files error out before we get here.
*/
err = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
err = deny_write_access(file);
@@ -946,7 +944,6 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
if (name->name[0] != '\0')
fsnotify_open(file);
-out:
return file;
exit:
--
2.34.1
From: Mateusz Guzik <mjguzik(a)gmail.com>
[ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
of the previous implementation. They used to legitimately check for the
condition, but that got moved up in two commits:
633fb6ac3980 ("exec: move S_ISREG() check earlier")
0fd338b2d2cd ("exec: move path_noexec() check earlier")
Instead of being removed said checks are WARN_ON'ed instead, which
has some debug value.
However, the spurious path_noexec check is racy, resulting in
unwarranted warnings should someone race with setting the noexec flag.
One can note there is more to perm-checking whether execve is allowed
and none of the conditions are guaranteed to still hold after they were
tested for.
Additionally this does not validate whether the code path did any perm
checking to begin with -- it will pass if the inode happens to be
regular.
Keep the redundant path_noexec() check even though it's mindless
nonsense checking for guarantee that isn't given so drop the WARN.
Reword the commentary and do small tidy ups while here.
Signed-off-by: Mateusz Guzik <mjguzik(a)gmail.com>
Link: https://lore.kernel.org/r/20240805131721.765484-1-mjguzik@gmail.com
[brauner: keep redundant path_noexec() check]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
[cascardo: keep exit label and use it]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
---
fs/exec.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index f49b352a6032..7776209d98c1 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -143,13 +143,11 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
goto out;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * Check do_open_execat() for an explanation.
*/
error = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
error = -ENOEXEC;
@@ -925,23 +923,22 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
file = do_filp_open(fd, name, &open_exec_flags);
if (IS_ERR(file))
- goto out;
+ return file;
/*
- * may_open() has already checked for this, so it should be
- * impossible to trip now. But we need to be extra cautious
- * and check again at the very end too.
+ * In the past the regular type check was here. It moved to may_open() in
+ * 633fb6ac3980 ("exec: move S_ISREG() check earlier"). Since then it is
+ * an invariant that all non-regular files error out before we get here.
*/
err = -EACCES;
- if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
- path_noexec(&file->f_path)))
+ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) ||
+ path_noexec(&file->f_path))
goto exit;
err = deny_write_access(file);
if (err)
goto exit;
-out:
return file;
exit:
--
2.34.1
This is the start of the stable review cycle for the 5.15.169 release.
There are 82 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 23 Oct 2024 10:22:25 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.169-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.169-rc1
Vasiliy Kovalev <kovalev(a)altlinux.org>
ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2
Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
powerpc/mm: Always update max/min_low_pfn in mem_topology_setup()
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: propagate directory read errors from nilfs_find_entry()
Paolo Abeni <pabeni(a)redhat.com>
mptcp: prevent MPC handshake on port-based signal endpoints
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
mptcp: fallback when MPTCP opts are dropped after 1st data
Paolo Abeni <pabeni(a)redhat.com>
tcp: fix mptcp DSS corruption due to large pmtu xmit
Paolo Abeni <pabeni(a)redhat.com>
mptcp: handle consistently DSS corruption
Geliang Tang <geliang.tang(a)suse.com>
mptcp: track and update contiguous data status
Marc Zyngier <maz(a)kernel.org>
irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
Sergey Matsievskiy <matsievskiysv(a)gmail.com>
pinctrl: ocelot: fix system hang on level based interrupts
Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
x86/entry_32: Clear CPU buffers after register restore in NMI return
Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
x86/entry_32: Do not clobber user EFLAGS.ZF
Zhang Rui <rui.zhang(a)intel.com>
x86/apic: Always explicitly disarm TSC-deadline timer
Nathan Chancellor <nathan(a)kernel.org>
x86/resctrl: Annotate get_mem_config() functions as __init
Takashi Iwai <tiwai(a)suse.de>
parport: Proper fix for array out-of-bounds access
Daniele Palmas <dnlplm(a)gmail.com>
USB: serial: option: add Telit FN920C04 MBIM compositions
Benjamin B. Frost <benjamin(a)geanix.com>
USB: serial: option: add support for Quectel EG916Q-GL
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Mitigate failed set dequeue pointer commands
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Fix incorrect stream context type macro
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
Aaron Thompson <dev(a)aaront.org>
Bluetooth: Remove debugfs directory on module init failure
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: adc: ti-ads124s08: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Emil Gedenryd <emil.gedenryd(a)axis.com>
iio: light: opt3001: add missing full-scale range value
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: light: veml6030: fix IIO device retrieval from embedded device
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: light: veml6030: fix ALS sensor resolution
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig
Nikolay Kuratov <kniv(a)yandex-team.ru>
drm/vmwgfx: Handle surface check failure correctly
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/radeon: Fix encoder->possible_clones
Jens Axboe <axboe(a)kernel.dk>
io_uring/sqpoll: close race on waiting for sqring entries
Omar Sandoval <osandov(a)fb.com>
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
Johannes Wikner <kwikner(a)ethz.ch>
x86/bugs: Do not use UNTRAIN_RET with IBPB on entry
Johannes Wikner <kwikner(a)ethz.ch>
x86/bugs: Skip RSB fill at VMEXIT
Johannes Wikner <kwikner(a)ethz.ch>
x86/entry: Have entry_ibpb() invalidate return predictions
Johannes Wikner <kwikner(a)ethz.ch>
x86/cpufeatures: Add a IBPB_NO_RET BUG flag
Jim Mattson <jmattson(a)google.com>
x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET
Michael Mueller <mimu(a)linux.ibm.com>
KVM: s390: Change virtual to physical address access in diag 0x258 handler
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
s390/sclp_vt220: Convert newlines to CRLF instead of LFCR
Lu Baolu <baolu.lu(a)linux.intel.com>
iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices
Felix Moessbauer <felix.moessbauer(a)siemens.com>
io_uring/sqpoll: do not put cpumask on stack
Jens Axboe <axboe(a)kernel.dk>
io_uring/sqpoll: retain test for whether the CPU is valid
Felix Moessbauer <felix.moessbauer(a)siemens.com>
io_uring/sqpoll: do not allow pinning outside of cpuset
Wachowski, Karol <karol.wachowski(a)intel.com>
drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)
Breno Leitao <leitao(a)debian.org>
KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin()
Mikulas Patocka <mpatocka(a)redhat.com>
dm-crypt, dm-verity: disable tasklets
Johannes Berg <johannes.berg(a)intel.com>
wifi: mac80211: fix potential key use-after-free
Patrick Roy <roypat(a)amazon.co.uk>
secretmem: disable memfd_secret() if arch cannot set direct map
Liu Shixin <liushixin2(a)huawei.com>
mm/swapfile: skip HugeTLB pages for unuse_vma
OGAWA Hirofumi <hirofumi(a)mail.parknet.co.jp>
fat: fix uninitialized variable
Nianyao Tang <tangnianyao(a)huawei.com>
irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1
Oleksij Rempel <linux(a)rempel-privat.de>
net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY
Mark Rutland <mark.rutland(a)arm.com>
arm64: probes: Fix simulate_ldr*_literal()
Mark Rutland <mark.rutland(a)arm.com>
arm64: probes: Remove broken LDR (literal) uprobe support
Jinjie Ruan <ruanjinjie(a)huawei.com>
posix-clock: Fix missing timespec64 check in pc_clock_settime()
Wei Fang <wei.fang(a)nxp.com>
net: enetc: add missing static descriptor and inline keyword
Wei Fang <wei.fang(a)nxp.com>
net: enetc: remove xdp_drops statistic from enetc_xdp_drop()
Jan Kara <jack(a)suse.cz>
udf: Fix bogus checksum computation in udf_rename()
Jan Kara <jack(a)suse.cz>
udf: Don't return bh from udf_expand_dir_adinicb()
Jan Kara <jack(a)suse.cz>
udf: Handle error when expanding directory
Jan Kara <jack(a)suse.cz>
udf: Remove old directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_link() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_mkdir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_add_nondir() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: Implement adding of dir entries using new iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_unlink() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_rmdir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert empty_dir() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_get_parent() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_lookup() to use new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Convert udf_readdir() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: Convert udf_rename() to new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Provide function to mark entry as deleted using new directory iteration code
Jan Kara <jack(a)suse.cz>
udf: Implement searching for directory entry using new iteration code
Jan Kara <jack(a)suse.cz>
udf: Move udf_expand_dir_adinicb() to its callsite
Jan Kara <jack(a)suse.cz>
udf: Convert udf_expand_dir_adinicb() to new directory iteration
Jan Kara <jack(a)suse.cz>
udf: New directory iteration code
Vasiliy Kovalev <kovalev(a)altlinux.org>
ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2
-------------
Diffstat:
Makefile | 4 +-
arch/arm64/kernel/probes/decode-insn.c | 16 +-
arch/arm64/kernel/probes/simulate-insn.c | 18 +-
arch/powerpc/mm/numa.c | 6 +-
arch/s390/kvm/diag.c | 2 +-
arch/x86/entry/entry.S | 5 +
arch/x86/entry/entry_32.S | 6 +-
arch/x86/include/asm/cpufeatures.h | 4 +-
arch/x86/kernel/apic/apic.c | 14 +-
arch/x86/kernel/cpu/bugs.c | 32 +
arch/x86/kernel/cpu/common.c | 3 +
arch/x86/kernel/cpu/resctrl/core.c | 4 +-
block/blk-rq-qos.c | 2 +-
drivers/bluetooth/btusb.c | 13 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 3 +
drivers/gpu/drm/radeon/radeon_encoders.c | 2 +-
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 1 +
drivers/iio/adc/Kconfig | 4 +
.../iio/common/hid-sensors/hid-sensor-trigger.c | 2 +-
drivers/iio/dac/Kconfig | 3 +
drivers/iio/light/opt3001.c | 4 +
drivers/iio/light/veml6030.c | 5 +-
drivers/iio/proximity/Kconfig | 2 +
drivers/iommu/intel/iommu.c | 4 +-
drivers/irqchip/irq-gic-v3-its.c | 26 +-
drivers/md/dm-crypt.c | 37 +-
drivers/net/ethernet/cadence/macb_main.c | 14 +-
drivers/net/ethernet/freescale/enetc/enetc.c | 2 +-
drivers/parport/procfs.c | 22 +-
drivers/pinctrl/pinctrl-ocelot.c | 8 +-
drivers/s390/char/sclp_vt220.c | 4 +-
drivers/usb/host/xhci-ring.c | 2 +-
drivers/usb/host/xhci.h | 2 +-
drivers/usb/serial/option.c | 8 +
fs/fat/namei_vfat.c | 2 +-
fs/nilfs2/dir.c | 50 +-
fs/nilfs2/namei.c | 39 +-
fs/nilfs2/nilfs.h | 2 +-
fs/udf/dir.c | 148 +--
fs/udf/directory.c | 594 ++++++++---
fs/udf/inode.c | 90 --
fs/udf/namei.c | 1038 +++++++-------------
fs/udf/udfdecl.h | 45 +-
include/linux/fsl/enetc_mdio.h | 3 +-
include/linux/irqchip/arm-gic-v4.h | 4 +-
io_uring/io_uring.c | 21 +-
kernel/time/posix-clock.c | 3 +
mm/secretmem.c | 4 +-
mm/swapfile.c | 2 +-
net/bluetooth/af_bluetooth.c | 1 +
net/ipv4/tcp_output.c | 2 +-
net/mac80211/cfg.c | 3 +
net/mac80211/key.c | 2 +-
net/mptcp/mib.c | 3 +
net/mptcp/mib.h | 3 +
net/mptcp/pm_netlink.c | 3 +-
net/mptcp/protocol.c | 23 +-
net/mptcp/protocol.h | 2 +
net/mptcp/subflow.c | 19 +-
sound/pci/hda/patch_conexant.c | 19 +
virt/kvm/kvm_main.c | 5 +-
61 files changed, 1172 insertions(+), 1242 deletions(-)
From: Christoph Hellwig <hch(a)lst.de>
upstream 936e114a245b6e38e0dbf706a67e7611fc993da1 commit.
Move the ki_pos update down a bit to prepare for a better common helper
that invalidates pages based of an iocb.
Link: https://lkml.kernel.org/r/20230601145904.1385409-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Andreas Gruenbacher <agruenba(a)redhat.com>
Cc: Anna Schumaker <anna(a)kernel.org>
Cc: Chao Yu <chao(a)kernel.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Ilya Dryomov <idryomov(a)gmail.com>
Cc: Jaegeuk Kim <jaegeuk(a)kernel.org>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Miklos Szeredi <miklos(a)szeredi.hu>
Cc: Miklos Szeredi <mszeredi(a)redhat.com>
Cc: Theodore Ts'o <tytso(a)mit.edu>
Cc: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Cc: Xiubo Li <xiubli(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Mahmoud Adam <mngyadam(a)amazon.com>
---
This fixes the ext3/4 data corruption casued by dde4c1e1663b6 ("ext4:
properly sync file size update after O_SYNC direct IO").
reported here: https://lore.kernel.org/all/2024102130-thieving-parchment-7885@gregkh/T/#mf…
fs/iomap/direct-io.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 933f234d5becd..8a49c0d3a7b46 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -93,7 +93,6 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio)
if (offset + ret > dio->i_size &&
!(dio->flags & IOMAP_DIO_WRITE))
ret = dio->i_size - offset;
- iocb->ki_pos += ret;
}
/*
@@ -119,15 +118,18 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio)
}
inode_dio_end(file_inode(iocb->ki_filp));
- /*
- * If this is a DSYNC write, make sure we push it to stable storage now
- * that we've written data.
- */
- if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC))
- ret = generic_write_sync(iocb, ret);
- kfree(dio);
+ if (ret > 0) {
+ iocb->ki_pos += ret;
+ /*
+ * If this is a DSYNC write, make sure we push it to stable
+ * storage now that we've written data.
+ */
+ if (dio->flags & IOMAP_DIO_NEED_SYNC)
+ ret = generic_write_sync(iocb, ret);
+ }
+ kfree(dio);
return ret;
}
EXPORT_SYMBOL_GPL(iomap_dio_complete);
--
2.40.1
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in
remap_file_pages()") fixed a security issue, it added an LSM check when
trying to remap file pages, so that LSMs have the opportunity to evaluate
such action like for other memory operations such as mmap() and mprotect().
However, that commit called security_mmap_file() inside the mmap_lock lock,
while the other calls do it before taking the lock, after commit
8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem").
This caused lock inversion issue with IMA which was taking the mmap_lock
and i_mutex lock in the opposite way when the remap_file_pages() system
call was called.
Solve the issue by splitting the critical region in remap_file_pages() in
two regions: the first takes a read lock of mmap_lock, retrieves the VMA
and the file descriptor associated, and calculates the 'prot' and 'flags'
variables; the second takes a write lock on mmap_lock, checks that the VMA
flags and the VMA file descriptor are the same as the ones obtained in the
first critical region (otherwise the system call fails), and calls
do_mmap().
In between, after releasing the read lock and before taking the write lock,
call security_mmap_file(), and solve the lock inversion issue.
Cc: stable(a)vger.kernel.org # v6.12-rcx
Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()")
Reported-by: syzbot+1cd571a672400ef3a930(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.…
Reviewed-by: Roberto Sassu <roberto.sassu(a)huawei.com>
Reviewed-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Tested-by: Roberto Sassu <roberto.sassu(a)huawei.com>
Tested-by: syzbot+1cd571a672400ef3a930(a)syzkaller.appspotmail.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
---
mm/mmap.c | 69 +++++++++++++++++++++++++++++++++++++++++--------------
1 file changed, 52 insertions(+), 17 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 9c0fb43064b5..f731dd69e162 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
unsigned long populate = 0;
unsigned long ret = -EINVAL;
struct file *file;
+ vm_flags_t vm_flags;
pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n",
current->comm, current->pid);
@@ -1656,12 +1657,60 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
if (pgoff + (size >> PAGE_SHIFT) < pgoff)
return ret;
- if (mmap_write_lock_killable(mm))
+ if (mmap_read_lock_killable(mm))
return -EINTR;
+ /*
+ * Look up VMA under read lock first so we can perform the security
+ * without holding locks (which can be problematic). We reacquire a
+ * write lock later and check nothing changed underneath us.
+ */
vma = vma_lookup(mm, start);
- if (!vma || !(vma->vm_flags & VM_SHARED))
+ if (!vma || !(vma->vm_flags & VM_SHARED)) {
+ mmap_read_unlock(mm);
+ return -EINVAL;
+ }
+
+ prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
+ prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
+ prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
+
+ flags &= MAP_NONBLOCK;
+ flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
+ if (vma->vm_flags & VM_LOCKED)
+ flags |= MAP_LOCKED;
+
+ /* Save vm_flags used to calculate prot and flags, and recheck later. */
+ vm_flags = vma->vm_flags;
+ file = get_file(vma->vm_file);
+
+ mmap_read_unlock(mm);
+
+ /* Call outside mmap_lock to be consistent with other callers. */
+ ret = security_mmap_file(file, prot, flags);
+ if (ret) {
+ fput(file);
+ return ret;
+ }
+
+ ret = -EINVAL;
+
+ /* OK security check passed, take write lock + let it rip. */
+ if (mmap_write_lock_killable(mm)) {
+ fput(file);
+ return -EINTR;
+ }
+
+ vma = vma_lookup(mm, start);
+
+ if (!vma)
+ goto out;
+
+ /* Make sure things didn't change under us. */
+ if (vma->vm_flags != vm_flags)
+ goto out;
+ if (vma->vm_file != file)
goto out;
if (start + size > vma->vm_end) {
@@ -1689,25 +1738,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
goto out;
}
- prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
- prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
- prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
-
- flags &= MAP_NONBLOCK;
- flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
- if (vma->vm_flags & VM_LOCKED)
- flags |= MAP_LOCKED;
-
- file = get_file(vma->vm_file);
- ret = security_mmap_file(vma->vm_file, prot, flags);
- if (ret)
- goto out_fput;
ret = do_mmap(vma->vm_file, start, size,
prot, flags, 0, pgoff, &populate, NULL);
-out_fput:
- fput(file);
out:
mmap_write_unlock(mm);
+ fput(file);
if (populate)
mm_populate(ret, populate);
if (!IS_ERR_VALUE(ret))
--
2.34.1
During the aborting of a command, the software receives a command
completion event for the command ring stopped, with the TRB pointing
to the next TRB after the aborted command.
If the command we abort is located just before the Link TRB in the
command ring, then during the 'command ring stopped' completion event,
the xHC gives the Link TRB in the event's cmd DMA, which causes a
mismatch in handling command completion event.
To handle this situation, an additional check has been added to ignore
the mismatch error and continue the operation.
Fixes: 7f84eef0dafb ("USB: xhci: No-op command queueing and irq handler.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Faisal Hassan <quic_faisalh(a)quicinc.com>
---
Changes in v2:
- Removed traversing of TRBs with in_range() API.
- Simplified the if condition check.
v1 link:
https://lore.kernel.org/all/20241018195953.12315-1-quic_faisalh@quicinc.com
drivers/usb/host/xhci-ring.c | 43 +++++++++++++++++++++++++++++++-----
1 file changed, 38 insertions(+), 5 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index b2950c35c740..de375c9f08ca 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -126,6 +126,29 @@ static void inc_td_cnt(struct urb *urb)
urb_priv->num_tds_done++;
}
+/*
+ * Return true if the DMA is pointing to a Link TRB in the ring;
+ * otherwise, return false.
+ */
+static bool is_dma_link_trb(struct xhci_ring *ring, dma_addr_t dma)
+{
+ struct xhci_segment *seg;
+ union xhci_trb *trb;
+
+ seg = ring->first_seg;
+ do {
+ if (in_range(dma, seg->dma, TRB_SEGMENT_SIZE)) {
+ /* found the TRB, check if it's link */
+ trb = &seg->trbs[(dma - seg->dma) / sizeof(*trb)];
+ return trb_is_link(trb);
+ }
+
+ seg = seg->next;
+ } while (seg != ring->first_seg);
+
+ return false;
+}
+
static void trb_to_noop(union xhci_trb *trb, u32 noop_type)
{
if (trb_is_link(trb)) {
@@ -1718,6 +1741,7 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
trace_xhci_handle_command(xhci->cmd_ring, &cmd_trb->generic);
+ cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
cmd_dequeue_dma = xhci_trb_virt_to_dma(xhci->cmd_ring->deq_seg,
cmd_trb);
/*
@@ -1725,17 +1749,26 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
* command.
*/
if (!cmd_dequeue_dma || cmd_dma != (u64)cmd_dequeue_dma) {
- xhci_warn(xhci,
- "ERROR mismatched command completion event\n");
- return;
+ /*
+ * For the 'command ring stopped' completion event, there
+ * is a risk of a mismatch in dequeue pointers if we abort
+ * the command just before the link TRB in the command ring.
+ * In this scenario, the cmd_dma in the event would point
+ * to a link TRB, while the software dequeue pointer circles
+ * back to the start.
+ */
+ if (!(cmd_comp_code == COMP_COMMAND_RING_STOPPED &&
+ is_dma_link_trb(xhci->cmd_ring, cmd_dma))) {
+ xhci_warn(xhci,
+ "ERROR mismatched command completion event\n");
+ return;
+ }
}
cmd = list_first_entry(&xhci->cmd_list, struct xhci_command, cmd_list);
cancel_delayed_work(&xhci->cmd_timer);
- cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
-
/* If CMD ring stopped we own the trbs between enqueue and dequeue */
if (cmd_comp_code == COMP_COMMAND_RING_STOPPED) {
complete_all(&xhci->cmd_ring_stop_completion);
--
2.17.1
From: Johannes Berg <johannes.berg(a)intel.com>
When we free wdev->cqm_config when unregistering, we also
need to clear out the pointer since the same wdev/netdev
may get re-registered in another network namespace, then
destroyed later, running this code again, which results in
a double-free.
Reported-by: syzbot+36218cddfd84b5cc263e(a)syzkaller.appspotmail.com
Fixes: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race")
Cc: stable(a)vger.kernel.org # 6.6+
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
---
net/wireless/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/wireless/core.c b/net/wireless/core.c
index 4c8d8f167409..d3c7b7978f00 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1280,6 +1280,7 @@ static void _cfg80211_unregister_wdev(struct wireless_dev *wdev,
/* deleted from the list, so can't be found from nl80211 any more */
cqm_config = rcu_access_pointer(wdev->cqm_config);
kfree_rcu(cqm_config, rcu_head);
+ RCU_INIT_POINTER(wdev->cqm_config, NULL);
/*
* Ensure that all events have been processed and
--
2.47.0
daddr can be NULL if there is no neighbour table entry present,
in that case the tx packet should be dropped.
saddr will normally be set by MCTP core, but in case it is NULL it
should be set to the device address.
Incorrect indent of the function arguments is also fixed.
Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
Cc: stable(a)vger.kernel.org
Reported-by: Dung Cao <dung(a)os.amperecomputing.com>
Signed-off-by: Matt Johnston <matt(a)codeconstruct.com.au>
---
Changes in v2:
- Set saddr to device address if NULL, mention in commit message
- Fix patch prefix formatting
- Link to v1: https://lore.kernel.org/r/20241018-mctp-i2c-null-dest-v1-1-ba1ab52966e9@cod…
---
drivers/net/mctp/mctp-i2c.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/net/mctp/mctp-i2c.c b/drivers/net/mctp/mctp-i2c.c
index 4dc057c121f5d0fb9c9c48bf16b6933ae2f7b2ac..c909254e03c21518c17daf8b813e610558e074c1 100644
--- a/drivers/net/mctp/mctp-i2c.c
+++ b/drivers/net/mctp/mctp-i2c.c
@@ -579,7 +579,7 @@ static void mctp_i2c_flow_release(struct mctp_i2c_dev *midev)
static int mctp_i2c_header_create(struct sk_buff *skb, struct net_device *dev,
unsigned short type, const void *daddr,
- const void *saddr, unsigned int len)
+ const void *saddr, unsigned int len)
{
struct mctp_i2c_hdr *hdr;
struct mctp_hdr *mhdr;
@@ -588,8 +588,15 @@ static int mctp_i2c_header_create(struct sk_buff *skb, struct net_device *dev,
if (len > MCTP_I2C_MAXMTU)
return -EMSGSIZE;
- lldst = *((u8 *)daddr);
- llsrc = *((u8 *)saddr);
+ if (daddr)
+ lldst = *((u8 *)daddr);
+ else
+ return -EINVAL;
+
+ if (saddr)
+ llsrc = *((u8 *)saddr);
+ else
+ llsrc = dev->dev_addr;
skb_push(skb, sizeof(struct mctp_i2c_hdr));
skb_reset_mac_header(skb);
---
base-commit: cb560795c8c2ceca1d36a95f0d1b2eafc4074e37
change-id: 20241018-mctp-i2c-null-dest-a0ba271e0c48
Best regards,
--
Matt Johnston <matt(a)codeconstruct.com.au>
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 2b0f922323ccfa76219bcaacd35cd50aeaa13592
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101837-mammogram-headsman-2dec@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b0f922323ccfa76219bcaacd35cd50aeaa13592 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Fri, 11 Oct 2024 12:24:45 +0200
Subject: [PATCH] mm: don't install PMD mappings when THPs are disabled by the
hw/process/vma
We (or rather, readahead logic :) ) might be allocating a THP in the
pagecache and then try mapping it into a process that explicitly disabled
THP: we might end up installing PMD mappings.
This is a problem for s390x KVM, which explicitly remaps all PMD-mapped
THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before
starting the VM.
For example, starting a VM backed on a file system with large folios
supported makes the VM crash when the VM tries accessing such a mapping
using KVM.
Is it also a problem when the HW disabled THP using
TRANSPARENT_HUGEPAGE_UNSUPPORTED? At least on x86 this would be the case
without X86_FEATURE_PSE.
In the future, we might be able to do better on s390x and only disallow
PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED
really wants. For now, fix it by essentially performing the same check as
would be done in __thp_vma_allowable_orders() or in shmem code, where this
works as expected, and disallow PMD mappings, making us fallback to PTE
mappings.
Link: https://lkml.kernel.org/r/20241011102445.934409-3-david@redhat.com
Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Leo Fu <bfu(a)redhat.com>
Tested-by: Thomas Huth <thuth(a)redhat.com>
Cc: Thomas Huth <thuth(a)redhat.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Janosch Frank <frankja(a)linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/memory.c b/mm/memory.c
index c0869a962ddd..30feedabc932 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4920,6 +4920,15 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
pmd_t entry;
vm_fault_t ret = VM_FAULT_FALLBACK;
+ /*
+ * It is too late to allocate a small folio, we already have a large
+ * folio in the pagecache: especially s390 KVM cannot tolerate any
+ * PMD mappings, but PTE-mapped THP are fine. So let's simply refuse any
+ * PMD mappings if THPs are disabled.
+ */
+ if (thp_disabled_by_hw() || vma_thp_disabled(vma, vma->vm_flags))
+ return ret;
+
if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
return ret;
The Voltorb device uses a speaker codec different from the original
Corsola device. When the Voltorb device tree was first added, the new
codec was added as a separate node when it should have just replaced the
existing one.
Merge the two nodes. The only differences are the compatible string and
the GPIO line property name. This keeps the device node path for the
speaker codec the same across the MT8186 Chromebook line. Also rename
the related labels and node names from having rt1019p to speaker codec.
Cc: <stable(a)vger.kernel.org> # v6.11+
Signed-off-by: Chen-Yu Tsai <wenst(a)chromium.org>
---
This is technically not a fix, but having the same device tree structure
in stable kernels would be more consistent for consumers of the device
tree. Hence the request for a stable backport.
Changes since v1:
- Dropped Fixes tag, since this is technically a cleanup, not a fix
- Rename existing rt1019p related node names and labels to the generic
"speaker codec" name
---
.../dts/mediatek/mt8186-corsola-voltorb.dtsi | 21 +++++--------------
.../boot/dts/mediatek/mt8186-corsola.dtsi | 8 +++----
2 files changed, 9 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/boot/dts/mediatek/mt8186-corsola-voltorb.dtsi b/arch/arm64/boot/dts/mediatek/mt8186-corsola-voltorb.dtsi
index 52ec58128d56..b495a241b443 100644
--- a/arch/arm64/boot/dts/mediatek/mt8186-corsola-voltorb.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8186-corsola-voltorb.dtsi
@@ -10,12 +10,6 @@
/ {
chassis-type = "laptop";
-
- max98360a: max98360a {
- compatible = "maxim,max98360a";
- sdmode-gpios = <&pio 150 GPIO_ACTIVE_HIGH>;
- #sound-dai-cells = <0>;
- };
};
&cpu6 {
@@ -59,19 +53,14 @@ &cluster1_opp_15 {
opp-hz = /bits/ 64 <2200000000>;
};
-&rt1019p{
- status = "disabled";
-};
-
&sound {
compatible = "mediatek,mt8186-mt6366-rt5682s-max98360-sound";
- status = "okay";
+};
- spk-hdmi-playback-dai-link {
- codec {
- sound-dai = <&it6505dptx>, <&max98360a>;
- };
- };
+&speaker_codec {
+ compatible = "maxim,max98360a";
+ sdmode-gpios = <&pio 150 GPIO_ACTIVE_HIGH>;
+ /delete-property/ sdb-gpios;
};
&spmi {
diff --git a/arch/arm64/boot/dts/mediatek/mt8186-corsola.dtsi b/arch/arm64/boot/dts/mediatek/mt8186-corsola.dtsi
index c7580ac1e2d4..cf288fe7a238 100644
--- a/arch/arm64/boot/dts/mediatek/mt8186-corsola.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8186-corsola.dtsi
@@ -259,15 +259,15 @@ spk-hdmi-playback-dai-link {
mediatek,clk-provider = "cpu";
/* RT1019P and IT6505 connected to the same I2S line */
codec {
- sound-dai = <&it6505dptx>, <&rt1019p>;
+ sound-dai = <&it6505dptx>, <&speaker_codec>;
};
};
};
- rt1019p: speaker-codec {
+ speaker_codec: speaker-codec {
compatible = "realtek,rt1019p";
pinctrl-names = "default";
- pinctrl-0 = <&rt1019p_pins_default>;
+ pinctrl-0 = <&speaker_codec_pins_default>;
#sound-dai-cells = <0>;
sdb-gpios = <&pio 150 GPIO_ACTIVE_HIGH>;
};
@@ -1195,7 +1195,7 @@ pins {
};
};
- rt1019p_pins_default: rt1019p-default-pins {
+ speaker_codec_pins_default: speaker-codec-default-pins {
pins-sdb {
pinmux = <PINMUX_GPIO150__FUNC_GPIO150>;
output-low;
--
2.47.0.rc1.288.g06298d1525-goog
The patch below does not apply to the 6.11-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.11.y
git checkout FETCH_HEAD
git cherry-pick -x 963756aac1f011d904ddd9548ae82286d3a91f96
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101835-eloquent-could-27ce@gregkh' --subject-prefix 'PATCH 6.11.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 963756aac1f011d904ddd9548ae82286d3a91f96 Mon Sep 17 00:00:00 2001
From: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Date: Fri, 11 Oct 2024 12:24:44 +0200
Subject: [PATCH] mm: huge_memory: add vma_thp_disabled() and
thp_disabled_by_hw()
Patch series "mm: don't install PMD mappings when THPs are disabled by the
hw/process/vma".
During testing, it was found that we can get PMD mappings in processes
where THP (and more precisely, PMD mappings) are supposed to be disabled.
While it works as expected for anon+shmem, the pagecache is the
problematic bit.
For s390 KVM this currently means that a VM backed by a file located on
filesystem with large folio support can crash when KVM tries accessing the
problematic page, because the readahead logic might decide to use a
PMD-sized THP and faulting it into the page tables will install a PMD
mapping, something that s390 KVM cannot tolerate.
This might also be a problem with HW that does not support PMD mappings,
but I did not try reproducing it.
Fix it by respecting the ways to disable THPs when deciding whether we can
install a PMD mapping. khugepaged should already be taking care of not
collapsing if THPs are effectively disabled for the hw/process/vma.
This patch (of 2):
Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by
shmem_allowable_huge_orders() and __thp_vma_allowable_orders().
[david(a)redhat.com: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ]
Link: https://lkml.kernel.org/r/20241011102445.934409-2-david@redhat.com
Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
Signed-off-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Leo Fu <bfu(a)redhat.com>
Tested-by: Thomas Huth <thuth(a)redhat.com>
Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Boqiao Fu <bfu(a)redhat.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Janosch Frank <frankja(a)linux.ibm.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 67d0ab3c3bba..ef5b80e48599 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -322,6 +322,24 @@ struct thpsize {
(transparent_hugepage_flags & \
(1<<TRANSPARENT_HUGEPAGE_USE_ZERO_PAGE_FLAG))
+static inline bool vma_thp_disabled(struct vm_area_struct *vma,
+ unsigned long vm_flags)
+{
+ /*
+ * Explicitly disabled through madvise or prctl, or some
+ * architectures may disable THP for some mappings, for
+ * example, s390 kvm.
+ */
+ return (vm_flags & VM_NOHUGEPAGE) ||
+ test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags);
+}
+
+static inline bool thp_disabled_by_hw(void)
+{
+ /* If the hardware/firmware marked hugepage support disabled. */
+ return transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED);
+}
+
unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags);
unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 87b49ecc7b1e..2fb328880b50 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -109,18 +109,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
if (!vma->vm_mm) /* vdso */
return 0;
- /*
- * Explicitly disabled through madvise or prctl, or some
- * architectures may disable THP for some mappings, for
- * example, s390 kvm.
- * */
- if ((vm_flags & VM_NOHUGEPAGE) ||
- test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
- return 0;
- /*
- * If the hardware/firmware marked hugepage support disabled.
- */
- if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
+ if (thp_disabled_by_hw() || vma_thp_disabled(vma, vm_flags))
return 0;
/* khugepaged doesn't collapse DAX vma, but page fault is fine. */
diff --git a/mm/shmem.c b/mm/shmem.c
index 4f11b5506363..c5adb987b23c 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1664,12 +1664,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
loff_t i_size;
int order;
- if (vma && ((vm_flags & VM_NOHUGEPAGE) ||
- test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)))
- return 0;
-
- /* If the hardware/firmware marked hugepage support disabled. */
- if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
+ if (thp_disabled_by_hw() || (vma && vma_thp_disabled(vma, vm_flags)))
return 0;
global_huge = shmem_huge_global_enabled(inode, index, write_end,
Hi,
This is a backport of [1] series for 6.6.x stable kernel.
It is fixing a user reported [2] NULL dereference kernel panic after
updating the SOF firmware and topology files.
While Meteor Lake is not supported by 6.6 kernel, users might need to be
able to use it as a jumping point to update the kernel to a version which
is supporting Meteor Lake (6.7+), a kernel panic should be avoided to
not block the transition.
[1] https://lore.kernel.org/alsa-devel/20230919103115.30783-1-peter.ujfalusi@li…
[2] https://github.com/thesofproject/sof/issues/9600
Regards,
Peter
---
Peter Ujfalusi (3):
ASoC: SOF: ipc4-topology: Add definition for generic switch/enum
control
ASoC: SOF: ipc4-control: Add support for ALSA switch control
ASoC: SOF: ipc4-control: Add support for ALSA enum control
sound/soc/sof/ipc4-control.c | 175 +++++++++++++++++++++++++++++++++-
sound/soc/sof/ipc4-topology.c | 49 +++++++++-
sound/soc/sof/ipc4-topology.h | 19 +++-
3 files changed, 237 insertions(+), 6 deletions(-)
--
2.47.0
[CCing Greg and the stable list, to ensure he is aware of this, as well
as the regressions list]
On 21.10.24 11:45, Pablo Neira Ayuso wrote:
> - There is no NFPROTO_IPV6 family for mark and NFLOG.
> - TRACE is also missing module autoload with NFPROTO_IPV6.
>
> This results in ip6tables failing to restore a ruleset. This issue has been
> reported by several users providing incomplete patches.
>
> Very similar to Ilya Katsnelson's patch including a missing chunk in the
> TRACE extension.
>
> Fixes: 0bfcb7b71e73 ("netfilter: xtables: avoid NFPROTO_UNSPEC where needed")
> [...]
Just FYI as the culprit recently hit various stable series (v6.11.4,
v6.6.57, v6.1.113, v5.15.168) quite a few reports came in that look like
issues that might be fixed by this to my untrained eyes. I suppose they
won't tell you anything new and maybe you even have seen them, but on
the off-chance that this might not be the case you can find them here:
https://bugzilla.kernel.org/show_bug.cgi?id=219397https://bugzilla.kernel.org/show_bug.cgi?id=219402https://bugzilla.kernel.org/show_bug.cgi?id=219409
Ciao, Thorsten
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 705e3ce37bccdf2ed6f848356ff355f480d51a91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102152-salvage-pursuable-3b7c@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 705e3ce37bccdf2ed6f848356ff355f480d51a91 Mon Sep 17 00:00:00 2001
From: Roger Quadros <rogerq(a)kernel.org>
Date: Fri, 11 Oct 2024 13:53:24 +0300
Subject: [PATCH] usb: dwc3: core: Fix system suspend on TI AM62 platforms
Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"),
system suspend is broken on AM62 TI platforms.
Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY
bits (hence forth called 2 SUSPHY bits) were being set during core
initialization and even during core re-initialization after a system
suspend/resume.
These bits are required to be set for system suspend/resume to work correctly
on AM62 platforms.
Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget
driver is not loaded and started.
For Host mode, the 2 SUSPHY bits are set before the first system suspend but
get cleared at system resume during core re-init and are never set again.
This patch resovles these two issues by ensuring the 2 SUSPHY bits are set
before system suspend and restored to the original state during system resume.
Cc: stable(a)vger.kernel.org # v6.9+
Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init")
Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/
Signed-off-by: Roger Quadros <rogerq(a)kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Tested-by: Markus Schneider-Pargmann <msp(a)baylibre.com>
Reviewed-by: Dhruva Gole <d-gole(a)ti.com>
Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 21740e2b8f07..427e5660f87c 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2342,6 +2342,11 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
u32 reg;
int i;
+ dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
+ DWC3_GUSB2PHYCFG_SUSPHY) ||
+ (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) &
+ DWC3_GUSB3PIPECTL_SUSPHY);
+
switch (dwc->current_dr_role) {
case DWC3_GCTL_PRTCAP_DEVICE:
if (pm_runtime_suspended(dwc->dev))
@@ -2393,6 +2398,15 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /*
+ * TI AM62 platform requires SUSPHY to be
+ * enabled for system suspend to work.
+ */
+ if (!dwc->susphy_state)
+ dwc3_enable_susphy(dwc, true);
+ }
+
return 0;
}
@@ -2460,6 +2474,11 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /* restore SUSPHY state to that before system suspend. */
+ dwc3_enable_susphy(dwc, dwc->susphy_state);
+ }
+
return 0;
}
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 9c508e0c5cdf..eab81dfdcc35 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1150,6 +1150,8 @@ struct dwc3_scratchpad_array {
* @sys_wakeup: set if the device may do system wakeup.
* @wakeup_configured: set if the device is configured for remote wakeup.
* @suspended: set to track suspend event due to U3/L2.
+ * @susphy_state: state of DWC3_GUSB2PHYCFG_SUSPHY + DWC3_GUSB3PIPECTL_SUSPHY
+ * before PM suspend.
* @imod_interval: set the interrupt moderation interval in 250ns
* increments or 0 to disable.
* @max_cfg_eps: current max number of IN eps used across all USB configs.
@@ -1382,6 +1384,7 @@ struct dwc3 {
unsigned sys_wakeup:1;
unsigned wakeup_configured:1;
unsigned suspended:1;
+ unsigned susphy_state:1;
u16 imod_interval;
Hello, I wanted to check on the backport of the fix for CVE-2024-26800
on the 5.15 kernel.
Here is the commit fixing the issue:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
and as you can see the commit it says it fixes was backported to 5.15
to fix a different CVE but this one wasn't as far as I can tell.
Thank You,
Michael Kochera
> Jeongjun Park <aha310510(a)gmail.com> wrote:
>
>
>
>> Kalle Valo <kvalo(a)kernel.org> wrote:
>>
>> Jeongjun Park <aha310510(a)gmail.com> wrote:
>>
>>> I found the following bug in my fuzzer:
>>>
>>> UBSAN: array-index-out-of-bounds in drivers/net/wireless/ath/ath9k/htc_hst.c:26:51
>>> index 255 is out of range for type 'htc_endpoint [22]'
>>> CPU: 0 UID: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.11.0-rc6-dirty #14
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
>>> Workqueue: events request_firmware_work_func
>>> Call Trace:
>>> <TASK>
>>> dump_stack_lvl+0x180/0x1b0
>>> __ubsan_handle_out_of_bounds+0xd4/0x130
>>> htc_issue_send.constprop.0+0x20c/0x230
>>> ? _raw_spin_unlock_irqrestore+0x3c/0x70
>>> ath9k_wmi_cmd+0x41d/0x610
>>> ? mark_held_locks+0x9f/0xe0
>>> ...
>>>
>>> Since this bug has been confirmed to be caused by insufficient verification
>>> of conn_rsp_epid, I think it would be appropriate to add a range check for
>>> conn_rsp_epid to htc_connect_service() to prevent the bug from occurring.
>>>
>>> Fixes: fb9987d0f748 ("ath9k_htc: Support for AR9271 chipset.")
>>> Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
>>> Acked-by: Toke Høiland-Jørgensen <toke(a)toke.dk>
>>> Signed-off-by: Kalle Valo <quic_kvalo(a)quicinc.com>
>>
>> Patch applied to ath-next branch of ath.git, thanks.
>>
>
Cc: <stable(a)vger.kernel.org>
> I think this patch should be applied to the next rc version immediately
> to fix the oob vulnerability as soon as possible, and also to the
> stable version.
>
> Regards,
>
> Jeongjun Park
>
>> 8619593634cb wifi: ath9k: add range check for conn_rsp_epid in htc_connect_service()
>>
>> --
>> https://patchwork.kernel.org/project/linux-wireless/patch/20240909103855.68…
>>
>> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatc…
>> https://docs.kernel.org/process/submitting-patches.html
>>
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102153-fiddling-unblended-6e63@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 3267cb6d3a174ff83d6287dcd5b0047bbd912452
Gitweb: https://git.kernel.org/tip/3267cb6d3a174ff83d6287dcd5b0047bbd912452
Author: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
AuthorDate: Tue, 23 Jan 2024 19:55:21 -08:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Mon, 21 Oct 2024 15:05:43 -07:00
x86/lam: Disable ADDRESS_MASKING in most cases
Linear Address Masking (LAM) has a weakness related to transient
execution as described in the SLAM paper[1]. Unless Linear Address
Space Separation (LASS) is enabled this weakness may be exploitable.
Until kernel adds support for LASS[2], only allow LAM for COMPILE_TEST,
or when speculation mitigations have been disabled at compile time,
otherwise keep LAM disabled.
There are no processors in market that support LAM yet, so currently
nobody is affected by this issue.
[1] SLAM: https://download.vusec.net/papers/slam_sp24.pdf
[2] LASS: https://lore.kernel.org/lkml/20230609183632.48706-1-alexander.shishkin@linu…
[ dhansen: update SPECULATION_MITIGATIONS -> CPU_MITIGATIONS ]
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta(a)intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/5373262886f2783f054256babdf5a98545dc986b.170606…
---
arch/x86/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2852fcd..16354df 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2257,6 +2257,7 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING
config ADDRESS_MASKING
bool "Linear Address Masking support"
depends on X86_64
+ depends on COMPILE_TEST || !CPU_MITIGATIONS # wait for LASS
help
Linear Address Masking (LAM) modifies the checking that is applied
to 64-bit linear addresses, allowing software to use of the
This is something that I've been thinking about for a while. We had a
discussion at LPC 2020 about this[1] but the proposals suggested there
never materialised.
In short, it is quite difficult for userspace to detect the feature
capability of syscalls at runtime. This is something a lot of programs
want to do, but they are forced to create elaborate scenarios to try to
figure out if a feature is supported without causing damage to the
system. For the vast majority of cases, each individual feature also
needs to be tested individually (because syscall results are
all-or-nothing), so testing even a single syscall's feature set can
easily inflate the startup time of programs.
This patchset implements the fairly minimal design I proposed in this
talk[2] and in some old LKML threads (though I can't find the exact
references ATM). The general flow looks like:
1. Userspace will indicate to the kernel that a syscall should a be
no-op by setting the top bit of the extensible struct size argument.
We will almost certainly never support exabyte sized structs, so the
top bits are free for us to use as makeshift flag bits. This is
preferable to using the per-syscall flag field inside the structure
because seccomp can easily detect the bit in the flag and allow the
probe or forcefully return -EEXTSYS_NOOP.
2. The kernel will then fill the provided structure with every valid
bit pattern that the current kernel understands.
For flags or other bitflag-like fields, this is the set of valid
flags or bits. For pointer fields or fields that take an arbitrary
value, the field has every bit set (0xFF... to fill the field) to
indicate that any value is valid in the field.
3. The syscall then returns -EEXTSYS_NOOP which is an errno that will
only ever be used for this purpose (so userspace can be sure that
the request succeeded).
On older kernels, the syscall will return a different error (usually
-E2BIG or -EFAULT) and userspace can do their old-fashioned checks.
4. Userspace can then check which flags and fields are supported by
looking at the fields in the returned structure. Flags are checked
by doing an AND with the flags field, and field support can checked
by comparing to 0. In principle you could just AND the entire
structure if you wanted to do this check generically without caring
about the structure contents (this is what libraries might consider
doing).
Userspace can even find out the internal kernel structure size by
passing a PAGE_SIZE buffer and seeing how many bytes are non-zero.
As with copy_struct_from_user(), this is designed to be forward- and
backwards- compatible.
This allows programas to get a one-shot understanding of what features a
syscall supports without having to do any elaborate setups or tricks to
detect support for destructive features. Flags can simply be ANDed to
check if they are in the supported set, and fields can just be checked
to see if they are non-zero.
This patchset is IMHO the simplest way we can add the ability to
introspect the feature set of extensible struct (copy_struct_from_user)
syscalls. It doesn't preclude the chance of a more generic mechanism
being added later.
The intended way of using this interface to get feature information
looks something like the following (imagine that openat2 has gained a
new field and a new flag in the future):
static bool openat2_no_automount_supported;
static bool openat2_cwd_fd_supported;
int check_openat2_support(void)
{
int err;
struct open_how how = {};
err = openat2(AT_FDCWD, ".", &how, CHECK_FIELDS | sizeof(how));
assert(err < 0);
switch (errno) {
case EFAULT: case E2BIG:
/* Old kernel... */
check_support_the_old_way();
break;
case EEXTSYS_NOOP:
openat2_no_automount_supported = (how.flags & RESOLVE_NO_AUTOMOUNT);
openat2_cwd_fd_supported = (how.cwd_fd != 0);
break;
}
}
This series adds CHECK_FIELDS support for the following extensible
struct syscalls, as they are quite likely to grow flags in the near
future:
* openat2
* clone3
* mount_setattr
[1]: https://lwn.net/Articles/830666/
[2]: https://youtu.be/ggD-eb3yPVs
Signed-off-by: Aleksa Sarai <cyphar(a)cyphar.com>
---
Changes in v3:
- Fix copy_struct_to_user() return values in case of clear_user() failure.
- v2: <https://lore.kernel.org/r/20240906-extensible-structs-check_fields-v2-0-0f4…>
Changes in v2:
- Add CHECK_FIELDS support to mount_setattr(2).
- Fix build failure on architectures with custom errno values.
- Rework selftests to use the tools/ uAPI headers rather than custom
defining EEXTSYS_NOOP.
- Make sure we return -EINVAL and -E2BIG for invalid sizes even if
CHECK_FIELDS is set, and add some tests for that.
- v1: <https://lore.kernel.org/r/20240902-extensible-structs-check_fields-v1-0-545…>
---
Aleksa Sarai (10):
uaccess: add copy_struct_to_user helper
sched_getattr: port to copy_struct_to_user
openat2: explicitly return -E2BIG for (usize > PAGE_SIZE)
openat2: add CHECK_FIELDS flag to usize argument
selftests: openat2: add 0xFF poisoned data after misaligned struct
selftests: openat2: add CHECK_FIELDS selftests
clone3: add CHECK_FIELDS flag to usize argument
selftests: clone3: add CHECK_FIELDS selftests
mount_setattr: add CHECK_FIELDS flag to usize argument
selftests: mount_setattr: add CHECK_FIELDS selftest
arch/alpha/include/uapi/asm/errno.h | 3 +
arch/mips/include/uapi/asm/errno.h | 3 +
arch/parisc/include/uapi/asm/errno.h | 3 +
arch/sparc/include/uapi/asm/errno.h | 3 +
fs/namespace.c | 17 ++
fs/open.c | 18 ++
include/linux/uaccess.h | 97 ++++++++
include/uapi/asm-generic/errno.h | 3 +
include/uapi/linux/openat2.h | 2 +
kernel/fork.c | 30 ++-
kernel/sched/syscalls.c | 42 +---
tools/arch/alpha/include/uapi/asm/errno.h | 3 +
tools/arch/mips/include/uapi/asm/errno.h | 3 +
tools/arch/parisc/include/uapi/asm/errno.h | 3 +
tools/arch/sparc/include/uapi/asm/errno.h | 3 +
tools/include/uapi/asm-generic/errno.h | 3 +
tools/include/uapi/asm-generic/posix_types.h | 101 ++++++++
tools/testing/selftests/clone3/.gitignore | 1 +
tools/testing/selftests/clone3/Makefile | 4 +-
.../testing/selftests/clone3/clone3_check_fields.c | 264 +++++++++++++++++++++
tools/testing/selftests/mount_setattr/Makefile | 2 +-
.../selftests/mount_setattr/mount_setattr_test.c | 53 ++++-
tools/testing/selftests/openat2/Makefile | 2 +
tools/testing/selftests/openat2/openat2_test.c | 165 ++++++++++++-
24 files changed, 777 insertions(+), 51 deletions(-)
---
base-commit: 98f7e32f20d28ec452afb208f9cffc08448a2652
change-id: 20240803-extensible-structs-check_fields-a47e94cef691
Best regards,
--
Aleksa Sarai <cyphar(a)cyphar.com>
The first patch simply adds the missing call to fwnode_handle_put() when
the node is no longer required to make it compatible with stable kernels
that don't support the cleanup attribute in its current form. The second
patch adds the __free() macro to make the code more robust and avoid
similar issues in the future.
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
Changes in v2:
- Add patch for the automatic cleanup facility.
- Link to v1: https://lore.kernel.org/r/20241019-typec-class-fwnode_handle_put-v1-1-a3b5a…
---
Javier Carrasco (2):
usb: typec: fix unreleased fwnode_handle in typec_port_register_altmodes()
usb: typec: use cleanup facility for 'altmodes_node'
drivers/usb/typec/class.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
---
base-commit: f2493655d2d3d5c6958ed996b043c821c23ae8d3
change-id: 20241019-typec-class-fwnode_handle_put-b3648f5bc51b
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e4d2102018542e3ae5e297bc6e229303abff8a0f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102131-blissful-iodize-4056@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e4d2102018542e3ae5e297bc6e229303abff8a0f Mon Sep 17 00:00:00 2001
From: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Date: Thu, 26 Sep 2024 09:10:31 -0700
Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand
Robert Gill reported below #GP in 32-bit mode when dosemu software was
executing vm86() system call:
general protection fault: 0000 [#1] PREEMPT SMP
CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
EIP: restore_all_switch_stack+0xbe/0xcf
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
Call Trace:
show_regs+0x70/0x78
die_addr+0x29/0x70
exc_general_protection+0x13c/0x348
exc_bounds+0x98/0x98
handle_exception+0x14d/0x14d
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
are enabled. This is because segment registers with an arbitrary user value
can result in #GP when executing VERW. Intel SDM vol. 2C documents the
following behavior for VERW instruction:
#GP(0) - If a memory operand effective address is outside the CS, DS, ES,
FS, or GS segment limit.
CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
space. Use %cs selector to reference VERW operand. This ensures VERW will
not #GP for an arbitrary user %ds.
[ mingo: Fixed the SOB chain. ]
Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
Reported-by: Robert Gill <rtgill82(a)gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3(a)citrix.com
Cc: stable(a)vger.kernel.org # 5.10+
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.i…
Suggested-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Suggested-by: Brian Gerst <brgerst(a)gmail.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index ff5f1ecc7d1e..96b410b1d4e8 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -323,7 +323,16 @@
* Note: Only the memory operand variant of VERW clears the CPU buffers.
*/
.macro CLEAR_CPU_BUFFERS
- ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+#ifdef CONFIG_X86_64
+ ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
+#else
+ /*
+ * In 32bit mode, the memory operand must be a %cs reference. The data
+ * segments may not be usable (vm86 mode), and the stack segment may not
+ * be flat (ESPFIX32).
+ */
+ ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
+#endif
.endm
#ifdef CONFIG_X86_64
Stuart Hayhurst has found that both at bootup and fullscreen VA-API video
is leading to black screens for around 1 second and kernel WARNING [1] traces
when calling dmub_psr_enable() with Parade 08-01 TCON.
These symptoms all go away with PSR-SU disabled for this TCON, so disable
it for now while DMUB traces [2] from the failure can be analyzed and the failure
state properly root caused.
Cc: stable(a)vger.kernel.org
Cc: Marc Rossi <Marc.Rossi(a)amd.com>
Cc: Hamza Mahfooz <Hamza.Mahfooz(a)amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/uploads/a832dd515b571ee171b3e3b566e9… [1]
Link: https://gitlab.freedesktop.org/drm/amd/uploads/8f13ff3b00963c833e23e68aa811… [2]
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2645
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
---
drivers/gpu/drm/amd/display/modules/power/power_helpers.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
index e304e8435fb8..477289846a0a 100644
--- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
+++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
@@ -841,6 +841,8 @@ bool is_psr_su_specific_panel(struct dc_link *link)
isPSRSUSupported = false;
else if (dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x03)
isPSRSUSupported = false;
+ else if (dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x01)
+ isPSRSUSupported = false;
else if (dpcd_caps->psr_info.force_psrsu_cap == 0x1)
isPSRSUSupported = true;
}
--
2.34.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e4d2102018542e3ae5e297bc6e229303abff8a0f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102130-saturday-bountiful-5087@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e4d2102018542e3ae5e297bc6e229303abff8a0f Mon Sep 17 00:00:00 2001
From: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Date: Thu, 26 Sep 2024 09:10:31 -0700
Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand
Robert Gill reported below #GP in 32-bit mode when dosemu software was
executing vm86() system call:
general protection fault: 0000 [#1] PREEMPT SMP
CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
EIP: restore_all_switch_stack+0xbe/0xcf
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
Call Trace:
show_regs+0x70/0x78
die_addr+0x29/0x70
exc_general_protection+0x13c/0x348
exc_bounds+0x98/0x98
handle_exception+0x14d/0x14d
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
are enabled. This is because segment registers with an arbitrary user value
can result in #GP when executing VERW. Intel SDM vol. 2C documents the
following behavior for VERW instruction:
#GP(0) - If a memory operand effective address is outside the CS, DS, ES,
FS, or GS segment limit.
CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
space. Use %cs selector to reference VERW operand. This ensures VERW will
not #GP for an arbitrary user %ds.
[ mingo: Fixed the SOB chain. ]
Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
Reported-by: Robert Gill <rtgill82(a)gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3(a)citrix.com
Cc: stable(a)vger.kernel.org # 5.10+
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.i…
Suggested-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Suggested-by: Brian Gerst <brgerst(a)gmail.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index ff5f1ecc7d1e..96b410b1d4e8 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -323,7 +323,16 @@
* Note: Only the memory operand variant of VERW clears the CPU buffers.
*/
.macro CLEAR_CPU_BUFFERS
- ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+#ifdef CONFIG_X86_64
+ ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
+#else
+ /*
+ * In 32bit mode, the memory operand must be a %cs reference. The data
+ * segments may not be usable (vm86 mode), and the stack segment may not
+ * be flat (ESPFIX32).
+ */
+ ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
+#endif
.endm
#ifdef CONFIG_X86_64
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x e4d2102018542e3ae5e297bc6e229303abff8a0f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102128-omega-phosphate-db6c@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e4d2102018542e3ae5e297bc6e229303abff8a0f Mon Sep 17 00:00:00 2001
From: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Date: Thu, 26 Sep 2024 09:10:31 -0700
Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand
Robert Gill reported below #GP in 32-bit mode when dosemu software was
executing vm86() system call:
general protection fault: 0000 [#1] PREEMPT SMP
CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
EIP: restore_all_switch_stack+0xbe/0xcf
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
Call Trace:
show_regs+0x70/0x78
die_addr+0x29/0x70
exc_general_protection+0x13c/0x348
exc_bounds+0x98/0x98
handle_exception+0x14d/0x14d
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
exc_bounds+0x98/0x98
restore_all_switch_stack+0xbe/0xcf
This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
are enabled. This is because segment registers with an arbitrary user value
can result in #GP when executing VERW. Intel SDM vol. 2C documents the
following behavior for VERW instruction:
#GP(0) - If a memory operand effective address is outside the CS, DS, ES,
FS, or GS segment limit.
CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
space. Use %cs selector to reference VERW operand. This ensures VERW will
not #GP for an arbitrary user %ds.
[ mingo: Fixed the SOB chain. ]
Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
Reported-by: Robert Gill <rtgill82(a)gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3(a)citrix.com
Cc: stable(a)vger.kernel.org # 5.10+
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.i…
Suggested-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Suggested-by: Brian Gerst <brgerst(a)gmail.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index ff5f1ecc7d1e..96b410b1d4e8 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -323,7 +323,16 @@
* Note: Only the memory operand variant of VERW clears the CPU buffers.
*/
.macro CLEAR_CPU_BUFFERS
- ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+#ifdef CONFIG_X86_64
+ ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
+#else
+ /*
+ * In 32bit mode, the memory operand must be a %cs reference. The data
+ * segments may not be usable (vm86 mode), and the stack segment may not
+ * be flat (ESPFIX32).
+ */
+ ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
+#endif
.endm
#ifdef CONFIG_X86_64
Commit 9bf4e919ccad worked around an issue introduced after an innocuous
optimisation change in LLVM main:
> len is defined as an 'int' because it is assigned from
> '__user int *optlen'. However, it is clamped against the result of
> sizeof(), which has a type of 'size_t' ('unsigned long' for 64-bit
> platforms). This is done with min_t() because min() requires compatible
> types, which results in both len and the result of sizeof() being casted
> to 'unsigned int', meaning len changes signs and the result of sizeof()
> is truncated. From there, len is passed to copy_to_user(), which has a
> third parameter type of 'unsigned long', so it is widened and changes
> signs again. This excessive casting in combination with the KCSAN
> instrumentation causes LLVM to fail to eliminate the __bad_copy_from()
> call, failing the build.
The same issue occurs in rfcomm in functions rfcomm_sock_getsockopt and
rfcomm_sock_getsockopt_old.
Change the type of len to size_t in both rfcomm_sock_getsockopt and
rfcomm_sock_getsockopt_old and replace min_t() with min().
Cc: stable(a)vger.kernel.org
Co-authored-by: Aleksei Vetrov <vvvvvv(a)google.com>
Improves: 9bf4e919ccad ("Bluetooth: Fix type of len in {l2cap,sco}_sock_getsockopt_old()")
Link: https://github.com/ClangBuiltLinux/linux/issues/2007
Link: https://github.com/llvm/llvm-project/issues/85647
Signed-off-by: Andrej Shadura <andrew.shadura(a)collabora.co.uk>
Reviewed-by: Nathan Chancellor <nathan(a)kernel.org>
---
net/bluetooth/rfcomm/sock.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 37d63d768afb..5f9d370e09b1 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -729,7 +729,8 @@ static int rfcomm_sock_getsockopt_old(struct socket *sock, int optname, char __u
struct sock *l2cap_sk;
struct l2cap_conn *conn;
struct rfcomm_conninfo cinfo;
- int len, err = 0;
+ int err = 0;
+ size_t len;
u32 opt;
BT_DBG("sk %p", sk);
@@ -783,7 +784,7 @@ static int rfcomm_sock_getsockopt_old(struct socket *sock, int optname, char __u
cinfo.hci_handle = conn->hcon->handle;
memcpy(cinfo.dev_class, conn->hcon->dev_class, 3);
- len = min_t(unsigned int, len, sizeof(cinfo));
+ len = min(len, sizeof(cinfo));
if (copy_to_user(optval, (char *) &cinfo, len))
err = -EFAULT;
@@ -802,7 +803,8 @@ static int rfcomm_sock_getsockopt(struct socket *sock, int level, int optname, c
{
struct sock *sk = sock->sk;
struct bt_security sec;
- int len, err = 0;
+ int err = 0;
+ size_t len;
BT_DBG("sk %p", sk);
@@ -827,7 +829,7 @@ static int rfcomm_sock_getsockopt(struct socket *sock, int level, int optname, c
sec.level = rfcomm_pi(sk)->sec_level;
sec.key_size = 0;
- len = min_t(unsigned int, len, sizeof(sec));
+ len = min(len, sizeof(sec));
if (copy_to_user(optval, (char *) &sec, len))
err = -EFAULT;
--
2.43.0
Read buffer is allocated according to max message size, reported by
the firmware and may reach 64K in systems with pxp client.
Contiguous 64k allocation may fail under memory pressure.
Read buffer is used as in-driver message storage and not required
to be contiguous.
Use kvmalloc to allow kernel to allocate non-contiguous memory.
Fixes: 3030dc056459 ("mei: add wrapper for queuing control commands.")
Reported-by: Rohit Agarwal <rohiagar(a)chromium.org>
Closes: https://lore.kernel.org/all/20240813084542.2921300-1-rohiagar@chromium.org/
Tested-by: Brian Geffon <bgeffon(a)google.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
---
Changes since V2:
- add Fixes and CC:stable
Changes since V1:
- add Tested-by and Reported-by
drivers/misc/mei/client.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/mei/client.c b/drivers/misc/mei/client.c
index 9d090fa07516..be011cef12e5 100644
--- a/drivers/misc/mei/client.c
+++ b/drivers/misc/mei/client.c
@@ -321,7 +321,7 @@ void mei_io_cb_free(struct mei_cl_cb *cb)
return;
list_del(&cb->list);
- kfree(cb->buf.data);
+ kvfree(cb->buf.data);
kfree(cb->ext_hdr);
kfree(cb);
}
@@ -497,7 +497,7 @@ struct mei_cl_cb *mei_cl_alloc_cb(struct mei_cl *cl, size_t length,
if (length == 0)
return cb;
- cb->buf.data = kmalloc(roundup(length, MEI_SLOT_SIZE), GFP_KERNEL);
+ cb->buf.data = kvmalloc(roundup(length, MEI_SLOT_SIZE), GFP_KERNEL);
if (!cb->buf.data) {
mei_io_cb_free(cb);
return NULL;
--
2.43.0
The 'altmodes_node' fwnode_handle is never released after it is no
longer required, which leaks the resource.
Add the required call to fwnode_handle_put() when 'altmodes_node' is no
longer required.
Cc: stable(a)vger.kernel.org
Fixes: 7b458a4c5d73 ("usb: typec: Add typec_port_register_altmodes()")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
drivers/usb/typec/class.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
index d61b4c74648d..1eb240604cf6 100644
--- a/drivers/usb/typec/class.c
+++ b/drivers/usb/typec/class.c
@@ -2341,6 +2341,7 @@ void typec_port_register_altmodes(struct typec_port *port,
altmodes[index] = alt;
index++;
}
+ fwnode_handle_put(altmodes_node);
}
EXPORT_SYMBOL_GPL(typec_port_register_altmodes);
---
base-commit: f2493655d2d3d5c6958ed996b043c821c23ae8d3
change-id: 20241019-typec-class-fwnode_handle_put-b3648f5bc51b
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
From: Vimal Agrawal <vimal.agrawal(a)sophos.com>
misc_minor_alloc was allocating id using ida for minor only in case of
MISC_DYNAMIC_MINOR but misc_minor_free was always freeing ids
using ida_free causing a mismatch and following warn:
> > WARNING: CPU: 0 PID: 159 at lib/idr.c:525 ida_free+0x3e0/0x41f
> > ida_free called for id=127 which is not allocated.
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
...
> > [<60941eb4>] ida_free+0x3e0/0x41f
> > [<605ac993>] misc_minor_free+0x3e/0xbc
> > [<605acb82>] misc_deregister+0x171/0x1b3
misc_minor_alloc is changed to allocate id from ida for all minors
falling in the range of dynamic/ misc dynamic minors
Fixes: ab760791c0cf ("char: misc: Increase the maximum number of dynamic misc devices to 1048448")
Signed-off-by: Vimal Agrawal <vimal.agrawal(a)sophos.com>
Reviewed-by: Dirk VanDerMerwe <dirk.vandermerwe(a)sophos.com>
Cc: stable(a)vger.kernel.org
---
v2: Added Fixes:
Added missed case for static minor in misc_minor_alloc
v3: Removed kunit changes as that will be added as second patch in this two patch series
v4: Updated Signed-off-by: to match from:
v5: Used corporate id in from: and Signed-off-by:
drivers/char/misc.c | 39 ++++++++++++++++++++++++++++++---------
1 file changed, 30 insertions(+), 9 deletions(-)
diff --git a/drivers/char/misc.c b/drivers/char/misc.c
index 541edc26ec89..2cf595d2e10b 100644
--- a/drivers/char/misc.c
+++ b/drivers/char/misc.c
@@ -63,16 +63,30 @@ static DEFINE_MUTEX(misc_mtx);
#define DYNAMIC_MINORS 128 /* like dynamic majors */
static DEFINE_IDA(misc_minors_ida);
-static int misc_minor_alloc(void)
+static int misc_minor_alloc(int minor)
{
- int ret;
-
- ret = ida_alloc_max(&misc_minors_ida, DYNAMIC_MINORS - 1, GFP_KERNEL);
- if (ret >= 0) {
- ret = DYNAMIC_MINORS - ret - 1;
+ int ret = 0;
+
+ if (minor == MISC_DYNAMIC_MINOR) {
+ /* allocate free id */
+ ret = ida_alloc_max(&misc_minors_ida, DYNAMIC_MINORS - 1, GFP_KERNEL);
+ if (ret >= 0) {
+ ret = DYNAMIC_MINORS - ret - 1;
+ } else {
+ ret = ida_alloc_range(&misc_minors_ida, MISC_DYNAMIC_MINOR + 1,
+ MINORMASK, GFP_KERNEL);
+ }
} else {
- ret = ida_alloc_range(&misc_minors_ida, MISC_DYNAMIC_MINOR + 1,
- MINORMASK, GFP_KERNEL);
+ /* specific minor, check if it is in dynamic or misc dynamic range */
+ if (minor < DYNAMIC_MINORS) {
+ minor = DYNAMIC_MINORS - minor - 1;
+ ret = ida_alloc_range(&misc_minors_ida, minor, minor, GFP_KERNEL);
+ } else if (minor > MISC_DYNAMIC_MINOR) {
+ ret = ida_alloc_range(&misc_minors_ida, minor, minor, GFP_KERNEL);
+ } else {
+ /* case of non-dynamic minors, no need to allocate id */
+ ret = 0;
+ }
}
return ret;
}
@@ -219,7 +233,7 @@ int misc_register(struct miscdevice *misc)
mutex_lock(&misc_mtx);
if (is_dynamic) {
- int i = misc_minor_alloc();
+ int i = misc_minor_alloc(misc->minor);
if (i < 0) {
err = -EBUSY;
@@ -228,6 +242,7 @@ int misc_register(struct miscdevice *misc)
misc->minor = i;
} else {
struct miscdevice *c;
+ int i;
list_for_each_entry(c, &misc_list, list) {
if (c->minor == misc->minor) {
@@ -235,6 +250,12 @@ int misc_register(struct miscdevice *misc)
goto out;
}
}
+
+ i = misc_minor_alloc(misc->minor);
+ if (i < 0) {
+ err = -EBUSY;
+ goto out;
+ }
}
dev = MKDEV(MISC_MAJOR, misc->minor);
--
2.17.1
This series fixes the handling of an fwnode that is not released in all
error paths and uses the wrong function to release it (spotted by Dmitry
Baryshkov).
To: Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
To: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
To: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
To: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
To: Caleb Connolly <caleb.connolly(a)linaro.org>
To: Guenter Roeck <linux(a)roeck-us.net>
Cc: linux-arm-msm(a)vger.kernel.org
Cc: linux-usb(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Changes in v2:
- add patch to use fwnode_handle_put() instead of
fwnode_remove_software-node().
- Link to v1: https://lore.kernel.org/r/20241019-qcom_pmic_typec-fwnode_remove-v1-1-88496…
---
Javier Carrasco (2):
usb: typec: qcom-pmic-typec: use fwnode_handle_put() to release fwnodes
usb: typec: qcom-pmic-typec: fix missing fwnode removal in error path
drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
---
base-commit: f2493655d2d3d5c6958ed996b043c821c23ae8d3
change-id: 20241019-qcom_pmic_typec-fwnode_remove-00dc49054cf7
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
When splicing a 0-length bvec, the block layer may loop iterating and not
advancing the bvec, causing lockups or hangs.
Pavel Begunkov (1):
splice: don't generate zero-len segement bvecs
fs/splice.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
--
2.34.1
If USB virtualizatoin is enabled, USB2 ports are shared between all
Virtual Functions. The USB2 port number owned by an USB2 root hub in
a Virtual Function may be less than total USB2 phy number supported
by the Tegra XUSB controller.
Using total USB2 phy number as port number to check all PORTSC values
would cause invalid memory access.
[ 116.923438] Unable to handle kernel paging request at virtual address 006c622f7665642f
...
[ 117.213640] Call trace:
[ 117.216783] tegra_xusb_enter_elpg+0x23c/0x658
[ 117.222021] tegra_xusb_runtime_suspend+0x40/0x68
[ 117.227260] pm_generic_runtime_suspend+0x30/0x50
[ 117.232847] __rpm_callback+0x84/0x3c0
[ 117.237038] rpm_suspend+0x2dc/0x740
[ 117.241229] pm_runtime_work+0xa0/0xb8
[ 117.245769] process_scheduled_works+0x24c/0x478
[ 117.251007] worker_thread+0x23c/0x328
[ 117.255547] kthread+0x104/0x1b0
[ 117.259389] ret_from_fork+0x10/0x20
[ 117.263582] Code: 54000222 f9461ae8 f8747908 b4ffff48 (f9400100)
Cc: <stable(a)vger.kernel.org> # v6.3+
Fixes: a30951d31b25 ("xhci: tegra: USB2 pad power controls")
Signed-off-by: Henry Lin <henryl(a)nvidia.com>
---
V1 -> V2: Add Fixes tag and the cc stable line
V2 -> V3: Update commit message to clarify issue
V3 -> V4: Resend for patch changelogs that are missing in V3
drivers/usb/host/xhci-tegra.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
index 6246d5ad1468..76f228e7443c 100644
--- a/drivers/usb/host/xhci-tegra.c
+++ b/drivers/usb/host/xhci-tegra.c
@@ -2183,7 +2183,7 @@ static int tegra_xusb_enter_elpg(struct tegra_xusb *tegra, bool runtime)
goto out;
}
- for (i = 0; i < tegra->num_usb_phys; i++) {
+ for (i = 0; i < xhci->usb2_rhub.num_ports; i++) {
if (!xhci->usb2_rhub.ports[i])
continue;
portsc = readl(xhci->usb2_rhub.ports[i]->addr);
--
2.25.1
The LG Gram Pro 16 2-in-1 (2024) the 16T90SP has its keybopard IRQ (1)
described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
which breaks the keyboard.
Add the 16T90SP to the irq1_level_low_skip_override[] quirk table to fix
this.
Reported-by: Dirk Holten <dirk.holten(a)gmx.de>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219382
Cc: stable(a)vger.kernel.org
Suggested-by: Dirk Holten <dirk.holten(a)gmx.de>
Signed-off-by: Christian Heusel <christian(a)heusel.eu>
---
Note that I do not have the relevant hardware since I'm sending in this
quirk at the request of someone else.
---
Changes in v2:
- fix the double initialization warning reported by the kernel test
robot, which accidentially overwrote another quirk
- Link to v1: https://lore.kernel.org/r/20241016-lg-gram-pro-keyboard-v1-1-34306123102f@h…
---
drivers/acpi/resource.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
index 129bceb1f4a27df93439bcefdb27fd9c91258028..7fe842dae1ec05ce6726af2ae4fcc8eff3698dcb 100644
--- a/drivers/acpi/resource.c
+++ b/drivers/acpi/resource.c
@@ -503,6 +503,13 @@ static const struct dmi_system_id irq1_level_low_skip_override[] = {
DMI_MATCH(DMI_BOARD_NAME, "17U70P"),
},
},
+ {
+ /* LG Electronics 16T90SP */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LG Electronics"),
+ DMI_MATCH(DMI_BOARD_NAME, "16T90SP"),
+ },
+ },
{ }
};
---
base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
change-id: 20241016-lg-gram-pro-keyboard-9a9d8b9aa647
Best regards,
--
Christian Heusel <christian(a)heusel.eu>
A few commits from Yu Zhao have been merged into 6.12.
They need to be backported to 6.11.
- c2a967f6ab0ec ("mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO")
- 95599ef684d01 ("mm/codetag: fix pgalloc_tag_split()")
- e0a955bf7f61c ("mm/codetag: add pgalloc_tag_copy()")
---
Changes in v2:
- Add signed off tag
- Link to v1: https://lore.kernel.org/r/20241017-stable-yuzhao-v1-0-3a4566660d44@kernel.o…
---
Yu Zhao (3):
mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO
mm/codetag: fix pgalloc_tag_split()
mm/codetag: add pgalloc_tag_copy()
include/linux/alloc_tag.h | 24 ++++++++-----------
include/linux/mm.h | 57 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/pgalloc_tag.h | 31 ------------------------
mm/huge_memory.c | 2 +-
mm/hugetlb_vmemmap.c | 40 +++++++++++++++----------------
mm/migrate.c | 1 +
mm/page_alloc.c | 4 ++--
7 files changed, 91 insertions(+), 68 deletions(-)
---
base-commit: 8e24a758d14c0b1cd42ab0aea980a1030eea811f
change-id: 20241016-stable-yuzhao-7779910482e8
Best regards,
--
Chris Li <chrisl(a)kernel.org>
commit 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de upstream.
Currently EROFS can map another compressed buffer for inplace
decompression, that was used to handle the cases that some pages of
compressed data are actually not in-place I/O.
However, like most simple LZ77 algorithms, LZ4 expects the compressed
data is arranged at the end of the decompressed buffer and it
explicitly uses memmove() to handle overlapping:
__________________________________________________________
|_ direction of decompression --> ____ |_ compressed data _|
Although EROFS arranges compressed data like this, it typically maps two
individual virtual buffers so the relative order is uncertain.
Previously, it was hardly observed since LZ4 only uses memmove() for
short overlapped literals and x86/arm64 memmove implementations seem to
completely cover it up and they don't have this issue. Juhyung reported
that EROFS data corruption can be found on a new Intel x86 processor.
After some analysis, it seems that recent x86 processors with the new
FSRM feature expose this issue with "rep movsb".
Let's strictly use the decompressed buffer for lz4 inplace
decompression for now. Later, as an useful improvement, we could try
to tie up these two buffers together in the correct order.
Reported-and-tested-by: Juhyung Park <qkrwngud825(a)gmail.com>
Closes: https://lore.kernel.org/r/CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR…
Fixes: 0ffd71bcc3a0 ("staging: erofs: introduce LZ4 decompression inplace")
Fixes: 598162d05080 ("erofs: support decompress big pcluster for lz4 backend")
Cc: stable <stable(a)vger.kernel.org> # 5.4+
Tested-by: Yifan Zhao <zhaoyifan(a)sjtu.edu.cn>
Link: https://lore.kernel.org/r/20231206045534.3920847-1-hsiangkao@linux.alibaba.…
Signed-off-by: Gao Xiang <hsiangkao(a)linux.alibaba.com>
---
The remaining stable patch to address the issue "CVE-2023-52497" for
5.4.y, which is the same as the 5.10.y one [1].
[1] https://lore.kernel.org/r/20240224063248.2157885-1-hsiangkao@linux.alibaba.…
fs/erofs/decompressor.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
index 38eeec5e3032..d06a3b77fb39 100644
--- a/fs/erofs/decompressor.c
+++ b/fs/erofs/decompressor.c
@@ -24,7 +24,8 @@ struct z_erofs_decompressor {
*/
int (*prepare_destpages)(struct z_erofs_decompress_req *rq,
struct list_head *pagepool);
- int (*decompress)(struct z_erofs_decompress_req *rq, u8 *out);
+ int (*decompress)(struct z_erofs_decompress_req *rq, u8 *out,
+ u8 *obase);
char *name;
};
@@ -114,10 +115,13 @@ static void *generic_copy_inplace_data(struct z_erofs_decompress_req *rq,
return tmp;
}
-static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out)
+static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out,
+ u8 *obase)
{
+ const uint nrpages_out = PAGE_ALIGN(rq->pageofs_out +
+ rq->outputsize) >> PAGE_SHIFT;
unsigned int inputmargin, inlen;
- u8 *src;
+ u8 *src, *src2;
bool copied, support_0padding;
int ret;
@@ -125,6 +129,7 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out)
return -EOPNOTSUPP;
src = kmap_atomic(*rq->in);
+ src2 = src;
inputmargin = 0;
support_0padding = false;
@@ -148,16 +153,15 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out)
if (rq->inplace_io) {
const uint oend = (rq->pageofs_out +
rq->outputsize) & ~PAGE_MASK;
- const uint nr = PAGE_ALIGN(rq->pageofs_out +
- rq->outputsize) >> PAGE_SHIFT;
-
if (rq->partial_decoding || !support_0padding ||
- rq->out[nr - 1] != rq->in[0] ||
+ rq->out[nrpages_out - 1] != rq->in[0] ||
rq->inputsize - oend <
LZ4_DECOMPRESS_INPLACE_MARGIN(inlen)) {
src = generic_copy_inplace_data(rq, src, inputmargin);
inputmargin = 0;
copied = true;
+ } else {
+ src = obase + ((nrpages_out - 1) << PAGE_SHIFT);
}
}
@@ -178,7 +182,7 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out)
if (copied)
erofs_put_pcpubuf(src);
else
- kunmap_atomic(src);
+ kunmap_atomic(src2);
return ret;
}
@@ -248,7 +252,7 @@ static int z_erofs_decompress_generic(struct z_erofs_decompress_req *rq,
return PTR_ERR(dst);
rq->inplace_io = false;
- ret = alg->decompress(rq, dst);
+ ret = alg->decompress(rq, dst, NULL);
if (!ret)
copy_from_pcpubuf(rq->out, dst, rq->pageofs_out,
rq->outputsize);
@@ -282,7 +286,7 @@ static int z_erofs_decompress_generic(struct z_erofs_decompress_req *rq,
dst_maptype = 2;
dstmap_out:
- ret = alg->decompress(rq, dst + rq->pageofs_out);
+ ret = alg->decompress(rq, dst + rq->pageofs_out, dst);
if (!dst_maptype)
kunmap_atomic(dst);
--
2.43.5
Greg recently reported 2 patches that could not be applied without
conflicts in v5.10:
- e32d262c89e2 ("mptcp: handle consistently DSS corruption")
- 4dabcdf58121 ("tcp: fix mptcp DSS corruption due to large pmtu xmit")
Conflicts have been resolved, and documented in each patch.
One extra commit has been backported, to support allow_infinite_fallback
which is used by one commit from the list above:
- 0530020a7c8f ("mptcp: track and update contiguous data status")
Geliang Tang (1):
mptcp: track and update contiguous data status
Paolo Abeni (2):
mptcp: handle consistently DSS corruption
tcp: fix mptcp DSS corruption due to large pmtu xmit
net/ipv4/tcp_output.c | 2 +-
net/mptcp/mib.c | 2 ++
net/mptcp/mib.h | 2 ++
net/mptcp/protocol.c | 26 ++++++++++++++++++++++----
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 3 ++-
6 files changed, 30 insertions(+), 6 deletions(-)
--
2.45.2
Greg recently reported 6 patches that could not be applied without
conflicts in v5.15:
- e32d262c89e2 ("mptcp: handle consistently DSS corruption")
- 4dabcdf58121 ("tcp: fix mptcp DSS corruption due to large pmtu xmit")
- 119d51e225fe ("mptcp: fallback when MPTCP opts are dropped after 1st
data")
- 7decd1f5904a ("mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow")
- 3d041393ea8c ("mptcp: prevent MPC handshake on port-based signal
endpoints")
- 5afca7e996c4 ("selftests: mptcp: join: test for prohibited MPC to
port-based endp")
Conflicts have been resolved for the 5 first ones, and documented in
each patch.
The last patch has not been backported: this is an extra test for the
selftests validating the previous commit, and there are a lot of
conflicts. That's fine not to backport this test, it is still possible
to use the selftests from a newer version and run them on this older
kernel.
One extra commit has been backported, to support allow_infinite_fallback
which is used by two commits from the list above:
- 0530020a7c8f ("mptcp: track and update contiguous data status")
Geliang Tang (1):
mptcp: track and update contiguous data status
Matthieu Baerts (NGI0) (2):
mptcp: fallback when MPTCP opts are dropped after 1st data
mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow
Paolo Abeni (3):
mptcp: handle consistently DSS corruption
tcp: fix mptcp DSS corruption due to large pmtu xmit
mptcp: prevent MPC handshake on port-based signal endpoints
net/ipv4/tcp_output.c | 2 +-
net/mptcp/mib.c | 3 +++
net/mptcp/mib.h | 3 +++
net/mptcp/pm_netlink.c | 3 ++-
net/mptcp/protocol.c | 23 ++++++++++++++++++++---
net/mptcp/protocol.h | 2 ++
net/mptcp/subflow.c | 19 ++++++++++++++++---
7 files changed, 47 insertions(+), 8 deletions(-)
--
2.45.2
Greg recently reported 3 patches that could not be applied without
conflicts in v6.1:
- 4dabcdf58121 ("tcp: fix mptcp DSS corruption due to large pmtu xmit")
- 3d041393ea8c ("mptcp: prevent MPC handshake on port-based signal
endpoints")
- 5afca7e996c4 ("selftests: mptcp: join: test for prohibited MPC to
port-based endp")
Conflicts have been resolved for the two first ones, and documented in
each patch.
The last patch has not been backported: this is an extra test for the
selftests validating the previous commit, and there are a lot of
conflicts. That's fine not to backport this test, it is still possible
to use the selftests from a newer version and run them on this older
kernel.
Paolo Abeni (2):
tcp: fix mptcp DSS corruption due to large pmtu xmit
mptcp: prevent MPC handshake on port-based signal endpoints
net/ipv4/tcp_output.c | 4 +---
net/mptcp/mib.c | 1 +
net/mptcp/mib.h | 1 +
net/mptcp/pm_netlink.c | 1 +
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 11 +++++++++++
6 files changed, 16 insertions(+), 3 deletions(-)
--
2.45.2
Greg recently reported 2 patches that could not be applied without
conflict in v6.6:
- 4dabcdf58121 ("tcp: fix mptcp DSS corruption due to large pmtu xmit")
- 5afca7e996c4 ("selftests: mptcp: join: test for prohibited MPC to
port-based endp")
Conflicts have been resolved, and documented in each patch.
Note that there are two extra patches:
- 8c6f6b4bb53a ("selftests: mptcp: join: change capture/checksum as
bool"): to avoid some conflicts
- "selftests: mptcp: remove duplicated variables": a dedicated patch
for v6.6, to fix some previous backport issues.
Geliang Tang (1):
selftests: mptcp: join: change capture/checksum as bool
Matthieu Baerts (NGI0) (1):
selftests: mptcp: remove duplicated variables
Paolo Abeni (2):
tcp: fix mptcp DSS corruption due to large pmtu xmit
selftests: mptcp: join: test for prohibited MPC to port-based endp
net/ipv4/tcp_output.c | 4 +-
.../testing/selftests/net/mptcp/mptcp_join.sh | 135 ++++++++++++------
.../testing/selftests/net/mptcp/mptcp_lib.sh | 11 --
3 files changed, 96 insertions(+), 54 deletions(-)
--
2.45.2
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 23f5f5debcaac1399cfeacec215278bf6dbc1d11
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102142-pristine-mayday-9998@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 23f5f5debcaac1399cfeacec215278bf6dbc1d11 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Wed, 9 Oct 2024 16:51:04 +0200
Subject: [PATCH] serial: qcom-geni: fix shutdown race
A commit adding back the stopping of tx on port shutdown failed to add
back the locking which had also been removed by commit e83766334f96
("tty: serial: qcom_geni_serial: No need to stop tx/rx on UART
shutdown").
Holding the port lock is needed to serialise against the console code,
which may update the interrupt enable register and access the port
state.
Fixes: d8aca2f96813 ("tty: serial: qcom-geni-serial: stop operations in progress at shutdown")
Fixes: 947cc4ecc06c ("serial: qcom-geni: fix soft lockup on sw flow control and suspend")
Cc: stable(a)vger.kernel.org # 6.3
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Reviewed-by: Douglas Anderson <dianders(a)chromium.org>
Link: https://lore.kernel.org/r/20241009145110.16847-4-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 2e4a5361f137..87cd974b76bf 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1114,10 +1114,12 @@ static void qcom_geni_serial_shutdown(struct uart_port *uport)
{
disable_irq(uport->irq);
+ uart_port_lock_irq(uport);
qcom_geni_serial_stop_tx(uport);
qcom_geni_serial_stop_rx(uport);
qcom_geni_serial_cancel_tx_cmd(uport);
+ uart_port_unlock_irq(uport);
}
static void qcom_geni_serial_flush_buffer(struct uart_port *uport)
This problem reported by Clement LE GOFFIC manifest when
using CONFIG_KASAN_IN_VMALLOC and VMAP_STACK:
https://lore.kernel.org/linux-arm-kernel/a1a1d062-f3a2-4d05-9836-3b098de9db…
After some analysis it seems we are missing to sync the
VMALLOC shadow memory in top level PGD to all CPUs.
Add some code to perform this sync, and the bug appears
to go away.
As suggested by Ard, also perform a dummy read from the
shadow memory of the new VMAP_STACK in the low level
assembly.
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
---
Changes in v2:
- Implement the two helper functions suggested by Russell
making the KASAN PGD copying less messy.
- Link to v1: https://lore.kernel.org/r/20241015-arm-kasan-vmalloc-crash-v1-0-dbb23592ca8…
---
Linus Walleij (2):
ARM: ioremap: Sync PGDs for VMALLOC shadow
ARM: entry: Do a dummy read from VMAP shadow
arch/arm/kernel/entry-armv.S | 8 ++++++++
arch/arm/mm/ioremap.c | 25 +++++++++++++++++++++----
2 files changed, 29 insertions(+), 4 deletions(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20241015-arm-kasan-vmalloc-crash-fcbd51416457
Best regards,
--
Linus Walleij <linus.walleij(a)linaro.org>
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102101-glacial-outsell-b7f5@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102100-unselect-chevron-960a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102158-vanquish-ignition-83d0@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102157-tactical-darkened-7877@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 40d7903386df4d18f04d90510ba90eedee260085
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102155-anemia-fructose-ab64@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 40d7903386df4d18f04d90510ba90eedee260085 Mon Sep 17 00:00:00 2001
From: Marek Vasut <marex(a)denx.de>
Date: Wed, 2 Oct 2024 20:40:38 +0200
Subject: [PATCH] serial: imx: Update mctrl old_status on RTSD interrupt
When sending data using DMA at high baudrate (4 Mbdps in local test case) to
a device with small RX buffer which keeps asserting RTS after every received
byte, it is possible that the iMX UART driver would not recognize the falling
edge of RTS input signal and get stuck, unable to transmit any more data.
This condition happens when the following sequence of events occur:
- imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
control signal status into sport->old_status using imx_uart_get_hwmctrl().
The RTSS/TIOCM_CTS bit is of interest here (*).
- DMA transfer occurs, the remote device asserts RTS signal after each byte.
The i.MX UART driver recognizes each such RTS signal change, raises an
interrupt with USR1 register RTSD bit set, which leads to invocation of
__imx_uart_rtsint(), which calls uart_handle_cts_change().
- If the RTS signal is deasserted, uart_handle_cts_change() clears
port->hw_stopped and unblocks the port for further data transfers.
- If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
and blocks the port for further data transfers. This may occur as the
last interrupt of a transfer, which means port->hw_stopped remains set
and the port remains blocked (**).
- Any further data transfer attempts will trigger imx_uart_mctrl_check(),
which will read current status of UART control signals by calling
imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
- If current status differs from sport->old_status for RTS signal,
uart_handle_cts_change() is called and possibly unblocks the port
by clearing port->hw_stopped .
- If current status does not differ from sport->old_status for RTS
signal, no action occurs. This may occur in case prior snapshot (*)
was taken before any transfer so the RTS is deasserted, current
snapshot (***) was taken after a transfer and therefore RTS is
deasserted again, which means current status and sport->old_status
are identical. In case (**) triggered when RTS got asserted, and
made port->hw_stopped set, the port->hw_stopped will remain set
because no change on RTS line is recognized by this driver and
uart_handle_cts_change() is not called from here to unblock the
port->hw_stopped.
Update sport->old_status in __imx_uart_rtsint() accordingly to make
imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
and TIOCM_RI bits in sport->old_status do not suffer from this problem.
Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
Cc: stable <stable(a)kernel.org>
Reviewed-by: Esben Haabendal <esben(a)geanix.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 67d4a72eda77..90974d338f3c 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
imx_uart_writel(sport, USR1_RTSD, USR1);
usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
+ /*
+ * Update sport->old_status here, so any follow-up calls to
+ * imx_uart_mctrl_check() will be able to recognize that RTS
+ * state changed since last imx_uart_mctrl_check() call.
+ *
+ * In case RTS has been detected as asserted here and later on
+ * deasserted by the time imx_uart_mctrl_check() was called,
+ * imx_uart_mctrl_check() can detect the RTS state change and
+ * trigger uart_handle_cts_change() to unblock the port for
+ * further TX transfers.
+ */
+ if (usr1 & USR1_RTSS)
+ sport->old_status |= TIOCM_CTS;
+ else
+ sport->old_status &= ~TIOCM_CTS;
uart_handle_cts_change(&sport->port, usr1);
wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 705e3ce37bccdf2ed6f848356ff355f480d51a91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102151-shove-lucid-37a2@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 705e3ce37bccdf2ed6f848356ff355f480d51a91 Mon Sep 17 00:00:00 2001
From: Roger Quadros <rogerq(a)kernel.org>
Date: Fri, 11 Oct 2024 13:53:24 +0300
Subject: [PATCH] usb: dwc3: core: Fix system suspend on TI AM62 platforms
Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"),
system suspend is broken on AM62 TI platforms.
Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY
bits (hence forth called 2 SUSPHY bits) were being set during core
initialization and even during core re-initialization after a system
suspend/resume.
These bits are required to be set for system suspend/resume to work correctly
on AM62 platforms.
Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget
driver is not loaded and started.
For Host mode, the 2 SUSPHY bits are set before the first system suspend but
get cleared at system resume during core re-init and are never set again.
This patch resovles these two issues by ensuring the 2 SUSPHY bits are set
before system suspend and restored to the original state during system resume.
Cc: stable(a)vger.kernel.org # v6.9+
Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init")
Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/
Signed-off-by: Roger Quadros <rogerq(a)kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Tested-by: Markus Schneider-Pargmann <msp(a)baylibre.com>
Reviewed-by: Dhruva Gole <d-gole(a)ti.com>
Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 21740e2b8f07..427e5660f87c 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2342,6 +2342,11 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
u32 reg;
int i;
+ dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
+ DWC3_GUSB2PHYCFG_SUSPHY) ||
+ (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) &
+ DWC3_GUSB3PIPECTL_SUSPHY);
+
switch (dwc->current_dr_role) {
case DWC3_GCTL_PRTCAP_DEVICE:
if (pm_runtime_suspended(dwc->dev))
@@ -2393,6 +2398,15 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /*
+ * TI AM62 platform requires SUSPHY to be
+ * enabled for system suspend to work.
+ */
+ if (!dwc->susphy_state)
+ dwc3_enable_susphy(dwc, true);
+ }
+
return 0;
}
@@ -2460,6 +2474,11 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /* restore SUSPHY state to that before system suspend. */
+ dwc3_enable_susphy(dwc, dwc->susphy_state);
+ }
+
return 0;
}
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 9c508e0c5cdf..eab81dfdcc35 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1150,6 +1150,8 @@ struct dwc3_scratchpad_array {
* @sys_wakeup: set if the device may do system wakeup.
* @wakeup_configured: set if the device is configured for remote wakeup.
* @suspended: set to track suspend event due to U3/L2.
+ * @susphy_state: state of DWC3_GUSB2PHYCFG_SUSPHY + DWC3_GUSB3PIPECTL_SUSPHY
+ * before PM suspend.
* @imod_interval: set the interrupt moderation interval in 250ns
* increments or 0 to disable.
* @max_cfg_eps: current max number of IN eps used across all USB configs.
@@ -1382,6 +1384,7 @@ struct dwc3 {
unsigned sys_wakeup:1;
unsigned wakeup_configured:1;
unsigned suspended:1;
+ unsigned susphy_state:1;
u16 imod_interval;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 705e3ce37bccdf2ed6f848356ff355f480d51a91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102150-sneer-daughter-91eb@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 705e3ce37bccdf2ed6f848356ff355f480d51a91 Mon Sep 17 00:00:00 2001
From: Roger Quadros <rogerq(a)kernel.org>
Date: Fri, 11 Oct 2024 13:53:24 +0300
Subject: [PATCH] usb: dwc3: core: Fix system suspend on TI AM62 platforms
Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"),
system suspend is broken on AM62 TI platforms.
Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY
bits (hence forth called 2 SUSPHY bits) were being set during core
initialization and even during core re-initialization after a system
suspend/resume.
These bits are required to be set for system suspend/resume to work correctly
on AM62 platforms.
Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget
driver is not loaded and started.
For Host mode, the 2 SUSPHY bits are set before the first system suspend but
get cleared at system resume during core re-init and are never set again.
This patch resovles these two issues by ensuring the 2 SUSPHY bits are set
before system suspend and restored to the original state during system resume.
Cc: stable(a)vger.kernel.org # v6.9+
Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init")
Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/
Signed-off-by: Roger Quadros <rogerq(a)kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Tested-by: Markus Schneider-Pargmann <msp(a)baylibre.com>
Reviewed-by: Dhruva Gole <d-gole(a)ti.com>
Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 21740e2b8f07..427e5660f87c 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2342,6 +2342,11 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
u32 reg;
int i;
+ dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
+ DWC3_GUSB2PHYCFG_SUSPHY) ||
+ (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) &
+ DWC3_GUSB3PIPECTL_SUSPHY);
+
switch (dwc->current_dr_role) {
case DWC3_GCTL_PRTCAP_DEVICE:
if (pm_runtime_suspended(dwc->dev))
@@ -2393,6 +2398,15 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /*
+ * TI AM62 platform requires SUSPHY to be
+ * enabled for system suspend to work.
+ */
+ if (!dwc->susphy_state)
+ dwc3_enable_susphy(dwc, true);
+ }
+
return 0;
}
@@ -2460,6 +2474,11 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /* restore SUSPHY state to that before system suspend. */
+ dwc3_enable_susphy(dwc, dwc->susphy_state);
+ }
+
return 0;
}
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 9c508e0c5cdf..eab81dfdcc35 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1150,6 +1150,8 @@ struct dwc3_scratchpad_array {
* @sys_wakeup: set if the device may do system wakeup.
* @wakeup_configured: set if the device is configured for remote wakeup.
* @suspended: set to track suspend event due to U3/L2.
+ * @susphy_state: state of DWC3_GUSB2PHYCFG_SUSPHY + DWC3_GUSB3PIPECTL_SUSPHY
+ * before PM suspend.
* @imod_interval: set the interrupt moderation interval in 250ns
* increments or 0 to disable.
* @max_cfg_eps: current max number of IN eps used across all USB configs.
@@ -1382,6 +1384,7 @@ struct dwc3 {
unsigned sys_wakeup:1;
unsigned wakeup_configured:1;
unsigned suspended:1;
+ unsigned susphy_state:1;
u16 imod_interval;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 705e3ce37bccdf2ed6f848356ff355f480d51a91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102149-useable-steep-0218@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 705e3ce37bccdf2ed6f848356ff355f480d51a91 Mon Sep 17 00:00:00 2001
From: Roger Quadros <rogerq(a)kernel.org>
Date: Fri, 11 Oct 2024 13:53:24 +0300
Subject: [PATCH] usb: dwc3: core: Fix system suspend on TI AM62 platforms
Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"),
system suspend is broken on AM62 TI platforms.
Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY
bits (hence forth called 2 SUSPHY bits) were being set during core
initialization and even during core re-initialization after a system
suspend/resume.
These bits are required to be set for system suspend/resume to work correctly
on AM62 platforms.
Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget
driver is not loaded and started.
For Host mode, the 2 SUSPHY bits are set before the first system suspend but
get cleared at system resume during core re-init and are never set again.
This patch resovles these two issues by ensuring the 2 SUSPHY bits are set
before system suspend and restored to the original state during system resume.
Cc: stable(a)vger.kernel.org # v6.9+
Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init")
Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/
Signed-off-by: Roger Quadros <rogerq(a)kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Tested-by: Markus Schneider-Pargmann <msp(a)baylibre.com>
Reviewed-by: Dhruva Gole <d-gole(a)ti.com>
Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 21740e2b8f07..427e5660f87c 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2342,6 +2342,11 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
u32 reg;
int i;
+ dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) &
+ DWC3_GUSB2PHYCFG_SUSPHY) ||
+ (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) &
+ DWC3_GUSB3PIPECTL_SUSPHY);
+
switch (dwc->current_dr_role) {
case DWC3_GCTL_PRTCAP_DEVICE:
if (pm_runtime_suspended(dwc->dev))
@@ -2393,6 +2398,15 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /*
+ * TI AM62 platform requires SUSPHY to be
+ * enabled for system suspend to work.
+ */
+ if (!dwc->susphy_state)
+ dwc3_enable_susphy(dwc, true);
+ }
+
return 0;
}
@@ -2460,6 +2474,11 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
break;
}
+ if (!PMSG_IS_AUTO(msg)) {
+ /* restore SUSPHY state to that before system suspend. */
+ dwc3_enable_susphy(dwc, dwc->susphy_state);
+ }
+
return 0;
}
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 9c508e0c5cdf..eab81dfdcc35 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1150,6 +1150,8 @@ struct dwc3_scratchpad_array {
* @sys_wakeup: set if the device may do system wakeup.
* @wakeup_configured: set if the device is configured for remote wakeup.
* @suspended: set to track suspend event due to U3/L2.
+ * @susphy_state: state of DWC3_GUSB2PHYCFG_SUSPHY + DWC3_GUSB3PIPECTL_SUSPHY
+ * before PM suspend.
* @imod_interval: set the interrupt moderation interval in 250ns
* increments or 0 to disable.
* @max_cfg_eps: current max number of IN eps used across all USB configs.
@@ -1382,6 +1384,7 @@ struct dwc3 {
unsigned sys_wakeup:1;
unsigned wakeup_configured:1;
unsigned suspended:1;
+ unsigned susphy_state:1;
u16 imod_interval;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 9499327714de7bc5cf6c792112c1474932d8ad31
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102111-sandstone-affected-1fd4@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9499327714de7bc5cf6c792112c1474932d8ad31 Mon Sep 17 00:00:00 2001
From: Kevin Groeneveld <kgroeneveld(a)lenbrook.com>
Date: Sun, 6 Oct 2024 19:26:31 -0400
Subject: [PATCH] usb: gadget: f_uac2: fix return value for
UAC2_ATTRIBUTE_STRING store
The configfs store callback should return the number of bytes consumed
not the total number of bytes we actually stored. These could differ if
for example the passed in string had a newline we did not store.
If the returned value does not match the number of bytes written the
writer might assume a failure or keep trying to write the remaining bytes.
For example the following command will hang trying to write the final
newline over and over again (tested on bash 2.05b):
echo foo > function_name
Fixes: 993a44fa85c1 ("usb: gadget: f_uac2: allow changing interface name via configfs")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Kevin Groeneveld <kgroeneveld(a)lenbrook.com>
Link: https://lore.kernel.org/r/20241006232637.4267-1-kgroeneveld@lenbrook.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/function/f_uac2.c b/drivers/usb/gadget/function/f_uac2.c
index 1cdda44455b3..ce5b77f89190 100644
--- a/drivers/usb/gadget/function/f_uac2.c
+++ b/drivers/usb/gadget/function/f_uac2.c
@@ -2061,7 +2061,7 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \
const char *page, size_t len) \
{ \
struct f_uac2_opts *opts = to_f_uac2_opts(item); \
- int ret = 0; \
+ int ret = len; \
\
mutex_lock(&opts->lock); \
if (opts->refcnt) { \
@@ -2072,8 +2072,8 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \
if (len && page[len - 1] == '\n') \
len--; \
\
- ret = scnprintf(opts->name, min(sizeof(opts->name), len + 1), \
- "%s", page); \
+ scnprintf(opts->name, min(sizeof(opts->name), len + 1), \
+ "%s", page); \
\
end: \
mutex_unlock(&opts->lock); \
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 9499327714de7bc5cf6c792112c1474932d8ad31
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102111-unbalance-roman-5bdd@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9499327714de7bc5cf6c792112c1474932d8ad31 Mon Sep 17 00:00:00 2001
From: Kevin Groeneveld <kgroeneveld(a)lenbrook.com>
Date: Sun, 6 Oct 2024 19:26:31 -0400
Subject: [PATCH] usb: gadget: f_uac2: fix return value for
UAC2_ATTRIBUTE_STRING store
The configfs store callback should return the number of bytes consumed
not the total number of bytes we actually stored. These could differ if
for example the passed in string had a newline we did not store.
If the returned value does not match the number of bytes written the
writer might assume a failure or keep trying to write the remaining bytes.
For example the following command will hang trying to write the final
newline over and over again (tested on bash 2.05b):
echo foo > function_name
Fixes: 993a44fa85c1 ("usb: gadget: f_uac2: allow changing interface name via configfs")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Kevin Groeneveld <kgroeneveld(a)lenbrook.com>
Link: https://lore.kernel.org/r/20241006232637.4267-1-kgroeneveld@lenbrook.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/gadget/function/f_uac2.c b/drivers/usb/gadget/function/f_uac2.c
index 1cdda44455b3..ce5b77f89190 100644
--- a/drivers/usb/gadget/function/f_uac2.c
+++ b/drivers/usb/gadget/function/f_uac2.c
@@ -2061,7 +2061,7 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \
const char *page, size_t len) \
{ \
struct f_uac2_opts *opts = to_f_uac2_opts(item); \
- int ret = 0; \
+ int ret = len; \
\
mutex_lock(&opts->lock); \
if (opts->refcnt) { \
@@ -2072,8 +2072,8 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \
if (len && page[len - 1] == '\n') \
len--; \
\
- ret = scnprintf(opts->name, min(sizeof(opts->name), len + 1), \
- "%s", page); \
+ scnprintf(opts->name, min(sizeof(opts->name), len + 1), \
+ "%s", page); \
\
end: \
mutex_unlock(&opts->lock); \
When BPF_TRAMP_F_CALL_ORIG is enabled, the address of a bpf_tramp_image
struct on the stack is passed during the size calculation pass and
an address on the heap is passed during code generation. This may
cause a heap buffer overflow if the heap address is tagged because
emit_a64_mov_i64() will emit longer code than it did during the size
calculation pass. The same problem could occur without tag-based
KASAN if one of the 16-bit words of the stack address happened to
be all-ones during the size calculation pass. Fix the problem by
assuming the worst case (4 instructions) when calculating the size
of the bpf_tramp_image address emission.
Fixes: 19d3c179a377 ("bpf, arm64: Fix trampoline for BPF_TRAMP_F_CALL_ORIG")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Link: https://linux-review.googlesource.com/id/I1496f2bc24fba7a1d492e16e2b94cf437…
Cc: stable(a)vger.kernel.org
---
arch/arm64/net/bpf_jit_comp.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 8bbd0b20136a8..5db82bfc9dc11 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -2220,7 +2220,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
emit(A64_STR64I(A64_R(20), A64_SP, regs_off + 8), ctx);
if (flags & BPF_TRAMP_F_CALL_ORIG) {
- emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
+ /* for the first pass, assume the worst case */
+ if (!ctx->image)
+ ctx->idx += 4;
+ else
+ emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
emit_call((const u64)__bpf_tramp_enter, ctx);
}
@@ -2264,7 +2268,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
if (flags & BPF_TRAMP_F_CALL_ORIG) {
im->ip_epilogue = ctx->ro_image + ctx->idx;
- emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
+ /* for the first pass, assume the worst case */
+ if (!ctx->image)
+ ctx->idx += 4;
+ else
+ emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
emit_call((const u64)__bpf_tramp_exit, ctx);
}
--
2.47.0.rc1.288.g06298d1525-goog
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 30c9ae5ece8ecd69d36e6912c2c0896418f2468c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102146-theme-encircle-9ead@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 30c9ae5ece8ecd69d36e6912c2c0896418f2468c Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 17:00:00 +0300
Subject: [PATCH] xhci: dbc: honor usb transfer size boundaries.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Treat each completed full size write to /dev/ttyDBC0 as a separate usb
transfer. Make sure the size of the TRBs matches the size of the tty
write by first queuing as many max packet size TRBs as possible up to
the last TRB which will be cut short to match the size of the tty write.
This solves an issue where userspace writes several transfers back to
back via /dev/ttyDBC0 into a kfifo before dbgtty can find available
request to turn that kfifo data into TRBs on the transfer ring.
The boundary between transfer was lost as xhci-dbgtty then turned
everyting in the kfifo into as many 'max packet size' TRBs as possible.
DbC would then send more data to the host than intended for that
transfer, causing host to issue a babble error.
Refuse to write more data to kfifo until previous tty write data is
turned into properly sized TRBs with data size boundaries matching tty
write size
Tested-by: Uday M Bhat <uday.m.bhat(a)intel.com>
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h
index 8ec813b6e9fd..9dc8f4d8077c 100644
--- a/drivers/usb/host/xhci-dbgcap.h
+++ b/drivers/usb/host/xhci-dbgcap.h
@@ -110,6 +110,7 @@ struct dbc_port {
struct tasklet_struct push;
struct list_head write_pool;
+ unsigned int tx_boundary;
bool registered;
};
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index b8e78867e25a..d719c16ea30b 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -24,6 +24,29 @@ static inline struct dbc_port *dbc_to_port(struct xhci_dbc *dbc)
return dbc->priv;
}
+static unsigned int
+dbc_kfifo_to_req(struct dbc_port *port, char *packet)
+{
+ unsigned int len;
+
+ len = kfifo_len(&port->port.xmit_fifo);
+
+ if (len == 0)
+ return 0;
+
+ len = min(len, DBC_MAX_PACKET);
+
+ if (port->tx_boundary)
+ len = min(port->tx_boundary, len);
+
+ len = kfifo_out(&port->port.xmit_fifo, packet, len);
+
+ if (port->tx_boundary)
+ port->tx_boundary -= len;
+
+ return len;
+}
+
static int dbc_start_tx(struct dbc_port *port)
__releases(&port->port_lock)
__acquires(&port->port_lock)
@@ -36,7 +59,7 @@ static int dbc_start_tx(struct dbc_port *port)
while (!list_empty(pool)) {
req = list_entry(pool->next, struct dbc_request, list_pool);
- len = kfifo_out(&port->port.xmit_fifo, req->buf, DBC_MAX_PACKET);
+ len = dbc_kfifo_to_req(port, req->buf);
if (len == 0)
break;
do_tty_wake = true;
@@ -200,14 +223,32 @@ static ssize_t dbc_tty_write(struct tty_struct *tty, const u8 *buf,
{
struct dbc_port *port = tty->driver_data;
unsigned long flags;
+ unsigned int written = 0;
spin_lock_irqsave(&port->port_lock, flags);
- if (count)
- count = kfifo_in(&port->port.xmit_fifo, buf, count);
- dbc_start_tx(port);
+
+ /*
+ * Treat tty write as one usb transfer. Make sure the writes are turned
+ * into TRB request having the same size boundaries as the tty writes.
+ * Don't add data to kfifo before previous write is turned into TRBs
+ */
+ if (port->tx_boundary) {
+ spin_unlock_irqrestore(&port->port_lock, flags);
+ return 0;
+ }
+
+ if (count) {
+ written = kfifo_in(&port->port.xmit_fifo, buf, count);
+
+ if (written == count)
+ port->tx_boundary = kfifo_len(&port->port.xmit_fifo);
+
+ dbc_start_tx(port);
+ }
+
spin_unlock_irqrestore(&port->port_lock, flags);
- return count;
+ return written;
}
static int dbc_tty_put_char(struct tty_struct *tty, u8 ch)
@@ -241,6 +282,10 @@ static unsigned int dbc_tty_write_room(struct tty_struct *tty)
spin_lock_irqsave(&port->port_lock, flags);
room = kfifo_avail(&port->port.xmit_fifo);
+
+ if (port->tx_boundary)
+ room = 0;
+
spin_unlock_irqrestore(&port->port_lock, flags);
return room;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 30c9ae5ece8ecd69d36e6912c2c0896418f2468c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102144-cinch-foster-09d5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 30c9ae5ece8ecd69d36e6912c2c0896418f2468c Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 17:00:00 +0300
Subject: [PATCH] xhci: dbc: honor usb transfer size boundaries.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Treat each completed full size write to /dev/ttyDBC0 as a separate usb
transfer. Make sure the size of the TRBs matches the size of the tty
write by first queuing as many max packet size TRBs as possible up to
the last TRB which will be cut short to match the size of the tty write.
This solves an issue where userspace writes several transfers back to
back via /dev/ttyDBC0 into a kfifo before dbgtty can find available
request to turn that kfifo data into TRBs on the transfer ring.
The boundary between transfer was lost as xhci-dbgtty then turned
everyting in the kfifo into as many 'max packet size' TRBs as possible.
DbC would then send more data to the host than intended for that
transfer, causing host to issue a babble error.
Refuse to write more data to kfifo until previous tty write data is
turned into properly sized TRBs with data size boundaries matching tty
write size
Tested-by: Uday M Bhat <uday.m.bhat(a)intel.com>
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h
index 8ec813b6e9fd..9dc8f4d8077c 100644
--- a/drivers/usb/host/xhci-dbgcap.h
+++ b/drivers/usb/host/xhci-dbgcap.h
@@ -110,6 +110,7 @@ struct dbc_port {
struct tasklet_struct push;
struct list_head write_pool;
+ unsigned int tx_boundary;
bool registered;
};
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index b8e78867e25a..d719c16ea30b 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -24,6 +24,29 @@ static inline struct dbc_port *dbc_to_port(struct xhci_dbc *dbc)
return dbc->priv;
}
+static unsigned int
+dbc_kfifo_to_req(struct dbc_port *port, char *packet)
+{
+ unsigned int len;
+
+ len = kfifo_len(&port->port.xmit_fifo);
+
+ if (len == 0)
+ return 0;
+
+ len = min(len, DBC_MAX_PACKET);
+
+ if (port->tx_boundary)
+ len = min(port->tx_boundary, len);
+
+ len = kfifo_out(&port->port.xmit_fifo, packet, len);
+
+ if (port->tx_boundary)
+ port->tx_boundary -= len;
+
+ return len;
+}
+
static int dbc_start_tx(struct dbc_port *port)
__releases(&port->port_lock)
__acquires(&port->port_lock)
@@ -36,7 +59,7 @@ static int dbc_start_tx(struct dbc_port *port)
while (!list_empty(pool)) {
req = list_entry(pool->next, struct dbc_request, list_pool);
- len = kfifo_out(&port->port.xmit_fifo, req->buf, DBC_MAX_PACKET);
+ len = dbc_kfifo_to_req(port, req->buf);
if (len == 0)
break;
do_tty_wake = true;
@@ -200,14 +223,32 @@ static ssize_t dbc_tty_write(struct tty_struct *tty, const u8 *buf,
{
struct dbc_port *port = tty->driver_data;
unsigned long flags;
+ unsigned int written = 0;
spin_lock_irqsave(&port->port_lock, flags);
- if (count)
- count = kfifo_in(&port->port.xmit_fifo, buf, count);
- dbc_start_tx(port);
+
+ /*
+ * Treat tty write as one usb transfer. Make sure the writes are turned
+ * into TRB request having the same size boundaries as the tty writes.
+ * Don't add data to kfifo before previous write is turned into TRBs
+ */
+ if (port->tx_boundary) {
+ spin_unlock_irqrestore(&port->port_lock, flags);
+ return 0;
+ }
+
+ if (count) {
+ written = kfifo_in(&port->port.xmit_fifo, buf, count);
+
+ if (written == count)
+ port->tx_boundary = kfifo_len(&port->port.xmit_fifo);
+
+ dbc_start_tx(port);
+ }
+
spin_unlock_irqrestore(&port->port_lock, flags);
- return count;
+ return written;
}
static int dbc_tty_put_char(struct tty_struct *tty, u8 ch)
@@ -241,6 +282,10 @@ static unsigned int dbc_tty_write_room(struct tty_struct *tty)
spin_lock_irqsave(&port->port_lock, flags);
room = kfifo_avail(&port->port.xmit_fifo);
+
+ if (port->tx_boundary)
+ room = 0;
+
spin_unlock_irqrestore(&port->port_lock, flags);
return room;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 30c9ae5ece8ecd69d36e6912c2c0896418f2468c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102143-chevy-extras-add3@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 30c9ae5ece8ecd69d36e6912c2c0896418f2468c Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 17:00:00 +0300
Subject: [PATCH] xhci: dbc: honor usb transfer size boundaries.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Treat each completed full size write to /dev/ttyDBC0 as a separate usb
transfer. Make sure the size of the TRBs matches the size of the tty
write by first queuing as many max packet size TRBs as possible up to
the last TRB which will be cut short to match the size of the tty write.
This solves an issue where userspace writes several transfers back to
back via /dev/ttyDBC0 into a kfifo before dbgtty can find available
request to turn that kfifo data into TRBs on the transfer ring.
The boundary between transfer was lost as xhci-dbgtty then turned
everyting in the kfifo into as many 'max packet size' TRBs as possible.
DbC would then send more data to the host than intended for that
transfer, causing host to issue a babble error.
Refuse to write more data to kfifo until previous tty write data is
turned into properly sized TRBs with data size boundaries matching tty
write size
Tested-by: Uday M Bhat <uday.m.bhat(a)intel.com>
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h
index 8ec813b6e9fd..9dc8f4d8077c 100644
--- a/drivers/usb/host/xhci-dbgcap.h
+++ b/drivers/usb/host/xhci-dbgcap.h
@@ -110,6 +110,7 @@ struct dbc_port {
struct tasklet_struct push;
struct list_head write_pool;
+ unsigned int tx_boundary;
bool registered;
};
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index b8e78867e25a..d719c16ea30b 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -24,6 +24,29 @@ static inline struct dbc_port *dbc_to_port(struct xhci_dbc *dbc)
return dbc->priv;
}
+static unsigned int
+dbc_kfifo_to_req(struct dbc_port *port, char *packet)
+{
+ unsigned int len;
+
+ len = kfifo_len(&port->port.xmit_fifo);
+
+ if (len == 0)
+ return 0;
+
+ len = min(len, DBC_MAX_PACKET);
+
+ if (port->tx_boundary)
+ len = min(port->tx_boundary, len);
+
+ len = kfifo_out(&port->port.xmit_fifo, packet, len);
+
+ if (port->tx_boundary)
+ port->tx_boundary -= len;
+
+ return len;
+}
+
static int dbc_start_tx(struct dbc_port *port)
__releases(&port->port_lock)
__acquires(&port->port_lock)
@@ -36,7 +59,7 @@ static int dbc_start_tx(struct dbc_port *port)
while (!list_empty(pool)) {
req = list_entry(pool->next, struct dbc_request, list_pool);
- len = kfifo_out(&port->port.xmit_fifo, req->buf, DBC_MAX_PACKET);
+ len = dbc_kfifo_to_req(port, req->buf);
if (len == 0)
break;
do_tty_wake = true;
@@ -200,14 +223,32 @@ static ssize_t dbc_tty_write(struct tty_struct *tty, const u8 *buf,
{
struct dbc_port *port = tty->driver_data;
unsigned long flags;
+ unsigned int written = 0;
spin_lock_irqsave(&port->port_lock, flags);
- if (count)
- count = kfifo_in(&port->port.xmit_fifo, buf, count);
- dbc_start_tx(port);
+
+ /*
+ * Treat tty write as one usb transfer. Make sure the writes are turned
+ * into TRB request having the same size boundaries as the tty writes.
+ * Don't add data to kfifo before previous write is turned into TRBs
+ */
+ if (port->tx_boundary) {
+ spin_unlock_irqrestore(&port->port_lock, flags);
+ return 0;
+ }
+
+ if (count) {
+ written = kfifo_in(&port->port.xmit_fifo, buf, count);
+
+ if (written == count)
+ port->tx_boundary = kfifo_len(&port->port.xmit_fifo);
+
+ dbc_start_tx(port);
+ }
+
spin_unlock_irqrestore(&port->port_lock, flags);
- return count;
+ return written;
}
static int dbc_tty_put_char(struct tty_struct *tty, u8 ch)
@@ -241,6 +282,10 @@ static unsigned int dbc_tty_write_room(struct tty_struct *tty)
spin_lock_irqsave(&port->port_lock, flags);
room = kfifo_avail(&port->port.xmit_fifo);
+
+ if (port->tx_boundary)
+ room = 0;
+
spin_unlock_irqrestore(&port->port_lock, flags);
return room;
The patch below does not apply to the 6.11-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.11.y
git checkout FETCH_HEAD
git cherry-pick -x 30c9ae5ece8ecd69d36e6912c2c0896418f2468c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102142-eskimo-lumber-a654@gregkh' --subject-prefix 'PATCH 6.11.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 30c9ae5ece8ecd69d36e6912c2c0896418f2468c Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 17:00:00 +0300
Subject: [PATCH] xhci: dbc: honor usb transfer size boundaries.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Treat each completed full size write to /dev/ttyDBC0 as a separate usb
transfer. Make sure the size of the TRBs matches the size of the tty
write by first queuing as many max packet size TRBs as possible up to
the last TRB which will be cut short to match the size of the tty write.
This solves an issue where userspace writes several transfers back to
back via /dev/ttyDBC0 into a kfifo before dbgtty can find available
request to turn that kfifo data into TRBs on the transfer ring.
The boundary between transfer was lost as xhci-dbgtty then turned
everyting in the kfifo into as many 'max packet size' TRBs as possible.
DbC would then send more data to the host than intended for that
transfer, causing host to issue a babble error.
Refuse to write more data to kfifo until previous tty write data is
turned into properly sized TRBs with data size boundaries matching tty
write size
Tested-by: Uday M Bhat <uday.m.bhat(a)intel.com>
Tested-by: Łukasz Bartosik <ukaszb(a)chromium.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h
index 8ec813b6e9fd..9dc8f4d8077c 100644
--- a/drivers/usb/host/xhci-dbgcap.h
+++ b/drivers/usb/host/xhci-dbgcap.h
@@ -110,6 +110,7 @@ struct dbc_port {
struct tasklet_struct push;
struct list_head write_pool;
+ unsigned int tx_boundary;
bool registered;
};
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index b8e78867e25a..d719c16ea30b 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -24,6 +24,29 @@ static inline struct dbc_port *dbc_to_port(struct xhci_dbc *dbc)
return dbc->priv;
}
+static unsigned int
+dbc_kfifo_to_req(struct dbc_port *port, char *packet)
+{
+ unsigned int len;
+
+ len = kfifo_len(&port->port.xmit_fifo);
+
+ if (len == 0)
+ return 0;
+
+ len = min(len, DBC_MAX_PACKET);
+
+ if (port->tx_boundary)
+ len = min(port->tx_boundary, len);
+
+ len = kfifo_out(&port->port.xmit_fifo, packet, len);
+
+ if (port->tx_boundary)
+ port->tx_boundary -= len;
+
+ return len;
+}
+
static int dbc_start_tx(struct dbc_port *port)
__releases(&port->port_lock)
__acquires(&port->port_lock)
@@ -36,7 +59,7 @@ static int dbc_start_tx(struct dbc_port *port)
while (!list_empty(pool)) {
req = list_entry(pool->next, struct dbc_request, list_pool);
- len = kfifo_out(&port->port.xmit_fifo, req->buf, DBC_MAX_PACKET);
+ len = dbc_kfifo_to_req(port, req->buf);
if (len == 0)
break;
do_tty_wake = true;
@@ -200,14 +223,32 @@ static ssize_t dbc_tty_write(struct tty_struct *tty, const u8 *buf,
{
struct dbc_port *port = tty->driver_data;
unsigned long flags;
+ unsigned int written = 0;
spin_lock_irqsave(&port->port_lock, flags);
- if (count)
- count = kfifo_in(&port->port.xmit_fifo, buf, count);
- dbc_start_tx(port);
+
+ /*
+ * Treat tty write as one usb transfer. Make sure the writes are turned
+ * into TRB request having the same size boundaries as the tty writes.
+ * Don't add data to kfifo before previous write is turned into TRBs
+ */
+ if (port->tx_boundary) {
+ spin_unlock_irqrestore(&port->port_lock, flags);
+ return 0;
+ }
+
+ if (count) {
+ written = kfifo_in(&port->port.xmit_fifo, buf, count);
+
+ if (written == count)
+ port->tx_boundary = kfifo_len(&port->port.xmit_fifo);
+
+ dbc_start_tx(port);
+ }
+
spin_unlock_irqrestore(&port->port_lock, flags);
- return count;
+ return written;
}
static int dbc_tty_put_char(struct tty_struct *tty, u8 ch)
@@ -241,6 +282,10 @@ static unsigned int dbc_tty_write_room(struct tty_struct *tty)
spin_lock_irqsave(&port->port_lock, flags);
room = kfifo_avail(&port->port.xmit_fifo);
+
+ if (port->tx_boundary)
+ room = 0;
+
spin_unlock_irqrestore(&port->port_lock, flags);
return room;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x fe49df60cdb7c2975aa743dc295f8786e4b7db10
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102120-valid-uncured-dcca@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fe49df60cdb7c2975aa743dc295f8786e4b7db10 Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 16 Oct 2024 16:59:58 +0300
Subject: [PATCH] xhci: Mitigate failed set dequeue pointer commands
Avoid xHC host from processing a cancelled URB by always turning
cancelled URB TDs into no-op TRBs before queuing a 'Set TR Deq' command.
If the command fails then xHC will start processing the cancelled TD
instead of skipping it once endpoint is restarted, causing issues like
Babble error.
This is not a complete solution as a failed 'Set TR Deq' command does not
guarantee xHC TRB caches are cleared.
Fixes: 4db356924a50 ("xhci: turn cancelled td cleanup to its own function")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-3-mathias.nyman@linux.intel…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 4d664ba53fe9..7dedf31bbddd 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1023,7 +1023,7 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
td_to_noop(xhci, ring, cached_td, false);
cached_td->cancel_status = TD_CLEARED;
}
-
+ td_to_noop(xhci, ring, td, false);
td->cancel_status = TD_CLEARING_CACHE;
cached_td = td;
break;
Hi Greg! Please consider picking up the following two bluetooth fixes
for the next round of stable updates, they fix problems quite a few
users hit in various stable series due to backports:
4084286151fc91 ("Bluetooth: btusb: Fix not being able to reconnect after
suspend") [v6.12-rc4] for 6.11.y
and
2c1dda2acc4192 ("Bluetooth: btusb: Fix regression with fake CSR
controllers 0a12:0001") [v6.12-rc4] for 5.10.y and later
For details see also:
https://lore.kernel.org/all/CABBYNZL0_j4EDWzDS=kXc1Vy0D6ToU+oYnP_uBWTKoXbEa…
tia!
Ciao, Thorsten
From: Eric Biggers <ebiggers(a)google.com>
Fix the kconfig option for the tas2781 HDA driver to select CRC32 rather
than CRC32_SARWATE. CRC32_SARWATE is an option from the kconfig
'choice' that selects the specific CRC32 implementation. Selecting a
'choice' option seems to have no effect, but even if it did work, it
would be incorrect for a random driver to override the user's choice.
CRC32 is the correct option to select for crc32() to be available.
Fixes: 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
sound/pci/hda/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/hda/Kconfig b/sound/pci/hda/Kconfig
index bb15a0248250c..68f1eee9e5c93 100644
--- a/sound/pci/hda/Kconfig
+++ b/sound/pci/hda/Kconfig
@@ -196,11 +196,11 @@ config SND_HDA_SCODEC_TAS2781_I2C
depends on ACPI
depends on EFI
depends on SND_SOC
select SND_SOC_TAS2781_COMLIB
select SND_SOC_TAS2781_FMWLIB
- select CRC32_SARWATE
+ select CRC32
help
Say Y or M here to include TAS2781 I2C HD-audio side codec support
in snd-hda-intel driver, such as ALC287.
comment "Set to Y if you want auto-loading the side codec driver"
base-commit: 715ca9dd687f89ddaac8ec8ccb3b5e5a30311a99
--
2.47.0
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x a985576af824426e33100554a5958a6beda60a1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102110-tableful-unnatural-5dab@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a985576af824426e33100554a5958a6beda60a13 Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Thu, 3 Oct 2024 23:04:52 +0200
Subject: [PATCH] iio: adc: ti-lmp92064: add missing select
IIO_(TRIGGERED_)BUFFER in Kconfig
This driver makes use of triggered buffers, but does not select the
required modules.
Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'.
Fixes: 6c7bc1d27bb2 ("iio: adc: ti-lmp92064: add buffering support")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Link: https://patch.msgid.link/20241003-iio-select-v1-6-67c0385197cd@gmail.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig
index 68640fa26f4e..c1197ee3dc68 100644
--- a/drivers/iio/adc/Kconfig
+++ b/drivers/iio/adc/Kconfig
@@ -1530,6 +1530,8 @@ config TI_LMP92064
tristate "Texas Instruments LMP92064 ADC driver"
depends on SPI
select REGMAP_SPI
+ select IIO_BUFFER
+ select IIO_TRIGGERED_BUFFER
help
Say yes here to build support for the LMP92064 Precision Current and Voltage
sensor.
Hi,
After upgrading to 6.6.57 I noticed that my IPv6 firewall config failed to load.
Quick investigation flagged NFLOG to be the issue:
# ip6tables -I INPUT -j NFLOG
Warning: Extension NFLOG revision 0 not supported, missing kernel module?
ip6tables: No chain/target/match by that name.
The regression is caused by the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/…
More precisely, the bug is in the change below:
+#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
+ {
+ .name = "NFLOG",
+ .revision = 0,
+ .family = NFPROTO_IPV4,
+ .checkentry = nflog_tg_check,
+ .destroy = nflog_tg_destroy,
+ .target = nflog_tg,
+ .targetsize = sizeof(struct xt_nflog_info),
+ .me = THIS_MODULE,
+ },
+#endif
Replacing NFPROTO_IPV4 with NFPROTO_IPV6 fixed the issue.
Looking at the commit, it seems that at least one more target (MARK) may be also impacted:
+#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
+ {
+ .name = "MARK",
+ .revision = 2,
+ .family = NFPROTO_IPV4,
+ .target = mark_tg,
+ .targetsize = sizeof(struct xt_mark_tginfo2),
+ .me = THIS_MODULE,
+ },
+#endif
The same errors seem to be present in the main tree:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
I also suspect other -stable trees may be impacted by the same issue.
Best regards,
Krzysztof Olędzki
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
When multiple ODR switch happens during FIFO off, the change could
not be taken into account if you get back to previous FIFO on value.
For example, if you run sensor buffer at 50Hz, stop, change to
200Hz, then back to 50Hz and restart buffer, data will be timestamped
at 200Hz. This due to testing against mult and not new_mult.
To prevent this, let's just run apply_odr automatically when FIFO is
off. It will also simplify driver code.
Update inv_mpu6050 and inv_icm42600 to delete now useless apply_odr.
Fixes: 95444b9eeb8c ("iio: invensense: fix odr switching to same value")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
---
drivers/iio/common/inv_sensors/inv_sensors_timestamp.c | 4 ++++
drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c | 1 -
drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c | 1 -
drivers/iio/imu/inv_mpu6050/inv_mpu_trigger.c | 1 -
4 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c b/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
index f44458c380d92823ce2e7e5f78ca877ea4c06118..37d0bdaa8d824f79dcd2f341be7501d249926951 100644
--- a/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
+++ b/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
@@ -70,6 +70,10 @@ int inv_sensors_timestamp_update_odr(struct inv_sensors_timestamp *ts,
if (mult != ts->mult)
ts->new_mult = mult;
+ /* When FIFO is off, directly apply the new ODR */
+ if (!fifo)
+ inv_sensors_timestamp_apply_odr(ts, 0, 0, 0);
+
return 0;
}
EXPORT_SYMBOL_NS_GPL(inv_sensors_timestamp_update_odr, IIO_INV_SENSORS_TIMESTAMP);
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c
index 56ac198142500a2e1fc40b62cdd465cc736d8bf0..d061a64ebbf71859a3bc44644a14137dff0f9efe 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c
@@ -229,7 +229,6 @@ static int inv_icm42600_accel_update_scan_mode(struct iio_dev *indio_dev,
}
/* update data FIFO write */
- inv_sensors_timestamp_apply_odr(ts, 0, 0, 0);
ret = inv_icm42600_buffer_set_fifo_en(st, fifo_en | st->fifo.en);
out_unlock:
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c
index 938af5b640b00f58d2b8185f752c4755edfb0d25..f1e5a9648c4f5dd34f40136d02c72c90473eff37 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c
@@ -128,7 +128,6 @@ static int inv_icm42600_gyro_update_scan_mode(struct iio_dev *indio_dev,
}
/* update data FIFO write */
- inv_sensors_timestamp_apply_odr(ts, 0, 0, 0);
ret = inv_icm42600_buffer_set_fifo_en(st, fifo_en | st->fifo.en);
out_unlock:
diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_trigger.c b/drivers/iio/imu/inv_mpu6050/inv_mpu_trigger.c
index 3bfeabab0ec4f6fa28fbbcd47afe92af5b8a58e2..5b1088cc3704f1ad1288a0d65b2f957b91455d7f 100644
--- a/drivers/iio/imu/inv_mpu6050/inv_mpu_trigger.c
+++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_trigger.c
@@ -112,7 +112,6 @@ int inv_mpu6050_prepare_fifo(struct inv_mpu6050_state *st, bool enable)
if (enable) {
/* reset timestamping */
inv_sensors_timestamp_reset(&st->timestamp);
- inv_sensors_timestamp_apply_odr(&st->timestamp, 0, 0, 0);
/* reset FIFO */
d = st->chip_config.user_ctrl | INV_MPU6050_BIT_FIFO_RST;
ret = regmap_write(st->map, st->reg->user_ctrl, d);
---
base-commit: c3e9df514041ec6c46be83801b1891392f4522f7
change-id: 20241017-invn-inv-sensors-timestamp-fix-switch-fifo-off-3f29110e95d0
Best regards,
--
Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
From: Zijun Hu <quic_zijuhu(a)quicinc.com>
For devm_pci_epc_destroy(), its comment says it needs to destroy the EPC
device, but it does not do that actually, so it can not fully undo what
the API devm_pci_epc_create() does, that is wrong, fixed by using
devres_release() instead of devres_destroy() within the API.
Fixes: 5e8cb4033807 ("PCI: endpoint: Add EP core layer to enable EP controller and EP functions")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
drivers/pci/endpoint/pci-epc-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
index 17f007109255..71b6d100056e 100644
--- a/drivers/pci/endpoint/pci-epc-core.c
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -857,7 +857,7 @@ void devm_pci_epc_destroy(struct device *dev, struct pci_epc *epc)
{
int r;
- r = devres_destroy(dev, devm_pci_epc_release, devm_pci_epc_match,
+ r = devres_release(dev, devm_pci_epc_release, devm_pci_epc_match,
epc);
dev_WARN_ONCE(dev, r, "couldn't find PCI EPC resource\n");
}
---
base-commit: 715ca9dd687f89ddaac8ec8ccb3b5e5a30311a99
change-id: 20241020-pci-epc-core_fix-a92512fa9d19
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
From: Zijun Hu <quic_zijuhu(a)quicinc.com>
For devm_usb_put_phy(), its comment says it needs to invoke usb_put_phy()
to release the phy, but it does not do that actually, so it can not fully
undo what the API devm_usb_get_phy() does, that is wrong, fixed by using
devres_release() instead of devres_destroy() within the API.
Fixes: cedf8602373a ("usb: phy: move bulk of otg/otg.c to phy/phy.c")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
The API is directly used by drivers/usb/musb/sunxi.c, sorry for that
i can't evaluate relevant impact since i know nothing about sunxi.
---
drivers/usb/phy/phy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/phy/phy.c b/drivers/usb/phy/phy.c
index 06e0fb23566c..06f789097989 100644
--- a/drivers/usb/phy/phy.c
+++ b/drivers/usb/phy/phy.c
@@ -628,7 +628,7 @@ void devm_usb_put_phy(struct device *dev, struct usb_phy *phy)
{
int r;
- r = devres_destroy(dev, devm_usb_phy_release, devm_usb_phy_match, phy);
+ r = devres_release(dev, devm_usb_phy_release, devm_usb_phy_match, phy);
dev_WARN_ONCE(dev, r, "couldn't find PHY resource\n");
}
EXPORT_SYMBOL_GPL(devm_usb_put_phy);
---
base-commit: 07b887f8236eb3ed52f1fe83e385e6436dc4b052
change-id: 20241020-usb_phy_fix-9d7c67ef4ab4
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 8fa075804cb3b00960dd5c06554308175c834530
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102014-dorsal-renounce-5242@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8fa075804cb3b00960dd5c06554308175c834530 Mon Sep 17 00:00:00 2001
From: Peter Wang <peter.wang(a)mediatek.com>
Date: Tue, 1 Oct 2024 17:19:17 +0800
Subject: [PATCH] scsi: ufs: core: Requeue aborted request
After the SQ cleanup fix, the CQ will receive a response with the
corresponding tag marked as OCS: ABORTED. To align with the behavior of
Legacy SDB mode, the handling of OCS: ABORTED has been changed to match
that of OCS_INVALID_COMMAND_STATUS (SDB), with both returning a SCSI
result of DID_REQUEUE.
Furthermore, the workaround implemented before the SQ cleanup fix can be
removed.
Fixes: ab248643d3d6 ("scsi: ufs: core: Add error handling for MCQ mode")
Cc: stable(a)vger.kernel.org
Signed-off-by: Peter Wang <peter.wang(a)mediatek.com>
Link: https://lore.kernel.org/r/20241001091917.6917-3-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 6a71ebf953e2..f845166dc0d7 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5416,10 +5416,12 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
}
break;
case OCS_ABORTED:
- result |= DID_ABORT << 16;
- break;
case OCS_INVALID_COMMAND_STATUS:
result |= DID_REQUEUE << 16;
+ dev_warn(hba->dev,
+ "OCS %s from controller for tag %d\n",
+ (ocs == OCS_ABORTED ? "aborted" : "invalid"),
+ lrbp->task_tag);
break;
case OCS_INVALID_CMD_TABLE_ATTR:
case OCS_INVALID_PRDT_ATTR:
@@ -6465,26 +6467,12 @@ static bool ufshcd_abort_one(struct request *rq, void *priv)
struct scsi_device *sdev = cmd->device;
struct Scsi_Host *shost = sdev->host;
struct ufs_hba *hba = shost_priv(shost);
- struct ufshcd_lrb *lrbp = &hba->lrb[tag];
- struct ufs_hw_queue *hwq;
- unsigned long flags;
*ret = ufshcd_try_to_abort_task(hba, tag);
dev_err(hba->dev, "Aborting tag %d / CDB %#02x %s\n", tag,
hba->lrb[tag].cmd ? hba->lrb[tag].cmd->cmnd[0] : -1,
*ret ? "failed" : "succeeded");
- /* Release cmd in MCQ mode if abort succeeds */
- if (hba->mcq_enabled && (*ret == 0)) {
- hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp->cmd));
- if (!hwq)
- return 0;
- spin_lock_irqsave(&hwq->cq_lock, flags);
- if (ufshcd_cmd_inflight(lrbp->cmd))
- ufshcd_release_scsi_cmd(hba, lrbp);
- spin_unlock_irqrestore(&hwq->cq_lock, flags);
- }
-
return *ret == 0;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 19a198b67767d952c8f3d0cf24eb3100522a8223
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102050-unequal-radiator-f679@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 19a198b67767d952c8f3d0cf24eb3100522a8223 Mon Sep 17 00:00:00 2001
From: Seunghwan Baek <sh8267.baek(a)samsung.com>
Date: Thu, 29 Aug 2024 18:39:13 +0900
Subject: [PATCH] scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down
There is a history of deadlock if reboot is performed at the beginning
of booting. SDEV_QUIESCE was set for all LU's scsi_devices by UFS
shutdown, and at that time the audio driver was waiting on
blk_mq_submit_bio() holding a mutex_lock while reading the fw binary.
After that, a deadlock issue occurred while audio driver shutdown was
waiting for mutex_unlock of blk_mq_submit_bio(). To solve this, set
SDEV_OFFLINE for all LUs except WLUN, so that any I/O that comes down
after a UFS shutdown will return an error.
[ 31.907781]I[0: swapper/0: 0] 1 130705007 1651079834 11289729804 0 D( 2) 3 ffffff882e208000 * init [device_shutdown]
[ 31.907793]I[0: swapper/0: 0] Mutex: 0xffffff8849a2b8b0: owner[0xffffff882e28cb00 kworker/6:0 :49]
[ 31.907806]I[0: swapper/0: 0] Call trace:
[ 31.907810]I[0: swapper/0: 0] __switch_to+0x174/0x338
[ 31.907819]I[0: swapper/0: 0] __schedule+0x5ec/0x9cc
[ 31.907826]I[0: swapper/0: 0] schedule+0x7c/0xe8
[ 31.907834]I[0: swapper/0: 0] schedule_preempt_disabled+0x24/0x40
[ 31.907842]I[0: swapper/0: 0] __mutex_lock+0x408/0xdac
[ 31.907849]I[0: swapper/0: 0] __mutex_lock_slowpath+0x14/0x24
[ 31.907858]I[0: swapper/0: 0] mutex_lock+0x40/0xec
[ 31.907866]I[0: swapper/0: 0] device_shutdown+0x108/0x280
[ 31.907875]I[0: swapper/0: 0] kernel_restart+0x4c/0x11c
[ 31.907883]I[0: swapper/0: 0] __arm64_sys_reboot+0x15c/0x280
[ 31.907890]I[0: swapper/0: 0] invoke_syscall+0x70/0x158
[ 31.907899]I[0: swapper/0: 0] el0_svc_common+0xb4/0xf4
[ 31.907909]I[0: swapper/0: 0] do_el0_svc+0x2c/0xb0
[ 31.907918]I[0: swapper/0: 0] el0_svc+0x34/0xe0
[ 31.907928]I[0: swapper/0: 0] el0t_64_sync_handler+0x68/0xb4
[ 31.907937]I[0: swapper/0: 0] el0t_64_sync+0x1a0/0x1a4
[ 31.908774]I[0: swapper/0: 0] 49 0 11960702 11236868007 0 D( 2) 6 ffffff882e28cb00 * kworker/6:0 [__bio_queue_enter]
[ 31.908783]I[0: swapper/0: 0] Call trace:
[ 31.908788]I[0: swapper/0: 0] __switch_to+0x174/0x338
[ 31.908796]I[0: swapper/0: 0] __schedule+0x5ec/0x9cc
[ 31.908803]I[0: swapper/0: 0] schedule+0x7c/0xe8
[ 31.908811]I[0: swapper/0: 0] __bio_queue_enter+0xb8/0x178
[ 31.908818]I[0: swapper/0: 0] blk_mq_submit_bio+0x194/0x67c
[ 31.908827]I[0: swapper/0: 0] __submit_bio+0xb8/0x19c
Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun")
Cc: stable(a)vger.kernel.org
Signed-off-by: Seunghwan Baek <sh8267.baek(a)samsung.com>
Link: https://lore.kernel.org/r/20240829093913.6282-2-sh8267.baek@samsung.com
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index f845166dc0d7..706dc81eb924 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -10197,7 +10197,9 @@ static void ufshcd_wl_shutdown(struct device *dev)
shost_for_each_device(sdev, hba->host) {
if (sdev == hba->ufs_device_wlun)
continue;
- scsi_device_quiesce(sdev);
+ mutex_lock(&sdev->state_mutex);
+ scsi_device_set_state(sdev, SDEV_OFFLINE);
+ mutex_unlock(&sdev->state_mutex);
}
__ufshcd_wl_suspend(hba, UFS_SHUTDOWN_PM);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x e972b08b91ef48488bae9789f03cfedb148667fb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102019-roundness-penholder-bb3f@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e972b08b91ef48488bae9789f03cfedb148667fb Mon Sep 17 00:00:00 2001
From: Omar Sandoval <osandov(a)fb.com>
Date: Tue, 15 Oct 2024 10:59:46 -0700
Subject: [PATCH] blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function
race
We're seeing crashes from rq_qos_wake_function that look like this:
BUG: unable to handle page fault for address: ffffafe180a40084
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
Oops: Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<IRQ>
try_to_wake_up+0x5a/0x6a0
rq_qos_wake_function+0x71/0x80
__wake_up_common+0x75/0xa0
__wake_up+0x36/0x60
scale_up.part.0+0x50/0x110
wb_timer_fn+0x227/0x450
...
So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).
p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.
What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:
rq_qos_wait() rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
data->got_token = true;
list_del_init(&curr->entry);
if (data.got_token)
break;
finish_wait(&rqw->wait, &data.wq);
^- returns immediately because
list_empty_careful(&wq_entry->entry)
is true
... return, go do something else ...
wake_up_process(data->task)
(NO LONGER VALID!)-^
Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.
The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.
Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().
Fixes: 38cfb5a45ee0 ("blk-wbt: improve waking of tasks")
Cc: stable(a)vger.kernel.org
Signed-off-by: Omar Sandoval <osandov(a)fb.com>
Acked-by: Tejun Heo <tj(a)kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.17290145…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 2cfb297d9a62..058f92c4f9d5 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -219,8 +219,8 @@ static int rq_qos_wake_function(struct wait_queue_entry *curr,
data->got_token = true;
smp_wmb();
- list_del_init(&curr->entry);
wake_up_process(data->task);
+ list_del_init_careful(&curr->entry);
return 1;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 2c02f7375e658ae93d57a31a66f91b62754ef8f1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102001-badly-overvalue-6662@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2c02f7375e658ae93d57a31a66f91b62754ef8f1 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 18 Oct 2024 21:43:00 -0400
Subject: [PATCH] fgraph: Use CPU hotplug mechanism to initialize idle shadow
stacks
The function graph infrastructure allocates a shadow stack for every task
when enabled. This includes the idle tasks. The first time the function
graph is invoked, the shadow stacks are created and never freed until the
task exits. This includes the idle tasks.
Only the idle tasks that were for online CPUs had their shadow stacks
created when function graph tracing started. If function graph tracing is
enabled and a CPU comes online, the idle task representing that CPU will
not have its shadow stack created, and all function graph tracing for that
idle task will be silently dropped.
Instead, use the CPU hotplug mechanism to allocate the idle shadow stacks.
This will include idle tasks for CPUs that come online during tracing.
This issue can be reproduced by:
# cd /sys/kernel/tracing
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 0 > set_ftrace_pid
# echo function_graph > current_tracer
# echo 1 > options/funcgraph-proc
# echo 1 > /sys/devices/system/cpu/cpu1
# grep '<idle>' per_cpu/cpu1/trace | head
Before, nothing would show up.
After:
1) <idle>-0 | 0.811 us | __enqueue_entity();
1) <idle>-0 | 5.626 us | } /* enqueue_entity */
1) <idle>-0 | | dl_server_update_idle_time() {
1) <idle>-0 | | dl_scaled_delta_exec() {
1) <idle>-0 | 0.450 us | arch_scale_cpu_capacity();
1) <idle>-0 | 1.242 us | }
1) <idle>-0 | 1.908 us | }
1) <idle>-0 | | dl_server_start() {
1) <idle>-0 | | enqueue_dl_entity() {
1) <idle>-0 | | task_contending() {
Note, if tracing stops and restarts, the old way would then initialize
the onlined CPUs.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lore.kernel.org/20241018214300.6df82178@rorschach
Fixes: 868baf07b1a25 ("ftrace: Fix memory leak with function graph and cpu hotplug")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index d7d4fb403f6f..43f4e3f57438 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -1160,19 +1160,13 @@ void fgraph_update_pid_func(void)
static int start_graph_tracing(void)
{
unsigned long **ret_stack_list;
- int ret, cpu;
+ int ret;
ret_stack_list = kmalloc(SHADOW_STACK_SIZE, GFP_KERNEL);
if (!ret_stack_list)
return -ENOMEM;
- /* The cpu_boot init_task->ret_stack will never be freed */
- for_each_online_cpu(cpu) {
- if (!idle_task(cpu)->ret_stack)
- ftrace_graph_init_idle_task(idle_task(cpu), cpu);
- }
-
do {
ret = alloc_retstack_tasklist(ret_stack_list);
} while (ret == -EAGAIN);
@@ -1242,14 +1236,34 @@ static void ftrace_graph_disable_direct(bool disable_branch)
fgraph_direct_gops = &fgraph_stub;
}
+/* The cpu_boot init_task->ret_stack will never be freed */
+static int fgraph_cpu_init(unsigned int cpu)
+{
+ if (!idle_task(cpu)->ret_stack)
+ ftrace_graph_init_idle_task(idle_task(cpu), cpu);
+ return 0;
+}
+
int register_ftrace_graph(struct fgraph_ops *gops)
{
+ static bool fgraph_initialized;
int command = 0;
int ret = 0;
int i = -1;
mutex_lock(&ftrace_lock);
+ if (!fgraph_initialized) {
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "fgraph_idle_init",
+ fgraph_cpu_init, NULL);
+ if (ret < 0) {
+ pr_warn("fgraph: Error to init cpu hotplug support\n");
+ return ret;
+ }
+ fgraph_initialized = true;
+ ret = 0;
+ }
+
if (!fgraph_array[0]) {
/* The array must always have real data on it */
for (i = 0; i < FGRAPH_ARRAY_SIZE; i++)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 2c02f7375e658ae93d57a31a66f91b62754ef8f1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102000-mortician-chant-190e@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2c02f7375e658ae93d57a31a66f91b62754ef8f1 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 18 Oct 2024 21:43:00 -0400
Subject: [PATCH] fgraph: Use CPU hotplug mechanism to initialize idle shadow
stacks
The function graph infrastructure allocates a shadow stack for every task
when enabled. This includes the idle tasks. The first time the function
graph is invoked, the shadow stacks are created and never freed until the
task exits. This includes the idle tasks.
Only the idle tasks that were for online CPUs had their shadow stacks
created when function graph tracing started. If function graph tracing is
enabled and a CPU comes online, the idle task representing that CPU will
not have its shadow stack created, and all function graph tracing for that
idle task will be silently dropped.
Instead, use the CPU hotplug mechanism to allocate the idle shadow stacks.
This will include idle tasks for CPUs that come online during tracing.
This issue can be reproduced by:
# cd /sys/kernel/tracing
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 0 > set_ftrace_pid
# echo function_graph > current_tracer
# echo 1 > options/funcgraph-proc
# echo 1 > /sys/devices/system/cpu/cpu1
# grep '<idle>' per_cpu/cpu1/trace | head
Before, nothing would show up.
After:
1) <idle>-0 | 0.811 us | __enqueue_entity();
1) <idle>-0 | 5.626 us | } /* enqueue_entity */
1) <idle>-0 | | dl_server_update_idle_time() {
1) <idle>-0 | | dl_scaled_delta_exec() {
1) <idle>-0 | 0.450 us | arch_scale_cpu_capacity();
1) <idle>-0 | 1.242 us | }
1) <idle>-0 | 1.908 us | }
1) <idle>-0 | | dl_server_start() {
1) <idle>-0 | | enqueue_dl_entity() {
1) <idle>-0 | | task_contending() {
Note, if tracing stops and restarts, the old way would then initialize
the onlined CPUs.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lore.kernel.org/20241018214300.6df82178@rorschach
Fixes: 868baf07b1a25 ("ftrace: Fix memory leak with function graph and cpu hotplug")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index d7d4fb403f6f..43f4e3f57438 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -1160,19 +1160,13 @@ void fgraph_update_pid_func(void)
static int start_graph_tracing(void)
{
unsigned long **ret_stack_list;
- int ret, cpu;
+ int ret;
ret_stack_list = kmalloc(SHADOW_STACK_SIZE, GFP_KERNEL);
if (!ret_stack_list)
return -ENOMEM;
- /* The cpu_boot init_task->ret_stack will never be freed */
- for_each_online_cpu(cpu) {
- if (!idle_task(cpu)->ret_stack)
- ftrace_graph_init_idle_task(idle_task(cpu), cpu);
- }
-
do {
ret = alloc_retstack_tasklist(ret_stack_list);
} while (ret == -EAGAIN);
@@ -1242,14 +1236,34 @@ static void ftrace_graph_disable_direct(bool disable_branch)
fgraph_direct_gops = &fgraph_stub;
}
+/* The cpu_boot init_task->ret_stack will never be freed */
+static int fgraph_cpu_init(unsigned int cpu)
+{
+ if (!idle_task(cpu)->ret_stack)
+ ftrace_graph_init_idle_task(idle_task(cpu), cpu);
+ return 0;
+}
+
int register_ftrace_graph(struct fgraph_ops *gops)
{
+ static bool fgraph_initialized;
int command = 0;
int ret = 0;
int i = -1;
mutex_lock(&ftrace_lock);
+ if (!fgraph_initialized) {
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "fgraph_idle_init",
+ fgraph_cpu_init, NULL);
+ if (ret < 0) {
+ pr_warn("fgraph: Error to init cpu hotplug support\n");
+ return ret;
+ }
+ fgraph_initialized = true;
+ ret = 0;
+ }
+
if (!fgraph_array[0]) {
/* The array must always have real data on it */
for (i = 0; i < FGRAPH_ARRAY_SIZE; i++)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 2c02f7375e658ae93d57a31a66f91b62754ef8f1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102058-headphone-embody-6747@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2c02f7375e658ae93d57a31a66f91b62754ef8f1 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 18 Oct 2024 21:43:00 -0400
Subject: [PATCH] fgraph: Use CPU hotplug mechanism to initialize idle shadow
stacks
The function graph infrastructure allocates a shadow stack for every task
when enabled. This includes the idle tasks. The first time the function
graph is invoked, the shadow stacks are created and never freed until the
task exits. This includes the idle tasks.
Only the idle tasks that were for online CPUs had their shadow stacks
created when function graph tracing started. If function graph tracing is
enabled and a CPU comes online, the idle task representing that CPU will
not have its shadow stack created, and all function graph tracing for that
idle task will be silently dropped.
Instead, use the CPU hotplug mechanism to allocate the idle shadow stacks.
This will include idle tasks for CPUs that come online during tracing.
This issue can be reproduced by:
# cd /sys/kernel/tracing
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 0 > set_ftrace_pid
# echo function_graph > current_tracer
# echo 1 > options/funcgraph-proc
# echo 1 > /sys/devices/system/cpu/cpu1
# grep '<idle>' per_cpu/cpu1/trace | head
Before, nothing would show up.
After:
1) <idle>-0 | 0.811 us | __enqueue_entity();
1) <idle>-0 | 5.626 us | } /* enqueue_entity */
1) <idle>-0 | | dl_server_update_idle_time() {
1) <idle>-0 | | dl_scaled_delta_exec() {
1) <idle>-0 | 0.450 us | arch_scale_cpu_capacity();
1) <idle>-0 | 1.242 us | }
1) <idle>-0 | 1.908 us | }
1) <idle>-0 | | dl_server_start() {
1) <idle>-0 | | enqueue_dl_entity() {
1) <idle>-0 | | task_contending() {
Note, if tracing stops and restarts, the old way would then initialize
the onlined CPUs.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lore.kernel.org/20241018214300.6df82178@rorschach
Fixes: 868baf07b1a25 ("ftrace: Fix memory leak with function graph and cpu hotplug")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index d7d4fb403f6f..43f4e3f57438 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -1160,19 +1160,13 @@ void fgraph_update_pid_func(void)
static int start_graph_tracing(void)
{
unsigned long **ret_stack_list;
- int ret, cpu;
+ int ret;
ret_stack_list = kmalloc(SHADOW_STACK_SIZE, GFP_KERNEL);
if (!ret_stack_list)
return -ENOMEM;
- /* The cpu_boot init_task->ret_stack will never be freed */
- for_each_online_cpu(cpu) {
- if (!idle_task(cpu)->ret_stack)
- ftrace_graph_init_idle_task(idle_task(cpu), cpu);
- }
-
do {
ret = alloc_retstack_tasklist(ret_stack_list);
} while (ret == -EAGAIN);
@@ -1242,14 +1236,34 @@ static void ftrace_graph_disable_direct(bool disable_branch)
fgraph_direct_gops = &fgraph_stub;
}
+/* The cpu_boot init_task->ret_stack will never be freed */
+static int fgraph_cpu_init(unsigned int cpu)
+{
+ if (!idle_task(cpu)->ret_stack)
+ ftrace_graph_init_idle_task(idle_task(cpu), cpu);
+ return 0;
+}
+
int register_ftrace_graph(struct fgraph_ops *gops)
{
+ static bool fgraph_initialized;
int command = 0;
int ret = 0;
int i = -1;
mutex_lock(&ftrace_lock);
+ if (!fgraph_initialized) {
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "fgraph_idle_init",
+ fgraph_cpu_init, NULL);
+ if (ret < 0) {
+ pr_warn("fgraph: Error to init cpu hotplug support\n");
+ return ret;
+ }
+ fgraph_initialized = true;
+ ret = 0;
+ }
+
if (!fgraph_array[0]) {
/* The array must always have real data on it */
for (i = 0; i < FGRAPH_ARRAY_SIZE; i++)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 2c02f7375e658ae93d57a31a66f91b62754ef8f1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102057-skipper-growl-db34@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2c02f7375e658ae93d57a31a66f91b62754ef8f1 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 18 Oct 2024 21:43:00 -0400
Subject: [PATCH] fgraph: Use CPU hotplug mechanism to initialize idle shadow
stacks
The function graph infrastructure allocates a shadow stack for every task
when enabled. This includes the idle tasks. The first time the function
graph is invoked, the shadow stacks are created and never freed until the
task exits. This includes the idle tasks.
Only the idle tasks that were for online CPUs had their shadow stacks
created when function graph tracing started. If function graph tracing is
enabled and a CPU comes online, the idle task representing that CPU will
not have its shadow stack created, and all function graph tracing for that
idle task will be silently dropped.
Instead, use the CPU hotplug mechanism to allocate the idle shadow stacks.
This will include idle tasks for CPUs that come online during tracing.
This issue can be reproduced by:
# cd /sys/kernel/tracing
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 0 > set_ftrace_pid
# echo function_graph > current_tracer
# echo 1 > options/funcgraph-proc
# echo 1 > /sys/devices/system/cpu/cpu1
# grep '<idle>' per_cpu/cpu1/trace | head
Before, nothing would show up.
After:
1) <idle>-0 | 0.811 us | __enqueue_entity();
1) <idle>-0 | 5.626 us | } /* enqueue_entity */
1) <idle>-0 | | dl_server_update_idle_time() {
1) <idle>-0 | | dl_scaled_delta_exec() {
1) <idle>-0 | 0.450 us | arch_scale_cpu_capacity();
1) <idle>-0 | 1.242 us | }
1) <idle>-0 | 1.908 us | }
1) <idle>-0 | | dl_server_start() {
1) <idle>-0 | | enqueue_dl_entity() {
1) <idle>-0 | | task_contending() {
Note, if tracing stops and restarts, the old way would then initialize
the onlined CPUs.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lore.kernel.org/20241018214300.6df82178@rorschach
Fixes: 868baf07b1a25 ("ftrace: Fix memory leak with function graph and cpu hotplug")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index d7d4fb403f6f..43f4e3f57438 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -1160,19 +1160,13 @@ void fgraph_update_pid_func(void)
static int start_graph_tracing(void)
{
unsigned long **ret_stack_list;
- int ret, cpu;
+ int ret;
ret_stack_list = kmalloc(SHADOW_STACK_SIZE, GFP_KERNEL);
if (!ret_stack_list)
return -ENOMEM;
- /* The cpu_boot init_task->ret_stack will never be freed */
- for_each_online_cpu(cpu) {
- if (!idle_task(cpu)->ret_stack)
- ftrace_graph_init_idle_task(idle_task(cpu), cpu);
- }
-
do {
ret = alloc_retstack_tasklist(ret_stack_list);
} while (ret == -EAGAIN);
@@ -1242,14 +1236,34 @@ static void ftrace_graph_disable_direct(bool disable_branch)
fgraph_direct_gops = &fgraph_stub;
}
+/* The cpu_boot init_task->ret_stack will never be freed */
+static int fgraph_cpu_init(unsigned int cpu)
+{
+ if (!idle_task(cpu)->ret_stack)
+ ftrace_graph_init_idle_task(idle_task(cpu), cpu);
+ return 0;
+}
+
int register_ftrace_graph(struct fgraph_ops *gops)
{
+ static bool fgraph_initialized;
int command = 0;
int ret = 0;
int i = -1;
mutex_lock(&ftrace_lock);
+ if (!fgraph_initialized) {
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "fgraph_idle_init",
+ fgraph_cpu_init, NULL);
+ if (ret < 0) {
+ pr_warn("fgraph: Error to init cpu hotplug support\n");
+ return ret;
+ }
+ fgraph_initialized = true;
+ ret = 0;
+ }
+
if (!fgraph_array[0]) {
/* The array must always have real data on it */
for (i = 0; i < FGRAPH_ARRAY_SIZE; i++)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 2c02f7375e658ae93d57a31a66f91b62754ef8f1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102055-nugget-delicious-edfe@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2c02f7375e658ae93d57a31a66f91b62754ef8f1 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 18 Oct 2024 21:43:00 -0400
Subject: [PATCH] fgraph: Use CPU hotplug mechanism to initialize idle shadow
stacks
The function graph infrastructure allocates a shadow stack for every task
when enabled. This includes the idle tasks. The first time the function
graph is invoked, the shadow stacks are created and never freed until the
task exits. This includes the idle tasks.
Only the idle tasks that were for online CPUs had their shadow stacks
created when function graph tracing started. If function graph tracing is
enabled and a CPU comes online, the idle task representing that CPU will
not have its shadow stack created, and all function graph tracing for that
idle task will be silently dropped.
Instead, use the CPU hotplug mechanism to allocate the idle shadow stacks.
This will include idle tasks for CPUs that come online during tracing.
This issue can be reproduced by:
# cd /sys/kernel/tracing
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 0 > set_ftrace_pid
# echo function_graph > current_tracer
# echo 1 > options/funcgraph-proc
# echo 1 > /sys/devices/system/cpu/cpu1
# grep '<idle>' per_cpu/cpu1/trace | head
Before, nothing would show up.
After:
1) <idle>-0 | 0.811 us | __enqueue_entity();
1) <idle>-0 | 5.626 us | } /* enqueue_entity */
1) <idle>-0 | | dl_server_update_idle_time() {
1) <idle>-0 | | dl_scaled_delta_exec() {
1) <idle>-0 | 0.450 us | arch_scale_cpu_capacity();
1) <idle>-0 | 1.242 us | }
1) <idle>-0 | 1.908 us | }
1) <idle>-0 | | dl_server_start() {
1) <idle>-0 | | enqueue_dl_entity() {
1) <idle>-0 | | task_contending() {
Note, if tracing stops and restarts, the old way would then initialize
the onlined CPUs.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lore.kernel.org/20241018214300.6df82178@rorschach
Fixes: 868baf07b1a25 ("ftrace: Fix memory leak with function graph and cpu hotplug")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index d7d4fb403f6f..43f4e3f57438 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -1160,19 +1160,13 @@ void fgraph_update_pid_func(void)
static int start_graph_tracing(void)
{
unsigned long **ret_stack_list;
- int ret, cpu;
+ int ret;
ret_stack_list = kmalloc(SHADOW_STACK_SIZE, GFP_KERNEL);
if (!ret_stack_list)
return -ENOMEM;
- /* The cpu_boot init_task->ret_stack will never be freed */
- for_each_online_cpu(cpu) {
- if (!idle_task(cpu)->ret_stack)
- ftrace_graph_init_idle_task(idle_task(cpu), cpu);
- }
-
do {
ret = alloc_retstack_tasklist(ret_stack_list);
} while (ret == -EAGAIN);
@@ -1242,14 +1236,34 @@ static void ftrace_graph_disable_direct(bool disable_branch)
fgraph_direct_gops = &fgraph_stub;
}
+/* The cpu_boot init_task->ret_stack will never be freed */
+static int fgraph_cpu_init(unsigned int cpu)
+{
+ if (!idle_task(cpu)->ret_stack)
+ ftrace_graph_init_idle_task(idle_task(cpu), cpu);
+ return 0;
+}
+
int register_ftrace_graph(struct fgraph_ops *gops)
{
+ static bool fgraph_initialized;
int command = 0;
int ret = 0;
int i = -1;
mutex_lock(&ftrace_lock);
+ if (!fgraph_initialized) {
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "fgraph_idle_init",
+ fgraph_cpu_init, NULL);
+ if (ret < 0) {
+ pr_warn("fgraph: Error to init cpu hotplug support\n");
+ return ret;
+ }
+ fgraph_initialized = true;
+ ret = 0;
+ }
+
if (!fgraph_array[0]) {
/* The array must always have real data on it */
for (i = 0; i < FGRAPH_ARRAY_SIZE; i++)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 2c02f7375e658ae93d57a31a66f91b62754ef8f1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024102054-nineteen-exemplary-3f78@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2c02f7375e658ae93d57a31a66f91b62754ef8f1 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 18 Oct 2024 21:43:00 -0400
Subject: [PATCH] fgraph: Use CPU hotplug mechanism to initialize idle shadow
stacks
The function graph infrastructure allocates a shadow stack for every task
when enabled. This includes the idle tasks. The first time the function
graph is invoked, the shadow stacks are created and never freed until the
task exits. This includes the idle tasks.
Only the idle tasks that were for online CPUs had their shadow stacks
created when function graph tracing started. If function graph tracing is
enabled and a CPU comes online, the idle task representing that CPU will
not have its shadow stack created, and all function graph tracing for that
idle task will be silently dropped.
Instead, use the CPU hotplug mechanism to allocate the idle shadow stacks.
This will include idle tasks for CPUs that come online during tracing.
This issue can be reproduced by:
# cd /sys/kernel/tracing
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 0 > set_ftrace_pid
# echo function_graph > current_tracer
# echo 1 > options/funcgraph-proc
# echo 1 > /sys/devices/system/cpu/cpu1
# grep '<idle>' per_cpu/cpu1/trace | head
Before, nothing would show up.
After:
1) <idle>-0 | 0.811 us | __enqueue_entity();
1) <idle>-0 | 5.626 us | } /* enqueue_entity */
1) <idle>-0 | | dl_server_update_idle_time() {
1) <idle>-0 | | dl_scaled_delta_exec() {
1) <idle>-0 | 0.450 us | arch_scale_cpu_capacity();
1) <idle>-0 | 1.242 us | }
1) <idle>-0 | 1.908 us | }
1) <idle>-0 | | dl_server_start() {
1) <idle>-0 | | enqueue_dl_entity() {
1) <idle>-0 | | task_contending() {
Note, if tracing stops and restarts, the old way would then initialize
the onlined CPUs.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lore.kernel.org/20241018214300.6df82178@rorschach
Fixes: 868baf07b1a25 ("ftrace: Fix memory leak with function graph and cpu hotplug")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index d7d4fb403f6f..43f4e3f57438 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -1160,19 +1160,13 @@ void fgraph_update_pid_func(void)
static int start_graph_tracing(void)
{
unsigned long **ret_stack_list;
- int ret, cpu;
+ int ret;
ret_stack_list = kmalloc(SHADOW_STACK_SIZE, GFP_KERNEL);
if (!ret_stack_list)
return -ENOMEM;
- /* The cpu_boot init_task->ret_stack will never be freed */
- for_each_online_cpu(cpu) {
- if (!idle_task(cpu)->ret_stack)
- ftrace_graph_init_idle_task(idle_task(cpu), cpu);
- }
-
do {
ret = alloc_retstack_tasklist(ret_stack_list);
} while (ret == -EAGAIN);
@@ -1242,14 +1236,34 @@ static void ftrace_graph_disable_direct(bool disable_branch)
fgraph_direct_gops = &fgraph_stub;
}
+/* The cpu_boot init_task->ret_stack will never be freed */
+static int fgraph_cpu_init(unsigned int cpu)
+{
+ if (!idle_task(cpu)->ret_stack)
+ ftrace_graph_init_idle_task(idle_task(cpu), cpu);
+ return 0;
+}
+
int register_ftrace_graph(struct fgraph_ops *gops)
{
+ static bool fgraph_initialized;
int command = 0;
int ret = 0;
int i = -1;
mutex_lock(&ftrace_lock);
+ if (!fgraph_initialized) {
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "fgraph_idle_init",
+ fgraph_cpu_init, NULL);
+ if (ret < 0) {
+ pr_warn("fgraph: Error to init cpu hotplug support\n");
+ return ret;
+ }
+ fgraph_initialized = true;
+ ret = 0;
+ }
+
if (!fgraph_array[0]) {
/* The array must always have real data on it */
for (i = 0; i < FGRAPH_ARRAY_SIZE; i++)
This patch series is to fix bugs for below APIs:
devm_phy_put()
devm_of_phy_provider_unregister()
devm_phy_destroy()
phy_get()
of_phy_get()
devm_of_phy_get_by_index()
And simplify API of_phy_simple_xlate().
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
Zijun Hu (6):
phy: core: Fix API devm_phy_put() can not release the phy
phy: core: Fix API devm_of_phy_provider_unregister() can not unregister the phy provider
phy: core: Fix API devm_phy_destroy() can not destroy the phy
phy: core: Add missing of_node_put() for an error handling path of _of_phy_get()
phy: core: Add missing of_node_put() in of_phy_provider_lookup()
phy: core: Simplify API of_phy_simple_xlate() implementation
drivers/phy/phy-core.c | 39 ++++++++++++++++++---------------------
1 file changed, 18 insertions(+), 21 deletions(-)
---
base-commit: d8f9d6d826fc15780451802796bb88ec52978f17
change-id: 20241020-phy_core_fix-e3ad65db98f7
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
Some page flags (page->flags) were converted to page types
(page->page_types). A recent example is PG_hugetlb.
From the exclusive writer's perspective, e.g., a thread doing
__folio_set_hugetlb(), there is a difference between the page flag and
type APIs: the former allows the same non-atomic operation to be
repeated whereas the latter does not. For example, calling
__folio_set_hugetlb() twice triggers VM_BUG_ON_FOLIO(), since the
second call expects the type (PG_hugetlb) not to be set previously.
Using add_hugetlb_folio() as an example, it calls
__folio_set_hugetlb() in the following error-handling path. And when
that happens, it triggers the aforementioned VM_BUG_ON_FOLIO().
if (folio_test_hugetlb(folio)) {
rc = hugetlb_vmemmap_restore_folio(h, folio);
if (rc) {
spin_lock_irq(&hugetlb_lock);
add_hugetlb_folio(h, folio, false);
...
It is possible to make hugeTLB comply with the new requirements from
the page type API. However, a straightforward fix would be to just
allow the same page type to be set or cleared again inside the API,
to avoid any changes to its callers.
Fixes: d99e3140a4d3 ("mm: turn folio_test_hugetlb into a PageType")
Signed-off-by: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
---
include/linux/page-flags.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index ccf3c78faefc..e80665bc51fa 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -977,12 +977,16 @@ static __always_inline bool folio_test_##fname(const struct folio *folio) \
} \
static __always_inline void __folio_set_##fname(struct folio *folio) \
{ \
+ if (folio_test_##fname(folio)) \
+ return; \
VM_BUG_ON_FOLIO(data_race(folio->page.page_type) != UINT_MAX, \
folio); \
folio->page.page_type = (unsigned int)PGTY_##lname << 24; \
} \
static __always_inline void __folio_clear_##fname(struct folio *folio) \
{ \
+ if (folio->page.page_type == UINT_MAX) \
+ return; \
VM_BUG_ON_FOLIO(!folio_test_##fname(folio), folio); \
folio->page.page_type = UINT_MAX; \
}
@@ -995,11 +999,15 @@ static __always_inline int Page##uname(const struct page *page) \
} \
static __always_inline void __SetPage##uname(struct page *page) \
{ \
+ if (Page##uname(page)) \
+ return; \
VM_BUG_ON_PAGE(data_race(page->page_type) != UINT_MAX, page); \
page->page_type = (unsigned int)PGTY_##lname << 24; \
} \
static __always_inline void __ClearPage##uname(struct page *page) \
{ \
+ if (page->page_type == UINT_MAX) \
+ return; \
VM_BUG_ON_PAGE(!Page##uname(page), page); \
page->page_type = UINT_MAX; \
}
--
2.47.0.rc1.288.g06298d1525-goog
tpm2_sessions_init() does not ignore the result of
tpm2_create_null_primary(). Address this by returning -ENODEV to the
caller. Given that upper layers cannot help healing the situation
further, deal with the TPM error here by
Cc: stable(a)vger.kernel.org # v6.10+
Fixes: d2add27cf2b8 ("tpm: Add NULL primary creation")
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v6:
- Address:
https://lore.kernel.org/linux-integrity/69c893e7-6b87-4daa-80db-44d1120e80f…
as TPM RC is taken care of at the call site. Add also the missing
documentation for the return values.
v5:
- Do not print klog messages on error, as tpm2_save_context() already
takes care of this.
v4:
- Fixed up stable version.
v3:
- Handle TPM and POSIX error separately and return -ENODEV always back
to the caller.
v2:
- Refined the commit message.
---
drivers/char/tpm/tpm2-sessions.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/char/tpm/tpm2-sessions.c b/drivers/char/tpm/tpm2-sessions.c
index 511c67061728..253639767c1e 100644
--- a/drivers/char/tpm/tpm2-sessions.c
+++ b/drivers/char/tpm/tpm2-sessions.c
@@ -1347,6 +1347,11 @@ static int tpm2_create_null_primary(struct tpm_chip *chip)
*
* Derive and context save the null primary and allocate memory in the
* struct tpm_chip for the authorizations.
+ *
+ * Return:
+ * * 0 - OK
+ * * -errno - A system error
+ * * TPM_RC - A TPM error
*/
int tpm2_sessions_init(struct tpm_chip *chip)
{
@@ -1354,7 +1359,7 @@ int tpm2_sessions_init(struct tpm_chip *chip)
rc = tpm2_create_null_primary(chip);
if (rc)
- dev_err(&chip->dev, "TPM: security failed (NULL seed derivation): %d\n", rc);
+ return rc;
chip->auth = kmalloc(sizeof(*chip->auth), GFP_KERNEL);
if (!chip->auth)
--
2.47.0
Returning an abort to the guest for an unsupported MMIO access is a
documented feature of the KVM UAPI. Nevertheless, it's clear that this
plumbing has seen limited testing, since userspace can trivially cause a
WARN in the MMIO return:
WARNING: CPU: 0 PID: 30558 at arch/arm64/include/asm/kvm_emulate.h:536 kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
Call trace:
kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536
kvm_arch_vcpu_ioctl_run+0x98/0x15b4 arch/arm64/kvm/arm.c:1133
kvm_vcpu_ioctl+0x75c/0xa78 virt/kvm/kvm_main.c:4487
__do_sys_ioctl fs/ioctl.c:51 [inline]
__se_sys_ioctl fs/ioctl.c:893 [inline]
__arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:893
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x1e0/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x38/0x68 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
The splat is complaining that KVM is advancing PC while an exception is
pending, i.e. that KVM is retiring the MMIO instruction despite a
pending external abort. Womp womp.
Fix the glaring UAPI bug by skipping over all the MMIO emulation in
case there is a pending synchronous exception. Note that while userspace
is capable of pending an asynchronous exception (SError, IRQ, or FIQ),
it is still safe to retire the MMIO instruction in this case as (1) they
are by definition asynchronous, and (2) KVM relies on hardware support
for pending/delivering these exceptions instead of the software state
machine for advancing PC.
Cc: stable(a)vger.kernel.org
Fixes: da345174ceca ("KVM: arm/arm64: Allow user injection of external data aborts")
Reported-by: Alexander Potapenko <glider(a)google.com>
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
---
arch/arm64/include/asm/kvm_emulate.h | 25 +++++++++++++++++++++++++
arch/arm64/kvm/mmio.c | 7 +++++--
2 files changed, 30 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index a601a9305b10..1b229099f684 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -544,6 +544,31 @@ static __always_inline void kvm_incr_pc(struct kvm_vcpu *vcpu)
vcpu_set_flag((v), e); \
} while (0)
+static inline bool kvm_pending_sync_exception(struct kvm_vcpu *vcpu)
+{
+ if (!vcpu_get_flag(vcpu, PENDING_EXCEPTION))
+ return false;
+
+ if (vcpu_el1_is_32bit(vcpu)) {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA32_UND):
+ case unpack_vcpu_flag(EXCEPT_AA32_IABT):
+ case unpack_vcpu_flag(EXCEPT_AA32_DABT):
+ return true;
+ default:
+ return false;
+ }
+ } else {
+ switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) {
+ case unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC):
+ case unpack_vcpu_flag(EXCEPT_AA64_EL2_SYNC):
+ return true;
+ default:
+ return false;
+ }
+ }
+}
+
#define __build_check_all_or_none(r, bits) \
BUILD_BUG_ON(((r) & (bits)) && ((r) & (bits)) != (bits))
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index cd6b7b83e2c3..0155ba665717 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -84,8 +84,11 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
unsigned int len;
int mask;
- /* Detect an already handled MMIO return */
- if (unlikely(!vcpu->mmio_needed))
+ /*
+ * Detect if the MMIO return was already handled or if userspace aborted
+ * the MMIO access.
+ */
+ if (unlikely(!vcpu->mmio_needed || kvm_pending_sync_exception(vcpu)))
return 1;
vcpu->mmio_needed = 0;
--
2.47.0.rc1.288.g06298d1525-goog
During the aborting of a command, the software receives a command
completion event for the command ring stopped, with the TRB pointing
to the next TRB after the aborted command.
If the command we abort is located just before the Link TRB in the
command ring, then during the 'command ring stopped' completion event,
the xHC gives the Link TRB in the event's cmd DMA, which causes a
mismatch in handling command completion event.
To handle this situation, an additional check has been added to ignore
the mismatch error and continue the operation.
Cc: stable(a)vger.kernel.org
Signed-off-by: Faisal Hassan <quic_faisalh(a)quicinc.com>
---
drivers/usb/host/xhci-ring.c | 38 +++++++++++++++++++++++++++++++++---
1 file changed, 35 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index b2950c35c740..43926c378df9 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -126,6 +126,32 @@ static void inc_td_cnt(struct urb *urb)
urb_priv->num_tds_done++;
}
+/*
+ * Return true if the DMA is pointing to a Link TRB in the ring;
+ * otherwise, return false.
+ */
+static bool is_dma_link_trb(struct xhci_ring *ring, dma_addr_t dma)
+{
+ struct xhci_segment *seg;
+ union xhci_trb *trb;
+ dma_addr_t trb_dma;
+ int i;
+
+ seg = ring->first_seg;
+ do {
+ for (i = 0; i < TRBS_PER_SEGMENT; i++) {
+ trb = &seg->trbs[i];
+ trb_dma = seg->dma + (i * sizeof(union xhci_trb));
+
+ if (trb_is_link(trb) && trb_dma == dma)
+ return true;
+ }
+ seg = seg->next;
+ } while (seg != ring->first_seg);
+
+ return false;
+}
+
static void trb_to_noop(union xhci_trb *trb, u32 noop_type)
{
if (trb_is_link(trb)) {
@@ -1718,13 +1744,21 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
trace_xhci_handle_command(xhci->cmd_ring, &cmd_trb->generic);
+ cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
cmd_dequeue_dma = xhci_trb_virt_to_dma(xhci->cmd_ring->deq_seg,
cmd_trb);
/*
* Check whether the completion event is for our internal kept
* command.
+ * For the 'command ring stopped' completion event, there is a
+ * risk of a mismatch in dequeue pointers if we abort the command
+ * just before the link TRB in the command ring. In this scenario,
+ * the cmd_dma in the event would point to a link TRB, while the
+ * software dequeue pointer circles back to the start.
*/
- if (!cmd_dequeue_dma || cmd_dma != (u64)cmd_dequeue_dma) {
+ if ((!cmd_dequeue_dma || cmd_dma != (u64)cmd_dequeue_dma) &&
+ !(cmd_comp_code == COMP_COMMAND_RING_STOPPED &&
+ is_dma_link_trb(xhci->cmd_ring, cmd_dma))) {
xhci_warn(xhci,
"ERROR mismatched command completion event\n");
return;
@@ -1734,8 +1768,6 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
cancel_delayed_work(&xhci->cmd_timer);
- cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
-
/* If CMD ring stopped we own the trbs between enqueue and dequeue */
if (cmd_comp_code == COMP_COMMAND_RING_STOPPED) {
complete_all(&xhci->cmd_ring_stop_completion);
--
2.17.1
Since Linux 6.11 we support AT_EMPTY_PATH and NULL path for fstatat and
statx in "some circumstances" mostly for performance and allowing
seccomp audition. But to make the API easier to be documented and used,
we should just treat AT_EMPTY_PATH and NULL as is AT_EMPTY_PATH and
empty string even if there are no performance or seccomp benefits.
Cc: Miao Wang <shankerwangmiao(a)gmail.com>
Cc: linux-fsdevel(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Xi Ruoyao (2):
vfs: support fstatat(..., NULL, AT_EMPTY_PATH | AT_NO_AUTOMOUNT, ...)
vfs: Make sure {statx,fstatat}(..., AT_EMPTY_PATH | ..., NULL, ...)
behave as (..., AT_EMPTY_PATH | ..., "", ...)
fs/stat.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--
2.46.2
When an i915 PMU counter is enabled and the driver is then unbound, the
PMU will be unregistered via perf_pmu_unregister(), however the event
will still be alive. i915 currently tries to deal with this situation
by:
a) Marking the pmu as "closed" and shortcut the calls from perf
b) Taking a reference from i915, that is put back when the event
is destroyed.
c) Setting event_init to NULL to avoid any further event
(a) is ugly, but may be left as is since it protects not trying to
access the HW that is now gone. Unless a pmu driver can call
perf_pmu_unregister() and not receive any more calls, it's a necessary
ugliness.
(b) doesn't really work: when the event is destroyed and the i915 ref is
put it may free the i915 object, that contains the pmu, not only the
event. After event->destroy() callback, perf still expects the pmu
object to be alive.
Instead of pigging back on the event->destroy() to take and put the
device reference, implement the new get()/put() on the pmu object for
that purpose.
(c) is only done to have a flag to avoid some function entrypoints when
pmu is unregistered.
Cc: stable(a)vger.kernel.org # 5.11+
Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com>
---
drivers/gpu/drm/i915/i915_pmu.c | 36 ++++++++++++++++++++-------------
1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 4d05d98f51b8e..dc9f753369170 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -515,15 +515,6 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
return HRTIMER_RESTART;
}
-static void i915_pmu_event_destroy(struct perf_event *event)
-{
- struct i915_pmu *pmu = event_to_pmu(event);
- struct drm_i915_private *i915 = pmu_to_i915(pmu);
-
- drm_WARN_ON(&i915->drm, event->parent);
-
- drm_dev_put(&i915->drm);
-}
static int
engine_event_status(struct intel_engine_cs *engine,
@@ -629,11 +620,6 @@ static int i915_pmu_event_init(struct perf_event *event)
if (ret)
return ret;
- if (!event->parent) {
- drm_dev_get(&i915->drm);
- event->destroy = i915_pmu_event_destroy;
- }
-
return 0;
}
@@ -872,6 +858,24 @@ static int i915_pmu_event_event_idx(struct perf_event *event)
return 0;
}
+static struct pmu *i915_pmu_get(struct pmu *base)
+{
+ struct i915_pmu *pmu = container_of(base, struct i915_pmu, base);
+ struct drm_i915_private *i915 = pmu_to_i915(pmu);
+
+ drm_dev_get(&i915->drm);
+
+ return base;
+}
+
+static void i915_pmu_put(struct pmu *base)
+{
+ struct i915_pmu *pmu = container_of(base, struct i915_pmu, base);
+ struct drm_i915_private *i915 = pmu_to_i915(pmu);
+
+ drm_dev_put(&i915->drm);
+}
+
struct i915_str_attribute {
struct device_attribute attr;
const char *str;
@@ -1154,6 +1158,8 @@ static void free_pmu(struct drm_device *dev, void *res)
struct i915_pmu *pmu = res;
struct drm_i915_private *i915 = pmu_to_i915(pmu);
+ perf_pmu_free(&pmu->base);
+
free_event_attributes(pmu);
kfree(pmu->base.attr_groups);
if (IS_DGFX(i915))
@@ -1299,6 +1305,8 @@ void i915_pmu_register(struct drm_i915_private *i915)
pmu->base.stop = i915_pmu_event_stop;
pmu->base.read = i915_pmu_event_read;
pmu->base.event_idx = i915_pmu_event_event_idx;
+ pmu->base.get = i915_pmu_get;
+ pmu->base.put = i915_pmu_put;
ret = perf_pmu_register(&pmu->base, pmu->name, -1);
if (ret)
--
2.47.0
[ Upstream commit 0885ef4705607936fc36a38fd74356e1c465b023 ]
I found a regression on mm-unstable during my swap stress test, using
tmpfs to compile linux. The test OOM very soon after the make spawns many
cc processes.
It bisects down to this change: 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9
(mm/gup: clear the LRU flag of a page before adding to LRU batch)
Yu Zhao propose the fix: "I think this is one of the potential side
effects -- Huge mentioned earlier about isolate_lru_folios():"
I test that with it the swap stress test no longer OOM.
Link: https://lore.kernel.org/r/CAOUHufYi9h0kz5uW3LHHS3ZrVwEq-kKp8S6N-MZUmErNAXoX…
Link: https://lkml.kernel.org/r/20240905-lru-flag-v2-1-8a2d9046c594@kernel.org
Fixes: 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch")
Signed-off-by: Chris Li <chrisl(a)kernel.org>
Suggested-by: Yu Zhao <yuzhao(a)google.com>
Suggested-by: Hugh Dickins <hughd(a)google.com>
Closes: https://lore.kernel.org/all/CAF8kJuNP5iTj2p07QgHSGOJsiUfYpJ2f4R1Q5-3BN9JiD9…
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index bd489c1af2289..a8d61a8b68944 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4300,7 +4300,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c
}
/* ineligible */
- if (zone > sc->reclaim_idx) {
+ if (!folio_test_lru(folio) || zone > sc->reclaim_idx) {
gen = folio_inc_gen(lruvec, folio, false);
list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
return true;
---
base-commit: 8e24a758d14c0b1cd42ab0aea980a1030eea811f
change-id: 20241015-stable-oom-fix-a6ab273b1817
Best regards,
--
Chris Li <chrisl(a)kernel.org>
The patch titled
Subject: Revert "selftests/mm: replace atomic_bool with pthread_barrier_t"
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
revert-selftests-mm-replace-atomic_bool-with-pthread_barrier_t.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Edward Liaw <edliaw(a)google.com>
Subject: Revert "selftests/mm: replace atomic_bool with pthread_barrier_t"
Date: Fri, 18 Oct 2024 17:17:23 +0000
This reverts commit e61ef21e27e8deed8c474e9f47f4aa7bc37e138c.
uffd_poll_thread may be called by other tests that do not initialize the
pthread_barrier, so this approach is not correct. This will revert to
using atomic_bool instead.
Link: https://lkml.kernel.org/r/20241018171734.2315053-3-edliaw@google.com
Fixes: e61ef21e27e8 ("selftests/mm: replace atomic_bool with pthread_barrier_t")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/uffd-common.c | 5 ++---
tools/testing/selftests/mm/uffd-common.h | 3 ++-
tools/testing/selftests/mm/uffd-unit-tests.c | 14 ++++++--------
3 files changed, 10 insertions(+), 12 deletions(-)
--- a/tools/testing/selftests/mm/uffd-common.c~revert-selftests-mm-replace-atomic_bool-with-pthread_barrier_t
+++ a/tools/testing/selftests/mm/uffd-common.c
@@ -18,7 +18,7 @@ bool test_uffdio_wp = true;
unsigned long long *count_verify;
uffd_test_ops_t *uffd_test_ops;
uffd_test_case_ops_t *uffd_test_case_ops;
-pthread_barrier_t ready_for_fork;
+atomic_bool ready_for_fork;
static int uffd_mem_fd_create(off_t mem_size, bool hugetlb)
{
@@ -519,8 +519,7 @@ void *uffd_poll_thread(void *arg)
pollfd[1].fd = pipefd[cpu*2];
pollfd[1].events = POLLIN;
- /* Ready for parent thread to fork */
- pthread_barrier_wait(&ready_for_fork);
+ ready_for_fork = true;
for (;;) {
ret = poll(pollfd, 2, -1);
--- a/tools/testing/selftests/mm/uffd-common.h~revert-selftests-mm-replace-atomic_bool-with-pthread_barrier_t
+++ a/tools/testing/selftests/mm/uffd-common.h
@@ -33,6 +33,7 @@
#include <inttypes.h>
#include <stdint.h>
#include <sys/random.h>
+#include <stdatomic.h>
#include "../kselftest.h"
#include "vm_util.h"
@@ -104,7 +105,7 @@ extern bool map_shared;
extern bool test_uffdio_wp;
extern unsigned long long *count_verify;
extern volatile bool test_uffdio_copy_eexist;
-extern pthread_barrier_t ready_for_fork;
+extern atomic_bool ready_for_fork;
extern uffd_test_ops_t anon_uffd_test_ops;
extern uffd_test_ops_t shmem_uffd_test_ops;
--- a/tools/testing/selftests/mm/uffd-unit-tests.c~revert-selftests-mm-replace-atomic_bool-with-pthread_barrier_t
+++ a/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -774,7 +774,7 @@ static void uffd_sigbus_test_common(bool
char c;
struct uffd_args args = { 0 };
- pthread_barrier_init(&ready_for_fork, NULL, 2);
+ ready_for_fork = false;
fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
@@ -791,9 +791,8 @@ static void uffd_sigbus_test_common(bool
if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args))
err("uffd_poll_thread create");
- /* Wait for child thread to start before forking */
- pthread_barrier_wait(&ready_for_fork);
- pthread_barrier_destroy(&ready_for_fork);
+ while (!ready_for_fork)
+ ; /* Wait for the poll_thread to start executing before forking */
pid = fork();
if (pid < 0)
@@ -834,7 +833,7 @@ static void uffd_events_test_common(bool
char c;
struct uffd_args args = { 0 };
- pthread_barrier_init(&ready_for_fork, NULL, 2);
+ ready_for_fork = false;
fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
if (uffd_register(uffd, area_dst, nr_pages * page_size,
@@ -845,9 +844,8 @@ static void uffd_events_test_common(bool
if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args))
err("uffd_poll_thread create");
- /* Wait for child thread to start before forking */
- pthread_barrier_wait(&ready_for_fork);
- pthread_barrier_destroy(&ready_for_fork);
+ while (!ready_for_fork)
+ ; /* Wait for the poll_thread to start executing before forking */
pid = fork();
if (pid < 0)
_
Patches currently in -mm which might be from edliaw(a)google.com are
revert-selftests-mm-fix-deadlock-for-fork-after-pthread_create-on-arm.patch
revert-selftests-mm-replace-atomic_bool-with-pthread_barrier_t.patch
selftests-mm-fix-deadlock-for-fork-after-pthread_create-with-atomic_bool.patch
The patch titled
Subject: Revert "selftests/mm: fix deadlock for fork after pthread_create on ARM"
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
revert-selftests-mm-fix-deadlock-for-fork-after-pthread_create-on-arm.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Edward Liaw <edliaw(a)google.com>
Subject: Revert "selftests/mm: fix deadlock for fork after pthread_create on ARM"
Date: Fri, 18 Oct 2024 17:17:22 +0000
Patch series "selftests/mm: revert pthread_barrier change"
On Android arm, pthread_create followed by a fork caused a deadlock in
the case where the fork required work to be completed by the created
thread.
The previous patches incorrectly assumed that the parent would
always initialize the pthread_barrier for the child thread. This
reverts the change and replaces the fix for wp-fork-with-event with the
original use of atomic_bool.
This patch (of 3):
This reverts commit e142cc87ac4ec618f2ccf5f68aedcd6e28a59d9d.
fork_event_consumer may be called by other tests that do not initialize
the pthread_barrier, so this approach is not correct. The subsequent
patch will revert to using atomic_bool instead.
Link: https://lkml.kernel.org/r/20241018171734.2315053-1-edliaw@google.com
Link: https://lkml.kernel.org/r/20241018171734.2315053-2-edliaw@google.com
Fixes: e142cc87ac4e ("fix deadlock for fork after pthread_create on ARM")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 7 -------
1 file changed, 7 deletions(-)
--- a/tools/testing/selftests/mm/uffd-unit-tests.c~revert-selftests-mm-fix-deadlock-for-fork-after-pthread_create-on-arm
+++ a/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -241,9 +241,6 @@ static void *fork_event_consumer(void *d
fork_event_args *args = data;
struct uffd_msg msg = { 0 };
- /* Ready for parent thread to fork */
- pthread_barrier_wait(&ready_for_fork);
-
/* Read until a full msg received */
while (uffd_read_msg(args->parent_uffd, &msg));
@@ -311,12 +308,8 @@ static int pagemap_test_fork(int uffd, b
/* Prepare a thread to resolve EVENT_FORK */
if (with_event) {
- pthread_barrier_init(&ready_for_fork, NULL, 2);
if (pthread_create(&thread, NULL, fork_event_consumer, &args))
err("pthread_create()");
- /* Wait for child thread to start before forking */
- pthread_barrier_wait(&ready_for_fork);
- pthread_barrier_destroy(&ready_for_fork);
}
child = fork();
_
Patches currently in -mm which might be from edliaw(a)google.com are
revert-selftests-mm-fix-deadlock-for-fork-after-pthread_create-on-arm.patch
revert-selftests-mm-replace-atomic_bool-with-pthread_barrier_t.patch
selftests-mm-fix-deadlock-for-fork-after-pthread_create-with-atomic_bool.patch
There is a race between laundromat handling of revoked delegations
and a client sending free_stateid operation. Laundromat thread
finds that delegation has expired and needs to be revoked so it
marks the delegation stid revoked and it puts it on a reaper list
but then it unlock the state lock and the actual delegation revocation
happens without the lock. Once the stid is marked revoked a racing
free_stateid processing thread does the following (1) it calls
list_del_init() which removes it from the reaper list and (2) frees
the delegation stid structure. The laundromat thread ends up not
calling the revoke_delegation() function for this particular delegation
but that means it will no release the lock lease that exists on
the file.
Now, a new open for this file comes in and ends up finding that
lease list isn't empty and calls nfsd_breaker_owns_lease() which ends
up trying to derefence a freed delegation stateid. Leading to the
followint use-after-free KASAN warning:
kernel: ==================================================================
kernel: BUG: KASAN: slab-use-after-free in nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
kernel: Read of size 8 at addr ffff0000e73cd0c8 by task nfsd/6205
kernel:
kernel: CPU: 2 UID: 0 PID: 6205 Comm: nfsd Kdump: loaded Not tainted 6.11.0-rc7+ #9
kernel: Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 2069.0.0.0.0 08/03/2024
kernel: Call trace:
kernel: dump_backtrace+0x98/0x120
kernel: show_stack+0x1c/0x30
kernel: dump_stack_lvl+0x80/0xe8
kernel: print_address_description.constprop.0+0x84/0x390
kernel: print_report+0xa4/0x268
kernel: kasan_report+0xb4/0xf8
kernel: __asan_report_load8_noabort+0x1c/0x28
kernel: nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
kernel: nfsd_file_do_acquire+0xb3c/0x11d0 [nfsd]
kernel: nfsd_file_acquire_opened+0x84/0x110 [nfsd]
kernel: nfs4_get_vfs_file+0x634/0x958 [nfsd]
kernel: nfsd4_process_open2+0xa40/0x1a40 [nfsd]
kernel: nfsd4_open+0xa08/0xe80 [nfsd]
kernel: nfsd4_proc_compound+0xb8c/0x2130 [nfsd]
kernel: nfsd_dispatch+0x22c/0x718 [nfsd]
kernel: svc_process_common+0x8e8/0x1960 [sunrpc]
kernel: svc_process+0x3d4/0x7e0 [sunrpc]
kernel: svc_handle_xprt+0x828/0xe10 [sunrpc]
kernel: svc_recv+0x2cc/0x6a8 [sunrpc]
kernel: nfsd+0x270/0x400 [nfsd]
kernel: kthread+0x288/0x310
kernel: ret_from_fork+0x10/0x20
This patch proposes a fixed that's based on adding 2 new additional
stid's sc_status values that help coordinate between the laundromat
and other operations (nfsd4_free_stateid() and nfsd4_delegreturn()).
First to make sure, that once the stid is marked revoked, it is not
removed by the nfsd4_free_stateid(), the laundromat take a reference
on the stateid. Then, coordinating whether the stid has been put
on the cl_revoked list or we are processing FREE_STATEID and need to
make sure to remove it from the list, each check that state and act
accordingly. If laundromat has added to the cl_revoke list before
the arrival of FREE_STATEID, then nfsd4_free_stateid() knows to remove
it from the list. If nfsd4_free_stateid() finds that operations arrived
before laundromat has placed it on cl_revoke list, it marks the state
freed and then laundromat will no longer add it to the list.
Also, for nfsd4_delegreturn() when looking for the specified stid,
we need to access stid that are marked removed or freeable, it means
the laundromat has started processing it but hasn't finished and this
delegreturn needs to return nfserr_deleg_revoked and not
nfserr_bad_stateid. The latter will not trigger a FREE_STATEID and the
lack of it will leave this stid on the cl_revoked list indefinitely.
Fixes: 2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is
protected by clp->cl_lock")
CC: stable(a)vger.kernel.org
Signed-off-by: Olga Kornievskaia <okorniev(a)redhat.com>
--- v3. (1) adds refcount to nfsd4_revoke_states() (2) adds comments
to revoke_delegation(), adds the WARN_ON_ONCE to make sure stid
state is what is expected and changes unlock placement.
---
fs/nfsd/nfs4state.c | 46 +++++++++++++++++++++++++++++++++++++--------
fs/nfsd/state.h | 2 ++
2 files changed, 40 insertions(+), 8 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7905ab9d8bc6..28e9b52b01fd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1351,21 +1351,47 @@ static void destroy_delegation(struct nfs4_delegation *dp)
destroy_unhashed_deleg(dp);
}
+/**
+ * revoke_delegation - perform nfs4 delegation structure cleanup
+ * @dp: pointer to the delegation
+ *
+ * This function assumes that it's called either from the administrative
+ * interface (nfsd4_revoke_states()) that's revoking a specific delegation
+ * stateid or it's called from a laundromat thread (nfsd4_landromat()) that
+ * determined that this specific state has expired and needs to be revoked
+ * (both mark state with the appropriate stid sc_status mode). It is also
+ * assumed that a reference was take on the @dp state.
+ *
+ * If this function finds that the @dp state is SC_STATUS_FREED it means
+ * that a FREE_STATEID operation for this stateid has been processed and
+ * we can proceed to removing it from recalled list. However, if @dp state
+ * isn't marked SC_STATUS_FREED, it means we need place it on the cl_revoked
+ * list and wait for the FREE_STATEID to arrive from the client. At the same
+ * time, we need to mark it as SC_STATUS_FREEABLE to indicate to the
+ * nfsd4_free_stateid() function that this stateid has already been added
+ * to the cl_revoked list and that nfsd4_free_stateid() is now responsible
+ * for removing it from the list. Inspection of where the delegation state
+ * in the revocation process is protected by the clp->cl_lock.
+ */
static void revoke_delegation(struct nfs4_delegation *dp)
{
struct nfs4_client *clp = dp->dl_stid.sc_client;
WARN_ON(!list_empty(&dp->dl_recall_lru));
+ WARN_ON_ONCE(!(dp->dl_stid.sc_status &
+ (SC_STATUS_REVOKED | SC_STATUS_ADMIN_REVOKED)));
trace_nfsd_stid_revoke(&dp->dl_stid);
- if (dp->dl_stid.sc_status &
- (SC_STATUS_REVOKED | SC_STATUS_ADMIN_REVOKED)) {
- spin_lock(&clp->cl_lock);
- refcount_inc(&dp->dl_stid.sc_count);
- list_add(&dp->dl_recall_lru, &clp->cl_revoked);
- spin_unlock(&clp->cl_lock);
+ spin_lock(&clp->cl_lock);
+ if (dp->dl_stid.sc_status & SC_STATUS_FREED) {
+ list_del_init(&dp->dl_recall_lru);
+ goto out;
}
+ list_add(&dp->dl_recall_lru, &clp->cl_revoked);
+ dp->dl_stid.sc_status |= SC_STATUS_FREEABLE;
+out:
+ spin_unlock(&clp->cl_lock);
destroy_unhashed_deleg(dp);
}
@@ -1772,6 +1798,7 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
mutex_unlock(&stp->st_mutex);
break;
case SC_TYPE_DELEG:
+ refcount_inc(&stid->sc_count);
dp = delegstateid(stid);
spin_lock(&state_lock);
if (!unhash_delegation_locked(
@@ -6606,6 +6633,7 @@ nfs4_laundromat(struct nfsd_net *nn)
dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
if (!state_expired(<, dp->dl_time))
break;
+ refcount_inc(&dp->dl_stid.sc_count);
unhash_delegation_locked(dp, SC_STATUS_REVOKED);
list_add(&dp->dl_recall_lru, &reaplist);
}
@@ -7218,7 +7246,9 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
s->sc_status |= SC_STATUS_CLOSED;
spin_unlock(&s->sc_lock);
dp = delegstateid(s);
- list_del_init(&dp->dl_recall_lru);
+ if (s->sc_status & SC_STATUS_FREEABLE)
+ list_del_init(&dp->dl_recall_lru);
+ s->sc_status |= SC_STATUS_FREED;
spin_unlock(&cl->cl_lock);
nfs4_put_stid(s);
ret = nfs_ok;
@@ -7548,7 +7578,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
return status;
- status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG, 0, &s, nn);
+ status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG, SC_STATUS_REVOKED|SC_STATUS_FREEABLE, &s, nn);
if (status)
goto out;
dp = delegstateid(s);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 6351e6eca7cc..cc00d6b64b88 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -114,6 +114,8 @@ struct nfs4_stid {
/* For a deleg stateid kept around only to process free_stateid's: */
#define SC_STATUS_REVOKED BIT(1)
#define SC_STATUS_ADMIN_REVOKED BIT(2)
+#define SC_STATUS_FREEABLE BIT(3)
+#define SC_STATUS_FREED BIT(4)
unsigned short sc_status;
struct list_head sc_cp_list;
--
2.43.5
There is a race between laundromat handling of revoked delegations
and a client sending free_stateid operation. Laundromat thread
finds that delegation has expired and needs to be revoked so it
marks the delegation stid revoked and it puts it on a reaper list
but then it unlock the state lock and the actual delegation revocation
happens without the lock. Once the stid is marked revoked a racing
free_stateid processing thread does the following (1) it calls
list_del_init() which removes it from the reaper list and (2) frees
the delegation stid structure. The laundromat thread ends up not
calling the revoke_delegation() function for this particular delegation
but that means it will no release the lock lease that exists on
the file.
Now, a new open for this file comes in and ends up finding that
lease list isn't empty and calls nfsd_breaker_owns_lease() which ends
up trying to derefence a freed delegation stateid. Leading to the
followint use-after-free KASAN warning:
kernel: ==================================================================
kernel: BUG: KASAN: slab-use-after-free in nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
kernel: Read of size 8 at addr ffff0000e73cd0c8 by task nfsd/6205
kernel:
kernel: CPU: 2 UID: 0 PID: 6205 Comm: nfsd Kdump: loaded Not tainted 6.11.0-rc7+ #9
kernel: Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 2069.0.0.0.0 08/03/2024
kernel: Call trace:
kernel: dump_backtrace+0x98/0x120
kernel: show_stack+0x1c/0x30
kernel: dump_stack_lvl+0x80/0xe8
kernel: print_address_description.constprop.0+0x84/0x390
kernel: print_report+0xa4/0x268
kernel: kasan_report+0xb4/0xf8
kernel: __asan_report_load8_noabort+0x1c/0x28
kernel: nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
kernel: leases_conflict+0x68/0x370
kernel: __break_lease+0x204/0xc38
kernel: nfsd_open_break_lease+0x8c/0xf0 [nfsd]
kernel: nfsd_file_do_acquire+0xb3c/0x11d0 [nfsd]
kernel: nfsd_file_acquire_opened+0x84/0x110 [nfsd]
kernel: nfs4_get_vfs_file+0x634/0x958 [nfsd]
kernel: nfsd4_process_open2+0xa40/0x1a40 [nfsd]
kernel: nfsd4_open+0xa08/0xe80 [nfsd]
kernel: nfsd4_proc_compound+0xb8c/0x2130 [nfsd]
kernel: nfsd_dispatch+0x22c/0x718 [nfsd]
kernel: svc_process_common+0x8e8/0x1960 [sunrpc]
kernel: svc_process+0x3d4/0x7e0 [sunrpc]
kernel: svc_handle_xprt+0x828/0xe10 [sunrpc]
kernel: svc_recv+0x2cc/0x6a8 [sunrpc]
kernel: nfsd+0x270/0x400 [nfsd]
kernel: kthread+0x288/0x310
kernel: ret_from_fork+0x10/0x20
This patch proposes a fix that's based on adding 2 new additional
stid's sc_status values that help coordinate between the laundromat
and other operations (nfsd4_free_stateid() and nfsd4_delegreturn()).
First to make sure, that once the stid is marked revoked, it is not
removed by the nfsd4_free_stateid(), the laundromat take a reference
on the stateid. Then, coordinating whether the stid has been put
on the cl_revoked list or we are processing FREE_STATEID and need to
make sure to remove it from the list, each check that state and act
accordingly. If laundromat has added to the cl_revoke list before
the arrival of FREE_STATEID, then nfsd4_free_stateid() knows to remove
it from the list. If nfsd4_free_stateid() finds that operations arrived
before laundromat has placed it on cl_revoke list, it marks the state
freed and then laundromat will no longer add it to the list.
Also, for nfsd4_delegreturn() when looking for the specified stid,
we need to access stid that are marked removed or freeable, it means
the laundromat has started processing it but hasn't finished and this
delegreturn needs to return nfserr_deleg_revoked and not
nfserr_bad_stateid. The latter will not trigger a FREE_STATEID and the
lack of it will leave this stid on the cl_revoked list indefinitely.
Fixes: 2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is
protected by clp->cl_lock")
CC: stable(a)vger.kernel.org
Signed-off-by: Olga Kornievskaia <okorniev(a)redhat.com>
---
fs/nfsd/nfs4state.c | 15 ++++++++++++---
fs/nfsd/state.h | 2 ++
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ac1859c7cc9d..cb989802e896 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1370,10 +1370,16 @@ static void revoke_delegation(struct nfs4_delegation *dp)
if (dp->dl_stid.sc_status &
(SC_STATUS_REVOKED | SC_STATUS_ADMIN_REVOKED)) {
spin_lock(&clp->cl_lock);
- refcount_inc(&dp->dl_stid.sc_count);
+ if (dp->dl_stid.sc_status & SC_STATUS_FREED) {
+ list_del_init(&dp->dl_recall_lru);
+ spin_unlock(&clp->cl_lock);
+ goto out;
+ }
list_add(&dp->dl_recall_lru, &clp->cl_revoked);
+ dp->dl_stid.sc_status |= SC_STATUS_FREEABLE;
spin_unlock(&clp->cl_lock);
}
+out:
destroy_unhashed_deleg(dp);
}
@@ -6545,6 +6551,7 @@ nfs4_laundromat(struct nfsd_net *nn)
dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
if (!state_expired(<, dp->dl_time))
break;
+ refcount_inc(&dp->dl_stid.sc_count);
unhash_delegation_locked(dp, SC_STATUS_REVOKED);
list_add(&dp->dl_recall_lru, &reaplist);
}
@@ -7156,7 +7163,9 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
if (s->sc_status & SC_STATUS_REVOKED) {
spin_unlock(&s->sc_lock);
dp = delegstateid(s);
- list_del_init(&dp->dl_recall_lru);
+ if (s->sc_status & SC_STATUS_FREEABLE)
+ list_del_init(&dp->dl_recall_lru);
+ s->sc_status |= SC_STATUS_FREED;
spin_unlock(&cl->cl_lock);
nfs4_put_stid(s);
ret = nfs_ok;
@@ -7486,7 +7495,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
return status;
- status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG, 0, &s, nn);
+ status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG, SC_STATUS_REVOKED|SC_STATUS_FREEABLE, &s, nn);
if (status)
goto out;
dp = delegstateid(s);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 79c743c01a47..35b3564c065f 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -114,6 +114,8 @@ struct nfs4_stid {
/* For a deleg stateid kept around only to process free_stateid's: */
#define SC_STATUS_REVOKED BIT(1)
#define SC_STATUS_ADMIN_REVOKED BIT(2)
+#define SC_STATUS_FREEABLE BIT(3)
+#define SC_STATUS_FREED BIT(4)
unsigned short sc_status;
struct list_head sc_cp_list;
--
2.43.5
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in
remap_file_pages()") fixed a security issue, it added an LSM check when
trying to remap file pages, so that LSMs have the opportunity to evaluate
such action like for other memory operations such as mmap() and mprotect().
However, that commit called security_mmap_file() inside the mmap_lock lock,
while the other calls do it before taking the lock, after commit
8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem").
This caused lock inversion issue with IMA which was taking the mmap_lock
and i_mutex lock in the opposite way when the remap_file_pages() system
call was called.
Solve the issue by splitting the critical region in remap_file_pages() in
two regions: the first takes a read lock of mmap_lock and retrieves the VMA
and the file associated, and calculate the 'prot' and 'flags' variable; the
second takes a write lock on mmap_lock, checks that the VMA flags and the
VMA file descriptor are the same as the ones obtained in the first critical
region (otherwise the system call fails), and calls do_mmap().
In between, after releasing the read lock and taking the write lock, call
security_mmap_file(), and solve the lock inversion issue.
Cc: stable(a)vger.kernel.org
Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()")
Reported-by: syzbot+91ae49e1c1a2634d20c0(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.…
Reviewed-by: Roberto Sassu <roberto.sassu(a)huawei.com> (Calculate prot and flags earlier)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
---
mm/mmap.c | 62 ++++++++++++++++++++++++++++++++++++++++---------------
1 file changed, 45 insertions(+), 17 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 9c0fb43064b5..762944427e03 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
unsigned long populate = 0;
unsigned long ret = -EINVAL;
struct file *file;
+ vm_flags_t vm_flags;
pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n",
current->comm, current->pid);
@@ -1656,12 +1657,53 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
if (pgoff + (size >> PAGE_SHIFT) < pgoff)
return ret;
- if (mmap_write_lock_killable(mm))
+ if (mmap_read_lock_killable(mm))
+ return -EINTR;
+
+ vma = vma_lookup(mm, start);
+
+ if (!vma || !(vma->vm_flags & VM_SHARED)) {
+ mmap_read_unlock(mm);
+ return -EINVAL;
+ }
+
+ prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
+ prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
+ prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
+
+ flags &= MAP_NONBLOCK;
+ flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
+ if (vma->vm_flags & VM_LOCKED)
+ flags |= MAP_LOCKED;
+
+ /* Save vm_flags used to calculate prot and flags, and recheck later. */
+ vm_flags = vma->vm_flags;
+ file = get_file(vma->vm_file);
+
+ mmap_read_unlock(mm);
+
+ ret = security_mmap_file(file, prot, flags);
+ if (ret) {
+ fput(file);
+ return ret;
+ }
+
+ ret = -EINVAL;
+
+ if (mmap_write_lock_killable(mm)) {
+ fput(file);
return -EINTR;
+ }
vma = vma_lookup(mm, start);
- if (!vma || !(vma->vm_flags & VM_SHARED))
+ if (!vma)
+ goto out;
+
+ if (vma->vm_flags != vm_flags)
+ goto out;
+
+ if (vma->vm_file != file)
goto out;
if (start + size > vma->vm_end) {
@@ -1689,25 +1731,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
goto out;
}
- prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
- prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
- prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
-
- flags &= MAP_NONBLOCK;
- flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
- if (vma->vm_flags & VM_LOCKED)
- flags |= MAP_LOCKED;
-
- file = get_file(vma->vm_file);
- ret = security_mmap_file(vma->vm_file, prot, flags);
- if (ret)
- goto out_fput;
ret = do_mmap(vma->vm_file, start, size,
prot, flags, 0, pgoff, &populate, NULL);
-out_fput:
- fput(file);
out:
mmap_write_unlock(mm);
+ fput(file);
if (populate)
mm_populate(ret, populate);
if (!IS_ERR_VALUE(ret))
--
2.34.1
If some remap_pfn_range() calls succeeded before one failed, we still have
buffer pages mapped into the userspace page tables when we drop the buffer
reference with comedi_buf_map_put(bm). The userspace mappings are only
cleaned up later in the mmap error path.
Fix it by explicitly flushing all mappings in our VMA on the error path.
See commit 79a61cc3fc04 ("mm: avoid leaving partial pfn mappings around in
error case").
Cc: stable(a)vger.kernel.org
Fixes: ed9eccbe8970 ("Staging: add comedi core")
Signed-off-by: Jann Horn <jannh(a)google.com>
---
Note: compile-tested only; I don't actually have comedi hardware, and I
don't know anything about comedi.
---
Changes in v3:
- gate zapping ptes on CONFIG_MMU (Intel kernel test robot)
- Link to v2: https://lore.kernel.org/r/20241015-comedi-tlb-v2-1-cafb0e27dd9a@google.com
Changes in v2:
- only do the zapping in the pfnmap path (Ian Abbott)
- use zap_vma_ptes() instead of zap_page_range_single() (Ian Abbott)
- Link to v1: https://lore.kernel.org/r/20241014-comedi-tlb-v1-1-4b699144b438@google.com
---
drivers/comedi/comedi_fops.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/comedi/comedi_fops.c b/drivers/comedi/comedi_fops.c
index 1b481731df96..b9df9b19d4bd 100644
--- a/drivers/comedi/comedi_fops.c
+++ b/drivers/comedi/comedi_fops.c
@@ -2407,6 +2407,18 @@ static int comedi_mmap(struct file *file, struct vm_area_struct *vma)
start += PAGE_SIZE;
}
+
+#ifdef CONFIG_MMU
+ /*
+ * Leaving behind a partial mapping of a buffer we're about to
+ * drop is unsafe, see remap_pfn_range_notrack().
+ * We need to zap the range here ourselves instead of relying
+ * on the automatic zapping in remap_pfn_range() because we call
+ * remap_pfn_range() in a loop.
+ */
+ if (retval)
+ zap_vma_ptes(vma, vma->vm_start, size);
+#endif
}
if (retval == 0) {
---
base-commit: 6485cf5ea253d40d507cd71253c9568c5470cd27
change-id: 20241014-comedi-tlb-400246505961
--
Jann Horn <jannh(a)google.com>
In psnet_open_pf_bar() and snet_open_vf_bar() a string later passed to
pcim_iomap_regions() is placed on the stack. Neither
pcim_iomap_regions() nor the functions it calls copy that string.
Should the string later ever be used, this, consequently, causes
undefined behavior since the stack frame will by then have disappeared.
Fix the bug by allocating the strings on the heap through
devm_kasprintf().
Cc: stable(a)vger.kernel.org # v6.3
Fixes: 51a8f9d7f587 ("virtio: vdpa: new SolidNET DPU driver.")
Reported-by: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Closes: https://lore.kernel.org/all/74e9109a-ac59-49e2-9b1d-d825c9c9f891@wanadoo.fr/
Suggested-by: Andy Shevchenko <andy(a)kernel.org>
Signed-off-by: Philipp Stanner <pstanner(a)redhat.com>
---
drivers/vdpa/solidrun/snet_main.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/vdpa/solidrun/snet_main.c b/drivers/vdpa/solidrun/snet_main.c
index 99428a04068d..c8b74980dbd1 100644
--- a/drivers/vdpa/solidrun/snet_main.c
+++ b/drivers/vdpa/solidrun/snet_main.c
@@ -555,7 +555,7 @@ static const struct vdpa_config_ops snet_config_ops = {
static int psnet_open_pf_bar(struct pci_dev *pdev, struct psnet *psnet)
{
- char name[50];
+ char *name;
int ret, i, mask = 0;
/* We don't know which BAR will be used to communicate..
* We will map every bar with len > 0.
@@ -573,7 +573,10 @@ static int psnet_open_pf_bar(struct pci_dev *pdev, struct psnet *psnet)
return -ENODEV;
}
- snprintf(name, sizeof(name), "psnet[%s]-bars", pci_name(pdev));
+ name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "psnet[%s]-bars", pci_name(pdev));
+ if (!name)
+ return -ENOMEM;
+
ret = pcim_iomap_regions(pdev, mask, name);
if (ret) {
SNET_ERR(pdev, "Failed to request and map PCI BARs\n");
@@ -590,10 +593,13 @@ static int psnet_open_pf_bar(struct pci_dev *pdev, struct psnet *psnet)
static int snet_open_vf_bar(struct pci_dev *pdev, struct snet *snet)
{
- char name[50];
+ char *name;
int ret;
- snprintf(name, sizeof(name), "snet[%s]-bar", pci_name(pdev));
+ name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "snet[%s]-bars", pci_name(pdev));
+ if (!name)
+ return -ENOMEM;
+
/* Request and map BAR */
ret = pcim_iomap_regions(pdev, BIT(snet->psnet->cfg.vf_bar), name);
if (ret) {
--
2.46.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x e8061f06185be0a06a73760d6526b8b0feadfe52
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101827-implosion-twilight-c8e1@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e8061f06185be0a06a73760d6526b8b0feadfe52 Mon Sep 17 00:00:00 2001
From: Nico Boehr <nrb(a)linux.ibm.com>
Date: Tue, 17 Sep 2024 17:18:33 +0200
Subject: [PATCH] KVM: s390: gaccess: Check if guest address is in memslot
Previously, access_guest_page() did not check whether the given guest
address is inside of a memslot. This is not a problem, since
kvm_write_guest_page/kvm_read_guest_page return -EFAULT in this case.
However, -EFAULT is also returned when copy_to/from_user fails.
When emulating a guest instruction, the address being outside a memslot
usually means that an addressing exception should be injected into the
guest.
Failure in copy_to/from_user however indicates that something is wrong
in userspace and hence should be handled there.
To be able to distinguish these two cases, return PGM_ADDRESSING in
access_guest_page() when the guest address is outside guest memory. In
access_guest_real(), populate vcpu->arch.pgm.code such that
kvm_s390_inject_prog_cond() can be used in the caller for injecting into
the guest (if applicable).
Since this adds a new return value to access_guest_page(), we need to make
sure that other callers are not confused by the new positive return value.
There are the following users of access_guest_page():
- access_guest_with_key() does the checking itself (in
guest_range_to_gpas()), so this case should never happen. Even if, the
handling is set up properly.
- access_guest_real() just passes the return code to its callers, which
are:
- read_guest_real() - see below
- write_guest_real() - see below
There are the following users of read_guest_real():
- ar_translation() in gaccess.c which already returns PGM_*
- setup_apcb10(), setup_apcb00(), setup_apcb11() in vsie.c which always
return -EFAULT on read_guest_read() nonzero return - no change
- shadow_crycb(), handle_stfle() always present this as validity, this
could be handled better but doesn't change current behaviour - no change
There are the following users of write_guest_real():
- kvm_s390_store_status_unloaded() always returns -EFAULT on
write_guest_real() failure.
Fixes: 2293897805c2 ("KVM: s390: add architecture compliant guest access functions")
Cc: stable(a)vger.kernel.org
Signed-off-by: Nico Boehr <nrb(a)linux.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Link: https://lore.kernel.org/r/20240917151904.74314-2-nrb@linux.ibm.com
Acked-by: Janosch Frank <frankja(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index e65f597e3044..a688351f4ab5 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -828,6 +828,8 @@ static int access_guest_page(struct kvm *kvm, enum gacc_mode mode, gpa_t gpa,
const gfn_t gfn = gpa_to_gfn(gpa);
int rc;
+ if (!gfn_to_memslot(kvm, gfn))
+ return PGM_ADDRESSING;
if (mode == GACC_STORE)
rc = kvm_write_guest_page(kvm, gfn, data, offset, len);
else
@@ -985,6 +987,8 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
gra += fragment_len;
data += fragment_len;
}
+ if (rc > 0)
+ vcpu->arch.pgm.code = rc;
return rc;
}
diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
index b320d12aa049..3fde45a151f2 100644
--- a/arch/s390/kvm/gaccess.h
+++ b/arch/s390/kvm/gaccess.h
@@ -405,11 +405,12 @@ int read_guest_abs(struct kvm_vcpu *vcpu, unsigned long gpa, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @data (kernel space) to @gra (guest real address).
- * It is up to the caller to ensure that the entire guest memory range is
- * valid memory before calling this function.
* Guest low address and key protection are not checked.
*
- * Returns zero on success or -EFAULT on error.
+ * Returns zero on success, -EFAULT when copying from @data failed, or
+ * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
+ * is also stored to allow injecting into the guest (if applicable) using
+ * kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to guest memory.
*/
@@ -428,11 +429,12 @@ int write_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @gra (guest real address) to @data (kernel space).
- * It is up to the caller to ensure that the entire guest memory range is
- * valid memory before calling this function.
* Guest key protection is not checked.
*
- * Returns zero on success or -EFAULT on error.
+ * Returns zero on success, -EFAULT when copying to @data failed, or
+ * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
+ * is also stored to allow injecting into the guest (if applicable) using
+ * kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to kernel space.
*/
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x e8061f06185be0a06a73760d6526b8b0feadfe52
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101826-gracious-singer-816f@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e8061f06185be0a06a73760d6526b8b0feadfe52 Mon Sep 17 00:00:00 2001
From: Nico Boehr <nrb(a)linux.ibm.com>
Date: Tue, 17 Sep 2024 17:18:33 +0200
Subject: [PATCH] KVM: s390: gaccess: Check if guest address is in memslot
Previously, access_guest_page() did not check whether the given guest
address is inside of a memslot. This is not a problem, since
kvm_write_guest_page/kvm_read_guest_page return -EFAULT in this case.
However, -EFAULT is also returned when copy_to/from_user fails.
When emulating a guest instruction, the address being outside a memslot
usually means that an addressing exception should be injected into the
guest.
Failure in copy_to/from_user however indicates that something is wrong
in userspace and hence should be handled there.
To be able to distinguish these two cases, return PGM_ADDRESSING in
access_guest_page() when the guest address is outside guest memory. In
access_guest_real(), populate vcpu->arch.pgm.code such that
kvm_s390_inject_prog_cond() can be used in the caller for injecting into
the guest (if applicable).
Since this adds a new return value to access_guest_page(), we need to make
sure that other callers are not confused by the new positive return value.
There are the following users of access_guest_page():
- access_guest_with_key() does the checking itself (in
guest_range_to_gpas()), so this case should never happen. Even if, the
handling is set up properly.
- access_guest_real() just passes the return code to its callers, which
are:
- read_guest_real() - see below
- write_guest_real() - see below
There are the following users of read_guest_real():
- ar_translation() in gaccess.c which already returns PGM_*
- setup_apcb10(), setup_apcb00(), setup_apcb11() in vsie.c which always
return -EFAULT on read_guest_read() nonzero return - no change
- shadow_crycb(), handle_stfle() always present this as validity, this
could be handled better but doesn't change current behaviour - no change
There are the following users of write_guest_real():
- kvm_s390_store_status_unloaded() always returns -EFAULT on
write_guest_real() failure.
Fixes: 2293897805c2 ("KVM: s390: add architecture compliant guest access functions")
Cc: stable(a)vger.kernel.org
Signed-off-by: Nico Boehr <nrb(a)linux.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Link: https://lore.kernel.org/r/20240917151904.74314-2-nrb@linux.ibm.com
Acked-by: Janosch Frank <frankja(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index e65f597e3044..a688351f4ab5 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -828,6 +828,8 @@ static int access_guest_page(struct kvm *kvm, enum gacc_mode mode, gpa_t gpa,
const gfn_t gfn = gpa_to_gfn(gpa);
int rc;
+ if (!gfn_to_memslot(kvm, gfn))
+ return PGM_ADDRESSING;
if (mode == GACC_STORE)
rc = kvm_write_guest_page(kvm, gfn, data, offset, len);
else
@@ -985,6 +987,8 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
gra += fragment_len;
data += fragment_len;
}
+ if (rc > 0)
+ vcpu->arch.pgm.code = rc;
return rc;
}
diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
index b320d12aa049..3fde45a151f2 100644
--- a/arch/s390/kvm/gaccess.h
+++ b/arch/s390/kvm/gaccess.h
@@ -405,11 +405,12 @@ int read_guest_abs(struct kvm_vcpu *vcpu, unsigned long gpa, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @data (kernel space) to @gra (guest real address).
- * It is up to the caller to ensure that the entire guest memory range is
- * valid memory before calling this function.
* Guest low address and key protection are not checked.
*
- * Returns zero on success or -EFAULT on error.
+ * Returns zero on success, -EFAULT when copying from @data failed, or
+ * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
+ * is also stored to allow injecting into the guest (if applicable) using
+ * kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to guest memory.
*/
@@ -428,11 +429,12 @@ int write_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @gra (guest real address) to @data (kernel space).
- * It is up to the caller to ensure that the entire guest memory range is
- * valid memory before calling this function.
* Guest key protection is not checked.
*
- * Returns zero on success or -EFAULT on error.
+ * Returns zero on success, -EFAULT when copying to @data failed, or
+ * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
+ * is also stored to allow injecting into the guest (if applicable) using
+ * kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to kernel space.
*/
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e8061f06185be0a06a73760d6526b8b0feadfe52
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101824-departure-oversight-aa1e@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e8061f06185be0a06a73760d6526b8b0feadfe52 Mon Sep 17 00:00:00 2001
From: Nico Boehr <nrb(a)linux.ibm.com>
Date: Tue, 17 Sep 2024 17:18:33 +0200
Subject: [PATCH] KVM: s390: gaccess: Check if guest address is in memslot
Previously, access_guest_page() did not check whether the given guest
address is inside of a memslot. This is not a problem, since
kvm_write_guest_page/kvm_read_guest_page return -EFAULT in this case.
However, -EFAULT is also returned when copy_to/from_user fails.
When emulating a guest instruction, the address being outside a memslot
usually means that an addressing exception should be injected into the
guest.
Failure in copy_to/from_user however indicates that something is wrong
in userspace and hence should be handled there.
To be able to distinguish these two cases, return PGM_ADDRESSING in
access_guest_page() when the guest address is outside guest memory. In
access_guest_real(), populate vcpu->arch.pgm.code such that
kvm_s390_inject_prog_cond() can be used in the caller for injecting into
the guest (if applicable).
Since this adds a new return value to access_guest_page(), we need to make
sure that other callers are not confused by the new positive return value.
There are the following users of access_guest_page():
- access_guest_with_key() does the checking itself (in
guest_range_to_gpas()), so this case should never happen. Even if, the
handling is set up properly.
- access_guest_real() just passes the return code to its callers, which
are:
- read_guest_real() - see below
- write_guest_real() - see below
There are the following users of read_guest_real():
- ar_translation() in gaccess.c which already returns PGM_*
- setup_apcb10(), setup_apcb00(), setup_apcb11() in vsie.c which always
return -EFAULT on read_guest_read() nonzero return - no change
- shadow_crycb(), handle_stfle() always present this as validity, this
could be handled better but doesn't change current behaviour - no change
There are the following users of write_guest_real():
- kvm_s390_store_status_unloaded() always returns -EFAULT on
write_guest_real() failure.
Fixes: 2293897805c2 ("KVM: s390: add architecture compliant guest access functions")
Cc: stable(a)vger.kernel.org
Signed-off-by: Nico Boehr <nrb(a)linux.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Link: https://lore.kernel.org/r/20240917151904.74314-2-nrb@linux.ibm.com
Acked-by: Janosch Frank <frankja(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index e65f597e3044..a688351f4ab5 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -828,6 +828,8 @@ static int access_guest_page(struct kvm *kvm, enum gacc_mode mode, gpa_t gpa,
const gfn_t gfn = gpa_to_gfn(gpa);
int rc;
+ if (!gfn_to_memslot(kvm, gfn))
+ return PGM_ADDRESSING;
if (mode == GACC_STORE)
rc = kvm_write_guest_page(kvm, gfn, data, offset, len);
else
@@ -985,6 +987,8 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
gra += fragment_len;
data += fragment_len;
}
+ if (rc > 0)
+ vcpu->arch.pgm.code = rc;
return rc;
}
diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
index b320d12aa049..3fde45a151f2 100644
--- a/arch/s390/kvm/gaccess.h
+++ b/arch/s390/kvm/gaccess.h
@@ -405,11 +405,12 @@ int read_guest_abs(struct kvm_vcpu *vcpu, unsigned long gpa, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @data (kernel space) to @gra (guest real address).
- * It is up to the caller to ensure that the entire guest memory range is
- * valid memory before calling this function.
* Guest low address and key protection are not checked.
*
- * Returns zero on success or -EFAULT on error.
+ * Returns zero on success, -EFAULT when copying from @data failed, or
+ * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
+ * is also stored to allow injecting into the guest (if applicable) using
+ * kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to guest memory.
*/
@@ -428,11 +429,12 @@ int write_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @gra (guest real address) to @data (kernel space).
- * It is up to the caller to ensure that the entire guest memory range is
- * valid memory before calling this function.
* Guest key protection is not checked.
*
- * Returns zero on success or -EFAULT on error.
+ * Returns zero on success, -EFAULT when copying to @data failed, or
+ * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
+ * is also stored to allow injecting into the guest (if applicable) using
+ * kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to kernel space.
*/
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e8061f06185be0a06a73760d6526b8b0feadfe52
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101823-tractor-twitter-a318@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e8061f06185be0a06a73760d6526b8b0feadfe52 Mon Sep 17 00:00:00 2001
From: Nico Boehr <nrb(a)linux.ibm.com>
Date: Tue, 17 Sep 2024 17:18:33 +0200
Subject: [PATCH] KVM: s390: gaccess: Check if guest address is in memslot
Previously, access_guest_page() did not check whether the given guest
address is inside of a memslot. This is not a problem, since
kvm_write_guest_page/kvm_read_guest_page return -EFAULT in this case.
However, -EFAULT is also returned when copy_to/from_user fails.
When emulating a guest instruction, the address being outside a memslot
usually means that an addressing exception should be injected into the
guest.
Failure in copy_to/from_user however indicates that something is wrong
in userspace and hence should be handled there.
To be able to distinguish these two cases, return PGM_ADDRESSING in
access_guest_page() when the guest address is outside guest memory. In
access_guest_real(), populate vcpu->arch.pgm.code such that
kvm_s390_inject_prog_cond() can be used in the caller for injecting into
the guest (if applicable).
Since this adds a new return value to access_guest_page(), we need to make
sure that other callers are not confused by the new positive return value.
There are the following users of access_guest_page():
- access_guest_with_key() does the checking itself (in
guest_range_to_gpas()), so this case should never happen. Even if, the
handling is set up properly.
- access_guest_real() just passes the return code to its callers, which
are:
- read_guest_real() - see below
- write_guest_real() - see below
There are the following users of read_guest_real():
- ar_translation() in gaccess.c which already returns PGM_*
- setup_apcb10(), setup_apcb00(), setup_apcb11() in vsie.c which always
return -EFAULT on read_guest_read() nonzero return - no change
- shadow_crycb(), handle_stfle() always present this as validity, this
could be handled better but doesn't change current behaviour - no change
There are the following users of write_guest_real():
- kvm_s390_store_status_unloaded() always returns -EFAULT on
write_guest_real() failure.
Fixes: 2293897805c2 ("KVM: s390: add architecture compliant guest access functions")
Cc: stable(a)vger.kernel.org
Signed-off-by: Nico Boehr <nrb(a)linux.ibm.com>
Reviewed-by: Heiko Carstens <hca(a)linux.ibm.com>
Link: https://lore.kernel.org/r/20240917151904.74314-2-nrb@linux.ibm.com
Acked-by: Janosch Frank <frankja(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index e65f597e3044..a688351f4ab5 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -828,6 +828,8 @@ static int access_guest_page(struct kvm *kvm, enum gacc_mode mode, gpa_t gpa,
const gfn_t gfn = gpa_to_gfn(gpa);
int rc;
+ if (!gfn_to_memslot(kvm, gfn))
+ return PGM_ADDRESSING;
if (mode == GACC_STORE)
rc = kvm_write_guest_page(kvm, gfn, data, offset, len);
else
@@ -985,6 +987,8 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
gra += fragment_len;
data += fragment_len;
}
+ if (rc > 0)
+ vcpu->arch.pgm.code = rc;
return rc;
}
diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
index b320d12aa049..3fde45a151f2 100644
--- a/arch/s390/kvm/gaccess.h
+++ b/arch/s390/kvm/gaccess.h
@@ -405,11 +405,12 @@ int read_guest_abs(struct kvm_vcpu *vcpu, unsigned long gpa, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @data (kernel space) to @gra (guest real address).
- * It is up to the caller to ensure that the entire guest memory range is
- * valid memory before calling this function.
* Guest low address and key protection are not checked.
*
- * Returns zero on success or -EFAULT on error.
+ * Returns zero on success, -EFAULT when copying from @data failed, or
+ * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
+ * is also stored to allow injecting into the guest (if applicable) using
+ * kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to guest memory.
*/
@@ -428,11 +429,12 @@ int write_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @gra (guest real address) to @data (kernel space).
- * It is up to the caller to ensure that the entire guest memory range is
- * valid memory before calling this function.
* Guest key protection is not checked.
*
- * Returns zero on success or -EFAULT on error.
+ * Returns zero on success, -EFAULT when copying to @data failed, or
+ * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
+ * is also stored to allow injecting into the guest (if applicable) using
+ * kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to kernel space.
*/
Currently, the rproc "atomic_t power" variable is incremented during:
a. WPSS rproc auto boot.
b. AHB power on for ath11k.
During AHB power off (rmmod ath11k_ahb.ko), rproc_shutdown fails
to unload the WPSS firmware because the rproc->power value is '2',
causing the atomic_dec_and_test(&rproc->power) condition to fail.
Consequently, during AHB power on (insmod ath11k_ahb.ko),
QMI_WLANFW_HOST_CAP_REQ_V01 fails due to the host and firmware QMI
states being out of sync.
Fixes: 300ed425dfa9 ("remoteproc: qcom_q6v5_pas: Add SC7280 ADSP, CDSP & WPSS")
Cc: stable(a)vger.kernel.org
Signed-off-by: Balaji Pothunoori <quic_bpothuno(a)quicinc.com>
---
v2: updated commit text.
added Fixes/cc:stable tags.
drivers/remoteproc/qcom_q6v5_pas.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c
index ef82835e98a4..05963d7924df 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -1344,7 +1344,7 @@ static const struct adsp_data sc7280_wpss_resource = {
.crash_reason_smem = 626,
.firmware_name = "wpss.mdt",
.pas_id = 6,
- .auto_boot = true,
+ .auto_boot = false,
.proxy_pd_names = (char*[]){
"cx",
"mx",
--
2.34.1
commit f011c9cf04c06f16b24f583d313d3c012e589e50 upstream.
The submit queue polling threads are userland threads that just never
exit to the userland. When creating the thread with IORING_SETUP_SQ_AFF,
the affinity of the poller thread is set to the cpu specified in
sq_thread_cpu. However, this CPU can be outside of the cpuset defined
by the cgroup cpuset controller. This violates the rules defined by the
cpuset controller and is a potential issue for realtime applications.
In b7ed6d8ffd6 we fixed the default affinity of the poller thread, in
case no explicit pinning is required by inheriting the one of the
creating task. In case of explicit pinning, the check is more
complicated, as also a cpu outside of the parent cpumask is allowed.
We implemented this by using cpuset_cpus_allowed (that has support for
cgroup cpusets) and testing if the requested cpu is in the set.
Fixes: 37d1e2e3642e ("io_uring: move SQPOLL thread io-wq forked worker")
Signed-off-by: Felix Moessbauer <felix.moessbauer(a)siemens.com>
Link: https://lore.kernel.org/r/20240909150036.55921-1-felix.moessbauer@siemens.c…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
io_uring/io_uring.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 8ed2c65529714..6b6fd244233f8 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -56,6 +56,7 @@
#include <linux/mm.h>
#include <linux/mman.h>
#include <linux/percpu.h>
+#include <linux/cpuset.h>
#include <linux/slab.h>
#include <linux/blkdev.h>
#include <linux/bvec.h>
@@ -8746,10 +8747,12 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx,
return 0;
if (p->flags & IORING_SETUP_SQ_AFF) {
+ struct cpumask allowed_mask;
int cpu = p->sq_thread_cpu;
ret = -EINVAL;
- if (cpu >= nr_cpu_ids || !cpu_online(cpu))
+ cpuset_cpus_allowed(current, &allowed_mask);
+ if (!cpumask_test_cpu(cpu, &allowed_mask))
goto err_sqpoll;
sqd->sq_cpu = cpu;
} else {
--
2.39.5
Svacer reports a NULL-pointer dereference in rtl8xxxu_probe().
After having been compared to a NULL value, pointer hw is passed as
1st parameter in call to ieee80211_free_hw(), where it is dereferenced.
The problem is present in 5.10 stable release and can be fixed by the
following upstream patch that can be cleanly applied to 5.10 branch.
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x bea07fd63192b61209d48cbb81ef474cc3ee4c62
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101840-army-handstand-92f8@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bea07fd63192b61209d48cbb81ef474cc3ee4c62 Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Date: Mon, 7 Oct 2024 16:28:32 +0100
Subject: [PATCH] maple_tree: correct tree corruption on spanning store
Patch series "maple_tree: correct tree corruption on spanning store", v3.
There has been a nasty yet subtle maple tree corruption bug that appears
to have been in existence since the inception of the algorithm.
This bug seems far more likely to happen since commit f8d112a4e657
("mm/mmap: avoid zeroing vma tree in mmap_region()"), which is the point
at which reports started to be submitted concerning this bug.
We were made definitely aware of the bug thanks to the kind efforts of
Bert Karwatzki who helped enormously in my being able to track this down
and identify the cause of it.
The bug arises when an attempt is made to perform a spanning store across
two leaf nodes, where the right leaf node is the rightmost child of the
shared parent, AND the store completely consumes the right-mode node.
This results in mas_wr_spanning_store() mitakenly duplicating the new and
existing entries at the maximum pivot within the range, and thus maple
tree corruption.
The fix patch corrects this by detecting this scenario and disallowing the
mistaken duplicate copy.
The fix patch commit message goes into great detail as to how this occurs.
This series also includes a test which reliably reproduces the issue, and
asserts that the fix works correctly.
Bert has kindly tested the fix and confirmed it resolved his issues. Also
Mikhail Gavrilov kindly reported what appears to be precisely the same
bug, which this fix should also resolve.
This patch (of 2):
There has been a subtle bug present in the maple tree implementation from
its inception.
This arises from how stores are performed - when a store occurs, it will
overwrite overlapping ranges and adjust the tree as necessary to
accommodate this.
A range may always ultimately span two leaf nodes. In this instance we
walk the two leaf nodes, determine which elements are not overwritten to
the left and to the right of the start and end of the ranges respectively
and then rebalance the tree to contain these entries and the newly
inserted one.
This kind of store is dubbed a 'spanning store' and is implemented by
mas_wr_spanning_store().
In order to reach this stage, mas_store_gfp() invokes
mas_wr_preallocate(), mas_wr_store_type() and mas_wr_walk() in turn to
walk the tree and update the object (mas) to traverse to the location
where the write should be performed, determining its store type.
When a spanning store is required, this function returns false stopping at
the parent node which contains the target range, and mas_wr_store_type()
marks the mas->store_type as wr_spanning_store to denote this fact.
When we go to perform the store in mas_wr_spanning_store(), we first
determine the elements AFTER the END of the range we wish to store (that
is, to the right of the entry to be inserted) - we do this by walking to
the NEXT pivot in the tree (i.e. r_mas.last + 1), starting at the node we
have just determined contains the range over which we intend to write.
We then turn our attention to the entries to the left of the entry we are
inserting, whose state is represented by l_mas, and copy these into a 'big
node', which is a special node which contains enough slots to contain two
leaf node's worth of data.
We then copy the entry we wish to store immediately after this - the copy
and the insertion of the new entry is performed by mas_store_b_node().
After this we copy the elements to the right of the end of the range which
we are inserting, if we have not exceeded the length of the node (i.e.
r_mas.offset <= r_mas.end).
Herein lies the bug - under very specific circumstances, this logic can
break and corrupt the maple tree.
Consider the following tree:
Height
0 Root Node
/ \
pivot = 0xffff / \ pivot = ULONG_MAX
/ \
1 A [-----] ...
/ \
pivot = 0x4fff / \ pivot = 0xffff
/ \
2 (LEAVES) B [-----] [-----] C
^--- Last pivot 0xffff.
Now imagine we wish to store an entry in the range [0x4000, 0xffff] (note
that all ranges expressed in maple tree code are inclusive):
1. mas_store_gfp() descends the tree, finds node A at <=0xffff, then
determines that this is a spanning store across nodes B and C. The mas
state is set such that the current node from which we traverse further
is node A.
2. In mas_wr_spanning_store() we try to find elements to the right of pivot
0xffff by searching for an index of 0x10000:
- mas_wr_walk_index() invokes mas_wr_walk_descend() and
mas_wr_node_walk() in turn.
- mas_wr_node_walk() loops over entries in node A until EITHER it
finds an entry whose pivot equals or exceeds 0x10000 OR it
reaches the final entry.
- Since no entry has a pivot equal to or exceeding 0x10000, pivot
0xffff is selected, leading to node C.
- mas_wr_walk_traverse() resets the mas state to traverse node C. We
loop around and invoke mas_wr_walk_descend() and mas_wr_node_walk()
in turn once again.
- Again, we reach the last entry in node C, which has a pivot of
0xffff.
3. We then copy the elements to the left of 0x4000 in node B to the big
node via mas_store_b_node(), and insert the new [0x4000, 0xffff] entry
too.
4. We determine whether we have any entries to copy from the right of the
end of the range via - and with r_mas set up at the entry at pivot
0xffff, r_mas.offset <= r_mas.end, and then we DUPLICATE the entry at
pivot 0xffff.
5. BUG! The maple tree is corrupted with a duplicate entry.
This requires a very specific set of circumstances - we must be spanning
the last element in a leaf node, which is the last element in the parent
node.
spanning store across two leaf nodes with a range that ends at that shared
pivot.
A potential solution to this problem would simply be to reset the walk
each time we traverse r_mas, however given the rarity of this situation it
seems that would be rather inefficient.
Instead, this patch detects if the right hand node is populated, i.e. has
anything we need to copy.
We do so by only copying elements from the right of the entry being
inserted when the maximum value present exceeds the last, rather than
basing this on offset position.
The patch also updates some comments and eliminates the unused bool return
value in mas_wr_walk_index().
The work performed in commit f8d112a4e657 ("mm/mmap: avoid zeroing vma
tree in mmap_region()") seems to have made the probability of this event
much more likely, which is the point at which reports started to be
submitted concerning this bug.
The motivation for this change arose from Bert Karwatzki's report of
encountering mm instability after the release of kernel v6.12-rc1 which,
after the use of CONFIG_DEBUG_VM_MAPLE_TREE and similar configuration
options, was identified as maple tree corruption.
After Bert very generously provided his time and ability to reproduce this
event consistently, I was able to finally identify that the issue
discussed in this commit message was occurring for him.
Link: https://lkml.kernel.org/r/cover.1728314402.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/48b349a2a0f7c76e18772712d0997a5e12ab0a3b.17283144…
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: Bert Karwatzki <spasswolf(a)web.de>
Closes: https://lore.kernel.org/all/20241001023402.3374-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf(a)web.de>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov(a)gmail.com>
Closes: https://lore.kernel.org/all/CABXGCsOPwuoNOqSMmAvWO2Fz4TEmPnjFj-b7iF+XFRu1h7…
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov(a)gmail.com>
Reviewed-by: Wei Yang <richard.weiyang(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index ce7c7a7a8258..3619301dda2e 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2196,6 +2196,8 @@ static inline void mas_node_or_none(struct ma_state *mas,
/*
* mas_wr_node_walk() - Find the correct offset for the index in the @mas.
+ * If @mas->index cannot be found within the containing
+ * node, we traverse to the last entry in the node.
* @wr_mas: The maple write state
*
* Uses mas_slot_locked() and does not need to worry about dead nodes.
@@ -3532,7 +3534,7 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas)
return true;
}
-static bool mas_wr_walk_index(struct ma_wr_state *wr_mas)
+static void mas_wr_walk_index(struct ma_wr_state *wr_mas)
{
struct ma_state *mas = wr_mas->mas;
@@ -3541,11 +3543,9 @@ static bool mas_wr_walk_index(struct ma_wr_state *wr_mas)
wr_mas->content = mas_slot_locked(mas, wr_mas->slots,
mas->offset);
if (ma_is_leaf(wr_mas->type))
- return true;
+ return;
mas_wr_walk_traverse(wr_mas);
-
}
- return true;
}
/*
* mas_extend_spanning_null() - Extend a store of a %NULL to include surrounding %NULLs.
@@ -3765,8 +3765,8 @@ static noinline void mas_wr_spanning_store(struct ma_wr_state *wr_mas)
memset(&b_node, 0, sizeof(struct maple_big_node));
/* Copy l_mas and store the value in b_node. */
mas_store_b_node(&l_wr_mas, &b_node, l_mas.end);
- /* Copy r_mas into b_node. */
- if (r_mas.offset <= r_mas.end)
+ /* Copy r_mas into b_node if there is anything to copy. */
+ if (r_mas.max > r_mas.last)
mas_mab_cp(&r_mas, r_mas.offset, r_mas.end,
&b_node, b_node.b_end + 1);
else
Avoid xHC host from processing a cancelled URB by always turning
cancelled URB TDs into no-op TRBs before queuing a 'Set TR Deq' command.
If the command fails then xHC will start processing the cancelled TD
instead of skipping it once endpoint is restarted, causing issues like
Babble error.
This is not a complete solution as a failed 'Set TR Deq' command does not
guarantee xHC TRB caches are cleared.
Fixes: 4db356924a50 ("xhci: turn cancelled td cleanup to its own function")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-ring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 4d664ba53fe9..7dedf31bbddd 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1023,7 +1023,7 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
td_to_noop(xhci, ring, cached_td, false);
cached_td->cancel_status = TD_CLEARED;
}
-
+ td_to_noop(xhci, ring, td, false);
td->cancel_status = TD_CLEARING_CACHE;
cached_td = td;
break;
--
2.25.1
The PLL checks are comparing 64 bit integers with 32 bit
ones, as reported by Coverity. Depending on the values of
the variables, this may underflow.
Fix it ensuring that both sides of the expression are u64.
Fixes: 852b50aeed15 ("media: On Semi AR0521 sensor driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
Acked-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
---
drivers/media/i2c/ar0521.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/media/i2c/ar0521.c b/drivers/media/i2c/ar0521.c
index fc27238dd4d3..24873149096c 100644
--- a/drivers/media/i2c/ar0521.c
+++ b/drivers/media/i2c/ar0521.c
@@ -255,10 +255,10 @@ static u32 calc_pll(struct ar0521_dev *sensor, u32 freq, u16 *pre_ptr, u16 *mult
continue; /* Minimum value */
if (new_mult > 254)
break; /* Maximum, larger pre won't work either */
- if (sensor->extclk_freq * (u64)new_mult < AR0521_PLL_MIN *
+ if (sensor->extclk_freq * (u64)new_mult < (u64)AR0521_PLL_MIN *
new_pre)
continue;
- if (sensor->extclk_freq * (u64)new_mult > AR0521_PLL_MAX *
+ if (sensor->extclk_freq * (u64)new_mult > (u64)AR0521_PLL_MAX *
new_pre)
break; /* Larger pre won't work either */
new_pll = div64_round_up(sensor->extclk_freq * (u64)new_mult,
--
2.47.0
Svacer reports possible dereference of a NULL-pointer in
amd_iommu_probe_finalize(). The problem is present in 5.10 stable release
and can be fixed by the following upstream patch. In order to apply this
patch, the incoming changes had to be manually accepted. This action was
necessary due to some differences in the code of amd_iommu_probe_finalize()
of the upstream version and 5.10 version of the kernel.
A commit adding back the stopping of tx on port shutdown failed to add
back the locking which had also been removed by commit e83766334f96
("tty: serial: qcom_geni_serial: No need to stop tx/rx on UART
shutdown").
Holding the port lock is needed to serialise against the console code,
which may update the interrupt enable register and access the port
state.
Fixes: d8aca2f96813 ("tty: serial: qcom-geni-serial: stop operations in progress at shutdown")
Fixes: 947cc4ecc06c ("serial: qcom-geni: fix soft lockup on sw flow control and suspend")
Cc: stable(a)vger.kernel.org # 6.3
Cc: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/tty/serial/qcom_geni_serial.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 9ea6bd09e665..b6a8729cee6d 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1096,10 +1096,12 @@ static void qcom_geni_serial_shutdown(struct uart_port *uport)
{
disable_irq(uport->irq);
+ uart_port_lock_irq(uport);
qcom_geni_serial_stop_tx(uport);
qcom_geni_serial_stop_rx(uport);
qcom_geni_serial_cancel_tx_cmd(uport);
+ uart_port_unlock_irq(uport);
}
static void qcom_geni_serial_flush_buffer(struct uart_port *uport)
--
2.45.2
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x bea07fd63192b61209d48cbb81ef474cc3ee4c62
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101818-ducky-dallying-2814@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bea07fd63192b61209d48cbb81ef474cc3ee4c62 Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Date: Mon, 7 Oct 2024 16:28:32 +0100
Subject: [PATCH] maple_tree: correct tree corruption on spanning store
Patch series "maple_tree: correct tree corruption on spanning store", v3.
There has been a nasty yet subtle maple tree corruption bug that appears
to have been in existence since the inception of the algorithm.
This bug seems far more likely to happen since commit f8d112a4e657
("mm/mmap: avoid zeroing vma tree in mmap_region()"), which is the point
at which reports started to be submitted concerning this bug.
We were made definitely aware of the bug thanks to the kind efforts of
Bert Karwatzki who helped enormously in my being able to track this down
and identify the cause of it.
The bug arises when an attempt is made to perform a spanning store across
two leaf nodes, where the right leaf node is the rightmost child of the
shared parent, AND the store completely consumes the right-mode node.
This results in mas_wr_spanning_store() mitakenly duplicating the new and
existing entries at the maximum pivot within the range, and thus maple
tree corruption.
The fix patch corrects this by detecting this scenario and disallowing the
mistaken duplicate copy.
The fix patch commit message goes into great detail as to how this occurs.
This series also includes a test which reliably reproduces the issue, and
asserts that the fix works correctly.
Bert has kindly tested the fix and confirmed it resolved his issues. Also
Mikhail Gavrilov kindly reported what appears to be precisely the same
bug, which this fix should also resolve.
This patch (of 2):
There has been a subtle bug present in the maple tree implementation from
its inception.
This arises from how stores are performed - when a store occurs, it will
overwrite overlapping ranges and adjust the tree as necessary to
accommodate this.
A range may always ultimately span two leaf nodes. In this instance we
walk the two leaf nodes, determine which elements are not overwritten to
the left and to the right of the start and end of the ranges respectively
and then rebalance the tree to contain these entries and the newly
inserted one.
This kind of store is dubbed a 'spanning store' and is implemented by
mas_wr_spanning_store().
In order to reach this stage, mas_store_gfp() invokes
mas_wr_preallocate(), mas_wr_store_type() and mas_wr_walk() in turn to
walk the tree and update the object (mas) to traverse to the location
where the write should be performed, determining its store type.
When a spanning store is required, this function returns false stopping at
the parent node which contains the target range, and mas_wr_store_type()
marks the mas->store_type as wr_spanning_store to denote this fact.
When we go to perform the store in mas_wr_spanning_store(), we first
determine the elements AFTER the END of the range we wish to store (that
is, to the right of the entry to be inserted) - we do this by walking to
the NEXT pivot in the tree (i.e. r_mas.last + 1), starting at the node we
have just determined contains the range over which we intend to write.
We then turn our attention to the entries to the left of the entry we are
inserting, whose state is represented by l_mas, and copy these into a 'big
node', which is a special node which contains enough slots to contain two
leaf node's worth of data.
We then copy the entry we wish to store immediately after this - the copy
and the insertion of the new entry is performed by mas_store_b_node().
After this we copy the elements to the right of the end of the range which
we are inserting, if we have not exceeded the length of the node (i.e.
r_mas.offset <= r_mas.end).
Herein lies the bug - under very specific circumstances, this logic can
break and corrupt the maple tree.
Consider the following tree:
Height
0 Root Node
/ \
pivot = 0xffff / \ pivot = ULONG_MAX
/ \
1 A [-----] ...
/ \
pivot = 0x4fff / \ pivot = 0xffff
/ \
2 (LEAVES) B [-----] [-----] C
^--- Last pivot 0xffff.
Now imagine we wish to store an entry in the range [0x4000, 0xffff] (note
that all ranges expressed in maple tree code are inclusive):
1. mas_store_gfp() descends the tree, finds node A at <=0xffff, then
determines that this is a spanning store across nodes B and C. The mas
state is set such that the current node from which we traverse further
is node A.
2. In mas_wr_spanning_store() we try to find elements to the right of pivot
0xffff by searching for an index of 0x10000:
- mas_wr_walk_index() invokes mas_wr_walk_descend() and
mas_wr_node_walk() in turn.
- mas_wr_node_walk() loops over entries in node A until EITHER it
finds an entry whose pivot equals or exceeds 0x10000 OR it
reaches the final entry.
- Since no entry has a pivot equal to or exceeding 0x10000, pivot
0xffff is selected, leading to node C.
- mas_wr_walk_traverse() resets the mas state to traverse node C. We
loop around and invoke mas_wr_walk_descend() and mas_wr_node_walk()
in turn once again.
- Again, we reach the last entry in node C, which has a pivot of
0xffff.
3. We then copy the elements to the left of 0x4000 in node B to the big
node via mas_store_b_node(), and insert the new [0x4000, 0xffff] entry
too.
4. We determine whether we have any entries to copy from the right of the
end of the range via - and with r_mas set up at the entry at pivot
0xffff, r_mas.offset <= r_mas.end, and then we DUPLICATE the entry at
pivot 0xffff.
5. BUG! The maple tree is corrupted with a duplicate entry.
This requires a very specific set of circumstances - we must be spanning
the last element in a leaf node, which is the last element in the parent
node.
spanning store across two leaf nodes with a range that ends at that shared
pivot.
A potential solution to this problem would simply be to reset the walk
each time we traverse r_mas, however given the rarity of this situation it
seems that would be rather inefficient.
Instead, this patch detects if the right hand node is populated, i.e. has
anything we need to copy.
We do so by only copying elements from the right of the entry being
inserted when the maximum value present exceeds the last, rather than
basing this on offset position.
The patch also updates some comments and eliminates the unused bool return
value in mas_wr_walk_index().
The work performed in commit f8d112a4e657 ("mm/mmap: avoid zeroing vma
tree in mmap_region()") seems to have made the probability of this event
much more likely, which is the point at which reports started to be
submitted concerning this bug.
The motivation for this change arose from Bert Karwatzki's report of
encountering mm instability after the release of kernel v6.12-rc1 which,
after the use of CONFIG_DEBUG_VM_MAPLE_TREE and similar configuration
options, was identified as maple tree corruption.
After Bert very generously provided his time and ability to reproduce this
event consistently, I was able to finally identify that the issue
discussed in this commit message was occurring for him.
Link: https://lkml.kernel.org/r/cover.1728314402.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/48b349a2a0f7c76e18772712d0997a5e12ab0a3b.17283144…
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reported-by: Bert Karwatzki <spasswolf(a)web.de>
Closes: https://lore.kernel.org/all/20241001023402.3374-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf(a)web.de>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov(a)gmail.com>
Closes: https://lore.kernel.org/all/CABXGCsOPwuoNOqSMmAvWO2Fz4TEmPnjFj-b7iF+XFRu1h7…
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov(a)gmail.com>
Reviewed-by: Wei Yang <richard.weiyang(a)gmail.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index ce7c7a7a8258..3619301dda2e 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2196,6 +2196,8 @@ static inline void mas_node_or_none(struct ma_state *mas,
/*
* mas_wr_node_walk() - Find the correct offset for the index in the @mas.
+ * If @mas->index cannot be found within the containing
+ * node, we traverse to the last entry in the node.
* @wr_mas: The maple write state
*
* Uses mas_slot_locked() and does not need to worry about dead nodes.
@@ -3532,7 +3534,7 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas)
return true;
}
-static bool mas_wr_walk_index(struct ma_wr_state *wr_mas)
+static void mas_wr_walk_index(struct ma_wr_state *wr_mas)
{
struct ma_state *mas = wr_mas->mas;
@@ -3541,11 +3543,9 @@ static bool mas_wr_walk_index(struct ma_wr_state *wr_mas)
wr_mas->content = mas_slot_locked(mas, wr_mas->slots,
mas->offset);
if (ma_is_leaf(wr_mas->type))
- return true;
+ return;
mas_wr_walk_traverse(wr_mas);
-
}
- return true;
}
/*
* mas_extend_spanning_null() - Extend a store of a %NULL to include surrounding %NULLs.
@@ -3765,8 +3765,8 @@ static noinline void mas_wr_spanning_store(struct ma_wr_state *wr_mas)
memset(&b_node, 0, sizeof(struct maple_big_node));
/* Copy l_mas and store the value in b_node. */
mas_store_b_node(&l_wr_mas, &b_node, l_mas.end);
- /* Copy r_mas into b_node. */
- if (r_mas.offset <= r_mas.end)
+ /* Copy r_mas into b_node if there is anything to copy. */
+ if (r_mas.max > r_mas.last)
mas_mab_cp(&r_mas, r_mas.offset, r_mas.end,
&b_node, b_node.b_end + 1);
else
Svacer reports redundant comparison in cdns_xfer_msg(). The problem is
present in 5.10 stable release and can be fixed by the following
upstream patch that can be cleanly applied to 5.10 stable branch.
From: Johannes Berg <johannes.berg(a)intel.com>
[ Upstream commit 31db78a4923ef5e2008f2eed321811ca79e7f71b ]
When ieee80211_key_link() is called by ieee80211_gtk_rekey_add()
but returns 0 due to KRACK protection (identical key reinstall),
ieee80211_gtk_rekey_add() will still return a pointer into the
key, in a potential use-after-free. This normally doesn't happen
since it's only called by iwlwifi in case of WoWLAN rekey offload
which has its own KRACK protection, but still better to fix, do
that by returning an error code and converting that to success on
the cfg80211 boundary only, leaving the error for bad callers of
ieee80211_gtk_rekey_add().
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Fixes: fdf7cb4185b6 ("mac80211: accept key reinstall without changing anything")
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[Sherry: bp to fix CVE-2023-52530, resolved minor conflicts in
net/mac80211/cfg.c because of context change due to missing commit
23a5f0af6ff4 ("wifi: mac80211: remove cipher scheme support")
ccdde7c74ffd ("wifi: mac80211: properly implement MLO key handling")]
Signed-off-by: Sherry Yang <sherry.yang(a)oracle.com>
---
net/mac80211/cfg.c | 3 +++
net/mac80211/key.c | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index f652982a106b..c54b3be62c0a 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -511,6 +511,9 @@ static int ieee80211_add_key(struct wiphy *wiphy, struct net_device *dev,
sta->cipher_scheme = cs;
err = ieee80211_key_link(key, sdata, sta);
+ /* KRACK protection, shouldn't happen but just silently accept key */
+ if (err == -EALREADY)
+ err = 0;
out_unlock:
mutex_unlock(&local->sta_mtx);
diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index f695fc80088b..7b427e39831b 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -843,7 +843,7 @@ int ieee80211_key_link(struct ieee80211_key *key,
*/
if (ieee80211_key_identical(sdata, old_key, key)) {
ieee80211_key_free_unused(key);
- ret = 0;
+ ret = -EALREADY;
goto out;
}
--
2.46.0
[ Upstream commit 7c2fd76048e95dd267055b5f5e0a48e6e7c81fd9 ]
On an NVMe namespace that does not support metadata, it is possible to
send an IO command with metadata through io-passthru. This allows issues
like [1] to trigger in the completion code path.
nvme_map_user_request() doesn't check if the namespace supports metadata
before sending it forward. It also allows admin commands with metadata to
be processed as it ignores metadata when bdev == NULL and may report
success.
Reject an IO command with metadata when the NVMe namespace doesn't
support it and reject an admin command if it has metadata.
[1] https://lore.kernel.org/all/mb61pcylvnym8.fsf@amazon.com/
Suggested-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me>
Reviewed-by: Anuj Gupta <anuj20.g(a)samsung.com>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
[ Minor changes to make it work on 6.1 ]
Signed-off-by: Puranjay Mohan <pjy(a)amazon.com>
---
drivers/nvme/host/ioctl.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index b3e322e4ade38..a02873792890e 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -3,6 +3,7 @@
* Copyright (c) 2011-2014, Intel Corporation.
* Copyright (c) 2017-2021 Christoph Hellwig.
*/
+#include <linux/blk-integrity.h>
#include <linux/ptrace.h> /* for force_successful_syscall_return */
#include <linux/nvme_ioctl.h>
#include <linux/io_uring.h>
@@ -95,10 +96,15 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer,
struct request_queue *q = req->q;
struct nvme_ns *ns = q->queuedata;
struct block_device *bdev = ns ? ns->disk->part0 : NULL;
+ bool supports_metadata = bdev && blk_get_integrity(bdev->bd_disk);
+ bool has_metadata = meta_buffer && meta_len;
struct bio *bio = NULL;
void *meta = NULL;
int ret;
+ if (has_metadata && !supports_metadata)
+ return -EINVAL;
+
if (ioucmd && (ioucmd->flags & IORING_URING_CMD_FIXED)) {
struct iov_iter iter;
@@ -122,7 +128,7 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer,
if (bdev)
bio_set_dev(bio, bdev);
- if (bdev && meta_buffer && meta_len) {
+ if (has_metadata) {
meta = nvme_add_user_metadata(req, meta_buffer, meta_len,
meta_seed);
if (IS_ERR(meta)) {
--
2.40.1
Upstream commit c2368b19807a ("net: devlink: introduce "unregistering"
mark and use it during devlinks iteration") in v6.0 introduced a race
when unregistering a devlink instance that can result in RCU stalls and
in the system completely locking up. Exact details and reproducer can be
found here [1]. The bug was inadvertently fixed in v6.3 by upstream
commit d77278196441 ("devlink: bump the instance index directly when
iterating").
This patchset fixes the bug by backporting the second commit and a
related dependency from v6.3 to v6.1.y while adjusting them to the
devlink file structure in v6.1.y (net/devlink/{core.c,devl_internal.h}
-> net/devlink/leftover.c).
Tested by running the devlink tests under
tools/testing/selftests/drivers/net/netdevsim/ and the reproducer
mentioned in [1].
[1] https://lore.kernel.org/stable/20241001112035.973187-1-idosch@nvidia.com/
Jakub Kicinski (2):
devlink: drop the filter argument from devlinks_xa_find_get
devlink: bump the instance index directly when iterating
net/devlink/leftover.c | 40 ++++++++++------------------------------
1 file changed, 10 insertions(+), 30 deletions(-)
--
2.47.0
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 532b53cebe58f34ce1c0f34d866f5c0e335c53c6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101412-prowling-snowflake-9fe0@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
532b53cebe58 ("secretmem: disable memfd_secret() if arch cannot set direct map")
f7c5b1aab5ef ("mm/secretmem: remove reduntant return value")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 532b53cebe58f34ce1c0f34d866f5c0e335c53c6 Mon Sep 17 00:00:00 2001
From: Patrick Roy <roypat(a)amazon.co.uk>
Date: Tue, 1 Oct 2024 09:00:41 +0100
Subject: [PATCH] secretmem: disable memfd_secret() if arch cannot set direct
map
Return -ENOSYS from memfd_secret() syscall if !can_set_direct_map(). This
is the case for example on some arm64 configurations, where marking 4k
PTEs in the direct map not present can only be done if the direct map is
set up at 4k granularity in the first place (as ARM's break-before-make
semantics do not easily allow breaking apart large/gigantic pages).
More precisely, on arm64 systems with !can_set_direct_map(),
set_direct_map_invalid_noflush() is a no-op, however it returns success
(0) instead of an error. This means that memfd_secret will seemingly
"work" (e.g. syscall succeeds, you can mmap the fd and fault in pages),
but it does not actually achieve its goal of removing its memory from the
direct map.
Note that with this patch, memfd_secret() will start erroring on systems
where can_set_direct_map() returns false (arm64 with
CONFIG_RODATA_FULL_DEFAULT_ENABLED=n, CONFIG_DEBUG_PAGEALLOC=n and
CONFIG_KFENCE=n), but that still seems better than the current silent
failure. Since CONFIG_RODATA_FULL_DEFAULT_ENABLED defaults to 'y', most
arm64 systems actually have a working memfd_secret() and aren't be
affected.
From going through the iterations of the original memfd_secret patch
series, it seems that disabling the syscall in these scenarios was the
intended behavior [1] (preferred over having
set_direct_map_invalid_noflush return an error as that would result in
SIGBUSes at page-fault time), however the check for it got dropped between
v16 [2] and v17 [3], when secretmem moved away from CMA allocations.
[1]: https://lore.kernel.org/lkml/20201124164930.GK8537@kernel.org/
[2]: https://lore.kernel.org/lkml/20210121122723.3446-11-rppt@kernel.org/#t
[3]: https://lore.kernel.org/lkml/20201125092208.12544-10-rppt@kernel.org/
Link: https://lkml.kernel.org/r/20241001080056.784735-1-roypat@amazon.co.uk
Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: Patrick Roy <roypat(a)amazon.co.uk>
Reviewed-by: Mike Rapoport (Microsoft) <rppt(a)kernel.org>
Cc: Alexander Graf <graf(a)amazon.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: James Gowans <jgowans(a)amazon.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/secretmem.c b/mm/secretmem.c
index 3afb5ad701e1..399552814fd0 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -238,7 +238,7 @@ SYSCALL_DEFINE1(memfd_secret, unsigned int, flags)
/* make sure local flags do not confict with global fcntl.h */
BUILD_BUG_ON(SECRETMEM_FLAGS_MASK & O_CLOEXEC);
- if (!secretmem_enable)
+ if (!secretmem_enable || !can_set_direct_map())
return -ENOSYS;
if (flags & ~(SECRETMEM_FLAGS_MASK | O_CLOEXEC))
@@ -280,7 +280,7 @@ static struct file_system_type secretmem_fs = {
static int __init secretmem_init(void)
{
- if (!secretmem_enable)
+ if (!secretmem_enable || !can_set_direct_map())
return 0;
secretmem_mnt = kern_mount(&secretmem_fs);
The patch below does not apply to the 6.11-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.11.y
git checkout FETCH_HEAD
git cherry-pick -x 2b0f922323ccfa76219bcaacd35cd50aeaa1359
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101841-keep-coma-4963@gregkh' --subject-prefix 'PATCH 6.11.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2b0f922323ccfa76219bcaacd35cd50aeaa13592 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Fri, 11 Oct 2024 12:24:45 +0200
Subject: [PATCH] mm: don't install PMD mappings when THPs are disabled by the
hw/process/vma
We (or rather, readahead logic :) ) might be allocating a THP in the
pagecache and then try mapping it into a process that explicitly disabled
THP: we might end up installing PMD mappings.
This is a problem for s390x KVM, which explicitly remaps all PMD-mapped
THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before
starting the VM.
For example, starting a VM backed on a file system with large folios
supported makes the VM crash when the VM tries accessing such a mapping
using KVM.
Is it also a problem when the HW disabled THP using
TRANSPARENT_HUGEPAGE_UNSUPPORTED? At least on x86 this would be the case
without X86_FEATURE_PSE.
In the future, we might be able to do better on s390x and only disallow
PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED
really wants. For now, fix it by essentially performing the same check as
would be done in __thp_vma_allowable_orders() or in shmem code, where this
works as expected, and disallow PMD mappings, making us fallback to PTE
mappings.
Link: https://lkml.kernel.org/r/20241011102445.934409-3-david@redhat.com
Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Leo Fu <bfu(a)redhat.com>
Tested-by: Thomas Huth <thuth(a)redhat.com>
Cc: Thomas Huth <thuth(a)redhat.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Janosch Frank <frankja(a)linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/memory.c b/mm/memory.c
index c0869a962ddd..30feedabc932 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4920,6 +4920,15 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
pmd_t entry;
vm_fault_t ret = VM_FAULT_FALLBACK;
+ /*
+ * It is too late to allocate a small folio, we already have a large
+ * folio in the pagecache: especially s390 KVM cannot tolerate any
+ * PMD mappings, but PTE-mapped THP are fine. So let's simply refuse any
+ * PMD mappings if THPs are disabled.
+ */
+ if (thp_disabled_by_hw() || vma_thp_disabled(vma, vma->vm_flags))
+ return ret;
+
if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
return ret;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 963756aac1f011d904ddd9548ae82286d3a91f96
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101848-lucid-mountain-2cdf@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 963756aac1f011d904ddd9548ae82286d3a91f96 Mon Sep 17 00:00:00 2001
From: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Date: Fri, 11 Oct 2024 12:24:44 +0200
Subject: [PATCH] mm: huge_memory: add vma_thp_disabled() and
thp_disabled_by_hw()
Patch series "mm: don't install PMD mappings when THPs are disabled by the
hw/process/vma".
During testing, it was found that we can get PMD mappings in processes
where THP (and more precisely, PMD mappings) are supposed to be disabled.
While it works as expected for anon+shmem, the pagecache is the
problematic bit.
For s390 KVM this currently means that a VM backed by a file located on
filesystem with large folio support can crash when KVM tries accessing the
problematic page, because the readahead logic might decide to use a
PMD-sized THP and faulting it into the page tables will install a PMD
mapping, something that s390 KVM cannot tolerate.
This might also be a problem with HW that does not support PMD mappings,
but I did not try reproducing it.
Fix it by respecting the ways to disable THPs when deciding whether we can
install a PMD mapping. khugepaged should already be taking care of not
collapsing if THPs are effectively disabled for the hw/process/vma.
This patch (of 2):
Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by
shmem_allowable_huge_orders() and __thp_vma_allowable_orders().
[david(a)redhat.com: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ]
Link: https://lkml.kernel.org/r/20241011102445.934409-2-david@redhat.com
Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
Signed-off-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Leo Fu <bfu(a)redhat.com>
Tested-by: Thomas Huth <thuth(a)redhat.com>
Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Boqiao Fu <bfu(a)redhat.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Janosch Frank <frankja(a)linux.ibm.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 67d0ab3c3bba..ef5b80e48599 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -322,6 +322,24 @@ struct thpsize {
(transparent_hugepage_flags & \
(1<<TRANSPARENT_HUGEPAGE_USE_ZERO_PAGE_FLAG))
+static inline bool vma_thp_disabled(struct vm_area_struct *vma,
+ unsigned long vm_flags)
+{
+ /*
+ * Explicitly disabled through madvise or prctl, or some
+ * architectures may disable THP for some mappings, for
+ * example, s390 kvm.
+ */
+ return (vm_flags & VM_NOHUGEPAGE) ||
+ test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags);
+}
+
+static inline bool thp_disabled_by_hw(void)
+{
+ /* If the hardware/firmware marked hugepage support disabled. */
+ return transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED);
+}
+
unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags);
unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 87b49ecc7b1e..2fb328880b50 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -109,18 +109,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
if (!vma->vm_mm) /* vdso */
return 0;
- /*
- * Explicitly disabled through madvise or prctl, or some
- * architectures may disable THP for some mappings, for
- * example, s390 kvm.
- * */
- if ((vm_flags & VM_NOHUGEPAGE) ||
- test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
- return 0;
- /*
- * If the hardware/firmware marked hugepage support disabled.
- */
- if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
+ if (thp_disabled_by_hw() || vma_thp_disabled(vma, vm_flags))
return 0;
/* khugepaged doesn't collapse DAX vma, but page fault is fine. */
diff --git a/mm/shmem.c b/mm/shmem.c
index 4f11b5506363..c5adb987b23c 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1664,12 +1664,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
loff_t i_size;
int order;
- if (vma && ((vm_flags & VM_NOHUGEPAGE) ||
- test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)))
- return 0;
-
- /* If the hardware/firmware marked hugepage support disabled. */
- if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
+ if (thp_disabled_by_hw() || (vma && vma_thp_disabled(vma, vm_flags)))
return 0;
global_huge = shmem_huge_global_enabled(inode, index, write_end,
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 963756aac1f011d904ddd9548ae82286d3a91f96
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101842-flatness-osmosis-b08e@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 963756aac1f011d904ddd9548ae82286d3a91f96 Mon Sep 17 00:00:00 2001
From: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Date: Fri, 11 Oct 2024 12:24:44 +0200
Subject: [PATCH] mm: huge_memory: add vma_thp_disabled() and
thp_disabled_by_hw()
Patch series "mm: don't install PMD mappings when THPs are disabled by the
hw/process/vma".
During testing, it was found that we can get PMD mappings in processes
where THP (and more precisely, PMD mappings) are supposed to be disabled.
While it works as expected for anon+shmem, the pagecache is the
problematic bit.
For s390 KVM this currently means that a VM backed by a file located on
filesystem with large folio support can crash when KVM tries accessing the
problematic page, because the readahead logic might decide to use a
PMD-sized THP and faulting it into the page tables will install a PMD
mapping, something that s390 KVM cannot tolerate.
This might also be a problem with HW that does not support PMD mappings,
but I did not try reproducing it.
Fix it by respecting the ways to disable THPs when deciding whether we can
install a PMD mapping. khugepaged should already be taking care of not
collapsing if THPs are effectively disabled for the hw/process/vma.
This patch (of 2):
Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by
shmem_allowable_huge_orders() and __thp_vma_allowable_orders().
[david(a)redhat.com: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ]
Link: https://lkml.kernel.org/r/20241011102445.934409-2-david@redhat.com
Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
Signed-off-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Leo Fu <bfu(a)redhat.com>
Tested-by: Thomas Huth <thuth(a)redhat.com>
Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Boqiao Fu <bfu(a)redhat.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Janosch Frank <frankja(a)linux.ibm.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 67d0ab3c3bba..ef5b80e48599 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -322,6 +322,24 @@ struct thpsize {
(transparent_hugepage_flags & \
(1<<TRANSPARENT_HUGEPAGE_USE_ZERO_PAGE_FLAG))
+static inline bool vma_thp_disabled(struct vm_area_struct *vma,
+ unsigned long vm_flags)
+{
+ /*
+ * Explicitly disabled through madvise or prctl, or some
+ * architectures may disable THP for some mappings, for
+ * example, s390 kvm.
+ */
+ return (vm_flags & VM_NOHUGEPAGE) ||
+ test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags);
+}
+
+static inline bool thp_disabled_by_hw(void)
+{
+ /* If the hardware/firmware marked hugepage support disabled. */
+ return transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED);
+}
+
unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags);
unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 87b49ecc7b1e..2fb328880b50 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -109,18 +109,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
if (!vma->vm_mm) /* vdso */
return 0;
- /*
- * Explicitly disabled through madvise or prctl, or some
- * architectures may disable THP for some mappings, for
- * example, s390 kvm.
- * */
- if ((vm_flags & VM_NOHUGEPAGE) ||
- test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
- return 0;
- /*
- * If the hardware/firmware marked hugepage support disabled.
- */
- if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
+ if (thp_disabled_by_hw() || vma_thp_disabled(vma, vm_flags))
return 0;
/* khugepaged doesn't collapse DAX vma, but page fault is fine. */
diff --git a/mm/shmem.c b/mm/shmem.c
index 4f11b5506363..c5adb987b23c 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1664,12 +1664,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
loff_t i_size;
int order;
- if (vma && ((vm_flags & VM_NOHUGEPAGE) ||
- test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)))
- return 0;
-
- /* If the hardware/firmware marked hugepage support disabled. */
- if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
+ if (thp_disabled_by_hw() || (vma && vma_thp_disabled(vma, vm_flags)))
return 0;
global_huge = shmem_huge_global_enabled(inode, index, write_end,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 7528c4fb1237512ee18049f852f014eba80bbe8d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101858-rewire-vocation-c981@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7528c4fb1237512ee18049f852f014eba80bbe8d Mon Sep 17 00:00:00 2001
From: Liu Shixin <liushixin2(a)huawei.com>
Date: Tue, 15 Oct 2024 09:45:21 +0800
Subject: [PATCH] mm/swapfile: skip HugeTLB pages for unuse_vma
I got a bad pud error and lost a 1GB HugeTLB when calling swapoff. The
problem can be reproduced by the following steps:
1. Allocate an anonymous 1GB HugeTLB and some other anonymous memory.
2. Swapout the above anonymous memory.
3. run swapoff and we will get a bad pud error in kernel message:
mm/pgtable-generic.c:42: bad pud 00000000743d215d(84000001400000e7)
We can tell that pud_clear_bad is called by pud_none_or_clear_bad in
unuse_pud_range() by ftrace. And therefore the HugeTLB pages will never
be freed because we lost it from page table. We can skip HugeTLB pages
for unuse_vma to fix it.
Link: https://lkml.kernel.org/r/20241015014521.570237-1-liushixin2@huawei.com
Fixes: 0fe6e20b9c4c ("hugetlb, rmap: add reverse mapping for hugepage")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Acked-by: Muchun Song <muchun.song(a)linux.dev>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/swapfile.c b/mm/swapfile.c
index eb782fcd5627..b0915f3fab31 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2313,7 +2313,7 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
mmap_read_lock(mm);
for_each_vma(vmi, vma) {
- if (vma->anon_vma) {
+ if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
ret = unuse_vma(vma, type);
if (ret)
break;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 37f0b47c5143c2957909ced44fc09ffb118c99f7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101803-cage-smokiness-cb8b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 37f0b47c5143c2957909ced44fc09ffb118c99f7 Mon Sep 17 00:00:00 2001
From: Yang Shi <yang(a)os.amperecomputing.com>
Date: Fri, 11 Oct 2024 18:17:02 -0700
Subject: [PATCH] mm: khugepaged: fix the arguments order in
khugepaged_collapse_file trace point
The "addr" and "is_shmem" arguments have different order in TP_PROTO and
TP_ARGS. This resulted in the incorrect trace result:
text-hugepage-644429 [276] 392092.878683: mm_khugepaged_collapse_file:
mm=0xffff20025d52c440, hpage_pfn=0x200678c00, index=512, addr=1, is_shmem=0,
filename=text-hugepage, nr=512, result=failed
The value of "addr" is wrong because it was treated as bool value, the
type of is_shmem.
Fix the order in TP_PROTO to keep "addr" is before "is_shmem" since the
original patch review suggested this order to achieve best packing.
And use "lx" for "addr" instead of "ld" in TP_printk because address is
typically shown in hex.
After the fix, the trace result looks correct:
text-hugepage-7291 [004] 128.627251: mm_khugepaged_collapse_file:
mm=0xffff0001328f9500, hpage_pfn=0x20016ea00, index=512, addr=0x400000,
is_shmem=0, filename=text-hugepage, nr=512, result=failed
Link: https://lkml.kernel.org/r/20241012011702.1084846-1-yang@os.amperecomputing.…
Fixes: 4c9473e87e75 ("mm/khugepaged: add tracepoint to collapse_file()")
Signed-off-by: Yang Shi <yang(a)os.amperecomputing.com>
Cc: Gautam Menghani <gautammenghani201(a)gmail.com>
Cc: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Cc: <stable(a)vger.kernel.org> [6.2+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h
index b5f5369b6300..9d5c00b0285c 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -208,7 +208,7 @@ TRACE_EVENT(mm_khugepaged_scan_file,
TRACE_EVENT(mm_khugepaged_collapse_file,
TP_PROTO(struct mm_struct *mm, struct folio *new_folio, pgoff_t index,
- bool is_shmem, unsigned long addr, struct file *file,
+ unsigned long addr, bool is_shmem, struct file *file,
int nr, int result),
TP_ARGS(mm, new_folio, index, addr, is_shmem, file, nr, result),
TP_STRUCT__entry(
@@ -233,7 +233,7 @@ TRACE_EVENT(mm_khugepaged_collapse_file,
__entry->result = result;
),
- TP_printk("mm=%p, hpage_pfn=0x%lx, index=%ld, addr=%ld, is_shmem=%d, filename=%s, nr=%d, result=%s",
+ TP_printk("mm=%p, hpage_pfn=0x%lx, index=%ld, addr=%lx, is_shmem=%d, filename=%s, nr=%d, result=%s",
__entry->mm,
__entry->hpfn,
__entry->index,
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index f9c39898eaff..a420eff92011 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2227,7 +2227,7 @@ rollback:
folio_put(new_folio);
out:
VM_BUG_ON(!list_empty(&pagelist));
- trace_mm_khugepaged_collapse_file(mm, new_folio, index, is_shmem, addr, file, HPAGE_PMD_NR, result);
+ trace_mm_khugepaged_collapse_file(mm, new_folio, index, addr, is_shmem, file, HPAGE_PMD_NR, result);
return result;
}
Hi Greg, Sasha,
Could you please help to backport the upstream commit
80e9963fb3b5509dfcabe9652d56bf4b35542055 ("irqchip/gic-v3-its: Fix VSYNC
referencing an unmapped VPE on GIC v4.1") to
* 5.10
* 5.15
* 6.1
* 6.6
trees? It can be applied and built (with arm64's defconfig) cleanly on
top of the mentioned stable branches.
Thanks,
Zenghui
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 5afca7e996c42aed1b4a42d4712817601ba42aff
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101827-regulate-lining-6c3e@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5afca7e996c42aed1b4a42d4712817601ba42aff Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Mon, 14 Oct 2024 16:06:01 +0200
Subject: [PATCH] selftests: mptcp: join: test for prohibited MPC to port-based
endp
Explicitly verify that MPC connection attempts towards a port-based
signal endpoint fail with a reset.
Note that this new test is a bit different from the other ones, not
using 'run_tests'. It is then needed to add the capture capability, and
the picking the right port which have been extracted into three new
helpers. The info about the capture can also be printed from a single
point, which simplifies the exit paths in do_transfer().
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port")
Cc: stable(a)vger.kernel.org
Co-developed-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-2-7faea8e6b6ae…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index e8d0a01b4144..c07e2bd3a315 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -23,6 +23,7 @@ tmpfile=""
cout=""
err=""
capout=""
+cappid=""
ns1=""
ns2=""
iptables="iptables"
@@ -887,6 +888,44 @@ check_cestab()
fi
}
+cond_start_capture()
+{
+ local ns="$1"
+
+ :> "$capout"
+
+ if $capture; then
+ local capuser capfile
+ if [ -z $SUDO_USER ]; then
+ capuser=""
+ else
+ capuser="-Z $SUDO_USER"
+ fi
+
+ capfile=$(printf "mp_join-%02u-%s.pcap" "$MPTCP_LIB_TEST_COUNTER" "$ns")
+
+ echo "Capturing traffic for test $MPTCP_LIB_TEST_COUNTER into $capfile"
+ ip netns exec "$ns" tcpdump -i any -s 65535 -B 32768 $capuser -w "$capfile" > "$capout" 2>&1 &
+ cappid=$!
+
+ sleep 1
+ fi
+}
+
+cond_stop_capture()
+{
+ if $capture; then
+ sleep 1
+ kill $cappid
+ cat "$capout"
+ fi
+}
+
+get_port()
+{
+ echo "$((10000 + MPTCP_LIB_TEST_COUNTER - 1))"
+}
+
do_transfer()
{
local listener_ns="$1"
@@ -894,33 +933,17 @@ do_transfer()
local cl_proto="$3"
local srv_proto="$4"
local connect_addr="$5"
+ local port
- local port=$((10000 + MPTCP_LIB_TEST_COUNTER - 1))
- local cappid
local FAILING_LINKS=${FAILING_LINKS:-""}
local fastclose=${fastclose:-""}
local speed=${speed:-"fast"}
+ port=$(get_port)
:> "$cout"
:> "$sout"
- :> "$capout"
- if $capture; then
- local capuser
- if [ -z $SUDO_USER ] ; then
- capuser=""
- else
- capuser="-Z $SUDO_USER"
- fi
-
- capfile=$(printf "mp_join-%02u-%s.pcap" "$MPTCP_LIB_TEST_COUNTER" "${listener_ns}")
-
- echo "Capturing traffic for test $MPTCP_LIB_TEST_COUNTER into $capfile"
- ip netns exec ${listener_ns} tcpdump -i any -s 65535 -B 32768 $capuser -w $capfile > "$capout" 2>&1 &
- cappid=$!
-
- sleep 1
- fi
+ cond_start_capture ${listener_ns}
NSTAT_HISTORY=/tmp/${listener_ns}.nstat ip netns exec ${listener_ns} \
nstat -n
@@ -1007,10 +1030,7 @@ do_transfer()
wait $spid
local rets=$?
- if $capture; then
- sleep 1
- kill $cappid
- fi
+ cond_stop_capture
NSTAT_HISTORY=/tmp/${listener_ns}.nstat ip netns exec ${listener_ns} \
nstat | grep Tcp > /tmp/${listener_ns}.out
@@ -1026,7 +1046,6 @@ do_transfer()
ip netns exec ${connector_ns} ss -Menita 1>&2 -o "dport = :$port"
cat /tmp/${connector_ns}.out
- cat "$capout"
return 1
fi
@@ -1043,13 +1062,7 @@ do_transfer()
fi
rets=$?
- if [ $retc -eq 0 ] && [ $rets -eq 0 ];then
- cat "$capout"
- return 0
- fi
-
- cat "$capout"
- return 1
+ [ $retc -eq 0 ] && [ $rets -eq 0 ]
}
make_file()
@@ -2873,6 +2886,32 @@ verify_listener_events()
fail_test
}
+chk_mpc_endp_attempt()
+{
+ local retl=$1
+ local attempts=$2
+
+ print_check "Connect"
+
+ if [ ${retl} = 124 ]; then
+ fail_test "timeout on connect"
+ elif [ ${retl} = 0 ]; then
+ fail_test "unexpected successful connect"
+ else
+ print_ok
+
+ print_check "Attempts"
+ count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtMPCapableEndpAttempt")
+ if [ -z "$count" ]; then
+ print_skip
+ elif [ "$count" != "$attempts" ]; then
+ fail_test "got ${count} MPC attempt[s] on port-based endpoint, expected ${attempts}"
+ else
+ print_ok
+ fi
+ fi
+}
+
add_addr_ports_tests()
{
# signal address with port
@@ -2963,6 +3002,22 @@ add_addr_ports_tests()
chk_join_nr 2 2 2
chk_add_nr 2 2 2
fi
+
+ if reset "port-based signal endpoint must not accept mpc"; then
+ local port retl count
+ port=$(get_port)
+
+ cond_start_capture ${ns1}
+ pm_nl_add_endpoint ${ns1} 10.0.2.1 flags signal port ${port}
+ mptcp_lib_wait_local_port_listen ${ns1} ${port}
+
+ timeout 1 ip netns exec ${ns2} \
+ ./mptcp_connect -t ${timeout_poll} -p $port -s MPTCP 10.0.2.1 >/dev/null 2>&1
+ retl=$?
+ cond_stop_capture
+
+ chk_mpc_endp_attempt ${retl} 1
+ fi
}
syncookies_tests()
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 5afca7e996c42aed1b4a42d4712817601ba42aff
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101826-outshine-powdered-a548@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5afca7e996c42aed1b4a42d4712817601ba42aff Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Mon, 14 Oct 2024 16:06:01 +0200
Subject: [PATCH] selftests: mptcp: join: test for prohibited MPC to port-based
endp
Explicitly verify that MPC connection attempts towards a port-based
signal endpoint fail with a reset.
Note that this new test is a bit different from the other ones, not
using 'run_tests'. It is then needed to add the capture capability, and
the picking the right port which have been extracted into three new
helpers. The info about the capture can also be printed from a single
point, which simplifies the exit paths in do_transfer().
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port")
Cc: stable(a)vger.kernel.org
Co-developed-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-2-7faea8e6b6ae…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index e8d0a01b4144..c07e2bd3a315 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -23,6 +23,7 @@ tmpfile=""
cout=""
err=""
capout=""
+cappid=""
ns1=""
ns2=""
iptables="iptables"
@@ -887,6 +888,44 @@ check_cestab()
fi
}
+cond_start_capture()
+{
+ local ns="$1"
+
+ :> "$capout"
+
+ if $capture; then
+ local capuser capfile
+ if [ -z $SUDO_USER ]; then
+ capuser=""
+ else
+ capuser="-Z $SUDO_USER"
+ fi
+
+ capfile=$(printf "mp_join-%02u-%s.pcap" "$MPTCP_LIB_TEST_COUNTER" "$ns")
+
+ echo "Capturing traffic for test $MPTCP_LIB_TEST_COUNTER into $capfile"
+ ip netns exec "$ns" tcpdump -i any -s 65535 -B 32768 $capuser -w "$capfile" > "$capout" 2>&1 &
+ cappid=$!
+
+ sleep 1
+ fi
+}
+
+cond_stop_capture()
+{
+ if $capture; then
+ sleep 1
+ kill $cappid
+ cat "$capout"
+ fi
+}
+
+get_port()
+{
+ echo "$((10000 + MPTCP_LIB_TEST_COUNTER - 1))"
+}
+
do_transfer()
{
local listener_ns="$1"
@@ -894,33 +933,17 @@ do_transfer()
local cl_proto="$3"
local srv_proto="$4"
local connect_addr="$5"
+ local port
- local port=$((10000 + MPTCP_LIB_TEST_COUNTER - 1))
- local cappid
local FAILING_LINKS=${FAILING_LINKS:-""}
local fastclose=${fastclose:-""}
local speed=${speed:-"fast"}
+ port=$(get_port)
:> "$cout"
:> "$sout"
- :> "$capout"
- if $capture; then
- local capuser
- if [ -z $SUDO_USER ] ; then
- capuser=""
- else
- capuser="-Z $SUDO_USER"
- fi
-
- capfile=$(printf "mp_join-%02u-%s.pcap" "$MPTCP_LIB_TEST_COUNTER" "${listener_ns}")
-
- echo "Capturing traffic for test $MPTCP_LIB_TEST_COUNTER into $capfile"
- ip netns exec ${listener_ns} tcpdump -i any -s 65535 -B 32768 $capuser -w $capfile > "$capout" 2>&1 &
- cappid=$!
-
- sleep 1
- fi
+ cond_start_capture ${listener_ns}
NSTAT_HISTORY=/tmp/${listener_ns}.nstat ip netns exec ${listener_ns} \
nstat -n
@@ -1007,10 +1030,7 @@ do_transfer()
wait $spid
local rets=$?
- if $capture; then
- sleep 1
- kill $cappid
- fi
+ cond_stop_capture
NSTAT_HISTORY=/tmp/${listener_ns}.nstat ip netns exec ${listener_ns} \
nstat | grep Tcp > /tmp/${listener_ns}.out
@@ -1026,7 +1046,6 @@ do_transfer()
ip netns exec ${connector_ns} ss -Menita 1>&2 -o "dport = :$port"
cat /tmp/${connector_ns}.out
- cat "$capout"
return 1
fi
@@ -1043,13 +1062,7 @@ do_transfer()
fi
rets=$?
- if [ $retc -eq 0 ] && [ $rets -eq 0 ];then
- cat "$capout"
- return 0
- fi
-
- cat "$capout"
- return 1
+ [ $retc -eq 0 ] && [ $rets -eq 0 ]
}
make_file()
@@ -2873,6 +2886,32 @@ verify_listener_events()
fail_test
}
+chk_mpc_endp_attempt()
+{
+ local retl=$1
+ local attempts=$2
+
+ print_check "Connect"
+
+ if [ ${retl} = 124 ]; then
+ fail_test "timeout on connect"
+ elif [ ${retl} = 0 ]; then
+ fail_test "unexpected successful connect"
+ else
+ print_ok
+
+ print_check "Attempts"
+ count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtMPCapableEndpAttempt")
+ if [ -z "$count" ]; then
+ print_skip
+ elif [ "$count" != "$attempts" ]; then
+ fail_test "got ${count} MPC attempt[s] on port-based endpoint, expected ${attempts}"
+ else
+ print_ok
+ fi
+ fi
+}
+
add_addr_ports_tests()
{
# signal address with port
@@ -2963,6 +3002,22 @@ add_addr_ports_tests()
chk_join_nr 2 2 2
chk_add_nr 2 2 2
fi
+
+ if reset "port-based signal endpoint must not accept mpc"; then
+ local port retl count
+ port=$(get_port)
+
+ cond_start_capture ${ns1}
+ pm_nl_add_endpoint ${ns1} 10.0.2.1 flags signal port ${port}
+ mptcp_lib_wait_local_port_listen ${ns1} ${port}
+
+ timeout 1 ip netns exec ${ns2} \
+ ./mptcp_connect -t ${timeout_poll} -p $port -s MPTCP 10.0.2.1 >/dev/null 2>&1
+ retl=$?
+ cond_stop_capture
+
+ chk_mpc_endp_attempt ${retl} 1
+ fi
}
syncookies_tests()
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 5afca7e996c42aed1b4a42d4712817601ba42aff
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101826-algorithm-figure-3cf3@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5afca7e996c42aed1b4a42d4712817601ba42aff Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Mon, 14 Oct 2024 16:06:01 +0200
Subject: [PATCH] selftests: mptcp: join: test for prohibited MPC to port-based
endp
Explicitly verify that MPC connection attempts towards a port-based
signal endpoint fail with a reset.
Note that this new test is a bit different from the other ones, not
using 'run_tests'. It is then needed to add the capture capability, and
the picking the right port which have been extracted into three new
helpers. The info about the capture can also be printed from a single
point, which simplifies the exit paths in do_transfer().
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port")
Cc: stable(a)vger.kernel.org
Co-developed-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-2-7faea8e6b6ae…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index e8d0a01b4144..c07e2bd3a315 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -23,6 +23,7 @@ tmpfile=""
cout=""
err=""
capout=""
+cappid=""
ns1=""
ns2=""
iptables="iptables"
@@ -887,6 +888,44 @@ check_cestab()
fi
}
+cond_start_capture()
+{
+ local ns="$1"
+
+ :> "$capout"
+
+ if $capture; then
+ local capuser capfile
+ if [ -z $SUDO_USER ]; then
+ capuser=""
+ else
+ capuser="-Z $SUDO_USER"
+ fi
+
+ capfile=$(printf "mp_join-%02u-%s.pcap" "$MPTCP_LIB_TEST_COUNTER" "$ns")
+
+ echo "Capturing traffic for test $MPTCP_LIB_TEST_COUNTER into $capfile"
+ ip netns exec "$ns" tcpdump -i any -s 65535 -B 32768 $capuser -w "$capfile" > "$capout" 2>&1 &
+ cappid=$!
+
+ sleep 1
+ fi
+}
+
+cond_stop_capture()
+{
+ if $capture; then
+ sleep 1
+ kill $cappid
+ cat "$capout"
+ fi
+}
+
+get_port()
+{
+ echo "$((10000 + MPTCP_LIB_TEST_COUNTER - 1))"
+}
+
do_transfer()
{
local listener_ns="$1"
@@ -894,33 +933,17 @@ do_transfer()
local cl_proto="$3"
local srv_proto="$4"
local connect_addr="$5"
+ local port
- local port=$((10000 + MPTCP_LIB_TEST_COUNTER - 1))
- local cappid
local FAILING_LINKS=${FAILING_LINKS:-""}
local fastclose=${fastclose:-""}
local speed=${speed:-"fast"}
+ port=$(get_port)
:> "$cout"
:> "$sout"
- :> "$capout"
- if $capture; then
- local capuser
- if [ -z $SUDO_USER ] ; then
- capuser=""
- else
- capuser="-Z $SUDO_USER"
- fi
-
- capfile=$(printf "mp_join-%02u-%s.pcap" "$MPTCP_LIB_TEST_COUNTER" "${listener_ns}")
-
- echo "Capturing traffic for test $MPTCP_LIB_TEST_COUNTER into $capfile"
- ip netns exec ${listener_ns} tcpdump -i any -s 65535 -B 32768 $capuser -w $capfile > "$capout" 2>&1 &
- cappid=$!
-
- sleep 1
- fi
+ cond_start_capture ${listener_ns}
NSTAT_HISTORY=/tmp/${listener_ns}.nstat ip netns exec ${listener_ns} \
nstat -n
@@ -1007,10 +1030,7 @@ do_transfer()
wait $spid
local rets=$?
- if $capture; then
- sleep 1
- kill $cappid
- fi
+ cond_stop_capture
NSTAT_HISTORY=/tmp/${listener_ns}.nstat ip netns exec ${listener_ns} \
nstat | grep Tcp > /tmp/${listener_ns}.out
@@ -1026,7 +1046,6 @@ do_transfer()
ip netns exec ${connector_ns} ss -Menita 1>&2 -o "dport = :$port"
cat /tmp/${connector_ns}.out
- cat "$capout"
return 1
fi
@@ -1043,13 +1062,7 @@ do_transfer()
fi
rets=$?
- if [ $retc -eq 0 ] && [ $rets -eq 0 ];then
- cat "$capout"
- return 0
- fi
-
- cat "$capout"
- return 1
+ [ $retc -eq 0 ] && [ $rets -eq 0 ]
}
make_file()
@@ -2873,6 +2886,32 @@ verify_listener_events()
fail_test
}
+chk_mpc_endp_attempt()
+{
+ local retl=$1
+ local attempts=$2
+
+ print_check "Connect"
+
+ if [ ${retl} = 124 ]; then
+ fail_test "timeout on connect"
+ elif [ ${retl} = 0 ]; then
+ fail_test "unexpected successful connect"
+ else
+ print_ok
+
+ print_check "Attempts"
+ count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtMPCapableEndpAttempt")
+ if [ -z "$count" ]; then
+ print_skip
+ elif [ "$count" != "$attempts" ]; then
+ fail_test "got ${count} MPC attempt[s] on port-based endpoint, expected ${attempts}"
+ else
+ print_ok
+ fi
+ fi
+}
+
add_addr_ports_tests()
{
# signal address with port
@@ -2963,6 +3002,22 @@ add_addr_ports_tests()
chk_join_nr 2 2 2
chk_add_nr 2 2 2
fi
+
+ if reset "port-based signal endpoint must not accept mpc"; then
+ local port retl count
+ port=$(get_port)
+
+ cond_start_capture ${ns1}
+ pm_nl_add_endpoint ${ns1} 10.0.2.1 flags signal port ${port}
+ mptcp_lib_wait_local_port_listen ${ns1} ${port}
+
+ timeout 1 ip netns exec ${ns2} \
+ ./mptcp_connect -t ${timeout_poll} -p $port -s MPTCP 10.0.2.1 >/dev/null 2>&1
+ retl=$?
+ cond_stop_capture
+
+ chk_mpc_endp_attempt ${retl} 1
+ fi
}
syncookies_tests()
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101825-oaf-glaucoma-1d13@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7 Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland(a)arm.com>
Date: Tue, 8 Oct 2024 16:58:48 +0100
Subject: [PATCH] arm64: probes: Fix uprobes for big-endian kernels
The arm64 uprobes code is broken for big-endian kernels as it doesn't
convert the in-memory instruction encoding (which is always
little-endian) into the kernel's native endianness before analyzing and
simulating instructions. This may result in a few distinct problems:
* The kernel may may erroneously reject probing an instruction which can
safely be probed.
* The kernel may erroneously erroneously permit stepping an
instruction out-of-line when that instruction cannot be stepped
out-of-line safely.
* The kernel may erroneously simulate instruction incorrectly dur to
interpretting the byte-swapped encoding.
The endianness mismatch isn't caught by the compiler or sparse because:
* The arch_uprobe::{insn,ixol} fields are encoded as arrays of u8, so
the compiler and sparse have no idea these contain a little-endian
32-bit value. The core uprobes code populates these with a memcpy()
which similarly does not handle endianness.
* While the uprobe_opcode_t type is an alias for __le32, both
arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() cast from u8[]
to the similarly-named probe_opcode_t, which is an alias for u32.
Hence there is no endianness conversion warning.
Fix this by changing the arch_uprobe::{insn,ixol} fields to __le32 and
adding the appropriate __le32_to_cpu() conversions prior to consuming
the instruction encoding. The core uprobes copies these fields as opaque
ranges of bytes, and so is unaffected by this change.
At the same time, remove MAX_UINSN_BYTES and consistently use
AARCH64_INSN_SIZE for clarity.
Tested with the following:
| #include <stdio.h>
| #include <stdbool.h>
|
| #define noinline __attribute__((noinline))
|
| static noinline void *adrp_self(void)
| {
| void *addr;
|
| asm volatile(
| " adrp %x0, adrp_self\n"
| " add %x0, %x0, :lo12:adrp_self\n"
| : "=r" (addr));
| }
|
|
| int main(int argc, char *argv)
| {
| void *ptr = adrp_self();
| bool equal = (ptr == adrp_self);
|
| printf("adrp_self => %p\n"
| "adrp_self() => %p\n"
| "%s\n",
| adrp_self, ptr, equal ? "EQUAL" : "NOT EQUAL");
|
| return 0;
| }
.... where the adrp_self() function was compiled to:
| 00000000004007e0 <adrp_self>:
| 4007e0: 90000000 adrp x0, 400000 <__ehdr_start>
| 4007e4: 911f8000 add x0, x0, #0x7e0
| 4007e8: d65f03c0 ret
Before this patch, the ADRP is not recognized, and is assumed to be
steppable, resulting in corruption of the result:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0xffffffffff7e0
| NOT EQUAL
After this patch, the ADRP is correctly recognized and simulated:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| #
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
Fixes: 9842ceae9fa8 ("arm64: Add uprobe support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/r/20241008155851.801546-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/uprobes.h b/arch/arm64/include/asm/uprobes.h
index 2b09495499c6..014b02897f8e 100644
--- a/arch/arm64/include/asm/uprobes.h
+++ b/arch/arm64/include/asm/uprobes.h
@@ -10,11 +10,9 @@
#include <asm/insn.h>
#include <asm/probes.h>
-#define MAX_UINSN_BYTES AARCH64_INSN_SIZE
-
#define UPROBE_SWBP_INSN cpu_to_le32(BRK64_OPCODE_UPROBES)
#define UPROBE_SWBP_INSN_SIZE AARCH64_INSN_SIZE
-#define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES
+#define UPROBE_XOL_SLOT_BYTES AARCH64_INSN_SIZE
typedef __le32 uprobe_opcode_t;
@@ -23,8 +21,8 @@ struct arch_uprobe_task {
struct arch_uprobe {
union {
- u8 insn[MAX_UINSN_BYTES];
- u8 ixol[MAX_UINSN_BYTES];
+ __le32 insn;
+ __le32 ixol;
};
struct arch_probe_insn api;
bool simulate;
diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c
index d49aef2657cd..a2f137a595fc 100644
--- a/arch/arm64/kernel/probes/uprobes.c
+++ b/arch/arm64/kernel/probes/uprobes.c
@@ -42,7 +42,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
else if (!IS_ALIGNED(addr, AARCH64_INSN_SIZE))
return -EINVAL;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
switch (arm_probe_decode_insn(insn, &auprobe->api)) {
case INSN_REJECTED:
@@ -108,7 +108,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
if (!auprobe->simulate)
return false;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
addr = instruction_pointer(regs);
if (auprobe->api.handler)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101822-deplored-dictator-689d@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7 Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland(a)arm.com>
Date: Tue, 8 Oct 2024 16:58:48 +0100
Subject: [PATCH] arm64: probes: Fix uprobes for big-endian kernels
The arm64 uprobes code is broken for big-endian kernels as it doesn't
convert the in-memory instruction encoding (which is always
little-endian) into the kernel's native endianness before analyzing and
simulating instructions. This may result in a few distinct problems:
* The kernel may may erroneously reject probing an instruction which can
safely be probed.
* The kernel may erroneously erroneously permit stepping an
instruction out-of-line when that instruction cannot be stepped
out-of-line safely.
* The kernel may erroneously simulate instruction incorrectly dur to
interpretting the byte-swapped encoding.
The endianness mismatch isn't caught by the compiler or sparse because:
* The arch_uprobe::{insn,ixol} fields are encoded as arrays of u8, so
the compiler and sparse have no idea these contain a little-endian
32-bit value. The core uprobes code populates these with a memcpy()
which similarly does not handle endianness.
* While the uprobe_opcode_t type is an alias for __le32, both
arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() cast from u8[]
to the similarly-named probe_opcode_t, which is an alias for u32.
Hence there is no endianness conversion warning.
Fix this by changing the arch_uprobe::{insn,ixol} fields to __le32 and
adding the appropriate __le32_to_cpu() conversions prior to consuming
the instruction encoding. The core uprobes copies these fields as opaque
ranges of bytes, and so is unaffected by this change.
At the same time, remove MAX_UINSN_BYTES and consistently use
AARCH64_INSN_SIZE for clarity.
Tested with the following:
| #include <stdio.h>
| #include <stdbool.h>
|
| #define noinline __attribute__((noinline))
|
| static noinline void *adrp_self(void)
| {
| void *addr;
|
| asm volatile(
| " adrp %x0, adrp_self\n"
| " add %x0, %x0, :lo12:adrp_self\n"
| : "=r" (addr));
| }
|
|
| int main(int argc, char *argv)
| {
| void *ptr = adrp_self();
| bool equal = (ptr == adrp_self);
|
| printf("adrp_self => %p\n"
| "adrp_self() => %p\n"
| "%s\n",
| adrp_self, ptr, equal ? "EQUAL" : "NOT EQUAL");
|
| return 0;
| }
.... where the adrp_self() function was compiled to:
| 00000000004007e0 <adrp_self>:
| 4007e0: 90000000 adrp x0, 400000 <__ehdr_start>
| 4007e4: 911f8000 add x0, x0, #0x7e0
| 4007e8: d65f03c0 ret
Before this patch, the ADRP is not recognized, and is assumed to be
steppable, resulting in corruption of the result:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0xffffffffff7e0
| NOT EQUAL
After this patch, the ADRP is correctly recognized and simulated:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| #
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
Fixes: 9842ceae9fa8 ("arm64: Add uprobe support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/r/20241008155851.801546-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/uprobes.h b/arch/arm64/include/asm/uprobes.h
index 2b09495499c6..014b02897f8e 100644
--- a/arch/arm64/include/asm/uprobes.h
+++ b/arch/arm64/include/asm/uprobes.h
@@ -10,11 +10,9 @@
#include <asm/insn.h>
#include <asm/probes.h>
-#define MAX_UINSN_BYTES AARCH64_INSN_SIZE
-
#define UPROBE_SWBP_INSN cpu_to_le32(BRK64_OPCODE_UPROBES)
#define UPROBE_SWBP_INSN_SIZE AARCH64_INSN_SIZE
-#define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES
+#define UPROBE_XOL_SLOT_BYTES AARCH64_INSN_SIZE
typedef __le32 uprobe_opcode_t;
@@ -23,8 +21,8 @@ struct arch_uprobe_task {
struct arch_uprobe {
union {
- u8 insn[MAX_UINSN_BYTES];
- u8 ixol[MAX_UINSN_BYTES];
+ __le32 insn;
+ __le32 ixol;
};
struct arch_probe_insn api;
bool simulate;
diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c
index d49aef2657cd..a2f137a595fc 100644
--- a/arch/arm64/kernel/probes/uprobes.c
+++ b/arch/arm64/kernel/probes/uprobes.c
@@ -42,7 +42,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
else if (!IS_ALIGNED(addr, AARCH64_INSN_SIZE))
return -EINVAL;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
switch (arm_probe_decode_insn(insn, &auprobe->api)) {
case INSN_REJECTED:
@@ -108,7 +108,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
if (!auprobe->simulate)
return false;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
addr = instruction_pointer(regs);
if (auprobe->api.handler)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101820-delirious-wrongful-e7f1@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7 Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland(a)arm.com>
Date: Tue, 8 Oct 2024 16:58:48 +0100
Subject: [PATCH] arm64: probes: Fix uprobes for big-endian kernels
The arm64 uprobes code is broken for big-endian kernels as it doesn't
convert the in-memory instruction encoding (which is always
little-endian) into the kernel's native endianness before analyzing and
simulating instructions. This may result in a few distinct problems:
* The kernel may may erroneously reject probing an instruction which can
safely be probed.
* The kernel may erroneously erroneously permit stepping an
instruction out-of-line when that instruction cannot be stepped
out-of-line safely.
* The kernel may erroneously simulate instruction incorrectly dur to
interpretting the byte-swapped encoding.
The endianness mismatch isn't caught by the compiler or sparse because:
* The arch_uprobe::{insn,ixol} fields are encoded as arrays of u8, so
the compiler and sparse have no idea these contain a little-endian
32-bit value. The core uprobes code populates these with a memcpy()
which similarly does not handle endianness.
* While the uprobe_opcode_t type is an alias for __le32, both
arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() cast from u8[]
to the similarly-named probe_opcode_t, which is an alias for u32.
Hence there is no endianness conversion warning.
Fix this by changing the arch_uprobe::{insn,ixol} fields to __le32 and
adding the appropriate __le32_to_cpu() conversions prior to consuming
the instruction encoding. The core uprobes copies these fields as opaque
ranges of bytes, and so is unaffected by this change.
At the same time, remove MAX_UINSN_BYTES and consistently use
AARCH64_INSN_SIZE for clarity.
Tested with the following:
| #include <stdio.h>
| #include <stdbool.h>
|
| #define noinline __attribute__((noinline))
|
| static noinline void *adrp_self(void)
| {
| void *addr;
|
| asm volatile(
| " adrp %x0, adrp_self\n"
| " add %x0, %x0, :lo12:adrp_self\n"
| : "=r" (addr));
| }
|
|
| int main(int argc, char *argv)
| {
| void *ptr = adrp_self();
| bool equal = (ptr == adrp_self);
|
| printf("adrp_self => %p\n"
| "adrp_self() => %p\n"
| "%s\n",
| adrp_self, ptr, equal ? "EQUAL" : "NOT EQUAL");
|
| return 0;
| }
.... where the adrp_self() function was compiled to:
| 00000000004007e0 <adrp_self>:
| 4007e0: 90000000 adrp x0, 400000 <__ehdr_start>
| 4007e4: 911f8000 add x0, x0, #0x7e0
| 4007e8: d65f03c0 ret
Before this patch, the ADRP is not recognized, and is assumed to be
steppable, resulting in corruption of the result:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0xffffffffff7e0
| NOT EQUAL
After this patch, the ADRP is correctly recognized and simulated:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| #
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
Fixes: 9842ceae9fa8 ("arm64: Add uprobe support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/r/20241008155851.801546-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/uprobes.h b/arch/arm64/include/asm/uprobes.h
index 2b09495499c6..014b02897f8e 100644
--- a/arch/arm64/include/asm/uprobes.h
+++ b/arch/arm64/include/asm/uprobes.h
@@ -10,11 +10,9 @@
#include <asm/insn.h>
#include <asm/probes.h>
-#define MAX_UINSN_BYTES AARCH64_INSN_SIZE
-
#define UPROBE_SWBP_INSN cpu_to_le32(BRK64_OPCODE_UPROBES)
#define UPROBE_SWBP_INSN_SIZE AARCH64_INSN_SIZE
-#define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES
+#define UPROBE_XOL_SLOT_BYTES AARCH64_INSN_SIZE
typedef __le32 uprobe_opcode_t;
@@ -23,8 +21,8 @@ struct arch_uprobe_task {
struct arch_uprobe {
union {
- u8 insn[MAX_UINSN_BYTES];
- u8 ixol[MAX_UINSN_BYTES];
+ __le32 insn;
+ __le32 ixol;
};
struct arch_probe_insn api;
bool simulate;
diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c
index d49aef2657cd..a2f137a595fc 100644
--- a/arch/arm64/kernel/probes/uprobes.c
+++ b/arch/arm64/kernel/probes/uprobes.c
@@ -42,7 +42,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
else if (!IS_ALIGNED(addr, AARCH64_INSN_SIZE))
return -EINVAL;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
switch (arm_probe_decode_insn(insn, &auprobe->api)) {
case INSN_REJECTED:
@@ -108,7 +108,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
if (!auprobe->simulate)
return false;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
addr = instruction_pointer(regs);
if (auprobe->api.handler)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101818-tying-implement-f714@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7 Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland(a)arm.com>
Date: Tue, 8 Oct 2024 16:58:48 +0100
Subject: [PATCH] arm64: probes: Fix uprobes for big-endian kernels
The arm64 uprobes code is broken for big-endian kernels as it doesn't
convert the in-memory instruction encoding (which is always
little-endian) into the kernel's native endianness before analyzing and
simulating instructions. This may result in a few distinct problems:
* The kernel may may erroneously reject probing an instruction which can
safely be probed.
* The kernel may erroneously erroneously permit stepping an
instruction out-of-line when that instruction cannot be stepped
out-of-line safely.
* The kernel may erroneously simulate instruction incorrectly dur to
interpretting the byte-swapped encoding.
The endianness mismatch isn't caught by the compiler or sparse because:
* The arch_uprobe::{insn,ixol} fields are encoded as arrays of u8, so
the compiler and sparse have no idea these contain a little-endian
32-bit value. The core uprobes code populates these with a memcpy()
which similarly does not handle endianness.
* While the uprobe_opcode_t type is an alias for __le32, both
arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() cast from u8[]
to the similarly-named probe_opcode_t, which is an alias for u32.
Hence there is no endianness conversion warning.
Fix this by changing the arch_uprobe::{insn,ixol} fields to __le32 and
adding the appropriate __le32_to_cpu() conversions prior to consuming
the instruction encoding. The core uprobes copies these fields as opaque
ranges of bytes, and so is unaffected by this change.
At the same time, remove MAX_UINSN_BYTES and consistently use
AARCH64_INSN_SIZE for clarity.
Tested with the following:
| #include <stdio.h>
| #include <stdbool.h>
|
| #define noinline __attribute__((noinline))
|
| static noinline void *adrp_self(void)
| {
| void *addr;
|
| asm volatile(
| " adrp %x0, adrp_self\n"
| " add %x0, %x0, :lo12:adrp_self\n"
| : "=r" (addr));
| }
|
|
| int main(int argc, char *argv)
| {
| void *ptr = adrp_self();
| bool equal = (ptr == adrp_self);
|
| printf("adrp_self => %p\n"
| "adrp_self() => %p\n"
| "%s\n",
| adrp_self, ptr, equal ? "EQUAL" : "NOT EQUAL");
|
| return 0;
| }
.... where the adrp_self() function was compiled to:
| 00000000004007e0 <adrp_self>:
| 4007e0: 90000000 adrp x0, 400000 <__ehdr_start>
| 4007e4: 911f8000 add x0, x0, #0x7e0
| 4007e8: d65f03c0 ret
Before this patch, the ADRP is not recognized, and is assumed to be
steppable, resulting in corruption of the result:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0xffffffffff7e0
| NOT EQUAL
After this patch, the ADRP is correctly recognized and simulated:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| #
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
Fixes: 9842ceae9fa8 ("arm64: Add uprobe support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/r/20241008155851.801546-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/uprobes.h b/arch/arm64/include/asm/uprobes.h
index 2b09495499c6..014b02897f8e 100644
--- a/arch/arm64/include/asm/uprobes.h
+++ b/arch/arm64/include/asm/uprobes.h
@@ -10,11 +10,9 @@
#include <asm/insn.h>
#include <asm/probes.h>
-#define MAX_UINSN_BYTES AARCH64_INSN_SIZE
-
#define UPROBE_SWBP_INSN cpu_to_le32(BRK64_OPCODE_UPROBES)
#define UPROBE_SWBP_INSN_SIZE AARCH64_INSN_SIZE
-#define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES
+#define UPROBE_XOL_SLOT_BYTES AARCH64_INSN_SIZE
typedef __le32 uprobe_opcode_t;
@@ -23,8 +21,8 @@ struct arch_uprobe_task {
struct arch_uprobe {
union {
- u8 insn[MAX_UINSN_BYTES];
- u8 ixol[MAX_UINSN_BYTES];
+ __le32 insn;
+ __le32 ixol;
};
struct arch_probe_insn api;
bool simulate;
diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c
index d49aef2657cd..a2f137a595fc 100644
--- a/arch/arm64/kernel/probes/uprobes.c
+++ b/arch/arm64/kernel/probes/uprobes.c
@@ -42,7 +42,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
else if (!IS_ALIGNED(addr, AARCH64_INSN_SIZE))
return -EINVAL;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
switch (arm_probe_decode_insn(insn, &auprobe->api)) {
case INSN_REJECTED:
@@ -108,7 +108,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
if (!auprobe->simulate)
return false;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
addr = instruction_pointer(regs);
if (auprobe->api.handler)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101815-chapped-decibel-91ed@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7 Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland(a)arm.com>
Date: Tue, 8 Oct 2024 16:58:48 +0100
Subject: [PATCH] arm64: probes: Fix uprobes for big-endian kernels
The arm64 uprobes code is broken for big-endian kernels as it doesn't
convert the in-memory instruction encoding (which is always
little-endian) into the kernel's native endianness before analyzing and
simulating instructions. This may result in a few distinct problems:
* The kernel may may erroneously reject probing an instruction which can
safely be probed.
* The kernel may erroneously erroneously permit stepping an
instruction out-of-line when that instruction cannot be stepped
out-of-line safely.
* The kernel may erroneously simulate instruction incorrectly dur to
interpretting the byte-swapped encoding.
The endianness mismatch isn't caught by the compiler or sparse because:
* The arch_uprobe::{insn,ixol} fields are encoded as arrays of u8, so
the compiler and sparse have no idea these contain a little-endian
32-bit value. The core uprobes code populates these with a memcpy()
which similarly does not handle endianness.
* While the uprobe_opcode_t type is an alias for __le32, both
arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() cast from u8[]
to the similarly-named probe_opcode_t, which is an alias for u32.
Hence there is no endianness conversion warning.
Fix this by changing the arch_uprobe::{insn,ixol} fields to __le32 and
adding the appropriate __le32_to_cpu() conversions prior to consuming
the instruction encoding. The core uprobes copies these fields as opaque
ranges of bytes, and so is unaffected by this change.
At the same time, remove MAX_UINSN_BYTES and consistently use
AARCH64_INSN_SIZE for clarity.
Tested with the following:
| #include <stdio.h>
| #include <stdbool.h>
|
| #define noinline __attribute__((noinline))
|
| static noinline void *adrp_self(void)
| {
| void *addr;
|
| asm volatile(
| " adrp %x0, adrp_self\n"
| " add %x0, %x0, :lo12:adrp_self\n"
| : "=r" (addr));
| }
|
|
| int main(int argc, char *argv)
| {
| void *ptr = adrp_self();
| bool equal = (ptr == adrp_self);
|
| printf("adrp_self => %p\n"
| "adrp_self() => %p\n"
| "%s\n",
| adrp_self, ptr, equal ? "EQUAL" : "NOT EQUAL");
|
| return 0;
| }
.... where the adrp_self() function was compiled to:
| 00000000004007e0 <adrp_self>:
| 4007e0: 90000000 adrp x0, 400000 <__ehdr_start>
| 4007e4: 911f8000 add x0, x0, #0x7e0
| 4007e8: d65f03c0 ret
Before this patch, the ADRP is not recognized, and is assumed to be
steppable, resulting in corruption of the result:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0xffffffffff7e0
| NOT EQUAL
After this patch, the ADRP is correctly recognized and simulated:
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
| #
| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
| # echo 1 > /sys/kernel/tracing/events/uprobes/enable
| # ./adrp-self
| adrp_self => 0x4007e0
| adrp_self() => 0x4007e0
| EQUAL
Fixes: 9842ceae9fa8 ("arm64: Add uprobe support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/r/20241008155851.801546-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/uprobes.h b/arch/arm64/include/asm/uprobes.h
index 2b09495499c6..014b02897f8e 100644
--- a/arch/arm64/include/asm/uprobes.h
+++ b/arch/arm64/include/asm/uprobes.h
@@ -10,11 +10,9 @@
#include <asm/insn.h>
#include <asm/probes.h>
-#define MAX_UINSN_BYTES AARCH64_INSN_SIZE
-
#define UPROBE_SWBP_INSN cpu_to_le32(BRK64_OPCODE_UPROBES)
#define UPROBE_SWBP_INSN_SIZE AARCH64_INSN_SIZE
-#define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES
+#define UPROBE_XOL_SLOT_BYTES AARCH64_INSN_SIZE
typedef __le32 uprobe_opcode_t;
@@ -23,8 +21,8 @@ struct arch_uprobe_task {
struct arch_uprobe {
union {
- u8 insn[MAX_UINSN_BYTES];
- u8 ixol[MAX_UINSN_BYTES];
+ __le32 insn;
+ __le32 ixol;
};
struct arch_probe_insn api;
bool simulate;
diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c
index d49aef2657cd..a2f137a595fc 100644
--- a/arch/arm64/kernel/probes/uprobes.c
+++ b/arch/arm64/kernel/probes/uprobes.c
@@ -42,7 +42,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
else if (!IS_ALIGNED(addr, AARCH64_INSN_SIZE))
return -EINVAL;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
switch (arm_probe_decode_insn(insn, &auprobe->api)) {
case INSN_REJECTED:
@@ -108,7 +108,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
if (!auprobe->simulate)
return false;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]);
+ insn = le32_to_cpu(auprobe->insn);
addr = instruction_pointer(regs);
if (auprobe->api.handler)
As detected by Coverity, the error check logic at get_ctrl() is
broken: if ptr_to_user() fails to fill a control due to an error,
no errors are returned and v4l2_g_ctrl() returns success on a
failed operation, which may cause applications to fail.
Add an error check at get_ctrl() and ensure that it will
be returned to userspace without filling the control value if
get_ctrl() fails.
Fixes: 71c689dc2e73 ("media: v4l2-ctrls: split up into four source files")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
---
drivers/media/v4l2-core/v4l2-ctrls-api.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/media/v4l2-core/v4l2-ctrls-api.c b/drivers/media/v4l2-core/v4l2-ctrls-api.c
index e5a364efd5e6..a0de7eeaf085 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-api.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-api.c
@@ -753,9 +753,10 @@ static int get_ctrl(struct v4l2_ctrl *ctrl, struct v4l2_ext_control *c)
for (i = 0; i < master->ncontrols; i++)
cur_to_new(master->cluster[i]);
ret = call_op(master, g_volatile_ctrl);
- new_to_user(c, ctrl);
+ if (!ret)
+ ret = new_to_user(c, ctrl);
} else {
- cur_to_user(c, ctrl);
+ ret = cur_to_user(c, ctrl);
}
v4l2_ctrl_unlock(master);
return ret;
@@ -770,7 +771,10 @@ int v4l2_g_ctrl(struct v4l2_ctrl_handler *hdl, struct v4l2_control *control)
if (!ctrl || !ctrl->is_int)
return -EINVAL;
ret = get_ctrl(ctrl, &c);
- control->value = c.value;
+
+ if (!ret)
+ control->value = c.value;
+
return ret;
}
EXPORT_SYMBOL(v4l2_g_ctrl);
@@ -811,10 +815,12 @@ static int set_ctrl_lock(struct v4l2_fh *fh, struct v4l2_ctrl *ctrl,
int ret;
v4l2_ctrl_lock(ctrl);
- user_to_new(c, ctrl);
+ ret = user_to_new(c, ctrl);
+ if (ret)
+ return ret;
ret = set_ctrl(fh, ctrl, 0);
if (!ret)
- cur_to_user(c, ctrl);
+ ret = cur_to_user(c, ctrl);
v4l2_ctrl_unlock(ctrl);
return ret;
}
--
2.47.0