December 2023 - Linux-stable-mirror

selftests: ftrace: Internal error: Oops: sve_save_state

by Naresh Kamboju

Following kernel crash noticed while running selftests: ftrace: ftracetest-ktap on FVP models running stable-rc 6.5.8-rc2. This is not an easy to reproduce issue and not seen on mainline and next. We are investigating this report. Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org> Reported-by: Naresh Kamboju <naresh.kamboju(a)linaro.org> Test log: ---------- kselftest: Running tests in ftrace TAP version 13 1..1 # timeout set to 0 # selftests: ftrace: ftracetest-ktap # TAP version 13 # 1..129 # ok 1 Basic trace file check # ok 2 Basic test for tracers # ok 3 Basic trace clock test # ok 4 Basic event tracing check # ok 5 Change the ringbuffer size # ok 6 Snapshot and tracing setting # ok 7 Snapshot and tracing_cpumask # ok 8 trace_pipe and trace_marker [ 471.689140] [ 471.689264] ********************************************************** [ 471.689422] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** [ 471.689574] ** ** [ 471.689716] ** trace_printk() being used. Allocating extra memory. ** [ 471.689878] ** ** [ 471.690031] ** This means that this is a DEBUG kernel and it is ** [ 471.690183] ** unsafe for production use. ** [ 471.690335] ** ** [ 471.690487] ** If you see this message and you are not debugging ** [ 471.690728] ** the kernel, report this immediately to your vendor! ** [ 471.690881] ** ** [ 471.691033] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** [ 471.691185] ********************************************************** [ 543.243648] hrtimer: interrupt took 11937170 ns [ 764.987161] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 764.992518] Mem abort info: [ 764.995330] ESR = 0x0000000096000044 [ 764.998562] EC = 0x25: DABT (current EL), IL = 32 bits [ 765.002434] SET = 0, FnV = 0 [ 765.005361] EA = 0, S1PTW = 0 [ 765.008327] FSC = 0x04: level 0 translation fault [ 765.012011] Data abort info: [ 765.014858] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000 [ 765.018797] CM = 0, WnR = 1, TnD = 0, TagAccess = 0 [ 765.022562] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 765.026438] user pgtable: 4k pages, 48-bit VAs, pgdp=00000008848bd000 [ 765.030782] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 765.037045] Internal error: Oops: 0000000096000044 [#1] PREEMPT SMP [ 765.038392] Modules linked in: ftrace_direct pl111_drm arm_spe_pmu drm_dma_helper crct10dif_ce panel_simple drm_kms_helper fuse drm dm_mod ip_tables x_tables [last unloaded: ftrace_direct] [ 765.044892] CPU: 3 PID: 808 Comm: rmmod Not tainted 6.5.8-rc2 #1 [ 765.046192] Hardware name: FVP Base RevC (DT) [ 765.047264] pstate: 234020c9 (nzCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--) [ 765.048693] pc : sve_save_state+0x4/0xf0 [ 765.049820] lr : fpsimd_save+0x1b8/0x218 [ 765.050933] sp : ffff800080a83ac0 [ 765.051871] x29: ffff800080a83ac0 x28: ffff000805257058 x27: 0000000000000001 [ 765.054108] x26: 0000000000000000 x25: ffffd7c64d332980 x24: 0000000000000000 [ 765.056341] x23: 0000000000000001 x22: ffff284232103000 x21: 0000000000000040 [ 765.058575] x20: ffff00087f7470b0 x19: ffffd7c64d6440b0 x18: 0000000000000000 [ 765.060811] x17: 0000000000000000 x16: ffff800080018000 x15: 0000000000000000 [ 765.063041] x14: 0000000000000000 x13: 0000000000000000 x12: 0000380a873b560e [ 765.065277] x11: ffffd7c64e0ae390 x10: ffff800080a83b10 x9 : ffffd7c64b5b7710 [ 765.067516] x8 : ffff800080a839b8 x7 : 000000000000001e x6 : ffff00080000c200 [ 765.069752] x5 : ffffd7c64b78cc30 x4 : 0000000000000000 x3 : 0000000000000000 [ 765.071983] x2 : 0000000000000001 x1 : ffff000805257820 x0 : 0000000000000880 [ 765.074221] Call trace: [ 765.075045] sve_save_state+0x4/0xf0 [ 765.076138] fpsimd_thread_switch+0x2c/0xe8 [ 765.077305] __switch_to+0x20/0x158 [ 765.078384] __schedule+0x2cc/0xb38 [ 765.079464] preempt_schedule_irq+0x44/0xa8 [ 765.080633] el1_interrupt+0x4c/0x68 [ 765.081691] el1h_64_irq_handler+0x18/0x28 [ 765.082829] el1h_64_irq+0x64/0x68 [ 765.083874] ftrace_return_to_handler+0x98/0x158 [ 765.085090] return_to_handler+0x20/0x48 [ 765.086205] do_sve_acc+0x64/0x128 [ 765.087272] el0_sve_acc+0x3c/0xa0 [ 765.088356] el0t_64_sync_handler+0x114/0x130 [ 765.089524] el0t_64_sync+0x190/0x198 [ 765.090712] Code: d51b4408 d65f03c0 d503201f d503245f (e5bb5800) [ 765.092024] ---[ end trace 0000000000000000 ]--- [ 765.904294] pstore: backend (efi_pstore) writing error (-5) [ 765.905531] note: rmmod[808] exited with irqs disabled Links: test log link: - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.5.y/build/v6.5.7… Details of tests: - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2WrHwIIFdZ… Build link: - https://storage.tuxsuite.com/public/linaro/lkft/builds/2WrHvExYOOOZVoxlISTd… -- Linaro LKFT https://lkft.linaro.org

1 year, 10 months

6
12
0 0

[PATCH] thermal/drivers/mediatek: Fix control buffer enablement on MT7896

by Frank Wunderlich

From: Frank Wunderlich <frank-w(a)public-files.de> Reading thermal sensor on mt7986 devices returns invalid temperature: bpi-r3 ~ # cat /sys/class/thermal/thermal_zone0/temp -274000 Fix this by adding missing members in mtk_thermal_data struct which were used in mtk_thermal_turn_on_buffer after commit 33140e668b10. Cc: stable(a)vger.kernel.org Fixes: 33140e668b10 ("thermal/drivers/mediatek: Control buffer enablement tweaks") Signed-off-by: Frank Wunderlich <frank-w(a)public-files.de> --- drivers/thermal/mediatek/auxadc_thermal.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/thermal/mediatek/auxadc_thermal.c b/drivers/thermal/mediatek/auxadc_thermal.c index 843214d30bd8..967b9a1aead4 100644 --- a/drivers/thermal/mediatek/auxadc_thermal.c +++ b/drivers/thermal/mediatek/auxadc_thermal.c @@ -690,6 +690,9 @@ static const struct mtk_thermal_data mt7986_thermal_data = { .adcpnp = mt7986_adcpnp, .sensor_mux_values = mt7986_mux_values, .version = MTK_THERMAL_V3, + .apmixed_buffer_ctl_reg = APMIXED_SYS_TS_CON1, + .apmixed_buffer_ctl_mask = GENMASK(31, 6) | BIT(3), + .apmixed_buffer_ctl_set = BIT(0), }; static bool mtk_thermal_temp_is_valid(int temp) -- 2.34.1

1 year, 10 months

5
4
0 0

[PATCH v2] rpm-pkg: simplify installkernel %post

by Jose Ignacio Tornos Martinez

A new installkernel application is now included in systemd-udev package and it has been improved to allow simplifications. For the new installkernel application, as Davide says: <<The %post currently does a shuffling dance before calling installkernel. This isn't actually necessary afaict, and the current implementation ends up triggering downstream issues such as https://github.com/systemd/systemd/issues/29568 This commit simplifies the logic to remove the shuffling. For reference, the original logic was added in commit 3c9c7a14b627("rpm-pkg: add %post section to create initramfs and grub hooks").>> But we need to keep the old behavior as well, because the old installkernel application from grubby package, does not allow this simplification and we need to be backward compatible to avoid issues with the different packages. So the easiest solution is to check the package that provides the installkernel application, and simplify (and fix for this application at the same time), only if the package is systemd-udev. cc: stable(a)vger.kernel.org Co-Developed-by: Davide Cavalca <dcavalca(a)meta.com> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com> --- V1 -> V2: - Complete to be backward compatible with the previous installkernel application. scripts/package/kernel.spec | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/scripts/package/kernel.spec b/scripts/package/kernel.spec index 3eee0143e0c5..d4276ddb6645 100644 --- a/scripts/package/kernel.spec +++ b/scripts/package/kernel.spec @@ -77,12 +77,16 @@ rm -rf %{buildroot} %post if [ -x /sbin/installkernel -a -r /boot/vmlinuz-%{KERNELRELEASE} -a -r /boot/System.map-%{KERNELRELEASE} ]; then +if [ $(rpm -qf /sbin/installkernel --queryformat "%{n}") = systemd-udev ];then +/sbin/installkernel %{KERNELRELEASE} /boot/vmlinuz-%{KERNELRELEASE} /boot/System.map-%{KERNELRELEASE} +else cp /boot/vmlinuz-%{KERNELRELEASE} /boot/.vmlinuz-%{KERNELRELEASE}-rpm cp /boot/System.map-%{KERNELRELEASE} /boot/.System.map-%{KERNELRELEASE}-rpm rm -f /boot/vmlinuz-%{KERNELRELEASE} /boot/System.map-%{KERNELRELEASE} /sbin/installkernel %{KERNELRELEASE} /boot/.vmlinuz-%{KERNELRELEASE}-rpm /boot/.System.map-%{KERNELRELEASE}-rpm rm -f /boot/.vmlinuz-%{KERNELRELEASE}-rpm /boot/.System.map-%{KERNELRELEASE}-rpm fi +fi %preun if [ -x /sbin/new-kernel-pkg ]; then -- 2.43.0

1 year, 10 months

3
24
0 0

[PATCH] arm64: dts: qcom: sc7280: Add additional MSI interrupts

by Krishna chaitanya chundru

Current MSI's mapping doesn't have all the vectors. This platform supports 8 vectors each vector supports 32 MSI's, so total MSI's supported is 256. Add all the MSI groups supported for this PCIe instance in this platform. Fixes: 92e0ee9f83b3 ("arm64: dts: qcom: sc7280: Add PCIe and PHY related nodes") cc: stable(a)vger.kernel.org Signed-off-by: Krishna chaitanya chundru <quic_krichai(a)quicinc.com> --- arch/arm64/boot/dts/qcom/sc7280.dtsi | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi b/arch/arm64/boot/dts/qcom/sc7280.dtsi index 66f1eb83cca7..e1dc41705f61 100644 --- a/arch/arm64/boot/dts/qcom/sc7280.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi @@ -2146,8 +2146,16 @@ pcie1: pci@1c08000 { ranges = <0x01000000 0x0 0x00000000 0x0 0x40200000 0x0 0x100000>, <0x02000000 0x0 0x40300000 0x0 0x40300000 0x0 0x1fd00000>; - interrupts = <GIC_SPI 307 IRQ_TYPE_LEVEL_HIGH>; - interrupt-names = "msi"; + interrupts = <GIC_SPI 307 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 308 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 309 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 312 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 313 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 314 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 374 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 375 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "msi0", "msi1", "msi2", "msi3", + "msi4", "msi5", "msi6", "msi7"; #interrupt-cells = <1>; interrupt-map-mask = <0 0 0 0x7>; interrupt-map = <0 0 0 1 &intc 0 0 0 434 IRQ_TYPE_LEVEL_HIGH>, --- base-commit: 5bd7ef53ffe5ca580e93e74eb8c81ed191ddc4bd change-id: 20231218-additional_msi-6062dc812c29 Best regards, -- Krishna chaitanya chundru <quic_krichai(a)quicinc.com>

1 year, 10 months

2
1
0 0

[PATCH] drm/ttm: Make sure the mapped tt pages are decrypted when needed

by Zack Rusin

From: Zack Rusin <zackr(a)vmware.com> Some drivers require the mapped tt pages to be decrypted. In an ideal world this would have been handled by the dma layer, but the TTM page fault handling would have to be rewritten to able to do that. A side-effect of the TTM page fault handling is using a dma allocation per order (via ttm_pool_alloc_page) which makes it impossible to just trivially use dma_mmap_attrs. As a result ttm has to be very careful about trying to make its pgprot for the mapped tt pages match what the dma layer thinks it is. At the ttm layer it's possible to deduce the requirement to have tt pages decrypted by checking whether coherent dma allocations have been requested and the system is running with confidential computing technologies. This approach isn't ideal but keeping TTM matching DMAs expectations for the page properties is in general fragile, unfortunately proper fix would require a rewrite of TTM's page fault handling. Fixes vmwgfx with SEV enabled. Signed-off-by: Zack Rusin <zackr(a)vmware.com> Fixes: 3bf3710e3718 ("drm/ttm: Add a generic TTM memcpy move for page-based iomem") Cc: Christian König <christian.koenig(a)amd.com> Cc: Thomas Hellström <thomas.hellstrom(a)linux.intel.com> Cc: Huang Rui <ray.huang(a)amd.com> Cc: dri-devel(a)lists.freedesktop.org Cc: linux-kernel(a)vger.kernel.org Cc: <stable(a)vger.kernel.org> # v5.14+ --- drivers/gpu/drm/ttm/ttm_bo_util.c | 13 +++++++++++-- drivers/gpu/drm/ttm/ttm_tt.c | 7 +++++++ include/drm/ttm/ttm_tt.h | 9 ++++++++- 3 files changed, 26 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index fd9fd3d15101..0b3f4267130c 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -294,7 +294,13 @@ pgprot_t ttm_io_prot(struct ttm_buffer_object *bo, struct ttm_resource *res, enum ttm_caching caching; man = ttm_manager_type(bo->bdev, res->mem_type); - caching = man->use_tt ? bo->ttm->caching : res->bus.caching; + if (man->use_tt) { + caching = bo->ttm->caching; + if (bo->ttm->page_flags & TTM_TT_FLAG_DECRYPTED) + tmp = pgprot_decrypted(tmp); + } else { + caching = res->bus.caching; + } return ttm_prot_from_caching(caching, tmp); } @@ -337,6 +343,8 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo, .no_wait_gpu = false }; struct ttm_tt *ttm = bo->ttm; + struct ttm_resource_manager *man = + ttm_manager_type(bo->bdev, bo->resource->mem_type); pgprot_t prot; int ret; @@ -346,7 +354,8 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo, if (ret) return ret; - if (num_pages == 1 && ttm->caching == ttm_cached) { + if (num_pages == 1 && ttm->caching == ttm_cached && + !(man->use_tt && (ttm->page_flags & TTM_TT_FLAG_DECRYPTED))) { /* * We're mapping a single page, and the desired * page protection is consistent with the bo. diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index e0a77671edd6..02dcb728e29c 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -81,6 +81,13 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc) pr_err("Illegal buffer object type\n"); return -EINVAL; } + /* + * When using dma_alloc_coherent with memory encryption the + * mapped TT pages need to be decrypted or otherwise the drivers + * will end up sending encrypted mem to the gpu. + */ + if (bdev->pool.use_dma_alloc && cc_platform_has(CC_ATTR_MEM_ENCRYPT)) + page_flags |= TTM_TT_FLAG_DECRYPTED; bo->ttm = bdev->funcs->ttm_tt_create(bo, page_flags); if (unlikely(bo->ttm == NULL)) diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index a4eff85b1f44..2b9d856ff388 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -79,6 +79,12 @@ struct ttm_tt { * page_flags = TTM_TT_FLAG_EXTERNAL | * TTM_TT_FLAG_EXTERNAL_MAPPABLE; * + * TTM_TT_FLAG_DECRYPTED: The mapped ttm pages should be marked as + * not encrypted. The framework will try to match what the dma layer + * is doing, but note that it is a little fragile because ttm page + * fault handling abuses the DMA api a bit and dma_map_attrs can't be + * used to assure pgprot always matches. + * * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is * set by TTM after ttm_tt_populate() has successfully returned, and is * then unset when TTM calls ttm_tt_unpopulate(). @@ -87,8 +93,9 @@ struct ttm_tt { #define TTM_TT_FLAG_ZERO_ALLOC BIT(1) #define TTM_TT_FLAG_EXTERNAL BIT(2) #define TTM_TT_FLAG_EXTERNAL_MAPPABLE BIT(3) +#define TTM_TT_FLAG_DECRYPTED BIT(4) -#define TTM_TT_FLAG_PRIV_POPULATED BIT(4) +#define TTM_TT_FLAG_PRIV_POPULATED BIT(5) uint32_t page_flags; /** @num_pages: Number of pages in the page array. */ uint32_t num_pages; -- 2.39.2

1 year, 11 months

6
10
0 0

[PATCH v4 0/4] x86/cacheinfo: Set the number of leaves per CPU

by Ricardo Neri

Hi, The interface /sys/devices/system/cpu/cpuX/cache is broken (not populated) if CPUs have different numbers of subleaves in CPUID 4. This is the case of Intel Meteor Lake. This is v4 of a patchset to fix the cache sysfs interface by setting the number of cache leaves independently for each CPU. v1, v2, and v3 can be found here[1], here[2], and here[3]. All the tests described in [4] passed. Changes since v3: * Fixed another NULL-pointer dereference when checking the validity of the last-level cache info. * Added the Reviewed-by tags from Radu and Sudeep. Thanks! * Rebased on v6.7-rc5. Changes since v2: * This version uncovered a NULL-pointer dereference in recent changes to cacheinfo[5]. This dereference is observed when the system does not configure cacheinfo early during boot nor makes corrections later during CPU hotplug; as is the case in x86. Patch 1 fixes this issue. Changes since v1: * Dave Hansen suggested to use the existing per-CPU ci_cpu_cacheinfo variable. Now the global variable num_cache_leaves became useless. * While here, I noticed that init_cache_level() also became useless: x86 does not need ci_cpu_cacheinfo::num_levels. Thanks and BR, Ricardo [1]. https://lore.kernel.org/lkml/20230314231658.30169-1-ricardo.neri-calderon@l… [2]. https://lore.kernel.org/all/20230424001956.21434-1-ricardo.neri-calderon@li… [3]. https://lore.kernel.org/lkml/20230805012421.7002-1-ricardo.neri-calderon@li… [4]. https://lore.kernel.org/lkml/20230912032350.GA17008@ranerica-svr.sc.intel.c… [5]. https://lore.kernel.org/all/20230412185759.755408-1-rrendec@redhat.com/ Ricardo Neri (4): cacheinfo: Check for null last-level cache info cacheinfo: Allocate memory for memory if not done from the primary CPU x86/cacheinfo: Delete global num_cache_leaves x86/cacheinfo: Clean out init_cache_level() arch/x86/kernel/cpu/cacheinfo.c | 49 +++++++++++++++++---------------- drivers/base/cacheinfo.c | 9 +++++- 2 files changed, 34 insertions(+), 24 deletions(-) -- 2.25.1

1 year, 11 months

2
8
0 0

[PATCH v2 1/2] EDAC/device_sysfs: Fix calling kobject_put() with ->state_initialized unset

by Harshit Mogalapalli

In edac_device_register_sysfs_main_kobj(), when dev_root is NULL, kobject_init_and_add() is not called. if (err) { // err = -ENODEV edac_dbg(1, "Failed to register '.../edac/%s'\n", edac_dev->name); goto err_kobj_reg; // This calls kobj_put() } This will cause a runtime warning in kobject_put() if the above happens. Warning: "kobject: '%s' (%p): is not initialized, yet kobject_put() is being called." Fix the error handling to avoid the above possible situation. Cc: <stable(a)vger.kernel.org> Fixes: cb4a0bec0bb9 ("EDAC/sysfs: move to use bus_get_dev_root()") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com> --- This is based on static analysis and only compile tested. v1->v2: Resend as a patchset as they are two similar bugs. --- drivers/edac/edac_device_sysfs.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/edac/edac_device_sysfs.c b/drivers/edac/edac_device_sysfs.c index 010c26be5846..4cac14cbdb60 100644 --- a/drivers/edac/edac_device_sysfs.c +++ b/drivers/edac/edac_device_sysfs.c @@ -253,11 +253,13 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev) /* register */ dev_root = bus_get_dev_root(edac_subsys); - if (dev_root) { - err = kobject_init_and_add(&edac_dev->kobj, &ktype_device_ctrl, - &dev_root->kobj, "%s", edac_dev->name); - put_device(dev_root); - } + if (!dev_root) + goto module_put; + + err = kobject_init_and_add(&edac_dev->kobj, &ktype_device_ctrl, + &dev_root->kobj, "%s", edac_dev->name); + put_device(dev_root); + if (err) { edac_dbg(1, "Failed to register '.../edac/%s'\n", edac_dev->name); @@ -276,8 +278,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev) /* Error exit stack */ err_kobj_reg: kobject_put(&edac_dev->kobj); +module_put: module_put(edac_dev->owner); - err_out: return err; } -- 2.42.0

1 year, 11 months

2
4
0 0

[PATCH] firmware: arm_scmi: Check Mailbox/SMT channel for consistency

by Cristian Marussi

On reception of a completion interrupt the SMT memory area is accessed to retrieve the message header at first and then, if the message sequence number identifies a transaction which is still pending, the related payload is fetched too. When an SCMI command times out the channel ownership remains with the platform until eventually a late reply is received and, as a consequence, any further transmission attempt remains pending, waiting for the channel to be relinquished by the platform. Once that late reply is received the channel ownership is given back to the agent and any pending request is then allowed to proceed and overwrite the SMT area of the just delivered late reply; then the wait for the reply to the new request starts. It has been observed that the spurious IRQ related to the late reply can be wrongly associated with the freshly enqueued request: when that happens the SCMI stack in-flight lookup procedure is fooled by the fact that the message header now present in the SMT area is related to the new pending transaction, even though the real reply has still to arrive. This race-condition on the A2P channel can be detected by looking at the channel status bits: a genuine reply from the platform will have set the channel free bit before triggering the completion IRQ. Add a consistency check to validate such condition in the A2P ISR. Reported-by: Xinglong Yang <xinglong.yang(a)cixtech.com> Closes: https://lore.kernel.org/all/PUZPR06MB54981E6FA00D82BFDBB864FBF08DA@PUZPR06M… Fixes: 5c8a47a5a91d ("firmware: arm_scmi: Make scmi core independent of the transport type") CC: stable(a)vger.kernel.org # 5.15+ Signed-off-by: Cristian Marussi <cristian.marussi(a)arm.com> --- drivers/firmware/arm_scmi/common.h | 1 + drivers/firmware/arm_scmi/mailbox.c | 14 ++++++++++++++ drivers/firmware/arm_scmi/shmem.c | 6 ++++++ 3 files changed, 21 insertions(+) diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h index 3b7c68a11fd0..0956c2443840 100644 --- a/drivers/firmware/arm_scmi/common.h +++ b/drivers/firmware/arm_scmi/common.h @@ -329,6 +329,7 @@ void shmem_fetch_notification(struct scmi_shared_mem __iomem *shmem, void shmem_clear_channel(struct scmi_shared_mem __iomem *shmem); bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem, struct scmi_xfer *xfer); +bool shmem_channel_free(struct scmi_shared_mem __iomem *shmem); /* declarations for message passing transports */ struct scmi_msg_payld; diff --git a/drivers/firmware/arm_scmi/mailbox.c b/drivers/firmware/arm_scmi/mailbox.c index 19246ed1f01f..b8d470417e8f 100644 --- a/drivers/firmware/arm_scmi/mailbox.c +++ b/drivers/firmware/arm_scmi/mailbox.c @@ -45,6 +45,20 @@ static void rx_callback(struct mbox_client *cl, void *m) { struct scmi_mailbox *smbox = client_to_scmi_mailbox(cl); + /* + * An A2P IRQ is NOT valid when received while the platform still has + * the ownership of the channel, because the platform at first releases + * the SMT channel and then sends the completion interrupt. + * + * This addresses a possible race condition in which a spurious IRQ from + * a previous timed-out reply which arrived late could be wrongly + * associated with the next pending transaction. + */ + if (cl->knows_txdone && !shmem_channel_free(smbox->shmem)) { + dev_warn(smbox->cinfo->dev, "Ignoring spurious A2P IRQ !\n"); + return; + } + scmi_rx_callback(smbox->cinfo, shmem_read_header(smbox->shmem), NULL); } diff --git a/drivers/firmware/arm_scmi/shmem.c b/drivers/firmware/arm_scmi/shmem.c index 87b4f4d35f06..517d52fb3bcb 100644 --- a/drivers/firmware/arm_scmi/shmem.c +++ b/drivers/firmware/arm_scmi/shmem.c @@ -122,3 +122,9 @@ bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem, (SCMI_SHMEM_CHAN_STAT_CHANNEL_ERROR | SCMI_SHMEM_CHAN_STAT_CHANNEL_FREE); } + +bool shmem_channel_free(struct scmi_shared_mem __iomem *shmem) +{ + return (ioread32(&shmem->channel_status) & + SCMI_SHMEM_CHAN_STAT_CHANNEL_FREE); +} -- 2.34.1

1 year, 11 months

3
5
0 0

[PATCH AUTOSEL 4.14 1/6] clk: rockchip: rk3128: Fix HCLK_OTG gate register

by Sasha Levin

From: Weihao Li <cn.liweihao(a)gmail.com> [ Upstream commit c6c5a5580dcb6631aa6369dabe12ef3ce784d1d2 ] The HCLK_OTG gate control is in CRU_CLKGATE5_CON, not CRU_CLKGATE3_CON. Signed-off-by: Weihao Li <cn.liweihao(a)gmail.com> Link: https://lore.kernel.org/r/20231031111816.8777-1-cn.liweihao@gmail.com Signed-off-by: Heiko Stuebner <heiko(a)sntech.de> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/clk/rockchip/clk-rk3128.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/clk/rockchip/clk-rk3128.c b/drivers/clk/rockchip/clk-rk3128.c index 5970a50671b9a..83c7eb18321f4 100644 --- a/drivers/clk/rockchip/clk-rk3128.c +++ b/drivers/clk/rockchip/clk-rk3128.c @@ -497,7 +497,7 @@ static struct rockchip_clk_branch common_clk_branches[] __initdata = { GATE(HCLK_I2S_2CH, "hclk_i2s_2ch", "hclk_peri", 0, RK2928_CLKGATE_CON(7), 2, GFLAGS), GATE(0, "hclk_usb_peri", "hclk_peri", CLK_IGNORE_UNUSED, RK2928_CLKGATE_CON(9), 13, GFLAGS), GATE(HCLK_HOST2, "hclk_host2", "hclk_peri", 0, RK2928_CLKGATE_CON(7), 3, GFLAGS), - GATE(HCLK_OTG, "hclk_otg", "hclk_peri", 0, RK2928_CLKGATE_CON(3), 13, GFLAGS), + GATE(HCLK_OTG, "hclk_otg", "hclk_peri", 0, RK2928_CLKGATE_CON(5), 13, GFLAGS), GATE(0, "hclk_peri_ahb", "hclk_peri", CLK_IGNORE_UNUSED, RK2928_CLKGATE_CON(9), 14, GFLAGS), GATE(HCLK_SPDIF, "hclk_spdif", "hclk_peri", 0, RK2928_CLKGATE_CON(10), 9, GFLAGS), GATE(HCLK_TSP, "hclk_tsp", "hclk_peri", 0, RK2928_CLKGATE_CON(10), 12, GFLAGS), -- 2.43.0

1 year, 11 months

3
8
0 0

[PATCH v2 1/2] usb: dwc3: host: Set XHCI_SG_TRB_CACHE_SIZE_QUIRK

by Prashanth K

Upstream commit bac1ec551434 ("usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK") introduced a new quirk in XHCI which fixes XHC timeout, which was seen on synopsys XHCs while using SG buffers. But the support for this quirk isn't present in the DWC3 layer. We will encounter this XHCI timeout/hung issue if we run iperf loopback tests using RTL8156 ethernet adaptor on DWC3 targets with scatter-gather enabled. This gets resolved after enabling the XHCI_SG_TRB_CACHE_SIZE_QUIRK. This patch enables it using the xhci device property since its needed for DWC3 controller. In Synopsys DWC3 databook, Table 9-3: xHCI Debug Capability Limitations Chained TRBs greater than TRB cache size: The debug capability driver must not create a multi-TRB TD that describes smaller than a 1K packet that spreads across 8 or more TRBs on either the IN TR or the OUT TR. Cc: <stable(a)vger.kernel.org> Signed-off-by: Prashanth K <quic_prashk(a)quicinc.com> --- drivers/usb/dwc3/host.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c index 61f57fe5bb78..31a496233d87 100644 --- a/drivers/usb/dwc3/host.c +++ b/drivers/usb/dwc3/host.c @@ -89,6 +89,8 @@ int dwc3_host_init(struct dwc3 *dwc) memset(props, 0, sizeof(struct property_entry) * ARRAY_SIZE(props)); + props[prop_idx++] = PROPERTY_ENTRY_BOOL("xhci-sg-trb-cache-size-quirk"); + if (dwc->usb3_lpm_capable) props[prop_idx++] = PROPERTY_ENTRY_BOOL("usb3-lpm-capable"); -- 2.25.1

1 year, 11 months

4
8
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror December 2023