The patch titled
Subject: mm: vmalloc: check if a hash-index is in cpu_possible_mask
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vmalloc-check-if-a-hash-index-is-in-cpu_possible_mask.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Uladzislau Rezki (Sony)" <urezki(a)gmail.com>
Subject: mm: vmalloc: check if a hash-index is in cpu_possible_mask
Date: Wed, 26 Jun 2024 16:03:30 +0200
The problem is that there are systems where cpu_possible_mask has gaps
between set CPUs, for example SPARC. In this scenario addr_to_vb_xa()
hash function can return an index which accesses to not-possible and not
setup CPU area using per_cpu() macro. This results in an oops on SPARC.
A per-cpu vmap_block_queue is also used as hash table, incorrectly
assuming the cpu_possible_mask has no gaps. Fix it by adjusting an index
to a next possible CPU.
Link: https://lkml.kernel.org/r/20240626140330.89836-1-urezki@gmail.com
Fixes: 062eacf57ad9 ("mm: vmalloc: remove a global vmap_blocks xarray")
Reported-by: Nick Bowler <nbowler(a)draconx.ca>
Closes: https://lore.kernel.org/linux-kernel/ZntjIE6msJbF8zTa@MiWiFi-R3L-srv/T/
Signed-off-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Hailong.Liu <hailong.liu(a)oppo.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko(a)sony.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmalloc.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
--- a/mm/vmalloc.c~mm-vmalloc-check-if-a-hash-index-is-in-cpu_possible_mask
+++ a/mm/vmalloc.c
@@ -2543,7 +2543,15 @@ static DEFINE_PER_CPU(struct vmap_block_
static struct xarray *
addr_to_vb_xa(unsigned long addr)
{
- int index = (addr / VMAP_BLOCK_SIZE) % num_possible_cpus();
+ int index = (addr / VMAP_BLOCK_SIZE) % nr_cpu_ids;
+
+ /*
+ * Please note, nr_cpu_ids points on a highest set
+ * possible bit, i.e. we never invoke cpumask_next()
+ * if an index points on it which is nr_cpu_ids - 1.
+ */
+ if (!cpu_possible(index))
+ index = cpumask_next(index, cpu_possible_mask);
return &per_cpu(vmap_block_queue, index).vmap_blocks;
}
_
Patches currently in -mm which might be from urezki(a)gmail.com are
mm-vmalloc-check-if-a-hash-index-is-in-cpu_possible_mask.patch
From: Chuck Lever <chuck.lever(a)oracle.com>
Hi-
It was pointed out that the NFSD fixes that went into 5.10.220 were
missing a few forward fixes from upstream. These five are the ones
I identified. I've run them through the usual NFSD CI tests and
found no new issues.
Chuck Lever (3):
SUNRPC: Fix a NULL pointer deref in trace_svc_stats_latency()
SUNRPC: Fix svcxdr_init_decode's end-of-buffer calculation
SUNRPC: Fix svcxdr_init_encode's buflen calculation
Jeff Layton (1):
nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
Yunjian Wang (1):
SUNRPC: Fix null pointer dereference in svc_rqst_free()
fs/nfsd/nfs4state.c | 7 ++-----
include/linux/sunrpc/svc.h | 20 ++++++++++++++++----
include/trace/events/sunrpc.h | 8 ++++----
net/sunrpc/svc.c | 18 +++++++++++++++++-
4 files changed, 39 insertions(+), 14 deletions(-)
--
2.45.1
Work for __counted_by on generic pointers in structures (not just
flexible array members) has started landing in Clang 19 (current tip of
tree). During the development of this feature, a restriction was added
to __counted_by to prevent the flexible array member's element type from
including a flexible array member itself such as:
struct foo {
int count;
char buf[];
};
struct bar {
int count;
struct foo data[] __counted_by(count);
};
because the size of data cannot be calculated with the standard array
size formula:
sizeof(struct foo) * count
This restriction was downgraded to a warning but due to CONFIG_WERROR,
it can still break the build. The application of __counted_by on the fod
member of 'struct nvmet_fc_tgt_queue' triggers this restriction,
resulting in:
drivers/nvme/target/fc.c:151:2: error: 'counted_by' should not be applied to an array with element of unknown size because 'struct nvmet_fc_fcp_iod' is a struct type with a flexible array member. This will be an error in a future compiler version [-Werror,-Wbounds-safety-counted-by-elt-type-unknown-size]
151 | struct nvmet_fc_fcp_iod fod[] __counted_by(sqsize);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Remove this use of __counted_by to fix the warning/error. However,
rather than remove it altogether, leave it commented, as it may be
possible to support this in future compiler releases.
Cc: stable(a)vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2027
Fixes: ccd3129aca28 ("nvmet-fc: Annotate struct nvmet_fc_tgt_queue with __counted_by")
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
drivers/nvme/target/fc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 337ee1cb09ae..381b4394731f 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -148,7 +148,7 @@ struct nvmet_fc_tgt_queue {
struct workqueue_struct *work_q;
struct kref ref;
/* array of fcp_iods */
- struct nvmet_fc_fcp_iod fod[] __counted_by(sqsize);
+ struct nvmet_fc_fcp_iod fod[] /* __counted_by(sqsize) */;
} __aligned(sizeof(unsigned long long));
struct nvmet_fc_hostport {
---
base-commit: c758b77d4a0a0ed3a1292b3fd7a2aeccd1a169a4
change-id: 20240529-drop-counted-by-fod-nvmet-fc-tgt-queue-50edd2f8d60e
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
For Gen-1 targets like IPQ6018, it is seen that stressing out the
controller in host mode results in HC died error:
xhci-hcd.12.auto: xHCI host not responding to stop endpoint command
xhci-hcd.12.auto: xHCI host controller not responding, assume dead
xhci-hcd.12.auto: HC died; cleaning up
And at this instant only restarting the host mode fixes it. Disable
SuperSpeed instance in park mode for IPQ6018 to mitigate this issue.
Cc: <stable(a)vger.kernel.org>
Fixes: 20bb9e3dd2e4 ("arm64: dts: qcom: ipq6018: add usb3 DT description")
Signed-off-by: Krishna Kurapati <quic_kriskura(a)quicinc.com>
---
arch/arm64/boot/dts/qcom/ipq6018.dtsi | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/boot/dts/qcom/ipq6018.dtsi b/arch/arm64/boot/dts/qcom/ipq6018.dtsi
index 9694140881c6..8b63c1a6da10 100644
--- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi
+++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi
@@ -685,6 +685,7 @@ dwc_0: usb@8a00000 {
clocks = <&xo>;
clock-names = "ref";
tx-fifo-resize;
+ snps,parkmode-disable-ss-quirk;
snps,is-utmi-l1-suspend;
snps,hird-threshold = /bits/ 8 <0x0>;
snps,dis_u2_susphy_quirk;
--
2.34.1
For Gen-1 targets like MSM8998, it is seen that stressing out the
controller in host mode results in HC died error:
xhci-hcd.12.auto: xHCI host not responding to stop endpoint command
xhci-hcd.12.auto: xHCI host controller not responding, assume dead
xhci-hcd.12.auto: HC died; cleaning up
And at this instant only restarting the host mode fixes it. Disable
SuperSpeed instance in park mode for MSM8998 to mitigate this issue.
Cc: <stable(a)vger.kernel.org>
Fixes: 026dad8f5873 ("arm64: dts: qcom: msm8998: Add USB-related nodes")
Signed-off-by: Krishna Kurapati <quic_kriskura(a)quicinc.com>
---
arch/arm64/boot/dts/qcom/msm8998.dtsi | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/boot/dts/qcom/msm8998.dtsi b/arch/arm64/boot/dts/qcom/msm8998.dtsi
index 3c94d823a514..6d5e1c7b2da5 100644
--- a/arch/arm64/boot/dts/qcom/msm8998.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8998.dtsi
@@ -2144,6 +2144,7 @@ usb3_dwc3: usb@a800000 {
interrupts = <GIC_SPI 131 IRQ_TYPE_LEVEL_HIGH>;
snps,dis_u2_susphy_quirk;
snps,dis_enblslpm_quirk;
+ snps,parkmode-disable-ss-quirk;
phys = <&qusb2phy>, <&usb3phy>;
phy-names = "usb2-phy", "usb3-phy";
snps,has-lpm-erratum;
--
2.34.1
[Why]
After supend/resume, with topology unchanged, observe that
link_address_sent of all mstb are marked as false even the topology probing
is done without any error.
It is caused by wrongly also include "ret == 0" case as a probing failure
case.
[How]
Remove inappropriate checking conditions.
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Harry Wentland <hwentlan(a)amd.com>
Cc: Jani Nikula <jani.nikula(a)intel.com>
Cc: Imre Deak <imre.deak(a)intel.com>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: stable(a)vger.kernel.org
Fixes: 37dfdc55ffeb ("drm/dp_mst: Cleanup drm_dp_send_link_address() a bit")
Signed-off-by: Wayne Lin <Wayne.Lin(a)amd.com>
---
drivers/gpu/drm/display/drm_dp_mst_topology.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
index 7f8e1cfbe19d..68831f4e502a 100644
--- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
@@ -2929,7 +2929,7 @@ static int drm_dp_send_link_address(struct drm_dp_mst_topology_mgr *mgr,
/* FIXME: Actually do some real error handling here */
ret = drm_dp_mst_wait_tx_reply(mstb, txmsg);
- if (ret <= 0) {
+ if (ret < 0) {
drm_err(mgr->dev, "Sending link address failed with %d\n", ret);
goto out;
}
@@ -2981,7 +2981,7 @@ static int drm_dp_send_link_address(struct drm_dp_mst_topology_mgr *mgr,
mutex_unlock(&mgr->lock);
out:
- if (ret <= 0)
+ if (ret < 0)
mstb->link_address_sent = false;
kfree(txmsg);
return ret < 0 ? ret : changed;
--
2.37.3
From: Reka Norman <rekanorman(a)chromium.org>
TGL systems have the same issue as ADL, where a large boot firmware
delay is seen if USB ports are left in U3 at shutdown. So apply the
XHCI_RESET_TO_DEFAULT quirk to TGL as well.
The issue it fixes is a ~20s boot time delay when booting from S5. It
affects TGL devices, and TGL support was added starting from v5.3.
Cc: stable(a)vger.kernel.org
Signed-off-by: Reka Norman <rekanorman(a)chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-pci.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 05881153883e..dc1e345ab67e 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -50,6 +50,7 @@
#define PCI_DEVICE_ID_INTEL_DENVERTON_XHCI 0x19d0
#define PCI_DEVICE_ID_INTEL_ICE_LAKE_XHCI 0x8a13
#define PCI_DEVICE_ID_INTEL_TIGER_LAKE_XHCI 0x9a13
+#define PCI_DEVICE_ID_INTEL_TIGER_LAKE_PCH_XHCI 0xa0ed
#define PCI_DEVICE_ID_INTEL_COMET_LAKE_XHCI 0xa3af
#define PCI_DEVICE_ID_INTEL_ALDER_LAKE_PCH_XHCI 0x51ed
#define PCI_DEVICE_ID_INTEL_ALDER_LAKE_N_PCH_XHCI 0x54ed
@@ -373,7 +374,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci->quirks |= XHCI_MISSING_CAS;
if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
- (pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_PCH_XHCI ||
+ (pdev->device == PCI_DEVICE_ID_INTEL_TIGER_LAKE_PCH_XHCI ||
+ pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_PCH_XHCI ||
pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_N_PCH_XHCI))
xhci->quirks |= XHCI_RESET_TO_DEFAULT;
--
2.25.1
BPF kfuncs are often not directly referenced and may be inadvertently
removed by optimization steps during kernel builds, thus the __bpf_kfunc
tag mitigates against this removal by including the __used macro. However,
this macro alone does not prevent removal during linking, and may still
yield build warnings (e.g. on mips64el):
LD vmlinux
BTFIDS vmlinux
WARN: resolve_btfids: unresolved symbol bpf_verify_pkcs7_signature
WARN: resolve_btfids: unresolved symbol bpf_lookup_user_key
WARN: resolve_btfids: unresolved symbol bpf_lookup_system_key
WARN: resolve_btfids: unresolved symbol bpf_key_put
WARN: resolve_btfids: unresolved symbol bpf_iter_task_next
WARN: resolve_btfids: unresolved symbol bpf_iter_css_task_new
WARN: resolve_btfids: unresolved symbol bpf_get_file_xattr
WARN: resolve_btfids: unresolved symbol bpf_ct_insert_entry
WARN: resolve_btfids: unresolved symbol bpf_cgroup_release
WARN: resolve_btfids: unresolved symbol bpf_cgroup_from_id
WARN: resolve_btfids: unresolved symbol bpf_cgroup_acquire
WARN: resolve_btfids: unresolved symbol bpf_arena_free_pages
NM System.map
SORTTAB vmlinux
OBJCOPY vmlinux.32
Update the __bpf_kfunc tag to better guard against linker optimization by
including the new __retain compiler macro, which fixes the warnings above.
Verify the __retain macro with readelf by checking object flags for 'R':
$ readelf -Wa kernel/trace/bpf_trace.o
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
...
[178] .text.bpf_key_put PROGBITS 00000000 6420 0050 00 AXR 0 0 8
...
Key to Flags:
...
R (retain), D (mbind), p (processor specific)
Link: https://lore.kernel.org/bpf/ZlmGoT9KiYLZd91S@krava/T/
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/r/202401211357.OCX9yllM-lkp@intel.com/
Fixes: 57e7c169cd6a ("bpf: Add __bpf_kfunc tag for marking kernel functions as kfuncs")
Cc: stable(a)vger.kernel.org # v6.6+
Signed-off-by: Tony Ambardar <Tony.Ambardar(a)gmail.com>
---
include/linux/btf.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/btf.h b/include/linux/btf.h
index f9e56fd12a9f..7c3e40c3295e 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -82,7 +82,7 @@
* as to avoid issues such as the compiler inlining or eliding either a static
* kfunc, or a global kfunc in an LTO build.
*/
-#define __bpf_kfunc __used noinline
+#define __bpf_kfunc __used __retain noinline
#define __bpf_kfunc_start_defs() \
__diag_push(); \
--
2.34.1