This targets the 6.12-y branch and fixes stability issues in the flow
scheduling Tx send/clean path that results in a Tx timeouts and can
occasionally crash in certain environments.
The majority of the patches come from the series "idpf: replace Tx flow
scheduling buffer ring with buffer pool" [1] except for the first two
patches which are included as they address additional situations that
can result in Tx timeouts. There are two minor differences from the
original patch (3&8), also noted in the respective patches, for size
assertions due to differences in struct sizes between the original
version and what is present here.
Snippet from the cover letter of the referenced series:
The existing guardrails in the Tx path were not sufficient to prevent
the driver from reusing completion tags that were still in flight (held
by the HW). This collision would cause the driver to erroneously clean
the wrong packet thus leaving the descriptor ring in a bad state.
The main point of this fix is to replace the flow scheduling buffer ring
with a large pool/array of buffers. The completion tag then simply is
the index into this array. The driver tracks the free tags and pulls
the next free one from a refillq. The cleaning routines simply use the
completion tag from the completion descriptor to index into the array to
quickly find the buffers to clean.
All of the code to support this is added first to ensure traffic still
passes with each patch. The final patch then removes all of the
obsolete stashing code.
[1] https://lore.kernel.org/netdev/20250821180100.401955-1-anthony.l.nguyen@int…
---
We do realize this request is larger than stable rules, however, one of
our customers asked if this could be backported to this LTS kernel. We're
hoping this can be accepted since these changes are isolated to this
driver alone and have been tested by the customer and Intel validation.
Joshua Hay (8):
idpf: add support for SW triggered interrupts
idpf: trigger SW interrupt when exiting wb_on_itr mode
idpf: add support for Tx refillqs in flow scheduling mode
idpf: improve when to set RE bit logic
idpf: simplify and fix splitq Tx packet rollback error path
idpf: replace flow scheduling buffer ring with buffer pool
idpf: stop Tx if there are insufficient buffer resources
idpf: remove obsolete stashing code
drivers/net/ethernet/intel/idpf/idpf_dev.c | 3 +
.../ethernet/intel/idpf/idpf_singleq_txrx.c | 61 +-
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 750 +++++++-----------
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 95 +--
drivers/net/ethernet/intel/idpf/idpf_vf_dev.c | 3 +
5 files changed, 390 insertions(+), 522 deletions(-)
--
2.47.1
svm_update_lbrv() always updates LBR MSRs intercepts, even when they are
already set correctly. This results in force_msr_bitmap_recalc always
being set to true on every nested transition, essentially undoing the
hyperv optimization in nested_svm_merge_msrpm().
Fix it by keeping track of whether LBR MSRs are intercepted or not and
only doing the update if needed, similar to x2avic_msrs_intercepted.
Avoid using svm_test_msr_bitmap_*() to check the status of the
intercepts, as an arbitrary MSR will need to be chosen as a
representative of all LBR MSRs, and this could theoretically break if
some of the MSRs intercepts are handled differently from the rest.
Also, using svm_test_msr_bitmap_*() makes backports difficult as it was
only recently introduced with no direct alternatives in older kernels.
Fixes: fbe5e5f030c2 ("KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Yosry Ahmed <yosry.ahmed(a)linux.dev>
---
arch/x86/kvm/svm/svm.c | 9 ++++++++-
arch/x86/kvm/svm/svm.h | 1 +
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 10c21e4c5406f..9d29b2e7e855d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -705,7 +705,11 @@ void *svm_alloc_permissions_map(unsigned long size, gfp_t gfp_mask)
static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu)
{
- bool intercept = !(to_svm(vcpu)->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK);
+ struct vcpu_svm *svm = to_svm(vcpu);
+ bool intercept = !(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK);
+
+ if (intercept == svm->lbr_msrs_intercepted)
+ return;
svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHFROMIP, MSR_TYPE_RW, intercept);
svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHTOIP, MSR_TYPE_RW, intercept);
@@ -714,6 +718,8 @@ static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu)
if (sev_es_guest(vcpu->kvm))
svm_set_intercept_for_msr(vcpu, MSR_IA32_DEBUGCTLMSR, MSR_TYPE_RW, intercept);
+
+ svm->lbr_msrs_intercepted = intercept;
}
void svm_vcpu_free_msrpm(void *msrpm)
@@ -1221,6 +1227,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
}
svm->x2avic_msrs_intercepted = true;
+ svm->lbr_msrs_intercepted = true;
svm->vmcb01.ptr = page_address(vmcb01_page);
svm->vmcb01.pa = __sme_set(page_to_pfn(vmcb01_page) << PAGE_SHIFT);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index c856d8e0f95e7..dd78e64023450 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -336,6 +336,7 @@ struct vcpu_svm {
bool guest_state_loaded;
bool x2avic_msrs_intercepted;
+ bool lbr_msrs_intercepted;
/* Guest GIF value, used when vGIF is not enabled */
bool guest_gif;
base-commit: 8a4821412cf2c1429fffa07c012dd150f2edf78c
--
2.51.2.1041.gc1ab5b90ca-goog
tegra_ahb_enable_smmu() utilizes driver_find_device_by_of_node() which
internally calls driver_find_device() to locate the matching device.
driver_find_device() increments the ref count of the found device by
calling get_device(), but tegra_ahb_enable_smmu() fails to call
put_device() to decrement the reference count before returning. This
results in a reference count leak of the device, which may prevent the
device from being properly released and cause a memory leak.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 89c788bab1f0 ("ARM: tegra: Add SMMU enabler in AHB")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/amba/tegra-ahb.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/amba/tegra-ahb.c b/drivers/amba/tegra-ahb.c
index f23c3ed01810..3ed5cef34806 100644
--- a/drivers/amba/tegra-ahb.c
+++ b/drivers/amba/tegra-ahb.c
@@ -148,6 +148,7 @@ int tegra_ahb_enable_smmu(struct device_node *dn)
val = gizmo_readl(ahb, AHB_ARBITRATION_XBAR_CTRL);
val |= AHB_ARBITRATION_XBAR_CTRL_SMMU_INIT_DONE;
gizmo_writel(ahb, val, AHB_ARBITRATION_XBAR_CTRL);
+ put_device(dev);
return 0;
}
EXPORT_SYMBOL(tegra_ahb_enable_smmu);
--
2.17.1
For some odd reason 5.10 kernel series doesn't compile with a newer
toolchain since 2025-02-09:
2025-02-09T17:32:07.7991299Z GEN .version
2025-02-09T17:32:07.8270062Z CHK include/generated/compile.h
2025-02-09T17:32:07.8540777Z LD vmlinux.o
2025-02-09T17:32:11.7210899Z MODPOST vmlinux.symvers
2025-02-09T17:32:12.0869599Z MODINFO modules.builtin.modinfo
2025-02-09T17:32:12.1403022Z GEN modules.builtin
2025-02-09T17:32:12.1475659Z LD .tmp_vmlinux.btf
2025-02-09T17:32:19.6117204Z BTF .btf.vmlinux.bin.o
2025-02-09T17:32:31.2916650Z LD .tmp_vmlinux.kallsyms1
2025-02-09T17:32:34.8731104Z KSYMS .tmp_vmlinux.kallsyms1.S
2025-02-09T17:32:35.4910608Z AS .tmp_vmlinux.kallsyms1.o
2025-02-09T17:32:35.9662538Z LD .tmp_vmlinux.kallsyms2
2025-02-09T17:32:39.2595984Z KSYMS .tmp_vmlinux.kallsyms2.S
2025-02-09T17:32:39.8802028Z AS .tmp_vmlinux.kallsyms2.o
2025-02-09T17:32:40.3659440Z LD vmlinux
2025-02-09T17:32:48.0031558Z BTFIDS vmlinux
2025-02-09T17:32:48.0143553Z FAILED unresolved symbol filp_close
2025-02-09T17:32:48.5019928Z make: *** [Makefile:1207: vmlinux] Error 255
2025-02-09T17:32:48.5061241Z ==> ERROR: A failure occurred in build().
5.10.234 built fine couple of days ago with the older one. There were
slight changes made. 5.4 and 5.15 still compile.
Wonder what might be missing here ...
--
Best, Philip
Hi Ilpo,
I managed to get my hands on acpidumps for these models so this is
verified against those.
Thanks for all your latest reviews!
Signed-off-by: Kurt Borja <kuurtb(a)gmail.com>
---
Kurt Borja (3):
platform/x86: alienware-wmi-wmax: Add support for new Area-51 laptops
platform/x86: alienware-wmi-wmax: Add AWCC support for Alienware x16
platform/x86: alienware-wmi-wmax: Add support for Alienware 16X Aurora
drivers/platform/x86/dell/alienware-wmi-wmax.c | 32 ++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
---
base-commit: 9b9c0adbc3f8a524d291baccc9d0c04097fb4869
change-id: 20251111-area-51-7e6c2501e4eb
--
~ Kurt
In function `scmi_devm_notifier_unregister` the notifier-block parameter
was unused and therefore never passed to `devres_release`. This causes
the function to always return -ENOENT and fail to unregister the
notifier.
In drivers that rely on this function for cleanup this causes
unexpected failures including kernel-panic.
This is not needed upstream becaues the bug was fixed
in a refactor by commit 264a2c520628 ("firmware: arm_scmi: Simplify
scmi_devm_notifier_unregister"). It is needed for the 5.15, 6.1 and
6.6 kernels.
Cc: <stable(a)vger.kernel.org> # 5.15.x, 6.1.x, and 6.6.x
Fixes: 5ad3d1cf7d34 ("firmware: arm_scmi: Introduce new devres notification ops")
Reviewed-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Reviewed-by: Cristian Marussi <cristian.marussi(a)arm.com>
Signed-off-by: Amitai Gottlieb <amitaig(a)hailo.ai>
---
drivers/firmware/arm_scmi/notify.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/firmware/arm_scmi/notify.c b/drivers/firmware/arm_scmi/notify.c
index 0efd20cd9d69..4782b115e6ec 100644
--- a/drivers/firmware/arm_scmi/notify.c
+++ b/drivers/firmware/arm_scmi/notify.c
@@ -1539,6 +1539,7 @@ static int scmi_devm_notifier_unregister(struct scmi_device *sdev,
dres.handle = sdev->handle;
dres.proto_id = proto_id;
dres.evt_id = evt_id;
+ dres.nb = nb;
if (src_id) {
dres.__src_id = *src_id;
dres.src_id = &dres.__src_id;
--
2.34.1
When the filesystem is being mounted, the kernel panics while the data
regarding slot map allocation to the local node, is being written to the
disk. This occurs because the value of slot map buffer head block
number, which should have been greater than or equal to
`OCFS2_SUPER_BLOCK_BLKNO` (evaluating to 2) is less than it, indicative
of disk metadata corruption. This triggers
BUG_ON(bh->b_blocknr < OCFS2_SUPER_BLOCK_BLKNO) in ocfs2_write_block(),
causing the kernel to panic.
This is fixed by introducing function ocfs2_validate_slot_map_block() to
validate slot map blocks. It first checks if the buffer head passed to it
is up to date and valid, else it panics the kernel at that point itself.
Further, it contains an if condition block, which checks if `bh->b_blocknr`
is lesser than `OCFS2_SUPER_BLOCK_BLKNO`; if yes, then ocfs2_error is
called, which prints the error log, for debugging purposes, and the return
value of ocfs2_error() is returned. If the return value is zero. then error
code EIO is returned.
This function is used as validate function in calls to ocfs2_read_blocks()
in ocfs2_refresh_slot_info() and ocfs2_map_slot_buffers().
In addition, the function also contains
Reported-by: syzbot+c818e5c4559444f88aa0(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c818e5c4559444f88aa0
Tested-by: syzbot+c818e5c4559444f88aa0(a)syzkaller.appspotmail.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Prithvi Tambewagh <activprithvi(a)gmail.com>
---
v2->v3:
- Create new function ocfs2_validate_slot_map_block() to validate block
number of slot map blocks, to be greater then or equal to
OCFS2_SUPER_BLOCK_BLKNO
- Use ocfs2_validate_slot_map_block() in calls to ocfs2_read_blocks() in
ocfs2_refresh_slot_info() and ocfs2_map_slot_buffers()
- In addition to using previously formulated if block in
ocfs2_validate_slot_map_block(), also check if the buffer head passed
in this function is up to date; if not, then kernel panics at that point
- Change title of patch to 'ocfs2: Add validate function for slot map blocks'
v2 link: https://lore.kernel.org/ocfs2-devel/nwkfpkm2wlajswykywnpt4sc6gdkesakw2sw7et…
v1->v2:
- Remove usage of le16_to_cpu() from ocfs2_error()
- Cast bh->b_blocknr to unsigned long long
- Remove type casting for OCFS2_SUPER_BLOCK_BLKNO
- Fix Sparse warnings reported in v1 by kernel test robot
- Update title from 'ocfs2: Fix kernel BUG in ocfs2_write_block' to
'ocfs2: fix kernel BUG in ocfs2_write_block'
v1 link: https://lore.kernel.org/all/20251206154819.175479-1-activprithvi@gmail.com/…
fs/ocfs2/slot_map.c | 29 +++++++++++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/fs/ocfs2/slot_map.c b/fs/ocfs2/slot_map.c
index e544c704b583..50ddd7f50f8f 100644
--- a/fs/ocfs2/slot_map.c
+++ b/fs/ocfs2/slot_map.c
@@ -44,6 +44,9 @@ struct ocfs2_slot_info {
static int __ocfs2_node_num_to_slot(struct ocfs2_slot_info *si,
unsigned int node_num);
+static int ocfs2_validate_slot_map_block(struct super_block *sb,
+ struct buffer_head *bh);
+
static void ocfs2_invalidate_slot(struct ocfs2_slot_info *si,
int slot_num)
{
@@ -132,7 +135,8 @@ int ocfs2_refresh_slot_info(struct ocfs2_super *osb)
* this is not true, the read of -1 (UINT64_MAX) will fail.
*/
ret = ocfs2_read_blocks(INODE_CACHE(si->si_inode), -1, si->si_blocks,
- si->si_bh, OCFS2_BH_IGNORE_CACHE, NULL);
+ si->si_bh, OCFS2_BH_IGNORE_CACHE,
+ ocfs2_validate_slot_map_block);
if (ret == 0) {
spin_lock(&osb->osb_lock);
ocfs2_update_slot_info(si);
@@ -332,6 +336,26 @@ int ocfs2_clear_slot(struct ocfs2_super *osb, int slot_num)
return ocfs2_update_disk_slot(osb, osb->slot_info, slot_num);
}
+static int ocfs2_validate_slot_map_block(struct super_block *sb,
+ struct buffer_head *bh)
+{
+ int rc;
+
+ BUG_ON(!buffer_uptodate(bh));
+
+ if (bh->b_blocknr < OCFS2_SUPER_BLOCK_BLKNO) {
+ rc = ocfs2_error(sb,
+ "Invalid Slot Map Buffer Head "
+ "Block Number : %llu, Should be >= %d",
+ (unsigned long long)bh->b_blocknr,
+ OCFS2_SUPER_BLOCK_BLKNO);
+ if (!rc)
+ return -EIO;
+ return rc;
+ }
+ return 0;
+}
+
static int ocfs2_map_slot_buffers(struct ocfs2_super *osb,
struct ocfs2_slot_info *si)
{
@@ -383,7 +407,8 @@ static int ocfs2_map_slot_buffers(struct ocfs2_super *osb,
bh = NULL; /* Acquire a fresh bh */
status = ocfs2_read_blocks(INODE_CACHE(si->si_inode), blkno,
- 1, &bh, OCFS2_BH_IGNORE_CACHE, NULL);
+ 1, &bh, OCFS2_BH_IGNORE_CACHE,
+ ocfs2_validate_slot_map_block);
if (status < 0) {
mlog_errno(status);
goto bail;
base-commit: 24172e0d79900908cf5ebf366600616d29c9b417
--
2.43.0