The RK3399 Puma SoM contains the internal Cypress CYUSB3304 USB
hub, that shows instability due to improper reset pin configuration.
Currently reset pin is modeled as a vcc5v0_host regulator, that
might result in too short reset pulse duration.
Starting with the v6.6, the Onboard USB hub driver (later renamed
to Onboard USB dev) contains support for Cypress HX3 hub family.
It can be now used to correctly model the RK3399 Puma SoM hardware.
The first commits in this series fix the onboard USB dev driver to
support all HX3 hub variants, including the CYUSB3304 found in
the RK3399 Puma SoM.
This allows to introduce fix for internal USB hub instability on
RK3399 Puma, by replacing the vcc5v0_host regulator with
cy3304_reset, used inside the hub node.
Please be aware that the patch that fixes USB hub instability in
arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi can me merged only
after updating the Onboard USB dev driver, otherwise the hub
will not work.
Two last commits in the series disable unrouted USB controllers
and PHYs on RK3399 Puma SOM and Haikou carrier board, with no
intended functional changes.
This series depends on the patch:
Link: https://lore.kernel.org/linux-usb/20250418-dt-binding-usb-device-compatible…
("dt-bindings: usb: usb-device: relax compatible pattern to a contains")
Signed-off-by: Lukasz Czechowski <lukasz.czechowski(a)thaumatec.com>
---
Changes in v2:
- Removed additional entries from onboard_dev_match table and
updated dt-bindings list, as suggested by Krzysztof and Conor.
Fallback compatible entry in SoM's dtsi file is used instead.
- Added vdd-supply and vdd2-supply entries to onboard hub nodes
to satisfy bindings checks.
- Changed the default cy3304-reset pin configuration to pcfg_output_high.
- Added dependency to: change-id: 20250415-dt-binding-usb-device-compatibles-188f7b0a81b4
- Link to v1: https://lore.kernel.org/r/20250326-onboard_usb_dev-v1-0-a4b0a5d1b32c@thauma…
---
Lukasz Czechowski (3):
usb: misc: onboard_usb_dev: fix support for Cypress HX3 hubs
dt-bindings: usb: cypress,hx3: Add support for all variants
arm64: dts: rockchip: fix internal USB hub instability on RK3399 Puma
Quentin Schulz (2):
arm64: dts: rockchip: disable unrouted USB controllers and PHY on RK3399 Puma
arm64: dts: rockchip: disable unrouted USB controllers and PHY on RK3399 Puma with Haikou
.../devicetree/bindings/usb/cypress,hx3.yaml | 19 +++++++--
.../arm64/boot/dts/rockchip/rk3399-puma-haikou.dts | 8 ----
arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi | 48 +++++++++++-----------
drivers/usb/misc/onboard_usb_dev.c | 10 ++++-
4 files changed, 48 insertions(+), 37 deletions(-)
---
base-commit: 834a4a689699090a406d1662b03affa8b155d025
change-id: 20250326-onboard_usb_dev-a7c063a8a515
prerequisite-change-id: 20250415-dt-binding-usb-device-compatibles-188f7b0a81b4:v2
prerequisite-patch-id: f5b90f95302ac9065fbbe5244cc7845c2a772ab6
Best regards,
--
Lukasz Czechowski <lukasz.czechowski(a)thaumatec.com>
If the 'buf' array received from the user contains an empty string, the
'length' variable will be zero. Accessing the 'buf' array element with
index 'length - 1' will result in a buffer overflow.
Add a check for an empty string.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems")
Cc: stable(a)vger.kernel.org
Signed-off-by: Vladimir Moskovkin <Vladimir.Moskovkin(a)kaspersky.com>
---
drivers/platform/x86/dell/dell-wmi-sysman/passobj-attributes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/platform/x86/dell/dell-wmi-sysman/passobj-attributes.c b/drivers/platform/x86/dell/dell-wmi-sysman/passobj-attributes.c
index 230e6ee96636..d8f1bf5e58a0 100644
--- a/drivers/platform/x86/dell/dell-wmi-sysman/passobj-attributes.c
+++ b/drivers/platform/x86/dell/dell-wmi-sysman/passobj-attributes.c
@@ -45,7 +45,7 @@ static ssize_t current_password_store(struct kobject *kobj,
int length;
length = strlen(buf);
- if (buf[length-1] == '\n')
+ if (length && buf[length - 1] == '\n')
length--;
/* firmware does verifiation of min/max password length,
--
2.25.1
From: Michael Kelley <mhklinux(a)outlook.com>
Starting with commit dca5161f9bd0 in the 6.3 kernel, the Linux driver
for Hyper-V synthetic networking (netvsc) occasionally reports
"nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
that Hyper-V has rejected a network packet transmit request from the
guest, and the outgoing network packet is dropped. Higher level
network protocols presumably recover and resend the packet so there is
no functional error, but performance is slightly impacted. Commit
dca5161f9bd0 is not the cause of the error -- it only added reporting
of an error that was already happening without any notice. The error
has presumably been present since the netvsc driver was originally
introduced into Linux.
This patch set fixes the root cause of the problem, which is that the
netvsc driver in Linux may send an incorrectly formatted VMBus message
to Hyper-V when transmitting the network packet. The incorrect
formatting occurs when the rndis header of the VMBus message crosses a
page boundary due to how the Linux skb head memory is aligned. In such
a case, two PFNs are required to describe the location of the rndis
header, even though they are contiguous in guest physical address
(GPA) space. Hyper-V requires that two PFNs be in a single "GPA range"
data struture, but current netvsc code puts each PFN in its own GPA
range, which Hyper-V rejects as an error in the case of the rndis
header.
The incorrect formatting occurs only for larger packets that netvsc
must transmit via a VMBus "GPA Direct" message. There's no problem
when netvsc transmits a smaller packet by copying it into a pre-
allocated send buffer slot because the pre-allocated slots don't have
page crossing issues.
After commit 14ad6ed30a10 in the 6.14 kernel, the error occurs much
more frequently in VMs with 16 or more vCPUs. It may occur every few
seconds, or even more frequently, in a ssh session that outputs a lot
of text. Commit 14ad6ed30a10 subtly changes how skb head memory is
allocated, making it much more likely that the rndis header will cross
a page boundary when the vCPU count is 16 or more. The changes in
commit 14ad6ed30a10 are perfectly valid -- they just had the side
effect of making the netvsc bug more prominent.
One fix is to check for adjacent PFNs in vmbus_sendpacket_pagebuffer()
and just combine them into a single GPA range. Such a fix is very
contained. But conceptually it is fixing the problem at the wrong
level. So this patch set takes the broader approach of maintaining
the already known grouping of contiguous PFNs at a higher level in
the netvsc driver code, and propagating that grouping down to the
creation of the VMBus message to send to Hyper-V. Maintaining the
grouping fixes this problem, and has the added benefit of allowing
netvsc_dma_map() to make fewer calls to dma_map_single() to do bounce
buffering in CoCo VMs.
Patch 1 is a preparatory change to allow vmbus_sendpacket_mpb_desc()
to specify multiple GPA ranges. In current code
vmbus_sendpacket_mpb_desc() is used only by the storvsc synthetic SCSI
driver, and it always creates a single GPA range.
Patch 2 updates the netvsc driver to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(). Because the higher levels of
netvsc still don't group contiguous PFNs, this patch is functionally
neutral. The VMBus message to Hyper-V still has many GPA ranges, each
with a single PFN. But it lays the groundwork for the next patch.
Patch 3 changes the higher levels of netvsc to preserve the already
known grouping of contiguous PFNs. When the contiguous groupings are
passed to vmbus_sendpacket_mpb_desc(), GPA ranges containing multiple
PFNs are produced, as expected by Hyper-V. This is point at which the
core problem is fixed.
Patches 4 and 5 remove code that is no longer necessary after the
previous patches.
These changes provide a net reduction of about 65 lines of code, which
is an added benefit.
These changes have been tested in normal VMs, in SEV-SNP and TDX CoCo
VMs, and in Dv6-series VMs where the netvsp implementation is in the
OpenHCL paravisor instead of the Hyper-V host.
These changes are built against kernel version 6.15-rc6.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217503
Michael Kelley (5):
Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple
ranges
hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
hv_netvsc: Remove rmsg_pgcnt
Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()
drivers/hv/channel.c | 65 ++-----------------------------
drivers/net/hyperv/hyperv_net.h | 13 ++++++-
drivers/net/hyperv/netvsc.c | 57 ++++++++++++++++++++++-----
drivers/net/hyperv/netvsc_drv.c | 62 +++++++----------------------
drivers/net/hyperv/rndis_filter.c | 24 +++---------
drivers/scsi/storvsc_drv.c | 1 +
include/linux/hyperv.h | 7 ----
7 files changed, 83 insertions(+), 146 deletions(-)
--
2.25.1
Microdia JP001 does not support reading the sample rate which leads to
many lines of "cannot get freq at ep 0x84".
This patch adds the USB ID to quirks.c and avoids those error messages.
usb 7-4: New USB device found, idVendor=0c45, idProduct=636b, bcdDevice= 1.00
usb 7-4: New USB device strings: Mfr=2, Product=1, SerialNumber=3
usb 7-4: Product: JP001
usb 7-4: Manufacturer: JP001
usb 7-4: SerialNumber: JP001
usb 7-4: 3:1: cannot get freq at ep 0x84
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Nicolas Chauvet <kwizart(a)gmail.com>
---
sound/usb/quirks.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c
index 9112313a9dbc..d3e45240329d 100644
--- a/sound/usb/quirks.c
+++ b/sound/usb/quirks.c
@@ -2242,6 +2242,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
QUIRK_FLAG_CTL_MSG_DELAY_1M),
DEVICE_FLG(0x0c45, 0x6340, /* Sonix HD USB Camera */
QUIRK_FLAG_GET_SAMPLE_RATE),
+ DEVICE_FLG(0x0c45, 0x636b, /* Microdia JP001 USB Camera */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
DEVICE_FLG(0x0d8c, 0x0014, /* USB Audio Device */
QUIRK_FLAG_CTL_MSG_DELAY_1M),
DEVICE_FLG(0x0ecb, 0x205c, /* JBL Quantum610 Wireless */
--
2.49.0
Hi All,
Chages since v7:
- drop "unnecessary free pages" optimization
- fix error path page leak
Chages since v6:
- do not unnecessary free pages across iterations
Chages since v5:
- full error message included into commit description
Chages since v4:
- unused pages leak is avoided
Chages since v3:
- pfn_to_virt() changed to page_to_virt() due to compile error
Chages since v2:
- page allocation moved out of the atomic context
Chages since v1:
- Fixes: and -stable tags added to the patch description
Thanks!
Alexander Gordeev (1):
kasan: Avoid sleepable page allocation from atomic context
mm/kasan/shadow.c | 77 ++++++++++++++++++++++++++++++++++++++---------
1 file changed, 63 insertions(+), 14 deletions(-)
--
2.45.2
If the directory is corrupted and the number of nlinks is less than 2
(valid nlinks have at least 2), then when the directory is deleted, the
minix_rmdir will try to reduce the nlinks(unsigned int) to a negative
value.
Make nlinks validity check for directory.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Andrey Kriulin <kitotavrik.media(a)gmail.com>
Signed-off-by: Andrey Kriulin <kitotavrik.s(a)gmail.com>
---
v2: Move nlinks validaty check to V[12]_minix_iget() per Jan Kara
<jack(a)suse.cz> request. Change return error code to EUCLEAN. Don't block
directory in r/o mode per Al Viro <viro(a)zeniv.linux.org.uk> request.
fs/minix/inode.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index f007e389d5d2..d815397b8b0d 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -517,6 +517,14 @@ static struct inode *V1_minix_iget(struct inode *inode)
iget_failed(inode);
return ERR_PTR(-ESTALE);
}
+ if (S_ISDIR(raw_inode->i_mode) && raw_inode->i_nlinks < 2) {
+ printk("MINIX-fs: inode directory with corrupted number of links");
+ if (!sb_rdonly(inode->i_sb)) {
+ brelse(bh);
+ iget_failed(inode);
+ return ERR_PTR(-EUCLEAN);
+ }
+ }
inode->i_mode = raw_inode->i_mode;
i_uid_write(inode, raw_inode->i_uid);
i_gid_write(inode, raw_inode->i_gid);
@@ -555,6 +563,14 @@ static struct inode *V2_minix_iget(struct inode *inode)
iget_failed(inode);
return ERR_PTR(-ESTALE);
}
+ if (S_ISDIR(raw_inode->i_mode) && raw_inode->i_nlinks < 2) {
+ printk("MINIX-fs: inode directory with corrupted number of links");
+ if (!sb_rdonly(inode->i_sb)) {
+ brelse(bh);
+ iget_failed(inode);
+ return ERR_PTR(-EUCLEAN);
+ }
+ }
inode->i_mode = raw_inode->i_mode;
i_uid_write(inode, raw_inode->i_uid);
i_gid_write(inode, raw_inode->i_gid);
--
2.47.2
The avs_card_suspend_pre() and avs_card_resume_post() in rt274
calls the snd_soc_card_get_codec_dai(), but does not check its return
value which is a null pointer if the function fails. This can result
in a null pointer dereference. A proper implementation can be found
in acp5x_nau8821_hw_params() and card_suspend_pre().
Add a null pointer check for snd_soc_card_get_codec_dai() to avoid null
pointer dereference when the function fails.
Fixes: a08797afc1f9 ("ASoC: Intel: avs: rt274: Refactor jack handling")
Cc: stable(a)vger.kernel.org # v6.2
Signed-off-by: Wentao Liang <vulab(a)iscas.ac.cn>
---
sound/soc/intel/avs/boards/rt274.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/sound/soc/intel/avs/boards/rt274.c b/sound/soc/intel/avs/boards/rt274.c
index 4b6c02a40204..7a8b6ee79f4c 100644
--- a/sound/soc/intel/avs/boards/rt274.c
+++ b/sound/soc/intel/avs/boards/rt274.c
@@ -194,6 +194,11 @@ static int avs_card_suspend_pre(struct snd_soc_card *card)
{
struct snd_soc_dai *codec_dai = snd_soc_card_get_codec_dai(card, RT274_CODEC_DAI);
+ if (!codec_dai) {
+ dev_err(card->dev, "Codec dai not found\n");
+ return -EINVAL;
+ }
+
return snd_soc_component_set_jack(codec_dai->component, NULL, NULL);
}
@@ -202,6 +207,11 @@ static int avs_card_resume_post(struct snd_soc_card *card)
struct snd_soc_dai *codec_dai = snd_soc_card_get_codec_dai(card, RT274_CODEC_DAI);
struct snd_soc_jack *jack = snd_soc_card_get_drvdata(card);
+ if (!codec_dai) {
+ dev_err(card->dev, "Codec dai not found\n");
+ return -EINVAL;
+ }
+
return snd_soc_component_set_jack(codec_dai->component, jack, NULL);
}
--
2.42.0.windows.2
The function avs_card_suspend_pre() in nau8825 calls the function
snd_soc_card_get_codec_dai(), but does not check its return
value which is a null pointer if the function fails. This can result
in a null pointer dereference. A proper implementation can be found
in acp5x_nau8821_hw_params() and card_suspend_pre().
Add a null pointer check for snd_soc_card_get_codec_dai() to avoid null
pointer dereference when the function fails.
Fixes: 9febcd7a0180 ("ASoC: Intel: avs: nau8825: Refactor jack handling")
Cc: stable(a)vger.kernel.org # v6.2
Signed-off-by: Wentao Liang <vulab(a)iscas.ac.cn>
---
sound/soc/intel/avs/boards/nau8825.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/sound/soc/intel/avs/boards/nau8825.c b/sound/soc/intel/avs/boards/nau8825.c
index bf902540744c..5baeb95cd5a6 100644
--- a/sound/soc/intel/avs/boards/nau8825.c
+++ b/sound/soc/intel/avs/boards/nau8825.c
@@ -220,6 +220,11 @@ static int avs_card_suspend_pre(struct snd_soc_card *card)
{
struct snd_soc_dai *codec_dai = snd_soc_card_get_codec_dai(card, SKL_NUVOTON_CODEC_DAI);
+ if (!codec_dai) {
+ dev_err(card->dev, "Codec dai not found\n");
+ return -EINVAL;
+ }
+
return snd_soc_component_set_jack(codec_dai->component, NULL, NULL);
}
--
2.42.0.windows.2
Hi,
As an exhibiting company at All-Energy Exhibition and Conference 2025 we have access to the fully updated attendee list, including last-minute registrants who attended the show.
This exclusive list can help you:
✅ Connect with high-intent leads
✅ Follow up with attendees who visited (or missed) your booth
✅ Maximize your ROI from the event
If you're Interested, I’d love to share more details—and I can also provide pricing information.
Looking forward to your thoughts.
Regards,
Jack Reacher
Sr. Marketing Manager
If you do not wish to receive this newsletter reply as "Not interested"
The series contains several small patches to fix various
issues in the pinctrl driver for Armada 3700.
Signed-off-by: Gabor Juhos <j4g8y7(a)gmail.com>
---
Changes in v2:
- remove 'stable' and 'Fixes' tags from the error propagating patches
- collect 'Reviewed-by' tags from Andrew
- swap patches 2 and 3 so the bug fix in the latter can be applied cleanly
without depending on the change in the former
- Link to v1: https://lore.kernel.org/r/20250512-pinctrl-a37xx-fixes-v1-0-d470fb1116a5@gm…
---
Gabor Juhos (7):
pinctrl: armada-37xx: use correct OUTPUT_VAL register for GPIOs > 31
pinctrl: armada-37xx: set GPIO output value before setting direction
pinctrl: armada-37xx: propagate error from armada_37xx_gpio_direction_output()
pinctrl: armada-37xx: propagate error from armada_37xx_gpio_get()
pinctrl: armada-37xx: propagate error from armada_37xx_pmx_gpio_set_direction()
pinctrl: armada-37xx: propagate error from armada_37xx_gpio_get_direction()
pinctrl: armada-37xx: propagate error from armada_37xx_pmx_set_by_name()
drivers/pinctrl/mvebu/pinctrl-armada-37xx.c | 35 ++++++++++++++++-------------
1 file changed, 20 insertions(+), 15 deletions(-)
---
base-commit: 82f2b0b97b36ee3fcddf0f0780a9a0825d52fec3
change-id: 20250512-pinctrl-a37xx-fixes-98fabc45cb11
Best regards,
--
Gabor Juhos <j4g8y7(a)gmail.com>
In some cases, there is a small-time gap in which CMD_RING_BUSY
can be cleared by controller but adding command completion event
to event ring will be delayed. As the result driver will return
error code.
This behavior has been detected on usbtest driver (test 9) with
configuration including ep1in/ep1out bulk and ep2in/ep2out isoc
endpoint.
Probably this gap occurred because controller was busy with adding
some other events to event ring.
The CMD_RING_BUSY is cleared to '0' when the Command Descriptor
has been executed and not when command completion event has been
added to event ring.
To fix this issue for this test the small delay is sufficient
less than 10us) but to make sure the problem doesn't happen again
in the future the patch introduces 10 retries to check with delay
about 20us before returning error code.
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
cc: stable(a)vger.kernel.org
Signed-off-by: Pawel Laszczak <pawell(a)cadence.com>
---
Changelog:
v2:
- replaced usleep_range with udelay
- increased retry counter and decreased the udelay value
drivers/usb/cdns3/cdnsp-gadget.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/cdns3/cdnsp-gadget.c b/drivers/usb/cdns3/cdnsp-gadget.c
index 4824a10df07e..58650b7f4173 100644
--- a/drivers/usb/cdns3/cdnsp-gadget.c
+++ b/drivers/usb/cdns3/cdnsp-gadget.c
@@ -547,6 +547,7 @@ int cdnsp_wait_for_cmd_compl(struct cdnsp_device *pdev)
dma_addr_t cmd_deq_dma;
union cdnsp_trb *event;
u32 cycle_state;
+ u32 retry = 10;
int ret, val;
u64 cmd_dma;
u32 flags;
@@ -578,8 +579,23 @@ int cdnsp_wait_for_cmd_compl(struct cdnsp_device *pdev)
flags = le32_to_cpu(event->event_cmd.flags);
/* Check the owner of the TRB. */
- if ((flags & TRB_CYCLE) != cycle_state)
+ if ((flags & TRB_CYCLE) != cycle_state) {
+ /*
+ * Give some extra time to get chance controller
+ * to finish command before returning error code.
+ * Checking CMD_RING_BUSY is not sufficient because
+ * this bit is cleared to '0' when the Command
+ * Descriptor has been executed by controller
+ * and not when command completion event has
+ * be added to event ring.
+ */
+ if (retry--) {
+ udelay(20);
+ continue;
+ }
+
return -EINVAL;
+ }
cmd_dma = le64_to_cpu(event->event_cmd.cmd_trb);
--
2.43.0
In xe_vm_close_and_put() we need to be able to call
flush_work(rebind_work), however during vm creation we can call this on
the error path, before having actually set up the worker, leading to a
splat from flush_work().
It looks like we can simply move the worker init step earlier to fix
this.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.8+
---
drivers/gpu/drm/xe/xe_vm.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 5a978da411b0..168756fb140b 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1704,8 +1704,10 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
* scheduler drops all the references of it, hence protecting the VM
* for this case is necessary.
*/
- if (flags & XE_VM_FLAG_LR_MODE)
+ if (flags & XE_VM_FLAG_LR_MODE) {
+ INIT_WORK(&vm->preempt.rebind_work, preempt_rebind_work_func);
xe_pm_runtime_get_noresume(xe);
+ }
vm_resv_obj = drm_gpuvm_resv_object_alloc(&xe->drm);
if (!vm_resv_obj) {
@@ -1750,10 +1752,8 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
vm->batch_invalidate_tlb = true;
}
- if (vm->flags & XE_VM_FLAG_LR_MODE) {
- INIT_WORK(&vm->preempt.rebind_work, preempt_rebind_work_func);
+ if (vm->flags & XE_VM_FLAG_LR_MODE)
vm->batch_invalidate_tlb = false;
- }
/* Fill pt_root after allocating scratch tables */
for_each_tile(tile, xe, id) {
--
2.49.0
This patch series is to fix OF device node refcount leakage for
- of_irq_parse_and_map_pci()
- of_pci_prop_intr_map()
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
Zijun Hu (2):
PCI: of: Fix OF device node refcount leakage in API of_irq_parse_and_map_pci()
PCI: of: Fix OF device node refcount leakages in of_pci_prop_intr_map()
drivers/pci/of.c | 2 ++
drivers/pci/of_property.c | 20 +++++++++++---------
2 files changed, 13 insertions(+), 9 deletions(-)
---
base-commit: 7d06015d936c861160803e020f68f413b5c3cd9d
change-id: 20250407-fix_of_pci-20b45dcc26b5
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
The patch titled
Subject: highmem: add folio_test_partial_kmap()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
highmem-add-folio_test_partial_kmap.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: highmem: add folio_test_partial_kmap()
Date: Wed, 14 May 2025 18:06:02 +0100
In commit c749d9b7ebbc (iov_iter: fix copy_page_from_iter_atomic() if
KMAP_LOCAL_FORCE_MAP), Hugh correctly noted that if KMAP_LOCAL_FORCE_MAP
is enabled, we must limit ourselves to PAGE_SIZE bytes per call to
kmap_local(). The same problem exists in memcpy_from_folio(),
memcpy_to_folio(), folio_zero_tail(), folio_fill_tail() and
memcpy_from_file_folio(), so add folio_test_partial_kmap() to do this more
succinctly.
Link: https://lkml.kernel.org/r/20250514170607.3000994-2-willy@infradead.org
Fixes: 00cdf76012ab ("mm: add memcpy_from_file_folio()")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/highmem.h | 10 +++++-----
include/linux/page-flags.h | 7 +++++++
2 files changed, 12 insertions(+), 5 deletions(-)
--- a/include/linux/highmem.h~highmem-add-folio_test_partial_kmap
+++ a/include/linux/highmem.h
@@ -461,7 +461,7 @@ static inline void memcpy_from_folio(cha
const char *from = kmap_local_folio(folio, offset);
size_t chunk = len;
- if (folio_test_highmem(folio) &&
+ if (folio_test_partial_kmap(folio) &&
chunk > PAGE_SIZE - offset_in_page(offset))
chunk = PAGE_SIZE - offset_in_page(offset);
memcpy(to, from, chunk);
@@ -489,7 +489,7 @@ static inline void memcpy_to_folio(struc
char *to = kmap_local_folio(folio, offset);
size_t chunk = len;
- if (folio_test_highmem(folio) &&
+ if (folio_test_partial_kmap(folio) &&
chunk > PAGE_SIZE - offset_in_page(offset))
chunk = PAGE_SIZE - offset_in_page(offset);
memcpy(to, from, chunk);
@@ -522,7 +522,7 @@ static inline __must_check void *folio_z
{
size_t len = folio_size(folio) - offset;
- if (folio_test_highmem(folio)) {
+ if (folio_test_partial_kmap(folio)) {
size_t max = PAGE_SIZE - offset_in_page(offset);
while (len > max) {
@@ -560,7 +560,7 @@ static inline void folio_fill_tail(struc
VM_BUG_ON(offset + len > folio_size(folio));
- if (folio_test_highmem(folio)) {
+ if (folio_test_partial_kmap(folio)) {
size_t max = PAGE_SIZE - offset_in_page(offset);
while (len > max) {
@@ -597,7 +597,7 @@ static inline size_t memcpy_from_file_fo
size_t offset = offset_in_folio(folio, pos);
char *from = kmap_local_folio(folio, offset);
- if (folio_test_highmem(folio)) {
+ if (folio_test_partial_kmap(folio)) {
offset = offset_in_page(offset);
len = min_t(size_t, len, PAGE_SIZE - offset);
} else
--- a/include/linux/page-flags.h~highmem-add-folio_test_partial_kmap
+++ a/include/linux/page-flags.h
@@ -615,6 +615,13 @@ FOLIO_FLAG(dropbehind, FOLIO_HEAD_PAGE)
PAGEFLAG_FALSE(HighMem, highmem)
#endif
+/* Does kmap_local_folio() only allow access to one page of the folio? */
+#ifdef CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP
+#define folio_test_partial_kmap(f) true
+#else
+#define folio_test_partial_kmap(f) folio_test_highmem(f)
+#endif
+
#ifdef CONFIG_SWAP
static __always_inline bool folio_test_swapcache(const struct folio *folio)
{
_
Patches currently in -mm which might be from willy(a)infradead.org are
highmem-add-folio_test_partial_kmap.patch
mm-rename-page-index-to-page-__folio_index.patch
Two of the patches are older rethunk fixes and one is a build fix for
CONFIG_MODULES=n.
---
Borislav Petkov (AMD) (1):
x86/alternative: Optimize returns patching
Eric Biggers (1):
x86/its: Fix build errors when CONFIG_MODULES=n
Josh Poimboeuf (1):
x86/alternatives: Remove faulty optimization
arch/x86/kernel/alternative.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
---
change-id: 20250513-its-fixes-6-1-d21ce20b0d1d
From: Shravya KN <shravya.k-n(a)broadcom.com>
[ Upstream commit 3051a77a09dfe3022aa012071346937fdf059033 ]
The MTU setting at the time an XDP multi-buffer is attached
determines whether the aggregation ring will be used and the
rx_skb_func handler. This is done in bnxt_set_rx_skb_mode().
If the MTU is later changed, the aggregation ring setting may need
to be changed and it may become out-of-sync with the settings
initially done in bnxt_set_rx_skb_mode(). This may result in
random memory corruption and crashes as the HW may DMA data larger
than the allocated buffer size, such as:
BUG: kernel NULL pointer dereference, address: 00000000000003c0
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 17 PID: 0 Comm: swapper/17 Kdump: loaded Tainted: G S OE 6.1.0-226bf9805506 #1
Hardware name: Wiwynn Delta Lake PVT BZA.02601.0150/Delta Lake-Class1, BIOS F0E_3A12 08/26/2021
RIP: 0010:bnxt_rx_pkt+0xe97/0x1ae0 [bnxt_en]
Code: 8b 95 70 ff ff ff 4c 8b 9d 48 ff ff ff 66 41 89 87 b4 00 00 00 e9 0b f7 ff ff 0f b7 43 0a 49 8b 95 a8 04 00 00 25 ff 0f 00 00 <0f> b7 14 42 48 c1 e2 06 49 03 95 a0 04 00 00 0f b6 42 33f
RSP: 0018:ffffa19f40cc0d18 EFLAGS: 00010202
RAX: 00000000000001e0 RBX: ffff8e2c805c6100 RCX: 00000000000007ff
RDX: 0000000000000000 RSI: ffff8e2c271ab990 RDI: ffff8e2c84f12380
RBP: ffffa19f40cc0e48 R08: 000000000001000d R09: 974ea2fcddfa4cbf
R10: 0000000000000000 R11: ffffa19f40cc0ff8 R12: ffff8e2c94b58980
R13: ffff8e2c952d6600 R14: 0000000000000016 R15: ffff8e2c271ab990
FS: 0000000000000000(0000) GS:ffff8e3b3f840000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000003c0 CR3: 0000000e8580a004 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<IRQ>
__bnxt_poll_work+0x1c2/0x3e0 [bnxt_en]
To address the issue, we now call bnxt_set_rx_skb_mode() within
bnxt_change_mtu() to properly set the AGG rings configuration and
update rx_skb_func based on the new MTU value.
Additionally, BNXT_FLAG_NO_AGG_RINGS is cleared at the beginning of
bnxt_set_rx_skb_mode() to make sure it gets set or cleared based on
the current MTU.
Fixes: 08450ea98ae9 ("bnxt_en: Fix max_mtu setting for multi-buf XDP")
Co-developed-by: Somnath Kotur <somnath.kotur(a)broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur(a)broadcom.com>
Signed-off-by: Shravya KN <shravya.k-n(a)broadcom.com>
Signed-off-by: Michael Chan <michael.chan(a)broadcom.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Zhaoyang Li <lizy04(a)hust.edu.cn>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 393a983f6d69..6b1245a3ab4b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4041,7 +4041,7 @@ int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
struct net_device *dev = bp->dev;
if (page_mode) {
- bp->flags &= ~BNXT_FLAG_AGG_RINGS;
+ bp->flags &= ~(BNXT_FLAG_AGG_RINGS | BNXT_FLAG_NO_AGG_RINGS);
bp->flags |= BNXT_FLAG_RX_PAGE_MODE;
if (bp->xdp_prog->aux->xdp_has_frags)
@@ -12799,6 +12799,14 @@ static int bnxt_change_mtu(struct net_device *dev, int new_mtu)
bnxt_close_nic(bp, true, false);
dev->mtu = new_mtu;
+
+ /* MTU change may change the AGG ring settings if an XDP multi-buffer
+ * program is attached. We need to set the AGG rings settings and
+ * rx_skb_func accordingly.
+ */
+ if (READ_ONCE(bp->xdp_prog))
+ bnxt_set_rx_skb_mode(bp, true);
+
bnxt_set_ring_params(bp);
if (netif_running(dev))
--
2.25.1
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 03552d8ac0afcc080c339faa0b726e2c0e9361cb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025051239-ceremony-rehab-8e6c@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 03552d8ac0afcc080c339faa0b726e2c0e9361cb Mon Sep 17 00:00:00 2001
From: Daniele Ceraolo Spurio <daniele.ceraolospurio(a)intel.com>
Date: Fri, 2 May 2025 08:51:04 -0700
Subject: [PATCH] drm/xe/gsc: do not flush the GSC worker from the reset path
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM,
while the GSC one isn't (and can't be as we need to do memory
allocations in the gsc worker). Therefore, we can't flush the latter
from the former.
The reason why we had such a flush was to avoid interrupting either
the GSC FW load or in progress GSC proxy operations. GSC proxy
operations fall into 2 categories:
1) GSC proxy init: this only happens once immediately after GSC FW load
and does not support being interrupted. The only way to recover from
an interruption of the proxy init is to do an FLR and re-load the GSC.
2) GSC proxy request: this can happen in response to a request that
the driver sends to the GSC. If this is interrupted, the GSC FW will
timeout and the driver request will be failed, but overall the GSC
will keep working fine.
Flushing the work allowed us to avoid interruption in both cases (unless
the hang came from the GSC engine itself, in which case we're toast
anyway). However, a failure on a proxy request is tolerable if we're in
a scenario where we're triggering a GT reset (i.e., something is already
gone pretty wrong), so what we really need to avoid is interrupting
the init flow, which we can do by polling on the register that reports
when the proxy init is complete (as that ensure us that all the load and
init operations have been completed).
Note that during suspend we still want to do a flush of the worker to
make sure it completes any operations involving the HW before the power
is cut.
v2: fix spelling in commit msg, rename waiter function (Julia)
Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio(a)intel.com>
Cc: John Harrison <John.C.Harrison(a)Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.8+
Reviewed-by: Julia Filipchuk <julia.filipchuk(a)intel.com>
Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@in…
(cherry picked from commit 12370bfcc4f0bdf70279ec5b570eb298963422b5)
Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com>
diff --git a/drivers/gpu/drm/xe/xe_gsc.c b/drivers/gpu/drm/xe/xe_gsc.c
index fd41113f8572..0bcf97063ff6 100644
--- a/drivers/gpu/drm/xe/xe_gsc.c
+++ b/drivers/gpu/drm/xe/xe_gsc.c
@@ -555,6 +555,28 @@ void xe_gsc_wait_for_worker_completion(struct xe_gsc *gsc)
flush_work(&gsc->work);
}
+void xe_gsc_stop_prepare(struct xe_gsc *gsc)
+{
+ struct xe_gt *gt = gsc_to_gt(gsc);
+ int ret;
+
+ if (!xe_uc_fw_is_loadable(&gsc->fw) || xe_uc_fw_is_in_error_state(&gsc->fw))
+ return;
+
+ xe_force_wake_assert_held(gt_to_fw(gt), XE_FW_GSC);
+
+ /*
+ * If the GSC FW load or the proxy init are interrupted, the only way
+ * to recover it is to do an FLR and reload the GSC from scratch.
+ * Therefore, let's wait for the init to complete before stopping
+ * operations. The proxy init is the last step, so we can just wait on
+ * that
+ */
+ ret = xe_gsc_wait_for_proxy_init_done(gsc);
+ if (ret)
+ xe_gt_err(gt, "failed to wait for GSC init completion before uc stop\n");
+}
+
/*
* wa_14015076503: if the GSC FW is loaded, we need to alert it before doing a
* GSC engine reset by writing a notification bit in the GS1 register and then
diff --git a/drivers/gpu/drm/xe/xe_gsc.h b/drivers/gpu/drm/xe/xe_gsc.h
index d99f66c38075..b8b8e0810ad9 100644
--- a/drivers/gpu/drm/xe/xe_gsc.h
+++ b/drivers/gpu/drm/xe/xe_gsc.h
@@ -16,6 +16,7 @@ struct xe_hw_engine;
int xe_gsc_init(struct xe_gsc *gsc);
int xe_gsc_init_post_hwconfig(struct xe_gsc *gsc);
void xe_gsc_wait_for_worker_completion(struct xe_gsc *gsc);
+void xe_gsc_stop_prepare(struct xe_gsc *gsc);
void xe_gsc_load_start(struct xe_gsc *gsc);
void xe_gsc_hwe_irq_handler(struct xe_hw_engine *hwe, u16 intr_vec);
diff --git a/drivers/gpu/drm/xe/xe_gsc_proxy.c b/drivers/gpu/drm/xe/xe_gsc_proxy.c
index 8cf70b228ff3..d0519cd6704a 100644
--- a/drivers/gpu/drm/xe/xe_gsc_proxy.c
+++ b/drivers/gpu/drm/xe/xe_gsc_proxy.c
@@ -71,6 +71,17 @@ bool xe_gsc_proxy_init_done(struct xe_gsc *gsc)
HECI1_FWSTS1_PROXY_STATE_NORMAL;
}
+int xe_gsc_wait_for_proxy_init_done(struct xe_gsc *gsc)
+{
+ struct xe_gt *gt = gsc_to_gt(gsc);
+
+ /* Proxy init can take up to 500ms, so wait double that for safety */
+ return xe_mmio_wait32(>->mmio, HECI_FWSTS1(MTL_GSC_HECI1_BASE),
+ HECI1_FWSTS1_CURRENT_STATE,
+ HECI1_FWSTS1_PROXY_STATE_NORMAL,
+ USEC_PER_SEC, NULL, false);
+}
+
static void __gsc_proxy_irq_rmw(struct xe_gsc *gsc, u32 clr, u32 set)
{
struct xe_gt *gt = gsc_to_gt(gsc);
diff --git a/drivers/gpu/drm/xe/xe_gsc_proxy.h b/drivers/gpu/drm/xe/xe_gsc_proxy.h
index fdef56995cd4..765602221dbc 100644
--- a/drivers/gpu/drm/xe/xe_gsc_proxy.h
+++ b/drivers/gpu/drm/xe/xe_gsc_proxy.h
@@ -12,6 +12,7 @@ struct xe_gsc;
int xe_gsc_proxy_init(struct xe_gsc *gsc);
bool xe_gsc_proxy_init_done(struct xe_gsc *gsc);
+int xe_gsc_wait_for_proxy_init_done(struct xe_gsc *gsc);
int xe_gsc_proxy_start(struct xe_gsc *gsc);
int xe_gsc_proxy_request_handler(struct xe_gsc *gsc);
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 10a9e3c72b36..66198cf2662c 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -857,7 +857,7 @@ void xe_gt_suspend_prepare(struct xe_gt *gt)
fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
- xe_uc_stop_prepare(>->uc);
+ xe_uc_suspend_prepare(>->uc);
xe_force_wake_put(gt_to_fw(gt), fw_ref);
}
diff --git a/drivers/gpu/drm/xe/xe_uc.c b/drivers/gpu/drm/xe/xe_uc.c
index c14bd2282044..3a8751a8b92d 100644
--- a/drivers/gpu/drm/xe/xe_uc.c
+++ b/drivers/gpu/drm/xe/xe_uc.c
@@ -244,7 +244,7 @@ void xe_uc_gucrc_disable(struct xe_uc *uc)
void xe_uc_stop_prepare(struct xe_uc *uc)
{
- xe_gsc_wait_for_worker_completion(&uc->gsc);
+ xe_gsc_stop_prepare(&uc->gsc);
xe_guc_stop_prepare(&uc->guc);
}
@@ -278,6 +278,12 @@ static void uc_reset_wait(struct xe_uc *uc)
goto again;
}
+void xe_uc_suspend_prepare(struct xe_uc *uc)
+{
+ xe_gsc_wait_for_worker_completion(&uc->gsc);
+ xe_guc_stop_prepare(&uc->guc);
+}
+
int xe_uc_suspend(struct xe_uc *uc)
{
/* GuC submission not enabled, nothing to do */
diff --git a/drivers/gpu/drm/xe/xe_uc.h b/drivers/gpu/drm/xe/xe_uc.h
index 3813c1ede450..c23e6f5e2514 100644
--- a/drivers/gpu/drm/xe/xe_uc.h
+++ b/drivers/gpu/drm/xe/xe_uc.h
@@ -18,6 +18,7 @@ int xe_uc_reset_prepare(struct xe_uc *uc);
void xe_uc_stop_prepare(struct xe_uc *uc);
void xe_uc_stop(struct xe_uc *uc);
int xe_uc_start(struct xe_uc *uc);
+void xe_uc_suspend_prepare(struct xe_uc *uc);
int xe_uc_suspend(struct xe_uc *uc);
int xe_uc_sanitize_reset(struct xe_uc *uc);
void xe_uc_declare_wedged(struct xe_uc *uc);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x f31fe8165d365379d858c53bef43254c7d6d1cfd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025051209-overturn-supreme-26f1@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f31fe8165d365379d858c53bef43254c7d6d1cfd Mon Sep 17 00:00:00 2001
From: Naman Jain <namjain(a)linux.microsoft.com>
Date: Fri, 2 May 2025 13:18:10 +0530
Subject: [PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer
On regular bootup, devices get registered to VMBus first, so when
uio_hv_generic driver for a particular device type is probed,
the device is already initialized and added, so sysfs creation in
hv_uio_probe() works fine. However, when the device is removed
and brought back, the channel gets rescinded and the device again gets
registered to VMBus. However this time, the uio_hv_generic driver is
already registered to probe for that device and in this case sysfs
creation is tried before the device's kobject gets initialized
completely.
Fix this by moving the core logic of sysfs creation of ring buffer,
from uio_hv_generic to HyperV's VMBus driver, where the rest of the
sysfs attributes for the channels are defined. While doing that, make
use of attribute groups and macros, instead of creating sysfs
directly, to ensure better error handling and code flow.
Problematic path:
vmbus_process_offer (A new offer comes for the VMBus device)
vmbus_add_channel_work
vmbus_device_register
|-> device_register
| |...
| |-> hv_uio_probe
| |...
| |-> sysfs_create_bin_file (leads to a warning as
| the primary channel's kobject, which is used to
| create the sysfs file, is not yet initialized)
|-> kset_create_and_add
|-> vmbus_add_channel_kobj (initialization of the primary
channel's kobject happens later)
Above code flow is sequential and the warning is always reproducible in
this path.
Fixes: 9ab877a6ccf8 ("uio_hv_generic: make ring buffer attribute for primary channel")
Cc: stable(a)kernel.org
Suggested-by: Saurabh Sengar <ssengar(a)linux.microsoft.com>
Suggested-by: Michael Kelley <mhklinux(a)outlook.com>
Reviewed-by: Michael Kelley <mhklinux(a)outlook.com>
Tested-by: Michael Kelley <mhklinux(a)outlook.com>
Reviewed-by: Dexuan Cui <decui(a)microsoft.com>
Signed-off-by: Naman Jain <namjain(a)linux.microsoft.com>
Link: https://lore.kernel.org/r/20250502074811.2022-2-namjain@linux.microsoft.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 29780f3a7478..0b450e53161e 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -477,4 +477,10 @@ static inline int hv_debug_add_dev_dir(struct hv_device *dev)
#endif /* CONFIG_HYPERV_TESTING */
+/* Create and remove sysfs entry for memory mapped ring buffers for a channel */
+int hv_create_ring_sysfs(struct vmbus_channel *channel,
+ int (*hv_mmap_ring_buffer)(struct vmbus_channel *channel,
+ struct vm_area_struct *vma));
+int hv_remove_ring_sysfs(struct vmbus_channel *channel);
+
#endif /* _HYPERV_VMBUS_H */
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 8d3cff42bdbb..0f16a83cc2d6 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1802,6 +1802,27 @@ static ssize_t subchannel_id_show(struct vmbus_channel *channel,
}
static VMBUS_CHAN_ATTR_RO(subchannel_id);
+static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj,
+ const struct bin_attribute *attr,
+ struct vm_area_struct *vma)
+{
+ struct vmbus_channel *channel = container_of(kobj, struct vmbus_channel, kobj);
+
+ /*
+ * hv_(create|remove)_ring_sysfs implementation ensures that mmap_ring_buffer
+ * is not NULL.
+ */
+ return channel->mmap_ring_buffer(channel, vma);
+}
+
+static struct bin_attribute chan_attr_ring_buffer = {
+ .attr = {
+ .name = "ring",
+ .mode = 0600,
+ },
+ .size = 2 * SZ_2M,
+ .mmap = hv_mmap_ring_buffer_wrapper,
+};
static struct attribute *vmbus_chan_attrs[] = {
&chan_attr_out_mask.attr,
&chan_attr_in_mask.attr,
@@ -1821,6 +1842,11 @@ static struct attribute *vmbus_chan_attrs[] = {
NULL
};
+static struct bin_attribute *vmbus_chan_bin_attrs[] = {
+ &chan_attr_ring_buffer,
+ NULL
+};
+
/*
* Channel-level attribute_group callback function. Returns the permission for
* each attribute, and returns 0 if an attribute is not visible.
@@ -1841,9 +1867,24 @@ static umode_t vmbus_chan_attr_is_visible(struct kobject *kobj,
return attr->mode;
}
+static umode_t vmbus_chan_bin_attr_is_visible(struct kobject *kobj,
+ const struct bin_attribute *attr, int idx)
+{
+ const struct vmbus_channel *channel =
+ container_of(kobj, struct vmbus_channel, kobj);
+
+ /* Hide ring attribute if channel's ring_sysfs_visible is set to false */
+ if (attr == &chan_attr_ring_buffer && !channel->ring_sysfs_visible)
+ return 0;
+
+ return attr->attr.mode;
+}
+
static const struct attribute_group vmbus_chan_group = {
.attrs = vmbus_chan_attrs,
- .is_visible = vmbus_chan_attr_is_visible
+ .bin_attrs = vmbus_chan_bin_attrs,
+ .is_visible = vmbus_chan_attr_is_visible,
+ .is_bin_visible = vmbus_chan_bin_attr_is_visible,
};
static const struct kobj_type vmbus_chan_ktype = {
@@ -1851,6 +1892,63 @@ static const struct kobj_type vmbus_chan_ktype = {
.release = vmbus_chan_release,
};
+/**
+ * hv_create_ring_sysfs() - create "ring" sysfs entry corresponding to ring buffers for a channel.
+ * @channel: Pointer to vmbus_channel structure
+ * @hv_mmap_ring_buffer: function pointer for initializing the function to be called on mmap of
+ * channel's "ring" sysfs node, which is for the ring buffer of that channel.
+ * Function pointer is of below type:
+ * int (*hv_mmap_ring_buffer)(struct vmbus_channel *channel,
+ * struct vm_area_struct *vma))
+ * This has a pointer to the channel and a pointer to vm_area_struct,
+ * used for mmap, as arguments.
+ *
+ * Sysfs node for ring buffer of a channel is created along with other fields, however its
+ * visibility is disabled by default. Sysfs creation needs to be controlled when the use-case
+ * is running.
+ * For example, HV_NIC device is used either by uio_hv_generic or hv_netvsc at any given point of
+ * time, and "ring" sysfs is needed only when uio_hv_generic is bound to that device. To avoid
+ * exposing the ring buffer by default, this function is reponsible to enable visibility of
+ * ring for userspace to use.
+ * Note: Race conditions can happen with userspace and it is not encouraged to create new
+ * use-cases for this. This was added to maintain backward compatibility, while solving
+ * one of the race conditions in uio_hv_generic while creating sysfs.
+ *
+ * Returns 0 on success or error code on failure.
+ */
+int hv_create_ring_sysfs(struct vmbus_channel *channel,
+ int (*hv_mmap_ring_buffer)(struct vmbus_channel *channel,
+ struct vm_area_struct *vma))
+{
+ struct kobject *kobj = &channel->kobj;
+
+ channel->mmap_ring_buffer = hv_mmap_ring_buffer;
+ channel->ring_sysfs_visible = true;
+
+ return sysfs_update_group(kobj, &vmbus_chan_group);
+}
+EXPORT_SYMBOL_GPL(hv_create_ring_sysfs);
+
+/**
+ * hv_remove_ring_sysfs() - remove ring sysfs entry corresponding to ring buffers for a channel.
+ * @channel: Pointer to vmbus_channel structure
+ *
+ * Hide "ring" sysfs for a channel by changing its is_visible attribute and updating sysfs group.
+ *
+ * Returns 0 on success or error code on failure.
+ */
+int hv_remove_ring_sysfs(struct vmbus_channel *channel)
+{
+ struct kobject *kobj = &channel->kobj;
+ int ret;
+
+ channel->ring_sysfs_visible = false;
+ ret = sysfs_update_group(kobj, &vmbus_chan_group);
+ channel->mmap_ring_buffer = NULL;
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hv_remove_ring_sysfs);
+
/*
* vmbus_add_channel_kobj - setup a sub-directory under device/channels
*/
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index 1b19b5647495..69c1df0f4ca5 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -131,15 +131,12 @@ static void hv_uio_rescind(struct vmbus_channel *channel)
vmbus_device_unregister(channel->device_obj);
}
-/* Sysfs API to allow mmap of the ring buffers
+/* Function used for mmap of ring buffer sysfs interface.
* The ring buffer is allocated as contiguous memory by vmbus_open
*/
-static int hv_uio_ring_mmap(struct file *filp, struct kobject *kobj,
- const struct bin_attribute *attr,
- struct vm_area_struct *vma)
+static int
+hv_uio_ring_mmap(struct vmbus_channel *channel, struct vm_area_struct *vma)
{
- struct vmbus_channel *channel
- = container_of(kobj, struct vmbus_channel, kobj);
void *ring_buffer = page_address(channel->ringbuffer_page);
if (channel->state != CHANNEL_OPENED_STATE)
@@ -149,15 +146,6 @@ static int hv_uio_ring_mmap(struct file *filp, struct kobject *kobj,
channel->ringbuffer_pagecount << PAGE_SHIFT);
}
-static const struct bin_attribute ring_buffer_bin_attr = {
- .attr = {
- .name = "ring",
- .mode = 0600,
- },
- .size = 2 * SZ_2M,
- .mmap = hv_uio_ring_mmap,
-};
-
/* Callback from VMBUS subsystem when new channel created. */
static void
hv_uio_new_channel(struct vmbus_channel *new_sc)
@@ -178,8 +166,7 @@ hv_uio_new_channel(struct vmbus_channel *new_sc)
/* Disable interrupts on sub channel */
new_sc->inbound.ring_buffer->interrupt_mask = 1;
set_channel_read_mode(new_sc, HV_CALL_ISR);
-
- ret = sysfs_create_bin_file(&new_sc->kobj, &ring_buffer_bin_attr);
+ ret = hv_create_ring_sysfs(new_sc, hv_uio_ring_mmap);
if (ret) {
dev_err(device, "sysfs create ring bin file failed; %d\n", ret);
vmbus_close(new_sc);
@@ -350,10 +337,18 @@ hv_uio_probe(struct hv_device *dev,
goto fail_close;
}
- ret = sysfs_create_bin_file(&channel->kobj, &ring_buffer_bin_attr);
- if (ret)
- dev_notice(&dev->device,
- "sysfs create ring bin file failed; %d\n", ret);
+ /*
+ * This internally calls sysfs_update_group, which returns a non-zero value if it executes
+ * before sysfs_create_group. This is expected as the 'ring' will be created later in
+ * vmbus_device_register() -> vmbus_add_channel_kobj(). Thus, no need to check the return
+ * value and print warning.
+ *
+ * Creating/exposing sysfs in driver probe is not encouraged as it can lead to race
+ * conditions with userspace. For backward compatibility, "ring" sysfs could not be removed
+ * or decoupled from uio_hv_generic probe. Userspace programs can make use of inotify
+ * APIs to make sure that ring is created.
+ */
+ hv_create_ring_sysfs(channel, hv_uio_ring_mmap);
hv_set_drvdata(dev, pdata);
@@ -375,7 +370,7 @@ hv_uio_remove(struct hv_device *dev)
if (!pdata)
return;
- sysfs_remove_bin_file(&dev->channel->kobj, &ring_buffer_bin_attr);
+ hv_remove_ring_sysfs(dev->channel);
uio_unregister_device(&pdata->info);
hv_uio_cleanup(dev, pdata);
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 675959fb97ba..d6ffe01962c2 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1002,6 +1002,12 @@ struct vmbus_channel {
/* The max size of a packet on this channel */
u32 max_pkt_size;
+
+ /* function to mmap ring buffer memory to the channel's sysfs ring attribute */
+ int (*mmap_ring_buffer)(struct vmbus_channel *channel, struct vm_area_struct *vma);
+
+ /* boolean to control visibility of sysfs for ring buffer */
+ bool ring_sysfs_visible;
};
#define lock_requestor(channel, flags) \
From: Nick Child <nnac123(a)linux.ibm.com>
[ Upstream commit bdf5d13aa05ec314d4385b31ac974d6c7e0997c9 ]
Previously, after successfully flushing the xmit buffer to VIOS,
the tx_bytes stat was incremented by the length of the skb.
It is invalid to access the skb memory after sending the buffer to
the VIOS because, at any point after sending, the VIOS can trigger
an interrupt to free this memory. A race between reading skb->len
and freeing the skb is possible (especially during LPM) and will
result in use-after-free:
==================================================================
BUG: KASAN: slab-use-after-free in ibmvnic_xmit+0x75c/0x1808 [ibmvnic]
Read of size 4 at addr c00000024eb48a70 by task hxecom/14495
<...>
Call Trace:
[c000000118f66cf0] [c0000000018cba6c] dump_stack_lvl+0x84/0xe8 (unreliable)
[c000000118f66d20] [c0000000006f0080] print_report+0x1a8/0x7f0
[c000000118f66df0] [c0000000006f08f0] kasan_report+0x128/0x1f8
[c000000118f66f00] [c0000000006f2868] __asan_load4+0xac/0xe0
[c000000118f66f20] [c0080000046eac84] ibmvnic_xmit+0x75c/0x1808 [ibmvnic]
[c000000118f67340] [c0000000014be168] dev_hard_start_xmit+0x150/0x358
<...>
Freed by task 0:
kasan_save_stack+0x34/0x68
kasan_save_track+0x2c/0x50
kasan_save_free_info+0x64/0x108
__kasan_mempool_poison_object+0x148/0x2d4
napi_skb_cache_put+0x5c/0x194
net_tx_action+0x154/0x5b8
handle_softirqs+0x20c/0x60c
do_softirq_own_stack+0x6c/0x88
<...>
The buggy address belongs to the object at c00000024eb48a00 which
belongs to the cache skbuff_head_cache of size 224
==================================================================
Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Nick Child <nnac123(a)linux.ibm.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Link: https://patch.msgid.link/20250214155233.235559-1-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[Minor context change fixed.]
Signed-off-by: Bin Lan <bin.lan.cn(a)windriver.com>
Signed-off-by: He Zhe <zhe.he(a)windriver.com>
---
Build test passed.
---
drivers/net/ethernet/ibm/ibmvnic.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 7f4539a2e551..4b87d9f59628 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1758,6 +1758,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
dma_addr_t data_dma_addr;
struct netdev_queue *txq;
unsigned long lpar_rc;
+ unsigned int skblen;
union sub_crq tx_crq;
unsigned int offset;
int num_entries = 1;
@@ -1843,6 +1844,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
tx_buff->skb = skb;
tx_buff->index = index;
tx_buff->pool_index = queue_num;
+ skblen = skb->len;
memset(&tx_crq, 0, sizeof(tx_crq));
tx_crq.v1.first = IBMVNIC_CRQ_CMD;
@@ -1919,7 +1921,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
}
tx_packets++;
- tx_bytes += skb->len;
+ tx_bytes += skblen;
txq->trans_start = jiffies;
ret = NETDEV_TX_OK;
goto out;
--
2.34.1
From: Nick Child <nnac123(a)linux.ibm.com>
[ Upstream commit bdf5d13aa05ec314d4385b31ac974d6c7e0997c9 ]
Previously, after successfully flushing the xmit buffer to VIOS,
the tx_bytes stat was incremented by the length of the skb.
It is invalid to access the skb memory after sending the buffer to
the VIOS because, at any point after sending, the VIOS can trigger
an interrupt to free this memory. A race between reading skb->len
and freeing the skb is possible (especially during LPM) and will
result in use-after-free:
==================================================================
BUG: KASAN: slab-use-after-free in ibmvnic_xmit+0x75c/0x1808 [ibmvnic]
Read of size 4 at addr c00000024eb48a70 by task hxecom/14495
<...>
Call Trace:
[c000000118f66cf0] [c0000000018cba6c] dump_stack_lvl+0x84/0xe8 (unreliable)
[c000000118f66d20] [c0000000006f0080] print_report+0x1a8/0x7f0
[c000000118f66df0] [c0000000006f08f0] kasan_report+0x128/0x1f8
[c000000118f66f00] [c0000000006f2868] __asan_load4+0xac/0xe0
[c000000118f66f20] [c0080000046eac84] ibmvnic_xmit+0x75c/0x1808 [ibmvnic]
[c000000118f67340] [c0000000014be168] dev_hard_start_xmit+0x150/0x358
<...>
Freed by task 0:
kasan_save_stack+0x34/0x68
kasan_save_track+0x2c/0x50
kasan_save_free_info+0x64/0x108
__kasan_mempool_poison_object+0x148/0x2d4
napi_skb_cache_put+0x5c/0x194
net_tx_action+0x154/0x5b8
handle_softirqs+0x20c/0x60c
do_softirq_own_stack+0x6c/0x88
<...>
The buggy address belongs to the object at c00000024eb48a00 which
belongs to the cache skbuff_head_cache of size 224
==================================================================
Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Nick Child <nnac123(a)linux.ibm.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Link: https://patch.msgid.link/20250214155233.235559-1-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[Minor context change fixed.]
Signed-off-by: Bin Lan <bin.lan.cn(a)windriver.com>
Signed-off-by: He Zhe <zhe.he(a)windriver.com>
---
Build test passed.
---
drivers/net/ethernet/ibm/ibmvnic.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 84da6ccaf339..73e3165aa9ae 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1548,6 +1548,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
dma_addr_t data_dma_addr;
struct netdev_queue *txq;
unsigned long lpar_rc;
+ unsigned int skblen;
union sub_crq tx_crq;
unsigned int offset;
int num_entries = 1;
@@ -1631,6 +1632,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
tx_buff->index = index;
tx_buff->pool_index = queue_num;
tx_buff->last_frag = true;
+ skblen = skb->len;
memset(&tx_crq, 0, sizeof(tx_crq));
tx_crq.v1.first = IBMVNIC_CRQ_CMD;
@@ -1733,7 +1735,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
}
tx_packets++;
- tx_bytes += skb->len;
+ tx_bytes += skblen;
txq->trans_start = jiffies;
ret = NETDEV_TX_OK;
goto out;
--
2.34.1