September 2023 - Linux-stable-mirror

[PATCH 4/4] xhci: Preserve RsvdP bits in ERSTBA register correctly

by Mathias Nyman

From: Lukas Wunner <lukas(a)wunner.de> xhci_add_interrupter() erroneously preserves only the lowest 4 bits when writing the ERSTBA register, not the lowest 6 bits. Fix it. Migrate the ERST_BASE_RSVDP macro to the modern GENMASK_ULL() syntax to avoid a u64 cast. This was previously fixed by commit 8c1cbec9db1a ("xhci: fix event ring segment table related masks and variables in header"), but immediately undone by commit b17a57f89f69 ("xhci: Refactor interrupter code for initial multi interrupter support."). Fixes: b17a57f89f69 ("xhci: Refactor interrupter code for initial multi interrupter support.") Signed-off-by: Lukas Wunner <lukas(a)wunner.de> Cc: stable(a)vger.kernel.org # v6.3+ Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> --- drivers/usb/host/xhci-mem.c | 4 ++-- drivers/usb/host/xhci.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 8714ab5bf04d..0a37f0d511cf 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -2285,8 +2285,8 @@ xhci_add_interrupter(struct xhci_hcd *xhci, struct xhci_interrupter *ir, writel(erst_size, &ir->ir_set->erst_size); erst_base = xhci_read_64(xhci, &ir->ir_set->erst_base); - erst_base &= ERST_PTR_MASK; - erst_base |= (ir->erst.erst_dma_addr & (u64) ~ERST_PTR_MASK); + erst_base &= ERST_BASE_RSVDP; + erst_base |= ir->erst.erst_dma_addr & ~ERST_BASE_RSVDP; xhci_write_64(xhci, erst_base, &ir->ir_set->erst_base); /* Set the event ring dequeue address of this interrupter */ diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 7e282b4522c0..5df370482521 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -514,7 +514,7 @@ struct xhci_intr_reg { #define ERST_SIZE_MASK (0xffff << 16) /* erst_base bitmasks */ -#define ERST_BASE_RSVDP (0x3f) +#define ERST_BASE_RSVDP (GENMASK_ULL(5, 0)) /* erst_dequeue bitmasks */ /* Dequeue ERST Segment Index (DESI) - Segment number (or alias) -- 2.25.1

2 years, 3 months

1
0
0 0

[PATCH 3/4] xhci: Clear EHB bit only at end of interrupt handler

by Mathias Nyman

From: Lukas Wunner <lukas(a)wunner.de> The Event Handler Busy bit shall be cleared by software when the Event Ring is empty. The xHC is thereby informed that it may raise another interrupt once it has enqueued new events (sec 4.17.2). However since commit dc0ffbea5729 ("usb: host: xhci: update event ring dequeue pointer on purpose"), the EHB bit is already cleared after half a segment has been processed. As a result, spurious interrupts may occur: - xhci_irq() processes half a segment, clears EHB, continues processing remaining events. - xHC enqueues new events. Because EHB has been cleared, xHC sets Interrupt Pending bit. Interrupt moderation countdown begins. - Meanwhile xhci_irq() continues processing events. Interrupt moderation countdown reaches zero, so an MSI interrupt is signaled. - xhci_irq() empties the Event Ring, clears EHB again and is done. - Because an MSI interrupt has been signaled, xhci_irq() is run again. It discovers there's nothing to do and returns IRQ_NONE. Avoid by clearing the EHB bit only at the end of xhci_irq(). Fixes: dc0ffbea5729 ("usb: host: xhci: update event ring dequeue pointer on purpose") Signed-off-by: Lukas Wunner <lukas(a)wunner.de> Cc: stable(a)vger.kernel.org # v5.5+ Cc: Peter Chen <peter.chen(a)kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> --- drivers/usb/host/xhci-ring.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 98389b568633..3e5dc0723a8f 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -2996,7 +2996,8 @@ static int xhci_handle_event(struct xhci_hcd *xhci, struct xhci_interrupter *ir) */ static void xhci_update_erst_dequeue(struct xhci_hcd *xhci, struct xhci_interrupter *ir, - union xhci_trb *event_ring_deq) + union xhci_trb *event_ring_deq, + bool clear_ehb) { u64 temp_64; dma_addr_t deq; @@ -3017,12 +3018,13 @@ static void xhci_update_erst_dequeue(struct xhci_hcd *xhci, return; /* Update HC event ring dequeue pointer */ - temp_64 &= ERST_PTR_MASK; + temp_64 &= ERST_DESI_MASK; temp_64 |= ((u64) deq & (u64) ~ERST_PTR_MASK); } /* Clear the event handler busy flag (RW1C) */ - temp_64 |= ERST_EHB; + if (clear_ehb) + temp_64 |= ERST_EHB; xhci_write_64(xhci, temp_64, &ir->ir_set->erst_dequeue); } @@ -3103,7 +3105,7 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd) while (xhci_handle_event(xhci, ir) > 0) { if (event_loop++ < TRBS_PER_SEGMENT / 2) continue; - xhci_update_erst_dequeue(xhci, ir, event_ring_deq); + xhci_update_erst_dequeue(xhci, ir, event_ring_deq, false); event_ring_deq = ir->event_ring->dequeue; /* ring is half-full, force isoc trbs to interrupt more often */ @@ -3113,7 +3115,7 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd) event_loop = 0; } - xhci_update_erst_dequeue(xhci, ir, event_ring_deq); + xhci_update_erst_dequeue(xhci, ir, event_ring_deq, true); ret = IRQ_HANDLED; out: -- 2.25.1

2 years, 3 months

1
0
0 0

[PATCH 2/4] xhci: track port suspend state correctly in unsuccessful resume cases

by Mathias Nyman

xhci-hub.c tracks suspended ports in a suspended_port bitfield. This is checked when responding to a Get_Status(PORT) request to see if a port in running U0 state was recently resumed, and adds the required USB_PORT_STAT_C_SUSPEND change bit in those cases. The suspended_port bit was left uncleared if a device is disconnected during suspend. The bit remained set even when a new device was connected and enumerated. The set bit resulted in a incorrect Get_Status(PORT) response with a bogus USB_PORT_STAT_C_SUSPEND change bit set once the new device reached U0 link state. USB_PORT_STAT_C_SUSPEND change bit is only used for USB2 ports, but xhci-hub keeps track of both USB2 and USB3 suspended ports. Cc: stable(a)vger.kernel.org Reported-by: Wesley Cheng <quic_wcheng(a)quicinc.com> Closes: https://lore.kernel.org/linux-usb/d68aa806-b26a-0e43-42fb-b8067325e967@quic… Fixes: 1d5810b6923c ("xhci: Rework port suspend structures for limited ports.") Tested-by: Wesley Cheng <quic_wcheng(a)quicinc.com> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> --- drivers/usb/host/xhci-hub.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c index 0054d02239e2..0df5d807a77e 100644 --- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -1062,19 +1062,19 @@ static void xhci_get_usb3_port_status(struct xhci_port *port, u32 *status, *status |= USB_PORT_STAT_C_CONFIG_ERROR << 16; /* USB3 specific wPortStatus bits */ - if (portsc & PORT_POWER) { + if (portsc & PORT_POWER) *status |= USB_SS_PORT_STAT_POWER; - /* link state handling */ - if (link_state == XDEV_U0) - bus_state->suspended_ports &= ~(1 << portnum); - } - /* remote wake resume signaling complete */ - if (bus_state->port_remote_wakeup & (1 << portnum) && + /* no longer suspended or resuming */ + if (link_state != XDEV_U3 && link_state != XDEV_RESUME && link_state != XDEV_RECOVERY) { - bus_state->port_remote_wakeup &= ~(1 << portnum); - usb_hcd_end_port_resume(&hcd->self, portnum); + /* remote wake resume signaling complete */ + if (bus_state->port_remote_wakeup & (1 << portnum)) { + bus_state->port_remote_wakeup &= ~(1 << portnum); + usb_hcd_end_port_resume(&hcd->self, portnum); + } + bus_state->suspended_ports &= ~(1 << portnum); } xhci_hub_report_usb3_link_state(xhci, status, portsc); @@ -1131,6 +1131,7 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status, usb_hcd_end_port_resume(&port->rhub->hcd->self, portnum); } port->rexit_active = 0; + bus_state->suspended_ports &= ~(1 << portnum); } } -- 2.25.1

2 years, 3 months

1
0
0 0

[PATCH 1/4] usb: xhci: xhci-ring: Use sysdev for mapping bounce buffer

by Mathias Nyman

From: Wesley Cheng <quic_wcheng(a)quicinc.com> As mentioned in: commit 474ed23a6257 ("xhci: align the last trb before link if it is easily splittable.") A bounce buffer is utilized for ensuring that transfers that span across ring segments are aligned to the EP's max packet size. However, the device that is used to map the DMA buffer to is currently using the XHCI HCD, which does not carry any DMA operations in certain configrations. Migration to using the sysdev entry was introduced for DWC3 based implementations where the IOMMU operations are present. Replace the reference to the controller device to sysdev instead. This allows the bounce buffer to be properly mapped to any implementations that have an IOMMU involved. cc: <stable(a)vger.kernel.org> Fixes: 4c39d4b949d3 ("usb: xhci: use bus->sysdev for DMA configuration") Signed-off-by: Wesley Cheng <quic_wcheng(a)quicinc.com> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> --- drivers/usb/host/xhci-ring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 1dde53f6eb31..98389b568633 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -798,7 +798,7 @@ static void xhci_giveback_urb_in_irq(struct xhci_hcd *xhci, static void xhci_unmap_td_bounce_buffer(struct xhci_hcd *xhci, struct xhci_ring *ring, struct xhci_td *td) { - struct device *dev = xhci_to_hcd(xhci)->self.controller; + struct device *dev = xhci_to_hcd(xhci)->self.sysdev; struct xhci_segment *seg = td->bounce_seg; struct urb *urb = td->urb; size_t len; @@ -3469,7 +3469,7 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int transferred, static int xhci_align_td(struct xhci_hcd *xhci, struct urb *urb, u32 enqd_len, u32 *trb_buff_len, struct xhci_segment *seg) { - struct device *dev = xhci_to_hcd(xhci)->self.controller; + struct device *dev = xhci_to_hcd(xhci)->self.sysdev; unsigned int unalign; unsigned int max_pkt; u32 new_buff_len; -- 2.25.1

2 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] net/mlx5: Free IRQ rmap and notifier on kernel shutdown" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 314ded538e5f22e7610b1bf621402024a180ec80 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023080708-livable-distress-7173@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: 314ded538e5f ("net/mlx5: Free IRQ rmap and notifier on kernel shutdown") 1da438c0ae02 ("net/mlx5: Fix indexing of mlx5_irq") 9c2d08010963 ("net/mlx5: Free irqs only on shutdown callback") 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation") 8bebfd767909 ("net/mlx5: Improve naming of pci function vectors") bbac70c74183 ("net/mlx5: Use newer affinity descriptor") 235a25fe28de ("net/mlx5: Modify struct mlx5_irq to use struct msi_map") 2acda57736de ("net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity hints") 147cc5838c0f ("Merge tag 'irq-core-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 314ded538e5f22e7610b1bf621402024a180ec80 Mon Sep 17 00:00:00 2001 From: Saeed Mahameed <saeedm(a)nvidia.com> Date: Thu, 8 Jun 2023 12:00:54 -0700 Subject: [PATCH] net/mlx5: Free IRQ rmap and notifier on kernel shutdown The kernel IRQ system needs the irq affinity notifier to be clear before attempting to free the irq, see WARN_ON log below. On a normal driver unload we don't have this issue since we do the complete cleanup of the irq resources. To fix this, put the important resources cleanup in a helper function and use it in both normal driver unload and shutdown flows. [ 4497.498434] ------------[ cut here ]------------ [ 4497.498726] WARNING: CPU: 0 PID: 9 at kernel/irq/manage.c:2034 free_irq+0x295/0x340 [ 4497.499193] Modules linked in: [ 4497.499386] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.4.0-rc4+ #10 [ 4497.499876] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 4497.500518] Workqueue: events do_poweroff [ 4497.500849] RIP: 0010:free_irq+0x295/0x340 [ 4497.501132] Code: 85 c0 0f 84 1d ff ff ff 48 89 ef ff d0 0f 1f 00 e9 10 ff ff ff 0f 0b e9 72 ff ff ff 49 8d 7f 28 ff d0 0f 1f 00 e9 df fd ff ff <0f> 0b 48 c7 80 c0 008 [ 4497.502269] RSP: 0018:ffffc90000053da0 EFLAGS: 00010282 [ 4497.502589] RAX: ffff888100949600 RBX: ffff88810330b948 RCX: 0000000000000000 [ 4497.503035] RDX: ffff888100949600 RSI: ffff888100400490 RDI: 0000000000000023 [ 4497.503472] RBP: ffff88810330c7e0 R08: ffff8881004005d0 R09: ffffffff8273a260 [ 4497.503923] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881009ae000 [ 4497.504359] R13: ffff8881009ae148 R14: 0000000000000000 R15: ffff888100949600 [ 4497.504804] FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 [ 4497.505302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4497.505671] CR2: 00007fce98806298 CR3: 000000000262e005 CR4: 0000000000370ef0 [ 4497.506104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4497.506540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4497.507002] Call Trace: [ 4497.507158] <TASK> [ 4497.507299] ? free_irq+0x295/0x340 [ 4497.507522] ? __warn+0x7c/0x130 [ 4497.507740] ? free_irq+0x295/0x340 [ 4497.507963] ? report_bug+0x171/0x1a0 [ 4497.508197] ? handle_bug+0x3c/0x70 [ 4497.508417] ? exc_invalid_op+0x17/0x70 [ 4497.508662] ? asm_exc_invalid_op+0x1a/0x20 [ 4497.508926] ? free_irq+0x295/0x340 [ 4497.509146] mlx5_irq_pool_free_irqs+0x48/0x90 [ 4497.509421] mlx5_irq_table_free_irqs+0x38/0x50 [ 4497.509714] mlx5_core_eq_free_irqs+0x27/0x40 [ 4497.509984] shutdown+0x7b/0x100 [ 4497.510184] pci_device_shutdown+0x30/0x60 [ 4497.510440] device_shutdown+0x14d/0x240 [ 4497.510698] kernel_power_off+0x30/0x70 [ 4497.510938] process_one_work+0x1e6/0x3e0 [ 4497.511183] worker_thread+0x49/0x3b0 [ 4497.511407] ? __pfx_worker_thread+0x10/0x10 [ 4497.511679] kthread+0xe0/0x110 [ 4497.511879] ? __pfx_kthread+0x10/0x10 [ 4497.512114] ret_from_fork+0x29/0x50 [ 4497.512342] </TASK> Fixes: 9c2d08010963 ("net/mlx5: Free irqs only on shutdown callback") Signed-off-by: Saeed Mahameed <saeedm(a)nvidia.com> Reviewed-by: Shay Drory <shayd(a)nvidia.com> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index 33b9359de53d..98412bd5a696 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -126,14 +126,22 @@ int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int function_id, return ret; } -static void irq_release(struct mlx5_irq *irq) +/* mlx5_system_free_irq - Free an IRQ + * @irq: IRQ to free + * + * Free the IRQ and other resources such as rmap from the system. + * BUT doesn't free or remove reference from mlx5. + * This function is very important for the shutdown flow, where we need to + * cleanup system resoruces but keep mlx5 objects alive, + * see mlx5_irq_table_free_irqs(). + */ +static void mlx5_system_free_irq(struct mlx5_irq *irq) { struct mlx5_irq_pool *pool = irq->pool; #ifdef CONFIG_RFS_ACCEL struct cpu_rmap *rmap; #endif - xa_erase(&pool->irqs, irq->pool_index); /* free_irq requires that affinity_hint and rmap will be cleared before * calling it. To satisfy this requirement, we call * irq_cpu_rmap_remove() to remove the notifier @@ -145,10 +153,18 @@ static void irq_release(struct mlx5_irq *irq) irq_cpu_rmap_remove(rmap, irq->map.virq); #endif - free_cpumask_var(irq->mask); free_irq(irq->map.virq, &irq->nh); if (irq->map.index && pci_msix_can_alloc_dyn(pool->dev->pdev)) pci_msix_free_irq(pool->dev->pdev, irq->map); +} + +static void irq_release(struct mlx5_irq *irq) +{ + struct mlx5_irq_pool *pool = irq->pool; + + xa_erase(&pool->irqs, irq->pool_index); + mlx5_system_free_irq(irq); + free_cpumask_var(irq->mask); kfree(irq); } @@ -705,7 +721,8 @@ static void mlx5_irq_pool_free_irqs(struct mlx5_irq_pool *pool) unsigned long index; xa_for_each(&pool->irqs, index, irq) - free_irq(irq->map.virq, &irq->nh); + mlx5_system_free_irq(irq); + } static void mlx5_irq_pools_free_irqs(struct mlx5_irq_table *table)

2 years, 3 months

2
1
0 0

FAILED: patch "[PATCH] net/mlx5: Free IRQ rmap and notifier on kernel shutdown" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 314ded538e5f22e7610b1bf621402024a180ec80 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023080706-enclosure-disobey-0b27@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: 314ded538e5f ("net/mlx5: Free IRQ rmap and notifier on kernel shutdown") 1da438c0ae02 ("net/mlx5: Fix indexing of mlx5_irq") 9c2d08010963 ("net/mlx5: Free irqs only on shutdown callback") 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation") 8bebfd767909 ("net/mlx5: Improve naming of pci function vectors") bbac70c74183 ("net/mlx5: Use newer affinity descriptor") 235a25fe28de ("net/mlx5: Modify struct mlx5_irq to use struct msi_map") 2acda57736de ("net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity hints") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 314ded538e5f22e7610b1bf621402024a180ec80 Mon Sep 17 00:00:00 2001 From: Saeed Mahameed <saeedm(a)nvidia.com> Date: Thu, 8 Jun 2023 12:00:54 -0700 Subject: [PATCH] net/mlx5: Free IRQ rmap and notifier on kernel shutdown The kernel IRQ system needs the irq affinity notifier to be clear before attempting to free the irq, see WARN_ON log below. On a normal driver unload we don't have this issue since we do the complete cleanup of the irq resources. To fix this, put the important resources cleanup in a helper function and use it in both normal driver unload and shutdown flows. [ 4497.498434] ------------[ cut here ]------------ [ 4497.498726] WARNING: CPU: 0 PID: 9 at kernel/irq/manage.c:2034 free_irq+0x295/0x340 [ 4497.499193] Modules linked in: [ 4497.499386] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.4.0-rc4+ #10 [ 4497.499876] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 4497.500518] Workqueue: events do_poweroff [ 4497.500849] RIP: 0010:free_irq+0x295/0x340 [ 4497.501132] Code: 85 c0 0f 84 1d ff ff ff 48 89 ef ff d0 0f 1f 00 e9 10 ff ff ff 0f 0b e9 72 ff ff ff 49 8d 7f 28 ff d0 0f 1f 00 e9 df fd ff ff <0f> 0b 48 c7 80 c0 008 [ 4497.502269] RSP: 0018:ffffc90000053da0 EFLAGS: 00010282 [ 4497.502589] RAX: ffff888100949600 RBX: ffff88810330b948 RCX: 0000000000000000 [ 4497.503035] RDX: ffff888100949600 RSI: ffff888100400490 RDI: 0000000000000023 [ 4497.503472] RBP: ffff88810330c7e0 R08: ffff8881004005d0 R09: ffffffff8273a260 [ 4497.503923] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881009ae000 [ 4497.504359] R13: ffff8881009ae148 R14: 0000000000000000 R15: ffff888100949600 [ 4497.504804] FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 [ 4497.505302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4497.505671] CR2: 00007fce98806298 CR3: 000000000262e005 CR4: 0000000000370ef0 [ 4497.506104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4497.506540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4497.507002] Call Trace: [ 4497.507158] <TASK> [ 4497.507299] ? free_irq+0x295/0x340 [ 4497.507522] ? __warn+0x7c/0x130 [ 4497.507740] ? free_irq+0x295/0x340 [ 4497.507963] ? report_bug+0x171/0x1a0 [ 4497.508197] ? handle_bug+0x3c/0x70 [ 4497.508417] ? exc_invalid_op+0x17/0x70 [ 4497.508662] ? asm_exc_invalid_op+0x1a/0x20 [ 4497.508926] ? free_irq+0x295/0x340 [ 4497.509146] mlx5_irq_pool_free_irqs+0x48/0x90 [ 4497.509421] mlx5_irq_table_free_irqs+0x38/0x50 [ 4497.509714] mlx5_core_eq_free_irqs+0x27/0x40 [ 4497.509984] shutdown+0x7b/0x100 [ 4497.510184] pci_device_shutdown+0x30/0x60 [ 4497.510440] device_shutdown+0x14d/0x240 [ 4497.510698] kernel_power_off+0x30/0x70 [ 4497.510938] process_one_work+0x1e6/0x3e0 [ 4497.511183] worker_thread+0x49/0x3b0 [ 4497.511407] ? __pfx_worker_thread+0x10/0x10 [ 4497.511679] kthread+0xe0/0x110 [ 4497.511879] ? __pfx_kthread+0x10/0x10 [ 4497.512114] ret_from_fork+0x29/0x50 [ 4497.512342] </TASK> Fixes: 9c2d08010963 ("net/mlx5: Free irqs only on shutdown callback") Signed-off-by: Saeed Mahameed <saeedm(a)nvidia.com> Reviewed-by: Shay Drory <shayd(a)nvidia.com> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index 33b9359de53d..98412bd5a696 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -126,14 +126,22 @@ int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int function_id, return ret; } -static void irq_release(struct mlx5_irq *irq) +/* mlx5_system_free_irq - Free an IRQ + * @irq: IRQ to free + * + * Free the IRQ and other resources such as rmap from the system. + * BUT doesn't free or remove reference from mlx5. + * This function is very important for the shutdown flow, where we need to + * cleanup system resoruces but keep mlx5 objects alive, + * see mlx5_irq_table_free_irqs(). + */ +static void mlx5_system_free_irq(struct mlx5_irq *irq) { struct mlx5_irq_pool *pool = irq->pool; #ifdef CONFIG_RFS_ACCEL struct cpu_rmap *rmap; #endif - xa_erase(&pool->irqs, irq->pool_index); /* free_irq requires that affinity_hint and rmap will be cleared before * calling it. To satisfy this requirement, we call * irq_cpu_rmap_remove() to remove the notifier @@ -145,10 +153,18 @@ static void irq_release(struct mlx5_irq *irq) irq_cpu_rmap_remove(rmap, irq->map.virq); #endif - free_cpumask_var(irq->mask); free_irq(irq->map.virq, &irq->nh); if (irq->map.index && pci_msix_can_alloc_dyn(pool->dev->pdev)) pci_msix_free_irq(pool->dev->pdev, irq->map); +} + +static void irq_release(struct mlx5_irq *irq) +{ + struct mlx5_irq_pool *pool = irq->pool; + + xa_erase(&pool->irqs, irq->pool_index); + mlx5_system_free_irq(irq); + free_cpumask_var(irq->mask); kfree(irq); } @@ -705,7 +721,8 @@ static void mlx5_irq_pool_free_irqs(struct mlx5_irq_pool *pool) unsigned long index; xa_for_each(&pool->irqs, index, irq) - free_irq(irq->map.virq, &irq->nh); + mlx5_system_free_irq(irq); + } static void mlx5_irq_pools_free_irqs(struct mlx5_irq_table *table)

2 years, 3 months

2
1
0 0

[PATCH 1/2] s390/vfio-ap: unpin pages on gisc registration failure

by Tony Krowiak

From: Anthony Krowiak <akrowiak(a)linux.ibm.com> In the vfio_ap_irq_enable function, after the page containing the notification indicator byte (NIB) is pinned, the function attempts to register the guest ISC. If registration fails, the function sets the status response code and returns without unpinning the page containing the NIB. In order to avoid a memory leak, the NIB should be unpinned before returning from the vfio_ap_irq_enable function. Fixes: 783f0a3ccd79 ("s390/vfio-ap: add s390dbf logging to the vfio_ap_irq_enable function") Signed-off-by: Janosch Frank <frankja(a)linux.ibm.com> Signed-off-by: Anthony Krowiak <akrowiak(a)linux.ibm.com> Cc: <stable(a)vger.kernel.org> --- drivers/s390/crypto/vfio_ap_ops.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 4db538a55192..9cb28978c186 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -457,6 +457,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, VFIO_AP_DBF_WARN("%s: gisc registration failed: nisc=%d, isc=%d, apqn=%#04x\n", __func__, nisc, isc, q->apqn); + vfio_unpin_pages(&q->matrix_mdev->vdev, nib, 1); status.response_code = AP_RESPONSE_INVALID_GISA; return status; } -- 2.41.0

2 years, 3 months

2
2
0 0

[PATCH] drm: bridge: it66121: ->get_edid callback must not return err pointers

by Jani Nikula

The drm stack does not expect error valued pointers for EDID anywhere. Fixes: e66856508746 ("drm: bridge: it66121: Set DDC preamble only once before reading EDID") Cc: Paul Cercueil <paul(a)crapouillou.net> Cc: Robert Foss <robert.foss(a)linaro.org> Cc: Phong LE <ple(a)baylibre.com> Cc: Neil Armstrong <neil.armstrong(a)linaro.org> Cc: Andrzej Hajda <andrzej.hajda(a)intel.com> Cc: Robert Foss <rfoss(a)kernel.org> Cc: Laurent Pinchart <Laurent.pinchart(a)ideasonboard.com> Cc: Jonas Karlman <jonas(a)kwiboo.se> Cc: Jernej Skrabec <jernej.skrabec(a)gmail.com> Cc: <stable(a)vger.kernel.org> # v6.3+ Signed-off-by: Jani Nikula <jani.nikula(a)intel.com> --- UNTESTED --- drivers/gpu/drm/bridge/ite-it66121.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/bridge/ite-it66121.c b/drivers/gpu/drm/bridge/ite-it66121.c index 3c9b42c9d2ee..1cf3fb1f13dc 100644 --- a/drivers/gpu/drm/bridge/ite-it66121.c +++ b/drivers/gpu/drm/bridge/ite-it66121.c @@ -884,14 +884,14 @@ static struct edid *it66121_bridge_get_edid(struct drm_bridge *bridge, mutex_lock(&ctx->lock); ret = it66121_preamble_ddc(ctx); if (ret) { - edid = ERR_PTR(ret); + edid = NULL; goto out_unlock; } ret = regmap_write(ctx->regmap, IT66121_DDC_HEADER_REG, IT66121_DDC_HEADER_EDID); if (ret) { - edid = ERR_PTR(ret); + edid = NULL; goto out_unlock; } -- 2.39.2

2 years, 3 months

3
2
0 0

Re: [PATCH v2] drm/amdgpu: always use legacy tlb flush on cyan_skilfish

by Lang Yu

On 09/14/ , Felix Kuehling wrote: > On 2023-09-14 10:02, Christian König wrote: Do we still need to use legacy flush to emulate heavyweight flush if we don't use SVM? And can I push this now? Regards, Lang > > > > Am 14.09.23 um 15:59 schrieb Felix Kuehling: > > > > > > On 2023-09-14 9:39, Christian König wrote: > > > > Is a single legacy flush sufficient to emulate an heavyweight > > > > flush as well? > > > > > > > > On previous generations we needed to issue at least two legacy > > > > flushes for this. > > > I assume you are referring to the Vega20 XGMI workaround. That is a > > > very different issue. Because PTEs would be cached in L2, we had to > > > always use a heavy-weight flush that would also flush the L2 cache > > > as well, and follow that with another legacy flush to deal with race > > > conditions where stale PTEs could be re-fetched from L2 before the > > > L2 flush was complete. > > > > No, we also have another (badly documented) workaround which issues a > > legacy flush before each heavy weight on some hw generations. See the my > > TLB flush cleanup patches. > > > > > > > > A heavy-weight flush guarantees that there are no more possible > > > memory accesses using the old PTEs. With physically addressed caches > > > on GFXv9 that includes a cache flush because the address translation > > > happened before putting data into the cache. I think the address > > > translation and cache architecture works differently on GFXv10. So > > > maybe the cache-flush is not required here. > > > > > > But even then a legacy flush probably allows for in-flight memory > > > accesses with old physical addresses to complete after the TLB > > > flush. So there is a small risk of memory corruption that was > > > assumed to not be accessed by the GPU any more. Or when using IOMMU > > > device isolation it would result in IOMMU faults if the DMA mappings > > > are invalidated slightly too early. > > > > Mhm, that's quite bad. Any idea how to avoid that? > > A few ideas > > * Add an arbitrary delay and hope that it is longer than the FIFOs in > the HW > * Execute an atomic operation to memory on some GPU engine that could > act as a fence, maybe just a RELEASE_MEM on the CP to some writeback > location would do the job > * If needed, RELEASE_MEM could also perform a cache flush > > Regards, > Felix > > > > > > Regards, > > Christian. > > > > > > > > Regards, > > > Felix > > > > > > > > > > > > > > And please don't push before getting an rb from Felix as well. > > > > > > > > Regards, > > > > Christian. > > > > > > > > > > > > Am 14.09.23 um 11:23 schrieb Lang Yu: > > > > > cyan_skilfish has problems with other flush types. > > > > > > > > > > v2: fix incorrect ternary conditional operator usage.(Yifan) > > > > > > > > > > Signed-off-by: Lang Yu <Lang.Yu(a)amd.com> > > > > > Cc: <stable(a)vger.kernel.org> # v5.15+ > > > > > --- > > > > > drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 7 ++++++- > > > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > > > > > b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > > > > > index d3da13f4c80e..c6d11047169a 100644 > > > > > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > > > > > @@ -236,7 +236,8 @@ static void > > > > > gmc_v10_0_flush_vm_hub(struct amdgpu_device *adev, uint32_t > > > > > vmid, > > > > > { > > > > > bool use_semaphore = > > > > > gmc_v10_0_use_invalidate_semaphore(adev, vmhub); > > > > > struct amdgpu_vmhub *hub = &adev->vmhub[vmhub]; > > > > > - u32 inv_req = > > > > > hub->vmhub_funcs->get_invalidate_req(vmid, flush_type); > > > > > + u32 inv_req = hub->vmhub_funcs->get_invalidate_req(vmid, > > > > > + (adev->asic_type != CHIP_CYAN_SKILLFISH) ? > > > > > flush_type : 0); > > > > > u32 tmp; > > > > > /* Use register 17 for GART */ > > > > > const unsigned int eng = 17; > > > > > @@ -331,6 +332,8 @@ static void > > > > > gmc_v10_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t > > > > > vmid, > > > > > int r; > > > > > + flush_type = (adev->asic_type != CHIP_CYAN_SKILLFISH) > > > > > ? flush_type : 0; > > > > > + > > > > > /* flush hdp cache */ > > > > > adev->hdp.funcs->flush_hdp(adev, NULL); > > > > > @@ -426,6 +429,8 @@ static int > > > > > gmc_v10_0_flush_gpu_tlb_pasid(struct amdgpu_device *adev, > > > > > struct amdgpu_ring *ring = &adev->gfx.kiq[0].ring; > > > > > struct amdgpu_kiq *kiq = &adev->gfx.kiq[0]; > > > > > + flush_type = (adev->asic_type != CHIP_CYAN_SKILLFISH) > > > > > ? flush_type : 0; > > > > > + > > > > > if (amdgpu_emu_mode == 0 && ring->sched.ready) { > > > > > spin_lock(&adev->gfx.kiq[0].ring_lock); > > > > > /* 2 dwords flush + 8 dwords fence */ > > > > > >

2 years, 3 months

2
1
0 0

Do Contact me for more details

by Mr Psacal Yembiline

Hi friend I am a banker in ADB BANK. I want to transfer an abandoned $18.5Million to your Bank account. 50/percent will be your share. No risk involved but keep it as secret. Contact me for more details. I am requesting for your urgent assistance to collaborate with you to move the balance sum of $18.5 Million US Dollars and 950kg of gold bars, this is a very genuine business that need full concentration, corporation and transparency between the both of us and its %100 free from any risk. I agreed to share the funds and gold bars with you in ratio of %50 %50. Yours Mr Pascal Yembiline

2 years, 3 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror September 2023