From: Jeff Vanhoof <qjv001(a)motorola.com>
arm-smmu related crashes seen after a Missed ISOC interrupt when
no_interrupt=1 is used. This can happen if the hardware is still using
the data associated with a TRB after the usb_request's ->complete call
has been made. Instead of immediately releasing a request when a Missed
ISOC interrupt has occurred, this change will add logic to cancel the
request instead where it will eventually be released when the
END_TRANSFER command has completed. This logic is similar to some of the
cleanup done in dwc3_gadget_ep_dequeue.
Fixes: 6d8a019614f3 ("usb: dwc3: gadget: check for Missed Isoc from event status")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Jeff Vanhoof <qjv001(a)motorola.com>
Co-developed-by: Dan Vacura <w36195(a)motorola.com>
Signed-off-by: Dan Vacura <w36195(a)motorola.com>
---
V1 -> V3:
- no change, new patch in series
drivers/usb/dwc3/core.h | 1 +
drivers/usb/dwc3/gadget.c | 38 ++++++++++++++++++++++++++------------
2 files changed, 27 insertions(+), 12 deletions(-)
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 8f9959ba9fd4..9b005d912241 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -943,6 +943,7 @@ struct dwc3_request {
#define DWC3_REQUEST_STATUS_DEQUEUED 3
#define DWC3_REQUEST_STATUS_STALLED 4
#define DWC3_REQUEST_STATUS_COMPLETED 5
+#define DWC3_REQUEST_STATUS_MISSED_ISOC 6
#define DWC3_REQUEST_STATUS_UNKNOWN -1
u8 epnum;
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 079cd333632e..411532c5c378 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2021,6 +2021,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
case DWC3_REQUEST_STATUS_STALLED:
dwc3_gadget_giveback(dep, req, -EPIPE);
break;
+ case DWC3_REQUEST_STATUS_MISSED_ISOC:
+ dwc3_gadget_giveback(dep, req, -EXDEV);
+ break;
default:
dev_err(dwc->dev, "request cancelled with wrong reason:%d\n", req->status);
dwc3_gadget_giveback(dep, req, -ECONNRESET);
@@ -3402,21 +3405,32 @@ static bool dwc3_gadget_endpoint_trbs_complete(struct dwc3_ep *dep,
struct dwc3 *dwc = dep->dwc;
bool no_started_trb = true;
- dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
+ if (status == -EXDEV) {
+ struct dwc3_request *tmp;
+ struct dwc3_request *req;
- if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
- goto out;
+ if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
+ dwc3_stop_active_transfer(dep, true, true);
- if (!dep->endpoint.desc)
- return no_started_trb;
+ list_for_each_entry_safe(req, tmp, &dep->started_list, list)
+ dwc3_gadget_move_cancelled_request(req,
+ DWC3_REQUEST_STATUS_MISSED_ISOC);
+ } else {
+ dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
- if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
- list_empty(&dep->started_list) &&
- (list_empty(&dep->pending_list) || status == -EXDEV))
- dwc3_stop_active_transfer(dep, true, true);
- else if (dwc3_gadget_ep_should_continue(dep))
- if (__dwc3_gadget_kick_transfer(dep) == 0)
- no_started_trb = false;
+ if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
+ goto out;
+
+ if (!dep->endpoint.desc)
+ return no_started_trb;
+
+ if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
+ list_empty(&dep->started_list) && list_empty(&dep->pending_list))
+ dwc3_stop_active_transfer(dep, true, true);
+ else if (dwc3_gadget_ep_should_continue(dep))
+ if (__dwc3_gadget_kick_transfer(dep) == 0)
+ no_started_trb = false;
+ }
out:
/*
--
2.34.1
Hi,
On Bugzilla, danilrybakov249(a)gmail.com reported stable-specific, ACPI error
regression that led into high CPU temperature [1]. He wrote:
> Overview:
>
> After updating from lts v6.6.14-2 to lts v6.6.17-1 noticed high CPU temperature and lag. After running htop noticed that journald was using 30-60% of CPU. Afterwards, tried switching to stable, or lts v6.6.18-1, but encountered the same issue.
>
> Running journalctl -f gives these lines over and over again:
>
> Feb 19 21:09:12 danirybe kernel: ACPI Error: Could not disable RealTimeClock events (20230628/evxfevnt-243)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 08, disabling event (20230628/evgpe-839)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0A, disabling event (20230628/evgpe-839)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0B, disabling event (20230628/evgpe-839)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PM_Timer (0), disabling (20230628/evevent-255)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PowerButton (2), disabling (20230628/evevent-255)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - SleepButton (3), disabling (20230628/evevent-255)
>
> My system info:
>
> Laptop model: ASUS VivoBook D540NV-GQ065T
> OS: Arch Linux x86_64
> Kernel: 6.6.14-2-lts
> WM: sway
> CPU: Intel Pentium N420 (4) @ 2.500GHz
> GPU1: Intel Apollo Lake [HD Graphics 505]
> GPU2: NVIDIA GeForce 920MX
>
> I've pinned down the commit after which the problem occurs:
>
> 847e1eb30e269a094da046c08273abe3f3361cf2 is the first bad commit
> commit 847e1eb30e269a094da046c08273abe3f3361cf2
> Author: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
> Date: Mon Jan 8 15:20:58 2024 +0900
>
> platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe
>
> commit 5913320eb0b3ec88158cfcb0fa5e996bf4ef681b upstream.
>
> <snipped>...
See Bugzilla for the full thread.
Thanks.
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=218531
--
An old man doll... just what I always wanted! - Clara
From: Muhammad Ahmed <ahmed.ahmed(a)amd.com>
[WHY]
Blackscreen hang @ PC EF000025 when trying to wake up from S0i3. DCN
gets powered off due to dc_power_down_on_boot() being called after
timeout.
[HOW]
Setting the power_down_on_boot function pointer to null since we don't
expect the function to be called for APU.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
Acked-by: Alex Hung <alex.hung(a)amd.com>
Signed-off-by: Muhammad Ahmed <ahmed.ahmed(a)amd.com>
---
drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c
index dce620d359a6..d4e0abbef28e 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c
@@ -39,7 +39,7 @@
static const struct hw_sequencer_funcs dcn35_funcs = {
.program_gamut_remap = dcn30_program_gamut_remap,
.init_hw = dcn35_init_hw,
- .power_down_on_boot = dcn35_power_down_on_boot,
+ .power_down_on_boot = NULL,
.apply_ctx_to_hw = dce110_apply_ctx_to_hw,
.apply_ctx_for_surface = NULL,
.program_front_end_for_ctx = dcn20_program_front_end_for_ctx,
--
2.34.1
We need to avoid the first page if we don't read it entirely.
We need to avoid the last page if we don't read it entirely.
While rather simple, this logic has been failed in the previous
fix. This time I wrote about 30 unit tests locally to check each
possible condition, hopefully I covered them all.
Reported-by: Christophe Kerello <christophe.kerello(a)foss.st.com>
Closes: https://lore.kernel.org/linux-mtd/20240221175327.42f7076d@xps-13/T/#m399bac…
Suggested-by: Christophe Kerello <christophe.kerello(a)foss.st.com>
Fixes: 828f6df1bcba ("mtd: rawnand: Clarify conditions to enable continuous reads")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
drivers/mtd/nand/raw/nand_base.c | 38 ++++++++++++++++++--------------
1 file changed, 22 insertions(+), 16 deletions(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 3b3ce2926f5d..bcfd99a1699f 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -3466,30 +3466,36 @@ static void rawnand_enable_cont_reads(struct nand_chip *chip, unsigned int page,
u32 readlen, int col)
{
struct mtd_info *mtd = nand_to_mtd(chip);
- unsigned int end_page, end_col;
+ unsigned int first_page, last_page;
chip->cont_read.ongoing = false;
if (!chip->controller->supported_op.cont_read)
return;
- end_page = DIV_ROUND_UP(col + readlen, mtd->writesize);
- end_col = (col + readlen) % mtd->writesize;
+ /*
+ * Don't bother making any calculations if the length is too small.
+ * Side effect: avoids possible integer underflows below.
+ */
+ if (readlen < (2 * mtd->writesize))
+ return;
+ /* Derive the page where continuous read should start (the first full page read) */
+ first_page = page;
if (col)
- page++;
-
- if (end_col && end_page)
- end_page--;
-
- if (page + 1 > end_page)
- return;
-
- chip->cont_read.first_page = page;
- chip->cont_read.last_page = end_page;
- chip->cont_read.ongoing = true;
-
- rawnand_cap_cont_reads(chip);
+ first_page++;
+
+ /* Derive the page where continuous read should stop (the last full page read) */
+ last_page = page + ((col + readlen) / mtd->writesize) - 1;
+
+ /* Configure and enable continuous read when suitable */
+ if (first_page < last_page) {
+ chip->cont_read.first_page = first_page;
+ chip->cont_read.last_page = last_page;
+ chip->cont_read.ongoing = true;
+ /* May reset the ongoing flag */
+ rawnand_cap_cont_reads(chip);
+ }
}
static void rawnand_cont_read_skip_first_page(struct nand_chip *chip, unsigned int page)
--
2.34.1
While crossing a LUN boundary, it is probably safer (and clearer) to
keep all members of the continuous read structure aligned, including the
pause page (which is the last page of the lun or the last page of the
continuous read). Once these members properly in sync, we can use the
rawnand_cap_cont_reads() helper everywhere to "prepare" the next
continuous read if there is one.
Fixes: bbcd80f53a5e ("mtd: rawnand: Prevent crossing LUN boundaries during sequential reads")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
This is not 100% a fix but I believe it is worth backporting as there
may be corner cases which were not identified with the initial
implementation.
---
drivers/mtd/nand/raw/nand_base.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index d6a27e08b112..4d5a663e4e05 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -1232,6 +1232,15 @@ static void rawnand_cap_cont_reads(struct nand_chip *chip)
chip->cont_read.pause_page = rawnand_last_page_of_lun(ppl, first_lun);
else
chip->cont_read.pause_page = chip->cont_read.last_page;
+
+ if (chip->cont_read.first_page == chip->cont_read.pause_page) {
+ chip->cont_read.first_page++;
+ chip->cont_read.pause_page = min(chip->cont_read.last_page,
+ rawnand_last_page_of_lun(ppl, first_lun + 1));
+ }
+
+ if (chip->cont_read.first_page >= chip->cont_read.last_page)
+ chip->cont_read.ongoing = false;
}
static int nand_lp_exec_cont_read_page_op(struct nand_chip *chip, unsigned int page,
@@ -1298,12 +1307,11 @@ static int nand_lp_exec_cont_read_page_op(struct nand_chip *chip, unsigned int p
if (!chip->cont_read.ongoing)
return 0;
- if (page == chip->cont_read.pause_page &&
- page != chip->cont_read.last_page) {
- chip->cont_read.first_page = chip->cont_read.pause_page + 1;
- rawnand_cap_cont_reads(chip);
- } else if (page == chip->cont_read.last_page) {
+ if (page == chip->cont_read.last_page) {
chip->cont_read.ongoing = false;
+ } else if (page == chip->cont_read.pause_page) {
+ chip->cont_read.first_page++;
+ rawnand_cap_cont_reads(chip);
}
return 0;
@@ -3510,10 +3518,7 @@ static void rawnand_cont_read_skip_first_page(struct nand_chip *chip, unsigned i
return;
chip->cont_read.first_page++;
- if (chip->cont_read.first_page == chip->cont_read.pause_page)
- chip->cont_read.first_page++;
- if (chip->cont_read.first_page >= chip->cont_read.last_page)
- chip->cont_read.ongoing = false;
+ rawnand_cap_cont_reads(chip);
}
/**
--
2.34.1
when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects
congestion) it is important that the folio is redirtied.
nfs_writepage_locked() doesn't do this, so files can become corrupted as
writes can be lost.
Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be
returned. It is needed for kernels v5.18..v6.7. Prior to 6.3 the patch
is different as it needs to mention "page", not "folio".
Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka(a)poczta.fm>
Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion")
Signed-off-by: NeilBrown <neilb(a)suse.de>
---
fs/nfs/write.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index b664caea8b4e..9e345d3c305a 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio,
int err;
if (wbc->sync_mode == WB_SYNC_NONE &&
- NFS_SERVER(inode)->write_congested)
+ NFS_SERVER(inode)->write_congested) {
+ folio_redirty_for_writepage(wbc, folio);
return AOP_WRITEPAGE_ACTIVATE;
+ }
nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
nfs_pageio_init_write(&pgio, inode, 0, false,
--
2.43.0
Hello,
54d217406afe250d7a768783baaa79a035f21d38 fixed an issue in
drm_dp_add_payload_part2 that lead to a NULL pointer dereference in
case state is NULL.
The change was (accidentally?) reverted in
5aa1dfcdf0a429e4941e2eef75b006a8c7a8ac49 and the problem reappeared.
The issue is rather spurious, but I've had it appear when unplugging a
thunderbolt dock.
#regzbot introduced 5aa1dfcdf0a429e4941e2eef75b006a8c7a8ac49