June 2025 - Linux-stable-mirror

[PATCH wireless v2] Revert "wifi: mwifiex: Fix HT40 bandwidth issue."

by Francesco Dolcini

From: Francesco Dolcini <francesco.dolcini(a)toradex.com> This reverts commit 4fcfcbe457349267fe048524078e8970807c1a5b. That commit introduces a regression, when HT40 mode is enabled, received packets are lost, this was experience with W8997 with both SDIO-UART and SDIO-SDIO variants. From an initial investigation the issue solves on its own after some time, but it's not clear what is the reason. Given that this was just a performance optimization, let's revert it till we have a better understanding of the issue and a proper fix. Cc: Jeff Chen <jeff.chen_1(a)nxp.com> Cc: stable(a)vger.kernel.org Fixes: 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth issue.") Closes: https://lore.kernel.org/all/20250603203337.GA109929@francesco-nb/ Signed-off-by: Francesco Dolcini <francesco.dolcini(a)toradex.com> --- v2: fix reverted commit sha v1: https://lore.kernel.org/all/20250605100313.34014-1-francesco@dolcini.it/ --- drivers/net/wireless/marvell/mwifiex/11n.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/marvell/mwifiex/11n.c b/drivers/net/wireless/marvell/mwifiex/11n.c index 738bafc3749b..66f0f5377ac1 100644 --- a/drivers/net/wireless/marvell/mwifiex/11n.c +++ b/drivers/net/wireless/marvell/mwifiex/11n.c @@ -403,14 +403,12 @@ mwifiex_cmd_append_11n_tlv(struct mwifiex_private *priv, if (sband->ht_cap.cap & IEEE80211_HT_CAP_SUP_WIDTH_20_40 && bss_desc->bcn_ht_oper->ht_param & - IEEE80211_HT_PARAM_CHAN_WIDTH_ANY) { - chan_list->chan_scan_param[0].radio_type |= - CHAN_BW_40MHZ << 2; + IEEE80211_HT_PARAM_CHAN_WIDTH_ANY) SET_SECONDARYCHAN(chan_list->chan_scan_param[0]. radio_type, (bss_desc->bcn_ht_oper->ht_param & IEEE80211_HT_PARAM_CHA_SEC_OFFSET)); - } + *buffer += struct_size(chan_list, chan_scan_param, 1); ret_len += struct_size(chan_list, chan_scan_param, 1); } -- 2.39.5

2 weeks, 3 days

2
1
0 0

+ mm-gup-revert-mm-gup-fix-infinite-loop-within-__get_longterm_locked.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: mm/gup: revert "mm: gup: fix infinite loop within __get_longterm_locked" has been added to the -mm mm-hotfixes-unstable branch. Its filename is mm-gup-revert-mm-gup-fix-infinite-loop-within-__get_longterm_locked.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: David Hildenbrand <david(a)redhat.com> Subject: mm/gup: revert "mm: gup: fix infinite loop within __get_longterm_locked" Date: Wed, 11 Jun 2025 15:13:14 +0200 After commit 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked") we are able to longterm pin folios that are not supposed to get longterm pinned, simply because they temporarily have the LRU flag cleared (esp. temporarily isolated). For example, two __get_longterm_locked() callers can race, or __get_longterm_locked() can race with anything else that temporarily isolates folios. The introducing commit mentions the use case of a driver that uses vm_ops->fault to insert pages allocated through cma_alloc() into the page tables, assuming they can later get longterm pinned. These pages/ folios would never have the LRU flag set and consequently cannot get isolated. There is no known in-tree user making use of that so far, fortunately. To handle that in the future -- and avoid retrying forever to isolate/migrate them -- we will need a different mechanism for the CMA area *owner* to indicate that it actually already allocated the page and is fine with longterm pinning it. The LRU flag is not suitable for that. Probably we can lookup the relevant CMA area and query the bitmap; we only have have to care about some races, probably. If already allocated, we could just allow longterm pinning) Anyhow, let's fix the "must not be longterm pinned" problem first by reverting the original commit. Link: https://lkml.kernel.org/r/20250611131314.594529-1-david@redhat.com Fixes: 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked") Signed-off-by: David Hildenbrand <david(a)redhat.com> Closes: https://lore.kernel.org/all/20250522092755.GA3277597@tiffany/ Reported-by: Hyesoo Yu <hyesoo.yu(a)samsung.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Jason Gunthorpe <jgg(a)ziepe.ca> Cc: Peter Xu <peterx(a)redhat.com> Cc: Zhaoyang Huang <zhaoyang.huang(a)unisoc.com> Cc: Aijun Sun <aijun.sun(a)unisoc.com> Cc: Alistair Popple <apopple(a)nvidia.com> Cc: John Hubbard <jhubbard(a)nvidia.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/gup.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) --- a/mm/gup.c~mm-gup-revert-mm-gup-fix-infinite-loop-within-__get_longterm_locked +++ a/mm/gup.c @@ -2303,13 +2303,13 @@ static void pofs_unpin(struct pages_or_f /* * Returns the number of collected folios. Return value is always >= 0. */ -static void collect_longterm_unpinnable_folios( +static unsigned long collect_longterm_unpinnable_folios( struct list_head *movable_folio_list, struct pages_or_folios *pofs) { + unsigned long i, collected = 0; struct folio *prev_folio = NULL; bool drain_allow = true; - unsigned long i; for (i = 0; i < pofs->nr_entries; i++) { struct folio *folio = pofs_get_folio(pofs, i); @@ -2321,6 +2321,8 @@ static void collect_longterm_unpinnable_ if (folio_is_longterm_pinnable(folio)) continue; + collected++; + if (folio_is_device_coherent(folio)) continue; @@ -2342,6 +2344,8 @@ static void collect_longterm_unpinnable_ NR_ISOLATED_ANON + folio_is_file_lru(folio), folio_nr_pages(folio)); } + + return collected; } /* @@ -2418,9 +2422,11 @@ static long check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs) { LIST_HEAD(movable_folio_list); + unsigned long collected; - collect_longterm_unpinnable_folios(&movable_folio_list, pofs); - if (list_empty(&movable_folio_list)) + collected = collect_longterm_unpinnable_folios(&movable_folio_list, + pofs); + if (!collected) return 0; return migrate_longterm_unpinnable_folios(&movable_folio_list, pofs); _ Patches currently in -mm which might be from david(a)redhat.com are mm-gup-revert-mm-gup-fix-infinite-loop-within-__get_longterm_locked.patch mm-gup-remove-vm_bug_ons.patch mm-gup-remove-vm_bug_ons-fix.patch

2 weeks, 3 days

1
0
0 0

[REGRESSION] Linux 6.15.1 xen/dom0 domain_crash_sync called from entry.S

by Chuck Zmudzinski

Hi, I am seeing the following regression between Linux 6.14.8 and 6.15.1. Kernel version 6.14.8 boots fine but version 6.15.1 crashes and reboots on Xen. I don't know if 6.14.9 or 6.14.10 is affected, or if 6.15 or the 6.15 release candidates are affected because I did not test them. Also, Linux 6.15.1 boots fine on bare metal without Xen. Hardware: Intel i5-14500 Raptor Lake CPU, and ASRock B760M PG motherboard and 32 GB RAM. Xen version: 4.19.2 (mockbuild(a)dynavirt.com) (gcc (GCC) 13.3.1 20240611 (Red Hat 13.3.1-2)) debug=n Sun Apr 13 15:24:29 PDT 2025 Xen Command line: placeholder dom0_mem=2G,max:2G conring_size=32k com1=9600,8n1,0x40c0,16,1:0.0 console=com1 Linux version 6.15.1-1.el9.elrepo.x86_64 (mockbuild@5b7a5dab3b71429898b4f8474fab8fa0) (gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5), GNU ld version 2.35.2-63.el9) #1 SMP PREEMPT_DYNAMIC Wed Jun 4 16:42:58 EDT 2025 Linux Kernel Command line: placeholder root=/dev/mapper/systems-rootalma ro crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M resume=UUID=2ddc2e3b-8f7b-498b-a4e8-bb4d33a1e5a7 console=hvc0 The Linux 6.15.1 dom0 kernel causes Xen to crash and reboot, here are the last messages on the serial console (includes messages from both dom0 and Xen) before crash: [ 0.301573] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl [ 0.301577] Register File Data Sampling: Vulnerable: No microcode [ 0.301581] ITS: Mitigation: Aligned branch/return thunks [ 0.301594] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.301598] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.301602] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.301605] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.301609] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format. (XEN) Pagetable walk from ffffc9003ffffff8: (XEN) L4[0x192] = 0000000855bee067 0000000000060e56 (XEN) L3[0x000] = 0000000855bed067 0000000000060e55 (XEN) L2[0x1ff] = 0000000855bf0067 0000000000060e58 (XEN) L1[0x1ff] = 8010000855bf2025 0000000000060e5a (XEN) domain_crash_sync called from entry.S: fault at ffff82d04036e5b0 x86_64/entry.S#domain_crash_page_fault_6x8+0/0x4 (XEN) Domain 0 (vcpu#0) crashed on cpu#11: (XEN) ----[ Xen-4.19.2 x86_64 debug=n Not tainted ]---- (XEN) CPU: 11 (XEN) RIP: e033:[<ffffffff810014fe>] (XEN) RFLAGS: 0000000000010206 EM: 1 CONTEXT: pv guest (d0v0) (XEN) rax: ffffffff81fb12d0 rbx: 000000000000029a rcx: 000000000000000c (XEN) rdx: 000000000000029a rsi: ffffffff81000b99 rdi: ffffc900400000f0 (XEN) rbp: 000000000000014d rsp: ffffc90040000000 r8: 0000000000000f9c (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: 000000000000000c r13: ffffffff82771530 r14: ffffffff827724cc (XEN) r15: ffffc900400000f0 cr0: 0000000080050033 cr4: 0000000000b526e0 (XEN) cr3: 000000086ae24000 cr2: ffffc9003ffffff8 (XEN) fsb: 0000000000000000 gsb: ffff88819ac55000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 (XEN) Guest stack trace from rsp=ffffc90040000000: (XEN) Stack empty. (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds. (XEN) Resetting with ACPI MEMORY or I/O RESET_REG. I searched mailing lists but could not find a report similar to what I am seeing here. I don't know what to try except to git bisect, but I have not done that yet. Chuck Zmudzinski

2 weeks, 3 days

2
2
0 0

[PATCH 0/5] Fixes for ITS mitigation and execmem

by Mike Rapoport

From: "Mike Rapoport (Microsoft)" <rppt(a)kernel.org> Hi, Jürgen Groß reported some bugs in interaction of ITS mitigation with execmem [1] when running on a Xen PV guest. These patches fix the issue by moving all the permissions management of ITS memory allocated from execmem into ITS code. I didn't test on a real Xen PV guest, but I emulated !PSE variant by force-disabling the ROX cache in x86::execmem_arch_setup(). Peter, I took liberty to put your SoB in the patch that actually implements the execmem permissions management in ITS, please let me know if I need to update something about the authorship. The patches are against v6.15. They are also available in git: https://web.git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=i… [1] https://lore.kernel.org/all/20250528123557.12847-2-jgross@suse.com/ Juergen Gross (1): x86/mm/pat: don't collapse pages without PSE set Mike Rapoport (Microsoft) (3): x86/Kconfig: only enable ROX cache in execmem when STRICT_MODULE_RWX is set x86/its: move its_pages array to struct mod_arch_specific Revert "mm/execmem: Unify early execmem_cache behaviour" Peter Zijlstra (Intel) (1): x86/its: explicitly manage permissions for ITS pages arch/x86/Kconfig | 2 +- arch/x86/include/asm/module.h | 8 ++++ arch/x86/kernel/alternative.c | 89 ++++++++++++++++++++++++++--------- arch/x86/mm/init_32.c | 3 -- arch/x86/mm/init_64.c | 3 -- arch/x86/mm/pat/set_memory.c | 3 ++ include/linux/execmem.h | 8 +--- include/linux/module.h | 5 -- mm/execmem.c | 40 ++-------------- 9 files changed, 82 insertions(+), 79 deletions(-) base-commit: 0ff41df1cb268fc69e703a08a57ee14ae967d0ca -- 2.47.2

2 weeks, 3 days

8
16
0 0

[PATCH] s390/pkey: prevent overflow in size calculation for memdup_user()

by Fedor Pchelkin

Number of apqn target list entries contained in 'nr_apqns' variable is determined by userspace via an ioctl call so the result of the product in calculation of size passed to memdup_user() may overflow. In this case the actual size of the allocated area and the value describing it won't be in sync leading to various types of unpredictable behaviour later. Return an error if an overflow is detected. Note that it is different from when nr_apqns is zero - that case is considered valid and should be handled in subsequent pkey_handler implementations. Found by Linux Verification Center (linuxtesting.org). Fixes: f2bbc96e7cfa ("s390/pkey: add CCA AES cipher key support") Cc: stable(a)vger.kernel.org Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru> --- drivers/s390/crypto/pkey_api.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/s390/crypto/pkey_api.c b/drivers/s390/crypto/pkey_api.c index cef60770f68b..a731fc9c62a7 100644 --- a/drivers/s390/crypto/pkey_api.c +++ b/drivers/s390/crypto/pkey_api.c @@ -83,10 +83,15 @@ static void *_copy_key_from_user(void __user *ukey, size_t keylen) static void *_copy_apqns_from_user(void __user *uapqns, size_t nr_apqns) { + size_t size; + if (!uapqns || nr_apqns == 0) return NULL; - return memdup_user(uapqns, nr_apqns * sizeof(struct pkey_apqn)); + if (check_mul_overflow(nr_apqns, sizeof(struct pkey_apqn), &size)) + return ERR_PTR(-EINVAL); + + return memdup_user(uapqns, size); } static int pkey_ioctl_genseck(struct pkey_genseck __user *ugs) -- 2.49.0

2 weeks, 3 days

2
1
0 0

[PATCH 05/10] drm/amd/display: Check dce_hwseq before dereferencing it

by Aurabindo Pillai

From: Alex Hung <alex.hung(a)amd.com> [WHAT] hws was checked for null earlier in dce110_blank_stream, indicating hws can be null, and should be checked whenever it is used. Cc: Mario Limonciello <mario.limonciello(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org Reviewed-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com> Signed-off-by: Alex Hung <alex.hung(a)amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com> --- drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c index c717cc1eca6d..542468224789 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c @@ -1227,7 +1227,7 @@ void dce110_blank_stream(struct pipe_ctx *pipe_ctx) return; if (link->local_sink && link->local_sink->sink_signal == SIGNAL_TYPE_EDP) { - if (!link->skip_implict_edp_power_control) + if (!link->skip_implict_edp_power_control && hws) hws->funcs.edp_backlight_control(link, false); link->dc->hwss.set_abm_immediate_disable(pipe_ctx); } -- 2.49.0

2 weeks, 3 days

1
0
0 0

Re: [PATCH 5.10.y 2/3] rtc: Make rtc_time64_to_tm() support dates before 1970

by Alexandre Belloni

Hello Cassio, On 10/06/2025 21:31:48+0100, Cassio Neri wrote: > Hi all, > > Although untested, I'm pretty sure that with very small changes, the > previous revision (1d1bb12) can handle dates prior to 1970-01-01 with no > need to add extra branches or arithmetic operations. Indeed, 1d1bb12 > contains: > > <code> > /* time must be positive */ > days = div_s64_rem(time, 86400, &secs); > > /* day of the week, 1970-01-01 was a Thursday */ > tm->tm_wday = (days + 4) % 7; > > /* long comments */ > > udays = ((u32) days) + 719468; > </code> > > This could have been changed to: > > <code> > /* time must be >= -719468 * 86400 which corresponds to 0000-03-01 */ > udays = div_u64_rem(time + 719468 * 86400, 86400, &secs); > > /* day of the week, 0000-03-01 was a Wednesday (in the proleptic Gregorian > calendar) */ > tm->tm_wday = (days + 3) % 7; > > /* long comments */ > </code> > > Indeed, the addition of 719468 * 86400 to `time` makes `days` to be 719468 > more than it should be. Therefore, in the calculation of `udays`, the > addition of 719468 becomes unnecessary and thus, `udays == days`. Moreover, > this means that `days` can be removed altogether and replaced by `udays`. > (Not the other way around because in the remaining code `udays` must be > u32.) > > Now, 719468 % 7 = 1 and thus tm->wday is 1 day after what it should be and > we correct that by adding 3 instead of 4. > > Therefore, I suggest these changes on top of 1d1bb12 instead of those made > in 7df4cfe. Since you're working on this, can I please kindly suggest two > other changes? > > 1) Change the reference provided in the long comment. It should say, "The > following algorithm is, basically, Figure 12 of Neri and Schneider [1]" and > [1] should refer to the published article: > > Neri C, Schneider L. Euclidean affine functions and their application to > calendar algorithms. Softw Pract Exper. 2023;53(4):937-970. doi: > 10.1002/spe.3172 > https://doi.org/10.1002/spe.3172 > > The article is much better written and clearer than the pre-print currently > referred to. > Thanks for your input, I wanted to look again at your paper and make those optimizations which is why I took so long to review the original patch. Unfortunately, I didn't have the time before the merge window. I would also gladly take patches for this if you are up for the task. > 2) Function rtc_time64_to_tm_test_date_range in drivers/rtc/lib_test.c, is > a kunit test that checks the result for everyday in a 160000 years range > starting at 1970-01-01. It'd be nice if this test is adapted to the new > code and starts at 1900-01-01 (technically, it could start at 0000-03-01 > but since tm->year counts from 1900, it would be weird to see tm->year == > -1900 to mean that the calendar year is 0.) Also 160000 is definitely an > overkill (my bad!) and a couple of thousands of years, say 3000, should be > more than safe for anyone. :-) This is also something on my radar as some have been complaining about the time it takes to run those tests. > > Many thanks, > Cassio. > > > > On Tue, 10 Jun 2025 at 08:35, Uwe Kleine-König <u.kleine-koenig(a)baylibre.com> > wrote: > > > From: Alexandre Mergnat <amergnat(a)baylibre.com> > > > > commit 7df4cfef8b351fec3156160bedfc7d6d29de4cce upstream. > > > > Conversion of dates before 1970 is still relevant today because these > > dates are reused on some hardwares to store dates bigger than the > > maximal date that is representable in the device's native format. > > This prominently and very soon affects the hardware covered by the > > rtc-mt6397 driver that can only natively store dates in the interval > > 1900-01-01 up to 2027-12-31. So to store the date 2028-01-01 00:00:00 > > to such a device, rtc_time64_to_tm() must do the right thing for > > time=-2208988800. > > > > Signed-off-by: Alexandre Mergnat <amergnat(a)baylibre.com> > > Reviewed-by: Uwe Kleine-König <u.kleine-koenig(a)baylibre.com> > > Link: > > https://lore.kernel.org/r/20250428-enable-rtc-v4-1-2b2f7e3f9349@baylibre.com > > Signed-off-by: Alexandre Belloni <alexandre.belloni(a)bootlin.com> > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)baylibre.com> > > --- > > drivers/rtc/lib.c | 24 +++++++++++++++++++----- > > 1 file changed, 19 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/rtc/lib.c b/drivers/rtc/lib.c > > index fe361652727a..13b5b1f20465 100644 > > --- a/drivers/rtc/lib.c > > +++ b/drivers/rtc/lib.c > > @@ -46,24 +46,38 @@ EXPORT_SYMBOL(rtc_year_days); > > * rtc_time64_to_tm - converts time64_t to rtc_time. > > * > > * @time: The number of seconds since 01-01-1970 00:00:00. > > - * (Must be positive.) > > + * Works for values since at least 1900 > > * @tm: Pointer to the struct rtc_time. > > */ > > void rtc_time64_to_tm(time64_t time, struct rtc_time *tm) > > { > > - unsigned int secs; > > - int days; > > + int days, secs; > > > > u64 u64tmp; > > u32 u32tmp, udays, century, day_of_century, year_of_century, year, > > day_of_year, month, day; > > bool is_Jan_or_Feb, is_leap_year; > > > > - /* time must be positive */ > > + /* > > + * Get days and seconds while preserving the sign to > > + * handle negative time values (dates before 1970-01-01) > > + */ > > days = div_s64_rem(time, 86400, &secs); > > > > + /* > > + * We need 0 <= secs < 86400 which isn't given for negative > > + * values of time. Fixup accordingly. > > + */ > > + if (secs < 0) { > > + days -= 1; > > + secs += 86400; > > + } > > + > > /* day of the week, 1970-01-01 was a Thursday */ > > tm->tm_wday = (days + 4) % 7; > > + /* Ensure tm_wday is always positive */ > > + if (tm->tm_wday < 0) > > + tm->tm_wday += 7; > > > > /* > > * The following algorithm is, basically, Proposition 6.3 of Neri > > @@ -93,7 +107,7 @@ void rtc_time64_to_tm(time64_t time, struct rtc_time > > *tm) > > * thus, is slightly different from [1]. > > */ > > > > - udays = ((u32) days) + 719468; > > + udays = days + 719468; > > > > u32tmp = 4 * udays + 3; > > century = u32tmp / 146097; > > -- > > 2.49.0 > > > >

2 weeks, 3 days

2
1
0 0

[PATCH] Revert "block: don't reorder requests in blk_add_rq_to_plug"

by Hazem Mohamed Abuelfotoh

This reverts commit e70c301faece15b618e54b613b1fd6ece3dd05b4. Commit <e70c301faece> ("block: don't reorder requests in blk_add_rq_to_plug") reversed how requests are stored in the blk_plug list, this had significant impact on bio merging with requests exist on the plug list. This impact has been reported in [1] and could easily be reproducible using 4k randwrite fio benchmark on an NVME based SSD without having any filesystem on the disk. My benchmark is: fio --time_based --name=benchmark --size=50G --rw=randwrite \ --runtime=60 --filename="/dev/nvme1n1" --ioengine=psync \ --randrepeat=0 --iodepth=1 --fsync=64 --invalidate=1 \ --verify=0 --verify_fatal=0 --blocksize=4k --numjobs=4 \ --group_reporting On 1.9TiB SSD(180K Max IOPS) attached to i3.16xlarge AWS EC2 instance. Kernel | fio (B.W MiB/sec) | I/O size (iostat) --------------+---------------------+-------------------- 6.15.1 | 362 | 2KiB 6.15.1+revert | 660 (+82%) | 4KiB --------------+---------------------+-------------------- I have run iostat while the fio benchmark was running and was able to see that the I/O size seen on the disk is shown as 2KB without this revert while it's 4KB with the revert. In the bad case the write bandwidth is capped at around 362MiB/sec which almost 2KiB * 180K IOPS so we are hitting the SSD Disk IOPS limit which is 180K. After the revert the I/O size has been doubled to 4KiB hence the bandwidth has been almost doubled as we no longer hit the Disk IOPS limit. I have done some tracing using bpftrace & bcc and was able to conclude that the reason behind the I/O size discrepancy with the revert is that this fio benchmark is subimitting each 4k I/O as 2 contiguous 2KB bios. In the good case each 2 bios are merged in a 4KB request that's then been submitted to the disk while in the bad case 2K bios are submitted to the disk without merging because blk_attempt_plug_merge() failed to merge them as seen below. **Without the revert** [12:12:28] r::blk_attempt_plug_merge():int:$retval COUNT EVENT 5618 $retval = 1 176578 $retval = 0 **With the revert** [12:11:43] r::blk_attempt_plug_merge():int:$retval COUNT EVENT 146684 $retval = 0 146686 $retval = 1 In blk_attempt_plug_merge() we are iterating ithrought the plug list from head to tail looking for a request with which we can merge the most recently submitted bio. With commit <e70c301faece> ("block: don't reorder requests in blk_add_rq_to_plug") the most recent request will be at the tail so blk_attempt_plug_merge() will fail because it tries to merge bio with the plug list head. In blk_attempt_plug_merge() we don't iterate across the whole plug list because as we exit the loop once we fail merging in blk_attempt_bio_merge(). In commit <bc490f81731> ("block: change plugging to use a singly linked list") the plug list has been changed to single linked list so there's no way to iterate the list from tail to head which is the only way to mitigate the impact on bio merging if we want to keep commit <e70c301faece> ("block: don't reorder requests in blk_add_rq_to_plug"). Given that moving plug list to a single linked list was mainly for performance reason then let's revert commit <e70c301faece> ("block: don't reorder requests in blk_add_rq_to_plug") for now to mitigate the reported performance regression. [1] https://lore.kernel.org/lkml/202412122112.ca47bcec-lkp@intel.com/ Cc: stable(a)vger.kernel.org # 6.12 Reported-by: kernel test robot <oliver.sang(a)intel.com> Reported-by: Hagar Hemdan <hagarhem(a)amazon.com> Reported-and-bisected-by: Shaoying Xu <shaoyi(a)amazon.com> Signed-off-by: Hazem Mohamed Abuelfotoh <abuehaze(a)amazon.com> --- block/blk-mq.c | 4 ++-- drivers/block/virtio_blk.c | 2 +- drivers/nvme/host/pci.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index c2697db59109..28965cac19fb 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1394,7 +1394,7 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq) */ if (!plug->has_elevator && (rq->rq_flags & RQF_SCHED_TAGS)) plug->has_elevator = true; - rq_list_add_tail(&plug->mq_list, rq); + rq_list_add_head(&plug->mq_list, rq); plug->rq_count++; } @@ -2846,7 +2846,7 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched) rq_list_add_tail(&requeue_list, rq); continue; } - list_add_tail(&rq->queuelist, &list); + list_add(&rq->queuelist, &list); depth++; } while (!rq_list_empty(&plug->mq_list)); diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 7cffea01d868..7992a171f905 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -513,7 +513,7 @@ static void virtio_queue_rqs(struct rq_list *rqlist) vq = this_vq; if (virtblk_prep_rq_batch(req)) - rq_list_add_tail(&submit_list, req); + rq_list_add_head(&submit_list, req); /* reverse order */ else rq_list_add_tail(&requeue_list, req); } diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index f1dd804151b1..5f7da42f9dac 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1026,7 +1026,7 @@ static void nvme_queue_rqs(struct rq_list *rqlist) nvmeq = req->mq_hctx->driver_data; if (nvme_prep_rq_batch(nvmeq, req)) - rq_list_add_tail(&submit_list, req); + rq_list_add_head(&submit_list, req); /* reverse order */ else rq_list_add_tail(&requeue_list, req); } -- 2.47.1

2 weeks, 3 days

5
6
0 0

[PATCH] pinctrl: qcom: msm: mark certain pins as invalid for interrupts

by Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org> When requesting pins whose intr_detection_width setting is not 1 or 2 for interrupts (for example by running `gpiomon -c 0 113` on RB2), we'll hit a BUG() in msm_gpio_irq_set_type(). Potentially crashing the kernel due to an invalid request from user-space is not optimal, so let's go through the pins and mark those that would fail the check as invalid for the irq chip as we should not even register them as available irqs. This function can be extended if we determine that there are more corner-cases like this. Fixes: f365be092572 ("pinctrl: Add Qualcomm TLMM driver") Cc: stable(a)vger.kernel.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org> --- drivers/pinctrl/qcom/pinctrl-msm.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c index f012ea88aa22c..77e0c2f023455 100644 --- a/drivers/pinctrl/qcom/pinctrl-msm.c +++ b/drivers/pinctrl/qcom/pinctrl-msm.c @@ -1038,6 +1038,24 @@ static bool msm_gpio_needs_dual_edge_parent_workaround(struct irq_data *d, test_bit(d->hwirq, pctrl->skip_wake_irqs); } +static void msm_gpio_irq_init_valid_mask(struct gpio_chip *gc, + unsigned long *valid_mask, + unsigned int ngpios) +{ + struct msm_pinctrl *pctrl = gpiochip_get_data(gc); + const struct msm_pingroup *g; + int i; + + bitmap_fill(valid_mask, ngpios); + + for (i = 0; i < ngpios; i++) { + g = &pctrl->soc->groups[i]; + if (g->intr_detection_width != 1 && + g->intr_detection_width != 2) + clear_bit(i, valid_mask); + } +} + static int msm_gpio_irq_set_type(struct irq_data *d, unsigned int type) { struct gpio_chip *gc = irq_data_get_irq_chip_data(d); @@ -1441,6 +1459,7 @@ static int msm_gpio_init(struct msm_pinctrl *pctrl) girq->default_type = IRQ_TYPE_NONE; girq->handler = handle_bad_irq; girq->parents[0] = pctrl->irq; + girq->init_valid_mask = msm_gpio_irq_init_valid_mask; ret = gpiochip_add_data(&pctrl->chip, pctrl); if (ret) { -- 2.48.1

2 weeks, 3 days

2
2
0 0

[PATCH v3 0/3] m68k: Bug fix and cleanup for framebuffer debug console

by Finn Thain

This series has a bug fix for the early bootconsole as well as some related efficiency improvements and cleanup. The relevant code is subject to CONSOLE_DEBUG, which is presently only used with CONFIG_MAC. To test this series (in qemu-system-m68k, for example) it's helpful to enable CONFIG_EARLY_PRINTK and CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER and boot with kernel parameters 'console=ttyS0 earlyprintk keep_bootcon'. --- Changed since v1: - Solved problem with line wrap while scrolling. - Added two additional patches. Changed since v2: - Adopted addq and subq as suggested by Andreas. Finn Thain (3): m68k: Fix lost column on framebuffer debug console m68k: Avoid pointless recursion in debug console rendering m68k: Remove unused "cursor home" code from debug console arch/m68k/kernel/head.S | 73 +++++++++++++++++++++-------------------- 1 file changed, 37 insertions(+), 36 deletions(-) -- 2.45.3

2 weeks, 3 days

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror June 2025