Guangshuo Li wrote:
> Hi Alison, Dave, and all,
>
> Thanks for the feedback. I’ve adopted your suggestions. Below is what I
> plan to take in v3.
I would just post v3. The review tags given on that version will be
picked up when the patch is merged if it is ok.
Thanks,
Ira
[snip]
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Commit 43c51bb573aa ("sc16is7xx: make sure device is in suspend once
probed") permanently enabled access to the enhanced features in
sc16is7xx_probe(), and it is never disabled after that.
Therefore, remove useless re-enable of enhanced features in
sc16is7xx_set_baud().
Fixes: 43c51bb573aa ("sc16is7xx: make sure device is in suspend once probed")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
---
drivers/tty/serial/sc16is7xx.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 1a2c4c14f6aac..c7435595dce13 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -588,13 +588,6 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
div /= prescaler;
}
- /* Enable enhanced features */
- sc16is7xx_efr_lock(port);
- sc16is7xx_port_update(port, SC16IS7XX_EFR_REG,
- SC16IS7XX_EFR_ENABLE_BIT,
- SC16IS7XX_EFR_ENABLE_BIT);
- sc16is7xx_efr_unlock(port);
-
/* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
--
2.39.5
The comedi_buf_munge() function performs a modulo operation
`async->munge_chan %= async->cmd.chanlist_len` without first
checking if chanlist_len is zero. If a user program submits a command with
chanlist_len set to zero, this causes a divide-by-zero error when the device
processes data in the interrupt handler path.
Add a check for zero chanlist_len at the beginning of the
function, similar to the existing checks for !map and
CMDF_RAWDATA flag. When chanlist_len is zero, update
munge_count and return early, indicating the data was
handled without munging.
This prevents potential kernel panics from malformed user commands.
Reported-by: syzbot+f6c3c066162d2c43a66c(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f6c3c066162d2c43a66c
Cc: stable(a)vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
---
v2: Merged the chanlist_len check with existing early return
check as suggested by Ian Abbott
---
drivers/comedi/comedi_buf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/comedi/comedi_buf.c b/drivers/comedi/comedi_buf.c
index 002c0e76baff..c7c262a2d8ca 100644
--- a/drivers/comedi/comedi_buf.c
+++ b/drivers/comedi/comedi_buf.c
@@ -317,7 +317,7 @@ static unsigned int comedi_buf_munge(struct comedi_subdevice *s,
unsigned int count = 0;
const unsigned int num_sample_bytes = comedi_bytes_per_sample(s);
- if (!s->munge || (async->cmd.flags & CMDF_RAWDATA)) {
+ if (!s->munge || (async->cmd.flags & CMDF_RAWDATA) || async->cmd.chanlist_len == 0) {
async->munge_count += num_bytes;
return num_bytes;
}
--
2.43.0
###**🌟 فرصتك للنشر في مجلات علمية محكمة دوليًا – دعوة مفتوحة للباحثين والأكاديميين**
**انضم إلى النخبة العلمية… وانشر أبحاثك معنا.**
**انطلاقًا من إيماننا بأن البحث العلمي هو الركيزة الأساسية لنهضة المجتمعات وتقدمها،**
يسر*فكر للدراسات والتطوير* أن تتشرف بدعوتكم للنشر في مجلاتها العلمية المحكمة، التي تمثل منبرًا أكاديميًا رصينًا لاحتضان الأفكار الأصيلة، والمشاريع البحثية الجادة، والرؤى التي تسهم في إنتاج معرفة تطبيقية تخدم قضايا الإنسان والمجتمع.
إن دعوتنا هذه تأتي ضمن رؤيتنا في تمكين الباحثين والكتّاب في العالم العربي والإسلامي من إيصال أبحاثهم إلى أوسع نطاق، والمساهمة الفاعلة في الحراك العلمي محليًا ودوليًا، عبر منصات نشر معتمدة وموثوقة.
➡️ **قدّم مخطوطتك الآن:** [https://7m8ue.r.ag.d.sendibm3.com/mk/cl/f/sh/WCPzyXJTZ7nvI8YYc4qB1knr1PmTmi…
####عامل تأثير عربي=2.7
مجلة ريحان للنشر العلمي
-----------------------
(Rihan Journal for Scientific Publishing)
###$50
*مجلة علمية دولية، محكّمة، شهرية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي: **ISSN-E: 2709-2097**
تستقبل المجلة الأبحاث والمقالات العلمية بثلاث لغات: **العربية، الإنجليزية، والتركية**، في مختلف التخصصات، وتخضع جميع المواد المقدمة لعملية تحكيم علمي صارمة، تضمن جودة المحتوى وموثوقيته الأكاديمية.
####عامل تأثير عربي=1.7
مجلة ايبرس للنشر الطبي
----------------------
###$50
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي: **ISSN-E: 2959-5371**
تأسست مجلة إيبرس للنشر الطبي عام 2022، لتكون منصة علمية رصينة في مجال العلوم الطبية والصحية، ووجهة موثوقة للباحثين والطلبة وأعضاء الهيئات التدريسية والأكاديميين لنشر أبحاثهم الأصيلة، ومراجعاتهم العلمية، ومشاركتهم في تطوير المعرفة الطبية الحديثة.
تستقبل المجلة الدراسات والأبحاث باللغة **العربية، الإنجليزية، والتركية**، وتخضع جميع المواد لعملية تحكيم علمي دقيقة، وفق أعلى المعايير الأكاديمية الدولية.
مجلة طُوى للعلوم الاجتماعية
---------------------------
###$50
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي:
**ISSN: 3104-7211**
**مجلة طُوى** هي منصة أكاديمية تُعنى بنشر البحوث والدراسات الأصيلة في **مجالات العلوم الاجتماعية المتداخلة**، بما يسهم في إنتاج معرفة تحليلية معمقة حول قضايا الإنسان والمجتمع. تلتزم المجلة بمعايير **التحكيم العلمي الرصين**، وتُرحّب بالأعمال المقدّمة باللغات: **العربية، الإنجليزية، والتركية**.
جاء اختيار اسم "طُوى" استلهامًا من **الوادي المقدّس طُوى**، تعبيرًا عن قدسية المعرفة، وإيمانًا بأن الفكر العلمي رسالة إنسانية لا تقل أثرًا وعمقًا عن أي رسالة تغيير أو بناء مجتمعي.
مجلة زنوبيا لدراسات المرأة والطفل والاسرة
-----------------------------------------
###$50
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير*
بالتعاون مع **المنتدى الثقافي النسائي السوري**
رقم التسلسل المعياري الدولي:
**ISSN: 3104-7874**
تُعنى **مجلة زنوبيا** بنشر البحوث والدراسات الأكاديمية التي تتناول قضايا **المرأة، والطفل، والأسرة** من زوايا علمية، اجتماعية، وثقافية متعددة، مع التركيز على التحديات والتحولات المعاصرة التي تمس هذه الفئات داخل السياقات العربية والعالمية. وتوفّر المجلة منبرًا أكاديميًا موثوقًا يعزز التفكير النقدي والتحليل العلمي البنّاء، ويسعى لإبراز التجارب والرؤى التي تدعم التمكين المجتمعي.
مجلة زكا للعلوم المالية والاقتصادية والإدارية
---------------------------------------------
###$50
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي
**ISSN: 3104-7289**
تُعنى **مجلة زكا** بنشر أبحاث علمية أصيلة وعالية الجودة في مجالات: **إدارة الأعمال، الاقتصاد، المحاسبة، العلوم المالية، التسويق، الاقتصاد الإسلامي، الإدارة العامة، والتمويل**.
وتمثل المجلة منصة أكاديمية رصينة تستهدف الباحثين، الأكاديميين، وطلبة الدراسات العليا في التخصصات المالية والإدارية، وتسعى إلى الإسهام الفعّال في تطوير المعرفة الاقتصادية والإدارية، وفقًا لأحدث المعايير العلمية والمنهجيات البحثية الحديثة.
مجلة روح للعلوم الإنسانية
-------------------------
###$50
**(RUH Journal of Humanities)**
*مجلة علمية دولية، محكّمة، فصلية، مفتوحة الوصول، تصدر عن مركز فكر للدراسات والتطوير.*
رقم التسلسل المعياري الدولي:
**ISSN: 3105-2436**
تهدف **مجلة روح للعلوم الإنسانية** إلى نشر الأبحاث العلمية المتميزة في مختلف مجالات العلوم الإنسانية، وتعزيز الفكر الأكاديمي والنقاش العلمي المتخصص في القضايا الإنسانية المعاصرة على المستويين العربي والعالمي.
تُعد المجلة منصة أكاديمية رصينة تجمع الباحثين من تخصصات متعددة لتبادل المعرفة، وتحفيز الدراسات الرصينة ذات الأثر المجتمعي والثقافي، وتوسيع دائرة الحوار العلمي حول الإنسان والمجتمع.
جاء اختيار اسم **"روح"** ليعكس جوهر العلوم الإنسانية، التي تنطلق من فهم الإنسان: مشاعره، ثقافته، سلوكه، وتاريخه.
تشير الكلمة إلى **العنصر الحيوي** الذي يمنح الإنسان والمجتمع معناهما، مما يرسّخ رؤية المجلة في أن **العلوم الإنسانية هي الروح النابضة لفهم المجتمعات وتطورها**.
دعوة للتبرع
------------
###$10
###🎓 ساهم في نهضة العلم في سوريا – كن جزءًا من التغيير الحقيقي!
إذا كنت تؤمن بقوة المعرفة وأهمية دعم البحث العلمي في سوريا، فقد حان الوقت لتشارك في صناعة الأمل.
**تبرعك اليوم لا يُقدَّر بثمن…**
إنه استثمار في العقول، وفي جيل جديد من الباحثين والمفكرين الذين يسعون لإعادة بناء المجتمع على أسس علمية وإنسانية.
📢 **ساهم في صناعة الأمل… تبرعك اليوم يصنع فرقاً حقيقياً غداً!**
المؤتمرات العلمية
-----------------
###$100
###🏛️ **مركز فكر للمؤتمرات العلمية**
**مركز فكر للمؤتمرات العلمية** هو أحد برامج مركز فكر للدراسات والتطوير، يُعنى بتنظيم المؤتمرات والملتقيات العلمية المتخصصة، بهدف دعم البحث الأكاديمي وتوسيع دائرة الحوار المعرفي حول القضايا المعاصرة التي تهم المجتمعات العربية والعالمية.
ينطلق المركز من رؤية تؤمن بأن **المؤتمر العلمي ليس مجرد حدث أكاديمي، بل هو منصة استراتيجية لإنتاج المعرفة، وتبادل الخبرات، وصياغة حلول مستندة إلى البحث العلمي**.
###📚 **مجالات المؤتمرات:**
العلوم الإنسانية والاجتماعية
العلوم الطبية والصحية
الدراسات الاقتصادية والمالية
دراسات المرأة والطفل والأسرة
البيئة والتنمية المستدامة
التكنولوجيا والتحول الرقمي
التعليم والتربية
**مع خالص الشكر والتقدير،**
============================
**نثمّن وقتكم واهتمامكم، ونتطلع إلى تعاون مثمر يجمعنا في خدمة البحث العلمي والمجتمع.**
======================================================================================
**نؤمن أن العمل المشترك هو مفتاح التغيير الحقيقي، ونرحّب بكم دائمًا ضمن شبكتنا العلمية والمجتمعية.**
====================================================================================================
➡️ **قدّم مخطوطتك الآن:** [https://forms.gle/g7McfPrkaYYDFC2N6](https://forms.gle/g7McfPrkaYYDFC2N6)
Maria Abdel Rahim
[pr@rjsp.org](mailto:ahmet@rjsp.org)
rihanjournal(a)gmail.com
[00905306359001](https://)
gazimuhtarpas
29600,gazantap
[Unsubscribe](https://7m8ue.r.ag.d.sendibm3.com/mk/un/v2/sh/7nVTPdbLJ2bPbEmD…
From: Lance Yang <lance.yang(a)linux.dev>
When both THP and MTE are enabled, splitting a THP and replacing its
zero-filled subpages with the shared zeropage can cause MTE tag mismatch
faults in userspace.
Remapping zero-filled subpages to the shared zeropage is unsafe, as the
zeropage has a fixed tag of zero, which may not match the tag expected by
the userspace pointer.
KSM already avoids this problem by using memcmp_pages(), which on arm64
intentionally reports MTE-tagged pages as non-identical to prevent unsafe
merging.
As suggested by David[1], this patch adopts the same pattern, replacing the
memchr_inv() byte-level check with a call to pages_identical(). This
leverages existing architecture-specific logic to determine if a page is
truly identical to the shared zeropage.
Having both the THP shrinker and KSM rely on pages_identical() makes the
design more future-proof, IMO. Instead of handling quirks in generic code,
we just let the architecture decide what makes two pages identical.
[1] https://lore.kernel.org/all/ca2106a3-4bb2-4457-81af-301fd99fbef4@redhat.com
Cc: <stable(a)vger.kernel.org>
Reported-by: Qun-wei Lin <Qun-wei.Lin(a)mediatek.com>
Closes: https://lore.kernel.org/all/a7944523fcc3634607691c35311a5d59d1a3f8d4.camel@…
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Suggested-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
Tested on x86_64 and on QEMU for arm64 (with and without MTE support),
and the fix works as expected.
mm/huge_memory.c | 15 +++------------
mm/migrate.c | 8 +-------
2 files changed, 4 insertions(+), 19 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 32e0ec2dde36..28d4b02a1aa5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -4104,29 +4104,20 @@ static unsigned long deferred_split_count(struct shrinker *shrink,
static bool thp_underused(struct folio *folio)
{
int num_zero_pages = 0, num_filled_pages = 0;
- void *kaddr;
int i;
for (i = 0; i < folio_nr_pages(folio); i++) {
- kaddr = kmap_local_folio(folio, i * PAGE_SIZE);
- if (!memchr_inv(kaddr, 0, PAGE_SIZE)) {
- num_zero_pages++;
- if (num_zero_pages > khugepaged_max_ptes_none) {
- kunmap_local(kaddr);
+ if (pages_identical(folio_page(folio, i), ZERO_PAGE(0))) {
+ if (++num_zero_pages > khugepaged_max_ptes_none)
return true;
- }
} else {
/*
* Another path for early exit once the number
* of non-zero filled pages exceeds threshold.
*/
- num_filled_pages++;
- if (num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none) {
- kunmap_local(kaddr);
+ if (++num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none)
return false;
- }
}
- kunmap_local(kaddr);
}
return false;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index aee61a980374..ce83c2c3c287 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -300,9 +300,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
unsigned long idx)
{
struct page *page = folio_page(folio, idx);
- bool contains_data;
pte_t newpte;
- void *addr;
if (PageCompound(page))
return false;
@@ -319,11 +317,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
* this subpage has been non present. If the subpage is only zero-filled
* then map it to the shared zeropage.
*/
- addr = kmap_local_page(page);
- contains_data = memchr_inv(addr, 0, PAGE_SIZE);
- kunmap_local(addr);
-
- if (contains_data)
+ if (!pages_identical(page, ZERO_PAGE(0)))
return false;
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
--
2.49.0
This is the mail system at host zihnyunrui.com.
I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.
For further assistance, please send mail to postmaster.
If you do so, please include this problem report. You can
delete your own text from the attached returned message.
The mail system
<linux-stable-mirror(a)lists.linaro.org>: host lists.linaro.org[3.208.193.21]
said: 554 5.7.1 Spam message rejected (in reply to end of DATA command)
The comedi_buf_munge() function performs a modulo operation
`async->munge_chan %= async->cmd.chanlist_len` without first
checking if chanlist_len is zero. If a user program submits
a command with chanlist_len set to zero, this causes a
divide-by-zero error when the device processes data in the
interrupt handler path.
Add a check for zero chanlist_len at the beginning of the
function, similar to the existing checks for !map and
CMDF_RAWDATA flag. When chanlist_len is zero, update
munge_count and return early, indicating the data was
handled without munging.
This prevents potential kernel panics from malformed user commands.
Reported-by: syzbot+f6c3c066162d2c43a66c(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f6c3c066162d2c43a66c
Cc: stable(a)vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
---
drivers/comedi/comedi_buf.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/comedi/comedi_buf.c b/drivers/comedi/comedi_buf.c
index 002c0e76baff..786f888299ce 100644
--- a/drivers/comedi/comedi_buf.c
+++ b/drivers/comedi/comedi_buf.c
@@ -321,6 +321,11 @@ static unsigned int comedi_buf_munge(struct comedi_subdevice *s,
async->munge_count += num_bytes;
return num_bytes;
}
+
+ if (async->cmd.chanlist_len == 0) {
+ async->munge_count += num_bytes;
+ return num_bytes;
+ }
/* don't munge partial samples */
num_bytes -= num_bytes % num_sample_bytes;
--
2.43.0
Fix incorrect use of PTR_ERR_OR_ZERO() in topology_parse_cpu_capacity()
which causes the code to proceed with NULL clock pointers. The current
logic uses !PTR_ERR_OR_ZERO(cpu_clk) which evaluates to true for both
valid pointers and NULL, leading to potential NULL pointer dereference
in clk_get_rate().
Per include/linux/err.h documentation, PTR_ERR_OR_ZERO(ptr) returns:
"The error code within @ptr if it is an error pointer; 0 otherwise."
This means PTR_ERR_OR_ZERO() returns 0 for both valid pointers AND NULL
pointers. Therefore !PTR_ERR_OR_ZERO(cpu_clk) evaluates to true (proceed)
when cpu_clk is either valid or NULL, causing clk_get_rate(NULL) to be
called when of_clk_get() returns NULL.
Replace with !IS_ERR_OR_NULL(cpu_clk) which only proceeds for valid
pointers, preventing potential NULL pointer dereference in clk_get_rate().
Fixes: b8fe128dad8f ("arch_topology: Adjust initial CPU capacities with current freq")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar(a)intel.com>
---
Changes in v4:
- recipient list adjustment as per kernel patch review process
Changes in v3:
- Used accurate "function call properties" terminology in commit description
(suggested by Markus Elfring)
- Added stable backport justification
- Removed duplicate marker line per kernel documentation
Changes in v2:
- Refined description based on documented macro properties (suggested by Markus Elfring)
- Added proper Fixes
drivers/base/arch_topology.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1037169abb45..e1eff05bea4a 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -292,7 +292,7 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu)
* frequency (by keeping the initial capacity_freq_ref value).
*/
cpu_clk = of_clk_get(cpu_node, 0);
- if (!PTR_ERR_OR_ZERO(cpu_clk)) {
+ if (!IS_ERR_OR_NULL(cpu_clk)) {
per_cpu(capacity_freq_ref, cpu) =
clk_get_rate(cpu_clk) / HZ_PER_KHZ;
clk_put(cpu_clk);
--
2.34.1
Fix incorrect use of PTR_ERR_OR_ZERO() in topology_parse_cpu_capacity()
which causes the code to proceed with NULL clock pointers. The current
logic uses !PTR_ERR_OR_ZERO(cpu_clk) which evaluates to true for both
valid pointers and NULL, leading to potential NULL pointer dereference
in clk_get_rate().
Per include/linux/err.h documentation, PTR_ERR_OR_ZERO(ptr) returns:
"The error code within @ptr if it is an error pointer; 0 otherwise."
This means PTR_ERR_OR_ZERO() returns 0 for both valid pointers AND NULL
pointers. Therefore !PTR_ERR_OR_ZERO(cpu_clk) evaluates to true (proceed)
when cpu_clk is either valid or NULL, causing clk_get_rate(NULL) to be
called when of_clk_get() returns NULL.
Replace with !IS_ERR_OR_NULL(cpu_clk) which only proceeds for valid
pointers, preventing potential NULL pointer dereference in clk_get_rate().
Fixes: b8fe128dad8f ("arch_topology: Adjust initial CPU capacities with current freq")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar(a)intel.com>
---
Changes in v3:
- Used accurate "function call properties" terminology in commit description
(suggested by Markus Elfring)
- Added stable backport justification
- Removed duplicate marker line per kernel documentation
Changes in v2:
- Refined description based on documented macro properties (suggested by Markus Elfring)
- Added proper Fixes
drivers/base/arch_topology.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1037169abb45..e1eff05bea4a 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -292,7 +292,7 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu)
* frequency (by keeping the initial capacity_freq_ref value).
*/
cpu_clk = of_clk_get(cpu_node, 0);
- if (!PTR_ERR_OR_ZERO(cpu_clk)) {
+ if (!IS_ERR_OR_NULL(cpu_clk)) {
per_cpu(capacity_freq_ref, cpu) =
clk_get_rate(cpu_clk) / HZ_PER_KHZ;
clk_put(cpu_clk);
--
2.34.1
From: HariKrishna Sagala <hariconscious(a)gmail.com>
Syzbot reported an uninit-value bug on at kmalloc_reserve for
commit 320475fbd590 ("Merge tag 'mtd/fixes-for-6.17-rc6' of
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux")'
Syzbot KMSAN reported use of uninitialized memory originating from functions
"kmalloc_reserve()", where memory allocated via "kmem_cache_alloc_node()" or
"kmalloc_node_track_caller()" was not explicitly initialized.
This can lead to undefined behavior when the allocated buffer
is later accessed.
Fix this by requesting the initialized memory using the gfp flag
appended with the option "__GFP_ZERO".
Reported-by: syzbot+9a4fbb77c9d4aacd3388(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9a4fbb77c9d4aacd3388
Fixes: 915d975b2ffa ("net: deal with integer overflows in
kmalloc_reserve()")
Tested-by: syzbot+9a4fbb77c9d4aacd3388(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org> # 6.16
Signed-off-by: HariKrishna Sagala <hariconscious(a)gmail.com>
---
RESEND:
- added Cc stable as suggested from kernel test robot
net/core/skbuff.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index ee0274417948..2308ebf99bbd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -573,6 +573,7 @@ static void *kmalloc_reserve(unsigned int *size, gfp_t flags, int node,
void *obj;
obj_size = SKB_HEAD_ALIGN(*size);
+ flags |= __GFP_ZERO;
if (obj_size <= SKB_SMALL_HEAD_CACHE_SIZE &&
!(flags & KMALLOC_NOT_NORMAL_BITS)) {
obj = kmem_cache_alloc_node(net_hotdata.skb_small_head_cache,
--
2.43.0
In register_shm_helper(), fix incorrect error handling for a call to
iov_iter_extract_pages(). A case is missing for when
iov_iter_extract_pages() only got some pages and return a number larger
than 0, but not the requested amount.
This fixes a possible NULL pointer dereference following a bad input from
ioctl(TEE_IOC_SHM_REGISTER) where parts of the buffer isn't mapped.
Cc: stable(a)vger.kernel.org
Reported-by: Masami Ichikawa <masami256(a)gmail.com>
Closes: https://lore.kernel.org/op-tee/CACOXgS-Bo2W72Nj1_44c7bntyNYOavnTjJAvUbEiQfq…
Tested-by: Masami Ichikawa <masami256(a)gmail.com>
Fixes: 7bdee4157591 ("tee: Use iov_iter to better support shared buffer registration")
Signed-off-by: Jens Wiklander <jens.wiklander(a)linaro.org>
---
Changes from v1
- Refactor the if statement as requested by Sumit
- Adding Tested-by: Masami Ichikawa <masami256(a)gmail.com
- Link to v1:
https://lore.kernel.org/op-tee/20250919124217.2934718-1-jens.wiklander@lina…
---
drivers/tee/tee_shm.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c
index daf6e5cfd59a..76c54e1dc98c 100644
--- a/drivers/tee/tee_shm.c
+++ b/drivers/tee/tee_shm.c
@@ -319,6 +319,14 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags,
if (unlikely(len <= 0)) {
ret = len ? ERR_PTR(len) : ERR_PTR(-ENOMEM);
goto err_free_shm_pages;
+ } else if (DIV_ROUND_UP(len + off, PAGE_SIZE) != num_pages) {
+ /*
+ * If we only got a few pages, update to release the
+ * correct amount below.
+ */
+ shm->num_pages = len / PAGE_SIZE;
+ ret = ERR_PTR(-ENOMEM);
+ goto err_put_shm_pages;
}
/*
--
2.43.0
Make sure to drop the reference taken when looking up the genpool
platform device in of_gen_pool_get() before returning the pool.
Note that holding a reference to a device does typically not prevent its
devres managed resources from being released so there is no point in
keeping the reference.
Fixes: 9375db07adea ("genalloc: add devres support, allow to find a managed pool by device")
Cc: stable(a)vger.kernel.org # 3.10
Cc: Philipp Zabel <p.zabel(a)pengutronix.de>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
lib/genalloc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/genalloc.c b/lib/genalloc.c
index 4fa5635bf81b..841f29783833 100644
--- a/lib/genalloc.c
+++ b/lib/genalloc.c
@@ -899,8 +899,11 @@ struct gen_pool *of_gen_pool_get(struct device_node *np,
if (!name)
name = of_node_full_name(np_pool);
}
- if (pdev)
+ if (pdev) {
pool = gen_pool_get(&pdev->dev, name);
+ put_device(&pdev->dev);
+ }
+
of_node_put(np_pool);
return pool;
--
2.49.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092154-unnoticed-collide-5621@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092111-hedging-brunch-bdeb@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
When waiting for the PCIe link to come up, both link up and link down
are valid results depending on the device state.
Since the link may come up later and to get rid of the following
mis-reported PM errors. Do not return an -ETIMEDOUT error, as the
outcome has already been reported in dw_pcie_wait_for_link().
PM error logs introduced by the -ETIMEDOUT error return.
imx6q-pcie 33800000.pcie: Phy link never came up
imx6q-pcie 33800000.pcie: PM: dpm_run_callback(): genpd_resume_noirq returns -110
imx6q-pcie 33800000.pcie: PM: failed to resume noirq: error -110
Cc: stable(a)vger.kernel.org
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Signed-off-by: Richard Zhu <hongxing.zhu(a)nxp.com>
Reviewed-by: Frank Li <Frank.Li(a)nxp.com>
---
drivers/pci/controller/dwc/pcie-designware-host.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index b303a74b0fd7..c4386be38a07 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -1084,10 +1084,9 @@ int dw_pcie_resume_noirq(struct dw_pcie *pci)
if (ret)
return ret;
- ret = dw_pcie_wait_for_link(pci);
- if (ret)
- return ret;
+ /* Ignore errors, the link may come up later */
+ dw_pcie_wait_for_link(pci);
- return ret;
+ return 0;
}
EXPORT_SYMBOL_GPL(dw_pcie_resume_noirq);
--
2.37.1
This series of patches clears the compiler warnings for the dc395x
driver.
The first patch introduces a new macro that casts the value returned by
a read operation to void, since some values returned by some specific
read operations (which just simply clears the FIFO buffer or resets the
interrupt status) can be ignored. Creating a new macro that casts the
return value to void to fix the warning.
During the fix, checkpatch.pl complained about missing white space
between macro arguments and missing parentheses around complex
expressions. To align with the changes in the first patch, the
formatting of macros above and below the introduced macro are also
fixed.
Since in Patch v2 [2] Bart pointed out that such change can't be made
to the stable tree, the patch is split to two parts.
---
Changes since v3 [1]:
- Undo some mistakes in the patch 2 of the patch v2 where extra
parentheses are added around function calls.
- Remove the unnecessary casts from the inb(), inw() and inl() calls,
so that extra parentheses are no longer required. They are now returns
the type it was being casted to, as James pointed out.
Changes since v2 [2]:
- Split the patch into two parts, the first one fixes the warning, and
the second one improves the formatting of the surrounding macros.
- Make the description of the formatting changes more clear.
Changes since v1 [3]:
- Add Cc: tag to include this patch to the stable tree.
- Add additional description about the formatting changes.
[1]: https://lore.kernel.org/linux-scsi/20250923125226.1883391-1-cyan@cyano.uk/
[2]: https://lore.kernel.org/linux-scsi/20250922152609.827311-1-cyan@cyano.uk/
[3]: https://lore.kernel.org/linux-scsi/20250922143619.824129-1-cyan@cyano.uk/
---
Xinhui Yang (2):
scsi: dc395x: correctly discard the return value in certain reads
scsi: dc395x: improve code formatting for the macros
drivers/scsi/dc395x.c | 34 ++++++++++++++++++++--------------
1 file changed, 20 insertions(+), 14 deletions(-)
---
Xinhui Yang (2):
scsi: dc395x: correctly discard the return value in certain reads
scsi: dc395x: improve code formatting for the macros
drivers/scsi/dc395x.c | 34 ++++++++++++++++++++--------------
1 file changed, 20 insertions(+), 14 deletions(-)
--
2.51.0
Good day,
I hope this email finds you well.
Mr. Kostyantyn a Ukrainian Businessman and a Philanthropist, has asked me to reach out to you because he is in difficult situation and seeking to relocate some asset sum of US$41.5M out of Ukraine to overseas for safe keeping due to the current ongoing crisis (WAR) in Ukraine.
He is looking for someone to help him receive and manage this asset into any business project over a period of 5-10 years till the crisis is over. Please reply to this email with your contact number for further discussion on this urgent matter.
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
The comedi_buf_munge() function performs a modulo operation
`async->munge_chan %= async->cmd.chanlist_len` without first
checking if chanlist_len is zero. If a user program submits
a command with chanlist_len set to zero, this causes a
divide-by-zero error when the device processes data in the
interrupt handler path.
Add a check for zero chanlist_len at the beginning of the
function, similar to the existing checks for !map and
CMDF_RAWDATA flag. When chanlist_len is zero, update
munge_count and return early, indicating the data was
handled without munging.
This prevents potential kernel panics from malformed user commands.
Reported-by: syzbot+f6c3c066162d2c43a66c(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f6c3c066162d2c43a66c
Cc: stable(a)vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
---
drivers/comedi/comedi_buf.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/comedi/comedi_buf.c b/drivers/comedi/comedi_buf.c
index 002c0e76baff..786f888299ce 100644
--- a/drivers/comedi/comedi_buf.c
+++ b/drivers/comedi/comedi_buf.c
@@ -321,6 +321,11 @@ static unsigned int comedi_buf_munge(struct comedi_subdevice *s,
async->munge_count += num_bytes;
return num_bytes;
}
+
+ if (async->cmd.chanlist_len == 0) {
+ async->munge_count += num_bytes;
+ return num_bytes;
+ }
/* don't munge partial samples */
num_bytes -= num_bytes % num_sample_bytes;
--
2.43.0
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
The comedi_buf_munge() function performs a modulo operation
`async->munge_chan %= async->cmd.chanlist_len` without first
checking if chanlist_len is zero. If a user program submits
a command with chanlist_len set to zero, this causes a
divide-by-zero error when the device processes data in the
interrupt handler path.
Add a check for zero chanlist_len at the beginning of the
function, similar to the existing checks for !map and
CMDF_RAWDATA flag. When chanlist_len is zero, update
munge_count and return early, indicating the data was
handled without munging.
This prevents potential kernel panics from malformed user commands.
Reported-by: syzbot+f6c3c066162d2c43a66c(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f6c3c066162d2c43a66c
Cc: stable(a)vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
---
drivers/comedi/comedi_buf.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/comedi/comedi_buf.c b/drivers/comedi/comedi_buf.c
index 002c0e76baff..786f888299ce 100644
--- a/drivers/comedi/comedi_buf.c
+++ b/drivers/comedi/comedi_buf.c
@@ -321,6 +321,11 @@ static unsigned int comedi_buf_munge(struct comedi_subdevice *s,
async->munge_count += num_bytes;
return num_bytes;
}
+
+ if (async->cmd.chanlist_len == 0) {
+ async->munge_count += num_bytes;
+ return num_bytes;
+ }
/* don't munge partial samples */
num_bytes -= num_bytes % num_sample_bytes;
--
2.43.0
This takes the fix from 2cb66ae604 ("nouveau: Membar before between
semaphore writes and the interrupt") and applies it to nvc0_fence.c.
If I force my ampere system down the nvc0_fence path then I reproduce the
same issues with transfer queues + The Talos Principle that were fixed by
the above commit. This fixes that issue in exactly the same way for the
old code path.
Fixes: 60cdadace320 ("drm/nouveau/fence: use NVIDIA's headers for emit()")
Signed-off-by: Mel Henning <mhenning(a)darkrefraction.com>
Cc: stable(a)vger.kernel.org
---
.../drm/nouveau/include/nvhw/class/cl906f.h | 23 +++++
.../drm/nouveau/include/nvhw/class/clb06f.h | 54 +++++++++++
.../drm/nouveau/include/nvhw/class/clc06f.h | 93 +++++++++++++++++++
.../gpu/drm/nouveau/include/nvif/push906f.h | 2 +
drivers/gpu/drm/nouveau/nvc0_fence.c | 31 ++++++-
5 files changed, 200 insertions(+), 3 deletions(-)
create mode 100644 drivers/gpu/drm/nouveau/include/nvhw/class/clb06f.h
create mode 100644 drivers/gpu/drm/nouveau/include/nvhw/class/clc06f.h
diff --git a/drivers/gpu/drm/nouveau/include/nvhw/class/cl906f.h b/drivers/gpu/drm/nouveau/include/nvhw/class/cl906f.h
index 673d668885bb..529c785b4651 100644
--- a/drivers/gpu/drm/nouveau/include/nvhw/class/cl906f.h
+++ b/drivers/gpu/drm/nouveau/include/nvhw/class/cl906f.h
@@ -47,6 +47,29 @@
#define NV906F_SEMAPHORED_RELEASE_SIZE_4BYTE 0x00000001
#define NV906F_NON_STALL_INTERRUPT (0x00000020)
#define NV906F_NON_STALL_INTERRUPT_HANDLE 31:0
+#define NV906F_MEM_OP_A (0x00000028)
+#define NV906F_MEM_OP_A_OPERAND_LOW 31:2
+#define NV906F_MEM_OP_A_TLB_INVALIDATE_ADDR 29:2
+#define NV906F_MEM_OP_A_TLB_INVALIDATE_TARGET 31:30
+#define NV906F_MEM_OP_A_TLB_INVALIDATE_TARGET_VID_MEM 0x00000000
+#define NV906F_MEM_OP_A_TLB_INVALIDATE_TARGET_SYS_MEM_COHERENT 0x00000002
+#define NV906F_MEM_OP_A_TLB_INVALIDATE_TARGET_SYS_MEM_NONCOHERENT 0x00000003
+#define NV906F_MEM_OP_B (0x0000002c)
+#define NV906F_MEM_OP_B_OPERAND_HIGH 7:0
+#define NV906F_MEM_OP_B_OPERATION 31:27
+#define NV906F_MEM_OP_B_OPERATION_SYSMEMBAR_FLUSH 0x00000005
+#define NV906F_MEM_OP_B_OPERATION_SOFT_FLUSH 0x00000006
+#define NV906F_MEM_OP_B_OPERATION_MMU_TLB_INVALIDATE 0x00000009
+#define NV906F_MEM_OP_B_OPERATION_L2_PEERMEM_INVALIDATE 0x0000000d
+#define NV906F_MEM_OP_B_OPERATION_L2_SYSMEM_INVALIDATE 0x0000000e
+#define NV906F_MEM_OP_B_OPERATION_L2_CLEAN_COMPTAGS 0x0000000f
+#define NV906F_MEM_OP_B_OPERATION_L2_FLUSH_DIRTY 0x00000010
+#define NV906F_MEM_OP_B_MMU_TLB_INVALIDATE_PDB 0:0
+#define NV906F_MEM_OP_B_MMU_TLB_INVALIDATE_PDB_ONE 0x00000000
+#define NV906F_MEM_OP_B_MMU_TLB_INVALIDATE_PDB_ALL 0x00000001
+#define NV906F_MEM_OP_B_MMU_TLB_INVALIDATE_GPC 1:1
+#define NV906F_MEM_OP_B_MMU_TLB_INVALIDATE_GPC_ENABLE 0x00000000
+#define NV906F_MEM_OP_B_MMU_TLB_INVALIDATE_GPC_DISABLE 0x00000001
#define NV906F_SET_REFERENCE (0x00000050)
#define NV906F_SET_REFERENCE_COUNT 31:0
diff --git a/drivers/gpu/drm/nouveau/include/nvhw/class/clb06f.h b/drivers/gpu/drm/nouveau/include/nvhw/class/clb06f.h
new file mode 100644
index 000000000000..15edc9d8dcbe
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/include/nvhw/class/clb06f.h
@@ -0,0 +1,54 @@
+/*******************************************************************************
+ Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
+
+ Permission is hereby granted, free of charge, to any person obtaining a
+ copy of this software and associated documentation files (the "Software"),
+ to deal in the Software without restriction, including without limitation
+ the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ and/or sell copies of the Software, and to permit persons to whom the
+ Software is furnished to do so, subject to the following conditions:
+
+ The above copyright notice and this permission notice shall be included in
+ all copies or substantial portions of the Software.
+
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+#ifndef _clb06f_h_
+#define _clb06f_h_
+
+/* fields and values */
+// NOTE - MEM_OP_A and MEM_OP_B have been removed for gm20x to make room for
+// possible future MEM_OP features. MEM_OP_C/D have identical functionality
+// to the previous MEM_OP_A/B methods.
+#define NVB06F_MEM_OP_C (0x00000030)
+#define NVB06F_MEM_OP_C_OPERAND_LOW 31:2
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_PDB 0:0
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_PDB_ONE 0x00000000
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_PDB_ALL 0x00000001
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_GPC 1:1
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_GPC_ENABLE 0x00000000
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_GPC_DISABLE 0x00000001
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_TARGET 11:10
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_TARGET_VID_MEM 0x00000000
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_TARGET_SYS_MEM_COHERENT 0x00000002
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_TARGET_SYS_MEM_NONCOHERENT 0x00000003
+#define NVB06F_MEM_OP_C_TLB_INVALIDATE_ADDR_LO 31:12
+#define NVB06F_MEM_OP_D (0x00000034)
+#define NVB06F_MEM_OP_D_OPERAND_HIGH 7:0
+#define NVB06F_MEM_OP_D_OPERATION 31:27
+#define NVB06F_MEM_OP_D_OPERATION_MEMBAR 0x00000005
+#define NVB06F_MEM_OP_D_OPERATION_MMU_TLB_INVALIDATE 0x00000009
+#define NVB06F_MEM_OP_D_OPERATION_L2_PEERMEM_INVALIDATE 0x0000000d
+#define NVB06F_MEM_OP_D_OPERATION_L2_SYSMEM_INVALIDATE 0x0000000e
+#define NVB06F_MEM_OP_D_OPERATION_L2_CLEAN_COMPTAGS 0x0000000f
+#define NVB06F_MEM_OP_D_OPERATION_L2_FLUSH_DIRTY 0x00000010
+#define NVB06F_MEM_OP_D_TLB_INVALIDATE_ADDR_HI 7:0
+
+#endif /* _clb06f_h_ */
diff --git a/drivers/gpu/drm/nouveau/include/nvhw/class/clc06f.h b/drivers/gpu/drm/nouveau/include/nvhw/class/clc06f.h
new file mode 100644
index 000000000000..4d0f13f79c17
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/include/nvhw/class/clc06f.h
@@ -0,0 +1,93 @@
+/*******************************************************************************
+ Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
+
+ Permission is hereby granted, free of charge, to any person obtaining a
+ copy of this software and associated documentation files (the "Software"),
+ to deal in the Software without restriction, including without limitation
+ the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ and/or sell copies of the Software, and to permit persons to whom the
+ Software is furnished to do so, subject to the following conditions:
+
+ The above copyright notice and this permission notice shall be included in
+ all copies or substantial portions of the Software.
+
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+#ifndef _clc06f_h_
+#define _clc06f_h_
+
+/* fields and values */
+// NOTE - MEM_OP_A and MEM_OP_B have been replaced in gp100 with methods for
+// specifying the page address for a targeted TLB invalidate and the uTLB for
+// a targeted REPLAY_CANCEL for UVM.
+// The previous MEM_OP_A/B functionality is in MEM_OP_C/D, with slightly
+// rearranged fields.
+#define NVC06F_MEM_OP_A (0x00000028)
+#define NVC06F_MEM_OP_A_TLB_INVALIDATE_CANCEL_TARGET_CLIENT_UNIT_ID 5:0 // only relevant for REPLAY_CANCEL_TARGETED
+#define NVC06F_MEM_OP_A_TLB_INVALIDATE_CANCEL_TARGET_GPC_ID 10:6 // only relevant for REPLAY_CANCEL_TARGETED
+#define NVC06F_MEM_OP_A_TLB_INVALIDATE_SYSMEMBAR 11:11
+#define NVC06F_MEM_OP_A_TLB_INVALIDATE_SYSMEMBAR_EN 0x00000001
+#define NVC06F_MEM_OP_A_TLB_INVALIDATE_SYSMEMBAR_DIS 0x00000000
+#define NVC06F_MEM_OP_A_TLB_INVALIDATE_TARGET_ADDR_LO 31:12
+#define NVC06F_MEM_OP_B (0x0000002c)
+#define NVC06F_MEM_OP_B_TLB_INVALIDATE_TARGET_ADDR_HI 31:0
+#define NVC06F_MEM_OP_C (0x00000030)
+#define NVC06F_MEM_OP_C_MEMBAR_TYPE 2:0
+#define NVC06F_MEM_OP_C_MEMBAR_TYPE_SYS_MEMBAR 0x00000000
+#define NVC06F_MEM_OP_C_MEMBAR_TYPE_MEMBAR 0x00000001
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PDB 0:0
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PDB_ONE 0x00000000
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PDB_ALL 0x00000001 // Probably nonsensical for MMU_TLB_INVALIDATE_TARGETED
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_GPC 1:1
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_GPC_ENABLE 0x00000000
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_GPC_DISABLE 0x00000001
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_REPLAY 4:2 // only relevant if GPC ENABLE
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_REPLAY_NONE 0x00000000
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_REPLAY_START 0x00000001
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_REPLAY_START_ACK_ALL 0x00000002
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_REPLAY_CANCEL_TARGETED 0x00000003
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_REPLAY_CANCEL_GLOBAL 0x00000004
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE 6:5 // only relevant if GPC ENABLE
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE_NONE 0x00000000
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE_GLOBALLY 0x00000001
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE_INTRANODE 0x00000002
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL 9:7 // Invalidate affects this level and all below
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_ALL 0x00000000 // Invalidate tlb caches at all levels of the page table
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_PTE_ONLY 0x00000001
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE0 0x00000002
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE1 0x00000003
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE2 0x00000004
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE3 0x00000005
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE4 0x00000006
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE5 0x00000007
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE 11:10 // only relevant if PDB_ONE
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE_VID_MEM 0x00000000
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE_SYS_MEM_COHERENT 0x00000002
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE_SYS_MEM_NONCOHERENT 0x00000003
+#define NVC06F_MEM_OP_C_TLB_INVALIDATE_PDB_ADDR_LO 31:12 // only relevant if PDB_ONE
+// MEM_OP_D MUST be preceded by MEM_OPs A-C.
+#define NVC06F_MEM_OP_D (0x00000034)
+#define NVC06F_MEM_OP_D_TLB_INVALIDATE_PDB_ADDR_HI 26:0 // only relevant if PDB_ONE
+#define NVC06F_MEM_OP_D_OPERATION 31:27
+#define NVC06F_MEM_OP_D_OPERATION_MEMBAR 0x00000005
+#define NVC06F_MEM_OP_D_OPERATION_MMU_TLB_INVALIDATE 0x00000009
+#define NVC06F_MEM_OP_D_OPERATION_MMU_TLB_INVALIDATE_TARGETED 0x0000000a
+#define NVC06F_MEM_OP_D_OPERATION_L2_PEERMEM_INVALIDATE 0x0000000d
+#define NVC06F_MEM_OP_D_OPERATION_L2_SYSMEM_INVALIDATE 0x0000000e
+// CLEAN_LINES is an alias for Tegra/GPU IP usage
+#define NVC06F_MEM_OP_D_OPERATION_L2_INVALIDATE_CLEAN_LINES 0x0000000e
+// This B alias is confusing but it was missed as part of the update. Left here
+// for compatibility.
+#define NVC06F_MEM_OP_B_OPERATION_L2_INVALIDATE_CLEAN_LINES 0x0000000e
+#define NVC06F_MEM_OP_D_OPERATION_L2_CLEAN_COMPTAGS 0x0000000f
+#define NVC06F_MEM_OP_D_OPERATION_L2_FLUSH_DIRTY 0x00000010
+#define NVC06F_MEM_OP_D_OPERATION_L2_WAIT_FOR_SYS_PENDING_READS 0x00000015
+
+#endif /* _clc06f_h_ */
diff --git a/drivers/gpu/drm/nouveau/include/nvif/push906f.h b/drivers/gpu/drm/nouveau/include/nvif/push906f.h
index 79df71de98d2..3d506f4dc2c9 100644
--- a/drivers/gpu/drm/nouveau/include/nvif/push906f.h
+++ b/drivers/gpu/drm/nouveau/include/nvif/push906f.h
@@ -7,6 +7,8 @@
#ifndef PUSH906F_SUBC
// Host methods
#define PUSH906F_SUBC_NV906F 0
+#define PUSH906F_SUBC_NVB06F 0
+#define PUSH906F_SUBC_NVC06F 0
#define PUSH906F_SUBC_NVC36F 0
// Twod
diff --git a/drivers/gpu/drm/nouveau/nvc0_fence.c b/drivers/gpu/drm/nouveau/nvc0_fence.c
index a5e98d0d4217..8b36deaaf8cf 100644
--- a/drivers/gpu/drm/nouveau/nvc0_fence.c
+++ b/drivers/gpu/drm/nouveau/nvc0_fence.c
@@ -27,15 +27,18 @@
#include "nv50_display.h"
+#include <nvif/class.h>
#include <nvif/push906f.h>
#include <nvhw/class/cl906f.h>
+#include <nvhw/class/clb06f.h>
+#include <nvhw/class/clc06f.h>
static int
nvc0_fence_emit32(struct nouveau_channel *chan, u64 virtual, u32 sequence)
{
struct nvif_push *push = &chan->chan.push;
- int ret = PUSH_WAIT(push, 6);
+ int ret = PUSH_WAIT(push, 12);
if (ret == 0) {
PUSH_MTHD(push, NV906F, SEMAPHOREA,
NVVAL(NV906F, SEMAPHOREA, OFFSET_UPPER, upper_32_bits(virtual)),
@@ -46,9 +49,31 @@ nvc0_fence_emit32(struct nouveau_channel *chan, u64 virtual, u32 sequence)
SEMAPHORED,
NVDEF(NV906F, SEMAPHORED, OPERATION, RELEASE) |
NVDEF(NV906F, SEMAPHORED, RELEASE_WFI, EN) |
- NVDEF(NV906F, SEMAPHORED, RELEASE_SIZE, 16BYTE),
+ NVDEF(NV906F, SEMAPHORED, RELEASE_SIZE, 16BYTE));
+
+ if (chan->user.oclass >= PASCAL_CHANNEL_GPFIFO_A) {
+ PUSH_MTHD(push, NVC06F, MEM_OP_A, 0,
+ MEM_OP_B, 0,
+
+ MEM_OP_C,
+ NVDEF(NVC06F, MEM_OP_C, MEMBAR_TYPE, SYS_MEMBAR),
+
+ MEM_OP_D,
+ NVDEF(NVC06F, MEM_OP_D, OPERATION, MEMBAR));
+ } else if (chan->user.oclass >= MAXWELL_CHANNEL_GPFIFO_A) {
+ PUSH_MTHD(push, NVB06F, MEM_OP_C, 0,
+
+ MEM_OP_D,
+ NVDEF(NVC06F, MEM_OP_D, OPERATION, MEMBAR));
+ } else {
+ PUSH_MTHD(push, NV906F, MEM_OP_A, 0,
+
+ MEM_OP_B,
+ NVDEF(NV906F, MEM_OP_B, OPERATION, SYSMEMBAR_FLUSH));
+ }
+
+ PUSH_MTHD(push, NV906F, NON_STALL_INTERRUPT, 0);
- NON_STALL_INTERRUPT, 0);
PUSH_KICK(push);
}
return ret;
--
2.51.0
The patch titled
Subject: mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
has been added to the -mm mm-unstable branch. Its filename is
mm-ksm-fix-incorrect-ksm-counter-handling-in-mm_struct-during-fork.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Donet Tom <donettom(a)linux.ibm.com>
Subject: mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
Date: Wed, 24 Sep 2025 00:16:59 +0530
Patch series "mm/ksm: Fix incorrect accounting of KSM counters during
fork", v3.
The first patch in this series fixes the incorrect accounting of KSM
counters such as ksm_merging_pages, ksm_rmap_items, and the global
ksm_zero_pages during fork.
The following patch add a selftest to verify the ksm_merging_pages counter
was updated correctly during fork.
Test Results
============
Without the first patch
-----------------------
# [RUN] test_fork_ksm_merging_page_count
not ok 10 ksm_merging_page in child: 32
With the first patch
--------------------
# [RUN] test_fork_ksm_merging_page_count
ok 10 ksm_merging_pages is not inherited after fork
This patch (of 2):
Currently, the KSM-related counters in `mm_struct`, such as
`ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are inherited
by the child process during fork. This results in inconsistent
accounting.
When a process uses KSM, identical pages are merged and an rmap item is
created for each merged page. The `ksm_merging_pages` and
`ksm_rmap_items` counters are updated accordingly. However, after a fork,
these counters are copied to the child while the corresponding rmap items
are not. As a result, when the child later triggers an unmerge, there are
no rmap items present in the child, so the counters remain stale, leading
to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a
global counter and a per-process counter. During fork, the per-process
counter is inherited by the child, but the global counter is not
incremented. Since the child also references zero pages, the global
counter should be updated as well. Otherwise, during zero-page unmerge,
both the global and per-process counters are decremented, causing the
global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0 during
fork, and the global ksm_zero_pages counter is updated with the
per-process ksm_zero_pages value inherited by the child. This ensures
that KSM statistics remain accurate and reflect the activity of each
process correctly.
Link: https://lkml.kernel.org/r/cover.1758648700.git.donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/7b9870eb67ccc0d79593940d9dbd4a0b39b5d396.17586487…
Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process")
Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process")
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list(a)gmail.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: xu xin <xu.xin16(a)zte.com.cn>
Cc: <stable(a)vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/ksm.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/include/linux/ksm.h~mm-ksm-fix-incorrect-ksm-counter-handling-in-mm_struct-during-fork
+++ a/include/linux/ksm.h
@@ -56,8 +56,14 @@ static inline long mm_ksm_zero_pages(str
static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
/* Adding mm to ksm is best effort on fork. */
- if (mm_flags_test(MMF_VM_MERGEABLE, oldmm))
+ if (mm_flags_test(MMF_VM_MERGEABLE, oldmm)) {
+ long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages);
+
+ mm->ksm_merging_pages = 0;
+ mm->ksm_rmap_items = 0;
+ atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages);
__ksm_enter(mm);
+ }
}
static inline int ksm_execve(struct mm_struct *mm)
_
Patches currently in -mm which might be from donettom(a)linux.ibm.com are
drivers-base-node-fix-double-free-in-register_one_node.patch
mm-ksm-fix-incorrect-ksm-counter-handling-in-mm_struct-during-fork.patch
selftests-mm-added-fork-inheritance-test-for-ksm_merging_pages-counter.patch
devm_kcalloc() may fail. ndtest_probe() allocates three DMA address
arrays (dcr_dma, label_dma, dimm_dma) and later unconditionally uses
them in ndtest_nvdimm_init(), which can lead to a NULL pointer
dereference under low-memory conditions.
Check all three allocations and return -ENOMEM if any allocation fails.
Do not emit an extra error message since the allocator already warns on
allocation failure.
Fixes: 9399ab61ad82 ("ndtest: Add dimms to the two buses")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
Changes in v2:
- Drop pr_err() on allocation failure; only NULL-check and return -ENOMEM.
- No other changes.
---
tools/testing/nvdimm/test/ndtest.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/nvdimm/test/ndtest.c b/tools/testing/nvdimm/test/ndtest.c
index 68a064ce598c..abdbe0c1cb63 100644
--- a/tools/testing/nvdimm/test/ndtest.c
+++ b/tools/testing/nvdimm/test/ndtest.c
@@ -855,6 +855,9 @@ static int ndtest_probe(struct platform_device *pdev)
p->dimm_dma = devm_kcalloc(&p->pdev.dev, NUM_DCR,
sizeof(dma_addr_t), GFP_KERNEL);
+ if (!p->dcr_dma || !p->label_dma || !p->dimm_dma)
+ return -ENOMEM;
+
rc = ndtest_nvdimm_init(p);
if (rc)
goto err;
--
2.43.0
Currently, the KSM-related counters in `mm_struct`, such as
`ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are
inherited by the child process during fork. This results in inconsistent
accounting.
When a process uses KSM, identical pages are merged and an rmap item is
created for each merged page. The `ksm_merging_pages` and
`ksm_rmap_items` counters are updated accordingly. However, after a
fork, these counters are copied to the child while the corresponding
rmap items are not. As a result, when the child later triggers an
unmerge, there are no rmap items present in the child, so the counters
remain stale, leading to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a
global counter and a per-process counter. During fork, the per-process
counter is inherited by the child, but the global counter is not
incremented. Since the child also references zero pages, the global
counter should be updated as well. Otherwise, during zero-page unmerge,
both the global and per-process counters are decremented, causing the
global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0
during fork, and the global ksm_zero_pages counter is updated with the
per-process ksm_zero_pages value inherited by the child. This ensures
that KSM statistics remain accurate and reflect the activity of each
process correctly.
Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process")
Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process")
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
cc: stable(a)vger.kernel.org # v6.6
Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
---
include/linux/ksm.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 22e67ca7cba3..067538fc4d58 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -56,8 +56,14 @@ static inline long mm_ksm_zero_pages(struct mm_struct *mm)
static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
/* Adding mm to ksm is best effort on fork. */
- if (mm_flags_test(MMF_VM_MERGEABLE, oldmm))
+ if (mm_flags_test(MMF_VM_MERGEABLE, oldmm)) {
+ long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages);
+
+ mm->ksm_merging_pages = 0;
+ mm->ksm_rmap_items = 0;
+ atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages);
__ksm_enter(mm);
+ }
}
static inline int ksm_execve(struct mm_struct *mm)
--
2.51.0
The Mediatek DRM driver allocates private data for components without a
platform driver but as the lifetime is tied to each component device,
the memory is never freed.
Tie the allocation lifetime to the DRM platform device so that the
memory is released on probe failure (e.g. probe deferral) and when the
driver is unbound.
Fixes: c0d36de868a6 ("drm/mediatek: Move clk info from struct mtk_ddp_comp to sub driver private data")
Cc: stable(a)vger.kernel.org # 5.12
Cc: CK Hu <ck.hu(a)mediatek.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/gpu/drm/mediatek/mtk_ddp_comp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
index 0264017806ad..31d67a131c50 100644
--- a/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_ddp_comp.c
@@ -671,7 +671,7 @@ int mtk_ddp_comp_init(struct device *dev, struct device_node *node, struct mtk_d
type == MTK_DSI)
return 0;
- priv = devm_kzalloc(comp->dev, sizeof(*priv), GFP_KERNEL);
+ priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
--
2.49.1
Commit 1a194e6c8e1ee745e914b0b7f50fa86c89ed13fe introduced overflow
checking for the font allocation size calculation, but in doing so moved
the addition of the size for font housekeeping data out of the kmalloc
call.
As a result, the calculated size now includes those extra bytes, which
marks the same number of bytes beyond the allocation as valid font data.
The crc32() call and the later memcmp() in fbcon_set_font() already perform
an out-of-bounds read, the latter is flagged on ppc64el:
memcmp: detected buffer overflow: 4112 byte read of buffer size 4096
when loading Lat15-Fixed16.psf.gz.
Since the addition of the extra size should only go into the kmalloc()
call, calculate this size in a separate variable.
Signed-off-by: Simon Richter <Simon.Richter(a)hogyros.de>
Fixes: 1a194e6c8e1e ("fbcon: fix integer overflow in fbcon_do_set_font")
Cc: stable <stable(a)vger.kernel.org> #v5.9+
---
drivers/video/fbdev/core/fbcon.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c
index 5fade44931b8..a3fbf42c57d9 100644
--- a/drivers/video/fbdev/core/fbcon.c
+++ b/drivers/video/fbdev/core/fbcon.c
@@ -2518,7 +2518,7 @@ static int fbcon_set_font(struct vc_data *vc, const struct console_font *font,
unsigned charcount = font->charcount;
int w = font->width;
int h = font->height;
- int size;
+ int size, allocsize;
int i, csum;
u8 *new_data, *data = font->data;
int pitch = PITCH(font->width);
@@ -2551,10 +2551,10 @@ static int fbcon_set_font(struct vc_data *vc, const struct console_font *font,
return -EINVAL;
/* Check for overflow in allocation size calculation */
- if (check_add_overflow(FONT_EXTRA_WORDS * sizeof(int), size, &size))
+ if (check_add_overflow(FONT_EXTRA_WORDS * sizeof(int), size, &allocsize))
return -EINVAL;
- new_data = kmalloc(size, GFP_USER);
+ new_data = kmalloc(allocsize, GFP_USER);
if (!new_data)
return -ENOMEM;
--
2.47.3
According to the PLIC specification[1], global interrupt sources are
assigned small unsigned integer identifiers beginning at the value 1.
An interrupt ID of 0 is reserved to mean "no interrupt".
The current plic_irq_resume() and plic_irq_suspend() functions incorrectly
start the loop from index 0, which accesses the register space for the
reserved interrupt ID 0.
Change the loop to start from index 1, skipping the reserved
interrupt ID 0 as per the PLIC specification.
This prevents potential undefined behavior when accessing the reserved
register space during suspend/resume cycles.
Link: https://github.com/riscv/riscv-plic-spec/releases/tag/1.0.0
Fixes: e80f0b6a2cf3 ("irqchip/irq-sifive-plic: Add syscore callbacks for hibernation")
Co-developed-by: Jia Wang <wangjia(a)ultrarisc.com>
Signed-off-by: Jia Wang <wangjia(a)ultrarisc.com>
Co-developed-by: Charles Mirabile <cmirabil(a)redhat.com>
Signed-off-by: Charles Mirabile <cmirabil(a)redhat.com>
Signed-off-by: Lucas Zampieri <lzampier(a)redhat.com>
---
drivers/irqchip/irq-sifive-plic.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index bf69a4802b71e..9c4af7d588463 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -252,7 +252,8 @@ static int plic_irq_suspend(void)
priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
- for (i = 0; i < priv->nr_irqs; i++) {
+ /* irq ID 0 is reserved */
+ for (i = 1; i < priv->nr_irqs; i++) {
__assign_bit(i, priv->prio_save,
readl(priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID));
}
@@ -283,7 +284,8 @@ static void plic_irq_resume(void)
priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
- for (i = 0; i < priv->nr_irqs; i++) {
+ /* irq ID 0 is reserved */
+ for (i = 1; i < priv->nr_irqs; i++) {
index = BIT_WORD(i);
writel((priv->prio_save[index] & BIT_MASK(i)) ? 1 : 0,
priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID);
--
2.51.0
This series of patches clears the compiler warnings for the dc395x
driver.
The first patch introduces a new macro that casts the value returned by
a read operation to void, since some values returned by some specific
read operations (which just simply clears the FIFO buffer or resets the
interrupt status) can be ignored. Creating a new macro that casts the
return value to void to fix the warning.
During the fix, checkpatch.pl complained about missing whitespace
between macro arguments and missing parentheses around complex
expressions. To align with the changes in the first patch, the
formatting of macros above and below the introduced macro are also
fixed.
Since in Patch v2 [1] Bart pointed out that such change can't be made
to the stable tree, the patch is splitted to two parts.
---
Changes since v2 [1]:
- Split the patch into two parts, the first one fixes the warning, and
the second one improves the formatting of the surrounding macros.
- Make the description of the formatting changes more clear.
Changes since v1 [2]:
- Add Cc: tag to include this patch to the stable tree.
- Add additional description about the formatting changes.
[1]: https://lore.kernel.org/linux-scsi/20250922152609.827311-1-cyan@cyano.uk/
[2]: https://lore.kernel.org/linux-scsi/20250922143619.824129-1-cyan@cyano.uk/
---
Xinhui Yang (2):
scsi: dc395x: correctly discard the return value in certain reads
scsi: dc395x: improve code formatting for the macros
drivers/scsi/dc395x.c | 34 ++++++++++++++++++++--------------
1 file changed, 20 insertions(+), 14 deletions(-)
--
2.51.0
wcd934x_codec_parse_data() contains a device reference count leak in
of_slim_get_device() where device_find_child() increases the reference
count of the device but this reference is not properly decreased in
the success path. Add put_device() in wcd934x_codec_parse_data() and
add devm_add_action_or_reset() in the probe function, which ensures
that the reference count of the device is correctly managed.
Memory leak in regmap_init_slimbus() as the allocated regmap is not
released when the device is removed. Using devm_regmap_init_slimbus()
instead of regmap_init_slimbus() to ensure automatic regmap cleanup on
device removal.
Calling path: of_slim_get_device() -> of_find_slim_device() ->
device_find_child(). As comment of device_find_child() says, 'NOTE:
you will need to drop the reference with put_device() after use.'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a61f3b4f476e ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v4:
- removed the redundant NULL check as put_device() can handle the NULL dev;
Changes in v3:
- added a wrapper function due to the warning report from kernel test robot;
Changes in v2:
- modified the handling in the success path and fixed the memory leak for regmap as suggestions.
---
sound/soc/codecs/wcd934x.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
index 1bb7e1dc7e6b..e92939068bf7 100644
--- a/sound/soc/codecs/wcd934x.c
+++ b/sound/soc/codecs/wcd934x.c
@@ -5831,6 +5831,13 @@ static const struct snd_soc_component_driver wcd934x_component_drv = {
.endianness = 1,
};
+static void wcd934x_put_device_action(void *data)
+{
+ struct device *dev = data;
+
+ put_device(dev);
+}
+
static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
{
struct device *dev = &wcd->sdev->dev;
@@ -5847,11 +5854,13 @@ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
return dev_err_probe(dev, -EINVAL, "Unable to get SLIM Interface device\n");
slim_get_logical_addr(wcd->sidev);
- wcd->if_regmap = regmap_init_slimbus(wcd->sidev,
+ wcd->if_regmap = devm_regmap_init_slimbus(wcd->sidev,
&wcd934x_ifc_regmap_config);
- if (IS_ERR(wcd->if_regmap))
+ if (IS_ERR(wcd->if_regmap)) {
+ put_device(&wcd->sidev->dev);
return dev_err_probe(dev, PTR_ERR(wcd->if_regmap),
"Failed to allocate ifc register map\n");
+ }
of_property_read_u32(dev->parent->of_node, "qcom,dmic-sample-rate",
&wcd->dmic_sample_rate);
@@ -5893,6 +5902,10 @@ static int wcd934x_codec_probe(struct platform_device *pdev)
if (ret)
return ret;
+ ret = devm_add_action_or_reset(dev, wcd934x_put_device_action, &wcd->sidev->dev);
+ if (ret)
+ return ret;
+
/* set default rate 9P6MHz */
regmap_update_bits(wcd->regmap, WCD934X_CODEC_RPM_CLK_MCLK_CFG,
WCD934X_CODEC_RPM_CLK_MCLK_CFG_MCLK_MASK,
--
2.17.1
First patch hardens the IRQ handler so the driver can resume after
hitting bus errors (presumably AXI errors). This will reduce the
risk of having to reboot the system in order to recover.
The second patch actually fix the issue, ensuring the decoder won't be
instructed to use address 0x0 as a reference anymore.
The issues was discovered using GStreamer. A bug in the userspace lead
to skippable (RASL) frames not being skipped if its past reference are
missing. This will happen when seeking inside a video that makes use of
this GOP configuration. The probably could also have been hit in lossy
streaming use cases, such as WebRTC. The JCT-VC-HEVC_V1 ITU tests still
results in 141/147 with two concurrent decoders.
Signed-off-by: Nicolas Dufresne <nicolas.dufresne(a)collabora.com>
---
Nicolas Dufresne (2):
media: verisilicon: Fix CPU stalls on G2 bus error
media: verisilicon: Protect G2 HEVC decoder against invalid DPB index
drivers/media/platform/verisilicon/hantro_g2.c | 88 +++++++++++++++++-----
.../platform/verisilicon/hantro_g2_hevc_dec.c | 17 ++++-
.../media/platform/verisilicon/hantro_g2_regs.h | 13 ++++
.../media/platform/verisilicon/hantro_g2_vp9_dec.c | 2 -
drivers/media/platform/verisilicon/hantro_hw.h | 1 +
drivers/media/platform/verisilicon/imx8m_vpu_hw.c | 2 +
6 files changed, 98 insertions(+), 25 deletions(-)
---
base-commit: 40b7a19f321e65789612ebaca966472055dab48c
change-id: 20250919-imx8mq-hantro-g2-hang-cb04dcd07a84
Best regards,
--
Nicolas Dufresne <nicolas.dufresne(a)collabora.com>
devm_kcalloc() may fail. mt8365_afe_suspend() uses afe->reg_back_up
unconditionally after allocation and writes afe->reg_back_up[i], which
can lead to a NULL pointer dereference under low-memory conditions.
Add a NULL check and bail out with -ENOMEM, making sure to disable the
main clock before returning to keep clock state balanced.
Fixes: e1991d102bc2 ("ASoC: mediatek: mt8365: Add the AFE driver support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
changelog:
v3:
- Move the Signed-off-by line above the '---' separator.
- Restore the original spacing/indentation in the devm_kcalloc() continuation line.
- No functional changes.
v2:
- Return -ENOMEM directly on allocation failure without goto/label.
- Disable the main clock before returning to keep clock state balanced.
---
sound/soc/mediatek/mt8365/mt8365-afe-pcm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c b/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c
index 10793bbe9275..570fb10c306d 100644
--- a/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c
+++ b/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c
@@ -1975,11 +1975,15 @@ static int mt8365_afe_suspend(struct device *dev)
mt8365_afe_enable_main_clk(afe);
- if (!afe->reg_back_up)
+ if (!afe->reg_back_up) {
afe->reg_back_up =
devm_kcalloc(dev, afe->reg_back_up_list_num,
sizeof(unsigned int), GFP_KERNEL);
-
+ if (!afe->reg_back_up) {
+ mt8365_afe_disable_main_clk(afe);
+ return -ENOMEM;
+ }
+ }
for (i = 0; i < afe->reg_back_up_list_num; i++)
regmap_read(regmap, afe->reg_back_up_list[i],
&afe->reg_back_up[i]);
--
2.43.0
This is a note to let you know that I've just added the patch titled
iio: imu: inv_icm42600: Avoid configuring if already pm_runtime
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
From 466f7a2fef2a4e426f809f79845a1ec1aeb558f4 Mon Sep 17 00:00:00 2001
From: Sean Nyekjaer <sean(a)geanix.com>
Date: Mon, 1 Sep 2025 09:49:15 +0200
Subject: iio: imu: inv_icm42600: Avoid configuring if already pm_runtime
suspended
Do as in suspend, skip resume configuration steps if the device is already
pm_runtime suspended. This avoids reconfiguring a device that is already
in the correct low-power state and ensures that pm_runtime handles the
power state transitions properly.
Fixes: 31c24c1e93c3 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver")
Signed-off-by: Sean Nyekjaer <sean(a)geanix.com>
Link: https://patch.msgid.link/20250901-icm42pmreg-v3-3-ef1336246960@geanix.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index 41b275ecc7e2..ee780f530dc8 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -837,17 +837,15 @@ static int inv_icm42600_suspend(struct device *dev)
struct device *accel_dev;
bool wakeup;
int accel_conf;
- int ret;
+ int ret = 0;
mutex_lock(&st->lock);
st->suspended.gyro = st->conf.gyro.mode;
st->suspended.accel = st->conf.accel.mode;
st->suspended.temp = st->conf.temp_en;
- if (pm_runtime_suspended(dev)) {
- ret = 0;
+ if (pm_runtime_suspended(dev))
goto out_unlock;
- }
/* disable FIFO data streaming */
if (st->fifo.on) {
@@ -900,10 +898,13 @@ static int inv_icm42600_resume(struct device *dev)
struct inv_icm42600_sensor_state *accel_st = iio_priv(st->indio_accel);
struct device *accel_dev;
bool wakeup;
- int ret;
+ int ret = 0;
mutex_lock(&st->lock);
+ if (pm_runtime_suspended(dev))
+ goto out_unlock;
+
/* check wakeup capability */
accel_dev = &st->indio_accel->dev;
wakeup = st->apex.on && device_may_wakeup(accel_dev);
--
2.51.0
This is a note to let you know that I've just added the patch titled
iio: imu: inv_icm42600: Drop redundant pm_runtime reinitialization in
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
From a95a0b4e471a6d8860f40c6ac8f1cad9dde3189a Mon Sep 17 00:00:00 2001
From: Sean Nyekjaer <sean(a)geanix.com>
Date: Mon, 1 Sep 2025 09:49:14 +0200
Subject: iio: imu: inv_icm42600: Drop redundant pm_runtime reinitialization in
resume
Remove unnecessary calls to pm_runtime_disable(), pm_runtime_set_active(),
and pm_runtime_enable() from the resume path. These operations are not
required here and can interfere with proper pm_runtime state handling,
especially when resuming from a pm_runtime suspended state.
Fixes: 31c24c1e93c3 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver")
Signed-off-by: Sean Nyekjaer <sean(a)geanix.com>
Link: https://patch.msgid.link/20250901-icm42pmreg-v3-2-ef1336246960@geanix.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index 76d8e4f14d87..41b275ecc7e2 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -917,10 +917,6 @@ static int inv_icm42600_resume(struct device *dev)
goto out_unlock;
}
- pm_runtime_disable(dev);
- pm_runtime_set_active(dev);
- pm_runtime_enable(dev);
-
/* restore sensors state */
ret = inv_icm42600_set_pwr_mgmt0(st, st->suspended.gyro,
st->suspended.accel,
--
2.51.0
This is a note to let you know that I've just added the patch titled
iio: imu: inv_icm42600: Simplify pm_runtime setup
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
From 0792c1984a45ccd7a296d6b8cb78088bc99a212e Mon Sep 17 00:00:00 2001
From: Sean Nyekjaer <sean(a)geanix.com>
Date: Mon, 1 Sep 2025 09:49:13 +0200
Subject: iio: imu: inv_icm42600: Simplify pm_runtime setup
Rework the power management in inv_icm42600_core_probe() to use
devm_pm_runtime_set_active_enabled(), which simplifies the runtime PM
setup by handling activation and enabling in one step.
Remove the separate inv_icm42600_disable_pm callback, as it's no longer
needed with the devm-managed approach.
Using devm_pm_runtime_enable() also fixes the missing disable of
autosuspend.
Update inv_icm42600_disable_vddio_reg() to only disable the regulator if
the device is not suspended i.e. powered-down, preventing unbalanced
disables.
Also remove redundant error msg on regulator_disable(), the regulator
framework already emits an error message when regulator_disable() fails.
This simplifies the PM setup and avoids manipulating the usage counter
unnecessarily.
Fixes: 31c24c1e93c3 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver")
Signed-off-by: Sean Nyekjaer <sean(a)geanix.com>
Link: https://patch.msgid.link/20250901-icm42pmreg-v3-1-ef1336246960@geanix.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
.../iio/imu/inv_icm42600/inv_icm42600_core.c | 24 ++++++-------------
1 file changed, 7 insertions(+), 17 deletions(-)
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index a4d42e7e2180..76d8e4f14d87 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -711,20 +711,12 @@ static void inv_icm42600_disable_vdd_reg(void *_data)
static void inv_icm42600_disable_vddio_reg(void *_data)
{
struct inv_icm42600_state *st = _data;
- const struct device *dev = regmap_get_device(st->map);
- int ret;
+ struct device *dev = regmap_get_device(st->map);
- ret = regulator_disable(st->vddio_supply);
- if (ret)
- dev_err(dev, "failed to disable vddio error %d\n", ret);
-}
+ if (pm_runtime_status_suspended(dev))
+ return;
-static void inv_icm42600_disable_pm(void *_data)
-{
- struct device *dev = _data;
-
- pm_runtime_put_sync(dev);
- pm_runtime_disable(dev);
+ regulator_disable(st->vddio_supply);
}
int inv_icm42600_core_probe(struct regmap *regmap, int chip,
@@ -824,16 +816,14 @@ int inv_icm42600_core_probe(struct regmap *regmap, int chip,
return ret;
/* setup runtime power management */
- ret = pm_runtime_set_active(dev);
+ ret = devm_pm_runtime_set_active_enabled(dev);
if (ret)
return ret;
- pm_runtime_get_noresume(dev);
- pm_runtime_enable(dev);
+
pm_runtime_set_autosuspend_delay(dev, INV_ICM42600_SUSPEND_DELAY_MS);
pm_runtime_use_autosuspend(dev);
- pm_runtime_put(dev);
- return devm_add_action_or_reset(dev, inv_icm42600_disable_pm, dev);
+ return ret;
}
EXPORT_SYMBOL_NS_GPL(inv_icm42600_core_probe, "IIO_ICM42600");
--
2.51.0
This is a note to let you know that I've just added the patch titled
iio: imu: inv_icm42600: Drop redundant pm_runtime reinitialization in
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
From a95a0b4e471a6d8860f40c6ac8f1cad9dde3189a Mon Sep 17 00:00:00 2001
From: Sean Nyekjaer <sean(a)geanix.com>
Date: Mon, 1 Sep 2025 09:49:14 +0200
Subject: iio: imu: inv_icm42600: Drop redundant pm_runtime reinitialization in
resume
Remove unnecessary calls to pm_runtime_disable(), pm_runtime_set_active(),
and pm_runtime_enable() from the resume path. These operations are not
required here and can interfere with proper pm_runtime state handling,
especially when resuming from a pm_runtime suspended state.
Fixes: 31c24c1e93c3 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver")
Signed-off-by: Sean Nyekjaer <sean(a)geanix.com>
Link: https://patch.msgid.link/20250901-icm42pmreg-v3-2-ef1336246960@geanix.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index 76d8e4f14d87..41b275ecc7e2 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -917,10 +917,6 @@ static int inv_icm42600_resume(struct device *dev)
goto out_unlock;
}
- pm_runtime_disable(dev);
- pm_runtime_set_active(dev);
- pm_runtime_enable(dev);
-
/* restore sensors state */
ret = inv_icm42600_set_pwr_mgmt0(st, st->suspended.gyro,
st->suspended.accel,
--
2.51.0
This is a note to let you know that I've just added the patch titled
iio: imu: inv_icm42600: Avoid configuring if already pm_runtime
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
From 466f7a2fef2a4e426f809f79845a1ec1aeb558f4 Mon Sep 17 00:00:00 2001
From: Sean Nyekjaer <sean(a)geanix.com>
Date: Mon, 1 Sep 2025 09:49:15 +0200
Subject: iio: imu: inv_icm42600: Avoid configuring if already pm_runtime
suspended
Do as in suspend, skip resume configuration steps if the device is already
pm_runtime suspended. This avoids reconfiguring a device that is already
in the correct low-power state and ensures that pm_runtime handles the
power state transitions properly.
Fixes: 31c24c1e93c3 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver")
Signed-off-by: Sean Nyekjaer <sean(a)geanix.com>
Link: https://patch.msgid.link/20250901-icm42pmreg-v3-3-ef1336246960@geanix.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index 41b275ecc7e2..ee780f530dc8 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -837,17 +837,15 @@ static int inv_icm42600_suspend(struct device *dev)
struct device *accel_dev;
bool wakeup;
int accel_conf;
- int ret;
+ int ret = 0;
mutex_lock(&st->lock);
st->suspended.gyro = st->conf.gyro.mode;
st->suspended.accel = st->conf.accel.mode;
st->suspended.temp = st->conf.temp_en;
- if (pm_runtime_suspended(dev)) {
- ret = 0;
+ if (pm_runtime_suspended(dev))
goto out_unlock;
- }
/* disable FIFO data streaming */
if (st->fifo.on) {
@@ -900,10 +898,13 @@ static int inv_icm42600_resume(struct device *dev)
struct inv_icm42600_sensor_state *accel_st = iio_priv(st->indio_accel);
struct device *accel_dev;
bool wakeup;
- int ret;
+ int ret = 0;
mutex_lock(&st->lock);
+ if (pm_runtime_suspended(dev))
+ goto out_unlock;
+
/* check wakeup capability */
accel_dev = &st->indio_accel->dev;
wakeup = st->apex.on && device_may_wakeup(accel_dev);
--
2.51.0
This is a note to let you know that I've just added the patch titled
iio: imu: inv_icm42600: Simplify pm_runtime setup
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
From 0792c1984a45ccd7a296d6b8cb78088bc99a212e Mon Sep 17 00:00:00 2001
From: Sean Nyekjaer <sean(a)geanix.com>
Date: Mon, 1 Sep 2025 09:49:13 +0200
Subject: iio: imu: inv_icm42600: Simplify pm_runtime setup
Rework the power management in inv_icm42600_core_probe() to use
devm_pm_runtime_set_active_enabled(), which simplifies the runtime PM
setup by handling activation and enabling in one step.
Remove the separate inv_icm42600_disable_pm callback, as it's no longer
needed with the devm-managed approach.
Using devm_pm_runtime_enable() also fixes the missing disable of
autosuspend.
Update inv_icm42600_disable_vddio_reg() to only disable the regulator if
the device is not suspended i.e. powered-down, preventing unbalanced
disables.
Also remove redundant error msg on regulator_disable(), the regulator
framework already emits an error message when regulator_disable() fails.
This simplifies the PM setup and avoids manipulating the usage counter
unnecessarily.
Fixes: 31c24c1e93c3 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver")
Signed-off-by: Sean Nyekjaer <sean(a)geanix.com>
Link: https://patch.msgid.link/20250901-icm42pmreg-v3-1-ef1336246960@geanix.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
.../iio/imu/inv_icm42600/inv_icm42600_core.c | 24 ++++++-------------
1 file changed, 7 insertions(+), 17 deletions(-)
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index a4d42e7e2180..76d8e4f14d87 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -711,20 +711,12 @@ static void inv_icm42600_disable_vdd_reg(void *_data)
static void inv_icm42600_disable_vddio_reg(void *_data)
{
struct inv_icm42600_state *st = _data;
- const struct device *dev = regmap_get_device(st->map);
- int ret;
+ struct device *dev = regmap_get_device(st->map);
- ret = regulator_disable(st->vddio_supply);
- if (ret)
- dev_err(dev, "failed to disable vddio error %d\n", ret);
-}
+ if (pm_runtime_status_suspended(dev))
+ return;
-static void inv_icm42600_disable_pm(void *_data)
-{
- struct device *dev = _data;
-
- pm_runtime_put_sync(dev);
- pm_runtime_disable(dev);
+ regulator_disable(st->vddio_supply);
}
int inv_icm42600_core_probe(struct regmap *regmap, int chip,
@@ -824,16 +816,14 @@ int inv_icm42600_core_probe(struct regmap *regmap, int chip,
return ret;
/* setup runtime power management */
- ret = pm_runtime_set_active(dev);
+ ret = devm_pm_runtime_set_active_enabled(dev);
if (ret)
return ret;
- pm_runtime_get_noresume(dev);
- pm_runtime_enable(dev);
+
pm_runtime_set_autosuspend_delay(dev, INV_ICM42600_SUSPEND_DELAY_MS);
pm_runtime_use_autosuspend(dev);
- pm_runtime_put(dev);
- return devm_add_action_or_reset(dev, inv_icm42600_disable_pm, dev);
+ return ret;
}
EXPORT_SYMBOL_NS_GPL(inv_icm42600_core_probe, "IIO_ICM42600");
--
2.51.0
Hi all,
Here's a collection of fixes that I *think* are bugs in fuse, along with
some scattered improvements.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fu…
---
Commits in this patchset:
* fuse: fix livelock in synchronous file put from fuseblk workers
* fuse: flush pending fuse events before aborting the connection
* fuse: capture the unique id of fuse commands being sent
* fuse: signal that a fuse filesystem should exhibit local fs behaviors
* fuse: implement file attributes mask for statx
* fuse: update file mode when updating acls
* fuse: propagate default and file acls on creation
* fuse: enable FUSE_SYNCFS for all fuseblk servers
---
fs/fuse/fuse_i.h | 55 +++++++++++++++++++++++++++
fs/fuse/acl.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dev.c | 60 +++++++++++++++++++++++++++--
fs/fuse/dev_uring.c | 4 +-
fs/fuse/dir.c | 96 +++++++++++++++++++++++++++++++++++------------
fs/fuse/file.c | 8 +++-
fs/fuse/inode.c | 17 ++++++++
fs/fuse/virtio_fs.c | 3 -
8 files changed, 314 insertions(+), 34 deletions(-)
Several crypto user API contexts and requests allocated with
sock_kmalloc() were left uninitialized, relying on callers to
set fields explicitly. This resulted in the use of uninitialized
data in certain error paths or when new fields are added in the
future.
The ACVP patches also contain two user-space interface files:
algif_kpp.c and algif_akcipher.c. These too rely on proper
initialization of their context structures.
A particular issue has been observed with the newly added
'inflight' variable introduced in af_alg_ctx by commit:
67b164a871af ("crypto: af_alg - Disallow multiple in-flight AIO requests")
Because the context is not memset to zero after allocation,
the inflight variable has contained garbage values. As a result,
af_alg_alloc_areq() has incorrectly returned -EBUSY randomly when
the garbage value was interpreted as true:
https://github.com/gregkh/linux/blame/master/crypto/af_alg.c#L1209
The check directly tests ctx->inflight without explicitly
comparing against true/false. Since inflight is only ever set to
true or false later, an uninitialized value has triggered
-EBUSY failures. Zero-initializing memory allocated with
sock_kmalloc() ensures inflight and other fields start in a known
state, removing random issues caused by uninitialized data.
Fixes: fe869cdb89c9 ("crypto: algif_hash - User-space interface for hash operations")
Fixes: 5afdfd22e6ba ("crypto: algif_rng - add random number generator support")
Fixes: 2d97591ef43d ("crypto: af_alg - consolidation of duplicate code")
Fixes: 99bd99d3e3a7 ("crypto: algif_skcipher - Fix stream cipher chaining")
Fixes: 67b164a871af ("crypto: af_alg - Disallow multiple in-flight AIO requests")
Cc: stable(a)vger.kernel.org
Signed-off-by: Shivani Agarwal <shivani.agarwal(a)broadcom.com>
---
crypto/af_alg.c | 5 ++---
crypto/algif_hash.c | 3 +--
crypto/algif_rng.c | 3 +--
crypto/algif_skcipher.c | 1 +
4 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index ca6fdcc6c54a..6c271e55f44d 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1212,15 +1212,14 @@ struct af_alg_async_req *af_alg_alloc_areq(struct sock *sk,
if (unlikely(!areq))
return ERR_PTR(-ENOMEM);
+ memset(areq, 0, areqlen);
+
ctx->inflight = true;
areq->areqlen = areqlen;
areq->sk = sk;
areq->first_rsgl.sgl.sgt.sgl = areq->first_rsgl.sgl.sgl;
- areq->last_rsgl = NULL;
INIT_LIST_HEAD(&areq->rsgl_list);
- areq->tsgl = NULL;
- areq->tsgl_entries = 0;
return areq;
}
diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index e3f1a4852737..4d3dfc60a16a 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -416,9 +416,8 @@ static int hash_accept_parent_nokey(void *private, struct sock *sk)
if (!ctx)
return -ENOMEM;
- ctx->result = NULL;
+ memset(ctx, 0, len);
ctx->len = len;
- ctx->more = false;
crypto_init_wait(&ctx->wait);
ask->private = ctx;
diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
index 10c41adac3b1..1a86e40c8372 100644
--- a/crypto/algif_rng.c
+++ b/crypto/algif_rng.c
@@ -248,9 +248,8 @@ static int rng_accept_parent(void *private, struct sock *sk)
if (!ctx)
return -ENOMEM;
+ memset(ctx, 0, len);
ctx->len = len;
- ctx->addtl = NULL;
- ctx->addtl_len = 0;
/*
* No seeding done at that point -- if multiple accepts are
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 125d395c5e00..f4ce5473324f 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -70,6 +70,7 @@ static int algif_skcipher_export(struct sock *sk, struct skcipher_request *req)
if (!ctx->state)
return -ENOMEM;
+ memset(ctx->state, 0, statesize);
err = crypto_skcipher_export(req, ctx->state);
if (err) {
sock_kzfree_s(sk, ctx->state, statesize);
--
2.40.4
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092142-easiness-blatancy-23af@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
devm_kcalloc() may fail. mt8365_afe_suspend() uses afe->reg_back_up
unconditionally after allocation and writes afe->reg_back_up[i], which
can lead to a NULL pointer dereference under low-memory conditions.
Add a NULL check and bail out with -ENOMEM, making sure to disable the
main clock via the existing error path to keep clock state balanced.
Fixes: e1991d102bc2 ("ASoC: mediatek: mt8365: Add the AFE driver support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
sound/soc/mediatek/mt8365/mt8365-afe-pcm.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c b/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c
index 10793bbe9275..eaeb14e1fce9 100644
--- a/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c
+++ b/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c
@@ -1979,6 +1979,10 @@ static int mt8365_afe_suspend(struct device *dev)
afe->reg_back_up =
devm_kcalloc(dev, afe->reg_back_up_list_num,
sizeof(unsigned int), GFP_KERNEL);
+ if (!afe->reg_back_up) {
+ mt8365_afe_disable_main_clk(afe);
+ return -ENOMEM;
+ }
for (i = 0; i < afe->reg_back_up_list_num; i++)
regmap_read(regmap, afe->reg_back_up_list[i],
--
2.43.0
The fmh_gpib driver contains a device reference count leak in
fmh_gpib_attach_impl() where driver_find_device() increases the
reference count of the device by get_device() when matching but this
reference is not properly decreased. Add put_device() in
fmh_gpib_detach(), which ensures that the reference count of the
device is correctly managed.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 8e4841a0888c ("staging: gpib: Add Frank Mori Hess FPGA PCI GPIB driver")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v3:
- deleted the redundant put_device() to avoid double free as suggestions;
Changes in v2:
- modified the free operations as suggestions. Thanks for dan carpenter's instructions.
---
drivers/staging/gpib/fmh_gpib/fmh_gpib.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/staging/gpib/fmh_gpib/fmh_gpib.c b/drivers/staging/gpib/fmh_gpib/fmh_gpib.c
index 4138f3d2bae7..efce01b39b9b 100644
--- a/drivers/staging/gpib/fmh_gpib/fmh_gpib.c
+++ b/drivers/staging/gpib/fmh_gpib/fmh_gpib.c
@@ -1517,6 +1517,11 @@ void fmh_gpib_detach(struct gpib_board *board)
resource_size(e_priv->gpib_iomem_res));
}
fmh_gpib_generic_detach(board);
+
+ if (board->dev) {
+ put_device(board->dev);
+ board->dev = NULL;
+ }
}
static int fmh_gpib_pci_attach_impl(struct gpib_board *board,
--
2.17.1
wcd934x_codec_parse_data() contains a device reference count leak in
of_slim_get_device() where device_find_child() increases the reference
count of the device but this reference is not properly decreased in
the success path. Add put_device() in wcd934x_codec_parse_data() and
add devm_add_action_or_reset() in the probe function, which ensures
that the reference count of the device is correctly managed.
Memory leak in regmap_init_slimbus() as the allocated regmap is not
released when the device is removed. Using devm_regmap_init_slimbus()
instead of regmap_init_slimbus() to ensure automatic regmap cleanup on
device removal.
Calling path: of_slim_get_device() -> of_find_slim_device() ->
device_find_child(). As comment of device_find_child() says, 'NOTE:
you will need to drop the reference with put_device() after use.'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a61f3b4f476e ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v3:
- added a wrapper function due to the warning report from kernel test robot;
Changes in v2:
- modified the handling in the success path and fixed the memory leak for regmap as suggestions.
---
sound/soc/codecs/wcd934x.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
index 1bb7e1dc7e6b..d9d8cf64977a 100644
--- a/sound/soc/codecs/wcd934x.c
+++ b/sound/soc/codecs/wcd934x.c
@@ -5831,6 +5831,15 @@ static const struct snd_soc_component_driver wcd934x_component_drv = {
.endianness = 1,
};
+static void wcd934x_put_device_action(void *data)
+{
+ struct device *dev = data;
+
+ if (dev) {
+ put_device(dev);
+ }
+}
+
static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
{
struct device *dev = &wcd->sdev->dev;
@@ -5847,11 +5856,13 @@ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
return dev_err_probe(dev, -EINVAL, "Unable to get SLIM Interface device\n");
slim_get_logical_addr(wcd->sidev);
- wcd->if_regmap = regmap_init_slimbus(wcd->sidev,
+ wcd->if_regmap = devm_regmap_init_slimbus(wcd->sidev,
&wcd934x_ifc_regmap_config);
- if (IS_ERR(wcd->if_regmap))
+ if (IS_ERR(wcd->if_regmap)) {
+ put_device(&wcd->sidev->dev);
return dev_err_probe(dev, PTR_ERR(wcd->if_regmap),
"Failed to allocate ifc register map\n");
+ }
of_property_read_u32(dev->parent->of_node, "qcom,dmic-sample-rate",
&wcd->dmic_sample_rate);
@@ -5893,6 +5904,10 @@ static int wcd934x_codec_probe(struct platform_device *pdev)
if (ret)
return ret;
+ ret = devm_add_action_or_reset(dev, wcd934x_put_device_action, &wcd->sidev->dev);
+ if (ret)
+ return ret;
+
/* set default rate 9P6MHz */
regmap_update_bits(wcd->regmap, WCD934X_CODEC_RPM_CLK_MCLK_CFG,
WCD934X_CODEC_RPM_CLK_MCLK_CFG_MCLK_MASK,
--
2.17.1
The quilt patch titled
Subject: selftests/mm: skip soft-dirty tests when CONFIG_MEM_SOFT_DIRTY is disabled
has been removed from the -mm tree. Its filename was
selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Lance Yang <lance.yang(a)linux.dev>
Subject: selftests/mm: skip soft-dirty tests when CONFIG_MEM_SOFT_DIRTY is disabled
Date: Wed, 17 Sep 2025 21:31:37 +0800
The madv_populate and soft-dirty kselftests currently fail on systems
where CONFIG_MEM_SOFT_DIRTY is disabled.
Introduce a new helper softdirty_supported() into vm_util.c/h to ensure
tests are properly skipped when the feature is not enabled.
Link: https://lkml.kernel.org/r/20250917133137.62802-1-lance.yang@linux.dev
Fixes: 9f3265db6ae8 ("selftests: vm: add test for Soft-Dirty PTE bit")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
Acked-by: David Hildenbrand <david(a)redhat.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Gabriel Krisman Bertazi <krisman(a)collabora.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/madv_populate.c | 21 +------------------
tools/testing/selftests/mm/soft-dirty.c | 5 +++-
tools/testing/selftests/mm/vm_util.c | 17 +++++++++++++++
tools/testing/selftests/mm/vm_util.h | 1
4 files changed, 24 insertions(+), 20 deletions(-)
--- a/tools/testing/selftests/mm/madv_populate.c~selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled
+++ a/tools/testing/selftests/mm/madv_populate.c
@@ -264,23 +264,6 @@ static void test_softdirty(void)
munmap(addr, SIZE);
}
-static int system_has_softdirty(void)
-{
- /*
- * There is no way to check if the kernel supports soft-dirty, other
- * than by writing to a page and seeing if the bit was set. But the
- * tests are intended to check that the bit gets set when it should, so
- * doing that check would turn a potentially legitimate fail into a
- * skip. Fortunately, we know for sure that arm64 does not support
- * soft-dirty. So for now, let's just use the arch as a corse guide.
- */
-#if defined(__aarch64__)
- return 0;
-#else
- return 1;
-#endif
-}
-
int main(int argc, char **argv)
{
int nr_tests = 16;
@@ -288,7 +271,7 @@ int main(int argc, char **argv)
pagesize = getpagesize();
- if (system_has_softdirty())
+ if (softdirty_supported())
nr_tests += 5;
ksft_print_header();
@@ -300,7 +283,7 @@ int main(int argc, char **argv)
test_holes();
test_populate_read();
test_populate_write();
- if (system_has_softdirty())
+ if (softdirty_supported())
test_softdirty();
err = ksft_get_fail_cnt();
--- a/tools/testing/selftests/mm/soft-dirty.c~selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled
+++ a/tools/testing/selftests/mm/soft-dirty.c
@@ -200,8 +200,11 @@ int main(int argc, char **argv)
int pagesize;
ksft_print_header();
- ksft_set_plan(15);
+ if (!softdirty_supported())
+ ksft_exit_skip("soft-dirty is not support\n");
+
+ ksft_set_plan(15);
pagemap_fd = open(PAGEMAP_FILE_PATH, O_RDONLY);
if (pagemap_fd < 0)
ksft_exit_fail_msg("Failed to open %s\n", PAGEMAP_FILE_PATH);
--- a/tools/testing/selftests/mm/vm_util.c~selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled
+++ a/tools/testing/selftests/mm/vm_util.c
@@ -449,6 +449,23 @@ bool check_vmflag_pfnmap(void *addr)
return check_vmflag(addr, "pf");
}
+bool softdirty_supported(void)
+{
+ char *addr;
+ bool supported = false;
+ const size_t pagesize = getpagesize();
+
+ /* New mappings are expected to be marked with VM_SOFTDIRTY (sd). */
+ addr = mmap(0, pagesize, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
+ if (!addr)
+ ksft_exit_fail_msg("mmap failed\n");
+
+ supported = check_vmflag(addr, "sd");
+ munmap(addr, pagesize);
+ return supported;
+}
+
/*
* Open an fd at /proc/$pid/maps and configure procmap_out ready for
* PROCMAP_QUERY query. Returns 0 on success, or an error code otherwise.
--- a/tools/testing/selftests/mm/vm_util.h~selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled
+++ a/tools/testing/selftests/mm/vm_util.h
@@ -104,6 +104,7 @@ bool find_vma_procmap(struct procmap_fd
int close_procmap(struct procmap_fd *procmap);
int write_sysfs(const char *file_path, unsigned long val);
int read_sysfs(const char *file_path, unsigned long *val);
+bool softdirty_supported(void);
static inline int open_self_procmap(struct procmap_fd *procmap_out)
{
_
Patches currently in -mm which might be from lance.yang(a)linux.dev are
hung_task-fix-warnings-caused-by-unaligned-lock-pointers.patch
mm-thp-fix-mte-tag-mismatch-when-replacing-zero-filled-subpages.patch
The quilt patch titled
Subject: kernel/sys.c: fix the racy usage of task_lock(tsk->group_leader) in sys_prlimit64() paths
has been removed from the -mm tree. Its filename was
fix-the-racy-usage-of-task_locktsk-group_leader-in-sys_prlimit64-paths.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Oleg Nesterov <oleg(a)redhat.com>
Subject: kernel/sys.c: fix the racy usage of task_lock(tsk->group_leader) in sys_prlimit64() paths
Date: Mon, 15 Sep 2025 14:09:17 +0200
The usage of task_lock(tsk->group_leader) in sys_prlimit64()->do_prlimit()
path is very broken.
sys_prlimit64() does get_task_struct(tsk) but this only protects task_struct
itself. If tsk != current and tsk is not a leader, this process can exit/exec
and task_lock(tsk->group_leader) may use the already freed task_struct.
Another problem is that sys_prlimit64() can race with mt-exec which changes
->group_leader. In this case do_prlimit() may take the wrong lock, or (worse)
->group_leader may change between task_lock() and task_unlock().
Change sys_prlimit64() to take tasklist_lock when necessary. This is not
nice, but I don't see a better fix for -stable.
Link: https://lkml.kernel.org/r/20250915120917.GA27702@redhat.com
Fixes: 18c91bb2d872 ("prlimit: do not grab the tasklist_lock")
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Cc: Mateusz Guzik <mjguzik(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/sys.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
--- a/kernel/sys.c~fix-the-racy-usage-of-task_locktsk-group_leader-in-sys_prlimit64-paths
+++ a/kernel/sys.c
@@ -1734,6 +1734,7 @@ SYSCALL_DEFINE4(prlimit64, pid_t, pid, u
struct rlimit old, new;
struct task_struct *tsk;
unsigned int checkflags = 0;
+ bool need_tasklist;
int ret;
if (old_rlim)
@@ -1760,8 +1761,25 @@ SYSCALL_DEFINE4(prlimit64, pid_t, pid, u
get_task_struct(tsk);
rcu_read_unlock();
- ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL,
- old_rlim ? &old : NULL);
+ need_tasklist = !same_thread_group(tsk, current);
+ if (need_tasklist) {
+ /*
+ * Ensure we can't race with group exit or de_thread(),
+ * so tsk->group_leader can't be freed or changed until
+ * read_unlock(tasklist_lock) below.
+ */
+ read_lock(&tasklist_lock);
+ if (!pid_alive(tsk))
+ ret = -ESRCH;
+ }
+
+ if (!ret) {
+ ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL,
+ old_rlim ? &old : NULL);
+ }
+
+ if (need_tasklist)
+ read_unlock(&tasklist_lock);
if (!ret && old_rlim) {
rlim_to_rlim64(&old, &old64);
_
Patches currently in -mm which might be from oleg(a)redhat.com are
Przemek Kitszel says:
Improvements hardening PF-VF communication for i40e driver.
This patchset targets several issues that can cause undefined behavior
or be exploited in some other way.
---
IWL: https://lore.kernel.org/intel-wired-lan/20250813104552.61027-1-przemyslaw.k…
The following are changes since commit cbf658dd09419f1ef9de11b9604e950bdd5c170b:
Merge tag 'net-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
and are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 40GbE
Lukasz Czapnik (8):
i40e: add validation for ring_len param
i40e: fix idx validation in i40e_validate_queue_map
i40e: fix idx validation in config queues msg
i40e: fix input validation logic for action_meta
i40e: fix validation of VF state in get resources
i40e: add max boundary check for VF filters
i40e: add mask to apply valid bits for itr_idx
i40e: improve VF MAC filters accounting
drivers/net/ethernet/intel/i40e/i40e.h | 3 +-
drivers/net/ethernet/intel/i40e/i40e_main.c | 26 ++++-
.../ethernet/intel/i40e/i40e_virtchnl_pf.c | 110 ++++++++++--------
.../ethernet/intel/i40e/i40e_virtchnl_pf.h | 3 +-
4 files changed, 90 insertions(+), 52 deletions(-)
--
2.47.1
commit 2b28fe75c7dbe7ec322e706eed4622964409e21d upstream
A number of recent Broadcom STB SoCs utilize a GIC-600 interrupt
controller thus requiring the use of the GICv3 driver.
Link: https://lore.kernel.org/r/20240726233414.2305526-1-florian.fainelli@broadco…
Signed-off-by: Florian Fainelli <florian.fainelli(a)broadcom.com>
---
While technically not a bug fix, this allows me to properly build test
6.6 on additional platforms within our test rack. Thanks!
arch/arm/mach-bcm/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig
index 8789d93a7c04..c705bec1410c 100644
--- a/arch/arm/mach-bcm/Kconfig
+++ b/arch/arm/mach-bcm/Kconfig
@@ -186,6 +186,7 @@ config ARCH_BRCMSTB
select ARCH_HAS_RESET_CONTROLLER
select ARM_AMBA
select ARM_GIC
+ select ARM_GIC_V3
select ARM_ERRATA_798181 if SMP
select HAVE_ARM_ARCH_TIMER
select ZONE_DMA if ARM_LPAE
--
2.34.1
The patch titled
Subject: mm/damon/sysfs: do not ignore callback's return value in damon_sysfs_damon_call()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-sysfs-do-not-ignore-callbacks-return-value-in-damon_sysfs_damon_call.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Akinobu Mita <akinobu.mita(a)gmail.com>
Subject: mm/damon/sysfs: do not ignore callback's return value in damon_sysfs_damon_call()
Date: Sat, 20 Sep 2025 22:25:46 +0900
The callback return value is ignored in damon_sysfs_damon_call(), which
means that it is not possible to detect invalid user input when writing
commands such as 'commit' to
/sys/kernel/mm/damon/admin/kdamonds/<K>/state. Fix it.
Link: https://lkml.kernel.org/r/20250920132546.5822-1-akinobu.mita@gmail.com
Fixes: f64539dcdb87 ("mm/damon/sysfs: use damon_call() for update_schemes_stats")
Signed-off-by: Akinobu Mita <akinobu.mita(a)gmail.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/damon/sysfs.c~mm-damon-sysfs-do-not-ignore-callbacks-return-value-in-damon_sysfs_damon_call
+++ a/mm/damon/sysfs.c
@@ -1592,12 +1592,14 @@ static int damon_sysfs_damon_call(int (*
struct damon_sysfs_kdamond *kdamond)
{
struct damon_call_control call_control = {};
+ int err;
if (!kdamond->damon_ctx)
return -EINVAL;
call_control.fn = fn;
call_control.data = kdamond;
- return damon_call(kdamond->damon_ctx, &call_control);
+ err = damon_call(kdamond->damon_ctx, &call_control);
+ return err ? err : call_control.return_code;
}
struct damon_sysfs_schemes_walk_data {
_
Patches currently in -mm which might be from akinobu.mita(a)gmail.com are
mm-damon-sysfs-do-not-ignore-callbacks-return-value-in-damon_sysfs_damon_call.patch
The patch titled
Subject: mm/thp: fix MTE tag mismatch when replacing zero-filled subpages
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-thp-fix-mte-tag-mismatch-when-replacing-zero-filled-subpages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Lance Yang <lance.yang(a)linux.dev>
Subject: mm/thp: fix MTE tag mismatch when replacing zero-filled subpages
Date: Mon, 22 Sep 2025 10:14:58 +0800
When both THP and MTE are enabled, splitting a THP and replacing its
zero-filled subpages with the shared zeropage can cause MTE tag mismatch
faults in userspace.
Remapping zero-filled subpages to the shared zeropage is unsafe, as the
zeropage has a fixed tag of zero, which may not match the tag expected by
the userspace pointer.
KSM already avoids this problem by using memcmp_pages(), which on arm64
intentionally reports MTE-tagged pages as non-identical to prevent unsafe
merging.
As suggested by David[1], this patch adopts the same pattern, replacing the
memchr_inv() byte-level check with a call to pages_identical(). This
leverages existing architecture-specific logic to determine if a page is
truly identical to the shared zeropage.
Having both the THP shrinker and KSM rely on pages_identical() makes the
design more future-proof, IMO. Instead of handling quirks in generic code,
we just let the architecture decide what makes two pages identical.
[1] https://lore.kernel.org/all/ca2106a3-4bb2-4457-81af-301fd99fbef4@redhat.com
Link: https://lkml.kernel.org/r/20250922021458.68123-1-lance.yang@linux.dev
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
Reported-by: Qun-wei Lin <Qun-wei.Lin(a)mediatek.com>
Closes: https://lore.kernel.org/all/a7944523fcc3634607691c35311a5d59d1a3f8d4.camel@…
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Zi Yan <ziy(a)nvidia.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Usama Arif <usamaarif642(a)gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: andrew.yang <andrew.yang(a)mediatek.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: Byungchul Park <byungchul(a)sk.com>
Cc: Charlie Jenkins <charlie(a)rivosinc.com>
Cc: Chinwen Chang <chinwen.chang(a)mediatek.com>
Cc: Dev Jain <dev.jain(a)arm.com>
Cc: Domenico Cerasuolo <cerasuolodomenico(a)gmail.com>
Cc: Gregory Price <gourry(a)gourry.net>
Cc: "Huang, Ying" <ying.huang(a)linux.alibaba.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Joshua Hahn <joshua.hahnjy(a)gmail.com>
Cc: Kairui Song <ryncsn(a)gmail.com>
Cc: Kalesh Singh <kaleshsingh(a)google.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Mariano Pache <npache(a)redhat.com>
Cc: Mathew Brost <matthew.brost(a)intel.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Palmer Dabbelt <palmer(a)rivosinc.com>
Cc: Rakie Kim <rakie.kim(a)sk.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Samuel Holland <samuel.holland(a)sifive.com>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 15 +++------------
mm/migrate.c | 8 +-------
2 files changed, 4 insertions(+), 19 deletions(-)
--- a/mm/huge_memory.c~mm-thp-fix-mte-tag-mismatch-when-replacing-zero-filled-subpages
+++ a/mm/huge_memory.c
@@ -4115,32 +4115,23 @@ static unsigned long deferred_split_coun
static bool thp_underused(struct folio *folio)
{
int num_zero_pages = 0, num_filled_pages = 0;
- void *kaddr;
int i;
if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1)
return false;
for (i = 0; i < folio_nr_pages(folio); i++) {
- kaddr = kmap_local_folio(folio, i * PAGE_SIZE);
- if (!memchr_inv(kaddr, 0, PAGE_SIZE)) {
- num_zero_pages++;
- if (num_zero_pages > khugepaged_max_ptes_none) {
- kunmap_local(kaddr);
+ if (pages_identical(folio_page(folio, i), ZERO_PAGE(0))) {
+ if (++num_zero_pages > khugepaged_max_ptes_none)
return true;
- }
} else {
/*
* Another path for early exit once the number
* of non-zero filled pages exceeds threshold.
*/
- num_filled_pages++;
- if (num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none) {
- kunmap_local(kaddr);
+ if (++num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none)
return false;
- }
}
- kunmap_local(kaddr);
}
return false;
}
--- a/mm/migrate.c~mm-thp-fix-mte-tag-mismatch-when-replacing-zero-filled-subpages
+++ a/mm/migrate.c
@@ -301,9 +301,7 @@ static bool try_to_map_unused_to_zeropag
unsigned long idx)
{
struct page *page = folio_page(folio, idx);
- bool contains_data;
pte_t newpte;
- void *addr;
if (PageCompound(page))
return false;
@@ -320,11 +318,7 @@ static bool try_to_map_unused_to_zeropag
* this subpage has been non present. If the subpage is only zero-filled
* then map it to the shared zeropage.
*/
- addr = kmap_local_page(page);
- contains_data = memchr_inv(addr, 0, PAGE_SIZE);
- kunmap_local(addr);
-
- if (contains_data)
+ if (!pages_identical(page, ZERO_PAGE(0)))
return false;
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
_
Patches currently in -mm which might be from lance.yang(a)linux.dev are
hung_task-fix-warnings-caused-by-unaligned-lock-pointers.patch
mm-thp-fix-mte-tag-mismatch-when-replacing-zero-filled-subpages.patch
selftests-mm-skip-soft-dirty-tests-when-config_mem_soft_dirty-is-disabled.patch
The patch titled
Subject: fs/proc/task_mmu: check p->vec_buf for NULL
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fs-proc-task_mmu-check-p-vec_buf-for-null.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jakub Acs <acsjakub(a)amazon.de>
Subject: fs/proc/task_mmu: check p->vec_buf for NULL
Date: Mon, 22 Sep 2025 08:22:05 +0000
When the PAGEMAP_SCAN ioctl is invoked with vec_len = 0 reaches
pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none)
[ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace:
[ 44.947030] <TASK>
[ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0
[ 44.952593] walk_pmd_range.isra.0+0x302/0x910
[ 44.954069] walk_pud_range.isra.0+0x419/0x790
[ 44.954427] walk_p4d_range+0x41e/0x620
[ 44.954743] walk_pgd_range+0x31e/0x630
[ 44.955057] __walk_page_range+0x160/0x670
[ 44.956883] walk_page_range_mm+0x408/0x980
[ 44.958677] walk_page_range+0x66/0x90
[ 44.958984] do_pagemap_scan+0x28d/0x9c0
[ 44.961833] do_pagemap_cmd+0x59/0x80
[ 44.962484] __x64_sys_ioctl+0x18d/0x210
[ 44.962804] do_syscall_64+0x5b/0x290
[ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are
allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(), that
page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking p->vec_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly or
transitively) protected by checking p->vec_buf.
Note:
From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output
is requested and it's only the side effects caller is interested in,
hence it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Link: https://lkml.kernel.org/r/20250922082206.6889-1-acsjakub@amazon.de
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Reviewed-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: "Micha�� Miros��aw" <mirq-linux(a)rere.qmqm.pl>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/proc/task_mmu.c~fs-proc-task_mmu-check-p-vec_buf-for-null
+++ a/fs/proc/task_mmu.c
@@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(s
{
struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!p->vec_buf)
+ return;
+
if (cur_buf->start != addr)
cur_buf->end = addr;
else
_
Patches currently in -mm which might be from acsjakub(a)amazon.de are
fs-proc-task_mmu-check-p-vec_buf-for-null.patch
Upstream commit ce971233242b ("s390/cpum_cf: Deny all sampling events by
counter PMU"), backported to 6.6 as commit d660c8d8142e ("s390/cpum_cf:
Deny all sampling events by counter PMU"), implicitly depends on the
unconditional initialization of err to -ENOENT added by upstream
commit aa1ac98268cd ("s390/cpumf: Fix double free on error in
cpumf_pmu_event_init()"). The latter change is missing from 6.6,
resulting in an instance of -Wuninitialized, which is fairly obvious
from looking at the actual diff.
arch/s390/kernel/perf_cpum_cf.c:858:10: warning: variable 'err' is uninitialized when used here [-Wuninitialized]
858 | return err;
| ^~~
Commit aa1ac98268cd ("s390/cpumf: Fix double free on error in
cpumf_pmu_event_init()") depends on commit c70ca298036c ("perf/core:
Simplify the perf_event_alloc() error path"), which is a part of a much
larger series unsuitable for stable.
Extract the unconditional initialization of err to -ENOENT from
commit aa1ac98268cd ("s390/cpumf: Fix double free on error in
cpumf_pmu_event_init()") and apply it to 6.6 as a standalone change to
resolve the warning.
Fixes: d660c8d8142e ("s390/cpum_cf: Deny all sampling events by counter PMU")
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
This was reported on 6.1 and the offending commit was dropped but 6.6
suffered from the same issue (I am surprised LKFT's testing caught it in
the 6.1 case but not 6.6...):
https://lore.kernel.org/CADYN=9Li3gHMJ+weE0khMBmpS1Wcj-XaUeaUZg2Nxdz0qY9sdg…
As it is already released, I figured I would just submit a fixup patch,
which could also be used to fix the issue in 6.1.
---
arch/s390/kernel/perf_cpum_cf.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
index 771e1cb17540..e590b4c09625 100644
--- a/arch/s390/kernel/perf_cpum_cf.c
+++ b/arch/s390/kernel/perf_cpum_cf.c
@@ -852,7 +852,7 @@ static int cpumf_pmu_event_type(struct perf_event *event)
static int cpumf_pmu_event_init(struct perf_event *event)
{
unsigned int type = event->attr.type;
- int err;
+ int err = -ENOENT;
if (is_sampling_event(event)) /* No sampling support */
return err;
@@ -861,8 +861,6 @@ static int cpumf_pmu_event_init(struct perf_event *event)
else if (event->pmu->type == type)
/* Registered as unknown PMU */
err = __hw_perf_event_init(event, cpumf_pmu_event_type(event));
- else
- return -ENOENT;
if (unlikely(err) && event->destroy)
event->destroy(event);
---
base-commit: af1544b5d072514b219695b0a9fba0b1e0d5e289
change-id: 20250922-6-6-s390-cpum_cf-fix-uninit-err-010876c3d58c
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 2066f00e5b2dc061fb6d8c88fadaebc97f11feaa
Gitweb: https://git.kernel.org/tip/2066f00e5b2dc061fb6d8c88fadaebc97f11feaa
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Sun, 21 Sep 2025 10:56:40 +02:00
Committer: Borislav Petkov (AMD) <bp(a)alien8.de>
CommitterDate: Mon, 22 Sep 2025 21:25:36 +02:00
x86/topology: Implement topology_is_core_online() to address SMT regression
Christian reported that commit a430c11f4015 ("intel_idle: Rescan "dead" SMT
siblings during initialization") broke the use case in which both 'nosmt'
and 'maxcpus' are on the kernel command line because it onlines primary
threads, which were offline due to the maxcpus limit.
The initially proposed fix to skip primary threads in the loop is
inconsistent. While it prevents the primary thread to be onlined, it then
onlines the corresponding hyperthread(s), which does not really make sense.
The CPU iterator in cpuhp_smt_enable() contains a check which excludes all
threads of a core, when the primary thread is offline. The default
implementation is a NOOP and therefore not effective on x86.
Implement topology_is_core_online() on x86 to address this issue. This
makes the behaviour consistent between x86 and PowerPC.
Fixes: a430c11f4015 ("intel_idle: Rescan "dead" SMT siblings during initialization")
Fixes: f694481b1d31 ("ACPI: processor: Rescan "dead" SMT siblings during initialization")
Closes: https://lore.kernel.org/linux-pm/724616a2-6374-4ba3-8ce3-ea9c45e2ae3b@arm.c…
Reported-by: Christian Loehle <christian.loehle(a)arm.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael(a)kernel.org>
Tested-by: Christian Loehle <christian.loehle(a)arm.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/12740505.O9o76ZdvQC@rafael.j.wysocki
---
arch/x86/include/asm/topology.h | 10 ++++++++++
arch/x86/kernel/cpu/topology.c | 13 +++++++++++++
2 files changed, 23 insertions(+)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 6c79ee7..2104189 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -231,6 +231,16 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
}
#define topology_is_primary_thread topology_is_primary_thread
+int topology_get_primary_thread(unsigned int cpu);
+
+static inline bool topology_is_core_online(unsigned int cpu)
+{
+ int pcpu = topology_get_primary_thread(cpu);
+
+ return pcpu >= 0 ? cpu_online(pcpu) : false;
+}
+#define topology_is_core_online topology_is_core_online
+
#else /* CONFIG_SMP */
static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
static inline int topology_max_smt_threads(void) { return 1; }
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index e35ccdc..6073a16 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -372,6 +372,19 @@ unsigned int topology_unit_count(u32 apicid, enum x86_topology_domains which_uni
return topo_unit_count(lvlid, at_level, apic_maps[which_units].map);
}
+#ifdef CONFIG_SMP
+int topology_get_primary_thread(unsigned int cpu)
+{
+ u32 apic_id = cpuid_to_apicid[cpu];
+
+ /*
+ * Get the core domain level APIC id, which is the primary thread
+ * and return the CPU number assigned to it.
+ */
+ return topo_lookup_cpuid(topo_apicid(apic_id, TOPO_CORE_DOMAIN));
+}
+#endif
+
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/**
* topology_hotplug_apic - Handle a physical hotplugged APIC after boot
In of_unittest_pci_node_verify(), when the add parameter is false,
device_find_any_child() obtains a reference to a child device. This
function implicitly calls get_device() to increment the device's
reference count before returning the pointer. However, the caller
fails to properly release this reference by calling put_device(),
leading to a device reference count leak.
As the comment of device_find_any_child states: "NOTE: you will need
to drop the reference with put_device() after use".
Cc: stable(a)vger.kernel.org
Fixes: 26409dd04589 ("of: unittest: Add pci_dt_testdrv pci driver")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/of/unittest.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index e3503ec20f6c..d225e73781fe 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -4271,7 +4271,7 @@ static struct platform_driver unittest_pci_driver = {
static int of_unittest_pci_node_verify(struct pci_dev *pdev, bool add)
{
struct device_node *pnp, *np = NULL;
- struct device *child_dev;
+ struct device *child_dev = NULL;
char *path = NULL;
const __be32 *reg;
int rc = 0;
@@ -4306,6 +4306,8 @@ static int of_unittest_pci_node_verify(struct pci_dev *pdev, bool add)
kfree(path);
if (np)
of_node_put(np);
+ if (child_dev)
+ put_device(child_dev);
return rc;
}
--
2.17.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092107-making-cough-9671@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285 Mon Sep 17 00:00:00 2001
From: Herbert Xu <herbert(a)gondor.apana.org.au>
Date: Tue, 16 Sep 2025 17:20:59 +0800
Subject: [PATCH] crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg
Issuing two writes to the same af_alg socket is bogus as the
data will be interleaved in an unpredictable fashion. Furthermore,
concurrent writes may create inconsistencies in the internal
socket state.
Disallow this by adding a new ctx->write field that indiciates
exclusive ownership for writing.
Fixes: 8ff590903d5 ("crypto: algif_skcipher - User-space interface for skcipher operations")
Reported-by: Muhammad Alifa Ramdhan <ramdhan(a)starlabs.sg>
Reported-by: Bing-Jhong Billy Jheng <billy(a)starlabs.sg>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 407f2c238f2c..ca6fdcc6c54a 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -970,6 +970,12 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
}
lock_sock(sk);
+ if (ctx->write) {
+ release_sock(sk);
+ return -EBUSY;
+ }
+ ctx->write = true;
+
if (ctx->init && !ctx->more) {
if (ctx->used) {
err = -EINVAL;
@@ -1105,6 +1111,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
unlock:
af_alg_data_wakeup(sk);
+ ctx->write = false;
release_sock(sk);
return copied ?: err;
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index f7b3b93f3a49..0c70f3a55575 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -135,6 +135,7 @@ struct af_alg_async_req {
* SG?
* @enc: Cryptographic operation to be performed when
* recvmsg is invoked.
+ * @write: True if we are in the middle of a write.
* @init: True if metadata has been sent.
* @len: Length of memory allocated for this data structure.
* @inflight: Non-zero when AIO requests are in flight.
@@ -151,10 +152,11 @@ struct af_alg_ctx {
size_t used;
atomic_t rcvused;
- bool more;
- bool merge;
- bool enc;
- bool init;
+ u32 more:1,
+ merge:1,
+ enc:1,
+ write:1,
+ init:1;
unsigned int len;
This reverts commit ce8da5d13d8c ("loop: Avoid updating block size under
exclusive owner") for the 6.6 kernel, because if the LTP ioctl_loop06 test is
run with this patch in place, the test will fail, it leaves the host unable to
kexec into the kernel again (hangs forever) and "losetup -a" will hang on
attempting to access the /dev/loopN device that the test has set up.
The patch doesn't need to be reverted from 6.12, as it works fine there.
Cc: stable(a)vger.kernel.org # 6.6.x
Signed-off-by: Eric Hagberg <ehagberg(a)janestreet.com>
---
--- b/drivers/block/loop.c
+++ a/drivers/block/loop.c
@@ -1472,36 +1472,19 @@
return error;
}
+static int loop_set_block_size(struct loop_device *lo, unsigned long arg)
-static int loop_set_block_size(struct loop_device *lo, blk_mode_t mode,
- struct block_device *bdev, unsigned long arg)
{
int err = 0;
+ if (lo->lo_state != Lo_bound)
+ return -ENXIO;
- /*
- * If we don't hold exclusive handle for the device, upgrade to it
- * here to avoid changing device under exclusive owner.
- */
- if (!(mode & BLK_OPEN_EXCL)) {
- err = bd_prepare_to_claim(bdev, loop_set_block_size, NULL);
- if (err)
- return err;
- }
-
- err = mutex_lock_killable(&lo->lo_mutex);
- if (err)
- goto abort_claim;
-
- if (lo->lo_state != Lo_bound) {
- err = -ENXIO;
- goto unlock;
- }
err = blk_validate_block_size(arg);
if (err)
return err;
if (lo->lo_queue->limits.logical_block_size == arg)
+ return 0;
- goto unlock;
sync_blockdev(lo->lo_device);
invalidate_bdev(lo->lo_device);
@@ -1513,11 +1496,6 @@
loop_update_dio(lo);
blk_mq_unfreeze_queue(lo->lo_queue);
-unlock:
- mutex_unlock(&lo->lo_mutex);
-abort_claim:
- if (!(mode & BLK_OPEN_EXCL))
- bd_abort_claiming(bdev, loop_set_block_size);
return err;
}
@@ -1536,6 +1514,9 @@
case LOOP_SET_DIRECT_IO:
err = loop_set_dio(lo, arg);
break;
+ case LOOP_SET_BLOCK_SIZE:
+ err = loop_set_block_size(lo, arg);
+ break;
default:
err = -EINVAL;
}
@@ -1590,12 +1571,9 @@
break;
case LOOP_GET_STATUS64:
return loop_get_status64(lo, argp);
- case LOOP_SET_BLOCK_SIZE:
- if (!(mode & BLK_OPEN_WRITE) && !capable(CAP_SYS_ADMIN))
- return -EPERM;
- return loop_set_block_size(lo, mode, bdev, arg);
case LOOP_SET_CAPACITY:
case LOOP_SET_DIRECT_IO:
+ case LOOP_SET_BLOCK_SIZE:
if (!(mode & BLK_OPEN_WRITE) && !capable(CAP_SYS_ADMIN))
return -EPERM;
fallthrough;
Commit 16c07342b542 ("gpiolib: acpi: Program debounce when finding GPIO")
adds a gpio_set_debounce_timeout() call to acpi_find_gpio() and makes
acpi_find_gpio() fail if this fails.
But gpio_set_debounce_timeout() failing is a somewhat normal occurrence,
since not all debounce values are supported on all GPIO/pinctrl chips.
Making this an error for example break getting the card-detect GPIO for
the micro-sd slot found on many Bay Trail tablets, breaking support for
the micro-sd slot on these tablets.
acpi_request_own_gpiod() already treats gpio_set_debounce_timeout()
failures as non-fatal, just warning about them.
Add a acpi_gpio_set_debounce_timeout() helper which wraps
gpio_set_debounce_timeout() and warns on failures and replace both existing
gpio_set_debounce_timeout() calls with the helper.
Since the helper only warns on failures this fixes the card-detect issue.
Fixes: 16c07342b542 ("gpiolib: acpi: Program debounce when finding GPIO")
Cc: stable(a)vger.kernel.org
Cc: Mario Limonciello <superm1(a)kernel.org>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
---
drivers/gpio/gpiolib-acpi-core.c | 27 +++++++++++++++------------
1 file changed, 15 insertions(+), 12 deletions(-)
diff --git a/drivers/gpio/gpiolib-acpi-core.c b/drivers/gpio/gpiolib-acpi-core.c
index 284e762d92c4..67c4c38afb86 100644
--- a/drivers/gpio/gpiolib-acpi-core.c
+++ b/drivers/gpio/gpiolib-acpi-core.c
@@ -291,6 +291,19 @@ acpi_gpio_to_gpiod_flags(const struct acpi_resource_gpio *agpio, int polarity)
return GPIOD_ASIS;
}
+static void acpi_gpio_set_debounce_timeout(struct gpio_desc *desc,
+ unsigned int acpi_debounce)
+{
+ int ret;
+
+ /* ACPI uses hundredths of milliseconds units */
+ acpi_debounce *= 10;
+ ret = gpio_set_debounce_timeout(desc, acpi_debounce);
+ if (ret)
+ gpiod_warn(desc, "Failed to set debounce-timeout %u: %d\n",
+ acpi_debounce, ret);
+}
+
static struct gpio_desc *acpi_request_own_gpiod(struct gpio_chip *chip,
struct acpi_resource_gpio *agpio,
unsigned int index,
@@ -300,18 +313,12 @@ static struct gpio_desc *acpi_request_own_gpiod(struct gpio_chip *chip,
enum gpiod_flags flags = acpi_gpio_to_gpiod_flags(agpio, polarity);
unsigned int pin = agpio->pin_table[index];
struct gpio_desc *desc;
- int ret;
desc = gpiochip_request_own_desc(chip, pin, label, polarity, flags);
if (IS_ERR(desc))
return desc;
- /* ACPI uses hundredths of milliseconds units */
- ret = gpio_set_debounce_timeout(desc, agpio->debounce_timeout * 10);
- if (ret)
- dev_warn(chip->parent,
- "Failed to set debounce-timeout for pin 0x%04X, err %d\n",
- pin, ret);
+ acpi_gpio_set_debounce_timeout(desc, agpio->debounce_timeout);
return desc;
}
@@ -944,7 +951,6 @@ struct gpio_desc *acpi_find_gpio(struct fwnode_handle *fwnode,
bool can_fallback = acpi_can_fallback_to_crs(adev, con_id);
struct acpi_gpio_info info = {};
struct gpio_desc *desc;
- int ret;
desc = __acpi_find_gpio(fwnode, con_id, idx, can_fallback, &info);
if (IS_ERR(desc))
@@ -959,10 +965,7 @@ struct gpio_desc *acpi_find_gpio(struct fwnode_handle *fwnode,
acpi_gpio_update_gpiod_flags(dflags, &info);
acpi_gpio_update_gpiod_lookup_flags(lookupflags, &info);
- /* ACPI uses hundredths of milliseconds units */
- ret = gpio_set_debounce_timeout(desc, info.debounce * 10);
- if (ret)
- return ERR_PTR(ret);
+ acpi_gpio_set_debounce_timeout(desc, info.debounce);
return desc;
}
--
2.51.0
devm_kcalloc() may fail. mt8365_afe_suspend() uses afe->reg_back_up
unconditionally after allocation and writes afe->reg_back_up[i], which
can lead to a NULL pointer dereference under low-memory conditions.
Add a NULL check and bail out with -ENOMEM, making sure to disable the
main clock via the existing error path to keep clock state balanced.
Fixes: e1991d102bc2 ("ASoC: mediatek: mt8365: Add the AFE driver support")
Cc: stable(a)vger.kernel.org
---
changelog:
v2:
- Return -ENOMEM directly on allocation failure without goto/label.
- Disable the main clock before returning to keep clock state balanced.
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
sound/soc/mediatek/mt8365/mt8365-afe-pcm.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c b/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c
index 10793bbe9275..55d832e05072 100644
--- a/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c
+++ b/sound/soc/mediatek/mt8365/mt8365-afe-pcm.c
@@ -1975,11 +1975,15 @@ static int mt8365_afe_suspend(struct device *dev)
mt8365_afe_enable_main_clk(afe);
- if (!afe->reg_back_up)
+ if (!afe->reg_back_up) {
afe->reg_back_up =
devm_kcalloc(dev, afe->reg_back_up_list_num,
- sizeof(unsigned int), GFP_KERNEL);
-
+ sizeof(unsigned int), GFP_KERNEL);
+ if (!afe->reg_back_up) {
+ mt8365_afe_disable_main_clk(afe);
+ return -ENOMEM;
+ }
+ }
for (i = 0; i < afe->reg_back_up_list_num; i++)
regmap_read(regmap, afe->reg_back_up_list[i],
&afe->reg_back_up[i]);
--
2.43.0
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 4f115596133fa168bac06bb34c6efd8f4d84c22e
Gitweb: https://git.kernel.org/tip/4f115596133fa168bac06bb34c6efd8f4d84c22e
Author: Alexander Popov <alex.popov(a)linux.com>
AuthorDate: Sun, 21 Sep 2025 23:58:15 +03:00
Committer: Borislav Petkov (AMD) <bp(a)alien8.de>
CommitterDate: Mon, 22 Sep 2025 14:40:17 +02:00
x86/Kconfig: Reenable PTDUMP on i386
The commit
f9aad622006bd64c ("mm: rename GENERIC_PTDUMP and PTDUMP_CORE")
has broken PTDUMP and the Kconfig options that use it on ARCH=i386, including
CONFIG_DEBUG_WX.
CONFIG_GENERIC_PTDUMP was renamed into CONFIG_ARCH_HAS_PTDUMP, but it was
mistakenly moved from "config X86" to "config X86_64". That made PTDUMP
unavailable for i386.
Move CONFIG_ARCH_HAS_PTDUMP back to "config X86" to fix it.
[ bp: Massage commit message. ]
Fixes: f9aad622006bd64c ("mm: rename GENERIC_PTDUMP and PTDUMP_CORE")
Signed-off-by: Alexander Popov <alex.popov(a)linux.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)vger.kernel.org
---
arch/x86/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 52c8910..0588030 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -26,7 +26,6 @@ config X86_64
depends on 64BIT
# Options that are inherently 64-bit kernel only:
select ARCH_HAS_GIGANTIC_PAGE
- select ARCH_HAS_PTDUMP
select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS
select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
select ARCH_SUPPORTS_PER_VMA_LOCK
@@ -99,6 +98,7 @@ config X86
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PMEM_API if X86_64
select ARCH_HAS_PREEMPT_LAZY
+ select ARCH_HAS_PTDUMP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_HW_PTE_YOUNG
select ARCH_HAS_NONLEAF_PMD_YOUNG if PGTABLE_LEVELS > 2
When a software-node gets added to a device which already has another
fwnode as primary node it will become the secondary fwnode for that
device.
Currently if a software-node with GPIO properties ends up as the secondary
fwnode then gpiod_find_by_fwnode() will fail to find the GPIOs.
Add a new gpiod_fwnode_lookup() helper which falls back to calling
gpiod_find_by_fwnode() with the secondary fwnode if the GPIO was not
found in the primary fwnode.
Fixes: e7f9ff5dc90c ("gpiolib: add support for software nodes")
Cc: stable(a)vger.kernel.org
Cc: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
---
Changes in v2:
- Add a new gpiod_fwnode_lookup() helper instead of putting the secondary
fwnode check inside gpiod_find_by_fwnode()
---
drivers/gpio/gpiolib.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 0d2b470a252e..74d54513730a 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -4604,6 +4604,23 @@ static struct gpio_desc *gpiod_find_by_fwnode(struct fwnode_handle *fwnode,
return desc;
}
+static struct gpio_desc *gpiod_fwnode_lookup(struct fwnode_handle *fwnode,
+ struct device *consumer,
+ const char *con_id,
+ unsigned int idx,
+ enum gpiod_flags *flags,
+ unsigned long *lookupflags)
+{
+ struct gpio_desc *desc;
+
+ desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx, flags, lookupflags);
+ if (gpiod_not_found(desc) && !IS_ERR_OR_NULL(fwnode))
+ desc = gpiod_find_by_fwnode(fwnode->secondary, consumer, con_id,
+ idx, flags, lookupflags);
+
+ return desc;
+}
+
struct gpio_desc *gpiod_find_and_request(struct device *consumer,
struct fwnode_handle *fwnode,
const char *con_id,
@@ -4622,8 +4639,8 @@ struct gpio_desc *gpiod_find_and_request(struct device *consumer,
int ret = 0;
scoped_guard(srcu, &gpio_devices_srcu) {
- desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx,
- &flags, &lookupflags);
+ desc = gpiod_fwnode_lookup(fwnode, consumer, con_id, idx,
+ &flags, &lookupflags);
if (gpiod_not_found(desc) && platform_lookup_allowed) {
/*
* Either we are not using DT or ACPI, or their lookup
--
2.51.0
On Wed 17-09-25 11:18:50, Eric Hagberg wrote:
> I stumbled across a problem where the 6.6.103 kernel will fail when
> running the ioctl_loop06 test from the LTP test suite... and worse
> than failing the test, it leaves the system in a state where you can't
> run "losetup -a" again because the /dev/loopN device that the test
> created and failed the test on... hangs in a LOOP_GET_STATUS64 ioctl.
>
> It also leaves the system in a state where you can't re-kexec into a
> copy of the kernel as it gets completely hung at the point where it
> says "starting Reboot via kexec"...
Thanks for the report! Please report issues with stable kernels to
stable(a)vger.kernel.org (CCed now) because they can act on them.
> If I revert just that patch from 6.6.103 (or newer) kernels, then the
> test succeeds and doesn't leave the host in a bad state. The patch
> applied to 6.12 doesn't cause this problem, but I also see that there
> are quite a few other changes to the loop subsystem in 6.12 that never
> made it to 6.6.
>
> For now, I'll probably just revert your patch in my 6.6 kernel builds,
> but I wouldn't be surprised if others stumble across this issue as
> well, so maybe it should be reverted or fixed some other way.
Yes, I think revert from 6.6 stable kernel is warranted (unless somebody
has time to figure out what else is missing to make the patch work with
that stable branch).
Honza
--
Jan Kara <jack(a)suse.com>
SUSE Labs, CR
On 9/22/25 00:52, gregkh(a)linuxfoundation.org wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> x86/sev: Guard sev_evict_cache() with CONFIG_AMD_MEM_ENCRYPT
>
> to the 6.12-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> x86-sev-guard-sev_evict_cache-with-config_amd_mem_encrypt.patch
> and it can be found in the queue-6.12 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Maybe I didn't use the tag correctly, but I put 6.16.x on the stable tag
to indicate that the patch only applied to 6.16 and above. Before 6.16,
there isn't a stub version of the function, so all off those releases
are fine.
So this patch doesn't need to be part of the 6.12 stable tree.
Thanks,
Tom
>
>
> From stable+bounces-180849-greg=kroah.com(a)vger.kernel.org Mon Sep 22 01:18:07 2025
> From: Sasha Levin <sashal(a)kernel.org>
> Date: Sun, 21 Sep 2025 19:17:59 -0400
> Subject: x86/sev: Guard sev_evict_cache() with CONFIG_AMD_MEM_ENCRYPT
> To: stable(a)vger.kernel.org
> Cc: Tom Lendacky <thomas.lendacky(a)amd.com>, "Borislav Petkov (AMD)" <bp(a)alien8.de>, stable(a)kernel.org, Sasha Levin <sashal(a)kernel.org>
> Message-ID: <20250921231759.3033314-1-sashal(a)kernel.org>
>
> From: Tom Lendacky <thomas.lendacky(a)amd.com>
>
> [ Upstream commit 7f830e126dc357fc086905ce9730140fd4528d66 ]
>
> The sev_evict_cache() is guest-related code and should be guarded by
> CONFIG_AMD_MEM_ENCRYPT, not CONFIG_KVM_AMD_SEV.
>
> CONFIG_AMD_MEM_ENCRYPT=y is required for a guest to run properly as an SEV-SNP
> guest, but a guest kernel built with CONFIG_KVM_AMD_SEV=n would get the stub
> function of sev_evict_cache() instead of the version that performs the actual
> eviction. Move the function declarations under the appropriate #ifdef.
>
> Fixes: 7b306dfa326f ("x86/sev: Evict cache lines during SNP memory validation")
> Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
> Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
> Cc: stable(a)kernel.org # 6.16.x
> Link: https://lore.kernel.org/r/70e38f2c4a549063de54052c9f64929705313526.17577089…
> [ Move sev_evict_cache() out of shared.c ]
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
> ---
> arch/x86/coco/sev/shared.c | 18 ------------------
> arch/x86/include/asm/sev.h | 19 +++++++++++++++++++
> 2 files changed, 19 insertions(+), 18 deletions(-)
>
> --- a/arch/x86/coco/sev/shared.c
> +++ b/arch/x86/coco/sev/shared.c
> @@ -1243,24 +1243,6 @@ static void svsm_pval_terminate(struct s
> __pval_terminate(pfn, action, page_size, ret, svsm_ret);
> }
>
> -static inline void sev_evict_cache(void *va, int npages)
> -{
> - volatile u8 val __always_unused;
> - u8 *bytes = va;
> - int page_idx;
> -
> - /*
> - * For SEV guests, a read from the first/last cache-lines of a 4K page
> - * using the guest key is sufficient to cause a flush of all cache-lines
> - * associated with that 4K page without incurring all the overhead of a
> - * full CLFLUSH sequence.
> - */
> - for (page_idx = 0; page_idx < npages; page_idx++) {
> - val = bytes[page_idx * PAGE_SIZE];
> - val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
> - }
> -}
> -
> static void svsm_pval_4k_page(unsigned long paddr, bool validate)
> {
> struct svsm_pvalidate_call *pc;
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -400,6 +400,24 @@ u64 sev_get_status(void);
> void sev_show_status(void);
> void snp_update_svsm_ca(void);
>
> +static inline void sev_evict_cache(void *va, int npages)
> +{
> + volatile u8 val __always_unused;
> + u8 *bytes = va;
> + int page_idx;
> +
> + /*
> + * For SEV guests, a read from the first/last cache-lines of a 4K page
> + * using the guest key is sufficient to cause a flush of all cache-lines
> + * associated with that 4K page without incurring all the overhead of a
> + * full CLFLUSH sequence.
> + */
> + for (page_idx = 0; page_idx < npages; page_idx++) {
> + val = bytes[page_idx * PAGE_SIZE];
> + val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
> + }
> +}
> +
> #else /* !CONFIG_AMD_MEM_ENCRYPT */
>
> #define snp_vmpl 0
> @@ -435,6 +453,7 @@ static inline u64 snp_get_unsupported_fe
> static inline u64 sev_get_status(void) { return 0; }
> static inline void sev_show_status(void) { }
> static inline void snp_update_svsm_ca(void) { }
> +static inline void sev_evict_cache(void *va, int npages) {}
>
> #endif /* CONFIG_AMD_MEM_ENCRYPT */
>
>
>
> Patches currently in stable-queue which might be from sashal(a)kernel.org are
>
> queue-6.12/mptcp-tfo-record-deny-join-id0-info.patch
> queue-6.12/crypto-af_alg-set-merge-to-zero-early-in-af_alg_send.patch
> queue-6.12/asoc-wm8940-correct-pll-rate-rounding.patch
> queue-6.12/um-virtio_uml-fix-use-after-free-after-put_device-in.patch
> queue-6.12/x86-sev-guard-sev_evict_cache-with-config_amd_mem_encrypt.patch
> queue-6.12/mptcp-pm-nl-announce-deny-join-id0-flag.patch
> queue-6.12/drm-bridge-anx7625-fix-null-pointer-dereference-with.patch
> queue-6.12/asoc-sof-intel-hda-stream-fix-incorrect-variable-use.patch
> queue-6.12/qed-don-t-collect-too-many-protection-override-grc-e.patch
> queue-6.12/dpaa2-switch-fix-buffer-pool-seeding-for-control-tra.patch
> queue-6.12/nvme-fix-pi-insert-on-write.patch
> queue-6.12/xhci-dbc-fix-full-dbc-transfer-ring-after-several-reconnects.patch
> queue-6.12/pcmcia-omap_cf-mark-driver-struct-with-__refdata-to-.patch
> queue-6.12/tcp-clear-tcp_sk-sk-fastopen_rsk-in-tcp_disconnect.patch
> queue-6.12/wifi-mac80211-increase-scan_ies_len-for-s1g.patch
> queue-6.12/i40e-remove-redundant-memory-barrier-when-cleaning-t.patch
> queue-6.12/usb-xhci-remove-option-to-change-a-default-ring-s-trb-cycle-bit.patch
> queue-6.12/btrfs-fix-invalid-extref-key-setup-when-replaying-de.patch
> queue-6.12/io_uring-fix-incorrect-io_kiocb-reference-in-io_link.patch
> queue-6.12/ice-fix-rx-page-leak-on-multi-buffer-frames.patch
> queue-6.12/net-natsemi-fix-rx_dropped-double-accounting-on-neti.patch
> queue-6.12/drm-xe-tile-release-kobject-for-the-failure-path.patch
> queue-6.12/wifi-mac80211-fix-incorrect-type-for-ret.patch
> queue-6.12/smb-client-fix-smbdirect_recv_io-leak-in-smbd_negoti.patch
> queue-6.12/net-mlx5e-harden-uplink-netdev-access-against-device.patch
> queue-6.12/usb-xhci-introduce-macro-for-ring-segment-list-iteration.patch
> queue-6.12/revert-net-mlx5e-update-and-set-xon-xoff-upon-port-s.patch
> queue-6.12/net-liquidio-fix-overflow-in-octeon_init_instr_queue.patch
> queue-6.12/net-tcp-fix-a-null-pointer-dereference-when-using-tc.patch
> queue-6.12/drm-bridge-cdns-mhdp8546-fix-missing-mutex-unlock-on.patch
> queue-6.12/ice-store-max_frame-and-rx_buf_len-only-in-ice_rx_ri.patch
> queue-6.12/selftests-mptcp-userspace-pm-validate-deny-join-id0-.patch
> queue-6.12/bonding-set-random-address-only-when-slaves-already-.patch
> queue-6.12/drm-xe-fix-a-null-vs-is_err-in-xe_vm_add_compute_exe.patch
> queue-6.12/cnic-fix-use-after-free-bugs-in-cnic_delete_task.patch
> queue-6.12/mm-gup-check-ref_count-instead-of-lru-before-migration.patch
> queue-6.12/tls-make-sure-to-abort-the-stream-if-headers-are-bog.patch
> queue-6.12/um-fix-fd-copy-size-in-os_rcv_fd_msg.patch
> queue-6.12/smb-client-let-smbd_destroy-call-disable_work_sync-i.patch
> queue-6.12/bonding-don-t-set-oif-to-bond-dev-when-getting-ns-ta.patch
> queue-6.12/xhci-dbc-decouple-endpoint-allocation-from-initialization.patch
> queue-6.12/mptcp-set-remote_deny_join_id0-on-syn-recv.patch
> queue-6.12/octeontx2-pf-fix-use-after-free-bugs-in-otx2_sync_ts.patch
> queue-6.12/smb-client-fix-filename-matching-of-deferred-files.patch
> queue-6.12/igc-don-t-fail-igc_probe-on-led-setup-error.patch
> queue-6.12/octeon_ep-fix-vf-mac-address-lifecycle-handling.patch
> queue-6.12/selftests-mptcp-sockopt-fix-error-messages.patch
> queue-6.12/cgroup-split-cgroup_destroy_wq-into-3-workqueues.patch
> queue-6.12/alsa-firewire-motu-drop-epollout-from-poll-return-va.patch
> queue-6.12/asoc-wm8974-correct-pll-rate-rounding.patch
> queue-6.12/mm-add-folio_expected_ref_count-for-reference-count-calculation.patch
> queue-6.12/wifi-wilc1000-avoid-buffer-overflow-in-wid-string-co.patch
> queue-6.12/asoc-intel-catpt-expose-correct-bit-depth-to-userspa.patch
> queue-6.12/asoc-wm8940-correct-typo-in-control-name.patch
> queue-6.12/perf-x86-intel-fix-crash-in-icl_update_topdown_event.patch
From: Vidya Sagar <vidyas(a)nvidia.com>
The return value from tegra_bpmp_transfer() indicates the success or
failure of the IPC transaction with BPMP. If the transaction
succeeded, we also need to check the actual command's result code.
If we don't have error handling for tegra_bpmp_transfer(), we will
set the pcie->ep_state to EP_STATE_ENABLED (even though the
tegra_bpmp_transfer() command failed). Thus, the pcie->ep_state will
get out of sync with reality, and any further PERST# assert + deassert
will be a no-op (will not trigger the hardware initialization sequence).
This is because pex_ep_event_pex_rst_deassert() checks the current
pcie->ep_state, and does nothing if the current state is already
EP_STATE_ENABLED.
Thus, it is important to have error handling for tegra_bpmp_transfer(),
such that the pcie->ep_state can not get out of sync with reality, so
that we will try to initialize the hardware not only during the first
PERST# assert + deassert, but also during any succeeding PERST# assert +
deassert.
One example where this fix is needed is when using a rock5b as host.
During the initial PERST# assert + deassert (triggered by the bootloader
on the rock5b) pex_ep_event_pex_rst_deassert() will get called, but for
some unknown reason, the tegra_bpmp_transfer() call to initialize the PHY
fails. Once Linux has been loaded on the rock5b, the PCIe driver will once
again assert + deassert PERST#. However, without tegra_bpmp_transfer()
error handling, this second PERST# assert + deassert will not trigger the
hardware initialization sequence.
With tegra_bpmp_transfer() error handling, the second PERST# assert +
deassert will once again trigger the hardware to be initialized and this
time the tegra_bpmp_transfer() succeeds.
Cc: stable(a)vger.kernel.org
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Vidya Sagar <vidyas(a)nvidia.com>
[cassel: improve commit log]
Reviewed-by: Jon Hunter <jonathanh(a)nvidia.com>
Acked-by: Thierry Reding <treding(a)nvidia.com>
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
drivers/pci/controller/dwc/pcie-tegra194.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index 7eb48cc13648e..c4265b3f72048 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -1223,6 +1223,7 @@ static int tegra_pcie_bpmp_set_ctrl_state(struct tegra_pcie_dw *pcie,
struct mrq_uphy_response resp;
struct tegra_bpmp_message msg;
struct mrq_uphy_request req;
+ int err;
/*
* Controller-5 doesn't need to have its state set by BPMP-FW in
@@ -1245,7 +1246,13 @@ static int tegra_pcie_bpmp_set_ctrl_state(struct tegra_pcie_dw *pcie,
msg.rx.data = &resp;
msg.rx.size = sizeof(resp);
- return tegra_bpmp_transfer(pcie->bpmp, &msg);
+ err = tegra_bpmp_transfer(pcie->bpmp, &msg);
+ if (err)
+ return err;
+ if (msg.rx.ret)
+ return -EINVAL;
+
+ return 0;
}
static int tegra_pcie_bpmp_set_pll_state(struct tegra_pcie_dw *pcie,
@@ -1254,6 +1261,7 @@ static int tegra_pcie_bpmp_set_pll_state(struct tegra_pcie_dw *pcie,
struct mrq_uphy_response resp;
struct tegra_bpmp_message msg;
struct mrq_uphy_request req;
+ int err;
memset(&req, 0, sizeof(req));
memset(&resp, 0, sizeof(resp));
@@ -1273,7 +1281,13 @@ static int tegra_pcie_bpmp_set_pll_state(struct tegra_pcie_dw *pcie,
msg.rx.data = &resp;
msg.rx.size = sizeof(resp);
- return tegra_bpmp_transfer(pcie->bpmp, &msg);
+ err = tegra_bpmp_transfer(pcie->bpmp, &msg);
+ if (err)
+ return err;
+ if (msg.rx.ret)
+ return -EINVAL;
+
+ return 0;
}
static void tegra_pcie_downstream_dev_to_D0(struct tegra_pcie_dw *pcie)
--
2.51.0
Tegra already defines all BARs expect for BAR0 as BAR_RESERVED.
This is sufficient for pci-epf-test to not allocate backing memory and to
not call set_bar() for those BARs. However, marking a BAR as BAR_RESERVED
does not mean that the BAR get disabled.
The host side driver, pci_endpoint_test, simply does an ioremap for all
enabled BARs, and will run tests against all enabled BARs. (I.e. it will
run tests also against the BARs marked as BAR_RESERVED.)
After running the BARs tests (which will write to all enabled BARs), the
inbound address translation is broken.
This is because the tegra controller exposes the ATU Port Logic Structure
in BAR4. So when BAR4 is written, the inbound address translation settings
get overwritten.
To avoid this, implement the dw_pcie_ep_ops .init() callback and start off
by disabling all BARs (pci-epf-test will later enable/configure BARs that
are not defined as BAR_RESERVED).
This matches the behavior of other PCIe endpoint drivers:
dra7xx, imx6, layerscape-ep, artpec6, dw-rockchip, qcom-ep, rcar-gen4, and
uniphier-ep.
With this, the PCI endpoint kselftest test case CONSECUTIVE_BAR_TEST
(which was specifically made to detect address translation issues) passes.
Cc: stable(a)vger.kernel.org
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
drivers/pci/controller/dwc/pcie-tegra194.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index 63d310e5335f4..7eb48cc13648e 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -1955,6 +1955,15 @@ static irqreturn_t tegra_pcie_ep_pex_rst_irq(int irq, void *arg)
return IRQ_HANDLED;
}
+static void tegra_pcie_ep_init(struct dw_pcie_ep *ep)
+{
+ struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+ enum pci_barno bar;
+
+ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++)
+ dw_pcie_ep_reset_bar(pci, bar);
+};
+
static int tegra_pcie_ep_raise_intx_irq(struct tegra_pcie_dw *pcie, u16 irq)
{
/* Tegra194 supports only INTA */
@@ -2030,6 +2039,7 @@ tegra_pcie_ep_get_features(struct dw_pcie_ep *ep)
}
static const struct dw_pcie_ep_ops pcie_ep_ops = {
+ .init = tegra_pcie_ep_init,
.raise_irq = tegra_pcie_ep_raise_irq,
.get_features = tegra_pcie_ep_get_features,
};
--
2.51.0
This series includes a total of 27 patches, to align minmax.h of
v5.15.y with v6.17-rc6.
The set consists of 24 commits that directly update minmax.h:
1) 92d23c6e9415 ("overflow, tracing: Define the is_signed_type() macro
once")
2) 5efcecd9a3b1 ("minmax: sanity check constant bounds when clamping")
3) 2122e2a4efc2 ("minmax: clamp more efficiently by avoiding extra
comparison")
4) f9bff0e31881 ("minmax: add in_range() macro")
5) c952c748c7a9 ("minmax: Introduce {min,max}_array()")
6) 5e57418a2031 ("minmax: deduplicate __unconst_integer_typeof()")
7) f6e9d38f8eb0 ("minmax: fix header inclusions")
8) d03eba99f5bf ("minmax: allow min()/max()/clamp() if the arguments
have the same signedness.")
9) f4b84b2ff851 ("minmax: fix indentation of __cmp_once() and
__clamp_once()")
10) 4ead534fba42 ("minmax: allow comparisons of 'int' against 'unsigned
char/short'")
11) 867046cc7027 ("minmax: relax check to allow comparison between
unsigned arguments and signed constants")
12) 3a7e02c040b1 ("minmax: avoid overly complicated constant
expressions in VM code")
14) 017fa3e89187 ("minmax: simplify and clarify min_t()/max_t()
implementation")
15) 1a251f52cfdc ("minmax: make generic MIN() and MAX() macros
available everywhere")
18) dc1c8034e31b ("minmax: simplify min()/max()/clamp()
implementation")
19) 22f546873149 ("minmax: improve macro expansion and type
checking")
20) 21b136cc63d2 ("minmax: fix up min3() and max3() too")
21) 71ee9b16251e ("minmax.h: add whitespace around operators and after
commas")
22) 10666e992048 ("minmax.h: update some comments")
23) b280bb27a9f7 ("minmax.h: reduce the #define expansion of min(),
max() and clamp()")
24) a5743f32baec ("minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi
test in clamp()")
25) c3939872ee4a ("minmax.h: move all the clamp() definitions after the
min/max() ones")
26) 495bba17cdf9 ("minmax.h: simplify the variants of clamp()")
27) 2b97aaf74ed5 ("minmax.h: remove some #defines that are only
expanded once")
2 prerequisite commits that adjust users of MIN and MAX macros (to
prevent compilation issues):
13) 4477b39c32fd ("minmax: add a few more MIN_T/MAX_T users")
17) cb04e8b1d2f2 ("minmax: don't use max() in situations that want a C
constant expression")
1 additional commit introduced to resolve a build failures during the
backport:
16) lib: zstd: drop local MIN/MAX macros in favor of generic ones
The primary motivation is to bring in commit (8).
In mainline, this change allows min()/max()/clamp() to accept mixed
argument types when both share the same signedness.
Backported patches to v5.10.y that use such forms trigger compiler
warnings, which in turn cause build failures when -Werror is enabled.
Originaly I aligned 5.10.y to 5.15.y, but David Laight commented that I
need to pick up the later changes (from Linus) as well.
Andy Shevchenko (2):
minmax: deduplicate __unconst_integer_typeof()
minmax: fix header inclusions
Bart Van Assche (1):
overflow, tracing: Define the is_signed_type() macro once
David Laight (11):
minmax: allow min()/max()/clamp() if the arguments have the same
signedness.
minmax: fix indentation of __cmp_once() and __clamp_once()
minmax: allow comparisons of 'int' against 'unsigned char/short'
minmax: relax check to allow comparison between unsigned arguments and
signed constants
minmax.h: add whitespace around operators and after commas
minmax.h: update some comments
minmax.h: reduce the #define expansion of min(), max() and clamp()
minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
minmax.h: move all the clamp() definitions after the min/max() ones
minmax.h: simplify the variants of clamp()
minmax.h: remove some #defines that are only expanded once
Eliav Farber (1):
lib: zstd: drop local MIN/MAX macros in favor of generic ones
Herve Codina (1):
minmax: Introduce {min,max}_array()
Jason A. Donenfeld (2):
minmax: sanity check constant bounds when clamping
minmax: clamp more efficiently by avoiding extra comparison
Linus Torvalds (8):
minmax: avoid overly complicated constant expressions in VM code
minmax: add a few more MIN_T/MAX_T users
minmax: simplify and clarify min_t()/max_t() implementation
minmax: make generic MIN() and MAX() macros available everywhere
minmax: don't use max() in situations that want a C constant
expression
minmax: simplify min()/max()/clamp() implementation
minmax: improve macro expansion and type checking
minmax: fix up min3() and max3() too
Matthew Wilcox (Oracle) (1):
minmax: add in_range() macro
arch/arm/mm/pageattr.c | 6 +-
arch/um/drivers/mconsole_user.c | 2 +
arch/x86/mm/pgtable.c | 2 +-
drivers/edac/sb_edac.c | 4 +-
drivers/edac/skx_common.h | 1 -
.../drm/amd/display/modules/hdcp/hdcp_ddc.c | 2 +
.../drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 14 +-
.../drm/arm/display/include/malidp_utils.h | 2 +-
.../display/komeda/komeda_pipeline_state.c | 24 +-
drivers/gpu/drm/drm_color_mgmt.c | 2 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 6 -
drivers/gpu/drm/radeon/evergreen_cs.c | 2 +
drivers/hwmon/adt7475.c | 24 +-
drivers/input/touchscreen/cyttsp4_core.c | 2 +-
drivers/md/dm-integrity.c | 2 +-
drivers/media/dvb-frontends/stv0367_priv.h | 3 +
.../net/ethernet/chelsio/cxgb3/cxgb3_main.c | 18 +-
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
drivers/net/fjes/fjes_main.c | 4 +-
drivers/nfc/pn544/i2c.c | 2 -
drivers/platform/x86/sony-laptop.c | 1 -
drivers/scsi/isci/init.c | 6 +-
.../pci/hive_isp_css_include/math_support.h | 5 -
fs/btrfs/misc.h | 2 -
fs/btrfs/tree-checker.c | 2 +-
fs/ext2/balloc.c | 2 -
fs/ext4/ext4.h | 2 -
fs/ufs/util.h | 6 -
include/linux/compiler.h | 15 +
include/linux/minmax.h | 267 ++++++++++++++----
include/linux/overflow.h | 1 -
include/linux/trace_events.h | 2 -
kernel/trace/preemptirq_delay_test.c | 2 -
lib/btree.c | 1 -
lib/decompress_unlzma.c | 2 +
lib/logic_pio.c | 3 -
lib/vsprintf.c | 2 +-
lib/zstd/zstd_internal.h | 2 -
mm/zsmalloc.c | 1 -
net/ipv4/proc.c | 2 +-
net/ipv6/proc.c | 2 +-
net/netfilter/nf_nat_core.c | 6 +-
net/tipc/core.h | 2 +-
net/tipc/link.c | 10 +-
44 files changed, 306 insertions(+), 164 deletions(-)
--
2.47.3
This series backports 7 patches to update minmax.h in the 6.12.y branch,
aligning it with v6.17-rc7.
The ultimate goal is to synchronize all longterm branches so that they
include the full set of minmax.h changes.
The key motivation is to bring in commit d03eba99f5bf ("minmax: allow
min()/max()/clamp() if the arguments have the same signedness"), which
is missing in older kernels.
In mainline, this change enables min()/max()/clamp() to accept mixed
argument types, provided both have the same signedness. Without it,
backported patches that use these forms may trigger compiler warnings,
which escalate to build failures when -Werror is enabled.
David Laight (7):
minmax.h: add whitespace around operators and after commas
minmax.h: update some comments
minmax.h: reduce the #define expansion of min(), max() and clamp()
minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
minmax.h: move all the clamp() definitions after the min/max() ones
minmax.h: simplify the variants of clamp()
minmax.h: remove some #defines that are only expanded once
include/linux/minmax.h | 205 +++++++++++++++++++----------------------
1 file changed, 95 insertions(+), 110 deletions(-)
--
2.47.3
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x cba4262a19afae21665ee242b3404bcede5a94d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025091431-craftily-size-46c6@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cba4262a19afae21665ee242b3404bcede5a94d7 Mon Sep 17 00:00:00 2001
From: K Prateek Nayak <kprateek.nayak(a)amd.com>
Date: Mon, 1 Sep 2025 17:04:15 +0000
Subject: [PATCH] x86/cpu/topology: Always try cpu_parse_topology_ext() on
AMD/Hygon
Support for parsing the topology on AMD/Hygon processors using CPUID leaf 0xb
was added in
3986a0a805e6 ("x86/CPU/AMD: Derive CPU topology from CPUID function 0xB when available").
In an effort to keep all the topology parsing bits in one place, this commit
also introduced a pseudo dependency on the TOPOEXT feature to parse the CPUID
leaf 0xb.
The TOPOEXT feature (CPUID 0x80000001 ECX[22]) advertises the support for
Cache Properties leaf 0x8000001d and the CPUID leaf 0x8000001e EAX for
"Extended APIC ID" however support for 0xb was introduced alongside the x2APIC
support not only on AMD [1], but also historically on x86 [2].
Similar to 0xb, the support for extended CPU topology leaf 0x80000026 too does
not depend on the TOPOEXT feature.
The support for these leaves is expected to be confirmed by ensuring
leaf <= {extended_}cpuid_level
and then parsing the level 0 of the respective leaf to confirm EBX[15:0]
(LogProcAtThisLevel) is non-zero as stated in the definition of
"CPUID_Fn0000000B_EAX_x00 [Extended Topology Enumeration]
(Core::X86::Cpuid::ExtTopEnumEax0)" in Processor Programming Reference (PPR)
for AMD Family 19h Model 01h Rev B1 Vol1 [3] Sec. 2.1.15.1 "CPUID Instruction
Functions".
This has not been a problem on baremetal platforms since support for TOPOEXT
(Fam 0x15 and later) predates the support for CPUID leaf 0xb (Fam 0x17[Zen2]
and later), however, for AMD guests on QEMU, the "x2apic" feature can be
enabled independent of the "topoext" feature where QEMU expects topology and
the initial APICID to be parsed using the CPUID leaf 0xb (especially when
number of cores > 255) which is populated independent of the "topoext" feature
flag.
Unconditionally call cpu_parse_topology_ext() on AMD and Hygon processors to
first parse the topology using the XTOPOLOGY leaves (0x80000026 / 0xb) before
using the TOPOEXT leaf (0x8000001e).
While at it, break down the single large comment in parse_topology_amd() to
better highlight the purpose of each CPUID leaf.
Fixes: 3986a0a805e6 ("x86/CPU/AMD: Derive CPU topology from CPUID function 0xB when available")
Suggested-by: Naveen N Rao (AMD) <naveen(a)kernel.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)vger.kernel.org # Only v6.9 and above; depends on x86 topology rewrite
Link: https://lore.kernel.org/lkml/1529686927-7665-1-git-send-email-suravee.suthi… [1]
Link: https://lore.kernel.org/lkml/20080818181435.523309000@linux-os.sc.intel.com/ [2]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 [3]
diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 827dd0dbb6e9..c79ebbb639cb 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -175,27 +175,30 @@ static void topoext_fixup(struct topo_scan *tscan)
static void parse_topology_amd(struct topo_scan *tscan)
{
- bool has_topoext = false;
-
/*
- * If the extended topology leaf 0x8000_001e is available
- * try to get SMT, CORE, TILE, and DIE shifts from extended
+ * Try to get SMT, CORE, TILE, and DIE shifts from extended
* CPUID leaf 0x8000_0026 on supported processors first. If
* extended CPUID leaf 0x8000_0026 is not supported, try to
- * get SMT and CORE shift from leaf 0xb first, then try to
- * get the CORE shift from leaf 0x8000_0008.
+ * get SMT and CORE shift from leaf 0xb. If either leaf is
+ * available, cpu_parse_topology_ext() will return true.
*/
- if (cpu_feature_enabled(X86_FEATURE_TOPOEXT))
- has_topoext = cpu_parse_topology_ext(tscan);
+ bool has_xtopology = cpu_parse_topology_ext(tscan);
if (cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES))
tscan->c->topo.cpu_type = cpuid_ebx(0x80000026);
- if (!has_topoext && !parse_8000_0008(tscan))
+ /*
+ * If XTOPOLOGY leaves (0x26/0xb) are not available, try to
+ * get the CORE shift from leaf 0x8000_0008 first.
+ */
+ if (!has_xtopology && !parse_8000_0008(tscan))
return;
- /* Prefer leaf 0x8000001e if available */
- if (parse_8000_001e(tscan, has_topoext))
+ /*
+ * Prefer leaf 0x8000001e if available to get the SMT shift and
+ * the initial APIC ID if XTOPOLOGY leaves are not available.
+ */
+ if (parse_8000_001e(tscan, has_xtopology))
return;
/* Try the NODEID MSR */
The mei_ace driver contains a device reference count leak in
mei_ace_setup_dev_link() where device_find_child_by_name() increases
the reference count of the found device but this reference is not
properly decreased in the success path. Add put_device() in
mei_ace_setup_dev_link() and delete put_device() in mei_ace_remove(),
which ensures that the reference count of the device is correctly
managed regardless of whether the probe is successful or fails.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 78876f71b3e9 ("media: pci: intel: ivsc: Add ACE submodule")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v2:
- modified the put_device() operations and the patch title as suggestions.
---
drivers/media/pci/intel/ivsc/mei_ace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/pci/intel/ivsc/mei_ace.c b/drivers/media/pci/intel/ivsc/mei_ace.c
index 98310b8511b1..bb57656fc85a 100644
--- a/drivers/media/pci/intel/ivsc/mei_ace.c
+++ b/drivers/media/pci/intel/ivsc/mei_ace.c
@@ -420,6 +420,7 @@ static int mei_ace_setup_dev_link(struct mei_ace *ace)
goto err_put;
}
+ put_device(csi_dev);
ace->csi_dev = csi_dev;
return 0;
@@ -522,7 +523,6 @@ static void mei_ace_remove(struct mei_cl_device *cldev)
cancel_work_sync(&ace->work);
device_link_del(ace->csi_link);
- put_device(ace->csi_dev);
pm_runtime_disable(&cldev->dev);
pm_runtime_set_suspended(&cldev->dev);
--
2.17.1
This series backports 15 patches to update minmax.h in the 6.6.y branch,
aligning it with v6.17-rc7.
The ultimate goal is to synchronize all longterm branches so that they
include the full set of minmax.h changes.
The key motivation is to bring in commit d03eba99f5bf ("minmax: allow
min()/max()/clamp() if the arguments have the same signedness"), which
is missing in older kernels.
In mainline, this change enables min()/max()/clamp() to accept mixed
argument types, provided both have the same signedness. Without it,
backported patches that use these forms may trigger compiler warnings,
which escalate to build failures when -Werror is enabled.
David Laight (7):
minmax.h: add whitespace around operators and after commas
minmax.h: update some comments
minmax.h: reduce the #define expansion of min(), max() and clamp()
minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
minmax.h: move all the clamp() definitions after the min/max() ones
minmax.h: simplify the variants of clamp()
minmax.h: remove some #defines that are only expanded once
Linus Torvalds (8):
minmax: avoid overly complicated constant expressions in VM code
minmax: simplify and clarify min_t()/max_t() implementation
minmax: add a few more MIN_T/MAX_T users
minmax: make generic MIN() and MAX() macros available everywhere
minmax: simplify min()/max()/clamp() implementation
minmax: don't use max() in situations that want a C constant
expression
minmax: improve macro expansion and type checking
minmax: fix up min3() and max3() too
arch/um/drivers/mconsole_user.c | 2 +
arch/x86/mm/pgtable.c | 2 +-
drivers/edac/sb_edac.c | 4 +-
drivers/edac/skx_common.h | 1 -
.../drm/amd/display/modules/hdcp/hdcp_ddc.c | 2 +
.../drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 14 +-
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +-
drivers/gpu/drm/drm_color_mgmt.c | 2 +-
drivers/gpu/drm/radeon/evergreen_cs.c | 2 +
drivers/hwmon/adt7475.c | 24 +-
drivers/input/touchscreen/cyttsp4_core.c | 2 +-
drivers/irqchip/irq-sun6i-r.c | 2 +-
drivers/md/dm-integrity.c | 6 +-
drivers/media/dvb-frontends/stv0367_priv.h | 3 +
.../net/can/usb/etas_es58x/es58x_devlink.c | 2 +-
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
drivers/net/fjes/fjes_main.c | 4 +-
drivers/nfc/pn544/i2c.c | 2 -
drivers/platform/x86/sony-laptop.c | 1 -
drivers/scsi/isci/init.c | 6 +-
.../pci/hive_isp_css_include/math_support.h | 5 -
fs/btrfs/tree-checker.c | 2 +-
include/linux/compiler.h | 9 +
include/linux/minmax.h | 228 +++++++++++-------
include/linux/pageblock-flags.h | 2 +-
kernel/trace/preemptirq_delay_test.c | 2 -
lib/btree.c | 1 -
lib/decompress_unlzma.c | 2 +
lib/vsprintf.c | 2 +-
mm/zsmalloc.c | 2 -
net/ipv4/proc.c | 2 +-
net/ipv6/proc.c | 2 +-
tools/testing/selftests/mm/mremap_test.c | 2 +
tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +
34 files changed, 202 insertions(+), 146 deletions(-)
--
2.47.3
When PAGEMAP_SCAN ioctl invoked with vec_len = 0 reaches
pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none)
[ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace:
[ 44.947030] <TASK>
[ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0
[ 44.952593] walk_pmd_range.isra.0+0x302/0x910
[ 44.954069] walk_pud_range.isra.0+0x419/0x790
[ 44.954427] walk_p4d_range+0x41e/0x620
[ 44.954743] walk_pgd_range+0x31e/0x630
[ 44.955057] __walk_page_range+0x160/0x670
[ 44.956883] walk_page_range_mm+0x408/0x980
[ 44.958677] walk_page_range+0x66/0x90
[ 44.958984] do_pagemap_scan+0x28d/0x9c0
[ 44.961833] do_pagemap_cmd+0x59/0x80
[ 44.962484] __x64_sys_ioctl+0x18d/0x210
[ 44.962804] do_syscall_64+0x5b/0x290
[ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are
allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(),
that page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking p->vec_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly
or transitively) protected by checking p->vec_buf.
Note:
From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output
is requested and it's only the side effects caller is interested in,
hence it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: "Michał Mirosław" <mirq-linux(a)rere.qmqm.pl>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-fsdevel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
v1 -> v2: check p->vec_buf instead of cur_buf
v2 -> v3: fix commit title
v1: https://lore.kernel.org/all/20250919142106.43527-1-acsjakub@amazon.de/
v2: https://lore.kernel.org/all/20250922081713.77303-1-acsjakub@amazon.de/
fs/proc/task_mmu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 29cca0e6d0ff..b26ae556b446 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(struct pagemap_scan_private *p,
{
struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!p->vec_buf)
+ return;
+
if (cur_buf->start != addr)
cur_buf->end = addr;
else
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
When PAGEMAP_SCAN ioctl invoked with vec_len = 0 reaches
pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none)
[ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace:
[ 44.947030] <TASK>
[ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0
[ 44.952593] walk_pmd_range.isra.0+0x302/0x910
[ 44.954069] walk_pud_range.isra.0+0x419/0x790
[ 44.954427] walk_p4d_range+0x41e/0x620
[ 44.954743] walk_pgd_range+0x31e/0x630
[ 44.955057] __walk_page_range+0x160/0x670
[ 44.956883] walk_page_range_mm+0x408/0x980
[ 44.958677] walk_page_range+0x66/0x90
[ 44.958984] do_pagemap_scan+0x28d/0x9c0
[ 44.961833] do_pagemap_cmd+0x59/0x80
[ 44.962484] __x64_sys_ioctl+0x18d/0x210
[ 44.962804] do_syscall_64+0x5b/0x290
[ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are
allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(),
that page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking cur_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly
or transitively) protected by checking p->vec_buf.
Note:
From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output
is requested and it's only the side effects caller is interested in,
hence it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: "Michał Mirosław" <mirq-linux(a)rere.qmqm.pl>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
linux-kernel(a)vger.kernel.org
linux-fsdevel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
---
fs/proc/task_mmu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 29cca0e6d0ff..8c10a8135e74 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(struct pagemap_scan_private *p,
{
struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!cur_buf)
+ return;
+
if (cur_buf->start != addr)
cur_buf->end = addr;
else
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
Hi,
We’re offering verified business contact data for the upcoming Fruit Attraction 2025 (FA), tailored for effective outreach before and after the event.
Place: Madrid, Spain
Date:SEP 30 - OCT 02, 2025
Contact Overview:
1,01,351 Attendees
2,179 Exhibiting Companies
6,537 Verified Exhibitor Contacts
Total: 107,885 Business Contacts
Each entry includes: Name, Job Title, Company, Website, Address, Phone, Official Email, LinkedIn Profile, and more.
100% GDPR-compliant Data with 20% Off Now.
If you'd like more details, just reply: “Send me pricing”
Best regards,
Natalie Foster
Sr. Marketing Manager
To opt out reply “Not Interested.”
wcd934x_codec_parse_data() contains a device reference count leak in
of_slim_get_device() where device_find_child() increases the reference
count of the device but this reference is not properly decreased in
the success path. Add put_device() in wcd934x_codec_parse_data() and
add devm_add_action_or_reset() in the probe function, which ensures
that the reference count of the device is correctly managed.
Memory leak in regmap_init_slimbus() as the allocated regmap is not
released when the device is removed. Using devm_regmap_init_slimbus()
instead of regmap_init_slimbus() to ensure automatic regmap cleanup on
device removal.
Calling path: of_slim_get_device() -> of_find_slim_device() ->
device_find_child(). As comment of device_find_child() says, 'NOTE:
you will need to drop the reference with put_device() after use.'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a61f3b4f476e ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v2:
- modified the handling in the success path and fixed the memory leak for regmap as suggestions.
---
sound/soc/codecs/wcd934x.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
index 1bb7e1dc7e6b..b472320d1ca4 100644
--- a/sound/soc/codecs/wcd934x.c
+++ b/sound/soc/codecs/wcd934x.c
@@ -5847,11 +5847,13 @@ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
return dev_err_probe(dev, -EINVAL, "Unable to get SLIM Interface device\n");
slim_get_logical_addr(wcd->sidev);
- wcd->if_regmap = regmap_init_slimbus(wcd->sidev,
+ wcd->if_regmap = devm_regmap_init_slimbus(wcd->sidev,
&wcd934x_ifc_regmap_config);
- if (IS_ERR(wcd->if_regmap))
+ if (IS_ERR(wcd->if_regmap)) {
+ put_device(&wcd->sidev->dev);
return dev_err_probe(dev, PTR_ERR(wcd->if_regmap),
"Failed to allocate ifc register map\n");
+ }
of_property_read_u32(dev->parent->of_node, "qcom,dmic-sample-rate",
&wcd->dmic_sample_rate);
@@ -5893,6 +5895,10 @@ static int wcd934x_codec_probe(struct platform_device *pdev)
if (ret)
return ret;
+ ret = devm_add_action_or_reset(dev, (void (*)(void *))put_device, &wcd->sidev->dev);
+ if (ret)
+ return ret;
+
/* set default rate 9P6MHz */
regmap_update_bits(wcd->regmap, WCD934X_CODEC_RPM_CLK_MCLK_CFG,
WCD934X_CODEC_RPM_CLK_MCLK_CFG_MCLK_MASK,
--
2.17.1
Forbid USB runtime PM (autosuspend) for AX88772* in bind.
usbnet enables runtime PM by default in probe, so disabling it via the
usb_driver flag is ineffective. For AX88772B, autosuspend shows no
measurable power saving in my tests (no link partner, admin up/down).
The ~0.453 W -> ~0.248 W reduction on 6.1 comes from phylib powering
the PHY off on admin-down, not from USB autosuspend.
With autosuspend active, resume paths may require calling phylink/phylib
(caller must hold RTNL) and doing MDIO I/O. Taking RTNL from a USB PM
resume can deadlock (RTNL may already be held), and MDIO can attempt a
runtime-wake while the USB PM lock is held. Given the lack of benefit
and poor test coverage (autosuspend is usually disabled by default in
distros), forbid runtime PM here to avoid these hazards.
This affects only AX88772* devices (per-interface in bind). System
sleep/resume is unchanged.
Fixes: 4a2c7217cd5a ("net: usb: asix: ax88772: manage PHY PM from MAC")
Reported-by: Hubert Wiśniewski <hubert.wisniewski.25632(a)gmail.com>
Closes: https://lore.kernel.org/all/20220622141638.GE930160@montezuma.acc.umu.se
Reported-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Closes: https://lore.kernel.org/all/b5ea8296-f981-445d-a09a-2f389d7f6fdd@samsung.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
---
Link to the measurement results:
https://lore.kernel.org/all/aMkPMa650kfKfmF4@pengutronix.de/
---
drivers/net/usb/asix_devices.c | 34 ++++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)
diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 792ddda1ad49..0d341d7e6154 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -625,6 +625,22 @@ static void ax88772_suspend(struct usbnet *dev)
asix_read_medium_status(dev, 1));
}
+/*
+ * Notes on PM callbacks and locking context:
+ *
+ * - asix_suspend()/asix_resume() are invoked for both runtime PM and
+ * system-wide suspend/resume. For struct usb_driver the ->resume()
+ * callback does not receive pm_message_t, so the resume type cannot
+ * be distinguished here.
+ *
+ * - The MAC driver must hold RTNL when calling phylink interfaces such as
+ * phylink_suspend()/resume(). Those calls will also perform MDIO I/O.
+ *
+ * - Taking RTNL and doing MDIO from a runtime-PM resume callback (while
+ * the USB PM lock is held) is fragile. Since autosuspend brings no
+ * measurable power saving for this device with current driver version, it is
+ * disabled below.
+ */
static int asix_suspend(struct usb_interface *intf, pm_message_t message)
{
struct usbnet *dev = usb_get_intfdata(intf);
@@ -919,6 +935,16 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
if (ret)
goto initphy_err;
+ /* Disable USB runtime PM (autosuspend) for this interface.
+ * Rationale:
+ * - No measurable power saving from autosuspend for this device.
+ * - phylink/phylib calls require caller-held RTNL and do MDIO I/O,
+ * which is unsafe from USB PM resume paths (possible RTNL already
+ * held, USB PM lock held).
+ * System suspend/resume is unaffected.
+ */
+ pm_runtime_forbid(&intf->dev);
+
return 0;
initphy_err:
@@ -948,6 +974,10 @@ static void ax88772_unbind(struct usbnet *dev, struct usb_interface *intf)
phylink_destroy(priv->phylink);
ax88772_mdio_unregister(priv);
asix_rx_fixup_common_free(dev->driver_priv);
+ /* Re-allow runtime PM on disconnect for tidiness. The interface
+ * goes away anyway, but this balances forbid for debug sanity.
+ */
+ pm_runtime_allow(&intf->dev);
}
static void ax88178_unbind(struct usbnet *dev, struct usb_interface *intf)
@@ -1600,6 +1630,10 @@ static struct usb_driver asix_driver = {
.resume = asix_resume,
.reset_resume = asix_resume,
.disconnect = usbnet_disconnect,
+ /* usbnet will force supports_autosuspend=1; we explicitly forbid RPM
+ * per-interface in bind to keep autosuspend disabled for this driver
+ * by using pm_runtime_forbid().
+ */
.supports_autosuspend = 1,
.disable_hub_initiated_lpm = 1,
};
--
2.47.3
In register_shm_helper(), fix incorrect error handling for a call to
iov_iter_extract_pages(). A case is missing for when
iov_iter_extract_pages() only got some pages and return a number larger
than 0, but not the requested amount.
This fixes a possible NULL pointer dereference following a bad input from
ioctl(TEE_IOC_SHM_REGISTER) where parts of the buffer isn't mapped.
Cc: stable(a)vger.kernel.org
Reported-by: Masami Ichikawa <masami256(a)gmail.com>
Closes: https://lore.kernel.org/op-tee/CACOXgS-Bo2W72Nj1_44c7bntyNYOavnTjJAvUbEiQfq…
Fixes: 7bdee4157591 ("tee: Use iov_iter to better support shared buffer registration")
Signed-off-by: Jens Wiklander <jens.wiklander(a)linaro.org>
---
drivers/tee/tee_shm.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c
index daf6e5cfd59a..6ed7d030f4ed 100644
--- a/drivers/tee/tee_shm.c
+++ b/drivers/tee/tee_shm.c
@@ -316,7 +316,16 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags,
len = iov_iter_extract_pages(iter, &shm->pages, LONG_MAX, num_pages, 0,
&off);
- if (unlikely(len <= 0)) {
+ if (DIV_ROUND_UP(len + off, PAGE_SIZE) != num_pages) {
+ if (len > 0) {
+ /*
+ * If we only got a few pages, update to release
+ * the correct amount below.
+ */
+ shm->num_pages = len / PAGE_SIZE;
+ ret = ERR_PTR(-ENOMEM);
+ goto err_put_shm_pages;
+ }
ret = len ? ERR_PTR(len) : ERR_PTR(-ENOMEM);
goto err_free_shm_pages;
}
--
2.43.0
ep_events_available() checks for available events by looking at ep->rdllist
and ep->ovflist. However, this is done without a lock, therefore the
returned value is not reliable. Because it is possible that both checks on
ep->rdllist and ep->ovflist are false while ep_start_scan() or
ep_done_scan() is being executed on other CPUs, despite events are
available.
This bug can be observed by:
1. Create an eventpoll with at least one ready level-triggered event
2. Create multiple threads who do epoll_wait() with zero timeout. The
threads do not consume the events, therefore all epoll_wait() should
return at least one event.
If one thread is executing ep_events_available() while another thread is
executing ep_start_scan() or ep_done_scan(), epoll_wait() may wrongly
return no event for the former thread.
This reproducer is implemented as TEST(epoll65) in
tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
Fix it by skipping ep_events_available(), just call ep_try_send_events()
directly.
epoll_sendevents() (io_uring) suffers the same problem, fix that as well.
There is still ep_busy_loop() who uses ep_events_available() without lock,
but it is probably okay (?) for busy-polling.
Fixes: c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()")
Fixes: e59d3c64cba6 ("epoll: eliminate unnecessary lock for zero timeout")
Fixes: ae3a4f1fdc2c ("eventpoll: add epoll_sendevents() helper")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
fs/eventpoll.c | 16 ++--------------
1 file changed, 2 insertions(+), 14 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 0fbf5dfedb24..541481eafc20 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -2022,7 +2022,7 @@ static int ep_schedule_timeout(ktime_t *to)
static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
int maxevents, struct timespec64 *timeout)
{
- int res, eavail, timed_out = 0;
+ int res, eavail = 1, timed_out = 0;
u64 slack = 0;
wait_queue_entry_t wait;
ktime_t expires, *to = NULL;
@@ -2041,16 +2041,6 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
timed_out = 1;
}
- /*
- * This call is racy: We may or may not see events that are being added
- * to the ready list under the lock (e.g., in IRQ callbacks). For cases
- * with a non-zero timeout, this thread will check the ready list under
- * lock and will add to the wait queue. For cases with a zero
- * timeout, the user by definition should not care and will have to
- * recheck again.
- */
- eavail = ep_events_available(ep);
-
while (1) {
if (eavail) {
res = ep_try_send_events(ep, events, maxevents);
@@ -2496,9 +2486,7 @@ int epoll_sendevents(struct file *file, struct epoll_event __user *events,
* Racy call, but that's ok - it should get retried based on
* poll readiness anyway.
*/
- if (ep_events_available(ep))
- return ep_try_send_events(ep, events, maxevents);
- return 0;
+ return ep_try_send_events(ep, events, maxevents);
}
/*
--
2.39.5
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 3539b1467e94336d5854ebf976d9627bfb65d6c3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092127-synthetic-squash-4d57@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3539b1467e94336d5854ebf976d9627bfb65d6c3 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Thu, 18 Sep 2025 10:21:14 -0600
Subject: [PATCH] io_uring: include dying ring in task_work "should cancel"
state
When running task_work for an exiting task, rather than perform the
issue retry attempt, the task_work is canceled. However, this isn't
done for a ring that has been closed. This can lead to requests being
successfully completed post the ring being closed, which is somewhat
confusing and surprising to an application.
Rather than just check the task exit state, also include the ring
ref state in deciding whether or not to terminate a given request when
run from task_work.
Cc: stable(a)vger.kernel.org # 6.1+
Link: https://github.com/axboe/liburing/discussions/1459
Reported-by: Benedek Thaler <thaler(a)thaler.hu>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 93633613a165..bcec12256f34 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1406,8 +1406,10 @@ static void io_req_task_cancel(struct io_kiocb *req, io_tw_token_t tw)
void io_req_task_submit(struct io_kiocb *req, io_tw_token_t tw)
{
- io_tw_lock(req->ctx, tw);
- if (unlikely(io_should_terminate_tw()))
+ struct io_ring_ctx *ctx = req->ctx;
+
+ io_tw_lock(ctx, tw);
+ if (unlikely(io_should_terminate_tw(ctx)))
io_req_defer_failed(req, -EFAULT);
else if (req->flags & REQ_F_FORCE_ASYNC)
io_queue_iowq(req);
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index abc6de227f74..1880902be6fd 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -476,9 +476,9 @@ static inline bool io_allowed_run_tw(struct io_ring_ctx *ctx)
* 2) PF_KTHREAD is set, in which case the invoker of the task_work is
* our fallback task_work.
*/
-static inline bool io_should_terminate_tw(void)
+static inline bool io_should_terminate_tw(struct io_ring_ctx *ctx)
{
- return current->flags & (PF_KTHREAD | PF_EXITING);
+ return (current->flags & (PF_KTHREAD | PF_EXITING)) || percpu_ref_is_dying(&ctx->refs);
}
static inline void io_req_queue_tw_complete(struct io_kiocb *req, s32 res)
diff --git a/io_uring/poll.c b/io_uring/poll.c
index c786e587563b..6090a26975d4 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -224,7 +224,7 @@ static int io_poll_check_events(struct io_kiocb *req, io_tw_token_t tw)
{
int v;
- if (unlikely(io_should_terminate_tw()))
+ if (unlikely(io_should_terminate_tw(req->ctx)))
return -ECANCELED;
do {
diff --git a/io_uring/timeout.c b/io_uring/timeout.c
index 7f13bfa9f2b6..17e3aab0af36 100644
--- a/io_uring/timeout.c
+++ b/io_uring/timeout.c
@@ -324,7 +324,7 @@ static void io_req_task_link_timeout(struct io_kiocb *req, io_tw_token_t tw)
int ret;
if (prev) {
- if (!io_should_terminate_tw()) {
+ if (!io_should_terminate_tw(req->ctx)) {
struct io_cancel_data cd = {
.ctx = req->ctx,
.data = prev->cqe.user_data,
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 053bac89b6c0..213716e10d70 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -118,7 +118,7 @@ static void io_uring_cmd_work(struct io_kiocb *req, io_tw_token_t tw)
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
unsigned int flags = IO_URING_F_COMPLETE_DEFER;
- if (io_should_terminate_tw())
+ if (io_should_terminate_tw(req->ctx))
flags |= IO_URING_F_TASK_DEAD;
/* task_work executor checks the deffered list completion */
The mei_ace driver contains a device reference count leak in
mei_ace_setup_dev_link() where device_find_child_by_name() increases
the reference count of the found device but this reference is not
properly decreased in the success path. Add put_device() in
mei_ace_setup_dev_link() and delete put_device() in mei_ace_remove(),
which ensures that the reference count of the device is correctly
managed regardless of whether the probe is successful or fails.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 78876f71b3e9 ("media: pci: intel: ivsc: Add ACE submodule")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/media/pci/intel/ivsc/mei_ace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/pci/intel/ivsc/mei_ace.c b/drivers/media/pci/intel/ivsc/mei_ace.c
index 98310b8511b1..261b30788118 100644
--- a/drivers/media/pci/intel/ivsc/mei_ace.c
+++ b/drivers/media/pci/intel/ivsc/mei_ace.c
@@ -421,6 +421,7 @@ static int mei_ace_setup_dev_link(struct mei_ace *ace)
}
ace->csi_dev = csi_dev;
+ put_device(csi_dev);
return 0;
@@ -522,7 +523,6 @@ static void mei_ace_remove(struct mei_cl_device *cldev)
cancel_work_sync(&ace->work);
device_link_del(ace->csi_link);
- put_device(ace->csi_dev);
pm_runtime_disable(&cldev->dev);
pm_runtime_set_suspended(&cldev->dev);
--
2.17.1
Make sure we return the right pud value and not a value that could
have been overwritten in between by a different core.
Fixes: c3cc2a4a3a23 ("riscv: Add support for PUD THP")
Cc: stable(a)vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti(a)rivosinc.com>
---
Note that this will conflict with
https://lore.kernel.org/linux-riscv/20250625063753.77511-1-ajd@linux.ibm.co…
if applied after 6.17.
---
arch/riscv/include/asm/pgtable.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 91697fbf1f9013005800f713797e4b6b1fc8d312..e69346307e78608dd98d8b7a77b7063c333448ee 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -942,6 +942,17 @@ static inline int pudp_test_and_clear_young(struct vm_area_struct *vma,
return ptep_test_and_clear_young(vma, address, (pte_t *)pudp);
}
+#define __HAVE_ARCH_PUDP_HUGE_GET_AND_CLEAR
+static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm,
+ unsigned long address, pud_t *pudp)
+{
+ pud_t pud = __pud(atomic_long_xchg((atomic_long_t *)pudp, 0));
+
+ page_table_check_pud_clear(mm, pud);
+
+ return pud;
+}
+
static inline int pud_young(pud_t pud)
{
return pte_young(pud_pte(pud));
---
base-commit: 62950c35a515743739e3d863eac25c20a5bd1613
change-id: 20250814-dev-alex-thp_pud_xchg-8153c313d946
Best regards,
--
Alexandre Ghiti <alexghiti(a)rivosinc.com>
Once of_device_register() failed, we should call put_device() to
decrement reference count for cleanup. Or it could cause memory leak.
So fix this by calling put_device(), then the name can be freed in
kobject_cleanup().
Calling path: of_device_register() -> of_device_add() -> device_add().
As comment of device_add() says, 'if device_add() succeeds, you should
call device_del() when you want to get rid of it. If device_add() has
not succeeded, use only put_device() to drop the reference count'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: cf44bbc26cf1 ("[SPARC]: Beginnings of generic of_device framework.")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v3:
- also fixed the same problem in arch/sparc/kernel/of_device_32.c as suggestions.
Changes in v2:
- retained kfree() manually due to the lack of a release callback function.
---
arch/sparc/kernel/of_device_32.c | 1 +
arch/sparc/kernel/of_device_64.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/sparc/kernel/of_device_32.c b/arch/sparc/kernel/of_device_32.c
index 06012e68bdca..284a4cafa432 100644
--- a/arch/sparc/kernel/of_device_32.c
+++ b/arch/sparc/kernel/of_device_32.c
@@ -387,6 +387,7 @@ static struct platform_device * __init scan_one_device(struct device_node *dp,
if (of_device_register(op)) {
printk("%pOF: Could not register of device.\n", dp);
+ put_device(&op->dev);
kfree(op);
op = NULL;
}
diff --git a/arch/sparc/kernel/of_device_64.c b/arch/sparc/kernel/of_device_64.c
index f98c2901f335..f53092b07b9e 100644
--- a/arch/sparc/kernel/of_device_64.c
+++ b/arch/sparc/kernel/of_device_64.c
@@ -677,6 +677,7 @@ static struct platform_device * __init scan_one_device(struct device_node *dp,
if (of_device_register(op)) {
printk("%pOF: Could not register of device.\n", dp);
+ put_device(&op->dev);
kfree(op);
op = NULL;
}
--
2.17.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x f755be0b1ff429a2ecf709beeb1bcd7abc111c2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092143-trench-expiring-9a2f@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f755be0b1ff429a2ecf709beeb1bcd7abc111c2b Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Fri, 12 Sep 2025 14:25:50 +0200
Subject: [PATCH] mptcp: propagate shutdown to subflows when possible
When the MPTCP DATA FIN have been ACKed, there is no more MPTCP related
metadata to exchange, and all subflows can be safely shutdown.
Before this patch, the subflows were actually terminated at 'close()'
time. That's certainly fine most of the time, but not when the userspace
'shutdown()' a connection, without close()ing it. When doing so, the
subflows were staying in LAST_ACK state on one side -- and consequently
in FIN_WAIT2 on the other side -- until the 'close()' of the MPTCP
socket.
Now, when the DATA FIN have been ACKed, all subflows are shutdown. A
consequence of this is that the TCP 'FIN' flag can be set earlier now,
but the end result is the same. This affects the packetdrill tests
looking at the end of the MPTCP connections, but for a good reason.
Note that tcp_shutdown() will check the subflow state, so no need to do
that again before calling it.
Fixes: 3721b9b64676 ("mptcp: Track received DATA_FIN sequence number and add related helpers")
Cc: stable(a)vger.kernel.org
Fixes: 16a9a9da1723 ("mptcp: Add helper to process acks of DATA_FIN")
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-1-d40e77cbbf…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e6fd97b21e9e..5e497a83e967 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -371,6 +371,20 @@ static void mptcp_close_wake_up(struct sock *sk)
sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
}
+static void mptcp_shutdown_subflows(struct mptcp_sock *msk)
+{
+ struct mptcp_subflow_context *subflow;
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ bool slow;
+
+ slow = lock_sock_fast(ssk);
+ tcp_shutdown(ssk, SEND_SHUTDOWN);
+ unlock_sock_fast(ssk, slow);
+ }
+}
+
/* called under the msk socket lock */
static bool mptcp_pending_data_fin_ack(struct sock *sk)
{
@@ -395,6 +409,7 @@ static void mptcp_check_data_fin_ack(struct sock *sk)
break;
case TCP_CLOSING:
case TCP_LAST_ACK:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
}
@@ -563,6 +578,7 @@ static bool mptcp_check_data_fin(struct sock *sk)
mptcp_set_state(sk, TCP_CLOSING);
break;
case TCP_FIN_WAIT2:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
default:
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x f755be0b1ff429a2ecf709beeb1bcd7abc111c2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092143-detonator-snowcap-57d7@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f755be0b1ff429a2ecf709beeb1bcd7abc111c2b Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Fri, 12 Sep 2025 14:25:50 +0200
Subject: [PATCH] mptcp: propagate shutdown to subflows when possible
When the MPTCP DATA FIN have been ACKed, there is no more MPTCP related
metadata to exchange, and all subflows can be safely shutdown.
Before this patch, the subflows were actually terminated at 'close()'
time. That's certainly fine most of the time, but not when the userspace
'shutdown()' a connection, without close()ing it. When doing so, the
subflows were staying in LAST_ACK state on one side -- and consequently
in FIN_WAIT2 on the other side -- until the 'close()' of the MPTCP
socket.
Now, when the DATA FIN have been ACKed, all subflows are shutdown. A
consequence of this is that the TCP 'FIN' flag can be set earlier now,
but the end result is the same. This affects the packetdrill tests
looking at the end of the MPTCP connections, but for a good reason.
Note that tcp_shutdown() will check the subflow state, so no need to do
that again before calling it.
Fixes: 3721b9b64676 ("mptcp: Track received DATA_FIN sequence number and add related helpers")
Cc: stable(a)vger.kernel.org
Fixes: 16a9a9da1723 ("mptcp: Add helper to process acks of DATA_FIN")
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-1-d40e77cbbf…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e6fd97b21e9e..5e497a83e967 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -371,6 +371,20 @@ static void mptcp_close_wake_up(struct sock *sk)
sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
}
+static void mptcp_shutdown_subflows(struct mptcp_sock *msk)
+{
+ struct mptcp_subflow_context *subflow;
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ bool slow;
+
+ slow = lock_sock_fast(ssk);
+ tcp_shutdown(ssk, SEND_SHUTDOWN);
+ unlock_sock_fast(ssk, slow);
+ }
+}
+
/* called under the msk socket lock */
static bool mptcp_pending_data_fin_ack(struct sock *sk)
{
@@ -395,6 +409,7 @@ static void mptcp_check_data_fin_ack(struct sock *sk)
break;
case TCP_CLOSING:
case TCP_LAST_ACK:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
}
@@ -563,6 +578,7 @@ static bool mptcp_check_data_fin(struct sock *sk)
mptcp_set_state(sk, TCP_CLOSING);
break;
case TCP_FIN_WAIT2:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
default:
The patch below was submitted to be applied to the 6.16-stable tree.
I fail to see how this patch meets the stable kernel rules as found at
Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to
<stable(a)vger.kernel.org> and let me know why this patch should be
applied. Otherwise, it is now dropped from my patch queues, never to be
seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d6d69f0edde63b553345d4efaceb7daed89fe04c Mon Sep 17 00:00:00 2001
From: Tao Cui <cuitao(a)kylinos.cn>
Date: Thu, 18 Sep 2025 19:44:04 +0800
Subject: [PATCH] LoongArch: Replace sprintf() with sysfs_emit()
As Documentation/filesystems/sysfs.rst suggested, show() should only use
sysfs_emit() or sysfs_emit_at() when formatting the value to be returned
to user space.
No functional change intended.
Cc: stable(a)vger.kernel.org
Signed-off-by: Tao Cui <cuitao(a)kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c
index be309a71f204..23bd5ae2212c 100644
--- a/arch/loongarch/kernel/env.c
+++ b/arch/loongarch/kernel/env.c
@@ -86,7 +86,7 @@ late_initcall(fdt_cpu_clk_init);
static ssize_t boardinfo_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
{
- return sprintf(buf,
+ return sysfs_emit(buf,
"BIOS Information\n"
"Vendor\t\t\t: %s\n"
"Version\t\t\t: %s\n"
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x f755be0b1ff429a2ecf709beeb1bcd7abc111c2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092142-stingily-broadside-f761@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f755be0b1ff429a2ecf709beeb1bcd7abc111c2b Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Fri, 12 Sep 2025 14:25:50 +0200
Subject: [PATCH] mptcp: propagate shutdown to subflows when possible
When the MPTCP DATA FIN have been ACKed, there is no more MPTCP related
metadata to exchange, and all subflows can be safely shutdown.
Before this patch, the subflows were actually terminated at 'close()'
time. That's certainly fine most of the time, but not when the userspace
'shutdown()' a connection, without close()ing it. When doing so, the
subflows were staying in LAST_ACK state on one side -- and consequently
in FIN_WAIT2 on the other side -- until the 'close()' of the MPTCP
socket.
Now, when the DATA FIN have been ACKed, all subflows are shutdown. A
consequence of this is that the TCP 'FIN' flag can be set earlier now,
but the end result is the same. This affects the packetdrill tests
looking at the end of the MPTCP connections, but for a good reason.
Note that tcp_shutdown() will check the subflow state, so no need to do
that again before calling it.
Fixes: 3721b9b64676 ("mptcp: Track received DATA_FIN sequence number and add related helpers")
Cc: stable(a)vger.kernel.org
Fixes: 16a9a9da1723 ("mptcp: Add helper to process acks of DATA_FIN")
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-1-d40e77cbbf…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e6fd97b21e9e..5e497a83e967 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -371,6 +371,20 @@ static void mptcp_close_wake_up(struct sock *sk)
sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
}
+static void mptcp_shutdown_subflows(struct mptcp_sock *msk)
+{
+ struct mptcp_subflow_context *subflow;
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ bool slow;
+
+ slow = lock_sock_fast(ssk);
+ tcp_shutdown(ssk, SEND_SHUTDOWN);
+ unlock_sock_fast(ssk, slow);
+ }
+}
+
/* called under the msk socket lock */
static bool mptcp_pending_data_fin_ack(struct sock *sk)
{
@@ -395,6 +409,7 @@ static void mptcp_check_data_fin_ack(struct sock *sk)
break;
case TCP_CLOSING:
case TCP_LAST_ACK:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
}
@@ -563,6 +578,7 @@ static bool mptcp_check_data_fin(struct sock *sk)
mptcp_set_state(sk, TCP_CLOSING);
break;
case TCP_FIN_WAIT2:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
default:
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 14e22b43df25dbd4301351b882486ea38892ae4f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092157-mullets-tweed-dee4@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 14e22b43df25dbd4301351b882486ea38892ae4f Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Fri, 12 Sep 2025 14:25:51 +0200
Subject: [PATCH] selftests: mptcp: connect: catch IO errors on listen side
IO errors were correctly printed to stderr, and propagated up to the
main loop for the server side, but the returned value was ignored. As a
consequence, the program for the listener side was no longer exiting
with an error code in case of IO issues.
Because of that, some issues might not have been seen. But very likely,
most issues either had an effect on the client side, or the file
transfer was not the expected one, e.g. the connection got reset before
the end. Still, it is better to fix this.
The main consequence of this issue is the error that was reported by the
selftests: the received and sent files were different, and the MIB
counters were not printed. Also, when such errors happened during the
'disconnect' tests, the program tried to continue until the timeout.
Now when an IO error is detected, the program exits directly with an
error.
Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests")
Cc: stable(a)vger.kernel.org
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-2-d40e77cbbf…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index 4f07ac9fa207..1408698df099 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -1093,6 +1093,7 @@ int main_loop_s(int listensock)
struct pollfd polls;
socklen_t salen;
int remotesock;
+ int err = 0;
int fd = 0;
again:
@@ -1125,7 +1126,7 @@ int main_loop_s(int listensock)
SOCK_TEST_TCPULP(remotesock, 0);
memset(&winfo, 0, sizeof(winfo));
- copyfd_io(fd, remotesock, 1, true, &winfo);
+ err = copyfd_io(fd, remotesock, 1, true, &winfo);
} else {
perror("accept");
return 1;
@@ -1134,10 +1135,10 @@ int main_loop_s(int listensock)
if (cfg_input)
close(fd);
- if (--cfg_repeat > 0)
+ if (!err && --cfg_repeat > 0)
goto again;
- return 0;
+ return err;
}
static void init_rng(void)
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x 225d1ee0f5ba3218d1814d36564fdb5f37b50474
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092125-thigh-immerse-6abd@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 225d1ee0f5ba3218d1814d36564fdb5f37b50474 Mon Sep 17 00:00:00 2001
From: Antheas Kapenekakis <lkml(a)antheas.dev>
Date: Tue, 16 Sep 2025 09:28:18 +0200
Subject: [PATCH] platform/x86: asus-wmi: Re-add extra keys to ignore_key_wlan
quirk
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It turns out that the dual screen models use 0x5E for attaching and
detaching the keyboard instead of 0x5F. So, re-add the codes by
reverting commit cf3940ac737d ("platform/x86: asus-wmi: Remove extra
keys from ignore_key_wlan quirk"). For our future reference, add a
comment next to 0x5E indicating that it is used for that purpose.
Fixes: cf3940ac737d ("platform/x86: asus-wmi: Remove extra keys from ignore_key_wlan quirk")
Reported-by: Rahul Chandra <rahul(a)chandra.net>
Closes: https://lore.kernel.org/all/10020-68c90c80-d-4ac6c580@106290038/
Cc: stable(a)kernel.org
Signed-off-by: Antheas Kapenekakis <lkml(a)antheas.dev>
Link: https://patch.msgid.link/20250916072818.196462-1-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
diff --git a/drivers/platform/x86/asus-nb-wmi.c b/drivers/platform/x86/asus-nb-wmi.c
index 3a488cf9ca06..6a62bc5b02fd 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -673,6 +673,8 @@ static void asus_nb_wmi_key_filter(struct asus_wmi_driver *asus_wmi, int *code,
if (atkbd_reports_vol_keys)
*code = ASUS_WMI_KEY_IGNORE;
break;
+ case 0x5D: /* Wireless console Toggle */
+ case 0x5E: /* Wireless console Enable / Keyboard Attach, Detach */
case 0x5F: /* Wireless console Disable / Special Key */
if (quirks->key_wlan_event)
*code = quirks->key_wlan_event;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x f755be0b1ff429a2ecf709beeb1bcd7abc111c2b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092141-slashing-postal-be15@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f755be0b1ff429a2ecf709beeb1bcd7abc111c2b Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Fri, 12 Sep 2025 14:25:50 +0200
Subject: [PATCH] mptcp: propagate shutdown to subflows when possible
When the MPTCP DATA FIN have been ACKed, there is no more MPTCP related
metadata to exchange, and all subflows can be safely shutdown.
Before this patch, the subflows were actually terminated at 'close()'
time. That's certainly fine most of the time, but not when the userspace
'shutdown()' a connection, without close()ing it. When doing so, the
subflows were staying in LAST_ACK state on one side -- and consequently
in FIN_WAIT2 on the other side -- until the 'close()' of the MPTCP
socket.
Now, when the DATA FIN have been ACKed, all subflows are shutdown. A
consequence of this is that the TCP 'FIN' flag can be set earlier now,
but the end result is the same. This affects the packetdrill tests
looking at the end of the MPTCP connections, but for a good reason.
Note that tcp_shutdown() will check the subflow state, so no need to do
that again before calling it.
Fixes: 3721b9b64676 ("mptcp: Track received DATA_FIN sequence number and add related helpers")
Cc: stable(a)vger.kernel.org
Fixes: 16a9a9da1723 ("mptcp: Add helper to process acks of DATA_FIN")
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Reviewed-by: Geliang Tang <geliang(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-1-d40e77cbbf…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e6fd97b21e9e..5e497a83e967 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -371,6 +371,20 @@ static void mptcp_close_wake_up(struct sock *sk)
sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
}
+static void mptcp_shutdown_subflows(struct mptcp_sock *msk)
+{
+ struct mptcp_subflow_context *subflow;
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ bool slow;
+
+ slow = lock_sock_fast(ssk);
+ tcp_shutdown(ssk, SEND_SHUTDOWN);
+ unlock_sock_fast(ssk, slow);
+ }
+}
+
/* called under the msk socket lock */
static bool mptcp_pending_data_fin_ack(struct sock *sk)
{
@@ -395,6 +409,7 @@ static void mptcp_check_data_fin_ack(struct sock *sk)
break;
case TCP_CLOSING:
case TCP_LAST_ACK:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
}
@@ -563,6 +578,7 @@ static bool mptcp_check_data_fin(struct sock *sk)
mptcp_set_state(sk, TCP_CLOSING);
break;
case TCP_FIN_WAIT2:
+ mptcp_shutdown_subflows(msk);
mptcp_set_state(sk, TCP_CLOSE);
break;
default:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x d02e48830e3fce9701265f6c5a58d9bdaf906a76
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092122-popper-small-d970@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d02e48830e3fce9701265f6c5a58d9bdaf906a76 Mon Sep 17 00:00:00 2001
From: "Maciej S. Szmigiero" <maciej.szmigiero(a)oracle.com>
Date: Mon, 25 Aug 2025 18:44:28 +0200
Subject: [PATCH] KVM: SVM: Sync TPR from LAPIC into VMCB::V_TPR even if AVIC
is active
Commit 3bbf3565f48c ("svm: Do not intercept CR8 when enable AVIC")
inhibited pre-VMRUN sync of TPR from LAPIC into VMCB::V_TPR in
sync_lapic_to_cr8() when AVIC is active.
AVIC does automatically sync between these two fields, however it does
so only on explicit guest writes to one of these fields, not on a bare
VMRUN.
This meant that when AVIC is enabled host changes to TPR in the LAPIC
state might not get automatically copied into the V_TPR field of VMCB.
This is especially true when it is the userspace setting LAPIC state via
KVM_SET_LAPIC ioctl() since userspace does not have access to the guest
VMCB.
Practice shows that it is the V_TPR that is actually used by the AVIC to
decide whether to issue pending interrupts to the CPU (not TPR in TASKPRI),
so any leftover value in V_TPR will cause serious interrupt delivery issues
in the guest when AVIC is enabled.
Fix this issue by doing pre-VMRUN TPR sync from LAPIC into VMCB::V_TPR
even when AVIC is enabled.
Fixes: 3bbf3565f48c ("svm: Do not intercept CR8 when enable AVIC")
Cc: stable(a)vger.kernel.org
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero(a)oracle.com>
Reviewed-by: Naveen N Rao (AMD) <naveen(a)kernel.org>
Link: https://lore.kernel.org/r/c231be64280b1461e854e1ce3595d70cde3a2e9d.17561396…
[sean: tag for stable@]
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d9931c6c4bc6..1bfebe40854f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4046,8 +4046,7 @@ static inline void sync_lapic_to_cr8(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
u64 cr8;
- if (nested_svm_virtualize_tpr(vcpu) ||
- kvm_vcpu_apicv_active(vcpu))
+ if (nested_svm_virtualize_tpr(vcpu))
return;
cr8 = kvm_get_cr8(vcpu);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x b6f56a44e4c1014b08859dcf04ed246500e310e5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092157-imagines-darkroom-e5c5@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b6f56a44e4c1014b08859dcf04ed246500e310e5 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Sat, 13 Sep 2025 13:35:15 +0200
Subject: [PATCH] net: rfkill: gpio: Fix crash due to dereferencering
uninitialized pointer
Since commit 7d5e9737efda ("net: rfkill: gpio: get the name and type from
device property") rfkill_find_type() gets called with the possibly
uninitialized "const char *type_name;" local variable.
On x86 systems when rfkill-gpio binds to a "BCM4752" or "LNV4752"
acpi_device, the rfkill->type is set based on the ACPI acpi_device_id:
rfkill->type = (unsigned)id->driver_data;
and there is no "type" property so device_property_read_string() will fail
and leave type_name uninitialized, leading to a potential crash.
rfkill_find_type() does accept a NULL pointer, fix the potential crash
by initializing type_name to NULL.
Note likely sofar this has not been caught because:
1. Not many x86 machines actually have a "BCM4752"/"LNV4752" acpi_device
2. The stack happened to contain NULL where type_name is stored
Fixes: 7d5e9737efda ("net: rfkill: gpio: get the name and type from device property")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://patch.msgid.link/20250913113515.21698-1-hansg@kernel.org
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 41e657e97761..cf2dcec6ce5a 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -94,10 +94,10 @@ static const struct dmi_system_id rfkill_gpio_deny_table[] = {
static int rfkill_gpio_probe(struct platform_device *pdev)
{
struct rfkill_gpio_data *rfkill;
- struct gpio_desc *gpio;
+ const char *type_name = NULL;
const char *name_property;
const char *type_property;
- const char *type_name;
+ struct gpio_desc *gpio;
int ret;
if (dmi_check_system(rfkill_gpio_deny_table))
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x b6f56a44e4c1014b08859dcf04ed246500e310e5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092156-postal-sappiness-e1ac@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b6f56a44e4c1014b08859dcf04ed246500e310e5 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Sat, 13 Sep 2025 13:35:15 +0200
Subject: [PATCH] net: rfkill: gpio: Fix crash due to dereferencering
uninitialized pointer
Since commit 7d5e9737efda ("net: rfkill: gpio: get the name and type from
device property") rfkill_find_type() gets called with the possibly
uninitialized "const char *type_name;" local variable.
On x86 systems when rfkill-gpio binds to a "BCM4752" or "LNV4752"
acpi_device, the rfkill->type is set based on the ACPI acpi_device_id:
rfkill->type = (unsigned)id->driver_data;
and there is no "type" property so device_property_read_string() will fail
and leave type_name uninitialized, leading to a potential crash.
rfkill_find_type() does accept a NULL pointer, fix the potential crash
by initializing type_name to NULL.
Note likely sofar this has not been caught because:
1. Not many x86 machines actually have a "BCM4752"/"LNV4752" acpi_device
2. The stack happened to contain NULL where type_name is stored
Fixes: 7d5e9737efda ("net: rfkill: gpio: get the name and type from device property")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://patch.msgid.link/20250913113515.21698-1-hansg@kernel.org
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 41e657e97761..cf2dcec6ce5a 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -94,10 +94,10 @@ static const struct dmi_system_id rfkill_gpio_deny_table[] = {
static int rfkill_gpio_probe(struct platform_device *pdev)
{
struct rfkill_gpio_data *rfkill;
- struct gpio_desc *gpio;
+ const char *type_name = NULL;
const char *name_property;
const char *type_property;
- const char *type_name;
+ struct gpio_desc *gpio;
int ret;
if (dmi_check_system(rfkill_gpio_deny_table))
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x b6f56a44e4c1014b08859dcf04ed246500e310e5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092155-familiar-divisible-9535@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b6f56a44e4c1014b08859dcf04ed246500e310e5 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Sat, 13 Sep 2025 13:35:15 +0200
Subject: [PATCH] net: rfkill: gpio: Fix crash due to dereferencering
uninitialized pointer
Since commit 7d5e9737efda ("net: rfkill: gpio: get the name and type from
device property") rfkill_find_type() gets called with the possibly
uninitialized "const char *type_name;" local variable.
On x86 systems when rfkill-gpio binds to a "BCM4752" or "LNV4752"
acpi_device, the rfkill->type is set based on the ACPI acpi_device_id:
rfkill->type = (unsigned)id->driver_data;
and there is no "type" property so device_property_read_string() will fail
and leave type_name uninitialized, leading to a potential crash.
rfkill_find_type() does accept a NULL pointer, fix the potential crash
by initializing type_name to NULL.
Note likely sofar this has not been caught because:
1. Not many x86 machines actually have a "BCM4752"/"LNV4752" acpi_device
2. The stack happened to contain NULL where type_name is stored
Fixes: 7d5e9737efda ("net: rfkill: gpio: get the name and type from device property")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://patch.msgid.link/20250913113515.21698-1-hansg@kernel.org
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 41e657e97761..cf2dcec6ce5a 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -94,10 +94,10 @@ static const struct dmi_system_id rfkill_gpio_deny_table[] = {
static int rfkill_gpio_probe(struct platform_device *pdev)
{
struct rfkill_gpio_data *rfkill;
- struct gpio_desc *gpio;
+ const char *type_name = NULL;
const char *name_property;
const char *type_property;
- const char *type_name;
+ struct gpio_desc *gpio;
int ret;
if (dmi_check_system(rfkill_gpio_deny_table))
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x b6f56a44e4c1014b08859dcf04ed246500e310e5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092155-evacuate-condition-525e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b6f56a44e4c1014b08859dcf04ed246500e310e5 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hansg(a)kernel.org>
Date: Sat, 13 Sep 2025 13:35:15 +0200
Subject: [PATCH] net: rfkill: gpio: Fix crash due to dereferencering
uninitialized pointer
Since commit 7d5e9737efda ("net: rfkill: gpio: get the name and type from
device property") rfkill_find_type() gets called with the possibly
uninitialized "const char *type_name;" local variable.
On x86 systems when rfkill-gpio binds to a "BCM4752" or "LNV4752"
acpi_device, the rfkill->type is set based on the ACPI acpi_device_id:
rfkill->type = (unsigned)id->driver_data;
and there is no "type" property so device_property_read_string() will fail
and leave type_name uninitialized, leading to a potential crash.
rfkill_find_type() does accept a NULL pointer, fix the potential crash
by initializing type_name to NULL.
Note likely sofar this has not been caught because:
1. Not many x86 machines actually have a "BCM4752"/"LNV4752" acpi_device
2. The stack happened to contain NULL where type_name is stored
Fixes: 7d5e9737efda ("net: rfkill: gpio: get the name and type from device property")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://patch.msgid.link/20250913113515.21698-1-hansg@kernel.org
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 41e657e97761..cf2dcec6ce5a 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -94,10 +94,10 @@ static const struct dmi_system_id rfkill_gpio_deny_table[] = {
static int rfkill_gpio_probe(struct platform_device *pdev)
{
struct rfkill_gpio_data *rfkill;
- struct gpio_desc *gpio;
+ const char *type_name = NULL;
const char *name_property;
const char *type_property;
- const char *type_name;
+ struct gpio_desc *gpio;
int ret;
if (dmi_check_system(rfkill_gpio_deny_table))
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 7f830e126dc357fc086905ce9730140fd4528d66
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092125-resurface-hypertext-5ca5@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7f830e126dc357fc086905ce9730140fd4528d66 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky(a)amd.com>
Date: Mon, 15 Sep 2025 11:04:12 -0500
Subject: [PATCH] x86/sev: Guard sev_evict_cache() with CONFIG_AMD_MEM_ENCRYPT
The sev_evict_cache() is guest-related code and should be guarded by
CONFIG_AMD_MEM_ENCRYPT, not CONFIG_KVM_AMD_SEV.
CONFIG_AMD_MEM_ENCRYPT=y is required for a guest to run properly as an SEV-SNP
guest, but a guest kernel built with CONFIG_KVM_AMD_SEV=n would get the stub
function of sev_evict_cache() instead of the version that performs the actual
eviction. Move the function declarations under the appropriate #ifdef.
Fixes: 7b306dfa326f ("x86/sev: Evict cache lines during SNP memory validation")
Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)kernel.org # 6.16.x
Link: https://lore.kernel.org/r/70e38f2c4a549063de54052c9f64929705313526.17577089…
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 02236962fdb1..465b19fd1a2d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -562,6 +562,24 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
extern struct ghcb *boot_ghcb;
+static inline void sev_evict_cache(void *va, int npages)
+{
+ volatile u8 val __always_unused;
+ u8 *bytes = va;
+ int page_idx;
+
+ /*
+ * For SEV guests, a read from the first/last cache-lines of a 4K page
+ * using the guest key is sufficient to cause a flush of all cache-lines
+ * associated with that 4K page without incurring all the overhead of a
+ * full CLFLUSH sequence.
+ */
+ for (page_idx = 0; page_idx < npages; page_idx++) {
+ val = bytes[page_idx * PAGE_SIZE];
+ val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
+ }
+}
+
#else /* !CONFIG_AMD_MEM_ENCRYPT */
#define snp_vmpl 0
@@ -605,6 +623,7 @@ static inline int snp_send_guest_request(struct snp_msg_desc *mdesc,
static inline int snp_svsm_vtpm_send_command(u8 *buffer) { return -ENODEV; }
static inline void __init snp_secure_tsc_prepare(void) { }
static inline void __init snp_secure_tsc_init(void) { }
+static inline void sev_evict_cache(void *va, int npages) {}
#endif /* CONFIG_AMD_MEM_ENCRYPT */
@@ -619,24 +638,6 @@ int rmp_make_shared(u64 pfn, enum pg_level level);
void snp_leak_pages(u64 pfn, unsigned int npages);
void kdump_sev_callback(void);
void snp_fixup_e820_tables(void);
-
-static inline void sev_evict_cache(void *va, int npages)
-{
- volatile u8 val __always_unused;
- u8 *bytes = va;
- int page_idx;
-
- /*
- * For SEV guests, a read from the first/last cache-lines of a 4K page
- * using the guest key is sufficient to cause a flush of all cache-lines
- * associated with that 4K page without incurring all the overhead of a
- * full CLFLUSH sequence.
- */
- for (page_idx = 0; page_idx < npages; page_idx++) {
- val = bytes[page_idx * PAGE_SIZE];
- val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
- }
-}
#else
static inline bool snp_probe_rmptable_info(void) { return false; }
static inline int snp_rmptable_init(void) { return -ENOSYS; }
@@ -652,7 +653,6 @@ static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV
static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
static inline void kdump_sev_callback(void) { }
static inline void snp_fixup_e820_tables(void) {}
-static inline void sev_evict_cache(void *va, int npages) {}
#endif
#endif
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 68f27f7c7708183e7873c585ded2f1b057ac5b97
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092104-booting-overstate-c9cf@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 68f27f7c7708183e7873c585ded2f1b057ac5b97 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Thu, 4 Sep 2025 12:18:50 +0200
Subject: [PATCH] ASoC: qcom: q6apm-lpass-dais: Fix NULL pointer dereference if
source graph failed
If earlier opening of source graph fails (e.g. ADSP rejects due to
incorrect audioreach topology), the graph is closed and
"dai_data->graph[dai->id]" is assigned NULL. Preparing the DAI for sink
graph continues though and next call to q6apm_lpass_dai_prepare()
receives dai_data->graph[dai->id]=NULL leading to NULL pointer
exception:
qcom-apm gprsvc:service:2:1: Error (1) Processing 0x01001002 cmd
qcom-apm gprsvc:service:2:1: DSP returned error[1001002] 1
q6apm-lpass-dais 30000000.remoteproc:glink-edge:gpr:service@1:bedais: fail to start APM port 78
q6apm-lpass-dais 30000000.remoteproc:glink-edge:gpr:service@1:bedais: ASoC: error at snd_soc_pcm_dai_prepare on TX_CODEC_DMA_TX_3: -22
Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a8
...
Call trace:
q6apm_graph_media_format_pcm+0x48/0x120 (P)
q6apm_lpass_dai_prepare+0x110/0x1b4
snd_soc_pcm_dai_prepare+0x74/0x108
__soc_pcm_prepare+0x44/0x160
dpcm_be_dai_prepare+0x124/0x1c0
Fixes: 30ad723b93ad ("ASoC: qdsp6: audioreach: add q6apm lpass dai support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla(a)oss.qualcomm.com>
Message-ID: <20250904101849.121503-2-krzysztof.kozlowski(a)linaro.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
diff --git a/sound/soc/qcom/qdsp6/q6apm-lpass-dais.c b/sound/soc/qcom/qdsp6/q6apm-lpass-dais.c
index a0d90462fd6a..20974f10406b 100644
--- a/sound/soc/qcom/qdsp6/q6apm-lpass-dais.c
+++ b/sound/soc/qcom/qdsp6/q6apm-lpass-dais.c
@@ -213,8 +213,10 @@ static int q6apm_lpass_dai_prepare(struct snd_pcm_substream *substream, struct s
return 0;
err:
- q6apm_graph_close(dai_data->graph[dai->id]);
- dai_data->graph[dai->id] = NULL;
+ if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
+ q6apm_graph_close(dai_data->graph[dai->id]);
+ dai_data->graph[dai->id] = NULL;
+ }
return rc;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 96fa515e70f3e4b98685ef8cac9d737fc62f10e1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092135-stinky-correct-5051@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 96fa515e70f3e4b98685ef8cac9d737fc62f10e1 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Tue, 16 Sep 2025 07:54:06 +0930
Subject: [PATCH] btrfs: tree-checker: fix the incorrect inode ref size check
[BUG]
Inside check_inode_ref(), we need to make sure every structure,
including the btrfs_inode_extref header, is covered by the item. But
our code is incorrectly using "sizeof(iref)", where @iref is just a
pointer.
This means "sizeof(iref)" will always be "sizeof(void *)", which is much
smaller than "sizeof(struct btrfs_inode_extref)".
This will allow some bad inode extrefs to sneak in, defeating tree-checker.
[FIX]
Fix the typo by calling "sizeof(*iref)", which is the same as
"sizeof(struct btrfs_inode_extref)", and will be the correct behavior we
want.
Fixes: 71bf92a9b877 ("btrfs: tree-checker: Add check for INODE_REF")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 0f556f4de3f9..a997c7cc35a2 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -1756,10 +1756,10 @@ static int check_inode_ref(struct extent_buffer *leaf,
while (ptr < end) {
u16 namelen;
- if (unlikely(ptr + sizeof(iref) > end)) {
+ if (unlikely(ptr + sizeof(*iref) > end)) {
inode_ref_err(leaf, slot,
"inode ref overflow, ptr %lu end %lu inode_ref_size %zu",
- ptr, end, sizeof(iref));
+ ptr, end, sizeof(*iref));
return -EUCLEAN;
}
[BUG]
With my local branch to enable bs > ps support for btrfs, sometimes I
hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild
yet, as it requires btrfs bs > ps support, which is not even in the
development branch.
But on the other hand, there is also a very low chance to hit above
ASSERT() with bs < ps cases, so this is an existing bug affect not only
the incoming bs > ps support but also the existing bs < ps support.
[CAUSE]
Firstly that ASSERT() means we're trying to submit a dirty block but
without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is
always larger than the fs block size in my bs > ps case.
(8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following
sequence:
extent_writepage()
| We got a 32K folio (4 fs blocks) at file offset 0, and the fs block
| size is 8K, page size is 4K.
| And there is another 8K folio at file offset 32K, which is also
| dirty.
| So the filemap layout looks like the following:
|
| "||" is the filio boundary in the filemap.
| "//| is the dirty range.
|
| 0 8K 16K 24K 32K 40K
| |////////| |//////////////////////||////////|
|
|- writepage_delalloc()
| |- find_lock_delalloc_range() for [0, 8K)
| | Now range [0, 8K) is properly locked.
| |
| |- find_lock_delalloc_range() for [16K, 40K)
| | |- btrfs_find_delalloc_range() returned range [0, 8K)
| | |- lock_delalloc_folios() succeeded.
| | |
| | | The filemap range [32K, 40K) got dropped from filemap.
| | |
| | |- lock_delalloc_folios() failed with -EAGAIN.
| | | As it failed to lock the folio at [32K, 40K).
| | |
| | |- loops = 1;
| | |- max_bytes = PAGE_SIZE;
| | |- goto again;
| | | This will re-do the lookup for dirty delalloc ranges.
| | |
| | |- btrfs_find_delalloc_range() called with @max_bytes == 4K
| | | This is smaller than block size, so
| | | btrfs_find_delalloc_range() is unable to return any range.
| | \- return false;
| |
| \- Now only range [0, 8K) has an OE for it, but for dirty range
| [16K, 32K) it's dirty without an OE.
| This breaks the assumption that writepage_delalloc() will find
| and lock all dirty ranges inside the folio.
|
|- extent_writepage_io()
|- submit_one_sector() for [0, 8K)
| Succeeded
|
|- submit_one_sector() for [16K, 24K)
Triggering the ASSERT(), as there is no OE, and the original
extent map is a hole.
Please note that, this also exposed the same problem for bs < ps
support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;"
branch, we will re-do the search using 64K as max_bytes.
Which may fail again to lock the next folio, and exit early without
handling all dirty blocks inside the folio.
[FIX]
Instead of using the fixed size PAGE_SIZE as @max_bytes, use
@sectorsize, so that we are ensured to find and lock any remaining
blocks inside the folio.
And since we're here, add an extra ASSERT() to
before calling btrfs_find_delalloc_range() to make sure the @max_bytes is
at least no smaller than a block to avoid false negative.
Cc: stable(a)vger.kernel.org #5.15+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/extent_io.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index ca7174fa0240..2fd82055a779 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -393,6 +393,13 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
/* step one, find a bunch of delalloc bytes starting at start */
delalloc_start = *start;
delalloc_end = 0;
+
+ /*
+ * If @max_bytes is smaller than a block, btrfs_find_delalloc_range() can
+ * return early without handling any dirty ranges.
+ */
+ ASSERT(max_bytes >= fs_info->sectorsize);
+
found = btrfs_find_delalloc_range(tree, &delalloc_start, &delalloc_end,
max_bytes, &cached_state);
if (!found || delalloc_end <= *start || delalloc_start > orig_end) {
@@ -423,13 +430,14 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
delalloc_end);
ASSERT(!ret || ret == -EAGAIN);
if (ret == -EAGAIN) {
- /* some of the folios are gone, lets avoid looping by
- * shortening the size of the delalloc range we're searching
+ /*
+ * Some of the folios are gone, lets avoid looping by
+ * shortening the size of the delalloc range we're searching.
*/
btrfs_free_extent_state(cached_state);
cached_state = NULL;
if (!loops) {
- max_bytes = PAGE_SIZE;
+ max_bytes = fs_info->sectorsize;
loops = 1;
goto again;
} else {
--
2.50.1
The quilt patch titled
Subject: mm/damon/lru_sort: use param_ctx for damon_attrs staging
has been removed from the -mm tree. Its filename was
mm-damon-lru_sort-use-param_ctx-for-damon_attrs-staging.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/lru_sort: use param_ctx for damon_attrs staging
Date: Mon, 15 Sep 2025 20:15:49 -0700
damon_lru_sort_apply_parameters() allocates a new DAMON context, stages
user-specified DAMON parameters on it, and commits to running DAMON
context at once, using damon_commit_ctx(). The code is, however, directly
updating the monitoring attributes of the running context. And the
attributes are over-written by later damon_commit_ctx() call. This means
that the monitoring attributes parameters are not really working. Fix the
wrong use of the parameter context.
Link: https://lkml.kernel.org/r/20250916031549.115326-1-sj@kernel.org
Fixes: a30969436428 ("mm/damon/lru_sort: use damon_commit_ctx()")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Reviewed-by: Joshua Hahn <joshua.hahnjy(a)gmail.com>
Cc: Joshua Hahn <joshua.hahnjy(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [6.11+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/lru_sort.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/damon/lru_sort.c~mm-damon-lru_sort-use-param_ctx-for-damon_attrs-staging
+++ a/mm/damon/lru_sort.c
@@ -219,7 +219,7 @@ static int damon_lru_sort_apply_paramete
goto out;
}
- err = damon_set_attrs(ctx, &damon_lru_sort_mon_attrs);
+ err = damon_set_attrs(param_ctx, &damon_lru_sort_mon_attrs);
if (err)
goto out;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-sysfs-set-damon_ctx-min_sz_region-only-for-paddr-use-case.patch
This series backports seven commits from v5.15.y that update minmax.h
and related code:
- ed6e37e30826 ("tracing: Define the is_signed_type() macro once")
- 998f03984e25 ("minmax: sanity check constant bounds when clamping")
- d470787b25e6 ("minmax: clamp more efficiently by avoiding extra
comparison")
- 1c2ee5bc9f11 ("minmax: fix header inclusions")
- d53b5d862acd ("minmax: allow min()/max()/clamp() if the arguments
have the same signedness.")
- 7ed91c5560df ("minmax: allow comparisons of 'int' against 'unsigned
char/short'")
- 22f7794ef5a3 ("minmax: relax check to allow comparison between
unsigned arguments and signed constants")
The main motivation is commit d53b5d862acd, which removes the strict
type check in min()/max() when both arguments have the same signedness.
Without this, kernel 5.10 builds can emit warnings that become build
failures when -Werror is used.
Additionally, commit ed6e37e30826 from tracing is required as a
dependency; without it, compilation fails.
Andy Shevchenko (1):
minmax: fix header inclusions
Bart Van Assche (1):
tracing: Define the is_signed_type() macro once
David Laight (3):
minmax: allow min()/max()/clamp() if the arguments have the same
signedness.
minmax: allow comparisons of 'int' against 'unsigned char/short'
minmax: relax check to allow comparison between unsigned arguments and
signed constants
Jason A. Donenfeld (2):
minmax: sanity check constant bounds when clamping
minmax: clamp more efficiently by avoiding extra comparison
include/linux/compiler.h | 6 +++
include/linux/minmax.h | 89 ++++++++++++++++++++++++++----------
include/linux/overflow.h | 1 -
include/linux/trace_events.h | 2 -
4 files changed, 70 insertions(+), 28 deletions(-)
--
2.47.3
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 225d1ee0f5ba3218d1814d36564fdb5f37b50474
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092126-upstream-favorite-2f89@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 225d1ee0f5ba3218d1814d36564fdb5f37b50474 Mon Sep 17 00:00:00 2001
From: Antheas Kapenekakis <lkml(a)antheas.dev>
Date: Tue, 16 Sep 2025 09:28:18 +0200
Subject: [PATCH] platform/x86: asus-wmi: Re-add extra keys to ignore_key_wlan
quirk
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It turns out that the dual screen models use 0x5E for attaching and
detaching the keyboard instead of 0x5F. So, re-add the codes by
reverting commit cf3940ac737d ("platform/x86: asus-wmi: Remove extra
keys from ignore_key_wlan quirk"). For our future reference, add a
comment next to 0x5E indicating that it is used for that purpose.
Fixes: cf3940ac737d ("platform/x86: asus-wmi: Remove extra keys from ignore_key_wlan quirk")
Reported-by: Rahul Chandra <rahul(a)chandra.net>
Closes: https://lore.kernel.org/all/10020-68c90c80-d-4ac6c580@106290038/
Cc: stable(a)kernel.org
Signed-off-by: Antheas Kapenekakis <lkml(a)antheas.dev>
Link: https://patch.msgid.link/20250916072818.196462-1-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
diff --git a/drivers/platform/x86/asus-nb-wmi.c b/drivers/platform/x86/asus-nb-wmi.c
index 3a488cf9ca06..6a62bc5b02fd 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -673,6 +673,8 @@ static void asus_nb_wmi_key_filter(struct asus_wmi_driver *asus_wmi, int *code,
if (atkbd_reports_vol_keys)
*code = ASUS_WMI_KEY_IGNORE;
break;
+ case 0x5D: /* Wireless console Toggle */
+ case 0x5E: /* Wireless console Enable / Keyboard Attach, Detach */
case 0x5F: /* Wireless console Disable / Special Key */
if (quirks->key_wlan_event)
*code = quirks->key_wlan_event;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 5282491fc49d5614ac6ddcd012e5743eecb6a67c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092118-portside-cheesy-44d2@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5282491fc49d5614ac6ddcd012e5743eecb6a67c Mon Sep 17 00:00:00 2001
From: Namjae Jeon <linkinjeon(a)kernel.org>
Date: Wed, 10 Sep 2025 11:22:52 +0900
Subject: [PATCH] ksmbd: smbdirect: validate data_offset and data_length field
of smb_direct_data_transfer
If data_offset and data_length of smb_direct_data_transfer struct are
invalid, out of bounds issue could happen.
This patch validate data_offset and data_length field in recv_done.
Cc: stable(a)vger.kernel.org
Fixes: 2ea086e35c3d ("ksmbd: add buffer validation for smb direct")
Reviewed-by: Stefan Metzmacher <metze(a)samba.org>
Reported-by: Luigino Camastra, Aisle Research <luigino.camastra(a)aisle.com>
Signed-off-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/server/transport_rdma.c b/fs/smb/server/transport_rdma.c
index cc4322bfa1d6..d52f37578276 100644
--- a/fs/smb/server/transport_rdma.c
+++ b/fs/smb/server/transport_rdma.c
@@ -554,7 +554,7 @@ static void recv_done(struct ib_cq *cq, struct ib_wc *wc)
case SMB_DIRECT_MSG_DATA_TRANSFER: {
struct smb_direct_data_transfer *data_transfer =
(struct smb_direct_data_transfer *)recvmsg->packet;
- unsigned int data_length;
+ unsigned int data_offset, data_length;
int avail_recvmsg_count, receive_credits;
if (wc->byte_len <
@@ -565,14 +565,15 @@ static void recv_done(struct ib_cq *cq, struct ib_wc *wc)
}
data_length = le32_to_cpu(data_transfer->data_length);
- if (data_length) {
- if (wc->byte_len < sizeof(struct smb_direct_data_transfer) +
- (u64)data_length) {
- put_recvmsg(t, recvmsg);
- smb_direct_disconnect_rdma_connection(t);
- return;
- }
+ data_offset = le32_to_cpu(data_transfer->data_offset);
+ if (wc->byte_len < data_offset ||
+ wc->byte_len < (u64)data_offset + data_length) {
+ put_recvmsg(t, recvmsg);
+ smb_direct_disconnect_rdma_connection(t);
+ return;
+ }
+ if (data_length) {
if (t->full_packet_received)
recvmsg->first_segment = true;
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092158-payee-omega-5893@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x c62cff40481c037307a13becbda795f7afdcfebd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092116-ceramics-stratus-5d18@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c62cff40481c037307a13becbda795f7afdcfebd Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Mon, 8 Sep 2025 19:22:38 -0700
Subject: [PATCH] samples/damon/mtier: avoid starting DAMON before
initialization
Commit 964314344eab ("samples/damon/mtier: support boot time enable
setup") is somehow incompletely applying the origin patch [1]. It is
missing the part that avoids starting DAMON before module initialization.
Probably a mistake during a merge has happened. Fix it by applying the
missed part again.
Link: https://lkml.kernel.org/r/20250909022238.2989-4-sj@kernel.org
Link: https://lore.kernel.org/20250706193207.39810-4-sj@kernel.org [1]
Fixes: 964314344eab ("samples/damon/mtier: support boot time enable setup")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
index 7ebd352138e4..beaf36657dea 100644
--- a/samples/damon/mtier.c
+++ b/samples/damon/mtier.c
@@ -208,6 +208,9 @@ static int damon_sample_mtier_enable_store(
if (enabled == is_enabled)
return 0;
+ if (!init_called)
+ return 0;
+
if (enabled) {
err = damon_sample_mtier_start();
if (err)
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x f826edeb888c5a8bd1b6e95ae6a50b0db2b21902
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092111-specked-enviably-906d@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f826edeb888c5a8bd1b6e95ae6a50b0db2b21902 Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Mon, 8 Sep 2025 19:22:36 -0700
Subject: [PATCH] samples/damon/wsse: avoid starting DAMON before
initialization
Patch series "samples/damon: fix boot time enable handling fixup merge
mistakes".
First three patches of the patch series "mm/damon: fix misc bugs in DAMON
modules" [1] were trying to fix boot time DAMON sample modules enabling
issues. The issues are the modules can crash if those are enabled before
DAMON is enabled, like using boot time parameter options. The three
patches were fixing the issues by avoiding starting DAMON before the
module initialization phase.
However, probably by a mistake during a merge, only half of the change is
merged, and the part for avoiding the starting of DAMON before the module
initialized is missed. So the problem is not solved and thus the modules
can still crash if enabled before DAMON is initialized. Fix those by
applying the unmerged parts again.
Note that the broken commits are merged into 6.17-rc1, but also backported
to relevant stable kernels. So this series also needs to be merged into
the stable kernels. Hence Cc-ing stable@.
This patch (of 3):
Commit 0ed1165c3727 ("samples/damon/wsse: fix boot time enable handling")
is somehow incompletely applying the origin patch [2]. It is missing the
part that avoids starting DAMON before module initialization. Probably a
mistake during a merge has happened. Fix it by applying the missed part
again.
Link: https://lkml.kernel.org/r/20250909022238.2989-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250909022238.2989-2-sj@kernel.org
Link: https://lkml.kernel.org/r/20250706193207.39810-1-sj@kernel.org [1]
Link: https://lore.kernel.org/20250706193207.39810-2-sj@kernel.org [2]
Fixes: 0ed1165c3727 ("samples/damon/wsse: fix boot time enable handling")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/samples/damon/wsse.c b/samples/damon/wsse.c
index da052023b099..21eaf15f987d 100644
--- a/samples/damon/wsse.c
+++ b/samples/damon/wsse.c
@@ -118,6 +118,9 @@ static int damon_sample_wsse_enable_store(
return 0;
if (enabled) {
+ if (!init_called)
+ return 0;
+
err = damon_sample_wsse_start();
if (err)
enabled = false;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 3539b1467e94336d5854ebf976d9627bfb65d6c3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092128-embassy-flyable-e3fb@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3539b1467e94336d5854ebf976d9627bfb65d6c3 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Thu, 18 Sep 2025 10:21:14 -0600
Subject: [PATCH] io_uring: include dying ring in task_work "should cancel"
state
When running task_work for an exiting task, rather than perform the
issue retry attempt, the task_work is canceled. However, this isn't
done for a ring that has been closed. This can lead to requests being
successfully completed post the ring being closed, which is somewhat
confusing and surprising to an application.
Rather than just check the task exit state, also include the ring
ref state in deciding whether or not to terminate a given request when
run from task_work.
Cc: stable(a)vger.kernel.org # 6.1+
Link: https://github.com/axboe/liburing/discussions/1459
Reported-by: Benedek Thaler <thaler(a)thaler.hu>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 93633613a165..bcec12256f34 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1406,8 +1406,10 @@ static void io_req_task_cancel(struct io_kiocb *req, io_tw_token_t tw)
void io_req_task_submit(struct io_kiocb *req, io_tw_token_t tw)
{
- io_tw_lock(req->ctx, tw);
- if (unlikely(io_should_terminate_tw()))
+ struct io_ring_ctx *ctx = req->ctx;
+
+ io_tw_lock(ctx, tw);
+ if (unlikely(io_should_terminate_tw(ctx)))
io_req_defer_failed(req, -EFAULT);
else if (req->flags & REQ_F_FORCE_ASYNC)
io_queue_iowq(req);
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index abc6de227f74..1880902be6fd 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -476,9 +476,9 @@ static inline bool io_allowed_run_tw(struct io_ring_ctx *ctx)
* 2) PF_KTHREAD is set, in which case the invoker of the task_work is
* our fallback task_work.
*/
-static inline bool io_should_terminate_tw(void)
+static inline bool io_should_terminate_tw(struct io_ring_ctx *ctx)
{
- return current->flags & (PF_KTHREAD | PF_EXITING);
+ return (current->flags & (PF_KTHREAD | PF_EXITING)) || percpu_ref_is_dying(&ctx->refs);
}
static inline void io_req_queue_tw_complete(struct io_kiocb *req, s32 res)
diff --git a/io_uring/poll.c b/io_uring/poll.c
index c786e587563b..6090a26975d4 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -224,7 +224,7 @@ static int io_poll_check_events(struct io_kiocb *req, io_tw_token_t tw)
{
int v;
- if (unlikely(io_should_terminate_tw()))
+ if (unlikely(io_should_terminate_tw(req->ctx)))
return -ECANCELED;
do {
diff --git a/io_uring/timeout.c b/io_uring/timeout.c
index 7f13bfa9f2b6..17e3aab0af36 100644
--- a/io_uring/timeout.c
+++ b/io_uring/timeout.c
@@ -324,7 +324,7 @@ static void io_req_task_link_timeout(struct io_kiocb *req, io_tw_token_t tw)
int ret;
if (prev) {
- if (!io_should_terminate_tw()) {
+ if (!io_should_terminate_tw(req->ctx)) {
struct io_cancel_data cd = {
.ctx = req->ctx,
.data = prev->cqe.user_data,
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 053bac89b6c0..213716e10d70 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -118,7 +118,7 @@ static void io_uring_cmd_work(struct io_kiocb *req, io_tw_token_t tw)
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
unsigned int flags = IO_URING_F_COMPLETE_DEFER;
- if (io_should_terminate_tw())
+ if (io_should_terminate_tw(req->ctx))
flags |= IO_URING_F_TASK_DEAD;
/* task_work executor checks the deffered list completion */
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 3539b1467e94336d5854ebf976d9627bfb65d6c3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092127-emit-dean-5272@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3539b1467e94336d5854ebf976d9627bfb65d6c3 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Thu, 18 Sep 2025 10:21:14 -0600
Subject: [PATCH] io_uring: include dying ring in task_work "should cancel"
state
When running task_work for an exiting task, rather than perform the
issue retry attempt, the task_work is canceled. However, this isn't
done for a ring that has been closed. This can lead to requests being
successfully completed post the ring being closed, which is somewhat
confusing and surprising to an application.
Rather than just check the task exit state, also include the ring
ref state in deciding whether or not to terminate a given request when
run from task_work.
Cc: stable(a)vger.kernel.org # 6.1+
Link: https://github.com/axboe/liburing/discussions/1459
Reported-by: Benedek Thaler <thaler(a)thaler.hu>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 93633613a165..bcec12256f34 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1406,8 +1406,10 @@ static void io_req_task_cancel(struct io_kiocb *req, io_tw_token_t tw)
void io_req_task_submit(struct io_kiocb *req, io_tw_token_t tw)
{
- io_tw_lock(req->ctx, tw);
- if (unlikely(io_should_terminate_tw()))
+ struct io_ring_ctx *ctx = req->ctx;
+
+ io_tw_lock(ctx, tw);
+ if (unlikely(io_should_terminate_tw(ctx)))
io_req_defer_failed(req, -EFAULT);
else if (req->flags & REQ_F_FORCE_ASYNC)
io_queue_iowq(req);
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index abc6de227f74..1880902be6fd 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -476,9 +476,9 @@ static inline bool io_allowed_run_tw(struct io_ring_ctx *ctx)
* 2) PF_KTHREAD is set, in which case the invoker of the task_work is
* our fallback task_work.
*/
-static inline bool io_should_terminate_tw(void)
+static inline bool io_should_terminate_tw(struct io_ring_ctx *ctx)
{
- return current->flags & (PF_KTHREAD | PF_EXITING);
+ return (current->flags & (PF_KTHREAD | PF_EXITING)) || percpu_ref_is_dying(&ctx->refs);
}
static inline void io_req_queue_tw_complete(struct io_kiocb *req, s32 res)
diff --git a/io_uring/poll.c b/io_uring/poll.c
index c786e587563b..6090a26975d4 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -224,7 +224,7 @@ static int io_poll_check_events(struct io_kiocb *req, io_tw_token_t tw)
{
int v;
- if (unlikely(io_should_terminate_tw()))
+ if (unlikely(io_should_terminate_tw(req->ctx)))
return -ECANCELED;
do {
diff --git a/io_uring/timeout.c b/io_uring/timeout.c
index 7f13bfa9f2b6..17e3aab0af36 100644
--- a/io_uring/timeout.c
+++ b/io_uring/timeout.c
@@ -324,7 +324,7 @@ static void io_req_task_link_timeout(struct io_kiocb *req, io_tw_token_t tw)
int ret;
if (prev) {
- if (!io_should_terminate_tw()) {
+ if (!io_should_terminate_tw(req->ctx)) {
struct io_cancel_data cd = {
.ctx = req->ctx,
.data = prev->cqe.user_data,
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 053bac89b6c0..213716e10d70 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -118,7 +118,7 @@ static void io_uring_cmd_work(struct io_kiocb *req, io_tw_token_t tw)
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
unsigned int flags = IO_URING_F_COMPLETE_DEFER;
- if (io_should_terminate_tw())
+ if (io_should_terminate_tw(req->ctx))
flags |= IO_URING_F_TASK_DEAD;
/* task_work executor checks the deffered list completion */
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x a09a8a1fbb374e0053b97306da9dbc05bd384685
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092110-music-knoll-828f@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a09a8a1fbb374e0053b97306da9dbc05bd384685 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:16:53 -0700
Subject: [PATCH] mm/gup: local lru_add_drain() to avoid lru_add_drain_all()
In many cases, if collect_longterm_unpinnable_folios() does need to drain
the LRU cache to release a reference, the cache in question is on this
same CPU, and much more efficiently drained by a preliminary local
lru_add_drain(), than the later cross-CPU lru_add_drain_all().
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration". Note for clean
backports: can take 6.16 commit a03db236aebf ("gup: optimize longterm
pin_user_pages() for large folio") first.
Link: https://lkml.kernel.org/r/66f2751f-283e-816d-9530-765db7edc465@google.com
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index 82aec6443c0a..b47066a54f52 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2287,8 +2287,8 @@ static unsigned long collect_longterm_unpinnable_folios(
struct pages_or_folios *pofs)
{
unsigned long collected = 0;
- bool drain_allow = true;
struct folio *folio;
+ int drained = 0;
long i = 0;
for (folio = pofs_get_folio(pofs, i); folio;
@@ -2307,10 +2307,17 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drain_allow && folio_ref_count(folio) !=
- folio_expected_ref_count(folio) + 1) {
+ if (drained == 0 &&
+ folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
+ lru_add_drain();
+ drained = 1;
+ }
+ if (drained == 1 &&
+ folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
- drain_allow = false;
+ drained = 2;
}
if (!folio_isolate_lru(folio))
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092108-drinking-sloped-1caa@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285 Mon Sep 17 00:00:00 2001
From: Herbert Xu <herbert(a)gondor.apana.org.au>
Date: Tue, 16 Sep 2025 17:20:59 +0800
Subject: [PATCH] crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg
Issuing two writes to the same af_alg socket is bogus as the
data will be interleaved in an unpredictable fashion. Furthermore,
concurrent writes may create inconsistencies in the internal
socket state.
Disallow this by adding a new ctx->write field that indiciates
exclusive ownership for writing.
Fixes: 8ff590903d5 ("crypto: algif_skcipher - User-space interface for skcipher operations")
Reported-by: Muhammad Alifa Ramdhan <ramdhan(a)starlabs.sg>
Reported-by: Bing-Jhong Billy Jheng <billy(a)starlabs.sg>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 407f2c238f2c..ca6fdcc6c54a 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -970,6 +970,12 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
}
lock_sock(sk);
+ if (ctx->write) {
+ release_sock(sk);
+ return -EBUSY;
+ }
+ ctx->write = true;
+
if (ctx->init && !ctx->more) {
if (ctx->used) {
err = -EINVAL;
@@ -1105,6 +1111,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
unlock:
af_alg_data_wakeup(sk);
+ ctx->write = false;
release_sock(sk);
return copied ?: err;
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index f7b3b93f3a49..0c70f3a55575 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -135,6 +135,7 @@ struct af_alg_async_req {
* SG?
* @enc: Cryptographic operation to be performed when
* recvmsg is invoked.
+ * @write: True if we are in the middle of a write.
* @init: True if metadata has been sent.
* @len: Length of memory allocated for this data structure.
* @inflight: Non-zero when AIO requests are in flight.
@@ -151,10 +152,11 @@ struct af_alg_ctx {
size_t used;
atomic_t rcvused;
- bool more;
- bool merge;
- bool enc;
- bool init;
+ u32 more:1,
+ merge:1,
+ enc:1,
+ write:1,
+ init:1;
unsigned int len;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092108-unmarked-tropical-1899@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285 Mon Sep 17 00:00:00 2001
From: Herbert Xu <herbert(a)gondor.apana.org.au>
Date: Tue, 16 Sep 2025 17:20:59 +0800
Subject: [PATCH] crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg
Issuing two writes to the same af_alg socket is bogus as the
data will be interleaved in an unpredictable fashion. Furthermore,
concurrent writes may create inconsistencies in the internal
socket state.
Disallow this by adding a new ctx->write field that indiciates
exclusive ownership for writing.
Fixes: 8ff590903d5 ("crypto: algif_skcipher - User-space interface for skcipher operations")
Reported-by: Muhammad Alifa Ramdhan <ramdhan(a)starlabs.sg>
Reported-by: Bing-Jhong Billy Jheng <billy(a)starlabs.sg>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 407f2c238f2c..ca6fdcc6c54a 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -970,6 +970,12 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
}
lock_sock(sk);
+ if (ctx->write) {
+ release_sock(sk);
+ return -EBUSY;
+ }
+ ctx->write = true;
+
if (ctx->init && !ctx->more) {
if (ctx->used) {
err = -EINVAL;
@@ -1105,6 +1111,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
unlock:
af_alg_data_wakeup(sk);
+ ctx->write = false;
release_sock(sk);
return copied ?: err;
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index f7b3b93f3a49..0c70f3a55575 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -135,6 +135,7 @@ struct af_alg_async_req {
* SG?
* @enc: Cryptographic operation to be performed when
* recvmsg is invoked.
+ * @write: True if we are in the middle of a write.
* @init: True if metadata has been sent.
* @len: Length of memory allocated for this data structure.
* @inflight: Non-zero when AIO requests are in flight.
@@ -151,10 +152,11 @@ struct af_alg_ctx {
size_t used;
atomic_t rcvused;
- bool more;
- bool merge;
- bool enc;
- bool init;
+ u32 more:1,
+ merge:1,
+ enc:1,
+ write:1,
+ init:1;
unsigned int len;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092107-crowbar-posting-c6ba@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1b34cbbf4f011a121ef7b2d7d6e6920a036d5285 Mon Sep 17 00:00:00 2001
From: Herbert Xu <herbert(a)gondor.apana.org.au>
Date: Tue, 16 Sep 2025 17:20:59 +0800
Subject: [PATCH] crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg
Issuing two writes to the same af_alg socket is bogus as the
data will be interleaved in an unpredictable fashion. Furthermore,
concurrent writes may create inconsistencies in the internal
socket state.
Disallow this by adding a new ctx->write field that indiciates
exclusive ownership for writing.
Fixes: 8ff590903d5 ("crypto: algif_skcipher - User-space interface for skcipher operations")
Reported-by: Muhammad Alifa Ramdhan <ramdhan(a)starlabs.sg>
Reported-by: Bing-Jhong Billy Jheng <billy(a)starlabs.sg>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 407f2c238f2c..ca6fdcc6c54a 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -970,6 +970,12 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
}
lock_sock(sk);
+ if (ctx->write) {
+ release_sock(sk);
+ return -EBUSY;
+ }
+ ctx->write = true;
+
if (ctx->init && !ctx->more) {
if (ctx->used) {
err = -EINVAL;
@@ -1105,6 +1111,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
unlock:
af_alg_data_wakeup(sk);
+ ctx->write = false;
release_sock(sk);
return copied ?: err;
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index f7b3b93f3a49..0c70f3a55575 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -135,6 +135,7 @@ struct af_alg_async_req {
* SG?
* @enc: Cryptographic operation to be performed when
* recvmsg is invoked.
+ * @write: True if we are in the middle of a write.
* @init: True if metadata has been sent.
* @len: Length of memory allocated for this data structure.
* @inflight: Non-zero when AIO requests are in flight.
@@ -151,10 +152,11 @@ struct af_alg_ctx {
size_t used;
atomic_t rcvused;
- bool more;
- bool merge;
- bool enc;
- bool init;
+ u32 more:1,
+ merge:1,
+ enc:1,
+ write:1,
+ init:1;
unsigned int len;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 7f830e126dc357fc086905ce9730140fd4528d66
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092126-fabulous-despair-ac21@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7f830e126dc357fc086905ce9730140fd4528d66 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky(a)amd.com>
Date: Mon, 15 Sep 2025 11:04:12 -0500
Subject: [PATCH] x86/sev: Guard sev_evict_cache() with CONFIG_AMD_MEM_ENCRYPT
The sev_evict_cache() is guest-related code and should be guarded by
CONFIG_AMD_MEM_ENCRYPT, not CONFIG_KVM_AMD_SEV.
CONFIG_AMD_MEM_ENCRYPT=y is required for a guest to run properly as an SEV-SNP
guest, but a guest kernel built with CONFIG_KVM_AMD_SEV=n would get the stub
function of sev_evict_cache() instead of the version that performs the actual
eviction. Move the function declarations under the appropriate #ifdef.
Fixes: 7b306dfa326f ("x86/sev: Evict cache lines during SNP memory validation")
Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)kernel.org # 6.16.x
Link: https://lore.kernel.org/r/70e38f2c4a549063de54052c9f64929705313526.17577089…
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 02236962fdb1..465b19fd1a2d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -562,6 +562,24 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
extern struct ghcb *boot_ghcb;
+static inline void sev_evict_cache(void *va, int npages)
+{
+ volatile u8 val __always_unused;
+ u8 *bytes = va;
+ int page_idx;
+
+ /*
+ * For SEV guests, a read from the first/last cache-lines of a 4K page
+ * using the guest key is sufficient to cause a flush of all cache-lines
+ * associated with that 4K page without incurring all the overhead of a
+ * full CLFLUSH sequence.
+ */
+ for (page_idx = 0; page_idx < npages; page_idx++) {
+ val = bytes[page_idx * PAGE_SIZE];
+ val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
+ }
+}
+
#else /* !CONFIG_AMD_MEM_ENCRYPT */
#define snp_vmpl 0
@@ -605,6 +623,7 @@ static inline int snp_send_guest_request(struct snp_msg_desc *mdesc,
static inline int snp_svsm_vtpm_send_command(u8 *buffer) { return -ENODEV; }
static inline void __init snp_secure_tsc_prepare(void) { }
static inline void __init snp_secure_tsc_init(void) { }
+static inline void sev_evict_cache(void *va, int npages) {}
#endif /* CONFIG_AMD_MEM_ENCRYPT */
@@ -619,24 +638,6 @@ int rmp_make_shared(u64 pfn, enum pg_level level);
void snp_leak_pages(u64 pfn, unsigned int npages);
void kdump_sev_callback(void);
void snp_fixup_e820_tables(void);
-
-static inline void sev_evict_cache(void *va, int npages)
-{
- volatile u8 val __always_unused;
- u8 *bytes = va;
- int page_idx;
-
- /*
- * For SEV guests, a read from the first/last cache-lines of a 4K page
- * using the guest key is sufficient to cause a flush of all cache-lines
- * associated with that 4K page without incurring all the overhead of a
- * full CLFLUSH sequence.
- */
- for (page_idx = 0; page_idx < npages; page_idx++) {
- val = bytes[page_idx * PAGE_SIZE];
- val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
- }
-}
#else
static inline bool snp_probe_rmptable_info(void) { return false; }
static inline int snp_rmptable_init(void) { return -ENOSYS; }
@@ -652,7 +653,6 @@ static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV
static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
static inline void kdump_sev_callback(void) { }
static inline void snp_fixup_e820_tables(void) {}
-static inline void sev_evict_cache(void *va, int npages) {}
#endif
#endif
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 7f830e126dc357fc086905ce9730140fd4528d66
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092125-stitch-starting-35cb@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7f830e126dc357fc086905ce9730140fd4528d66 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky(a)amd.com>
Date: Mon, 15 Sep 2025 11:04:12 -0500
Subject: [PATCH] x86/sev: Guard sev_evict_cache() with CONFIG_AMD_MEM_ENCRYPT
The sev_evict_cache() is guest-related code and should be guarded by
CONFIG_AMD_MEM_ENCRYPT, not CONFIG_KVM_AMD_SEV.
CONFIG_AMD_MEM_ENCRYPT=y is required for a guest to run properly as an SEV-SNP
guest, but a guest kernel built with CONFIG_KVM_AMD_SEV=n would get the stub
function of sev_evict_cache() instead of the version that performs the actual
eviction. Move the function declarations under the appropriate #ifdef.
Fixes: 7b306dfa326f ("x86/sev: Evict cache lines during SNP memory validation")
Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)kernel.org # 6.16.x
Link: https://lore.kernel.org/r/70e38f2c4a549063de54052c9f64929705313526.17577089…
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 02236962fdb1..465b19fd1a2d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -562,6 +562,24 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
extern struct ghcb *boot_ghcb;
+static inline void sev_evict_cache(void *va, int npages)
+{
+ volatile u8 val __always_unused;
+ u8 *bytes = va;
+ int page_idx;
+
+ /*
+ * For SEV guests, a read from the first/last cache-lines of a 4K page
+ * using the guest key is sufficient to cause a flush of all cache-lines
+ * associated with that 4K page without incurring all the overhead of a
+ * full CLFLUSH sequence.
+ */
+ for (page_idx = 0; page_idx < npages; page_idx++) {
+ val = bytes[page_idx * PAGE_SIZE];
+ val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
+ }
+}
+
#else /* !CONFIG_AMD_MEM_ENCRYPT */
#define snp_vmpl 0
@@ -605,6 +623,7 @@ static inline int snp_send_guest_request(struct snp_msg_desc *mdesc,
static inline int snp_svsm_vtpm_send_command(u8 *buffer) { return -ENODEV; }
static inline void __init snp_secure_tsc_prepare(void) { }
static inline void __init snp_secure_tsc_init(void) { }
+static inline void sev_evict_cache(void *va, int npages) {}
#endif /* CONFIG_AMD_MEM_ENCRYPT */
@@ -619,24 +638,6 @@ int rmp_make_shared(u64 pfn, enum pg_level level);
void snp_leak_pages(u64 pfn, unsigned int npages);
void kdump_sev_callback(void);
void snp_fixup_e820_tables(void);
-
-static inline void sev_evict_cache(void *va, int npages)
-{
- volatile u8 val __always_unused;
- u8 *bytes = va;
- int page_idx;
-
- /*
- * For SEV guests, a read from the first/last cache-lines of a 4K page
- * using the guest key is sufficient to cause a flush of all cache-lines
- * associated with that 4K page without incurring all the overhead of a
- * full CLFLUSH sequence.
- */
- for (page_idx = 0; page_idx < npages; page_idx++) {
- val = bytes[page_idx * PAGE_SIZE];
- val = bytes[page_idx * PAGE_SIZE + PAGE_SIZE - 1];
- }
-}
#else
static inline bool snp_probe_rmptable_info(void) { return false; }
static inline int snp_rmptable_init(void) { return -ENOSYS; }
@@ -652,7 +653,6 @@ static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV
static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
static inline void kdump_sev_callback(void) { }
static inline void snp_fixup_e820_tables(void) {}
-static inline void sev_evict_cache(void *va, int npages) {}
#endif
#endif
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 96fa515e70f3e4b98685ef8cac9d737fc62f10e1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092135-breeding-chrome-585a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 96fa515e70f3e4b98685ef8cac9d737fc62f10e1 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Tue, 16 Sep 2025 07:54:06 +0930
Subject: [PATCH] btrfs: tree-checker: fix the incorrect inode ref size check
[BUG]
Inside check_inode_ref(), we need to make sure every structure,
including the btrfs_inode_extref header, is covered by the item. But
our code is incorrectly using "sizeof(iref)", where @iref is just a
pointer.
This means "sizeof(iref)" will always be "sizeof(void *)", which is much
smaller than "sizeof(struct btrfs_inode_extref)".
This will allow some bad inode extrefs to sneak in, defeating tree-checker.
[FIX]
Fix the typo by calling "sizeof(*iref)", which is the same as
"sizeof(struct btrfs_inode_extref)", and will be the correct behavior we
want.
Fixes: 71bf92a9b877 ("btrfs: tree-checker: Add check for INODE_REF")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 0f556f4de3f9..a997c7cc35a2 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -1756,10 +1756,10 @@ static int check_inode_ref(struct extent_buffer *leaf,
while (ptr < end) {
u16 namelen;
- if (unlikely(ptr + sizeof(iref) > end)) {
+ if (unlikely(ptr + sizeof(*iref) > end)) {
inode_ref_err(leaf, slot,
"inode ref overflow, ptr %lu end %lu inode_ref_size %zu",
- ptr, end, sizeof(iref));
+ ptr, end, sizeof(*iref));
return -EUCLEAN;
}
The patch below does not apply to the 6.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.16.y
git checkout FETCH_HEAD
git cherry-pick -x e6b733ca2f99e968d696c2e812c8eb8e090bf37b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092121-boned-marbles-55ea@gregkh' --subject-prefix 'PATCH 6.16.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e6b733ca2f99e968d696c2e812c8eb8e090bf37b Mon Sep 17 00:00:00 2001
From: SeongJae Park <sj(a)kernel.org>
Date: Mon, 8 Sep 2025 19:22:37 -0700
Subject: [PATCH] samples/damon/prcl: avoid starting DAMON before
initialization
Commit 2780505ec2b4 ("samples/damon/prcl: fix boot time enable crash") is
somehow incompletely applying the origin patch [1]. It is missing the
part that avoids starting DAMON before module initialization. Probably a
mistake during a merge has happened. Fix it by applying the missed part
again.
Link: https://lkml.kernel.org/r/20250909022238.2989-3-sj@kernel.org
Link: https://lore.kernel.org/20250706193207.39810-3-sj@kernel.org [1]
Fixes: 2780505ec2b4 ("samples/damon/prcl: fix boot time enable crash")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/samples/damon/prcl.c b/samples/damon/prcl.c
index 1b839c06a612..0226652f94d5 100644
--- a/samples/damon/prcl.c
+++ b/samples/damon/prcl.c
@@ -137,6 +137,9 @@ static int damon_sample_prcl_enable_store(
if (enabled == is_enabled)
return 0;
+ if (!init_called)
+ return 0;
+
if (enabled) {
err = damon_sample_prcl_start();
if (err)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092147-truck-ceremony-311d@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092146-exhume-krypton-1383@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092145-system-superjet-a0fb@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092144-angler-cuddly-30db@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092143-defrost-backboned-d1ea@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2da6de30e60dd9bb14600eff1cc99df2fa2ddae3 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:23:15 -0700
Subject: [PATCH] mm: folio_may_be_lru_cached() unless folio_test_large()
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain[_all]() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2fe6ed2cc3fd..7012a0f758d8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_struct *);
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_lru_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
diff --git a/mm/gup.c b/mm/gup.c
index b47066a54f52..0bc4d140fc07 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (drained == 0 &&
+ if (drained == 0 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain();
drained = 1;
}
- if (drained == 1 &&
+ if (drained == 1 && folio_may_be_lru_cached(folio) &&
folio_ref_count(folio) !=
folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..bb0776f5ef7c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
diff --git a/mm/swap.c b/mm/swap.c
index 6ae2d5680574..b74ebe865dd9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_lru_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092156-candied-rogue-bf08@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092154-canon-user-98b7@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 98c6d259319ecf6e8d027abd3f14b81324b8c0ad
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092152-bobtail-scarring-ffff@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98c6d259319ecf6e8d027abd3f14b81324b8c0ad Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Mon, 8 Sep 2025 15:15:03 -0700
Subject: [PATCH] mm/gup: check ref_count instead of lru before migration
Patch series "mm: better GUP pin lru_add_drain_all()", v2.
Series of lru_add_drain_all()-related patches, arising from recent mm/gup
migration report from Will Deacon.
This patch (of 5):
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note on PG_private_2: ceph and nfs are still using the deprecated
PG_private_2 flag, with the aid of netfs and filemap support functions.
Although it is consistently matched by an increment of folio ref_count,
folio_expected_ref_count() intentionally does not recognize it, and ceph
folio migration currently depends on that for PG_private_2 folios to be
rejected. New references to the deprecated flag are discouraged, so do
not add it into the collect_longterm_unpinnable_folios() calculation: but
longterm pinning of transiently PG_private_2 ceph and nfs folios (an
uncommon case) may invoke a redundant lru_add_drain_all(). And this makes
easy the backport to earlier releases: up to and including 6.12, btrfs
also used PG_private_2, but without a ref_count increment.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm:
add folio_expected_ref_count() for reference count calculation").
Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com
Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Acked-by: Kiryl Shutsemau <kas(a)kernel.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index adffe663594d..82aec6443c0a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_folios(
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
wcd934x_codec_parse_data() contains a device reference count leak in
of_slim_get_device() where device_find_child() increases the reference
count of the device but this reference is not properly decreased in
the success path. Add put_device() in wcd934x_codec_parse_data(),
which ensures that the reference count of the device is correctly
managed.
Calling path: of_slim_get_device() -> of_find_slim_device() ->
device_find_child(). As comment of device_find_child() says, 'NOTE:
you will need to drop the reference with put_device() after use.'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: a61f3b4f476e ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
sound/soc/codecs/wcd934x.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
index 1bb7e1dc7e6b..9ffa65329934 100644
--- a/sound/soc/codecs/wcd934x.c
+++ b/sound/soc/codecs/wcd934x.c
@@ -5849,10 +5849,13 @@ static int wcd934x_codec_parse_data(struct wcd934x_codec *wcd)
slim_get_logical_addr(wcd->sidev);
wcd->if_regmap = regmap_init_slimbus(wcd->sidev,
&wcd934x_ifc_regmap_config);
- if (IS_ERR(wcd->if_regmap))
+ if (IS_ERR(wcd->if_regmap)) {
+ put_device(&wcd->sidev->dev);
return dev_err_probe(dev, PTR_ERR(wcd->if_regmap),
"Failed to allocate ifc register map\n");
+ }
+ put_device(&wcd->sidev->dev);
of_property_read_u32(dev->parent->of_node, "qcom,dmic-sample-rate",
&wcd->dmic_sample_rate);
--
2.17.1
When do_task() exhausts its iteration budget (!ret), it sets the state
to TASK_STATE_IDLE to reschedule, without a secondary check on the
current task->state. This can overwrite the TASK_STATE_DRAINING state
set by a concurrent call to rxe_cleanup_task() or rxe_disable_task().
While state changes are protected by a spinlock, both rxe_cleanup_task()
and rxe_disable_task() release the lock while waiting for the task to
finish draining in the while(!is_done(task)) loop. The race occurs if
do_task() hits its iteration limit and acquires the lock in this window.
The cleanup logic may then proceed while the task incorrectly
reschedules itself, leading to a potential use-after-free.
This bug was introduced during the migration from tasklets to workqueues,
where the special handling for the draining case was lost.
Fix this by restoring the original pre-migration behavior. If the state is
TASK_STATE_DRAINING when iterations are exhausted, set cont to 1 to
force a new loop iteration. This allows the task to finish its work, so
that a subsequent iteration can reach the switch statement and correctly
transition the state to TASK_STATE_DRAINED, stopping the task as intended.
Fixes: 9b4b7c1f9f54 ("RDMA/rxe: Add workqueue support for rxe tasks")
Cc: stable(a)vger.kernel.org
Reviewed-by: Zhu Yanjun <yanjun.zhu(a)linux.dev>
Signed-off-by: Gui-Dong Han <hanguidong02(a)gmail.com>
---
v2:
* Rewrite commit message for clarity. Thanks to Zhu Yanjun for the review.
---
drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 6f8f353e9583..f522820b950c 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -132,8 +132,12 @@ static void do_task(struct rxe_task *task)
* yield the cpu and reschedule the task
*/
if (!ret) {
- task->state = TASK_STATE_IDLE;
- resched = 1;
+ if (task->state != TASK_STATE_DRAINING) {
+ task->state = TASK_STATE_IDLE;
+ resched = 1;
+ } else {
+ cont = 1;
+ }
goto exit;
}
--
2.25.1
To: linux-kernel(a)vger.kernel.org
Cc: Paul Walmsley <paul.walmsley(a)sifive.com>
Cc: Samuel Holland <samuel.holland(a)sifive.com>
Cc: stable(a)vger.kernel.org
Cc: linux-riscv(a)lists.infradead.org
Cc: Thomas Gleixner <tglx(a)linutronix.de>
According to the PLIC specification[1], global interrupt sources are
assigned small unsigned integer identifiers beginning at the value 1.
An interrupt ID of 0 is reserved to mean "no interrupt".
The current plic_irq_resume() and plic_irq_suspend() functions incorrectly
starts the loop from index 0, which could access the reserved interrupt ID
0 register space.
This fix changes the loop to start from index 1, skipping the reserved
interrupt ID 0 as per the PLIC specification.
This prevents potential undefined behavior when accessing the reserved
register space during suspend/resume cycles.
Fixes: e80f0b6a2cf3 ("irqchip/irq-sifive-plic: Add syscore callbacks for hibernation")
Co-developed-by: Jia Wang <wangjia(a)ultrarisc.com>
Signed-off-by: Jia Wang <wangjia(a)ultrarisc.com>
Signed-off-by: Lucas Zampieri <lzampier(a)redhat.com>
[1] https://github.com/riscv/riscv-plic-spec/releases/tag/1.0.0
---
drivers/irqchip/irq-sifive-plic.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index bf69a4802b71..1c2b4d2575ac 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -252,7 +252,7 @@ static int plic_irq_suspend(void)
priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
- for (i = 0; i < priv->nr_irqs; i++) {
+ for (i = 1; i < priv->nr_irqs; i++) {
__assign_bit(i, priv->prio_save,
readl(priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID));
}
@@ -283,7 +283,7 @@ static void plic_irq_resume(void)
priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
- for (i = 0; i < priv->nr_irqs; i++) {
+ for (i = 1; i < priv->nr_irqs; i++) {
index = BIT_WORD(i);
writel((priv->prio_save[index] & BIT_MASK(i)) ? 1 : 0,
priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID);
--
2.51.0
Hi Stable,
Please provide a quote for your products:
Include:
1.Pricing (per unit)
2.Delivery cost & timeline
3.Quote expiry date
Deadline: September
Thanks!
Kamal Prasad
Albinayah Trading
The Qualcomm SM6375 processor is a 7nm process SoC for the mid-range market with the following features:
CPU: Eight-core design, including high-performance Kryo 670 core and efficient Kryo 265 core, optimized performance and energy efficiency.
GPU: Equipped with Adreno 642L GPU, supporting high-quality graphics and gaming experience.
AI Engine: Integrated Qualcomm AI engine to enhance intelligent features such as voice recognition and image processing.
Connectivity: Supports modern wireless standards such as 5G, Wi-Fi 6 and Bluetooth 5.2.
Multimedia: Supports 4K video encoding and decoding
Mainly used in mid-to-high-end smartphones, tablets and some IoT devices, suitable for users who need to balance cost performance and performance.
.# Part Number Manufacturer Date Code Quantity Unit Price Lead Time Condition (PCS) USD/Each one 1 SM-6375-1-PSP837-TR-00-0-AB QUALCOMM 2023+ 12000pcs US$18.00/pcs 7days New & original - stock 2 PM-6375-0-FOWNSP144-TR-01-0;TR-01-1 QUALCOMM 2023+ 12000pcs US$1.00/pcs 3 PMR-735A-0-WLNSP48-TR-05-0,TR-05-1 QUALCOMM 2023+ 12000pcs US$0.85/pcs 4 PMK-8003-0-FOWPSP36-TR-01-0 QUALCOMM 2023+ 12000pcs US$0.24/pcs 5 SDR-735-0-PSP219B-TR-01-0;TR-01-1 QUALCOMM 2023+ 12000pcs US$2.50/pcs 6 WCD-9370-0-WLPSP55-TR-01-0;TR-01-4 QUALCOMM 2023+ 12000pcs US$0.50/pcs 7 WCN-3988-0-82BWLPSP-TR-00-0 QUALCOMM 2023+ 12000pcs US$3.50/pcs 8 QET-6105-0-WLNSP24B-TR-00-1 QUALCOMM 2023+ 12000pcs US$1.20/pcs 9 QET4101-0-12WLNSP-TR-00-0 QUALCOMM 2022+ 12000pcs US$0.21/pcs
These materials are sold as a set for $28/usd, and are guaranteed to be authentic.
If you need other Qualcomm materials, please feel free to contact me
Stay in tune with product evolutions—tap . Keep Receiving Notices
Feel like taking a break? Select Configure Your Mailing.
The callback return value is ignored in damon_sysfs_damon_call(), which
means that it is not possible to detect invalid user input when writing
commands such as 'commit' to /sys/kernel/mm/damon/admin/kdamonds/<K>/state.
Fix it.
Signed-off-by: Akinobu Mita <akinobu.mita(a)gmail.com>
Fixes: f64539dcdb87 ("mm/damon/sysfs: use damon_call() for update_schemes_stats")
Cc: <stable(a)vger.kernel.org> # v6.14.x
Reviewed-by: SeongJae Park <sj(a)kernel.org>
---
mm/damon/sysfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c
index fe4e73d0ebbb..3ffe3a77b5db 100644
--- a/mm/damon/sysfs.c
+++ b/mm/damon/sysfs.c
@@ -1627,12 +1627,14 @@ static int damon_sysfs_damon_call(int (*fn)(void *data),
struct damon_sysfs_kdamond *kdamond)
{
struct damon_call_control call_control = {};
+ int err;
if (!kdamond->damon_ctx)
return -EINVAL;
call_control.fn = fn;
call_control.data = kdamond;
- return damon_call(kdamond->damon_ctx, &call_control);
+ err = damon_call(kdamond->damon_ctx, &call_control);
+ return err ? err : call_control.return_code;
}
struct damon_sysfs_schemes_walk_data {
--
2.43.0
It seems like everywhere in this file, when the request is not
bidirectionala, req->src is mapped with DMA_TO_DEVICE and req->dst is
mapped with DMA_FROM_DEVICE.
Fixes: 62f58b1637b7 ("crypto: aspeed - add HACE crypto driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Thomas Fourier <fourier.thomas(a)gmail.com>
---
v1->v2:
- fix confusion between dst and src in commit message
drivers/crypto/aspeed/aspeed-hace-crypto.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/aspeed/aspeed-hace-crypto.c b/drivers/crypto/aspeed/aspeed-hace-crypto.c
index a72dfebc53ff..fa201dae1f81 100644
--- a/drivers/crypto/aspeed/aspeed-hace-crypto.c
+++ b/drivers/crypto/aspeed/aspeed-hace-crypto.c
@@ -346,7 +346,7 @@ static int aspeed_sk_start_sg(struct aspeed_hace_dev *hace_dev)
} else {
dma_unmap_sg(hace_dev->dev, req->dst, rctx->dst_nents,
- DMA_TO_DEVICE);
+ DMA_FROM_DEVICE);
dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents,
DMA_TO_DEVICE);
}
--
2.43.0
Greetings!!
We are a 24+ yr old high tech Web Development firm with presence of over
18+ yrs in Mauritius; partners of RV Tec hAdvisora Ltd and headquartered in
India
We have catered to over 7000 customers. You may visit
https://www.mirackle.com for more information about our company. We create
designs that help businesses and individuals attract and engage readers. We
work with all the latest technologies.
We are Authorized Google Workspace Reseller Partner for Asia Pacific region
including Mauritius.
Our Services: Domain Registrations, Web hosting, Google Workspace, Mobile
Responsive Website Designing, Wordpress Websites, Mobile Apps, Web Apps,
E-commerce websites, Google Ads, SEO, Catalogue design & affiliated services
We create beautiful designs. Our brief website portfolio:
http://www.mirackle.com/portfolio.html
Note: We are also looking for tie-ups with IT/Web design cos. who would
want to outsource work for high end Websites/Mobile APP requirements etc.
We have a team of highly skilled php coders who can cater to any complex
requirement.
Get in touch with to get the best prices & offers
India Whatsapp: +91 9323272846 / 9323551195; Mauritius WharsApp: +230 5758
5497; Email: business(a)mirackle.com ; Web: http://www.mirackle.com
Regards,
Nishith Patel
commit 96939cec994070aa5df852c10fad5fc303a97ea3 upstream.
When a SYN containing the 'C' flag (deny join id0) was received, this
piece of information was not propagated to the path-manager.
Even if this flag is mainly set on the server side, a client can also
tell the server it cannot try to establish new subflows to the client's
initial IP address and port. The server's PM should then record such
info when received, and before sending events about the new connection.
Fixes: df377be38725 ("mptcp: add deny_join_id0 in mptcp_options_received")
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-pm-uspace-deny_join_id0-v1-1-40…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[ Conflicts in subflow.c, because of differences in the context, e.g.
introduced by commit 3a236aef280e ("mptcp: refactor passive socket
initialization"), which is not in this version. The same lines --
using 'mptcp_sk(new_msk)' instead of 'owner' -- can still be added
approximately at the same place, before calling
mptcp_pm_new_connection(). ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
net/mptcp/subflow.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 6bc36132d490..f67d8c98d58a 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -758,6 +758,9 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
*/
WRITE_ONCE(mptcp_sk(new_msk)->first, child);
+ if (mp_opt.deny_join_id0)
+ WRITE_ONCE(mptcp_sk(new_msk)->pm.remote_deny_join_id0, true);
+
/* new mpc subflow takes ownership of the newly
* created mptcp socket
*/
--
2.51.0
Currently this is hidden behind perfmon_capable() since this is
technically an info leak, given that this is a system wide metric.
However the granularity reported here is always PAGE_SIZE aligned, which
matches what the core kernel is already willing to expose to userspace
if querying how many free RAM pages there are on the system, and that
doesn't need any special privileges. In addition other drm drivers seem
happy to expose this.
The motivation here if with oneAPI where they want to use the system
wide 'used' reporting here, so not the per-client fdinfo stats. This has
also come up with some perf overlay applications wanting this
information.
Fixes: 1105ac15d2a1 ("drm/xe/uapi: restrict system wide accounting")
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Cc: Joshua Santosh <joshua.santosh.ranjan(a)intel.com>
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.8+
---
drivers/gpu/drm/xe/xe_query.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index e1b603aba61b..2e9ff33ed2fe 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -276,8 +276,7 @@ static int query_mem_regions(struct xe_device *xe,
mem_regions->mem_regions[0].instance = 0;
mem_regions->mem_regions[0].min_page_size = PAGE_SIZE;
mem_regions->mem_regions[0].total_size = man->size << PAGE_SHIFT;
- if (perfmon_capable())
- mem_regions->mem_regions[0].used = ttm_resource_manager_usage(man);
+ mem_regions->mem_regions[0].used = ttm_resource_manager_usage(man);
mem_regions->num_mem_regions = 1;
for (i = XE_PL_VRAM0; i <= XE_PL_VRAM1; ++i) {
@@ -293,13 +292,11 @@ static int query_mem_regions(struct xe_device *xe,
mem_regions->mem_regions[mem_regions->num_mem_regions].total_size =
man->size;
- if (perfmon_capable()) {
- xe_ttm_vram_get_used(man,
- &mem_regions->mem_regions
- [mem_regions->num_mem_regions].used,
- &mem_regions->mem_regions
- [mem_regions->num_mem_regions].cpu_visible_used);
- }
+ xe_ttm_vram_get_used(man,
+ &mem_regions->mem_regions
+ [mem_regions->num_mem_regions].used,
+ &mem_regions->mem_regions
+ [mem_regions->num_mem_regions].cpu_visible_used);
mem_regions->mem_regions[mem_regions->num_mem_regions].cpu_visible_size =
xe_ttm_vram_get_cpu_visible_size(man);
--
2.51.0
commit 2293c57484ae64c9a3c847c8807db8c26a3a4d41 upstream.
During the connection establishment, a peer can tell the other one that
it cannot establish new subflows to the initial IP address and port by
setting the 'C' flag [1]. Doing so makes sense when the sender is behind
a strict NAT, operating behind a legacy Layer 4 load balancer, or using
anycast IP address for example.
When this 'C' flag is set, the path-managers must then not try to
establish new subflows to the other peer's initial IP address and port.
The in-kernel PM has access to this info, but the userspace PM didn't.
The RFC8684 [1] is strict about that:
(...) therefore the receiver MUST NOT try to open any additional
subflows toward this address and port.
So it is important to tell the userspace about that as it is responsible
for the respect of this flag.
When a new connection is created and established, the Netlink events
now contain the existing but not currently used 'flags' attribute. When
MPTCP_PM_EV_FLAG_DENY_JOIN_ID0 is set, it means no other subflows
to the initial IP address and port -- info that are also part of the
event -- can be established.
Link: https://datatracker.ietf.org/doc/html/rfc8684#section-3.1-20.6 [1]
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Reported-by: Marek Majkowski <marek(a)cloudflare.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/532
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-pm-uspace-deny_join_id0-v1-2-40…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[ Conflicts in mptcp_pm.yaml, because the indentation has been modified
in commit ec362192aa9e ("netlink: specs: fix up indentation errors"),
which is not in this version. Applying the same modifications, but at
a different level. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Documentation/netlink/specs/mptcp_pm.yaml | 4 ++--
include/uapi/linux/mptcp.h | 2 ++
include/uapi/linux/mptcp_pm.h | 4 ++--
net/mptcp/pm_netlink.c | 7 +++++++
4 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/Documentation/netlink/specs/mptcp_pm.yaml b/Documentation/netlink/specs/mptcp_pm.yaml
index 7e295bad8b29..a670a9bbe01b 100644
--- a/Documentation/netlink/specs/mptcp_pm.yaml
+++ b/Documentation/netlink/specs/mptcp_pm.yaml
@@ -28,13 +28,13 @@ definitions:
traffic-patterns it can take a long time until the
MPTCP_EVENT_ESTABLISHED is sent.
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
- dport, server-side.
+ dport, server-side, [flags].
-
name: established
doc: >-
A MPTCP connection is established (can start new subflows).
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
- dport, server-side.
+ dport, server-side, [flags].
-
name: closed
doc: >-
diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h
index 67d015df8893..5fd5b4cf75ca 100644
--- a/include/uapi/linux/mptcp.h
+++ b/include/uapi/linux/mptcp.h
@@ -31,6 +31,8 @@
#define MPTCP_INFO_FLAG_FALLBACK _BITUL(0)
#define MPTCP_INFO_FLAG_REMOTE_KEY_RECEIVED _BITUL(1)
+#define MPTCP_PM_EV_FLAG_DENY_JOIN_ID0 _BITUL(0)
+
#define MPTCP_PM_ADDR_FLAG_SIGNAL (1 << 0)
#define MPTCP_PM_ADDR_FLAG_SUBFLOW (1 << 1)
#define MPTCP_PM_ADDR_FLAG_BACKUP (1 << 2)
diff --git a/include/uapi/linux/mptcp_pm.h b/include/uapi/linux/mptcp_pm.h
index 6ac84b2f636c..7359d34da446 100644
--- a/include/uapi/linux/mptcp_pm.h
+++ b/include/uapi/linux/mptcp_pm.h
@@ -16,10 +16,10 @@
* good time to allocate memory and send ADD_ADDR if needed. Depending on the
* traffic-patterns it can take a long time until the MPTCP_EVENT_ESTABLISHED
* is sent. Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
- * sport, dport, server-side.
+ * sport, dport, server-side, [flags].
* @MPTCP_EVENT_ESTABLISHED: A MPTCP connection is established (can start new
* subflows). Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
- * sport, dport, server-side.
+ * sport, dport, server-side, [flags].
* @MPTCP_EVENT_CLOSED: A MPTCP connection has stopped. Attribute: token.
* @MPTCP_EVENT_ANNOUNCED: A new address has been announced by the peer.
* Attributes: token, rem_id, family, daddr4 | daddr6 [, dport].
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index b763729b85e0..463c2e7956d5 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -2211,6 +2211,7 @@ static int mptcp_event_created(struct sk_buff *skb,
const struct sock *ssk)
{
int err = nla_put_u32(skb, MPTCP_ATTR_TOKEN, READ_ONCE(msk->token));
+ u16 flags = 0;
if (err)
return err;
@@ -2218,6 +2219,12 @@ static int mptcp_event_created(struct sk_buff *skb,
if (nla_put_u8(skb, MPTCP_ATTR_SERVER_SIDE, READ_ONCE(msk->pm.server_side)))
return -EMSGSIZE;
+ if (READ_ONCE(msk->pm.remote_deny_join_id0))
+ flags |= MPTCP_PM_EV_FLAG_DENY_JOIN_ID0;
+
+ if (flags && nla_put_u16(skb, MPTCP_ATTR_FLAGS, flags))
+ return -EMSGSIZE;
+
return mptcp_event_add_subflow(skb, ssk);
}
--
2.51.0
commit 2293c57484ae64c9a3c847c8807db8c26a3a4d41 upstream.
During the connection establishment, a peer can tell the other one that
it cannot establish new subflows to the initial IP address and port by
setting the 'C' flag [1]. Doing so makes sense when the sender is behind
a strict NAT, operating behind a legacy Layer 4 load balancer, or using
anycast IP address for example.
When this 'C' flag is set, the path-managers must then not try to
establish new subflows to the other peer's initial IP address and port.
The in-kernel PM has access to this info, but the userspace PM didn't.
The RFC8684 [1] is strict about that:
(...) therefore the receiver MUST NOT try to open any additional
subflows toward this address and port.
So it is important to tell the userspace about that as it is responsible
for the respect of this flag.
When a new connection is created and established, the Netlink events
now contain the existing but not currently used 'flags' attribute. When
MPTCP_PM_EV_FLAG_DENY_JOIN_ID0 is set, it means no other subflows
to the initial IP address and port -- info that are also part of the
event -- can be established.
Link: https://datatracker.ietf.org/doc/html/rfc8684#section-3.1-20.6 [1]
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Reported-by: Marek Majkowski <marek(a)cloudflare.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/532
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20250912-net-mptcp-pm-uspace-deny_join_id0-v1-2-40…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[ Conflicts in mptcp_pm.yaml, because the indentation has been modified
in commit ec362192aa9e ("netlink: specs: fix up indentation errors"),
which is not in this version. Applying the same modifications, but at
a different level. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Documentation/netlink/specs/mptcp_pm.yaml | 4 ++--
include/uapi/linux/mptcp.h | 2 ++
include/uapi/linux/mptcp_pm.h | 4 ++--
net/mptcp/pm_netlink.c | 7 +++++++
4 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/Documentation/netlink/specs/mptcp_pm.yaml b/Documentation/netlink/specs/mptcp_pm.yaml
index ecfe5ee33de2..c77f32cfcae9 100644
--- a/Documentation/netlink/specs/mptcp_pm.yaml
+++ b/Documentation/netlink/specs/mptcp_pm.yaml
@@ -28,13 +28,13 @@ definitions:
traffic-patterns it can take a long time until the
MPTCP_EVENT_ESTABLISHED is sent.
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
- dport, server-side.
+ dport, server-side, [flags].
-
name: established
doc: >-
A MPTCP connection is established (can start new subflows).
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
- dport, server-side.
+ dport, server-side, [flags].
-
name: closed
doc: >-
diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h
index 67d015df8893..5fd5b4cf75ca 100644
--- a/include/uapi/linux/mptcp.h
+++ b/include/uapi/linux/mptcp.h
@@ -31,6 +31,8 @@
#define MPTCP_INFO_FLAG_FALLBACK _BITUL(0)
#define MPTCP_INFO_FLAG_REMOTE_KEY_RECEIVED _BITUL(1)
+#define MPTCP_PM_EV_FLAG_DENY_JOIN_ID0 _BITUL(0)
+
#define MPTCP_PM_ADDR_FLAG_SIGNAL (1 << 0)
#define MPTCP_PM_ADDR_FLAG_SUBFLOW (1 << 1)
#define MPTCP_PM_ADDR_FLAG_BACKUP (1 << 2)
diff --git a/include/uapi/linux/mptcp_pm.h b/include/uapi/linux/mptcp_pm.h
index 6ac84b2f636c..7359d34da446 100644
--- a/include/uapi/linux/mptcp_pm.h
+++ b/include/uapi/linux/mptcp_pm.h
@@ -16,10 +16,10 @@
* good time to allocate memory and send ADD_ADDR if needed. Depending on the
* traffic-patterns it can take a long time until the MPTCP_EVENT_ESTABLISHED
* is sent. Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
- * sport, dport, server-side.
+ * sport, dport, server-side, [flags].
* @MPTCP_EVENT_ESTABLISHED: A MPTCP connection is established (can start new
* subflows). Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
- * sport, dport, server-side.
+ * sport, dport, server-side, [flags].
* @MPTCP_EVENT_CLOSED: A MPTCP connection has stopped. Attribute: token.
* @MPTCP_EVENT_ANNOUNCED: A new address has been announced by the peer.
* Attributes: token, rem_id, family, daddr4 | daddr6 [, dport].
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 50aaf259959a..ce7d42d3bd00 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -408,6 +408,7 @@ static int mptcp_event_created(struct sk_buff *skb,
const struct sock *ssk)
{
int err = nla_put_u32(skb, MPTCP_ATTR_TOKEN, READ_ONCE(msk->token));
+ u16 flags = 0;
if (err)
return err;
@@ -415,6 +416,12 @@ static int mptcp_event_created(struct sk_buff *skb,
if (nla_put_u8(skb, MPTCP_ATTR_SERVER_SIDE, READ_ONCE(msk->pm.server_side)))
return -EMSGSIZE;
+ if (READ_ONCE(msk->pm.remote_deny_join_id0))
+ flags |= MPTCP_PM_EV_FLAG_DENY_JOIN_ID0;
+
+ if (flags && nla_put_u16(skb, MPTCP_ATTR_FLAGS, flags))
+ return -EMSGSIZE;
+
return mptcp_event_add_subflow(skb, ssk);
}
--
2.51.0
Hi Sasha,
Thank you for maintaining the stable versions with Greg!
If I remember well, you run some scripts on your side to maintain the
queue/* branches in the linux-stable-rc Git tree [1], is that correct?
These branches have not been updated for a bit more than 3 weeks. Is it
normal?
Personally, I find them useful. But if it is just me, I can work without
them.
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/…
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
The patch titled
Subject: kmsan: Fix out-of-bounds access to shadow memory
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
kmsan-fix-out-of-bounds-access-to-shadow-memory.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Eric Biggers <ebiggers(a)kernel.org>
Subject: kmsan: Fix out-of-bounds access to shadow memory
Date: Thu, 11 Sep 2025 12:58:58 -0700
Running sha224_kunit on a KMSAN-enabled kernel results in a crash in
kmsan_internal_set_shadow_origin():
BUG: unable to handle page fault for address: ffffbc3840291000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary)
Tainted: [N]=TEST
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100
[...]
Call Trace:
<TASK>
__msan_memset+0xee/0x1a0
sha224_final+0x9e/0x350
test_hash_buffer_overruns+0x46f/0x5f0
? kmsan_get_shadow_origin_ptr+0x46/0xa0
? __pfx_test_hash_buffer_overruns+0x10/0x10
kunit_try_run_case+0x198/0xa00
This occurs when memset() is called on a buffer that is not 4-byte aligned
and extends to the end of a guard page, i.e. the next page is unmapped.
The bug is that the loop at the end of kmsan_internal_set_shadow_origin()
accesses the wrong shadow memory bytes when the address is not 4-byte
aligned. Since each 4 bytes are associated with an origin, it rounds the
address and size so that it can access all the origins that contain the
buffer. However, when it checks the corresponding shadow bytes for a
particular origin, it incorrectly uses the original unrounded shadow
address. This results in reads from shadow memory beyond the end of the
buffer's shadow memory, which crashes when that memory is not mapped.
To fix this, correctly align the shadow address before accessing the 4
shadow bytes corresponding to each origin.
Link: https://lkml.kernel.org/r/20250911195858.394235-1-ebiggers@kernel.org
Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning")
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
Tested-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Alexander Potapenko <glider(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kmsan/core.c | 10 +++++++---
mm/kmsan/kmsan_test.c | 16 ++++++++++++++++
2 files changed, 23 insertions(+), 3 deletions(-)
--- a/mm/kmsan/core.c~kmsan-fix-out-of-bounds-access-to-shadow-memory
+++ a/mm/kmsan/core.c
@@ -195,7 +195,8 @@ void kmsan_internal_set_shadow_origin(vo
u32 origin, bool checked)
{
u64 address = (u64)addr;
- u32 *shadow_start, *origin_start;
+ void *shadow_start;
+ u32 *aligned_shadow, *origin_start;
size_t pad = 0;
KMSAN_WARN_ON(!kmsan_metadata_is_contiguous(addr, size));
@@ -214,9 +215,12 @@ void kmsan_internal_set_shadow_origin(vo
}
__memset(shadow_start, b, size);
- if (!IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ if (IS_ALIGNED(address, KMSAN_ORIGIN_SIZE)) {
+ aligned_shadow = shadow_start;
+ } else {
pad = address % KMSAN_ORIGIN_SIZE;
address -= pad;
+ aligned_shadow = shadow_start - pad;
size += pad;
}
size = ALIGN(size, KMSAN_ORIGIN_SIZE);
@@ -230,7 +234,7 @@ void kmsan_internal_set_shadow_origin(vo
* corresponding shadow slot is zero.
*/
for (int i = 0; i < size / KMSAN_ORIGIN_SIZE; i++) {
- if (origin || !shadow_start[i])
+ if (origin || !aligned_shadow[i])
origin_start[i] = origin;
}
}
--- a/mm/kmsan/kmsan_test.c~kmsan-fix-out-of-bounds-access-to-shadow-memory
+++ a/mm/kmsan/kmsan_test.c
@@ -556,6 +556,21 @@ DEFINE_TEST_MEMSETXX(16)
DEFINE_TEST_MEMSETXX(32)
DEFINE_TEST_MEMSETXX(64)
+/* Test case: ensure that KMSAN does not access shadow memory out of bounds. */
+static void test_memset_on_guarded_buffer(struct kunit *test)
+{
+ void *buf = vmalloc(PAGE_SIZE);
+
+ kunit_info(test,
+ "memset() on ends of guarded buffer should not crash\n");
+
+ for (size_t size = 0; size <= 128; size++) {
+ memset(buf, 0xff, size);
+ memset(buf + PAGE_SIZE - size, 0xff, size);
+ }
+ vfree(buf);
+}
+
static noinline void fibonacci(int *array, int size, int start)
{
if (start < 2 || (start == size))
@@ -677,6 +692,7 @@ static struct kunit_case kmsan_test_case
KUNIT_CASE(test_memset16),
KUNIT_CASE(test_memset32),
KUNIT_CASE(test_memset64),
+ KUNIT_CASE(test_memset_on_guarded_buffer),
KUNIT_CASE(test_long_origin_chain),
KUNIT_CASE(test_stackdepot_roundtrip),
KUNIT_CASE(test_unpoison_memory),
_
Patches currently in -mm which might be from ebiggers(a)kernel.org are
kmsan-fix-out-of-bounds-access-to-shadow-memory.patch
The patch titled
Subject: Squashfs: fix uninit-value in squashfs_get_parent
has been added to the -mm mm-nonmm-unstable branch. Its filename is
squashfs-fix-uninit-value-in-squashfs_get_parent.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: Squashfs: fix uninit-value in squashfs_get_parent
Date: Fri, 19 Sep 2025 00:33:08 +0100
Syzkaller reports a "KMSAN: uninit-value in squashfs_get_parent" bug.
This is caused by open_by_handle_at() being called with a file handle
containing an invalid parent inode number. In particular the inode number
is that of a symbolic link, rather than a directory.
Squashfs_get_parent() gets called with that symbolic link inode, and
accesses the parent member field.
unsigned int parent_ino = squashfs_i(inode)->parent;
Because non-directory inodes in Squashfs do not have a parent value, this
is uninitialised, and this causes an uninitialised value access.
The fix is to initialise parent with the invalid inode 0, which will cause
an EINVAL error to be returned.
Regular inodes used to share the parent field with the block_list_start
field. This is removed in this commit to enable the parent field to
contain the invalid inode number 0.
Link: https://lkml.kernel.org/r/20250918233308.293861-1-phillip@squashfs.org.uk
Fixes: 122601408d20 ("Squashfs: export operations")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+157bdef5cf596ad0da2c(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68cc2431.050a0220.139b6.0001.GAE@google.com/
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/inode.c | 7 +++++++
fs/squashfs/squashfs_fs_i.h | 2 +-
2 files changed, 8 insertions(+), 1 deletion(-)
--- a/fs/squashfs/inode.c~squashfs-fix-uninit-value-in-squashfs_get_parent
+++ a/fs/squashfs/inode.c
@@ -169,6 +169,7 @@ int squashfs_read_inode(struct inode *in
squashfs_i(inode)->start = le32_to_cpu(sqsh_ino->start_block);
squashfs_i(inode)->block_list_start = block;
squashfs_i(inode)->offset = offset;
+ squashfs_i(inode)->parent = 0;
inode->i_data.a_ops = &squashfs_aops;
TRACE("File inode %x:%x, start_block %llx, block_list_start "
@@ -216,6 +217,7 @@ int squashfs_read_inode(struct inode *in
squashfs_i(inode)->start = le64_to_cpu(sqsh_ino->start_block);
squashfs_i(inode)->block_list_start = block;
squashfs_i(inode)->offset = offset;
+ squashfs_i(inode)->parent = 0;
inode->i_data.a_ops = &squashfs_aops;
TRACE("File inode %x:%x, start_block %llx, block_list_start "
@@ -296,6 +298,7 @@ int squashfs_read_inode(struct inode *in
inode->i_mode |= S_IFLNK;
squashfs_i(inode)->start = block;
squashfs_i(inode)->offset = offset;
+ squashfs_i(inode)->parent = 0;
if (type == SQUASHFS_LSYMLINK_TYPE) {
__le32 xattr;
@@ -333,6 +336,7 @@ int squashfs_read_inode(struct inode *in
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
rdev = le32_to_cpu(sqsh_ino->rdev);
init_special_inode(inode, inode->i_mode, new_decode_dev(rdev));
+ squashfs_i(inode)->parent = 0;
TRACE("Device inode %x:%x, rdev %x\n",
SQUASHFS_INODE_BLK(ino), offset, rdev);
@@ -357,6 +361,7 @@ int squashfs_read_inode(struct inode *in
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
rdev = le32_to_cpu(sqsh_ino->rdev);
init_special_inode(inode, inode->i_mode, new_decode_dev(rdev));
+ squashfs_i(inode)->parent = 0;
TRACE("Device inode %x:%x, rdev %x\n",
SQUASHFS_INODE_BLK(ino), offset, rdev);
@@ -377,6 +382,7 @@ int squashfs_read_inode(struct inode *in
inode->i_mode |= S_IFSOCK;
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
init_special_inode(inode, inode->i_mode, 0);
+ squashfs_i(inode)->parent = 0;
break;
}
case SQUASHFS_LFIFO_TYPE:
@@ -396,6 +402,7 @@ int squashfs_read_inode(struct inode *in
inode->i_op = &squashfs_inode_ops;
set_nlink(inode, le32_to_cpu(sqsh_ino->nlink));
init_special_inode(inode, inode->i_mode, 0);
+ squashfs_i(inode)->parent = 0;
break;
}
default:
--- a/fs/squashfs/squashfs_fs_i.h~squashfs-fix-uninit-value-in-squashfs_get_parent
+++ a/fs/squashfs/squashfs_fs_i.h
@@ -16,6 +16,7 @@ struct squashfs_inode_info {
u64 xattr;
unsigned int xattr_size;
int xattr_count;
+ int parent;
union {
struct {
u64 fragment_block;
@@ -27,7 +28,6 @@ struct squashfs_inode_info {
u64 dir_idx_start;
int dir_idx_offset;
int dir_idx_cnt;
- int parent;
};
};
struct inode vfs_inode;
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
squashfs-fix-uninit-value-in-squashfs_get_parent.patch
The patch titled
Subject: fs/proc/task_mmu: check cur_buf for NULL
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fs-proc-task_mmu-check-cur_buf-for-null.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jakub Acs <acsjakub(a)amazon.de>
Subject: fs/proc/task_mmu: check cur_buf for NULL
Date: Fri, 19 Sep 2025 14:21:04 +0000
When the PAGEMAP_SCAN ioctl is invoked with vec_len = 0 reaches
pagemap_scan_backout_range(), kernel panics with null-ptr-deref:
[ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none)
[ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80
<snip registers, unreliable trace>
[ 44.946828] Call Trace:
[ 44.947030] <TASK>
[ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0
[ 44.952593] walk_pmd_range.isra.0+0x302/0x910
[ 44.954069] walk_pud_range.isra.0+0x419/0x790
[ 44.954427] walk_p4d_range+0x41e/0x620
[ 44.954743] walk_pgd_range+0x31e/0x630
[ 44.955057] __walk_page_range+0x160/0x670
[ 44.956883] walk_page_range_mm+0x408/0x980
[ 44.958677] walk_page_range+0x66/0x90
[ 44.958984] do_pagemap_scan+0x28d/0x9c0
[ 44.961833] do_pagemap_cmd+0x59/0x80
[ 44.962484] __x64_sys_ioctl+0x18d/0x210
[ 44.962804] do_syscall_64+0x5b/0x290
[ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e
vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are
allocated and p->vec_buf remains set to NULL.
This breaks an assumption made later in pagemap_scan_backout_range(), that
page_region is always allocated for p->vec_buf_index.
Fix it by explicitly checking cur_buf for NULL before dereferencing.
Other sites that might run into same deref-issue are already (directly or
transitively) protected by checking p->vec_buf.
Note:
From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output
is requested and it's only the side effects caller is interested in, hence
it passes check in pagemap_scan_get_args().
This issue was found by syzkaller.
Link: https://lkml.kernel.org/r/20250919142106.43527-1-acsjakub@amazon.de
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Signed-off-by: Jakub Acs <acsjakub(a)amazon.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: "Micha�� Miros��aw" <mirq-linux(a)rere.qmqm.pl>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/proc/task_mmu.c~fs-proc-task_mmu-check-cur_buf-for-null
+++ a/fs/proc/task_mmu.c
@@ -2417,6 +2417,9 @@ static void pagemap_scan_backout_range(s
{
struct page_region *cur_buf = &p->vec_buf[p->vec_buf_index];
+ if (!cur_buf)
+ return;
+
if (cur_buf->start != addr)
cur_buf->end = addr;
else
_
Patches currently in -mm which might be from acsjakub(a)amazon.de are
fs-proc-task_mmu-check-cur_buf-for-null.patch
Bug-report: https://lore.kernel.org/all/915c0e00-b92d-4e37-9d4b-0f6a4580da97@oracle.com/
Summary: While backporting commit: 7c62c442b6eb ("x86/vmscape: Enumerate
VMSCAPE bug") to 6.12.y --> VULNBL_AMD(0x1a, SRSO | VMSCAPE) was added
even when 6.12.y doesn't have commit: 877818802c3e ("x86/bugs: Add
SRSO_USER_KERNEL_NO support").
Boris Ostrovsky suggested backporting three commits to 6.12.y:
1. commit: 877818802c3e ("x86/bugs: Add SRSO_USER_KERNEL_NO support")
2. commit: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX")
and its fix
3. commit: e3417ab75ab2 ("KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0
<=> 1 VM count transitions") -- Maybe optional
Which changes current mitigation status on turin for 6.12.48 from Safe
RET to Reduced Speculation, leaving it with Safe RET liely causes heavy
performance regressions.
This three patches together change mitigation status from Safe RET to
Reduced Speculation
Tested on Turin:
[ 3.188134] Speculative Return Stack Overflow: Mitigation: Reduced Speculation
Backports:
1. Patch 1 had minor conflict as VMSCAPE commit added VULNBL_AMD(0x1a,
SRSO | VMSCAPE), and resolution is to skip that line.
2. Patch 2 and 3 are clean cherry-picks, 3 is a fix for 2.
Note: I verified if this problem is also on other stable trees like (6.6
--> 5.10, no they don't have this backport problem)
Thanks,
Harshit
Borislav Petkov (1):
x86/bugs: KVM: Add support for SRSO_MSR_FIX
Borislav Petkov (AMD) (1):
x86/bugs: Add SRSO_USER_KERNEL_NO support
Sean Christopherson (1):
KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count
transitions
Documentation/admin-guide/hw-vuln/srso.rst | 13 +++++
arch/x86/include/asm/cpufeatures.h | 5 ++
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/bugs.c | 28 ++++++++--
arch/x86/kvm/svm/svm.c | 65 ++++++++++++++++++++++
arch/x86/kvm/svm/svm.h | 2 +
arch/x86/lib/msr.c | 2 +
7 files changed, 112 insertions(+), 4 deletions(-)
--
2.50.1
The following commit has been merged into the x86/cpu branch of tip:
Commit-ID: 32278c677947ae2f042c9535674a7fff9a245dd3
Gitweb: https://git.kernel.org/tip/32278c677947ae2f042c9535674a7fff9a245dd3
Author: Sean Christopherson <seanjc(a)google.com>
AuthorDate: Fri, 08 Aug 2025 10:23:56 -07:00
Committer: Borislav Petkov (AMD) <bp(a)alien8.de>
CommitterDate: Fri, 19 Sep 2025 20:21:12 +02:00
x86/umip: Check that the instruction opcode is at least two bytes
When checking for a potential UMIP violation on #GP, verify the decoder found
at least two opcode bytes to avoid false positives when the kernel encounters
an unknown instruction that starts with 0f. Because the array of opcode.bytes
is zero-initialized by insn_init(), peeking at bytes[1] will misinterpret
garbage as a potential SLDT or STR instruction, and can incorrectly trigger
emulation.
E.g. if a VPALIGNR instruction
62 83 c5 05 0f 08 ff vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff
hits a #GP, the kernel emulates it as STR and squashes the #GP (and corrupts
the userspace code stream).
Arguably the check should look for exactly two bytes, but no three byte
opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be
impossible to get a false positive if the first two opcode bytes match '0f 00'
or '0f 01'. Go with a more conservative check with respect to the existing
code to minimize the chances of breaking userspace, e.g. due to decoder
weirdness.
Analyzed by Nick Bray <ncbray(a)google.com>.
Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
Reported-by: Dan Snyder <dansnyder(a)google.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
---
arch/x86/kernel/umip.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 5a4b213..406ac01 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -156,8 +156,8 @@ static int identify_insn(struct insn *insn)
if (!insn->modrm.nbytes)
return -EINVAL;
- /* All the instructions of interest start with 0x0f. */
- if (insn->opcode.bytes[0] != 0xf)
+ /* The instructions of interest have 2-byte opcodes: 0F 00 or 0F 01. */
+ if (insn->opcode.nbytes < 2 || insn->opcode.bytes[0] != 0xf)
return -EINVAL;
if (insn->opcode.bytes[1] == 0x1) {
The following commit has been merged into the x86/cpu branch of tip:
Commit-ID: 27b1fd62012dfe9d3eb8ecde344d7aa673695ecf
Gitweb: https://git.kernel.org/tip/27b1fd62012dfe9d3eb8ecde344d7aa673695ecf
Author: Sean Christopherson <seanjc(a)google.com>
AuthorDate: Fri, 08 Aug 2025 10:23:57 -07:00
Committer: Borislav Petkov (AMD) <bp(a)alien8.de>
CommitterDate: Fri, 19 Sep 2025 21:34:48 +02:00
x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases)
Filter out the register forms of 0F 01 when determining whether or not to
emulate in response to a potential UMIP violation #GP, as SGDT and SIDT only
accept memory operands. The register variants of 0F 01 are used to encode
instructions for things like VMX and SGX, i.e. not checking the Mod field
would cause the kernel to incorrectly emulate on #GP, e.g. due to a CPL
violation on VMLAUNCH.
Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions")
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
---
arch/x86/kernel/umip.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 406ac01..d432f38 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -163,8 +163,19 @@ static int identify_insn(struct insn *insn)
if (insn->opcode.bytes[1] == 0x1) {
switch (X86_MODRM_REG(insn->modrm.value)) {
case 0:
+ /* The reg form of 0F 01 /0 encodes VMX instructions. */
+ if (X86_MODRM_MOD(insn->modrm.value) == 3)
+ return -EINVAL;
+
return UMIP_INST_SGDT;
case 1:
+ /*
+ * The reg form of 0F 01 /1 encodes MONITOR/MWAIT,
+ * STAC/CLAC, and ENCLS.
+ */
+ if (X86_MODRM_MOD(insn->modrm.value) == 3)
+ return -EINVAL;
+
return UMIP_INST_SIDT;
case 4:
return UMIP_INST_SMSW;
From: HariKrishna Sagala <hariconscious(a)gmail.com>
Syzbot reported an uninit-value bug on at kmalloc_reserve for
commit 320475fbd590 ("Merge tag 'mtd/fixes-for-6.17-rc6' of
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux")'
Syzbot KMSAN reported use of uninitialized memory originating from functions
"kmalloc_reserve()", where memory allocated via "kmem_cache_alloc_node()" or
"kmalloc_node_track_caller()" was not explicitly initialized.
This can lead to undefined behavior when the allocated buffer
is later accessed.
Fix this by requesting the initialized memory using the gfp flag
appended with the option "__GFP_ZERO".
Reported-by: syzbot+9a4fbb77c9d4aacd3388(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9a4fbb77c9d4aacd3388
Fixes: 915d975b2ffa ("net: deal with integer overflows in
kmalloc_reserve()")
Tested-by: syzbot+9a4fbb77c9d4aacd3388(a)syzkaller.appspotmail.com
Signed-off-by: HariKrishna Sagala <hariconscious(a)gmail.com>
---
net/core/skbuff.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index ee0274417948..2308ebf99bbd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -573,6 +573,7 @@ static void *kmalloc_reserve(unsigned int *size, gfp_t flags, int node,
void *obj;
obj_size = SKB_HEAD_ALIGN(*size);
+ flags |= __GFP_ZERO;
if (obj_size <= SKB_SMALL_HEAD_CACHE_SIZE &&
!(flags & KMALLOC_NOT_NORMAL_BITS)) {
obj = kmem_cache_alloc_node(net_hotdata.skb_small_head_cache,
--
2.43.0
Once of_device_register() failed, we should call put_device() to
decrement reference count for cleanup. Or it could cause memory leak.
So fix this by calling put_device(), then the name can be freed in
kobject_cleanup().
Calling path: of_device_register() -> of_device_add() -> device_add().
As comment of device_add() says, 'if device_add() succeeds, you should
call device_del() when you want to get rid of it. If device_add() has
not succeeded, use only put_device() to drop the reference count'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: cf44bbc26cf1 ("[SPARC]: Beginnings of generic of_device framework.")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v2:
- retained kfree() manually due to the lack of a release callback function.
---
arch/sparc/kernel/of_device_64.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/sparc/kernel/of_device_64.c b/arch/sparc/kernel/of_device_64.c
index f98c2901f335..f53092b07b9e 100644
--- a/arch/sparc/kernel/of_device_64.c
+++ b/arch/sparc/kernel/of_device_64.c
@@ -677,6 +677,7 @@ static struct platform_device * __init scan_one_device(struct device_node *dp,
if (of_device_register(op)) {
printk("%pOF: Could not register of device.\n", dp);
+ put_device(&op->dev);
kfree(op);
op = NULL;
}
--
2.25.1
Hi stable maintainers,
While skimming over stable backports for VMSCAPE commits, I found
something unusual.
This is regarding the 6.12.y commit: 7c62c442b6eb ("x86/vmscape:
Enumerate VMSCAPE bug")
commit 7c62c442b6eb95d21bc4c5afc12fee721646ebe2
Author: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Date: Thu Aug 14 10:20:42 2025 -0700
x86/vmscape: Enumerate VMSCAPE bug
Commit a508cec6e5215a3fbc7e73ae86a5c5602187934d upstream.
The VMSCAPE vulnerability may allow a guest to cause Branch Target
Injection (BTI) in userspace hypervisors.
Kernels (both host and guest) have existing defenses against direct BTI
attacks from guests. There are also inter-process BTI mitigations which
prevent processes from attacking each other. However, the threat in
this
case is to a userspace hypervisor within the same process as the
attacker.
Userspace hypervisors have access to their own sensitive data like disk
encryption keys and also typically have access to all guest data. This
means guest userspace may use the hypervisor as a confused deputy
to attack
sensitive guest kernel data. There are no existing mitigations for
these
attacks.
Introduce X86_BUG_VMSCAPE for this vulnerability and set it on affected
Intel and AMD CPUs.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
So the problem in this commit is this part of the backport:
in file: arch/x86/kernel/cpu/common.c
VULNBL_AMD(0x15, RETBLEED),
VULNBL_AMD(0x16, RETBLEED),
- VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO),
- VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO),
- VULNBL_AMD(0x19, SRSO | TSA),
+ VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO | VMSCAPE),
+ VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO | VMSCAPE),
+ VULNBL_AMD(0x19, SRSO | TSA | VMSCAPE),
+ VULNBL_AMD(0x1a, SRSO | VMSCAPE),
+
{}
Notice the part where VULNBL_AMD(0x1a, SRSO | VMSCAPE) is added, 6.12.y
doesn't have commit: 877818802c3e ("x86/bugs: Add SRSO_USER_KERNEL_NO
support") so I think we shouldn't be adding VULNBL_AMD(0x1a, SRSO |
VMSCAPE) directly.
Boris Ostrovsky suggested me to verify this on a Turin machine as this
could cause a very big performance regression : and stated if SRSO
mitigation status is Safe RET we are likely in a problem, and we are in
that situation.
# lscpu | grep -E "CPU family"
CPU family: 26
Notes: CPU ID 26 -> 0x1a
And Turin machine reports the SRSO mitigation status as "Safe RET"
# uname -r
6.12.48-master.20250917.el8.rc1.x86_64
# cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Safe RET
Boris Ostrovsky suggested backporting three commits to 6.12.y:
1. commit: 877818802c3e ("x86/bugs: Add SRSO_USER_KERNEL_NO support")
2. commit: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX")
and its fix
3. commit: e3417ab75ab2 ("KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0
<=> 1 VM count transitions") -- Maybe optional
After backporting these three:
# uname -r
6.12.48-master.20250919.el8.dev.x86_64 // Note this this is kernel with
patches above three applied.
# dmesg | grep -C 2 Reduce
[ 3.186135] Speculative Store Bypass: Mitigation: Speculative Store
Bypass disabled via prctl
[ 3.187135] Speculative Return Stack Overflow: Reducing speculation to
address VM/HV SRSO attack vector.
[ 3.188134] Speculative Return Stack Overflow: Mitigation: Reduced
Speculation
[ 3.189135] VMSCAPE: Mitigation: IBPB before exit to userspace
[ 3.191139] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point
registers'
# cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Reduced Speculation
I can send my backports to stable if this looks good. Thoughts ?
Thanks,
Harshit
The iput() function is a dangerous one - if the reference counter goes
to zero, the function may block for a long time due to:
- inode_wait_for_writeback() waits until writeback on this inode
completes
- the filesystem-specific "evict_inode" callback can do similar
things; e.g. all netfs-based filesystems will call
netfs_wait_for_outstanding_io() which is similar to
inode_wait_for_writeback()
Therefore, callers must carefully evaluate the context they're in and
check whether invoking iput() is a good idea at all.
Most of the time, this is not a problem because the dcache holds
references to all inodes, and the dcache is usually the one to release
the last reference. But this assumption is fragile. For example,
under (memcg) memory pressure, the dcache shrinker is more likely to
release inode references, moving the inode eviction to contexts where
that was extremely unlikely to occur.
Our production servers "found" at least two deadlock bugs in the Ceph
filesystem that were caused by this iput() behavior:
1. Writeback may lead to iput() calls in Ceph (e.g. from
ceph_put_wrbuffer_cap_refs()) which deadlocks in
inode_wait_for_writeback(). Waiting for writeback completion from
within writeback will obviously never be able to make any progress.
This leads to blocked kworkers like this:
INFO: task kworker/u777:6:1270802 blocked for more than 122 seconds.
Not tainted 6.16.7-i1-es #773
task:kworker/u777:6 state:D stack:0 pid:1270802 tgid:1270802 ppid:2
task_flags:0x4208060 flags:0x00004000
Workqueue: writeback wb_workfn (flush-ceph-3)
Call Trace:
<TASK>
__schedule+0x4ea/0x17d0
schedule+0x1c/0xc0
inode_wait_for_writeback+0x71/0xb0
evict+0xcf/0x200
ceph_put_wrbuffer_cap_refs+0xdd/0x220
ceph_invalidate_folio+0x97/0xc0
ceph_writepages_start+0x127b/0x14d0
do_writepages+0xba/0x150
__writeback_single_inode+0x34/0x290
writeback_sb_inodes+0x203/0x470
__writeback_inodes_wb+0x4c/0xe0
wb_writeback+0x189/0x2b0
wb_workfn+0x30b/0x3d0
process_one_work+0x143/0x2b0
worker_thread+0x30a/0x450
2. In the Ceph messenger thread (net/ceph/messenger*.c), any iput()
call may invoke ceph_evict_inode() which will deadlock in
netfs_wait_for_outstanding_io(); since this blocks the messenger
thread, completions from the Ceph servers will not ever be received
and handled.
It looks like these deadlock bugs have been in the Ceph filesystem
code since forever (therefore no "Fixes" tag in this patch). There
may be various ways to solve this:
- make iput() asynchronous and defer the actual eviction like fput()
(may add overhead)
- make iput() only asynchronous if I_SYNC is set (doesn't solve random
things happening inside the "evict_inode" callback)
- add iput_deferred() to make this asynchronous behavior/overhead
optional and explicit
- refactor Ceph to avoid iput() calls from within writeback and
messenger (if that is even possible)
- add a Ceph-specific workaround
After advice from Mateusz Guzik, I decided to do the latter. The
implementation is simple because it piggybacks on the existing
work_struct for ceph_queue_inode_work() - ceph_inode_work() calls
iput() at the end which means we can donate the last reference to it.
Since Ceph has a few iput() callers in a loop, it seemed simple enough
to pass this counter and use atomic_sub() instead of atomic_dec().
This patch adds ceph_iput_n_async() and converts lots of iput() calls
to it - at least those that may come through writeback and the
messenger.
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
Cc: Mateusz Guzik <mjguzik(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
fs/ceph/addr.c | 2 +-
fs/ceph/caps.c | 21 ++++++++++-----------
fs/ceph/dir.c | 2 +-
fs/ceph/inode.c | 42 ++++++++++++++++++++++++++++++++++++++++++
fs/ceph/mds_client.c | 32 ++++++++++++++++----------------
fs/ceph/quota.c | 4 ++--
fs/ceph/snap.c | 10 +++++-----
fs/ceph/super.h | 7 +++++++
8 files changed, 84 insertions(+), 36 deletions(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 322ed268f14a..fc497c91530e 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -265,7 +265,7 @@ static void finish_netfs_read(struct ceph_osd_request *req)
subreq->error = err;
trace_netfs_sreq(subreq, netfs_sreq_trace_io_progress);
netfs_read_subreq_terminated(subreq);
- iput(req->r_inode);
+ ceph_iput_async(req->r_inode);
ceph_dec_osd_stopping_blocker(fsc->mdsc);
}
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index b1a8ff612c41..bd88b5287a2b 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1771,7 +1771,7 @@ void ceph_flush_snaps(struct ceph_inode_info *ci,
spin_unlock(&mdsc->snap_flush_lock);
if (need_put)
- iput(inode);
+ ceph_iput_async(inode);
}
/*
@@ -3318,8 +3318,8 @@ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had,
}
if (wake)
wake_up_all(&ci->i_cap_wq);
- while (put-- > 0)
- iput(inode);
+ if (put > 0)
+ ceph_iput_n_async(inode, put);
}
void ceph_put_cap_refs(struct ceph_inode_info *ci, int had)
@@ -3418,9 +3418,8 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr,
}
if (complete_capsnap)
wake_up_all(&ci->i_cap_wq);
- while (put-- > 0) {
- iput(inode);
- }
+ if (put > 0)
+ ceph_iput_n_async(inode, put);
}
/*
@@ -3917,7 +3916,7 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid,
if (wake_mdsc)
wake_up_all(&mdsc->cap_flushing_wq);
if (drop)
- iput(inode);
+ ceph_iput_async(inode);
}
void __ceph_remove_capsnap(struct inode *inode, struct ceph_cap_snap *capsnap,
@@ -4008,7 +4007,7 @@ static void handle_cap_flushsnap_ack(struct inode *inode, u64 flush_tid,
wake_up_all(&ci->i_cap_wq);
if (wake_mdsc)
wake_up_all(&mdsc->cap_flushing_wq);
- iput(inode);
+ ceph_iput_async(inode);
}
}
@@ -4557,7 +4556,7 @@ void ceph_handle_caps(struct ceph_mds_session *session,
done:
mutex_unlock(&session->s_mutex);
done_unlocked:
- iput(inode);
+ ceph_iput_async(inode);
out:
ceph_dec_mds_stopping_blocker(mdsc);
@@ -4636,7 +4635,7 @@ unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
doutc(cl, "on %p %llx.%llx\n", inode,
ceph_vinop(inode));
ceph_check_caps(ci, 0);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->cap_delay_lock);
}
@@ -4675,7 +4674,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s)
spin_unlock(&mdsc->cap_dirty_lock);
ceph_wait_on_async_create(inode);
ceph_check_caps(ci, CHECK_CAPS_FLUSH);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->cap_dirty_lock);
}
spin_unlock(&mdsc->cap_dirty_lock);
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 32973c62c1a2..ec73ed52a227 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1290,7 +1290,7 @@ static void ceph_async_unlink_cb(struct ceph_mds_client *mdsc,
ceph_mdsc_free_path_info(&path_info);
}
out:
- iput(req->r_old_inode);
+ ceph_iput_async(req->r_old_inode);
ceph_mdsc_release_dir_caps(req);
}
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index f67025465de0..385d5261632d 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2191,6 +2191,48 @@ void ceph_queue_inode_work(struct inode *inode, int work_bit)
}
}
+/**
+ * Queue an asynchronous iput() call in a worker thread. Use this
+ * instead of iput() in contexts where evicting the inode is unsafe.
+ * For example, inode eviction may cause deadlocks in
+ * inode_wait_for_writeback() (when called from within writeback) or
+ * in netfs_wait_for_outstanding_io() (when called from within the
+ * Ceph messenger).
+ *
+ * @n: how many references to put
+ */
+void ceph_iput_n_async(struct inode *inode, int n)
+{
+ if (unlikely(!inode))
+ return;
+
+ if (likely(atomic_sub_return(n, &inode->i_count) > 0))
+ /* somebody else is holding another reference -
+ * nothing left to do for us
+ */
+ return;
+
+ doutc(ceph_inode_to_fs_client(inode)->client, "%p %llx.%llx\n", inode, ceph_vinop(inode));
+
+ /* the reference counter is now 0, i.e. nobody else is holding
+ * a reference to this inode; restore it to 1 and donate it to
+ * ceph_inode_work() which will call iput() at the end
+ */
+ atomic_set(&inode->i_count, 1);
+
+ /* simply queue a ceph_inode_work() without setting
+ * i_work_mask bit; other than putting the reference, there is
+ * nothing to do
+ */
+ WARN_ON_ONCE(!queue_work(ceph_inode_to_fs_client(inode)->inode_wq,
+ &ceph_inode(inode)->i_work));
+
+ /* note: queue_work() cannot fail; it i_work were already
+ * queued, then it would be holding another reference, but no
+ * such reference exists
+ */
+}
+
static void ceph_do_invalidate_pages(struct inode *inode)
{
struct ceph_client *cl = ceph_inode_to_client(inode);
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 3bc72b47fe4d..d7fce1ad8073 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1097,14 +1097,14 @@ void ceph_mdsc_release_request(struct kref *kref)
ceph_msg_put(req->r_reply);
if (req->r_inode) {
ceph_put_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN);
- iput(req->r_inode);
+ ceph_iput_async(req->r_inode);
}
if (req->r_parent) {
ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
- iput(req->r_parent);
+ ceph_iput_async(req->r_parent);
}
- iput(req->r_target_inode);
- iput(req->r_new_inode);
+ ceph_iput_async(req->r_target_inode);
+ ceph_iput_async(req->r_new_inode);
if (req->r_dentry)
dput(req->r_dentry);
if (req->r_old_dentry)
@@ -1118,7 +1118,7 @@ void ceph_mdsc_release_request(struct kref *kref)
*/
ceph_put_cap_refs(ceph_inode(req->r_old_dentry_dir),
CEPH_CAP_PIN);
- iput(req->r_old_dentry_dir);
+ ceph_iput_async(req->r_old_dentry_dir);
}
kfree(req->r_path1);
kfree(req->r_path2);
@@ -1240,7 +1240,7 @@ static void __unregister_request(struct ceph_mds_client *mdsc,
}
if (req->r_unsafe_dir) {
- iput(req->r_unsafe_dir);
+ ceph_iput_async(req->r_unsafe_dir);
req->r_unsafe_dir = NULL;
}
@@ -1413,7 +1413,7 @@ static int __choose_mds(struct ceph_mds_client *mdsc,
cap = rb_entry(rb_first(&ci->i_caps), struct ceph_cap, ci_node);
if (!cap) {
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
goto random;
}
mds = cap->session->s_mds;
@@ -1422,7 +1422,7 @@ static int __choose_mds(struct ceph_mds_client *mdsc,
cap == ci->i_auth_cap ? "auth " : "", cap);
spin_unlock(&ci->i_ceph_lock);
out:
- iput(inode);
+ ceph_iput_async(inode);
return mds;
random:
@@ -1841,7 +1841,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session,
spin_unlock(&session->s_cap_lock);
if (last_inode) {
- iput(last_inode);
+ ceph_iput_async(last_inode);
last_inode = NULL;
}
if (old_cap) {
@@ -1874,7 +1874,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session,
session->s_cap_iterator = NULL;
spin_unlock(&session->s_cap_lock);
- iput(last_inode);
+ ceph_iput_async(last_inode);
if (old_cap)
ceph_put_cap(session->s_mdsc, old_cap);
@@ -1903,8 +1903,8 @@ static int remove_session_caps_cb(struct inode *inode, int mds, void *arg)
wake_up_all(&ci->i_cap_wq);
if (invalidate)
ceph_queue_invalidate(inode);
- while (iputs--)
- iput(inode);
+ if (iputs > 0)
+ ceph_iput_n_async(inode, iputs);
return 0;
}
@@ -1944,7 +1944,7 @@ static void remove_session_caps(struct ceph_mds_session *session)
spin_unlock(&session->s_cap_lock);
inode = ceph_find_inode(sb, vino);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&session->s_cap_lock);
}
@@ -2512,7 +2512,7 @@ static void ceph_cap_unlink_work(struct work_struct *work)
doutc(cl, "on %p %llx.%llx\n", inode,
ceph_vinop(inode));
ceph_check_caps(ci, CHECK_CAPS_FLUSH);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->cap_delay_lock);
}
}
@@ -3933,7 +3933,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
!req->r_reply_info.has_create_ino) {
/* This should never happen on an async create */
WARN_ON_ONCE(req->r_deleg_ino);
- iput(in);
+ ceph_iput_async(in);
in = NULL;
}
@@ -5313,7 +5313,7 @@ static void handle_lease(struct ceph_mds_client *mdsc,
out:
mutex_unlock(&session->s_mutex);
- iput(inode);
+ ceph_iput_async(inode);
ceph_dec_mds_stopping_blocker(mdsc);
return;
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index d90eda19bcc4..bba00f8926e6 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -76,7 +76,7 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
le64_to_cpu(h->max_files));
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
out:
ceph_dec_mds_stopping_blocker(mdsc);
}
@@ -190,7 +190,7 @@ void ceph_cleanup_quotarealms_inodes(struct ceph_mds_client *mdsc)
node = rb_first(&mdsc->quotarealms_inodes);
qri = rb_entry(node, struct ceph_quotarealm_inode, node);
rb_erase(node, &mdsc->quotarealms_inodes);
- iput(qri->inode);
+ ceph_iput_async(qri->inode);
kfree(qri);
}
mutex_unlock(&mdsc->quotarealms_inodes_mutex);
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
index c65f2b202b2b..19f097e79b3c 100644
--- a/fs/ceph/snap.c
+++ b/fs/ceph/snap.c
@@ -735,7 +735,7 @@ static void queue_realm_cap_snaps(struct ceph_mds_client *mdsc,
if (!inode)
continue;
spin_unlock(&realm->inodes_with_caps_lock);
- iput(lastinode);
+ ceph_iput_async(lastinode);
lastinode = inode;
/*
@@ -762,7 +762,7 @@ static void queue_realm_cap_snaps(struct ceph_mds_client *mdsc,
spin_lock(&realm->inodes_with_caps_lock);
}
spin_unlock(&realm->inodes_with_caps_lock);
- iput(lastinode);
+ ceph_iput_async(lastinode);
if (capsnap)
kmem_cache_free(ceph_cap_snap_cachep, capsnap);
@@ -955,7 +955,7 @@ static void flush_snaps(struct ceph_mds_client *mdsc)
ihold(inode);
spin_unlock(&mdsc->snap_flush_lock);
ceph_flush_snaps(ci, &session);
- iput(inode);
+ ceph_iput_async(inode);
spin_lock(&mdsc->snap_flush_lock);
}
spin_unlock(&mdsc->snap_flush_lock);
@@ -1116,12 +1116,12 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc,
ceph_get_snap_realm(mdsc, realm);
ceph_change_snap_realm(inode, realm);
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
continue;
skip_inode:
spin_unlock(&ci->i_ceph_lock);
- iput(inode);
+ ceph_iput_async(inode);
}
/* we may have taken some of the old realm's children. */
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index cf176aab0f82..15c09b6c94aa 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1085,6 +1085,13 @@ static inline void ceph_queue_flush_snaps(struct inode *inode)
ceph_queue_inode_work(inode, CEPH_I_WORK_FLUSH_SNAPS);
}
+void ceph_iput_n_async(struct inode *inode, int n);
+
+static inline void ceph_iput_async(struct inode *inode)
+{
+ ceph_iput_n_async(inode, 1);
+}
+
extern int ceph_try_to_choose_auth_mds(struct inode *inode, int mask);
extern int __ceph_do_getattr(struct inode *inode, struct page *locked_page,
int mask, bool force);
--
2.47.3
From: Kan Liang <kan.liang(a)linux.intel.com>
[ Upstream commit b0823d5fbacb1c551d793cbfe7af24e0d1fa45ed ]
The perf_fuzzer found a hard-lockup crash on a RaptorLake machine:
Oops: general protection fault, maybe for address 0xffff89aeceab400: 0000
CPU: 23 UID: 0 PID: 0 Comm: swapper/23
Tainted: [W]=WARN
Hardware name: Dell Inc. Precision 9660/0VJ762
RIP: 0010:native_read_pmc+0x7/0x40
Code: cc e8 8d a9 01 00 48 89 03 5b cd cc cc cc cc 0f 1f ...
RSP: 000:fffb03100273de8 EFLAGS: 00010046
....
Call Trace:
<TASK>
icl_update_topdown_event+0x165/0x190
? ktime_get+0x38/0xd0
intel_pmu_read_event+0xf9/0x210
__perf_event_read+0xf9/0x210
CPUs 16-23 are E-core CPUs that don't support the perf metrics feature.
The icl_update_topdown_event() should not be invoked on these CPUs.
It's a regression of commit:
f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read")
The bug introduced by that commit is that the is_topdown_event() function
is mistakenly used to replace the is_topdown_count() call to check if the
topdown functions for the perf metrics feature should be invoked.
Fix it.
Fixes: f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read")
Closes: https://lore.kernel.org/lkml/352f0709-f026-cd45-e60c-60dfd97f73f3@maine.edu/
Reported-by: Vince Weaver <vincent.weaver(a)maine.edu>
Signed-off-by: Kan Liang <kan.liang(a)linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Tested-by: Vince Weaver <vincent.weaver(a)maine.edu>
Cc: stable(a)vger.kernel.org # v6.15+
Link: https://lore.kernel.org/r/20250612143818.2889040-1-kan.liang@linux.intel.com
[ omitted PEBS check ]
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Angel Adetula <angeladetula(a)google.com>
---
arch/x86/events/intel/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 5e43d390f7a3..36d8404f406d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2793,7 +2793,7 @@ static void intel_pmu_read_event(struct perf_event *event)
if (pmu_enabled)
intel_pmu_disable_all();
- if (is_topdown_event(event))
+ if (is_topdown_count(event))
static_call(intel_pmu_update_topdown_event)(event);
else
intel_pmu_drain_pebs_buffer();
--
2.51.0.470.ga7dc726c21-goog