From: Yuezhang Mo Yuezhang.Mo@sony.com
[ Upstream commit 99f9a97dce39ad413c39b92c90393bbd6778f3fd ]
An infinite loop may occur if the following conditions occur due to file system corruption.
(1) Condition for exfat_count_dir_entries() to loop infinitely. - The cluster chain includes a loop. - There is no UNUSED entry in the cluster chain.
(2) Condition for exfat_create_upcase_table() to loop infinitely. - The cluster chain of the root directory includes a loop. - There are no UNUSED entry and up-case table entry in the cluster chain of the root directory.
(3) Condition for exfat_load_bitmap() to loop infinitely. - The cluster chain of the root directory includes a loop. - There are no UNUSED entry and bitmap entry in the cluster chain of the root directory.
(4) Condition for exfat_find_dir_entry() to loop infinitely. - The cluster chain includes a loop. - The unused directory entries were exhausted by some operation.
(5) Condition for exfat_check_dir_empty() to loop infinitely. - The cluster chain includes a loop. - The unused directory entries were exhausted by some operation. - All files and sub-directories under the directory are deleted.
This commit adds checks to break the above infinite loop.
Signed-off-by: Yuezhang Mo Yuezhang.Mo@sony.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the following critical reasons:
## Security Impact - Prevents Multiple Infinite Loop Vulnerabilities
The commit fixes **five distinct infinite loop conditions** that can occur due to filesystem corruption, each representing a potential denial-of-service vulnerability:
1. **exfat_count_dir_entries()** - Adds loop detection using `sbi->used_clusters` as a bound 2. **exfat_create_upcase_table()** - Addressed through root directory chain validation 3. **exfat_load_bitmap()** - Addressed through root directory chain validation 4. **exfat_find_dir_entry()** - Adds loop detection using `EXFAT_DATA_CLUSTER_COUNT(sbi)` 5. **exfat_check_dir_empty()** - Adds loop detection using `EXFAT_DATA_CLUSTER_COUNT(sbi)`
## Critical Bug Fix Characteristics
1. **Fixes Real Security Issues**: The infinite loops can cause system hangs and DoS conditions when mounting corrupted/malicious exFAT filesystems 2. **Small, Contained Changes**: The fix adds simple counter checks (4-5 lines per location) without architectural changes 3. **Clear Root Cause**: Addresses missing validation of cluster chain loops in directory traversal 4. **Pattern of Similar Fixes**: This follows three previous infinite loop fixes in the same subsystem (commits b0522303f672, a5324b3a488d, fee873761bd9), all of which fix similar issues dating back to the original exfat implementation
## Code Analysis Shows Low Risk
The changes are minimal and safe: - Adds `unsigned int clu_count = 0` declarations - Increments counter when following cluster chains - Breaks traversal if counter exceeds valid cluster count - In `exfat_count_num_clusters()`: adds explicit loop detection with error message
## Follows Stable Kernel Rules
✓ Fixes critical bugs (infinite loops/DoS) ✓ Minimal code changes (~50 lines total) ✓ No new features or API changes ✓ Similar fixes already backported (the three previous infinite loop fixes) ✓ Clear error conditions with proper error returns (-EIO)
The commit message explicitly states these are corruption-triggered infinite loops, and the pattern matches previous fixes that have "Fixes:" tags pointing to the original exfat implementation. This is a critical reliability and security fix that prevents system hangs when handling corrupted exFAT filesystems.
fs/exfat/dir.c | 12 ++++++++++++ fs/exfat/fatent.c | 10 ++++++++++ fs/exfat/namei.c | 5 +++++ fs/exfat/super.c | 32 +++++++++++++++++++++----------- 4 files changed, 48 insertions(+), 11 deletions(-)
diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c index 3103b932b674..ee060e26f51d 100644 --- a/fs/exfat/dir.c +++ b/fs/exfat/dir.c @@ -996,6 +996,7 @@ int exfat_find_dir_entry(struct super_block *sb, struct exfat_inode_info *ei, struct exfat_hint_femp candi_empty; struct exfat_sb_info *sbi = EXFAT_SB(sb); int num_entries = exfat_calc_num_entries(p_uniname); + unsigned int clu_count = 0;
if (num_entries < 0) return num_entries; @@ -1133,6 +1134,10 @@ int exfat_find_dir_entry(struct super_block *sb, struct exfat_inode_info *ei, } else { if (exfat_get_next_cluster(sb, &clu.dir)) return -EIO; + + /* break if the cluster chain includes a loop */ + if (unlikely(++clu_count > EXFAT_DATA_CLUSTER_COUNT(sbi))) + goto not_found; } }
@@ -1195,6 +1200,7 @@ int exfat_count_dir_entries(struct super_block *sb, struct exfat_chain *p_dir) int i, count = 0; int dentries_per_clu; unsigned int entry_type; + unsigned int clu_count = 0; struct exfat_chain clu; struct exfat_dentry *ep; struct exfat_sb_info *sbi = EXFAT_SB(sb); @@ -1227,6 +1233,12 @@ int exfat_count_dir_entries(struct super_block *sb, struct exfat_chain *p_dir) } else { if (exfat_get_next_cluster(sb, &(clu.dir))) return -EIO; + + if (unlikely(++clu_count > sbi->used_clusters)) { + exfat_fs_error(sb, "FAT or bitmap is corrupted"); + return -EIO; + } + } }
diff --git a/fs/exfat/fatent.c b/fs/exfat/fatent.c index 23065f948ae7..232cc7f8ab92 100644 --- a/fs/exfat/fatent.c +++ b/fs/exfat/fatent.c @@ -490,5 +490,15 @@ int exfat_count_num_clusters(struct super_block *sb, }
*ret_count = count; + + /* + * since exfat_count_used_clusters() is not called, sbi->used_clusters + * cannot be used here. + */ + if (unlikely(i == sbi->num_clusters && clu != EXFAT_EOF_CLUSTER)) { + exfat_fs_error(sb, "The cluster chain has a loop"); + return -EIO; + } + return 0; } diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c index fede0283d6e2..f5f1c4e8a29f 100644 --- a/fs/exfat/namei.c +++ b/fs/exfat/namei.c @@ -890,6 +890,7 @@ static int exfat_check_dir_empty(struct super_block *sb, { int i, dentries_per_clu; unsigned int type; + unsigned int clu_count = 0; struct exfat_chain clu; struct exfat_dentry *ep; struct exfat_sb_info *sbi = EXFAT_SB(sb); @@ -926,6 +927,10 @@ static int exfat_check_dir_empty(struct super_block *sb, } else { if (exfat_get_next_cluster(sb, &(clu.dir))) return -EIO; + + /* break if the cluster chain includes a loop */ + if (unlikely(++clu_count > EXFAT_DATA_CLUSTER_COUNT(sbi))) + break; } }
diff --git a/fs/exfat/super.c b/fs/exfat/super.c index 7ed858937d45..3a9ec75ab452 100644 --- a/fs/exfat/super.c +++ b/fs/exfat/super.c @@ -341,13 +341,12 @@ static void exfat_hash_init(struct super_block *sb) INIT_HLIST_HEAD(&sbi->inode_hashtable[i]); }
-static int exfat_read_root(struct inode *inode) +static int exfat_read_root(struct inode *inode, struct exfat_chain *root_clu) { struct super_block *sb = inode->i_sb; struct exfat_sb_info *sbi = EXFAT_SB(sb); struct exfat_inode_info *ei = EXFAT_I(inode); - struct exfat_chain cdir; - int num_subdirs, num_clu = 0; + int num_subdirs;
exfat_chain_set(&ei->dir, sbi->root_dir, 0, ALLOC_FAT_CHAIN); ei->entry = -1; @@ -360,12 +359,9 @@ static int exfat_read_root(struct inode *inode) ei->hint_stat.clu = sbi->root_dir; ei->hint_femp.eidx = EXFAT_HINT_NONE;
- exfat_chain_set(&cdir, sbi->root_dir, 0, ALLOC_FAT_CHAIN); - if (exfat_count_num_clusters(sb, &cdir, &num_clu)) - return -EIO; - i_size_write(inode, num_clu << sbi->cluster_size_bits); + i_size_write(inode, EXFAT_CLU_TO_B(root_clu->size, sbi));
- num_subdirs = exfat_count_dir_entries(sb, &cdir); + num_subdirs = exfat_count_dir_entries(sb, root_clu); if (num_subdirs < 0) return -EIO; set_nlink(inode, num_subdirs + EXFAT_MIN_SUBDIR); @@ -578,7 +574,8 @@ static int exfat_verify_boot_region(struct super_block *sb) }
/* mount the file system volume */ -static int __exfat_fill_super(struct super_block *sb) +static int __exfat_fill_super(struct super_block *sb, + struct exfat_chain *root_clu) { int ret; struct exfat_sb_info *sbi = EXFAT_SB(sb); @@ -595,6 +592,18 @@ static int __exfat_fill_super(struct super_block *sb) goto free_bh; }
+ /* + * Call exfat_count_num_cluster() before searching for up-case and + * bitmap directory entries to avoid infinite loop if they are missing + * and the cluster chain includes a loop. + */ + exfat_chain_set(root_clu, sbi->root_dir, 0, ALLOC_FAT_CHAIN); + ret = exfat_count_num_clusters(sb, root_clu, &root_clu->size); + if (ret) { + exfat_err(sb, "failed to count the number of clusters in root"); + goto free_bh; + } + ret = exfat_create_upcase_table(sb); if (ret) { exfat_err(sb, "failed to load upcase table"); @@ -627,6 +636,7 @@ static int exfat_fill_super(struct super_block *sb, struct fs_context *fc) struct exfat_sb_info *sbi = sb->s_fs_info; struct exfat_mount_options *opts = &sbi->options; struct inode *root_inode; + struct exfat_chain root_clu; int err;
if (opts->allow_utime == (unsigned short)-1) @@ -645,7 +655,7 @@ static int exfat_fill_super(struct super_block *sb, struct fs_context *fc) sb->s_time_min = EXFAT_MIN_TIMESTAMP_SECS; sb->s_time_max = EXFAT_MAX_TIMESTAMP_SECS;
- err = __exfat_fill_super(sb); + err = __exfat_fill_super(sb, &root_clu); if (err) { exfat_err(sb, "failed to recognize exfat type"); goto check_nls_io; @@ -680,7 +690,7 @@ static int exfat_fill_super(struct super_block *sb, struct fs_context *fc)
root_inode->i_ino = EXFAT_ROOT_INO; inode_set_iversion(root_inode, 1); - err = exfat_read_root(root_inode); + err = exfat_read_root(root_inode, &root_clu); if (err) { exfat_err(sb, "failed to initialize root inode"); goto put_inode;
From: Jaegeuk Kim jaegeuk@kernel.org
[ Upstream commit e23ab8028de0d92df5921a570f5212c0370db3b5 ]
Let's return errors caught by the generic checks. This fixes generic/494 where it expects to see EBUSY by setattr_prepare instead of EINVAL by f2fs for active swapfile.
Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
## Bug Fix Analysis
1. **Fixes a real bug affecting users**: The commit fixes incorrect error code handling for swapfiles. When attempting to modify an active swapfile, f2fs was returning `-EINVAL` instead of the correct `-EBUSY` error code. This breaks userspace expectations and causes test failures in `generic/494`.
2. **Small and contained fix**: The change is minimal - it simply reorders the error checking sequence in `f2fs_setattr()` to call generic checks (`setattr_prepare`, `fscrypt_prepare_setattr`, `fsverity_prepare_setattr`) before f2fs-specific checks. The code movement involves only 12 lines being relocated within the same function.
## Technical Details
The commit moves three generic preparation calls from lines 1055-1065 (after f2fs-specific checks) to lines 1055-1065 (before f2fs-specific checks). This ensures that:
- `setattr_prepare()` gets called first, which contains the `IS_SWAPFILE()` check that returns `-ETXTBSY` (which gets translated to `-EBUSY`) - The generic VFS layer error codes are returned consistently with other filesystems - F2fs-specific validation (like compression, pinned file checks) only happens after generic validation passes
## Risk Assessment
1. **Minimal regression risk**: The change only reorders existing checks without adding new logic or modifying the checks themselves. All the same validation still occurs, just in a different order.
2. **Follows stable tree rules**: This is a clear bugfix that: - Fixes incorrect error reporting to userspace - Makes f2fs behavior consistent with VFS expectations - Fixes a specific test case (`generic/494`) that validates correct swapfile handling - Has no feature additions or architectural changes
3. **Limited scope**: The change is confined to a single function in the f2fs subsystem and doesn't affect any other kernel components.
4. **Already reviewed**: The commit has been reviewed by a subsystem maintainer (Chao Yu) and merged by the f2fs maintainer (Jaegeuk Kim).
The incorrect error code could potentially confuse userspace applications that rely on specific error codes to determine why an operation failed. Returning `-EINVAL` (invalid argument) instead of `-EBUSY` (resource busy) for an active swapfile is semantically incorrect and breaks POSIX compliance expectations.
fs/f2fs/file.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 696131e655ed..bb3fd6a8416f 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1047,6 +1047,18 @@ int f2fs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, if (unlikely(f2fs_cp_error(F2FS_I_SB(inode)))) return -EIO;
+ err = setattr_prepare(idmap, dentry, attr); + if (err) + return err; + + err = fscrypt_prepare_setattr(dentry, attr); + if (err) + return err; + + err = fsverity_prepare_setattr(dentry, attr); + if (err) + return err; + if (unlikely(IS_IMMUTABLE(inode))) return -EPERM;
@@ -1065,18 +1077,6 @@ int f2fs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, return -EINVAL; }
- err = setattr_prepare(idmap, dentry, attr); - if (err) - return err; - - err = fscrypt_prepare_setattr(dentry, attr); - if (err) - return err; - - err = fsverity_prepare_setattr(dentry, attr); - if (err) - return err; - if (is_quota_modification(idmap, inode, attr)) { err = f2fs_dquot_initialize(inode); if (err)
From: "fangzhong.zhou" myth5@myth5.com
[ Upstream commit 0b7c9528facdb5a73ad78fea86d2e95a6c48dbc4 ]
This patch fixes an issue where the touchpad cursor movement becomes slow on the Dell Precision 5560. Force the touchpad freq to 100khz as a workaround.
Tested on Dell Precision 5560 with 6.14 to 6.14.6. Cursor movement is now smooth and responsive.
Signed-off-by: fangzhong.zhou myth5@myth5.com [wsa: kept sorting and removed unnecessary parts from commit msg] Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
## Analysis Supporting Backport:
### 1. **Fixes a Real User-Visible Bug** The commit fixes a specific hardware malfunction where the touchpad cursor movement becomes slow on Dell Precision 5560 laptops. This is a concrete usability issue that affects end users' ability to use their hardware properly.
### 2. **Minimal and Contained Change** The change is extremely minimal - it only adds a single line `{ "DLL0945", 0 },` to an existing array `i2c_acpi_force_100khz_device_ids[]`. This is a purely additive change that: - Doesn't modify any existing logic - Only affects devices with the specific ACPI ID "DLL0945" - Cannot cause regressions on other hardware
### 3. **Follows Established Pattern** The code shows this is a well-established pattern for hardware quirks: - The `i2c_acpi_force_100khz_device_ids[]` array already exists with the same purpose - A nearly identical fix was just added for ELAN06FA touchpad (commit bfd74cd1fbc0) - Similar quirk arrays exist for forcing 400KHz (i2c_acpi_force_400khz_device_ids) - Previous similar fixes have been marked for stable (e.g., commit 7574c0db2e68 for Silead touchscreens included `Cc: stable@kernel.org`)
### 4. **Hardware-Specific Workaround** This is a hardware-specific workaround that: - Only triggers for Dell devices with the DLL0945 touchpad - Forces I2C bus speed to 100KHz to work around a hardware/firmware issue - Has been tested on the affected hardware (Dell Precision 5560 with kernels 6.14 to 6.14.6)
### 5. **No Architecture Changes** The commit: - Uses existing infrastructure (the quirk array mechanism) - Doesn't introduce new features - Doesn't change any APIs or interfaces - Simply adds one more device ID to an existing workaround list
### 6. **Low Risk of Regression** The change has minimal regression risk because: - It only affects devices with the specific ACPI ID - The mechanism is already proven with other devices - The fix is isolated to I2C bus speed negotiation for one specific touchpad model - If the device ID doesn't match, the code path is never executed
### 7. **Consistent with Stable Kernel Rules** This fix aligns perfectly with stable kernel criteria: - Fixes a real bug that bothers users (slow touchpad cursor) - Is obviously correct and tested - Is small (1 line addition) - Doesn't add new features - Fixes only one specific issue
The commit follows the exact same pattern as previous touchpad I2C frequency quirks that have been successfully backported to stable kernels, making it a clear candidate for stable tree inclusion.
drivers/i2c/i2c-core-acpi.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/i2c/i2c-core-acpi.c b/drivers/i2c/i2c-core-acpi.c index d2499f302b50..f43067f6797e 100644 --- a/drivers/i2c/i2c-core-acpi.c +++ b/drivers/i2c/i2c-core-acpi.c @@ -370,6 +370,7 @@ static const struct acpi_device_id i2c_acpi_force_100khz_device_ids[] = { * the device works without issues on Windows at what is expected to be * a 400KHz frequency. The root cause of the issue is not known. */ + { "DLL0945", 0 }, { "ELAN06FA", 0 }, {} };
From: John Ogness john.ogness@linutronix.de
[ Upstream commit 571c1ea91a73db56bd94054fabecd0f070dc90db ]
If a console printer is interrupted during panic, it will never be able to reacquire ownership in order to perform and cleanup. That in itself is not a problem, since the non-panic CPU will simply quiesce in an endless loop within nbcon_reacquire_nobuf().
However, in this state, platforms that do not support a true NMI to interrupt the quiesced CPU will not be able to shutdown that CPU from within panic(). This then causes problems for such as being unable to load and run a kdump kernel.
Fix this by allowing non-panic CPUs to reacquire ownership using a direct acquire. Then the non-panic CPUs can successfullyl exit the nbcon_reacquire_nobuf() loop and the console driver can perform any necessary cleanup. But more importantly, the CPU is no longer quiesced and is free to process any interrupts necessary for panic() to shutdown the CPU.
All other forms of acquire are still not allowed for non-panic CPUs since it is safer to have them avoid gaining console ownership that is not strictly necessary.
Reported-by: Michael Kelley mhklinux@outlook.com Closes: https://lore.kernel.org/r/SN6PR02MB4157A4C5E8CB219A75263A17D46DA@SN6PR02MB41... Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Petr Mladek pmladek@suse.com Tested-by: Michael Kelley mhklinux@outlook.com Link: https://patch.msgid.link/20250606185549.900611-1-john.ogness@linutronix.de Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
## Bug Fix for Critical System Functionality
1. **Fixes a real bug affecting kdump functionality**: The commit addresses a specific problem where platforms without true NMI support cannot properly shutdown CPUs during panic, preventing kdump kernels from loading. This is a critical debugging and recovery feature that many production systems rely on.
2. **Clear regression/breakage scenario**: The commit message clearly describes how the current behavior causes non-panic CPUs to get stuck in an endless loop in `nbcon_reacquire_nobuf()`, preventing proper CPU shutdown during panic. This is a functional regression that affects system reliability.
## Safe and Contained Fix
3. **Minimal and targeted change**: The fix is confined to the nbcon (new console) subsystem, specifically modifying only the acquire logic to allow reacquire during panic. The diff shows only 41 insertions and 22 deletions, mostly adding a `is_reacquire` parameter to existing functions.
4. **No architectural changes**: The commit doesn't introduce new features or change the fundamental design. It merely adjusts the existing acquire logic to handle a specific edge case during panic.
5. **Conservative approach**: The fix maintains safety by: - Only allowing direct reacquire for non-panic CPUs (not all acquire types) - Preserving the check for `unsafe_takeover` state - Keeping all other panic-time restrictions in place
## Well-Tested and Reviewed
6. **Proper testing and review**: The commit has been: - Reported by Michael Kelley with a specific reproducer - Reviewed by Petr Mladek (printk maintainer) - Tested by the original reporter - Already included upstream (commit 571c1ea91a73db56bd94054fabecd0f070dc90db)
## Code Analysis
The key changes in `nbcon_context_try_acquire_direct()`: ```c -static int nbcon_context_try_acquire_direct(struct nbcon_context *ctxt, - struct nbcon_state *cur) +static int nbcon_context_try_acquire_direct(struct nbcon_context *ctxt, + struct nbcon_state *cur, bool is_reacquire) ```
And the critical logic change: ```c -if (other_cpu_in_panic()) +if (other_cpu_in_panic() && + (!is_reacquire || cur->unsafe_takeover)) { return -EPERM; +} ```
This allows reacquire during panic only when it's a genuine reacquire attempt and no unsafe takeover has occurred, which is a safe and necessary exception to handle the described bug.
The commit follows stable kernel rules by fixing an important bug with minimal risk and without introducing new features.
kernel/printk/nbcon.c | 63 ++++++++++++++++++++++++++++--------------- 1 file changed, 41 insertions(+), 22 deletions(-)
diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c index fd12efcc4aed..e7a3af81b173 100644 --- a/kernel/printk/nbcon.c +++ b/kernel/printk/nbcon.c @@ -214,8 +214,9 @@ static void nbcon_seq_try_update(struct nbcon_context *ctxt, u64 new_seq)
/** * nbcon_context_try_acquire_direct - Try to acquire directly - * @ctxt: The context of the caller - * @cur: The current console state + * @ctxt: The context of the caller + * @cur: The current console state + * @is_reacquire: This acquire is a reacquire * * Acquire the console when it is released. Also acquire the console when * the current owner has a lower priority and the console is in a safe state. @@ -225,17 +226,17 @@ static void nbcon_seq_try_update(struct nbcon_context *ctxt, u64 new_seq) * * Errors: * - * -EPERM: A panic is in progress and this is not the panic CPU. - * Or the current owner or waiter has the same or higher - * priority. No acquire method can be successful in - * this case. + * -EPERM: A panic is in progress and this is neither the panic + * CPU nor is this a reacquire. Or the current owner or + * waiter has the same or higher priority. No acquire + * method can be successful in these cases. * * -EBUSY: The current owner has a lower priority but the console * in an unsafe state. The caller should try using * the handover acquire method. */ static int nbcon_context_try_acquire_direct(struct nbcon_context *ctxt, - struct nbcon_state *cur) + struct nbcon_state *cur, bool is_reacquire) { unsigned int cpu = smp_processor_id(); struct console *con = ctxt->console; @@ -243,14 +244,20 @@ static int nbcon_context_try_acquire_direct(struct nbcon_context *ctxt,
do { /* - * Panic does not imply that the console is owned. However, it - * is critical that non-panic CPUs during panic are unable to - * acquire ownership in order to satisfy the assumptions of - * nbcon_waiter_matches(). In particular, the assumption that - * lower priorities are ignored during panic. + * Panic does not imply that the console is owned. However, + * since all non-panic CPUs are stopped during panic(), it + * is safer to have them avoid gaining console ownership. + * + * If this acquire is a reacquire (and an unsafe takeover + * has not previously occurred) then it is allowed to attempt + * a direct acquire in panic. This gives console drivers an + * opportunity to perform any necessary cleanup if they were + * interrupted by the panic CPU while printing. */ - if (other_cpu_in_panic()) + if (other_cpu_in_panic() && + (!is_reacquire || cur->unsafe_takeover)) { return -EPERM; + }
if (ctxt->prio <= cur->prio || ctxt->prio <= cur->req_prio) return -EPERM; @@ -301,8 +308,9 @@ static bool nbcon_waiter_matches(struct nbcon_state *cur, int expected_prio) * Event #1 implies this context is EMERGENCY. * Event #2 implies the new context is PANIC. * Event #3 occurs when panic() has flushed the console. - * Events #4 and #5 are not possible due to the other_cpu_in_panic() - * check in nbcon_context_try_acquire_direct(). + * Event #4 occurs when a non-panic CPU reacquires. + * Event #5 is not possible due to the other_cpu_in_panic() check + * in nbcon_context_try_acquire_handover(). */
return (cur->req_prio == expected_prio); @@ -431,6 +439,16 @@ static int nbcon_context_try_acquire_handover(struct nbcon_context *ctxt, WARN_ON_ONCE(ctxt->prio <= cur->prio || ctxt->prio <= cur->req_prio); WARN_ON_ONCE(!cur->unsafe);
+ /* + * Panic does not imply that the console is owned. However, it + * is critical that non-panic CPUs during panic are unable to + * wait for a handover in order to satisfy the assumptions of + * nbcon_waiter_matches(). In particular, the assumption that + * lower priorities are ignored during panic. + */ + if (other_cpu_in_panic()) + return -EPERM; + /* Handover is not possible on the same CPU. */ if (cur->cpu == cpu) return -EBUSY; @@ -558,7 +576,8 @@ static struct printk_buffers panic_nbcon_pbufs;
/** * nbcon_context_try_acquire - Try to acquire nbcon console - * @ctxt: The context of the caller + * @ctxt: The context of the caller + * @is_reacquire: This acquire is a reacquire * * Context: Under @ctxt->con->device_lock() or local_irq_save(). * Return: True if the console was acquired. False otherwise. @@ -568,7 +587,7 @@ static struct printk_buffers panic_nbcon_pbufs; * in an unsafe state. Otherwise, on success the caller may assume * the console is not in an unsafe state. */ -static bool nbcon_context_try_acquire(struct nbcon_context *ctxt) +static bool nbcon_context_try_acquire(struct nbcon_context *ctxt, bool is_reacquire) { unsigned int cpu = smp_processor_id(); struct console *con = ctxt->console; @@ -577,7 +596,7 @@ static bool nbcon_context_try_acquire(struct nbcon_context *ctxt)
nbcon_state_read(con, &cur); try_again: - err = nbcon_context_try_acquire_direct(ctxt, &cur); + err = nbcon_context_try_acquire_direct(ctxt, &cur, is_reacquire); if (err != -EBUSY) goto out;
@@ -913,7 +932,7 @@ void nbcon_reacquire_nobuf(struct nbcon_write_context *wctxt) { struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
- while (!nbcon_context_try_acquire(ctxt)) + while (!nbcon_context_try_acquire(ctxt, true)) cpu_relax();
nbcon_write_context_set_buf(wctxt, NULL, 0); @@ -1101,7 +1120,7 @@ static bool nbcon_emit_one(struct nbcon_write_context *wctxt, bool use_atomic) cant_migrate(); }
- if (!nbcon_context_try_acquire(ctxt)) + if (!nbcon_context_try_acquire(ctxt, false)) goto out;
/* @@ -1486,7 +1505,7 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq, ctxt->prio = nbcon_get_default_prio(); ctxt->allow_unsafe_takeover = allow_unsafe_takeover;
- if (!nbcon_context_try_acquire(ctxt)) + if (!nbcon_context_try_acquire(ctxt, false)) return -EPERM;
while (nbcon_seq_read(con) < stop_seq) { @@ -1762,7 +1781,7 @@ bool nbcon_device_try_acquire(struct console *con) ctxt->console = con; ctxt->prio = NBCON_PRIO_NORMAL;
- if (!nbcon_context_try_acquire(ctxt)) + if (!nbcon_context_try_acquire(ctxt, false)) return false;
if (!nbcon_context_enter_unsafe(ctxt))
From: Chao Yu chao@kernel.org
[ Upstream commit 1773f63d108b1b9b9d053d8c95f8300c556f93b8 ]
F2FS-fs (dm-55): access invalid blkaddr:972878540 Call trace: dump_backtrace+0xec/0x128 show_stack+0x18/0x28 dump_stack_lvl+0x40/0x88 dump_stack+0x18/0x24 __f2fs_is_valid_blkaddr+0x360/0x3b4 f2fs_is_valid_blkaddr+0x10/0x20 f2fs_get_node_info+0x21c/0x60c __write_node_page+0x15c/0x734 f2fs_sync_node_pages+0x4f8/0x700 f2fs_write_checkpoint+0x4a8/0x99c __checkpoint_and_complete_reqs+0x7c/0x20c issue_checkpoint_thread+0x4c/0xd8 kthread+0x11c/0x1b0 ret_from_fork+0x10/0x20
If nat.blkaddr is corrupted, during checkpoint, f2fs_sync_node_pages() will loop to flush node page w/ corrupted nat.blkaddr.
Although, it tags SBI_NEED_FSCK, checkpoint can not persist it due to deadloop.
Let's call f2fs_handle_error(, ERROR_INCONSISTENT_NAT) to record such error into superblock, it expects fsck can detect the error and repair inconsistent nat.blkaddr after device reboot.
Note that, let's add sanity check in f2fs_get_node_info() to detect in-memory nat.blkaddr inconsistency, but only if CONFIG_F2FS_CHECK_FS is enabled.
Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Critical Bug Fix**: This fixes a **deadloop/infinite loop condition** that can occur during checkpoint operations when NAT (Node Address Table) entries have corrupted block addresses. The commit message clearly shows a real-world stack trace where `f2fs_sync_node_pages()` enters an infinite loop trying to flush nodes with corrupted nat.blkaddr during checkpoint.
2. **Data Integrity and System Availability**: The deadloop prevents the checkpoint from completing, which means: - The filesystem cannot persist the `SBI_NEED_FSCK` flag that would trigger repair on next mount - The system may hang or become unresponsive during checkpoint operations - Filesystem metadata corruption cannot be properly handled
3. **Small and Contained Fix**: The patch is relatively small (21 insertions, 8 deletions) and focused on a single function `f2fs_get_node_info()`. The changes are: - Move the sanity check earlier in the code path - Add proper error handling with `f2fs_handle_error()` and `ERROR_INCONSISTENT_NAT` - Return `-EFSCORRUPTED` instead of `-EFAULT` for better error semantics - Add conditional caching logic to avoid caching corrupted entries
4. **Low Risk of Regression**: - The main sanity check logic already existed but was incomplete - The new error handling path only triggers when corruption is detected - The CONFIG_F2FS_CHECK_FS conditional check adds extra validation only when debugging is enabled - The change doesn't alter normal operation flow for valid NAT entries
5. **Follows Stable Rules**: This fix meets stable kernel criteria: - Fixes a real bug that users have hit (stack trace provided) - The fix is minimal and targeted - No new features are introduced - The risk of regression is low
6. **Corruption Handling**: The commit properly handles filesystem corruption by: - Setting the `SBI_NEED_FSCK` flag - Recording the error in the superblock via `f2fs_handle_error()` - Providing detailed error logging for debugging - Returning appropriate error codes to prevent further damage
The deadloop condition this fixes is particularly severe as it can make the system unresponsive and prevent proper error recovery, making this an important candidate for stable backporting.
fs/f2fs/node.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index bfe104db284e..2fd287f2bca4 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -555,8 +555,8 @@ int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, struct f2fs_nat_entry ne; struct nat_entry *e; pgoff_t index; - block_t blkaddr; int i; + bool need_cache = true;
ni->flag = 0; ni->nid = nid; @@ -569,6 +569,10 @@ int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, ni->blk_addr = nat_get_blkaddr(e); ni->version = nat_get_version(e); f2fs_up_read(&nm_i->nat_tree_lock); + if (IS_ENABLED(CONFIG_F2FS_CHECK_FS)) { + need_cache = false; + goto sanity_check; + } return 0; }
@@ -594,7 +598,7 @@ int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, up_read(&curseg->journal_rwsem); if (i >= 0) { f2fs_up_read(&nm_i->nat_tree_lock); - goto cache; + goto sanity_check; }
/* Fill node_info from nat page */ @@ -609,14 +613,23 @@ int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, ne = nat_blk->entries[nid - start_nid]; node_info_from_raw_nat(ni, &ne); f2fs_folio_put(folio, true); -cache: - blkaddr = le32_to_cpu(ne.block_addr); - if (__is_valid_data_blkaddr(blkaddr) && - !f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC_ENHANCE)) - return -EFAULT; +sanity_check: + if (__is_valid_data_blkaddr(ni->blk_addr) && + !f2fs_is_valid_blkaddr(sbi, ni->blk_addr, + DATA_GENERIC_ENHANCE)) { + set_sbi_flag(sbi, SBI_NEED_FSCK); + f2fs_err_ratelimited(sbi, + "f2fs_get_node_info of %pS: inconsistent nat entry, " + "ino:%u, nid:%u, blkaddr:%u, ver:%u, flag:%u", + __builtin_return_address(0), + ni->ino, ni->nid, ni->blk_addr, ni->version, ni->flag); + f2fs_handle_error(sbi, ERROR_INCONSISTENT_NAT); + return -EFSCORRUPTED; + }
/* cache nat entry */ - cache_nat_entry(sbi, nid, &ne); + if (need_cache) + cache_nat_entry(sbi, nid, &ne); return 0; }
linux-stable-mirror@lists.linaro.org