Re: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code

27 Aug 2025


      On Wed, Aug 27, 2025 at 12:34:44AM -0700, Christoph Hellwig wrote:
...
On Mon, Aug 25, 2025 at 08:34:14AM -0700, Darrick J. Wong wrote:
...
...

case BLK_STS_NOSPC:
return -ENOSPC;


case BLK_STS_OFFLINE:
return -ENODEV;


default:
return -EIO;


Well as I pointed out earlier, one interesting "quality" of the current
behavior is that online fsck captures the ENODATA and turns that into a
metadata corruption report.  I'd like to keep that behavior.
-EIO is just as much of a metadata corruption, so if you only catch
ENODATA you're missing most of them.
Hrmm, well an EIO (or an ENODATA) coming from the block layer causes the
scrub code to return to userspace with EIO, and xfs_scrub will complain
about the IO error and exit.
It doesn't explicitly mark the data structure as corrupt, but scrub
failing should be enough to conclude that the fs is corrupt.
I could patch the kernel to set the CORRUPT flag on the data structure
and keep going, since the likelihood of random bit errors causing media
errors is pretty high now that we have disks that store more than 1e15
bits.
...
...
...
if (bio->bi_status)

xfs_buf_ioerror(bp, blk_status_to_errno(bio->bi_status));


xfs_buf_ioerror(bp, xfs_buf_bio_status(bio));


I think you'd also want to wrap all the submit_bio_wait here too, right?
Hrm, only discard bios, log writes, and zonegc use that function.  Maybe
not?  I think a failed log write takes down the system no matter what
error code, nobody cares about failing discard, and I think zonegc write
failures just lead to the gc ... aborting?
Yes.  In Linux -EIO means an unrecoverable I/O error that the lower
layers gave up retrying. Not much we can do about that.
<nod>
--D

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code