In f2fs_update_inode_page, f2fs_get_node_page handles EIO along with f2fs_handle_page_eio that stops checkpoint, if the disk couldn't be recovered. As a result, we don't need to stop checkpoint right away given single EIO.
Cc: stable@vger.kernel.org Signed-off-by: Randall Huang huangrandall@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org --- fs/f2fs/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 01b9e6f85f6b..66e407fcefd3 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -719,10 +719,10 @@ void f2fs_update_inode_page(struct inode *inode) if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
- if (err == -ENOMEM) { + if (err == -ENOMEM || err == -EIO) { cond_resched(); goto retry; - } else if (err != -ENOENT) { + } else if (err != -ENOENT || f2fs_cp_error(sbi)) { f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_UPDATE_INODE); }
In f2fs_update_inode_page, f2fs_get_node_page handles EIO along with f2fs_handle_page_eio that stops checkpoint, if the disk couldn't be recovered. As a result, we don't need to stop checkpoint right away given single EIO.
Cc: stable@vger.kernel.org Signed-off-by: Randall Huang huangrandall@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org ---
Change log from v1: - fix a bug
fs/f2fs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index ff6cf66ed46b..2ed7a621fdf1 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -719,7 +719,7 @@ void f2fs_update_inode_page(struct inode *inode) if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
- if (err == -ENOMEM) { + if (err == -ENOMEM || (err == -EIO && !f2fs_cp_error(sbi))) { cond_resched(); goto retry; } else if (err != -ENOENT) {
On 2023/1/11 9:20, Jaegeuk Kim wrote:
In f2fs_update_inode_page, f2fs_get_node_page handles EIO along with f2fs_handle_page_eio that stops checkpoint, if the disk couldn't be recovered. As a result, we don't need to stop checkpoint right away given single EIO.
f2fs_handle_page_eio() only covers the case that EIO occurs on the same page, should we cover the case EIO occurs on different pages?
Thanks,
Cc: stable@vger.kernel.org Signed-off-by: Randall Huang huangrandall@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Change log from v1:
- fix a bug
fs/f2fs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index ff6cf66ed46b..2ed7a621fdf1 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -719,7 +719,7 @@ void f2fs_update_inode_page(struct inode *inode) if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
if (err == -ENOMEM) {
} else if (err != -ENOENT) {if (err == -ENOMEM || (err == -EIO && !f2fs_cp_error(sbi))) { cond_resched(); goto retry;
On 01/11, Chao Yu wrote:
On 2023/1/11 9:20, Jaegeuk Kim wrote:
In f2fs_update_inode_page, f2fs_get_node_page handles EIO along with f2fs_handle_page_eio that stops checkpoint, if the disk couldn't be recovered. As a result, we don't need to stop checkpoint right away given single EIO.
f2fs_handle_page_eio() only covers the case that EIO occurs on the same page, should we cover the case EIO occurs on different pages?
Which case are you looking at?
Thanks,
Cc: stable@vger.kernel.org Signed-off-by: Randall Huang huangrandall@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Change log from v1:
- fix a bug
fs/f2fs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index ff6cf66ed46b..2ed7a621fdf1 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -719,7 +719,7 @@ void f2fs_update_inode_page(struct inode *inode) if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
if (err == -ENOMEM) {
} else if (err != -ENOENT) {if (err == -ENOMEM || (err == -EIO && !f2fs_cp_error(sbi))) { cond_resched(); goto retry;
On 2023/1/12 2:50, Jaegeuk Kim wrote:
On 01/11, Chao Yu wrote:
On 2023/1/11 9:20, Jaegeuk Kim wrote:
In f2fs_update_inode_page, f2fs_get_node_page handles EIO along with f2fs_handle_page_eio that stops checkpoint, if the disk couldn't be recovered. As a result, we don't need to stop checkpoint right away given single EIO.
f2fs_handle_page_eio() only covers the case that EIO occurs on the same page, should we cover the case EIO occurs on different pages?
Which case are you looking at?
- __get_node_page(PageA) - __get_node_page(PageB) - f2fs_handle_page_eio - sbi->page_eio_ofs[type] = PageA->index - f2fs_handle_page_eio - sbi->page_eio_ofs[type] = PageB->index
In such race case, it may has low probability to set CP_ERROR_FLAG as we expect?
Thanks,
Thanks,
Cc: stable@vger.kernel.org Signed-off-by: Randall Huang huangrandall@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Change log from v1: - fix a bug
fs/f2fs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index ff6cf66ed46b..2ed7a621fdf1 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -719,7 +719,7 @@ void f2fs_update_inode_page(struct inode *inode) if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
if (err == -ENOMEM) {
if (err == -ENOMEM || (err == -EIO && !f2fs_cp_error(sbi))) { cond_resched(); goto retry; } else if (err != -ENOENT) {
On 01/12, Chao Yu wrote:
On 2023/1/12 2:50, Jaegeuk Kim wrote:
On 01/11, Chao Yu wrote:
On 2023/1/11 9:20, Jaegeuk Kim wrote:
In f2fs_update_inode_page, f2fs_get_node_page handles EIO along with f2fs_handle_page_eio that stops checkpoint, if the disk couldn't be recovered. As a result, we don't need to stop checkpoint right away given single EIO.
f2fs_handle_page_eio() only covers the case that EIO occurs on the same page, should we cover the case EIO occurs on different pages?
Which case are you looking at?
- __get_node_page(PageA) - __get_node_page(PageB)
- f2fs_handle_page_eio
- sbi->page_eio_ofs[type] = PageA->index - f2fs_handle_page_eio - sbi->page_eio_ofs[type] = PageB->index
In such race case, it may has low probability to set CP_ERROR_FLAG as we expect?
Do you see that case in products? I'm trying to avoid setting CP_ERROR_FLAG here.
Thanks,
Thanks,
Cc: stable@vger.kernel.org Signed-off-by: Randall Huang huangrandall@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Change log from v1: - fix a bug
fs/f2fs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index ff6cf66ed46b..2ed7a621fdf1 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -719,7 +719,7 @@ void f2fs_update_inode_page(struct inode *inode) if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
if (err == -ENOMEM) {
if (err == -ENOMEM || (err == -EIO && !f2fs_cp_error(sbi))) { cond_resched(); goto retry; } else if (err != -ENOENT) {
On 2023/1/13 8:01, Jaegeuk Kim wrote:
On 01/12, Chao Yu wrote:
On 2023/1/12 2:50, Jaegeuk Kim wrote:
On 01/11, Chao Yu wrote:
On 2023/1/11 9:20, Jaegeuk Kim wrote:
In f2fs_update_inode_page, f2fs_get_node_page handles EIO along with f2fs_handle_page_eio that stops checkpoint, if the disk couldn't be recovered. As a result, we don't need to stop checkpoint right away given single EIO.
f2fs_handle_page_eio() only covers the case that EIO occurs on the same page, should we cover the case EIO occurs on different pages?
Which case are you looking at?
- __get_node_page(PageA) - __get_node_page(PageB)
- f2fs_handle_page_eio
- sbi->page_eio_ofs[type] = PageA->index - f2fs_handle_page_eio - sbi->page_eio_ofs[type] = PageB->index
In such race case, it may has low probability to set CP_ERROR_FLAG as we expect?
Do you see that case in products?
Not yet.
I'm trying to avoid setting CP_ERROR_FLAG here.
Copied, how about using the same logic for node page as meta page, like we did in f2fs_get_meta_page_retry()?
Thanks,
Thanks,
Thanks,
Cc: stable@vger.kernel.org Signed-off-by: Randall Huang huangrandall@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Change log from v1: - fix a bug fs/f2fs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index ff6cf66ed46b..2ed7a621fdf1 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -719,7 +719,7 @@ void f2fs_update_inode_page(struct inode *inode) if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
if (err == -ENOMEM) {
if (err == -ENOMEM || (err == -EIO && !f2fs_cp_error(sbi))) { cond_resched(); goto retry; } else if (err != -ENOENT) {
On 01/28, Chao Yu wrote:
On 2023/1/13 8:01, Jaegeuk Kim wrote:
On 01/12, Chao Yu wrote:
On 2023/1/12 2:50, Jaegeuk Kim wrote:
On 01/11, Chao Yu wrote:
On 2023/1/11 9:20, Jaegeuk Kim wrote:
In f2fs_update_inode_page, f2fs_get_node_page handles EIO along with f2fs_handle_page_eio that stops checkpoint, if the disk couldn't be recovered. As a result, we don't need to stop checkpoint right away given single EIO.
f2fs_handle_page_eio() only covers the case that EIO occurs on the same page, should we cover the case EIO occurs on different pages?
Which case are you looking at?
- __get_node_page(PageA) - __get_node_page(PageB)
- f2fs_handle_page_eio
- sbi->page_eio_ofs[type] = PageA->index - f2fs_handle_page_eio - sbi->page_eio_ofs[type] = PageB->index
In such race case, it may has low probability to set CP_ERROR_FLAG as we expect?
Do you see that case in products?
Not yet.
I'm trying to avoid setting CP_ERROR_FLAG here.
Copied, how about using the same logic for node page as meta page, like we did in f2fs_get_meta_page_retry()?
Yeah.. let me send v3 and test a bit. Thanks~ :)
Thanks,
Thanks,
Thanks,
Cc: stable@vger.kernel.org Signed-off-by: Randall Huang huangrandall@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Change log from v1: - fix a bug fs/f2fs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index ff6cf66ed46b..2ed7a621fdf1 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -719,7 +719,7 @@ void f2fs_update_inode_page(struct inode *inode) if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
if (err == -ENOMEM) {
if (err == -ENOMEM || (err == -EIO && !f2fs_cp_error(sbi))) { cond_resched(); goto retry; } else if (err != -ENOENT) {
In f2fs_update_inode_page, f2fs_get_node_page handles EIO along with f2fs_handle_page_eio that stops checkpoint, if the disk couldn't be recovered. As a result, we don't need to stop checkpoint right away given single EIO.
Cc: stable@vger.kernel.org Signed-off-by: Randall Huang huangrandall@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org ---
Change log from v2: - set EIO to cover the case of data corruption given by buggy UFS driver
fs/f2fs/inode.c | 2 +- fs/f2fs/node.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index ff6cf66ed46b..2ed7a621fdf1 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -719,7 +719,7 @@ void f2fs_update_inode_page(struct inode *inode) if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
- if (err == -ENOMEM) { + if (err == -ENOMEM || (err == -EIO && !f2fs_cp_error(sbi))) { cond_resched(); goto retry; } else if (err != -ENOENT) { diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 558b318f7316..1629dc300c61 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1455,7 +1455,7 @@ static struct page *__get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid, ofs_of_node(page), cpver_of_node(page), next_blkaddr_of_node(page)); set_sbi_flag(sbi, SBI_NEED_FSCK); - err = -EINVAL; + err = -EIO; out_err: ClearPageUptodate(page); out_put_err:
If the storage gives a corrupted node block due to short power failure and reset, f2fs stops the entire operations by setting the checkpoint failure flag.
Let's give more chances to live by re-issuing IOs for a while in such critical path.
Cc: stable@vger.kernel.org Suggested-by: Randall Huang huangrandall@google.com Suggested-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org ---
Change log from v2: - adopting the retry logic
fs/f2fs/inode.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index e1d6e021e82b..7bf660d4cad9 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -724,18 +724,19 @@ void f2fs_update_inode_page(struct inode *inode) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct page *node_page; + int count = 0; retry: node_page = f2fs_get_node_page(sbi, inode->i_ino); if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
- if (err == -ENOMEM) { - cond_resched(); + /* The node block was truncated. */ + if (err == -ENOENT) + return; + + if (err == -ENOMEM || ++count <= DEFAULT_RETRY_IO_COUNT) goto retry; - } else if (err != -ENOENT) { - f2fs_stop_checkpoint(sbi, false, - STOP_CP_REASON_UPDATE_INODE); - } + f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_UPDATE_INODE); return; } f2fs_update_inode(inode, node_page);
On 2023/1/31 7:30, Jaegeuk Kim wrote:
If the storage gives a corrupted node block due to short power failure and reset, f2fs stops the entire operations by setting the checkpoint failure flag.
Let's give more chances to live by re-issuing IOs for a while in such critical path.
Cc: stable@vger.kernel.org Suggested-by: Randall Huang huangrandall@google.com Suggested-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org
Reviewed-by: Chao Yu chao@kernel.org
Thanks,
linux-stable-mirror@lists.linaro.org