We found an issue under Android OTA scenario that many BIOs have to do FEC where the data under dm-verity is 100% complete and no corruption.
Android OTA has many dm-block layers, from upper to lower: dm-verity dm-snapshot dm-origin & dm-cow dm-linear ufs
Dm tables have to change 2 times during Android OTA merging process. When doing table change, the dm-snapshot will be suspended for a while. During this interval, we found there are many readahead IOs are submitted to dm_verity from filesystem. Then the kverity works are busy doing FEC process which cost too much time to finish dm-verity IO. And cause system stuck.
We add some debug log and find that each readahead IO need around 10s to finish when this situation occurred. Because here has a IO amplification:
dm-snapshot suspend erofs_readahead // 300+ io is submitted dm_submit_bio (dm_verity) dm_submit_bio (dm_snapshot) bio return EIO bio got nothing, it's empty verity_end_io verity_verify_io forloop range(0, io->n_blocks) // each io->nblocks ~= 20 verity_fec_decode fec_decode_rsb fec_read_bufs forloop range(0, v->fec->rsn) // v->fec->rsn = 253 new_read submit_bio (dm_snapshot) end loop end loop dm-snapshot resume
Readahead BIO got nothing during dm-snapshot suspended. So all of them will do FEC. Each readahead BIO need to do io->n_blocks ~= 20 times verify. Each block need to do fec, and every block need to do v->fec->rsn = 253 times read. So during the suspend interval(~200ms), 300 readahead BIO make 300*20*253 IOs on dm-snapshot.
As readahead IO is not required by user space, and to fix this issue, I think it would be better to pass it to upper layer to handle it.
Cc: stable@vger.kernel.org Fixes: a739ff3f543a ("dm verity: add support for forward error correction") Signed-off-by: Wu Bo bo.wu@vivo.com --- drivers/md/dm-verity-target.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c index beec14b6b044..14e58ae70521 100644 --- a/drivers/md/dm-verity-target.c +++ b/drivers/md/dm-verity-target.c @@ -667,7 +667,9 @@ static void verity_end_io(struct bio *bio) struct dm_verity_io *io = bio->bi_private;
if (bio->bi_status && - (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) { + (!verity_fec_is_enabled(io->v) || + verity_is_system_shutting_down() || + (bio->bi_opf & REQ_RAHEAD))) { verity_finish_io(io, bio->bi_status); return; }
ping.
On 2023/11/22 11:51, Wu Bo wrote:
We found an issue under Android OTA scenario that many BIOs have to do FEC where the data under dm-verity is 100% complete and no corruption.
Android OTA has many dm-block layers, from upper to lower: dm-verity dm-snapshot dm-origin & dm-cow dm-linear ufs
Dm tables have to change 2 times during Android OTA merging process. When doing table change, the dm-snapshot will be suspended for a while. During this interval, we found there are many readahead IOs are submitted to dm_verity from filesystem. Then the kverity works are busy doing FEC process which cost too much time to finish dm-verity IO. And cause system stuck.
We add some debug log and find that each readahead IO need around 10s to finish when this situation occurred. Because here has a IO amplification:
dm-snapshot suspend erofs_readahead // 300+ io is submitted dm_submit_bio (dm_verity) dm_submit_bio (dm_snapshot) bio return EIO bio got nothing, it's empty verity_end_io verity_verify_io forloop range(0, io->n_blocks) // each io->nblocks ~= 20 verity_fec_decode fec_decode_rsb fec_read_bufs forloop range(0, v->fec->rsn) // v->fec->rsn = 253 new_read submit_bio (dm_snapshot) end loop end loop dm-snapshot resume
Readahead BIO got nothing during dm-snapshot suspended. So all of them will do FEC. Each readahead BIO need to do io->n_blocks ~= 20 times verify. Each block need to do fec, and every block need to do v->fec->rsn = 253 times read. So during the suspend interval(~200ms), 300 readahead BIO make 300*20*253 IOs on dm-snapshot.
As readahead IO is not required by user space, and to fix this issue, I think it would be better to pass it to upper layer to handle it.
Cc: stable@vger.kernel.org Fixes: a739ff3f543a ("dm verity: add support for forward error correction") Signed-off-by: Wu Bo bo.wu@vivo.com
drivers/md/dm-verity-target.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c index beec14b6b044..14e58ae70521 100644 --- a/drivers/md/dm-verity-target.c +++ b/drivers/md/dm-verity-target.c @@ -667,7 +667,9 @@ static void verity_end_io(struct bio *bio) struct dm_verity_io *io = bio->bi_private; if (bio->bi_status &&
(!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
(!verity_fec_is_enabled(io->v) ||
verity_is_system_shutting_down() ||
verity_finish_io(io, bio->bi_status); return; }(bio->bi_opf & REQ_RAHEAD))) {
On Tue, 21 Nov 2023, Wu Bo wrote:
We found an issue under Android OTA scenario that many BIOs have to do FEC where the data under dm-verity is 100% complete and no corruption.
Android OTA has many dm-block layers, from upper to lower: dm-verity dm-snapshot dm-origin & dm-cow dm-linear ufs
Dm tables have to change 2 times during Android OTA merging process. When doing table change, the dm-snapshot will be suspended for a while. During this interval, we found there are many readahead IOs are submitted to dm_verity from filesystem. Then the kverity works are busy doing FEC process which cost too much time to finish dm-verity IO. And cause system stuck.
We add some debug log and find that each readahead IO need around 10s to finish when this situation occurred. Because here has a IO amplification:
dm-snapshot suspend erofs_readahead // 300+ io is submitted dm_submit_bio (dm_verity) dm_submit_bio (dm_snapshot) bio return EIO bio got nothing, it's empty verity_end_io verity_verify_io forloop range(0, io->n_blocks) // each io->nblocks ~= 20 verity_fec_decode fec_decode_rsb fec_read_bufs forloop range(0, v->fec->rsn) // v->fec->rsn = 253 new_read submit_bio (dm_snapshot) end loop end loop dm-snapshot resume
Readahead BIO got nothing during dm-snapshot suspended. So all of them will do FEC. Each readahead BIO need to do io->n_blocks ~= 20 times verify. Each block need to do fec, and every block need to do v->fec->rsn = 253 times read. So during the suspend interval(~200ms), 300 readahead BIO make 300*20*253 IOs on dm-snapshot.
As readahead IO is not required by user space, and to fix this issue, I think it would be better to pass it to upper layer to handle it.
Cc: stable@vger.kernel.org Fixes: a739ff3f543a ("dm verity: add support for forward error correction") Signed-off-by: Wu Bo bo.wu@vivo.com
drivers/md/dm-verity-target.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c index beec14b6b044..14e58ae70521 100644 --- a/drivers/md/dm-verity-target.c +++ b/drivers/md/dm-verity-target.c @@ -667,7 +667,9 @@ static void verity_end_io(struct bio *bio) struct dm_verity_io *io = bio->bi_private; if (bio->bi_status &&
(!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
(!verity_fec_is_enabled(io->v) ||
verity_is_system_shutting_down() ||
verity_finish_io(io, bio->bi_status); return; }(bio->bi_opf & REQ_RAHEAD))) {
-- 2.25.1
Reviewed-by: Mikulas Patocka mpatocka@redhat.com
linux-stable-mirror@lists.linaro.org