Hi Andrew,
On Thu, Aug 02, 2018 at 02:13:04PM -0700, Andrew Morton wrote:
On Thu, 2 Aug 2018 14:11:12 +0900 Minchan Kim minchan@kernel.org wrote:
If zram supports writeback feature, it's no more syncrhonous device beause zram does synchronous IO opeation for incompressible page.
Do not pretend to be syncrhonous IO device. It makes system very sluggish as waiting IO completion from upper layer.
Furthermore, it makes user-after-free problem because swap think the opearion is done when the IO functions returns so it could free page by will(e.g., lock_page_or_retry and goto out_release in do_swap_page) but in fact, IO is asynchrnous so driver could access just freed page afterward.
This patch fixes the problem.
I fixed my faults from original description. Otherwise, ones you corrected looks good to me.
That changelog is rather hard to follow. Please review my edits:
: If zram supports writeback feature, it's no longer a BD_CAP_SYNCHRONOUS_IO : device beause zram does synchronous IO operations for incompressible pages.
asynchronous
: : Do not pretend to be synchronous IO device. It makes the system very : sluggish due to waiting for IO completion from upper layers. : : Furthermore, it causes a user-after-free problem because swap thinks the : opearion is done when the IO functions returns so it can free the page : (e.g., lock_page_or_retry and goto out_release in do_swap_page) but in : fact, IO is asynchrnous so the driver could access a just freed page
asynchronous
: afterward. : : This patch fixes the problem.
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 7436b2d27fa3..0b6eda1bd77a 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -298,7 +298,8 @@ static void reset_bdev(struct zram *zram) zram->backing_dev = NULL; zram->old_block_size = 0; zram->bdev = NULL;
- zram->disk->queue->backing_dev_info->capabilities |=
kvfree(zram->bitmap); zram->bitmap = NULL;BDI_CAP_SYNCHRONOUS_IO;
} @@ -400,6 +401,8 @@ static ssize_t backing_dev_store(struct device *dev, zram->backing_dev = backing_dev; zram->bitmap = bitmap; zram->nr_pages = nr_pages;
- zram->disk->queue->backing_dev_info->capabilities &=
up_write(&zram->init_lock);~BDI_CAP_SYNCHRONOUS_IO;
pr_info("setup backing device %s\n", file_name);
A reader looking at this would wonder "why the heck are we doing that". Adding a code comment would help them.
I will add
/* * With writeback feature, zram does a asynchronous IO so it's no longer * synchronous device. If it pretend to be, upper layer could wait IO * completion rather than (submit and return), which will cause system * sluggish. * Furthermore, when the IO function returns(e.g., swap_readpage), * uppler lay could guess IO was done so it could deallocate the page * freely but in fact, IO is going on and it finally could cause * use-after-free when the IO is really done. */
Is it legitimate to be altering the bdi capabilities at this level? Or is this hacky?
Most of device's bdi capability seems to be static but there are few drivers which can change capability. For example, BDI_CAP_STABLE_WRITES
drivers/scsi/iscsi_tcp.c drivers/md/raid5.c
I believe it's driver itself advertisement stuff so I hope it's not hack.
If "yes" then should reset_bdev() be unconditionally setting BDI_CAP_SYNCHRONOUS_IO? Shouldn't it be restoring that flag to its previous setting?
Yu, reset_bdev should set it unconditionally. Because zram's default mode is synchronous and it changed only if user set backing device.
I will respin the patch with revised comment and description.
Thanks.