Backported mainline commit 6ff38bd40230af35
("mm: cleancache: fix corruption on missed inode invalidation")
From: Pavel Tikhomirov <ptikhomirov(a)virtuozzo.com>
Date: Fri, 30 Nov 2018 14:09:00 -0800
Subject: [PATCH] mm: cleancache: fix corruption on missed inode invalidation
If all pages are deleted from the mapping by memory reclaim and also
moved to the cleancache:
__delete_from_page_cache
(no shadow case)
unaccount_page_cache_page
cleancache_put_page
page_cache_delete
mapping->nrpages -= nr
(nrpages becomes 0)
We don't clean the cleancache for an inode after final file truncation
(removal).
truncate_inode_pages_final
check (nrpages || nrexceptional) is false
no truncate_inode_pages
no cleancache_invalidate_inode(mapping)
These way when reading the new file created with same inode we may get
these trash leftover pages from cleancache and see wrong data instead of
the contents of the new file.
Fix it by always doing truncate_inode_pages which is already ready for
nrpages == 0 && nrexceptional == 0 case and just invalidates inode.
[akpm(a)linux-foundation.org: add comment, per Jan]
Link: http://lkml.kernel.org/r/20181112095734.17979-1-ptikhomirov@virtuozzo.com
Fixes: commit 91b0abe36a7b ("mm + fs: store shadow entries in page cache")
Signed-off-by: Pavel Tikhomirov <ptikhomirov(a)virtuozzo.com>
Reviewed-by: Vasily Averin <vvs(a)virtuozzo.com>
Reviewed-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Vasily Averin <vvs(a)virtuozzo.com>
---
mm/truncate.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/truncate.c b/mm/truncate.c
index 2330223841fb..d3a2737cc188 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -471,9 +471,13 @@ void truncate_inode_pages_final(struct address_space *mapping)
*/
spin_lock_irq(&mapping->tree_lock);
spin_unlock_irq(&mapping->tree_lock);
-
- truncate_inode_pages(mapping, 0);
}
+
+ /*
+ * Cleancache needs notification even if there are no pages or shadow
+ * entries.
+ */
+ truncate_inode_pages(mapping, 0);
}
EXPORT_SYMBOL(truncate_inode_pages_final);
--
2.17.1
Hi Greg,
This has been applied to 4.19-stable, but should be needed in
4.14-stable also. I have attached the backported patch and have
also tested with one such DVD where the problem was seen.
--
Regards
Sudip
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f8397d69daef06d358430d3054662fb597e37c00 Mon Sep 17 00:00:00 2001
From: Nikolay Borisov <nborisov(a)suse.com>
Date: Tue, 6 Nov 2018 16:40:20 +0200
Subject: [PATCH] btrfs: Always try all copies when reading extent buffers
When a metadata read is served the endio routine btree_readpage_end_io_hook
is called which eventually runs the tree-checker. If tree-checker fails
to validate the read eb then it sets EXTENT_BUFFER_CORRUPT flag. This
leads to btree_read_extent_buffer_pages wrongly assuming that all
available copies of this extent buffer are wrong and failing prematurely.
Fix this modify btree_read_extent_buffer_pages to read all copies of
the data.
This failure was exhibitted in xfstests btrfs/124 which would
spuriously fail its balance operations. The reason was that when balance
was run following re-introduction of the missing raid1 disk
__btrfs_map_block would map the read request to stripe 0, which
corresponded to devid 2 (the disk which is being removed in the test):
item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 3553624064) itemoff 15975 itemsize 112
length 1073741824 owner 2 stripe_len 65536 type DATA|RAID1
io_align 65536 io_width 65536 sector_size 4096
num_stripes 2 sub_stripes 1
stripe 0 devid 2 offset 2156920832
dev_uuid 8466c350-ed0c-4c3b-b17d-6379b445d5c8
stripe 1 devid 1 offset 3553624064
dev_uuid 1265d8db-5596-477e-af03-df08eb38d2ca
This caused read requests for a checksum item that to be routed to the
stale disk which triggered the aforementioned logic involving
EXTENT_BUFFER_CORRUPT flag. This then triggered cascading failures of
the balance operation.
Fixes: a826d6dcb32d ("Btrfs: check items for correctness as we search")
CC: stable(a)vger.kernel.org # 4.4+
Suggested-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3f0b6d1936e8..6d776717d8b3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -477,9 +477,9 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
int mirror_num = 0;
int failed_mirror = 0;
- clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree;
while (1) {
+ clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
ret = read_extent_buffer_pages(io_tree, eb, WAIT_COMPLETE,
mirror_num);
if (!ret) {
@@ -493,15 +493,6 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
break;
}
- /*
- * This buffer's crc is fine, but its contents are corrupted, so
- * there is no reason to read the other copies, they won't be
- * any less wrong.
- */
- if (test_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags) ||
- ret == -EUCLEAN)
- break;
-
num_copies = btrfs_num_copies(fs_info,
eb->start, eb->len);
if (num_copies == 1)
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f8397d69daef06d358430d3054662fb597e37c00 Mon Sep 17 00:00:00 2001
From: Nikolay Borisov <nborisov(a)suse.com>
Date: Tue, 6 Nov 2018 16:40:20 +0200
Subject: [PATCH] btrfs: Always try all copies when reading extent buffers
When a metadata read is served the endio routine btree_readpage_end_io_hook
is called which eventually runs the tree-checker. If tree-checker fails
to validate the read eb then it sets EXTENT_BUFFER_CORRUPT flag. This
leads to btree_read_extent_buffer_pages wrongly assuming that all
available copies of this extent buffer are wrong and failing prematurely.
Fix this modify btree_read_extent_buffer_pages to read all copies of
the data.
This failure was exhibitted in xfstests btrfs/124 which would
spuriously fail its balance operations. The reason was that when balance
was run following re-introduction of the missing raid1 disk
__btrfs_map_block would map the read request to stripe 0, which
corresponded to devid 2 (the disk which is being removed in the test):
item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 3553624064) itemoff 15975 itemsize 112
length 1073741824 owner 2 stripe_len 65536 type DATA|RAID1
io_align 65536 io_width 65536 sector_size 4096
num_stripes 2 sub_stripes 1
stripe 0 devid 2 offset 2156920832
dev_uuid 8466c350-ed0c-4c3b-b17d-6379b445d5c8
stripe 1 devid 1 offset 3553624064
dev_uuid 1265d8db-5596-477e-af03-df08eb38d2ca
This caused read requests for a checksum item that to be routed to the
stale disk which triggered the aforementioned logic involving
EXTENT_BUFFER_CORRUPT flag. This then triggered cascading failures of
the balance operation.
Fixes: a826d6dcb32d ("Btrfs: check items for correctness as we search")
CC: stable(a)vger.kernel.org # 4.4+
Suggested-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3f0b6d1936e8..6d776717d8b3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -477,9 +477,9 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
int mirror_num = 0;
int failed_mirror = 0;
- clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree;
while (1) {
+ clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
ret = read_extent_buffer_pages(io_tree, eb, WAIT_COMPLETE,
mirror_num);
if (!ret) {
@@ -493,15 +493,6 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
break;
}
- /*
- * This buffer's crc is fine, but its contents are corrupted, so
- * there is no reason to read the other copies, they won't be
- * any less wrong.
- */
- if (test_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags) ||
- ret == -EUCLEAN)
- break;
-
num_copies = btrfs_num_copies(fs_info,
eb->start, eb->len);
if (num_copies == 1)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f8397d69daef06d358430d3054662fb597e37c00 Mon Sep 17 00:00:00 2001
From: Nikolay Borisov <nborisov(a)suse.com>
Date: Tue, 6 Nov 2018 16:40:20 +0200
Subject: [PATCH] btrfs: Always try all copies when reading extent buffers
When a metadata read is served the endio routine btree_readpage_end_io_hook
is called which eventually runs the tree-checker. If tree-checker fails
to validate the read eb then it sets EXTENT_BUFFER_CORRUPT flag. This
leads to btree_read_extent_buffer_pages wrongly assuming that all
available copies of this extent buffer are wrong and failing prematurely.
Fix this modify btree_read_extent_buffer_pages to read all copies of
the data.
This failure was exhibitted in xfstests btrfs/124 which would
spuriously fail its balance operations. The reason was that when balance
was run following re-introduction of the missing raid1 disk
__btrfs_map_block would map the read request to stripe 0, which
corresponded to devid 2 (the disk which is being removed in the test):
item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 3553624064) itemoff 15975 itemsize 112
length 1073741824 owner 2 stripe_len 65536 type DATA|RAID1
io_align 65536 io_width 65536 sector_size 4096
num_stripes 2 sub_stripes 1
stripe 0 devid 2 offset 2156920832
dev_uuid 8466c350-ed0c-4c3b-b17d-6379b445d5c8
stripe 1 devid 1 offset 3553624064
dev_uuid 1265d8db-5596-477e-af03-df08eb38d2ca
This caused read requests for a checksum item that to be routed to the
stale disk which triggered the aforementioned logic involving
EXTENT_BUFFER_CORRUPT flag. This then triggered cascading failures of
the balance operation.
Fixes: a826d6dcb32d ("Btrfs: check items for correctness as we search")
CC: stable(a)vger.kernel.org # 4.4+
Suggested-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3f0b6d1936e8..6d776717d8b3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -477,9 +477,9 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
int mirror_num = 0;
int failed_mirror = 0;
- clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree;
while (1) {
+ clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
ret = read_extent_buffer_pages(io_tree, eb, WAIT_COMPLETE,
mirror_num);
if (!ret) {
@@ -493,15 +493,6 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
break;
}
- /*
- * This buffer's crc is fine, but its contents are corrupted, so
- * there is no reason to read the other copies, they won't be
- * any less wrong.
- */
- if (test_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags) ||
- ret == -EUCLEAN)
- break;
-
num_copies = btrfs_num_copies(fs_info,
eb->start, eb->len);
if (num_copies == 1)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ceff2f4dcd44abf35864d9a99f85ac619e89a01d Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 27 Aug 2018 10:21:46 +0200
Subject: [PATCH] drm/mediatek: fix OF sibling-node lookup
Use the new of_get_compatible_child() helper to lookup the sibling
instead of using of_find_compatible_node(), which searches the entire
tree from a given start node and thus can return an unrelated (i.e.
non-sibling) node.
This also addresses a potential use-after-free (e.g. after probe
deferral) as the tree-wide helper drops a reference to its first
argument (i.e. the parent device node).
While at it, also fix the related cec-node reference leak.
Fixes: 8f83f26891e1 ("drm/mediatek: Add HDMI support")
Cc: stable <stable(a)vger.kernel.org> # 4.8
Cc: Junzhi Zhao <junzhi.zhao(a)mediatek.com>
Cc: Philipp Zabel <p.zabel(a)pengutronix.de>
Cc: CK Hu <ck.hu(a)mediatek.com>
Cc: David Airlie <airlied(a)linux.ie>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Rob Herring <robh(a)kernel.org>
diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi.c b/drivers/gpu/drm/mediatek/mtk_hdmi.c
index 2d45d1dd9554..643f5edd68fe 100644
--- a/drivers/gpu/drm/mediatek/mtk_hdmi.c
+++ b/drivers/gpu/drm/mediatek/mtk_hdmi.c
@@ -1446,8 +1446,7 @@ static int mtk_hdmi_dt_parse_pdata(struct mtk_hdmi *hdmi,
}
/* The CEC module handles HDMI hotplug detection */
- cec_np = of_find_compatible_node(np->parent, NULL,
- "mediatek,mt8173-cec");
+ cec_np = of_get_compatible_child(np->parent, "mediatek,mt8173-cec");
if (!cec_np) {
dev_err(dev, "Failed to find CEC node\n");
return -EINVAL;
@@ -1457,8 +1456,10 @@ static int mtk_hdmi_dt_parse_pdata(struct mtk_hdmi *hdmi,
if (!cec_pdev) {
dev_err(hdmi->dev, "Waiting for CEC device %pOF\n",
cec_np);
+ of_node_put(cec_np);
return -EPROBE_DEFER;
}
+ of_node_put(cec_np);
hdmi->cec_dev = &cec_pdev->dev;
/*
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f9a7082327e26f54067a49cac2316d31e0cc8ba7 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 27 Aug 2018 10:21:47 +0200
Subject: [PATCH] drm/msm: fix OF child-node lookup
Use the new of_get_compatible_child() helper to lookup the legacy
pwrlevels child node instead of using of_find_compatible_node(), which
searches the entire tree from a given start node and thus can return an
unrelated (i.e. non-child) node.
This also addresses a potential use-after-free (e.g. after probe
deferral) as the tree-wide helper drops a reference to its first
argument (i.e. the probed device's node).
While at it, also fix the related child-node reference leak.
Fixes: e2af8b6b0ca1 ("drm/msm: gpu: Use OPP tables if we can")
Cc: stable <stable(a)vger.kernel.org> # 4.12
Cc: Jordan Crouse <jcrouse(a)codeaurora.org>
Cc: Rob Clark <robdclark(a)gmail.com>
Cc: David Airlie <airlied(a)linux.ie>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Rob Herring <robh(a)kernel.org>
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index da1363a0c54d..93d70f4a2154 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -633,8 +633,7 @@ static int adreno_get_legacy_pwrlevels(struct device *dev)
struct device_node *child, *node;
int ret;
- node = of_find_compatible_node(dev->of_node, NULL,
- "qcom,gpu-pwrlevels");
+ node = of_get_compatible_child(dev->of_node, "qcom,gpu-pwrlevels");
if (!node) {
dev_err(dev, "Could not find the GPU powerlevels\n");
return -ENXIO;
@@ -655,6 +654,8 @@ static int adreno_get_legacy_pwrlevels(struct device *dev)
dev_pm_opp_add(dev, val, 0);
}
+ of_node_put(node);
+
return 0;
}
From: Alek Du <alek.du(a)intel.com>
We observed some premature timeouts on a virtualization platform, the log
is like this:
case 1:
[159525.255629] mmc1: Internal clock never stabilised.
[159525.255818] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[159525.256049] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
...
[159525.257205] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x0000fa03
>From the clock control register dump, we are pretty sure the clock was
stablized.
case 2:
[ 914.550127] mmc1: Reset 0x2 never completed.
[ 914.550321] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 914.550608] mmc1: sdhci: Sys addr: 0x00000010 | Version: 0x00001002
After checking the sdhci code, we found the timeout check actually has a
little window that the CPU can be scheduled out and when it comes back,
the original time set or check is not valid.
Fixes: 5a436cc0af62 ("mmc: sdhci: Optimize delay loops")
Cc: stable(a)vger.kernel.org # v4.12+
Signed-off-by: Alek Du <alek.du(a)intel.com>
---
drivers/mmc/host/sdhci.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 99bdae53fa2e..451b08a818a9 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -216,8 +216,12 @@ void sdhci_reset(struct sdhci_host *host, u8 mask)
timeout = ktime_add_ms(ktime_get(), 100);
/* hw clears the bit when it's done */
- while (sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask) {
- if (ktime_after(ktime_get(), timeout)) {
+ while (1) {
+ bool timedout = ktime_after(ktime_get(), timeout);
+
+ if (!(sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask))
+ break;
+ if (timedout) {
pr_err("%s: Reset 0x%x never completed.\n",
mmc_hostname(host->mmc), (int)mask);
sdhci_dumpregs(host);
@@ -1608,9 +1612,13 @@ void sdhci_enable_clk(struct sdhci_host *host, u16 clk)
/* Wait max 20 ms */
timeout = ktime_add_ms(ktime_get(), 20);
- while (!((clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL))
- & SDHCI_CLOCK_INT_STABLE)) {
- if (ktime_after(ktime_get(), timeout)) {
+ while (1) {
+ bool timedout = ktime_after(ktime_get(), timeout);
+
+ clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL);
+ if (clk & SDHCI_CLOCK_INT_STABLE)
+ break;
+ if (timedout) {
pr_err("%s: Internal clock never stabilised.\n",
mmc_hostname(host->mmc));
sdhci_dumpregs(host);
--
2.17.1
From: Markus Hofstaetter <markus.hofstaetter(a)ait.ac.at>
commit f16703360da7731a057df2ffa902306819c22398 upstream.
Some PWMs are disabled by default or the default pin setting
does not match the LED_OFF state (e.g., active-low leds).
Hence, the driver may end up reporting 0 brightness, but
the leds are actually on using full brightness, because
it never enforces its default configuration.
So enforce it by calling led_pwm_set() after successfully
registering the device.
Tested on a Phytec phyFLEX i.MX6Q board based on kernel
v3.19.5.
Signed-off-by: Markus Hofstaetter <markus.hofstaetter(a)ait.ac.at>
Tested-by: Markus Hofstaetter <markus.hofstaetter(a)ait.ac.at>
Signed-off-by: Jacek Anaszewski <j.anaszewski(a)samsung.com>
Signed-off-by: Krzysztof Kozlowski <krzk(a)kernel.org>
---
drivers/leds/leds-pwm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/leds/leds-pwm.c b/drivers/leds/leds-pwm.c
index 1d07e3e83d29..3149dbece146 100644
--- a/drivers/leds/leds-pwm.c
+++ b/drivers/leds/leds-pwm.c
@@ -132,6 +132,7 @@ static int led_pwm_add(struct device *dev, struct led_pwm_priv *priv,
ret = led_classdev_register(dev, &led_data->cdev);
if (ret == 0) {
priv->num_leds++;
+ led_pwm_set(&led_data->cdev, led_data->cdev.brightness);
} else {
dev_err(dev, "failed to register PWM led for %s: %d\n",
led->name, ret);
--
2.7.4