From: Henry Lin <henryl(a)nvidia.com>
This change fixes a data corruption issue occurred on USB hard disk for
the case that bounce buffer is used during transferring data.
While updating data between sg list and bounce buffer, current
implementation passes mapped sg number (urb->num_mapped_sgs) to
sg_pcopy_from_buffer() and sg_pcopy_to_buffer(). This causes data
not get copied if target buffer is located in the elements after
mapped sg elements. This change passes sg number for full list to
fix issue.
Besides, for copying data from bounce buffer, calling dma_unmap_single()
on the bounce buffer before copying data to sg list can avoid cache issue.
Fixes: f9c589e142d0 ("xhci: TD-fragment, align the unsplittable case with a bounce buffer")
Cc: <stable(a)vger.kernel.org> # v4.8+
Signed-off-by: Henry Lin <henryl(a)nvidia.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-ring.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index fed3385..ef7c869 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -656,6 +656,7 @@ static void xhci_unmap_td_bounce_buffer(struct xhci_hcd *xhci,
struct device *dev = xhci_to_hcd(xhci)->self.controller;
struct xhci_segment *seg = td->bounce_seg;
struct urb *urb = td->urb;
+ size_t len;
if (!ring || !seg || !urb)
return;
@@ -666,11 +667,14 @@ static void xhci_unmap_td_bounce_buffer(struct xhci_hcd *xhci,
return;
}
- /* for in tranfers we need to copy the data from bounce to sg */
- sg_pcopy_from_buffer(urb->sg, urb->num_mapped_sgs, seg->bounce_buf,
- seg->bounce_len, seg->bounce_offs);
dma_unmap_single(dev, seg->bounce_dma, ring->bounce_buf_len,
DMA_FROM_DEVICE);
+ /* for in tranfers we need to copy the data from bounce to sg */
+ len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs, seg->bounce_buf,
+ seg->bounce_len, seg->bounce_offs);
+ if (len != seg->bounce_len)
+ xhci_warn(xhci, "WARN Wrong bounce buffer read length: %ld != %d\n",
+ len, seg->bounce_len);
seg->bounce_len = 0;
seg->bounce_offs = 0;
}
@@ -3127,6 +3131,7 @@ static int xhci_align_td(struct xhci_hcd *xhci, struct urb *urb, u32 enqd_len,
unsigned int unalign;
unsigned int max_pkt;
u32 new_buff_len;
+ size_t len;
max_pkt = usb_endpoint_maxp(&urb->ep->desc);
unalign = (enqd_len + *trb_buff_len) % max_pkt;
@@ -3157,8 +3162,12 @@ static int xhci_align_td(struct xhci_hcd *xhci, struct urb *urb, u32 enqd_len,
/* create a max max_pkt sized bounce buffer pointed to by last trb */
if (usb_urb_dir_out(urb)) {
- sg_pcopy_to_buffer(urb->sg, urb->num_mapped_sgs,
+ len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs,
seg->bounce_buf, new_buff_len, enqd_len);
+ if (len != seg->bounce_len)
+ xhci_warn(xhci,
+ "WARN Wrong bounce buffer write length: %ld != %d\n",
+ len, seg->bounce_len);
seg->bounce_dma = dma_map_single(dev, seg->bounce_buf,
max_pkt, DMA_TO_DEVICE);
} else {
--
2.7.4
We have a single node system with node 0 disabled:
Scanning NUMA topology in Northbridge 24
Number of physical nodes 2
Skipping disabled node 0
Node 1 MemBase 0000000000000000 Limit 00000000fbff0000
NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff]
This causes crashes in memcg when system boots:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
#PF error: [normal kernel read fault]
...
RIP: 0010:list_lru_add+0x94/0x170
...
Call Trace:
d_lru_add+0x44/0x50
dput.part.34+0xfc/0x110
__fput+0x108/0x230
task_work_run+0x9f/0xc0
exit_to_usermode_loop+0xf5/0x100
It is reproducible as far as 4.12. I did not try older kernels. You have
to have a new enough systemd, e.g. 241 (the reason is unknown -- was not
investigated). Cannot be reproduced with systemd 234.
The system crashes because the size of lru array is never updated in
memcg_update_all_list_lrus and the reads are past the zero-sized array,
causing dereferences of random memory.
The root cause are list_lru_memcg_aware checks in the list_lru code.
The test in list_lru_memcg_aware is broken: it assumes node 0 is always
present, but it is not true on some systems as can be seen above.
So fix this by avoiding checks on node 0. Remember the memcg-awareness
by a bool flag in struct list_lru.
[v2] use the idea proposed by Vladimir -- the bool flag.
Signed-off-by: Jiri Slaby <jslaby(a)suse.cz>
Fixes: 60d3fd32a7a9 ("list_lru: introduce per-memcg lists")
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Suggested-by: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Acked-by: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Cc: <cgroups(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org>
Cc: <linux-mm(a)kvack.org>
Cc: Raghavendra K T <raghavendra.kt(a)linux.vnet.ibm.com>
---
This is only a resent patch. I did not send it the akpm's way previously.
include/linux/list_lru.h | 1 +
mm/list_lru.c | 8 +++-----
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index aa5efd9351eb..d5ceb2839a2d 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -54,6 +54,7 @@ struct list_lru {
#ifdef CONFIG_MEMCG_KMEM
struct list_head list;
int shrinker_id;
+ bool memcg_aware;
#endif
};
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 0730bf8ff39f..d3b538146efd 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -37,11 +37,7 @@ static int lru_shrinker_id(struct list_lru *lru)
static inline bool list_lru_memcg_aware(struct list_lru *lru)
{
- /*
- * This needs node 0 to be always present, even
- * in the systems supporting sparse numa ids.
- */
- return !!lru->node[0].memcg_lrus;
+ return lru->memcg_aware;
}
static inline struct list_lru_one *
@@ -451,6 +447,8 @@ static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
{
int i;
+ lru->memcg_aware = memcg_aware;
+
if (!memcg_aware)
return 0;
--
2.21.0