Fixes the below crash
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc000000000c3447c
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 11 PID: 7519 Comm: lt-ndctl Not tainted 5.6.0-rc7-autotest #1
...
NIP [c000000000c3447c] vmemmap_populated+0x98/0xc0
LR [c000000000088354] vmemmap_free+0x144/0x320
Call Trace:
section_deactivate+0x220/0x240
__remove_pages+0x118/0x170
arch_remove_memory+0x3c/0x150
memunmap_pages+0x1cc/0x2f0
devm_action_release+0x30/0x50
release_nodes+0x2f8/0x3e0
device_release_driver_internal+0x168/0x270
unbind_store+0x130/0x170
drv_attr_store+0x44/0x60
sysfs_kf_write+0x68/0x80
kernfs_fop_write+0x100/0x290
__vfs_write+0x3c/0x70
vfs_write+0xcc/0x240
ksys_write+0x7c/0x140
system_call+0x5c/0x68
The crash is due to NULL dereference at
test_bit(idx, ms->usage->subsection_map); due to ms->usage = NULL; in pfn_section_valid()
With commit: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case")
section_mem_map is set to NULL after depopulate_section_mem(). This
was done so that pfn_page() can work correctly with kernel config that disables
SPARSEMEM_VMEMMAP. With that config pfn_to_page does
__section_mem_map_addr(__sec) + __pfn;
where
static inline struct page *__section_mem_map_addr(struct mem_section *section)
{
unsigned long map = section->section_mem_map;
map &= SECTION_MAP_MASK;
return (struct page *)map;
}
Now with SPASEMEM_VMEMAP enabled, mem_section->usage->subsection_map is used to
check the pfn validity (pfn_valid()). Since section_deactivate release
mem_section->usage if a section is fully deactivated, pfn_valid() check after
a subsection_deactivate cause a kernel crash.
static inline int pfn_valid(unsigned long pfn)
{
...
return early_section(ms) || pfn_section_valid(ms, pfn);
}
where
static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
{
int idx = subsection_map_index(pfn);
return test_bit(idx, ms->usage->subsection_map);
}
Avoid this by clearing SECTION_HAS_MEM_MAP when mem_section->usage is freed.
For architectures like ppc64 where large pages are used for vmmemap mapping (16MB),
a specific vmemmap mapping can cover multiple sections. Hence before a vmemmap
mapping page can be freed, the kernel needs to make sure there are no valid sections
within that mapping. Clearing the section valid bit before
depopulate_section_memap enables this.
Fixes: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case")
Reported-by: Sachin Sant <sachinp(a)linux.vnet.ibm.com>
Tested-by: Sachin Sant <sachinp(a)linux.vnet.ibm.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Pankaj Gupta <pankaj.gupta.linux(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Wei Yang <richardw.yang(a)linux.intel.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
---
mm/sparse.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/mm/sparse.c b/mm/sparse.c
index aadb7298dcef..65599e8bd636 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -781,6 +781,12 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages,
ms->usage = NULL;
}
memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr);
+ /*
+ * Mark the section invalid so that valid_section()
+ * return false. This prevents code from dereferencing
+ * ms->usage array.
+ */
+ ms->section_mem_map &= ~SECTION_HAS_MEM_MAP;
}
if (section_is_early && memmap)
--
2.25.1
This is a note to let you know that I've just added the patch titled
staging: wlan-ng: fix use-after-free Read in hfa384x_usbin_callback
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 1165dd73e811a07d947aee218510571f516081f6 Mon Sep 17 00:00:00 2001
From: Qiujun Huang <hqjagain(a)gmail.com>
Date: Thu, 26 Mar 2020 21:18:50 +0800
Subject: staging: wlan-ng: fix use-after-free Read in hfa384x_usbin_callback
We can't handle the case length > WLAN_DATA_MAXLEN.
Because the size of rxfrm->data is WLAN_DATA_MAXLEN(2312), and we can't
read more than that.
Thanks-to: Hillf Danton <hdanton(a)sina.com>
Reported-and-tested-by: syzbot+7d42d68643a35f71ac8a(a)syzkaller.appspotmail.com
Signed-off-by: Qiujun Huang <hqjagain(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200326131850.17711-1-hqjagain@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/wlan-ng/hfa384x_usb.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/staging/wlan-ng/hfa384x_usb.c b/drivers/staging/wlan-ng/hfa384x_usb.c
index e38acb58cd5e..fa1bf8b069fd 100644
--- a/drivers/staging/wlan-ng/hfa384x_usb.c
+++ b/drivers/staging/wlan-ng/hfa384x_usb.c
@@ -3376,6 +3376,8 @@ static void hfa384x_int_rxmonitor(struct wlandevice *wlandev,
WLAN_HDR_A4_LEN + WLAN_DATA_MAXLEN + WLAN_CRC_LEN)) {
pr_debug("overlen frm: len=%zd\n",
skblen - sizeof(struct p80211_caphdr));
+
+ return;
}
skb = dev_alloc_skb(skblen);
--
2.26.0
This is a note to let you know that I've just added the patch titled
usb: gadget: f_fs: Fix use after free issue as part of queue failure
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From f63ec55ff904b2f2e126884fcad93175f16ab4bb Mon Sep 17 00:00:00 2001
From: Sriharsha Allenki <sallenki(a)codeaurora.org>
Date: Thu, 26 Mar 2020 17:26:20 +0530
Subject: usb: gadget: f_fs: Fix use after free issue as part of queue failure
In AIO case, the request is freed up if ep_queue fails.
However, io_data->req still has the reference to this freed
request. In the case of this failure if there is aio_cancel
call on this io_data it will lead to an invalid dequeue
operation and a potential use after free issue.
Fix this by setting the io_data->req to NULL when the request
is freed as part of queue failure.
Fixes: 2e4c7553cd6f ("usb: gadget: f_fs: add aio support")
Signed-off-by: Sriharsha Allenki <sallenki(a)codeaurora.org>
CC: stable <stable(a)vger.kernel.org>
Reviewed-by: Peter Chen <peter.chen(a)nxp.com>
Link: https://lore.kernel.org/r/20200326115620.12571-1-sallenki@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/gadget/function/f_fs.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 571917677d35..767f30b86645 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -1120,6 +1120,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC);
if (unlikely(ret)) {
+ io_data->req = NULL;
usb_ep_free_request(ep->ep, req);
goto error_lock;
}
--
2.26.0
This is a note to let you know that I've just added the patch titled
USB: serial: io_edgeport: fix slab-out-of-bounds read in
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 57aa9f294b09463492f604feaa5cc719beaace32 Mon Sep 17 00:00:00 2001
From: Qiujun Huang <hqjagain(a)gmail.com>
Date: Wed, 25 Mar 2020 15:52:37 +0800
Subject: USB: serial: io_edgeport: fix slab-out-of-bounds read in
edge_interrupt_callback
Fix slab-out-of-bounds read in the interrupt-URB completion handler.
The boundary condition should be (length - 1) as we access
data[position + 1].
Reported-and-tested-by: syzbot+37ba33391ad5f3935bbd(a)syzkaller.appspotmail.com
Signed-off-by: Qiujun Huang <hqjagain(a)gmail.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/io_edgeport.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/serial/io_edgeport.c b/drivers/usb/serial/io_edgeport.c
index 5737add6a2a4..4cca0b836f43 100644
--- a/drivers/usb/serial/io_edgeport.c
+++ b/drivers/usb/serial/io_edgeport.c
@@ -710,7 +710,7 @@ static void edge_interrupt_callback(struct urb *urb)
/* grab the txcredits for the ports if available */
position = 2;
portNumber = 0;
- while ((position < length) &&
+ while ((position < length - 1) &&
(portNumber < edge_serial->serial->num_ports)) {
txCredits = data[position] | (data[position+1] << 8);
if (txCredits) {
--
2.26.0
On Thu, 2020-03-26 at 13:08 -0700, Andrew Morton wrote:
> > After our server is upgraded to a newer kernel, we found that it
> > continuesly print a warning in the kernel message. The warning is,
> > [832984.946322] netlink: 'irmas.lc': attribute type 1 has an invalid length.
> >
> > irmas.lc is one of our container monitor daemons, and it will use
> > CGROUPSTATS_CMD_GET to get the cgroupstats, that is similar with
> > tools/accounting/getdelays.c. We can also produce this warning with
> > getdelays. For example, after running bellow command
> > $ ./getdelays -C /sys/fs/cgroup/memory
> > then you can find a warning in dmesg,
> > [61607.229318] netlink: 'getdelays': attribute type 1 has an invalid length.
> >
> > This warning is introduced in commit 6e237d099fac ("netlink: Relax attr
> > validation for fixed length types"), which is used to check whether
> > attributes using types NLA_U* and NLA_S* have an exact length.
> >
> > Regarding this issue, the root cause is cgroupstats_cmd_get_policy defines
> > a wrong type as NLA_U32, while it should be NLA_NESTED an its minimal
> > length is NLA_HDRLEN. That is similar to taskstats_cmd_get_policy.
> >
> > As this behavior change really breaks our application, we'd better
> > cc stable as well.
Can you explain how it breaks the application? I mean, it's really only
printing a message to the kernel log in this case? At least that's what
you're describing.
I think you may be describing it wrong, because an NLA_NESTED is allowed
to be *empty* (but otherwise must have at least 4 bytes just like an
NLA_U32).
That said, I'm not even sure I agree that this fix is right? See below.
> Is it correct to say that although the code has always been incorrect,
> but only kernels after 6e237d099fac need this change? If so, I'll add
> Fixes:6e237d099fac to guide the -stable backporting.
That doesn't really seem right - 6e237d099fac *relaxed* the checks. If
anything then it ought to point to 28033ae4e0f5 which may have actually
returned an error; but again, need to understand better what really the
issue is.
> > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > index e2ac0e3..b90a520 100644
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -35,8 +35,8 @@
> > static struct genl_family family;
> >
> > static const struct nla_policy taskstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1] = {
> > - [TASKSTATS_CMD_ATTR_PID] = { .type = NLA_U32 },
> > - [TASKSTATS_CMD_ATTR_TGID] = { .type = NLA_U32 },
> > + [TASKSTATS_CMD_ATTR_PID] = { .type = NLA_NESTED },
> > + [TASKSTATS_CMD_ATTR_TGID] = { .type = NLA_NESTED },
I'm not sure where this is coming from - the kernel evidently uses them
as nested attributes in *outgoing* data (see mk_reply()), but as NLA_U32
in *incoming* data, (see cmd_attr_pid() and cmd_attr_tgid()).
I would generally recommend not doing such a thing as it's messy, but we
do have quite a few such instances cases. In all those cases must the
policy list the incoming policy since that's what the kernel uses to
validate the attributes.
IOW, this part of the change seems _wrong_.
> > * Make sure they are always aligned.
> > */
> > static const struct nla_policy cgroupstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1] = {
> > - [CGROUPSTATS_CMD_ATTR_FD] = { .type = NLA_U32 },
> > + [CGROUPSTATS_CMD_ATTR_FD] = { .type = NLA_NESTED },
> > };
And same here, actually.
johannes
On Thu, 2020-03-26 at 13:08 -0700, Andrew Morton wrote:
> (cc's added)
>
> On Wed, 25 Mar 2020 22:50:42 -0400 Yafang Shao <laoar.shao(a)gmail.com> wrote:
>
> > After our server is upgraded to a newer kernel, we found that it
> > continuesly print a warning in the kernel message. The warning is,
> > [832984.946322] netlink: 'irmas.lc': attribute type 1 has an invalid length.
> >
> > irmas.lc is one of our container monitor daemons, and it will use
> > CGROUPSTATS_CMD_GET to get the cgroupstats, that is similar with
> > tools/accounting/getdelays.c. We can also produce this warning with
> > getdelays. For example, after running bellow command
> > $ ./getdelays -C /sys/fs/cgroup/memory
> > then you can find a warning in dmesg,
> > [61607.229318] netlink: 'getdelays': attribute type 1 has an invalid length.
And looking at this ... well, that code is completely wrong?
E.g.
rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
cmd_type, &tid, sizeof(__u32));
(cmd_type is one of TASKSTATS_CMD_ATTR_TGID, TASKSTATS_CMD_ATTR_PID)
or it might do
rc = send_cmd(nl_sd, id, mypid, CGROUPSTATS_CMD_GET,
CGROUPSTATS_CMD_ATTR_FD, &cfd, sizeof(__u32));
so clearly it wants to produce a u32 attribute.
But then
static int send_cmd(int sd, __u16 nlmsg_type, __u32 nlmsg_pid,
__u8 genl_cmd, __u16 nla_type,
void *nla_data, int nla_len)
{
...
na = (struct nlattr *) GENLMSG_DATA(&msg);
// this is still fine
na->nla_type = nla_type;
// this is also fine
na->nla_len = nla_len + 1 + NLA_HDRLEN;
// but this??? the nla_len of a netlink attribute should just be
// the len ... what's NLA_HDRLEN doing here? this isn't nested
// here we end up just reserving 1+NLA_HDRLEN too much space
memcpy(NLA_DATA(na), nla_data, nla_len);
// but then it anyway only fills the first nla_len bytes, which
// is just like a regular attribute.
msg.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
// note that this is also wrong - it should be
// += NLA_ALIGN(NLA_HDRLEN + nla_len)
So really I think what happened here is precisely what we wanted -
David's kernel patch caught the broken userspace tool.
johannes