Hi Sasha,
On Thu, Jan 9, 2020 at 9:38 PM Sasha Levin <sashal(a)kernel.org> wrote:
> This is a note to let you know that I've just added the patch titled
>
> ARM: shmobile: defconfig: Restore debugfs support
>
> to the 5.4-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> arm-shmobile-defconfig-restore-debugfs-support.patch
> and it can be found in the queue-5.4 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit b56e3b1467eba2ba55bca05e3c8697b23304f12b
> Author: Geert Uytterhoeven <geert+renesas(a)glider.be>
> Date: Mon Dec 9 11:13:27 2019 +0100
>
> ARM: shmobile: defconfig: Restore debugfs support
>
> [ Upstream commit fa2cdb1762d15f701b83efa60b04f0d04e71bf89 ]
>
> Since commit 0e4a459f56c32d3e ("tracing: Remove unnecessary DEBUG_FS
> dependency"), CONFIG_DEBUG_FS is no longer auto-enabled. This breaks
AFAIK, that commit is not present in v5.4, and hasn't been backported yet.
So I don't think there is a need to backport this and all other fixes restoring
debugfs support in post-v5.4 kernels.
BTW, I noticed you plan to backport this "fix" not just to v5.4, but also
to v4.19?
> booting Debian 9, as systemd needs debugfs:
>
> [FAILED] Failed to mount /sys/kernel/debug.
> See 'systemctl status sys-kernel-debug.mount' for details.
> [DEPEND] Dependency failed for Local File Systems.
> ...
> You are in emergGive root password for maintenance
> (or press Control-D to continue):
>
> Fix this by enabling CONFIG_DEBUG_FS explicitly.
>
> See also commit 18977008f44c66bd ("ARM: multi_v7_defconfig: Restore
> debugfs support").
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
> Link: https://lore.kernel.org/r/20191209101327.26571-1-geert+renesas@glider.be
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/arch/arm/configs/shmobile_defconfig b/arch/arm/configs/shmobile_defconfig
> index c6c70355141c..7e7b678ae153 100644
> --- a/arch/arm/configs/shmobile_defconfig
> +++ b/arch/arm/configs/shmobile_defconfig
> @@ -215,4 +215,5 @@ CONFIG_DMA_CMA=y
> CONFIG_CMA_SIZE_MBYTES=64
> CONFIG_PRINTK_TIME=y
> # CONFIG_ENABLE_MUST_CHECK is not set
> +CONFIG_DEBUG_FS=y
> CONFIG_DEBUG_KERNEL=y
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert(a)linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Right now in tcp_bpf_recvmsg, sock read data first from sk_receive_queue
if not empty than psock->ingress_msg otherwise. If a FIN packet arrives
and there's also some data in psock->ingress_msg, the data in
psock->ingress_msg will be purged. It is always happen when request to a
HTTP1.0 server like python SimpleHTTPServer since the server send FIN
packet after data is sent out.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Reported-by: Arika Chen <eaglesora(a)gmail.com>
Suggested-by: Arika Chen <eaglesora(a)gmail.com>
Signed-off-by: Lingpeng Chen <forrest0579(a)gmail.com>
Signed-off-by: John Fastabend <john.fastabend(a)gmail.com>
Cc: stable(a)vger.kernel.org # v4.20+
Acked-by: Song Liu <songliubraving(a)fb.com>
---
net/ipv4/tcp_bpf.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index e38705165ac9..e6b08b5a0895 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -121,14 +121,14 @@ int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
struct sk_psock *psock;
int copied, ret;
- if (unlikely(flags & MSG_ERRQUEUE))
- return inet_recv_error(sk, msg, len, addr_len);
- if (!skb_queue_empty(&sk->sk_receive_queue))
- return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
-
psock = sk_psock_get(sk);
if (unlikely(!psock))
return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
+ if (unlikely(flags & MSG_ERRQUEUE))
+ return inet_recv_error(sk, msg, len, addr_len);
+ if (!skb_queue_empty(&sk->sk_receive_queue) &&
+ sk_psock_queue_empty(psock))
+ return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
lock_sock(sk);
msg_bytes_ready:
copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags);
@@ -139,7 +139,7 @@ int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
timeo = sock_rcvtimeo(sk, nonblock);
data = tcp_bpf_wait_data(sk, psock, flags, timeo, &err);
if (data) {
- if (skb_queue_empty(&sk->sk_receive_queue))
+ if (!sk_psock_queue_empty(psock))
goto msg_bytes_ready;
release_sock(sk);
sk_psock_put(sk, psock);
--
2.17.1
The patch titled
Subject: memcg: fix a crash in wb_workfn when a device disappears
has been added to the -mm tree. Its filename is
memcg-fix-a-crash-in-wb_workfn-when-a-device-disappears.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/memcg-fix-a-crash-in-wb_workfn-whe…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/memcg-fix-a-crash-in-wb_workfn-whe…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Theodore Ts'o" <tytso(a)mit.edu>
Subject: memcg: fix a crash in wb_workfn when a device disappears
Without memcg, there is a one-to-one mapping between the bdi and
bdi_writeback structures. In this world, things are fairly
straightforward; the first thing bdi_unregister() does is to shutdown the
bdi_writeback structure (or wb), and part of that writeback ensures that
no other work queued against the wb, and that the wb is fully drained.
With memcg, however, there is a one-to-many relationship between the bdi
and bdi_writeback structures; that is, there are multiple wb objects which
can all point to a single bdi. There is a refcount which prevents the bdi
object from being released (and hence, unregistered). So in theory, the
bdi_unregister() *should* only get called once its refcount goes to zero
(bdi_put will drop the refcount, and when it is zero, release_bdi gets
called, which calls bdi_unregister).
Unfortunately, del_gendisk() in block/gen_hd.c never got the memo about
the Brave New memcg World, and calls bdi_unregister directly. It does
this without informing the file system, or the memcg code, or anything
else. This causes the root wb associated with the bdi to be unregistered,
but none of the memcg-specific wb's are shutdown. So when one of these
wb's are woken up to do delayed work, they try to dereference their
wb->bdi->dev to fetch the device name, but unfortunately bdi->dev is now
NULL, thanks to the bdi_unregister() called by del_gendisk(). As a
result, *boom*.
Fortunately, it looks like the rest of the writeback path is perfectly
happy with bdi->dev and bdi->owner being NULL, so the simplest fix is to
create a bdi_dev_name() function which can handle bdi->dev being NULL.
This also allows us to bulletproof the writeback tracepoints to prevent
them from dereferencing a NULL pointer and crashing the kernel if one is
tracing with memcg's enabled, and an iSCSI device dies or a USB storage
stick is pulled.
The most common way of triggering this will be hotremoval of a device
while writeback with memcg enabled is going on. It was triggering several
times a day in a heavily loaded production environment.
Google Bug Id: 145475544
Link: https://lore.kernel.org/r/20191227194829.150110-1-tytso@mit.edu
Link: http://lkml.kernel.org/r/20191228005211.163952-1-tytso@mit.edu
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: Chris Mason <clm(a)fb.com>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/fs-writeback.c | 2 -
include/linux/backing-dev.h | 10 +++++++
include/trace/events/writeback.h | 37 +++++++++++++----------------
mm/backing-dev.c | 1
4 files changed, 29 insertions(+), 21 deletions(-)
--- a/fs/fs-writeback.c~memcg-fix-a-crash-in-wb_workfn-when-a-device-disappears
+++ a/fs/fs-writeback.c
@@ -2063,7 +2063,7 @@ void wb_workfn(struct work_struct *work)
struct bdi_writeback, dwork);
long pages_written;
- set_worker_desc("flush-%s", dev_name(wb->bdi->dev));
+ set_worker_desc("flush-%s", bdi_dev_name(wb->bdi));
current->flags |= PF_SWAPWRITE;
if (likely(!current_is_workqueue_rescuer() ||
--- a/include/linux/backing-dev.h~memcg-fix-a-crash-in-wb_workfn-when-a-device-disappears
+++ a/include/linux/backing-dev.h
@@ -13,6 +13,7 @@
#include <linux/fs.h>
#include <linux/sched.h>
#include <linux/blkdev.h>
+#include <linux/device.h>
#include <linux/writeback.h>
#include <linux/blk-cgroup.h>
#include <linux/backing-dev-defs.h>
@@ -504,4 +505,13 @@ static inline int bdi_rw_congested(struc
(1 << WB_async_congested));
}
+extern const char *bdi_unknown_name;
+
+static inline const char *bdi_dev_name(struct backing_dev_info *bdi)
+{
+ if (!bdi || !bdi->dev)
+ return bdi_unknown_name;
+ return dev_name(bdi->dev);
+}
+
#endif /* _LINUX_BACKING_DEV_H */
--- a/include/trace/events/writeback.h~memcg-fix-a-crash-in-wb_workfn-when-a-device-disappears
+++ a/include/trace/events/writeback.h
@@ -67,8 +67,8 @@ DECLARE_EVENT_CLASS(writeback_page_templ
TP_fast_assign(
strscpy_pad(__entry->name,
- mapping ? dev_name(inode_to_bdi(mapping->host)->dev) : "(unknown)",
- 32);
+ bdi_dev_name(mapping ? inode_to_bdi(mapping->host) :
+ NULL), 32);
__entry->ino = mapping ? mapping->host->i_ino : 0;
__entry->index = page->index;
),
@@ -111,8 +111,7 @@ DECLARE_EVENT_CLASS(writeback_dirty_inod
struct backing_dev_info *bdi = inode_to_bdi(inode);
/* may be called for files on pseudo FSes w/ unregistered bdi */
- strscpy_pad(__entry->name,
- bdi->dev ? dev_name(bdi->dev) : "(unknown)", 32);
+ strscpy_pad(__entry->name, bdi_dev_name(bdi), 32);
__entry->ino = inode->i_ino;
__entry->state = inode->i_state;
__entry->flags = flags;
@@ -193,7 +192,7 @@ TRACE_EVENT(inode_foreign_history,
),
TP_fast_assign(
- strncpy(__entry->name, dev_name(inode_to_bdi(inode)->dev), 32);
+ strncpy(__entry->name, bdi_dev_name(inode_to_bdi(inode)), 32);
__entry->ino = inode->i_ino;
__entry->cgroup_ino = __trace_wbc_assign_cgroup(wbc);
__entry->history = history;
@@ -222,7 +221,7 @@ TRACE_EVENT(inode_switch_wbs,
),
TP_fast_assign(
- strncpy(__entry->name, dev_name(old_wb->bdi->dev), 32);
+ strncpy(__entry->name, bdi_dev_name(old_wb->bdi), 32);
__entry->ino = inode->i_ino;
__entry->old_cgroup_ino = __trace_wb_assign_cgroup(old_wb);
__entry->new_cgroup_ino = __trace_wb_assign_cgroup(new_wb);
@@ -255,7 +254,7 @@ TRACE_EVENT(track_foreign_dirty,
struct address_space *mapping = page_mapping(page);
struct inode *inode = mapping ? mapping->host : NULL;
- strncpy(__entry->name, dev_name(wb->bdi->dev), 32);
+ strncpy(__entry->name, bdi_dev_name(wb->bdi), 32);
__entry->bdi_id = wb->bdi->id;
__entry->ino = inode ? inode->i_ino : 0;
__entry->memcg_id = wb->memcg_css->id;
@@ -288,7 +287,7 @@ TRACE_EVENT(flush_foreign,
),
TP_fast_assign(
- strncpy(__entry->name, dev_name(wb->bdi->dev), 32);
+ strncpy(__entry->name, bdi_dev_name(wb->bdi), 32);
__entry->cgroup_ino = __trace_wb_assign_cgroup(wb);
__entry->frn_bdi_id = frn_bdi_id;
__entry->frn_memcg_id = frn_memcg_id;
@@ -318,7 +317,7 @@ DECLARE_EVENT_CLASS(writeback_write_inod
TP_fast_assign(
strscpy_pad(__entry->name,
- dev_name(inode_to_bdi(inode)->dev), 32);
+ bdi_dev_name(inode_to_bdi(inode)), 32);
__entry->ino = inode->i_ino;
__entry->sync_mode = wbc->sync_mode;
__entry->cgroup_ino = __trace_wbc_assign_cgroup(wbc);
@@ -361,9 +360,7 @@ DECLARE_EVENT_CLASS(writeback_work_class
__field(ino_t, cgroup_ino)
),
TP_fast_assign(
- strscpy_pad(__entry->name,
- wb->bdi->dev ? dev_name(wb->bdi->dev) :
- "(unknown)", 32);
+ strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32);
__entry->nr_pages = work->nr_pages;
__entry->sb_dev = work->sb ? work->sb->s_dev : 0;
__entry->sync_mode = work->sync_mode;
@@ -416,7 +413,7 @@ DECLARE_EVENT_CLASS(writeback_class,
__field(ino_t, cgroup_ino)
),
TP_fast_assign(
- strscpy_pad(__entry->name, dev_name(wb->bdi->dev), 32);
+ strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32);
__entry->cgroup_ino = __trace_wb_assign_cgroup(wb);
),
TP_printk("bdi %s: cgroup_ino=%lu",
@@ -438,7 +435,7 @@ TRACE_EVENT(writeback_bdi_register,
__array(char, name, 32)
),
TP_fast_assign(
- strscpy_pad(__entry->name, dev_name(bdi->dev), 32);
+ strscpy_pad(__entry->name, bdi_dev_name(bdi), 32);
),
TP_printk("bdi %s",
__entry->name
@@ -463,7 +460,7 @@ DECLARE_EVENT_CLASS(wbc_class,
),
TP_fast_assign(
- strscpy_pad(__entry->name, dev_name(bdi->dev), 32);
+ strscpy_pad(__entry->name, bdi_dev_name(bdi), 32);
__entry->nr_to_write = wbc->nr_to_write;
__entry->pages_skipped = wbc->pages_skipped;
__entry->sync_mode = wbc->sync_mode;
@@ -514,7 +511,7 @@ TRACE_EVENT(writeback_queue_io,
),
TP_fast_assign(
unsigned long *older_than_this = work->older_than_this;
- strscpy_pad(__entry->name, dev_name(wb->bdi->dev), 32);
+ strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32);
__entry->older = older_than_this ? *older_than_this : 0;
__entry->age = older_than_this ?
(jiffies - *older_than_this) * 1000 / HZ : -1;
@@ -600,7 +597,7 @@ TRACE_EVENT(bdi_dirty_ratelimit,
),
TP_fast_assign(
- strscpy_pad(__entry->bdi, dev_name(wb->bdi->dev), 32);
+ strscpy_pad(__entry->bdi, bdi_dev_name(wb->bdi), 32);
__entry->write_bw = KBps(wb->write_bandwidth);
__entry->avg_write_bw = KBps(wb->avg_write_bandwidth);
__entry->dirty_rate = KBps(dirty_rate);
@@ -665,7 +662,7 @@ TRACE_EVENT(balance_dirty_pages,
TP_fast_assign(
unsigned long freerun = (thresh + bg_thresh) / 2;
- strscpy_pad(__entry->bdi, dev_name(wb->bdi->dev), 32);
+ strscpy_pad(__entry->bdi, bdi_dev_name(wb->bdi), 32);
__entry->limit = global_wb_domain.dirty_limit;
__entry->setpoint = (global_wb_domain.dirty_limit +
@@ -726,7 +723,7 @@ TRACE_EVENT(writeback_sb_inodes_requeue,
TP_fast_assign(
strscpy_pad(__entry->name,
- dev_name(inode_to_bdi(inode)->dev), 32);
+ bdi_dev_name(inode_to_bdi(inode)), 32);
__entry->ino = inode->i_ino;
__entry->state = inode->i_state;
__entry->dirtied_when = inode->dirtied_when;
@@ -800,7 +797,7 @@ DECLARE_EVENT_CLASS(writeback_single_ino
TP_fast_assign(
strscpy_pad(__entry->name,
- dev_name(inode_to_bdi(inode)->dev), 32);
+ bdi_dev_name(inode_to_bdi(inode)), 32);
__entry->ino = inode->i_ino;
__entry->state = inode->i_state;
__entry->dirtied_when = inode->dirtied_when;
--- a/mm/backing-dev.c~memcg-fix-a-crash-in-wb_workfn-when-a-device-disappears
+++ a/mm/backing-dev.c
@@ -21,6 +21,7 @@ struct backing_dev_info noop_backing_dev
EXPORT_SYMBOL_GPL(noop_backing_dev_info);
static struct class *bdi_class;
+const char *bdi_unknown_name = "(unknown)";
/*
* bdi_lock protects bdi_tree and updates to bdi_list. bdi_list has RCU
_
Patches currently in -mm which might be from tytso(a)mit.edu are
memcg-fix-a-crash-in-wb_workfn-when-a-device-disappears.patch
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: c2277d07f243 - Linux 5.4.11-rc1
The results of these automated tests are provided below.
Overall result: FAILED (see details below)
Merge: OK
Compile: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/374903
We attempted to compile the kernel for multiple architectures, but the compile
failed on one or more architectures:
aarch64: FAILED (see build-aarch64.log.xz attachment)
We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 3 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg