The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe867cac9e1967c553e4ac2aece5fc8675258010 Mon Sep 17 00:00:00 2001
From: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Date: Mon, 4 Nov 2019 12:02:14 +0200
Subject: [PATCH] net/mlx5e: Use preactivate hook to set the indirection table
mlx5e_ethtool_set_channels updates the indirection table before
switching to the new channels. If the switch fails, the indirection
table is new, but the channels are old, which is wrong. Fix it by using
the preactivate hook of mlx5e_safe_switch_channels to update the
indirection table at the stage when nothing can fail anymore.
As the code that updates the indirection table is now encapsulated into
a new function, use that function in the attach flow when the driver has
to reduce the number of channels, and prepare the code for the next
commit.
Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Reviewed-by: Tariq Toukan <tariqt(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index bc2c96b34de1..4ddccab02a4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1043,6 +1043,7 @@ int mlx5e_safe_reopen_channels(struct mlx5e_priv *priv);
int mlx5e_safe_switch_channels(struct mlx5e_priv *priv,
struct mlx5e_channels *new_chs,
mlx5e_fp_preactivate preactivate);
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv);
void mlx5e_activate_priv_channels(struct mlx5e_priv *priv);
void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 68b520df07e4..ff7f5a931520 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -432,9 +432,7 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
*cur_params = new_channels.params;
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
+ mlx5e_num_channels_changed(priv);
goto out;
}
@@ -442,12 +440,8 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (arfs_enabled)
mlx5e_arfs_disable(priv);
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
-
/* Switch to new channels, set new parameters and close old ones */
- err = mlx5e_safe_switch_channels(priv, &new_channels, NULL);
+ err = mlx5e_safe_switch_channels(priv, &new_channels, mlx5e_num_channels_changed);
if (arfs_enabled) {
int err2 = mlx5e_arfs_enable(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 152aa5d7df79..bbe8c32fb423 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2880,6 +2880,17 @@ static void mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
netif_set_real_num_rx_queues(netdev, num_rxqs);
}
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv)
+{
+ u16 count = priv->channels.params.num_channels;
+
+ if (!netif_is_rxfh_configured(priv->netdev))
+ mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
+ MLX5E_INDIR_RQT_SIZE, count);
+
+ return 0;
+}
+
static void mlx5e_build_txq_maps(struct mlx5e_priv *priv)
{
int i, ch;
@@ -5288,9 +5299,10 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv)
max_nch = mlx5e_get_max_num_channels(priv->mdev);
if (priv->channels.params.num_channels > max_nch) {
mlx5_core_warn(priv->mdev, "MLX5E: Reducing number of channels to %d\n", max_nch);
+ /* Reducing the number of channels - RXFH has to be reset. */
+ priv->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
priv->channels.params.num_channels = max_nch;
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, max_nch);
+ mlx5e_num_channels_changed(priv);
}
err = profile->init_tx(priv);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe867cac9e1967c553e4ac2aece5fc8675258010 Mon Sep 17 00:00:00 2001
From: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Date: Mon, 4 Nov 2019 12:02:14 +0200
Subject: [PATCH] net/mlx5e: Use preactivate hook to set the indirection table
mlx5e_ethtool_set_channels updates the indirection table before
switching to the new channels. If the switch fails, the indirection
table is new, but the channels are old, which is wrong. Fix it by using
the preactivate hook of mlx5e_safe_switch_channels to update the
indirection table at the stage when nothing can fail anymore.
As the code that updates the indirection table is now encapsulated into
a new function, use that function in the attach flow when the driver has
to reduce the number of channels, and prepare the code for the next
commit.
Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Reviewed-by: Tariq Toukan <tariqt(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index bc2c96b34de1..4ddccab02a4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1043,6 +1043,7 @@ int mlx5e_safe_reopen_channels(struct mlx5e_priv *priv);
int mlx5e_safe_switch_channels(struct mlx5e_priv *priv,
struct mlx5e_channels *new_chs,
mlx5e_fp_preactivate preactivate);
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv);
void mlx5e_activate_priv_channels(struct mlx5e_priv *priv);
void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 68b520df07e4..ff7f5a931520 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -432,9 +432,7 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
*cur_params = new_channels.params;
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
+ mlx5e_num_channels_changed(priv);
goto out;
}
@@ -442,12 +440,8 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (arfs_enabled)
mlx5e_arfs_disable(priv);
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
-
/* Switch to new channels, set new parameters and close old ones */
- err = mlx5e_safe_switch_channels(priv, &new_channels, NULL);
+ err = mlx5e_safe_switch_channels(priv, &new_channels, mlx5e_num_channels_changed);
if (arfs_enabled) {
int err2 = mlx5e_arfs_enable(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 152aa5d7df79..bbe8c32fb423 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2880,6 +2880,17 @@ static void mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
netif_set_real_num_rx_queues(netdev, num_rxqs);
}
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv)
+{
+ u16 count = priv->channels.params.num_channels;
+
+ if (!netif_is_rxfh_configured(priv->netdev))
+ mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
+ MLX5E_INDIR_RQT_SIZE, count);
+
+ return 0;
+}
+
static void mlx5e_build_txq_maps(struct mlx5e_priv *priv)
{
int i, ch;
@@ -5288,9 +5299,10 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv)
max_nch = mlx5e_get_max_num_channels(priv->mdev);
if (priv->channels.params.num_channels > max_nch) {
mlx5_core_warn(priv->mdev, "MLX5E: Reducing number of channels to %d\n", max_nch);
+ /* Reducing the number of channels - RXFH has to be reset. */
+ priv->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
priv->channels.params.num_channels = max_nch;
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, max_nch);
+ mlx5e_num_channels_changed(priv);
}
err = profile->init_tx(priv);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe867cac9e1967c553e4ac2aece5fc8675258010 Mon Sep 17 00:00:00 2001
From: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Date: Mon, 4 Nov 2019 12:02:14 +0200
Subject: [PATCH] net/mlx5e: Use preactivate hook to set the indirection table
mlx5e_ethtool_set_channels updates the indirection table before
switching to the new channels. If the switch fails, the indirection
table is new, but the channels are old, which is wrong. Fix it by using
the preactivate hook of mlx5e_safe_switch_channels to update the
indirection table at the stage when nothing can fail anymore.
As the code that updates the indirection table is now encapsulated into
a new function, use that function in the attach flow when the driver has
to reduce the number of channels, and prepare the code for the next
commit.
Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Reviewed-by: Tariq Toukan <tariqt(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index bc2c96b34de1..4ddccab02a4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1043,6 +1043,7 @@ int mlx5e_safe_reopen_channels(struct mlx5e_priv *priv);
int mlx5e_safe_switch_channels(struct mlx5e_priv *priv,
struct mlx5e_channels *new_chs,
mlx5e_fp_preactivate preactivate);
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv);
void mlx5e_activate_priv_channels(struct mlx5e_priv *priv);
void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 68b520df07e4..ff7f5a931520 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -432,9 +432,7 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
*cur_params = new_channels.params;
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
+ mlx5e_num_channels_changed(priv);
goto out;
}
@@ -442,12 +440,8 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (arfs_enabled)
mlx5e_arfs_disable(priv);
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
-
/* Switch to new channels, set new parameters and close old ones */
- err = mlx5e_safe_switch_channels(priv, &new_channels, NULL);
+ err = mlx5e_safe_switch_channels(priv, &new_channels, mlx5e_num_channels_changed);
if (arfs_enabled) {
int err2 = mlx5e_arfs_enable(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 152aa5d7df79..bbe8c32fb423 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2880,6 +2880,17 @@ static void mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
netif_set_real_num_rx_queues(netdev, num_rxqs);
}
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv)
+{
+ u16 count = priv->channels.params.num_channels;
+
+ if (!netif_is_rxfh_configured(priv->netdev))
+ mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
+ MLX5E_INDIR_RQT_SIZE, count);
+
+ return 0;
+}
+
static void mlx5e_build_txq_maps(struct mlx5e_priv *priv)
{
int i, ch;
@@ -5288,9 +5299,10 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv)
max_nch = mlx5e_get_max_num_channels(priv->mdev);
if (priv->channels.params.num_channels > max_nch) {
mlx5_core_warn(priv->mdev, "MLX5E: Reducing number of channels to %d\n", max_nch);
+ /* Reducing the number of channels - RXFH has to be reset. */
+ priv->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
priv->channels.params.num_channels = max_nch;
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, max_nch);
+ mlx5e_num_channels_changed(priv);
}
err = profile->init_tx(priv);
On Fri, Apr 17, 2020 at 9:41 PM Toralf Förster <toralf.foerster(a)gmx.de> wrote:
>
> On 4/17/20 8:52 PM, Rafael J. Wysocki wrote:
> > On Fri, Apr 17, 2020 at 6:36 PM Toralf Förster <toralf.foerster(a)gmx.de> wrote:
> >>
> >> On 4/17/20 5:53 PM, Rafael J. Wysocki wrote:
> >>> Does the patch below (untested) make any difference?
> >>>
> >>> ---
> >>> drivers/acpi/ec.c | 5 ++++-
> >>> 1 file changed, 4 insertions(+), 1 deletion(-)
> >>>
> >>> Index: linux-pm/drivers/acpi/ec.c
> >>> ===================================================================
> >>> --- linux-pm.orig/drivers/acpi/ec.c
> >>> +++ linux-pm/drivers/acpi/ec.c
> >>> @@ -2067,7 +2067,10 @@ static struct acpi_driver acpi_ec_driver
> >>> .add = acpi_ec_add,
> >>> .remove = acpi_ec_remove,
> >>> },
> >>> - .drv.pm = &acpi_ec_pm,
> >>> + .drv = {
> >>> + .probe_type = PROBE_FORCE_SYNCHRONOUS,
> >>> + .pm = &acpi_ec_pm,
> >>> + },
> >>> };
> >>>
> >>> static void acpi_ec_destroy_workqueues(void)
> >> I'd say no, but for completeness:
> >
> > OK, it looks like mainline commit
> >
> > 65a691f5f8f0 ("ACPI: EC: Do not clear boot_ec_is_ecdt in acpi_ec_add()")
> >
> > was backported into 5.6.5 by mistake.
> >
> > Can you please revert that patch and retest?
> >
> Yes, reverting that commit solved the issue.
OK, thanks!
Greg, I'm not sure why commit 65a691f5f8f0 from the mainline ended up in 5.6.5.
It has not been marked for -stable or otherwise requested to be
included AFAICS. Also it depends on other mainline commits that have
not been included into 5.6.5.
Can you please drop it?
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ef4a632ccc1c7d3fb71a5baae85b79af08b7f94b Mon Sep 17 00:00:00 2001
From: Murphy Zhou <jencce.kernel(a)gmail.com>
Date: Wed, 18 Mar 2020 20:43:38 +0800
Subject: [PATCH] CIFS: check new file size when extending file by fallocate
xfstests generic/228 checks if fallocate respect RLIMIT_FSIZE.
After fallocate mode 0 extending enabled, we can hit this failure.
Fix this by check the new file size with vfs helper, return
error if file size is larger then RLIMIT_FSIZE(ulimit -f).
This patch has been tested by LTP/xfstests aginst samba and
Windows server.
Acked-by: Ronnie Sahlberg <lsahlber(a)redhat.com>
Signed-off-by: Murphy Zhou <jencce.kernel(a)gmail.com>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
CC: Stable <stable(a)vger.kernel.org>
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index b0759c8aa6f5..9c9258fc8756 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -3255,6 +3255,10 @@ static long smb3_simple_falloc(struct file *file, struct cifs_tcon *tcon,
* Extending the file
*/
if ((keep_size == false) && i_size_read(inode) < off + len) {
+ rc = inode_newsize_ok(inode, off + len);
+ if (rc)
+ goto out;
+
if ((cifsi->cifsAttrs & FILE_ATTRIBUTE_SPARSE_FILE) == 0)
smb2_set_sparse(xid, tcon, cfile, inode, false);
A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in
memory.events") changes the behavior of memcg events, which will
consider subtrees in memory.events. But oom_kill event is a special one
as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed
in memory.oom_control. The file memory.oom_control is in both root memcg
and non root memcg, that is different with memory.event as it only in
non-root memcg. That commit is okay for cgroup2, but it is not okay for
cgroup1 as it will cause inconsistent behavior between root memcg and
non-root memcg.
Let's recover the original behavior for cgroup1.
Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events")
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Yafang Shao <laoar.shao(a)gmail.com>
---
include/linux/memcontrol.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 8c340e6b347f..a0ae080a67d1 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg,
atomic_long_inc(&memcg->memory_events[event]);
cgroup_file_notify(&memcg->events_file);
- if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
+ if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS ||
+ !cgroup_subsys_on_dfl(memory_cgrp_subsys))
break;
} while ((memcg = parent_mem_cgroup(memcg)) &&
!mem_cgroup_is_root(memcg));
--
2.18.2
The patch titled
Subject: mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
has been added to the -mm tree. Its filename is
mm-page_alloc-fix-watchdog-soft-lockups-during-set_zone_contiguous.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-watchdog-soft-lo…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-watchdog-soft-lo…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
Without CONFIG_PREEMPT, it can happen that we get soft lockups detected,
e.g., while booting up.
[ 105.608900] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
[ 105.608933] Modules linked in:
[ 105.608933] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4
[ 105.608933] Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
[ 105.608933] RIP: 0010:__pageblock_pfn_to_page+0x134/0x1c0
[ 105.608933] Code: 85 c0 74 71 4a 8b 04 d0 48 85 c0 74 68 48 01 c1 74 63 f6 01 04 74 5e 48 c1 e7 06 4c 8b 05 cc 991
[ 105.608933] RSP: 0000:ffffb6d94000fe60 EFLAGS: 00010286 ORIG_RAX: ffffffffffffff13
[ 105.608933] RAX: fffff81953250000 RBX: 000000000a4c9600 RCX: ffff8fe9ff7c1990
[ 105.608933] RDX: ffff8fe9ff7dab80 RSI: 000000000a4c95ff RDI: 0000000293250000
[ 105.608933] RBP: ffff8fe9ff7dab80 R08: fffff816c0000000 R09: 0000000000000008
[ 105.608933] R10: 0000000000000014 R11: 0000000000000014 R12: 0000000000000000
[ 105.608933] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 105.608933] FS: 0000000000000000(0000) GS:ffff8fe1ff400000(0000) knlGS:0000000000000000
[ 105.608933] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 105.608933] CR2: 000000000f613000 CR3: 00000088cf20a000 CR4: 00000000000006f0
[ 105.608933] Call Trace:
[ 105.608933] set_zone_contiguous+0x56/0x70
[ 105.608933] page_alloc_init_late+0x166/0x176
[ 105.608933] kernel_init_freeable+0xfa/0x255
[ 105.608933] ? rest_init+0xaa/0xaa
[ 105.608933] kernel_init+0xa/0x106
[ 105.608933] ret_from_fork+0x35/0x40
The issue becomes visible when having a lot of memory (e.g., 4TB) assigned
to a single NUMA node - a system that can easily be created using QEMU.
Inside VMs on a hypervisor with quite some memory overcommit, this is
fairly easy to trigger.
Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux(a)gmail.com>
Reviewed-by: Baoquan He <bhe(a)redhat.com>
Reviewed-by: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Alexander Duyck <alexander.duyck(a)gmail.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/page_alloc.c~mm-page_alloc-fix-watchdog-soft-lockups-during-set_zone_contiguous
+++ a/mm/page_alloc.c
@@ -1607,6 +1607,7 @@ void set_zone_contiguous(struct zone *zo
if (!__pageblock_pfn_to_page(block_start_pfn,
block_end_pfn, zone))
return;
+ cond_resched();
}
/* We confirm that there is no hole */
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-page_alloc-fix-watchdog-soft-lockups-during-set_zone_contiguous.patch
drivers-base-memoryc-cache-memory-blocks-in-xarray-to-accelerate-lookup-fix.patch
powerpc-pseries-hotplug-memory-stop-checking-is_mem_section_removable.patch
mm-memory_hotplug-remove-is_mem_section_removable.patch
On Fri, 13 Mar 2020 at 06:41, Sowjanya Komatineni
<skomatineni(a)nvidia.com> wrote:
>
> Tegra host supports HW busy detection and timeouts based on the
> count programmed in SDHCI_TIMEOUT_CONTROL register and max busy
> timeout it supports is 11s in finite busy wait mode.
>
> Some operations like SLEEP_AWAKE, ERASE and flush cache through
> SWITCH commands take longer than 11s and Tegra host supports
> infinite HW busy wait mode where HW waits forever till the card
> is busy without HW timeout.
>
> This patch implements Tegra specific set_timeout sdhci_ops to allow
> switching between finite and infinite HW busy detection wait modes
> based on the device command expected operation time.
>
> Signed-off-by: Sowjanya Komatineni <skomatineni(a)nvidia.com>
> ---
> drivers/mmc/host/sdhci-tegra.c | 31 +++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
>
> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> index a25c3a4..fa8f6a4 100644
> --- a/drivers/mmc/host/sdhci-tegra.c
> +++ b/drivers/mmc/host/sdhci-tegra.c
> @@ -45,6 +45,7 @@
> #define SDHCI_TEGRA_CAP_OVERRIDES_DQS_TRIM_SHIFT 8
>
> #define SDHCI_TEGRA_VENDOR_MISC_CTRL 0x120
> +#define SDHCI_MISC_CTRL_ERASE_TIMEOUT_LIMIT BIT(0)
> #define SDHCI_MISC_CTRL_ENABLE_SDR104 0x8
> #define SDHCI_MISC_CTRL_ENABLE_SDR50 0x10
> #define SDHCI_MISC_CTRL_ENABLE_SDHCI_SPEC_300 0x20
> @@ -1227,6 +1228,34 @@ static u32 sdhci_tegra_cqhci_irq(struct sdhci_host *host, u32 intmask)
> return 0;
> }
>
> +static void tegra_sdhci_set_timeout(struct sdhci_host *host,
> + struct mmc_command *cmd)
> +{
> + u32 val;
> +
> + /*
> + * HW busy detection timeout is based on programmed data timeout
> + * counter and maximum supported timeout is 11s which may not be
> + * enough for long operations like cache flush, sleep awake, erase.
> + *
> + * ERASE_TIMEOUT_LIMIT bit of VENDOR_MISC_CTRL register allows
> + * host controller to wait for busy state until the card is busy
> + * without HW timeout.
> + *
> + * So, use infinite busy wait mode for operations that may take
> + * more than maximum HW busy timeout of 11s otherwise use finite
> + * busy wait mode.
> + */
> + val = sdhci_readl(host, SDHCI_TEGRA_VENDOR_MISC_CTRL);
> + if (cmd && cmd->busy_timeout >= 11 * HZ)
> + val |= SDHCI_MISC_CTRL_ERASE_TIMEOUT_LIMIT;
> + else
> + val &= ~SDHCI_MISC_CTRL_ERASE_TIMEOUT_LIMIT;
> + sdhci_writel(host, val, SDHCI_TEGRA_VENDOR_MISC_CTRL);
> +
> + __sdhci_set_timeout(host, cmd);
kernel build on arm and arm64 architecture failed on stable-rc 4.19
(arm), 5.4 (arm64) and 5.5 (arm64)
drivers/mmc/host/sdhci-tegra.c: In function 'tegra_sdhci_set_timeout':
drivers/mmc/host/sdhci-tegra.c:1256:2: error: implicit declaration of
function '__sdhci_set_timeout'; did you mean
'tegra_sdhci_set_timeout'? [-Werror=implicit-function-declaration]
__sdhci_set_timeout(host, cmd);
^~~~~~~~~~~~~~~~~~~
tegra_sdhci_set_timeout
Full build log,
https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.5/D…https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.4/D…https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-4.19/…
- Naresh
Hi Linus,
Linux 5.7-rc1 reboot/powereoff hangs on AMD Ryzen 5 PRO 2400GE
system.
I isolated the commit to:
Revering the following commit fixes the problem.
commit 487eca11a321ef33bcf4ca5adb3c0c4954db1b58
Author: Prike Liang <Prike.Liang(a)amd.com>
Date: Tue Apr 7 20:21:26 2020 +0800
drm/amdgpu: fix gfx hang during suspend with video playback (v2)
The system will be hang up during S3 suspend because of SMU is
pending for GC not respose the register CP_HQD_ACTIVE access
request.This issue root cause of accessing the GC register under
enter GFX CGGPG and can be fixed by disable GFX CGPG before perform
suspend.
v2: Use disable the GFX CGPG instead of RLC safe mode guard.
Signed-off-by: Prike Liang <Prike.Liang(a)amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang(a)amd.com>
Reviewed-by: Huang Rui <ray.huang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
I did the bisect on Linux 5.6.5-rc1
git bisect start
# good: [0a27a29496060843ae3a8fe78aaec0062cbd5dfa] Linux 5.6.4
git bisect good 0a27a29496060843ae3a8fe78aaec0062cbd5dfa
# bad: [576aa353744ce5f1279071363e4a55e97f486f39] Linux 5.6.5-rc1
git bisect bad 576aa353744ce5f1279071363e4a55e97f486f39
# good: [7509db5d111a5763a199902052eecc480e0ec724] x86/tsc_msr: Make MSR
derived TSC frequency more accurate
git bisect good 7509db5d111a5763a199902052eecc480e0ec724
# good: [15f1ead7d7966d087352ba2cf81a1759b25ad163] scsi: lpfc: Fix
broken Credit Recovery after driver load
git bisect good 15f1ead7d7966d087352ba2cf81a1759b25ad163
# good: [9e52b4ab5fadd803c8c2e617aa8c151720757fb1] s390/diag: fix
display of diagnose call statistics
git bisect good 9e52b4ab5fadd803c8c2e617aa8c151720757fb1
# good: [3e1e6903924fd6c95db7e46c5baa41dfa0f46fdb]
powerpc/hash64/devmap: Use H_PAGE_THP_HUGE when setting up huge devmap
PTE entries
git bisect good 3e1e6903924fd6c95db7e46c5baa41dfa0f46fdb
# good: [0344e0fee2f904ebce1d58bec8ace3ee9cf5f777] drm/dp_mst: Fix
clearing payload state on topology disable
git bisect good 0344e0fee2f904ebce1d58bec8ace3ee9cf5f777
# bad: [136881c0420bb52d5f21f75688f5aee1cf401737] perf/core: Unify
{pinned,flexible}_sched_in()
git bisect bad 136881c0420bb52d5f21f75688f5aee1cf401737
# bad: [0d928b424b99cfe0e7806c530f6039a053d5082d] drm/i915/ggtt: do not
set bits 1-11 in gen12 ptes
git bisect bad 0d928b424b99cfe0e7806c530f6039a053d5082d
# bad: [a655a99c9f6a1d819d695fca6d48b450449f45ee] drm/amdgpu: fix gfx
hang during suspend with video playback (v2)
git bisect bad a655a99c9f6a1d819d695fca6d48b450449f45ee
# first bad commit: [a655a99c9f6a1d819d695fca6d48b450449f45ee]
drm/amdgpu: fix gfx hang during suspend with video playback (v2)
thanks,
-- Shuah