It's true we can't resume the device from poll workers in
nouveau_connector_detect(). We can however, prevent the autosuspend
timer from elapsing immediately if it hasn't already without risking any
sort of deadlock with the runtime suspend/resume operations. So do that
instead of entirely avoiding grabbing a power reference.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: Karol Herbst <karolherbst(a)gmail.com>
---
drivers/gpu/drm/nouveau/nouveau_connector.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 2a45b4c2ceb0..010d6db14cba 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -572,12 +572,16 @@ nouveau_connector_detect(struct drm_connector *connector, bool force)
nv_connector->edid = NULL;
}
- /* Outputs are only polled while runtime active, so acquiring a
- * runtime PM ref here is unnecessary (and would deadlock upon
- * runtime suspend because it waits for polling to finish).
+ /* Outputs are only polled while runtime active, so resuming the
+ * device here is unnecessary (and would deadlock upon runtime suspend
+ * because it waits for polling to finish). We do however, want to
+ * prevent the autosuspend timer from elapsing during this operation
+ * if possible.
*/
- if (!drm_kms_helper_is_poll_worker()) {
- ret = pm_runtime_get_sync(connector->dev->dev);
+ if (drm_kms_helper_is_poll_worker()) {
+ pm_runtime_get_noresume(dev->dev);
+ } else {
+ ret = pm_runtime_get_sync(dev->dev);
if (ret < 0 && ret != -EACCES)
return conn_status;
}
@@ -655,10 +659,8 @@ nouveau_connector_detect(struct drm_connector *connector, bool force)
out:
- if (!drm_kms_helper_is_poll_worker()) {
- pm_runtime_mark_last_busy(connector->dev->dev);
- pm_runtime_put_autosuspend(connector->dev->dev);
- }
+ pm_runtime_mark_last_busy(dev->dev);
+ pm_runtime_put_autosuspend(dev->dev);
return conn_status;
}
--
2.17.1
This removes the potential of deadlocking with fb_helper entirely by
preventing it from handling hotplugs during the runtime suspend process
as early as possible in the suspend process. If it turns out this is not
possible, due to some fb_helper action having been queued up before we
got a time to disable hotplugging, we simply return -EBUSY so that the
runtime PM core attempts autosuspending the device again once fb_helper
isn't doing anything.
This fixes one of the issues causing deadlocks on runtime suspend/resume
with nouveau on my P50.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: Karol Herbst <karolherbst(a)gmail.com>
---
drivers/gpu/drm/nouveau/nouveau_drm.c | 8 ++++++++
drivers/gpu/drm/nouveau/nouveau_fbcon.c | 1 +
2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index ee2546db09c9..d47cb5b2af98 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -836,6 +836,14 @@ nouveau_pmops_runtime_suspend(struct device *dev)
return -EBUSY;
}
+ /* There's no way for us to stop fb_helper work in reaction to
+ * hotplugs later in the RPM process. First off: we don't want to,
+ * fb_helper should be able to keep the GPU awake. Second off: it is
+ * capable of grabbing basically any lock in existence.
+ */
+ if (!drm_fb_helper_suspend_hotplug(drm_dev->fb_helper))
+ return -EBUSY;
+
nouveau_switcheroo_optimus_dsm();
ret = nouveau_do_suspend(drm_dev, true);
pci_save_state(pdev);
diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
index 85c1f10bc2b6..963ba630fd04 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
@@ -466,6 +466,7 @@ nouveau_fbcon_set_suspend_work(struct work_struct *work)
console_unlock();
if (state == FBINFO_STATE_RUNNING) {
+ drm_fb_helper_resume_hotplug(drm->dev->fb_helper);
pm_runtime_mark_last_busy(drm->dev->dev);
pm_runtime_put_sync(drm->dev->dev);
}
--
2.17.1
Turns out this part is my fault for not noticing when reviewing
9a2eba337cace ("drm/nouveau: Fix drm poll_helper handling"). Currently
we call drm_kms_helper_poll_enable() from nouveau_display_hpd_work().
This makes basically no sense however, because that means we're calling
drm_kms_helper_poll_enable() every time we schedule the hotplug
detection work. This is also against the advice mentioned in
drm_kms_helper_poll_enable()'s documentation:
Note that calls to enable and disable polling must be strictly ordered,
which is automatically the case when they're only call from
suspend/resume callbacks.
Of course, hotplugs can't really be ordered. They could even happen
immediately after we called drm_kms_helper_poll_disable() in
nouveau_display_fini(), which can lead to all sorts of issues.
Additionally; enabling polling /after/ we call
drm_helper_hpd_irq_event() could also mean that we'd miss a hotplug
event anyway, since drm_helper_hpd_irq_event() wouldn't bother trying to
probe connectors so long as polling is disabled.
So; simply move this back into nouveau_display_init() again. The race
condition that both of these patches attempted to work around has
already been fixed properly in
d61a5c106351 ("drm/nouveau: Fix deadlock on runtime suspend")
Fixes: 9a2eba337cace ("drm/nouveau: Fix drm poll_helper handling")
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: Peter Ujfalusi <peter.ujfalusi(a)ti.com>
Cc: stable(a)vger.kernel.org
---
drivers/gpu/drm/nouveau/nouveau_display.c | 7 +++++--
drivers/gpu/drm/nouveau/nouveau_drm.c | 1 -
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c
index ec7861457b84..1d36ab5d4796 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -355,8 +355,6 @@ nouveau_display_hpd_work(struct work_struct *work)
pm_runtime_get_sync(drm->dev->dev);
drm_helper_hpd_irq_event(drm->dev);
- /* enable polling for external displays */
- drm_kms_helper_poll_enable(drm->dev);
pm_runtime_mark_last_busy(drm->dev->dev);
pm_runtime_put_sync(drm->dev->dev);
@@ -411,6 +409,11 @@ nouveau_display_init(struct drm_device *dev)
if (ret)
return ret;
+ /* enable connector detection and polling for connectors without HPD
+ * support
+ */
+ drm_kms_helper_poll_enable(dev);
+
/* enable hotplug interrupts */
drm_connector_list_iter_begin(dev, &conn_iter);
nouveau_for_each_non_mst_connector_iter(connector, &conn_iter) {
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index c7ec86d6c3c9..5fdc1fbe2ee5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -835,7 +835,6 @@ nouveau_pmops_runtime_suspend(struct device *dev)
return -EBUSY;
}
- drm_kms_helper_poll_disable(drm_dev);
nouveau_switcheroo_optimus_dsm();
ret = nouveau_do_suspend(drm_dev, true);
pci_save_state(pdev);
--
2.17.1
Hi Greg,
This was missing in 4.14-stable. Though marked for stable, but not sure
if it matches the stable rules. Please apply to your queue if it does.
--
Regards
Sudip
Hi!
Seems the .sign-files for 4.9.116 and 4.14.59 are missing in
https://mirrors.edge.kernel.org/pub/linux/kernel/v4.x/ - so my script
complained.
So long!
Rainer Fiebig
--
The truth always turns out to be simpler than you thought.
Richard Feynman
On Sat, Jul 28, 2018 at 1:11 AM <gregkh(a)linuxfoundation.org> wrote:
>
>
> This is a note to let you know that I've just added the patch titled
>
> kvm, mm: account shadow page tables to kmemcg
>
> to the 4.4-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> kvm-mm-account-shadow-page-tables-to-kmemcg.patch
> and it can be found in the queue-4.4 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
Hi Greg, this patch requires some more changes to be effective on 4.4
kernel as kmem charging is still not in the generic page allocator
code path in 4.4.
Shakeel
>
> From d97e5e6160c0e0a23963ec198c7cb1c69e6bf9e8 Mon Sep 17 00:00:00 2001
> From: Shakeel Butt <shakeelb(a)google.com>
> Date: Thu, 26 Jul 2018 16:37:45 -0700
> Subject: kvm, mm: account shadow page tables to kmemcg
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> From: Shakeel Butt <shakeelb(a)google.com>
>
> commit d97e5e6160c0e0a23963ec198c7cb1c69e6bf9e8 upstream.
>
> The size of kvm's shadow page tables corresponds to the size of the
> guest virtual machines on the system. Large VMs can spend a significant
> amount of memory as shadow page tables which can not be left as system
> memory overhead. So, account shadow page tables to the kmemcg.
>
> [shakeelb(a)google.com: replace (GFP_KERNEL|__GFP_ACCOUNT) with GFP_KERNEL_ACCOUNT]
> Link: http://lkml.kernel.org/r/20180629140224.205849-1-shakeelb@google.com
> Link: http://lkml.kernel.org/r/20180627181349.149778-1-shakeelb@google.com
> Signed-off-by: Shakeel Butt <shakeelb(a)google.com>
> Cc: Michal Hocko <mhocko(a)kernel.org>
> Cc: Johannes Weiner <hannes(a)cmpxchg.org>
> Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
> Cc: Paolo Bonzini <pbonzini(a)redhat.com>
> Cc: Greg Thelen <gthelen(a)google.com>
> Cc: Radim Krčmář <rkrcmar(a)redhat.com>
> Cc: Peter Feiner <pfeiner(a)google.com>
> Cc: <stable(a)vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
>
> ---
> arch/x86/kvm/mmu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -692,7 +692,7 @@ static int mmu_topup_memory_cache_page(s
> if (cache->nobjs >= min)
> return 0;
> while (cache->nobjs < ARRAY_SIZE(cache->objects)) {
> - page = (void *)__get_free_page(GFP_KERNEL);
> + page = (void *)__get_free_page(GFP_KERNEL_ACCOUNT);
> if (!page)
> return -ENOMEM;
> cache->objects[cache->nobjs++] = page;
>
>
> Patches currently in stable-queue which might be from shakeelb(a)google.com are
>
> queue-4.4/kvm-mm-account-shadow-page-tables-to-kmemcg.patch
Greg,
referring to 4.4 kernels which I build and use on the legacy Zaurus
handhelds, I still carry these fixes which I tested but sent too late
for the 4.4 merge window.
e5b7d71aa5b3 ASoC: pxa: Fix module autoload for platform drivers
I think this fix could be backported to 4.4-stable. Please review.
Thanks
Andrea