Disabling HPD polling from i915_hpd_poll_init_work() involves probing all display connectors explicitly to account for lost hotplug interrupts. On some platforms (mostly pre-ICL) with HDMI connectors the I2C EDID bit-banging using udelay() triggers in turn the
workqueue: i915_hpd_poll_init_work [i915] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
warning.
Fix the above by scheduling i915_hpd_poll_init_work() on a WQ_UNBOUND workqueue. It's ok to use a system WQ, since i915_hpd_poll_init_work() is properly flushed in intel_hpd_cancel_work().
The connector probing from drm_mode_config::output_poll_work resulting in the same warning is fixed by the next patch.
Cc: Tejun Heo tj@kernel.org Cc: Heiner Kallweit hkallweit1@gmail.com CC: stable@vger.kernel.org # 6.5 Suggested-by: Tejun Heo tj@kernel.org Suggested-by: Heiner Kallweit hkallweit1@gmail.com Reported-by: Heiner Kallweit hkallweit1@gmail.com Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9245 Link: https://lore.kernel.org/all/f7e21caa-e98d-e5b5-932a-fe12d27fde9b@gmail.com Signed-off-by: Imre Deak imre.deak@intel.com --- drivers/gpu/drm/i915/display/intel_hotplug.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_hotplug.c b/drivers/gpu/drm/i915/display/intel_hotplug.c index e8562f6f8bb44..accc2fec562a0 100644 --- a/drivers/gpu/drm/i915/display/intel_hotplug.c +++ b/drivers/gpu/drm/i915/display/intel_hotplug.c @@ -774,7 +774,7 @@ void intel_hpd_poll_enable(struct drm_i915_private *dev_priv) * As well, there's no issue if we race here since we always reschedule * this worker anyway */ - queue_work(dev_priv->unordered_wq, + queue_work(system_unbound_wq, &dev_priv->display.hotplug.poll_init_work); }
@@ -803,7 +803,7 @@ void intel_hpd_poll_disable(struct drm_i915_private *dev_priv) return;
WRITE_ONCE(dev_priv->display.hotplug.poll_enabled, false); - queue_work(dev_priv->unordered_wq, + queue_work(system_unbound_wq, &dev_priv->display.hotplug.poll_init_work); }
On some i915 platforms at least the HPD poll work involves I2C bit-banging using udelay()s to probe for monitor EDIDs. This in turn may trigger the
workqueue: output_poll_execute [drm_kms_helper] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
warning. Fix this by scheduling drm_mode_config::output_poll_work on a WQ_UNBOUND workqueue.
Cc: Tejun Heo tj@kernel.org Cc: Heiner Kallweit hkallweit1@gmail.com CC: stable@vger.kernel.org # 6.5 Cc: dri-devel@lists.freedesktop.org Suggested-by: Tejun Heo tj@kernel.org Suggested-by: Heiner Kallweit hkallweit1@gmail.com Reported-by: Heiner Kallweit hkallweit1@gmail.com Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9245 Link: https://lore.kernel.org/all/f7e21caa-e98d-e5b5-932a-fe12d27fde9b@gmail.com Signed-off-by: Imre Deak imre.deak@intel.com --- drivers/gpu/drm/drm_probe_helper.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c index 3f479483d7d80..72eac0cd25e74 100644 --- a/drivers/gpu/drm/drm_probe_helper.c +++ b/drivers/gpu/drm/drm_probe_helper.c @@ -279,7 +279,8 @@ static void reschedule_output_poll_work(struct drm_device *dev) */ delay = HZ;
- schedule_delayed_work(&dev->mode_config.output_poll_work, delay); + queue_delayed_work(system_unbound_wq, + &dev->mode_config.output_poll_work, delay); }
/** @@ -614,7 +615,7 @@ int drm_helper_probe_single_connector_modes(struct drm_connector *connector, */ dev->mode_config.delayed_event = true; if (dev->mode_config.poll_enabled) - mod_delayed_work(system_wq, + mod_delayed_work(system_unbound_wq, &dev->mode_config.output_poll_work, 0); } @@ -838,7 +839,8 @@ static void output_poll_execute(struct work_struct *work) drm_kms_helper_hotplug_event(dev);
if (repoll) - schedule_delayed_work(delayed_work, DRM_OUTPUT_POLL_PERIOD); + queue_delayed_work(system_unbound_wq, + delayed_work, DRM_OUTPUT_POLL_PERIOD); }
/**
+Jani
On Fri, Sep 01, 2023 at 05:04:02PM +0300, Imre Deak wrote:
Disabling HPD polling from i915_hpd_poll_init_work() involves probing all display connectors explicitly to account for lost hotplug interrupts. On some platforms (mostly pre-ICL) with HDMI connectors the I2C EDID bit-banging using udelay() triggers in turn the
workqueue: i915_hpd_poll_init_work [i915] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
warning.
Fix the above by scheduling i915_hpd_poll_init_work() on a WQ_UNBOUND workqueue. It's ok to use a system WQ, since i915_hpd_poll_init_work() is properly flushed in intel_hpd_cancel_work().
The connector probing from drm_mode_config::output_poll_work resulting in the same warning is fixed by the next patch.
Cc: Tejun Heo tj@kernel.org Cc: Heiner Kallweit hkallweit1@gmail.com CC: stable@vger.kernel.org # 6.5 Suggested-by: Tejun Heo tj@kernel.org Suggested-by: Heiner Kallweit hkallweit1@gmail.com Reported-by: Heiner Kallweit hkallweit1@gmail.com Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9245 Link: https://lore.kernel.org/all/f7e21caa-e98d-e5b5-932a-fe12d27fde9b@gmail.com Signed-off-by: Imre Deak imre.deak@intel.com
drivers/gpu/drm/i915/display/intel_hotplug.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_hotplug.c b/drivers/gpu/drm/i915/display/intel_hotplug.c index e8562f6f8bb44..accc2fec562a0 100644 --- a/drivers/gpu/drm/i915/display/intel_hotplug.c +++ b/drivers/gpu/drm/i915/display/intel_hotplug.c @@ -774,7 +774,7 @@ void intel_hpd_poll_enable(struct drm_i915_private *dev_priv) * As well, there's no issue if we race here since we always reschedule * this worker anyway */
- queue_work(dev_priv->unordered_wq,
- queue_work(system_unbound_wq, &dev_priv->display.hotplug.poll_init_work);
}
@@ -803,7 +803,7 @@ void intel_hpd_poll_disable(struct drm_i915_private *dev_priv) return;
WRITE_ONCE(dev_priv->display.hotplug.poll_enabled, false);
- queue_work(dev_priv->unordered_wq,
- queue_work(system_unbound_wq,
This 1y+ patch doesn't apply anymore and now we also have xe to account for. I'm reviving this since we are unifying the kernel config in CI and now several machines testing i915 start to hit this warning.
Looking at the current code for xe we have:
drivers/gpu/drm/xe/xe_device_types.h:
/** @unordered_wq: used to serialize unordered work, mostly display */ struct workqueue_struct *unordered_wq;
... which is, actually, just display.
Jani, should we move this wq to display where it belongs, with the right flags, rather than queueing it on system_unbound_wq?
Lucas De Marchi
&dev_priv->display.hotplug.poll_init_work);
}
-- 2.37.2
On Mon, 28 Oct 2024, Lucas De Marchi lucas.demarchi@intel.com wrote:
+Jani
On Fri, Sep 01, 2023 at 05:04:02PM +0300, Imre Deak wrote:
Disabling HPD polling from i915_hpd_poll_init_work() involves probing all display connectors explicitly to account for lost hotplug interrupts. On some platforms (mostly pre-ICL) with HDMI connectors the I2C EDID bit-banging using udelay() triggers in turn the
workqueue: i915_hpd_poll_init_work [i915] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
warning.
Fix the above by scheduling i915_hpd_poll_init_work() on a WQ_UNBOUND workqueue. It's ok to use a system WQ, since i915_hpd_poll_init_work() is properly flushed in intel_hpd_cancel_work().
The connector probing from drm_mode_config::output_poll_work resulting in the same warning is fixed by the next patch.
Cc: Tejun Heo tj@kernel.org Cc: Heiner Kallweit hkallweit1@gmail.com CC: stable@vger.kernel.org # 6.5 Suggested-by: Tejun Heo tj@kernel.org Suggested-by: Heiner Kallweit hkallweit1@gmail.com Reported-by: Heiner Kallweit hkallweit1@gmail.com Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9245 Link: https://lore.kernel.org/all/f7e21caa-e98d-e5b5-932a-fe12d27fde9b@gmail.com Signed-off-by: Imre Deak imre.deak@intel.com
drivers/gpu/drm/i915/display/intel_hotplug.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_hotplug.c b/drivers/gpu/drm/i915/display/intel_hotplug.c index e8562f6f8bb44..accc2fec562a0 100644 --- a/drivers/gpu/drm/i915/display/intel_hotplug.c +++ b/drivers/gpu/drm/i915/display/intel_hotplug.c @@ -774,7 +774,7 @@ void intel_hpd_poll_enable(struct drm_i915_private *dev_priv) * As well, there's no issue if we race here since we always reschedule * this worker anyway */
- queue_work(dev_priv->unordered_wq,
- queue_work(system_unbound_wq, &dev_priv->display.hotplug.poll_init_work);
}
@@ -803,7 +803,7 @@ void intel_hpd_poll_disable(struct drm_i915_private *dev_priv) return;
WRITE_ONCE(dev_priv->display.hotplug.poll_enabled, false);
- queue_work(dev_priv->unordered_wq,
- queue_work(system_unbound_wq,
This 1y+ patch doesn't apply anymore and now we also have xe to account for. I'm reviving this since we are unifying the kernel config in CI and now several machines testing i915 start to hit this warning.
Looking at the current code for xe we have:
drivers/gpu/drm/xe/xe_device_types.h:
/** @unordered_wq: used to serialize unordered work, mostly display */ struct workqueue_struct *unordered_wq;
... which is, actually, just display.
Jani, should we move this wq to display where it belongs, with the right flags, rather than queueing it on system_unbound_wq?
I think the general answer is:
1. Never use i915 or xe core workqueues in display code.
2. Use system workqueues where appropriate for display, and cancel the individual work where needed. While there are legitimate uses for the dedicated workqueues, I'm guessing a large portion of their use may be cargo-culting and/or the result of system wq flush going away, and not based on any real analysis.
3. For display stuff, add the minimal necessary workqueues in display code, and use them where appropriate. Also cancel the individual work where needed, and flush the entire workqueue only when really necessary. The entire workqueue flushing is also cargo-culting and/or the result of system wq flush going away.
4. Finally move the display wq init/cleanup to display code.
Now, I don't know what the answer in this particular case, or many of the other cases, is. And that's probably one of the reasons we have the status quo. :(
BR, Jani.
On Mon, Oct 28, 2024 at 04:49:21PM +0200, Jani Nikula wrote:
On Mon, 28 Oct 2024, Lucas De Marchi lucas.demarchi@intel.com wrote:
+Jani
On Fri, Sep 01, 2023 at 05:04:02PM +0300, Imre Deak wrote:
Disabling HPD polling from i915_hpd_poll_init_work() involves probing all display connectors explicitly to account for lost hotplug interrupts. On some platforms (mostly pre-ICL) with HDMI connectors the I2C EDID bit-banging using udelay() triggers in turn the
workqueue: i915_hpd_poll_init_work [i915] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
warning.
Fix the above by scheduling i915_hpd_poll_init_work() on a WQ_UNBOUND workqueue. It's ok to use a system WQ, since i915_hpd_poll_init_work() is properly flushed in intel_hpd_cancel_work().
The connector probing from drm_mode_config::output_poll_work resulting in the same warning is fixed by the next patch.
Cc: Tejun Heo tj@kernel.org Cc: Heiner Kallweit hkallweit1@gmail.com CC: stable@vger.kernel.org # 6.5 Suggested-by: Tejun Heo tj@kernel.org Suggested-by: Heiner Kallweit hkallweit1@gmail.com Reported-by: Heiner Kallweit hkallweit1@gmail.com Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9245 Link: https://lore.kernel.org/all/f7e21caa-e98d-e5b5-932a-fe12d27fde9b@gmail.com Signed-off-by: Imre Deak imre.deak@intel.com
drivers/gpu/drm/i915/display/intel_hotplug.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_hotplug.c b/drivers/gpu/drm/i915/display/intel_hotplug.c index e8562f6f8bb44..accc2fec562a0 100644 --- a/drivers/gpu/drm/i915/display/intel_hotplug.c +++ b/drivers/gpu/drm/i915/display/intel_hotplug.c @@ -774,7 +774,7 @@ void intel_hpd_poll_enable(struct drm_i915_private *dev_priv) * As well, there's no issue if we race here since we always reschedule * this worker anyway */
- queue_work(dev_priv->unordered_wq,
- queue_work(system_unbound_wq, &dev_priv->display.hotplug.poll_init_work);
}
@@ -803,7 +803,7 @@ void intel_hpd_poll_disable(struct drm_i915_private *dev_priv) return;
WRITE_ONCE(dev_priv->display.hotplug.poll_enabled, false);
- queue_work(dev_priv->unordered_wq,
- queue_work(system_unbound_wq,
This 1y+ patch doesn't apply anymore and now we also have xe to account for. I'm reviving this since we are unifying the kernel config in CI and now several machines testing i915 start to hit this warning.
Looking at the current code for xe we have:
drivers/gpu/drm/xe/xe_device_types.h:
/** @unordered_wq: used to serialize unordered work, mostly display */ struct workqueue_struct *unordered_wq;
... which is, actually, just display.
Jani, should we move this wq to display where it belongs, with the right flags, rather than queueing it on system_unbound_wq?
I think the general answer is:
Never use i915 or xe core workqueues in display code.
Use system workqueues where appropriate for display, and cancel the individual work where needed. While there are legitimate uses for the dedicated workqueues, I'm guessing a large portion of their use may be cargo-culting and/or the result of system wq flush going away, and not based on any real analysis.
For display stuff, add the minimal necessary workqueues in display code, and use them where appropriate. Also cancel the individual work where needed, and flush the entire workqueue only when really necessary. The entire workqueue flushing is also cargo-culting and/or the result of system wq flush going away.
Finally move the display wq init/cleanup to display code.
Now, I don't know what the answer in this particular case, or many of the other cases, is. And that's probably one of the reasons we have the status quo. :(
The work scheduled for hotplug events (like the above poll_init_work) are flushed explicitly, so I think the system workqueues should be used for those (as opposed to driver specific workqueues).
BR, Jani.
-- Jani Nikula, Intel
linux-stable-mirror@lists.linaro.org