On Wed, 2019-04-24 at 13:59 -0500, Bjorn Helgaas wrote:
Not being a scheduled work expert, I was unsure if this experiment was equivalent to what I proposed.
I'm always suspicious of singleton solutions like this (using schedule_work() in runtime_resume()) because usually they seem to be solving a generic problem that should happen on many kinds of hardware. The 0b2fe6594fa2 ("drm/nouveau: Queue hpd_work on (runtime) resume") commit log says:
We need to call drm_helper_hpd_irq_event() on resume to properly detect monitor connection / disconnection on some laptops, use hpd_work for this to avoid deadlocks.
The situation of a monitor being connected or disconnected during suspend can happen to *any* GPU, but the commit only changes nouveau, which of course raises the question of how we deal with that in other drivers. If the Nvidia GPU has some unique behavior related to monitor connection, that would explain special-case code there, but the commit doesn't mention anything like that.
It should be simple to revert 0b2fe6594fa2 and see whether it changes the behavior at all (well, simple except for the fact that this problem isn't 100% reproducible in the first place).
It's not 100% reproducible, but it's at least 90% so it's not difficult for me to test at all.
Also, reverting this commit makes no difference either. Note that while that commit only changed nouveau, scheduled_work() is exactly how a number of other drivers (i915 for instance) handle reprobing like this as well. The reason being that we can't do full connector reprobing in our runtime resume thread because we could deadlock if someone else is holding a modesetting lock we need and waiting on us to resume at the same time (there's a number of other bug fixes in nouveau for other issues caused by the same deadlock scenario).
I'm confused here though, it sounds like you're running under the assumption that PCI devices like this aren't reset into a clean state during a system reboot, is that correct?
Do we want to have this discussion on the bz btw, or is this email thread fine?
Email is fine.