Hello Johan,
On Mon, 6 Feb 2023 09:09:50 +0100 Johan Hovold johan@kernel.org wrote:
On Fri, Feb 03, 2023 at 05:01:21PM +0100, Luca Ceresoli wrote:
Hello Johan,
On Wed, 1 Feb 2023 11:15:40 +0100 Johan Hovold johan+linaro@kernel.org wrote:
The current interconnect provider registration interface is inherently racy as nodes are not added until the after adding the provider. This can specifically cause racing DT lookups to fail.
Switch to using the new API where the provider is not registered until after it has been fully initialised.
Fixes: f0d8048525d7 ("interconnect: Add imx core driver") Cc: stable@vger.kernel.org # 5.8 Cc: Leonard Crestez leonard.crestez@nxp.com Cc: Alexandre Bailon abailon@baylibre.com Signed-off-by: Johan Hovold johan+linaro@kernel.org
Georgi pointed me to this series after I reported a bug yesterday [0], that I found on iMX8MP. So I ran some tests with my original, failing tree, minus one patch with my debugging code to hunt for the bug, plus patches 1-4 of this series.
The original code was failing approx 5~10% of the times. With your 4 patches applied it ran 139 times with zero errors, which looks great! I won't be able to do more testing until next Monday to be extra sure.
Thanks for testing.
It indeed looks like you're hitting the same race, and as the imx interconnect driver also initialises the provider data num_nodes count before adding the nodes it results in that NULL-deref (where the qcom driver failed a bit more gracefully).
My v6.2-rc5 tree with patches 1 to 4 added has booted 590 times with 0 errors, which add to the 139 times on Friday. This definitely deserves my:
Tested-by: Luca Ceresoli luca.ceresoli@bootlin.com