This patch series is to fix bugs for below APIs:
devm_phy_put() devm_of_phy_provider_unregister() devm_phy_destroy() phy_get() of_phy_get() devm_phy_get() devm_of_phy_get() devm_of_phy_get_by_index()
And simplify below API:
of_phy_simple_xlate().
Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com --- Changes in v2: - Correct title, commit message, and inline comments. - Link to v1: https://lore.kernel.org/r/20241020-phy_core_fix-v1-0-078062f7da71@quicinc.co...
--- Zijun Hu (6): phy: core: Fix that API devm_phy_put() fails to release the phy phy: core: Fix that API devm_of_phy_provider_unregister() fails to unregister the phy provider phy: core: Fix that API devm_phy_destroy() fails to destroy the phy phy: core: Fix an OF node refcount leakage in _of_phy_get() phy: core: Fix an OF node refcount leakage in of_phy_provider_lookup() phy: core: Simplify API of_phy_simple_xlate() implementation
drivers/phy/phy-core.c | 39 +++++++++++++++++++-------------------- 1 file changed, 19 insertions(+), 20 deletions(-) --- base-commit: e70d2677ef4088d59158739d72b67ac36d1b132b change-id: 20241020-phy_core_fix-e3ad65db98f7
Best regards,
From: Zijun Hu quic_zijuhu@quicinc.com
For devm_phy_put(), its comment says it needs to invoke phy_put() to release the phy, but it does not invoke the function actually since devres_destroy() will not call devm_phy_release() at all which will call the function, and the missing phy_put() call will cause:
- The phy fails to be released. - devm_phy_put() can not fully undo what API devm_phy_get() does. - Leak refcount of both the module and device for below typical usage:
devm_phy_get(); // or its variant ... err = do_something(); if (err) goto err_out; ... err_out: devm_phy_put();
The file(s) affected by this issue are shown below since they have such typical usage. drivers/pci/controller/cadence/pcie-cadence.c drivers/net/ethernet/ti/am65-cpsw-nuss.c
Fixed by using devres_release() instead of devres_destroy() within the API
Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Cc: Lorenzo Pieralisi lpieralisi@kernel.org Cc: "Krzysztof Wilczyński" kw@linux.com Cc: Bjorn Helgaas bhelgaas@google.com Cc: "David S. Miller" davem@davemloft.net Cc: Eric Dumazet edumazet@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Paolo Abeni pabeni@redhat.com Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com --- drivers/phy/phy-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c index f053b525ccff..f190d7126613 100644 --- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -737,7 +737,7 @@ void devm_phy_put(struct device *dev, struct phy *phy) if (!phy) return;
- r = devres_destroy(dev, devm_phy_release, devm_phy_match, phy); + r = devres_release(dev, devm_phy_release, devm_phy_match, phy); dev_WARN_ONCE(dev, r, "couldn't find PHY resource\n"); } EXPORT_SYMBOL_GPL(devm_phy_put);
On Thu, Oct 24, 2024 at 10:39:26PM +0800, Zijun Hu wrote:
From: Zijun Hu quic_zijuhu@quicinc.com
For devm_phy_put(), its comment says it needs to invoke phy_put() to release the phy, but it does not invoke the function actually since devres_destroy() will not call devm_phy_release() at all which will call the function, and the missing phy_put() call will cause:
Please split the above up in at least two sentences to make it easier to parse. Split it after devm_phy_release() and rephrase the latter part (e.g. by dropping "at all which will call the function").
The phy fails to be released.
devm_phy_put() can not fully undo what API devm_phy_get() does.
Leak refcount of both the module and device for below typical usage:
devm_phy_get(); // or its variant ... err = do_something(); if (err) goto err_out; ... err_out: devm_phy_put();
The file(s) affected by this issue are shown below since they have such typical usage. drivers/pci/controller/cadence/pcie-cadence.c drivers/net/ethernet/ti/am65-cpsw-nuss.c
Fixed by using devres_release() instead of devres_destroy() within the API
Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Cc: Lorenzo Pieralisi lpieralisi@kernel.org Cc: "Krzysztof Wilczyński" kw@linux.com Cc: Bjorn Helgaas bhelgaas@google.com Cc: "David S. Miller" davem@davemloft.net Cc: Eric Dumazet edumazet@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Paolo Abeni pabeni@redhat.com Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
Diff itself looks good. Nice find.
Johan
On 2024/10/29 21:40, Johan Hovold wrote:
On Thu, Oct 24, 2024 at 10:39:26PM +0800, Zijun Hu wrote:
From: Zijun Hu quic_zijuhu@quicinc.com
For devm_phy_put(), its comment says it needs to invoke phy_put() to release the phy, but it does not invoke the function actually since devres_destroy() will not call devm_phy_release() at all which will call the function, and the missing phy_put() call will cause:
Please split the above up in at least two sentences to make it easier to parse. Split it after devm_phy_release() and rephrase the latter part (e.g. by dropping "at all which will call the function").
thank you for code review. will take your suggestions and send v2 (^^).
The phy fails to be released.
devm_phy_put() can not fully undo what API devm_phy_get() does.
Leak refcount of both the module and device for below typical usage:
devm_phy_get(); // or its variant ... err = do_something(); if (err) goto err_out; ... err_out: devm_phy_put();
The file(s) affected by this issue are shown below since they have such typical usage. drivers/pci/controller/cadence/pcie-cadence.c drivers/net/ethernet/ti/am65-cpsw-nuss.c
Fixed by using devres_release() instead of devres_destroy() within the API
Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Cc: Lorenzo Pieralisi lpieralisi@kernel.org Cc: "Krzysztof Wilczyński" kw@linux.com Cc: Bjorn Helgaas bhelgaas@google.com Cc: "David S. Miller" davem@davemloft.net Cc: Eric Dumazet edumazet@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Paolo Abeni pabeni@redhat.com Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
Diff itself looks good. Nice find.
Johan
From: Zijun Hu quic_zijuhu@quicinc.com
For devm_of_phy_provider_unregister(), its comment says it needs to invoke of_phy_provider_unregister() to unregister the phy provider, but it does not invoke the function actually since devres_destroy() will not call devm_phy_provider_release() at all which will call the function, and the missing of_phy_provider_unregister() call will case:
- The phy provider fails to be unregistered. - Leak both memory and the OF node refcount.
Fixed by using devres_release() instead of devres_destroy() within the API.
Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com --- drivers/phy/phy-core.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c index f190d7126613..de07e1616b34 100644 --- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -1259,12 +1259,12 @@ EXPORT_SYMBOL_GPL(of_phy_provider_unregister); * of_phy_provider_unregister to unregister the phy provider. */ void devm_of_phy_provider_unregister(struct device *dev, - struct phy_provider *phy_provider) + struct phy_provider *phy_provider) { int r;
- r = devres_destroy(dev, devm_phy_provider_release, devm_phy_match, - phy_provider); + r = devres_release(dev, devm_phy_provider_release, devm_phy_match, + phy_provider); dev_WARN_ONCE(dev, r, "couldn't find PHY provider device resource\n"); } EXPORT_SYMBOL_GPL(devm_of_phy_provider_unregister);
On Thu, Oct 24, 2024 at 10:39:27PM +0800, Zijun Hu wrote:
From: Zijun Hu quic_zijuhu@quicinc.com
For devm_of_phy_provider_unregister(), its comment says it needs to invoke of_phy_provider_unregister() to unregister the phy provider, but it does not invoke the function actually since devres_destroy() will not call devm_phy_provider_release() at all which will call the function, and the missing of_phy_provider_unregister() call will case:
Please split this up in two sentences as well.
- The phy provider fails to be unregistered.
- Leak both memory and the OF node refcount.
Perhaps a comment about there not being any in-tree users of this API is in place here?
And you could consider dropping the function altogether as well.
Fixed by using devres_release() instead of devres_destroy() within the API.
Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
Looks good otherwise.
Johan
On 2024/10/29 21:43, Johan Hovold wrote:
On Thu, Oct 24, 2024 at 10:39:27PM +0800, Zijun Hu wrote:
From: Zijun Hu quic_zijuhu@quicinc.com
For devm_of_phy_provider_unregister(), its comment says it needs to invoke of_phy_provider_unregister() to unregister the phy provider, but it does not invoke the function actually since devres_destroy() will not call devm_phy_provider_release() at all which will call the function, and the missing of_phy_provider_unregister() call will case:
Please split this up in two sentences as well.
good suggestions. will do it.
- The phy provider fails to be unregistered.
- Leak both memory and the OF node refcount.
Perhaps a comment about there not being any in-tree users of this API is in place here?
okay, will do it as you suggest.
And you could consider dropping the function altogether as well.
Remove the API devm_of_phy_provider_unregister()?
i prefer fixing it instead of removing it based on below considerations.
1) it is simper. just about one line change. 2) the API may be used in future. the similar API of [PATCH 1/6] have 2 usages.
Fixed by using devres_release() instead of devres_destroy() within the API.
Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
Looks good otherwise.
Johan
On Tue, Oct 29, 2024 at 11:35:48PM +0800, Zijun Hu wrote:
On 2024/10/29 21:43, Johan Hovold wrote:
And you could consider dropping the function altogether as well.
Remove the API devm_of_phy_provider_unregister()?
i prefer fixing it instead of removing it based on below considerations.
- it is simper. just about one line change.
- the API may be used in future. the similar API of [PATCH 1/6] have 2
usages.
We typically don't carry APIs that no one uses (or has ever used), but sure, that can be done separately later if anyone cares.
Johan
From: Zijun Hu quic_zijuhu@quicinc.com
For devm_phy_destroy(), its comment says it needs to invoke phy_destroy() to destroy the phy, but it does not invoke the function actually since devres_destroy() will not call devm_phy_consume() at all which will call the function, and the missing phy_destroy() call will case that the phy fails to be destroyed.
Fixed by using devres_release() instead of devres_destroy() within the API.
Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com --- drivers/phy/phy-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c index de07e1616b34..52ca590a58b9 100644 --- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -1121,7 +1121,7 @@ void devm_phy_destroy(struct device *dev, struct phy *phy) { int r;
- r = devres_destroy(dev, devm_phy_consume, devm_phy_match, phy); + r = devres_release(dev, devm_phy_consume, devm_phy_match, phy); dev_WARN_ONCE(dev, r, "couldn't find PHY resource\n"); } EXPORT_SYMBOL_GPL(devm_phy_destroy);
On Thu, Oct 24, 2024 at 10:39:28PM +0800, Zijun Hu wrote:
From: Zijun Hu quic_zijuhu@quicinc.com
For devm_phy_destroy(), its comment says it needs to invoke phy_destroy() to destroy the phy, but it does not invoke the function actually since devres_destroy() will not call devm_phy_consume() at all which will call the function, and the missing phy_destroy() call will case that the phy fails to be destroyed.
Here too, split in at least two sentences.
Fixed by using devres_release() instead of devres_destroy() within the API.
And add a comment about there not being any in-tree users of the interface.
And consider dropping it.
Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
Johan
On 2024/10/29 21:45, Johan Hovold wrote:
On Thu, Oct 24, 2024 at 10:39:28PM +0800, Zijun Hu wrote:
From: Zijun Hu quic_zijuhu@quicinc.com
For devm_phy_destroy(), its comment says it needs to invoke phy_destroy() to destroy the phy, but it does not invoke the function actually since devres_destroy() will not call devm_phy_consume() at all which will call the function, and the missing phy_destroy() call will case that the phy fails to be destroyed.
Here too, split in at least two sentences.
okay.
Fixed by using devres_release() instead of devres_destroy() within the API.
And add a comment about there not being any in-tree users of the interface.
okay, will do it within v2.
And consider dropping it.
okay, will follow the same action as [PATCH 2/6]
Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
Johan
From: Zijun Hu quic_zijuhu@quicinc.com
It will leak refcount of OF node @args.np for _of_phy_get() not to decrease refcount increased by previous of_parse_phandle_with_args() when returns due to of_device_is_compatible() error.
Fix by adding of_node_put() before the error return.
Fixes: b7563e2796f8 ("phy: work around 'phys' references to usb-nop-xceiv devices") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com --- drivers/phy/phy-core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c index 52ca590a58b9..967878b78797 100644 --- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -629,8 +629,11 @@ static struct phy *_of_phy_get(struct device_node *np, int index) return ERR_PTR(-ENODEV);
/* This phy type handled by the usb-phy subsystem for now */ - if (of_device_is_compatible(args.np, "usb-nop-xceiv")) + if (of_device_is_compatible(args.np, "usb-nop-xceiv")) { + /* Put refcount above of_parse_phandle_with_args() got */ + of_node_put(args.np); return ERR_PTR(-ENODEV); + }
mutex_lock(&phy_provider_mutex); phy_provider = of_phy_provider_lookup(args.np);
On Thu, Oct 24, 2024 at 10:39:29PM +0800, Zijun Hu wrote:
From: Zijun Hu quic_zijuhu@quicinc.com
It will leak refcount of OF node @args.np for _of_phy_get() not to decrease refcount increased by previous of_parse_phandle_with_args() when returns due to of_device_is_compatible() error.
Fix by adding of_node_put() before the error return.
Fixes: b7563e2796f8 ("phy: work around 'phys' references to usb-nop-xceiv devices") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
drivers/phy/phy-core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c index 52ca590a58b9..967878b78797 100644 --- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -629,8 +629,11 @@ static struct phy *_of_phy_get(struct device_node *np, int index) return ERR_PTR(-ENODEV); /* This phy type handled by the usb-phy subsystem for now */
- if (of_device_is_compatible(args.np, "usb-nop-xceiv"))
- if (of_device_is_compatible(args.np, "usb-nop-xceiv")) {
/* Put refcount above of_parse_phandle_with_args() got */
No need for a comment as this is already handled in the later paths.
of_node_put(args.np);
For that reason you should probably initialise ret and add a new label out_put_node that you jump to here instead.
return ERR_PTR(-ENODEV);
- }
mutex_lock(&phy_provider_mutex); phy_provider = of_phy_provider_lookup(args.np);
Johan
On 2024/10/29 21:47, Johan Hovold wrote:
On Thu, Oct 24, 2024 at 10:39:29PM +0800, Zijun Hu wrote:
From: Zijun Hu quic_zijuhu@quicinc.com
It will leak refcount of OF node @args.np for _of_phy_get() not to decrease refcount increased by previous of_parse_phandle_with_args() when returns due to of_device_is_compatible() error.
Fix by adding of_node_put() before the error return.
Fixes: b7563e2796f8 ("phy: work around 'phys' references to usb-nop-xceiv devices") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
drivers/phy/phy-core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c index 52ca590a58b9..967878b78797 100644 --- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -629,8 +629,11 @@ static struct phy *_of_phy_get(struct device_node *np, int index) return ERR_PTR(-ENODEV); /* This phy type handled by the usb-phy subsystem for now */
- if (of_device_is_compatible(args.np, "usb-nop-xceiv"))
- if (of_device_is_compatible(args.np, "usb-nop-xceiv")) {
/* Put refcount above of_parse_phandle_with_args() got */
No need for a comment as this is already handled in the later paths.
will remove it within v3
of_node_put(args.np);
For that reason you should probably initialise ret and add a new label out_put_node that you jump to here instead.
will try to do it within v3.
return ERR_PTR(-ENODEV);
- }
mutex_lock(&phy_provider_mutex); phy_provider = of_phy_provider_lookup(args.np);
Johan
From: Zijun Hu quic_zijuhu@quicinc.com
For macro for_each_child_of_node(parent, child), refcount of @child has been increased before entering its loop body, so normally needs to call of_node_put(@child) before returning from the loop body to avoid refcount leakage.
of_phy_provider_lookup() has such usage but does not call of_node_put() before returning, so cause leakage of the OF node refcount.
Fixed by simply calling of_node_put() before returning from the loop body.
The APIs affected by this issue are shown below since they indirectly invoke problematic of_phy_provider_lookup(). phy_get() of_phy_get() devm_phy_get() devm_of_phy_get() devm_of_phy_get_by_index()
Fixes: 2a4c37016ca9 ("phy: core: Fix of_phy_provider_lookup to return PHY provider for sub node") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
--- The following kernel mainline commit fixes a similar issue: Commit: b337cc3ce475 ("backlight: lm3509_bl: Fix early returns in for_each_child_of_node()") --- drivers/phy/phy-core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c index 967878b78797..de0264dfc387 100644 --- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -145,8 +145,10 @@ static struct phy_provider *of_phy_provider_lookup(struct device_node *node) return phy_provider;
for_each_child_of_node(phy_provider->children, child) - if (child == node) + if (child == node) { + of_node_put(child); return phy_provider; + } }
return ERR_PTR(-EPROBE_DEFER);
On Thu, Oct 24, 2024 at 10:39:30PM +0800, Zijun Hu wrote:
From: Zijun Hu quic_zijuhu@quicinc.com
For macro for_each_child_of_node(parent, child), refcount of @child has been increased before entering its loop body, so normally needs to call of_node_put(@child) before returning from the loop body to avoid refcount leakage.
of_phy_provider_lookup() has such usage but does not call of_node_put() before returning, so cause leakage of the OF node refcount.
Fixed by simply calling of_node_put() before returning from the loop body.
The APIs affected by this issue are shown below since they indirectly invoke problematic of_phy_provider_lookup(). phy_get() of_phy_get() devm_phy_get() devm_of_phy_get() devm_of_phy_get_by_index()
Fixes: 2a4c37016ca9 ("phy: core: Fix of_phy_provider_lookup to return PHY provider for sub node") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com
Looks good.
Reviewed-by: Johan Hovold johan+linaro@kernel.org
From: Zijun Hu quic_zijuhu@quicinc.com
Simplify of_phy_simple_xlate() implementation by API class_find_device_by_of_node() which is also safer since it subsys_get() the class's subsystem in advance of iterating over the class's devices.
Also correct comments to mark its parameter @dev as unused instead of @args in passing.
Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com --- drivers/phy/phy-core.c | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-)
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c index de0264dfc387..b83c6aa0287d 100644 --- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -749,8 +749,8 @@ EXPORT_SYMBOL_GPL(devm_phy_put);
/** * of_phy_simple_xlate() - returns the phy instance from phy provider - * @dev: the PHY provider device - * @args: of_phandle_args (not used here) + * @dev: the PHY provider device (not used here) + * @args: of_phandle_args * * Intended to be used by phy provider for the common case where #phy-cells is * 0. For other cases where #phy-cells is greater than '0', the phy provider @@ -760,20 +760,14 @@ EXPORT_SYMBOL_GPL(devm_phy_put); struct phy *of_phy_simple_xlate(struct device *dev, const struct of_phandle_args *args) { - struct phy *phy; - struct class_dev_iter iter; - - class_dev_iter_init(&iter, &phy_class, NULL, NULL); - while ((dev = class_dev_iter_next(&iter))) { - phy = to_phy(dev); - if (args->np != phy->dev.of_node) - continue; + struct device *target_dev;
- class_dev_iter_exit(&iter); - return phy; + target_dev = class_find_device_by_of_node(&phy_class, args->np); + if (target_dev) { + put_device(target_dev); + return to_phy(target_dev); }
- class_dev_iter_exit(&iter); return ERR_PTR(-ENODEV); } EXPORT_SYMBOL_GPL(of_phy_simple_xlate);
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#opti...
Rule: add the tag "Cc: stable@vger.kernel.org" in the sign-off area to have the patch automatically included in the stable tree. Subject: [PATCH v2 6/6] phy: core: Simplify API of_phy_simple_xlate() implementation Link: https://lore.kernel.org/stable/20241024-phy_core_fix-v2-6-fc0c63dbfcf3%40qui...
linux-stable-mirror@lists.linaro.org