When of_find_net_device_by_node() successfully acquires a reference to a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
Cc: stable@vger.kernel.org Fixes: deff710703d8 ("net: dsa: Allow default tag protocol to be overridden from DT") Signed-off-by: Ma Ke make24@iscas.ac.cn --- Changes in v2: - simplified the patch as suggestions; - modified the Fixes tag as suggestions. --- net/dsa/dsa.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index a20efabe778f..31b409a47491 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -1247,6 +1247,7 @@ static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) struct device_node *ethernet = of_parse_phandle(dn, "ethernet", 0); const char *name = of_get_property(dn, "label", NULL); bool link = of_property_read_bool(dn, "link"); + int err = 0;
dp->dn = dn;
@@ -1260,7 +1261,11 @@ static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) return -EPROBE_DEFER;
user_protocol = of_get_property(dn, "dsa-tag-protocol", NULL); - return dsa_port_parse_cpu(dp, conduit, user_protocol); + err = dsa_port_parse_cpu(dp, conduit, user_protocol); + if (err) + put_device(conduit); + + return err; }
if (link)
Hi,
On 12/14/25 14:12, Ma Ke wrote:
When of_find_net_device_by_node() successfully acquires a reference to
Your subject is missing the () of dsa_port_parse_of()
a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
I agree with the reference not being properly released on failure, but I don't think this fix is complete.
I was trying to figure out where the put_device() would happen in the success case (or on removal), and I failed to find it.
Also if the (indirect) top caller of dsa_port_parse_of(), dsa_switch_probe(), fails at a later place the reference won't be released either.
The only explicit put_device() that happens is in dsa_dev_to_net_device(), which seems to convert a device reference to a netdev reference via dev_hold().
But the only caller of that, dsa_port_parse() immediately calls dev_put() on it, essentially dropping all references, and then continuing using it.
dsa_switch_shutdown() talks about dropping references taken via netdev_upper_dev_link(), but AFAICT this happens only after dsa_port_parse{,_of}() setup the conduit, so it looks like there could be a window without any reference held onto the conduit.
So AFAICT the current state is:
dsa_port_parse_of() keeps the device reference. dsa_port_parse() drops the device reference, and shortly has a dev_hold(), but it does not extend beyond the function.
Therefore if my analysis is correct (which it may very well not be), the correct fix(es) here could be:
dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold() on success to the conduit.
Or maybe they should unconditionally drop if *after* calling dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one when assigning dsa_port::conduit.
Regardless, the end result should be that there is a reference on the conduit stored in dsa_port::conduit.
dsa_switch_release_ports() should drop the references, as this seems to be called in all error paths of dsa_port_parse{,of} as well by dsa_switch_remove().
And maybe dsa_switch_shutdown() then also needs to drop the reference? Though it may need to then retake the reference on resume, and I don't know where that exactly should happen. Maybe it should also lookup the conduit(s) again to be correct.
But here I'm more doing educated guesses then actually knowing what's correct.
The alternative/quick "fix" would be to just drop the reference unconditionally, which would align the behaviour to that of dsa_port_parse(). Not sure if it should mirror the dev_hold() / dev_put() spiel as well.
Not that I think this would be the correct behaviour though.
Sorry for the lengthy review/train of thought.
Best regards, Jonas
Cc: stable@vger.kernel.org Fixes: deff710703d8 ("net: dsa: Allow default tag protocol to be overridden from DT") Signed-off-by: Ma Ke make24@iscas.ac.cn
Changes in v2:
- simplified the patch as suggestions;
- modified the Fixes tag as suggestions.
net/dsa/dsa.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index a20efabe778f..31b409a47491 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -1247,6 +1247,7 @@ static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) struct device_node *ethernet = of_parse_phandle(dn, "ethernet", 0); const char *name = of_get_property(dn, "label", NULL); bool link = of_property_read_bool(dn, "link");
- int err = 0;
dp->dn = dn; @@ -1260,7 +1261,11 @@ static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) return -EPROBE_DEFER; user_protocol = of_get_property(dn, "dsa-tag-protocol", NULL);
return dsa_port_parse_cpu(dp, conduit, user_protocol);
err = dsa_port_parse_cpu(dp, conduit, user_protocol);if (err)put_device(conduit); }return err;if (link)
Hi Jonas, Ma Ke,
On Sun, Dec 14, 2025 at 05:02:33PM +0100, Jonas Gorski wrote:
Hi,
On 12/14/25 14:12, Ma Ke wrote:
When of_find_net_device_by_node() successfully acquires a reference to
Your subject is missing the () of dsa_port_parse_of()
a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
I agree with the reference not being properly released on failure, but I don't think this fix is complete.
I was trying to figure out where the put_device() would happen in the success case (or on removal), and I failed to find it.
Also if the (indirect) top caller of dsa_port_parse_of(), dsa_switch_probe(), fails at a later place the reference won't be released either.
The only explicit put_device() that happens is in dsa_dev_to_net_device(), which seems to convert a device reference to a netdev reference via dev_hold().
But the only caller of that, dsa_port_parse() immediately calls dev_put() on it, essentially dropping all references, and then continuing using it.
dsa_switch_shutdown() talks about dropping references taken via netdev_upper_dev_link(), but AFAICT this happens only after dsa_port_parse{,_of}() setup the conduit, so it looks like there could be a window without any reference held onto the conduit.
So AFAICT the current state is:
dsa_port_parse_of() keeps the device reference. dsa_port_parse() drops the device reference, and shortly has a dev_hold(), but it does not extend beyond the function.
Therefore if my analysis is correct (which it may very well not be), the correct fix(es) here could be:
dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold() on success to the conduit.
Or maybe they should unconditionally drop if *after* calling dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one when assigning dsa_port::conduit.
Regardless, the end result should be that there is a reference on the conduit stored in dsa_port::conduit.
dsa_switch_release_ports() should drop the references, as this seems to be called in all error paths of dsa_port_parse{,of} as well by dsa_switch_remove().
And maybe dsa_switch_shutdown() then also needs to drop the reference? Though it may need to then retake the reference on resume, and I don't know where that exactly should happen. Maybe it should also lookup the conduit(s) again to be correct.
But here I'm more doing educated guesses then actually knowing what's correct.
The alternative/quick "fix" would be to just drop the reference unconditionally, which would align the behaviour to that of dsa_port_parse(). Not sure if it should mirror the dev_hold() / dev_put() spiel as well.
Not that I think this would be the correct behaviour though.
Sorry for the lengthy review/train of thought.
Best regards, Jonas
Thank you for your thoughts on this topic. Indeed there is a problem, for which I managed to find a few hours today to investigate. I was going to just submit a patch directly and refer Ma Ke to it directly, but since you started looking into the situation as well, I just thought I'd reply "please standby". It's currently undergoing testing.
On Sun, Dec 14, 2025 at 5:10 PM Vladimir Oltean olteanv@gmail.com wrote:
Hi Jonas, Ma Ke,
On Sun, Dec 14, 2025 at 05:02:33PM +0100, Jonas Gorski wrote:
Hi,
On 12/14/25 14:12, Ma Ke wrote:
When of_find_net_device_by_node() successfully acquires a reference to
Your subject is missing the () of dsa_port_parse_of()
a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
I agree with the reference not being properly released on failure, but I don't think this fix is complete.
I was trying to figure out where the put_device() would happen in the success case (or on removal), and I failed to find it.
Also if the (indirect) top caller of dsa_port_parse_of(), dsa_switch_probe(), fails at a later place the reference won't be released either.
The only explicit put_device() that happens is in dsa_dev_to_net_device(), which seems to convert a device reference to a netdev reference via dev_hold().
But the only caller of that, dsa_port_parse() immediately calls dev_put() on it, essentially dropping all references, and then continuing using it.
dsa_switch_shutdown() talks about dropping references taken via netdev_upper_dev_link(), but AFAICT this happens only after dsa_port_parse{,_of}() setup the conduit, so it looks like there could be a window without any reference held onto the conduit.
So AFAICT the current state is:
dsa_port_parse_of() keeps the device reference. dsa_port_parse() drops the device reference, and shortly has a dev_hold(), but it does not extend beyond the function.
Therefore if my analysis is correct (which it may very well not be), the correct fix(es) here could be:
dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold() on success to the conduit.
Or maybe they should unconditionally drop if *after* calling dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one when assigning dsa_port::conduit.
Regardless, the end result should be that there is a reference on the conduit stored in dsa_port::conduit.
dsa_switch_release_ports() should drop the references, as this seems to be called in all error paths of dsa_port_parse{,of} as well by dsa_switch_remove().
And maybe dsa_switch_shutdown() then also needs to drop the reference? Though it may need to then retake the reference on resume, and I don't know where that exactly should happen. Maybe it should also lookup the conduit(s) again to be correct.
But here I'm more doing educated guesses then actually knowing what's correct.
The alternative/quick "fix" would be to just drop the reference unconditionally, which would align the behaviour to that of dsa_port_parse(). Not sure if it should mirror the dev_hold() / dev_put() spiel as well.
Not that I think this would be the correct behaviour though.
Sorry for the lengthy review/train of thought.
Best regards, Jonas
Thank you for your thoughts on this topic. Indeed there is a problem, for which I managed to find a few hours today to investigate. I was going to just submit a patch directly and refer Ma Ke to it directly, but since you started looking into the situation as well, I just thought I'd reply "please standby". It's currently undergoing testing.
A patch already, that's even better! I'll gladly stand by :)
Best regards, Jonas
On Sun, Dec 14, 2025 at 05:14:04PM +0100, Jonas Gorski wrote:
On Sun, Dec 14, 2025 at 5:10 PM Vladimir Oltean olteanv@gmail.com wrote:
Hi Jonas, Ma Ke,
On Sun, Dec 14, 2025 at 05:02:33PM +0100, Jonas Gorski wrote:
Hi,
On 12/14/25 14:12, Ma Ke wrote:
When of_find_net_device_by_node() successfully acquires a reference to
Your subject is missing the () of dsa_port_parse_of()
a network device but the subsequent call to dsa_port_parse_cpu() fails, dsa_port_parse_of() returns without releasing the reference count on the network device.
of_find_net_device_by_node() increments the reference count of the returned structure, which should be balanced with a corresponding put_device() when the reference is no longer needed.
Found by code review.
I agree with the reference not being properly released on failure, but I don't think this fix is complete.
I was trying to figure out where the put_device() would happen in the success case (or on removal), and I failed to find it.
Also if the (indirect) top caller of dsa_port_parse_of(), dsa_switch_probe(), fails at a later place the reference won't be released either.
The only explicit put_device() that happens is in dsa_dev_to_net_device(), which seems to convert a device reference to a netdev reference via dev_hold().
But the only caller of that, dsa_port_parse() immediately calls dev_put() on it, essentially dropping all references, and then continuing using it.
dsa_switch_shutdown() talks about dropping references taken via netdev_upper_dev_link(), but AFAICT this happens only after dsa_port_parse{,_of}() setup the conduit, so it looks like there could be a window without any reference held onto the conduit.
So AFAICT the current state is:
dsa_port_parse_of() keeps the device reference. dsa_port_parse() drops the device reference, and shortly has a dev_hold(), but it does not extend beyond the function.
Therefore if my analysis is correct (which it may very well not be), the correct fix(es) here could be:
dsa_port_parse{,_of}() should keep a reference via e.g. dev_hold() on success to the conduit.
Or maybe they should unconditionally drop if *after* calling dsa_port_parse_cpu(), and dsa_port_parse_cpu() should take one when assigning dsa_port::conduit.
Regardless, the end result should be that there is a reference on the conduit stored in dsa_port::conduit.
dsa_switch_release_ports() should drop the references, as this seems to be called in all error paths of dsa_port_parse{,of} as well by dsa_switch_remove().
And maybe dsa_switch_shutdown() then also needs to drop the reference? Though it may need to then retake the reference on resume, and I don't know where that exactly should happen. Maybe it should also lookup the conduit(s) again to be correct.
But here I'm more doing educated guesses then actually knowing what's correct.
The alternative/quick "fix" would be to just drop the reference unconditionally, which would align the behaviour to that of dsa_port_parse(). Not sure if it should mirror the dev_hold() / dev_put() spiel as well.
Not that I think this would be the correct behaviour though.
Sorry for the lengthy review/train of thought.
Best regards, Jonas
Thank you for your thoughts on this topic. Indeed there is a problem, for which I managed to find a few hours today to investigate. I was going to just submit a patch directly and refer Ma Ke to it directly, but since you started looking into the situation as well, I just thought I'd reply "please standby". It's currently undergoing testing.
A patch already, that's even better! I'll gladly stand by :)
Best regards, Jonas
Patch location (for tracking purposes): https://lore.kernel.org/netdev/20251214182449.3900190-1-vladimir.oltean@nxp....
Ma Ke's patch unfortunately does not compile even in v2, so:
pw-bot: cr
(although "changes requested" is a figure of speech here, in the sense that I don't expect a follow-up patch from Ma Ke)
linux-stable-mirror@lists.linaro.org