From: Emanuele Ghidoli emanuele.ghidoli@toradex.com
While the DP83867 PHYs report EEE capability through their feature registers, the actual hardware does not support EEE (see Links). When the connected MAC enables EEE, it causes link instability and communication failures.
The issue is reproducible with a iMX8MP and relevant stmmac ethernet port. Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Call phy_disable_eee during phy initialization to prevent EEE from being enabled on DP83867 PHYs.
Link: https://e2e.ti.com/support/interface-group/interface/f/interface-forum/14452... Link: https://e2e.ti.com/support/interface-group/interface/f/interface-forum/65863... Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy") Cc: stable@vger.kernel.org Signed-off-by: Emanuele Ghidoli emanuele.ghidoli@toradex.com --- drivers/net/phy/dp83867.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c index deeefb962566..36a0c1b7f59c 100644 --- a/drivers/net/phy/dp83867.c +++ b/drivers/net/phy/dp83867.c @@ -738,6 +738,12 @@ static int dp83867_config_init(struct phy_device *phydev) return ret; }
+ /* Although the DP83867 reports EEE capability through the + * MDIO_PCS_EEE_ABLE and MDIO_AN_EEE_ADV registers, the feature + * is not actually implemented in hardware. + */ + phy_disable_eee(phydev); + if (phy_interface_is_rgmii(phydev) || phydev->interface == PHY_INTERFACE_MODE_SGMII) { val = phy_read(phydev, MII_DP83867_PHYCTRL);
On Thu, Oct 23, 2025 at 04:48:53PM +0200, Emanuele Ghidoli wrote:
From: Emanuele Ghidoli emanuele.ghidoli@toradex.com
While the DP83867 PHYs report EEE capability through their feature registers, the actual hardware does not support EEE (see Links). When the connected MAC enables EEE, it causes link instability and communication failures.
The issue is reproducible with a iMX8MP and relevant stmmac ethernet port. Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Call phy_disable_eee during phy initialization to prevent EEE from being enabled on DP83867 PHYs.
Link: https://e2e.ti.com/support/interface-group/interface/f/interface-forum/14452... Link: https://e2e.ti.com/support/interface-group/interface/f/interface-forum/65863...
Interesting statement this last one. "None of our gigabit PHYs support EEE operation today."
I wounder if any of the other TI PHY drivers need this fix?
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy") Cc: stable@vger.kernel.org Signed-off-by: Emanuele Ghidoli emanuele.ghidoli@toradex.com
Reviewed-by: Andrew Lunn andrew@lunn.ch
Andrew
Hello:
This patch was applied to netdev/net.git (main) by Jakub Kicinski kuba@kernel.org:
On Thu, 23 Oct 2025 16:48:53 +0200 you wrote:
From: Emanuele Ghidoli emanuele.ghidoli@toradex.com
While the DP83867 PHYs report EEE capability through their feature registers, the actual hardware does not support EEE (see Links). When the connected MAC enables EEE, it causes link instability and communication failures.
[...]
Here is the summary with links: - [v1] net: phy: dp83867: Disable EEE support as not implemented https://git.kernel.org/netdev/net/c/84a905290cb4
You are awesome, thank you!
On Thu, Oct 23, 2025 at 04:48:53PM +0200, Emanuele Ghidoli wrote:
While the DP83867 PHYs report EEE capability through their feature registers, the actual hardware does not support EEE (see Links). When the connected MAC enables EEE, it causes link instability and communication failures.
The issue is reproducible with a iMX8MP and relevant stmmac ethernet port. Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Wasn't it enabled before? See commit 4218647d4556 ("net: stmmac: convert to phylink managed EEE support").
stmmac's mac_link_up() was:
- if (phy && priv->dma_cap.eee) { - phy_eee_rx_clock_stop(phy, !(priv->plat->flags & - STMMAC_FLAG_RX_CLK_RUNS_IN_LPI)); - priv->tx_lpi_timer = phy->eee_cfg.tx_lpi_timer; - stmmac_eee_init(priv, phy->enable_tx_lpi); stmmac_set_eee_pls(priv, priv->hw, true); - }
So, if EEE is enabled in the core synthesis, then EEE will be configured depending on what phylib says.
In stmmac_init_phy(), there was this:
- if (priv->dma_cap.eee) - phy_support_eee(phydev); - ret = phylink_connect_phy(priv->phylink, phydev);
So phylib was told to enable EEE support on the PHY if the dwmac core supports EEE.
So, from what I can see, converting to phylink managed EEE didn't change this. So what really did change?
On Sat, Oct 25, 2025 at 09:44:31AM +0100, Russell King (Oracle) wrote:
On Thu, Oct 23, 2025 at 04:48:53PM +0200, Emanuele Ghidoli wrote:
While the DP83867 PHYs report EEE capability through their feature registers, the actual hardware does not support EEE (see Links). When the connected MAC enables EEE, it causes link instability and communication failures.
The issue is reproducible with a iMX8MP and relevant stmmac ethernet port. Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Wasn't it enabled before? See commit 4218647d4556 ("net: stmmac: convert to phylink managed EEE support").
stmmac's mac_link_up() was:
if (phy && priv->dma_cap.eee) {phy_eee_rx_clock_stop(phy, !(priv->plat->flags &STMMAC_FLAG_RX_CLK_RUNS_IN_LPI));priv->tx_lpi_timer = phy->eee_cfg.tx_lpi_timer;stmmac_eee_init(priv, phy->enable_tx_lpi); stmmac_set_eee_pls(priv, priv->hw, true);}So, if EEE is enabled in the core synthesis, then EEE will be configured depending on what phylib says.
In stmmac_init_phy(), there was this:
if (priv->dma_cap.eee)phy_support_eee(phydev);ret = phylink_connect_phy(priv->phylink, phydev);So phylib was told to enable EEE support on the PHY if the dwmac core supports EEE.
So, from what I can see, converting to phylink managed EEE didn't change this. So what really did change?
I suspect it is a change in board designs. iMX8MP EVB variants are using Realtek PHYs with the SmartEEE variant. Therefore, the MAC is not able to control LPI behavior. Designs based on the EVB design (with the Realtek PHY) are not affected. I mean, any bug on the MAC or software side will stay invisible.
Some new designs with special requirements for TSN, for example low-latency TI PHYs, are a different story. They promise "Extra low latency TX < 90ns, RX < 290ns" and also announce EEE support. These two promises are not compatible with each other anyway, and at the same time, even if LPI does work, it will most probably fail with the FEC driver. I do not know about STMMAC.
On Sat, Oct 25, 2025 at 11:19:25AM +0200, Oleksij Rempel wrote:
On Sat, Oct 25, 2025 at 09:44:31AM +0100, Russell King (Oracle) wrote:
On Thu, Oct 23, 2025 at 04:48:53PM +0200, Emanuele Ghidoli wrote:
While the DP83867 PHYs report EEE capability through their feature registers, the actual hardware does not support EEE (see Links). When the connected MAC enables EEE, it causes link instability and communication failures.
The issue is reproducible with a iMX8MP and relevant stmmac ethernet port. Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Wasn't it enabled before? See commit 4218647d4556 ("net: stmmac: convert to phylink managed EEE support").
stmmac's mac_link_up() was:
if (phy && priv->dma_cap.eee) {phy_eee_rx_clock_stop(phy, !(priv->plat->flags &STMMAC_FLAG_RX_CLK_RUNS_IN_LPI));priv->tx_lpi_timer = phy->eee_cfg.tx_lpi_timer;stmmac_eee_init(priv, phy->enable_tx_lpi); stmmac_set_eee_pls(priv, priv->hw, true);}So, if EEE is enabled in the core synthesis, then EEE will be configured depending on what phylib says.
In stmmac_init_phy(), there was this:
if (priv->dma_cap.eee)phy_support_eee(phydev);ret = phylink_connect_phy(priv->phylink, phydev);So phylib was told to enable EEE support on the PHY if the dwmac core supports EEE.
So, from what I can see, converting to phylink managed EEE didn't change this. So what really did change?
I suspect it is a change in board designs. iMX8MP EVB variants are using Realtek PHYs with the SmartEEE variant. Therefore, the MAC is not able to control LPI behavior. Designs based on the EVB design (with the Realtek PHY) are not affected. I mean, any bug on the MAC or software side will stay invisible.
Some new designs with special requirements for TSN, for example low-latency TI PHYs, are a different story. They promise "Extra low latency TX < 90ns, RX < 290ns" and also announce EEE support. These two promises are not compatible with each other anyway, and at the same time, even if LPI does work, it will most probably fail with the FEC driver. I do not know about STMMAC.
What's annoying me is this "we spotted a change in the driver, we're going to blame that for our problems" attitude that there seems to be towards phylink.
When I make changes such as when porting a driver to a new facility, I try to do it with _no_ behavioural change. Yet, people still blame phylink for their problems. In 99% of cases, it turns out to be incorrect blame.
This commit description is stating that the conversion of stmmac to phylink-managed EEE changed the behaviour to default to enabling EEE. I claim that the driver _already_ defaulted to enabling EEE. So the commit description is nonsense, and just pulling at straws to justify the probem.
What I'm asking for is people to _properly_ investigate their problems rather than just looking at the commit history, and pulling out some random commit to blame, which invariably seems to be phylink related.
Having one's hard efforts constantly slated in this way is unhelpful.
Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Did you do a bisect to prove this?
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
What has this Fixes: tag got to do with phylink?
I hope you have seen Russell is not so happy you claim phylink is to blame here...
Andrew
On 27/10/2025 00:45, Andrew Lunn wrote:
Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Did you do a bisect to prove this?
Yes, I have done a bisect and the commit that introduced the behavior on our board is 4218647d4556 ("net: stmmac: convert to phylink managed EEE support").
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
What has this Fixes: tag got to do with phylink?
I think that the phylink commit is just enabling by default the EEE support, and my commit is not really fixing that. It is why I didn't put a Fixes: tag pointing to that.
I’ve tried to trace the behavior, but it’s quite complex. From my testing, I can summarize the situation as follows:
- ethtool, after that patch, returns: ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full - before that patch returns, after boot: EEE settings for end0: EEE status: disabled Tx LPI: disabled Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: Not reported Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full - Enabling EEE manually using ethtool, triggers the problem too (and ethtool -show-eee report eee status enabled): ethtool --set-eee end0 eee on tx-lpi on ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
I understand Russell point of view but from my point of view EEE is now enabled by default, and before it wasn't, at least on my setup.
I hope you have seen Russell is not so happy you claim phylink is to blame here...
Andrew
Emanuele
On Mon, Oct 27, 2025 at 01:57:48PM +0100, Emanuele Ghidoli wrote:
On 27/10/2025 00:45, Andrew Lunn wrote:
Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Did you do a bisect to prove this?
Yes, I have done a bisect and the commit that introduced the behavior on our board is 4218647d4556 ("net: stmmac: convert to phylink managed EEE support").
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
What has this Fixes: tag got to do with phylink?
I think that the phylink commit is just enabling by default the EEE support, and my commit is not really fixing that. It is why I didn't put a Fixes: tag pointing to that.
I’ve tried to trace the behavior, but it’s quite complex. From my testing, I can summarize the situation as follows:
- ethtool, after that patch, returns:
ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
- before that patch returns, after boot:
EEE settings for end0: EEE status: disabled Tx LPI: disabled Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: Not reported Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
- Enabling EEE manually using ethtool, triggers the problem too (and ethtool
-show-eee report eee status enabled): ethtool --set-eee end0 eee on tx-lpi on ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
I understand Russell point of view but from my point of view EEE is now enabled by default, and before it wasn't, at least on my setup.
We like to try to understand what is going on, and give accurate descriptions. You have given us important information here, which at minimum should go into the commit message, but more likely, it will help lead us to the correct fix.
So, two things here. You say:
I think that the phylink commit is just enabling by default the EEE support,
That needs confirming, because you are blaming the conversion to phylink, not that phylink now enabled EEE by default. Russell also tries to avoid behaviour change, which this clearly is. We want a better understanding what caused this behaviour change.
Also:
- Enabling EEE manually using ethtool, triggers the problem too (and ethtool
-show-eee report eee status enabled):
This indicates EEE has always been broken. This brokenness has been somewhat hidden in the past, and it is the change in behaviour in phylink which exposed this brokenness. A commit message using these words would be much more factually correct, and it would also fit with the Fixes: tag you used.
So, please work with Russell. I see two things which would be good to understand before a new version of the patch is submitted:
What cause the behaviour change such that EEE is now enabled? Was it deliberate? Should something be change to revert that behaviour change?
Given that EEE has always been broken, do we understand it sufficiently to say it is not fixable? Is there an errata? Are we sure it is the PHY and not the MAC which is broken?
Andrew
On Mon, Oct 27, 2025 at 02:25:12PM +0100, Andrew Lunn wrote:
On Mon, Oct 27, 2025 at 01:57:48PM +0100, Emanuele Ghidoli wrote:
On 27/10/2025 00:45, Andrew Lunn wrote:
Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Did you do a bisect to prove this?
Yes, I have done a bisect and the commit that introduced the behavior on our board is 4218647d4556 ("net: stmmac: convert to phylink managed EEE support").
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
What has this Fixes: tag got to do with phylink?
I think that the phylink commit is just enabling by default the EEE support, and my commit is not really fixing that. It is why I didn't put a Fixes: tag pointing to that.
I’ve tried to trace the behavior, but it’s quite complex. From my testing, I can summarize the situation as follows:
- ethtool, after that patch, returns:
ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
- before that patch returns, after boot:
EEE settings for end0: EEE status: disabled Tx LPI: disabled Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: Not reported Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
- Enabling EEE manually using ethtool, triggers the problem too (and ethtool
-show-eee report eee status enabled): ethtool --set-eee end0 eee on tx-lpi on ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
I understand Russell point of view but from my point of view EEE is now enabled by default, and before it wasn't, at least on my setup.
We like to try to understand what is going on, and give accurate descriptions. You have given us important information here, which at minimum should go into the commit message, but more likely, it will help lead us to the correct fix.
So, two things here. You say:
I think that the phylink commit is just enabling by default the EEE support,
That needs confirming, because you are blaming the conversion to phylink, not that phylink now enabled EEE by default. Russell also tries to avoid behaviour change, which this clearly is. We want a better understanding what caused this behaviour change.
Also:
- Enabling EEE manually using ethtool, triggers the problem too (and ethtool
-show-eee report eee status enabled):
This indicates EEE has always been broken. This brokenness has been somewhat hidden in the past, and it is the change in behaviour in phylink which exposed this brokenness. A commit message using these words would be much more factually correct, and it would also fit with the Fixes: tag you used.
So, please work with Russell. I see two things which would be good to understand before a new version of the patch is submitted:
What cause the behaviour change such that EEE is now enabled? Was it deliberate? Should something be change to revert that behaviour change?
Given that EEE has always been broken, do we understand it sufficiently to say it is not fixable? Is there an errata?
None of following TI Gbit PHYs claim EEE support: dp83867cr/ir https://www.ti.com/de/lit/gpn/dp83867cr dp83867e/cs/is https://www.ti.com/de/lit/gpn/dp83867cs dp83869hm https://www.ti.com/lit/gpn/dp83869hm
For comparison, TI 100Mbit PHYs list EEE as supported: dp83826a* https://www.ti.com/de/lit/gpn/dp83826ae dp83826i https://www.ti.com/de/lit/gpn/dp83826i
If vendor do not see it as selling point, or it is just broken beyond repair, there is nothing we can do here. I guess it is ok to sync the driver with vendors claim.
Are we sure it is the PHY and not the MAC which is broken?
I personally still do not have suitable reference board for testing. There are some with Realtek or TI PHYs. It will be good to find board with iMX8MP + KSZ9131 on both MACs (FEC and STMMAC).
Hello Andrew and Russel,
On Mon, Oct 27, 2025 at 02:25:12PM +0100, Andrew Lunn wrote:
On Mon, Oct 27, 2025 at 01:57:48PM +0100, Emanuele Ghidoli wrote:
On 27/10/2025 00:45, Andrew Lunn wrote:
Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY. Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
What has this Fixes: tag got to do with phylink?
I think that the phylink commit is just enabling by default the EEE support, and my commit is not really fixing that. It is why I didn't put a Fixes: tag pointing to that.
I’ve tried to trace the behavior, but it’s quite complex. From my testing, I can summarize the situation as follows:
- ethtool, after that patch, returns:
ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
- before that patch returns, after boot:
EEE settings for end0: EEE status: disabled Tx LPI: disabled Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: Not reported Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
- Enabling EEE manually using ethtool, triggers the problem too (and ethtool
-show-eee report eee status enabled): ethtool --set-eee end0 eee on tx-lpi on ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
I understand Russell point of view but from my point of view EEE is now enabled by default, and before it wasn't, at least on my setup.
We like to try to understand what is going on, and give accurate descriptions. You have given us important information here, which at minimum should go into the commit message, but more likely, it will help lead us to the correct fix.
So, two things here. You say:
I think that the phylink commit is just enabling by default the EEE support,
That needs confirming, because you are blaming the conversion to phylink, not that phylink now enabled EEE by default. Russell also tries to avoid behaviour change, which this clearly is. We want a better understanding what caused this behaviour change.
Also:
- Enabling EEE manually using ethtool, triggers the problem too (and ethtool
-show-eee report eee status enabled):
This indicates EEE has always been broken. This brokenness has been somewhat hidden in the past, and it is the change in behaviour in phylink which exposed this brokenness. A commit message using these words would be much more factually correct, and it would also fit with the Fixes: tag you used.
So, please work with Russell. I see two things which would be good to understand before a new version of the patch is submitted:
What cause the behaviour change such that EEE is now enabled? Was it deliberate? Should something be change to revert that behaviour change?
Given that EEE has always been broken, do we understand it sufficiently to say it is not fixable? Is there an errata? Are we sure it is the PHY and not the MAC which is broken?
I was talking together with Emanuele on this topic and we are confused on how to proceed.
From the various comments and tests in this thread, to me the actual code change is correct, the dp83867 does not support EEE and we have to explicitly disable it in the dp83867 driver.
As of now we do not have a clear shared understanding on what is going on in the stmmac driver. And the commit message is not correct on this regard.
This patch is already merged [1] in netdev tree, should we send a series reverting this commit and another commit with just the same change and a different commit message?
In parallel, unrelated to the dp83867 topic, Emanuele is trying to help figuring out why the actual behavior of the stmmac changed after Russell refactoring. And it's clear that this change in behavior is not expected.
[1] commit 84a905290cb4 ("net: phy: dp83867: Disable EEE support as not implemented")
Francesco
I was talking together with Emanuele on this topic and we are confused on how to proceed.
From the various comments and tests in this thread, to me the actual
code change is correct, the dp83867 does not support EEE and we have to explicitly disable it in the dp83867 driver.
As of now we do not have a clear shared understanding on what is going on in the stmmac driver. And the commit message is not correct on this regard.
This patch is already merged [1] in netdev tree, should we send a series reverting this commit and another commit with just the same change and a different commit message?
No, if its merged, its merged. What should come out of this is a learning, please be precise with commit messages, more detail rather than less, and provide as much supporting evidence as you can for any claims you make.
In parallel, unrelated to the dp83867 topic, Emanuele is trying to help figuring out why the actual behavior of the stmmac changed after Russell refactoring. And it's clear that this change in behavior is not expected.
Great.
Andrew
On Mon, Oct 27, 2025 at 01:57:48PM +0100, Emanuele Ghidoli wrote:
On 27/10/2025 00:45, Andrew Lunn wrote:
Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Did you do a bisect to prove this?
Yes, I have done a bisect and the commit that introduced the behavior on our board is 4218647d4556 ("net: stmmac: convert to phylink managed EEE support").
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
What has this Fixes: tag got to do with phylink?
I think that the phylink commit is just enabling by default the EEE support, and my commit is not really fixing that. It is why I didn't put a Fixes: tag pointing to that.
I’ve tried to trace the behavior, but it’s quite complex. From my testing, I can summarize the situation as follows:
- ethtool, after that patch, returns:
ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
- before that patch returns, after boot:
EEE settings for end0: EEE status: disabled Tx LPI: disabled Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: Not reported Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
Oh damn. I see why now:
/* Some DT bindings do not set-up the PHY handle. Let's try to * manually parse it */ if (!phy_fwnode || IS_ERR(phy_fwnode)) { int addr = priv->plat->phy_addr; ... if (priv->dma_cap.eee) phy_support_eee(phydev);
ret = phylink_connect_phy(priv->phylink, phydev); } else { fwnode_handle_put(phy_fwnode); ret = phylink_fwnode_phy_connect(priv->phylink, fwnode, 0); }
The driver only considers calling phy_support_eee() when DT fails to describe the PHY (because in the other path, it doesn't have access to the struct phy_device to make this call.)
My commit makes it apply even to DT described PHYs, so now (as has been shown when you enable EEE manually) it's uncovering latent problems.
So now we understand why the change has occurred - this is important. Now the question becomes, what to do about it.
For your issue, given that we have statements from TI that indicate none of their gigabit PHYs support EEE, we really should not be reporting to userspace that there is any EEE support. Therefore, "Supported EEE link modes" should be completely empty.
I think I understand why we're getting EEE modes supported. In the DP83867 manual, it states for the DEVAD field of the C45 indirect access registers:
"Device Address: In general, these bits [4:0] are the device address DEVAD that directs any accesses of ADDAR register (0x000E) to the appropriate MMD. Specifically, the DP83867 uses the vendor specific DEVAD [4:0] = 11111 for accesses. All accesses through registers REGCR and ADDAR can use this DEVAD. Transactions with other DEVAD are ignored."
Specifically, that last sentence, and the use of "ignored". If this means the PHY does not drive the MDIO data line when registers are read, they will return 0xffff, which is actually against the IEEE requirements for C45 registers (unimplemented C45 registers are supposed to be zero.)
So, this needs to be tested - please modify phylib's genphy_c45_read_eee_cap1() to print the value read from the register.
If it is 0xffff, that confirms that theory.
The correct solution here, to stop other MAC drivers running into this is for TI PHY drivers to implement the .get_features() phylib method, call genphy_read_abilities() or genphy_c45_pma_read_abilities() as appropriate, and then clear phydev->supported_eee so the PHY driver reports (correctly according to TI's statements) that EEE modes are not supported.
So, while my commit does have an unintended change of behaviour (thanks for helping to locate that), it would appear that it has revealed a latent bug in likely all TI PHY gigabit drivers that needs fixing independently of what we do about this.
On 27/10/2025 15:53, Russell King (Oracle) wrote:
On Mon, Oct 27, 2025 at 01:57:48PM +0100, Emanuele Ghidoli wrote:
On 27/10/2025 00:45, Andrew Lunn wrote:
Since the introduction of phylink-managed EEE support in the stmmac driver, EEE is now enabled by default, leading to issues on systems using the DP83867 PHY.
Did you do a bisect to prove this?
Yes, I have done a bisect and the commit that introduced the behavior on our board is 4218647d4556 ("net: stmmac: convert to phylink managed EEE support").
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
What has this Fixes: tag got to do with phylink?
I think that the phylink commit is just enabling by default the EEE support, and my commit is not really fixing that. It is why I didn't put a Fixes: tag pointing to that.
I’ve tried to trace the behavior, but it’s quite complex. From my testing, I can summarize the situation as follows:
- ethtool, after that patch, returns:
ethtool --show-eee end0 EEE settings for end0: EEE status: enabled - active Tx LPI: 1000000 (us) Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: 100baseT/Full 1000baseT/Full Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
- before that patch returns, after boot:
EEE settings for end0: EEE status: disabled Tx LPI: disabled Supported EEE link modes: 100baseT/Full 1000baseT/Full Advertised EEE link modes: Not reported Link partner advertised EEE link modes: 100baseT/Full 1000baseT/Full
Oh damn. I see why now:
/* Some DT bindings do not set-up the PHY handle. Let's try to * manually parse it */ if (!phy_fwnode || IS_ERR(phy_fwnode)) { int addr = priv->plat->phy_addr; ... if (priv->dma_cap.eee) phy_support_eee(phydev); ret = phylink_connect_phy(priv->phylink, phydev); } else { fwnode_handle_put(phy_fwnode); ret = phylink_fwnode_phy_connect(priv->phylink, fwnode, 0); }The driver only considers calling phy_support_eee() when DT fails to describe the PHY (because in the other path, it doesn't have access to the struct phy_device to make this call.)
My commit makes it apply even to DT described PHYs, so now (as has been shown when you enable EEE manually) it's uncovering latent problems.
So now we understand why the change has occurred - this is important.
Good. Thanks.> Now the question becomes, what to do about it.
For your issue, given that we have statements from TI that indicate none of their gigabit PHYs support EEE, we really should not be reporting to userspace that there is any EEE support. Therefore, "Supported EEE link modes" should be completely empty.
I think I understand why we're getting EEE modes supported. In the DP83867 manual, it states for the DEVAD field of the C45 indirect access registers:
"Device Address: In general, these bits [4:0] are the device address DEVAD that directs any accesses of ADDAR register (0x000E) to the appropriate MMD. Specifically, the DP83867 uses the vendor specific DEVAD [4:0] = 11111 for accesses. All accesses through registers REGCR and ADDAR can use this DEVAD. Transactions with other DEVAD are ignored."
Specifically, that last sentence, and the use of "ignored". If this means the PHY does not drive the MDIO data line when registers are read, they will return 0xffff, which is actually against the IEEE requirements for C45 registers (unimplemented C45 registers are supposed to be zero.)
So, this needs to be tested - please modify phylib's genphy_c45_read_eee_cap1() to print the value read from the register.
If it is 0xffff, that confirms that theory.
It’s not 0xffff; I verified that the value read is: TI DP83867 stmmac-0:02: Reading EEE capabilities from MDIO_PCS_EEE_ABLE: 0x0006
This indicates that EEE is reported as supported, according to: #define MDIO_AN_EEE_ADV_100TX 0x0002 /* Advertise 100TX EEE cap */ #define MDIO_AN_EEE_ADV_1000T 0x0004 /* Advertise 1000T EEE cap */
So the PHY simply reports the capability as present.
I verified this behaviour before submitting the patch, which is why I wrote: "While the DP83867 PHYs report EEE capability through their feature registers."
Anyway, if the value were 0xffff, the code already handles it as not supported.
Let me know if I should test anything else.
On Mon, Oct 27, 2025 at 04:34:54PM +0100, Emanuele Ghidoli wrote:
So, this needs to be tested - please modify phylib's genphy_c45_read_eee_cap1() to print the value read from the register.
If it is 0xffff, that confirms that theory.
It’s not 0xffff; I verified that the value read is: TI DP83867 stmmac-0:02: Reading EEE capabilities from MDIO_PCS_EEE_ABLE: 0x0006
Thanks for testing. So the published manual for this PHY is wrong. https://www.ti.com/lit/ds/symlink/dp83867ir.pdf page 64.
The comment I quoted from that page implies that the PCS and AN MMD registers shouldn't be implemented.
Given what we now know, I'd suggest TI PHYs are a mess. Stuff they say in the documentation that is ignored plainly isn't, and their PHYs report stuff as capable but their PHYs aren't capable.
I was suggesting to clear phydev->supported_eee, but that won't work if the MDIO_AN_EEE_ADV register is implemented even as far as exchanging EEE capabilities with the link partner. We use the supported_eee bitmap to know whether a register is implemented. Clearing ->supported_eee will mean we won't write to the advertisement register. That's risky. Given the brokenness so far, I wouldn't like to assume that the MDIO_AN_EEE_ADV register contains zero by default.
Calling phy_disable_eee() from .get_features() won't work, because after we call that method, of_set_phy_eee_broken() will then be called, which will clear phydev->eee_disabled_modes. I think that is a mistake. Is there any reason why we would want to clear the disabled modes? Isn't it already zero? (note that if OF_MDIO is disabled, or there's no DT node, we don't zero this.)
Your placement is the only possible location as the code currently stands, but I would like to suggest that of_set_phy_eee_broken() should _not_ be calling linkmode_zero(modes), and we should be able to set phydev->eee_disabled_modes in the .get_features() method.
Andrew, would you agree?
On Mon, Oct 27, 2025 at 04:44:33PM +0000, Russell King (Oracle) wrote:
On Mon, Oct 27, 2025 at 04:34:54PM +0100, Emanuele Ghidoli wrote:
So, this needs to be tested - please modify phylib's genphy_c45_read_eee_cap1() to print the value read from the register.
If it is 0xffff, that confirms that theory.
It’s not 0xffff; I verified that the value read is: TI DP83867 stmmac-0:02: Reading EEE capabilities from MDIO_PCS_EEE_ABLE: 0x0006
Thanks for testing. So the published manual for this PHY is wrong. https://www.ti.com/lit/ds/symlink/dp83867ir.pdf page 64.
The comment I quoted from that page implies that the PCS and AN MMD registers shouldn't be implemented.
Given what we now know, I'd suggest TI PHYs are a mess. Stuff they say in the documentation that is ignored plainly isn't, and their PHYs report stuff as capable but their PHYs aren't capable.
I was suggesting to clear phydev->supported_eee, but that won't work if the MDIO_AN_EEE_ADV register is implemented even as far as exchanging EEE capabilities with the link partner. We use the supported_eee bitmap to know whether a register is implemented. Clearing ->supported_eee will mean we won't write to the advertisement register. That's risky. Given the brokenness so far, I wouldn't like to assume that the MDIO_AN_EEE_ADV register contains zero by default.
Calling phy_disable_eee() from .get_features() won't work, because after we call that method, of_set_phy_eee_broken() will then be called, which will clear phydev->eee_disabled_modes. I think that is a mistake. Is there any reason why we would want to clear the disabled modes? Isn't it already zero? (note that if OF_MDIO is disabled, or there's no DT node, we don't zero this.)
Your placement is the only possible location as the code currently stands, but I would like to suggest that of_set_phy_eee_broken() should _not_ be calling linkmode_zero(modes), and we should be able to set phydev->eee_disabled_modes in the .get_features() method.
Andrew, would you agree?
What I'm thinking of is an overall change such as (against net-next):
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c index deeefb962566..f923f3a57b11 100644 --- a/drivers/net/phy/dp83867.c +++ b/drivers/net/phy/dp83867.c @@ -708,6 +708,21 @@ static int dp83867_probe(struct phy_device *phydev) return dp83867_of_init(phydev); }
+static int dp83867_get_features(struct phy_device *phydev) +{ + int err = genphy_read_abilities(phydev); + + /* TI Gigabit PHYs do not support EEE, even though they report support + * in their "ignored" Clause 45 indirect registers, appear to implement + * the advertisement registers and exchange the relevant AN page. Set + * all EEE link modes as disabled, so we still write to the C45 EEE + * advertisement register to ensure it is set to zero. + */ + linkmode_fill(phydev->eee_disabled_modes); + + return err; +} + static int dp83867_config_init(struct phy_device *phydev) { struct dp83867_private *dp83867 = phydev->priv; @@ -1118,6 +1133,7 @@ static struct phy_driver dp83867_driver[] = { /* PHY_GBIT_FEATURES */
.probe = dp83867_probe, + .get_features = dp83867_get_features, .config_init = dp83867_config_init, .soft_reset = dp83867_phy_reset,
diff --git a/drivers/net/phy/phy-core.c b/drivers/net/phy/phy-core.c index 605ca20ae192..43ccbd3a09f8 100644 --- a/drivers/net/phy/phy-core.c +++ b/drivers/net/phy/phy-core.c @@ -207,8 +207,6 @@ void of_set_phy_eee_broken(struct phy_device *phydev) if (!IS_ENABLED(CONFIG_OF_MDIO) || !node) return;
- linkmode_zero(modes); - if (of_property_read_bool(node, "eee-broken-100tx")) linkmode_set_bit(ETHTOOL_LINK_MODE_100baseT_Full_BIT, modes); if (of_property_read_bool(node, "eee-broken-1000t"))
Your placement is the only possible location as the code currently stands, but I would like to suggest that of_set_phy_eee_broken() should _not_ be calling linkmode_zero(modes), and we should be able to set phydev->eee_disabled_modes in the .get_features() method.
Andrew, would you agree?
I dug back through the git history. Originally, the code would read a value from DT which was literally used as a mask against the value in MDIO_MMD_AN. If the mask was not zero, it was applied. That later got converted to a collection of Boolean DT properties, one per link speed, and the mask was created as a collection of |= statements, with the initial value being 0, and the result assigned to phydev->eee_broken_modes. It would of been possible at that stage to do phydev->eee_broken_modes |=, but it guess a local variable was used to keep the lines shorter. The u32 then got converted to a linux bitmap, with the initial = 0; replaced with a linkmode_zero().
I don't see anything in any of the commit messages to indicate there was a reason to initialise phydev->eee_broken_modes to 0 before parsing the DT properties.
So i think it should be O.K. to remove the linkmode_zero(). For broken PHYs like this, the earlier we mask out the broken behaviour the better.
Andrew
linux-stable-mirror@lists.linaro.org