On 4/5/25 3:43 AM, Paul Menzel wrote:
Dear Greg,
Thank you for replying on a Saturday.
Am 05.04.25 um 09:29 schrieb Greg KH:
On Sat, Apr 05, 2025 at 08:32:13AM +0200, Paul Menzel wrote:
Am 29.03.25 um 15:57 schrieb Chuck Lever:
On 3/29/25 8:17 AM, Takashi Iwai wrote:
On Sun, 23 Feb 2025 09:53:10 +0100, Takashi Iwai wrote:
we received a bug report showing the regression on 6.13.1 kernel against 6.13.0. The symptom is that Chrome and VSCode stopped working with Gnome Scaling, as reported on openSUSE Tumbleweed bug tracker https://bugzilla.suse.com/show_bug.cgi?id=1236943
Quoting from there: """ I use the latest TW on Gnome with a 4K display and 150% scaling. Everything has been working fine, but recently both Chrome and VSCode (installed from official non-openSUSE channels) stopped working with Scaling. .... I am using VSCode with: `--enable-features=UseOzonePlatform --enable- features=WaylandWindowDecorations --ozone-platform-hint=auto` and for Chrome, I select `Preferred Ozone platform` == `Wayland`. """
Surprisingly, the bisection pointed to the backport of the commit b9b588f22a0c049a14885399e27625635ae6ef91 ("libfs: Use d_children list to iterate simple_offset directories").
Indeed, the revert of this patch on the latest 6.13.4 was confirmed to fix the issue. Also, the reporter verified that the latest 6.14-rc release is still affected, too.
For now I have no concrete idea how the patch could break the behavior of a graphical application like the above. Let us know if you need something for debugging. (Or at easiest, join to the bugzilla entry and ask there; or open another bug report at whatever you like.)
BTW, I'll be traveling tomorrow, so my reply will be delayed.
#regzbot introduced: b9b588f22a0c049a14885399e27625635ae6ef91 #regzbot monitor: https://bugzilla.suse.com/show_bug.cgi?id=1236943
After all, this seems to be a bug in Chrome and its variant, which was surfaced by the kernel commit above: as the commit changes the directory enumeration, it also changed the list order returned from libdrm drmGetDevices2(), and it screwed up the application that worked casually beforehand. That said, the bug itself has been already present. The Chrome upstream tracker: https://issuetracker.google.com/issues/396434686
#regzbot invalid: problem has always existed on Chrome and related code
Thank you very much for your report and for chasing this to conclusion.
Doesn’t marking this an invalid contradict Linux’ no regression policy to never break user space, so users can always update the Linux kernel? Shouldn’t this commit still be reverted, and another way be found keeping the old ordering?
Greg, Sasha, in stable/linux-6.13.y the two commits below would need to be reverted:
180c7e44a18bbd7db89dfd7e7b58d920c44db0ca d9da7a68a24518e93686d7ae48937187a80944ea
For stable/linux-6.12.y:
176d0333aae43bd0b6d116b1ff4b91e9a15f88ef 639b40424d17d9eb1d826d047ab871fe37897e76
Unless the changes are also reverted in Linus's tree, we'll be keeping these in. Please work with the maintainers to resolve this in mainline and we will be glad to mirror that in the stable trees as well.
Commit b9b588f22a0c (libfs: Use d_children list to iterate simple_offset directories) does not have a Fixes: tag or Cc: stable@vger.kernel.org. I do not understand, why it was applied to the stable series at all [1], and cannot be reverted when it breaks userspace?
I NACK'd the upstream revert because I expected an RCA before 6.14 final (that didn't happen), and the Chrome issue was the only reported problem and it was specific to a particular hardware configuration and the /latest developer release/ of Chrome. Neither v6.14.0 nor a Chrome developer release are going to be put in front of users who do not expect to encounter issues.
Note that the libfs series addresses several issues. Commit b9b588f22a0c itself addresses CVE-2024-46701 [1] (in v6.6). I did not add a "Cc: stable" for commit b9b588f22a0c because it cannot be cherry picked to apply to v6.6, it has to be manually adjusted to apply.
The final RCA reported in [2] shows that there is nothing incorrect about b9b588f22a0c.
In addition, the next Chrome release will carry a fix for the clearly incorrect library behavior -- applications cannot depend on the order of directory entry iteration, because that can change arbitrarily, and not just because of file system implementation quirks. You will note that even after sorting the directory entries, the library still had problems discovering the accelerated graphics device.
Reverting now might follow the letter of the rule about "no regressions" but IMHO moving forward from here seems to me to be the more constructive approach.