To add another datapoint to this - I've seen the same problem on Dell PowerEdge R6615 servers... but no others.
The problem also crept into the 6.1.79 kernel with the commit mentioned earlier, and is fixed by reverting that commit. Adding nogbpages to the kernel command line can cause the failure to reproduce on that hardware as well.
On Tue, Mar 05, 2024 at 05:39:32AM -0500, Eric Hagberg wrote:
To add another datapoint to this - I've seen the same problem on Dell PowerEdge R6615 servers... but no others.
The problem also crept into the 6.1.79 kernel with the commit mentioned earlier, and is fixed by reverting that commit. Adding nogbpages to the kernel command line can cause the failure to reproduce on that hardware as well.
Eric,
What Linux Distribution are you running on that machine? My guess would be that this is not distro related; if you are running something quite different from Pavin that would confirm this.
I found an AMD based system to try to reproduce this on. The 6.7.7 kernel doesn't seem to have a problem in the machine's existing RHEL environment. I think it's likely that this system's hardware doesn't have the characteristics that bring this problem to the surface. But I will be trying OpenSUSE Tumbleweed on it if I can.
Thanks,
--> Steve
linux-stable-mirror@lists.linaro.org