On Wed, Apr 09, 2025 at 10:29:43AM -0400, Rik van Riel wrote:
On Wed, 2025-04-02 at 20:07 +0200, Peter Zijlstra wrote:
Anyway, seeing how your min_vruntime is weird, let me ask you to try the below; it removes the old min_vruntime and instead tracks zero vruntime as the 'current' avg_vruntime. We don't need the monotinicity filter, all we really need is something 'near' all the other vruntimes in order to compute this relative key so we can preserve order across the wrap.
This *should* get us near minimal sized keys. If you can still reproduce, you should probably add something like that patch I send you privately earlier, that checks the overflows.
Our trouble workload still makes the scheduler crash with this patch.
I'll go put the debugging patch on our kernel.
Should I try to get debugging data with this patch part of the mix, or with the debugging patch just on top of what's in 6.13 already?
Whatever is more convenient I suppose.
If you can dump the full tree that would be useful. Typically the se::{vruntime,weight} and cfs_rq::{zero_vruntime,avg_vruntime,avg_load} such that we can do full manual validation of the numbers.