On Wed, 2025-04-02 at 20:07 +0200, Peter Zijlstra wrote:
Anyway, seeing how your min_vruntime is weird, let me ask you to try the below; it removes the old min_vruntime and instead tracks zero vruntime as the 'current' avg_vruntime. We don't need the monotinicity filter, all we really need is something 'near' all the other vruntimes in order to compute this relative key so we can preserve order across the wrap.
This *should* get us near minimal sized keys. If you can still reproduce, you should probably add something like that patch I send you privately earlier, that checks the overflows.
Our trouble workload still makes the scheduler crash with this patch.
I'll go put the debugging patch on our kernel.
Should I try to get debugging data with this patch part of the mix, or with the debugging patch just on top of what's in 6.13 already?
Digging through our kernel crash history, this particular crash seems to go back at least to 6.11. They just happen much more frequently on 6.13 for some (as of yet unknown) reason.