On Mon, Mar 31 2025 at 16:53, Miroslav Lichvar wrote:
On Thu, Mar 27, 2025 at 04:42:49PM +0100, Miroslav Lichvar wrote:
Maybe I could simply patch the kernel to force a small clock multiplier to increase the rate at which the error accumulates.
I tried that and it indeed makes the issue clearly visible. The COARSE fix makes the clock less stable. It's barely visible with the normal multiplier, at least for the clocksource I tested, but a reduced multiplier forces a larger NTP error and raises it above the precision and instability of the system and reference clocks.
The test was done on a machine with a TSC clocksource (3GHz CPU with disabled frequency scaling - normal multplier is 5592407) and tried a multiplier reduced by 4, 16, 64 with this COARSE-fixing patch not applied and applied. Each test ran for 1 minute and produced an average value of skew - stability of the clock frequency as reported by chronyd in the tracking log when synchronizing to a free-running PTP clock at 64, 16, and 4 updates per second. It's in parts per million (resolution in the chrony log is limited to 0.001 ppm).
Mult reduction Updates/sec Skew before Skew after 1 4 0.000 0.000 1 16 0.001 0.002 1 64 0.002 0.006 4 4 0.001 0.001 4 16 0.003 0.005 4 64 0.005 0.015 16 4 0.004 0.009 16 16 0.011 0.069 16 64 0.020 0.117 64 4 0.013 0.012 64 16 0.030 0.107 64 64 0.058 0.879
Hrm.
Can you try the delta patch below?
Thanks,
tglx --- kernel/time/timekeeping.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -2234,8 +2234,8 @@ static bool timekeeping_advance(enum tim tk->tkr_mono.cycle_last, tk->tkr_mono.mask, tk->tkr_mono.clock->max_raw_delta);
- /* Check if there's really nothing to do */ - if (offset < real_tk->cycle_interval && mode == TK_ADV_TICK) + /* Check if there's really something to do */ + if (offset < real_tk->cycle_interval) return false;
offset = timekeeping_accumulate(tk, offset, mode, &clock_set);