On Wed, Jun 20, 2018 at 6:19 PM, Andi Kleen ak@linux.intel.com wrote:
Arnd Bergmann arnd@arndb.de writes:
To clarify: current_kernel_time() uses at most millisecond resolution rather than microsecond, as tkr_mono.xtime_nsec only gets updated during the timer tick.
Ah you're right. I remember now: the motivation was to make sure there is basically no overhead. In some setups the full gtod can be rather slow, particularly if it falls back to some crappy timer.
I think it would be ok if it falls back to jiffies if TSC or a similar fast timer doesn't work. But the function you're using likely doesn't do that?
My patch as posted just uses ktime_get_coarse_real_ts64(), which doesn't ever access the hires clocksource, the change is just cosmetic so far.
The timekeeping and clocksource core code (maintainers added to Cc) doesn't yet export an API that we can use to determine whether the clocksource is "fast" or not, but I would expect that we can decide to add that if needed.
This is also something that definitely changed over the years since your patch was originally added. Back then, the x86 TSC probably wasn't reliable enough to depend on it but now I would guess that very few x86 machines in production use care. On embedded systems, we used to have all kinds of clocksource drivers with varying characteristics, but nowadays the embedded market is dominated by ARMv7VE (Cortex-A7/A15/A17) or ARMv8, which are required to have a fast clocksource (drivers/clocksource/arm_arch_timer.c), and a lot of the others have it too (risc-v, modern mips, all ppc32, most ARM Cortex-A9, ...). The traditional non-x86 architectures (s390, powerpc, sparc) that are still being used have of course had low-latency clocksource access for a much longer time.
This means, we're probably fine with a compile-time option that distros can choose to enable depending on what classes of hardware they are targetting, like
struct timespec64 current_time(struct inode *inode) { struct timespec64 now; u64 gran = inode->i_sb->s_time_gran;
if (IS_ENABLED(CONFIG_HIRES_INODE_TIMES) && gran <= NSEC_PER_JIFFY) ktime_get_real_ts64(&now); else ktime_get_coarse_real_ts64(&now);
return timespec64_trunc(now, gran); }
With that implementation, we could still let file systems choose to get coarse timestamps by tuning the granularity in the superblock s_time_gran, which would result in nice round tv_nsec values that represent the actual accuracy.
Obviously this still needs performance testing on various bits of real hardware, but I can imagine that the overhead is rather small on hardware from the past five years.
Arnd