Re: [Y2038] [PATCH] vfs: replace current_kernel_time64 with ktime equivalent

20 Jun 2018


      On Wed, Jun 20, 2018 at 5:40 PM, Andi Kleen ak@linux.intel.com wrote:
...
Arnd Bergmann arnd@arndb.de writes:
...
I traced the original addition of the current_kernel_time() call to set
the nanosecond fields back to linux-2.5.48, where Andi Kleen added a
patch with subject "nanosecond stat timefields". This adds the original
call to current_kernel_time and the truncation to the resolution of the
file system, but makes no mention of the intended accuracy.  At the time,
we had a do_gettimeofday() interface that on some architectures could
return a microsecond-resolution timestamp, but there was no interface
for getting an accurate timestamp in nanosecond resolution, neither inside
the kernel nor from user space. This makes me suspect that the use of
coarse timestamps was never really a conscious decision but instead
a result of whatever API was available 16 years ago.
Kind of. VFS/system calls are expensive enough that you need multiple us
in and out so us resolution was considered good enough.
To clarify: current_kernel_time() uses at most millisecond resolution rather
than microsecond, as tkr_mono.xtime_nsec only gets updated during the
timer tick.
Has that time scale changed over the past 16 years as CPUs got faster
(and system call entry times slower down again with recent changes)?
I tried a simple test on the shell, in tmpfs here and saw:
$ for i in `seq -w 100000` ; do > $i ; done
$ stat * | less | grep Modify | uniq -c | head
    601 Modify: 2018-06-20 18:04:48.794314629 +0200
    920 Modify: 2018-06-20 18:04:48.798314691 +0200
    936 Modify: 2018-06-20 18:04:48.802314753 +0200
    937 Modify: 2018-06-20 18:04:48.806314816 +0200
    901 Modify: 2018-06-20 18:04:48.810314878 +0200
    929 Modify: 2018-06-20 18:04:48.814314940 +0200
    931 Modify: 2018-06-20 18:04:48.818315002 +0200
    894 Modify: 2018-06-20 18:04:48.822315064 +0200
    952 Modify: 2018-06-20 18:04:48.826315128 +0200
    898 Modify: 2018-06-20 18:04:48.830315190 +0200
which indicates that the result of ktime_get_coarse_real_ts64()
gets updated every four milliseconds here (matching the
CONFIG_HZ_250 setting in my running kernel), and that
we can create around 900 files during that time that each
get the same timestamp (strace shows 10 system calls for
each new file). Trying the same on btrfs, I get around 260
files per jiffy.
...
Also if you do this change you really need to do some benchmarks,
especially on setups without lazy atime. This might potentially
cause a lot more inode flushes.
Good point. On the other hand, there may be some reasons to
do it even if there is a noticeable overhead, in cases where we
actually want hires timestamps, so perhaps this could be
a mount option.
Arnd

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Y2038] [PATCH] vfs: replace current_kernel_time64 with ktime equivalent