The following patches introduce a debug feature, accurate timestamps on all kernel logs during boot.
At the moment until time is set up in the kernel, all logs are timestamped [0.000000] giving the impression these early boot activities have no duration. That's far from the case.
These patches give accurate, monotonic timestamps from the very first log entry allowing insight into where the time is going during the whole of the boot process.
It's a debug feature like DEBUG_LL, it does not cooperate with ARCH_MULTIPLATFORM (it works with such a kernel though) but can only be enabled for a single platform at compile-time.
To use it, you need an ARM SoC with a globaltimer, which is very common now. Even if the globaltimer is used by the kernel, we can use it in early boot and stop using it as soon as a normal time source is set up, so there's no conflict.
There are more details on how to use it and examples of the results in the second patch.
---
Andy Green (2): scheduler: time: allow arch-specific time offset function arm: time: add globaltimer-based arch_early_time
arch/arm/Kconfig | 4 +++ arch/arm/Kconfig.debug | 35 ++++++++++++++++++++++++++ arch/arm/kernel/head.S | 65 ++++++++++++++++++++++++++++++++++++++++++++++++ arch/arm/kernel/time.c | 41 ++++++++++++++++++++++++++++++ include/linux/time.h | 6 ++++ kernel/sched/clock.c | 12 ++++----- 6 files changed, 157 insertions(+), 6 deletions(-)
This introduces an optional arch_early_time() which by default doesn't exist and doesn't cause any extra code to be generated.
If it is configured and declared though, it can be used in an architecture or platform-specific way to provide time during early boot.
After early boot when normal time is available, it can stop counting and just add its offset, so time in the boot is monotonic.
Signed-off-by: Andy Green andy.green@linaro.org --- include/linux/time.h | 6 ++++++ kernel/sched/clock.c | 12 ++++++------ 2 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/include/linux/time.h b/include/linux/time.h index d5d229b..57dc58e 100644 --- a/include/linux/time.h +++ b/include/linux/time.h @@ -181,6 +181,12 @@ extern s32 timekeeping_get_tai_offset(void); extern void timekeeping_set_tai_offset(s32 tai_offset); extern void timekeeping_clocktai(struct timespec *ts);
+#ifdef CONFIG_HAS_ARCH_EARLY_TIME +extern u64 arch_early_time(u64 normal); +#else +#define arch_early_time(_x) ((u64)_x) +#endif + struct tms; extern void do_sys_times(struct tms *);
diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c index c3ae144..04b0490 100644 --- a/kernel/sched/clock.c +++ b/kernel/sched/clock.c @@ -245,10 +245,10 @@ u64 sched_clock_cpu(int cpu) WARN_ON_ONCE(!irqs_disabled());
if (sched_clock_stable) - return sched_clock(); + return arch_early_time(sched_clock());
if (unlikely(!sched_clock_running)) - return 0ull; + return arch_early_time(0);
scd = cpu_sdc(cpu);
@@ -257,7 +257,7 @@ u64 sched_clock_cpu(int cpu) else clock = sched_clock_local(scd);
- return clock; + return arch_early_time(clock); }
void sched_clock_tick(void) @@ -339,7 +339,7 @@ u64 local_clock(void) unsigned long flags;
local_irq_save(flags); - clock = sched_clock_cpu(smp_processor_id()); + clock = arch_early_time(sched_clock_cpu(smp_processor_id())); local_irq_restore(flags);
return clock; @@ -355,9 +355,9 @@ void sched_clock_init(void) u64 sched_clock_cpu(int cpu) { if (unlikely(!sched_clock_running)) - return 0; + return arch_early_time(0);
- return sched_clock(); + return arch_early_time(sched_clock()); }
u64 cpu_clock(int cpu)
This uses the generic arch_early_time support added in the pervious patch to define an ARM globaltimer-based implementation that can be configured through CONFIG_.
It uses the early time to provide detailed and consistent time information in your logs. For example, what used to look like this
[ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Linux version 3.11.0-linaro_fjslt-tracking-bsp2.0.0-rc1+ (agreen@build.warmcat.com) (gcc version 4.7.3 20130205 (prerelease) (crosstool-NG linaro-1.13.1-4.7-2013.02-01-20130221 - Linaro GCC 2013.02) ) #193 SMP PREEMPT Thu Aug 8 09:13:53 CST 2013 [ 0.000000] CPU: ARMv7 Processor [412fc092] revision 2 (ARMv7), cr=50c5387d [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache [ 0.000000] Machine: Fujitsu Semiconductor MB8AC0300-EVB, model: Fujitsu MB8AC0300 Evaluation Board [ 0.000000] early_init_dt_add_memory_arch: 40000000 10000000 [ 0.000000] cma: CMA: reserved 64 MiB at 4c000000 [ 0.000000] Memory policy: ECC disabled, Data cache writealloc [ 0.000000] On node 0 totalpages: 65536 [ 0.000000] Normal zone: 512 pages used for memmap [ 0.000000] Normal zone: 0 pages reserved [ 0.000000] Normal zone: 65536 pages, LIFO batch:15 [ 0.000000] PERCPU: Embedded 7 pages/cpu @804f4000 s7296 r8192 d13184 u32768 [ 0.000000] pcpu-alloc: s7296 r8192 d13184 u32768 alloc=8*4096 [ 0.000000] pcpu-alloc: [0] 0 [0] 1 [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 65024 [ 0.000000] Kernel command line: loglevel=4 mb8ac0300_ts.adjust_y=-40 console=tty0 console=ttyS0,115200 root=/dev/mmcblk0p2 rootfstype=ext4 rw rootwait [ 0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes) [ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) [ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) [ 0.000000] Memory: 191100K/262144K available (1734K kernel code, 115K rwdata, 716K rodata, 107K init, 236K bss, 71044K reserved, 0K highmem) [ 0.000000] Virtual kernel memory layout: [ 0.000000] vector : 0xffff0000 - 0xffff1000 ( 4 kB) [ 0.000000] fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB) [ 0.000000] vmalloc : 0x90800000 - 0xff000000 (1768 MB) [ 0.000000] lowmem : 0x80000000 - 0x90000000 ( 256 MB) [ 0.000000] pkmap : 0x7fe00000 - 0x80000000 ( 2 MB) [ 0.000000] modules : 0x7f800000 - 0x7fe00000 ( 6 MB) [ 0.000000] .text : 0x80008000 - 0x8026cb28 (2451 kB) [ 0.000000] .init : 0x8026d000 - 0x80287c80 ( 108 kB) [ 0.000000] .data : 0x80288000 - 0x802a4ca0 ( 116 kB) [ 0.000000] .bss : 0x802a4ca0 - 0x802dff74 ( 237 kB) [ 0.000000] Preemptible hierarchical RCU implementation. [ 0.000000] Dump stacks of tasks blocking RCU-preempt GP. [ 0.000000] RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2. [ 0.000000] NR_IRQS:16 nr_irqs:16 16 [ 0.000000] sched_clock: 32 bits at 41MHz, resolution 24ns, wraps every 103727ms [ 0.000000] Console: colour dummy device 80x30 [ 0.000000] console [tty0] enabled [ 0.000734] Calibrating delay loop... 1318.91 BogoMIPS (lpj=6594560) ...
will look like this
[ 0.000418] Booting Linux on physical CPU 0x0 [ 0.000439] Linux version 3.11.0-linaro_fjslt-tracking-bsp2.0.0-rc1+ (agreen@build.warmcat.com) (gcc version 4.7.3 20130205 (prerelease) (crosstool-NG linaro-1.13.1-4.7-2013.02-01-20130221 - Linaro GCC 2013.02) ) #194 SMP PREEMPT Thu Aug 8 09:16:22 CST 2013 [ 0.000468] CPU: ARMv7 Processor [412fc092] revision 2 (ARMv7), cr=50c5387d [ 0.000481] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache [ 0.000504] Machine: Fujitsu Semiconductor MB8AC0300-EVB, model: Fujitsu MB8AC0300 Evaluation Board [ 0.000664] early_init_dt_add_memory_arch: 40000000 10000000 [ 0.000824] cma: CMA: reserved 64 MiB at 4c000000 [ 0.000838] Memory policy: ECC disabled, Data cache writealloc [ 0.010319] On node 0 totalpages: 65536 [ 0.010339] Normal zone: 512 pages used for memmap [ 0.010349] Normal zone: 0 pages reserved [ 0.010361] Normal zone: 65536 pages, LIFO batch:15 [ 0.032012] PERCPU: Embedded 7 pages/cpu @804f4000 s7296 r8192 d13184 u32768 [ 0.032042] pcpu-alloc: s7296 r8192 d13184 u32768 alloc=8*4096 [ 0.032053] pcpu-alloc: [0] 0 [0] 1 [ 0.032091] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 65024 [ 0.032106] Kernel command line: loglevel=4 mb8ac0300_ts.adjust_y=-40 console=tty0 console=ttyS0,115200 root=/dev/mmcblk0p2 rootfstype=ext4 rw rootwait [ 0.032292] PID hash table entries: 1024 (order: 0, 4096 bytes) [ 0.032444] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) [ 0.032708] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) [ 0.040695] Memory: 191100K/262144K available (1734K kernel code, 115K rwdata, 716K rodata, 107K init, 236K bss, 71044K reserved, 0K highmem) [ 0.040750] Virtual kernel memory layout: [ 0.040750] vector : 0xffff0000 - 0xffff1000 ( 4 kB) [ 0.040750] fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB) [ 0.040750] vmalloc : 0x90800000 - 0xff000000 (1768 MB) [ 0.040750] lowmem : 0x80000000 - 0x90000000 ( 256 MB) [ 0.040750] pkmap : 0x7fe00000 - 0x80000000 ( 2 MB) [ 0.040750] modules : 0x7f800000 - 0x7fe00000 ( 6 MB) [ 0.040750] .text : 0x80008000 - 0x8026cb28 (2451 kB) [ 0.040750] .init : 0x8026d000 - 0x80287c80 ( 108 kB) [ 0.040750] .data : 0x80288000 - 0x802a4ca0 ( 116 kB) [ 0.040750] .bss : 0x802a4ca0 - 0x802dff74 ( 237 kB) [ 0.040997] Preemptible hierarchical RCU implementation. [ 0.041010] Dump stacks of tasks blocking RCU-preempt GP. [ 0.041022] RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2. [ 0.041059] NR_IRQS:16 nr_irqs:16 16 [ 0.048966] sched_clock: 32 bits at 41MHz, resolution 24ns, wraps every 103727ms [ 0.049495] Console: colour dummy device 80x30 [ 0.049688] console [tty0] enabled [ 0.050483] Calibrating delay loop... 1318.91 BogoMIPS (lpj=6594560) [ 0.138981] pid_max: default: 32768 minimum: 301 ...
when normal time starts, it is offset by the corresponding early time, so the logs are monotonic and consistent in time.
Like DEBUG_LL it's not compatible with ARCH_MULTIPLATFORM and not designed to be on by default, it's a debugging aid for studying where the real time is going during early kernel boot.
If you have a globaltimer on your SoC, to use it set CONFIG_EARLY_TIME_ARM_GLOBALTIMER=y and ..._PHYS to the base address of the SoC region that contains the globaltimer at +0x200.
..._VIRT needs to be set to somewhere in the static virtual mapping region where it won't step on anything: if you already have a static virtual mapping for this area you can re-use the static mapping virtual address here.
..._FREQ needs to be the clock rate of the globaltimer clock in Hz.
Normally you want to enable ..._ZERO, however it's configurable because you may also start the globaltimer in your bootloader, which will let you measure the total time from that to events in the kernel boot flow.
For example:
CONFIG_EARLY_TIME_ARM_GLOBALTIMER=y CONFIG_EARLY_TIME_ARM_GLOBALTIMER_ZERO=y CONFIG_EARLY_TIME_ARM_GLOBALTIMER_PHYS=0xf8100000 CONFIG_EARLY_TIME_ARM_GLOBALTIMER_VIRT=0xfd100000 CONFIG_EARLY_TIME_ARM_GLOBALTIMER_FREQ=165500000
Signed-off-by: Andy Green andy.green@linaro.org --- arch/arm/Kconfig | 4 +++ arch/arm/Kconfig.debug | 35 ++++++++++++++++++++++++++ arch/arm/kernel/head.S | 65 ++++++++++++++++++++++++++++++++++++++++++++++++ arch/arm/kernel/time.c | 41 ++++++++++++++++++++++++++++++ 4 files changed, 145 insertions(+)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index f3f4bcf..a3b1d75 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -2311,6 +2311,10 @@ config ARM_CPU_SUSPEND
endmenu
+config HAS_ARCH_EARLY_TIME + bool + default n + source "net/Kconfig"
source "drivers/Kconfig" diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug index de41088..e3a6570 100644 --- a/arch/arm/Kconfig.debug +++ b/arch/arm/Kconfig.debug @@ -900,4 +900,39 @@ config PID_IN_CONTEXTIDR additional instructions during context switch. Say Y here only if you are planning to use hardware trace tools with this kernel.
+config EARLY_TIME_ARM_GLOBALTIMER + bool "Enable early Arm globaltimer support" + default n + select HAS_ARCH_EARLY_TIME + help + If you have a spare globaltimer in your SoC, you can use it to + get accurate log timings starting from very early in boot and + before the very first log entry. + You'll need to speify some additional gnarly details if enabled. + +config EARLY_TIME_ARM_GLOBALTIMER_ZERO + bool "zeroing the globaltimer early in boot" + depends on EARLY_TIME_ARM_GLOBALTIMER + default y + help + If you started the globaltimer in your bootloader, you can disable + this to stop the kernel code zeroing the timer. Then your logs + will reflect time from the point it was zeroed in the bootloader. + But normally, you want to see time from very early in kernel boot. + +config EARLY_TIME_ARM_GLOBALTIMER_PHYS + hex "Physical address of your globaltimer" + depends on EARLY_TIME_ARM_GLOBALTIMER + +config EARLY_TIME_ARM_GLOBALTIMER_VIRT + hex "Virtual mapping address of your globaltimer" + depends on EARLY_TIME_ARM_GLOBALTIMER + help + This shoud match a static mapping so the timer can be used + seamlessly after the static mappings are set up + +config EARLY_TIME_ARM_GLOBALTIMER_FREQ + int "Frequency of your globaltimer in Hz" + depends on EARLY_TIME_ARM_GLOBALTIMER + endmenu diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index 19aaa24..1725846 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -336,6 +336,71 @@ __create_page_tables: str r3, [r0] #endif #endif + +#ifdef CONFIG_EARLY_TIME_ARM_GLOBALTIMER + + /* allow cpu0 and cpu1 nonsec access to global timer */ + + ldr r3, =CONFIG_EARLY_TIME_ARM_GLOBALTIMER_PHYS /* no mmu yet*/ + orr r3, r3, #0x54 + ldr r7, [r3] + orr r7, r7, #0x300 /* cpu0 and 1 nonsec access */ + str r7, [r3] + +#ifdef CONFIG_EARLY_TIME_ARM_GLOBALTIMER_ZERO + + /* disable global timer */ + + ldr r3, =CONFIG_EARLY_TIME_ARM_GLOBALTIMER_PHYS + orr r3, r3, #0x208 + mov r7, #0 + str r7, [r3] + + /* zero global timer */ + + ldr r3, =CONFIG_EARLY_TIME_ARM_GLOBALTIMER_PHYS + orr r3, r3, #0x200 + mov r7, #0 + str r7, [r3] + add r3, r3, #4 + str r7, [r3] + + /* enable global timer */ + + add r3, r3, #4 + mov r7, #1 + str r7, [r3] + +#endif + + /* map it DEBUG_LL style */ + + ldr r7, =CONFIG_EARLY_TIME_ARM_GLOBALTIMER_PHYS + ldr r3, =CONFIG_EARLY_TIME_ARM_GLOBALTIMER_VIRT + + mov r3, r3, lsr #SECTION_SHIFT + mov r3, r3, lsl #PMD_ORDER + + add r0, r4, r3 + mov r3, r7, lsr #SECTION_SHIFT + ldr r7, [r10, #PROCINFO_IO_MMUFLAGS] @ io_mmuflags + orr r3, r7, r3, lsl #SECTION_SHIFT +#ifdef CONFIG_ARM_LPAE + mov r7, #1 << (54 - 32) @ XN +#ifdef CONFIG_CPU_ENDIAN_BE8 + str r7, [r0], #4 + str r3, [r0], #4 +#else + str r3, [r0], #4 + str r7, [r0], #4 +#endif +#else + orr r3, r3, #PMD_SECT_XN + str r3, [r0], #4 +#endif +#endif /* CONFIG_EARLY_TIME_ARM_GLOBALTIMER */ + + #ifdef CONFIG_ARM_LPAE sub r4, r4, #0x1000 @ point to the PGD table mov r4, r4, lsr #ARCH_PGD_SHIFT diff --git a/arch/arm/kernel/time.c b/arch/arm/kernel/time.c index 98aee32..0f84f8e 100644 --- a/arch/arm/kernel/time.c +++ b/arch/arm/kernel/time.c @@ -121,3 +121,44 @@ void __init time_init(void) else clocksource_of_init(); } + +#ifdef CONFIG_EARLY_TIME_ARM_GLOBALTIMER + +u64 arch_early_time(u64 normal) +{ + u32 __iomem *p = (u32 __iomem *) + (CONFIG_EARLY_TIME_ARM_GLOBALTIMER_VIRT + 0x200); + u64 ts; + static u64 handover; + + /* if normal time had started already, just return the offset */ + + if (handover) + return handover + normal; + + ts = __raw_readl(p); + ts |= ((u64)__raw_readl(p + 1)) << 32; + + /* accumulate ps */ + + ts = (ts * (int)(1024000000000.0 * + (1.0 / (double)CONFIG_EARLY_TIME_ARM_GLOBALTIMER_FREQ))) >> 10; + + /* + * has some normal time appeared? Let's stop our early time and + * keep it for use as an offset, so we get monotonic logs + * Subtract the normal time to make the sample above a valid + * offset for 0 normal time. + */ + + if (!handover && normal) { + ts -= normal; + handover = ts; + } + + return ts + normal; +} +EXPORT_SYMBOL_GPL(arch_early_time); + +#endif +
On 8 August 2013 03:44, Andy Green andy.green@linaro.org wrote:
These patches give accurate, monotonic timestamps from the very first log entry allowing insight into where the time is going during the whole of the boot process.
It's a debug feature like DEBUG_LL, it does not cooperate with ARCH_MULTIPLATFORM (it works with such a kernel though) but can only be enabled for a single platform at compile-time.
I think this kind of only-one-platform compile-time-set feature is definitely going in the wrong direction, even for a debug feature. It's bad enough that DEBUG_LL works like that.
Can't you put the relevant information into the device tree so that it works on multiplatform kernels? That's the way the kernel's chosen to store its "this is the config for this platform" data, after all...
-- PMM
On 8 August 2013 17:35, Peter Maydell peter.maydell@linaro.org wrote:
On 8 August 2013 03:44, Andy Green andy.green@linaro.org wrote:
These patches give accurate, monotonic timestamps from the very first log entry allowing insight into where the time is going during the whole of the boot process.
It's a debug feature like DEBUG_LL, it does not cooperate with ARCH_MULTIPLATFORM (it works with such a kernel though) but can only be enabled for a single platform at compile-time.
I think this kind of only-one-platform compile-time-set feature is definitely going in the wrong direction, even for a debug feature. It's bad enough that DEBUG_LL works like that.
Yes it's doing the same thing as DEBUG_LL. DEBUG_LL hides things in a per-platform assembler file in addition to config.
Can't you put the relevant information into the device tree so that it works on multiplatform kernels? That's the way the kernel's chosen to store its "this is the config for this platform" data, after all...
I assumed if it could be done, DEBUG_LL would be doing it.
It's starting time in assembler before the MMU is switched on. I don't think there's any existing code for walking the dtb at that time.
Maybe it can start time a bit later after MMU and use the early dtb walking bits, I'll take a look but it's not a normal situation and may not have a normal solution.
-Andy
On 8 August 2013 11:23, Andy Green andy.green@linaro.org wrote:
On 8 August 2013 17:35, Peter Maydell peter.maydell@linaro.org wrote
Can't you put the relevant information into the device tree so that it works on multiplatform kernels? That's the way the kernel's chosen to store its "this is the config for this platform" data, after all...
I assumed if it could be done, DEBUG_LL would be doing it.
I think DEBUG_LL should also be doing it. The virtualization team is going to be looking into this, because we really don't want to have lots of people trying to start VMs and coming back with "I get no output" bug reports :-)
-- PMM
On 8 August 2013 18:50, Peter Maydell peter.maydell@linaro.org wrote:
On 8 August 2013 11:23, Andy Green andy.green@linaro.org wrote:
On 8 August 2013 17:35, Peter Maydell peter.maydell@linaro.org wrote
Can't you put the relevant information into the device tree so that it works on multiplatform kernels? That's the way the kernel's chosen to store its "this is the config for this platform" data, after all...
I assumed if it could be done, DEBUG_LL would be doing it.
I think DEBUG_LL should also be doing it. The virtualization team is going to be looking into this, because we really don't want to have lots of people trying to start VMs and coming back with "I get no output" bug reports :-)
Well good luck it looks like a bigger job.
I can see how to get it from the flat device tree by adding properties to chosen { } and riding on early_init_dt_scan_chosen() easily enough. From the results I have already I know on this platform that's ~500us after where we set it up at the moment, which is not really hurting. So I'll give that a go, thanks for the advice.
-Andy
On 8 August 2013 12:05, Andy Green andy.green@linaro.org wrote:
I can see how to get it from the flat device tree by adding properties to chosen { } and riding on early_init_dt_scan_chosen() easily enough. From the results I have already I know on this platform that's ~500us after where we set it up at the moment, which is not really hurting. So I'll give that a go, thanks for the advice.
Thanks for looking into the device-tree approach -- as a non-kernel dev I appreciate that it's easier for me to throw stones from the sidelines than it is for you to actually implement things ;-)
-- PMM