Been terrible about sending out status this month. My apologies!
=== Highlights ===
* I'd not call it a highlight, but the leapsecond debacle has taken up
the majority of my time this month. The good news is fixes are upstream
as of end of last week and backports to all the seven -stable branches
have been sent out as of yesterday, and nothing has blown up yet, so
this issue seems just about put to bed.
* Implemented a compatability shim for earlysuspend so the released
Android userlands match up with upstream AOSP 3.4 and my forward proted
3.5 Android trees. This has been merged into Andrey's tree.
* Got an invite to kernel summit. Plan to discuss timekeeping/leapsecond
issues as well as virtual memory issues connected to vmevent changes
needed for the userland-low-memory-killer work and the volatile range
ashmem alternative. Also likely will discuss general android upstreaming
progress.
* Queued CLOCK_TICK_RATE change from the aarch64 work. Will push for 3.6
* Ran two android-kernel-subteam calls
* Had meeting w/ Zach, Bero and Alexander on blueprints that are waiting
for AOSP userland updates for 3.4+ kernels.
* Reviewed Shawn's alarmtimer removal patch and found some issues.
* Re-pinged Google devs on mmc wakelock changes.
=== Plans ===
* Vacation! I'm out of the office 19-20.
* Finish digging myself out of my mail backlog
* Get back to LRU_VOLATILE work. Try to finish my rough draft that I
started and send it out to lkml.
* Try to get the ETM driver up and running on my panda board
=== Issues ===
* Fried brain
It seems that 'ftrace_enabled' flag should not be used inside the tracer
functions. The ftrace core is using this flag for internal purposes, and
the flag wasn't meant to be used in tracers' runtime checks.
stack tracer is the only tracer that abusing the flag. So stop it from
serving as a bad example.
Also, there is a local 'stack_trace_disabled' flag in the stack tracer,
which is never updated; so it can be removed as well.
Signed-off-by: Anton Vorontsov <anton.vorontsov(a)linaro.org>
---
kernel/trace/trace_stack.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index d4545f4..94c2c53 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -33,7 +33,6 @@ static unsigned long max_stack_size;
static arch_spinlock_t max_stack_lock =
(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
-static int stack_trace_disabled __read_mostly;
static DEFINE_PER_CPU(int, trace_active);
static DEFINE_MUTEX(stack_sysctl_mutex);
@@ -115,9 +114,6 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip)
{
int cpu;
- if (unlikely(!ftrace_enabled || stack_trace_disabled))
- return;
-
preempt_disable_notrace();
cpu = raw_smp_processor_id();
--
1.7.10.4
== Highlights ==
* Implemented async knocking to the KDB FIQ debugger, the special $3#33
command is now used to enter the debugger. Plus prepared a new patch
that makes alignment_trap asm macro self-contained; also reworked KGDB
FIQ callbacks to support multi-mach kernels. This is all included
in the v2 (so far there were no comments on this iteration, may be
everyone is busy w/ their own patches on the merge window's eve? :-)
* Addressed Steven's comments on pstore/tracing patchset and resent it.
The patches are now in the linux-next tree. Some new minor
enhancements (as suggested by Steven) in the works;
* Fixed Dan Carpenter's comments on pstore's configurable ECC size
patchset (mostly documentation). The patchset is now merged into
-next;
* The merge window is approaching, so needed to review/apply a bunch
of battery-related patches.
== Plans ==
* Finally update lowmemory killer blueprint;
* Continue work on FIQ debugger: implement "kiosk" mode.
--
Anton Vorontsov
Email: cbouatmailru(a)gmail.com
write_buf() should be marked as notrace, otherwise it is prone to
recursion.
Though, yet the issue is never triggered in real life, because we run
inside the function tracer, where ftrace does its own recurse protection.
But it's still no good, plus soon we might switch to our own tracer ops,
and then the issue will be fatal. So, let's fix it.
Signed-off-by: Anton Vorontsov <anton.vorontsov(a)linaro.org>
---
fs/pstore/ram.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 0b311bc..b86b2b7 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -32,6 +32,7 @@
#include <linux/ioport.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
+#include <linux/compiler.h>
#include <linux/pstore_ram.h>
#define RAMOOPS_KERNMSG_HDR "===="
@@ -181,12 +182,11 @@ static size_t ramoops_write_kmsg_hdr(struct persistent_ram_zone *prz)
return len;
}
-
-static int ramoops_pstore_write_buf(enum pstore_type_id type,
- enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
- const char *buf, size_t size,
- struct pstore_info *psi)
+static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
+ enum kmsg_dump_reason reason,
+ u64 *id, unsigned int part,
+ const char *buf, size_t size,
+ struct pstore_info *psi)
{
struct ramoops_context *cxt = psi->data;
struct persistent_ram_zone *prz = cxt->przs[cxt->dump_write_cnt];
--
1.7.10.4
=== omarrmz ===
=== Highlights ===
* Setting up environment to test power management, specifically for
suspend/resume and OFF mode if available on 3.5-rc6.
* Digging mailing list for fixes related to suspend/resume not working
on Panda 4460
- Looks like there were some fixes based on 3.5-rc2 from Tero of an
exiting bug only affecting 4460 that prevents the secondary CPU from
correctly booting up, rebased to 3.5-rc6 to get suspend/resume
working.
- As a side issue while trying 4430, booting from MMC doesn't work;
spent few time investigating that but given that community reports
show that MMC on Panda 4430 is working abandoned the investigation.
- Recent changes in MMC now force it to use DMA engine to work, minor
mail exchange about that (needed to boot over MMC).
* Started drafting changes for iommu and mailbox OFF mode support.
* Completing TI's HR courses.
* Rebased my patches floating in the mailing list to 3.5-rc6 (minor effort)
Regards,
Omar
We've had a discussion in the Linaro storage team (Saugata, Venkat and me,
with Luca joining in on the discussion) about swapping to flash based media
such as eMMC. This is a summary of what we found and what we think should
be done. If people agree that this is a good idea, we can start working
on it.
The basic problem is that Linux without swap is sort of crippled and some
things either don't work at all (hibernate) or not as efficient as they
should (e.g. tmpfs). At the same time, the swap code seems to be rather
inappropriate for the algorithms used in most flash media today, causing
system performance to suffer drastically, and wearing out the flash hardware
much faster than necessary. In order to change that, we would be
implementing the following changes:
1) Try to swap out multiple pages at once, in a single write request. My
reading of the current code is that we always send pages one by one to
the swap device, while most flash devices have an optimum write size of
32 or 64 kb and some require an alignment of more than a page. Ideally
we would try to write an aligned 64 kb block all the time. Writing aligned
64 kb chunks often gives us ten times the throughput of linear 4kb writes,
and going beyond 64 kb usually does not give any better performance.
2) Make variable sized swap clusters. Right now, the swap space is
organized in clusters of 256 pages (1MB), which is less than the typical
erase block size of 4 or 8 MB. We should try to make the swap cluster
aligned to erase blocks and have the size match to avoid garbage collection
in the drive. The cluster size would typically be set by mkswap as a new
option and interpreted at swapon time.
3) As Luca points out, some eMMC media would benefit significantly from
having discard requests issued for every page that gets freed from
the swap cache, rather than at the time just before we reuse a swap
cluster. This would probably have to become a configurable option
as well, to avoid the overhead of sending the discard requests on
media that don't benefit from this.
Does this all sound appropriate for the Linux memory management people?
Also, does this sound useful to the Android developers? Would you
start using swap if we make it perform well and not destroy the drives?
Finally, does this plan match up with the capabilities of the
various eMMC devices? I know more about SD and USB devices and
I'm quite convinced that it would help there, but eMMC can be
more like an SSD in some ways, and the current code should be fine
for real SSDs.
Arnd
Hi all,
These patches introduce KGDB FIQ debugger support. The idea (and some
code, of course) comes from Google's FIQ debugger[1]. There are some
differences (mostly implementation details, feature-wise they're almost
equivalent, or can be made equivalent, if desired).
The FIQ debugger is a facility that can be used to debug situations
when the kernel stuck in uninterruptable sections, e.g. the kernel
infinitely loops or deadlocked in an interrupt or with interrupts
disabled. On some development boards there is even a special NMI
button, which is very useful for debugging weird kernel hangs.
And FIQ is basically an NMI, it has a higher priority than IRQs, and
upon IRQ exception FIQs are not disabled. It is still possible to
disable FIQs (as well as some "NMIs" on other architectures), but via
special means.
So, here FIQs and NMIs are synonyms, but in the code I use NMI term
for arch-independent code, and FIQs for ARM code.
A few years ago KDB wasn't yet ready for production, or even not
well-known, so originally Google implemented its own FIQ debugger
that included its own shell, ring-buffer, commands, dumping,
backtracing logic and whatnot. This is very much like PowerPC's xmon
(arch/powerpc/xmon), except that xmon was there for a decade, so it
even predates KDB.
Anyway, nowadays KGDB/KDB is the cross-platform debugger, and the
only feature that was missing is NMI handling. This is now fixed for
ARM.
There a few differences comparing to the original (Google's) FIQ
debugger:
- Doing stuff in FIQ context is dangerous, as there we are not allowed
to cause aborts or faults. In the original FIQ debugger there was a
"signal" software-induced interrupt, upon exit from FIQ it would fire,
and we would continue to execute "dangerous" commands from there.
In KGDB/KDB we don't use signal interrupts. We can do easier:
set up a breakpoint, continue, and you'll trap into KGDB again
in a safe context.
It works for most cases, but I can imagine cases when you can't
set up a breakpoint. For these cases we'd better introduce a
KDB command "exit_nmi", that will rise the SW IRQ, after which
we're allowed to do anything.
- KGDB/KDB FIQ debugger shell is synchronous. In Google's version
you could have a dedicated shell always running in the FIQ context,
so when you type something on a serial line, you won't actually cause
any debugging actions, FIQ would save the characters in its own
buffer and continue execution normally. But when you hit return key
after the command, then the command is executed.
In KGDB/KDB FIQ debugger it is different. When you start any activity
on the FIQ-enabled serial console, you'll enter KGDB and kernel will
stop until you instruct it to continue.
This might look as a drastic change, but it is not. There is actually
no difference whether you have sync or async shell, or at least I
couldn't find any use-case where this would matter at all. Anyways,
it is still possible to do async shell in KDB, just don't see any
need for this.
- Original FIQ debugger used a custom FIQ vector handling code, w/
a lot of logic in it. In this approach I'm using the fact that
FIQs are basically IRQs, except that we there are a bit more
registers banked, and we can actually trap from the IRQ context.
But this all does not prevent us from using a simple jump-table
based approach as used in the generic ARM entry code. So, here
I just reuse the generic approach.
Note that I testing the code on a modelled ARM machine (QEMU Versatile),
so there might be some issues on a real HW, but it works in QEMU tho. :-)
Assuming you have QEMU >= 1.1.0, you can easily play with the code
using ARM/versatile defconfig and command like this:
qemu-system-arm -nographic -machine versatilepb \
-kernel linux/arch/arm/boot/zImage \
-append "console=ttyAMA0 kgdboc=ttyAMA0 kgdb_fiq.enable=1"
TODO:
1. alignment_trap macro uses local label, so we have to put the label
into each file that use the macro. We can get rid of the label;
2. Need per-machine kgdb_arch_enable_nmi(), probably will introduce
a pointer to a func;
3. Since console interrupt is actually is overtaken by NMI handler, we
should make serial/uart drivers stop using TX interrupts. This my
homework to think how to do it better. Currently, we would just
better not use console= and kgdboc= on the same tty (but it still
works, just might cause troubles if you hit TX interrupt);
4. Address any comments. :-)
Thanks!
--
arch/arm/Kconfig | 14 +++
arch/arm/common/vic.c | 28 +++++
arch/arm/include/asm/hardware/vic.h | 2 +
arch/arm/include/asm/kgdb.h | 8 ++
arch/arm/kernel/Makefile | 1 +
arch/arm/kernel/entry-armv.S | 167 +-------------------------
arch/arm/kernel/entry-header.S | 170 +++++++++++++++++++++++++++
arch/arm/kernel/kgdb_fiq.c | 78 ++++++++++++
arch/arm/kernel/kgdb_fiq_entry.S | 80 +++++++++++++
arch/arm/mach-versatile/Makefile | 1 +
arch/arm/mach-versatile/include/mach/irqs.h | 1 +
arch/arm/mach-versatile/kgdb_fiq.c | 40 +++++++
include/linux/kgdb.h | 9 ++
kernel/debug/debug_core.c | 12 +-
kernel/debug/kdb/kdb_debugger.c | 4 +
15 files changed, 448 insertions(+), 167 deletions(-)
p.s.
[1] Original Google's FIQ debugger, fiq_* files:
http://android.git.linaro.org/gitweb?p=kernel/common.git;a=tree;f=arch/arm/…
And board support as an example of using it:
http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=commitdiff;h=461cb80c1…
pp.s. If anyone curious, typical NMI entry looks like this
(I also executed a bit of commands):
Entering kdb (current=0xc781bd60, pid 1) due to NonMaskable Interrupt @ 0xc01510d0
Pid: 1, comm: swapper
CPU: 0 Not tainted (3.5.0-rc4+ #214)
PC is at __delay+0x0/0xc
LR is at panic+0x180/0x1b0
pc : [<c01510d0>] lr : [<c0286b64>] psr: 20000013
sp : c7823f24 ip : c7823f24 fp : c7823f38
r10: c02f35c4 r9 : 00000000 r8 : c0377988
r7 : 00000320 r6 : 000002bc r5 : 00000040 r4 : 00000000
r3 : c0020f4c r2 : 000002ce r1 : ffffffff r0 : 0000e2e1
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: 00093177 Table: 00004000 DAC: 00000017
Backtrace:
[<c00173a4>] (dump_backtrace+0x0/0x10c) from [<c02867f4>] (dump_stack+0x18/0x1c)
r6:0000000f r5:c0361d58 r4:c7823edc
[<c02867dc>] (dump_stack+0x0/0x1c) from [<c001506c>] (show_regs+0x44/0x50)
[<c0015028>] (show_regs+0x0/0x50) from [<c0287474>] (kdb_dumpregs+0x30/0x58)
r4:c0383330
[<c0287444>] (kdb_dumpregs+0x0/0x58) from [<c00606e4>] (kdb_local.isra.5+0x354/0x5ec)
r6:c0385534 r5:c7823edc r4:00000008
[<c0060390>] (kdb_local.isra.5+0x0/0x5ec) from [<c0060a28>] (kdb_main_loop+0xac/0x1bc)
[<c006097c>] (kdb_main_loop+0x0/0x1bc) from [<c0063020>] (kdb_stub+0x2e0/0x3e8)
r8:c0385820 r7:c0364004 r6:c0382cb4 r5:c03857c8 r4:c7823e7c
[<c0062d40>] (kdb_stub+0x0/0x3e8) from [<c0059868>] (kgdb_cpu_enter.constprop.9+0x13c/0x4f8)
[<c005972c>] (kgdb_cpu_enter.constprop.9+0x0/0x4f8) from [<c0059f2c>] (kgdb_handle_exception+0x8c/0xa0)
[<c0059ea0>] (kgdb_handle_exception+0x0/0xa0) from [<c0008614>] (kgdb_fiq_do_handle+0x58/0x7c)
r8:c0377988 r7:c7823f10 r6:ffffffff r5:c7823edc r4:c7822000
[<c00085bc>] (kgdb_fiq_do_handle+0x0/0x7c) from [<c0018df4>] (__fiq_svc+0x34/0x40)
Exception stack(0xc7823edc to 0xc7823f24)
3ec0: 0000e2e1
3ee0: ffffffff 000002ce c0020f4c 00000000 00000040 000002bc 00000320 c0377988
3f00: 00000000 c02f35c4 c7823f38 c7823f24 c7823f24 c0286b64 c01510d0 20000013
3f20: ffffffff
r5:20000013 r4:c01510d0
[<c02869e4>] (panic+0x0/0x1b0) from [<c0334d94>] (mount_block_root+0xe0/0x194)
r3:00000000 r2:00000000 r1:c7823f50 r0:c02f355c
r7:c789a000
[<c0334cb4>] (mount_block_root+0x0/0x194) from [<c0335030>] (mount_root+0xec/0x114)
[<c0334f44>] (mount_root+0x0/0x114) from [<c03351c0>] (prepare_namespace+0x168/0x1bc)
r7:00000013 r6:c0025c0c r5:c0351b24 r4:c0377440
[<c0335058>] (prepare_namespace+0x0/0x1bc) from [<c03349e4>] (kernel_init+0xd0/0xfc)
r5:c0351b24 r4:c0351b24
[<c0334914>] (kernel_init+0x0/0xfc) from [<c0025c0c>] (do_exit+0x0/0x2d8)
r5:c0334914 r4:00000000
more>
kdb> md c01510d0
0xc01510d0 e2500001 8afffffd e1a0f00e e254c001 ..P...........T.
0xc01510e0 9a000033 e11c0004 0a000028 e1510004 3.......(.....Q.
0xc01510f0 e3a03000 3a00000b e16f2f14 e16fcf11 .0.....:./o...o.
0xc0151100 e042200c e3a0c001 e1a0c21c e1a02214 . B.........."..
0xc0151110 e1510002 2183300c 20511002 11b0c0ac ..Q..0.!..Q ....
0xc0151120 e1a020a2 1afffff9 e3510000 e3a02000 . ........Q.. ..
0xc0151130 01500004 31a01000 31a0f00e e3a0c102 ..P....1...1....
0xc0151140 e1b00080 e0b11001 0a000005 31510004 ..............Q1
kdb> bp __delay
Instruction(i) BP #0 at 0xc01510d0 (__delay)
is enabled addr at 00000000c01510d0, hardtype=0 installed=0
kdb> go __delay
Entering kdb (current=0xc781bd60, pid 1) due to Breakpoint @ 0xc01510d0
kdb> bt
Stack traceback for pid 1
0xc781bd60 1 0 1 0 R 0xc781bf1c *swapper
Backtrace:
[<c00173a4>] (dump_backtrace+0x0/0x10c) from [<c0017804>] (show_stack+0x18/0x1c)
r6:0000000f r5:c0361d58 r4:c0383330
[<c00177ec>] (show_stack+0x0/0x1c) from [<c006202c>] (kdb_show_stack+0x78/0x88)
[<c0061fb4>] (kdb_show_stack+0x0/0x88) from [<c00620c0>] (kdb_bt1.isra.0+0x84/0xd8)
r8:00000032 r7:00000000 r6:00000000 r5:ffffffff r4:c781bd60
[<c006203c>] (kdb_bt1.isra.0+0x0/0xd8) from [<c00623b8>] (kdb_bt+0x2a4/0x348)
r7:00000001 r6:00000000 r5:c03857d0 r4:c03856fc
[<c0062114>] (kdb_bt+0x0/0x348) from [<c005fdbc>] (kdb_parse+0x2cc/0x4f4)
r8:00000032 r7:c03856fc r6:c02fa1f8 r5:c0383614 r4:00000009
[<c005faf0>] (kdb_parse+0x0/0x4f4) from [<c0060588>] (kdb_local.isra.5+0x1f8/0x5ec)
[<c0060390>] (kdb_local.isra.5+0x0/0x5ec) from [<c0060a28>] (kdb_main_loop+0xac/0x1bc)
[<c006097c>] (kdb_main_loop+0x0/0x1bc) from [<c0063020>] (kdb_stub+0x2e0/0x3e8)
r8:c0385820 r7:c0364004 r6:c0382cb4 r5:c03857c8 r4:c7823de0
[<c0062d40>] (kdb_stub+0x0/0x3e8) from [<c0059868>] (kgdb_cpu_enter.constprop.9+0x13c/0x4f8)
[<c005972c>] (kgdb_cpu_enter.constprop.9+0x0/0x4f8) from [<c0059f2c>] (kgdb_handle_exception+0x8c/0xa0)
[<c0059ea0>] (kgdb_handle_exception+0x0/0xa0) from [<c0018ae0>] (kgdb_brk_fn+0x20/0x28)
r8:c0377988 r7:00000000 r6:60000093 r5:c01510d0 r4:c7823edc
[<c0018ac0>] (kgdb_brk_fn+0x0/0x28) from [<c00084f0>] (do_undefinstr+0xdc/0x1a8)
[<c0008414>] (do_undefinstr+0x0/0x1a8) from [<c0013e1c>] (__und_svc+0x3c/0x60)
Exception stack(0xc7823edc to 0xc7823f24)
3ec0: 0000e2e1
3ee0: ffffffff 000002ce c0020f4c 00000000 00000040 000002bc 00000320 c0377988
3f00: 00000000 c02f35c4 c7823f38 c7823f24 c7823f24 c0286b64 c01510d0 20000013
3f20: ffffffff
r7:c7823f10 r6:ffffffff r5:20000013 r4:c01510d4
[<c02869e4>] (panic+0x0/0x1b0) from [<c0334d94>] (mount_block_root+0xe0/0x194)
r3:00000000 r2:00000000 r1:c7823f50 r0:c02f355c
r7:c789a000
[<c0334cb4>] (mount_block_root+0x0/0x194) from [<c0335030>] (mount_root+0xec/0x114)
[<c0334f44>] (mount_root+0x0/0x114) from [<c03351c0>] (prepare_namespace+0x168/0x1bc)
r7:00000013 r6:c0025c0c r5:c0351b24 r4:c0377440
[<c0335058>] (prepare_namespace+0x0/0x1bc) from [<c03349e4>] (kernel_init+0xd0/0xfc)
r5:c0351b24 r4:c0351b24
[<c0334914>] (kernel_init+0x0/0xfc) from [<c0025c0c>] (do_exit+0x0/0x2d8)
r5:c0334914 r4:00000000
kdb>
kdb> ps
15 sleeping system daemon (state M) processes suppressed,
use 'ps A' to see all.
Task Addr Pid Parent [*] cpu State Thread Command
0xc781bd60 1 0 1 0 R 0xc781bf1c *swapper
0xc781bd60 1 0 1 0 R 0xc781bf1c *swapper
0xc789dd60 13 2 0 0 R 0xc789df1c kworker/0:1
0xc789d580 16 2 0 0 R 0xc789d73c kworker/u:1
0xc796cd60 23 2 0 0 R 0xc796cf1c deferwq
--
Anton Vorontsov
Email: cbouatmailru(a)gmail.com
Hi all,
Here is v2 for the KGDB FIQ debugger, the changes include:
- Per Colin Cross' suggestion, we should not enter the debugger on any
received byte (this might be a problem when there's a noise on the
serial line). So there is now an additional patch that implements
"knocking" to the KDB (either via $3#33 command or return key, this
is configurable);
- Reworked {enable,select}_fiq/is_fiq callbacks, now multi-mach kernels
should not be a problem;
- For versatile machines there are run-time checks for proper UART port
(kernel will scream aloud if out of range port is specified);
- Added some __init annotations;
- Since not every architecture defines FIQ_START, we can't just blindly
select CONFIG_FIQ symbol. So ARCH_MIGHT_HAVE_FIQ introduced;
- Add !THUMB2_KERNEL dependency for KGDB_FIQ, we don't support Thumb2
kernels;
- New patch that is used to get rid of LCcralign label in alignment_trap
macro.
Rationale:
These patches introduce KGDB FIQ debugger support. The idea (and some
code, of course) comes from Google's FIQ debugger[1]. There are some
differences (mostly implementation details, feature-wise they're almost
equivalent, or can be made equivalent, if desired).
The FIQ debugger is a facility that can be used to debug situations
when the kernel stuck in uninterruptable sections, e.g. the kernel
infinitely loops or deadlocked in an interrupt or with interrupts
disabled. On some development boards there is even a special NMI
button, which is very useful for debugging weird kernel hangs.
And FIQ is basically an NMI, it has a higher priority than IRQs, and
upon IRQ exception FIQs are not disabled. It is still possible to
disable FIQs (as well as some "NMIs" on other architectures), but via
special means.
So, here FIQs and NMIs are synonyms, but in the code I use NMI term
for arch-independent code, and FIQs for ARM code.
A few years ago KDB wasn't yet ready for production, or even not
well-known, so originally Google implemented its own FIQ debugger
that included its own shell, ring-buffer, commands, dumping,
backtracing logic and whatnot. This is very much like PowerPC's xmon
(arch/powerpc/xmon), except that xmon was there for a decade, so it
even predates KDB.
Anyway, nowadays KGDB/KDB is the cross-platform debugger, and the
only feature that was missing is NMI handling. This is now fixed for
ARM.
There a few differences comparing to the original (Google's) FIQ
debugger:
- Doing stuff in FIQ context is dangerous, as there we are not allowed
to cause aborts or faults. In the original FIQ debugger there was a
"signal" software-induced interrupt, upon exit from FIQ it would fire,
and we would continue to execute "dangerous" commands from there.
In KGDB/KDB we don't use signal interrupts. We can do easier:
set up a breakpoint, continue, and you'll trap into KGDB again
in a safe context.
It works for most cases, but I can imagine cases when you can't
set up a breakpoint. For these cases we'd better introduce a
KDB command "exit_nmi", that will rise the SW IRQ, after which
we're allowed to do anything.
- KGDB/KDB FIQ debugger shell is synchronous. In Google's version
you could have a dedicated shell always running in the FIQ context,
so when you type something on a serial line, you won't actually cause
any debugging actions, FIQ would save the characters in its own
buffer and continue execution normally. But when you hit return key
after the command, then the command is executed.
In KGDB/KDB FIQ debugger it is different. Once you enter KGDB, the
kernel will stop until you instruct it to continue.
This might look as a drastic change, but it is not. There is actually
no difference whether you have sync or async shell, or at least I
couldn't find any use-case where this would matter at all. Anyways,
it is still possible to do async shell in KDB, just don't see any
need for this.
- Original FIQ debugger used a custom FIQ vector handling code, w/
a lot of logic in it. In this approach I'm using the fact that
FIQs are basically IRQs, except that we there are a bit more
registers banked, and we can actually trap from the IRQ context.
But this all does not prevent us from using a simple jump-table
based approach as used in the generic ARM entry code. So, here
I just reuse the generic approach.
Note that I test the code on a modelled ARM machine (QEMU Versatile), so
there might be some issues on a real HW, but it works in QEMU tho. :-)
Assuming you have QEMU >= 1.1.0, you can easily play with the code
using ARM/versatile defconfig and command like this:
qemu-system-arm -nographic -machine versatilepb \
-kernel linux/arch/arm/boot/zImage \
-append "console=ttyAMA0 kgdboc=ttyAMA0 kgdb_fiq.enable=1"
Thanks!
--
arch/arm/Kconfig | 19 +++
arch/arm/common/vic.c | 28 +++++
arch/arm/include/asm/hardware/vic.h | 2 +
arch/arm/include/asm/kgdb.h | 8 ++
arch/arm/kernel/Makefile | 1 +
arch/arm/kernel/entry-armv.S | 169 +------------------------
arch/arm/kernel/entry-header.S | 176 ++++++++++++++++++++++++++-
arch/arm/kernel/kgdb_fiq.c | 141 +++++++++++++++++++++
arch/arm/kernel/kgdb_fiq_entry.S | 76 ++++++++++++
arch/arm/mach-versatile/Makefile | 1 +
arch/arm/mach-versatile/include/mach/irqs.h | 1 +
arch/arm/mach-versatile/kgdb_fiq.c | 31 +++++
include/linux/kgdb.h | 9 ++
kernel/debug/debug_core.c | 12 +-
kernel/debug/kdb/kdb_debugger.c | 4 +
15 files changed, 508 insertions(+), 170 deletions(-)
p.s.
[1] Original Google's FIQ debugger, fiq_* files:
http://android.git.linaro.org/gitweb?p=kernel/common.git;a=tree;f=arch/arm/…
And board support as an example of using it:
http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=commitdiff;h=461cb80c1…
pp.s. If anyone curious, typical NMI entry looks like this
(I also executed a bit of commands):
Entering kdb (current=0xc781bd60, pid 1) due to NonMaskable Interrupt @ 0xc01510d0
Pid: 1, comm: swapper
CPU: 0 Not tainted (3.5.0-rc4+ #214)
PC is at __delay+0x0/0xc
LR is at panic+0x180/0x1b0
pc : [<c01510d0>] lr : [<c0286b64>] psr: 20000013
sp : c7823f24 ip : c7823f24 fp : c7823f38
r10: c02f35c4 r9 : 00000000 r8 : c0377988
r7 : 00000320 r6 : 000002bc r5 : 00000040 r4 : 00000000
r3 : c0020f4c r2 : 000002ce r1 : ffffffff r0 : 0000e2e1
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: 00093177 Table: 00004000 DAC: 00000017
Backtrace:
[<c00173a4>] (dump_backtrace+0x0/0x10c) from [<c02867f4>] (dump_stack+0x18/0x1c)
r6:0000000f r5:c0361d58 r4:c7823edc
[<c02867dc>] (dump_stack+0x0/0x1c) from [<c001506c>] (show_regs+0x44/0x50)
[<c0015028>] (show_regs+0x0/0x50) from [<c0287474>] (kdb_dumpregs+0x30/0x58)
r4:c0383330
[<c0287444>] (kdb_dumpregs+0x0/0x58) from [<c00606e4>] (kdb_local.isra.5+0x354/0x5ec)
r6:c0385534 r5:c7823edc r4:00000008
[<c0060390>] (kdb_local.isra.5+0x0/0x5ec) from [<c0060a28>] (kdb_main_loop+0xac/0x1bc)
[<c006097c>] (kdb_main_loop+0x0/0x1bc) from [<c0063020>] (kdb_stub+0x2e0/0x3e8)
r8:c0385820 r7:c0364004 r6:c0382cb4 r5:c03857c8 r4:c7823e7c
[<c0062d40>] (kdb_stub+0x0/0x3e8) from [<c0059868>] (kgdb_cpu_enter.constprop.9+0x13c/0x4f8)
[<c005972c>] (kgdb_cpu_enter.constprop.9+0x0/0x4f8) from [<c0059f2c>] (kgdb_handle_exception+0x8c/0xa0)
[<c0059ea0>] (kgdb_handle_exception+0x0/0xa0) from [<c0008614>] (kgdb_fiq_do_handle+0x58/0x7c)
r8:c0377988 r7:c7823f10 r6:ffffffff r5:c7823edc r4:c7822000
[<c00085bc>] (kgdb_fiq_do_handle+0x0/0x7c) from [<c0018df4>] (__fiq_svc+0x34/0x40)
Exception stack(0xc7823edc to 0xc7823f24)
3ec0: 0000e2e1
3ee0: ffffffff 000002ce c0020f4c 00000000 00000040 000002bc 00000320 c0377988
3f00: 00000000 c02f35c4 c7823f38 c7823f24 c7823f24 c0286b64 c01510d0 20000013
3f20: ffffffff
r5:20000013 r4:c01510d0
[<c02869e4>] (panic+0x0/0x1b0) from [<c0334d94>] (mount_block_root+0xe0/0x194)
r3:00000000 r2:00000000 r1:c7823f50 r0:c02f355c
r7:c789a000
[<c0334cb4>] (mount_block_root+0x0/0x194) from [<c0335030>] (mount_root+0xec/0x114)
[<c0334f44>] (mount_root+0x0/0x114) from [<c03351c0>] (prepare_namespace+0x168/0x1bc)
r7:00000013 r6:c0025c0c r5:c0351b24 r4:c0377440
[<c0335058>] (prepare_namespace+0x0/0x1bc) from [<c03349e4>] (kernel_init+0xd0/0xfc)
r5:c0351b24 r4:c0351b24
[<c0334914>] (kernel_init+0x0/0xfc) from [<c0025c0c>] (do_exit+0x0/0x2d8)
r5:c0334914 r4:00000000
more>
kdb> md c01510d0
0xc01510d0 e2500001 8afffffd e1a0f00e e254c001 ..P...........T.
0xc01510e0 9a000033 e11c0004 0a000028 e1510004 3.......(.....Q.
0xc01510f0 e3a03000 3a00000b e16f2f14 e16fcf11 .0.....:./o...o.
0xc0151100 e042200c e3a0c001 e1a0c21c e1a02214 . B.........."..
0xc0151110 e1510002 2183300c 20511002 11b0c0ac ..Q..0.!..Q ....
0xc0151120 e1a020a2 1afffff9 e3510000 e3a02000 . ........Q.. ..
0xc0151130 01500004 31a01000 31a0f00e e3a0c102 ..P....1...1....
0xc0151140 e1b00080 e0b11001 0a000005 31510004 ..............Q1
kdb> bp __delay
Instruction(i) BP #0 at 0xc01510d0 (__delay)
is enabled addr at 00000000c01510d0, hardtype=0 installed=0
kdb> go __delay
Entering kdb (current=0xc781bd60, pid 1) due to Breakpoint @ 0xc01510d0
kdb> bt
Stack traceback for pid 1
0xc781bd60 1 0 1 0 R 0xc781bf1c *swapper
Backtrace:
[<c00173a4>] (dump_backtrace+0x0/0x10c) from [<c0017804>] (show_stack+0x18/0x1c)
r6:0000000f r5:c0361d58 r4:c0383330
[<c00177ec>] (show_stack+0x0/0x1c) from [<c006202c>] (kdb_show_stack+0x78/0x88)
[<c0061fb4>] (kdb_show_stack+0x0/0x88) from [<c00620c0>] (kdb_bt1.isra.0+0x84/0xd8)
r8:00000032 r7:00000000 r6:00000000 r5:ffffffff r4:c781bd60
[<c006203c>] (kdb_bt1.isra.0+0x0/0xd8) from [<c00623b8>] (kdb_bt+0x2a4/0x348)
r7:00000001 r6:00000000 r5:c03857d0 r4:c03856fc
[<c0062114>] (kdb_bt+0x0/0x348) from [<c005fdbc>] (kdb_parse+0x2cc/0x4f4)
r8:00000032 r7:c03856fc r6:c02fa1f8 r5:c0383614 r4:00000009
[<c005faf0>] (kdb_parse+0x0/0x4f4) from [<c0060588>] (kdb_local.isra.5+0x1f8/0x5ec)
[<c0060390>] (kdb_local.isra.5+0x0/0x5ec) from [<c0060a28>] (kdb_main_loop+0xac/0x1bc)
[<c006097c>] (kdb_main_loop+0x0/0x1bc) from [<c0063020>] (kdb_stub+0x2e0/0x3e8)
r8:c0385820 r7:c0364004 r6:c0382cb4 r5:c03857c8 r4:c7823de0
[<c0062d40>] (kdb_stub+0x0/0x3e8) from [<c0059868>] (kgdb_cpu_enter.constprop.9+0x13c/0x4f8)
[<c005972c>] (kgdb_cpu_enter.constprop.9+0x0/0x4f8) from [<c0059f2c>] (kgdb_handle_exception+0x8c/0xa0)
[<c0059ea0>] (kgdb_handle_exception+0x0/0xa0) from [<c0018ae0>] (kgdb_brk_fn+0x20/0x28)
r8:c0377988 r7:00000000 r6:60000093 r5:c01510d0 r4:c7823edc
[<c0018ac0>] (kgdb_brk_fn+0x0/0x28) from [<c00084f0>] (do_undefinstr+0xdc/0x1a8)
[<c0008414>] (do_undefinstr+0x0/0x1a8) from [<c0013e1c>] (__und_svc+0x3c/0x60)
Exception stack(0xc7823edc to 0xc7823f24)
3ec0: 0000e2e1
3ee0: ffffffff 000002ce c0020f4c 00000000 00000040 000002bc 00000320 c0377988
3f00: 00000000 c02f35c4 c7823f38 c7823f24 c7823f24 c0286b64 c01510d0 20000013
3f20: ffffffff
r7:c7823f10 r6:ffffffff r5:20000013 r4:c01510d4
[<c02869e4>] (panic+0x0/0x1b0) from [<c0334d94>] (mount_block_root+0xe0/0x194)
r3:00000000 r2:00000000 r1:c7823f50 r0:c02f355c
r7:c789a000
[<c0334cb4>] (mount_block_root+0x0/0x194) from [<c0335030>] (mount_root+0xec/0x114)
[<c0334f44>] (mount_root+0x0/0x114) from [<c03351c0>] (prepare_namespace+0x168/0x1bc)
r7:00000013 r6:c0025c0c r5:c0351b24 r4:c0377440
[<c0335058>] (prepare_namespace+0x0/0x1bc) from [<c03349e4>] (kernel_init+0xd0/0xfc)
r5:c0351b24 r4:c0351b24
[<c0334914>] (kernel_init+0x0/0xfc) from [<c0025c0c>] (do_exit+0x0/0x2d8)
r5:c0334914 r4:00000000
kdb>
kdb> ps
15 sleeping system daemon (state M) processes suppressed,
use 'ps A' to see all.
Task Addr Pid Parent [*] cpu State Thread Command
0xc781bd60 1 0 1 0 R 0xc781bf1c *swapper
0xc781bd60 1 0 1 0 R 0xc781bf1c *swapper
0xc789dd60 13 2 0 0 R 0xc789df1c kworker/0:1
0xc789d580 16 2 0 0 R 0xc789d73c kworker/u:1
0xc796cd60 23 2 0 0 R 0xc796cf1c deferwq
--
Anton Vorontsov
Email: cbouatmailru(a)gmail.com
Hi all,
Just a few patches left from the series that used to add configurable
ECC size for pstore/ram backend. Most patches were merged into -next,
and this is just a resend of the leftovers.
(Note that pstore/trace patches go on top of this series.)
Thanks,
---
fs/pstore/ram.c | 14 +++++++-------
fs/pstore/ram_core.c | 30 ++++++++++++++----------------
include/linux/pstore_ram.h | 7 ++-----
3 files changed, 23 insertions(+), 28 deletions(-)
--
Anton Vorontsov
Email: cbouatmailru(a)gmail.com
Hi all,
In v3:
- Make traces versioned, as suggested by Steven, Tony and Colin. (The
version tag is stored in the PRZ signature, see the last patch for
the implementation details).
- Add Steven's Ack on the first patch.
In v2:
- Do not introduce a separate 'persistent' tracer, but introduce an
option to the existing 'function' tracer.
Rationale for this patch set:
With this support kernel can save functions call chain log into a
persistent ram buffer that can be decoded and dumped after reboot
through pstore filesystem. It can be used to determine what function
was last called before a hang or an unexpected reset (caused by, for
example, a buggy driver that abuses HW).
Here's a "nano howto", to get the idea:
# mount -t debugfs debugfs /sys/kernel/debug/
# cd /sys/kernel/debug/tracing
# echo function > current_tracer
# echo 1 > options/func_pstore
# reboot -f
[...]
# mount -t pstore pstore /mnt/
# tail /mnt/ftrace-ramoops
0 ffffffff8101ea64 ffffffff8101bcda native_apic_mem_read <- disconnect_bsp_APIC+0x6a/0xc0
0 ffffffff8101ea44 ffffffff8101bcf6 native_apic_mem_write <- disconnect_bsp_APIC+0x86/0xc0
0 ffffffff81020084 ffffffff8101a4b5 hpet_disable <- native_machine_shutdown+0x75/0x90
0 ffffffff81005f94 ffffffff8101a4bb iommu_shutdown_noop <- native_machine_shutdown+0x7b/0x90
0 ffffffff8101a6a1 ffffffff8101a437 native_machine_emergency_restart <- native_machine_restart+0x37/0x40
0 ffffffff811f9876 ffffffff8101a73a acpi_reboot <- native_machine_emergency_restart+0xaa/0x1e0
0 ffffffff8101a514 ffffffff8101a772 mach_reboot_fixups <- native_machine_emergency_restart+0xe2/0x1e0
0 ffffffff811d9c54 ffffffff8101a7a0 __const_udelay <- native_machine_emergency_restart+0x110/0x1e0
0 ffffffff811d9c34 ffffffff811d9c80 __delay <- __const_udelay+0x30/0x40
0 ffffffff811d9d14 ffffffff811d9c3f delay_tsc <- __delay+0xf/0x20
Mostly the code comes from trace_persistent.c driver found in the
Android git tree, written by Colin Cross <ccross(a)android.com>
(according to sign-off history). I reworked the driver a little bit,
and ported it to pstore subsystem.
--
Documentation/ramoops.txt | 25 +++++++++
fs/pstore/Kconfig | 13 +++++
fs/pstore/Makefile | 1 +
fs/pstore/ftrace.c | 35 +++++++++++++
fs/pstore/inode.c | 111 ++++++++++++++++++++++++++++++++++++++--
fs/pstore/internal.h | 43 ++++++++++++++++
fs/pstore/platform.c | 12 ++++-
fs/pstore/ram.c | 65 +++++++++++++++++------
fs/pstore/ram_core.c | 12 +++--
include/linux/pstore.h | 13 +++++
include/linux/pstore_ram.h | 3 +-
kernel/trace/trace.c | 7 +--
kernel/trace/trace_functions.c | 25 +++++++--
13 files changed, 330 insertions(+), 35 deletions(-)
--
Anton Vorontsov
Email: cbouatmailru(a)gmail.com