Hi Folks,
Just a few notes that might be useful to someone else.
I mentioned before that we'll want to see ACPI_INITRD_TABLE_OVERRIDE support on arm64 in due course for use during bringup of new systems. It is commonly used on x86 to provide handy updated ACPI tables via a prepended initrd cpio that is attached to the regular initrd image, which is infinitely better than respinning firmware each time you want to provide updated tables (yes yes, we all know the "devicetree" directive in GRUB does this with DT and that's great and wonderful). I know that someone is working on it, but I had a need for an interim solution over the weekend for some hardware I am bringing up here.
The problem is that there's no mapping for the pages containing the ramdisk data prior to paging_init. x86 does a bit of hoop jumping (and I think they have certain assumptions about linear mapping too) and splits the initrd setup into a couple of different phases, which we may well need to do also if we want to be able to call something like:
#if defined(CONFIG_ACPI) && defined(CONFIG_BLK_DEV_INITRD) acpi_initrd_override((void *)initrd_start, initrd_end - initrd_start); #endif
prior to acpi_boot_table_init in arch/arm64/kernel/setup.c. That routine will correctly map memory for data it copies out of the cpio, but it assumes that the initrd is already mapped. For the very nasty hack I want over the weekend, I added some extra fixmaps and then do an early_memremap on the physical address of first page of the initrd (the cpio is tiny, and is always prepended and this is very nasty). Viz:
[ 0.000000] JCM: using fixmap for first page of ramdisk... [ 0.000000] JCM: first physical page of ramdisk: [mem REDACTED] [ 0.000000] JCM: first virtual page of ramdisk: [mem REDACTED] [ 0.000000] XXXX ACPI table found in initrd [kernel/firmware/acpi/xxxx.aml][0xxxx] [ 0.000000] XXXX ACPI table found in initrd [kernel/firmware/acpi/xxxx.aml][0xxx] [ 0.000000] XXXX ACPI table found in initrd [kernel/firmware/acpi/xxxx.aml][0xxxx]
So maybe I'll get what I want working, but a head's up to whoever from Linaro ends up looking into the broader implementation that I think you'll need to rework initrd mapping as part of that. Unless of course I'm missing something. Which is more than possible.
Jon.
Hi,
On Sun, Dec 13, 2015 at 03:47:12AM -0500, Jon Masters wrote:
Hi Folks,
Just a few notes that might be useful to someone else.
I mentioned before that we'll want to see ACPI_INITRD_TABLE_OVERRIDE support on arm64 in due course for use during bringup of new systems. It is commonly used on x86 to provide handy updated ACPI tables via a prepended initrd cpio that is attached to the regular initrd image, which is infinitely better than respinning firmware each time you want to provide updated tables (yes yes, we all know the "devicetree" directive in GRUB does this with DT and that's great and wonderful). I know that someone is working on it, but I had a need for an interim solution over the weekend for some hardware I am bringing up here.
Which table(s) are you trying to override?
It might be possible to have an EFI application prior to the kernel which override the relevant table, or you may be able to override the tables in the EFI stub. From the PoV of Linux proper, things would then be the same as a normal boot -- nothing special to be done.
If you're able to do this in a separate application prior to the stub (or even prior to the main bootloader), that same override should work for any OS.
The problem is that there's no mapping for the pages containing the ramdisk data prior to paging_init.Can we not move paging_init earlier?
Why can't we move paging_init earlier? Which information do we need prior to paging_init?
The only thing I can think of are the SLIT and SRAT. Are you trying to override those?
x86 does a bit of hoop jumping (and I think they have certain assumptions about linear mapping too) and splits the initrd setup into a couple of different phases, which we may well need to do also if we want to be able to call something like:
#if defined(CONFIG_ACPI) && defined(CONFIG_BLK_DEV_INITRD) acpi_initrd_override((void *)initrd_start, initrd_end - initrd_start); #endif
prior to acpi_boot_table_init in arch/arm64/kernel/setup.c. That routine will correctly map memory for data it copies out of the cpio, but it assumes that the initrd is already mapped. For the very nasty hack I want over the weekend, I added some extra fixmaps and then do an early_memremap on the physical address of first page of the initrd (the cpio is tiny, and is always prepended and this is very nasty). Viz:
[ 0.000000] JCM: using fixmap for first page of ramdisk... [ 0.000000] JCM: first physical page of ramdisk: [mem REDACTED] [ 0.000000] JCM: first virtual page of ramdisk: [mem REDACTED] [ 0.000000] XXXX ACPI table found in initrd [kernel/firmware/acpi/xxxx.aml][0xxxx] [ 0.000000] XXXX ACPI table found in initrd [kernel/firmware/acpi/xxxx.aml][0xxx] [ 0.000000] XXXX ACPI table found in initrd [kernel/firmware/acpi/xxxx.aml][0xxxx]
So maybe I'll get what I want working, but a head's up to whoever from Linaro ends up looking into the broader implementation that I think you'll need to rework initrd mapping as part of that. Unless of course I'm missing something. Which is more than possible.
If we really need to map the initrd this early, one option would be to make it possible to perform arbitrarily large memremaps early on, following the approach I posted for the linear mapping [1]. That would require some more invasive rework, but we'd be able to reuse the logic for other things we want to map early on.
Thanks, Mark.
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/392292.h...
On 7 December 2015 at 06:05, Mark Rutland mark.rutland@arm.com wrote:
Hi,
On Sun, Dec 13, 2015 at 03:47:12AM -0500, Jon Masters wrote:
Hi Folks,
Just a few notes that might be useful to someone else.
I mentioned before that we'll want to see ACPI_INITRD_TABLE_OVERRIDE support on arm64 in due course for use during bringup of new systems. It is commonly used on x86 to provide handy updated ACPI tables via a prepended initrd cpio that is attached to the regular initrd image, which is infinitely better than respinning firmware each time you want to provide updated tables (yes yes, we all know the "devicetree" directive in GRUB does this with DT and that's great and wonderful). I know that someone is working on it, but I had a need for an interim solution over the weekend for some hardware I am bringing up here.
Which table(s) are you trying to override?
It might be possible to have an EFI application prior to the kernel which override the relevant table, or you may be able to override the tables in the EFI stub. From the PoV of Linux proper, things would then be the same as a normal boot -- nothing special to be done.
If you're able to do this in a separate application prior to the stub (or even prior to the main bootloader), that same override should work for any OS.
grub2 already has a acpi module which should allow this.
Graeme
On Sun, Dec 13, 2015 at 03:47:12AM -0500, Jon Masters wrote:
I mentioned before that we'll want to see ACPI_INITRD_TABLE_OVERRIDE support on arm64 in due course for use during bringup of new systems. It is commonly used on x86 to provide handy updated ACPI tables via a prepended initrd cpio that is attached to the regular initrd image, which is infinitely better than respinning firmware each time you want to provide updated tables (yes yes, we all know the "devicetree" directive in GRUB does this with DT and that's great and wonderful). I know that someone is working on it, but I had a need for an interim solution over the weekend for some hardware I am bringing up here.
Why would we do this through an initrd? That sounds an awful lot like the per-kernel-image dtbs that are the whole thing we're trying to avoid.
Implementing this as a grub command (or kernel command line) for debug purposes would be trivial - and I'm pretty sure Graeme/Al already did this in the past.
/ Leif
Just thinking from a parity point of view - if you can do it on x86, it should be doable on ARM. But that GRUB module approach I quite like! Let's get it documented prominently in the kernel doc as the alternative to updating an initrd then :)
(In our case we would only turn this on in -debug RHEL kernels so as to keep it very clear this is never for production systems, can't trust folks not to hack things if you give them half a chance)
Jon.
On 12/13/2015 04:36 PM, Jon Masters wrote:
Just thinking from a parity point of view - if you can do it on x86, it should be doable on ARM. But that GRUB module approach I quite like!
Sorry for top post earlier. Was on my phone. In any case, it looks like the "acpi" command in GRUB currently does an all-or-nothing replace of all of the tables, not just a named table. We need to be able to override e.g. just an DSDT or SSDT with a replacement test one.
Jon.
On 13 December 2015 at 22:02, Jon Masters jcm@redhat.com wrote:
On 12/13/2015 04:36 PM, Jon Masters wrote:
Just thinking from a parity point of view - if you can do it on x86, it should be doable on ARM. But that GRUB module approach I quite like!
Sorry for top post earlier. Was on my phone. In any case, it looks like the "acpi" command in GRUB currently does an all-or-nothing replace of all of the tables, not just a named table. We need to be able to override e.g. just an DSDT or SSDT with a replacement test one.
I think thats just clumsily worded documentation. The code certainly looks like you can replace them individually.
I think its trying to say in the case RSDP,XSDT,RSDT are in ROM it will create new ones to point to the new tables you've just imported.
I had planned to try it this week anyway.
Graeme
On 12/13/2015 05:15 PM, G Gregory wrote:
On 13 December 2015 at 22:02, Jon Masters jcm@redhat.com wrote:
On 12/13/2015 04:36 PM, Jon Masters wrote:
Just thinking from a parity point of view - if you can do it on x86, it should be doable on ARM. But that GRUB module approach I quite like!
Sorry for top post earlier. Was on my phone. In any case, it looks like the "acpi" command in GRUB currently does an all-or-nothing replace of all of the tables, not just a named table. We need to be able to override e.g. just an DSDT or SSDT with a replacement test one.
I think thats just clumsily worded documentation. The code certainly looks like you can replace them individually.
Yea, it does. It's unfortunate that the code is more "obvious" than the docs, but hardly the first time ;)
I think its trying to say in the case RSDP,XSDT,RSDT are in ROM it will create new ones to point to the new tables you've just imported.
Yea. I think it actually ends up creating new tables in any case, then copies over the bits that change. It'll actually need to do that anyway because it'll want to update the pointers to new tables. I'll try it. But I also want to see the initrd approach working in the not too distant future - everything someone can do on an x86 system should translate 1:1 in terms of experience. It'll help with bringup of newer platforms such as the one I have that needs some NUMA related tweaks.
I had planned to try it this week anyway.
Great :)
Jon.
On 13 December 2015 at 22:19, Jon Masters jcm@redhat.com wrote:
On 12/13/2015 05:15 PM, G Gregory wrote:
On 13 December 2015 at 22:02, Jon Masters jcm@redhat.com wrote:
On 12/13/2015 04:36 PM, Jon Masters wrote:
Just thinking from a parity point of view - if you can do it on x86, it should be doable on ARM. But that GRUB module approach I quite like!
Sorry for top post earlier. Was on my phone. In any case, it looks like the "acpi" command in GRUB currently does an all-or-nothing replace of all of the tables, not just a named table. We need to be able to override e.g. just an DSDT or SSDT with a replacement test one.
I think thats just clumsily worded documentation. The code certainly looks like you can replace them individually.
Yea, it does. It's unfortunate that the code is more "obvious" than the docs, but hardly the first time ;)
I think its trying to say in the case RSDP,XSDT,RSDT are in ROM it will create new ones to point to the new tables you've just imported.
Yea. I think it actually ends up creating new tables in any case, then copies over the bits that change. It'll actually need to do that anyway because it'll want to update the pointers to new tables. I'll try it. But I also want to see the initrd approach working in the not too distant future - everything someone can do on an x86 system should translate 1:1 in terms of experience. It'll help with bringup of newer platforms such as the one I have that needs some NUMA related tweaks.
I had planned to try it this week anyway.
Great :)
Jon.
Tried it, results were not great :-(
set root='hd0,gpt2' insmod acpi acpi /DSDT.aml linux /Image-leg earlycon=pl011,0xe1010000 console=ttyAMA0 acpi=force root =/dev/sda2
[ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 4.4.0-rc1-00058-g5e47260-dirty (graeme@linaro-seattle.xora.org.uk) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #172 SMP PREEMPT Tue Dec 1 17:01:23 GMT 2015 [ 0.000000] Boot CPU: AArch64 Processor [411fd072] [ 0.000000] earlycon: Early serial console at MMIO 0xe1010000 (options '') [ 0.000000] bootconsole [uart0] enabled [ 0.000000] Bad mode in Error handler detected, code 0xbf000000 -- SError [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.0-rc1-00058-g5e47260-dirty #172 [ 0.000000] Hardware name: AMD Seattle (RevB) Development Board (Overdrive) (DT) [ 0.000000] task: fffffe0000dc7500 ti: fffffe0000d90000 task.ti: fffffe0000d90000 [ 0.000000] PC is at setup_arch+0xfc/0x564 [ 0.000000] LR is at setup_arch+0xf8/0x564 [ 0.000000] pc : [<fffffe0000cb35e8>] lr : [<fffffe0000cb35e4>] pstate: 000002c5 [ 0.000000] sp : fffffe0000d93f00 [ 0.000000] x29: fffffe0000d93f00 x28: 00000083f0fa22c0 [ 0.000000] x27: fffffe0000081198 x26: 0000008002040000 [ 0.000000] x25: 0000008002010000 x24: 0000008001000000 [ 0.000000] x23: fffffe0000dc0000 x22: 00000083f0f15228 [ 0.000000] x21: fffffe0000080000 x20: fffffe0000cffc00 [ 0.000000] x19: fffffdfffa800000 x18: fffffe00008d0bf0 [ 0.000000] x17: 000000000000000e x16: 0000000000000007 [ 0.000000] x15: 0000000000000001 x14: 0ffffffffffffffe [ 0.000000] x13: 0000000000000020 x12: 0000000000000038 [ 0.000000] x11: 0000000000000007 x10: 0101010101010101 [ 0.000000] x9 : fffffffffffffffd x8 : 0000000000000008 [ 0.000000] x7 : 0000000000000005 x6 : 0000000000000080 [ 0.000000] x5 : 000000000000005f x4 : 0000000000000072 [ 0.000000] x3 : 0000000000000063 x2 : 0000000000000072 [ 0.000000] x1 : 0000000000000000 x0 : 0000000000000001 [ 0.000000] [ 0.000000] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.0-rc1-00058-g5e47260-dirty #172 [ 0.000000] Hardware name: AMD Seattle (RevB) Development Board (Overdrive) (DT) [ 0.000000] task: fffffe0000dc7500 ti: fffffe0000d90000 task.ti: fffffe0000d90000 [ 0.000000] PC is at setup_arch+0xfc/0x564 [ 0.000000] LR is at setup_arch+0xf8/0x564 [ 0.000000] pc : [<fffffe0000cb35e8>] lr : [<fffffe0000cb35e4>] pstate: 000002c5 [ 0.000000] sp : fffffe0000d93f00 [ 0.000000] x29: fffffe0000d93f00 x28: 00000083f0fa22c0 [ 0.000000] x27: fffffe0000081198 x26: 0000008002040000 [ 0.000000] x25: 0000008002010000 x24: 0000008001000000 [ 0.000000] x23: fffffe0000dc0000 x22: 00000083f0f15228 [ 0.000000] x21: fffffe0000080000 x20: fffffe0000cffc00 [ 0.000000] x19: fffffdfffa800000 x18: fffffe00008d0bf0 [ 0.000000] x17: 000000000000000e x16: 0000000000000007 [ 0.000000] x15: 0000000000000001 x14: 0ffffffffffffffe [ 0.000000] x13: 0000000000000020 x12: 0000000000000038 [ 0.000000] x11: 0000000000000007 x10: 0101010101010101 [ 0.000000] x9 : fffffffffffffffd x8 : 0000000000000008 [ 0.000000] x7 : 0000000000000005 x6 : 0000000000000080 [ 0.000000] x5 : 000000000000005f x4 : 0000000000000072 [ 0.000000] x7 : 0000000000000005 x6 : 0000000000000080
[0/1977] [ 0.000000] x5 : 000000000000005f x4 : 0000000000000072 [ 0.000000] x3 : 0000000000000063 x2 : 0000000000000072 [ 0.000000] x1 : 0000000000000000 x0 : 0000000000000001 [ 0.000000] [ 0.000000] Process swapper (pid: 0, stack limit = 0xfffffe0000d90020) [ 0.000000] Stack: (0xfffffe0000d93f00 to 0xfffffe0000d94000) [ 0.000000] 3f00: fffffe0000d93fa0 fffffe0000cb06a0 0000000000000001 fffffe0000cffc00 [ 0.000000] 3f20: 000000801fe00000 00000083f0f15228 fffffe0000dc0000 0000008001000000 [ 0.000000] 3f40: 0000008002010000 0000008002040000 fffffe0000081198 00000000ffffffc8 [ 0.000000] 3f60: 00000083f0f155a0 fffffe0000860080 0000000000000001 000000801fe00000 [ 0.000000] 3f80: ffffffffffffffff 8080808080800000 0000808080808080 fefefefefeff736d [ 0.000000] 3fa0: 0000000000000000 000000800184d000 00000083f0f155a0 0000000000000e12 [ 0.000000] 3fc0: 000000801fe00000 00000083f0f15228 00000083f0f1523d 0000008001000000 [ 0.000000] 3fe0: 0000000000000000 fffffe0000d00428 0000000000000000 0000000000000000 [ 0.000000] Call trace: [ 0.000000] [<fffffe0000cb35e8>] setup_arch+0xfc/0x564 [ 0.000000] [<fffffe0000cb06a0>] start_kernel+0xd4/0x400 [ 0.000000] [<000000800184d000>] 0x800184d000 [ 0.000000] Code: 91318000 94002b79 97fff3e7 d50344ff (94000760) [ 0.000000] ---[ end trace f24b6c88ae00fa9a ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task!
On 12/15/2015 06:19 AM, G Gregory wrote:
<below refers to the GRUB-based approach not the initrd override>
Tried it, results were not great :-(
set root='hd0,gpt2' insmod acpi acpi /DSDT.aml linux /Image-leg earlycon=pl011,0xe1010000 console=ttyAMA0 acpi=force root =/dev/sda2
<snip>
[ 0.000000] PC is at setup_arch+0xfc/0x564 [ 0.000000] LR is at setup_arch+0xf8/0x564
<snip>
So we unmask SError in setup_arch, which happens now later enough that it'll come out of the UART where you can see it. However there are still occasions on some of these early platforms where an unhandled SError can exist as GRUB exits. I have seen that a number of times on Seattle if there's a pending error from one of the IO IP blocks on the SoC You might need a firmware update, but can you also confirm that this happened reproducibly?
Jon.
On 15 December 2015 at 15:36, Jon Masters jcm@redhat.com wrote:
On 12/15/2015 06:19 AM, G Gregory wrote:
<below refers to the GRUB-based approach not the initrd override>
Tried it, results were not great :-(
set root='hd0,gpt2' insmod acpi acpi /DSDT.aml linux /Image-leg earlycon=pl011,0xe1010000 console=ttyAMA0 acpi=force root =/dev/sda2
<snip>
[ 0.000000] PC is at setup_arch+0xfc/0x564 [ 0.000000] LR is at setup_arch+0xf8/0x564
<snip>
So we unmask SError in setup_arch, which happens now later enough that it'll come out of the UART where you can see it. However there are still occasions on some of these early platforms where an unhandled SError can exist as GRUB exits. I have seen that a number of times on Seattle if there's a pending error from one of the IO IP blocks on the SoC You might need a firmware update, but can you also confirm that this happened reproducibly?
It is Seattle RevB I am using and it is repeatable. I have pre-release firmware on there!
Graeme
On 15 December 2015 at 16:13, G Gregory graeme.gregory@linaro.org wrote:
On 15 December 2015 at 15:36, Jon Masters jcm@redhat.com wrote:
On 12/15/2015 06:19 AM, G Gregory wrote:
<below refers to the GRUB-based approach not the initrd override>
Tried it, results were not great :-(
set root='hd0,gpt2' insmod acpi acpi /DSDT.aml linux /Image-leg earlycon=pl011,0xe1010000 console=ttyAMA0 acpi=force root =/dev/sda2
<snip>
[ 0.000000] PC is at setup_arch+0xfc/0x564 [ 0.000000] LR is at setup_arch+0xf8/0x564
<snip>
So we unmask SError in setup_arch, which happens now later enough that it'll come out of the UART where you can see it. However there are still occasions on some of these early platforms where an unhandled SError can exist as GRUB exits. I have seen that a number of times on Seattle if there's a pending error from one of the IO IP blocks on the SoC You might need a firmware update, but can you also confirm that this happened reproducibly?
It is Seattle RevB I am using and it is repeatable. I have pre-release firmware on there!
Repeatable on ROD0084E as well.
Graeme
On 12/15/2015 11:28 AM, G Gregory wrote:
On 15 December 2015 at 16:13, G Gregory graeme.gregory@linaro.org wrote:
On 15 December 2015 at 15:36, Jon Masters jcm@redhat.com wrote:
On 12/15/2015 06:19 AM, G Gregory wrote:
<below refers to the GRUB-based approach not the initrd override>
Tried it, results were not great :-(
set root='hd0,gpt2' insmod acpi acpi /DSDT.aml linux /Image-leg earlycon=pl011,0xe1010000 console=ttyAMA0 acpi=force root =/dev/sda2
<snip>
[ 0.000000] PC is at setup_arch+0xfc/0x564 [ 0.000000] LR is at setup_arch+0xf8/0x564
<snip>
So we unmask SError in setup_arch, which happens now later enough that it'll come out of the UART where you can see it. However there are still occasions on some of these early platforms where an unhandled SError can exist as GRUB exits. I have seen that a number of times on Seattle if there's a pending error from one of the IO IP blocks on the SoC You might need a firmware update, but can you also confirm that this happened reproducibly?
It is Seattle RevB I am using and it is repeatable. I have pre-release firmware on there!
Repeatable on ROD0084E as well.
I'll try it on a couple of other platforms later this week. I've pondered before whether GRUB should unmask SError and report this prior to entering the kernel, because today Linux always gets the blame ;)
Jon.
On 15 December 2015 at 16:31, Jon Masters jcm@redhat.com wrote:
On 12/15/2015 11:28 AM, G Gregory wrote:
On 15 December 2015 at 16:13, G Gregory graeme.gregory@linaro.org wrote:
On 15 December 2015 at 15:36, Jon Masters jcm@redhat.com wrote:
On 12/15/2015 06:19 AM, G Gregory wrote:
<below refers to the GRUB-based approach not the initrd override>
Tried it, results were not great :-(
set root='hd0,gpt2' insmod acpi acpi /DSDT.aml linux /Image-leg earlycon=pl011,0xe1010000 console=ttyAMA0 acpi=force root =/dev/sda2
<snip>
[ 0.000000] PC is at setup_arch+0xfc/0x564 [ 0.000000] LR is at setup_arch+0xf8/0x564
<snip>
So we unmask SError in setup_arch, which happens now later enough that it'll come out of the UART where you can see it. However there are still occasions on some of these early platforms where an unhandled SError can exist as GRUB exits. I have seen that a number of times on Seattle if there's a pending error from one of the IO IP blocks on the SoC You might need a firmware update, but can you also confirm that this happened reproducibly?
It is Seattle RevB I am using and it is repeatable. I have pre-release firmware on there!
Repeatable on ROD0084E as well.
I'll try it on a couple of other platforms later this week. I've pondered before whether GRUB should unmask SError and report this prior to entering the kernel, because today Linux always gets the blame ;)
Well IMO it almost certainly should not pass control of a "broken" machine. But I do not know anything about SError.
Graeme
On 15 December 2015 at 17:08, G Gregory graeme.gregory@linaro.org wrote:
On 15 December 2015 at 16:31, Jon Masters jcm@redhat.com wrote:
On 12/15/2015 11:28 AM, G Gregory wrote:
On 15 December 2015 at 16:13, G Gregory graeme.gregory@linaro.org wrote:
On 15 December 2015 at 15:36, Jon Masters jcm@redhat.com wrote:
On 12/15/2015 06:19 AM, G Gregory wrote:
<below refers to the GRUB-based approach not the initrd override>
Tried it, results were not great :-(
set root='hd0,gpt2' insmod acpi acpi /DSDT.aml linux /Image-leg earlycon=pl011,0xe1010000 console=ttyAMA0 acpi=force root =/dev/sda2
<snip>
[ 0.000000] PC is at setup_arch+0xfc/0x564 [ 0.000000] LR is at setup_arch+0xf8/0x564
<snip>
So we unmask SError in setup_arch, which happens now later enough that it'll come out of the UART where you can see it. However there are still occasions on some of these early platforms where an unhandled SError can exist as GRUB exits. I have seen that a number of times on Seattle if there's a pending error from one of the IO IP blocks on the SoC You might need a firmware update, but can you also confirm that this happened reproducibly?
It is Seattle RevB I am using and it is repeatable. I have pre-release firmware on there!
Repeatable on ROD0084E as well.
I'll try it on a couple of other platforms later this week. I've pondered before whether GRUB should unmask SError and report this prior to entering the kernel, because today Linux always gets the blame ;)
Well IMO it almost certainly should not pass control of a "broken" machine. But I do not know anything about SError.
Tested on QEMU and command works as expected
[ 0.000000] ACPI: Early table checksum verification disabled [ 0.000000] ACPI: RSDP 0x00000000B69F732A 000024 (v02 BOCHS ) [ 0.000000] ACPI: XSDT 0x00000000B69F72DE 00004C (v01 BOCHS BXPCFACP 0000000 1 BXPC 00000001) [ 0.000000] ACPI: SPCR 0x00000000B69F6FBA 000050 (v02 BOCHS BXPCSPCR 0000000 1 BXPC 00000001) [ 0.000000] ACPI: MCFG 0x00000000B69F700A 00003C (v01 BOCHS BXPCMCFG 0000000 1 BXPC 00000001) [ 0.000000] ACPI: GTDT 0x00000000B69F7046 000060 (v02 BOCHS BXPCGTDT 0000000 1 BXPC 00000001) [ 0.000000] ACPI: APIC 0x00000000B69F70A6 0000F4 (v03 BOCHS BXPCAPIC 0000000 1 BXPC 00000001) [ 0.000000] ACPI: FACP 0x00000000B69F719A 00010C (v05 BOCHS BXPCFACP 0000000 1 BXPC 00000001) [ 0.000000] ACPI: DSDT 0x00000000B69F6000 000FBA (v02 XORAS XORAXORA 0000000 1 INTL 20140926)
Can see DSDT is not the one generated from QEMU
Graeme
On 12/13/2015 05:02 PM, Jon Masters wrote:
On 12/13/2015 04:36 PM, Jon Masters wrote:
Just thinking from a parity point of view - if you can do it on x86, it should be doable on ARM. But that GRUB module approach I quite like!
Sorry for top post earlier. Was on my phone. In any case, it looks like the "acpi" command in GRUB currently does an all-or-nothing replace of all of the tables, not just a named table. We need to be able to override e.g. just an DSDT or SSDT with a replacement test one.
I rescind that. Looking at the GRUB source clarifies things. It will copy all host tables, then update the copied version if there are additional tables with changes and recalculate checksums/pointers in the XSDT etc. I seem to recall having looked at this before a couple years ago or something and it might have grown some of the 64-bit XSDT logic in that time. Either way, it does seem to do what I wanted. I will followup. I'll probably still use the initrd approach for now.
Jon.
On 12/13/2015 05:15 PM, Jon Masters wrote:
On 12/13/2015 05:02 PM, Jon Masters wrote:
On 12/13/2015 04:36 PM, Jon Masters wrote:
Just thinking from a parity point of view - if you can do it on x86, it should be doable on ARM. But that GRUB module approach I quite like!
Sorry for top post earlier. Was on my phone. In any case, it looks like the "acpi" command in GRUB currently does an all-or-nothing replace of all of the tables, not just a named table. We need to be able to override e.g. just an DSDT or SSDT with a replacement test one.
I rescind that. Looking at the GRUB source clarifies things. It will copy all host tables, then update the copied version if there are additional tables with changes and recalculate checksums/pointers in the XSDT etc. I seem to recall having looked at this before a couple years ago or something and it might have grown some of the 64-bit XSDT logic in that time. Either way, it does seem to do what I wanted. I will followup. I'll probably still use the initrd approach for now.
Addendum. I was able to get the initrd based ACPI table override working...HOWEVER...
*** I had to implement a custom memcpy routine to do so ***
Per some internal suggestions, I tried moving map_mem earlier (prior to doing boot time ACPI table parsing, in order to avoid needing to abuse fixmaps to touch the initrd contents), which works. For reading the ramdisk cpio content (the kernel code still uses a fixmap in the initrd override driver code to map the newly created tables in memory).
But the reading of that cpio content into the new table locations is done using the kernel memcpy routine to early_ioremap'd memory (Device memory), which is architecturally sensitive to missaligned accesses. The in-kernel memcpy routine only checks the alignment of the source pointed when it begins copying, and doesn't handle the case that the natural alignment differs between source and destination. Therefore, the kernel rolls over and plays dead unless I provide a hacked up jcm_memcpy that just does byte copies. Then everything "works".
Now sure, the official version of the initrd override code is going to work differently from my hack I'm never posting that is for bringup of a specific system, and it will all nicely be done by Linaro and so on, but per my other thread, I am also not convinced that the kernel's "optimized" memcpy library routine is behaving correctly in the case that alignment differs. Surely it should handle the case that the alignment is off on either side of the source and destination.
Once I figured out what in the heck was going on with memcpy nonsense:
SRAT ACPI table found in initrd [kernel/firmware/acpi/srat.aml][REDACT] SLIT ACPI table found in initrd [kernel/firmware/acpi/slit.aml][REDACT] SSDT ACPI table found in initrd [kernel/firmware/acpi/ssdt1.aml][REDACT] ACPI: Early table checksum verification disabled ACPI: RSDP REDACTED_ADDRESS REDACT (REDACTED) ACPI: XSDT REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: FACP REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: DSDT REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: APIC REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: SSDT REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: Override [SSDT- REDACTED], this is unsafe: tainting kernel
Needless to say rewriting the kernel memcpy library was slightly more than I planned on doing to get a simple initrd override to work...
Jon.
Obviously never for use ever. But just so you have the patch that I am using here tonight. It might help someone to ponder a real solution.
Jon.
On 12/14/2015 11:51 PM, Jon Masters wrote:
On 12/13/2015 05:15 PM, Jon Masters wrote:
On 12/13/2015 05:02 PM, Jon Masters wrote:
On 12/13/2015 04:36 PM, Jon Masters wrote:
Just thinking from a parity point of view - if you can do it on x86, it should be doable on ARM. But that GRUB module approach I quite like!
Sorry for top post earlier. Was on my phone. In any case, it looks like the "acpi" command in GRUB currently does an all-or-nothing replace of all of the tables, not just a named table. We need to be able to override e.g. just an DSDT or SSDT with a replacement test one.
I rescind that. Looking at the GRUB source clarifies things. It will copy all host tables, then update the copied version if there are additional tables with changes and recalculate checksums/pointers in the XSDT etc. I seem to recall having looked at this before a couple years ago or something and it might have grown some of the 64-bit XSDT logic in that time. Either way, it does seem to do what I wanted. I will followup. I'll probably still use the initrd approach for now.
Addendum. I was able to get the initrd based ACPI table override working...HOWEVER...
*** I had to implement a custom memcpy routine to do so ***
Per some internal suggestions, I tried moving map_mem earlier (prior to doing boot time ACPI table parsing, in order to avoid needing to abuse fixmaps to touch the initrd contents), which works. For reading the ramdisk cpio content (the kernel code still uses a fixmap in the initrd override driver code to map the newly created tables in memory).
But the reading of that cpio content into the new table locations is done using the kernel memcpy routine to early_ioremap'd memory (Device memory), which is architecturally sensitive to missaligned accesses. The in-kernel memcpy routine only checks the alignment of the source pointed when it begins copying, and doesn't handle the case that the natural alignment differs between source and destination. Therefore, the kernel rolls over and plays dead unless I provide a hacked up jcm_memcpy that just does byte copies. Then everything "works".
Now sure, the official version of the initrd override code is going to work differently from my hack I'm never posting that is for bringup of a specific system, and it will all nicely be done by Linaro and so on, but per my other thread, I am also not convinced that the kernel's "optimized" memcpy library routine is behaving correctly in the case that alignment differs. Surely it should handle the case that the alignment is off on either side of the source and destination.
Once I figured out what in the heck was going on with memcpy nonsense:
SRAT ACPI table found in initrd [kernel/firmware/acpi/srat.aml][REDACT] SLIT ACPI table found in initrd [kernel/firmware/acpi/slit.aml][REDACT] SSDT ACPI table found in initrd [kernel/firmware/acpi/ssdt1.aml][REDACT] ACPI: Early table checksum verification disabled ACPI: RSDP REDACTED_ADDRESS REDACT (REDACTED) ACPI: XSDT REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: FACP REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: DSDT REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: APIC REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: SSDT REDACTED_ADDRESS REDACT (REDACTED OEM) ACPI: Override [SSDT- REDACTED], this is unsafe: tainting kernel
Needless to say rewriting the kernel memcpy library was slightly more than I planned on doing to get a simple initrd override to work...
Jon.
On Mon, Dec 14, 2015 at 11:51:09PM -0500, Jon Masters wrote:
Addendum. I was able to get the initrd based ACPI table override working...HOWEVER...
*** I had to implement a custom memcpy routine to do so ***
Per some internal suggestions, I tried moving map_mem earlier (prior to doing boot time ACPI table parsing, in order to avoid needing to abuse fixmaps to touch the initrd contents), which works. For reading the ramdisk cpio content (the kernel code still uses a fixmap in the initrd override driver code to map the newly created tables in memory).
But the reading of that cpio content into the new table locations is done using the kernel memcpy routine to early_ioremap'd memory (Device memory), which is architecturally sensitive to missaligned accesses.
Why is it mapped as device memory?
If that is a legal thing, shouldn't drivers/acpi/osl.c be using memcpy_fromio(), which (on ARM*) does the alignment fixups if required, without going full-bytewise?
/ Leif
On Mon, Dec 14, 2015 at 11:51:09PM -0500, Jon Masters wrote:
Addendum. I was able to get the initrd based ACPI table override working...HOWEVER...
*** I had to implement a custom memcpy routine to do so ***
Per some internal suggestions, I tried moving map_mem earlier (prior to doing boot time ACPI table parsing, in order to avoid needing to abuse fixmaps to touch the initrd contents), which works. For reading the ramdisk cpio content (the kernel code still uses a fixmap in the initrd override driver code to map the newly created tables in memory).
But the reading of that cpio content into the new table locations is done using the kernel memcpy routine to early_ioremap'd memory (Device memory), which is architecturally sensitive to missaligned accesses. The in-kernel memcpy routine only checks the alignment of the source pointed when it begins copying, and doesn't handle the case that the natural alignment differs between source and destination. Therefore, the kernel rolls over and plays dead unless I provide a hacked up jcm_memcpy that just does byte copies. Then everything "works".
You don't need a custom memcpy. All you need to do is use early_memremap, as we do for relocate_initrd, to get a Normal Cacheable mapping. See:
setup_arch() relocate_initrd() copy_from_early_mem() early_memremap() memcpy() early_memunmap()
Mark.
Jon Masters jcm@redhat.com writes:
On 12/13/2015 05:15 PM, Jon Masters wrote:
On 12/13/2015 05:02 PM, Jon Masters wrote:
On 12/13/2015 04:36 PM, Jon Masters wrote:
Just thinking from a parity point of view - if you can do it on x86, it should be doable on ARM. But that GRUB module approach I quite like!
Sorry for top post earlier. Was on my phone. In any case, it looks like the "acpi" command in GRUB currently does an all-or-nothing replace of all of the tables, not just a named table. We need to be able to override e.g. just an DSDT or SSDT with a replacement test one.
I rescind that. Looking at the GRUB source clarifies things. It will copy all host tables, then update the copied version if there are additional tables with changes and recalculate checksums/pointers in the XSDT etc. I seem to recall having looked at this before a couple years ago or something and it might have grown some of the 64-bit XSDT logic in that time. Either way, it does seem to do what I wanted. I will followup. I'll probably still use the initrd approach for now.
Addendum. I was able to get the initrd based ACPI table override working...HOWEVER...
Why do you need to override ACPI tables? Isn't ACPI supposed to be perfect?
*** I had to implement a custom memcpy routine to do so ***
Per some internal suggestions, I tried moving map_mem earlier (prior to doing boot time ACPI table parsing, in order to avoid needing to abuse fixmaps to touch the initrd contents), which works. For reading the ramdisk cpio content (the kernel code still uses a fixmap in the initrd override driver code to map the newly created tables in memory).
But the reading of that cpio content into the new table locations is done using the kernel memcpy routine to early_ioremap'd memory (Device memory), which is architecturally sensitive to missaligned accesses.
As others have pointed out, your error is twofold: first using early_ioremap() when early_memremap() would do, then using regular memcpy() and not memcpy_fromio() on that. Seems like your precious ACPI code is to blame here.
The in-kernel memcpy routine only checks the alignment of the source pointed when it begins copying, and doesn't handle the case that the natural alignment differs between source and destination. Therefore, the kernel rolls over and plays dead unless I provide a hacked up jcm_memcpy that just does byte copies. Then everything "works".
That's the right thing to do for normal memory on ARMv6 and up which support unaligned accesses. Hardware unaligned accesses are slightly slower than aligned, so aligning the loads saves a few cycles. The manual shifts required to also align the destination are, however, more expensive than simply letting the hardware do it.
On 12/15/2015 06:43 AM, Måns Rullgård wrote:
Jon Masters jcm@redhat.com writes:
But the reading of that cpio content into the new table locations is done using the kernel memcpy routine to early_ioremap'd memory (Device memory), which is architecturally sensitive to missaligned accesses.
As others have pointed out, your error is twofold: first using early_ioremap() when early_memremap() would do, then using regular memcpy() and not memcpy_fromio() on that. Seems like your precious ACPI code is to blame here.
So I get that this could be changed. But I still want to understand whether memcpy is behaving correctly. Will it guaranteed /never/ occur that a copy will involve Device memory? All other such occurrences in the kernel will be caught and fixed so this will never be an issue?
Jon.
On Tue, Dec 15, 2015 at 10:29:54AM -0500, Jon Masters wrote:
On 12/15/2015 06:43 AM, Måns Rullgård wrote:
Jon Masters jcm@redhat.com writes:
But the reading of that cpio content into the new table locations is done using the kernel memcpy routine to early_ioremap'd memory (Device memory), which is architecturally sensitive to missaligned accesses.
As others have pointed out, your error is twofold: first using early_ioremap() when early_memremap() would do, then using regular memcpy() and not memcpy_fromio() on that. Seems like your precious ACPI code is to blame here.
So I get that this could be changed. But I still want to understand whether memcpy is behaving correctly. Will it guaranteed /never/ occur that a copy will involve Device memory?
It is guaranteed in the same sense that other problems resulting from misuses of kernel APIs are guaranteed never to occur -- so long as the contract between the API and the programmer is respected.
It is physically possible that a programmer may violate that contract. It is a programming error if that happens.
All other such occurrences in the kernel will be caught and fixed so this will never be an issue?
No-one has the prescience to be able to tell you whether a particular bug will or will not be introduced in future. If the programmer pays attention to their tools (e.g. sparse warnings about address space mismatches), this class of error is fairly easy to avoid.
If you have found another occurrence of this issue, please report it.
Mark.
Jon Masters jcm@redhat.com writes:
On 12/15/2015 06:43 AM, Måns Rullgård wrote:
Jon Masters jcm@redhat.com writes:
But the reading of that cpio content into the new table locations is done using the kernel memcpy routine to early_ioremap'd memory (Device memory), which is architecturally sensitive to missaligned accesses.
As others have pointed out, your error is twofold: first using early_ioremap() when early_memremap() would do, then using regular memcpy() and not memcpy_fromio() on that. Seems like your precious ACPI code is to blame here.
So I get that this could be changed. But I still want to understand whether memcpy is behaving correctly.
It is. I already explained why it's the correct way.