Single zImage at Linaro Connect
robherring2 at gmail.com
Thu Jul 28 15:43:14 UTC 2011
On 07/27/2011 09:58 PM, Nicolas Pitre wrote:
> To everyone, and especially to those who are expected to work on this
> topic next week, please find below a list of tasks that needs to be
> investigated and/or accomplished. I'll coordinate the work and collect
> patches for the team.
> If you have comments on this, or if you know about some omissions,
> please feel free to provide them as a reply to this message.
> I'd like to know if people are particularly interested in one (or more :-))
> items they would like to work on. If so please say so as well.
> Without further ado, here it is:
> 0) The so called "single zImage" project
> We wish to provide the ability to build as many ARM platforms as
> possible into a single kernel binary image. This will greatly simplify
> the archive packaging and maintenance effort by having only one kernel
> that could be built and booted on multiple ARM targets. A side effect
> of this is also to enforce better source code architecture even if the
> resulting binaries are not always supporting multiple targets.
> This work started a while ago. Some initial description can be found
> Part of it has been implemented already, namely the runtime determined
> PHYS_OFFSET, the AUTO_ZRELADDR and some other items referenced below.
> But there is still a large amount of work remaining.
> 1) Removal of any dependencies on <mach/*.h> from generic header files
> To see the current culprits:
> $ git grep "#include <mach/.*.h>" arch/arm/include/
> arch/arm/include/asm/clkdev.h:#include <mach/clkdev.h>
> arch/arm/include/asm/dma.h:#include <mach/isa-dma.h>
> arch/arm/include/asm/floppy.h:#include <mach/floppy.h>
> arch/arm/include/asm/gpio.h:#include <mach/gpio.h>
> arch/arm/include/asm/hardware/dec21285.h:#include <mach/hardware.h>
> arch/arm/include/asm/hardware/iop3xx-adma.h:#include <mach/hardware.h>
> arch/arm/include/asm/hardware/iop3xx-gpio.h:#include <mach/hardware.h>
> arch/arm/include/asm/hardware/sa1111.h:#include <mach/bitfield.h>
> arch/arm/include/asm/io.h:#include <mach/io.h>
> arch/arm/include/asm/irq.h:#include <mach/irqs.h>
> arch/arm/include/asm/mc146818rtc.h:#include <mach/irqs.h>
> arch/arm/include/asm/memory.h:#include <mach/memory.h>
> arch/arm/include/asm/mtd-xip.h:#include <mach/mtd-xip.h>
> arch/arm/include/asm/pci.h:#include <mach/hardware.h> /* for PCIBIOS_MIN_* */
> arch/arm/include/asm/pgtable.h:#include <mach/vmalloc.h>
> arch/arm/include/asm/system.h:#include <mach/barriers.h>
> arch/arm/include/asm/timex.h:#include <mach/timex.h>
> arch/arm/include/asm/vga.h:#include <mach/hardware.h>
> 1.1) mach/memory.h
> This may contain the following defines:
> 1.1.1) ARM_DMA_ZONE_SIZE
> This can be eliminated by moving that value into struct machine_desc.
> The work is done already, but presented as an example for other tasks:
> And as of now this is merged in mainline already for v3.1-rc1.
> 1.1.2) PLAT_PHYS_OFFSET
> Most occurrences can be eliminated. With CONFIG_ARM_PATCH_PHYS_VIRT, it
> is possible to determine PHYS_OFFSET at run time. Remains to remove the
> direct uses, mostly by mdesc->boot_params initializers. Changing
> boot_params into atag_offset has two effects: that makes it clearer that
> it is only about ATAGs and not DT, and a relative offset plays more
> nicely with a runtime determined PHYS_OFFSET.
> This work is done but not yet accepted:
> 1.1.3) FLUSH_BASE, FLUSH_BASE_PHYS, FLUSH_BASE_MINICACHE, UNCACHEABLE_ADDR
> Those are StrongARM related constants, and different for each variants.
> Fixing this involves making the virtual addresses constant for all
> variants, and hiding the differences in the physical addresses during
> the actual mapping.
> The solution is here:
> 1.1.4) CONSISTENT_DMA_SIZE
> Maybe the CMA work will make this obsolete and the consistent DMA area
> could be dynamically adjusted. In the mean time, the easiest solution
> is probably to store this in the machine_desc structure just like with
> This has not been addressed yet.
> 1.1.5) Other weird things
> Some machines have non linear memory maps or bus address translations,
> sparsemem, etc. Examples of that are:
> I think solving this is out of scope for this round. Deleting
> arch/arm/mach-*/include/mach/memory.h can't be done universally. So a
> new Kconfig symbol (NO_MACH_MEMORY_H) is introduced to indicate which
> machine class has its legacy <mach/memory.h> file removed. The single
> zImage for multiple targets will be restricted, amongst other things, to
> those machines or SOCs with that symbol selected. Partial result here:
> 1.2) mach/io.h
> This contains things like IO_SPACE_LIMIT, __io(), __mem_pci(), and
> sometimes __arch_ioremap()/__arch_unmap(). but in most cases, the
> definitions here are pretty similar from one machine class to another.
> Arnd says:
> |I have a plan. When CONFIG_PCI is disabled (along with CONFIG_ISA and
> |CONFIG_PCMCIA), we should have neither of IO_SPACE_LIMIT, __io()
> |and get no inb/outb functions as a result.
> |When it is enabled, the 'common' platforms need only one I/O window
> |of 64KB, so we should find a common place in the virtual address space
> |for that and hardcode __io, while the platform specific PCI initialization
> |code (or map_io for that matter) ensures that the window is pointing
> |to the physical window.
> |__arch_ioremap()/__arch_unmap() are not really needed as far as I can
> |tell but are used as an optimization to redirect ioremap to the
> |hardcoded virtal address mapping. In the first step we can disable
> |this for combined kernels, later we can find a generic way so
> |__arch_ioremap walks the list of static mappings.
> 1.3) mach/timex.h
> Most instances simply define a dummy CLOCK_TICK_RATE value. This can
> probably be removed altogether, or simply have a common value in
> arch/arm/include/asm/timex.h, as nothing seriously uses that anymore.
> Reference: http://lkml.org/lkml/2011/2/21/323
> 1.4) mach/vmalloc.h
> This universally contains only a definition for VMALLOC_END, but not an
> universal definition. Would be nice to have VMALLOC_eND dynamically
> determined from the static IO mappings, but the highmem threshold
> depends on the value of VMALLOC_END, and memory has to be initialized
> before the static IO mappings can be processed.
> Therefore the best solution so far appears to use another value in
> struct machine_desc for it so it can be set at run time. this is a
> mechanical conversion that has to be done.
> 1.5) mach/irqs.h
> The only information globally required from those files is the value of
> NR_IRQS. Yet there is already a nr_irqs member in the machine_desc
> structure for this, used by arch_probe_nr_irqs() in
> So the first step would be to add
> .nr_irqs = NR_IRQS,
> to all machine_desc instances, making sure that <mach/irqs.h> is
> included in those files. Then, <mach/irqs.h> should be removed from
> arch/arm/include/asm/irq.h, and adjust things so everything still
> 1.6) mach/gpio.h
> This is a tough one. This depends on CONFIG_GENERIC_GPIO which is
> selected by many machine types. They should all be converted to (or
> configurable with) CONFIG_GPIOLIB so each SOC's specific GPIO handling
> is made into runtime code instead of static inline functions. Care to
> preserve the ability to not use gpiolib might be desireable in some
> cases for performance reasons.
> Definitely in need of serious investigation.
> 1.7) mach/mtd-xip.h
> No need to care about those. This is for running the kernel XIP from
> ROM memory. A XIP kernel is already incompatible with the notion of a
> single kernel image since it obviously can't be modified at run time (as
> needed by CONFIG_ARM_PATCH_PHYS_VIRT).
> 1.8) mach/isa-dma.h, mach/floppy.h
> Those are used by old targets we might not care much about.
> 1.9) mach/entry-macro.S
> This one gets included directly from arch/arm/kernel/entry-armv.S.
> The only relevant macro still widely used is get_irqnr_preamble and
> get_irqnr_and_base. They can be overridden by CONFIG_MULTI_IRQ_HANDLER
> and the equivalent code hooked to the handle_irq member of the
> machine_desc structure.
> 1.10) mach/debug-macro.S
> This is used when CONFIG_DEBUG_LL is set. Supporting that option with a
> single kernel image might prove very difficult with a rapidly
> diminishing return on the investment.
> This code is in need of some refactoring already:
> To still benefit from the most likely needed debugging aid, we might
> consider the ability to still allow the selection of one amongst the
> existing implementation when building a kernel with many SOC support.
> Obviously that would only work on the one hardware platform for which the selected printch implementation was
> designed, but that should be good enough for debugging purposes.
> 1.11) mach/system.h
> This is included from arch/arm/kernel/process.c and expected to provide
> the following static inline functions or equivalent:
> 1.11.1) arch_idle()
> Called when system is idle. Most of them just call cpu_do_idle().
> The call to cpu_do_idle() should be moved to default_idle() and the exception
> cases moved out of line where they can be hooked to the pm_idle callback.
> 1.11.2) arch_reset()
> Used to reset the system. This is far from being a hot path and doesn't
> justify a static inline function. An out-of-line version hooked to a
> global arch_reset function pointer would work just fine.
> 1.12) mach/uncompress.h
> This is used to define per SOC methods to output some progress feedback
> from the kernel decompressor over a serial port. Once again, supporting
> this with a single kernel image might prove very difficult with a
> rapidly diminishing return on the investment. So it is probably best to
> simply use generic empty stubs whenever more than one SOC family is
> configured in a common kernel image.
> 2) Removal of any dependencies on <mach/*.h> from driver code
> A couple possibilities:
> a) We move the required header files next to the driver code. In many
> cases, having a .h file with only the defines relevant to the concerned
> driver is best. But this is a _lot_ of work.
> b) We change those <mach/foo.h> into something more absolute, such as
> <mach/omap2/foo.h>. This can be done on a per SOC basis, first by
> moving the header files one level deeper, and then fixing up all
> affected drivers.
> c) We change those <mach/foo.h> files into something more precise, e.g.
> <mach/omap2_foo.h> and fix concerned drivers.
> I think the best solution here is (b) which doesn't preclude (a)
> eventually or if it is trivial. But (c) is dangerous as files might be
> added easily without paying too much attention to the file prefix.
> 3) Change thes to the build system
> We need to move towards the ability to actually build more than one SOC
> family at the same time.
> 3.1) Kconfig
> This involves changes to Kconfig where currently only one out of all the
> different architectures is selected through the big "ARM system type"
> choice prompt. We need to determine a good way to move some of them
> into simply bool prompts and keep track of which architecture can be
> built concurrently with which. We know for instance that it is unlikely
> that pre-ARMv6 and ARMv6/7 will ever be buildable together. Today we
> know that nothing can be built with anything else and therefore this
> should be the starting default. This needs investigating.
> 3.2) Makefile
> Currently the arch/arm/Makefile is organized so the lowest instruction
> set level and the highest optimization level are selected from all the
> configured options. So this part should already be fine.
> However the machine-$(*), plat-$(*), machdirs and platdirs variables
> must go. In (2) above we should have removed the need for adding to the
> global KBUILD_CPPFLAGS to add a path to some specific architecture
> includes already. Keeping them only for the code under each
> architecture subdirectory should be sufficient.
> For example, this might be all that is needed:
> obj-$(CONFIG_ARCH_MSM) += mach-msm/
> obj-$(CONFIG_ARCH_KIRKWOOD) += mach-kirkwood/ plat-orion/
> obj-$(CONFIG_ARCH_ORION5X) += mach-orion5x/ plat-orion/
> And within each of these directories, using the subdir-ccflags-y
> variable to include the locally needed architecture specific include
> files will do the trick.
> 3.3) defconfig
> We need a defconfig file adding as many architectures to it as possible
> for build coverage. Ideally the resulting binary should be boot tested
> on as many targets it supports as possible.
> 4) Picking up broken pieces
> Things will certainly break along the way. There are certainly issues
> that I didn't foresee. My experience so far tend to indicate that
> this is a somewhat recursive process where the tackling of one work item
> reveals a few more which are prerequisite to the first one, etc. So any
> estimate for this work needs to consider a large fudge factor.
There's also collisions with the platform SMP and localtimer code.
Things like platform_secondary_init, platform_smp_prepare_cpus, etc.
need to be converted over to something like smp_ops on PowerPC. There's
been some work on the local timer code by Marc Z.
More information about the linaro-kernel