Single zImage at Linaro Connect
nicolas.pitre at linaro.org
Thu Jul 28 02:58:36 UTC 2011
To everyone, and especially to those who are expected to work on this
topic next week, please find below a list of tasks that needs to be
investigated and/or accomplished. I'll coordinate the work and collect
patches for the team.
If you have comments on this, or if you know about some omissions,
please feel free to provide them as a reply to this message.
I'd like to know if people are particularly interested in one (or more :-))
items they would like to work on. If so please say so as well.
Without further ado, here it is:
0) The so called "single zImage" project
We wish to provide the ability to build as many ARM platforms as
possible into a single kernel binary image. This will greatly simplify
the archive packaging and maintenance effort by having only one kernel
that could be built and booted on multiple ARM targets. A side effect
of this is also to enforce better source code architecture even if the
resulting binaries are not always supporting multiple targets.
This work started a while ago. Some initial description can be found
Part of it has been implemented already, namely the runtime determined
PHYS_OFFSET, the AUTO_ZRELADDR and some other items referenced below.
But there is still a large amount of work remaining.
1) Removal of any dependencies on <mach/*.h> from generic header files
To see the current culprits:
$ git grep "#include <mach/.*.h>" arch/arm/include/
arch/arm/include/asm/pci.h:#include <mach/hardware.h> /* for PCIBIOS_MIN_* */
This may contain the following defines:
This can be eliminated by moving that value into struct machine_desc.
The work is done already, but presented as an example for other tasks:
And as of now this is merged in mainline already for v3.1-rc1.
Most occurrences can be eliminated. With CONFIG_ARM_PATCH_PHYS_VIRT, it
is possible to determine PHYS_OFFSET at run time. Remains to remove the
direct uses, mostly by mdesc->boot_params initializers. Changing
boot_params into atag_offset has two effects: that makes it clearer that
it is only about ATAGs and not DT, and a relative offset plays more
nicely with a runtime determined PHYS_OFFSET.
This work is done but not yet accepted:
1.1.3) FLUSH_BASE, FLUSH_BASE_PHYS, FLUSH_BASE_MINICACHE, UNCACHEABLE_ADDR
Those are StrongARM related constants, and different for each variants.
Fixing this involves making the virtual addresses constant for all
variants, and hiding the differences in the physical addresses during
the actual mapping.
The solution is here:
Maybe the CMA work will make this obsolete and the consistent DMA area
could be dynamically adjusted. In the mean time, the easiest solution
is probably to store this in the machine_desc structure just like with
This has not been addressed yet.
1.1.5) Other weird things
Some machines have non linear memory maps or bus address translations,
sparsemem, etc. Examples of that are:
I think solving this is out of scope for this round. Deleting
arch/arm/mach-*/include/mach/memory.h can't be done universally. So a
new Kconfig symbol (NO_MACH_MEMORY_H) is introduced to indicate which
machine class has its legacy <mach/memory.h> file removed. The single
zImage for multiple targets will be restricted, amongst other things, to
those machines or SOCs with that symbol selected. Partial result here:
This contains things like IO_SPACE_LIMIT, __io(), __mem_pci(), and
sometimes __arch_ioremap()/__arch_unmap(). but in most cases, the
definitions here are pretty similar from one machine class to another.
|I have a plan. When CONFIG_PCI is disabled (along with CONFIG_ISA and
|CONFIG_PCMCIA), we should have neither of IO_SPACE_LIMIT, __io()
|and get no inb/outb functions as a result.
|When it is enabled, the 'common' platforms need only one I/O window
|of 64KB, so we should find a common place in the virtual address space
|for that and hardcode __io, while the platform specific PCI initialization
|code (or map_io for that matter) ensures that the window is pointing
|to the physical window.
|__arch_ioremap()/__arch_unmap() are not really needed as far as I can
|tell but are used as an optimization to redirect ioremap to the
|hardcoded virtal address mapping. In the first step we can disable
|this for combined kernels, later we can find a generic way so
|__arch_ioremap walks the list of static mappings.
Most instances simply define a dummy CLOCK_TICK_RATE value. This can
probably be removed altogether, or simply have a common value in
arch/arm/include/asm/timex.h, as nothing seriously uses that anymore.
This universally contains only a definition for VMALLOC_END, but not an
universal definition. Would be nice to have VMALLOC_eND dynamically
determined from the static IO mappings, but the highmem threshold
depends on the value of VMALLOC_END, and memory has to be initialized
before the static IO mappings can be processed.
Therefore the best solution so far appears to use another value in
struct machine_desc for it so it can be set at run time. this is a
mechanical conversion that has to be done.
The only information globally required from those files is the value of
NR_IRQS. Yet there is already a nr_irqs member in the machine_desc
structure for this, used by arch_probe_nr_irqs() in
So the first step would be to add
.nr_irqs = NR_IRQS,
to all machine_desc instances, making sure that <mach/irqs.h> is
included in those files. Then, <mach/irqs.h> should be removed from
arch/arm/include/asm/irq.h, and adjust things so everything still
This is a tough one. This depends on CONFIG_GENERIC_GPIO which is
selected by many machine types. They should all be converted to (or
configurable with) CONFIG_GPIOLIB so each SOC's specific GPIO handling
is made into runtime code instead of static inline functions. Care to
preserve the ability to not use gpiolib might be desireable in some
cases for performance reasons.
Definitely in need of serious investigation.
No need to care about those. This is for running the kernel XIP from
ROM memory. A XIP kernel is already incompatible with the notion of a
single kernel image since it obviously can't be modified at run time (as
needed by CONFIG_ARM_PATCH_PHYS_VIRT).
1.8) mach/isa-dma.h, mach/floppy.h
Those are used by old targets we might not care much about.
This one gets included directly from arch/arm/kernel/entry-armv.S.
The only relevant macro still widely used is get_irqnr_preamble and
get_irqnr_and_base. They can be overridden by CONFIG_MULTI_IRQ_HANDLER
and the equivalent code hooked to the handle_irq member of the
This is used when CONFIG_DEBUG_LL is set. Supporting that option with a
single kernel image might prove very difficult with a rapidly
diminishing return on the investment.
This code is in need of some refactoring already:
To still benefit from the most likely needed debugging aid, we might
consider the ability to still allow the selection of one amongst the
existing implementation when building a kernel with many SOC support.
Obviously that would only work on the one hardware platform for which the selected printch implementation was
designed, but that should be good enough for debugging purposes.
This is included from arch/arm/kernel/process.c and expected to provide
the following static inline functions or equivalent:
Called when system is idle. Most of them just call cpu_do_idle().
The call to cpu_do_idle() should be moved to default_idle() and the exception
cases moved out of line where they can be hooked to the pm_idle callback.
Used to reset the system. This is far from being a hot path and doesn't
justify a static inline function. An out-of-line version hooked to a
global arch_reset function pointer would work just fine.
This is used to define per SOC methods to output some progress feedback
from the kernel decompressor over a serial port. Once again, supporting
this with a single kernel image might prove very difficult with a
rapidly diminishing return on the investment. So it is probably best to
simply use generic empty stubs whenever more than one SOC family is
configured in a common kernel image.
2) Removal of any dependencies on <mach/*.h> from driver code
A couple possibilities:
a) We move the required header files next to the driver code. In many
cases, having a .h file with only the defines relevant to the concerned
driver is best. But this is a _lot_ of work.
b) We change those <mach/foo.h> into something more absolute, such as
<mach/omap2/foo.h>. This can be done on a per SOC basis, first by
moving the header files one level deeper, and then fixing up all
c) We change those <mach/foo.h> files into something more precise, e.g.
<mach/omap2_foo.h> and fix concerned drivers.
I think the best solution here is (b) which doesn't preclude (a)
eventually or if it is trivial. But (c) is dangerous as files might be
added easily without paying too much attention to the file prefix.
3) Change thes to the build system
We need to move towards the ability to actually build more than one SOC
family at the same time.
This involves changes to Kconfig where currently only one out of all the
different architectures is selected through the big "ARM system type"
choice prompt. We need to determine a good way to move some of them
into simply bool prompts and keep track of which architecture can be
built concurrently with which. We know for instance that it is unlikely
that pre-ARMv6 and ARMv6/7 will ever be buildable together. Today we
know that nothing can be built with anything else and therefore this
should be the starting default. This needs investigating.
Currently the arch/arm/Makefile is organized so the lowest instruction
set level and the highest optimization level are selected from all the
configured options. So this part should already be fine.
However the machine-$(*), plat-$(*), machdirs and platdirs variables
must go. In (2) above we should have removed the need for adding to the
global KBUILD_CPPFLAGS to add a path to some specific architecture
includes already. Keeping them only for the code under each
architecture subdirectory should be sufficient.
For example, this might be all that is needed:
obj-$(CONFIG_ARCH_MSM) += mach-msm/
obj-$(CONFIG_ARCH_KIRKWOOD) += mach-kirkwood/ plat-orion/
obj-$(CONFIG_ARCH_ORION5X) += mach-orion5x/ plat-orion/
And within each of these directories, using the subdir-ccflags-y
variable to include the locally needed architecture specific include
files will do the trick.
We need a defconfig file adding as many architectures to it as possible
for build coverage. Ideally the resulting binary should be boot tested
on as many targets it supports as possible.
4) Picking up broken pieces
Things will certainly break along the way. There are certainly issues
that I didn't foresee. My experience so far tend to indicate that
this is a somewhat recursive process where the tackling of one work item
reveals a few more which are prerequisite to the first one, etc. So any
estimate for this work needs to consider a large fudge factor.
More information about the linaro-kernel