Single zImage at Linaro Connect

Rob Herring robherring2 at gmail.com
Thu Jul 28 15:43:14 UTC 2011


On 07/27/2011 09:58 PM, Nicolas Pitre wrote:
> 
> To everyone, and especially to those who are expected to work on this 
> topic next week, please find below a list of tasks that needs to be 
> investigated and/or accomplished.  I'll coordinate the work and collect 
> patches for the team.
> 
> If you have comments on this, or if you know about some omissions, 
> please feel free to provide them as a reply to this message.
> 
> I'd like to know if people are particularly interested in one (or more :-)) 
> items they would like to work on.  If so please say so as well.
> 
> Without further ado, here it is:
> 
> <><><><><>
> 
> 0) The so called "single zImage" project
> 
> We wish to provide the ability to build as many ARM platforms as 
> possible into a single kernel binary image. This will greatly simplify 
> the archive packaging and maintenance effort by having only one kernel 
> that could be built and booted on multiple ARM targets.  A side effect 
> of this is also to enforce better source code architecture even if the 
> resulting binaries are not always supporting multiple targets.
> 
> This work started a while ago.  Some initial description can be found 
> here:
> 
> https://wiki.ubuntu.com/Specs/ARMSingleKernel
> 
> Part of it has been implemented already, namely the runtime determined 
> PHYS_OFFSET, the AUTO_ZRELADDR and some other items referenced below.  
> But there is still a large amount of work remaining.
> 
> 1) Removal of any dependencies on <mach/*.h> from generic header files
> 
> To see the current culprits:
> 
> $ git grep "#include <mach/.*.h>" arch/arm/include/
> arch/arm/include/asm/clkdev.h:#include <mach/clkdev.h>
> arch/arm/include/asm/dma.h:#include <mach/isa-dma.h>
> arch/arm/include/asm/floppy.h:#include <mach/floppy.h>
> arch/arm/include/asm/gpio.h:#include <mach/gpio.h>
> arch/arm/include/asm/hardware/dec21285.h:#include <mach/hardware.h>
> arch/arm/include/asm/hardware/iop3xx-adma.h:#include <mach/hardware.h>
> arch/arm/include/asm/hardware/iop3xx-gpio.h:#include <mach/hardware.h>
> arch/arm/include/asm/hardware/sa1111.h:#include <mach/bitfield.h>
> arch/arm/include/asm/io.h:#include <mach/io.h>
> arch/arm/include/asm/irq.h:#include <mach/irqs.h>
> arch/arm/include/asm/mc146818rtc.h:#include <mach/irqs.h>
> arch/arm/include/asm/memory.h:#include <mach/memory.h>
> arch/arm/include/asm/mtd-xip.h:#include <mach/mtd-xip.h>
> arch/arm/include/asm/pci.h:#include <mach/hardware.h> /* for PCIBIOS_MIN_* */
> arch/arm/include/asm/pgtable.h:#include <mach/vmalloc.h>
> arch/arm/include/asm/system.h:#include <mach/barriers.h>
> arch/arm/include/asm/timex.h:#include <mach/timex.h>
> arch/arm/include/asm/vga.h:#include <mach/hardware.h>
> 
> 1.1) mach/memory.h
> 
> This may contain the following defines:
> 
> 1.1.1) ARM_DMA_ZONE_SIZE
> 
> This can be eliminated by moving that value into struct machine_desc.
> The work is done already, but presented as an example for other tasks: 
> http://git.linaro.org/gitweb?p=people/nico/linux.git;a=shortlog;h=refs/heads/dma
> And as of now this is merged in mainline already for v3.1-rc1.
> 
> 1.1.2) PLAT_PHYS_OFFSET
> 
> Most occurrences can be eliminated.  With CONFIG_ARM_PATCH_PHYS_VIRT, it 
> is possible to determine PHYS_OFFSET at run time.  Remains to remove the 
> direct uses, mostly by mdesc->boot_params initializers.  Changing 
> boot_params into atag_offset has two effects: that makes it clearer that 
> it is only about ATAGs and not DT, and a relative offset plays more 
> nicely with a runtime determined PHYS_OFFSET.
> 
> This work is done but not yet accepted:
> http://news.gmane.org/group/gmane.linux.ports.arm.kernel/thread=123480
> 
> 1.1.3) FLUSH_BASE, FLUSH_BASE_PHYS, FLUSH_BASE_MINICACHE, UNCACHEABLE_ADDR
> 
> Those are StrongARM related constants, and different for each variants.
> Fixing this involves making the virtual addresses constant for all 
> variants, and hiding the differences in the physical addresses during 
> the actual mapping.
> 
> The solution is here:
> http://news.gmane.org/group/gmane.linux.ports.arm.kernel/thread=123477/force_load=t/focus=124849
> 
> 1.1.4) CONSISTENT_DMA_SIZE
> 
> Maybe the CMA work will make this obsolete and the consistent DMA area 
> could be dynamically adjusted.  In the mean time, the easiest solution 
> is probably to store this in the machine_desc structure just like with 
> ARM_DMA_ZONE_SIZE.
> 
> This has not been addressed yet.
> 
> 1.1.5) Other weird things
> 
> Some machines have non linear memory maps or bus address translations, 
> sparsemem, etc. Examples of that are:
> 
> arch/arm/mach-realview/include/mach/memory.h
> arch/arm/mach-integrator/include/mach/memory.h
> 
> I think solving this is out of scope for this round.  Deleting 
> arch/arm/mach-*/include/mach/memory.h can't be done universally.  So a 
> new Kconfig symbol (NO_MACH_MEMORY_H) is introduced to indicate which 
> machine class has its legacy <mach/memory.h> file removed.  The single 
> zImage for multiple targets will be restricted, amongst other things, to 
> those machines or SOCs with that symbol selected.  Partial result here:
> http://git.linaro.org/gitweb?p=people/nico/linux.git;a=shortlog;h=refs/heads/redux
> 
> 1.2) mach/io.h
> 
> This contains things like IO_SPACE_LIMIT, __io(), __mem_pci(), and 
> sometimes __arch_ioremap()/__arch_unmap().  but in most cases, the 
> definitions here are pretty similar from one machine class to another.
> 
> Arnd says: 
> 
> |I have a plan. When CONFIG_PCI is disabled (along with CONFIG_ISA and
> |CONFIG_PCMCIA), we should have neither of IO_SPACE_LIMIT, __io()
> |and get no inb/outb functions as a result.
> |
> |When it is enabled, the 'common' platforms need only one I/O window
> |of 64KB, so we should find a common place in the virtual address space
> |for that and hardcode __io, while the platform specific PCI initialization
> |code (or map_io for that matter) ensures that the window is pointing
> |to the physical window.
> |
> |__arch_ioremap()/__arch_unmap() are not really needed as far as I can
> |tell but are used as an optimization to redirect ioremap to the
> |hardcoded virtal address mapping. In the first step we can disable
> |this for combined kernels, later we can find a generic way so
> |__arch_ioremap walks the list of static mappings.
> 
> 1.3) mach/timex.h
> 
> Most instances simply define a dummy CLOCK_TICK_RATE value. This can 
> probably be removed altogether, or simply have a common value in 
> arch/arm/include/asm/timex.h, as nothing seriously uses that anymore. 
> 
> Reference: http://lkml.org/lkml/2011/2/21/323
> 
> 1.4) mach/vmalloc.h
> 
> This universally contains only a definition for VMALLOC_END, but not an 
> universal definition. Would be nice to have VMALLOC_eND dynamically 
> determined from the static IO mappings, but the highmem threshold 
> depends on the value of VMALLOC_END, and memory has to be initialized 
> before the static IO mappings can be processed.
> 
> Therefore the best solution so far appears to use another value in
> struct machine_desc for it so it can be set at run time.  this is a 
> mechanical conversion that has to be done.
> 
> 1.5) mach/irqs.h
> 
> The only information globally required from those files is the value of 
> NR_IRQS.  Yet there is already a nr_irqs member in the machine_desc 
> structure for this, used by arch_probe_nr_irqs() in 
> arch/arm/kernel/irq.c).
> 
> So the first step would be to add
> 
> 	.nr_irqs	= NR_IRQS,
> 
> to all machine_desc instances, making sure that <mach/irqs.h> is 
> included in those files.  Then, <mach/irqs.h> should be removed from 
> arch/arm/include/asm/irq.h, and adjust things so everything still 
> compiles.
> 
> 1.6) mach/gpio.h
> 
> This is a tough one.  This depends on CONFIG_GENERIC_GPIO which is 
> selected by many machine types.  They should all be converted to (or 
> configurable with) CONFIG_GPIOLIB so each SOC's specific GPIO handling 
> is made into runtime code instead of static inline functions.  Care to 
> preserve the ability to not use gpiolib might be desireable in some 
> cases for performance reasons.
> 
> Definitely in need of serious investigation.
> 
> 1.7) mach/mtd-xip.h
> 
> No need to care about those.  This is for running the kernel XIP from 
> ROM memory.  A XIP kernel is already incompatible with the notion of a 
> single kernel image since it obviously can't be modified at run time (as 
> needed by CONFIG_ARM_PATCH_PHYS_VIRT).
> 
> 1.8) mach/isa-dma.h, mach/floppy.h
> 
> Those are used by old targets we might not care much about.
> 
> 1.9) mach/entry-macro.S
> 
> This one gets included directly from arch/arm/kernel/entry-armv.S.
> The only relevant macro still widely used is get_irqnr_preamble and 
> get_irqnr_and_base.  They can be overridden by CONFIG_MULTI_IRQ_HANDLER
> and the equivalent code hooked to the handle_irq member of the 
> machine_desc structure.
> 
> 1.10) mach/debug-macro.S
> 
> This is used when CONFIG_DEBUG_LL is set.  Supporting that option with a 
> single kernel image might prove very difficult with a rapidly 
> diminishing return on the investment.
> 
> This code is in need of some refactoring already:
> http://article.gmane.org/gmane.linux.ports.arm.kernel/118525
> 
> To still benefit from the most likely needed debugging aid, we might
> consider the ability to still allow the selection of one amongst the
> existing implementation when building a kernel with many SOC support.
> Obviously that would only work on the one hardware platform for which the selected printch implementation was
> designed, but that should be good enough for debugging purposes.
> 
> 1.11) mach/system.h
> 
> This is included from arch/arm/kernel/process.c and expected to provide 
> the following static inline functions or equivalent:
> 
> 1.11.1) arch_idle()
> 
> Called when system is idle.  Most of them just call cpu_do_idle().
> The call to cpu_do_idle() should be moved to default_idle() and the exception
> cases moved out of line where they can be hooked to the pm_idle callback.
> 
> 1.11.2) arch_reset()
> 
> Used to reset the system.  This is far from being a hot path and doesn't 
> justify a static inline function.  An out-of-line version hooked to a 
> global arch_reset function pointer would work just fine.
> 
> 1.12) mach/uncompress.h
> 
> This is used to define per SOC methods to output some progress feedback 
> from the kernel decompressor over a serial port.  Once again, supporting 
> this with a single kernel image might prove very difficult with a 
> rapidly diminishing return on the investment.  So it is probably best to 
> simply use generic empty stubs whenever more than one SOC family is 
> configured in a common kernel image.
> 
> 2) Removal of any dependencies on <mach/*.h> from driver code
> 
> A couple possibilities:
> 
> a) We move the required header files next to the driver code.  In many 
> cases, having a .h file with only the defines relevant to the concerned 
> driver is best.  But this is a _lot_ of work.
> 
> b) We change those <mach/foo.h> into something more absolute, such as 
> <mach/omap2/foo.h>.  This can be done on a per SOC basis, first by 
> moving the header files one level deeper, and then fixing up all 
> affected drivers.
> 
> c) We change those <mach/foo.h> files into something more precise, e.g. 
> <mach/omap2_foo.h> and fix concerned drivers.
> 
> I think the best solution here is (b) which doesn't preclude (a) 
> eventually or if it is trivial.  But (c) is dangerous as files might be 
> added easily without paying too much attention to the file prefix.
> 
> 3) Change thes to the build system
> 
> We need to move towards the ability to actually build more than one SOC 
> family at the same time.
> 
> 3.1) Kconfig
> 
> This involves changes to Kconfig where currently only one out of all the 
> different architectures is selected through the big "ARM system type" 
> choice prompt.  We need to determine a good way to move some of them 
> into simply bool prompts and keep track of which architecture can be 
> built concurrently with which.  We know for instance that it is unlikely 
> that pre-ARMv6 and ARMv6/7 will ever be buildable together.  Today we 
> know that nothing can be built with anything else and therefore this 
> should be the starting default.  This needs investigating.
> 
> 3.2) Makefile
> 
> Currently the arch/arm/Makefile is organized so the lowest instruction 
> set level and the highest optimization level are selected from all the 
> configured options.  So this part should already be fine.
> 
> However the machine-$(*), plat-$(*), machdirs and platdirs variables 
> must go.  In (2) above we should have removed the need for adding to the 
> global KBUILD_CPPFLAGS to add a path to some specific architecture 
> includes already.  Keeping them only for the code under each 
> architecture subdirectory should be sufficient.
> 
> For example, this might be all that is needed:
> 
> obj-$(CONFIG_ARCH_MSM) += mach-msm/
> 
> or
> 
> obj-$(CONFIG_ARCH_KIRKWOOD) += mach-kirkwood/ plat-orion/
> obj-$(CONFIG_ARCH_ORION5X) += mach-orion5x/ plat-orion/
> 
> Etc.
> 
> And within each of these directories, using the subdir-ccflags-y 
> variable to include the locally needed architecture specific include 
> files will do the trick.
> 
> 3.3) defconfig
> 
> We need a defconfig file adding as many architectures to it as possible 
> for build coverage.  Ideally the resulting binary should be boot tested 
> on as many targets it supports as possible.
> 
> 4) Picking up broken pieces
> 
> Things will certainly break along the way.  There are certainly issues 
> that I didn't foresee.  My experience so far tend to indicate that 
> this is a somewhat recursive process where the tackling of one work item 
> reveals a few more which are prerequisite to the first one, etc.  So any 
> estimate for this work needs to consider a large fudge factor.
> 

There's also collisions with the platform SMP and localtimer code.
Things like platform_secondary_init, platform_smp_prepare_cpus, etc.
need to be converted over to something like smp_ops on PowerPC. There's
been some work on the local timer code by Marc Z.

Rob




More information about the linaro-kernel mailing list