On 07/27/2011 09:58 PM, Nicolas Pitre wrote:
To everyone, and especially to those who are expected to work on this topic next week, please find below a list of tasks that needs to be investigated and/or accomplished. I'll coordinate the work and collect patches for the team.
If you have comments on this, or if you know about some omissions, please feel free to provide them as a reply to this message.
I'd like to know if people are particularly interested in one (or more :-)) items they would like to work on. If so please say so as well.
Without further ado, here it is:
<><><><><>
- The so called "single zImage" project
We wish to provide the ability to build as many ARM platforms as possible into a single kernel binary image. This will greatly simplify the archive packaging and maintenance effort by having only one kernel that could be built and booted on multiple ARM targets. A side effect of this is also to enforce better source code architecture even if the resulting binaries are not always supporting multiple targets.
This work started a while ago. Some initial description can be found here:
https://wiki.ubuntu.com/Specs/ARMSingleKernel
Part of it has been implemented already, namely the runtime determined PHYS_OFFSET, the AUTO_ZRELADDR and some other items referenced below. But there is still a large amount of work remaining.
- Removal of any dependencies on <mach/*.h> from generic header files
To see the current culprits:
$ git grep "#include <mach/.*.h>" arch/arm/include/ arch/arm/include/asm/clkdev.h:#include <mach/clkdev.h> arch/arm/include/asm/dma.h:#include <mach/isa-dma.h> arch/arm/include/asm/floppy.h:#include <mach/floppy.h> arch/arm/include/asm/gpio.h:#include <mach/gpio.h> arch/arm/include/asm/hardware/dec21285.h:#include <mach/hardware.h> arch/arm/include/asm/hardware/iop3xx-adma.h:#include <mach/hardware.h> arch/arm/include/asm/hardware/iop3xx-gpio.h:#include <mach/hardware.h> arch/arm/include/asm/hardware/sa1111.h:#include <mach/bitfield.h> arch/arm/include/asm/io.h:#include <mach/io.h> arch/arm/include/asm/irq.h:#include <mach/irqs.h> arch/arm/include/asm/mc146818rtc.h:#include <mach/irqs.h> arch/arm/include/asm/memory.h:#include <mach/memory.h> arch/arm/include/asm/mtd-xip.h:#include <mach/mtd-xip.h> arch/arm/include/asm/pci.h:#include <mach/hardware.h> /* for PCIBIOS_MIN_* */ arch/arm/include/asm/pgtable.h:#include <mach/vmalloc.h> arch/arm/include/asm/system.h:#include <mach/barriers.h> arch/arm/include/asm/timex.h:#include <mach/timex.h> arch/arm/include/asm/vga.h:#include <mach/hardware.h>
1.1) mach/memory.h
This may contain the following defines:
1.1.1) ARM_DMA_ZONE_SIZE
This can be eliminated by moving that value into struct machine_desc. The work is done already, but presented as an example for other tasks: http://git.linaro.org/gitweb?p=people/nico/linux.git%3Ba=shortlog%3Bh=refs/h... And as of now this is merged in mainline already for v3.1-rc1.
1.1.2) PLAT_PHYS_OFFSET
Most occurrences can be eliminated. With CONFIG_ARM_PATCH_PHYS_VIRT, it is possible to determine PHYS_OFFSET at run time. Remains to remove the direct uses, mostly by mdesc->boot_params initializers. Changing boot_params into atag_offset has two effects: that makes it clearer that it is only about ATAGs and not DT, and a relative offset plays more nicely with a runtime determined PHYS_OFFSET.
This work is done but not yet accepted: http://news.gmane.org/group/gmane.linux.ports.arm.kernel/thread=123480
1.1.3) FLUSH_BASE, FLUSH_BASE_PHYS, FLUSH_BASE_MINICACHE, UNCACHEABLE_ADDR
Those are StrongARM related constants, and different for each variants. Fixing this involves making the virtual addresses constant for all variants, and hiding the differences in the physical addresses during the actual mapping.
The solution is here: http://news.gmane.org/group/gmane.linux.ports.arm.kernel/thread=123477/force...
1.1.4) CONSISTENT_DMA_SIZE
Maybe the CMA work will make this obsolete and the consistent DMA area could be dynamically adjusted. In the mean time, the easiest solution is probably to store this in the machine_desc structure just like with ARM_DMA_ZONE_SIZE.
This has not been addressed yet.
1.1.5) Other weird things
Some machines have non linear memory maps or bus address translations, sparsemem, etc. Examples of that are:
arch/arm/mach-realview/include/mach/memory.h arch/arm/mach-integrator/include/mach/memory.h
I think solving this is out of scope for this round. Deleting arch/arm/mach-*/include/mach/memory.h can't be done universally. So a new Kconfig symbol (NO_MACH_MEMORY_H) is introduced to indicate which machine class has its legacy <mach/memory.h> file removed. The single zImage for multiple targets will be restricted, amongst other things, to those machines or SOCs with that symbol selected. Partial result here: http://git.linaro.org/gitweb?p=people/nico/linux.git%3Ba=shortlog%3Bh=refs/h...
1.2) mach/io.h
This contains things like IO_SPACE_LIMIT, __io(), __mem_pci(), and sometimes __arch_ioremap()/__arch_unmap(). but in most cases, the definitions here are pretty similar from one machine class to another.
Arnd says:
|I have a plan. When CONFIG_PCI is disabled (along with CONFIG_ISA and |CONFIG_PCMCIA), we should have neither of IO_SPACE_LIMIT, __io() |and get no inb/outb functions as a result. | |When it is enabled, the 'common' platforms need only one I/O window |of 64KB, so we should find a common place in the virtual address space |for that and hardcode __io, while the platform specific PCI initialization |code (or map_io for that matter) ensures that the window is pointing |to the physical window. | |__arch_ioremap()/__arch_unmap() are not really needed as far as I can |tell but are used as an optimization to redirect ioremap to the |hardcoded virtal address mapping. In the first step we can disable |this for combined kernels, later we can find a generic way so |__arch_ioremap walks the list of static mappings.
1.3) mach/timex.h
Most instances simply define a dummy CLOCK_TICK_RATE value. This can probably be removed altogether, or simply have a common value in arch/arm/include/asm/timex.h, as nothing seriously uses that anymore.
Reference: http://lkml.org/lkml/2011/2/21/323
1.4) mach/vmalloc.h
This universally contains only a definition for VMALLOC_END, but not an universal definition. Would be nice to have VMALLOC_eND dynamically determined from the static IO mappings, but the highmem threshold depends on the value of VMALLOC_END, and memory has to be initialized before the static IO mappings can be processed.
Therefore the best solution so far appears to use another value in struct machine_desc for it so it can be set at run time. this is a mechanical conversion that has to be done.
1.5) mach/irqs.h
The only information globally required from those files is the value of NR_IRQS. Yet there is already a nr_irqs member in the machine_desc structure for this, used by arch_probe_nr_irqs() in arch/arm/kernel/irq.c).
So the first step would be to add
.nr_irqs = NR_IRQS,
to all machine_desc instances, making sure that <mach/irqs.h> is included in those files. Then, <mach/irqs.h> should be removed from arch/arm/include/asm/irq.h, and adjust things so everything still compiles.
1.6) mach/gpio.h
This is a tough one. This depends on CONFIG_GENERIC_GPIO which is selected by many machine types. They should all be converted to (or configurable with) CONFIG_GPIOLIB so each SOC's specific GPIO handling is made into runtime code instead of static inline functions. Care to preserve the ability to not use gpiolib might be desireable in some cases for performance reasons.
Definitely in need of serious investigation.
1.7) mach/mtd-xip.h
No need to care about those. This is for running the kernel XIP from ROM memory. A XIP kernel is already incompatible with the notion of a single kernel image since it obviously can't be modified at run time (as needed by CONFIG_ARM_PATCH_PHYS_VIRT).
1.8) mach/isa-dma.h, mach/floppy.h
Those are used by old targets we might not care much about.
1.9) mach/entry-macro.S
This one gets included directly from arch/arm/kernel/entry-armv.S. The only relevant macro still widely used is get_irqnr_preamble and get_irqnr_and_base. They can be overridden by CONFIG_MULTI_IRQ_HANDLER and the equivalent code hooked to the handle_irq member of the machine_desc structure.
1.10) mach/debug-macro.S
This is used when CONFIG_DEBUG_LL is set. Supporting that option with a single kernel image might prove very difficult with a rapidly diminishing return on the investment.
This code is in need of some refactoring already: http://article.gmane.org/gmane.linux.ports.arm.kernel/118525
To still benefit from the most likely needed debugging aid, we might consider the ability to still allow the selection of one amongst the existing implementation when building a kernel with many SOC support. Obviously that would only work on the one hardware platform for which the selected printch implementation was designed, but that should be good enough for debugging purposes.
1.11) mach/system.h
This is included from arch/arm/kernel/process.c and expected to provide the following static inline functions or equivalent:
1.11.1) arch_idle()
Called when system is idle. Most of them just call cpu_do_idle(). The call to cpu_do_idle() should be moved to default_idle() and the exception cases moved out of line where they can be hooked to the pm_idle callback.
1.11.2) arch_reset()
Used to reset the system. This is far from being a hot path and doesn't justify a static inline function. An out-of-line version hooked to a global arch_reset function pointer would work just fine.
1.12) mach/uncompress.h
This is used to define per SOC methods to output some progress feedback from the kernel decompressor over a serial port. Once again, supporting this with a single kernel image might prove very difficult with a rapidly diminishing return on the investment. So it is probably best to simply use generic empty stubs whenever more than one SOC family is configured in a common kernel image.
- Removal of any dependencies on <mach/*.h> from driver code
A couple possibilities:
a) We move the required header files next to the driver code. In many cases, having a .h file with only the defines relevant to the concerned driver is best. But this is a _lot_ of work.
b) We change those <mach/foo.h> into something more absolute, such as <mach/omap2/foo.h>. This can be done on a per SOC basis, first by moving the header files one level deeper, and then fixing up all affected drivers.
c) We change those <mach/foo.h> files into something more precise, e.g. <mach/omap2_foo.h> and fix concerned drivers.
I think the best solution here is (b) which doesn't preclude (a) eventually or if it is trivial. But (c) is dangerous as files might be added easily without paying too much attention to the file prefix.
- Change thes to the build system
We need to move towards the ability to actually build more than one SOC family at the same time.
3.1) Kconfig
This involves changes to Kconfig where currently only one out of all the different architectures is selected through the big "ARM system type" choice prompt. We need to determine a good way to move some of them into simply bool prompts and keep track of which architecture can be built concurrently with which. We know for instance that it is unlikely that pre-ARMv6 and ARMv6/7 will ever be buildable together. Today we know that nothing can be built with anything else and therefore this should be the starting default. This needs investigating.
3.2) Makefile
Currently the arch/arm/Makefile is organized so the lowest instruction set level and the highest optimization level are selected from all the configured options. So this part should already be fine.
However the machine-$(*), plat-$(*), machdirs and platdirs variables must go. In (2) above we should have removed the need for adding to the global KBUILD_CPPFLAGS to add a path to some specific architecture includes already. Keeping them only for the code under each architecture subdirectory should be sufficient.
For example, this might be all that is needed:
obj-$(CONFIG_ARCH_MSM) += mach-msm/
or
obj-$(CONFIG_ARCH_KIRKWOOD) += mach-kirkwood/ plat-orion/ obj-$(CONFIG_ARCH_ORION5X) += mach-orion5x/ plat-orion/
Etc.
And within each of these directories, using the subdir-ccflags-y variable to include the locally needed architecture specific include files will do the trick.
3.3) defconfig
We need a defconfig file adding as many architectures to it as possible for build coverage. Ideally the resulting binary should be boot tested on as many targets it supports as possible.
- Picking up broken pieces
Things will certainly break along the way. There are certainly issues that I didn't foresee. My experience so far tend to indicate that this is a somewhat recursive process where the tackling of one work item reveals a few more which are prerequisite to the first one, etc. So any estimate for this work needs to consider a large fudge factor.
There's also collisions with the platform SMP and localtimer code. Things like platform_secondary_init, platform_smp_prepare_cpus, etc. need to be converted over to something like smp_ops on PowerPC. There's been some work on the local timer code by Marc Z.
Rob