This patchset is a first attempt at providing a consolidation of idle
code for the ARM processor architecture and a request for comment on
the provided methodology.
It relies and it is based on kernel features such as suspend/resume,
pm notifiers and common code for cpu_reset().
It integrates latest patches from ALKML for cpu pm notifiers and a cpu_reset
function hook. Patches are included in the patchset for completeness.
The patchset depends on the following patches, whose links are provided
for completeness:
https://patchwork.kernel.org/patch/882892/https://patchwork.kernel.org/patch/873692/https://patchwork.kernel.org/patch/873682/https://patchwork.kernel.org/patch/873672/
The idle framework defines a common entry point in [sr_entry.S]
cpu_enter_idle(cstate, rstate, flags)
where:
C-state [CPU state]:
0 - RUN MODE
1 - STANDBY
2 - DORMANT (not supported by this patch)
3 - SHUTDOWN
R-state [CLUSTER state]
0 - RUN
1 - STANDBY (not supported by this patch)
2 - L2 RAM retention
3 - SHUTDOWN
flags:
SR_SAVE_L2: L2 registers saved and restored on shutdown
SR_SAVE_SCU: SCU reset on cluster wake-up
The assembly entry point checks the targeted state and executes wfi,
entering a shallow c-state or call into the sr framework to put the cpu
and cluster in low-power state. If the target is a deep low-power state
it saves the current stack pointer and registers on the stack for the resume
path.
On deep-power state entry, since the CPU is hitting the off state, the
code switches page tables (cloned from init_mm at boot) to cater for 1:1
mapping of kernel code, data, and uncached reserved memory pages obtained
from platform code through a hook:
platform_context_pointer(size)
Every platform using the framework should implement this hook to return
reserved memory pages that are going to be mapped as uncached and 1:1 mapped
to cater for the MMU off resume path.
This decision has been made in order to avoid fiddling with L2 when CPU
enters low-power (context should be flushed to L3 so that a CPU can fetch it
from memory when the MMU is off).
On the resume path the CPU loads a non-cacheable stack pointer to cater for
the MMU enabling path, and after switching page tables returns to the OS.
The non-cacheable stack simplifies the L2 management in that, since for single
CPU shutdown the L2 is still enabled, on MMU resume some stack used before
the MMU is turned on might still be present and valid in L2 leading to
corruption. After MMU is enabled a few bytes of the stack frame are copied
back to the Linux stack and execution resumes.
Generic subsystem save/restore is triggered by cpu pm notifiers, to
save/restore GIC, VFP, PMU state automatically.
The patchset introduces a new notifier chain which notifies listeners of
a required platform shutdown/wakeup. Platform code should register to the chain
and execute all actions required to put the system in low-power mode when
called. It is called within a virtual address space cloned from init_mm at
arch_initcall.
On cluster shutdown L2 cache memory should be either cleaned (complete shutdown)
or just save L2 logic (L2 RAM retained). This is a major issue since
on power-down the stack points to cacheable memory that must be cleaned
from L2 before disabling the L2 controller.
Current code performing that action is a hack and provides ground for
discussions. The stack might be switched to non-cacheable on power down
but by doing this code relying on thread_info is broken unless that
struct is copied across the stack switch.
Atomicity of the code is provided by strongly ordered locking algorithm
(Lamport's Bakery) since when CPUs are out of coherency and the D$ look-up
are disabled normal spinlocks based on ldrex/strex are not functional.
Atomicity of L2 clean/invalidate L2 and reset of SCU are fundamental
to guarantee system stability.
Lamport's bakery code is provided for completeness and it can be
ignored; please refer to the patch commit note.
Entry on low-power mode is performed by a function pointer (*sr_sleep), to
allow platforms to override the default behaviour (and possibly execute
from different memory spaces).
Tested on dual-core A9 Cluster through all system low-power states
supported by the patchset. A8, A5 support compile tested.
Colin Cross (3):
ARM: Add cpu power management notifiers
ARM: gic: Use cpu pm notifiers to save gic state
ARM: vfp: Use cpu pm notifiers to save vfp state
Lorenzo Pieralisi (13):
ARM: kernel: save/restore kernel IF
ARM: kernel: save/restore generic infrastructure
ARM: kernel: save/restore v7 assembly helpers
ARM: kernel: save/restore arch runtime support
ARM: kernel: v7 resets support
ARM: kernel: save/restore v7 infrastructure support
ARM: kernel: add support for Lamport's bakery locks
ARM: kernel: add SCU reset hook
ARM: mm: L2x0 save/restore support
ARM: kernel: save/restore 1:1 page tables
ARM: perf: use cpu pm notifiers to save pmu state
ARM: PM: enhance idle pm notifiers
ARM: kernel: save/restore build infrastructure
Will Deacon (1):
ARM: proc: add definition of cpu_reset for ARMv6 and ARMv7 cores
arch/arm/Kconfig | 18 ++
arch/arm/common/gic.c | 212 +++++++++++++++++++++++
arch/arm/include/asm/cpu_pm.h | 69 ++++++++
arch/arm/include/asm/lb_lock.h | 34 ++++
arch/arm/include/asm/outercache.h | 22 +++
arch/arm/include/asm/smp_scu.h | 3 +-
arch/arm/include/asm/sr_platform_api.h | 28 +++
arch/arm/kernel/Makefile | 5 +
arch/arm/kernel/cpu_pm.c | 265 ++++++++++++++++++++++++++++
arch/arm/kernel/lb_lock.c | 85 +++++++++
arch/arm/kernel/perf_event.c | 22 +++
arch/arm/kernel/reset_v7.S | 109 ++++++++++++
arch/arm/kernel/smp_scu.c | 33 ++++-
arch/arm/kernel/sr.h | 162 +++++++++++++++++
arch/arm/kernel/sr_api.c | 197 +++++++++++++++++++++
arch/arm/kernel/sr_arch.c | 74 ++++++++
arch/arm/kernel/sr_context.c | 23 +++
arch/arm/kernel/sr_entry.S | 213 +++++++++++++++++++++++
arch/arm/kernel/sr_helpers.h | 56 ++++++
arch/arm/kernel/sr_mapping.c | 78 +++++++++
arch/arm/kernel/sr_platform.c | 48 +++++
arch/arm/kernel/sr_power.c | 26 +++
arch/arm/kernel/sr_v7.c | 298 ++++++++++++++++++++++++++++++++
arch/arm/kernel/sr_v7_helpers.S | 47 +++++
arch/arm/mm/cache-l2x0.c | 63 +++++++
arch/arm/mm/proc-v6.S | 5 +
arch/arm/mm/proc-v7.S | 7 +
arch/arm/vfp/vfpmodule.c | 40 +++++
28 files changed, 2238 insertions(+), 4 deletions(-)
create mode 100644 arch/arm/include/asm/cpu_pm.h
create mode 100644 arch/arm/include/asm/lb_lock.h
create mode 100644 arch/arm/include/asm/sr_platform_api.h
create mode 100644 arch/arm/kernel/cpu_pm.c
create mode 100644 arch/arm/kernel/lb_lock.c
create mode 100644 arch/arm/kernel/reset_v7.S
create mode 100644 arch/arm/kernel/sr.h
create mode 100644 arch/arm/kernel/sr_api.c
create mode 100644 arch/arm/kernel/sr_arch.c
create mode 100644 arch/arm/kernel/sr_context.c
create mode 100644 arch/arm/kernel/sr_entry.S
create mode 100644 arch/arm/kernel/sr_helpers.h
create mode 100644 arch/arm/kernel/sr_mapping.c
create mode 100644 arch/arm/kernel/sr_platform.c
create mode 100644 arch/arm/kernel/sr_power.c
create mode 100644 arch/arm/kernel/sr_v7.c
create mode 100644 arch/arm/kernel/sr_v7_helpers.S
--
1.7.4.4
>
> Were you using Linaro-media-create to put things onto the SD card or
> some other route?
>
>
Yes.
created the master card using `linaro-media-create`
booted it once and added some secret sauce to it.
duplicated that card to 20 cards using `sfdisk` and `partimage`.
(Half of those cards did not boot)
used `gparted` to create a new partition table on one of the failed cards
(the new second master)
used `rsync` to copy the contents of the first master to a second master
reran the duplication process
repeated the step above for the remaining failed cards
ended up with 3 master cards with the exact same filesystem contents and
very nearly identical partition tables,
but different disk geometries (head, cyl, sector).
AJ ONeal
On Wed, Jun 29, 2011 at 12:48 PM, AJ ONeal <coolaj86(a)gmail.com> wrote:
> > I have a few inter-related issues:
> >
> > Why would one kernel boot a card that another kernel can't?
> > Why would a card's disk geometry matter for boot?
> > Who is a good manufacturer for getting hardware-identical cards in bulk?
> > How can I probe the actual "disk geometry" of an sd card?
> >
> > I bought 100 Transcend SD cards a little while ago and duplicated them
> with
> > an OpenEmbedded-based filesystem (linux-2.6.36).
> > There were a few "bad" cards that I threw out, but the success rate was
> > acceptable.
> >
> > In the next round of 40 SD cards I used a Linaro-based filesystem
> > (linux-2.6.39) and had about a 50% failure rate when testing that the
> cards
> > would boot, which is absurd.
> > There kernel reports: [ 1.003204] mmcblk0: unknown partition table
> > However, the cards would mount and show files just fine.
> > I reduplicated one of the non-booting cards with an OpenEmbedded
> filesystem
> > and then it booted. Weird!
> >
> > After some investigation I found that using `gparted` (instead of
> `fdisk`)
> > to create a new partition table and then `rsync`ing the contents of the
> > original filesystem resulted in a booting Linaro card.
> > Rinse and repeat and I ended up with 3 images which only vary by the disk
> > geometry as reported by `fdisk -l`:
> >
> > 50% -- 255 heads, 63 sectors/track, 974 cylinders
> > 40% -- 2 heads, 4 sectors/track, 1957632 cylinders
> > 10% -- 247 heads, 62 sectors/track, 1022 cylinders
> > 1 card still didn't boot
> >
> > I'm lost. Please advise.
> > AJ ONeal
> >
> >
> >
> > Non-booting kernel message
> > [ 0.923309] Waiting for root device /dev/mmcblk0p2...
> > [ 0.957885] mmc0: host does not support reading read-only switch.
> > assuming write-enable.
> > [ 0.982025] mmc0: new high speed SDHC card at address b368
> > [ 0.988494] mmcblk0: mmc0:b368 USD 7.46 GiB
> > [ 0.993957] mmcblk0: detected capacity change from 0 to 8018460672
> > [ 1.003204] mmcblk0: unknown partition table
> > [ 1.036926] VFS: Cannot open root device "mmcblk0p2" or
> > unknown-block(179,2)
> > [ 1.044433] Please append a correct "root=" boot option; here are the
> > available partitions:
> > [ 1.053344] b300 7830528 mmcblk0 driver: mmcblk
> > [ 1.058959] Kernel panic - not syncing: VFS: Unable to mount root fs
> on
> > unknown-block(179,2)
> >
> > Booting kernel message
> > [ 1.122070] mmc0: host does not support reading read-only switch.
> > assuming write-enable.
> > [ 1.146087] mmc0: new high speed SDHC card at address b368
> > [ 1.152557] mmcblk0: mmc0:b368 USD 7.46 GiB
> > [ 1.158020] mmcblk0: detected capacity change from 0 to 8018460672
> > [ 1.166351] mmcblk0: p1 p2 p3
> > [ 1.259674] EXT3-fs: barriers not enabled
> > [ 1.265411] kjournald starting. Commit interval 5 seconds
> > [ 1.271331] EXT3-fs (mmcblk0p2): mounted filesystem with ordered data
> > mode
> > [ 1.278686] VFS: Mounted root (ext3 filesystem) readonly on device
> 179:2.
> > _______________________________________________
> > linaro-dev mailing list
> > linaro-dev(a)lists.linaro.org
> > http://lists.linaro.org/mailman/listinfo/linaro-dev
> >
> >
>
>
>
> --
> Regards,
> Tom
>
> "We want great men who, when fortune frowns will not be discouraged."
> - Colonel Henry Knox
> Linaro.org │ Open source software for ARM SoCs
> w) tom.gall att linaro.org
> w) tom_gall att vnet.ibm.com
> h) tom_gall att mac.com
>
A quick poll of the ARM platforms that implement CPU Hotplug support
shows that every platform treats CPU 0 as a special case that cannot be
hotplugged. In fact every platform has identical code for
platform_cpu_die which returns -EPERM in the case of CPU 0.
The user-facing sysfs interfaces should reflect this by not populating
an 'online' entry for CPU 0 at all. This better reflects reality by
making it clear to users that CPU 0 cannot be hotplugged.
This patch prevents CPU 0 from being marked as hotpluggable on all ARM
platforms during CPU registration. This in turn prevents the creation
of an 'online' sysfs interface for that CPU.
Signed-off-by: Mike Turquette <mturquette(a)ti.com>
---
arch/arm/kernel/setup.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index ed11fb0..a5fc969 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -940,7 +940,8 @@ static int __init topology_init(void)
for_each_possible_cpu(cpu) {
struct cpuinfo_arm *cpuinfo = &per_cpu(cpu_data, cpu);
- cpuinfo->cpu.hotpluggable = 1;
+ if (cpu)
+ cpuinfo->cpu.hotpluggable = 1;
register_cpu(&cpuinfo->cpu, cpu);
}
--
1.7.4.1
To everyone, and especially to those who are expected to work on this
topic next week, please find below a list of tasks that needs to be
investigated and/or accomplished. I'll coordinate the work and collect
patches for the team.
If you have comments on this, or if you know about some omissions,
please feel free to provide them as a reply to this message.
I'd like to know if people are particularly interested in one (or more :-))
items they would like to work on. If so please say so as well.
Without further ado, here it is:
<><><><><>
0) The so called "single zImage" project
We wish to provide the ability to build as many ARM platforms as
possible into a single kernel binary image. This will greatly simplify
the archive packaging and maintenance effort by having only one kernel
that could be built and booted on multiple ARM targets. A side effect
of this is also to enforce better source code architecture even if the
resulting binaries are not always supporting multiple targets.
This work started a while ago. Some initial description can be found
here:
https://wiki.ubuntu.com/Specs/ARMSingleKernel
Part of it has been implemented already, namely the runtime determined
PHYS_OFFSET, the AUTO_ZRELADDR and some other items referenced below.
But there is still a large amount of work remaining.
1) Removal of any dependencies on <mach/*.h> from generic header files
To see the current culprits:
$ git grep "#include <mach/.*.h>" arch/arm/include/
arch/arm/include/asm/clkdev.h:#include <mach/clkdev.h>
arch/arm/include/asm/dma.h:#include <mach/isa-dma.h>
arch/arm/include/asm/floppy.h:#include <mach/floppy.h>
arch/arm/include/asm/gpio.h:#include <mach/gpio.h>
arch/arm/include/asm/hardware/dec21285.h:#include <mach/hardware.h>
arch/arm/include/asm/hardware/iop3xx-adma.h:#include <mach/hardware.h>
arch/arm/include/asm/hardware/iop3xx-gpio.h:#include <mach/hardware.h>
arch/arm/include/asm/hardware/sa1111.h:#include <mach/bitfield.h>
arch/arm/include/asm/io.h:#include <mach/io.h>
arch/arm/include/asm/irq.h:#include <mach/irqs.h>
arch/arm/include/asm/mc146818rtc.h:#include <mach/irqs.h>
arch/arm/include/asm/memory.h:#include <mach/memory.h>
arch/arm/include/asm/mtd-xip.h:#include <mach/mtd-xip.h>
arch/arm/include/asm/pci.h:#include <mach/hardware.h> /* for PCIBIOS_MIN_* */
arch/arm/include/asm/pgtable.h:#include <mach/vmalloc.h>
arch/arm/include/asm/system.h:#include <mach/barriers.h>
arch/arm/include/asm/timex.h:#include <mach/timex.h>
arch/arm/include/asm/vga.h:#include <mach/hardware.h>
1.1) mach/memory.h
This may contain the following defines:
1.1.1) ARM_DMA_ZONE_SIZE
This can be eliminated by moving that value into struct machine_desc.
The work is done already, but presented as an example for other tasks:
http://git.linaro.org/gitweb?p=people/nico/linux.git;a=shortlog;h=refs/head…
And as of now this is merged in mainline already for v3.1-rc1.
1.1.2) PLAT_PHYS_OFFSET
Most occurrences can be eliminated. With CONFIG_ARM_PATCH_PHYS_VIRT, it
is possible to determine PHYS_OFFSET at run time. Remains to remove the
direct uses, mostly by mdesc->boot_params initializers. Changing
boot_params into atag_offset has two effects: that makes it clearer that
it is only about ATAGs and not DT, and a relative offset plays more
nicely with a runtime determined PHYS_OFFSET.
This work is done but not yet accepted:
http://news.gmane.org/group/gmane.linux.ports.arm.kernel/thread=123480
1.1.3) FLUSH_BASE, FLUSH_BASE_PHYS, FLUSH_BASE_MINICACHE, UNCACHEABLE_ADDR
Those are StrongARM related constants, and different for each variants.
Fixing this involves making the virtual addresses constant for all
variants, and hiding the differences in the physical addresses during
the actual mapping.
The solution is here:
http://news.gmane.org/group/gmane.linux.ports.arm.kernel/thread=123477/forc…
1.1.4) CONSISTENT_DMA_SIZE
Maybe the CMA work will make this obsolete and the consistent DMA area
could be dynamically adjusted. In the mean time, the easiest solution
is probably to store this in the machine_desc structure just like with
ARM_DMA_ZONE_SIZE.
This has not been addressed yet.
1.1.5) Other weird things
Some machines have non linear memory maps or bus address translations,
sparsemem, etc. Examples of that are:
arch/arm/mach-realview/include/mach/memory.h
arch/arm/mach-integrator/include/mach/memory.h
I think solving this is out of scope for this round. Deleting
arch/arm/mach-*/include/mach/memory.h can't be done universally. So a
new Kconfig symbol (NO_MACH_MEMORY_H) is introduced to indicate which
machine class has its legacy <mach/memory.h> file removed. The single
zImage for multiple targets will be restricted, amongst other things, to
those machines or SOCs with that symbol selected. Partial result here:
http://git.linaro.org/gitweb?p=people/nico/linux.git;a=shortlog;h=refs/head…
1.2) mach/io.h
This contains things like IO_SPACE_LIMIT, __io(), __mem_pci(), and
sometimes __arch_ioremap()/__arch_unmap(). but in most cases, the
definitions here are pretty similar from one machine class to another.
Arnd says:
|I have a plan. When CONFIG_PCI is disabled (along with CONFIG_ISA and
|CONFIG_PCMCIA), we should have neither of IO_SPACE_LIMIT, __io()
|and get no inb/outb functions as a result.
|
|When it is enabled, the 'common' platforms need only one I/O window
|of 64KB, so we should find a common place in the virtual address space
|for that and hardcode __io, while the platform specific PCI initialization
|code (or map_io for that matter) ensures that the window is pointing
|to the physical window.
|
|__arch_ioremap()/__arch_unmap() are not really needed as far as I can
|tell but are used as an optimization to redirect ioremap to the
|hardcoded virtal address mapping. In the first step we can disable
|this for combined kernels, later we can find a generic way so
|__arch_ioremap walks the list of static mappings.
1.3) mach/timex.h
Most instances simply define a dummy CLOCK_TICK_RATE value. This can
probably be removed altogether, or simply have a common value in
arch/arm/include/asm/timex.h, as nothing seriously uses that anymore.
Reference: http://lkml.org/lkml/2011/2/21/323
1.4) mach/vmalloc.h
This universally contains only a definition for VMALLOC_END, but not an
universal definition. Would be nice to have VMALLOC_eND dynamically
determined from the static IO mappings, but the highmem threshold
depends on the value of VMALLOC_END, and memory has to be initialized
before the static IO mappings can be processed.
Therefore the best solution so far appears to use another value in
struct machine_desc for it so it can be set at run time. this is a
mechanical conversion that has to be done.
1.5) mach/irqs.h
The only information globally required from those files is the value of
NR_IRQS. Yet there is already a nr_irqs member in the machine_desc
structure for this, used by arch_probe_nr_irqs() in
arch/arm/kernel/irq.c).
So the first step would be to add
.nr_irqs = NR_IRQS,
to all machine_desc instances, making sure that <mach/irqs.h> is
included in those files. Then, <mach/irqs.h> should be removed from
arch/arm/include/asm/irq.h, and adjust things so everything still
compiles.
1.6) mach/gpio.h
This is a tough one. This depends on CONFIG_GENERIC_GPIO which is
selected by many machine types. They should all be converted to (or
configurable with) CONFIG_GPIOLIB so each SOC's specific GPIO handling
is made into runtime code instead of static inline functions. Care to
preserve the ability to not use gpiolib might be desireable in some
cases for performance reasons.
Definitely in need of serious investigation.
1.7) mach/mtd-xip.h
No need to care about those. This is for running the kernel XIP from
ROM memory. A XIP kernel is already incompatible with the notion of a
single kernel image since it obviously can't be modified at run time (as
needed by CONFIG_ARM_PATCH_PHYS_VIRT).
1.8) mach/isa-dma.h, mach/floppy.h
Those are used by old targets we might not care much about.
1.9) mach/entry-macro.S
This one gets included directly from arch/arm/kernel/entry-armv.S.
The only relevant macro still widely used is get_irqnr_preamble and
get_irqnr_and_base. They can be overridden by CONFIG_MULTI_IRQ_HANDLER
and the equivalent code hooked to the handle_irq member of the
machine_desc structure.
1.10) mach/debug-macro.S
This is used when CONFIG_DEBUG_LL is set. Supporting that option with a
single kernel image might prove very difficult with a rapidly
diminishing return on the investment.
This code is in need of some refactoring already:
http://article.gmane.org/gmane.linux.ports.arm.kernel/118525
To still benefit from the most likely needed debugging aid, we might
consider the ability to still allow the selection of one amongst the
existing implementation when building a kernel with many SOC support.
Obviously that would only work on the one hardware platform for which the selected printch implementation was
designed, but that should be good enough for debugging purposes.
1.11) mach/system.h
This is included from arch/arm/kernel/process.c and expected to provide
the following static inline functions or equivalent:
1.11.1) arch_idle()
Called when system is idle. Most of them just call cpu_do_idle().
The call to cpu_do_idle() should be moved to default_idle() and the exception
cases moved out of line where they can be hooked to the pm_idle callback.
1.11.2) arch_reset()
Used to reset the system. This is far from being a hot path and doesn't
justify a static inline function. An out-of-line version hooked to a
global arch_reset function pointer would work just fine.
1.12) mach/uncompress.h
This is used to define per SOC methods to output some progress feedback
from the kernel decompressor over a serial port. Once again, supporting
this with a single kernel image might prove very difficult with a
rapidly diminishing return on the investment. So it is probably best to
simply use generic empty stubs whenever more than one SOC family is
configured in a common kernel image.
2) Removal of any dependencies on <mach/*.h> from driver code
A couple possibilities:
a) We move the required header files next to the driver code. In many
cases, having a .h file with only the defines relevant to the concerned
driver is best. But this is a _lot_ of work.
b) We change those <mach/foo.h> into something more absolute, such as
<mach/omap2/foo.h>. This can be done on a per SOC basis, first by
moving the header files one level deeper, and then fixing up all
affected drivers.
c) We change those <mach/foo.h> files into something more precise, e.g.
<mach/omap2_foo.h> and fix concerned drivers.
I think the best solution here is (b) which doesn't preclude (a)
eventually or if it is trivial. But (c) is dangerous as files might be
added easily without paying too much attention to the file prefix.
3) Change thes to the build system
We need to move towards the ability to actually build more than one SOC
family at the same time.
3.1) Kconfig
This involves changes to Kconfig where currently only one out of all the
different architectures is selected through the big "ARM system type"
choice prompt. We need to determine a good way to move some of them
into simply bool prompts and keep track of which architecture can be
built concurrently with which. We know for instance that it is unlikely
that pre-ARMv6 and ARMv6/7 will ever be buildable together. Today we
know that nothing can be built with anything else and therefore this
should be the starting default. This needs investigating.
3.2) Makefile
Currently the arch/arm/Makefile is organized so the lowest instruction
set level and the highest optimization level are selected from all the
configured options. So this part should already be fine.
However the machine-$(*), plat-$(*), machdirs and platdirs variables
must go. In (2) above we should have removed the need for adding to the
global KBUILD_CPPFLAGS to add a path to some specific architecture
includes already. Keeping them only for the code under each
architecture subdirectory should be sufficient.
For example, this might be all that is needed:
obj-$(CONFIG_ARCH_MSM) += mach-msm/
or
obj-$(CONFIG_ARCH_KIRKWOOD) += mach-kirkwood/ plat-orion/
obj-$(CONFIG_ARCH_ORION5X) += mach-orion5x/ plat-orion/
Etc.
And within each of these directories, using the subdir-ccflags-y
variable to include the locally needed architecture specific include
files will do the trick.
3.3) defconfig
We need a defconfig file adding as many architectures to it as possible
for build coverage. Ideally the resulting binary should be boot tested
on as many targets it supports as possible.
4) Picking up broken pieces
Things will certainly break along the way. There are certainly issues
that I didn't foresee. My experience so far tend to indicate that
this is a somewhat recursive process where the tackling of one work item
reveals a few more which are prerequisite to the first one, etc. So any
estimate for this work needs to consider a large fudge factor.
Nicolas
The affinity between ARM processors is defined in the MPIDR register.
We can identify which processors are in the same cluster,
and which ones have performance interdependency. We can define the
cpu topology of ARM platform, that is then used by sched_mc and sched_smt.
The default state of sched_mc and sched_smt config is disable.
When enabled, the behavior of the scheduler can be modified with
sched_mc_power_savings and sched_smt_power_savings sysfs interfaces.
Changes since v4 :
* Remove unnecessary parentheses and blank lines
Changes since v3 :
* Update the format of printk message
* Remove blank line
Changes since v2 :
* Update the commit message and some comments
Changes since v1 :
* Update the commit message
* Add read_cpuid_mpidr in arch/arm/include/asm/cputype.h
* Modify header of arch/arm/kernel/topology.c
* Modify tests and manipulation of MPIDR's bitfields
* Modify the place and dependancy of the config
* Modify Noop functions
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria(a)linaro.org>
---
arch/arm/Kconfig | 25 +++++++
arch/arm/include/asm/cputype.h | 6 ++
arch/arm/include/asm/topology.h | 33 +++++++++
arch/arm/kernel/Makefile | 1 +
arch/arm/kernel/smp.c | 5 ++
arch/arm/kernel/topology.c | 148 +++++++++++++++++++++++++++++++++++++++
6 files changed, 218 insertions(+), 0 deletions(-)
create mode 100644 arch/arm/kernel/topology.c
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9adc278..f327e55 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1344,6 +1344,31 @@ config SMP_ON_UP
If you don't know what to do here, say Y.
+config ARM_CPU_TOPOLOGY
+ bool "Support cpu topology definition"
+ depends on SMP && CPU_V7
+ default y
+ help
+ Support ARM cpu topology definition. The MPIDR register defines
+ affinity between processors which is then used to describe the cpu
+ topology of an ARM System.
+
+config SCHED_MC
+ bool "Multi-core scheduler support"
+ depends on ARM_CPU_TOPOLOGY
+ help
+ Multi-core scheduler support improves the CPU scheduler's decision
+ making when dealing with multi-core CPU chips at a cost of slightly
+ increased overhead in some places. If unsure say N here.
+
+config SCHED_SMT
+ bool "SMT scheduler support"
+ depends on ARM_CPU_TOPOLOGY
+ help
+ Improves the CPU scheduler's decision making when dealing with
+ MultiThreading at a cost of slightly increased overhead in some
+ places. If unsure say N here.
+
config HAVE_ARM_SCU
bool
depends on SMP
diff --git a/arch/arm/include/asm/cputype.h b/arch/arm/include/asm/cputype.h
index cd4458f..cb47d28 100644
--- a/arch/arm/include/asm/cputype.h
+++ b/arch/arm/include/asm/cputype.h
@@ -8,6 +8,7 @@
#define CPUID_CACHETYPE 1
#define CPUID_TCM 2
#define CPUID_TLBTYPE 3
+#define CPUID_MPIDR 5
#define CPUID_EXT_PFR0 "c1, 0"
#define CPUID_EXT_PFR1 "c1, 1"
@@ -70,6 +71,11 @@ static inline unsigned int __attribute_const__ read_cpuid_tcmstatus(void)
return read_cpuid(CPUID_TCM);
}
+static inline unsigned int __attribute_const__ read_cpuid_mpidr(void)
+{
+ return read_cpuid(CPUID_MPIDR);
+}
+
/*
* Intel's XScale3 core supports some v6 features (supersections, L2)
* but advertises itself as v5 as it does not support the v6 ISA. For
diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
index accbd7c..a7e457e 100644
--- a/arch/arm/include/asm/topology.h
+++ b/arch/arm/include/asm/topology.h
@@ -1,6 +1,39 @@
#ifndef _ASM_ARM_TOPOLOGY_H
#define _ASM_ARM_TOPOLOGY_H
+#ifdef CONFIG_ARM_CPU_TOPOLOGY
+
+#include <linux/cpumask.h>
+
+struct cputopo_arm {
+ int thread_id;
+ int core_id;
+ int socket_id;
+ cpumask_t thread_sibling;
+ cpumask_t core_sibling;
+};
+
+extern struct cputopo_arm cpu_topology[NR_CPUS];
+
+#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id)
+#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
+#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
+#define topology_thread_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
+
+#define mc_capable() (cpu_topology[0].socket_id != -1)
+#define smt_capable() (cpu_topology[0].thread_id != -1)
+
+void init_cpu_topology(void);
+void store_cpu_topology(unsigned int cpuid);
+const struct cpumask *cpu_coregroup_mask(unsigned int cpu);
+
+#else
+
+static inline void init_cpu_topology(void) { }
+static inline void store_cpu_topology(unsigned int cpuid) { }
+
+#endif
+
#include <asm-generic/topology.h>
#endif /* _ASM_ARM_TOPOLOGY_H */
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index a5b31af..816a481 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -61,6 +61,7 @@ obj-$(CONFIG_IWMMXT) += iwmmxt.o
obj-$(CONFIG_CPU_HAS_PMU) += pmu.o
obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
AFLAGS_iwmmxt.o := -Wa,-mcpu=iwmmxt
+obj-$(CONFIG_ARM_CPU_TOPOLOGY) += topology.o
ifneq ($(CONFIG_ARCH_EBSA110),y)
obj-y += io.o
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 344e52b..051fd36 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -31,6 +31,7 @@
#include <asm/cacheflush.h>
#include <asm/cpu.h>
#include <asm/cputype.h>
+#include <asm/topology.h>
#include <asm/mmu_context.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -268,6 +269,8 @@ static void __cpuinit smp_store_cpu_info(unsigned int cpuid)
struct cpuinfo_arm *cpu_info = &per_cpu(cpu_data, cpuid);
cpu_info->loops_per_jiffy = loops_per_jiffy;
+
+ store_cpu_topology(cpuid);
}
/*
@@ -354,6 +357,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
{
unsigned int ncores = num_possible_cpus();
+ init_cpu_topology();
+
smp_store_cpu_info(smp_processor_id());
/*
diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
new file mode 100644
index 0000000..1040c00
--- /dev/null
+++ b/arch/arm/kernel/topology.c
@@ -0,0 +1,148 @@
+/*
+ * arch/arm/kernel/topology.c
+ *
+ * Copyright (C) 2011 Linaro Limited.
+ * Written by: Vincent Guittot
+ *
+ * based on arch/sh/kernel/topology.c
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License. See the file "COPYING" in the main directory of this archive
+ * for more details.
+ */
+
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/init.h>
+#include <linux/percpu.h>
+#include <linux/node.h>
+#include <linux/nodemask.h>
+#include <linux/sched.h>
+
+#include <asm/cputype.h>
+#include <asm/topology.h>
+
+#define MPIDR_SMP_BITMASK (0x3 << 30)
+#define MPIDR_SMP_VALUE (0x2 << 30)
+
+#define MPIDR_MT_BITMASK (0x1 << 24)
+
+/*
+ * These masks reflect the current use of the affinity levels.
+ * The affinity level can be up to 16 bits according to ARM ARM
+ */
+
+#define MPIDR_LEVEL0_MASK 0x3
+#define MPIDR_LEVEL0_SHIFT 0
+
+#define MPIDR_LEVEL1_MASK 0xF
+#define MPIDR_LEVEL1_SHIFT 8
+
+#define MPIDR_LEVEL2_MASK 0xFF
+#define MPIDR_LEVEL2_SHIFT 16
+
+struct cputopo_arm cpu_topology[NR_CPUS];
+
+const struct cpumask *cpu_coregroup_mask(unsigned int cpu)
+{
+ return &cpu_topology[cpu].core_sibling;
+}
+
+/*
+ * store_cpu_topology is called at boot when only one cpu is running
+ * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
+ * which prevents simultaneous write access to cpu_topology array
+ */
+void store_cpu_topology(unsigned int cpuid)
+{
+ struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid];
+ unsigned int mpidr;
+ unsigned int cpu;
+
+ /* If the cpu topology has been already set, just return */
+ if (cpuid_topo->core_id != -1)
+ return;
+
+ mpidr = read_cpuid_mpidr();
+
+ /* create cpu topology mapping */
+ if ((mpidr & MPIDR_SMP_BITMASK) == MPIDR_SMP_VALUE) {
+ /*
+ * This is a multiprocessor system
+ * multiprocessor format & multiprocessor mode field are set
+ */
+
+ if (mpidr & MPIDR_MT_BITMASK) {
+ /* core performance interdependency */
+ cpuid_topo->thread_id = (mpidr >> MPIDR_LEVEL0_SHIFT)
+ & MPIDR_LEVEL0_MASK;
+ cpuid_topo->core_id = (mpidr >> MPIDR_LEVEL1_SHIFT)
+ & MPIDR_LEVEL1_MASK;
+ cpuid_topo->socket_id = (mpidr >> MPIDR_LEVEL2_SHIFT)
+ & MPIDR_LEVEL2_MASK;
+ } else {
+ /* largely independent cores */
+ cpuid_topo->thread_id = -1;
+ cpuid_topo->core_id = (mpidr >> MPIDR_LEVEL0_SHIFT)
+ & MPIDR_LEVEL0_MASK;
+ cpuid_topo->socket_id = (mpidr >> MPIDR_LEVEL1_SHIFT)
+ & MPIDR_LEVEL1_MASK;
+ }
+ } else {
+ /*
+ * This is an uniprocessor system
+ * we are in multiprocessor format but uniprocessor system
+ * or in the old uniprocessor format
+ */
+ cpuid_topo->thread_id = -1;
+ cpuid_topo->core_id = 0;
+ cpuid_topo->socket_id = -1;
+ }
+
+ /* update core and thread sibling masks */
+ for_each_possible_cpu(cpu) {
+ struct cputopo_arm *cpu_topo = &cpu_topology[cpu];
+
+ if (cpuid_topo->socket_id == cpu_topo->socket_id) {
+ cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+ if (cpu != cpuid)
+ cpumask_set_cpu(cpu,
+ &cpuid_topo->core_sibling);
+
+ if (cpuid_topo->core_id == cpu_topo->core_id) {
+ cpumask_set_cpu(cpuid,
+ &cpu_topo->thread_sibling);
+ if (cpu != cpuid)
+ cpumask_set_cpu(cpu,
+ &cpuid_topo->thread_sibling);
+ }
+ }
+ }
+ smp_wmb();
+
+ printk(KERN_INFO "CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
+ cpuid, cpu_topology[cpuid].thread_id,
+ cpu_topology[cpuid].core_id,
+ cpu_topology[cpuid].socket_id, mpidr);
+}
+
+/*
+ * init_cpu_topology is called at boot when only one cpu is running
+ * which prevent simultaneous write access to cpu_topology array
+ */
+void init_cpu_topology(void)
+{
+ unsigned int cpu;
+
+ /* init core mask */
+ for_each_possible_cpu(cpu) {
+ struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
+
+ cpu_topo->thread_id = -1;
+ cpu_topo->core_id = -1;
+ cpu_topo->socket_id = -1;
+ cpumask_clear(&cpu_topo->core_sibling);
+ cpumask_clear(&cpu_topo->thread_sibling);
+ }
+ smp_wmb();
+}
--
1.7.4.1