I'm trying to work out what compile time checks I can do determine which ARM architecture the kernel binary may be run on.
Is __LINUX_ARM_ARCH__ the architecture that the kernel is being built to support, or is it the instruction set being used by the compiler? (If these aren't always the same.)
Also, when a kernel is built for ARMv6 and ARMv7, I assume that __LINUX_ARM_ARCH__ == 6 ?
Finally, is it safe to assume that a single kernel binary will never support both v5 and v6 hardware? What about v4 and v5?
My ultimate goal is to correctly simulate ARM instructions which behave differently on different architectures, for this, there will also need to be some runtime checking. I can do this by running some test code at boot time, but does the kernel already have some CPU architecture or feature detection that I can make use of?
I was also looking at the Instruction Set Attribute Register in CP15, these give me the exact information I want, but I suspect that determining their availability will be as difficult as writing code to probe the features I'm interested in (ARM/Thumb interworking).
On Fri, Jul 01, 2011 at 03:38:48PM +0100, Tixy wrote:
I'm trying to work out what compile time checks I can do determine which ARM architecture the kernel binary may be run on.
Is __LINUX_ARM_ARCH__ the architecture that the kernel is being built to support, or is it the instruction set being used by the compiler? (If these aren't always the same.)
Also, when a kernel is built for ARMv6 and ARMv7, I assume that __LINUX_ARM_ARCH__ == 6 ?
I think yes -- it's the baseline architecture, even if some specific files get built for a newer architecture in a kernel supporting multiple CPUs. This define is set globally in arch/arm/Makefile.
Finally, is it safe to assume that a single kernel binary will never support both v5 and v6 hardware? What about v4 and v5?
I think it's "too hard" to support v6 and pre-v6 in the same kernel, so such configurations are not really supported.
Other than that, I'm not too sure. Nico or someone may know.
A Thumb-2 kernel cannot, by definition, run on anything prior to ARMv6T2.
My ultimate goal is to correctly simulate ARM instructions which behave differently on different architectures, for this, there will also need to be some runtime checking. I can do this by running some test code at boot time, but does the kernel already have some CPU architecture or feature detection that I can make use of?
Most opcodes are either UNPREDICTABLE or UNDEFINED before they get defined to mean something in some version of the architecture.
There are a few exceptions to this, but perhaps it's not worth the effort of emulating them all(?)
I was also looking at the Instruction Set Attribute Register in CP15, these give me the exact information I want, but I suspect that determining their availability will be as difficult as writing code to probe the features I'm interested in (ARM/Thumb interworking).
v7 processors and a few earlier processors have these registers.
If it's the ARM/Thumb interworking behaviour you're interested in, note that the kernel is a non-interworking environment, and cannot be built for (or contain any) Thumb code on ARMv4T. So those instructions really shouldn't make a difference unless the kernel is buggy.
The interworking behaviour is uniform for ARMv5(T) and above, but since kernels built in Thumb cannot run on pre-v7, and kernels built in ARM cannot (or certainly should not) contain any Thumb code, these niceties may not matter.
Cheers ---Dave
On Friday 01 July 2011 20:26:54 Dave Martin wrote:
On Fri, Jul 01, 2011 at 03:38:48PM +0100, Tixy wrote:
Finally, is it safe to assume that a single kernel binary will never support both v5 and v6 hardware? What about v4 and v5?
I think it's "too hard" to support v6 and pre-v6 in the same kernel, so such configurations are not really supported.
Other than that, I'm not too sure. Nico or someone may know.
We can have common kernels for v6+v7, and we can have common kernels for v3+v4+v5, iirc.
I was also looking at the Instruction Set Attribute Register in CP15, these give me the exact information I want, but I suspect that determining their availability will be as difficult as writing code to probe the features I'm interested in (ARM/Thumb interworking).
v7 processors and a few earlier processors have these registers.
If it's the ARM/Thumb interworking behaviour you're interested in, note that the kernel is a non-interworking environment, and cannot be built for (or contain any) Thumb code on ARMv4T. So those instructions really shouldn't make a difference unless the kernel is buggy.
The interworking behaviour is uniform for ARMv5(T) and above, but since kernels built in Thumb cannot run on pre-v7, and kernels built in ARM cannot (or certainly should not) contain any Thumb code, these niceties may not matter.
Another variable would be endianess, if the goal is to support every single possibility.
Arnd
On Fri, 2011-07-01 at 19:26 +0100, Dave Martin wrote:
The interworking behaviour is uniform for ARMv5(T) and above, but since kernels built in Thumb cannot run on pre-v7, and kernels built in ARM cannot (or certainly should not) contain any Thumb code, these niceties may not matter.
Interworking in different on v7, ARM mode ALU instructions now interwork, e.g. "sub pc, pc, #3" will switch from ARM to Thumb.
When doing doing the kprobes bug fixes it was decided that we should avoid writes to PC which produce unpredictable results, even though such instructions aren't legal and won't occur in normal code. So I believe it follows that we should implement interworking correctly, even on kernels not built for Thumb.
Before I started my work, the kprobes code already had a partial simulation of interworking, so someone else must have thought it worth while. Though it would obviously be a lot simpler if I could just wrap all interworking stuff up in #ifdef CONFIG_THUMB2_KERNEL.
On Sat, Jul 02, 2011 at 12:28:01PM +0100, Tixy wrote:
On Fri, 2011-07-01 at 19:26 +0100, Dave Martin wrote:
The interworking behaviour is uniform for ARMv5(T) and above, but since kernels built in Thumb cannot run on pre-v7, and kernels built in ARM cannot (or certainly should not) contain any Thumb code, these niceties may not matter.
Interworking in different on v7, ARM mode ALU instructions now interwork, e.g. "sub pc, pc, #3" will switch from ARM to Thumb.
OK, I was just thinking about Thumb, but this is a fair distinction.
Alternatively, would it make sense to oops instead if we discover when simulating the branch that it would try to switch to ARM?
I believe that the CPU Main ID register is sufficient to confirm whether the CPUID registers are there, but I'm not sure whether there's any precise indication of whether you're running on v7 as such. If you've not already done so, you might want to look in detail at the CPUID feature bits documented in the ARM ARM, to see whether the necessary clues are there.
When doing doing the kprobes bug fixes it was decided that we should avoid writes to PC which produce unpredictable results, even though such instructions aren't legal and won't occur in normal code. So I believe it follows that we should implement interworking correctly, even on kernels not built for Thumb.
I'm not sure I understand your argument here.
Before I started my work, the kprobes code already had a partial simulation of interworking, so someone else must have thought it worth while. Though it would obviously be a lot simpler if I could just wrap all interworking stuff up in #ifdef CONFIG_THUMB2_KERNEL.
Is it worth pinging the original committer to get his views on the rationale for this?
Cheers ---Dave
On Mon, 2011-07-04 at 17:16 +0100, Dave Martin wrote:
On Sat, Jul 02, 2011 at 12:28:01PM +0100, Tixy wrote:
On Fri, 2011-07-01 at 19:26 +0100, Dave Martin wrote:
The interworking behaviour is uniform for ARMv5(T) and above, but since kernels built in Thumb cannot run on pre-v7, and kernels built in ARM cannot (or certainly should not) contain any Thumb code, these niceties may not matter.
Interworking in different on v7, ARM mode ALU instructions now interwork, e.g. "sub pc, pc, #3" will switch from ARM to Thumb.
OK, I was just thinking about Thumb, but this is a fair distinction.
Alternatively, would it make sense to oops instead if we discover when simulating the branch that it would try to switch to ARM?
I'm not sure I understand what you mean here. In ARM mode the instruction "sub pc, pc, #3" does the following
on ARMv7, switch to Thumb on ARMv6, stay in ARM mode on ARMv5 and earlier, UNPREDICTABLE so are you saying OOPS in the last case? (I currently have it staying ARM mode as I only distinguish between <=ARMv6 and >=ARMv7 )
Note, in Thumb mode, the ALU instructions don't interwork on any architecture. (It makes you wonder why they changed the ARM mode behaviour in v7).
[...]
When doing doing the kprobes bug fixes it was decided that we should avoid writes to PC which produce unpredictable results, even though such instructions aren't legal and won't occur in normal code. So I believe it follows that we should implement interworking correctly, even on kernels not built for Thumb.
I'm not sure I understand your argument here.
I was trying to say that if we already have code to handle cases of invalid instructions that the compiler/assembler would never generate, then we should have code to handle legal instructions that the the compiler/assembler won't generate because we're not building for Thumb.
Before I started my work, the kprobes code already had a partial simulation of interworking, so someone else must have thought it worth while. Though it would obviously be a lot simpler if I could just wrap all interworking stuff up in #ifdef CONFIG_THUMB2_KERNEL.
Is it worth pinging the original committer to get his views on the rationale for this?
Over the weekend I implemented functions to check various interworking behaviour, and run these when kprobes is first initialised. So I now have things all simulated correctly and I think I may as well leave things in there now.
On Mon, 4 Jul 2011, Tixy wrote:
On Mon, 2011-07-04 at 17:16 +0100, Dave Martin wrote:
On Sat, Jul 02, 2011 at 12:28:01PM +0100, Tixy wrote:
On Fri, 2011-07-01 at 19:26 +0100, Dave Martin wrote:
The interworking behaviour is uniform for ARMv5(T) and above, but since kernels built in Thumb cannot run on pre-v7, and kernels built in ARM cannot (or certainly should not) contain any Thumb code, these niceties may not matter.
Interworking in different on v7, ARM mode ALU instructions now interwork, e.g. "sub pc, pc, #3" will switch from ARM to Thumb.
OK, I was just thinking about Thumb, but this is a fair distinction.
Alternatively, would it make sense to oops instead if we discover when simulating the branch that it would try to switch to ARM?
I would think this can be determined at probe installation time.
I'm not sure I understand what you mean here. In ARM mode the instruction "sub pc, pc, #3" does the following
on ARMv7, switch to Thumb on ARMv6, stay in ARM mode on ARMv5 and earlier, UNPREDICTABLE so are you saying OOPS in the last case? (I currently have it staying ARM mode as I only distinguish between <=ARMv6 and >=ARMv7 )
Actually, is this something that may happen frequently? Given the behavior is not consistent, it is likely that no one will use such construct in practice and therefore we may consider simply refusing to install a probe on such instructions.
I was trying to say that if we already have code to handle cases of invalid instructions that the compiler/assembler would never generate, then we should have code to handle legal instructions that the the compiler/assembler won't generate because we're not building for Thumb.
Sure. If by that you mean properly refusing to handle them then I agree.
Before I started my work, the kprobes code already had a partial simulation of interworking, so someone else must have thought it worth while. Though it would obviously be a lot simpler if I could just wrap all interworking stuff up in #ifdef CONFIG_THUMB2_KERNEL.
Is it worth pinging the original committer to get his views on the rationale for this?
Over the weekend I implemented functions to check various interworking behaviour, and run these when kprobes is first initialised. So I now have things all simulated correctly and I think I may as well leave things in there now.
As long as this doesn't look like too much bloat then I agree.
Nicolas
On Mon, 2011-07-04 at 14:44 -0400, Nicolas Pitre wrote:
Alternatively, would it make sense to oops instead if we discover when simulating the branch that it would try to switch to ARM?
I would think this can be determined at probe installation time.
It can't in the general case of data processing operations storing their result to PC because it will depend on register contents at the time the probe is hit. My example was bad because it only used PC and constants, think of something different like "add pc, pc, r0"
I'm not sure I understand what you mean here. In ARM mode the instruction "sub pc, pc, #3" does the following
on ARMv7, switch to Thumb on ARMv6, stay in ARM mode on ARMv5 and earlier, UNPREDICTABLE so are you saying OOPS in the last case? (I currently have it staying ARM mode as I only distinguish between <=ARMv6 and >=ARMv7 )
Actually, is this something that may happen frequently? Given the behavior is not consistent, it is likely that no one will use such construct in practice and therefore we may consider simply refusing to install a probe on such instructions.
It is consistent if the calculated address always aligned to a word. And it is also very common with instructions like "mov pc, reg"
I was trying to say that if we already have code to handle cases of invalid instructions that the compiler/assembler would never generate, then we should have code to handle legal instructions that the the compiler/assembler won't generate because we're not building for Thumb.
Sure. If by that you mean properly refusing to handle them then I agree.
Or by emulating them in exactly the same way as they normally execute on that hardware.
Before I started my work, the kprobes code already had a partial simulation of interworking, so someone else must have thought it worth while. Though it would obviously be a lot simpler if I could just wrap all interworking stuff up in #ifdef CONFIG_THUMB2_KERNEL.
Is it worth pinging the original committer to get his views on the rationale for this?
Over the weekend I implemented functions to check various interworking behaviour, and run these when kprobes is first initialised. So I now have things all simulated correctly and I think I may as well leave things in there now.
As long as this doesn't look like too much bloat then I agree.
74 lines, half of which are comments or blank. 48 bytes of code.
We've been talking about data-processing instructions but I've also done similar for "ldr pc, [...]" as Arnd suggested that we might have single kernel binaries that execute on both ARMv4 and v5 hardware.
On Mon, 4 Jul 2011, Tixy wrote:
We've been talking about data-processing instructions but I've also done similar for "ldr pc, [...]" as Arnd suggested that we might have single kernel binaries that execute on both ARMv4 and v5 hardware.
Sure, and I think we already do with some configurations. But in that case nothing should ever use Thumb mode in the kernel.
Nicolas
On Mon, 2011-07-04 at 15:45 -0400, Nicolas Pitre wrote:
On Mon, 4 Jul 2011, Tixy wrote:
We've been talking about data-processing instructions but I've also done similar for "ldr pc, [...]" as Arnd suggested that we might have single kernel binaries that execute on both ARMv4 and v5 hardware.
Sure, and I think we already do with some configurations. But in that case nothing should ever use Thumb mode in the kernel.
Should an ARMv7 kernel have code running in Thumb mode if it wasn't configured with CONFIG_THUMB2_KERNEL?
On Mon, 4 Jul 2011, Tixy wrote:
On Mon, 2011-07-04 at 15:45 -0400, Nicolas Pitre wrote:
On Mon, 4 Jul 2011, Tixy wrote:
We've been talking about data-processing instructions but I've also done similar for "ldr pc, [...]" as Arnd suggested that we might have single kernel binaries that execute on both ARMv4 and v5 hardware.
Sure, and I think we already do with some configurations. But in that case nothing should ever use Thumb mode in the kernel.
Should an ARMv7 kernel have code running in Thumb mode if it wasn't configured with CONFIG_THUMB2_KERNEL?
I would say no.
Nicolas
On Mon, Jul 04, 2011 at 04:10:02PM -0400, Nicolas Pitre wrote:
On Mon, 4 Jul 2011, Tixy wrote:
On Mon, 2011-07-04 at 15:45 -0400, Nicolas Pitre wrote:
On Mon, 4 Jul 2011, Tixy wrote:
We've been talking about data-processing instructions but I've also done similar for "ldr pc, [...]" as Arnd suggested that we might have single kernel binaries that execute on both ARMv4 and v5 hardware.
Sure, and I think we already do with some configurations. But in that case nothing should ever use Thumb mode in the kernel.
Should an ARMv7 kernel have code running in Thumb mode if it wasn't configured with CONFIG_THUMB2_KERNEL?
I would say no.
Agreed.
FWIW, I would prefer if the kernel was a properly EABI environment, with correct interworking -- just from a general cleanliness point of view.
This actually probably not too hard to achieve, but I think people are a bit scared of it (somewhat justifiably) and the benefits are not huge since all modules have to be built with a consistent configuration anyway.
The only real benefit is that fossilised binary blob drivers are more likely to work more smoothly -- that will be hard to sell to the kernel community.
All in all, I suspect this is unlikely to happen.
Note that Thumb kernels do contain very small amounts of ARM code. But this is only for one or two special cases, and it's probably not worth trying to support these with kprobes.
Cheers ---Dave
On Tue, 2011-07-05 at 09:50 +0100, Dave Martin wrote:
Note that Thumb kernels do contain very small amounts of ARM code. But this is only for one or two special cases, and it's probably not worth trying to support these with kprobes.
I currently have ARM probes working on Thumb kernels, do you thing I should remove my changes?
On Tue, Jul 5, 2011 at 11:07 AM, Tixy tixy@yxit.co.uk wrote:
On Tue, 2011-07-05 at 09:50 +0100, Dave Martin wrote:
Note that Thumb kernels do contain very small amounts of ARM code. Â But this is only for one or two special cases, and it's probably not worth trying to support these with kprobes.
I currently have ARM probes working on Thumb kernels, do you thing I should remove my changes?
If it's already there, I see no special reason not to keep it.
One question though -- how do we know when setting a probe whether the target instruction is ARM or Thumb?
AFAIK, the kernel doesn't contain sufficient information to allow us to know that, because the ARM/Thumb mapping symbols are not included when the kernel gets linked.
Cheers ---Dave
On Tue, 2011-07-05 at 12:01 +0100, Dave Martin wrote:
One question though -- how do we know when setting a probe whether the target instruction is ARM or Thumb?
I'm using the bottom bit of the probe address. The kprobes API lets you specify the probe location as a symbol
the_probe.symbol_name = "function_name";
or as an address
the_probe.addr = &function_name;
and both of these cases will work. If the address is obtained by another means which doesn't set bit zero to indicate thumb code, then it's going to go bang.
Do you think that we should assume all probes are Thumb on Thumb kernels and ARM on ARM kernels? And therefore configure out ARM instruction decoding and simulation on Thumb kernels?
On Tue, Jul 5, 2011 at 12:35 PM, Tixy tixy@yxit.co.uk wrote:
On Tue, 2011-07-05 at 12:01 +0100, Dave Martin wrote:
One question though -- how do we know when setting a probe whether the target instruction is ARM or Thumb?
I'm using the bottom bit of the probe address. The kprobes API lets you specify the probe location as a symbol
the_probe.symbol_name = "function_name";
or as an address
the_probe.addr = &function_name;
and both of these cases will work. If the address is obtained by another means which doesn't set bit zero to indicate thumb code, then it's going to go bang.
The only code locations which exist from the point of view of ELF are function entry points, so the convention is clear for those.
The main other means I can think of is if people are setting ad-hoc probes in the middle of functions.
Now, we could make correct setting of the Thumb bit part of the semantics of the kprobes interface, but I think we have to document it explicitly in that case, and there's a risk it could interfere with some existing uses of kprobes.
If there is automatic infrastructure for creating probe points in the middle of functions, this would also have to be careful to set the bit correctly, since by default the noted locations may not have the bit set correctly. I forget exactly what such infrastructure may exist -- do you have ideas on this?
Do you think that we should assume all probes are Thumb on Thumb kernels and ARM on ARM kernels? And therefore configure out ARM instruction decoding and simulation on Thumb kernels?
I'm starting to feel that it might be safer to assume everything is Thumb -- since the overwhelming majority of the kernel is Thumb for a Thumb kernel. Literally just a handful of instructions will be ARM, pretty much all of them in places where it would be impossible/unsafe to set a kprobe anyway (such as the kernel entry point, low-level power management backend code and the vectors page).
It's a bit annoying, since we can quite reasonably simulate this part of the architecture in kprobes (as you do). But it could create problems which outweigh the usefulness. That's just my gut feeling though -- I don't have specific examples. If you don't think my concerns are likely to lead to actual problems, that's fair enough.
Cheers ---Dave
On Tue, 2011-07-05 at 14:46 +0100, Dave Martin wrote:
On Tue, Jul 5, 2011 at 12:35 PM, Tixy tixy@yxit.co.uk wrote:
On Tue, 2011-07-05 at 12:01 +0100, Dave Martin wrote:
One question though -- how do we know when setting a probe whether the target instruction is ARM or Thumb?
I'm using the bottom bit of the probe address. The kprobes API lets you specify the probe location as a symbol
the_probe.symbol_name = "function_name";
or as an address
the_probe.addr = &function_name;
and both of these cases will work. If the address is obtained by another means which doesn't set bit zero to indicate thumb code, then it's going to go bang.
[...]
If there is automatic infrastructure for creating probe points in the middle of functions, this would also have to be careful to set the bit correctly, since by default the noted locations may not have the bit set correctly.
The kprobes struct has an 'offset' field which is used to specify an offset from the start of the specified function. If this is used for probes in the middle of functions then things will also be OK.
I forget exactly what such infrastructure may exist -- do you have ideas on this?
I haven't. It's a failing of mine that I haven't researched this enough to find these. Though from what I have come across that seems like it would be a thankless task - there seams to be many trace/debug/instrumentation frameworks in various states of newness, flux or disrepair. At UDS I was in one meeting where someone mentioned that one trace scheme was effectively dead, at the same time another meeting was talking about integrating it into Ubuntu!
On Tue, 2011-07-05 at 14:46 +0100, Dave Martin wrote:
On Tue, Jul 5, 2011 at 12:35 PM, Tixy tixy@yxit.co.uk wrote:
On Tue, 2011-07-05 at 12:01 +0100, Dave Martin wrote:
One question though -- how do we know when setting a probe whether the target instruction is ARM or Thumb?
I'm using the bottom bit of the probe address. The kprobes API lets you specify the probe location as a symbol
the_probe.symbol_name = "function_name";
or as an address
the_probe.addr = &function_name;
and both of these cases will work. If the address is obtained by another means which doesn't set bit zero to indicate thumb code, then it's going to go bang.
The only code locations which exist from the point of view of ELF are function entry points, so the convention is clear for those.
The main other means I can think of is if people are setting ad-hoc probes in the middle of functions.
Now, we could make correct setting of the Thumb bit part of the semantics of the kprobes interface, but I think we have to document it explicitly in that case, and there's a risk it could interfere with some existing uses of kprobes.
Now we decided that we wouldn't support probing ARM code on Thumb kernels I've been changing the code to ignore bit 0 of the probe address. However, one problem with this is that the address is used by the non-arch specific framework code to identify probes.
This causes two problems.
1. In the ARM kprobe_handler I have to decide whether to call get_kprobe with an address which has bit 0 set or not, which I can't do without knowing how the probe was originally registered. And doing a second lookups if the first fails seems too horrible.
2. The generic kprobes code supports the case where two or more probes are placed at the same location, this will fail if bit 0 of the address differs.
I think therefore, that we should keep my original implementation where:
Probe addresses in thumb code must have bit 0 set. This will naturally be the case when using symbol lookup or "&function" to set a probe at the start of a function. It will also be the case for setting a probe in the middle of a function if the offset parameter is used for this. Other uses which calculate addresses by other means may need to be modified to set bit zero accordingly.
Does this sound reasonable?
On Wed, Jul 06, 2011 at 01:38:44PM +0100, Tixy wrote:
On Tue, 2011-07-05 at 14:46 +0100, Dave Martin wrote:
On Tue, Jul 5, 2011 at 12:35 PM, Tixy tixy@yxit.co.uk wrote:
On Tue, 2011-07-05 at 12:01 +0100, Dave Martin wrote:
One question though -- how do we know when setting a probe whether the target instruction is ARM or Thumb?
I'm using the bottom bit of the probe address. The kprobes API lets you specify the probe location as a symbol
the_probe.symbol_name = "function_name";
or as an address
the_probe.addr = &function_name;
and both of these cases will work. If the address is obtained by another means which doesn't set bit zero to indicate thumb code, then it's going to go bang.
The only code locations which exist from the point of view of ELF are function entry points, so the convention is clear for those.
The main other means I can think of is if people are setting ad-hoc probes in the middle of functions.
Now, we could make correct setting of the Thumb bit part of the semantics of the kprobes interface, but I think we have to document it explicitly in that case, and there's a risk it could interfere with some existing uses of kprobes.
Now we decided that we wouldn't support probing ARM code on Thumb kernels I've been changing the code to ignore bit 0 of the probe address. However, one problem with this is that the address is used by the non-arch specific framework code to identify probes.
This causes two problems.
- In the ARM kprobe_handler I have to decide whether to call get_kprobe
with an address which has bit 0 set or not, which I can't do without knowing how the probe was originally registered. And doing a second lookups if the first fails seems too horrible.
You could maybe canonicalise all the addresses when registering probes -- i.e., clear bit 0. This avoids the problems which result from abusing an address bit to store metadata -- providing there is somewhere else we can store that arch-specific metadata for each registered probe.
Or is the whole registering process handled by generic code?
If so, and if there is no arch-specific hook, that might be considered a deficiency in the generic code, though I'm not sure if this metadata requirement would affect many architectures.
Having a probe simultaneously on overlapping ARM and Thumb locations will not work in any case, so it may make little sense to worry about matching to the correct probe if such an ambiguity occurs. Really, we should avoid overlapping probes being registered in the first place, or just consider that to be user error. In practice, I don't expect this is likely to cause us any problems.
- The generic kprobes code supports the case where two or more probes
are placed at the same location, this will fail if bit 0 of the address differs.
I think therefore, that we should keep my original implementation where:
Probe addresses in thumb code must have bit 0 set. This will naturally be the case when using symbol lookup or "&function" to set a probe at the start of a function. It will also be the case for setting a probe in the middle of a function if the offset parameter is used for this. Other uses which calculate addresses by other means may need to be modified to set bit zero accordingly.
Does this sound reasonable?
It seems reasonable for the interface to require that probes should be set in a consistent way -- so that all Thumb probes have to have bit 0 set (from which it follows that two Thumb probes set in the same place will match on the registered probe address).
I don't actually know whether there are any scenarios in which kprobes would not naturally get registered in this way. If registered probe is based somehow off a link-time vmlinux or module symbol, bit 0 will naturally be set and there's no problem. "Other uses" would encompass registering probes on arbitrary absolute addresses, which seems a bit tenuous in any case.
Cheers ---Dave
On Wed, 2011-07-06 at 14:52 +0100, Dave Martin wrote:
On Wed, Jul 06, 2011 at 01:38:44PM +0100, Tixy wrote:
On Tue, 2011-07-05 at 14:46 +0100, Dave Martin wrote:
On Tue, Jul 5, 2011 at 12:35 PM, Tixy tixy@yxit.co.uk wrote:
On Tue, 2011-07-05 at 12:01 +0100, Dave Martin wrote:
One question though -- how do we know when setting a probe whether the target instruction is ARM or Thumb?
I'm using the bottom bit of the probe address. The kprobes API lets you specify the probe location as a symbol
the_probe.symbol_name = "function_name";
or as an address
the_probe.addr = &function_name;
and both of these cases will work. If the address is obtained by another means which doesn't set bit zero to indicate thumb code, then it's going to go bang.
The only code locations which exist from the point of view of ELF are function entry points, so the convention is clear for those.
The main other means I can think of is if people are setting ad-hoc probes in the middle of functions.
Now, we could make correct setting of the Thumb bit part of the semantics of the kprobes interface, but I think we have to document it explicitly in that case, and there's a risk it could interfere with some existing uses of kprobes.
Now we decided that we wouldn't support probing ARM code on Thumb kernels I've been changing the code to ignore bit 0 of the probe address. However, one problem with this is that the address is used by the non-arch specific framework code to identify probes.
This causes two problems.
- In the ARM kprobe_handler I have to decide whether to call get_kprobe
with an address which has bit 0 set or not, which I can't do without knowing how the probe was originally registered. And doing a second lookups if the first fails seems too horrible.
- The generic kprobes code supports the case where two or more probes
are placed at the same location, this will fail if bit 0 of the address differs.
You could maybe canonicalise all the addresses when registering probes -- i.e., clear bit 0. This avoids the problems which result from abusing an address bit to store metadata -- providing there is somewhere else we can store that arch-specific metadata for each registered probe.
Or is the whole registering process handled by generic code?
It is handled by the generic code.
If so, and if there is no arch-specific hook, that might be considered a deficiency in the generic code, though I'm not sure if this metadata requirement would affect many architectures.
Having a probe simultaneously on overlapping ARM and Thumb locations will not work in any case, so it may make little sense to worry about matching to the correct probe if such an ambiguity occurs.
I'm not worried about overlapping probes, it's about probing the same Thumb location but where one user specifies bit 0 as set and the other doesn't. Not very likely, admittedly.
On Wed, 6 Jul 2011, Tixy wrote:
I'm not worried about overlapping probes, it's about probing the same Thumb location but where one user specifies bit 0 as set and the other doesn't. Not very likely, admittedly.
Right. Let's not overengineer this. Anyway, the first probe would have installed a faulting instruction in memory, and the second probe should simply see that and bail out already.
Nicolas
On Wed, Jul 06, 2011 at 10:44:14AM -0400, Nicolas Pitre wrote:
On Wed, 6 Jul 2011, Tixy wrote:
I'm not worried about overlapping probes, it's about probing the same Thumb location but where one user specifies bit 0 as set and the other doesn't. Not very likely, admittedly.
I think we can reasonably assert that that's wrong usage, at least for now...
Right. Let's not overengineer this. Anyway, the first probe would have installed a faulting instruction in memory, and the second probe should simply see that and bail out already.
Good point -- presumably the kprobes already refuses to set a probe if the target location already contains the opcode used for a kprobe trap?
Cheers ---Dave
On Wed, 2011-07-06 at 18:09 +0100, Dave Martin wrote:
Good point -- presumably the kprobes already refuses to set a probe if the target location already contains the opcode used for a kprobe trap?
It will do for ARM, because you're probing an undefined instruction. Other architecture can have optimised probes which use branches rather rather breakpoints.
On Wed, 6 Jul 2011, Tixy wrote:
On Wed, 2011-07-06 at 18:09 +0100, Dave Martin wrote:
Good point -- presumably the kprobes already refuses to set a probe if the target location already contains the opcode used for a kprobe trap?
It will do for ARM, because you're probing an undefined instruction. Other architecture can have optimised probes which use branches rather rather breakpoints.
The point is that we don't have to care about this case even if the generic kprobes code might be fooled by the same address presented with and without the Thumb bit since the second attempt will be refused and nothing nasty will occur.
Nicolas
On Wed, 2011-07-06 at 13:21 -0400, Nicolas Pitre wrote:
On Wed, 6 Jul 2011, Tixy wrote:
On Wed, 2011-07-06 at 18:09 +0100, Dave Martin wrote:
Good point -- presumably the kprobes already refuses to set a probe if the target location already contains the opcode used for a kprobe trap?
It will do for ARM, because you're probing an undefined instruction. Other architecture can have optimised probes which use branches rather rather breakpoints.
The point is that we don't have to care about this case even if the generic kprobes code might be fooled by the same address presented with and without the Thumb bit since the second attempt will be refused and nothing nasty will occur.
That is true. So we've shot down point 2)
But we still have point 1) the ARM kprobe_handler has to to decide whether to call get_kprobe with an address which has bit 0 set or not.
If we aren't going to reject Thumb probes that are specified with bit 0 clear then the only cleanish option I can see is to add some new hook into the generic register_kprobe so we can modify this bit. However, who 'owns' the address value in the struct? This is set by the user of the API and currently left unchanged. Would it be acceptable for us to start changing it?
On Wed, 6 Jul 2011, Tixy wrote:
On Wed, 2011-07-06 at 13:21 -0400, Nicolas Pitre wrote:
On Wed, 6 Jul 2011, Tixy wrote:
On Wed, 2011-07-06 at 18:09 +0100, Dave Martin wrote:
Good point -- presumably the kprobes already refuses to set a probe if the target location already contains the opcode used for a kprobe trap?
It will do for ARM, because you're probing an undefined instruction. Other architecture can have optimised probes which use branches rather rather breakpoints.
The point is that we don't have to care about this case even if the generic kprobes code might be fooled by the same address presented with and without the Thumb bit since the second attempt will be refused and nothing nasty will occur.
That is true. So we've shot down point 2)
But we still have point 1) the ARM kprobe_handler has to to decide whether to call get_kprobe with an address which has bit 0 set or not.
Hmmm. Well, I think that simply trying it twice: first with the Thumb bit set if that is the most likely usage when they are recorded, and then without that bit, should be OK.
If we aren't going to reject Thumb probes that are specified with bit 0 clear then the only cleanish option I can see is to add some new hook into the generic register_kprobe so we can modify this bit. However, who 'owns' the address value in the struct? This is set by the user of the API and currently left unchanged. Would it be acceptable for us to start changing it?
I don't think it is right to change it as this would prevent deregistration as the kprobe user is likely to reuse the same address. Either a new per-architecture macro is introduced to canonicalize addresses, or we try to get away with calling get_kprobe() twice. I don't have a problem with the later either.
Nicolas
On Tue, 2011-07-05 at 12:35 +0100, Tixy wrote:
Do you think that we should assume all probes are Thumb on Thumb kernels and ARM on ARM kernels? And therefore configure out ARM instruction decoding and simulation on Thumb kernels?
Irrespective of correct interworking behaviour and API issues, do we think that it's a good memory saving exercise to not support probing ARM code on Thumb kernels?
On Tue, 5 Jul 2011, Tixy wrote:
Irrespective of correct interworking behaviour and API issues, do we think that it's a good memory saving exercise to not support probing ARM code on Thumb kernels?
I would think so. And if this is really something people need then this could be revisited in the future. Remember that you gain no points by trying to submit everything on a zero-day.
By the way, the merge window may be opened at any moment given latest comments from Linus. If you have some patches that can be merged during this mergewindow I'd suggest posting them soon.
Nicolas
On Tue, 2011-07-05 at 11:38 -0400, Nicolas Pitre wrote:
On Tue, 5 Jul 2011, Tixy wrote:
Irrespective of correct interworking behaviour and API issues, do we think that it's a good memory saving exercise to not support probing ARM code on Thumb kernels?
I would think so. And if this is really something people need then this could be revisited in the future. Remember that you gain no points by trying to submit everything on a zero-day.
There isn't really much lines of code or diffstat saving not supporting ARM code on Thumb kernels. In fact, for cleanness reasons I would need to reorganise code. Currently we have
kprobes.c Infrastructure kprobes-decode.c ARM instruction decoding and simulation
My changes at the moment have
kprobes.c Infrastructure (700 lines) kprobes-decode.c ARM instruction decoding and simulation (1000 lines) Common instruction decoding and simulation (400 lines) Decoding table processing (300 lines) kprobes-thumb.c Thumb instruction decoding and simulation (1500 lines)
To avoid #ifdef the ARM instruction decoding should be in its own file and not built for Thumb kernels. So the minimal change would be to move
Common instruction decoding and simulation Decoding table processing
into kprobes.c, or probably nicer into a new kprobes-common.c. (We could then rename kprobes-decode to kprobes-arm and have things looking neat.)
As this is reworking 70 patches, I would like confirmation that the approach is good before starting ;-)
On Tue, 5 Jul 2011, Tixy wrote:
On Tue, 2011-07-05 at 11:38 -0400, Nicolas Pitre wrote:
On Tue, 5 Jul 2011, Tixy wrote:
Irrespective of correct interworking behaviour and API issues, do we think that it's a good memory saving exercise to not support probing ARM code on Thumb kernels?
I would think so. And if this is really something people need then this could be revisited in the future. Remember that you gain no points by trying to submit everything on a zero-day.
There isn't really much lines of code or diffstat saving not supporting ARM code on Thumb kernels. In fact, for cleanness reasons I would need to reorganise code. Currently we have
kprobes.c Infrastructure kprobes-decode.c ARM instruction decoding and simulation
My changes at the moment have
kprobes.c Infrastructure (700 lines) kprobes-decode.c ARM instruction decoding and simulation (1000 lines) Common instruction decoding and simulation (400 lines) Decoding table processing (300 lines) kprobes-thumb.c Thumb instruction decoding and simulation (1500 lines)
To avoid #ifdef the ARM instruction decoding should be in its own file and not built for Thumb kernels. So the minimal change would be to move
Common instruction decoding and simulation Decoding table processing
into kprobes.c, or probably nicer into a new kprobes-common.c. (We could then rename kprobes-decode to kprobes-arm and have things looking neat.)
As this is reworking 70 patches, I would like confirmation that the approach is good before starting ;-)
Sure, looks sensible. Better do the renaming first and only add new stuff to it with subsequent patches.
Keeping the kprobes interface separate from the actual table and simulation/emulation handling is certainly a good idea too. So you'd end up with kprobes.c, kprobes-common.c, kprobes-arm.c and kprobles-thumb.c.
Nicolas
On Tue, Jul 05, 2011 at 12:48:39PM -0400, Nicolas Pitre wrote:
On Tue, 5 Jul 2011, Tixy wrote:
On Tue, 2011-07-05 at 11:38 -0400, Nicolas Pitre wrote:
On Tue, 5 Jul 2011, Tixy wrote:
Irrespective of correct interworking behaviour and API issues, do we think that it's a good memory saving exercise to not support probing ARM code on Thumb kernels?
I would think so. And if this is really something people need then this could be revisited in the future. Remember that you gain no points by trying to submit everything on a zero-day.
There isn't really much lines of code or diffstat saving not supporting ARM code on Thumb kernels. In fact, for cleanness reasons I would need to reorganise code. Currently we have
kprobes.c Infrastructure kprobes-decode.c ARM instruction decoding and simulation
My changes at the moment have
kprobes.c Infrastructure (700 lines) kprobes-decode.c ARM instruction decoding and simulation (1000 lines) Common instruction decoding and simulation (400 lines) Decoding table processing (300 lines) kprobes-thumb.c Thumb instruction decoding and simulation (1500 lines)
To avoid #ifdef the ARM instruction decoding should be in its own file and not built for Thumb kernels. So the minimal change would be to move
Common instruction decoding and simulation Decoding table processing
into kprobes.c, or probably nicer into a new kprobes-common.c. (We could then rename kprobes-decode to kprobes-arm and have things looking neat.)
As this is reworking 70 patches, I would like confirmation that the approach is good before starting ;-)
Sure, looks sensible. Better do the renaming first and only add new stuff to it with subsequent patches.
Keeping the kprobes interface separate from the actual table and simulation/emulation handling is certainly a good idea too. So you'd end up with kprobes.c, kprobes-common.c, kprobes-arm.c and kprobles-thumb.c.
That sounds good to me if it's straightforward to achieve.
Cheers ---Dave
On Mon, 2011-07-04 at 16:10 -0400, Nicolas Pitre wrote:
On Mon, 4 Jul 2011, Tixy wrote:
On Mon, 2011-07-04 at 15:45 -0400, Nicolas Pitre wrote:
On Mon, 4 Jul 2011, Tixy wrote:
We've been talking about data-processing instructions but I've also done similar for "ldr pc, [...]" as Arnd suggested that we might have single kernel binaries that execute on both ARMv4 and v5 hardware.
Sure, and I think we already do with some configurations. But in that case nothing should ever use Thumb mode in the kernel.
Nicolas, I thought you meant by this that for the ldr pc,[...] case we didn't have to worry about interworking. I.e. don't code for the possibility.
Should an ARMv7 kernel have code running in Thumb mode if it wasn't configured with CONFIG_THUMB2_KERNEL?
I would say no.
If I was right about your meaning for the ARMv4/v5 statement, then doesn't the same hold true for ARMv7 with no CONFIG_THUMB2_KERNEL? I.e. don't worry about interworking? Where "don't worry" means either
a) ignore bit 0 of PC or b) OOPs if bit 0 is set
rather than
c) switch to Thumb mode if the real instruction would have done that on the current hardware
I like the idea of b) and throw away my code which checks interworking behaviour of instructions.
On Mon, Jul 04, 2011 at 05:52:03PM +0100, Tixy wrote:
On Mon, 2011-07-04 at 17:16 +0100, Dave Martin wrote:
On Sat, Jul 02, 2011 at 12:28:01PM +0100, Tixy wrote:
On Fri, 2011-07-01 at 19:26 +0100, Dave Martin wrote:
The interworking behaviour is uniform for ARMv5(T) and above, but since kernels built in Thumb cannot run on pre-v7, and kernels built in ARM cannot (or certainly should not) contain any Thumb code, these niceties may not matter.
Interworking in different on v7, ARM mode ALU instructions now interwork, e.g. "sub pc, pc, #3" will switch from ARM to Thumb.
OK, I was just thinking about Thumb, but this is a fair distinction.
Alternatively, would it make sense to oops instead if we discover when simulating the branch that it would try to switch to ARM?
I'm not sure I understand what you mean here. In ARM mode the instruction "sub pc, pc, #3" does the following
on ARMv7, switch to Thumb on ARMv6, stay in ARM mode on ARMv5 and earlier, UNPREDICTABLE
Hmmm, I wasn't aware of the distinction between ARMv5 and ARMv6 here.
so are you saying OOPS in the last case? (I currently have it staying ARM mode as I only distinguish between <=ARMv6 and >=ARMv7 )
No, I mean oops in the ARMv7 case if it would switch to Thumb. By definition we cannot detect many of these case statically, and I suggest it may not be worth trying to catch any of them statically.
For ARMv6, we should implement the architecture correctly and stay in ARM.
For ARMv5, "UNPREDICTABLE" provides flexibility on how to interpret the architecture -- so we can do what's most convenient and mimic the ARMv6 behaviour. If we could detect the UNPREDICTABLE cases statically, it might be worth refusing the probe, but since the behaviour depends on the ALU result, we can't tell statically whether the instruction is an unpredictable usabe or not. But if you don't feel comfortable with that, we could also Oops or something. That too is within the scope of what the architecture permits for UNPREDICTABLE behaviours.
Note, in Thumb mode, the ALU instructions don't interwork on any architecture. (It makes you wonder why they changed the ARM mode behaviour in v7).
I think it's to make it easier to integrate legacy ARM code into a mixed ARM/Thumb environment.
Thumb-2 code is not legacy by definition, and Thumb-1 can't really be used for general-purpose stuff. Plus, ALU operations have never been the correct way to do procedure returns etc. since the introduction of EABI. So generally, Thumb code is not going to suffer from interworking problems to the same extent as ARM.
[...]
When doing doing the kprobes bug fixes it was decided that we should avoid writes to PC which produce unpredictable results, even though such instructions aren't legal and won't occur in normal code. So I believe it follows that we should implement interworking correctly, even on kernels not built for Thumb.
I'm not sure I understand your argument here.
I was trying to say that if we already have code to handle cases of invalid instructions that the compiler/assembler would never generate, then we should have code to handle legal instructions that the the compiler/assembler won't generate because we're not building for Thumb.
OK, I think I get it -- you mean we should not just silently execute these cases and hope for the best.
I agree with that; as above, my conclusion is that since attempts to detect these unpredictable ALU operations statically will be somewhat haphazard* it may be cleaner not to attempt it at all, and check at runtime on the architectures where it makes a difference.
(* i.e., subs add pc, lr, #3 is not an UNPREDICTABLE case if lr contains an odd number at run-time, where as subs pc, lr, #4 may be UNPREDICTABLE if lr is odd etc. -- we cannot tell just be decoding the instruction)
Before I started my work, the kprobes code already had a partial simulation of interworking, so someone else must have thought it worth while. Though it would obviously be a lot simpler if I could just wrap all interworking stuff up in #ifdef CONFIG_THUMB2_KERNEL.
Is it worth pinging the original committer to get his views on the rationale for this?
Over the weekend I implemented functions to check various interworking behaviour, and run these when kprobes is first initialised. So I now have things all simulated correctly and I think I may as well leave things in there now.
Do you mean you check the CPU ID etc?
I'm not sure that experimentally running instructions to see what happens is entirely safe, but that may not be what you're suggesting.
Cheers, ---Dave
On Tue, 2011-07-05 at 09:44 +0100, Dave Martin wrote:
On Mon, Jul 04, 2011 at 05:52:03PM +0100, Tixy wrote:
[...]
Over the weekend I implemented functions to check various interworking behaviour, and run these when kprobes is first initialised. So I now have things all simulated correctly and I think I may as well leave things in there now.
Do you mean you check the CPU ID etc?
I'm not sure that experimentally running instructions to see what happens is entirely safe, but that may not be what you're suggesting.
For "add pc, pc, #3"
#if __LINUX_ARM_ARCH__ <= 5
Assume kernel can't be running on ARMv6 or so don't interwork.
#else
Execute "add pc, pc, #3" and see what happens.
#endif.
For the case of "ldr pc, [...]", you are right, trying to execute this on ARMv4 is UNPREDICTABLE, so I need to think again. Possibly just try a CLZ instruction and see if we get an undef abort, that would tell us we're on ARMv5?
On Tue, 5 Jul 2011, Tixy wrote:
For the case of "ldr pc, [...]", you are right, trying to execute this on ARMv4 is UNPREDICTABLE, so I need to think again. Possibly just try a CLZ instruction and see if we get an undef abort, that would tell us we're on ARMv5?
Look at cpu_architecture() in arch/arm/kernel/setup.c which should tell you at run time the actual architecture being used.
Nicolas
On Tue, 2011-07-05 at 10:52 -0400, Nicolas Pitre wrote:
Look at cpu_architecture() in arch/arm/kernel/setup.c which should tell you at run time the actual architecture being used.
Just what I needed all along! Though what can I do if it returns CPU_ARCH_UNKNOWN? Assume that the ARCH is at least v7? Or resort to OOPsing at runtime if PC&1 is true?
On Tue, 5 Jul 2011, Tixy wrote:
On Tue, 2011-07-05 at 10:52 -0400, Nicolas Pitre wrote:
Look at cpu_architecture() in arch/arm/kernel/setup.c which should tell you at run time the actual architecture being used.
Just what I needed all along! Though what can I do if it returns CPU_ARCH_UNKNOWN?
Just call BUG() right away. This shouldn't happen.
Nicolas
linaro-kernel@lists.linaro.org