Currently, the arm64 /proc/cpuinfo format differs from that of arm, in a manner which prevents some otherwise portable applications from functioning as expected. Specifically, the "Features" line describes the 64-bit hwcaps exclusive of the 32-bit hwcaps, which causes issues for certain applications which attempt to parse /proc/cpuinfo to detect features rather than directly using the hwcaps exposed via auxval.
Additionally, the arm64 /proc/cpuinfo format only provides identifying information for a single CPU (unlike 32-bit), which is problematic for systems with heterogeneous CPUs (i.e. big.LITTLE).
This patch attempts to solve both issues. I believe the contentious part is what to do with the Features line, and for that there are a number of possibilities:
[a] Only print the 64-bit hwcaps
This would match our current behaviour. However certain 32-bit applications will not detect CPU features correctly, and could fail to launch. The appropriate hwcaps are available in auxval, but this will not be of help to existing binaries.
[b] Append the 64-bit and 32-bit hwcaps
This would allow for a consistent format. However, some human-readable hwcap names have been reused for analogous instruction set features (e.g. "aes") despite 32-bit and 64-bit instruction set support being largely unrelated per the architecture. This could lead to applications mis-detecting instruction set support on some CPUs in future, and may be misleading to a casual reader.
[c] Print different hwcaps for compat tasks
This would allow for 32-bit and 64-bit applications to function correctly. Having the format differ depending on the instruction set of the application reading /proc/cpuinfo may be misleading in some cases (e.g. a human using a 32-bit cat to read /proc/cpuinfo on a 64-bit system).
[d] Print different hwcaps dependent on the personality.
This would allow for 32-bit and 64-bit applications to function correctly, but for some 32-bit applications the personality would need to be set explicitly by the user.
This patch takes approach d, aligning with what we do for COMPART_UTS_NAME and COMPAT_ELF_PLATFORM function. Below are sample output on a 32-bit platform and a 64-bit platform before and after this patch (with and without LINUX32 personality).
Does it sound reasonable for the /proc/cpuinfo format to vary depending on the task or personality?
Are there applications for which any of these strategies will not work?
Thanks, Mark.
[1] arm, v3.17, Versatile Express A15x2 A7x3 coretile ---->8---- processor : 0 model name : ARMv7 Processor rev 1 (v7l) Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x2 CPU part : 0xc0f CPU revision : 1
processor : 1 model name : ARMv7 Processor rev 1 (v7l) Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x2 CPU part : 0xc0f CPU revision : 1
processor : 2 model name : ARMv7 Processor rev 1 (v7l) Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xc07 CPU revision : 1
processor : 3 model name : ARMv7 Processor rev 1 (v7l) Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xc07 CPU revision : 1
processor : 4 model name : ARMv7 Processor rev 1 (v7l) Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xc07 CPU revision : 1
Hardware : ARM-Versatile Express Revision : 0000 Serial : 0000000000000000 ----8<----
[2] arm64, v3.17, Juno platform ---->8---- Processor : AArch64 Processor rev 0 (aarch64) processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: AArch64 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0
Hardware : Juno ----8<----
[3] arm64, v3.17 + this patch, Juno platform ---->8---- processor : 0 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0
processor : 1 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0
processor : 2 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0
processor : 3 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0
processor : 4 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd07 CPU revision : 0
processor : 5 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd07 CPU revision : 0 ----8<----
[4] arm64, v3.17 + this patch, Juno platform, LINUX32 personality ---->8---- processor : 0 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0
processor : 1 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0
processor : 2 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0
processor : 3 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 0
processor : 4 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd07 CPU revision : 0
processor : 5 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd07 CPU revision : 0 ----8<----
Mark Rutland (1): arm64: Fix up /proc/cpuinfo
arch/arm64/kernel/setup.c | 96 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 72 insertions(+), 24 deletions(-)
Commit d7a49086f263164a (arm64: cpuinfo: print info for all CPUs) attempted to clean up /proc/cpuinfo, but due to concerns regarding further changes was reverted in commit 5e39977edf6500fd (Revert "arm64: cpuinfo: print info for all CPUs").
There are two major issues with the arm64 /proc/cpuinfo format currently:
* The "Features" line describes (only) the 64-bit hwcaps, which is problematic for some 32-bit applications which attempt to parse it. As the same names are used for analogous ISA features (e.g. aes) despite these generally being architecturally unrelated, it is not possible to simply append the 64-bit and 32-bit hwcaps in a manner that might not be misleading to some applications.
Various potential solutions have appeared in vendor kernels. Typically the format of the Features line varies depending on whether the task is 32-bit.
* Information is only printed regarding a single CPU. This does not match the ARM format, and does not provide sufficient information in big.LITTLE systems where CPUs are heterogeneous. The CPU information printed is queried from the current CPU's registers, which is racy w.r.t. cross-cpu migration.
This patch attempts to solve these issues. The following changes are made:
* When a task with a LINUX32 personality attempts to read /proc/cpuinfo, the "Features" line contains the decoded 32-bit hwcaps, as with the arm port. Otherwise, the decoded 64-bit hwcaps are shown. This aligns with the behaviour of COMPAT_UTS_MACHINE and COMPAT_ELF_PLATFORM. In the absense of compat support, the Features line is empty.
The set of hwcaps injected into a task's auxval are unaffected.
* Properties are printed per-cpu, as with the ARM port. The per-cpu information is queried from pre-recorded cpu information (as used by the sanity checks).
* As with the previous attempt at fixing up /proc/cpuinfo, the hardware field is removed. The only users so far are 32-bit applications tied to particular boards, so no portable applications should be affected, and this should prevent future tying to particular boards.
The following differences remain:
* No model_name is printed, as this cannot be queried from the hardware and cannot be provided in a stable fashion. Use of the CPU {implementor,variant,part,revision} fields is sufficient to identify a CPU and is portable across arm and arm64.
* The following system-wide properties are not provided, as they are not possible to provide generally. Programs relying on these are already tied to particular (32-bit only) boards: - Hardware - Revision - Serial
No software has yet been identified for which these remaining differences are problematic.
Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Greg Hackmann ghackmann@google.com Cc: Ian Campbell ijc@hellion.org.uk Cc: Serban Constantinescu serban.constantinescu@arm.com Cc: Will Deacon will.deacon@arm.com Cc: cross-distro@lists.linaro.org Cc: linux-api@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org --- arch/arm64/kernel/setup.c | 96 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 72 insertions(+), 24 deletions(-)
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index edb146d..1834b33 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -43,6 +43,7 @@ #include <linux/of_fdt.h> #include <linux/of_platform.h> #include <linux/efi.h> +#include <linux/personality.h>
#include <asm/fixmap.h> #include <asm/cpu.h> @@ -78,7 +79,6 @@ unsigned int compat_elf_hwcap2 __read_mostly; #endif
static const char *cpu_name; -static const char *machine_name; phys_addr_t __fdt_pointer __initdata;
/* @@ -310,8 +310,6 @@ static void __init setup_machine_fdt(phys_addr_t dt_phys) while (true) cpu_relax(); } - - machine_name = of_flat_dt_get_machine_name(); }
/* @@ -446,14 +444,50 @@ static const char *hwcap_str[] = { NULL };
+#ifdef CONFIG_COMPAT +static const char *compat_hwcap_str[] = { + "swp", + "half", + "thumb", + "26bit", + "fastmult", + "fpa", + "vfp", + "edsp", + "java", + "iwmmxt", + "crunch", + "thumbee", + "neon", + "vfpv3", + "vfpv3d16", + "tls", + "vfpv4", + "idiva", + "idivt", + "vfpd32", + "lpae", + "evtstrm" +}; + +static const char *compat_hwcap2_str[] = { + "aes", + "pmull", + "sha1", + "sha2", + "crc32", + NULL +}; +#endif /* CONFIG_COMPAT */ + static int c_show(struct seq_file *m, void *v) { - int i; - - seq_printf(m, "Processor\t: %s rev %d (%s)\n", - cpu_name, read_cpuid_id() & 15, ELF_PLATFORM); + int i, j;
for_each_online_cpu(i) { + struct cpuinfo_arm64 *cpuinfo = &per_cpu(cpu_data, i); + u32 midr = cpuinfo->reg_midr; + /* * glibc reads /proc/cpuinfo to determine the number of * online processors, looking for lines beginning with @@ -462,24 +496,38 @@ static int c_show(struct seq_file *m, void *v) #ifdef CONFIG_SMP seq_printf(m, "processor\t: %d\n", i); #endif - } - - /* dump out the processor features */ - seq_puts(m, "Features\t: "); - - for (i = 0; hwcap_str[i]; i++) - if (elf_hwcap & (1 << i)) - seq_printf(m, "%s ", hwcap_str[i]); - - seq_printf(m, "\nCPU implementer\t: 0x%02x\n", read_cpuid_id() >> 24); - seq_printf(m, "CPU architecture: AArch64\n"); - seq_printf(m, "CPU variant\t: 0x%x\n", (read_cpuid_id() >> 20) & 15); - seq_printf(m, "CPU part\t: 0x%03x\n", (read_cpuid_id() >> 4) & 0xfff); - seq_printf(m, "CPU revision\t: %d\n", read_cpuid_id() & 15);
- seq_puts(m, "\n"); - - seq_printf(m, "Hardware\t: %s\n", machine_name); + /* + * Dump out the common processor features in a single line. + * Userspace should read the hwcaps with getauxval(AT_HWCAP) + * rather than attempting to parse this, but there's a body of + * software which does already (at least for 32-bit). + */ + seq_puts(m, "Features\t:"); + if (personality(current->personality) == PER_LINUX32) { +#ifdef CONFIG_COMPAT + for (j = 0; compat_hwcap_str[j]; j++) + if (compat_elf_hwcap & (1 << j)) + seq_printf(m, " %s", compat_hwcap_str[j]); + + for (j = 0; compat_hwcap2_str[j]; j++) + if (compat_elf_hwcap2 & (1 << j)) + seq_printf(m, " %s", compat_hwcap2_str[j]); +#endif /* CONFIG_COMPAT */ + } else { + for (j = 0; hwcap_str[j]; j++) + if (elf_hwcap & (1 << j)) + seq_printf(m, " %s", hwcap_str[j]); + } + seq_puts(m, "\n"); + + seq_printf(m, "CPU implementer\t: 0x%02x\n", + MIDR_IMPLEMENTOR(midr)); + seq_printf(m, "CPU architecture: 8\n"); + seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr)); + seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr)); + seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr)); + }
return 0; }
Hi Mark,
On Fri, Oct 24, 2014 at 02:56:40PM +0100, Mark Rutland wrote:
Commit d7a49086f263164a (arm64: cpuinfo: print info for all CPUs) attempted to clean up /proc/cpuinfo, but due to concerns regarding further changes was reverted in commit 5e39977edf6500fd (Revert "arm64: cpuinfo: print info for all CPUs").
I've applied this on our devel branch, with a view to queuing it for -next in a week or two. I think we should also CC stable on this, so could you see how far back it applies please? (note that I already had to resolve a conflict with a patch we've got queued from Ard).
Cheers,
Will
On Thu, 2014-10-30 at 17:15 +0000, Will Deacon wrote:
Hi Mark,
On Fri, Oct 24, 2014 at 02:56:40PM +0100, Mark Rutland wrote:
Commit d7a49086f263164a (arm64: cpuinfo: print info for all CPUs) attempted to clean up /proc/cpuinfo, but due to concerns regarding further changes was reverted in commit 5e39977edf6500fd (Revert "arm64: cpuinfo: print info for all CPUs").
I've applied this on our devel branch, with a view to queuing it for -next in a week or two. I think we should also CC stable on this, so could you see how far back it applies please? (note that I already had to resolve a conflict with a patch we've got queued from Ard).
<dons Debian hat>
Since we'll be shipping 3.16 in our next release I'd love to see it go back at least as far as that ;-)
(I'd likely apply it myself even if it doesn't appear via stable).
Ian.
On Fri, Oct 24, 2014 at 02:56:39PM +0100, Mark Rutland wrote:
Currently, the arm64 /proc/cpuinfo format differs from that of arm, in a manner which prevents some otherwise portable applications from functioning as expected. Specifically, the "Features" line describes the 64-bit hwcaps exclusive of the 32-bit hwcaps, which causes issues for certain applications which attempt to parse /proc/cpuinfo to detect features rather than directly using the hwcaps exposed via auxval.
Like it or not, but every file in procfs is a userspace API, and can be parsed by any program. If we change a procfs file and a userspace program then stops working, that's our fault, and our problem to fix (by restoring the information published there in a manner which userspace can parse.)
So, as you've found some programs which rely on this, ARM64 really does need to be compatible with ARM32 in this respect.
It's unfortunate that people have decided that parsing the ELF HWCAPs from /proc/cpuinfo is an acceptable way to go, rather than using the binary information passed, but procfs is a much more visible source of information than some binary interface which you need to read man pages to find.
That's the danger of publishing information in either procfs or sysfs. Once published, it becomes part of the userspace API, and it can become hard to remove it. This is why we should /always/ think very carefully about what we expose through these filesystems.
On Fri, Oct 24, 2014 at 03:19:36PM +0100, Russell King - ARM Linux wrote:
On Fri, Oct 24, 2014 at 02:56:39PM +0100, Mark Rutland wrote:
Currently, the arm64 /proc/cpuinfo format differs from that of arm, in a manner which prevents some otherwise portable applications from functioning as expected. Specifically, the "Features" line describes the 64-bit hwcaps exclusive of the 32-bit hwcaps, which causes issues for certain applications which attempt to parse /proc/cpuinfo to detect features rather than directly using the hwcaps exposed via auxval.
Like it or not, but every file in procfs is a userspace API, and can be parsed by any program. If we change a procfs file and a userspace program then stops working, that's our fault, and our problem to fix (by restoring the information published there in a manner which userspace can parse.)
So, as you've found some programs which rely on this, ARM64 really does need to be compatible with ARM32 in this respect.
I agree, hence this RFC.
The key question is how we fix the arm64 /proc/cpuinfo format to make those programs function, without potentially breaking other applications.
It's unfortunate that people have decided that parsing the ELF HWCAPs from /proc/cpuinfo is an acceptable way to go, rather than using the binary information passed, but procfs is a much more visible source of information than some binary interface which you need to read man pages to find.
That's the danger of publishing information in either procfs or sysfs. Once published, it becomes part of the userspace API, and it can become hard to remove it. This is why we should /always/ think very carefully about what we expose through these filesystems.
Yes. We made a mistake here with the arm64 format. Hopefully there's a way by which we can keep applications happy.
For future architectures, it's probably worth putting stronger guidelines in place to prevent precisely the issues we've hit here.
Mark.
On Fri, Oct 24, 2014 at 03:24:35PM +0100, Mark Rutland wrote:
On Fri, Oct 24, 2014 at 03:19:36PM +0100, Russell King - ARM Linux wrote:
It's unfortunate that people have decided that parsing the ELF HWCAPs from /proc/cpuinfo is an acceptable way to go, rather than using the binary information passed, but procfs is a much more visible source of information than some binary interface which you need to read man pages to find.
That's the danger of publishing information in either procfs or sysfs. Once published, it becomes part of the userspace API, and it can become hard to remove it. This is why we should /always/ think very carefully about what we expose through these filesystems.
Yes. We made a mistake here with the arm64 format. Hopefully there's a way by which we can keep applications happy.
For future architectures, it's probably worth putting stronger guidelines in place to prevent precisely the issues we've hit here.
You know, I saw something like this happening years ago.
I guess that if people had listened to me when ARM64 was in the process of being released about my concerns about the ARM64 code diverging from the ARM32 code, we wouldn't be having this problem right now, because we would have been aware of the API differences much earlier on.
As ARM64 wants to be compatible with ARM32 (in that it wants to be able to run ARM32 applications), ARM64 *has* to offer a compatible user API for everything that is a user API.
That means you have to generate an ARM32 compatible /proc/cpuinfo, ARM32 compatible hwcap information, ARM32 compatible signal structures, ARM32 compatible everything else. Which means you basically need to have a copy of the ARM32 core code in ARM64, even if you want a different native ARM64 user API.
This is _exactly_ the reason why architectures like X86 decided it was silly having separated 32 and 64-bit, and why they went through a process of merging the two together. A lot of the code was identical, and a lot of the 32-bit specific code was needed for 64-bit to provide the 32-bit API.
Right now, you're finding out this the hard way, and hitting these API problems in the process, and going "oh fsck" when you hit them - quite simply because you've backed yourselves into a corner over this. You have established a different ARM64 API because you didn't want the ARM32 legacy, but then you've found that you /do/ need the ARM32 legacy. Now you're in the position of having to change the ARM64 API, possibly breaking ARM64 applications in the process.
Welcome to the problems of being separate. :)
For example, some of the differences between ARM64 and ARM32 are really trivial, and would've been nice to have them reflected into the ARM32 code base. Here's an example from __die():
- printk(KERN_EMERG "Internal error: %s: %x [#%d]" S_PREEMPT S_SMP - S_ISA "\n", str, err, ++die_counter); + pr_emerg("Internal error: %s: %x [#%d]" S_PREEMPT S_SMP "\n", + str, err, ++die_counter);
/* trap and error numbers are mostly meaningless on ARM */ - ret = notify_die(DIE_OOPS, str, regs, err, tsk->thread.trap_no, SIGSEGV); + ret = notify_die(DIE_OOPS, str, regs, err, 0, SIGSEGV); if (ret == NOTIFY_STOP) - return 1; + return ret;
print_modules(); __show_regs(regs); - printk(KERN_EMERG "Process %.*s (pid: %d, stack limit = 0x%p)\n", - TASK_COMM_LEN, tsk->comm, task_pid_nr(tsk), end_of_stack(tsk)); + pr_emerg("Process %.*s (pid: %d, stack limit = 0x%p)\n", + TASK_COMM_LEN, tsk->comm, task_pid_nr(tsk), thread + 1);
if (!user_mode(regs) || in_interrupt()) { - dump_mem(KERN_EMERG, "Stack: ", regs->ARM_sp, + dump_mem(KERN_EMERG, "Stack: ", regs->sp, THREAD_SIZE + (unsigned long)task_stack_page(tsk)); dump_backtrace(regs, tsk); dump_instr(KERN_EMERG, regs); }
- return 0; + return ret; }
Having the printk(KERN_EMERG -> pr_emerg( would be nice. What's also revealed is that you've missed out on the end_of_stack() cleanup which Rabin Vincent submitted in 2012.
I'm sure that there's plenty more instances of stuff which should be aligned between the two copies of what is essentially the same core code.
On Fri, Oct 24, 2014 at 6:56 AM, Mark Rutland mark.rutland@arm.com wrote:
[c] Print different hwcaps for compat tasks
This would allow for 32-bit and 64-bit applications to function correctly. Having the format differ depending on the instruction set of the application reading /proc/cpuinfo may be misleading in some cases (e.g. a human using a 32-bit cat to read /proc/cpuinfo on a 64-bit system).
FWIW we have an Nvidia-contributed patch in the android-3.10 kernel tree that basically does this. It works well in almost all cases.
We've only found one situation so far where this approach falls apart. One specific Android NDK app detects NEON support by running "/system/bin/cat /proc/cpuinfo" in another process and searching for the string "neon" in the output. Even though the app itself is 32-bit, /system/bin/cat is 64-bit, so it reads the "wrong" /proc/cpuinfo and decides the CPU doesn't support NEON.
From Android's perspective, the only downside of using personalities
instead is that our runtime will need to ensure 32-bit apps have the right personality. Our runtime team has said they're fine with doing this, if that's the direction the upstream kernel wants to go.
Are there applications for which any of these strategies will not work?
We've found a handful of Android apps that expect the "CPU architecture" field to be "7" specifically and get confused by the existence of ARMv8. There's not much you can do on the kernel side to support those apps, other than lying about the CPU architecture to 32-bit apps (and we drew the line at doing that).
On Fri, Oct 24, 2014 at 02:56:39PM +0100, Mark Rutland wrote:
Currently, the arm64 /proc/cpuinfo format differs from that of arm, in a manner which prevents some otherwise portable applications from functioning as expected. Specifically, the "Features" line describes the 64-bit hwcaps exclusive of the 32-bit hwcaps, which causes issues for certain applications which attempt to parse /proc/cpuinfo to detect features rather than directly using the hwcaps exposed via auxval.
Additionally, the arm64 /proc/cpuinfo format only provides identifying information for a single CPU (unlike 32-bit), which is problematic for systems with heterogeneous CPUs (i.e. big.LITTLE).
This patch attempts to solve both issues.
I'm perfectly fine with the heterogeneous CPUs part.
I believe the contentious part is what to do with the Features line, and for that there are a number of possibilities:
[a] Only print the 64-bit hwcaps
This would match our current behaviour. However certain 32-bit applications will not detect CPU features correctly, and could fail to launch. The appropriate hwcaps are available in auxval, but this will not be of help to existing binaries.
[b] Append the 64-bit and 32-bit hwcaps
This would allow for a consistent format. However, some human-readable hwcap names have been reused for analogous instruction set features (e.g. "aes") despite 32-bit and 64-bit instruction set support being largely unrelated per the architecture. This could lead to applications mis-detecting instruction set support on some CPUs in future, and may be misleading to a casual reader.
The only overlap between 32 and 64-bit is aes, pmull, sha1, sha2, crc32. An ARMv8 CPU implementing both AArch32 and AArch64 will likely support these extensions in both modes. However, "likely" is not enough and we need to get some confirmation from the architecture people. If that's the case, point [b] is not too bad.
[c] Print different hwcaps for compat tasks
This would allow for 32-bit and 64-bit applications to function correctly. Having the format differ depending on the instruction set of the application reading /proc/cpuinfo may be misleading in some cases (e.g. a human using a 32-bit cat to read /proc/cpuinfo on a 64-bit system).
Long time ago we decided that "uname -m" would report "aarch64" independent of whether it is compat or not. That's the case for x86 as well, you need PER_LINUX32 to report the actual compat UTS name.
[d] Print different hwcaps dependent on the personality.
This would allow for 32-bit and 64-bit applications to function correctly, but for some 32-bit applications the personality would need to be set explicitly by the user.
Which makes this option actually in line with the uname -m behaviour. My vote goes for [d] with option [b] as a close alternative.
[1] arm, v3.17, Versatile Express A15x2 A7x3 coretile Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
[...]
[2] arm64, v3.17, Juno platform Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
As an exercise, I'm trying to see what option [b] would look like when CONFIG_COMPAT is enabled:
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae
The duplicate strings would only be listed once (evtstrm, aes, pmull, sha1, sha2, crc32). New AArch64 features that we may expect to be optional on AArch32 could be prefixed with "a64". If they are missing entirely from AArch32, (like asimd), no need for the prefix.
The advantage is that we don't need to check the personality but we have to assume that scripts would not search for substrings (sane people shouldn't do this anyway as the Features string can always be extended).
On Thu, Nov 06, 2014 at 04:43:12PM +0000, Catalin Marinas wrote:
On Fri, Oct 24, 2014 at 02:56:39PM +0100, Mark Rutland wrote:
[d] Print different hwcaps dependent on the personality.
This would allow for 32-bit and 64-bit applications to function correctly, but for some 32-bit applications the personality would need to be set explicitly by the user.
Which makes this option actually in line with the uname -m behaviour. My vote goes for [d] with option [b] as a close alternative.
[1] arm, v3.17, Versatile Express A15x2 A7x3 coretile Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
[...]
[2] arm64, v3.17, Juno platform Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
As an exercise, I'm trying to see what option [b] would look like when CONFIG_COMPAT is enabled:
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae
The duplicate strings would only be listed once (evtstrm, aes, pmull, sha1, sha2, crc32). New AArch64 features that we may expect to be optional on AArch32 could be prefixed with "a64". If they are missing entirely from AArch32, (like asimd), no need for the prefix.
The advantage is that we don't need to check the personality but we have to assume that scripts would not search for substrings (sane people shouldn't do this anyway as the Features string can always be extended).
And a big disadvantage is that I can imagine AArch64 applications checking for "neon" instead of "asimd", which will break if they're run under kernels without COMPAT support enabled.
So I'm inclined to stick with Mark's patch as it is.
Will
On Thu, Nov 06, 2014 at 04:54:31PM +0000, Will Deacon wrote:
On Thu, Nov 06, 2014 at 04:43:12PM +0000, Catalin Marinas wrote:
On Fri, Oct 24, 2014 at 02:56:39PM +0100, Mark Rutland wrote:
[d] Print different hwcaps dependent on the personality.
This would allow for 32-bit and 64-bit applications to function correctly, but for some 32-bit applications the personality would need to be set explicitly by the user.
Which makes this option actually in line with the uname -m behaviour. My vote goes for [d] with option [b] as a close alternative.
[1] arm, v3.17, Versatile Express A15x2 A7x3 coretile Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
[...]
[2] arm64, v3.17, Juno platform Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
As an exercise, I'm trying to see what option [b] would look like when CONFIG_COMPAT is enabled:
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae
The duplicate strings would only be listed once (evtstrm, aes, pmull, sha1, sha2, crc32). New AArch64 features that we may expect to be optional on AArch32 could be prefixed with "a64". If they are missing entirely from AArch32, (like asimd), no need for the prefix.
The advantage is that we don't need to check the personality but we have to assume that scripts would not search for substrings (sane people shouldn't do this anyway as the Features string can always be extended).
And a big disadvantage is that I can imagine AArch64 applications checking for "neon" instead of "asimd", which will break if they're run under kernels without COMPAT support enabled.
That's because they do stupid things and I they probably deserve to break ;). But I agree, there is a risk.
So I'm inclined to stick with Mark's patch as it is.
If we don't hear otherwise, I propose sometime next week we queue Mark's patch for -next.
On Thu, Nov 06, 2014 at 05:05:48PM +0000, Catalin Marinas wrote:
On Thu, Nov 06, 2014 at 04:54:31PM +0000, Will Deacon wrote:
On Thu, Nov 06, 2014 at 04:43:12PM +0000, Catalin Marinas wrote:
On Fri, Oct 24, 2014 at 02:56:39PM +0100, Mark Rutland wrote:
[d] Print different hwcaps dependent on the personality.
This would allow for 32-bit and 64-bit applications to function correctly, but for some 32-bit applications the personality would need to be set explicitly by the user.
Which makes this option actually in line with the uname -m behaviour. My vote goes for [d] with option [b] as a close alternative.
[1] arm, v3.17, Versatile Express A15x2 A7x3 coretile Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
[...]
[2] arm64, v3.17, Juno platform Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
As an exercise, I'm trying to see what option [b] would look like when CONFIG_COMPAT is enabled:
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae
The duplicate strings would only be listed once (evtstrm, aes, pmull, sha1, sha2, crc32). New AArch64 features that we may expect to be optional on AArch32 could be prefixed with "a64". If they are missing entirely from AArch32, (like asimd), no need for the prefix.
The advantage is that we don't need to check the personality but we have to assume that scripts would not search for substrings (sane people shouldn't do this anyway as the Features string can always be extended).
And a big disadvantage is that I can imagine AArch64 applications checking for "neon" instead of "asimd", which will break if they're run under kernels without COMPAT support enabled.
[...]
So I'm inclined to stick with Mark's patch as it is.
If we don't hear otherwise, I propose sometime next week we queue Mark's patch for -next.
As we haven't heard otherwise:
Acked-by: Catalin Marinas catalin.marinas@arm.com