The 2024 architecture release includes a number of data processing extensions, mostly SVE and SME additions with a few others. These are all very straightforward extensions which add instructions but no architectural state so only need hwcaps and exposing of the ID registers to KVM guests and userspace.
Signed-off-by: Mark Brown broonie@kernel.org --- Changes in v3: - Commit log update for the hwcap test. - Link to v2: https://lore.kernel.org/r/20241030-arm64-2024-dpisa-v2-0-b6601a15d2a5@kernel...
Changes in v2: - Filter KVM guest visible bitfields in ID_AA64ISAR3_EL1 to only those we make writeable. - Link to v1: https://lore.kernel.org/r/20241028-arm64-2024-dpisa-v1-0-a38d08b008a8@kernel...
--- Mark Brown (9): arm64/sysreg: Update ID_AA64PFR2_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ISAR3_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64FPFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ZFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64SMFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ISAR2_EL1 to DDI0601 2024-09 arm64/hwcap: Describe 2024 dpISA extensions to userspace KVM: arm64: Allow control of dpISA extensions in ID_AA64ISAR3_EL1 kselftest/arm64: Add 2024 dpISA extensions to hwcap test
Documentation/arch/arm64/elf_hwcaps.rst | 51 ++++++ arch/arm64/include/asm/hwcap.h | 17 ++ arch/arm64/include/uapi/asm/hwcap.h | 17 ++ arch/arm64/kernel/cpufeature.c | 35 ++++ arch/arm64/kernel/cpuinfo.c | 17 ++ arch/arm64/kvm/sys_regs.c | 6 +- arch/arm64/tools/sysreg | 87 +++++++++- tools/testing/selftests/arm64/abi/hwcap.c | 273 +++++++++++++++++++++++++++++- 8 files changed, 493 insertions(+), 10 deletions(-) --- base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37 change-id: 20241008-arm64-2024-dpisa-8091074a7f48
Best regards,
DDI0601 2024-09 defines a new feature flags in ID_AA64PFR2_EL1 describing support for injecting UNDEF exceptions, update sysreg to include this.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/tools/sysreg | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index b081b54d6d227ed8300a6f129896647316f0b673..911f16c82ebd3ee98ffed965b02a5c6b153bc50c 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1010,7 +1010,12 @@ UnsignedEnum 35:32 FPMR 0b0000 NI 0b0001 IMP EndEnum -Res0 31:12 +Res0 31:20 +UnsignedEnum 19:16 UINJ + 0b0000 NI + 0b0001 IMP +EndEnum +Res0 15:12 UnsignedEnum 11:8 MTEFAR 0b0000 NI 0b0001 IMP
DDI0601 2024-09 defines several new feature flags in ID_AA64ISAR3_EL1, update our description in sysreg to reflect these.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/tools/sysreg | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 911f16c82ebd3ee98ffed965b02a5c6b153bc50c..c5af604eda6a721cedf5c9c68d6f7038156de651 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1566,7 +1566,23 @@ EndEnum EndSysreg
Sysreg ID_AA64ISAR3_EL1 3 0 0 6 3 -Res0 63:16 +Res0 63:32 +UnsignedEnum 31:28 FPRCVT + 0b0000 NI + 0b0010 IMP +EndEnum +UnsignedEnum 27:24 LSUI + 0b0000 NI + 0b0010 IMP +EndEnum +UnsignedEnum 23:20 OCCMO + 0b0000 NI + 0b0010 IMP +EndEnum +UnsignedEnum 19:16 LSFE + 0b0000 NI + 0b0010 IMP +EndEnum UnsignedEnum 15:12 PACM 0b0000 NI 0b0001 TRIVIAL_IMP
On Tue, Dec 03, 2024 at 12:39:21PM +0000, Mark Brown wrote:
DDI0601 2024-09 defines several new feature flags in ID_AA64ISAR3_EL1, update our description in sysreg to reflect these.
Signed-off-by: Mark Brown broonie@kernel.org
arch/arm64/tools/sysreg | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 911f16c82ebd3ee98ffed965b02a5c6b153bc50c..c5af604eda6a721cedf5c9c68d6f7038156de651 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1566,7 +1566,23 @@ EndEnum EndSysreg Sysreg ID_AA64ISAR3_EL1 3 0 0 6 3 -Res0 63:16 +Res0 63:32 +UnsignedEnum 31:28 FPRCVT
- 0b0000 NI
- 0b0010 IMP
+EndEnum +UnsignedEnum 27:24 LSUI
- 0b0000 NI
- 0b0010 IMP
+EndEnum +UnsignedEnum 23:20 OCCMO
- 0b0000 NI
- 0b0010 IMP
+EndEnum +UnsignedEnum 19:16 LSFE
- 0b0000 NI
- 0b0010 IMP
These IMP encodings look wrong to me -- the document you reference in the commit message uses 0b0001 for the "implemented" cases.
Can we _please_ just generate this stuff. It feels like we've been making silly typos over and over again with the current approach so either it's hard or we're not very good at it. Either way, it should be automated.
Others have managed it [1], so it's clearly do-able.
Will
On Tue, Dec 10, 2024 at 05:09:55PM +0000, Will Deacon wrote:
Can we _please_ just generate this stuff. It feels like we've been making silly typos over and over again with the current approach so either it's hard or we're not very good at it. Either way, it should be automated.
Others have managed it [1], so it's clearly do-able.
Yes, the issues here are not technical ones. Though there are some complications - eg, IIRC the XML doesn't encode the signedness of fields like we do and there's areas where we've deliberately diverged. Given the amount of review I end up having to do of sysreg changes your reasoning is especially apparent to me. I've passed this feedback on (again).
On Tue, Dec 10, 2024 at 06:43:05PM +0000, Mark Brown wrote:
On Tue, Dec 10, 2024 at 05:09:55PM +0000, Will Deacon wrote:
Can we _please_ just generate this stuff. It feels like we've been making silly typos over and over again with the current approach so either it's hard or we're not very good at it. Either way, it should be automated.
Others have managed it [1], so it's clearly do-able.
Yes, the issues here are not technical ones. Though there are some complications - eg, IIRC the XML doesn't encode the signedness of fields like we do and there's areas where we've deliberately diverged. Given the amount of review I end up having to do of sysreg changes your reasoning is especially apparent to me. I've passed this feedback on (again).
One thing we _could_ do is have a tool (in-tree) that takes two copies of the sysreg file (i.e. before and after applying a diff) along with a copy of the XML and, for the the new fields being added, shows how the XML represents those compared to the diff. It should then be relatively straightforward to flag the use of an unallocated encoding (like we had here) and also things like assigning a field name to a RES0 region.
So this wouldn't be generating the patches from the XML, but more like using the XML as an oracle in a linter.
Will
On Wed, Dec 11, 2024 at 10:40:15PM +0000, Will Deacon wrote:
On Tue, Dec 10, 2024 at 06:43:05PM +0000, Mark Brown wrote:
Yes, the issues here are not technical ones. Though there are some complications - eg, IIRC the XML doesn't encode the signedness of fields like we do and there's areas where we've deliberately diverged. Given the amount of review I end up having to do of sysreg changes your reasoning is especially apparent to me. I've passed this feedback on (again).
One thing we _could_ do is have a tool (in-tree) that takes two copies of the sysreg file (i.e. before and after applying a diff) along with a copy of the XML and, for the the new fields being added, shows how the XML represents those compared to the diff. It should then be relatively straightforward to flag the use of an unallocated encoding (like we had here) and also things like assigning a field name to a RES0 region.
So this wouldn't be generating the patches from the XML, but more like using the XML as an oracle in a linter.
That'd be useful, yes - unfortunately I think that's still something I can't work on myself at the moment for the above mentioned non-technical reasons.
On Thu, Dec 12, 2024 at 11:33:05AM +0000, Mark Brown wrote:
On Wed, Dec 11, 2024 at 10:40:15PM +0000, Will Deacon wrote:
On Tue, Dec 10, 2024 at 06:43:05PM +0000, Mark Brown wrote:
Yes, the issues here are not technical ones. Though there are some complications - eg, IIRC the XML doesn't encode the signedness of fields like we do and there's areas where we've deliberately diverged. Given the amount of review I end up having to do of sysreg changes your reasoning is especially apparent to me. I've passed this feedback on (again).
One thing we _could_ do is have a tool (in-tree) that takes two copies of the sysreg file (i.e. before and after applying a diff) along with a copy of the XML and, for the the new fields being added, shows how the XML represents those compared to the diff. It should then be relatively straightforward to flag the use of an unallocated encoding (like we had here) and also things like assigning a field name to a RES0 region.
So this wouldn't be generating the patches from the XML, but more like using the XML as an oracle in a linter.
That'd be useful, yes - unfortunately I think that's still something I can't work on myself at the moment for the above mentioned non-technical reasons.
Is anybody able to work on it? Without insight into the "non-technical reasons", I don't know what I'm supposed to do other than write the tool myself (which means finding some spare cycles...) or refusing to take wholesale sysreg definitions until it's been ironed out :/
Will
On Thu, Dec 19, 2024 at 03:55:48PM +0000, Will Deacon wrote:
On Thu, Dec 12, 2024 at 11:33:05AM +0000, Mark Brown wrote:
That'd be useful, yes - unfortunately I think that's still something I can't work on myself at the moment for the above mentioned non-technical reasons.
Is anybody able to work on it? Without insight into the "non-technical reasons", I don't know what I'm supposed to do other than write the tool myself (which means finding some spare cycles...) or refusing to take wholesale sysreg definitions until it's been ironed out :/
Similar issues will apply to anyone at Arm as things currently stand.
On Thu, Dec 19, 2024 at 04:39:11PM +0000, Mark Brown wrote:
On Thu, Dec 19, 2024 at 03:55:48PM +0000, Will Deacon wrote:
On Thu, Dec 12, 2024 at 11:33:05AM +0000, Mark Brown wrote:
That'd be useful, yes - unfortunately I think that's still something I can't work on myself at the moment for the above mentioned non-technical reasons.
Is anybody able to work on it? Without insight into the "non-technical reasons", I don't know what I'm supposed to do other than write the tool myself (which means finding some spare cycles...) or refusing to take wholesale sysreg definitions until it's been ironed out :/
Similar issues will apply to anyone at Arm as things currently stand.
Oh, actually - shortly after I sent this mail I got a notification that there's now an "Open Source Machine Readable Data" package at:
https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
as part of the 2024-12 which should unblock this, just in time for Christmas. There's just the small matter of free time to resolve!
On Thu, Dec 19, 2024 at 04:49:05PM +0000, Mark Brown wrote:
On Thu, Dec 19, 2024 at 04:39:11PM +0000, Mark Brown wrote:
On Thu, Dec 19, 2024 at 03:55:48PM +0000, Will Deacon wrote:
On Thu, Dec 12, 2024 at 11:33:05AM +0000, Mark Brown wrote:
That'd be useful, yes - unfortunately I think that's still something I can't work on myself at the moment for the above mentioned non-technical reasons.
Is anybody able to work on it? Without insight into the "non-technical reasons", I don't know what I'm supposed to do other than write the tool myself (which means finding some spare cycles...) or refusing to take wholesale sysreg definitions until it's been ironed out :/
Similar issues will apply to anyone at Arm as things currently stand.
Oh, actually - shortly after I sent this mail I got a notification that there's now an "Open Source Machine Readable Data" package at:
https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
as part of the 2024-12 which should unblock this, just in time for Christmas. There's just the small matter of free time to resolve!
<party.gif>
That's great! Hopefully we can knock up some basic linter tools in the new year.
Thanks,
Will
DDI0601 2024-09 defines two new feature flags in ID_AA64FPFR0_EL1 describing new FP8 operations, describe them in sysreg.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/tools/sysreg | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index c5af604eda6a721cedf5c9c68d6f7038156de651..b44ab511cf5d9d33efd7dca304d0e2f53ce47810 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1180,7 +1180,15 @@ UnsignedEnum 28 F8DP2 0b0 NI 0b1 IMP EndEnum -Res0 27:2 +UnsignedEnum 27 F8MM8 + 0b0 NI + 0b1 IMP +EndEnum +UnsignedEnum 26 F8MM4 + 0b0 NI + 0b1 IMP +EndEnum +Res0 25:2 UnsignedEnum 1 F8E4M3 0b0 NI 0b1 IMP
DDI0601 2024-09 introduces SVE 2.2 as well as a few new optional features, update sysreg to reflect the changes in ID_AA64ZFR0_EL1 enumerating them.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/tools/sysreg | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index b44ab511cf5d9d33efd7dca304d0e2f53ce47810..7e6b204e83270daabd0036c8109b2fdb0e9b700a 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1040,7 +1040,10 @@ UnsignedEnum 55:52 F32MM 0b0000 NI 0b0001 IMP EndEnum -Res0 51:48 +UnsignedEnum 51:48 F16MM + 0b0000 NI + 0b0001 IMP +EndEnum UnsignedEnum 47:44 I8MM 0b0000 NI 0b0001 IMP @@ -1058,6 +1061,7 @@ Res0 31:28 UnsignedEnum 27:24 B16B16 0b0000 NI 0b0001 IMP + 0b0010 BFSCALE EndEnum UnsignedEnum 23:20 BF16 0b0000 NI @@ -1068,16 +1072,22 @@ UnsignedEnum 19:16 BitPerm 0b0000 NI 0b0001 IMP EndEnum -Res0 15:8 +UnsignedEnum 15:12 EltPerm + 0b0000 NI + 0b0001 IMP +EndEnum +Res0 11:8 UnsignedEnum 7:4 AES 0b0000 NI 0b0001 IMP 0b0010 PMULL128 + 0b0011 AES2 EndEnum UnsignedEnum 3:0 SVEver 0b0000 IMP 0b0001 SVE2 0b0010 SVE2p1 + 0b0011 SVE2p2 EndEnum EndSysreg
DDI0601 2024-09 introduces SME 2.2 as well as a few new optional features, update sysreg to reflect the changes in ID_AA64SMFR0_EL1 enumerating them.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/tools/sysreg | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 7e6b204e83270daabd0036c8109b2fdb0e9b700a..0253d3847aeb2294da04b2b0b3f33f81f32c849f 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1105,6 +1105,7 @@ UnsignedEnum 59:56 SMEver 0b0000 SME 0b0001 SME2 0b0010 SME2p1 + 0b0011 SME2p2 0b0000 IMP EndEnum UnsignedEnum 55:52 I16I64 @@ -1169,7 +1170,36 @@ UnsignedEnum 28 SF8DP2 0b0 NI 0b1 IMP EndEnum -Res0 27:0 +UnsignedEnum 27 SF8MM8 + 0b0 NI + 0b1 IMP +EndEnum +UnsignedEnum 26 SF8MM4 + 0b0 NI + 0b1 IMP +EndEnum +UnsignedEnum 25 SBitPerm + 0b0 NI + 0b1 IMP +EndEnum +UnsignedEnum 24 AES + 0b0 NI + 0b1 IMP +EndEnum +UnsignedEnum 23 SFEXPA + 0b0 NI + 0b1 IMP +EndEnum +Res0 22:17 +UnsignedEnum 16 STMOP + 0b0 NI + 0b1 IMP +EndEnum +Res0 15:1 +UnsignedEnum 0 SMOP4 + 0b0 NI + 0b1 IMP +EndEnum EndSysreg
Sysreg ID_AA64FPFR0_EL1 3 0 0 4 7
DDI0601 2024-09 introduces new features which are enumerated via ID_AA64ISAR2_EL1, update the sysreg file to reflect these updates.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/tools/sysreg | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 0253d3847aeb2294da04b2b0b3f33f81f32c849f..fe55c04624de74a6c1a5e0be45363b9c46ff1340 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1556,12 +1556,16 @@ EndEnum UnsignedEnum 55:52 CSSC 0b0000 NI 0b0001 IMP + 0b0010 CMPBR EndEnum UnsignedEnum 51:48 RPRFM 0b0000 NI 0b0001 IMP EndEnum -Res0 47:44 +UnsignedEnum 47:44 PCDPHINT + 0b0000 NI + 0b0001 IMP +EndEnum UnsignedEnum 43:40 PRFMSLC 0b0000 NI 0b0001 IMP
The 2024 dpISA introduces a number of architecture features all of which only add new instructions so only require the addition of hwcaps and ID register visibility.
Signed-off-by: Mark Brown broonie@kernel.org --- Documentation/arch/arm64/elf_hwcaps.rst | 51 +++++++++++++++++++++++++++++++++ arch/arm64/include/asm/hwcap.h | 17 +++++++++++ arch/arm64/include/uapi/asm/hwcap.h | 17 +++++++++++ arch/arm64/kernel/cpufeature.c | 35 ++++++++++++++++++++++ arch/arm64/kernel/cpuinfo.c | 17 +++++++++++ 5 files changed, 137 insertions(+)
diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst index 2ff922a406ad83d0dff8104a6e362ac6b02d0e1f..7c99894ca3e8f5433b1a0db6a4679395e5cd9ecc 100644 --- a/Documentation/arch/arm64/elf_hwcaps.rst +++ b/Documentation/arch/arm64/elf_hwcaps.rst @@ -174,6 +174,57 @@ HWCAP_GCS Functionality implied by ID_AA64PFR1_EL1.GCS == 0b1, as described by Documentation/arch/arm64/gcs.rst.
+HWCAP_CMPBR + Functionality implied by ID_AA64ISAR2_EL1.CSSC == 0b0010. + +HWCAP_FPRCVT + Functionality implied by ID_AA64ISAR3_EL1.FPRCVT == 0b0001. + +HWCAP_F8MM8 + Functionality implied by ID_AA64FPFR0_EL1.F8MM8 == 0b0001. + +HWCAP_F8MM4 + Functionality implied by ID_AA64FPFR0_EL1.F8MM4 == 0b0001. + +HWCAP_SVE_F16MM + Functionality implied by ID_AA64ZFR0_EL1.F16MM == 0b0001. + +HWCAP_SVE_ELTPERM + Functionality implied by ID_AA64ZFR0_EL1.ELTPERM == 0b0001. + +HWCAP_SVE_AES2 + Functionality implied by ID_AA64ZFR0_EL1.AES == 0b0011. + +HWCAP_SVE_BFSCALE + Functionality implied by ID_AA64ZFR0_EL1.B16B16 == 0b0010. + +HWCAP_SVE2P2 + Functionality implied by ID_AA64ZFR0_EL1.SVEver == 0b0011. + +HWCAP_SME2P2 + Functionality implied by ID_AA64SMFR0_EL1.SMEver == 0b0011. + +HWCAP_SME_SF8MM8 + Functionality implied by ID_AA64SMFR0_EL1.SF8MM8 == 0b1. + +HWCAP_SME_SF8MM4 + Functionality implied by ID_AA64SMFR0_EL1.SF8MM4 == 0b1. + +HWCAP_SME_SBITPERM + Functionality implied by ID_AA64SMFR0_EL1.SBitPerm == 0b1. + +HWCAP_SME_AES + Functionality implied by ID_AA64SMFR0_EL1.AES == 0b1. + +HWCAP_SME_SFEXPA + Functionality implied by ID_AA64SMFR0_EL1.SFEXPA == 0b1. + +HWCAP_SME_STMOP + Functionality implied by ID_AA64SMFR0_EL1.STMOP == 0b1. + +HWCAP_SME_SMOP4 + Functionality implied by ID_AA64SMFR0_EL1.SMOP4 == 0b1. + HWCAP2_DCPODP Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0010.
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h index 2b6c61c608e2cd107503b09aba5aaeab639b759a..dbec921ee39c8c897f3e1e1c84d522b5b57130bb 100644 --- a/arch/arm64/include/asm/hwcap.h +++ b/arch/arm64/include/asm/hwcap.h @@ -93,6 +93,23 @@ #define KERNEL_HWCAP_PACA __khwcap_feature(PACA) #define KERNEL_HWCAP_PACG __khwcap_feature(PACG) #define KERNEL_HWCAP_GCS __khwcap_feature(GCS) +#define KERNEL_HWCAP_CMPBR __khwcap_feature(CMPBR) +#define KERNEL_HWCAP_FPRCVT __khwcap_feature(FPRCVT) +#define KERNEL_HWCAP_F8MM8 __khwcap_feature(F8MM8) +#define KERNEL_HWCAP_F8MM4 __khwcap_feature(F8MM4) +#define KERNEL_HWCAP_SVE_F16MM __khwcap_feature(SVE_F16MM) +#define KERNEL_HWCAP_SVE_ELTPERM __khwcap_feature(SVE_ELTPERM) +#define KERNEL_HWCAP_SVE_AES2 __khwcap_feature(SVE_AES2) +#define KERNEL_HWCAP_SVE_BFSCALE __khwcap_feature(SVE_BFSCALE) +#define KERNEL_HWCAP_SVE2P2 __khwcap_feature(SVE2P2) +#define KERNEL_HWCAP_SME2P2 __khwcap_feature(SME2P2) +#define KERNEL_HWCAP_SME_SF8MM8 __khwcap_feature(SME_SF8MM8) +#define KERNEL_HWCAP_SME_SF8MM4 __khwcap_feature(SME_SF8MM4) +#define KERNEL_HWCAP_SME_SBITPERM __khwcap_feature(SME_SBITPERM) +#define KERNEL_HWCAP_SME_AES __khwcap_feature(SME_AES) +#define KERNEL_HWCAP_SME_SFEXPA __khwcap_feature(SME_SFEXPA) +#define KERNEL_HWCAP_SME_STMOP __khwcap_feature(SME_STMOP) +#define KERNEL_HWCAP_SME_SMOP4 __khwcap_feature(SME_SMOP4)
#define __khwcap2_feature(x) (const_ilog2(HWCAP2_ ## x) + 64) #define KERNEL_HWCAP_DCPODP __khwcap2_feature(DCPODP) diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h index 48d46b768eaec4c307360cd3bee8b564687f4b88..61fbc88d2bfb81d0bad639ed533ac67440ae2fc4 100644 --- a/arch/arm64/include/uapi/asm/hwcap.h +++ b/arch/arm64/include/uapi/asm/hwcap.h @@ -56,6 +56,23 @@ #define HWCAP_PACA (1 << 30) #define HWCAP_PACG (1UL << 31) #define HWCAP_GCS (1UL << 32) +#define HWCAP_CMPBR (1UL << 33) +#define HWCAP_FPRCVT (1UL << 34) +#define HWCAP_F8MM8 (1UL << 35) +#define HWCAP_F8MM4 (1UL << 36) +#define HWCAP_SVE_F16MM (1UL << 37) +#define HWCAP_SVE_ELTPERM (1UL << 38) +#define HWCAP_SVE_AES2 (1UL << 39) +#define HWCAP_SVE_BFSCALE (1UL << 40) +#define HWCAP_SVE2P2 (1UL << 41) +#define HWCAP_SME2P2 (1UL << 42) +#define HWCAP_SME_SF8MM8 (1UL << 43) +#define HWCAP_SME_SF8MM4 (1UL << 44) +#define HWCAP_SME_SBITPERM (1UL << 45) +#define HWCAP_SME_AES (1UL << 46) +#define HWCAP_SME_SFEXPA (1UL << 47) +#define HWCAP_SME_STMOP (1UL << 48) +#define HWCAP_SME_SMOP4 (1UL << 49)
/* * HWCAP2 flags - for AT_HWCAP2 diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 6ce71f444ed84f9056196bb21bbfac61c9687e30..7ba73fdee6deb57cd745ff684eeb97f66d2ea85f 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -268,6 +268,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar2[] = { };
static const struct arm64_ftr_bits ftr_id_aa64isar3[] = { + ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_FPRCVT_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_FAMINMAX_SHIFT, 4, 0), ARM64_FTR_END, }; @@ -317,6 +318,8 @@ static const struct arm64_ftr_bits ftr_id_aa64zfr0[] = { FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_F64MM_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_F32MM_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_F16MM_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_I8MM_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), @@ -329,6 +332,8 @@ static const struct arm64_ftr_bits ftr_id_aa64zfr0[] = { FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_BF16_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_BitPerm_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_EltPerm_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_AES_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), @@ -373,6 +378,20 @@ static const struct arm64_ftr_bits ftr_id_aa64smfr0[] = { FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8DP4_SHIFT, 1, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8DP2_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8MM8_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8MM4_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SBitPerm_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_AES_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SFEXPA_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_STMOP_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SMOP4_SHIFT, 1, 0), ARM64_FTR_END, };
@@ -381,6 +400,8 @@ static const struct arm64_ftr_bits ftr_id_aa64fpfr0[] = { ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8FMA_SHIFT, 1, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8DP4_SHIFT, 1, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8DP2_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8MM8_SHIFT, 1, 0), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8MM4_SHIFT, 1, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8E4M3_SHIFT, 1, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8E5M2_SHIFT, 1, 0), ARM64_FTR_END, @@ -3092,12 +3113,15 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { HWCAP_CAP(ID_AA64MMFR2_EL1, AT, IMP, CAP_HWCAP, KERNEL_HWCAP_USCAT), #ifdef CONFIG_ARM64_SVE HWCAP_CAP(ID_AA64PFR0_EL1, SVE, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE), + HWCAP_CAP(ID_AA64ZFR0_EL1, SVEver, SVE2p2, CAP_HWCAP, KERNEL_HWCAP_SVE2P2), HWCAP_CAP(ID_AA64ZFR0_EL1, SVEver, SVE2p1, CAP_HWCAP, KERNEL_HWCAP_SVE2P1), HWCAP_CAP(ID_AA64ZFR0_EL1, SVEver, SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2), HWCAP_CAP(ID_AA64ZFR0_EL1, AES, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEAES), HWCAP_CAP(ID_AA64ZFR0_EL1, AES, PMULL128, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL), + HWCAP_CAP(ID_AA64ZFR0_EL1, AES, AES2, CAP_HWCAP, KERNEL_HWCAP_SVE_AES2), HWCAP_CAP(ID_AA64ZFR0_EL1, BitPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM), HWCAP_CAP(ID_AA64ZFR0_EL1, B16B16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE_B16B16), + HWCAP_CAP(ID_AA64ZFR0_EL1, B16B16, BFSCALE, CAP_HWCAP, KERNEL_HWCAP_SVE_BFSCALE), HWCAP_CAP(ID_AA64ZFR0_EL1, BF16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBF16), HWCAP_CAP(ID_AA64ZFR0_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_SVE_EBF16), HWCAP_CAP(ID_AA64ZFR0_EL1, SHA3, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESHA3), @@ -3105,6 +3129,8 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { HWCAP_CAP(ID_AA64ZFR0_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM), HWCAP_CAP(ID_AA64ZFR0_EL1, F32MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM), HWCAP_CAP(ID_AA64ZFR0_EL1, F64MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM), + HWCAP_CAP(ID_AA64ZFR0_EL1, F16MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE_F16MM), + HWCAP_CAP(ID_AA64ZFR0_EL1, EltPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE_ELTPERM), #endif #ifdef CONFIG_ARM64_GCS HWCAP_CAP(ID_AA64PFR1_EL1, GCS, IMP, CAP_HWCAP, KERNEL_HWCAP_GCS), @@ -3124,6 +3150,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { HWCAP_CAP(ID_AA64MMFR0_EL1, ECV, IMP, CAP_HWCAP, KERNEL_HWCAP_ECV), HWCAP_CAP(ID_AA64MMFR1_EL1, AFP, IMP, CAP_HWCAP, KERNEL_HWCAP_AFP), HWCAP_CAP(ID_AA64ISAR2_EL1, CSSC, IMP, CAP_HWCAP, KERNEL_HWCAP_CSSC), + HWCAP_CAP(ID_AA64ISAR2_EL1, CSSC, CMPBR, CAP_HWCAP, KERNEL_HWCAP_CMPBR), HWCAP_CAP(ID_AA64ISAR2_EL1, RPRFM, IMP, CAP_HWCAP, KERNEL_HWCAP_RPRFM), HWCAP_CAP(ID_AA64ISAR2_EL1, RPRES, IMP, CAP_HWCAP, KERNEL_HWCAP_RPRES), HWCAP_CAP(ID_AA64ISAR2_EL1, WFxT, IMP, CAP_HWCAP, KERNEL_HWCAP_WFXT), @@ -3133,6 +3160,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { HWCAP_CAP(ID_AA64PFR1_EL1, SME, IMP, CAP_HWCAP, KERNEL_HWCAP_SME), HWCAP_CAP(ID_AA64SMFR0_EL1, FA64, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_FA64), HWCAP_CAP(ID_AA64SMFR0_EL1, LUTv2, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_LUTV2), + HWCAP_CAP(ID_AA64SMFR0_EL1, SMEver, SME2p2, CAP_HWCAP, KERNEL_HWCAP_SME2P2), HWCAP_CAP(ID_AA64SMFR0_EL1, SMEver, SME2p1, CAP_HWCAP, KERNEL_HWCAP_SME2P1), HWCAP_CAP(ID_AA64SMFR0_EL1, SMEver, SME2, CAP_HWCAP, KERNEL_HWCAP_SME2), HWCAP_CAP(ID_AA64SMFR0_EL1, I16I64, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_I16I64), @@ -3150,6 +3178,13 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { HWCAP_CAP(ID_AA64SMFR0_EL1, SF8FMA, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8FMA), HWCAP_CAP(ID_AA64SMFR0_EL1, SF8DP4, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8DP4), HWCAP_CAP(ID_AA64SMFR0_EL1, SF8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8DP2), + HWCAP_CAP(ID_AA64SMFR0_EL1, SF8MM8, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8MM8), + HWCAP_CAP(ID_AA64SMFR0_EL1, SF8MM4, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8MM4), + HWCAP_CAP(ID_AA64SMFR0_EL1, SBitPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SBITPERM), + HWCAP_CAP(ID_AA64SMFR0_EL1, AES, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_AES), + HWCAP_CAP(ID_AA64SMFR0_EL1, SFEXPA, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SFEXPA), + HWCAP_CAP(ID_AA64SMFR0_EL1, STMOP, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_STMOP), + HWCAP_CAP(ID_AA64SMFR0_EL1, SMOP4, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SMOP4), #endif /* CONFIG_ARM64_SME */ HWCAP_CAP(ID_AA64FPFR0_EL1, F8CVT, IMP, CAP_HWCAP, KERNEL_HWCAP_F8CVT), HWCAP_CAP(ID_AA64FPFR0_EL1, F8FMA, IMP, CAP_HWCAP, KERNEL_HWCAP_F8FMA), diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index d79e88fccdfce427507e7a34c5959ce6309cbd12..9861291843d8fbcc5f8e68e2b9eaac65a0b37c22 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -145,6 +145,23 @@ static const char *const hwcap_str[] = { [KERNEL_HWCAP_SME_SF8DP4] = "smesf8dp4", [KERNEL_HWCAP_SME_SF8DP2] = "smesf8dp2", [KERNEL_HWCAP_POE] = "poe", + [KERNEL_HWCAP_CMPBR] = "cmpbr", + [KERNEL_HWCAP_FPRCVT] = "fprcvt", + [KERNEL_HWCAP_F8MM8] = "f8mm8", + [KERNEL_HWCAP_F8MM4] = "f8mm4", + [KERNEL_HWCAP_SVE_F16MM] = "svef16mm", + [KERNEL_HWCAP_SVE_ELTPERM] = "sveeltperm", + [KERNEL_HWCAP_SVE_AES2] = "sveaes2", + [KERNEL_HWCAP_SVE_BFSCALE] = "svebfscale", + [KERNEL_HWCAP_SVE2P2] = "sve2p2", + [KERNEL_HWCAP_SME2P2] = "sme2p2", + [KERNEL_HWCAP_SME_SF8MM8] = "smesf8mm8", + [KERNEL_HWCAP_SME_SF8MM4] = "smesf8mm4", + [KERNEL_HWCAP_SME_SBITPERM] = "smesbitperm", + [KERNEL_HWCAP_SME_AES] = "smeaes", + [KERNEL_HWCAP_SME_SFEXPA] = "smesfexpa", + [KERNEL_HWCAP_SME_STMOP] = "smestmop", + [KERNEL_HWCAP_SME_SMOP4] = "smesmop4", };
#ifdef CONFIG_COMPAT
ID_AA64ISAR3_EL1 is currently marked as unallocated in KVM but does have a number of bitfields defined in it. Expose FPRCVT and FAMINMAX, two simple instruction only extensions to guests.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/kvm/sys_regs.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 83c6b4a07ef56cf0ed9c8751ec80686f45dca6b2..6efbe3f4a579afd1874c4cf844c1c1249ae8b942 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1604,6 +1604,9 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu, if (!cpus_have_final_cap(ARM64_HAS_WFXT)) val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT); break; + case SYS_ID_AA64ISAR3_EL1: + val &= ID_AA64ISAR3_EL1_FPRCVT | ID_AA64ISAR3_EL1_FAMINMAX; + break; case SYS_ID_AA64MMFR2_EL1: val &= ~ID_AA64MMFR2_EL1_CCIDX_MASK; break; @@ -2608,7 +2611,8 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_WRITABLE(ID_AA64ISAR2_EL1, ~(ID_AA64ISAR2_EL1_RES0 | ID_AA64ISAR2_EL1_APA3 | ID_AA64ISAR2_EL1_GPA3)), - ID_UNALLOCATED(6,3), + ID_WRITABLE(ID_AA64ISAR3_EL1, (ID_AA64ISAR3_EL1_FPRCVT | + ID_AA64ISAR3_EL1_FAMINMAX)), ID_UNALLOCATED(6,4), ID_UNALLOCATED(6,5), ID_UNALLOCATED(6,6),
Add coverage of the hwcaps for the 2024 dpISA extensions to the hwcap test.
We don't actually test SIGILL generation for CMPBR since the need to branch makes it a pain to generate and the SIGILL detection would be unreliable anyway. Since this should be very unusual we provide a stub function rather than supporting a missing test.
The sigill functions aren't well sorted in the file so the ordering is a bit random.
Signed-off-by: Mark Brown broonie@kernel.org --- tools/testing/selftests/arm64/abi/hwcap.c | 273 +++++++++++++++++++++++++++++- 1 file changed, 271 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c index 0029ed9c5c9aa4451f3d0573ee672eca993fb2f4..2a230cfa4cb4108580a16161e2df03a513710dbc 100644 --- a/tools/testing/selftests/arm64/abi/hwcap.c +++ b/tools/testing/selftests/arm64/abi/hwcap.c @@ -46,6 +46,12 @@ static void atomics_sigill(void) asm volatile(".inst 0xb82003ff" : : : ); }
+static void cmpbr_sigill(void) +{ + /* Not implemented, too complicated and unreliable anyway */ +} + + static void crc32_sigill(void) { /* CRC32W W0, W0, W1 */ @@ -82,6 +88,18 @@ static void f8fma_sigill(void) asm volatile(".inst 0xec0fc00"); }
+static void f8mm4_sigill(void) +{ + /* FMMLA V0.4SH, V0.16B, V0.16B */ + asm volatile(".inst 0x6e00ec00"); +} + +static void f8mm8_sigill(void) +{ + /* FMMLA V0.4S, V0.16B, V0.16B */ + asm volatile(".inst 0x6e80ec00"); +} + static void faminmax_sigill(void) { /* FAMIN V0.4H, V0.4H, V0.4H */ @@ -98,6 +116,12 @@ static void fpmr_sigill(void) asm volatile("mrs x0, S3_3_C4_C4_2" : : : "x0"); }
+static void fprcvt_sigill(void) +{ + /* FCVTAS S0, H0 */ + asm volatile(".inst 0x1efa0000"); +} + static void gcs_sigill(void) { unsigned long *gcspr; @@ -226,6 +250,42 @@ static void sme2p1_sigill(void) asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); }
+static void sme2p2_sigill(void) +{ + /* SMSTART SM */ + asm volatile("msr S0_3_C4_C3_3, xzr" : : : ); + + /* UXTB Z0.D, P0/Z, Z0.D */ + asm volatile(".inst 0x4c1a000" : : : ); + + /* SMSTOP */ + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); +} + +static void sme_aes_sigill(void) +{ + /* SMSTART SM */ + asm volatile("msr S0_3_C4_C3_3, xzr" : : : ); + + /* AESD z0.b, z0.b, z0.b */ + asm volatile(".inst 0x4522e400" : : : "z0"); + + /* SMSTOP */ + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); +} + +static void sme_sbitperm_sigill(void) +{ + /* SMSTART SM */ + asm volatile("msr S0_3_C4_C3_3, xzr" : : : ); + + /* BDEP Z0.B, Z0.B, Z0.B */ + asm volatile(".inst 0x4500b400" : : : "z0"); + + /* SMSTOP */ + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); +} + static void smei16i32_sigill(void) { /* SMSTART */ @@ -334,13 +394,73 @@ static void smesf8dp4_sigill(void) asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); }
+static void smesf8mm8_sigill(void) +{ + /* SMSTART */ + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); + + /* FMMLA V0.4S, V0.16B, V0.16B */ + asm volatile(".inst 0x6e80ec00"); + + /* SMSTOP */ + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); +} + +static void smesf8mm4_sigill(void) +{ + /* SMSTART */ + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); + + /* FMMLA V0.4SH, V0.16B, V0.16B */ + asm volatile(".inst 0x6e00ec00"); + + /* SMSTOP */ + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); +} + static void smesf8fma_sigill(void) { /* SMSTART */ asm volatile("msr S0_3_C4_C7_3, xzr" : : : );
- /* FMLALB V0.8H, V0.16B, V0.16B */ - asm volatile(".inst 0xec0fc00"); + /* FMLALB Z0.8H, Z0.B, Z0.B */ + asm volatile(".inst 0x64205000"); + + /* SMSTOP */ + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); +} + +static void smesfexpa_sigill(void) +{ + /* SMSTART */ + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); + + /* FEXPA Z0.D, Z0.D */ + asm volatile(".inst 0x04e0b800"); + + /* SMSTOP */ + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); +} + +static void smesmop4_sigill(void) +{ + /* SMSTART */ + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); + + /* SMOP4A ZA0.S, Z0.B, { Z0.B - Z1.B } */ + asm volatile(".inst 0x80108000"); + + /* SMSTOP */ + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); +} + +static void smestmop_sigill(void) +{ + /* SMSTART */ + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); + + /* STMOPA ZA0.S, { Z0.H - Z1.H }, Z0.H, Z20[0] */ + asm volatile(".inst 0x80408008");
/* SMSTOP */ asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); @@ -364,18 +484,42 @@ static void sve2p1_sigill(void) asm volatile(".inst 0x65000000" : : : "z0"); }
+static void sve2p2_sigill(void) +{ + /* NOT Z0.D, P0/Z, Z0.D */ + asm volatile(".inst 0x4cea000" : : : "z0"); +} + static void sveaes_sigill(void) { /* AESD z0.b, z0.b, z0.b */ asm volatile(".inst 0x4522e400" : : : "z0"); }
+static void sveaes2_sigill(void) +{ + /* AESD {Z0.B - Z1.B }, { Z0.B - Z1.B }, Z0.Q */ + asm volatile(".inst 0x4522ec00" : : : "z0"); +} + static void sveb16b16_sigill(void) { /* BFADD Z0.H, Z0.H, Z0.H */ asm volatile(".inst 0x65000000" : : : ); }
+static void svebfscale_sigill(void) +{ + /* BFSCALE Z0.H, P0/M, Z0.H, Z0.H */ + asm volatile(".inst 0x65098000" : : : "z0"); +} + +static void svef16mm_sigill(void) +{ + /* FMMLA Z0.S, Z0.H, Z0.H */ + asm volatile(".inst 0x6420e400"); +} + static void svepmull_sigill(void) { /* PMULLB Z0.Q, Z0.D, Z0.D */ @@ -394,6 +538,12 @@ static void svesha3_sigill(void) asm volatile(".inst 0x4203800" : : : "z0"); }
+static void sveeltperm_sigill(void) +{ + /* COMPACT Z0.B, P0, Z0.B */ + asm volatile(".inst 0x5218000" : : : "x0"); +} + static void svesm4_sigill(void) { /* SM4E Z0.S, Z0.S, Z0.S */ @@ -469,6 +619,13 @@ static const struct hwcap_data { .cpuinfo = "aes", .sigill_fn = aes_sigill, }, + { + .name = "CMPBR", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_CMPBR, + .cpuinfo = "cmpbr", + .sigill_fn = cmpbr_sigill, + }, { .name = "CRC32", .at_hwcap = AT_HWCAP, @@ -523,6 +680,20 @@ static const struct hwcap_data { .cpuinfo = "f8fma", .sigill_fn = f8fma_sigill, }, + { + .name = "F8MM8", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_F8MM8, + .cpuinfo = "f8mm8", + .sigill_fn = f8mm8_sigill, + }, + { + .name = "F8MM4", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_F8MM4, + .cpuinfo = "f8mm4", + .sigill_fn = f8mm4_sigill, + }, { .name = "FAMINMAX", .at_hwcap = AT_HWCAP2, @@ -545,6 +716,13 @@ static const struct hwcap_data { .sigill_fn = fpmr_sigill, .sigill_reliable = true, }, + { + .name = "FPRCVT", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_FPRCVT, + .cpuinfo = "fprcvt", + .sigill_fn = fprcvt_sigill, + }, { .name = "GCS", .at_hwcap = AT_HWCAP, @@ -691,6 +869,20 @@ static const struct hwcap_data { .cpuinfo = "sme2p1", .sigill_fn = sme2p1_sigill, }, + { + .name = "SME 2.2", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SME2P2, + .cpuinfo = "sme2p2", + .sigill_fn = sme2p2_sigill, + }, + { + .name = "SME AES", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SME_AES, + .cpuinfo = "smeaes", + .sigill_fn = sme_aes_sigill, + }, { .name = "SME I16I32", .at_hwcap = AT_HWCAP2, @@ -740,6 +932,13 @@ static const struct hwcap_data { .cpuinfo = "smelutv2", .sigill_fn = smelutv2_sigill, }, + { + .name = "SME SBITPERM", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SME_SBITPERM, + .cpuinfo = "smesbitperm", + .sigill_fn = sme_sbitperm_sigill, + }, { .name = "SME SF8FMA", .at_hwcap = AT_HWCAP2, @@ -747,6 +946,20 @@ static const struct hwcap_data { .cpuinfo = "smesf8fma", .sigill_fn = smesf8fma_sigill, }, + { + .name = "SME SF8MM8", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SME_SF8MM8, + .cpuinfo = "smesf8mm8", + .sigill_fn = smesf8mm8_sigill, + }, + { + .name = "SME SF8MM4", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SME_SF8MM8, + .cpuinfo = "smesf8mm4", + .sigill_fn = smesf8mm4_sigill, + }, { .name = "SME SF8DP2", .at_hwcap = AT_HWCAP2, @@ -761,6 +974,27 @@ static const struct hwcap_data { .cpuinfo = "smesf8dp4", .sigill_fn = smesf8dp4_sigill, }, + { + .name = "SME SFEXPA", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SME_SFEXPA, + .cpuinfo = "smesfexpa", + .sigill_fn = smesfexpa_sigill, + }, + { + .name = "SME SMOP4", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SME_SMOP4, + .cpuinfo = "smesmop4", + .sigill_fn = smesmop4_sigill, + }, + { + .name = "SME STMOP", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SME_STMOP, + .cpuinfo = "smestmop", + .sigill_fn = smestmop_sigill, + }, { .name = "SVE", .at_hwcap = AT_HWCAP, @@ -783,6 +1017,13 @@ static const struct hwcap_data { .cpuinfo = "sve2p1", .sigill_fn = sve2p1_sigill, }, + { + .name = "SVE 2.2", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SVE2P2, + .cpuinfo = "sve2p2", + .sigill_fn = sve2p2_sigill, + }, { .name = "SVE AES", .at_hwcap = AT_HWCAP2, @@ -790,6 +1031,34 @@ static const struct hwcap_data { .cpuinfo = "sveaes", .sigill_fn = sveaes_sigill, }, + { + .name = "SVE AES2", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SVE_AES2, + .cpuinfo = "sveaes2", + .sigill_fn = sveaes2_sigill, + }, + { + .name = "SVE BFSCALE", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SVE_BFSCALE, + .cpuinfo = "svebfscale", + .sigill_fn = svebfscale_sigill, + }, + { + .name = "SVE ELTPERM", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SVE_ELTPERM, + .cpuinfo = "sveeltperm", + .sigill_fn = sveeltperm_sigill, + }, + { + .name = "SVE F16MM", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SVE_F16MM, + .cpuinfo = "svef16mm", + .sigill_fn = svef16mm_sigill, + }, { .name = "SVE2 B16B16", .at_hwcap = AT_HWCAP2,
linux-kselftest-mirror@lists.linaro.org