On platform where SVE is supported but there are less than 2 VLs available
the signal SVE change test should be skipped instead of failing.
Reported-by: Andre Przywara <andre.przywara(a)arm.com>
Tested-by: Andre Przywara <andre.przywara(a)arm.com>
Cc: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Cristian Marussi <cristian.marussi(a)arm.com>
---
.../arm64/signal/testcases/fake_sigreturn_sve_change_vl.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sve_change_vl.c b/tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sve_change_vl.c
index bb50b5adbf10..915821375b0a 100644
--- a/tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sve_change_vl.c
+++ b/tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sve_change_vl.c
@@ -6,6 +6,7 @@
* supported and is expected to segfault.
*/
+#include <kselftest.h>
#include <signal.h>
#include <ucontext.h>
#include <sys/prctl.h>
@@ -40,6 +41,7 @@ static bool sve_get_vls(struct tdescr *td)
/* We need at least two VLs */
if (nvls < 2) {
fprintf(stderr, "Only %d VL supported\n", nvls);
+ td->result = KSFT_SKIP;
return false;
}
--
2.36.1
V4: https://lore.kernel.org/lkml/cover.1649878359.git.reinette.chatre@intel.com/
Changes since V4 that directly impact user space:
- SGX_IOC_ENCLAVE_MODIFY_TYPES ioctl()'s struct was renamed
from struct sgx_enclave_modify_type to
struct sgx_enclave_modify_types. (Jarkko)
Details about changes since V4 that do not directly impact user space:
- Related function names were changed to match with the struct name
change:
sgx_ioc_enclave_modify_type() -> sgx_ioc_enclave_modify_types()
sgx_enclave_modify_type() -> sgx_enclave_modify_types()
- Revert a SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS parameter check that
requires read permission. The hardware does support restricting
enclave page permission to zero permissions. Replace with
permission check to ensure read permission is set when write permission
is set. This is verified early to prevent a later fault of the
instruction. (Vijay).
- Do not attempt direct reclaim if no EPC pages available during page
fault. mmap_lock is already held in page fault handler so attempting
to take it again while running sgx_reclaim_pages() has risk of
deadlock. This was discovered by lockdep during stress testing.
- Pick up Reviewed-by and Tested-by tags from Jarkko.
- Pick up Tested-by tags from Haitao after testing with Intel SGX SDK/PSW.
- Pick up Tested-by tags from Vijay after testing with Gramine.
V3: https://lore.kernel.org/lkml/cover.1648847675.git.reinette.chatre@intel.com/
Changes since V3 that directly impact user space:
- SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl()'s struct
sgx_enclave_restrict_permissions no longer provides entire secinfo,
just the new permissions in new "permissions" struct member. (Jarkko)
- Rename SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl() to
SGX_IOC_ENCLAVE_MODIFY_TYPES. (Jarkko)
- SGX_IOC_ENCLAVE_MODIFY_TYPES ioctl()'s struct sgx_enclave_modify_type
no longer provides entire secinfo, just the new page type in new
"page_type" struct member. (Jarkko)
Details about changes since V3 that do not directly impact user space:
- Add new patch to enable VA pages to be added without invoking reclaimer
directly if no EPC pages are available, failing instead. This enables
VA pages to be added with enclave's mutex held. Fixes an issue
encountered by Haitao. More details in new patch "x86/sgx: Support VA page
allocation without reclaiming".
- While refactoring, change existing code to consistently use
IS_ALIGNED(). (Jarkko)
- Many patches received a tag from Jarkko.
- Many smaller changes, please refer to individual patches.
V2: https://lore.kernel.org/lkml/cover.1644274683.git.reinette.chatre@intel.com/
Changes since V2 that directly impact user space:
- Maximum allowed permissions of dynamically added pages is RWX,
previously limited to RW. (Jarkko)
Dynamically added pages are initially created with architecturally
limited EPCM permissions of RW. mmap() and mprotect() of these pages
with RWX permissions would no longer be blocked by SGX driver. PROT_EXEC
on dynamically added pages will be possible after running ENCLU[EMODPE]
from within the enclave with appropriate VMA permissions.
- The kernel no longer attempts to track the EPCM runtime permissions. (Jarkko)
Consequences are:
- Kernel does not modify PTEs to follow EPCM permissions. User space
will receive #PF with SGX error code in cases where the V2
implementation would have resulted in regular (non-SGX) page fault
error code.
- SGX_IOC_ENCLAVE_RELAX_PERMISSIONS is removed. This ioctl() was used
to clear PTEs after permissions were modified from within the enclave
and ensure correct PTEs are installed. Since PTEs no longer track
EPCM permissions the changes in EPCM permissions would not impact PTEs.
As long as new permissions are within the maximum vetted permissions
(vm_max_prot_bits) only ENCLU[EMODPE] from within enclave is needed,
as accompanied by appropriate VMA permissions.
- struct sgx_enclave_restrict_perm renamed to
sgx_enclave_restrict_permissions (Jarkko)
- struct sgx_enclave_modt renamed to struct sgx_enclave_modify_type
to be consistent with the verbose naming of other SGX uapi structs.
Details about changes since V2 that do not directly impact user space:
- Kernel no longer tracks the runtime EPCM permissions with the aim of
installing accurate PTEs. (Jarkko)
- In support of this change the following patches were removed:
Documentation/x86: Document SGX permission details
x86/sgx: Support VMA permissions more relaxed than enclave permissions
x86/sgx: Add pfn_mkwrite() handler for present PTEs
x86/sgx: Add sgx_encl_page->vm_run_prot_bits for dynamic permission changes
x86/sgx: Support relaxing of enclave page permissions
- No more handling of scenarios where VMA permissions may be more
relaxed than what the EPCM allows. Enclaves are not prevented
from accessing such pages and the EPCM permissions are entrusted
to control access as supported by the SGX error code in page faults.
- No more explicit setting of protection bits in page fault handler.
Protection bits are inherited from VMA similar to SGX1 support.
- Selftest patches are moved to the end of the series. (Jarkko)
- New patch contributed by Jarkko to avoid duplicated code:
x86/sgx: Export sgx_encl_page_alloc()
- New patch separating changes from existing patch. (Jarkko)
x86/sgx: Export sgx_encl_{grow,shrink}()
- New patch to keep one required benefit from the (now removed) kernel
EPCM permission tracking:
x86/sgx: Support loading enclave page without VMA permissions check
- Updated cover letter to reflect architecture changes.
- Many smaller changes, please refer to individual patches.
V1: https://lore.kernel.org/linux-sgx/cover.1638381245.git.reinette.chatre@inte…
Changes since V1 that directly impact user space:
- SGX2 permission changes changed from a single ioctl() named
SGX_IOC_PAGE_MODP to two new ioctl()s:
SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS, supported by two different
parameter structures (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS does
not support a result output parameter) (Jarkko).
User space flow impact: After user space runs ENCLU[EMODPE] it
needs to call SGX_IOC_ENCLAVE_RELAX_PERMISSIONS to have PTEs
updated. Previously running SGX_IOC_PAGE_MODP in this scenario
resulted in EPCM.PR being set but calling
SGX_IOC_ENCLAVE_RELAX_PERMISSIONS will not result in EPCM.PR
being set anymore and thus no need for an additional
ENCLU[EACCEPT].
- SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
obtain new permissions from secinfo as parameter instead of
the permissions directly (Jarkko).
- ioctl() supporting SGX2 page type change is renamed from
SGX_IOC_PAGE_MODT to SGX_IOC_ENCLAVE_MODIFY_TYPE (Jarkko).
- SGX_IOC_ENCLAVE_MODIFY_TYPE obtains new page type from secinfo
as parameter instead of the page type directly (Jarkko).
- ioctl() supporting SGX2 page removal is renamed from
SGX_IOC_PAGE_REMOVE to SGX_IOC_ENCLAVE_REMOVE_PAGES (Jarkko).
- All ioctl() parameter structures have been renamed as a result of the
ioctl() renaming:
SGX_IOC_ENCLAVE_RELAX_PERMISSIONS => struct sgx_enclave_relax_perm
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS => struct sgx_enclave_restrict_perm
SGX_IOC_ENCLAVE_MODIFY_TYPE => struct sgx_enclave_modt
SGX_IOC_ENCLAVE_REMOVE_PAGES => struct sgx_enclave_remove_pages
Changes since V1 that do not directly impact user space:
- Number of patches in series increased from 25 to 32 primarily because
of splitting the original submission:
- Wrappers for the new SGX2 functions are introduced in three separate
patches replacing the original "x86/sgx: Add wrappers for SGX2
functions"
(Jarkko).
- Moving and renaming sgx_encl_ewb_cpumask() is done with two patches
replacing the original "x86/sgx: Use more generic name for enclave
cpumask function" (Jarkko).
- Support for SGX2 EPCM permission changes is split into two ioctls(),
one for relaxing and one for restricting permissions, each introduced
by a new patch replacing the original "x86/sgx: Support enclave page
permission changes" (Jarkko).
- Extracted code used by existing ioctls() for usage by new ioctl()s
into a new utility in new patch "x86/sgx: Create utility to validate
user provided offset and length" (Dave did not specifically ask for
this but it addresses his review feedback).
- Two new Documentation patches to support the SGX2 work
("Documentation/x86: Introduce enclave runtime management") and
a dedicated section on the enclave permission management
("Documentation/x86: Document SGX permission details") (Andy).
- Most patches were reworked to improve the language by:
* aiming to refer to exact item instead of English rephrasing (Jarkko).
* use ioctl() instead of ioctl throughout (Dave).
* Use "relaxed" instead of "exceed" when referring to permissions
(Dave).
- Improved documentation with several additions to
Documentation/x86/sgx.rst.
- Many smaller changes, please refer to individual patches.
Hi Everybody,
The current Linux kernel support for SGX includes support for SGX1 that
requires that an enclave be created with properties that accommodate all
usages over its (the enclave's) lifetime. This includes properties such
as permissions of enclave pages, the number of enclave pages, and the
number of threads supported by the enclave.
Consequences of this requirement to have the enclave be created to
accommodate all usages include:
* pages needing to support relocated code are required to have RWX
permissions for their entire lifetime,
* an enclave needs to be created with the maximum stack and heap
projected to be needed during the enclave's entire lifetime which
can be longer than the processes running within it,
* an enclave needs to be created with support for the maximum number
of threads projected to run in the enclave.
Since SGX1 a few more functions were introduced, collectively called
SGX2, that support modifications to an initialized enclave. Hardware
supporting these functions are already available as listed on
https://github.com/ayeks/SGX-hardware
This series adds support for SGX2, also referred to as Enclave Dynamic
Memory Management (EDMM). This includes:
* Support modifying EPCM permissions of regular enclave pages belonging
to an initialized enclave. Only permission restriction is supported
via a new ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS. Relaxing of
EPCM permissions can only be done from within the enclave with the
SGX instruction ENCLU[EMODPE].
* Support dynamic addition of regular enclave pages to an initialized
enclave. At creation new pages are architecturally limited to RW EPCM
permissions but will be accessible with PROT_EXEC after the enclave
runs ENCLU[EMODPE] to relax EPCM permissions to RWX.
Pages are dynamically added to an initialized enclave from the SGX
page fault handler.
* Support expanding an initialized enclave to accommodate more threads.
More threads can be accommodated by an enclave with the addition of
Thread Control Structure (TCS) pages that is done by changing the
type of regular enclave pages to TCS pages using a new ioctl()
SGX_IOC_ENCLAVE_MODIFY_TYPES.
* Support removing regular and TCS pages from an initialized enclave.
Removing pages is accomplished in two stages as supported by two new
ioctl()s SGX_IOC_ENCLAVE_MODIFY_TYPES (same ioctl() as mentioned in
previous bullet) and SGX_IOC_ENCLAVE_REMOVE_PAGES.
* Tests covering all the new flows, some edge cases, and one
comprehensive stress scenario.
No additional work is needed to support SGX2 in a virtualized
environment. All tests included in this series passed when run from
a guest as tested with the recent QEMU release based on 6.2.0
that supports SGX.
Patches 1 through 14 prepare the existing code for SGX2 support by
introducing the SGX2 functions, refactoring code, and tracking enclave
page types.
Patches 15 through 21 enable the SGX2 features and include a
Documentation patch.
Patches 22 through 31 test several scenarios of all the enabled
SGX2 features.
This series is based on v5.18-rc5 with recently submitted SGX shmem
fixes applied:
https://lore.kernel.org/linux-sgx/cover.1652131695.git.reinette.chatre@inte…
A repo with both series applied is available:
repo: https://github.com/rchatre/linux.git
branch: sgx/sgx2_submitted_v5_plus_rwx
This SGX2 series also applies directly to v5.18-rc5 if done with a 3-way merge
since it and the shmem fixes both make changes to arch/x86/kernel/cpu/sgx/encl.h
but do not have direct conflicts.
Your feedback will be greatly appreciated.
Regards,
Reinette
Jarkko Sakkinen (1):
x86/sgx: Export sgx_encl_page_alloc()
Reinette Chatre (30):
x86/sgx: Add short descriptions to ENCLS wrappers
x86/sgx: Add wrapper for SGX2 EMODPR function
x86/sgx: Add wrapper for SGX2 EMODT function
x86/sgx: Add wrapper for SGX2 EAUG function
x86/sgx: Support loading enclave page without VMA permissions check
x86/sgx: Export sgx_encl_ewb_cpumask()
x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask()
x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes()
x86/sgx: Make sgx_ipi_cb() available internally
x86/sgx: Create utility to validate user provided offset and length
x86/sgx: Keep record of SGX page type
x86/sgx: Export sgx_encl_{grow,shrink}()
x86/sgx: Support VA page allocation without reclaiming
x86/sgx: Support restricting of enclave page permissions
x86/sgx: Support adding of pages to an initialized enclave
x86/sgx: Tighten accessible memory range after enclave initialization
x86/sgx: Support modifying SGX page type
x86/sgx: Support complete page removal
x86/sgx: Free up EPC pages directly to support large page ranges
Documentation/x86: Introduce enclave runtime management section
selftests/sgx: Add test for EPCM permission changes
selftests/sgx: Add test for TCS page permission changes
selftests/sgx: Test two different SGX2 EAUG flows
selftests/sgx: Introduce dynamic entry point
selftests/sgx: Introduce TCS initialization enclave operation
selftests/sgx: Test complete changing of page type flow
selftests/sgx: Test faulty enclave behavior
selftests/sgx: Test invalid access to removed enclave page
selftests/sgx: Test reclaiming of untouched page
selftests/sgx: Page removal stress test
Documentation/x86/sgx.rst | 15 +
arch/x86/include/asm/sgx.h | 8 +
arch/x86/include/uapi/asm/sgx.h | 62 +
arch/x86/kernel/cpu/sgx/encl.c | 329 +++-
arch/x86/kernel/cpu/sgx/encl.h | 15 +-
arch/x86/kernel/cpu/sgx/encls.h | 33 +
arch/x86/kernel/cpu/sgx/ioctl.c | 641 +++++++-
arch/x86/kernel/cpu/sgx/main.c | 75 +-
arch/x86/kernel/cpu/sgx/sgx.h | 3 +
tools/testing/selftests/sgx/defines.h | 23 +
tools/testing/selftests/sgx/load.c | 41 +
tools/testing/selftests/sgx/main.c | 1435 +++++++++++++++++
tools/testing/selftests/sgx/main.h | 1 +
tools/testing/selftests/sgx/test_encl.c | 68 +
.../selftests/sgx/test_encl_bootstrap.S | 6 +
15 files changed, 2627 insertions(+), 128 deletions(-)
base-commit: 672c0c5173427e6b3e2a9bbb7be51ceeec78093a
prerequisite-patch-id: 1a738c00922b0ec865f2674c6f4f8be9ff9b1aab
prerequisite-patch-id: 792889ea9bdfae8c150b1be5c16da697bc404422
prerequisite-patch-id: 78ed2d6251ead724bcb96e0f058bb39dca9eba04
prerequisite-patch-id: cbb715e565631a146eb3cd902455ebaa5d489872
prerequisite-patch-id: 3e853bae87d94f8695a48c537ef32a516f415933
--
2.25.1
Follow the pattern used by other selftests like memfd and fall back on the
standard toolchain options to build with a system installed alsa-lib if
we don't get anything from pkg-config. This reduces our build dependencies
a bit in the common case while still allowing use of pkg-config in case
there is a need for it.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/alsa/Makefile | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/alsa/Makefile b/tools/testing/selftests/alsa/Makefile
index f64d9090426d..fd8ddce2b1a6 100644
--- a/tools/testing/selftests/alsa/Makefile
+++ b/tools/testing/selftests/alsa/Makefile
@@ -3,6 +3,9 @@
CFLAGS += $(shell pkg-config --cflags alsa)
LDLIBS += $(shell pkg-config --libs alsa)
+ifeq ($(LDLIBS),)
+LDLIBS += -lasound
+endif
TEST_GEN_PROGS := mixer-test
--
2.30.2
RFC 9131 changes default behaviour of handling RX of NA messages when the
corresponding entry is absent in the neighbour cache. The current
implementation is limited to accept just unsolicited NAs. However, the
RFC is more generic where it also accepts solicited NAs. Both types
should result in adding a STALE entry for this case.
Expand accept_untracked_na behaviour to also accept solicited NAs to
be compliant with the RFC and rename the sysctl knob to
accept_untracked_na.
Fixes: f9a2fb73318e ("net/ipv6: Introduce accept_unsolicited_na knob to implement router-side changes for RFC9131")
Signed-off-by: Arun Ajith S <aajith(a)arista.com>
---
This change updates the accept_unsolicited_na feature that merged to net-next
for v5.19 to be better compliant with the RFC. It also involves renaming the sysctl
knob to accept_untracked_na before shipping in a release.
Note that the behaviour table has been modifed in the code comments,
but dropped from the Documentation. This is because the table
documents behaviour that is not unique to the knob, and it is more
relevant to understanding the code. The documentation has been updated
to be unambiguous even without the table.
v2:
1. Changed commit message and subject as suggested.
2. Added Fixes tag.
3. Used en-uk spellings consistently.
4. Added a couple of missing comments.
5. Refactored patch to be smaller by avoiding early return.
6. Made the documentation more clearer.
v3:
1. Fixed build issue. (Verified make defconfig && make && make htmldocs SPHINXDIRS=networking)
Documentation/networking/ip-sysctl.rst | 23 ++++------
include/linux/ipv6.h | 2 +-
include/uapi/linux/ipv6.h | 2 +-
net/ipv6/addrconf.c | 6 +--
net/ipv6/ndisc.c | 42 +++++++++++--------
.../net/ndisc_unsolicited_na_test.sh | 23 +++++-----
6 files changed, 50 insertions(+), 48 deletions(-)
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index b882d4238581..04216564a03c 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -2474,21 +2474,16 @@ drop_unsolicited_na - BOOLEAN
By default this is turned off.
-accept_unsolicited_na - BOOLEAN
- Add a new neighbour cache entry in STALE state for routers on receiving an
- unsolicited neighbour advertisement with target link-layer address option
- specified. This is as per router-side behavior documented in RFC9131.
- This has lower precedence than drop_unsolicited_na.
+accept_untracked_na - BOOLEAN
+ Add a new neighbour cache entry in STALE state for routers on receiving a
+ neighbour advertisement (either solicited or unsolicited) with target
+ link-layer address option specified if no neighbour entry is already
+ present for the advertised IPv6 address. Without this knob, NAs received
+ for untracked addresses (absent in neighbour cache) are silently ignored.
+
+ This is as per router-side behaviour documented in RFC9131.
- ==== ====== ====== ==============================================
- drop accept fwding behaviour
- ---- ------ ------ ----------------------------------------------
- 1 X X Drop NA packet and don't pass up the stack
- 0 0 X Pass NA packet up the stack, don't update NC
- 0 1 0 Pass NA packet up the stack, don't update NC
- 0 1 1 Pass NA packet up the stack, and add a STALE
- NC entry
- ==== ====== ====== ==============================================
+ This has lower precedence than drop_unsolicited_na.
This will optimize the return path for the initial off-link communication
that is initiated by a directly connected host, by ensuring that
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 38c8203d52cb..37dfdcfcdd54 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -61,7 +61,7 @@ struct ipv6_devconf {
__s32 suppress_frag_ndisc;
__s32 accept_ra_mtu;
__s32 drop_unsolicited_na;
- __s32 accept_unsolicited_na;
+ __s32 accept_untracked_na;
struct ipv6_stable_secret {
bool initialized;
struct in6_addr secret;
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 549ddeaf788b..03cdbe798fe3 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -194,7 +194,7 @@ enum {
DEVCONF_IOAM6_ID,
DEVCONF_IOAM6_ID_WIDE,
DEVCONF_NDISC_EVICT_NOCARRIER,
- DEVCONF_ACCEPT_UNSOLICITED_NA,
+ DEVCONF_ACCEPT_UNTRACKED_NA,
DEVCONF_MAX
};
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ca0aa744593e..1b1932502e9e 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -5586,7 +5586,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
array[DEVCONF_IOAM6_ID] = cnf->ioam6_id;
array[DEVCONF_IOAM6_ID_WIDE] = cnf->ioam6_id_wide;
array[DEVCONF_NDISC_EVICT_NOCARRIER] = cnf->ndisc_evict_nocarrier;
- array[DEVCONF_ACCEPT_UNSOLICITED_NA] = cnf->accept_unsolicited_na;
+ array[DEVCONF_ACCEPT_UNTRACKED_NA] = cnf->accept_untracked_na;
}
static inline size_t inet6_ifla6_size(void)
@@ -7038,8 +7038,8 @@ static const struct ctl_table addrconf_sysctl[] = {
.extra2 = (void *)SYSCTL_ONE,
},
{
- .procname = "accept_unsolicited_na",
- .data = &ipv6_devconf.accept_unsolicited_na,
+ .procname = "accept_untracked_na",
+ .data = &ipv6_devconf.accept_untracked_na,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 254addad0dd3..b0dfe97ea4ee 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -979,7 +979,7 @@ static void ndisc_recv_na(struct sk_buff *skb)
struct inet6_dev *idev = __in6_dev_get(dev);
struct inet6_ifaddr *ifp;
struct neighbour *neigh;
- bool create_neigh;
+ u8 new_state;
if (skb->len < sizeof(struct nd_msg)) {
ND_PRINTK(2, warn, "NA: packet too short\n");
@@ -1000,7 +1000,7 @@ static void ndisc_recv_na(struct sk_buff *skb)
/* For some 802.11 wireless deployments (and possibly other networks),
* there will be a NA proxy and unsolicitd packets are attacks
* and thus should not be accepted.
- * drop_unsolicited_na takes precedence over accept_unsolicited_na
+ * drop_unsolicited_na takes precedence over accept_untracked_na
*/
if (!msg->icmph.icmp6_solicited && idev &&
idev->cnf.drop_unsolicited_na)
@@ -1041,25 +1041,33 @@ static void ndisc_recv_na(struct sk_buff *skb)
in6_ifa_put(ifp);
return;
}
+
+ neigh = neigh_lookup(&nd_tbl, &msg->target, dev);
+
/* RFC 9131 updates original Neighbour Discovery RFC 4861.
- * An unsolicited NA can now create a neighbour cache entry
- * on routers if it has Target LL Address option.
+ * NAs with Target LL Address option without a corresponding
+ * entry in the neighbour cache can now create a STALE neighbour
+ * cache entry on routers.
+ *
+ * entry accept fwding solicited behaviour
+ * ------- ------ ------ --------- ----------------------
+ * present X X 0 Set state to STALE
+ * present X X 1 Set state to REACHABLE
+ * absent 0 X X Do nothing
+ * absent 1 0 X Do nothing
+ * absent 1 1 X Add a new STALE entry
*
- * drop accept fwding behaviour
- * ---- ------ ------ ----------------------------------------------
- * 1 X X Drop NA packet and don't pass up the stack
- * 0 0 X Pass NA packet up the stack, don't update NC
- * 0 1 0 Pass NA packet up the stack, don't update NC
- * 0 1 1 Pass NA packet up the stack, and add a STALE
- * NC entry
* Note that we don't do a (daddr == all-routers-mcast) check.
*/
- create_neigh = !msg->icmph.icmp6_solicited && lladdr &&
- idev && idev->cnf.forwarding &&
- idev->cnf.accept_unsolicited_na;
- neigh = __neigh_lookup(&nd_tbl, &msg->target, dev, create_neigh);
+ new_state = msg->icmph.icmp6_solicited ? NUD_REACHABLE : NUD_STALE;
+ if (!neigh && lladdr &&
+ idev && idev->cnf.forwarding &&
+ idev->cnf.accept_untracked_na) {
+ neigh = neigh_create(&nd_tbl, &msg->target, dev);
+ new_state = NUD_STALE;
+ }
- if (neigh) {
+ if (neigh && !IS_ERR(neigh)) {
u8 old_flags = neigh->flags;
struct net *net = dev_net(dev);
@@ -1079,7 +1087,7 @@ static void ndisc_recv_na(struct sk_buff *skb)
}
ndisc_update(dev, neigh, lladdr,
- msg->icmph.icmp6_solicited ? NUD_REACHABLE : NUD_STALE,
+ new_state,
NEIGH_UPDATE_F_WEAK_OVERRIDE|
(msg->icmph.icmp6_override ? NEIGH_UPDATE_F_OVERRIDE : 0)|
NEIGH_UPDATE_F_OVERRIDE_ISROUTER|
diff --git a/tools/testing/selftests/net/ndisc_unsolicited_na_test.sh b/tools/testing/selftests/net/ndisc_unsolicited_na_test.sh
index f508657ee126..86e621b7b9c7 100755
--- a/tools/testing/selftests/net/ndisc_unsolicited_na_test.sh
+++ b/tools/testing/selftests/net/ndisc_unsolicited_na_test.sh
@@ -1,15 +1,14 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
-# This test is for the accept_unsolicited_na feature to
+# This test is for the accept_untracked_na feature to
# enable RFC9131 behaviour. The following is the test-matrix.
# drop accept fwding behaviour
# ---- ------ ------ ----------------------------------------------
-# 1 X X Drop NA packet and don't pass up the stack
-# 0 0 X Pass NA packet up the stack, don't update NC
-# 0 1 0 Pass NA packet up the stack, don't update NC
-# 0 1 1 Pass NA packet up the stack, and add a STALE
-# NC entry
+# 1 X X Don't update NC
+# 0 0 X Don't update NC
+# 0 1 0 Don't update NC
+# 0 1 1 Add a STALE NC entry
ret=0
# Kselftest framework requirement - SKIP code is 4.
@@ -72,7 +71,7 @@ setup()
set -e
local drop_unsolicited_na=$1
- local accept_unsolicited_na=$2
+ local accept_untracked_na=$2
local forwarding=$3
# Setup two namespaces and a veth tunnel across them.
@@ -93,7 +92,7 @@ setup()
${IP_ROUTER_EXEC} sysctl -qw \
${ROUTER_CONF}.drop_unsolicited_na=${drop_unsolicited_na}
${IP_ROUTER_EXEC} sysctl -qw \
- ${ROUTER_CONF}.accept_unsolicited_na=${accept_unsolicited_na}
+ ${ROUTER_CONF}.accept_untracked_na=${accept_untracked_na}
${IP_ROUTER_EXEC} sysctl -qw ${ROUTER_CONF}.disable_ipv6=0
${IP_ROUTER} addr add ${ROUTER_ADDR_WITH_MASK} dev ${ROUTER_INTF}
@@ -144,13 +143,13 @@ link_up() {
verify_ndisc() {
local drop_unsolicited_na=$1
- local accept_unsolicited_na=$2
+ local accept_untracked_na=$2
local forwarding=$3
neigh_show_output=$(${IP_ROUTER} neigh show \
to ${HOST_ADDR} dev ${ROUTER_INTF} nud stale)
if [ ${drop_unsolicited_na} -eq 0 ] && \
- [ ${accept_unsolicited_na} -eq 1 ] && \
+ [ ${accept_untracked_na} -eq 1 ] && \
[ ${forwarding} -eq 1 ]; then
# Neighbour entry expected to be present for 011 case
[[ ${neigh_show_output} ]]
@@ -179,14 +178,14 @@ test_unsolicited_na_combination() {
test_unsolicited_na_common $1 $2 $3
test_msg=("test_unsolicited_na: "
"drop_unsolicited_na=$1 "
- "accept_unsolicited_na=$2 "
+ "accept_untracked_na=$2 "
"forwarding=$3")
log_test $? 0 "${test_msg[*]}"
cleanup
}
test_unsolicited_na_combinations() {
- # Args: drop_unsolicited_na accept_unsolicited_na forwarding
+ # Args: drop_unsolicited_na accept_untracked_na forwarding
# Expect entry
test_unsolicited_na_combination 0 1 1
--
2.27.0
This patch series is motivated by Shuah's suggestion here:
https://lore.kernel.org/kvm/d576d8f7-980f-3bc6-87ad-5a6ae45609b8@linuxfound…
Many s390x KVM selftests do not output any information about which
tests have been run, so it's hard to say whether a test binary
contains a certain sub-test or not. To improve this situation let's
add some TAP output via the kselftest.h interface to these tests,
so that it easier to understand what has been executed or not.
v3:
- Added comments / fixed cosmetics according to Janosch's and
Janis' reviews of the v2 series
- Added Reviewed-by tags from the v2 series
v2:
- Reworked the extension checking in the first patch
- Make sure to always print the TAP 13 header in the second patch
- Reworked the SKIP printing in the third patch
Thomas Huth (4):
KVM: s390: selftests: Use TAP interface in the memop test
KVM: s390: selftests: Use TAP interface in the sync_regs test
KVM: s390: selftests: Use TAP interface in the tprot test
KVM: s390: selftests: Use TAP interface in the reset test
tools/testing/selftests/kvm/s390x/memop.c | 90 +++++++++++++++----
tools/testing/selftests/kvm/s390x/resets.c | 38 ++++++--
.../selftests/kvm/s390x/sync_regs_test.c | 87 +++++++++++++-----
tools/testing/selftests/kvm/s390x/tprot.c | 29 ++++--
4 files changed, 193 insertions(+), 51 deletions(-)
--
2.27.0
If a map is write-protected, for example by an eBPF program implementing
the bpf_map security hook, some read-like operations like show and dump
cannot be performed by bpftool even if bpftool has the right to do so.
The reason is that bpftool sets the open flags to zero, at the time it gets
a map file descriptor. The kernel interprets this as a request for full
access to the map (with read and write permissions).
The simple solution is to set only the necessary open flags for a requested
operation, so that only those operations requiring more privileges than the
ones granted by the enforcing eBPF programs are denied.
There are different ways to solve the problem. One would be to introduce a
new function to acquire a read-only file descriptor and use it from the
functions implementing read-like operations.
Or more simply, another is to attempt to get a read-only file descriptor in
the original function when the first request with full permissions failed.
This patch set implements the second solution in patch 1, and adds a
corresponding test in patch 2. Depending on the feedback, the first
solution can be implemented.
Roberto Sassu (2):
libbpf: Retry map access with read-only permission
selftests/bpf: Add test for retrying access to map with read-only perm
tools/lib/bpf/bpf.c | 5 ++
.../bpf/prog_tests/test_map_retry_access.c | 54 +++++++++++++++++++
.../selftests/bpf/progs/map_retry_access.c | 36 +++++++++++++
3 files changed, 95 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_map_retry_access.c
create mode 100644 tools/testing/selftests/bpf/progs/map_retry_access.c
--
2.25.1
Hi,
And here comes the v5 of the HID-BPF series.
I managed to achive the same functionalities than v3 this time.
Handling per-device BPF program was "interesting" to say the least,
but I don't know if we can have a generic BPF way of handling such
situation.
The interesting bits is that now the BPF core changes are rather small,
and I am mostly using existing facilities.
I didn't managed to write selftests for the RET_PTR_TO_MEM kfunc,
because I can not call kmalloc while in a SEC("tc") program to match
what the other kfunc tests are doing.
And AFAICT, the most interesting bits would be to implement verifier
selftests, which are way out of my league, given that they are
implemented as plain bytecode.
The logic is the following (see also the last patch for some more
documentation):
- hid-bpf first preloads a BPF program in the kernel that does a few
things:
* find out which attach_btf_id are associated with our trace points
* adds a bpf_tail_call() BPF program that I can use to "call" any
other BPF program stored into a jump table
* monitors the releases of struct bpf_prog, and when there are no
other users than us, detach the bpf progs from the HID devices
- users then declare their tracepoints and then call
hid_bpf_attach_prog() in a SEC("syscall") program
- hid-bpf then calls multiple time the bpf_tail_call() program with a
different index in the jump table whenever there is an event coming
from a matching HID device
Note that I am tempted to pin an "attach_hid_program" in the bpffs so
that users don't need to declare one, but I am afraid this will be one
more API to handle, so maybe not.
I am also wondering if I should not strip out hid_bpf_jmp_table of most
of its features and implement everything as a BPF program. This might
remove the need to add the kernel light skeleton implementations of map
modifications, and might also possibly be more re-usable for other
subsystems. But every plan I do in my head involves a lot of back and
forth between the kernel and BPF to achieve the same, which doesn't feel
right. The tricky part is the RCU list of programs that is stored in each
device and also the global state of the jump table.
Anyway, something to look for in a next version if there is a push for it.
FWIW, patch 1 is something I'd like to get merged sooner. With 2
colleagues, we are also working on supporting the "revoke" functionality
of a fd for USB and for hidraw. While hidraw can be emulated with the
current features, we need the syscall kfuncs for USB, because when we
revoke a USB access, we also need to kick out the user, and for that, we
need to actually execute code in the kernel from a userspace event.
Anyway, happy reviewing.
Cheers,
Benjamin
[Patch series based on commit 68084a136420 ("selftests/bpf: Fix building bpf selftests statically")
in the bpf-next tree]
Benjamin Tissoires (17):
bpf/btf: also allow kfunc in tracing and syscall programs
bpf/verifier: allow kfunc to return an allocated mem
bpf: prepare for more bpf syscall to be used from kernel and user
space.
libbpf: add map_get_fd_by_id and map_delete_elem in light skeleton
HID: core: store the unique system identifier in hid_device
HID: export hid_report_type to uapi
HID: initial BPF implementation
selftests/bpf: add tests for the HID-bpf initial implementation
HID: bpf: allocate data memory for device_event BPF programs
selftests/bpf/hid: add test to change the report size
HID: bpf: introduce hid_hw_request()
selftests/bpf: add tests for bpf_hid_hw_request
HID: bpf: allow to change the report descriptor
selftests/bpf: add report descriptor fixup tests
samples/bpf: add new hid_mouse example
selftests/bpf: Add a test for BPF_F_INSERT_HEAD
Documentation: add HID-BPF docs
Documentation/hid/hid-bpf.rst | 528 ++++++++++
Documentation/hid/index.rst | 1 +
drivers/hid/Kconfig | 2 +
drivers/hid/Makefile | 2 +
drivers/hid/bpf/Kconfig | 19 +
drivers/hid/bpf/Makefile | 11 +
drivers/hid/bpf/entrypoints/Makefile | 88 ++
drivers/hid/bpf/entrypoints/README | 4 +
drivers/hid/bpf/entrypoints/entrypoints.bpf.c | 78 ++
.../hid/bpf/entrypoints/entrypoints.lskel.h | 782 ++++++++++++++
drivers/hid/bpf/hid_bpf_dispatch.c | 565 ++++++++++
drivers/hid/bpf/hid_bpf_dispatch.h | 28 +
drivers/hid/bpf/hid_bpf_jmp_table.c | 587 +++++++++++
drivers/hid/hid-core.c | 43 +-
include/linux/btf.h | 7 +
include/linux/hid.h | 29 +-
include/linux/hid_bpf.h | 144 +++
include/uapi/linux/hid.h | 12 +
include/uapi/linux/hid_bpf.h | 25 +
kernel/bpf/btf.c | 47 +-
kernel/bpf/syscall.c | 10 +-
kernel/bpf/verifier.c | 72 +-
samples/bpf/.gitignore | 1 +
samples/bpf/Makefile | 23 +
samples/bpf/hid_mouse.bpf.c | 134 +++
samples/bpf/hid_mouse.c | 157 +++
tools/lib/bpf/skel_internal.h | 23 +
tools/testing/selftests/bpf/config | 3 +
tools/testing/selftests/bpf/prog_tests/hid.c | 990 ++++++++++++++++++
tools/testing/selftests/bpf/progs/hid.c | 222 ++++
30 files changed, 4593 insertions(+), 44 deletions(-)
create mode 100644 Documentation/hid/hid-bpf.rst
create mode 100644 drivers/hid/bpf/Kconfig
create mode 100644 drivers/hid/bpf/Makefile
create mode 100644 drivers/hid/bpf/entrypoints/Makefile
create mode 100644 drivers/hid/bpf/entrypoints/README
create mode 100644 drivers/hid/bpf/entrypoints/entrypoints.bpf.c
create mode 100644 drivers/hid/bpf/entrypoints/entrypoints.lskel.h
create mode 100644 drivers/hid/bpf/hid_bpf_dispatch.c
create mode 100644 drivers/hid/bpf/hid_bpf_dispatch.h
create mode 100644 drivers/hid/bpf/hid_bpf_jmp_table.c
create mode 100644 include/linux/hid_bpf.h
create mode 100644 include/uapi/linux/hid_bpf.h
create mode 100644 samples/bpf/hid_mouse.bpf.c
create mode 100644 samples/bpf/hid_mouse.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/hid.c
create mode 100644 tools/testing/selftests/bpf/progs/hid.c
--
2.36.1