From: Jeff Xu jeffxu@google.com
By default, memfd_create() creates a non-sealable MFD, unless the MFD_ALLOW_SEALING flag is set.
When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created with that flag is sealable, even though MFD_ALLOW_SEALING is not set. This patch changes MFD_NOEXEC_SEAL to be non-sealable by default, unless MFD_ALLOW_SEALING is explicitly set.
This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL is new, we expect not many applications will rely on the nature of MFD_NOEXEC_SEAL being sealable. In most cases, the application already sets MFD_ALLOW_SEALING if they need a sealable MFD.
Additionally, this enhances the useability of pid namespace sysctl vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD to be sealable. This means, any application that does not desire this behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to migrate/enforce non-executable MFD. This adjustment ensures that applications can anticipate that the sealable characteristic will remain unmodified by vm.memfd_noexec.
This patch was initially developed by Barnabás Pőcze, and Barnabás used Debian Code Search and GitHub to try to find potential breakages and could only find a single one. Dbus-broker's memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to work around it [1]. This workaround will break. Luckily, this only affects the test suite, it does not affect the normal operations of dbus-broker. There is a PR with a fix[2]. In addition, David Rheinsberg also raised similar fix in [3]
[1]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8... [2]: https://github.com/bus1/dbus-broker/pull/366 [3]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
Cc: stable@vger.kernel.org Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Barnabás Pőcze pobrn@protonmail.com Signed-off-by: Jeff Xu jeffxu@google.com Reviewed-by: David Rheinsberg david@readahead.eu --- mm/memfd.c | 9 ++++---- tools/testing/selftests/memfd/memfd_test.c | 26 +++++++++++++++++++++- 2 files changed, 29 insertions(+), 6 deletions(-)
diff --git a/mm/memfd.c b/mm/memfd.c index 7d8d3ab3fa37..8b7f6afee21d 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
inode->i_mode &= ~0111; file_seals = memfd_file_seals_ptr(file); - if (file_seals) { - *file_seals &= ~F_SEAL_SEAL; + if (file_seals) *file_seals |= F_SEAL_EXEC; - } - } else if (flags & MFD_ALLOW_SEALING) { - /* MFD_EXEC and MFD_ALLOW_SEALING are set */ + } + + if (flags & MFD_ALLOW_SEALING) { file_seals = memfd_file_seals_ptr(file); if (file_seals) *file_seals &= ~F_SEAL_SEAL; diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c index 95af2d78fd31..8579a93d006b 100644 --- a/tools/testing/selftests/memfd/memfd_test.c +++ b/tools/testing/selftests/memfd/memfd_test.c @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void) mfd_def_size, MFD_CLOEXEC | MFD_NOEXEC_SEAL); mfd_assert_mode(fd, 0666); - mfd_assert_has_seals(fd, F_SEAL_EXEC); + mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC); mfd_fail_chmod(fd, 0777); close(fd); } @@ -1169,6 +1169,14 @@ static void test_sysctl_sysctl0(void) mfd_assert_has_seals(fd, 0); mfd_assert_chmod(fd, 0644); close(fd); + + fd = mfd_assert_new("kern_memfd_sysctl_0_dfl", + mfd_def_size, + MFD_CLOEXEC); + mfd_assert_mode(fd, 0777); + mfd_assert_has_seals(fd, F_SEAL_SEAL); + mfd_assert_chmod(fd, 0644); + close(fd); }
static void test_sysctl_set_sysctl0(void) @@ -1206,6 +1214,14 @@ static void test_sysctl_sysctl1(void) mfd_assert_has_seals(fd, F_SEAL_EXEC); mfd_fail_chmod(fd, 0777); close(fd); + + fd = mfd_assert_new("kern_memfd_sysctl_1_noexec_nosealable", + mfd_def_size, + MFD_CLOEXEC | MFD_NOEXEC_SEAL); + mfd_assert_mode(fd, 0666); + mfd_assert_has_seals(fd, F_SEAL_EXEC | F_SEAL_SEAL); + mfd_fail_chmod(fd, 0777); + close(fd); }
static void test_sysctl_set_sysctl1(void) @@ -1238,6 +1254,14 @@ static void test_sysctl_sysctl2(void) mfd_assert_has_seals(fd, F_SEAL_EXEC); mfd_fail_chmod(fd, 0777); close(fd); + + fd = mfd_assert_new("kern_memfd_sysctl_2_noexec_notsealable", + mfd_def_size, + MFD_CLOEXEC | MFD_NOEXEC_SEAL); + mfd_assert_mode(fd, 0666); + mfd_assert_has_seals(fd, F_SEAL_EXEC | F_SEAL_SEAL); + mfd_fail_chmod(fd, 0777); + close(fd); }
static void test_sysctl_set_sysctl2(void)
Hi
On Fri, May 24, 2024, at 5:39 AM, jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@google.com
By default, memfd_create() creates a non-sealable MFD, unless the MFD_ALLOW_SEALING flag is set.
When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created with that flag is sealable, even though MFD_ALLOW_SEALING is not set. This patch changes MFD_NOEXEC_SEAL to be non-sealable by default, unless MFD_ALLOW_SEALING is explicitly set.
This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL is new, we expect not many applications will rely on the nature of MFD_NOEXEC_SEAL being sealable. In most cases, the application already sets MFD_ALLOW_SEALING if they need a sealable MFD.
This does not really reflect the effort that went into this. Shouldn't this be something along the lines of:
This is a non-backward compatible change. However, MFD_NOEXEC_SEAL was only recently introduced and a codesearch revealed no breaking users apart from dbus-broker unit-tests (which have a patch pending and explicitly support this change).
Additionally, this enhances the useability of pid namespace sysctl vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD to be sealable. This means, any application that does not desire this behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to migrate/enforce non-executable MFD. This adjustment ensures that applications can anticipate that the sealable characteristic will remain unmodified by vm.memfd_noexec.
This patch was initially developed by Barnabás Pőcze, and Barnabás used Debian Code Search and GitHub to try to find potential breakages and could only find a single one. Dbus-broker's memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to work around it [1]. This workaround will break. Luckily, this only affects the test suite, it does not affect the normal operations of dbus-broker. There is a PR with a fix[2]. In addition, David Rheinsberg also raised similar fix in [3]
Cc: stable@vger.kernel.org Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Barnabás Pőcze pobrn@protonmail.com Signed-off-by: Jeff Xu jeffxu@google.com Reviewed-by: David Rheinsberg david@readahead.eu
Looks good! Thanks! David
Hi David and Barnabás
On Fri, May 24, 2024 at 7:15 AM David Rheinsberg david@readahead.eu wrote:
Hi
On Fri, May 24, 2024, at 5:39 AM, jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@google.com
By default, memfd_create() creates a non-sealable MFD, unless the MFD_ALLOW_SEALING flag is set.
When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created with that flag is sealable, even though MFD_ALLOW_SEALING is not set. This patch changes MFD_NOEXEC_SEAL to be non-sealable by default, unless MFD_ALLOW_SEALING is explicitly set.
This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL is new, we expect not many applications will rely on the nature of MFD_NOEXEC_SEAL being sealable. In most cases, the application already sets MFD_ALLOW_SEALING if they need a sealable MFD.
This does not really reflect the effort that went into this. Shouldn't this be something along the lines of:
This is a non-backward compatible change. However, MFD_NOEXEC_SEAL was only recently introduced and a codesearch revealed no breaking users apart from dbus-broker unit-tests (which have a patch pending and explicitly support this change).
Actually, I think we might need to hold on to this change. With debian code search, I found more codes that already use MFD_NOEXEC_SEAL without MFD_ALLOW_SEALING. e.g. systemd [1], [2] [3]
I'm not sure if this will break more applications not-knowingly that have started relying on MFD_NOEXEC_SEAL being sealable. The feature has been out for more than a year.
Would you consider my augments in [4] to make MFD to be sealable by default ?
At this moment, I'm willing to add a document to clarify that MFD_NOEXEC_SEAL is sealable by default, and that an app that needs non-sealable MFD can set SEAL_SEAL. Because both MFD_NOEXEC_SEAL and vm.memfd_noexec are new, I don't think it breaks the existing ABI, and vm.memfd_noexec=0 is there for backward compatibility reasons. Besides, I honestly think there is little reason that MFD needs to be non-sealable by default. There might be few rare cases, but the majority of apps don't need that. On the flip side, the fact that MFD is set up to be sealable by default is a nice bonus for an app - it makes it easier for apps to use the sealing feature.
What do you think ?
Thanks -Jeff
[1] https://codesearch.debian.net/search?q=MFD_NOEXEC_SEAL [2] https://codesearch.debian.net/show?file=systemd_256~rc3-5%2Fsrc%2Fhome%2Fhom... [3] https://sources.debian.org/src/elogind/255.5-1debian1/src/shared/serialize.c... [4] https://lore.kernel.org/lkml/CALmYWFuPBEM2DE97mQvB2eEgSO9Dvt=uO9OewMhGfhGCY6...
Additionally, this enhances the useability of pid namespace sysctl vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD to be sealable. This means, any application that does not desire this behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to migrate/enforce non-executable MFD. This adjustment ensures that applications can anticipate that the sealable characteristic will remain unmodified by vm.memfd_noexec.
This patch was initially developed by Barnabás Pőcze, and Barnabás used Debian Code Search and GitHub to try to find potential breakages and could only find a single one. Dbus-broker's memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to work around it [1]. This workaround will break. Luckily, this only affects the test suite, it does not affect the normal operations of dbus-broker. There is a PR with a fix[2]. In addition, David Rheinsberg also raised similar fix in [3]
Cc: stable@vger.kernel.org Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Barnabás Pőcze pobrn@protonmail.com Signed-off-by: Jeff Xu jeffxu@google.com Reviewed-by: David Rheinsberg david@readahead.eu
Looks good! Thanks! David
Hi
2024. május 29., szerda 23:30 keltezéssel, Jeff Xu jeffxu@google.com írta:
Hi David and Barnabás
On Fri, May 24, 2024 at 7:15 AM David Rheinsberg david@readahead.eu wrote:
Hi
On Fri, May 24, 2024, at 5:39 AM, jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@google.com
By default, memfd_create() creates a non-sealable MFD, unless the MFD_ALLOW_SEALING flag is set.
When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created with that flag is sealable, even though MFD_ALLOW_SEALING is not set. This patch changes MFD_NOEXEC_SEAL to be non-sealable by default, unless MFD_ALLOW_SEALING is explicitly set.
This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL is new, we expect not many applications will rely on the nature of MFD_NOEXEC_SEAL being sealable. In most cases, the application already sets MFD_ALLOW_SEALING if they need a sealable MFD.
This does not really reflect the effort that went into this. Shouldn't this be something along the lines of:
This is a non-backward compatible change. However, MFD_NOEXEC_SEAL was only recently introduced and a codesearch revealed no breaking users apart from dbus-broker unit-tests (which have a patch pending and explicitly support this change).
Actually, I think we might need to hold on to this change. With debian code search, I found more codes that already use MFD_NOEXEC_SEAL without MFD_ALLOW_SEALING. e.g. systemd [1], [2] [3]
Yes, I have looked at those as well, and as far as I could tell, they are not affected. Have I missed something?
Regards, Barnabás
I'm not sure if this will break more applications not-knowingly that have started relying on MFD_NOEXEC_SEAL being sealable. The feature has been out for more than a year.
Would you consider my augments in [4] to make MFD to be sealable by default ?
At this moment, I'm willing to add a document to clarify that MFD_NOEXEC_SEAL is sealable by default, and that an app that needs non-sealable MFD can set SEAL_SEAL. Because both MFD_NOEXEC_SEAL and vm.memfd_noexec are new, I don't think it breaks the existing ABI, and vm.memfd_noexec=0 is there for backward compatibility reasons. Besides, I honestly think there is little reason that MFD needs to be non-sealable by default. There might be few rare cases, but the majority of apps don't need that. On the flip side, the fact that MFD is set up to be sealable by default is a nice bonus for an app - it makes it easier for apps to use the sealing feature.
What do you think ?
Thanks -Jeff
[1] https://codesearch.debian.net/search?q=MFD_NOEXEC_SEAL [2] https://codesearch.debian.net/show?file=systemd_256~rc3-5%2Fsrc%2Fhome%2Fhom... [3] https://sources.debian.org/src/elogind/255.5-1debian1/src/shared/serialize.c... [4] https://lore.kernel.org/lkml/CALmYWFuPBEM2DE97mQvB2eEgSO9Dvt=uO9OewMhGfhGCY6...
Additionally, this enhances the useability of pid namespace sysctl vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD to be sealable. This means, any application that does not desire this behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to migrate/enforce non-executable MFD. This adjustment ensures that applications can anticipate that the sealable characteristic will remain unmodified by vm.memfd_noexec.
This patch was initially developed by Barnabás Pőcze, and Barnabás used Debian Code Search and GitHub to try to find potential breakages and could only find a single one. Dbus-broker's memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to work around it [1]. This workaround will break. Luckily, this only affects the test suite, it does not affect the normal operations of dbus-broker. There is a PR with a fix[2]. In addition, David Rheinsberg also raised similar fix in [3]
Cc: stable@vger.kernel.org Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Barnabás Pőcze pobrn@protonmail.com Signed-off-by: Jeff Xu jeffxu@google.com Reviewed-by: David Rheinsberg david@readahead.eu
Looks good! Thanks! David
On Wed, May 29, 2024 at 2:46 PM Barnabás Pőcze pobrn@protonmail.com wrote:
Hi
- május 29., szerda 23:30 keltezéssel, Jeff Xu jeffxu@google.com írta:
Hi David and Barnabás
On Fri, May 24, 2024 at 7:15 AM David Rheinsberg david@readahead.eu wrote:
Hi
On Fri, May 24, 2024, at 5:39 AM, jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@google.com
By default, memfd_create() creates a non-sealable MFD, unless the MFD_ALLOW_SEALING flag is set.
When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created with that flag is sealable, even though MFD_ALLOW_SEALING is not set. This patch changes MFD_NOEXEC_SEAL to be non-sealable by default, unless MFD_ALLOW_SEALING is explicitly set.
This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL is new, we expect not many applications will rely on the nature of MFD_NOEXEC_SEAL being sealable. In most cases, the application already sets MFD_ALLOW_SEALING if they need a sealable MFD.
This does not really reflect the effort that went into this. Shouldn't this be something along the lines of:
This is a non-backward compatible change. However, MFD_NOEXEC_SEAL was only recently introduced and a codesearch revealed no breaking users apart from dbus-broker unit-tests (which have a patch pending and explicitly support this change).
Actually, I think we might need to hold on to this change. With debian code search, I found more codes that already use MFD_NOEXEC_SEAL without MFD_ALLOW_SEALING. e.g. systemd [1], [2] [3]
Yes, I have looked at those as well, and as far as I could tell, they are not affected. Have I missed something?
In the example, the MFD was created then passed into somewhere else (safe_fork_full, open_serialization_fd, etc.), the scope and usage of mfd isn't that clear to me, you might have checked all the user cases. In addition, MFD_NOEXEC_SEAL exists in libc and rust and go lib. I don't know if debian code search is sufficient to cover enough apps . There is a certain risk.
Fundamentally, I'm not convinced that making MFD default-non-sealable has meaningful benefit, especially when MFD_NOEXEC_SEAL is new.
Regards, Barnabás
I'm not sure if this will break more applications not-knowingly that have started relying on MFD_NOEXEC_SEAL being sealable. The feature has been out for more than a year.
Would you consider my augments in [4] to make MFD to be sealable by default ?
At this moment, I'm willing to add a document to clarify that MFD_NOEXEC_SEAL is sealable by default, and that an app that needs non-sealable MFD can set SEAL_SEAL. Because both MFD_NOEXEC_SEAL and vm.memfd_noexec are new, I don't think it breaks the existing ABI, and vm.memfd_noexec=0 is there for backward compatibility reasons. Besides, I honestly think there is little reason that MFD needs to be non-sealable by default. There might be few rare cases, but the majority of apps don't need that. On the flip side, the fact that MFD is set up to be sealable by default is a nice bonus for an app - it makes it easier for apps to use the sealing feature.
What do you think ?
Thanks -Jeff
[1] https://codesearch.debian.net/search?q=MFD_NOEXEC_SEAL [2] https://codesearch.debian.net/show?file=systemd_256~rc3-5%2Fsrc%2Fhome%2Fhom... [3] https://sources.debian.org/src/elogind/255.5-1debian1/src/shared/serialize.c... [4] https://lore.kernel.org/lkml/CALmYWFuPBEM2DE97mQvB2eEgSO9Dvt=uO9OewMhGfhGCY6...
Additionally, this enhances the useability of pid namespace sysctl vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD to be sealable. This means, any application that does not desire this behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to migrate/enforce non-executable MFD. This adjustment ensures that applications can anticipate that the sealable characteristic will remain unmodified by vm.memfd_noexec.
This patch was initially developed by Barnabás Pőcze, and Barnabás used Debian Code Search and GitHub to try to find potential breakages and could only find a single one. Dbus-broker's memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to work around it [1]. This workaround will break. Luckily, this only affects the test suite, it does not affect the normal operations of dbus-broker. There is a PR with a fix[2]. In addition, David Rheinsberg also raised similar fix in [3]
Cc: stable@vger.kernel.org Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Barnabás Pőcze pobrn@protonmail.com Signed-off-by: Jeff Xu jeffxu@google.com Reviewed-by: David Rheinsberg david@readahead.eu
Looks good! Thanks! David
2024. május 30., csütörtök 0:24 keltezéssel, Jeff Xu jeffxu@google.com írta:
On Wed, May 29, 2024 at 2:46 PM Barnabás Pőcze pobrn@protonmail.com wrote:
Hi
- május 29., szerda 23:30 keltezéssel, Jeff Xu jeffxu@google.com írta:
Hi David and Barnabás
On Fri, May 24, 2024 at 7:15 AM David Rheinsberg david@readahead.eu wrote:
Hi
On Fri, May 24, 2024, at 5:39 AM, jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@google.com
By default, memfd_create() creates a non-sealable MFD, unless the MFD_ALLOW_SEALING flag is set.
When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created with that flag is sealable, even though MFD_ALLOW_SEALING is not set. This patch changes MFD_NOEXEC_SEAL to be non-sealable by default, unless MFD_ALLOW_SEALING is explicitly set.
This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL is new, we expect not many applications will rely on the nature of MFD_NOEXEC_SEAL being sealable. In most cases, the application already sets MFD_ALLOW_SEALING if they need a sealable MFD.
This does not really reflect the effort that went into this. Shouldn't this be something along the lines of:
This is a non-backward compatible change. However, MFD_NOEXEC_SEAL was only recently introduced and a codesearch revealed no breaking users apart from dbus-broker unit-tests (which have a patch pending and explicitly support this change).
Actually, I think we might need to hold on to this change. With debian code search, I found more codes that already use MFD_NOEXEC_SEAL without MFD_ALLOW_SEALING. e.g. systemd [1], [2] [3]
Yes, I have looked at those as well, and as far as I could tell, they are not affected. Have I missed something?
In the example, the MFD was created then passed into somewhere else (safe_fork_full, open_serialization_fd, etc.), the scope and usage of mfd isn't that clear to me, you might have checked all the user cases. In addition, MFD_NOEXEC_SEAL exists in libc and rust and go lib. I don't know if debian code search is sufficient to cover enough apps . There is a certain risk.
Fundamentally, I'm not convinced that making MFD default-non-sealable has meaningful benefit, especially when MFD_NOEXEC_SEAL is new.
Certainly, there is always a risk, I did not mean to imply that there isn't. However, I believe this risk is low enough to at least warrant an attempt at eliminating this inconsistency. It can always be reverted if it turns out that the effects have been vastly underestimated by me.
So I would still like to see this change merged.
Regards, Barnabás Pőcze
Regards, Barnabás
I'm not sure if this will break more applications not-knowingly that have started relying on MFD_NOEXEC_SEAL being sealable. The feature has been out for more than a year.
Would you consider my augments in [4] to make MFD to be sealable by default ?
At this moment, I'm willing to add a document to clarify that MFD_NOEXEC_SEAL is sealable by default, and that an app that needs non-sealable MFD can set SEAL_SEAL. Because both MFD_NOEXEC_SEAL and vm.memfd_noexec are new, I don't think it breaks the existing ABI, and vm.memfd_noexec=0 is there for backward compatibility reasons. Besides, I honestly think there is little reason that MFD needs to be non-sealable by default. There might be few rare cases, but the majority of apps don't need that. On the flip side, the fact that MFD is set up to be sealable by default is a nice bonus for an app - it makes it easier for apps to use the sealing feature.
What do you think ?
Thanks -Jeff
[1] https://codesearch.debian.net/search?q=MFD_NOEXEC_SEAL [2] https://codesearch.debian.net/show?file=systemd_256~rc3-5%2Fsrc%2Fhome%2Fhom... [3] https://sources.debian.org/src/elogind/255.5-1debian1/src/shared/serialize.c... [4] https://lore.kernel.org/lkml/CALmYWFuPBEM2DE97mQvB2eEgSO9Dvt=uO9OewMhGfhGCY6...
Additionally, this enhances the useability of pid namespace sysctl vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD to be sealable. This means, any application that does not desire this behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to migrate/enforce non-executable MFD. This adjustment ensures that applications can anticipate that the sealable characteristic will remain unmodified by vm.memfd_noexec.
This patch was initially developed by Barnabás Pőcze, and Barnabás used Debian Code Search and GitHub to try to find potential breakages and could only find a single one. Dbus-broker's memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to work around it [1]. This workaround will break. Luckily, this only affects the test suite, it does not affect the normal operations of dbus-broker. There is a PR with a fix[2]. In addition, David Rheinsberg also raised similar fix in [3]
Cc: stable@vger.kernel.org Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Barnabás Pőcze pobrn@protonmail.com Signed-off-by: Jeff Xu jeffxu@google.com Reviewed-by: David Rheinsberg david@readahead.eu
Looks good! Thanks! David
Hi Barnabás
On Fri, May 31, 2024 at 11:56 AM Barnabás Pőcze pobrn@protonmail.com wrote:
- május 30., csütörtök 0:24 keltezéssel, Jeff Xu jeffxu@google.com írta:
On Wed, May 29, 2024 at 2:46 PM Barnabás Pőcze pobrn@protonmail.com wrote:
Hi
- május 29., szerda 23:30 keltezéssel, Jeff Xu jeffxu@google.com írta:
Hi David and Barnabás
On Fri, May 24, 2024 at 7:15 AM David Rheinsberg david@readahead.eu wrote:
Hi
On Fri, May 24, 2024, at 5:39 AM, jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@google.com
By default, memfd_create() creates a non-sealable MFD, unless the MFD_ALLOW_SEALING flag is set.
When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created with that flag is sealable, even though MFD_ALLOW_SEALING is not set. This patch changes MFD_NOEXEC_SEAL to be non-sealable by default, unless MFD_ALLOW_SEALING is explicitly set.
This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL is new, we expect not many applications will rely on the nature of MFD_NOEXEC_SEAL being sealable. In most cases, the application already sets MFD_ALLOW_SEALING if they need a sealable MFD.
This does not really reflect the effort that went into this. Shouldn't this be something along the lines of:
This is a non-backward compatible change. However, MFD_NOEXEC_SEAL was only recently introduced and a codesearch revealed no breaking users apart from dbus-broker unit-tests (which have a patch pending and explicitly support this change).
Actually, I think we might need to hold on to this change. With debian code search, I found more codes that already use MFD_NOEXEC_SEAL without MFD_ALLOW_SEALING. e.g. systemd [1], [2] [3]
Yes, I have looked at those as well, and as far as I could tell, they are not affected. Have I missed something?
In the example, the MFD was created then passed into somewhere else (safe_fork_full, open_serialization_fd, etc.), the scope and usage of mfd isn't that clear to me, you might have checked all the user cases. In addition, MFD_NOEXEC_SEAL exists in libc and rust and go lib. I don't know if debian code search is sufficient to cover enough apps . There is a certain risk.
Fundamentally, I'm not convinced that making MFD default-non-sealable has meaningful benefit, especially when MFD_NOEXEC_SEAL is new.
Certainly, there is always a risk, I did not mean to imply that there isn't. However, I believe this risk is low enough to at least warrant an attempt at eliminating this inconsistency. It can always be reverted if it turns out that the effects have been vastly underestimated by me.
So I would still like to see this change merged.
The MFD_NOEXEC_SEAL is a new flag, technically, ABI is not broken. The sysctl vm.memfd_noexec=1 or 2, is meant to help migration/enforcement of MFD_NOEXEC_SEAL, so it will break application if it is used pre-maturely, that is by-design.
I think the main problem here is lack of documentation, instead of a code bug. ABI change shouldn't be treated lightly, given the risk, I would like to keep the API the same and add the documentation instead. I think that is the best route forward.
Best Regards, -Jeff
Regards, Barnabás Pőcze
Regards, Barnabás
I'm not sure if this will break more applications not-knowingly that have started relying on MFD_NOEXEC_SEAL being sealable. The feature has been out for more than a year.
Would you consider my augments in [4] to make MFD to be sealable by default ?
At this moment, I'm willing to add a document to clarify that MFD_NOEXEC_SEAL is sealable by default, and that an app that needs non-sealable MFD can set SEAL_SEAL. Because both MFD_NOEXEC_SEAL and vm.memfd_noexec are new, I don't think it breaks the existing ABI, and vm.memfd_noexec=0 is there for backward compatibility reasons. Besides, I honestly think there is little reason that MFD needs to be non-sealable by default. There might be few rare cases, but the majority of apps don't need that. On the flip side, the fact that MFD is set up to be sealable by default is a nice bonus for an app - it makes it easier for apps to use the sealing feature.
What do you think ?
Thanks -Jeff
[1] https://codesearch.debian.net/search?q=MFD_NOEXEC_SEAL [2] https://codesearch.debian.net/show?file=systemd_256~rc3-5%2Fsrc%2Fhome%2Fhom... [3] https://sources.debian.org/src/elogind/255.5-1debian1/src/shared/serialize.c... [4] https://lore.kernel.org/lkml/CALmYWFuPBEM2DE97mQvB2eEgSO9Dvt=uO9OewMhGfhGCY6...
Additionally, this enhances the useability of pid namespace sysctl vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD to be sealable. This means, any application that does not desire this behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to migrate/enforce non-executable MFD. This adjustment ensures that applications can anticipate that the sealable characteristic will remain unmodified by vm.memfd_noexec.
This patch was initially developed by Barnabás Pőcze, and Barnabás used Debian Code Search and GitHub to try to find potential breakages and could only find a single one. Dbus-broker's memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to work around it [1]. This workaround will break. Luckily, this only affects the test suite, it does not affect the normal operations of dbus-broker. There is a PR with a fix[2]. In addition, David Rheinsberg also raised similar fix in [3]
Cc: stable@vger.kernel.org Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Barnabás Pőcze pobrn@protonmail.com Signed-off-by: Jeff Xu jeffxu@google.com Reviewed-by: David Rheinsberg david@readahead.eu
Looks good! Thanks! David
linux-stable-mirror@lists.linaro.org