Re: [PATCH] signal: restore the override_rlimit logic

1 Nov 2024

On Fri, Nov 01, 2024 at 03:58:07PM -0500, Eric W. Biederman wrote:
...
Roman Gushchin roman.gushchin@linux.dev writes:
...
On Fri, Nov 01, 2024 at 02:51:00PM -0500, Eric W. Biederman wrote:
...
Roman Gushchin roman.gushchin@linux.dev writes:
...
Prior to commit d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of
ucounts") UCOUNT_RLIMIT_SIGPENDING rlimit was not enforced for a class
of signals. However now it's enforced unconditionally, even if
override_rlimit is set.
Not true.
It added a limit on the number of siginfo structures that
a container may allocate.  Have you tried not limiting your
container?
...
This behavior change caused production issues.
...
For example, if the limit is reached and a process receives a SIGSEGV
signal, sigqueue_alloc fails to allocate the necessary resources for the
signal delivery, preventing the signal from being delivered with
siginfo. This prevents the process from correctly identifying the fault
address and handling the error. From the user-space perspective,
applications are unaware that the limit has been reached and that the
siginfo is effectively 'corrupted'. This can lead to unpredictable
behavior and crashes, as we observed with java applications.
Note.  There are always conditions when the allocation may fail.
The structure is allocated with __GFP_ATOMIC so it is much more likely
to fail than a typical kernel memory allocation.
But I agree it does look like there is a quality of implementation issue
here.
...
Fix this by passing override_rlimit into inc_rlimit_get_ucounts() and
skip the comparison to max there if override_rlimit is set. This
effectively restores the old behavior.
Instead please just give the container and unlimited number of siginfo
structures it can play with.
Well, personally I'd not use this limit too, but I don't think
"it's broken, userspace shouldn't use it" argument is valid.
I said if you don't want the limit don't use it.
A version of "Doctor it hurts when I do this". To which the doctor
replies "Don't do that then".
I was also asking that you test with the limit disabled (at user
namespace creation time) so that you can verify that is problem.
...
...
The maximum for rlimit(RLIM_SIGPENDING) is the rlimit(RLIM_SIGPENDING)
value when the user namespace is created.
Given that it took 3 and half years to report this.  I am going to
say this really looks like a userspace bug.
The trick here is another bug fixed by https://lkml.org/lkml/2024/10/31/185.
Basically it's a leak of the rlimit value.
If a limit is set and reached in the reality, all following signals
will not have a siginfo attached, causing applications which depend on
handling SIGSEGV to crash.
I will take a deeper look at the patch you are referring to.
...
...
Beyond that your patch is actually buggy, and should not be applied.
If we want to change the semantics and ignore the maximum number of
pending signals in a container (when override_rlimit is set) then
the code should change the computation of the max value (pegging it at
LONG_MAX) and not ignore it.
Hm, isn't the unconditional (new < 0) enough to capture the overflow?
Actually I'm not sure I understand how "long new" can be "> LONG_MAX"
anyway.
Agreed "new < 0" should catch that, but still splitting the logic
between the calculation of max and the test of max is quite confusing.
It makes much more sense to put the logic into the calculate of max.
You mean something like this?

diff --git a/kernel/ucount.c b/kernel/ucount.c
index 046b3d57ebb4..49fcec41e5b4 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -317,11 +317,12 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type,
for (iter = ucounts; iter; iter = iter->ns->ucounts) {
                long new = atomic_long_add_return(1, &iter->rlimit[type]);
-               if (new < 0 || (!override_rlimit && (new > max)))
+               if (new < 0 || new > max)
                        goto unwind;
                if (iter == ucounts)
                        ret = new;
-               max = get_userns_rlimit_max(iter->ns, type);
+               if (!override_rlimit)
+                       max = get_userns_rlimit_max(iter->ns, type);
                /*
                 * Grab an extra ucount reference for the caller when
                 * the rlimit count was previously 0.
--
If you strongly prefer this version, I can send a v2. I like my original
version slightly better, but not a strong preference. Please, let me know.
Thanks!

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] signal: restore the override_rlimit logic