Hi!
From: Michal Hocko mhocko@suse.com
commit b0f53dbc4bc4c371f38b14c391095a3bb8a0bb40 upstream.
Partially revert 16db3d3f1170 ("kernel/sysctl.c: threads-max observe limits") because the patch is causing a regression to any workload which needs to override the auto-tuning of the limit provided by kernel.
set_max_threads is implementing a boot time guesstimate to provide a sensible limit of the concurrently running threads so that runaways will not deplete all the memory. This is a good thing in general but there are workloads which might need to increase this limit for an application to run (reportedly WebSpher MQ is affected) and that is simply not possible after the mentioned change. It is also very dubious to override an admin decision by an estimation that doesn't have any direct relation to correctness of the kernel operation.
Fix this by dropping set_max_threads from sysctl_max_threads so any value is accepted as long as it fits into MAX_THREADS which is important to check because allowing more threads could break internal robust futex restriction. While at it, do not use MIN_THREADS as the lower boundary because it is also only a heuristic for automatic estimation and admin might have a good reason to stop new threads to be created even when below this limit.
Ok, why not, but I smell followup work could be done:
@@ -2635,7 +2635,7 @@ int sysctl_max_threads(struct ctl_table if (ret || !write) return ret;
- set_max_threads(threads);
- max_threads = threads;
AFAICT set_max_threads can now become __init.
Yes. Care to send a patch?
I'm not usually hacking in that area. Could you do that?
Plus, I don't see any locking here, should this be WRITE_ONCE() at minimum?
Why would that matter? Do you expect several root processes race to set the value?
Well, for example to warn humans that this code is accessing unlocked variable. Second, as is, code is not valid C and compilers are allowed to do strange stuff ("undefined behaviour"). Third, there are concurency checkers that will not like this one.
Best regards, Pavel