On 10/18/23 09:41, Waiman Long wrote:
On 10/18/23 05:24, Tejun Heo wrote:
Hello,
On Fri, Oct 13, 2023 at 02:11:19PM -0400, Waiman Long wrote:
When the "isolcpus" boot command line option is used to add a set of isolated CPUs, those CPUs will be excluded automatically from wq_unbound_cpumask to avoid running work functions from unbound workqueues.
Recently cpuset has been extended to allow the creation of partitions of isolated CPUs dynamically. To make it closer to the "isolcpus" in functionality, the CPUs in those isolated cpuset partitions should be excluded from wq_unbound_cpumask as well. This can be done currently by explicitly writing to the workqueue's cpumask sysfs file after creating the isolated partitions. However, this process can be error prone. Ideally, the cpuset code should be allowed to request the workqueue code to exclude those isolated CPUs from wq_unbound_cpumask so that this operation can be done automatically and the isolated CPUs will be returned back to wq_unbound_cpumask after the destructions of the isolated cpuset partitions.
This patch adds a new workqueue_unbound_exclude_cpumask() to enable that. This new function will exclude the specified isolated CPUs from wq_unbound_cpumask. To be able to restore those isolated CPUs back after the destruction of isolated cpuset partitions, a new wq_user_unbound_cpumask is added to store the user provided unbound cpumask either from the boot command line options or from writing to the cpumask sysfs file. This new cpumask provides the basis for CPU exclusion.
The behaviors around wq_unbound_cpumask is getting pretty inconsistent:
- Housekeeping excludes isolated CPUs on boot but allows user to
override it to include isolated CPUs afterwards.
- If an unbound wq's cpumask doesn't have any intersection with
wq_unbound_cpumask we ignore the per-wq cpumask and falls back to wq_unbound_cpumask.
- You're adding a masking layer on top with exclude which fails to
set if the intersection is empty.
Can we do the followings for consistency?
- User's requested_unbound_cpumask is stored separately (as in this
patch).
- The effect wq_unbound_cpumask is determined by
requested_unbound_cpumask & housekeeping_cpumask & cpuset_allowed_cpumask. The operation order matters. When an & operation yields an cpumask, the cpumask from the previous step is the effective one.
Sure. I will do that.
I have a second thought after taking a further look at that. First of all, cpuset_allowed_mask isn't relevant here and the mask can certainly contain offline CPUs. So cpu_possible_mask is the proper fallback.
With the current patch, wq_user_unbound_cpumask is set up initially as (HK_TYPE_WQ ∩ HK_TYPE_DOMAIN) house keeping mask and rewritten by any subsequent write to workqueue/cpumask sysfs file. So using wq_user_unbound_cpumask has the implied precedence of user-sysfs written mask, command line isolcpus or nohz_full option mask and cpu_possible_mask. I think just fall back to wq_user_unbound_cpumask if the operation fails should be enough.
Cheers, Longman