Hi,
On 29/10/25 20:08, Andrea Righi wrote:
During switching from sched_ext to fair tasks and vice-versa, we need support for intializing and removing the bandwidth contribution of either DL server.
My first and more general/design question is do we strictly need this automagic bandwidth management. We seem to agree [1] that we want to move towards explicit dl-server(s) and tasks bandwidth handling, so we might want to consider leaving the burden completely to whomever might be configuring the system.
Add support for handling these transitions.
Anyway, if we still want to do this :) ...
Moreover, remove references specific to the fair server, in preparation for adding the ext server.
v2: - wait for inactive_task_timer to fire before removing the bandwidth reservation (Juri Lelli) - add WARN_ON_ONCE(!cpus) sanity check in dl_server_apply_params() (Andrea Righi)
Co-developed-by: Joel Fernandes joelagnelf@nvidia.com Signed-off-by: Joel Fernandes joelagnelf@nvidia.com Signed-off-by: Andrea Righi arighi@nvidia.com
...
+/**
- dl_server_remove_params - Remove bandwidth reservation for a DL server
- @dl_se: The DL server entity to remove bandwidth for
- This function removes the bandwidth reservation for a DL server entity,
- cleaning up all bandwidth accounting and server state.
- Returns: 0 on success, negative error code on failure
- */
+int dl_server_remove_params(struct sched_dl_entity *dl_se,
struct rq *rq, struct rq_flags *rf)+{
- if (!dl_se->dl_server)
return 0; /* Already disabled */- /*
* First dequeue if still queued. It should not be queued since* we call this only after the last dl_server_stop().*/- if (WARN_ON_ONCE(on_dl_rq(dl_se)))
dequeue_dl_entity(dl_se, DEQUEUE_SLEEP);- if (hrtimer_try_to_cancel(&dl_se->inactive_timer) == -1) {
rq_unlock_irqrestore(rq, rf);
This seems racy. I fear the moment we release the rq lock something can slip in and the server(s) state might change?
hrtimer_cancel(&dl_se->inactive_timer);
I am not sure we actually need to force cancel the timer (but still contradicting myself every time I go back at staring at code :). The way I believe this should work 'in theory' is
- we remove a server (either automagic or user sets runtime to 0 - which is probably to fix/look at in current implementation as well btw) - current bandwidth is retained and only freed (and server reset) at 0-lag (when inactive_timer fires) - if server is activated back before 0-lag it will use it's current parameters - after 0-lag it's a new instance with new parameters
In inactive_timer() we have this behavior for simple tasks, but we skip __dl_sub() etc for servers (since we clear it up immediately).
In all this I essentially fear that if we clear parameters immediately one could be able to trick the system by quickly disabling/enabling a dl-server to let fair/scx tasks execute more than what requested (as each new enable will be seen as a new instance). But, again, I wasn't yet able to demonstrate this and I am still uncomfortably uncertain. Please Peter and others keep me honest.
Also, server parameters changes are root only, so maybe not a big deal? For scx automagic as well?
Thanks! Juri
1 - https://lore.kernel.org/lkml/aQiE1ULtInJS6X4R@jlelli-thinkpadt14gen4.remote....