Hi Tejun,
On 4 January 2013 20:39, Tejun Heo tj@kernel.org wrote:
I don't know either. Changing behavior subtly like this is hard. I usually try to spot some problem cases and try to identify patterns there. Once you identify a few of them, understanding and detecting other problem cases get a lot easier. In this case, maybe there are too many places to audit and the problems are too subtle, and, if we *have* to do it, the only thing we can do is implementing a debug option to make such problems more visible - say, always schedule to a different cpu on queue_work().
That said, at this point, the patchset doesn't seem all that convincing to me and if I'm comprehending responses from others correctly that seems to be the consensus. It might be a better approach to identify the specific offending workqueue users and make them handle the situation more intelligently than trying to impose the behavior on all workqueue users. At any rate, we need way more data showing this actually helps and if so why.
I understand your concerns and believe me, even i feel the same :) I had another idea, that i wanted to share.
Firstly the root cause of this patchset.
Myself and some others in Linaro are working on ARM future cores: big.LITTLE systems. Here we have few very powerful, high power consuming cores (big, currently A15's) and few very efficient ones (LITTLE, currently A7's).
The ultimate goal is to save as much power as possible without compromising much with performance. For, that we wanted most of the stuff to run on LITTLE cores and some performance-demanding stuff on big Cores. There are multiple things going around in this direction. Now, we thought A15's or big cores shouldn't be used for running small tasks like timers/workqueues and hence this patch is an attempt towards reaching that goal.
Over that we can do some load balancing of works within multiple alive cpus, so that it can get done quickly. Also, we shouldn't start using an idle cpu just for processing work :)
I have another idea that we can try:
queue_work_on_any_cpu().
With this we would not break any existing code and can try to migrate old users to this new infrastructure (atleast the ones which are rearming works from their work_handlers). What do you say?
To take care of the cache locality issue, we can pass an argument to this routine, that can provide - the mask of cpus to schedule this work on OR - Sched Level (SD_LEVEL) of cpus to run it.
Waiting for your view on it :)
-- viresh