On 20 August 2012 12:50, 함명주 myungjoo.ham@samsung.com wrote:
Prepare devfreq core framework to support devices which can idle. When device idleness is detected perhaps through runtime-pm, need some mechanism to suspend devfreq load monitoring and resume when device is back online. Present code continues monitoring unless device is removed from devfreq core.
This patch introduces following updates,
- move device load monitoring logic to ondemand governor as it is specific to ondemand.
We have this ondemand governor in the mainline devfreq. However, we have (not upstreamed yet) governors with specific purpose (for GPU or for a specific chip) with load monitoring logic. Although we don't have them upstreamed yet, why don't you keep the monitoring logic sharable by other governors. (From today, I'm not objecting to have individual workqueue, but I still don't want to let each active governor reimplement the same things.)
If more governors are planned based on load monitoring, possibly ondemand is not good enough even after adding additional configurable parameters. Could you please describe one such case which demand new governor?
Please find my next comments.
GPU devfreq driver requires behaviors similar to conservative though not exactly same.
In our internal development branch, it increases frequencies differently according to the current usage (# GPU cores used and % each core activated). If it's almost 100% and the queue is filling up, it goes to max. If it's (for example) 98%, it goes to the next faster step. Besides, GPU developers want to differ the poling intervals according to the current frequency.
Another case is with bus + memory-interface case. We are going to use a custom governor for them as each bus and memory-interface has its own DVFS capability while we want them to coordinate. The two drivers may be hooked by the custom governor. We don't have a specific detail with this as it is developed by another division. However, what I know is that they are going to use their own custon governor with ondemand-like behavior in principle.
Because devfreq is meant to be used by various types of devices, we cannot assume that the default governor is "mostly" fine. Device driver writers can embed governors in their own devfreq driver code ("custom governor").
- devfreq core interacts with governors via events to perform specific actions. These events include start/stop devfreq, and frequency limit changes outside devfreq. This sets ground for adding suspend/resume events.
event_handler with START/STOP/UPDATE seem fine.
However, init() and exit() should be different from START/STOP. We do not need to do init and exit every time when we do runtime suspend/resume.
I agree and its already taken care in this patch.
DEVFREQ_GOV_START : Event for initializing and starting of devfreq governor for a device. Happens only when device is added using devfreq_add_device().
DEVFREQ_GOV_STOP: Event for stopping and uninitialization of devfreq governor for a device. Happens only when device is removed using devfreq_remove_device().
Events introduced in next patch (2/3) - DEVFREQ_GOV_SUSPEND: Event for suspending devfreq of a device. I.e suspending of load monitoring of device in case of ondemand governor. Happens only when devfreq_suspend_device() is called.
DEVFREQ_GOV_RESUME: Event for resuming devfreq of a device. I.e continue load monitoring of already suspended device in case of ondemand governor. Happens only when devfreq_resume_device() is called.
Ah.. suspend/resume events are seperated. Please never mind this.
- use per device work instead of global work to monitor device load. This enables suspend/resume of device devfreq and reduces monitoring code complexity.
It's fine to have a delayed work struct per device.
However, please try to keep as many things/features in devfreq.c as possible; i.e., reduce features and code size of governors. Because, we will be supporting various types of devices with devfreq, there will be various governors and I don't want them to have shared things reimplemented. Dealing with the delayed work at devfreq.c and let governors choose to use it (at its struct) or not should be fine.
Delayed work is of only relevance to ondemand governor - justifies why struct delayed work moved from devfreq.c to governor_simpleondemand.c. When delayed work is optional for governors to use, then let governors handle them if needed instead of devfreq core.
With this change, we will not be forcing next upcoming devfreq governors to use delayed work for load monitoring(if required) but instead flexibility to have their own mechanisms. Its up to the governors to decide how it wants to do dvfs of a device. But imperative to respond the events.
As mentioned above, please do not assume that the default governors (drivers/devfreq/governor_*.c) will be only governors available and ondemand will be the only "active" governor for the future.
I think it'd be better for the driver writes to reuse code than to copy and paste the code.
Some governors may use some other polling/sampling mechanisms; however, still, providing delayed work at the core does not mean that governors must use the default delayed work struct. They may simply not use it as userspace, performance, and powersave do.
In this patchset, the size of ondemand governor has been enlarged too much for that purpose.
I said this because I do not want governors to act as if they are core and I do not want to make writing governors too difficult for device driver writers. I want them to focus on the "policy" only, not the mechanism.
- Force devfreq users to set min/max supported frequencies in device profile to help governors to predict target frequecy with in limits.
Is this really necessary?
Yes. Mainly for two reasons,
- Devfreq core provides sysfs interface for usespace to set min/max
operating frequency for a device. These values must be within device supported min/max frequency limits.
No. they do not need to be within device supported limits.
When I say "CPU should be running at least 50MHz" when the CPU supports 1000MHz ~ 3000MHz, there is no problem. Running at 1000MHz does not contradict with the condition.
When I say "CPU should be running at least 4000MHz" for the same CPU, It is fine to set the CPU at max available.
- Governors should know device supported min/max opps and use
them where ever needed. Never assume UINT_MAX as max frequency.
The core and governors do not need to know such information. They only need to provide "recommended" frequency. The corresponding driver is going to interpret it properly. Whenever we can, it's better to remove device specific information from governors and cores as long as it does not significantly increase the complexity of drivers.
UINT_MAX means "I recommend to run as fast as possible." Then, the driver only needs to provide the fastest frequency. The core and governor does not need to say the specific frequency.
Cheers! MyungJoo
The devfreq apis are not modified and are kept intact.
The ABIs are not.
You can no longer do "# echo 0 > ABI_path" in order to deactivate.
Only relevant for ondemand governor. Load monitoring can be deactivated when polling_ms is set to zero. Next version of patch will fix this. Thanks.
Cheers! MyungJoo
Signed-off-by: Rajagopal Venkat rajagopal.venkat@linaro.org
ps. please make the patch a bit more readable. (please don't shuffle the location of pre-existed functions)
-- Regards, Rajagopal
--
MyungJoo Ham (함명주), PHD
System S/W Lab, S/W Platform Team, Software Center Samsung Electronics Cell: +82-10-6714-2858