On 03/21/2014 11:37 PM, Catalin Marinas wrote:
On Fri, Mar 21, 2014 at 11:24:16AM +0000, Srivatsa S. Bhat wrote:
On 03/21/2014 04:35 PM, Catalin Marinas wrote:
On Fri, Mar 21, 2014 at 09:21:02AM +0000, Viresh Kumar wrote:
@Catalin: We have a problem here and need your expert advice. After changing CPU frequency we need to call this code:
cpufreq_notify_post_transition(); policy->transition_ongoing = false;
And the sequence must be like this only. Is this guaranteed without any memory barriers? cpufreq_notify_post_transition() isn't touching transition_ongoing at all..
The above sequence doesn't say much. As rmk said, the compiler wouldn't reorder the transition_ongoing write before the function call. I think most architectures (not sure about Alpha) don't do speculative stores, so hardware wouldn't reorder them either. However, other stores inside the cpufreq_notify_post_transition() could be reordered after transition_ongoing store. The same for memory accesses after the transition_ongoing update, they could be reordered before.
So what we actually need to know is what are the other relevant memory accesses that require strict ordering with transition_ongoing.
Hmm.. The thing is, _everything_ inside the post_transition() function should complete before writing to transition_ongoing. Because, setting the flag to 'false' indicates the end of the critical section, and the next contending task can enter the critical section.
smp_mb() is all about relative ordering. So if you want memory accesses in post_transition() to be visible to other observers before transition_ongoing = false, you also need to make sure that the readers of transition_ongoing have a barrier before subsequent memory accesses.
The reader takes a spin-lock before reading the flag.. won't that suffice?
+wait: + wait_event(policy->transition_wait, !policy->transition_ongoing); + + spin_lock(&policy->transition_lock); + + if (unlikely(policy->transition_ongoing)) { + spin_unlock(&policy->transition_lock); + goto wait; + }
What I find strange in your patch is that cpufreq_freq_transition_begin() uses spinlocks around transition_ongoing update but cpufreq_freq_transition_end() doesn't.
The reason is that, by the time we drop the spinlock, we would have set the transition_ongoing flag to true, which prevents any other task from entering the critical section. Hence, when we call the _end() function, we are 100% sure that only one task is executing it. Hence locks are not necessary around that second update. In fact, that very update marks the end of the critical section (which acts much like a spin_unlock(&lock) in a "regular" critical section).
OK, I start to get it. Is there a risk of missing a wake_up event? E.g. one thread waking up earlier, noticing that transition is in progress and waiting indefinitely?
No, the only downside to having the CPU reorder the assignment to the flag is that a new transition can begin while the old one is still finishing up the frequency transition by calling the _post_transition() notifiers.
Regards, Srivatsa S. Bhat