On Fri, 11 Dec 2015, Daniel Lezcano wrote:
On 12/10/2015 11:31 PM, Nicolas Pitre wrote:
On Thu, 10 Dec 2015, Daniel Lezcano wrote:
On 12/10/2015 06:16 PM, Nicolas Pitre wrote:
On Thu, 10 Dec 2015, Daniel Lezcano wrote:
+u32 stats_variance(struct stats *s) +{
- s32 mean = stats_mean(s);
- return s->n ? (s->sq_sum / s->n) - (mean * mean) : 0;
Here sq_sum is a u64 value. On 32-bit architectures you are not allowed to do a straight division with 64-bit dividends in the kernel. You have to use do_div(). The do_div() is there to make the division much less costly than the plain division where gcc promotes both the dividend and the divisor to 64 bits. Still, do_div() is still a very expensive operation when the divisor isn't constant.
Ah, ok. Will fix this.
Note that this is the reason why in my version of the IRQ stat code, I avoided all non constant divisions by factoring out the n term. Here you have:
sq_sum / n - mean * mean
where mean is obtained by:
sum / n
So by calling stats_variance() you pay the cost of two expensive divisions. If you call stats_mean() separately from higher level code (as you do in patch #3) then you add another division to the mix.
One reason I didn't use a library abstraction in my own implementation is because I wanted to have n always be a constant for such divisions (plain or do_div() style) so that gcc could optimize them with reciprocal multiplications at compile time. And some other divisions were factored out and postponed to the end, etc.
Ok, if n is constant and I define it as a 2^x value, bits shifting will be much more efficient than an optimized division, right ?
Absolutely. And you don't have to shift values manually as gcc is smart enough to do it on its own.
Nicolas