On 12/10/2015 11:31 PM, Nicolas Pitre wrote:
On Thu, 10 Dec 2015, Daniel Lezcano wrote:
On 12/10/2015 06:16 PM, Nicolas Pitre wrote:
On Thu, 10 Dec 2015, Daniel Lezcano wrote:
+u32 stats_variance(struct stats *s) +{
- s32 mean = stats_mean(s);
- return s->n ? (s->sq_sum / s->n) - (mean * mean) : 0;
Here sq_sum is a u64 value. On 32-bit architectures you are not allowed to do a straight division with 64-bit dividends in the kernel. You have to use do_div(). The do_div() is there to make the division much less costly than the plain division where gcc promotes both the dividend and the divisor to 64 bits. Still, do_div() is still a very expensive operation when the divisor isn't constant.
Ah, ok. Will fix this.
Note that this is the reason why in my version of the IRQ stat code, I avoided all non constant divisions by factoring out the n term. Here you have:
sq_sum / n - mean * mean
where mean is obtained by:
sum / n
So by calling stats_variance() you pay the cost of two expensive divisions. If you call stats_mean() separately from higher level code (as you do in patch #3) then you add another division to the mix.
One reason I didn't use a library abstraction in my own implementation is because I wanted to have n always be a constant for such divisions (plain or do_div() style) so that gcc could optimize them with reciprocal multiplications at compile time. And some other divisions were factored out and postponed to the end, etc.
Ok, I understand the optimization. It is a fair argument.
Furthermore, given you allow for values with a 24-bit magnitude, it is possible for the mean to have that magnitude. With mean being a s32 variable, any mean value with a magnitude greater than 15 bits (or actually 46340 to be precise) will overflow the product above.
Thanks for spotting this.
I think the return type is wrong, it should be an u64.
Right.
But then how behaves an u64 with int_sqrt which is based on BITS_PER_LONG ?
The prototype has an unsigned long for argument. Therefore, on a 32-bit architecture, your u64 value will be truncated.
Mmh, ok. It would be simple then to remove the stddev function and let the caller to deal with the variance.
-- http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro Facebook | http://twitter.com/#!/linaroorg Twitter | http://www.linaro.org/linaro-blog/ Blog