On Thu, Jun 7, 2018 at 9:23 AM Dan Williams dan.j.williams@intel.com wrote:
Mark notes that gcc optimization passes have the potential to elide necessary invocations of this instruction sequence, so mark the asm volatile.
Ack. I'm not entirely sure this matters much, but it certainly isn't wrong either.
The reason I'm not 100% convinced this matters is that gcc can *still* mess things up for us by simply adding conditionals elsewhere.
For example, let's say we write this:
if (idx < foo) { idx = array_idx_nospec(idx, foo); do_something(idx); } else { do_something_else(); }
then everything is obviously fine, right? With the volatile on the array_idx_nospec(), we're guaranteed to use the right reduced idx, and there's only one user, so we're all good.
Except maybe do_something(idx) looks something like this:
do_something(int idx) { do_something_else() access(idx); }
and gcc decides that "hey, I can combine the two do_something_else() cases", and then generates code that is basically
if (idx < foo) idx = array_idx_nospec(idx, foo); do_something_else(); if (idx < foo) access(idx);
instead. And now we're back to the "first branch can be predicted correctly, second branch can be mis-predicted".
Honestly, I don't really care, and I don't think the kernel _should_ care. I don't think this is a problem in practice. I'm just saying that adding a "volatile" on array_idx_nospec() doesn't really guarantee anything, since it's not a volatile over the whole relevant sequence, only over that small part.
So I think the volatile is fine, but I really think it doesn't matter either. We're not going to plug every theoretical hole, and I think the hole that the volatile plugs is theoretical, not practical.
Linus