On Mon, Sep 18, 2023 at 4:26 AM Luis Gerhorst gerhorst@amazon.de wrote:
It is true that this is not easily possible using the method most exploits use, at least to my knowledge (i.e., accessing the same address from another core). However, it is still possible to evict the cacheline with skb->data/data_end from the cache in between the loads by iterating over a large map using bpf_loop(). Then the load of skb->data_end would be slow while skb->data is readily available in a callee-saved register.
...
call %[bpf_loop]; \ gadget_%=: \ r2 = *(u32 *)(r7 + 80); \ if r2 <= r9 goto exit_%=; \
r9 is supposed to be available in the callee-saved register? :) I think you're missing that it is _callee_ saved. If that bpf_loop indeed managed to flush L1 cache the refill of r9 in the epilogue would be a cache miss. And r7 will be a cache miss as well. So there is no chance cpu will execute 'if r2 <= r9' speculatively.
I have to agree that the above approach sounds plausible in theory and I've never seen anyone propose to mispredict a branch this way. Which also means that no known speculation attack was crafted. I suspect that's a strong sign that the above approach is indeed a theory and it doesn't work in practice. Likely because the whole cache is flushed the subsequent misses will spoil all speculation. For spec v1 to work you need only one slow load in one side of that branch. The other load and the speculative code after should not miss. When it does the spec code won't be able to leave the breadcrumbs of the leaked bit in the cache. 'Leak full byte' is also 'wishful thinking' imo. I haven't seen any attack that leaks byte at-a-time.
So I will insist on seeing a full working exploit before doing anything else here. It's good to discuss this cpu speculation concerns, but we have to stay practical. Even removing bpf from the picture there is so much code in the network core that checks packet boundaries. One can find plenty of cases of 'read past skb->end' under speculation and I'm arguing none of them are exploitable.