On May 13, 2019, at 4:27 AM, Peter Zijlstra peterz@infradead.org wrote:
On Mon, May 13, 2019 at 09:21:01AM +0000, Nadav Amit wrote:
On May 13, 2019, at 2:12 AM, Peter Zijlstra peterz@infradead.org wrote:
The other thing I was thinking of is trying to detect overlap through the page-tables themselves, but we have a distinct lack of storage there.
We might just use some state in the pmd, there's still 2 _pt_pad_[12] in struct page to 'use'. So we could come up with some tlb generation scheme that would detect conflict.
It is rather easy to come up with a scheme (and I did similar things) if you flush the table while you hold the page-tables lock. But if you batch across page-tables it becomes harder.
Yeah; finding that out now. I keep finding holes :/
You can use Uhlig’s dissertation for inspiration (Section 4.4).
[1] https://www.researchgate.net/publication/36450482_Scalability_of_microkernel...
Thinking about it while typing, perhaps it is simpler than I think - if you need to flush range that runs across more than a single table, you are very likely to flush a range of more than 33 entries, so anyhow you are likely to do a full TLB flush.
We can't rely on the 33, that x86 specific. Other architectures can have another (or no) limit on that.
I wonder whether there are architectures that do no invalidate the TLB entry by entry, experiencing these kind of overheads.
So perhaps just avoiding the batching if only entries from a single table are flushed would be enough.
That's near to what Will suggested initially, just flush the entire thing when there's a conflict.
One question is how you define a conflict. IIUC, Will suggests same mm marks a conflict. In addition, I suggest that if you only remove a single entry (or few ones), you would just not batch and do the flushing while holding the page-table lock.