On Mon, Oct 12, 2020 at 12:53:54PM -0700, Ira Weiny wrote:
On Mon, Oct 12, 2020 at 05:44:38PM +0100, Matthew Wilcox wrote:
On Mon, Oct 12, 2020 at 09:28:29AM -0700, Dave Hansen wrote:
kmap_atomic() is always preferred over kmap()/kmap_thread(). kmap_atomic() is _much_ more lightweight since its TLB invalidation is always CPU-local and never broadcast.
So, basically, unless you *must* sleep while the mapping is in place, kmap_atomic() is preferred.
But kmap_atomic() disables preemption, so the _ideal_ interface would map it only locally, then on preemption make it global. I don't even know if that _can_ be done. But this email makes it seem like kmap_atomic() has no downsides.
And that is IIUC what Thomas was trying to solve.
Also, Linus brought up that kmap_atomic() has quirks in nesting.[1]
From what I can see all of these discussions support the need to have something
between kmap() and kmap_atomic().
However, the reason behind converting call sites to kmap_thread() are different between Thomas' patch set and mine. Both require more kmap granularity. However, they do so with different reasons and underlying implementations but with the _same_ resulting semantics; a thread local mapping which is preemptable.[2] Therefore they each focus on changing different call sites.
While this patch set is huge I think it serves a valuable purpose to identify a large number of call sites which are candidates for this new semantic.
Yes, I agree. My problem with this patch-set is that it ties it to some Intel feature that almost nobody cares about. Maybe we should care about it, but you didn't try very hard to make anyone care about it in the cover letter.
For a future patch-set, I'd like to see you just introduce the new API. Then you can optimise the Intel implementation of it afterwards. Those patch-sets have entirely different reviewers.