On Fri, Dec 18 2020 at 11:42, Ira Weiny wrote:
On Fri, Dec 18, 2020 at 02:57:51PM +0100, Thomas Gleixner wrote:
- Modify kmap() so that it marks the to be mapped page as 'globaly unprotected' instead of doing this global unprotect PKS dance. kunmap() undoes that. That obviously needs some thought vs. refcounting if there are concurrent users, but that's a solvable problem either as part of struct page itself or stored in some global hash.
How would this globally unprotected flag work? I suppose if kmap created a new PTE we could make that PTE non-PKS protected then we don't have to fiddle with the register... I think I like that idea.
No. Look at the highmem implementation of kmap(). It's a terrible idea, really. Don't even think about that.
There is _no_ global flag. The point is that the kmap is strictly bound to a particular struct page. So you can simply do:
kmap(page) if (page_is_access_protected(page)) atomic_inc(&page->unprotect);
kunmap(page) if (page_is_access_protected(page)) atomic_dec(&page->unprotect);
and in the #PF handler:
if (!page->unprotect) goto die;
The reason why I said: either in struct page itself or in a global hash is that struct page is already packed and people are not really happy about increasing it's size. But the principle is roughly the same.
Have a smart #PF mechanism which does:
if (error_code & X86_PF_PK) { page = virt_to_page(address);
if (!page || !page_is_globaly_unprotected(page)) goto die; if (pks_mode == PKS_MODE_STRICT) goto die; WARN_ONCE(pks_mode == PKS_MODE_RELAXED, "Useful info ..."); temporary_unprotect(page, regs); return;
}
I feel like this is very similar to what I had in the global patch you found in my git tree with the exception of the RELAXED mode. I simply had globally unprotected or die.
Your stuff depends on that global_pks_state which is not maintainable especially not the teardown side. This depends on per page state which is clearly way simpler and more focussed.
Regardless I think unprotecting a global context is the easy part. The code you had a problem with (and I see is fully broken) was the restriction of access. A failure to update in that direction would only result in a wider window of access. I contemplated not doing a global update at all and just leave the access open until the next context switch. But the code as it stands tries to force an update for a couple of reasons:
kmap_local_page() removes most of the need for global pks. So I was thinking that global PKS could be a slow path.
kmap()'s that are handed to other contexts they are likely to be 'long term' and should not need to be updated 'too' often. I will admit that I don't know how often 'too often' is.
Even once in while is not a justification for stopping the world for N milliseconds.
temporary_unprotect(page, regs) { key = page_to_key(page);
/* Return from #PF will establish this for the faulting context */ extended_state(regs)->pks &= ~PKS_MASK(key); }
This temporary unprotect is undone when the context is left, so depending on the context (thread, interrupt, softirq) the unprotected section might be way wider than actually needed, but that's still orders of magnitudes better than having this fully unrestricted global PKS mode which is completely scopeless.
I'm not sure I follow you. How would we know when the context is left?
The context goes away on it's own. Either context switch or return from interrupt. As I said there is an extended window where the external context still might have unprotected access even if the initiating context has called kunmap() already. It's not pretty, but it's not the end of the world either.
That's why I suggested to have that WARN_ONCE() so we can actually see why and where that happens and think about solutions to make this go into local context, e.g. by changing the vaddr pointer to a struct page pointer for these particular use cases and then the other context can do kmap/unmap_local().
The DAX case which you made "work" with dev_access_enable() and dev_access_disable(), i.e. with yet another lazy approach of avoiding to change a handful of usage sites.
The use cases are strictly context local which means the global magic is not used at all. Why does it exist in the first place?
I'm not following. What is 'it'?
That global argument to dev_access_enable()/disable().
That leaves the question about the refcount. AFAICT, nothing nests in that use case for a given execution context. I'm surely missing something subtle here.
The refcount is needed for non-global pks as well as global. I've not resolved if anything needs to be done with the refcount on the global update since the following is legal.
kmap() kmap_local_page() kunmap() kunmap_local()
Which would be a problem. But I don't think it is ever actually done.
If it does not exist why would we support it in the first place? We can have some warning there to catch that case.
Another problem would be if the kmap and kunmap happened in different contexts... :-/ I don't think that is done either but I don't know for certain.
Frankly, my main focus before any of this global support has been to get rid of as many kmaps as possible.[1] Once that is done I think more of these questions can be answered better.
I was expecting that you could answer these questions :)
Thanks,
tglx