On Thu, Apr 30, 2020 at 5:10 PM Linus Torvalds torvalds@linux-foundation.org wrote:
On Thu, Apr 30, 2020 at 4:52 PM Dan Williams dan.j.williams@intel.com wrote:
You had me until here. Up to this point I was grokking that Andy's "_fallible" suggestion does help explain better than "_safe", because the copy is doing extra safety checks. copy_to_user() and copy_to_user_fallible() mean *something* where copy_to_user_safe() does not.
It's a horrible word, btw. The word doesn't actually mean what Andy means it to mean. "fallible" means "can make mistakes", not "can fault".
So "fallible" is a horrible name.
But anyway, I don't hate something like "copy_to_user_fallible()" conceptually. The naming needs to be fixed, in that "user" can always take a fault, so it's the _source_ that can fault, not the "user" part.
It was the "copy_safe()" model that I find unacceptable, that uses _one_ name for what is at the very least *four* different operations:
copy from faulting memory to user
copy from faulting memory to kernel
copy from kernel to faulting memory
copy within faulting memory
No way can you do that with one single function. A kernel address and a user address may literally have the exact same bit representation. So the user vs kernel distinction _has_ to be in the name.
The "kernel vs faulting" doesn't necessarily have to be there from an implementation standpoint, but it *should* be there, because
it might affect implemmentation
but even if it DOESN'T affect implementation, it should be separate
just from the standpoint of being self-documenting code.
However you lose me on this "broken nvdimm semantics" contention. There is nothing nvdimm-hardware specific about the copy_safe() implementation, zero, nada, nothing new to the error model that DRAM did not also inflict on the Linux implementation.
Ok, so good. Let's kill this all, and just use memcpy(), and copy_to_user().
Just make sure that the nvdimm code doesn't use invalid kernel addresses or other broken poisoning.
Problem solved.
You can't have it both ways. Either memcpy just works, or it doesn't.
It doesn't, but copy_to_user() is frustratingly close and you can see in the patch that I went ahead and used copy_user_generic() to implement the backend of the default "fast" implementation.
However now I see that copy_user_generic() works for the wrong reason. It works because the exception on the source address due to poison looks no different than a write fault on the user address to the caller, it's still just a short copy. So it makes copy_to_user() work for the wrong reason relative to the name.
How about, following your suggestion, introduce copy_mc_to_user() (can just use copy_user_generic() internally) and copy_mc_to_kernel() for the other the helpers that the copy_to_iter() implementation needs? That makes it clear that no mmu-faults are expected on reads, only exceptions, and no protection-faults are expected at all for copy_mc_to_kernel() even if it happens to accidentally handle it. Following Jann's ex_handler_uaccess() example I could arrange for copy_mc_to_kernel() to use a new _ASM_EXTABLE_MC() to validate that the only type of exception meant to be handled is MC and warn otherwise?