On Thu, Jun 30, 2022 at 03:16:41PM +0200, Juergen Gross wrote:
On 30.06.22 13:34, Greg KH wrote:
On Mon, Jun 27, 2022 at 02:10:02PM -0400, Demi Marie Obenour wrote:
commit dbe97cff7dd9f0f75c524afdd55ad46be3d15295 upstream
unmap_grant_pages() currently waits for the pages to no longer be used. In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a deadlock against i915: i915 was waiting for gntdev's MMU notifier to finish, while gntdev was waiting for i915 to free its pages. I also believe this is responsible for various deadlocks I have experienced in the past.
Avoid these problems by making unmap_grant_pages async. This requires making it return void, as any errors will not be available when the function returns. Fortunately, the only use of the return value is a WARN_ON(), which can be replaced by a WARN_ON when the error is detected. Additionally, a failed call will not prevent further calls from being made, but this is harmless.
Because unmap_grant_pages is now async, the grant handle will be sent to INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same handle. Instead, a separate bool array is allocated for this purpose. This wastes memory, but stuffing this information in padding bytes is too fragile. Furthermore, it is necessary to grab a reference to the map before making the asynchronous call, and release the reference when the call returns.
It is also necessary to guard against reentrancy in gntdev_map_put(), and to handle the case where userspace tries to map a mapping whose contents have not all been freed yet.
Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are still in use") Cc: stable@vger.kernel.org Signed-off-by: Demi Marie Obenour demi@invisiblethingslab.com Reviewed-by: Juergen Gross jgross@suse.com Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com
drivers/xen/gntdev-common.h | 7 ++ drivers/xen/gntdev.c | 142 +++++++++++++++++++++++++----------- 2 files changed, 106 insertions(+), 43 deletions(-)
All now queued up, thanks.
Sorry, but I think at least the version for 5.10 is fishy, as it removes the tests for successful allocations of add->map_ops and add->unmap_ops.
That is definitely a bug; I will send another version (without your Reviewed-by).
I need to do a thorough review of the patches (the "Reviewed-by:" tag in the patches is the one for the upstream patch).
Yeah, that was my fault, sorry.
Greg, can you please wait for my explicit "okay" for the backports?
Confirming that these patches do need review before they can be applied. Juergen, would you mind making sure that the upstream patch was also correct for 5.15 and 5.18? It applied cleanly, but that is no guarantee of correctness.