In this context, a "doomed" object is an object whose refcount has reached zero, but that has not yet been freed.
To avoid mutual refcounting vmwgfx need to have a non-refcounted pointer to a dma-buf in a lookup structure. The pointer is removed in the dma-buf destructor. To allow lookup-structure private locks we need get_dma_buf_unless_doomed(). This common refcounting scenario is described with examples in detail in the kref documentaion. The solution with local locks is under kref_get_unless_zero(). See also kobject_get_unless_zero() and its commit message. Since dma-bufs are using the attached file for refcounting, get_dma_buf_unless_doomed maps directly to a get_file_unless_doomed.
Signed-off-by: Thomas Hellstrom thellstrom@vmware.com --- include/linux/dma-buf.h | 16 ++++++++++++++++ include/linux/fs.h | 15 +++++++++++++++ 2 files changed, 31 insertions(+)
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index dfac5ed..6e58144 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -162,6 +162,22 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); }
+/** + * get_dma_buf_unless_doomed - convenience wrapper for get_file_unless_doomed + * + * @dmabuf: [in] pointer to dma_buf + * + * Obtain a dma-buf reference from a lookup structure that doesn't refcount + * the dma-buf, but synchronizes with its release method to make sure it has + * not been freed yet. See for example kref_get_unless_zero documentation. + * Returns true if refcounting succeeds, false otherwise. + */ +static inline bool __must_check +get_dma_buf_unless_doomed(struct dma_buf *dmabuf) +{ + return get_file_unless_doomed(dmabuf->file); +} + struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, struct device *dev); void dma_buf_detach(struct dma_buf *dmabuf, diff --git a/include/linux/fs.h b/include/linux/fs.h index 3f40547..a96c333 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -824,6 +824,21 @@ static inline struct file *get_file(struct file *f) atomic_long_inc(&f->f_count); return f; } + +/** + * get_file_unless_doomed - convenience wrapper for get_file_unless_doomed + * @file: [in] pointer to file + * + * Obtain a file reference from a lookup structure that doesn't refcount + * the file, but synchronizes with its release method to make sure it has + * not been freed yet. See for example kref_get_unless_zero documentation. + * Returns true if refcounting succeeds, false otherwise. + */ +static inline bool __must_check get_file_unless_doomed(struct file *f) +{ + return atomic_long_inc_not_zero(&f->f_count) != 0L; +} + #define fput_atomic(x) atomic_long_add_unless(&(x)->f_count, -1, 1) #define file_count(x) atomic_long_read(&(x)->f_count)
On Thu, Nov 07, 2013 at 11:18:54PM -0800, Thomas Hellstrom wrote:
I'm a bit confused ... why do we need this helper in common code? You can only synchronize the final release with your own locks for your own dma-bufs exported from vmwgfx. So I'm not sure at all whether we want this in common code.
Also the gem/prime stuff gets by without this (and I have a pretty evil set of tests for it). The only current bug is that multiple imports of foreign objects (e.g. using a 2nd gpu to render, then import+share to the compositor) can still lead to some fun. But that's simply something I haven't gotten around to yet ... -Daniel
On 11/08/2013 09:06 AM, Daniel Vetter wrote:
It's for the following scenario
thread a thread b dma_buf_put() refcount reaches zero lock_lookup obtain dma_buf_pointer get_dma_buf() -> False reference unlock_lookup lock_lookup remove dma_buf_pointer unlock lookup
There are two solutions to this. The bad one is to enclose also dma_buf_put() in lock_lookup which means we need to add an extra set of shared locks to dmabuf, and we'd have to take the locks around *all* dma_buf_puts(). The good one is using get_dma_buf_unless_doomed(), which also opens up for rcu locking of the lookup had struct_file been freed using RCU synchronization semantics. Both these scenarios are well documented in the kref docs, and this one should be pretty established by now, as seen by the kobj_get_unless_zero commit message from Linus. Nothing strange here.
I've looked closely at the gem/prime stuff, and AFAICT it's fully correct. The difference is that the lookup (gem::dma_buf) is referenced so it doesn't encounter this situation. The reference and lookup is removed with the gem name triggering dma_buf destruction and finally a free of the gem object.
This, however comes with two implications 1) Once a DMA-buf is added, it stays alive at least until someone removes the gem name of the exporting object, regardless whether there are any external users or not. I think this is OK, but unnecessary. 2) If someone decides to get a new handle from fd, and the gem name has already been removed, a new gem name is created for the exporting dma-buf by the requested client. This is why I can't do the same. Because of the relaxed RCU locking, I can't re-add a name to a TTM base object. Removing it is always part of the object destruction sequence.
/Thomas
On Fri, Nov 8, 2013 at 9:50 AM, Thomas Hellstrom thellstrom@vmware.com wrote:
Imo that's actually fairly nice guarantee, since if you have dumb userspace that always re-does the export/import dance accross a device the importer can check whether it has the same object already somewhere.
Without this guarantee we'll end up mapping the same underlying storage multiple times. btw this is the part where userspace can still trick the kernel. I have testcases for it, but thus far lacked the time to implement the fix. It needs a combination of nasty+dumb userspace though to be a real issue.
Yeah, we seem to have a bit a split in how gem handles userspace handles and the weak references they cause and how ttm does it. ttm uses kref_get_unless_zero for weak references. Atm gem objects themselves still need the big mutex, but the only blocker is the mmap code (actually the has table). My plan (somewhere on my todo list) is to do the same trick for that weak reference from the mmap offset lookup structure.
Anyway I just wanted to point out with my original mail that this problem can be solved in different ways. But I see that the weak ref approach with a possibly failing get call suits the current ttm design (and so I guess vmwgfx) a bit better.
Cheers, Daniel
linaro-mm-sig@lists.linaro.org