The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail.
Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry.
This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry.
To make the directory entry as unique, append the inode creation time to the inode. With this change the stats directory entries will be in the format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in secs>.
Signed-off-by: Charan Teja Kalla quic_charante@quicinc.com --- drivers/dma-buf/dma-buf-sysfs-stats.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-buf-sysfs-stats.c b/drivers/dma-buf/dma-buf-sysfs-stats.c index 2bba0ba..292cb31 100644 --- a/drivers/dma-buf/dma-buf-sysfs-stats.c +++ b/drivers/dma-buf/dma-buf-sysfs-stats.c @@ -192,7 +192,8 @@ int dma_buf_stats_setup(struct dma_buf *dmabuf)
/* create the directory for buffer stats */ ret = kobject_init_and_add(&sysfs_entry->kobj, &dma_buf_ktype, NULL, - "%lu", file_inode(dmabuf->file)->i_ino); + "%lu-%lu", file_inode(dmabuf->file)->i_ino, + file_inode(dmabuf->file)->i_ctime.tv_sec); if (ret) goto err_sysfs_dmabuf;
On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail.
Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry.
Then maybe we should not fail the creation path of the kobject fails to be created? It's just for debugging, it should be fine if the creation of it isn't there.
This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry.
To make the directory entry as unique, append the inode creation time to the inode. With this change the stats directory entries will be in the format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in secs>.
As you are changing the format here, shouldn't the Documentation/ABI/ entry for this also be changed?
And what's to keep the seconds field from also being the same?
thanks,
greg k-h
Am 10.05.22 um 13:00 schrieb Greg KH:
On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail.
Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry.
Then maybe we should not fail the creation path of the kobject fails to be created? It's just for debugging, it should be fine if the creation of it isn't there.
Well if it's just for debugging then it should be under debugfs and not sysfs.
This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry.
To make the directory entry as unique, append the inode creation time to the inode. With this change the stats directory entries will be in the format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in secs>.
As you are changing the format here, shouldn't the Documentation/ABI/ entry for this also be changed?
As far as I can see that is even an UAPI break, not sure if we can allow that.
And what's to keep the seconds field from also being the same?
Well exporting two DMA-bufs with the same ino in the same nanosecond should be basically impossible, but I would rather opt for using a 64bit atomic in that function.
This should be 100% UAPI compatible and even if we manage to create one buffer ever ns we need ~500years to wrap around.
Regards, Christian.
thanks,
greg k-h
On Tue, May 10, 2022 at 01:35:41PM +0200, Christian König wrote:
Am 10.05.22 um 13:00 schrieb Greg KH:
On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail.
Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry.
Then maybe we should not fail the creation path of the kobject fails to be created? It's just for debugging, it should be fine if the creation of it isn't there.
Well if it's just for debugging then it should be under debugfs and not sysfs.
I'll note that the original patch series for this described why this was moved from debugfs to sysfs.
This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry.
To make the directory entry as unique, append the inode creation time to the inode. With this change the stats directory entries will be in the format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in secs>.
As you are changing the format here, shouldn't the Documentation/ABI/ entry for this also be changed?
As far as I can see that is even an UAPI break, not sure if we can allow that.
Why? Device names change all the time and should never be static. A buffer name should just be a unique identifier in that directory, that's all. No rules on the formatting of it unless for some reason the name being the inode number was somehow being used in userspace for that number?
thanks,
greg k-h
Am 10.05.22 um 14:10 schrieb Greg KH:
On Tue, May 10, 2022 at 01:35:41PM +0200, Christian König wrote:
Am 10.05.22 um 13:00 schrieb Greg KH:
On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail.
Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry.
Then maybe we should not fail the creation path of the kobject fails to be created? It's just for debugging, it should be fine if the creation of it isn't there.
Well if it's just for debugging then it should be under debugfs and not sysfs.
I'll note that the original patch series for this described why this was moved from debugfs to sysfs.
This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry.
To make the directory entry as unique, append the inode creation time to the inode. With this change the stats directory entries will be in the format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in secs>.
As you are changing the format here, shouldn't the Documentation/ABI/ entry for this also be changed?
As far as I can see that is even an UAPI break, not sure if we can allow that.
Why? Device names change all the time and should never be static. A buffer name should just be a unique identifier in that directory, that's all. No rules on the formatting of it unless for some reason the name being the inode number was somehow being used in userspace for that number?
My impression was that we documented that should have been a number, but I might be wrong on this. And if it's not documented to be a number, I think it should be.
The background is that you probably need to associate the DMA-buf with some userspace structure for accounting and that becomes easier when you can just put them into a radix.
Regards, Christian.
thanks,
greg k-h _______________________________________________ Linaro-mm-sig mailing list -- linaro-mm-sig@lists.linaro.org To unsubscribe send an email to linaro-mm-sig-leave@lists.linaro.org
Thanks Christian for the inputs!!
On 5/10/2022 5:05 PM, Christian König wrote:
And what's to keep the seconds field from also being the same?
Well exporting two DMA-bufs with the same ino in the same nanosecond should be basically impossible, but I would rather opt for using a 64bit atomic in that function.
This should be 100% UAPI compatible and even if we manage to create one buffer ever ns we need ~500years to wrap around.
I see that the inode->i_ctime->tv_sec is already defined as 64bit(time64_t tv_sec), hence used it. This way we don't need extra static atomic_t variable just to get a unique name.
Just pasting excerpt from the reply posted to Greg about why this secs will always be a unique: with secs field added, to get the same inode-secs string, the uint should overflow in the same second which is impossible.
Let me know If you still opt for atomic variable only.
Am 10.05.22 um 14:16 schrieb Charan Teja Kalla:
Thanks Christian for the inputs!!
On 5/10/2022 5:05 PM, Christian König wrote:
And what's to keep the seconds field from also being the same?
Well exporting two DMA-bufs with the same ino in the same nanosecond should be basically impossible, but I would rather opt for using a 64bit atomic in that function.
This should be 100% UAPI compatible and even if we manage to create one buffer ever ns we need ~500years to wrap around.
I see that the inode->i_ctime->tv_sec is already defined as 64bit(time64_t tv_sec), hence used it. This way we don't need extra static atomic_t variable just to get a unique name.
Just pasting excerpt from the reply posted to Greg about why this secs will always be a unique: with secs field added, to get the same inode-secs string, the uint should overflow in the same second which is impossible.
Let me know If you still opt for atomic variable only.
I think just using a static atomic variable here would be cleaner, that is 100% unique.
Your approach should probably work as well, but it looks quite constructed.
Regards, Christian.
Thanks Greg for the inputs!!
On 5/10/2022 4:30 PM, Greg KH wrote:
The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail.
Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry.
Then maybe we should not fail the creation path of the kobject fails to be created? It's just for debugging, it should be fine if the creation of it isn't there.
Not creating the debug node under some special cases can make this interface not reliable if one wants to know info about the created dmabuf buffers. Please help in correcting me If my perspective is wrong here.
IIUC, except this -EEXIST condition, under the other conditions (-EINVAL and -ENOMEM) failure is fine. Since, we are going to fix the -EEXIST error in this patch, my opinion is failure in the kobject creation path is acceptable for the reasons: a) The user is expected to pass the valid dmabuf to create the stats node, b) The user can undefine the CONFIG_DMABUF_SYSFS_STATS if he don't want this stats.
This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry.
To make the directory entry as unique, append the inode creation time to the inode. With this change the stats directory entries will be in the format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in secs>.
As you are changing the format here, shouldn't the Documentation/ABI/ entry for this also be changed?
And what's to keep the seconds field from also being the same?
get_next_ino() just increases the inode number monotonically and return to the caller and it is 'unsigned int' data type. Thus 2 successive calls always generate the different inode_number but can be the same secs value. With inode-secs format, this will be still be a unique string. Say it will be like ino1-sec1 and ino2-sec1.
Now after the inode number overflow and wraps, we may get the ino1 again from the get_next_ino() but then secs will be different i.e. say it may be like ino1-secn and ion2-secn. So, it always be a unique string.
IOW, with secs field added, to get the same inode-secs string, the uint should overflow in the same second which is impossible.
Thanks for pointing out the changes to be done in ABI document. Will do it in the next spin.
linaro-mm-sig@lists.linaro.org