On Thu, Apr 25, 2019 at 10:04:34AM +0000, David Laight wrote:
From: Kirill Smelkov
Sent: 24 April 2019 19:30
On Wed, Apr 24, 2019 at 10:26:55AM -0700, Linus Torvalds wrote:
On Wed, Apr 24, 2019 at 10:19 AM Sasha Levin sashal@kernel.org wrote:
Hm, I might be confusing something here but I see a bunch of patches that convert existing callers mentioned in this patch to use stream_open() which was introduced here.
The only use of stream_open() upstream right now is the xenbus conversion, and that isn't actually a bugfix, because xenbus used to manually do that
filp->f_mode &= ~FMODE_ATOMIC_POS; /* cdev-style semantics */
that stream_open() does.
So no, there isn't "a bunch of patches" anywhere.
There are *future* cleanups for 5.2 that will happen, and that might have hit linux-next. And there is at least one FUSE patch (again - pending, not upstream) that may get marked for stable.
But I see nothing right now that makes it stable material yet.
Linus, thanks for explaining. Sasha, Greg, there is a FUSE patch that should be stable material that will need this stream_open() thing. That patch has just entered fuse.git#for-next today:
https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git/commit/?id...
and will hopefully enter 5.2 when merge window opens. I agree we should not blindly backport bulk stream_open conversions as performed by stream_open.cocci, at least unless there is a bug report indicating that it is actually required for a particular driver. On the other hand both Xen and FUSE deadlocks were hit for real which justifies stable propagation for their fixes.
You can read about the deadlock regression and the plan to fix it in original "fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock" patch (the 59/66 patch that was send in this thread), or here:
https://git.kernel.org/linus/10dce8af3422
Hope it clarifies things a bit,
I can also imagine drivers that expect accesses to be done using pread() and pwrite() - maybe only if the fd is shared. Provided accesses get the correct offset they can be concurrent. In fact they only need to update the offset in the file structure when they complete - they may do this already.
I know (I think) uclibc implementing pread() as lseek() + read() caused me grief - but that might just have been the extra system call overhead rather than any problems with the offset.
I'm not sure I understand your comment completely, but we convert to stream_open only drivers that actually do _not_ use position at all, and that were already using nonseekable_open, thus pread and pwrite were already returning -ESPIPE for them (nonseekable_open clears FMODE_{PREAD,PWRITE} and ksys_{pread,pwrite}64 check for that flag). We also convert only drivers that use no_llseek for .llseek, so lseek on those files is/was always returning -ESPIPE as well.
If a driver uses position in its read and write and has support for pread/pwrite (FMODE_PREAD and FMODE_PWRITE), pread and pwrite are already working _without_ file->f_pos locking - because those system calls do not semantically update file->f_pos at all and thus do not take file->f_pos_lock - i.e. pread/pwrite can be run simultaneously already.
If libc implements pread as lseek+read it will work for a single user case (single thread, or fd not shared between processes), but it will break because of lseek+read non-atomicity if multiple preads are simultaneously used from several threads. And also for such emulation for multiple users case there is a chance for pread vs pwrite deadlock, since those system calls are using read and write and read and write take file->f_pos_lock.
Kirill