Liam R. Howlett Liam.Howlett@oracle.com wrote:
No per-vma change is checked prior to entering a per-vma modification loop today. This means that mseal() differs in behaviour in "up-front failure" vs "partial change failure" that exists in every other function.
I discussed this with Liam and Jeff a while ago (seperate conversations).
A bunch of linux m*() syscalls have weaker atomicity gaurantees than the other systems I looked into.
Linux is an outlier here. Other systems do two passes over the "entries in the range", before commiting to success or failure. When success is returned, it means the whole range has been changed. When an error is identified in the first pass, then no changes are applied, and error is returned. I found no partial results in my limited reading of various VM systems.
Actually the gaurantee of having done nothing upon error, is very common system call behaviour. POSIX and defacto standards don't seem to specify by specific wording as far as I can see, but majority of systems seem to do so because it matches expectations.
Considering all the system calls, I can't think of any examples. There are a few specific ioctl which were designed wrong.
I suspect, for performance reasons, there will be little appetite to repair the m*() syscalls in Linux. (I would appreciate if they were brought up to standard, so I guess that starts the 20 year counter :)
I think we can all agree that having some up-front and some later without any reason will lead to a higher probability of things getting missed.
Also as attack surface. I spent some time thinking about circumstances where this might help an attack.
The risk is that mprotect() return value is very rarely checked, yet parts of objects will change. mprotect() is probably the least checked system call, since people assume it will always succeed entirely; not the case on Linux. Even more so not the case once immutable memory ranges come into play, it's an even more likely error condition now.
I didn't find a particular piece of software (or an old attack) which would help an attack with the sloppy permission handling aspects, but I only thought about it for a couple days... there are people with more time on their hands.