On Wed, May 24, 2023 at 04:26:33PM +0500, Muhammad Usama Anjum wrote:
On 5/24/23 12:43 AM, Peter Xu wrote:
Hi, Muhammad,
On Mon, May 22, 2023 at 04:26:07PM +0500, Muhammad Usama Anjum wrote:
On 5/22/23 3:24 PM, Muhammad Usama Anjum wrote:
On 4/26/23 7:13 PM, Peter Xu wrote:
Hi, Muhammad,
On Wed, Apr 26, 2023 at 12:06:23PM +0500, Muhammad Usama Anjum wrote:
On 4/20/23 11:01 AM, Muhammad Usama Anjum wrote: > +/* Supported flags */ > +#define PM_SCAN_OP_GET (1 << 0) > +#define PM_SCAN_OP_WP (1 << 1) We have only these flag options available in PAGEMAP_SCAN IOCTL. PM_SCAN_OP_GET must always be specified for this IOCTL. PM_SCAN_OP_WP can be specified as need. But PM_SCAN_OP_WP cannot be specified without PM_SCAN_OP_GET. (This was removed after you had asked me to not duplicate functionality which can be achieved by UFFDIO_WRITEPROTECT.)
- PM_SCAN_OP_GET | PM_SCAN_OP_WP
vs 2) UFFDIO_WRITEPROTECT
After removing the usage of uffd_wp_range() from PAGEMAP_SCAN IOCTL, we are getting really good performance which is comparable just like we are depending on SOFT_DIRTY flags in the PTE. But when we want to perform wp, PM_SCAN_OP_GET | PM_SCAN_OP_WP is more desirable than UFFDIO_WRITEPROTECT performance and behavior wise.
I've got the results from someone else that UFFDIO_WRITEPROTECT block pagefaults somehow which PAGEMAP_IOCTL doesn't. I still need to verify this as I don't have tests comparing them one-to-one.
What are your thoughts about it? Have you thought about making UFFDIO_WRITEPROTECT perform better?
I'm sorry to mention the word "performance" here. Actually we want better performance to emulate Windows syscall. That is why we are adding this functionality. So either we need to see what can be improved in UFFDIO_WRITEPROTECT or can I please add only PM_SCAN_OP_WP back in pagemap_ioctl?
I'm fine if you want to add it back if it works for you. Though before that, could you remind me why there can be a difference on performance?
I've looked at the code again and I think I've found something. Lets look at exact performance numbers:
I've run 2 different tests. In first test UFFDIO_WRITEPROTECT is being used for engaging WP. In second test PM_SCAN_OP_WP is being used. I've measured the average write time to the same memory which is being WP-ed and total time of execution of these APIs:
What is the steps of the test? Is it as simple as "writeprotect", "unprotect", then write all pages in a single thread?
Is UFFDIO_WRITEPROTECT sent in one range covering all pages?
Maybe you can attach the test program here too.
I'd not attached the test earlier as I thought that you wouldn't be interested in running the test. I've attached it now. The test has multiple
Thanks. No plan to run it, just to make sure I understand why such a difference.
threads where one thread tries to get status of flags and reset them, while other threads write to that memory. In main(), we call the pagemap_scan ioctl to get status of flags and reset the memory area as well. While in N threads, the memory is written.
I usually run the test by following where memory area is of 100000 * pages: ./win2_linux 8 100000 1 1 0
I'm running tests on real hardware. The results are pretty consistent. I'm also testing only on x86_64. PM_SCAN_OP_WP wins every time as compared to UFFDIO_WRITEPROTECT.
If it's multi-threaded test especially when the ioctl runs together with the writers, then I'd assume it's caused by writers frequently need to flush tlb (when writes during UFFDIO_WRITEPROTECT), the flush target could potentially also include the core running the main thread who is also trying to reprotect because they run on the same mm.
This makes me think that your current test case probably is the worst case of Nadav's patch 6ce64428d6 because (1) the UFFDIO_WRITEPROTECT covers a super large range, and (2) there're a _lot_ of concurrent writers during the ioctl, so all of them will need to trigger a tlb flush, and that tlb flush will further slow down the ioctl sender.
While I think that's the optimal case sometimes, of having minimum tlb flush on the ioctl(UFFDIO_WRITEPROTECT), so maybe it makes sense somewhere else where concurrent writers are not that much. I'll need to rethink a bit on all these to find out whether we can have a good way for both..
For now, if your workload is mostly exactly like your test case, maybe you can have your pagemap version of WP-only op there, making sure tlb flush is within the pgtable lock critical section (so you should be safe even without Nadav's patch). If so, I'd appreciate you can add some comment somewhere about such difference of using pagemap WP-only and ioctl(UFFDIO_WRITEPROTECT), though. In short, functional-wise they should be the same, but trivial detail difference on performance as TBD (maybe one day we can have a good approach for all and make them aligned again, but maybe that also doesn't need to block your work).