On Tue, Dec 12, 2023 at 7:08 AM Jason Gunthorpe jgg@nvidia.com wrote:
On Tue, Dec 12, 2023 at 06:58:17AM -0800, Mina Almasry wrote:
Jason, we set the LSB on page_pool_iov pointers before casting it to struct page pointers. The resulting pointers are not useable as page pointers at all.
I understand that, the second ask is about maintainability of the mm by using correct types.
Perhaps you can simply avoid this by arranging for this driver to also exclusively use some special type to indicate the dual nature of the pointer and leave the other drivers as using the struct page version.
This is certainly possible, but it requires us to rename all the page pointers in the page_pool to the new type, and requires the driver adding devmem TCP support to rename all the page* pointer instances to the new type. It's possible but it introduces lots of code churn. Is the LSB + cast not a reasonable compromise here? I feel like the trick of setting the least significant bit on a pointer to indicate it's something else has a fair amount of precedent in the kernel.
Linus himself has complained about exactly this before, and written a cleanup:
https://lore.kernel.org/linux-mm/20221108194139.57604-1-torvalds@linux-found...
If you mangle a pointer *so it is no longer a pointer* then give it a proper opaque type so the compiler can check everything statically and require that the necessary converters are called in all cases.
You call it churn, I call it future maintainability. :(
No objection to using the LSB, just properly type a LSB mangled pointer so everyone knows what is going on and don't call it MM's struct page *.
I would say this is important here because it is a large driver facing API surface.
OK, I imagine this is not that hard to implement - it's really whether the change is acceptable to reviewers.
I figure I can start by implementing a no-op abstraction to page*:
typedef struct page netmem_t
and replace the page* in the following places with netmem_t*:
1. page_pool API (not internals) 2. drivers using the page_pool. 3. skb_frag_t.
I think that change needs to be a separate series by itself. Then the devmem patches would on top of that change netmem_t such that it can be a union between struct page and page_pool_iov and add the special handling of page_pool_iov. Does this sound reasonable?
-- Thanks, Mina