On Tue, Dec 12, 2023 at 6:39 AM Jason Gunthorpe jgg@nvidia.com wrote:
On Tue, Dec 12, 2023 at 06:26:51AM -0800, Mina Almasry wrote:
On Tue, Dec 12, 2023 at 4:25 AM Jason Gunthorpe jgg@nvidia.com wrote:
On Thu, Dec 07, 2023 at 04:52:39PM -0800, Mina Almasry wrote:
+static inline struct page_pool_iov *page_to_page_pool_iov(struct page *page) +{
if (page_is_page_pool_iov(page))
return (struct page_pool_iov *)((unsigned long)page & ~PP_IOV);
DEBUG_NET_WARN_ON_ONCE(true);
return NULL;
+}
We already asked not to do this, please do not allocate weird things can call them 'struct page' when they are not. It undermines the maintainability of the mm to have things mis-typed like this. Introduce a new type for your thing so the compiler can check it properly.
There is a new type introduced, it's the page_pool_iov. We set the LSB on page_pool_iov* and cast it to page* only to avoid the churn of renaming page* to page_pool_iov* in the page_pool and all the net drivers using it. Is that not a reasonable compromise in your opinion? Since the LSB is set on the resulting page pointers, they are not actually usuable as pages, and are never passed to mm APIs per your requirement.
There were two asks, the one you did was to never pass this non-struct page memory to the mm, which is great.
The other was to not mistype things, and don't type something as struct page when it is, in fact, not.
I fear what you've done is make it so only one driver calls these special functions and left the other drivers passing the struct page directly to the mm and sort of obfuscating why it is OK based on this netdev knowledge of not enabling/using the static branch in the other cases.
Jason, we set the LSB on page_pool_iov pointers before casting it to struct page pointers. The resulting pointers are not useable as page pointers at all.
In order to use the resulting pointers, the driver _must_ use the special functions that first clear the LSB. It is impossible for the driver to 'accidentally' use the resulting page pointers with the LSB set - the kernel would just crash trying to dereference such a pointer.
The way it works currently is that drivers that support devmem TCP will declare that support to the net stack, and use the special functions that clear the LSB and cast the struct back to page_pool_iov. The drivers that don't support devmem TCP will not declare support and will get pages allocated from the mm stack from the page_pool and use them as pages normally.
Perhaps you can simply avoid this by arranging for this driver to also exclusively use some special type to indicate the dual nature of the pointer and leave the other drivers as using the struct page version.
This is certainly possible, but it requires us to rename all the page pointers in the page_pool to the new type, and requires the driver adding devmem TCP support to rename all the page* pointer instances to the new type. It's possible but it introduces lots of code churn. Is the LSB + cast not a reasonable compromise here? I feel like the trick of setting the least significant bit on a pointer to indicate it's something else has a fair amount of precedent in the kernel.
-- Thanks, Mina