From: Matt Evans matt@ozlabs.org Sent: Wednesday, June 10, 2026 11:43 PM
+int vfio_pci_dma_buf_find_pfn(struct vfio_pci_dma_buf *priv,
struct vm_area_struct *vma,unsigned long address,unsigned int order,unsigned long *out_pfn)+{
- /*
* Given a VMA (start, end, pgoffs) and a fault address,* search the corresponding DMABUF's phys_vec[] to find the* range representing the address's offset into the VMA, and* its PFN.** The phys_vec[] ranges represent contiguous spans of VAs* upwards from the buffer offset 0; the actual PFNs might be* in any order, overlap/alias, etc. Calculate an offset of* the desired page given VMA start/pgoff and address, then* search upwards from 0 to find which span contains it.** On success, a valid PFN for a page sized by 'order' is* returned into out_pfn.** Failure occurs if:* - The page would cross the edge of the VMA* - The page isn't entirely contained within a range* - We find a range, but the final PFN isn't aligned to the* requested order.** (Upon failure, the caller is expected to try again with a* smaller order; the tests above will always succeed for* order=0 as the limit case.)** It's suboptimal if DMABUFs are created with neigbouring
s/neigbouring/neighboring/
* ranges that are physically contiguous, since hugepages* can't straddle range boundaries. (The construction of the* ranges vector should merge such ranges.)
though the field is called 'phys_vec', removing 'vector' in description is clearer here.
** Finally, vma_pgoff_adjust is used for a DMABUF representing* a VFIO BAR mmap, which is created from the start of the* offset region.
Elaborate it a little bit that the vm_pgoff is already counted in paddr of phys_vec so it should be skipped when finding the pfn.
*/- const unsigned long pagesize = PAGE_SIZE << order;
- unsigned long vma_off = ((vma->vm_pgoff - priv->vma_pgoff_adjust)
<<
PAGE_SHIFT) & VFIO_PCI_OFFSET_MASK;- unsigned long rounded_page_addr = ALIGN_DOWN(address,
pagesize);
- unsigned long rounded_page_end = rounded_page_addr + pagesize;
- unsigned long page_buf_offset;
- unsigned long page_buf_offset_end;
what about "fault_offset[_end]"? page_buf is a bit confusing.
- unsigned long range_buf_offset = 0;
could this be called 'range_start' then the 'range_start' in latter loop is renamed to 'phys_start'?
Not strong... just feel such naming helps me understand the logic easier
- unsigned int i;
- if (rounded_page_addr < vma->vm_start || rounded_page_end >
vma->vm_end) {
if (order > 0)return -EAGAIN;/* A fault address outside of the VMA is absurd. */WARN(1, "Fault addr 0x%lx outside VMA 0x%lx-0x%lx\n",address, vma->vm_start, vma->vm_end);return -EFAULT;- }
- /*
* page_buff_offset[_end] is the span of DMABUF offsets* corresponding to the faulting page:*/
if the naming is kept then s/page_buff_offset/page_buf_offset/
otherwise,
Reviewed-by: Kevin Tian kevin.tian@intel.com