RE: [PATCH v3 10/15] iommufd: IOCTLs for the io_pagetable

4 Nov 2022

      ...
From: Jason Gunthorpe jgg@nvidia.com
Sent: Wednesday, October 26, 2022 2:12 AM
+int iommufd_ioas_allow_iovas(struct iommufd_ucmd *ucmd)
+{

struct iommu_ioas_allow_iovas *cmd = ucmd->cmd;
struct rb_root_cached allowed_iova = RB_ROOT_CACHED;
struct interval_tree_node *node;
struct iommufd_ioas *ioas;
struct io_pagetable *iopt;
int rc = 0;

ioas = iommufd_get_ioas(ucmd, cmd->ioas_id);
if (IS_ERR(ioas))
return PTR_ERR(ioas);

iopt = &ioas->iopt;

Missed the check of __reserved field
...

+int iommufd_ioas_copy(struct iommufd_ucmd *ucmd)
+{

struct iommu_ioas_copy *cmd = ucmd->cmd;
struct iommufd_ioas *src_ioas;
struct iommufd_ioas *dst_ioas;
unsigned int flags = 0;
LIST_HEAD(pages_list);
unsigned long iova;
int rc;

if ((cmd->flags &
    ~(IOMMU_IOAS_MAP_FIXED_IOVA |

IOMMU_IOAS_MAP_WRITEABLE |

      IOMMU_IOAS_MAP_READABLE)))

return -EOPNOTSUPP;

if (cmd->length >= ULONG_MAX)
return -EOVERFLOW;

and overflow on cmd->dest_iova/src_iova
...

src_ioas = iommufd_get_ioas(ucmd, cmd->src_ioas_id);
if (IS_ERR(src_ioas))
return PTR_ERR(src_ioas);

rc = iopt_get_pages(&src_ioas->iopt, cmd->src_iova, cmd->length,
	    &pages_list);

iommufd_put_object(&src_ioas->obj);
if (rc)
goto out_pages;

direct return given iopt_get_pages() already called
iopt_free_pages_list() upon error.
...
+int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd)
+{

struct iommu_ioas_unmap *cmd = ucmd->cmd;
struct iommufd_ioas *ioas;
unsigned long unmapped = 0;
int rc;

ioas = iommufd_get_ioas(ucmd, cmd->ioas_id);
if (IS_ERR(ioas))
return PTR_ERR(ioas);

if (cmd->iova == 0 && cmd->length == U64_MAX) {
rc = iopt_unmap_all(&ioas->iopt, &unmapped);

if (rc)

	goto out_put;

} else {
if (cmd->iova >= ULONG_MAX || cmd->length >=

ULONG_MAX) {

	rc = -EOVERFLOW;

	goto out_put;

}

Above check can be moved before iommufd_get_ioas().
...
+static int iommufd_option(struct iommufd_ucmd *ucmd)
+{

struct iommu_option *cmd = ucmd->cmd;
int rc;

lack of __reserved check
...
static struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
   IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct
iommu_destroy, id),

IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl,
 struct iommu_ioas_alloc, out_ioas_id),

IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas,
 struct iommu_ioas_allow_iovas, allowed_iovas),

IOCTL_OP(IOMMU_IOAS_COPY, iommufd_ioas_copy, struct

iommu_ioas_copy,

 src_iova),

IOCTL_OP(IOMMU_IOAS_IOVA_RANGES, iommufd_ioas_iova_ranges,
 struct iommu_ioas_iova_ranges, out_iova_alignment),

IOCTL_OP(IOMMU_IOAS_MAP, iommufd_ioas_map, struct

iommu_ioas_map,

 __reserved),

IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct

iommu_ioas_unmap,

 length),

IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option,
 val64),

};
Just personal preference - it reads better to me if the above order (and
the enum definition in iommufd.h) can be same as how those commands
are defined/explained in iommufd.h.
...
+/**

struct iommu_ioas_iova_ranges - ioctl(IOMMU_IOAS_IOVA_RANGES)

@size: sizeof(struct iommu_ioas_iova_ranges)

@ioas_id: IOAS ID to read ranges from

@num_iovas: Input/Output total number of ranges in the IOAS

@__reserved: Must be 0

@allowed_iovas: Pointer to the output array of struct iommu_iova_range

@out_iova_alignment: Minimum alignment required for mapping IOVA

Query an IOAS for ranges of allowed IOVAs. Mapping IOVA outside these

ranges

is not allowed. out_num_iovas will be set to the total number of iovas

and

the out_valid_iovas[] will be filled in as space permits.

out_num_iovas and out_valid_iovas[] are stale.
...

The allowed ranges are dependent on the HW path the DMA operation

takes, and

can change during the lifetime of the IOAS. A fresh empty IOAS will have a

full range, and each attached device will narrow the ranges based on that

devices HW restrictions. Detatching a device can widen the ranges.

devices -> device's
...
+/**

struct iommu_ioas_allow_iovas - ioctl(IOMMU_IOAS_ALLOW_IOVAS)

@size: sizeof(struct iommu_ioas_allow_iovas)

@ioas_id: IOAS ID to allow IOVAs from

missed num_iovas and __reserved
...

@allowed_iovas: Pointer to array of struct iommu_iova_range

Ensure a range of IOVAs are always available for allocation. If this call

succeeds then IOMMU_IOAS_IOVA_RANGES will never return a list of

IOVA ranges

that are narrower than the ranges provided here. This call will fail if

IOMMU_IOAS_IOVA_RANGES is currently narrower than the given ranges.

When an IOAS is first created the IOVA_RANGES will be maximally sized,

and as

devices are attached the IOVA will narrow based on the device

restrictions.

When an allowed range is specified any narrowing will be refused, ie

device

attachment can fail if the device requires limiting within the allowed

range.

Automatic IOVA allocation is also impacted by this call. MAP will only

allocate within the allowed IOVAs if they are present.

According to iopt_check_iova() FIXED_IOVA can specify an iova which
is not in allowed list but in the list of reported IOVA_RANGES. Is it
correct or make more sense to have FIXED_IOVA also under guard of
the allowed list (if violating then fail the map call)?
...
+/**

struct iommu_ioas_unmap - ioctl(IOMMU_IOAS_UNMAP)

@size: sizeof(struct iommu_ioas_unmap)

@ioas_id: IOAS ID to change the mapping of

@iova: IOVA to start the unmapping at

@length: Number of bytes to unmap, and return back the bytes

unmapped

Unmap an IOVA range. The iova/length must be a superset of a

previously

mapped range used with IOMMU_IOAS_PAGETABLE_MAP or COPY.

remove 'PAGETABLE'
...
+/**

enum iommufd_option

@IOMMU_OPTION_RLIMIT_MODE:

Change how RLIMIT_MEMLOCK accounting works. The caller must have

privilege

to invoke this. Value 0 (default) is user based accouting, 1 uses process

based accounting. Global option, object_id must be 0

@IOMMU_OPTION_HUGE_PAGES:

Value 1 (default) allows contiguous pages to be combined when

generating

iommu mappings. Value 0 disables combining, everything is mapped to

PAGE_SIZE. This can be useful for benchmarking.  This is a per-IOAS

option, the object_id must be the IOAS ID.

What about HWPT ID? Is there value of supporting HWPT's with different
mapping size attached to the same IOAS?
...
+/**

@size: sizeof(struct iommu_option)

@option_id: One of enum iommufd_option

@op: One of enum iommufd_option_ops

@__reserved: Must be 0

@object_id: ID of the object if required

@val64: Option value to set or value returned on get

Change a simple option value. This multiplexor allows controlling a

options

on objects. IOMMU_OPTION_OP_SET will load an option and

IOMMU_OPTION_OP_GET

will return the current value.

*/

This is quite generic. Does it imply that future device capability reporting
can be also implemented based on this cmd, i.e. have OP_GET on a
device object?

2025

2024

2023

2022

2021

2020

2019

2018

2017

RE: [PATCH v3 10/15] iommufd: IOCTLs for the io_pagetable