Hello everyone,
Moving the discussion from google doc [1] to email (Sorry, it was becoming
impossible to follow and reply there).
FWIW, I am not the best person to answer all questions here, that would be
Armelle as she understand the requirements and the end goal much better than I
do. I can though try to answer from kernel's perspective, based on whatever
implementation we have right now.
AFAIK, the broad idea is to implement two virtio communication paths from pVM,
one to the Linux host (via virtio-pci) and another one to Trusty (via
virtio-msg-ffa). In order to not take performance hit at runtime (to map io
buffers), the idea is to map whatever amount of memory we can at the beginning
and then keep allocating from there.
Current setup:
What we have achieved until now is virtio-msg communication between host and
trusty. We have implemented FFA specific dma-hal [2] to perform FFA memory
sharing with trusty. With "reserved-mem" and "memory-region" DT entries (not
sure if that is the final solution), we are able to allocate memory the FFA
device (which represents bus for all the enumerated devices between
trusty/host). This memory is shared with trusty at probe time (from
virtio-msg-ffa layer) and the DMA hal later allocates memory from there for
coherent allocations and bounce buffers. This works just fine right now.
Now looking at "dynamic mapping" section in [1] we are not sure if that will
work fine for the end use case, pVM to trusty. It looks like the coco
implementation will always end up using dma encrypt/decrypt when a pVM is
running and share the memory with host, even when all we want to do is share
with trusty. Is that understanding correct ? We would also want to establish
virtio-pci (existing tech) based communication between pVM and host, which
should use mem decrypt path (?).
I am not sure if we need contiguous PA here, contiguous IPA should be
sufficient, Armelle?
We are also looking for further suggestions to improve the design, as my
understanding of memory mapping, dma hal etc. is limited, and maybe there are
better ways to do this.
--
Viresh
[1] https://docs.google.com/document/d/1KyxclKngQ0MShX8Q1YPstZQVIDIctkxfj6lQ0V3…
[2] https://web.git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux.git/tree/…
Bertrand,
I know you are on vacation but here is my full list. It took longer to
flesh out from my pencil notes than I thought.
While you are gone I will try to tackle all of these these in a PR w/ one
commit per item.
I will at least get to some of them.
I will create issues for things I think need discussion.
Thanks,
Bill
--------------
Global:
* Eliminate all use of "frontend" and "backend", can use device side and
driver side when just device or driver is not enough
* Use device number (dev_num) instead of Device ID to avoid confusion with
device type (called Device ID in spec)
* Address out of band notification as an option, any polling that is
required should be done by the bus implementation, not the common layer.
* for all EVENT_ messages change "does not require a response" to "does not
have a response". There is no optional response; a response is forbidden.
4.4.2.1 Bus level feature bits
* are these local to the bus implementation or communicated to the
virtio-msg transport layer? (I vote local)
* what would be an example of a bus feature that would need to be known at
the common layer that is NOT a device feature?
4.4.2.1.1 max message size definition.
Does this include or exclude headers added by the bus implementation? I
suggest it excludes it.
4.4.2.1.1 max message size of 64K
* This seems a bit excessive. It may cause people to invent new uses for
the message channel that we do not intend.
* Should we lower the max message size?
* Should we add a note that says no use of message sizes above 288 bytes is
envisioned?
(288 = 256 bytes of config data plus 32 bytes for message overhead.
Right now message overhead is 18 for SET_CONFIG so 32 is plenty but 16 is
too small.)
4.4.2.3 Config Generation Count
* (self note: Config gen count only needs to be incremented once per
*driver visible* change
A device can do a whole set of changes and MAY/SHOULD only increment the
gen count by one until the new gen count is published to the other side via
EVENT_CONFIG or GET/SET_CONFIG
However this is basically the same thing from the spec POV. )
* Should we say that Config gen count starts at 0 (SHOULD? MUST?)
* Does config gen count reset at device reset or not?
If not then maybe we don't say it starts at zero either and the driver
MUST query it
You do not need to know the config gen counter to GET_CONFIG, only to
SET_CONFIG
4.4.2.5.1 Error codes
* Are error codes aligned to admin queues yet?
4.4.2.6 Bus vs Transport messages
* Mention shared memory setup and out of band notification in bus message
overview
* Mention bus implementation specific messages in bus message overview
* Remove BUS_MSG_STATUS for now (or allow request/response and maybe add
EVENT_ version w/o response)
4.4.2.8 dev_id to dev_num or dev_number
4.4.3.3 Resetting the BUS
* We decided to drop this for now
* When we do it graceful and abortive reset should be handled
* For graceful device initiated reset, it should be handled by setting
state to NEEDS_RESET
4.4.4.2 Device Info
* add max number of virtqueues
(or define a null response for GET_VQ)
4.4.4.3.1 error if *index* is beyond
elsewhere it says pad with zeros
(why do we need this just return zeros)
4.4.4.5
we need max number of virtqueue in DEVICE_INFO OR return maximum size
of 0 for all unsupported indexes on GET and ERROR on set
4.4.4.6 Mention Device status change w/o config change as another possible
reason to get
4.4.4.7
Actually virtqueues and come and go during the life of the device. The
STATUS can (should?) be set to DRIVER OK before any virtqueues are set.
4.4.5.1
* what are NOTIFY_ON_AVAIL and NEGOCIATE_DATA (sp)? These appear no where
else in the spec.
* all existing transports have a way to notify driver -> device per virtqueue
4.4.5.1.1
* virtqueue id is always present
* If VIRTIO_F_NOTIFICATION_DATA has not been negotiated then next_off and
next_wrap should be 0 in message
(Bertrand did not like using msg_length to exclude them).
* We might want to update the text for a better description or just delete
it and leave it to the VIRTIO_F_NOTIFICATION_DATA feature description.
* Should we support VIRTIO_F_NOTIF_CONFIG_DATA? If so then virtqueue_index
can be vq_notif_config_data instead
4.4.5.2 Device Notifications
* Maybe here would be a place to mention OoB notifications and polling?
* I don't think this is a MAY from the transport level.
* The Bus MUST give these messages to the transport level. It can get them
from messages on the bus, from OoB notifications, or generating them
periodically to stimulate polling.
4.4.5.2.1 EVENT_CONFIG
* sent on config OR STATUS change
* we talked about a new version of this message with no data to use for OoB
and polling stimulation
4.4.5.2.2 EVENT_USED
* should we define virtqueue index of -1 as "all virtqueues?
* we could put the loop in the bus instead, this would handle the case of 2
virtqueues on MSIX #2 and 1 one virtqueue on MSIX #3
4.4.5.3 Configuration Changes During Operation
* Feature re-negotiation is not allowed by the spec. drop this part of the text
From 2.2 Feature Bits:
"""
Each virtio device offers all the features it understands. During device
initialization, the driver reads this and
tells the device the subset that it accepts. The only way to renegotiate is
to reset the device.
"""
4.4.5.4 Device Rest and Shutdown
* reword last sentence, it is valid to not re-init the device if your not
going to use it again
4.4.5.4.1 Device-Initated Rest
* match the other transports here. The device sends an async status change
with DEVICE_NEEDS_RESET set and waits for driver to reset the device via a
SET_STATUS of 0.
4.4.6.1.1 Bigger gap between request response messages and EVENT_ messages
* Not a lot of space to add new request/response messages
* Renumber to 0x80, 0x81, 0x82
4.4.6.2 VIRTIO_MSG_ERROR
* no way to match an error to multiple outstanding messages of the same type
** Right now we assume we only have one outstanding request message per
device.
This will probibly be true in Linux.
However a different OS could choose to do multiple SET_VQ in parrallel
(for example).
There is no way to coordinate which SET_VQ a VIRTIO_MSG_ERROR pairs with.
** Solutions
1) add a transaction id at the bus level and do all matching based on that
2) add a transaction id at the transport common header
3) add another field to the error message that would differenciate
(index for VQ). This makes matching logic per msg_id
4) assume one message per device serialization
5) #4 UNLESS #1
4.4.6.3 VIRTIO_MSG_GET_DEVICE_INFO
* add max num virtqueues
* add number of Shared memory Regions (section 2.3) (do this in same patch
that adds GET_SHM message)
4.4.6.4 VIRTIO_MSG_GET_FEATURES
* Most transports have device features and driver features and allow both
to be read.
device features always reflect the capabilities of the device.
driver features are written with the feastures the driver wants to use
and read back the negotiated features
It is clear that the SET_FEATURES writes the driver desired features and
gets back the negotiated features
What does GET_FEATURES read? I presume it reads device features before
negotiation and negotiated features after negotiation
* Should we add a param to specify DEVICE to DRIVER features to the
GET_FEATURES message for parity with other transports?
4.4.6.4 & .5
* make it clear that a feature block is a set of 32 feature bits
* feature bits beyond the number of max feature bits always read as zero
4.4.6.9 SET_DEVICE_STATUS
* return the new status. It is vald for the device to refuse to set the
FEATURES_OK bit. (3.1.1 number 6) A set status might also cause a
DEVICE_NEEDS_RESET.
4.4.6.11 SET_VQUEUE
* why is the info echoed back, can it ever change?
4.4.6.12 RESET_VQUEUE
* Does this need to be negotiated?
* VIRTIO_F_RING_RESET is the feature bit
4.4.6.13 VIRTIO_MSG_EVENT_CONFIG
* see above about adding no data version of this
4.4.6.14 VIRTIO_MSG_EVENT_AVAIL
* Next offset and next wrap should be 0 if not negotiated
* should we add virtqueue index == -1 means all virtqueues (this would be
use on device side between bus and transport for OoB or polling stimulation
4.4.6.15 VIRTIO_MSG_EVENT_USED
* should we add virtqueue index == -1 means all virtqueues
(this would be use on driver side bus and transport for OoB or polling
stimulation and gives parity with virtio-mmio and base level of virtio-pci)
4.4.7.2 BUS_MSG_GET_DEVICES
* needs better description of bit packing
* offset is in bits, bytes, 32 bit words??
* number of device_nums requested, this is number of bits or number of
packed data bytes / words. Must be multiple of
* better description on how bits are packed into array
4.4.7.4 BUS_MSG_DEVICE_REMOVED
* how do we handle a graceful eject vs an abortive one?
* add indicator
A) already removed, deal with it
B) removing now, don't start new data transfers, I will finish existing
ones
C) removing soon, write any dirty data and then remove or reset device
yourself. May escalate to B or A if driver does not take action
D) ask to eject, can be refused
Add BUS_MSG_DEVICE_REMOVE as driver to device MSG. Driver side say it is
done with this device.
4.4.7.6 BUS_MSG_STATUS
* lots of problems with this being a no response message
* remove from spec for now
4.4.8.* Conformance
(Initial thoughts. I did not review this section rigorously.)
* Most of this is "do the right thing" without any specifics.
* Other transports spread the requirements into the various sections. We
should probibly do the same.
One thing that needs to be specified is when to return ERROR and when to
return "benign data".
Ex: getting feature bits that don't exist, return zeros or return error
Ex: setting features bit that don't exist to zero; OK or error
Ex: setting feature bits that don't exist to 1; return negotiated 0 or
return error
The mmio transport will just return zeros for these cases. Why can't we?
* We need to thing about notifications and events. I don't think they are
optional. They are not optional in virtio-mmio and virtio-pci.
4.4.8.2.1 General Transport Requirements
I don't see a precedent for common requirements. If no precedent is found,
repeat the requirements in device and driver sections.
4.4.8.6 Versioning and Forward Compatibility
""
If a bus instance or device does not
support an advanced feature, it MUST reject or ignore those requests
cleanly using VIRTIO_MSG_ERROR,
rather than undefined behavior.
"""
Reject is via MSG_ERROR that is clear. If we negotiate a protocol version
do we expect to send unknown messages? The option to Ignore messages we
don't understand seems wrong as we don't know if they are important.
--
Bill Mills
Principal Technical Consultant, Linaro
+1-240-643-0836
TZ: US Eastern
Work Schedule: Tues/Wed/Thur
Hi Everyone,
I pushed a major change to the pull request for the spec here:
https://github.com/Linaro/virtio-msg-spec/pull/16
I reworked the chapters and the content to look a lot more like other virtio transports and follow the sentence convention (MUST, SHALL, SHOULD).
Some sentences have been reworded with the help of chatgpt so we need to have a careful review of this but i think it looks a lot better than before.
please review based on the final content and not using the tree of patches as the last one is more or less rewriting everything.
Some things to think of:
- message format description: should we switch to structures instead of tables ? I did that for the header and channel I/O is using structures which might look better
- global terminology: i struggle a bit between driver/device vs frontend/backend and we must have a check on the coherency
- compliance: this is the bare output of chatgpt more for discussing than keeping it but it could give us ideas if we want to have something (Warning the compliance chapter content might be wrong).
As always: any comments are welcome :-)
Cheers
Bertrand
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.