Hello,
As discussed yesterday during the call, I have prepared the list of instructions
to reproduce my virtio-msg loopback setup. Please give a try and lemme know if
something doesn't work.
This is based on the latest version of the spec (from Bill's latest pull request
today).
--
Viresh
[1] https://linaro.atlassian.net/wiki/spaces/HVAC/pages/30104092673
--
viresh
This series adds the virtio-msg transport layer.
The individuals and organizations involved in this effort have had difficulty in
using the existing virtio-transports in various situations and desire to add one
more transport that performs its transport layer operations by sending and
receiving messages.
Implementations of virtio-msg will normally be done in multiple layers:
* common / device level
* bus level
The common / device level defines the messages exchanged between the driver
and a device. This common part should lead to a common driver holding most
of the virtio specifics and can be shared by all virtio-msg bus implementations.
The kernel implementation in [3] shows this separation. As with other transport
layers, virtio-msg should not require modifications to existing virtio device
implementations (virtio-net, virtio-blk etc). The common / device level is the
main focus of this version of the patch series.
The virtio-msg bus level implements the normal things a bus defines
(enumeration, dma operations, etc) but also implements the message send and
receive operations. A number of bus implementations are envisioned,
some of which will be reusable and general purpose. Other bus implementations
might be unique to a given situation, for example only used by a PCIe card
and its driver.
How much of the bus level should be described in the virtio spec is one item
we wish to discuss. This draft takes a middle approach by describing the bus
level and defining some standard bus level messages that MAY be used by the bus.
It also describes a range of bus messages that are implementation dependent.
The standard bus messages are an effort to avoid different bus implementations
doing the same thing in different ways for no good reason. However the
different environments will require different things. Instead of trying to
anticipate all needs and provide something very abstract, we think
implementation specific messages will be needed at the bus level. Over time,
if we see similar messages across multiple bus implementations, we will move to
standardize a bus level message for that.
We are working on two reusable bus implementations:
* virtio-msg-ffa based on Arm FF-A interface for use between:
* normal world and secure world
* host and VM or VM to VM
* Can be used w/ or with out a hypervisor
* Any Hypervisor that implements FF-A can be used
* virtio-msg-amp for use between heterogenous systems
* The main processors and its co-processors on an AMP SOC
* Two or more systems connected via PCIe
* Minimal requirements: bi-directional interrupts and
at least one shared memory area
We also anticipate a few more:
* virtio-msg-xen specific to Xen
* Usable on any Xen system (including x86 where FF-A does not exist)
* Using Xen events and page grants
* virtio-msg-loopback for userspace implemented devices
* Allows user space to provide devices to its own kernel
* This is similar to fuse, cuse or loopback block devices but for virtio
* Once developed this can provide a single kernel demo of virtio-msg
* [Work has begun on this]
* virtio-msg over admin virtqueues
* This allows any virtio-pci device that supports admin virtqueues to also
support a virtio-msg bus that supports sub devices
* [We are looking for collaborators for this work]
Changes since RFC1:
* reformated document to better conform to the virtio spec style
- created an introduction chapter
- created a basic concept chapter
- created bus operation and device initialization and operation chapters
- reworked description of transport and bus messages
- attempted a "compliance chapter"
- reused spec macros
- switched to MAY/MUST/SHALL/SHOULD wording
- eliminate the use of terms front-end and back-end and use driver and device
* made the maximum message size variable per bus instance
* use "device number" for virtio-msg device instances on the bus instead of
adding yet another meaning for "device ID"
* added the configuration generation count and its use
* described types of things that can be done with bus specific messages
such as setup of bus level shared memory and out of band notifications
* removed wording of notification being optional at transport level and
described bus level notifications of in-band, out-of-band, and polled
from driver side bus
* removed the ERROR message from transport level. Errors should be handled at
the bus level to better match virtio-pci and virtio-mmio
* removed bus level reset and status from standard bus messages
* replaced bus messages DEVICE_{ADDED,REMOVED} with EVENT_DEVICE
* changed names to GET_DEVICE_FEATURES and SET_DRIVER_FEATURES for clarity
* made SET_DEVICE_STATE return new state as it may not match
* Don't echo back the data in SET_VQUEUE (it cannot change)
* defined request/response vs event message id ranges
* match field size of next offset and wrap to virtio-{pci,mmio}
* added maximum number of virtqueues to DEVICE_INFO
* added admin virtqueue and SHM support
This series is a work in progress and we acknowledge at least the following
issues we need to work on:
* Better conformance documentation
* Publish an update to Arm FF-A spec that shows virtio-msg binding (work underway)
* Publish virtio-msg-amp data structures and messages somewhere
* Align implementations to this version and send PATCH v1 (non-rfc)
Background info and work in progress implementations:
* HVAC project page with intro slides [1]
* HVAC demo repo w/ instructions in README.md [2]
* Kernel w/ virtio-msg common level and ffa support [3]
* QEMU w/ support for one form of virtio-msg-amp [4]
* Portable RTOS library w/ one form of virtio-msg-amp [5]
In addition to the QEMU system based demos in the hvac-demo repo, we also have
two hardware systems running:
* AMD x86 + AMD Arm Versal connected via PCIe
* ST STM32MP157 A7 Linux using virtio-i2c provided by M4 Zephyr
Please note that although the demos work, they are not yet aligned with each
other nor this version of the spec.
[1] https://linaro.atlassian.net/wiki/spaces/HVAC/overview
[2] https://github.com/wmamills/hvac-demo
[3] https://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux.git/log/?h=vi…
[4] https://github.com/edgarigl/qemu/commits/edgar/virtio-msg-new
[5] https://github.com/arnopo/open-amp/commits/virtio-msg/
Bill Mills (1):
virtio-msg: Add virtio-msg, a message based virtio transport layer
content.tex | 1 +
transport-msg.tex | 1354 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 1355 insertions(+)
create mode 100644 transport-msg.tex
--
2.34.1
All,
The pull request referenced here covers the remaining items needed before
sending RFCv2. Please take a look and let me know if you see issues. You
can comment in the PR or send email to the list.
https://github.com/Linaro/virtio-msg-spec/pull/20
Thanks,
Bill
--
Bill Mills
Principal Technical Consultant, Linaro
+1-240-643-0836
TZ: US Eastern
Work Schedule: Tues/Wed/Thur
Hello everyone,
Moving the discussion from google doc [1] to email (Sorry, it was becoming
impossible to follow and reply there).
FWIW, I am not the best person to answer all questions here, that would be
Armelle as she understand the requirements and the end goal much better than I
do. I can though try to answer from kernel's perspective, based on whatever
implementation we have right now.
AFAIK, the broad idea is to implement two virtio communication paths from pVM,
one to the Linux host (via virtio-pci) and another one to Trusty (via
virtio-msg-ffa). In order to not take performance hit at runtime (to map io
buffers), the idea is to map whatever amount of memory we can at the beginning
and then keep allocating from there.
Current setup:
What we have achieved until now is virtio-msg communication between host and
trusty. We have implemented FFA specific dma-hal [2] to perform FFA memory
sharing with trusty. With "reserved-mem" and "memory-region" DT entries (not
sure if that is the final solution), we are able to allocate memory the FFA
device (which represents bus for all the enumerated devices between
trusty/host). This memory is shared with trusty at probe time (from
virtio-msg-ffa layer) and the DMA hal later allocates memory from there for
coherent allocations and bounce buffers. This works just fine right now.
Now looking at "dynamic mapping" section in [1] we are not sure if that will
work fine for the end use case, pVM to trusty. It looks like the coco
implementation will always end up using dma encrypt/decrypt when a pVM is
running and share the memory with host, even when all we want to do is share
with trusty. Is that understanding correct ? We would also want to establish
virtio-pci (existing tech) based communication between pVM and host, which
should use mem decrypt path (?).
I am not sure if we need contiguous PA here, contiguous IPA should be
sufficient, Armelle?
We are also looking for further suggestions to improve the design, as my
understanding of memory mapping, dma hal etc. is limited, and maybe there are
better ways to do this.
--
Viresh
[1] https://docs.google.com/document/d/1KyxclKngQ0MShX8Q1YPstZQVIDIctkxfj6lQ0V3…
[2] https://web.git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux.git/tree/…
Bertrand,
I know you are on vacation but here is my full list. It took longer to
flesh out from my pencil notes than I thought.
While you are gone I will try to tackle all of these these in a PR w/ one
commit per item.
I will at least get to some of them.
I will create issues for things I think need discussion.
Thanks,
Bill
--------------
Global:
* Eliminate all use of "frontend" and "backend", can use device side and
driver side when just device or driver is not enough
* Use device number (dev_num) instead of Device ID to avoid confusion with
device type (called Device ID in spec)
* Address out of band notification as an option, any polling that is
required should be done by the bus implementation, not the common layer.
* for all EVENT_ messages change "does not require a response" to "does not
have a response". There is no optional response; a response is forbidden.
4.4.2.1 Bus level feature bits
* are these local to the bus implementation or communicated to the
virtio-msg transport layer? (I vote local)
* what would be an example of a bus feature that would need to be known at
the common layer that is NOT a device feature?
4.4.2.1.1 max message size definition.
Does this include or exclude headers added by the bus implementation? I
suggest it excludes it.
4.4.2.1.1 max message size of 64K
* This seems a bit excessive. It may cause people to invent new uses for
the message channel that we do not intend.
* Should we lower the max message size?
* Should we add a note that says no use of message sizes above 288 bytes is
envisioned?
(288 = 256 bytes of config data plus 32 bytes for message overhead.
Right now message overhead is 18 for SET_CONFIG so 32 is plenty but 16 is
too small.)
4.4.2.3 Config Generation Count
* (self note: Config gen count only needs to be incremented once per
*driver visible* change
A device can do a whole set of changes and MAY/SHOULD only increment the
gen count by one until the new gen count is published to the other side via
EVENT_CONFIG or GET/SET_CONFIG
However this is basically the same thing from the spec POV. )
* Should we say that Config gen count starts at 0 (SHOULD? MUST?)
* Does config gen count reset at device reset or not?
If not then maybe we don't say it starts at zero either and the driver
MUST query it
You do not need to know the config gen counter to GET_CONFIG, only to
SET_CONFIG
4.4.2.5.1 Error codes
* Are error codes aligned to admin queues yet?
4.4.2.6 Bus vs Transport messages
* Mention shared memory setup and out of band notification in bus message
overview
* Mention bus implementation specific messages in bus message overview
* Remove BUS_MSG_STATUS for now (or allow request/response and maybe add
EVENT_ version w/o response)
4.4.2.8 dev_id to dev_num or dev_number
4.4.3.3 Resetting the BUS
* We decided to drop this for now
* When we do it graceful and abortive reset should be handled
* For graceful device initiated reset, it should be handled by setting
state to NEEDS_RESET
4.4.4.2 Device Info
* add max number of virtqueues
(or define a null response for GET_VQ)
4.4.4.3.1 error if *index* is beyond
elsewhere it says pad with zeros
(why do we need this just return zeros)
4.4.4.5
we need max number of virtqueue in DEVICE_INFO OR return maximum size
of 0 for all unsupported indexes on GET and ERROR on set
4.4.4.6 Mention Device status change w/o config change as another possible
reason to get
4.4.4.7
Actually virtqueues and come and go during the life of the device. The
STATUS can (should?) be set to DRIVER OK before any virtqueues are set.
4.4.5.1
* what are NOTIFY_ON_AVAIL and NEGOCIATE_DATA (sp)? These appear no where
else in the spec.
* all existing transports have a way to notify driver -> device per virtqueue
4.4.5.1.1
* virtqueue id is always present
* If VIRTIO_F_NOTIFICATION_DATA has not been negotiated then next_off and
next_wrap should be 0 in message
(Bertrand did not like using msg_length to exclude them).
* We might want to update the text for a better description or just delete
it and leave it to the VIRTIO_F_NOTIFICATION_DATA feature description.
* Should we support VIRTIO_F_NOTIF_CONFIG_DATA? If so then virtqueue_index
can be vq_notif_config_data instead
4.4.5.2 Device Notifications
* Maybe here would be a place to mention OoB notifications and polling?
* I don't think this is a MAY from the transport level.
* The Bus MUST give these messages to the transport level. It can get them
from messages on the bus, from OoB notifications, or generating them
periodically to stimulate polling.
4.4.5.2.1 EVENT_CONFIG
* sent on config OR STATUS change
* we talked about a new version of this message with no data to use for OoB
and polling stimulation
4.4.5.2.2 EVENT_USED
* should we define virtqueue index of -1 as "all virtqueues?
* we could put the loop in the bus instead, this would handle the case of 2
virtqueues on MSIX #2 and 1 one virtqueue on MSIX #3
4.4.5.3 Configuration Changes During Operation
* Feature re-negotiation is not allowed by the spec. drop this part of the text
From 2.2 Feature Bits:
"""
Each virtio device offers all the features it understands. During device
initialization, the driver reads this and
tells the device the subset that it accepts. The only way to renegotiate is
to reset the device.
"""
4.4.5.4 Device Rest and Shutdown
* reword last sentence, it is valid to not re-init the device if your not
going to use it again
4.4.5.4.1 Device-Initated Rest
* match the other transports here. The device sends an async status change
with DEVICE_NEEDS_RESET set and waits for driver to reset the device via a
SET_STATUS of 0.
4.4.6.1.1 Bigger gap between request response messages and EVENT_ messages
* Not a lot of space to add new request/response messages
* Renumber to 0x80, 0x81, 0x82
4.4.6.2 VIRTIO_MSG_ERROR
* no way to match an error to multiple outstanding messages of the same type
** Right now we assume we only have one outstanding request message per
device.
This will probibly be true in Linux.
However a different OS could choose to do multiple SET_VQ in parrallel
(for example).
There is no way to coordinate which SET_VQ a VIRTIO_MSG_ERROR pairs with.
** Solutions
1) add a transaction id at the bus level and do all matching based on that
2) add a transaction id at the transport common header
3) add another field to the error message that would differenciate
(index for VQ). This makes matching logic per msg_id
4) assume one message per device serialization
5) #4 UNLESS #1
4.4.6.3 VIRTIO_MSG_GET_DEVICE_INFO
* add max num virtqueues
* add number of Shared memory Regions (section 2.3) (do this in same patch
that adds GET_SHM message)
4.4.6.4 VIRTIO_MSG_GET_FEATURES
* Most transports have device features and driver features and allow both
to be read.
device features always reflect the capabilities of the device.
driver features are written with the feastures the driver wants to use
and read back the negotiated features
It is clear that the SET_FEATURES writes the driver desired features and
gets back the negotiated features
What does GET_FEATURES read? I presume it reads device features before
negotiation and negotiated features after negotiation
* Should we add a param to specify DEVICE to DRIVER features to the
GET_FEATURES message for parity with other transports?
4.4.6.4 & .5
* make it clear that a feature block is a set of 32 feature bits
* feature bits beyond the number of max feature bits always read as zero
4.4.6.9 SET_DEVICE_STATUS
* return the new status. It is vald for the device to refuse to set the
FEATURES_OK bit. (3.1.1 number 6) A set status might also cause a
DEVICE_NEEDS_RESET.
4.4.6.11 SET_VQUEUE
* why is the info echoed back, can it ever change?
4.4.6.12 RESET_VQUEUE
* Does this need to be negotiated?
* VIRTIO_F_RING_RESET is the feature bit
4.4.6.13 VIRTIO_MSG_EVENT_CONFIG
* see above about adding no data version of this
4.4.6.14 VIRTIO_MSG_EVENT_AVAIL
* Next offset and next wrap should be 0 if not negotiated
* should we add virtqueue index == -1 means all virtqueues (this would be
use on device side between bus and transport for OoB or polling stimulation
4.4.6.15 VIRTIO_MSG_EVENT_USED
* should we add virtqueue index == -1 means all virtqueues
(this would be use on driver side bus and transport for OoB or polling
stimulation and gives parity with virtio-mmio and base level of virtio-pci)
4.4.7.2 BUS_MSG_GET_DEVICES
* needs better description of bit packing
* offset is in bits, bytes, 32 bit words??
* number of device_nums requested, this is number of bits or number of
packed data bytes / words. Must be multiple of
* better description on how bits are packed into array
4.4.7.4 BUS_MSG_DEVICE_REMOVED
* how do we handle a graceful eject vs an abortive one?
* add indicator
A) already removed, deal with it
B) removing now, don't start new data transfers, I will finish existing
ones
C) removing soon, write any dirty data and then remove or reset device
yourself. May escalate to B or A if driver does not take action
D) ask to eject, can be refused
Add BUS_MSG_DEVICE_REMOVE as driver to device MSG. Driver side say it is
done with this device.
4.4.7.6 BUS_MSG_STATUS
* lots of problems with this being a no response message
* remove from spec for now
4.4.8.* Conformance
(Initial thoughts. I did not review this section rigorously.)
* Most of this is "do the right thing" without any specifics.
* Other transports spread the requirements into the various sections. We
should probibly do the same.
One thing that needs to be specified is when to return ERROR and when to
return "benign data".
Ex: getting feature bits that don't exist, return zeros or return error
Ex: setting features bit that don't exist to zero; OK or error
Ex: setting feature bits that don't exist to 1; return negotiated 0 or
return error
The mmio transport will just return zeros for these cases. Why can't we?
* We need to thing about notifications and events. I don't think they are
optional. They are not optional in virtio-mmio and virtio-pci.
4.4.8.2.1 General Transport Requirements
I don't see a precedent for common requirements. If no precedent is found,
repeat the requirements in device and driver sections.
4.4.8.6 Versioning and Forward Compatibility
""
If a bus instance or device does not
support an advanced feature, it MUST reject or ignore those requests
cleanly using VIRTIO_MSG_ERROR,
rather than undefined behavior.
"""
Reject is via MSG_ERROR that is clear. If we negotiate a protocol version
do we expect to send unknown messages? The option to Ignore messages we
don't understand seems wrong as we don't know if they are important.
--
Bill Mills
Principal Technical Consultant, Linaro
+1-240-643-0836
TZ: US Eastern
Work Schedule: Tues/Wed/Thur
Hi Everyone,
I pushed a major change to the pull request for the spec here:
https://github.com/Linaro/virtio-msg-spec/pull/16
I reworked the chapters and the content to look a lot more like other virtio transports and follow the sentence convention (MUST, SHALL, SHOULD).
Some sentences have been reworded with the help of chatgpt so we need to have a careful review of this but i think it looks a lot better than before.
please review based on the final content and not using the tree of patches as the last one is more or less rewriting everything.
Some things to think of:
- message format description: should we switch to structures instead of tables ? I did that for the header and channel I/O is using structures which might look better
- global terminology: i struggle a bit between driver/device vs frontend/backend and we must have a check on the coherency
- compliance: this is the bare output of chatgpt more for discussing than keeping it but it could give us ideas if we want to have something (Warning the compliance chapter content might be wrong).
As always: any comments are welcome :-)
Cheers
Bertrand
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi all,
I was wondering if the virtio-msg demos we have for FFA work on Xen?
I have some vague memory that they do.
I wonder if I can use the same lower-level virtio-msg-bus or ffa calls to move
messages between a DomU/Linux and a QEMU backend in Dom0?
Can someone point me to the code? Or documentation?
Thanks!
Best regards,
Edgar
Hi Everyone,
As you know we have a pending subject around Version handling to solve
to have some kind of "stable" version of the protocol and I think we have
a bit over engineered this so far so i will try to explain my current train of
thoughts so that we could have a productive discussion on Thursday.
What are other transports doing and why ?
- PCI: there is no real protocol version. Capabilities are used and there
is only one used for Legacy devices
- MMIO: One version register which an be 1 or 2, read only from the driver
which is used to inform a driver of the layout of the registers. There is no
negociation
- Channel I/O: A revision, a length and some way to set additional options
depending on the revision. The driver is here telling to the device which
revision of the protocol it wants to use and the device just say no if it
does not support what the driver wants. Once selected any change must be
rejected by the device and the driver must begin by setting it. Somehow
the virtio spec says that this is per device as the only change was a status
message not existing but it is clear that this must be handled at the transport
level and not in a specific driver.
What we need ?
I think we need something very close to what Channel I/O is defining and it
would be a good idea to do reuse the same principles:
- the driver side sets the revision
- the device side just say "Error" if it does not support what is requested
- once the revision is set, only a reset can allow to change it
- driver side should start with highest revision and go down until it finds
a revision it supports
Now I think that having this per device is not useful because we want to
have a generic transport and having different sets of messages per
device due to different revision seems like a complexity
we do not need.
So i think this is something that we should offload at the bus level and
define the following:
- A bus implementation must inform the virtio message generic
transport of the version of the protocol to be used for a specific bus
device instance (ie all devices on this instance will use the same
protocol version). This is to be done through an implementation
defined way.
- A bus implementation must inform the virtio message generic
transport of the maximum message size its support through an
implementation defined way. This can also be per bus device instance.
- It is the bus driver responsibility to negociate a version and maximum
message size with a bus device instance.
- It is the bus device responsibility to know which versions are supported
by its own virtio message generic implementation.
Follwing those principles i would propose to do the following changes in
the specification:
- remove the VIRTIO_MSG_VERSION message
- introduce a BUS_MSG_VERSION message with more or less the same
definition as the current VIRTIO_MSG_VERSION (only simplifying by
saying that the driver sets a version and device say yes or no, the size
would work as it is now)
Some questions for discussion:
- Do you think it is ok to move this to the BUS or should we keep it in
the generic layer to be more "coherent" with other transports ?
- Should we provision something like the "data" part in Channel I/O to
have options on a specific revision ?
- Anything else ?
Cheers
Bertrand
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.