Hello everyone,
Moving the discussion from google doc [1] to email (Sorry, it was becoming
impossible to follow and reply there).
FWIW, I am not the best person to answer all questions here, that would be
Armelle as she understand the requirements and the end goal much better than I
do. I can though try to answer from kernel's perspective, based on whatever
implementation we have right now.
AFAIK, the broad idea is to implement two virtio communication paths from pVM,
one to the Linux host (via virtio-pci) and another one to Trusty (via
virtio-msg-ffa). In order to not take performance hit at runtime (to map io
buffers), the idea is to map whatever amount of memory we can at the beginning
and then keep allocating from there.
Current setup:
What we have achieved until now is virtio-msg communication between host and
trusty. We have implemented FFA specific dma-hal [2] to perform FFA memory
sharing with trusty. With "reserved-mem" and "memory-region" DT entries (not
sure if that is the final solution), we are able to allocate memory the FFA
device (which represents bus for all the enumerated devices between
trusty/host). This memory is shared with trusty at probe time (from
virtio-msg-ffa layer) and the DMA hal later allocates memory from there for
coherent allocations and bounce buffers. This works just fine right now.
Now looking at "dynamic mapping" section in [1] we are not sure if that will
work fine for the end use case, pVM to trusty. It looks like the coco
implementation will always end up using dma encrypt/decrypt when a pVM is
running and share the memory with host, even when all we want to do is share
with trusty. Is that understanding correct ? We would also want to establish
virtio-pci (existing tech) based communication between pVM and host, which
should use mem decrypt path (?).
I am not sure if we need contiguous PA here, contiguous IPA should be
sufficient, Armelle?
We are also looking for further suggestions to improve the design, as my
understanding of memory mapping, dma hal etc. is limited, and maybe there are
better ways to do this.
--
Viresh
[1] https://docs.google.com/document/d/1KyxclKngQ0MShX8Q1YPstZQVIDIctkxfj6lQ0V3…
[2] https://web.git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux.git/tree/…
Bertrand,
I know you are on vacation but here is my full list. It took longer to
flesh out from my pencil notes than I thought.
While you are gone I will try to tackle all of these these in a PR w/ one
commit per item.
I will at least get to some of them.
I will create issues for things I think need discussion.
Thanks,
Bill
--------------
Global:
* Eliminate all use of "frontend" and "backend", can use device side and
driver side when just device or driver is not enough
* Use device number (dev_num) instead of Device ID to avoid confusion with
device type (called Device ID in spec)
* Address out of band notification as an option, any polling that is
required should be done by the bus implementation, not the common layer.
* for all EVENT_ messages change "does not require a response" to "does not
have a response". There is no optional response; a response is forbidden.
4.4.2.1 Bus level feature bits
* are these local to the bus implementation or communicated to the
virtio-msg transport layer? (I vote local)
* what would be an example of a bus feature that would need to be known at
the common layer that is NOT a device feature?
4.4.2.1.1 max message size definition.
Does this include or exclude headers added by the bus implementation? I
suggest it excludes it.
4.4.2.1.1 max message size of 64K
* This seems a bit excessive. It may cause people to invent new uses for
the message channel that we do not intend.
* Should we lower the max message size?
* Should we add a note that says no use of message sizes above 288 bytes is
envisioned?
(288 = 256 bytes of config data plus 32 bytes for message overhead.
Right now message overhead is 18 for SET_CONFIG so 32 is plenty but 16 is
too small.)
4.4.2.3 Config Generation Count
* (self note: Config gen count only needs to be incremented once per
*driver visible* change
A device can do a whole set of changes and MAY/SHOULD only increment the
gen count by one until the new gen count is published to the other side via
EVENT_CONFIG or GET/SET_CONFIG
However this is basically the same thing from the spec POV. )
* Should we say that Config gen count starts at 0 (SHOULD? MUST?)
* Does config gen count reset at device reset or not?
If not then maybe we don't say it starts at zero either and the driver
MUST query it
You do not need to know the config gen counter to GET_CONFIG, only to
SET_CONFIG
4.4.2.5.1 Error codes
* Are error codes aligned to admin queues yet?
4.4.2.6 Bus vs Transport messages
* Mention shared memory setup and out of band notification in bus message
overview
* Mention bus implementation specific messages in bus message overview
* Remove BUS_MSG_STATUS for now (or allow request/response and maybe add
EVENT_ version w/o response)
4.4.2.8 dev_id to dev_num or dev_number
4.4.3.3 Resetting the BUS
* We decided to drop this for now
* When we do it graceful and abortive reset should be handled
* For graceful device initiated reset, it should be handled by setting
state to NEEDS_RESET
4.4.4.2 Device Info
* add max number of virtqueues
(or define a null response for GET_VQ)
4.4.4.3.1 error if *index* is beyond
elsewhere it says pad with zeros
(why do we need this just return zeros)
4.4.4.5
we need max number of virtqueue in DEVICE_INFO OR return maximum size
of 0 for all unsupported indexes on GET and ERROR on set
4.4.4.6 Mention Device status change w/o config change as another possible
reason to get
4.4.4.7
Actually virtqueues and come and go during the life of the device. The
STATUS can (should?) be set to DRIVER OK before any virtqueues are set.
4.4.5.1
* what are NOTIFY_ON_AVAIL and NEGOCIATE_DATA (sp)? These appear no where
else in the spec.
* all existing transports have a way to notify driver -> device per virtqueue
4.4.5.1.1
* virtqueue id is always present
* If VIRTIO_F_NOTIFICATION_DATA has not been negotiated then next_off and
next_wrap should be 0 in message
(Bertrand did not like using msg_length to exclude them).
* We might want to update the text for a better description or just delete
it and leave it to the VIRTIO_F_NOTIFICATION_DATA feature description.
* Should we support VIRTIO_F_NOTIF_CONFIG_DATA? If so then virtqueue_index
can be vq_notif_config_data instead
4.4.5.2 Device Notifications
* Maybe here would be a place to mention OoB notifications and polling?
* I don't think this is a MAY from the transport level.
* The Bus MUST give these messages to the transport level. It can get them
from messages on the bus, from OoB notifications, or generating them
periodically to stimulate polling.
4.4.5.2.1 EVENT_CONFIG
* sent on config OR STATUS change
* we talked about a new version of this message with no data to use for OoB
and polling stimulation
4.4.5.2.2 EVENT_USED
* should we define virtqueue index of -1 as "all virtqueues?
* we could put the loop in the bus instead, this would handle the case of 2
virtqueues on MSIX #2 and 1 one virtqueue on MSIX #3
4.4.5.3 Configuration Changes During Operation
* Feature re-negotiation is not allowed by the spec. drop this part of the text
From 2.2 Feature Bits:
"""
Each virtio device offers all the features it understands. During device
initialization, the driver reads this and
tells the device the subset that it accepts. The only way to renegotiate is
to reset the device.
"""
4.4.5.4 Device Rest and Shutdown
* reword last sentence, it is valid to not re-init the device if your not
going to use it again
4.4.5.4.1 Device-Initated Rest
* match the other transports here. The device sends an async status change
with DEVICE_NEEDS_RESET set and waits for driver to reset the device via a
SET_STATUS of 0.
4.4.6.1.1 Bigger gap between request response messages and EVENT_ messages
* Not a lot of space to add new request/response messages
* Renumber to 0x80, 0x81, 0x82
4.4.6.2 VIRTIO_MSG_ERROR
* no way to match an error to multiple outstanding messages of the same type
** Right now we assume we only have one outstanding request message per
device.
This will probibly be true in Linux.
However a different OS could choose to do multiple SET_VQ in parrallel
(for example).
There is no way to coordinate which SET_VQ a VIRTIO_MSG_ERROR pairs with.
** Solutions
1) add a transaction id at the bus level and do all matching based on that
2) add a transaction id at the transport common header
3) add another field to the error message that would differenciate
(index for VQ). This makes matching logic per msg_id
4) assume one message per device serialization
5) #4 UNLESS #1
4.4.6.3 VIRTIO_MSG_GET_DEVICE_INFO
* add max num virtqueues
* add number of Shared memory Regions (section 2.3) (do this in same patch
that adds GET_SHM message)
4.4.6.4 VIRTIO_MSG_GET_FEATURES
* Most transports have device features and driver features and allow both
to be read.
device features always reflect the capabilities of the device.
driver features are written with the feastures the driver wants to use
and read back the negotiated features
It is clear that the SET_FEATURES writes the driver desired features and
gets back the negotiated features
What does GET_FEATURES read? I presume it reads device features before
negotiation and negotiated features after negotiation
* Should we add a param to specify DEVICE to DRIVER features to the
GET_FEATURES message for parity with other transports?
4.4.6.4 & .5
* make it clear that a feature block is a set of 32 feature bits
* feature bits beyond the number of max feature bits always read as zero
4.4.6.9 SET_DEVICE_STATUS
* return the new status. It is vald for the device to refuse to set the
FEATURES_OK bit. (3.1.1 number 6) A set status might also cause a
DEVICE_NEEDS_RESET.
4.4.6.11 SET_VQUEUE
* why is the info echoed back, can it ever change?
4.4.6.12 RESET_VQUEUE
* Does this need to be negotiated?
* VIRTIO_F_RING_RESET is the feature bit
4.4.6.13 VIRTIO_MSG_EVENT_CONFIG
* see above about adding no data version of this
4.4.6.14 VIRTIO_MSG_EVENT_AVAIL
* Next offset and next wrap should be 0 if not negotiated
* should we add virtqueue index == -1 means all virtqueues (this would be
use on device side between bus and transport for OoB or polling stimulation
4.4.6.15 VIRTIO_MSG_EVENT_USED
* should we add virtqueue index == -1 means all virtqueues
(this would be use on driver side bus and transport for OoB or polling
stimulation and gives parity with virtio-mmio and base level of virtio-pci)
4.4.7.2 BUS_MSG_GET_DEVICES
* needs better description of bit packing
* offset is in bits, bytes, 32 bit words??
* number of device_nums requested, this is number of bits or number of
packed data bytes / words. Must be multiple of
* better description on how bits are packed into array
4.4.7.4 BUS_MSG_DEVICE_REMOVED
* how do we handle a graceful eject vs an abortive one?
* add indicator
A) already removed, deal with it
B) removing now, don't start new data transfers, I will finish existing
ones
C) removing soon, write any dirty data and then remove or reset device
yourself. May escalate to B or A if driver does not take action
D) ask to eject, can be refused
Add BUS_MSG_DEVICE_REMOVE as driver to device MSG. Driver side say it is
done with this device.
4.4.7.6 BUS_MSG_STATUS
* lots of problems with this being a no response message
* remove from spec for now
4.4.8.* Conformance
(Initial thoughts. I did not review this section rigorously.)
* Most of this is "do the right thing" without any specifics.
* Other transports spread the requirements into the various sections. We
should probibly do the same.
One thing that needs to be specified is when to return ERROR and when to
return "benign data".
Ex: getting feature bits that don't exist, return zeros or return error
Ex: setting features bit that don't exist to zero; OK or error
Ex: setting feature bits that don't exist to 1; return negotiated 0 or
return error
The mmio transport will just return zeros for these cases. Why can't we?
* We need to thing about notifications and events. I don't think they are
optional. They are not optional in virtio-mmio and virtio-pci.
4.4.8.2.1 General Transport Requirements
I don't see a precedent for common requirements. If no precedent is found,
repeat the requirements in device and driver sections.
4.4.8.6 Versioning and Forward Compatibility
""
If a bus instance or device does not
support an advanced feature, it MUST reject or ignore those requests
cleanly using VIRTIO_MSG_ERROR,
rather than undefined behavior.
"""
Reject is via MSG_ERROR that is clear. If we negotiate a protocol version
do we expect to send unknown messages? The option to Ignore messages we
don't understand seems wrong as we don't know if they are important.
--
Bill Mills
Principal Technical Consultant, Linaro
+1-240-643-0836
TZ: US Eastern
Work Schedule: Tues/Wed/Thur
Hi Everyone,
I pushed a major change to the pull request for the spec here:
https://github.com/Linaro/virtio-msg-spec/pull/16
I reworked the chapters and the content to look a lot more like other virtio transports and follow the sentence convention (MUST, SHALL, SHOULD).
Some sentences have been reworded with the help of chatgpt so we need to have a careful review of this but i think it looks a lot better than before.
please review based on the final content and not using the tree of patches as the last one is more or less rewriting everything.
Some things to think of:
- message format description: should we switch to structures instead of tables ? I did that for the header and channel I/O is using structures which might look better
- global terminology: i struggle a bit between driver/device vs frontend/backend and we must have a check on the coherency
- compliance: this is the bare output of chatgpt more for discussing than keeping it but it could give us ideas if we want to have something (Warning the compliance chapter content might be wrong).
As always: any comments are welcome :-)
Cheers
Bertrand
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi all,
I was wondering if the virtio-msg demos we have for FFA work on Xen?
I have some vague memory that they do.
I wonder if I can use the same lower-level virtio-msg-bus or ffa calls to move
messages between a DomU/Linux and a QEMU backend in Dom0?
Can someone point me to the code? Or documentation?
Thanks!
Best regards,
Edgar
Hi Everyone,
As you know we have a pending subject around Version handling to solve
to have some kind of "stable" version of the protocol and I think we have
a bit over engineered this so far so i will try to explain my current train of
thoughts so that we could have a productive discussion on Thursday.
What are other transports doing and why ?
- PCI: there is no real protocol version. Capabilities are used and there
is only one used for Legacy devices
- MMIO: One version register which an be 1 or 2, read only from the driver
which is used to inform a driver of the layout of the registers. There is no
negociation
- Channel I/O: A revision, a length and some way to set additional options
depending on the revision. The driver is here telling to the device which
revision of the protocol it wants to use and the device just say no if it
does not support what the driver wants. Once selected any change must be
rejected by the device and the driver must begin by setting it. Somehow
the virtio spec says that this is per device as the only change was a status
message not existing but it is clear that this must be handled at the transport
level and not in a specific driver.
What we need ?
I think we need something very close to what Channel I/O is defining and it
would be a good idea to do reuse the same principles:
- the driver side sets the revision
- the device side just say "Error" if it does not support what is requested
- once the revision is set, only a reset can allow to change it
- driver side should start with highest revision and go down until it finds
a revision it supports
Now I think that having this per device is not useful because we want to
have a generic transport and having different sets of messages per
device due to different revision seems like a complexity
we do not need.
So i think this is something that we should offload at the bus level and
define the following:
- A bus implementation must inform the virtio message generic
transport of the version of the protocol to be used for a specific bus
device instance (ie all devices on this instance will use the same
protocol version). This is to be done through an implementation
defined way.
- A bus implementation must inform the virtio message generic
transport of the maximum message size its support through an
implementation defined way. This can also be per bus device instance.
- It is the bus driver responsibility to negociate a version and maximum
message size with a bus device instance.
- It is the bus device responsibility to know which versions are supported
by its own virtio message generic implementation.
Follwing those principles i would propose to do the following changes in
the specification:
- remove the VIRTIO_MSG_VERSION message
- introduce a BUS_MSG_VERSION message with more or less the same
definition as the current VIRTIO_MSG_VERSION (only simplifying by
saying that the driver sets a version and device say yes or no, the size
would work as it is now)
Some questions for discussion:
- Do you think it is ok to move this to the BUS or should we keep it in
the generic layer to be more "coherent" with other transports ?
- Should we provision something like the "data" part in Channel I/O to
have options on a specific revision ?
- Anything else ?
Cheers
Bertrand
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Everyone,
I pushed a new pull request to github on top of the current RFC status: https://github.com/Linaro/virtio-msg-spec/pull/13
This includes changes to handle some of the issues raised in github:
- have a version of the protocol
- have a variable message size
- have a configuration generation count
- give hint that bus implementation can be used for memory sharing or use of out of band notifications
Any comment is welcome :-)
Cheers
Bertrand
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
All,
The RFC went out a bit ago.
I updated it based on Bertrand's and Arnaud's suggestions (cover letter,
commit description, and s/request/requests/ only).
(I also fixed a s/massage/message/ in the cover letter.)
Going forward the new baseline for spec work will be
virtio-msg-rfc1. This branch is now the default at the linaro repo [1].
Please target all new pull requests to this branch.
The virtio-msg-alpha branch is now frozen but will be kept for reference.
Thanks,
Bill
[1] https://github.com/Linaro/virtio-msg-spec
--
Bill Mills
Principal Technical Consultant, Linaro
+1-240-643-0836
TZ: US Eastern
Work Schedule: Tues/Wed/Thur
All,
I have updated the hvac-demo repo [1] with a new version.
This version is tagged: v0.5
The following outlines the changes since Jan 27:
* The demos can now be run on an arm64 host
(I tested with an AWS EC2 m7g.2xlarge instance)
(x86_64 testing was done on my desktop and AWS EC2 m7i.2xlarge instances)
* Pre-setup container images (amd64 & arm64) are published at
docker.io/wmills/havc-demo
* A convenience script ./container was added to make it easier to run a
container with a mounted directory
(This is an alternative to using the published container images and
suggested for building the demos)
* The ./container flow now requires the user to run ./setup themselves
(The pre-built container images already have ./setup done for both "run"
and "build")
* Fixes found from Dan's testing
* Building demo1
* Running demo3
* Running demo4
* Fixes found from Alex's testing (running on Debian-12 w/o container and
using tmux already)
* Handle tmux server already running
* Isolate from users .tmux.conf and default server
* (These fixes are not needed to run in a container)
* I have tested with podman and docker
* Most of my testing was done on clean EC2 machines w/ Ubuntu 24.04, both
x86_64 and arm64
* I also used my x86_64 desktop w/ Ubuntu 22.04
* I tested running the docker.io/wmills/hvac-demo images and the
./container flow
* Updated README.MD with simplified instructions including above
Thanks,
Bill
[1] https://github.com/wmamills/hvac-demo.git
--
Bill Mills
Principal Technical Consultant, Linaro
+1-240-643-0836
TZ: US Eastern
Work Schedule: Tues/Wed/Thur
Hi,
I just tried to run demo1 from a docker setup.
I can run the demo1 using docker but the demo does not work (some error 95 in kernel transmitting FF-A messages, it went out to far so i could not copy paste it).
Cheers
Bertrand
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.