On Thu, Apr 30, 2026 at 04:45:05PM +0000, Bertrand Marquis wrote:
Hi Everyone,
Quick follow up on one block point with discussed.
Manos pointed me to the ccw transport code which does sleep: https://github.com/torvalds/linux/blob/e75a43c7cec459a07d91ed17de4de13ede2b7...
And i checked and tested with a big print when i though i could not sleep and had to use a shadow config. In fact I can sleep and i check my original code when i started to introduce this and the test was wrong at the time: I was checking if i was in an interrupt context but not if could sleep and my rcu might have been called by other issues.
So we might not have to solve this problem at all as get/set config are sleepable !!!
I will do some extra investigations and checks inside the kernel code but this could make things way easier :-)
Thanks a lot Manos for the pointer.
Fantastic!
Thanks Manos!
Cheers, Edgar
Cheers Bertrand
On 30 Apr 2026, at 15:23, Edgar E. Iglesias edgar.iglesias@amd.com wrote:
On Thu, Apr 23, 2026 at 01:40:40PM +0000, Bertrand Marquis wrote:
Hi Everyone,
I have a first PoC showing virtio-msg working with a loopback system between Linux kernel and Qemu.
The patches, build and run instructions can be found here:
https://github.com/bertrand-marquis/virtio-msg-spec/tree/linux-poc/v0/linux-...
This is very early stage but this shows a fully functional version with rng and block validated. I used ChatGPT help to fix issues and write part of the code (or big parts for Qemu) and this is far from upstreamable so do not share that.
I will share a v1 in the next weeks with FF-A support but i still have some timings and DMA issues to solve.
Any comment on this is more than welcome !!
Cheers Bertrand
Nice work Bertrand!
Most of the QEMU parts look good! And I think this code highlights some issues with the current implementation. Thanks.
The new RAMBlock is a good idea. It makes sense to me for these bridge-published DMA windows, since you want them to work through the normal DMA and address space paths in QEMU.
The thing I would watch carefully is lifetime tracking of host pointer use. Once a published range is exposed that way, every path that gets a direct host pointer really needs to give it back again. Otherwise DEL_REQ teardown can end up waiting on references that never drop.
That feels like the main thing worth double checking in this part of the code. Just thought I'd mention it.
On the new virtio-msg kernel interface, I was hoping we would not need to expose the rings to user-space. I do agree that the kernel probably needs an internal ring, queue or something. I would just like to understand better why we need to expose that part too. I made a few changes to Viresh code at some point to internally use the AMP queues but still maintaining the read/write/select/poll interface towards user-space. A little similar to a UDP socket.
E.g. Driver to Device:
- Driver puts msg-1 on internal queue.
- Driver puts msg-2 on internal queue before device reads msg-1.
- At some point device wakes up, reads msg-1 and msg-2 in a loop before read() would block.
Device to driver:
- write() msg-1
- Before driver reads msg-1, device write() msg-2.
- write() blocks when the internal queue is full.
That was the direction I was hoping to go with Viresh code. With a suitable depth of queue.
Another problem I found in the AMP code when using Viresh's virtio-msg was the problem if multiple contexts are trying to send/receive messages at the same time, e.g interrupt context vs normal thread context. IIRC, I worked around it in the AMP code by having separate rx buffers. The problem was for example that the driver would send a msg-1 and wait for a response to that msg. While it waits, it would process incoming msg's and look for a match to msg-1. During this wait, interrupt context sends msg-2 (in my case notification), and it would override the buffer holding msg-1 causing a hang.
I don't know what the best way to solve it is, but in the AMP code, multiple buffers solved the problem for us.
It would be good to discuss this a bit more before going too far down a substantial redesign, and first see if the problems we hit there can be solved within Viresh's design.
A few nitpicks:
I've tried to call virtio-msg-bus implementations virtio-msg-bus-something. In this case, virtio-msg-bus-linux-bridge.c I guess.
include/hw/virtio/virtio-msg-prot.h has some general code to print messages and convert some parts to strings. It's missing stuff but it looks like you could reuse some of it or extend it for your needs.
A couple more things I think are worth a look (assisted by codex):
Commit 6899bd4d48 `hw/virtio: add virtio-msg linux bridge transport parent`
High: `virtio_msg_linux_bridge_transport_unrealize()` destroys the bridge before it unrealizes the `VirtIOMSGProxy`. The bridge unrealize path destroys the bridge DMA address space, but child virtio devices under the proxy can still hold and use that address space through the earlier latched transport caps / `vdev->dma_as`. That looks like a use-after-free risk.
Commit 466fbe293e `virtio-msg: latch transport capabilities during pre-plug`
Medium: this introduces a one-time cached transport capability snapshot on the proxy, but there is no invalidation path once `latched_caps_valid` becomes true. If the backend later tears down or recreates its DMA address space, `virtio_msg_get_dma_as()` may keep returning a stale pointer.
Best regards, Edgar
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.