On Mon, Dec 15, 2025 at 03:11:22PM +0100, Stefano Garzarella wrote:
On Fri, Dec 12, 2025 at 07:26:15AM -0800, Bobby Eshleman wrote:
On Tue, Dec 02, 2025 at 02:01:04PM -0800, Bobby Eshleman wrote:
On Tue, Dec 02, 2025 at 09:47:19PM +0100, Paolo Abeni wrote:
On 12/2/25 6:56 PM, Bobby Eshleman wrote:
On Tue, Dec 02, 2025 at 11:18:14AM +0100, Paolo Abeni wrote:
On 11/27/25 8:47 AM, Bobby Eshleman wrote: > @@ -674,6 +689,17 @@ static int vhost_vsock_dev_open(struct inode *inode, struct file *file) > goto out; > } > > + net = current->nsproxy->net_ns; > + vsock->net = get_net_track(net, &vsock->ns_tracker, GFP_KERNEL); > + > + /* Store the mode of the namespace at the time of creation. If this > + * namespace later changes from "global" to "local", we want this vsock > + * to continue operating normally and not suddenly break. For that > + * reason, we save the mode here and later use it when performing > + * socket lookups with vsock_net_check_mode() (see vhost_vsock_get()). > + */ > + vsock->net_mode = vsock_net_mode(net);
I'm sorry for the very late feedback. I think that at very least the user-space needs a way to query if the given transport is in local or global mode, as AFAICS there is no way to tell that when socket creation races with mode change.
Are you thinking something along the lines of sockopt?
I'd like to see a way for the user-space to query the socket 'namespace mode'.
sockopt could be an option; a possibly better one could be sock_diag. Or you could do both using dumping the info with a shared helper invoked by both code paths, alike what TCP is doing.
Also I'm a bit uneasy with the model implemented here, as 'local' socket may cross netns boundaris and connect to 'local' socket in other netns (if I read correctly patch 2/12). That in turns AFAICS break the netns isolation.
Local mode sockets are unable to communicate with local mode (and global mode too) sockets that are in other namespaces. The key piece of code for that is vsock_net_check_mode(), where if either modes is local the namespaces must be the same.
Sorry, I likely misread the large comment in patch 2:
https://lore.kernel.org/netdev/20251126-vsock-vmtest-v12-2-257ee21cd5de@meta...
Have you considered instead a slightly different model, where the local/global model is set in stone at netns creation time - alike what /proc/sys/net/ipv4/tcp_child_ehash_entries is doing[1] - and inter-netns connectivity is explicitly granted by the admin (I guess you will need new transport operations for that)?
/P
[1] tcp allows using per-netns established socket lookup tables - as opposed to the default global lookup table (even if match always takes in account the netns obviously). The mentioned sysctl specify such configuration for the children namespaces, if any.
I'll save this discussion if the above doesn't resolve your concerns.
I still have some concern WRT the dynamic mode change after netns creation. I fear some 'unsolvable' (or very hard to solve) race I can't see now. A tcp_child_ehash_entries-like model will avoid completely the issue, but I understand it would be a significant change over the current status.
"Luckily" the merge window is on us and we have some time to discuss. Do you have a specific use-case for the ability to change the netns >
mode
after creation?
/P
I don't think there is a hard requirement that the mode be change-able after creation. Though I'd love to avoid such a big change... or at least leave unchanged as much of what we've already reviewed as possible.
In the scheme of defining the mode at creation and following the tcp_child_ehash_entries-ish model, what I'm imagining is:
/proc/sys/net/vsock/child_ns_mode can be set to "local" or "global"
/proc/sys/net/vsock/child_ns_mode is not immutable, can change any number of times
when a netns is created, the new netns mode is inherited from child_ns_mode, being assigned using something like:
net->vsock.ns_mode = get_net_ns_by_pid(current->pid)->child_ns_mode
/proc/sys/net/vsock/ns_mode queries the current mode, returning "local" or "global", returning value of net->vsock.ns_mode
/proc/sys/net/vsock/ns_mode and net->vsock.ns_mode are immutable and reject writes
Does that align with what you have in mind?
Hey Paolo, I just wanted to sync up on this one. Does the above align with what you envision?
Hi Bobby, AFAIK Paolo was at LPC, so there could be some delay.
FYI I'll be off from Dec 25 to Jan 6, so if we want to do an RFC in the middle, I'll do my best to take a look before my time off.
Thanks, Stefano
Sounds like a plan, thanks!
Best, Bobby