On Fri, 2025-01-17 at 08:54 +0100, Steffen Klassert wrote:
Hi Jianbo,
I talked with Sabrina and it looks we can't simply do this. Because both xfrm_add_sa_expire() and xfrm_timer_handler() calling __xfrm_state_delete() under spin lock. If we move the xfrm_dev_state_delete() out of __xfrm_state_delete(), all the places need to be handled correctly.
At the same time xfrm_timer_handler() calling xfrm_dev_state_update_stats before __xfrm_state_delete(). Should we also take care of it to make sure the state change and delete are called at the same time?
Hi Steffen, do you have any comments?
Can't you just fix this in bonding? xfrm_timer_handler() can't sleep anyway, even if you remove the spinlock, it is a timer function.
I am not sure this can be fixed in bonding given that the xdo_dev_state_delete op could, in the general case, sleep while talking to the hardware. I don't think it's reasonable to expect devices to offload xfrm while the kernel holds a spinlock. Bonding just exposed this assumption mismatch because of the mutex that was added to replace a spinlock which exhibited the same problem we are talking about here.
Do the dev offload operations need to be synchronous? Couldn't __xfrm_state_delete instead schedule a wq to do the dev offload? I saw there's already an xfrm_state_gc_task that's invoked to call xfrm_dev_state_free, perhaps that could be used to do the delete as well?
Cosmin.