On Wed, Jan 20, 2021 at 09:34:10AM -0800, Dave Hansen wrote:
On 1/20/21 6:43 AM, Jarkko Sakkinen wrote:
So why do you need the synchronize_srcu() call when this process sees an empty mm_list already?
Thx.
The other process aka some process using the enclave calls list_del_rcu() (and synchronize_srcu()), which starts a new grace period. If we don't do it, then the cleanup_srcu() will race with that grace period.
To me, this is only a partial explanation.
That goal of synchronize_srcu() is to wait for the completion of a *previous* grace period: one that might have observed the old state of the list.
Could you explain the *actual* effects of the misplaced synchronize_srcu()? If the race _occurs_, what is the side-effect?
As I haven't been able to reproduce this regression myself, I need to take steps back and try to reproduce the it with Graphene.
WARN_ON()'s trigger inside cleanup_srcu_struct(), which causes a memory leak since free_percpu() gets never called. If I recall correctly, it was srcu_readers_active() but unfortunately I don't have a log available.
Perhaps Haitao could provide us one.
/Jarkko