On Fri, Aug 18, 2023 at 10:45:56AM -0700, Yosry Ahmed wrote:
On Fri, Aug 18, 2023 at 10:35 AM Johannes Weiner hannes@cmpxchg.org wrote:
On Fri, Aug 18, 2023 at 07:56:37AM -0700, Yosry Ahmed wrote:
If this happens it seems possible for this to happen:
cpu #1 cpu#2 css_put() /* css_free_rwork_fn is queued */ rcu_read_lock() mem_cgroup_from_id() mem_cgroup_id_remove() /* access memcg */
I don't quite see how that'd possible. IDR uses rcu_assign_pointer() during deletion, which inserts the necessary barriering. My understanding is that this should always be safe:
rcu_read_lock() (writer serialization, in this case ref count == 0) foo = idr_find(x) idr_remove(x) if (foo) kfree_rcu(foo) LOAD(foo->bar) rcu_read_unlock()
How does a barrier inside IDR removal protect against the memcg being freed here though?
If css_put() is executed out-of-order before mem_cgroup_id_remove(), the memcg can be freed even before mem_cgroup_id_remove() is called, right?
css_put() can start earlier, but it's not allowed to reorder the rcu callback that frees past the rcu_assign_pointer() in idr_remove().
This is what RCU and its access primitives guarantees. It ensures that after "unpublishing" the pointer, all concurrent RCU-protected accesses to the object have finished, and the memory can be freed.