On Tue, Mar 11, 2025 at 08:57:54AM -0700, Shakeel Butt wrote:
On Tue, Mar 11, 2025 at 11:30:32AM -0400, Johannes Weiner wrote:
On Mon, Mar 10, 2025 at 04:09:34PM -0700, Shakeel Butt wrote:
Currently on cpu hotplug teardown, only memcg stock is drained but we need to drain the obj stock as well otherwise we will miss the stats accumulated on the target cpu as well as the nr_bytes cached. The stats include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In addition we are leaking reference to struct obj_cgroup object.
Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Shakeel Butt shakeel.butt@linux.dev Cc: stable@vger.kernel.org
Wow, that's old. Good catch.
mm/memcontrol.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4de6acb9b8ec..59dcaf6a3519 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1921,9 +1921,18 @@ void drain_all_stock(struct mem_cgroup *root_memcg) static int memcg_hotplug_cpu_dead(unsigned int cpu) { struct memcg_stock_pcp *stock;
- struct obj_cgroup *old;
- unsigned long flags;
stock = &per_cpu(memcg_stock, cpu);
- /* drain_obj_stock requires stock_lock */
- local_lock_irqsave(&memcg_stock.stock_lock, flags);
- old = drain_obj_stock(stock);
- local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
- drain_stock(stock);
- obj_cgroup_put(old);
It might be better to call drain_local_stock() directly instead. That would prevent a bug of this type to reoccur in the future.
The issue is drain_local_stock() works on the local cpu stock while here we are working on a remote cpu cpu which is dead (memcg_hotplug_cpu_dead is in PREPARE section of hotplug teardown which runs after the cpu is dead).
We can safely call drain_stock() on remote cpu stock here but drain_obj_stock() is a bit tricky as it can __refill_stock() to local cpu stock and can call __mod_objcg_mlstate to flush stats. Both of these requires irq disable for NON-RT kernels and thus I added the local_lock here.
Anyways I wanted a simple fix for the backports and in parallel I am working on cleaning up all the stock functions as I plan to add multi memcg support.
Really curious to see patches/more details here, I have some ideas here as well (nothing materialized yet though).
Thanks!