Hello Enze,
First of all, thank you for sharing this patch!
On Tue, 2 Dec 2025 10:14:07 +0800 Enze Li lienze@kylinos.cn wrote:
The current implementation only supports repeated calls to a single damon_call_control request per context.
I understand "repeated calls to a single damon_call_control" means "single damon_call_control object having ->repeat set as true". Let me call it "repeat mode damon_call_control object".
This is not an intentionally designed limitation but a bug. damon_call() allows callers adding multiple repeat mode damon_call_control objects per context. Technically, it adds any requested damon_call_control object to the per-context linked list, regardless of the number of repeat mode objects on the list. But, the consumer of the damon_call_control objects list, kdamond_call(), moves the repeat mode objects from the per-context list to a temporal list (repeat_controls), and then move only the first repeat mode entry from the temporal list to the per-context list.
If there were multiple repeat mode objects in the per-context list, what happens to the remaining repeat mode damon_call_control objects on the temporal list? Nothing. As a result, the memory for the objects are leaked. Definitely this is a bug.
Luckily there is no such multiple repeat mode damon_call() requests, so no upstream kernel user is exposed to the memory leak bug in real. But the bug is a bug. We should fix this.
This limitation introduces inefficiencies for scenarios that require registering multiple deferred operations.
I'm not very convinced with the above reasoning because 1. it is not a matter of inefficiency but a clear memory leak bug. 2. there is no damon_call() callers that want to have multiple deferred operations with high efficiency, at the moment. In my opinion, the above sentence is better to be just dropped.
This patch modifies the implementation of kdamond_call() to support repeated calls to multiple damon_call_control requests.
This change is rquired for fixing the bug, though.
To demonstrate the effect of this change, I made minor modifications to samples/damon/prcl.c by adding a new request alongside the original damon_call_control request and performed comparative tests.
Before applying the patch, I observed,
[ 381.661821] damon_sample_prcl: start [ 381.668199] damon_sample_prcl: repeat_call_v2 [ 381.668208] damon_sample_prcl: repeat_call [ 381.668211] damon_sample_prcl: wss: 0 [ 381.675194] damon_sample_prcl: repeat_call [ 381.675202] damon_sample_prcl: wss: 0
after applying the patch, I saw,
[ 61.750723] damon_sample_prcl: start [ 61.757104] damon_sample_prcl: repeat_call_v2 [ 61.757106] damon_sample_prcl: repeat_call [ 61.757107] damon_sample_prcl: wss: 0 [ 61.763067] damon_sample_prcl: repeat_call_v2 [ 61.763069] damon_sample_prcl: repeat_call [ 61.763070] damon_sample_prcl: wss: 0
Signed-off-by: Enze Li lienze@kylinos.cn
Assuming we agree on the fact this is a fix of the bug, I think we should add below tags.
Fixes: 43df7676e550 ("mm/damon/core: introduce repeat mode damon_call()") Cc: stable@vger.kernel.org # 6.17.x
mm/damon/core.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c index 109b050c795a..66b5bae44f22 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -2526,13 +2526,19 @@ static void kdamond_call(struct damon_ctx *ctx, bool cancel) list_add(&control->list, &repeat_controls); } }
- control = list_first_entry_or_null(&repeat_controls,
struct damon_call_control, list);- if (!control || cancel)
return;- mutex_lock(&ctx->call_controls_lock);
- list_add_tail(&control->list, &ctx->call_controls);
- mutex_unlock(&ctx->call_controls_lock);
- while (true) {
control = list_first_entry_or_null(&repeat_controls,struct damon_call_control, list);if (!control)break;/* Unlink from the repeate_controls list. */list_del(&control->list);if (cancel)continue;mutex_lock(&ctx->call_controls_lock);list_add(&control->list, &ctx->call_controls);mutex_unlock(&ctx->call_controls_lock);- }
This looks good enough to fix the bug.
Could you please resend this patch after rewording the commit message as appripriate for the bug fix, adding points I listed above?
Again, thank you for letting us find this bug.
Thanks, SJ
[...]
Hi SJ,
Thanks for your review and quick reply!
On Mon, Dec 01 2025 at 09:29:55 PM -0800, SeongJae Park wrote:
<...>
On Tue, 2 Dec 2025 10:14:07 +0800 Enze Li lienze@kylinos.cn wrote:
The current implementation only supports repeated calls to a single damon_call_control request per context.
I understand "repeated calls to a single damon_call_control" means "single damon_call_control object having ->repeat set as true". Let me call it "repeat mode damon_call_control object".
This is not an intentionally designed limitation but a bug. damon_call() allows callers adding multiple repeat mode damon_call_control objects per context. Technically, it adds any requested damon_call_control object to the per-context linked list, regardless of the number of repeat mode objects on the list. But, the consumer of the damon_call_control objects list, kdamond_call(), moves the repeat mode objects from the per-context list to a temporal list (repeat_controls), and then move only the first repeat mode entry from the temporal list to the per-context list.
If there were multiple repeat mode objects in the per-context list, what happens to the remaining repeat mode damon_call_control objects on the temporal list? Nothing. As a result, the memory for the objects are leaked. Definitely this is a bug.
Thank you for the detailed explanation -- it really clarified the design for me.
Luckily there is no such multiple repeat mode damon_call() requests, so no upstream kernel user is exposed to the memory leak bug in real. But the bug is a bug. We should fix this.
This limitation introduces inefficiencies for scenarios that require registering multiple deferred operations.
I'm not very convinced with the above reasoning because 1. it is not a matter of inefficiency but a clear memory leak bug. 2. there is no damon_call() callers that want to have multiple deferred operations with high efficiency, at the moment. In my opinion, the above sentence is better to be just dropped.
Agreed. I will rework the patch description for the next revision.
This patch modifies the implementation of kdamond_call() to support repeated calls to multiple damon_call_control requests.
This change is rquired for fixing the bug, though.
<...>
Assuming we agree on the fact this is a fix of the bug, I think we should add below tags.
Fixes: 43df7676e550 ("mm/damon/core: introduce repeat mode damon_call()") Cc: stable@vger.kernel.org # 6.17.x
mm/damon/core.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c index 109b050c795a..66b5bae44f22 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -2526,13 +2526,19 @@ static void kdamond_call(struct damon_ctx *ctx, bool cancel) list_add(&control->list, &repeat_controls); } }
- control = list_first_entry_or_null(&repeat_controls,
struct damon_call_control, list);- if (!control || cancel)
return;- mutex_lock(&ctx->call_controls_lock);
- list_add_tail(&control->list, &ctx->call_controls);
- mutex_unlock(&ctx->call_controls_lock);
- while (true) {
control = list_first_entry_or_null(&repeat_controls,struct damon_call_control, list);if (!control)break;/* Unlink from the repeate_controls list. */list_del(&control->list);if (cancel)continue;mutex_lock(&ctx->call_controls_lock);list_add(&control->list, &ctx->call_controls);mutex_unlock(&ctx->call_controls_lock);- }
This looks good enough to fix the bug.
Could you please resend this patch after rewording the commit message as appripriate for the bug fix, adding points I listed above?
Okay, I'll resend this patch shortly.
Best Regards, Enze
<...>
linux-stable-mirror@lists.linaro.org