On Fri, Dec 27, 2024 at 9:28 AM Suren Baghdasaryan surenb@google.com wrote:
On Thu, Dec 26, 2024 at 11:59 PM Andrew Morton akpm@linux-foundation.org wrote:
On Thu, 26 Dec 2024 16:56:00 -0800 Suren Baghdasaryan surenb@google.com wrote:
On Thu, Dec 26, 2024 at 4:23 PM Andrew Morton akpm@linux-foundation.org wrote:
On Thu, 26 Dec 2024 15:07:39 -0800 Suren Baghdasaryan surenb@google.com wrote:
On Thu, Dec 26, 2024 at 3:01 PM Andrew Morton akpm@linux-foundation.org wrote:
On Thu, 26 Dec 2024 13:16:39 -0800 Suren Baghdasaryan surenb@google.com wrote:
> When memory allocation profiling is disabled, there is no need to swap > allocation tags during migration. Skip it to avoid unnecessary overhead. > > Fixes: e0a955bf7f61 ("mm/codetag: add pgalloc_tag_copy()") > Signed-off-by: Suren Baghdasaryan surenb@google.com > Cc: stable@vger.kernel.org
Are these changes worth backporting? Some indication of how much difference the patches make would help people understand why we're proposing a backport.
The first patch ("alloc_tag: avoid current->alloc_tag manipulations when profiling is disabled") I think is worth backporting. It eliminates about half of the regression for slab allocations when profiling is disabled.
um, what regression? The changelog makes no mention of this. Please send along a suitable Reported-by: and Closes: and a summary of the benefits so that people can actually see what this patch does, and why.
Sorry, I should have used "overhead" instead of "regression". When one sets CONFIG_MEM_ALLOC_PROFILING=y, the code gets instrumented and even if profiling is turned off, it still has a small performance cost minimized by the use of mem_alloc_profiling_key static key. I found a couple of places which were not protected with mem_alloc_profiling_key, which means that even when profiling is turned off, the code is still executed. Once I added these checks, the overhead of the mode when memory profiling is enabled but turned off went down by about 50%.
Well, a 50% reduction in a 0.0000000001% overhead ain't much.
I wish the overhead was that low :)
I ran more comprehensive testing on Pixel 6 on Big, Medium and Little cores:
Overhead before fixes Overhead after fixes slab alloc page alloc slab alloc page alloc
Big 6.21% 5.32% 3.31% 4.93% Medium 4.51% 5.05% 3.79% 4.39% Little 7.62% 1.82% 6.68% 1.02%
Note, this is an allocation microbenchmark doing allocations in a tight loop. Not a really realistic scenario and useful only to make performance comparisons.
But I added the final sentence to the changelog.
It still doesn't tell us the very simple thing which we're all eager to know: how much faster did the kernel get??