Re: [PATCH v7 1/3] drm/buddy: Optimize free block management with RB tree

26 Sep 2025

      On 23/09/2025 10:02, Arunpravin Paneer Selvam wrote:
...
Replace the freelist (O(n)) used for free block management with a
red-black tree, providing more efficient O(log n) search, insert,
and delete operations. This improves scalability and performance
when managing large numbers of free blocks per order (e.g., hundreds
or thousands).
In the VK-CTS memory stress subtest, the buddy manager merges
fragmented memory and inserts freed blocks into the freelist. Since
freelist insertion is O(n), this becomes a bottleneck as fragmentation
increases. Benchmarking shows list_insert_sorted() consumes ~52.69% CPU
with the freelist, compared to just 0.03% with the RB tree
(rbtree_insert.isra.0), despite performing the same sorted insert.
This also improves performance in heavily fragmented workloads,
such as games or graphics tests that stress memory.
As the buddy allocator evolves with new features such as clear-page
tracking, the resulting fragmentation and complexity have grown.
These RB-tree based design changes are introduced to address that
growth and ensure the allocator continues to perform efficiently
under fragmented conditions.
The RB tree implementation with separate clear/dirty trees provides:

O(n log n) aggregate complexity for all operations instead of O(n^2)
Elimination of soft lockups and system instability
Improved code maintainability and clarity
Better scalability for large memory systems
Predictable performance under fragmentation

v3(Matthew):

Remove RB_EMPTY_NODE check in force_merge function.
Rename rb for loop macros to have less generic names and move to
.c file.
Make the rb node rb and link field as union.

v4(Jani Nikula):

The kernel-doc comment should be "/**"
Move all the rbtree macros to rbtree.h and add parens to ensure
correct precedence.

v5:

Remove the inline in a .c file (Jani Nikula).

v6(Peter Zijlstra):

Add rb_add() function replacing the existing rbtree_insert() code.

v7:

A full walk iteration in rbtree is slower than the list (Peter Zijlstra).
The existing rbtree_postorder_for_each_entry_safe macro should be used
in scenarios where traversal order is not a critical factor (Christian).

Cc: stable@vger.kernel.org
Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality")
Signed-off-by: Arunpravin Paneer Selvam Arunpravin.PaneerSelvam@amd.com
Reviewed-by: Matthew Auld matthew.auld@intel.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v7 1/3] drm/buddy: Optimize free block management with RB tree