Hi Christian,
On 9/9/2025 9:55 PM, Christian König wrote:
On 09.09.25 16:05, Peter Zijlstra wrote:
On Tue, Sep 09, 2025 at 02:04:30PM +0200, Christian König wrote:
Hi Arun,
On 09.09.25 11:56, Arunpravin Paneer Selvam wrote: [SNIP]
+/**
- rbtree_for_each_entry_safe - iterate in-order over rb_root safe against removal
- @pos: the 'type *' to use as a loop cursor
- @n: another 'type *' to use as temporary storage
- @root: 'rb_root *' of the rbtree
- @member: the name of the rb_node field within 'type'
- */
+#define rbtree_for_each_entry_safe(pos, n, root, member) \
- for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member), \
(n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL; \
(pos); \
(pos) = (n), \
(n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL)
As far as I know exactly that operation does not work on an R/B tree.
See the _safe() variants of the for_each_ macros are usually used to iterate over a container while being able to remove entries.
But because of the potential re-balance storing just the next entry is not sufficient for an R/B tree to do that as far as I know.
Please explain how exactly you want to use this macro.
Thanks for the pointer, yes, this will not work on RB tree. We need a reverse safe variant for use in the force_merge() function similar to the list_for_each_entry_safe_reverse() macro in list.h. The reason is that in force_merge(), we remove the block from the free tree before invoking drm_buddy_free(), which merges and frees buddy blocks to form a larger block.
So I don't much like these iterators; I've said so before. Either we should introduce a properly threaded rb-tree (where the NULL child pointers encode a linked list), or simply keep a list_head next to the rb_node and use that.
I agree, something is clearly fishy here.
The rb_{next,prev}() things are O(ln n), in the worst case they do a full traversal up the tree and a full traversal down the other branch.
Yeah from the logic that is exactly what is supposed to happen in the __force_merge() function.
The question is rather why does that function exists in the first place? The operation doesn't look logical to me.
For drm_buddy_reset_clear() and drm_buddy_fini() we should use rbtree_postorder_for_each_entry_safe() instead.
And during normal allocation __force_merge() should never be used.
In normal allocation, the force_merge() function is used when no free blocks of the requested order are available. In such cases, smaller blocks must be merged on demand to satisfy the allocation. Mainly, this does not involve traversing the entire tree to merge all blocks, but only merging as needed. For example, if the requested order is 6, and the minimum order is 5, drm_buddy_alloc_blocks() will first attempt to allocate an order-6 block. If none are available, it will try to allocate two order-5 blocks. If those are also unavailable, it will invoke force_merge() to merge lower order blocks (4, 3, 2, 1, 0) in order to coalesce into a higher-order block of order 5.
In drm_buddy_fini(), force_merge() is called to ensure all blocks are merged before tearing down the allocator. This guarantees that all mm->roots are freed and not held by the driver at shutdown. If any blocks remain allocated, drm_buddy_fini() will issue a warning.
In drm_buddy_reset_clear(), which is invoked at device suspend/resume, it is an ideal place to call force_merge(). This ensures that all possible blocks are merged before resetting the clear state, thereby reducing fragmentation and improving allocation efficiency after resume.
I tried using this rbtree_postorder_for_each_entry_safe() macro in the force_merge() and it works, but we also a need a reverse variant since in normal allocation we dont want to disturb the lower addresses.
Thanks, Arun.
That said; given 'next' will remain an existing node, only the 'pos' node gets removed, rb_next() will still work correctly, even in the face of rebalance.
Good to know!
Regards, Christian.