- Linux-stable-mirror - lists.linaro.org

[PATCH v6 1/3] drm/buddy: Optimize free block management with RB tree

by Arunpravin Paneer Selvam

Replace the freelist (O(n)) used for free block management with a red-black tree, providing more efficient O(log n) search, insert, and delete operations. This improves scalability and performance when managing large numbers of free blocks per order (e.g., hundreds or thousands). In the VK-CTS memory stress subtest, the buddy manager merges fragmented memory and inserts freed blocks into the freelist. Since freelist insertion is O(n), this becomes a bottleneck as fragmentation increases. Benchmarking shows list_insert_sorted() consumes ~52.69% CPU with the freelist, compared to just 0.03% with the RB tree (rbtree_insert.isra.0), despite performing the same sorted insert. This also improves performance in heavily fragmented workloads, such as games or graphics tests that stress memory. As the buddy allocator evolves with new features such as clear-page tracking, the resulting fragmentation and complexity have grown. These RB-tree based design changes are introduced to address that growth and ensure the allocator continues to perform efficiently under fragmented conditions. The RB tree implementation with separate clear/dirty trees provides: - O(n log n) aggregate complexity for all operations instead of O(n^2) - Elimination of soft lockups and system instability - Improved code maintainability and clarity - Better scalability for large memory systems - Predictable performance under fragmentation v3(Matthew): - Remove RB_EMPTY_NODE check in force_merge function. - Rename rb for loop macros to have less generic names and move to .c file. - Make the rb node rb and link field as union. v4(Jani Nikula): - The kernel-doc comment should be "/**" - Move all the rbtree macros to rbtree.h and add parens to ensure correct precedence. v5: - Remove the inline in a .c file (Jani Nikula). v6(Peter Zijlstra): - Add rb_add() function replacing the existing rbtree_insert() code. Cc: stable(a)vger.kernel.org Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam(a)amd.com> --- drivers/gpu/drm/drm_buddy.c | 144 ++++++++++++++++++++++-------------- include/drm/drm_buddy.h | 11 ++- include/linux/rbtree.h | 56 ++++++++++++++ 3 files changed, 153 insertions(+), 58 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index a94061f373de..89f4b49ae3fb 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -14,6 +14,8 @@ static struct kmem_cache *slab_blocks; +#define rbtree_get_free_block(node) rb_entry((node), struct drm_buddy_block, rb) + static struct drm_buddy_block *drm_block_alloc(struct drm_buddy *mm, struct drm_buddy_block *parent, unsigned int order, @@ -31,6 +33,8 @@ static struct drm_buddy_block *drm_block_alloc(struct drm_buddy *mm, block->header |= order; block->parent = parent; + RB_CLEAR_NODE(&block->rb); + BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED); return block; } @@ -41,23 +45,49 @@ static void drm_block_free(struct drm_buddy *mm, kmem_cache_free(slab_blocks, block); } -static void list_insert_sorted(struct drm_buddy *mm, - struct drm_buddy_block *block) +static bool drm_buddy_block_offset_less(const struct drm_buddy_block *block, + const struct drm_buddy_block *node) { - struct drm_buddy_block *node; - struct list_head *head; + return drm_buddy_block_offset(block) < drm_buddy_block_offset(node); +} - head = &mm->free_list[drm_buddy_block_order(block)]; - if (list_empty(head)) { - list_add(&block->link, head); - return; - } +static bool rbtree_block_offset_less(struct rb_node *block, + const struct rb_node *node) +{ + return drm_buddy_block_offset_less(rbtree_get_free_block(block), + rbtree_get_free_block(node)); +} - list_for_each_entry(node, head, link) - if (drm_buddy_block_offset(block) < drm_buddy_block_offset(node)) - break; +static void rbtree_insert(struct drm_buddy *mm, + struct drm_buddy_block *block) +{ + rb_add(&block->rb, + &mm->free_tree[drm_buddy_block_order(block)], + rbtree_block_offset_less); +} + +static void rbtree_remove(struct drm_buddy *mm, + struct drm_buddy_block *block) +{ + struct rb_root *root; + + root = &mm->free_tree[drm_buddy_block_order(block)]; + rb_erase(&block->rb, root); - __list_add(&block->link, node->link.prev, &node->link); + RB_CLEAR_NODE(&block->rb); +} + +static struct drm_buddy_block * +rbtree_last_entry(struct drm_buddy *mm, unsigned int order) +{ + struct rb_node *node = rb_last(&mm->free_tree[order]); + + return node ? rb_entry(node, struct drm_buddy_block, rb) : NULL; +} + +static bool rbtree_is_empty(struct drm_buddy *mm, unsigned int order) +{ + return RB_EMPTY_ROOT(&mm->free_tree[order]); } static void clear_reset(struct drm_buddy_block *block) @@ -70,12 +100,13 @@ static void mark_cleared(struct drm_buddy_block *block) block->header |= DRM_BUDDY_HEADER_CLEAR; } -static void mark_allocated(struct drm_buddy_block *block) +static void mark_allocated(struct drm_buddy *mm, + struct drm_buddy_block *block) { block->header &= ~DRM_BUDDY_HEADER_STATE; block->header |= DRM_BUDDY_ALLOCATED; - list_del(&block->link); + rbtree_remove(mm, block); } static void mark_free(struct drm_buddy *mm, @@ -84,15 +115,16 @@ static void mark_free(struct drm_buddy *mm, block->header &= ~DRM_BUDDY_HEADER_STATE; block->header |= DRM_BUDDY_FREE; - list_insert_sorted(mm, block); + rbtree_insert(mm, block); } -static void mark_split(struct drm_buddy_block *block) +static void mark_split(struct drm_buddy *mm, + struct drm_buddy_block *block) { block->header &= ~DRM_BUDDY_HEADER_STATE; block->header |= DRM_BUDDY_SPLIT; - list_del(&block->link); + rbtree_remove(mm, block); } static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2) @@ -148,7 +180,7 @@ static unsigned int __drm_buddy_free(struct drm_buddy *mm, mark_cleared(parent); } - list_del(&buddy->link); + rbtree_remove(mm, buddy); if (force_merge && drm_buddy_block_is_clear(buddy)) mm->clear_avail -= drm_buddy_block_size(mm, buddy); @@ -179,9 +211,11 @@ static int __force_merge(struct drm_buddy *mm, return -EINVAL; for (i = min_order - 1; i >= 0; i--) { - struct drm_buddy_block *block, *prev; + struct drm_buddy_block *block, *prev_block, *first_block; + + first_block = rb_entry(rb_first(&mm->free_tree[i]), struct drm_buddy_block, rb); - list_for_each_entry_safe_reverse(block, prev, &mm->free_list[i], link) { + rbtree_reverse_for_each_entry_safe(block, prev_block, &mm->free_tree[i], rb) { struct drm_buddy_block *buddy; u64 block_start, block_end; @@ -206,10 +240,14 @@ static int __force_merge(struct drm_buddy *mm, * block in the next iteration as we would free the * buddy block as part of the free function. */ - if (prev == buddy) - prev = list_prev_entry(prev, link); + if (prev_block && prev_block == buddy) { + if (prev_block != first_block) + prev_block = rb_entry(rb_prev(&prev_block->rb), + struct drm_buddy_block, + rb); + } - list_del(&block->link); + rbtree_remove(mm, block); if (drm_buddy_block_is_clear(block)) mm->clear_avail -= drm_buddy_block_size(mm, block); @@ -258,14 +296,14 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) BUG_ON(mm->max_order > DRM_BUDDY_MAX_ORDER); - mm->free_list = kmalloc_array(mm->max_order + 1, - sizeof(struct list_head), + mm->free_tree = kmalloc_array(mm->max_order + 1, + sizeof(struct rb_root), GFP_KERNEL); - if (!mm->free_list) + if (!mm->free_tree) return -ENOMEM; for (i = 0; i <= mm->max_order; ++i) - INIT_LIST_HEAD(&mm->free_list[i]); + mm->free_tree[i] = RB_ROOT; mm->n_roots = hweight64(size); @@ -273,7 +311,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) sizeof(struct drm_buddy_block *), GFP_KERNEL); if (!mm->roots) - goto out_free_list; + goto out_free_tree; offset = 0; i = 0; @@ -312,8 +350,8 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) while (i--) drm_block_free(mm, mm->roots[i]); kfree(mm->roots); -out_free_list: - kfree(mm->free_list); +out_free_tree: + kfree(mm->free_tree); return -ENOMEM; } EXPORT_SYMBOL(drm_buddy_init); @@ -323,7 +361,7 @@ EXPORT_SYMBOL(drm_buddy_init); * * @mm: DRM buddy manager to free * - * Cleanup memory manager resources and the freelist + * Cleanup memory manager resources and the freetree */ void drm_buddy_fini(struct drm_buddy *mm) { @@ -350,7 +388,7 @@ void drm_buddy_fini(struct drm_buddy *mm) WARN_ON(mm->avail != mm->size); kfree(mm->roots); - kfree(mm->free_list); + kfree(mm->free_tree); } EXPORT_SYMBOL(drm_buddy_fini); @@ -383,7 +421,7 @@ static int split_block(struct drm_buddy *mm, clear_reset(block); } - mark_split(block); + mark_split(mm, block); return 0; } @@ -412,7 +450,7 @@ EXPORT_SYMBOL(drm_get_buddy); * @is_clear: blocks clear state * * Reset the clear state based on @is_clear value for each block - * in the freelist. + * in the freetree. */ void drm_buddy_reset_clear(struct drm_buddy *mm, bool is_clear) { @@ -433,7 +471,7 @@ void drm_buddy_reset_clear(struct drm_buddy *mm, bool is_clear) for (i = 0; i <= mm->max_order; ++i) { struct drm_buddy_block *block; - list_for_each_entry_reverse(block, &mm->free_list[i], link) { + rbtree_reverse_for_each_entry(block, &mm->free_tree[i], rb) { if (is_clear != drm_buddy_block_is_clear(block)) { if (is_clear) { mark_cleared(block); @@ -641,7 +679,7 @@ get_maxblock(struct drm_buddy *mm, unsigned int order, for (i = order; i <= mm->max_order; ++i) { struct drm_buddy_block *tmp_block; - list_for_each_entry_reverse(tmp_block, &mm->free_list[i], link) { + rbtree_reverse_for_each_entry(tmp_block, &mm->free_tree[i], rb) { if (block_incompatible(tmp_block, flags)) continue; @@ -667,7 +705,7 @@ get_maxblock(struct drm_buddy *mm, unsigned int order, } static struct drm_buddy_block * -alloc_from_freelist(struct drm_buddy *mm, +alloc_from_freetree(struct drm_buddy *mm, unsigned int order, unsigned long flags) { @@ -684,7 +722,7 @@ alloc_from_freelist(struct drm_buddy *mm, for (tmp = order; tmp <= mm->max_order; ++tmp) { struct drm_buddy_block *tmp_block; - list_for_each_entry_reverse(tmp_block, &mm->free_list[tmp], link) { + rbtree_reverse_for_each_entry(tmp_block, &mm->free_tree[tmp], rb) { if (block_incompatible(tmp_block, flags)) continue; @@ -700,10 +738,8 @@ alloc_from_freelist(struct drm_buddy *mm, if (!block) { /* Fallback method */ for (tmp = order; tmp <= mm->max_order; ++tmp) { - if (!list_empty(&mm->free_list[tmp])) { - block = list_last_entry(&mm->free_list[tmp], - struct drm_buddy_block, - link); + if (!rbtree_is_empty(mm, tmp)) { + block = rbtree_last_entry(mm, tmp); if (block) break; } @@ -771,7 +807,7 @@ static int __alloc_range(struct drm_buddy *mm, if (contains(start, end, block_start, block_end)) { if (drm_buddy_block_is_free(block)) { - mark_allocated(block); + mark_allocated(mm, block); total_allocated += drm_buddy_block_size(mm, block); mm->avail -= drm_buddy_block_size(mm, block); if (drm_buddy_block_is_clear(block)) @@ -849,7 +885,6 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm, { u64 rhs_offset, lhs_offset, lhs_size, filled; struct drm_buddy_block *block; - struct list_head *list; LIST_HEAD(blocks_lhs); unsigned long pages; unsigned int order; @@ -862,11 +897,10 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm, if (order == 0) return -ENOSPC; - list = &mm->free_list[order]; - if (list_empty(list)) + if (rbtree_is_empty(mm, order)) return -ENOSPC; - list_for_each_entry_reverse(block, list, link) { + rbtree_reverse_for_each_entry(block, &mm->free_tree[order], rb) { /* Allocate blocks traversing RHS */ rhs_offset = drm_buddy_block_offset(block); err = __drm_buddy_alloc_range(mm, rhs_offset, size, @@ -976,7 +1010,7 @@ int drm_buddy_block_trim(struct drm_buddy *mm, list_add(&block->tmp_link, &dfs); err = __alloc_range(mm, &dfs, new_start, new_size, blocks, NULL); if (err) { - mark_allocated(block); + mark_allocated(mm, block); mm->avail -= drm_buddy_block_size(mm, block); if (drm_buddy_block_is_clear(block)) mm->clear_avail -= drm_buddy_block_size(mm, block); @@ -999,8 +1033,8 @@ __drm_buddy_alloc_blocks(struct drm_buddy *mm, return __drm_buddy_alloc_range_bias(mm, start, end, order, flags); else - /* Allocate from freelist */ - return alloc_from_freelist(mm, order, flags); + /* Allocate from freetree */ + return alloc_from_freetree(mm, order, flags); } /** @@ -1017,8 +1051,8 @@ __drm_buddy_alloc_blocks(struct drm_buddy *mm, * alloc_range_bias() called on range limitations, which traverses * the tree and returns the desired block. * - * alloc_from_freelist() called when *no* range restrictions - * are enforced, which picks the block from the freelist. + * alloc_from_freetree() called when *no* range restrictions + * are enforced, which picks the block from the freetree. * * Returns: * 0 on success, error code on failure. @@ -1120,7 +1154,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, } } while (1); - mark_allocated(block); + mark_allocated(mm, block); mm->avail -= drm_buddy_block_size(mm, block); if (drm_buddy_block_is_clear(block)) mm->clear_avail -= drm_buddy_block_size(mm, block); @@ -1204,7 +1238,7 @@ void drm_buddy_print(struct drm_buddy *mm, struct drm_printer *p) struct drm_buddy_block *block; u64 count = 0, free; - list_for_each_entry(block, &mm->free_list[order], link) { + rbtree_for_each_entry(block, &mm->free_tree[order], rb) { BUG_ON(!drm_buddy_block_is_free(block)); count++; } diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index 513837632b7d..9ee105d4309f 100644 --- a/include/drm/drm_buddy.h +++ b/include/drm/drm_buddy.h @@ -10,6 +10,7 @@ #include <linux/list.h> #include <linux/slab.h> #include <linux/sched.h> +#include <linux/rbtree.h> #include <drm/drm_print.h> @@ -53,7 +54,11 @@ struct drm_buddy_block { * a list, if so desired. As soon as the block is freed with * drm_buddy_free* ownership is given back to the mm. */ - struct list_head link; + union { + struct rb_node rb; + struct list_head link; + }; + struct list_head tmp_link; }; @@ -68,7 +73,7 @@ struct drm_buddy_block { */ struct drm_buddy { /* Maintain a free list for each order. */ - struct list_head *free_list; + struct rb_root *free_tree; /* * Maintain explicit binary tree(s) to track the allocation of the @@ -94,7 +99,7 @@ struct drm_buddy { }; static inline u64 -drm_buddy_block_offset(struct drm_buddy_block *block) +drm_buddy_block_offset(const struct drm_buddy_block *block) { return block->header & DRM_BUDDY_HEADER_OFFSET; } diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h index 8d2ba3749866..17190bb4837c 100644 --- a/include/linux/rbtree.h +++ b/include/linux/rbtree.h @@ -79,6 +79,62 @@ static inline void rb_link_node_rcu(struct rb_node *node, struct rb_node *parent ____ptr ? rb_entry(____ptr, type, member) : NULL; \ }) +/** + * rbtree_for_each_entry - iterate in-order over rb_root of given type + * + * @pos: the 'type *' to use as a loop cursor. + * @root: 'rb_root *' of the rbtree. + * @member: the name of the rb_node field within 'type'. + */ +#define rbtree_for_each_entry(pos, root, member) \ + for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member); \ + (pos); \ + (pos) = rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member)) + +/** + * rbtree_reverse_for_each_entry - iterate in reverse in-order over rb_root + * of given type + * + * @pos: the 'type *' to use as a loop cursor. + * @root: 'rb_root *' of the rbtree. + * @member: the name of the rb_node field within 'type'. + */ +#define rbtree_reverse_for_each_entry(pos, root, member) \ + for ((pos) = rb_entry_safe(rb_last(root), typeof(*(pos)), member); \ + (pos); \ + (pos) = rb_entry_safe(rb_prev(&(pos)->member), typeof(*(pos)), member)) + +/** + * rbtree_for_each_entry_safe - iterate in-order over rb_root safe against removal + * + * @pos: the 'type *' to use as a loop cursor + * @n: another 'type *' to use as temporary storage + * @root: 'rb_root *' of the rbtree + * @member: the name of the rb_node field within 'type' + */ +#define rbtree_for_each_entry_safe(pos, n, root, member) \ + for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member), \ + (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL; \ + (pos); \ + (pos) = (n), \ + (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL) + +/** + * rbtree_reverse_for_each_entry_safe - iterate in reverse in-order over rb_root + * safe against removal + * + * @pos: the struct type * to use as a loop cursor. + * @n: another struct type * to use as temporary storage. + * @root: pointer to struct rb_root to iterate. + * @member: name of the rb_node field within the struct. + */ +#define rbtree_reverse_for_each_entry_safe(pos, n, root, member) \ + for ((pos) = rb_entry_safe(rb_last(root), typeof(*(pos)), member), \ + (n) = (pos) ? rb_entry_safe(rb_prev(&(pos)->member), typeof(*(pos)), member) : NULL; \ + (pos); \ + (pos) = (n), \ + (n) = (pos) ? rb_entry_safe(rb_prev(&(pos)->member), typeof(*(pos)), member) : NULL) + /** * rbtree_postorder_for_each_entry_safe - iterate in post-order over rb_root of * given type allowing the backing memory of @pos to be invalidated base-commit: 7156602d56e5ad689ae11e03680ab6326238b5e3 -- 2.34.1

5 days, 19 hours

1
1
0 0

[PATCH net v2] rds: ib: Increment i_fastreg_wrs before bailing out

by Håkon Bugge

We need to increment i_fastreg_wrs before we bail out from rds_ib_post_reg_frmr(). Fixes: 1659185fb4d0 ("RDS: IB: Support Fastreg MR (FRMR) memory registration mode") Fixes: 3a2886cca703 ("net/rds: Keep track of and wait for FRWR segments in use upon shutdown") Cc: stable(a)vger.kernel.org Signed-off-by: Håkon Bugge <haakon.bugge(a)oracle.com> --- v1 -> v2: Added Cc: stable(a)vger.kernel.org --- net/rds/ib_frmr.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/net/rds/ib_frmr.c b/net/rds/ib_frmr.c index 28c1b00221780..7e3b04a83904d 100644 --- a/net/rds/ib_frmr.c +++ b/net/rds/ib_frmr.c @@ -133,12 +133,15 @@ static int rds_ib_post_reg_frmr(struct rds_ib_mr *ibmr) ret = ib_map_mr_sg_zbva(frmr->mr, ibmr->sg, ibmr->sg_dma_len, &off, PAGE_SIZE); - if (unlikely(ret != ibmr->sg_dma_len)) - return ret < 0 ? ret : -EINVAL; + if (unlikely(ret != ibmr->sg_dma_len)) { + ret = ret < 0 ? ret : -EINVAL; + goto out_inc; + } - if (cmpxchg(&frmr->fr_state, - FRMR_IS_FREE, FRMR_IS_INUSE) != FRMR_IS_FREE) - return -EBUSY; + if (cmpxchg(&frmr->fr_state, FRMR_IS_FREE, FRMR_IS_INUSE) != FRMR_IS_FREE) { + ret = -EBUSY; + goto out_inc; + } atomic_inc(&ibmr->ic->i_fastreg_inuse_count); @@ -178,9 +181,11 @@ static int rds_ib_post_reg_frmr(struct rds_ib_mr *ibmr) * being accessed while registration is still pending. */ wait_event(frmr->fr_reg_done, !frmr->fr_reg); - out: + return ret; +out_inc: + atomic_inc(&ibmr->ic->i_fastreg_wrs); return ret; } -- 2.43.5

5 days, 20 hours

2
1
0 0

[PATCH v5 2/3] drm/buddy: Separate clear and dirty free block trees

by Arunpravin Paneer Selvam

Maintain two separate RB trees per order - one for clear (zeroed) blocks and another for dirty (uncleared) blocks. This separation improves code clarity and makes it more obvious which tree is being searched during allocation. It also improves scalability and efficiency when searching for a specific type of block, avoiding unnecessary checks and making the allocator more predictable under fragmentation. The changes have been validated using the existing drm_buddy_test KUnit test cases, along with selected graphics workloads, to ensure correctness and avoid regressions. v2: Missed adding the suggested-by tag. Added it in v2. v3(Matthew): - Remove the double underscores from the internal functions. - Rename the internal functions to have less generic names. - Fix the error handling code. - Pass tree argument for the tree macro. - Use the existing dirty/free bit instead of new tree field. - Make free_trees[] instead of clear_tree and dirty_tree for more cleaner approach. v4: - A bug was reported by Intel CI and it is fixed by Matthew Auld. - Replace the get_root function with &mm->free_trees[tree][order] (Matthew) - Remove the unnecessary rbtree_is_empty() check (Matthew) - Remove the unnecessary get_tree_for_flags() function. - Rename get_tree_for_block() name with get_block_tree() for more clarity. v5(Jani Nikula): - Don't use static inline in .c files. - enum free_tree and enumerator names are quite generic for a header and usage and the whole enum should be an implementation detail. Cc: stable(a)vger.kernel.org Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam(a)amd.com> Suggested-by: Matthew Auld <matthew.auld(a)intel.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4260 --- drivers/gpu/drm/drm_buddy.c | 336 +++++++++++++++++++++--------------- include/drm/drm_buddy.h | 2 +- 2 files changed, 201 insertions(+), 137 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 8b340f47f73d..4f96e2e17d08 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -12,8 +12,17 @@ #include <drm/drm_buddy.h> +enum drm_buddy_free_tree { + DRM_BUDDY_CLEAR_TREE = 0, + DRM_BUDDY_DIRTY_TREE, + DRM_BUDDY_MAX_FREE_TREES, +}; + static struct kmem_cache *slab_blocks; +#define for_each_free_tree(tree) \ + for ((tree) = 0; (tree) < DRM_BUDDY_MAX_FREE_TREES; (tree)++) + static struct drm_buddy_block *drm_block_alloc(struct drm_buddy *mm, struct drm_buddy_block *parent, unsigned int order, @@ -43,22 +52,61 @@ static void drm_block_free(struct drm_buddy *mm, kmem_cache_free(slab_blocks, block); } +static enum drm_buddy_free_tree +get_block_tree(struct drm_buddy_block *block) +{ + return drm_buddy_block_is_clear(block) ? + DRM_BUDDY_CLEAR_TREE : DRM_BUDDY_DIRTY_TREE; +} + +static struct drm_buddy_block * +rbtree_get_free_block(struct rb_node *node) +{ + return node ? rb_entry(node, struct drm_buddy_block, rb) : NULL; +} + +static struct drm_buddy_block * +rbtree_prev_free_block(struct rb_node *node) +{ + return rbtree_get_free_block(rb_prev(node)); +} + +static struct drm_buddy_block * +rbtree_first_free_block(struct rb_root *root) +{ + return rbtree_get_free_block(rb_first(root)); +} + +static struct drm_buddy_block * +rbtree_last_free_block(struct rb_root *root) +{ + return rbtree_get_free_block(rb_last(root)); +} + +static bool rbtree_is_empty(struct rb_root *root) +{ + return RB_EMPTY_ROOT(root); +} + static void rbtree_insert(struct drm_buddy *mm, - struct drm_buddy_block *block) + struct drm_buddy_block *block, + enum drm_buddy_free_tree tree) { - struct rb_root *root = &mm->free_tree[drm_buddy_block_order(block)]; - struct rb_node **link = &root->rb_node; - struct rb_node *parent = NULL; + struct rb_node **link, *parent = NULL; struct drm_buddy_block *node; - u64 offset; + struct rb_root *root; + unsigned int order; - offset = drm_buddy_block_offset(block); + order = drm_buddy_block_order(block); + + root = &mm->free_trees[tree][order]; + link = &root->rb_node; while (*link) { parent = *link; - node = rb_entry(parent, struct drm_buddy_block, rb); + node = rbtree_get_free_block(parent); - if (offset < drm_buddy_block_offset(node)) + if (drm_buddy_block_offset(block) < drm_buddy_block_offset(node)) link = &parent->rb_left; else link = &parent->rb_right; @@ -71,27 +119,17 @@ static void rbtree_insert(struct drm_buddy *mm, static void rbtree_remove(struct drm_buddy *mm, struct drm_buddy_block *block) { + unsigned int order = drm_buddy_block_order(block); struct rb_root *root; + enum drm_buddy_free_tree tree; - root = &mm->free_tree[drm_buddy_block_order(block)]; - rb_erase(&block->rb, root); + tree = get_block_tree(block); + root = &mm->free_trees[tree][order]; + rb_erase(&block->rb, root); RB_CLEAR_NODE(&block->rb); } -static struct drm_buddy_block * -rbtree_last_entry(struct drm_buddy *mm, unsigned int order) -{ - struct rb_node *node = rb_last(&mm->free_tree[order]); - - return node ? rb_entry(node, struct drm_buddy_block, rb) : NULL; -} - -static bool rbtree_is_empty(struct drm_buddy *mm, unsigned int order) -{ - return RB_EMPTY_ROOT(&mm->free_tree[order]); -} - static void clear_reset(struct drm_buddy_block *block) { block->header &= ~DRM_BUDDY_HEADER_CLEAR; @@ -114,10 +152,13 @@ static void mark_allocated(struct drm_buddy *mm, static void mark_free(struct drm_buddy *mm, struct drm_buddy_block *block) { + enum drm_buddy_free_tree tree; + block->header &= ~DRM_BUDDY_HEADER_STATE; block->header |= DRM_BUDDY_FREE; - rbtree_insert(mm, block); + tree = get_block_tree(block); + rbtree_insert(mm, block, tree); } static void mark_split(struct drm_buddy *mm, @@ -203,6 +244,7 @@ static int __force_merge(struct drm_buddy *mm, u64 end, unsigned int min_order) { + enum drm_buddy_free_tree tree; unsigned int order; int i; @@ -212,50 +254,49 @@ static int __force_merge(struct drm_buddy *mm, if (min_order > mm->max_order) return -EINVAL; - for (i = min_order - 1; i >= 0; i--) { - struct drm_buddy_block *block, *prev_block, *first_block; - - first_block = rb_entry(rb_first(&mm->free_tree[i]), struct drm_buddy_block, rb); + for_each_free_tree(tree) { + for (i = min_order - 1; i >= 0; i--) { + struct rb_root *root = &mm->free_trees[tree][i]; + struct drm_buddy_block *block, *prev_block; - rbtree_reverse_for_each_entry_safe(block, prev_block, &mm->free_tree[i], rb) { - struct drm_buddy_block *buddy; - u64 block_start, block_end; + rbtree_reverse_for_each_entry_safe(block, prev_block, root, rb) { + struct drm_buddy_block *buddy; + u64 block_start, block_end; - if (!block->parent) - continue; + if (!block->parent) + continue; - block_start = drm_buddy_block_offset(block); - block_end = block_start + drm_buddy_block_size(mm, block) - 1; + block_start = drm_buddy_block_offset(block); + block_end = block_start + drm_buddy_block_size(mm, block) - 1; - if (!contains(start, end, block_start, block_end)) - continue; + if (!contains(start, end, block_start, block_end)) + continue; - buddy = __get_buddy(block); - if (!drm_buddy_block_is_free(buddy)) - continue; + buddy = __get_buddy(block); + if (!drm_buddy_block_is_free(buddy)) + continue; - WARN_ON(drm_buddy_block_is_clear(block) == - drm_buddy_block_is_clear(buddy)); + WARN_ON(drm_buddy_block_is_clear(block) == + drm_buddy_block_is_clear(buddy)); - /* - * If the prev block is same as buddy, don't access the - * block in the next iteration as we would free the - * buddy block as part of the free function. - */ - if (prev_block && prev_block == buddy) { - if (prev_block != first_block) - prev_block = rb_entry(rb_prev(&prev_block->rb), - struct drm_buddy_block, - rb); - } + /* + * If the prev block is same as buddy, don't access the + * block in the next iteration as we would free the + * buddy block as part of the free function. + */ + if (prev_block && prev_block == buddy) { + if (prev_block != rbtree_first_free_block(root)) + prev_block = rbtree_prev_free_block(&prev_block->rb); + } - rbtree_remove(mm, block); - if (drm_buddy_block_is_clear(block)) - mm->clear_avail -= drm_buddy_block_size(mm, block); + rbtree_remove(mm, block); + if (drm_buddy_block_is_clear(block)) + mm->clear_avail -= drm_buddy_block_size(mm, block); - order = __drm_buddy_free(mm, block, true); - if (order >= min_order) - return 0; + order = __drm_buddy_free(mm, block, true); + if (order >= min_order) + return 0; + } } } @@ -276,7 +317,7 @@ static int __force_merge(struct drm_buddy *mm, */ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) { - unsigned int i; + unsigned int i, j; u64 offset; if (size < chunk_size) @@ -298,14 +339,22 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) BUG_ON(mm->max_order > DRM_BUDDY_MAX_ORDER); - mm->free_tree = kmalloc_array(mm->max_order + 1, - sizeof(struct rb_root), - GFP_KERNEL); - if (!mm->free_tree) + mm->free_trees = kmalloc_array(DRM_BUDDY_MAX_FREE_TREES, + sizeof(*mm->free_trees), + GFP_KERNEL); + if (!mm->free_trees) return -ENOMEM; - for (i = 0; i <= mm->max_order; ++i) - mm->free_tree[i] = RB_ROOT; + for (i = 0; i < DRM_BUDDY_MAX_FREE_TREES; i++) { + mm->free_trees[i] = kmalloc_array(mm->max_order + 1, + sizeof(struct rb_root), + GFP_KERNEL); + if (!mm->free_trees[i]) + goto out_free_tree; + + for (j = 0; j <= mm->max_order; ++j) + mm->free_trees[i][j] = RB_ROOT; + } mm->n_roots = hweight64(size); @@ -353,7 +402,9 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) drm_block_free(mm, mm->roots[i]); kfree(mm->roots); out_free_tree: - kfree(mm->free_tree); + while (i--) + kfree(mm->free_trees[i]); + kfree(mm->free_trees); return -ENOMEM; } EXPORT_SYMBOL(drm_buddy_init); @@ -389,8 +440,9 @@ void drm_buddy_fini(struct drm_buddy *mm) WARN_ON(mm->avail != mm->size); + for (i = 0; i < DRM_BUDDY_MAX_FREE_TREES; i++) + kfree(mm->free_trees[i]); kfree(mm->roots); - kfree(mm->free_tree); } EXPORT_SYMBOL(drm_buddy_fini); @@ -414,8 +466,7 @@ static int split_block(struct drm_buddy *mm, return -ENOMEM; } - mark_free(mm, block->left); - mark_free(mm, block->right); + mark_split(mm, block); if (drm_buddy_block_is_clear(block)) { mark_cleared(block->left); @@ -423,7 +474,8 @@ static int split_block(struct drm_buddy *mm, clear_reset(block); } - mark_split(mm, block); + mark_free(mm, block->left); + mark_free(mm, block->right); return 0; } @@ -456,6 +508,7 @@ EXPORT_SYMBOL(drm_get_buddy); */ void drm_buddy_reset_clear(struct drm_buddy *mm, bool is_clear) { + enum drm_buddy_free_tree src_tree, dst_tree; u64 root_size, size, start; unsigned int order; int i; @@ -470,19 +523,24 @@ void drm_buddy_reset_clear(struct drm_buddy *mm, bool is_clear) size -= root_size; } + src_tree = is_clear ? DRM_BUDDY_DIRTY_TREE : DRM_BUDDY_CLEAR_TREE; + dst_tree = is_clear ? DRM_BUDDY_CLEAR_TREE : DRM_BUDDY_DIRTY_TREE; + for (i = 0; i <= mm->max_order; ++i) { + struct rb_root *root = &mm->free_trees[src_tree][i]; struct drm_buddy_block *block; - rbtree_reverse_for_each_entry(block, &mm->free_tree[i], rb) { - if (is_clear != drm_buddy_block_is_clear(block)) { - if (is_clear) { - mark_cleared(block); - mm->clear_avail += drm_buddy_block_size(mm, block); - } else { - clear_reset(block); - mm->clear_avail -= drm_buddy_block_size(mm, block); - } + rbtree_reverse_for_each_entry(block, root, rb) { + rbtree_remove(mm, block); + if (is_clear) { + mark_cleared(block); + mm->clear_avail += drm_buddy_block_size(mm, block); + } else { + clear_reset(block); + mm->clear_avail -= drm_buddy_block_size(mm, block); } + + rbtree_insert(mm, block, dst_tree); } } } @@ -672,23 +730,17 @@ __drm_buddy_alloc_range_bias(struct drm_buddy *mm, } static struct drm_buddy_block * -get_maxblock(struct drm_buddy *mm, unsigned int order, - unsigned long flags) +get_maxblock(struct drm_buddy *mm, + unsigned int order, + enum drm_buddy_free_tree tree) { struct drm_buddy_block *max_block = NULL, *block = NULL; + struct rb_root *root; unsigned int i; for (i = order; i <= mm->max_order; ++i) { - struct drm_buddy_block *tmp_block; - - rbtree_reverse_for_each_entry(tmp_block, &mm->free_tree[i], rb) { - if (block_incompatible(tmp_block, flags)) - continue; - - block = tmp_block; - break; - } - + root = &mm->free_trees[tree][i]; + block = rbtree_last_free_block(root); if (!block) continue; @@ -712,39 +764,39 @@ alloc_from_freetree(struct drm_buddy *mm, unsigned long flags) { struct drm_buddy_block *block = NULL; + struct rb_root *root; + enum drm_buddy_free_tree tree; unsigned int tmp; int err; + tree = (flags & DRM_BUDDY_CLEAR_ALLOCATION) ? + DRM_BUDDY_CLEAR_TREE : DRM_BUDDY_DIRTY_TREE; + if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) { - block = get_maxblock(mm, order, flags); + block = get_maxblock(mm, order, tree); if (block) /* Store the obtained block order */ tmp = drm_buddy_block_order(block); } else { for (tmp = order; tmp <= mm->max_order; ++tmp) { - struct drm_buddy_block *tmp_block; - - rbtree_reverse_for_each_entry(tmp_block, &mm->free_tree[tmp], rb) { - if (block_incompatible(tmp_block, flags)) - continue; - - block = tmp_block; - break; - } - + /* Get RB tree root for this order and tree */ + root = &mm->free_trees[tree][tmp]; + block = rbtree_last_free_block(root); if (block) break; } } if (!block) { - /* Fallback method */ + /* Try allocating from the other tree */ + tree = (tree == DRM_BUDDY_CLEAR_TREE) ? + DRM_BUDDY_DIRTY_TREE : DRM_BUDDY_CLEAR_TREE; + for (tmp = order; tmp <= mm->max_order; ++tmp) { - if (!rbtree_is_empty(mm, tmp)) { - block = rbtree_last_entry(mm, tmp); - if (block) - break; - } + root = &mm->free_trees[tree][tmp]; + block = rbtree_last_free_block(root); + if (block) + break; } if (!block) @@ -888,6 +940,7 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm, u64 rhs_offset, lhs_offset, lhs_size, filled; struct drm_buddy_block *block; LIST_HEAD(blocks_lhs); + enum drm_buddy_free_tree tree; unsigned long pages; unsigned int order; u64 modify_size; @@ -899,34 +952,39 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm, if (order == 0) return -ENOSPC; - if (rbtree_is_empty(mm, order)) + if (rbtree_is_empty(&mm->free_trees[DRM_BUDDY_CLEAR_TREE][order]) && + rbtree_is_empty(&mm->free_trees[DRM_BUDDY_DIRTY_TREE][order])) return -ENOSPC; - rbtree_reverse_for_each_entry(block, &mm->free_tree[order], rb) { - /* Allocate blocks traversing RHS */ - rhs_offset = drm_buddy_block_offset(block); - err = __drm_buddy_alloc_range(mm, rhs_offset, size, - &filled, blocks); - if (!err || err != -ENOSPC) - return err; - - lhs_size = max((size - filled), min_block_size); - if (!IS_ALIGNED(lhs_size, min_block_size)) - lhs_size = round_up(lhs_size, min_block_size); - - /* Allocate blocks traversing LHS */ - lhs_offset = drm_buddy_block_offset(block) - lhs_size; - err = __drm_buddy_alloc_range(mm, lhs_offset, lhs_size, - NULL, &blocks_lhs); - if (!err) { - list_splice(&blocks_lhs, blocks); - return 0; - } else if (err != -ENOSPC) { + for_each_free_tree(tree) { + struct rb_root *root = &mm->free_trees[tree][order]; + + rbtree_reverse_for_each_entry(block, root, rb) { + /* Allocate blocks traversing RHS */ + rhs_offset = drm_buddy_block_offset(block); + err = __drm_buddy_alloc_range(mm, rhs_offset, size, + &filled, blocks); + if (!err || err != -ENOSPC) + return err; + + lhs_size = max((size - filled), min_block_size); + if (!IS_ALIGNED(lhs_size, min_block_size)) + lhs_size = round_up(lhs_size, min_block_size); + + /* Allocate blocks traversing LHS */ + lhs_offset = drm_buddy_block_offset(block) - lhs_size; + err = __drm_buddy_alloc_range(mm, lhs_offset, lhs_size, + NULL, &blocks_lhs); + if (!err) { + list_splice(&blocks_lhs, blocks); + return 0; + } else if (err != -ENOSPC) { + drm_buddy_free_list_internal(mm, blocks); + return err; + } + /* Free blocks for the next iteration */ drm_buddy_free_list_internal(mm, blocks); - return err; } - /* Free blocks for the next iteration */ - drm_buddy_free_list_internal(mm, blocks); } return -ENOSPC; @@ -1231,6 +1289,7 @@ EXPORT_SYMBOL(drm_buddy_block_print); */ void drm_buddy_print(struct drm_buddy *mm, struct drm_printer *p) { + enum drm_buddy_free_tree tree; int order; drm_printf(p, "chunk_size: %lluKiB, total: %lluMiB, free: %lluMiB, clear_free: %lluMiB\n", @@ -1238,11 +1297,16 @@ void drm_buddy_print(struct drm_buddy *mm, struct drm_printer *p) for (order = mm->max_order; order >= 0; order--) { struct drm_buddy_block *block; + struct rb_root *root; u64 count = 0, free; - rbtree_for_each_entry(block, &mm->free_tree[order], rb) { - BUG_ON(!drm_buddy_block_is_free(block)); - count++; + for_each_free_tree(tree) { + root = &mm->free_trees[tree][order]; + + rbtree_for_each_entry(block, root, rb) { + BUG_ON(!drm_buddy_block_is_free(block)); + count++; + } } drm_printf(p, "order-%2d ", order); diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index 091823592034..a2189a92bb7a 100644 --- a/include/drm/drm_buddy.h +++ b/include/drm/drm_buddy.h @@ -73,7 +73,7 @@ struct drm_buddy_block { */ struct drm_buddy { /* Maintain a free list for each order. */ - struct rb_root *free_tree; + struct rb_root **free_trees; /* * Maintain explicit binary tree(s) to track the allocation of the -- 2.34.1

5 days, 21 hours

1
0
0 0

[PATCH net v5] selftests: net: add test for destination in broadcast packets

by Oscar Maes

Add test to check the broadcast ethernet destination field is set correctly. This test sends a broadcast ping, captures it using tcpdump and ensures that all bits of the 6 octet ethernet destination address are correctly set by examining the output capture file. Co-developed-by: Brett A C Sheffield <bacs(a)librecast.net> Signed-off-by: Brett A C Sheffield <bacs(a)librecast.net> Signed-off-by: Oscar Maes <oscmaes92(a)gmail.com> --- v4 -> v5: - Fixed Signed-off-by chain v3 -> v4: - Added Brett as co-author - Wait for tcpdump to bind using slowwait Links: - Discussion: https://lore.kernel.org/netdev/20250822165231.4353-4-bacs@librecast.net/ - Previous version: https://lore.kernel.org/netdev/20250828114242.6433-1-oscmaes92@gmail.com/ Thanks to Brett Sheffield for co-developing this selftest! tools/testing/selftests/net/Makefile | 1 + .../selftests/net/broadcast_ether_dst.sh | 83 +++++++++++++++++++ 2 files changed, 84 insertions(+) create mode 100755 tools/testing/selftests/net/broadcast_ether_dst.sh diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index b31a71f2b372..56ad10ea6628 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -115,6 +115,7 @@ TEST_PROGS += skf_net_off.sh TEST_GEN_FILES += skf_net_off TEST_GEN_FILES += tfo TEST_PROGS += tfo_passive.sh +TEST_PROGS += broadcast_ether_dst.sh TEST_PROGS += broadcast_pmtu.sh TEST_PROGS += ipv6_force_forwarding.sh diff --git a/tools/testing/selftests/net/broadcast_ether_dst.sh b/tools/testing/selftests/net/broadcast_ether_dst.sh new file mode 100755 index 000000000000..334a7eca8a80 --- /dev/null +++ b/tools/testing/selftests/net/broadcast_ether_dst.sh @@ -0,0 +1,83 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Author: Brett A C Sheffield <bacs(a)librecast.net> +# Author: Oscar Maes <oscmaes92(a)gmail.com> +# +# Ensure destination ethernet field is correctly set for +# broadcast packets + +source lib.sh + +CLIENT_IP4="192.168.0.1" +GW_IP4="192.168.0.2" + +setup() { + setup_ns CLIENT_NS SERVER_NS + + ip -net "${SERVER_NS}" link add link1 type veth \ + peer name link0 netns "${CLIENT_NS}" + + ip -net "${CLIENT_NS}" link set link0 up + ip -net "${CLIENT_NS}" addr add "${CLIENT_IP4}"/24 dev link0 + + ip -net "${SERVER_NS}" link set link1 up + + ip -net "${CLIENT_NS}" route add default via "${GW_IP4}" + ip netns exec "${CLIENT_NS}" arp -s "${GW_IP4}" 00:11:22:33:44:55 +} + +cleanup() { + rm -f "${CAPFILE}" "${OUTPUT}" + ip -net "${SERVER_NS}" link del link1 + cleanup_ns "${CLIENT_NS}" "${SERVER_NS}" +} + +test_broadcast_ether_dst() { + local rc=0 + CAPFILE=$(mktemp -u cap.XXXXXXXXXX) + OUTPUT=$(mktemp -u out.XXXXXXXXXX) + + echo "Testing ethernet broadcast destination" + + # start tcpdump listening for icmp + # tcpdump will exit after receiving a single packet + # timeout will kill tcpdump if it is still running after 2s + timeout 2s ip netns exec "${CLIENT_NS}" \ + tcpdump -i link0 -c 1 -w "${CAPFILE}" icmp &> "${OUTPUT}" & + pid=$! + slowwait 1 grep -qs "listening" "${OUTPUT}" + + # send broadcast ping + ip netns exec "${CLIENT_NS}" \ + ping -W0.01 -c1 -b 255.255.255.255 &> /dev/null + + # wait for tcpdump for exit after receiving packet + wait "${pid}" + + # compare ethernet destination field to ff:ff:ff:ff:ff:ff + ether_dst=$(tcpdump -r "${CAPFILE}" -tnne 2>/dev/null | \ + awk '{sub(/,/,"",$3); print $3}') + if [[ "${ether_dst}" == "ff:ff:ff:ff:ff:ff" ]]; then + echo "[ OK ]" + rc="${ksft_pass}" + else + echo "[FAIL] expected dst ether addr to be ff:ff:ff:ff:ff:ff," \ + "got ${ether_dst}" + rc="${ksft_fail}" + fi + + return "${rc}" +} + +if [ ! -x "$(command -v tcpdump)" ]; then + echo "SKIP: Could not run test without tcpdump tool" + exit "${ksft_skip}" +fi + +trap cleanup EXIT + +setup +test_broadcast_ether_dst + +exit $? -- 2.39.5

5 days, 22 hours

3
2
0 0

[PATCH iwlwifi-fixes] wifi: iwlwifi: fix 130/1030 configs

by Miri Korenblit

From: Johannes Berg <johannes.berg(a)intel.com> The 130/1030 devices are really derivatives of 6030, with some small differences not pertaining to the MAC, so they must use the 6030 MAC config. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220472 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220517 Fixes: 35ac275ebe0c ("wifi: iwlwifi: cfg: finish config split") Cc: stable(a)vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg(a)intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit(a)intel.com> --- drivers/net/wireless/intel/iwlwifi/pcie/drv.c | 26 +++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c index f9e2095d6490..7e56e4ff7642 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c @@ -124,13 +124,13 @@ VISIBLE_IF_IWLWIFI_KUNIT const struct pci_device_id iwl_hw_card_ids[] = { {IWL_PCI_DEVICE(0x0082, 0x1304, iwl6005_mac_cfg)},/* low 5GHz active */ {IWL_PCI_DEVICE(0x0082, 0x1305, iwl6005_mac_cfg)},/* high 5GHz active */ -/* 6x30 Series */ - {IWL_PCI_DEVICE(0x008A, 0x5305, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x008A, 0x5307, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x008A, 0x5325, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x008A, 0x5327, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x008B, 0x5315, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x008B, 0x5317, iwl1000_mac_cfg)}, +/* 1030/6x30 Series */ + {IWL_PCI_DEVICE(0x008A, 0x5305, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x008A, 0x5307, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x008A, 0x5325, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x008A, 0x5327, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x008B, 0x5315, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x008B, 0x5317, iwl6030_mac_cfg)}, {IWL_PCI_DEVICE(0x0090, 0x5211, iwl6030_mac_cfg)}, {IWL_PCI_DEVICE(0x0090, 0x5215, iwl6030_mac_cfg)}, {IWL_PCI_DEVICE(0x0090, 0x5216, iwl6030_mac_cfg)}, @@ -181,12 +181,12 @@ VISIBLE_IF_IWLWIFI_KUNIT const struct pci_device_id iwl_hw_card_ids[] = { {IWL_PCI_DEVICE(0x08AE, 0x1027, iwl1000_mac_cfg)}, /* 130 Series WiFi */ - {IWL_PCI_DEVICE(0x0896, 0x5005, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x0896, 0x5007, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x0897, 0x5015, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x0897, 0x5017, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x0896, 0x5025, iwl1000_mac_cfg)}, - {IWL_PCI_DEVICE(0x0896, 0x5027, iwl1000_mac_cfg)}, + {IWL_PCI_DEVICE(0x0896, 0x5005, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x0896, 0x5007, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x0897, 0x5015, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x0897, 0x5017, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x0896, 0x5025, iwl6030_mac_cfg)}, + {IWL_PCI_DEVICE(0x0896, 0x5027, iwl6030_mac_cfg)}, /* 2x00 Series */ {IWL_PCI_DEVICE(0x0890, 0x4022, iwl2000_mac_cfg)}, -- 2.34.1

5 days, 22 hours

1
0
0 0

[PATCH 0/3] Fix the NULL pointer deference issue in QMP USB drivers

by Kathiravan Thirumoorthy

In the suspend / resume callbacks, qmp->phy could be NULL because PHY is created after the PM ops are enabled, which lead to the NULL pointer deference. Internally issue is reported on qcom-qmp-usb driver. Since the fix is applicable to legacy and usbc drivers, incoporated the fixes for those driver as well. qcom-qmp-usb-legacy and qcom-qmp-usbc drivers are splitted out from qcom-qmp-usb driver in v6.6 and v6.9 respectively. So splitted the changes into 3, for ease of backporting. Signed-off-by: Kathiravan Thirumoorthy <kathiravan.thirumoorthy(a)oss.qualcomm.com> --- Poovendhan Selvaraj (3): phy: qcom-qmp-usb: fix NULL pointer dereference in PM callbacks phy: qcom-qmp-usb-legacy: fix NULL pointer dereference in PM callbacks phy: qcom-qmp-usbc: fix NULL pointer dereference in PM callbacks drivers/phy/qualcomm/phy-qcom-qmp-usb-legacy.c | 4 ++-- drivers/phy/qualcomm/phy-qcom-qmp-usb.c | 4 ++-- drivers/phy/qualcomm/phy-qcom-qmp-usbc.c | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) --- base-commit: 0f4c93f7eb861acab537dbe94441817a270537bf change-id: 20250825-qmp-null-deref-on-pm-fd98a91c775b Best regards, -- Kathiravan Thirumoorthy <kathiravan.thirumoorthy(a)oss.qualcomm.com>

5 days, 23 hours

3
7
0 0

[merged mm-hotfixes-stable] mm-damon-sysfs-fix-use-after-free-in-state_show.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/damon/sysfs: fix use-after-free in state_show() has been removed from the -mm tree. Its filename was mm-damon-sysfs-fix-use-after-free-in-state_show.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Stanislav Fort <stanislav.fort(a)aisle.com> Subject: mm/damon/sysfs: fix use-after-free in state_show() Date: Fri, 5 Sep 2025 13:10:46 +0300 state_show() reads kdamond->damon_ctx without holding damon_sysfs_lock. This allows a use-after-free race: CPU 0 CPU 1 ----- ----- state_show() damon_sysfs_turn_damon_on() ctx = kdamond->damon_ctx; mutex_lock(&damon_sysfs_lock); damon_destroy_ctx(kdamond->damon_ctx); kdamond->damon_ctx = NULL; mutex_unlock(&damon_sysfs_lock); damon_is_running(ctx); /* ctx is freed */ mutex_lock(&ctx->kdamond_lock); /* UAF */ (The race can also occur with damon_sysfs_kdamonds_rm_dirs() and damon_sysfs_kdamond_release(), which free or replace the context under damon_sysfs_lock.) Fix by taking damon_sysfs_lock before dereferencing the context, mirroring the locking used in pid_show(). The bug has existed since state_show() first accessed kdamond->damon_ctx. Link: https://lkml.kernel.org/r/20250905101046.2288-1-disclosure@aisle.com Fixes: a61ea561c871 ("mm/damon/sysfs: link DAMON for virtual address spaces monitoring") Signed-off-by: Stanislav Fort <disclosure(a)aisle.com> Reported-by: Stanislav Fort <disclosure(a)aisle.com> Reviewed-by: SeongJae Park <sj(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/damon/sysfs.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) --- a/mm/damon/sysfs.c~mm-damon-sysfs-fix-use-after-free-in-state_show +++ a/mm/damon/sysfs.c @@ -1260,14 +1260,18 @@ static ssize_t state_show(struct kobject { struct damon_sysfs_kdamond *kdamond = container_of(kobj, struct damon_sysfs_kdamond, kobj); - struct damon_ctx *ctx = kdamond->damon_ctx; - bool running; + struct damon_ctx *ctx; + bool running = false; - if (!ctx) - running = false; - else + if (!mutex_trylock(&damon_sysfs_lock)) + return -EBUSY; + + ctx = kdamond->damon_ctx; + if (ctx) running = damon_is_running(ctx); + mutex_unlock(&damon_sysfs_lock); + return sysfs_emit(buf, "%s\n", running ? damon_sysfs_cmd_strs[DAMON_SYSFS_CMD_ON] : damon_sysfs_cmd_strs[DAMON_SYSFS_CMD_OFF]); _ Patches currently in -mm which might be from stanislav.fort(a)aisle.com are mm-memcg-v1-account-event-registrations-and-drop-world-writable-cgroupevent_control.patch

6 days

1
0
0 0

[merged mm-hotfixes-stable] proc-fix-type-confusion-in-pde_set_flags.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: proc: fix type confusion in pde_set_flags() has been removed from the -mm tree. Its filename was proc-fix-type-confusion-in-pde_set_flags.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: wangzijie <wangzijie1(a)honor.com> Subject: proc: fix type confusion in pde_set_flags() Date: Thu, 4 Sep 2025 21:57:15 +0800 Commit 2ce3d282bd50 ("proc: fix missing pde_set_flags() for net proc files") missed a key part in the definition of proc_dir_entry: union { const struct proc_ops *proc_ops; const struct file_operations *proc_dir_ops; }; So dereference of ->proc_ops assumes it is a proc_ops structure results in type confusion and make NULL check for 'proc_ops' not work for proc dir. Add !S_ISDIR(dp->mode) test before calling pde_set_flags() to fix it. Link: https://lkml.kernel.org/r/20250904135715.3972782-1-wangzijie1@honor.com Fixes: 2ce3d282bd50 ("proc: fix missing pde_set_flags() for net proc files") Signed-off-by: wangzijie <wangzijie1(a)honor.com> Reported-by: Brad Spengler <spender(a)grsecurity.net> Closes: https://lore.kernel.org/all/20250903065758.3678537-1-wangzijie1@honor.com/ Cc: Alexey Dobriyan <adobriyan(a)gmail.com> Cc: Al Viro <viro(a)zeniv.linux.org.uk> Cc: Christian Brauner <brauner(a)kernel.org> Cc: Jiri Slaby <jirislaby(a)kernel.org> Cc: Stefano Brivio <sbrivio(a)redhat.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/proc/generic.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/fs/proc/generic.c~proc-fix-type-confusion-in-pde_set_flags +++ a/fs/proc/generic.c @@ -393,7 +393,8 @@ struct proc_dir_entry *proc_register(str if (proc_alloc_inum(&dp->low_ino)) goto out_free_entry; - pde_set_flags(dp); + if (!S_ISDIR(dp->mode)) + pde_set_flags(dp); write_lock(&proc_subdir_lock); dp->parent = dir; _ Patches currently in -mm which might be from wangzijie1(a)honor.com are

6 days

1
0
0 0

[merged mm-hotfixes-stable] compiler-clangh-define-__sanitize___-macros-only-when-undefined.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: compiler-clang.h: define __SANITIZE_*__ macros only when undefined has been removed from the -mm tree. Its filename was compiler-clangh-define-__sanitize___-macros-only-when-undefined.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Nathan Chancellor <nathan(a)kernel.org> Subject: compiler-clang.h: define __SANITIZE_*__ macros only when undefined Date: Tue, 02 Sep 2025 15:49:26 -0700 Clang 22 recently added support for defining __SANITIZE__ macros similar to GCC [1], which causes warnings (or errors with CONFIG_WERROR=y or W=e) with the existing defines that the kernel creates to emulate this behavior with existing clang versions. In file included from <built-in>:3: In file included from include/linux/compiler_types.h:171: include/linux/compiler-clang.h:37:9: error: '__SANITIZE_THREAD__' macro redefined [-Werror,-Wmacro-redefined] 37 | #define __SANITIZE_THREAD__ | ^ <built-in>:352:9: note: previous definition is here 352 | #define __SANITIZE_THREAD__ 1 | ^ Refactor compiler-clang.h to only define the sanitizer macros when they are undefined and adjust the rest of the code to use these macros for checking if the sanitizers are enabled, clearing up the warnings and allowing the kernel to easily drop these defines when the minimum supported version of LLVM for building the kernel becomes 22.0.0 or newer. Link: https://lkml.kernel.org/r/20250902-clang-update-sanitize-defines-v1-1-cf370… Link: https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da… [1] Signed-off-by: Nathan Chancellor <nathan(a)kernel.org> Reviewed-by: Justin Stitt <justinstitt(a)google.com> Cc: Alexander Potapenko <glider(a)google.com> Cc: Andrey Konovalov <andreyknvl(a)gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com> Cc: Bill Wendling <morbo(a)google.com> Cc: Dmitriy Vyukov <dvyukov(a)google.com> Cc: Marco Elver <elver(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/compiler-clang.h | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) --- a/include/linux/compiler-clang.h~compiler-clangh-define-__sanitize___-macros-only-when-undefined +++ a/include/linux/compiler-clang.h @@ -18,23 +18,42 @@ #define KASAN_ABI_VERSION 5 /* + * Clang 22 added preprocessor macros to match GCC, in hopes of eventually + * dropping __has_feature support for sanitizers: + * https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da… + * Create these macros for older versions of clang so that it is easy to clean + * up once the minimum supported version of LLVM for building the kernel always + * creates these macros. + * * Note: Checking __has_feature(*_sanitizer) is only true if the feature is * enabled. Therefore it is not required to additionally check defined(CONFIG_*) * to avoid adding redundant attributes in other configurations. */ +#if __has_feature(address_sanitizer) && !defined(__SANITIZE_ADDRESS__) +#define __SANITIZE_ADDRESS__ +#endif +#if __has_feature(hwaddress_sanitizer) && !defined(__SANITIZE_HWADDRESS__) +#define __SANITIZE_HWADDRESS__ +#endif +#if __has_feature(thread_sanitizer) && !defined(__SANITIZE_THREAD__) +#define __SANITIZE_THREAD__ +#endif -#if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer) -/* Emulate GCC's __SANITIZE_ADDRESS__ flag */ +/* + * Treat __SANITIZE_HWADDRESS__ the same as __SANITIZE_ADDRESS__ in the kernel. + */ +#ifdef __SANITIZE_HWADDRESS__ #define __SANITIZE_ADDRESS__ +#endif + +#ifdef __SANITIZE_ADDRESS__ #define __no_sanitize_address \ __attribute__((no_sanitize("address", "hwaddress"))) #else #define __no_sanitize_address #endif -#if __has_feature(thread_sanitizer) -/* emulate gcc's __SANITIZE_THREAD__ flag */ -#define __SANITIZE_THREAD__ +#ifdef __SANITIZE_THREAD__ #define __no_sanitize_thread \ __attribute__((no_sanitize("thread"))) #else _ Patches currently in -mm which might be from nathan(a)kernel.org are nilfs2-fix-cfi-failure-when-accessing-sys-fs-nilfs2-features.patch mm-rmap-convert-enum-rmap_level-to-enum-pgtable_level-fix.patch

6 days

1
0
0 0

[merged mm-hotfixes-stable] mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc() has been removed from the -mm tree. Its filename was mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: "Uladzislau Rezki (Sony)" <urezki(a)gmail.com> Subject: mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc() Date: Sun, 31 Aug 2025 14:10:58 +0200 kasan_populate_vmalloc() and its helpers ignore the caller's gfp_mask and always allocate memory using the hardcoded GFP_KERNEL flag. This makes them inconsistent with vmalloc(), which was recently extended to support GFP_NOFS and GFP_NOIO allocations. Page table allocations performed during shadow population also ignore the external gfp_mask. To preserve the intended semantics of GFP_NOFS and GFP_NOIO, wrap the apply_to_page_range() calls into the appropriate memalloc scope. xfs calls vmalloc with GFP_NOFS, so this bug could lead to deadlock. There was a report here https://lkml.kernel.org/r/686ea951.050a0220.385921.0016.GAE@google.com This patch: - Extends kasan_populate_vmalloc() and helpers to take gfp_mask; - Passes gfp_mask down to alloc_pages_bulk() and __get_free_page(); - Enforces GFP_NOFS/NOIO semantics with memalloc_*_save()/restore() around apply_to_page_range(); - Updates vmalloc.c and percpu allocator call sites accordingly. Link: https://lkml.kernel.org/r/20250831121058.92971-1-urezki@gmail.com Fixes: 451769ebb7e7 ("mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc") Signed-off-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com> Reported-by: syzbot+3470c9ffee63e4abafeb(a)syzkaller.appspotmail.com Reviewed-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com> Cc: Baoquan He <bhe(a)redhat.com> Cc: Michal Hocko <mhocko(a)kernel.org> Cc: Alexander Potapenko <glider(a)google.com> Cc: Andrey Konovalov <andreyknvl(a)gmail.com> Cc: Dmitry Vyukov <dvyukov(a)google.com> Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/kasan.h | 6 +++--- mm/kasan/shadow.c | 31 ++++++++++++++++++++++++------- mm/vmalloc.c | 8 ++++---- 3 files changed, 31 insertions(+), 14 deletions(-) --- a/include/linux/kasan.h~mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc +++ a/include/linux/kasan.h @@ -562,7 +562,7 @@ static inline void kasan_init_hw_tags(vo #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) void kasan_populate_early_vm_area_shadow(void *start, unsigned long size); -int kasan_populate_vmalloc(unsigned long addr, unsigned long size); +int kasan_populate_vmalloc(unsigned long addr, unsigned long size, gfp_t gfp_mask); void kasan_release_vmalloc(unsigned long start, unsigned long end, unsigned long free_region_start, unsigned long free_region_end, @@ -574,7 +574,7 @@ static inline void kasan_populate_early_ unsigned long size) { } static inline int kasan_populate_vmalloc(unsigned long start, - unsigned long size) + unsigned long size, gfp_t gfp_mask) { return 0; } @@ -610,7 +610,7 @@ static __always_inline void kasan_poison static inline void kasan_populate_early_vm_area_shadow(void *start, unsigned long size) { } static inline int kasan_populate_vmalloc(unsigned long start, - unsigned long size) + unsigned long size, gfp_t gfp_mask) { return 0; } --- a/mm/kasan/shadow.c~mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc +++ a/mm/kasan/shadow.c @@ -336,13 +336,13 @@ static void ___free_pages_bulk(struct pa } } -static int ___alloc_pages_bulk(struct page **pages, int nr_pages) +static int ___alloc_pages_bulk(struct page **pages, int nr_pages, gfp_t gfp_mask) { unsigned long nr_populated, nr_total = nr_pages; struct page **page_array = pages; while (nr_pages) { - nr_populated = alloc_pages_bulk(GFP_KERNEL, nr_pages, pages); + nr_populated = alloc_pages_bulk(gfp_mask, nr_pages, pages); if (!nr_populated) { ___free_pages_bulk(page_array, nr_total - nr_pages); return -ENOMEM; @@ -354,25 +354,42 @@ static int ___alloc_pages_bulk(struct pa return 0; } -static int __kasan_populate_vmalloc(unsigned long start, unsigned long end) +static int __kasan_populate_vmalloc(unsigned long start, unsigned long end, gfp_t gfp_mask) { unsigned long nr_pages, nr_total = PFN_UP(end - start); struct vmalloc_populate_data data; + unsigned int flags; int ret = 0; - data.pages = (struct page **)__get_free_page(GFP_KERNEL | __GFP_ZERO); + data.pages = (struct page **)__get_free_page(gfp_mask | __GFP_ZERO); if (!data.pages) return -ENOMEM; while (nr_total) { nr_pages = min(nr_total, PAGE_SIZE / sizeof(data.pages[0])); - ret = ___alloc_pages_bulk(data.pages, nr_pages); + ret = ___alloc_pages_bulk(data.pages, nr_pages, gfp_mask); if (ret) break; data.start = start; + + /* + * page tables allocations ignore external gfp mask, enforce it + * by the scope API + */ + if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) + flags = memalloc_nofs_save(); + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) + flags = memalloc_noio_save(); + ret = apply_to_page_range(&init_mm, start, nr_pages * PAGE_SIZE, kasan_populate_vmalloc_pte, &data); + + if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) + memalloc_nofs_restore(flags); + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) + memalloc_noio_restore(flags); + ___free_pages_bulk(data.pages, nr_pages); if (ret) break; @@ -386,7 +403,7 @@ static int __kasan_populate_vmalloc(unsi return ret; } -int kasan_populate_vmalloc(unsigned long addr, unsigned long size) +int kasan_populate_vmalloc(unsigned long addr, unsigned long size, gfp_t gfp_mask) { unsigned long shadow_start, shadow_end; int ret; @@ -415,7 +432,7 @@ int kasan_populate_vmalloc(unsigned long shadow_start = PAGE_ALIGN_DOWN(shadow_start); shadow_end = PAGE_ALIGN(shadow_end); - ret = __kasan_populate_vmalloc(shadow_start, shadow_end); + ret = __kasan_populate_vmalloc(shadow_start, shadow_end, gfp_mask); if (ret) return ret; --- a/mm/vmalloc.c~mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc +++ a/mm/vmalloc.c @@ -2026,6 +2026,8 @@ static struct vmap_area *alloc_vmap_area if (unlikely(!vmap_initialized)) return ERR_PTR(-EBUSY); + /* Only reclaim behaviour flags are relevant. */ + gfp_mask = gfp_mask & GFP_RECLAIM_MASK; might_sleep(); /* @@ -2038,8 +2040,6 @@ static struct vmap_area *alloc_vmap_area */ va = node_alloc(size, align, vstart, vend, &addr, &vn_id); if (!va) { - gfp_mask = gfp_mask & GFP_RECLAIM_MASK; - va = kmem_cache_alloc_node(vmap_area_cachep, gfp_mask, node); if (unlikely(!va)) return ERR_PTR(-ENOMEM); @@ -2089,7 +2089,7 @@ retry: BUG_ON(va->va_start < vstart); BUG_ON(va->va_end > vend); - ret = kasan_populate_vmalloc(addr, size); + ret = kasan_populate_vmalloc(addr, size, gfp_mask); if (ret) { free_vmap_area(va); return ERR_PTR(ret); @@ -4826,7 +4826,7 @@ retry: /* populate the kasan shadow space */ for (area = 0; area < nr_vms; area++) { - if (kasan_populate_vmalloc(vas[area]->va_start, sizes[area])) + if (kasan_populate_vmalloc(vas[area]->va_start, sizes[area], GFP_KERNEL)) goto err_free_shadow; } _ Patches currently in -mm which might be from urezki(a)gmail.com are

6 days

1
0
0 0

[merged mm-hotfixes-stable] ocfs2-fix-recursive-semaphore-deadlock-in-fiemap-call.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: ocfs2: fix recursive semaphore deadlock in fiemap call has been removed from the -mm tree. Its filename was ocfs2-fix-recursive-semaphore-deadlock-in-fiemap-call.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Mark Tinguely <mark.tinguely(a)oracle.com> Subject: ocfs2: fix recursive semaphore deadlock in fiemap call Date: Fri, 29 Aug 2025 10:18:15 -0500 syzbot detected a OCFS2 hang due to a recursive semaphore on a FS_IOC_FIEMAP of the extent list on a specially crafted mmap file. context_switch kernel/sched/core.c:5357 [inline] __schedule+0x1798/0x4cc0 kernel/sched/core.c:6961 __schedule_loop kernel/sched/core.c:7043 [inline] schedule+0x165/0x360 kernel/sched/core.c:7058 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:7115 rwsem_down_write_slowpath+0x872/0xfe0 kernel/locking/rwsem.c:1185 __down_write_common kernel/locking/rwsem.c:1317 [inline] __down_write kernel/locking/rwsem.c:1326 [inline] down_write+0x1ab/0x1f0 kernel/locking/rwsem.c:1591 ocfs2_page_mkwrite+0x2ff/0xc40 fs/ocfs2/mmap.c:142 do_page_mkwrite+0x14d/0x310 mm/memory.c:3361 wp_page_shared mm/memory.c:3762 [inline] do_wp_page+0x268d/0x5800 mm/memory.c:3981 handle_pte_fault mm/memory.c:6068 [inline] __handle_mm_fault+0x1033/0x5440 mm/memory.c:6195 handle_mm_fault+0x40a/0x8e0 mm/memory.c:6364 do_user_addr_fault+0x764/0x1390 arch/x86/mm/fault.c:1387 handle_page_fault arch/x86/mm/fault.c:1476 [inline] exc_page_fault+0x76/0xf0 arch/x86/mm/fault.c:1532 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 RIP: 0010:copy_user_generic arch/x86/include/asm/uaccess_64.h:126 [inline] RIP: 0010:raw_copy_to_user arch/x86/include/asm/uaccess_64.h:147 [inline] RIP: 0010:_inline_copy_to_user include/linux/uaccess.h:197 [inline] RIP: 0010:_copy_to_user+0x85/0xb0 lib/usercopy.c:26 Code: e8 00 bc f7 fc 4d 39 fc 72 3d 4d 39 ec 77 38 e8 91 b9 f7 fc 4c 89 f7 89 de e8 47 25 5b fd 0f 01 cb 4c 89 ff 48 89 d9 4c 89 f6 <f3> a4 0f 1f 00 48 89 cb 0f 01 ca 48 89 d8 5b 41 5c 41 5d 41 5e 41 RSP: 0018:ffffc9000403f950 EFLAGS: 00050256 RAX: ffffffff84c7f101 RBX: 0000000000000038 RCX: 0000000000000038 RDX: 0000000000000000 RSI: ffffc9000403f9e0 RDI: 0000200000000060 RBP: ffffc9000403fa90 R08: ffffc9000403fa17 R09: 1ffff92000807f42 R10: dffffc0000000000 R11: fffff52000807f43 R12: 0000200000000098 R13: 00007ffffffff000 R14: ffffc9000403f9e0 R15: 0000200000000060 copy_to_user include/linux/uaccess.h:225 [inline] fiemap_fill_next_extent+0x1c0/0x390 fs/ioctl.c:145 ocfs2_fiemap+0x888/0xc90 fs/ocfs2/extent_map.c:806 ioctl_fiemap fs/ioctl.c:220 [inline] do_vfs_ioctl+0x1173/0x1430 fs/ioctl.c:532 __do_sys_ioctl fs/ioctl.c:596 [inline] __se_sys_ioctl+0x82/0x170 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f5f13850fd9 RSP: 002b:00007ffe3b3518b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000200000000000 RCX: 00007f5f13850fd9 RDX: 0000200000000040 RSI: 00000000c020660b RDI: 0000000000000004 RBP: 6165627472616568 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe3b3518f0 R13: 00007ffe3b351b18 R14: 431bde82d7b634db R15: 00007f5f1389a03b ocfs2_fiemap() takes a read lock of the ip_alloc_sem semaphore (since v2.6.22-527-g7307de80510a) and calls fiemap_fill_next_extent() to read the extent list of this running mmap executable. The user supplied buffer to hold the fiemap information page faults calling ocfs2_page_mkwrite() which will take a write lock (since v2.6.27-38-g00dc417fa3e7) of the same semaphore. This recursive semaphore will hold filesystem locks and causes a hang of the fileystem. The ip_alloc_sem protects the inode extent list and size. Release the read semphore before calling fiemap_fill_next_extent() in ocfs2_fiemap() and ocfs2_fiemap_inline(). This does an unnecessary semaphore lock/unlock on the last extent but simplifies the error path. Link: https://lkml.kernel.org/r/61d1a62b-2631-4f12-81e2-cd689914360b@oracle.com Fixes: 00dc417fa3e7 ("ocfs2: fiemap support") Signed-off-by: Mark Tinguely <mark.tinguely(a)oracle.com> Reported-by: syzbot+541dcc6ee768f77103e7(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=541dcc6ee768f77103e7 Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com> Cc: Mark Fasheh <mark(a)fasheh.com> Cc: Joel Becker <jlbec(a)evilplan.org> Cc: Junxiao Bi <junxiao.bi(a)oracle.com> Cc: Changwei Ge <gechangwei(a)live.cn> Cc: Jun Piao <piaojun(a)huawei.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/ocfs2/extent_map.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) --- a/fs/ocfs2/extent_map.c~ocfs2-fix-recursive-semaphore-deadlock-in-fiemap-call +++ a/fs/ocfs2/extent_map.c @@ -706,6 +706,8 @@ out: * it not only handles the fiemap for inlined files, but also deals * with the fast symlink, cause they have no difference for extent * mapping per se. + * + * Must be called with ip_alloc_sem semaphore held. */ static int ocfs2_fiemap_inline(struct inode *inode, struct buffer_head *di_bh, struct fiemap_extent_info *fieinfo, @@ -717,6 +719,7 @@ static int ocfs2_fiemap_inline(struct in u64 phys; u32 flags = FIEMAP_EXTENT_DATA_INLINE|FIEMAP_EXTENT_LAST; struct ocfs2_inode_info *oi = OCFS2_I(inode); + lockdep_assert_held_read(&oi->ip_alloc_sem); di = (struct ocfs2_dinode *)di_bh->b_data; if (ocfs2_inode_is_fast_symlink(inode)) @@ -732,8 +735,11 @@ static int ocfs2_fiemap_inline(struct in phys += offsetof(struct ocfs2_dinode, id2.i_data.id_data); + /* Release the ip_alloc_sem to prevent deadlock on page fault */ + up_read(&OCFS2_I(inode)->ip_alloc_sem); ret = fiemap_fill_next_extent(fieinfo, 0, phys, id_count, flags); + down_read(&OCFS2_I(inode)->ip_alloc_sem); if (ret < 0) return ret; } @@ -802,9 +808,11 @@ int ocfs2_fiemap(struct inode *inode, st len_bytes = (u64)le16_to_cpu(rec.e_leaf_clusters) << osb->s_clustersize_bits; phys_bytes = le64_to_cpu(rec.e_blkno) << osb->sb->s_blocksize_bits; virt_bytes = (u64)le32_to_cpu(rec.e_cpos) << osb->s_clustersize_bits; - + /* Release the ip_alloc_sem to prevent deadlock on page fault */ + up_read(&OCFS2_I(inode)->ip_alloc_sem); ret = fiemap_fill_next_extent(fieinfo, virt_bytes, phys_bytes, len_bytes, fe_flags); + down_read(&OCFS2_I(inode)->ip_alloc_sem); if (ret) break; _ Patches currently in -mm which might be from mark.tinguely(a)oracle.com are

6 days

1
0
0 0

[merged mm-hotfixes-stable] mm-memory-failure-fix-vm_bug_on_pagepagepoisonedpage-when-unpoison-memory.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory has been removed from the -mm tree. Its filename was mm-memory-failure-fix-vm_bug_on_pagepagepoisonedpage-when-unpoison-memory.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Miaohe Lin <linmiaohe(a)huawei.com> Subject: mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory Date: Thu, 28 Aug 2025 10:46:18 +0800 When I did memory failure tests, below panic occurs: page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page)) kernel BUG at include/linux/page-flags.h:616! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 #40 RIP: 0010:unpoison_memory+0x2f3/0x590 RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 Call Trace: <TASK> unpoison_memory+0x2f3/0x590 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xd5/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f08f0314887 RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887 RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001 RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009 R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00 </TASK> Modules linked in: hwpoison_inject ---[ end trace 0000000000000000 ]--- RIP: 0010:unpoison_memory+0x2f3/0x590 RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]--- The root cause is that unpoison_memory() tries to check the PG_HWPoison flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is triggered. This can be reproduced by below steps: 1.Offline memory block: echo offline > /sys/devices/system/memory/memory12/state 2.Get offlined memory pfn: page-types -b n -rlN 3.Write pfn to unpoison-pfn echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn This scenario can be identified by pfn_to_online_page() returning NULL. And ZONE_DEVICE pages are never expected, so we can simply fail if pfn_to_online_page() == NULL to fix the bug. Link: https://lkml.kernel.org/r/20250828024618.1744895-1-linmiaohe@huawei.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com> Suggested-by: David Hildenbrand <david(a)redhat.com> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memory-failure.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) --- a/mm/memory-failure.c~mm-memory-failure-fix-vm_bug_on_pagepagepoisonedpage-when-unpoison-memory +++ a/mm/memory-failure.c @@ -2568,10 +2568,9 @@ int unpoison_memory(unsigned long pfn) static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); - if (!pfn_valid(pfn)) - return -ENXIO; - - p = pfn_to_page(pfn); + p = pfn_to_online_page(pfn); + if (!p) + return -EIO; folio = page_folio(p); mutex_lock(&mf_mutex); _ Patches currently in -mm which might be from linmiaohe(a)huawei.com are revert-hugetlb-make-hugetlb-depends-on-sysfs-or-sysctl.patch mm-hwpoison-decouple-hwpoison_filter-from-mm-memory-failurec.patch

6 days

1
0
0 0

[PATCH v5 1/2] drm/buddy: Optimize free block management with RB tree

by Arunpravin Paneer Selvam

Replace the freelist (O(n)) used for free block management with a red-black tree, providing more efficient O(log n) search, insert, and delete operations. This improves scalability and performance when managing large numbers of free blocks per order (e.g., hundreds or thousands). In the VK-CTS memory stress subtest, the buddy manager merges fragmented memory and inserts freed blocks into the freelist. Since freelist insertion is O(n), this becomes a bottleneck as fragmentation increases. Benchmarking shows list_insert_sorted() consumes ~52.69% CPU with the freelist, compared to just 0.03% with the RB tree (rbtree_insert.isra.0), despite performing the same sorted insert. This also improves performance in heavily fragmented workloads, such as games or graphics tests that stress memory. v3(Matthew): - Remove RB_EMPTY_NODE check in force_merge function. - Rename rb for loop macros to have less generic names and move to .c file. - Make the rb node rb and link field as union. v4(Jani Nikula): - The kernel-doc comment should be "/**" - Move all the rbtree macros to rbtree.h and add parens to ensure correct precedence. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam(a)amd.com> --- drivers/gpu/drm/drm_buddy.c | 142 ++++++++++++++++++++++-------------- include/drm/drm_buddy.h | 9 ++- include/linux/rbtree.h | 56 ++++++++++++++ 3 files changed, 152 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index a94061f373de..978cabfbcf0f 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -31,6 +31,8 @@ static struct drm_buddy_block *drm_block_alloc(struct drm_buddy *mm, block->header |= order; block->parent = parent; + RB_CLEAR_NODE(&block->rb); + BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED); return block; } @@ -41,23 +43,53 @@ static void drm_block_free(struct drm_buddy *mm, kmem_cache_free(slab_blocks, block); } -static void list_insert_sorted(struct drm_buddy *mm, - struct drm_buddy_block *block) +static void rbtree_insert(struct drm_buddy *mm, + struct drm_buddy_block *block) { + struct rb_root *root = &mm->free_tree[drm_buddy_block_order(block)]; + struct rb_node **link = &root->rb_node; + struct rb_node *parent = NULL; struct drm_buddy_block *node; - struct list_head *head; + u64 offset; + + offset = drm_buddy_block_offset(block); - head = &mm->free_list[drm_buddy_block_order(block)]; - if (list_empty(head)) { - list_add(&block->link, head); - return; + while (*link) { + parent = *link; + node = rb_entry(parent, struct drm_buddy_block, rb); + + if (offset < drm_buddy_block_offset(node)) + link = &parent->rb_left; + else + link = &parent->rb_right; } - list_for_each_entry(node, head, link) - if (drm_buddy_block_offset(block) < drm_buddy_block_offset(node)) - break; + rb_link_node(&block->rb, parent, link); + rb_insert_color(&block->rb, root); +} + +static void rbtree_remove(struct drm_buddy *mm, + struct drm_buddy_block *block) +{ + struct rb_root *root; + + root = &mm->free_tree[drm_buddy_block_order(block)]; + rb_erase(&block->rb, root); - __list_add(&block->link, node->link.prev, &node->link); + RB_CLEAR_NODE(&block->rb); +} + +static inline struct drm_buddy_block * +rbtree_last_entry(struct drm_buddy *mm, unsigned int order) +{ + struct rb_node *node = rb_last(&mm->free_tree[order]); + + return node ? rb_entry(node, struct drm_buddy_block, rb) : NULL; +} + +static bool rbtree_is_empty(struct drm_buddy *mm, unsigned int order) +{ + return RB_EMPTY_ROOT(&mm->free_tree[order]); } static void clear_reset(struct drm_buddy_block *block) @@ -70,12 +102,13 @@ static void mark_cleared(struct drm_buddy_block *block) block->header |= DRM_BUDDY_HEADER_CLEAR; } -static void mark_allocated(struct drm_buddy_block *block) +static void mark_allocated(struct drm_buddy *mm, + struct drm_buddy_block *block) { block->header &= ~DRM_BUDDY_HEADER_STATE; block->header |= DRM_BUDDY_ALLOCATED; - list_del(&block->link); + rbtree_remove(mm, block); } static void mark_free(struct drm_buddy *mm, @@ -84,15 +117,16 @@ static void mark_free(struct drm_buddy *mm, block->header &= ~DRM_BUDDY_HEADER_STATE; block->header |= DRM_BUDDY_FREE; - list_insert_sorted(mm, block); + rbtree_insert(mm, block); } -static void mark_split(struct drm_buddy_block *block) +static void mark_split(struct drm_buddy *mm, + struct drm_buddy_block *block) { block->header &= ~DRM_BUDDY_HEADER_STATE; block->header |= DRM_BUDDY_SPLIT; - list_del(&block->link); + rbtree_remove(mm, block); } static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2) @@ -148,7 +182,7 @@ static unsigned int __drm_buddy_free(struct drm_buddy *mm, mark_cleared(parent); } - list_del(&buddy->link); + rbtree_remove(mm, buddy); if (force_merge && drm_buddy_block_is_clear(buddy)) mm->clear_avail -= drm_buddy_block_size(mm, buddy); @@ -179,9 +213,11 @@ static int __force_merge(struct drm_buddy *mm, return -EINVAL; for (i = min_order - 1; i >= 0; i--) { - struct drm_buddy_block *block, *prev; + struct drm_buddy_block *block, *prev_block, *first_block; + + first_block = rb_entry(rb_first(&mm->free_tree[i]), struct drm_buddy_block, rb); - list_for_each_entry_safe_reverse(block, prev, &mm->free_list[i], link) { + rbtree_reverse_for_each_entry_safe(block, prev_block, &mm->free_tree[i], rb) { struct drm_buddy_block *buddy; u64 block_start, block_end; @@ -206,10 +242,14 @@ static int __force_merge(struct drm_buddy *mm, * block in the next iteration as we would free the * buddy block as part of the free function. */ - if (prev == buddy) - prev = list_prev_entry(prev, link); + if (prev_block && prev_block == buddy) { + if (prev_block != first_block) + prev_block = rb_entry(rb_prev(&prev_block->rb), + struct drm_buddy_block, + rb); + } - list_del(&block->link); + rbtree_remove(mm, block); if (drm_buddy_block_is_clear(block)) mm->clear_avail -= drm_buddy_block_size(mm, block); @@ -258,14 +298,14 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) BUG_ON(mm->max_order > DRM_BUDDY_MAX_ORDER); - mm->free_list = kmalloc_array(mm->max_order + 1, - sizeof(struct list_head), + mm->free_tree = kmalloc_array(mm->max_order + 1, + sizeof(struct rb_root), GFP_KERNEL); - if (!mm->free_list) + if (!mm->free_tree) return -ENOMEM; for (i = 0; i <= mm->max_order; ++i) - INIT_LIST_HEAD(&mm->free_list[i]); + mm->free_tree[i] = RB_ROOT; mm->n_roots = hweight64(size); @@ -273,7 +313,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) sizeof(struct drm_buddy_block *), GFP_KERNEL); if (!mm->roots) - goto out_free_list; + goto out_free_tree; offset = 0; i = 0; @@ -312,8 +352,8 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) while (i--) drm_block_free(mm, mm->roots[i]); kfree(mm->roots); -out_free_list: - kfree(mm->free_list); +out_free_tree: + kfree(mm->free_tree); return -ENOMEM; } EXPORT_SYMBOL(drm_buddy_init); @@ -323,7 +363,7 @@ EXPORT_SYMBOL(drm_buddy_init); * * @mm: DRM buddy manager to free * - * Cleanup memory manager resources and the freelist + * Cleanup memory manager resources and the freetree */ void drm_buddy_fini(struct drm_buddy *mm) { @@ -350,7 +390,7 @@ void drm_buddy_fini(struct drm_buddy *mm) WARN_ON(mm->avail != mm->size); kfree(mm->roots); - kfree(mm->free_list); + kfree(mm->free_tree); } EXPORT_SYMBOL(drm_buddy_fini); @@ -383,7 +423,7 @@ static int split_block(struct drm_buddy *mm, clear_reset(block); } - mark_split(block); + mark_split(mm, block); return 0; } @@ -412,7 +452,7 @@ EXPORT_SYMBOL(drm_get_buddy); * @is_clear: blocks clear state * * Reset the clear state based on @is_clear value for each block - * in the freelist. + * in the freetree. */ void drm_buddy_reset_clear(struct drm_buddy *mm, bool is_clear) { @@ -433,7 +473,7 @@ void drm_buddy_reset_clear(struct drm_buddy *mm, bool is_clear) for (i = 0; i <= mm->max_order; ++i) { struct drm_buddy_block *block; - list_for_each_entry_reverse(block, &mm->free_list[i], link) { + rbtree_reverse_for_each_entry(block, &mm->free_tree[i], rb) { if (is_clear != drm_buddy_block_is_clear(block)) { if (is_clear) { mark_cleared(block); @@ -641,7 +681,7 @@ get_maxblock(struct drm_buddy *mm, unsigned int order, for (i = order; i <= mm->max_order; ++i) { struct drm_buddy_block *tmp_block; - list_for_each_entry_reverse(tmp_block, &mm->free_list[i], link) { + rbtree_reverse_for_each_entry(tmp_block, &mm->free_tree[i], rb) { if (block_incompatible(tmp_block, flags)) continue; @@ -667,7 +707,7 @@ get_maxblock(struct drm_buddy *mm, unsigned int order, } static struct drm_buddy_block * -alloc_from_freelist(struct drm_buddy *mm, +alloc_from_freetree(struct drm_buddy *mm, unsigned int order, unsigned long flags) { @@ -684,7 +724,7 @@ alloc_from_freelist(struct drm_buddy *mm, for (tmp = order; tmp <= mm->max_order; ++tmp) { struct drm_buddy_block *tmp_block; - list_for_each_entry_reverse(tmp_block, &mm->free_list[tmp], link) { + rbtree_reverse_for_each_entry(tmp_block, &mm->free_tree[tmp], rb) { if (block_incompatible(tmp_block, flags)) continue; @@ -700,10 +740,8 @@ alloc_from_freelist(struct drm_buddy *mm, if (!block) { /* Fallback method */ for (tmp = order; tmp <= mm->max_order; ++tmp) { - if (!list_empty(&mm->free_list[tmp])) { - block = list_last_entry(&mm->free_list[tmp], - struct drm_buddy_block, - link); + if (!rbtree_is_empty(mm, tmp)) { + block = rbtree_last_entry(mm, tmp); if (block) break; } @@ -771,7 +809,7 @@ static int __alloc_range(struct drm_buddy *mm, if (contains(start, end, block_start, block_end)) { if (drm_buddy_block_is_free(block)) { - mark_allocated(block); + mark_allocated(mm, block); total_allocated += drm_buddy_block_size(mm, block); mm->avail -= drm_buddy_block_size(mm, block); if (drm_buddy_block_is_clear(block)) @@ -849,7 +887,6 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm, { u64 rhs_offset, lhs_offset, lhs_size, filled; struct drm_buddy_block *block; - struct list_head *list; LIST_HEAD(blocks_lhs); unsigned long pages; unsigned int order; @@ -862,11 +899,10 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm, if (order == 0) return -ENOSPC; - list = &mm->free_list[order]; - if (list_empty(list)) + if (rbtree_is_empty(mm, order)) return -ENOSPC; - list_for_each_entry_reverse(block, list, link) { + rbtree_reverse_for_each_entry(block, &mm->free_tree[order], rb) { /* Allocate blocks traversing RHS */ rhs_offset = drm_buddy_block_offset(block); err = __drm_buddy_alloc_range(mm, rhs_offset, size, @@ -976,7 +1012,7 @@ int drm_buddy_block_trim(struct drm_buddy *mm, list_add(&block->tmp_link, &dfs); err = __alloc_range(mm, &dfs, new_start, new_size, blocks, NULL); if (err) { - mark_allocated(block); + mark_allocated(mm, block); mm->avail -= drm_buddy_block_size(mm, block); if (drm_buddy_block_is_clear(block)) mm->clear_avail -= drm_buddy_block_size(mm, block); @@ -999,8 +1035,8 @@ __drm_buddy_alloc_blocks(struct drm_buddy *mm, return __drm_buddy_alloc_range_bias(mm, start, end, order, flags); else - /* Allocate from freelist */ - return alloc_from_freelist(mm, order, flags); + /* Allocate from freetree */ + return alloc_from_freetree(mm, order, flags); } /** @@ -1017,8 +1053,8 @@ __drm_buddy_alloc_blocks(struct drm_buddy *mm, * alloc_range_bias() called on range limitations, which traverses * the tree and returns the desired block. * - * alloc_from_freelist() called when *no* range restrictions - * are enforced, which picks the block from the freelist. + * alloc_from_freetree() called when *no* range restrictions + * are enforced, which picks the block from the freetree. * * Returns: * 0 on success, error code on failure. @@ -1120,7 +1156,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, } } while (1); - mark_allocated(block); + mark_allocated(mm, block); mm->avail -= drm_buddy_block_size(mm, block); if (drm_buddy_block_is_clear(block)) mm->clear_avail -= drm_buddy_block_size(mm, block); @@ -1204,7 +1240,7 @@ void drm_buddy_print(struct drm_buddy *mm, struct drm_printer *p) struct drm_buddy_block *block; u64 count = 0, free; - list_for_each_entry(block, &mm->free_list[order], link) { + rbtree_for_each_entry(block, &mm->free_tree[order], rb) { BUG_ON(!drm_buddy_block_is_free(block)); count++; } diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index 513837632b7d..091823592034 100644 --- a/include/drm/drm_buddy.h +++ b/include/drm/drm_buddy.h @@ -10,6 +10,7 @@ #include <linux/list.h> #include <linux/slab.h> #include <linux/sched.h> +#include <linux/rbtree.h> #include <drm/drm_print.h> @@ -53,7 +54,11 @@ struct drm_buddy_block { * a list, if so desired. As soon as the block is freed with * drm_buddy_free* ownership is given back to the mm. */ - struct list_head link; + union { + struct rb_node rb; + struct list_head link; + }; + struct list_head tmp_link; }; @@ -68,7 +73,7 @@ struct drm_buddy_block { */ struct drm_buddy { /* Maintain a free list for each order. */ - struct list_head *free_list; + struct rb_root *free_tree; /* * Maintain explicit binary tree(s) to track the allocation of the diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h index 8d2ba3749866..17190bb4837c 100644 --- a/include/linux/rbtree.h +++ b/include/linux/rbtree.h @@ -79,6 +79,62 @@ static inline void rb_link_node_rcu(struct rb_node *node, struct rb_node *parent ____ptr ? rb_entry(____ptr, type, member) : NULL; \ }) +/** + * rbtree_for_each_entry - iterate in-order over rb_root of given type + * + * @pos: the 'type *' to use as a loop cursor. + * @root: 'rb_root *' of the rbtree. + * @member: the name of the rb_node field within 'type'. + */ +#define rbtree_for_each_entry(pos, root, member) \ + for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member); \ + (pos); \ + (pos) = rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member)) + +/** + * rbtree_reverse_for_each_entry - iterate in reverse in-order over rb_root + * of given type + * + * @pos: the 'type *' to use as a loop cursor. + * @root: 'rb_root *' of the rbtree. + * @member: the name of the rb_node field within 'type'. + */ +#define rbtree_reverse_for_each_entry(pos, root, member) \ + for ((pos) = rb_entry_safe(rb_last(root), typeof(*(pos)), member); \ + (pos); \ + (pos) = rb_entry_safe(rb_prev(&(pos)->member), typeof(*(pos)), member)) + +/** + * rbtree_for_each_entry_safe - iterate in-order over rb_root safe against removal + * + * @pos: the 'type *' to use as a loop cursor + * @n: another 'type *' to use as temporary storage + * @root: 'rb_root *' of the rbtree + * @member: the name of the rb_node field within 'type' + */ +#define rbtree_for_each_entry_safe(pos, n, root, member) \ + for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member), \ + (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL; \ + (pos); \ + (pos) = (n), \ + (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL) + +/** + * rbtree_reverse_for_each_entry_safe - iterate in reverse in-order over rb_root + * safe against removal + * + * @pos: the struct type * to use as a loop cursor. + * @n: another struct type * to use as temporary storage. + * @root: pointer to struct rb_root to iterate. + * @member: name of the rb_node field within the struct. + */ +#define rbtree_reverse_for_each_entry_safe(pos, n, root, member) \ + for ((pos) = rb_entry_safe(rb_last(root), typeof(*(pos)), member), \ + (n) = (pos) ? rb_entry_safe(rb_prev(&(pos)->member), typeof(*(pos)), member) : NULL; \ + (pos); \ + (pos) = (n), \ + (n) = (pos) ? rb_entry_safe(rb_prev(&(pos)->member), typeof(*(pos)), member) : NULL) + /** * rbtree_postorder_for_each_entry_safe - iterate in post-order over rb_root of * given type allowing the backing memory of @pos to be invalidated base-commit: f4c75f975cf50fa2e1fd96c5aafe5aa62e55fbe4 -- 2.34.1

6 days, 1 hour

5
6
0 0

[PATCH v2] media: staging/ipu7: fix isys device runtime PM usage in firmware closing

by bingbu.cao＠intel.com

From: Bingbu Cao <bingbu.cao(a)intel.com> The PM usage counter of isys was bumped up when start camera stream (opening firmware) but it was not dropped after stream stop(closing firmware), it forbids system fail to suspend due to the wrong PM state of ISYS. This patch drop the PM usage counter in firmware close to fix it. Cc: Stable(a)vger.kernel.org Fixes: a516d36bdc3d ("media: staging/ipu7: add IPU7 input system device driver") Signed-off-by: Bingbu Cao <bingbu.cao(a)intel.com> --- drivers/staging/media/ipu7/ipu7-isys-video.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/staging/media/ipu7/ipu7-isys-video.c b/drivers/staging/media/ipu7/ipu7-isys-video.c index 8756da3a8fb0..173afd405d9b 100644 --- a/drivers/staging/media/ipu7/ipu7-isys-video.c +++ b/drivers/staging/media/ipu7/ipu7-isys-video.c @@ -946,6 +946,7 @@ void ipu7_isys_fw_close(struct ipu7_isys *isys) ipu7_fw_isys_close(isys); mutex_unlock(&isys->mutex); + pm_runtime_put(&isys->adev->auxdev.dev); } int ipu7_isys_setup_video(struct ipu7_isys_video *av, -- 2.34.1

6 days, 1 hour

1
0
0 0

[PATCH v2 0/2] media: az6007: overall refactor to fix bugs

by Jeongjun Park

This patch series refactors the az6007 driver to address root causes of persistent bugs that have persisted for some time. Jeongjun Park (2): media: az6007: fix out-of-bounds in az6007_i2c_xfer() media: az6007: refactor to properly use dvb-usb-v2 drivers/media/usb/dvb-usb-v2/az6007.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------------------------------------------------------- 1 file changed, 107 insertions(+), 104 deletions(-)

6 days, 1 hour

2
3
0 0

[PATCH] media: staging/ipu7: fix isys device runtime PM usage in firmware closing

by bingbu.cao＠intel.com

From: Bingbu Cao <bingbu.cao(a)intel.com> The PM usage counter of isys was bumped up when start camera stream (opening firmware) but it was not dropped after stream stop(closing firmware), it forbids system fail to suspend due to the wrong PM state of ISYS. This patch drop the PM usage counter in firmware close to fix it. Cc: Stable(a)vger.kernel.org Signed-off-by: Bingbu Cao <bingbu.cao(a)intel.com> --- drivers/staging/media/ipu7/ipu7-isys-video.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/staging/media/ipu7/ipu7-isys-video.c b/drivers/staging/media/ipu7/ipu7-isys-video.c index 8756da3a8fb0..173afd405d9b 100644 --- a/drivers/staging/media/ipu7/ipu7-isys-video.c +++ b/drivers/staging/media/ipu7/ipu7-isys-video.c @@ -946,6 +946,7 @@ void ipu7_isys_fw_close(struct ipu7_isys *isys) ipu7_fw_isys_close(isys); mutex_unlock(&isys->mutex); + pm_runtime_put(&isys->adev->auxdev.dev); } int ipu7_isys_setup_video(struct ipu7_isys_video *av, -- 2.34.1

6 days, 2 hours

2
1
0 0

[PATCH] zram: fix slot write race condition

by Sergey Senozhatsky

Parallel concurrent writes to the same zram index result in leaked zsmalloc handles. Schematically we can have something like this: CPU0 CPU1 zram_slot_lock() zs_free(handle) zram_slot_lock() zram_slot_lock() zs_free(handle) zram_slot_lock() compress compress handle = zs_malloc() handle = zs_malloc() zram_slot_lock zram_set_handle(handle) zram_slot_lock zram_slot_lock zram_set_handle(handle) zram_slot_lock Either CPU0 or CPU1 zsmalloc handle will leak because zs_free() is done too early. In fact, we need to reset zram entry right before we set its new handle, all under the same slot lock scope. Cc: stable(a)vger.kernel.org Reported-by: Changhui Zhong <czhong(a)redhat.com> Closes: https://lore.kernel.org/all/CAGVVp+UtpGoW5WEdEU7uVTtsSCjPN=ksN6EcvyypAtFDOU… Fixes: 71268035f5d73 ("zram: free slot memory early during write") Signed-off-by: Sergey Senozhatsky <senozhatsky(a)chromium.org> --- drivers/block/zram/zram_drv.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 9ac271b82780..dc4a1cdfaf98 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1788,6 +1788,7 @@ static int write_same_filled_page(struct zram *zram, unsigned long fill, u32 index) { zram_slot_lock(zram, index); + zram_free_page(zram, index); zram_set_flag(zram, index, ZRAM_SAME); zram_set_handle(zram, index, fill); zram_slot_unlock(zram, index); @@ -1820,11 +1821,13 @@ static int write_incompressible_page(struct zram *zram, struct page *page, return -ENOMEM; } + zram_slot_lock(zram, index); + zram_free_page(zram, index); + src = kmap_local_page(page); zs_obj_write(zram->mem_pool, handle, src, PAGE_SIZE); kunmap_local(src); - zram_slot_lock(zram, index); zram_set_flag(zram, index, ZRAM_HUGE); zram_set_handle(zram, index, handle); zram_set_obj_size(zram, index, PAGE_SIZE); @@ -1848,11 +1851,6 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index) unsigned long element; bool same_filled; - /* First, free memory allocated to this slot (if any) */ - zram_slot_lock(zram, index); - zram_free_page(zram, index); - zram_slot_unlock(zram, index); - mem = kmap_local_page(page); same_filled = page_same_filled(mem, &element); kunmap_local(mem); @@ -1890,10 +1888,11 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index) return -ENOMEM; } + zram_slot_lock(zram, index); + zram_free_page(zram, index); zs_obj_write(zram->mem_pool, handle, zstrm->buffer, comp_len); zcomp_stream_put(zstrm); - zram_slot_lock(zram, index); zram_set_handle(zram, index, handle); zram_set_obj_size(zram, index, comp_len); zram_slot_unlock(zram, index); -- 2.51.0.384.g4c02a37b29-goog

6 days, 2 hours

1
1
0 0

[PATCH 0/3] samples/damon: fix boot time enable handling fixup merge mistakes

by SeongJae Park

First three patches of the patch series "mm/damon: fix misc bugs in DAMON modules" [1] was trying to fix boot time DAMON sample modules enabling issues by avoiding starting DAMON before the module initialization phase. However, probably by a mistake during a merge, only half of the change is merged, and the part for avoiding the starting of DAMON before the module initialized is missed. So the problem is not solved. Fix those. Note that the broken commits are merged into 6.17-rc1, but also backported to relevant stable kernels. So this series also need to be merged into the stable kernels. Hence Cc-ing stable@. [1] https://lore.kernel.org/20250706193207.39810-1-sj@kernel.org SeongJae Park (3): samples/damon/wsse: avoid starting DAMON before initialization samples/damon/prcl: avoid starting DAMON before initialization samples/damon/mtier: avoid starting DAMON before initialization samples/damon/mtier.c | 3 +++ samples/damon/prcl.c | 3 +++ samples/damon/wsse.c | 3 +++ 3 files changed, 9 insertions(+) base-commit: 186951910f4e44e20738d85c0421032634ddb298 -- 2.39.5

6 days, 3 hours

2
7
0 0

+ nilfs2-fix-cfi-failure-when-accessing-sys-fs-nilfs2-features.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: nilfs2: fix CFI failure when accessing /sys/fs/nilfs2/features/* has been added to the -mm mm-hotfixes-unstable branch. Its filename is nilfs2-fix-cfi-failure-when-accessing-sys-fs-nilfs2-features.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Nathan Chancellor <nathan(a)kernel.org> Subject: nilfs2: fix CFI failure when accessing /sys/fs/nilfs2/features/* Date: Sat, 6 Sep 2025 23:43:34 +0900 When accessing one of the files under /sys/fs/nilfs2/features when CONFIG_CFI_CLANG is enabled, there is a CFI violation: CFI failure at kobj_attr_show+0x59/0x80 (target: nilfs_feature_revision_show+0x0/0x30; expected type: 0xfc392c4d) ... Call Trace: <TASK> sysfs_kf_seq_show+0x2a6/0x390 ? __cfi_kobj_attr_show+0x10/0x10 kernfs_seq_show+0x104/0x15b seq_read_iter+0x580/0xe2b ... When the kobject of the kset for /sys/fs/nilfs2 is initialized, its ktype is set to kset_ktype, which has a ->sysfs_ops of kobj_sysfs_ops. When nilfs_feature_attr_group is added to that kobject via sysfs_create_group(), the kernfs_ops of each files is sysfs_file_kfops_rw, which will call sysfs_kf_seq_show() when ->seq_show() is called. sysfs_kf_seq_show() in turn calls kobj_attr_show() through ->sysfs_ops->show(). kobj_attr_show() casts the provided attribute out to a 'struct kobj_attribute' via container_of() and calls ->show(), resulting in the CFI violation since neither nilfs_feature_revision_show() nor nilfs_feature_README_show() match the prototype of ->show() in 'struct kobj_attribute'. Resolve the CFI violation by adjusting the second parameter in nilfs_feature_{revision,README}_show() from 'struct attribute' to 'struct kobj_attribute' to match the expected prototype. Link: https://lkml.kernel.org/r/20250906144410.22511-1-konishi.ryusuke@gmail.com Fixes: aebe17f68444 ("nilfs2: add /sys/fs/nilfs2/features group") Signed-off-by: Nathan Chancellor <nathan(a)kernel.org> Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Reported-by: kernel test robot <oliver.sang(a)intel.com> Closes: https://lore.kernel.org/oe-lkp/202509021646.bc78d9ef-lkp@intel.com/ Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/nilfs2/sysfs.c | 4 ++-- fs/nilfs2/sysfs.h | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) --- a/fs/nilfs2/sysfs.c~nilfs2-fix-cfi-failure-when-accessing-sys-fs-nilfs2-features +++ a/fs/nilfs2/sysfs.c @@ -1075,7 +1075,7 @@ void nilfs_sysfs_delete_device_group(str ************************************************************************/ static ssize_t nilfs_feature_revision_show(struct kobject *kobj, - struct attribute *attr, char *buf) + struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, "%d.%d\n", NILFS_CURRENT_REV, NILFS_MINOR_REV); @@ -1087,7 +1087,7 @@ static const char features_readme_str[] "(1) revision\n\tshow current revision of NILFS file system driver.\n"; static ssize_t nilfs_feature_README_show(struct kobject *kobj, - struct attribute *attr, + struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, features_readme_str); --- a/fs/nilfs2/sysfs.h~nilfs2-fix-cfi-failure-when-accessing-sys-fs-nilfs2-features +++ a/fs/nilfs2/sysfs.h @@ -50,16 +50,16 @@ struct nilfs_sysfs_dev_subgroups { struct completion sg_segments_kobj_unregister; }; -#define NILFS_COMMON_ATTR_STRUCT(name) \ +#define NILFS_KOBJ_ATTR_STRUCT(name) \ struct nilfs_##name##_attr { \ struct attribute attr; \ - ssize_t (*show)(struct kobject *, struct attribute *, \ + ssize_t (*show)(struct kobject *, struct kobj_attribute *, \ char *); \ - ssize_t (*store)(struct kobject *, struct attribute *, \ + ssize_t (*store)(struct kobject *, struct kobj_attribute *, \ const char *, size_t); \ } -NILFS_COMMON_ATTR_STRUCT(feature); +NILFS_KOBJ_ATTR_STRUCT(feature); #define NILFS_DEV_ATTR_STRUCT(name) \ struct nilfs_##name##_attr { \ _ Patches currently in -mm which might be from nathan(a)kernel.org are compiler-clangh-define-__sanitize___-macros-only-when-undefined.patch nilfs2-fix-cfi-failure-when-accessing-sys-fs-nilfs2-features.patch mm-rmap-convert-enum-rmap_level-to-enum-pgtable_level-fix.patch

6 days, 3 hours

1
0
0 0

[PATCH] KVM: x86: Latch INITs only in specific CPU states in KVM_SET_VCPU_EVENTS

by Fei Li

Commit ff90afa75573 ("KVM: x86: Evaluate latched_init in KVM_SET_VCPU_EVENTS when vCPU not in SMM") changes KVM_SET_VCPU_EVENTS handler to set pending LAPIC INIT event regardless of if vCPU is in SMM mode or not. However, latch INIT without checking CPU state exists race condition, which causes the loss of INIT event. This is fatal during the VM startup process because it will cause some AP to never switch to non-root mode. Just as commit f4ef19108608 ("KVM: X86: Fix loss of pending INIT due to race") said: BSP AP kvm_vcpu_ioctl_x86_get_vcpu_events events->smi.latched_init = 0 kvm_vcpu_block kvm_vcpu_check_block schedule send INIT to AP kvm_vcpu_ioctl_x86_set_vcpu_events (e.g. `info registers -a` when VM starts/reboots) if (events->smi.latched_init == 0) clear INIT in pending_events kvm_apic_accept_events test_bit(KVM_APIC_INIT, &pe) == false vcpu->arch.mp_state maintains UNINITIALIZED send SIPI to AP kvm_apic_accept_events test_bit(KVM_APIC_SIPI, &pe) == false vcpu->arch.mp_state will never change to RUNNABLE (defy: UNINITIALIZED => INIT_RECEIVED => RUNNABLE) AP will never switch to non-root operation In such race result, VM hangs. E.g., BSP loops in SeaBIOS's SMPLock and AP will never be reset, and qemu hmp "info registers -a" shows: CPU#0 EAX=00000002 EBX=00000002 ECX=00000000 EDX=00020000 ESI=00000000 EDI=00000000 EBP=00000008 ESP=00006c6c EIP=000ef570 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ...... CPU#1 EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000 EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 00000000 0000ffff 00009300 CS =f000 ffff0000 0000ffff 00009b00 ...... Fix this by handling latched INITs only in specific CPU states (SMM, VMX non-root mode, SVM with GIF=0) in KVM_SET_VCPU_EVENTS. Cc: stable(a)vger.kernel.org Fixes: ff90afa75573 ("KVM: x86: Evaluate latched_init in KVM_SET_VCPU_EVENTS when vCPU not in SMM") Signed-off-by: Fei Li <lifei.shirley(a)bytedance.com> --- arch/x86/kvm/x86.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a1c49bc681c46..7001b2af00ed1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5556,7 +5556,7 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu, return -EINVAL; #endif - if (lapic_in_kernel(vcpu)) { + if (!kvm_apic_init_sipi_allowed(vcpu) && lapic_in_kernel(vcpu)) { if (events->smi.latched_init) set_bit(KVM_APIC_INIT, &vcpu->arch.apic->pending_events); else -- 2.39.2 (Apple Git-143)

6 days, 3 hours

3
8
0 0

[PATCH net v2] net: mana: Remove redundant netdev_lock_ops_to_full() calls

by Saurabh Sengar

NET_SHAPER is always selected for MANA driver. When NET_SHAPER is enabled, netdev_lock_ops_to_full() reduces effectively to only an assert for lock, which is always held in the path when NET_SHAPER is enabled. Remove the redundant netdev_lock_ops_to_full() call. Signed-off-by: Saurabh Sengar <ssengar(a)linux.microsoft.com> --- [v2] - removed Fixes tag and stable CC drivers/net/ethernet/microsoft/mana/mana_en.c | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 550843e2164b..f0dbf4e82e0b 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -2100,10 +2100,8 @@ static void mana_destroy_txq(struct mana_port_context *apc) napi = &apc->tx_qp[i].tx_cq.napi; if (apc->tx_qp[i].txq.napi_initialized) { napi_synchronize(napi); - netdev_lock_ops_to_full(napi->dev); napi_disable_locked(napi); netif_napi_del_locked(napi); - netdev_unlock_full_to_ops(napi->dev); apc->tx_qp[i].txq.napi_initialized = false; } mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object); @@ -2256,10 +2254,8 @@ static int mana_create_txq(struct mana_port_context *apc, mana_create_txq_debugfs(apc, i); set_bit(NAPI_STATE_NO_BUSY_POLL, &cq->napi.state); - netdev_lock_ops_to_full(net); netif_napi_add_locked(net, &cq->napi, mana_poll); napi_enable_locked(&cq->napi); - netdev_unlock_full_to_ops(net); txq->napi_initialized = true; mana_gd_ring_cq(cq->gdma_cq, SET_ARM_BIT); @@ -2295,10 +2291,8 @@ static void mana_destroy_rxq(struct mana_port_context *apc, if (napi_initialized) { napi_synchronize(napi); - netdev_lock_ops_to_full(napi->dev); napi_disable_locked(napi); netif_napi_del_locked(napi); - netdev_unlock_full_to_ops(napi->dev); } xdp_rxq_info_unreg(&rxq->xdp_rxq); @@ -2549,18 +2543,14 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, gc->cq_table[cq->gdma_id] = cq->gdma_cq; - netdev_lock_ops_to_full(ndev); netif_napi_add_weight_locked(ndev, &cq->napi, mana_poll, 1); - netdev_unlock_full_to_ops(ndev); WARN_ON(xdp_rxq_info_reg(&rxq->xdp_rxq, ndev, rxq_idx, cq->napi.napi_id)); WARN_ON(xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq, MEM_TYPE_PAGE_POOL, rxq->page_pool)); - netdev_lock_ops_to_full(ndev); napi_enable_locked(&cq->napi); - netdev_unlock_full_to_ops(ndev); mana_gd_ring_cq(cq->gdma_cq, SET_ARM_BIT); out: -- 2.43.0

6 days, 3 hours

2
1
0 0

+ samples-damon-mtier-avoid-starting-damon-before-initialization.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: samples/damon/mtier: avoid starting DAMON before initialization has been added to the -mm mm-hotfixes-unstable branch. Its filename is samples-damon-mtier-avoid-starting-damon-before-initialization.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: SeongJae Park <sj(a)kernel.org> Subject: samples/damon/mtier: avoid starting DAMON before initialization Date: Mon, 8 Sep 2025 19:22:38 -0700 Commit 964314344eab ("samples/damon/mtier: support boot time enable setup") is somehow incompletely applying the origin patch [1]. It is missing the part that avoids starting DAMON before module initialization. Probably a mistake during a merge has happened. Fix it by applying the missed part again. Link: https://lkml.kernel.org/r/20250909022238.2989-4-sj@kernel.org Link: https://lore.kernel.org/20250706193207.39810-4-sj@kernel.org [1] Fixes: 964314344eab ("samples/damon/mtier: support boot time enable setup") Signed-off-by: SeongJae Park <sj(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- samples/damon/mtier.c | 3 +++ 1 file changed, 3 insertions(+) --- a/samples/damon/mtier.c~samples-damon-mtier-avoid-starting-damon-before-initialization +++ a/samples/damon/mtier.c @@ -208,6 +208,9 @@ static int damon_sample_mtier_enable_sto if (enabled == is_enabled) return 0; + if (!init_called) + return 0; + if (enabled) { err = damon_sample_mtier_start(); if (err) _ Patches currently in -mm which might be from sj(a)kernel.org are mm-damon-core-introduce-damon_call_control-dealloc_on_cancel.patch mm-damon-sysfs-use-dynamically-allocated-repeat-mode-damon_call_control.patch samples-damon-wsse-avoid-starting-damon-before-initialization.patch samples-damon-prcl-avoid-starting-damon-before-initialization.patch samples-damon-mtier-avoid-starting-damon-before-initialization.patch mm-zswap-store-page_size-compression-failed-page-as-is.patch mm-zswap-store-page_size-compression-failed-page-as-is-fix.patch mm-zswap-store-page_size-compression-failed-page-as-is-v5.patch mm-zswap-store-page_size-compression-failed-page-as-is-fix-2.patch mm-damon-core-add-damon_ctx-addr_unit.patch mm-damon-paddr-support-addr_unit-for-access-monitoring.patch mm-damon-paddr-support-addr_unit-for-damos_pageout.patch mm-damon-paddr-support-addr_unit-for-damos_lru_prio.patch mm-damon-paddr-support-addr_unit-for-migrate_hotcold.patch mm-damon-paddr-support-addr_unit-for-damos_stat.patch mm-damon-sysfs-implement-addr_unit-file-under-context-dir.patch docs-mm-damon-design-document-address-unit-parameter.patch docs-admin-guide-mm-damon-usage-document-addr_unit-file.patch docs-abi-damon-document-addr_unit-file.patch

6 days, 5 hours

1
0
0 0

+ samples-damon-prcl-avoid-starting-damon-before-initialization.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: samples/damon/prcl: avoid starting DAMON before initialization has been added to the -mm mm-hotfixes-unstable branch. Its filename is samples-damon-prcl-avoid-starting-damon-before-initialization.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: SeongJae Park <sj(a)kernel.org> Subject: samples/damon/prcl: avoid starting DAMON before initialization Date: Mon, 8 Sep 2025 19:22:37 -0700 Commit 2780505ec2b4 ("samples/damon/prcl: fix boot time enable crash") is somehow incompletely applying the origin patch [1]. It is missing the part that avoids starting DAMON before module initialization. Probably a mistake during a merge has happened. Fix it by applying the missed part again. Link: https://lkml.kernel.org/r/20250909022238.2989-3-sj@kernel.org Link: https://lore.kernel.org/20250706193207.39810-3-sj@kernel.org [1] Fixes: 2780505ec2b4 ("samples/damon/prcl: fix boot time enable crash") Signed-off-by: SeongJae Park <sj(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- samples/damon/prcl.c | 3 +++ 1 file changed, 3 insertions(+) --- a/samples/damon/prcl.c~samples-damon-prcl-avoid-starting-damon-before-initialization +++ a/samples/damon/prcl.c @@ -137,6 +137,9 @@ static int damon_sample_prcl_enable_stor if (enabled == is_enabled) return 0; + if (!init_called) + return 0; + if (enabled) { err = damon_sample_prcl_start(); if (err) _ Patches currently in -mm which might be from sj(a)kernel.org are mm-damon-core-introduce-damon_call_control-dealloc_on_cancel.patch mm-damon-sysfs-use-dynamically-allocated-repeat-mode-damon_call_control.patch samples-damon-wsse-avoid-starting-damon-before-initialization.patch samples-damon-prcl-avoid-starting-damon-before-initialization.patch samples-damon-mtier-avoid-starting-damon-before-initialization.patch mm-zswap-store-page_size-compression-failed-page-as-is.patch mm-zswap-store-page_size-compression-failed-page-as-is-fix.patch mm-zswap-store-page_size-compression-failed-page-as-is-v5.patch mm-zswap-store-page_size-compression-failed-page-as-is-fix-2.patch mm-damon-core-add-damon_ctx-addr_unit.patch mm-damon-paddr-support-addr_unit-for-access-monitoring.patch mm-damon-paddr-support-addr_unit-for-damos_pageout.patch mm-damon-paddr-support-addr_unit-for-damos_lru_prio.patch mm-damon-paddr-support-addr_unit-for-migrate_hotcold.patch mm-damon-paddr-support-addr_unit-for-damos_stat.patch mm-damon-sysfs-implement-addr_unit-file-under-context-dir.patch docs-mm-damon-design-document-address-unit-parameter.patch docs-admin-guide-mm-damon-usage-document-addr_unit-file.patch docs-abi-damon-document-addr_unit-file.patch

6 days, 5 hours

1
0
0 0

+ samples-damon-wsse-avoid-starting-damon-before-initialization.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: samples/damon/wsse: avoid starting DAMON before initialization has been added to the -mm mm-hotfixes-unstable branch. Its filename is samples-damon-wsse-avoid-starting-damon-before-initialization.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: SeongJae Park <sj(a)kernel.org> Subject: samples/damon/wsse: avoid starting DAMON before initialization Date: Mon, 8 Sep 2025 19:22:36 -0700 Patch series "samples/damon: fix boot time enable handling fixup merge mistakes". This patch (of 3): Commit 0ed1165c3727 ("samples/damon/wsse: fix boot time enable handling") is somehow incompletely applying the origin patch [1]. It is missing the part that avoids starting DAMON before module initialization. Probably a mistake during a merge has happened. Fix it by applying the missed part again. Link: https://lkml.kernel.org/r/20250909022238.2989-1-sj@kernel.org Link: https://lkml.kernel.org/r/20250909022238.2989-2-sj@kernel.org Link: https://lore.kernel.org/20250706193207.39810-2-sj@kernel.org [1] Fixes: 0ed1165c3727 ("samples/damon/wsse: fix boot time enable handling") Signed-off-by: SeongJae Park <sj(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- samples/damon/wsse.c | 3 +++ 1 file changed, 3 insertions(+) --- a/samples/damon/wsse.c~samples-damon-wsse-avoid-starting-damon-before-initialization +++ a/samples/damon/wsse.c @@ -118,6 +118,9 @@ static int damon_sample_wsse_enable_stor return 0; if (enabled) { + if (!init_called) + return 0; + err = damon_sample_wsse_start(); if (err) enabled = false; _ Patches currently in -mm which might be from sj(a)kernel.org are mm-damon-core-introduce-damon_call_control-dealloc_on_cancel.patch mm-damon-sysfs-use-dynamically-allocated-repeat-mode-damon_call_control.patch samples-damon-wsse-avoid-starting-damon-before-initialization.patch samples-damon-prcl-avoid-starting-damon-before-initialization.patch samples-damon-mtier-avoid-starting-damon-before-initialization.patch mm-zswap-store-page_size-compression-failed-page-as-is.patch mm-zswap-store-page_size-compression-failed-page-as-is-fix.patch mm-zswap-store-page_size-compression-failed-page-as-is-v5.patch mm-zswap-store-page_size-compression-failed-page-as-is-fix-2.patch mm-damon-core-add-damon_ctx-addr_unit.patch mm-damon-paddr-support-addr_unit-for-access-monitoring.patch mm-damon-paddr-support-addr_unit-for-damos_pageout.patch mm-damon-paddr-support-addr_unit-for-damos_lru_prio.patch mm-damon-paddr-support-addr_unit-for-migrate_hotcold.patch mm-damon-paddr-support-addr_unit-for-damos_stat.patch mm-damon-sysfs-implement-addr_unit-file-under-context-dir.patch docs-mm-damon-design-document-address-unit-parameter.patch docs-admin-guide-mm-damon-usage-document-addr_unit-file.patch docs-abi-damon-document-addr_unit-file.patch

6 days, 5 hours

1
0
0 0

[PATCH net-next v4] rds: ib: Remove unused extern definition

by Håkon Bugge

In the old days, RDS used FMR (Fast Memory Registration) to register IB MRs to be used by RDMA. A newer and better verbs based registration/de-registration method called FRWR (Fast Registration Work Request) was added to RDS by commit 1659185fb4d0 ("RDS: IB: Support Fastreg MR (FRMR) memory registration mode") in 2016. Detection and enablement of FRWR was done in commit 2cb2912d6563 ("RDS: IB: add Fastreg MR (FRMR) detection support"). But said commit added an extern bool prefer_frmr, which was not used by said commit - nor used by later commits. Hence, remove it. Signed-off-by: Håkon Bugge <haakon.bugge(a)oracle.com> Reviewed-by: Allison Henderson <allison.henderson(a)oracle.com> --- v3 -> v4: * Added Allison's r-b * Removed indentation for this section v2 -> v3: * As per Jakub's request, removed Cc: and Fixes: tags * Subject to net-next (instead of net) v1 -> v2: * Added commit message * Added Cc: stable(a)vger.kernel.org --- net/rds/ib_mr.h | 1 - 1 file changed, 1 deletion(-) diff --git a/net/rds/ib_mr.h b/net/rds/ib_mr.h index ea5e9aee4959e..5884de8c6f45b 100644 --- a/net/rds/ib_mr.h +++ b/net/rds/ib_mr.h @@ -108,7 +108,6 @@ struct rds_ib_mr_pool { }; extern struct workqueue_struct *rds_ib_mr_wq; -extern bool prefer_frmr; struct rds_ib_mr_pool *rds_ib_create_mr_pool(struct rds_ib_device *rds_dev, int npages); -- 2.43.5

6 days, 6 hours

2
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror