The patch below does not apply to the 6.12-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y git checkout FETCH_HEAD git cherry-pick -x f0ed39830e6064d62f9c5393505677a26569bb56 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2025010650-tuesday-motivate-5cbb@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f0ed39830e6064d62f9c5393505677a26569bb56 Mon Sep 17 00:00:00 2001 From: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Date: Fri, 20 Dec 2024 09:19:18 -0800 Subject: [PATCH] xe/oa: Fix query mode of operation for OAR/OAC MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
This is a set of squashed commits to facilitate smooth applying to stable. Each commit message is retained for reference.
1) Allow a GGTT mapped batch to be submitted to user exec queue
For a OA use case, one of the HW registers needs to be modified by submitting an MI_LOAD_REGISTER_IMM command to the users exec queue, so that the register is modified in the user's hardware context. In order to do this a batch that is mapped in GGTT, needs to be submitted to the user exec queue. Since all user submissions use q->vm and hence PPGTT, add some plumbing to enable submission of batches mapped in GGTT.
v2: ggtt is zero-initialized, so no need to set it false (Matt Brost)
2) xe/oa: Use MI_LOAD_REGISTER_IMMEDIATE to enable OAR/OAC
To enable OAR/OAC, a bit in RING_CONTEXT_CONTROL needs to be set. Setting this bit cause the context image size to change and if not done correct, can cause undesired hangs.
Current code uses a separate exec_queue to modify this bit and is error-prone. As per HW recommendation, submit MI_LOAD_REGISTER_IMM to the target hardware context to modify the relevant bit.
In v2 version, an attempt to submit everything to the user-queue was made, but it failed the unprivileged-single-ctx-counters test. It appears that the OACTXCONTROL must be modified from a remote context.
In v3 version, all context specific register configurations were moved to use LOAD_REGISTER_IMMEDIATE and that seems to work well. This is a cleaner way, since we can now submit all configuration to user exec_queue and the fence handling is simplified.
v2: (Matt) - set job->ggtt to true if create job is successful - unlock vm on job error
(Ashutosh) - don't wait on job submission - use kernel exec queue where possible
v3: (Ashutosh) - Fix checkpatch issues - Remove extra spaces/new-lines - Add Fixes: and Cc: tags - Reset context control bit when OA stream is closed - Submit all config via MI_LOAD_REGISTER_IMMEDIATE
(Umesh) - Update commit message for v3 experiment - Squash patches for easier port to stable
v4: (Ashutosh) - No need to pass q to xe_oa_submit_bb - Do not support exec queues with width > 1 - Fix disabling of CTX_CTRL_OAC_CONTEXT_ENABLE
v5: (Ashutosh) - Drop reg_lri related comments - Use XE_OA_SUBMIT_NO_DEPS in xe_oa_load_with_lri
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com
diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index 8dd55798ab31..5cc0f6f9bc11 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -74,12 +74,6 @@ struct xe_oa_config { struct rcu_head rcu; };
-struct flex { - struct xe_reg reg; - u32 offset; - u32 value; -}; - struct xe_oa_open_param { struct xe_file *xef; u32 oa_unit_id; @@ -596,19 +590,38 @@ static __poll_t xe_oa_poll(struct file *file, poll_table *wait) return ret; }
+static void xe_oa_lock_vma(struct xe_exec_queue *q) +{ + if (q->vm) { + down_read(&q->vm->lock); + xe_vm_lock(q->vm, false); + } +} + +static void xe_oa_unlock_vma(struct xe_exec_queue *q) +{ + if (q->vm) { + xe_vm_unlock(q->vm); + up_read(&q->vm->lock); + } +} + static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, enum xe_oa_submit_deps deps, struct xe_bb *bb) { + struct xe_exec_queue *q = stream->exec_q ?: stream->k_exec_q; struct xe_sched_job *job; struct dma_fence *fence; int err = 0;
- /* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */ - job = xe_bb_create_job(stream->k_exec_q, bb); + xe_oa_lock_vma(q); + + job = xe_bb_create_job(q, bb); if (IS_ERR(job)) { err = PTR_ERR(job); goto exit; } + job->ggtt = true;
if (deps == XE_OA_SUBMIT_ADD_DEPS) { for (int i = 0; i < stream->num_syncs && !err; i++) @@ -623,10 +636,13 @@ static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, enum xe_oa fence = dma_fence_get(&job->drm.s_fence->finished); xe_sched_job_push(job);
+ xe_oa_unlock_vma(q); + return fence; err_put_job: xe_sched_job_put(job); exit: + xe_oa_unlock_vma(q); return ERR_PTR(err); }
@@ -675,63 +691,19 @@ static void xe_oa_free_configs(struct xe_oa_stream *stream) dma_fence_put(stream->last_fence); }
-static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc, - struct xe_bb *bb, const struct flex *flex, u32 count) -{ - u32 offset = xe_bo_ggtt_addr(lrc->bo); - - do { - bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1); - bb->cs[bb->len++] = offset + flex->offset * sizeof(u32); - bb->cs[bb->len++] = 0; - bb->cs[bb->len++] = flex->value; - - } while (flex++, --count); -} - -static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc, - const struct flex *flex, u32 count) +static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri, u32 count) { struct dma_fence *fence; struct xe_bb *bb; int err;
- bb = xe_bb_new(stream->gt, 4 * count, false); + bb = xe_bb_new(stream->gt, 2 * count + 1, false); if (IS_ERR(bb)) { err = PTR_ERR(bb); goto exit; }
- xe_oa_store_flex(stream, lrc, bb, flex, count); - - fence = xe_oa_submit_bb(stream, XE_OA_SUBMIT_NO_DEPS, bb); - if (IS_ERR(fence)) { - err = PTR_ERR(fence); - goto free_bb; - } - xe_bb_free(bb, fence); - dma_fence_put(fence); - - return 0; -free_bb: - xe_bb_free(bb, NULL); -exit: - return err; -} - -static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri) -{ - struct dma_fence *fence; - struct xe_bb *bb; - int err; - - bb = xe_bb_new(stream->gt, 3, false); - if (IS_ERR(bb)) { - err = PTR_ERR(bb); - goto exit; - } - - write_cs_mi_lri(bb, reg_lri, 1); + write_cs_mi_lri(bb, reg_lri, count);
fence = xe_oa_submit_bb(stream, XE_OA_SUBMIT_NO_DEPS, bb); if (IS_ERR(fence)) { @@ -751,71 +723,55 @@ static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re static int xe_oa_configure_oar_context(struct xe_oa_stream *stream, bool enable) { const struct xe_oa_format *format = stream->oa_buffer.format; - struct xe_lrc *lrc = stream->exec_q->lrc[0]; - u32 regs_offset = xe_lrc_regs_offset(lrc) / sizeof(u32); u32 oacontrol = __format_to_oactrl(format, OAR_OACONTROL_COUNTER_SEL_MASK) | (enable ? OAR_OACONTROL_COUNTER_ENABLE : 0);
- struct flex regs_context[] = { + struct xe_oa_reg reg_lri[] = { { OACTXCONTROL(stream->hwe->mmio_base), - stream->oa->ctx_oactxctrl_offset[stream->hwe->class] + 1, enable ? OA_COUNTER_RESUME : 0, }, + { + OAR_OACONTROL, + oacontrol, + }, { RING_CONTEXT_CONTROL(stream->hwe->mmio_base), - regs_offset + CTX_CONTEXT_CONTROL, - _MASKED_BIT_ENABLE(CTX_CTRL_OAC_CONTEXT_ENABLE), + _MASKED_FIELD(CTX_CTRL_OAC_CONTEXT_ENABLE, + enable ? CTX_CTRL_OAC_CONTEXT_ENABLE : 0) }, }; - struct xe_oa_reg reg_lri = { OAR_OACONTROL, oacontrol }; - int err;
- /* Modify stream hwe context image with regs_context */ - err = xe_oa_modify_ctx_image(stream, stream->exec_q->lrc[0], - regs_context, ARRAY_SIZE(regs_context)); - if (err) - return err; - - /* Apply reg_lri using LRI */ - return xe_oa_load_with_lri(stream, ®_lri); + return xe_oa_load_with_lri(stream, reg_lri, ARRAY_SIZE(reg_lri)); }
static int xe_oa_configure_oac_context(struct xe_oa_stream *stream, bool enable) { const struct xe_oa_format *format = stream->oa_buffer.format; - struct xe_lrc *lrc = stream->exec_q->lrc[0]; - u32 regs_offset = xe_lrc_regs_offset(lrc) / sizeof(u32); u32 oacontrol = __format_to_oactrl(format, OAR_OACONTROL_COUNTER_SEL_MASK) | (enable ? OAR_OACONTROL_COUNTER_ENABLE : 0); - struct flex regs_context[] = { + struct xe_oa_reg reg_lri[] = { { OACTXCONTROL(stream->hwe->mmio_base), - stream->oa->ctx_oactxctrl_offset[stream->hwe->class] + 1, enable ? OA_COUNTER_RESUME : 0, }, + { + OAC_OACONTROL, + oacontrol + }, { RING_CONTEXT_CONTROL(stream->hwe->mmio_base), - regs_offset + CTX_CONTEXT_CONTROL, - _MASKED_BIT_ENABLE(CTX_CTRL_OAC_CONTEXT_ENABLE) | + _MASKED_FIELD(CTX_CTRL_OAC_CONTEXT_ENABLE, + enable ? CTX_CTRL_OAC_CONTEXT_ENABLE : 0) | _MASKED_FIELD(CTX_CTRL_RUN_ALONE, enable ? CTX_CTRL_RUN_ALONE : 0), }, }; - struct xe_oa_reg reg_lri = { OAC_OACONTROL, oacontrol }; - int err;
/* Set ccs select to enable programming of OAC_OACONTROL */ xe_mmio_write32(&stream->gt->mmio, __oa_regs(stream)->oa_ctrl, __oa_ccs_select(stream));
- /* Modify stream hwe context image with regs_context */ - err = xe_oa_modify_ctx_image(stream, stream->exec_q->lrc[0], - regs_context, ARRAY_SIZE(regs_context)); - if (err) - return err; - - /* Apply reg_lri using LRI */ - return xe_oa_load_with_lri(stream, ®_lri); + return xe_oa_load_with_lri(stream, reg_lri, ARRAY_SIZE(reg_lri)); }
static int xe_oa_configure_oa_context(struct xe_oa_stream *stream, bool enable) @@ -2066,8 +2022,8 @@ int xe_oa_stream_open_ioctl(struct drm_device *dev, u64 data, struct drm_file *f if (XE_IOCTL_DBG(oa->xe, !param.exec_q)) return -ENOENT;
- if (param.exec_q->width > 1) - drm_dbg(&oa->xe->drm, "exec_q->width > 1, programming only exec_q->lrc[0]\n"); + if (XE_IOCTL_DBG(oa->xe, param.exec_q->width > 1)) + return -EOPNOTSUPP; }
/* diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c index 0be4f489d3e1..9f327f27c072 100644 --- a/drivers/gpu/drm/xe/xe_ring_ops.c +++ b/drivers/gpu/drm/xe/xe_ring_ops.c @@ -221,7 +221,10 @@ static int emit_pipe_imm_ggtt(u32 addr, u32 value, bool stall_only, u32 *dw,
static u32 get_ppgtt_flag(struct xe_sched_job *job) { - return job->q->vm ? BIT(8) : 0; + if (job->q->vm && !job->ggtt) + return BIT(8); + + return 0; }
static int emit_copy_timestamp(struct xe_lrc *lrc, u32 *dw, int i) diff --git a/drivers/gpu/drm/xe/xe_sched_job_types.h b/drivers/gpu/drm/xe/xe_sched_job_types.h index f13f333f00be..d942b20a9f29 100644 --- a/drivers/gpu/drm/xe/xe_sched_job_types.h +++ b/drivers/gpu/drm/xe/xe_sched_job_types.h @@ -56,6 +56,8 @@ struct xe_sched_job { u32 migrate_flush_flags; /** @ring_ops_flush_tlb: The ring ops need to flush TLB before payload. */ bool ring_ops_flush_tlb; + /** @ggtt: mapped in ggtt. */ + bool ggtt; /** @ptrs: per instance pointers. */ struct xe_job_ptrs ptrs[]; };
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
This is a set of squashed commits to facilitate smooth applying to stable. Each commit message is retained for reference.
1) Allow a GGTT mapped batch to be submitted to user exec queue
For a OA use case, one of the HW registers needs to be modified by submitting an MI_LOAD_REGISTER_IMM command to the users exec queue, so that the register is modified in the user's hardware context. In order to do this a batch that is mapped in GGTT, needs to be submitted to the user exec queue. Since all user submissions use q->vm and hence PPGTT, add some plumbing to enable submission of batches mapped in GGTT.
v2: ggtt is zero-initialized, so no need to set it false (Matt Brost)
2) xe/oa: Use MI_LOAD_REGISTER_IMMEDIATE to enable OAR/OAC
To enable OAR/OAC, a bit in RING_CONTEXT_CONTROL needs to be set. Setting this bit cause the context image size to change and if not done correct, can cause undesired hangs.
Current code uses a separate exec_queue to modify this bit and is error-prone. As per HW recommendation, submit MI_LOAD_REGISTER_IMM to the target hardware context to modify the relevant bit.
In v2 version, an attempt to submit everything to the user-queue was made, but it failed the unprivileged-single-ctx-counters test. It appears that the OACTXCONTROL must be modified from a remote context.
In v3 version, all context specific register configurations were moved to use LOAD_REGISTER_IMMEDIATE and that seems to work well. This is a cleaner way, since we can now submit all configuration to user exec_queue and the fence handling is simplified.
v2: (Matt) - set job->ggtt to true if create job is successful - unlock vm on job error
(Ashutosh) - don't wait on job submission - use kernel exec queue where possible
v3: (Ashutosh) - Fix checkpatch issues - Remove extra spaces/new-lines - Add Fixes: and Cc: tags - Reset context control bit when OA stream is closed - Submit all config via MI_LOAD_REGISTER_IMMEDIATE
(Umesh) - Update commit message for v3 experiment - Squash patches for easier port to stable
v4: (Ashutosh) - No need to pass q to xe_oa_submit_bb - Do not support exec queues with width > 1 - Fix disabling of CTX_CTRL_OAC_CONTEXT_ENABLE
v5: (Ashutosh) - Drop reg_lri related comments - Use XE_OA_SUBMIT_NO_DEPS in xe_oa_load_with_lri
v6: Backport to linux-6.12.y (Umesh)
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56) --- drivers/gpu/drm/xe/xe_oa.c | 124 +++++++++--------------- drivers/gpu/drm/xe/xe_ring_ops.c | 5 +- drivers/gpu/drm/xe/xe_sched_job_types.h | 2 + 3 files changed, 51 insertions(+), 80 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index 78823f53d290..ddc3472a2506 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -63,12 +63,6 @@ struct xe_oa_config { struct rcu_head rcu; };
-struct flex { - struct xe_reg reg; - u32 offset; - u32 value; -}; - struct xe_oa_open_param { u32 oa_unit_id; bool sample; @@ -567,24 +561,46 @@ static __poll_t xe_oa_poll(struct file *file, poll_table *wait) return ret; }
+static void xe_oa_lock_vma(struct xe_exec_queue *q) +{ + if (q->vm) { + down_read(&q->vm->lock); + xe_vm_lock(q->vm, false); + } +} + +static void xe_oa_unlock_vma(struct xe_exec_queue *q) +{ + if (q->vm) { + xe_vm_unlock(q->vm); + up_read(&q->vm->lock); + } +} + static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) { + struct xe_exec_queue *q = stream->exec_q ?: stream->k_exec_q; struct xe_sched_job *job; struct dma_fence *fence; long timeout; int err = 0;
- /* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */ - job = xe_bb_create_job(stream->k_exec_q, bb); + xe_oa_lock_vma(q); + + job = xe_bb_create_job(q, bb); if (IS_ERR(job)) { err = PTR_ERR(job); + xe_oa_unlock_vma(q); goto exit; } + job->ggtt = true;
xe_sched_job_arm(job); fence = dma_fence_get(&job->drm.s_fence->finished); xe_sched_job_push(job);
+ xe_oa_unlock_vma(q); + timeout = dma_fence_wait_timeout(fence, false, HZ); dma_fence_put(fence); if (timeout < 0) @@ -639,52 +655,18 @@ static void xe_oa_free_configs(struct xe_oa_stream *stream) free_oa_config_bo(oa_bo); }
-static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc, - struct xe_bb *bb, const struct flex *flex, u32 count) -{ - u32 offset = xe_bo_ggtt_addr(lrc->bo); - - do { - bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1); - bb->cs[bb->len++] = offset + flex->offset * sizeof(u32); - bb->cs[bb->len++] = 0; - bb->cs[bb->len++] = flex->value; - - } while (flex++, --count); -} - -static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc, - const struct flex *flex, u32 count) -{ - struct xe_bb *bb; - int err; - - bb = xe_bb_new(stream->gt, 4 * count, false); - if (IS_ERR(bb)) { - err = PTR_ERR(bb); - goto exit; - } - - xe_oa_store_flex(stream, lrc, bb, flex, count); - - err = xe_oa_submit_bb(stream, bb); - xe_bb_free(bb, NULL); -exit: - return err; -} - -static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri) +static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri, u32 count) { struct xe_bb *bb; int err;
- bb = xe_bb_new(stream->gt, 3, false); + bb = xe_bb_new(stream->gt, 2 * count + 1, false); if (IS_ERR(bb)) { err = PTR_ERR(bb); goto exit; }
- write_cs_mi_lri(bb, reg_lri, 1); + write_cs_mi_lri(bb, reg_lri, count);
err = xe_oa_submit_bb(stream, bb); xe_bb_free(bb, NULL); @@ -695,70 +677,54 @@ static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re static int xe_oa_configure_oar_context(struct xe_oa_stream *stream, bool enable) { const struct xe_oa_format *format = stream->oa_buffer.format; - struct xe_lrc *lrc = stream->exec_q->lrc[0]; - u32 regs_offset = xe_lrc_regs_offset(lrc) / sizeof(u32); u32 oacontrol = __format_to_oactrl(format, OAR_OACONTROL_COUNTER_SEL_MASK) | (enable ? OAR_OACONTROL_COUNTER_ENABLE : 0);
- struct flex regs_context[] = { + struct xe_oa_reg reg_lri[] = { { OACTXCONTROL(stream->hwe->mmio_base), - stream->oa->ctx_oactxctrl_offset[stream->hwe->class] + 1, enable ? OA_COUNTER_RESUME : 0, }, + { + OAR_OACONTROL, + oacontrol, + }, { RING_CONTEXT_CONTROL(stream->hwe->mmio_base), - regs_offset + CTX_CONTEXT_CONTROL, - _MASKED_BIT_ENABLE(CTX_CTRL_OAC_CONTEXT_ENABLE), + _MASKED_FIELD(CTX_CTRL_OAC_CONTEXT_ENABLE, + enable ? CTX_CTRL_OAC_CONTEXT_ENABLE : 0) }, }; - struct xe_oa_reg reg_lri = { OAR_OACONTROL, oacontrol }; - int err; - - /* Modify stream hwe context image with regs_context */ - err = xe_oa_modify_ctx_image(stream, stream->exec_q->lrc[0], - regs_context, ARRAY_SIZE(regs_context)); - if (err) - return err;
- /* Apply reg_lri using LRI */ - return xe_oa_load_with_lri(stream, ®_lri); + return xe_oa_load_with_lri(stream, reg_lri, ARRAY_SIZE(reg_lri)); }
static int xe_oa_configure_oac_context(struct xe_oa_stream *stream, bool enable) { const struct xe_oa_format *format = stream->oa_buffer.format; - struct xe_lrc *lrc = stream->exec_q->lrc[0]; - u32 regs_offset = xe_lrc_regs_offset(lrc) / sizeof(u32); u32 oacontrol = __format_to_oactrl(format, OAR_OACONTROL_COUNTER_SEL_MASK) | (enable ? OAR_OACONTROL_COUNTER_ENABLE : 0); - struct flex regs_context[] = { + struct xe_oa_reg reg_lri[] = { { OACTXCONTROL(stream->hwe->mmio_base), - stream->oa->ctx_oactxctrl_offset[stream->hwe->class] + 1, enable ? OA_COUNTER_RESUME : 0, }, + { + OAC_OACONTROL, + oacontrol + }, { RING_CONTEXT_CONTROL(stream->hwe->mmio_base), - regs_offset + CTX_CONTEXT_CONTROL, - _MASKED_BIT_ENABLE(CTX_CTRL_OAC_CONTEXT_ENABLE) | + _MASKED_FIELD(CTX_CTRL_OAC_CONTEXT_ENABLE, + enable ? CTX_CTRL_OAC_CONTEXT_ENABLE : 0) | _MASKED_FIELD(CTX_CTRL_RUN_ALONE, enable ? CTX_CTRL_RUN_ALONE : 0), }, }; - struct xe_oa_reg reg_lri = { OAC_OACONTROL, oacontrol }; - int err;
/* Set ccs select to enable programming of OAC_OACONTROL */ xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_ctrl, __oa_ccs_select(stream));
- /* Modify stream hwe context image with regs_context */ - err = xe_oa_modify_ctx_image(stream, stream->exec_q->lrc[0], - regs_context, ARRAY_SIZE(regs_context)); - if (err) - return err; - - /* Apply reg_lri using LRI */ - return xe_oa_load_with_lri(stream, ®_lri); + return xe_oa_load_with_lri(stream, reg_lri, ARRAY_SIZE(reg_lri)); }
static int xe_oa_configure_oa_context(struct xe_oa_stream *stream, bool enable) @@ -1817,8 +1783,8 @@ int xe_oa_stream_open_ioctl(struct drm_device *dev, u64 data, struct drm_file *f if (XE_IOCTL_DBG(oa->xe, !param.exec_q)) return -ENOENT;
- if (param.exec_q->width > 1) - drm_dbg(&oa->xe->drm, "exec_q->width > 1, programming only exec_q->lrc[0]\n"); + if (XE_IOCTL_DBG(oa->xe, param.exec_q->width > 1)) + return -EOPNOTSUPP; }
/* diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c index 0be4f489d3e1..9f327f27c072 100644 --- a/drivers/gpu/drm/xe/xe_ring_ops.c +++ b/drivers/gpu/drm/xe/xe_ring_ops.c @@ -221,7 +221,10 @@ static int emit_pipe_imm_ggtt(u32 addr, u32 value, bool stall_only, u32 *dw,
static u32 get_ppgtt_flag(struct xe_sched_job *job) { - return job->q->vm ? BIT(8) : 0; + if (job->q->vm && !job->ggtt) + return BIT(8); + + return 0; }
static int emit_copy_timestamp(struct xe_lrc *lrc, u32 *dw, int i) diff --git a/drivers/gpu/drm/xe/xe_sched_job_types.h b/drivers/gpu/drm/xe/xe_sched_job_types.h index 0d3f76fb05ce..c207361bf43e 100644 --- a/drivers/gpu/drm/xe/xe_sched_job_types.h +++ b/drivers/gpu/drm/xe/xe_sched_job_types.h @@ -57,6 +57,8 @@ struct xe_sched_job { u32 migrate_flush_flags; /** @ring_ops_flush_tlb: The ring ops need to flush TLB before payload. */ bool ring_ops_flush_tlb; + /** @ggtt: mapped in ggtt. */ + bool ggtt; /** @ptrs: per instance pointers. */ struct xe_job_ptrs ptrs[]; };
[ Sasha's backport helper bot ]
Hi,
The claimed upstream commit SHA1 (55039832f98c7e05f1cf9e0d8c12b2490abd0f16) was not found. However, I found a matching commit: f0ed39830e6064d62f9c5393505677a26569bb56
Status in newer kernel trees: 6.12.y | Not found
Note: The patch differs from the upstream commit: --- 1: f0ed39830e60 ! 1: 14a6fe72119e xe/oa: Fix query mode of operation for OAR/OAC @@ Metadata ## Commit message ## xe/oa: Fix query mode of operation for OAR/OAC
+ commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream + This is a set of squashed commits to facilitate smooth applying to stable. Each commit message is retained for reference.
@@ Commit message - Drop reg_lri related comments - Use XE_OA_SUBMIT_NO_DEPS in xe_oa_load_with_lri
+ v6: + Backport to linux-6.12.y (Umesh) + Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com - Cc: stable@vger.kernel.org + Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com + (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
## drivers/gpu/drm/xe/xe_oa.c ## @@ drivers/gpu/drm/xe/xe_oa.c: struct xe_oa_config { @@ drivers/gpu/drm/xe/xe_oa.c: struct xe_oa_config { -}; - struct xe_oa_open_param { - struct xe_file *xef; u32 oa_unit_id; + bool sample; @@ drivers/gpu/drm/xe/xe_oa.c: static __poll_t xe_oa_poll(struct file *file, poll_table *wait) return ret; } @@ drivers/gpu/drm/xe/xe_oa.c: static __poll_t xe_oa_poll(struct file *file, poll_t + } +} + - static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, enum xe_oa_submit_deps deps, - struct xe_bb *bb) + static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) { + struct xe_exec_queue *q = stream->exec_q ?: stream->k_exec_q; struct xe_sched_job *job; struct dma_fence *fence; + long timeout; int err = 0;
- /* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */ @@ drivers/gpu/drm/xe/xe_oa.c: static __poll_t xe_oa_poll(struct file *file, poll_t + job = xe_bb_create_job(q, bb); if (IS_ERR(job)) { err = PTR_ERR(job); ++ xe_oa_unlock_vma(q); goto exit; } + job->ggtt = true;
- if (deps == XE_OA_SUBMIT_ADD_DEPS) { - for (int i = 0; i < stream->num_syncs && !err; i++) -@@ drivers/gpu/drm/xe/xe_oa.c: static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, enum xe_oa + xe_sched_job_arm(job); fence = dma_fence_get(&job->drm.s_fence->finished); xe_sched_job_push(job);
+ xe_oa_unlock_vma(q); + - return fence; - err_put_job: - xe_sched_job_put(job); - exit: -+ xe_oa_unlock_vma(q); - return ERR_PTR(err); - } - + timeout = dma_fence_wait_timeout(fence, false, HZ); + dma_fence_put(fence); + if (timeout < 0) @@ drivers/gpu/drm/xe/xe_oa.c: static void xe_oa_free_configs(struct xe_oa_stream *stream) - dma_fence_put(stream->last_fence); + free_oa_config_bo(oa_bo); }
-static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc, @@ drivers/gpu/drm/xe/xe_oa.c: static void xe_oa_free_configs(struct xe_oa_stream * -static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc, - const struct flex *flex, u32 count) -{ -- struct dma_fence *fence; - struct xe_bb *bb; - int err; - @@ drivers/gpu/drm/xe/xe_oa.c: static void xe_oa_free_configs(struct xe_oa_stream * - - xe_oa_store_flex(stream, lrc, bb, flex, count); - -- fence = xe_oa_submit_bb(stream, XE_OA_SUBMIT_NO_DEPS, bb); -- if (IS_ERR(fence)) { -- err = PTR_ERR(fence); -- goto free_bb; -- } -- xe_bb_free(bb, fence); -- dma_fence_put(fence); -- -- return 0; --free_bb: +- err = xe_oa_submit_bb(stream, bb); - xe_bb_free(bb, NULL); -exit: - return err; @@ drivers/gpu/drm/xe/xe_oa.c: static void xe_oa_free_configs(struct xe_oa_stream * -static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri) +static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri, u32 count) { - struct dma_fence *fence; struct xe_bb *bb; int err;
@@ drivers/gpu/drm/xe/xe_oa.c: static void xe_oa_free_configs(struct xe_oa_stream * - write_cs_mi_lri(bb, reg_lri, 1); + write_cs_mi_lri(bb, reg_lri, count);
- fence = xe_oa_submit_bb(stream, XE_OA_SUBMIT_NO_DEPS, bb); - if (IS_ERR(fence)) { + err = xe_oa_submit_bb(stream, bb); + xe_bb_free(bb, NULL); @@ drivers/gpu/drm/xe/xe_oa.c: static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re static int xe_oa_configure_oar_context(struct xe_oa_stream *stream, bool enable) { @@ drivers/gpu/drm/xe/xe_oa.c: static int xe_oa_load_with_lri(struct xe_oa_stream * - int err;
/* Set ccs select to enable programming of OAC_OACONTROL */ - xe_mmio_write32(&stream->gt->mmio, __oa_regs(stream)->oa_ctrl, - __oa_ccs_select(stream)); + xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_ctrl, __oa_ccs_select(stream));
- /* Modify stream hwe context image with regs_context */ - err = xe_oa_modify_ctx_image(stream, stream->exec_q->lrc[0], ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.12.y | Success | Success |
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
greg k-h
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
All subsystems that grow to having large teams (more than 2-4 people) working on a single driver will eventually hit the scaling problem, just be glad we find things first so everyone else knows how to deal with it later.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
None of what you are saying makes any sense here. Explain how patches are backported to stable releases before they end up in Linus's tree to me like I'm 5, because there should be no possible workflow where that can happen, stable pulls from patches in Linus' tree, not from my tree or drm-next or anywhere else. Now it might appear that way because tooling isn't prepared or people don't know what they are looking at, but I still don't see the actual problem.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
Fix the tools.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
How do you recommend we do that, edit the immutable git history to remove the stable tag from the original commit?
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
If you have to do, go do it. The thing is the workflow is there for a reason, once you have a large enough team, having every single team member intimately aware of the rc schedule to decide where they need to land patches doesn't scale. If you can't land patches to a central -next tree and then pick those patches out to a -fixes tree after a maintainer realises they need to be backported to stable. Now I suppose we could just ban stable tags on the next tree and only put them on the cherry-picks but then how does it deal with the case where something needs to be fixes in -next but not really urgent enough for -fixes immediately. Would that be good enough, no stable tags in -next trees, like we could make the tooling block it? But it seems like overkill, to avoid fixing some shitty scripts someone is probably selling as a security application.
Dave.
On Mon, 13 Jan 2025 at 05:51, Dave Airlie airlied@gmail.com wrote:
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
All subsystems that grow to having large teams (more than 2-4 people) working on a single driver will eventually hit the scaling problem, just be glad we find things first so everyone else knows how to deal with it later.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
None of what you are saying makes any sense here. Explain how patches are backported to stable releases before they end up in Linus's tree to me like I'm 5, because there should be no possible workflow where that can happen, stable pulls from patches in Linus' tree, not from my tree or drm-next or anywhere else. Now it might appear that way because tooling isn't prepared or people don't know what they are looking at, but I still don't see the actual problem.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
Fix the tools.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
How do you recommend we do that, edit the immutable git history to remove the stable tag from the original commit?
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
If you have to do, go do it. The thing is the workflow is there for a reason, once you have a large enough team, having every single team member intimately aware of the rc schedule to decide where they need to land patches doesn't scale. If you can't land patches to a central -next tree and then pick those patches out to a -fixes tree after a maintainer realises they need to be backported to stable. Now I suppose we could just ban stable tags on the next tree and only put them on the cherry-picks but then how does it deal with the case where something needs to be fixes in -next but not really urgent enough for -fixes immediately. Would that be good enough, no stable tags in -next trees, like we could make the tooling block it? But it seems like overkill, to avoid fixing some shitty scripts someone is probably selling as a security application.
If you were to ignore stable tags on drm, could we then write a tool that creates a new for-stable tree that just strips out all the fixes/backports/commits and recreates it based on Linus commits to you, or would future duplicate commits then break tools?
I just don't get what the ABI the tools expect is, and why everyone is writing bespoke tools and getting it wrong, then blaming us for not conforming. Fix the tools or write new ones when you realise the situation is more complex than your initial ideas.
Dave.
On Mon, Jan 13, 2025 at 06:01:51AM +1000, Dave Airlie wrote:
On Mon, 13 Jan 2025 at 05:51, Dave Airlie airlied@gmail.com wrote:
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
All subsystems that grow to having large teams (more than 2-4 people) working on a single driver will eventually hit the scaling problem, just be glad we find things first so everyone else knows how to deal with it later.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
None of what you are saying makes any sense here. Explain how patches are backported to stable releases before they end up in Linus's tree to me like I'm 5, because there should be no possible workflow where that can happen, stable pulls from patches in Linus' tree, not from my tree or drm-next or anywhere else. Now it might appear that way because tooling isn't prepared or people don't know what they are looking at, but I still don't see the actual problem.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
Fix the tools.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
How do you recommend we do that, edit the immutable git history to remove the stable tag from the original commit?
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
If you have to do, go do it. The thing is the workflow is there for a reason, once you have a large enough team, having every single team member intimately aware of the rc schedule to decide where they need to land patches doesn't scale. If you can't land patches to a central -next tree and then pick those patches out to a -fixes tree after a maintainer realises they need to be backported to stable. Now I suppose we could just ban stable tags on the next tree and only put them on the cherry-picks but then how does it deal with the case where something needs to be fixes in -next but not really urgent enough for -fixes immediately. Would that be good enough, no stable tags in -next trees, like we could make the tooling block it? But it seems like overkill, to avoid fixing some shitty scripts someone is probably selling as a security application.
If you were to ignore stable tags on drm, could we then write a tool that creates a new for-stable tree that just strips out all the fixes/backports/commits and recreates it based on Linus commits to you, or would future duplicate commits then break tools?
That would be great, just pick which commit id to reference (i.e. the one that shows up in Linus's tree first.)
But then, be careful with the "Fixes:" tags as well, those need to line up and match the correct ones.
We create a "web" when we backport commits, and mark things for "Fixes:" When we get those ids wrong because you all have duplicate commits for the same thing, everything breaks.
I just don't get what the ABI the tools expect is, and why everyone is writing bespoke tools and getting it wrong, then blaming us for not conforming. Fix the tools or write new ones when you realise the situation is more complex than your initial ideas.
All I want to see and care about is:
- for a stable commit, the id that the commit is in Linus's tree. - for a Fixes: tag, the id that matches the commit in Linus's tree AND the commit that got backported to stable trees.
That's it, that's the whole "ABI". The issue is that you all, for any number of commits, have 2 unique ids for any single commit and how are we supposed to figure that mess out...
thanks,
greg k-h
On Mon, 13 Jan 2025 at 07:09, Greg KH gregkh@linuxfoundation.org wrote:
On Mon, Jan 13, 2025 at 06:01:51AM +1000, Dave Airlie wrote:
On Mon, 13 Jan 2025 at 05:51, Dave Airlie airlied@gmail.com wrote:
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
All subsystems that grow to having large teams (more than 2-4 people) working on a single driver will eventually hit the scaling problem, just be glad we find things first so everyone else knows how to deal with it later.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
None of what you are saying makes any sense here. Explain how patches are backported to stable releases before they end up in Linus's tree to me like I'm 5, because there should be no possible workflow where that can happen, stable pulls from patches in Linus' tree, not from my tree or drm-next or anywhere else. Now it might appear that way because tooling isn't prepared or people don't know what they are looking at, but I still don't see the actual problem.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
Fix the tools.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
How do you recommend we do that, edit the immutable git history to remove the stable tag from the original commit?
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
If you have to do, go do it. The thing is the workflow is there for a reason, once you have a large enough team, having every single team member intimately aware of the rc schedule to decide where they need to land patches doesn't scale. If you can't land patches to a central -next tree and then pick those patches out to a -fixes tree after a maintainer realises they need to be backported to stable. Now I suppose we could just ban stable tags on the next tree and only put them on the cherry-picks but then how does it deal with the case where something needs to be fixes in -next but not really urgent enough for -fixes immediately. Would that be good enough, no stable tags in -next trees, like we could make the tooling block it? But it seems like overkill, to avoid fixing some shitty scripts someone is probably selling as a security application.
If you were to ignore stable tags on drm, could we then write a tool that creates a new for-stable tree that just strips out all the fixes/backports/commits and recreates it based on Linus commits to you, or would future duplicate commits then break tools?
That would be great, just pick which commit id to reference (i.e. the one that shows up in Linus's tree first.)
But then, be careful with the "Fixes:" tags as well, those need to line up and match the correct ones.
We create a "web" when we backport commits, and mark things for "Fixes:" When we get those ids wrong because you all have duplicate commits for the same thing, everything breaks.
I just don't get what the ABI the tools expect is, and why everyone is writing bespoke tools and getting it wrong, then blaming us for not conforming. Fix the tools or write new ones when you realise the situation is more complex than your initial ideas.
All I want to see and care about is:
- for a stable commit, the id that the commit is in Linus's tree.
- for a Fixes: tag, the id that matches the commit in Linus's tree AND the commit that got backported to stable trees.
That's it, that's the whole "ABI". The issue is that you all, for any number of commits, have 2 unique ids for any single commit and how are we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
When the initial commit enters during the next merge window, you look for that subject or commit id in the stable tree already, if it exists, dump the latest Linus patch on the floor, it's already in stable your job is done.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
I'm just not seeing what I'm missing here, fixes tags should work fine, but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Dave.
On Mon, Jan 13, 2025 at 10:44:41AM +1000, Dave Airlie wrote:
On Mon, 13 Jan 2025 at 07:09, Greg KH gregkh@linuxfoundation.org wrote:
On Mon, Jan 13, 2025 at 06:01:51AM +1000, Dave Airlie wrote:
On Mon, 13 Jan 2025 at 05:51, Dave Airlie airlied@gmail.com wrote:
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
All subsystems that grow to having large teams (more than 2-4 people) working on a single driver will eventually hit the scaling problem, just be glad we find things first so everyone else knows how to deal with it later.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
None of what you are saying makes any sense here. Explain how patches are backported to stable releases before they end up in Linus's tree to me like I'm 5, because there should be no possible workflow where that can happen, stable pulls from patches in Linus' tree, not from my tree or drm-next or anywhere else. Now it might appear that way because tooling isn't prepared or people don't know what they are looking at, but I still don't see the actual problem.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
Fix the tools.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
How do you recommend we do that, edit the immutable git history to remove the stable tag from the original commit?
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
If you have to do, go do it. The thing is the workflow is there for a reason, once you have a large enough team, having every single team member intimately aware of the rc schedule to decide where they need to land patches doesn't scale. If you can't land patches to a central -next tree and then pick those patches out to a -fixes tree after a maintainer realises they need to be backported to stable. Now I suppose we could just ban stable tags on the next tree and only put them on the cherry-picks but then how does it deal with the case where something needs to be fixes in -next but not really urgent enough for -fixes immediately. Would that be good enough, no stable tags in -next trees, like we could make the tooling block it? But it seems like overkill, to avoid fixing some shitty scripts someone is probably selling as a security application.
If you were to ignore stable tags on drm, could we then write a tool that creates a new for-stable tree that just strips out all the fixes/backports/commits and recreates it based on Linus commits to you, or would future duplicate commits then break tools?
That would be great, just pick which commit id to reference (i.e. the one that shows up in Linus's tree first.)
But then, be careful with the "Fixes:" tags as well, those need to line up and match the correct ones.
We create a "web" when we backport commits, and mark things for "Fixes:" When we get those ids wrong because you all have duplicate commits for the same thing, everything breaks.
I just don't get what the ABI the tools expect is, and why everyone is writing bespoke tools and getting it wrong, then blaming us for not conforming. Fix the tools or write new ones when you realise the situation is more complex than your initial ideas.
All I want to see and care about is:
- for a stable commit, the id that the commit is in Linus's tree.
- for a Fixes: tag, the id that matches the commit in Linus's tree AND the commit that got backported to stable trees.
That's it, that's the whole "ABI". The issue is that you all, for any number of commits, have 2 unique ids for any single commit and how are we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
And the same for me.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
Ok, this is "fine", I can live with it. BUT that's not the real issue (and your own developers get confused by this, again, look at the original email that started this all, they used an invalid git id to send to us thinking that was the correct id to use.)
The problem is when it comes to Reverts and Fixes. Both of those now get out of sync because the DRM developers don't know which commit id to pick, and so they choose a random one. And of course, now 50% of the time they are wrong.
When the initial commit enters during the next merge window, you look for that subject or commit id in the stable tree already, if it exists, dump the latest Linus patch on the floor, it's already in stable your job is done.
This takes an extra step from any other subsystem, which is why DRM patches are the LAST to get merged into stable releases after the big -rc1 flood as it takes me a manual step for each and every commit. It's a pain and I curse it but hey, if I have to live with it I can.
But again, that's not what I am complaining about here. Again, it's the mismatch of Fixes and Revert ids being used.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
Yes, they sometimes are wrong. How am I supposed to know that if I see a DRM Fixes or Revert id, I need to walk the whole tree and search again for the ids in the changelog to see if they match or not.
That is the problem.
I'm just not seeing what I'm missing here, fixes tags should work fine,
They do not when you refer to the "wrong" commit id (i.e. the commit that ended up in -rc1, NOT the previous real release that came through the "fixes" branch you have.)
but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
Please fix this up, if you all can. That is the issue here. And again, same for reverts.
I think between the two, this is causing many fixes and reverts to go unresolved in the stable trees.
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I still think it's lunacy to have a "cherry pick" commit refer to a commit that is NOT IN THE TREE YET and shows up in history as "IN THE FUTURE". But hey, that's just me.
Why do you have these markings at all? Who are they helping? Me? Someone else?
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Again, it's the Fixes and Reverts id referencing that is all messed up here. That needs to be resolved. If it takes you all the effort to make up a special "stable tree only" branch/series/whatever, I'm all for it, but as it is now, what you all are doing is NOT working for me at all.
thanks,
greg k-h
We create a "web" when we backport commits, and mark things for "Fixes:" When we get those ids wrong because you all have duplicate commits for the same thing, everything breaks.
I just don't get what the ABI the tools expect is, and why everyone is writing bespoke tools and getting it wrong, then blaming us for not conforming. Fix the tools or write new ones when you realise the situation is more complex than your initial ideas.
All I want to see and care about is:
- for a stable commit, the id that the commit is in Linus's tree.
- for a Fixes: tag, the id that matches the commit in Linus's tree AND the commit that got backported to stable trees.
That's it, that's the whole "ABI". The issue is that you all, for any number of commits, have 2 unique ids for any single commit and how are we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
And the same for me.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
Ok, this is "fine", I can live with it. BUT that's not the real issue (and your own developers get confused by this, again, look at the original email that started this all, they used an invalid git id to send to us thinking that was the correct id to use.)
I'm going to go back and look at the one you pointed out as I'm missing the issue with it, I thought it was due to a future ID being used.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
Please fix this up, if you all can. That is the issue here. And again, same for reverts.
I think between the two, this is causing many fixes and reverts to go unresolved in the stable trees.
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I still think it's lunacy to have a "cherry pick" commit refer to a commit that is NOT IN THE TREE YET and shows up in history as "IN THE FUTURE". But hey, that's just me.
Why do you have these markings at all? Who are they helping? Me? Someone else?
They are for helping you. Again if the commit that goes into -next is immutable, there is no way for it to reference the commit that goes into -fixes ahead of it.
The commit in -fixes needs to add the link to the future commit in -next, that link is the cherry-pick statement.
When you get the future commit into the stable queue, you look for the commit id in stable history as a cherry-pick and drop it if it's already there.
I can't see any other way to do this, the future commit id is a placeholder in Linus/stable tree, the commit is immutable and 99.99% of the time it will arrive at some future point in time.
I'm open to how you would make this work that isn't lunacy, but I can't really see a way since git commits are immutable.
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Again, it's the Fixes and Reverts id referencing that is all messed up here. That needs to be resolved. If it takes you all the effort to make up a special "stable tree only" branch/series/whatever, I'm all for it, but as it is now, what you all are doing is NOT working for me at all.
I'll have to see if anyone is willing to consider pulling this sort of feat off, it's not a small task, and it would have to be 99% automated I think to be not too burdensome.
Dave.
On Tue, Jan 14, 2025 at 11:01:34AM +1000, Dave Airlie wrote:
We create a "web" when we backport commits, and mark things for "Fixes:" When we get those ids wrong because you all have duplicate commits for the same thing, everything breaks.
I just don't get what the ABI the tools expect is, and why everyone is writing bespoke tools and getting it wrong, then blaming us for not conforming. Fix the tools or write new ones when you realise the situation is more complex than your initial ideas.
All I want to see and care about is:
- for a stable commit, the id that the commit is in Linus's tree.
- for a Fixes: tag, the id that matches the commit in Linus's tree AND the commit that got backported to stable trees.
That's it, that's the whole "ABI". The issue is that you all, for any number of commits, have 2 unique ids for any single commit and how are we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
And the same for me.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
Ok, this is "fine", I can live with it. BUT that's not the real issue (and your own developers get confused by this, again, look at the original email that started this all, they used an invalid git id to send to us thinking that was the correct id to use.)
I'm going to go back and look at the one you pointed out as I'm missing the issue with it, I thought it was due to a future ID being used.
I think the issue is that with the cherry-picking we do, we don't update the Fixes: or Reverts: lines, so those still point at the og commit in -next, while the cherry-picked commit is in -fixes.
The fix for that (which our own cherry-pick scripts implement iirc) is to keep track of the cherry-picks (which is why we add that line) and treat them as aliases.
So if you have a Fixes: $sha1 pointing at -next, then if you do a full-text commit message search for (cherry picked from $sha1), you should be able to find it.
We could try to do that lookup with the cherry-pick scripts, but a lot of folks hand-roll these, so it's lossy at best. Plus you already have to keep track of aliases anyway since you're cherry-picking to stable, so I was assuming that this shouldn't cause additional issues.
The other part is that if you already have a cherry picked from $sha1 in your history, even if it wasn't done with stable cherry-pick, then you don't have to cherry-pick again. These should be easy to filter out.
But maybe I'm also not understanding what the issue is, I guess would need to look at a specific example.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
Please fix this up, if you all can. That is the issue here. And again, same for reverts.
I think between the two, this is causing many fixes and reverts to go unresolved in the stable trees.
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I still think it's lunacy to have a "cherry pick" commit refer to a commit that is NOT IN THE TREE YET and shows up in history as "IN THE FUTURE". But hey, that's just me.
Why do you have these markings at all? Who are they helping? Me? Someone else?
They are for helping you. Again if the commit that goes into -next is immutable, there is no way for it to reference the commit that goes into -fixes ahead of it.
The commit in -fixes needs to add the link to the future commit in -next, that link is the cherry-pick statement.
When you get the future commit into the stable queue, you look for the commit id in stable history as a cherry-pick and drop it if it's already there.
I can't see any other way to do this, the future commit id is a placeholder in Linus/stable tree, the commit is immutable and 99.99% of the time it will arrive at some future point in time.
I'm open to how you would make this work that isn't lunacy, but I can't really see a way since git commits are immutable.
Yeah the (cherry picked from $sha1) with a sha1 that's in -next and almost always shows up in Linus' tree in the future shouldn't be an issue. That part really is required for driver teams to manage their flows.
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Again, it's the Fixes and Reverts id referencing that is all messed up here. That needs to be resolved. If it takes you all the effort to make up a special "stable tree only" branch/series/whatever, I'm all for it, but as it is now, what you all are doing is NOT working for me at all.
I'll have to see if anyone is willing to consider pulling this sort of feat off, it's not a small task, and it would have to be 99% automated I think to be not too burdensome.
It's not that hard to script, dim cherry-pick already does it. It's the part where we need to guarantee that we never ever let one slip through didn't get this treatment of replacing the sha1.
The even more insideous one is when people rebase their -next or -fixes trees, since then the sha1 will really never ever show up. Which is why we're telling people to not mess with git history at all and instead cherry-pick. It's the lesser pain. -Sima
On Tue, Jan 14, 2025 at 04:03:09PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 11:01:34AM +1000, Dave Airlie wrote:
We create a "web" when we backport commits, and mark things for "Fixes:" When we get those ids wrong because you all have duplicate commits for the same thing, everything breaks.
I just don't get what the ABI the tools expect is, and why everyone is writing bespoke tools and getting it wrong, then blaming us for not conforming. Fix the tools or write new ones when you realise the situation is more complex than your initial ideas.
All I want to see and care about is:
- for a stable commit, the id that the commit is in Linus's tree.
- for a Fixes: tag, the id that matches the commit in Linus's tree AND the commit that got backported to stable trees.
That's it, that's the whole "ABI". The issue is that you all, for any number of commits, have 2 unique ids for any single commit and how are we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
And the same for me.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
Ok, this is "fine", I can live with it. BUT that's not the real issue (and your own developers get confused by this, again, look at the original email that started this all, they used an invalid git id to send to us thinking that was the correct id to use.)
I'm going to go back and look at the one you pointed out as I'm missing the issue with it, I thought it was due to a future ID being used.
I think the issue is that with the cherry-picking we do, we don't update the Fixes: or Reverts: lines, so those still point at the og commit in -next, while the cherry-picked commit is in -fixes.
The fix for that (which our own cherry-pick scripts implement iirc) is to keep track of the cherry-picks (which is why we add that line) and treat them as aliases.
So if you have a Fixes: $sha1 pointing at -next, then if you do a full-text commit message search for (cherry picked from $sha1), you should be able to find it.
We could try to do that lookup with the cherry-pick scripts, but a lot of folks hand-roll these, so it's lossy at best. Plus you already have to keep track of aliases anyway since you're cherry-picking to stable, so I was assuming that this shouldn't cause additional issues.
The other part is that if you already have a cherry picked from $sha1 in your history, even if it wasn't done with stable cherry-pick, then you don't have to cherry-pick again. These should be easy to filter out.
But maybe I'm also not understanding what the issue is, I guess would need to look at a specific example.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
Please fix this up, if you all can. That is the issue here. And again, same for reverts.
I think between the two, this is causing many fixes and reverts to go unresolved in the stable trees.
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I still think it's lunacy to have a "cherry pick" commit refer to a commit that is NOT IN THE TREE YET and shows up in history as "IN THE FUTURE". But hey, that's just me.
Why do you have these markings at all? Who are they helping? Me? Someone else?
They are for helping you. Again if the commit that goes into -next is immutable, there is no way for it to reference the commit that goes into -fixes ahead of it.
The commit in -fixes needs to add the link to the future commit in -next, that link is the cherry-pick statement.
When you get the future commit into the stable queue, you look for the commit id in stable history as a cherry-pick and drop it if it's already there.
I can't see any other way to do this, the future commit id is a placeholder in Linus/stable tree, the commit is immutable and 99.99% of the time it will arrive at some future point in time.
I'm open to how you would make this work that isn't lunacy, but I can't really see a way since git commits are immutable.
Yeah the (cherry picked from $sha1) with a sha1 that's in -next and almost always shows up in Linus' tree in the future shouldn't be an issue. That part really is required for driver teams to manage their flows.
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Again, it's the Fixes and Reverts id referencing that is all messed up here. That needs to be resolved. If it takes you all the effort to make up a special "stable tree only" branch/series/whatever, I'm all for it, but as it is now, what you all are doing is NOT working for me at all.
I'll have to see if anyone is willing to consider pulling this sort of feat off, it's not a small task, and it would have to be 99% automated I think to be not too burdensome.
It's not that hard to script, dim cherry-pick already does it. It's the part where we need to guarantee that we never ever let one slip through didn't get this treatment of replacing the sha1.
The even more insideous one is when people rebase their -next or -fixes trees, since then the sha1 will really never ever show up. Which is why we're telling people to not mess with git history at all and instead cherry-pick. It's the lesser pain.
But this does happen with cherry picks... A few examples from what I saw with drivers/gpu/drm/ and -stable:
5a507b7d2be1 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") which landed as 8a0a7b98d4b6 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") rather than 4545614c1d8da.
e89afb51f97a ("drm/vmwgfx: Fix a 64bit regression on svga3") which landed as c2aaa37dc18f ("drm/vmwgfx: Fix a 64bit regression on svga3") rather than 873601687598.
a829f033e966 ("drm/i915: Wedge the GPU if command parser setup fails") which indicates it's a cherry-pick, but I couldn't find the equivalent commit landing at any point later on.
Or the following 3 commits:
0811b9e4530d ("drm/amd/display: Add HUBP surface flip interrupt handler") which has a stable tag, and no cherry-pick line.
4ded1ec8d1b3 ("drm/amd/display: Add HUBP surface flip interrupt handler") which is a different code change than the previous commit, and a completely different commit message, no stable tag, and no cherry-pick line.
7af87fc1ba13 ("drm/amd/display: Add HUBP surface flip interrupt handler") which has the same code change as above, and it has the same commit message as 4ded1ec8d1b3 but with an added stable tag, and again - no cherry-pick line.
On Tue, Jan 14, 2025 at 11:02 AM Sasha Levin sashal@kernel.org wrote:
On Tue, Jan 14, 2025 at 04:03:09PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 11:01:34AM +1000, Dave Airlie wrote:
We create a "web" when we backport commits, and mark things for "Fixes:" When we get those ids wrong because you all have duplicate commits for the same thing, everything breaks.
> I just don't get what the ABI the tools expect is, and why everyone is > writing bespoke tools and getting it wrong, then blaming us for not > conforming. Fix the tools or write new ones when you realise the > situation is more complex than your initial ideas.
All I want to see and care about is:
- for a stable commit, the id that the commit is in Linus's tree.
- for a Fixes: tag, the id that matches the commit in Linus's tree AND the commit that got backported to stable trees.
That's it, that's the whole "ABI". The issue is that you all, for any number of commits, have 2 unique ids for any single commit and how are we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
And the same for me.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
Ok, this is "fine", I can live with it. BUT that's not the real issue (and your own developers get confused by this, again, look at the original email that started this all, they used an invalid git id to send to us thinking that was the correct id to use.)
I'm going to go back and look at the one you pointed out as I'm missing the issue with it, I thought it was due to a future ID being used.
I think the issue is that with the cherry-picking we do, we don't update the Fixes: or Reverts: lines, so those still point at the og commit in -next, while the cherry-picked commit is in -fixes.
The fix for that (which our own cherry-pick scripts implement iirc) is to keep track of the cherry-picks (which is why we add that line) and treat them as aliases.
So if you have a Fixes: $sha1 pointing at -next, then if you do a full-text commit message search for (cherry picked from $sha1), you should be able to find it.
We could try to do that lookup with the cherry-pick scripts, but a lot of folks hand-roll these, so it's lossy at best. Plus you already have to keep track of aliases anyway since you're cherry-picking to stable, so I was assuming that this shouldn't cause additional issues.
The other part is that if you already have a cherry picked from $sha1 in your history, even if it wasn't done with stable cherry-pick, then you don't have to cherry-pick again. These should be easy to filter out.
But maybe I'm also not understanding what the issue is, I guess would need to look at a specific example.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
Please fix this up, if you all can. That is the issue here. And again, same for reverts.
I think between the two, this is causing many fixes and reverts to go unresolved in the stable trees.
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I still think it's lunacy to have a "cherry pick" commit refer to a commit that is NOT IN THE TREE YET and shows up in history as "IN THE FUTURE". But hey, that's just me.
Why do you have these markings at all? Who are they helping? Me? Someone else?
They are for helping you. Again if the commit that goes into -next is immutable, there is no way for it to reference the commit that goes into -fixes ahead of it.
The commit in -fixes needs to add the link to the future commit in -next, that link is the cherry-pick statement.
When you get the future commit into the stable queue, you look for the commit id in stable history as a cherry-pick and drop it if it's already there.
I can't see any other way to do this, the future commit id is a placeholder in Linus/stable tree, the commit is immutable and 99.99% of the time it will arrive at some future point in time.
I'm open to how you would make this work that isn't lunacy, but I can't really see a way since git commits are immutable.
Yeah the (cherry picked from $sha1) with a sha1 that's in -next and almost always shows up in Linus' tree in the future shouldn't be an issue. That part really is required for driver teams to manage their flows.
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Again, it's the Fixes and Reverts id referencing that is all messed up here. That needs to be resolved. If it takes you all the effort to make up a special "stable tree only" branch/series/whatever, I'm all for it, but as it is now, what you all are doing is NOT working for me at all.
I'll have to see if anyone is willing to consider pulling this sort of feat off, it's not a small task, and it would have to be 99% automated I think to be not too burdensome.
It's not that hard to script, dim cherry-pick already does it. It's the part where we need to guarantee that we never ever let one slip through didn't get this treatment of replacing the sha1.
The even more insideous one is when people rebase their -next or -fixes trees, since then the sha1 will really never ever show up. Which is why we're telling people to not mess with git history at all and instead cherry-pick. It's the lesser pain.
But this does happen with cherry picks... A few examples from what I saw with drivers/gpu/drm/ and -stable:
5a507b7d2be1 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") which landed as 8a0a7b98d4b6 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") rather than 4545614c1d8da.
e89afb51f97a ("drm/vmwgfx: Fix a 64bit regression on svga3") which landed as c2aaa37dc18f ("drm/vmwgfx: Fix a 64bit regression on svga3") rather than 873601687598.
a829f033e966 ("drm/i915: Wedge the GPU if command parser setup fails") which indicates it's a cherry-pick, but I couldn't find the equivalent commit landing at any point later on.
Or the following 3 commits:
0811b9e4530d ("drm/amd/display: Add HUBP surface flip interrupt handler") which has a stable tag, and no cherry-pick line.
4ded1ec8d1b3 ("drm/amd/display: Add HUBP surface flip interrupt handler") which is a different code change than the previous commit, and a completely different commit message, no stable tag, and no cherry-pick line.
7af87fc1ba13 ("drm/amd/display: Add HUBP surface flip interrupt handler") which has the same code change as above, and it has the same commit message as 4ded1ec8d1b3 but with an added stable tag, and again - no cherry-pick line.
In fairness, these pre-dated the amdgpu tree using cherry-pick -x. I had stopped doing that for a while because I kept getting yelled at for referencing commits that were only in -next. I've since started using -x when I need to cherry-pick a fix to -fixes.
Alex
On Tue, Jan 14, 2025 at 11:11:42AM -0500, Alex Deucher wrote:
On Tue, Jan 14, 2025 at 11:02 AM Sasha Levin sashal@kernel.org wrote:
On Tue, Jan 14, 2025 at 04:03:09PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 11:01:34AM +1000, Dave Airlie wrote:
> We create a "web" when we backport commits, and mark things for "Fixes:" > When we get those ids wrong because you all have duplicate commits for > the same thing, everything breaks. > > > I just don't get what the ABI the tools expect is, and why everyone is > > writing bespoke tools and getting it wrong, then blaming us for not > > conforming. Fix the tools or write new ones when you realise the > > situation is more complex than your initial ideas. > > All I want to see and care about is: > > - for a stable commit, the id that the commit is in Linus's tree. > - for a Fixes: tag, the id that matches the commit in Linus's tree AND > the commit that got backported to stable trees. > > That's it, that's the whole "ABI". The issue is that you all, for any > number of commits, have 2 unique ids for any single commit and how are > we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
And the same for me.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
Ok, this is "fine", I can live with it. BUT that's not the real issue (and your own developers get confused by this, again, look at the original email that started this all, they used an invalid git id to send to us thinking that was the correct id to use.)
I'm going to go back and look at the one you pointed out as I'm missing the issue with it, I thought it was due to a future ID being used.
I think the issue is that with the cherry-picking we do, we don't update the Fixes: or Reverts: lines, so those still point at the og commit in -next, while the cherry-picked commit is in -fixes.
The fix for that (which our own cherry-pick scripts implement iirc) is to keep track of the cherry-picks (which is why we add that line) and treat them as aliases.
So if you have a Fixes: $sha1 pointing at -next, then if you do a full-text commit message search for (cherry picked from $sha1), you should be able to find it.
We could try to do that lookup with the cherry-pick scripts, but a lot of folks hand-roll these, so it's lossy at best. Plus you already have to keep track of aliases anyway since you're cherry-picking to stable, so I was assuming that this shouldn't cause additional issues.
The other part is that if you already have a cherry picked from $sha1 in your history, even if it wasn't done with stable cherry-pick, then you don't have to cherry-pick again. These should be easy to filter out.
But maybe I'm also not understanding what the issue is, I guess would need to look at a specific example.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
Please fix this up, if you all can. That is the issue here. And again, same for reverts.
I think between the two, this is causing many fixes and reverts to go unresolved in the stable trees.
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I still think it's lunacy to have a "cherry pick" commit refer to a commit that is NOT IN THE TREE YET and shows up in history as "IN THE FUTURE". But hey, that's just me.
Why do you have these markings at all? Who are they helping? Me? Someone else?
They are for helping you. Again if the commit that goes into -next is immutable, there is no way for it to reference the commit that goes into -fixes ahead of it.
The commit in -fixes needs to add the link to the future commit in -next, that link is the cherry-pick statement.
When you get the future commit into the stable queue, you look for the commit id in stable history as a cherry-pick and drop it if it's already there.
I can't see any other way to do this, the future commit id is a placeholder in Linus/stable tree, the commit is immutable and 99.99% of the time it will arrive at some future point in time.
I'm open to how you would make this work that isn't lunacy, but I can't really see a way since git commits are immutable.
Yeah the (cherry picked from $sha1) with a sha1 that's in -next and almost always shows up in Linus' tree in the future shouldn't be an issue. That part really is required for driver teams to manage their flows.
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Again, it's the Fixes and Reverts id referencing that is all messed up here. That needs to be resolved. If it takes you all the effort to make up a special "stable tree only" branch/series/whatever, I'm all for it, but as it is now, what you all are doing is NOT working for me at all.
I'll have to see if anyone is willing to consider pulling this sort of feat off, it's not a small task, and it would have to be 99% automated I think to be not too burdensome.
It's not that hard to script, dim cherry-pick already does it. It's the part where we need to guarantee that we never ever let one slip through didn't get this treatment of replacing the sha1.
The even more insideous one is when people rebase their -next or -fixes trees, since then the sha1 will really never ever show up. Which is why we're telling people to not mess with git history at all and instead cherry-pick. It's the lesser pain.
But this does happen with cherry picks... A few examples from what I saw with drivers/gpu/drm/ and -stable:
5a507b7d2be1 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") which landed as 8a0a7b98d4b6 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") rather than 4545614c1d8da.
e89afb51f97a ("drm/vmwgfx: Fix a 64bit regression on svga3") which landed as c2aaa37dc18f ("drm/vmwgfx: Fix a 64bit regression on svga3") rather than 873601687598.
a829f033e966 ("drm/i915: Wedge the GPU if command parser setup fails") which indicates it's a cherry-pick, but I couldn't find the equivalent commit landing at any point later on.
Or the following 3 commits:
0811b9e4530d ("drm/amd/display: Add HUBP surface flip interrupt handler") which has a stable tag, and no cherry-pick line.
4ded1ec8d1b3 ("drm/amd/display: Add HUBP surface flip interrupt handler") which is a different code change than the previous commit, and a completely different commit message, no stable tag, and no cherry-pick line.
7af87fc1ba13 ("drm/amd/display: Add HUBP surface flip interrupt handler") which has the same code change as above, and it has the same commit message as 4ded1ec8d1b3 but with an added stable tag, and again - no cherry-pick line.
In fairness, these pre-dated the amdgpu tree using cherry-pick -x. I had stopped doing that for a while because I kept getting yelled at for referencing commits that were only in -next. I've since started using -x when I need to cherry-pick a fix to -fixes.
Ok, here's the most recent one that I tripped over, and the CVE community hit as well (a CVE consumer reported the problem to me): a6dd15981c03 ("drm/amdgpu: prevent NULL pointer dereference if ATIF is not supported") which showed up in the 6.12 release (6.12-rc6 to be specific), but yet has the line: Fixes: c9b7c809b89f ("drm/amd: Guard against bad data for ATIF ACPI method") while that commit was NOT in Linus's tree and did not get there until 6.13-rc1.
So here we have a commit that claims it fixes a commit that is not yet in the tree. This of course caused havoc with tools that assumed that if a commit says it fixes something, that fix is in the tree already (i.e. the CVE assignment scripts.) After much digging around, it turns out that the _REAL_ fixes commit id was: bf58f03931fd ("drm/amd: Guard against bad data for ATIF ACPI method") which really showed up in 6.12-rc5 (and was backported to stable commits as it too claimed to solve a problem in those kernels.)
So this is a real problem, and shows up and we have to then explain to external people, "no, what that commit says really isn't true, it's really fixing this other commit over there which is why we applied it and why we had to manually mark this CVE as THAT commit being the offending one, not this other one over there...)
So again, if you all wish to keep your current workflow, wonderful, but realize it has consequences. Those consequences are stable maintainers that are loath to touch these patches, CVE maintainers who cringe every time they see a DRM patch being assigned a CVE and hope that the tags are correct and they don't have to manually research it to determine if it is right or not (remember, we assign 8 CVEs a day, that workload does not lend itself to hand-holding at all) and also there are confused users when they try to determine where a fix should be applied to because they want to not follow the stable kernels and do their own thing (RHEL, SLES, ChromeOS, cloud-monstrosities, etc.)
I'm trying to point out "your workflow is causing problems." If you want to ignore that statement, fine. But that doesn't mean it's not a valid statement.
greg k-h
On Tue, Jan 14, 2025 at 10:51:54AM -0500, Sasha Levin wrote:
On Tue, Jan 14, 2025 at 04:03:09PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 11:01:34AM +1000, Dave Airlie wrote:
We create a "web" when we backport commits, and mark things for "Fixes:" When we get those ids wrong because you all have duplicate commits for the same thing, everything breaks.
> I just don't get what the ABI the tools expect is, and why everyone is > writing bespoke tools and getting it wrong, then blaming us for not > conforming. Fix the tools or write new ones when you realise the > situation is more complex than your initial ideas.
All I want to see and care about is:
- for a stable commit, the id that the commit is in Linus's tree.
- for a Fixes: tag, the id that matches the commit in Linus's tree AND the commit that got backported to stable trees.
That's it, that's the whole "ABI". The issue is that you all, for any number of commits, have 2 unique ids for any single commit and how are we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
And the same for me.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
Ok, this is "fine", I can live with it. BUT that's not the real issue (and your own developers get confused by this, again, look at the original email that started this all, they used an invalid git id to send to us thinking that was the correct id to use.)
I'm going to go back and look at the one you pointed out as I'm missing the issue with it, I thought it was due to a future ID being used.
I think the issue is that with the cherry-picking we do, we don't update the Fixes: or Reverts: lines, so those still point at the og commit in -next, while the cherry-picked commit is in -fixes.
The fix for that (which our own cherry-pick scripts implement iirc) is to keep track of the cherry-picks (which is why we add that line) and treat them as aliases.
So if you have a Fixes: $sha1 pointing at -next, then if you do a full-text commit message search for (cherry picked from $sha1), you should be able to find it.
We could try to do that lookup with the cherry-pick scripts, but a lot of folks hand-roll these, so it's lossy at best. Plus you already have to keep track of aliases anyway since you're cherry-picking to stable, so I was assuming that this shouldn't cause additional issues.
The other part is that if you already have a cherry picked from $sha1 in your history, even if it wasn't done with stable cherry-pick, then you don't have to cherry-pick again. These should be easy to filter out.
But maybe I'm also not understanding what the issue is, I guess would need to look at a specific example.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
Please fix this up, if you all can. That is the issue here. And again, same for reverts.
I think between the two, this is causing many fixes and reverts to go unresolved in the stable trees.
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I still think it's lunacy to have a "cherry pick" commit refer to a commit that is NOT IN THE TREE YET and shows up in history as "IN THE FUTURE". But hey, that's just me.
Why do you have these markings at all? Who are they helping? Me? Someone else?
They are for helping you. Again if the commit that goes into -next is immutable, there is no way for it to reference the commit that goes into -fixes ahead of it.
The commit in -fixes needs to add the link to the future commit in -next, that link is the cherry-pick statement.
When you get the future commit into the stable queue, you look for the commit id in stable history as a cherry-pick and drop it if it's already there.
I can't see any other way to do this, the future commit id is a placeholder in Linus/stable tree, the commit is immutable and 99.99% of the time it will arrive at some future point in time.
I'm open to how you would make this work that isn't lunacy, but I can't really see a way since git commits are immutable.
Yeah the (cherry picked from $sha1) with a sha1 that's in -next and almost always shows up in Linus' tree in the future shouldn't be an issue. That part really is required for driver teams to manage their flows.
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Again, it's the Fixes and Reverts id referencing that is all messed up here. That needs to be resolved. If it takes you all the effort to make up a special "stable tree only" branch/series/whatever, I'm all for it, but as it is now, what you all are doing is NOT working for me at all.
I'll have to see if anyone is willing to consider pulling this sort of feat off, it's not a small task, and it would have to be 99% automated I think to be not too burdensome.
It's not that hard to script, dim cherry-pick already does it. It's the part where we need to guarantee that we never ever let one slip through didn't get this treatment of replacing the sha1.
The even more insideous one is when people rebase their -next or -fixes trees, since then the sha1 will really never ever show up. Which is why we're telling people to not mess with git history at all and instead cherry-pick. It's the lesser pain.
But this does happen with cherry picks... A few examples from what I saw with drivers/gpu/drm/ and -stable:
5a507b7d2be1 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") which landed as 8a0a7b98d4b6 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") rather than 4545614c1d8da.
This one also landed through Alex' tree, and before he switched over to cherry-pick -x and not trying to fix things up with rebasing. Because in theory rebasing bugfixes out of -next into -fixes avoids all that trouble, in practice it just causes a reliably even bigger mess.
e89afb51f97a ("drm/vmwgfx: Fix a 64bit regression on svga3") which landed as c2aaa37dc18f ("drm/vmwgfx: Fix a 64bit regression on svga3") rather than 873601687598.
This one is from 2021. Iirc it's the case that motivated us to improve the commiter documentation and make it clear that only maintainers should do cherry-picks. Occasionally people don't know and screw up.
a829f033e966 ("drm/i915: Wedge the GPU if command parser setup fails") which indicates it's a cherry-pick, but I couldn't find the equivalent commit landing at any point later on.
This one was a maintainer action by Dave and me, where we went in and rebased an entire -next tree. Also from 2021, even more exceptional than the "committer cherry-picked themselves and screwed up".
I'm not saying that the cherry-pick model with committers is error free. Not at all. It's just in my experience substantially less error prone than anything else, it's simply the less shit option.
Roughly the options are:
- rebase trees to not have duplicated commits. Breaks the committer model, pretty much guarantees that you have commit references to absolutely nowhere at all in practice because people butcher rebases all the time. Also pisses off Linus with unecessary rebases that don't reflect actual development history.
Plus we'd insta run out of maintainers in drm if we do this.
I think this is also what Alex tried to do until very recently.
- cherry-pick, but pretend it didn't happen. This means either people perfectly fix up all tags (see above, doesn't happen in practice) or you need to do title based guessing games. Plus you need to do title-based guessing games with the duplicates anyway.
- cherry-pick -x. You can actually handle this one with scripts and no human shouting. Unless people forgot to use -x or screwed up something else (which is why we have a script and docs). Which does happene, but the two examples you've found for that flow are from 2021. There should also be some that are more recent.
- we give up on stable for drm.
Cheers, Sima
Or the following 3 commits:
0811b9e4530d ("drm/amd/display: Add HUBP surface flip interrupt handler") which has a stable tag, and no cherry-pick line.
4ded1ec8d1b3 ("drm/amd/display: Add HUBP surface flip interrupt handler") which is a different code change than the previous commit, and a completely different commit message, no stable tag, and no cherry-pick line.
7af87fc1ba13 ("drm/amd/display: Add HUBP surface flip interrupt handler") which has the same code change as above, and it has the same commit message as 4ded1ec8d1b3 but with an added stable tag, and again - no cherry-pick line.
-- Thanks, Sasha
On Tue, Jan 14, 2025 at 06:31:26PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 10:51:54AM -0500, Sasha Levin wrote:
On Tue, Jan 14, 2025 at 04:03:09PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 11:01:34AM +1000, Dave Airlie wrote:
> We create a "web" when we backport commits, and mark things for "Fixes:" > When we get those ids wrong because you all have duplicate commits for > the same thing, everything breaks. > > > I just don't get what the ABI the tools expect is, and why everyone is > > writing bespoke tools and getting it wrong, then blaming us for not > > conforming. Fix the tools or write new ones when you realise the > > situation is more complex than your initial ideas. > > All I want to see and care about is: > > - for a stable commit, the id that the commit is in Linus's tree. > - for a Fixes: tag, the id that matches the commit in Linus's tree AND > the commit that got backported to stable trees. > > That's it, that's the whole "ABI". The issue is that you all, for any > number of commits, have 2 unique ids for any single commit and how are > we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
And the same for me.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
Ok, this is "fine", I can live with it. BUT that's not the real issue (and your own developers get confused by this, again, look at the original email that started this all, they used an invalid git id to send to us thinking that was the correct id to use.)
I'm going to go back and look at the one you pointed out as I'm missing the issue with it, I thought it was due to a future ID being used.
I think the issue is that with the cherry-picking we do, we don't update the Fixes: or Reverts: lines, so those still point at the og commit in -next, while the cherry-picked commit is in -fixes.
The fix for that (which our own cherry-pick scripts implement iirc) is to keep track of the cherry-picks (which is why we add that line) and treat them as aliases.
So if you have a Fixes: $sha1 pointing at -next, then if you do a full-text commit message search for (cherry picked from $sha1), you should be able to find it.
We could try to do that lookup with the cherry-pick scripts, but a lot of folks hand-roll these, so it's lossy at best. Plus you already have to keep track of aliases anyway since you're cherry-picking to stable, so I was assuming that this shouldn't cause additional issues.
The other part is that if you already have a cherry picked from $sha1 in your history, even if it wasn't done with stable cherry-pick, then you don't have to cherry-pick again. These should be easy to filter out.
But maybe I'm also not understanding what the issue is, I guess would need to look at a specific example.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
Please fix this up, if you all can. That is the issue here. And again, same for reverts.
I think between the two, this is causing many fixes and reverts to go unresolved in the stable trees.
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I still think it's lunacy to have a "cherry pick" commit refer to a commit that is NOT IN THE TREE YET and shows up in history as "IN THE FUTURE". But hey, that's just me.
Why do you have these markings at all? Who are they helping? Me? Someone else?
They are for helping you. Again if the commit that goes into -next is immutable, there is no way for it to reference the commit that goes into -fixes ahead of it.
The commit in -fixes needs to add the link to the future commit in -next, that link is the cherry-pick statement.
When you get the future commit into the stable queue, you look for the commit id in stable history as a cherry-pick and drop it if it's already there.
I can't see any other way to do this, the future commit id is a placeholder in Linus/stable tree, the commit is immutable and 99.99% of the time it will arrive at some future point in time.
I'm open to how you would make this work that isn't lunacy, but I can't really see a way since git commits are immutable.
Yeah the (cherry picked from $sha1) with a sha1 that's in -next and almost always shows up in Linus' tree in the future shouldn't be an issue. That part really is required for driver teams to manage their flows.
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Again, it's the Fixes and Reverts id referencing that is all messed up here. That needs to be resolved. If it takes you all the effort to make up a special "stable tree only" branch/series/whatever, I'm all for it, but as it is now, what you all are doing is NOT working for me at all.
I'll have to see if anyone is willing to consider pulling this sort of feat off, it's not a small task, and it would have to be 99% automated I think to be not too burdensome.
It's not that hard to script, dim cherry-pick already does it. It's the part where we need to guarantee that we never ever let one slip through didn't get this treatment of replacing the sha1.
The even more insideous one is when people rebase their -next or -fixes trees, since then the sha1 will really never ever show up. Which is why we're telling people to not mess with git history at all and instead cherry-pick. It's the lesser pain.
But this does happen with cherry picks... A few examples from what I saw with drivers/gpu/drm/ and -stable:
5a507b7d2be1 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") which landed as 8a0a7b98d4b6 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") rather than 4545614c1d8da.
This one also landed through Alex' tree, and before he switched over to cherry-pick -x and not trying to fix things up with rebasing. Because in theory rebasing bugfixes out of -next into -fixes avoids all that trouble, in practice it just causes a reliably even bigger mess.
e89afb51f97a ("drm/vmwgfx: Fix a 64bit regression on svga3") which landed as c2aaa37dc18f ("drm/vmwgfx: Fix a 64bit regression on svga3") rather than 873601687598.
This one is from 2021. Iirc it's the case that motivated us to improve the commiter documentation and make it clear that only maintainers should do cherry-picks. Occasionally people don't know and screw up.
a829f033e966 ("drm/i915: Wedge the GPU if command parser setup fails") which indicates it's a cherry-pick, but I couldn't find the equivalent commit landing at any point later on.
This one was a maintainer action by Dave and me, where we went in and rebased an entire -next tree. Also from 2021, even more exceptional than the "committer cherry-picked themselves and screwed up".
I'm not saying that the cherry-pick model with committers is error free. Not at all. It's just in my experience substantially less error prone than anything else, it's simply the less shit option.
Roughly the options are:
rebase trees to not have duplicated commits. Breaks the committer model, pretty much guarantees that you have commit references to absolutely nowhere at all in practice because people butcher rebases all the time. Also pisses off Linus with unecessary rebases that don't reflect actual development history.
Plus we'd insta run out of maintainers in drm if we do this.
Bit more here, because this isn't hyperbole. drm isn't magic, we don't have more maintainer volunteers than any other subsystem. And if we'd run the show the same way as most others, we'd suffer like everyone else from overloaded and burnt out maintainers.
We fixed the "not enough maintainer volunteers" problem by radically changing the workflow, and radically reducing the amount of work maintainers have to do. But that has consequences, and that's why we cherry-pick so much.
If you center your flow around committers, then you also need to accept that for a committer the most important tree is their driver/subsystem tree, and everything else is downstream. And they don't care about downstream at that much. Exactly like how maintainers don't care that much about stable trees as their downstream, and you're trying to make it as easy as possible for them.
Roughly translating things:
- For you, stable is the downstream that cherry-picks from the main development branch. For drm committers, drm-fixes is their downstream that cherry-picks from the development tree (and everything else is even further downstream).
- For you Linus' git tree is the development branch you cherry-pick from. For drm committers the drm-foo-next branch is their development branch that we cherry-pick from.
- You asking us to not cherry-pick but instead do the classic maintainer approach of filtering out fixes into foo-fixes branches is the same as if you'd ask maintainers to send bugfixes for stable to you directly, rebase them out of their pr to Linus and then backmerge. This is total bullocks, because stable isn't the development branch.
The only difference is that for drm committers, Linus' tree is also not the development branch, it's the drm-next branches. And sure drm is like 10x smaller than the kernel overall, but that doesn't mean your request that we rebase our development branch because downstream (meaning drm-fixes, which feeds into Linus' git) can't cope with cherry-picks is 10x more reasonable. It's still fundamentally a busted workflow.
And if you ditch the notion that for committers drm-next is their development branch, then you cannot have a committer model. And if you ditch that drm is as starved for maintainer volunteer time as any other subsystem, because we're really not special.
Cheers, Sima
I think this is also what Alex tried to do until very recently.
cherry-pick, but pretend it didn't happen. This means either people perfectly fix up all tags (see above, doesn't happen in practice) or you need to do title based guessing games. Plus you need to do title-based guessing games with the duplicates anyway.
cherry-pick -x. You can actually handle this one with scripts and no human shouting. Unless people forgot to use -x or screwed up something else (which is why we have a script and docs). Which does happene, but the two examples you've found for that flow are from 2021. There should also be some that are more recent.
we give up on stable for drm.
Cheers, Sima
Or the following 3 commits:
0811b9e4530d ("drm/amd/display: Add HUBP surface flip interrupt handler") which has a stable tag, and no cherry-pick line.
4ded1ec8d1b3 ("drm/amd/display: Add HUBP surface flip interrupt handler") which is a different code change than the previous commit, and a completely different commit message, no stable tag, and no cherry-pick line.
7af87fc1ba13 ("drm/amd/display: Add HUBP surface flip interrupt handler") which has the same code change as above, and it has the same commit message as 4ded1ec8d1b3 but with an added stable tag, and again - no cherry-pick line.
-- Thanks, Sasha
-- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Wed, Jan 15, 2025 at 10:07:47AM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 06:31:26PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 10:51:54AM -0500, Sasha Levin wrote:
On Tue, Jan 14, 2025 at 04:03:09PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 11:01:34AM +1000, Dave Airlie wrote:
> > We create a "web" when we backport commits, and mark things for "Fixes:" > > When we get those ids wrong because you all have duplicate commits for > > the same thing, everything breaks. > > > > > I just don't get what the ABI the tools expect is, and why everyone is > > > writing bespoke tools and getting it wrong, then blaming us for not > > > conforming. Fix the tools or write new ones when you realise the > > > situation is more complex than your initial ideas. > > > > All I want to see and care about is: > > > > - for a stable commit, the id that the commit is in Linus's tree. > > - for a Fixes: tag, the id that matches the commit in Linus's tree AND > > the commit that got backported to stable trees. > > > > That's it, that's the whole "ABI". The issue is that you all, for any > > number of commits, have 2 unique ids for any single commit and how are > > we supposed to figure that mess out... > > Pretty sure we've explained how a few times now, not sure we can do much more.
And the same for me.
> If you see a commit with a cherry-pick link in it and don't have any > sight on that commit in Linus's tree, ignore the cherry-pick link in > it, assume it's a future placeholder for that commit id. You could if > you wanted to store that info somewhere, but there shouldn't be a > need.
Ok, this is "fine", I can live with it. BUT that's not the real issue (and your own developers get confused by this, again, look at the original email that started this all, they used an invalid git id to send to us thinking that was the correct id to use.)
I'm going to go back and look at the one you pointed out as I'm missing the issue with it, I thought it was due to a future ID being used.
I think the issue is that with the cherry-picking we do, we don't update the Fixes: or Reverts: lines, so those still point at the og commit in -next, while the cherry-picked commit is in -fixes.
The fix for that (which our own cherry-pick scripts implement iirc) is to keep track of the cherry-picks (which is why we add that line) and treat them as aliases.
So if you have a Fixes: $sha1 pointing at -next, then if you do a full-text commit message search for (cherry picked from $sha1), you should be able to find it.
We could try to do that lookup with the cherry-pick scripts, but a lot of folks hand-roll these, so it's lossy at best. Plus you already have to keep track of aliases anyway since you're cherry-picking to stable, so I was assuming that this shouldn't cause additional issues.
The other part is that if you already have a cherry picked from $sha1 in your history, even if it wasn't done with stable cherry-pick, then you don't have to cherry-pick again. These should be easy to filter out.
But maybe I'm also not understanding what the issue is, I guess would need to look at a specific example.
> When future tools are analysing things, they will see the patch from > the merge window, the cherry-picked patches in the fixes tree, and > stable will reference the fixes, and the fixes patch will reference > the merge window one?
> but I think when we cherry-pick patches from -next that fix > other patches from -next maybe the fixes lines should be reworked to > reference the previous Linus tree timeline not the future one. not > 100% sure this happens? Sima might know more.
Please fix this up, if you all can. That is the issue here. And again, same for reverts.
I think between the two, this is causing many fixes and reverts to go unresolved in the stable trees.
> Now previously I think we'd be requested to remove the cherry-picks > from the -fixes commits as they were referencing things not in Linus' > tree, we said it was a bad idea, I think we did it anyways, we got > shouted at, we put it back, we get shouted that we are referencing > commits that aren't in Linus tree. Either the link is useful > information and we just assume cherry-picks of something we can't see > are a future placeholder and ignore it until it shows up in our > timeline.
I still think it's lunacy to have a "cherry pick" commit refer to a commit that is NOT IN THE TREE YET and shows up in history as "IN THE FUTURE". But hey, that's just me.
Why do you have these markings at all? Who are they helping? Me? Someone else?
They are for helping you. Again if the commit that goes into -next is immutable, there is no way for it to reference the commit that goes into -fixes ahead of it.
The commit in -fixes needs to add the link to the future commit in -next, that link is the cherry-pick statement.
When you get the future commit into the stable queue, you look for the commit id in stable history as a cherry-pick and drop it if it's already there.
I can't see any other way to do this, the future commit id is a placeholder in Linus/stable tree, the commit is immutable and 99.99% of the time it will arrive at some future point in time.
I'm open to how you would make this work that isn't lunacy, but I can't really see a way since git commits are immutable.
Yeah the (cherry picked from $sha1) with a sha1 that's in -next and almost always shows up in Linus' tree in the future shouldn't be an issue. That part really is required for driver teams to manage their flows.
> I think we could ask to not merge things into -next with stable cc'ed > but I think that will result in a loss of valuable fixes esp for > backporters.
Again, it's the Fixes and Reverts id referencing that is all messed up here. That needs to be resolved. If it takes you all the effort to make up a special "stable tree only" branch/series/whatever, I'm all for it, but as it is now, what you all are doing is NOT working for me at all.
I'll have to see if anyone is willing to consider pulling this sort of feat off, it's not a small task, and it would have to be 99% automated I think to be not too burdensome.
It's not that hard to script, dim cherry-pick already does it. It's the part where we need to guarantee that we never ever let one slip through didn't get this treatment of replacing the sha1.
The even more insideous one is when people rebase their -next or -fixes trees, since then the sha1 will really never ever show up. Which is why we're telling people to not mess with git history at all and instead cherry-pick. It's the lesser pain.
But this does happen with cherry picks... A few examples from what I saw with drivers/gpu/drm/ and -stable:
5a507b7d2be1 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") which landed as 8a0a7b98d4b6 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") rather than 4545614c1d8da.
This one also landed through Alex' tree, and before he switched over to cherry-pick -x and not trying to fix things up with rebasing. Because in theory rebasing bugfixes out of -next into -fixes avoids all that trouble, in practice it just causes a reliably even bigger mess.
e89afb51f97a ("drm/vmwgfx: Fix a 64bit regression on svga3") which landed as c2aaa37dc18f ("drm/vmwgfx: Fix a 64bit regression on svga3") rather than 873601687598.
This one is from 2021. Iirc it's the case that motivated us to improve the commiter documentation and make it clear that only maintainers should do cherry-picks. Occasionally people don't know and screw up.
a829f033e966 ("drm/i915: Wedge the GPU if command parser setup fails") which indicates it's a cherry-pick, but I couldn't find the equivalent commit landing at any point later on.
This one was a maintainer action by Dave and me, where we went in and rebased an entire -next tree. Also from 2021, even more exceptional than the "committer cherry-picked themselves and screwed up".
I'm not saying that the cherry-pick model with committers is error free. Not at all. It's just in my experience substantially less error prone than anything else, it's simply the less shit option.
Roughly the options are:
rebase trees to not have duplicated commits. Breaks the committer model, pretty much guarantees that you have commit references to absolutely nowhere at all in practice because people butcher rebases all the time. Also pisses off Linus with unecessary rebases that don't reflect actual development history.
Plus we'd insta run out of maintainers in drm if we do this.
Bit more here, because this isn't hyperbole. drm isn't magic, we don't have more maintainer volunteers than any other subsystem. And if we'd run the show the same way as most others, we'd suffer like everyone else from overloaded and burnt out maintainers.
We fixed the "not enough maintainer volunteers" problem by radically changing the workflow, and radically reducing the amount of work maintainers have to do. But that has consequences, and that's why we cherry-pick so much.
If you center your flow around committers, then you also need to accept that for a committer the most important tree is their driver/subsystem tree, and everything else is downstream. And they don't care about downstream at that much. Exactly like how maintainers don't care that much about stable trees as their downstream, and you're trying to make it as easy as possible for them.
Roughly translating things:
For you, stable is the downstream that cherry-picks from the main development branch. For drm committers, drm-fixes is their downstream that cherry-picks from the development tree (and everything else is even further downstream).
For you Linus' git tree is the development branch you cherry-pick from. For drm committers the drm-foo-next branch is their development branch that we cherry-pick from.
You asking us to not cherry-pick but instead do the classic maintainer approach of filtering out fixes into foo-fixes branches is the same as if you'd ask maintainers to send bugfixes for stable to you directly, rebase them out of their pr to Linus and then backmerge. This is total bullocks, because stable isn't the development branch.
No, I can live with you all cherry-picking as that seems to make your life easier, what I am complaining about is when that cherry-picking causes massive confusion as the Fixes: and Revert ids end up showing going "back in time" and pointing to the wrong commit.
And note, the commit that caused this recent thread DID actually confuse your own developers, and they used the wrong git id as well, so it's not like your own developers don't get confused either.
It's your tree, you all run it like you want to, I'm just pointing out that the current way you all are running it IS causing problems for those of us who have to deal with the result of it.
thanks,
greg k-h
On Wed, Jan 15, 2025 at 10:38:34AM +0100, Greg KH wrote:
On Wed, Jan 15, 2025 at 10:07:47AM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 06:31:26PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 10:51:54AM -0500, Sasha Levin wrote:
On Tue, Jan 14, 2025 at 04:03:09PM +0100, Simona Vetter wrote:
On Tue, Jan 14, 2025 at 11:01:34AM +1000, Dave Airlie wrote:
> > > We create a "web" when we backport commits, and mark things for "Fixes:" > > > When we get those ids wrong because you all have duplicate commits for > > > the same thing, everything breaks. > > > > > > > I just don't get what the ABI the tools expect is, and why everyone is > > > > writing bespoke tools and getting it wrong, then blaming us for not > > > > conforming. Fix the tools or write new ones when you realise the > > > > situation is more complex than your initial ideas. > > > > > > All I want to see and care about is: > > > > > > - for a stable commit, the id that the commit is in Linus's tree. > > > - for a Fixes: tag, the id that matches the commit in Linus's tree AND > > > the commit that got backported to stable trees. > > > > > > That's it, that's the whole "ABI". The issue is that you all, for any > > > number of commits, have 2 unique ids for any single commit and how are > > > we supposed to figure that mess out... > > > > Pretty sure we've explained how a few times now, not sure we can do much more. > > And the same for me. > > > If you see a commit with a cherry-pick link in it and don't have any > > sight on that commit in Linus's tree, ignore the cherry-pick link in > > it, assume it's a future placeholder for that commit id. You could if > > you wanted to store that info somewhere, but there shouldn't be a > > need. > > Ok, this is "fine", I can live with it. BUT that's not the real issue > (and your own developers get confused by this, again, look at the > original email that started this all, they used an invalid git id to > send to us thinking that was the correct id to use.)
I'm going to go back and look at the one you pointed out as I'm missing the issue with it, I thought it was due to a future ID being used.
I think the issue is that with the cherry-picking we do, we don't update the Fixes: or Reverts: lines, so those still point at the og commit in -next, while the cherry-picked commit is in -fixes.
The fix for that (which our own cherry-pick scripts implement iirc) is to keep track of the cherry-picks (which is why we add that line) and treat them as aliases.
So if you have a Fixes: $sha1 pointing at -next, then if you do a full-text commit message search for (cherry picked from $sha1), you should be able to find it.
We could try to do that lookup with the cherry-pick scripts, but a lot of folks hand-roll these, so it's lossy at best. Plus you already have to keep track of aliases anyway since you're cherry-picking to stable, so I was assuming that this shouldn't cause additional issues.
The other part is that if you already have a cherry picked from $sha1 in your history, even if it wasn't done with stable cherry-pick, then you don't have to cherry-pick again. These should be easy to filter out.
But maybe I'm also not understanding what the issue is, I guess would need to look at a specific example.
> > When future tools are analysing things, they will see the patch from > > the merge window, the cherry-picked patches in the fixes tree, and > > stable will reference the fixes, and the fixes patch will reference > > the merge window one? > > > > but I think when we cherry-pick patches from -next that fix > > other patches from -next maybe the fixes lines should be reworked to > > reference the previous Linus tree timeline not the future one. not > > 100% sure this happens? Sima might know more. > > Please fix this up, if you all can. That is the issue here. And again, > same for reverts. > > I think between the two, this is causing many fixes and reverts to go > unresolved in the stable trees. > > > Now previously I think we'd be requested to remove the cherry-picks > > from the -fixes commits as they were referencing things not in Linus' > > tree, we said it was a bad idea, I think we did it anyways, we got > > shouted at, we put it back, we get shouted that we are referencing > > commits that aren't in Linus tree. Either the link is useful > > information and we just assume cherry-picks of something we can't see > > are a future placeholder and ignore it until it shows up in our > > timeline. > > I still think it's lunacy to have a "cherry pick" commit refer to a > commit that is NOT IN THE TREE YET and shows up in history as "IN THE > FUTURE". But hey, that's just me. > > Why do you have these markings at all? Who are they helping? Me? > Someone else?
They are for helping you. Again if the commit that goes into -next is immutable, there is no way for it to reference the commit that goes into -fixes ahead of it.
The commit in -fixes needs to add the link to the future commit in -next, that link is the cherry-pick statement.
When you get the future commit into the stable queue, you look for the commit id in stable history as a cherry-pick and drop it if it's already there.
I can't see any other way to do this, the future commit id is a placeholder in Linus/stable tree, the commit is immutable and 99.99% of the time it will arrive at some future point in time.
I'm open to how you would make this work that isn't lunacy, but I can't really see a way since git commits are immutable.
Yeah the (cherry picked from $sha1) with a sha1 that's in -next and almost always shows up in Linus' tree in the future shouldn't be an issue. That part really is required for driver teams to manage their flows.
> > I think we could ask to not merge things into -next with stable cc'ed > > but I think that will result in a loss of valuable fixes esp for > > backporters. > > Again, it's the Fixes and Reverts id referencing that is all messed up > here. That needs to be resolved. If it takes you all the effort to > make up a special "stable tree only" branch/series/whatever, I'm all for > it, but as it is now, what you all are doing is NOT working for me at > all.
I'll have to see if anyone is willing to consider pulling this sort of feat off, it's not a small task, and it would have to be 99% automated I think to be not too burdensome.
It's not that hard to script, dim cherry-pick already does it. It's the part where we need to guarantee that we never ever let one slip through didn't get this treatment of replacing the sha1.
The even more insideous one is when people rebase their -next or -fixes trees, since then the sha1 will really never ever show up. Which is why we're telling people to not mess with git history at all and instead cherry-pick. It's the lesser pain.
But this does happen with cherry picks... A few examples from what I saw with drivers/gpu/drm/ and -stable:
5a507b7d2be1 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") which landed as 8a0a7b98d4b6 ("drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2") rather than 4545614c1d8da.
This one also landed through Alex' tree, and before he switched over to cherry-pick -x and not trying to fix things up with rebasing. Because in theory rebasing bugfixes out of -next into -fixes avoids all that trouble, in practice it just causes a reliably even bigger mess.
e89afb51f97a ("drm/vmwgfx: Fix a 64bit regression on svga3") which landed as c2aaa37dc18f ("drm/vmwgfx: Fix a 64bit regression on svga3") rather than 873601687598.
This one is from 2021. Iirc it's the case that motivated us to improve the commiter documentation and make it clear that only maintainers should do cherry-picks. Occasionally people don't know and screw up.
a829f033e966 ("drm/i915: Wedge the GPU if command parser setup fails") which indicates it's a cherry-pick, but I couldn't find the equivalent commit landing at any point later on.
This one was a maintainer action by Dave and me, where we went in and rebased an entire -next tree. Also from 2021, even more exceptional than the "committer cherry-picked themselves and screwed up".
I'm not saying that the cherry-pick model with committers is error free. Not at all. It's just in my experience substantially less error prone than anything else, it's simply the less shit option.
Roughly the options are:
rebase trees to not have duplicated commits. Breaks the committer model, pretty much guarantees that you have commit references to absolutely nowhere at all in practice because people butcher rebases all the time. Also pisses off Linus with unecessary rebases that don't reflect actual development history.
Plus we'd insta run out of maintainers in drm if we do this.
Bit more here, because this isn't hyperbole. drm isn't magic, we don't have more maintainer volunteers than any other subsystem. And if we'd run the show the same way as most others, we'd suffer like everyone else from overloaded and burnt out maintainers.
We fixed the "not enough maintainer volunteers" problem by radically changing the workflow, and radically reducing the amount of work maintainers have to do. But that has consequences, and that's why we cherry-pick so much.
If you center your flow around committers, then you also need to accept that for a committer the most important tree is their driver/subsystem tree, and everything else is downstream. And they don't care about downstream at that much. Exactly like how maintainers don't care that much about stable trees as their downstream, and you're trying to make it as easy as possible for them.
Roughly translating things:
For you, stable is the downstream that cherry-picks from the main development branch. For drm committers, drm-fixes is their downstream that cherry-picks from the development tree (and everything else is even further downstream).
For you Linus' git tree is the development branch you cherry-pick from. For drm committers the drm-foo-next branch is their development branch that we cherry-pick from.
You asking us to not cherry-pick but instead do the classic maintainer approach of filtering out fixes into foo-fixes branches is the same as if you'd ask maintainers to send bugfixes for stable to you directly, rebase them out of their pr to Linus and then backmerge. This is total bullocks, because stable isn't the development branch.
No, I can live with you all cherry-picking as that seems to make your life easier, what I am complaining about is when that cherry-picking causes massive confusion as the Fixes: and Revert ids end up showing going "back in time" and pointing to the wrong commit.
And note, the commit that caused this recent thread DID actually confuse your own developers, and they used the wrong git id as well, so it's not like your own developers don't get confused either.
It's your tree, you all run it like you want to, I'm just pointing out that the current way you all are running it IS causing problems for those of us who have to deal with the result of it.
So my understanding is that you got confused by this:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
$ git log --grep="(cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)" --since="6 month ago" --pretty=oneline f0ed39830e6064d62f9c5393505677a26569bb56
And yes f0ed39830e6064d62f9c5393505677a26569bb56 is the commit you care about for stable backport and cve tracking purposes, because it's the one in v6.13-rc6.
And the thing is, Sasha's bot found that one too:
https://lore.kernel.org/all/20250110164811-61a12d6905bb8676@stable.kernel.or...
Except Sasha's bot plays guessing games, the above git log query is exact.
Like I tried to explain in my reply to Sasha somewhere else in this thread it really only takes two things: - drm maintainers consistently add cherry picked from lines anytime we cherry-pick - you adjust your script to go hunt for the cherry pick alias if you get a sha1 that makes no sense, so that you can put in the right sha1. And if you do that for any sha1 you find (whether upstream references, Fixes: or Reverts or stable candidate commits or whatever really), it will sort out all the things we've been shouting about for years now. Automatically, without human intervention, because it's just a git oneliner.
Of course people will still screw up, and Sasha found two examples from 2021 for that. And assuming you do add the above fallback to find the "right" sha1 to your scripts I'll happily promise that Dave&me will make damn sure it gets correctly added everywhere. We _want_ to scripts this all as much as possible too, and not cause you endless amounts of exceptions.
And we've been adding these cherry-picked lines intentionally from the start of the committer model exactly because we wanted to make sure you can handle the fallout with just scripts.
Note that most (maybe all?) the amd examples are because Alex has been trying to do what you're asking for here, and in theory that works, but in practice it's just so much worse.
Cheers, Sima
On Wed, Jan 15, 2025 at 12:15:46PM +0100, Simona Vetter wrote:
On Wed, Jan 15, 2025 at 10:38:34AM +0100, Greg KH wrote: So my understanding is that you got confused by this:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
$ git log --grep="(cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)" --since="6 month ago" --pretty=oneline f0ed39830e6064d62f9c5393505677a26569bb56
And yes f0ed39830e6064d62f9c5393505677a26569bb56 is the commit you care about for stable backport and cve tracking purposes, because it's the one in v6.13-rc6.
And the thing is, Sasha's bot found that one too:
https://lore.kernel.org/all/20250110164811-61a12d6905bb8676@stable.kernel.or...
Except Sasha's bot plays guessing games, the above git log query is exact.
Cool, can we test it out? I'll try and pick a recent commit (2024).
Let's assume that I'm looking at the v6.10 git tree before 50aec9665e0b ("drm/xe: Use ordered WQ for G2H handler") made it upstream ("git checkout v6.10" will do the trick), and I get a backport request that says:
commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de upstream
I run my trusty script that says "50aec9665e0b isn't real, grep for cherry picked from line!". My trusty script runs the query you've provided:
$ git log --grep="(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)" --since="6 month ago" --pretty=oneline 2d9c72f676e6f79a021b74c6c1c88235e7d5b722 drm/xe: Use ordered WQ for G2H handler c002bfe644a29ba600c571f2abba13a155a12dcd drm/xe: Use ordered WQ for G2H handler
Which commit do I pick? Note that they are slightly different from eachother, and c002bfe644 landed in v6.9 while 2d9c72f676 landed in v6.10.
Like I tried to explain in my reply to Sasha somewhere else in this thread it really only takes two things:
- drm maintainers consistently add cherry picked from lines anytime we
cherry-pick
- you adjust your script to go hunt for the cherry pick alias if you get a
sha1 that makes no sense, so that you can put in the right sha1. And if you do that for any sha1 you find (whether upstream references, Fixes: or Reverts or stable candidate commits or whatever really), it will sort out all the things we've been shouting about for years now.
We still have holes here... For example, this backport claims to:
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
But 8135f1c09dd2 is a cherry-pick:
(cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
In the future, if we get a new patch that says:
Fixes: 0c8650b09a36 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
By your logic, our scripts will look at it and say "0c8650b09a36 is a real commit, but it's not in linux-6.12.y so there's no need to backport the fix".
Which is the wrong thing to do, because we have 8135f1c09dd2 in linux-6.12.y.
So no, this isn't a simple trace-the-cherry-pick-tags exercise.
Automatically, without human intervention, because it's just a git oneliner.
So look at the backport in question which started this thread. The backporter ends up with:
""" [...]
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
[...]
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
"""
Where most of the git IDs in it are invalid right now :)
On Wed, Jan 15, 2025 at 12:18:13PM -0500, Sasha Levin wrote:
On Wed, Jan 15, 2025 at 12:15:46PM +0100, Simona Vetter wrote:
On Wed, Jan 15, 2025 at 10:38:34AM +0100, Greg KH wrote: So my understanding is that you got confused by this:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
$ git log --grep="(cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)" --since="6 month ago" --pretty=oneline f0ed39830e6064d62f9c5393505677a26569bb56
And yes f0ed39830e6064d62f9c5393505677a26569bb56 is the commit you care about for stable backport and cve tracking purposes, because it's the one in v6.13-rc6.
And the thing is, Sasha's bot found that one too:
https://lore.kernel.org/all/20250110164811-61a12d6905bb8676@stable.kernel.or...
Except Sasha's bot plays guessing games, the above git log query is exact.
Cool, can we test it out? I'll try and pick a recent commit (2024).
Thanks a lot for specific examples, makes it much easier to walk through them and show how much cherry pick tracing is neeeded.
Let's assume that I'm looking at the v6.10 git tree before 50aec9665e0b ("drm/xe: Use ordered WQ for G2H handler") made it upstream ("git checkout v6.10" will do the trick), and I get a backport request that says:
commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de upstream
I run my trusty script that says "50aec9665e0b isn't real, grep for cherry picked from line!". My trusty script runs the query you've provided:
$ git log --grep="(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)" --since="6 month ago" --pretty=oneline 2d9c72f676e6f79a021b74c6c1c88235e7d5b722 drm/xe: Use ordered WQ for G2H handler c002bfe644a29ba600c571f2abba13a155a12dcd drm/xe: Use ordered WQ for G2H handler
Which commit do I pick? Note that they are slightly different from eachother, and c002bfe644 landed in v6.9 while 2d9c72f676 landed in v6.10.
Ok let me first explain why this happens. drm subtrees feature-freeze around -rc6/7, to make sure that when the merge window is open we don't have a buggy chaotic mess but are ready. Which means for that short amount of time there's 3 trees:
- drm-intel-next, which aims at v6.11 at this point and contains 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de - drm-intel-next-fixes, which aims for v6.10-rc and contains the cherry-picked version 2d9c72f676e6f79a021b74c6c1c88235e7d5b722 of the same commit - drm-intel-fixes, which aims at v6.9-rc and contains yet another cherry-picked version c002bfe644a29ba600c571f2abba13a155a12dcd of the same commit
Now we generally rotate maintainership among releases, so Thomas Hellström was taking care of anything needed for v6.10 and Lucas De Marchi for v6.9, and they both individually did the necessary cherry pick at pretty much the same time. And so we end up with two cherry-picks of the same commit. At other times you might end up with a chain of cherry-picks, it's all the same.
Now looking back this is it's very silly, but it's a lot less silly as stuff gets merged into Linus' tree.
First v6.9 gets tagged, which contains c002bfe644a29ba600c571f. You don't have that in v6.8.y yet, so you cherry-pick it over. Nothing special here. If you feel like you also assign a CVE for that upstream commit.
Then v6.10-rc1 is released, which contains 2d9c72f676e6f79a021b74c6. You don't have that, or a cherry pick of that commit in any of your stable trees, so might be tempted to try to backport it and then realize you seem to have a duplicate of this commit already.
But you're not yet done cherry-pick tracing, because 2d9c72f676e6f79a021b7 contains the following line in its commit message:
(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)
So you also need to check whether you have any cherry-picks of 50aec9665e0babd:
- v6.9.y contains c002bfe644a29ba600c, so doesn't need another backport of that. You skip this commit.
- v6.8.y and older stable trees contain a backport of c002bfe644a29ba6, and if you didn't delete any cherry-picked lines all those backports still contain
(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)
line, so you know you're good and don't need another copy of that commit.
Since this commit gets completely filtered you're also not tempted to assign a new CVE for this one. So no risk of duplicates.
Next up v6.11-rc1 is released, which contains 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de. You check all your stable release branches and notice they all contain a cherry pick already:
- v6.10.y has 2d9c72f676e6f79a021b7 - v6.9.y has c002bfe644a29ba600c571f - v6.8.y and older have a cherry pick of c002bfe644a29ba600c571f
Again this commit is completely filtered out with cherry-pick tracin checks.
Note that except for the first patch none of this mess should ever get past scripted filtering. Which means it should not confuse stable maintainers, and it also should not result in multiple CVEs getting assigned. Because only the first commit (c002bfe644a29ba600c in v6.9) will ever get past all the cherry pick tracing checks.
Like I tried to explain in my reply to Sasha somewhere else in this thread it really only takes two things:
- drm maintainers consistently add cherry picked from lines anytime we
cherry-pick
- you adjust your script to go hunt for the cherry pick alias if you get a
sha1 that makes no sense, so that you can put in the right sha1. And if you do that for any sha1 you find (whether upstream references, Fixes: or Reverts or stable candidate commits or whatever really), it will sort out all the things we've been shouting about for years now.
We still have holes here... For example, this backport claims to:
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
But 8135f1c09dd2 is a cherry-pick:
(cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
In the future, if we get a new patch that says:
Fixes: 0c8650b09a36 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
By your logic, our scripts will look at it and say "0c8650b09a36 is a real commit, but it's not in linux-6.12.y so there's no need to backport the fix".
Which is the wrong thing to do, because we have 8135f1c09dd2 in linux-6.12.y.
So no, this isn't a simple trace-the-cherry-pick-tags exercise.
So this is not quite what you do, because before you drop this patch you have to check whether you have a cherry-pick of 8135f1c09dd2 in your stable branch.
Now if we assume that only the stable team does cherry picks, you can limit your search to just the v6.12.y branch from the v6.12, and you'd wrongly conclude that you don't have a cherry pick there. But since the drm maintainers also do cherry-picks then limiting yourselv to just the patches you've applied to v6.12 will miss some, so you need to cherry-pick trace some more (I think 6 months into the past is generally enough, from your starting point commit).
And then you'll find that 8135f1c09dd2 is in v6.12 and a cherry-pick of 0c8650b09a36 so you need that bugfix.
Automatically, without human intervention, because it's just a git oneliner.
So look at the backport in question which started this thread. The backporter ends up with:
""" [...]
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
Cherry pick tracing says this is f0ed39830e6064d62f9c5393505677a26569bb56 which is in v6.13-rc1.
[...]
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
This one is in v6.12, but it's already a cherry-pick, so you need to be careful and look for possible other versions, because its commit message contains
(cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
But further cherry-pick tracing didn't show up any more commits that we didn't know of, so it's all done.
And 0c8650b09a365f4a31fca1d1d1e9d99c56071128 itself is in v6.13-rc1, so that resolves.
Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)
This commit will show up in v6.14-rc1. Currently all you can use this for is as a lookup key to find other cherry-pick copies.
Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)o
This commit is in v6.13-rc6, because that -rc series also needed the cherry-pick since the broken commit was in v6.13-rc1. And this commit itself is correctly annotated as a cherry-pick of 55039832f98c7e05f1, so it all checks out.
"""
Where most of the git IDs in it are invalid right now :)
They all make sense, but sometimes you do have to do a bit more cherry-pick tracing than what you've done.
Sometimes you need to do multiple levels of tracing, sometimes you start at a cherry-pick and need to fish out the original sha1 first (even if that's not one that resolves for you, you can still use it to find more cherry-picks). And because the stable team isn't the only maintainer team doing cherry-picks, you need to broaden your query a bit and can't limit yourself to your own cherry-picks only, to make sure you get them all.
Happy to walk you through even more special-case ladden examples. I tried to think them all through when cooking up the committer model years ago, I might indeed have missed a case. But pretty sure that the answer will be "you need more cherry-pick tracing".
Cheers, Sima
On Wed, Jan 15, 2025 at 08:02:01PM +0100, Simona Vetter wrote:
On Wed, Jan 15, 2025 at 12:18:13PM -0500, Sasha Levin wrote:
On Wed, Jan 15, 2025 at 12:15:46PM +0100, Simona Vetter wrote:
On Wed, Jan 15, 2025 at 10:38:34AM +0100, Greg KH wrote: So my understanding is that you got confused by this:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
$ git log --grep="(cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)" --since="6 month ago" --pretty=oneline f0ed39830e6064d62f9c5393505677a26569bb56
And yes f0ed39830e6064d62f9c5393505677a26569bb56 is the commit you care about for stable backport and cve tracking purposes, because it's the one in v6.13-rc6.
And the thing is, Sasha's bot found that one too:
https://lore.kernel.org/all/20250110164811-61a12d6905bb8676@stable.kernel.or...
Except Sasha's bot plays guessing games, the above git log query is exact.
Cool, can we test it out? I'll try and pick a recent commit (2024).
Thanks a lot for specific examples, makes it much easier to walk through them and show how much cherry pick tracing is neeeded.
Let's assume that I'm looking at the v6.10 git tree before 50aec9665e0b ("drm/xe: Use ordered WQ for G2H handler") made it upstream ("git checkout v6.10" will do the trick), and I get a backport request that says:
commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de upstream
I run my trusty script that says "50aec9665e0b isn't real, grep for cherry picked from line!". My trusty script runs the query you've provided:
$ git log --grep="(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)" --since="6 month ago" --pretty=oneline 2d9c72f676e6f79a021b74c6c1c88235e7d5b722 drm/xe: Use ordered WQ for G2H handler c002bfe644a29ba600c571f2abba13a155a12dcd drm/xe: Use ordered WQ for G2H handler
Which commit do I pick? Note that they are slightly different from eachother, and c002bfe644 landed in v6.9 while 2d9c72f676 landed in v6.10.
Ok let me first explain why this happens. drm subtrees feature-freeze around -rc6/7, to make sure that when the merge window is open we don't have a buggy chaotic mess but are ready. Which means for that short amount of time there's 3 trees:
- drm-intel-next, which aims at v6.11 at this point and contains 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de
- drm-intel-next-fixes, which aims for v6.10-rc and contains the cherry-picked version 2d9c72f676e6f79a021b74c6c1c88235e7d5b722 of the same commit
- drm-intel-fixes, which aims at v6.9-rc and contains yet another cherry-picked version c002bfe644a29ba600c571f2abba13a155a12dcd of the same commit
Now we generally rotate maintainership among releases, so Thomas Hellström was taking care of anything needed for v6.10 and Lucas De Marchi for v6.9, and they both individually did the necessary cherry pick at pretty much the same time. And so we end up with two cherry-picks of the same commit. At other times you might end up with a chain of cherry-picks, it's all the same.
Now looking back this is it's very silly, but it's a lot less silly as stuff gets merged into Linus' tree.
First v6.9 gets tagged, which contains c002bfe644a29ba600c571f. You don't have that in v6.8.y yet, so you cherry-pick it over. Nothing special here. If you feel like you also assign a CVE for that upstream commit.
Then v6.10-rc1 is released, which contains 2d9c72f676e6f79a021b74c6. You don't have that, or a cherry pick of that commit in any of your stable trees, so might be tempted to try to backport it and then realize you seem to have a duplicate of this commit already.
But you're not yet done cherry-pick tracing, because 2d9c72f676e6f79a021b7 contains the following line in its commit message:
(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)
So you also need to check whether you have any cherry-picks of 50aec9665e0babd:
v6.9.y contains c002bfe644a29ba600c, so doesn't need another backport of that. You skip this commit.
v6.8.y and older stable trees contain a backport of c002bfe644a29ba6, and if you didn't delete any cherry-picked lines all those backports still contain
(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)
line, so you know you're good and don't need another copy of that commit.
Since this commit gets completely filtered you're also not tempted to assign a new CVE for this one. So no risk of duplicates.
Next up v6.11-rc1 is released, which contains 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de. You check all your stable release branches and notice they all contain a cherry pick already:
- v6.10.y has 2d9c72f676e6f79a021b7
- v6.9.y has c002bfe644a29ba600c571f
- v6.8.y and older have a cherry pick of c002bfe644a29ba600c571f
Again this commit is completely filtered out with cherry-pick tracin checks.
Note that except for the first patch none of this mess should ever get past scripted filtering. Which means it should not confuse stable maintainers, and it also should not result in multiple CVEs getting assigned. Because only the first commit (c002bfe644a29ba600c in v6.9) will ever get past all the cherry pick tracing checks.
Like I tried to explain in my reply to Sasha somewhere else in this thread it really only takes two things:
- drm maintainers consistently add cherry picked from lines anytime we
cherry-pick
- you adjust your script to go hunt for the cherry pick alias if you get a
sha1 that makes no sense, so that you can put in the right sha1. And if you do that for any sha1 you find (whether upstream references, Fixes: or Reverts or stable candidate commits or whatever really), it will sort out all the things we've been shouting about for years now.
We still have holes here... For example, this backport claims to:
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
But 8135f1c09dd2 is a cherry-pick:
(cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
In the future, if we get a new patch that says:
Fixes: 0c8650b09a36 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
By your logic, our scripts will look at it and say "0c8650b09a36 is a real commit, but it's not in linux-6.12.y so there's no need to backport the fix".
Which is the wrong thing to do, because we have 8135f1c09dd2 in linux-6.12.y.
So no, this isn't a simple trace-the-cherry-pick-tags exercise.
So this is not quite what you do, because before you drop this patch you have to check whether you have a cherry-pick of 8135f1c09dd2 in your stable branch.
Now if we assume that only the stable team does cherry picks, you can limit your search to just the v6.12.y branch from the v6.12, and you'd wrongly conclude that you don't have a cherry pick there. But since the drm maintainers also do cherry-picks then limiting yourselv to just the patches you've applied to v6.12 will miss some, so you need to cherry-pick trace some more (I think 6 months into the past is generally enough, from your starting point commit).
And then you'll find that 8135f1c09dd2 is in v6.12 and a cherry-pick of 0c8650b09a36 so you need that bugfix.
Automatically, without human intervention, because it's just a git oneliner.
So look at the backport in question which started this thread. The backporter ends up with:
""" [...]
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
Cherry pick tracing says this is f0ed39830e6064d62f9c5393505677a26569bb56 which is in v6.13-rc1.
[...]
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
This one is in v6.12, but it's already a cherry-pick, so you need to be careful and look for possible other versions, because its commit message contains
(cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
But further cherry-pick tracing didn't show up any more commits that we didn't know of, so it's all done.
And 0c8650b09a365f4a31fca1d1d1e9d99c56071128 itself is in v6.13-rc1, so that resolves.
Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)
This commit will show up in v6.14-rc1. Currently all you can use this for is as a lookup key to find other cherry-pick copies.
Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)o
This commit is in v6.13-rc6, because that -rc series also needed the cherry-pick since the broken commit was in v6.13-rc1. And this commit itself is correctly annotated as a cherry-pick of 55039832f98c7e05f1, so it all checks out.
"""
Where most of the git IDs in it are invalid right now :)
They all make sense, but sometimes you do have to do a bit more cherry-pick tracing than what you've done.
Sometimes you need to do multiple levels of tracing, sometimes you start at a cherry-pick and need to fish out the original sha1 first (even if that's not one that resolves for you, you can still use it to find more cherry-picks). And because the stable team isn't the only maintainer team doing cherry-picks, you need to broaden your query a bit and can't limit yourself to your own cherry-picks only, to make sure you get them all.
Happy to walk you through even more special-case ladden examples. I tried to think them all through when cooking up the committer model years ago, I might indeed have missed a case. But pretty sure that the answer will be "you need more cherry-pick tracing".
Maybe also helps to go back from examples to the generic algorithm, which is two steps:
1. You first need to find the root sha1 that all the cherry picks originate from. If the sha1 you have doesn't resolve, you skip this. Otherwise look at the commit message, if it has a "(cherry picked from $sha)" line you pick the first line (they're ordered like sob lines), and that's the root sha1. It might not resolve, but it's the search key you need.
2. You go hunt for all cherry pick aliases for that root sha1. Strictly speaking you'd need to search the entire history, but in practice commits only travel back in time by 3-4 months at most (or a bit less thatn 2 full kernel releases). Half a year is the defensive assumption. So there's two subcases:
2a) If you want to find the right commit in upstream, you scan half a year of history in Linus' repo starting from the current tip.
2b) If you want to check for cherry picks in a stable branch, either to check whether you already have that backport, or whether you need to backport the bugfix for a buggy backport (due to Fixes/Revert: lines). You scan all the stable commits, plus half a year of history from the release tag Linus has done.
It's a bit madness, but more importantly, it's scriptable madness. And since this confuses drm maintainers too, we do have that partially implemented in our own scripts, to make sure we don't miss any bugfixes we need in drm-fixes. Partially because we never have unresolved sha1 in our own repos, because they also contain the -next branches where the root commit is.
Cheers, Sima
On Thu, Jan 16, 2025 at 10:48:45AM +0100, Simona Vetter wrote:
On Wed, Jan 15, 2025 at 08:02:01PM +0100, Simona Vetter wrote:
On Wed, Jan 15, 2025 at 12:18:13PM -0500, Sasha Levin wrote:
On Wed, Jan 15, 2025 at 12:15:46PM +0100, Simona Vetter wrote:
On Wed, Jan 15, 2025 at 10:38:34AM +0100, Greg KH wrote: So my understanding is that you got confused by this:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
$ git log --grep="(cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)" --since="6 month ago" --pretty=oneline f0ed39830e6064d62f9c5393505677a26569bb56
And yes f0ed39830e6064d62f9c5393505677a26569bb56 is the commit you care about for stable backport and cve tracking purposes, because it's the one in v6.13-rc6.
And the thing is, Sasha's bot found that one too:
https://lore.kernel.org/all/20250110164811-61a12d6905bb8676@stable.kernel.or...
Except Sasha's bot plays guessing games, the above git log query is exact.
Cool, can we test it out? I'll try and pick a recent commit (2024).
Thanks a lot for specific examples, makes it much easier to walk through them and show how much cherry pick tracing is neeeded.
Let's assume that I'm looking at the v6.10 git tree before 50aec9665e0b ("drm/xe: Use ordered WQ for G2H handler") made it upstream ("git checkout v6.10" will do the trick), and I get a backport request that says:
commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de upstream
I run my trusty script that says "50aec9665e0b isn't real, grep for cherry picked from line!". My trusty script runs the query you've provided:
$ git log --grep="(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)" --since="6 month ago" --pretty=oneline 2d9c72f676e6f79a021b74c6c1c88235e7d5b722 drm/xe: Use ordered WQ for G2H handler c002bfe644a29ba600c571f2abba13a155a12dcd drm/xe: Use ordered WQ for G2H handler
Which commit do I pick? Note that they are slightly different from eachother, and c002bfe644 landed in v6.9 while 2d9c72f676 landed in v6.10.
Ok let me first explain why this happens. drm subtrees feature-freeze around -rc6/7, to make sure that when the merge window is open we don't have a buggy chaotic mess but are ready. Which means for that short amount of time there's 3 trees:
- drm-intel-next, which aims at v6.11 at this point and contains 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de
- drm-intel-next-fixes, which aims for v6.10-rc and contains the cherry-picked version 2d9c72f676e6f79a021b74c6c1c88235e7d5b722 of the same commit
- drm-intel-fixes, which aims at v6.9-rc and contains yet another cherry-picked version c002bfe644a29ba600c571f2abba13a155a12dcd of the same commit
Now we generally rotate maintainership among releases, so Thomas Hellström was taking care of anything needed for v6.10 and Lucas De Marchi for v6.9, and they both individually did the necessary cherry pick at pretty much the same time. And so we end up with two cherry-picks of the same commit. At other times you might end up with a chain of cherry-picks, it's all the same.
Now looking back this is it's very silly, but it's a lot less silly as stuff gets merged into Linus' tree.
First v6.9 gets tagged, which contains c002bfe644a29ba600c571f. You don't have that in v6.8.y yet, so you cherry-pick it over. Nothing special here. If you feel like you also assign a CVE for that upstream commit.
Then v6.10-rc1 is released, which contains 2d9c72f676e6f79a021b74c6. You don't have that, or a cherry pick of that commit in any of your stable trees, so might be tempted to try to backport it and then realize you seem to have a duplicate of this commit already.
But you're not yet done cherry-pick tracing, because 2d9c72f676e6f79a021b7 contains the following line in its commit message:
(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)
So you also need to check whether you have any cherry-picks of 50aec9665e0babd:
v6.9.y contains c002bfe644a29ba600c, so doesn't need another backport of that. You skip this commit.
v6.8.y and older stable trees contain a backport of c002bfe644a29ba6, and if you didn't delete any cherry-picked lines all those backports still contain
(cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)
line, so you know you're good and don't need another copy of that commit.
Since this commit gets completely filtered you're also not tempted to assign a new CVE for this one. So no risk of duplicates.
Next up v6.11-rc1 is released, which contains 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de. You check all your stable release branches and notice they all contain a cherry pick already:
- v6.10.y has 2d9c72f676e6f79a021b7
- v6.9.y has c002bfe644a29ba600c571f
- v6.8.y and older have a cherry pick of c002bfe644a29ba600c571f
Again this commit is completely filtered out with cherry-pick tracin checks.
Note that except for the first patch none of this mess should ever get past scripted filtering. Which means it should not confuse stable maintainers, and it also should not result in multiple CVEs getting assigned. Because only the first commit (c002bfe644a29ba600c in v6.9) will ever get past all the cherry pick tracing checks.
Like I tried to explain in my reply to Sasha somewhere else in this thread it really only takes two things:
- drm maintainers consistently add cherry picked from lines anytime we
cherry-pick
- you adjust your script to go hunt for the cherry pick alias if you get a
sha1 that makes no sense, so that you can put in the right sha1. And if you do that for any sha1 you find (whether upstream references, Fixes: or Reverts or stable candidate commits or whatever really), it will sort out all the things we've been shouting about for years now.
We still have holes here... For example, this backport claims to:
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
But 8135f1c09dd2 is a cherry-pick:
(cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
In the future, if we get a new patch that says:
Fixes: 0c8650b09a36 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
By your logic, our scripts will look at it and say "0c8650b09a36 is a real commit, but it's not in linux-6.12.y so there's no need to backport the fix".
Which is the wrong thing to do, because we have 8135f1c09dd2 in linux-6.12.y.
So no, this isn't a simple trace-the-cherry-pick-tags exercise.
So this is not quite what you do, because before you drop this patch you have to check whether you have a cherry-pick of 8135f1c09dd2 in your stable branch.
Now if we assume that only the stable team does cherry picks, you can limit your search to just the v6.12.y branch from the v6.12, and you'd wrongly conclude that you don't have a cherry pick there. But since the drm maintainers also do cherry-picks then limiting yourselv to just the patches you've applied to v6.12 will miss some, so you need to cherry-pick trace some more (I think 6 months into the past is generally enough, from your starting point commit).
And then you'll find that 8135f1c09dd2 is in v6.12 and a cherry-pick of 0c8650b09a36 so you need that bugfix.
Automatically, without human intervention, because it's just a git oneliner.
So look at the backport in question which started this thread. The backporter ends up with:
""" [...]
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
Cherry pick tracing says this is f0ed39830e6064d62f9c5393505677a26569bb56 which is in v6.13-rc1.
[...]
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
This one is in v6.12, but it's already a cherry-pick, so you need to be careful and look for possible other versions, because its commit message contains
(cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
But further cherry-pick tracing didn't show up any more commits that we didn't know of, so it's all done.
And 0c8650b09a365f4a31fca1d1d1e9d99c56071128 itself is in v6.13-rc1, so that resolves.
Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)
This commit will show up in v6.14-rc1. Currently all you can use this for is as a lookup key to find other cherry-pick copies.
Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)o
This commit is in v6.13-rc6, because that -rc series also needed the cherry-pick since the broken commit was in v6.13-rc1. And this commit itself is correctly annotated as a cherry-pick of 55039832f98c7e05f1, so it all checks out.
"""
Where most of the git IDs in it are invalid right now :)
They all make sense, but sometimes you do have to do a bit more cherry-pick tracing than what you've done.
Sometimes you need to do multiple levels of tracing, sometimes you start at a cherry-pick and need to fish out the original sha1 first (even if that's not one that resolves for you, you can still use it to find more cherry-picks). And because the stable team isn't the only maintainer team doing cherry-picks, you need to broaden your query a bit and can't limit yourself to your own cherry-picks only, to make sure you get them all.
Happy to walk you through even more special-case ladden examples. I tried to think them all through when cooking up the committer model years ago, I might indeed have missed a case. But pretty sure that the answer will be "you need more cherry-pick tracing".
Maybe also helps to go back from examples to the generic algorithm, which is two steps:
- You first need to find the root sha1 that all the cherry picks
originate from. If the sha1 you have doesn't resolve, you skip this. Otherwise look at the commit message, if it has a "(cherry picked from $sha)" line you pick the first line (they're ordered like sob lines), and that's the root sha1. It might not resolve, but it's the search key you need.
- You go hunt for all cherry pick aliases for that root sha1. Strictly
speaking you'd need to search the entire history, but in practice commits only travel back in time by 3-4 months at most (or a bit less thatn 2 full kernel releases). Half a year is the defensive assumption. So there's two subcases:
2a) If you want to find the right commit in upstream, you scan half a year of history in Linus' repo starting from the current tip.
2b) If you want to check for cherry picks in a stable branch, either to check whether you already have that backport, or whether you need to backport the bugfix for a buggy backport (due to Fixes/Revert: lines). You scan all the stable commits, plus half a year of history from the release tag Linus has done.
It's a bit madness, but more importantly, it's scriptable madness. And since this confuses drm maintainers too, we do have that partially implemented in our own scripts, to make sure we don't miss any bugfixes we need in drm-fixes. Partially because we never have unresolved sha1 in our own repos, because they also contain the -next branches where the root commit is.
Yes, this is total madness. Scanning the tree for every commit that we want to apply, or potentially apply, or figuring out if there is a missing fixes for that commit, is an exponential slowdown.
Right now we have tools that can go "where was this commit backported to" and give us that answer very quickly: $ /usr/bin/time ~/linux/vulns/tools/verhaal/id_found_in 79d67c499c3f886202a40c5cb27e747e4fa4d738 5.4.289 5.10.233 5.15.176 6.1.124 6.6.70 6.12.9 6.13-rc6 0.06user 0.03system 0:00.09elapsed 100%CPU (0avgtext+0avgdata 5856maxresident)k 0inputs+0outputs (0major+1953minor)pagefaults 0swaps as we pre-process the whole kernel tree ahead of time and populate a database we can query.
Code for this is here: https://git.sr.ht/~gregkh/verhaal and that's what we use now for the CVEs to determine when a vulnerability showed up, and when it was fixed, and in what branches of the tree.
our older tool took a bit longer: $ /usr/bin/time ~/linux/stable/commit_tree/id_found_in 79d67c499c3f886202a40c5cb27e747e4fa4d738 5.4.289 5.10.233 5.15.176 6.1.124 6.6.70 6.12.9 6.13-rc6 0.30user 0.90system 0:00.60elapsed 198%CPU (0avgtext+0avgdata 366820maxresident)k 0inputs+0outputs (0major+167451minor)pagefaults 0swaps as it abused the filesystem as a database with the output of some 'git log' results and relied on 'git grep' a lot to do regex matching. And that turned out to miss a lot of things and have false-positives, hence the rewrite above. I still personally use the old tool for some stable work as speed
Code for the old version is here: https://git.sr.ht/~gregkh/linux-stable_commit_tree
.6 seconds doesn't sound like a lot, but when you have to run other queries at the same time to walk branches and the like, and you're usually running it on a machine that is doing kernel builds at the same time, AND you are doing lookups for all commits in a series (100-500 at a time) or all CVEs issued so far (5000+), speed matters, hence the rewrite which also fixed some consistancy issues we had in our CVE entries.
So to add a whole new "now I need to walk back in time across ALL active kernel branches right now, AND maybe go dig elsewhere too" just to get things correct for one subsystem, you can see my disinterest and claims that "this sucks and is too much work and I'm going to give up".
To do this "right" what I feel I need to do is find ALL matching commits in the kernel tree (based on linux-next reports I know DRM isn't the only offender here, many commits flow through multiple trees at the same time, but only DRM seems to trigger the problems) and then work on commit ids as having "aliases" with the duplicate matches. I'm already starting to collect a bunch of "this id is invalid" stuff and maybe come up with a "rewrite" table to do queries off of as many times our Fixes and even Revert ids are wrong.
Ugh, time to dust off my old SQL book and figure out some table joins... I'll consider working on this during the merge window and see what I can come up with.
Any pointers to where the scripts you all use for your "catch the duplicates" logic you describe above are for the drm developers to use?
thanks,
greg k-h
On Thu, Jan 16, 2025 at 02:52:23PM +0100, Greg KH wrote:
On Thu, Jan 16, 2025 at 10:48:45AM +0100, Simona Vetter wrote:
Maybe also helps to go back from examples to the generic algorithm, which is two steps:
- You first need to find the root sha1 that all the cherry picks
originate from. If the sha1 you have doesn't resolve, you skip this. Otherwise look at the commit message, if it has a "(cherry picked from $sha)" line you pick the first line (they're ordered like sob lines), and that's the root sha1. It might not resolve, but it's the search key you need.
- You go hunt for all cherry pick aliases for that root sha1. Strictly
speaking you'd need to search the entire history, but in practice commits only travel back in time by 3-4 months at most (or a bit less thatn 2 full kernel releases). Half a year is the defensive assumption. So there's two subcases:
2a) If you want to find the right commit in upstream, you scan half a year of history in Linus' repo starting from the current tip.
2b) If you want to check for cherry picks in a stable branch, either to check whether you already have that backport, or whether you need to backport the bugfix for a buggy backport (due to Fixes/Revert: lines). You scan all the stable commits, plus half a year of history from the release tag Linus has done.
It's a bit madness, but more importantly, it's scriptable madness. And since this confuses drm maintainers too, we do have that partially implemented in our own scripts, to make sure we don't miss any bugfixes we need in drm-fixes. Partially because we never have unresolved sha1 in our own repos, because they also contain the -next branches where the root commit is.
Yes, this is total madness. Scanning the tree for every commit that we want to apply, or potentially apply, or figuring out if there is a missing fixes for that commit, is an exponential slowdown.
Right now we have tools that can go "where was this commit backported to" and give us that answer very quickly: $ /usr/bin/time ~/linux/vulns/tools/verhaal/id_found_in 79d67c499c3f886202a40c5cb27e747e4fa4d738 5.4.289 5.10.233 5.15.176 6.1.124 6.6.70 6.12.9 6.13-rc6 0.06user 0.03system 0:00.09elapsed 100%CPU (0avgtext+0avgdata 5856maxresident)k 0inputs+0outputs (0major+1953minor)pagefaults 0swaps as we pre-process the whole kernel tree ahead of time and populate a database we can query.
Code for this is here: https://git.sr.ht/~gregkh/verhaal and that's what we use now for the CVEs to determine when a vulnerability showed up, and when it was fixed, and in what branches of the tree.
our older tool took a bit longer: $ /usr/bin/time ~/linux/stable/commit_tree/id_found_in 79d67c499c3f886202a40c5cb27e747e4fa4d738 5.4.289 5.10.233 5.15.176 6.1.124 6.6.70 6.12.9 6.13-rc6 0.30user 0.90system 0:00.60elapsed 198%CPU (0avgtext+0avgdata 366820maxresident)k 0inputs+0outputs (0major+167451minor)pagefaults 0swaps as it abused the filesystem as a database with the output of some 'git log' results and relied on 'git grep' a lot to do regex matching. And that turned out to miss a lot of things and have false-positives, hence the rewrite above. I still personally use the old tool for some stable work as speed
Code for the old version is here: https://git.sr.ht/~gregkh/linux-stable_commit_tree
.6 seconds doesn't sound like a lot, but when you have to run other queries at the same time to walk branches and the like, and you're usually running it on a machine that is doing kernel builds at the same time, AND you are doing lookups for all commits in a series (100-500 at a time) or all CVEs issued so far (5000+), speed matters, hence the rewrite which also fixed some consistancy issues we had in our CVE entries.
So to add a whole new "now I need to walk back in time across ALL active kernel branches right now, AND maybe go dig elsewhere too" just to get things correct for one subsystem, you can see my disinterest and claims that "this sucks and is too much work and I'm going to give up".
Oh I know, but I'm honestly super happy that we've moved from "this is impossible madness" to "the script takes way too long". Which frankly I expected we'll get to, because the git grep stuff is ok for us, it's definitely not fast enough at the scale of stable backports and cve assignments.
To do this "right" what I feel I need to do is find ALL matching commits in the kernel tree (based on linux-next reports I know DRM isn't the only offender here, many commits flow through multiple trees at the same time, but only DRM seems to trigger the problems) and then work on commit ids as having "aliases" with the duplicate matches. I'm already starting to collect a bunch of "this id is invalid" stuff and maybe come up with a "rewrite" table to do queries off of as many times our Fixes and even Revert ids are wrong.
So for the annotated cherry-picks I think you need a table with 2 rows, first has the real commit sha1 (and if you scan linux-next you should never have a sha1 that doesn't resolve), the second the root sha1 of the cherry-pick chain, as annotated in the first "(cherry picked from $root_sha1)" in the commit message.
The full cherry-pick query would then still be the two step process, except you can skip at step 1 because if there's no entry, you know that this commit is not part of a cherry-pick group. Since at least with the drm flow you'll always see the cherry-picks first as new commits land in Linus' tree.
Then just fill that table with new entries as you scan for updates in Linus' tree (I wouldn't scan linux-next except for testing this, because you might pick up bogus entries).
That should be enough to do all the alias querying you need for properly annotated cherry-picks. But it's not enough for the case of where a patch is applied twice, without that annotation. So not sure what to do with those, imo teaching maintainers to annotate them is probably the best option, since guessing is error prone.
Ugh, time to dust off my old SQL book and figure out some table joins... I'll consider working on this during the merge window and see what I can come up with.
Any pointers to where the scripts you all use for your "catch the duplicates" logic you describe above are for the drm developers to use?
Probably not much use, because the actual logic is the git log greps from this thread, and we cheat with not handling a bunch of corner cases. But it's dim cherry-pick-branch and dim cherry-pick in our maintainer tools that helps with the cherry-picking:
https://gitlab.freedesktop.org/drm/maintainer-tools/-/blob/master/dim?ref_ty...
Comes with some docs too:
https://drm.pages.freedesktop.org/maintainer-tools/dim.html
Cheers, Sima
On Mon, Jan 13, 2025 at 10:44:41AM +1000, Dave Airlie wrote:
Pretty sure we've explained how a few times now, not sure we can do much more.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
When the initial commit enters during the next merge window, you look for that subject or commit id in the stable tree already, if it exists, dump the latest Linus patch on the floor, it's already in stable your job is done.
We can't rely too heavily on the subject line. Consider the following two very different commits that have the same subject line:
3119668c0e0a ("drm/amd/display: avoid disable otg when dig was disabled") 218784049f4b ("drm/amd/display: avoid disable otg when dig was disabled")
Now, if a new commit lands and it has the following "Fixes:" tag:
Fixes: abcdef12345 ("drm/amd/display: avoid disable otg when dig was disabled")
Does it refer to one of the older commits? Or a new commit that will show up during the merge window?
Or... What happens if a new commit with the very same subject line shows up, and it has a cherry-pick link that points to a completely different commit that is not in the tree yet? :)
But just in general, there are so many odd combinations of commits where trying to follow the suggestion you've made will just break...
Something like these two identical commits which are not tagged for stable:
21afc872fbc2 ("drm/amd/display: Add monitor patch for specific eDP") 3d71a8726e05 ("drm/amd/display: Add monitor patch for specific eDP")
And the following two identical ones which are tagged for stable:
b7cdccc6a849 ("drm/amd/display: Add monitor patch for specific eDP") 04a59c547575 ("drm/amd/display: Add monitor patch for specific eDP")
On Mon, Jan 13, 2025 at 04:48:17PM -0500, Sasha Levin wrote:
On Mon, Jan 13, 2025 at 10:44:41AM +1000, Dave Airlie wrote:
Pretty sure we've explained how a few times now, not sure we can do much more.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
When the initial commit enters during the next merge window, you look for that subject or commit id in the stable tree already, if it exists, dump the latest Linus patch on the floor, it's already in stable your job is done.
We can't rely too heavily on the subject line. Consider the following two very different commits that have the same subject line:
3119668c0e0a ("drm/amd/display: avoid disable otg when dig was disabled") 218784049f4b ("drm/amd/display: avoid disable otg when dig was disabled")
Now, if a new commit lands and it has the following "Fixes:" tag:
Fixes: abcdef12345 ("drm/amd/display: avoid disable otg when dig was disabled")
This is why we're asking people to include the cherry-picked from line, so you're scripts can handle this automatically.
Because then you get two cherry-picked from lines in your stable commits: - one from the drm cherry-pick - one from the stable cherry-pick
And instead of only checking the stable cherry-pick line you just check both if you want an answer to the "do I have this one already?" question. There's two cases:
- You have a backport candidate, but want to check if you have it already. When that happens with the 2nd commit your scripts will try to apply that patch (because it doesn't match the cherry-picked from you've added yourself), which will fail and result in an angry mail to dri-devel.
But if you instead check against both your and the drm cherry-pick lines, you'd know that you have this patch already and can drop it automatically.
- You get a Fixes: line like above, and want to know whether you need that patch. You already have to consult all the stable cherry-pick lines to make sure (because stable doesn't have that sha1 if the broken commit was itself cherry-picked). If you instead check against both the drm and stable cherry-pick lines then the tooling will do the right job.
Which is why Dave&me want these cherry-pick lines, but Alex has removed them again because of the last round of shouting about this. Because without cherry-pick lines you're down to guessing by title, which goes wrong.
So the only thing that's needed in the tooling is that instead of only looking at your own cherry-pick lines in stable commits to figure out whether you need a backport or have it already, or whether you need that bugfix or don't have the broken commit, is to look at all cherry-pick lines. And ask Alex to again add them.
Does it refer to one of the older commits? Or a new commit that will show up during the merge window?
Or... What happens if a new commit with the very same subject line shows up, and it has a cherry-pick link that points to a completely different commit that is not in the tree yet? :)
But just in general, there are so many odd combinations of commits where trying to follow the suggestion you've made will just break...
Something like these two identical commits which are not tagged for stable:
21afc872fbc2 ("drm/amd/display: Add monitor patch for specific eDP") 3d71a8726e05 ("drm/amd/display: Add monitor patch for specific eDP")
Yeah sometimes people forget to add cc: stable. It happens. I don't think anything else is going on here.
And the following two identical ones which are tagged for stable:
b7cdccc6a849 ("drm/amd/display: Add monitor patch for specific eDP") 04a59c547575 ("drm/amd/display: Add monitor patch for specific eDP")
Yeah this is just standard bugfix cherry-picking, except because you shouted about the cherry-pick lines last time around they're now gone, so you have no idea what's going on here.
Imo we should add the cherry-pick lines back and then this would be all clear. -Sima
On Mon, Jan 13, 2025 at 10:44:41AM +1000, Dave Airlie wrote:
On Mon, 13 Jan 2025 at 07:09, Greg KH gregkh@linuxfoundation.org wrote:
On Mon, Jan 13, 2025 at 06:01:51AM +1000, Dave Airlie wrote:
On Mon, 13 Jan 2025 at 05:51, Dave Airlie airlied@gmail.com wrote:
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
All subsystems that grow to having large teams (more than 2-4 people) working on a single driver will eventually hit the scaling problem, just be glad we find things first so everyone else knows how to deal with it later.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
None of what you are saying makes any sense here. Explain how patches are backported to stable releases before they end up in Linus's tree to me like I'm 5, because there should be no possible workflow where that can happen, stable pulls from patches in Linus' tree, not from my tree or drm-next or anywhere else. Now it might appear that way because tooling isn't prepared or people don't know what they are looking at, but I still don't see the actual problem.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
Fix the tools.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
How do you recommend we do that, edit the immutable git history to remove the stable tag from the original commit?
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
If you have to do, go do it. The thing is the workflow is there for a reason, once you have a large enough team, having every single team member intimately aware of the rc schedule to decide where they need to land patches doesn't scale. If you can't land patches to a central -next tree and then pick those patches out to a -fixes tree after a maintainer realises they need to be backported to stable. Now I suppose we could just ban stable tags on the next tree and only put them on the cherry-picks but then how does it deal with the case where something needs to be fixes in -next but not really urgent enough for -fixes immediately. Would that be good enough, no stable tags in -next trees, like we could make the tooling block it? But it seems like overkill, to avoid fixing some shitty scripts someone is probably selling as a security application.
If you were to ignore stable tags on drm, could we then write a tool that creates a new for-stable tree that just strips out all the fixes/backports/commits and recreates it based on Linus commits to you, or would future duplicate commits then break tools?
That would be great, just pick which commit id to reference (i.e. the one that shows up in Linus's tree first.)
But then, be careful with the "Fixes:" tags as well, those need to line up and match the correct ones.
We create a "web" when we backport commits, and mark things for "Fixes:" When we get those ids wrong because you all have duplicate commits for the same thing, everything breaks.
I just don't get what the ABI the tools expect is, and why everyone is writing bespoke tools and getting it wrong, then blaming us for not conforming. Fix the tools or write new ones when you realise the situation is more complex than your initial ideas.
All I want to see and care about is:
- for a stable commit, the id that the commit is in Linus's tree.
- for a Fixes: tag, the id that matches the commit in Linus's tree AND the commit that got backported to stable trees.
That's it, that's the whole "ABI". The issue is that you all, for any number of commits, have 2 unique ids for any single commit and how are we supposed to figure that mess out...
Pretty sure we've explained how a few times now, not sure we can do much more.
If you see a commit with a cherry-pick link in it and don't have any sight on that commit in Linus's tree, ignore the cherry-pick link in it, assume it's a future placeholder for that commit id. You could if you wanted to store that info somewhere, but there shouldn't be a need.
When the initial commit enters during the next merge window, you look for that subject or commit id in the stable tree already, if it exists, dump the latest Linus patch on the floor, it's already in stable your job is done.
When future tools are analysing things, they will see the patch from the merge window, the cherry-picked patches in the fixes tree, and stable will reference the fixes, and the fixes patch will reference the merge window one?
I'm just not seeing what I'm missing here, fixes tags should work fine, but I think when we cherry-pick patches from -next that fix other patches from -next maybe the fixes lines should be reworked to reference the previous Linus tree timeline not the future one. not 100% sure this happens? Sima might know more.
The issue with trying to fix up the Fixes/Reverts citations is that if you miss one, you're tooling is still broken. So what we do instead is track all the aliases of commits we've cherry-picked, because that's actually reliable.
And you need to do that already anyway for stable processing, because you want to automatically filter out commits that you've cherry picked already. The only difference is that aside from the actual commit sha1, you also add all the cherry-pick aliases that are listed in the commit messages. Which is why we add those, because our tooling uses them, so you don't have to guess by patch title or something imprecise like that.
So example: You cherry-pick commit A1 to stable, it now has commit sha1 A2 in stable. But it also has a cherry-picked form A0 note, so you also add A0 as a commit alias.
When you then check whether you've already cherry-picked a commit because of a Fixes or Reverts line, instead of only checking against A1 like you currently do in stable processing, you also check against A0. If the Fixes/Reverts line matches either, you know that you need that bugfix in your stable tree too.
And if you cherry-pick a commit, you also check against both A1 and A0, and if either matches, you've already cherry-picked that commit (or one of its duplicated cherry-picks) and can drop it.
And if someone has gone completely silly and cherry picked multiple times, you just add all the cherry-picked from sha1 as aliases to your "do I have this commit already" database/query.
Our script just launches a git grep over the history looking for all cherry-pick lines, not caring one bit whether there's more than one per commit. That's the entire magic we do. -Sima
Now previously I think we'd be requested to remove the cherry-picks from the -fixes commits as they were referencing things not in Linus' tree, we said it was a bad idea, I think we did it anyways, we got shouted at, we put it back, we get shouted that we are referencing commits that aren't in Linus tree. Either the link is useful information and we just assume cherry-picks of something we can't see are a future placeholder and ignore it until it shows up in our timeline.
I think we could ask to not merge things into -next with stable cc'ed but I think that will result in a loss of valuable fixes esp for backporters.
Dave.
On Mon, Jan 13, 2025 at 05:51:30AM +1000, Dave Airlie wrote:
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56)
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
All subsystems that grow to having large teams (more than 2-4 people) working on a single driver will eventually hit the scaling problem, just be glad we find things first so everyone else knows how to deal with it later.
That's fine, the issue is that you are the only ones with "duplicate" commits in the tree that are both tagged for stable, every release.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
None of what you are saying makes any sense here. Explain how patches are backported to stable releases before they end up in Linus's tree to me like I'm 5, because there should be no possible workflow where that can happen, stable pulls from patches in Linus' tree, not from my tree or drm-next or anywhere else. Now it might appear that way because tooling isn't prepared or people don't know what they are looking at, but I still don't see the actual problem.
Look at the above commit here that triggered this latest complaint.
It "claims" to be commit id 55039832f98c7e05f1cf9e0d8c12b2490abd0f16. But that id is NOT in Linus's tree, rather it is ONLY in the DRM tree at this point in time.
What _is_ in Linus's tree is the same exact thing, with a different commit id, f0ed39830e6064d62f9c5393505677a26569bb56.
That's the problem. You all check the same exact change into 2 different branches, which then flows to Linus at different points in time. Many times, after both are in Linus's tree, we get backport requests, or backports, or CVE reports, or "Fixes:" tags that refer to one, or the other, commit, causing confusion (which commit was backported to where, so where does the Fixes line up to, what branches are fixed / vulnerable, etc.)
I don't mind duplicate commits, but PLEASE don't mark both of them to be included into the stable trees. That's the issue here. We don't have a way to say "This commit is IDENTICAL to that commit, despite it landing in Linus's tree at different points in time/releases."
Again, think of the backports. And the Fixes:. And the resulting confusion.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
Fix the tools.
Please tell me how to properly determine this when given the above commit sent to us to backport. When one commit (of the identical pair) is backported to a stable tree, yet the Fixes tag refers to the other commit, how am I supposed to know what to do?
If you follow the "path in time" it looks like things travel backwards sometimes. We ran into that a few weeks ago with a commit that was marked as a CVE and we tried to figure out what releases it fixed. It was marked as Fixing a commit that was _after_ it showed up in Linus's tree.
So again, if you want to do duplicate commits in different branches, go ahead. But if so, pick one, or the other, to mark for Fixes, AND to mark for stable backports.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
How do you recommend we do that, edit the immutable git history to remove the stable tag from the original commit?
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
If you have to do, go do it. The thing is the workflow is there for a reason, once you have a large enough team, having every single team member intimately aware of the rc schedule to decide where they need to land patches doesn't scale. If you can't land patches to a central -next tree and then pick those patches out to a -fixes tree after a maintainer realises they need to be backported to stable. Now I suppose we could just ban stable tags on the next tree and only put them on the cherry-picks but then how does it deal with the case where something needs to be fixes in -next but not really urgent enough for -fixes immediately. Would that be good enough, no stable tags in -next trees, like we could make the tooling block it? But it seems like overkill, to avoid fixing some shitty scripts someone is probably selling as a security application.
If you all want to not mark anything for us to backport, and will make a tree/mbox/whatever for us to take only, I would _LOVE_ it. That way you all can make sure the git ids match up properly (for the backport id, AND for the Fixes: id).
Otherwise, again, stop it with the duplicate commit ids. They are a pain and every time I see a DRM patch tagged for stable I flinch and dread it.
thanks,
greg k-h
On Sun, Jan 12, 2025 at 10:06:42PM +0100, Greg KH wrote:
That's fine, the issue is that you are the only ones with "duplicate" commits in the tree that are both tagged for stable, every release.
Isn't a solution as easy as teaching your tooling not to create/accept commits on -next with Cc: stable? This way folks intending to push a change will notice it should go to the fixes branch. And if only afterwards you notice this is a critical fix that should get backported at least the commit that takes more time entering mainline doesn't have the stable tag.
Maybe additionally make sure that Fixes: and revert notices only point to commits that are an ancestor.
If I understand the problem correctly, this should make the stable maintainers happy.
Best regards Uwe
On Fri, Jan 17, 2025 at 12:01:01PM +0100, Uwe Kleine-König wrote:
On Sun, Jan 12, 2025 at 10:06:42PM +0100, Greg KH wrote:
That's fine, the issue is that you are the only ones with "duplicate" commits in the tree that are both tagged for stable, every release.
Isn't a solution as easy as teaching your tooling not to create/accept commits on -next with Cc: stable? This way folks intending to push a change will notice it should go to the fixes branch. And if only afterwards you notice this is a critical fix that should get backported at least the commit that takes more time entering mainline doesn't have the stable tag.
Maybe additionally make sure that Fixes: and revert notices only point to commits that are an ancestor.
The commit is always an ancestor, the "trick" is which one when the ancestor was cherry-picked previously? That's the real problem here..
gre k-h
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Let me try and work out what you think is the problem with this particular commit as I read your email and I don't get it.
This commit is in drm-next as 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 and says Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close)
It was pulled into drm-fixes a second time as a cherry-pick from next as f0ed39830e6064d62f9c5393505677a26569bb56 (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)
Now the commit it Fixes: 8135f1c09dd2 is also at 0c8650b09a365f4a31fca1d1d1e9d99c56071128
Now the above thing you wrote is your cherry-picked commit for stable? since I don't see (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56) in my tree anywhere.
So this patch comes into stable previously as f0ed39830e6064d62f9c5393505677a26569bb56 ? and then when it comes in as 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 you didn't notice you already had it, (there is where I think the extra step of searching in git history for the patch (this seems easily automatable) should come in.
Or is the concern with the Fixes: line referencing the wrong thing?
Dave.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
greg k-h
On Tue, 14 Jan 2025, Dave Airlie airlied@gmail.com wrote:
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Let me try and work out what you think is the problem with this particular commit as I read your email and I don't get it.
This commit is in drm-next as 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 and says Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close)
It was pulled into drm-fixes a second time as a cherry-pick from next as f0ed39830e6064d62f9c5393505677a26569bb56 (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)
Now the commit it Fixes: 8135f1c09dd2 is also at 0c8650b09a365f4a31fca1d1d1e9d99c56071128
Now the above thing you wrote is your cherry-picked commit for stable? since I don't see (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56) in my tree anywhere.
The automatic cherry-pick for 6.12 stable failed, and Umesh provided the manually cherry-picked patch for it, apparently using -x in the process, adding the second cherry-pick annotation. The duplicate annotation hasn't been merged to any tree, it's not part of the process, it's just what happened with this manual stable backport. I think it would be wise to ignore that part of the whole discussion. It's really not that relevant.
BR, Jani.
So this patch comes into stable previously as f0ed39830e6064d62f9c5393505677a26569bb56 ? and then when it comes in as 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 you didn't notice you already had it, (there is where I think the extra step of searching in git history for the patch (this seems easily automatable) should come in.
Or is the concern with the Fixes: line referencing the wrong thing?
Dave.
Yes, I know that DRM is special and unique and running at a zillion times faster with more maintainers than any other subsystem and really, it's bigger than the rest of the kernel combined, but hey, we ALL are a common project here. If each different subsystem decided to have their own crazy workflows like this, we'd be in a world of hurt. Right now it's just you all that is causing this world of hurt, no one else, so I'll complain to you.
We have commits that end up looking like they go back in time that are backported to stable releases BEFORE they end up in Linus's tree and future releases. This causes major havoc and I get complaints from external people when they see this as obviously, it makes no sense at all.
And it easily breaks tools that tries to track where backports went and if they are needed elsewhere, which ends up missing things because of this crazy workflow. So in the end, it's really only hurting YOUR subsystem because of this.
And yes, there is a simple way to fix this, DO NOT TAG COMMITS THAT ARE DUPLICATES AS FOR STABLE. Don't know why you all don't do that, would save a world of hurt.
I'm tired of it, please, just stop. I am _this_ close to just ignoring ALL DRM patches for stable trees...
greg k-h
On Tue, Jan 14, 2025 at 11:22:26AM +0200, Jani Nikula wrote:
On Tue, 14 Jan 2025, Dave Airlie airlied@gmail.com wrote:
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Let me try and work out what you think is the problem with this particular commit as I read your email and I don't get it.
This commit is in drm-next as 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 and says Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close)
It was pulled into drm-fixes a second time as a cherry-pick from next as f0ed39830e6064d62f9c5393505677a26569bb56 (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)
Now the commit it Fixes: 8135f1c09dd2 is also at 0c8650b09a365f4a31fca1d1d1e9d99c56071128
Now the above thing you wrote is your cherry-picked commit for stable? since I don't see (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56) in my tree anywhere.
The automatic cherry-pick for 6.12 stable failed, and Umesh provided the manually cherry-picked patch for it, apparently using -x in the process, adding the second cherry-pick annotation. The duplicate annotation hasn't been merged to any tree, it's not part of the process, it's just what happened with this manual stable backport. I think it would be wise to ignore that part of the whole discussion. It's really not that relevant.
On the contrary, this commit shows the whole problem very well. It is trivial for people to get confused, the author submitted a backport of a commit that is NOT in Linus's tree because they got an email telling of a failure and didn't use the correct id because they looked in the drm-next branch, NOT in Linus's branch.
Which is why I flagged it, as the commit id used here was not a valid one at this point in time. Yes, after -rc1 it would be valid, but again, totally confusing.
You all are seeing confusion with this development model. That's the issue. Whether or not it's intentional, that's the result. And because of it, I am telling you all, the kernel is less secure as it allows us all to get confused and mis backports and drop fixes incorrectly.
So either you all change the process, or just live with it and the consequences of having total confusion at times and grumpy stable developers because of it, and less secure users due to missed bug and security fixes.
greg k-h
On Wed, Jan 15, 2025 at 10:11:00AM +0100, Greg KH wrote:
On Tue, Jan 14, 2025 at 11:22:26AM +0200, Jani Nikula wrote:
On Tue, 14 Jan 2025, Dave Airlie airlied@gmail.com wrote:
On Sun, 12 Jan 2025 at 22:19, Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jan 10, 2025 at 12:53:41PM -0800, Umesh Nerlige Ramappa wrote:
commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 upstream
<snip>
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com # commit 1 Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Cc: stable@vger.kernel.org # 6.12+ Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.... (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com
Oh I see what you all did here.
I give up. You all need to stop it with the duplicated git commit ids all over the place. It's a major pain and hassle all the time and is something that NO OTHER subsystem does.
Let me try and work out what you think is the problem with this particular commit as I read your email and I don't get it.
This commit is in drm-next as 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 and says Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close)
It was pulled into drm-fixes a second time as a cherry-pick from next as f0ed39830e6064d62f9c5393505677a26569bb56 (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)
Now the commit it Fixes: 8135f1c09dd2 is also at 0c8650b09a365f4a31fca1d1d1e9d99c56071128
Now the above thing you wrote is your cherry-picked commit for stable? since I don't see (cherry picked from commit f0ed39830e6064d62f9c5393505677a26569bb56) in my tree anywhere.
The automatic cherry-pick for 6.12 stable failed, and Umesh provided the manually cherry-picked patch for it, apparently using -x in the process, adding the second cherry-pick annotation. The duplicate annotation hasn't been merged to any tree, it's not part of the process, it's just what happened with this manual stable backport. I think it would be wise to ignore that part of the whole discussion. It's really not that relevant.
On the contrary, this commit shows the whole problem very well. It is trivial for people to get confused, the author submitted a backport of a commit that is NOT in Linus's tree because they got an email telling of a failure and didn't use the correct id because they looked in the drm-next branch, NOT in Linus's branch.
Which is why I flagged it, as the commit id used here was not a valid one at this point in time. Yes, after -rc1 it would be valid, but again, totally confusing.
You all are seeing confusion with this development model. That's the issue. Whether or not it's intentional, that's the result. And because of it, I am telling you all, the kernel is less secure as it allows us all to get confused and mis backports and drop fixes incorrectly.
So either you all change the process, or just live with it and the consequences of having total confusion at times and grumpy stable developers because of it, and less secure users due to missed bug and security fixes.
We won't change our process, because I couldn't find the maintainer volunteers to make that happen. And I don't think you can find them for me.
Full answer here:
https://lore.kernel.org/dri-devel/Z4d6406b82Pu1PRV@phenom.ffwll.local/
And all we need to sort out the fallout is that - drm maintainers consistently add cherry-picked from lines (which means you need to stop shouting about them) - downstream consumers look at cherry-picked from lines to figure out all the sha1 aliases a commit has, which with the dumbest git log script here takes less than a second here to check
I've tried to explain this here in a reply to Sasha:
https://lore.kernel.org/dri-devel/Z4aNwGys3epVzf7G@phenom.ffwll.local/
And yes I'm aware this breaks your workflow, we've had these discussions at regularly scheduled intervals for as long as we've been doing the committer model. And I've been trying for as long to explain what you need to adjust to cope, purely using scripts.
This shit is easy, except somehow here we are almost a decade later. -Sima
linux-stable-mirror@lists.linaro.org