December 2024 - Linux-stable-mirror

[PATCH 2/3] vfs: Fix implicit conversion problem when testing overflow case

by Julian Sun

The overflow check in generic_copy_file_checks() and generic_remap_checks() is now broken because the result of the addition is implicitly converted to an unsigned type, which disrupts the comparison with signed numbers. This caused the kernel to not return EOVERFLOW in copy_file_range() call with len is set to 0xffffffffa003e45bul. Use the check_add_overflow() macro to fix this issue. Reported-and-tested-by: syzbot+296b1c84b9cbf306e5a0(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=296b1c84b9cbf306e5a0 Fixes: 1383a7ed6749 ("vfs: check file ranges before cloning files") Fixes: 96e6e8f4a68d ("vfs: add missing checks to copy_file_range") Inspired-by: Dave Chinner <david(a)fromorbit.com> Reviewed-by: Jan Kara <jack(a)suse.cz> Signed-off-by: Julian Sun <sunjunchao2870(a)gmail.com> --- fs/read_write.c | 5 +++-- fs/remap_range.c | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 070a7c33b9dd..5211246edc2e 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1509,7 +1509,7 @@ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in, struct inode *inode_in = file_inode(file_in); struct inode *inode_out = file_inode(file_out); uint64_t count = *req_count; - loff_t size_in; + loff_t size_in, tmp; int ret; ret = generic_file_rw_checks(file_in, file_out); @@ -1544,7 +1544,8 @@ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in, return -ETXTBSY; /* Ensure offsets don't wrap. */ - if (pos_in + count < pos_in || pos_out + count < pos_out) + if (check_add_overflow(pos_in, count, &tmp) || + check_add_overflow(pos_out, count, &tmp)) return -EOVERFLOW; /* Shorten the copy to EOF */ diff --git a/fs/remap_range.c b/fs/remap_range.c index 28246dfc8485..6fdeb3c8cb70 100644 --- a/fs/remap_range.c +++ b/fs/remap_range.c @@ -36,7 +36,7 @@ static int generic_remap_checks(struct file *file_in, loff_t pos_in, struct inode *inode_out = file_out->f_mapping->host; uint64_t count = *req_count; uint64_t bcount; - loff_t size_in, size_out; + loff_t size_in, size_out, tmp; loff_t bs = inode_out->i_sb->s_blocksize; int ret; @@ -45,7 +45,8 @@ static int generic_remap_checks(struct file *file_in, loff_t pos_in, return -EINVAL; /* Ensure offsets don't wrap. */ - if (pos_in + count < pos_in || pos_out + count < pos_out) + if (check_add_overflow(pos_in, count, &tmp) || + check_add_overflow(pos_out, count, &tmp)) return -EINVAL; size_in = i_size_read(inode_in); -- 2.39.2

11 months, 3 weeks

5
7
0 0

[REGRESSION] amdgpu: thinkpad e495 backlight brightness resets after suspend

by Siva Mahadevan

#regzbot introduced: 99a02eab8 Observed behaviour: linux-stable v6.12.5 has a regression on my thinkpad e495 where suspend/resume of the laptop results in my backlight brightness settings to be reset to some very high value. After resume, I'm able to increase brightness further until max brightness, but I'm not able to decrease it anymore. Behaviour prior to regression: linux-stable v6.12.4 correctly maintains the same brightness setting on the backlight that was set prior to suspend/resume. Notes: I bisected this issue between v6.12.4 and v6.12.5 to commit 99a02eab8 titled "drm/amdgpu: rework resume handling for display (v2)". Hardware: * lenovo thinkpad e495 * AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx * VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Picasso/Raven 2 [Radeon Vega Series / Radeon Vega Mobile Series] (rev c2)

11 months, 3 weeks

3
4
0 0

[PATCH stable 6.6] bpf: support non-r10 register spill/fill to/from stack in precision tracking

by Shung-Hsi Yu

From: Andrii Nakryiko <andrii(a)kernel.org> [ Upstream commit 41f6f64e6999a837048b1bd13a2f8742964eca6b ] Use instruction (jump) history to record instructions that performed register spill/fill to/from stack, regardless if this was done through read-only r10 register, or any other register after copying r10 into it *and* potentially adjusting offset. To make this work reliably, we push extra per-instruction flags into instruction history, encoding stack slot index (spi) and stack frame number in extra 10 bit flags we take away from prev_idx in instruction history. We don't touch idx field for maximum performance, as it's checked most frequently during backtracking. This change removes basically the last remaining practical limitation of precision backtracking logic in BPF verifier. It fixes known deficiencies, but also opens up new opportunities to reduce number of verified states, explored in the subsequent patches. There are only three differences in selftests' BPF object files according to veristat, all in the positive direction (less states). File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) -------------------------------------- ------------- --------- --------- ------------- ---------- ---------- ------------- test_cls_redirect_dynptr.bpf.linked3.o cls_redirect 2987 2864 -123 (-4.12%) 240 231 -9 (-3.75%) xdp_synproxy_kern.bpf.linked3.o syncookie_tc 82848 82661 -187 (-0.23%) 5107 5073 -34 (-0.67%) xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 85116 84964 -152 (-0.18%) 5162 5130 -32 (-0.62%) Note, I avoided renaming jmp_history to more generic insn_hist to minimize number of lines changed and potential merge conflicts between bpf and bpf-next trees. Notice also cur_hist_entry pointer reset to NULL at the beginning of instruction verification loop. This pointer avoids the problem of relying on last jump history entry's insn_idx to determine whether we already have entry for current instruction or not. It can happen that we added jump history entry because current instruction is_jmp_point(), but also we need to add instruction flags for stack access. In this case, we don't want to entries, so we need to reuse last added entry, if it is present. Relying on insn_idx comparison has the same ambiguity problem as the one that was fixed recently in [0], so we avoid that. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231110002638.4168352… Acked-by: Eduard Zingerman <eddyz87(a)gmail.com> Reported-by: Tao Lyu <tao.lyu(a)epfl.ch> Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast(a)kernel.org> Signed-off-by: Shung-Hsi Yu <shung-hsi.yu(a)suse.com> --- This is a backport to fix CVE-2023-52920[1] 1: https://lore.kernel.org/linux-cve-announce/2024110518-CVE-2023-52920-17f6@g… --- include/linux/bpf_verifier.h | 33 +++- kernel/bpf/verifier.c | 175 ++++++++++-------- .../bpf/progs/verifier_subprog_precision.c | 23 ++- .../testing/selftests/bpf/verifier/precise.c | 38 ++-- 4 files changed, 170 insertions(+), 99 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 92919d52f7e1..cb8e97665eaa 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -319,12 +319,34 @@ struct bpf_func_state { struct bpf_stack_state *stack; }; -struct bpf_idx_pair { - u32 prev_idx; - u32 idx; +#define MAX_CALL_FRAMES 8 + +/* instruction history flags, used in bpf_jmp_history_entry.flags field */ +enum { + /* instruction references stack slot through PTR_TO_STACK register; + * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8) + * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512, + * 8 bytes per slot, so slot index (spi) is [0, 63]) + */ + INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */ + + INSN_F_SPI_MASK = 0x3f, /* 6 bits */ + INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */ + + INSN_F_STACK_ACCESS = BIT(9), /* we need 10 bits total */ +}; + +static_assert(INSN_F_FRAMENO_MASK + 1 >= MAX_CALL_FRAMES); +static_assert(INSN_F_SPI_MASK + 1 >= MAX_BPF_STACK / 8); + +struct bpf_jmp_history_entry { + u32 idx; + /* insn idx can't be bigger than 1 million */ + u32 prev_idx : 22; + /* special flags, e.g., whether insn is doing register stack spill/load */ + u32 flags : 10; }; -#define MAX_CALL_FRAMES 8 /* Maximum number of register states that can exist at once */ #define BPF_ID_MAP_SIZE ((MAX_BPF_REG + MAX_BPF_STACK / BPF_REG_SIZE) * MAX_CALL_FRAMES) struct bpf_verifier_state { @@ -407,7 +429,7 @@ struct bpf_verifier_state { * For most states jmp_history_cnt is [0-3]. * For loops can go up to ~40. */ - struct bpf_idx_pair *jmp_history; + struct bpf_jmp_history_entry *jmp_history; u32 jmp_history_cnt; u32 dfs_depth; u32 callback_unroll_depth; @@ -640,6 +662,7 @@ struct bpf_verifier_env { int cur_stack; } cfg; struct backtrack_state bt; + struct bpf_jmp_history_entry *cur_hist_ent; u32 pass_cnt; /* number of times do_check() was called */ u32 subprog_cnt; /* number of instructions analyzed by the verifier */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 4f19a091571b..5ca02af3a872 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1762,8 +1762,8 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state, int i, err; dst_state->jmp_history = copy_array(dst_state->jmp_history, src->jmp_history, - src->jmp_history_cnt, sizeof(struct bpf_idx_pair), - GFP_USER); + src->jmp_history_cnt, sizeof(*dst_state->jmp_history), + GFP_USER); if (!dst_state->jmp_history) return -ENOMEM; dst_state->jmp_history_cnt = src->jmp_history_cnt; @@ -3397,6 +3397,21 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno, return __check_reg_arg(env, state->regs, regno, t); } +static int insn_stack_access_flags(int frameno, int spi) +{ + return INSN_F_STACK_ACCESS | (spi << INSN_F_SPI_SHIFT) | frameno; +} + +static int insn_stack_access_spi(int insn_flags) +{ + return (insn_flags >> INSN_F_SPI_SHIFT) & INSN_F_SPI_MASK; +} + +static int insn_stack_access_frameno(int insn_flags) +{ + return insn_flags & INSN_F_FRAMENO_MASK; +} + static void mark_jmp_point(struct bpf_verifier_env *env, int idx) { env->insn_aux_data[idx].jmp_point = true; @@ -3408,28 +3423,51 @@ static bool is_jmp_point(struct bpf_verifier_env *env, int insn_idx) } /* for any branch, call, exit record the history of jmps in the given state */ -static int push_jmp_history(struct bpf_verifier_env *env, - struct bpf_verifier_state *cur) +static int push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state *cur, + int insn_flags) { u32 cnt = cur->jmp_history_cnt; - struct bpf_idx_pair *p; + struct bpf_jmp_history_entry *p; size_t alloc_size; - if (!is_jmp_point(env, env->insn_idx)) + /* combine instruction flags if we already recorded this instruction */ + if (env->cur_hist_ent) { + /* atomic instructions push insn_flags twice, for READ and + * WRITE sides, but they should agree on stack slot + */ + WARN_ONCE((env->cur_hist_ent->flags & insn_flags) && + (env->cur_hist_ent->flags & insn_flags) != insn_flags, + "verifier insn history bug: insn_idx %d cur flags %x new flags %x\n", + env->insn_idx, env->cur_hist_ent->flags, insn_flags); + env->cur_hist_ent->flags |= insn_flags; return 0; + } cnt++; alloc_size = kmalloc_size_roundup(size_mul(cnt, sizeof(*p))); p = krealloc(cur->jmp_history, alloc_size, GFP_USER); if (!p) return -ENOMEM; - p[cnt - 1].idx = env->insn_idx; - p[cnt - 1].prev_idx = env->prev_insn_idx; cur->jmp_history = p; + + p = &cur->jmp_history[cnt - 1]; + p->idx = env->insn_idx; + p->prev_idx = env->prev_insn_idx; + p->flags = insn_flags; cur->jmp_history_cnt = cnt; + env->cur_hist_ent = p; + return 0; } +static struct bpf_jmp_history_entry *get_jmp_hist_entry(struct bpf_verifier_state *st, + u32 hist_end, int insn_idx) +{ + if (hist_end > 0 && st->jmp_history[hist_end - 1].idx == insn_idx) + return &st->jmp_history[hist_end - 1]; + return NULL; +} + /* Backtrack one insn at a time. If idx is not at the top of recorded * history then previous instruction came from straight line execution. * Return -ENOENT if we exhausted all instructions within given state. @@ -3591,9 +3629,14 @@ static inline bool bt_is_reg_set(struct backtrack_state *bt, u32 reg) return bt->reg_masks[bt->frame] & (1 << reg); } +static inline bool bt_is_frame_slot_set(struct backtrack_state *bt, u32 frame, u32 slot) +{ + return bt->stack_masks[frame] & (1ull << slot); +} + static inline bool bt_is_slot_set(struct backtrack_state *bt, u32 slot) { - return bt->stack_masks[bt->frame] & (1ull << slot); + return bt_is_frame_slot_set(bt, bt->frame, slot); } /* format registers bitmask, e.g., "r0,r2,r4" for 0x15 mask */ @@ -3647,7 +3690,7 @@ static bool calls_callback(struct bpf_verifier_env *env, int insn_idx); * - *was* processed previously during backtracking. */ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, - struct backtrack_state *bt) + struct bpf_jmp_history_entry *hist, struct backtrack_state *bt) { const struct bpf_insn_cbs cbs = { .cb_call = disasm_kfunc_name, @@ -3660,7 +3703,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, u8 mode = BPF_MODE(insn->code); u32 dreg = insn->dst_reg; u32 sreg = insn->src_reg; - u32 spi, i; + u32 spi, i, fr; if (insn->code == 0) return 0; @@ -3723,20 +3766,15 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, * by 'precise' mark in corresponding register of this state. * No further tracking necessary. */ - if (insn->src_reg != BPF_REG_FP) + if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) return 0; - /* dreg = *(u64 *)[fp - off] was a fill from the stack. * that [fp - off] slot contains scalar that needs to be * tracked with precision */ - spi = (-insn->off - 1) / BPF_REG_SIZE; - if (spi >= 64) { - verbose(env, "BUG spi %d\n", spi); - WARN_ONCE(1, "verifier backtracking bug"); - return -EFAULT; - } - bt_set_slot(bt, spi); + spi = insn_stack_access_spi(hist->flags); + fr = insn_stack_access_frameno(hist->flags); + bt_set_frame_slot(bt, fr, spi); } else if (class == BPF_STX || class == BPF_ST) { if (bt_is_reg_set(bt, dreg)) /* stx & st shouldn't be using _scalar_ dst_reg @@ -3745,17 +3783,13 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, */ return -ENOTSUPP; /* scalars can only be spilled into stack */ - if (insn->dst_reg != BPF_REG_FP) + if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) return 0; - spi = (-insn->off - 1) / BPF_REG_SIZE; - if (spi >= 64) { - verbose(env, "BUG spi %d\n", spi); - WARN_ONCE(1, "verifier backtracking bug"); - return -EFAULT; - } - if (!bt_is_slot_set(bt, spi)) + spi = insn_stack_access_spi(hist->flags); + fr = insn_stack_access_frameno(hist->flags); + if (!bt_is_frame_slot_set(bt, fr, spi)) return 0; - bt_clear_slot(bt, spi); + bt_clear_frame_slot(bt, fr, spi); if (class == BPF_STX) bt_set_reg(bt, sreg); } else if (class == BPF_JMP || class == BPF_JMP32) { @@ -3799,10 +3833,14 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, WARN_ONCE(1, "verifier backtracking bug"); return -EFAULT; } - /* we don't track register spills perfectly, - * so fallback to force-precise instead of failing */ - if (bt_stack_mask(bt) != 0) - return -ENOTSUPP; + /* we are now tracking register spills correctly, + * so any instance of leftover slots is a bug + */ + if (bt_stack_mask(bt) != 0) { + verbose(env, "BUG stack slots %llx\n", bt_stack_mask(bt)); + WARN_ONCE(1, "verifier backtracking bug (subprog leftover stack slots)"); + return -EFAULT; + } /* propagate r1-r5 to the caller */ for (i = BPF_REG_1; i <= BPF_REG_5; i++) { if (bt_is_reg_set(bt, i)) { @@ -3827,8 +3865,11 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, WARN_ONCE(1, "verifier backtracking bug"); return -EFAULT; } - if (bt_stack_mask(bt) != 0) - return -ENOTSUPP; + if (bt_stack_mask(bt) != 0) { + verbose(env, "BUG stack slots %llx\n", bt_stack_mask(bt)); + WARN_ONCE(1, "verifier backtracking bug (callback leftover stack slots)"); + return -EFAULT; + } /* clear r1-r5 in callback subprog's mask */ for (i = BPF_REG_1; i <= BPF_REG_5; i++) bt_clear_reg(bt, i); @@ -4265,6 +4306,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno) for (;;) { DECLARE_BITMAP(mask, 64); u32 history = st->jmp_history_cnt; + struct bpf_jmp_history_entry *hist; if (env->log.level & BPF_LOG_LEVEL2) { verbose(env, "mark_precise: frame%d: last_idx %d first_idx %d subseq_idx %d \n", @@ -4328,7 +4370,8 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno) err = 0; skip_first = false; } else { - err = backtrack_insn(env, i, subseq_idx, bt); + hist = get_jmp_hist_entry(st, history, i); + err = backtrack_insn(env, i, subseq_idx, hist, bt); } if (err == -ENOTSUPP) { mark_all_scalars_precise(env, env->cur_state); @@ -4381,22 +4424,10 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno) bitmap_from_u64(mask, bt_frame_stack_mask(bt, fr)); for_each_set_bit(i, mask, 64) { if (i >= func->allocated_stack / BPF_REG_SIZE) { - /* the sequence of instructions: - * 2: (bf) r3 = r10 - * 3: (7b) *(u64 *)(r3 -8) = r0 - * 4: (79) r4 = *(u64 *)(r10 -8) - * doesn't contain jmps. It's backtracked - * as a single block. - * During backtracking insn 3 is not recognized as - * stack access, so at the end of backtracking - * stack slot fp-8 is still marked in stack_mask. - * However the parent state may not have accessed - * fp-8 and it's "unallocated" stack space. - * In such case fallback to conservative. - */ - mark_all_scalars_precise(env, env->cur_state); - bt_reset(bt); - return 0; + verbose(env, "BUG backtracking (stack slot %d, total slots %d)\n", + i, func->allocated_stack / BPF_REG_SIZE); + WARN_ONCE(1, "verifier backtracking bug (stack slot out of bounds)"); + return -EFAULT; } if (!is_spilled_scalar_reg(&func->stack[i])) { @@ -4561,7 +4592,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, int i, slot = -off - 1, spi = slot / BPF_REG_SIZE, err; struct bpf_insn *insn = &env->prog->insnsi[insn_idx]; struct bpf_reg_state *reg = NULL; - u32 dst_reg = insn->dst_reg; + int insn_flags = insn_stack_access_flags(state->frameno, spi); /* caller checked that off % size == 0 and -MAX_BPF_STACK <= off < 0, * so it's aligned access and [off, off + size) are within stack limits @@ -4599,17 +4630,6 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, mark_stack_slot_scratched(env, spi); if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) && !register_is_null(reg) && env->bpf_capable) { - if (dst_reg != BPF_REG_FP) { - /* The backtracking logic can only recognize explicit - * stack slot address like [fp - 8]. Other spill of - * scalar via different register has to be conservative. - * Backtrack from here and mark all registers as precise - * that contributed into 'reg' being a constant. - */ - err = mark_chain_precision(env, value_regno); - if (err) - return err; - } save_register_state(state, spi, reg, size); /* Break the relation on a narrowing spill. */ if (fls64(reg->umax_value) > BITS_PER_BYTE * size) @@ -4621,6 +4641,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, __mark_reg_known(&fake_reg, insn->imm); fake_reg.type = SCALAR_VALUE; save_register_state(state, spi, &fake_reg, size); + insn_flags = 0; /* not a register spill */ } else if (reg && is_spillable_regtype(reg->type)) { /* register containing pointer is being spilled into stack */ if (size != BPF_REG_SIZE) { @@ -4666,9 +4687,12 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, /* Mark slots affected by this stack write. */ for (i = 0; i < size; i++) - state->stack[spi].slot_type[(slot - i) % BPF_REG_SIZE] = - type; + state->stack[spi].slot_type[(slot - i) % BPF_REG_SIZE] = type; + insn_flags = 0; /* not a register spill */ } + + if (insn_flags) + return push_jmp_history(env, env->cur_state, insn_flags); return 0; } @@ -4857,6 +4881,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env, int i, slot = -off - 1, spi = slot / BPF_REG_SIZE; struct bpf_reg_state *reg; u8 *stype, type; + int insn_flags = insn_stack_access_flags(reg_state->frameno, spi); stype = reg_state->stack[spi].slot_type; reg = &reg_state->stack[spi].spilled_ptr; @@ -4902,12 +4927,10 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env, return -EACCES; } mark_reg_unknown(env, state->regs, dst_regno); + insn_flags = 0; /* not restoring original register state */ } state->regs[dst_regno].live |= REG_LIVE_WRITTEN; - return 0; - } - - if (dst_regno >= 0) { + } else if (dst_regno >= 0) { /* restore register state from stack */ copy_register_state(&state->regs[dst_regno], reg); /* mark reg as written since spilled pointer state likely @@ -4943,7 +4966,10 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env, mark_reg_read(env, reg, reg->parent, REG_LIVE_READ64); if (dst_regno >= 0) mark_reg_stack_read(env, reg_state, off, off + size, dst_regno); + insn_flags = 0; /* we are not restoring spilled register */ } + if (insn_flags) + return push_jmp_history(env, env->cur_state, insn_flags); return 0; } @@ -7027,7 +7053,6 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i BPF_SIZE(insn->code), BPF_WRITE, -1, true, false); if (err) return err; - return 0; } @@ -16773,7 +16798,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) * the precision needs to be propagated back in * the current state. */ - err = err ? : push_jmp_history(env, cur); + if (is_jmp_point(env, env->insn_idx)) + err = err ? : push_jmp_history(env, cur, 0); err = err ? : propagate_precision(env, &sl->state); if (err) return err; @@ -16997,6 +17023,9 @@ static int do_check(struct bpf_verifier_env *env) u8 class; int err; + /* reset current history entry on each new instruction */ + env->cur_hist_ent = NULL; + env->prev_insn_idx = prev_insn_idx; if (env->insn_idx >= insn_cnt) { verbose(env, "invalid insn idx %d insn_cnt %d\n", @@ -17036,7 +17065,7 @@ static int do_check(struct bpf_verifier_env *env) } if (is_jmp_point(env, env->insn_idx)) { - err = push_jmp_history(env, state); + err = push_jmp_history(env, state, 0); if (err) return err; } diff --git a/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c b/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c index f61d623b1ce8..f87365f7599b 100644 --- a/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c +++ b/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c @@ -541,11 +541,24 @@ static __u64 subprog_spill_reg_precise(void) SEC("?raw_tp") __success __log_level(2) -/* precision backtracking can't currently handle stack access not through r10, - * so we won't be able to mark stack slot fp-8 as precise, and so will - * fallback to forcing all as precise - */ -__msg("mark_precise: frame0: falling back to forcing all scalars precise") +__msg("10: (0f) r1 += r7") +__msg("mark_precise: frame0: last_idx 10 first_idx 7 subseq_idx -1") +__msg("mark_precise: frame0: regs=r7 stack= before 9: (bf) r1 = r8") +__msg("mark_precise: frame0: regs=r7 stack= before 8: (27) r7 *= 4") +__msg("mark_precise: frame0: regs=r7 stack= before 7: (79) r7 = *(u64 *)(r10 -8)") +__msg("mark_precise: frame0: parent state regs= stack=-8: R0_w=2 R6_w=1 R8_rw=map_value(map=.data.vals,ks=4,vs=16) R10=fp0 fp-8_rw=P1") +__msg("mark_precise: frame0: last_idx 18 first_idx 0 subseq_idx 7") +__msg("mark_precise: frame0: regs= stack=-8 before 18: (95) exit") +__msg("mark_precise: frame1: regs= stack= before 17: (0f) r0 += r2") +__msg("mark_precise: frame1: regs= stack= before 16: (79) r2 = *(u64 *)(r1 +0)") +__msg("mark_precise: frame1: regs= stack= before 15: (79) r0 = *(u64 *)(r10 -16)") +__msg("mark_precise: frame1: regs= stack= before 14: (7b) *(u64 *)(r10 -16) = r2") +__msg("mark_precise: frame1: regs= stack= before 13: (7b) *(u64 *)(r1 +0) = r2") +__msg("mark_precise: frame1: regs=r2 stack= before 6: (85) call pc+6") +__msg("mark_precise: frame0: regs=r2 stack= before 5: (bf) r2 = r6") +__msg("mark_precise: frame0: regs=r6 stack= before 4: (07) r1 += -8") +__msg("mark_precise: frame0: regs=r6 stack= before 3: (bf) r1 = r10") +__msg("mark_precise: frame0: regs=r6 stack= before 2: (b7) r6 = 1") __naked int subprog_spill_into_parent_stack_slot_precise(void) { asm volatile ( diff --git a/tools/testing/selftests/bpf/verifier/precise.c b/tools/testing/selftests/bpf/verifier/precise.c index 0d84dd1f38b6..8a2ff81d8350 100644 --- a/tools/testing/selftests/bpf/verifier/precise.c +++ b/tools/testing/selftests/bpf/verifier/precise.c @@ -140,10 +140,11 @@ .result = REJECT, }, { - "precise: ST insn causing spi > allocated_stack", + "precise: ST zero to stack insn is supported", .insns = { BPF_MOV64_REG(BPF_REG_3, BPF_REG_10), BPF_JMP_IMM(BPF_JNE, BPF_REG_3, 123, 0), + /* not a register spill, so we stop precision propagation for R4 here */ BPF_ST_MEM(BPF_DW, BPF_REG_3, -8, 0), BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_10, -8), BPF_MOV64_IMM(BPF_REG_0, -1), @@ -157,11 +158,11 @@ mark_precise: frame0: last_idx 4 first_idx 2\ mark_precise: frame0: regs=r4 stack= before 4\ mark_precise: frame0: regs=r4 stack= before 3\ - mark_precise: frame0: regs= stack=-8 before 2\ - mark_precise: frame0: falling back to forcing all scalars precise\ - force_precise: frame0: forcing r0 to be precise\ mark_precise: frame0: last_idx 5 first_idx 5\ - mark_precise: frame0: parent state regs= stack=:", + mark_precise: frame0: parent state regs=r0 stack=:\ + mark_precise: frame0: last_idx 4 first_idx 2\ + mark_precise: frame0: regs=r0 stack= before 4\ + 5: R0=-1 R4=0", .result = VERBOSE_ACCEPT, .retval = -1, }, @@ -169,6 +170,8 @@ "precise: STX insn causing spi > allocated_stack", .insns = { BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32), + /* make later reg spill more interesting by having somewhat known scalar */ + BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 0xff), BPF_MOV64_REG(BPF_REG_3, BPF_REG_10), BPF_JMP_IMM(BPF_JNE, BPF_REG_3, 123, 0), BPF_STX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, -8), @@ -179,18 +182,21 @@ }, .prog_type = BPF_PROG_TYPE_XDP, .flags = BPF_F_TEST_STATE_FREQ, - .errstr = "mark_precise: frame0: last_idx 6 first_idx 6\ + .errstr = "mark_precise: frame0: last_idx 7 first_idx 7\ mark_precise: frame0: parent state regs=r4 stack=:\ - mark_precise: frame0: last_idx 5 first_idx 3\ - mark_precise: frame0: regs=r4 stack= before 5\ - mark_precise: frame0: regs=r4 stack= before 4\ - mark_precise: frame0: regs= stack=-8 before 3\ - mark_precise: frame0: falling back to forcing all scalars precise\ - force_precise: frame0: forcing r0 to be precise\ - force_precise: frame0: forcing r0 to be precise\ - force_precise: frame0: forcing r0 to be precise\ - force_precise: frame0: forcing r0 to be precise\ - mark_precise: frame0: last_idx 6 first_idx 6\ + mark_precise: frame0: last_idx 6 first_idx 4\ + mark_precise: frame0: regs=r4 stack= before 6: (b7) r0 = -1\ + mark_precise: frame0: regs=r4 stack= before 5: (79) r4 = *(u64 *)(r10 -8)\ + mark_precise: frame0: regs= stack=-8 before 4: (7b) *(u64 *)(r3 -8) = r0\ + mark_precise: frame0: parent state regs=r0 stack=:\ + mark_precise: frame0: last_idx 3 first_idx 3\ + mark_precise: frame0: regs=r0 stack= before 3: (55) if r3 != 0x7b goto pc+0\ + mark_precise: frame0: regs=r0 stack= before 2: (bf) r3 = r10\ + mark_precise: frame0: regs=r0 stack= before 1: (57) r0 &= 255\ + mark_precise: frame0: parent state regs=r0 stack=:\ + mark_precise: frame0: last_idx 0 first_idx 0\ + mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7\ + mark_precise: frame0: last_idx 7 first_idx 7\ mark_precise: frame0: parent state regs= stack=:", .result = VERBOSE_ACCEPT, .retval = -1, -- 2.47.0

11 months, 3 weeks

3
5
0 0

[PATCH v2, 2/2] usb: typec: tcpm: fix the sender response time issue

by joswang

From: Jos Wang <joswang(a)lenovo.com> According to the USB PD3 CTS specification (https://usb.org/document-library/ usb-power-delivery-compliance-test-specification-0/ USB_PD3_CTS_Q4_2024_OR.zip), the requirements for tSenderResponse are different in PD2 and PD3 modes, see Table 19 Timing Table & Calculations. For PD2 mode, the tSenderResponse min 24ms and max 30ms; for PD3 mode, the tSenderResponse min 27ms and max 33ms. For the "TEST.PD.PROT.SRC.2 Get_Source_Cap No Request" test item, after receiving the Source_Capabilities Message sent by the UUT, the tester deliberately does not send a Request Message in order to force the SenderResponse timer on the Source UUT to timeout. The Tester checks that a Hard Reset is detected between tSenderResponse min and max，the delay is between the last bit of the GoodCRC Message EOP has been sent and the first bit of Hard Reset SOP has been received. The current code does not distinguish between PD2 and PD3 modes, and tSenderResponse defaults to 60ms. This will cause this test item and the following tests to fail: TEST.PD.PROT.SRC3.2 SenderResponseTimer Timeout TEST.PD.PROT.SNK.6 SenderResponseTimer Timeout Considering factors such as SOC performance, i2c rate, and the speed of PD chip sending data, "pd2-sender-response-time-ms" and "pd3-sender-response-time-ms" DT time properties are added to allow users to define platform timing. For values that have not been explicitly defined in DT using this property, a default value of 27ms for PD2 tSenderResponse and 30ms for PD3 tSenderResponse is set. Fixes: 2eadc33f40d4 ("typec: tcpm: Add core support for sink side PPS") Cc: stable(a)vger.kernel.org Signed-off-by: Jos Wang <joswang(a)lenovo.com> --- v1 -> v2: - modify the commit message - patch 1/2 and patch 2/2 are placed in the same thread drivers/usb/typec/tcpm/tcpm.c | 50 +++++++++++++++++++++++------------ include/linux/usb/pd.h | 3 ++- 2 files changed, 35 insertions(+), 18 deletions(-) diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c index 6021eeb903fe..3a159bfcf382 100644 --- a/drivers/usb/typec/tcpm/tcpm.c +++ b/drivers/usb/typec/tcpm/tcpm.c @@ -314,12 +314,16 @@ struct pd_data { * @sink_wait_cap_time: Deadline (in ms) for tTypeCSinkWaitCap timer * @ps_src_wait_off_time: Deadline (in ms) for tPSSourceOff timer * @cc_debounce_time: Deadline (in ms) for tCCDebounce timer + * @pd2_sender_response_time: Deadline (in ms) for pd20 tSenderResponse timer + * @pd3_sender_response_time: Deadline (in ms) for pd30 tSenderResponse timer */ struct pd_timings { u32 sink_wait_cap_time; u32 ps_src_off_time; u32 cc_debounce_time; u32 snk_bc12_cmpletion_time; + u32 pd2_sender_response_time; + u32 pd3_sender_response_time; }; struct tcpm_port { @@ -3776,7 +3780,9 @@ static bool tcpm_send_queued_message(struct tcpm_port *port) } else if (port->pwr_role == TYPEC_SOURCE) { tcpm_ams_finish(port); tcpm_set_state(port, HARD_RESET_SEND, - PD_T_SENDER_RESPONSE); + port->negotiated_rev >= PD_REV30 ? + port->timings.pd3_sender_response_time : + port->timings.pd2_sender_response_time); } else { tcpm_ams_finish(port); } @@ -4619,6 +4625,9 @@ static void run_state_machine(struct tcpm_port *port) enum typec_pwr_opmode opmode; unsigned int msecs; enum tcpm_state upcoming_state; + u32 sender_response_time = port->negotiated_rev >= PD_REV30 ? + port->timings.pd3_sender_response_time : + port->timings.pd2_sender_response_time; if (port->tcpc->check_contaminant && port->state != CHECK_CONTAMINANT) port->potential_contaminant = ((port->enter_state == SRC_ATTACH_WAIT && @@ -5113,7 +5122,7 @@ static void run_state_machine(struct tcpm_port *port) tcpm_set_state(port, SNK_WAIT_CAPABILITIES, 0); } else { tcpm_set_state_cond(port, hard_reset_state(port), - PD_T_SENDER_RESPONSE); + sender_response_time); } break; case SNK_NEGOTIATE_PPS_CAPABILITIES: @@ -5135,7 +5144,7 @@ static void run_state_machine(struct tcpm_port *port) tcpm_set_state(port, SNK_READY, 0); } else { tcpm_set_state_cond(port, hard_reset_state(port), - PD_T_SENDER_RESPONSE); + sender_response_time); } break; case SNK_TRANSITION_SINK: @@ -5387,7 +5396,7 @@ static void run_state_machine(struct tcpm_port *port) port->message_id_prime = 0; port->rx_msgid_prime = -1; tcpm_pd_send_control(port, PD_CTRL_SOFT_RESET, TCPC_TX_SOP_PRIME); - tcpm_set_state_cond(port, ready_state(port), PD_T_SENDER_RESPONSE); + tcpm_set_state_cond(port, ready_state(port), sender_response_time); } else { port->message_id = 0; port->rx_msgid = -1; @@ -5398,7 +5407,7 @@ static void run_state_machine(struct tcpm_port *port) tcpm_set_state_cond(port, hard_reset_state(port), 0); else tcpm_set_state_cond(port, hard_reset_state(port), - PD_T_SENDER_RESPONSE); + sender_response_time); } break; @@ -5409,8 +5418,7 @@ static void run_state_machine(struct tcpm_port *port) port->send_discover = true; port->send_discover_prime = false; } - tcpm_set_state_cond(port, DR_SWAP_SEND_TIMEOUT, - PD_T_SENDER_RESPONSE); + tcpm_set_state_cond(port, DR_SWAP_SEND_TIMEOUT, sender_response_time); break; case DR_SWAP_ACCEPT: tcpm_pd_send_control(port, PD_CTRL_ACCEPT, TCPC_TX_SOP); @@ -5444,7 +5452,7 @@ static void run_state_machine(struct tcpm_port *port) tcpm_set_state(port, ERROR_RECOVERY, 0); break; } - tcpm_set_state_cond(port, FR_SWAP_SEND_TIMEOUT, PD_T_SENDER_RESPONSE); + tcpm_set_state_cond(port, FR_SWAP_SEND_TIMEOUT, sender_response_time); break; case FR_SWAP_SEND_TIMEOUT: tcpm_set_state(port, ERROR_RECOVERY, 0); @@ -5475,8 +5483,7 @@ static void run_state_machine(struct tcpm_port *port) break; case PR_SWAP_SEND: tcpm_pd_send_control(port, PD_CTRL_PR_SWAP, TCPC_TX_SOP); - tcpm_set_state_cond(port, PR_SWAP_SEND_TIMEOUT, - PD_T_SENDER_RESPONSE); + tcpm_set_state_cond(port, PR_SWAP_SEND_TIMEOUT, sender_response_time); break; case PR_SWAP_SEND_TIMEOUT: tcpm_swap_complete(port, -ETIMEDOUT); @@ -5574,8 +5581,7 @@ static void run_state_machine(struct tcpm_port *port) break; case VCONN_SWAP_SEND: tcpm_pd_send_control(port, PD_CTRL_VCONN_SWAP, TCPC_TX_SOP); - tcpm_set_state(port, VCONN_SWAP_SEND_TIMEOUT, - PD_T_SENDER_RESPONSE); + tcpm_set_state(port, VCONN_SWAP_SEND_TIMEOUT, sender_response_time); break; case VCONN_SWAP_SEND_TIMEOUT: tcpm_swap_complete(port, -ETIMEDOUT); @@ -5656,23 +5662,21 @@ static void run_state_machine(struct tcpm_port *port) break; case GET_STATUS_SEND: tcpm_pd_send_control(port, PD_CTRL_GET_STATUS, TCPC_TX_SOP); - tcpm_set_state(port, GET_STATUS_SEND_TIMEOUT, - PD_T_SENDER_RESPONSE); + tcpm_set_state(port, GET_STATUS_SEND_TIMEOUT, sender_response_time); break; case GET_STATUS_SEND_TIMEOUT: tcpm_set_state(port, ready_state(port), 0); break; case GET_PPS_STATUS_SEND: tcpm_pd_send_control(port, PD_CTRL_GET_PPS_STATUS, TCPC_TX_SOP); - tcpm_set_state(port, GET_PPS_STATUS_SEND_TIMEOUT, - PD_T_SENDER_RESPONSE); + tcpm_set_state(port, GET_PPS_STATUS_SEND_TIMEOUT, sender_response_time); break; case GET_PPS_STATUS_SEND_TIMEOUT: tcpm_set_state(port, ready_state(port), 0); break; case GET_SINK_CAP: tcpm_pd_send_control(port, PD_CTRL_GET_SINK_CAP, TCPC_TX_SOP); - tcpm_set_state(port, GET_SINK_CAP_TIMEOUT, PD_T_SENDER_RESPONSE); + tcpm_set_state(port, GET_SINK_CAP_TIMEOUT, sender_response_time); break; case GET_SINK_CAP_TIMEOUT: port->sink_cap_done = true; @@ -7109,6 +7113,18 @@ static void tcpm_fw_get_timings(struct tcpm_port *port, struct fwnode_handle *fw ret = fwnode_property_read_u32(fwnode, "sink-bc12-completion-time-ms", &val); if (!ret) port->timings.snk_bc12_cmpletion_time = val; + + ret = fwnode_property_read_u32(fwnode, "pd2-sender-response-time-ms", &val); + if (!ret) + port->timings.pd2_sender_response_time = val; + else + port->timings.pd2_sender_response_time = PD_T_PD2_SENDER_RESPONSE; + + ret = fwnode_property_read_u32(fwnode, "pd3-sender-response-time-ms", &val); + if (!ret) + port->timings.pd3_sender_response_time = val; + else + port->timings.pd3_sender_response_time = PD_T_PD3_SENDER_RESPONSE; } static int tcpm_fw_get_caps(struct tcpm_port *port, struct fwnode_handle *fwnode) diff --git a/include/linux/usb/pd.h b/include/linux/usb/pd.h index d50098fb16b5..9c599e851b9a 100644 --- a/include/linux/usb/pd.h +++ b/include/linux/usb/pd.h @@ -457,7 +457,6 @@ static inline unsigned int rdo_max_power(u32 rdo) #define PD_T_NO_RESPONSE 5000 /* 4.5 - 5.5 seconds */ #define PD_T_DB_DETECT 10000 /* 10 - 15 seconds */ #define PD_T_SEND_SOURCE_CAP 150 /* 100 - 200 ms */ -#define PD_T_SENDER_RESPONSE 60 /* 24 - 30 ms, relaxed */ #define PD_T_RECEIVER_RESPONSE 15 /* 15ms max */ #define PD_T_SOURCE_ACTIVITY 45 #define PD_T_SINK_ACTIVITY 135 @@ -491,6 +490,8 @@ static inline unsigned int rdo_max_power(u32 rdo) #define PD_T_CC_DEBOUNCE 200 /* 100 - 200 ms */ #define PD_T_PD_DEBOUNCE 20 /* 10 - 20 ms */ #define PD_T_TRY_CC_DEBOUNCE 15 /* 10 - 20 ms */ +#define PD_T_PD2_SENDER_RESPONSE 27 /* PD20 spec 24 - 30 ms */ +#define PD_T_PD3_SENDER_RESPONSE 30 /* PD30 spec 27 - 33 ms */ #define PD_N_CAPS_COUNT (PD_T_NO_RESPONSE / PD_T_SEND_SOURCE_CAP) #define PD_N_HARD_RESET_COUNT 2 -- 2.17.1

11 months, 3 weeks

3
7
0 0

[PATCH v2 1/2] HID: hid-plantronics: Add mic mute mapping and generalize quirks

by Wade Wang

From: Terry Junge <linuxhid(a)cosmicgizmosystems.com> Add mapping for headset mute key events. Remove PLT_QUIRK_DOUBLE_VOLUME_KEYS quirk and made it generic. The quirk logic did not keep track of the actual previous key so any key event occurring in less than or equal to 5ms was ignored. Remove PLT_QUIRK_FOLLOWED_OPPOSITE_VOLUME_KEYS quirk. It had the same logic issue as the double key quirk and was actually masking the as designed behavior of most of the headsets. It's occurrence should be minimized with the ALSA control naming quirk that is part of the patch set. Signed-off-by: Terry Junge <linuxhid(a)cosmicgizmosystems.com> Signed-off-by: Wade Wang <wade.wang(a)hp.com> Cc: stable(a)vger.kernel.org --- V1 -> V2: Optimize out 2 macros - no functional changes drivers/hid/hid-plantronics.c | 144 ++++++++++++++++------------------ 1 file changed, 67 insertions(+), 77 deletions(-) diff --git a/drivers/hid/hid-plantronics.c b/drivers/hid/hid-plantronics.c index 25cfd964dc25..acb9eb18f7cc 100644 --- a/drivers/hid/hid-plantronics.c +++ b/drivers/hid/hid-plantronics.c @@ -6,9 +6,6 @@ * Copyright (c) 2015-2018 Terry Junge <terry.junge(a)plantronics.com> */ -/* - */ - #include "hid-ids.h" #include <linux/hid.h> @@ -23,30 +20,28 @@ #define PLT_VOL_UP 0x00b1 #define PLT_VOL_DOWN 0x00b2 +#define PLT_MIC_MUTE 0x00b5 #define PLT1_VOL_UP (PLT_HID_1_0_PAGE | PLT_VOL_UP) #define PLT1_VOL_DOWN (PLT_HID_1_0_PAGE | PLT_VOL_DOWN) +#define PLT1_MIC_MUTE (PLT_HID_1_0_PAGE | PLT_MIC_MUTE) #define PLT2_VOL_UP (PLT_HID_2_0_PAGE | PLT_VOL_UP) #define PLT2_VOL_DOWN (PLT_HID_2_0_PAGE | PLT_VOL_DOWN) +#define PLT2_MIC_MUTE (PLT_HID_2_0_PAGE | PLT_MIC_MUTE) +#define HID_TELEPHONY_MUTE (HID_UP_TELEPHONY | 0x2f) +#define HID_CONSUMER_MUTE (HID_UP_CONSUMER | 0xe2) #define PLT_DA60 0xda60 #define PLT_BT300_MIN 0x0413 #define PLT_BT300_MAX 0x0418 - -#define PLT_ALLOW_CONSUMER (field->application == HID_CP_CONSUMERCONTROL && \ - (usage->hid & HID_USAGE_PAGE) == HID_UP_CONSUMER) - -#define PLT_QUIRK_DOUBLE_VOLUME_KEYS BIT(0) -#define PLT_QUIRK_FOLLOWED_OPPOSITE_VOLUME_KEYS BIT(1) - #define PLT_DOUBLE_KEY_TIMEOUT 5 /* ms */ -#define PLT_FOLLOWED_OPPOSITE_KEY_TIMEOUT 220 /* ms */ struct plt_drv_data { unsigned long device_type; - unsigned long last_volume_key_ts; - u32 quirks; + unsigned long last_key_ts; + unsigned long double_key_to; + __u16 last_key; }; static int plantronics_input_mapping(struct hid_device *hdev, @@ -58,34 +53,43 @@ static int plantronics_input_mapping(struct hid_device *hdev, unsigned short mapped_key; struct plt_drv_data *drv_data = hid_get_drvdata(hdev); unsigned long plt_type = drv_data->device_type; + int allow_mute = usage->hid == HID_TELEPHONY_MUTE; + int allow_consumer = field->application == HID_CP_CONSUMERCONTROL && + (usage->hid & HID_USAGE_PAGE) == HID_UP_CONSUMER && + usage->hid != HID_CONSUMER_MUTE; /* special case for PTT products */ if (field->application == HID_GD_JOYSTICK) goto defaulted; - /* handle volume up/down mapping */ /* non-standard types or multi-HID interfaces - plt_type is PID */ if (!(plt_type & HID_USAGE_PAGE)) { switch (plt_type) { case PLT_DA60: - if (PLT_ALLOW_CONSUMER) + if (allow_consumer) goto defaulted; - goto ignored; + if (usage->hid == HID_CONSUMER_MUTE) { + mapped_key = KEY_MICMUTE; + goto mapped; + } + break; default: - if (PLT_ALLOW_CONSUMER) + if (allow_consumer || allow_mute) goto defaulted; } + goto ignored; } - /* handle standard types - plt_type is 0xffa0uuuu or 0xffa2uuuu */ - /* 'basic telephony compliant' - allow default consumer page map */ - else if ((plt_type & HID_USAGE) >= PLT_BASIC_TELEPHONY && - (plt_type & HID_USAGE) != PLT_BASIC_EXCEPTION) { - if (PLT_ALLOW_CONSUMER) - goto defaulted; - } - /* not 'basic telephony' - apply legacy mapping */ - /* only map if the field is in the device's primary vendor page */ - else if (!((field->application ^ plt_type) & HID_USAGE_PAGE)) { + + /* handle standard consumer control mapping */ + /* and standard telephony mic mute mapping */ + if (allow_consumer || allow_mute) + goto defaulted; + + /* handle vendor unique types - plt_type is 0xffa0uuuu or 0xffa2uuuu */ + /* if not 'basic telephony compliant' - map vendor unique controls */ + if (!((plt_type & HID_USAGE) >= PLT_BASIC_TELEPHONY && + (plt_type & HID_USAGE) != PLT_BASIC_EXCEPTION) && + !((field->application ^ plt_type) & HID_USAGE_PAGE)) switch (usage->hid) { case PLT1_VOL_UP: case PLT2_VOL_UP: @@ -95,8 +99,11 @@ static int plantronics_input_mapping(struct hid_device *hdev, case PLT2_VOL_DOWN: mapped_key = KEY_VOLUMEDOWN; goto mapped; + case PLT1_MIC_MUTE: + case PLT2_MIC_MUTE: + mapped_key = KEY_MICMUTE; + goto mapped; } - } /* * Future mapping of call control or other usages, @@ -105,6 +112,8 @@ static int plantronics_input_mapping(struct hid_device *hdev, */ ignored: + hid_dbg(hdev, "usage: %08x (appl: %08x) - ignored\n", + usage->hid, field->application); return -1; defaulted: @@ -123,38 +132,26 @@ static int plantronics_event(struct hid_device *hdev, struct hid_field *field, struct hid_usage *usage, __s32 value) { struct plt_drv_data *drv_data = hid_get_drvdata(hdev); + unsigned long prev_tsto, cur_ts; + __u16 prev_key, cur_key; - if (drv_data->quirks & PLT_QUIRK_DOUBLE_VOLUME_KEYS) { - unsigned long prev_ts, cur_ts; + /* Usages are filtered in plantronics_usages. */ - /* Usages are filtered in plantronics_usages. */ + /* HZ too low for ms resolution - double key detection disabled */ + /* or it is a key release - handle key presses only. */ + if (!drv_data->double_key_to || !value) + return 0; - if (!value) /* Handle key presses only. */ - return 0; + prev_tsto = drv_data->last_key_ts + drv_data->double_key_to; + cur_ts = drv_data->last_key_ts = jiffies; + prev_key = drv_data->last_key; + cur_key = drv_data->last_key = usage->code; - prev_ts = drv_data->last_volume_key_ts; - cur_ts = jiffies; - if (jiffies_to_msecs(cur_ts - prev_ts) <= PLT_DOUBLE_KEY_TIMEOUT) - return 1; /* Ignore the repeated key. */ - - drv_data->last_volume_key_ts = cur_ts; + /* If the same key occurs in <= double_key_to -- ignore it */ + if (prev_key == cur_key && time_before_eq(cur_ts, prev_tsto)) { + hid_dbg(hdev, "double key %d ignored\n", cur_key); + return 1; /* Ignore the repeated key. */ } - if (drv_data->quirks & PLT_QUIRK_FOLLOWED_OPPOSITE_VOLUME_KEYS) { - unsigned long prev_ts, cur_ts; - - /* Usages are filtered in plantronics_usages. */ - - if (!value) /* Handle key presses only. */ - return 0; - - prev_ts = drv_data->last_volume_key_ts; - cur_ts = jiffies; - if (jiffies_to_msecs(cur_ts - prev_ts) <= PLT_FOLLOWED_OPPOSITE_KEY_TIMEOUT) - return 1; /* Ignore the followed opposite volume key. */ - - drv_data->last_volume_key_ts = cur_ts; - } - return 0; } @@ -196,12 +193,16 @@ static int plantronics_probe(struct hid_device *hdev, ret = hid_parse(hdev); if (ret) { hid_err(hdev, "parse failed\n"); - goto err; + return ret; } drv_data->device_type = plantronics_device_type(hdev); - drv_data->quirks = id->driver_data; - drv_data->last_volume_key_ts = jiffies - msecs_to_jiffies(PLT_DOUBLE_KEY_TIMEOUT); + drv_data->double_key_to = msecs_to_jiffies(PLT_DOUBLE_KEY_TIMEOUT); + drv_data->last_key_ts = jiffies - drv_data->double_key_to; + + /* if HZ does not allow ms resolution - disable double key detection */ + if (drv_data->double_key_to < PLT_DOUBLE_KEY_TIMEOUT) + drv_data->double_key_to = 0; hid_set_drvdata(hdev, drv_data); @@ -210,29 +211,10 @@ static int plantronics_probe(struct hid_device *hdev, if (ret) hid_err(hdev, "hw start failed\n"); -err: return ret; } static const struct hid_device_id plantronics_devices[] = { - { HID_USB_DEVICE(USB_VENDOR_ID_PLANTRONICS, - USB_DEVICE_ID_PLANTRONICS_BLACKWIRE_3210_SERIES), - .driver_data = PLT_QUIRK_DOUBLE_VOLUME_KEYS }, - { HID_USB_DEVICE(USB_VENDOR_ID_PLANTRONICS, - USB_DEVICE_ID_PLANTRONICS_BLACKWIRE_3220_SERIES), - .driver_data = PLT_QUIRK_DOUBLE_VOLUME_KEYS }, - { HID_USB_DEVICE(USB_VENDOR_ID_PLANTRONICS, - USB_DEVICE_ID_PLANTRONICS_BLACKWIRE_3215_SERIES), - .driver_data = PLT_QUIRK_DOUBLE_VOLUME_KEYS }, - { HID_USB_DEVICE(USB_VENDOR_ID_PLANTRONICS, - USB_DEVICE_ID_PLANTRONICS_BLACKWIRE_3225_SERIES), - .driver_data = PLT_QUIRK_DOUBLE_VOLUME_KEYS }, - { HID_USB_DEVICE(USB_VENDOR_ID_PLANTRONICS, - USB_DEVICE_ID_PLANTRONICS_BLACKWIRE_3325_SERIES), - .driver_data = PLT_QUIRK_FOLLOWED_OPPOSITE_VOLUME_KEYS }, - { HID_USB_DEVICE(USB_VENDOR_ID_PLANTRONICS, - USB_DEVICE_ID_PLANTRONICS_ENCOREPRO_500_SERIES), - .driver_data = PLT_QUIRK_FOLLOWED_OPPOSITE_VOLUME_KEYS }, { HID_USB_DEVICE(USB_VENDOR_ID_PLANTRONICS, HID_ANY_ID) }, { } }; @@ -241,6 +223,14 @@ MODULE_DEVICE_TABLE(hid, plantronics_devices); static const struct hid_usage_id plantronics_usages[] = { { HID_CP_VOLUMEUP, EV_KEY, HID_ANY_ID }, { HID_CP_VOLUMEDOWN, EV_KEY, HID_ANY_ID }, + { HID_TELEPHONY_MUTE, EV_KEY, HID_ANY_ID }, + { HID_CONSUMER_MUTE, EV_KEY, HID_ANY_ID }, + { PLT2_VOL_UP, EV_KEY, HID_ANY_ID }, + { PLT2_VOL_DOWN, EV_KEY, HID_ANY_ID }, + { PLT2_MIC_MUTE, EV_KEY, HID_ANY_ID }, + { PLT1_VOL_UP, EV_KEY, HID_ANY_ID }, + { PLT1_VOL_DOWN, EV_KEY, HID_ANY_ID }, + { PLT1_MIC_MUTE, EV_KEY, HID_ANY_ID }, { HID_TERMINATOR, HID_TERMINATOR, HID_TERMINATOR } }; -- 2.43.0

11 months, 3 weeks

3
5
0 0

[PATCH 1/7] mm/hugetlb: Fix avoid_reserve to allow taking folio from subpool

by Peter Xu

Since commit 04f2cbe35699 ("hugetlb: guarantee that COW faults for a process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed"), avoid_reserve was introduced for a special case of CoW on hugetlb private mappings, and only if the owner VMA is trying to allocate yet another hugetlb folio that is not reserved within the private vma reserved map. Later on, in commit d85f69b0b533 ("mm/hugetlb: alloc_huge_page handle areas hole punched by fallocate"), alloc_huge_page() enforced to not consume any global reservation as long as avoid_reserve=true. This operation doesn't look correct, because even if it will enforce the allocation to not use global reservation at all, it will still try to take one reservation from the spool (if the subpool existed). Then since the spool reserved pages take from global reservation, it'll also take one reservation globally. Logically it can cause global reservation to go wrong. I wrote a reproducer below, trigger this special path, and every run of such program will cause global reservation count to increment by one, until it hits the number of free pages: #define _GNU_SOURCE /* See feature_test_macros(7) */ #include <stdio.h> #include <fcntl.h> #include <errno.h> #include <unistd.h> #include <stdlib.h> #include <sys/mman.h> #define MSIZE (2UL << 20) int main(int argc, char *argv[]) { const char *path; int *buf; int fd, ret; pid_t child; if (argc < 2) { printf("usage: %s <hugetlb_file>\n", argv[0]); return -1; } path = argv[1]; fd = open(path, O_RDWR | O_CREAT, 0666); if (fd < 0) { perror("open failed"); return -1; } ret = fallocate(fd, 0, 0, MSIZE); if (ret != 0) { perror("fallocate"); return -1; } buf = mmap(NULL, MSIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); if (buf == MAP_FAILED) { perror("mmap() failed"); return -1; } /* Allocate a page */ *buf = 1; child = fork(); if (child == 0) { /* child doesn't need to do anything */ exit(0); } /* Trigger CoW from owner */ *buf = 2; munmap(buf, MSIZE); close(fd); unlink(path); return 0; } It can only reproduce with a sub-mount when there're reserved pages on the spool, like: # sysctl vm.nr_hugepages=128 # mkdir ./hugetlb-pool # mount -t hugetlbfs -o min_size=8M,pagesize=2M none ./hugetlb-pool Then run the reproducer on the mountpoint: # ./reproducer ./hugetlb-pool/test Fix it by taking the reservation from spool if available. In general, avoid_reserve is IMHO more about "avoid vma resv map", not spool's. I copied stable, however I have no intention for backporting if it's not a clean cherry-pick, because private hugetlb mapping, and then fork() on top is too rare to hit. Cc: linux-stable <stable(a)vger.kernel.org> Fixes: d85f69b0b533 ("mm/hugetlb: alloc_huge_page handle areas hole punched by fallocate") Signed-off-by: Peter Xu <peterx(a)redhat.com> --- mm/hugetlb.c | 22 +++------------------- 1 file changed, 3 insertions(+), 19 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index cec4b121193f..9ce69fd22a01 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1394,8 +1394,7 @@ static unsigned long available_huge_pages(struct hstate *h) static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h, struct vm_area_struct *vma, - unsigned long address, int avoid_reserve, - long chg) + unsigned long address, long chg) { struct folio *folio = NULL; struct mempolicy *mpol; @@ -1411,10 +1410,6 @@ static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h, if (!vma_has_reserves(vma, chg) && !available_huge_pages(h)) goto err; - /* If reserves cannot be used, ensure enough pages are in the pool */ - if (avoid_reserve && !available_huge_pages(h)) - goto err; - gfp_mask = htlb_alloc_mask(h); nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask); @@ -1430,7 +1425,7 @@ static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h, folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask, nid, nodemask); - if (folio && !avoid_reserve && vma_has_reserves(vma, chg)) { + if (folio && vma_has_reserves(vma, chg)) { folio_set_hugetlb_restore_reserve(folio); h->resv_huge_pages--; } @@ -3007,17 +3002,6 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, gbl_chg = hugepage_subpool_get_pages(spool, 1); if (gbl_chg < 0) goto out_end_reservation; - - /* - * Even though there was no reservation in the region/reserve - * map, there could be reservations associated with the - * subpool that can be used. This would be indicated if the - * return value of hugepage_subpool_get_pages() is zero. - * However, if avoid_reserve is specified we still avoid even - * the subpool reservations. - */ - if (avoid_reserve) - gbl_chg = 1; } /* If this allocation is not consuming a reservation, charge it now. @@ -3040,7 +3024,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, * from the global free pool (global change). gbl_chg == 0 indicates * a reservation exists for the allocation. */ - folio = dequeue_hugetlb_folio_vma(h, vma, addr, avoid_reserve, gbl_chg); + folio = dequeue_hugetlb_folio_vma(h, vma, addr, gbl_chg); if (!folio) { spin_unlock_irq(&hugetlb_lock); folio = alloc_buddy_hugetlb_folio_with_mpol(h, vma, addr); -- 2.47.0

11 months, 3 weeks

2
3
0 0

[PATCH 1/2] dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2)

by Milan Broz

This patch fixes an issue that was fixed in the commit df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to block size") but later broken again in the commit 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO") If the Reed-Solomon roots setting spans multiple blocks, the code does not use proper parity bytes and randomly fails to repair even trivial errors. This bug cannot happen if the sector size is multiple of RS roots setting (Android case with roots 2). The previous solution was to find a dm-bufio block size that is multiple of the device sector size and roots size. Unfortunately, the optimization in commit 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO") is incorrect and uses data block size for some roots (for example, it uses 4096 block size for roots = 20). This patch uses a different approach: - It always uses a configured data block size for dm-bufio to avoid possible misaligned IOs. - and it caches the processed parity bytes, so it can join it if it spans two blocks. As the RS calculation is called only if an error is detected and the process is computationally intensive, copying a few more bytes should not introduce performance issues. The issue was reported to cryptsetup with trivial reproducer https://gitlab.com/cryptsetup/cryptsetup/-/issues/923 Reproducer (with roots=20): # create verity device with RS FEC dd if=/dev/urandom of=data.img bs=4096 count=8 status=none veritysetup format data.img hash.img --fec-device=fec.img --fec-roots=20 | \ awk '/^Root hash/{ print $3 }' >roothash # create an erasure that should always be repairable with this roots setting dd if=/dev/zero of=data.img conv=notrunc bs=1 count=4 seek=4 status=none # try to read it through dm-verity veritysetup open data.img test hash.img --fec-device=fec.img --fec-roots=20 $(cat roothash) dd if=/dev/mapper/test of=/dev/null bs=4096 status=noxfer Even now the log says it cannot repair it: : verity-fec: 7:1: FEC 0: failed to correct: -74 : device-mapper: verity: 7:1: data block 0 is corrupted ... With this fix, errors are properly repaired. : verity-fec: 7:1: FEC 0: corrected 4 errors Signed-off-by: Milan Broz <gmazyland(a)gmail.com> Fixes: 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO") Cc: stable(a)vger.kernel.org --- drivers/md/dm-verity-fec.c | 40 +++++++++++++++++++++++++------------- 1 file changed, 26 insertions(+), 14 deletions(-) diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c index 62b1a44b8dd2..6bd9848518d4 100644 --- a/drivers/md/dm-verity-fec.c +++ b/drivers/md/dm-verity-fec.c @@ -60,15 +60,19 @@ static int fec_decode_rs8(struct dm_verity *v, struct dm_verity_fec_io *fio, * to the data block. Caller is responsible for releasing buf. */ static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index, - unsigned int *offset, struct dm_buffer **buf, - unsigned short ioprio) + unsigned int *offset, unsigned int par_buf_offset, + struct dm_buffer **buf, unsigned short ioprio) { u64 position, block, rem; u8 *res; + /* We have already part of parity bytes read, skip to the next block */ + if (par_buf_offset) + index++; + position = (index + rsb) * v->fec->roots; block = div64_u64_rem(position, v->fec->io_size, &rem); - *offset = (unsigned int)rem; + *offset = par_buf_offset ? 0 : (unsigned int)rem; res = dm_bufio_read_with_ioprio(v->fec->bufio, block, buf, ioprio); if (IS_ERR(res)) { @@ -128,11 +132,12 @@ static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_io *io, { int r, corrected = 0, res; struct dm_buffer *buf; - unsigned int n, i, offset; - u8 *par, *block; + unsigned int n, i, offset, par_buf_offset = 0; + u8 *par, *block, par_buf[DM_VERITY_FEC_RSM - DM_VERITY_FEC_MIN_RSN]; struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size); - par = fec_read_parity(v, rsb, block_offset, &offset, &buf, bio_prio(bio)); + par = fec_read_parity(v, rsb, block_offset, &offset, + par_buf_offset, &buf, bio_prio(bio)); if (IS_ERR(par)) return PTR_ERR(par); @@ -142,7 +147,8 @@ static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_io *io, */ fec_for_each_buffer_rs_block(fio, n, i) { block = fec_buffer_rs_block(v, fio, n, i); - res = fec_decode_rs8(v, fio, block, &par[offset], neras); + memcpy(&par_buf[par_buf_offset], &par[offset], v->fec->roots - par_buf_offset); + res = fec_decode_rs8(v, fio, block, par_buf, neras); if (res < 0) { r = res; goto error; @@ -155,12 +161,21 @@ static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_io *io, if (block_offset >= 1 << v->data_dev_block_bits) goto done; - /* read the next block when we run out of parity bytes */ - offset += v->fec->roots; + /* Read the next block when we run out of parity bytes */ + offset += (v->fec->roots - par_buf_offset); + /* Check if parity bytes are split between blocks */ + if (offset < v->fec->io_size && (offset + v->fec->roots) > v->fec->io_size) { + par_buf_offset = v->fec->io_size - offset; + memcpy(par_buf, &par[offset], par_buf_offset); + offset += par_buf_offset; + } else + par_buf_offset = 0; + if (offset >= v->fec->io_size) { dm_bufio_release(buf); - par = fec_read_parity(v, rsb, block_offset, &offset, &buf, bio_prio(bio)); + par = fec_read_parity(v, rsb, block_offset, &offset, + par_buf_offset, &buf, bio_prio(bio)); if (IS_ERR(par)) return PTR_ERR(par); } @@ -724,10 +739,7 @@ int verity_fec_ctr(struct dm_verity *v) return -E2BIG; } - if ((f->roots << SECTOR_SHIFT) & ((1 << v->data_dev_block_bits) - 1)) - f->io_size = 1 << v->data_dev_block_bits; - else - f->io_size = v->fec->roots << SECTOR_SHIFT; + f->io_size = 1 << v->data_dev_block_bits; f->bufio = dm_bufio_client_create(f->dev->bdev, f->io_size, -- 2.45.2

11 months, 3 weeks

2
1
0 0

FAILED: patch "[PATCH] x86/hyperv: Fix hv tsc page based sched_clock for hibernation" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x bcc80dec91ee745b3d66f3e48f0ec2efdea97149 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024122324-twitter-clustered-891d@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From bcc80dec91ee745b3d66f3e48f0ec2efdea97149 Mon Sep 17 00:00:00 2001 From: Naman Jain <namjain(a)linux.microsoft.com> Date: Tue, 17 Sep 2024 11:09:17 +0530 Subject: [PATCH] x86/hyperv: Fix hv tsc page based sched_clock for hibernation read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is bigger than the variable hv_sched_clock_offset, which is cached during early boot, but depending on the timing this assumption may be false when a hibernated VM starts again (the clock counter starts from 0 again) and is resuming back (Note: hv_init_tsc_clocksource() is not called during hibernation/resume); consequently, read_hv_sched_clock_tsc() may return a negative integer (which is interpreted as a huge positive integer since the return type is u64) and new kernel messages are prefixed with huge timestamps before read_hv_sched_clock_tsc() grows big enough (which typically takes several seconds). Fix the issue by saving the Hyper-V clock counter just before the suspend, and using it to correct the hv_sched_clock_offset in resume. This makes hv tsc page based sched_clock continuous and ensures that post resume, it starts from where it left off during suspend. Override x86_platform.save_sched_clock_state and x86_platform.restore_sched_clock_state routines to correct this as soon as possible. Note: if Invariant TSC is available, the issue doesn't happen because 1) we don't register read_hv_sched_clock_tsc() for sched clock: See commit e5313f1c5404 ("clocksource/drivers/hyper-v: Rework clocksource and sched clock setup"); 2) the common x86 code adjusts TSC similarly: see __restore_processor_state() -> tsc_verify_tsc_adjust(true) and x86_platform.restore_sched_clock_state(). Cc: stable(a)vger.kernel.org Fixes: 1349401ff1aa ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation") Co-developed-by: Dexuan Cui <decui(a)microsoft.com> Signed-off-by: Dexuan Cui <decui(a)microsoft.com> Signed-off-by: Naman Jain <namjain(a)linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux(a)outlook.com> Link: https://lore.kernel.org/r/20240917053917.76787-1-namjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu(a)kernel.org> Message-ID: <20240917053917.76787-1-namjain(a)linux.microsoft.com> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index d18078834ded..dc12fe5ef3ca 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -223,6 +223,63 @@ static void hv_machine_crash_shutdown(struct pt_regs *regs) hyperv_cleanup(); } #endif /* CONFIG_CRASH_DUMP */ + +static u64 hv_ref_counter_at_suspend; +static void (*old_save_sched_clock_state)(void); +static void (*old_restore_sched_clock_state)(void); + +/* + * Hyper-V clock counter resets during hibernation. Save and restore clock + * offset during suspend/resume, while also considering the time passed + * before suspend. This is to make sure that sched_clock using hv tsc page + * based clocksource, proceeds from where it left off during suspend and + * it shows correct time for the timestamps of kernel messages after resume. + */ +static void save_hv_clock_tsc_state(void) +{ + hv_ref_counter_at_suspend = hv_read_reference_counter(); +} + +static void restore_hv_clock_tsc_state(void) +{ + /* + * Adjust the offsets used by hv tsc clocksource to + * account for the time spent before hibernation. + * adjusted value = reference counter (time) at suspend + * - reference counter (time) now. + */ + hv_adj_sched_clock_offset(hv_ref_counter_at_suspend - hv_read_reference_counter()); +} + +/* + * Functions to override save_sched_clock_state and restore_sched_clock_state + * functions of x86_platform. The Hyper-V clock counter is reset during + * suspend-resume and the offset used to measure time needs to be + * corrected, post resume. + */ +static void hv_save_sched_clock_state(void) +{ + old_save_sched_clock_state(); + save_hv_clock_tsc_state(); +} + +static void hv_restore_sched_clock_state(void) +{ + restore_hv_clock_tsc_state(); + old_restore_sched_clock_state(); +} + +static void __init x86_setup_ops_for_tsc_pg_clock(void) +{ + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) + return; + + old_save_sched_clock_state = x86_platform.save_sched_clock_state; + x86_platform.save_sched_clock_state = hv_save_sched_clock_state; + + old_restore_sched_clock_state = x86_platform.restore_sched_clock_state; + x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state; +} #endif /* CONFIG_HYPERV */ static uint32_t __init ms_hyperv_platform(void) @@ -579,6 +636,7 @@ static void __init ms_hyperv_init_platform(void) /* Register Hyper-V specific clocksource */ hv_init_clocksource(); + x86_setup_ops_for_tsc_pg_clock(); hv_vtl_init_platform(); #endif /* diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c index 99177835cade..b39dee7b93af 100644 --- a/drivers/clocksource/hyperv_timer.c +++ b/drivers/clocksource/hyperv_timer.c @@ -27,7 +27,8 @@ #include <asm/mshyperv.h> static struct clock_event_device __percpu *hv_clock_event; -static u64 hv_sched_clock_offset __ro_after_init; +/* Note: offset can hold negative values after hibernation. */ +static u64 hv_sched_clock_offset __read_mostly; /* * If false, we're using the old mechanism for stimer0 interrupts @@ -470,6 +471,17 @@ static void resume_hv_clock_tsc(struct clocksource *arg) hv_set_msr(HV_MSR_REFERENCE_TSC, tsc_msr.as_uint64); } +/* + * Called during resume from hibernation, from overridden + * x86_platform.restore_sched_clock_state routine. This is to adjust offsets + * used to calculate time for hv tsc page based sched_clock, to account for + * time spent before hibernation. + */ +void hv_adj_sched_clock_offset(u64 offset) +{ + hv_sched_clock_offset -= offset; +} + #ifdef HAVE_VDSO_CLOCKMODE_HVCLOCK static int hv_cs_enable(struct clocksource *cs) { diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h index 6cdc873ac907..aa5233b1eba9 100644 --- a/include/clocksource/hyperv_timer.h +++ b/include/clocksource/hyperv_timer.h @@ -38,6 +38,8 @@ extern void hv_remap_tsc_clocksource(void); extern unsigned long hv_get_tsc_pfn(void); extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void); +extern void hv_adj_sched_clock_offset(u64 offset); + static __always_inline bool hv_read_tsc_page_tsc(const struct ms_hyperv_tsc_page *tsc_pg, u64 *cur_tsc, u64 *time)

11 months, 3 weeks

3
2
0 0

FAILED: patch "[PATCH] x86/hyperv: Fix hv tsc page based sched_clock for hibernation" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x bcc80dec91ee745b3d66f3e48f0ec2efdea97149 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024122327-imaginary-gizzard-8e8b@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From bcc80dec91ee745b3d66f3e48f0ec2efdea97149 Mon Sep 17 00:00:00 2001 From: Naman Jain <namjain(a)linux.microsoft.com> Date: Tue, 17 Sep 2024 11:09:17 +0530 Subject: [PATCH] x86/hyperv: Fix hv tsc page based sched_clock for hibernation read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is bigger than the variable hv_sched_clock_offset, which is cached during early boot, but depending on the timing this assumption may be false when a hibernated VM starts again (the clock counter starts from 0 again) and is resuming back (Note: hv_init_tsc_clocksource() is not called during hibernation/resume); consequently, read_hv_sched_clock_tsc() may return a negative integer (which is interpreted as a huge positive integer since the return type is u64) and new kernel messages are prefixed with huge timestamps before read_hv_sched_clock_tsc() grows big enough (which typically takes several seconds). Fix the issue by saving the Hyper-V clock counter just before the suspend, and using it to correct the hv_sched_clock_offset in resume. This makes hv tsc page based sched_clock continuous and ensures that post resume, it starts from where it left off during suspend. Override x86_platform.save_sched_clock_state and x86_platform.restore_sched_clock_state routines to correct this as soon as possible. Note: if Invariant TSC is available, the issue doesn't happen because 1) we don't register read_hv_sched_clock_tsc() for sched clock: See commit e5313f1c5404 ("clocksource/drivers/hyper-v: Rework clocksource and sched clock setup"); 2) the common x86 code adjusts TSC similarly: see __restore_processor_state() -> tsc_verify_tsc_adjust(true) and x86_platform.restore_sched_clock_state(). Cc: stable(a)vger.kernel.org Fixes: 1349401ff1aa ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation") Co-developed-by: Dexuan Cui <decui(a)microsoft.com> Signed-off-by: Dexuan Cui <decui(a)microsoft.com> Signed-off-by: Naman Jain <namjain(a)linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux(a)outlook.com> Link: https://lore.kernel.org/r/20240917053917.76787-1-namjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu(a)kernel.org> Message-ID: <20240917053917.76787-1-namjain(a)linux.microsoft.com> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index d18078834ded..dc12fe5ef3ca 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -223,6 +223,63 @@ static void hv_machine_crash_shutdown(struct pt_regs *regs) hyperv_cleanup(); } #endif /* CONFIG_CRASH_DUMP */ + +static u64 hv_ref_counter_at_suspend; +static void (*old_save_sched_clock_state)(void); +static void (*old_restore_sched_clock_state)(void); + +/* + * Hyper-V clock counter resets during hibernation. Save and restore clock + * offset during suspend/resume, while also considering the time passed + * before suspend. This is to make sure that sched_clock using hv tsc page + * based clocksource, proceeds from where it left off during suspend and + * it shows correct time for the timestamps of kernel messages after resume. + */ +static void save_hv_clock_tsc_state(void) +{ + hv_ref_counter_at_suspend = hv_read_reference_counter(); +} + +static void restore_hv_clock_tsc_state(void) +{ + /* + * Adjust the offsets used by hv tsc clocksource to + * account for the time spent before hibernation. + * adjusted value = reference counter (time) at suspend + * - reference counter (time) now. + */ + hv_adj_sched_clock_offset(hv_ref_counter_at_suspend - hv_read_reference_counter()); +} + +/* + * Functions to override save_sched_clock_state and restore_sched_clock_state + * functions of x86_platform. The Hyper-V clock counter is reset during + * suspend-resume and the offset used to measure time needs to be + * corrected, post resume. + */ +static void hv_save_sched_clock_state(void) +{ + old_save_sched_clock_state(); + save_hv_clock_tsc_state(); +} + +static void hv_restore_sched_clock_state(void) +{ + restore_hv_clock_tsc_state(); + old_restore_sched_clock_state(); +} + +static void __init x86_setup_ops_for_tsc_pg_clock(void) +{ + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) + return; + + old_save_sched_clock_state = x86_platform.save_sched_clock_state; + x86_platform.save_sched_clock_state = hv_save_sched_clock_state; + + old_restore_sched_clock_state = x86_platform.restore_sched_clock_state; + x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state; +} #endif /* CONFIG_HYPERV */ static uint32_t __init ms_hyperv_platform(void) @@ -579,6 +636,7 @@ static void __init ms_hyperv_init_platform(void) /* Register Hyper-V specific clocksource */ hv_init_clocksource(); + x86_setup_ops_for_tsc_pg_clock(); hv_vtl_init_platform(); #endif /* diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c index 99177835cade..b39dee7b93af 100644 --- a/drivers/clocksource/hyperv_timer.c +++ b/drivers/clocksource/hyperv_timer.c @@ -27,7 +27,8 @@ #include <asm/mshyperv.h> static struct clock_event_device __percpu *hv_clock_event; -static u64 hv_sched_clock_offset __ro_after_init; +/* Note: offset can hold negative values after hibernation. */ +static u64 hv_sched_clock_offset __read_mostly; /* * If false, we're using the old mechanism for stimer0 interrupts @@ -470,6 +471,17 @@ static void resume_hv_clock_tsc(struct clocksource *arg) hv_set_msr(HV_MSR_REFERENCE_TSC, tsc_msr.as_uint64); } +/* + * Called during resume from hibernation, from overridden + * x86_platform.restore_sched_clock_state routine. This is to adjust offsets + * used to calculate time for hv tsc page based sched_clock, to account for + * time spent before hibernation. + */ +void hv_adj_sched_clock_offset(u64 offset) +{ + hv_sched_clock_offset -= offset; +} + #ifdef HAVE_VDSO_CLOCKMODE_HVCLOCK static int hv_cs_enable(struct clocksource *cs) { diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h index 6cdc873ac907..aa5233b1eba9 100644 --- a/include/clocksource/hyperv_timer.h +++ b/include/clocksource/hyperv_timer.h @@ -38,6 +38,8 @@ extern void hv_remap_tsc_clocksource(void); extern unsigned long hv_get_tsc_pfn(void); extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void); +extern void hv_adj_sched_clock_offset(u64 offset); + static __always_inline bool hv_read_tsc_page_tsc(const struct ms_hyperv_tsc_page *tsc_pg, u64 *cur_tsc, u64 *time)

11 months, 3 weeks

3
2
0 0

FAILED: patch "[PATCH] x86/hyperv: Fix hv tsc page based sched_clock for hibernation" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x bcc80dec91ee745b3d66f3e48f0ec2efdea97149 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024122325-flavoring-mute-cb7e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From bcc80dec91ee745b3d66f3e48f0ec2efdea97149 Mon Sep 17 00:00:00 2001 From: Naman Jain <namjain(a)linux.microsoft.com> Date: Tue, 17 Sep 2024 11:09:17 +0530 Subject: [PATCH] x86/hyperv: Fix hv tsc page based sched_clock for hibernation read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is bigger than the variable hv_sched_clock_offset, which is cached during early boot, but depending on the timing this assumption may be false when a hibernated VM starts again (the clock counter starts from 0 again) and is resuming back (Note: hv_init_tsc_clocksource() is not called during hibernation/resume); consequently, read_hv_sched_clock_tsc() may return a negative integer (which is interpreted as a huge positive integer since the return type is u64) and new kernel messages are prefixed with huge timestamps before read_hv_sched_clock_tsc() grows big enough (which typically takes several seconds). Fix the issue by saving the Hyper-V clock counter just before the suspend, and using it to correct the hv_sched_clock_offset in resume. This makes hv tsc page based sched_clock continuous and ensures that post resume, it starts from where it left off during suspend. Override x86_platform.save_sched_clock_state and x86_platform.restore_sched_clock_state routines to correct this as soon as possible. Note: if Invariant TSC is available, the issue doesn't happen because 1) we don't register read_hv_sched_clock_tsc() for sched clock: See commit e5313f1c5404 ("clocksource/drivers/hyper-v: Rework clocksource and sched clock setup"); 2) the common x86 code adjusts TSC similarly: see __restore_processor_state() -> tsc_verify_tsc_adjust(true) and x86_platform.restore_sched_clock_state(). Cc: stable(a)vger.kernel.org Fixes: 1349401ff1aa ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation") Co-developed-by: Dexuan Cui <decui(a)microsoft.com> Signed-off-by: Dexuan Cui <decui(a)microsoft.com> Signed-off-by: Naman Jain <namjain(a)linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux(a)outlook.com> Link: https://lore.kernel.org/r/20240917053917.76787-1-namjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu(a)kernel.org> Message-ID: <20240917053917.76787-1-namjain(a)linux.microsoft.com> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index d18078834ded..dc12fe5ef3ca 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -223,6 +223,63 @@ static void hv_machine_crash_shutdown(struct pt_regs *regs) hyperv_cleanup(); } #endif /* CONFIG_CRASH_DUMP */ + +static u64 hv_ref_counter_at_suspend; +static void (*old_save_sched_clock_state)(void); +static void (*old_restore_sched_clock_state)(void); + +/* + * Hyper-V clock counter resets during hibernation. Save and restore clock + * offset during suspend/resume, while also considering the time passed + * before suspend. This is to make sure that sched_clock using hv tsc page + * based clocksource, proceeds from where it left off during suspend and + * it shows correct time for the timestamps of kernel messages after resume. + */ +static void save_hv_clock_tsc_state(void) +{ + hv_ref_counter_at_suspend = hv_read_reference_counter(); +} + +static void restore_hv_clock_tsc_state(void) +{ + /* + * Adjust the offsets used by hv tsc clocksource to + * account for the time spent before hibernation. + * adjusted value = reference counter (time) at suspend + * - reference counter (time) now. + */ + hv_adj_sched_clock_offset(hv_ref_counter_at_suspend - hv_read_reference_counter()); +} + +/* + * Functions to override save_sched_clock_state and restore_sched_clock_state + * functions of x86_platform. The Hyper-V clock counter is reset during + * suspend-resume and the offset used to measure time needs to be + * corrected, post resume. + */ +static void hv_save_sched_clock_state(void) +{ + old_save_sched_clock_state(); + save_hv_clock_tsc_state(); +} + +static void hv_restore_sched_clock_state(void) +{ + restore_hv_clock_tsc_state(); + old_restore_sched_clock_state(); +} + +static void __init x86_setup_ops_for_tsc_pg_clock(void) +{ + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) + return; + + old_save_sched_clock_state = x86_platform.save_sched_clock_state; + x86_platform.save_sched_clock_state = hv_save_sched_clock_state; + + old_restore_sched_clock_state = x86_platform.restore_sched_clock_state; + x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state; +} #endif /* CONFIG_HYPERV */ static uint32_t __init ms_hyperv_platform(void) @@ -579,6 +636,7 @@ static void __init ms_hyperv_init_platform(void) /* Register Hyper-V specific clocksource */ hv_init_clocksource(); + x86_setup_ops_for_tsc_pg_clock(); hv_vtl_init_platform(); #endif /* diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c index 99177835cade..b39dee7b93af 100644 --- a/drivers/clocksource/hyperv_timer.c +++ b/drivers/clocksource/hyperv_timer.c @@ -27,7 +27,8 @@ #include <asm/mshyperv.h> static struct clock_event_device __percpu *hv_clock_event; -static u64 hv_sched_clock_offset __ro_after_init; +/* Note: offset can hold negative values after hibernation. */ +static u64 hv_sched_clock_offset __read_mostly; /* * If false, we're using the old mechanism for stimer0 interrupts @@ -470,6 +471,17 @@ static void resume_hv_clock_tsc(struct clocksource *arg) hv_set_msr(HV_MSR_REFERENCE_TSC, tsc_msr.as_uint64); } +/* + * Called during resume from hibernation, from overridden + * x86_platform.restore_sched_clock_state routine. This is to adjust offsets + * used to calculate time for hv tsc page based sched_clock, to account for + * time spent before hibernation. + */ +void hv_adj_sched_clock_offset(u64 offset) +{ + hv_sched_clock_offset -= offset; +} + #ifdef HAVE_VDSO_CLOCKMODE_HVCLOCK static int hv_cs_enable(struct clocksource *cs) { diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h index 6cdc873ac907..aa5233b1eba9 100644 --- a/include/clocksource/hyperv_timer.h +++ b/include/clocksource/hyperv_timer.h @@ -38,6 +38,8 @@ extern void hv_remap_tsc_clocksource(void); extern unsigned long hv_get_tsc_pfn(void); extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void); +extern void hv_adj_sched_clock_offset(u64 offset); + static __always_inline bool hv_read_tsc_page_tsc(const struct ms_hyperv_tsc_page *tsc_pg, u64 *cur_tsc, u64 *time)

11 months, 3 weeks

3
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror December 2024