When creating a new monitoring group, the RMID allocated for it may have
been used by a group which was previously removed. In this case, the
hardware counters will have non-zero values which should be deducted
from what is reported in the new group's counts.
resctrl_arch_reset_rmid() initializes the prev_msr value for counters to
0, causing the initial count to be charged to the new group. Resurrect
__rmid_read() and use it to initialize prev_msr correctly.
Unlike before, __rmid_read() checks for error bits in the MSR read so
that callers don't need to.
Fixes: 1d81d15db39c ("x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()")
Signed-off-by: Peter Newman <peternewman(a)google.com>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: stable(a)vger.kernel.org
---
v3:
- add changelog
- CC stable
v2:
- move error bit processing into __rmid_read()
v1: https://lore.kernel.org/lkml/20221207112924.3602960-1-peternewman@google.co…
v2: https://lore.kernel.org/lkml/20221214160856.2164207-1-peternewman@google.co…
---
arch/x86/kernel/cpu/resctrl/monitor.c | 49 ++++++++++++++++++---------
1 file changed, 33 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index efe0c30d3a12..77538abeb72a 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -146,6 +146,30 @@ static inline struct rmid_entry *__rmid_entry(u32 rmid)
return entry;
}
+static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
+{
+ u64 msr_val;
+
+ /*
+ * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured
+ * with a valid event code for supported resource type and the bits
+ * IA32_QM_EVTSEL.RMID (bits 41:32) are configured with valid RMID,
+ * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
+ * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
+ * are error bits.
+ */
+ wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
+ rdmsrl(MSR_IA32_QM_CTR, msr_val);
+
+ if (msr_val & RMID_VAL_ERROR)
+ return -EIO;
+ if (msr_val & RMID_VAL_UNAVAIL)
+ return -EINVAL;
+
+ *val = msr_val;
+ return 0;
+}
+
static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_domain *hw_dom,
u32 rmid,
enum resctrl_event_id eventid)
@@ -172,8 +196,12 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
struct arch_mbm_state *am;
am = get_arch_mbm_state(hw_dom, rmid, eventid);
- if (am)
+ if (am) {
memset(am, 0, sizeof(*am));
+
+ /* Record any initial, non-zero count value. */
+ __rmid_read(rmid, eventid, &am->prev_msr);
+ }
}
static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
@@ -191,25 +219,14 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d,
struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
struct arch_mbm_state *am;
u64 msr_val, chunks;
+ int ret;
if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask))
return -EINVAL;
- /*
- * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured
- * with a valid event code for supported resource type and the bits
- * IA32_QM_EVTSEL.RMID (bits 41:32) are configured with valid RMID,
- * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
- * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
- * are error bits.
- */
- wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
- rdmsrl(MSR_IA32_QM_CTR, msr_val);
-
- if (msr_val & RMID_VAL_ERROR)
- return -EIO;
- if (msr_val & RMID_VAL_UNAVAIL)
- return -EINVAL;
+ ret = __rmid_read(rmid, eventid, &msr_val);
+ if (ret)
+ return ret;
am = get_arch_mbm_state(hw_dom, rmid, eventid);
if (am) {
base-commit: 830b3c68c1fb1e9176028d02ef86f3cf76aa2476
--
2.39.0.314.g84b9a713c41-goog
From: Quentin Schulz <quentin.schulz(a)theobroma-systems.com>
clk_cifout is derived from clk_cifout_src through an integer divider
limited to 32. clk_cifout_src is a child of either cpll, gpll or npll
without any possibility of a divider of any sort. The default clock
parent is cpll.
Let's allow clk_cifout to ask its parent clk_cifout_src to reparent in
order to find the real closest possible rate for clk_cifout and not one
derived from cpll only.
Cc: stable(a)vger.kernel.org # 4.10+
Fixes: fd8bc829336a ("clk: rockchip: fix the rk3399 cifout clock")
Signed-off-by: Quentin Schulz <quentin.schulz(a)theobroma-systems.com>
---
clk: rockchip: rk3399: allow clk_cifout to force clk_cifout_src to reparent
This used to be correct before v4.10 but commit fd8bc829336a ("clk: rockchip:
fix the rk3399 cifout clock") incorrectly removed this ability while reworking
it.
Note: this has been tested on top of v6.0.2 only but no changes were made to
this driver since. As for older stable releases, the git context seems identical
and there does not seem to have been any logical change introduced since v4.10
so it should be pretty safe to apply.
To: Michael Turquette <mturquette(a)baylibre.com>
To: Stephen Boyd <sboyd(a)kernel.org>
To: Heiko Stuebner <heiko(a)sntech.de>
To: Xing Zheng <zhengxing(a)rock-chips.com>
Cc: linux-clk(a)vger.kernel.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-rockchip(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
---
drivers/clk/rockchip/clk-rk3399.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/rockchip/clk-rk3399.c b/drivers/clk/rockchip/clk-rk3399.c
index 306910a3a0d38..9ebd6c451b3db 100644
--- a/drivers/clk/rockchip/clk-rk3399.c
+++ b/drivers/clk/rockchip/clk-rk3399.c
@@ -1263,7 +1263,7 @@ static struct rockchip_clk_branch rk3399_clk_branches[] __initdata = {
RK3399_CLKSEL_CON(56), 6, 2, MFLAGS,
RK3399_CLKGATE_CON(10), 7, GFLAGS),
- COMPOSITE_NOGATE(SCLK_CIF_OUT, "clk_cifout", mux_clk_cif_p, 0,
+ COMPOSITE_NOGATE(SCLK_CIF_OUT, "clk_cifout", mux_clk_cif_p, CLK_SET_RATE_PARENT,
RK3399_CLKSEL_CON(56), 5, 1, MFLAGS, 0, 5, DFLAGS),
/* gic */
---
base-commit: cc675d22e422442f6d230654a55a5fc5682ea018
change-id: 20221117-rk3399-cifout-set-rate-parent-1fbf0173ef2d
Best regards,
--
Quentin Schulz <quentin.schulz(a)theobroma-systems.com>
Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
restricted on asymmetric systems"), the setting and clearing of
user_cpus_ptr are done under pi_lock for arm64 architecture. However,
dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
protection. Since sched_setaffinity() can be invoked from another
process, the process being modified may be undergoing fork() at
the same time. When racing with the clearing of user_cpus_ptr in
__set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
possibly double-free in arm64 kernel.
Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
cpumask") fixes this problem as user_cpus_ptr, once set, will never
be cleared in a task's lifetime. However, this bug was re-introduced
in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
do_set_cpus_allowed(). This time, it will affect all arches.
Fix this bug by always clearing the user_cpus_ptr of the newly
cloned/forked task before the copying process starts and check the
user_cpus_ptr state of the source task under pi_lock.
Note to stable, this patch won't be applicable to stable releases.
Just copy the new dup_user_cpus_ptr() function over.
Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
CC: stable(a)vger.kernel.org
Reported-by: David Wang 王标 <wangbiao3(a)xiaomi.com>
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
kernel/sched/core.c | 34 +++++++++++++++++++++++++++++-----
1 file changed, 29 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 25b582b6ee5f..b93d030b9fd5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2612,19 +2612,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
int node)
{
+ cpumask_t *user_mask;
unsigned long flags;
- if (!src->user_cpus_ptr)
+ /*
+ * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
+ * may differ by now due to racing.
+ */
+ dst->user_cpus_ptr = NULL;
+
+ /*
+ * This check is racy and losing the race is a valid situation.
+ * It is not worth the extra overhead of taking the pi_lock on
+ * every fork/clone.
+ */
+ if (data_race(!src->user_cpus_ptr))
return 0;
- dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
- if (!dst->user_cpus_ptr)
+ user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+ if (!user_mask)
return -ENOMEM;
- /* Use pi_lock to protect content of user_cpus_ptr */
+ /*
+ * Use pi_lock to protect content of user_cpus_ptr
+ *
+ * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
+ * do_set_cpus_allowed().
+ */
raw_spin_lock_irqsave(&src->pi_lock, flags);
- cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+ if (src->user_cpus_ptr) {
+ swap(dst->user_cpus_ptr, user_mask);
+ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+ }
raw_spin_unlock_irqrestore(&src->pi_lock, flags);
+
+ if (unlikely(user_mask))
+ kfree(user_mask);
+
return 0;
}
--
2.31.1
From: Takashi Iwai <tiwai(a)suse.de>
commit 8423f0b6d513b259fdab9c9bf4aaa6188d054c2d upstream.
There is a small race window at snd_pcm_oss_sync() that is called from
OSS PCM SNDCTL_DSP_SYNC ioctl; namely the function calls
snd_pcm_oss_make_ready() at first, then takes the params_lock mutex
for the rest. When the stream is set up again by another thread
between them, it leads to inconsistency, and may result in unexpected
results such as NULL dereference of OSS buffer as a fuzzer spotted
recently.
The fix is simply to cover snd_pcm_oss_make_ready() call into the same
params_lock mutex with snd_pcm_oss_make_ready_locked() variant.
Reported-and-tested-by: butt3rflyh4ck <butterflyhuangxx(a)gmail.com>
Reviewed-by: Jaroslav Kysela <perex(a)perex.cz>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/CAFcO6XN7JDM4xSXGhtusQfS2mSBcx50VJKwQpCq=WeLt57aa…
Link: https://lore.kernel.org/r/20220905060714.22549-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Zubin Mithra <zsm(a)google.com>
---
Note:
* 8423f0b6d513 is present in linux-5.15.y and linux-5.4.y; missing in
linux-5.10.y.
* Backport addresses conflict due to surrounding context.
* Tests run: build and boot.
sound/core/oss/pcm_oss.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c
index f88de74da1eb..de6f94bee50b 100644
--- a/sound/core/oss/pcm_oss.c
+++ b/sound/core/oss/pcm_oss.c
@@ -1662,13 +1662,14 @@ static int snd_pcm_oss_sync(struct snd_pcm_oss_file *pcm_oss_file)
runtime = substream->runtime;
if (atomic_read(&substream->mmap_count))
goto __direct;
- if ((err = snd_pcm_oss_make_ready(substream)) < 0)
- return err;
atomic_inc(&runtime->oss.rw_ref);
if (mutex_lock_interruptible(&runtime->oss.params_lock)) {
atomic_dec(&runtime->oss.rw_ref);
return -ERESTARTSYS;
}
+ err = snd_pcm_oss_make_ready_locked(substream);
+ if (err < 0)
+ goto unlock;
format = snd_pcm_oss_format_from(runtime->oss.format);
width = snd_pcm_format_physical_width(format);
if (runtime->oss.buffer_used > 0) {
--
2.38.0.rc1.362.ged0d419d3c-goog
Hi community.
Previously our team reported a race condition in IMA relates to LSM
based rules which would case IMA to match files that should be filtered
out under normal condition. The issue was originally analyzed and fixed
on mainstream. The patch and the discussion could be found here:
https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
After that, we did a regression test on 4.19 LTS and the same issue
arises. Further analysis reveled that the issue is from a completely
different cause.
The cause is that selinux_audit_rule_init() would set the rule (which is
a second level pointer) to NULL immediately after called. The relevant
codes are as shown:
security/selinux/ss/services.c:
> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> {
> struct selinux_state *state = &selinux_state;
> struct policydb *policydb = &state->ss->policydb;
> struct selinux_audit_rule *tmprule;
> struct role_datum *roledatum;
> struct type_datum *typedatum;
> struct user_datum *userdatum;
> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> int rc = 0;
>
> *rule = NULL;
*rule is set to NULL here, which means the rule on IMA side is also NULL.
>
> if (!state->initialized)
> return -EOPNOTSUPP;
...
> out:
> read_unlock(&state->ss->policy_rwlock);
>
> if (rc) {
> selinux_audit_rule_free(tmprule);
> tmprule = NULL;
> }
>
> *rule = tmprule;
rule is updated at the end of the function.
>
> return rc;
> }
security/integrity/ima/ima_policy.c:
> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
> const struct cred *cred, u32 secid,
> enum ima_hooks func, int mask)
> {...
> for (i = 0; i < MAX_LSM_RULES; i++) {
> int rc = 0;
> u32 osid;
> int retried = 0;
>
> if (!rule->lsm[i].rule)
> continue;
Setting rule to NULL would lead to LSM based rule matching being skipped.
> retry:
> switch (i) {
To solve this issue, there are multiple approaches we might take and I
would like some input from the community.
The first proposed solution would be to change
selinux_audit_rule_init(). Remove the set to NULL bit and update the
rule pointer with cmpxchg.
> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> index a9f2bc8443bd..aa74b04ccaf7 100644
> --- a/security/selinux/ss/services.c
> +++ b/security/selinux/ss/services.c
> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> struct type_datum *typedatum;
> struct user_datum *userdatum;
> struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> + struct selinux_audit_rule *orig = rule;
> int rc = 0;
>
> - *rule = NULL;
> -
> if (!state->initialized)
> return -EOPNOTSUPP;
>
> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> tmprule = NULL;
> }
>
> - *rule = tmprule;
> + if (cmpxchg(rule, orig, tmprule) != orig)
> + selinux_audit_rule_free(tmprule);
>
> return rc;
> }
This solution would be an easy fix, but might influence other modules
calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS,
only auditfilter and IMA it seems). And it might be worth returning an
error code such as -EAGAIN.
Or, we can access rules via RCU, similar to what we do on 5.10. This
could means more code change and testing.
Reported-by: Huaxin Lu <luhuaxin1(a)huawei.com>
--
Best
GUO Zihua