On 01/16/2018 01:59 PM, Thomas Gleixner wrote:
On Tue, 16 Jan 2018, Yu, Fenghua wrote:
From: Thomas Gleixner [mailto:tglx@linutronix.de]
Is this a Haswell specific issue?
I run the following test forever without issue on Broadwell and 4.15.0-rc6 with rdt mounted: for ((;;)) do for ((i=1;i<88;i++)) do echo 0 >/sys/devices/system/cpu/cpu$i/online done echo "online cpus:" grep processor /proc/cpuinfo |wc for ((i=1;i<88;i++)) do echo 1 >/sys/devices/system/cpu/cpu$i/online done echo "online cpus:" grep processor /proc/cpuinfo|wc done
I'm finding a Haswell to reproduce the issue.
Come on. This is crystal clear from the KASAN trace. And the fix is simple enough.
You simply do not run into it because on your machine
is_llc_occupancy_enabled() is false...
Thanks,
tglx 8<--------------------
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c index 88dcf8479013..99442370de40 100644 --- a/arch/x86/kernel/cpu/intel_rdt.c +++ b/arch/x86/kernel/cpu/intel_rdt.c @@ -525,10 +525,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) */ if (static_branch_unlikely(&rdt_mon_enable_key)) rmdir_mondata_subdir_allrdtgrp(r, d->id);
kfree(d->ctrl_val);
kfree(d->rmid_busy_llc);
kfree(d->mbm_total);
list_del(&d->list); if (is_mbm_enabled()) cancel_delayed_work(&d->mbm_over);kfree(d->mbm_local);
@@ -545,6 +541,10 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) cancel_delayed_work(&d->cqm_limbo); }
kfree(d->ctrl_val);
kfree(d->rmid_busy_llc);
kfree(d->mbm_total);
kfree(d); return; }kfree(d->mbm_local);
Thanks, Thomas. I'll build some test kernels and have your patch tested out.
Thanks,
Joe