- Linux-stable-mirror - lists.linaro.org

[PATCH 2/3] usb: typec: ucsi: acpi: Workaround for cache mode issue

by Heikki Krogerus

This fixes an issue where the driver fails with an error: ioremap error for 0x3f799000-0x3f79a000, requested 0x2, got 0x0 On some platforms the UCSI ACPI mailbox SystemMemory Operation Region may be setup before the driver has been loaded. That will lead into the driver failing to map the mailbox region, as it has been already marked as write-back memory. acpi_os_ioremap() for x86 uses ioremap_cache() unconditionally. When the issue happens, the embedded controller has a pending query event for the UCSI notification right after boot-up which causes the operation region to be setup before UCSI driver has been loaded. The fix is to notify acpi core that the driver is about to access memory region which potentially overlaps with an operation region right before mapping it. acpi_release_memory() will check if the memory has already been setup (mapped) by acpi core, and deactivate it (unmap) if it has. The driver is then able to map the memory with ioremap_nocache() and set the memtype to uncached for the region. Reported-by: Paul Menzel <pmenzel(a)molgen.mpg.de> Fixes: 8243edf44152 ("usb: typec: ucsi: Add ACPI driver") Cc: stable(a)vger.kernel.org Signed-off-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com> --- drivers/usb/typec/ucsi/ucsi_acpi.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/usb/typec/ucsi/ucsi_acpi.c b/drivers/usb/typec/ucsi/ucsi_acpi.c index 44eb4e1ea817..a18112a83fae 100644 --- a/drivers/usb/typec/ucsi/ucsi_acpi.c +++ b/drivers/usb/typec/ucsi/ucsi_acpi.c @@ -79,6 +79,11 @@ static int ucsi_acpi_probe(struct platform_device *pdev) return -ENODEV; } + /* This will make sure we can use ioremap_nocache() */ + status = acpi_release_memory(ACPI_HANDLE(&pdev->dev), res, 1); + if (ACPI_FAILURE(status)) + return -ENOMEM; + /* * NOTE: The memory region for the data structures is used also in an * operation region, which means ACPI has already reserved it. Therefore -- 2.17.1

7 years

1
0
0 0

[PATCH 1/3] acpi: Add helper for deactivating memory region

by Heikki Krogerus

Sometimes memory resource may be overlapping with SystemMemory Operation Region by design, for example if the memory region is used as a mailbox for communication with a firmware in the system. One occasion of such mailboxes is USB Type-C Connector System Software Interface (UCSI). With regions like that, it is important that the driver is able to map the memory with the requirements it has. For example, the driver should be allowed to map the memory as non-cached memory. However, if the operation region has been accessed before the driver has mapped the memory, the memory has been marked as write-back by the time the driver is loaded. That means the driver will fail to map the memory if it expects non-cached memory. To work around the problem, introducing helper that the drivers can use to temporarily deactivate (unmap) SystemMemory Operation Regions that overlap with their IO memory. Fixes: 8243edf44152 ("usb: typec: ucsi: Add ACPI driver") Cc: stable(a)vger.kernel.org Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com> --- drivers/acpi/osl.c | 72 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/acpi.h | 3 ++ 2 files changed, 75 insertions(+) diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c index 7ca41bf023c9..8df9abfa947b 100644 --- a/drivers/acpi/osl.c +++ b/drivers/acpi/osl.c @@ -45,6 +45,8 @@ #include <linux/uaccess.h> #include <linux/io-64-nonatomic-lo-hi.h> +#include "acpica/accommon.h" +#include "acpica/acnamesp.h" #include "internal.h" #define _COMPONENT ACPI_OS_SERVICES @@ -1490,6 +1492,76 @@ int acpi_check_region(resource_size_t start, resource_size_t n, } EXPORT_SYMBOL(acpi_check_region); +static acpi_status acpi_deactivate_mem_region(acpi_handle handle, u32 level, + void *_res, void **return_value) +{ + struct acpi_mem_space_context **mem_ctx; + union acpi_operand_object *handler_obj; + union acpi_operand_object *region_obj2; + union acpi_operand_object *region_obj; + struct resource *res = _res; + acpi_status status; + + region_obj = acpi_ns_get_attached_object(handle); + if (!region_obj) + return AE_OK; + + handler_obj = region_obj->region.handler; + if (!handler_obj) + return AE_OK; + + if (region_obj->region.space_id != ACPI_ADR_SPACE_SYSTEM_MEMORY) + return AE_OK; + + if (!(region_obj->region.flags & AOPOBJ_SETUP_COMPLETE)) + return AE_OK; + + region_obj2 = acpi_ns_get_secondary_object(region_obj); + if (!region_obj2) + return AE_OK; + + mem_ctx = (void *)&region_obj2->extra.region_context; + + if (!(mem_ctx[0]->address >= res->start && + mem_ctx[0]->address < res->end)) + return AE_OK; + + status = handler_obj->address_space.setup(region_obj, + ACPI_REGION_DEACTIVATE, + NULL, (void **)mem_ctx); + if (ACPI_SUCCESS(status)) + region_obj->region.flags &= ~(AOPOBJ_SETUP_COMPLETE); + + return status; +} + +/** + * acpi_release_memory - Release any mappings done to a memory region + * @handle: Handle to namespace node + * @res: Memory resource + * @level: A level that terminates the search + * + * Walks through @handle and unmaps all SystemMemory Operation Regions that + * overlap with @res and that have already been activated (mapped). + * + * This is a helper that allows drivers to place special requirements on memory + * region that may overlap with operation regions, primarily allowing them to + * safely map the region as non-cached memory. + * + * The unmapped Operation Regions will be automatically remapped next time they + * are called, so the drivers do not need to do anything else. + */ +acpi_status acpi_release_memory(acpi_handle handle, struct resource *res, + u32 level) +{ + if (!(res->flags & IORESOURCE_MEM)) + return AE_TYPE; + + return acpi_walk_namespace(ACPI_TYPE_REGION, handle, level, + acpi_deactivate_mem_region, NULL, res, NULL); +} +EXPORT_SYMBOL_GPL(acpi_release_memory); + /* * Let drivers know whether the resource checks are effective */ diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 4b35a66383f9..e54f40974eb0 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -443,6 +443,9 @@ int acpi_check_resource_conflict(const struct resource *res); int acpi_check_region(resource_size_t start, resource_size_t n, const char *name); +acpi_status acpi_release_memory(acpi_handle handle, struct resource *res, + u32 level); + int acpi_resources_are_enforced(void); #ifdef CONFIG_HIBERNATION -- 2.17.1

7 years

1
0
0 0

[PATCH RESEND] softirq: reorder trace_softirqs_on to prevent lockdep splat

by Joel Fernandes

From: "Joel Fernandes (Google)" <joel(a)joelfernandes.org> I'm able to reproduce a lockdep splat with config options: CONFIG_PROVE_LOCKING=y, CONFIG_DEBUG_LOCK_ALLOC=y and CONFIG_PREEMPTIRQ_EVENTS=y $ echo 1 > /d/tracing/events/preemptirq/preempt_enable/enable [ 26.112609] DEBUG_LOCKS_WARN_ON(current->softirqs_enabled) [ 26.112636] WARNING: CPU: 0 PID: 118 at kernel/locking/lockdep.c:3854 [...] [ 26.144229] Call Trace: [ 26.144926] <IRQ> [ 26.145506] lock_acquire+0x55/0x1b0 [ 26.146499] ? __do_softirq+0x46f/0x4d9 [ 26.147571] ? __do_softirq+0x46f/0x4d9 [ 26.148646] trace_preempt_on+0x8f/0x240 [ 26.149744] ? trace_preempt_on+0x4d/0x240 [ 26.150862] ? __do_softirq+0x46f/0x4d9 [ 26.151930] preempt_count_sub+0x18a/0x1a0 [ 26.152985] __do_softirq+0x46f/0x4d9 [ 26.153937] irq_exit+0x68/0xe0 [ 26.154755] smp_apic_timer_interrupt+0x271/0x280 [ 26.156056] apic_timer_interrupt+0xf/0x20 [ 26.157105] </IRQ> The issue was this: preempt_count = 1 << SOFTIRQ_SHIFT __local_bh_enable(cnt = 1 << SOFTIRQ_SHIFT) { if (softirq_count() == (cnt && SOFTIRQ_MASK)) { trace_softirqs_on() { current->softirqs_enabled = 1; } } preempt_count_sub(cnt) { trace_preempt_on() { tracepoint() { rcu_read_lock_sched() { // jumps into lockdep Where preempt_count still has softirqs disabled, but current->softirqs_enabled is true, and we get a splat. Cc: Steven Rostedt <rostedt(a)goodmis.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Cc: Tom Zanussi <tom.zanussi(a)linux.intel.com> Cc: Namhyung Kim <namhyung(a)kernel.org> Cc: Thomas Glexiner <tglx(a)linutronix.de> Cc: Boqun Feng <boqun.feng(a)gmail.com> Cc: Paul McKenney <paulmck(a)linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Todd Kjos <tkjos(a)google.com> Cc: Erick Reyes <erickreyes(a)google.com> Cc: Julia Cartwright <julia(a)ni.com> Cc: Byungchul Park <byungchul.park(a)lge.com> Cc: stable(a)vger.kernel.org Reviewed-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Fixes: d59158162e032 ("tracing: Add support for preempt and irq enable/disable events") Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org> --- kernel/softirq.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/softirq.c b/kernel/softirq.c index 177de3640c78..8a040bcaa033 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -139,9 +139,13 @@ static void __local_bh_enable(unsigned int cnt) { lockdep_assert_irqs_disabled(); + if (preempt_count() == cnt) + trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip()); + if (softirq_count() == (cnt & SOFTIRQ_MASK)) trace_softirqs_on(_RET_IP_); - preempt_count_sub(cnt); + + __preempt_count_sub(cnt); } /* -- 2.17.1.1185.g55be947832-goog

7 years

3
2
0 0

[PATCH 2/5] xhci: Fix kernel oops in trace_xhci_free_virt_device

by Mathias Nyman

From: Zhengjun Xing <zhengjun.xing(a)linux.intel.com> commit 44a182b9d177 ("xhci: Fix use-after-free in xhci_free_virt_device") set dev->udev pointer to NULL in xhci_free_dev(), it will cause kernel panic in trace_xhci_free_virt_device. This patch reimplement the trace function trace_xhci_free_virt_device, remove dev->udev dereference and added more useful parameters to show in the trace function,it also makes sure dev->udev is not NULL before calling trace_xhci_free_virt_device. This issue happened when xhci-hcd trace is enabled and USB devices hot plug test. Original use-after-free patch went to stable so this needs so be applied there as well. [ 1092.022457] usb 2-4: USB disconnect, device number 6 [ 1092.092772] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 1092.101694] PGD 0 P4D 0 [ 1092.104601] Oops: 0000 [#1] SMP [ 1092.207734] Workqueue: usb_hub_wq hub_event [ 1092.212507] RIP: 0010:trace_event_raw_event_xhci_log_virt_dev+0x6c/0xf0 [ 1092.220050] RSP: 0018:ffff8c252e883d28 EFLAGS: 00010086 [ 1092.226024] RAX: ffff8c24af86fa84 RBX: 0000000000000003 RCX: ffff8c25255c2a01 [ 1092.234130] RDX: 0000000000000000 RSI: 00000000aef55009 RDI: ffff8c252e883d28 [ 1092.242242] RBP: ffff8c252550e2c0 R08: ffff8c24af86fa84 R09: 0000000000000a70 [ 1092.250364] R10: 0000000000000a70 R11: 0000000000000000 R12: ffff8c251f21a000 [ 1092.258468] R13: 000000000000000c R14: ffff8c251f21a000 R15: ffff8c251f432f60 [ 1092.266572] FS: 0000000000000000(0000) GS:ffff8c252e880000(0000) knlGS:0000000000000000 [ 1092.275757] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1092.282281] CR2: 0000000000000000 CR3: 0000000154209001 CR4: 00000000003606e0 [ 1092.290384] Call Trace: [ 1092.293156] <IRQ> [ 1092.295439] xhci_free_virt_device.part.34+0x182/0x1a0 [ 1092.301288] handle_cmd_completion+0x7ac/0xfa0 [ 1092.306336] ? trace_event_raw_event_xhci_log_trb+0x6e/0xa0 [ 1092.312661] xhci_irq+0x3e8/0x1f60 [ 1092.316524] __handle_irq_event_percpu+0x75/0x180 [ 1092.321876] handle_irq_event_percpu+0x20/0x50 [ 1092.326922] handle_irq_event+0x36/0x60 [ 1092.331273] handle_edge_irq+0x6d/0x180 [ 1092.335644] handle_irq+0x16/0x20 [ 1092.339417] do_IRQ+0x41/0xc0 [ 1092.342782] common_interrupt+0xf/0xf [ 1092.346955] </IRQ> Fixes: 44a182b9d177 ("xhci: Fix use-after-free in xhci_free_virt_device") Cc: <stable(a)vger.kernel.org> Signed-off-by: Zhengjun Xing <zhengjun.xing(a)linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> --- drivers/usb/host/xhci-mem.c | 4 ++-- drivers/usb/host/xhci-trace.h | 36 +++++++++++++++++++++++++++++++----- 2 files changed, 33 insertions(+), 7 deletions(-) diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index acbd3d7..8a62eee 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -886,12 +886,12 @@ void xhci_free_virt_device(struct xhci_hcd *xhci, int slot_id) dev = xhci->devs[slot_id]; - trace_xhci_free_virt_device(dev); - xhci->dcbaa->dev_context_ptrs[slot_id] = 0; if (!dev) return; + trace_xhci_free_virt_device(dev); + if (dev->tt_info) old_active_eps = dev->tt_info->active_eps; diff --git a/drivers/usb/host/xhci-trace.h b/drivers/usb/host/xhci-trace.h index 410544f..88b4274 100644 --- a/drivers/usb/host/xhci-trace.h +++ b/drivers/usb/host/xhci-trace.h @@ -171,6 +171,37 @@ DEFINE_EVENT(xhci_log_trb, xhci_dbc_gadget_ep_queue, TP_ARGS(ring, trb) ); +DECLARE_EVENT_CLASS(xhci_log_free_virt_dev, + TP_PROTO(struct xhci_virt_device *vdev), + TP_ARGS(vdev), + TP_STRUCT__entry( + __field(void *, vdev) + __field(unsigned long long, out_ctx) + __field(unsigned long long, in_ctx) + __field(u8, fake_port) + __field(u8, real_port) + __field(u16, current_mel) + + ), + TP_fast_assign( + __entry->vdev = vdev; + __entry->in_ctx = (unsigned long long) vdev->in_ctx->dma; + __entry->out_ctx = (unsigned long long) vdev->out_ctx->dma; + __entry->fake_port = (u8) vdev->fake_port; + __entry->real_port = (u8) vdev->real_port; + __entry->current_mel = (u16) vdev->current_mel; + ), + TP_printk("vdev %p ctx %llx | %llx fake_port %d real_port %d current_mel %d", + __entry->vdev, __entry->in_ctx, __entry->out_ctx, + __entry->fake_port, __entry->real_port, __entry->current_mel + ) +); + +DEFINE_EVENT(xhci_log_free_virt_dev, xhci_free_virt_device, + TP_PROTO(struct xhci_virt_device *vdev), + TP_ARGS(vdev) +); + DECLARE_EVENT_CLASS(xhci_log_virt_dev, TP_PROTO(struct xhci_virt_device *vdev), TP_ARGS(vdev), @@ -208,11 +239,6 @@ DEFINE_EVENT(xhci_log_virt_dev, xhci_alloc_virt_device, TP_ARGS(vdev) ); -DEFINE_EVENT(xhci_log_virt_dev, xhci_free_virt_device, - TP_PROTO(struct xhci_virt_device *vdev), - TP_ARGS(vdev) -); - DEFINE_EVENT(xhci_log_virt_dev, xhci_setup_device, TP_PROTO(struct xhci_virt_device *vdev), TP_ARGS(vdev) -- 2.7.4

7 years

1
0
0 0

Re: BUG: jumbo frames broken after commit xen-netfront: Fix race between device setup and open

by Javier Martinez Canillas

Hi Andrew, On Wed, Jun 6, 2018 at 6:29 PM, Andrew Jeddeloh <andrew.jeddeloh(a)redhat.com> wrote: > Hi all, > > The patch "xen-netfront: Fix race between device setup and open" seems > to have introduced a regression preventing setting MTU's larger than > 1500. We experienced this downstream with Container Linux and > confirmed with Fedora 28 as well. > > It's commit f599c64fdf7d9c108e8717fb04bc41c680120da4 in the linux-stable tree. > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/com… > > Downstream bugs: > https://github.com/coreos/bugs/issues/2443 > https://bugzilla.redhat.com/show_bug.cgi?id=1584216 > > We've confirmed that reverting that commit fixes the bug. It be > reliably can be reproduced on AWS with t2.micro instances (and > presumably other systems using the same driver). Both using > systemd-networkd to set the mtu and manual ip link commands cause the > link to repsond with "Invalid argument" when trying to set the MTU > > 1500. > > I'm not sure why that commit introduced the regression. > > Please let me know if there's any more information that would be helpful. > > - Andrew I'm adding some relevant people to the CC list to bring more attention on this regression. The get_maintainer.pl script is very useful to get some hints on who should be copied, i.e: $ ./scripts/get_maintainer.pl -f drivers/net/xen-netfront.c Best regards, Javier

7 years

3
2
0 0

[PATCH] x86/xen: add call of speculative_store_bypass_ht_init() to pv paths

by Juergen Gross

Commit 1f50ddb4f4189243c05926b842dc1a0332195f31 ("x86/speculation: Handle HT correctly on AMD") added speculative_store_bypass_ht_init() to the per-cpu initialization sequence. speculative_store_bypass_ht_init() needs to be called on each cpu for pv guests, too. Reported-by: Brian Woods <brian.woods(a)amd.com> Fixes: 1f50ddb4f4189243c05926b842dc1a0332195f31 ("x86/speculation: Handle HT correctly on AMD") Cc: <stable(a)vger.kernel.org> Signed-off-by: Juergen Gross <jgross(a)suse.com> Tested-by: Brian Woods <brian.woods(a)amd.com> --- arch/x86/xen/smp_pv.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c index 2e20ae2fa2d6..e3b18ad49889 100644 --- a/arch/x86/xen/smp_pv.c +++ b/arch/x86/xen/smp_pv.c @@ -32,6 +32,7 @@ #include <xen/interface/vcpu.h> #include <xen/interface/xenpmu.h> +#include <asm/spec-ctrl.h> #include <asm/xen/interface.h> #include <asm/xen/hypercall.h> @@ -70,6 +71,8 @@ static void cpu_bringup(void) cpu_data(cpu).x86_max_cores = 1; set_cpu_sibling_map(cpu); + speculative_store_bypass_ht_init(); + xen_setup_cpu_clockevents(); notify_cpu_starting(cpu); @@ -250,6 +253,8 @@ static void __init xen_pv_smp_prepare_cpus(unsigned int max_cpus) } set_cpu_sibling_map(0); + speculative_store_bypass_ht_init(); + xen_pmu_init(0); if (xen_smp_intr_init(0) || xen_smp_intr_init_pv(0)) -- 2.13.7

7 years

2
1
0 0

Re: [PATCH v2 1/2] sched/fair: Fix bandwidth timer clock drift condition

by Xunlei Pang

On 6/21/18 1:01 AM, bsegall(a)google.com wrote: > Xunlei Pang <xlpang(a)linux.alibaba.com> writes: > >> I noticed the group constantly got throttled even it consumed >> low cpu usage, this caused some jitters on the response time >> to some of our business containers enabling cpu quota. >> >> It's very simple to reproduce: >> mkdir /sys/fs/cgroup/cpu/test >> cd /sys/fs/cgroup/cpu/test >> echo 100000 > cpu.cfs_quota_us >> echo $$ > tasks >> then repeat: >> cat cpu.stat |grep nr_throttled // nr_throttled will increase >> >> After some analysis, we found that cfs_rq::runtime_remaining will >> be cleared by expire_cfs_rq_runtime() due to two equal but stale >> "cfs_{b|q}->runtime_expires" after period timer is re-armed. >> >> The current condition to judge clock drift in expire_cfs_rq_runtime() >> is wrong, the two runtime_expires are actually the same when clock >> drift happens, so this condtion can never hit. The orginal design was >> correctly done by commit a9cf55b28610 ("sched: Expire invalid runtime"), >> but was changed to be the current one due to its locking issue. >> >> This patch introduces another way, it adds a new field in both structures >> cfs_rq and cfs_bandwidth to record the expiration update sequence, and >> use them to figure out if clock drift happens(true if they equal). >> >> Fixes: 51f2176d74ac ("sched/fair: Fix unlocked reads of some cfs_b->quota/period") >> Cc: Ben Segall <bsegall(a)google.com> > > Reviewed-By: Ben Segall <bsegall(a)google.com> Thanks Ben :-) Hi Peter, could you please have a look at them? Cc stable(a)vger.kernel.org > >> Signed-off-by: Xunlei Pang <xlpang(a)linux.alibaba.com> >> --- >> kernel/sched/fair.c | 14 ++++++++------ >> kernel/sched/sched.h | 6 ++++-- >> 2 files changed, 12 insertions(+), 8 deletions(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index e497c05aab7f..e6bb68d52962 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -4590,6 +4590,7 @@ void __refill_cfs_bandwidth_runtime(struct cfs_bandwidth *cfs_b) >> now = sched_clock_cpu(smp_processor_id()); >> cfs_b->runtime = cfs_b->quota; >> cfs_b->runtime_expires = now + ktime_to_ns(cfs_b->period); >> + cfs_b->expires_seq++; >> } >> >> static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg) >> @@ -4612,6 +4613,7 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq) >> struct task_group *tg = cfs_rq->tg; >> struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(tg); >> u64 amount = 0, min_amount, expires; >> + int expires_seq; >> >> /* note: this is a positive sum as runtime_remaining <= 0 */ >> min_amount = sched_cfs_bandwidth_slice() - cfs_rq->runtime_remaining; >> @@ -4628,6 +4630,7 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq) >> cfs_b->idle = 0; >> } >> } >> + expires_seq = cfs_b->expires_seq; >> expires = cfs_b->runtime_expires; >> raw_spin_unlock(&cfs_b->lock); >> >> @@ -4637,8 +4640,10 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq) >> * spread between our sched_clock and the one on which runtime was >> * issued. >> */ >> - if ((s64)(expires - cfs_rq->runtime_expires) > 0) >> + if (cfs_rq->expires_seq != expires_seq) { >> + cfs_rq->expires_seq = expires_seq; >> cfs_rq->runtime_expires = expires; >> + } >> >> return cfs_rq->runtime_remaining > 0; >> } >> @@ -4664,12 +4669,9 @@ static void expire_cfs_rq_runtime(struct cfs_rq *cfs_rq) >> * has not truly expired. >> * >> * Fortunately we can check determine whether this the case by checking >> - * whether the global deadline has advanced. It is valid to compare >> - * cfs_b->runtime_expires without any locks since we only care about >> - * exact equality, so a partial write will still work. >> + * whether the global deadline(cfs_b->expires_seq) has advanced. >> */ >> - >> - if (cfs_rq->runtime_expires != cfs_b->runtime_expires) { >> + if (cfs_rq->expires_seq == cfs_b->expires_seq) { >> /* extend local deadline, drift is bounded above by 2 ticks */ >> cfs_rq->runtime_expires += TICK_NSEC; >> } else { >> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h >> index 6601baf2361c..e977e04f8daf 100644 >> --- a/kernel/sched/sched.h >> +++ b/kernel/sched/sched.h >> @@ -334,9 +334,10 @@ struct cfs_bandwidth { >> u64 runtime; >> s64 hierarchical_quota; >> u64 runtime_expires; >> + int expires_seq; >> >> - int idle; >> - int period_active; >> + short idle; >> + short period_active; >> struct hrtimer period_timer; >> struct hrtimer slack_timer; >> struct list_head throttled_cfs_rq; >> @@ -551,6 +552,7 @@ struct cfs_rq { >> >> #ifdef CONFIG_CFS_BANDWIDTH >> int runtime_enabled; >> + int expires_seq; >> u64 runtime_expires; >> s64 runtime_remaining;

7 years

2
2
0 0

[tip:x86/urgent] x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths

by tip-bot for Juergen Gross

Commit-ID: 74899d92e66663dc7671a8017b3146dcd4735f3b Gitweb: https://git.kernel.org/tip/74899d92e66663dc7671a8017b3146dcd4735f3b Author: Juergen Gross <jgross(a)suse.com> AuthorDate: Thu, 21 Jun 2018 10:43:31 +0200 Committer: Ingo Molnar <mingo(a)kernel.org> CommitDate: Thu, 21 Jun 2018 10:55:52 +0200 x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths Commit: 1f50ddb4f418 ("x86/speculation: Handle HT correctly on AMD") ... added speculative_store_bypass_ht_init() to the per-CPU initialization sequence. speculative_store_bypass_ht_init() needs to be called on each CPU for PV guests, too. Reported-by: Brian Woods <brian.woods(a)amd.com> Tested-by: Brian Woods <brian.woods(a)amd.com> Signed-off-by: Juergen Gross <jgross(a)suse.com> Cc: <stable(a)vger.kernel.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: boris.ostrovsky(a)oracle.com Cc: xen-devel(a)lists.xenproject.org Fixes: 1f50ddb4f4189243c05926b842dc1a0332195f31 ("x86/speculation: Handle HT correctly on AMD") Link: https://lore.kernel.org/lkml/20180621084331.21228-1-jgross@suse.com Signed-off-by: Ingo Molnar <mingo(a)kernel.org> --- arch/x86/xen/smp_pv.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c index 2e20ae2fa2d6..e3b18ad49889 100644 --- a/arch/x86/xen/smp_pv.c +++ b/arch/x86/xen/smp_pv.c @@ -32,6 +32,7 @@ #include <xen/interface/vcpu.h> #include <xen/interface/xenpmu.h> +#include <asm/spec-ctrl.h> #include <asm/xen/interface.h> #include <asm/xen/hypercall.h> @@ -70,6 +71,8 @@ static void cpu_bringup(void) cpu_data(cpu).x86_max_cores = 1; set_cpu_sibling_map(cpu); + speculative_store_bypass_ht_init(); + xen_setup_cpu_clockevents(); notify_cpu_starting(cpu); @@ -250,6 +253,8 @@ static void __init xen_pv_smp_prepare_cpus(unsigned int max_cpus) } set_cpu_sibling_map(0); + speculative_store_bypass_ht_init(); + xen_pmu_init(0); if (xen_smp_intr_init(0) || xen_smp_intr_init_pv(0))

7 years

1
0
0 0

+ slub-track-number-of-slabs-irrespective-of-config_slub_debug.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: slub: track number of slabs irrespective of CONFIG_SLUB_DEBUG has been added to the -mm tree. Its filename is slub-track-number-of-slabs-irrespective-of-config_slub_debug.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/slub-track-number-of-slabs-irrespe… and later at http://ozlabs.org/~akpm/mmotm/broken-out/slub-track-number-of-slabs-irrespe… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Shakeel Butt <shakeelb(a)google.com> Subject: slub: track number of slabs irrespective of CONFIG_SLUB_DEBUG For !CONFIG_SLUB_DEBUG, SLUB does not maintain the number of slabs allocated per node for a kmem_cache. Thus, slabs_node() in __kmem_cache_empty(), __kmem_cache_shrink() and __kmem_cache_destroy() will always return 0 for such config. This is wrong and can cause issues for all users of these functions. In fact in [1] Jason has reported a system crash while using SLUB without CONFIG_SLUB_DEBUG. The reason was the usage of slabs_node() by __kmem_cache_empty(). The right solution is to make slabs_node() work even for !CONFIG_SLUB_DEBUG. The commit 0f389ec63077 ("slub: No need for per node slab counters if !SLUB_DEBUG") had put the per node slab counter under CONFIG_SLUB_DEBUG because it was only read through sysfs API and the sysfs API was disabled on !CONFIG_SLUB_DEBUG. However the users of the per node slab counter assumed that it will work in the absence of CONFIG_SLUB_DEBUG. So, make the counter work for !CONFIG_SLUB_DEBUG. Please note that f9e13c0a5a33 ("slab, slub: skip unnecessary kasan_cache_shutdown()") exposed this issue but it is present even before. [1] http://lkml.kernel.org/r/CAHmME9rtoPwxUSnktxzKso14iuVCWT7BE_-_8PAC=pGw1iJnQ… Link: http://lkml.kernel.org/r/20180620224147.23777-1-shakeelb@google.com Fixes: f9e13c0a5a33 ("slab, slub: skip unnecessary kasan_cache_shutdown()") Signed-off-by: Shakeel Butt <shakeelb(a)google.com> Suggested-by: David Rientjes <rientjes(a)google.com> Reported-by: Jason A . Donenfeld <Jason(a)zx2c4.com> Cc: Christoph Lameter <cl(a)linux.com> Cc: Pekka Enberg <penberg(a)kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com> Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/slab.h | 2 - mm/slub.c | 80 ++++++++++++++++++++++++---------------------------- 2 files changed, 38 insertions(+), 44 deletions(-) diff -puN mm/slab.h~slub-track-number-of-slabs-irrespective-of-config_slub_debug mm/slab.h --- a/mm/slab.h~slub-track-number-of-slabs-irrespective-of-config_slub_debug +++ a/mm/slab.h @@ -473,8 +473,8 @@ struct kmem_cache_node { #ifdef CONFIG_SLUB unsigned long nr_partial; struct list_head partial; -#ifdef CONFIG_SLUB_DEBUG atomic_long_t nr_slabs; +#ifdef CONFIG_SLUB_DEBUG atomic_long_t total_objects; struct list_head full; #endif diff -puN mm/slub.c~slub-track-number-of-slabs-irrespective-of-config_slub_debug mm/slub.c --- a/mm/slub.c~slub-track-number-of-slabs-irrespective-of-config_slub_debug +++ a/mm/slub.c @@ -1030,42 +1030,6 @@ static void remove_full(struct kmem_cach list_del(&page->lru); } -/* Tracking of the number of slabs for debugging purposes */ -static inline unsigned long slabs_node(struct kmem_cache *s, int node) -{ - struct kmem_cache_node *n = get_node(s, node); - - return atomic_long_read(&n->nr_slabs); -} - -static inline unsigned long node_nr_slabs(struct kmem_cache_node *n) -{ - return atomic_long_read(&n->nr_slabs); -} - -static inline void inc_slabs_node(struct kmem_cache *s, int node, int objects) -{ - struct kmem_cache_node *n = get_node(s, node); - - /* - * May be called early in order to allocate a slab for the - * kmem_cache_node structure. Solve the chicken-egg - * dilemma by deferring the increment of the count during - * bootstrap (see early_kmem_cache_node_alloc). - */ - if (likely(n)) { - atomic_long_inc(&n->nr_slabs); - atomic_long_add(objects, &n->total_objects); - } -} -static inline void dec_slabs_node(struct kmem_cache *s, int node, int objects) -{ - struct kmem_cache_node *n = get_node(s, node); - - atomic_long_dec(&n->nr_slabs); - atomic_long_sub(objects, &n->total_objects); -} - /* Object debug checks for alloc/free paths */ static void setup_object_debug(struct kmem_cache *s, struct page *page, void *object) @@ -1321,16 +1285,46 @@ slab_flags_t kmem_cache_flags(unsigned i #define disable_higher_order_debug 0 +#endif /* CONFIG_SLUB_DEBUG */ + static inline unsigned long slabs_node(struct kmem_cache *s, int node) - { return 0; } +{ + struct kmem_cache_node *n = get_node(s, node); + + return atomic_long_read(&n->nr_slabs); +} + static inline unsigned long node_nr_slabs(struct kmem_cache_node *n) - { return 0; } -static inline void inc_slabs_node(struct kmem_cache *s, int node, - int objects) {} -static inline void dec_slabs_node(struct kmem_cache *s, int node, - int objects) {} +{ + return atomic_long_read(&n->nr_slabs); +} -#endif /* CONFIG_SLUB_DEBUG */ +static inline void inc_slabs_node(struct kmem_cache *s, int node, int objects) +{ + struct kmem_cache_node *n = get_node(s, node); + + /* + * May be called early in order to allocate a slab for the + * kmem_cache_node structure. Solve the chicken-egg + * dilemma by deferring the increment of the count during + * bootstrap (see early_kmem_cache_node_alloc). + */ + if (likely(n)) { + atomic_long_inc(&n->nr_slabs); +#ifdef CONFIG_SLUB_DEBUG + atomic_long_add(objects, &n->total_objects); +#endif + } +} +static inline void dec_slabs_node(struct kmem_cache *s, int node, int objects) +{ + struct kmem_cache_node *n = get_node(s, node); + + atomic_long_dec(&n->nr_slabs); +#ifdef CONFIG_SLUB_DEBUG + atomic_long_sub(objects, &n->total_objects); +#endif +} /* * Hooks for other subsystems that check memory allocations. In a typical _ Patches currently in -mm which might be from shakeelb(a)google.com are slub-track-number-of-slabs-irrespective-of-config_slub_debug.patch slub-fix-__kmem_cache_empty-for-config_slub_debug.patch

7 years

3
2
0 0

v4.14.51 build: 0 failures 0 warnings (v4.14.51)

by Build bot for Mark Brown

Tree/Branch: v4.14.51 Git describe: v4.14.51 Commit: 33445c07cd Linux 4.14.51 Build Time: 98 min 40 sec Passed: 11 / 11 (100.00 %) Failed: 0 / 11 ( 0.00 %) Errors: 0 Warnings: 0 Section Mismatches: 0 ------------------------------------------------------------------------------- defconfigs with issues (other than build errors): ------------------------------------------------------------------------------- =============================================================================== Detailed per-defconfig build reports below: ------------------------------------------------------------------------------- Passed with no errors, warnings or mismatches: arm64-allnoconfig arm64-allmodconfig arm-multi_v5_defconfig arm-multi_v7_defconfig x86_64-defconfig arm-allmodconfig arm-allnoconfig x86_64-allnoconfig arm-multi_v4t_defconfig x86_64-allmodconfig arm64-defconfig close failed in file object destructor: sys.excepthook is missing lost sys.stderr

7 years

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror