MSM sd host controller is capable of HW busy detection of device busy
signaling over DAT0 line. And it requires the R1B response for
commands that have this response associated with them.
So set the below two host capabilities for qcom SDHC.
- MMC_CAP_WAIT_WHILE_BUSY
- MMC_CAP_NEED_RSP_BUSY
Cc: <stable(a)vger.kernel.org> # v4.19+
Signed-off-by: Veerabhadrarao Badiganti <vbadigan(a)codeaurora.org>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
---
drivers/mmc/host/sdhci-msm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
index 09ff731..d826e9b 100644
--- a/drivers/mmc/host/sdhci-msm.c
+++ b/drivers/mmc/host/sdhci-msm.c
@@ -2087,6 +2087,9 @@ static int sdhci_msm_probe(struct platform_device *pdev)
goto clk_disable;
}
+ msm_host->mmc->caps |= MMC_CAP_WAIT_WHILE_BUSY;
+ msm_host->mmc->caps |= MMC_CAP_NEED_RSP_BUSY;
+
pm_runtime_get_noresume(&pdev->dev);
pm_runtime_set_active(&pdev->dev);
pm_runtime_enable(&pdev->dev);
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Please apply the attached backport of commit
396d2e878f92ec108e4293f1c77ea3bc90b414ff to the 4.4, 4.9, 4.14, and
4.19 stable branches.
Ben.
--
Ben Hutchings
I say we take off; nuke the site from orbit.
It's the only way to be sure.
Please pick this fix for 4.4, 4.9, and 4.14 stable branches:
commit 7690e25302dc7d0cd42b349e746fe44b44a94f2b
Author: Goldwyn Rodrigues <rgoldwyn(a)suse.com>
Date: Sun Dec 3 21:14:12 2017 -0600
dm flakey: check for null arg_name in parse_features()
Ben.
--
Ben Hutchings
I say we take off; nuke the site from orbit.
It's the only way to be sure.
I've managed to get about everything wrong while digging these out of
OEM's board file.
Correct the bus numbers, the exact model of the NOR flash, polarity of
the chip selects and align the SPI frequency with the data sheet.
Tested that it works now, with a slight fix to the PXA SSP driver.
Signed-off-by: Lubomir Rintel <lkundrak(a)v3.sk>
Cc: <stable(a)vger.kernel.org>
---
arch/arm/boot/dts/mmp3-dell-ariel.dts | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/arm/boot/dts/mmp3-dell-ariel.dts b/arch/arm/boot/dts/mmp3-dell-ariel.dts
index 15449c72c042b..b0ec14c421641 100644
--- a/arch/arm/boot/dts/mmp3-dell-ariel.dts
+++ b/arch/arm/boot/dts/mmp3-dell-ariel.dts
@@ -98,19 +98,19 @@ &twsi4 {
status = "okay";
};
-&ssp3 {
+&ssp1 {
status = "okay";
- cs-gpios = <&gpio 46 GPIO_ACTIVE_HIGH>;
+ cs-gpios = <&gpio 46 GPIO_ACTIVE_LOW>;
firmware-flash@0 {
- compatible = "st,m25p80", "jedec,spi-nor";
+ compatible = "winbond,w25q32", "jedec,spi-nor";
reg = <0>;
- spi-max-frequency = <40000000>;
+ spi-max-frequency = <104000000>;
m25p,fast-read;
};
};
-&ssp4 {
- cs-gpios = <&gpio 56 GPIO_ACTIVE_HIGH>;
+&ssp2 {
+ cs-gpios = <&gpio 56 GPIO_ACTIVE_LOW>;
status = "okay";
};
--
2.26.0
Please apply the following commit:
Commit subject: usb: dwc3: gadget: Don't clear flags before transfer ended
Commit ID: a114c4ca64bd522aec1790c7e5c60c882f699d8f
Apply to: at least 4.19 stable, and 5.4 stable if possible. Note that all kernels from v4.18-rc1 up to 5.7-rc1 are affected.
Why apply it:
<https://github.com/torvalds/linux/commit/a114c4ca64bd522aec1790c7e5c60c882f…> fixes <https://github.com/torvalds/linux/commit/6d8a019614f3a7630e0a2c1be4bf1cfc23…>. Without this fix the built-in USB function source/sink test module fails to work with isochronous endpoints [1]. A side-effect of setting dep->flags = DWC3_EP_ENABLED; in dwc3_gadget_ep_cleanup_completed_requests() as part of disabling an ep is that a subsequent attempt to enable the endpoint will skip __dwc3_gadget_ep_enable() effectively leaving the ep disabled.
[1] Our gadget driver on TI AM5729 fails to work exactly like the built-in USB function source/sink test module when switching to alternate interface 1 because of the issue described above.
TI is currently using 4.19 and 5.4 stable kernels as the basis for their processor SDK kernels for their AM57x SoC and may switch to 5.4 stable at a later time. Thank you.
Kind Regards
—
Delio Brignoli
AudioScience, Inc.
USA Sales Toll Free 1-855-AUDIOSC
<www.audioscience.com> - <http://www.facebook.com/AudioScienceInc>
Dear Prike, dear Alex, dear Linux folks,
Am 13.04.20 um 10:44 schrieb Paul Menzel:
> A regression between causes a system with the AMD board MSI B350M MORTAR
> (MS-7A37) with an AMD Ryzen 3 2200G not to power off any more but just
> to halt.
>
> The regression is introduced in 9ebe5422ad6c..b032227c6293. I am in the
> process to bisect this, but maybe somebody already has an idea.
I found the Easter egg:
> commit 487eca11a321ef33bcf4ca5adb3c0c4954db1b58
> Author: Prike Liang <Prike.Liang(a)amd.com>
> Date: Tue Apr 7 20:21:26 2020 +0800
>
> drm/amdgpu: fix gfx hang during suspend with video playback (v2)
>
> The system will be hang up during S3 suspend because of SMU is pending
> for GC not respose the register CP_HQD_ACTIVE access request.This issue
> root cause of accessing the GC register under enter GFX CGGPG and can
> be fixed by disable GFX CGPG before perform suspend.
>
> v2: Use disable the GFX CGPG instead of RLC safe mode guard.
>
> Signed-off-by: Prike Liang <Prike.Liang(a)amd.com>
> Tested-by: Mengbing Wang <Mengbing.Wang(a)amd.com>
> Reviewed-by: Huang Rui <ray.huang(a)amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
> Cc: stable(a)vger.kernel.org
It reverts cleanly on top of 5.7-rc1, and this fixes the issue.
Greg, please do not apply this to the stable series. The commit message
doesn’t even reference a issue/bug report, and doesn’t give a detailed
problem description. What system is it?
Dave, Alex, how to proceed? Revert? I created issue 1094 [1].
Kind regards,
Paul
[1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1094
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 55e8c8eb2c7b6bf30e99423ccfe7ca032f498f59 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
Date: Fri, 28 Feb 2020 11:11:06 -0600
Subject: [PATCH] posix-cpu-timers: Store a reference to a pid not a task
posix cpu timers do not handle the death of a process well.
This is most clearly seen when a multi-threaded process calls exec from a
thread that is not the leader of the thread group. The posix cpu timer code
continues to pin the old thread group leader and is unable to find the
siglock from there.
This results in posix_cpu_timer_del being unable to delete a timer,
posix_cpu_timer_set being unable to set a timer. Further to compensate for
the problems in posix_cpu_timer_del on a multi-threaded exec all timers
that point at the multi-threaded task are stopped.
The code for the timers fundamentally needs to check if the target
process/thread is alive. This needs an extra level of indirection. This
level of indirection is already available in struct pid.
So replace cpu.task with cpu.pid to get the needed extra layer of
indirection.
In addition to handling things more cleanly this reduces the amount of
memory a timer can pin when a process exits and then is reaped from
a task_struct to the vastly smaller struct pid.
Fixes: e0a70217107e ("posix-cpu-timers: workaround to suppress the problems with mt exec")
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lkml.kernel.org/r/87wo86tz6d.fsf@x220.int.ebiederm.org
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 3d10c84a97a9..e3f0f8585da4 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -69,7 +69,7 @@ static inline int clockid_to_fd(const clockid_t clk)
struct cpu_timer {
struct timerqueue_node node;
struct timerqueue_head *head;
- struct task_struct *task;
+ struct pid *pid;
struct list_head elist;
int firing;
};
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index ef936c5a910b..6df468a622fe 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -118,6 +118,16 @@ static inline int validate_clock_permissions(const clockid_t clock)
return __get_task_for_clock(clock, false, false) ? 0 : -EINVAL;
}
+static inline enum pid_type cpu_timer_pid_type(struct k_itimer *timer)
+{
+ return CPUCLOCK_PERTHREAD(timer->it_clock) ? PIDTYPE_PID : PIDTYPE_TGID;
+}
+
+static inline struct task_struct *cpu_timer_task_rcu(struct k_itimer *timer)
+{
+ return pid_task(timer->it.cpu.pid, cpu_timer_pid_type(timer));
+}
+
/*
* Update expiry time from increment, and increase overrun count,
* given the current clock sample.
@@ -391,7 +401,12 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
new_timer->kclock = &clock_posix_cpu;
timerqueue_init(&new_timer->it.cpu.node);
- new_timer->it.cpu.task = p;
+ new_timer->it.cpu.pid = get_task_pid(p, cpu_timer_pid_type(new_timer));
+ /*
+ * get_task_for_clock() took a reference on @p. Drop it as the timer
+ * holds a reference on the pid of @p.
+ */
+ put_task_struct(p);
return 0;
}
@@ -404,13 +419,15 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
static int posix_cpu_timer_del(struct k_itimer *timer)
{
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Protect against sighand release/switch in exit/exec and process/
@@ -432,8 +449,10 @@ static int posix_cpu_timer_del(struct k_itimer *timer)
unlock_task_sighand(p, &flags);
}
+out:
+ rcu_read_unlock();
if (!ret)
- put_task_struct(p);
+ put_pid(ctmr->pid);
return ret;
}
@@ -561,13 +580,21 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
u64 old_expires, new_expires, old_incr, val;
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p) {
+ /*
+ * If p has just been reaped, we can no
+ * longer get any information about it at all.
+ */
+ rcu_read_unlock();
+ return -ESRCH;
+ }
/*
* Use the to_ktime conversion because that clamps the maximum
@@ -584,8 +611,10 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
* If p has just been reaped, we can no
* longer get any information about it at all.
*/
- if (unlikely(sighand == NULL))
+ if (unlikely(sighand == NULL)) {
+ rcu_read_unlock();
return -ESRCH;
+ }
/*
* Disarm any old timer after extracting its expiry time.
@@ -690,6 +719,7 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
ret = 0;
out:
+ rcu_read_unlock();
if (old)
old->it_interval = ns_to_timespec64(old_incr);
@@ -701,10 +731,12 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
struct cpu_timer *ctmr = &timer->it.cpu;
u64 now, expires = cpu_timer_getexpires(ctmr);
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Easy part: convert the reload time.
@@ -712,7 +744,7 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_interval = ktime_to_timespec64(timer->it_interval);
if (!expires)
- return;
+ goto out;
/*
* Sample the clock to take the difference with the expiry time.
@@ -732,6 +764,8 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_value.tv_nsec = 1;
itp->it_value.tv_sec = 0;
}
+out:
+ rcu_read_unlock();
}
#define MAX_COLLECTED 20
@@ -952,14 +986,15 @@ static void check_process_timers(struct task_struct *tsk,
static void posix_cpu_timer_rearm(struct k_itimer *timer)
{
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
- struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
struct sighand_struct *sighand;
unsigned long flags;
u64 now;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Fetch the current sample and update the timer's expiry time.
@@ -974,13 +1009,15 @@ static void posix_cpu_timer_rearm(struct k_itimer *timer)
/* Protect timer list r/w in arm_timer() */
sighand = lock_task_sighand(p, &flags);
if (unlikely(sighand == NULL))
- return;
+ goto out;
/*
* Now re-arm for the new expiry time.
*/
arm_timer(timer, p);
unlock_task_sighand(p, &flags);
+out:
+ rcu_read_unlock();
}
/**
The original memcpy_mcsafe() implementation satisfied two primary
concerns. It provided a copy routine that avoided known unrecoverable
scenarios (poison consumption via fast-string instructions / accesses
across cacheline boundaries), and it provided a fallback to fast plain
memcpy if the platform did not indicate recovery capability.
However, the platform indication for recovery capability is not
architectural so it was always going to require an ever growing list of
opt-in quirks. Also, ongoing hardware improvements to recoverability
mean that the cross-cacheline-boundary and fast-string poison
consumption scenarios are no longer concerns on newer platforms. The
introduction of memcpy_mcsafe_fast(), and resulting reworks, recovers
performance for these newer CPUs, but without the maintenance burden of
an opt-in whitelist.
With memcpy_mcsafe_fast() the existing opt-in becomes a blacklist for
CPUs that benefit from a more careful slower implementation. Every other
CPU gets a fast, but optionally recoverable implementation.
With this in place memcpy_mcsafe() never falls back to plain memcpy().
It can be used in any scenario where the caller needs guarantees that
source or destination access faults / exceptions will be handled.
Systems without recovery support continue to default to a fast
implementation while newer systems both default to fast, and support
recovery, without needing an explicit quirk.
Cc: x86(a)kernel.org
Cc: <stable(a)vger.kernel.org>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Acked-by: Tony Luck <tony.luck(a)intel.com>
Reported-by: Erwin Tsaur <erwin.tsaur(a)intel.com>
Tested-by: Erwin Tsaur <erwin.tsaur(a)intel.com>
Fixes: 92b0729c34ca ("x86/mm, x86/mce: Add memcpy_mcsafe()")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
Note that I marked this for stable and included a Fixes tag because it
is arguably a regression that old kernels stop recovering from poison
consumption on new hardware.
arch/x86/include/asm/string_64.h | 24 ++++++-------
arch/x86/include/asm/uaccess_64.h | 7 +---
arch/x86/kernel/cpu/mce/core.c | 6 ++-
arch/x86/kernel/quirks.c | 4 +-
arch/x86/lib/memcpy_64.S | 57 +++++++++++++++++++++++++++---
arch/x86/lib/usercopy_64.c | 6 +--
tools/objtool/check.c | 3 +-
tools/perf/bench/mem-memcpy-x86-64-lib.c | 8 +---
tools/testing/nvdimm/test/nfit.c | 2 +
9 files changed, 75 insertions(+), 42 deletions(-)
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index 75314c3dbe47..07840fa3582a 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -83,21 +83,23 @@ int strcmp(const char *cs, const char *ct);
#endif
#define __HAVE_ARCH_MEMCPY_MCSAFE 1
-__must_check unsigned long __memcpy_mcsafe(void *dst, const void *src,
+__must_check unsigned long memcpy_mcsafe_slow(void *dst, const void *src,
size_t cnt);
-DECLARE_STATIC_KEY_FALSE(mcsafe_key);
+__must_check unsigned long memcpy_mcsafe_fast(void *dst, const void *src,
+ size_t cnt);
+DECLARE_STATIC_KEY_FALSE(mcsafe_slow_key);
/**
- * memcpy_mcsafe - copy memory with indication if a machine check happened
+ * memcpy_mcsafe - copy memory with indication if an exception / fault happened
*
* @dst: destination address
* @src: source address
* @cnt: number of bytes to copy
*
- * Low level memory copy function that catches machine checks
- * We only call into the "safe" function on systems that can
- * actually do machine check recovery. Everyone else can just
- * use memcpy().
+ * The slow version is opted into by platform quirks. The fast version
+ * is equivalent to memcpy() regardless of CPU machine-check-recovery
+ * capability, but may still fall back to the slow version if the CPU
+ * lacks fast-string instruction support.
*
* Return 0 for success, or number of bytes not copied if there was an
* exception.
@@ -106,12 +108,10 @@ static __always_inline __must_check unsigned long
memcpy_mcsafe(void *dst, const void *src, size_t cnt)
{
#ifdef CONFIG_X86_MCE
- if (static_branch_unlikely(&mcsafe_key))
- return __memcpy_mcsafe(dst, src, cnt);
- else
+ if (static_branch_unlikely(&mcsafe_slow_key))
+ return memcpy_mcsafe_slow(dst, src, cnt);
#endif
- memcpy(dst, src, cnt);
- return 0;
+ return memcpy_mcsafe_fast(dst, src, cnt);
}
#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index 5cd1caa8bc65..f8c0d38c3f45 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -52,12 +52,7 @@ copy_to_user_mcsafe(void *to, const void *from, unsigned len)
unsigned long ret;
__uaccess_begin();
- /*
- * Note, __memcpy_mcsafe() is explicitly used since it can
- * handle exceptions / faults. memcpy_mcsafe() may fall back to
- * memcpy() which lacks this handling.
- */
- ret = __memcpy_mcsafe(to, from, len);
+ ret = memcpy_mcsafe(to, from, len);
__uaccess_end();
return ret;
}
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 2c4f949611e4..6bf94d39dc7f 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -2579,13 +2579,13 @@ static void __init mcheck_debugfs_init(void)
static void __init mcheck_debugfs_init(void) { }
#endif
-DEFINE_STATIC_KEY_FALSE(mcsafe_key);
-EXPORT_SYMBOL_GPL(mcsafe_key);
+DEFINE_STATIC_KEY_FALSE(mcsafe_slow_key);
+EXPORT_SYMBOL_GPL(mcsafe_slow_key);
static int __init mcheck_late_init(void)
{
if (mca_cfg.recovery)
- static_branch_inc(&mcsafe_key);
+ static_branch_inc(&mcsafe_slow_key);
mcheck_debugfs_init();
cec_init();
diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 896d74cb5081..89c88d9de5c4 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -636,7 +636,7 @@ static void quirk_intel_brickland_xeon_ras_cap(struct pci_dev *pdev)
pci_read_config_dword(pdev, 0x84, &capid0);
if (capid0 & 0x10)
- static_branch_inc(&mcsafe_key);
+ static_branch_inc(&mcsafe_slow_key);
}
/* Skylake */
@@ -653,7 +653,7 @@ static void quirk_intel_purley_xeon_ras_cap(struct pci_dev *pdev)
* enabled, so memory machine check recovery is also enabled.
*/
if ((capid0 & 0xc0) == 0xc0 || (capid5 & 0x1e0))
- static_branch_inc(&mcsafe_key);
+ static_branch_inc(&mcsafe_slow_key);
}
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x0ec3, quirk_intel_brickland_xeon_ras_cap);
diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
index 56b243b14c3a..b5e4fc1cf99d 100644
--- a/arch/x86/lib/memcpy_64.S
+++ b/arch/x86/lib/memcpy_64.S
@@ -189,11 +189,18 @@ SYM_FUNC_END(memcpy_orig)
MCSAFE_TEST_CTL
/*
- * __memcpy_mcsafe - memory copy with machine check exception handling
- * Note that we only catch machine checks when reading the source addresses.
- * Writes to target are posted and don't generate machine checks.
+ * memcpy_mcsafe_slow() - memory copy with exception handling
+ *
+ * In contrast to memcpy_mcsafe_fast() this version is careful to
+ * never perform a read across a cacheline boundary, and not use
+ * fast-string instruction sequences which are known to be unrecoverable
+ * on CPUs identified by mcsafe_slow_key.
+ *
+ * Note that this only catches machine check exceptions when reading the
+ * source addresses. Writes to target are posted and don't generate
+ * machine checks. However this does handle protection faults on writes.
*/
-SYM_FUNC_START(__memcpy_mcsafe)
+SYM_FUNC_START(memcpy_mcsafe_slow)
cmpl $8, %edx
/* Less than 8 bytes? Go to byte copy loop */
jb .L_no_whole_words
@@ -260,8 +267,8 @@ SYM_FUNC_START(__memcpy_mcsafe)
xorl %eax, %eax
.L_done:
ret
-SYM_FUNC_END(__memcpy_mcsafe)
-EXPORT_SYMBOL_GPL(__memcpy_mcsafe)
+SYM_FUNC_END(memcpy_mcsafe_slow)
+EXPORT_SYMBOL_GPL(memcpy_mcsafe_slow)
.section .fixup, "ax"
/*
@@ -296,4 +303,42 @@ EXPORT_SYMBOL_GPL(__memcpy_mcsafe)
_ASM_EXTABLE(.L_write_leading_bytes, .E_leading_bytes)
_ASM_EXTABLE(.L_write_words, .E_write_words)
_ASM_EXTABLE(.L_write_trailing_bytes, .E_trailing_bytes)
+
+/*
+ * memcpy_mcsafe_fast - memory copy with exception handling
+ *
+ * Fast string copy + exception handling. If the CPU does support
+ * machine check exception recovery, but does not support recovering
+ * from fast-string exceptions then this CPU needs to be added to the
+ * mcsafe_slow_key set of quirks. Otherwise, absent any machine check
+ * recovery support this version should be no slower than standard
+ * memcpy.
+ */
+SYM_FUNC_START(memcpy_mcsafe_fast)
+ ALTERNATIVE "jmp memcpy_mcsafe_slow", "", X86_FEATURE_ERMS
+ movq %rdi, %rax
+ movq %rdx, %rcx
+.L_copy:
+ rep movsb
+ /* Copy successful. Return zero */
+ xorl %eax, %eax
+ ret
+SYM_FUNC_END(memcpy_mcsafe_fast)
+EXPORT_SYMBOL_GPL(memcpy_mcsafe_fast)
+
+ .section .fixup, "ax"
+.E_copy:
+ /*
+ * On fault %rcx is updated such that the copy instruction could
+ * optionally be restarted at the fault position, i.e. it
+ * contains 'bytes remaining'. A non-zero return indicates error
+ * to memcpy_mcsafe() users, or indicate short transfers to
+ * user-copy routines.
+ */
+ movq %rcx, %rax
+ ret
+
+ .previous
+
+ _ASM_EXTABLE_FAULT(.L_copy, .E_copy)
#endif
diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
index fff28c6f73a2..348c9331748e 100644
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -64,11 +64,7 @@ __visible notrace unsigned long
mcsafe_handle_tail(char *to, char *from, unsigned len)
{
for (; len; --len, to++, from++) {
- /*
- * Call the assembly routine back directly since
- * memcpy_mcsafe() may silently fallback to memcpy.
- */
- unsigned long rem = __memcpy_mcsafe(to, from, 1);
+ unsigned long rem = memcpy_mcsafe(to, from, 1);
if (rem)
break;
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 4768d91c6d68..bae1f77aae90 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -485,7 +485,8 @@ static const char *uaccess_safe_builtin[] = {
"__ubsan_handle_shift_out_of_bounds",
/* misc */
"csum_partial_copy_generic",
- "__memcpy_mcsafe",
+ "memcpy_mcsafe_slow",
+ "memcpy_mcsafe_fast",
"mcsafe_handle_tail",
"ftrace_likely_update", /* CONFIG_TRACE_BRANCH_PROFILING */
NULL
diff --git a/tools/perf/bench/mem-memcpy-x86-64-lib.c b/tools/perf/bench/mem-memcpy-x86-64-lib.c
index 4130734dde84..23e1747b3a67 100644
--- a/tools/perf/bench/mem-memcpy-x86-64-lib.c
+++ b/tools/perf/bench/mem-memcpy-x86-64-lib.c
@@ -5,17 +5,13 @@
*/
#include <linux/types.h>
-unsigned long __memcpy_mcsafe(void *dst, const void *src, size_t cnt);
+unsigned long memcpy_mcsafe(void *dst, const void *src, size_t cnt);
unsigned long mcsafe_handle_tail(char *to, char *from, unsigned len);
unsigned long mcsafe_handle_tail(char *to, char *from, unsigned len)
{
for (; len; --len, to++, from++) {
- /*
- * Call the assembly routine back directly since
- * memcpy_mcsafe() may silently fallback to memcpy.
- */
- unsigned long rem = __memcpy_mcsafe(to, from, 1);
+ unsigned long rem = memcpy_mcsafe(to, from, 1);
if (rem)
break;
diff --git a/tools/testing/nvdimm/test/nfit.c b/tools/testing/nvdimm/test/nfit.c
index bf6422a6af7f..282722d96f8e 100644
--- a/tools/testing/nvdimm/test/nfit.c
+++ b/tools/testing/nvdimm/test/nfit.c
@@ -3136,7 +3136,7 @@ void mcsafe_test(void)
}
mcsafe_test_init(dst, src, 512);
- rem = __memcpy_mcsafe(dst, src, 512);
+ rem = memcpy_mcsafe_slow(dst, src, 512);
valid = mcsafe_test_validate(dst, src, 512, expect);
if (rem == expect && valid)
continue;
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe867cac9e1967c553e4ac2aece5fc8675258010 Mon Sep 17 00:00:00 2001
From: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Date: Mon, 4 Nov 2019 12:02:14 +0200
Subject: [PATCH] net/mlx5e: Use preactivate hook to set the indirection table
mlx5e_ethtool_set_channels updates the indirection table before
switching to the new channels. If the switch fails, the indirection
table is new, but the channels are old, which is wrong. Fix it by using
the preactivate hook of mlx5e_safe_switch_channels to update the
indirection table at the stage when nothing can fail anymore.
As the code that updates the indirection table is now encapsulated into
a new function, use that function in the attach flow when the driver has
to reduce the number of channels, and prepare the code for the next
commit.
Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Reviewed-by: Tariq Toukan <tariqt(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index bc2c96b34de1..4ddccab02a4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1043,6 +1043,7 @@ int mlx5e_safe_reopen_channels(struct mlx5e_priv *priv);
int mlx5e_safe_switch_channels(struct mlx5e_priv *priv,
struct mlx5e_channels *new_chs,
mlx5e_fp_preactivate preactivate);
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv);
void mlx5e_activate_priv_channels(struct mlx5e_priv *priv);
void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 68b520df07e4..ff7f5a931520 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -432,9 +432,7 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
*cur_params = new_channels.params;
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
+ mlx5e_num_channels_changed(priv);
goto out;
}
@@ -442,12 +440,8 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (arfs_enabled)
mlx5e_arfs_disable(priv);
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
-
/* Switch to new channels, set new parameters and close old ones */
- err = mlx5e_safe_switch_channels(priv, &new_channels, NULL);
+ err = mlx5e_safe_switch_channels(priv, &new_channels, mlx5e_num_channels_changed);
if (arfs_enabled) {
int err2 = mlx5e_arfs_enable(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 152aa5d7df79..bbe8c32fb423 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2880,6 +2880,17 @@ static void mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
netif_set_real_num_rx_queues(netdev, num_rxqs);
}
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv)
+{
+ u16 count = priv->channels.params.num_channels;
+
+ if (!netif_is_rxfh_configured(priv->netdev))
+ mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
+ MLX5E_INDIR_RQT_SIZE, count);
+
+ return 0;
+}
+
static void mlx5e_build_txq_maps(struct mlx5e_priv *priv)
{
int i, ch;
@@ -5288,9 +5299,10 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv)
max_nch = mlx5e_get_max_num_channels(priv->mdev);
if (priv->channels.params.num_channels > max_nch) {
mlx5_core_warn(priv->mdev, "MLX5E: Reducing number of channels to %d\n", max_nch);
+ /* Reducing the number of channels - RXFH has to be reset. */
+ priv->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
priv->channels.params.num_channels = max_nch;
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, max_nch);
+ mlx5e_num_channels_changed(priv);
}
err = profile->init_tx(priv);
From: Sultan Alsawaf <sultan(a)kerneltoast.com>
Hi,
This patchset contains fixes for two pesky i915 bugs that exist in 5.4.
The first bug, fixed by
"drm/i915/gt: Schedule request retirement when timeline idles" upstream, is very
high power consumption by i915 hardware due to an old commit that broke the RC6
power state for the iGPU on Skylake+. On my laptop, a Dell Precision 5540 with
an i9-9880H, the idle power consumption of my laptop with this commit goes down
from 10 W to 2 W, saving a massive 8 W of power. On other Skylake+ laptops I
tested, this commit reduced idle power consumption by at least a few watts. The
difference in power consumption can be observed by running `powertop` while
disconnected from the charger, or by using the intel-undervolt tool [1] and
running `intel-undervolt measure` which doesn't require being disconnected from
the charger. The psys measurement from intel-undervolt is the one of interest.
Backporting this commit was not trivial due to i915's high rate of churn, but
the backport didn't require changing any code from the original commit; it only
required moving code around and not altering some #includes.
The second bug causes severe graphical corruption and flickering on laptops
which support an esoteric power-saving mechanism called Panel Self Refresh
(PSR). Enabled by default in 5.0, PSR causes graphical corruption to the point
of unusability on many Dell laptops made within the past few years, since they
typically support PSR. A bug was filed with Intel a while ago for this with more
information [2]. I suspect most the community hasn't been affected by this bug
because ThinkPads and many other laptops I checked didn't support PSR. As of
writing, the issues I observed with PSR are fixed in Intel's drm-tip branch, but
i915 suffers from so much churn that backporting the individual PSR fixes is
infeasible (and an Intel engineer attested to this, saying that the PSR fixes in
drm-tip wouldn't be backported [3]). Disabling PSR by default brings us back to
pre-5.0 behavior, and from my tests with functional PSR in the drm-tip kernel,
I didn't observe any reduction in power consumption with PSR enabled, so there
isn't much lost from turning it off. Also, Ubuntu now ships with PSR disabled by
default in its affected kernels, so there is distro support behind this change.
Thanks,
Sultan
[1] https://github.com/kitsunyan/intel-undervolt
[2] https://gitlab.freedesktop.org/drm/intel/issues/425
[3] https://gitlab.freedesktop.org/drm/intel/issues/425#note_416130
Chris Wilson (1):
drm/i915/gt: Schedule request retirement when timeline idles
Sultan Alsawaf (1):
drm/i915: Disable Panel Self Refresh (PSR) by default
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +
drivers/gpu/drm/i915/gt/intel_engine_types.h | 8 ++
drivers/gpu/drm/i915/gt/intel_lrc.c | 8 ++
drivers/gpu/drm/i915/gt/intel_timeline.c | 1 +
.../gpu/drm/i915/gt/intel_timeline_types.h | 3 +
drivers/gpu/drm/i915/i915_params.c | 3 +-
drivers/gpu/drm/i915/i915_params.h | 2 +-
drivers/gpu/drm/i915/i915_request.c | 73 +++++++++++++++++++
drivers/gpu/drm/i915/i915_request.h | 4 +
9 files changed, 101 insertions(+), 3 deletions(-)
--
2.26.0
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a114c4ca64bd522aec1790c7e5c60c882f699d8f Mon Sep 17 00:00:00 2001
From: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Date: Thu, 5 Mar 2020 13:23:49 -0800
Subject: [PATCH] usb: dwc3: gadget: Don't clear flags before transfer ended
We track END_TRANSFER command completion. Don't clear transfer
started/ended flag prematurely. Otherwise, we'd run into the problem
with restarting transfer before END_TRANSFER command finishes.
Fixes: 6d8a019614f3 ("usb: dwc3: gadget: check for Missed Isoc from event status")
Signed-off-by: Thinh Nguyen <thinhn(a)synopsys.com>
Signed-off-by: Felipe Balbi <balbi(a)kernel.org>
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 1e00bf2d65a2..b032e62fff0f 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2570,10 +2570,8 @@ static void dwc3_gadget_endpoint_transfer_in_progress(struct dwc3_ep *dep,
dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
- if (stop) {
+ if (stop)
dwc3_stop_active_transfer(dep, true, true);
- dep->flags = DWC3_EP_ENABLED;
- }
/*
* WORKAROUND: This is the 2nd half of U1/U2 -> U0 workaround.
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 55e8c8eb2c7b6bf30e99423ccfe7ca032f498f59 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
Date: Fri, 28 Feb 2020 11:11:06 -0600
Subject: [PATCH] posix-cpu-timers: Store a reference to a pid not a task
posix cpu timers do not handle the death of a process well.
This is most clearly seen when a multi-threaded process calls exec from a
thread that is not the leader of the thread group. The posix cpu timer code
continues to pin the old thread group leader and is unable to find the
siglock from there.
This results in posix_cpu_timer_del being unable to delete a timer,
posix_cpu_timer_set being unable to set a timer. Further to compensate for
the problems in posix_cpu_timer_del on a multi-threaded exec all timers
that point at the multi-threaded task are stopped.
The code for the timers fundamentally needs to check if the target
process/thread is alive. This needs an extra level of indirection. This
level of indirection is already available in struct pid.
So replace cpu.task with cpu.pid to get the needed extra layer of
indirection.
In addition to handling things more cleanly this reduces the amount of
memory a timer can pin when a process exits and then is reaped from
a task_struct to the vastly smaller struct pid.
Fixes: e0a70217107e ("posix-cpu-timers: workaround to suppress the problems with mt exec")
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lkml.kernel.org/r/87wo86tz6d.fsf@x220.int.ebiederm.org
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 3d10c84a97a9..e3f0f8585da4 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -69,7 +69,7 @@ static inline int clockid_to_fd(const clockid_t clk)
struct cpu_timer {
struct timerqueue_node node;
struct timerqueue_head *head;
- struct task_struct *task;
+ struct pid *pid;
struct list_head elist;
int firing;
};
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index ef936c5a910b..6df468a622fe 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -118,6 +118,16 @@ static inline int validate_clock_permissions(const clockid_t clock)
return __get_task_for_clock(clock, false, false) ? 0 : -EINVAL;
}
+static inline enum pid_type cpu_timer_pid_type(struct k_itimer *timer)
+{
+ return CPUCLOCK_PERTHREAD(timer->it_clock) ? PIDTYPE_PID : PIDTYPE_TGID;
+}
+
+static inline struct task_struct *cpu_timer_task_rcu(struct k_itimer *timer)
+{
+ return pid_task(timer->it.cpu.pid, cpu_timer_pid_type(timer));
+}
+
/*
* Update expiry time from increment, and increase overrun count,
* given the current clock sample.
@@ -391,7 +401,12 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
new_timer->kclock = &clock_posix_cpu;
timerqueue_init(&new_timer->it.cpu.node);
- new_timer->it.cpu.task = p;
+ new_timer->it.cpu.pid = get_task_pid(p, cpu_timer_pid_type(new_timer));
+ /*
+ * get_task_for_clock() took a reference on @p. Drop it as the timer
+ * holds a reference on the pid of @p.
+ */
+ put_task_struct(p);
return 0;
}
@@ -404,13 +419,15 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
static int posix_cpu_timer_del(struct k_itimer *timer)
{
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Protect against sighand release/switch in exit/exec and process/
@@ -432,8 +449,10 @@ static int posix_cpu_timer_del(struct k_itimer *timer)
unlock_task_sighand(p, &flags);
}
+out:
+ rcu_read_unlock();
if (!ret)
- put_task_struct(p);
+ put_pid(ctmr->pid);
return ret;
}
@@ -561,13 +580,21 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
u64 old_expires, new_expires, old_incr, val;
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p) {
+ /*
+ * If p has just been reaped, we can no
+ * longer get any information about it at all.
+ */
+ rcu_read_unlock();
+ return -ESRCH;
+ }
/*
* Use the to_ktime conversion because that clamps the maximum
@@ -584,8 +611,10 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
* If p has just been reaped, we can no
* longer get any information about it at all.
*/
- if (unlikely(sighand == NULL))
+ if (unlikely(sighand == NULL)) {
+ rcu_read_unlock();
return -ESRCH;
+ }
/*
* Disarm any old timer after extracting its expiry time.
@@ -690,6 +719,7 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
ret = 0;
out:
+ rcu_read_unlock();
if (old)
old->it_interval = ns_to_timespec64(old_incr);
@@ -701,10 +731,12 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
struct cpu_timer *ctmr = &timer->it.cpu;
u64 now, expires = cpu_timer_getexpires(ctmr);
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Easy part: convert the reload time.
@@ -712,7 +744,7 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_interval = ktime_to_timespec64(timer->it_interval);
if (!expires)
- return;
+ goto out;
/*
* Sample the clock to take the difference with the expiry time.
@@ -732,6 +764,8 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_value.tv_nsec = 1;
itp->it_value.tv_sec = 0;
}
+out:
+ rcu_read_unlock();
}
#define MAX_COLLECTED 20
@@ -952,14 +986,15 @@ static void check_process_timers(struct task_struct *tsk,
static void posix_cpu_timer_rearm(struct k_itimer *timer)
{
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
- struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
struct sighand_struct *sighand;
unsigned long flags;
u64 now;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Fetch the current sample and update the timer's expiry time.
@@ -974,13 +1009,15 @@ static void posix_cpu_timer_rearm(struct k_itimer *timer)
/* Protect timer list r/w in arm_timer() */
sighand = lock_task_sighand(p, &flags);
if (unlikely(sighand == NULL))
- return;
+ goto out;
/*
* Now re-arm for the new expiry time.
*/
arm_timer(timer, p);
unlock_task_sighand(p, &flags);
+out:
+ rcu_read_unlock();
}
/**
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 55e8c8eb2c7b6bf30e99423ccfe7ca032f498f59 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
Date: Fri, 28 Feb 2020 11:11:06 -0600
Subject: [PATCH] posix-cpu-timers: Store a reference to a pid not a task
posix cpu timers do not handle the death of a process well.
This is most clearly seen when a multi-threaded process calls exec from a
thread that is not the leader of the thread group. The posix cpu timer code
continues to pin the old thread group leader and is unable to find the
siglock from there.
This results in posix_cpu_timer_del being unable to delete a timer,
posix_cpu_timer_set being unable to set a timer. Further to compensate for
the problems in posix_cpu_timer_del on a multi-threaded exec all timers
that point at the multi-threaded task are stopped.
The code for the timers fundamentally needs to check if the target
process/thread is alive. This needs an extra level of indirection. This
level of indirection is already available in struct pid.
So replace cpu.task with cpu.pid to get the needed extra layer of
indirection.
In addition to handling things more cleanly this reduces the amount of
memory a timer can pin when a process exits and then is reaped from
a task_struct to the vastly smaller struct pid.
Fixes: e0a70217107e ("posix-cpu-timers: workaround to suppress the problems with mt exec")
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lkml.kernel.org/r/87wo86tz6d.fsf@x220.int.ebiederm.org
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 3d10c84a97a9..e3f0f8585da4 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -69,7 +69,7 @@ static inline int clockid_to_fd(const clockid_t clk)
struct cpu_timer {
struct timerqueue_node node;
struct timerqueue_head *head;
- struct task_struct *task;
+ struct pid *pid;
struct list_head elist;
int firing;
};
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index ef936c5a910b..6df468a622fe 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -118,6 +118,16 @@ static inline int validate_clock_permissions(const clockid_t clock)
return __get_task_for_clock(clock, false, false) ? 0 : -EINVAL;
}
+static inline enum pid_type cpu_timer_pid_type(struct k_itimer *timer)
+{
+ return CPUCLOCK_PERTHREAD(timer->it_clock) ? PIDTYPE_PID : PIDTYPE_TGID;
+}
+
+static inline struct task_struct *cpu_timer_task_rcu(struct k_itimer *timer)
+{
+ return pid_task(timer->it.cpu.pid, cpu_timer_pid_type(timer));
+}
+
/*
* Update expiry time from increment, and increase overrun count,
* given the current clock sample.
@@ -391,7 +401,12 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
new_timer->kclock = &clock_posix_cpu;
timerqueue_init(&new_timer->it.cpu.node);
- new_timer->it.cpu.task = p;
+ new_timer->it.cpu.pid = get_task_pid(p, cpu_timer_pid_type(new_timer));
+ /*
+ * get_task_for_clock() took a reference on @p. Drop it as the timer
+ * holds a reference on the pid of @p.
+ */
+ put_task_struct(p);
return 0;
}
@@ -404,13 +419,15 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
static int posix_cpu_timer_del(struct k_itimer *timer)
{
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Protect against sighand release/switch in exit/exec and process/
@@ -432,8 +449,10 @@ static int posix_cpu_timer_del(struct k_itimer *timer)
unlock_task_sighand(p, &flags);
}
+out:
+ rcu_read_unlock();
if (!ret)
- put_task_struct(p);
+ put_pid(ctmr->pid);
return ret;
}
@@ -561,13 +580,21 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
u64 old_expires, new_expires, old_incr, val;
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p) {
+ /*
+ * If p has just been reaped, we can no
+ * longer get any information about it at all.
+ */
+ rcu_read_unlock();
+ return -ESRCH;
+ }
/*
* Use the to_ktime conversion because that clamps the maximum
@@ -584,8 +611,10 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
* If p has just been reaped, we can no
* longer get any information about it at all.
*/
- if (unlikely(sighand == NULL))
+ if (unlikely(sighand == NULL)) {
+ rcu_read_unlock();
return -ESRCH;
+ }
/*
* Disarm any old timer after extracting its expiry time.
@@ -690,6 +719,7 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
ret = 0;
out:
+ rcu_read_unlock();
if (old)
old->it_interval = ns_to_timespec64(old_incr);
@@ -701,10 +731,12 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
struct cpu_timer *ctmr = &timer->it.cpu;
u64 now, expires = cpu_timer_getexpires(ctmr);
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Easy part: convert the reload time.
@@ -712,7 +744,7 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_interval = ktime_to_timespec64(timer->it_interval);
if (!expires)
- return;
+ goto out;
/*
* Sample the clock to take the difference with the expiry time.
@@ -732,6 +764,8 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_value.tv_nsec = 1;
itp->it_value.tv_sec = 0;
}
+out:
+ rcu_read_unlock();
}
#define MAX_COLLECTED 20
@@ -952,14 +986,15 @@ static void check_process_timers(struct task_struct *tsk,
static void posix_cpu_timer_rearm(struct k_itimer *timer)
{
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
- struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
struct sighand_struct *sighand;
unsigned long flags;
u64 now;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Fetch the current sample and update the timer's expiry time.
@@ -974,13 +1009,15 @@ static void posix_cpu_timer_rearm(struct k_itimer *timer)
/* Protect timer list r/w in arm_timer() */
sighand = lock_task_sighand(p, &flags);
if (unlikely(sighand == NULL))
- return;
+ goto out;
/*
* Now re-arm for the new expiry time.
*/
arm_timer(timer, p);
unlock_task_sighand(p, &flags);
+out:
+ rcu_read_unlock();
}
/**
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 55e8c8eb2c7b6bf30e99423ccfe7ca032f498f59 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
Date: Fri, 28 Feb 2020 11:11:06 -0600
Subject: [PATCH] posix-cpu-timers: Store a reference to a pid not a task
posix cpu timers do not handle the death of a process well.
This is most clearly seen when a multi-threaded process calls exec from a
thread that is not the leader of the thread group. The posix cpu timer code
continues to pin the old thread group leader and is unable to find the
siglock from there.
This results in posix_cpu_timer_del being unable to delete a timer,
posix_cpu_timer_set being unable to set a timer. Further to compensate for
the problems in posix_cpu_timer_del on a multi-threaded exec all timers
that point at the multi-threaded task are stopped.
The code for the timers fundamentally needs to check if the target
process/thread is alive. This needs an extra level of indirection. This
level of indirection is already available in struct pid.
So replace cpu.task with cpu.pid to get the needed extra layer of
indirection.
In addition to handling things more cleanly this reduces the amount of
memory a timer can pin when a process exits and then is reaped from
a task_struct to the vastly smaller struct pid.
Fixes: e0a70217107e ("posix-cpu-timers: workaround to suppress the problems with mt exec")
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lkml.kernel.org/r/87wo86tz6d.fsf@x220.int.ebiederm.org
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 3d10c84a97a9..e3f0f8585da4 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -69,7 +69,7 @@ static inline int clockid_to_fd(const clockid_t clk)
struct cpu_timer {
struct timerqueue_node node;
struct timerqueue_head *head;
- struct task_struct *task;
+ struct pid *pid;
struct list_head elist;
int firing;
};
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index ef936c5a910b..6df468a622fe 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -118,6 +118,16 @@ static inline int validate_clock_permissions(const clockid_t clock)
return __get_task_for_clock(clock, false, false) ? 0 : -EINVAL;
}
+static inline enum pid_type cpu_timer_pid_type(struct k_itimer *timer)
+{
+ return CPUCLOCK_PERTHREAD(timer->it_clock) ? PIDTYPE_PID : PIDTYPE_TGID;
+}
+
+static inline struct task_struct *cpu_timer_task_rcu(struct k_itimer *timer)
+{
+ return pid_task(timer->it.cpu.pid, cpu_timer_pid_type(timer));
+}
+
/*
* Update expiry time from increment, and increase overrun count,
* given the current clock sample.
@@ -391,7 +401,12 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
new_timer->kclock = &clock_posix_cpu;
timerqueue_init(&new_timer->it.cpu.node);
- new_timer->it.cpu.task = p;
+ new_timer->it.cpu.pid = get_task_pid(p, cpu_timer_pid_type(new_timer));
+ /*
+ * get_task_for_clock() took a reference on @p. Drop it as the timer
+ * holds a reference on the pid of @p.
+ */
+ put_task_struct(p);
return 0;
}
@@ -404,13 +419,15 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
static int posix_cpu_timer_del(struct k_itimer *timer)
{
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Protect against sighand release/switch in exit/exec and process/
@@ -432,8 +449,10 @@ static int posix_cpu_timer_del(struct k_itimer *timer)
unlock_task_sighand(p, &flags);
}
+out:
+ rcu_read_unlock();
if (!ret)
- put_task_struct(p);
+ put_pid(ctmr->pid);
return ret;
}
@@ -561,13 +580,21 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
u64 old_expires, new_expires, old_incr, val;
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p) {
+ /*
+ * If p has just been reaped, we can no
+ * longer get any information about it at all.
+ */
+ rcu_read_unlock();
+ return -ESRCH;
+ }
/*
* Use the to_ktime conversion because that clamps the maximum
@@ -584,8 +611,10 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
* If p has just been reaped, we can no
* longer get any information about it at all.
*/
- if (unlikely(sighand == NULL))
+ if (unlikely(sighand == NULL)) {
+ rcu_read_unlock();
return -ESRCH;
+ }
/*
* Disarm any old timer after extracting its expiry time.
@@ -690,6 +719,7 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
ret = 0;
out:
+ rcu_read_unlock();
if (old)
old->it_interval = ns_to_timespec64(old_incr);
@@ -701,10 +731,12 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
struct cpu_timer *ctmr = &timer->it.cpu;
u64 now, expires = cpu_timer_getexpires(ctmr);
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Easy part: convert the reload time.
@@ -712,7 +744,7 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_interval = ktime_to_timespec64(timer->it_interval);
if (!expires)
- return;
+ goto out;
/*
* Sample the clock to take the difference with the expiry time.
@@ -732,6 +764,8 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_value.tv_nsec = 1;
itp->it_value.tv_sec = 0;
}
+out:
+ rcu_read_unlock();
}
#define MAX_COLLECTED 20
@@ -952,14 +986,15 @@ static void check_process_timers(struct task_struct *tsk,
static void posix_cpu_timer_rearm(struct k_itimer *timer)
{
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
- struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
struct sighand_struct *sighand;
unsigned long flags;
u64 now;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Fetch the current sample and update the timer's expiry time.
@@ -974,13 +1009,15 @@ static void posix_cpu_timer_rearm(struct k_itimer *timer)
/* Protect timer list r/w in arm_timer() */
sighand = lock_task_sighand(p, &flags);
if (unlikely(sighand == NULL))
- return;
+ goto out;
/*
* Now re-arm for the new expiry time.
*/
arm_timer(timer, p);
unlock_task_sighand(p, &flags);
+out:
+ rcu_read_unlock();
}
/**
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 55e8c8eb2c7b6bf30e99423ccfe7ca032f498f59 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
Date: Fri, 28 Feb 2020 11:11:06 -0600
Subject: [PATCH] posix-cpu-timers: Store a reference to a pid not a task
posix cpu timers do not handle the death of a process well.
This is most clearly seen when a multi-threaded process calls exec from a
thread that is not the leader of the thread group. The posix cpu timer code
continues to pin the old thread group leader and is unable to find the
siglock from there.
This results in posix_cpu_timer_del being unable to delete a timer,
posix_cpu_timer_set being unable to set a timer. Further to compensate for
the problems in posix_cpu_timer_del on a multi-threaded exec all timers
that point at the multi-threaded task are stopped.
The code for the timers fundamentally needs to check if the target
process/thread is alive. This needs an extra level of indirection. This
level of indirection is already available in struct pid.
So replace cpu.task with cpu.pid to get the needed extra layer of
indirection.
In addition to handling things more cleanly this reduces the amount of
memory a timer can pin when a process exits and then is reaped from
a task_struct to the vastly smaller struct pid.
Fixes: e0a70217107e ("posix-cpu-timers: workaround to suppress the problems with mt exec")
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lkml.kernel.org/r/87wo86tz6d.fsf@x220.int.ebiederm.org
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 3d10c84a97a9..e3f0f8585da4 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -69,7 +69,7 @@ static inline int clockid_to_fd(const clockid_t clk)
struct cpu_timer {
struct timerqueue_node node;
struct timerqueue_head *head;
- struct task_struct *task;
+ struct pid *pid;
struct list_head elist;
int firing;
};
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index ef936c5a910b..6df468a622fe 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -118,6 +118,16 @@ static inline int validate_clock_permissions(const clockid_t clock)
return __get_task_for_clock(clock, false, false) ? 0 : -EINVAL;
}
+static inline enum pid_type cpu_timer_pid_type(struct k_itimer *timer)
+{
+ return CPUCLOCK_PERTHREAD(timer->it_clock) ? PIDTYPE_PID : PIDTYPE_TGID;
+}
+
+static inline struct task_struct *cpu_timer_task_rcu(struct k_itimer *timer)
+{
+ return pid_task(timer->it.cpu.pid, cpu_timer_pid_type(timer));
+}
+
/*
* Update expiry time from increment, and increase overrun count,
* given the current clock sample.
@@ -391,7 +401,12 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
new_timer->kclock = &clock_posix_cpu;
timerqueue_init(&new_timer->it.cpu.node);
- new_timer->it.cpu.task = p;
+ new_timer->it.cpu.pid = get_task_pid(p, cpu_timer_pid_type(new_timer));
+ /*
+ * get_task_for_clock() took a reference on @p. Drop it as the timer
+ * holds a reference on the pid of @p.
+ */
+ put_task_struct(p);
return 0;
}
@@ -404,13 +419,15 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
static int posix_cpu_timer_del(struct k_itimer *timer)
{
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Protect against sighand release/switch in exit/exec and process/
@@ -432,8 +449,10 @@ static int posix_cpu_timer_del(struct k_itimer *timer)
unlock_task_sighand(p, &flags);
}
+out:
+ rcu_read_unlock();
if (!ret)
- put_task_struct(p);
+ put_pid(ctmr->pid);
return ret;
}
@@ -561,13 +580,21 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
u64 old_expires, new_expires, old_incr, val;
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p) {
+ /*
+ * If p has just been reaped, we can no
+ * longer get any information about it at all.
+ */
+ rcu_read_unlock();
+ return -ESRCH;
+ }
/*
* Use the to_ktime conversion because that clamps the maximum
@@ -584,8 +611,10 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
* If p has just been reaped, we can no
* longer get any information about it at all.
*/
- if (unlikely(sighand == NULL))
+ if (unlikely(sighand == NULL)) {
+ rcu_read_unlock();
return -ESRCH;
+ }
/*
* Disarm any old timer after extracting its expiry time.
@@ -690,6 +719,7 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
ret = 0;
out:
+ rcu_read_unlock();
if (old)
old->it_interval = ns_to_timespec64(old_incr);
@@ -701,10 +731,12 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
struct cpu_timer *ctmr = &timer->it.cpu;
u64 now, expires = cpu_timer_getexpires(ctmr);
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Easy part: convert the reload time.
@@ -712,7 +744,7 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_interval = ktime_to_timespec64(timer->it_interval);
if (!expires)
- return;
+ goto out;
/*
* Sample the clock to take the difference with the expiry time.
@@ -732,6 +764,8 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_value.tv_nsec = 1;
itp->it_value.tv_sec = 0;
}
+out:
+ rcu_read_unlock();
}
#define MAX_COLLECTED 20
@@ -952,14 +986,15 @@ static void check_process_timers(struct task_struct *tsk,
static void posix_cpu_timer_rearm(struct k_itimer *timer)
{
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
- struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
struct sighand_struct *sighand;
unsigned long flags;
u64 now;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Fetch the current sample and update the timer's expiry time.
@@ -974,13 +1009,15 @@ static void posix_cpu_timer_rearm(struct k_itimer *timer)
/* Protect timer list r/w in arm_timer() */
sighand = lock_task_sighand(p, &flags);
if (unlikely(sighand == NULL))
- return;
+ goto out;
/*
* Now re-arm for the new expiry time.
*/
arm_timer(timer, p);
unlock_task_sighand(p, &flags);
+out:
+ rcu_read_unlock();
}
/**
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 55e8c8eb2c7b6bf30e99423ccfe7ca032f498f59 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
Date: Fri, 28 Feb 2020 11:11:06 -0600
Subject: [PATCH] posix-cpu-timers: Store a reference to a pid not a task
posix cpu timers do not handle the death of a process well.
This is most clearly seen when a multi-threaded process calls exec from a
thread that is not the leader of the thread group. The posix cpu timer code
continues to pin the old thread group leader and is unable to find the
siglock from there.
This results in posix_cpu_timer_del being unable to delete a timer,
posix_cpu_timer_set being unable to set a timer. Further to compensate for
the problems in posix_cpu_timer_del on a multi-threaded exec all timers
that point at the multi-threaded task are stopped.
The code for the timers fundamentally needs to check if the target
process/thread is alive. This needs an extra level of indirection. This
level of indirection is already available in struct pid.
So replace cpu.task with cpu.pid to get the needed extra layer of
indirection.
In addition to handling things more cleanly this reduces the amount of
memory a timer can pin when a process exits and then is reaped from
a task_struct to the vastly smaller struct pid.
Fixes: e0a70217107e ("posix-cpu-timers: workaround to suppress the problems with mt exec")
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lkml.kernel.org/r/87wo86tz6d.fsf@x220.int.ebiederm.org
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 3d10c84a97a9..e3f0f8585da4 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -69,7 +69,7 @@ static inline int clockid_to_fd(const clockid_t clk)
struct cpu_timer {
struct timerqueue_node node;
struct timerqueue_head *head;
- struct task_struct *task;
+ struct pid *pid;
struct list_head elist;
int firing;
};
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index ef936c5a910b..6df468a622fe 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -118,6 +118,16 @@ static inline int validate_clock_permissions(const clockid_t clock)
return __get_task_for_clock(clock, false, false) ? 0 : -EINVAL;
}
+static inline enum pid_type cpu_timer_pid_type(struct k_itimer *timer)
+{
+ return CPUCLOCK_PERTHREAD(timer->it_clock) ? PIDTYPE_PID : PIDTYPE_TGID;
+}
+
+static inline struct task_struct *cpu_timer_task_rcu(struct k_itimer *timer)
+{
+ return pid_task(timer->it.cpu.pid, cpu_timer_pid_type(timer));
+}
+
/*
* Update expiry time from increment, and increase overrun count,
* given the current clock sample.
@@ -391,7 +401,12 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
new_timer->kclock = &clock_posix_cpu;
timerqueue_init(&new_timer->it.cpu.node);
- new_timer->it.cpu.task = p;
+ new_timer->it.cpu.pid = get_task_pid(p, cpu_timer_pid_type(new_timer));
+ /*
+ * get_task_for_clock() took a reference on @p. Drop it as the timer
+ * holds a reference on the pid of @p.
+ */
+ put_task_struct(p);
return 0;
}
@@ -404,13 +419,15 @@ static int posix_cpu_timer_create(struct k_itimer *new_timer)
static int posix_cpu_timer_del(struct k_itimer *timer)
{
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Protect against sighand release/switch in exit/exec and process/
@@ -432,8 +449,10 @@ static int posix_cpu_timer_del(struct k_itimer *timer)
unlock_task_sighand(p, &flags);
}
+out:
+ rcu_read_unlock();
if (!ret)
- put_task_struct(p);
+ put_pid(ctmr->pid);
return ret;
}
@@ -561,13 +580,21 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
u64 old_expires, new_expires, old_incr, val;
struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
int ret = 0;
- if (WARN_ON_ONCE(!p))
- return -EINVAL;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p) {
+ /*
+ * If p has just been reaped, we can no
+ * longer get any information about it at all.
+ */
+ rcu_read_unlock();
+ return -ESRCH;
+ }
/*
* Use the to_ktime conversion because that clamps the maximum
@@ -584,8 +611,10 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
* If p has just been reaped, we can no
* longer get any information about it at all.
*/
- if (unlikely(sighand == NULL))
+ if (unlikely(sighand == NULL)) {
+ rcu_read_unlock();
return -ESRCH;
+ }
/*
* Disarm any old timer after extracting its expiry time.
@@ -690,6 +719,7 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
ret = 0;
out:
+ rcu_read_unlock();
if (old)
old->it_interval = ns_to_timespec64(old_incr);
@@ -701,10 +731,12 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
struct cpu_timer *ctmr = &timer->it.cpu;
u64 now, expires = cpu_timer_getexpires(ctmr);
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Easy part: convert the reload time.
@@ -712,7 +744,7 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_interval = ktime_to_timespec64(timer->it_interval);
if (!expires)
- return;
+ goto out;
/*
* Sample the clock to take the difference with the expiry time.
@@ -732,6 +764,8 @@ static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp
itp->it_value.tv_nsec = 1;
itp->it_value.tv_sec = 0;
}
+out:
+ rcu_read_unlock();
}
#define MAX_COLLECTED 20
@@ -952,14 +986,15 @@ static void check_process_timers(struct task_struct *tsk,
static void posix_cpu_timer_rearm(struct k_itimer *timer)
{
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
- struct cpu_timer *ctmr = &timer->it.cpu;
- struct task_struct *p = ctmr->task;
+ struct task_struct *p;
struct sighand_struct *sighand;
unsigned long flags;
u64 now;
- if (WARN_ON_ONCE(!p))
- return;
+ rcu_read_lock();
+ p = cpu_timer_task_rcu(timer);
+ if (!p)
+ goto out;
/*
* Fetch the current sample and update the timer's expiry time.
@@ -974,13 +1009,15 @@ static void posix_cpu_timer_rearm(struct k_itimer *timer)
/* Protect timer list r/w in arm_timer() */
sighand = lock_task_sighand(p, &flags);
if (unlikely(sighand == NULL))
- return;
+ goto out;
/*
* Now re-arm for the new expiry time.
*/
arm_timer(timer, p);
unlock_task_sighand(p, &flags);
+out:
+ rcu_read_unlock();
}
/**
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe867cac9e1967c553e4ac2aece5fc8675258010 Mon Sep 17 00:00:00 2001
From: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Date: Mon, 4 Nov 2019 12:02:14 +0200
Subject: [PATCH] net/mlx5e: Use preactivate hook to set the indirection table
mlx5e_ethtool_set_channels updates the indirection table before
switching to the new channels. If the switch fails, the indirection
table is new, but the channels are old, which is wrong. Fix it by using
the preactivate hook of mlx5e_safe_switch_channels to update the
indirection table at the stage when nothing can fail anymore.
As the code that updates the indirection table is now encapsulated into
a new function, use that function in the attach flow when the driver has
to reduce the number of channels, and prepare the code for the next
commit.
Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Reviewed-by: Tariq Toukan <tariqt(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index bc2c96b34de1..4ddccab02a4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1043,6 +1043,7 @@ int mlx5e_safe_reopen_channels(struct mlx5e_priv *priv);
int mlx5e_safe_switch_channels(struct mlx5e_priv *priv,
struct mlx5e_channels *new_chs,
mlx5e_fp_preactivate preactivate);
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv);
void mlx5e_activate_priv_channels(struct mlx5e_priv *priv);
void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 68b520df07e4..ff7f5a931520 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -432,9 +432,7 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
*cur_params = new_channels.params;
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
+ mlx5e_num_channels_changed(priv);
goto out;
}
@@ -442,12 +440,8 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (arfs_enabled)
mlx5e_arfs_disable(priv);
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
-
/* Switch to new channels, set new parameters and close old ones */
- err = mlx5e_safe_switch_channels(priv, &new_channels, NULL);
+ err = mlx5e_safe_switch_channels(priv, &new_channels, mlx5e_num_channels_changed);
if (arfs_enabled) {
int err2 = mlx5e_arfs_enable(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 152aa5d7df79..bbe8c32fb423 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2880,6 +2880,17 @@ static void mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
netif_set_real_num_rx_queues(netdev, num_rxqs);
}
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv)
+{
+ u16 count = priv->channels.params.num_channels;
+
+ if (!netif_is_rxfh_configured(priv->netdev))
+ mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
+ MLX5E_INDIR_RQT_SIZE, count);
+
+ return 0;
+}
+
static void mlx5e_build_txq_maps(struct mlx5e_priv *priv)
{
int i, ch;
@@ -5288,9 +5299,10 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv)
max_nch = mlx5e_get_max_num_channels(priv->mdev);
if (priv->channels.params.num_channels > max_nch) {
mlx5_core_warn(priv->mdev, "MLX5E: Reducing number of channels to %d\n", max_nch);
+ /* Reducing the number of channels - RXFH has to be reset. */
+ priv->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
priv->channels.params.num_channels = max_nch;
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, max_nch);
+ mlx5e_num_channels_changed(priv);
}
err = profile->init_tx(priv);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe867cac9e1967c553e4ac2aece5fc8675258010 Mon Sep 17 00:00:00 2001
From: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Date: Mon, 4 Nov 2019 12:02:14 +0200
Subject: [PATCH] net/mlx5e: Use preactivate hook to set the indirection table
mlx5e_ethtool_set_channels updates the indirection table before
switching to the new channels. If the switch fails, the indirection
table is new, but the channels are old, which is wrong. Fix it by using
the preactivate hook of mlx5e_safe_switch_channels to update the
indirection table at the stage when nothing can fail anymore.
As the code that updates the indirection table is now encapsulated into
a new function, use that function in the attach flow when the driver has
to reduce the number of channels, and prepare the code for the next
commit.
Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Reviewed-by: Tariq Toukan <tariqt(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index bc2c96b34de1..4ddccab02a4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1043,6 +1043,7 @@ int mlx5e_safe_reopen_channels(struct mlx5e_priv *priv);
int mlx5e_safe_switch_channels(struct mlx5e_priv *priv,
struct mlx5e_channels *new_chs,
mlx5e_fp_preactivate preactivate);
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv);
void mlx5e_activate_priv_channels(struct mlx5e_priv *priv);
void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 68b520df07e4..ff7f5a931520 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -432,9 +432,7 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
*cur_params = new_channels.params;
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
+ mlx5e_num_channels_changed(priv);
goto out;
}
@@ -442,12 +440,8 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (arfs_enabled)
mlx5e_arfs_disable(priv);
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
-
/* Switch to new channels, set new parameters and close old ones */
- err = mlx5e_safe_switch_channels(priv, &new_channels, NULL);
+ err = mlx5e_safe_switch_channels(priv, &new_channels, mlx5e_num_channels_changed);
if (arfs_enabled) {
int err2 = mlx5e_arfs_enable(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 152aa5d7df79..bbe8c32fb423 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2880,6 +2880,17 @@ static void mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
netif_set_real_num_rx_queues(netdev, num_rxqs);
}
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv)
+{
+ u16 count = priv->channels.params.num_channels;
+
+ if (!netif_is_rxfh_configured(priv->netdev))
+ mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
+ MLX5E_INDIR_RQT_SIZE, count);
+
+ return 0;
+}
+
static void mlx5e_build_txq_maps(struct mlx5e_priv *priv)
{
int i, ch;
@@ -5288,9 +5299,10 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv)
max_nch = mlx5e_get_max_num_channels(priv->mdev);
if (priv->channels.params.num_channels > max_nch) {
mlx5_core_warn(priv->mdev, "MLX5E: Reducing number of channels to %d\n", max_nch);
+ /* Reducing the number of channels - RXFH has to be reset. */
+ priv->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
priv->channels.params.num_channels = max_nch;
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, max_nch);
+ mlx5e_num_channels_changed(priv);
}
err = profile->init_tx(priv);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe867cac9e1967c553e4ac2aece5fc8675258010 Mon Sep 17 00:00:00 2001
From: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Date: Mon, 4 Nov 2019 12:02:14 +0200
Subject: [PATCH] net/mlx5e: Use preactivate hook to set the indirection table
mlx5e_ethtool_set_channels updates the indirection table before
switching to the new channels. If the switch fails, the indirection
table is new, but the channels are old, which is wrong. Fix it by using
the preactivate hook of mlx5e_safe_switch_channels to update the
indirection table at the stage when nothing can fail anymore.
As the code that updates the indirection table is now encapsulated into
a new function, use that function in the attach flow when the driver has
to reduce the number of channels, and prepare the code for the next
commit.
Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Reviewed-by: Tariq Toukan <tariqt(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index bc2c96b34de1..4ddccab02a4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1043,6 +1043,7 @@ int mlx5e_safe_reopen_channels(struct mlx5e_priv *priv);
int mlx5e_safe_switch_channels(struct mlx5e_priv *priv,
struct mlx5e_channels *new_chs,
mlx5e_fp_preactivate preactivate);
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv);
void mlx5e_activate_priv_channels(struct mlx5e_priv *priv);
void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 68b520df07e4..ff7f5a931520 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -432,9 +432,7 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
*cur_params = new_channels.params;
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
+ mlx5e_num_channels_changed(priv);
goto out;
}
@@ -442,12 +440,8 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (arfs_enabled)
mlx5e_arfs_disable(priv);
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
-
/* Switch to new channels, set new parameters and close old ones */
- err = mlx5e_safe_switch_channels(priv, &new_channels, NULL);
+ err = mlx5e_safe_switch_channels(priv, &new_channels, mlx5e_num_channels_changed);
if (arfs_enabled) {
int err2 = mlx5e_arfs_enable(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 152aa5d7df79..bbe8c32fb423 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2880,6 +2880,17 @@ static void mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
netif_set_real_num_rx_queues(netdev, num_rxqs);
}
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv)
+{
+ u16 count = priv->channels.params.num_channels;
+
+ if (!netif_is_rxfh_configured(priv->netdev))
+ mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
+ MLX5E_INDIR_RQT_SIZE, count);
+
+ return 0;
+}
+
static void mlx5e_build_txq_maps(struct mlx5e_priv *priv)
{
int i, ch;
@@ -5288,9 +5299,10 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv)
max_nch = mlx5e_get_max_num_channels(priv->mdev);
if (priv->channels.params.num_channels > max_nch) {
mlx5_core_warn(priv->mdev, "MLX5E: Reducing number of channels to %d\n", max_nch);
+ /* Reducing the number of channels - RXFH has to be reset. */
+ priv->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
priv->channels.params.num_channels = max_nch;
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, max_nch);
+ mlx5e_num_channels_changed(priv);
}
err = profile->init_tx(priv);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fe867cac9e1967c553e4ac2aece5fc8675258010 Mon Sep 17 00:00:00 2001
From: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Date: Mon, 4 Nov 2019 12:02:14 +0200
Subject: [PATCH] net/mlx5e: Use preactivate hook to set the indirection table
mlx5e_ethtool_set_channels updates the indirection table before
switching to the new channels. If the switch fails, the indirection
table is new, but the channels are old, which is wrong. Fix it by using
the preactivate hook of mlx5e_safe_switch_channels to update the
indirection table at the stage when nothing can fail anymore.
As the code that updates the indirection table is now encapsulated into
a new function, use that function in the attach flow when the driver has
to reduce the number of channels, and prepare the code for the next
commit.
Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)mellanox.com>
Reviewed-by: Tariq Toukan <tariqt(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index bc2c96b34de1..4ddccab02a4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1043,6 +1043,7 @@ int mlx5e_safe_reopen_channels(struct mlx5e_priv *priv);
int mlx5e_safe_switch_channels(struct mlx5e_priv *priv,
struct mlx5e_channels *new_chs,
mlx5e_fp_preactivate preactivate);
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv);
void mlx5e_activate_priv_channels(struct mlx5e_priv *priv);
void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 68b520df07e4..ff7f5a931520 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -432,9 +432,7 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
*cur_params = new_channels.params;
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
+ mlx5e_num_channels_changed(priv);
goto out;
}
@@ -442,12 +440,8 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
if (arfs_enabled)
mlx5e_arfs_disable(priv);
- if (!netif_is_rxfh_configured(priv->netdev))
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, count);
-
/* Switch to new channels, set new parameters and close old ones */
- err = mlx5e_safe_switch_channels(priv, &new_channels, NULL);
+ err = mlx5e_safe_switch_channels(priv, &new_channels, mlx5e_num_channels_changed);
if (arfs_enabled) {
int err2 = mlx5e_arfs_enable(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 152aa5d7df79..bbe8c32fb423 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2880,6 +2880,17 @@ static void mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
netif_set_real_num_rx_queues(netdev, num_rxqs);
}
+int mlx5e_num_channels_changed(struct mlx5e_priv *priv)
+{
+ u16 count = priv->channels.params.num_channels;
+
+ if (!netif_is_rxfh_configured(priv->netdev))
+ mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
+ MLX5E_INDIR_RQT_SIZE, count);
+
+ return 0;
+}
+
static void mlx5e_build_txq_maps(struct mlx5e_priv *priv)
{
int i, ch;
@@ -5288,9 +5299,10 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv)
max_nch = mlx5e_get_max_num_channels(priv->mdev);
if (priv->channels.params.num_channels > max_nch) {
mlx5_core_warn(priv->mdev, "MLX5E: Reducing number of channels to %d\n", max_nch);
+ /* Reducing the number of channels - RXFH has to be reset. */
+ priv->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
priv->channels.params.num_channels = max_nch;
- mlx5e_build_default_indir_rqt(priv->rss_params.indirection_rqt,
- MLX5E_INDIR_RQT_SIZE, max_nch);
+ mlx5e_num_channels_changed(priv);
}
err = profile->init_tx(priv);
On Fri, Apr 17, 2020 at 9:41 PM Toralf Förster <toralf.foerster(a)gmx.de> wrote:
>
> On 4/17/20 8:52 PM, Rafael J. Wysocki wrote:
> > On Fri, Apr 17, 2020 at 6:36 PM Toralf Förster <toralf.foerster(a)gmx.de> wrote:
> >>
> >> On 4/17/20 5:53 PM, Rafael J. Wysocki wrote:
> >>> Does the patch below (untested) make any difference?
> >>>
> >>> ---
> >>> drivers/acpi/ec.c | 5 ++++-
> >>> 1 file changed, 4 insertions(+), 1 deletion(-)
> >>>
> >>> Index: linux-pm/drivers/acpi/ec.c
> >>> ===================================================================
> >>> --- linux-pm.orig/drivers/acpi/ec.c
> >>> +++ linux-pm/drivers/acpi/ec.c
> >>> @@ -2067,7 +2067,10 @@ static struct acpi_driver acpi_ec_driver
> >>> .add = acpi_ec_add,
> >>> .remove = acpi_ec_remove,
> >>> },
> >>> - .drv.pm = &acpi_ec_pm,
> >>> + .drv = {
> >>> + .probe_type = PROBE_FORCE_SYNCHRONOUS,
> >>> + .pm = &acpi_ec_pm,
> >>> + },
> >>> };
> >>>
> >>> static void acpi_ec_destroy_workqueues(void)
> >> I'd say no, but for completeness:
> >
> > OK, it looks like mainline commit
> >
> > 65a691f5f8f0 ("ACPI: EC: Do not clear boot_ec_is_ecdt in acpi_ec_add()")
> >
> > was backported into 5.6.5 by mistake.
> >
> > Can you please revert that patch and retest?
> >
> Yes, reverting that commit solved the issue.
OK, thanks!
Greg, I'm not sure why commit 65a691f5f8f0 from the mainline ended up in 5.6.5.
It has not been marked for -stable or otherwise requested to be
included AFAICS. Also it depends on other mainline commits that have
not been included into 5.6.5.
Can you please drop it?
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ef4a632ccc1c7d3fb71a5baae85b79af08b7f94b Mon Sep 17 00:00:00 2001
From: Murphy Zhou <jencce.kernel(a)gmail.com>
Date: Wed, 18 Mar 2020 20:43:38 +0800
Subject: [PATCH] CIFS: check new file size when extending file by fallocate
xfstests generic/228 checks if fallocate respect RLIMIT_FSIZE.
After fallocate mode 0 extending enabled, we can hit this failure.
Fix this by check the new file size with vfs helper, return
error if file size is larger then RLIMIT_FSIZE(ulimit -f).
This patch has been tested by LTP/xfstests aginst samba and
Windows server.
Acked-by: Ronnie Sahlberg <lsahlber(a)redhat.com>
Signed-off-by: Murphy Zhou <jencce.kernel(a)gmail.com>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
CC: Stable <stable(a)vger.kernel.org>
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index b0759c8aa6f5..9c9258fc8756 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -3255,6 +3255,10 @@ static long smb3_simple_falloc(struct file *file, struct cifs_tcon *tcon,
* Extending the file
*/
if ((keep_size == false) && i_size_read(inode) < off + len) {
+ rc = inode_newsize_ok(inode, off + len);
+ if (rc)
+ goto out;
+
if ((cifsi->cifsAttrs & FILE_ATTRIBUTE_SPARSE_FILE) == 0)
smb2_set_sparse(xid, tcon, cfile, inode, false);
A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in
memory.events") changes the behavior of memcg events, which will
consider subtrees in memory.events. But oom_kill event is a special one
as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed
in memory.oom_control. The file memory.oom_control is in both root memcg
and non root memcg, that is different with memory.event as it only in
non-root memcg. That commit is okay for cgroup2, but it is not okay for
cgroup1 as it will cause inconsistent behavior between root memcg and
non-root memcg.
Let's recover the original behavior for cgroup1.
Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events")
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Yafang Shao <laoar.shao(a)gmail.com>
---
include/linux/memcontrol.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 8c340e6b347f..a0ae080a67d1 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg,
atomic_long_inc(&memcg->memory_events[event]);
cgroup_file_notify(&memcg->events_file);
- if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
+ if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS ||
+ !cgroup_subsys_on_dfl(memory_cgrp_subsys))
break;
} while ((memcg = parent_mem_cgroup(memcg)) &&
!mem_cgroup_is_root(memcg));
--
2.18.2
The patch titled
Subject: mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
has been added to the -mm tree. Its filename is
mm-page_alloc-fix-watchdog-soft-lockups-during-set_zone_contiguous.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-watchdog-soft-lo…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-watchdog-soft-lo…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
Without CONFIG_PREEMPT, it can happen that we get soft lockups detected,
e.g., while booting up.
[ 105.608900] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
[ 105.608933] Modules linked in:
[ 105.608933] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4
[ 105.608933] Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
[ 105.608933] RIP: 0010:__pageblock_pfn_to_page+0x134/0x1c0
[ 105.608933] Code: 85 c0 74 71 4a 8b 04 d0 48 85 c0 74 68 48 01 c1 74 63 f6 01 04 74 5e 48 c1 e7 06 4c 8b 05 cc 991
[ 105.608933] RSP: 0000:ffffb6d94000fe60 EFLAGS: 00010286 ORIG_RAX: ffffffffffffff13
[ 105.608933] RAX: fffff81953250000 RBX: 000000000a4c9600 RCX: ffff8fe9ff7c1990
[ 105.608933] RDX: ffff8fe9ff7dab80 RSI: 000000000a4c95ff RDI: 0000000293250000
[ 105.608933] RBP: ffff8fe9ff7dab80 R08: fffff816c0000000 R09: 0000000000000008
[ 105.608933] R10: 0000000000000014 R11: 0000000000000014 R12: 0000000000000000
[ 105.608933] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 105.608933] FS: 0000000000000000(0000) GS:ffff8fe1ff400000(0000) knlGS:0000000000000000
[ 105.608933] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 105.608933] CR2: 000000000f613000 CR3: 00000088cf20a000 CR4: 00000000000006f0
[ 105.608933] Call Trace:
[ 105.608933] set_zone_contiguous+0x56/0x70
[ 105.608933] page_alloc_init_late+0x166/0x176
[ 105.608933] kernel_init_freeable+0xfa/0x255
[ 105.608933] ? rest_init+0xaa/0xaa
[ 105.608933] kernel_init+0xa/0x106
[ 105.608933] ret_from_fork+0x35/0x40
The issue becomes visible when having a lot of memory (e.g., 4TB) assigned
to a single NUMA node - a system that can easily be created using QEMU.
Inside VMs on a hypervisor with quite some memory overcommit, this is
fairly easy to trigger.
Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux(a)gmail.com>
Reviewed-by: Baoquan He <bhe(a)redhat.com>
Reviewed-by: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Alexander Duyck <alexander.duyck(a)gmail.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/page_alloc.c~mm-page_alloc-fix-watchdog-soft-lockups-during-set_zone_contiguous
+++ a/mm/page_alloc.c
@@ -1607,6 +1607,7 @@ void set_zone_contiguous(struct zone *zo
if (!__pageblock_pfn_to_page(block_start_pfn,
block_end_pfn, zone))
return;
+ cond_resched();
}
/* We confirm that there is no hole */
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-page_alloc-fix-watchdog-soft-lockups-during-set_zone_contiguous.patch
drivers-base-memoryc-cache-memory-blocks-in-xarray-to-accelerate-lookup-fix.patch
powerpc-pseries-hotplug-memory-stop-checking-is_mem_section_removable.patch
mm-memory_hotplug-remove-is_mem_section_removable.patch
On Fri, 13 Mar 2020 at 06:41, Sowjanya Komatineni
<skomatineni(a)nvidia.com> wrote:
>
> Tegra host supports HW busy detection and timeouts based on the
> count programmed in SDHCI_TIMEOUT_CONTROL register and max busy
> timeout it supports is 11s in finite busy wait mode.
>
> Some operations like SLEEP_AWAKE, ERASE and flush cache through
> SWITCH commands take longer than 11s and Tegra host supports
> infinite HW busy wait mode where HW waits forever till the card
> is busy without HW timeout.
>
> This patch implements Tegra specific set_timeout sdhci_ops to allow
> switching between finite and infinite HW busy detection wait modes
> based on the device command expected operation time.
>
> Signed-off-by: Sowjanya Komatineni <skomatineni(a)nvidia.com>
> ---
> drivers/mmc/host/sdhci-tegra.c | 31 +++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
>
> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> index a25c3a4..fa8f6a4 100644
> --- a/drivers/mmc/host/sdhci-tegra.c
> +++ b/drivers/mmc/host/sdhci-tegra.c
> @@ -45,6 +45,7 @@
> #define SDHCI_TEGRA_CAP_OVERRIDES_DQS_TRIM_SHIFT 8
>
> #define SDHCI_TEGRA_VENDOR_MISC_CTRL 0x120
> +#define SDHCI_MISC_CTRL_ERASE_TIMEOUT_LIMIT BIT(0)
> #define SDHCI_MISC_CTRL_ENABLE_SDR104 0x8
> #define SDHCI_MISC_CTRL_ENABLE_SDR50 0x10
> #define SDHCI_MISC_CTRL_ENABLE_SDHCI_SPEC_300 0x20
> @@ -1227,6 +1228,34 @@ static u32 sdhci_tegra_cqhci_irq(struct sdhci_host *host, u32 intmask)
> return 0;
> }
>
> +static void tegra_sdhci_set_timeout(struct sdhci_host *host,
> + struct mmc_command *cmd)
> +{
> + u32 val;
> +
> + /*
> + * HW busy detection timeout is based on programmed data timeout
> + * counter and maximum supported timeout is 11s which may not be
> + * enough for long operations like cache flush, sleep awake, erase.
> + *
> + * ERASE_TIMEOUT_LIMIT bit of VENDOR_MISC_CTRL register allows
> + * host controller to wait for busy state until the card is busy
> + * without HW timeout.
> + *
> + * So, use infinite busy wait mode for operations that may take
> + * more than maximum HW busy timeout of 11s otherwise use finite
> + * busy wait mode.
> + */
> + val = sdhci_readl(host, SDHCI_TEGRA_VENDOR_MISC_CTRL);
> + if (cmd && cmd->busy_timeout >= 11 * HZ)
> + val |= SDHCI_MISC_CTRL_ERASE_TIMEOUT_LIMIT;
> + else
> + val &= ~SDHCI_MISC_CTRL_ERASE_TIMEOUT_LIMIT;
> + sdhci_writel(host, val, SDHCI_TEGRA_VENDOR_MISC_CTRL);
> +
> + __sdhci_set_timeout(host, cmd);
kernel build on arm and arm64 architecture failed on stable-rc 4.19
(arm), 5.4 (arm64) and 5.5 (arm64)
drivers/mmc/host/sdhci-tegra.c: In function 'tegra_sdhci_set_timeout':
drivers/mmc/host/sdhci-tegra.c:1256:2: error: implicit declaration of
function '__sdhci_set_timeout'; did you mean
'tegra_sdhci_set_timeout'? [-Werror=implicit-function-declaration]
__sdhci_set_timeout(host, cmd);
^~~~~~~~~~~~~~~~~~~
tegra_sdhci_set_timeout
Full build log,
https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.5/D…https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.4/D…https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-4.19/…
- Naresh
Hi Linus,
Linux 5.7-rc1 reboot/powereoff hangs on AMD Ryzen 5 PRO 2400GE
system.
I isolated the commit to:
Revering the following commit fixes the problem.
commit 487eca11a321ef33bcf4ca5adb3c0c4954db1b58
Author: Prike Liang <Prike.Liang(a)amd.com>
Date: Tue Apr 7 20:21:26 2020 +0800
drm/amdgpu: fix gfx hang during suspend with video playback (v2)
The system will be hang up during S3 suspend because of SMU is
pending for GC not respose the register CP_HQD_ACTIVE access
request.This issue root cause of accessing the GC register under
enter GFX CGGPG and can be fixed by disable GFX CGPG before perform
suspend.
v2: Use disable the GFX CGPG instead of RLC safe mode guard.
Signed-off-by: Prike Liang <Prike.Liang(a)amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang(a)amd.com>
Reviewed-by: Huang Rui <ray.huang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
I did the bisect on Linux 5.6.5-rc1
git bisect start
# good: [0a27a29496060843ae3a8fe78aaec0062cbd5dfa] Linux 5.6.4
git bisect good 0a27a29496060843ae3a8fe78aaec0062cbd5dfa
# bad: [576aa353744ce5f1279071363e4a55e97f486f39] Linux 5.6.5-rc1
git bisect bad 576aa353744ce5f1279071363e4a55e97f486f39
# good: [7509db5d111a5763a199902052eecc480e0ec724] x86/tsc_msr: Make MSR
derived TSC frequency more accurate
git bisect good 7509db5d111a5763a199902052eecc480e0ec724
# good: [15f1ead7d7966d087352ba2cf81a1759b25ad163] scsi: lpfc: Fix
broken Credit Recovery after driver load
git bisect good 15f1ead7d7966d087352ba2cf81a1759b25ad163
# good: [9e52b4ab5fadd803c8c2e617aa8c151720757fb1] s390/diag: fix
display of diagnose call statistics
git bisect good 9e52b4ab5fadd803c8c2e617aa8c151720757fb1
# good: [3e1e6903924fd6c95db7e46c5baa41dfa0f46fdb]
powerpc/hash64/devmap: Use H_PAGE_THP_HUGE when setting up huge devmap
PTE entries
git bisect good 3e1e6903924fd6c95db7e46c5baa41dfa0f46fdb
# good: [0344e0fee2f904ebce1d58bec8ace3ee9cf5f777] drm/dp_mst: Fix
clearing payload state on topology disable
git bisect good 0344e0fee2f904ebce1d58bec8ace3ee9cf5f777
# bad: [136881c0420bb52d5f21f75688f5aee1cf401737] perf/core: Unify
{pinned,flexible}_sched_in()
git bisect bad 136881c0420bb52d5f21f75688f5aee1cf401737
# bad: [0d928b424b99cfe0e7806c530f6039a053d5082d] drm/i915/ggtt: do not
set bits 1-11 in gen12 ptes
git bisect bad 0d928b424b99cfe0e7806c530f6039a053d5082d
# bad: [a655a99c9f6a1d819d695fca6d48b450449f45ee] drm/amdgpu: fix gfx
hang during suspend with video playback (v2)
git bisect bad a655a99c9f6a1d819d695fca6d48b450449f45ee
# first bad commit: [a655a99c9f6a1d819d695fca6d48b450449f45ee]
drm/amdgpu: fix gfx hang during suspend with video playback (v2)
thanks,
-- Shuah
From: Masahiro Yamada <masahiroy(a)kernel.org>
[ Upstream commit 63b903dfebdea92aa92ad337d8451a6fbfeabf9d ]
As far as I understood from the Kconfig help text, this build rule is
used to rebuild the driver firmware, which runs on an old m68k-based
chip. So, you need m68k tools for the firmware rebuild.
wanxl.c is a PCI driver, but CONFIG_M68K does not select CONFIG_HAVE_PCI.
So, you cannot enable CONFIG_WANXL_BUILD_FIRMWARE for ARCH=m68k. In other
words, ifeq ($(ARCH),m68k) is false here.
I am keeping the dead code for now, but rebuilding the firmware requires
'as68k' and 'ld68k', which I do not have in hand.
Instead, the kernel.org m68k GCC [1] successfully built it.
Allowing a user to pass in CROSS_COMPILE_M68K= is handier.
[1] https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/…
Suggested-by: Geert Uytterhoeven <geert(a)linux-m68k.org>
Signed-off-by: Masahiro Yamada <masahiroy(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/wan/Kconfig | 2 +-
drivers/net/wan/Makefile | 12 ++++++------
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/net/wan/Kconfig b/drivers/net/wan/Kconfig
index dd1a147f29716..058d77d2e693d 100644
--- a/drivers/net/wan/Kconfig
+++ b/drivers/net/wan/Kconfig
@@ -200,7 +200,7 @@ config WANXL_BUILD_FIRMWARE
depends on WANXL && !PREVENT_FIRMWARE_BUILD
help
Allows you to rebuild firmware run by the QUICC processor.
- It requires as68k, ld68k and hexdump programs.
+ It requires m68k toolchains and hexdump programs.
You should never need this option, say N.
diff --git a/drivers/net/wan/Makefile b/drivers/net/wan/Makefile
index 701f5d2fe3b61..995277c657a1e 100644
--- a/drivers/net/wan/Makefile
+++ b/drivers/net/wan/Makefile
@@ -40,17 +40,17 @@ $(obj)/wanxl.o: $(obj)/wanxlfw.inc
ifeq ($(CONFIG_WANXL_BUILD_FIRMWARE),y)
ifeq ($(ARCH),m68k)
- AS68K = $(AS)
- LD68K = $(LD)
+ M68KAS = $(AS)
+ M68KLD = $(LD)
else
- AS68K = as68k
- LD68K = ld68k
+ M68KAS = $(CROSS_COMPILE_M68K)as
+ M68KLD = $(CROSS_COMPILE_M68K)ld
endif
quiet_cmd_build_wanxlfw = BLD FW $@
cmd_build_wanxlfw = \
- $(CPP) -D__ASSEMBLY__ -Wp,-MD,$(depfile) -I$(srctree)/include/uapi $< | $(AS68K) -m68360 -o $(obj)/wanxlfw.o; \
- $(LD68K) --oformat binary -Ttext 0x1000 $(obj)/wanxlfw.o -o $(obj)/wanxlfw.bin; \
+ $(CPP) -D__ASSEMBLY__ -Wp,-MD,$(depfile) -I$(srctree)/include/uapi $< | $(M68KAS) -m68360 -o $(obj)/wanxlfw.o; \
+ $(M68KLD) --oformat binary -Ttext 0x1000 $(obj)/wanxlfw.o -o $(obj)/wanxlfw.bin; \
hexdump -ve '"\n" 16/1 "0x%02X,"' $(obj)/wanxlfw.bin | sed 's/0x ,//g;1s/^/static const u8 firmware[]={/;$$s/,$$/\n};\n/' >$(obj)/wanxlfw.inc; \
rm -f $(obj)/wanxlfw.bin $(obj)/wanxlfw.o
--
2.20.1
The default resource group ("rdtgroup_default") is associated with the
root of the resctrl filesystem and should never be removed. New resource
groups can be created as subdirectories of the resctrl filesystem and
they can be removed from user space. There exists a safeguard in the
directory removal code (rdtgroup_rmdir()) that ensures that only
subdirectories can be removed by testing that the directory to be
removed has to be a child of the root directory.
A possible deadlock was recently fixed with commit 334b0f4e9b1b
("x86/resctrl: Fix a deadlock due to inaccurate reference"). This fix
involved associating the private data of the "mon_groups" and "mon_data"
directories to the resource group to which they belong instead of NULL
as before. A consequence of this change was that the original safeguard
code preventing removal of "mon_groups" and "mon_data" found in the root
directory failed resulting in attempts to remove the default resource
group that ends in a BUG:
kernel BUG at mm/slub.c:3969!
invalid opcode: 0000 [#1] SMP PTI
Call Trace:
rdtgroup_rmdir+0x16b/0x2c0
kernfs_iop_rmdir+0x5c/0x90
vfs_rmdir+0x7a/0x160
do_rmdir+0x17d/0x1e0
do_syscall_64+0x55/0x1d0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fix this by improving the directory removal safeguard to ensure that
subdirectories of the resctrl root directory can only be removed if
they are a child of the resctrl filesystem's root _and_ not associated
with the default resource group.
Fixes: 334b0f4e9b1b ("x86/resctrl: Fix a deadlock due to inaccurate reference")
Cc: stable(a)vger.kernel.org
Reported-by: Sai Praneeth Prakhya <sai.praneeth.prakhya(a)intel.com>
Tested-by: Sai Praneeth Prakhya <sai.praneeth.prakhya(a)intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre(a)intel.com>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 064e9ef44cd6..9d4e73a9b5a9 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -3072,7 +3072,8 @@ static int rdtgroup_rmdir(struct kernfs_node *kn)
* If the rdtgroup is a mon group and parent directory
* is a valid "mon_groups" directory, remove the mon group.
*/
- if (rdtgrp->type == RDTCTRL_GROUP && parent_kn == rdtgroup_default.kn) {
+ if (rdtgrp->type == RDTCTRL_GROUP && parent_kn == rdtgroup_default.kn &&
+ rdtgrp != &rdtgroup_default) {
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP ||
rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
ret = rdtgroup_ctrl_remove(kn, rdtgrp);
--
2.21.0
Hi,
My name is felix. I have a quick and easy business offer that
will benefit us both immensely. The amount involved is over 9.4
Milli0n in US d0llars.
I await your immediate reply so that I can give full details.
Rgds
Felix
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: 3688b0db5c331f4ec3fa5eb9f670a4b04f530700
Gitweb: https://git.kernel.org/tip/3688b0db5c331f4ec3fa5eb9f670a4b04f530700
Author: Grygorii Strashko <grygorii.strashko(a)ti.com>
AuthorDate: Wed, 08 Apr 2020 22:15:32 +03:00
Committer: Marc Zyngier <maz(a)kernel.org>
CommitterDate: Fri, 17 Apr 2020 08:59:28 +01:00
irqchip/ti-sci-inta: Fix processing of masked irqs
The ti_sci_inta_irq_handler() does not take into account INTA IRQs state
(masked/unmasked) as it uses INTA_STATUS_CLEAR_j register to get INTA IRQs
status, which provides raw status value.
This causes hard IRQ handlers to be called or threaded handlers to be
scheduled many times even if corresponding INTA IRQ is masked.
Above, first of all, affects the LEVEL interrupts processing and causes
unexpected behavior up the system stack or crash.
Fix it by using the Interrupt Masked Status INTA_STATUSM_j register which
provides masked INTA IRQs status.
Fixes: 9f1463b86c13 ("irqchip/ti-sci-inta: Add support for Interrupt Aggregator driver")
Signed-off-by: Grygorii Strashko <grygorii.strashko(a)ti.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Reviewed-by: Lokesh Vutla <lokeshvutla(a)ti.com>
Link: https://lore.kernel.org/r/20200408191532.31252-1-grygorii.strashko@ti.com
Cc: stable(a)vger.kernel.org
---
drivers/irqchip/irq-ti-sci-inta.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-ti-sci-inta.c b/drivers/irqchip/irq-ti-sci-inta.c
index 8f6e6b0..7e3ebf6 100644
--- a/drivers/irqchip/irq-ti-sci-inta.c
+++ b/drivers/irqchip/irq-ti-sci-inta.c
@@ -37,6 +37,7 @@
#define VINT_ENABLE_SET_OFFSET 0x0
#define VINT_ENABLE_CLR_OFFSET 0x8
#define VINT_STATUS_OFFSET 0x18
+#define VINT_STATUS_MASKED_OFFSET 0x20
/**
* struct ti_sci_inta_event_desc - Description of an event coming to
@@ -116,7 +117,7 @@ static void ti_sci_inta_irq_handler(struct irq_desc *desc)
chained_irq_enter(irq_desc_get_chip(desc), desc);
val = readq_relaxed(inta->base + vint_desc->vint_id * 0x1000 +
- VINT_STATUS_OFFSET);
+ VINT_STATUS_MASKED_OFFSET);
for_each_set_bit(bit, &val, MAX_EVENTS_PER_VINT) {
virq = irq_find_mapping(domain, vint_desc->events[bit].hwirq);
The function posix_acl_create() applies the umask only if the inode
has no ACL (= NULL) or if ACLs are not supported by the filesystem
driver (= -EOPNOTSUPP).
However, this happens only after after the IS_POSIXACL() check
succeeded. If the superblock doesn't enable ACL support, umask will
never be applied. A filesystem which has no ACL support will of
course not enable SB_POSIXACL, rendering the umask-applying code path
unreachable.
This fixes a bug which causes the umask to be ignored with O_TMPFILE
on tmpfs:
https://github.com/MusicPlayerDaemon/MPD/issues/558https://bugs.gentoo.org/show_bug.cgi?id=686142#c3https://bugzilla.kernel.org/show_bug.cgi?id=203625
Signed-off-by: Max Kellermann <mk(a)cm4all.com>
Reviewed-by: J. Bruce Fields <bfields(a)redhat.com>
Cc: stable(a)vger.kernel.org
---
fs/posix_acl.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 249672bf54fe..e5e7a2295b99 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -589,9 +589,14 @@ posix_acl_create(struct inode *dir, umode_t *mode,
*acl = NULL;
*default_acl = NULL;
- if (S_ISLNK(*mode) || !IS_POSIXACL(dir))
+ if (S_ISLNK(*mode))
return 0;
+ if (!IS_POSIXACL(dir)) {
+ *mode &= ~current_umask();
+ return 0;
+ }
+
p = get_acl(dir, ACL_TYPE_DEFAULT);
if (!p || p == ERR_PTR(-EOPNOTSUPP)) {
*mode &= ~current_umask();
--
2.20.1
A request may not be completed because not all the TRBs are prepared for
it. This happens when we run out of available TRBs. When some TRBs are
completed, the driver needs to prepare the rest of the TRBs for the
request. The check dwc3_gadget_ep_request_completed() shouldn't be
checking the amount of data received but rather the number of pending
TRBs. Revise this request completion check.
Cc: stable(a)vger.kernel.org
Fixes: e0c42ce590fe ("usb: dwc3: gadget: simplify IOC handling")
Signed-off-by: Thinh Nguyen <thinhn(a)synopsys.com>
---
Changes in v2:
- Add Cc: stable tag
drivers/usb/dwc3/gadget.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 1a4fc03742aa..c45853b14cff 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2550,14 +2550,7 @@ static int dwc3_gadget_ep_reclaim_trb_linear(struct dwc3_ep *dep,
static bool dwc3_gadget_ep_request_completed(struct dwc3_request *req)
{
- /*
- * For OUT direction, host may send less than the setup
- * length. Return true for all OUT requests.
- */
- if (!req->direction)
- return true;
-
- return req->request.actual == req->request.length;
+ return req->num_pending_sgs == 0;
}
static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep,
@@ -2581,8 +2574,7 @@ static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep,
req->request.actual = req->request.length - req->remaining;
- if (!dwc3_gadget_ep_request_completed(req) ||
- req->num_pending_sgs) {
+ if (!dwc3_gadget_ep_request_completed(req)) {
__dwc3_gadget_kick_transfer(dep);
goto out;
}
--
2.11.0
The patch titled
Subject: tools/vm: fix cross-compile build
has been added to the -mm tree. Its filename is
tools-vm-fix-cross-compile-build.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/tools-vm-fix-cross-compile-build.p…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/tools-vm-fix-cross-compile-build.p…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Lucas Stach <l.stach(a)pengutronix.de>
Subject: tools/vm: fix cross-compile build
7ed1c1901fe5 (tools: fix cross-compile var clobbering) moved the setup of
the CC variable to tools/scripts/Makefile.include to make the behavior
consistent across all the tools Makefiles. As the vm tools missed the
include we end up with the wrong CC in a cross-compiling evironment.
Link: http://lkml.kernel.org/r/20200416104748.25243-1-l.stach@pengutronix.de
Fixes: 7ed1c1901fe5 (tools: fix cross-compile var clobbering)
Signed-off-by: Lucas Stach <l.stach(a)pengutronix.de>
Cc: Martin Kelly <martin(a)martingkelly.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/vm/Makefile | 2 ++
1 file changed, 2 insertions(+)
--- a/tools/vm/Makefile~tools-vm-fix-cross-compile-build
+++ a/tools/vm/Makefile
@@ -1,6 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
# Makefile for vm tools
#
+include ../scripts/Makefile.include
+
TARGETS=page-types slabinfo page_owner_sort
LIB_DIR = ../lib/api
_
Patches currently in -mm which might be from l.stach(a)pengutronix.de are
tools-vm-fix-cross-compile-build.patch
The patch titled
Subject: coredump: fix null pointer dereference on coredump
has been added to the -mm tree. Its filename is
coredump-fix-null-pointer-dereference-on-coredump.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/coredump-fix-null-pointer-derefere…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/coredump-fix-null-pointer-derefere…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Sudip Mukherjee <sudipm.mukherjee(a)gmail.com>
Subject: coredump: fix null pointer dereference on coredump
If the core_pattern is set to "|" and any process segfaults then we get
a null pointer derefernce while trying to coredump. The call stack shows:
[ 108.212680] RIP: 0010:do_coredump+0x628/0x11c0
When the core_pattern has only "|" there is no use of trying the coredump
and we can check that while formating the corename and exit with an error.
After this change I get:
[ 48.453756] format_corename failed
[ 48.453758] Aborting core
Link: http://lkml.kernel.org/r/20200416194612.21418-1-sudipm.mukherjee@gmail.com
Fixes: 315c69261dd3 ("coredump: split pipe command whitespace before expanding template")
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee(a)gmail.com>
Reported-by: Matthew Ruffell <matthew.ruffell(a)canonical.com>
Cc: Paul Wise <pabs3(a)bonedaddy.net>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Neil Horman <nhorman(a)tuxdriver.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/coredump.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/coredump.c~coredump-fix-null-pointer-dereference-on-coredump
+++ a/fs/coredump.c
@@ -211,6 +211,8 @@ static int format_corename(struct core_n
return -ENOMEM;
(*argv)[(*argc)++] = 0;
++pat_ptr;
+ if (!(*pat_ptr))
+ return -ENOMEM;
}
/* Repeat as long as we have more pattern to process and more output
_
Patches currently in -mm which might be from sudipm.mukherjee(a)gmail.com are
coredump-fix-null-pointer-dereference-on-coredump.patch
Commit 871f1f2bcb01 ("platform/x86: intel_int0002_vgpio: Only implement
irq_set_wake on Bay Trail") stopped passing irq_set_wake requests on to
the parents IRQ because this was breaking suspend (causing immediate
wakeups) on an Asus E202SA.
This workaround for this issue is mostly fine, on most Cherry Trail
devices where we need the INT0002 device for wakeups by e.g. USB kbds,
the parent IRQ is shared with the ACPI SCI and that is marked as wakeup
anyways.
But not on all devices, specifically on a Medion Akoya E1239T there is
no SCI at all, and because the irq_set_wake request is not passed on to
the parent IRQ, wake up by the builtin USB kbd does not work here.
So the workaround for the Asus E202SA immediate wake problem is causing
problems elsewhere; and in hindsight it is not the correct fix,
the Asus E202SA uses Airmont CPU cores, but this does not mean it is a
Cherry Trail based device, Brasswell uses Airmont CPU cores too and this
actually is a Braswell device.
Most (all?) Braswell devices use classic S3 mode suspend rather then
s2idle suspend and in this case directly dealing with PME events as
the INT0002 driver does likely is not the best idea, so that this is
causing issues is not surprising.
Replace the workaround of not passing irq_set_wake requests on to the
parents IRQ, by not binding to the INT0002 device when s2idle is not used.
This fixes USB kbd wakeups not working on some Cherry Trail devices,
while still avoiding mucking with the wakeup flags on the Asus E202SA
(and other Brasswell devices).
Cc: Maxim Mikityanskiy <maxtram95(a)gmail.com>
Cc: 5.3+ <stable(a)vger.kernel.org> # 5.3+
Fixes: 871f1f2bcb01 ("platform/x86: intel_int0002_vgpio: Only implement irq_set_wake on Bay Trail")
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/platform/x86/intel_int0002_vgpio.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)
diff --git a/drivers/platform/x86/intel_int0002_vgpio.c b/drivers/platform/x86/intel_int0002_vgpio.c
index 55f088f535e2..e8bec72d3823 100644
--- a/drivers/platform/x86/intel_int0002_vgpio.c
+++ b/drivers/platform/x86/intel_int0002_vgpio.c
@@ -143,21 +143,9 @@ static struct irq_chip int0002_byt_irqchip = {
.irq_set_wake = int0002_irq_set_wake,
};
-static struct irq_chip int0002_cht_irqchip = {
- .name = DRV_NAME,
- .irq_ack = int0002_irq_ack,
- .irq_mask = int0002_irq_mask,
- .irq_unmask = int0002_irq_unmask,
- /*
- * No set_wake, on CHT the IRQ is typically shared with the ACPI SCI
- * and we don't want to mess with the ACPI SCI irq settings.
- */
- .flags = IRQCHIP_SKIP_SET_WAKE,
-};
-
static const struct x86_cpu_id int0002_cpu_ids[] = {
INTEL_CPU_FAM6(ATOM_SILVERMONT, int0002_byt_irqchip), /* Valleyview, Bay Trail */
- INTEL_CPU_FAM6(ATOM_AIRMONT, int0002_cht_irqchip), /* Braswell, Cherry Trail */
+ INTEL_CPU_FAM6(ATOM_AIRMONT, int0002_byt_irqchip), /* Braswell, Cherry Trail */
{}
};
@@ -181,6 +169,10 @@ static int int0002_probe(struct platform_device *pdev)
if (!cpu_id)
return -ENODEV;
+ /* We only need to directly deal with PMEs when using s2idle */
+ if (!pm_suspend_default_s2idle())
+ return -ENODEV;
+
irq = platform_get_irq(pdev, 0);
if (irq < 0)
return irq;
--
2.26.0
If the core_pattern is set to "|" and any process segfaults then we get
a null pointer derefernce while trying to coredump. The call stack shows:
[ 108.212680] RIP: 0010:do_coredump+0x628/0x11c0
When the core_pattern has only "|" there is no use of trying the
coredump and we can check that while formating the corename and exit
with an error.
After this change I get:
[ 48.453756] format_corename failed
[ 48.453758] Aborting core
Fixes: 315c69261dd3 ("coredump: split pipe command whitespace before expanding template")
Reported-by: Matthew Ruffell <matthew.ruffell(a)canonical.com>
Cc: Paul Wise <pabs3(a)bonedaddy.net>
Cc: stable(a)vger.kernel.org
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee(a)gmail.com>
---
fs/coredump.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/coredump.c b/fs/coredump.c
index f8296a82d01d..408418e6aa13 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -211,6 +211,8 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
return -ENOMEM;
(*argv)[(*argc)++] = 0;
++pat_ptr;
+ if (!(*pat_ptr))
+ return -ENOMEM;
}
/* Repeat as long as we have more pattern to process and more output
--
2.11.0
From: Jason Gerecke <killertofu(a)gmail.com>
This reverts commit 15893fa40109f5e7c67eeb8da62267d0fdf0be9d.
The referenced commit broke pen and touch input for a variety of devices
such as the Cintiq Pro 32. Affected devices may appear to work normally
for a short amount of time, but eventually loose track of actual touch
state and can leave touch arbitration enabled which prevents the pen
from working. The commit is not itself required for any currently-available
Bluetooth device, and so we revert it to correct the behavior of broken
devices.
This breakage occurs due to a mismatch between the order of collections
and the order of usages on some devices. This commit tries to read the
contact count before processing events, but will fail if the contact
count does not occur prior to the first logical finger collection. This
is the case for devices like the Cintiq Pro 32 which place the contact
count at the very end of the report.
Without the contact count set, touches will only be partially processed.
The `wacom_wac_finger_slot` function will not open any slots since the
number of contacts seen is greater than the expectation of 0, but we will
still end up calling `input_mt_sync_frame` for each finger anyway. This
can cause problems for userspace separate from the issue currently taking
place in the kernel. Only once all of the individual finger collections
have been processed do we finally get to the enclosing collection which
contains the contact count. The value ends up being used for the *next*
report, however.
This delayed use of the contact count can cause the driver to loose track
of the actual touch state and believe that there are contacts down when
there aren't. This leaves touch arbitration enabled and prevents the pen
from working. It can also cause userspace to incorrectly treat single-
finger input as gestures.
Link: https://github.com/linuxwacom/input-wacom/issues/146
Signed-off-by: Jason Gerecke <jason.gerecke(a)wacom.com>
Reviewed-by: Aaron Armstrong Skomra <aaron.skomra(a)wacom.com>
Fixes: 15893fa40109 ("HID: wacom: generic: read the number of expected touches on a per collection basis")
Cc: stable(a)vger.kernel.org # 5.3+
---
drivers/hid/wacom_wac.c | 79 +++++++++--------------------------------
1 file changed, 16 insertions(+), 63 deletions(-)
diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c
index d99a9d407671..96d00eba99c0 100644
--- a/drivers/hid/wacom_wac.c
+++ b/drivers/hid/wacom_wac.c
@@ -2637,9 +2637,25 @@ static void wacom_wac_finger_pre_report(struct hid_device *hdev,
case HID_DG_TIPSWITCH:
hid_data->last_slot_field = equivalent_usage;
break;
+ case HID_DG_CONTACTCOUNT:
+ hid_data->cc_report = report->id;
+ hid_data->cc_index = i;
+ hid_data->cc_value_index = j;
+ break;
}
}
}
+
+ if (hid_data->cc_report != 0 &&
+ hid_data->cc_index >= 0) {
+ struct hid_field *field = report->field[hid_data->cc_index];
+ int value = field->value[hid_data->cc_value_index];
+ if (value)
+ hid_data->num_expected = value;
+ }
+ else {
+ hid_data->num_expected = wacom_wac->features.touch_max;
+ }
}
static void wacom_wac_finger_report(struct hid_device *hdev,
@@ -2649,7 +2665,6 @@ static void wacom_wac_finger_report(struct hid_device *hdev,
struct wacom_wac *wacom_wac = &wacom->wacom_wac;
struct input_dev *input = wacom_wac->touch_input;
unsigned touch_max = wacom_wac->features.touch_max;
- struct hid_data *hid_data = &wacom_wac->hid_data;
/* If more packets of data are expected, give us a chance to
* process them rather than immediately syncing a partial
@@ -2663,7 +2678,6 @@ static void wacom_wac_finger_report(struct hid_device *hdev,
input_sync(input);
wacom_wac->hid_data.num_received = 0;
- hid_data->num_expected = 0;
/* keep touch state for pen event */
wacom_wac->shared->touch_down = wacom_wac_finger_count_touches(wacom_wac);
@@ -2738,73 +2752,12 @@ static void wacom_report_events(struct hid_device *hdev,
}
}
-static void wacom_set_num_expected(struct hid_device *hdev,
- struct hid_report *report,
- int collection_index,
- struct hid_field *field,
- int field_index)
-{
- struct wacom *wacom = hid_get_drvdata(hdev);
- struct wacom_wac *wacom_wac = &wacom->wacom_wac;
- struct hid_data *hid_data = &wacom_wac->hid_data;
- unsigned int original_collection_level =
- hdev->collection[collection_index].level;
- bool end_collection = false;
- int i;
-
- if (hid_data->num_expected)
- return;
-
- // find the contact count value for this segment
- for (i = field_index; i < report->maxfield && !end_collection; i++) {
- struct hid_field *field = report->field[i];
- unsigned int field_level =
- hdev->collection[field->usage[0].collection_index].level;
- unsigned int j;
-
- if (field_level != original_collection_level)
- continue;
-
- for (j = 0; j < field->maxusage; j++) {
- struct hid_usage *usage = &field->usage[j];
-
- if (usage->collection_index != collection_index) {
- end_collection = true;
- break;
- }
- if (wacom_equivalent_usage(usage->hid) == HID_DG_CONTACTCOUNT) {
- hid_data->cc_report = report->id;
- hid_data->cc_index = i;
- hid_data->cc_value_index = j;
-
- if (hid_data->cc_report != 0 &&
- hid_data->cc_index >= 0) {
-
- struct hid_field *field =
- report->field[hid_data->cc_index];
- int value =
- field->value[hid_data->cc_value_index];
-
- if (value)
- hid_data->num_expected = value;
- }
- }
- }
- }
-
- if (hid_data->cc_report == 0 || hid_data->cc_index < 0)
- hid_data->num_expected = wacom_wac->features.touch_max;
-}
-
static int wacom_wac_collection(struct hid_device *hdev, struct hid_report *report,
int collection_index, struct hid_field *field,
int field_index)
{
struct wacom *wacom = hid_get_drvdata(hdev);
- if (WACOM_FINGER_FIELD(field))
- wacom_set_num_expected(hdev, report, collection_index, field,
- field_index);
wacom_report_events(hdev, report, collection_index, field_index);
/*
--
2.26.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4b5142905d4ff58a4b93f7c8eaa7ba829c0a53c9 Mon Sep 17 00:00:00 2001
From: Nikos Tsironis <ntsironis(a)arrikto.com>
Date: Fri, 27 Mar 2020 16:01:08 +0200
Subject: [PATCH] dm clone: Fix handling of partial region discards
There is a bug in the way dm-clone handles discards, which can lead to
discarding the wrong blocks or trying to discard blocks beyond the end
of the device.
This could lead to data corruption, if the destination device indeed
discards the underlying blocks, i.e., if the discard operation results
in the original contents of a block to be lost.
The root of the problem is the code that calculates the range of regions
covered by a discard request and decides which regions to discard.
Since dm-clone handles the device in units of regions, we don't discard
parts of a region, only whole regions.
The range is calculated as:
rs = dm_sector_div_up(bio->bi_iter.bi_sector, clone->region_size);
re = bio_end_sector(bio) >> clone->region_shift;
, where 'rs' is the first region to discard and (re - rs) is the number
of regions to discard.
The bug manifests when we try to discard part of a single region, i.e.,
when we try to discard a block with size < region_size, and the discard
request both starts at an offset with respect to the beginning of that
region and ends before the end of the region.
The root cause is the following comparison:
if (rs == re)
// skip discard and complete original bio immediately
, which doesn't take into account that 'rs' might be greater than 're'.
Thus, we then issue a discard request for the wrong blocks, instead of
skipping the discard all together.
Fix the check to also take into account the above case, so we don't end
up discarding the wrong blocks.
Also, add some range checks to dm_clone_set_region_hydrated() and
dm_clone_cond_set_range(), which update dm-clone's region bitmap.
Note that the aforementioned bug doesn't cause invalid memory accesses,
because dm_clone_is_range_hydrated() returns True for this case, so the
checks are just precautionary.
Fixes: 7431b7835f55 ("dm: add clone target")
Cc: stable(a)vger.kernel.org # v5.4+
Signed-off-by: Nikos Tsironis <ntsironis(a)arrikto.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm-clone-metadata.c b/drivers/md/dm-clone-metadata.c
index c05b12110456..199e7af00858 100644
--- a/drivers/md/dm-clone-metadata.c
+++ b/drivers/md/dm-clone-metadata.c
@@ -850,6 +850,12 @@ int dm_clone_set_region_hydrated(struct dm_clone_metadata *cmd, unsigned long re
struct dirty_map *dmap;
unsigned long word, flags;
+ if (unlikely(region_nr >= cmd->nr_regions)) {
+ DMERR("Region %lu out of range (total number of regions %lu)",
+ region_nr, cmd->nr_regions);
+ return -ERANGE;
+ }
+
word = region_nr / BITS_PER_LONG;
spin_lock_irqsave(&cmd->bitmap_lock, flags);
@@ -879,6 +885,13 @@ int dm_clone_cond_set_range(struct dm_clone_metadata *cmd, unsigned long start,
struct dirty_map *dmap;
unsigned long word, region_nr;
+ if (unlikely(start >= cmd->nr_regions || (start + nr_regions) < start ||
+ (start + nr_regions) > cmd->nr_regions)) {
+ DMERR("Invalid region range: start %lu, nr_regions %lu (total number of regions %lu)",
+ start, nr_regions, cmd->nr_regions);
+ return -ERANGE;
+ }
+
spin_lock_irq(&cmd->bitmap_lock);
if (cmd->read_only) {
diff --git a/drivers/md/dm-clone-target.c b/drivers/md/dm-clone-target.c
index d1e1b5b56b1b..022dddcad647 100644
--- a/drivers/md/dm-clone-target.c
+++ b/drivers/md/dm-clone-target.c
@@ -293,10 +293,17 @@ static inline unsigned long bio_to_region(struct clone *clone, struct bio *bio)
/* Get the region range covered by the bio */
static void bio_region_range(struct clone *clone, struct bio *bio,
- unsigned long *rs, unsigned long *re)
+ unsigned long *rs, unsigned long *nr_regions)
{
+ unsigned long end;
+
*rs = dm_sector_div_up(bio->bi_iter.bi_sector, clone->region_size);
- *re = bio_end_sector(bio) >> clone->region_shift;
+ end = bio_end_sector(bio) >> clone->region_shift;
+
+ if (*rs >= end)
+ *nr_regions = 0;
+ else
+ *nr_regions = end - *rs;
}
/* Check whether a bio overwrites a region */
@@ -454,7 +461,7 @@ static void trim_bio(struct bio *bio, sector_t sector, unsigned int len)
static void complete_discard_bio(struct clone *clone, struct bio *bio, bool success)
{
- unsigned long rs, re;
+ unsigned long rs, nr_regions;
/*
* If the destination device supports discards, remap and trim the
@@ -463,9 +470,9 @@ static void complete_discard_bio(struct clone *clone, struct bio *bio, bool succ
*/
if (test_bit(DM_CLONE_DISCARD_PASSDOWN, &clone->flags) && success) {
remap_to_dest(clone, bio);
- bio_region_range(clone, bio, &rs, &re);
+ bio_region_range(clone, bio, &rs, &nr_regions);
trim_bio(bio, rs << clone->region_shift,
- (re - rs) << clone->region_shift);
+ nr_regions << clone->region_shift);
generic_make_request(bio);
} else
bio_endio(bio);
@@ -473,12 +480,21 @@ static void complete_discard_bio(struct clone *clone, struct bio *bio, bool succ
static void process_discard_bio(struct clone *clone, struct bio *bio)
{
- unsigned long rs, re;
+ unsigned long rs, nr_regions;
- bio_region_range(clone, bio, &rs, &re);
- BUG_ON(re > clone->nr_regions);
+ bio_region_range(clone, bio, &rs, &nr_regions);
+ if (!nr_regions) {
+ bio_endio(bio);
+ return;
+ }
- if (unlikely(rs == re)) {
+ if (WARN_ON(rs >= clone->nr_regions || (rs + nr_regions) < rs ||
+ (rs + nr_regions) > clone->nr_regions)) {
+ DMERR("%s: Invalid range (%lu + %lu, total regions %lu) for discard (%llu + %u)",
+ clone_device_name(clone), rs, nr_regions,
+ clone->nr_regions,
+ (unsigned long long)bio->bi_iter.bi_sector,
+ bio_sectors(bio));
bio_endio(bio);
return;
}
@@ -487,7 +503,7 @@ static void process_discard_bio(struct clone *clone, struct bio *bio)
* The covered regions are already hydrated so we just need to pass
* down the discard.
*/
- if (dm_clone_is_range_hydrated(clone->cmd, rs, re - rs)) {
+ if (dm_clone_is_range_hydrated(clone->cmd, rs, nr_regions)) {
complete_discard_bio(clone, bio, true);
return;
}
@@ -1169,7 +1185,7 @@ static void process_deferred_discards(struct clone *clone)
int r = -EPERM;
struct bio *bio;
struct blk_plug plug;
- unsigned long rs, re;
+ unsigned long rs, nr_regions;
struct bio_list discards = BIO_EMPTY_LIST;
spin_lock_irq(&clone->lock);
@@ -1185,14 +1201,13 @@ static void process_deferred_discards(struct clone *clone)
/* Update the metadata */
bio_list_for_each(bio, &discards) {
- bio_region_range(clone, bio, &rs, &re);
+ bio_region_range(clone, bio, &rs, &nr_regions);
/*
* A discard request might cover regions that have been already
* hydrated. There is no need to update the metadata for these
* regions.
*/
- r = dm_clone_cond_set_range(clone->cmd, rs, re - rs);
-
+ r = dm_clone_cond_set_range(clone->cmd, rs, nr_regions);
if (unlikely(r))
break;
}
This is a note to let you know that I've just added the patch titled
serial: sh-sci: Make sure status register SCxSR is read in correct
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 3dc4db3662366306e54ddcbda4804acb1258e4ba Mon Sep 17 00:00:00 2001
From: Kazuhiro Fujita <kazuhiro.fujita.jg(a)renesas.com>
Date: Fri, 27 Mar 2020 18:17:28 +0000
Subject: serial: sh-sci: Make sure status register SCxSR is read in correct
sequence
For SCIF and HSCIF interfaces the SCxSR register holds the status of
data that is to be read next from SCxRDR register, But where as for
SCIFA and SCIFB interfaces SCxSR register holds status of data that is
previously read from SCxRDR register.
This patch makes sure the status register is read depending on the port
types so that errors are caught accordingly.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Kazuhiro Fujita <kazuhiro.fujita.jg(a)renesas.com>
Signed-off-by: Hao Bui <hao.bui.yg(a)renesas.com>
Signed-off-by: KAZUMI HARADA <kazumi.harada.rh(a)renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj(a)bp.renesas.com>
Tested-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Link: https://lore.kernel.org/r/1585333048-31828-1-git-send-email-kazuhiro.fujita…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/sh-sci.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
index c073aa7001c4..e1179e74a2b8 100644
--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -870,9 +870,16 @@ static void sci_receive_chars(struct uart_port *port)
tty_insert_flip_char(tport, c, TTY_NORMAL);
} else {
for (i = 0; i < count; i++) {
- char c = serial_port_in(port, SCxRDR);
-
- status = serial_port_in(port, SCxSR);
+ char c;
+
+ if (port->type == PORT_SCIF ||
+ port->type == PORT_HSCIF) {
+ status = serial_port_in(port, SCxSR);
+ c = serial_port_in(port, SCxRDR);
+ } else {
+ c = serial_port_in(port, SCxRDR);
+ status = serial_port_in(port, SCxSR);
+ }
if (uart_handle_sysrq_char(port, c)) {
count--; i--;
continue;
--
2.26.1
This is a note to let you know that I've just added the patch titled
Revert "serial: uartps: Move Port ID to device data structure"
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 492cc08bc16c44e2e587362ada3f6269dee2be22 Mon Sep 17 00:00:00 2001
From: Michal Simek <michal.simek(a)xilinx.com>
Date: Fri, 3 Apr 2020 11:24:35 +0200
Subject: Revert "serial: uartps: Move Port ID to device data structure"
This reverts commit bed25ac0e2b6ab8f9aed2d20bc9c3a2037311800.
As Johan says, this driver needs a lot more work and these changes are
only going in the wrong direction:
https://lkml.kernel.org/r/20190523091839.GC568@localhost
Reported-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Michal Simek <michal.simek(a)xilinx.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/eb0ec98fecdca9b79c1a3ac0c30c668b6973b193.15859058…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/xilinx_uartps.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index 58f0fa07ecdb..41d9c2f188f0 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -189,7 +189,6 @@ MODULE_PARM_DESC(rx_timeout, "Rx timeout, 1-255");
* @pclk: APB clock
* @cdns_uart_driver: Pointer to UART driver
* @baud: Current baud rate
- * @id: Port ID
* @clk_rate_change_nb: Notifier block for clock changes
* @quirks: Flags for RXBS support.
*/
@@ -199,7 +198,6 @@ struct cdns_uart {
struct clk *pclk;
struct uart_driver *cdns_uart_driver;
unsigned int baud;
- int id;
struct notifier_block clk_rate_change_nb;
u32 quirks;
bool cts_override;
@@ -1412,7 +1410,7 @@ MODULE_DEVICE_TABLE(of, cdns_uart_of_match);
*/
static int cdns_uart_probe(struct platform_device *pdev)
{
- int rc, irq;
+ int rc, id, irq;
struct uart_port *port;
struct resource *res;
struct cdns_uart *cdns_uart_data;
@@ -1438,18 +1436,18 @@ static int cdns_uart_probe(struct platform_device *pdev)
return -ENOMEM;
/* Look for a serialN alias */
- cdns_uart_data->id = of_alias_get_id(pdev->dev.of_node, "serial");
- if (cdns_uart_data->id < 0)
- cdns_uart_data->id = 0;
+ id = of_alias_get_id(pdev->dev.of_node, "serial");
+ if (id < 0)
+ id = 0;
- if (cdns_uart_data->id >= CDNS_UART_NR_PORTS) {
+ if (id >= CDNS_UART_NR_PORTS) {
dev_err(&pdev->dev, "Cannot get uart_port structure\n");
return -ENODEV;
}
/* There is a need to use unique driver name */
driver_name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s%d",
- CDNS_UART_NAME, cdns_uart_data->id);
+ CDNS_UART_NAME, id);
if (!driver_name)
return -ENOMEM;
@@ -1457,7 +1455,7 @@ static int cdns_uart_probe(struct platform_device *pdev)
cdns_uart_uart_driver->driver_name = driver_name;
cdns_uart_uart_driver->dev_name = CDNS_UART_TTY_NAME;
cdns_uart_uart_driver->major = CDNS_UART_MAJOR;
- cdns_uart_uart_driver->minor = cdns_uart_data->id;
+ cdns_uart_uart_driver->minor = id;
cdns_uart_uart_driver->nr = 1;
#ifdef CONFIG_SERIAL_XILINX_PS_UART_CONSOLE
@@ -1468,7 +1466,7 @@ static int cdns_uart_probe(struct platform_device *pdev)
strncpy(cdns_uart_console->name, CDNS_UART_TTY_NAME,
sizeof(cdns_uart_console->name));
- cdns_uart_console->index = cdns_uart_data->id;
+ cdns_uart_console->index = id;
cdns_uart_console->write = cdns_uart_console_write;
cdns_uart_console->device = uart_console_device;
cdns_uart_console->setup = cdns_uart_console_setup;
@@ -1490,7 +1488,7 @@ static int cdns_uart_probe(struct platform_device *pdev)
* registration because tty_driver structure is not filled.
* name_base is 0 by default.
*/
- cdns_uart_uart_driver->tty_driver->name_base = cdns_uart_data->id;
+ cdns_uart_uart_driver->tty_driver->name_base = id;
match = of_match_node(cdns_uart_of_match, pdev->dev.of_node);
if (match && match->data) {
--
2.26.1
This is a note to let you know that I've just added the patch titled
Revert "serial: uartps: Change uart ID port allocation"
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 72d68197281e2ad313960504d10b0c41ff87fd55 Mon Sep 17 00:00:00 2001
From: Michal Simek <michal.simek(a)xilinx.com>
Date: Fri, 3 Apr 2020 11:24:34 +0200
Subject: Revert "serial: uartps: Change uart ID port allocation"
This reverts commit ae1cca3fa3478be92948dbbcd722390272032ade.
With setting up NR_PORTS to 16 to be able to use serial2 and higher
aliases and don't loose functionality which was intended by these changes.
As Johan says, this driver needs a lot more work and these changes are
only going in the wrong direction:
https://lkml.kernel.org/r/20190523091839.GC568@localhost
Reported-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Michal Simek <michal.simek(a)xilinx.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/a94931b65ce0089f76fb1fe6b446a08731bff754.15859058…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/xilinx_uartps.c | 111 ++++-------------------------
1 file changed, 13 insertions(+), 98 deletions(-)
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index 9db3cd120057..58f0fa07ecdb 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -27,6 +27,7 @@
#define CDNS_UART_TTY_NAME "ttyPS"
#define CDNS_UART_NAME "xuartps"
#define CDNS_UART_MAJOR 0 /* use dynamic node allocation */
+#define CDNS_UART_NR_PORTS 16
#define CDNS_UART_FIFO_SIZE 64 /* FIFO size */
#define CDNS_UART_REGISTER_SPACE 0x1000
#define TX_TIMEOUT 500000
@@ -1403,90 +1404,6 @@ static const struct of_device_id cdns_uart_of_match[] = {
};
MODULE_DEVICE_TABLE(of, cdns_uart_of_match);
-/*
- * Maximum number of instances without alias IDs but if there is alias
- * which target "< MAX_UART_INSTANCES" range this ID can't be used.
- */
-#define MAX_UART_INSTANCES 32
-
-/* Stores static aliases list */
-static DECLARE_BITMAP(alias_bitmap, MAX_UART_INSTANCES);
-static int alias_bitmap_initialized;
-
-/* Stores actual bitmap of allocated IDs with alias IDs together */
-static DECLARE_BITMAP(bitmap, MAX_UART_INSTANCES);
-/* Protect bitmap operations to have unique IDs */
-static DEFINE_MUTEX(bitmap_lock);
-
-static int cdns_get_id(struct platform_device *pdev)
-{
- int id, ret;
-
- mutex_lock(&bitmap_lock);
-
- /* Alias list is stable that's why get alias bitmap only once */
- if (!alias_bitmap_initialized) {
- ret = of_alias_get_alias_list(cdns_uart_of_match, "serial",
- alias_bitmap, MAX_UART_INSTANCES);
- if (ret && ret != -EOVERFLOW) {
- mutex_unlock(&bitmap_lock);
- return ret;
- }
-
- alias_bitmap_initialized++;
- }
-
- /* Make sure that alias ID is not taken by instance without alias */
- bitmap_or(bitmap, bitmap, alias_bitmap, MAX_UART_INSTANCES);
-
- dev_dbg(&pdev->dev, "Alias bitmap: %*pb\n",
- MAX_UART_INSTANCES, bitmap);
-
- /* Look for a serialN alias */
- id = of_alias_get_id(pdev->dev.of_node, "serial");
- if (id < 0) {
- dev_warn(&pdev->dev,
- "No serial alias passed. Using the first free id\n");
-
- /*
- * Start with id 0 and check if there is no serial0 alias
- * which points to device which is compatible with this driver.
- * If alias exists then try next free position.
- */
- id = 0;
-
- for (;;) {
- dev_info(&pdev->dev, "Checking id %d\n", id);
- id = find_next_zero_bit(bitmap, MAX_UART_INSTANCES, id);
-
- /* No free empty instance */
- if (id == MAX_UART_INSTANCES) {
- dev_err(&pdev->dev, "No free ID\n");
- mutex_unlock(&bitmap_lock);
- return -EINVAL;
- }
-
- dev_dbg(&pdev->dev, "The empty id is %d\n", id);
- /* Check if ID is empty */
- if (!test_and_set_bit(id, bitmap)) {
- /* Break the loop if bit is taken */
- dev_dbg(&pdev->dev,
- "Selected ID %d allocation passed\n",
- id);
- break;
- }
- dev_dbg(&pdev->dev,
- "Selected ID %d allocation failed\n", id);
- /* if taking bit fails then try next one */
- id++;
- }
- }
-
- mutex_unlock(&bitmap_lock);
-
- return id;
-}
-
/**
* cdns_uart_probe - Platform driver probe
* @pdev: Pointer to the platform device structure
@@ -1520,17 +1437,21 @@ static int cdns_uart_probe(struct platform_device *pdev)
if (!cdns_uart_uart_driver)
return -ENOMEM;
- cdns_uart_data->id = cdns_get_id(pdev);
+ /* Look for a serialN alias */
+ cdns_uart_data->id = of_alias_get_id(pdev->dev.of_node, "serial");
if (cdns_uart_data->id < 0)
- return cdns_uart_data->id;
+ cdns_uart_data->id = 0;
+
+ if (cdns_uart_data->id >= CDNS_UART_NR_PORTS) {
+ dev_err(&pdev->dev, "Cannot get uart_port structure\n");
+ return -ENODEV;
+ }
/* There is a need to use unique driver name */
driver_name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s%d",
CDNS_UART_NAME, cdns_uart_data->id);
- if (!driver_name) {
- rc = -ENOMEM;
- goto err_out_id;
- }
+ if (!driver_name)
+ return -ENOMEM;
cdns_uart_uart_driver->owner = THIS_MODULE;
cdns_uart_uart_driver->driver_name = driver_name;
@@ -1559,7 +1480,7 @@ static int cdns_uart_probe(struct platform_device *pdev)
rc = uart_register_driver(cdns_uart_uart_driver);
if (rc < 0) {
dev_err(&pdev->dev, "Failed to register driver\n");
- goto err_out_id;
+ return rc;
}
cdns_uart_data->cdns_uart_driver = cdns_uart_uart_driver;
@@ -1710,10 +1631,7 @@ static int cdns_uart_probe(struct platform_device *pdev)
clk_disable_unprepare(cdns_uart_data->pclk);
err_out_unregister_driver:
uart_unregister_driver(cdns_uart_data->cdns_uart_driver);
-err_out_id:
- mutex_lock(&bitmap_lock);
- clear_bit(cdns_uart_data->id, bitmap);
- mutex_unlock(&bitmap_lock);
+
return rc;
}
@@ -1736,9 +1654,6 @@ static int cdns_uart_remove(struct platform_device *pdev)
#endif
rc = uart_remove_one_port(cdns_uart_data->cdns_uart_driver, port);
port->mapbase = 0;
- mutex_lock(&bitmap_lock);
- clear_bit(cdns_uart_data->id, bitmap);
- mutex_unlock(&bitmap_lock);
clk_disable_unprepare(cdns_uart_data->uartclk);
clk_disable_unprepare(cdns_uart_data->pclk);
pm_runtime_disable(&pdev->dev);
--
2.26.1
This is a note to let you know that I've just added the patch titled
Revert "serial: uartps: Fix error path when alloc failed"
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From b6fd2dbbd649b89a3998528994665ded1e3fbf7f Mon Sep 17 00:00:00 2001
From: Michal Simek <michal.simek(a)xilinx.com>
Date: Fri, 3 Apr 2020 11:24:32 +0200
Subject: Revert "serial: uartps: Fix error path when alloc failed"
This reverts commit 32cf21ac4edd6c0d5b9614368a83bcdc68acb031.
As Johan says, this driver needs a lot more work and these changes are
only going in the wrong direction:
https://lkml.kernel.org/r/20190523091839.GC568@localhost
Reported-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Michal Simek <michal.simek(a)xilinx.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/46cd7f039db847c08baa6508edd7854f7c8ff80f.15859058…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/xilinx_uartps.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index 4e3fefa70b56..412bfc51f546 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -1542,10 +1542,8 @@ static int cdns_uart_probe(struct platform_device *pdev)
#ifdef CONFIG_SERIAL_XILINX_PS_UART_CONSOLE
cdns_uart_console = devm_kzalloc(&pdev->dev, sizeof(*cdns_uart_console),
GFP_KERNEL);
- if (!cdns_uart_console) {
- rc = -ENOMEM;
- goto err_out_id;
- }
+ if (!cdns_uart_console)
+ return -ENOMEM;
strncpy(cdns_uart_console->name, CDNS_UART_TTY_NAME,
sizeof(cdns_uart_console->name));
--
2.26.1
This is a note to let you know that I've just added the patch titled
Revert "serial: uartps: Use the same dynamic major number for all
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 8da1a3940da4b0e82848ec29b835486890bc9232 Mon Sep 17 00:00:00 2001
From: Michal Simek <michal.simek(a)xilinx.com>
Date: Fri, 3 Apr 2020 11:24:31 +0200
Subject: Revert "serial: uartps: Use the same dynamic major number for all
ports"
This reverts commit ab262666018de6f4e206b021386b93ed0c164316.
As Johan says, this driver needs a lot more work and these changes are
only going in the wrong direction:
https://lkml.kernel.org/r/20190523091839.GC568@localhost
Reported-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Michal Simek <michal.simek(a)xilinx.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/14a565fc1e14a5ec6cc6a6710deb878ae8305f22.15859058…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/xilinx_uartps.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index b858fb14833d..4e3fefa70b56 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -26,13 +26,13 @@
#define CDNS_UART_TTY_NAME "ttyPS"
#define CDNS_UART_NAME "xuartps"
+#define CDNS_UART_MAJOR 0 /* use dynamic node allocation */
#define CDNS_UART_FIFO_SIZE 64 /* FIFO size */
#define CDNS_UART_REGISTER_SPACE 0x1000
#define TX_TIMEOUT 500000
/* Rx Trigger level */
static int rx_trigger_level = 56;
-static int uartps_major;
module_param(rx_trigger_level, uint, 0444);
MODULE_PARM_DESC(rx_trigger_level, "Rx trigger level, 1-63 bytes");
@@ -1535,7 +1535,7 @@ static int cdns_uart_probe(struct platform_device *pdev)
cdns_uart_uart_driver->owner = THIS_MODULE;
cdns_uart_uart_driver->driver_name = driver_name;
cdns_uart_uart_driver->dev_name = CDNS_UART_TTY_NAME;
- cdns_uart_uart_driver->major = uartps_major;
+ cdns_uart_uart_driver->major = CDNS_UART_MAJOR;
cdns_uart_uart_driver->minor = cdns_uart_data->id;
cdns_uart_uart_driver->nr = 1;
@@ -1564,7 +1564,6 @@ static int cdns_uart_probe(struct platform_device *pdev)
goto err_out_id;
}
- uartps_major = cdns_uart_uart_driver->tty_driver->major;
cdns_uart_data->cdns_uart_driver = cdns_uart_uart_driver;
/*
--
2.26.1
This is a note to let you know that I've just added the patch titled
Revert "serial: uartps: Do not allow use aliases >=
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 91c9dfa25c7f95b543c280e0edf1fd8de6e90985 Mon Sep 17 00:00:00 2001
From: Michal Simek <michal.simek(a)xilinx.com>
Date: Fri, 3 Apr 2020 11:24:33 +0200
Subject: Revert "serial: uartps: Do not allow use aliases >=
MAX_UART_INSTANCES"
This reverts commit 2088cfd882d0403609bdf426e9b24372fe1b8337.
As Johan says, this driver needs a lot more work and these changes are
only going in the wrong direction:
https://lkml.kernel.org/r/20190523091839.GC568@localhost
Reported-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Michal Simek <michal.simek(a)xilinx.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/dac3898e3e32d963f357fb436ac9a7ac3cbcf933.15859058…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/xilinx_uartps.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index 412bfc51f546..9db3cd120057 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -1712,8 +1712,7 @@ static int cdns_uart_probe(struct platform_device *pdev)
uart_unregister_driver(cdns_uart_data->cdns_uart_driver);
err_out_id:
mutex_lock(&bitmap_lock);
- if (cdns_uart_data->id < MAX_UART_INSTANCES)
- clear_bit(cdns_uart_data->id, bitmap);
+ clear_bit(cdns_uart_data->id, bitmap);
mutex_unlock(&bitmap_lock);
return rc;
}
@@ -1738,8 +1737,7 @@ static int cdns_uart_remove(struct platform_device *pdev)
rc = uart_remove_one_port(cdns_uart_data->cdns_uart_driver, port);
port->mapbase = 0;
mutex_lock(&bitmap_lock);
- if (cdns_uart_data->id < MAX_UART_INSTANCES)
- clear_bit(cdns_uart_data->id, bitmap);
+ clear_bit(cdns_uart_data->id, bitmap);
mutex_unlock(&bitmap_lock);
clk_disable_unprepare(cdns_uart_data->uartclk);
clk_disable_unprepare(cdns_uart_data->pclk);
--
2.26.1
This is a note to let you know that I've just added the patch titled
Revert "serial: uartps: Fix uartps_major handling"
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 2e01911b7cf7aa07a304a809eca1b11a4bd35859 Mon Sep 17 00:00:00 2001
From: Michal Simek <michal.simek(a)xilinx.com>
Date: Fri, 3 Apr 2020 11:24:30 +0200
Subject: Revert "serial: uartps: Fix uartps_major handling"
This reverts commit 5e9bd2d70ae7c00a95a22994abf1eef728649e64.
As Johan says, this driver needs a lot more work and these changes are
only going in the wrong direction:
https://lkml.kernel.org/r/20190523091839.GC568@localhost
Reported-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Michal Simek <michal.simek(a)xilinx.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/310999ab5342f788a7bc1b0e68294d4f052cad07.15859058…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/xilinx_uartps.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index 6b26f767768e..b858fb14833d 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -1564,6 +1564,7 @@ static int cdns_uart_probe(struct platform_device *pdev)
goto err_out_id;
}
+ uartps_major = cdns_uart_uart_driver->tty_driver->major;
cdns_uart_data->cdns_uart_driver = cdns_uart_uart_driver;
/*
@@ -1694,7 +1695,6 @@ static int cdns_uart_probe(struct platform_device *pdev)
console_port = NULL;
#endif
- uartps_major = cdns_uart_uart_driver->tty_driver->major;
cdns_uart_data->cts_override = of_property_read_bool(pdev->dev.of_node,
"cts-override");
return 0;
@@ -1756,12 +1756,6 @@ static int cdns_uart_remove(struct platform_device *pdev)
console_port = NULL;
#endif
- /* If this is last instance major number should be initialized */
- mutex_lock(&bitmap_lock);
- if (bitmap_empty(bitmap, MAX_UART_INSTANCES))
- uartps_major = 0;
- mutex_unlock(&bitmap_lock);
-
uart_unregister_driver(cdns_uart_data->cdns_uart_driver);
return rc;
}
--
2.26.1
This is a note to let you know that I've just added the patch titled
USB: Add USB_QUIRK_DELAY_CTRL_MSG and USB_QUIRK_DELAY_INIT for
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From be34a5854b4606bd7a160ad3cb43415d623596c7 Mon Sep 17 00:00:00 2001
From: Jonathan Cox <jonathan(a)jdcox.net>
Date: Fri, 10 Apr 2020 14:24:27 -0700
Subject: USB: Add USB_QUIRK_DELAY_CTRL_MSG and USB_QUIRK_DELAY_INIT for
Corsair K70 RGB RAPIDFIRE
The Corsair K70 RGB RAPIDFIRE needs the USB_QUIRK_DELAY_INIT and
USB_QUIRK_DELAY_CTRL_MSG to function or it will randomly not
respond on boot, just like other Corsair keyboards
Signed-off-by: Jonathan Cox <jonathan(a)jdcox.net>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200410212427.2886-1-jonathan@jdcox.net
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/quirks.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
index da30b5664ff3..3e8efe759c3e 100644
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -430,6 +430,10 @@ static const struct usb_device_id usb_quirk_list[] = {
/* Corsair K70 LUX */
{ USB_DEVICE(0x1b1c, 0x1b36), .driver_info = USB_QUIRK_DELAY_INIT },
+ /* Corsair K70 RGB RAPDIFIRE */
+ { USB_DEVICE(0x1b1c, 0x1b38), .driver_info = USB_QUIRK_DELAY_INIT |
+ USB_QUIRK_DELAY_CTRL_MSG },
+
/* MIDI keyboard WORLDE MINI */
{ USB_DEVICE(0x1c75, 0x0204), .driver_info =
USB_QUIRK_CONFIG_INTF_STRINGS },
--
2.26.1
Call devm_free_irq() if we have to revert to polling in order not to
unnecessarily reserve the IRQ for the life-cycle of the driver.
Cc: stable(a)vger.kernel.org # 4.5.x
Reported-by: Hans de Goede <hdegoede(a)redhat.com>
Fixes: e3837e74a06d ("tpm_tis: Refactor the interrupt setup")
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
drivers/char/tpm/tpm_tis_core.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 27c6ca031e23..ae6868e7b696 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -1062,9 +1062,12 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
if (irq) {
tpm_tis_probe_irq_single(chip, intmask, IRQF_SHARED,
irq);
- if (!(chip->flags & TPM_CHIP_FLAG_IRQ))
+ if (!(chip->flags & TPM_CHIP_FLAG_IRQ)) {
dev_err(&chip->dev, FW_BUG
"TPM interrupt not working, polling instead\n");
+ devm_free_irq(chip->dev.parent, priv->irq,
+ chip);
+ }
} else {
tpm_tis_probe_irq(chip, intmask);
}
--
2.25.1
From: Kai Vehmanen <kai.vehmanen(a)linux.intel.com>
[ Upstream commit 43bcb1c0507858cdc95e425017dcc33f8105df39 ]
snd_hdac_ext_bus_link_get() does not work correctly in case
there are multiple codecs on the bus. It unconditionally
resets the bus->codec_mask value. As per documentation in
hdaudio.h and existing use in client code, this field should
be used to store bit flag of detected codecs on the bus.
By overwriting value of the codec_mask, information on all
detected codecs is lost. No current user of hdac is impacted,
but use of bus->codec_mask is planned in future patches
for SOF.
Signed-off-by: Kai Vehmanen <kai.vehmanen(a)linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan(a)linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
Reviewed-by: Takashi Iwai <tiwai(a)suse.de>
Link: https://lore.kernel.org/r/20200206200223.7715-1-kai.vehmanen@linux.intel.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/hda/ext/hdac_ext_controller.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/sound/hda/ext/hdac_ext_controller.c b/sound/hda/ext/hdac_ext_controller.c
index cfab60d88c921..09ff209df4a30 100644
--- a/sound/hda/ext/hdac_ext_controller.c
+++ b/sound/hda/ext/hdac_ext_controller.c
@@ -254,6 +254,7 @@ EXPORT_SYMBOL_GPL(snd_hdac_ext_bus_link_power_down_all);
int snd_hdac_ext_bus_link_get(struct hdac_bus *bus,
struct hdac_ext_link *link)
{
+ unsigned long codec_mask;
int ret = 0;
mutex_lock(&bus->lock);
@@ -280,9 +281,11 @@ int snd_hdac_ext_bus_link_get(struct hdac_bus *bus,
* HDA spec section 4.3 - Codec Discovery
*/
udelay(521);
- bus->codec_mask = snd_hdac_chip_readw(bus, STATESTS);
- dev_dbg(bus->dev, "codec_mask = 0x%lx\n", bus->codec_mask);
- snd_hdac_chip_writew(bus, STATESTS, bus->codec_mask);
+ codec_mask = snd_hdac_chip_readw(bus, STATESTS);
+ dev_dbg(bus->dev, "codec_mask = 0x%lx\n", codec_mask);
+ snd_hdac_chip_writew(bus, STATESTS, codec_mask);
+ if (!bus->codec_mask)
+ bus->codec_mask = codec_mask;
}
mutex_unlock(&bus->lock);
--
2.20.1
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ab9b2c7b32e6be53cac2e23f5b2db66815a7d972 Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef(a)toxicpanda.com>
Date: Thu, 13 Feb 2020 10:47:30 -0500
Subject: [PATCH] btrfs: handle logged extent failure properly
If we're allocating a logged extent we attempt to insert an extent
record for the file extent directly. We increase
space_info->bytes_reserved, because the extent entry addition will call
btrfs_update_block_group(), which will convert the ->bytes_reserved to
->bytes_used. However if we fail at any point while inserting the
extent entry we will bail and leave space on ->bytes_reserved, which
will trigger a WARN_ON() on umount. Fix this by pinning the space if we
fail to insert, which is what happens in every other failure case that
involves adding the extent entry.
CC: stable(a)vger.kernel.org # 5.4+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 136fffb76428..7eef91d6c2b6 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4422,7 +4422,7 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
ret = alloc_reserved_file_extent(trans, 0, root_objectid, 0, owner,
offset, ins, 1);
if (ret)
- btrfs_pin_extent(fs_info, ins->objectid, ins->offset, 1);
+ btrfs_pin_extent(trans, ins->objectid, ins->offset, 1);
btrfs_put_block_group(block_group);
return ret;
}
This is a note to let you know that I've just added the patch titled
usb: typec: tcpm: Ignore CC and vbus changes in PORT_RESET change
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 901789745a053286e0ced37960d44fa60267b940 Mon Sep 17 00:00:00 2001
From: Badhri Jagan Sridharan <badhri(a)google.com>
Date: Thu, 2 Apr 2020 14:59:47 -0700
Subject: usb: typec: tcpm: Ignore CC and vbus changes in PORT_RESET change
After PORT_RESET, the port is set to the appropriate
default_state. Ignore processing CC changes here as this
could cause the port to be switched into sink states
by default.
echo source > /sys/class/typec/port0/port_type
Before:
[ 154.528547] pending state change PORT_RESET -> PORT_RESET_WAIT_OFF @ 100 ms
[ 154.528560] CC1: 0 -> 0, CC2: 3 -> 0 [state PORT_RESET, polarity 0, disconnected]
[ 154.528564] state change PORT_RESET -> SNK_UNATTACHED
After:
[ 151.068814] pending state change PORT_RESET -> PORT_RESET_WAIT_OFF @ 100 ms [rev3 NONE_AMS]
[ 151.072440] CC1: 3 -> 0, CC2: 0 -> 0 [state PORT_RESET, polarity 0, disconnected]
[ 151.172117] state change PORT_RESET -> PORT_RESET_WAIT_OFF [delayed 100 ms]
[ 151.172136] pending state change PORT_RESET_WAIT_OFF -> SRC_UNATTACHED @ 870 ms [rev3 NONE_AMS]
[ 152.060106] state change PORT_RESET_WAIT_OFF -> SRC_UNATTACHED [delayed 870 ms]
[ 152.060118] Start toggling
Signed-off-by: Badhri Jagan Sridharan <badhri(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Reviewed-by: Guenter Roeck <linux(a)roeck-us.net>
Link: https://lore.kernel.org/r/20200402215947.176577-1-badhri@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/typec/tcpm/tcpm.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index de3576e6530a..82b19ebd7838 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -3794,6 +3794,14 @@ static void _tcpm_cc_change(struct tcpm_port *port, enum typec_cc_status cc1,
*/
break;
+ case PORT_RESET:
+ case PORT_RESET_WAIT_OFF:
+ /*
+ * State set back to default mode once the timer completes.
+ * Ignore CC changes here.
+ */
+ break;
+
default:
if (tcpm_port_is_disconnected(port))
tcpm_set_state(port, unattached_state(port), 0);
@@ -3855,6 +3863,15 @@ static void _tcpm_pd_vbus_on(struct tcpm_port *port)
case SRC_TRY_DEBOUNCE:
/* Do nothing, waiting for sink detection */
break;
+
+ case PORT_RESET:
+ case PORT_RESET_WAIT_OFF:
+ /*
+ * State set back to default mode once the timer completes.
+ * Ignore vbus changes here.
+ */
+ break;
+
default:
break;
}
@@ -3908,10 +3925,19 @@ static void _tcpm_pd_vbus_off(struct tcpm_port *port)
case PORT_RESET_WAIT_OFF:
tcpm_set_state(port, tcpm_default_state(port), 0);
break;
+
case SRC_TRY_WAIT:
case SRC_TRY_DEBOUNCE:
/* Do nothing, waiting for sink detection */
break;
+
+ case PORT_RESET:
+ /*
+ * State set back to default mode once the timer completes.
+ * Ignore vbus changes here.
+ */
+ break;
+
default:
if (port->pwr_role == TYPEC_SINK &&
port->attached)
--
2.26.1
This is a note to let you know that I've just added the patch titled
UAS: fix deadlock in error handling and PM flushing work
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From f6cc6093a729ede1ff5658b493237c42b82ba107 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum(a)suse.com>
Date: Wed, 15 Apr 2020 16:17:50 +0200
Subject: UAS: fix deadlock in error handling and PM flushing work
A SCSI error handler and block runtime PM must not allocate
memory with GFP_KERNEL. Furthermore they must not wait for
tasks allocating memory with GFP_KERNEL.
That means that they cannot share a workqueue with arbitrary tasks.
Fix this for UAS using a private workqueue.
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Fixes: f9dc024a2da1f ("uas: pre_reset and suspend: Fix a few races")
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200415141750.811-2-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/storage/uas.c | 43 ++++++++++++++++++++++++++++++++++++---
1 file changed, 40 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 08503e3507bf..d592071119ba 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -81,6 +81,19 @@ static void uas_free_streams(struct uas_dev_info *devinfo);
static void uas_log_cmd_state(struct scsi_cmnd *cmnd, const char *prefix,
int status);
+/*
+ * This driver needs its own workqueue, as we need to control memory allocation.
+ *
+ * In the course of error handling and power management uas_wait_for_pending_cmnds()
+ * needs to flush pending work items. In these contexts we cannot allocate memory
+ * by doing block IO as we would deadlock. For the same reason we cannot wait
+ * for anything allocating memory not heeding these constraints.
+ *
+ * So we have to control all work items that can be on the workqueue we flush.
+ * Hence we cannot share a queue and need our own.
+ */
+static struct workqueue_struct *workqueue;
+
static void uas_do_work(struct work_struct *work)
{
struct uas_dev_info *devinfo =
@@ -109,7 +122,7 @@ static void uas_do_work(struct work_struct *work)
if (!err)
cmdinfo->state &= ~IS_IN_WORK_LIST;
else
- schedule_work(&devinfo->work);
+ queue_work(workqueue, &devinfo->work);
}
out:
spin_unlock_irqrestore(&devinfo->lock, flags);
@@ -134,7 +147,7 @@ static void uas_add_work(struct uas_cmd_info *cmdinfo)
lockdep_assert_held(&devinfo->lock);
cmdinfo->state |= IS_IN_WORK_LIST;
- schedule_work(&devinfo->work);
+ queue_work(workqueue, &devinfo->work);
}
static void uas_zap_pending(struct uas_dev_info *devinfo, int result)
@@ -1229,7 +1242,31 @@ static struct usb_driver uas_driver = {
.id_table = uas_usb_ids,
};
-module_usb_driver(uas_driver);
+static int __init uas_init(void)
+{
+ int rv;
+
+ workqueue = alloc_workqueue("uas", WQ_MEM_RECLAIM, 0);
+ if (!workqueue)
+ return -ENOMEM;
+
+ rv = usb_register(&uas_driver);
+ if (rv) {
+ destroy_workqueue(workqueue);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void __exit uas_exit(void)
+{
+ usb_deregister(&uas_driver);
+ destroy_workqueue(workqueue);
+}
+
+module_init(uas_init);
+module_exit(uas_exit);
MODULE_LICENSE("GPL");
MODULE_IMPORT_NS(USB_STORAGE);
--
2.26.1
This is a note to let you know that I've just added the patch titled
cdc-acm: close race betrween suspend() and acm_softint
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 0afccd7601514c4b83d8cc58c740089cc447051d Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum(a)suse.com>
Date: Wed, 15 Apr 2020 17:13:57 +0200
Subject: cdc-acm: close race betrween suspend() and acm_softint
Suspend increments a counter, then kills the URBs,
then kills the scheduled work. The scheduled work, however,
may reschedule the URBs. Fix this by having the work
check the counter.
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Cc: stable <stable(a)vger.kernel.org>
Tested-by: Jonas Karlsson <jonas.karlsson(a)actia.se>
Link: https://lore.kernel.org/r/20200415151358.32664-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/class/cdc-acm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 84d6f7df09a4..4ef68e6671aa 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -557,14 +557,14 @@ static void acm_softint(struct work_struct *work)
struct acm *acm = container_of(work, struct acm, work);
if (test_bit(EVENT_RX_STALL, &acm->flags)) {
- if (!(usb_autopm_get_interface(acm->data))) {
+ smp_mb(); /* against acm_suspend() */
+ if (!acm->susp_count) {
for (i = 0; i < acm->rx_buflimit; i++)
usb_kill_urb(acm->read_urbs[i]);
usb_clear_halt(acm->dev, acm->in);
acm_submit_read_urbs(acm, GFP_KERNEL);
- usb_autopm_put_interface(acm->data);
+ clear_bit(EVENT_RX_STALL, &acm->flags);
}
- clear_bit(EVENT_RX_STALL, &acm->flags);
}
if (test_and_clear_bit(EVENT_TTY_WAKEUP, &acm->flags))
--
2.26.1
This is a note to let you know that I've just added the patch titled
UAS: no use logging any details in case of ENODEV
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 5963dec98dc52d52476390485f07a29c30c6a582 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum(a)suse.com>
Date: Wed, 15 Apr 2020 16:17:49 +0200
Subject: UAS: no use logging any details in case of ENODEV
Once a device is gone, the internal state does not matter anymore.
There is no need to spam the logs.
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Cc: stable <stable(a)vger.kernel.org>
Fixes: 326349f824619 ("uas: add dead request list")
Link: https://lore.kernel.org/r/20200415141750.811-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/storage/uas.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 3670fda02c34..08503e3507bf 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -190,6 +190,9 @@ static void uas_log_cmd_state(struct scsi_cmnd *cmnd, const char *prefix,
struct uas_cmd_info *ci = (void *)&cmnd->SCp;
struct uas_cmd_info *cmdinfo = (void *)&cmnd->SCp;
+ if (status == -ENODEV) /* too late */
+ return;
+
scmd_printk(KERN_INFO, cmnd,
"%s %d uas-tag %d inflight:%s%s%s%s%s%s%s%s%s%s%s%s ",
prefix, status, cmdinfo->uas_tag,
--
2.26.1
This is a note to let you know that I've just added the patch titled
USB: core: Fix free-while-in-use bug in the USB S-Glibrary
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 056ad39ee9253873522f6469c3364964a322912b Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Sat, 28 Mar 2020 16:18:11 -0400
Subject: USB: core: Fix free-while-in-use bug in the USB S-Glibrary
FuzzUSB (a variant of syzkaller) found a free-while-still-in-use bug
in the USB scatter-gather library:
BUG: KASAN: use-after-free in atomic_read
include/asm-generic/atomic-instrumented.h:26 [inline]
BUG: KASAN: use-after-free in usb_hcd_unlink_urb+0x5f/0x170
drivers/usb/core/hcd.c:1607
Read of size 4 at addr ffff888065379610 by task kworker/u4:1/27
CPU: 1 PID: 27 Comm: kworker/u4:1 Not tainted 5.5.11 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.10.2-1ubuntu1 04/01/2014
Workqueue: scsi_tmf_2 scmd_eh_abort_handler
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xce/0x128 lib/dump_stack.c:118
print_address_description.constprop.4+0x21/0x3c0 mm/kasan/report.c:374
__kasan_report+0x153/0x1cb mm/kasan/report.c:506
kasan_report+0x12/0x20 mm/kasan/common.c:639
check_memory_region_inline mm/kasan/generic.c:185 [inline]
check_memory_region+0x152/0x1b0 mm/kasan/generic.c:192
__kasan_check_read+0x11/0x20 mm/kasan/common.c:95
atomic_read include/asm-generic/atomic-instrumented.h:26 [inline]
usb_hcd_unlink_urb+0x5f/0x170 drivers/usb/core/hcd.c:1607
usb_unlink_urb+0x72/0xb0 drivers/usb/core/urb.c:657
usb_sg_cancel+0x14e/0x290 drivers/usb/core/message.c:602
usb_stor_stop_transport+0x5e/0xa0 drivers/usb/storage/transport.c:937
This bug occurs when cancellation of the S-G transfer races with
transfer completion. When that happens, usb_sg_cancel() may continue
to access the transfer's URBs after usb_sg_wait() has freed them.
The bug is caused by the fact that usb_sg_cancel() does not take any
sort of reference to the transfer, and so there is nothing to prevent
the URBs from being deallocated while the routine is trying to use
them. The fix is to take such a reference by incrementing the
transfer's io->count field while the cancellation is in progres and
decrementing it afterward. The transfer's URBs are not deallocated
until io->complete is triggered, which happens when io->count reaches
zero.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-and-tested-by: Kyungtae Kim <kt0755(a)gmail.com>
CC: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.2003281615140.14837-100000@netrid…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/message.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index d5f834f16993..a48678a0c83a 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -589,12 +589,13 @@ void usb_sg_cancel(struct usb_sg_request *io)
int i, retval;
spin_lock_irqsave(&io->lock, flags);
- if (io->status) {
+ if (io->status || io->count == 0) {
spin_unlock_irqrestore(&io->lock, flags);
return;
}
/* shut everything down */
io->status = -ECONNRESET;
+ io->count++; /* Keep the request alive until we're done */
spin_unlock_irqrestore(&io->lock, flags);
for (i = io->entries - 1; i >= 0; --i) {
@@ -608,6 +609,12 @@ void usb_sg_cancel(struct usb_sg_request *io)
dev_warn(&io->dev->dev, "%s, unlink --> %d\n",
__func__, retval);
}
+
+ spin_lock_irqsave(&io->lock, flags);
+ io->count--;
+ if (!io->count)
+ complete(&io->complete);
+ spin_unlock_irqrestore(&io->lock, flags);
}
EXPORT_SYMBOL_GPL(usb_sg_cancel);
--
2.26.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 504e84abec7a635b861afd8d7f92ecd13eaa2b09 Mon Sep 17 00:00:00 2001
From: Gilad Ben-Yossef <gilad(a)benyossef.com>
Date: Wed, 29 Jan 2020 16:37:55 +0200
Subject: [PATCH] crypto: ccree - only try to map auth tag if needed
Make sure to only add the size of the auth tag to the source mapping
for encryption if it is an in-place operation. Failing to do this
previously caused us to try and map auth size len bytes from a NULL
mapping and crashing if both the cryptlen and assoclen are zero.
Reported-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Signed-off-by: Gilad Ben-Yossef <gilad(a)benyossef.com>
Cc: stable(a)vger.kernel.org # v4.19+
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/drivers/crypto/ccree/cc_buffer_mgr.c b/drivers/crypto/ccree/cc_buffer_mgr.c
index b938ceae7ae7..885347b5b372 100644
--- a/drivers/crypto/ccree/cc_buffer_mgr.c
+++ b/drivers/crypto/ccree/cc_buffer_mgr.c
@@ -1109,9 +1109,11 @@ int cc_map_aead_request(struct cc_drvdata *drvdata, struct aead_request *req)
}
size_to_map = req->cryptlen + areq_ctx->assoclen;
- if (areq_ctx->gen_ctx.op_type == DRV_CRYPTO_DIRECTION_ENCRYPT)
+ /* If we do in-place encryption, we also need the auth tag */
+ if ((areq_ctx->gen_ctx.op_type == DRV_CRYPTO_DIRECTION_ENCRYPT) &&
+ (req->src == req->dst)) {
size_to_map += authsize;
-
+ }
if (is_gcm4543)
size_to_map += crypto_aead_ivsize(tfm);
rc = cc_map_sg(dev, req->src, size_to_map, DMA_BIDIRECTIONAL,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8962c6d2c2b8ca51b0f188109015b15fc5f4da44 Mon Sep 17 00:00:00 2001
From: Gilad Ben-Yossef <gilad(a)benyossef.com>
Date: Sun, 2 Feb 2020 18:19:14 +0200
Subject: [PATCH] crypto: ccree - dec auth tag size from cryptlen map
Remove the auth tag size from cryptlen before mapping the destination
in out-of-place AEAD decryption thus resolving a crash with
extended testmgr tests.
Signed-off-by: Gilad Ben-Yossef <gilad(a)benyossef.com>
Reported-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Cc: stable(a)vger.kernel.org # v4.19+
Tested-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/drivers/crypto/ccree/cc_buffer_mgr.c b/drivers/crypto/ccree/cc_buffer_mgr.c
index 885347b5b372..954f14bddf1d 100644
--- a/drivers/crypto/ccree/cc_buffer_mgr.c
+++ b/drivers/crypto/ccree/cc_buffer_mgr.c
@@ -894,8 +894,12 @@ static int cc_aead_chain_data(struct cc_drvdata *drvdata,
if (req->src != req->dst) {
size_for_map = areq_ctx->assoclen + req->cryptlen;
- size_for_map += (direct == DRV_CRYPTO_DIRECTION_ENCRYPT) ?
- authsize : 0;
+
+ if (direct == DRV_CRYPTO_DIRECTION_ENCRYPT)
+ size_for_map += authsize;
+ else
+ size_for_map -= authsize;
+
if (is_gcm4543)
size_for_map += crypto_aead_ivsize(tfm);
This is a note to let you know that I've just added the patch titled
staging: vt6656: Power save stop wake_up_count wrap around.
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From ea81c3486442f4643fc9825a2bb1b430b829bccd Mon Sep 17 00:00:00 2001
From: Malcolm Priestley <tvboxspy(a)gmail.com>
Date: Tue, 14 Apr 2020 11:39:23 +0100
Subject: staging: vt6656: Power save stop wake_up_count wrap around.
conf.listen_interval can sometimes be zero causing wake_up_count
to wrap around up to many beacons too late causing
CTRL-EVENT-BEACON-LOSS as in.
wpa_supplicant[795]: message repeated 45 times: [..CTRL-EVENT-BEACON-LOSS ]
Fixes: 43c93d9bf5e2 ("staging: vt6656: implement power saving code.")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Malcolm Priestley <tvboxspy(a)gmail.com>
Link: https://lore.kernel.org/r/fce47bb5-7ca6-7671-5094-5c6107302f2b@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/vt6656/usbpipe.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/vt6656/usbpipe.c b/drivers/staging/vt6656/usbpipe.c
index eae211e5860f..91b62c3dff7b 100644
--- a/drivers/staging/vt6656/usbpipe.c
+++ b/drivers/staging/vt6656/usbpipe.c
@@ -207,7 +207,8 @@ static void vnt_int_process_data(struct vnt_private *priv)
priv->wake_up_count =
priv->hw->conf.listen_interval;
- --priv->wake_up_count;
+ if (priv->wake_up_count)
+ --priv->wake_up_count;
/* Turn on wake up to listen next beacon */
if (priv->wake_up_count == 1)
--
2.26.1
Compilers with branch protection support can be configured to enable it by
default, it is likely that distributions will do this as part of deploying
branch protection system wide. As well as the slight overhead from having
some extra NOPs for unused branch protection features this can cause more
serious problems when the kernel is providing pointer authentication to
userspace but not built for pointer authentication itself. In that case our
switching of keys for userspace can affect the kernel unexpectedly, causing
pointer authentication instructions in the kernel to corrupt addresses.
To ensure that we get consistent and reliable behaviour always explicitly
initialise the branch protection mode, ensuring that the kernel is built
the same way regardless of the compiler defaults.
[This is a reworked version of b8fdef311a0bd9223f1075 ("arm64: Always
force a branch protection mode when the compiler has one") for backport.
Kernels prior to 74afda4016a7 ("arm64: compile the kernel with ptrauth
return address signing") don't have any Makefile machinery for forcing
on pointer auth but still have issues if the compiler defaults it on so
need this reworked version. -- broonie]
Fixes: 7503197562567 (arm64: add basic pointer authentication support)
Reported-by: Szabolcs Nagy <szabolcs.nagy(a)arm.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
[catalin.marinas(a)arm.com: remove Kconfig option in favour of Makefile check]
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
---
arch/arm64/Makefile | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index dca1a97751ab..4e6ce2d9196e 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -65,6 +65,10 @@ stack_protector_prepare: prepare0
include/generated/asm-offsets.h))
endif
+# Ensure that if the compiler supports branch protection we default it
+# off.
+KBUILD_CFLAGS += $(call cc-option,-mbranch-protection=none)
+
ifeq ($(CONFIG_CPU_BIG_ENDIAN), y)
KBUILD_CPPFLAGS += -mbig-endian
CHECKFLAGS += -D__AARCH64EB__
--
2.20.1
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b8fdef311a0bd9223f10754f94fdcf1a594a3457 Mon Sep 17 00:00:00 2001
From: Mark Brown <broonie(a)kernel.org>
Date: Tue, 31 Mar 2020 20:44:59 +0100
Subject: [PATCH] arm64: Always force a branch protection mode when the
compiler has one
Compilers with branch protection support can be configured to enable it by
default, it is likely that distributions will do this as part of deploying
branch protection system wide. As well as the slight overhead from having
some extra NOPs for unused branch protection features this can cause more
serious problems when the kernel is providing pointer authentication to
userspace but not built for pointer authentication itself. In that case our
switching of keys for userspace can affect the kernel unexpectedly, causing
pointer authentication instructions in the kernel to corrupt addresses.
To ensure that we get consistent and reliable behaviour always explicitly
initialise the branch protection mode, ensuring that the kernel is built
the same way regardless of the compiler defaults.
Fixes: 7503197562567 (arm64: add basic pointer authentication support)
Reported-by: Szabolcs Nagy <szabolcs.nagy(a)arm.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
[catalin.marinas(a)arm.com: remove Kconfig option in favour of Makefile check]
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index f15f92ba53e6..85e4149cc5d5 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -65,6 +65,10 @@ stack_protector_prepare: prepare0
include/generated/asm-offsets.h))
endif
+# Ensure that if the compiler supports branch protection we default it
+# off, this will be overridden if we are using branch protection.
+branch-prot-flags-y += $(call cc-option,-mbranch-protection=none)
+
ifeq ($(CONFIG_ARM64_PTR_AUTH),y)
branch-prot-flags-$(CONFIG_CC_HAS_SIGN_RETURN_ADDRESS) := -msign-return-address=all
branch-prot-flags-$(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET) := -mbranch-protection=pac-ret+leaf
@@ -73,9 +77,10 @@ branch-prot-flags-$(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET) := -mbranch-protection=pa
# we pass it only to the assembler. This option is utilized only in case of non
# integrated assemblers.
branch-prot-flags-$(CONFIG_AS_HAS_PAC) += -Wa,-march=armv8.3-a
-KBUILD_CFLAGS += $(branch-prot-flags-y)
endif
+KBUILD_CFLAGS += $(branch-prot-flags-y)
+
ifeq ($(CONFIG_CPU_BIG_ENDIAN), y)
KBUILD_CPPFLAGS += -mbig-endian
CHECKFLAGS += -D__AARCH64EB__
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ea36ec8623f56791c6ff6738d0509b7920f85220 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Sun, 2 Feb 2020 17:16:31 +0000
Subject: [PATCH] drm: Remove PageReserved manipulation from drm_pci_alloc
drm_pci_alloc/drm_pci_free are very thin wrappers around the core dma
facilities, and we have no special reason within the drm layer to behave
differently. In particular, since
commit de09d31dd38a50fdce106c15abd68432eebbd014
Author: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Date: Fri Jan 15 16:51:42 2016 -0800
page-flags: define PG_reserved behavior on compound pages
As far as I can see there's no users of PG_reserved on compound pages.
Let's use PF_NO_COMPOUND here.
it has been illegal to combine GFP_COMP with SetPageReserved, so lets
stop doing both and leave the dma layer to its own devices.
Reported-by: Taketo Kabe
Bug: https://gitlab.freedesktop.org/drm/intel/issues/1027
Fixes: de09d31dd38a ("page-flags: define PG_reserved behavior on compound pages")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: <stable(a)vger.kernel.org> # v4.5+
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200202171635.4039044-1-chri…
diff --git a/drivers/gpu/drm/drm_pci.c b/drivers/gpu/drm/drm_pci.c
index f2e43d341980..d16dac4325f9 100644
--- a/drivers/gpu/drm/drm_pci.c
+++ b/drivers/gpu/drm/drm_pci.c
@@ -51,8 +51,6 @@
drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, size_t size, size_t align)
{
drm_dma_handle_t *dmah;
- unsigned long addr;
- size_t sz;
/* pci_alloc_consistent only guarantees alignment to the smallest
* PAGE_SIZE order which is greater than or equal to the requested size.
@@ -68,20 +66,13 @@ drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, size_t size, size_t ali
dmah->size = size;
dmah->vaddr = dma_alloc_coherent(&dev->pdev->dev, size,
&dmah->busaddr,
- GFP_KERNEL | __GFP_COMP);
+ GFP_KERNEL);
if (dmah->vaddr == NULL) {
kfree(dmah);
return NULL;
}
- /* XXX - Is virt_to_page() legal for consistent mem? */
- /* Reserve */
- for (addr = (unsigned long)dmah->vaddr, sz = size;
- sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
- SetPageReserved(virt_to_page((void *)addr));
- }
-
return dmah;
}
@@ -94,19 +85,9 @@ EXPORT_SYMBOL(drm_pci_alloc);
*/
void __drm_legacy_pci_free(struct drm_device * dev, drm_dma_handle_t * dmah)
{
- unsigned long addr;
- size_t sz;
-
- if (dmah->vaddr) {
- /* XXX - Is virt_to_page() legal for consistent mem? */
- /* Unreserve */
- for (addr = (unsigned long)dmah->vaddr, sz = dmah->size;
- sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
- ClearPageReserved(virt_to_page((void *)addr));
- }
+ if (dmah->vaddr)
dma_free_coherent(&dev->pdev->dev, dmah->size, dmah->vaddr,
dmah->busaddr);
- }
}
/**
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From bd40b17ca49d7d110adf456e647701ce74de2241 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Date: Fri, 31 Jan 2020 05:07:55 -0500
Subject: [PATCH] XArray: Fix xa_find_next for large multi-index entries
Coverity pointed out that xas_sibling() was shifting xa_offset without
promoting it to an unsigned long first, so the shift could cause an
overflow and we'd get the wrong answer. The fix is obvious, and the
new test-case provokes UBSAN to report an error:
runtime error: shift exponent 60 is too large for 32-bit type 'int'
Fixes: 19c30f4dd092 ("XArray: Fix xa_find_after with multi-index entries")
Reported-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reported-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: stable(a)vger.kernel.org
diff --git a/lib/test_xarray.c b/lib/test_xarray.c
index 55c14e8c8859..8c7d7a8468b8 100644
--- a/lib/test_xarray.c
+++ b/lib/test_xarray.c
@@ -12,6 +12,9 @@
static unsigned int tests_run;
static unsigned int tests_passed;
+static const unsigned int order_limit =
+ IS_ENABLED(CONFIG_XARRAY_MULTI) ? BITS_PER_LONG : 1;
+
#ifndef XA_DEBUG
# ifdef __KERNEL__
void xa_dump(const struct xarray *xa) { }
@@ -959,6 +962,20 @@ static noinline void check_multi_find_2(struct xarray *xa)
}
}
+static noinline void check_multi_find_3(struct xarray *xa)
+{
+ unsigned int order;
+
+ for (order = 5; order < order_limit; order++) {
+ unsigned long index = 1UL << (order - 5);
+
+ XA_BUG_ON(xa, !xa_empty(xa));
+ xa_store_order(xa, 0, order - 4, xa_mk_index(0), GFP_KERNEL);
+ XA_BUG_ON(xa, xa_find_after(xa, &index, ULONG_MAX, XA_PRESENT));
+ xa_erase_index(xa, 0);
+ }
+}
+
static noinline void check_find_1(struct xarray *xa)
{
unsigned long i, j, k;
@@ -1081,6 +1098,7 @@ static noinline void check_find(struct xarray *xa)
for (i = 2; i < 10; i++)
check_multi_find_1(xa, i);
check_multi_find_2(xa);
+ check_multi_find_3(xa);
}
/* See find_swap_entry() in mm/shmem.c */
diff --git a/lib/xarray.c b/lib/xarray.c
index 1d9fab7db8da..acd1fad2e862 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -1839,7 +1839,8 @@ static bool xas_sibling(struct xa_state *xas)
if (!node)
return false;
mask = (XA_CHUNK_SIZE << node->shift) - 1;
- return (xas->xa_index & mask) > (xas->xa_offset << node->shift);
+ return (xas->xa_index & mask) >
+ ((unsigned long)xas->xa_offset << node->shift);
}
/**
From: Eric Biggers <ebiggers(a)google.com>
Currently after any algorithm is registered and tested, there's an
unnecessary request_module("cryptomgr") even if it's already loaded.
Also, CRYPTO_MSG_ALG_LOADED is sent twice, and thus if the algorithm is
"crct10dif", lib/crc-t10dif.c replaces the tfm twice rather than once.
This occurs because CRYPTO_MSG_ALG_LOADED is sent using
crypto_probing_notify(), which tries to load "cryptomgr" if the
notification is not handled (NOTIFY_DONE). This doesn't make sense
because "cryptomgr" doesn't handle this notification.
Fix this by using crypto_notify() instead of crypto_probing_notify().
Fixes: dd8b083f9a5e ("crypto: api - Introduce notifier for new crypto algorithms")
Cc: <stable(a)vger.kernel.org> # v4.20+
Cc: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/algapi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/crypto/algapi.c b/crypto/algapi.c
index 69605e21af92..849254d7e627 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -403,7 +403,7 @@ static void crypto_wait_for_test(struct crypto_larval *larval)
err = wait_for_completion_killable(&larval->completion);
WARN_ON(err);
if (!err)
- crypto_probing_notify(CRYPTO_MSG_ALG_LOADED, larval);
+ crypto_notify(CRYPTO_MSG_ALG_LOADED, larval);
out:
crypto_larval_kill(&larval->alg);
--
2.26.0
Hi,
The folloewing build failures currently observed with stable release
candidates.
Thanks,
Guenter
---
v4.19.y:
drivers/gpu/drm/etnaviv/etnaviv_perfmon.c:392:14: error: 'chipFeatures_PIPE_3D' undeclared here
drivers/gpu/drm/etnaviv/etnaviv_perfmon.c:397:14: error: 'chipFeatures_PIPE_2D'
drivers/gpu/drm/etnaviv/etnaviv_perfmon.c:402:14: error: 'chipFeatures_PIPE_VG' undeclared here
Culprit is 'drm/etnaviv: rework perfmon query infrastructure'. Applying
commit 15ff4a7b5841 ("etnaviv: perfmon: fix total and idle HI cyleces
readout") as well fixes the problem.
v5.4.y, v5.5.y:
drivers/mmc/host/sdhci-tegra.c: In function 'tegra_sdhci_set_timeout':
drivers/mmc/host/sdhci-tegra.c:1256:2: error:
implicit declaration of function '__sdhci_set_timeout'
Culprit is "sdhci: tegra: Implement Tegra specific set_timeout callback".
__sdhci_set_timeout() was introduced with commit 7d76ed77cfbd3 ("mmc:
sdhci: Refactor sdhci_set_timeout()"). Unfortunately, applying that patch
results in a conflict.
Guenter
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From af92bad615be75c6c0d1b1c5b48178360250a187 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)c-s.fr>
Date: Fri, 6 Mar 2020 15:09:40 +0000
Subject: [PATCH] powerpc/kasan: Fix kasan_remap_early_shadow_ro()
At the moment kasan_remap_early_shadow_ro() does nothing, because
k_end is 0 and k_cur < 0 is always true.
Change the test to k_cur != k_end, as done in
kasan_init_shadow_page_tables()
Signed-off-by: Christophe Leroy <christophe.leroy(a)c-s.fr>
Fixes: cbd18991e24f ("powerpc/mm: Fix an Oops in kasan_mmu_init()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/4e7b56865e01569058914c991143f5961b5d4719.15835073…
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index f19526e7d3dc..1a29cf469903 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -101,7 +101,7 @@ static void __init kasan_remap_early_shadow_ro(void)
kasan_populate_pte(kasan_early_shadow_pte, prot);
- for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
+ for (k_cur = k_start & PAGE_MASK; k_cur != k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_ptr_k(k_cur);
pte_t *ptep = pte_offset_kernel(pmd, k_cur);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From aa4113340ae6c2811e046f08c2bc21011d20a072 Mon Sep 17 00:00:00 2001
From: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Date: Thu, 23 Jan 2020 11:19:25 +0000
Subject: [PATCH] powerpc/fsl_booke: Avoid creating duplicate tlb1 entry
In the current implementation, the call to loadcam_multi() is wrapped
between switch_to_as1() and restore_to_as0() calls so, when it tries
to create its own temporary AS=1 TLB1 entry, it ends up duplicating
the existing one created by switch_to_as1(). Add a check to skip
creating the temporary entry if already running in AS=1.
Fixes: d9e1831a4202 ("powerpc/85xx: Load all early TLB entries at once")
Cc: stable(a)vger.kernel.org # v4.4+
Signed-off-by: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Acked-by: Scott Wood <oss(a)buserror.net>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200123111914.2565-1-laurentiu.tudor@nxp.com
diff --git a/arch/powerpc/mm/nohash/tlb_low.S b/arch/powerpc/mm/nohash/tlb_low.S
index 2ca407cedbe7..eaeee402f96e 100644
--- a/arch/powerpc/mm/nohash/tlb_low.S
+++ b/arch/powerpc/mm/nohash/tlb_low.S
@@ -397,7 +397,7 @@ _GLOBAL(set_context)
* extern void loadcam_entry(unsigned int index)
*
* Load TLBCAM[index] entry in to the L2 CAM MMU
- * Must preserve r7, r8, r9, and r10
+ * Must preserve r7, r8, r9, r10 and r11
*/
_GLOBAL(loadcam_entry)
mflr r5
@@ -433,6 +433,10 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_BIG_PHYS)
*/
_GLOBAL(loadcam_multi)
mflr r8
+ /* Don't switch to AS=1 if already there */
+ mfmsr r11
+ andi. r11,r11,MSR_IS
+ bne 10f
/*
* Set up temporary TLB entry that is the same as what we're
@@ -458,6 +462,7 @@ _GLOBAL(loadcam_multi)
mtmsr r6
isync
+10:
mr r9,r3
add r10,r3,r4
2: bl loadcam_entry
@@ -466,6 +471,10 @@ _GLOBAL(loadcam_multi)
mr r3,r9
blt 2b
+ /* Don't return to AS=0 if we were in AS=1 at function start */
+ andi. r11,r11,MSR_IS
+ bne 3f
+
/* Return to AS=0 and clear the temporary entry */
mfmsr r6
rlwinm. r6,r6,0,~(MSR_IS|MSR_DS)
@@ -481,6 +490,7 @@ _GLOBAL(loadcam_multi)
tlbwe
isync
+3:
mtlr r8
blr
#endif
The patch titled
Subject: mm: call cond_resched() from deferred_init_memmap()
has been added to the -mm tree. Its filename is
mm-call-cond_resched-from-deferred_init_memmap.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-call-cond_resched-from-deferred…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-call-cond_resched-from-deferred…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Subject: mm: call cond_resched() from deferred_init_memmap()
Now that deferred pages are initialized with interrupts enabled we can
replace touch_nmi_watchdog() with cond_resched(), as it was before
3a2d7fa8a3d5.
For now, we cannot do the same in deferred_grow_zone() as it is still
initializes pages with interrupts disabled.
This change fixes RCU problem described in
https://lkml.kernel.org/r/20200401104156.11564-2-david@redhat.com
[ 60.474005] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 60.475000] rcu: 1-...0: (0 ticks this GP) idle=02a/1/0x4000000000000000 softirq=1/1 fqs=15000
[ 60.475000] rcu: (detected by 0, t=60002 jiffies, g=-1199, q=1)
[ 60.475000] Sending NMI from CPU 0 to CPUs 1:
[ 1.760091] NMI backtrace for cpu 1
[ 1.760091] CPU: 1 PID: 20 Comm: pgdatinit0 Not tainted 4.18.0-147.9.1.el8_1.x86_64 #1
[ 1.760091] Hardware name: Red Hat KVM, BIOS 1.13.0-1.module+el8.2.0+5520+4e5817f3 04/01/2014
[ 1.760091] RIP: 0010:__init_single_page.isra.65+0x10/0x4f
[ 1.760091] Code: 48 83 cf 63 48 89 f8 0f 1f 40 00 48 89 c6 48 89 d7 e8 6b 18 80 ff 66 90 5b c3 31 c0 b9 10 00 00 00 49 89 f8 48 c1 e6 33 f3 ab <b8> 07 00 00 00 48 c1 e2 36 41 c7 40 34 01 00 00 00 48 c1 e0 33 41
[ 1.760091] RSP: 0000:ffffba783123be40 EFLAGS: 00000006
[ 1.760091] RAX: 0000000000000000 RBX: fffffad34405e300 RCX: 0000000000000000
[ 1.760091] RDX: 0000000000000000 RSI: 0010000000000000 RDI: fffffad34405e340
[ 1.760091] RBP: 0000000033f3177e R08: fffffad34405e300 R09: 0000000000000002
[ 1.760091] R10: 000000000000002b R11: ffff98afb691a500 R12: 0000000000000002
[ 1.760091] R13: 0000000000000000 R14: 000000003f03ea00 R15: 000000003e10178c
[ 1.760091] FS: 0000000000000000(0000) GS:ffff9c9ebeb00000(0000) knlGS:0000000000000000
[ 1.760091] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.760091] CR2: 00000000ffffffff CR3: 000000a1cf20a001 CR4: 00000000003606e0
[ 1.760091] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1.760091] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1.760091] Call Trace:
[ 1.760091] deferred_init_pages+0x8f/0xbf
[ 1.760091] deferred_init_memmap+0x184/0x29d
[ 1.760091] ? deferred_free_pages.isra.97+0xba/0xba
[ 1.760091] kthread+0x112/0x130
[ 1.760091] ? kthread_flush_work_fn+0x10/0x10
[ 1.760091] ret_from_fork+0x35/0x40
[ 89.123011] node 0 initialised, 1055935372 pages in 88650ms
Link: http://lkml.kernel.org/r/20200403140952.17177-4-pasha.tatashin@soleen.com
Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages")
Reported-by: Yiqian Wei <yiwei(a)redhat.com>
Tested-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: James Morris <jmorris(a)namei.org>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Sasha Levin <sashal(a)kernel.org>
Cc: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org> [4.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-call-cond_resched-from-deferred_init_memmap
+++ a/mm/page_alloc.c
@@ -1869,7 +1869,7 @@ static int __init deferred_init_memmap(v
*/
while (spfn < epfn) {
nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn);
- touch_nmi_watchdog();
+ cond_resched();
}
zone_empty:
/* Sanity check that the next zone really is unpopulated */
_
Patches currently in -mm which might be from pasha.tatashin(a)soleen.com are
mm-initialize-deferred-pages-with-interrupts-enabled.patch
mm-call-cond_resched-from-deferred_init_memmap.patch
The patch titled
Subject: mm: initialize deferred pages with interrupts enabled
has been added to the -mm tree. Its filename is
mm-initialize-deferred-pages-with-interrupts-enabled.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-initialize-deferred-pages-with-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-initialize-deferred-pages-with-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Subject: mm: initialize deferred pages with interrupts enabled
Initializing struct pages is a long task and keeping interrupts disabled
for the duration of this operation introduces a number of problems.
1. jiffies are not updated for long period of time, and thus incorrect time
is reported. See proposed solution and discussion here:
lkml/20200311123848.118638-1-shile.zhang(a)linux.alibaba.com
2. It prevents farther improving deferred page initialization by allowing
intra-node multi-threading.
We are keeping interrupts disabled to solve a rather theoretical problem
that was never observed in real world (See 3a2d7fa8a3d5).
Let's keep interrupts enabled. In case we ever encounter a scenario where
an interrupt thread wants to allocate large amount of memory this early in
boot we can deal with that by growing zone (see deferred_grow_zone()) by
the needed amount before starting deferred_init_memmap() threads.
Before:
[ 1.232459] node 0 initialised, 12058412 pages in 1ms
After:
[ 1.632580] node 0 initialised, 12051227 pages in 436ms
Link: http://lkml.kernel.org/r/20200403140952.17177-3-pasha.tatashin@soleen.com
Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages")
Reported-by: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: James Morris <jmorris(a)namei.org>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: Sasha Levin <sashal(a)kernel.org>
Cc: Yiqian Wei <yiwei(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [4.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mmzone.h | 2 ++
mm/page_alloc.c | 20 +++++++-------------
2 files changed, 9 insertions(+), 13 deletions(-)
--- a/include/linux/mmzone.h~mm-initialize-deferred-pages-with-interrupts-enabled
+++ a/include/linux/mmzone.h
@@ -678,6 +678,8 @@ typedef struct pglist_data {
/*
* Must be held any time you expect node_start_pfn,
* node_present_pages, node_spanned_pages or nr_zones to stay constant.
+ * Also synchronizes pgdat->first_deferred_pfn during deferred page
+ * init.
*
* pgdat_resize_lock() and pgdat_resize_unlock() are provided to
* manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG
--- a/mm/page_alloc.c~mm-initialize-deferred-pages-with-interrupts-enabled
+++ a/mm/page_alloc.c
@@ -1843,6 +1843,13 @@ static int __init deferred_init_memmap(v
BUG_ON(pgdat->first_deferred_pfn > pgdat_end_pfn(pgdat));
pgdat->first_deferred_pfn = ULONG_MAX;
+ /*
+ * Once we unlock here, the zone cannot be grown anymore, thus if an
+ * interrupt thread must allocate this early in boot, zone must be
+ * pre-grown prior to start of deferred page initialization.
+ */
+ pgdat_resize_unlock(pgdat, &flags);
+
/* Only the highest zone is deferred so find it */
for (zid = 0; zid < MAX_NR_ZONES; zid++) {
zone = pgdat->node_zones + zid;
@@ -1865,8 +1872,6 @@ static int __init deferred_init_memmap(v
touch_nmi_watchdog();
}
zone_empty:
- pgdat_resize_unlock(pgdat, &flags);
-
/* Sanity check that the next zone really is unpopulated */
WARN_ON(++zid < MAX_NR_ZONES && populated_zone(++zone));
@@ -1909,17 +1914,6 @@ deferred_grow_zone(struct zone *zone, un
pgdat_resize_lock(pgdat, &flags);
/*
- * If deferred pages have been initialized while we were waiting for
- * the lock, return true, as the zone was grown. The caller will retry
- * this zone. We won't return to this function since the caller also
- * has this static branch.
- */
- if (!static_branch_unlikely(&deferred_pages)) {
- pgdat_resize_unlock(pgdat, &flags);
- return true;
- }
-
- /*
* If someone grew this zone while we were waiting for spinlock, return
* true, as there might be enough pages already.
*/
_
Patches currently in -mm which might be from pasha.tatashin(a)soleen.com are
mm-initialize-deferred-pages-with-interrupts-enabled.patch
mm-call-cond_resched-from-deferred_init_memmap.patch
The patch titled
Subject: mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init
has been added to the -mm tree. Its filename is
mm-call-touch_nmi_watchdog-on-max-order-boundaries-in-deferred-init.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-call-touch_nmi_watchdog-on-max-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-call-touch_nmi_watchdog-on-max-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Subject: mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init
Patch series "initialize deferred pages with interrupts enabled", v4.
Keep interrupts enabled during deferred page initialization in order to
make code more modular and allow jiffies to update.
Original approach, and discussion can be found here:
http://lkml.kernel.org/r/20200311123848.118638-1-shile.zhang@linux.alibaba.…
This patch (of 3):
deferred_init_memmap() disables interrupts the entire time, so it calls
touch_nmi_watchdog() periodically to avoid soft lockup splats. Soon it
will run with interrupts enabled, at which point cond_resched() should be
used instead.
deferred_grow_zone() makes the same watchdog calls through code shared
with deferred init but will continue to run with interrupts disabled, so
it can't call cond_resched().
Pull the watchdog calls up to these two places to allow the first to be
changed later, independently of the second. The frequency reduces from
twice per pageblock (init and free) to once per max order block.
Link: http://lkml.kernel.org/r/20200403140952.17177-2-pasha.tatashin@soleen.com
Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages")
Signed-off-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Shile Zhang <shile.zhang(a)linux.alibaba.com>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: James Morris <jmorris(a)namei.org>
Cc: Sasha Levin <sashal(a)kernel.org>
Cc: Yiqian Wei <yiwei(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [4.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/mm/page_alloc.c~mm-call-touch_nmi_watchdog-on-max-order-boundaries-in-deferred-init
+++ a/mm/page_alloc.c
@@ -1692,7 +1692,6 @@ static void __init deferred_free_pages(u
} else if (!(pfn & nr_pgmask)) {
deferred_free_range(pfn - nr_free, nr_free);
nr_free = 1;
- touch_nmi_watchdog();
} else {
nr_free++;
}
@@ -1722,7 +1721,6 @@ static unsigned long __init deferred_in
continue;
} else if (!page || !(pfn & nr_pgmask)) {
page = pfn_to_page(pfn);
- touch_nmi_watchdog();
} else {
page++;
}
@@ -1862,8 +1860,10 @@ static int __init deferred_init_memmap(v
* that we can avoid introducing any issues with the buddy
* allocator.
*/
- while (spfn < epfn)
+ while (spfn < epfn) {
nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn);
+ touch_nmi_watchdog();
+ }
zone_empty:
pgdat_resize_unlock(pgdat, &flags);
@@ -1947,6 +1947,7 @@ deferred_grow_zone(struct zone *zone, un
first_deferred_pfn = spfn;
nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn);
+ touch_nmi_watchdog();
/* We should only stop along section boundaries */
if ((first_deferred_pfn ^ spfn) < PAGES_PER_SECTION)
_
Patches currently in -mm which might be from daniel.m.jordan(a)oracle.com are
mm-call-touch_nmi_watchdog-on-max-order-boundaries-in-deferred-init.patch
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6a13a0d7b4d1171ef9b80ad69abc37e1daa941b3 Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Tue, 24 Mar 2020 16:34:48 +0900
Subject: [PATCH] ftrace/kprobe: Show the maxactive number on kprobe_events
Show maxactive parameter on kprobe_events.
This allows user to save the current configuration and
restore it without losing maxactive parameter.
Link: http://lkml.kernel.org/r/4762764a-6df7-bc93-ed60-e336146dce1f@gmail.com
Link: http://lkml.kernel.org/r/158503528846.22706.5549974121212526020.stgit@devno…
Cc: stable(a)vger.kernel.org
Fixes: 696ced4fb1d76 ("tracing/kprobes: expose maxactive for kretprobe in kprobe_events")
Reported-by: Taeung Song <treeze.taeung(a)gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 362cca52f5de..d0568af4a0ef 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1078,6 +1078,8 @@ static int trace_kprobe_show(struct seq_file *m, struct dyn_event *ev)
int i;
seq_putc(m, trace_kprobe_is_return(tk) ? 'r' : 'p');
+ if (trace_kprobe_is_return(tk) && tk->rp.maxactive)
+ seq_printf(m, "%d", tk->rp.maxactive);
seq_printf(m, ":%s/%s", trace_probe_group_name(&tk->tp),
trace_probe_name(&tk->tp));
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 33238c50451596be86db1505ab65fee5172844d0 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Wed, 18 Mar 2020 20:33:37 +0100
Subject: [PATCH] perf/core: Fix event cgroup tracking
Song reports that installing cgroup events is broken since:
db0503e4f675 ("perf/core: Optimize perf_install_in_event()")
The problem being that cgroup events try to track cpuctx->cgrp even
for disabled events, which is pointless and actively harmful since the
above commit. Rework the code to have explicit enable/disable hooks
for cgroup events, such that we can limit cgroup tracking to active
events.
More specifically, since the above commit disabled events are no
longer added to their context from the 'right' CPU, and we can't
access things like the current cgroup for a remote CPU.
Cc: <stable(a)vger.kernel.org> # v5.5+
Fixes: db0503e4f675 ("perf/core: Optimize perf_install_in_event()")
Reported-by: Song Liu <songliubraving(a)fb.com>
Tested-by: Song Liu <songliubraving(a)fb.com>
Reviewed-by: Song Liu <songliubraving(a)fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Link: https://lkml.kernel.org/r/20200318193337.GB20760@hirez.programming.kicks-as…
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 55e44417f66d..7afd0b503406 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -983,16 +983,10 @@ perf_cgroup_set_shadow_time(struct perf_event *event, u64 now)
event->shadow_ctx_time = now - t->timestamp;
}
-/*
- * Update cpuctx->cgrp so that it is set when first cgroup event is added and
- * cleared when last cgroup event is removed.
- */
static inline void
-list_update_cgroup_event(struct perf_event *event,
- struct perf_event_context *ctx, bool add)
+perf_cgroup_event_enable(struct perf_event *event, struct perf_event_context *ctx)
{
struct perf_cpu_context *cpuctx;
- struct list_head *cpuctx_entry;
if (!is_cgroup_event(event))
return;
@@ -1009,28 +1003,41 @@ list_update_cgroup_event(struct perf_event *event,
* because if the first would mismatch, the second would not try again
* and we would leave cpuctx->cgrp unset.
*/
- if (add && !cpuctx->cgrp) {
+ if (ctx->is_active && !cpuctx->cgrp) {
struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
cpuctx->cgrp = cgrp;
}
- if (add && ctx->nr_cgroups++)
+ if (ctx->nr_cgroups++)
return;
- else if (!add && --ctx->nr_cgroups)
+
+ list_add(&cpuctx->cgrp_cpuctx_entry,
+ per_cpu_ptr(&cgrp_cpuctx_list, event->cpu));
+}
+
+static inline void
+perf_cgroup_event_disable(struct perf_event *event, struct perf_event_context *ctx)
+{
+ struct perf_cpu_context *cpuctx;
+
+ if (!is_cgroup_event(event))
return;
- /* no cgroup running */
- if (!add)
+ /*
+ * Because cgroup events are always per-cpu events,
+ * @ctx == &cpuctx->ctx.
+ */
+ cpuctx = container_of(ctx, struct perf_cpu_context, ctx);
+
+ if (--ctx->nr_cgroups)
+ return;
+
+ if (ctx->is_active && cpuctx->cgrp)
cpuctx->cgrp = NULL;
- cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
- if (add)
- list_add(cpuctx_entry,
- per_cpu_ptr(&cgrp_cpuctx_list, event->cpu));
- else
- list_del(cpuctx_entry);
+ list_del(&cpuctx->cgrp_cpuctx_entry);
}
#else /* !CONFIG_CGROUP_PERF */
@@ -1096,11 +1103,14 @@ static inline u64 perf_cgroup_event_time(struct perf_event *event)
}
static inline void
-list_update_cgroup_event(struct perf_event *event,
- struct perf_event_context *ctx, bool add)
+perf_cgroup_event_enable(struct perf_event *event, struct perf_event_context *ctx)
{
}
+static inline void
+perf_cgroup_event_disable(struct perf_event *event, struct perf_event_context *ctx)
+{
+}
#endif
/*
@@ -1791,13 +1801,14 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
add_event_to_groups(event, ctx);
}
- list_update_cgroup_event(event, ctx, true);
-
list_add_rcu(&event->event_entry, &ctx->event_list);
ctx->nr_events++;
if (event->attr.inherit_stat)
ctx->nr_stat++;
+ if (event->state > PERF_EVENT_STATE_OFF)
+ perf_cgroup_event_enable(event, ctx);
+
ctx->generation++;
}
@@ -1976,8 +1987,6 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
event->attach_state &= ~PERF_ATTACH_CONTEXT;
- list_update_cgroup_event(event, ctx, false);
-
ctx->nr_events--;
if (event->attr.inherit_stat)
ctx->nr_stat--;
@@ -1994,8 +2003,10 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
* of error state is by explicit re-enabling
* of the event
*/
- if (event->state > PERF_EVENT_STATE_OFF)
+ if (event->state > PERF_EVENT_STATE_OFF) {
+ perf_cgroup_event_disable(event, ctx);
perf_event_set_state(event, PERF_EVENT_STATE_OFF);
+ }
ctx->generation++;
}
@@ -2226,6 +2237,7 @@ event_sched_out(struct perf_event *event,
if (READ_ONCE(event->pending_disable) >= 0) {
WRITE_ONCE(event->pending_disable, -1);
+ perf_cgroup_event_disable(event, ctx);
state = PERF_EVENT_STATE_OFF;
}
perf_event_set_state(event, state);
@@ -2363,6 +2375,7 @@ static void __perf_event_disable(struct perf_event *event,
event_sched_out(event, cpuctx, ctx);
perf_event_set_state(event, PERF_EVENT_STATE_OFF);
+ perf_cgroup_event_disable(event, ctx);
}
/*
@@ -2746,7 +2759,7 @@ static int __perf_install_in_context(void *info)
}
#ifdef CONFIG_CGROUP_PERF
- if (is_cgroup_event(event)) {
+ if (event->state > PERF_EVENT_STATE_OFF && is_cgroup_event(event)) {
/*
* If the current cgroup doesn't match the event's
* cgroup, we should not try to schedule it.
@@ -2906,6 +2919,7 @@ static void __perf_event_enable(struct perf_event *event,
ctx_sched_out(ctx, cpuctx, EVENT_TIME);
perf_event_set_state(event, PERF_EVENT_STATE_INACTIVE);
+ perf_cgroup_event_enable(event, ctx);
if (!ctx->is_active)
return;
@@ -3616,8 +3630,10 @@ static int merge_sched_in(struct perf_event *event, void *data)
}
if (event->state == PERF_EVENT_STATE_INACTIVE) {
- if (event->attr.pinned)
+ if (event->attr.pinned) {
+ perf_cgroup_event_disable(event, ctx);
perf_event_set_state(event, PERF_EVENT_STATE_ERROR);
+ }
*can_add_hw = 0;
ctx->rotate_necessary = 1;
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 0b72a251bf92ca2378530fa1f9b35a71830ab51c Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 31 Mar 2020 16:23:48 +0100
Subject: [PATCH] drm/i915/gt: Fill all the unused space in the GGTT
When we allocate space in the GGTT we may have to allocate a larger
region than will be populated by the object to accommodate fencing. Make
sure that this space beyond the end of the buffer points safely into
scratch space, in case the HW tries to access it anyway (e.g. fenced
access to the last tile row).
v2: Preemptively / conservatively guard gen6 ggtt as well.
Reported-by: Imre Deak <imre.deak(a)intel.com>
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1554
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Imre Deak <imre.deak(a)intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com>
Reviewed-by: Imre Deak <imre.deak(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331152348.26946-1-chris@…
(cherry picked from commit 4d6c18590870fbac1e65dde5e01e621c8e0ca096)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index aed498a0d032..4c5a209cb669 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -191,10 +191,11 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
enum i915_cache_level level,
u32 flags)
{
- struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
- struct sgt_iter sgt_iter;
- gen8_pte_t __iomem *gtt_entries;
const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, 0);
+ struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
+ gen8_pte_t __iomem *gte;
+ gen8_pte_t __iomem *end;
+ struct sgt_iter iter;
dma_addr_t addr;
/*
@@ -202,10 +203,17 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
* not to allow the user to override access to a read only page.
*/
- gtt_entries = (gen8_pte_t __iomem *)ggtt->gsm;
- gtt_entries += vma->node.start / I915_GTT_PAGE_SIZE;
- for_each_sgt_daddr(addr, sgt_iter, vma->pages)
- gen8_set_pte(gtt_entries++, pte_encode | addr);
+ gte = (gen8_pte_t __iomem *)ggtt->gsm;
+ gte += vma->node.start / I915_GTT_PAGE_SIZE;
+ end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
+
+ for_each_sgt_daddr(addr, iter, vma->pages)
+ gen8_set_pte(gte++, pte_encode | addr);
+ GEM_BUG_ON(gte > end);
+
+ /* Fill the allocated but "unused" space beyond the end of the buffer */
+ while (gte < end)
+ gen8_set_pte(gte++, vm->scratch[0].encode);
/*
* We want to flush the TLBs only after we're certain all the PTE
@@ -241,13 +249,22 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
u32 flags)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
- gen6_pte_t __iomem *entries = (gen6_pte_t __iomem *)ggtt->gsm;
- unsigned int i = vma->node.start / I915_GTT_PAGE_SIZE;
+ gen6_pte_t __iomem *gte;
+ gen6_pte_t __iomem *end;
struct sgt_iter iter;
dma_addr_t addr;
+ gte = (gen6_pte_t __iomem *)ggtt->gsm;
+ gte += vma->node.start / I915_GTT_PAGE_SIZE;
+ end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
+
for_each_sgt_daddr(addr, iter, vma->pages)
- iowrite32(vm->pte_encode(addr, level, flags), &entries[i++]);
+ iowrite32(vm->pte_encode(addr, level, flags), gte++);
+ GEM_BUG_ON(gte > end);
+
+ /* Fill the allocated but "unused" space beyond the end of the buffer */
+ while (gte < end)
+ iowrite32(vm->scratch[0].encode, gte++);
/*
* We want to flush the TLBs only after we're certain all the PTE
The patch below does not apply to the 5.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6e8a36c13382b7165d23928caee8d91c1b301142 Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak(a)intel.com>
Date: Mon, 30 Mar 2020 18:22:44 +0300
Subject: [PATCH] drm/i915/icl+: Don't enable DDI IO power on a TypeC port in
TBT mode
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The DDI IO power well must not be enabled for a TypeC port in TBT mode,
ensure this during driver loading/system resume.
This gets rid of error messages like
[drm] *ERROR* power well DDI E TC2 IO state mismatch (refcount 1/enabled 0)
and avoids leaking the power ref when disabling the output.
Cc: <stable(a)vger.kernel.org> # v5.4+
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
Reviewed-by: José Roberto de Souza <jose.souza(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330152244.11316-1-imre.d…
(cherry picked from commit f77a2db27f26c3ccba0681f7e89fef083718f07f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index 73d0f4648c06..5202fdec8e0a 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -1869,7 +1869,11 @@ static void intel_ddi_get_power_domains(struct intel_encoder *encoder,
return;
dig_port = enc_to_dig_port(encoder);
- intel_display_power_get(dev_priv, dig_port->ddi_io_power_domain);
+
+ if (!intel_phy_is_tc(dev_priv, phy) ||
+ dig_port->tc_mode != TC_PORT_TBT_ALT)
+ intel_display_power_get(dev_priv,
+ dig_port->ddi_io_power_domain);
/*
* AUX power is only needed for (e)DP mode, and for HDMI mode on TC
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 487eca11a321ef33bcf4ca5adb3c0c4954db1b58 Mon Sep 17 00:00:00 2001
From: Prike Liang <Prike.Liang(a)amd.com>
Date: Tue, 7 Apr 2020 20:21:26 +0800
Subject: [PATCH] drm/amdgpu: fix gfx hang during suspend with video playback
(v2)
The system will be hang up during S3 suspend because of SMU is pending
for GC not respose the register CP_HQD_ACTIVE access request.This issue
root cause of accessing the GC register under enter GFX CGGPG and can
be fixed by disable GFX CGPG before perform suspend.
v2: Use disable the GFX CGPG instead of RLC safe mode guard.
Signed-off-by: Prike Liang <Prike.Liang(a)amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang(a)amd.com>
Reviewed-by: Huang Rui <ray.huang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index faa3e7102156..559dc24ef436 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2340,8 +2340,6 @@ static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)
{
int i, r;
- amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
- amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
if (!adev->ip_blocks[i].status.valid)
@@ -3356,6 +3354,9 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
}
}
+ amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
+ amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
+
amdgpu_amdkfd_suspend(adev, !fbcon);
amdgpu_ras_suspend(adev);
The patch below does not apply to the 5.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8732fe46b20c951493bfc4dba0ad08efdf41de81 Mon Sep 17 00:00:00 2001
From: Lyude Paul <lyude(a)redhat.com>
Date: Wed, 22 Jan 2020 14:43:20 -0500
Subject: [PATCH] drm/dp_mst: Fix clearing payload state on topology disable
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The issues caused by:
commit 64e62bdf04ab ("drm/dp_mst: Remove VCPI while disabling topology
mgr")
Prompted me to take a closer look at how we clear the payload state in
general when disabling the topology, and it turns out there's actually
two subtle issues here.
The first is that we're not grabbing &mgr.payload_lock when clearing the
payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical
lock order is &mgr.payload_lock -> &mgr.lock (because we always want
&mgr.lock to be the inner-most lock so topology validation always
works), this makes perfect sense. It also means that -technically- there
could be racing between someone calling
drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a
modeset occurring that's modifying the payload state at the same time.
The second is the more obvious issue that Wayne Lin discovered, that
we're not clearing proposed_payloads when disabling the topology.
I actually can't see any obvious places where the racing caused by the
first issue would break something, and it could be that some of our
higher-level locks already prevent this by happenstance, but better safe
then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst()
first grabs &mgr.payload_lock followed by &mgr.lock so that we never
race when modifying the payload state. Then, we also clear
proposed_payloads to fix the original issue of enabling a new topology
with a dirty payload state. This doesn't clear any of the drm_dp_vcpi
structures, but those are getting destroyed along with the ports anyway.
Changes since v1:
* Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead -
vsyrjala
Cc: Sean Paul <sean(a)poorly.run>
Cc: Wayne Lin <Wayne.Lin(a)amd.com>
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v4.4+
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200122194321.14953-1-lyude@…
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index e78c73e975ed..4104f15f4594 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3447,6 +3447,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
int ret = 0;
struct drm_dp_mst_branch *mstb = NULL;
+ mutex_lock(&mgr->payload_lock);
mutex_lock(&mgr->lock);
if (mst_state == mgr->mst_state)
goto out_unlock;
@@ -3505,7 +3506,10 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
/* this can fail if the device is gone */
drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
ret = 0;
- memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct drm_dp_payload));
+ memset(mgr->payloads, 0,
+ mgr->max_payloads * sizeof(mgr->payloads[0]));
+ memset(mgr->proposed_vcpis, 0,
+ mgr->max_payloads * sizeof(mgr->proposed_vcpis[0]));
mgr->payload_mask = 0;
set_bit(0, &mgr->payload_mask);
mgr->vcpi_mask = 0;
@@ -3514,6 +3518,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
out_unlock:
mutex_unlock(&mgr->lock);
+ mutex_unlock(&mgr->payload_lock);
if (mstb)
drm_dp_mst_topology_put_mstb(mstb);
return ret;
The patch below does not apply to the 5.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3e138a63d6674a4567a018a31e467567c40b14d5 Mon Sep 17 00:00:00 2001
From: Torsten Duwe <duwe(a)lst.de>
Date: Tue, 18 Feb 2020 16:57:44 +0100
Subject: [PATCH] drm/bridge: analogix-anx78xx: Fix drm_dp_link helper removal
drm_dp_link_rate_to_bw_code and ...bw_code_to_link_rate simply divide by
and multiply with 27000, respectively. Avoid an overflow in the u8 dpcd[0]
and the multiply+divide alltogether.
Signed-off-by: Torsten Duwe <duwe(a)lst.de>
Fixes: ff1e8fb68ea0 ("drm/bridge: analogix-anx78xx: Avoid drm_dp_link helpers")
Cc: Thierry Reding <treding(a)nvidia.com>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Andrzej Hajda <a.hajda(a)samsung.com>
Cc: Neil Armstrong <narmstrong(a)baylibre.com>
Cc: Laurent Pinchart <Laurent.pinchart(a)ideasonboard.com>
Cc: Jonas Karlman <jonas(a)kwiboo.se>
Cc: Jernej Skrabec <jernej.skrabec(a)siol.net>
Cc: <stable(a)vger.kernel.org> # v5.5+
Reviewed-by: Thierry Reding <treding(a)nvidia.com>
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200218155744.9675368BE1@ver…
diff --git a/drivers/gpu/drm/bridge/analogix/analogix-anx78xx.c b/drivers/gpu/drm/bridge/analogix/analogix-anx78xx.c
index 41867be03751..864423f59d66 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix-anx78xx.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix-anx78xx.c
@@ -722,10 +722,9 @@ static int anx78xx_dp_link_training(struct anx78xx *anx78xx)
if (err)
return err;
- dpcd[0] = drm_dp_max_link_rate(anx78xx->dpcd);
- dpcd[0] = drm_dp_link_rate_to_bw_code(dpcd[0]);
err = regmap_write(anx78xx->map[I2C_IDX_TX_P0],
- SP_DP_MAIN_LINK_BW_SET_REG, dpcd[0]);
+ SP_DP_MAIN_LINK_BW_SET_REG,
+ anx78xx->dpcd[DP_MAX_LINK_RATE]);
if (err)
return err;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 835214f5d5f516a38069bc077c879c7da00d6108 Mon Sep 17 00:00:00 2001
From: James Smart <jsmart2021(a)gmail.com>
Date: Mon, 27 Jan 2020 16:23:03 -0800
Subject: [PATCH] scsi: lpfc: Fix broken Credit Recovery after driver load
When driver is set to enable bb credit recovery, the switch displayed the
setting as inactive. If the link bounces, it switches to Active.
During link up processing, the driver currently does a MBX_READ_SPARAM
followed by a MBX_CONFIG_LINK. These mbox commands are queued to be
executed, one at a time and the completion is processed by the worker
thread. Since the MBX_READ_SPARAM is done BEFORE the MBX_CONFIG_LINK, the
BB_SC_N bit is never set the the returned values. BB Credit recovery status
only gets set after the driver requests the feature in CONFIG_LINK, which
is done after the link up. Thus the ordering of READ_SPARAM needs to follow
the CONFIG_LINK.
Fix by reordering so that READ_SPARAM is done after CONFIG_LINK. Added a
HBA_DEFER_FLOGI flag so that any FLOGI handling waits until after the
READ_SPARAM is done so that the proper BB credit value is set in the FLOGI
payload.
Fixes: 6bfb16208298 ("scsi: lpfc: Fix configuration of BB credit recovery in service parameters")
Cc: <stable(a)vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20200128002312.16346-4-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy(a)broadcom.com>
Signed-off-by: James Smart <jsmart2021(a)gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 04d73e2be373..3f2cb17c4574 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -749,6 +749,7 @@ struct lpfc_hba {
* capability
*/
#define HBA_FLOGI_ISSUED 0x100000 /* FLOGI was issued */
+#define HBA_DEFER_FLOGI 0x800000 /* Defer FLOGI till read_sparm cmpl */
uint32_t fcp_ring_in_use; /* When polling test if intr-hndlr active*/
struct lpfc_dmabuf slim2p;
diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
index dcc8999c6a68..6a2bdae0e52a 100644
--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -1163,13 +1163,16 @@ lpfc_mbx_cmpl_local_config_link(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb)
}
/* Start discovery by sending a FLOGI. port_state is identically
- * LPFC_FLOGI while waiting for FLOGI cmpl
+ * LPFC_FLOGI while waiting for FLOGI cmpl. Check if sending
+ * the FLOGI is being deferred till after MBX_READ_SPARAM completes.
*/
- if (vport->port_state != LPFC_FLOGI)
- lpfc_initial_flogi(vport);
- else if (vport->fc_flag & FC_PT2PT)
- lpfc_disc_start(vport);
-
+ if (vport->port_state != LPFC_FLOGI) {
+ if (!(phba->hba_flag & HBA_DEFER_FLOGI))
+ lpfc_initial_flogi(vport);
+ } else {
+ if (vport->fc_flag & FC_PT2PT)
+ lpfc_disc_start(vport);
+ }
return;
out:
@@ -3094,6 +3097,14 @@ lpfc_mbx_cmpl_read_sparam(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb)
lpfc_mbuf_free(phba, mp->virt, mp->phys);
kfree(mp);
mempool_free(pmb, phba->mbox_mem_pool);
+
+ /* Check if sending the FLOGI is being deferred to after we get
+ * up to date CSPs from MBX_READ_SPARAM.
+ */
+ if (phba->hba_flag & HBA_DEFER_FLOGI) {
+ lpfc_initial_flogi(vport);
+ phba->hba_flag &= ~HBA_DEFER_FLOGI;
+ }
return;
out:
@@ -3224,6 +3235,23 @@ lpfc_mbx_process_link_up(struct lpfc_hba *phba, struct lpfc_mbx_read_top *la)
}
lpfc_linkup(phba);
+ sparam_mbox = NULL;
+
+ if (!(phba->hba_flag & HBA_FCOE_MODE)) {
+ cfglink_mbox = mempool_alloc(phba->mbox_mem_pool, GFP_KERNEL);
+ if (!cfglink_mbox)
+ goto out;
+ vport->port_state = LPFC_LOCAL_CFG_LINK;
+ lpfc_config_link(phba, cfglink_mbox);
+ cfglink_mbox->vport = vport;
+ cfglink_mbox->mbox_cmpl = lpfc_mbx_cmpl_local_config_link;
+ rc = lpfc_sli_issue_mbox(phba, cfglink_mbox, MBX_NOWAIT);
+ if (rc == MBX_NOT_FINISHED) {
+ mempool_free(cfglink_mbox, phba->mbox_mem_pool);
+ goto out;
+ }
+ }
+
sparam_mbox = mempool_alloc(phba->mbox_mem_pool, GFP_KERNEL);
if (!sparam_mbox)
goto out;
@@ -3244,20 +3272,7 @@ lpfc_mbx_process_link_up(struct lpfc_hba *phba, struct lpfc_mbx_read_top *la)
goto out;
}
- if (!(phba->hba_flag & HBA_FCOE_MODE)) {
- cfglink_mbox = mempool_alloc(phba->mbox_mem_pool, GFP_KERNEL);
- if (!cfglink_mbox)
- goto out;
- vport->port_state = LPFC_LOCAL_CFG_LINK;
- lpfc_config_link(phba, cfglink_mbox);
- cfglink_mbox->vport = vport;
- cfglink_mbox->mbox_cmpl = lpfc_mbx_cmpl_local_config_link;
- rc = lpfc_sli_issue_mbox(phba, cfglink_mbox, MBX_NOWAIT);
- if (rc == MBX_NOT_FINISHED) {
- mempool_free(cfglink_mbox, phba->mbox_mem_pool);
- goto out;
- }
- } else {
+ if (phba->hba_flag & HBA_FCOE_MODE) {
vport->port_state = LPFC_VPORT_UNKNOWN;
/*
* Add the driver's default FCF record at FCF index 0 now. This
@@ -3314,6 +3329,10 @@ lpfc_mbx_process_link_up(struct lpfc_hba *phba, struct lpfc_mbx_read_top *la)
}
/* Reset FCF roundrobin bmask for new discovery */
lpfc_sli4_clear_fcf_rr_bmask(phba);
+ } else {
+ if (phba->bbcredit_support && phba->cfg_enable_bbcr &&
+ !(phba->link_flag & LS_LOOPBACK_MODE))
+ phba->hba_flag |= HBA_DEFER_FLOGI;
}
/* Prepare for LINK up registrations */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b8fdd090376a7a46d17db316638fe54b965c2fb0 Mon Sep 17 00:00:00 2001
From: Bob Liu <bob.liu(a)oracle.com>
Date: Tue, 24 Mar 2020 21:22:45 +0800
Subject: [PATCH] dm zoned: remove duplicate nr_rnd_zones increase in
dmz_init_zone()
zmd->nr_rnd_zones was increased twice by mistake. The other place it
is increased in dmz_init_zone() is the only one needed:
1131 zmd->nr_useable_zones++;
1132 if (dmz_is_rnd(zone)) {
1133 zmd->nr_rnd_zones++;
^^^
Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target")
Cc: stable(a)vger.kernel.org
Signed-off-by: Bob Liu <bob.liu(a)oracle.com>
Reviewed-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 516c7b671d25..369de15c4e80 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -1109,7 +1109,6 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
switch (blkz->type) {
case BLK_ZONE_TYPE_CONVENTIONAL:
set_bit(DMZ_RND, &zone->flags);
- zmd->nr_rnd_zones++;
break;
case BLK_ZONE_TYPE_SEQWRITE_REQ:
case BLK_ZONE_TYPE_SEQWRITE_PREF:
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3f142b6a7b573bde6cff926f246da05652c61eb4 Mon Sep 17 00:00:00 2001
From: Andrei Botila <andrei.botila(a)nxp.com>
Date: Fri, 28 Feb 2020 12:46:48 +0200
Subject: [PATCH] crypto: caam - update xts sector size for large input length
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Since in the software implementation of XTS-AES there is
no notion of sector every input length is processed the same way.
CAAM implementation has the notion of sector which causes different
results between the software implementation and the one in CAAM
for input lengths bigger than 512 bytes.
Increase sector size to maximum value on 16 bits.
Fixes: c6415a6016bf ("crypto: caam - add support for acipher xts(aes)")
Cc: <stable(a)vger.kernel.org> # v4.12+
Signed-off-by: Andrei Botila <andrei.botila(a)nxp.com>
Reviewed-by: Horia Geantă <horia.geanta(a)nxp.com>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/drivers/crypto/caam/caamalg_desc.c b/drivers/crypto/caam/caamalg_desc.c
index 6171a8118b5a..d6c58184bb57 100644
--- a/drivers/crypto/caam/caamalg_desc.c
+++ b/drivers/crypto/caam/caamalg_desc.c
@@ -1524,7 +1524,13 @@ EXPORT_SYMBOL(cnstr_shdsc_skcipher_decap);
*/
void cnstr_shdsc_xts_skcipher_encap(u32 * const desc, struct alginfo *cdata)
{
- __be64 sector_size = cpu_to_be64(512);
+ /*
+ * Set sector size to a big value, practically disabling
+ * sector size segmentation in xts implementation. We cannot
+ * take full advantage of this HW feature with existing
+ * crypto API / dm-crypt SW architecture.
+ */
+ __be64 sector_size = cpu_to_be64(BIT(15));
u32 *key_jump_cmd;
init_sh_desc(desc, HDR_SHARE_SERIAL | HDR_SAVECTX);
@@ -1577,7 +1583,13 @@ EXPORT_SYMBOL(cnstr_shdsc_xts_skcipher_encap);
*/
void cnstr_shdsc_xts_skcipher_decap(u32 * const desc, struct alginfo *cdata)
{
- __be64 sector_size = cpu_to_be64(512);
+ /*
+ * Set sector size to a big value, practically disabling
+ * sector size segmentation in xts implementation. We cannot
+ * take full advantage of this HW feature with existing
+ * crypto API / dm-crypt SW architecture.
+ */
+ __be64 sector_size = cpu_to_be64(BIT(15));
u32 *key_jump_cmd;
init_sh_desc(desc, HDR_SHARE_SERIAL | HDR_SAVECTX);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9fc06ff56845cc5ccafec52f545fc2e08d22f849 Mon Sep 17 00:00:00 2001
From: Nikos Tsironis <ntsironis(a)arrikto.com>
Date: Fri, 27 Mar 2020 16:01:10 +0200
Subject: [PATCH] dm clone: Add missing casts to prevent overflows and data
corruption
Add missing casts when converting from regions to sectors.
In case BITS_PER_LONG == 32, the lack of the appropriate casts can lead
to overflows and miscalculation of the device sector.
As a result, we could end up discarding and/or copying the wrong parts
of the device, thus corrupting the device's data.
Fixes: 7431b7835f55 ("dm: add clone target")
Cc: stable(a)vger.kernel.org # v5.4+
Signed-off-by: Nikos Tsironis <ntsironis(a)arrikto.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm-clone-target.c b/drivers/md/dm-clone-target.c
index 6ee85fb3388a..ca5020c58f7c 100644
--- a/drivers/md/dm-clone-target.c
+++ b/drivers/md/dm-clone-target.c
@@ -282,7 +282,7 @@ static bool bio_triggers_commit(struct clone *clone, struct bio *bio)
/* Get the address of the region in sectors */
static inline sector_t region_to_sector(struct clone *clone, unsigned long region_nr)
{
- return (region_nr << clone->region_shift);
+ return ((sector_t)region_nr << clone->region_shift);
}
/* Get the region number of the bio */
@@ -471,7 +471,7 @@ static void complete_discard_bio(struct clone *clone, struct bio *bio, bool succ
if (test_bit(DM_CLONE_DISCARD_PASSDOWN, &clone->flags) && success) {
remap_to_dest(clone, bio);
bio_region_range(clone, bio, &rs, &nr_regions);
- trim_bio(bio, rs << clone->region_shift,
+ trim_bio(bio, region_to_sector(clone, rs),
nr_regions << clone->region_shift);
generic_make_request(bio);
} else
@@ -804,11 +804,14 @@ static void hydration_copy(struct dm_clone_region_hydration *hd, unsigned int nr
struct dm_io_region from, to;
struct clone *clone = hd->clone;
+ if (WARN_ON(!nr_regions))
+ return;
+
region_size = clone->region_size;
region_start = hd->region_nr;
region_end = region_start + nr_regions - 1;
- total_size = (nr_regions - 1) << clone->region_shift;
+ total_size = region_to_sector(clone, nr_regions - 1);
if (region_end == clone->nr_regions - 1) {
/*
If we use a non-forcewaked write to PMINTRMSK, it does not take effect
until much later, if at all, causing a loss of RPS interrupts and no GPU
reclocking, leaving the GPU running at the wrong frequency for long
periods of time.
Reported-by: Francisco Jerez <currojerez(a)riseup.net>
Suggested-by: Francisco Jerez <currojerez(a)riseup.net>
Fixes: 35cc7f32c298 ("drm/i915/gt: Use non-forcewake writes for RPS")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Francisco Jerez <currojerez(a)riseup.net>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Andi Shyti <andi.shyti(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.6+
---
drivers/gpu/drm/i915/gt/intel_rps.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index 86110458e2a7..6a3505467406 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -81,13 +81,14 @@ static void rps_enable_interrupts(struct intel_rps *rps)
events = (GEN6_PM_RP_UP_THRESHOLD |
GEN6_PM_RP_DOWN_THRESHOLD |
GEN6_PM_RP_DOWN_TIMEOUT);
-
WRITE_ONCE(rps->pm_events, events);
+
spin_lock_irq(>->irq_lock);
gen6_gt_pm_enable_irq(gt, rps->pm_events);
spin_unlock_irq(>->irq_lock);
- set(gt->uncore, GEN6_PMINTRMSK, rps_pm_mask(rps, rps->cur_freq));
+ intel_uncore_write(gt->uncore,
+ GEN6_PMINTRMSK, rps_pm_mask(rps, rps->last_freq));
}
static void gen6_rps_reset_interrupts(struct intel_rps *rps)
@@ -120,7 +121,9 @@ static void rps_disable_interrupts(struct intel_rps *rps)
struct intel_gt *gt = rps_to_gt(rps);
WRITE_ONCE(rps->pm_events, 0);
- set(gt->uncore, GEN6_PMINTRMSK, rps_pm_sanitize_mask(rps, ~0u));
+
+ intel_uncore_write(gt->uncore,
+ GEN6_PMINTRMSK, rps_pm_sanitize_mask(rps, ~0u));
spin_lock_irq(>->irq_lock);
gen6_gt_pm_disable_irq(gt, GEN6_PM_RPS_EVENTS);
--
2.20.1
The patch titled
Subject: mm/ksm: fix NULL pointer dereference when KSM zero page is enabled
has been added to the -mm tree. Its filename is
mm-ksm-fix-null-pointer-dereference-when-ksm-zero-page-is-enabled.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-ksm-fix-null-pointer-dereferenc…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-ksm-fix-null-pointer-dereferenc…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm/ksm: fix NULL pointer dereference when KSM zero page is enabled
find_mergeable_vma can return NULL. In this case, it leads to crash when
we access vma->vm_mm(its offset is 0x40) later in write_protect_page. And
this case did happen on our server. The following calltrace is captured
in kernel 4.19 with the following patch applied and KSM zero page enabled
on our server.
commit e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with colouring")
So add a vma check to fix it.
--------------------------------------------------------------------------
BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
Oops: 0000 [#1] SMP NOPTI
CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019
RIP: 0010:try_to_merge_one_page+0xc7/0x760
Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
? follow_page_pte+0x36d/0x5e0
ksm_scan_thread+0x115e/0x1960
? remove_wait_queue+0x60/0x60
kthread+0xf5/0x130
? try_to_merge_with_ksm_page+0x90/0x90
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x1f/0x30
--------------------------------------------------------------------------
Link: http://lkml.kernel.org/r/20200414132905.83819-1-songmuchun@bytedance.com
Fixes: e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with colouring")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Co-developed-by: Xiongchun Duan <duanxiongchun(a)bytedance.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.vnet.ibm.com>
Cc: Markus Elfring <Markus.Elfring(a)web.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/ksm.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/mm/ksm.c~mm-ksm-fix-null-pointer-dereference-when-ksm-zero-page-is-enabled
+++ a/mm/ksm.c
@@ -2112,8 +2112,11 @@ static void cmp_and_merge_page(struct pa
down_read(&mm->mmap_sem);
vma = find_mergeable_vma(mm, rmap_item->address);
- err = try_to_merge_one_page(vma, page,
- ZERO_PAGE(rmap_item->address));
+ if (vma)
+ err = try_to_merge_one_page(vma, page,
+ ZERO_PAGE(rmap_item->address));
+ else
+ err = -EFAULT;
up_read(&mm->mmap_sem);
/*
* In case of failure, the page was not really empty, so we
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-ksm-fix-null-pointer-dereference-when-ksm-zero-page-is-enabled.patch
From: "Paul E. McKenney" <paulmck(a)kernel.org>
The rcu_nmi_enter_common() function can be invoked both in interrupt
and NMI handlers. If it is invoked from process context (as opposed
to userspace or idle context) on a nohz_full CPU, it might acquire the
CPU's leaf rcu_node structure's ->lock. Because this lock is held only
with interrupts disabled, this is safe from an interrupt handler, but
doing so from an NMI handler can result in self-deadlock.
This commit therefore adds "irq" to the "if" condition so as to only
acquire the ->lock from irq handlers or process context, never from
an NMI handler.
Fixes: 5b14557b073c ("rcu: Avoid tick_dep_set_cpu() misordering")
Reported-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
Cc: <stable(a)vger.kernel.org> # 5.5.x
---
kernel/rcu/tree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7dd7a17..573fd78 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -876,7 +876,7 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
rcu_cleanup_after_idle();
incby = 1;
- } else if (tick_nohz_full_cpu(rdp->cpu) &&
+ } else if (irq && tick_nohz_full_cpu(rdp->cpu) &&
rdp->dynticks_nmi_nesting == DYNTICK_IRQ_NONIDLE &&
READ_ONCE(rdp->rcu_urgent_qs) &&
!READ_ONCE(rdp->rcu_forced_tick)) {
--
2.9.5
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c17eb4dca5a353a9dbbb8ad6934fe57af7165e91 Mon Sep 17 00:00:00 2001
From: Clement Courbet <courbet(a)google.com>
Date: Mon, 30 Mar 2020 10:03:56 +0200
Subject: [PATCH] powerpc: Make setjmp/longjmp signature standard
Declaring setjmp()/longjmp() as taking longs makes the signature
non-standard, and makes clang complain. In the past, this has been
worked around by adding -ffreestanding to the compile flags.
The implementation looks like it only ever propagates the value
(in longjmp) or sets it to 1 (in setjmp), and we only call longjmp
with integer parameters.
This allows removing -ffreestanding from the compilation flags.
Fixes: c9029ef9c957 ("powerpc: Avoid clang warnings around setjmp and longjmp")
Cc: stable(a)vger.kernel.org # v4.14+
Signed-off-by: Clement Courbet <courbet(a)google.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200330080400.124803-1-courbet@google.com
diff --git a/arch/powerpc/include/asm/setjmp.h b/arch/powerpc/include/asm/setjmp.h
index e9f81bb3f83b..f798e80e4106 100644
--- a/arch/powerpc/include/asm/setjmp.h
+++ b/arch/powerpc/include/asm/setjmp.h
@@ -7,7 +7,9 @@
#define JMP_BUF_LEN 23
-extern long setjmp(long *) __attribute__((returns_twice));
-extern void longjmp(long *, long) __attribute__((noreturn));
+typedef long jmp_buf[JMP_BUF_LEN];
+
+extern int setjmp(jmp_buf env) __attribute__((returns_twice));
+extern void longjmp(jmp_buf env, int val) __attribute__((noreturn));
#endif /* _ASM_POWERPC_SETJMP_H */
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 378f6108a414..86380c69f5ce 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,9 +3,6 @@
# Makefile for the linux kernel.
#
-# Avoid clang warnings around longjmp/setjmp declarations
-CFLAGS_crash.o += -ffreestanding
-
obj-y += core.o crash.o core_$(BITS).o
obj-$(CONFIG_PPC32) += relocate_32.o
diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
index c3842dbeb1b7..6f9cccea54f3 100644
--- a/arch/powerpc/xmon/Makefile
+++ b/arch/powerpc/xmon/Makefile
@@ -1,9 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
# Makefile for xmon
-# Avoid clang warnings around longjmp/setjmp declarations
-subdir-ccflags-y := -ffreestanding
-
GCOV_PROFILE := n
KCOV_INSTRUMENT := n
UBSAN_SANITIZE := n
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f0cc2cd70164efe8f75c5d99560f0f69969c72e4 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Fri, 28 Feb 2020 13:04:36 +0000
Subject: [PATCH] Btrfs: fix crash during unmount due to race with delayed
inode workers
During unmount we can have a job from the delayed inode items work queue
still running, that can lead to at least two bad things:
1) A crash, because the worker can try to create a transaction just
after the fs roots were freed;
2) A transaction leak, because the worker can create a transaction
before the fs roots are freed and just after we committed the last
transaction and after we stopped the transaction kthread.
A stack trace example of the crash:
[79011.691214] kernel BUG at lib/radix-tree.c:982!
[79011.692056] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
[79011.693180] CPU: 3 PID: 1394 Comm: kworker/u8:2 Tainted: G W 5.6.0-rc2-btrfs-next-54 #2
(...)
[79011.696789] Workqueue: btrfs-delayed-meta btrfs_work_helper [btrfs]
[79011.697904] RIP: 0010:radix_tree_tag_set+0xe7/0x170
(...)
[79011.702014] RSP: 0018:ffffb3c84a317ca0 EFLAGS: 00010293
[79011.702949] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[79011.704202] RDX: ffffb3c84a317cb0 RSI: ffffb3c84a317ca8 RDI: ffff8db3931340a0
[79011.705463] RBP: 0000000000000005 R08: 0000000000000005 R09: ffffffff974629d0
[79011.706756] R10: ffffb3c84a317bc0 R11: 0000000000000001 R12: ffff8db393134000
[79011.708010] R13: ffff8db3931340a0 R14: ffff8db393134068 R15: 0000000000000001
[79011.709270] FS: 0000000000000000(0000) GS:ffff8db3b6a00000(0000) knlGS:0000000000000000
[79011.710699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[79011.711710] CR2: 00007f22c2a0a000 CR3: 0000000232ad4005 CR4: 00000000003606e0
[79011.712958] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[79011.714205] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[79011.715448] Call Trace:
[79011.715925] record_root_in_trans+0x72/0xf0 [btrfs]
[79011.716819] btrfs_record_root_in_trans+0x4b/0x70 [btrfs]
[79011.717925] start_transaction+0xdd/0x5c0 [btrfs]
[79011.718829] btrfs_async_run_delayed_root+0x17e/0x2b0 [btrfs]
[79011.719915] btrfs_work_helper+0xaa/0x720 [btrfs]
[79011.720773] process_one_work+0x26d/0x6a0
[79011.721497] worker_thread+0x4f/0x3e0
[79011.722153] ? process_one_work+0x6a0/0x6a0
[79011.722901] kthread+0x103/0x140
[79011.723481] ? kthread_create_worker_on_cpu+0x70/0x70
[79011.724379] ret_from_fork+0x3a/0x50
(...)
The following diagram shows a sequence of steps that lead to the crash
during ummount of the filesystem:
CPU 1 CPU 2 CPU 3
btrfs_punch_hole()
btrfs_btree_balance_dirty()
btrfs_balance_delayed_items()
--> sees
fs_info->delayed_root->items
with value 200, which is greater
than
BTRFS_DELAYED_BACKGROUND (128)
and smaller than
BTRFS_DELAYED_WRITEBACK (512)
btrfs_wq_run_delayed_node()
--> queues a job for
fs_info->delayed_workers to run
btrfs_async_run_delayed_root()
btrfs_async_run_delayed_root()
--> job queued by CPU 1
--> starts picking and running
delayed nodes from the
prepare_list list
close_ctree()
btrfs_delete_unused_bgs()
btrfs_commit_super()
btrfs_join_transaction()
--> gets transaction N
btrfs_commit_transaction(N)
--> set transaction state
to TRANTS_STATE_COMMIT_START
btrfs_first_prepared_delayed_node()
--> picks delayed node X through
the prepared_list list
btrfs_run_delayed_items()
btrfs_first_delayed_node()
--> also picks delayed node X
but through the node_list
list
__btrfs_commit_inode_delayed_items()
--> runs all delayed items from
this node and drops the
node's item count to 0
through call to
btrfs_release_delayed_inode()
--> finishes running any remaining
delayed nodes
--> finishes transaction commit
--> stops cleaner and transaction threads
btrfs_free_fs_roots()
--> frees all roots and removes them
from the radix tree
fs_info->fs_roots_radix
btrfs_join_transaction()
start_transaction()
btrfs_record_root_in_trans()
record_root_in_trans()
radix_tree_tag_set()
--> crashes because
the root is not in
the radix tree
anymore
If the worker is able to call btrfs_join_transaction() before the unmount
task frees the fs roots, we end up leaking a transaction and all its
resources, since after the call to btrfs_commit_super() and stopping the
transaction kthread, we don't expect to have any transaction open anymore.
When this situation happens the worker has a delayed node that has no
more items to run, since the task calling btrfs_run_delayed_items(),
which is doing a transaction commit, picks the same node and runs all
its items first.
We can not wait for the worker to complete when running delayed items
through btrfs_run_delayed_items(), because we call that function in
several phases of a transaction commit, and that could cause a deadlock
because the worker calls btrfs_join_transaction() and the task doing the
transaction commit may have already set the transaction state to
TRANS_STATE_COMMIT_DOING.
Also it's not possible to get into a situation where only some of the
items of a delayed node are added to the fs/subvolume tree in the current
transaction and the remaining ones in the next transaction, because when
running the items of a delayed inode we lock its mutex, effectively
waiting for the worker if the worker is running the items of the delayed
node already.
Since this can only cause issues when unmounting a filesystem, fix it in
a simple way by waiting for any jobs on the delayed workers queue before
calling btrfs_commit_supper() at close_ctree(). This works because at this
point no one can call btrfs_btree_balance_dirty() or
btrfs_balance_delayed_items(), and if we end up waiting for any worker to
complete, btrfs_commit_super() will commit the transaction created by the
worker.
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index 1d32a07bb2d1..309516e6a968 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -395,3 +395,11 @@ void btrfs_set_work_high_priority(struct btrfs_work *work)
{
set_bit(WORK_HIGH_PRIO_BIT, &work->flags);
}
+
+void btrfs_flush_workqueue(struct btrfs_workqueue *wq)
+{
+ if (wq->high)
+ flush_workqueue(wq->high->normal_wq);
+
+ flush_workqueue(wq->normal->normal_wq);
+}
diff --git a/fs/btrfs/async-thread.h b/fs/btrfs/async-thread.h
index a4434301d84d..3204daa51b95 100644
--- a/fs/btrfs/async-thread.h
+++ b/fs/btrfs/async-thread.h
@@ -44,5 +44,6 @@ void btrfs_set_work_high_priority(struct btrfs_work *work);
struct btrfs_fs_info * __pure btrfs_work_owner(const struct btrfs_work *work);
struct btrfs_fs_info * __pure btrfs_workqueue_owner(const struct __btrfs_workqueue *wq);
bool btrfs_workqueue_normal_congested(const struct btrfs_workqueue *wq);
+void btrfs_flush_workqueue(struct btrfs_workqueue *wq);
#endif
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f01f19c76cf5..483b54077d01 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4067,6 +4067,19 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
*/
btrfs_delete_unused_bgs(fs_info);
+ /*
+ * There might be existing delayed inode workers still running
+ * and holding an empty delayed inode item. We must wait for
+ * them to complete first because they can create a transaction.
+ * This happens when someone calls btrfs_balance_delayed_items()
+ * and then a transaction commit runs the same delayed nodes
+ * before any delayed worker has done something with the nodes.
+ * We must wait for any worker here and not at transaction
+ * commit time since that could cause a deadlock.
+ * This is a very rare case.
+ */
+ btrfs_flush_workqueue(fs_info->delayed_workers);
+
ret = btrfs_commit_super(fs_info);
if (ret)
btrfs_err(fs_info, "commit super ret %d", ret);
Hi,
Please consider applying the following patches to the listed stable releases.
The following patches were found to be missing in stable releases by the
Chrome OS missing patch robot. The patches meet the following criteria.
- The patch includes a Fixes: tag
- The patch referenced in the Fixes: tag has been applied to the listed
stable release(s)
- The patch has not been applied to the listed stable release(s)
All patches have been applied to the respective stable release(s) and to at
least one Chrome OS branch. Resulting images have been build- and runtime-
tested (where applicable) on real hardware and with virtual hardware on
kerneltests.org.
Thanks,
Guenter
---
Upstream commit e78c38f6bdd9 ("futex: futex_wake_op, do not fail on invalid op")
Fixes: 30d6e0a4190d ("futex: Remove duplicated code and fix undefined behaviour")
in linux-4.4.y: 177a981885cf
Applies to:
v4.4.y
Upstream commit 93f0750dcdae ("Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"")
Fixes: 2c1f6951a8a8 ("videobuf2-v4l2: Verify planes array in buffer dequeueing")
in linux-4.4.y: 19a4e46b4513
Applies to:
v4.4.y
Upstream commit 2e356101e72a ("KEYS: reaching the keys quotas correctly")
Fixes: a08bf91ce28e ("KEYS: allow reaching the keys quotas exactly")
in linux-4.4.y: 1e73c0aeb3ee
in linux-4.9.y: 6704b9d8a075
in linux-4.14.y: fe303ba7ab93
in linux-4.19.y: f812bec554d0
Applies to:
v4.4.y, v4.9.y, v4.14.y, v4.19.y
Upstream commit 538d92912d31 ("xen-netfront: Rework the fix for Rx stall during OOM and network stress")
Fixes: 90c311b0eeea ("xen-netfront: Fix Rx stall during network stress and OOM")
in linux-4.4.y: 230fe9c7d814
Applies to:
v4.4.y
Upstream commit 183ab39eb0ea ("ALSA: hda: Initialize power_state field properly")
Fixes: 98081ca62cba ("ALSA: hda - Record the current power state before suspend/resume calls")
in linux-4.4.y: 2569eed24d93
in linux-4.9.y: 5ee86945565e
in linux-4.14.y: 886e8316b599
Applies to:
v4.4.y, v4.9.y, v4.14.y
Upstream commit 24e52b11e0ca ("Btrfs: incremental send, fix invalid memory access")
Fixes: e1cbfd7bf6da ("Btrfs: send, fix file hole not being preserved due to inline extent")
in linux-4.4.y: 266bbc907c3f
Applies to:
v4.4.y
Upstream commit 1f80bd6a6cc8 ("IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush")
Fixes: b4b678b06f6e ("IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop")
in linux-4.4.y: 26c66554d7bf
Applies to:
v4.4.y
Upstream commit 56a49d704870 ("net: rtnl_configure_link: fix dev flags changes arg to __dev_notify_flags")
Fixes: 5025f7f7d506 ("rtnetlink: add rtnl_link_state check in rtnl_configure_link")
in linux-4.14.y: 23557c5d34b9
Applies to:
v4.14.y
Upstream commit 11dd34f3eae5 ("powerpc/pseries: Drop pointless static qualifier in vpa_debugfs_init()")
Fixes: c6c26fb55e8e ("powerpc/pseries: Export raw per-CPU VPA data via debugfs")
in linux-4.14.y: 27b1ef75f579
in linux-4.19.y: ee35e01b0f08
Applies to:
v4.14.y, v4.19.y
Upstream commit 34d66caf251d ("x86/speculation: Remove redundant arch_smt_update() invocation")
Fixes: a74cfffb03b7 ("x86/speculation: Rework SMT state change")
in linux-4.14.y: 36a4c5fc9285
in linux-4.19.y: f55e301ec4d5
Applies to:
v4.14.y, v4.19.y
Upstream commit cc41f11a21a5 ("scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug")
Fixes: c666d3be99c0 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
in linux-4.14.y: 3748694f1b91
Applies to:
v4.14.y
Upstream commit 82f04bfe2aff ("tools: gpio: Fix out-of-tree build regression")
Fixes: 0161a94e2d1c ("tools: gpio: Correctly add make dependencies for gpio_utils")
in linux-4.14.y: f71e52cb3270
in linux-4.19.y: 036588ec6888
Applies to:
v4.14.y, v4.19.y
Upstream commit 3e487d2e4aa4 ("PCI: pciehp: Fix indefinite wait on sysfs requests")
Fixes: 157c1062fcd8 ("PCI: pciehp: Avoid returning prematurely from sysfs requests")
in linux-4.19.y: 248e65f3220e
in linux-5.4.y: 9bd9d123399b
Applies to:
v4.19.y, v5.4.y
Upstream commit 8644772637de ("mm: Use fixed constant in page_frag_alloc instead of size + 1")
Fixes: 2c2ade81741c ("mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs")
in linux-4.14.y: a977209627ca
in linux-4.19.y: 33e83ea302c0
Applies to:
v4.14.y, v4.19.y
Upstream commit 2abb5792387e ("net: qualcomm: rmnet: Allow configuration updates to existing devices")
Fixes: 1dc49e9d164c ("net: rmnet: do not allow to change mux id if mux id is duplicated")
in linux-4.19.y: 48c5bfbbcec1
in linux-5.4.y: 835bbd892683
Applies to:
v4.19.y, v5.4.y
Upstream commit 36eb7dc1bd42 ("cpufreq: imx6q: Fixes unwanted cpu overclocking on i.MX6ULL")
Fixes: 2733fb0d0699 ("cpufreq: imx6q: read OCOTP through nvmem for imx6ul/imx6ull")
in linux-4.19.y: 4ef576e99d29
Applies to:
v4.19.y
Upstream commit 4c7eeb9af3e4 ("arm64: dts: allwinner: h6: Fix PMU compatible")
Fixes: 7aa9b9eb7d6a ("arm64: dts: allwinner: H6: Add PMU mode")
in linux-4.19.y: 8f1046b33f1b
in linux-5.4.y: 02dfae36b03f
Applies to:
v4.19.y, v5.4.y
Upstream commit ae769d355664 ("ALSA: pcm: oss: Fix regression by buffer overflow fix")
Fixes: f2ecf903ef06 ("ALSA: pcm: oss: Avoid plugin buffer overflow")
in linux-4.14.y: 5ac3462e1921
in linux-4.19.y: 8c5bd5520334
in linux-5.4.y: 07ec940ceda5
Applies to:
v4.14.y, v4.19.y, v5.4.y
Upstream commit 82e0516ce3a1 ("sched/core: Remove duplicate assignment in sched_tick_remote()")
Fixes: ebc0f83c78a2 ("timers/nohz: Update NOHZ load in remote tick")
in linux-5.4.y: 166d6008fa2a
Applies to:
v5.4.y
Upstream commit 4ae7a3c3d7d3 ("arm64: dts: allwinner: h5: Fix PMU compatible")
Fixes: c35a516a4618 ("arm64: dts: allwinner: H5: Add PMU node")
in linux-5.4.y: 5a241d7bf1e6
Applies to:
v5.4.y
Upstream commit 9b8b17541f13 ("mm, memcg: do not high throttle allocators based on wraparound")
Fixes: e26733e0d0ec ("mm, memcg: throttle allocators based on ancestral memory.high")
in linux-5.4.y: 61cfbcce9e09
Applies to:
v5.4.y
Upstream commit 8c5c66052920 ("nvme-fc: Revert "add module to ops template to allow module references"")
Fixes: 863fbae929c7 ("nvme_fc: add module to ops template to allow module references")
in linux-4.14.y: a123233fc320
in linux-4.19.y: 6c786e656cd9
in linux-5.4.y: 6b49a5a9eb46
Applies to:
v4.14.y, v4.19.y, v5.4.y
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c17eb4dca5a353a9dbbb8ad6934fe57af7165e91 Mon Sep 17 00:00:00 2001
From: Clement Courbet <courbet(a)google.com>
Date: Mon, 30 Mar 2020 10:03:56 +0200
Subject: [PATCH] powerpc: Make setjmp/longjmp signature standard
Declaring setjmp()/longjmp() as taking longs makes the signature
non-standard, and makes clang complain. In the past, this has been
worked around by adding -ffreestanding to the compile flags.
The implementation looks like it only ever propagates the value
(in longjmp) or sets it to 1 (in setjmp), and we only call longjmp
with integer parameters.
This allows removing -ffreestanding from the compilation flags.
Fixes: c9029ef9c957 ("powerpc: Avoid clang warnings around setjmp and longjmp")
Cc: stable(a)vger.kernel.org # v4.14+
Signed-off-by: Clement Courbet <courbet(a)google.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200330080400.124803-1-courbet@google.com
diff --git a/arch/powerpc/include/asm/setjmp.h b/arch/powerpc/include/asm/setjmp.h
index e9f81bb3f83b..f798e80e4106 100644
--- a/arch/powerpc/include/asm/setjmp.h
+++ b/arch/powerpc/include/asm/setjmp.h
@@ -7,7 +7,9 @@
#define JMP_BUF_LEN 23
-extern long setjmp(long *) __attribute__((returns_twice));
-extern void longjmp(long *, long) __attribute__((noreturn));
+typedef long jmp_buf[JMP_BUF_LEN];
+
+extern int setjmp(jmp_buf env) __attribute__((returns_twice));
+extern void longjmp(jmp_buf env, int val) __attribute__((noreturn));
#endif /* _ASM_POWERPC_SETJMP_H */
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 378f6108a414..86380c69f5ce 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,9 +3,6 @@
# Makefile for the linux kernel.
#
-# Avoid clang warnings around longjmp/setjmp declarations
-CFLAGS_crash.o += -ffreestanding
-
obj-y += core.o crash.o core_$(BITS).o
obj-$(CONFIG_PPC32) += relocate_32.o
diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
index c3842dbeb1b7..6f9cccea54f3 100644
--- a/arch/powerpc/xmon/Makefile
+++ b/arch/powerpc/xmon/Makefile
@@ -1,9 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
# Makefile for xmon
-# Avoid clang warnings around longjmp/setjmp declarations
-subdir-ccflags-y := -ffreestanding
-
GCOV_PROFILE := n
KCOV_INSTRUMENT := n
UBSAN_SANITIZE := n
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c17eb4dca5a353a9dbbb8ad6934fe57af7165e91 Mon Sep 17 00:00:00 2001
From: Clement Courbet <courbet(a)google.com>
Date: Mon, 30 Mar 2020 10:03:56 +0200
Subject: [PATCH] powerpc: Make setjmp/longjmp signature standard
Declaring setjmp()/longjmp() as taking longs makes the signature
non-standard, and makes clang complain. In the past, this has been
worked around by adding -ffreestanding to the compile flags.
The implementation looks like it only ever propagates the value
(in longjmp) or sets it to 1 (in setjmp), and we only call longjmp
with integer parameters.
This allows removing -ffreestanding from the compilation flags.
Fixes: c9029ef9c957 ("powerpc: Avoid clang warnings around setjmp and longjmp")
Cc: stable(a)vger.kernel.org # v4.14+
Signed-off-by: Clement Courbet <courbet(a)google.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200330080400.124803-1-courbet@google.com
diff --git a/arch/powerpc/include/asm/setjmp.h b/arch/powerpc/include/asm/setjmp.h
index e9f81bb3f83b..f798e80e4106 100644
--- a/arch/powerpc/include/asm/setjmp.h
+++ b/arch/powerpc/include/asm/setjmp.h
@@ -7,7 +7,9 @@
#define JMP_BUF_LEN 23
-extern long setjmp(long *) __attribute__((returns_twice));
-extern void longjmp(long *, long) __attribute__((noreturn));
+typedef long jmp_buf[JMP_BUF_LEN];
+
+extern int setjmp(jmp_buf env) __attribute__((returns_twice));
+extern void longjmp(jmp_buf env, int val) __attribute__((noreturn));
#endif /* _ASM_POWERPC_SETJMP_H */
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 378f6108a414..86380c69f5ce 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,9 +3,6 @@
# Makefile for the linux kernel.
#
-# Avoid clang warnings around longjmp/setjmp declarations
-CFLAGS_crash.o += -ffreestanding
-
obj-y += core.o crash.o core_$(BITS).o
obj-$(CONFIG_PPC32) += relocate_32.o
diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
index c3842dbeb1b7..6f9cccea54f3 100644
--- a/arch/powerpc/xmon/Makefile
+++ b/arch/powerpc/xmon/Makefile
@@ -1,9 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
# Makefile for xmon
-# Avoid clang warnings around longjmp/setjmp declarations
-subdir-ccflags-y := -ffreestanding
-
GCOV_PROFILE := n
KCOV_INSTRUMENT := n
UBSAN_SANITIZE := n
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From af92bad615be75c6c0d1b1c5b48178360250a187 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)c-s.fr>
Date: Fri, 6 Mar 2020 15:09:40 +0000
Subject: [PATCH] powerpc/kasan: Fix kasan_remap_early_shadow_ro()
At the moment kasan_remap_early_shadow_ro() does nothing, because
k_end is 0 and k_cur < 0 is always true.
Change the test to k_cur != k_end, as done in
kasan_init_shadow_page_tables()
Signed-off-by: Christophe Leroy <christophe.leroy(a)c-s.fr>
Fixes: cbd18991e24f ("powerpc/mm: Fix an Oops in kasan_mmu_init()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/4e7b56865e01569058914c991143f5961b5d4719.15835073…
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index f19526e7d3dc..1a29cf469903 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -101,7 +101,7 @@ static void __init kasan_remap_early_shadow_ro(void)
kasan_populate_pte(kasan_early_shadow_pte, prot);
- for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
+ for (k_cur = k_start & PAGE_MASK; k_cur != k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_ptr_k(k_cur);
pte_t *ptep = pte_offset_kernel(pmd, k_cur);
The patch below does not apply to the 5.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From af92bad615be75c6c0d1b1c5b48178360250a187 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)c-s.fr>
Date: Fri, 6 Mar 2020 15:09:40 +0000
Subject: [PATCH] powerpc/kasan: Fix kasan_remap_early_shadow_ro()
At the moment kasan_remap_early_shadow_ro() does nothing, because
k_end is 0 and k_cur < 0 is always true.
Change the test to k_cur != k_end, as done in
kasan_init_shadow_page_tables()
Signed-off-by: Christophe Leroy <christophe.leroy(a)c-s.fr>
Fixes: cbd18991e24f ("powerpc/mm: Fix an Oops in kasan_mmu_init()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/4e7b56865e01569058914c991143f5961b5d4719.15835073…
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index f19526e7d3dc..1a29cf469903 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -101,7 +101,7 @@ static void __init kasan_remap_early_shadow_ro(void)
kasan_populate_pte(kasan_early_shadow_pte, prot);
- for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
+ for (k_cur = k_start & PAGE_MASK; k_cur != k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_ptr_k(k_cur);
pte_t *ptep = pte_offset_kernel(pmd, k_cur);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From aa4113340ae6c2811e046f08c2bc21011d20a072 Mon Sep 17 00:00:00 2001
From: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Date: Thu, 23 Jan 2020 11:19:25 +0000
Subject: [PATCH] powerpc/fsl_booke: Avoid creating duplicate tlb1 entry
In the current implementation, the call to loadcam_multi() is wrapped
between switch_to_as1() and restore_to_as0() calls so, when it tries
to create its own temporary AS=1 TLB1 entry, it ends up duplicating
the existing one created by switch_to_as1(). Add a check to skip
creating the temporary entry if already running in AS=1.
Fixes: d9e1831a4202 ("powerpc/85xx: Load all early TLB entries at once")
Cc: stable(a)vger.kernel.org # v4.4+
Signed-off-by: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Acked-by: Scott Wood <oss(a)buserror.net>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200123111914.2565-1-laurentiu.tudor@nxp.com
diff --git a/arch/powerpc/mm/nohash/tlb_low.S b/arch/powerpc/mm/nohash/tlb_low.S
index 2ca407cedbe7..eaeee402f96e 100644
--- a/arch/powerpc/mm/nohash/tlb_low.S
+++ b/arch/powerpc/mm/nohash/tlb_low.S
@@ -397,7 +397,7 @@ _GLOBAL(set_context)
* extern void loadcam_entry(unsigned int index)
*
* Load TLBCAM[index] entry in to the L2 CAM MMU
- * Must preserve r7, r8, r9, and r10
+ * Must preserve r7, r8, r9, r10 and r11
*/
_GLOBAL(loadcam_entry)
mflr r5
@@ -433,6 +433,10 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_BIG_PHYS)
*/
_GLOBAL(loadcam_multi)
mflr r8
+ /* Don't switch to AS=1 if already there */
+ mfmsr r11
+ andi. r11,r11,MSR_IS
+ bne 10f
/*
* Set up temporary TLB entry that is the same as what we're
@@ -458,6 +462,7 @@ _GLOBAL(loadcam_multi)
mtmsr r6
isync
+10:
mr r9,r3
add r10,r3,r4
2: bl loadcam_entry
@@ -466,6 +471,10 @@ _GLOBAL(loadcam_multi)
mr r3,r9
blt 2b
+ /* Don't return to AS=0 if we were in AS=1 at function start */
+ andi. r11,r11,MSR_IS
+ bne 3f
+
/* Return to AS=0 and clear the temporary entry */
mfmsr r6
rlwinm. r6,r6,0,~(MSR_IS|MSR_DS)
@@ -481,6 +490,7 @@ _GLOBAL(loadcam_multi)
tlbwe
isync
+3:
mtlr r8
blr
#endif
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From aa4113340ae6c2811e046f08c2bc21011d20a072 Mon Sep 17 00:00:00 2001
From: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Date: Thu, 23 Jan 2020 11:19:25 +0000
Subject: [PATCH] powerpc/fsl_booke: Avoid creating duplicate tlb1 entry
In the current implementation, the call to loadcam_multi() is wrapped
between switch_to_as1() and restore_to_as0() calls so, when it tries
to create its own temporary AS=1 TLB1 entry, it ends up duplicating
the existing one created by switch_to_as1(). Add a check to skip
creating the temporary entry if already running in AS=1.
Fixes: d9e1831a4202 ("powerpc/85xx: Load all early TLB entries at once")
Cc: stable(a)vger.kernel.org # v4.4+
Signed-off-by: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Acked-by: Scott Wood <oss(a)buserror.net>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200123111914.2565-1-laurentiu.tudor@nxp.com
diff --git a/arch/powerpc/mm/nohash/tlb_low.S b/arch/powerpc/mm/nohash/tlb_low.S
index 2ca407cedbe7..eaeee402f96e 100644
--- a/arch/powerpc/mm/nohash/tlb_low.S
+++ b/arch/powerpc/mm/nohash/tlb_low.S
@@ -397,7 +397,7 @@ _GLOBAL(set_context)
* extern void loadcam_entry(unsigned int index)
*
* Load TLBCAM[index] entry in to the L2 CAM MMU
- * Must preserve r7, r8, r9, and r10
+ * Must preserve r7, r8, r9, r10 and r11
*/
_GLOBAL(loadcam_entry)
mflr r5
@@ -433,6 +433,10 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_BIG_PHYS)
*/
_GLOBAL(loadcam_multi)
mflr r8
+ /* Don't switch to AS=1 if already there */
+ mfmsr r11
+ andi. r11,r11,MSR_IS
+ bne 10f
/*
* Set up temporary TLB entry that is the same as what we're
@@ -458,6 +462,7 @@ _GLOBAL(loadcam_multi)
mtmsr r6
isync
+10:
mr r9,r3
add r10,r3,r4
2: bl loadcam_entry
@@ -466,6 +471,10 @@ _GLOBAL(loadcam_multi)
mr r3,r9
blt 2b
+ /* Don't return to AS=0 if we were in AS=1 at function start */
+ andi. r11,r11,MSR_IS
+ bne 3f
+
/* Return to AS=0 and clear the temporary entry */
mfmsr r6
rlwinm. r6,r6,0,~(MSR_IS|MSR_DS)
@@ -481,6 +490,7 @@ _GLOBAL(loadcam_multi)
tlbwe
isync
+3:
mtlr r8
blr
#endif
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From aa4113340ae6c2811e046f08c2bc21011d20a072 Mon Sep 17 00:00:00 2001
From: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Date: Thu, 23 Jan 2020 11:19:25 +0000
Subject: [PATCH] powerpc/fsl_booke: Avoid creating duplicate tlb1 entry
In the current implementation, the call to loadcam_multi() is wrapped
between switch_to_as1() and restore_to_as0() calls so, when it tries
to create its own temporary AS=1 TLB1 entry, it ends up duplicating
the existing one created by switch_to_as1(). Add a check to skip
creating the temporary entry if already running in AS=1.
Fixes: d9e1831a4202 ("powerpc/85xx: Load all early TLB entries at once")
Cc: stable(a)vger.kernel.org # v4.4+
Signed-off-by: Laurentiu Tudor <laurentiu.tudor(a)nxp.com>
Acked-by: Scott Wood <oss(a)buserror.net>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200123111914.2565-1-laurentiu.tudor@nxp.com
diff --git a/arch/powerpc/mm/nohash/tlb_low.S b/arch/powerpc/mm/nohash/tlb_low.S
index 2ca407cedbe7..eaeee402f96e 100644
--- a/arch/powerpc/mm/nohash/tlb_low.S
+++ b/arch/powerpc/mm/nohash/tlb_low.S
@@ -397,7 +397,7 @@ _GLOBAL(set_context)
* extern void loadcam_entry(unsigned int index)
*
* Load TLBCAM[index] entry in to the L2 CAM MMU
- * Must preserve r7, r8, r9, and r10
+ * Must preserve r7, r8, r9, r10 and r11
*/
_GLOBAL(loadcam_entry)
mflr r5
@@ -433,6 +433,10 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_BIG_PHYS)
*/
_GLOBAL(loadcam_multi)
mflr r8
+ /* Don't switch to AS=1 if already there */
+ mfmsr r11
+ andi. r11,r11,MSR_IS
+ bne 10f
/*
* Set up temporary TLB entry that is the same as what we're
@@ -458,6 +462,7 @@ _GLOBAL(loadcam_multi)
mtmsr r6
isync
+10:
mr r9,r3
add r10,r3,r4
2: bl loadcam_entry
@@ -466,6 +471,10 @@ _GLOBAL(loadcam_multi)
mr r3,r9
blt 2b
+ /* Don't return to AS=0 if we were in AS=1 at function start */
+ andi. r11,r11,MSR_IS
+ bne 3f
+
/* Return to AS=0 and clear the temporary entry */
mfmsr r6
rlwinm. r6,r6,0,~(MSR_IS|MSR_DS)
@@ -481,6 +490,7 @@ _GLOBAL(loadcam_multi)
tlbwe
isync
+3:
mtlr r8
blr
#endif
The aarch32_vdso_pages[] array never has entries allocated in the C_VVAR
or C_VDSO slots, and as the array is zero initialized these contain
NULL.
However in __aarch32_alloc_vdso_pages() when
aarch32_alloc_kuser_vdso_page() fails we attempt to free the page whose
struct page is at NULL, which is obviously nonsensical.
This patch removes the erroneous page freeing.
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
arch/arm64/kernel/vdso.c | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index 354b11e27c07..033a48f30dbb 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -260,18 +260,7 @@ static int __aarch32_alloc_vdso_pages(void)
if (ret)
return ret;
- ret = aarch32_alloc_kuser_vdso_page();
- if (ret) {
- unsigned long c_vvar =
- (unsigned long)page_to_virt(aarch32_vdso_pages[C_VVAR]);
- unsigned long c_vdso =
- (unsigned long)page_to_virt(aarch32_vdso_pages[C_VDSO]);
-
- free_page(c_vvar);
- free_page(c_vdso);
- }
-
- return ret;
+ return aarch32_alloc_kuser_vdso_page();
}
#else
static int __aarch32_alloc_vdso_pages(void)
--
2.11.0
From: Claudiu Beznea <claudiu.beznea(a)microchip.com>
[ Upstream commit b0ecf1c6c6e82da4847900fad0272abfd014666d ]
clk_hw_round_rate() may call round rate function of its parents. In case
of SAM9X60 two of USB parrents are PLLA and UPLL. These clocks are
controlled by clk-sam9x60-pll.c driver. The round rate function for this
driver is sam9x60_pll_round_rate() which call in turn
sam9x60_pll_get_best_div_mul(). In case the requested rate is not in the
proper range (rate < characteristics->output[0].min &&
rate > characteristics->output[0].max) the sam9x60_pll_round_rate() will
return a negative number to its caller (called by
clk_core_round_rate_nolock()). clk_hw_round_rate() will return zero in
case a negative number is returned by clk_core_round_rate_nolock(). With
this, the USB clock will continue its rate computation even caller of
clk_hw_round_rate() returned an error. With this, the USB clock on SAM9X60
may not chose the best parent. I detected this after a suspend/resume
cycle on SAM9X60.
Signed-off-by: Claudiu Beznea <claudiu.beznea(a)microchip.com>
Link: https://lkml.kernel.org/r/1579261009-4573-2-git-send-email-claudiu.beznea@m…
Signed-off-by: Stephen Boyd <sboyd(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/clk/at91/clk-usb.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/clk/at91/clk-usb.c b/drivers/clk/at91/clk-usb.c
index 8ab8502778a28..55e09641b491b 100644
--- a/drivers/clk/at91/clk-usb.c
+++ b/drivers/clk/at91/clk-usb.c
@@ -79,6 +79,9 @@ static int at91sam9x5_clk_usb_determine_rate(struct clk_hw *hw,
tmp_parent_rate = req->rate * div;
tmp_parent_rate = clk_hw_round_rate(parent,
tmp_parent_rate);
+ if (!tmp_parent_rate)
+ continue;
+
tmp_rate = DIV_ROUND_CLOSEST(tmp_parent_rate, div);
if (tmp_rate < req->rate)
tmp_diff = req->rate - tmp_rate;
--
2.20.1
From: Claudiu Beznea <claudiu.beznea(a)microchip.com>
[ Upstream commit b0ecf1c6c6e82da4847900fad0272abfd014666d ]
clk_hw_round_rate() may call round rate function of its parents. In case
of SAM9X60 two of USB parrents are PLLA and UPLL. These clocks are
controlled by clk-sam9x60-pll.c driver. The round rate function for this
driver is sam9x60_pll_round_rate() which call in turn
sam9x60_pll_get_best_div_mul(). In case the requested rate is not in the
proper range (rate < characteristics->output[0].min &&
rate > characteristics->output[0].max) the sam9x60_pll_round_rate() will
return a negative number to its caller (called by
clk_core_round_rate_nolock()). clk_hw_round_rate() will return zero in
case a negative number is returned by clk_core_round_rate_nolock(). With
this, the USB clock will continue its rate computation even caller of
clk_hw_round_rate() returned an error. With this, the USB clock on SAM9X60
may not chose the best parent. I detected this after a suspend/resume
cycle on SAM9X60.
Signed-off-by: Claudiu Beznea <claudiu.beznea(a)microchip.com>
Link: https://lkml.kernel.org/r/1579261009-4573-2-git-send-email-claudiu.beznea@m…
Signed-off-by: Stephen Boyd <sboyd(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/clk/at91/clk-usb.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/clk/at91/clk-usb.c b/drivers/clk/at91/clk-usb.c
index 791770a563fcc..6fac6383d024e 100644
--- a/drivers/clk/at91/clk-usb.c
+++ b/drivers/clk/at91/clk-usb.c
@@ -78,6 +78,9 @@ static int at91sam9x5_clk_usb_determine_rate(struct clk_hw *hw,
tmp_parent_rate = req->rate * div;
tmp_parent_rate = clk_hw_round_rate(parent,
tmp_parent_rate);
+ if (!tmp_parent_rate)
+ continue;
+
tmp_rate = DIV_ROUND_CLOSEST(tmp_parent_rate, div);
if (tmp_rate < req->rate)
tmp_diff = req->rate - tmp_rate;
--
2.20.1
From: Claudiu Beznea <claudiu.beznea(a)microchip.com>
[ Upstream commit b0ecf1c6c6e82da4847900fad0272abfd014666d ]
clk_hw_round_rate() may call round rate function of its parents. In case
of SAM9X60 two of USB parrents are PLLA and UPLL. These clocks are
controlled by clk-sam9x60-pll.c driver. The round rate function for this
driver is sam9x60_pll_round_rate() which call in turn
sam9x60_pll_get_best_div_mul(). In case the requested rate is not in the
proper range (rate < characteristics->output[0].min &&
rate > characteristics->output[0].max) the sam9x60_pll_round_rate() will
return a negative number to its caller (called by
clk_core_round_rate_nolock()). clk_hw_round_rate() will return zero in
case a negative number is returned by clk_core_round_rate_nolock(). With
this, the USB clock will continue its rate computation even caller of
clk_hw_round_rate() returned an error. With this, the USB clock on SAM9X60
may not chose the best parent. I detected this after a suspend/resume
cycle on SAM9X60.
Signed-off-by: Claudiu Beznea <claudiu.beznea(a)microchip.com>
Link: https://lkml.kernel.org/r/1579261009-4573-2-git-send-email-claudiu.beznea@m…
Signed-off-by: Stephen Boyd <sboyd(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/clk/at91/clk-usb.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/clk/at91/clk-usb.c b/drivers/clk/at91/clk-usb.c
index 791770a563fcc..6fac6383d024e 100644
--- a/drivers/clk/at91/clk-usb.c
+++ b/drivers/clk/at91/clk-usb.c
@@ -78,6 +78,9 @@ static int at91sam9x5_clk_usb_determine_rate(struct clk_hw *hw,
tmp_parent_rate = req->rate * div;
tmp_parent_rate = clk_hw_round_rate(parent,
tmp_parent_rate);
+ if (!tmp_parent_rate)
+ continue;
+
tmp_rate = DIV_ROUND_CLOSEST(tmp_parent_rate, div);
if (tmp_rate < req->rate)
tmp_diff = req->rate - tmp_rate;
--
2.20.1
From: Claudiu Beznea <claudiu.beznea(a)microchip.com>
[ Upstream commit b0ecf1c6c6e82da4847900fad0272abfd014666d ]
clk_hw_round_rate() may call round rate function of its parents. In case
of SAM9X60 two of USB parrents are PLLA and UPLL. These clocks are
controlled by clk-sam9x60-pll.c driver. The round rate function for this
driver is sam9x60_pll_round_rate() which call in turn
sam9x60_pll_get_best_div_mul(). In case the requested rate is not in the
proper range (rate < characteristics->output[0].min &&
rate > characteristics->output[0].max) the sam9x60_pll_round_rate() will
return a negative number to its caller (called by
clk_core_round_rate_nolock()). clk_hw_round_rate() will return zero in
case a negative number is returned by clk_core_round_rate_nolock(). With
this, the USB clock will continue its rate computation even caller of
clk_hw_round_rate() returned an error. With this, the USB clock on SAM9X60
may not chose the best parent. I detected this after a suspend/resume
cycle on SAM9X60.
Signed-off-by: Claudiu Beznea <claudiu.beznea(a)microchip.com>
Link: https://lkml.kernel.org/r/1579261009-4573-2-git-send-email-claudiu.beznea@m…
Signed-off-by: Stephen Boyd <sboyd(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/clk/at91/clk-usb.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/clk/at91/clk-usb.c b/drivers/clk/at91/clk-usb.c
index 791770a563fcc..6fac6383d024e 100644
--- a/drivers/clk/at91/clk-usb.c
+++ b/drivers/clk/at91/clk-usb.c
@@ -78,6 +78,9 @@ static int at91sam9x5_clk_usb_determine_rate(struct clk_hw *hw,
tmp_parent_rate = req->rate * div;
tmp_parent_rate = clk_hw_round_rate(parent,
tmp_parent_rate);
+ if (!tmp_parent_rate)
+ continue;
+
tmp_rate = DIV_ROUND_CLOSEST(tmp_parent_rate, div);
if (tmp_rate < req->rate)
tmp_diff = req->rate - tmp_rate;
--
2.20.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6a13a0d7b4d1171ef9b80ad69abc37e1daa941b3 Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Tue, 24 Mar 2020 16:34:48 +0900
Subject: [PATCH] ftrace/kprobe: Show the maxactive number on kprobe_events
Show maxactive parameter on kprobe_events.
This allows user to save the current configuration and
restore it without losing maxactive parameter.
Link: http://lkml.kernel.org/r/4762764a-6df7-bc93-ed60-e336146dce1f@gmail.com
Link: http://lkml.kernel.org/r/158503528846.22706.5549974121212526020.stgit@devno…
Cc: stable(a)vger.kernel.org
Fixes: 696ced4fb1d76 ("tracing/kprobes: expose maxactive for kretprobe in kprobe_events")
Reported-by: Taeung Song <treeze.taeung(a)gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 362cca52f5de..d0568af4a0ef 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1078,6 +1078,8 @@ static int trace_kprobe_show(struct seq_file *m, struct dyn_event *ev)
int i;
seq_putc(m, trace_kprobe_is_return(tk) ? 'r' : 'p');
+ if (trace_kprobe_is_return(tk) && tk->rp.maxactive)
+ seq_printf(m, "%d", tk->rp.maxactive);
seq_printf(m, ":%s/%s", trace_probe_group_name(&tk->tp),
trace_probe_name(&tk->tp));
The patch below does not apply to the 5.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 33238c50451596be86db1505ab65fee5172844d0 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Wed, 18 Mar 2020 20:33:37 +0100
Subject: [PATCH] perf/core: Fix event cgroup tracking
Song reports that installing cgroup events is broken since:
db0503e4f675 ("perf/core: Optimize perf_install_in_event()")
The problem being that cgroup events try to track cpuctx->cgrp even
for disabled events, which is pointless and actively harmful since the
above commit. Rework the code to have explicit enable/disable hooks
for cgroup events, such that we can limit cgroup tracking to active
events.
More specifically, since the above commit disabled events are no
longer added to their context from the 'right' CPU, and we can't
access things like the current cgroup for a remote CPU.
Cc: <stable(a)vger.kernel.org> # v5.5+
Fixes: db0503e4f675 ("perf/core: Optimize perf_install_in_event()")
Reported-by: Song Liu <songliubraving(a)fb.com>
Tested-by: Song Liu <songliubraving(a)fb.com>
Reviewed-by: Song Liu <songliubraving(a)fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Link: https://lkml.kernel.org/r/20200318193337.GB20760@hirez.programming.kicks-as…
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 55e44417f66d..7afd0b503406 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -983,16 +983,10 @@ perf_cgroup_set_shadow_time(struct perf_event *event, u64 now)
event->shadow_ctx_time = now - t->timestamp;
}
-/*
- * Update cpuctx->cgrp so that it is set when first cgroup event is added and
- * cleared when last cgroup event is removed.
- */
static inline void
-list_update_cgroup_event(struct perf_event *event,
- struct perf_event_context *ctx, bool add)
+perf_cgroup_event_enable(struct perf_event *event, struct perf_event_context *ctx)
{
struct perf_cpu_context *cpuctx;
- struct list_head *cpuctx_entry;
if (!is_cgroup_event(event))
return;
@@ -1009,28 +1003,41 @@ list_update_cgroup_event(struct perf_event *event,
* because if the first would mismatch, the second would not try again
* and we would leave cpuctx->cgrp unset.
*/
- if (add && !cpuctx->cgrp) {
+ if (ctx->is_active && !cpuctx->cgrp) {
struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
cpuctx->cgrp = cgrp;
}
- if (add && ctx->nr_cgroups++)
+ if (ctx->nr_cgroups++)
return;
- else if (!add && --ctx->nr_cgroups)
+
+ list_add(&cpuctx->cgrp_cpuctx_entry,
+ per_cpu_ptr(&cgrp_cpuctx_list, event->cpu));
+}
+
+static inline void
+perf_cgroup_event_disable(struct perf_event *event, struct perf_event_context *ctx)
+{
+ struct perf_cpu_context *cpuctx;
+
+ if (!is_cgroup_event(event))
return;
- /* no cgroup running */
- if (!add)
+ /*
+ * Because cgroup events are always per-cpu events,
+ * @ctx == &cpuctx->ctx.
+ */
+ cpuctx = container_of(ctx, struct perf_cpu_context, ctx);
+
+ if (--ctx->nr_cgroups)
+ return;
+
+ if (ctx->is_active && cpuctx->cgrp)
cpuctx->cgrp = NULL;
- cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
- if (add)
- list_add(cpuctx_entry,
- per_cpu_ptr(&cgrp_cpuctx_list, event->cpu));
- else
- list_del(cpuctx_entry);
+ list_del(&cpuctx->cgrp_cpuctx_entry);
}
#else /* !CONFIG_CGROUP_PERF */
@@ -1096,11 +1103,14 @@ static inline u64 perf_cgroup_event_time(struct perf_event *event)
}
static inline void
-list_update_cgroup_event(struct perf_event *event,
- struct perf_event_context *ctx, bool add)
+perf_cgroup_event_enable(struct perf_event *event, struct perf_event_context *ctx)
{
}
+static inline void
+perf_cgroup_event_disable(struct perf_event *event, struct perf_event_context *ctx)
+{
+}
#endif
/*
@@ -1791,13 +1801,14 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
add_event_to_groups(event, ctx);
}
- list_update_cgroup_event(event, ctx, true);
-
list_add_rcu(&event->event_entry, &ctx->event_list);
ctx->nr_events++;
if (event->attr.inherit_stat)
ctx->nr_stat++;
+ if (event->state > PERF_EVENT_STATE_OFF)
+ perf_cgroup_event_enable(event, ctx);
+
ctx->generation++;
}
@@ -1976,8 +1987,6 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
event->attach_state &= ~PERF_ATTACH_CONTEXT;
- list_update_cgroup_event(event, ctx, false);
-
ctx->nr_events--;
if (event->attr.inherit_stat)
ctx->nr_stat--;
@@ -1994,8 +2003,10 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
* of error state is by explicit re-enabling
* of the event
*/
- if (event->state > PERF_EVENT_STATE_OFF)
+ if (event->state > PERF_EVENT_STATE_OFF) {
+ perf_cgroup_event_disable(event, ctx);
perf_event_set_state(event, PERF_EVENT_STATE_OFF);
+ }
ctx->generation++;
}
@@ -2226,6 +2237,7 @@ event_sched_out(struct perf_event *event,
if (READ_ONCE(event->pending_disable) >= 0) {
WRITE_ONCE(event->pending_disable, -1);
+ perf_cgroup_event_disable(event, ctx);
state = PERF_EVENT_STATE_OFF;
}
perf_event_set_state(event, state);
@@ -2363,6 +2375,7 @@ static void __perf_event_disable(struct perf_event *event,
event_sched_out(event, cpuctx, ctx);
perf_event_set_state(event, PERF_EVENT_STATE_OFF);
+ perf_cgroup_event_disable(event, ctx);
}
/*
@@ -2746,7 +2759,7 @@ static int __perf_install_in_context(void *info)
}
#ifdef CONFIG_CGROUP_PERF
- if (is_cgroup_event(event)) {
+ if (event->state > PERF_EVENT_STATE_OFF && is_cgroup_event(event)) {
/*
* If the current cgroup doesn't match the event's
* cgroup, we should not try to schedule it.
@@ -2906,6 +2919,7 @@ static void __perf_event_enable(struct perf_event *event,
ctx_sched_out(ctx, cpuctx, EVENT_TIME);
perf_event_set_state(event, PERF_EVENT_STATE_INACTIVE);
+ perf_cgroup_event_enable(event, ctx);
if (!ctx->is_active)
return;
@@ -3616,8 +3630,10 @@ static int merge_sched_in(struct perf_event *event, void *data)
}
if (event->state == PERF_EVENT_STATE_INACTIVE) {
- if (event->attr.pinned)
+ if (event->attr.pinned) {
+ perf_cgroup_event_disable(event, ctx);
perf_event_set_state(event, PERF_EVENT_STATE_ERROR);
+ }
*can_add_hw = 0;
ctx->rotate_necessary = 1;
The patch below does not apply to the 5.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 75da98586af75eb80664714a67a9895bf0a5517e Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Thu, 2 Apr 2020 10:34:36 -0400
Subject: [PATCH] NFS: finish_automount() requires us to hold 2 refs to the
mount record
We must not return from nfs_d_automount() without holding 2 references
to the mount record. Doing so, will trigger the BUG() in finish_automount().
Also ensure that we don't try to reschedule the automount timer with
a negative or zero timeout value.
Fixes: 22a1ae9a93fb ("NFS: If nfs_mountpoint_expiry_timeout < 0, do not expire submounts")
Cc: stable(a)vger.kernel.org # v5.5+
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c
index da67820462f2..fe19ae280262 100644
--- a/fs/nfs/namespace.c
+++ b/fs/nfs/namespace.c
@@ -145,6 +145,7 @@ struct vfsmount *nfs_d_automount(struct path *path)
struct vfsmount *mnt = ERR_PTR(-ENOMEM);
struct nfs_server *server = NFS_SERVER(d_inode(path->dentry));
struct nfs_client *client = server->nfs_client;
+ int timeout = READ_ONCE(nfs_mountpoint_expiry_timeout);
int ret;
if (IS_ROOT(path->dentry))
@@ -190,12 +191,12 @@ struct vfsmount *nfs_d_automount(struct path *path)
if (IS_ERR(mnt))
goto out_fc;
- if (nfs_mountpoint_expiry_timeout < 0)
+ mntget(mnt); /* prevent immediate expiration */
+ if (timeout <= 0)
goto out_fc;
- mntget(mnt); /* prevent immediate expiration */
mnt_set_expiry(mnt, &nfs_automount_list);
- schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout);
+ schedule_delayed_work(&nfs_automount_task, timeout);
out_fc:
put_fs_context(fc);
@@ -233,10 +234,11 @@ const struct inode_operations nfs_referral_inode_operations = {
static void nfs_expire_automounts(struct work_struct *work)
{
struct list_head *list = &nfs_automount_list;
+ int timeout = READ_ONCE(nfs_mountpoint_expiry_timeout);
mark_mounts_for_expiry(list);
- if (!list_empty(list))
- schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout);
+ if (!list_empty(list) && timeout > 0)
+ schedule_delayed_work(&nfs_automount_task, timeout);
}
void nfs_release_automount_timer(void)
Try to make RPS dramatically more responsive by shrinking the evaluation
intervales by a factor of 100! The issue is as we now park the GPU
rapidly upon idling, a short or bursty workload such as the composited
desktop never sustains enough work to fill and complete an evaluation
window. As such, the frequency we program remains stuck. This was first
reported as once boosted, we never relinquished the boost [see commit
21abf0bf168d ("drm/i915/gt: Treat idling as a RPS downclock event")] but
it equally applies in the order direction for bursty workloads that
*need* low latency, like desktop animations.
What we could try is preserve the incomplete EI history across idling,
it is not clear whether that would be effective, nor whether the
presumption of continuous workloads is accurate. A clearer path seems to
treat it as symptomatic that we fail to handle bursty workload with the
current EI, and seek to address that by shrinking the EI so the
evaluations are run much more often.
This will likely entail more frequent interrupts, and by the time we
process the interrupt in the bottom half [from inside a worker], the
workload on the GPU has changed. To address the changeable nature, in
the previous patch we compared the previous complete EI with the
interrupt request and only up/down clock if both agree. The impact of
asking for, and presumably, receiving more interrupts is still to be
determined and mitigations sought. The first idea is to differentiate
between up/down responsivity and make upclocking more responsive than
downlocking. This should both help thwart jitter on bursty workloads by
making it easier to increase than it is to decrease frequencies, and
reduce the number of interrupts we would need to process.
Fixes: 21abf0bf168d ("drm/i915/gt: Treat idling as a RPS downclock event")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1698
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Andi Shyti <andi.shyti(a)intel.com>
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Francisco Jerez <currojerez(a)riseup.net>
Cc: <stable(a)vger.kernel.org> # v5.5+
---
drivers/gpu/drm/i915/gt/intel_rps.c | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index 367132092bed..47ddb25edc97 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -542,37 +542,38 @@ static void rps_set_power(struct intel_rps *rps, int new_power)
/* Note the units here are not exactly 1us, but 1280ns. */
switch (new_power) {
case LOW_POWER:
- /* Upclock if more than 95% busy over 16ms */
- ei_up = 16000;
+ /* Upclock if more than 95% busy over 160us */
+ ei_up = 160;
threshold_up = 95;
- /* Downclock if less than 85% busy over 32ms */
- ei_down = 32000;
+ /* Downclock if less than 85% busy over 1600us */
+ ei_down = 1600;
threshold_down = 85;
break;
case BETWEEN:
- /* Upclock if more than 90% busy over 13ms */
- ei_up = 13000;
+ /* Upclock if more than 90% busy over 160us */
+ ei_up = 160;
threshold_up = 90;
- /* Downclock if less than 75% busy over 32ms */
- ei_down = 32000;
+ /* Downclock if less than 75% busy over 1600us */
+ ei_down = 1600;
threshold_down = 75;
break;
case HIGH_POWER:
- /* Upclock if more than 85% busy over 10ms */
- ei_up = 10000;
+ /* Upclock if more than 85% busy over 160us */
+ ei_up = 160;
threshold_up = 85;
- /* Downclock if less than 60% busy over 32ms */
- ei_down = 32000;
+ /* Downclock if less than 60% busy over 1600us */
+ ei_down = 1600;
threshold_down = 60;
break;
}
- /* When byt can survive without system hang with dynamic
+ /*
+ * When byt can survive without system hang with dynamic
* sw freq adjustments, this restriction can be lifted.
*/
if (IS_VALLEYVIEW(i915))
--
2.20.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6e8a36c13382b7165d23928caee8d91c1b301142 Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak(a)intel.com>
Date: Mon, 30 Mar 2020 18:22:44 +0300
Subject: [PATCH] drm/i915/icl+: Don't enable DDI IO power on a TypeC port in
TBT mode
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The DDI IO power well must not be enabled for a TypeC port in TBT mode,
ensure this during driver loading/system resume.
This gets rid of error messages like
[drm] *ERROR* power well DDI E TC2 IO state mismatch (refcount 1/enabled 0)
and avoids leaking the power ref when disabling the output.
Cc: <stable(a)vger.kernel.org> # v5.4+
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
Reviewed-by: José Roberto de Souza <jose.souza(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330152244.11316-1-imre.d…
(cherry picked from commit f77a2db27f26c3ccba0681f7e89fef083718f07f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index 73d0f4648c06..5202fdec8e0a 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -1869,7 +1869,11 @@ static void intel_ddi_get_power_domains(struct intel_encoder *encoder,
return;
dig_port = enc_to_dig_port(encoder);
- intel_display_power_get(dev_priv, dig_port->ddi_io_power_domain);
+
+ if (!intel_phy_is_tc(dev_priv, phy) ||
+ dig_port->tc_mode != TC_PORT_TBT_ALT)
+ intel_display_power_get(dev_priv,
+ dig_port->ddi_io_power_domain);
/*
* AUX power is only needed for (e)DP mode, and for HDMI mode on TC
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 487eca11a321ef33bcf4ca5adb3c0c4954db1b58 Mon Sep 17 00:00:00 2001
From: Prike Liang <Prike.Liang(a)amd.com>
Date: Tue, 7 Apr 2020 20:21:26 +0800
Subject: [PATCH] drm/amdgpu: fix gfx hang during suspend with video playback
(v2)
The system will be hang up during S3 suspend because of SMU is pending
for GC not respose the register CP_HQD_ACTIVE access request.This issue
root cause of accessing the GC register under enter GFX CGGPG and can
be fixed by disable GFX CGPG before perform suspend.
v2: Use disable the GFX CGPG instead of RLC safe mode guard.
Signed-off-by: Prike Liang <Prike.Liang(a)amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang(a)amd.com>
Reviewed-by: Huang Rui <ray.huang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index faa3e7102156..559dc24ef436 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2340,8 +2340,6 @@ static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)
{
int i, r;
- amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
- amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
if (!adev->ip_blocks[i].status.valid)
@@ -3356,6 +3354,9 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
}
}
+ amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
+ amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
+
amdgpu_amdkfd_suspend(adev, !fbcon);
amdgpu_ras_suspend(adev);
The patch below does not apply to the 5.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 487eca11a321ef33bcf4ca5adb3c0c4954db1b58 Mon Sep 17 00:00:00 2001
From: Prike Liang <Prike.Liang(a)amd.com>
Date: Tue, 7 Apr 2020 20:21:26 +0800
Subject: [PATCH] drm/amdgpu: fix gfx hang during suspend with video playback
(v2)
The system will be hang up during S3 suspend because of SMU is pending
for GC not respose the register CP_HQD_ACTIVE access request.This issue
root cause of accessing the GC register under enter GFX CGGPG and can
be fixed by disable GFX CGPG before perform suspend.
v2: Use disable the GFX CGPG instead of RLC safe mode guard.
Signed-off-by: Prike Liang <Prike.Liang(a)amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang(a)amd.com>
Reviewed-by: Huang Rui <ray.huang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index faa3e7102156..559dc24ef436 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2340,8 +2340,6 @@ static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)
{
int i, r;
- amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
- amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
if (!adev->ip_blocks[i].status.valid)
@@ -3356,6 +3354,9 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
}
}
+ amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
+ amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
+
amdgpu_amdkfd_suspend(adev, !fbcon);
amdgpu_ras_suspend(adev);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ea36ec8623f56791c6ff6738d0509b7920f85220 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Sun, 2 Feb 2020 17:16:31 +0000
Subject: [PATCH] drm: Remove PageReserved manipulation from drm_pci_alloc
drm_pci_alloc/drm_pci_free are very thin wrappers around the core dma
facilities, and we have no special reason within the drm layer to behave
differently. In particular, since
commit de09d31dd38a50fdce106c15abd68432eebbd014
Author: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Date: Fri Jan 15 16:51:42 2016 -0800
page-flags: define PG_reserved behavior on compound pages
As far as I can see there's no users of PG_reserved on compound pages.
Let's use PF_NO_COMPOUND here.
it has been illegal to combine GFP_COMP with SetPageReserved, so lets
stop doing both and leave the dma layer to its own devices.
Reported-by: Taketo Kabe
Bug: https://gitlab.freedesktop.org/drm/intel/issues/1027
Fixes: de09d31dd38a ("page-flags: define PG_reserved behavior on compound pages")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: <stable(a)vger.kernel.org> # v4.5+
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200202171635.4039044-1-chri…
diff --git a/drivers/gpu/drm/drm_pci.c b/drivers/gpu/drm/drm_pci.c
index f2e43d341980..d16dac4325f9 100644
--- a/drivers/gpu/drm/drm_pci.c
+++ b/drivers/gpu/drm/drm_pci.c
@@ -51,8 +51,6 @@
drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, size_t size, size_t align)
{
drm_dma_handle_t *dmah;
- unsigned long addr;
- size_t sz;
/* pci_alloc_consistent only guarantees alignment to the smallest
* PAGE_SIZE order which is greater than or equal to the requested size.
@@ -68,20 +66,13 @@ drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, size_t size, size_t ali
dmah->size = size;
dmah->vaddr = dma_alloc_coherent(&dev->pdev->dev, size,
&dmah->busaddr,
- GFP_KERNEL | __GFP_COMP);
+ GFP_KERNEL);
if (dmah->vaddr == NULL) {
kfree(dmah);
return NULL;
}
- /* XXX - Is virt_to_page() legal for consistent mem? */
- /* Reserve */
- for (addr = (unsigned long)dmah->vaddr, sz = size;
- sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
- SetPageReserved(virt_to_page((void *)addr));
- }
-
return dmah;
}
@@ -94,19 +85,9 @@ EXPORT_SYMBOL(drm_pci_alloc);
*/
void __drm_legacy_pci_free(struct drm_device * dev, drm_dma_handle_t * dmah)
{
- unsigned long addr;
- size_t sz;
-
- if (dmah->vaddr) {
- /* XXX - Is virt_to_page() legal for consistent mem? */
- /* Unreserve */
- for (addr = (unsigned long)dmah->vaddr, sz = dmah->size;
- sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
- ClearPageReserved(virt_to_page((void *)addr));
- }
+ if (dmah->vaddr)
dma_free_coherent(&dev->pdev->dev, dmah->size, dmah->vaddr,
dmah->busaddr);
- }
}
/**
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ea36ec8623f56791c6ff6738d0509b7920f85220 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Sun, 2 Feb 2020 17:16:31 +0000
Subject: [PATCH] drm: Remove PageReserved manipulation from drm_pci_alloc
drm_pci_alloc/drm_pci_free are very thin wrappers around the core dma
facilities, and we have no special reason within the drm layer to behave
differently. In particular, since
commit de09d31dd38a50fdce106c15abd68432eebbd014
Author: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Date: Fri Jan 15 16:51:42 2016 -0800
page-flags: define PG_reserved behavior on compound pages
As far as I can see there's no users of PG_reserved on compound pages.
Let's use PF_NO_COMPOUND here.
it has been illegal to combine GFP_COMP with SetPageReserved, so lets
stop doing both and leave the dma layer to its own devices.
Reported-by: Taketo Kabe
Bug: https://gitlab.freedesktop.org/drm/intel/issues/1027
Fixes: de09d31dd38a ("page-flags: define PG_reserved behavior on compound pages")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: <stable(a)vger.kernel.org> # v4.5+
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200202171635.4039044-1-chri…
diff --git a/drivers/gpu/drm/drm_pci.c b/drivers/gpu/drm/drm_pci.c
index f2e43d341980..d16dac4325f9 100644
--- a/drivers/gpu/drm/drm_pci.c
+++ b/drivers/gpu/drm/drm_pci.c
@@ -51,8 +51,6 @@
drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, size_t size, size_t align)
{
drm_dma_handle_t *dmah;
- unsigned long addr;
- size_t sz;
/* pci_alloc_consistent only guarantees alignment to the smallest
* PAGE_SIZE order which is greater than or equal to the requested size.
@@ -68,20 +66,13 @@ drm_dma_handle_t *drm_pci_alloc(struct drm_device * dev, size_t size, size_t ali
dmah->size = size;
dmah->vaddr = dma_alloc_coherent(&dev->pdev->dev, size,
&dmah->busaddr,
- GFP_KERNEL | __GFP_COMP);
+ GFP_KERNEL);
if (dmah->vaddr == NULL) {
kfree(dmah);
return NULL;
}
- /* XXX - Is virt_to_page() legal for consistent mem? */
- /* Reserve */
- for (addr = (unsigned long)dmah->vaddr, sz = size;
- sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
- SetPageReserved(virt_to_page((void *)addr));
- }
-
return dmah;
}
@@ -94,19 +85,9 @@ EXPORT_SYMBOL(drm_pci_alloc);
*/
void __drm_legacy_pci_free(struct drm_device * dev, drm_dma_handle_t * dmah)
{
- unsigned long addr;
- size_t sz;
-
- if (dmah->vaddr) {
- /* XXX - Is virt_to_page() legal for consistent mem? */
- /* Unreserve */
- for (addr = (unsigned long)dmah->vaddr, sz = dmah->size;
- sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
- ClearPageReserved(virt_to_page((void *)addr));
- }
+ if (dmah->vaddr)
dma_free_coherent(&dev->pdev->dev, dmah->size, dmah->vaddr,
dmah->busaddr);
- }
}
/**
The patch below does not apply to the 5.5-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8732fe46b20c951493bfc4dba0ad08efdf41de81 Mon Sep 17 00:00:00 2001
From: Lyude Paul <lyude(a)redhat.com>
Date: Wed, 22 Jan 2020 14:43:20 -0500
Subject: [PATCH] drm/dp_mst: Fix clearing payload state on topology disable
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The issues caused by:
commit 64e62bdf04ab ("drm/dp_mst: Remove VCPI while disabling topology
mgr")
Prompted me to take a closer look at how we clear the payload state in
general when disabling the topology, and it turns out there's actually
two subtle issues here.
The first is that we're not grabbing &mgr.payload_lock when clearing the
payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical
lock order is &mgr.payload_lock -> &mgr.lock (because we always want
&mgr.lock to be the inner-most lock so topology validation always
works), this makes perfect sense. It also means that -technically- there
could be racing between someone calling
drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a
modeset occurring that's modifying the payload state at the same time.
The second is the more obvious issue that Wayne Lin discovered, that
we're not clearing proposed_payloads when disabling the topology.
I actually can't see any obvious places where the racing caused by the
first issue would break something, and it could be that some of our
higher-level locks already prevent this by happenstance, but better safe
then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst()
first grabs &mgr.payload_lock followed by &mgr.lock so that we never
race when modifying the payload state. Then, we also clear
proposed_payloads to fix the original issue of enabling a new topology
with a dirty payload state. This doesn't clear any of the drm_dp_vcpi
structures, but those are getting destroyed along with the ports anyway.
Changes since v1:
* Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead -
vsyrjala
Cc: Sean Paul <sean(a)poorly.run>
Cc: Wayne Lin <Wayne.Lin(a)amd.com>
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v4.4+
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200122194321.14953-1-lyude@…
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index e78c73e975ed..4104f15f4594 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3447,6 +3447,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
int ret = 0;
struct drm_dp_mst_branch *mstb = NULL;
+ mutex_lock(&mgr->payload_lock);
mutex_lock(&mgr->lock);
if (mst_state == mgr->mst_state)
goto out_unlock;
@@ -3505,7 +3506,10 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
/* this can fail if the device is gone */
drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, 0);
ret = 0;
- memset(mgr->payloads, 0, mgr->max_payloads * sizeof(struct drm_dp_payload));
+ memset(mgr->payloads, 0,
+ mgr->max_payloads * sizeof(mgr->payloads[0]));
+ memset(mgr->proposed_vcpis, 0,
+ mgr->max_payloads * sizeof(mgr->proposed_vcpis[0]));
mgr->payload_mask = 0;
set_bit(0, &mgr->payload_mask);
mgr->vcpi_mask = 0;
@@ -3514,6 +3518,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
out_unlock:
mutex_unlock(&mgr->lock);
+ mutex_unlock(&mgr->payload_lock);
if (mstb)
drm_dp_mst_topology_put_mstb(mstb);
return ret;