 
            From: Zhenzhong Duan zhenzhong.duan@gmail.com
[ Upstream commit 5d7f7d1d5e01c22894dee7c9c9266500478dca99 ]
The original code is a nop as i_mce.status is or'ed with part of itself, fix it.
Fixes: a1300e505297 ("x86/ras/mce_amd_inj: Trigger deferred and thresholding errors interrupts") Signed-off-by: Zhenzhong Duan zhenzhong.duan@gmail.com Signed-off-by: Borislav Petkov bp@suse.de Acked-by: Yazen Ghannam yazen.ghannam@amd.com Link: https://lkml.kernel.org/r/20200611023238.3830-1-zhenzhong.duan@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/mce/inject.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c index 1f30117b24ba7..eb2d41c1816d6 100644 --- a/arch/x86/kernel/cpu/mce/inject.c +++ b/arch/x86/kernel/cpu/mce/inject.c @@ -511,7 +511,7 @@ static void do_inject(void) */ if (inj_type == DFR_INT_INJ) { i_mce.status |= MCI_STATUS_DEFERRED; - i_mce.status |= (i_mce.status & ~MCI_STATUS_UC); + i_mce.status &= ~MCI_STATUS_UC; }
/*
 
            From: Vincent Guittot vincent.guittot@linaro.org
[ Upstream commit 3ea2f097b17e13a8280f1f9386c331b326a3dbef ]
With commit: 'b7031a02ec75 ("sched/fair: Add NOHZ_STATS_KICK")' rebalance_domains of the local cfs_rq happens before others idle cpus have updated nohz.next_balance and its value is overwritten.
Move the update of nohz.next_balance for other idles cpus before balancing and updating the next_balance of local cfs_rq.
Also, the nohz.next_balance is now updated only if all idle cpus got a chance to rebalance their domains and the idle balance has not been aborted because of new activities on the CPU. In case of need_resched, the idle load balance will be kick the next jiffie in order to address remaining ilb.
Fixes: b7031a02ec75 ("sched/fair: Add NOHZ_STATS_KICK") Reported-by: Peng Liu iwtbavbm@gmail.com Signed-off-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Acked-by: Mel Gorman mgorman@suse.de Link: https://lkml.kernel.org/r/20200609123748.18636-1-vincent.guittot@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9b16080093be1..20bf1f66733ac 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9385,7 +9385,12 @@ static void kick_ilb(unsigned int flags) { int ilb_cpu;
- nohz.next_balance++; + /* + * Increase nohz.next_balance only when if full ilb is triggered but + * not if we only update stats. + */ + if (flags & NOHZ_BALANCE_KICK) + nohz.next_balance = jiffies+1;
ilb_cpu = find_new_ilb();
@@ -9703,6 +9708,14 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, } }
+ /* + * next_balance will be updated only when there is a need. + * When the CPU is attached to null domain for ex, it will not be + * updated. + */ + if (likely(update_next_balance)) + nohz.next_balance = next_balance; + /* Newly idle CPU doesn't need an update */ if (idle != CPU_NEWLY_IDLE) { update_blocked_averages(this_cpu); @@ -9723,14 +9736,6 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, if (has_blocked_load) WRITE_ONCE(nohz.has_blocked, 1);
- /* - * next_balance will be updated only when there is a need. - * When the CPU is attached to null domain for ex, it will not be - * updated. - */ - if (likely(update_next_balance)) - nohz.next_balance = next_balance; - return ret; }
 
            From: Peng Liu iwtbavbm@gmail.com
[ Upstream commit 9b1b234bb86bcdcdb142e900d39b599185465dbb ]
During sched domain init, we check whether non-topological SD_flags are returned by tl->sd_flags(), if found, fire a waning and correct the violation, but the code failed to correct the violation. Correct this.
Fixes: 143e1e28cb40 ("sched: Rework sched_domain topology definition") Signed-off-by: Peng Liu iwtbavbm@gmail.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Vincent Guittot vincent.guittot@linaro.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Link: https://lkml.kernel.org/r/20200609150936.GA13060@iZj6chx1xj0e0buvshuecpZ Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 1fa1e13a59446..ffaa97a8d4051 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1333,7 +1333,7 @@ sd_init(struct sched_domain_topology_level *tl, sd_flags = (*tl->sd_flags)(); if (WARN_ONCE(sd_flags & ~TOPOLOGY_SD_FLAGS, "wrong sd_flags in topology description\n")) - sd_flags &= ~TOPOLOGY_SD_FLAGS; + sd_flags &= TOPOLOGY_SD_FLAGS;
/* Apply detected topology flags */ sd_flags |= dflags;
 
            From: Heiko Stuebner heiko.stuebner@theobroma-systems.com
[ Upstream commit 2300e6dab473e93181cf76e4fe6671aa3d24c57b ]
The lion gmac node currently uses opposite active-values for the gmac phy reset pin. The gpio-declaration uses active-high while the separate snps,reset-active-low property marks the pin as active low.
While on the kernel side this works ok, other DT users may get confused - as seen with uboot right now.
So bring this in line and make both properties match, similar to the other Rockchip board.
Fixes: d99a02bcfa81 ("arm64: dts: rockchip: add RK3368-uQ7 (Lion) SoM") Signed-off-by: Heiko Stuebner heiko.stuebner@theobroma-systems.com Link: https://lore.kernel.org/r/20200607212909.920575-1-heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/rockchip/rk3368-lion.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3368-lion.dtsi b/arch/arm64/boot/dts/rockchip/rk3368-lion.dtsi index e17311e090826..216aafd90e7f1 100644 --- a/arch/arm64/boot/dts/rockchip/rk3368-lion.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3368-lion.dtsi @@ -156,7 +156,7 @@ &gmac { pinctrl-0 = <&rgmii_pins>; snps,reset-active-low; snps,reset-delays-us = <0 10000 50000>; - snps,reset-gpio = <&gpio3 RK_PB3 GPIO_ACTIVE_HIGH>; + snps,reset-gpio = <&gpio3 RK_PB3 GPIO_ACTIVE_LOW>; tx_delay = <0x10>; rx_delay = <0x10>; status = "okay";
 
            From: Heiko Stuebner heiko.stuebner@theobroma-systems.com
[ Upstream commit 7a7184f6cfa9279f1a1c10a1845d247d7fad54ff ]
The puma vcc5v0_host regulator node currently uses opposite active-values for the enable pin. The gpio-declaration uses active-high while the separate enable-active-low property marks the pin as active low.
While on the kernel side this works ok, other DT users may get confused - as seen with uboot right now.
So bring this in line and make both properties match, similar to the gmac fix.
Fixes: 2c66fc34e945 ("arm64: dts: rockchip: add RK3399-Q7 (Puma) SoM") Signed-off-by: Heiko Stuebner heiko.stuebner@theobroma-systems.com Link: https://lore.kernel.org/r/20200604091239.424318-1-heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi index 62ea288a1a70b..fb47e4046f4e4 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi @@ -101,7 +101,7 @@ vcc3v3_sys: vcc3v3-sys {
vcc5v0_host: vcc5v0-host-regulator { compatible = "regulator-fixed"; - gpio = <&gpio4 RK_PA3 GPIO_ACTIVE_HIGH>; + gpio = <&gpio4 RK_PA3 GPIO_ACTIVE_LOW>; enable-active-low; pinctrl-names = "default"; pinctrl-0 = <&vcc5v0_host_en>;
 
            From: Heiko Stuebner heiko.stuebner@theobroma-systems.com
[ Upstream commit 8a445086f8af0b7b9bd8d1901d6f306bb154f70d ]
The puma gmac node currently uses opposite active-values for the gmac phy reset pin. The gpio-declaration uses active-high while the separate snps,reset-active-low property marks the pin as active low.
While on the kernel side this works ok, other DT users may get confused - as seen with uboot right now.
So bring this in line and make both properties match, similar to the other Rockchip board.
Fixes: 2c66fc34e945 ("arm64: dts: rockchip: add RK3399-Q7 (Puma) SoM") Signed-off-by: Heiko Stuebner heiko.stuebner@theobroma-systems.com Link: https://lore.kernel.org/r/20200603132836.362519-1-heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi index fb47e4046f4e4..45b86933c6ea0 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi @@ -157,7 +157,7 @@ &gmac { phy-mode = "rgmii"; pinctrl-names = "default"; pinctrl-0 = <&rgmii_pins>; - snps,reset-gpio = <&gpio3 RK_PC0 GPIO_ACTIVE_HIGH>; + snps,reset-gpio = <&gpio3 RK_PC0 GPIO_ACTIVE_LOW>; snps,reset-active-low; snps,reset-delays-us = <0 10000 50000>; tx_delay = <0x10>;
 
            From: Qiushi Wu wu000273@umn.edu
[ Upstream commit 17ed808ad243192fb923e4e653c1338d3ba06207 ]
When kobject_init_and_add() returns an error, it should be handled because kobject_init_and_add() takes a reference even when it fails. If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object.
Therefore, replace calling kfree() and call kobject_put() and add a missing kobject_put() in the edac_device_register_sysfs_main_kobj() error path.
[ bp: Massage and merge into a single patch. ]
Fixes: b2ed215a3338 ("Kobject: change drivers/edac to use kobject_init_and_add") Signed-off-by: Qiushi Wu wu000273@umn.edu Signed-off-by: Borislav Petkov bp@suse.de Link: https://lkml.kernel.org/r/20200528202238.18078-1-wu000273@umn.edu Link: https://lkml.kernel.org/r/20200528203526.20908-1-wu000273@umn.edu Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/edac/edac_device_sysfs.c | 1 + drivers/edac/edac_pci_sysfs.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/edac/edac_device_sysfs.c b/drivers/edac/edac_device_sysfs.c index 0e7ea3591b781..5e75937537997 100644 --- a/drivers/edac/edac_device_sysfs.c +++ b/drivers/edac/edac_device_sysfs.c @@ -275,6 +275,7 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
/* Error exit stack */ err_kobj_reg: + kobject_put(&edac_dev->kobj); module_put(edac_dev->owner);
err_out: diff --git a/drivers/edac/edac_pci_sysfs.c b/drivers/edac/edac_pci_sysfs.c index 72c9eb9fdffbe..53042af7262e2 100644 --- a/drivers/edac/edac_pci_sysfs.c +++ b/drivers/edac/edac_pci_sysfs.c @@ -386,7 +386,7 @@ static int edac_pci_main_kobj_setup(void)
/* Error unwind statck */ kobject_init_and_add_fail: - kfree(edac_pci_top_main_kobj); + kobject_put(edac_pci_top_main_kobj);
kzalloc_fail: module_put(THIS_MODULE);
 
            From: Herbert Xu herbert@gondor.apana.org.au
[ Upstream commit 3906f640224dbe7714b52b66d7d68c0812808e19 ]
The crypto notify call occurs with a read mutex held so you must not do any substantial work directly. In particular, you cannot call crypto_alloc_* as they may trigger further notifications which may dead-lock in the presence of another writer.
This patch fixes this by postponing the work into a work queue and taking the same lock in the module init function.
While we're at it this patch also ensures that all RCU accesses are marked appropriately (tested with sparse).
Finally this also reveals a race condition in module param show function as it may be called prior to the module init function. It's fixed by testing whether crct10dif_tfm is NULL (this is true iff the init function has not completed assuming fallback is false).
Fixes: 11dcb1037f40 ("crc-t10dif: Allow current transform to be...") Fixes: b76377543b73 ("crc-t10dif: Pick better transform if one...") Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Reviewed-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- lib/crc-t10dif.c | 54 +++++++++++++++++++++++++++++++++++++----------- 1 file changed, 42 insertions(+), 12 deletions(-)
diff --git a/lib/crc-t10dif.c b/lib/crc-t10dif.c index 8cc01a6034165..c9acf1c12cfcb 100644 --- a/lib/crc-t10dif.c +++ b/lib/crc-t10dif.c @@ -19,39 +19,46 @@ static struct crypto_shash __rcu *crct10dif_tfm; static struct static_key crct10dif_fallback __read_mostly; static DEFINE_MUTEX(crc_t10dif_mutex); +static struct work_struct crct10dif_rehash_work;
-static int crc_t10dif_rehash(struct notifier_block *self, unsigned long val, void *data) +static int crc_t10dif_notify(struct notifier_block *self, unsigned long val, void *data) { struct crypto_alg *alg = data; - struct crypto_shash *new, *old;
if (val != CRYPTO_MSG_ALG_LOADED || static_key_false(&crct10dif_fallback) || strncmp(alg->cra_name, CRC_T10DIF_STRING, strlen(CRC_T10DIF_STRING))) return 0;
+ schedule_work(&crct10dif_rehash_work); + return 0; +} + +static void crc_t10dif_rehash(struct work_struct *work) +{ + struct crypto_shash *new, *old; + mutex_lock(&crc_t10dif_mutex); old = rcu_dereference_protected(crct10dif_tfm, lockdep_is_held(&crc_t10dif_mutex)); if (!old) { mutex_unlock(&crc_t10dif_mutex); - return 0; + return; } new = crypto_alloc_shash("crct10dif", 0, 0); if (IS_ERR(new)) { mutex_unlock(&crc_t10dif_mutex); - return 0; + return; } rcu_assign_pointer(crct10dif_tfm, new); mutex_unlock(&crc_t10dif_mutex);
synchronize_rcu(); crypto_free_shash(old); - return 0; }
static struct notifier_block crc_t10dif_nb = { - .notifier_call = crc_t10dif_rehash, + .notifier_call = crc_t10dif_notify, };
__u16 crc_t10dif_update(__u16 crc, const unsigned char *buffer, size_t len) @@ -86,19 +93,26 @@ EXPORT_SYMBOL(crc_t10dif);
static int __init crc_t10dif_mod_init(void) { + struct crypto_shash *tfm; + + INIT_WORK(&crct10dif_rehash_work, crc_t10dif_rehash); crypto_register_notifier(&crc_t10dif_nb); - crct10dif_tfm = crypto_alloc_shash("crct10dif", 0, 0); - if (IS_ERR(crct10dif_tfm)) { + mutex_lock(&crc_t10dif_mutex); + tfm = crypto_alloc_shash("crct10dif", 0, 0); + if (IS_ERR(tfm)) { static_key_slow_inc(&crct10dif_fallback); - crct10dif_tfm = NULL; + tfm = NULL; } + RCU_INIT_POINTER(crct10dif_tfm, tfm); + mutex_unlock(&crc_t10dif_mutex); return 0; }
static void __exit crc_t10dif_mod_fini(void) { crypto_unregister_notifier(&crc_t10dif_nb); - crypto_free_shash(crct10dif_tfm); + cancel_work_sync(&crct10dif_rehash_work); + crypto_free_shash(rcu_dereference_protected(crct10dif_tfm, 1)); }
module_init(crc_t10dif_mod_init); @@ -106,11 +120,27 @@ module_exit(crc_t10dif_mod_fini);
static int crc_t10dif_transform_show(char *buffer, const struct kernel_param *kp) { + struct crypto_shash *tfm; + const char *name; + int len; + if (static_key_false(&crct10dif_fallback)) return sprintf(buffer, "fallback\n");
- return sprintf(buffer, "%s\n", - crypto_tfm_alg_driver_name(crypto_shash_tfm(crct10dif_tfm))); + rcu_read_lock(); + tfm = rcu_dereference(crct10dif_tfm); + if (!tfm) { + len = sprintf(buffer, "init\n"); + goto unlock; + } + + name = crypto_tfm_alg_driver_name(crypto_shash_tfm(tfm)); + len = sprintf(buffer, "%s\n", name); + +unlock: + rcu_read_unlock(); + + return len; }
module_param_call(transform, NULL, crc_t10dif_transform_show, NULL, 0644);
 
            From: Stephan Gerhold stephan@gerhold.net
[ Upstream commit 1b6a1a162defe649c5599d661b58ac64bb6f31b6 ]
msm8916-pins.dtsi specifies "bias-pull-none" for most of the audio pin configurations. This was likely copied from the qcom kernel fork where the same property was used for these audio pins.
However, "bias-pull-none" actually does not exist at all - not in mainline and not in downstream. I can only guess that the original intention was to configure "no pull", i.e. bias-disable.
Change it to that instead.
Fixes: 143bb9ad85b7 ("arm64: dts: qcom: add audio pinctrls") Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Stephan Gerhold stephan@gerhold.net Link: https://lore.kernel.org/r/20200605185916.318494-2-stephan@gerhold.net Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/msm8916-pins.dtsi | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/msm8916-pins.dtsi b/arch/arm64/boot/dts/qcom/msm8916-pins.dtsi index 242aaea688040..1235830ffd0b7 100644 --- a/arch/arm64/boot/dts/qcom/msm8916-pins.dtsi +++ b/arch/arm64/boot/dts/qcom/msm8916-pins.dtsi @@ -508,7 +508,7 @@ pinconf { pins = "gpio63", "gpio64", "gpio65", "gpio66", "gpio67", "gpio68"; drive-strength = <8>; - bias-pull-none; + bias-disable; }; }; cdc_pdm_lines_sus: pdm_lines_off { @@ -537,7 +537,7 @@ pinconf { pins = "gpio113", "gpio114", "gpio115", "gpio116"; drive-strength = <8>; - bias-pull-none; + bias-disable; }; };
@@ -565,7 +565,7 @@ pinmux { pinconf { pins = "gpio110"; drive-strength = <8>; - bias-pull-none; + bias-disable; }; };
@@ -591,7 +591,7 @@ pinmux { pinconf { pins = "gpio116"; drive-strength = <8>; - bias-pull-none; + bias-disable; }; }; ext_mclk_tlmm_lines_sus: mclk_lines_off { @@ -619,7 +619,7 @@ pinconf { pins = "gpio112", "gpio117", "gpio118", "gpio119"; drive-strength = <8>; - bias-pull-none; + bias-disable; }; }; ext_sec_tlmm_lines_sus: tlmm_lines_off {
 
            From: Luis Chamberlain mcgrof@kernel.org
[ Upstream commit bad8e64fb19d3a0de5e564d9a7271c31bd684369 ]
On commit 6ac93117ab00 ("blktrace: use existing disk debugfs directory") merged on v4.12 Omar fixed the original blktrace code for request-based drivers (multiqueue). This however left in place a possible crash, if you happen to abuse blktrace while racing to remove / add a device.
We used to use asynchronous removal of the request_queue, and with that the issue was easier to reproduce. Now that we have reverted to synchronous removal of the request_queue, the issue is still possible to reproduce, its however just a bit more difficult.
We essentially run two instances of break-blktrace which add/remove a loop device, and setup a blktrace and just never tear the blktrace down. We do this twice in parallel. This is easily reproduced with the script run_0004.sh from break-blktrace [0].
We can end up with two types of panics each reflecting where we race, one a failed blktrace setup:
[ 252.426751] debugfs: Directory 'loop0' with parent 'block' already present! [ 252.432265] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 252.436592] #PF: supervisor write access in kernel mode [ 252.439822] #PF: error_code(0x0002) - not-present page [ 252.442967] PGD 0 P4D 0 [ 252.444656] Oops: 0002 [#1] SMP NOPTI [ 252.446972] CPU: 10 PID: 1153 Comm: break-blktrace Tainted: G E 5.7.0-rc2-next-20200420+ #164 [ 252.452673] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 252.456343] RIP: 0010:down_write+0x15/0x40 [ 252.458146] Code: eb ca e8 ae 22 8d ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 fd e8 52 db ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 55 00 75 0f 48 8b 04 25 c0 8b 01 00 48 89 45 08 5d [ 252.463638] RSP: 0018:ffffa626415abcc8 EFLAGS: 00010246 [ 252.464950] RAX: 0000000000000000 RBX: ffff958c25f0f5c0 RCX: ffffff8100000000 [ 252.466727] RDX: 0000000000000001 RSI: ffffff8100000000 RDI: 00000000000000a0 [ 252.468482] RBP: 00000000000000a0 R08: 0000000000000000 R09: 0000000000000001 [ 252.470014] R10: 0000000000000000 R11: ffff958d1f9227ff R12: 0000000000000000 [ 252.471473] R13: ffff958c25ea5380 R14: ffffffff8cce15f1 R15: 00000000000000a0 [ 252.473346] FS: 00007f2e69dee540(0000) GS:ffff958c2fc80000(0000) knlGS:0000000000000000 [ 252.475225] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 252.476267] CR2: 00000000000000a0 CR3: 0000000427d10004 CR4: 0000000000360ee0 [ 252.477526] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 252.478776] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 252.479866] Call Trace: [ 252.480322] simple_recursive_removal+0x4e/0x2e0 [ 252.481078] ? debugfs_remove+0x60/0x60 [ 252.481725] ? relay_destroy_buf+0x77/0xb0 [ 252.482662] debugfs_remove+0x40/0x60 [ 252.483518] blk_remove_buf_file_callback+0x5/0x10 [ 252.484328] relay_close_buf+0x2e/0x60 [ 252.484930] relay_open+0x1ce/0x2c0 [ 252.485520] do_blk_trace_setup+0x14f/0x2b0 [ 252.486187] __blk_trace_setup+0x54/0xb0 [ 252.486803] blk_trace_ioctl+0x90/0x140 [ 252.487423] ? do_sys_openat2+0x1ab/0x2d0 [ 252.488053] blkdev_ioctl+0x4d/0x260 [ 252.488636] block_ioctl+0x39/0x40 [ 252.489139] ksys_ioctl+0x87/0xc0 [ 252.489675] __x64_sys_ioctl+0x16/0x20 [ 252.490380] do_syscall_64+0x52/0x180 [ 252.491032] entry_SYSCALL_64_after_hwframe+0x44/0xa9
And the other on the device removal:
[ 128.528940] debugfs: Directory 'loop0' with parent 'block' already present! [ 128.615325] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 128.619537] #PF: supervisor write access in kernel mode [ 128.622700] #PF: error_code(0x0002) - not-present page [ 128.625842] PGD 0 P4D 0 [ 128.627585] Oops: 0002 [#1] SMP NOPTI [ 128.629871] CPU: 12 PID: 544 Comm: break-blktrace Tainted: G E 5.7.0-rc2-next-20200420+ #164 [ 128.635595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 128.640471] RIP: 0010:down_write+0x15/0x40 [ 128.643041] Code: eb ca e8 ae 22 8d ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 fd e8 52 db ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 55 00 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 45 08 5d [ 128.650180] RSP: 0018:ffffa9c3c05ebd78 EFLAGS: 00010246 [ 128.651820] RAX: 0000000000000000 RBX: ffff8ae9a6370240 RCX: ffffff8100000000 [ 128.653942] RDX: 0000000000000001 RSI: ffffff8100000000 RDI: 00000000000000a0 [ 128.655720] RBP: 00000000000000a0 R08: 0000000000000002 R09: ffff8ae9afd2d3d0 [ 128.657400] R10: 0000000000000056 R11: 0000000000000000 R12: 0000000000000000 [ 128.659099] R13: 0000000000000000 R14: 0000000000000003 R15: 00000000000000a0 [ 128.660500] FS: 00007febfd995540(0000) GS:ffff8ae9afd00000(0000) knlGS:0000000000000000 [ 128.662204] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 128.663426] CR2: 00000000000000a0 CR3: 0000000420042003 CR4: 0000000000360ee0 [ 128.664776] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 128.666022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 128.667282] Call Trace: [ 128.667801] simple_recursive_removal+0x4e/0x2e0 [ 128.668663] ? debugfs_remove+0x60/0x60 [ 128.669368] debugfs_remove+0x40/0x60 [ 128.669985] blk_trace_free+0xd/0x50 [ 128.670593] __blk_trace_remove+0x27/0x40 [ 128.671274] blk_trace_shutdown+0x30/0x40 [ 128.671935] blk_release_queue+0x95/0xf0 [ 128.672589] kobject_put+0xa5/0x1b0 [ 128.673188] disk_release+0xa2/0xc0 [ 128.673786] device_release+0x28/0x80 [ 128.674376] kobject_put+0xa5/0x1b0 [ 128.674915] loop_remove+0x39/0x50 [loop] [ 128.675511] loop_control_ioctl+0x113/0x130 [loop] [ 128.676199] ksys_ioctl+0x87/0xc0 [ 128.676708] __x64_sys_ioctl+0x16/0x20 [ 128.677274] do_syscall_64+0x52/0x180 [ 128.677823] entry_SYSCALL_64_after_hwframe+0x44/0xa9
The common theme here is:
debugfs: Directory 'loop0' with parent 'block' already present
This crash happens because of how blktrace uses the debugfs directory where it places its files. Upon init we always create the same directory which would be needed by blktrace but we only do this for make_request drivers (multiqueue) block drivers. When you race a removal of these devices with a blktrace setup you end up in a situation where the make_request recursive debugfs removal will sweep away the blktrace files and then later blktrace will also try to remove individual dentries which are already NULL. The inverse is also possible and hence the two types of use after frees.
We don't create the block debugfs directory on init for these types of block devices:
* request-based block driver block devices * every possible partition * scsi-generic
And so, this race should in theory only be possible with make_request drivers.
We can fix the UAF by simply re-using the debugfs directory for make_request drivers (multiqueue) and only creating the ephemeral directory for the other type of block devices. The new clarifications on relying on the q->blk_trace_mutex *and* also checking for q->blk_trace *prior* to processing a blktrace ensures the debugfs directories are only created if no possible directory name clashes are possible.
This goes tested with:
o nvme partitions o ISCSI with tgt, and blktracing against scsi-generic with: o block o tape o cdrom o media changer o blktests
This patch is part of the work which disputes the severity of CVE-2019-19770 which shows this issue is not a core debugfs issue, but a misuse of debugfs within blktace.
Fixes: 6ac93117ab00 ("blktrace: use existing disk debugfs directory") Reported-by: syzbot+603294af2d01acfdd6da@syzkaller.appspotmail.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Cc: Bart Van Assche bvanassche@acm.org Cc: Omar Sandoval osandov@fb.com Cc: Hannes Reinecke hare@suse.com Cc: Nicolai Stange nstange@suse.de Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Michal Hocko mhocko@kernel.org Cc: "Martin K. Petersen" martin.petersen@oracle.com Cc: "James E.J. Bottomley" jejb@linux.ibm.com Cc: yu kuai yukuai3@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/blktrace.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index eaee960153e1e..a4c8f9d9522e4 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -521,10 +521,18 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, if (!bt->msg_data) goto err;
- ret = -ENOENT; - - dir = debugfs_lookup(buts->name, blk_debugfs_root); - if (!dir) +#ifdef CONFIG_BLK_DEBUG_FS + /* + * When tracing whole make_request drivers (multiqueue) block devices, + * reuse the existing debugfs directory created by the block layer on + * init. For request-based block devices, all partitions block devices, + * and scsi-generic block devices we create a temporary new debugfs + * directory that will be removed once the trace ends. + */ + if (queue_is_mq(q) && bdev && bdev == bdev->bd_contains) + dir = q->debugfs_dir; + else +#endif bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root);
bt->dev = dev; @@ -565,8 +573,6 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
ret = 0; err: - if (dir && !bt->dir) - dput(dir); if (ret) blk_trace_free(bt); return ret;
 
            From: Gilad Ben-Yossef gilad@benyossef.com
[ Upstream commit 9bc6165d608d676f05d8bf156a2c9923ee38d05b ]
Fix a small resource leak on the error path of cipher processing.
Signed-off-by: Gilad Ben-Yossef gilad@benyossef.com Fixes: 63ee04c8b491e ("crypto: ccree - add skcipher support") Cc: Markus Elfring Markus.Elfring@web.de Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/ccree/cc_cipher.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/drivers/crypto/ccree/cc_cipher.c b/drivers/crypto/ccree/cc_cipher.c index cd9c60268bf8d..9bf0cce578f02 100644 --- a/drivers/crypto/ccree/cc_cipher.c +++ b/drivers/crypto/ccree/cc_cipher.c @@ -163,7 +163,6 @@ static int cc_cipher_init(struct crypto_tfm *tfm) skcipher_alg.base); struct device *dev = drvdata_to_dev(cc_alg->drvdata); unsigned int max_key_buf_size = cc_alg->skcipher_alg.max_keysize; - int rc = 0;
dev_dbg(dev, "Initializing context @%p for %s\n", ctx_p, crypto_tfm_alg_name(tfm)); @@ -175,10 +174,19 @@ static int cc_cipher_init(struct crypto_tfm *tfm) ctx_p->flow_mode = cc_alg->flow_mode; ctx_p->drvdata = cc_alg->drvdata;
+ if (ctx_p->cipher_mode == DRV_CIPHER_ESSIV) { + /* Alloc hash tfm for essiv */ + ctx_p->shash_tfm = crypto_alloc_shash("sha256-generic", 0, 0); + if (IS_ERR(ctx_p->shash_tfm)) { + dev_err(dev, "Error allocating hash tfm for ESSIV.\n"); + return PTR_ERR(ctx_p->shash_tfm); + } + } + /* Allocate key buffer, cache line aligned */ ctx_p->user.key = kmalloc(max_key_buf_size, GFP_KERNEL); if (!ctx_p->user.key) - return -ENOMEM; + goto free_shash;
dev_dbg(dev, "Allocated key buffer in context. key=@%p\n", ctx_p->user.key); @@ -190,21 +198,19 @@ static int cc_cipher_init(struct crypto_tfm *tfm) if (dma_mapping_error(dev, ctx_p->user.key_dma_addr)) { dev_err(dev, "Mapping Key %u B at va=%pK for DMA failed\n", max_key_buf_size, ctx_p->user.key); - return -ENOMEM; + goto free_key; } dev_dbg(dev, "Mapped key %u B at va=%pK to dma=%pad\n", max_key_buf_size, ctx_p->user.key, &ctx_p->user.key_dma_addr);
- if (ctx_p->cipher_mode == DRV_CIPHER_ESSIV) { - /* Alloc hash tfm for essiv */ - ctx_p->shash_tfm = crypto_alloc_shash("sha256-generic", 0, 0); - if (IS_ERR(ctx_p->shash_tfm)) { - dev_err(dev, "Error allocating hash tfm for ESSIV.\n"); - return PTR_ERR(ctx_p->shash_tfm); - } - } + return 0;
- return rc; +free_key: + kfree(ctx_p->user.key); +free_shash: + crypto_free_shash(ctx_p->shash_tfm); + + return -ENOMEM; }
static void cc_cipher_exit(struct crypto_tfm *tfm)
 
            From: Marek Szyprowski m.szyprowski@samsung.com
[ Upstream commit ea9dd8f61c8a890843f68e8dc0062ce78365aab8 ]
Call exynos_cpu_power_up(cpunr) unconditionally. This is needed by the big.LITTLE cpuidle driver and has no side-effects on other code paths.
The additional soft-reset call during little core power up has been added to properly boot all cores on the Exynos5422-based boards with secure firmware (like Odroid XU3/XU4 family). This however broke big.LITTLE CPUidle driver, which worked only on boards without secure firmware (like Peach-Pit/Pi Chromebooks). Apply the workaround only when board is running under secure firmware.
Fixes: 833b5794e330 ("ARM: EXYNOS: reset Little cores when cpu is up") Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Reviewed-by: Lukasz Luba lukasz.luba@arm.com Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-exynos/mcpm-exynos.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mach-exynos/mcpm-exynos.c b/arch/arm/mach-exynos/mcpm-exynos.c index 9a681b421ae11..cd861c57d5adf 100644 --- a/arch/arm/mach-exynos/mcpm-exynos.c +++ b/arch/arm/mach-exynos/mcpm-exynos.c @@ -26,6 +26,7 @@ #define EXYNOS5420_USE_L2_COMMON_UP_STATE BIT(30)
static void __iomem *ns_sram_base_addr __ro_after_init; +static bool secure_firmware __ro_after_init;
/* * The common v7_exit_coherency_flush API could not be used because of the @@ -58,15 +59,16 @@ static void __iomem *ns_sram_base_addr __ro_after_init; static int exynos_cpu_powerup(unsigned int cpu, unsigned int cluster) { unsigned int cpunr = cpu + (cluster * EXYNOS5420_CPUS_PER_CLUSTER); + bool state;
pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); if (cpu >= EXYNOS5420_CPUS_PER_CLUSTER || cluster >= EXYNOS5420_NR_CLUSTERS) return -EINVAL;
- if (!exynos_cpu_power_state(cpunr)) { - exynos_cpu_power_up(cpunr); - + state = exynos_cpu_power_state(cpunr); + exynos_cpu_power_up(cpunr); + if (!state && secure_firmware) { /* * This assumes the cluster number of the big cores(Cortex A15) * is 0 and the Little cores(Cortex A7) is 1. @@ -258,6 +260,8 @@ static int __init exynos_mcpm_init(void) return -ENOMEM; }
+ secure_firmware = exynos_secure_firmware_available(); + /* * To increase the stability of KFC reset we need to program * the PMU SPARE3 register
 
            From: Cristian Marussi cristian.marussi@arm.com
[ Upstream commit e0f1a30cf184821499eeb67daedd7a3f21bbcb0b ]
When, at probe time, an SCMI communication failure inhibits the capacity to query power domains states, such domains should be skipped.
Registering partially initialized SCMI power domains with genpd will causes kernel panic.
arm-scmi timed out in resp(caller: scmi_power_state_get+0xa4/0xd0) scmi-power-domain scmi_dev.2: failed to get state for domain 9 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x96000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=00000009f3691000 [0000000000000000] pgd=00000009f1ca0003, p4d=00000009f1ca0003, pud=00000009f35ea003, pmd=0000000000000000 Internal error: Oops: 96000006 [#1] PREEMPT SMP CPU: 2 PID: 381 Comm: bash Not tainted 5.8.0-rc1-00011-gebd118c2cca8 #2 Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jan 3 2020 Internal error: Oops: 96000006 [#1] PREEMPT SMP pstate: 80000005 (Nzcv daif -PAN -UAO BTYPE=--) pc : of_genpd_add_provider_onecell+0x98/0x1f8 lr : of_genpd_add_provider_onecell+0x48/0x1f8 Call trace: of_genpd_add_provider_onecell+0x98/0x1f8 scmi_pm_domain_probe+0x174/0x1e8 scmi_dev_probe+0x90/0xe0 really_probe+0xe4/0x448 driver_probe_device+0xfc/0x168 device_driver_attach+0x7c/0x88 bind_store+0xe8/0x128 drv_attr_store+0x2c/0x40 sysfs_kf_write+0x4c/0x60 kernfs_fop_write+0x114/0x230 __vfs_write+0x24/0x50 vfs_write+0xbc/0x1e0 ksys_write+0x70/0xf8 __arm64_sys_write+0x24/0x30 el0_svc_common.constprop.3+0x94/0x160 do_el0_svc+0x2c/0x98 el0_sync_handler+0x148/0x1a8 el0_sync+0x158/0x180
Do not register any power domain that failed to be queried with genpd.
Fixes: 898216c97ed2 ("firmware: arm_scmi: add device power domain support using genpd") Link: https://lore.kernel.org/r/20200619220330.12217-1-cristian.marussi@arm.com Signed-off-by: Cristian Marussi cristian.marussi@arm.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/arm_scmi/scmi_pm_domain.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/firmware/arm_scmi/scmi_pm_domain.c b/drivers/firmware/arm_scmi/scmi_pm_domain.c index 87f737e01473c..041f8152272bf 100644 --- a/drivers/firmware/arm_scmi/scmi_pm_domain.c +++ b/drivers/firmware/arm_scmi/scmi_pm_domain.c @@ -85,7 +85,10 @@ static int scmi_pm_domain_probe(struct scmi_device *sdev) for (i = 0; i < num_domains; i++, scmi_pd++) { u32 state;
- domains[i] = &scmi_pd->genpd; + if (handle->power_ops->state_get(handle, i, &state)) { + dev_warn(dev, "failed to get state for domain %d\n", i); + continue; + }
scmi_pd->domain = i; scmi_pd->handle = handle; @@ -94,13 +97,10 @@ static int scmi_pm_domain_probe(struct scmi_device *sdev) scmi_pd->genpd.power_off = scmi_pd_power_off; scmi_pd->genpd.power_on = scmi_pd_power_on;
- if (handle->power_ops->state_get(handle, i, &state)) { - dev_warn(dev, "failed to get state for domain %d\n", i); - continue; - } - pm_genpd_init(&scmi_pd->genpd, NULL, state == SCMI_POWER_STATE_GENERIC_OFF); + + domains[i] = &scmi_pd->genpd; }
scmi_pd_data->domains = domains;
 
            From: Alim Akhtar alim.akhtar@samsung.com
[ Upstream commit b072714bfc0e42c984b8fd6e069f3ca17de8137a ]
Once regulators are disabled after kernel boot, on Espresso board silent hang observed because of LDO7 being disabled. LDO7 actually provide power to CPU cores and non-cpu blocks circuitries. Keep this regulator always-on to fix this hang.
Fixes: 9589f7721e16 ("arm64: dts: Add S2MPS15 PMIC node on exynos7-espresso") Signed-off-by: Alim Akhtar alim.akhtar@samsung.com Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/exynos/exynos7-espresso.dts | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/boot/dts/exynos/exynos7-espresso.dts b/arch/arm64/boot/dts/exynos/exynos7-espresso.dts index 080e0f56e108f..61ee7b6a31594 100644 --- a/arch/arm64/boot/dts/exynos/exynos7-espresso.dts +++ b/arch/arm64/boot/dts/exynos/exynos7-espresso.dts @@ -157,6 +157,7 @@ ldo7_reg: LDO7 { regulator-min-microvolt = <700000>; regulator-max-microvolt = <1150000>; regulator-enable-ramp-delay = <125>; + regulator-always-on; };
ldo8_reg: LDO8 {
 
            From: Qais Yousef qais.yousef@arm.com
[ Upstream commit d81ae8aac85ca2e307d273f6dc7863a721bf054e ]
struct uclamp_rq was zeroed out entirely in assumption that in the first call to uclamp_rq_inc() they'd be initialized correctly in accordance to default settings.
But when next patch introduces a static key to skip uclamp_rq_{inc,dec}() until userspace opts in to use uclamp, schedutil will fail to perform any frequency changes because the rq->uclamp[UCLAMP_MAX].value is zeroed at init and stays as such. Which means all rqs are capped to 0 by default.
Fix it by making sure we do proper initialization at init without relying on uclamp_rq_inc() doing it later.
Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting") Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Tested-by: Lukasz Luba lukasz.luba@arm.com Link: https://lkml.kernel.org/r/20200630112123.12076-2-qais.yousef@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/core.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 38ae3cf9d173e..b34b5c6e25248 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1238,6 +1238,20 @@ static void uclamp_fork(struct task_struct *p) } }
+static void __init init_uclamp_rq(struct rq *rq) +{ + enum uclamp_id clamp_id; + struct uclamp_rq *uc_rq = rq->uclamp; + + for_each_clamp_id(clamp_id) { + uc_rq[clamp_id] = (struct uclamp_rq) { + .value = uclamp_none(clamp_id) + }; + } + + rq->uclamp_flags = 0; +} + static void __init init_uclamp(void) { struct uclamp_se uc_max = {}; @@ -1246,11 +1260,8 @@ static void __init init_uclamp(void)
mutex_init(&uclamp_mutex);
- for_each_possible_cpu(cpu) { - memset(&cpu_rq(cpu)->uclamp, 0, - sizeof(struct uclamp_rq)*UCLAMP_CNT); - cpu_rq(cpu)->uclamp_flags = 0; - } + for_each_possible_cpu(cpu) + init_uclamp_rq(cpu_rq(cpu));
for_each_clamp_id(clamp_id) { uclamp_se_set(&init_task.uclamp_req[clamp_id],
 
            From: Sudeep Holla sudeep.holla@arm.com
[ Upstream commit fcd2e0deae50bce48450f14c8fc5611b08d7438c ]
Currently we are not initializing the scmi clock with discrete rates correctly. We fetch the min_rate and max_rate value only for clocks with ranges and ignore the ones with discrete rates. This will lead to wrong initialization of rate range when clock supports discrete rate.
Fix this by using the first and the last rate in the sorted list of the discrete clock rates while registering the clock.
Link: https://lore.kernel.org/r/20200709081705.46084-2-sudeep.holla@arm.com Fixes: 6d6a1d82eaef7 ("clk: add support for clocks provided by SCMI") Reviewed-by: Stephen Boyd sboyd@kernel.org Reported-and-tested-by: Dien Pham dien.pham.ry@renesas.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/clk-scmi.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c index 886f7c5df51a9..e3cdb4a282fea 100644 --- a/drivers/clk/clk-scmi.c +++ b/drivers/clk/clk-scmi.c @@ -103,6 +103,8 @@ static const struct clk_ops scmi_clk_ops = { static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk) { int ret; + unsigned long min_rate, max_rate; + struct clk_init_data init = { .flags = CLK_GET_RATE_NOCACHE, .num_parents = 0, @@ -112,9 +114,23 @@ static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
sclk->hw.init = &init; ret = devm_clk_hw_register(dev, &sclk->hw); - if (!ret) - clk_hw_set_rate_range(&sclk->hw, sclk->info->range.min_rate, - sclk->info->range.max_rate); + if (ret) + return ret; + + if (sclk->info->rate_discrete) { + int num_rates = sclk->info->list.num_rates; + + if (num_rates <= 0) + return -EINVAL; + + min_rate = sclk->info->list.rates[0]; + max_rate = sclk->info->list.rates[num_rates - 1]; + } else { + min_rate = sclk->info->range.min_rate; + max_rate = sclk->info->range.max_rate; + } + + clk_hw_set_rate_range(&sclk->hw, min_rate, max_rate); return ret; }
 
            From: Finn Thain fthain@telegraphics.com.au
[ Upstream commit aeb445bf2194d83e12e85bf5c65baaf1f093bd8f ]
In the following sequence of calls, iop_do_send() gets called when the "send" channel is not in the IOP_MSG_IDLE state:
iop_ism_irq() iop_handle_send() (msg->handler)() iop_send_message() iop_do_send()
Avoid this by testing the channel state before calling iop_do_send().
When sending, and iop_send_queue is empty, call iop_do_send() because the channel is idle. If iop_send_queue is not empty, iop_do_send() will get called later by iop_handle_send().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain fthain@telegraphics.com.au Tested-by: Stan Johnson userm57@yahoo.com Cc: Joshua Thompson funaho@jurai.org Link: https://lore.kernel.org/r/6d667c39e53865661fa5a48f16829d18ed8abe54.159088033... Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/m68k/mac/iop.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/arch/m68k/mac/iop.c b/arch/m68k/mac/iop.c index 9bfa170157688..d8f2282978f9c 100644 --- a/arch/m68k/mac/iop.c +++ b/arch/m68k/mac/iop.c @@ -416,7 +416,8 @@ static void iop_handle_send(uint iop_num, uint chan) msg->status = IOP_MSGSTATUS_UNUSED; msg = msg->next; iop_send_queue[iop_num][chan] = msg; - if (msg) iop_do_send(msg); + if (msg && iop_readb(iop, IOP_ADDR_SEND_STATE + chan) == IOP_MSG_IDLE) + iop_do_send(msg); }
/* @@ -490,16 +491,12 @@ int iop_send_message(uint iop_num, uint chan, void *privdata,
if (!(q = iop_send_queue[iop_num][chan])) { iop_send_queue[iop_num][chan] = msg; + iop_do_send(msg); } else { while (q->next) q = q->next; q->next = msg; }
- if (iop_readb(iop_base[iop_num], - IOP_ADDR_SEND_STATE + chan) == IOP_MSG_IDLE) { - iop_do_send(msg); - } - return 0; }
 
            From: Finn Thain fthain@telegraphics.com.au
[ Upstream commit 931fc82a6aaf4e2e4a5490addaa6a090d78c24a7 ]
When writing values to the IOP status/control register make sure those values do not have any extraneous bits that will clear interrupt flags.
To place the SCC IOP into bypass mode would be desirable but this is not achieved by writing IOP_DMAINACTIVE | IOP_RUN | IOP_AUTOINC | IOP_BYPASS to the control register. Drop this ineffective register write.
Remove the flawed and unused iop_bypass() function. Make use of the unused iop_stop() function.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain fthain@telegraphics.com.au Tested-by: Stan Johnson userm57@yahoo.com Cc: Joshua Thompson funaho@jurai.org Link: https://lore.kernel.org/r/09bcb7359a1719a18b551ee515da3c4c3cf709e6.159088033... Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/m68k/mac/iop.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/arch/m68k/mac/iop.c b/arch/m68k/mac/iop.c index d8f2282978f9c..c432bfafe63e2 100644 --- a/arch/m68k/mac/iop.c +++ b/arch/m68k/mac/iop.c @@ -183,7 +183,7 @@ static __inline__ void iop_writeb(volatile struct mac_iop *iop, __u16 addr, __u8
static __inline__ void iop_stop(volatile struct mac_iop *iop) { - iop->status_ctrl &= ~IOP_RUN; + iop->status_ctrl = IOP_AUTOINC; }
static __inline__ void iop_start(volatile struct mac_iop *iop) @@ -191,14 +191,9 @@ static __inline__ void iop_start(volatile struct mac_iop *iop) iop->status_ctrl = IOP_RUN | IOP_AUTOINC; }
-static __inline__ void iop_bypass(volatile struct mac_iop *iop) -{ - iop->status_ctrl |= IOP_BYPASS; -} - static __inline__ void iop_interrupt(volatile struct mac_iop *iop) { - iop->status_ctrl |= IOP_IRQ; + iop->status_ctrl = IOP_IRQ | IOP_RUN | IOP_AUTOINC; }
static int iop_alive(volatile struct mac_iop *iop) @@ -244,7 +239,6 @@ void __init iop_preinit(void) } else { iop_base[IOP_NUM_SCC] = (struct mac_iop *) SCC_IOP_BASE_QUADRA; } - iop_base[IOP_NUM_SCC]->status_ctrl = 0x87; iop_scc_present = 1; } else { iop_base[IOP_NUM_SCC] = NULL; @@ -256,7 +250,7 @@ void __init iop_preinit(void) } else { iop_base[IOP_NUM_ISM] = (struct mac_iop *) ISM_IOP_BASE_QUADRA; } - iop_base[IOP_NUM_ISM]->status_ctrl = 0; + iop_stop(iop_base[IOP_NUM_ISM]); iop_ism_present = 1; } else { iop_base[IOP_NUM_ISM] = NULL;
 
            From: Lu Wei luwei32@huawei.com
[ Upstream commit 71fbe886ce6dd0be17f20aded9c63fe58edd2806 ]
In the function check_acpi_dev(), if it fails to create platform device, the return value is ERR_PTR() or NULL. Thus it must use IS_ERR_OR_NULL() to check return value.
Fixes: ecc83e52b28c ("intel-hid: new hid event driver for hotkeys") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Lu Wei luwei32@huawei.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/intel-hid.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/platform/x86/intel-hid.c b/drivers/platform/x86/intel-hid.c index 7a506c1d01134..ad1399dcb21f5 100644 --- a/drivers/platform/x86/intel-hid.c +++ b/drivers/platform/x86/intel-hid.c @@ -570,7 +570,7 @@ check_acpi_dev(acpi_handle handle, u32 lvl, void *context, void **rv) return AE_OK;
if (acpi_match_device_ids(dev, ids) == 0) - if (acpi_create_platform_device(dev, NULL)) + if (!IS_ERR_OR_NULL(acpi_create_platform_device(dev, NULL))) dev_info(&dev->dev, "intel-hid: created platform device\n");
 
            From: Lu Wei luwei32@huawei.com
[ Upstream commit 64dd4a5a7d214a07e3d9f40227ec30ac8ba8796e ]
In the function check_acpi_dev(), if it fails to create platform device, the return value is ERR_PTR() or NULL. Thus it must use IS_ERR_OR_NULL() to check return value.
Fixes: 332e081225fc ("intel-vbtn: new driver for Intel Virtual Button") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Lu Wei luwei32@huawei.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/intel-vbtn.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/platform/x86/intel-vbtn.c b/drivers/platform/x86/intel-vbtn.c index cb2a80fdd8f46..3393ee95077f6 100644 --- a/drivers/platform/x86/intel-vbtn.c +++ b/drivers/platform/x86/intel-vbtn.c @@ -286,7 +286,7 @@ check_acpi_dev(acpi_handle handle, u32 lvl, void *context, void **rv) return AE_OK;
if (acpi_match_device_ids(dev, ids) == 0) - if (acpi_create_platform_device(dev, NULL)) + if (!IS_ERR_OR_NULL(acpi_create_platform_device(dev, NULL))) dev_info(&dev->dev, "intel-vbtn: created platform device\n");
 
            From: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se
[ Upstream commit d344234abde938ae1062edb6c05852b0bafb4a03 ]
When adding the adv7180 device node the ports node was misspelled as port, fix this.
Fixes: 8cae359049a88b75 ("ARM: dts: gose: add composite video input") Signed-off-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Link: https://lore.kernel.org/r/20200704155856.3037010-2-niklas.soderlund+renesas@... Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/r8a7793-gose.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/r8a7793-gose.dts b/arch/arm/boot/dts/r8a7793-gose.dts index 42f3313e6988a..dc435ac95d23a 100644 --- a/arch/arm/boot/dts/r8a7793-gose.dts +++ b/arch/arm/boot/dts/r8a7793-gose.dts @@ -339,7 +339,7 @@ composite-in@20 { reg = <0x20>; remote = <&vin1>;
- port { + ports { #address-cells = <1>; #size-cells = <0>;
 
            From: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se
[ Upstream commit 59692ac5a7bb8c97ff440fc8917828083fbc38d6 ]
When adding the adv7612 device node the ports node was misspelled as port, fix this.
Fixes: bc63cd87f3ce924f ("ARM: dts: gose: add HDMI input") Signed-off-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Link: https://lore.kernel.org/r/20200713111016.523189-1-niklas.soderlund+renesas@r... Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/r8a7793-gose.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/r8a7793-gose.dts b/arch/arm/boot/dts/r8a7793-gose.dts index dc435ac95d23a..9f507393c3752 100644 --- a/arch/arm/boot/dts/r8a7793-gose.dts +++ b/arch/arm/boot/dts/r8a7793-gose.dts @@ -399,7 +399,7 @@ hdmi-in@4c { interrupts = <2 IRQ_TYPE_LEVEL_LOW>; default-input = <0>;
- port { + ports { #address-cells = <1>; #size-cells = <0>;
 
            From: yu kuai yukuai3@huawei.com
[ Upstream commit f87a4f022c44e5b87e842a9f3e644fba87e8385f ]
if of_find_device_by_node() succeed, at91_pm_sram_init() doesn't have a corresponding put_device(). Thus add a jump target to fix the exception handling for this function implementation.
Fixes: d2e467905596 ("ARM: at91: pm: use the mmio-sram pool to access SRAM") Signed-off-by: yu kuai yukuai3@huawei.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Link: https://lore.kernel.org/r/20200604123301.3905837-1-yukuai3@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-at91/pm.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mach-at91/pm.c b/arch/arm/mach-at91/pm.c index 52665f30d236d..6bc3000deb86e 100644 --- a/arch/arm/mach-at91/pm.c +++ b/arch/arm/mach-at91/pm.c @@ -592,13 +592,13 @@ static void __init at91_pm_sram_init(void) sram_pool = gen_pool_get(&pdev->dev, NULL); if (!sram_pool) { pr_warn("%s: sram pool unavailable!\n", __func__); - return; + goto out_put_device; }
sram_base = gen_pool_alloc(sram_pool, at91_pm_suspend_in_sram_sz); if (!sram_base) { pr_warn("%s: unable to alloc sram!\n", __func__); - return; + goto out_put_device; }
sram_pbase = gen_pool_virt_to_phys(sram_pool, sram_base); @@ -606,12 +606,17 @@ static void __init at91_pm_sram_init(void) at91_pm_suspend_in_sram_sz, false); if (!at91_suspend_sram_fn) { pr_warn("SRAM: Could not map\n"); - return; + goto out_put_device; }
/* Copy the pm suspend handler to SRAM */ at91_suspend_sram_fn = fncpy(at91_suspend_sram_fn, &at91_pm_suspend_in_sram, at91_pm_suspend_in_sram_sz); + return; + +out_put_device: + put_device(&pdev->dev); + return; }
static bool __init at91_is_pm_mode_active(int pm_mode)
 
            From: Chen-Yu Tsai wens@csie.org
[ Upstream commit 55b271af765b0e03d1ff29502f81644b1a3c87fd ]
The device tree currently only assigns the a supply for the first CPU core, when in reality the regulator supply is shared by all four cores. This might cause an issue if the implementation does not realize the sharing of the supply.
Assign the same regulator supply to the remaining CPU cores to address this.
Fixes: 6eeb4180d4b9 ("ARM: dts: sunxi: h3-h5: Add Bananapi M2+ v1.2 device trees") Signed-off-by: Chen-Yu Tsai wens@csie.org Signed-off-by: Maxime Ripard maxime@cerno.tech Link: https://lore.kernel.org/r/20200717160053.31191-3-wens@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi b/arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi index 22466afd38a3a..a628b5ee72b65 100644 --- a/arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi +++ b/arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi @@ -28,3 +28,15 @@ reg_vdd_cpux: vdd-cpux { &cpu0 { cpu-supply = <®_vdd_cpux>; }; + +&cpu1 { + cpu-supply = <®_vdd_cpux>; +}; + +&cpu2 { + cpu-supply = <®_vdd_cpux>; +}; + +&cpu3 { + cpu-supply = <®_vdd_cpux>; +};
 
            From: Chen-Yu Tsai wens@csie.org
[ Upstream commit e4dae01bf08b754de79072441c357737220b873f ]
The Bananapi M2+ uses a GPIO line to change the effective resistance of the CPU supply regulator's feedback resistor network. The voltages described in the device tree were given directly by the vendor. This turns out to be slightly off compared to the real values.
The updated voltages are based on calculations of the feedback resistor network, and verified down to three decimal places with a multi-meter.
Fixes: 6eeb4180d4b9 ("ARM: dts: sunxi: h3-h5: Add Bananapi M2+ v1.2 device trees") Signed-off-by: Chen-Yu Tsai wens@csie.org Signed-off-by: Maxime Ripard maxime@cerno.tech Link: https://lore.kernel.org/r/20200717160053.31191-4-wens@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi b/arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi index a628b5ee72b65..235994a4a2ebb 100644 --- a/arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi +++ b/arch/arm/boot/dts/sunxi-bananapi-m2-plus-v1.2.dtsi @@ -16,12 +16,12 @@ reg_vdd_cpux: vdd-cpux { regulator-type = "voltage"; regulator-boot-on; regulator-always-on; - regulator-min-microvolt = <1100000>; - regulator-max-microvolt = <1300000>; + regulator-min-microvolt = <1108475>; + regulator-max-microvolt = <1308475>; regulator-ramp-delay = <50>; /* 4ms */ gpios = <&r_pio 0 1 GPIO_ACTIVE_HIGH>; /* PL1 */ gpios-states = <0x1>; - states = <1100000 0>, <1300000 1>; + states = <1108475 0>, <1308475 1>; }; };
 
            From: Dilip Kota eswara.kota@linux.intel.com
[ Upstream commit 661ccf2b3f1360be50242726f7c26ced6a9e7d52 ]
In full duplex mode, rx overflow error is observed. To overcome the error, wait until the complete data got received and proceed further.
Fixes: 17f84b793c01 ("spi: lantiq-ssc: add support for Lantiq SSC SPI controller") Signed-off-by: Dilip Kota eswara.kota@linux.intel.com Link: https://lore.kernel.org/r/efb650b0faa49a00788c4e0ca8ef7196bdba851d.159495701... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-lantiq-ssc.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/spi/spi-lantiq-ssc.c b/drivers/spi/spi-lantiq-ssc.c index 9dfe8b04e6880..55394bdbc5a30 100644 --- a/drivers/spi/spi-lantiq-ssc.c +++ b/drivers/spi/spi-lantiq-ssc.c @@ -184,6 +184,7 @@ struct lantiq_ssc_spi { unsigned int tx_fifo_size; unsigned int rx_fifo_size; unsigned int base_cs; + unsigned int fdx_tx_level; };
static u32 lantiq_ssc_readl(const struct lantiq_ssc_spi *spi, u32 reg) @@ -481,6 +482,7 @@ static void tx_fifo_write(struct lantiq_ssc_spi *spi) u32 data; unsigned int tx_free = tx_fifo_free(spi);
+ spi->fdx_tx_level = 0; while (spi->tx_todo && tx_free) { switch (spi->bits_per_word) { case 2 ... 8: @@ -509,6 +511,7 @@ static void tx_fifo_write(struct lantiq_ssc_spi *spi)
lantiq_ssc_writel(spi, data, LTQ_SPI_TB); tx_free--; + spi->fdx_tx_level++; } }
@@ -520,6 +523,13 @@ static void rx_fifo_read_full_duplex(struct lantiq_ssc_spi *spi) u32 data; unsigned int rx_fill = rx_fifo_level(spi);
+ /* + * Wait until all expected data to be shifted in. + * Otherwise, rx overrun may occur. + */ + while (rx_fill != spi->fdx_tx_level) + rx_fill = rx_fifo_level(spi); + while (rx_fill) { data = lantiq_ssc_readl(spi, LTQ_SPI_RB);
 
            From: Tyler Hicks tyhicks@linux.microsoft.com
[ Upstream commit 7f3d176f5f7e3f0477bf82df0f600fcddcdcc4e4 ]
Require that the TCG_PCR_EVENT2.digests.count value strictly matches the value of TCG_EfiSpecIdEvent.numberOfAlgorithms in the event field of the TCG_PCClientPCREvent event log header. Also require that TCG_EfiSpecIdEvent.numberOfAlgorithms is non-zero.
The TCG PC Client Platform Firmware Profile Specification section 9.1 (Family "2.0", Level 00 Revision 1.04) states:
For each Hash algorithm enumerated in the TCG_PCClientPCREvent entry, there SHALL be a corresponding digest in all TCG_PCR_EVENT2 structures. Note: This includes EV_NO_ACTION events which do not extend the PCR.
Section 9.4.5.1 provides this description of TCG_EfiSpecIdEvent.numberOfAlgorithms:
The number of Hash algorithms in the digestSizes field. This field MUST be set to a value of 0x01 or greater.
Enforce these restrictions, as required by the above specification, in order to better identify and ignore invalid sequences of bytes at the end of an otherwise valid TPM2 event log. Firmware doesn't always have the means necessary to inform the kernel of the actual event log size so the kernel's event log parsing code should be stringent when parsing the event log for resiliency against firmware bugs. This is true, for example, when firmware passes the event log to the kernel via a reserved memory region described in device tree.
POWER and some ARM systems use the "linux,sml-base" and "linux,sml-size" device tree properties to describe the memory region used to pass the event log from firmware to the kernel. Unfortunately, the "linux,sml-size" property describes the size of the entire reserved memory region rather than the size of the event long within the memory region and the event log format does not include information describing the size of the event log.
tpm_read_log_of(), in drivers/char/tpm/eventlog/of.c, is where the "linux,sml-size" property is used. At the end of that function, log->bios_event_log_end is pointing at the end of the reserved memory region. That's typically 0x10000 bytes offset from "linux,sml-base", depending on what's defined in the device tree source.
The firmware event log only fills a portion of those 0x10000 bytes and the rest of the memory region should be zeroed out by firmware. Even in the case of a properly zeroed bytes in the remainder of the memory region, the only thing allowing the kernel's event log parser to detect the end of the event log is the following conditional in __calc_tpm2_event_size():
if (event_type == 0 && event_field->event_size == 0) size = 0;
If that wasn't there, __calc_tpm2_event_size() would think that a 16 byte sequence of zeroes, following an otherwise valid event log, was a valid event.
However, problems can occur if a single bit is set in the offset corresponding to either the TCG_PCR_EVENT2.eventType or TCG_PCR_EVENT2.eventSize fields, after the last valid event log entry. This could confuse the parser into thinking that an additional entry is present in the event log and exposing this invalid entry to userspace in the /sys/kernel/security/tpm0/binary_bios_measurements file. Such problems have been seen if firmware does not fully zero the memory region upon a warm reboot.
This patch significantly raises the bar on how difficult it is for stale/invalid memory to confuse the kernel's event log parser but there's still, ultimately, a reliance on firmware to properly initialize the remainder of the memory region reserved for the event log as the parser cannot be expected to detect a stale but otherwise properly formatted firmware event log entry.
Fixes: fd5c78694f3f ("tpm: fix handling of the TPM 2.0 event logs") Signed-off-by: Tyler Hicks tyhicks@linux.microsoft.com Reviewed-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Signed-off-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/tpm_eventlog.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/include/linux/tpm_eventlog.h b/include/linux/tpm_eventlog.h index eccfd3a4e4c85..f3caeeb7a0d03 100644 --- a/include/linux/tpm_eventlog.h +++ b/include/linux/tpm_eventlog.h @@ -211,9 +211,16 @@ static inline int __calc_tpm2_event_size(struct tcg_pcr_event2_head *event,
efispecid = (struct tcg_efi_specid_event_head *)event_header->event;
- /* Check if event is malformed. */ + /* + * Perform validation of the event in order to identify malformed + * events. This function may be asked to parse arbitrary byte sequences + * immediately following a valid event log. The caller expects this + * function to recognize that the byte sequence is not a valid event + * and to return an event size of 0. + */ if (memcmp(efispecid->signature, TCG_SPECID_SIG, - sizeof(TCG_SPECID_SIG)) || count > efispecid->num_algs) { + sizeof(TCG_SPECID_SIG)) || + !efispecid->num_algs || count != efispecid->num_algs) { size = 0; goto out; }
 
            From: Gregory Herrero gregory.herrero@oracle.com
[ Upstream commit ea0eada45632f4807b2f49de951072283e2d781c ]
Currently, if a section has a relocation to '_mcount' symbol, a new __mcount_loc entry will be added whatever the relocation type is. This is problematic when a relocation to '_mcount' is in the middle of a section and is not a call for ftrace use.
Such relocation could be generated with below code for example: bool is_mcount(unsigned long addr) { return (target == (unsigned long) &_mcount); }
With this snippet of code, ftrace will try to patch the mcount location generated by this code on module load and fail with:
Call trace: ftrace_bug+0xa0/0x28c ftrace_process_locs+0x2f4/0x430 ftrace_module_init+0x30/0x38 load_module+0x14f0/0x1e78 __do_sys_finit_module+0x100/0x11c __arm64_sys_finit_module+0x28/0x34 el0_svc_common+0x88/0x194 el0_svc_handler+0x38/0x8c el0_svc+0x8/0xc ---[ end trace d828d06b36ad9d59 ]--- ftrace failed to modify [<ffffa2dbf3a3a41c>] 0xffffa2dbf3a3a41c actual: 66:a9:3c:90 Initializing ftrace call sites ftrace record flags: 2000000 (0) expected tramp: ffffa2dc6cf66724
So Limit the relocation type to R_AARCH64_CALL26 as in perl version of recordmcount.
Fixes: af64d2aa872a ("ftrace: Add arm64 support to recordmcount") Signed-off-by: Gregory Herrero gregory.herrero@oracle.com Acked-by: Steven Rostedt (VMware) rostedt@goodmis.org Link: https://lore.kernel.org/r/20200717143338.19302-1-gregory.herrero@oracle.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/recordmcount.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c index 7225107a9aafe..e59022b3f1254 100644 --- a/scripts/recordmcount.c +++ b/scripts/recordmcount.c @@ -434,6 +434,11 @@ static int arm_is_fake_mcount(Elf32_Rel const *rp) return 1; }
+static int arm64_is_fake_mcount(Elf64_Rel const *rp) +{ + return ELF64_R_TYPE(w(rp->r_info)) != R_AARCH64_CALL26; +} + /* 64-bit EM_MIPS has weird ELF64_Rela.r_info. * http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.p... * We interpret Table 29 Relocation Operation (Elf64_Rel, Elf64_Rela) [p.40] @@ -547,6 +552,7 @@ static int do_file(char const *const fname) make_nop = make_nop_arm64; rel_type_nop = R_AARCH64_NONE; ideal_nop = ideal_nop4_arm64; + is_fake_mcount64 = arm64_is_fake_mcount; break; case EM_IA_64: reltype = R_IA64_IMM64; break; case EM_MIPS: /* reltype: e_class */ break;
 
            From: Vladimir Zapolskiy vz@mleia.com
[ Upstream commit 9177514ce34902b3adb2abd490b6ad05d1cfcb43 ]
The change corrects registration and deregistration on error path of a regulator, the problem was manifested by a reported memory leak on deferred probe:
as3722-regulator as3722-regulator: regulator 13 register failed -517
# cat /sys/kernel/debug/kmemleak unreferenced object 0xecc43740 (size 64): comm "swapper/0", pid 1, jiffies 4294937640 (age 712.880s) hex dump (first 32 bytes): 72 65 67 75 6c 61 74 6f 72 2e 32 34 00 5a 5a 5a regulator.24.ZZZ 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ backtrace: [<0c4c3d1c>] __kmalloc_track_caller+0x15c/0x2c0 [<40c0ad48>] kvasprintf+0x64/0xd4 [<109abd29>] kvasprintf_const+0x70/0x84 [<c4215946>] kobject_set_name_vargs+0x34/0xa8 [<62282ea2>] dev_set_name+0x40/0x64 [<a39b6757>] regulator_register+0x3a4/0x1344 [<16a9543f>] devm_regulator_register+0x4c/0x84 [<51a4c6a1>] as3722_regulator_probe+0x294/0x754 ...
The memory leak problem was introduced as a side ef another fix in regulator_register() error path, I believe that the proper fix is to decouple device_register() function into its two compounds and initialize a struct device before assigning any values to its fields and then using it before actual registration of a device happens.
This lets to call put_device() safely after initialization, and, since now a release callback is called, kfree(rdev->constraints) shall be removed to exclude a double free condition.
Fixes: a3cde9534ebd ("regulator: core: fix regulator_register() error paths to properly release rdev") Signed-off-by: Vladimir Zapolskiy vz@mleia.com Cc: Wen Yang wenyang@linux.alibaba.com Link: https://lore.kernel.org/r/20200724005013.23278-1-vz@mleia.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/core.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 0011bdc15afbb..a17aebe0aa7a7 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -4994,7 +4994,6 @@ regulator_register(const struct regulator_desc *regulator_desc, struct regulator_dev *rdev; bool dangling_cfg_gpiod = false; bool dangling_of_gpiod = false; - bool reg_device_fail = false; struct device *dev; int ret, i;
@@ -5123,10 +5122,12 @@ regulator_register(const struct regulator_desc *regulator_desc, }
/* register with sysfs */ + device_initialize(&rdev->dev); rdev->dev.class = ®ulator_class; rdev->dev.parent = dev; dev_set_name(&rdev->dev, "regulator.%lu", (unsigned long) atomic_inc_return(®ulator_no)); + dev_set_drvdata(&rdev->dev, rdev);
/* set regulator constraints */ if (init_data) @@ -5177,12 +5178,9 @@ regulator_register(const struct regulator_desc *regulator_desc, !rdev->desc->fixed_uV) rdev->is_switch = true;
- dev_set_drvdata(&rdev->dev, rdev); - ret = device_register(&rdev->dev); - if (ret != 0) { - reg_device_fail = true; + ret = device_add(&rdev->dev); + if (ret != 0) goto unset_supplies; - }
rdev_init_debugfs(rdev);
@@ -5204,17 +5202,15 @@ regulator_register(const struct regulator_desc *regulator_desc, mutex_unlock(®ulator_list_mutex); wash: kfree(rdev->coupling_desc.coupled_rdevs); - kfree(rdev->constraints); mutex_lock(®ulator_list_mutex); regulator_ena_gpio_free(rdev); mutex_unlock(®ulator_list_mutex); + put_device(&rdev->dev); + rdev = NULL; clean: if (dangling_of_gpiod) gpiod_put(config->ena_gpiod); - if (reg_device_fail) - put_device(&rdev->dev); - else - kfree(rdev); + kfree(rdev); kfree(config); rinse: if (dangling_cfg_gpiod)
 
            From: Dmitry Vyukov dvyukov@google.com
[ Upstream commit b36200f543ff07a1cb346aa582349141df2c8068 ]
rings_size() sets sq_offset to the total size of the rings (the returned value which is used for memory allocation). This is wrong: sq array should be located within the rings, not after them. Set sq_offset to where it should be.
Fixes: 75b28affdd6a ("io_uring: allocate the two rings together") Signed-off-by: Dmitry Vyukov dvyukov@google.com Acked-by: Hristo Venev hristo@venev.name Cc: io-uring@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index e0200406765c3..265981ca3109c 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3400,6 +3400,9 @@ static unsigned long rings_size(unsigned sq_entries, unsigned cq_entries, return SIZE_MAX; #endif
+ if (sq_offset) + *sq_offset = off; + sq_array_size = array_size(sizeof(u32), sq_entries); if (sq_array_size == SIZE_MAX) return SIZE_MAX; @@ -3407,9 +3410,6 @@ static unsigned long rings_size(unsigned sq_entries, unsigned cq_entries, if (check_add_overflow(off, sq_array_size, &off)) return SIZE_MAX;
- if (sq_offset) - *sq_offset = off; - return off; }
 
            From: Jon Lin jon.lin@rock-chips.com
[ Upstream commit 4294e4accf8d695ea5605f6b189008b692e3e82c ]
The RXFLR is possible larger than rx_left in Rockchip SPI, fix it.
Fixes: 01b59ce5dac8 ("spi: rockchip: use irq rather than polling") Signed-off-by: Jon Lin jon.lin@rock-chips.com Tested-by: Emil Renner Berthing kernel@esmil.dk Reviewed-by: Heiko Stuebner heiko@sntech.de Reviewed-by: Emil Renner Berthing kernel@esmil.dk Link: https://lore.kernel.org/r/20200723004356.6390-3-jon.lin@rock-chips.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-rockchip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/spi/spi-rockchip.c b/drivers/spi/spi-rockchip.c index 2cc6d9951b52e..008b64f4e031a 100644 --- a/drivers/spi/spi-rockchip.c +++ b/drivers/spi/spi-rockchip.c @@ -286,7 +286,7 @@ static void rockchip_spi_pio_writer(struct rockchip_spi *rs) static void rockchip_spi_pio_reader(struct rockchip_spi *rs) { u32 words = readl_relaxed(rs->regs + ROCKCHIP_SPI_RXFLR); - u32 rx_left = rs->rx_left - words; + u32 rx_left = (rs->rx_left > words) ? rs->rx_left - words : 0;
/* the hardware doesn't allow us to change fifo threshold * level while spi is enabled, so instead make sure to leave
 
            From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 3ad7b4e8f89d6bcc9887ca701cf2745a6aedb1a0 ]
if of_find_device_by_node() succeed, socfpga_setup_ocram_self_refresh doesn't have a corresponding put_device(). Thus add a jump target to fix the exception handling for this function implementation.
Fixes: 44fd8c7d4005 ("ARM: socfpga: support suspend to ram") Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Dinh Nguyen dinguyen@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-socfpga/pm.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mach-socfpga/pm.c b/arch/arm/mach-socfpga/pm.c index 6ed887cf8dc9a..365c0428b21b6 100644 --- a/arch/arm/mach-socfpga/pm.c +++ b/arch/arm/mach-socfpga/pm.c @@ -49,14 +49,14 @@ static int socfpga_setup_ocram_self_refresh(void) if (!ocram_pool) { pr_warn("%s: ocram pool unavailable!\n", __func__); ret = -ENODEV; - goto put_node; + goto put_device; }
ocram_base = gen_pool_alloc(ocram_pool, socfpga_sdram_self_refresh_sz); if (!ocram_base) { pr_warn("%s: unable to alloc ocram!\n", __func__); ret = -ENOMEM; - goto put_node; + goto put_device; }
ocram_pbase = gen_pool_virt_to_phys(ocram_pool, ocram_base); @@ -67,7 +67,7 @@ static int socfpga_setup_ocram_self_refresh(void) if (!suspend_ocram_base) { pr_warn("%s: __arm_ioremap_exec failed!\n", __func__); ret = -ENOMEM; - goto put_node; + goto put_device; }
/* Copy the code that puts DDR in self refresh to ocram */ @@ -81,6 +81,8 @@ static int socfpga_setup_ocram_self_refresh(void) if (!socfpga_sdram_self_refresh_in_ocram) ret = -EFAULT;
+put_device: + put_device(&pdev->dev); put_node: of_node_put(np);
 
            From: Chengming Zhou zhouchengming@bytedance.com
[ Upstream commit d9012a59db54442d5b2fcfdfcded35cf566397d3 ]
We shouldn't skip iocg when its abs_vdebt is not zero.
Fixes: 0b80f9866e6b ("iocost: protect iocg->abs_vdebt with iocg->waitq.lock") Signed-off-by: Chengming Zhou zhouchengming@bytedance.com Acked-by: Tejun Heo tj@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-iocost.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 4d2bda812d9b4..dcc6685d5becc 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -1377,7 +1377,7 @@ static void ioc_timer_fn(struct timer_list *timer) * should have woken up in the last period and expire idle iocgs. */ list_for_each_entry_safe(iocg, tiocg, &ioc->active_iocgs, active_list) { - if (!waitqueue_active(&iocg->waitq) && iocg->abs_vdebt && + if (!waitqueue_active(&iocg->waitq) && !iocg->abs_vdebt && !iocg_is_idle(iocg)) continue;
 
            From: Tiezhu Yang yangtiezhu@loongson.cn
[ Upstream commit 4b127a14cb1385dd355c7673d975258d5d668922 ]
When call function devm_ioremap_resource(), we should use IS_ERR() to check the return value and return PTR_ERR() if failed.
Fixes: 9f1463b86c13 ("irqchip/ti-sci-inta: Add support for Interrupt Aggregator driver") Signed-off-by: Tiezhu Yang yangtiezhu@loongson.cn Signed-off-by: Marc Zyngier maz@kernel.org Reviewed-by: Grygorii Strashko grygorii.strashko@ti.com Link: https://lore.kernel.org/r/1591437017-5295-2-git-send-email-yangtiezhu@loongs... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-ti-sci-inta.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-ti-sci-inta.c b/drivers/irqchip/irq-ti-sci-inta.c index fa7488863bd0a..0a35499c46728 100644 --- a/drivers/irqchip/irq-ti-sci-inta.c +++ b/drivers/irqchip/irq-ti-sci-inta.c @@ -571,7 +571,7 @@ static int ti_sci_inta_irq_domain_probe(struct platform_device *pdev) res = platform_get_resource(pdev, IORESOURCE_MEM, 0); inta->base = devm_ioremap_resource(dev, res); if (IS_ERR(inta->base)) - return -ENODEV; + return PTR_ERR(inta->base);
domain = irq_domain_add_linear(dev_of_node(dev), ti_sci_get_num_resources(inta->vint),
 
            From: Kees Cook keescook@chromium.org
[ Upstream commit 47e33c05f9f07cac3de833e531bcac9ae052c7ca ]
When SECCOMP_IOCTL_NOTIF_ID_VALID was first introduced it had the wrong direction flag set. While this isn't a big deal as nothing currently enforces these bits in the kernel, it should be defined correctly. Fix the define and provide support for the old command until it is no longer needed for backward compatibility.
Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace") Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/uapi/linux/seccomp.h | 3 ++- kernel/seccomp.c | 9 +++++++++ tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- 3 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h index 90734aa5aa363..b5f901af79f0b 100644 --- a/include/uapi/linux/seccomp.h +++ b/include/uapi/linux/seccomp.h @@ -93,5 +93,6 @@ struct seccomp_notif_resp { #define SECCOMP_IOCTL_NOTIF_RECV SECCOMP_IOWR(0, struct seccomp_notif) #define SECCOMP_IOCTL_NOTIF_SEND SECCOMP_IOWR(1, \ struct seccomp_notif_resp) -#define SECCOMP_IOCTL_NOTIF_ID_VALID SECCOMP_IOR(2, __u64) +#define SECCOMP_IOCTL_NOTIF_ID_VALID SECCOMP_IOW(2, __u64) + #endif /* _UAPI_LINUX_SECCOMP_H */ diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 2c697ce7be21f..e0fd972356539 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -42,6 +42,14 @@ #include <linux/uaccess.h> #include <linux/anon_inodes.h>
+/* + * When SECCOMP_IOCTL_NOTIF_ID_VALID was first introduced, it had the + * wrong direction flag in the ioctl number. This is the broken one, + * which the kernel needs to keep supporting until all userspaces stop + * using the wrong command number. + */ +#define SECCOMP_IOCTL_NOTIF_ID_VALID_WRONG_DIR SECCOMP_IOR(2, __u64) + enum notify_state { SECCOMP_NOTIFY_INIT, SECCOMP_NOTIFY_SENT, @@ -1168,6 +1176,7 @@ static long seccomp_notify_ioctl(struct file *file, unsigned int cmd, return seccomp_notify_recv(filter, buf); case SECCOMP_IOCTL_NOTIF_SEND: return seccomp_notify_send(filter, buf); + case SECCOMP_IOCTL_NOTIF_ID_VALID_WRONG_DIR: case SECCOMP_IOCTL_NOTIF_ID_VALID: return seccomp_notify_id_valid(filter, buf); default: diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 96bbda4f10fc6..19c7351eeb74b 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -177,7 +177,7 @@ struct seccomp_metadata { #define SECCOMP_IOCTL_NOTIF_RECV SECCOMP_IOWR(0, struct seccomp_notif) #define SECCOMP_IOCTL_NOTIF_SEND SECCOMP_IOWR(1, \ struct seccomp_notif_resp) -#define SECCOMP_IOCTL_NOTIF_ID_VALID SECCOMP_IOR(2, __u64) +#define SECCOMP_IOCTL_NOTIF_ID_VALID SECCOMP_IOW(2, __u64)
struct seccomp_notif { __u64 id;
 
            From: Colin Ian King colin.king@canonical.com
[ Upstream commit 9a5a85972c073f720d81a7ebd08bfe278e6e16db ]
Pointer mddev is being dereferenced with a test_bit call before mddev is being null checked, this may cause a null pointer dereference. Fix this by moving the null pointer checks to sanity check mddev before it is dereferenced.
Addresses-Coverity: ("Dereference before null check") Fixes: 62f7b1989c02 ("md raid0/linear: Mark array as 'broken' and fail BIOs if a member is gone") Signed-off-by: Colin Ian King colin.king@canonical.com Reviewed-by: Guilherme G. Piccoli gpiccoli@canonical.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/md.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 5a378a453a2d4..acef01e519d06 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -376,17 +376,18 @@ static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio) struct mddev *mddev = q->queuedata; unsigned int sectors;
- if (unlikely(test_bit(MD_BROKEN, &mddev->flags)) && (rw == WRITE)) { + if (mddev == NULL || mddev->pers == NULL) { bio_io_error(bio); return BLK_QC_T_NONE; }
- blk_queue_split(q, &bio); - - if (mddev == NULL || mddev->pers == NULL) { + if (unlikely(test_bit(MD_BROKEN, &mddev->flags)) && (rw == WRITE)) { bio_io_error(bio); return BLK_QC_T_NONE; } + + blk_queue_split(q, &bio); + if (mddev->ro == 1 && unlikely(rw == WRITE)) { if (bio_sectors(bio) != 0) bio->bi_status = BLK_STS_IOERR;
 
            From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit 2875b0aecabe2f081a8432e2bc85b85df0529490 ]
commit fe35ec58f0d3 ("block: update hctx map when use multiple maps") exposed an issue where we may hang trying to wait for queue freeze during I/O. We call blk_mq_update_nr_hw_queues which in case of multiple queue maps (which we have now for default/read/poll) is attempting to freeze the queue. However we never started queue freeze when starting the reset, which means that we have inflight pending requests that entered the queue that we will not complete once the queue is quiesced.
So start a freeze before we quiesce the queue, and unfreeze the queue after we successfully connected the I/O queues (and make sure to call blk_mq_update_nr_hw_queues only after we are sure that the queue was already frozen).
This follows to how the pci driver handles resets.
Fixes: fe35ec58f0d3 ("block: update hctx map when use multiple maps") Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/tcp.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 53e113a18a549..0166ff0e4738e 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1684,15 +1684,20 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new) ret = PTR_ERR(ctrl->connect_q); goto out_free_tag_set; } - } else { - blk_mq_update_nr_hw_queues(ctrl->tagset, - ctrl->queue_count - 1); }
ret = nvme_tcp_start_io_queues(ctrl); if (ret) goto out_cleanup_connect_q;
+ if (!new) { + nvme_start_queues(ctrl); + nvme_wait_freeze(ctrl); + blk_mq_update_nr_hw_queues(ctrl->tagset, + ctrl->queue_count - 1); + nvme_unfreeze(ctrl); + } + return 0;
out_cleanup_connect_q: @@ -1797,6 +1802,7 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl, { if (ctrl->queue_count <= 1) return; + nvme_start_freeze(ctrl); nvme_stop_queues(ctrl); nvme_tcp_stop_io_queues(ctrl); if (ctrl->tagset) {
 
            From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit 9f98772ba307dd89a3d17dc2589f213d3972fc64 ]
commit fe35ec58f0d3 ("block: update hctx map when use multiple maps") exposed an issue where we may hang trying to wait for queue freeze during I/O. We call blk_mq_update_nr_hw_queues which in case of multiple queue maps (which we have now for default/read/poll) is attempting to freeze the queue. However we never started queue freeze when starting the reset, which means that we have inflight pending requests that entered the queue that we will not complete once the queue is quiesced.
So start a freeze before we quiesce the queue, and unfreeze the queue after we successfully connected the I/O queues (and make sure to call blk_mq_update_nr_hw_queues only after we are sure that the queue was already frozen).
This follows to how the pci driver handles resets.
Fixes: fe35ec58f0d3 ("block: update hctx map when use multiple maps") Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/rdma.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index cd0d499781908..d0336545e1fe0 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -890,15 +890,20 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new) ret = PTR_ERR(ctrl->ctrl.connect_q); goto out_free_tag_set; } - } else { - blk_mq_update_nr_hw_queues(&ctrl->tag_set, - ctrl->ctrl.queue_count - 1); }
ret = nvme_rdma_start_io_queues(ctrl); if (ret) goto out_cleanup_connect_q;
+ if (!new) { + nvme_start_queues(&ctrl->ctrl); + nvme_wait_freeze(&ctrl->ctrl); + blk_mq_update_nr_hw_queues(ctrl->ctrl.tagset, + ctrl->ctrl.queue_count - 1); + nvme_unfreeze(&ctrl->ctrl); + } + return 0;
out_cleanup_connect_q: @@ -931,6 +936,7 @@ static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl, bool remove) { if (ctrl->ctrl.queue_count > 1) { + nvme_start_freeze(&ctrl->ctrl); nvme_stop_queues(&ctrl->ctrl); nvme_rdma_stop_io_queues(ctrl); if (ctrl->ctrl.tagset) {
 
            From: Martin Wilck mwilck@suse.com
[ Upstream commit 3f6e3246db0e6f92e784965d9d0edb8abe6c6b74 ]
Handle the special case where we have exactly one optimized path, which we should keep using in this case.
Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy") Signed off-by: Martin Wilck mwilck@suse.com Signed-off-by: Hannes Reinecke hare@suse.de Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/multipath.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 5433aa2f76017..38d25d7c6bca3 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -249,6 +249,12 @@ static struct nvme_ns *nvme_round_robin_path(struct nvme_ns_head *head, fallback = ns; }
+ /* No optimized path found, re-check the current path */ + if (!nvme_path_is_disabled(old) && + old->ana_state == NVME_ANA_OPTIMIZED) { + found = old; + goto out; + } if (!fallback) return NULL; found = fallback;
 
            From: Hannes Reinecke hare@suse.de
[ Upstream commit fbd6a42d8932e172921c7de10468a2e12c34846b ]
When nvme_round_robin_path() finds a valid namespace we should be using it; falling back to __nvme_find_path() for non-optimized paths will cause the result from nvme_round_robin_path() to be ignored for non-optimized paths.
Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy") Signed-off-by: Martin Wilck mwilck@suse.com Signed-off-by: Hannes Reinecke hare@suse.de Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/multipath.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 38d25d7c6bca3..484aad0d0c9c6 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -275,10 +275,13 @@ inline struct nvme_ns *nvme_find_path(struct nvme_ns_head *head) struct nvme_ns *ns;
ns = srcu_dereference(head->current_path[node], &head->srcu); - if (READ_ONCE(head->subsys->iopolicy) == NVME_IOPOLICY_RR && ns) - ns = nvme_round_robin_path(head, node, ns); - if (unlikely(!ns || !nvme_path_is_optimized(ns))) - ns = __nvme_find_path(head, node); + if (unlikely(!ns)) + return __nvme_find_path(head, node); + + if (READ_ONCE(head->subsys->iopolicy) == NVME_IOPOLICY_RR) + return nvme_round_robin_path(head, node, ns); + if (unlikely(!nvme_path_is_optimized(ns))) + return __nvme_find_path(head, node); return ns; }
linux-stable-mirror@lists.linaro.org
