This is the start of the stable review cycle for the 4.14.153 release. There are 62 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 10 Nov 2019 05:42:11 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.153-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.14.153-rc1
Desnes A. Nunes do Rosario desnesn@linux.ibm.com selftests/powerpc: Fix compile error on tlbie_test due to newer gcc
Aneesh Kumar K.V aneesh.kumar@linux.ibm.com selftests/powerpc: Add test case for tlbie vs mtpidr ordering issue
Aneesh Kumar K.V aneesh.kumar@linux.ibm.com powerpc/mm: Fixup tlbie vs mtpidr/mtlpidr ordering issue on POWER9
Aneesh Kumar K.V aneesh.kumar@linux.ibm.com powerpc/book3s64/radix: Rename CPU_FTR_P9_TLBIE_BUG feature flag
Aneesh Kumar K.V aneesh.kumar@linux.ibm.com powerpc/book3s64/mm: Don't do tlbie fixup for some hardware revisions
Aneesh Kumar K.V aneesh.kumar@linux.vnet.ibm.com powerpc/mm: Fixup tlbie vs store ordering issue on POWER9
Fabrice Gasnier fabrice.gasnier@st.com iio: adc: stm32-adc: fix a race when using several adcs with dma and irq
Fabrice Gasnier fabrice.gasnier@st.com iio: adc: stm32-adc: move registers definitions
Jan Kiszka jan.kiszka@siemens.com platform/x86: pmc_atom: Add Siemens SIMATIC IPC227E to critclk_systems DMI table
Seth Forshee seth.forshee@canonical.com kbuild: add -fcf-protection=none when using retpoline flags
Masahiro Yamada yamada.masahiro@socionext.com kbuild: use -fmacro-prefix-map to make __FILE__ a relative path
Peter Zijlstra peterz@infradead.org sched/wake_q: Fix wakeup ordering for wake_q
Jeffrey Hugo jeffrey.l.hugo@gmail.com dmaengine: qcom: bam_dma: Fix resource leak
Eric Dumazet edumazet@google.com net/flow_dissector: switch to siphash
Eric Dumazet edumazet@google.com inet: stop leaking jiffies on the wire
Xin Long lucien.xin@gmail.com erspan: fix the tun_info options_len check for erspan
Xin Long lucien.xin@gmail.com vxlan: check tun_info options_len properly
Eric Dumazet edumazet@google.com net: use skb_queue_empty_lockless() in busy poll contexts
Eric Dumazet edumazet@google.com net: use skb_queue_empty_lockless() in poll() handlers
Eric Dumazet edumazet@google.com udp: use skb_queue_empty_lockless()
Eric Dumazet edumazet@google.com net: add skb_queue_empty_lockless()
Doug Berger opendmb@gmail.com net: bcmgenet: reset 40nm EPHY on energy detect
Vivien Didelot vivien.didelot@gmail.com net: dsa: fix switch tree list
Kazutoshi Noguchi noguchi.kazutosi@gmail.com r8152: add device id for Lenovo ThinkPad USB-C Dock Gen 2
Andrew Lunn andrew@lunn.ch net: usb: lan78xx: Connect PHY before registering MAC
Florian Fainelli f.fainelli@gmail.com net: dsa: b53: Do not clear existing mirrored port mask
Maxim Mikityanskiy maximmi@mellanox.com net/mlx5e: Fix handling of compressed CQEs in case of low NAPI budget
Eric Dumazet edumazet@google.com net: add READ_ONCE() annotation in __skb_wait_for_more_packets()
Eric Dumazet edumazet@google.com udp: fix data-race in udp_set_dev_scratch()
Wei Wang weiwan@google.com selftests: net: reuseport_dualstack: fix uninitalized parameter
zhanglin zhang.lin16@zte.com.cn net: Zeroing the structure ethtool_wolinfo in ethtool_get_wol()
Eran Ben Elisha eranbe@mellanox.com net/mlx4_core: Dynamically set guaranteed amount of counters per VF
Jiangfeng Xiao xiaojiangfeng@huawei.com net: hisilicon: Fix ping latency when deal with high throughput
Tejun Heo tj@kernel.org net: fix sk_page_frag() recursion from memory reclaim
Benjamin Herrenschmidt benh@kernel.crashing.org net: ethernet: ftgmac100: Fix DMA coherency issue with SW checksum
Florian Fainelli f.fainelli@gmail.com net: dsa: bcm_sf2: Fix IMP setup for port different than 8
Eric Dumazet edumazet@google.com net: annotate lockless accesses to sk->sk_napi_id
Eric Dumazet edumazet@google.com net: annotate accesses to sk->sk_incoming_cpu
Eric Dumazet edumazet@google.com dccp: do not leak jiffies on the wire
Vishal Kulkarni vishal@chelsio.com cxgb4: fix panic when attaching to ULD fail
Josef Bacik josef@toxicpanda.com nbd: handle racing with error'ed out commands
Dave Wysochanski dwysocha@redhat.com cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
Alain Volmat alain.volmat@st.com i2c: stm32f7: remove warning when compiling with W=1
Jonas Gorski jonas.gorski@gmail.com MIPS: bmips: mark exception vectors as char arrays
Navid Emamdoost navid.emamdoost@gmail.com of: unittest: fix memory leak in unittest_data_add
afzal mohammed afzal.mohd.ma@gmail.com ARM: 8926/1: v7m: remove register save to stack before svc
Bodo Stroesser bstroesser@ts.fujitsu.com scsi: target: core: Do not overwrite CDB byte 1
Peter Ujfalusi peter.ujfalusi@ti.com ARM: davinci: dm365: Fix McBSP dma_slave_map entry
Yunfeng Ye yeyunfeng@huawei.com perf kmem: Fix memory leak in compact_gfp_flags()
Yunfeng Ye yeyunfeng@huawei.com perf c2c: Fix memory leak in build_cl_output()
Anson Huang Anson.Huang@nxp.com ARM: dts: imx7s: Correct GPT's ipg clock source
Thomas Bogendoerfer tbogendoerfer@suse.de scsi: fix kconfig dependency warning related to 53C700_LE_ON_BE
Thomas Bogendoerfer tbogendoerfer@suse.de scsi: sni_53c710: fix compilation error
Hannes Reinecke hare@suse.com scsi: scsi_dh_alua: handle RTPG sense code correctly during state transitions
Russell King rmk+kernel@armlinux.org.uk ARM: mm: fix alignment handler faults under memory pressure
Dan Carpenter dan.carpenter@oracle.com pinctrl: ns2: Fix off by one bugs in ns2_pinmux_enable()
Adam Ford aford173@gmail.com ARM: dts: logicpd-torpedo-som: Remove twl_keypad
Robin Murphy robin.murphy@arm.com ASoc: rockchip: i2s: Fix RPM imbalance
Stuart Henderson stuarth@opensource.cirrus.com ASoC: wm_adsp: Don't generate kcontrols without READ flags
Yizhuo yzhai003@ucr.edu regulator: pfuze100-regulator: Variable "val" in pfuze100_regulator_probe() could be uninitialized
Axel Lin axel.lin@ingics.com regulator: ti-abb: Fix timeout in ti_abb_wait_txdone/ti_abb_clear_all_txdone
Rayagonda Kokatanur rayagonda.kokatanur@broadcom.com arm64: dts: Fix gpio to pinmux mapping
-------------
Diffstat:
Makefile | 13 +- arch/arm/boot/dts/imx7s.dtsi | 8 +- arch/arm/boot/dts/logicpd-torpedo-som.dtsi | 4 + arch/arm/mach-davinci/dm365.c | 4 +- arch/arm/mm/alignment.c | 44 +- arch/arm/mm/proc-v7m.S | 1 - .../dts/broadcom/stingray/stingray-pinctrl.dtsi | 5 +- .../arm64/boot/dts/broadcom/stingray/stingray.dtsi | 3 +- arch/mips/bcm63xx/prom.c | 2 +- arch/mips/include/asm/bmips.h | 10 +- arch/mips/kernel/smp-bmips.c | 8 +- arch/powerpc/include/asm/cputable.h | 5 +- arch/powerpc/kernel/dt_cpu_ftrs.c | 32 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 3 + arch/powerpc/kvm/book3s_hv_rm_mmu.c | 33 + arch/powerpc/mm/hash_native_64.c | 38 +- arch/powerpc/mm/pgtable_64.c | 1 + arch/powerpc/mm/tlb-radix.c | 94 ++- drivers/block/nbd.c | 6 + drivers/dma/qcom/bam_dma.c | 14 + drivers/i2c/busses/i2c-stm32f7.c | 2 +- drivers/iio/adc/stm32-adc-core.c | 70 +- drivers/iio/adc/stm32-adc-core.h | 135 ++++ drivers/iio/adc/stm32-adc.c | 107 --- drivers/isdn/capi/capi.c | 2 +- drivers/net/dsa/b53/b53_common.c | 1 - drivers/net/dsa/bcm_sf2.c | 36 +- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 9 +- drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c | 29 +- drivers/net/ethernet/faraday/ftgmac100.c | 25 +- drivers/net/ethernet/hisilicon/hip04_eth.c | 15 +- .../net/ethernet/mellanox/mlx4/resource_tracker.c | 42 +- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 5 +- drivers/net/usb/cdc_ether.c | 7 + drivers/net/usb/lan78xx.c | 12 +- drivers/net/usb/r8152.c | 1 + drivers/net/vxlan.c | 5 +- drivers/of/unittest.c | 1 + drivers/pinctrl/bcm/pinctrl-ns2-mux.c | 4 +- drivers/platform/x86/pmc_atom.c | 7 + drivers/regulator/pfuze100-regulator.c | 8 +- drivers/regulator/ti-abb-regulator.c | 26 +- drivers/scsi/Kconfig | 2 +- drivers/scsi/device_handler/scsi_dh_alua.c | 21 +- drivers/scsi/sni_53c710.c | 4 +- drivers/target/target_core_device.c | 21 - fs/cifs/cifsglob.h | 5 + fs/cifs/cifsproto.h | 1 + fs/cifs/file.c | 23 +- fs/cifs/smb2file.c | 2 +- include/linux/gfp.h | 23 + include/linux/skbuff.h | 36 +- include/net/busy_poll.h | 6 +- include/net/flow_dissector.h | 3 +- include/net/fq.h | 2 +- include/net/fq_impl.h | 4 +- include/net/sock.h | 15 +- kernel/sched/core.c | 7 +- net/atm/common.c | 2 +- net/bluetooth/af_bluetooth.c | 4 +- net/caif/caif_socket.c | 2 +- net/core/datagram.c | 8 +- net/core/ethtool.c | 4 +- net/core/flow_dissector.c | 48 +- net/core/sock.c | 6 +- net/dccp/ipv4.c | 4 +- net/dsa/dsa2.c | 2 +- net/ipv4/datagram.c | 2 +- net/ipv4/inet_hashtables.c | 2 +- net/ipv4/ip_gre.c | 3 + net/ipv4/tcp.c | 4 +- net/ipv4/tcp_ipv4.c | 4 +- net/ipv4/udp.c | 29 +- net/ipv6/inet6_hashtables.c | 2 +- net/ipv6/udp.c | 2 +- net/nfc/llcp_sock.c | 4 +- net/phonet/socket.c | 4 +- net/sched/sch_hhf.c | 8 +- net/sched/sch_sfb.c | 13 +- net/sched/sch_sfq.c | 14 +- net/sctp/socket.c | 8 +- net/tipc/socket.c | 4 +- net/unix/af_unix.c | 6 +- net/vmw_vsock/af_vsock.c | 2 +- sound/soc/codecs/wm_adsp.c | 3 +- sound/soc/rockchip/rockchip_i2s.c | 2 +- tools/perf/builtin-c2c.c | 14 +- tools/perf/builtin-kmem.c | 1 + tools/testing/selftests/net/reuseport_dualstack.c | 3 +- tools/testing/selftests/powerpc/mm/Makefile | 2 + tools/testing/selftests/powerpc/mm/tlbie_test.c | 734 +++++++++++++++++++++ 91 files changed, 1557 insertions(+), 445 deletions(-)
From: Rayagonda Kokatanur rayagonda.kokatanur@broadcom.com
[ Upstream commit 965f6603e3335a953f4f876792074cb36bf65f7f ]
There are total of 151 non-secure gpio (0-150) and four pins of pinmux (91, 92, 93 and 94) are not mapped to any gpio pin, hence update same in DT.
Fixes: 8aa428cc1e2e ("arm64: dts: Add pinctrl DT nodes for Stingray SOC") Signed-off-by: Rayagonda Kokatanur rayagonda.kokatanur@broadcom.com Reviewed-by: Ray Jui ray.jui@broadcom.com Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/broadcom/stingray/stingray-pinctrl.dtsi | 5 +++-- arch/arm64/boot/dts/broadcom/stingray/stingray.dtsi | 3 +-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/boot/dts/broadcom/stingray/stingray-pinctrl.dtsi b/arch/arm64/boot/dts/broadcom/stingray/stingray-pinctrl.dtsi index 15214d05fec1c..8c20d4a0cb4ed 100644 --- a/arch/arm64/boot/dts/broadcom/stingray/stingray-pinctrl.dtsi +++ b/arch/arm64/boot/dts/broadcom/stingray/stingray-pinctrl.dtsi @@ -42,13 +42,14 @@
pinmux: pinmux@0014029c { compatible = "pinctrl-single"; - reg = <0x0014029c 0x250>; + reg = <0x0014029c 0x26c>; #address-cells = <1>; #size-cells = <1>; pinctrl-single,register-width = <32>; pinctrl-single,function-mask = <0xf>; pinctrl-single,gpio-range = < - &range 0 154 MODE_GPIO + &range 0 91 MODE_GPIO + &range 95 60 MODE_GPIO >; range: gpio-range { #pinctrl-single,gpio-range-cells = <3>; diff --git a/arch/arm64/boot/dts/broadcom/stingray/stingray.dtsi b/arch/arm64/boot/dts/broadcom/stingray/stingray.dtsi index 2b76293b51c83..3d2921ef29351 100644 --- a/arch/arm64/boot/dts/broadcom/stingray/stingray.dtsi +++ b/arch/arm64/boot/dts/broadcom/stingray/stingray.dtsi @@ -444,8 +444,7 @@ <&pinmux 108 16 27>, <&pinmux 135 77 6>, <&pinmux 141 67 4>, - <&pinmux 145 149 6>, - <&pinmux 151 91 4>; + <&pinmux 145 149 6>; };
i2c1: i2c@000e0000 {
From: Axel Lin axel.lin@ingics.com
[ Upstream commit f64db548799e0330897c3203680c2ee795ade518 ]
ti_abb_wait_txdone() may return -ETIMEDOUT when ti_abb_check_txdone() returns true in the latest iteration of the while loop because the timeout value is abb->settling_time + 1. Similarly, ti_abb_clear_all_txdone() may return -ETIMEDOUT when ti_abb_check_txdone() returns false in the latest iteration of the while loop. Fix it.
Signed-off-by: Axel Lin axel.lin@ingics.com Acked-by: Nishanth Menon nm@ti.com Link: https://lore.kernel.org/r/20190929095848.21960-1-axel.lin@ingics.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/ti-abb-regulator.c | 26 ++++++++------------------ 1 file changed, 8 insertions(+), 18 deletions(-)
diff --git a/drivers/regulator/ti-abb-regulator.c b/drivers/regulator/ti-abb-regulator.c index d2f9942987535..6d17357b3a248 100644 --- a/drivers/regulator/ti-abb-regulator.c +++ b/drivers/regulator/ti-abb-regulator.c @@ -173,19 +173,14 @@ static int ti_abb_wait_txdone(struct device *dev, struct ti_abb *abb) while (timeout++ <= abb->settling_time) { status = ti_abb_check_txdone(abb); if (status) - break; + return 0;
udelay(1); }
- if (timeout > abb->settling_time) { - dev_warn_ratelimited(dev, - "%s:TRANXDONE timeout(%duS) int=0x%08x\n", - __func__, timeout, readl(abb->int_base)); - return -ETIMEDOUT; - } - - return 0; + dev_warn_ratelimited(dev, "%s:TRANXDONE timeout(%duS) int=0x%08x\n", + __func__, timeout, readl(abb->int_base)); + return -ETIMEDOUT; }
/** @@ -205,19 +200,14 @@ static int ti_abb_clear_all_txdone(struct device *dev, const struct ti_abb *abb)
status = ti_abb_check_txdone(abb); if (!status) - break; + return 0;
udelay(1); }
- if (timeout > abb->settling_time) { - dev_warn_ratelimited(dev, - "%s:TRANXDONE timeout(%duS) int=0x%08x\n", - __func__, timeout, readl(abb->int_base)); - return -ETIMEDOUT; - } - - return 0; + dev_warn_ratelimited(dev, "%s:TRANXDONE timeout(%duS) int=0x%08x\n", + __func__, timeout, readl(abb->int_base)); + return -ETIMEDOUT; }
/**
From: Yizhuo yzhai003@ucr.edu
[ Upstream commit 1252b283141f03c3dffd139292c862cae10e174d ]
In function pfuze100_regulator_probe(), variable "val" could be initialized if regmap_read() fails. However, "val" is used to decide the control flow later in the if statement, which is potentially unsafe.
Signed-off-by: Yizhuo yzhai003@ucr.edu Link: https://lore.kernel.org/r/20190929170957.14775-1-yzhai003@ucr.edu Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/pfuze100-regulator.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/regulator/pfuze100-regulator.c b/drivers/regulator/pfuze100-regulator.c index 659e516455bee..4f205366d8aea 100644 --- a/drivers/regulator/pfuze100-regulator.c +++ b/drivers/regulator/pfuze100-regulator.c @@ -632,7 +632,13 @@ static int pfuze100_regulator_probe(struct i2c_client *client,
/* SW2~SW4 high bit check and modify the voltage value table */ if (i >= sw_check_start && i <= sw_check_end) { - regmap_read(pfuze_chip->regmap, desc->vsel_reg, &val); + ret = regmap_read(pfuze_chip->regmap, + desc->vsel_reg, &val); + if (ret) { + dev_err(&client->dev, "Fails to read from the register.\n"); + return ret; + } + if (val & sw_hi) { if (pfuze_chip->chip_id == PFUZE3000) { desc->volt_table = pfuze3000_sw2hi;
From: Stuart Henderson stuarth@opensource.cirrus.com
[ Upstream commit 3ae7359c0e39f42a96284d6798fc669acff38140 ]
User space always expects to be able to read ALSA controls, so ensure no kcontrols are generated without an appropriate READ flag. In the case of a read of such a control zeros will be returned.
Signed-off-by: Stuart Henderson stuarth@opensource.cirrus.com Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com Link: https://lore.kernel.org/r/20191002084240.21589-1-ckeepax@opensource.cirrus.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/wm_adsp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/sound/soc/codecs/wm_adsp.c b/sound/soc/codecs/wm_adsp.c index d632a0511d62a..158ce68bc9bf3 100644 --- a/sound/soc/codecs/wm_adsp.c +++ b/sound/soc/codecs/wm_adsp.c @@ -1169,8 +1169,7 @@ static unsigned int wmfw_convert_flags(unsigned int in, unsigned int len) }
if (in) { - if (in & WMFW_CTL_FLAG_READABLE) - out |= rd; + out |= rd; if (in & WMFW_CTL_FLAG_WRITEABLE) out |= wr; if (in & WMFW_CTL_FLAG_VOLATILE)
From: Robin Murphy robin.murphy@arm.com
[ Upstream commit b1e620e7d32f5aad5353cc3cfc13ed99fea65d3a ]
If rockchip_pcm_platform_register() fails, e.g. upon deferring to wait for an absent DMA channel, we return without disabling RPM, which makes subsequent re-probe attempts scream with errors about the unbalanced enable. Don't do that.
Fixes: ebb75c0bdba2 ("ASoC: rockchip: i2s: Adjust devm usage") Signed-off-by: Robin Murphy robin.murphy@arm.com Link: https://lore.kernel.org/r/bcb12a849a05437fb18372bc7536c649b94bdf07.157002986... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/rockchip/rockchip_i2s.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/rockchip/rockchip_i2s.c b/sound/soc/rockchip/rockchip_i2s.c index 66fc13a2396a0..0e07e3dea7de4 100644 --- a/sound/soc/rockchip/rockchip_i2s.c +++ b/sound/soc/rockchip/rockchip_i2s.c @@ -676,7 +676,7 @@ static int rockchip_i2s_probe(struct platform_device *pdev) ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0); if (ret) { dev_err(&pdev->dev, "Could not register PCM\n"); - return ret; + goto err_suspend; }
return 0;
From: Adam Ford aford173@gmail.com
[ Upstream commit 6b512b0ee091edcb8e46218894e4c917d919d3dc ]
The TWL4030 used on the Logit PD Torpedo SOM does not have the keypad pins routed. This patch disables the twl_keypad driver to remove some splat during boot:
twl4030_keypad 48070000.i2c:twl@48:keypad: missing or malformed property linux,keymap: -22 twl4030_keypad 48070000.i2c:twl@48:keypad: Failed to build keymap twl4030_keypad: probe of 48070000.i2c:twl@48:keypad failed with error -22
Signed-off-by: Adam Ford aford173@gmail.com [tony@atomide.com: removed error time stamps] Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/logicpd-torpedo-som.dtsi | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/arm/boot/dts/logicpd-torpedo-som.dtsi b/arch/arm/boot/dts/logicpd-torpedo-som.dtsi index fe4cbdc72359e..7265d7072b5cb 100644 --- a/arch/arm/boot/dts/logicpd-torpedo-som.dtsi +++ b/arch/arm/boot/dts/logicpd-torpedo-som.dtsi @@ -270,3 +270,7 @@ &twl_gpio { ti,use-leds; }; + +&twl_keypad { + status = "disabled"; +};
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit 39b65fbb813089e366b376bd8acc300b6fd646dc ]
The pinctrl->functions[] array has pinctrl->num_functions elements and the pinctrl->groups[] array is the same way. These are set in ns2_pinmux_probe(). So the > comparisons should be >= so that we don't read one element beyond the end of the array.
Fixes: b5aa1006e4a9 ("pinctrl: ns2: add pinmux driver support for Broadcom NS2 SoC") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/20190926081426.GB2332@mwanda Acked-by: Scott Branden scott.branden@broadcom.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/bcm/pinctrl-ns2-mux.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/pinctrl/bcm/pinctrl-ns2-mux.c b/drivers/pinctrl/bcm/pinctrl-ns2-mux.c index 4b5cf0e0f16e2..951090faa6a91 100644 --- a/drivers/pinctrl/bcm/pinctrl-ns2-mux.c +++ b/drivers/pinctrl/bcm/pinctrl-ns2-mux.c @@ -640,8 +640,8 @@ static int ns2_pinmux_enable(struct pinctrl_dev *pctrl_dev, const struct ns2_pin_function *func; const struct ns2_pin_group *grp;
- if (grp_select > pinctrl->num_groups || - func_select > pinctrl->num_functions) + if (grp_select >= pinctrl->num_groups || + func_select >= pinctrl->num_functions) return -EINVAL;
func = &pinctrl->functions[func_select];
From: Russell King rmk+kernel@armlinux.org.uk
[ Upstream commit 67e15fa5b487adb9b78a92789eeff2d6ec8f5cee ]
When the system has high memory pressure, the page containing the instruction may be paged out. Using probe_kernel_address() means that if the page is swapped out, the resulting page fault will not be handled because page faults are disabled by this function.
Use get_user() to read the instruction instead.
Reported-by: Jing Xiangfeng jingxiangfeng@huawei.com Fixes: b255188f90e2 ("ARM: fix scheduling while atomic warning in alignment handling code") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mm/alignment.c | 44 +++++++++++++++++++++++++++++++++-------- 1 file changed, 36 insertions(+), 8 deletions(-)
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c index 2c96190e018bd..96b17a870b91d 100644 --- a/arch/arm/mm/alignment.c +++ b/arch/arm/mm/alignment.c @@ -768,6 +768,36 @@ do_alignment_t32_to_handler(unsigned long *pinstr, struct pt_regs *regs, return NULL; }
+static int alignment_get_arm(struct pt_regs *regs, u32 *ip, unsigned long *inst) +{ + u32 instr = 0; + int fault; + + if (user_mode(regs)) + fault = get_user(instr, ip); + else + fault = probe_kernel_address(ip, instr); + + *inst = __mem_to_opcode_arm(instr); + + return fault; +} + +static int alignment_get_thumb(struct pt_regs *regs, u16 *ip, u16 *inst) +{ + u16 instr = 0; + int fault; + + if (user_mode(regs)) + fault = get_user(instr, ip); + else + fault = probe_kernel_address(ip, instr); + + *inst = __mem_to_opcode_thumb16(instr); + + return fault; +} + static int do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs) { @@ -775,10 +805,10 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs) unsigned long instr = 0, instrptr; int (*handler)(unsigned long addr, unsigned long instr, struct pt_regs *regs); unsigned int type; - unsigned int fault; u16 tinstr = 0; int isize = 4; int thumb2_32b = 0; + int fault;
if (interrupts_enabled(regs)) local_irq_enable(); @@ -787,15 +817,14 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
if (thumb_mode(regs)) { u16 *ptr = (u16 *)(instrptr & ~1); - fault = probe_kernel_address(ptr, tinstr); - tinstr = __mem_to_opcode_thumb16(tinstr); + + fault = alignment_get_thumb(regs, ptr, &tinstr); if (!fault) { if (cpu_architecture() >= CPU_ARCH_ARMv7 && IS_T32(tinstr)) { /* Thumb-2 32-bit */ - u16 tinst2 = 0; - fault = probe_kernel_address(ptr + 1, tinst2); - tinst2 = __mem_to_opcode_thumb16(tinst2); + u16 tinst2; + fault = alignment_get_thumb(regs, ptr + 1, &tinst2); instr = __opcode_thumb32_compose(tinstr, tinst2); thumb2_32b = 1; } else { @@ -804,8 +833,7 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs) } } } else { - fault = probe_kernel_address((void *)instrptr, instr); - instr = __mem_to_opcode_arm(instr); + fault = alignment_get_arm(regs, (void *)instrptr, &instr); }
if (fault) {
From: Hannes Reinecke hare@suse.com
[ Upstream commit b6ce6fb121a655aefe41dccc077141c102145a37 ]
Some arrays are not capable of returning RTPG data during state transitioning, but rather return an 'LUN not accessible, asymmetric access state transition' sense code. In these cases we can set the state to 'transitioning' directly and don't need to evaluate the RTPG data (which we won't have anyway).
Link: https://lore.kernel.org/r/20191007135701.32389-1-hare@suse.de Reviewed-by: Laurence Oberman loberman@redhat.com Reviewed-by: Ewan D. Milne emilne@redhat.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Hannes Reinecke hare@suse.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/device_handler/scsi_dh_alua.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index 41f5f64101630..135376ee2cbf0 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -523,6 +523,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) unsigned int tpg_desc_tbl_off; unsigned char orig_transition_tmo; unsigned long flags; + bool transitioning_sense = false;
if (!pg->expiry) { unsigned long transition_tmo = ALUA_FAILOVER_TIMEOUT * HZ; @@ -567,13 +568,19 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) goto retry; } /* - * Retry on ALUA state transition or if any - * UNIT ATTENTION occurred. + * If the array returns with 'ALUA state transition' + * sense code here it cannot return RTPG data during + * transition. So set the state to 'transitioning' directly. */ if (sense_hdr.sense_key == NOT_READY && - sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) - err = SCSI_DH_RETRY; - else if (sense_hdr.sense_key == UNIT_ATTENTION) + sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) { + transitioning_sense = true; + goto skip_rtpg; + } + /* + * Retry on any other UNIT ATTENTION occurred. + */ + if (sense_hdr.sense_key == UNIT_ATTENTION) err = SCSI_DH_RETRY; if (err == SCSI_DH_RETRY && pg->expiry != 0 && time_before(jiffies, pg->expiry)) { @@ -661,7 +668,11 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) off = 8 + (desc[7] * 4); }
+ skip_rtpg: spin_lock_irqsave(&pg->lock, flags); + if (transitioning_sense) + pg->state = SCSI_ACCESS_STATE_TRANSITIONING; + sdev_printk(KERN_INFO, sdev, "%s: port group %02x state %c %s supports %c%c%c%c%c%c%c\n", ALUA_DH_NAME, pg->group_id, print_alua_state(pg->state),
From: Thomas Bogendoerfer tbogendoerfer@suse.de
[ Upstream commit 0ee6211408a8e939428f662833c7301394125b80 ]
Drop out memory dev_printk() with wrong device pointer argument.
[mkp: typo]
Link: https://lore.kernel.org/r/20191009151118.32350-1-tbogendoerfer@suse.de Signed-off-by: Thomas Bogendoerfer tbogendoerfer@suse.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/sni_53c710.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/scsi/sni_53c710.c b/drivers/scsi/sni_53c710.c index 1f9a087daf69f..3102a75984d3b 100644 --- a/drivers/scsi/sni_53c710.c +++ b/drivers/scsi/sni_53c710.c @@ -78,10 +78,8 @@ static int snirm710_probe(struct platform_device *dev)
base = res->start; hostdata = kzalloc(sizeof(*hostdata), GFP_KERNEL); - if (!hostdata) { - dev_printk(KERN_ERR, dev, "Failed to allocate host data\n"); + if (!hostdata) return -ENOMEM; - }
hostdata->dev = &dev->dev; dma_set_mask(&dev->dev, DMA_BIT_MASK(32));
From: Thomas Bogendoerfer tbogendoerfer@suse.de
[ Upstream commit 8cbf0c173aa096dda526d1ccd66fc751c31da346 ]
When building a kernel with SCSI_SNI_53C710 enabled, Kconfig warns:
WARNING: unmet direct dependencies detected for 53C700_LE_ON_BE Depends on [n]: SCSI_LOWLEVEL [=y] && SCSI [=y] && SCSI_LASI700 [=n] Selected by [y]: - SCSI_SNI_53C710 [=y] && SCSI_LOWLEVEL [=y] && SNI_RM [=y] && SCSI [=y]
Add the missing depends SCSI_SNI_53C710 to 53C700_LE_ON_BE to fix it.
Link: https://lore.kernel.org/r/20191009151128.32411-1-tbogendoerfer@suse.de Signed-off-by: Thomas Bogendoerfer tbogendoerfer@suse.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 41366339b9501..881906dc33b83 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -966,7 +966,7 @@ config SCSI_SNI_53C710
config 53C700_LE_ON_BE bool - depends on SCSI_LASI700 + depends on SCSI_LASI700 || SCSI_SNI_53C710 default y
config SCSI_STEX
From: Anson Huang Anson.Huang@nxp.com
[ Upstream commit 252b9e21bcf46b0d16f733f2e42b21fdc60addee ]
i.MX7S/D's GPT ipg clock should be from GPT clock root and controlled by CCM's GPT CCGR, using correct clock source for GPT ipg clock instead of IMX7D_CLK_DUMMY.
Fixes: 3ef79ca6bd1d ("ARM: dts: imx7d: use imx7s.dtsi as base device tree") Signed-off-by: Anson Huang Anson.Huang@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/imx7s.dtsi | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm/boot/dts/imx7s.dtsi b/arch/arm/boot/dts/imx7s.dtsi index bf15efbe8a710..836550f2297ac 100644 --- a/arch/arm/boot/dts/imx7s.dtsi +++ b/arch/arm/boot/dts/imx7s.dtsi @@ -450,7 +450,7 @@ compatible = "fsl,imx7d-gpt", "fsl,imx6sx-gpt"; reg = <0x302d0000 0x10000>; interrupts = <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>; - clocks = <&clks IMX7D_CLK_DUMMY>, + clocks = <&clks IMX7D_GPT1_ROOT_CLK>, <&clks IMX7D_GPT1_ROOT_CLK>; clock-names = "ipg", "per"; }; @@ -459,7 +459,7 @@ compatible = "fsl,imx7d-gpt", "fsl,imx6sx-gpt"; reg = <0x302e0000 0x10000>; interrupts = <GIC_SPI 54 IRQ_TYPE_LEVEL_HIGH>; - clocks = <&clks IMX7D_CLK_DUMMY>, + clocks = <&clks IMX7D_GPT2_ROOT_CLK>, <&clks IMX7D_GPT2_ROOT_CLK>; clock-names = "ipg", "per"; status = "disabled"; @@ -469,7 +469,7 @@ compatible = "fsl,imx7d-gpt", "fsl,imx6sx-gpt"; reg = <0x302f0000 0x10000>; interrupts = <GIC_SPI 53 IRQ_TYPE_LEVEL_HIGH>; - clocks = <&clks IMX7D_CLK_DUMMY>, + clocks = <&clks IMX7D_GPT3_ROOT_CLK>, <&clks IMX7D_GPT3_ROOT_CLK>; clock-names = "ipg", "per"; status = "disabled"; @@ -479,7 +479,7 @@ compatible = "fsl,imx7d-gpt", "fsl,imx6sx-gpt"; reg = <0x30300000 0x10000>; interrupts = <GIC_SPI 52 IRQ_TYPE_LEVEL_HIGH>; - clocks = <&clks IMX7D_CLK_DUMMY>, + clocks = <&clks IMX7D_GPT4_ROOT_CLK>, <&clks IMX7D_GPT4_ROOT_CLK>; clock-names = "ipg", "per"; status = "disabled";
From: Yunfeng Ye yeyunfeng@huawei.com
[ Upstream commit ae199c580da1754a2b051321eeb76d6dacd8707b ]
There is a memory leak problem in the failure paths of build_cl_output(), so fix it.
Signed-off-by: Yunfeng Ye yeyunfeng@huawei.com Acked-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Feilong Lin linfeilong@huawei.com Cc: Hu Shiyuan hushiyuan@huawei.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/4d3c0178-5482-c313-98e1-f82090d2d456@huawei.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-c2c.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c index 32e64a8a6443f..bec7a2f1fb4dc 100644 --- a/tools/perf/builtin-c2c.c +++ b/tools/perf/builtin-c2c.c @@ -2454,6 +2454,7 @@ static int build_cl_output(char *cl_sort, bool no_source) bool add_sym = false; bool add_dso = false; bool add_src = false; + int ret = 0;
if (!buf) return -ENOMEM; @@ -2472,7 +2473,8 @@ static int build_cl_output(char *cl_sort, bool no_source) add_dso = true; } else if (strcmp(tok, "offset")) { pr_err("unrecognized sort token: %s\n", tok); - return -EINVAL; + ret = -EINVAL; + goto err; } }
@@ -2495,13 +2497,15 @@ static int build_cl_output(char *cl_sort, bool no_source) add_sym ? "symbol," : "", add_dso ? "dso," : "", add_src ? "cl_srcline," : "", - "node") < 0) - return -ENOMEM; + "node") < 0) { + ret = -ENOMEM; + goto err; + }
c2c.show_src = add_src; - +err: free(buf); - return 0; + return ret; }
static int setup_coalesce(const char *coalesce, bool no_source)
From: Yunfeng Ye yeyunfeng@huawei.com
[ Upstream commit 1abecfcaa7bba21c9985e0136fa49836164dd8fd ]
The memory @orig_flags is allocated by strdup(), it is freed on the normal path, but leak to free on the error path.
Fix this by adding free(orig_flags) on the error path.
Fixes: 0e11115644b3 ("perf kmem: Print gfp flags in human readable string") Signed-off-by: Yunfeng Ye yeyunfeng@huawei.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Feilong Lin linfeilong@huawei.com Cc: Hu Shiyuan hushiyuan@huawei.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/f9e9f458-96f3-4a97-a1d5-9feec2420e07@huawei.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-kmem.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c index 9e693ce4b73b0..ce786f363476e 100644 --- a/tools/perf/builtin-kmem.c +++ b/tools/perf/builtin-kmem.c @@ -687,6 +687,7 @@ static char *compact_gfp_flags(char *gfp_flags) new = realloc(new_flags, len + strlen(cpt) + 2); if (new == NULL) { free(new_flags); + free(orig_flags); return NULL; }
From: Peter Ujfalusi peter.ujfalusi@ti.com
[ Upstream commit 564b6bb9d42d31fc80c006658cf38940a9b99616 ]
dm365 have only single McBSP, so the device name is without .0
Fixes: 0c750e1fe481d ("ARM: davinci: dm365: Add dma_slave_map to edma") Signed-off-by: Peter Ujfalusi peter.ujfalusi@ti.com Signed-off-by: Sekhar Nori nsekhar@ti.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-davinci/dm365.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/mach-davinci/dm365.c b/arch/arm/mach-davinci/dm365.c index 8be04ec95adf5..d80b2290ac2e0 100644 --- a/arch/arm/mach-davinci/dm365.c +++ b/arch/arm/mach-davinci/dm365.c @@ -856,8 +856,8 @@ static s8 dm365_queue_priority_mapping[][2] = { };
static const struct dma_slave_map dm365_edma_map[] = { - { "davinci-mcbsp.0", "tx", EDMA_FILTER_PARAM(0, 2) }, - { "davinci-mcbsp.0", "rx", EDMA_FILTER_PARAM(0, 3) }, + { "davinci-mcbsp", "tx", EDMA_FILTER_PARAM(0, 2) }, + { "davinci-mcbsp", "rx", EDMA_FILTER_PARAM(0, 3) }, { "davinci_voicecodec", "tx", EDMA_FILTER_PARAM(0, 2) }, { "davinci_voicecodec", "rx", EDMA_FILTER_PARAM(0, 3) }, { "spi_davinci.2", "tx", EDMA_FILTER_PARAM(0, 10) },
From: Bodo Stroesser bstroesser@ts.fujitsu.com
[ Upstream commit 27e84243cb63601a10e366afe3e2d05bb03c1cb5 ]
passthrough_parse_cdb() - used by TCMU and PSCSI - attepts to reset the LUN field of SCSI-2 CDBs (bits 5,6,7 of byte 1). The current code is wrong as for newer commands not having the LUN field it overwrites relevant command bits (e.g. for SECURITY PROTOCOL IN / OUT). We think this code was unnecessary from the beginning or at least it is no longer useful. So we remove it entirely.
Link: https://lore.kernel.org/r/12498eab-76fd-eaad-1316-c2827badb76a@ts.fujitsu.co... Signed-off-by: Bodo Stroesser bstroesser@ts.fujitsu.com Reviewed-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Hannes Reinecke hare@suse.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/target_core_device.c | 21 --------------------- 1 file changed, 21 deletions(-)
diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c index 84742125f7730..92b52d2314b53 100644 --- a/drivers/target/target_core_device.c +++ b/drivers/target/target_core_device.c @@ -1151,27 +1151,6 @@ passthrough_parse_cdb(struct se_cmd *cmd, struct se_device *dev = cmd->se_dev; unsigned int size;
- /* - * Clear a lun set in the cdb if the initiator talking to use spoke - * and old standards version, as we can't assume the underlying device - * won't choke up on it. - */ - switch (cdb[0]) { - case READ_10: /* SBC - RDProtect */ - case READ_12: /* SBC - RDProtect */ - case READ_16: /* SBC - RDProtect */ - case SEND_DIAGNOSTIC: /* SPC - SELF-TEST Code */ - case VERIFY: /* SBC - VRProtect */ - case VERIFY_16: /* SBC - VRProtect */ - case WRITE_VERIFY: /* SBC - VRProtect */ - case WRITE_VERIFY_12: /* SBC - VRProtect */ - case MAINTENANCE_IN: /* SPC - Parameter Data Format for SA RTPG */ - break; - default: - cdb[1] &= 0x1f; /* clear logical unit number */ - break; - } - /* * For REPORT LUNS we always need to emulate the response, for everything * else, pass it up.
From: afzal mohammed afzal.mohd.ma@gmail.com
[ Upstream commit 2ecb287998a47cc0a766f6071f63bc185f338540 ]
r0-r3 & r12 registers are saved & restored, before & after svc respectively. Intention was to preserve those registers across thread to handler mode switch.
On v7-M, hardware saves the register context upon exception in AAPCS complaint way. Restoring r0-r3 & r12 is done from stack location where hardware saves it, not from the location on stack where these registers were saved.
To clarify, on stm32f429 discovery board:
1. before svc, sp - 0x90009ff8 2. r0-r3,r12 saved to 0x90009ff8 - 0x9000a00b 3. upon svc, h/w decrements sp by 32 & pushes registers onto stack 4. after svc, sp - 0x90009fd8 5. r0-r3,r12 restored from 0x90009fd8 - 0x90009feb
Above means r0-r3,r12 is not restored from the location where they are saved, but since hardware pushes the registers onto stack, the registers are restored correctly.
Note that during register saving to stack (step 2), it goes past 0x9000a000. And it seems, based on objdump, there are global symbols residing there, and it perhaps can cause issues on a non-XIP Kernel (on XIP, data section is setup later).
Based on the analysis above, manually saving registers onto stack is at best no-op and at worst can cause data section corruption. Hence remove storing of registers onto stack before svc.
Fixes: b70cd406d7fe ("ARM: 8671/1: V7M: Preserve registers across switch from Thread to Handler mode") Signed-off-by: afzal mohammed afzal.mohd.ma@gmail.com Acked-by: Vladimir Murzin vladimir.murzin@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mm/proc-v7m.S | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S index 92e84181933ad..c68408d51c4bc 100644 --- a/arch/arm/mm/proc-v7m.S +++ b/arch/arm/mm/proc-v7m.S @@ -135,7 +135,6 @@ __v7m_setup_cont: dsb mov r6, lr @ save LR ldr sp, =init_thread_union + THREAD_START_SP - stmia sp, {r0-r3, r12} cpsie i svc #0 1: cpsid i
From: Navid Emamdoost navid.emamdoost@gmail.com
[ Upstream commit e13de8fe0d6a51341671bbe384826d527afe8d44 ]
In unittest_data_add, a copy buffer is created via kmemdup. This buffer is leaked if of_fdt_unflatten_tree fails. The release for the unittest_data buffer is added.
Fixes: b951f9dc7f25 ("Enabling OF selftest to run without machine's devicetree") Signed-off-by: Navid Emamdoost navid.emamdoost@gmail.com Reviewed-by: Frank Rowand frowand.list@gmail.com Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/unittest.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c index 7c6aff7618009..87650d42682fc 100644 --- a/drivers/of/unittest.c +++ b/drivers/of/unittest.c @@ -1002,6 +1002,7 @@ static int __init unittest_data_add(void) of_fdt_unflatten_tree(unittest_data, NULL, &unittest_data_node); if (!unittest_data_node) { pr_warn("%s: No tree to attach; not running tests\n", __func__); + kfree(unittest_data); return -ENODATA; } of_node_set_flag(unittest_data_node, OF_DETACHED);
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit e4f5cb1a9b27c0f94ef4f5a0178a3fde2d3d0e9e ]
The vectors span more than one byte, so mark them as arrays.
Fixes the following build error when building when using GCC 8.3:
In file included from ./include/linux/string.h:19, from ./include/linux/bitmap.h:9, from ./include/linux/cpumask.h:12, from ./arch/mips/include/asm/processor.h:15, from ./arch/mips/include/asm/thread_info.h:16, from ./include/linux/thread_info.h:38, from ./include/asm-generic/preempt.h:5, from ./arch/mips/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:81, from ./include/linux/spinlock.h:51, from ./include/linux/mmzone.h:8, from ./include/linux/bootmem.h:8, from arch/mips/bcm63xx/prom.c:10: arch/mips/bcm63xx/prom.c: In function 'prom_init': ./arch/mips/include/asm/string.h:162:11: error: '__builtin_memcpy' forming offset [2, 32] is out of the bounds [0, 1] of object 'bmips_smp_movevec' with type 'char' [-Werror=array-bounds] __ret = __builtin_memcpy((dst), (src), __len); \ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/mips/bcm63xx/prom.c:97:3: note: in expansion of macro 'memcpy' memcpy((void *)0xa0000200, &bmips_smp_movevec, 0x20); ^~~~~~ In file included from arch/mips/bcm63xx/prom.c:14: ./arch/mips/include/asm/bmips.h:80:13: note: 'bmips_smp_movevec' declared here extern char bmips_smp_movevec;
Fixes: 18a1eef92dcd ("MIPS: BMIPS: Introduce bmips.h") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Paul Burton paulburton@kernel.org Cc: linux-mips@vger.kernel.org Cc: Ralf Baechle ralf@linux-mips.org Cc: James Hogan jhogan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/bcm63xx/prom.c | 2 +- arch/mips/include/asm/bmips.h | 10 +++++----- arch/mips/kernel/smp-bmips.c | 8 ++++---- 3 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/mips/bcm63xx/prom.c b/arch/mips/bcm63xx/prom.c index 7019e2967009e..bbbf8057565b2 100644 --- a/arch/mips/bcm63xx/prom.c +++ b/arch/mips/bcm63xx/prom.c @@ -84,7 +84,7 @@ void __init prom_init(void) * Here we will start up CPU1 in the background and ask it to * reconfigure itself then go back to sleep. */ - memcpy((void *)0xa0000200, &bmips_smp_movevec, 0x20); + memcpy((void *)0xa0000200, bmips_smp_movevec, 0x20); __sync(); set_c0_cause(C_SW0); cpumask_set_cpu(1, &bmips_booted_mask); diff --git a/arch/mips/include/asm/bmips.h b/arch/mips/include/asm/bmips.h index b3e2975f83d36..a564915fddc40 100644 --- a/arch/mips/include/asm/bmips.h +++ b/arch/mips/include/asm/bmips.h @@ -75,11 +75,11 @@ static inline int register_bmips_smp_ops(void) #endif }
-extern char bmips_reset_nmi_vec; -extern char bmips_reset_nmi_vec_end; -extern char bmips_smp_movevec; -extern char bmips_smp_int_vec; -extern char bmips_smp_int_vec_end; +extern char bmips_reset_nmi_vec[]; +extern char bmips_reset_nmi_vec_end[]; +extern char bmips_smp_movevec[]; +extern char bmips_smp_int_vec[]; +extern char bmips_smp_int_vec_end[];
extern int bmips_smp_enabled; extern int bmips_cpu_offset; diff --git a/arch/mips/kernel/smp-bmips.c b/arch/mips/kernel/smp-bmips.c index 382d12eb88f0f..45fbcbbf2504e 100644 --- a/arch/mips/kernel/smp-bmips.c +++ b/arch/mips/kernel/smp-bmips.c @@ -457,10 +457,10 @@ static void bmips_wr_vec(unsigned long dst, char *start, char *end)
static inline void bmips_nmi_handler_setup(void) { - bmips_wr_vec(BMIPS_NMI_RESET_VEC, &bmips_reset_nmi_vec, - &bmips_reset_nmi_vec_end); - bmips_wr_vec(BMIPS_WARM_RESTART_VEC, &bmips_smp_int_vec, - &bmips_smp_int_vec_end); + bmips_wr_vec(BMIPS_NMI_RESET_VEC, bmips_reset_nmi_vec, + bmips_reset_nmi_vec_end); + bmips_wr_vec(BMIPS_WARM_RESTART_VEC, bmips_smp_int_vec, + bmips_smp_int_vec_end); }
struct reset_vec_info {
From: Alain Volmat alain.volmat@st.com
[ Upstream commit 348e46fbb4cdb2aead79aee1fd8bb25ec5fd25db ]
Remove the following warning:
drivers/i2c/busses/i2c-stm32f7.c:315: warning: cannot understand function prototype: 'struct stm32f7_i2c_spec i2c_specs[] =
Replace a comment starting with /** by simply /* to avoid having it interpreted as a kernel-doc comment.
Fixes: aeb068c57214 ("i2c: i2c-stm32f7: add driver") Signed-off-by: Alain Volmat alain.volmat@st.com Reviewed-by: Pierre-Yves MORDRET pierre-yves.mordret@st.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-stm32f7.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-stm32f7.c b/drivers/i2c/busses/i2c-stm32f7.c index d8cbe149925b5..14f60751729e7 100644 --- a/drivers/i2c/busses/i2c-stm32f7.c +++ b/drivers/i2c/busses/i2c-stm32f7.c @@ -219,7 +219,7 @@ struct stm32f7_i2c_dev { struct stm32f7_i2c_timings timing; };
-/** +/* * All these values are coming from I2C Specification, Version 6.0, 4th of * April 2014. *
From: Dave Wysochanski dwysocha@redhat.com
[ Upstream commit d46b0da7a33dd8c99d969834f682267a45444ab3 ]
There's a deadlock that is possible and can easily be seen with a test where multiple readers open/read/close of the same file and a disruption occurs causing reconnect. The deadlock is due a reader thread inside cifs_strict_readv calling down_read and obtaining lock_sem, and then after reconnect inside cifs_reopen_file calling down_read a second time. If in between the two down_read calls, a down_write comes from another process, deadlock occurs.
CPU0 CPU1 ---- ---- cifs_strict_readv() down_read(&cifsi->lock_sem); _cifsFileInfo_put OR cifs_new_fileinfo down_write(&cifsi->lock_sem); cifs_reopen_file() down_read(&cifsi->lock_sem);
Fix the above by changing all down_write(lock_sem) calls to down_write_trylock(lock_sem)/msleep() loop, which in turn makes the second down_read call benign since it will never block behind the writer while holding lock_sem.
Signed-off-by: Dave Wysochanski dwysocha@redhat.com Suggested-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed--by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/cifsglob.h | 5 +++++ fs/cifs/cifsproto.h | 1 + fs/cifs/file.c | 23 +++++++++++++++-------- fs/cifs/smb2file.c | 2 +- 4 files changed, 22 insertions(+), 9 deletions(-)
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index 7b7ab10a9db18..600bb838c15b8 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1210,6 +1210,11 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file); struct cifsInodeInfo { bool can_cache_brlcks; struct list_head llist; /* locks helb by this inode */ + /* + * NOTE: Some code paths call down_read(lock_sem) twice, so + * we must always use use cifs_down_write() instead of down_write() + * for this semaphore to avoid deadlocks. + */ struct rw_semaphore lock_sem; /* protect the fields above */ /* BB add in lists for dirty pages i.e. write caching info for oplock */ struct list_head openFileList; diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index ccdb42f71b2e8..3a7fb8e750e97 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -149,6 +149,7 @@ extern int cifs_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock, const unsigned int xid); extern int cifs_push_mandatory_locks(struct cifsFileInfo *cfile);
+extern void cifs_down_write(struct rw_semaphore *sem); extern struct cifsFileInfo *cifs_new_fileinfo(struct cifs_fid *fid, struct file *file, struct tcon_link *tlink, diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 71a960da7cce1..40f22932343c6 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -280,6 +280,13 @@ cifs_has_mand_locks(struct cifsInodeInfo *cinode) return has_locks; }
+void +cifs_down_write(struct rw_semaphore *sem) +{ + while (!down_write_trylock(sem)) + msleep(10); +} + struct cifsFileInfo * cifs_new_fileinfo(struct cifs_fid *fid, struct file *file, struct tcon_link *tlink, __u32 oplock) @@ -305,7 +312,7 @@ cifs_new_fileinfo(struct cifs_fid *fid, struct file *file, INIT_LIST_HEAD(&fdlocks->locks); fdlocks->cfile = cfile; cfile->llist = fdlocks; - down_write(&cinode->lock_sem); + cifs_down_write(&cinode->lock_sem); list_add(&fdlocks->llist, &cinode->llist); up_write(&cinode->lock_sem);
@@ -457,7 +464,7 @@ void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_handler) * Delete any outstanding lock records. We'll lose them when the file * is closed anyway. */ - down_write(&cifsi->lock_sem); + cifs_down_write(&cifsi->lock_sem); list_for_each_entry_safe(li, tmp, &cifs_file->llist->locks, llist) { list_del(&li->llist); cifs_del_lock_waiters(li); @@ -1011,7 +1018,7 @@ static void cifs_lock_add(struct cifsFileInfo *cfile, struct cifsLockInfo *lock) { struct cifsInodeInfo *cinode = CIFS_I(d_inode(cfile->dentry)); - down_write(&cinode->lock_sem); + cifs_down_write(&cinode->lock_sem); list_add_tail(&lock->llist, &cfile->llist->locks); up_write(&cinode->lock_sem); } @@ -1033,7 +1040,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock,
try_again: exist = false; - down_write(&cinode->lock_sem); + cifs_down_write(&cinode->lock_sem);
exist = cifs_find_lock_conflict(cfile, lock->offset, lock->length, lock->type, &conf_lock, CIFS_LOCK_OP); @@ -1055,7 +1062,7 @@ cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock, (lock->blist.next == &lock->blist)); if (!rc) goto try_again; - down_write(&cinode->lock_sem); + cifs_down_write(&cinode->lock_sem); list_del_init(&lock->blist); }
@@ -1108,7 +1115,7 @@ cifs_posix_lock_set(struct file *file, struct file_lock *flock) return rc;
try_again: - down_write(&cinode->lock_sem); + cifs_down_write(&cinode->lock_sem); if (!cinode->can_cache_brlcks) { up_write(&cinode->lock_sem); return rc; @@ -1314,7 +1321,7 @@ cifs_push_locks(struct cifsFileInfo *cfile) int rc = 0;
/* we are going to update can_cache_brlcks here - need a write access */ - down_write(&cinode->lock_sem); + cifs_down_write(&cinode->lock_sem); if (!cinode->can_cache_brlcks) { up_write(&cinode->lock_sem); return rc; @@ -1505,7 +1512,7 @@ cifs_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock, if (!buf) return -ENOMEM;
- down_write(&cinode->lock_sem); + cifs_down_write(&cinode->lock_sem); for (i = 0; i < 2; i++) { cur = buf; num = 0; diff --git a/fs/cifs/smb2file.c b/fs/cifs/smb2file.c index 1add404618f06..2c809233084bb 100644 --- a/fs/cifs/smb2file.c +++ b/fs/cifs/smb2file.c @@ -139,7 +139,7 @@ smb2_unlock_range(struct cifsFileInfo *cfile, struct file_lock *flock,
cur = buf;
- down_write(&cinode->lock_sem); + cifs_down_write(&cinode->lock_sem); list_for_each_entry_safe(li, tmp, &cfile->llist->locks, llist) { if (flock->fl_start > li->offset || (flock->fl_start + length) <
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 7ce23e8e0a9cd38338fc8316ac5772666b565ca9 ]
We hit the following warning in production
print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700 ------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60 Workqueue: knbd-recv recv_work [nbd] RIP: 0010:refcount_sub_and_test_checked+0x53/0x60 Call Trace: blk_mq_free_request+0xb7/0xf0 blk_mq_complete_request+0x62/0xf0 recv_work+0x29/0xa1 [nbd] process_one_work+0x1f5/0x3f0 worker_thread+0x2d/0x3d0 ? rescuer_thread+0x340/0x340 kthread+0x111/0x130 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x1f/0x30 ---[ end trace b079c3c67f98bb7c ]---
This was preceded by us timing out everything and shutting down the sockets for the device. The problem is we had a request in the queue at the same time, so we completed the request twice. This can actually happen in a lot of cases, we fail to get a ref on our config, we only have one connection and just error out the command, etc.
Fix this by checking cmd->status in nbd_read_stat. We only change this under the cmd->lock, so we are safe to check this here and see if we've already error'ed this command out, which would indicate that we've completed it as well.
Reviewed-by: Mike Christie mchristi@redhat.com Signed-off-by: Josef Bacik josef@toxicpanda.com
Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/nbd.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index f3d0bc9a99058..34dfadd4dcd41 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -648,6 +648,12 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) ret = -ENOENT; goto out; } + if (cmd->status != BLK_STS_OK) { + dev_err(disk_to_dev(nbd->disk), "Command already handled %p\n", + req); + ret = -ENOENT; + goto out; + } if (test_bit(NBD_CMD_REQUEUED, &cmd->flags)) { dev_err(disk_to_dev(nbd->disk), "Raced with timeout on req %p\n", req);
From: Vishal Kulkarni vishal@chelsio.com
[ Upstream commit fc89cc358fb64e2429aeae0f37906126636507ec ]
Release resources when attaching to ULD fail. Otherwise, data mismatch is seen between LLD and ULD later on, which lead to kernel panic when accessing resources that should not even exist in the first place.
Fixes: 94cdb8bb993a ("cxgb4: Add support for dynamic allocation of resources for ULD") Signed-off-by: Shahjada Abul Husain shahjada@chelsio.com Signed-off-by: Vishal Kulkarni vishal@chelsio.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c | 29 ++++++++++++++----------- 1 file changed, 17 insertions(+), 12 deletions(-)
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c @@ -670,10 +670,10 @@ static void uld_init(struct adapter *ada lld->fr_nsmr_tpte_wr_support = adap->params.fr_nsmr_tpte_wr_support; }
-static void uld_attach(struct adapter *adap, unsigned int uld) +static int uld_attach(struct adapter *adap, unsigned int uld) { - void *handle; struct cxgb4_lld_info lli; + void *handle;
uld_init(adap, &lli); uld_queue_init(adap, uld, &lli); @@ -683,7 +683,7 @@ static void uld_attach(struct adapter *a dev_warn(adap->pdev_dev, "could not attach to the %s driver, error %ld\n", adap->uld[uld].name, PTR_ERR(handle)); - return; + return PTR_ERR(handle); }
adap->uld[uld].handle = handle; @@ -691,23 +691,24 @@ static void uld_attach(struct adapter *a
if (adap->flags & FULL_INIT_DONE) adap->uld[uld].state_change(handle, CXGB4_STATE_UP); + + return 0; }
-/** - * cxgb4_register_uld - register an upper-layer driver - * @type: the ULD type - * @p: the ULD methods +/* cxgb4_register_uld - register an upper-layer driver + * @type: the ULD type + * @p: the ULD methods * - * Registers an upper-layer driver with this driver and notifies the ULD - * about any presently available devices that support its type. Returns - * %-EBUSY if a ULD of the same type is already registered. + * Registers an upper-layer driver with this driver and notifies the ULD + * about any presently available devices that support its type. Returns + * %-EBUSY if a ULD of the same type is already registered. */ int cxgb4_register_uld(enum cxgb4_uld type, const struct cxgb4_uld_info *p) { - int ret = 0; unsigned int adap_idx = 0; struct adapter *adap; + int ret = 0;
if (type >= CXGB4_ULD_MAX) return -EINVAL; @@ -741,12 +742,16 @@ int cxgb4_register_uld(enum cxgb4_uld ty if (ret) goto free_irq; adap->uld[type] = *p; - uld_attach(adap, type); + ret = uld_attach(adap, type); + if (ret) + goto free_txq; adap_idx++; } mutex_unlock(&uld_mutex); return 0;
+free_txq: + release_sge_txq_uld(adap, type); free_irq: if (adap->flags & FULL_INIT_DONE) quiesce_rx_uld(adap, type);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 3d1e5039f5f87a8731202ceca08764ee7cb010d3 ]
For some reason I missed the case of DCCP passive flows in my previous patch.
Fixes: a904a0693c18 ("inet: stop leaking jiffies on the wire") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Thiemo Nagel tnagel@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dccp/ipv4.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -417,7 +417,7 @@ struct sock *dccp_v4_request_recv_sock(c RCU_INIT_POINTER(newinet->inet_opt, rcu_dereference(ireq->ireq_opt)); newinet->mc_index = inet_iif(skb); newinet->mc_ttl = ip_hdr(skb)->ttl; - newinet->inet_id = jiffies; + newinet->inet_id = prandom_u32();
if (dst == NULL && (dst = inet_csk_route_child_sock(sk, newsk, req)) == NULL) goto put_and_exit;
From: Eric Dumazet edumazet@google.com
[ Upstream commit 7170a977743b72cf3eb46ef6ef89885dc7ad3621 ]
This socket field can be read and written by concurrent cpus.
Use READ_ONCE() and WRITE_ONCE() annotations to document this, and avoid some compiler 'optimizations'.
KCSAN reported :
BUG: KCSAN: data-race in tcp_v4_rcv / tcp_v4_rcv
write to 0xffff88812220763c of 4 bytes by interrupt on cpu 0: sk_incoming_cpu_update include/net/sock.h:953 [inline] tcp_v4_rcv+0x1b3c/0x1bb0 net/ipv4/tcp_ipv4.c:1934 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955 napi_poll net/core/dev.c:6392 [inline] net_rx_action+0x3ae/0xa90 net/core/dev.c:6460 __do_softirq+0x115/0x33f kernel/softirq.c:292 do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082 do_softirq.part.0+0x6b/0x80 kernel/softirq.c:337 do_softirq kernel/softirq.c:329 [inline] __local_bh_enable_ip+0x76/0x80 kernel/softirq.c:189
read to 0xffff88812220763c of 4 bytes by interrupt on cpu 1: sk_incoming_cpu_update include/net/sock.h:952 [inline] tcp_v4_rcv+0x181a/0x1bb0 net/ipv4/tcp_ipv4.c:1934 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955 napi_poll net/core/dev.c:6392 [inline] net_rx_action+0x3ae/0xa90 net/core/dev.c:6460 __do_softirq+0x115/0x33f kernel/softirq.c:292 run_ksoftirqd+0x46/0x60 kernel/softirq.c:603 smpboot_thread_fn+0x37d/0x4a0 kernel/smpboot.c:165
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sock.h | 4 ++-- net/core/sock.c | 4 ++-- net/ipv4/inet_hashtables.c | 2 +- net/ipv4/udp.c | 2 +- net/ipv6/inet6_hashtables.c | 2 +- net/ipv6/udp.c | 2 +- 6 files changed, 8 insertions(+), 8 deletions(-)
--- a/include/net/sock.h +++ b/include/net/sock.h @@ -916,8 +916,8 @@ static inline void sk_incoming_cpu_updat { int cpu = raw_smp_processor_id();
- if (unlikely(sk->sk_incoming_cpu != cpu)) - sk->sk_incoming_cpu = cpu; + if (unlikely(READ_ONCE(sk->sk_incoming_cpu) != cpu)) + WRITE_ONCE(sk->sk_incoming_cpu, cpu); }
static inline void sock_rps_record_flow_hash(__u32 hash) --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1039,7 +1039,7 @@ set_rcvbuf: break;
case SO_INCOMING_CPU: - sk->sk_incoming_cpu = val; + WRITE_ONCE(sk->sk_incoming_cpu, val); break;
case SO_CNX_ADVICE: @@ -1351,7 +1351,7 @@ int sock_getsockopt(struct socket *sock, break;
case SO_INCOMING_CPU: - v.val = sk->sk_incoming_cpu; + v.val = READ_ONCE(sk->sk_incoming_cpu); break;
case SO_MEMINFO: --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -193,7 +193,7 @@ static inline int compute_score(struct s if (sk->sk_bound_dev_if) score += 4; } - if (sk->sk_incoming_cpu == raw_smp_processor_id()) + if (READ_ONCE(sk->sk_incoming_cpu) == raw_smp_processor_id()) score++; } return score; --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -419,7 +419,7 @@ static int compute_score(struct sock *sk score += 4; }
- if (sk->sk_incoming_cpu == raw_smp_processor_id()) + if (READ_ONCE(sk->sk_incoming_cpu) == raw_smp_processor_id()) score++; return score; } --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -118,7 +118,7 @@ static inline int compute_score(struct s if (sk->sk_bound_dev_if) score++; } - if (sk->sk_incoming_cpu == raw_smp_processor_id()) + if (READ_ONCE(sk->sk_incoming_cpu) == raw_smp_processor_id()) score++; } return score; --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -170,7 +170,7 @@ static int compute_score(struct sock *sk score++; }
- if (sk->sk_incoming_cpu == raw_smp_processor_id()) + if (READ_ONCE(sk->sk_incoming_cpu) == raw_smp_processor_id()) score++;
return score;
From: Eric Dumazet edumazet@google.com
[ Upstream commit ee8d153d46a3b98c064ee15c0c0a3bbf1450e5a1 ]
We already annotated most accesses to sk->sk_napi_id
We missed sk_mark_napi_id() and sk_mark_napi_id_once() which might be called without socket lock held in UDP stack.
KCSAN reported : BUG: KCSAN: data-race in udpv6_queue_rcv_one_skb / udpv6_queue_rcv_one_skb
write to 0xffff888121c6d108 of 4 bytes by interrupt on cpu 0: sk_mark_napi_id include/net/busy_poll.h:125 [inline] __udpv6_queue_rcv_skb net/ipv6/udp.c:571 [inline] udpv6_queue_rcv_one_skb+0x70c/0xb40 net/ipv6/udp.c:672 udpv6_queue_rcv_skb+0xb5/0x400 net/ipv6/udp.c:689 udp6_unicast_rcv_skb.isra.0+0xd7/0x180 net/ipv6/udp.c:832 __udp6_lib_rcv+0x69c/0x1770 net/ipv6/udp.c:913 udpv6_rcv+0x2b/0x40 net/ipv6/udp.c:1015 ip6_protocol_deliver_rcu+0x22a/0xbe0 net/ipv6/ip6_input.c:409 ip6_input_finish+0x30/0x50 net/ipv6/ip6_input.c:450 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip6_input+0x177/0x190 net/ipv6/ip6_input.c:459 dst_input include/net/dst.h:442 [inline] ip6_rcv_finish+0x110/0x140 net/ipv6/ip6_input.c:76 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ipv6_rcv+0x1a1/0x1b0 net/ipv6/ip6_input.c:284 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955 napi_poll net/core/dev.c:6392 [inline] net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
write to 0xffff888121c6d108 of 4 bytes by interrupt on cpu 1: sk_mark_napi_id include/net/busy_poll.h:125 [inline] __udpv6_queue_rcv_skb net/ipv6/udp.c:571 [inline] udpv6_queue_rcv_one_skb+0x70c/0xb40 net/ipv6/udp.c:672 udpv6_queue_rcv_skb+0xb5/0x400 net/ipv6/udp.c:689 udp6_unicast_rcv_skb.isra.0+0xd7/0x180 net/ipv6/udp.c:832 __udp6_lib_rcv+0x69c/0x1770 net/ipv6/udp.c:913 udpv6_rcv+0x2b/0x40 net/ipv6/udp.c:1015 ip6_protocol_deliver_rcu+0x22a/0xbe0 net/ipv6/ip6_input.c:409 ip6_input_finish+0x30/0x50 net/ipv6/ip6_input.c:450 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip6_input+0x177/0x190 net/ipv6/ip6_input.c:459 dst_input include/net/dst.h:442 [inline] ip6_rcv_finish+0x110/0x140 net/ipv6/ip6_input.c:76 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ipv6_rcv+0x1a1/0x1b0 net/ipv6/ip6_input.c:284 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 10890 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: e68b6e50fa35 ("udp: enable busy polling for all sockets") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/busy_poll.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/include/net/busy_poll.h +++ b/include/net/busy_poll.h @@ -134,7 +134,7 @@ static inline void skb_mark_napi_id(stru static inline void sk_mark_napi_id(struct sock *sk, const struct sk_buff *skb) { #ifdef CONFIG_NET_RX_BUSY_POLL - sk->sk_napi_id = skb->napi_id; + WRITE_ONCE(sk->sk_napi_id, skb->napi_id); #endif }
@@ -143,8 +143,8 @@ static inline void sk_mark_napi_id_once( const struct sk_buff *skb) { #ifdef CONFIG_NET_RX_BUSY_POLL - if (!sk->sk_napi_id) - sk->sk_napi_id = skb->napi_id; + if (!READ_ONCE(sk->sk_napi_id)) + WRITE_ONCE(sk->sk_napi_id, skb->napi_id); #endif }
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 5fc0f21246e50afdf318b5a3a941f7f4f57b8947 ]
Since it became possible for the DSA core to use a CPU port different than 8, our bcm_sf2_imp_setup() function was broken because it assumes that registers are applicable to port 8. In particular, the port's MAC is going to stay disabled, so make sure we clear the RX_DIS and TX_DIS bits if we are not configured for port 8.
Fixes: 9f91484f6fcc ("net: dsa: make "label" property optional for dsa2") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/bcm_sf2.c | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-)
--- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -106,22 +106,11 @@ static void bcm_sf2_imp_setup(struct dsa unsigned int i; u32 reg, offset;
- if (priv->type == BCM7445_DEVICE_ID) - offset = CORE_STS_OVERRIDE_IMP; - else - offset = CORE_STS_OVERRIDE_IMP2; - /* Enable the port memories */ reg = core_readl(priv, CORE_MEM_PSM_VDD_CTRL); reg &= ~P_TXQ_PSM_VDD(port); core_writel(priv, reg, CORE_MEM_PSM_VDD_CTRL);
- /* Enable Broadcast, Multicast, Unicast forwarding to IMP port */ - reg = core_readl(priv, CORE_IMP_CTL); - reg |= (RX_BCST_EN | RX_MCST_EN | RX_UCST_EN); - reg &= ~(RX_DIS | TX_DIS); - core_writel(priv, reg, CORE_IMP_CTL); - /* Enable forwarding */ core_writel(priv, SW_FWDG_EN, CORE_SWMODE);
@@ -140,10 +129,27 @@ static void bcm_sf2_imp_setup(struct dsa
bcm_sf2_brcm_hdr_setup(priv, port);
- /* Force link status for IMP port */ - reg = core_readl(priv, offset); - reg |= (MII_SW_OR | LINK_STS); - core_writel(priv, reg, offset); + if (port == 8) { + if (priv->type == BCM7445_DEVICE_ID) + offset = CORE_STS_OVERRIDE_IMP; + else + offset = CORE_STS_OVERRIDE_IMP2; + + /* Force link status for IMP port */ + reg = core_readl(priv, offset); + reg |= (MII_SW_OR | LINK_STS); + core_writel(priv, reg, offset); + + /* Enable Broadcast, Multicast, Unicast forwarding to IMP port */ + reg = core_readl(priv, CORE_IMP_CTL); + reg |= (RX_BCST_EN | RX_MCST_EN | RX_UCST_EN); + reg &= ~(RX_DIS | TX_DIS); + core_writel(priv, reg, CORE_IMP_CTL); + } else { + reg = core_readl(priv, CORE_G_PCTL_PORT(port)); + reg &= ~(RX_DIS | TX_DIS); + core_writel(priv, reg, CORE_G_PCTL_PORT(port)); + } }
static void bcm_sf2_eee_enable_set(struct dsa_switch *ds, int port, bool enable)
From: Benjamin Herrenschmidt benh@kernel.crashing.org
[ Upstream commit 88824e3bf29a2fcacfd9ebbfe03063649f0f3254 ]
We are calling the checksum helper after the dma_map_single() call to map the packet. This is incorrect as the checksumming code will touch the packet from the CPU. This means the cache won't be properly flushes (or the bounce buffering will leave us with the unmodified packet to DMA).
This moves the calculation of the checksum & vlan tags to before the DMA mapping.
This also has the side effect of fixing another bug: If the checksum helper fails, we goto "drop" to drop the packet, which will not unmap the DMA mapping.
Signed-off-by: Benjamin Herrenschmidt benh@kernel.crashing.org Fixes: 05690d633f30 ("ftgmac100: Upgrade to NETIF_F_HW_CSUM") Reviewed-by: Vijay Khemka vijaykhemka@fb.com Tested-by: Vijay Khemka vijaykhemka@fb.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/faraday/ftgmac100.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-)
--- a/drivers/net/ethernet/faraday/ftgmac100.c +++ b/drivers/net/ethernet/faraday/ftgmac100.c @@ -734,6 +734,18 @@ static int ftgmac100_hard_start_xmit(str */ nfrags = skb_shinfo(skb)->nr_frags;
+ /* Setup HW checksumming */ + csum_vlan = 0; + if (skb->ip_summed == CHECKSUM_PARTIAL && + !ftgmac100_prep_tx_csum(skb, &csum_vlan)) + goto drop; + + /* Add VLAN tag */ + if (skb_vlan_tag_present(skb)) { + csum_vlan |= FTGMAC100_TXDES1_INS_VLANTAG; + csum_vlan |= skb_vlan_tag_get(skb) & 0xffff; + } + /* Get header len */ len = skb_headlen(skb);
@@ -760,19 +772,6 @@ static int ftgmac100_hard_start_xmit(str if (nfrags == 0) f_ctl_stat |= FTGMAC100_TXDES0_LTS; txdes->txdes3 = cpu_to_le32(map); - - /* Setup HW checksumming */ - csum_vlan = 0; - if (skb->ip_summed == CHECKSUM_PARTIAL && - !ftgmac100_prep_tx_csum(skb, &csum_vlan)) - goto drop; - - /* Add VLAN tag */ - if (skb_vlan_tag_present(skb)) { - csum_vlan |= FTGMAC100_TXDES1_INS_VLANTAG; - csum_vlan |= skb_vlan_tag_get(skb) & 0xffff; - } - txdes->txdes1 = cpu_to_le32(csum_vlan);
/* Next descriptor */
From: Tejun Heo tj@kernel.org
[ Upstream commit 20eb4f29b60286e0d6dc01d9c260b4bd383c58fb ]
sk_page_frag() optimizes skb_frag allocations by using per-task skb_frag cache when it knows it's the only user. The condition is determined by seeing whether the socket allocation mask allows blocking - if the allocation may block, it obviously owns the task's context and ergo exclusively owns current->task_frag.
Unfortunately, this misses recursion through memory reclaim path. Please take a look at the following backtrace.
[2] RIP: 0010:tcp_sendmsg_locked+0xccf/0xe10 ... tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 sock_xmit.isra.24+0xa1/0x170 [nbd] nbd_send_cmd+0x1d2/0x690 [nbd] nbd_queue_rq+0x1b5/0x3b0 [nbd] __blk_mq_try_issue_directly+0x108/0x1b0 blk_mq_request_issue_directly+0xbd/0xe0 blk_mq_try_issue_list_directly+0x41/0xb0 blk_mq_sched_insert_requests+0xa2/0xe0 blk_mq_flush_plug_list+0x205/0x2a0 blk_flush_plug_list+0xc3/0xf0 [1] blk_finish_plug+0x21/0x2e _xfs_buf_ioapply+0x313/0x460 __xfs_buf_submit+0x67/0x220 xfs_buf_read_map+0x113/0x1a0 xfs_trans_read_buf_map+0xbf/0x330 xfs_btree_read_buf_block.constprop.42+0x95/0xd0 xfs_btree_lookup_get_block+0x95/0x170 xfs_btree_lookup+0xcc/0x470 xfs_bmap_del_extent_real+0x254/0x9a0 __xfs_bunmapi+0x45c/0xab0 xfs_bunmapi+0x15/0x30 xfs_itruncate_extents_flags+0xca/0x250 xfs_free_eofblocks+0x181/0x1e0 xfs_fs_destroy_inode+0xa8/0x1b0 destroy_inode+0x38/0x70 dispose_list+0x35/0x50 prune_icache_sb+0x52/0x70 super_cache_scan+0x120/0x1a0 do_shrink_slab+0x120/0x290 shrink_slab+0x216/0x2b0 shrink_node+0x1b6/0x4a0 do_try_to_free_pages+0xc6/0x370 try_to_free_mem_cgroup_pages+0xe3/0x1e0 try_charge+0x29e/0x790 mem_cgroup_charge_skmem+0x6a/0x100 __sk_mem_raise_allocated+0x18e/0x390 __sk_mem_schedule+0x2a/0x40 [0] tcp_sendmsg_locked+0x8eb/0xe10 tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x26d/0x2b0 __sys_sendmsg+0x57/0xa0 do_syscall_64+0x42/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9
In [0], tcp_send_msg_locked() was using current->page_frag when it called sk_wmem_schedule(). It already calculated how many bytes can be fit into current->page_frag. Due to memory pressure, sk_wmem_schedule() called into memory reclaim path which called into xfs and then IO issue path. Because the filesystem in question is backed by nbd, the control goes back into the tcp layer - back into tcp_sendmsg_locked().
nbd sets sk_allocation to (GFP_NOIO | __GFP_MEMALLOC) which makes sense - it's in the process of freeing memory and wants to be able to, e.g., drop clean pages to make forward progress. However, this confused sk_page_frag() called from [2]. Because it only tests whether the allocation allows blocking which it does, it now thinks current->page_frag can be used again although it already was being used in [0].
After [2] used current->page_frag, the offset would be increased by the used amount. When the control returns to [0], current->page_frag's offset is increased and the previously calculated number of bytes now may overrun the end of allocated memory leading to silent memory corruptions.
Fix it by adding gfpflags_normal_context() which tests sleepable && !reclaim and use it to determine whether to use current->task_frag.
v2: Eric didn't like gfp flags being tested twice. Introduce a new helper gfpflags_normal_context() and combine the two tests.
Signed-off-by: Tejun Heo tj@kernel.org Cc: Josef Bacik josef@toxicpanda.com Cc: Eric Dumazet eric.dumazet@gmail.com Cc: stable@vger.kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/gfp.h | 23 +++++++++++++++++++++++ include/net/sock.h | 11 ++++++++--- 2 files changed, 31 insertions(+), 3 deletions(-)
--- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -313,6 +313,29 @@ static inline bool gfpflags_allow_blocki return !!(gfp_flags & __GFP_DIRECT_RECLAIM); }
+/** + * gfpflags_normal_context - is gfp_flags a normal sleepable context? + * @gfp_flags: gfp_flags to test + * + * Test whether @gfp_flags indicates that the allocation is from the + * %current context and allowed to sleep. + * + * An allocation being allowed to block doesn't mean it owns the %current + * context. When direct reclaim path tries to allocate memory, the + * allocation context is nested inside whatever %current was doing at the + * time of the original allocation. The nested allocation may be allowed + * to block but modifying anything %current owns can corrupt the outer + * context's expectations. + * + * %true result from this function indicates that the allocation context + * can sleep and use anything that's associated with %current. + */ +static inline bool gfpflags_normal_context(const gfp_t gfp_flags) +{ + return (gfp_flags & (__GFP_DIRECT_RECLAIM | __GFP_MEMALLOC)) == + __GFP_DIRECT_RECLAIM; +} + #ifdef CONFIG_HIGHMEM #define OPT_ZONE_HIGHMEM ZONE_HIGHMEM #else --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2131,12 +2131,17 @@ struct sk_buff *sk_stream_alloc_skb(stru * sk_page_frag - return an appropriate page_frag * @sk: socket * - * If socket allocation mode allows current thread to sleep, it means its - * safe to use the per task page_frag instead of the per socket one. + * Use the per task page_frag instead of the per socket one for + * optimization when we know that we're in the normal context and owns + * everything that's associated with %current. + * + * gfpflags_allow_blocking() isn't enough here as direct reclaim may nest + * inside other socket operations and end up recursing into sk_page_frag() + * while it's already in use. */ static inline struct page_frag *sk_page_frag(struct sock *sk) { - if (gfpflags_allow_blocking(sk->sk_allocation)) + if (gfpflags_normal_context(sk->sk_allocation)) return ¤t->task_frag;
return &sk->sk_frag;
From: Jiangfeng Xiao xiaojiangfeng@huawei.com
[ Upstream commit e56bd641ca61beb92b135298d5046905f920b734 ]
This is due to error in over budget processing. When dealing with high throughput, the used buffers that exceeds the budget is not cleaned up. In addition, it takes a lot of cycles to clean up the used buffer, and then the buffer where the valid data is located can take effect.
Signed-off-by: Jiangfeng Xiao xiaojiangfeng@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/hisilicon/hip04_eth.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/hisilicon/hip04_eth.c +++ b/drivers/net/ethernet/hisilicon/hip04_eth.c @@ -174,6 +174,7 @@ struct hip04_priv { dma_addr_t rx_phys[RX_DESC_NUM]; unsigned int rx_head; unsigned int rx_buf_size; + unsigned int rx_cnt_remaining;
struct device_node *phy_node; struct phy_device *phy; @@ -487,7 +488,6 @@ static int hip04_rx_poll(struct napi_str struct hip04_priv *priv = container_of(napi, struct hip04_priv, napi); struct net_device *ndev = priv->ndev; struct net_device_stats *stats = &ndev->stats; - unsigned int cnt = hip04_recv_cnt(priv); struct rx_desc *desc; struct sk_buff *skb; unsigned char *buf; @@ -500,8 +500,8 @@ static int hip04_rx_poll(struct napi_str
/* clean up tx descriptors */ tx_remaining = hip04_tx_reclaim(ndev, false); - - while (cnt && !last) { + priv->rx_cnt_remaining += hip04_recv_cnt(priv); + while (priv->rx_cnt_remaining && !last) { buf = priv->rx_buf[priv->rx_head]; skb = build_skb(buf, priv->rx_buf_size); if (unlikely(!skb)) { @@ -547,11 +547,13 @@ refill: hip04_set_recv_desc(priv, phys);
priv->rx_head = RX_NEXT(priv->rx_head); - if (rx >= budget) + if (rx >= budget) { + --priv->rx_cnt_remaining; goto done; + }
- if (--cnt == 0) - cnt = hip04_recv_cnt(priv); + if (--priv->rx_cnt_remaining == 0) + priv->rx_cnt_remaining += hip04_recv_cnt(priv); }
if (!(priv->reg_inten & RCV_INT)) { @@ -636,6 +638,7 @@ static int hip04_mac_open(struct net_dev int i;
priv->rx_head = 0; + priv->rx_cnt_remaining = 0; priv->tx_head = 0; priv->tx_tail = 0; hip04_reset_ppe(priv);
From: Eran Ben Elisha eranbe@mellanox.com
[ Upstream commit e19868efea0c103f23b4b7e986fd0a703822111f ]
Prior to this patch, the amount of counters guaranteed per VF in the resource tracker was MLX4_VF_COUNTERS_PER_PORT * MLX4_MAX_PORTS. It was set regardless if the VF was single or dual port. This caused several VFs to have no guaranteed counters although the system could satisfy their request.
The fix is to dynamically guarantee counters, based on each VF specification.
Fixes: 9de92c60beaa ("net/mlx4_core: Adjust counter grant policy in the resource tracker") Signed-off-by: Eran Ben Elisha eranbe@mellanox.com Signed-off-by: Jack Morgenstein jackm@dev.mellanox.co.il Signed-off-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 42 +++++++++++------- 1 file changed, 26 insertions(+), 16 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c +++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c @@ -471,12 +471,31 @@ void mlx4_init_quotas(struct mlx4_dev *d priv->mfunc.master.res_tracker.res_alloc[RES_MPT].quota[pf]; }
-static int get_max_gauranteed_vfs_counter(struct mlx4_dev *dev) +static int +mlx4_calc_res_counter_guaranteed(struct mlx4_dev *dev, + struct resource_allocator *res_alloc, + int vf) { - /* reduce the sink counter */ - return (dev->caps.max_counters - 1 - - (MLX4_PF_COUNTERS_PER_PORT * MLX4_MAX_PORTS)) - / MLX4_MAX_PORTS; + struct mlx4_active_ports actv_ports; + int ports, counters_guaranteed; + + /* For master, only allocate according to the number of phys ports */ + if (vf == mlx4_master_func_num(dev)) + return MLX4_PF_COUNTERS_PER_PORT * dev->caps.num_ports; + + /* calculate real number of ports for the VF */ + actv_ports = mlx4_get_active_ports(dev, vf); + ports = bitmap_weight(actv_ports.ports, dev->caps.num_ports); + counters_guaranteed = ports * MLX4_VF_COUNTERS_PER_PORT; + + /* If we do not have enough counters for this VF, do not + * allocate any for it. '-1' to reduce the sink counter. + */ + if ((res_alloc->res_reserved + counters_guaranteed) > + (dev->caps.max_counters - 1)) + return 0; + + return counters_guaranteed; }
int mlx4_init_resource_tracker(struct mlx4_dev *dev) @@ -484,7 +503,6 @@ int mlx4_init_resource_tracker(struct ml struct mlx4_priv *priv = mlx4_priv(dev); int i, j; int t; - int max_vfs_guarantee_counter = get_max_gauranteed_vfs_counter(dev);
priv->mfunc.master.res_tracker.slave_list = kzalloc(dev->num_slaves * sizeof(struct slave_list), @@ -601,16 +619,8 @@ int mlx4_init_resource_tracker(struct ml break; case RES_COUNTER: res_alloc->quota[t] = dev->caps.max_counters; - if (t == mlx4_master_func_num(dev)) - res_alloc->guaranteed[t] = - MLX4_PF_COUNTERS_PER_PORT * - MLX4_MAX_PORTS; - else if (t <= max_vfs_guarantee_counter) - res_alloc->guaranteed[t] = - MLX4_VF_COUNTERS_PER_PORT * - MLX4_MAX_PORTS; - else - res_alloc->guaranteed[t] = 0; + res_alloc->guaranteed[t] = + mlx4_calc_res_counter_guaranteed(dev, res_alloc, t); break; default: break;
From: zhanglin zhang.lin16@zte.com.cn
[ Upstream commit 5ff223e86f5addbfae26419cbb5d61d98f6fbf7d ]
memset() the structure ethtool_wolinfo that has padded bytes but the padded bytes have not been zeroed out.
Signed-off-by: zhanglin zhang.lin16@zte.com.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/ethtool.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -1450,11 +1450,13 @@ static int ethtool_reset(struct net_devi
static int ethtool_get_wol(struct net_device *dev, char __user *useraddr) { - struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL }; + struct ethtool_wolinfo wol;
if (!dev->ethtool_ops->get_wol) return -EOPNOTSUPP;
+ memset(&wol, 0, sizeof(struct ethtool_wolinfo)); + wol.cmd = ETHTOOL_GWOL; dev->ethtool_ops->get_wol(dev, &wol);
if (copy_to_user(useraddr, &wol, sizeof(wol)))
From: Wei Wang weiwan@google.com
[ Upstream commit d64479a3e3f9924074ca7b50bd72fa5211dca9c1 ]
This test reports EINVAL for getsockopt(SOL_SOCKET, SO_DOMAIN) occasionally due to the uninitialized length parameter. Initialize it to fix this, and also use int for "test_family" to comply with the API standard.
Fixes: d6a61f80b871 ("soreuseport: test mixed v4/v6 sockets") Reported-by: Maciej Żenczykowski maze@google.com Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: Wei Wang weiwan@google.com Cc: Craig Gallek cgallek@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/reuseport_dualstack.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/net/reuseport_dualstack.c +++ b/tools/testing/selftests/net/reuseport_dualstack.c @@ -129,7 +129,7 @@ static void test(int *rcv_fds, int count { struct epoll_event ev; int epfd, i, test_fd; - uint16_t test_family; + int test_family; socklen_t len;
epfd = epoll_create(1); @@ -146,6 +146,7 @@ static void test(int *rcv_fds, int count send_from_v4(proto);
test_fd = receive_once(epfd, proto); + len = sizeof(test_family); if (getsockopt(test_fd, SOL_SOCKET, SO_DOMAIN, &test_family, &len)) error(1, errno, "failed to read socket domain"); if (test_family != AF_INET)
From: Eric Dumazet edumazet@google.com
[ Upstream commit a793183caa9afae907a0d7ddd2ffd57329369bf5 ]
KCSAN reported a data-race in udp_set_dev_scratch() [1]
The issue here is that we must not write over skb fields if skb is shared. A similar issue has been fixed in commit 89c22d8c3b27 ("net: Fix skb csum races when peeking")
While we are at it, use a helper only dealing with udp_skb_scratch(skb)->csum_unnecessary, as this allows udp_set_dev_scratch() to be called once and thus inlined.
[1] BUG: KCSAN: data-race in udp_set_dev_scratch / udpv6_recvmsg
write to 0xffff888120278317 of 1 bytes by task 10411 on cpu 1: udp_set_dev_scratch+0xea/0x200 net/ipv4/udp.c:1308 __first_packet_length+0x147/0x420 net/ipv4/udp.c:1556 first_packet_length+0x68/0x2a0 net/ipv4/udp.c:1579 udp_poll+0xea/0x110 net/ipv4/udp.c:2720 sock_poll+0xed/0x250 net/socket.c:1256 vfs_poll include/linux/poll.h:90 [inline] do_select+0x7d0/0x1020 fs/select.c:534 core_sys_select+0x381/0x550 fs/select.c:677 do_pselect.constprop.0+0x11d/0x160 fs/select.c:759 __do_sys_pselect6 fs/select.c:784 [inline] __se_sys_pselect6 fs/select.c:769 [inline] __x64_sys_pselect6+0x12e/0x170 fs/select.c:769 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9
read to 0xffff888120278317 of 1 bytes by task 10413 on cpu 0: udp_skb_csum_unnecessary include/net/udp.h:358 [inline] udpv6_recvmsg+0x43e/0xe90 net/ipv6/udp.c:310 inet6_recvmsg+0xbb/0x240 net/ipv6/af_inet6.c:592 sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871 ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480 do_recvmmsg+0x19a/0x5c0 net/socket.c:2601 __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680 __do_sys_recvmmsg net/socket.c:2703 [inline] __se_sys_recvmmsg net/socket.c:2696 [inline] __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 10413 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 2276f58ac589 ("udp: use a separate rx queue for packet reception") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Paolo Abeni pabeni@redhat.com Reviewed-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/udp.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
--- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1195,6 +1195,20 @@ static void udp_set_dev_scratch(struct s scratch->_tsize_state |= UDP_SKB_IS_STATELESS; }
+static void udp_skb_csum_unnecessary_set(struct sk_buff *skb) +{ + /* We come here after udp_lib_checksum_complete() returned 0. + * This means that __skb_checksum_complete() might have + * set skb->csum_valid to 1. + * On 64bit platforms, we can set csum_unnecessary + * to true, but only if the skb is not shared. + */ +#if BITS_PER_LONG == 64 + if (!skb_shared(skb)) + udp_skb_scratch(skb)->csum_unnecessary = true; +#endif +} + static int udp_skb_truesize(struct sk_buff *skb) { return udp_skb_scratch(skb)->_tsize_state & ~UDP_SKB_IS_STATELESS; @@ -1430,10 +1444,7 @@ static struct sk_buff *__first_packet_le *total += skb->truesize; kfree_skb(skb); } else { - /* the csum related bits could be changed, refresh - * the scratch area - */ - udp_set_dev_scratch(skb); + udp_skb_csum_unnecessary_set(skb); break; } }
From: Eric Dumazet edumazet@google.com
[ Upstream commit 7c422d0ce97552dde4a97e6290de70ec6efb0fc6 ]
__skb_wait_for_more_packets() can be called while other cpus can feed packets to the socket receive queue.
KCSAN reported :
BUG: KCSAN: data-race in __skb_wait_for_more_packets / __udp_enqueue_schedule_skb
write to 0xffff888102e40b58 of 8 bytes by interrupt on cpu 0: __skb_insert include/linux/skbuff.h:1852 [inline] __skb_queue_before include/linux/skbuff.h:1958 [inline] __skb_queue_tail include/linux/skbuff.h:1991 [inline] __udp_enqueue_schedule_skb+0x2d7/0x410 net/ipv4/udp.c:1470 __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline] udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057 udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074 udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233 __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300 udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955
read to 0xffff888102e40b58 of 8 bytes by task 13035 on cpu 1: __skb_wait_for_more_packets+0xfa/0x320 net/core/datagram.c:100 __skb_recv_udp+0x374/0x500 net/ipv4/udp.c:1683 udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871 ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480 do_recvmmsg+0x19a/0x5c0 net/socket.c:2601 __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680 __do_sys_recvmmsg net/socket.c:2703 [inline] __se_sys_recvmmsg net/socket.c:2696 [inline] __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 13035 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/datagram.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -97,7 +97,7 @@ int __skb_wait_for_more_packets(struct s if (error) goto out_err;
- if (sk->sk_receive_queue.prev != skb) + if (READ_ONCE(sk->sk_receive_queue.prev) != skb) goto out;
/* Socket shut down? */
From: Maxim Mikityanskiy maximmi@mellanox.com
[ Upstream commit 9df86bdb6746d7fcfc2fda715f7a7c3d0ddb2654 ]
When CQE compression is enabled, compressed CQEs use the following structure: a title is followed by one or many blocks, each containing 8 mini CQEs (except the last, which may contain fewer mini CQEs).
Due to NAPI budget restriction, a complete structure is not always parsed in one NAPI run, and some blocks with mini CQEs may be deferred to the next NAPI poll call - we have the mlx5e_decompress_cqes_cont call in the beginning of mlx5e_poll_rx_cq. However, if the budget is extremely low, some blocks may be left even after that, but the code that follows the mlx5e_decompress_cqes_cont call doesn't check it and assumes that a new CQE begins, which may not be the case. In such cases, random memory corruptions occur.
An extremely low NAPI budget of 8 is used when busy_poll or busy_read is active.
This commit adds a check to make sure that the previous compressed CQE has been completely parsed after mlx5e_decompress_cqes_cont, otherwise it prevents a new CQE from being fetched in the middle of a compressed CQE.
This commit fixes random crashes in __build_skb, __page_pool_put_page and other not-related-directly places, that used to happen when both CQE compression and busy_poll/busy_read were enabled.
Fixes: 7219ab34f184 ("net/mlx5e: CQE compression") Signed-off-by: Maxim Mikityanskiy maximmi@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -1093,8 +1093,11 @@ int mlx5e_poll_rx_cq(struct mlx5e_cq *cq if (unlikely(!MLX5E_TEST_BIT(rq->state, MLX5E_RQ_STATE_ENABLED))) return 0;
- if (cq->decmprs_left) + if (cq->decmprs_left) { work_done += mlx5e_decompress_cqes_cont(rq, cq, 0, budget); + if (cq->decmprs_left || work_done >= budget) + goto out; + }
cqe = mlx5_cqwq_get_cqe(&cq->wq); if (!cqe) {
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit c763ac436b668d7417f0979430ec0312ede4093d ]
Clearing the existing bitmask of mirrored ports essentially prevents us from capturing more than one port at any given time. This is clearly wrong, do not clear the bitmask prior to setting up the new port.
Reported-by: Hubert Feurstein h.feurstein@gmail.com Fixes: ed3af5fd08eb ("net: dsa: b53: Add support for port mirroring") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Vivien Didelot vivien.didelot@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/b53/b53_common.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1431,7 +1431,6 @@ int b53_mirror_add(struct dsa_switch *ds loc = B53_EG_MIR_CTL;
b53_read16(dev, B53_MGMT_PAGE, loc, ®); - reg &= ~MIRROR_MASK; reg |= BIT(port); b53_write16(dev, B53_MGMT_PAGE, loc, reg);
From: Andrew Lunn andrew@lunn.ch
[ Upstream commit 38b4fe320119859c11b1dc06f6b4987a16344fa1 ]
As soon as the netdev is registers, the kernel can start using the interface. If the driver connects the MAC to the PHY after the netdev is registered, there is a race condition where the interface can be opened without having the PHY connected.
Change the order to close this race condition.
Fixes: 92571a1aae40 ("lan78xx: Connect phy early") Reported-by: Daniel Wagner dwagner@suse.de Signed-off-by: Andrew Lunn andrew@lunn.ch Tested-by: Daniel Wagner dwagner@suse.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/lan78xx.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -3642,10 +3642,14 @@ static int lan78xx_probe(struct usb_inte /* driver requires remote-wakeup capability during autosuspend. */ intf->needs_remote_wakeup = 1;
+ ret = lan78xx_phy_init(dev); + if (ret < 0) + goto out4; + ret = register_netdev(netdev); if (ret != 0) { netif_err(dev, probe, netdev, "couldn't register the device\n"); - goto out4; + goto out5; }
usb_set_intfdata(intf, dev); @@ -3658,14 +3662,10 @@ static int lan78xx_probe(struct usb_inte pm_runtime_set_autosuspend_delay(&udev->dev, DEFAULT_AUTOSUSPEND_DELAY);
- ret = lan78xx_phy_init(dev); - if (ret < 0) - goto out5; - return 0;
out5: - unregister_netdev(netdev); + phy_disconnect(netdev->phydev); out4: usb_free_urb(dev->urb_intr); out3:
From: Kazutoshi Noguchi noguchi.kazutosi@gmail.com
[ Upstream commit b3060531979422d5bb18d80226f978910284dc70 ]
This device is sold as 'ThinkPad USB-C Dock Gen 2 (40AS)'. Chipset is RTL8153 and works with r8152. Without this, the generic cdc_ether grabs the device, and the device jam connected networks up when the machine suspends.
Signed-off-by: Kazutoshi Noguchi noguchi.kazutosi@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/cdc_ether.c | 7 +++++++ drivers/net/usb/r8152.c | 1 + 2 files changed, 8 insertions(+)
--- a/drivers/net/usb/cdc_ether.c +++ b/drivers/net/usb/cdc_ether.c @@ -800,6 +800,13 @@ static const struct usb_device_id produc .driver_info = 0, },
+/* ThinkPad USB-C Dock Gen 2 (based on Realtek RTL8153) */ +{ + USB_DEVICE_AND_INTERFACE_INFO(LENOVO_VENDOR_ID, 0xa387, USB_CLASS_COMM, + USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), + .driver_info = 0, +}, + /* NVIDIA Tegra USB 3.0 Ethernet Adapters (based on Realtek RTL8153) */ { USB_DEVICE_AND_INTERFACE_INFO(NVIDIA_VENDOR_ID, 0x09ff, USB_CLASS_COMM, --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -5324,6 +5324,7 @@ static const struct usb_device_id rtl815 {REALTEK_USB_DEVICE(VENDOR_ID_LENOVO, 0x7205)}, {REALTEK_USB_DEVICE(VENDOR_ID_LENOVO, 0x720c)}, {REALTEK_USB_DEVICE(VENDOR_ID_LENOVO, 0x7214)}, + {REALTEK_USB_DEVICE(VENDOR_ID_LENOVO, 0xa387)}, {REALTEK_USB_DEVICE(VENDOR_ID_LINKSYS, 0x0041)}, {REALTEK_USB_DEVICE(VENDOR_ID_NVIDIA, 0x09ff)}, {REALTEK_USB_DEVICE(VENDOR_ID_TPLINK, 0x0601)},
From: Vivien Didelot vivien.didelot@gmail.com
[ Upstream commit 50c7d2ba9de20f60a2d527ad6928209ef67e4cdd ]
If there are multiple switch trees on the device, only the last one will be listed, because the arguments of list_add_tail are swapped.
Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation") Signed-off-by: Vivien Didelot vivien.didelot@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dsa/dsa2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -62,7 +62,7 @@ static struct dsa_switch_tree *dsa_add_d return NULL; dst->tree = tree; INIT_LIST_HEAD(&dst->list); - list_add_tail(&dsa_switch_trees, &dst->list); + list_add_tail(&dst->list, &dsa_switch_trees); kref_init(&dst->refcount);
return dst;
From: Doug Berger opendmb@gmail.com
[ Upstream commit 25382b991d252aed961cd434176240f9de6bb15f ]
The EPHY integrated into the 40nm Set-Top Box devices can falsely detect energy when connected to a disabled peer interface. When the peer interface is enabled the EPHY will detect and report the link as active, but on occasion may get into a state where it is not able to exchange data with the connected GENET MAC. This issue has not been observed when the link parameters are auto-negotiated; however, it has been observed with a manually configured link.
It has been empirically determined that issuing a soft reset to the EPHY when energy is detected prevents it from getting into this bad state.
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger opendmb@gmail.com Acked-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -1985,6 +1985,8 @@ static void bcmgenet_link_intr_enable(st */ if (priv->internal_phy) { int0_enable |= UMAC_IRQ_LINK_EVENT; + if (GENET_IS_V1(priv) || GENET_IS_V2(priv) || GENET_IS_V3(priv)) + int0_enable |= UMAC_IRQ_PHY_DET_R; } else if (priv->ext_phy) { int0_enable |= UMAC_IRQ_LINK_EVENT; } else if (priv->phy_interface == PHY_INTERFACE_MODE_MOCA) { @@ -2608,6 +2610,10 @@ static void bcmgenet_irq_task(struct wor bcmgenet_power_up(priv, GENET_POWER_WOL_MAGIC); }
+ if (status & UMAC_IRQ_PHY_DET_R && + priv->dev->phydev->autoneg != AUTONEG_ENABLE) + phy_init_hw(priv->dev->phydev); + /* Link UP/DOWN event */ if (status & UMAC_IRQ_LINK_EVENT) phy_mac_interrupt(priv->phydev, @@ -2713,8 +2719,7 @@ static irqreturn_t bcmgenet_isr0(int irq }
/* all other interested interrupts handled in bottom half */ - status &= (UMAC_IRQ_LINK_EVENT | - UMAC_IRQ_MPD_R); + status &= (UMAC_IRQ_LINK_EVENT | UMAC_IRQ_MPD_R | UMAC_IRQ_PHY_DET_R); if (status) { /* Save irq status for bottom-half processing. */ spin_lock_irqsave(&priv->lock, flags);
From: Eric Dumazet edumazet@google.com
[ Upstream commit d7d16a89350ab263484c0aa2b523dd3a234e4a80 ]
Some paths call skb_queue_empty() without holding the queue lock. We must use a barrier in order to not let the compiler do strange things, and avoid KCSAN splats.
Adding a barrier in skb_queue_empty() might be overkill, I prefer adding a new helper to clearly identify points where the callers might be lockless. This might help us finding real bugs.
The corresponding WRITE_ONCE() should add zero cost for current compilers.
Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/skbuff.h | 33 ++++++++++++++++++++++++--------- 1 file changed, 24 insertions(+), 9 deletions(-)
--- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1346,6 +1346,19 @@ static inline int skb_queue_empty(const }
/** + * skb_queue_empty_lockless - check if a queue is empty + * @list: queue head + * + * Returns true if the queue is empty, false otherwise. + * This variant can be used in lockless contexts. + */ +static inline bool skb_queue_empty_lockless(const struct sk_buff_head *list) +{ + return READ_ONCE(list->next) == (const struct sk_buff *) list; +} + + +/** * skb_queue_is_last - check if skb is the last entry in the queue * @list: queue head * @skb: buffer @@ -1709,9 +1722,11 @@ static inline void __skb_insert(struct s struct sk_buff *prev, struct sk_buff *next, struct sk_buff_head *list) { - newsk->next = next; - newsk->prev = prev; - next->prev = prev->next = newsk; + /* see skb_queue_empty_lockless() for the opposite READ_ONCE() */ + WRITE_ONCE(newsk->next, next); + WRITE_ONCE(newsk->prev, prev); + WRITE_ONCE(next->prev, newsk); + WRITE_ONCE(prev->next, newsk); list->qlen++; }
@@ -1722,11 +1737,11 @@ static inline void __skb_queue_splice(co struct sk_buff *first = list->next; struct sk_buff *last = list->prev;
- first->prev = prev; - prev->next = first; + WRITE_ONCE(first->prev, prev); + WRITE_ONCE(prev->next, first);
- last->next = next; - next->prev = last; + WRITE_ONCE(last->next, next); + WRITE_ONCE(next->prev, last); }
/** @@ -1867,8 +1882,8 @@ static inline void __skb_unlink(struct s next = skb->next; prev = skb->prev; skb->next = skb->prev = NULL; - next->prev = prev; - prev->next = next; + WRITE_ONCE(next->prev, prev); + WRITE_ONCE(prev->next, next); }
/**
From: Eric Dumazet edumazet@google.com
[ Upstream commit 137a0dbe3426fd7bcfe3f8117b36a87b3590e4eb ]
syzbot reported a data-race [1].
We should use skb_queue_empty_lockless() to document that we are not ensuring a mutual exclusion and silence KCSAN.
[1] BUG: KCSAN: data-race in __skb_recv_udp / __udp_enqueue_schedule_skb
write to 0xffff888122474b50 of 8 bytes by interrupt on cpu 0: __skb_insert include/linux/skbuff.h:1852 [inline] __skb_queue_before include/linux/skbuff.h:1958 [inline] __skb_queue_tail include/linux/skbuff.h:1991 [inline] __udp_enqueue_schedule_skb+0x2c1/0x410 net/ipv4/udp.c:1470 __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline] udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057 udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074 udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233 __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300 udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955
read to 0xffff888122474b50 of 8 bytes by task 8921 on cpu 1: skb_queue_empty include/linux/skbuff.h:1494 [inline] __skb_recv_udp+0x18d/0x500 net/ipv4/udp.c:1653 udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871 ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480 do_recvmmsg+0x19a/0x5c0 net/socket.c:2601 __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680 __do_sys_recvmmsg net/socket.c:2703 [inline] __se_sys_recvmmsg net/socket.c:2696 [inline] __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 8921 Comm: syz-executor.4 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/udp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1468,7 +1468,7 @@ static int first_packet_length(struct so
spin_lock_bh(&rcvq->lock); skb = __first_packet_length(sk, rcvq, &total); - if (!skb && !skb_queue_empty(sk_queue)) { + if (!skb && !skb_queue_empty_lockless(sk_queue)) { spin_lock(&sk_queue->lock); skb_queue_splice_tail_init(sk_queue, rcvq); spin_unlock(&sk_queue->lock); @@ -1543,7 +1543,7 @@ struct sk_buff *__skb_recv_udp(struct so return skb; }
- if (skb_queue_empty(sk_queue)) { + if (skb_queue_empty_lockless(sk_queue)) { spin_unlock_bh(&queue->lock); goto busy_check; } @@ -1570,7 +1570,7 @@ busy_check: break;
sk_busy_loop(sk, flags & MSG_DONTWAIT); - } while (!skb_queue_empty(sk_queue)); + } while (!skb_queue_empty_lockless(sk_queue));
/* sk_queue is empty, reader_queue may contain peeked packets */ } while (timeo &&
From: Eric Dumazet edumazet@google.com
[ Upstream commit 3ef7cf57c72f32f61e97f8fa401bc39ea1f1a5d4 ]
Many poll() handlers are lockless. Using skb_queue_empty_lockless() instead of skb_queue_empty() is more appropriate.
Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/isdn/capi/capi.c | 2 +- net/atm/common.c | 2 +- net/bluetooth/af_bluetooth.c | 4 ++-- net/caif/caif_socket.c | 2 +- net/core/datagram.c | 4 ++-- net/ipv4/tcp.c | 2 +- net/ipv4/udp.c | 2 +- net/nfc/llcp_sock.c | 4 ++-- net/phonet/socket.c | 4 ++-- net/sctp/socket.c | 4 ++-- net/tipc/socket.c | 4 ++-- net/unix/af_unix.c | 6 +++--- net/vmw_vsock/af_vsock.c | 2 +- 13 files changed, 21 insertions(+), 21 deletions(-)
--- a/drivers/isdn/capi/capi.c +++ b/drivers/isdn/capi/capi.c @@ -743,7 +743,7 @@ capi_poll(struct file *file, poll_table
poll_wait(file, &(cdev->recvwait), wait); mask = POLLOUT | POLLWRNORM; - if (!skb_queue_empty(&cdev->recvqueue)) + if (!skb_queue_empty_lockless(&cdev->recvqueue)) mask |= POLLIN | POLLRDNORM; return mask; } --- a/net/atm/common.c +++ b/net/atm/common.c @@ -667,7 +667,7 @@ unsigned int vcc_poll(struct file *file, mask |= POLLHUP;
/* readable? */ - if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) mask |= POLLIN | POLLRDNORM;
/* writable? */ --- a/net/bluetooth/af_bluetooth.c +++ b/net/bluetooth/af_bluetooth.c @@ -460,7 +460,7 @@ unsigned int bt_sock_poll(struct file *f if (sk->sk_state == BT_LISTEN) return bt_accept_poll(sk);
- if (sk->sk_err || !skb_queue_empty(&sk->sk_error_queue)) + if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) mask |= POLLERR | (sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? POLLPRI : 0);
@@ -470,7 +470,7 @@ unsigned int bt_sock_poll(struct file *f if (sk->sk_shutdown == SHUTDOWN_MASK) mask |= POLLHUP;
- if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) mask |= POLLIN | POLLRDNORM;
if (sk->sk_state == BT_CLOSED) --- a/net/caif/caif_socket.c +++ b/net/caif/caif_socket.c @@ -953,7 +953,7 @@ static unsigned int caif_poll(struct fil mask |= POLLRDHUP;
/* readable? */ - if (!skb_queue_empty(&sk->sk_receive_queue) || + if (!skb_queue_empty_lockless(&sk->sk_receive_queue) || (sk->sk_shutdown & RCV_SHUTDOWN)) mask |= POLLIN | POLLRDNORM;
--- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -844,7 +844,7 @@ unsigned int datagram_poll(struct file * mask = 0;
/* exceptional events? */ - if (sk->sk_err || !skb_queue_empty(&sk->sk_error_queue)) + if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) mask |= POLLERR | (sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? POLLPRI : 0);
@@ -854,7 +854,7 @@ unsigned int datagram_poll(struct file * mask |= POLLHUP;
/* readable? */ - if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) mask |= POLLIN | POLLRDNORM;
/* Connection-based need to check for termination and startup */ --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -574,7 +574,7 @@ unsigned int tcp_poll(struct file *file, } /* This barrier is coupled with smp_wmb() in tcp_reset() */ smp_rmb(); - if (sk->sk_err || !skb_queue_empty(&sk->sk_error_queue)) + if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) mask |= POLLERR;
return mask; --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2550,7 +2550,7 @@ unsigned int udp_poll(struct file *file, unsigned int mask = datagram_poll(file, sock, wait); struct sock *sk = sock->sk;
- if (!skb_queue_empty(&udp_sk(sk)->reader_queue)) + if (!skb_queue_empty_lockless(&udp_sk(sk)->reader_queue)) mask |= POLLIN | POLLRDNORM;
sock_rps_record_flow(sk); --- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -567,11 +567,11 @@ static unsigned int llcp_sock_poll(struc if (sk->sk_state == LLCP_LISTEN) return llcp_accept_poll(sk);
- if (sk->sk_err || !skb_queue_empty(&sk->sk_error_queue)) + if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) mask |= POLLERR | (sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? POLLPRI : 0);
- if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) mask |= POLLIN | POLLRDNORM;
if (sk->sk_state == LLCP_CLOSED) --- a/net/phonet/socket.c +++ b/net/phonet/socket.c @@ -352,9 +352,9 @@ static unsigned int pn_socket_poll(struc
if (sk->sk_state == TCP_CLOSE) return POLLERR; - if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) mask |= POLLIN | POLLRDNORM; - if (!skb_queue_empty(&pn->ctrlreq_queue)) + if (!skb_queue_empty_lockless(&pn->ctrlreq_queue)) mask |= POLLPRI; if (!mask && sk->sk_state == TCP_CLOSE_WAIT) return POLLHUP; --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -7371,7 +7371,7 @@ unsigned int sctp_poll(struct file *file mask = 0;
/* Is there any exceptional events? */ - if (sk->sk_err || !skb_queue_empty(&sk->sk_error_queue)) + if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) mask |= POLLERR | (sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? POLLPRI : 0); if (sk->sk_shutdown & RCV_SHUTDOWN) @@ -7380,7 +7380,7 @@ unsigned int sctp_poll(struct file *file mask |= POLLHUP;
/* Is it readable? Reconsider this code with TCP-style support. */ - if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) mask |= POLLIN | POLLRDNORM;
/* The association is either gone or not ready. */ --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -714,14 +714,14 @@ static unsigned int tipc_poll(struct fil /* fall thru' */ case TIPC_LISTEN: case TIPC_CONNECTING: - if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) mask |= (POLLIN | POLLRDNORM); break; case TIPC_OPEN: if (!tsk->cong_link_cnt) mask |= POLLOUT; if (tipc_sk_type_connectionless(sk) && - (!skb_queue_empty(&sk->sk_receive_queue))) + (!skb_queue_empty_lockless(&sk->sk_receive_queue))) mask |= (POLLIN | POLLRDNORM); break; case TIPC_DISCONNECTING: --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2665,7 +2665,7 @@ static unsigned int unix_poll(struct fil mask |= POLLRDHUP | POLLIN | POLLRDNORM;
/* readable? */ - if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) mask |= POLLIN | POLLRDNORM;
/* Connection-based need to check for termination and startup */ @@ -2693,7 +2693,7 @@ static unsigned int unix_dgram_poll(stru mask = 0;
/* exceptional events? */ - if (sk->sk_err || !skb_queue_empty(&sk->sk_error_queue)) + if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) mask |= POLLERR | (sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? POLLPRI : 0);
@@ -2703,7 +2703,7 @@ static unsigned int unix_dgram_poll(stru mask |= POLLHUP;
/* readable? */ - if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) mask |= POLLIN | POLLRDNORM;
/* Connection-based need to check for termination and startup */ --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -880,7 +880,7 @@ static unsigned int vsock_poll(struct fi * the queue and write as long as the socket isn't shutdown for * sending. */ - if (!skb_queue_empty(&sk->sk_receive_queue) || + if (!skb_queue_empty_lockless(&sk->sk_receive_queue) || (sk->sk_shutdown & RCV_SHUTDOWN)) { mask |= POLLIN | POLLRDNORM; }
From: Eric Dumazet edumazet@google.com
[ Upstream commit 3f926af3f4d688e2e11e7f8ed04e277a14d4d4a4 ]
Busy polling usually runs without locks. Let's use skb_queue_empty_lockless() instead of skb_queue_empty()
Also uses READ_ONCE() in __skb_try_recv_datagram() to address a similar potential problem.
Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/datagram.c | 2 +- net/core/sock.c | 2 +- net/ipv4/tcp.c | 2 +- net/sctp/socket.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-)
--- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -281,7 +281,7 @@ struct sk_buff *__skb_try_recv_datagram( break;
sk_busy_loop(sk, flags & MSG_DONTWAIT); - } while (sk->sk_receive_queue.prev != *last); + } while (READ_ONCE(sk->sk_receive_queue.prev) != *last);
error = -EAGAIN;
--- a/net/core/sock.c +++ b/net/core/sock.c @@ -3381,7 +3381,7 @@ bool sk_busy_loop_end(void *p, unsigned { struct sock *sk = p;
- return !skb_queue_empty(&sk->sk_receive_queue) || + return !skb_queue_empty_lockless(&sk->sk_receive_queue) || sk_busy_loop_timeout(sk, start_time); } EXPORT_SYMBOL(sk_busy_loop_end); --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1787,7 +1787,7 @@ int tcp_recvmsg(struct sock *sk, struct if (unlikely(flags & MSG_ERRQUEUE)) return inet_recv_error(sk, msg, len, addr_len);
- if (sk_can_busy_loop(sk) && skb_queue_empty(&sk->sk_receive_queue) && + if (sk_can_busy_loop(sk) && skb_queue_empty_lockless(&sk->sk_receive_queue) && (sk->sk_state == TCP_ESTABLISHED)) sk_busy_loop(sk, nonblock);
--- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -7716,7 +7716,7 @@ struct sk_buff *sctp_skb_recv_datagram(s if (sk_can_busy_loop(sk)) { sk_busy_loop(sk, noblock);
- if (!skb_queue_empty(&sk->sk_receive_queue)) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) continue; }
From: Xin Long lucien.xin@gmail.com
[ Upstream commit eadf52cf1852196a1363044dcda22fa5d7f296f7 ]
This patch is to improve the tun_info options_len by dropping the skb when TUNNEL_VXLAN_OPT is set but options_len is less than vxlan_metadata. This can void a potential out-of-bounds access on ip_tun_info.
Fixes: ee122c79d422 ("vxlan: Flow based tunneling") Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/vxlan.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -2169,8 +2169,11 @@ static void vxlan_xmit_one(struct sk_buf vni = tunnel_id_to_key32(info->key.tun_id); ifindex = 0; dst_cache = &info->dst_cache; - if (info->options_len) + if (info->options_len) { + if (info->options_len < sizeof(*md)) + goto drop; md = ip_tunnel_info_opts(info); + } ttl = info->key.ttl; tos = info->key.tos; label = info->key.label;
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 2eb8d6d2910cfe3dc67dc056f26f3dd9c63d47cd ]
The check for !md doens't really work for ip_tunnel_info_opts(info) which only does info + 1. Also to avoid out-of-bounds access on info, it should ensure options_len is not less than erspan_metadata in both erspan_xmit() and ip6erspan_tunnel_xmit().
Fixes: 1a66a836da ("gre: add collect_md mode to ERSPAN tunnel") Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_gre.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -592,6 +592,9 @@ static void erspan_fb_xmit(struct sk_buf truncate = true; }
+ if (tun_info->options_len < sizeof(*md)) + goto err_free_rt; + md = ip_tunnel_info_opts(tun_info); if (!md) goto err_free_rt;
From: Eric Dumazet edumazet@google.com
[ Upstream commit a904a0693c189691eeee64f6c6b188bd7dc244e9 ]
Historically linux tried to stick to RFC 791, 1122, 2003 for IPv4 ID field generation.
RFC 6864 made clear that no matter how hard we try, we can not ensure unicity of IP ID within maximum lifetime for all datagrams with a given source address/destination address/protocol tuple.
Linux uses a per socket inet generator (inet_id), initialized at connection startup with a XOR of 'jiffies' and other fields that appear clear on the wire.
Thiemo Nagel pointed that this strategy is a privacy concern as this provides 16 bits of entropy to fingerprint devices.
Let's switch to a random starting point, this is just as good as far as RFC 6864 is concerned and does not leak anything critical.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Thiemo Nagel tnagel@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dccp/ipv4.c | 2 +- net/ipv4/datagram.c | 2 +- net/ipv4/tcp_ipv4.c | 4 ++-- net/sctp/socket.c | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-)
--- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -121,7 +121,7 @@ int dccp_v4_connect(struct sock *sk, str inet->inet_daddr, inet->inet_sport, inet->inet_dport); - inet->inet_id = dp->dccps_iss ^ jiffies; + inet->inet_id = prandom_u32();
err = dccp_connect(sk); rt = NULL; --- a/net/ipv4/datagram.c +++ b/net/ipv4/datagram.c @@ -75,7 +75,7 @@ int __ip4_datagram_connect(struct sock * inet->inet_dport = usin->sin_port; sk->sk_state = TCP_ESTABLISHED; sk_set_txhash(sk); - inet->inet_id = jiffies; + inet->inet_id = prandom_u32();
sk_dst_set(sk, &rt->dst); err = 0; --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -245,7 +245,7 @@ int tcp_v4_connect(struct sock *sk, stru inet->inet_daddr); }
- inet->inet_id = tp->write_seq ^ jiffies; + inet->inet_id = prandom_u32();
if (tcp_fastopen_defer_connect(sk, &err)) return err; @@ -1368,7 +1368,7 @@ struct sock *tcp_v4_syn_recv_sock(const inet_csk(newsk)->icsk_ext_hdr_len = 0; if (inet_opt) inet_csk(newsk)->icsk_ext_hdr_len = inet_opt->opt.optlen; - newinet->inet_id = newtp->write_seq ^ jiffies; + newinet->inet_id = prandom_u32();
if (!dst) { dst = inet_csk_route_child_sock(sk, newsk, req); --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -8136,7 +8136,7 @@ void sctp_copy_sock(struct sock *newsk, newinet->inet_rcv_saddr = inet->inet_rcv_saddr; newinet->inet_dport = htons(asoc->peer.port); newinet->pmtudisc = inet->pmtudisc; - newinet->inet_id = asoc->next_tsn ^ jiffies; + newinet->inet_id = prandom_u32();
newinet->uc_ttl = inet->uc_ttl; newinet->mc_loop = 1;
From: Eric Dumazet edumazet@google.com
commit 55667441c84fa5e0911a0aac44fb059c15ba6da2 upstream.
UDP IPv6 packets auto flowlabels are using a 32bit secret (static u32 hashrnd in net/core/flow_dissector.c) and apply jhash() over fields known by the receivers.
Attackers can easily infer the 32bit secret and use this information to identify a device and/or user, since this 32bit secret is only set at boot time.
Really, using jhash() to generate cookies sent on the wire is a serious security concern.
Trying to change the rol32(hash, 16) in ip6_make_flowlabel() would be a dead end. Trying to periodically change the secret (like in sch_sfq.c) could change paths taken in the network for long lived flows.
Let's switch to siphash, as we did in commit df453700e8d8 ("inet: switch IP ID generator to siphash")
Using a cryptographically strong pseudo random function will solve this privacy issue and more generally remove other weak points in the stack.
Packet schedulers using skb_get_hash_perturb() benefit from this change.
Fixes: b56774163f99 ("ipv6: Enable auto flow labels by default") Fixes: 42240901f7c4 ("ipv6: Implement different admin modes for automatic flow labels") Fixes: 67800f9b1f4e ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel") Fixes: cb1ce2ef387b ("ipv6: Implement automatic flow label generation on transmit") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Jonathan Berger jonathann1@walla.com Reported-by: Amit Klein aksecurity@gmail.com Reported-by: Benny Pinkas benny@pinkas.net Cc: Tom Herbert tom@herbertland.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/skbuff.h | 3 +- include/net/flow_dissector.h | 3 +- include/net/fq.h | 2 - include/net/fq_impl.h | 4 +-- net/core/flow_dissector.c | 48 +++++++++++++++++-------------------------- net/sched/sch_hhf.c | 8 +++---- net/sched/sch_sfb.c | 13 ++++++----- net/sched/sch_sfq.c | 14 +++++++----- 8 files changed, 46 insertions(+), 49 deletions(-)
--- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1228,7 +1228,8 @@ static inline __u32 skb_get_hash_flowi6( return skb->hash; }
-__u32 skb_get_hash_perturb(const struct sk_buff *skb, u32 perturb); +__u32 skb_get_hash_perturb(const struct sk_buff *skb, + const siphash_key_t *perturb);
static inline __u32 skb_get_hash_raw(const struct sk_buff *skb) { --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -4,6 +4,7 @@
#include <linux/types.h> #include <linux/in6.h> +#include <linux/siphash.h> #include <uapi/linux/if_ether.h>
/** @@ -229,7 +230,7 @@ struct flow_dissector { struct flow_keys { struct flow_dissector_key_control control; #define FLOW_KEYS_HASH_START_FIELD basic - struct flow_dissector_key_basic basic; + struct flow_dissector_key_basic basic __aligned(SIPHASH_ALIGNMENT); struct flow_dissector_key_tags tags; struct flow_dissector_key_vlan vlan; struct flow_dissector_key_keyid keyid; --- a/include/net/fq.h +++ b/include/net/fq.h @@ -70,7 +70,7 @@ struct fq { struct list_head backlogs; spinlock_t lock; u32 flows_cnt; - u32 perturbation; + siphash_key_t perturbation; u32 limit; u32 memory_limit; u32 memory_usage; --- a/include/net/fq_impl.h +++ b/include/net/fq_impl.h @@ -105,7 +105,7 @@ static struct fq_flow *fq_flow_classify(
lockdep_assert_held(&fq->lock);
- hash = skb_get_hash_perturb(skb, fq->perturbation); + hash = skb_get_hash_perturb(skb, &fq->perturbation); idx = reciprocal_scale(hash, fq->flows_cnt); flow = &fq->flows[idx];
@@ -255,7 +255,7 @@ static int fq_init(struct fq *fq, int fl INIT_LIST_HEAD(&fq->backlogs); spin_lock_init(&fq->lock); fq->flows_cnt = max_t(u32, flows_cnt, 1); - fq->perturbation = prandom_u32(); + get_random_bytes(&fq->perturbation, sizeof(fq->perturbation)); fq->quantum = 300; fq->limit = 8192; fq->memory_limit = 16 << 20; /* 16 MBytes */ --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -889,45 +889,34 @@ out_bad: } EXPORT_SYMBOL(__skb_flow_dissect);
-static u32 hashrnd __read_mostly; +static siphash_key_t hashrnd __read_mostly; static __always_inline void __flow_hash_secret_init(void) { net_get_random_once(&hashrnd, sizeof(hashrnd)); }
-static __always_inline u32 __flow_hash_words(const u32 *words, u32 length, - u32 keyval) +static const void *flow_keys_hash_start(const struct flow_keys *flow) { - return jhash2(words, length, keyval); -} - -static inline const u32 *flow_keys_hash_start(const struct flow_keys *flow) -{ - const void *p = flow; - - BUILD_BUG_ON(FLOW_KEYS_HASH_OFFSET % sizeof(u32)); - return (const u32 *)(p + FLOW_KEYS_HASH_OFFSET); + BUILD_BUG_ON(FLOW_KEYS_HASH_OFFSET % SIPHASH_ALIGNMENT); + return &flow->FLOW_KEYS_HASH_START_FIELD; }
static inline size_t flow_keys_hash_length(const struct flow_keys *flow) { - size_t diff = FLOW_KEYS_HASH_OFFSET + sizeof(flow->addrs); - BUILD_BUG_ON((sizeof(*flow) - FLOW_KEYS_HASH_OFFSET) % sizeof(u32)); - BUILD_BUG_ON(offsetof(typeof(*flow), addrs) != - sizeof(*flow) - sizeof(flow->addrs)); + size_t len = offsetof(typeof(*flow), addrs) - FLOW_KEYS_HASH_OFFSET;
switch (flow->control.addr_type) { case FLOW_DISSECTOR_KEY_IPV4_ADDRS: - diff -= sizeof(flow->addrs.v4addrs); + len += sizeof(flow->addrs.v4addrs); break; case FLOW_DISSECTOR_KEY_IPV6_ADDRS: - diff -= sizeof(flow->addrs.v6addrs); + len += sizeof(flow->addrs.v6addrs); break; case FLOW_DISSECTOR_KEY_TIPC_ADDRS: - diff -= sizeof(flow->addrs.tipcaddrs); + len += sizeof(flow->addrs.tipcaddrs); break; } - return (sizeof(*flow) - diff) / sizeof(u32); + return len; }
__be32 flow_get_u32_src(const struct flow_keys *flow) @@ -993,14 +982,15 @@ static inline void __flow_hash_consisten } }
-static inline u32 __flow_hash_from_keys(struct flow_keys *keys, u32 keyval) +static inline u32 __flow_hash_from_keys(struct flow_keys *keys, + const siphash_key_t *keyval) { u32 hash;
__flow_hash_consistentify(keys);
- hash = __flow_hash_words(flow_keys_hash_start(keys), - flow_keys_hash_length(keys), keyval); + hash = siphash(flow_keys_hash_start(keys), + flow_keys_hash_length(keys), keyval); if (!hash) hash = 1;
@@ -1010,12 +1000,13 @@ static inline u32 __flow_hash_from_keys( u32 flow_hash_from_keys(struct flow_keys *keys) { __flow_hash_secret_init(); - return __flow_hash_from_keys(keys, hashrnd); + return __flow_hash_from_keys(keys, &hashrnd); } EXPORT_SYMBOL(flow_hash_from_keys);
static inline u32 ___skb_get_hash(const struct sk_buff *skb, - struct flow_keys *keys, u32 keyval) + struct flow_keys *keys, + const siphash_key_t *keyval) { skb_flow_dissect_flow_keys(skb, keys, FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL); @@ -1063,7 +1054,7 @@ u32 __skb_get_hash_symmetric(const struc NULL, 0, 0, 0, FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL);
- return __flow_hash_from_keys(&keys, hashrnd); + return __flow_hash_from_keys(&keys, &hashrnd); } EXPORT_SYMBOL_GPL(__skb_get_hash_symmetric);
@@ -1083,13 +1074,14 @@ void __skb_get_hash(struct sk_buff *skb)
__flow_hash_secret_init();
- hash = ___skb_get_hash(skb, &keys, hashrnd); + hash = ___skb_get_hash(skb, &keys, &hashrnd);
__skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys)); } EXPORT_SYMBOL(__skb_get_hash);
-__u32 skb_get_hash_perturb(const struct sk_buff *skb, u32 perturb) +__u32 skb_get_hash_perturb(const struct sk_buff *skb, + const siphash_key_t *perturb) { struct flow_keys keys;
--- a/net/sched/sch_hhf.c +++ b/net/sched/sch_hhf.c @@ -4,11 +4,11 @@ * Copyright (C) 2013 Nandita Dukkipati nanditad@google.com */
-#include <linux/jhash.h> #include <linux/jiffies.h> #include <linux/module.h> #include <linux/skbuff.h> #include <linux/vmalloc.h> +#include <linux/siphash.h> #include <net/pkt_sched.h> #include <net/sock.h>
@@ -125,7 +125,7 @@ struct wdrr_bucket {
struct hhf_sched_data { struct wdrr_bucket buckets[WDRR_BUCKET_CNT]; - u32 perturbation; /* hash perturbation */ + siphash_key_t perturbation; /* hash perturbation */ u32 quantum; /* psched_mtu(qdisc_dev(sch)); */ u32 drop_overlimit; /* number of times max qdisc packet * limit was hit @@ -263,7 +263,7 @@ static enum wdrr_bucket_idx hhf_classify }
/* Get hashed flow-id of the skb. */ - hash = skb_get_hash_perturb(skb, q->perturbation); + hash = skb_get_hash_perturb(skb, &q->perturbation);
/* Check if this packet belongs to an already established HH flow. */ flow_pos = hash & HHF_BIT_MASK; @@ -578,7 +578,7 @@ static int hhf_init(struct Qdisc *sch, s
sch->limit = 1000; q->quantum = psched_mtu(qdisc_dev(sch)); - q->perturbation = prandom_u32(); + get_random_bytes(&q->perturbation, sizeof(q->perturbation)); INIT_LIST_HEAD(&q->new_buckets); INIT_LIST_HEAD(&q->old_buckets);
--- a/net/sched/sch_sfb.c +++ b/net/sched/sch_sfb.c @@ -22,7 +22,7 @@ #include <linux/errno.h> #include <linux/skbuff.h> #include <linux/random.h> -#include <linux/jhash.h> +#include <linux/siphash.h> #include <net/ip.h> #include <net/pkt_sched.h> #include <net/pkt_cls.h> @@ -49,7 +49,7 @@ struct sfb_bucket { * (Section 4.4 of SFB reference : moving hash functions) */ struct sfb_bins { - u32 perturbation; /* jhash perturbation */ + siphash_key_t perturbation; /* siphash key */ struct sfb_bucket bins[SFB_LEVELS][SFB_NUMBUCKETS]; };
@@ -221,7 +221,8 @@ static u32 sfb_compute_qlen(u32 *prob_r,
static void sfb_init_perturbation(u32 slot, struct sfb_sched_data *q) { - q->bins[slot].perturbation = prandom_u32(); + get_random_bytes(&q->bins[slot].perturbation, + sizeof(q->bins[slot].perturbation)); }
static void sfb_swap_slot(struct sfb_sched_data *q) @@ -317,9 +318,9 @@ static int sfb_enqueue(struct sk_buff *s /* If using external classifiers, get result and record it. */ if (!sfb_classify(skb, fl, &ret, &salt)) goto other_drop; - sfbhash = jhash_1word(salt, q->bins[slot].perturbation); + sfbhash = siphash_1u32(salt, &q->bins[slot].perturbation); } else { - sfbhash = skb_get_hash_perturb(skb, q->bins[slot].perturbation); + sfbhash = skb_get_hash_perturb(skb, &q->bins[slot].perturbation); }
@@ -355,7 +356,7 @@ static int sfb_enqueue(struct sk_buff *s /* Inelastic flow */ if (q->double_buffering) { sfbhash = skb_get_hash_perturb(skb, - q->bins[slot].perturbation); + &q->bins[slot].perturbation); if (!sfbhash) sfbhash = 1; sfb_skb_cb(skb)->hashes[slot] = sfbhash; --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -18,7 +18,7 @@ #include <linux/errno.h> #include <linux/init.h> #include <linux/skbuff.h> -#include <linux/jhash.h> +#include <linux/siphash.h> #include <linux/slab.h> #include <linux/vmalloc.h> #include <net/netlink.h> @@ -121,7 +121,7 @@ struct sfq_sched_data { u8 headdrop; u8 maxdepth; /* limit of packets per flow */
- u32 perturbation; + siphash_key_t perturbation; u8 cur_depth; /* depth of longest slot */ u8 flags; unsigned short scaled_quantum; /* SFQ_ALLOT_SIZE(quantum) */ @@ -160,7 +160,7 @@ static inline struct sfq_head *sfq_dep_h static unsigned int sfq_hash(const struct sfq_sched_data *q, const struct sk_buff *skb) { - return skb_get_hash_perturb(skb, q->perturbation) & (q->divisor - 1); + return skb_get_hash_perturb(skb, &q->perturbation) & (q->divisor - 1); }
static unsigned int sfq_classify(struct sk_buff *skb, struct Qdisc *sch, @@ -609,9 +609,11 @@ static void sfq_perturbation(unsigned lo struct Qdisc *sch = (struct Qdisc *)arg; struct sfq_sched_data *q = qdisc_priv(sch); spinlock_t *root_lock = qdisc_lock(qdisc_root_sleeping(sch)); + siphash_key_t nkey;
+ get_random_bytes(&nkey, sizeof(nkey)); spin_lock(root_lock); - q->perturbation = prandom_u32(); + q->perturbation = nkey; if (!q->filter_list && q->tail) sfq_rehash(sch); spin_unlock(root_lock); @@ -690,7 +692,7 @@ static int sfq_change(struct Qdisc *sch, del_timer(&q->perturb_timer); if (q->perturb_period) { mod_timer(&q->perturb_timer, jiffies + q->perturb_period); - q->perturbation = prandom_u32(); + get_random_bytes(&q->perturbation, sizeof(q->perturbation)); } sch_tree_unlock(sch); kfree(p); @@ -746,7 +748,7 @@ static int sfq_init(struct Qdisc *sch, s q->quantum = psched_mtu(qdisc_dev(sch)); q->scaled_quantum = SFQ_ALLOT_SIZE(q->quantum); q->perturb_period = 0; - q->perturbation = prandom_u32(); + get_random_bytes(&q->perturbation, sizeof(q->perturbation));
if (opt) { int err = sfq_change(sch, opt);
From: Jeffrey Hugo jeffrey.l.hugo@gmail.com
commit 7667819385457b4aeb5fac94f67f52ab52cc10d5 upstream.
bam_dma_terminate_all() will leak resources if any of the transactions are committed to the hardware (present in the desc fifo), and not complete. Since bam_dma_terminate_all() does not cause the hardware to be updated, the hardware will still operate on any previously committed transactions. This can cause memory corruption if the memory for the transaction has been reassigned, and will cause a sync issue between the BAM and its client(s).
Fix this by properly updating the hardware in bam_dma_terminate_all().
Fixes: e7c0fe2a5c84 ("dmaengine: add Qualcomm BAM dma driver") Signed-off-by: Jeffrey Hugo jeffrey.l.hugo@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191017152606.34120-1-jeffrey.l.hugo@gmail.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/dma/qcom/bam_dma.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/drivers/dma/qcom/bam_dma.c +++ b/drivers/dma/qcom/bam_dma.c @@ -690,7 +690,21 @@ static int bam_dma_terminate_all(struct
/* remove all transactions, including active transaction */ spin_lock_irqsave(&bchan->vc.lock, flag); + /* + * If we have transactions queued, then some might be committed to the + * hardware in the desc fifo. The only way to reset the desc fifo is + * to do a hardware reset (either by pipe or the entire block). + * bam_chan_init_hw() will trigger a pipe reset, and also reinit the + * pipe. If the pipe is left disabled (default state after pipe reset) + * and is accessed by a connected hardware engine, a fatal error in + * the BAM will occur. There is a small window where this could happen + * with bam_chan_init_hw(), but it is assumed that the caller has + * stopped activity on any attached hardware engine. Make sure to do + * this first so that the BAM hardware doesn't cause memory corruption + * by accessing freed resources. + */ if (bchan->curr_txd) { + bam_chan_init_hw(bchan, bchan->curr_txd->dir); list_add(&bchan->curr_txd->vd.node, &bchan->vc.desc_issued); bchan->curr_txd = NULL; }
From: Peter Zijlstra peterz@infradead.org
commit 4c4e3731564c8945ac5ac90fc2a1e1f21cb79c92 upstream.
Notable cmpxchg() does not provide ordering when it fails, however wake_q_add() requires ordering in this specific case too. Without this it would be possible for the concurrent wakeup to not observe our prior state.
Andrea Parri provided:
C wake_up_q-wake_q_add
{ int next = 0; int y = 0; }
P0(int *next, int *y) { int r0;
/* in wake_up_q() */
WRITE_ONCE(*next, 1); /* node->next = NULL */ smp_mb(); /* implied by wake_up_process() */ r0 = READ_ONCE(*y); }
P1(int *next, int *y) { int r1;
/* in wake_q_add() */
WRITE_ONCE(*y, 1); /* wake_cond = true */ smp_mb__before_atomic(); r1 = cmpxchg_relaxed(next, 1, 2); }
exists (0:r0=0 /\ 1:r1=0)
This "exists" clause cannot be satisfied according to the LKMM:
Test wake_up_q-wake_q_add Allowed States 3 0:r0=0; 1:r1=1; 0:r0=1; 1:r1=0; 0:r0=1; 1:r1=1; No Witnesses Positive: 0 Negative: 3 Condition exists (0:r0=0 /\ 1:r1=0) Observation wake_up_q-wake_q_add Never 0 3
Reported-by: Yongji Xie elohimes@gmail.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Davidlohr Bueso dave@stgolabs.net Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Waiman Long longman@redhat.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Jisheng Zhang Jisheng.Zhang@synaptics.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/sched/core.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -432,10 +432,11 @@ void wake_q_add(struct wake_q_head *head * its already queued (either by us or someone else) and will get the * wakeup due to that. * - * This cmpxchg() implies a full barrier, which pairs with the write - * barrier implied by the wakeup in wake_up_q(). + * In order to ensure that a pending wakeup will observe our pending + * state, even in the failed case, an explicit smp_mb() must be used. */ - if (cmpxchg(&node->next, NULL, WAKE_Q_TAIL)) + smp_mb__before_atomic(); + if (cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)) return;
get_task_struct(task);
From: Masahiro Yamada yamada.masahiro@socionext.com
[ Upstream commit a73619a845d5625079cc1b3b820f44c899618388 ]
The __FILE__ macro is used everywhere in the kernel to locate the file printing the log message, such as WARN_ON(), etc. If the kernel is built out of tree, this can be a long absolute path, like this:
WARNING: CPU: 1 PID: 1 at /path/to/build/directory/arch/arm64/kernel/foo.c:...
This is because Kbuild runs in the objtree instead of the srctree, then __FILE__ is expanded to a file path prefixed with $(srctree)/.
Commit 9da0763bdd82 ("kbuild: Use relative path when building in a subdir of the source tree") improved this to some extent; $(srctree) becomes ".." if the objtree is a child of the srctree.
For other cases of out-of-tree build, __FILE__ is still the absolute path. It also means the kernel image depends on where it was built.
A brand-new option from GCC, -fmacro-prefix-map, solves this problem. If your compiler supports it, __FILE__ is the relative path from the srctree regardless of O= option. This provides more readable log and more reproducible builds.
Please note __FILE__ is always an absolute path for external modules.
Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/Makefile b/Makefile index 1d7f47334ca2b..61660387eb34b 100644 --- a/Makefile +++ b/Makefile @@ -840,6 +840,9 @@ KBUILD_CFLAGS += $(call cc-option,-Werror=incompatible-pointer-types) # Require designated initializers for all marked structures KBUILD_CFLAGS += $(call cc-option,-Werror=designated-init)
+# change __FILE__ to the relative path from the srctree +KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=) + # use the deterministic mode of AR if available KBUILD_ARFLAGS := $(call ar-option,D)
From: Seth Forshee seth.forshee@canonical.com
[ Upstream commit 29be86d7f9cb18df4123f309ac7857570513e8bc ]
The gcc -fcf-protection=branch option is not compatible with -mindirect-branch=thunk-extern. The latter is used when CONFIG_RETPOLINE is selected, and this will fail to build with a gcc which has -fcf-protection=branch enabled by default. Adding -fcf-protection=none when building with retpoline enabled prevents such build failures.
Signed-off-by: Seth Forshee seth.forshee@canonical.com Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/Makefile b/Makefile index 61660387eb34b..52aaa6150099e 100644 --- a/Makefile +++ b/Makefile @@ -843,6 +843,12 @@ KBUILD_CFLAGS += $(call cc-option,-Werror=designated-init) # change __FILE__ to the relative path from the srctree KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)
+# ensure -fcf-protection is disabled when using retpoline as it is +# incompatible with -mindirect-branch=thunk-extern +ifdef CONFIG_RETPOLINE +KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none) +endif + # use the deterministic mode of AR if available KBUILD_ARFLAGS := $(call ar-option,D)
From: Jan Kiszka jan.kiszka@siemens.com
commit ad0d315b4d4e7138f43acf03308192ec00e9614d upstream.
The SIMATIC IPC227E uses the PMC clock for on-board components and gets stuck during boot if the clock is disabled. Therefore, add this device to the critical systems list.
Fixes: 648e921888ad ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL") Signed-off-by: Jan Kiszka jan.kiszka@siemens.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/platform/x86/pmc_atom.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/platform/x86/pmc_atom.c +++ b/drivers/platform/x86/pmc_atom.c @@ -475,6 +475,13 @@ static const struct dmi_system_id critcl DMI_MATCH(DMI_BOARD_NAME, "CB6363"), }, }, + { + .ident = "SIMATIC IPC227E", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "SIEMENS AG"), + DMI_MATCH(DMI_PRODUCT_VERSION, "6ES7647-8B"), + }, + }, { /*sentinel*/ } };
From: Fabrice Gasnier fabrice.gasnier@st.com
commit 31922f62bb527d749b99dbc776e514bcba29b7fe upstream.
Move STM32 ADC registers definitions to common header. This is precursor patch to: - iio: adc: stm32-adc: fix a race when using several adcs with dma and irq
It keeps registers definitions as a whole block, to ease readability and allow simple access path to EOC bits (readl) in stm32-adc-core driver.
Fixes: 2763ea0585c9 ("iio: adc: stm32: add optional dma support")
Signed-off-by: Fabrice Gasnier fabrice.gasnier@st.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/adc/stm32-adc-core.c | 27 ------- drivers/iio/adc/stm32-adc-core.h | 134 +++++++++++++++++++++++++++++++++++++++ drivers/iio/adc/stm32-adc.c | 107 ------------------------------- 3 files changed, 134 insertions(+), 134 deletions(-)
--- a/drivers/iio/adc/stm32-adc-core.c +++ b/drivers/iio/adc/stm32-adc-core.c @@ -33,36 +33,9 @@
#include "stm32-adc-core.h"
-/* STM32F4 - common registers for all ADC instances: 1, 2 & 3 */ -#define STM32F4_ADC_CSR (STM32_ADCX_COMN_OFFSET + 0x00) -#define STM32F4_ADC_CCR (STM32_ADCX_COMN_OFFSET + 0x04) - -/* STM32F4_ADC_CSR - bit fields */ -#define STM32F4_EOC3 BIT(17) -#define STM32F4_EOC2 BIT(9) -#define STM32F4_EOC1 BIT(1) - -/* STM32F4_ADC_CCR - bit fields */ -#define STM32F4_ADC_ADCPRE_SHIFT 16 -#define STM32F4_ADC_ADCPRE_MASK GENMASK(17, 16) - /* STM32 F4 maximum analog clock rate (from datasheet) */ #define STM32F4_ADC_MAX_CLK_RATE 36000000
-/* STM32H7 - common registers for all ADC instances */ -#define STM32H7_ADC_CSR (STM32_ADCX_COMN_OFFSET + 0x00) -#define STM32H7_ADC_CCR (STM32_ADCX_COMN_OFFSET + 0x08) - -/* STM32H7_ADC_CSR - bit fields */ -#define STM32H7_EOC_SLV BIT(18) -#define STM32H7_EOC_MST BIT(2) - -/* STM32H7_ADC_CCR - bit fields */ -#define STM32H7_PRESC_SHIFT 18 -#define STM32H7_PRESC_MASK GENMASK(21, 18) -#define STM32H7_CKMODE_SHIFT 16 -#define STM32H7_CKMODE_MASK GENMASK(17, 16) - /* STM32 H7 maximum analog clock rate (from datasheet) */ #define STM32H7_ADC_MAX_CLK_RATE 36000000
--- a/drivers/iio/adc/stm32-adc-core.h +++ b/drivers/iio/adc/stm32-adc-core.h @@ -39,6 +39,140 @@ #define STM32_ADC_MAX_ADCS 3 #define STM32_ADCX_COMN_OFFSET 0x300
+/* STM32F4 - Registers for each ADC instance */ +#define STM32F4_ADC_SR 0x00 +#define STM32F4_ADC_CR1 0x04 +#define STM32F4_ADC_CR2 0x08 +#define STM32F4_ADC_SMPR1 0x0C +#define STM32F4_ADC_SMPR2 0x10 +#define STM32F4_ADC_HTR 0x24 +#define STM32F4_ADC_LTR 0x28 +#define STM32F4_ADC_SQR1 0x2C +#define STM32F4_ADC_SQR2 0x30 +#define STM32F4_ADC_SQR3 0x34 +#define STM32F4_ADC_JSQR 0x38 +#define STM32F4_ADC_JDR1 0x3C +#define STM32F4_ADC_JDR2 0x40 +#define STM32F4_ADC_JDR3 0x44 +#define STM32F4_ADC_JDR4 0x48 +#define STM32F4_ADC_DR 0x4C + +/* STM32F4 - common registers for all ADC instances: 1, 2 & 3 */ +#define STM32F4_ADC_CSR (STM32_ADCX_COMN_OFFSET + 0x00) +#define STM32F4_ADC_CCR (STM32_ADCX_COMN_OFFSET + 0x04) + +/* STM32F4_ADC_SR - bit fields */ +#define STM32F4_STRT BIT(4) +#define STM32F4_EOC BIT(1) + +/* STM32F4_ADC_CR1 - bit fields */ +#define STM32F4_RES_SHIFT 24 +#define STM32F4_RES_MASK GENMASK(25, 24) +#define STM32F4_SCAN BIT(8) +#define STM32F4_EOCIE BIT(5) + +/* STM32F4_ADC_CR2 - bit fields */ +#define STM32F4_SWSTART BIT(30) +#define STM32F4_EXTEN_SHIFT 28 +#define STM32F4_EXTEN_MASK GENMASK(29, 28) +#define STM32F4_EXTSEL_SHIFT 24 +#define STM32F4_EXTSEL_MASK GENMASK(27, 24) +#define STM32F4_EOCS BIT(10) +#define STM32F4_DDS BIT(9) +#define STM32F4_DMA BIT(8) +#define STM32F4_ADON BIT(0) + +/* STM32F4_ADC_CSR - bit fields */ +#define STM32F4_EOC3 BIT(17) +#define STM32F4_EOC2 BIT(9) +#define STM32F4_EOC1 BIT(1) + +/* STM32F4_ADC_CCR - bit fields */ +#define STM32F4_ADC_ADCPRE_SHIFT 16 +#define STM32F4_ADC_ADCPRE_MASK GENMASK(17, 16) + +/* STM32H7 - Registers for each ADC instance */ +#define STM32H7_ADC_ISR 0x00 +#define STM32H7_ADC_IER 0x04 +#define STM32H7_ADC_CR 0x08 +#define STM32H7_ADC_CFGR 0x0C +#define STM32H7_ADC_SMPR1 0x14 +#define STM32H7_ADC_SMPR2 0x18 +#define STM32H7_ADC_PCSEL 0x1C +#define STM32H7_ADC_SQR1 0x30 +#define STM32H7_ADC_SQR2 0x34 +#define STM32H7_ADC_SQR3 0x38 +#define STM32H7_ADC_SQR4 0x3C +#define STM32H7_ADC_DR 0x40 +#define STM32H7_ADC_CALFACT 0xC4 +#define STM32H7_ADC_CALFACT2 0xC8 + +/* STM32H7 - common registers for all ADC instances */ +#define STM32H7_ADC_CSR (STM32_ADCX_COMN_OFFSET + 0x00) +#define STM32H7_ADC_CCR (STM32_ADCX_COMN_OFFSET + 0x08) + +/* STM32H7_ADC_ISR - bit fields */ +#define STM32H7_EOC BIT(2) +#define STM32H7_ADRDY BIT(0) + +/* STM32H7_ADC_IER - bit fields */ +#define STM32H7_EOCIE STM32H7_EOC + +/* STM32H7_ADC_CR - bit fields */ +#define STM32H7_ADCAL BIT(31) +#define STM32H7_ADCALDIF BIT(30) +#define STM32H7_DEEPPWD BIT(29) +#define STM32H7_ADVREGEN BIT(28) +#define STM32H7_LINCALRDYW6 BIT(27) +#define STM32H7_LINCALRDYW5 BIT(26) +#define STM32H7_LINCALRDYW4 BIT(25) +#define STM32H7_LINCALRDYW3 BIT(24) +#define STM32H7_LINCALRDYW2 BIT(23) +#define STM32H7_LINCALRDYW1 BIT(22) +#define STM32H7_ADCALLIN BIT(16) +#define STM32H7_BOOST BIT(8) +#define STM32H7_ADSTP BIT(4) +#define STM32H7_ADSTART BIT(2) +#define STM32H7_ADDIS BIT(1) +#define STM32H7_ADEN BIT(0) + +/* STM32H7_ADC_CFGR bit fields */ +#define STM32H7_EXTEN_SHIFT 10 +#define STM32H7_EXTEN_MASK GENMASK(11, 10) +#define STM32H7_EXTSEL_SHIFT 5 +#define STM32H7_EXTSEL_MASK GENMASK(9, 5) +#define STM32H7_RES_SHIFT 2 +#define STM32H7_RES_MASK GENMASK(4, 2) +#define STM32H7_DMNGT_SHIFT 0 +#define STM32H7_DMNGT_MASK GENMASK(1, 0) + +enum stm32h7_adc_dmngt { + STM32H7_DMNGT_DR_ONLY, /* Regular data in DR only */ + STM32H7_DMNGT_DMA_ONESHOT, /* DMA one shot mode */ + STM32H7_DMNGT_DFSDM, /* DFSDM mode */ + STM32H7_DMNGT_DMA_CIRC, /* DMA circular mode */ +}; + +/* STM32H7_ADC_CALFACT - bit fields */ +#define STM32H7_CALFACT_D_SHIFT 16 +#define STM32H7_CALFACT_D_MASK GENMASK(26, 16) +#define STM32H7_CALFACT_S_SHIFT 0 +#define STM32H7_CALFACT_S_MASK GENMASK(10, 0) + +/* STM32H7_ADC_CALFACT2 - bit fields */ +#define STM32H7_LINCALFACT_SHIFT 0 +#define STM32H7_LINCALFACT_MASK GENMASK(29, 0) + +/* STM32H7_ADC_CSR - bit fields */ +#define STM32H7_EOC_SLV BIT(18) +#define STM32H7_EOC_MST BIT(2) + +/* STM32H7_ADC_CCR - bit fields */ +#define STM32H7_PRESC_SHIFT 18 +#define STM32H7_PRESC_MASK GENMASK(21, 18) +#define STM32H7_CKMODE_SHIFT 16 +#define STM32H7_CKMODE_MASK GENMASK(17, 16) + /** * struct stm32_adc_common - stm32 ADC driver common data (for all instances) * @base: control registers base cpu addr --- a/drivers/iio/adc/stm32-adc.c +++ b/drivers/iio/adc/stm32-adc.c @@ -40,113 +40,6 @@
#include "stm32-adc-core.h"
-/* STM32F4 - Registers for each ADC instance */ -#define STM32F4_ADC_SR 0x00 -#define STM32F4_ADC_CR1 0x04 -#define STM32F4_ADC_CR2 0x08 -#define STM32F4_ADC_SMPR1 0x0C -#define STM32F4_ADC_SMPR2 0x10 -#define STM32F4_ADC_HTR 0x24 -#define STM32F4_ADC_LTR 0x28 -#define STM32F4_ADC_SQR1 0x2C -#define STM32F4_ADC_SQR2 0x30 -#define STM32F4_ADC_SQR3 0x34 -#define STM32F4_ADC_JSQR 0x38 -#define STM32F4_ADC_JDR1 0x3C -#define STM32F4_ADC_JDR2 0x40 -#define STM32F4_ADC_JDR3 0x44 -#define STM32F4_ADC_JDR4 0x48 -#define STM32F4_ADC_DR 0x4C - -/* STM32F4_ADC_SR - bit fields */ -#define STM32F4_STRT BIT(4) -#define STM32F4_EOC BIT(1) - -/* STM32F4_ADC_CR1 - bit fields */ -#define STM32F4_RES_SHIFT 24 -#define STM32F4_RES_MASK GENMASK(25, 24) -#define STM32F4_SCAN BIT(8) -#define STM32F4_EOCIE BIT(5) - -/* STM32F4_ADC_CR2 - bit fields */ -#define STM32F4_SWSTART BIT(30) -#define STM32F4_EXTEN_SHIFT 28 -#define STM32F4_EXTEN_MASK GENMASK(29, 28) -#define STM32F4_EXTSEL_SHIFT 24 -#define STM32F4_EXTSEL_MASK GENMASK(27, 24) -#define STM32F4_EOCS BIT(10) -#define STM32F4_DDS BIT(9) -#define STM32F4_DMA BIT(8) -#define STM32F4_ADON BIT(0) - -/* STM32H7 - Registers for each ADC instance */ -#define STM32H7_ADC_ISR 0x00 -#define STM32H7_ADC_IER 0x04 -#define STM32H7_ADC_CR 0x08 -#define STM32H7_ADC_CFGR 0x0C -#define STM32H7_ADC_SMPR1 0x14 -#define STM32H7_ADC_SMPR2 0x18 -#define STM32H7_ADC_PCSEL 0x1C -#define STM32H7_ADC_SQR1 0x30 -#define STM32H7_ADC_SQR2 0x34 -#define STM32H7_ADC_SQR3 0x38 -#define STM32H7_ADC_SQR4 0x3C -#define STM32H7_ADC_DR 0x40 -#define STM32H7_ADC_CALFACT 0xC4 -#define STM32H7_ADC_CALFACT2 0xC8 - -/* STM32H7_ADC_ISR - bit fields */ -#define STM32H7_EOC BIT(2) -#define STM32H7_ADRDY BIT(0) - -/* STM32H7_ADC_IER - bit fields */ -#define STM32H7_EOCIE STM32H7_EOC - -/* STM32H7_ADC_CR - bit fields */ -#define STM32H7_ADCAL BIT(31) -#define STM32H7_ADCALDIF BIT(30) -#define STM32H7_DEEPPWD BIT(29) -#define STM32H7_ADVREGEN BIT(28) -#define STM32H7_LINCALRDYW6 BIT(27) -#define STM32H7_LINCALRDYW5 BIT(26) -#define STM32H7_LINCALRDYW4 BIT(25) -#define STM32H7_LINCALRDYW3 BIT(24) -#define STM32H7_LINCALRDYW2 BIT(23) -#define STM32H7_LINCALRDYW1 BIT(22) -#define STM32H7_ADCALLIN BIT(16) -#define STM32H7_BOOST BIT(8) -#define STM32H7_ADSTP BIT(4) -#define STM32H7_ADSTART BIT(2) -#define STM32H7_ADDIS BIT(1) -#define STM32H7_ADEN BIT(0) - -/* STM32H7_ADC_CFGR bit fields */ -#define STM32H7_EXTEN_SHIFT 10 -#define STM32H7_EXTEN_MASK GENMASK(11, 10) -#define STM32H7_EXTSEL_SHIFT 5 -#define STM32H7_EXTSEL_MASK GENMASK(9, 5) -#define STM32H7_RES_SHIFT 2 -#define STM32H7_RES_MASK GENMASK(4, 2) -#define STM32H7_DMNGT_SHIFT 0 -#define STM32H7_DMNGT_MASK GENMASK(1, 0) - -enum stm32h7_adc_dmngt { - STM32H7_DMNGT_DR_ONLY, /* Regular data in DR only */ - STM32H7_DMNGT_DMA_ONESHOT, /* DMA one shot mode */ - STM32H7_DMNGT_DFSDM, /* DFSDM mode */ - STM32H7_DMNGT_DMA_CIRC, /* DMA circular mode */ -}; - -/* STM32H7_ADC_CALFACT - bit fields */ -#define STM32H7_CALFACT_D_SHIFT 16 -#define STM32H7_CALFACT_D_MASK GENMASK(26, 16) -#define STM32H7_CALFACT_S_SHIFT 0 -#define STM32H7_CALFACT_S_MASK GENMASK(10, 0) - -/* STM32H7_ADC_CALFACT2 - bit fields */ -#define STM32H7_LINCALFACT_SHIFT 0 -#define STM32H7_LINCALFACT_MASK GENMASK(29, 0) - /* Number of linear calibration shadow registers / LINCALRDYW control bits */ #define STM32H7_LINCALFACT_NUM 6
From: Fabrice Gasnier fabrice.gasnier@st.com
commit dcb10920179ab74caf88a6f2afadecfc2743b910 upstream.
End of conversion may be handled by using IRQ or DMA. There may be a race when two conversions complete at the same time on several ADCs. EOC can be read as 'set' for several ADCs, with: - an ADC configured to use IRQs. EOCIE bit is set. The handler is normally called in this case. - an ADC configured to use DMA. EOCIE bit isn't set. EOC triggers the DMA request instead. It's then automatically cleared by DMA read. But the handler gets called due to status bit is temporarily set (IRQ triggered by the other ADC). So both EOC status bit in CSR and EOCIE control bit must be checked before invoking the interrupt handler (e.g. call ISR only for IRQ-enabled ADCs).
Fixes: 2763ea0585c9 ("iio: adc: stm32: add optional dma support")
Signed-off-by: Fabrice Gasnier fabrice.gasnier@st.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/adc/stm32-adc-core.c | 43 ++++++++++++++++++++++++++++++++++++--- drivers/iio/adc/stm32-adc-core.h | 1 2 files changed, 41 insertions(+), 3 deletions(-)
--- a/drivers/iio/adc/stm32-adc-core.c +++ b/drivers/iio/adc/stm32-adc-core.c @@ -45,12 +45,16 @@ * @eoc1: adc1 end of conversion flag in @csr * @eoc2: adc2 end of conversion flag in @csr * @eoc3: adc3 end of conversion flag in @csr + * @ier: interrupt enable register offset for each adc + * @eocie_msk: end of conversion interrupt enable mask in @ier */ struct stm32_adc_common_regs { u32 csr; u32 eoc1_msk; u32 eoc2_msk; u32 eoc3_msk; + u32 ier; + u32 eocie_msk; };
struct stm32_adc_priv; @@ -244,6 +248,8 @@ static const struct stm32_adc_common_reg .eoc1_msk = STM32F4_EOC1, .eoc2_msk = STM32F4_EOC2, .eoc3_msk = STM32F4_EOC3, + .ier = STM32F4_ADC_CR1, + .eocie_msk = STM32F4_EOCIE, };
/* STM32H7 common registers definitions */ @@ -251,8 +257,24 @@ static const struct stm32_adc_common_reg .csr = STM32H7_ADC_CSR, .eoc1_msk = STM32H7_EOC_MST, .eoc2_msk = STM32H7_EOC_SLV, + .ier = STM32H7_ADC_IER, + .eocie_msk = STM32H7_EOCIE, };
+static const unsigned int stm32_adc_offset[STM32_ADC_MAX_ADCS] = { + 0, STM32_ADC_OFFSET, STM32_ADC_OFFSET * 2, +}; + +static unsigned int stm32_adc_eoc_enabled(struct stm32_adc_priv *priv, + unsigned int adc) +{ + u32 ier, offset = stm32_adc_offset[adc]; + + ier = readl_relaxed(priv->common.base + offset + priv->cfg->regs->ier); + + return ier & priv->cfg->regs->eocie_msk; +} + /* ADC common interrupt for all instances */ static void stm32_adc_irq_handler(struct irq_desc *desc) { @@ -263,13 +285,28 @@ static void stm32_adc_irq_handler(struct chained_irq_enter(chip, desc); status = readl_relaxed(priv->common.base + priv->cfg->regs->csr);
- if (status & priv->cfg->regs->eoc1_msk) + /* + * End of conversion may be handled by using IRQ or DMA. There may be a + * race here when two conversions complete at the same time on several + * ADCs. EOC may be read 'set' for several ADCs, with: + * - an ADC configured to use DMA (EOC triggers the DMA request, and + * is then automatically cleared by DR read in hardware) + * - an ADC configured to use IRQs (EOCIE bit is set. The handler must + * be called in this case) + * So both EOC status bit in CSR and EOCIE control bit must be checked + * before invoking the interrupt handler (e.g. call ISR only for + * IRQ-enabled ADCs). + */ + if (status & priv->cfg->regs->eoc1_msk && + stm32_adc_eoc_enabled(priv, 0)) generic_handle_irq(irq_find_mapping(priv->domain, 0));
- if (status & priv->cfg->regs->eoc2_msk) + if (status & priv->cfg->regs->eoc2_msk && + stm32_adc_eoc_enabled(priv, 1)) generic_handle_irq(irq_find_mapping(priv->domain, 1));
- if (status & priv->cfg->regs->eoc3_msk) + if (status & priv->cfg->regs->eoc3_msk && + stm32_adc_eoc_enabled(priv, 2)) generic_handle_irq(irq_find_mapping(priv->domain, 2));
chained_irq_exit(chip, desc); --- a/drivers/iio/adc/stm32-adc-core.h +++ b/drivers/iio/adc/stm32-adc-core.h @@ -37,6 +37,7 @@ * -------------------------------------------------------- */ #define STM32_ADC_MAX_ADCS 3 +#define STM32_ADC_OFFSET 0x100 #define STM32_ADCX_COMN_OFFSET 0x300
/* STM32F4 - Registers for each ADC instance */
From: "Aneesh Kumar K.V" aneesh.kumar@linux.vnet.ibm.com
commit a5d4b5891c2f1f865a2def1eb0030f534e77ff86 upstream.
On POWER9, under some circumstances, a broadcast TLB invalidation might complete before all previous stores have drained, potentially allowing stale stores from becoming visible after the invalidation. This works around it by doubling up those TLB invalidations which was verified by HW to be sufficient to close the risk window.
This will be documented in a yet-to-be-published errata.
Cc: stable@vger.kernel.org # v4.14 Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V aneesh.kumar@linux.vnet.ibm.com [mpe: Enable the feature in the DT CPU features code for all Power9, rename the feature to CPU_FTR_P9_TLBIE_BUG per benh.] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20180323045627.16800-3-aneesh.kumar@linux.vnet.ibm... [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das sandipan@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/cputable.h | 4 ++- arch/powerpc/kernel/dt_cpu_ftrs.c | 3 ++ arch/powerpc/kvm/book3s_64_mmu_radix.c | 3 ++ arch/powerpc/kvm/book3s_hv_rm_mmu.c | 11 ++++++++ arch/powerpc/mm/hash_native_64.c | 16 ++++++++++++ arch/powerpc/mm/pgtable_64.c | 1 arch/powerpc/mm/tlb-radix.c | 41 ++++++++++++++++++++++++--------- 7 files changed, 66 insertions(+), 13 deletions(-)
--- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -215,6 +215,7 @@ enum { #define CPU_FTR_DAWR LONG_ASM_CONST(0x0400000000000000) #define CPU_FTR_DABRX LONG_ASM_CONST(0x0800000000000000) #define CPU_FTR_PMAO_BUG LONG_ASM_CONST(0x1000000000000000) +#define CPU_FTR_P9_TLBIE_BUG LONG_ASM_CONST(0x2000000000000000) #define CPU_FTR_POWER9_DD1 LONG_ASM_CONST(0x4000000000000000)
#ifndef __ASSEMBLY__ @@ -475,7 +476,8 @@ enum { CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \ CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \ CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \ - CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300) + CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | \ + CPU_FTR_P9_TLBIE_BUG) #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \ (~CPU_FTR_SAO)) #define CPU_FTRS_CELL (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \ --- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -742,6 +742,9 @@ static __init void cpufeatures_cpu_quirk */ if ((version & 0xffffff00) == 0x004e0100) cur_cpu_spec->cpu_features |= CPU_FTR_POWER9_DD1; + + if ((version & 0xffff0000) == 0x004e0000) + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; }
static void __init cpufeatures_setup_finished(void) --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -160,6 +160,9 @@ static void kvmppc_radix_tlbie_page(stru asm volatile("ptesync": : :"memory"); asm volatile(PPC_TLBIE_5(%0, %1, 0, 0, 1) : : "r" (addr), "r" (kvm->arch.lpid) : "memory"); + if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) + asm volatile(PPC_TLBIE_5(%0, %1, 0, 0, 1) + : : "r" (addr), "r" (kvm->arch.lpid) : "memory"); asm volatile("ptesync": : :"memory"); }
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -448,6 +448,17 @@ static void do_tlbies(struct kvm *kvm, u asm volatile(PPC_TLBIE_5(%0,%1,0,0,0) : : "r" (rbvalues[i]), "r" (kvm->arch.lpid)); } + + if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) { + /* + * Need the extra ptesync to make sure we don't + * re-order the tlbie + */ + asm volatile("ptesync": : :"memory"); + asm volatile(PPC_TLBIE_5(%0,%1,0,0,0) : : + "r" (rbvalues[0]), "r" (kvm->arch.lpid)); + } + asm volatile("eieio; tlbsync; ptesync" : : : "memory"); kvm->arch.tlbie_lock = 0; } else { --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -104,6 +104,15 @@ static inline unsigned long ___tlbie(un return va; }
+static inline void fixup_tlbie(unsigned long vpn, int psize, int apsize, int ssize) +{ + if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) { + /* Need the extra ptesync to ensure we don't reorder tlbie*/ + asm volatile("ptesync": : :"memory"); + ___tlbie(vpn, psize, apsize, ssize); + } +} + static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize) { unsigned long rb; @@ -181,6 +190,7 @@ static inline void tlbie(unsigned long v asm volatile("ptesync": : :"memory"); } else { __tlbie(vpn, psize, apsize, ssize); + fixup_tlbie(vpn, psize, apsize, ssize); asm volatile("eieio; tlbsync; ptesync": : :"memory"); } if (lock_tlbie && !use_local) @@ -674,7 +684,7 @@ static void native_hpte_clear(void) */ static void native_flush_hash_range(unsigned long number, int local) { - unsigned long vpn; + unsigned long vpn = 0; unsigned long hash, index, hidx, shift, slot; struct hash_pte *hptep; unsigned long hpte_v; @@ -746,6 +756,10 @@ static void native_flush_hash_range(unsi __tlbie(vpn, psize, psize, ssize); } pte_iterate_hashed_end(); } + /* + * Just do one more with the last used values. + */ + fixup_tlbie(vpn, psize, psize, ssize); asm volatile("eieio; tlbsync; ptesync":::"memory");
if (lock_tlbie) --- a/arch/powerpc/mm/pgtable_64.c +++ b/arch/powerpc/mm/pgtable_64.c @@ -491,6 +491,7 @@ void mmu_partition_table_set_entry(unsig "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0); } + /* do we need fixup here ?*/ asm volatile("eieio; tlbsync; ptesync" : : : "memory"); } EXPORT_SYMBOL_GPL(mmu_partition_table_set_entry); --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -23,6 +23,33 @@ #define RIC_FLUSH_PWC 1 #define RIC_FLUSH_ALL 2
+static inline void __tlbie_va(unsigned long va, unsigned long pid, + unsigned long ap, unsigned long ric) +{ + unsigned long rb,rs,prs,r; + + rb = va & ~(PPC_BITMASK(52, 63)); + rb |= ap << PPC_BITLSHIFT(58); + rs = pid << PPC_BITLSHIFT(31); + prs = 1; /* process scoped */ + r = 1; /* raidx format */ + + asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) + : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory"); + trace_tlbie(0, 0, rb, rs, ric, prs, r); +} + +static inline void fixup_tlbie(void) +{ + unsigned long pid = 0; + unsigned long va = ((1UL << 52) - 1); + + if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) { + asm volatile("ptesync": : :"memory"); + __tlbie_va(va, pid, mmu_get_ap(MMU_PAGE_64K), RIC_FLUSH_TLB); + } +} + static inline void __tlbiel_pid(unsigned long pid, int set, unsigned long ric) { @@ -80,6 +107,7 @@ static inline void _tlbie_pid(unsigned l asm volatile("ptesync": : :"memory"); asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory"); + fixup_tlbie(); asm volatile("eieio; tlbsync; ptesync": : :"memory"); trace_tlbie(0, 0, rb, rs, ric, prs, r); } @@ -105,19 +133,10 @@ static inline void _tlbiel_va(unsigned l static inline void _tlbie_va(unsigned long va, unsigned long pid, unsigned long ap, unsigned long ric) { - unsigned long rb,rs,prs,r; - - rb = va & ~(PPC_BITMASK(52, 63)); - rb |= ap << PPC_BITLSHIFT(58); - rs = pid << PPC_BITLSHIFT(31); - prs = 1; /* process scoped */ - r = 1; /* raidx format */ - asm volatile("ptesync": : :"memory"); - asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) - : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory"); + __tlbie_va(va, pid, ap, ric); + fixup_tlbie(); asm volatile("eieio; tlbsync; ptesync": : :"memory"); - trace_tlbie(0, 0, rb, rs, ric, prs, r); }
/*
From: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com
commit 677733e296b5c7a37c47da391fc70a43dc40bd67 upstream.
The store ordering vs tlbie issue mentioned in commit a5d4b5891c2f ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9") is fixed for Nimbus 2.3 and Cumulus 1.3 revisions. We don't need to apply the fixup if we are running on them
We can only do this on PowerNV. On pseries guest with kvm we still don't support redoing the feature fixup after migration. So we should be enabling all the workarounds needed, because whe can possibly migrate between DD 2.3 and DD 2.2
Cc: stable@vger.kernel.org # v4.14 Fixes: a5d4b5891c2f ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9") Signed-off-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20190924035254.24612-1-aneesh.kumar@linux.ibm.com [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das sandipan@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kernel/dt_cpu_ftrs.c | 31 ++++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-)
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -733,9 +733,35 @@ static bool __init cpufeatures_process_f return true; }
+/* + * Handle POWER9 broadcast tlbie invalidation issue using + * cpu feature flag. + */ +static __init void update_tlbie_feature_flag(unsigned long pvr) +{ + if (PVR_VER(pvr) == PVR_POWER9) { + /* + * Set the tlbie feature flag for anything below + * Nimbus DD 2.3 and Cumulus DD 1.3 + */ + if ((pvr & 0xe000) == 0) { + /* Nimbus */ + if ((pvr & 0xfff) < 0x203) + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; + } else if ((pvr & 0xc000) == 0) { + /* Cumulus */ + if ((pvr & 0xfff) < 0x103) + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; + } else { + WARN_ONCE(1, "Unknown PVR"); + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; + } + } +} + static __init void cpufeatures_cpu_quirks(void) { - int version = mfspr(SPRN_PVR); + unsigned long version = mfspr(SPRN_PVR);
/* * Not all quirks can be derived from the cpufeatures device tree. @@ -743,8 +769,7 @@ static __init void cpufeatures_cpu_quirk if ((version & 0xffffff00) == 0x004e0100) cur_cpu_spec->cpu_features |= CPU_FTR_POWER9_DD1;
- if ((version & 0xffff0000) == 0x004e0000) - cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; + update_tlbie_feature_flag(version); }
static void __init cpufeatures_setup_finished(void)
From: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com
commit 09ce98cacd51fcd0fa0af2f79d1e1d3192f4cbb0 upstream.
Rename the #define to indicate this is related to store vs tlbie ordering issue. In the next patch, we will be adding another feature flag that is used to handles ERAT flush vs tlbie ordering issue.
Cc: stable@vger.kernel.org # v4.14 Fixes: a5d4b5891c2f ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9") Signed-off-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20190924035254.24612-2-aneesh.kumar@linux.ibm.com [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das sandipan@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/cputable.h | 4 ++-- arch/powerpc/kernel/dt_cpu_ftrs.c | 6 +++--- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 +- arch/powerpc/mm/hash_native_64.c | 2 +- arch/powerpc/mm/tlb-radix.c | 2 +- 6 files changed, 9 insertions(+), 9 deletions(-)
--- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -215,7 +215,7 @@ enum { #define CPU_FTR_DAWR LONG_ASM_CONST(0x0400000000000000) #define CPU_FTR_DABRX LONG_ASM_CONST(0x0800000000000000) #define CPU_FTR_PMAO_BUG LONG_ASM_CONST(0x1000000000000000) -#define CPU_FTR_P9_TLBIE_BUG LONG_ASM_CONST(0x2000000000000000) +#define CPU_FTR_P9_TLBIE_STQ_BUG LONG_ASM_CONST(0x0000400000000000) #define CPU_FTR_POWER9_DD1 LONG_ASM_CONST(0x4000000000000000)
#ifndef __ASSEMBLY__ @@ -477,7 +477,7 @@ enum { CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \ CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \ CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | \ - CPU_FTR_P9_TLBIE_BUG) + CPU_FTR_P9_TLBIE_STQ_BUG) #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \ (~CPU_FTR_SAO)) #define CPU_FTRS_CELL (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \ --- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -747,14 +747,14 @@ static __init void update_tlbie_feature_ if ((pvr & 0xe000) == 0) { /* Nimbus */ if ((pvr & 0xfff) < 0x203) - cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_STQ_BUG; } else if ((pvr & 0xc000) == 0) { /* Cumulus */ if ((pvr & 0xfff) < 0x103) - cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_STQ_BUG; } else { WARN_ONCE(1, "Unknown PVR"); - cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_STQ_BUG; } } } --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -160,7 +160,7 @@ static void kvmppc_radix_tlbie_page(stru asm volatile("ptesync": : :"memory"); asm volatile(PPC_TLBIE_5(%0, %1, 0, 0, 1) : : "r" (addr), "r" (kvm->arch.lpid) : "memory"); - if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) + if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) asm volatile(PPC_TLBIE_5(%0, %1, 0, 0, 1) : : "r" (addr), "r" (kvm->arch.lpid) : "memory"); asm volatile("ptesync": : :"memory"); --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -449,7 +449,7 @@ static void do_tlbies(struct kvm *kvm, u "r" (rbvalues[i]), "r" (kvm->arch.lpid)); }
- if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) { + if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) { /* * Need the extra ptesync to make sure we don't * re-order the tlbie --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -106,7 +106,7 @@ static inline unsigned long ___tlbie(un
static inline void fixup_tlbie(unsigned long vpn, int psize, int apsize, int ssize) { - if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) { + if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) { /* Need the extra ptesync to ensure we don't reorder tlbie*/ asm volatile("ptesync": : :"memory"); ___tlbie(vpn, psize, apsize, ssize); --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -44,7 +44,7 @@ static inline void fixup_tlbie(void) unsigned long pid = 0; unsigned long va = ((1UL << 52) - 1);
- if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) { + if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) { asm volatile("ptesync": : :"memory"); __tlbie_va(va, pid, mmu_get_ap(MMU_PAGE_64K), RIC_FLUSH_TLB); }
From: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com
commit 047e6575aec71d75b765c22111820c4776cd1c43 upstream.
On POWER9, under some circumstances, a broadcast TLB invalidation will fail to invalidate the ERAT cache on some threads when there are parallel mtpidr/mtlpidr happening on other threads of the same core. This can cause stores to continue to go to a page after it's unmapped.
The workaround is to force an ERAT flush using PID=0 or LPID=0 tlbie flush. This additional TLB flush will cause the ERAT cache invalidation. Since we are using PID=0 or LPID=0, we don't get filtered out by the TLB snoop filtering logic.
We need to still follow this up with another tlbie to take care of store vs tlbie ordering issue explained in commit: a5d4b5891c2f ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9"). The presence of ERAT cache implies we can still get new stores and they may miss store queue marking flush.
Cc: stable@vger.kernel.org # v4.14 Signed-off-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20190924035254.24612-3-aneesh.kumar@linux.ibm.com [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das sandipan@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/cputable.h | 3 + arch/powerpc/kernel/dt_cpu_ftrs.c | 2 + arch/powerpc/kvm/book3s_hv_rm_mmu.c | 42 +++++++++++++++++------ arch/powerpc/mm/hash_native_64.c | 28 +++++++++++++-- arch/powerpc/mm/tlb-radix.c | 65 ++++++++++++++++++++++++++++++------ 5 files changed, 116 insertions(+), 24 deletions(-)
--- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -217,6 +217,7 @@ enum { #define CPU_FTR_PMAO_BUG LONG_ASM_CONST(0x1000000000000000) #define CPU_FTR_P9_TLBIE_STQ_BUG LONG_ASM_CONST(0x0000400000000000) #define CPU_FTR_POWER9_DD1 LONG_ASM_CONST(0x4000000000000000) +#define CPU_FTR_P9_TLBIE_ERAT_BUG LONG_ASM_CONST(0x0001000000000000)
#ifndef __ASSEMBLY__
@@ -477,7 +478,7 @@ enum { CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \ CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \ CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | \ - CPU_FTR_P9_TLBIE_STQ_BUG) + CPU_FTR_P9_TLBIE_STQ_BUG | CPU_FTR_P9_TLBIE_ERAT_BUG) #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \ (~CPU_FTR_SAO)) #define CPU_FTRS_CELL (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \ --- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -756,6 +756,8 @@ static __init void update_tlbie_feature_ WARN_ONCE(1, "Unknown PVR"); cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_STQ_BUG; } + + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_ERAT_BUG; } }
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -429,6 +429,37 @@ static inline int try_lock_tlbie(unsigne return old == 0; }
+static inline void fixup_tlbie_lpid(unsigned long rb_value, unsigned long lpid) +{ + + if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) { + /* Radix flush for a hash guest */ + + unsigned long rb,rs,prs,r,ric; + + rb = PPC_BIT(52); /* IS = 2 */ + rs = 0; /* lpid = 0 */ + prs = 0; /* partition scoped */ + r = 1; /* radix format */ + ric = 0; /* RIC_FLSUH_TLB */ + + /* + * Need the extra ptesync to make sure we don't + * re-order the tlbie + */ + asm volatile("ptesync": : :"memory"); + asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) + : : "r"(rb), "i"(r), "i"(prs), + "i"(ric), "r"(rs) : "memory"); + } + + if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) { + asm volatile("ptesync": : :"memory"); + asm volatile(PPC_TLBIE_5(%0,%1,0,0,0) : : + "r" (rb_value), "r" (lpid)); + } +} + static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, long npages, int global, bool need_sync) { @@ -449,16 +480,7 @@ static void do_tlbies(struct kvm *kvm, u "r" (rbvalues[i]), "r" (kvm->arch.lpid)); }
- if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) { - /* - * Need the extra ptesync to make sure we don't - * re-order the tlbie - */ - asm volatile("ptesync": : :"memory"); - asm volatile(PPC_TLBIE_5(%0,%1,0,0,0) : : - "r" (rbvalues[0]), "r" (kvm->arch.lpid)); - } - + fixup_tlbie_lpid(rbvalues[i - 1], kvm->arch.lpid); asm volatile("eieio; tlbsync; ptesync" : : : "memory"); kvm->arch.tlbie_lock = 0; } else { --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -104,8 +104,30 @@ static inline unsigned long ___tlbie(un return va; }
-static inline void fixup_tlbie(unsigned long vpn, int psize, int apsize, int ssize) +static inline void fixup_tlbie_vpn(unsigned long vpn, int psize, + int apsize, int ssize) { + if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) { + /* Radix flush for a hash guest */ + + unsigned long rb,rs,prs,r,ric; + + rb = PPC_BIT(52); /* IS = 2 */ + rs = 0; /* lpid = 0 */ + prs = 0; /* partition scoped */ + r = 1; /* radix format */ + ric = 0; /* RIC_FLSUH_TLB */ + + /* + * Need the extra ptesync to make sure we don't + * re-order the tlbie + */ + asm volatile("ptesync": : :"memory"); + asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) + : : "r"(rb), "i"(r), "i"(prs), + "i"(ric), "r"(rs) : "memory"); + } + if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) { /* Need the extra ptesync to ensure we don't reorder tlbie*/ asm volatile("ptesync": : :"memory"); @@ -190,7 +212,7 @@ static inline void tlbie(unsigned long v asm volatile("ptesync": : :"memory"); } else { __tlbie(vpn, psize, apsize, ssize); - fixup_tlbie(vpn, psize, apsize, ssize); + fixup_tlbie_vpn(vpn, psize, apsize, ssize); asm volatile("eieio; tlbsync; ptesync": : :"memory"); } if (lock_tlbie && !use_local) @@ -759,7 +781,7 @@ static void native_flush_hash_range(unsi /* * Just do one more with the last used values. */ - fixup_tlbie(vpn, psize, psize, ssize); + fixup_tlbie_vpn(vpn, psize, psize, ssize); asm volatile("eieio; tlbsync; ptesync":::"memory");
if (lock_tlbie) --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -39,14 +39,18 @@ static inline void __tlbie_va(unsigned l trace_tlbie(0, 0, rb, rs, ric, prs, r); }
-static inline void fixup_tlbie(void) + +static inline void fixup_tlbie_va(unsigned long va, unsigned long pid, + unsigned long ap) { - unsigned long pid = 0; - unsigned long va = ((1UL << 52) - 1); + if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) { + asm volatile("ptesync": : :"memory"); + __tlbie_va(va, 0, ap, RIC_FLUSH_TLB); + }
if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) { asm volatile("ptesync": : :"memory"); - __tlbie_va(va, pid, mmu_get_ap(MMU_PAGE_64K), RIC_FLUSH_TLB); + __tlbie_va(va, pid, ap, RIC_FLUSH_TLB); } }
@@ -95,23 +99,64 @@ static inline void _tlbiel_pid(unsigned asm volatile(PPC_INVALIDATE_ERAT "; isync" : : :"memory"); }
-static inline void _tlbie_pid(unsigned long pid, unsigned long ric) +static inline void __tlbie_pid(unsigned long pid, unsigned long ric) { unsigned long rb,rs,prs,r;
rb = PPC_BIT(53); /* IS = 1 */ rs = pid << PPC_BITLSHIFT(31); prs = 1; /* process scoped */ - r = 1; /* raidx format */ + r = 1; /* radix format */
- asm volatile("ptesync": : :"memory"); asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1) : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory"); - fixup_tlbie(); - asm volatile("eieio; tlbsync; ptesync": : :"memory"); trace_tlbie(0, 0, rb, rs, ric, prs, r); }
+static inline void fixup_tlbie_pid(unsigned long pid) +{ + /* + * We can use any address for the invalidation, pick one which is + * probably unused as an optimisation. + */ + unsigned long va = ((1UL << 52) - 1); + + if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) { + asm volatile("ptesync": : :"memory"); + __tlbie_pid(0, RIC_FLUSH_TLB); + } + + if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) { + asm volatile("ptesync": : :"memory"); + __tlbie_va(va, pid, mmu_get_ap(MMU_PAGE_64K), RIC_FLUSH_TLB); + } +} + +static inline void _tlbie_pid(unsigned long pid, unsigned long ric) +{ + asm volatile("ptesync": : :"memory"); + + /* + * Workaround the fact that the "ric" argument to __tlbie_pid + * must be a compile-time contraint to match the "i" constraint + * in the asm statement. + */ + switch (ric) { + case RIC_FLUSH_TLB: + __tlbie_pid(pid, RIC_FLUSH_TLB); + fixup_tlbie_pid(pid); + break; + case RIC_FLUSH_PWC: + __tlbie_pid(pid, RIC_FLUSH_PWC); + break; + case RIC_FLUSH_ALL: + default: + __tlbie_pid(pid, RIC_FLUSH_ALL); + fixup_tlbie_pid(pid); + } + asm volatile("eieio; tlbsync; ptesync": : :"memory"); +} + static inline void _tlbiel_va(unsigned long va, unsigned long pid, unsigned long ap, unsigned long ric) { @@ -135,7 +180,7 @@ static inline void _tlbie_va(unsigned lo { asm volatile("ptesync": : :"memory"); __tlbie_va(va, pid, ap, ric); - fixup_tlbie(); + fixup_tlbie_va(va, pid, ap); asm volatile("eieio; tlbsync; ptesync": : :"memory"); }
From: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com
commit 93cad5f789951eaa27c3392b15294b4e51253944 upstream.
Cc: stable@vger.kernel.org # v4.14 Signed-off-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com [mpe: Some minor fixes to make it build] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20190924035254.24612-4-aneesh.kumar@linux.ibm.com [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das sandipan@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/powerpc/mm/Makefile | 2 tools/testing/selftests/powerpc/mm/tlbie_test.c | 734 ++++++++++++++++++++++++ 2 files changed, 736 insertions(+) create mode 100644 tools/testing/selftests/powerpc/mm/tlbie_test.c
--- a/tools/testing/selftests/powerpc/mm/Makefile +++ b/tools/testing/selftests/powerpc/mm/Makefile @@ -3,6 +3,7 @@ noarg: $(MAKE) -C ../
TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao +TEST_GEN_PROGS_EXTENDED := tlbie_test TEST_GEN_FILES := tempfile
include ../../lib.mk @@ -14,3 +15,4 @@ $(OUTPUT)/prot_sao: ../utils.c $(OUTPUT)/tempfile: dd if=/dev/zero of=$@ bs=64k count=1
+$(OUTPUT)/tlbie_test: LDLIBS += -lpthread --- /dev/null +++ b/tools/testing/selftests/powerpc/mm/tlbie_test.c @@ -0,0 +1,734 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright 2019, Nick Piggin, Gautham R. Shenoy, Aneesh Kumar K.V, IBM Corp. + */ + +/* + * + * Test tlbie/mtpidr race. We have 4 threads doing flush/load/compare/store + * sequence in a loop. The same threads also rung a context switch task + * that does sched_yield() in loop. + * + * The snapshot thread mark the mmap area PROT_READ in between, make a copy + * and copy it back to the original area. This helps us to detect if any + * store continued to happen after we marked the memory PROT_READ. + */ + +#define _GNU_SOURCE +#include <stdio.h> +#include <sys/mman.h> +#include <sys/types.h> +#include <sys/wait.h> +#include <sys/ipc.h> +#include <sys/shm.h> +#include <sys/stat.h> +#include <sys/time.h> +#include <linux/futex.h> +#include <unistd.h> +#include <asm/unistd.h> +#include <string.h> +#include <stdlib.h> +#include <fcntl.h> +#include <sched.h> +#include <time.h> +#include <stdarg.h> +#include <sched.h> +#include <pthread.h> +#include <signal.h> +#include <sys/prctl.h> + +static inline void dcbf(volatile unsigned int *addr) +{ + __asm__ __volatile__ ("dcbf %y0; sync" : : "Z"(*(unsigned char *)addr) : "memory"); +} + +static void err_msg(char *msg) +{ + + time_t now; + time(&now); + printf("=================================\n"); + printf(" Error: %s\n", msg); + printf(" %s", ctime(&now)); + printf("=================================\n"); + exit(1); +} + +static char *map1; +static char *map2; +static pid_t rim_process_pid; + +/* + * A "rim-sequence" is defined to be the sequence of the following + * operations performed on a memory word: + * 1) FLUSH the contents of that word. + * 2) LOAD the contents of that word. + * 3) COMPARE the contents of that word with the content that was + * previously stored at that word + * 4) STORE new content into that word. + * + * The threads in this test that perform the rim-sequence are termed + * as rim_threads. + */ + +/* + * A "corruption" is defined to be the failed COMPARE operation in a + * rim-sequence. + * + * A rim_thread that detects a corruption informs about it to all the + * other rim_threads, and the mem_snapshot thread. + */ +static volatile unsigned int corruption_found; + +/* + * This defines the maximum number of rim_threads in this test. + * + * The THREAD_ID_BITS denote the number of bits required + * to represent the thread_ids [0..MAX_THREADS - 1]. + * We are being a bit paranoid here and set it to 8 bits, + * though 6 bits suffice. + * + */ +#define MAX_THREADS 64 +#define THREAD_ID_BITS 8 +#define THREAD_ID_MASK ((1 << THREAD_ID_BITS) - 1) +static unsigned int rim_thread_ids[MAX_THREADS]; +static pthread_t rim_threads[MAX_THREADS]; + + +/* + * Each rim_thread works on an exclusive "chunk" of size + * RIM_CHUNK_SIZE. + * + * The ith rim_thread works on the ith chunk. + * + * The ith chunk begins at + * map1 + (i * RIM_CHUNK_SIZE) + */ +#define RIM_CHUNK_SIZE 1024 +#define BITS_PER_BYTE 8 +#define WORD_SIZE (sizeof(unsigned int)) +#define WORD_BITS (WORD_SIZE * BITS_PER_BYTE) +#define WORDS_PER_CHUNK (RIM_CHUNK_SIZE/WORD_SIZE) + +static inline char *compute_chunk_start_addr(unsigned int thread_id) +{ + char *chunk_start; + + chunk_start = (char *)((unsigned long)map1 + + (thread_id * RIM_CHUNK_SIZE)); + + return chunk_start; +} + +/* + * The "word-offset" of a word-aligned address inside a chunk, is + * defined to be the number of words that precede the address in that + * chunk. + * + * WORD_OFFSET_BITS denote the number of bits required to represent + * the word-offsets of all the word-aligned addresses of a chunk. + */ +#define WORD_OFFSET_BITS (__builtin_ctz(WORDS_PER_CHUNK)) +#define WORD_OFFSET_MASK ((1 << WORD_OFFSET_BITS) - 1) + +static inline unsigned int compute_word_offset(char *start, unsigned int *addr) +{ + unsigned int delta_bytes, ret; + delta_bytes = (unsigned long)addr - (unsigned long)start; + + ret = delta_bytes/WORD_SIZE; + + return ret; +} + +/* + * A "sweep" is defined to be the sequential execution of the + * rim-sequence by a rim_thread on its chunk one word at a time, + * starting from the first word of its chunk and ending with the last + * word of its chunk. + * + * Each sweep of a rim_thread is uniquely identified by a sweep_id. + * SWEEP_ID_BITS denote the number of bits required to represent + * the sweep_ids of rim_threads. + * + * As to why SWEEP_ID_BITS are computed as a function of THREAD_ID_BITS, + * WORD_OFFSET_BITS, and WORD_BITS, see the "store-pattern" below. + */ +#define SWEEP_ID_BITS (WORD_BITS - (THREAD_ID_BITS + WORD_OFFSET_BITS)) +#define SWEEP_ID_MASK ((1 << SWEEP_ID_BITS) - 1) + +/* + * A "store-pattern" is the word-pattern that is stored into a word + * location in the 4)STORE step of the rim-sequence. + * + * In the store-pattern, we shall encode: + * + * - The thread-id of the rim_thread performing the store + * (The most significant THREAD_ID_BITS) + * + * - The word-offset of the address into which the store is being + * performed (The next WORD_OFFSET_BITS) + * + * - The sweep_id of the current sweep in which the store is + * being performed. (The lower SWEEP_ID_BITS) + * + * Store Pattern: 32 bits + * |------------------|--------------------|---------------------------------| + * | Thread id | Word offset | sweep_id | + * |------------------|--------------------|---------------------------------| + * THREAD_ID_BITS WORD_OFFSET_BITS SWEEP_ID_BITS + * + * In the store pattern, the (Thread-id + Word-offset) uniquely identify the + * address to which the store is being performed i.e, + * address == map1 + + * (Thread-id * RIM_CHUNK_SIZE) + (Word-offset * WORD_SIZE) + * + * And the sweep_id in the store pattern identifies the time when the + * store was performed by the rim_thread. + * + * We shall use this property in the 3)COMPARE step of the + * rim-sequence. + */ +#define SWEEP_ID_SHIFT 0 +#define WORD_OFFSET_SHIFT (SWEEP_ID_BITS) +#define THREAD_ID_SHIFT (WORD_OFFSET_BITS + SWEEP_ID_BITS) + +/* + * Compute the store pattern for a given thread with id @tid, at + * location @addr in the sweep identified by @sweep_id + */ +static inline unsigned int compute_store_pattern(unsigned int tid, + unsigned int *addr, + unsigned int sweep_id) +{ + unsigned int ret = 0; + char *start = compute_chunk_start_addr(tid); + unsigned int word_offset = compute_word_offset(start, addr); + + ret += (tid & THREAD_ID_MASK) << THREAD_ID_SHIFT; + ret += (word_offset & WORD_OFFSET_MASK) << WORD_OFFSET_SHIFT; + ret += (sweep_id & SWEEP_ID_MASK) << SWEEP_ID_SHIFT; + return ret; +} + +/* Extract the thread-id from the given store-pattern */ +static inline unsigned int extract_tid(unsigned int pattern) +{ + unsigned int ret; + + ret = (pattern >> THREAD_ID_SHIFT) & THREAD_ID_MASK; + return ret; +} + +/* Extract the word-offset from the given store-pattern */ +static inline unsigned int extract_word_offset(unsigned int pattern) +{ + unsigned int ret; + + ret = (pattern >> WORD_OFFSET_SHIFT) & WORD_OFFSET_MASK; + + return ret; +} + +/* Extract the sweep-id from the given store-pattern */ +static inline unsigned int extract_sweep_id(unsigned int pattern) + +{ + unsigned int ret; + + ret = (pattern >> SWEEP_ID_SHIFT) & SWEEP_ID_MASK; + + return ret; +} + +/************************************************************ + * * + * Logging the output of the verification * + * * + ************************************************************/ +#define LOGDIR_NAME_SIZE 100 +static char logdir[LOGDIR_NAME_SIZE]; + +static FILE *fp[MAX_THREADS]; +static const char logfilename[] ="Thread-%02d-Chunk"; + +static inline void start_verification_log(unsigned int tid, + unsigned int *addr, + unsigned int cur_sweep_id, + unsigned int prev_sweep_id) +{ + FILE *f; + char logfile[30]; + char path[LOGDIR_NAME_SIZE + 30]; + char separator[2] = "/"; + char *chunk_start = compute_chunk_start_addr(tid); + unsigned int size = RIM_CHUNK_SIZE; + + sprintf(logfile, logfilename, tid); + strcpy(path, logdir); + strcat(path, separator); + strcat(path, logfile); + f = fopen(path, "w"); + + if (!f) { + err_msg("Unable to create logfile\n"); + } + + fp[tid] = f; + + fprintf(f, "----------------------------------------------------------\n"); + fprintf(f, "PID = %d\n", rim_process_pid); + fprintf(f, "Thread id = %02d\n", tid); + fprintf(f, "Chunk Start Addr = 0x%016lx\n", (unsigned long)chunk_start); + fprintf(f, "Chunk Size = %d\n", size); + fprintf(f, "Next Store Addr = 0x%016lx\n", (unsigned long)addr); + fprintf(f, "Current sweep-id = 0x%08x\n", cur_sweep_id); + fprintf(f, "Previous sweep-id = 0x%08x\n", prev_sweep_id); + fprintf(f, "----------------------------------------------------------\n"); +} + +static inline void log_anamoly(unsigned int tid, unsigned int *addr, + unsigned int expected, unsigned int observed) +{ + FILE *f = fp[tid]; + + fprintf(f, "Thread %02d: Addr 0x%lx: Expected 0x%x, Observed 0x%x\n", + tid, (unsigned long)addr, expected, observed); + fprintf(f, "Thread %02d: Expected Thread id = %02d\n", tid, extract_tid(expected)); + fprintf(f, "Thread %02d: Observed Thread id = %02d\n", tid, extract_tid(observed)); + fprintf(f, "Thread %02d: Expected Word offset = %03d\n", tid, extract_word_offset(expected)); + fprintf(f, "Thread %02d: Observed Word offset = %03d\n", tid, extract_word_offset(observed)); + fprintf(f, "Thread %02d: Expected sweep-id = 0x%x\n", tid, extract_sweep_id(expected)); + fprintf(f, "Thread %02d: Observed sweep-id = 0x%x\n", tid, extract_sweep_id(observed)); + fprintf(f, "----------------------------------------------------------\n"); +} + +static inline void end_verification_log(unsigned int tid, unsigned nr_anamolies) +{ + FILE *f = fp[tid]; + char logfile[30]; + char path[LOGDIR_NAME_SIZE + 30]; + char separator[] = "/"; + + fclose(f); + + if (nr_anamolies == 0) { + remove(path); + return; + } + + sprintf(logfile, logfilename, tid); + strcpy(path, logdir); + strcat(path, separator); + strcat(path, logfile); + + printf("Thread %02d chunk has %d corrupted words. For details check %s\n", + tid, nr_anamolies, path); +} + +/* + * When a COMPARE step of a rim-sequence fails, the rim_thread informs + * everyone else via the shared_memory pointed to by + * corruption_found variable. On seeing this, every thread verifies the + * content of its chunk as follows. + * + * Suppose a thread identified with @tid was about to store (but not + * yet stored) to @next_store_addr in its current sweep identified + * @cur_sweep_id. Let @prev_sweep_id indicate the previous sweep_id. + * + * This implies that for all the addresses @addr < @next_store_addr, + * Thread @tid has already performed a store as part of its current + * sweep. Hence we expect the content of such @addr to be: + * |-------------------------------------------------| + * | tid | word_offset(addr) | cur_sweep_id | + * |-------------------------------------------------| + * + * Since Thread @tid is yet to perform stores on address + * @next_store_addr and above, we expect the content of such an + * address @addr to be: + * |-------------------------------------------------| + * | tid | word_offset(addr) | prev_sweep_id | + * |-------------------------------------------------| + * + * The verifier function @verify_chunk does this verification and logs + * any anamolies that it finds. + */ +static void verify_chunk(unsigned int tid, unsigned int *next_store_addr, + unsigned int cur_sweep_id, + unsigned int prev_sweep_id) +{ + unsigned int *iter_ptr; + unsigned int size = RIM_CHUNK_SIZE; + unsigned int expected; + unsigned int observed; + char *chunk_start = compute_chunk_start_addr(tid); + + int nr_anamolies = 0; + + start_verification_log(tid, next_store_addr, + cur_sweep_id, prev_sweep_id); + + for (iter_ptr = (unsigned int *)chunk_start; + (unsigned long)iter_ptr < (unsigned long)chunk_start + size; + iter_ptr++) { + unsigned int expected_sweep_id; + + if (iter_ptr < next_store_addr) { + expected_sweep_id = cur_sweep_id; + } else { + expected_sweep_id = prev_sweep_id; + } + + expected = compute_store_pattern(tid, iter_ptr, expected_sweep_id); + + dcbf((volatile unsigned int*)iter_ptr); //Flush before reading + observed = *iter_ptr; + + if (observed != expected) { + nr_anamolies++; + log_anamoly(tid, iter_ptr, expected, observed); + } + } + + end_verification_log(tid, nr_anamolies); +} + +static void set_pthread_cpu(pthread_t th, int cpu) +{ + cpu_set_t run_cpu_mask; + struct sched_param param; + + CPU_ZERO(&run_cpu_mask); + CPU_SET(cpu, &run_cpu_mask); + pthread_setaffinity_np(th, sizeof(cpu_set_t), &run_cpu_mask); + + param.sched_priority = 1; + if (0 && sched_setscheduler(0, SCHED_FIFO, ¶m) == -1) { + /* haven't reproduced with this setting, it kills random preemption which may be a factor */ + fprintf(stderr, "could not set SCHED_FIFO, run as root?\n"); + } +} + +static void set_mycpu(int cpu) +{ + cpu_set_t run_cpu_mask; + struct sched_param param; + + CPU_ZERO(&run_cpu_mask); + CPU_SET(cpu, &run_cpu_mask); + sched_setaffinity(0, sizeof(cpu_set_t), &run_cpu_mask); + + param.sched_priority = 1; + if (0 && sched_setscheduler(0, SCHED_FIFO, ¶m) == -1) { + fprintf(stderr, "could not set SCHED_FIFO, run as root?\n"); + } +} + +static volatile int segv_wait; + +static void segv_handler(int signo, siginfo_t *info, void *extra) +{ + while (segv_wait) { + sched_yield(); + } + +} + +static void set_segv_handler(void) +{ + struct sigaction sa; + + sa.sa_flags = SA_SIGINFO; + sa.sa_sigaction = segv_handler; + + if (sigaction(SIGSEGV, &sa, NULL) == -1) { + perror("sigaction"); + exit(EXIT_FAILURE); + } +} + +int timeout = 0; +/* + * This function is executed by every rim_thread. + * + * This function performs sweeps over the exclusive chunks of the + * rim_threads executing the rim-sequence one word at a time. + */ +static void *rim_fn(void *arg) +{ + unsigned int tid = *((unsigned int *)arg); + + int size = RIM_CHUNK_SIZE; + char *chunk_start = compute_chunk_start_addr(tid); + + unsigned int prev_sweep_id; + unsigned int cur_sweep_id = 0; + + /* word access */ + unsigned int pattern = cur_sweep_id; + unsigned int *pattern_ptr = &pattern; + unsigned int *w_ptr, read_data; + + set_segv_handler(); + + /* + * Let us initialize the chunk: + * + * Each word-aligned address addr in the chunk, + * is initialized to : + * |-------------------------------------------------| + * | tid | word_offset(addr) | 0 | + * |-------------------------------------------------| + */ + for (w_ptr = (unsigned int *)chunk_start; + (unsigned long)w_ptr < (unsigned long)(chunk_start) + size; + w_ptr++) { + + *pattern_ptr = compute_store_pattern(tid, w_ptr, cur_sweep_id); + *w_ptr = *pattern_ptr; + } + + while (!corruption_found && !timeout) { + prev_sweep_id = cur_sweep_id; + cur_sweep_id = cur_sweep_id + 1; + + for (w_ptr = (unsigned int *)chunk_start; + (unsigned long)w_ptr < (unsigned long)(chunk_start) + size; + w_ptr++) { + unsigned int old_pattern; + + /* + * Compute the pattern that we would have + * stored at this location in the previous + * sweep. + */ + old_pattern = compute_store_pattern(tid, w_ptr, prev_sweep_id); + + /* + * FLUSH:Ensure that we flush the contents of + * the cache before loading + */ + dcbf((volatile unsigned int*)w_ptr); //Flush + + /* LOAD: Read the value */ + read_data = *w_ptr; //Load + + /* + * COMPARE: Is it the same as what we had stored + * in the previous sweep ? It better be! + */ + if (read_data != old_pattern) { + /* No it isn't! Tell everyone */ + corruption_found = 1; + } + + /* + * Before performing a store, let us check if + * any rim_thread has found a corruption. + */ + if (corruption_found || timeout) { + /* + * Yes. Someone (including us!) has found + * a corruption :( + * + * Let us verify that our chunk is + * correct. + */ + /* But first, let us allow the dust to settle down! */ + verify_chunk(tid, w_ptr, cur_sweep_id, prev_sweep_id); + + return 0; + } + + /* + * Compute the new pattern that we are going + * to write to this location + */ + *pattern_ptr = compute_store_pattern(tid, w_ptr, cur_sweep_id); + + /* + * STORE: Now let us write this pattern into + * the location + */ + *w_ptr = *pattern_ptr; + } + } + + return NULL; +} + + +static unsigned long start_cpu = 0; +static unsigned long nrthreads = 4; + +static pthread_t mem_snapshot_thread; + +static void *mem_snapshot_fn(void *arg) +{ + int page_size = getpagesize(); + size_t size = page_size; + void *tmp = malloc(size); + + while (!corruption_found && !timeout) { + /* Stop memory migration once corruption is found */ + segv_wait = 1; + + mprotect(map1, size, PROT_READ); + + /* + * Load from the working alias (map1). Loading from map2 + * also fails. + */ + memcpy(tmp, map1, size); + + /* + * Stores must go via map2 which has write permissions, but + * the corrupted data tends to be seen in the snapshot buffer, + * so corruption does not appear to be introduced at the + * copy-back via map2 alias here. + */ + memcpy(map2, tmp, size); + /* + * Before releasing other threads, must ensure the copy + * back to + */ + asm volatile("sync" ::: "memory"); + mprotect(map1, size, PROT_READ|PROT_WRITE); + asm volatile("sync" ::: "memory"); + segv_wait = 0; + + usleep(1); /* This value makes a big difference */ + } + + return 0; +} + +void alrm_sighandler(int sig) +{ + timeout = 1; +} + +int main(int argc, char *argv[]) +{ + int c; + int page_size = getpagesize(); + time_t now; + int i, dir_error; + pthread_attr_t attr; + key_t shm_key = (key_t) getpid(); + int shmid, run_time = 20 * 60; + struct sigaction sa_alrm; + + snprintf(logdir, LOGDIR_NAME_SIZE, + "/tmp/logdir-%u", (unsigned int)getpid()); + while ((c = getopt(argc, argv, "r:hn:l:t:")) != -1) { + switch(c) { + case 'r': + start_cpu = strtoul(optarg, NULL, 10); + break; + case 'h': + printf("%s [-r <start_cpu>] [-n <nrthreads>] [-l <logdir>] [-t <timeout>]\n", argv[0]); + exit(0); + break; + case 'n': + nrthreads = strtoul(optarg, NULL, 10); + break; + case 'l': + strncpy(logdir, optarg, LOGDIR_NAME_SIZE); + break; + case 't': + run_time = strtoul(optarg, NULL, 10); + break; + default: + printf("invalid option\n"); + exit(0); + break; + } + } + + if (nrthreads > MAX_THREADS) + nrthreads = MAX_THREADS; + + shmid = shmget(shm_key, page_size, IPC_CREAT|0666); + if (shmid < 0) { + err_msg("Failed shmget\n"); + } + + map1 = shmat(shmid, NULL, 0); + if (map1 == (void *) -1) { + err_msg("Failed shmat"); + } + + map2 = shmat(shmid, NULL, 0); + if (map2 == (void *) -1) { + err_msg("Failed shmat"); + } + + dir_error = mkdir(logdir, 0755); + + if (dir_error) { + err_msg("Failed mkdir"); + } + + printf("start_cpu list:%lu\n", start_cpu); + printf("number of worker threads:%lu + 1 snapshot thread\n", nrthreads); + printf("Allocated address:0x%016lx + secondary map:0x%016lx\n", (unsigned long)map1, (unsigned long)map2); + printf("logdir at : %s\n", logdir); + printf("Timeout: %d seconds\n", run_time); + + time(&now); + printf("=================================\n"); + printf(" Starting Test\n"); + printf(" %s", ctime(&now)); + printf("=================================\n"); + + for (i = 0; i < nrthreads; i++) { + if (1 && !fork()) { + prctl(PR_SET_PDEATHSIG, SIGKILL); + set_mycpu(start_cpu + i); + for (;;) + sched_yield(); + exit(0); + } + } + + + sa_alrm.sa_handler = &alrm_sighandler; + sigemptyset(&sa_alrm.sa_mask); + sa_alrm.sa_flags = 0; + + if (sigaction(SIGALRM, &sa_alrm, 0) == -1) { + err_msg("Failed signal handler registration\n"); + } + + alarm(run_time); + + pthread_attr_init(&attr); + for (i = 0; i < nrthreads; i++) { + rim_thread_ids[i] = i; + pthread_create(&rim_threads[i], &attr, rim_fn, &rim_thread_ids[i]); + set_pthread_cpu(rim_threads[i], start_cpu + i); + } + + pthread_create(&mem_snapshot_thread, &attr, mem_snapshot_fn, map1); + set_pthread_cpu(mem_snapshot_thread, start_cpu + i); + + + pthread_join(mem_snapshot_thread, NULL); + for (i = 0; i < nrthreads; i++) { + pthread_join(rim_threads[i], NULL); + } + + if (!timeout) { + time(&now); + printf("=================================\n"); + printf(" Data Corruption Detected\n"); + printf(" %s", ctime(&now)); + printf(" See logfiles in %s\n", logdir); + printf("=================================\n"); + return 1; + } + return 0; +}
From: Desnes A. Nunes do Rosario desnesn@linux.ibm.com
commit 5b216ea1c40cf06eead15054c70e238c9bd4729e upstream.
Newer versions of GCC (>= 9) demand that the size of the string to be copied must be explicitly smaller than the size of the destination. Thus, the NULL char has to be taken into account on strncpy.
This will avoid the following compiling error:
tlbie_test.c: In function 'main': tlbie_test.c:639:4: error: 'strncpy' specified bound 100 equals destination size strncpy(logdir, optarg, LOGDIR_NAME_SIZE); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors
Cc: stable@vger.kernel.org # v4.14 Signed-off-by: Desnes A. Nunes do Rosario desnesn@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20191003211010.9711-1-desnesn@linux.ibm.com [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das sandipan@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/powerpc/mm/tlbie_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/powerpc/mm/tlbie_test.c +++ b/tools/testing/selftests/powerpc/mm/tlbie_test.c @@ -636,7 +636,7 @@ int main(int argc, char *argv[]) nrthreads = strtoul(optarg, NULL, 10); break; case 'l': - strncpy(logdir, optarg, LOGDIR_NAME_SIZE); + strncpy(logdir, optarg, LOGDIR_NAME_SIZE - 1); break; case 't': run_time = strtoul(optarg, NULL, 10);
stable-rc/linux-4.14.y boot: 110 boots: 0 failed, 103 passed with 7 offline (v4.14.152-63-g2cfe0b7bdeef)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.14.y/kernel/v4.14... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.14.y/kernel/v4.14.152-63...
Tree: stable-rc Branch: linux-4.14.y Git Describe: v4.14.152-63-g2cfe0b7bdeef Git Commit: 2cfe0b7bdeef09a0ffe2895928288ebca332b8be Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 61 unique boards, 21 SoC families, 13 builds out of 201
Offline Platforms:
arm:
sunxi_defconfig: gcc-8 sun5i-r8-chip: 1 offline lab sun7i-a20-bananapi: 1 offline lab
multi_v7_defconfig: gcc-8 qcom-apq8064-cm-qs600: 1 offline lab sun5i-r8-chip: 1 offline lab sun7i-a20-bananapi: 1 offline lab
davinci_all_defconfig: gcc-8 dm365evm,legacy: 1 offline lab
qcom_defconfig: gcc-8 qcom-apq8064-cm-qs600: 1 offline lab
--- For more info write to info@kernelci.org
On Sat, 9 Nov 2019 at 00:27, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.14.153 release. There are 62 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 10 Nov 2019 05:42:11 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.153-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.14.153-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.14.y git commit: 2cfe0b7bdeef09a0ffe2895928288ebca332b8be git describe: v4.14.152-63-g2cfe0b7bdeef Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.152-6...
No regressions (compared to build v4.14.152)
No fixes (compared to build v4.14.152)
Ran 24212 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * ltp-dio-tests * ltp-io-tests * network-basic-tests * ltp-open-posix-tests * kvm-unit-tests * ssuite * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On 11/8/19 10:49 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.153 release. There are 62 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 10 Nov 2019 05:42:11 PM UTC. Anything received after that time might be too late.
Build results: total: 172 pass: 172 fail: 0 Qemu test results: total: 372 pass: 372 fail: 0
Guenter
linux-stable-mirror@lists.linaro.org