From: Kumar Kartikeya Dwivedi memxor@gmail.com
[ Upstream commit a7e75016a0753c24d6c995bc02501ae35368e333 ]
Add a test that validates that timer value is not overwritten when doing a copy_map_value call in the kernel. Without the prior fix, this test triggers a crash.
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20220209070324.1093182-3-memxor@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/bpf/prog_tests/timer_crash.c | 32 +++++++++++ .../testing/selftests/bpf/progs/timer_crash.c | 54 +++++++++++++++++++ 2 files changed, 86 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/timer_crash.c create mode 100644 tools/testing/selftests/bpf/progs/timer_crash.c
diff --git a/tools/testing/selftests/bpf/prog_tests/timer_crash.c b/tools/testing/selftests/bpf/prog_tests/timer_crash.c new file mode 100644 index 0000000000000..f74b82305da8c --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/timer_crash.c @@ -0,0 +1,32 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <test_progs.h> +#include "timer_crash.skel.h" + +enum { + MODE_ARRAY, + MODE_HASH, +}; + +static void test_timer_crash_mode(int mode) +{ + struct timer_crash *skel; + + skel = timer_crash__open_and_load(); + if (!ASSERT_OK_PTR(skel, "timer_crash__open_and_load")) + return; + skel->bss->pid = getpid(); + skel->bss->crash_map = mode; + if (!ASSERT_OK(timer_crash__attach(skel), "timer_crash__attach")) + goto end; + usleep(1); +end: + timer_crash__destroy(skel); +} + +void test_timer_crash(void) +{ + if (test__start_subtest("array")) + test_timer_crash_mode(MODE_ARRAY); + if (test__start_subtest("hash")) + test_timer_crash_mode(MODE_HASH); +} diff --git a/tools/testing/selftests/bpf/progs/timer_crash.c b/tools/testing/selftests/bpf/progs/timer_crash.c new file mode 100644 index 0000000000000..f8f7944e70dae --- /dev/null +++ b/tools/testing/selftests/bpf/progs/timer_crash.c @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> + +struct map_elem { + struct bpf_timer timer; + struct bpf_spin_lock lock; +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, int); + __type(value, struct map_elem); +} amap SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 1); + __type(key, int); + __type(value, struct map_elem); +} hmap SEC(".maps"); + +int pid = 0; +int crash_map = 0; /* 0 for amap, 1 for hmap */ + +SEC("fentry/do_nanosleep") +int sys_enter(void *ctx) +{ + struct map_elem *e, value = {}; + void *map = crash_map ? (void *)&hmap : (void *)&amap; + + if (bpf_get_current_task_btf()->tgid != pid) + return 0; + + *(void **)&value = (void *)0xdeadcaf3; + + bpf_map_update_elem(map, &(int){0}, &value, 0); + /* For array map, doing bpf_map_update_elem will do a + * check_and_free_timer_in_array, which will trigger the crash if timer + * pointer was overwritten, for hmap we need to use bpf_timer_cancel. + */ + if (crash_map == 1) { + e = bpf_map_lookup_elem(map, &(int){0}); + if (!e) + return 0; + bpf_timer_cancel(&e->timer); + } + return 0; +} + +char _license[] SEC("license") = "GPL";
From: Halil Pasic pasic@linux.ibm.com
[ Upstream commit ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e ]
The problem I'm addressing was discovered by the LTP test covering cve-2018-1000204.
A short description of what happens follows: 1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV and a corresponding dxferp. The peculiar thing about this is that TUR is not reading from the device. 2) In sg_start_req() the invocation of blk_rq_map_user() effectively bounces the user-space buffer. As if the device was to transfer into it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()") we make sure this first bounce buffer is allocated with GFP_ZERO. 3) For the rest of the story we keep ignoring that we have a TUR, so the device won't touch the buffer we prepare as if the we had a DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device and the buffer allocated by SG is mapped by the function virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here scatter-gather and not scsi generics). This mapping involves bouncing via the swiotlb (we need swiotlb to do virtio in protected guest like s390 Secure Execution, or AMD SEV). 4) When the SCSI TUR is done, we first copy back the content of the second (that is swiotlb) bounce buffer (which most likely contains some previous IO data), to the first bounce buffer, which contains all zeros. Then we copy back the content of the first bounce buffer to the user-space buffer. 5) The test case detects that the buffer, which it zero-initialized, ain't all zeros and fails.
One can argue that this is an swiotlb problem, because without swiotlb we leak all zeros, and the swiotlb should be transparent in a sense that it does not affect the outcome (if all other participants are well behaved).
Copying the content of the original buffer into the swiotlb buffer is the only way I can think of to make swiotlb transparent in such scenarios. So let's do just that if in doubt, but allow the driver to tell us that the whole mapped buffer is going to be overwritten, in which case we can preserve the old behavior and avoid the performance impact of the extra bounce.
Signed-off-by: Halil Pasic pasic@linux.ibm.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/core-api/dma-attributes.rst | 8 ++++++++ include/linux/dma-mapping.h | 8 ++++++++ kernel/dma/swiotlb.c | 3 ++- 3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst index 1887d92e8e926..17706dc91ec9f 100644 --- a/Documentation/core-api/dma-attributes.rst +++ b/Documentation/core-api/dma-attributes.rst @@ -130,3 +130,11 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged subsystem that the buffer is fully accessible at the elevated privilege level (and ideally inaccessible or at least read-only at the lesser-privileged levels). + +DMA_ATTR_OVERWRITE +------------------ + +This is a hint to the DMA-mapping subsystem that the device is expected to +overwrite the entire mapped size, thus the caller does not require any of the +previous buffer contents to be preserved. This allows bounce-buffering +implementations to optimise DMA_FROM_DEVICE transfers. diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index dca2b1355bb13..6150d11a607e1 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -61,6 +61,14 @@ */ #define DMA_ATTR_PRIVILEGED (1UL << 9)
+/* + * This is a hint to the DMA-mapping subsystem that the device is expected + * to overwrite the entire mapped size, thus the caller does not require any + * of the previous buffer contents to be preserved. This allows + * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers. + */ +#define DMA_ATTR_OVERWRITE (1UL << 10) + /* * A dma_addr_t can hold any valid DMA or bus address for the platform. It can * be given to a device to use as a DMA source or target. It is specific to a diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 87c40517e8227..aca0690550e2f 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -579,7 +579,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, mem->slots[index + i].orig_addr = slot_addr(orig_addr, i); tlb_addr = slot_addr(mem->start, index) + offset; if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) && - (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) + (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE || + dir == DMA_BIDIRECTIONAL)) swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE); return tlb_addr; }
From: Heikki Krogerus heikki.krogerus@linux.intel.com
[ Upstream commit 038438a25c45d5ac996e95a22fa9e76ff3d1f8c7 ]
This patch adds the necessary PCI ID for Intel Raptor Lake-S devices.
Signed-off-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20220214141948.18637-1-heikki.krogerus@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/dwc3/dwc3-pci.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/usb/dwc3/dwc3-pci.c b/drivers/usb/dwc3/dwc3-pci.c index 7ff8fc8f79a9b..4e69a9d829f23 100644 --- a/drivers/usb/dwc3/dwc3-pci.c +++ b/drivers/usb/dwc3/dwc3-pci.c @@ -43,6 +43,7 @@ #define PCI_DEVICE_ID_INTEL_ADLP 0x51ee #define PCI_DEVICE_ID_INTEL_ADLM 0x54ee #define PCI_DEVICE_ID_INTEL_ADLS 0x7ae1 +#define PCI_DEVICE_ID_INTEL_RPLS 0x7a61 #define PCI_DEVICE_ID_INTEL_TGL 0x9a15 #define PCI_DEVICE_ID_AMD_MR 0x163a
@@ -409,6 +410,9 @@ static const struct pci_device_id dwc3_pci_id_table[] = { { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ADLS), (kernel_ulong_t) &dwc3_pci_intel_swnode, },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_RPLS), + (kernel_ulong_t) &dwc3_pci_intel_swnode, }, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_TGL), (kernel_ulong_t) &dwc3_pci_intel_swnode, },
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 6f66db29e2415cbe8759c48584f9cae19b3c2651 ]
It appears that last minute change moved ACPI ID of Alder Lake-M to the INTC1055, which is already in the driver.
This ID on the other hand will be used elsewhere.
This reverts commit 258435a1c8187f559549e515d2f77fa0b57bcd27.
Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/intel/pinctrl-tigerlake.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/pinctrl/intel/pinctrl-tigerlake.c b/drivers/pinctrl/intel/pinctrl-tigerlake.c index 0bcd19597e4ad..3ddaeffc04150 100644 --- a/drivers/pinctrl/intel/pinctrl-tigerlake.c +++ b/drivers/pinctrl/intel/pinctrl-tigerlake.c @@ -749,7 +749,6 @@ static const struct acpi_device_id tgl_pinctrl_acpi_match[] = { { "INT34C5", (kernel_ulong_t)&tgllp_soc_data }, { "INT34C6", (kernel_ulong_t)&tglh_soc_data }, { "INTC1055", (kernel_ulong_t)&tgllp_soc_data }, - { "INTC1057", (kernel_ulong_t)&tgllp_soc_data }, { } }; MODULE_DEVICE_TABLE(acpi, tgl_pinctrl_acpi_match);
From: Wanpeng Li wanpengli@tencent.com
[ Upstream commit 4cb9a998b1ce25fad74a82f5a5c45a4ef40de337 ]
I saw the below splatting after the host suspended and resumed.
WARNING: CPU: 0 PID: 2943 at kvm/arch/x86/kvm/../../../virt/kvm/kvm_main.c:5531 kvm_resume+0x2c/0x30 [kvm] CPU: 0 PID: 2943 Comm: step_after_susp Tainted: G W IOE 5.17.0-rc3+ #4 RIP: 0010:kvm_resume+0x2c/0x30 [kvm] Call Trace: <TASK> syscore_resume+0x90/0x340 suspend_devices_and_enter+0xaee/0xe90 pm_suspend.cold+0x36b/0x3c2 state_store+0x82/0xf0 kernfs_fop_write_iter+0x1b6/0x260 new_sync_write+0x258/0x370 vfs_write+0x33f/0x510 ksys_write+0xc9/0x160 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae
lockdep_is_held() can return -1 when lockdep is disabled which triggers this warning. Let's use lockdep_assert_not_held() which can detect incorrect calls while holding a lock and it also avoids false negatives when lockdep is disabled.
Signed-off-by: Wanpeng Li wanpengli@tencent.com Message-Id: 1644920142-81249-1-git-send-email-wanpengli@tencent.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- virt/kvm/kvm_main.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 13aff136e6eef..5309b3eaa0470 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5416,9 +5416,7 @@ static int kvm_suspend(void) static void kvm_resume(void) { if (kvm_usage_count) { -#ifdef CONFIG_LOCKDEP - WARN_ON(lockdep_is_held(&kvm_count_lock)); -#endif + lockdep_assert_not_held(&kvm_count_lock); hardware_enable_nolock(NULL); } }
On 3/1/22 21:16, Sasha Levin wrote:
From: Wanpeng Li wanpengli@tencent.com
[ Upstream commit 4cb9a998b1ce25fad74a82f5a5c45a4ef40de337 ]
I saw the below splatting after the host suspended and resumed.
WARNING: CPU: 0 PID: 2943 at kvm/arch/x86/kvm/../../../virt/kvm/kvm_main.c:5531 kvm_resume+0x2c/0x30 [kvm] CPU: 0 PID: 2943 Comm: step_after_susp Tainted: G W IOE 5.17.0-rc3+ #4 RIP: 0010:kvm_resume+0x2c/0x30 [kvm] Call Trace: <TASK> syscore_resume+0x90/0x340 suspend_devices_and_enter+0xaee/0xe90 pm_suspend.cold+0x36b/0x3c2 state_store+0x82/0xf0 kernfs_fop_write_iter+0x1b6/0x260 new_sync_write+0x258/0x370 vfs_write+0x33f/0x510 ksys_write+0xc9/0x160 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae
lockdep_is_held() can return -1 when lockdep is disabled which triggers this warning. Let's use lockdep_assert_not_held() which can detect incorrect calls while holding a lock and it also avoids false negatives when lockdep is disabled.
Signed-off-by: Wanpeng Li wanpengli@tencent.com Message-Id: 1644920142-81249-1-git-send-email-wanpengli@tencent.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
virt/kvm/kvm_main.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 13aff136e6eef..5309b3eaa0470 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5416,9 +5416,7 @@ static int kvm_suspend(void) static void kvm_resume(void) { if (kvm_usage_count) { -#ifdef CONFIG_LOCKDEP
WARN_ON(lockdep_is_held(&kvm_count_lock));
-#endif
hardware_enable_nolock(NULL); } }lockdep_assert_not_held(&kvm_count_lock);
Acked-by: Paolo Bonzini pbonzini@redhat.com
From: Anton Romanov romanton@google.com
[ Upstream commit 3a55f729240a686aa8af00af436306c0cd532522 ]
If vcpu has tsc_always_catchup set each request updates pvclock data. KVM_HC_CLOCK_PAIRING consumers such as ptp_kvm_x86 rely on tsc read on host's side and do hypercall inside pvclock_read_retry loop leading to infinite loop in such situation.
v3: Removed warn Changed return code to KVM_EFAULT v2: Added warn
Signed-off-by: Anton Romanov romanton@google.com Message-Id: 20220216182653.506850-1-romanton@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/x86.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 33cb065181248..c804122bb6e3a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8657,6 +8657,13 @@ static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr, if (clock_type != KVM_CLOCK_PAIRING_WALLCLOCK) return -KVM_EOPNOTSUPP;
+ /* + * When tsc is in permanent catchup mode guests won't be able to use + * pvclock_read_retry loop to get consistent view of pvclock + */ + if (vcpu->arch.tsc_always_catchup) + return -KVM_EOPNOTSUPP; + if (!kvm_get_walltime_and_clockread(&ts, &cycle)) return -KVM_EOPNOTSUPP;
On 3/1/22 21:16, Sasha Levin wrote:
From: Anton Romanov romanton@google.com
[ Upstream commit 3a55f729240a686aa8af00af436306c0cd532522 ]
If vcpu has tsc_always_catchup set each request updates pvclock data. KVM_HC_CLOCK_PAIRING consumers such as ptp_kvm_x86 rely on tsc read on host's side and do hypercall inside pvclock_read_retry loop leading to infinite loop in such situation.
v3: Removed warn Changed return code to KVM_EFAULT v2: Added warn
Signed-off-by: Anton Romanov romanton@google.com Message-Id: 20220216182653.506850-1-romanton@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
arch/x86/kvm/x86.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 33cb065181248..c804122bb6e3a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8657,6 +8657,13 @@ static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr, if (clock_type != KVM_CLOCK_PAIRING_WALLCLOCK) return -KVM_EOPNOTSUPP;
- /*
* When tsc is in permanent catchup mode guests won't be able to use
* pvclock_read_retry loop to get consistent view of pvclock
*/
- if (vcpu->arch.tsc_always_catchup)
return -KVM_EOPNOTSUPP;
- if (!kvm_get_walltime_and_clockread(&ts, &cycle)) return -KVM_EOPNOTSUPP;
Acked-by: Paolo Bonzini pbonzini@redhat.com
From: Jon Lin jon.lin@rock-chips.com
[ Upstream commit 9382df0a98aad5bbcd4d634790305a1d786ad224 ]
Get num-cs u32 from dts of_node property rather than u16.
Signed-off-by: Jon Lin jon.lin@rock-chips.com Link: https://lore.kernel.org/r/20220216014028.8123-2-jon.lin@rock-chips.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-rockchip.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/spi/spi-rockchip.c b/drivers/spi/spi-rockchip.c index 553b6b9d02222..4f65ba3dd19c2 100644 --- a/drivers/spi/spi-rockchip.c +++ b/drivers/spi/spi-rockchip.c @@ -654,7 +654,7 @@ static int rockchip_spi_probe(struct platform_device *pdev) struct spi_controller *ctlr; struct resource *mem; struct device_node *np = pdev->dev.of_node; - u32 rsd_nsecs; + u32 rsd_nsecs, num_cs; bool slave_mode;
slave_mode = of_property_read_bool(np, "spi-slave"); @@ -764,8 +764,9 @@ static int rockchip_spi_probe(struct platform_device *pdev) * rk spi0 has two native cs, spi1..5 one cs only * if num-cs is missing in the dts, default to 1 */ - if (of_property_read_u16(np, "num-cs", &ctlr->num_chipselect)) - ctlr->num_chipselect = 1; + if (of_property_read_u32(np, "num-cs", &num_cs)) + num_cs = 1; + ctlr->num_chipselect = num_cs; ctlr->use_gpio_descriptors = true; } ctlr->dev.of_node = pdev->dev.of_node;
From: Jon Lin jon.lin@rock-chips.com
[ Upstream commit 80808768e41324d2e23de89972b5406c1020e6e4 ]
After slave abort, all DMA should be stopped, or it will affect the next transmission and maybe abort again.
Signed-off-by: Jon Lin jon.lin@rock-chips.com Link: https://lore.kernel.org/r/20220216014028.8123-3-jon.lin@rock-chips.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-rockchip.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/spi/spi-rockchip.c b/drivers/spi/spi-rockchip.c index 4f65ba3dd19c2..c6a1bb09be056 100644 --- a/drivers/spi/spi-rockchip.c +++ b/drivers/spi/spi-rockchip.c @@ -585,6 +585,12 @@ static int rockchip_spi_slave_abort(struct spi_controller *ctlr) { struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
+ if (atomic_read(&rs->state) & RXDMA) + dmaengine_terminate_sync(ctlr->dma_rx); + if (atomic_read(&rs->state) & TXDMA) + dmaengine_terminate_sync(ctlr->dma_tx); + atomic_set(&rs->state, 0); + spi_enable_chip(rs, false); rs->slave_abort = true; spi_finalize_current_transfer(ctlr);
From: Maxime Ripard maxime@cerno.tech
[ Upstream commit e40945ab7c7f966d0c37b7bd7b0596497dfe228d ]
On bind we will register the HDMI codec device but we don't unregister it on unbind, leading to a device leakage. Unregister our device at unbind.
Signed-off-by: Maxime Ripard maxime@cerno.tech Reviewed-by: Javier Martinez Canillas javierm@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20220127111452.222002-1-maxime... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vc4/vc4_hdmi.c | 8 ++++++++ drivers/gpu/drm/vc4/vc4_hdmi.h | 1 + 2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index 9170d948b4483..07887cbfd9cb6 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -1522,6 +1522,7 @@ static int vc4_hdmi_audio_init(struct vc4_hdmi *vc4_hdmi) dev_err(dev, "Couldn't register the HDMI codec: %ld\n", PTR_ERR(codec_pdev)); return PTR_ERR(codec_pdev); } + vc4_hdmi->audio.codec_pdev = codec_pdev;
dai_link->cpus = &vc4_hdmi->audio.cpu; dai_link->codecs = &vc4_hdmi->audio.codec; @@ -1561,6 +1562,12 @@ static int vc4_hdmi_audio_init(struct vc4_hdmi *vc4_hdmi)
}
+static void vc4_hdmi_audio_exit(struct vc4_hdmi *vc4_hdmi) +{ + platform_device_unregister(vc4_hdmi->audio.codec_pdev); + vc4_hdmi->audio.codec_pdev = NULL; +} + static irqreturn_t vc4_hdmi_hpd_irq_thread(int irq, void *priv) { struct vc4_hdmi *vc4_hdmi = priv; @@ -2298,6 +2305,7 @@ static void vc4_hdmi_unbind(struct device *dev, struct device *master, kfree(vc4_hdmi->hdmi_regset.regs); kfree(vc4_hdmi->hd_regset.regs);
+ vc4_hdmi_audio_exit(vc4_hdmi); vc4_hdmi_cec_exit(vc4_hdmi); vc4_hdmi_hotplug_exit(vc4_hdmi); vc4_hdmi_connector_destroy(&vc4_hdmi->connector); diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h index 33e9f665ab8e4..c0492da736833 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.h +++ b/drivers/gpu/drm/vc4/vc4_hdmi.h @@ -113,6 +113,7 @@ struct vc4_hdmi_audio { struct snd_soc_dai_link_component platform; struct snd_dmaengine_dai_dma_data dma_data; struct hdmi_audio_infoframe infoframe; + struct platform_device *codec_pdev; bool streaming; };
From: Wanpeng Li wanpengli@tencent.com
[ Upstream commit ec756e40e271866f951d77c5e923d8deb6002b15 ]
Inspired by commit 3553ae5690a (x86/kvm: Don't use pvqspinlock code if only 1 vCPU), on a VM with only 1 vCPU, there is no need to enable pv tlb/ipi/sched_yield and we can save the memory for __pv_cpu_mask.
Signed-off-by: Wanpeng Li wanpengli@tencent.com Message-Id: 1645171838-2855-1-git-send-email-wanpengli@tencent.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/kvm.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index b656456c3a944..811c7aaf23aac 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -457,19 +457,22 @@ static bool pv_tlb_flush_supported(void) { return (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) && !kvm_para_has_hint(KVM_HINTS_REALTIME) && - kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)); + kvm_para_has_feature(KVM_FEATURE_STEAL_TIME) && + (num_possible_cpus() != 1)); }
static bool pv_ipi_supported(void) { - return kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI); + return (kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI) && + (num_possible_cpus() != 1)); }
static bool pv_sched_yield_supported(void) { return (kvm_para_has_feature(KVM_FEATURE_PV_SCHED_YIELD) && !kvm_para_has_hint(KVM_HINTS_REALTIME) && - kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)); + kvm_para_has_feature(KVM_FEATURE_STEAL_TIME) && + (num_possible_cpus() != 1)); }
#define KVM_IPI_CLUSTER_SIZE (2 * BITS_PER_LONG)
On 3/1/22 21:16, Sasha Levin wrote:
From: Wanpeng Li wanpengli@tencent.com
[ Upstream commit ec756e40e271866f951d77c5e923d8deb6002b15 ]
Inspired by commit 3553ae5690a (x86/kvm: Don't use pvqspinlock code if only 1 vCPU), on a VM with only 1 vCPU, there is no need to enable pv tlb/ipi/sched_yield and we can save the memory for __pv_cpu_mask.
Signed-off-by: Wanpeng Li wanpengli@tencent.com Message-Id: 1645171838-2855-1-git-send-email-wanpengli@tencent.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
arch/x86/kernel/kvm.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index b656456c3a944..811c7aaf23aac 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -457,19 +457,22 @@ static bool pv_tlb_flush_supported(void) { return (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) && !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
kvm_para_has_feature(KVM_FEATURE_STEAL_TIME));
kvm_para_has_feature(KVM_FEATURE_STEAL_TIME) &&
}(num_possible_cpus() != 1));
static bool pv_ipi_supported(void) {
- return kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI);
- return (kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI) &&
}(num_possible_cpus() != 1));
static bool pv_sched_yield_supported(void) { return (kvm_para_has_feature(KVM_FEATURE_PV_SCHED_YIELD) && !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
kvm_para_has_feature(KVM_FEATURE_STEAL_TIME));
kvm_para_has_feature(KVM_FEATURE_STEAL_TIME) &&
}(num_possible_cpus() != 1));
#define KVM_IPI_CLUSTER_SIZE (2 * BITS_PER_LONG)
NACK
Not really necessary.
Paolo
From: suresh kumar suresh2514@gmail.com
[ Upstream commit 4224cfd7fb6523f7a9d1c8bb91bb5df1e38eb624 ]
When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed.
[ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280
crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92
crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER)
To prevent this scenario, we also make sure that the netdevice is present.
Signed-off-by: suresh kumar suresh2514@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/net-sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index a4ae652633844..9e7a473548e3d 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -213,7 +213,7 @@ static ssize_t speed_show(struct device *dev, if (!rtnl_trylock()) return restart_syscall();
- if (netif_running(netdev)) { + if (netif_running(netdev) && netif_device_present(netdev)) { struct ethtool_link_ksettings cmd;
if (!__ethtool_get_link_ksettings(netdev, &cmd))
From: Oliver Neukum oneukum@suse.com
[ Upstream commit e9da0b56fe27206b49f39805f7dcda8a89379062 ]
A malicious device can leak heap data to user space providing bogus frame lengths. Introduce a sanity check.
Signed-off-by: Oliver Neukum oneukum@suse.com Reviewed-by: Grant Grundler grundler@chromium.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/sr9700.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/usb/sr9700.c b/drivers/net/usb/sr9700.c index 6516a37893e27..0c50f24671da3 100644 --- a/drivers/net/usb/sr9700.c +++ b/drivers/net/usb/sr9700.c @@ -410,7 +410,7 @@ static int sr9700_rx_fixup(struct usbnet *dev, struct sk_buff *skb) /* ignore the CRC length */ len = (skb->data[1] | (skb->data[2] << 8)) - 4;
- if (len > ETH_FRAME_LEN) + if (len > ETH_FRAME_LEN || len > skb->len) return 0;
/* the last packet of current skb */
From: Vikash Chandola vikash.chandola@linux.intel.com
[ Upstream commit 35f165f08950a876f1b95a61d79c93678fba2fd6 ]
Almost all fault/warning bits in pmbus status registers remain set even after fault/warning condition are removed. As per pmbus specification these faults must be cleared by user. Modify hwmon behavior to clear fault/warning bit after fetching data if fault/warning bit was set. This allows to get fresh data in next read.
Signed-off-by: Vikash Chandola vikash.chandola@linux.intel.com Link: https://lore.kernel.org/r/20220222131253.2426834-1-vikash.chandola@linux.int... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/pmbus/pmbus_core.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/hwmon/pmbus/pmbus_core.c b/drivers/hwmon/pmbus/pmbus_core.c index 776ee2237be20..ac2fbee1ba9c0 100644 --- a/drivers/hwmon/pmbus/pmbus_core.c +++ b/drivers/hwmon/pmbus/pmbus_core.c @@ -911,6 +911,11 @@ static int pmbus_get_boolean(struct i2c_client *client, struct pmbus_boolean *b, pmbus_update_sensor_data(client, s2);
regval = status & mask; + if (regval) { + ret = pmbus_write_byte_data(client, page, reg, regval); + if (ret) + goto unlock; + } if (s1 && s2) { s64 v1, v2;
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit 3f1271b54edcc692da5a3663f2aa2a64781f9bc3 ]
There are enough VBIOS escapes without the proper workaround that some users still hit this. Microsoft never productized ATS on Windows so OEM platforms that were Windows-only didn't always validate ATS.
The advantages of ATS are not worth it compared to the potential instabilities on harvested boards. Disable ATS on all Navi10 and Navi14 boards.
Symptoms include:
amdgpu 0000:07:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xffffc02000 flags=0x0000] AMD-Vi: Event logged [IO_PAGE_FAULT device=07:00.0 domain=0x0007 address=0xffffc02000 flags=0x0000] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=6047, emitted seq=6049 amdgpu 0000:07:00.0: amdgpu: GPU reset begin! amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume amdgpu 0000:07:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110) [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110 amdgpu 0000:07:00.0: amdgpu: GPU reset(1) failed
Related commits:
e8946a53e2a6 ("PCI: Mark AMD Navi14 GPU ATS as broken") a2da5d8cc0b0 ("PCI: Mark AMD Raven iGPU ATS as broken in some platforms") 45beb31d3afb ("PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken") 5e89cd303e3a ("PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken") d28ca864c493 ("PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken") 9b44b0b09dec ("PCI: Mark AMD Stoney GPU ATS as broken")
[bhelgaas: add symptoms and related commits] Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1760 Link: https://lore.kernel.org/r/20220222160801.841643-1-alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Acked-by: Christian König christian.koenig@amd.com Acked-by: Guchun Chen guchun.chen@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 0663762ea69db..e7cd8b5045358 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5344,11 +5344,6 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0422, quirk_no_ext_tags); */ static void quirk_amd_harvest_no_ats(struct pci_dev *pdev) { - if ((pdev->device == 0x7312 && pdev->revision != 0x00) || - (pdev->device == 0x7340 && pdev->revision != 0xc5) || - (pdev->device == 0x7341 && pdev->revision != 0x00)) - return; - if (pdev->device == 0x15d8) { if (pdev->revision == 0xcf && pdev->subsystem_vendor == 0xea50 && @@ -5370,10 +5365,19 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_amd_harvest_no_ats); /* AMD Iceland dGPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6900, quirk_amd_harvest_no_ats); /* AMD Navi10 dGPU */ +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7310, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7312, quirk_amd_harvest_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7318, quirk_amd_harvest_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7319, quirk_amd_harvest_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x731a, quirk_amd_harvest_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x731b, quirk_amd_harvest_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x731e, quirk_amd_harvest_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x731f, quirk_amd_harvest_no_ats); /* AMD Navi14 dGPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7340, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7341, quirk_amd_harvest_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7347, quirk_amd_harvest_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x734f, quirk_amd_harvest_no_ats); /* AMD Raven platform iGPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x15d8, quirk_amd_harvest_no_ats); #endif /* CONFIG_PCI_ATS */
From: Shreeya Patel shreeya.patel@collabora.com
[ Upstream commit ae42f9288846353982e2eab181fb41e7fd8bf60f ]
We are racing the registering of .to_irq when probing the i2c driver. This results in random failure of touchscreen devices.
Following explains the race condition better.
[gpio driver] gpio driver registers gpio chip [gpio consumer] gpio is acquired [gpio consumer] gpiod_to_irq() fails with -ENXIO [gpio driver] gpio driver registers irqchip gpiod_to_irq works at this point, but -ENXIO is fatal
We could see the following errors in dmesg logs when gc->to_irq is NULL
[2.101857] i2c_hid i2c-FTS3528:00: HID over i2c has not been provided an Int IRQ [2.101953] i2c_hid: probe of i2c-FTS3528:00 failed with error -22
To avoid this situation, defer probing until to_irq is registered. Returning -EPROBE_DEFER would be the first step towards avoiding the failure of devices due to the race in registration of .to_irq. Final solution to this issue would be to avoid using gc irq members until they are fully initialized.
This issue has been reported many times in past and people have been using workarounds like changing the pinctrl_amd to built-in instead of loading it as a module or by adding a softdep for pinctrl_amd into the config file.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209413 Reviewed-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reported-by: kernel test robot lkp@intel.com Signed-off-by: Shreeya Patel shreeya.patel@collabora.com Signed-off-by: Bartosz Golaszewski brgl@bgdev.pl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index d1b9b721218f2..08525988e64f2 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -3106,6 +3106,16 @@ int gpiod_to_irq(const struct gpio_desc *desc)
return retirq; } +#ifdef CONFIG_GPIOLIB_IRQCHIP + if (gc->irq.chip) { + /* + * Avoid race condition with other code, which tries to lookup + * an IRQ before the irqchip has been properly registered, + * i.e. while gpiochip is still being brought up. + */ + return -EPROBE_DEFER; + } +#endif return -ENXIO; } EXPORT_SYMBOL_GPL(gpiod_to_irq);
From: Guchun Chen guchun.chen@amd.com
[ Upstream commit e2b993302f40c4eb714ecf896dd9e1c5be7d4cd7 ]
vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled.
This can fix below calltrace:
amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu]
v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well.
Signed-off-by: Leslie Shi Yuliang.Shi@amd.com Signed-off-by: Guchun Chen guchun.chen@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index dc50c05f23fc2..5c08047adb594 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -1145,7 +1145,7 @@ int amdgpu_display_framebuffer_init(struct drm_device *dev, if (ret) return ret;
- if (!dev->mode_config.allow_fb_modifiers) { + if (!dev->mode_config.allow_fb_modifiers && !adev->enable_virtual_display) { drm_WARN_ONCE(dev, adev->family >= AMDGPU_FAMILY_AI, "GFX9+ requires FB check based on format modifier\n"); ret = check_tiling_flags_gfx6(rfb);
From: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com
[ Upstream commit 0f4558ae91870692ce7f509c31c9d6ee721d8cdc ]
This reverts commit 1f2565780e9b7218cf92c7630130e82dcc0fe9c2.
The 'hotplug-status' node should not be removed as long as the vif device remains configured. Otherwise the xen-netback would wait for re-running the network script even if it was already called (in case of the frontent re-connecting). But also, it _should_ be removed when the vif device is destroyed (for example when unbinding the driver) - otherwise hotplug script would not configure the device whenever it re-appear.
Moving removal of the 'hotplug-status' node was a workaround for nothing calling network script after xen-netback module is reloaded. But when vif interface is re-created (on xen-netback unbind/bind for example), the script should be called, regardless of who does that - currently this case is not handled by the toolstack, and requires manual script call. Keeping hotplug-status=connected to skip the call is wrong and leads to not configured interface.
More discussion at https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe....
Signed-off-by: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com Reviewed-by: Paul Durrant paul@xen.org Link: https://lore.kernel.org/r/20220222001817.2264967-1-marmarek@invisiblethingsl... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/xen-netback/xenbus.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index d24b7a7993aa0..3fad58d22155b 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -256,6 +256,7 @@ static void backend_disconnect(struct backend_info *be) unsigned int queue_index;
xen_unregister_watchers(vif); + xenbus_rm(XBT_NIL, be->dev->nodename, "hotplug-status"); #ifdef CONFIG_DEBUG_FS xenvif_debugfs_delif(vif); #endif /* CONFIG_DEBUG_FS */ @@ -675,7 +676,6 @@ static void hotplug_status_changed(struct xenbus_watch *watch,
/* Not interested in this watch anymore. */ unregister_hotplug_status_watch(be); - xenbus_rm(XBT_NIL, be->dev->nodename, "hotplug-status"); } kfree(str); }
From: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com
[ Upstream commit e8240addd0a3919e0fd7436416afe9aa6429c484 ]
This reverts commit 2afeec08ab5c86ae21952151f726bfe184f6b23d.
The reasoning in the commit was wrong - the code expected to setup the watch even if 'hotplug-status' didn't exist. In fact, it relied on the watch being fired the first time - to check if maybe 'hotplug-status' is already set to 'connected'. Not registering a watch for non-existing path (which is the case if hotplug script hasn't been executed yet), made the backend not waiting for the hotplug script to execute. This in turns, made the netfront think the interface is fully operational, while in fact it was not (the vif interface on xen-netback side might not be configured yet).
This was a workaround for 'hotplug-status' erroneously being removed. But since that is reverted now, the workaround is not necessary either.
More discussion at https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe....
Signed-off-by: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com Reviewed-by: Paul Durrant paul@xen.org Reviewed-by: Michael Brown mbrown@fensystems.co.uk Link: https://lore.kernel.org/r/20220222001817.2264967-2-marmarek@invisiblethingsl... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/xen-netback/xenbus.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 3fad58d22155b..990360d75cb64 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -824,15 +824,11 @@ static void connect(struct backend_info *be) xenvif_carrier_on(be->vif);
unregister_hotplug_status_watch(be); - if (xenbus_exists(XBT_NIL, dev->nodename, "hotplug-status")) { - err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch, - NULL, hotplug_status_changed, - "%s/%s", dev->nodename, - "hotplug-status"); - if (err) - goto err; + err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch, NULL, + hotplug_status_changed, + "%s/%s", dev->nodename, "hotplug-status"); + if (!err) be->have_hotplug_status_watch = 1; - }
netif_tx_wake_all_queues(be->vif->dev);
From: Niels Dossche dossche.niels@gmail.com
[ Upstream commit 6c0d8833a605e195ae219b5042577ce52bf71fff ]
valid_lft, prefered_lft and tstamp are always accessed under the lock "lock" in other places. Reading these without taking the lock may result in inconsistencies regarding the calculation of the valid and preferred variables since decisions are taken on these fields for those variables.
Signed-off-by: Niels Dossche dossche.niels@gmail.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: Niels Dossche niels.dossche@ugent.be Link: https://lore.kernel.org/r/20220223131954.6570-1-niels.dossche@ugent.be Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/addrconf.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index c6e1989ab2ed9..01cbfb4321eec 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -4996,6 +4996,7 @@ static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa, nla_put_s32(skb, IFA_TARGET_NETNSID, args->netnsid)) goto error;
+ spin_lock_bh(&ifa->lock); if (!((ifa->flags&IFA_F_PERMANENT) && (ifa->prefered_lft == INFINITY_LIFE_TIME))) { preferred = ifa->prefered_lft; @@ -5017,6 +5018,7 @@ static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa, preferred = INFINITY_LIFE_TIME; valid = INFINITY_LIFE_TIME; } + spin_unlock_bh(&ifa->lock);
if (!ipv6_addr_any(&ifa->peer_addr)) { if (nla_put_in6_addr(skb, IFA_LOCAL, &ifa->addr) < 0 ||
From: Sven Schnelle svens@linux.ibm.com
[ Upstream commit 7acf3a127bb7c65ff39099afd78960e77b2ca5de ]
Booting the kernel with 'trace_buf_size=1' give a warning at boot during the ftrace selftests:
[ 0.892809] Running postponed tracer tests: [ 0.892893] Testing tracer function: [ 0.901899] Callback from call_rcu_tasks_trace() invoked. [ 0.983829] Callback from call_rcu_tasks_rude() invoked. [ 1.072003] .. bad ring buffer .. corrupted trace buffer .. [ 1.091944] Callback from call_rcu_tasks() invoked. [ 1.097695] PASSED [ 1.097701] Testing dynamic ftrace: .. filter failed count=0 ..FAILED! [ 1.353474] ------------[ cut here ]------------ [ 1.353478] WARNING: CPU: 0 PID: 1 at kernel/trace/trace.c:1951 run_tracer_selftest+0x13c/0x1b0
Therefore enforce a minimum of 4096 bytes to make the selftest pass.
Link: https://lkml.kernel.org/r/20220214134456.1751749-1-svens@linux.ibm.com
Signed-off-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 618c20ce2479d..f64d46e22b47c 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1496,10 +1496,12 @@ static int __init set_buf_size(char *str) if (!str) return 0; buf_size = memparse(str, &str); - /* nr_entries can not be zero */ - if (buf_size == 0) - return 0; - trace_buf_size = buf_size; + /* + * nr_entries can not be zero and the startup + * tests require some buffer space. Therefore + * ensure we have at least 4096 bytes of buffer. + */ + trace_buf_size = max(4096UL, buf_size); return 1; } __setup("trace_buf_size=", set_buf_size);
From: Daniel Bristot de Oliveira bristot@kernel.org
[ Upstream commit dd990352f01ee9a6c6eee152e5d11c021caccfe4 ]
osnoise's runtime and period are in the microseconds scale, but it is currently sleeping in the millisecond's scale. This behavior roots in the usage of hwlat as the skeleton for osnoise.
Make osnoise to sleep in the microseconds scale. Also, move the sleep to a specialized function.
Link: https://lkml.kernel.org/r/302aa6c7bdf2d131719b22901905e9da122a11b2.164519733...
Cc: Ingo Molnar mingo@redhat.com Signed-off-by: Daniel Bristot de Oliveira bristot@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_osnoise.c | 53 ++++++++++++++++++++++-------------- 1 file changed, 32 insertions(+), 21 deletions(-)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c index 65a518649997b..fc491d0aee5a0 100644 --- a/kernel/trace/trace_osnoise.c +++ b/kernel/trace/trace_osnoise.c @@ -1249,6 +1249,37 @@ static int run_osnoise(void) static struct cpumask osnoise_cpumask; static struct cpumask save_cpumask;
+/* + * osnoise_sleep - sleep until the next period + */ +static void osnoise_sleep(void) +{ + u64 interval; + ktime_t wake_time; + + mutex_lock(&interface_lock); + interval = osnoise_data.sample_period - osnoise_data.sample_runtime; + mutex_unlock(&interface_lock); + + /* + * differently from hwlat_detector, the osnoise tracer can run + * without a pause because preemption is on. + */ + if (!interval) { + /* Let synchronize_rcu_tasks() make progress */ + cond_resched_tasks_rcu_qs(); + return; + } + + wake_time = ktime_add_us(ktime_get(), interval); + __set_current_state(TASK_INTERRUPTIBLE); + + while (schedule_hrtimeout_range(&wake_time, 0, HRTIMER_MODE_ABS)) { + if (kthread_should_stop()) + break; + } +} + /* * osnoise_main - The osnoise detection kernel thread * @@ -1257,30 +1288,10 @@ static struct cpumask save_cpumask; */ static int osnoise_main(void *data) { - u64 interval;
while (!kthread_should_stop()) { - run_osnoise(); - - mutex_lock(&interface_lock); - interval = osnoise_data.sample_period - osnoise_data.sample_runtime; - mutex_unlock(&interface_lock); - - do_div(interval, USEC_PER_MSEC); - - /* - * differently from hwlat_detector, the osnoise tracer can run - * without a pause because preemption is on. - */ - if (interval < 1) { - /* Let synchronize_rcu_tasks() make progress */ - cond_resched_tasks_rcu_qs(); - continue; - } - - if (msleep_interruptible(interval)) - break; + osnoise_sleep(); }
return 0;
From: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com
[ Upstream commit f39c58008dee7ab5fc94c3f1995a21e886801df0 ]
On the latest RHEL the test fails due to executable mapped at 256MB address
# ./map_fixed_noreplace mmap() @ 0x10000000-0x10050000 p=0xffffffffffffffff result=File exists 10000000-10010000 r-xp 00000000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10010000-10020000 r--p 00000000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10020000-10030000 rw-p 00010000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10029b90000-10029bc0000 rw-p 00000000 00:00 0 [heap] 7fffbb510000-7fffbb750000 r-xp 00000000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb750000-7fffbb760000 r--p 00230000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb760000-7fffbb770000 rw-p 00240000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb780000-7fffbb7a0000 r--p 00000000 00:00 0 [vvar] 7fffbb7a0000-7fffbb7b0000 r-xp 00000000 00:00 0 [vdso] 7fffbb7b0000-7fffbb800000 r-xp 00000000 fd:04 24514 /usr/lib64/ld64.so.2 7fffbb800000-7fffbb810000 r--p 00040000 fd:04 24514 /usr/lib64/ld64.so.2 7fffbb810000-7fffbb820000 rw-p 00050000 fd:04 24514 /usr/lib64/ld64.so.2 7fffd93f0000-7fffd9420000 rw-p 00000000 00:00 0 [stack] Error: couldn't map the space we need for the test
Fix this by finding a free address using mmap instead of hardcoding BASE_ADDRESS.
Link: https://lkml.kernel.org/r/20220217083417.373823-1-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Jann Horn jannh@google.com Cc: Shuah Khan shuah@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/vm/map_fixed_noreplace.c | 49 ++++++++++++++----- 1 file changed, 37 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/vm/map_fixed_noreplace.c b/tools/testing/selftests/vm/map_fixed_noreplace.c index d91bde5112686..eed44322d1a63 100644 --- a/tools/testing/selftests/vm/map_fixed_noreplace.c +++ b/tools/testing/selftests/vm/map_fixed_noreplace.c @@ -17,9 +17,6 @@ #define MAP_FIXED_NOREPLACE 0x100000 #endif
-#define BASE_ADDRESS (256ul * 1024 * 1024) - - static void dump_maps(void) { char cmd[32]; @@ -28,18 +25,46 @@ static void dump_maps(void) system(cmd); }
+static unsigned long find_base_addr(unsigned long size) +{ + void *addr; + unsigned long flags; + + flags = MAP_PRIVATE | MAP_ANONYMOUS; + addr = mmap(NULL, size, PROT_NONE, flags, -1, 0); + if (addr == MAP_FAILED) { + printf("Error: couldn't map the space we need for the test\n"); + return 0; + } + + if (munmap(addr, size) != 0) { + printf("Error: couldn't map the space we need for the test\n"); + return 0; + } + return (unsigned long)addr; +} + int main(void) { + unsigned long base_addr; unsigned long flags, addr, size, page_size; char *p;
page_size = sysconf(_SC_PAGE_SIZE);
+ //let's find a base addr that is free before we start the tests + size = 5 * page_size; + base_addr = find_base_addr(size); + if (!base_addr) { + printf("Error: couldn't map the space we need for the test\n"); + return 1; + } + flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED_NOREPLACE;
// Check we can map all the areas we need below errno = 0; - addr = BASE_ADDRESS; + addr = base_addr; size = 5 * page_size; p = mmap((void *)addr, size, PROT_NONE, flags, -1, 0);
@@ -60,7 +85,7 @@ int main(void) printf("unmap() successful\n");
errno = 0; - addr = BASE_ADDRESS + page_size; + addr = base_addr + page_size; size = 3 * page_size; p = mmap((void *)addr, size, PROT_NONE, flags, -1, 0); printf("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p); @@ -80,7 +105,7 @@ int main(void) * +4 | free | new */ errno = 0; - addr = BASE_ADDRESS; + addr = base_addr; size = 5 * page_size; p = mmap((void *)addr, size, PROT_NONE, flags, -1, 0); printf("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p); @@ -101,7 +126,7 @@ int main(void) * +4 | free | */ errno = 0; - addr = BASE_ADDRESS + (2 * page_size); + addr = base_addr + (2 * page_size); size = page_size; p = mmap((void *)addr, size, PROT_NONE, flags, -1, 0); printf("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p); @@ -121,7 +146,7 @@ int main(void) * +4 | free | new */ errno = 0; - addr = BASE_ADDRESS + (3 * page_size); + addr = base_addr + (3 * page_size); size = 2 * page_size; p = mmap((void *)addr, size, PROT_NONE, flags, -1, 0); printf("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p); @@ -141,7 +166,7 @@ int main(void) * +4 | free | */ errno = 0; - addr = BASE_ADDRESS; + addr = base_addr; size = 2 * page_size; p = mmap((void *)addr, size, PROT_NONE, flags, -1, 0); printf("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p); @@ -161,7 +186,7 @@ int main(void) * +4 | free | */ errno = 0; - addr = BASE_ADDRESS; + addr = base_addr; size = page_size; p = mmap((void *)addr, size, PROT_NONE, flags, -1, 0); printf("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p); @@ -181,7 +206,7 @@ int main(void) * +4 | free | new */ errno = 0; - addr = BASE_ADDRESS + (4 * page_size); + addr = base_addr + (4 * page_size); size = page_size; p = mmap((void *)addr, size, PROT_NONE, flags, -1, 0); printf("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p); @@ -192,7 +217,7 @@ int main(void) return 1; }
- addr = BASE_ADDRESS; + addr = base_addr; size = 5 * page_size; if (munmap((void *)addr, size) != 0) { dump_maps();
From: Mike Kravetz mike.kravetz@oracle.com
[ Upstream commit fda153c89af344d21df281009a9d046cf587ea0f ]
Running the memfd script ./run_hugetlbfs_test.sh will often end in error as follows:
memfd-hugetlb: CREATE memfd-hugetlb: BASIC memfd-hugetlb: SEAL-WRITE memfd-hugetlb: SEAL-FUTURE-WRITE memfd-hugetlb: SEAL-SHRINK fallocate(ALLOC) failed: No space left on device ./run_hugetlbfs_test.sh: line 60: 166855 Aborted (core dumped) ./memfd_test hugetlbfs opening: ./mnt/memfd fuse: DONE
If no hugetlb pages have been preallocated, run_hugetlbfs_test.sh will allocate 'just enough' pages to run the test. In the SEAL-FUTURE-WRITE test the mfd_fail_write routine maps the file, but does not unmap. As a result, two hugetlb pages remain reserved for the mapping. When the fallocate call in the SEAL-SHRINK test attempts allocate all hugetlb pages, it is short by the two reserved pages.
Fix by making sure to unmap in mfd_fail_write.
Link: https://lkml.kernel.org/r/20220219004340.56478-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Cc: Joel Fernandes joel@joelfernandes.org Cc: Shuah Khan shuah@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/memfd/memfd_test.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c index 192a2899bae8f..94df2692e6e4a 100644 --- a/tools/testing/selftests/memfd/memfd_test.c +++ b/tools/testing/selftests/memfd/memfd_test.c @@ -455,6 +455,7 @@ static void mfd_fail_write(int fd) printf("mmap()+mprotect() didn't fail as expected\n"); abort(); } + munmap(p, mfd_def_size); }
/* verify PUNCH_HOLE fails */
linux-stable-mirror@lists.linaro.org