From: Quan Zhou <quan.zhou(a)mediatek.com>
For some cases as below, we may encounter the unpreditable chip stats
in driver probe()
* The system reboot flow do not work properly, such as kernel oops while
rebooting, and then the driver do not go back to default status at
this moment.
* Similar to the flow above. If the device was enabled in BIOS or UEFI,
the system may switch to Linux without driver fully shutdown.
To avoid the problem, force push the device back to default in probe()
* mt7921e_mcu_fw_pmctrl() : return control privilege to chip side.
* mt7921_wfsys_reset() : cleanup chip config before resource init.
Error log
[59007.600714] mt7921e 0000:02:00.0: ASIC revision: 79220010
[59010.889773] mt7921e 0000:02:00.0: Message 00000010 (seq 1) timeout
[59010.889786] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59014.217839] mt7921e 0000:02:00.0: Message 00000010 (seq 2) timeout
[59014.217852] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59017.545880] mt7921e 0000:02:00.0: Message 00000010 (seq 3) timeout
[59017.545893] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59020.874086] mt7921e 0000:02:00.0: Message 00000010 (seq 4) timeout
[59020.874099] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59024.202019] mt7921e 0000:02:00.0: Message 00000010 (seq 5) timeout
[59024.202033] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59027.530082] mt7921e 0000:02:00.0: Message 00000010 (seq 6) timeout
[59027.530096] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59030.857888] mt7921e 0000:02:00.0: Message 00000010 (seq 7) timeout
[59030.857904] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59034.185946] mt7921e 0000:02:00.0: Message 00000010 (seq 8) timeout
[59034.185961] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59037.514249] mt7921e 0000:02:00.0: Message 00000010 (seq 9) timeout
[59037.514262] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59040.842362] mt7921e 0000:02:00.0: Message 00000010 (seq 10) timeout
[59040.842375] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59040.923845] mt7921e 0000:02:00.0: hardware init failed
Cc: stable(a)vger.kernel.org
Fixes: 5c14a5f944b9 ("mt76: mt7921: introduce mt7921e support")
Tested-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Tested-by: Juan Martinez <juan.martinez(a)amd.com>
Co-developed-by: Leon Yen <leon.yen(a)mediatek.com>
Signed-off-by: Leon Yen <leon.yen(a)mediatek.com>
Signed-off-by: Quan Zhou <quan.zhou(a)mediatek.com>
Signed-off-by: Deren Wu <deren.wu(a)mediatek.com>
---
v2: The v1 patch has been accpeted in wireless patchwork. However,
this patch is very important for existing system, we need to add
cc stable tag and hope this patch can be pulled to stable branch earlier.
---
drivers/net/wireless/mediatek/mt76/mt7921/dma.c | 4 ----
drivers/net/wireless/mediatek/mt76/mt7921/mcu.c | 8 --------
drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 8 ++++++++
3 files changed, 8 insertions(+), 12 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/dma.c b/drivers/net/wireless/mediatek/mt76/mt7921/dma.c
index f0a80c2b476a..4153cd6c2a01 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/dma.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/dma.c
@@ -231,10 +231,6 @@ int mt7921_dma_init(struct mt7921_dev *dev)
if (ret)
return ret;
- ret = mt7921_wfsys_reset(dev);
- if (ret)
- return ret;
-
/* init tx queue */
ret = mt76_connac_init_tx_queues(dev->phy.mt76, MT7921_TXQ_BAND0,
MT7921_TX_RING_SIZE,
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
index c69ce6df4956..f55caa00ac69 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
@@ -476,12 +476,6 @@ static int mt7921_load_firmware(struct mt7921_dev *dev)
{
int ret;
- ret = mt76_get_field(dev, MT_CONN_ON_MISC, MT_TOP_MISC2_FW_N9_RDY);
- if (ret && mt76_is_mmio(&dev->mt76)) {
- dev_dbg(dev->mt76.dev, "Firmware is already download\n");
- goto fw_loaded;
- }
-
ret = mt76_connac2_load_patch(&dev->mt76, mt7921_patch_name(dev));
if (ret)
return ret;
@@ -504,8 +498,6 @@ static int mt7921_load_firmware(struct mt7921_dev *dev)
return -EIO;
}
-fw_loaded:
-
#ifdef CONFIG_PM
dev->mt76.hw->wiphy->wowlan = &mt76_connac_wowlan_support;
#endif /* CONFIG_PM */
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
index 1c727870bbdb..6c512bc75685 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
@@ -325,6 +325,10 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
bus_ops->rmw = mt7921_rmw;
dev->mt76.bus = bus_ops;
+ ret = mt7921e_mcu_fw_pmctrl(dev);
+ if (ret)
+ goto err_free_dev;
+
ret = __mt7921e_mcu_drv_pmctrl(dev);
if (ret)
goto err_free_dev;
@@ -333,6 +337,10 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
(mt7921_l1_rr(dev, MT_HW_REV) & 0xff);
dev_info(mdev->dev, "ASIC revision: %04x\n", mdev->rev);
+ ret = mt7921_wfsys_reset(dev);
+ if (ret)
+ goto err_free_dev;
+
mt76_wr(dev, MT_WFDMA0_HOST_INT_ENA, 0);
mt76_wr(dev, MT_PCIE_MAC_INT_ENABLE, 0xff);
--
2.18.0
From: Long Li <longli(a)microsoft.com>
It's inefficient to ring the doorbell page every time a WQE is posted to
the received queue. Excessive MMIO writes result in CPU spending more
time waiting on LOCK instructions (atomic operations), resulting in
poor scaling performance.
Move the code for ringing doorbell page to where after we have posted all
WQEs to the receive queue during a callback from napi_poll().
With this change, tests showed an improvement from 120G/s to 160G/s on a
200G physical link, with 16 or 32 hardware queues.
Tests showed no regression in network latency benchmarks on single
connection.
While we are making changes in this code path, change the code for
ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
hardware specification specifies that it should set to 0. Although
currently the hardware doesn't enforce the check, in the future releases
it may do.
Cc: stable(a)vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Reviewed-by: Haiyang Zhang <haiyangz(a)microsoft.com>
Reviewed-by: Dexuan Cui <decui(a)microsoft.com>
Signed-off-by: Long Li <longli(a)microsoft.com>
---
Change log:
v2:
Check for comp_read > 0 as it might be negative on completion error.
Set rq.wqe_cnt to 0 according to BNIC spec.
v3:
Add details in the commit on the reason of performance increase and test numbers.
Add details in the commit on why rq.wqe_cnt should be set to 0 according to hardware spec.
Add "Reviewed-by" from Haiyang and Dexuan.
drivers/net/ethernet/microsoft/mana/gdma_main.c | 5 ++++-
drivers/net/ethernet/microsoft/mana/mana_en.c | 10 ++++++++--
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 8f3f78b68592..3765d3389a9a 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -300,8 +300,11 @@ static void mana_gd_ring_doorbell(struct gdma_context *gc, u32 db_index,
void mana_gd_wq_ring_doorbell(struct gdma_context *gc, struct gdma_queue *queue)
{
+ /* Hardware Spec specifies that software client should set 0 for
+ * wqe_cnt for Receive Queues. This value is not used in Send Queues.
+ */
mana_gd_ring_doorbell(gc, queue->gdma_dev->doorbell, queue->type,
- queue->id, queue->head * GDMA_WQE_BU_SIZE, 1);
+ queue->id, queue->head * GDMA_WQE_BU_SIZE, 0);
}
void mana_gd_ring_cq(struct gdma_queue *cq, u8 arm_bit)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index cd4d5ceb9f2d..1d8abe63fcb8 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1383,8 +1383,8 @@ static void mana_post_pkt_rxq(struct mana_rxq *rxq)
recv_buf_oob = &rxq->rx_oobs[curr_index];
- err = mana_gd_post_and_ring(rxq->gdma_rq, &recv_buf_oob->wqe_req,
- &recv_buf_oob->wqe_inf);
+ err = mana_gd_post_work_request(rxq->gdma_rq, &recv_buf_oob->wqe_req,
+ &recv_buf_oob->wqe_inf);
if (WARN_ON_ONCE(err))
return;
@@ -1654,6 +1654,12 @@ static void mana_poll_rx_cq(struct mana_cq *cq)
mana_process_rx_cqe(rxq, cq, &comp[i]);
}
+ if (comp_read > 0) {
+ struct gdma_context *gc = rxq->gdma_rq->gdma_dev->gdma_context;
+
+ mana_gd_wq_ring_doorbell(gc, rxq->gdma_rq);
+ }
+
if (rxq->xdp_flush)
xdp_do_flush();
}
--
2.34.1
A crash was reported in amd-sfh related to hid core initialization
before SFH initialization has run.
```
amdtp_hid_request+0x36/0x50 [amd_sfh
2e3095779aada9fdb1764f08ca578ccb14e41fe4]
sensor_hub_get_feature+0xad/0x170 [hid_sensor_hub
d6157999c9d260a1bfa6f27d4a0dc2c3e2c5654e]
hid_sensor_parse_common_attributes+0x217/0x310 [hid_sensor_iio_common
07a7935272aa9c7a28193b574580b3e953a64ec4]
hid_gyro_3d_probe+0x7f/0x2e0 [hid_sensor_gyro_3d
9f2eb51294a1f0c0315b365f335617cbaef01eab]
platform_probe+0x44/0xa0
really_probe+0x19e/0x3e0
```
Ensure that sensors have been set up before calling into
amd_sfh_get_report() or amd_sfh_set_report().
Cc: stable(a)vger.kernel.org
Cc: Linux regression tracking (Thorsten Leemhuis) <regressions(a)leemhuis.info>
Fixes: 7bcfdab3f0c6 ("HID: amd_sfh: if no sensors are enabled, clean up")
Reported-by: Haochen Tong <linux(a)hexchain.org>
Link: https://lore.kernel.org/all/3250319.ancTxkQ2z5@zen/T/
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
drivers/hid/amd-sfh-hid/amd_sfh_client.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_client.c b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
index d9b7b01900b5..88f3d913eaa1 100644
--- a/drivers/hid/amd-sfh-hid/amd_sfh_client.c
+++ b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
@@ -25,6 +25,9 @@ void amd_sfh_set_report(struct hid_device *hid, int report_id,
struct amdtp_cl_data *cli_data = hid_data->cli_data;
int i;
+ if (!cli_data->is_any_sensor_enabled)
+ return;
+
for (i = 0; i < cli_data->num_hid_devices; i++) {
if (cli_data->hid_sensor_hubs[i] == hid) {
cli_data->cur_hid_dev = i;
@@ -41,6 +44,9 @@ int amd_sfh_get_report(struct hid_device *hid, int report_id, int report_type)
struct request_list *req_list = &cli_data->req_list;
int i;
+ if (!cli_data->is_any_sensor_enabled)
+ return -ENODEV;
+
for (i = 0; i < cli_data->num_hid_devices; i++) {
if (cli_data->hid_sensor_hubs[i] == hid) {
struct request_list *new = kzalloc(sizeof(*new), GFP_KERNEL);
--
2.34.1
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86. Based on the reproducer
provided in [1] we suspect this is caused by the lack of VMA locking
while forking a child process.
Patch 1/2 in the series implements proper VMA locking during fork.
I tested the fix locally using the reproducer and was unable to reproduce
the memory corruption problem.
This fix can potentially regress some fork-heavy workloads. Kernel build
time did not show noticeable regression on a 56-core machine while a
stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~7% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Patch 2/2 disables per-VMA locks until the fix is tested and verified.
Both patches apply cleanly over Linus' ToT and stable 6.4.y branch.
Changes from v3 posted at [3]:
- Replace vma_iter_init with vma_iter_set, per Liam R. Howlett
- Update the regression number caused by additional VMA tree walk
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
[3] https://lore.kernel.org/all/20230705171213.2843068-1-surenb@google.com
Suren Baghdasaryan (2):
fork: lock VMAs of the parent process when forking
mm: disable CONFIG_PER_VMA_LOCK until its fixed
kernel/fork.c | 6 ++++++
mm/Kconfig | 3 ++-
2 files changed, 8 insertions(+), 1 deletion(-)
--
2.41.0.255.g8b1d071c50-goog
This is the start of the stable review cycle for the 6.4.2 release.
There are 15 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 06 Jul 2023 08:46:01 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.2-rc2.…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.4.2-rc2
Linus Torvalds <torvalds(a)linux-foundation.org>
gup: avoid stack expansion warning for known-good case
SeongJae Park <sj(a)kernel.org>
arch/arm64/mm/fault: Fix undeclared variable error in do_page_fault()
Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
drm/amdgpu: Validate VM ioctl flags.
Demi Marie Obenour <demi(a)invisiblethingslab.com>
dm ioctl: Avoid double-fetch of version
Ahmed S. Darwish <darwi(a)linutronix.de>
docs: Set minimal gtags / GNU GLOBAL version to 6.6.5
Ahmed S. Darwish <darwi(a)linutronix.de>
scripts/tags.sh: Resolve gtags empty index generation
Mike Kravetz <mike.kravetz(a)oracle.com>
hugetlb: revert use of page_cache_next_miss()
Finn Thain <fthain(a)linux-m68k.org>
nubus: Partially revert proc_create_single_data() conversion
Dan Williams <dan.j.williams(a)intel.com>
Revert "cxl/port: Enable the HDM decoder capability for switch ports"
Jeff Layton <jlayton(a)kernel.org>
nfs: don't report STATX_BTIME in ->getattr
Linus Torvalds <torvalds(a)linux-foundation.org>
execve: always mark stack as growing down during early stack setup
Mario Limonciello <mario.limonciello(a)amd.com>
PCI/ACPI: Call _REG when transitioning D-states
Bjorn Helgaas <bhelgaas(a)google.com>
PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Thomas Weißschuh <linux(a)weissschuh.net>
tools/nolibc: x86_64: disable stack protector for _start
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix lock_mm_and_find_vma in case VMA not found
-------------
Diffstat:
Documentation/process/changes.rst | 7 +++++
Makefile | 4 +--
arch/arm64/mm/fault.c | 2 --
drivers/cxl/core/pci.c | 27 +++--------------
drivers/cxl/cxl.h | 1 -
drivers/cxl/port.c | 14 ++++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++
drivers/md/dm-ioctl.c | 33 +++++++++++++--------
drivers/nubus/proc.c | 22 ++++++++++----
drivers/pci/pci-acpi.c | 53 +++++++++++++++++++++++++---------
fs/hugetlbfs/inode.c | 8 ++---
fs/nfs/inode.c | 2 +-
include/linux/mm.h | 4 ++-
mm/hugetlb.c | 12 ++++----
mm/memory.c | 4 +++
mm/nommu.c | 7 ++++-
scripts/tags.sh | 9 +++++-
tools/include/nolibc/arch-x86_64.h | 2 +-
tools/testing/cxl/Kbuild | 1 -
tools/testing/cxl/test/mock.c | 15 ----------
20 files changed, 132 insertions(+), 99 deletions(-)