Hello,
I've always had trouble with this driver in my Asus Zephyrus laptop, but I was able to use it eventually, that's until 5.16.3 landed.
This version completely broke it. I'm unable to bring the interface up, no matter what I try.
Before, sometimes I was able to make the chip work by suspending the laptop, but in 5.16.3 the machine doesn't wake up (which is probably another issue).
Reverting back to 5.16.2 makes it work.
Let me know if you need more information, or if you would like me to bisect the issue.
Cheers.
On Sat, Jan 29, 2022 at 1:12 PM James bjlockie@lockie.ca wrote:
Does dmesg show anything?
It's hard to tell because it seems there are multiple conflating issues. I booted into 5.16.3 again, and this time I experienced a different problem, so far I've seen these two:
1. The device appears, but I'm unable to bring it up 2. The device doesn't even appear
For issue #2 I see this interesting error:
[ 0.325945] Freeing initrd memory: 8768K [ 0.331968] ------------[ cut here ]------------ [ 0.331969] WARNING: CPU: 4 PID: 1 at drivers/iommu/amd/init.c:839 amd_iommu_enable_interrupts+0x352/0x430 [ 0.331975] Modules linked in: [ 0.331977] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.16.3-arch1-1 #1 ca51a3fe35922d501638d513dc9548a2c4fed987 [ 0.331980] Hardware name: ASUSTeK COMPUTER INC. ROG Zephyrus G14 GA401QM_GA401QM/GA401QM, BIOS GA401QM.410 12/13/2021 [ 0.331980] RIP: 0010:amd_iommu_enable_interrupts+0x352/0x430 [ 0.331982] Code: ff ff 48 8b 7b 18 89 04 24 e8 2a 3a ed ff 8b 04 24 e9 45 fd ff ff 0f 0b 48 8b 1b 48 81 fb 70 09 b6 99 0f 85 00 fd ff ff eb 96 <0f> 0b 48 8b 1b 48 81 fb 70 09 b6 99 0f 85 ec fc ff ff eb 82 31 f6 [ 0.331983] RSP: 0018:ffffa17a00087db8 EFLAGS: 00010246 [ 0.331985] RAX: 0000000000000018 RBX: ffff89af0004b000 RCX: ffffa17a00100000 [ 0.331986] RDX: 0000000000000000 RSI: ffffa17a00100000 RDI: 0000000000000000 [ 0.331986] RBP: 0000000080000000 R08: 0000000000000000 R09: 0000000000000000 [ 0.331987] R10: 0000000000000000 R11: 0000000000000000 R12: 000ffffffffffff8 [ 0.331988] R13: 0800000000000000 R14: ffffa17a00087dc0 R15: ffff89af013323c0 [ 0.331988] FS: 0000000000000000(0000) GS:ffff89b1de700000(0000) knlGS:0000000000000000 [ 0.331989] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.331990] CR2: 0000000000000000 CR3: 000000020a410000 CR4: 0000000000750ee0 [ 0.331991] PKRU: 55555554 [ 0.331991] Call Trace: [ 0.331992] <TASK> [ 0.331995] iommu_go_to_state+0x1164/0x1458 [ 0.331999] ? e820__memblock_setup+0x7d/0x7d [ 0.332002] amd_iommu_init+0xf/0x29 [ 0.332003] pci_iommu_init+0x16/0x3f [ 0.332005] do_one_initcall+0x57/0x220 [ 0.332008] kernel_init_freeable+0x1e8/0x242 [ 0.332010] ? rest_init+0xd0/0xd0 [ 0.332013] kernel_init+0x16/0x130 [ 0.332014] ret_from_fork+0x22/0x30 [ 0.332016] </TASK> [ 0.332018] ---[ end trace 99de2ba3e793f5cf ]--- [ 0.332018] software IO TLB: tearing down default memory pool
Even more interesting is that I rebooted into 5.16.2 and the same warning appeared, and the same issue happened: I didn't see the driver. I turned off the laptop (as opposed to rebooting), and then turned it on, and now the wireless works fine (in 5.16.2).
The reason I turn off the laptop is that I read in some forums that turning off the computer and waiting 10 seconds makes the chip work again (although that was for yet another issue, I've not experienced lately, and it happened even in Windows).
Here's the whole dmesg: https://dpaste.org/0sj3
I'll try to disable the proprietary nvidia driver to see if there's any difference.
On Sat, Jan 29, 2022 at 1:50 PM Felipe Contreras felipe.contreras@gmail.com wrote:
On Sat, Jan 29, 2022 at 1:12 PM James bjlockie@lockie.ca wrote:
Does dmesg show anything?
It's hard to tell because it seems there are multiple conflating issues. I booted into 5.16.3 again, and this time I experienced a different problem, so far I've seen these two:
- The device appears, but I'm unable to bring it up
- The device doesn't even appear
I removed the nvidia driver and I was still able to reproduce issue #1.
Here are the interesting bits:
[ 2.295614] mt7921e 0000:02:00.0: enabling device (0000 -> 0002) [ 2.295810] mt7921e 0000:02:00.0: ASIC revision: 79610010 [ 2.377578] mt7921e 0000:02:00.0: HW/SW Version: 0x8a108a10, Build Time: 20220110230855a [ 2.846987] mt7921e 0000:02:00.0: WM Firmware Version: ____010000, Build Time: 20220110230951 [ 2.874395] mt7921e 0000:02:00.0: Firmware init done [ 7.374118] mt7921e 0000:02:00.0: Message 00020001 (seq 4) timeout [ 7.374180] mt7921e 0000:02:00.0: chip reset [ 13.773763] mt7921e 0000:02:00.0: Message 000046ed (seq 5) timeout [ 13.887279] mt7921e 0000:02:00.0: HW/SW Version: 0x8a108a10, Build Time: 20220110230855a [ 13.958763] mt7921e 0000:02:00.0: WM Firmware Version: ____010000, Build Time: 20220110230951 [ 13.989292] mt7921e 0000:02:00.0: Firmware init done [ 54.093979] mt7921e 0000:02:00.0: Message 00020001 (seq 10) timeout [ 54.094010] mt7921e 0000:02:00.0: chip reset [ 60.493981] mt7921e 0000:02:00.0: Message 000046ed (seq 11) timeout [ 60.600757] mt7921e 0000:02:00.0: HW/SW Version: 0x8a108a10, Build Time: 20220110230855a [ 60.672805] mt7921e 0000:02:00.0: WM Firmware Version: ____010000, Build Time: 20220110230951 [ 60.704784] mt7921e 0000:02:00.0: Firmware init done
The last "Firmware init done" happened after I did "ip link set wlan0 up" which failed.
Here's the full dmesg: https://dpaste.org/PVTE
Yet another issue (#3) is that the kernel sometimes crashes when starting up the system. I've mostly ignored this issue, but looking at the log when that happens, it seems to be related to the mt7921 driver:
Jan 29 14:21:58 chronos kernel: mt7921e 0000:02:00.0: Timeout for driver own Jan 29 14:21:58 chronos kernel: BUG: Bad page state in process systemd-udevd pfn:103328 Jan 29 14:21:58 chronos kernel: page:00000000128101f9 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x103328 Jan 29 14:21:58 chronos kernel: flags: 0x2ffff0000000000(node=0|zone=2|lastcpupid=0xffff) Jan 29 14:21:58 chronos kernel: raw: 02ffff0000000000 dead000000000100 dead000000000122 0000000000000000 Jan 29 14:21:58 chronos kernel: raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000 Jan 29 14:21:58 chronos kernel: page dumped because: nonzero _refcount Jan 29 14:21:58 chronos kernel: Modules linked in: bnep btusb btrtl btbcm ccm algif_aead cbc btintel des_generic libdes ecb bluetooth iptable_nat ecdh_generic hid_asus nf_nat algif_skcipher nf_conntrack cmac nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c md4 algif_hash iptable_mangle iptable_filter af_alg intel_rapl_msr intel_rapl_common joydev mousedev mt7921e(+) mt7921_common snd_hda_codec_realtek mt76_connac_lib snd_hda_codec_generic edac_mce_amd ledtrig_audio mt76 snd_hda_codec_hdmi hid_multitouch snd_hda_intel mac80211 kvm_amd snd_intel_dspcfg vfat asus_nb_wmi snd_intel_sdw_acpi libarc4 fat amdgpu kvm irqbypass snd_hda_codec asus_wmi snd_pci_acp6x crct10dif_pclmul cfg80211 snd_hda_core sparse_keymap crc32_pclmul snd_hwdep ghash_clmulni_intel platform_profile snd_pcm aesni_intel i8042 snd_timer crypto_simd gpu_sched snd_pci_acp5x cryptd serio ucsi_acpi sp5100_tco snd drm_ttm_helper snd_rn_pci_acp3x rapl typec_ucsi pcspkr wmi_bmof rfkill ttm soundcore ccp snd_pci_acp3x i2c_piix4 k10temp tpm_crb typec tpm_tis Jan 29 14:21:58 chronos kernel: roles mac_hid tpm_tis_core i2c_hid_acpi tpm i2c_hid amd_pmc rng_core acpi_cpufreq asus_wireless pinctrl_amd pkcs8_key_parser crypto_user fuse bpf_preload ip_tables x_tables usbhid ext4 crc32c_generic crc16 mbcache jbd2 xhci_pci crc32c_intel xhci_pci_renesas wmi video Jan 29 14:21:58 chronos kernel: CPU: 12 PID: 396 Comm: systemd-udevd Tainted: G W 5.16.3-arch1-1 #1 ca51a3fe35922d501638d513dc9548a2c4fed987 Jan 29 14:21:58 chronos kernel: Hardware name: ASUSTeK COMPUTER INC. ROG Zephyrus G14 GA401QM_GA401QM/GA401QM, BIOS GA401QM.410 12/13/2021 Jan 29 14:21:58 chronos kernel: Call Trace: Jan 29 14:21:58 chronos kernel: <TASK> Jan 29 14:21:58 chronos kernel: dump_stack_lvl+0x48/0x66 Jan 29 14:21:58 chronos kernel: bad_page.cold+0x63/0x94 Jan 29 14:21:58 chronos kernel: free_pcppages_bulk+0x1f2/0x380 Jan 29 14:21:58 chronos kernel: free_unref_page+0xbd/0x140 Jan 29 14:21:58 chronos kernel: mt76_dma_rx_cleanup+0x94/0x120 [mt76 d94b4c9690089b7441d9b3262ec58606565d1b82] Jan 29 14:21:58 chronos kernel: mt7921_wpdma_reset+0xbc/0x1c0 [mt7921e 7e95012acfae7cc199e541d3b3dbe15de0128110] Jan 29 14:21:58 chronos kernel: mt7921_register_device+0x32b/0x5e0 [mt7921_common 19fe4291bf468cdc820d57b91bfc4be907d53377] Jan 29 14:21:58 chronos kernel: mt7921_pci_probe+0x1f1/0x230 [mt7921e 7e95012acfae7cc199e541d3b3dbe15de0128110] Jan 29 14:21:58 chronos kernel: ? __pm_runtime_resume+0x58/0x80 Jan 29 14:21:58 chronos kernel: local_pci_probe+0x45/0x90 Jan 29 14:21:58 chronos kernel: ? pci_match_device+0xdf/0x140 Jan 29 14:21:58 chronos kernel: pci_device_probe+0xcf/0x1c0 Jan 29 14:21:58 chronos kernel: really_probe+0x203/0x400 Jan 29 14:21:58 chronos kernel: __driver_probe_device+0x112/0x190 Jan 29 14:21:58 chronos kernel: driver_probe_device+0x1e/0x90 Jan 29 14:21:58 chronos kernel: __driver_attach+0xc8/0x1e0 Jan 29 14:21:58 chronos kernel: ? __device_attach_driver+0xf0/0xf0 Jan 29 14:21:58 chronos kernel: ? __device_attach_driver+0xf0/0xf0 Jan 29 14:21:58 chronos kernel: bus_for_each_dev+0x8d/0xe0 Jan 29 14:21:58 chronos kernel: bus_add_driver+0x154/0x200 Jan 29 14:21:58 chronos kernel: driver_register+0x8f/0xf0 Jan 29 14:21:58 chronos kernel: ? 0xffffffffc0753000 Jan 29 14:21:58 chronos kernel: do_one_initcall+0x57/0x220 Jan 29 14:21:58 chronos kernel: do_init_module+0x5c/0x270 Jan 29 14:21:58 chronos kernel: load_module+0x25d7/0x27a0 Jan 29 14:21:58 chronos kernel: ? __alloc_pages_bulk+0x5e7/0x740 Jan 29 14:21:58 chronos kernel: ? __do_sys_init_module+0x12e/0x1b0 Jan 29 14:21:58 chronos kernel: __do_sys_init_module+0x12e/0x1b0 Jan 29 14:21:58 chronos kernel: do_syscall_64+0x5c/0x90 Jan 29 14:21:58 chronos kernel: ? ksys_read+0x67/0xf0 Jan 29 14:21:58 chronos kernel: ? exc_page_fault+0x72/0x180 Jan 29 14:21:58 chronos kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae Jan 29 14:21:58 chronos kernel: RIP: 0033:0x7f261332632e Jan 29 14:21:58 chronos kernel: Code: 48 8b 0d 45 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 12 0b 0c 00 f7 d8 64 89 01 48 Jan 29 14:21:58 chronos kernel: RSP: 002b:00007ffd2b6d7fd8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af Jan 29 14:21:58 chronos kernel: RAX: ffffffffffffffda RBX: 000056487d08e8b0 RCX: 00007f261332632e Jan 29 14:21:58 chronos kernel: RDX: 00007f261347aa9d RSI: 000000000002b3af RDI: 000056487d408720 Jan 29 14:21:58 chronos kernel: RBP: 000056487d408720 R08: 27d4eb2f165667c5 R09: 0000000000000000 Jan 29 14:21:58 chronos kernel: R10: 000056487d0c3830 R11: 0000000000000246 R12: 00007f261347aa9d Jan 29 14:21:58 chronos kernel: R13: 0000000000000001 R14: 000056487d0e5bd0 R15: 000056487d08e8b0 Jan 29 14:21:58 chronos kernel: </TASK> Jan 29 14:21:58 chronos kernel: Disabling lock debugging due to kernel taint
This is not a regression though, as it happens in 5.16.2 too.
Here's a full dmesg of the crash: http://dpaste.org/xtt5
On Sat, Jan 29, 2022 at 01:05:50PM -0600, Felipe Contreras wrote:
Hello,
I've always had trouble with this driver in my Asus Zephyrus laptop, but I was able to use it eventually, that's until 5.16.3 landed.
This version completely broke it. I'm unable to bring the interface up, no matter what I try.
Before, sometimes I was able to make the chip work by suspending the laptop, but in 5.16.3 the machine doesn't wake up (which is probably another issue).
Reverting back to 5.16.2 makes it work.
Let me know if you need more information, or if you would like me to bisect the issue.
Using 'git bisect' would be best, so we know what commit exactly causes the problems.
thanks,
greg k-h
On Sun, Jan 30, 2022 at 1:28 AM Greg KH greg@kroah.com wrote:
On Sat, Jan 29, 2022 at 01:05:50PM -0600, Felipe Contreras wrote:
Hello,
I've always had trouble with this driver in my Asus Zephyrus laptop, but I was able to use it eventually, that's until 5.16.3 landed.
This version completely broke it. I'm unable to bring the interface up, no matter what I try.
Before, sometimes I was able to make the chip work by suspending the laptop, but in 5.16.3 the machine doesn't wake up (which is probably another issue).
Reverting back to 5.16.2 makes it work.
Let me know if you need more information, or if you would like me to bisect the issue.
Using 'git bisect' would be best, so we know what commit exactly causes the problems.
I know, but it has been a while since I've created a decent config file to build a kernel.
Either way, I pushed forward and the commit is a38b94c43943.
Upstream commit 547224024579 introduced a regression that was fixed by the next commit 680a2ead741a, but the second commit was never merged to stable.
I've sent the second commit to fix the regression.
On Sun, Jan 30, 2022 at 02:07:32AM -0600, Felipe Contreras wrote:
On Sun, Jan 30, 2022 at 1:28 AM Greg KH greg@kroah.com wrote:
On Sat, Jan 29, 2022 at 01:05:50PM -0600, Felipe Contreras wrote:
Hello,
I've always had trouble with this driver in my Asus Zephyrus laptop, but I was able to use it eventually, that's until 5.16.3 landed.
This version completely broke it. I'm unable to bring the interface up, no matter what I try.
Before, sometimes I was able to make the chip work by suspending the laptop, but in 5.16.3 the machine doesn't wake up (which is probably another issue).
Reverting back to 5.16.2 makes it work.
Let me know if you need more information, or if you would like me to bisect the issue.
Using 'git bisect' would be best, so we know what commit exactly causes the problems.
I know, but it has been a while since I've created a decent config file to build a kernel.
Either way, I pushed forward and the commit is a38b94c43943.
Upstream commit 547224024579 introduced a regression that was fixed by the next commit 680a2ead741a, but the second commit was never merged to stable.
I've sent the second commit to fix the regression.
Wonderful, thanks for figuring this out and sending the fix.
greg k-h
I can confirm this regression. I have an Asus TUF laptop.
I've also tried "resetting" the chip by holding down the power button for 60 seconds. This usually helped in previous versions.
If it helps, this issue does not exist in Windows (I have a dualboot).
This is the dmesg log when trying to bring the interface up: https://dpaste.org/Wouy
Happy to help diagnosing this further.
Thanks Abhijeet
On 30/01/22 00:35, Felipe Contreras wrote:
Hello,
I've always had trouble with this driver in my Asus Zephyrus laptop, but I was able to use it eventually, that's until 5.16.3 landed.
This version completely broke it. I'm unable to bring the interface up, no matter what I try.
Before, sometimes I was able to make the chip work by suspending the laptop, but in 5.16.3 the machine doesn't wake up (which is probably another issue).
Reverting back to 5.16.2 makes it work.
Let me know if you need more information, or if you would like me to bisect the issue.
Cheers.
linux-stable-mirror@lists.linaro.org