When I converted rk808 to device managed resources I converted the rk808
specific pm_power_off handler to devm_register_sys_off_handler() using
SYS_OFF_MODE_POWER_OFF_PREPARE, which is allowed to sleep. I did this
because the driver's poweroff function makes use of regmap and the backend
of that might sleep.
But the PMIC poweroff function will kill off the board power and the
kernel does some extra steps after the prepare handler. Thus the prepare
handler should not be used for the PMIC's poweroff routine. Instead the
normal SYS_OFF_MODE_POWER_OFF phase should be used. The old pm_power_off
method is also being called from there, so this would have been a
cleaner conversion anyways.
But it still makes sense to investigate the sleep handling and check
if there are any issues. Apparently the Rockchip and Meson I2C drivers
(the only platforms using the PMICs handled by this driver) both have
support for atomic transfers and thus may be called from the proper
poweroff context.
Things are different on the SPI side. That is so far only used by rk806
and that one is only used by Rockchip RK3588. Unfortunately the Rockchip
SPI driver does not support atomic transfers. That means using the
normal POWER_OFF handler would introduce the following error splash
during shutdown on all RK3588 boards currently supported upstream:
[ 13.761353] ------------[ cut here ]------------
[ 13.761764] Voluntary context switch within RCU read-side critical section!
[ 13.761776] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree_plugin.h:330 rcu_note_context_switch+0x3ac/0x404
[ 13.763219] Modules linked in:
[ 13.763498] CPU: 0 UID: 0 PID: 1 Comm: systemd-shutdow Not tainted 6.10.0-12284-g2818a9a19514 #1499
[ 13.764297] Hardware name: Rockchip RK3588 EVB1 V10 Board (DT)
[ 13.764812] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 13.765427] pc : rcu_note_context_switch+0x3ac/0x404
[ 13.765871] lr : rcu_note_context_switch+0x3ac/0x404
[ 13.766314] sp : ffff800084f4b5b0
[ 13.766609] x29: ffff800084f4b5b0 x28: ffff00040139b800 x27: 00007dfb4439ae80
[ 13.767245] x26: ffff00040139bc80 x25: 0000000000000000 x24: ffff800082118470
[ 13.767880] x23: 0000000000000000 x22: ffff000400300000 x21: ffff000400300000
[ 13.768515] x20: ffff800083a9d600 x19: ffff0004fee48600 x18: fffffffffffed448
[ 13.769151] x17: 000000040044ffff x16: 005000f2b5503510 x15: 0000000000000048
[ 13.769787] x14: fffffffffffed490 x13: ffff80008473b3c0 x12: 0000000000000900
[ 13.770421] x11: 0000000000000300 x10: ffff800084797bc0 x9 : ffff80008473b3c0
[ 13.771057] x8 : 00000000ffffefff x7 : ffff8000847933c0 x6 : 0000000000000300
[ 13.771692] x5 : 0000000000000301 x4 : 40000000fffff300 x3 : 0000000000000000
[ 13.772328] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000400300000
[ 13.772964] Call trace:
[ 13.773184] rcu_note_context_switch+0x3ac/0x404
[ 13.773598] __schedule+0x94/0xb0c
[ 13.773907] schedule+0x34/0x104
[ 13.774198] schedule_timeout+0x84/0xfc
[ 13.774544] wait_for_completion_timeout+0x78/0x14c
[ 13.774980] spi_transfer_one_message+0x588/0x690
[ 13.775403] __spi_pump_transfer_message+0x19c/0x4ec
[ 13.775846] __spi_sync+0x2a8/0x3c4
[ 13.776161] spi_write_then_read+0x120/0x208
[ 13.776543] rk806_spi_bus_read+0x54/0x88
[ 13.776905] _regmap_raw_read+0xec/0x16c
[ 13.777257] _regmap_bus_read+0x44/0x7c
[ 13.777601] _regmap_read+0x60/0xd8
[ 13.777915] _regmap_update_bits+0xf4/0x13c
[ 13.778289] regmap_update_bits_base+0x64/0x98
[ 13.778686] rk808_power_off+0x70/0xfc
[ 13.779024] sys_off_notify+0x40/0x6c
[ 13.779356] atomic_notifier_call_chain+0x60/0x90
[ 13.779776] do_kernel_power_off+0x54/0x6c
[ 13.780146] machine_power_off+0x18/0x24
[ 13.780499] kernel_power_off+0x70/0x7c
[ 13.780845] __do_sys_reboot+0x210/0x270
[ 13.781198] __arm64_sys_reboot+0x24/0x30
[ 13.781558] invoke_syscall+0x48/0x10c
[ 13.781897] el0_svc_common+0x3c/0xe8
[ 13.782228] do_el0_svc+0x20/0x2c
[ 13.782528] el0_svc+0x34/0xd8
[ 13.782806] el0t_64_sync_handler+0x120/0x12c
[ 13.783197] el0t_64_sync+0x190/0x194
[ 13.783527] ---[ end trace 0000000000000000 ]---
To avoid this we keep the SYS_OFF_MODE_POWER_OFF_PREPARE handler for the
SPI backend. This is not great, but at least avoids regressions and the
fix should be small enough to allow backporting.
As a side-effect this also works around a shutdown problem on the Asus
C201. For reasons unknown that skips calling the prepare handler and
directly calls the final shutdown handler.
Fixes: 4fec8a5a85c49 ("mfd: rk808: Convert to device managed resources")
Cc: stable(a)vger.kernel.org
Reported-by: Urja <urja(a)urja.dev>
Signed-off-by: Sebastian Reichel <sebastian.reichel(a)collabora.com>
---
drivers/mfd/rk8xx-core.c | 15 +++++++++++++--
drivers/mfd/rk8xx-i2c.c | 2 +-
drivers/mfd/rk8xx-spi.c | 2 +-
include/linux/mfd/rk808.h | 2 +-
4 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/mfd/rk8xx-core.c b/drivers/mfd/rk8xx-core.c
index 5eda3c0dbbdf..757ef8181328 100644
--- a/drivers/mfd/rk8xx-core.c
+++ b/drivers/mfd/rk8xx-core.c
@@ -692,10 +692,11 @@ void rk8xx_shutdown(struct device *dev)
}
EXPORT_SYMBOL_GPL(rk8xx_shutdown);
-int rk8xx_probe(struct device *dev, int variant, unsigned int irq, struct regmap *regmap)
+int rk8xx_probe(struct device *dev, int variant, unsigned int irq, struct regmap *regmap, bool is_spi)
{
struct rk808 *rk808;
const struct rk808_reg_data *pre_init_reg;
+ enum sys_off_mode pwr_off_mode = SYS_OFF_MODE_POWER_OFF;
const struct mfd_cell *cells;
int dual_support = 0;
int nr_pre_init_regs;
@@ -785,10 +786,20 @@ int rk8xx_probe(struct device *dev, int variant, unsigned int irq, struct regmap
if (ret)
return dev_err_probe(dev, ret, "failed to add MFD devices\n");
+ /*
+ * Currently the Rockchip SPI driver always sleeps when doing SPI
+ * transfers. This is not allowed in the SYS_OFF_MODE_POWER_OFF
+ * handler, so we are using the prepare handler as a workaround.
+ * This should be removed once the Rockchip SPI driver has been
+ * adapted.
+ */
+ if (is_spi)
+ pwr_off_mode = SYS_OFF_MODE_POWER_OFF_PREPARE;
+
if (device_property_read_bool(dev, "rockchip,system-power-controller") ||
device_property_read_bool(dev, "system-power-controller")) {
ret = devm_register_sys_off_handler(dev,
- SYS_OFF_MODE_POWER_OFF_PREPARE, SYS_OFF_PRIO_HIGH,
+ pwr_off_mode, SYS_OFF_PRIO_HIGH,
&rk808_power_off, rk808);
if (ret)
return dev_err_probe(dev, ret,
diff --git a/drivers/mfd/rk8xx-i2c.c b/drivers/mfd/rk8xx-i2c.c
index 69a6b297d723..a2029decd654 100644
--- a/drivers/mfd/rk8xx-i2c.c
+++ b/drivers/mfd/rk8xx-i2c.c
@@ -189,7 +189,7 @@ static int rk8xx_i2c_probe(struct i2c_client *client)
return dev_err_probe(&client->dev, PTR_ERR(regmap),
"regmap initialization failed\n");
- return rk8xx_probe(&client->dev, data->variant, client->irq, regmap);
+ return rk8xx_probe(&client->dev, data->variant, client->irq, regmap, false);
}
static void rk8xx_i2c_shutdown(struct i2c_client *client)
diff --git a/drivers/mfd/rk8xx-spi.c b/drivers/mfd/rk8xx-spi.c
index 3405fb82ff9f..20f9428f94bb 100644
--- a/drivers/mfd/rk8xx-spi.c
+++ b/drivers/mfd/rk8xx-spi.c
@@ -94,7 +94,7 @@ static int rk8xx_spi_probe(struct spi_device *spi)
return dev_err_probe(&spi->dev, PTR_ERR(regmap),
"Failed to init regmap\n");
- return rk8xx_probe(&spi->dev, RK806_ID, spi->irq, regmap);
+ return rk8xx_probe(&spi->dev, RK806_ID, spi->irq, regmap, true);
}
static const struct of_device_id rk8xx_spi_of_match[] = {
diff --git a/include/linux/mfd/rk808.h b/include/linux/mfd/rk808.h
index 69cbea78b430..be15b84cff9e 100644
--- a/include/linux/mfd/rk808.h
+++ b/include/linux/mfd/rk808.h
@@ -1349,7 +1349,7 @@ struct rk808 {
};
void rk8xx_shutdown(struct device *dev);
-int rk8xx_probe(struct device *dev, int variant, unsigned int irq, struct regmap *regmap);
+int rk8xx_probe(struct device *dev, int variant, unsigned int irq, struct regmap *regmap, bool is_spi);
int rk8xx_suspend(struct device *dev);
int rk8xx_resume(struct device *dev);
--
2.43.0
Commit 60e3318e3e900 ("cifs: use fs_context for automounts") was
released in v6.1.54 and broke the failover when one of the servers
inside DFS becomes unavailable. We reproduced the problem on the EC2
instances of different types. Reverting aforementioned commint on top of
the latest stable verison v6.1.94 helps to resolve the problem.
Earliest working version is v6.2-rc1. There were two big merges of CIFS fixes:
[1] and [2]. We would like to ask for the help to investigate this problem and
if some of those patches need to be backported. Also, is it safe to just revert
problematic commit until proper fixes/backports will be available?
We will help to do testing and confirm if fix works, but let me also list the
steps we used to reproduce the problem if it will help to identify the problem:
1. Create Active Directory domain eg. 'corp.fsxtest.local' in AWS Directory
Service with:
- three AWS FSX file systems filesystem1..filesystem3
- three Windows servers; They have DFS installed as per
https://learn.microsoft.com/en-us/windows-server/storage/dfs-namespaces/dfs…:
- dfs-srv1: EC2AMAZ-2EGTM59
- dfs-srv2: EC2AMAZ-1N36PRD
- dfs-srv3: EC2AMAZ-0PAUH2U
2. Create DFS namespace eg. 'dfs-namespace' in Windows server 2008 mode
and three folders targets in it:
- referral-a mapped to filesystem1.corp.local
- referral-b mapped to filesystem2.corp.local
- referral-c mapped to filesystem3.corp.local
- local folders dfs-srv1..dfs-srv3 in C:\DFSRoots\dfs-namespace of every
Windows server. This helps to quickly define underlying server when
DFS is mounted.
3. Enabled cifs debug logs:
```
echo 'module cifs +p' > /sys/kernel/debug/dynamic_debug/control
echo 'file fs/cifs/* +p' > /sys/kernel/debug/dynamic_debug/control
echo 7 > /proc/fs/cifs/cifsFYI
```
4. Mount DFS namespace on Amazon Linux 2023 instance running any vanilla
kernel v6.1.54+:
```
dmesg -c &>/dev/null
cd /mnt
mount -t cifs -o cred=/mnt/creds,echo_interval=5 \
//corp.fsxtest.local/dfs-namespace \
./dfs-namespace
```
5. List DFS root, it's also required to avoid recursive mounts that happen
during regular 'ls' run:
```
sh -c 'ls dfs-namespace'
dfs-srv2 referral-a referral-b
```
The DFS server is EC2AMAZ-1N36PRD, it's also listed in mount:
```
[root@ip-172-31-2-82 mnt]# mount | grep dfs
//corp.fsxtest.local/dfs-namespace on /mnt/dfs-namespace type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.11.26,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
//EC2AMAZ-1N36PRD.corp.fsxtest.local/dfs-namespace/referral-a on /mnt/dfs-namespace/referral-a type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.12.80,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
```
List files in first folder:
```
sh -c 'ls dfs-namespace/referral-a'
filea.txt.txt
```
6. Shutdown DFS server-2.
List DFS root again, server changed from dfs-srv2 to dfs-srv1 EC2AMAZ-2EGTM59:
```
sh -c 'ls dfs-namespace'
dfs-srv1 referral-a referral-b
```
7. Try to list files in another folder, this causes ls to fail with error:
```
sh -c 'ls dfs-namespace/referral-b'
ls: cannot access 'dfs-namespace/referral-b': No route to host```
Sometimes it's also 'Operation now in progress' error.
mount shows the same output:
```
//corp.fsxtest.local/dfs-namespace on /mnt/dfs-namespace type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.11.26,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
//EC2AMAZ-1N36PRD.corp.fsxtest.local/dfs-namespace/referral-a on /mnt/dfs-namespace/referral-a type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.12.80,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
```
I also attached kernel debug logs from this test.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
Reported-by: Andrei Paniakin <apanyaki(a)amazon.com>
Bisected-by: Simba Bonga <simbarb(a)amazon.com>
---
#regzbot introduced: v6.1.54..v6.2-rc1
From: Johannes Berg <johannes.berg(a)intel.com>
[ Upstream commit 23daf1b4c91db9b26f8425cc7039cf96d22ccbfe ]
Setting the AP channel width is meant for use with the normal
20/40/... MHz channel width progression, and switching around
in S1G or narrow channels isn't supported. Disallow that.
Reported-by: syzbot+bc0f5b92cc7091f45fb6(a)syzkaller.appspotmail.com
Link: https://msgid.link/20240515141600.d4a9590bfe32.I19a32d60097e81b527eafe6b092…
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
net/wireless/nl80211.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 72c7bf5585816..81d5bf186180f 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -3419,6 +3419,33 @@ static int __nl80211_set_channel(struct cfg80211_registered_device *rdev,
if (chandef.chan != cur_chan)
return -EBUSY;
+ /* only allow this for regular channel widths */
+ switch (wdev->links[link_id].ap.chandef.width) {
+ case NL80211_CHAN_WIDTH_20_NOHT:
+ case NL80211_CHAN_WIDTH_20:
+ case NL80211_CHAN_WIDTH_40:
+ case NL80211_CHAN_WIDTH_80:
+ case NL80211_CHAN_WIDTH_80P80:
+ case NL80211_CHAN_WIDTH_160:
+ case NL80211_CHAN_WIDTH_320:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ switch (chandef.width) {
+ case NL80211_CHAN_WIDTH_20_NOHT:
+ case NL80211_CHAN_WIDTH_20:
+ case NL80211_CHAN_WIDTH_40:
+ case NL80211_CHAN_WIDTH_80:
+ case NL80211_CHAN_WIDTH_80P80:
+ case NL80211_CHAN_WIDTH_160:
+ case NL80211_CHAN_WIDTH_320:
+ break;
+ default:
+ return -EINVAL;
+ }
+
result = rdev_set_ap_chanwidth(rdev, dev, link_id,
&chandef);
if (result)
--
2.43.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x ccc45cb4e7271c74dbb27776ae8f73d84557f5c6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023061111-tracing-shakiness-9054@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ccc45cb4e7271c74dbb27776ae8f73d84557f5c6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jan=20H=C3=B6ppner?= <hoeppner(a)linux.ibm.com>
Date: Fri, 9 Jun 2023 17:37:50 +0200
Subject: [PATCH] s390/dasd: Use correct lock while counting channel queue
length
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The lock around counting the channel queue length in the BIODASDINFO
ioctl was incorrectly changed to the dasd_block->queue_lock with commit
583d6535cb9d ("dasd: remove dead code"). This can lead to endless list
iterations and a subsequent crash.
The queue_lock is supposed to be used only for queue lists belonging to
dasd_block. For dasd_device related queue lists the ccwdev lock must be
used.
Fix the mentioned issues by correctly using the ccwdev lock instead of
the queue lock.
Fixes: 583d6535cb9d ("dasd: remove dead code")
Cc: stable(a)vger.kernel.org # v5.0+
Signed-off-by: Jan Höppner <hoeppner(a)linux.ibm.com>
Reviewed-by: Stefan Haberland <sth(a)linux.ibm.com>
Signed-off-by: Stefan Haberland <sth(a)linux.ibm.com>
Link: https://lore.kernel.org/r/20230609153750.1258763-2-sth@linux.ibm.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index 9327dcdd6e5e..8fca725b3dae 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -552,10 +552,10 @@ static int __dasd_ioctl_information(struct dasd_block *block,
memcpy(dasd_info->type, base->discipline->name, 4);
- spin_lock_irqsave(&block->queue_lock, flags);
+ spin_lock_irqsave(get_ccwdev_lock(base->cdev), flags);
list_for_each(l, &base->ccw_queue)
dasd_info->chanq_len++;
- spin_unlock_irqrestore(&block->queue_lock, flags);
+ spin_unlock_irqrestore(get_ccwdev_lock(base->cdev), flags);
return 0;
}
This reverts commit 484fd6c1de13b336806a967908a927cc0356e312. The
commit caused a regression because now the umask was applied to
symlinks and the fix is unnecessary because the umask/O_TMPFILE bug
has been fixed somewhere else already.
Fixes: https://lore.kernel.org/lkml/28DSITL9912E1.2LSZUVTGTO52Q@mforney.org/
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
---
fs/ext4/acl.h | 5 -----
1 file changed, 5 deletions(-)
diff --git a/fs/ext4/acl.h b/fs/ext4/acl.h
index ef4c19e5f570..0c5a79c3b5d4 100644
--- a/fs/ext4/acl.h
+++ b/fs/ext4/acl.h
@@ -68,11 +68,6 @@ extern int ext4_init_acl(handle_t *, struct inode *, struct inode *);
static inline int
ext4_init_acl(handle_t *handle, struct inode *inode, struct inode *dir)
{
- /* usually, the umask is applied by posix_acl_create(), but if
- ext4 ACL support is disabled at compile time, we need to do
- it here, because posix_acl_create() will never be called */
- inode->i_mode &= ~current_umask();
-
return 0;
}
#endif /* CONFIG_EXT4_FS_POSIX_ACL */
--
2.39.2
Hi
wpa_supplicant 2.11 broke Wi-Fi on T2 Macs as well, but this patch doesn't seem to be fixing Wi-Fi. Instead, it's breaking it even on older 2.10 wpa_supplicant. Tested by a user on bcm4364b2 wifi chip with a WPA2-PSK [AES] network. dmesg output:
However dmesg outputs more info
[ 5.852978] usbcore: registered new interface driver brcmfmac
[ 5.853114] brcmfmac 0000:01:00.0: enabling device (0000 -> 0002)
[ 5.992212] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac4364b2-pcie for chip BCM4364/3
[ 5.993923] brcmfmac 0000:01:00.0: Direct firmware load for brcm/brcmfmac4364b2-pcie.apple,maui-HRPN-u-7.5-X0.bin failed with error -2
[ 5.993968] brcmfmac 0000:01:00.0: Direct firmware load for brcm/brcmfmac4364b2-pcie.apple,maui-HRPN-u-7.5.bin failed with error -2
[ 5.994004] brcmfmac 0000:01:00.0: Direct firmware load for brcm/brcmfmac4364b2-pcie.apple,maui-HRPN-u.bin failed with error -2
[ 5.994041] brcmfmac 0000:01:00.0: Direct firmware load for brcm/brcmfmac4364b2-pcie.apple,maui-HRPN.bin failed with error -2
[ 5.994076] brcmfmac 0000:01:00.0: Direct firmware load for brcm/brcmfmac4364b2-pcie.apple,maui-X0.bin failed with error -2
[ 6.162830] Bluetooth: hci0: BCM: 'brcm/BCM.hcd'
[ 6.796637] brcmfmac: brcmf_c_process_txcap_blob: TxCap blob found, loading
[ 6.798396] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4364/3 wl0: Jul 10 2023 12:30:19 version 9.30.503.0.32.5.92 FWID 01-88a8883
[ 6.885876] brcmfmac 0000:01:00.0 wlp1s0: renamed from wlan0
[ 8.195243] ieee80211 phy0: brcmf_p2p_set_firmware: failed to update device address ret -52
[ 8.196584] ieee80211 phy0: brcmf_p2p_create_p2pdev: set p2p_disc error
[ 8.196588] ieee80211 phy0: brcmf_cfg80211_add_iface: add iface p2p-dev-wlp1s0 type 10 failed: err=-52
It is observed sometimes when tethering is used over NCM with Windows 11
as host, at some instances, the gadget_giveback has one byte appended at
the end of a proper NTB. When the NTB is parsed, unwrap call looks for
any leftover bytes in SKB provided by u_ether and if there are any pending
bytes, it treats them as a separate NTB and parses it. But in case the
second NTB (as per unwrap call) is faulty/corrupt, all the datagrams that
were parsed properly in the first NTB and saved in rx_list are dropped.
Adding a few custom traces showed the following:
[002] d..1 7828.532866: dwc3_gadget_giveback: ep1out:
req 000000003868811a length 1025/16384 zsI ==> 0
[002] d..1 7828.532867: ncm_unwrap_ntb: K: ncm_unwrap_ntb toprocess: 1025
[002] d..1 7828.532867: ncm_unwrap_ntb: K: ncm_unwrap_ntb nth: 1751999342
[002] d..1 7828.532868: ncm_unwrap_ntb: K: ncm_unwrap_ntb seq: 0xce67
[002] d..1 7828.532868: ncm_unwrap_ntb: K: ncm_unwrap_ntb blk_len: 0x400
[002] d..1 7828.532868: ncm_unwrap_ntb: K: ncm_unwrap_ntb ndp_len: 0x10
[002] d..1 7828.532869: ncm_unwrap_ntb: K: Parsed NTB with 1 frames
In this case, the giveback is of 1025 bytes and block length is 1024.
The rest 1 byte (which is 0x00) won't be parsed resulting in drop of
all datagrams in rx_list.
Same is case with packets of size 2048:
[002] d..1 7828.557948: dwc3_gadget_giveback: ep1out:
req 0000000011dfd96e length 2049/16384 zsI ==> 0
[002] d..1 7828.557949: ncm_unwrap_ntb: K: ncm_unwrap_ntb nth: 1751999342
[002] d..1 7828.557950: ncm_unwrap_ntb: K: ncm_unwrap_ntb blk_len: 0x800
Lecroy shows one byte coming in extra confirming that the byte is coming
in from PC:
Transfer 2959 - Bytes Transferred(1025) Timestamp((18.524 843 590)
- Transaction 8391 - Data(1025 bytes) Timestamp(18.524 843 590)
--- Packet 4063861
Data(1024 bytes)
Duration(2.117us) Idle(14.700ns) Timestamp(18.524 843 590)
--- Packet 4063863
Data(1 byte)
Duration(66.160ns) Time(282.000ns) Timestamp(18.524 845 722)
According to Windows driver, no ZLP is needed if wBlockLength is non-zero,
because the non-zero wBlockLength has already told the function side the
size of transfer to be expected. However, there are in-market NCM devices
that rely on ZLP as long as the wBlockLength is multiple of wMaxPacketSize.
To deal with such devices, it pads an extra 0 at end so the transfer is no
longer multiple of wMaxPacketSize.
Cc: <stable(a)vger.kernel.org>
Fixes: 9f6ce4240a2b ("usb: gadget: f_ncm.c added")
Signed-off-by: Krishna Kurapati <quic_kriskura(a)quicinc.com>
---
Link to v2:
https://lore.kernel.org/all/20240131150332.1326523-1-quic_kriskura@quicinc.…
Changes in v2:
Added check to see if the padded byte is 0x00.
Changes in v3:
Removed wMaxPacketSize check from v2.
drivers/usb/gadget/function/f_ncm.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c
index ca5d5f564998..e2a059cfda2c 100644
--- a/drivers/usb/gadget/function/f_ncm.c
+++ b/drivers/usb/gadget/function/f_ncm.c
@@ -1338,7 +1338,15 @@ static int ncm_unwrap_ntb(struct gether *port,
"Parsed NTB with %d frames\n", dgram_counter);
to_process -= block_len;
- if (to_process != 0) {
+
+ /*
+ * Windows NCM driver avoids USB ZLPs by adding a 1-byte
+ * zero pad as needed.
+ */
+ if (to_process == 1 &&
+ (*(unsigned char *)(ntb_ptr + block_len) == 0x00)) {
+ to_process--;
+ } else if (to_process > 0) {
ntb_ptr = (unsigned char *)(ntb_ptr + block_len);
goto parse_ntb;
}
--
2.34.1
From: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
[ Upstream commit 524e057b2d66b61f9b63b6db30467ab7b0bb4796 ]
The Broadcom BCM5760X NIC may be a multi-function device.
While it does not advertise an ACS capability, peer-to-peer transactions
are not possible between the individual functions. So it is ok to treat
them as fully isolated.
Add an ACS quirk for this device so the functions can be in independent
IOMMU groups and attached individually to userspace applications using
VFIO.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240510204228.73435-1-ajit.khaparde@broa…
Signed-off-by: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Andy Gospodarek <gospo(a)broadcom.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/pci/quirks.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4d4267105cd2b..842e8fecf0a9a 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4971,6 +4971,10 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_BROADCOM, 0x1750, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1751, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1752, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1760, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1761, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1762, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1763, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0xD714, pci_quirk_brcm_acs },
/* Amazon Annapurna Labs */
{ PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS, 0x0031, pci_quirk_al_acs },
--
2.43.0