Since: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") we wait for all the workloads to be completed during shutdown. This was done to avoid a stall once the device is started again.
Unfortunately this has the side effect of stalling kexec(), if the userspace is frozen. Let's handle that case.
To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com To: Liam Girdwood lgirdwood@gmail.com To: Peter Ujfalusi peter.ujfalusi@linux.intel.com To: Bard Liao yung-chuan.liao@linux.intel.com To: Ranjani Sridharan ranjani.sridharan@linux.intel.com To: Kai Vehmanen kai.vehmanen@linux.intel.com To: Daniel Baluta daniel.baluta@nxp.com To: Mark Brown broonie@kernel.org To: Jaroslav Kysela perex@perex.cz To: Takashi Iwai tiwai@suse.com To: Eric Biederman ebiederm@xmission.com To: Chromeos Kdump chromeos-kdump@google.com To: Steven Rostedt rostedt@goodmis.org Cc: stable@vger.kernel.org Cc: sound-open-firmware@alsa-project.org Cc: alsa-devel@alsa-project.org Cc: linux-kernel@vger.kernel.org Cc: kexec@lists.infradead.org Signed-off-by: Ricardo Ribalda ribalda@chromium.org --- Changes in v7: - Fix commit message (Thanks Pierre-Louis). - Link to v6: https://lore.kernel.org/r/20221127-snd-freeze-v6-0-3e90553f64a5@chromium.org
Changes in v6: - Check if we are in kexec with the userspace frozen. - Link to v5: https://lore.kernel.org/r/20221127-snd-freeze-v5-0-4ededeb08ba0@chromium.org
Changes in v5: - Edit subject prefix. - Link to v4: https://lore.kernel.org/r/20221127-snd-freeze-v4-0-51ca64b7f2ab@chromium.org
Changes in v4: - Do not call snd_sof_machine_unregister from shutdown. - Link to v3: https://lore.kernel.org/r/20221127-snd-freeze-v3-0-a2eda731ca14@chromium.org
Changes in v3: - Wrap pm_freezing in a function. - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org
Changes in v2: - Only use pm_freezing if CONFIG_FREEZER . - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org
--- Ricardo Ribalda (2): kexec: Introduce kexec_with_frozen_processes ASoC: SOF: Fix deadlock when shutdown a frozen userspace
include/linux/kexec.h | 3 +++ kernel/kexec_core.c | 5 +++++ sound/soc/sof/core.c | 4 +++- 3 files changed, 11 insertions(+), 1 deletion(-) --- base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4 change-id: 20221127-snd-freeze-1ee143228326
Best regards,
Drivers running .shutdown() might want to wait for userspace to complete before exiting.
If userspace is frozen and we are running kexec they will stall the computer.
Add a way for them to figure out if they should just skip waiting for userspace.
Cc: stable@vger.kernel.org Fixes: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") Signed-off-by: Ricardo Ribalda ribalda@chromium.org --- include/linux/kexec.h | 3 +++ kernel/kexec_core.c | 5 +++++ 2 files changed, 8 insertions(+)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 41a686996aaa..c22711e0f7b5 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -426,6 +426,8 @@ extern int kexec_load_disabled; /* flag to track if kexec reboot is in progress */ extern bool kexec_in_progress;
+bool kexec_with_frozen_processes(void); + int crash_shrink_memory(unsigned long new_size); ssize_t crash_get_memory_size(void);
@@ -507,6 +509,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { } static inline void crash_kexec(struct pt_regs *regs) { } static inline int kexec_should_crash(struct task_struct *p) { return 0; } static inline int kexec_crash_loaded(void) { return 0; } +static inline bool kexec_with_frozen_processes(void) { return false; } #define kexec_in_progress false #endif /* CONFIG_KEXEC_CORE */
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index ca2743f9c634..8bc8257ee7ca 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -54,6 +54,11 @@ note_buf_t __percpu *crash_notes; /* Flag to indicate we are going to kexec a new kernel */ bool kexec_in_progress = false;
+bool kexec_with_frozen_processes(void) +{ + return kexec_in_progress && pm_freezing; +} +EXPORT_SYMBOL(kexec_with_frozen_processes);
/* Location of the reserved area for the crash kernel */ struct resource crashk_res = {
Hi Ricardo,
I love your patch! Yet something to improve:
[auto build test ERROR on 4312098baf37ee17a8350725e6e0d0e8590252d4]
url: https://github.com/intel-lab-lkp/linux/commits/Ricardo-Ribalda/ASoC-SOF-Fix-... base: 4312098baf37ee17a8350725e6e0d0e8590252d4 patch link: https://lore.kernel.org/r/20221127-snd-freeze-v7-1-127c582f1ca4%40chromium.o... patch subject: [PATCH v7 1/2] kexec: Introduce kexec_with_frozen_processes config: ia64-randconfig-r022-20221128 compiler: ia64-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/ddb7195380bb431ff9f4505d5fcfbe... git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Ricardo-Ribalda/ASoC-SOF-Fix-deadlock-when-shutdown-a-frozen-userspace/20221201-003214 git checkout ddb7195380bb431ff9f4505d5fcfbef756b80a3a # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=ia64 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot lkp@intel.com
All errors (new ones prefixed by >>):
kernel/kexec_core.c: In function 'kexec_with_frozen_processes':
kernel/kexec_core.c:59:37: error: 'pm_freezing' undeclared (first use in this function); did you mean 'freezing'?
59 | return kexec_in_progress && pm_freezing; | ^~~~~~~~~~~ | freezing kernel/kexec_core.c:59:37: note: each undeclared identifier is reported only once for each function it appears in kernel/kexec_core.c:60:1: error: control reaches end of non-void function [-Werror=return-type] 60 | } | ^ cc1: some warnings being treated as errors
vim +59 kernel/kexec_core.c
56 57 bool kexec_with_frozen_processes(void) 58 {
59 return kexec_in_progress && pm_freezing;
60 } 61 EXPORT_SYMBOL(kexec_with_frozen_processes); 62
Hi Ricardo,
I love your patch! Yet something to improve:
[auto build test ERROR on 4312098baf37ee17a8350725e6e0d0e8590252d4]
url: https://github.com/intel-lab-lkp/linux/commits/Ricardo-Ribalda/ASoC-SOF-Fix-... base: 4312098baf37ee17a8350725e6e0d0e8590252d4 patch link: https://lore.kernel.org/r/20221127-snd-freeze-v7-1-127c582f1ca4%40chromium.o... patch subject: [PATCH v7 1/2] kexec: Introduce kexec_with_frozen_processes config: s390-randconfig-r024-20221128 compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 6e4cea55f0d1104408b26ac574566a0e4de48036) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install s390 cross compiling tool for clang build # apt-get install binutils-s390x-linux-gnu # https://github.com/intel-lab-lkp/linux/commit/ddb7195380bb431ff9f4505d5fcfbe... git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Ricardo-Ribalda/ASoC-SOF-Fix-deadlock-when-shutdown-a-frozen-userspace/20221201-003214 git checkout ddb7195380bb431ff9f4505d5fcfbef756b80a3a # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot lkp@intel.com
All error/warnings (new ones prefixed by >>):
In file included from kernel/kexec_core.c:14: In file included from include/linux/kexec.h:19: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __raw_readb(PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu' #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) ^ include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16' #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) ^ In file included from kernel/kexec_core.c:14: In file included from include/linux/kexec.h:19: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu' #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x)) ^ include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32' #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x)) ^ In file included from kernel/kexec_core.c:14: In file included from include/linux/kexec.h:19: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writeb(value, PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^
kernel/kexec_core.c:59:30: error: use of undeclared identifier 'pm_freezing'; did you mean 'freezing'?
return kexec_in_progress && pm_freezing; ^~~~~~~~~~~ freezing include/linux/freezer.h:75:20: note: 'freezing' declared here static inline bool freezing(struct task_struct *p) { return false; } ^
kernel/kexec_core.c:59:30: warning: address of function 'freezing' will always evaluate to 'true' [-Wpointer-bool-conversion]
return kexec_in_progress && pm_freezing; ~~ ^~~~~~~~~~~ kernel/kexec_core.c:59:30: note: prefix with the address-of operator to silence this warning return kexec_in_progress && pm_freezing; ^ & 13 warnings and 1 error generated.
vim +59 kernel/kexec_core.c
56 57 bool kexec_with_frozen_processes(void) 58 {
59 return kexec_in_progress && pm_freezing;
60 } 61 EXPORT_SYMBOL(kexec_with_frozen_processes); 62
During kexec(), the userspace might frozen. Therefore we cannot wait for it to complete.
During a kexec with frozen processe do not unregister the clients.
This fixes:
[ 84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done. [ 246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds. [ 246.819035] Call Trace: [ 246.821782] <TASK> [ 246.824186] __schedule+0x5f9/0x1263 [ 246.828231] schedule+0x87/0xc5 [ 246.831779] snd_card_disconnect_sync+0xb5/0x127 ... [ 246.889249] snd_sof_device_shutdown+0xb4/0x150 [ 246.899317] pci_device_shutdown+0x37/0x61 [ 246.903990] device_shutdown+0x14c/0x1d6 [ 246.908391] kernel_kexec+0x45/0xb9
And:
[ 246.893222] INFO: task kexec-lite:4891 blocked for more than 122 seconds. [ 246.927709] Call Trace: [ 246.930461] <TASK> [ 246.932819] __schedule+0x5f9/0x1263 [ 246.936855] ? fsnotify_grab_connector+0x5c/0x70 [ 246.942045] schedule+0x87/0xc5 [ 246.945567] schedule_timeout+0x49/0xf3 [ 246.949877] wait_for_completion+0x86/0xe8 [ 246.954463] snd_card_free+0x68/0x89 ... [ 247.001080] platform_device_unregister+0x12/0x35
Cc: stable@vger.kernel.org Fixes: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") Signed-off-by: Ricardo Ribalda ribalda@chromium.org --- sound/soc/sof/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c index 3e6141d03770..4301f347bb90 100644 --- a/sound/soc/sof/core.c +++ b/sound/soc/sof/core.c @@ -9,6 +9,7 @@ //
#include <linux/firmware.h> +#include <linux/kexec.h> #include <linux/module.h> #include <sound/soc.h> #include <sound/sof.h> @@ -484,7 +485,8 @@ int snd_sof_device_shutdown(struct device *dev) * make sure clients and machine driver(s) are unregistered to force * all userspace devices to be closed prior to the DSP shutdown sequence */ - sof_unregister_clients(sdev); + if (!kexec_with_frozen_processes()) + sof_unregister_clients(sdev);
snd_sof_machine_unregister(sdev, pdata);
Hi,
On Wed, 30 Nov 2022, Ricardo Ribalda wrote:
During kexec(), the userspace might frozen. Therefore we cannot wait for it to complete.
[...]
--- a/sound/soc/sof/core.c +++ b/sound/soc/sof/core.c @@ -9,6 +9,7 @@ // #include <linux/firmware.h> +#include <linux/kexec.h> #include <linux/module.h> #include <sound/soc.h> #include <sound/sof.h> @@ -484,7 +485,8 @@ int snd_sof_device_shutdown(struct device *dev) * make sure clients and machine driver(s) are unregistered to force * all userspace devices to be closed prior to the DSP shutdown sequence */
- sof_unregister_clients(sdev);
- if (!kexec_with_frozen_processes())
sof_unregister_clients(sdev);
snd_sof_machine_unregister(sdev, pdata);
I think the case you hit was specifically snd_card_disconnect_sync() that gets called via snd_sof_machine_unregister(), right, so you'd have to skip both sof_unregister_clients() and the machine_unregister().
Skipping ok might be an ok solution here. There's clearly a problem and we cannot just drop these calls in the general case (when we are going to S5), but in the specific case of kexec, this is probably safe. And I agree one way or another this needs to be fixed. Pierre and others what do you think?
Br, Kai
linux-stable-mirror@lists.linaro.org