For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
With this patch patch now the MultiMedia1 PCM ends up with device number 0 as expected.
Fixes: 9b4fe0f1cd79 ("ASoC: qdsp6: audioreach: add q6apm-dai support") Cc: Stable@vger.kernel.org Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org --- sound/soc/qcom/qdsp6/q6apm-dai.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/soc/qcom/qdsp6/q6apm-dai.c b/sound/soc/qcom/qdsp6/q6apm-dai.c index 5eb0b864c740..c90db6daabbd 100644 --- a/sound/soc/qcom/qdsp6/q6apm-dai.c +++ b/sound/soc/qcom/qdsp6/q6apm-dai.c @@ -840,6 +840,7 @@ static const struct snd_soc_component_driver q6apm_fe_dai_component = { .pointer = q6apm_dai_pointer, .trigger = q6apm_dai_trigger, .compress_ops = &q6apm_dai_compress_ops, + .use_dai_pcm_id = true, };
static int q6apm_dai_probe(struct platform_device *pdev)
On Wed, Jun 28, 2023 at 10:24:04AM +0100, Srinivas Kandagatla wrote:
For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
Why is this a problem?
With this patch patch now the MultiMedia1 PCM ends up with device number 0 as expected.
Fixes: 9b4fe0f1cd79 ("ASoC: qdsp6: audioreach: add q6apm-dai support") Cc: Stable@vger.kernel.org
Won't this be an ABI change? That seems like it'd disrupt things in stable.
On Thu, Jun 29, 2023 at 04:43:57PM +0100, Mark Brown wrote:
On Wed, Jun 28, 2023 at 10:24:04AM +0100, Srinivas Kandagatla wrote:
For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
Why is this a problem?
With this patch patch now the MultiMedia1 PCM ends up with device number 0 as expected.
Fixes: 9b4fe0f1cd79 ("ASoC: qdsp6: audioreach: add q6apm-dai support") Cc: Stable@vger.kernel.org
Won't this be an ABI change? That seems like it'd disrupt things in stable.
ABI changes should disrupt things just the same in Linus's tree, why is stable any different?
thanks,
greg k-h
On Thu, Jun 29, 2023 at 06:06:05PM +0200, Greg KH wrote:
On Thu, Jun 29, 2023 at 04:43:57PM +0100, Mark Brown wrote:
Won't this be an ABI change? That seems like it'd disrupt things in stable.
ABI changes should disrupt things just the same in Linus's tree, why is stable any different?
This is a numbering resulting from enumeration thing so it gets to be like the issues we've had with the order in which block and ethernet devices appear, it's on the edge the extent to which people might be relying on it. If it's causing some problem as is and there's a reason to do something (see the first half of my reply...) but the case gets even harder to make with stable.
On Thu, Jun 29, 2023 at 05:16:44PM +0100, Mark Brown wrote:
On Thu, Jun 29, 2023 at 06:06:05PM +0200, Greg KH wrote:
On Thu, Jun 29, 2023 at 04:43:57PM +0100, Mark Brown wrote:
Won't this be an ABI change? That seems like it'd disrupt things in stable.
ABI changes should disrupt things just the same in Linus's tree, why is stable any different?
This is a numbering resulting from enumeration thing so it gets to be like the issues we've had with the order in which block and ethernet devices appear, it's on the edge the extent to which people might be relying on it. If it's causing some problem as is and there's a reason to do something (see the first half of my reply...) but the case gets even harder to make with stable.
It shouldn't matter for stable or not, if the change is acceptable in Linus's tree, with the userspace visable change, then it should be acceptable in any active stable branch as well. There is no difference here for userspace api/abi rules.
thanks,
greg k-h
On Thu, Jun 29, 2023 at 07:22:51PM +0200, Greg KH wrote:
It shouldn't matter for stable or not, if the change is acceptable in Linus's tree, with the userspace visable change, then it should be acceptable in any active stable branch as well. There is no difference here for userspace api/abi rules.
As discussed before your tolerance for risk in stable is *far* higher than mine, if there's any value in doing this at all it's probably within what would get taken but that doesn't mean that it's something that it's sensible to highlight as an important fix like tagging for stable does. It's extremely unclear that it fits the severity criteria that are supposed to be being applied to stable, though obviously the documentation doesn't fit the actual practice these days.
On Thu, Jun 29, 2023 at 06:38:38PM +0100, Mark Brown wrote:
On Thu, Jun 29, 2023 at 07:22:51PM +0200, Greg KH wrote:
It shouldn't matter for stable or not, if the change is acceptable in Linus's tree, with the userspace visable change, then it should be acceptable in any active stable branch as well. There is no difference here for userspace api/abi rules.
As discussed before your tolerance for risk in stable is *far* higher than mine, if there's any value in doing this at all it's probably within what would get taken but that doesn't mean that it's something that it's sensible to highlight as an important fix like tagging for stable does. It's extremely unclear that it fits the severity criteria that are supposed to be being applied to stable, though obviously the documentation doesn't fit the actual practice these days.
It's not a matter of "tolerance for risk", it's a "if this change is good enough for future releases, why isn't it good enough for older releases as well?"
As you know, we don't break user interfaces, so either this is a break or it isn't, stable trees have nothing to do with it as a normal user would "hit" this when updating to run Linus's tree, just as easily as they would "hit" it updating their stable kernel version.
thanks,
greg k-h
On Thu, Jun 29, 2023 at 08:48:42PM +0200, Greg KH wrote:
On Thu, Jun 29, 2023 at 06:38:38PM +0100, Mark Brown wrote:
As discussed before your tolerance for risk in stable is *far* higher than mine, if there's any value in doing this at all it's probably within what would get taken but that doesn't mean that it's something that it's sensible to highlight as an important fix like tagging for stable does. It's extremely unclear that it fits the severity criteria that are supposed to be being applied to stable, though obviously the documentation doesn't fit the actual practice these days.
It's not a matter of "tolerance for risk", it's a "if this change is good enough for future releases, why isn't it good enough for older releases as well?"
As you know, we don't break user interfaces, so either this is a break or it isn't, stable trees have nothing to do with it as a normal user would "hit" this when updating to run Linus's tree, just as easily as they would "hit" it updating their stable kernel version.
You know as well as I do that we have a bunch of interfaces where things end up getting dynamically numbered as they appear, and provided to userspace together with identifying information that allows userspace to figure out what's what in a stable fashion even though the numbers might change. Like I said earlier in the thread this is one of them, better hardware support also has some risk of disturbing things (and some of the numbering is going to be hotplug dependent, though this patch isn't likely to run into that particular bit of things).
ABI stability is a continuum, from for example things relying on race conditions or other timing things that were lucky they ever worked to changes in interfaces that break clear and documented guarantees. Reliance on stability is similar, and how much of an issue it is when something does change and someone notices is going to vary depending on what changed and why. While the risk here seems low if the reasoning is just to make things neater then it's even harder to justify for a stable kernel than it is for mainline.
Note also that the patch is still under discussion for mainline...
On 29/06/2023 16:43, Mark Brown wrote:
On Wed, Jun 28, 2023 at 10:24:04AM +0100, Srinivas Kandagatla wrote:
For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
Why is this a problem?
In existing Qualcomm setup the backend pcm are added first, which results in frontend pcms getting pcm numbers after this.
For example: with 3 backend dailinks in DT we have frontend pcm start at 3. Now if we add new backend dai-link in DT we now have frontend pcm start at 4.
This is a bug in qualcomm driver.
With this patch patch now the MultiMedia1 PCM ends up with device number 0 as expected.
Fixes: 9b4fe0f1cd79 ("ASoC: qdsp6: audioreach: add q6apm-dai support") Cc: Stable@vger.kernel.org
Won't this be an ABI change? That seems like it'd disrupt things in stable.
Yes, but this is a real bug. without fixing this also results in abi(pcm number) change when we add new backend dai-link. I have also sent fix for UCM to handle this.
--srini
On Thu, Jun 29, 2023 at 06:33:09PM +0100, Srinivas Kandagatla wrote:
On 29/06/2023 16:43, Mark Brown wrote:
On Wed, Jun 28, 2023 at 10:24:04AM +0100, Srinivas Kandagatla wrote:
For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
Why is this a problem?
In existing Qualcomm setup the backend pcm are added first, which results in frontend pcms getting pcm numbers after this.
For example: with 3 backend dailinks in DT we have frontend pcm start at 3. Now if we add new backend dai-link in DT we now have frontend pcm start at 4.
This is a bug in qualcomm driver.
Why is this an actual problem rather than just being a bit ugly? What is the negative consequence of having a PCM with this number?
With this patch patch now the MultiMedia1 PCM ends up with device number 0 as expected.
Fixes: 9b4fe0f1cd79 ("ASoC: qdsp6: audioreach: add q6apm-dai support") Cc: Stable@vger.kernel.org
Won't this be an ABI change? That seems like it'd disrupt things in stable.
Yes, but this is a real bug. without fixing this also results in abi(pcm number) change when we add new backend dai-link. I have also sent fix for UCM to handle this.
I'm still not clear why you believe this to be a bug.
On 29/06/2023 18:42, Mark Brown wrote:
On Thu, Jun 29, 2023 at 06:33:09PM +0100, Srinivas Kandagatla wrote:
On 29/06/2023 16:43, Mark Brown wrote:
On Wed, Jun 28, 2023 at 10:24:04AM +0100, Srinivas Kandagatla wrote:
For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
Why is this a problem?
In existing Qualcomm setup the backend pcm are added first, which results in frontend pcms getting pcm numbers after this.
For example: with 3 backend dailinks in DT we have frontend pcm start at 3. Now if we add new backend dai-link in DT we now have frontend pcm start at 4.
This is a bug in qualcomm driver.
Why is this an actual problem rather than just being a bit ugly? What is the negative consequence of having a PCM with this number?
Yes, it is ugly but also breaks the existing UCM as the pcm device numbers keep changing. Which is why I refereed it as bug in the driver.
--srini
On Wed, 28 Jun 2023 10:24:04 +0100, Srinivas Kandagatla wrote:
For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
With this patch patch now the MultiMedia1 PCM ends up with device number 0 as expected.
[...]
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
Thanks!
[1/1] ASoC: qdsp6: q6apm: use dai link pcm id as pcm device number commit: ac192c1a54f9562efe6bac910e6e7aae7b5fbea3
All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying to this mail.
Thanks, Mark
On Wed, Jun 28, 2023 at 10:24:04AM +0100, Srinivas Kandagatla wrote:
For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
With this patch patch now the MultiMedia1 PCM ends up with device number 0 as expected.
This appears to fix the intermittent probe breakage that I see every five boots or so:
[ 11.843320] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.867467] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.867574] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.867725] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.867933] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.868379] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.873645] snd-sc8280xp: probe of sound failed with error -22
and which I've reported here:
https://lore.kernel.org/lkml/ZIHMMFtuDtvdpFAZ@hovoldconsulting.com/
as unrelated changes in timings resulting from that series made the problem much harder (but not impossible) to hit.
With this fix, I've rebooted 20+ times without hitting the issue once.
I'm guessing that you found this issue while investigated that probe race, Srini? It does look related, and it does seem to make the problem go away, but I'm not comfortable claiming that the intermittent probe breakage has been resolved without some analysis to back that up.
Fixes: 9b4fe0f1cd79 ("ASoC: qdsp6: audioreach: add q6apm-dai support") Cc: Stable@vger.kernel.org
I noticed that Mark dropped this to avoid regressions in stable, but if this indeed fixes the probe race then we may want to consider backporting it after all.
Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org
sound/soc/qcom/qdsp6/q6apm-dai.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/soc/qcom/qdsp6/q6apm-dai.c b/sound/soc/qcom/qdsp6/q6apm-dai.c index 5eb0b864c740..c90db6daabbd 100644 --- a/sound/soc/qcom/qdsp6/q6apm-dai.c +++ b/sound/soc/qcom/qdsp6/q6apm-dai.c @@ -840,6 +840,7 @@ static const struct snd_soc_component_driver q6apm_fe_dai_component = { .pointer = q6apm_dai_pointer, .trigger = q6apm_dai_trigger, .compress_ops = &q6apm_dai_compress_ops,
- .use_dai_pcm_id = true,
}; static int q6apm_dai_probe(struct platform_device *pdev)
Johan
On Mon, Jul 03, 2023 at 09:48:34AM +0200, Johan Hovold wrote:
On Wed, Jun 28, 2023 at 10:24:04AM +0100, Srinivas Kandagatla wrote:
For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
With this patch patch now the MultiMedia1 PCM ends up with device number 0 as expected.
This appears to fix the intermittent probe breakage that I see every five boots or so:
[ 11.843320] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.867467] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.867574] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.867725] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.867933] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.868379] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.873645] snd-sc8280xp: probe of sound failed with error -22
and which I've reported here:
https://lore.kernel.org/lkml/ZIHMMFtuDtvdpFAZ@hovoldconsulting.com/
as unrelated changes in timings resulting from that series made the problem much harder (but not impossible) to hit.
With this fix, I've rebooted 20+ times without hitting the issue once.
I'm guessing that you found this issue while investigated that probe race, Srini? It does look related, and it does seem to make the problem go away, but I'm not comfortable claiming that the intermittent probe breakage has been resolved without some analysis to back that up.
Ok, scratch that. I just hit the race again also with this patch applied:
[ 11.815028] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.838667] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.838774] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.838916] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.838996] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.839430] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.844801] snd-sc8280xp: probe of sound failed with error -22
Sorry about the noise.
Johan
On Mon, 03 Jul 2023 10:03:52 +0200, Johan Hovold wrote:
On Mon, Jul 03, 2023 at 09:48:34AM +0200, Johan Hovold wrote:
On Wed, Jun 28, 2023 at 10:24:04AM +0100, Srinivas Kandagatla wrote:
For some reason we ended up with a setup without this flag. This resulted in inconsistent sound card devices numbers which are also not starting as expected at dai_link->id. (Ex: MultiMedia1 pcm ended up with device number 4 instead of 0)
With this patch patch now the MultiMedia1 PCM ends up with device number 0 as expected.
This appears to fix the intermittent probe breakage that I see every five boots or so:
[ 11.843320] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.867467] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.867574] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.867725] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.867933] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.868379] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.873645] snd-sc8280xp: probe of sound failed with error -22
and which I've reported here:
https://lore.kernel.org/lkml/ZIHMMFtuDtvdpFAZ@hovoldconsulting.com/
as unrelated changes in timings resulting from that series made the problem much harder (but not impossible) to hit.
With this fix, I've rebooted 20+ times without hitting the issue once.
I'm guessing that you found this issue while investigated that probe race, Srini? It does look related, and it does seem to make the problem go away, but I'm not comfortable claiming that the intermittent probe breakage has been resolved without some analysis to back that up.
Ok, scratch that. I just hit the race again also with this patch applied:
[ 11.815028] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.838667] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.838774] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.838916] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.838996] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.839430] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.844801] snd-sc8280xp: probe of sound failed with error -22
Isn't it rather an issue about the error code passing in qcom driver? How about the change like below?
Takashi
--- a/sound/soc/qcom/qdsp6/topology.c +++ b/sound/soc/qcom/qdsp6/topology.c @@ -1276,10 +1276,8 @@ int audioreach_tplg_init(struct snd_soc_component *component) }
ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw); - if (ret < 0) { - dev_err(dev, "tplg component load failed%d\n", ret); - ret = -EINVAL; - } + if (ret < 0) + dev_err_probe(dev, ret, "tplg component load failed %d\n", ret);
release_firmware(fw); err:
On Mon, Jul 03, 2023 at 10:19:29AM +0200, Takashi Iwai wrote:
On Mon, 03 Jul 2023 10:03:52 +0200, Johan Hovold wrote:
Ok, scratch that. I just hit the race again also with this patch applied:
[ 11.815028] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.838667] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.838774] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.838916] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.838996] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.839430] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.844801] snd-sc8280xp: probe of sound failed with error -22
Isn't it rather an issue about the error code passing in qcom driver? How about the change like below?
Indeed, and I tested a change like that here:
https://lore.kernel.org/lkml/ZIHSGf18Aw7htb9o8@hovoldconsulting.com/
but that only seems to make the problem worse currently.
This should probably still be fixed, but I was just hoping that the DAI numbering could have been the cause for the probe deferral (which then triggers the other bugs).
Johan
On Mon, Jul 03, 2023 at 10:48:51AM +0200, Johan Hovold wrote:
On Mon, Jul 03, 2023 at 10:19:29AM +0200, Takashi Iwai wrote:
On Mon, 03 Jul 2023 10:03:52 +0200, Johan Hovold wrote:
Ok, scratch that. I just hit the race again also with this patch applied:
[ 11.815028] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.838667] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.838774] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.838916] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.838996] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.839430] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.844801] snd-sc8280xp: probe of sound failed with error -22
Isn't it rather an issue about the error code passing in qcom driver? How about the change like below?
Indeed, and I tested a change like that here:
https://lore.kernel.org/lkml/ZIHSGf18Aw7htb9o8@hovoldconsulting.com/
That link was missing was apparently broken, should have been:
https://lore.kernel.org/all/ZIHSGf18w7htb9o8@hovoldconsulting.com/
but that only seems to make the problem worse currently.
This should probably still be fixed, but I was just hoping that the DAI numbering could have been the cause for the probe deferral (which then triggers the other bugs).
Johan
On Mon, Jul 03, 2023 at 10:19:29AM +0200, Takashi Iwai wrote:
Isn't it rather an issue about the error code passing in qcom driver? How about the change like below?
Takashi
--- a/sound/soc/qcom/qdsp6/topology.c +++ b/sound/soc/qcom/qdsp6/topology.c @@ -1276,10 +1276,8 @@ int audioreach_tplg_init(struct snd_soc_component *component) } ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw);
- if (ret < 0) {
dev_err(dev, "tplg component load failed%d\n", ret);
ret = -EINVAL;
- }
- if (ret < 0)
dev_err_probe(dev, ret, "tplg component load failed %d\n", ret);
That looks like a sensible change in general anyway.
linux-stable-mirror@lists.linaro.org