On Thu, Jan 19, 2023 at 04:58:53PM -0800, Luis Chamberlain wrote:
On Thu, Jan 19, 2023 at 04:51:27PM -0800, Luis Chamberlain wrote:
On Thu, Jan 19, 2023 at 04:47:05PM +0100, Petr Mladek wrote:
Yes, the -EINVAL error is strange. It is returned also in kernel/module/main.c on few locations. But neither of them looks like a good candidate.
OK I updated to next-20230119 and I don't see the issue now. Odd. It could have been an issue with next-20221207 which I was on before.
I'll run some more test and if nothing fails I'll send the fix to Linux for rc5.
Jeesh it just occured to me the difference, which I'll have to test next, for next-20221207 I had enabled module compression on kdevops with zstd.
You can see the issues on kdevops git log with that... and I finally disabled it and the kmod test issue is gone. So it could be that but I just am ending my day so will check tomorrow if that was it. But if someone else beats me then great.
With kdevops it should be a matter of just enabling zstd as I just bumped support for next-20230119 and that has module decompression disabled.
So indeed, my suspcions were correct. There is one bug with compression on debian:
- gzip compressed modules don't end up in the initramfs
There is a generic upstream kmod bug:
- modprobe --show-depends won't grok compressed modules so initramfs tools that use this as Debian likely are not getting module dependencies installed in their initramfs
But using xz compression reveals 4 GiB memory is not enough for kmod.sh test 0004, the -EINVAL is due to an OOM hit on modprobe so the request fails. That's a test bug.
But increasing memory (8 GiB seems to do it) still reveals kmod.sh test 0009 does fail, not all the times, and it is why the test runs 150 times if you run the test once. The failure is not deterministic but surely fails for me every time at least once out of the 150 runs. Test 0009 tries to trigger running kmod_concurrent over max_modprobes for get_fs_type().
I'm trying to test to see if I this failure can trigger without module compression but I don't see the failure yet.
Reverting the patch on this thread on linux-next does not fix that issue and so this has perhaps been broken for a much longer time. And so this patch still remains a candidate fix.
Luis