On Wed, Oct 22, 2025 at 10:01 PM Jakub Kicinski kuba@kernel.org wrote:
On Wed, 22 Oct 2025 10:39:56 -0700 Gustavo Luiz Duarte wrote:
This series fixes a race condition in netconsole's userdata handling where concurrent message transmission could read partially updated userdata fields, resulting in corrupted netconsole output.
The first patch adds a selftest that reproduces the race condition by continuously sending messages while rapidly changing userdata values, detecting any torn reads in the output.
The second patch fixes the issue by ensuring update_userdata() holds the target_list_lock while updating both extradata_complete and userdata_length, preventing readers from seeing inconsistent state.
This targets net tree as it fixes a bug introduced in commit df03f830d099 ("net: netconsole: cache userdata formatted string in netconsole_target").
This test is skipping on debug kernel builds in netdev CI.
TAP version 13 1..1 # overriding timeout to 360 # selftests: drivers/net: netcons_race_userdata.sh # socat died before we could check 10000 messages. Skipping test. ok 1 selftests: drivers/net: netcons_race_userdata.sh # SKIP
We can't have skips for SW tests.
I think Breno was fighting with a similar problem in the past. Not sure what he ended up doing. Maybe just leave it at the print? Don't actually mark the test as skipped?
Slightly more advanced option is to only do that if KSFT_MACHINE_SLOW per: https://github.com/linux-netdev/nipa/wiki/How-to-run-netdev-selftests-CI-sty...
There are two reasons for hitting this skip. 1. The hardcoded 2s timeout in listen_port_and_save_to() expired 2. socat died or failed to start for mysterious reasons
#1 should probably be a success (we ran the test for this long and no corruption found), and for #2 we can try to return whatever exit code socat give us. Retrieving socat return code is a bit tricky because we are running it in a subshell, but we can save it in a temp file.
I can also send a follow up patch to use a longer timeout in listen_port_and_save_to() if KSFT_MACHINE_SLOW
-- pw-bot: cr