The current implementation of netconsole sends all log messages in parallel, which can lead to an intermixed and interleaved output on the receiving side. This makes it challenging to demultiplex the messages and attribute them to their originating CPUs.
As a result, users and developers often struggle to effectively analyze and debug the parallel log output received through netconsole.
Example of a message got from produciton hosts:
------------[ cut here ]------------ ------------[ cut here ]------------ refcount_t: saturated; leaking memory. WARNING: CPU: 2 PID: 1613668 at lib/refcount.c:22 refcount_warn_saturate+0x5e/0xe0 refcount_t: addition on 0; use-after-free. WARNING: CPU: 26 PID: 4139916 at lib/refcount.c:25 refcount_warn_saturate+0x7d/0xe0 Modules linked in: bpf_preload(E) vhost_net(E) tun(E) vhost(E)
This series of patches introduces a new feature to the netconsole subsystem that allows the automatic population of the CPU number in the userdata field for each log message. This enhancement provides several benefits:
* Improved demultiplexing of parallel log output: When multiple CPUs are sending messages concurrently, the added CPU number in the userdata makes it easier to differentiate and attribute the messages to their originating CPUs.
* Better visibility into message sources: The CPU number information gives users and developers more insight into which specific CPU a particular log message came from, which can be valuable for debugging and analysis.
The changes in this series are as follows:
Patch "Ensure dynamic_netconsole_mutex is held during userdata update"
Add a lockdep assert to make sure dynamic_netconsole_mutex is held when calling update_userdata().
Patch "netconsole: Add option to auto-populate CPU number in userdata"
Adds a new option to enable automatic CPU number population in the netconsole userdata Provides a new "populate_cpu_nr" sysfs attribute to control this feature
Patch "netconsole: selftest: test CPU number auto-population"
Expands the existing netconsole selftest to verify the CPU number auto-population functionality Ensures the received netconsole messages contain the expected "cpu=" entry in the userdata
Patch "netconsole: docs: Add documentation for CPU number auto-population"
Updates the netconsole documentation to explain the new CPU number auto-population feature Provides instructions on how to enable and use the feature
I believe these changes will be a valuable addition to the netconsole subsystem, enhancing its usefulness for kernel developers and users.
Signed-off-by: Breno Leitao leitao@debian.org --- Breno Leitao (4): netconsole: Ensure dynamic_netconsole_mutex is held during userdata update netconsole: Add option to auto-populate CPU number in userdata netconsole: docs: Add documentation for CPU number auto-population netconsole: selftest: Validate CPU number auto-population in userdata
Documentation/networking/netconsole.rst | 44 +++++++++++++++ drivers/net/netconsole.c | 63 ++++++++++++++++++++++ .../testing/selftests/drivers/net/netcons_basic.sh | 18 +++++++ 3 files changed, 125 insertions(+) --- base-commit: a58f00ed24b849d449f7134fd5d86f07090fe2f5 change-id: 20241108-netcon_cpu-ce3917e88f4b
Best regards,