Re: [PATCH net-next 0/4] (no cover subject)

8 Dec 2025

      On Fri 2025-12-05 02:21:08, Breno Leitao wrote:
...
On Thu, Dec 04, 2025 at 11:51:58AM +0100, Petr Mladek wrote:
...
...
...
...
...
perhaps it should be configured to only log messages at a high level?
Chris is actually working on per-console log levels to solve exactly
this problem, so we could filter serial console messages while keeping
everything in other consoles (aka netconsole):
https://lore.kernel.org/all/cover.1764272407.git.chris@chrisdown.name/
Excellent! Unless I'm missing more context Chris does seem to be
attacking the problem at a more suitable layer.
This would help to bypass slow serial consoles. But the extra messages
would still get stored into the kernel ring buffer and passed back
to user space logs, for example journalctl.
It might actually make sense for the "workload enters or leaves" messages.
But I am not sure about the "ping" messages.
Agree. Let me back up and explain my "ping" messages better, which
I think might add more information about this topic.
Meta has millions of servers, and all of them must have netconsole
running 100% of the time.
Of course that this is not reality, and problems happen for different
reasons, the ones that interest me here are:

Top of the rack switch MAC address changes (mostly associated with
network hardware (top of the rack switches and gateway) replacement)
 a) Keep in mind that netconsole target has the destination MAC as
    part of its configuration.

Netconsole got associated with the wrong network port, which comes in
two different flavors.
a) The machine got provisioned wrongly since day one (Most common
   case)
b) The machine NIC changed and: 
   i) The target doesn't bind correctly anymore (if netconsole
      target is bound by mac address)
   	   * This is easier to detect, given the target will never be
     enabled.

Netconsd (the daemon that listen to netconsole packets) is buggy or
dead

Network failures across the route

Possible Solutions
In order to detect those issues above, I think the best (or only) way is
to send messages from the host, and check if they got received. If not,
raise an alarm (in the common distributed way).
This could be done in very different ways, tho. Such as:

Have a binary in each machine:
a) This binary reads the netconsole target that is configured,
   and mimics "ping" UDP netconsole packet.
Pro: 
     * It doesn't need any kernel change
Cons:
     * It needs to reimplement the netconsole logic in userspace
     * This needs also a widely distributed binary on all
       machines

Send a ping directly to the console
a) Something as 'echo ping from $hostname" > /dev/kmsg')
Pro:

No kernel changes

Cons:

These debug messages will be sent to journalctl and to
the console, polluting both

Using per-loglevel patchset.
a) Same as above, but, setting netconsole loglevel to DEBUG, while
   all other consoles to INFO.
Pro:

No changes on netconsole
Netconsole "buffers" continues to be synchronized with
kernel buffers. Everything in the same page, but,
netconsole data has one loglevel higher.
Sending a message to netconsole-only message is not
special at all. It uses the same workflow we have
today, through `/dev/kmsg'

Cons:

Needs to change printk/console code (Chris' patch)
that is on review for years now. Will it ever get
accepted?
These "ping" message will be in kernel buffers and
journalctl, and are useless in there (!?)
It is not possible to send a message to a single
netconsole target.

JFYI, I am going to review the last version of the per-console
loglevel patchset later this week. IMHO, we are very close to
get it merged.
BTW: How often do you ping the netconsole, please?
IMHO, adding a short message once-per-hour might be bearable,
     once-per-minute might be questionable for the kernel buffer
     but still fine for journalctl.
Also it depends on the size of the kernel buffer and whether
     you use a crash dump. I mean that it might be handy to have
     some useful messages in the kernel buffer when the crash dump
     is generated and used for debugging. Otherwise, the only
     important thing is whether they get stored externally either
     via console or journalctl.
...

send messages only to netconsole (this patchset)
Pro:
It is easy to test netconsole connective (problem above),
without kernel buffers/journal pollution
It doesn't depend on the per-loglevel patchset
Adds flexibility to netconsole targets.
only certain netconsole targets receive
certain messages

Cons:
Messages sent to netconsole is a superset of messages in the
kernel buffer. In other words, "dmesg" and machine
logs/journal will not be able to see messages that
were sent directly to netconsole.
It might be seen as a back channel (!?)

Different netconsole targets may receive different
messages. Too much flexibility might be bad (!?)

I do not have strong opinion about this.
That said, the location /sys/kernel/config/netconsole/<target>/send_msg
looks a bit weird to me. I would rather expect /dev/netconsole_msg
or so. But I do not have strong opinion. It might be an overkill.
How important is it to trigger the ping from userspace, please?
It might be sent by an existing watchdog.
Best Regards,
Petr

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH net-next 0/4] (no cover subject)

Possible Solutions