On Mon Sep 15, 2025 at 10:39 PM UTC, Jonathan Corbet wrote:
Gabriele Paoloni gpaoloni@redhat.com writes:
This patch proposes initial kernel-doc documentation for memory_open() and most of the functions in the mem_fops structure. The format used for the specifications follows the guidelines defined in Documentation/doc-guide/code-specifications.rst
I'll repeat my obnoxious question from the first patch: what does that buy for us?
Fair question, and definitely not obnoxious.
It might help to reframe this a bit. The idea is to take an engineering technique from one domain and apply it with modifications to another. The relevant terms of art are "forward engineering" and "reverse engineering".
My kneejerk first reaction is: you are repeating the code of the function in a different language.
No disagreement on that perception. We have more work to do when it comes to communicating the idea, as well as developing a better implementation.
The design of the Linux kernel is emergent and, in the present state, all forms of testing are an (educated) guess at the intended design. We can demonstrate this by picking a random bit of code from the kernel and assigning ourselves the task of writing a test for it.
Are you certain that your test accurately reflects the true design intent? You can read the code and test what you see. But that does not mean that your test is valid against the intent in someone else's head.
Music instructors see this whenever their students play the right notes but clearly do not yet "feel" the music. The difference is noticeable even by casual listeners.
If we are not convinced that the code is correct, how can we be more confident that this set of specifications is correct?
We have no reason to be independently convinced of either. When we describe this as importing a technique into a new domain, your question is an example of some of the concessions that have to be made.
The Linux kernel is not a forward engineered system. Therefore it is not possible to develop code and test from the same seed. Our only option is to reverse engineer that seed to the best of our abilities.
At that point we have a few options.
Ideally, the original developer can weigh in and validate that our interpretation is correct. This has the effect of "simulating" a forward engineering scenario, because a test can be created from the validated seed (I am trying valiantly to avoid using the word kernel).
Absent the original developer's validation, we have the option of simply asserting the specification. This is equivalent to the way testing is done today, except a test can be equally opaque with respect to what design it is attempting to validate.
In either case, if a test is developed against the specification, even an initially incorrect specification, we have the ability to bring code, specification, and test into alignment over time.
And again, what will consume this text?
Humans are the consumer. But to be clear - a machine readable template is going to be required in the long run to ensure that code and specification remain aligned. Our intentent was to avoid confusing things with templates, and introduce them once we have made headway on the points you have brought up.
It is probably also worth mentioning, we have already had an "a-ha" moment from one kernel maintainer. I believe the words were something to the effect of, "this is great, I used to have to relearn that code every time I touch it".
How does going through this effort get us to a better kernel?
I am hoping some of the above planted the seed to answer this one. Code must be correct in two ways, it must be valid and it must be verified.
Valid means - the code is doing the right thing. Verified means - the code is doing the thing right.
If code and test accurately reflect the same idea, then we can alleviate maintainers of a large portion of the verification burden. Validation is in the "hearts and minds" of the users, so that burden never goes away.
Despite having been to a couple of your talks, I'm not fully understanding how this comes together; people who haven't been to the talks are not going to have an easier time getting the full picture.
I agree. And thank you very much for attending those talks and engaging with us. It truly means a lot.
I have submitted a refereed talk to this year's Pumbers conference that is intended to go over these points in detail. My colleague (not on this thread) has also submitted a refereed talk on best practices for developing these specifications. His name is Matthew Whitehead and he is a recognized expert in that area.
..Ch:W..