+ damon@lists.linux.dev
I haven't thoroughly read any version of this patch series due to my laziness, sorry. So I may saying something completely wrong. My apology in advance, and please correct me in the case.
On Tue, Nov 26, 2024 at 06:57:19PM -0800, Yuanchu Xie wrote:
This patch series provides workingset reporting of user pages in lruvecs, of which coldness can be tracked by accessed bits and fd references.
DAMON provides data access patterns of user pages. It is not exactly named as workingset but a superset of the information. Users can therefore get the workingset from DAMON-provided raw data. So I feel I have to ask if DAMON can be used for, or help at achieving the purpose of this patch series.
Depending on the detailed definition of workingset, of course, the workingset we can get from DAMON might not be technically same to what this patch series aim to provide, and the difference could be somewhat that makes DAMON unable to be used or help here. But I cannot know if this is the case with only this cover letter.
However, the concept of workingset applies generically to all types of memory, which could be kernel slab caches, discardable userspace caches (databases), or CXL.mem. Therefore, data sources might come from slab shrinkers, device drivers, or the userspace. Another interesting idea might be hugepage workingset, so that we can measure the proportion of hugepages backing cold memory. However, with architectures like arm, there may be too many hugepage sizes leading to a combinatorial explosion when exporting stats to the userspace. Nonetheless, the kernel should provide a set of workingset interfaces that is generic enough to accommodate the various use cases, and extensible to potential future use cases.
This again sounds similar to what DAMON aims to provide, to me. DAMON is designed to be easy to extend for vairous use cases and internal mechanisms. Specifically, it separates access check mechanisms and core logic into different layers, and provides an interface to use for implementing extending DAMON with new mechanisms. DAMON's two access check mechanisms for virtual address spaces and the physical address space are made using the interface, indeed. Also there were RFC patch series extending DAMON for NUMA-specific and write-only access monitoring using NUMA hinting fault and soft-dirty PTEs as the internal mechanisms.
My humble understanding of the major difference between DAMON and workingset reporting is the internal mechanism. Workingset reporting uses MGLRU as the access check mechanism, while current access check mechanisms for DAMON are using page table accessed bits checking as the major mechanism. I think DAMON can be extended to use MGLRU as its another internal access check mechanism, but I understand that there could be many things that I overseeing.
Yuanchu, I think it would help me and other reviewers better understand this patch series if you could share that. And I will also be more than happy to help you and others better understanding what DAMON can do or not with the discussion.
Doesn't DAMON already provide this information?
CCing SJ.
Thank you for adding me, Johannes :)
[...]
It does provide more detailed insight into userspace memory behavior, which could be helpful when trying to make sense of applications that sit on a rich layer of libraries and complicated runtimes. But here a comparison to DAMON would be helpful.
100% agree.
Thanks, SJ
[...]