On 8/8/19 4:13 PM, Mina Almasry wrote:
For shared mappings, the pointer to the hugetlb_cgroup to uncharge lives in the resv_map entries, in file_region->reservation_counter.
When a file_region entry is added to the resv_map via region_add, we also charge the appropriate hugetlb_cgroup and put the pointer to that in file_region->reservation_counter. This is slightly delicate since we need to not modify the resv_map until we know that charging the reservation has succeeded. If charging doesn't succeed, we report the error to the caller, so that the kernel fails the reservation.
I wish we did not need to modify these region_() routines as they are already difficult to understand. However, I see no other way with the desired semantics.
On region_del, which is when the hugetlb memory is unreserved, we delete the file_region entry in the resv_map, but also uncharge the file_region->reservation_counter.
mm/hugetlb.c | 208 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 170 insertions(+), 38 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 235996aef6618..d76e3137110ab 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -242,8 +242,72 @@ struct file_region { struct list_head link; long from; long to; +#ifdef CONFIG_CGROUP_HUGETLB
- /*
* On shared mappings, each reserved region appears as a struct
* file_region in resv_map. These fields hold the info needed to
* uncharge each reservation.
*/
- struct page_counter *reservation_counter;
- unsigned long pages_per_hpage;
+#endif };
+/* Must be called with resv->lock held. Calling this with dry_run == true will
- count the number of pages added but will not modify the linked list.
- */
+static long consume_regions_we_overlap_with(struct file_region *rg,
struct list_head *head, long f, long *t,
struct hugetlb_cgroup *h_cg,
struct hstate *h,
bool dry_run)
+{
- long add = 0;
- struct file_region *trg = NULL, *nrg = NULL;
- /* Consume any regions we now overlap with. */
- nrg = rg;
- list_for_each_entry_safe(rg, trg, rg->link.prev, link) {
if (&rg->link == head)
break;
if (rg->from > *t)
break;
/* If this area reaches higher then extend our area to
* include it completely. If this is not the first area
* which we intend to reuse, free it.
*/
if (rg->to > *t)
*t = rg->to;
if (rg != nrg) {
/* Decrement return value by the deleted range.
* Another range will span this area so that by
* end of routine add will be >= zero
*/
add -= (rg->to - rg->from);
if (!dry_run) {
list_del(&rg->link);
kfree(rg);
Is it possible that the region struct we are deleting pointed to a reservation_counter? Perhaps even for another cgroup? Just concerned with the way regions are coalesced that we may be deleting counters.