When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org --- mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e3aac274cf16 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1099,7 +1099,7 @@ static void migrate_folio_done(struct folio *src, * not accounted to NR_ISOLATED_*. They can be recognized * as __folio_test_movable */ - if (likely(!__folio_test_movable(src))) + if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION) mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src));
Gregory Price gourry@gourry.net writes:
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org
mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e3aac274cf16 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1099,7 +1099,7 @@ static void migrate_folio_done(struct folio *src, * not accounted to NR_ISOLATED_*. They can be recognized * as __folio_test_movable */
- if (likely(!__folio_test_movable(src)))
- if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION) mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src));
Good catch! Thanks for doing this! Feel free to add
Reviewed-by: "Huang, Ying" ying.huang@intel.com
in the future versions.
On Fri, Oct 25, 2024 at 10:17:24AM GMT, Gregory Price wrote:
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
This patch looks good for stable backports. For future I wonder if instead of migrate_pages(), the caller providing the isolated folios, manages the isolated stats (increments and decrements) similar to how reclaim does it.
mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e3aac274cf16 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1099,7 +1099,7 @@ static void migrate_folio_done(struct folio *src, * not accounted to NR_ISOLATED_*. They can be recognized * as __folio_test_movable */
- if (likely(!__folio_test_movable(src)))
- if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION) mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src));
2.43.0
On Sun, Oct 27, 2024 at 10:24:10PM -0700, Shakeel Butt wrote:
On Fri, Oct 25, 2024 at 10:17:24AM GMT, Gregory Price wrote:
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
This patch looks good for stable backports. For future I wonder if instead of migrate_pages(), the caller providing the isolated folios, manages the isolated stats (increments and decrements) similar to how reclaim does it.
Note that even if you provided the folios, you'd likely still end up in migrate_pages_batch/migrate_folio_move and subsequently the same accounting path. Probably there's some refactoring we can do to make the accounting more obvious - it is very subtle here.
mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e3aac274cf16 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1099,7 +1099,7 @@ static void migrate_folio_done(struct folio *src, * not accounted to NR_ISOLATED_*. They can be recognized * as __folio_test_movable */
- if (likely(!__folio_test_movable(src)))
- if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION) mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src));
2.43.0
Gregory Price gourry@gourry.net writes:
On Sun, Oct 27, 2024 at 10:24:10PM -0700, Shakeel Butt wrote:
On Fri, Oct 25, 2024 at 10:17:24AM GMT, Gregory Price wrote:
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org
Reviewed-by: Shakeel Butt shakeel.butt@linux.dev
This patch looks good for stable backports. For future I wonder if instead of migrate_pages(), the caller providing the isolated folios, manages the isolated stats (increments and decrements) similar to how reclaim does it.
Note that even if you provided the folios, you'd likely still end up in migrate_pages_batch/migrate_folio_move and subsequently the same accounting path. Probably there's some refactoring we can do to make the accounting more obvious - it is very subtle here.
I agree with Shakeel here. It's better for the caller who isolates the folios to increase and decrease the isolation counter. And yes, some refactoring is required.
-- Best Regards, Huang, Ying
mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e3aac274cf16 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1099,7 +1099,7 @@ static void migrate_folio_done(struct folio *src, * not accounted to NR_ISOLATED_*. They can be recognized * as __folio_test_movable */
- if (likely(!__folio_test_movable(src)))
- if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION) mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src));
2.43.0
On Fri, 25 Oct 2024, Gregory Price wrote:
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org
Reviewed-by: Davidlohr Bueso dave@stgolabs.net
On Fri, Oct 25, 2024 at 7:17 AM Gregory Price gourry@gourry.net wrote:
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
AFAIK, MGLRU doesn't dec/inc this counter, so it is not double-decrement for MGLRU. Maybe "imbalance update" is better? Anyway, it is just a nit. I'd suggest capturing the MGLRU case in the commit log too.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org
Thanks for catching this. Reviewed-by: Yang Shi shy828301@gmail.com
mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e3aac274cf16 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1099,7 +1099,7 @@ static void migrate_folio_done(struct folio *src, * not accounted to NR_ISOLATED_*. They can be recognized * as __folio_test_movable */
if (likely(!__folio_test_movable(src)))
if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION) mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src));
-- 2.43.0
On Mon, Oct 28, 2024 at 01:45:48PM -0700, Yang Shi wrote:
On Fri, Oct 25, 2024 at 7:17 AM Gregory Price gourry@gourry.net wrote:
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
AFAIK, MGLRU doesn't dec/inc this counter, so it is not double-decrement for MGLRU. Maybe "imbalance update" is better? Anyway, it is just a nit. I'd suggest capturing the MGLRU case in the commit log too.
Gotcha, so yeah saying it's an imbalance fix is more accurate.
So more accurate changelog is:
[PATCH] vmscan,migrate: fix page count imbalance on node stats when demoting pages
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. migrate_pages will decrement the the node's isolated page count, leading to an imbalanced count when invoked from (MG)LRU code.
The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The following path produces the decrement:
shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org
Thanks for catching this. Reviewed-by: Yang Shi shy828301@gmail.com
mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e3aac274cf16 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1099,7 +1099,7 @@ static void migrate_folio_done(struct folio *src, * not accounted to NR_ISOLATED_*. They can be recognized * as __folio_test_movable */
if (likely(!__folio_test_movable(src)))
if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION) mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src));
-- 2.43.0
Gregory Price gourry@gourry.net writes:
On Mon, Oct 28, 2024 at 01:45:48PM -0700, Yang Shi wrote:
On Fri, Oct 25, 2024 at 7:17 AM Gregory Price gourry@gourry.net wrote:
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
AFAIK, MGLRU doesn't dec/inc this counter, so it is not double-decrement for MGLRU. Maybe "imbalance update" is better? Anyway, it is just a nit. I'd suggest capturing the MGLRU case in the commit log too.
Gotcha, so yeah saying it's an imbalance fix is more accurate.
So more accurate changelog is:
[PATCH] vmscan,migrate: fix page count imbalance on node stats when demoting pages
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. migrate_pages will decrement the the node's isolated page count, leading to an imbalanced count when invoked from (MG)LRU code.
The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The following path produces the decrement:
shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- decrement
I think that it may be better to mention the different behavior of LRU and MGLRU. But that's not a big deal, change it again only if you think it's necessary.
-- Best Regards, Huang, Ying
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org
Thanks for catching this. Reviewed-by: Yang Shi shy828301@gmail.com
mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e3aac274cf16 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1099,7 +1099,7 @@ static void migrate_folio_done(struct folio *src, * not accounted to NR_ISOLATED_*. They can be recognized * as __folio_test_movable */
if (likely(!__folio_test_movable(src)))
if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION) mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src));
-- 2.43.0
On Tue, Oct 29, 2024 at 08:34:34AM +0800, Huang, Ying wrote:
Gregory Price gourry@gourry.net writes:
On Mon, Oct 28, 2024 at 01:45:48PM -0700, Yang Shi wrote:
On Fri, Oct 25, 2024 at 7:17 AM Gregory Price gourry@gourry.net wrote:
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
AFAIK, MGLRU doesn't dec/inc this counter, so it is not double-decrement for MGLRU. Maybe "imbalance update" is better? Anyway, it is just a nit. I'd suggest capturing the MGLRU case in the commit log too.
Gotcha, so yeah saying it's an imbalance fix is more accurate.
So more accurate changelog is:
...
I think that it may be better to mention the different behavior of LRU and MGLRU. But that's not a big deal, change it again only if you think it's necessary.
The behavior isn't really different. It's either way migrate_pages decrements when it shouldn't going through the shink code - and both LRU and MGLRU go through the same code. That LRU does an inc/dec pair is irrelevant - neither should do the decrement in the migrate path.
~Gregory
On Fri, Oct 25, 2024 at 10:17:24AM -0400, Gregory Price wrote:
When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such:
$ cat /proc/sys/vm/stat_refresh
[77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642
This negative value may impact compaction and reclaim throttling.
The double-decrement occurs in the migrate_pages path:
caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement
This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code.
When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION.
Signed-off-by: Gregory Price gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Cc: stable@vger.kernel.org
Reviewed-by: Oscar Salvador osalvador@suse.de
linux-stable-mirror@lists.linaro.org