memblock_search_pfn_nid() returns the nid and the [start|end]_pfn of the memory region where pfn sits in. While the calculation of start_pfn has potential issue when the regions base is not page aligned.
For example, we assume PAGE_SHIFT is 12 and base is 0x1234. Current implementation would return 1 while this is not correct.
This patch fixes this by using PFN_UP().
The original commit is commit e76b63f80d93 ("memblock, numa: binary search node id") and merged in v3.12.
Signed-off-by: Wei Yang richard.weiyang@gmail.com Cc: 3.12+ stable@vger.kernel.org
--- * add He Jia in cc * fix the mm mail list address * Cc: 3.12+
--- mm/memblock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memblock.c b/mm/memblock.c index b6ba6b7adadc..de768307696d 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1673,7 +1673,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn, if (mid == -1) return -1;
- *start_pfn = PFN_DOWN(type->regions[mid].base); + *start_pfn = PFN_UP(type->regions[mid].base); *end_pfn = PFN_DOWN(type->regions[mid].base + type->regions[mid].size);
return type->regions[mid].nid;
On Fri, 30 Mar 2018 11:30:55 +0800 Wei Yang richard.weiyang@gmail.com wrote:
memblock_search_pfn_nid() returns the nid and the [start|end]_pfn of the memory region where pfn sits in. While the calculation of start_pfn has potential issue when the regions base is not page aligned.
For example, we assume PAGE_SHIFT is 12 and base is 0x1234. Current implementation would return 1 while this is not correct.
Why is this not correct? The caller might want the pfn of the page which covers the base?
This patch fixes this by using PFN_UP().
The original commit is commit e76b63f80d93 ("memblock, numa: binary search node id") and merged in v3.12.
Signed-off-by: Wei Yang richard.weiyang@gmail.com Cc: 3.12+ stable@vger.kernel.org
Please fully describe the runtime effects of a bug when fixing that bug. This description doesn't give enough justification for merging the patch into mainline, let alone -stable.
On Fri, Mar 30, 2018 at 01:57:27PM -0700, Andrew Morton wrote:
On Fri, 30 Mar 2018 11:30:55 +0800 Wei Yang richard.weiyang@gmail.com wrote:
memblock_search_pfn_nid() returns the nid and the [start|end]_pfn of the memory region where pfn sits in. While the calculation of start_pfn has potential issue when the regions base is not page aligned.
For example, we assume PAGE_SHIFT is 12 and base is 0x1234. Current implementation would return 1 while this is not correct.
Why is this not correct? The caller might want the pfn of the page which covers the base?
Hmm... the only caller of memblock_search_pfn_nid() is __early_pfn_to_nid(), which returns the nid of a pfn and save the [start_pfn, end_pfn] with in the same memory region to a cache. So this looks not a good practice to store un-exact pfn in the cache.
This patch fixes this by using PFN_UP().
The original commit is commit e76b63f80d93 ("memblock, numa: binary search node id") and merged in v3.12.
Signed-off-by: Wei Yang richard.weiyang@gmail.com Cc: 3.12+ stable@vger.kernel.org
Please fully describe the runtime effects of a bug when fixing that bug. This description doesn't give enough justification for merging the patch into mainline, let alone -stable.
Since PFN_UP() and PFN_DOWN() differs when the address is not page aligned, in theory we may have two situations like below.
Case 1: non-continuous memory region
[0x1000, 0x2fff] [0x4123, 0x5fff]
Case 2: continuous memory region
[0x1000, 0x4ff2] [0x4ff3, 0x5fff]
memblock_search_pfn_nid() is only used by __early_pfn_to_nid() to search for node id for pfn and cache the range, there would be two potential issues at runtime respectively:
Case 1. Return a node id for an invalid pfn Case 2. Return an incorrect node id
Neither of them do some damage to sytem. At most, it affects some performance of read/write on the page that pfn points to.
For Case 1, pfn 0x4 would be though on the second memory region's node, while it is not a valid pfn.
For Case 2, pfn 0x4 would be thought on the second memory region's node, while the node is not defined id in this case.
But these two cases in theory would not happen in reality even we would have the memory layout like this. Since 0x4 is not a valid pfn at all.
Last but not the least, more important impact on this code is misleading to the audience. They would thought the pfn range of a memory region is
[PFN_DOWN(base), PFN_DOWN(end))
instead of
[PFN_UP(base), PFN_DOWN(end))
even in reality they are the same usually.
For example in this discussion thread. https://lkml.org/lkml/2018/3/29/143 The author use this range to search the pfn, which is not exact.
On Mon 02-04-18 09:50:26, Wei Yang wrote:
On Fri, Mar 30, 2018 at 01:57:27PM -0700, Andrew Morton wrote:
On Fri, 30 Mar 2018 11:30:55 +0800 Wei Yang richard.weiyang@gmail.com wrote:
memblock_search_pfn_nid() returns the nid and the [start|end]_pfn of the memory region where pfn sits in. While the calculation of start_pfn has potential issue when the regions base is not page aligned.
For example, we assume PAGE_SHIFT is 12 and base is 0x1234. Current implementation would return 1 while this is not correct.
Why is this not correct? The caller might want the pfn of the page which covers the base?
Hmm... the only caller of memblock_search_pfn_nid() is __early_pfn_to_nid(), which returns the nid of a pfn and save the [start_pfn, end_pfn] with in the same memory region to a cache. So this looks not a good practice to store un-exact pfn in the cache.
This patch fixes this by using PFN_UP().
The original commit is commit e76b63f80d93 ("memblock, numa: binary search node id") and merged in v3.12.
Signed-off-by: Wei Yang richard.weiyang@gmail.com Cc: 3.12+ stable@vger.kernel.org
Please fully describe the runtime effects of a bug when fixing that bug. This description doesn't give enough justification for merging the patch into mainline, let alone -stable.
Since PFN_UP() and PFN_DOWN() differs when the address is not page aligned, in theory we may have two situations like below.
Have you ever seen a HW that would report page unaligned memory ranges? Is this even possible?
On Tue, Apr 03, 2018 at 01:30:41PM +0200, Michal Hocko wrote:
On Mon 02-04-18 09:50:26, Wei Yang wrote:
On Fri, Mar 30, 2018 at 01:57:27PM -0700, Andrew Morton wrote:
On Fri, 30 Mar 2018 11:30:55 +0800 Wei Yang richard.weiyang@gmail.com wrote:
memblock_search_pfn_nid() returns the nid and the [start|end]_pfn of the memory region where pfn sits in. While the calculation of start_pfn has potential issue when the regions base is not page aligned.
For example, we assume PAGE_SHIFT is 12 and base is 0x1234. Current implementation would return 1 while this is not correct.
Why is this not correct? The caller might want the pfn of the page which covers the base?
Hmm... the only caller of memblock_search_pfn_nid() is __early_pfn_to_nid(), which returns the nid of a pfn and save the [start_pfn, end_pfn] with in the same memory region to a cache. So this looks not a good practice to store un-exact pfn in the cache.
This patch fixes this by using PFN_UP().
The original commit is commit e76b63f80d93 ("memblock, numa: binary search node id") and merged in v3.12.
Signed-off-by: Wei Yang richard.weiyang@gmail.com Cc: 3.12+ stable@vger.kernel.org
Please fully describe the runtime effects of a bug when fixing that bug. This description doesn't give enough justification for merging the patch into mainline, let alone -stable.
Since PFN_UP() and PFN_DOWN() differs when the address is not page aligned, in theory we may have two situations like below.
Have you ever seen a HW that would report page unaligned memory ranges? Is this even possible?
No, so we don't need to handle this case?
-- Michal Hocko SUSE Labs
On Wed 04-04-18 09:33:57, Wei Yang wrote:
On Tue, Apr 03, 2018 at 01:30:41PM +0200, Michal Hocko wrote:
On Mon 02-04-18 09:50:26, Wei Yang wrote:
On Fri, Mar 30, 2018 at 01:57:27PM -0700, Andrew Morton wrote:
On Fri, 30 Mar 2018 11:30:55 +0800 Wei Yang richard.weiyang@gmail.com wrote:
memblock_search_pfn_nid() returns the nid and the [start|end]_pfn of the memory region where pfn sits in. While the calculation of start_pfn has potential issue when the regions base is not page aligned.
For example, we assume PAGE_SHIFT is 12 and base is 0x1234. Current implementation would return 1 while this is not correct.
Why is this not correct? The caller might want the pfn of the page which covers the base?
Hmm... the only caller of memblock_search_pfn_nid() is __early_pfn_to_nid(), which returns the nid of a pfn and save the [start_pfn, end_pfn] with in the same memory region to a cache. So this looks not a good practice to store un-exact pfn in the cache.
This patch fixes this by using PFN_UP().
The original commit is commit e76b63f80d93 ("memblock, numa: binary search node id") and merged in v3.12.
Signed-off-by: Wei Yang richard.weiyang@gmail.com Cc: 3.12+ stable@vger.kernel.org
Please fully describe the runtime effects of a bug when fixing that bug. This description doesn't give enough justification for merging the patch into mainline, let alone -stable.
Since PFN_UP() and PFN_DOWN() differs when the address is not page aligned, in theory we may have two situations like below.
Have you ever seen a HW that would report page unaligned memory ranges? Is this even possible?
No, so we don't need to handle this case?
Memblock code is subtle enough to not touch it for something that doesn't really exist. We do have some alignment assumptions all over so if we have weird configurations with page non-aligned regions of memory then we should do the rounding when this is detected rathert than spread them all over random places.
On Wed, Apr 04, 2018 at 07:45:53AM +0200, Michal Hocko wrote:
On Wed 04-04-18 09:33:57, Wei Yang wrote:
On Tue, Apr 03, 2018 at 01:30:41PM +0200, Michal Hocko wrote:
On Mon 02-04-18 09:50:26, Wei Yang wrote:
On Fri, Mar 30, 2018 at 01:57:27PM -0700, Andrew Morton wrote:
On Fri, 30 Mar 2018 11:30:55 +0800 Wei Yang richard.weiyang@gmail.com wrote:
memblock_search_pfn_nid() returns the nid and the [start|end]_pfn of the memory region where pfn sits in. While the calculation of start_pfn has potential issue when the regions base is not page aligned.
For example, we assume PAGE_SHIFT is 12 and base is 0x1234. Current implementation would return 1 while this is not correct.
Why is this not correct? The caller might want the pfn of the page which covers the base?
Hmm... the only caller of memblock_search_pfn_nid() is __early_pfn_to_nid(), which returns the nid of a pfn and save the [start_pfn, end_pfn] with in the same memory region to a cache. So this looks not a good practice to store un-exact pfn in the cache.
This patch fixes this by using PFN_UP().
The original commit is commit e76b63f80d93 ("memblock, numa: binary search node id") and merged in v3.12.
Signed-off-by: Wei Yang richard.weiyang@gmail.com Cc: 3.12+ stable@vger.kernel.org
Please fully describe the runtime effects of a bug when fixing that bug. This description doesn't give enough justification for merging the patch into mainline, let alone -stable.
Since PFN_UP() and PFN_DOWN() differs when the address is not page aligned, in theory we may have two situations like below.
Have you ever seen a HW that would report page unaligned memory ranges? Is this even possible?
No, so we don't need to handle this case?
Memblock code is subtle enough to not touch it for something that doesn't really exist. We do have some alignment assumptions all over so if we have weird configurations with page non-aligned regions of memory then we should do the rounding when this is detected rathert than spread them all over random places.
Well, I got you concern and I agree to have some assumption to make the code easy to read or write.
But, my point is current code doesn't keep this assumption always. For example, __next_mem_pfn_range() and memblock_search_pfn_nid() treat the pfn range differently. The inconsistency may confuse audience and harm the kernel in the long turn.
To not spread this calculation over random places, how about introducing a helper function memblock_pfn_range() to consolidate the place and help audience in the future.
For example, https://patchwork.kernel.org/patch/10323845/ tries to optimize the page initialization which rely on the memblock pfn range. If we could consolidate the calculation in one place, would it be more convenient to maintain the kernel?
-- Michal Hocko SUSE Labs
linux-stable-mirror@lists.linaro.org