The calculation of bridge window head alignment is done by calculate_mem_align() [*]. With the default bridge window alignment, it is used for both head and tail alignment.
The selected head alignment does not always result in tight-fitting resources (gap at d4f00000-d4ffffff):
d4800000-dbffffff : PCI Bus 0000:06 d4800000-d48fffff : PCI Bus 0000:07 d4800000-d4803fff : 0000:07:00.0 d4800000-d4803fff : nvme d4900000-d49fffff : PCI Bus 0000:0a d4900000-d490ffff : 0000:0a:00.0 d4900000-d490ffff : r8169 d4910000-d4913fff : 0000:0a:00.0 d4a00000-d4cfffff : PCI Bus 0000:0b d4a00000-d4bfffff : 0000:0b:00.0 d4a00000-d4bfffff : 0000:0b:00.0 d4c00000-d4c07fff : 0000:0b:00.0 d4d00000-d4dfffff : PCI Bus 0000:15 d4d00000-d4d07fff : 0000:15:00.0 d4d00000-d4d07fff : xhci-hcd d4e00000-d4efffff : PCI Bus 0000:16 d4e00000-d4e7ffff : 0000:16:00.0 d4e80000-d4e803ff : 0000:16:00.0 d4e80000-d4e803ff : ahci d5000000-dbffffff : PCI Bus 0000:0c
This has not been caused problems (for years) with the default bridge window tail alignment that grossly over-estimates the required tail alignment leaving more tail room than necessary. With the introduction of relaxed tail alignment that leaves no extra tail room whatsoever, any gaps will immediately turn into assignment failures.
Introduce head alignment calculation that ensures no gaps are left and apply the new approach when using relaxed alignment.
([*] I don't understand the algorithm in calculate_mem_align().)
Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220775 Reported-by: Malte Schröder malte+lkml@tnxip.de Tested-by: Malte Schröder malte+lkml@tnxip.de Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Cc: stable@vger.kernel.org ---
Little annoyingly, there's difference in what aligns array contains between the legacy alignment approach (which I dare not to touch as I really don't understand what the algorithm tries to do) and this new head aligment algorithm, both consuming stack space. After making the new approach the only available approach in the follow-up patch, only one array remains (however, that follow-up change is also somewhat riskier when it comes to regressions).
That being said, the new head alignment could work with the same aligns array as the legacy approach, it just won't necessarily produce an optimal (the smallest possible) head alignment when if (r_size <= align) condition is used. Just let me know if that approach is preferred (to save some stack space). --- drivers/pci/setup-bus.c | 53 ++++++++++++++++++++++++++++++++++------- 1 file changed, 44 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index 70d021ffb486..93f6b0750174 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -1224,6 +1224,45 @@ static inline resource_size_t calculate_mem_align(resource_size_t *aligns, return min_align; }
+/* + * Calculate bridge window head alignment that leaves no gaps in between + * resources. + */ +static resource_size_t calculate_head_align(resource_size_t *aligns, + int max_order) +{ + resource_size_t head_align = 1; + resource_size_t remainder = 0; + int order; + + /* Take the largest alignment as the starting point. */ + head_align <<= max_order + __ffs(SZ_1M); + + for (order = max_order - 1; order >= 0; order--) { + resource_size_t align1 = 1; + + align1 <<= order + __ffs(SZ_1M); + + /* + * Account smaller resources with alignment < max_order that + * could be used to fill head room if alignment less than + * max_order is used. + */ + remainder += aligns[order]; + + /* + * Test if head fill is enough to satisfy the alignment of + * the larger resources after reducing the alignment. + */ + while ((head_align > align1) && (remainder >= head_align / 2)) { + head_align /= 2; + remainder -= head_align; + } + } + + return head_align; +} + /** * pbus_upstream_space_available - Check no upstream resource limits allocation * @bus: The bus @@ -1311,13 +1350,13 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type, { struct pci_dev *dev; resource_size_t min_align, win_align, align, size, size0, size1 = 0; - resource_size_t aligns[28]; /* Alignments from 1MB to 128TB */ + resource_size_t aligns[28] = {}; /* Alignments from 1MB to 128TB */ + resource_size_t aligns2[28] = {};/* Alignments from 1MB to 128TB */ int order, max_order; struct resource *b_res = pbus_select_window_for_type(bus, type); resource_size_t children_add_size = 0; resource_size_t children_add_align = 0; resource_size_t add_align = 0; - resource_size_t relaxed_align; resource_size_t old_size;
if (!b_res) @@ -1327,7 +1366,6 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type, if (b_res->parent) return;
- memset(aligns, 0, sizeof(aligns)); max_order = 0; size = 0;
@@ -1378,6 +1416,7 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type, */ if (r_size <= align) aligns[order] += align; + aligns2[order] += align; if (order > max_order) max_order = order;
@@ -1402,9 +1441,7 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
if (bus->self && size0 && !pbus_upstream_space_available(bus, b_res, size0, min_align)) { - relaxed_align = 1ULL << (max_order + __ffs(SZ_1M)); - relaxed_align = max(relaxed_align, win_align); - min_align = min(min_align, relaxed_align); + min_align = calculate_head_align(aligns2, max_order); size0 = calculate_memsize(size, min_size, 0, 0, old_size, win_align); resource_set_range(b_res, min_align, size0); pci_info(bus->self, "bridge window %pR to %pR requires relaxed alignment rules\n", @@ -1418,9 +1455,7 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
if (bus->self && size1 && !pbus_upstream_space_available(bus, b_res, size1, add_align)) { - relaxed_align = 1ULL << (max_order + __ffs(SZ_1M)); - relaxed_align = max(relaxed_align, win_align); - min_align = min(min_align, relaxed_align); + min_align = calculate_head_align(aligns2, max_order); size1 = calculate_memsize(size, min_size, add_size, children_add_size, old_size, win_align); pci_info(bus->self,