The patch titled
Subject: mm: zero remaining unavailable struct pages
has been removed from the -mm tree. Its filename was
mm-zero-remaining-unavailable-struct-pages.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Subject: mm: zero remaining unavailable struct pages
There is a kernel panic that is triggered when reading /proc/kpageflags on
the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]':
BUG: unable to handle kernel paging request at fffffffffffffffe
PGD 9b20e067 P4D 9b20e067 PUD 9b210067 PMD 0
Oops: 0000 [#1] SMP PTI
CPU: 2 PID: 1728 Comm: page-types Not tainted 4.17.0-rc6-mm1-v4.17-rc6-180605-0816-00236-g2dfb086ef02c+ #160
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 04/01/2014
RIP: 0010:stable_page_flags+0x27/0x3c0
Code: 00 00 00 0f 1f 44 00 00 48 85 ff 0f 84 a0 03 00 00 41 54 55 49 89 fc 53 48 8b 57 08 48 8b 2f 48 8d 42 ff 83 e2 01 48 0f 44 c7 <48> 8b 00 f6 c4 01 0f 84 10 03 00 00 31 db 49 8b 54 24 08 4c 89 e7
RSP: 0018:ffffbbd44111fde0 EFLAGS: 00010202
RAX: fffffffffffffffe RBX: 00007fffffffeff9 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffffed1182fff5c0
RBP: ffffffffffffffff R08: 0000000000000001 R09: 0000000000000001
R10: ffffbbd44111fed8 R11: 0000000000000000 R12: ffffed1182fff5c0
R13: 00000000000bffd7 R14: 0000000002fff5c0 R15: ffffbbd44111ff10
FS: 00007efc4335a500(0000) GS:ffff93a5bfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffffffffffffe CR3: 00000000b2a58000 CR4: 00000000001406e0
Call Trace:
kpageflags_read+0xc7/0x120
proc_reg_read+0x3c/0x60
__vfs_read+0x36/0x170
vfs_read+0x89/0x130
ksys_pread64+0x71/0x90
do_syscall_64+0x5b/0x160
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7efc42e75e23
Code: 09 00 ba 9f 01 00 00 e8 ab 81 f4 ff 66 2e 0f 1f 84 00 00 00 00 00 90 83 3d 29 0a 2d 00 00 75 13 49 89 ca b8 11 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 db d3 01 00 48 89 04 24
According to kernel bisection, this problem became visible due to commit
f7f99100d8d9 which changes how struct pages are initialized.
Memblock layout affects the pfn ranges covered by node/zone. Consider
that we have a VM with 2 NUMA nodes and each node has 4GB memory, and
the default (no memmap= given) memblock layout is like below:
MEMBLOCK configuration:
memory size = 0x00000001fff75c00 reserved size = 0x000000000300c000
memory.cnt = 0x4
memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0
memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0
memory[0x2] [0x0000000100000000-0x000000013fffffff], 0x0000000040000000 bytes on node 0 flags: 0x0
memory[0x3] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0
...
If you give memmap=1G!4G (so it just covers memory[0x2]),
the range [0x100000000-0x13fffffff] is gone:
MEMBLOCK configuration:
memory size = 0x00000001bff75c00 reserved size = 0x000000000300c000
memory.cnt = 0x3
memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0
memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0
memory[0x2] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0
...
This causes shrinking node 0's pfn range because it is calculated by
the address range of memblock.memory. So some of struct pages in the
gap range are left uninitialized.
We have a function zero_resv_unavail() which does zeroing the struct
pages outside memblock.memory, but currently it covers only the reserved
unavailable range (i.e. memblock.memory && !memblock.reserved).
This patch extends it to cover all unavailable range, which fixes
the reported issue.
Link: http://lkml.kernel.org/r/20180613054107.GA5329@hori1.linux.bs1.fc.nec.co.jp
Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap")
Signed-off-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Tested-by: Oscar Salvador <osalvador(a)suse.de>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Steven Sistare <steven.sistare(a)oracle.com>
Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Huang Ying <ying.huang(a)intel.com>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/memblock.h | 16 ----------------
mm/page_alloc.c | 33 ++++++++++++++++++++++++---------
2 files changed, 24 insertions(+), 25 deletions(-)
diff -puN include/linux/memblock.h~mm-zero-remaining-unavailable-struct-pages include/linux/memblock.h
--- a/include/linux/memblock.h~mm-zero-remaining-unavailable-struct-pages
+++ a/include/linux/memblock.h
@@ -236,22 +236,6 @@ void __next_mem_pfn_range(int *idx, int
for_each_mem_range_rev(i, &memblock.memory, &memblock.reserved, \
nid, flags, p_start, p_end, p_nid)
-/**
- * for_each_resv_unavail_range - iterate through reserved and unavailable memory
- * @i: u64 used as loop variable
- * @flags: pick from blocks based on memory attributes
- * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
- * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
- *
- * Walks over unavailable but reserved (reserved && !memory) areas of memblock.
- * Available as soon as memblock is initialized.
- * Note: because this memory does not belong to any physical node, flags and
- * nid arguments do not make sense and thus not exported as arguments.
- */
-#define for_each_resv_unavail_range(i, p_start, p_end) \
- for_each_mem_range(i, &memblock.reserved, &memblock.memory, \
- NUMA_NO_NODE, MEMBLOCK_NONE, p_start, p_end, NULL)
-
static inline void memblock_set_region_flags(struct memblock_region *r,
unsigned long flags)
{
diff -puN mm/page_alloc.c~mm-zero-remaining-unavailable-struct-pages mm/page_alloc.c
--- a/mm/page_alloc.c~mm-zero-remaining-unavailable-struct-pages
+++ a/mm/page_alloc.c
@@ -6390,25 +6390,40 @@ void __paginginit free_area_init_node(in
* struct pages which are reserved in memblock allocator and their fields
* may be accessed (for example page_to_pfn() on some configuration accesses
* flags). We must explicitly zero those struct pages.
+ *
+ * This function also addresses a similar issue where struct pages are left
+ * uninitialized because the physical address range is not covered by
+ * memblock.memory or memblock.reserved. That could happen when memblock
+ * layout is manually configured via memmap=.
*/
void __paginginit zero_resv_unavail(void)
{
phys_addr_t start, end;
unsigned long pfn;
u64 i, pgcnt;
+ phys_addr_t next = 0;
/*
- * Loop through ranges that are reserved, but do not have reported
- * physical memory backing.
+ * Loop through unavailable ranges not covered by memblock.memory.
*/
pgcnt = 0;
- for_each_resv_unavail_range(i, &start, &end) {
- for (pfn = PFN_DOWN(start); pfn < PFN_UP(end); pfn++) {
- if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages)))
- continue;
- mm_zero_struct_page(pfn_to_page(pfn));
- pgcnt++;
+ for_each_mem_range(i, &memblock.memory, NULL,
+ NUMA_NO_NODE, MEMBLOCK_NONE, &start, &end, NULL) {
+ if (next < start) {
+ for (pfn = PFN_DOWN(next); pfn < PFN_UP(start); pfn++) {
+ if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages)))
+ continue;
+ mm_zero_struct_page(pfn_to_page(pfn));
+ pgcnt++;
+ }
}
+ next = end;
+ }
+ for (pfn = PFN_DOWN(next); pfn < max_pfn; pfn++) {
+ if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages)))
+ continue;
+ mm_zero_struct_page(pfn_to_page(pfn));
+ pgcnt++;
}
/*
@@ -6419,7 +6434,7 @@ void __paginginit zero_resv_unavail(void
* this code can be removed.
*/
if (pgcnt)
- pr_info("Reserved but unavailable: %lld pages", pgcnt);
+ pr_info("Zeroed struct page in unavailable ranges: %lld pages", pgcnt);
}
#endif /* CONFIG_HAVE_MEMBLOCK */
_
Patches currently in -mm which might be from n-horiguchi(a)ah.jp.nec.com are
Hi,
The patch has been in the mainline, and I have verified the commit can be cherry-picked
cleanly to these 3 stable branches mentioned in the subject.
When the patch was originally submitted, I did add a " Cc: stable(a)vger.kernel.org" tag:
https://lkml.org/lkml/2018/6/6/766 .
It looks somehow the tag was left out.
Thanks,
-- Dexuan
From: Martin Kelly <mkelly(a)xevo.com>
commit c043ec1ca5baae63726aae32abbe003192bc6eec upstream.
Currently, we use int for buffer length and bytes_per_datum. However,
kfifo uses unsigned int for length and size_t for element size. We need
to make sure these matches or we will have bugs related to overflow (in
the range between INT_MAX and UINT_MAX for length, for example).
In addition, set_bytes_per_datum uses size_t while bytes_per_datum is an
int, which would cause bugs for large values of bytes_per_datum.
Change buffer length to use unsigned int and bytes_per_datum to use
size_t.
Signed-off-by: Martin Kelly <mkelly(a)xevo.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
[bwh: Backported to 4.4:
- Drop change to iio_dma_buffer_set_length()
- Adjust filename, context]
Signed-off-by: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
---
drivers/iio/buffer/kfifo_buf.c | 4 ++--
include/linux/iio/buffer.h | 6 +++---
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/buffer/kfifo_buf.c b/drivers/iio/buffer/kfifo_buf.c
index 7ef9b13262a8..e44181f9eb36 100644
--- a/drivers/iio/buffer/kfifo_buf.c
+++ b/drivers/iio/buffer/kfifo_buf.c
@@ -19,7 +19,7 @@ struct iio_kfifo {
#define iio_to_kfifo(r) container_of(r, struct iio_kfifo, buffer)
static inline int __iio_allocate_kfifo(struct iio_kfifo *buf,
- int bytes_per_datum, int length)
+ size_t bytes_per_datum, unsigned int length)
{
if ((length == 0) || (bytes_per_datum == 0))
return -EINVAL;
@@ -71,7 +71,7 @@ static int iio_set_bytes_per_datum_kfifo(struct iio_buffer *r, size_t bpd)
return 0;
}
-static int iio_set_length_kfifo(struct iio_buffer *r, int length)
+static int iio_set_length_kfifo(struct iio_buffer *r, unsigned int length)
{
/* Avoid an invalid state */
if (length < 2)
diff --git a/include/linux/iio/buffer.h b/include/linux/iio/buffer.h
index 1600c55828e0..93a774ce4922 100644
--- a/include/linux/iio/buffer.h
+++ b/include/linux/iio/buffer.h
@@ -49,7 +49,7 @@ struct iio_buffer_access_funcs {
int (*request_update)(struct iio_buffer *buffer);
int (*set_bytes_per_datum)(struct iio_buffer *buffer, size_t bpd);
- int (*set_length)(struct iio_buffer *buffer, int length);
+ int (*set_length)(struct iio_buffer *buffer, unsigned int length);
void (*release)(struct iio_buffer *buffer);
@@ -78,8 +78,8 @@ struct iio_buffer_access_funcs {
* @watermark: [INTERN] number of datums to wait for poll/read.
*/
struct iio_buffer {
- int length;
- int bytes_per_datum;
+ unsigned int length;
+ size_t bytes_per_datum;
struct attribute_group *scan_el_attrs;
long *scan_mask;
bool scan_timestamp;
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
The patch
spi: cadence: Change usleep_range() to udelay(), for atomic context
has been applied to the spi tree at
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
>From 931c4e9a72ae91d59c5332ffb6812911a749da8e Mon Sep 17 00:00:00 2001
From: Janek Kotas <jank(a)cadence.com>
Date: Mon, 4 Jun 2018 11:24:44 +0000
Subject: [PATCH] spi: cadence: Change usleep_range() to udelay(), for atomic
context
The path "spi: cadence: Add usleep_range() for
cdns_spi_fill_tx_fifo()" added a usleep_range() function call,
which cannot be used in atomic context.
However the cdns_spi_fill_tx_fifo() function can be called during
an interrupt which may result in a kernel panic:
BUG: scheduling while atomic: grep/561/0x00010002
Modules linked in:
Preemption disabled at:
[<ffffff800858ea28>] wait_for_common+0x48/0x178
CPU: 0 PID: 561 Comm: grep Not tainted 4.17.0 #1
Hardware name: Cadence CSP (DT)
Call trace:
dump_backtrace+0x0/0x198
show_stack+0x14/0x20
dump_stack+0x8c/0xac
__schedule_bug+0x6c/0xb8
__schedule+0x570/0x5d8
schedule+0x34/0x98
schedule_hrtimeout_range_clock+0x98/0x110
schedule_hrtimeout_range+0x10/0x18
usleep_range+0x64/0x98
cdns_spi_fill_tx_fifo+0x70/0xb0
cdns_spi_irq+0xd0/0xe0
__handle_irq_event_percpu+0x9c/0x128
handle_irq_event_percpu+0x34/0x88
handle_irq_event+0x48/0x78
handle_fasteoi_irq+0xbc/0x1b0
generic_handle_irq+0x24/0x38
__handle_domain_irq+0x84/0xf8
gic_handle_irq+0xc4/0x180
This patch replaces the function call with udelay() which can be
used in an atomic context, like an interrupt.
Signed-off-by: Jan Kotas <jank(a)cadence.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
drivers/spi/spi-cadence.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/spi/spi-cadence.c b/drivers/spi/spi-cadence.c
index f3dad6fcdc35..a568f35522f9 100644
--- a/drivers/spi/spi-cadence.c
+++ b/drivers/spi/spi-cadence.c
@@ -319,7 +319,7 @@ static void cdns_spi_fill_tx_fifo(struct cdns_spi *xspi)
*/
if (cdns_spi_read(xspi, CDNS_SPI_ISR) &
CDNS_SPI_IXR_TXFULL)
- usleep_range(10, 20);
+ udelay(10);
if (xspi->txbuf)
cdns_spi_write(xspi, CDNS_SPI_TXD, *xspi->txbuf++);
--
2.17.1
From: Sandy Huang <hjc(a)rock-chips.com>
The vop irq is shared between vop and iommu and irq probing in the
iommu driver moved to the probe function recently. This can in some
cases lead to a stall if the irq is triggered while the vop driver
still has it disabled, but the vop irq handler gets called.
But there is no real need to disable the irq, as the vop can simply
also track its enabled state and ignore irqs in that case.
For this we can simply check the power-domain state of the vop,
similar to how the iommu driver does it.
So remove the enable/disable handling and add appropriate condition
to the irq handler.
changes in v2:
- move to just check the power-domain state
- add clock handling
changes in v3:
- clarify comment to speak of runtime-pm not power-domain
Fixes: d0b912bd4c23 ("iommu/rockchip: Request irqs in rk_iommu_probe()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sandy Huang <hjc(a)rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Tested-by: Ezequiel Garcia <ezequiel(a)collabora.com>
---
drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 28 ++++++++++++++-------
1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index 9a1f272e41c7..ae8a69793aed 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -573,8 +573,6 @@ static int vop_enable(struct drm_crtc *crtc)
spin_unlock(&vop->reg_lock);
- enable_irq(vop->irq);
-
drm_crtc_vblank_on(crtc);
return 0;
@@ -618,8 +616,6 @@ static void vop_crtc_atomic_disable(struct drm_crtc *crtc,
vop_dsp_hold_valid_irq_disable(vop);
- disable_irq(vop->irq);
-
vop->is_enabled = false;
/*
@@ -1195,6 +1191,16 @@ static irqreturn_t vop_isr(int irq, void *data)
uint32_t active_irqs;
int ret = IRQ_NONE;
+ /*
+ * The irq is shared with the iommu. If the runtime-pm state of the
+ * vop-device is disabled the irq has to be targetted at the iommu.
+ */
+ if (!pm_runtime_get_if_in_use(vop->dev))
+ return IRQ_NONE;
+
+ if (WARN_ON(vop_core_clks_enable(vop)))
+ goto out;
+
/*
* interrupt register has interrupt status, enable and clear bits, we
* must hold irq_lock to avoid a race with enable/disable_vblank().
@@ -1209,8 +1215,11 @@ static irqreturn_t vop_isr(int irq, void *data)
spin_unlock(&vop->irq_lock);
/* This is expected for vop iommu irqs, since the irq is shared */
- if (!active_irqs)
- return IRQ_NONE;
+ if (!active_irqs) {
+ ret = IRQ_NONE;
+ vop_core_clks_disable(vop);
+ goto out;
+ }
if (active_irqs & DSP_HOLD_VALID_INTR) {
complete(&vop->dsp_hold_completion);
@@ -1236,6 +1245,10 @@ static irqreturn_t vop_isr(int irq, void *data)
DRM_DEV_ERROR(vop->dev, "Unknown VOP IRQs: %#02x\n",
active_irqs);
+ vop_core_clks_disable(vop);
+
+out:
+ pm_runtime_put(vop->dev);
return ret;
}
@@ -1614,9 +1627,6 @@ static int vop_bind(struct device *dev, struct device *master, void *data)
if (ret)
goto err_disable_pm_runtime;
- /* IRQ is initially disabled; it gets enabled in power_on */
- disable_irq(vop->irq);
-
return 0;
err_disable_pm_runtime:
--
2.17.0
This reverts commit 2c17a4368aad2b88b68e4390c819e226cf320f70.
The offending commit triggers a run-time fault when accessing the panel
element of the sun4i_tcon structure when no such panel is attached.
It was apparently assumed in said commit that a panel is always used with
the TCON. Although it is often the case, this is not always true.
For instance a bridge might be used instead of a panel.
This issue was discovered using an A13-OLinuXino, that uses the TCON
in RGB mode for a simple DAC-based VGA bridge.
Cc: stable(a)vger.kernel.org
Signed-off-by: Paul Kocialkowski <paul.kocialkowski(a)bootlin.com>
---
drivers/gpu/drm/sun4i/sun4i_tcon.c | 25 -------------------------
1 file changed, 25 deletions(-)
diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c b/drivers/gpu/drm/sun4i/sun4i_tcon.c
index c3d92d537240..8045871335b5 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tcon.c
+++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c
@@ -17,7 +17,6 @@
#include <drm/drm_encoder.h>
#include <drm/drm_modes.h>
#include <drm/drm_of.h>
-#include <drm/drm_panel.h>
#include <uapi/drm/drm_mode.h>
@@ -350,9 +349,6 @@ static void sun4i_tcon0_mode_set_lvds(struct sun4i_tcon *tcon,
static void sun4i_tcon0_mode_set_rgb(struct sun4i_tcon *tcon,
const struct drm_display_mode *mode)
{
- struct drm_panel *panel = tcon->panel;
- struct drm_connector *connector = panel->connector;
- struct drm_display_info display_info = connector->display_info;
unsigned int bp, hsync, vsync;
u8 clk_delay;
u32 val = 0;
@@ -410,27 +406,6 @@ static void sun4i_tcon0_mode_set_rgb(struct sun4i_tcon *tcon,
if (mode->flags & DRM_MODE_FLAG_PVSYNC)
val |= SUN4I_TCON0_IO_POL_VSYNC_POSITIVE;
- /*
- * On A20 and similar SoCs, the only way to achieve Positive Edge
- * (Rising Edge), is setting dclk clock phase to 2/3(240°).
- * By default TCON works in Negative Edge(Falling Edge),
- * this is why phase is set to 0 in that case.
- * Unfortunately there's no way to logically invert dclk through
- * IO_POL register.
- * The only acceptable way to work, triple checked with scope,
- * is using clock phase set to 0° for Negative Edge and set to 240°
- * for Positive Edge.
- * On A33 and similar SoCs there would be a 90° phase option,
- * but it divides also dclk by 2.
- * Following code is a way to avoid quirks all around TCON
- * and DOTCLOCK drivers.
- */
- if (display_info.bus_flags & DRM_BUS_FLAG_PIXDATA_POSEDGE)
- clk_set_phase(tcon->dclk, 240);
-
- if (display_info.bus_flags & DRM_BUS_FLAG_PIXDATA_NEGEDGE)
- clk_set_phase(tcon->dclk, 0);
-
regmap_update_bits(tcon->regs, SUN4I_TCON0_IO_POL_REG,
SUN4I_TCON0_IO_POL_HSYNC_POSITIVE | SUN4I_TCON0_IO_POL_VSYNC_POSITIVE,
val);
--
2.17.0
From: Eric Dumazet <edumazet(a)google.com>
commit 02db55718d53f9d426cee504c27fb768e9ed4ffe upstream.
While rcvbuf is properly clamped by tcp_rmem[2], rcvwin
is left to a potentially too big value.
It has no serious effect, since :
1) tcp_grow_window() has very strict checks.
2) window_clamp can be mangled by user space to any value anyway.
tcp_init_buffer_space() and companions use tcp_full_space(),
we use tcp_win_from_space() to avoid reloading sk->sk_rcvbuf
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Acked-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Acked-by: Wei Wang <weiwan(a)google.com>
Acked-by: Neal Cardwell <ncardwell(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Guenter Roeck <linux(a)roeck-us.net>
---
net/ipv4/tcp_input.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 125b49c166a4..f0caff3139ed 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -647,7 +647,7 @@ void tcp_rcv_space_adjust(struct sock *sk)
sk->sk_rcvbuf = rcvbuf;
/* Make the window clamp follow along. */
- tp->window_clamp = rcvwin;
+ tp->window_clamp = tcp_win_from_space(rcvbuf);
}
}
tp->rcvq_space.space = copied;
--
2.7.4
Hi,
4.14.48 can cause abnormally small TCP receive windows when the sender is
faster than the receiver; see https://github.com/coreos/bugs/issues/2457 for
details. Reverting "tcp: avoid integer overflows in tcp_rcv_space_adjust()"
fixes it. Backporting 02db55718d53 ("tcp: do not overshoot window_clamp in
tcp_rcv_space_adjust()"), which is its parent commit upstream, also fixes
it.
--Benjamin Gilbert