This patch changes drm_mode_legacy_fb_format() to only return formats
which are supported by the current drm-device. The motivation for this
change is to fix a regression introduced by commit
c91acda3a380 ("drm/gem: Check for valid formats") which stops the Xorg
modesetting driver from working on the Beagleboard Black (it uses the
tilcdc kernel driver).
When the Xorg modesetting driver starts up, it tries to determine the
default bpp for the device. It does this by allocating a dumb 32bpp
frame buffer (using DRM_IOCTL_MODE_CREATE_DUMB) and then calling
drmModeAddFB() with that frame buffer asking for a 24-bit depth and 32
bpp. As the modesetting driver uses drmModeAddFB() which doesn't
supply a format, the kernel's drm_mode_legacy_fb_format() is called to
provide a format matching the requested depth and bpp. If the
drmModeAddFB() call fails, it forces both depth and bpp to 24. If
drmModeAddFB() succeeds, depth is assumed to be 24 and bpp 32. The
dummy frame buffer is then removed (using drmModeRmFB()).
If the modesetting driver finds that both the default bpp and depth
are 24, it forces the use of a 32bpp shadow buffer and a 24bpp front
buffer. Following this, the driver reads the user-specified color
depth option and tries to create a framebuffer of that depth, but if
the use of a shadow buffer has been forced, the bpp and depth of it
overrides the user-supplied option.
The Xorg modesetting driver on top of the tilcdc kernel driver used to
work on the Beagleboard Black if a 16 bit color depth was
configured. The hardware in the Beagleboard Black supports the RG16,
BG24, and XB24 formats. When drm_mode_legacy_fb_format() was called to
request a format for a 24-bit depth and 32 bpp, it would return the
unsupported RG24 format which drmModeAddFB() would happily accept (as
there was no check for a valid format). As a shadow buffer wasn't
forced, the modesetting driver would try the user specified 16 bit
color depth and drm_mode_legacy_fb_format() would return RG16 which is
supported by the hardware. Color depths of 24 bits were not supported,
as the unsupported RG24 would be detected when drmModeSetCrtc() was
called.
Following commit c91acda3a380 ("drm/gem: Check for valid formats"),
which adds a check for a valid (supported by the hardware) format to
the code path for the kernel part of drmModeAddFB(), the modesetting
driver fails to configure and add a frame buffer. This is because the
call to create a 24-bit depth and 32 bpp framebuffer during detection
of the default bpp will now fail and a 24-bit depth and 24 bpp front
buffer will be forced. As drm_mode_legacy_fb_format() will return RG24
which isn't supported, the creation of that framebuffer will also
fail.
To fix the regression, this patch extends drm_mode_legacy_fb_format()
to list all formats with a particular bpp and color depth known to the
kernel, and have it probe the current drm-device for a supported
format. This fixes the regression and, as a bonus, a color depth of 24
bits on the Beagleboard Black is now working.
As this patch changes drm_mode_legacy_fb_format() which is used by
other drivers, it has, in addition to the Beagleboard Black, also been
tested with the nouveau and modesetting drivers on a NVIDIA NV96, and
with the intel and modesetting drivers on an intel HD Graphics 4000
chipset.
Signed-off-by: Frej Drejhammar <frej.drejhammar(a)gmail.com>
Fixes: c91acda3a380 ("drm/gem: Check for valid formats")
Cc: stable(a)vger.kernel.org
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: David Airlie <airlied(a)gmail.com>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Cc: Patrik Jakobsson <patrik.r.jakobsson(a)gmail.com>
Cc: Rob Clark <robdclark(a)gmail.com>
Cc: Abhinav Kumar <quic_abhinavk(a)quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
Cc: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
Cc: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: "Maíra Canal" <mcanal(a)igalia.com>
Cc: "Ville Syrjälä" <ville.syrjala(a)linux.intel.com>
Cc: dri-devel(a)lists.freedesktop.org
---
drivers/gpu/drm/armada/armada_fbdev.c | 2 +-
drivers/gpu/drm/drm_fb_helper.c | 2 +-
drivers/gpu/drm/drm_fbdev_dma.c | 3 +-
drivers/gpu/drm/drm_fbdev_generic.c | 3 +-
drivers/gpu/drm/drm_fourcc.c | 91 ++++++++++++++++---
drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 3 +-
drivers/gpu/drm/gma500/fbdev.c | 2 +-
drivers/gpu/drm/i915/display/intel_fbdev_fb.c | 3 +-
drivers/gpu/drm/msm/msm_fbdev.c | 3 +-
drivers/gpu/drm/omapdrm/omap_fbdev.c | 2 +-
drivers/gpu/drm/radeon/radeon_fbdev.c | 3 +-
drivers/gpu/drm/tegra/fbdev.c | 3 +-
drivers/gpu/drm/tiny/ofdrm.c | 6 +-
drivers/gpu/drm/xe/display/intel_fbdev_fb.c | 3 +-
include/drm/drm_fourcc.h | 3 +-
15 files changed, 104 insertions(+), 28 deletions(-)
diff --git a/drivers/gpu/drm/armada/armada_fbdev.c b/drivers/gpu/drm/armada/armada_fbdev.c
index d223176912b6..82f312f76980 100644
--- a/drivers/gpu/drm/armada/armada_fbdev.c
+++ b/drivers/gpu/drm/armada/armada_fbdev.c
@@ -54,7 +54,7 @@ static int armada_fbdev_create(struct drm_fb_helper *fbh,
mode.width = sizes->surface_width;
mode.height = sizes->surface_height;
mode.pitches[0] = armada_pitch(mode.width, sizes->surface_bpp);
- mode.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
+ mode.pixel_format = drm_mode_legacy_fb_format(dev, sizes->surface_bpp,
sizes->surface_depth);
size = mode.pitches[0] * mode.height;
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index d612133e2cf7..62f81a14fb2e 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -1453,7 +1453,7 @@ static uint32_t drm_fb_helper_find_format(struct drm_fb_helper *fb_helper, const
* the framebuffer emulation can only deal with such
* formats, specifically RGB/BGA formats.
*/
- format = drm_mode_legacy_fb_format(bpp, depth);
+ format = drm_mode_legacy_fb_format(dev, bpp, depth);
if (!format)
goto err;
diff --git a/drivers/gpu/drm/drm_fbdev_dma.c b/drivers/gpu/drm/drm_fbdev_dma.c
index 6c9427bb4053..cdb315c6d110 100644
--- a/drivers/gpu/drm/drm_fbdev_dma.c
+++ b/drivers/gpu/drm/drm_fbdev_dma.c
@@ -90,7 +90,8 @@ static int drm_fbdev_dma_helper_fb_probe(struct drm_fb_helper *fb_helper,
sizes->surface_width, sizes->surface_height,
sizes->surface_bpp);
- format = drm_mode_legacy_fb_format(sizes->surface_bpp, sizes->surface_depth);
+ format = drm_mode_legacy_fb_format(dev,
+ sizes->surface_bpp, sizes->surface_depth);
buffer = drm_client_framebuffer_create(client, sizes->surface_width,
sizes->surface_height, format);
if (IS_ERR(buffer))
diff --git a/drivers/gpu/drm/drm_fbdev_generic.c b/drivers/gpu/drm/drm_fbdev_generic.c
index d647d89764cb..aba8c272560c 100644
--- a/drivers/gpu/drm/drm_fbdev_generic.c
+++ b/drivers/gpu/drm/drm_fbdev_generic.c
@@ -84,7 +84,8 @@ static int drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper,
sizes->surface_width, sizes->surface_height,
sizes->surface_bpp);
- format = drm_mode_legacy_fb_format(sizes->surface_bpp, sizes->surface_depth);
+ format = drm_mode_legacy_fb_format(dev,
+ sizes->surface_bpp, sizes->surface_depth);
buffer = drm_client_framebuffer_create(client, sizes->surface_width,
sizes->surface_height, format);
if (IS_ERR(buffer))
diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
index 193cf8ed7912..034f2087af9a 100644
--- a/drivers/gpu/drm/drm_fourcc.c
+++ b/drivers/gpu/drm/drm_fourcc.c
@@ -29,47 +29,97 @@
#include <drm/drm_device.h>
#include <drm/drm_fourcc.h>
+#include <drm/drm_plane.h>
+#include <drm/drm_print.h>
+
+/*
+ * Internal helper to find a valid format among a list of potentially
+ * valid formats.
+ *
+ * Traverses the variadic arguments until a format supported by @dev
+ * or an DRM_FORMAT_INVALID argument is found. If a supported format
+ * is found it is returned, otherwise DRM_FORMAT_INVALID is returned.
+ */
+static uint32_t select_valid_format(struct drm_device *dev, ...)
+{
+ va_list va;
+ uint32_t fmt = DRM_FORMAT_INVALID;
+ uint32_t to_try;
+
+ va_start(va, dev);
+
+ for (to_try = va_arg(va, uint32_t);
+ to_try != DRM_FORMAT_INVALID;
+ to_try = va_arg(va, uint32_t)) {
+ if (drm_any_plane_has_format(dev, to_try, 0)) {
+ fmt = to_try;
+ break;
+ }
+ }
+
+ va_end(va);
+
+ return fmt;
+}
/**
* drm_mode_legacy_fb_format - compute drm fourcc code from legacy description
+ * @dev: DRM device
* @bpp: bits per pixels
* @depth: bit depth per pixel
*
* Computes a drm fourcc pixel format code for the given @bpp/@depth values.
* Useful in fbdev emulation code, since that deals in those values.
*/
-uint32_t drm_mode_legacy_fb_format(uint32_t bpp, uint32_t depth)
+uint32_t drm_mode_legacy_fb_format(struct drm_device *dev,
+ uint32_t bpp, uint32_t depth)
{
uint32_t fmt = DRM_FORMAT_INVALID;
switch (bpp) {
case 1:
if (depth == 1)
- fmt = DRM_FORMAT_C1;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_C1,
+ DRM_FORMAT_INVALID);
break;
case 2:
if (depth == 2)
- fmt = DRM_FORMAT_C2;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_C2,
+ DRM_FORMAT_INVALID);
break;
case 4:
if (depth == 4)
- fmt = DRM_FORMAT_C4;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_C4,
+ DRM_FORMAT_INVALID);
break;
case 8:
if (depth == 8)
- fmt = DRM_FORMAT_C8;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_C8,
+ DRM_FORMAT_INVALID);
break;
case 16:
switch (depth) {
case 15:
- fmt = DRM_FORMAT_XRGB1555;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_XRGB1555,
+ DRM_FORMAT_XBGR1555,
+ DRM_FORMAT_RGBX5551,
+ DRM_FORMAT_BGRX5551,
+ DRM_FORMAT_INVALID);
break;
case 16:
- fmt = DRM_FORMAT_RGB565;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_RGB565,
+ DRM_FORMAT_BGR565,
+ DRM_FORMAT_INVALID);
break;
default:
break;
@@ -78,19 +128,36 @@ uint32_t drm_mode_legacy_fb_format(uint32_t bpp, uint32_t depth)
case 24:
if (depth == 24)
- fmt = DRM_FORMAT_RGB888;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_RGB888,
+ DRM_FORMAT_BGR888);
break;
case 32:
switch (depth) {
case 24:
- fmt = DRM_FORMAT_XRGB8888;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_XRGB8888,
+ DRM_FORMAT_XBGR8888,
+ DRM_FORMAT_RGBX8888,
+ DRM_FORMAT_BGRX8888,
+ DRM_FORMAT_INVALID);
break;
case 30:
- fmt = DRM_FORMAT_XRGB2101010;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_XRGB2101010,
+ DRM_FORMAT_XBGR2101010,
+ DRM_FORMAT_RGBX1010102,
+ DRM_FORMAT_BGRX1010102,
+ DRM_FORMAT_INVALID);
break;
case 32:
- fmt = DRM_FORMAT_ARGB8888;
+ fmt = select_valid_format(dev,
+ DRM_FORMAT_ARGB8888,
+ DRM_FORMAT_ABGR8888,
+ DRM_FORMAT_RGBA8888,
+ DRM_FORMAT_BGRA8888,
+ DRM_FORMAT_INVALID);
break;
default:
break;
@@ -119,7 +186,7 @@ EXPORT_SYMBOL(drm_mode_legacy_fb_format);
uint32_t drm_driver_legacy_fb_format(struct drm_device *dev,
uint32_t bpp, uint32_t depth)
{
- uint32_t fmt = drm_mode_legacy_fb_format(bpp, depth);
+ uint32_t fmt = drm_mode_legacy_fb_format(dev, bpp, depth);
if (dev->mode_config.quirk_addfb_prefer_host_byte_order) {
if (fmt == DRM_FORMAT_XRGB8888)
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
index a379c8ca435a..e114ebd44169 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
@@ -104,7 +104,8 @@ static int exynos_drm_fbdev_create(struct drm_fb_helper *helper,
mode_cmd.width = sizes->surface_width;
mode_cmd.height = sizes->surface_height;
mode_cmd.pitches[0] = sizes->surface_width * (sizes->surface_bpp >> 3);
- mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
+ mode_cmd.pixel_format = drm_mode_legacy_fb_format(dev,
+ sizes->surface_bpp,
sizes->surface_depth);
size = mode_cmd.pitches[0] * mode_cmd.height;
diff --git a/drivers/gpu/drm/gma500/fbdev.c b/drivers/gpu/drm/gma500/fbdev.c
index 98b44974d42d..811ae5cccf2c 100644
--- a/drivers/gpu/drm/gma500/fbdev.c
+++ b/drivers/gpu/drm/gma500/fbdev.c
@@ -189,7 +189,7 @@ static int psb_fbdev_fb_probe(struct drm_fb_helper *fb_helper,
mode_cmd.width = sizes->surface_width;
mode_cmd.height = sizes->surface_height;
mode_cmd.pitches[0] = ALIGN(mode_cmd.width * DIV_ROUND_UP(bpp, 8), 64);
- mode_cmd.pixel_format = drm_mode_legacy_fb_format(bpp, depth);
+ mode_cmd.pixel_format = drm_mode_legacy_fb_format(dev, bpp, depth);
size = mode_cmd.pitches[0] * mode_cmd.height;
size = ALIGN(size, PAGE_SIZE);
diff --git a/drivers/gpu/drm/i915/display/intel_fbdev_fb.c b/drivers/gpu/drm/i915/display/intel_fbdev_fb.c
index 0665f943f65f..cb32fcff8fb5 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev_fb.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev_fb.c
@@ -30,7 +30,8 @@ struct drm_framebuffer *intel_fbdev_fb_alloc(struct drm_fb_helper *helper,
mode_cmd.pitches[0] = ALIGN(mode_cmd.width *
DIV_ROUND_UP(sizes->surface_bpp, 8), 64);
- mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
+ mode_cmd.pixel_format = drm_mode_legacy_fb_format(dev,
+ sizes->surface_bpp,
sizes->surface_depth);
size = mode_cmd.pitches[0] * mode_cmd.height;
diff --git a/drivers/gpu/drm/msm/msm_fbdev.c b/drivers/gpu/drm/msm/msm_fbdev.c
index 030bedac632d..8748610299b4 100644
--- a/drivers/gpu/drm/msm/msm_fbdev.c
+++ b/drivers/gpu/drm/msm/msm_fbdev.c
@@ -77,7 +77,8 @@ static int msm_fbdev_create(struct drm_fb_helper *helper,
uint32_t format;
int ret, pitch;
- format = drm_mode_legacy_fb_format(sizes->surface_bpp, sizes->surface_depth);
+ format = drm_mode_legacy_fb_format(dev,
+ sizes->surface_bpp, sizes->surface_depth);
DBG("create fbdev: %dx%d@%d (%dx%d)", sizes->surface_width,
sizes->surface_height, sizes->surface_bpp,
diff --git a/drivers/gpu/drm/omapdrm/omap_fbdev.c b/drivers/gpu/drm/omapdrm/omap_fbdev.c
index 6b08b137af1a..98f01d80abd8 100644
--- a/drivers/gpu/drm/omapdrm/omap_fbdev.c
+++ b/drivers/gpu/drm/omapdrm/omap_fbdev.c
@@ -139,7 +139,7 @@ static int omap_fbdev_create(struct drm_fb_helper *helper,
sizes->surface_height, sizes->surface_bpp,
sizes->fb_width, sizes->fb_height);
- mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
+ mode_cmd.pixel_format = drm_mode_legacy_fb_format(dev, sizes->surface_bpp,
sizes->surface_depth);
mode_cmd.width = sizes->surface_width;
diff --git a/drivers/gpu/drm/radeon/radeon_fbdev.c b/drivers/gpu/drm/radeon/radeon_fbdev.c
index 02bf25759059..bf1843529c7c 100644
--- a/drivers/gpu/drm/radeon/radeon_fbdev.c
+++ b/drivers/gpu/drm/radeon/radeon_fbdev.c
@@ -221,7 +221,8 @@ static int radeon_fbdev_fb_helper_fb_probe(struct drm_fb_helper *fb_helper,
if ((sizes->surface_bpp == 24) && ASIC_IS_AVIVO(rdev))
sizes->surface_bpp = 32;
- mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
+ mode_cmd.pixel_format = drm_mode_legacy_fb_format(dev,
+ sizes->surface_bpp,
sizes->surface_depth);
ret = radeon_fbdev_create_pinned_object(fb_helper, &mode_cmd, &gobj);
diff --git a/drivers/gpu/drm/tegra/fbdev.c b/drivers/gpu/drm/tegra/fbdev.c
index db6eaac3d30e..290e8c426b0c 100644
--- a/drivers/gpu/drm/tegra/fbdev.c
+++ b/drivers/gpu/drm/tegra/fbdev.c
@@ -87,7 +87,8 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
cmd.pitches[0] = round_up(sizes->surface_width * bytes_per_pixel,
tegra->pitch_align);
- cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
+ cmd.pixel_format = drm_mode_legacy_fb_format(dev,
+ sizes->surface_bpp,
sizes->surface_depth);
size = cmd.pitches[0] * cmd.height;
diff --git a/drivers/gpu/drm/tiny/ofdrm.c b/drivers/gpu/drm/tiny/ofdrm.c
index ab89b7fc7bf6..ded868601aea 100644
--- a/drivers/gpu/drm/tiny/ofdrm.c
+++ b/drivers/gpu/drm/tiny/ofdrm.c
@@ -100,14 +100,14 @@ static const struct drm_format_info *display_get_validated_format(struct drm_dev
switch (depth) {
case 8:
- format = drm_mode_legacy_fb_format(8, 8);
+ format = drm_mode_legacy_fb_format(dev, 8, 8);
break;
case 15:
case 16:
- format = drm_mode_legacy_fb_format(16, depth);
+ format = drm_mode_legacy_fb_format(dev, 16, depth);
break;
case 32:
- format = drm_mode_legacy_fb_format(32, 24);
+ format = drm_mode_legacy_fb_format(dev, 32, 24);
break;
default:
drm_err(dev, "unsupported framebuffer depth %u\n", depth);
diff --git a/drivers/gpu/drm/xe/display/intel_fbdev_fb.c b/drivers/gpu/drm/xe/display/intel_fbdev_fb.c
index 51ae3561fd0d..a38a8143d632 100644
--- a/drivers/gpu/drm/xe/display/intel_fbdev_fb.c
+++ b/drivers/gpu/drm/xe/display/intel_fbdev_fb.c
@@ -32,7 +32,8 @@ struct drm_framebuffer *intel_fbdev_fb_alloc(struct drm_fb_helper *helper,
mode_cmd.pitches[0] = ALIGN(mode_cmd.width *
DIV_ROUND_UP(sizes->surface_bpp, 8), XE_PAGE_SIZE);
- mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
+ mode_cmd.pixel_format = drm_mode_legacy_fb_format(dev,
+ sizes->surface_bpp,
sizes->surface_depth);
size = mode_cmd.pitches[0] * mode_cmd.height;
diff --git a/include/drm/drm_fourcc.h b/include/drm/drm_fourcc.h
index ccf91daa4307..75d06393a564 100644
--- a/include/drm/drm_fourcc.h
+++ b/include/drm/drm_fourcc.h
@@ -310,7 +310,8 @@ const struct drm_format_info *drm_format_info(u32 format);
const struct drm_format_info *
drm_get_format_info(struct drm_device *dev,
const struct drm_mode_fb_cmd2 *mode_cmd);
-uint32_t drm_mode_legacy_fb_format(uint32_t bpp, uint32_t depth);
+uint32_t drm_mode_legacy_fb_format(struct drm_device *dev,
+ uint32_t bpp, uint32_t depth);
uint32_t drm_driver_legacy_fb_format(struct drm_device *dev,
uint32_t bpp, uint32_t depth);
unsigned int drm_format_info_block_width(const struct drm_format_info *info,
base-commit: b9511c6d277c31b13d4f3128eba46f4e0733d734
--
2.44.0
As a matter of fact, continuous reads require additional handling at the
operation level in order for them to work properly. The core helpers do
have this additional logic now, but any time a controller implements its
own page helper, this extra logic is "lost". This means we need another
level of per-controller driver checks to ensure they can leverage
continuous reads. This is for now unsupported, so in order to ensure
continuous reads are enabled only when fully using the core page
helpers, we need to add more initial checks.
Also, as performance is not relevant during raw accesses, we also
prevent these from enabling the feature.
This should solve the issue seen with controllers such as the STM32 FMC2
when in sequencer mode. In this case, the continuous read feature would
be enabled but not leveraged, and most importantly not disabled, leading
to further operations to fail.
Reported-by: Christophe Kerello <christophe.kerello(a)foss.st.com>
Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
drivers/mtd/nand/raw/nand_base.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 4d5a663e4e05..2479fa98f991 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -3594,7 +3594,8 @@ static int nand_do_read_ops(struct nand_chip *chip, loff_t from,
oob = ops->oobbuf;
oob_required = oob ? 1 : 0;
- rawnand_enable_cont_reads(chip, page, readlen, col);
+ if (likely(ops->mode != MTD_OPS_RAW))
+ rawnand_enable_cont_reads(chip, page, readlen, col);
while (1) {
struct mtd_ecc_stats ecc_stats = mtd->ecc_stats;
@@ -5212,6 +5213,15 @@ static void rawnand_late_check_supported_ops(struct nand_chip *chip)
if (!nand_has_exec_op(chip))
return;
+ /*
+ * For now, continuous reads can only be used with the core page helpers.
+ * This can be extended later.
+ */
+ if (!(chip->ecc.read_page == nand_read_page_hwecc ||
+ chip->ecc.read_page == nand_read_page_syndrome ||
+ chip->ecc.read_page == nand_read_page_swecc))
+ return;
+
rawnand_check_cont_read_support(chip);
}
--
2.40.1
None of the callers of drm_panel_get_modes() expect it to return
negative error codes. Either they propagate the return value in their
struct drm_connector_helper_funcs .get_modes() hook (which is also not
supposed to return negative codes), or add it to other counts leading to
bogus values.
On the other hand, many of the struct drm_panel_funcs .get_modes() hooks
do return negative error codes, so handle them gracefully instead of
propagating further.
Return 0 for no modes, whatever the reason.
Cc: Neil Armstrong <neil.armstrong(a)linaro.org>
Cc: Jessica Zhang <quic_jesszhan(a)quicinc.com>
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
---
drivers/gpu/drm/drm_panel.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/drm_panel.c b/drivers/gpu/drm/drm_panel.c
index e814020bbcd3..cfbe020de54e 100644
--- a/drivers/gpu/drm/drm_panel.c
+++ b/drivers/gpu/drm/drm_panel.c
@@ -274,19 +274,24 @@ EXPORT_SYMBOL(drm_panel_disable);
* The modes probed from the panel are automatically added to the connector
* that the panel is attached to.
*
- * Return: The number of modes available from the panel on success or a
- * negative error code on failure.
+ * Return: The number of modes available from the panel on success, or 0 on
+ * failure (no modes).
*/
int drm_panel_get_modes(struct drm_panel *panel,
struct drm_connector *connector)
{
if (!panel)
- return -EINVAL;
+ return 0;
- if (panel->funcs && panel->funcs->get_modes)
- return panel->funcs->get_modes(panel, connector);
+ if (panel->funcs && panel->funcs->get_modes) {
+ int num;
- return -EOPNOTSUPP;
+ num = panel->funcs->get_modes(panel, connector);
+ if (num > 0)
+ return num;
+ }
+
+ return 0;
}
EXPORT_SYMBOL(drm_panel_get_modes);
--
2.39.2
Hi Sasha,
On 10/03/2024 03:33, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> selftests: mptcp: simult flows: format subtests results in TAP
>
> to the 6.1-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> selftests-mptcp-simult-flows-format-subtests-results.patch
> and it can be found in the queue-6.1 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Thank you for having backported this commit 675d99338e7a ("selftests:
mptcp: simult flows: format subtests results in TAP") -- as well as
commit 4d8e0dde0403 ("selftests: mptcp: simult flows: fix some subtest
names"), a fix for it -- as a "dependence" for commit 5e2f3c65af47
("selftests: mptcp: decrease BW in simult flows"), but I think it is
better not to include 675d99338e7a (and 4d8e0dde0403): they are not
dependences, just modifying the lines around, and they depend on other
commits to have this feature to work.
In other words, commit 675d99338e7a ("selftests: mptcp: simult flows:
format subtests results in TAP") -- and 4d8e0dde0403 ("selftests: mptcp:
simult flows: fix some subtest names") -- is now causing the MPTCP
simult flows selftest to fail. Could it be possible to remove them from
6.1 and 5.15 queues please?
> commit 4eeef0aaffa567f812390612c30f800de02edd73
> Author: Matthieu Baerts <matttbe(a)kernel.org>
> Date: Mon Jul 17 15:21:31 2023 +0200
>
> selftests: mptcp: simult flows: format subtests results in TAP
>
> [ Upstream commit 675d99338e7a6cd925d61d7dbf8c26612f7f08a9 ]
>
> The current selftests infrastructure formats the results in TAP 13. This
> version doesn't support subtests and only the end result of each
> selftest is taken into account. It means that a single issue in a
> subtest of a selftest containing multiple subtests forces the whole
> selftest to be marked as failed. It also means that subtests results are
> not tracked by CIs executing selftests.
>
> MPTCP selftests run hundreds of various subtests. It is then important
> to track each of them and not one result per selftest.
>
> It is particularly interesting to do that when validating stable kernels
> with the last version of the test suite: tests might fail because a
> feature is not supported but the test didn't skip that part. In this
> case, if subtests are not tracked, the whole selftest will be marked as
> failed making the other subtests useless because their results are
> ignored.
>
> This patch formats subtests results in TAP in simult_flows.sh selftest.
>
> Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
> Acked-by: Paolo Abeni <pabeni(a)redhat.com>
> Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
> Signed-off-by: David S. Miller <davem(a)davemloft.net>
> Stable-dep-of: 5e2f3c65af47 ("selftests: mptcp: decrease BW in simult flows")
If needed, I can help to resolve the conflicts to have commit
5e2f3c65af47 ("selftests: mptcp: decrease BW in simult flows")
backported to 6.1 and 5.15.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
On Sun, Mar 10, 2024 at 03:43:38PM -0400, Kent Overstreet wrote:
> The following changes since commit 2e7cdd29fc42c410eab52fffe5710bf656619222:
>
> Linux 6.7.9 (2024-03-06 14:54:01 +0000)
>
> are available in the Git repository at:
>
> https://evilpiepirate.org/git/bcachefs.git tags/bcachefs-for-v6.7-stable-20240310
>
> for you to fetch changes up to 560ceb6a4d9e3bea57c29f5f3a7a1d671dfc7983:
>
> bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree (2024-03-10 14:36:57 -0400)
>
> ----------------------------------------------------------------
> bcachefs fixes for 6.7 stable
>
> "bcachefs: fix simulateously upgrading & downgrading" is the important
> one here. This fixes a really nasty bug where in a rare situation we
> wouldn't downgrade; we'd write a superblock where the version number is
> higher than the currently supported version.
>
> This caused total failure to mount multi device filesystems with the
> splitbrain checking in 6.8, since now we wouldn't be updating the member
> sequence numbers used for splitbrain checking, but the version number
> said we would be - and newer versions would attempt to kick every device
> out of the fs.
>
> ----------------------------------------------------------------
> Helge Deller (1):
> bcachefs: Fix build on parisc by avoiding __multi3()
>
> Kent Overstreet (3):
> bcachefs: check for failure to downgrade
> bcachefs: fix simulateously upgrading & downgrading
> bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree
>
> Mathias Krause (1):
> bcachefs: install fd later to avoid race with close
>
> fs/bcachefs/btree_iter.c | 4 +++-
> fs/bcachefs/chardev.c | 3 +--
> fs/bcachefs/errcode.h | 1 +
> fs/bcachefs/mean_and_variance.h | 2 +-
> fs/bcachefs/super-io.c | 27 ++++++++++++++++++++++++---
> 5 files changed, 30 insertions(+), 7 deletions(-)
Qcom SoCs making use of ARM SMMU require BDF to SID translation table in
the driver to properly map the SID for the PCIe devices based on their BDF
identifier. This is currently achieved with the help of
qcom_pcie_config_sid_1_9_0() function for SoCs supporting the 1_9_0 config.
But With newer Qcom SoCs starting from SM8450, BDF to SID translation is
set to bypass mode by default in hardware. Due to this, the translation
table that is set in the qcom_pcie_config_sid_1_9_0() is essentially
unused and the default SID is used for all endpoints in SoCs starting from
SM8450.
This is a security concern and also warrants swapping the DeviceID in DT
while using the GIC ITS to handle MSIs from endpoints. The swapping is
currently done like below in DT when using GIC ITS:
/*
* MSIs for BDF (1:0.0) only works with Device ID 0x5980.
* Hence, the IDs are swapped.
*/
msi-map = <0x0 &gic_its 0x5981 0x1>,
<0x100 &gic_its 0x5980 0x1>;
Here, swapping of the DeviceIDs ensure that the endpoint with BDF (1:0.0)
gets the DeviceID 0x5980 which is associated with the default SID as per
the iommu mapping in DT. So MSIs were delivered with IDs swapped so far.
But this also means the Root Port (0:0.0) won't receive any MSIs (for PME,
AER etc...)
So let's fix these issues by clearing the BDF to SID bypass mode for all
SoCs making use of the 1_9_0 config. This allows the PCIe devices to use
the correct SID, thus avoiding the DeviceID swapping hack in DT and also
achieving the isolation between devices.
Cc: <stable(a)vger.kernel.org> # 5.11
Fixes: 4c9398822106 ("PCI: qcom: Add support for configuring BDF to SID mapping for SM8250")
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
I will send the DT patches to fix the msi-map entries once this patch gets
merged.
---
drivers/pci/controller/dwc/pcie-qcom.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 10f2d0bb86be..84e47c6f95fe 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -53,6 +53,7 @@
#define PARF_SLV_ADDR_SPACE_SIZE 0x358
#define PARF_DEVICE_TYPE 0x1000
#define PARF_BDF_TO_SID_TABLE_N 0x2000
+#define PARF_BDF_TO_SID_CFG 0x2c00
/* ELBI registers */
#define ELBI_SYS_CTRL 0x04
@@ -120,6 +121,9 @@
/* PARF_DEVICE_TYPE register fields */
#define DEVICE_TYPE_RC 0x4
+/* PARF_BDF_TO_SID_CFG fields */
+#define BDF_TO_SID_BYPASS BIT(0)
+
/* ELBI_SYS_CTRL register fields */
#define ELBI_SYS_CTRL_LT_ENABLE BIT(0)
@@ -1008,11 +1012,17 @@ static int qcom_pcie_config_sid_1_9_0(struct qcom_pcie *pcie)
u8 qcom_pcie_crc8_table[CRC8_TABLE_SIZE];
int i, nr_map, size = 0;
u32 smmu_sid_base;
+ u32 val;
of_get_property(dev->of_node, "iommu-map", &size);
if (!size)
return 0;
+ /* Enable BDF to SID translation by disabling bypass mode (default) */
+ val = readl(pcie->parf + PARF_BDF_TO_SID_CFG);
+ val &= ~BDF_TO_SID_BYPASS;
+ writel(val, pcie->parf + PARF_BDF_TO_SID_CFG);
+
map = kzalloc(size, GFP_KERNEL);
if (!map)
return -ENOMEM;
---
base-commit: 6613476e225e090cc9aad49be7fa504e290dd33d
change-id: 20240307-pci-bdf-sid-fix-c9cd8c0023d0
Best regards,
--
Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Clang enables -Wenum-enum-conversion and -Wenum-compare-conditional
under -Wenum-conversion. A recent change in Clang strengthened these
warnings and they appear frequently in common builds, primarily due to
several instances in common headers but there are quite a few drivers
that have individual instances as well.
include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
508 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
| ~~~~~~~~~~~~~~~~~~~~~ ^
509 | item];
| ~~~~
drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c:955:24: warning: conditional expression between different enumeration types ('enum iwl_mac_beacon_flags' and 'enum iwl_mac_beacon_flags_v1') [-Wenum-compare-conditional]
955 | flags |= is_new_rate ? IWL_MAC_BEACON_CCK
| ^ ~~~~~~~~~~~~~~~~~~
956 | : IWL_MAC_BEACON_CCK_V1;
| ~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c:1120:21: warning: conditional expression between different enumeration types ('enum iwl_mac_beacon_flags' and 'enum iwl_mac_beacon_flags_v1') [-Wenum-compare-conditional]
1120 | 0) > 10 ?
| ^
1121 | IWL_MAC_BEACON_FILS :
| ~~~~~~~~~~~~~~~~~~~
1122 | IWL_MAC_BEACON_FILS_V1;
| ~~~~~~~~~~~~~~~~~~~~~~
Doing arithmetic between or returning two different types of enums could
be a bug, so each of the instance of the warning needs to be evaluated.
Unfortunately, as mentioned above, there are many instances of this
warning in many different configurations, which can break the build when
CONFIG_WERROR is enabled.
To avoid introducing new instances of the warnings while cleaning up the
disruption for the majority of users, disable these warnings for the
default build while leaving them on for W=1 builds.
Cc: stable(a)vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2002
Link: https://github.com/llvm/llvm-project/commit/8c2ae42b3e1c6aa7c18f873edcebff7…
Acked-by: Yonghong Song <yonghong.song(a)linux.dev>
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
Changes in v2:
- Only disable the warning for the default build, leave it on for W=1 (Arnd)
- Add Yonghong's ack, as the warning is still disabled for the default
build.
- Link to v1: https://lore.kernel.org/r/20240305-disable-extra-clang-enum-warnings-v1-1-6…
---
scripts/Makefile.extrawarn | 2 ++
1 file changed, 2 insertions(+)
diff --git a/scripts/Makefile.extrawarn b/scripts/Makefile.extrawarn
index a9e552a1e910..2f25a1de129d 100644
--- a/scripts/Makefile.extrawarn
+++ b/scripts/Makefile.extrawarn
@@ -132,6 +132,8 @@ KBUILD_CFLAGS += $(call cc-disable-warning, pointer-to-enum-cast)
KBUILD_CFLAGS += -Wno-tautological-constant-out-of-range-compare
KBUILD_CFLAGS += $(call cc-disable-warning, unaligned-access)
KBUILD_CFLAGS += $(call cc-disable-warning, cast-function-type-strict)
+KBUILD_CFLAGS += -Wno-enum-compare-conditional
+KBUILD_CFLAGS += -Wno-enum-enum-conversion
endif
endif
---
base-commit: 90d35da658da8cff0d4ecbb5113f5fac9d00eb72
change-id: 20240304-disable-extra-clang-enum-warnings-bf574c7c99fd
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The .release() function does not get called until all readers of a file
descriptor are finished.
If a thread is blocked on reading a file descriptor in ring_buffer_wait(),
and another thread closes the file descriptor, it will not wake up the
other thread as ring_buffer_wake_waiters() is called by .release(), and
that will not get called until the .read() is finished.
The issue originally showed up in trace-cmd, but the readers are actually
other processes with their own file descriptors. So calling close() would wake
up the other tasks because they are blocked on another descriptor then the
one that was closed(). But there's other wake ups that solve that issue.
When a thread is blocked on a read, it can still hang even when another
thread closed its descriptor.
This is what the .flush() callback is for. Have the .flush() wake up the
readers.
Link: https://lore.kernel.org/linux-trace-kernel/20240308202432.107909457@goodmis…
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: linke li <lilinke99(a)qq.com>
Cc: Rabin Vincent <rabin(a)rab.in>
Fixes: f3ddb74ad0790 ("tracing: Wake up ring buffer waiters on closing of the file")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d16b95ca58a7..c9c898307348 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -8393,6 +8393,20 @@ tracing_buffers_read(struct file *filp, char __user *ubuf,
return size;
}
+static int tracing_buffers_flush(struct file *file, fl_owner_t id)
+{
+ struct ftrace_buffer_info *info = file->private_data;
+ struct trace_iterator *iter = &info->iter;
+
+ iter->wait_index++;
+ /* Make sure the waiters see the new wait_index */
+ smp_wmb();
+
+ ring_buffer_wake_waiters(iter->array_buffer->buffer, iter->cpu_file);
+
+ return 0;
+}
+
static int tracing_buffers_release(struct inode *inode, struct file *file)
{
struct ftrace_buffer_info *info = file->private_data;
@@ -8404,12 +8418,6 @@ static int tracing_buffers_release(struct inode *inode, struct file *file)
__trace_array_put(iter->tr);
- iter->wait_index++;
- /* Make sure the waiters see the new wait_index */
- smp_wmb();
-
- ring_buffer_wake_waiters(iter->array_buffer->buffer, iter->cpu_file);
-
if (info->spare)
ring_buffer_free_read_page(iter->array_buffer->buffer,
info->spare_cpu, info->spare);
@@ -8625,6 +8633,7 @@ static const struct file_operations tracing_buffers_fops = {
.read = tracing_buffers_read,
.poll = tracing_buffers_poll,
.release = tracing_buffers_release,
+ .flush = tracing_buffers_flush,
.splice_read = tracing_buffers_splice_read,
.unlocked_ioctl = tracing_buffers_ioctl,
.llseek = no_llseek,
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The "shortest_full" variable is used to keep track of the waiter that is
waiting for the smallest amount on the ring buffer before being woken up.
When a tasks waits on the ring buffer, it passes in a "full" value that is
a percentage. 0 means wake up on any data. 1-100 means wake up from 1% to
100% full buffer.
As all waiters are on the same wait queue, the wake up happens for the
waiter with the smallest percentage.
The problem is that the smallest_full on the cpu_buffer that stores the
smallest amount doesn't get reset when all the waiters are woken up. It
does get reset when the ring buffer is reset (echo > /sys/kernel/tracing/trace).
This means that tasks may be woken up more often then when they want to
be. Instead, have the shortest_full field get reset just before waking up
all the tasks. If the tasks wait again, they will update the shortest_full
before sleeping.
Also add locking around setting of shortest_full in the poll logic, and
change "work" to "rbwork" to match the variable name for rb_irq_work
structures that are used in other places.
Link: https://lore.kernel.org/linux-trace-kernel/20240308202431.948914369@goodmis…
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: linke li <lilinke99(a)qq.com>
Cc: Rabin Vincent <rabin(a)rab.in>
Fixes: 2c2b0a78b3739 ("ring-buffer: Add percentage of ring buffer full to wake up reader")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/ring_buffer.c | 30 +++++++++++++++++++++++-------
1 file changed, 23 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 3400f11286e3..aa332ace108b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -755,8 +755,19 @@ static void rb_wake_up_waiters(struct irq_work *work)
wake_up_all(&rbwork->waiters);
if (rbwork->full_waiters_pending || rbwork->wakeup_full) {
+ /* Only cpu_buffer sets the above flags */
+ struct ring_buffer_per_cpu *cpu_buffer =
+ container_of(rbwork, struct ring_buffer_per_cpu, irq_work);
+
+ /* Called from interrupt context */
+ raw_spin_lock(&cpu_buffer->reader_lock);
rbwork->wakeup_full = false;
rbwork->full_waiters_pending = false;
+
+ /* Waking up all waiters, they will reset the shortest full */
+ cpu_buffer->shortest_full = 0;
+ raw_spin_unlock(&cpu_buffer->reader_lock);
+
wake_up_all(&rbwork->full_waiters);
}
}
@@ -934,28 +945,33 @@ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu,
struct file *filp, poll_table *poll_table, int full)
{
struct ring_buffer_per_cpu *cpu_buffer;
- struct rb_irq_work *work;
+ struct rb_irq_work *rbwork;
if (cpu == RING_BUFFER_ALL_CPUS) {
- work = &buffer->irq_work;
+ rbwork = &buffer->irq_work;
full = 0;
} else {
if (!cpumask_test_cpu(cpu, buffer->cpumask))
return EPOLLERR;
cpu_buffer = buffer->buffers[cpu];
- work = &cpu_buffer->irq_work;
+ rbwork = &cpu_buffer->irq_work;
}
if (full) {
- poll_wait(filp, &work->full_waiters, poll_table);
- work->full_waiters_pending = true;
+ unsigned long flags;
+
+ poll_wait(filp, &rbwork->full_waiters, poll_table);
+
+ raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
+ rbwork->full_waiters_pending = true;
if (!cpu_buffer->shortest_full ||
cpu_buffer->shortest_full > full)
cpu_buffer->shortest_full = full;
+ raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
} else {
- poll_wait(filp, &work->waiters, poll_table);
- work->waiters_pending = true;
+ poll_wait(filp, &rbwork->waiters, poll_table);
+ rbwork->waiters_pending = true;
}
/*
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
A task can wait on a ring buffer for when it fills up to a specific
watermark. The writer will check the minimum watermark that waiters are
waiting for and if the ring buffer is past that, it will wake up all the
waiters.
The waiters are in a wait loop, and will first check if a signal is
pending and then check if the ring buffer is at the desired level where it
should break out of the loop.
If a file that uses a ring buffer closes, and there's threads waiting on
the ring buffer, it needs to wake up those threads. To do this, a
"wait_index" was used.
Before entering the wait loop, the waiter will read the wait_index. On
wakeup, it will check if the wait_index is different than when it entered
the loop, and will exit the loop if it is. The waker will only need to
update the wait_index before waking up the waiters.
This had a couple of bugs. One trivial one and one broken by design.
The trivial bug was that the waiter checked the wait_index after the
schedule() call. It had to be checked between the prepare_to_wait() and
the schedule() which it was not.
The main bug is that the first check to set the default wait_index will
always be outside the prepare_to_wait() and the schedule(). That's because
the ring_buffer_wait() doesn't have enough context to know if it should
break out of the loop.
The loop itself is not needed, because all the callers to the
ring_buffer_wait() also has their own loop, as the callers have a better
sense of what the context is to decide whether to break out of the loop
or not.
Just have the ring_buffer_wait() block once, and if it gets woken up, exit
the function and let the callers decide what to do next.
Link: https://lore.kernel.org/all/CAHk-=whs5MdtNjzFkTyaUy=vHi=qwWgPi0JgTe6OYUYMNS…
Link: https://lore.kernel.org/linux-trace-kernel/20240308202431.792933613@goodmis…
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: linke li <lilinke99(a)qq.com>
Cc: Rabin Vincent <rabin(a)rab.in>
Fixes: e30f53aad2202 ("tracing: Do not busy wait in buffer splice")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/ring_buffer.c | 139 ++++++++++++++++++-------------------
1 file changed, 68 insertions(+), 71 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 0699027b4f4c..3400f11286e3 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -384,7 +384,6 @@ struct rb_irq_work {
struct irq_work work;
wait_queue_head_t waiters;
wait_queue_head_t full_waiters;
- long wait_index;
bool waiters_pending;
bool full_waiters_pending;
bool wakeup_full;
@@ -798,14 +797,40 @@ void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu)
rbwork = &cpu_buffer->irq_work;
}
- rbwork->wait_index++;
- /* make sure the waiters see the new index */
- smp_wmb();
-
/* This can be called in any context */
irq_work_queue(&rbwork->work);
}
+static bool rb_watermark_hit(struct trace_buffer *buffer, int cpu, int full)
+{
+ struct ring_buffer_per_cpu *cpu_buffer;
+ bool ret = false;
+
+ /* Reads of all CPUs always waits for any data */
+ if (cpu == RING_BUFFER_ALL_CPUS)
+ return !ring_buffer_empty(buffer);
+
+ cpu_buffer = buffer->buffers[cpu];
+
+ if (!ring_buffer_empty_cpu(buffer, cpu)) {
+ unsigned long flags;
+ bool pagebusy;
+
+ if (!full)
+ return true;
+
+ raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
+ pagebusy = cpu_buffer->reader_page == cpu_buffer->commit_page;
+ ret = !pagebusy && full_hit(buffer, cpu, full);
+
+ if (!cpu_buffer->shortest_full ||
+ cpu_buffer->shortest_full > full)
+ cpu_buffer->shortest_full = full;
+ raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
+ }
+ return ret;
+}
+
/**
* ring_buffer_wait - wait for input to the ring buffer
* @buffer: buffer to wait on
@@ -821,7 +846,6 @@ int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full)
struct ring_buffer_per_cpu *cpu_buffer;
DEFINE_WAIT(wait);
struct rb_irq_work *work;
- long wait_index;
int ret = 0;
/*
@@ -840,81 +864,54 @@ int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full)
work = &cpu_buffer->irq_work;
}
- wait_index = READ_ONCE(work->wait_index);
-
- while (true) {
- if (full)
- prepare_to_wait(&work->full_waiters, &wait, TASK_INTERRUPTIBLE);
- else
- prepare_to_wait(&work->waiters, &wait, TASK_INTERRUPTIBLE);
-
- /*
- * The events can happen in critical sections where
- * checking a work queue can cause deadlocks.
- * After adding a task to the queue, this flag is set
- * only to notify events to try to wake up the queue
- * using irq_work.
- *
- * We don't clear it even if the buffer is no longer
- * empty. The flag only causes the next event to run
- * irq_work to do the work queue wake up. The worse
- * that can happen if we race with !trace_empty() is that
- * an event will cause an irq_work to try to wake up
- * an empty queue.
- *
- * There's no reason to protect this flag either, as
- * the work queue and irq_work logic will do the necessary
- * synchronization for the wake ups. The only thing
- * that is necessary is that the wake up happens after
- * a task has been queued. It's OK for spurious wake ups.
- */
- if (full)
- work->full_waiters_pending = true;
- else
- work->waiters_pending = true;
-
- if (signal_pending(current)) {
- ret = -EINTR;
- break;
- }
-
- if (cpu == RING_BUFFER_ALL_CPUS && !ring_buffer_empty(buffer))
- break;
-
- if (cpu != RING_BUFFER_ALL_CPUS &&
- !ring_buffer_empty_cpu(buffer, cpu)) {
- unsigned long flags;
- bool pagebusy;
- bool done;
-
- if (!full)
- break;
-
- raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
- pagebusy = cpu_buffer->reader_page == cpu_buffer->commit_page;
- done = !pagebusy && full_hit(buffer, cpu, full);
+ if (full)
+ prepare_to_wait(&work->full_waiters, &wait, TASK_INTERRUPTIBLE);
+ else
+ prepare_to_wait(&work->waiters, &wait, TASK_INTERRUPTIBLE);
- if (!cpu_buffer->shortest_full ||
- cpu_buffer->shortest_full > full)
- cpu_buffer->shortest_full = full;
- raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
- if (done)
- break;
- }
+ /*
+ * The events can happen in critical sections where
+ * checking a work queue can cause deadlocks.
+ * After adding a task to the queue, this flag is set
+ * only to notify events to try to wake up the queue
+ * using irq_work.
+ *
+ * We don't clear it even if the buffer is no longer
+ * empty. The flag only causes the next event to run
+ * irq_work to do the work queue wake up. The worse
+ * that can happen if we race with !trace_empty() is that
+ * an event will cause an irq_work to try to wake up
+ * an empty queue.
+ *
+ * There's no reason to protect this flag either, as
+ * the work queue and irq_work logic will do the necessary
+ * synchronization for the wake ups. The only thing
+ * that is necessary is that the wake up happens after
+ * a task has been queued. It's OK for spurious wake ups.
+ */
+ if (full)
+ work->full_waiters_pending = true;
+ else
+ work->waiters_pending = true;
- schedule();
+ if (rb_watermark_hit(buffer, cpu, full))
+ goto out;
- /* Make sure to see the new wait index */
- smp_rmb();
- if (wait_index != work->wait_index)
- break;
+ if (signal_pending(current)) {
+ ret = -EINTR;
+ goto out;
}
+ schedule();
+ out:
if (full)
finish_wait(&work->full_waiters, &wait);
else
finish_wait(&work->waiters, &wait);
+ if (!ret && !rb_watermark_hit(buffer, cpu, full) && signal_pending(current))
+ ret = -EINTR;
+
return ret;
}
--
2.43.0