On Wed, Jan 17, 2018 at 12:03:15PM -0800, Eric Anholt wrote:
> Boris Brezillon <boris.brezillon(a)free-electrons.com> writes:
>
> > When saving BOs in the hang state we skip one entry of the
> > kernel_state->bo[] array, thus leaving it to NULL. This leads to a NULL
> > pointer dereference when, later in this function, we iterate over all
> > BOs to check their ->madv state.
> >
> > Fixes: ca26d28bbaa3 ("drm/vc4: improve throughput by pipelining binning and rendering jobs")
> > Cc: <stable(a)vger.kernel.org>
> > Signed-off-by: Boris Brezillon <boris.brezillon(a)free-electrons.com>
> > ---
> > drivers/gpu/drm/vc4/vc4_gem.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
> > index 6c32c89a83a9..19ac7fe0e5db 100644
> > --- a/drivers/gpu/drm/vc4/vc4_gem.c
> > +++ b/drivers/gpu/drm/vc4/vc4_gem.c
> > @@ -208,7 +208,7 @@ vc4_save_hang_state(struct drm_device *dev)
> > kernel_state->bo[j + prev_idx] = &bo->base.base;
> > j++;
> > }
> > - prev_idx = j + 1;
> > + prev_idx = j;
>
> Could we replace the whole "[j + prev_idx]" with a "[k++]" and maybe a
> WARN_ON_ONCE(k != state->bo_count) at the end?
>
> I really need to figure out if I can come up with a way to make IGT
> cases for GPU hangs on vc4, despite the validation. I found a bug in
> GPU reset due to BCL hangs when doing vc5, but I don't have a testcase.
> Maybe some submit flags that overwrite the BCL or RCL to do an infinite
> loop?
What we currently do for i915 is an endless chain of batches (since no
command parser we can get away with that). Previously we did a special
debugfs mode which blocked out updating the ring head (but left all the
other command submission handling in place). Except for the very minor
change nothing needed to be adjusted in the kernel, and from the kernel's
pov it very much looked like the gpu simply died.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Wed, 17 Jan 2018 12:03:15 -0800
Eric Anholt <eric(a)anholt.net> wrote:
> Boris Brezillon <boris.brezillon(a)free-electrons.com> writes:
>
> > When saving BOs in the hang state we skip one entry of the
> > kernel_state->bo[] array, thus leaving it to NULL. This leads to a NULL
> > pointer dereference when, later in this function, we iterate over all
> > BOs to check their ->madv state.
> >
> > Fixes: ca26d28bbaa3 ("drm/vc4: improve throughput by pipelining binning and rendering jobs")
> > Cc: <stable(a)vger.kernel.org>
> > Signed-off-by: Boris Brezillon <boris.brezillon(a)free-electrons.com>
> > ---
> > drivers/gpu/drm/vc4/vc4_gem.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
> > index 6c32c89a83a9..19ac7fe0e5db 100644
> > --- a/drivers/gpu/drm/vc4/vc4_gem.c
> > +++ b/drivers/gpu/drm/vc4/vc4_gem.c
> > @@ -208,7 +208,7 @@ vc4_save_hang_state(struct drm_device *dev)
> > kernel_state->bo[j + prev_idx] = &bo->base.base;
> > j++;
> > }
> > - prev_idx = j + 1;
> > + prev_idx = j;
>
> Could we replace the whole "[j + prev_idx]" with a "[k++]" and maybe a
> WARN_ON_ONCE(k != state->bo_count) at the end?
Sure.
>
> I really need to figure out if I can come up with a way to make IGT
> cases for GPU hangs on vc4, despite the validation.
I managed to trigger the NULL pointer dereference while debugging the
perfmon stuff, but it's fixed now, so I don't have a way to easily
force a reset.
> I found a bug in
> GPU reset due to BCL hangs when doing vc5, but I don't have a testcase.
> Maybe some submit flags that overwrite the BCL or RCL to do an infinite
> loop?
usbip host lists devices attached to vhci_hcd on the same server
when user does attach over localhost or specifies the server as the
remote.
usbip attach -r localhost -b busid
or
usbip attach -r servername (or server IP)
Fix it to check and not list devices that are attached to vhci_hcd.
Cc: stable(a)vger.kernel.org
Signed-off-by: Shuah Khan <shuahkh(a)osg.samsung.com>
---
tools/usb/usbip/src/usbip_list.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tools/usb/usbip/src/usbip_list.c b/tools/usb/usbip/src/usbip_list.c
index f1b38e866dd7..d65a9f444174 100644
--- a/tools/usb/usbip/src/usbip_list.c
+++ b/tools/usb/usbip/src/usbip_list.c
@@ -187,6 +187,7 @@ static int list_devices(bool parsable)
const char *busid;
char product_name[128];
int ret = -1;
+ const char *devpath;
/* Create libudev context. */
udev = udev_new();
@@ -209,6 +210,14 @@ static int list_devices(bool parsable)
path = udev_list_entry_get_name(dev_list_entry);
dev = udev_device_new_from_syspath(udev, path);
+ /* Ignore devices attached to vhci_hcd */
+ devpath = udev_device_get_devpath(dev);
+ if (strstr(devpath, USBIP_VHCI_DRV_NAME)) {
+ dbg("Skip the device %s already attached to %s\n",
+ devpath, USBIP_VHCI_DRV_NAME);
+ continue;
+ }
+
/* Get device information. */
idVendor = udev_device_get_sysattr_value(dev, "idVendor");
idProduct = udev_device_get_sysattr_value(dev, "idProduct");
--
2.14.1
usbip host binds to devices attached to vhci_hcd on the same server
when user does attach over localhost or specifies the server as the
remote.
usbip attach -r localhost -b busid
or
usbip attach -r servername (or server IP)
Unbind followed by bind works, however device is left in a bad state with
accesses via the attached busid result in errors and system hangs during
shutdown.
Fix it to check and bail out if the device is already attached to vhci_hcd.
Cc: stable(a)vger.kernel.org
Signed-off-by: Shuah Khan <shuahkh(a)osg.samsung.com>
---
tools/usb/usbip/src/usbip_bind.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tools/usb/usbip/src/usbip_bind.c b/tools/usb/usbip/src/usbip_bind.c
index fa46141ae68b..e121cfb1746a 100644
--- a/tools/usb/usbip/src/usbip_bind.c
+++ b/tools/usb/usbip/src/usbip_bind.c
@@ -144,6 +144,7 @@ static int bind_device(char *busid)
int rc;
struct udev *udev;
struct udev_device *dev;
+ const char *devpath;
/* Check whether the device with this bus ID exists. */
udev = udev_new();
@@ -152,8 +153,16 @@ static int bind_device(char *busid)
err("device with the specified bus ID does not exist");
return -1;
}
+ devpath = udev_device_get_devpath(dev);
udev_unref(udev);
+ /* If the device is already attached to vhci_hcd - bail out */
+ if (strstr(devpath, USBIP_VHCI_DRV_NAME)) {
+ err("bind loop detected: device: %s is attached to %s\n",
+ devpath, USBIP_VHCI_DRV_NAME);
+ return -1;
+ }
+
rc = unbind_other(busid);
if (rc == UNBIND_ST_FAILED) {
err("could not unbind driver from device on busid %s", busid);
--
2.14.1
If ubifs_wbuf_sync() fails we must not write a master node with the
dirty marker cleared.
Otherwise it is possible that in case of an IO error while syncing we
mark the filesystem as clean and UBIFS refuses to recover upon next
mount.
Cc: <stable(a)vger.kernel.org>
Fixes: 1e51764a3c2a ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard(a)nod.at>
---
fs/ubifs/super.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 0beb285b143d..468bca452d0a 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -1739,8 +1739,11 @@ static void ubifs_remount_ro(struct ubifs_info *c)
dbg_save_space_info(c);
- for (i = 0; i < c->jhead_cnt; i++)
- ubifs_wbuf_sync(&c->jheads[i].wbuf);
+ for (i = 0; i < c->jhead_cnt; i++) {
+ err = ubifs_wbuf_sync(&c->jheads[i].wbuf);
+ if (err)
+ ubifs_ro_mode(c, err);
+ }
c->mst_node->flags &= ~cpu_to_le32(UBIFS_MST_DIRTY);
c->mst_node->flags |= cpu_to_le32(UBIFS_MST_NO_ORPHS);
@@ -1806,8 +1809,11 @@ static void ubifs_put_super(struct super_block *sb)
int err;
/* Synchronize write-buffers */
- for (i = 0; i < c->jhead_cnt; i++)
- ubifs_wbuf_sync(&c->jheads[i].wbuf);
+ for (i = 0; i < c->jhead_cnt; i++) {
+ err = ubifs_wbuf_sync(&c->jheads[i].wbuf);
+ if (err)
+ ubifs_ro_mode(c, err);
+ }
/*
* We are being cleanly unmounted which means the
--
2.13.6
Hi Greg,
Could you please pick up commit 1b5c7ef3d0d0 ("drm/nouveau/disp/gf119:
add missing drive vfunc ptr") for the 4.14 series? It fixes
https://bugs.freedesktop.org/show_bug.cgi?id=103421 which seems to break
nouveau for everyone with a GF119 card.
This problem has also been reported twice in Debian[1,2], and the Debian
kernel team has already applied the patch.
Cheers,
Sven
1. https://bugs.debian.org/880660
2. https://bugs.debian.org/886727
This is a note to let you know that I've just added the patch titled
drm/nouveau/disp/gf119: add missing drive vfunc ptr
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-nouveau-disp-gf119-add-missing-drive-vfunc-ptr.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1b5c7ef3d0d0610bda9b63263f7c5b7178d11015 Mon Sep 17 00:00:00 2001
From: Rob Clark <robdclark(a)gmail.com>
Date: Sat, 6 Jan 2018 10:59:41 -0500
Subject: drm/nouveau/disp/gf119: add missing drive vfunc ptr
From: Rob Clark <robdclark(a)gmail.com>
commit 1b5c7ef3d0d0610bda9b63263f7c5b7178d11015 upstream.
Fixes broken dp on GF119:
Call Trace:
? nvkm_dp_train_drive+0x183/0x2c0 [nouveau]
nvkm_dp_acquire+0x4f3/0xcd0 [nouveau]
nv50_disp_super_2_2+0x5d/0x470 [nouveau]
? nvkm_devinit_pll_set+0xf/0x20 [nouveau]
gf119_disp_super+0x19c/0x2f0 [nouveau]
process_one_work+0x193/0x3c0
worker_thread+0x35/0x3b0
kthread+0x125/0x140
? process_one_work+0x3c0/0x3c0
? kthread_park+0x60/0x60
ret_from_fork+0x25/0x30
Code: Bad RIP value.
RIP: (null) RSP: ffffb1e243e4bc38
CR2: 0000000000000000
Fixes: af85389c614a drm/nouveau/disp: shuffle functions around
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103421
Signed-off-by: Rob Clark <robdclark(a)gmail.com>
Signed-off-by: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Sven Joachim <svenjoac(a)gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c
@@ -174,6 +174,7 @@ gf119_sor = {
.links = gf119_sor_dp_links,
.power = g94_sor_dp_power,
.pattern = gf119_sor_dp_pattern,
+ .drive = gf119_sor_dp_drive,
.vcpi = gf119_sor_dp_vcpi,
.audio = gf119_sor_dp_audio,
.audio_sym = gf119_sor_dp_audio_sym,
Patches currently in stable-queue which might be from robdclark(a)gmail.com are
queue-4.14/drm-nouveau-disp-gf119-add-missing-drive-vfunc-ptr.patch