Hi,
This series of two patches fixes the issue introduced in
cf586021642d80 ("drm/i915/gt: Pipelined page migration") where,
as reported by Matt, in a chain of requests an error is reported
only if happens in the last request.
However Chris noticed that without ensuring exclusivity in the
locking we might end up in some deadlock. That's why patch 1
throttles for the ringspace in order to make sure that no one is
holding it.
Version 1 of this patch has been reviewed by matt and this
version is adding Chris exclusive locking.
Thanks Chris for this work.
Andi
Changelog
=========
v3 -> v4
- In v3 the timeline was being locked, but I forgot that also
request_create() and request_add() are locking the timeline
as well. The former does the locking, the latter does the
unlocking. In order to avoid this extra lock/unlock, we need
the "_locked" version of the said functions.
v2 -> v3
- Really lock the timeline before generating all the requests
until the last.
v1 -> v2
- Add patch 1 for ensuring exclusive locking of the timeline
- Reword git commit of patch 2.
Andi Shyti (4):
drm/i915/gt: Add intel_context_timeline_is_locked helper
drm/i915: Create the locked version of the request create
drm/i915: Create the locked version of the request add
drm/i915/gt: Make sure that errors are propagated through request
chains
Chris Wilson (1):
drm/i915: Throttle for ringspace prior to taking the timeline mutex
drivers/gpu/drm/i915/gt/intel_context.c | 41 +++++++++++++++++++
drivers/gpu/drm/i915/gt/intel_context.h | 8 ++++
drivers/gpu/drm/i915/gt/intel_migrate.c | 41 ++++++++++++++-----
drivers/gpu/drm/i915/i915_request.c | 54 ++++++++++++++++++-------
drivers/gpu/drm/i915/i915_request.h | 3 ++
5 files changed, 122 insertions(+), 25 deletions(-)
--
2.39.2
As per the RZ/G2L users hardware manual (Rev.1.20 Sep, 2022), section
23.3.7 Serial Data Transmission (Asynchronous Mode) it is mentioned
that the TE (transmit enable) must be set after setting TIE (transmit
interrupt enable) or these 2 bits are set to 1 simultaneously by a
single instruction. So set these 2 bits in single instruction.
Fixes: 93bcefd4c6ba ("serial: sh-sci: Fix setting SCSCR_TIE while transferring data")
Cc: stable(a)vger.kernel.org
Signed-off-by: Biju Das <biju.das.jz(a)bp.renesas.com>
---
v3:
* New patch moved here from Renesas SCI fixes patch series
* Updated commit description
* Moved handling of clearing TE bit as separate patch#5.
---
drivers/tty/serial/sh-sci.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
index 15743c2f3d3d..32f5c1f7d697 100644
--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -601,6 +601,15 @@ static void sci_start_tx(struct uart_port *port)
port->type == PORT_SCIFA || port->type == PORT_SCIFB) {
/* Set TIE (Transmit Interrupt Enable) bit in SCSCR */
ctrl = serial_port_in(port, SCSCR);
+
+ /*
+ * For SCI, TE (transmit enable) must be set after setting TIE
+ * (transmit interrupt enable) or in the same instruction to start
+ * the transmit process.
+ */
+ if (port->type == PORT_SCI)
+ ctrl |= SCSCR_TE;
+
serial_port_out(port, SCSCR, ctrl | SCSCR_TIE);
}
}
@@ -2600,8 +2609,14 @@ static void sci_set_termios(struct uart_port *port, struct ktermios *termios,
sci_set_mctrl(port, port->mctrl);
}
- scr_val |= SCSCR_RE | SCSCR_TE |
- (s->cfg->scscr & ~(SCSCR_CKE1 | SCSCR_CKE0));
+ /*
+ * For SCI, TE (transmit enable) must be set after setting TIE
+ * (transmit interrupt enable) or in the same instruction to
+ * start the transmitting process. So skip setting TE here for SCI.
+ */
+ if (port->type != PORT_SCI)
+ scr_val |= SCSCR_TE;
+ scr_val |= SCSCR_RE | (s->cfg->scscr & ~(SCSCR_CKE1 | SCSCR_CKE0));
serial_port_out(port, SCSCR, scr_val | s->hscif_tot);
if ((srr + 1 == 5) &&
(port->type == PORT_SCIFA || port->type == PORT_SCIFB)) {
--
2.25.1
From: "Liam R. Howlett" <Liam.Howlett(a)Oracle.com>
Use the maple tree in RCU mode for VMA tracking.
The maple tree tracks the stack and is able to update the pivot
(lower/upper boundary) in-place to allow the page fault handler to write
to the tree while holding just the mmap read lock. This is safe as the
writes to the stack have a guard VMA which ensures there will always be
a NULL in the direction of the growth and thus will only update a pivot.
It is possible, but not recommended, to have VMAs that grow up/down
without guard VMAs. syzbot has constructed a testcase which sets up a
VMA to grow and consume the empty space. Overwriting the entire NULL
entry causes the tree to be altered in a way that is not safe for
concurrent readers; the readers may see a node being rewritten or one
that does not match the maple state they are using.
Enabling RCU mode allows the concurrent readers to see a stable node and
will return the expected result.
Link: https://lkml.kernel.org/r/20230227173632.3292573-9-surenb@google.com
Link: https://lore.kernel.org/linux-mm/000000000000b0a65805f663ace6@google.com/
Cc: stable(a)vger.kernel.org
Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reported-by: syzbot+8d95422d3537159ca390(a)syzkaller.appspotmail.com
---
include/linux/mm_types.h | 3 ++-
kernel/fork.c | 3 +++
mm/mmap.c | 3 ++-
3 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 0722859c3647..a57e6ae78e65 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -774,7 +774,8 @@ struct mm_struct {
unsigned long cpu_bitmap[];
};
-#define MM_MT_FLAGS (MT_FLAGS_ALLOC_RANGE | MT_FLAGS_LOCK_EXTERN)
+#define MM_MT_FLAGS (MT_FLAGS_ALLOC_RANGE | MT_FLAGS_LOCK_EXTERN | \
+ MT_FLAGS_USE_RCU)
extern struct mm_struct init_mm;
/* Pointer magic because the dynamic array size confuses some compilers. */
diff --git a/kernel/fork.c b/kernel/fork.c
index d8cda4c6de6c..1bf31ba07e85 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -617,6 +617,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
if (retval)
goto out;
+ mt_clear_in_rcu(vmi.mas.tree);
for_each_vma(old_vmi, mpnt) {
struct file *file;
@@ -700,6 +701,8 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
retval = arch_dup_mmap(oldmm, mm);
loop_out:
vma_iter_free(&vmi);
+ if (!retval)
+ mt_set_in_rcu(vmi.mas.tree);
out:
mmap_write_unlock(mm);
flush_tlb_mm(oldmm);
diff --git a/mm/mmap.c b/mm/mmap.c
index 740b54be3ed4..16cbb83b3ec6 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2277,7 +2277,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
int count = 0;
int error = -ENOMEM;
MA_STATE(mas_detach, &mt_detach, 0, 0);
- mt_init_flags(&mt_detach, MT_FLAGS_LOCK_EXTERN);
+ mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK_MASK);
mt_set_external_lock(&mt_detach, &mm->mmap_lock);
/*
@@ -3042,6 +3042,7 @@ void exit_mmap(struct mm_struct *mm)
*/
set_bit(MMF_OOM_SKIP, &mm->flags);
mmap_write_lock(mm);
+ mt_clear_in_rcu(&mm->mm_mt);
free_pgtables(&tlb, &mm->mm_mt, vma, FIRST_USER_ADDRESS,
USER_PGTABLES_CEILING);
tlb_finish_mmu(&tlb);
--
2.39.2
Since 32ef9e5054ec, -Wa,-gdwarf-2 is no longer used in KBUILD_AFLAGS.
Instead, it includes -g, the appropriate -gdwarf-* flag, and also the
-Wa versions of both of those if building with Clang and GNU as. As a
result, debug info was being generated for the purgatory objects, even
though the intention was that it not be.
Fixes: 32ef9e5054ec ("Makefile.debug: re-enable debug info for .S files")
Signed-off-by: Alyssa Ross <hi(a)alyssa.is>
Cc: stable(a)vger.kernel.org
---
Difference from v2: replace each AFLAGS_REMOVE_* assignment with a
single aflags-remove-y line, and use foreach to add the -Wa versions,
as suggested by Masahiro Yamada.
arch/riscv/purgatory/Makefile | 7 +------
arch/x86/purgatory/Makefile | 3 +--
2 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile
index d16bf715a586..5730797a6b40 100644
--- a/arch/riscv/purgatory/Makefile
+++ b/arch/riscv/purgatory/Makefile
@@ -84,12 +84,7 @@ CFLAGS_string.o += $(PURGATORY_CFLAGS)
CFLAGS_REMOVE_ctype.o += $(PURGATORY_CFLAGS_REMOVE)
CFLAGS_ctype.o += $(PURGATORY_CFLAGS)
-AFLAGS_REMOVE_entry.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_memcpy.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_memset.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_strcmp.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_strlen.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_strncmp.o += -Wa,-gdwarf-2
+asflags-remove-y += $(foreach x, -g -gdwarf-4 -gdwarf-5, $(x) -Wa,$(x))
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 17f09dc26381..82fec66d46d2 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -69,8 +69,7 @@ CFLAGS_sha256.o += $(PURGATORY_CFLAGS)
CFLAGS_REMOVE_string.o += $(PURGATORY_CFLAGS_REMOVE)
CFLAGS_string.o += $(PURGATORY_CFLAGS)
-AFLAGS_REMOVE_setup-x86_$(BITS).o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_entry64.o += -Wa,-gdwarf-2
+asflags-remove-y += $(foreach x, -g -gdwarf-4 -gdwarf-5, $(x) -Wa,$(x))
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)
base-commit: da8e7da11e4ba758caf4c149cc8d8cd555aefe5f
--
2.37.1
By default the indirect state sampler data (border colors) are stored
in the same heap as the SAMPLER_STATE structure. For userspace drivers
that can be 2 different heaps (dynamic state heap & bindless sampler
state heap). This means that border colors have to copied in 2
different places so that the same SAMPLER_STATE structure find the
right data.
This change is forcing the indirect state sampler data to only be in
the dynamic state pool (more convinient for userspace drivers, they
only have to have one copy of the border colors). This is reproducing
the behavior of the Windows drivers.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin(a)intel.com>
Cc: stable(a)vger.kernel.org
---
drivers/gpu/drm/i915/gt/intel_gt_regs.h | 1 +
drivers/gpu/drm/i915/gt/intel_workarounds.c | 17 +++++++++++++++++
2 files changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 08d76aa06974c..1aaa471d08c56 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -1141,6 +1141,7 @@
#define ENABLE_SMALLPL REG_BIT(15)
#define SC_DISABLE_POWER_OPTIMIZATION_EBB REG_BIT(9)
#define GEN11_SAMPLER_ENABLE_HEADLESS_MSG REG_BIT(5)
+#define GEN11_INDIRECT_STATE_BASE_ADDR_OVERRIDE REG_BIT(0)
#define GEN9_HALF_SLICE_CHICKEN7 MCR_REG(0xe194)
#define DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA REG_BIT(15)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 32aa1647721ae..734b64e714647 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -2542,6 +2542,23 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
ENABLE_SMALLPL);
}
+ if (GRAPHICS_VER(i915) >= 11) {
+ /* This is not a Wa (although referred to as
+ * WaSetInidrectStateOverride in places), this allows
+ * applications that reference sampler states through
+ * the BindlessSamplerStateBaseAddress to have their
+ * border color relative to DynamicStateBaseAddress
+ * rather than BindlessSamplerStateBaseAddress.
+ *
+ * Otherwise SAMPLER_STATE border colors have to be
+ * copied in multiple heaps (DynamicStateBaseAddress &
+ * BindlessSamplerStateBaseAddress)
+ */
+ wa_mcr_masked_en(wal,
+ GEN10_SAMPLER_MODE,
+ GEN11_INDIRECT_STATE_BASE_ADDR_OVERRIDE);
+ }
+
if (GRAPHICS_VER(i915) == 11) {
/* This is not an Wa. Enable for better image quality */
wa_masked_en(wal,
--
2.34.1
The rust_fmt_argument function is called from printk() to handle the %pA
format specifier.
Since it's called from C, we should mark it extern "C" to make sure it's
ABI compatible.
Cc: stable(a)vger.kernel.org
Fixes: 247b365dc8dc ("rust: add `kernel` crate")
Signed-off-by: David Gow <davidgow(a)google.com>
---
See discussion in:
https://github.com/Rust-for-Linux/linux/pull/967
and
https://github.com/Rust-for-Linux/linux/pull/966
---
rust/kernel/print.rs | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/rust/kernel/print.rs b/rust/kernel/print.rs
index 30103325696d..ec457f0952fe 100644
--- a/rust/kernel/print.rs
+++ b/rust/kernel/print.rs
@@ -18,7 +18,7 @@ use crate::bindings;
// Called from `vsprintf` with format specifier `%pA`.
#[no_mangle]
-unsafe fn rust_fmt_argument(buf: *mut c_char, end: *mut c_char, ptr: *const c_void) -> *mut c_char {
+unsafe extern "C" fn rust_fmt_argument(buf: *mut c_char, end: *mut c_char, ptr: *const c_void) -> *mut c_char {
use fmt::Write;
// SAFETY: The C contract guarantees that `buf` is valid if it's less than `end`.
let mut w = unsafe { RawFormatter::from_ptrs(buf.cast(), end.cast()) };
--
2.39.1.581.gbfd45094c4-goog
The PSP IRQ is edge-triggered (MSI or MSI-X) in all cases supported by
the psp module so clear the interrupt status register early in the
handler to prevent missed interrupts. sev_irq_handler() calls wake_up()
on a wait queue, which can result in a new command being submitted from
a different CPU. This then races with the clearing of isr and can result
in missed interrupts. A missed interrupt results in a command waiting
until it times out, which results in the psp being declared dead.
This is unlikely on bare metal, but has been observed when running
virtualized. In the cases where this is observed, sev->cmdresp_reg has
PSP_CMDRESP_RESP set which indicates that the command was processed
correctly but no interrupt was asserted.
The full sequence of events looks like this:
CPU 1: submits SEV cmd #1
CPU 1: calls wait_event_timeout()
CPU 0: enters psp_irq_handler()
CPU 0: calls sev_handler()->wake_up()
CPU 1: wakes up; finishes processing cmd #1
CPU 1: submits SEV cmd #2
CPU 1: calls wait_event_timeout()
PSP: finishes processing cmd #2; interrupt status is still set; no interrupt
CPU 0: clears intsts
CPU 0: exits psp_irq_handler()
CPU 1: wait_event_timeout() times out; psp_dead=true
Fixes: 200664d5237f ("crypto: ccp: Add Secure Encrypted Virtualization (SEV) command support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jeremi Piotrowski <jpiotrowski(a)linux.microsoft.com>
---
drivers/crypto/ccp/psp-dev.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/crypto/ccp/psp-dev.c b/drivers/crypto/ccp/psp-dev.c
index c9c741ac8442..949a3fa0b94a 100644
--- a/drivers/crypto/ccp/psp-dev.c
+++ b/drivers/crypto/ccp/psp-dev.c
@@ -42,6 +42,9 @@ static irqreturn_t psp_irq_handler(int irq, void *data)
/* Read the interrupt status: */
status = ioread32(psp->io_regs + psp->vdata->intsts_reg);
+ /* Clear the interrupt status by writing the same value we read. */
+ iowrite32(status, psp->io_regs + psp->vdata->intsts_reg);
+
/* invoke subdevice interrupt handlers */
if (status) {
if (psp->sev_irq_handler)
@@ -51,9 +54,6 @@ static irqreturn_t psp_irq_handler(int irq, void *data)
psp->tee_irq_handler(irq, psp->tee_irq_data, status);
}
- /* Clear the interrupt status by writing the same value we read. */
- iowrite32(status, psp->io_regs + psp->vdata->intsts_reg);
-
return IRQ_HANDLED;
}
--
2.34.1