VIA VT6306/6307/6308 provides PCI interface compliant to 1394 OHCI. When
the hardware is combined with Asmedia ASM1083/1085 PCIe-to-PCI bus bridge,
it appears that accesses to its 'Isochronous Cycle Timer' register (offset
0xf0 on PCI I/O space) often causes unexpected system reboot in any type
of AMD Ryzen machine (both 0x17 and 0x19 families). It does not appears in
the other type of machine (AMD pre-Ryzen machine, Intel machine, at least),
or in the other OHCI 1394 hardware (e.g. Texas Instruments).
The issue explicitly appears at a commit dcadfd7f7c74 ("firewire: core:
use union for callback of transaction completion") added to v6.5 kernel.
It changed 1394 OHCI driver to access to the register every time to
dispatch local asynchronous transaction. However, the issue exists in
older version of kernel as long as it runs in AMD Ryzen machine, since
the access to the register is required to maintain bus time. It is not
hard to imagine that users experience the unexpected system reboot when
generating bus reset by plugging any devices in, or reading the register
by time-aware application programs; e.g. audio sample processing.
Well, this commit suppresses the system reboot in the combination of
hardware. It avoids the access itself. As a result, the software stack can
not provide the hardware time anymore to unit drivers, userspace
applications, and nodes in the same IEEE 1394 bus. It brings apparent
disadvantage since time-aware application programs require it, while
time-unaware applications are available again; e.g. sbp2.
Cc: stable(a)vger.kernel.org
Reported-by: Jiri Slaby <jirislaby(a)kernel.org>
Closes: https://bugzilla.suse.com/show_bug.cgi?id=1215436
Reported-by: Mario Limonciello <mario.limonciello(a)amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217994
Reported-by: Tobias Gruetzmacher <tobias-lists(a)23.gs>
Closes: https://sourceforge.net/p/linux1394/mailman/message/58711901/
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2240973
Closes: https://bugs.launchpad.net/linux/+bug/2043905
Signed-off-by: Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
---
drivers/firewire/ohci.c | 49 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
index 7e88fd489741..62af3fa39a70 100644
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -279,6 +279,8 @@ static char ohci_driver_name[] = KBUILD_MODNAME;
#define QUIRK_TI_SLLZ059 0x20
#define QUIRK_IR_WAKE 0x40
+#define QUIRK_REBOOT_BY_CYCLE_TIMER_READ 0x80000000
+
/* In case of multiple matches in ohci_quirks[], only the first one is used. */
static const struct {
unsigned short vendor, device, revision, flags;
@@ -1724,6 +1726,11 @@ static u32 get_cycle_time(struct fw_ohci *ohci)
s32 diff01, diff12;
int i;
+#if IS_ENABLED(CONFIG_X86)
+ if (ohci->quirks & QUIRK_REBOOT_BY_CYCLE_TIMER_READ)
+ return 0;
+#endif
+
c2 = reg_read(ohci, OHCI1394_IsochronousCycleTimer);
if (ohci->quirks & QUIRK_CYCLE_TIMER) {
@@ -3527,6 +3534,45 @@ static const struct fw_card_driver ohci_driver = {
.stop_iso = ohci_stop_iso,
};
+// On PCI Express Root Complex in any type of AMD Ryzen machine, VIA VT6306/6307/6308 with Asmedia
+// ASM1083/1085 brings an inconvenience that read accesses to 'Isochronous Cycle Timer' register
+// (at offset 0xf0 in PCI I/O space) often causes unexpected system reboot. The mechanism is not
+// clear, since the read access to the other registers is enough safe; e.g. 'Node ID' register,
+// while it is probable due to detection of any type of PCIe error.
+#if IS_ENABLED(CONFIG_X86)
+
+#define PCI_DEVICE_ID_ASMEDIA_ASM108X 0x1080
+
+static bool detect_vt630x_with_asm1083_on_amd_ryzen_machine(const struct pci_dev *pdev,
+ struct fw_ohci *ohci)
+{
+ const struct pci_dev *pcie_to_pci_bridge;
+ const struct cpuinfo_x86 *cinfo = &cpu_data(0);
+
+ // Detect any type of AMD Ryzen machine.
+ if (cinfo->x86_vendor != X86_VENDOR_AMD || cinfo->x86 < 0x17)
+ return false;
+
+ // Detect VIA VT6306/6307/6308.
+ if (pdev->vendor != PCI_VENDOR_ID_VIA)
+ return false;
+ if (pdev->device != PCI_DEVICE_ID_VIA_VT630X)
+ return false;
+
+ // Detect Asmedia ASM1083/1085.
+ pcie_to_pci_bridge = pdev->bus->self;
+ if (pcie_to_pci_bridge->vendor != PCI_VENDOR_ID_ASMEDIA)
+ return false;
+ if (pcie_to_pci_bridge->device != PCI_DEVICE_ID_ASMEDIA_ASM108X)
+ return false;
+
+ return true;
+}
+
+#else
+#define detect_vt630x_with_asm1083_on_amd_ryzen_machine(pdev) false
+#endif
+
#ifdef CONFIG_PPC_PMAC
static void pmac_ohci_on(struct pci_dev *dev)
{
@@ -3630,6 +3676,9 @@ static int pci_probe(struct pci_dev *dev,
if (param_quirks)
ohci->quirks = param_quirks;
+ if (detect_vt630x_with_asm1083_on_amd_ryzen_machine(dev, ohci))
+ ohci->quirks |= QUIRK_REBOOT_BY_CYCLE_TIMER_READ;
+
/*
* Because dma_alloc_coherent() allocates at least one page,
* we save space by using a common buffer for the AR request/
--
2.39.2
Hi,
Memory corruption may occur if the location of the HFI memory buffer is not
restored when resuming from hibernation or suspend-to-memory.
During a normal boot, the kernel allocates a memory buffer and gives it to
the hardware for reporting updates in the HFI table. The same allocation
process is done by a restore kernel when resuming from suspend or
hibernation.
The location of the memory that the restore kernel allocates may differ
from that allocated by the image kernel. To prevent memory corruption (the
hardware keeps using the memory buffer from the restore kernel), it is
necessary to disable HFI before transferring control to the image kernel.
Once running, the image kernel must restore the location of the HFI memory
and enable HFI.
The patchset addresses the described bug on systems with one or more HFI
instances (i.e., packages) using CPU hotplug callbacks and a suspend
notifier.
I tested this patchset on Meteor Lake and Sapphire Rapids. The systems
completed 3500 (in two separate tests of 1500 and 2000 repeats) and
1000 hibernate-resume cycles, respectively. I tested it using Rafael's
testing branch as on 20th December 2023.
Thanks and BR,
Ricardo
Ricardo Neri (4):
thermal: intel: hfi: Refactor enabling code into helper functions
thermal: intel: hfi: Enable an HFI instance from its first online CPU
thermal: intel: hfi: Disable an HFI instance when all its CPUs go
offline
thermal: intel: hfi: Add a suspend notifier
drivers/thermal/intel/intel_hfi.c | 142 ++++++++++++++++++++++++------
1 file changed, 116 insertions(+), 26 deletions(-)
--
2.25.1
This patchset backport ksmbd patches between 6.6 and 6.7-rc5 kernel.
Bug fixes were not applied well due to clean-up and new feautre patches.
To facilitate backport, This patch-set included clean-up patches and
new feature patches of ksmbd for stable 6.6 kernel.
--
2.25.1