A particular RX 5600 device requires a hack in the rebar logic, but the
current branch is too general and catches other devices too, breaking
them. This patch changes the branch to be more selective on the
particular revision.
This patch fixes intermittent freezes on other RX 5600 devices where the
hack is unnecessary. Credit to all contributors in the linked issue on
the AMD bug tracker.
See also: https://gitlab.freedesktop.org/drm/amd/-/issues/1707
Fixes: 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
Cc: stable(a)vger.kernel.org # v5.12+
Signed-off-by: Robin McCorkell <robin(a)mccorkell.me.uk>
Reported-by: Simon May <@Socob on gitlab.freedesktop.com>
Tested-by: Kain Centeno <@kaincenteno on gitlab.freedesktop.com>
Tested-by: Tobias Jakobi <@tobiasjakobi on gitlab.freedesktop.com>
Suggested-by: lijo lazar <@lijo on gitlab.freedesktop.com>
---
drivers/pci/pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ce2ab62b64cf..1fe75243019e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3647,7 +3647,7 @@ u32 pci_rebar_get_possible_sizes(struct pci_dev *pdev, int bar)
/* Sapphire RX 5600 XT Pulse has an invalid cap dword for BAR 0 */
if (pdev->vendor == PCI_VENDOR_ID_ATI && pdev->device == 0x731f &&
- bar == 0 && cap == 0x7000)
+ pdev->revision == 0xC1 && bar == 0 && cap == 0x7000)
cap = 0x3f000;
return cap >> 4;
--
2.31.1
From: Laurent Vivier <lvivier(a)redhat.com>
Commit 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest
context before enabling irqs") moved guest_exit() into the interrupt
protected area to avoid wrong context warning (or worse). The problem is
that tick-based time accounting has not yet been updated at this point
(because it depends on the timer interrupt firing), so the guest time
gets incorrectly accounted to system time.
To fix the problem, follow the x86 fix in commit 160457140187 ("Defer
vtime accounting 'til after IRQ handling"), and allow host IRQs to run
before accounting the guest exit time.
In the case vtime accounting is enabled, this is not required because TB
is used directly for accounting.
Before this patch, with CONFIG_TICK_CPU_ACCOUNTING=y in the host and a
guest running a kernel compile, the 'guest' fields of /proc/stat are
stuck at zero. With the patch they can be observed increasing roughly as
expected.
Fixes: e233d54d4d97 ("KVM: booke: use __kvm_guest_exit")
Fixes: 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs")
Cc: <stable(a)vger.kernel.org> # 5.12
Signed-off-by: Laurent Vivier <lvivier(a)redhat.com>
[np: only required for tick accounting, add Book3E fix, tweak changelog]
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
---
Since v2:
- I took over the patch with Laurent's blessing.
- Changed to avoid processing IRQs if we do have vtime accounting
enabled.
- Changed so in either case the accounting is called with irqs disabled.
- Added similar Book3E fix.
- Rebased on upstream, tested, observed bug and confirmed fix.
arch/powerpc/kvm/book3s_hv.c | 30 ++++++++++++++++++++++++++++--
arch/powerpc/kvm/booke.c | 16 +++++++++++++++-
2 files changed, 43 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 2acb1c96cfaf..7b74fc0a986b 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3726,7 +3726,20 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
kvmppc_set_host_core(pcpu);
- guest_exit_irqoff();
+ context_tracking_guest_exit();
+ if (!vtime_accounting_enabled_this_cpu()) {
+ local_irq_enable();
+ /*
+ * Service IRQs here before vtime_account_guest_exit() so any
+ * ticks that occurred while running the guest are accounted to
+ * the guest. If vtime accounting is enabled, accounting uses
+ * TB rather than ticks, so it can be done without enabling
+ * interrupts here, which has the problem that it accounts
+ * interrupt processing overhead to the host.
+ */
+ local_irq_disable();
+ }
+ vtime_account_guest_exit();
local_irq_enable();
@@ -4510,7 +4523,20 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
kvmppc_set_host_core(pcpu);
- guest_exit_irqoff();
+ context_tracking_guest_exit();
+ if (!vtime_accounting_enabled_this_cpu()) {
+ local_irq_enable();
+ /*
+ * Service IRQs here before vtime_account_guest_exit() so any
+ * ticks that occurred while running the guest are accounted to
+ * the guest. If vtime accounting is enabled, accounting uses
+ * TB rather than ticks, so it can be done without enabling
+ * interrupts here, which has the problem that it accounts
+ * interrupt processing overhead to the host.
+ */
+ local_irq_disable();
+ }
+ vtime_account_guest_exit();
local_irq_enable();
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 977801c83aff..8c15c90dd3a9 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1042,7 +1042,21 @@ int kvmppc_handle_exit(struct kvm_vcpu *vcpu, unsigned int exit_nr)
}
trace_kvm_exit(exit_nr, vcpu);
- guest_exit_irqoff();
+
+ context_tracking_guest_exit();
+ if (!vtime_accounting_enabled_this_cpu()) {
+ local_irq_enable();
+ /*
+ * Service IRQs here before vtime_account_guest_exit() so any
+ * ticks that occurred while running the guest are accounted to
+ * the guest. If vtime accounting is enabled, accounting uses
+ * TB rather than ticks, so it can be done without enabling
+ * interrupts here, which has the problem that it accounts
+ * interrupt processing overhead to the host.
+ */
+ local_irq_disable();
+ }
+ vtime_account_guest_exit();
local_irq_enable();
--
2.23.0
ocfs2_truncate_file() did unmap invalidate page cache pages before
zeroing partial tail cluster and setting i_size. Thus some pages could
be left (and likely have left if the cluster zeroing happened) in the
page cache beyond i_size after truncate finished letting user possibly
see stale data once the file was extended again. Also the tail cluster
zeroing was not guaranteed to finish before truncate finished causing
possible stale data exposure. The problem started to be particularly
easy to hit after commit 6dbf7bb55598 "fs: Don't invalidate page buffers
in block_write_full_page()" stopped invalidation of pages beyond i_size
from page writeback path.
Fix these problems by unmapping and invalidating pages in the page cache
after the i_size is reduced and tail cluster is zeroed out.
CC: stable(a)vger.kernel.org
Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/ocfs2/file.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 54d7843c0211..fc5f780fa235 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -476,10 +476,11 @@ int ocfs2_truncate_file(struct inode *inode,
* greater than page size, so we have to truncate them
* anyway.
*/
- unmap_mapping_range(inode->i_mapping, new_i_size + PAGE_SIZE - 1, 0, 1);
- truncate_inode_pages(inode->i_mapping, new_i_size);
if (OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL) {
+ unmap_mapping_range(inode->i_mapping,
+ new_i_size + PAGE_SIZE - 1, 0, 1);
+ truncate_inode_pages(inode->i_mapping, new_i_size);
status = ocfs2_truncate_inline(inode, di_bh, new_i_size,
i_size_read(inode), 1);
if (status)
@@ -498,6 +499,9 @@ int ocfs2_truncate_file(struct inode *inode,
goto bail_unlock_sem;
}
+ unmap_mapping_range(inode->i_mapping, new_i_size + PAGE_SIZE - 1, 0, 1);
+ truncate_inode_pages(inode->i_mapping, new_i_size);
+
status = ocfs2_commit_truncate(osb, inode, di_bh);
if (status < 0) {
mlog_errno(status);
--
2.26.2
Turns out some xHC controllers require all 64 bits in the CRCR register
to be written to execute a command abort.
The lower 32 bits containing the command abort bit is written first.
In case the command ring stops before we write the upper 32 bits then
hardware may use these upper bits to set the commnd ring dequeue pointer.
Solve this by making sure the upper 32 bits contain a valid command
ring dequeue pointer.
The original patch that only wrote the first 32 to stop the ring went
to stable, so this fix should go there as well.
Fixes: ff0e50d3564f ("xhci: Fix command ring pointer corruption while aborting a command")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-ring.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 311597bba80e..eaa49aef2935 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -366,7 +366,9 @@ static void xhci_handle_stopped_cmd_ring(struct xhci_hcd *xhci,
/* Must be called with xhci->lock held, releases and aquires lock back */
static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long flags)
{
- u32 temp_32;
+ struct xhci_segment *new_seg = xhci->cmd_ring->deq_seg;
+ union xhci_trb *new_deq = xhci->cmd_ring->dequeue;
+ u64 crcr;
int ret;
xhci_dbg(xhci, "Abort command ring\n");
@@ -375,13 +377,18 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long flags)
/*
* The control bits like command stop, abort are located in lower
- * dword of the command ring control register. Limit the write
- * to the lower dword to avoid corrupting the command ring pointer
- * in case if the command ring is stopped by the time upper dword
- * is written.
+ * dword of the command ring control register.
+ * Some controllers require all 64 bits to be written to abort the ring.
+ * Make sure the upper dword is valid, pointing to the next command,
+ * avoiding corrupting the command ring pointer in case the command ring
+ * is stopped by the time the upper dword is written.
*/
- temp_32 = readl(&xhci->op_regs->cmd_ring);
- writel(temp_32 | CMD_RING_ABORT, &xhci->op_regs->cmd_ring);
+ next_trb(xhci, NULL, &new_seg, &new_deq);
+ if (trb_is_link(new_deq))
+ next_trb(xhci, NULL, &new_seg, &new_deq);
+
+ crcr = xhci_trb_virt_to_dma(new_seg, new_deq);
+ xhci_write_64(xhci, crcr | CMD_RING_ABORT, &xhci->op_regs->cmd_ring);
/* Section 4.6.1.2 of xHCI 1.0 spec says software should also time the
* completion of the Command Abort operation. If CRR is not negated in 5
--
2.25.1
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Looks like our VBIOS/GOP generally fail to turn the DP dual mode adater
TMDS output buffers back on after a reboot. This leads to a black screen
after reboot if we turned the TMDS output buffers off prior to reboot.
And if i915 decides to do a fastboot the black screen will persist even
after i915 takes over.
Apparently this has been a problem ever since commit b2ccb822d376 ("drm/i915:
Enable/disable TMDS output buffers in DP++ adaptor as needed") if one
rebooted while the display was turned off. And things became worse with
commit fe0f1e3bfdfe ("drm/i915: Shut down displays gracefully on reboot")
since now we always turn the display off before a reboot.
This was reported on a RKL, but I confirmed the same behaviour on my
SNB as well. So looks pretty universal.
Let's fix this by explicitly turning the TMDS output buffers back on
in the encoder->shutdown() hook. Note that this gets called after irqs
have been disabled, so the i2c communication with the DP dual mode
adapter has to be performed via polling (which the gmbus code is
perfectly happy to do for us).
We also need a bit of care in handling DDI encoders which may or may
not be set up for HDMI output. Specifically ddc_pin will not be
populated for a DP only DDI encoder, in which case we don't want to
call intel_gmbus_get_adapter(). We can handle that by simply doing
the dual mode adapter type check before calling
intel_gmbus_get_adapter().
Cc: stable(a)vger.kernel.org
Fixes: b2ccb822d376 ("drm/i915: Enable/disable TMDS output buffers in DP++ adaptor as needed")
Fixes: fe0f1e3bfdfe ("drm/i915: Shut down displays gracefully on reboot")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4371
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/g4x_hdmi.c | 1 +
drivers/gpu/drm/i915/display/intel_ddi.c | 1 +
drivers/gpu/drm/i915/display/intel_hdmi.c | 16 ++++++++++++++--
drivers/gpu/drm/i915/display/intel_hdmi.h | 1 +
4 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/g4x_hdmi.c b/drivers/gpu/drm/i915/display/g4x_hdmi.c
index 88c427f3c346..f5b4dd5b4275 100644
--- a/drivers/gpu/drm/i915/display/g4x_hdmi.c
+++ b/drivers/gpu/drm/i915/display/g4x_hdmi.c
@@ -584,6 +584,7 @@ void g4x_hdmi_init(struct drm_i915_private *dev_priv,
else
intel_encoder->enable = g4x_enable_hdmi;
}
+ intel_encoder->shutdown = intel_hdmi_encoder_shutdown;
intel_encoder->type = INTEL_OUTPUT_HDMI;
intel_encoder->power_domain = intel_port_to_power_domain(port);
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index 9fb99b09fff8..5ef2882727e1 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -4312,6 +4312,7 @@ static void intel_ddi_encoder_shutdown(struct intel_encoder *encoder)
enum phy phy = intel_port_to_phy(i915, encoder->port);
intel_dp_encoder_shutdown(encoder);
+ intel_hdmi_encoder_shutdown(encoder);
if (!intel_phy_is_tc(i915, phy))
return;
diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c b/drivers/gpu/drm/i915/display/intel_hdmi.c
index 7e6af959bf83..3b5b9e7b05b7 100644
--- a/drivers/gpu/drm/i915/display/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/display/intel_hdmi.c
@@ -1246,12 +1246,13 @@ static void hsw_set_infoframes(struct intel_encoder *encoder,
void intel_dp_dual_mode_set_tmds_output(struct intel_hdmi *hdmi, bool enable)
{
struct drm_i915_private *dev_priv = intel_hdmi_to_i915(hdmi);
- struct i2c_adapter *adapter =
- intel_gmbus_get_adapter(dev_priv, hdmi->ddc_bus);
+ struct i2c_adapter *adapter;
if (hdmi->dp_dual_mode.type < DRM_DP_DUAL_MODE_TYPE2_DVI)
return;
+ adapter = intel_gmbus_get_adapter(dev_priv, hdmi->ddc_bus);
+
drm_dbg_kms(&dev_priv->drm, "%s DP dual mode adaptor TMDS output\n",
enable ? "Enabling" : "Disabling");
@@ -2285,6 +2286,17 @@ int intel_hdmi_compute_config(struct intel_encoder *encoder,
return 0;
}
+void intel_hdmi_encoder_shutdown(struct intel_encoder *encoder)
+{
+ struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(encoder);
+
+ /*
+ * Give a hand to buggy BIOSen which forget to turn
+ * the TMDS output buffers back on after a reboot.
+ */
+ intel_dp_dual_mode_set_tmds_output(intel_hdmi, true);
+}
+
static void
intel_hdmi_unset_edid(struct drm_connector *connector)
{
diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.h b/drivers/gpu/drm/i915/display/intel_hdmi.h
index b43a180d007e..2bf440eb400a 100644
--- a/drivers/gpu/drm/i915/display/intel_hdmi.h
+++ b/drivers/gpu/drm/i915/display/intel_hdmi.h
@@ -28,6 +28,7 @@ void intel_hdmi_init_connector(struct intel_digital_port *dig_port,
int intel_hdmi_compute_config(struct intel_encoder *encoder,
struct intel_crtc_state *pipe_config,
struct drm_connector_state *conn_state);
+void intel_hdmi_encoder_shutdown(struct intel_encoder *encoder);
bool intel_hdmi_handle_sink_scrambling(struct intel_encoder *encoder,
struct drm_connector *connector,
bool high_tmds_clock_ratio,
--
2.32.0
When running as PVH or HVM guest with actual memory < max memory the
hypervisor is using "populate on demand" in order to allow the guest
to balloon down from its maximum memory size. For this to work
correctly the guest must not touch more memory pages than its target
memory size as otherwise the PoD cache will be exhausted and the guest
is crashed as a result of that.
In extreme cases ballooning down might not be finished today before
the init process is started, which can consume lots of memory.
In order to avoid random boot crashes in such cases, add a late init
call to wait for ballooning down having finished for PVH/HVM guests.
Warn on console if initial ballooning fails, panic() after stalling
for more than 3 minutes per default. Add a module parameter for
changing this timeout.
Cc: <stable(a)vger.kernel.org>
Reported-by: Marek Marczykowski-Górecki <marmarek(a)invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
V2:
- add warning and panic() when stalling (Marek Marczykowski-Górecki)
- don't wait if credit > 0
V3:
- issue warning only after ballooning failed (Marek Marczykowski-Górecki)
- make panic() timeout configurable via parameter
---
.../stable/sysfs-devices-system-xen_memory | 10 +++
drivers/xen/balloon.c | 63 +++++++++++++++----
drivers/xen/xen-balloon.c | 2 +
include/xen/balloon.h | 1 +
4 files changed, 63 insertions(+), 13 deletions(-)
diff --git a/Documentation/ABI/stable/sysfs-devices-system-xen_memory b/Documentation/ABI/stable/sysfs-devices-system-xen_memory
index 6d83f95a8a8e..2da062e2c94a 100644
--- a/Documentation/ABI/stable/sysfs-devices-system-xen_memory
+++ b/Documentation/ABI/stable/sysfs-devices-system-xen_memory
@@ -84,3 +84,13 @@ Description:
Control scrubbing pages before returning them to Xen for others domains
use. Can be set with xen_scrub_pages cmdline
parameter. Default value controlled with CONFIG_XEN_SCRUB_PAGES_DEFAULT.
+
+What: /sys/devices/system/xen_memory/xen_memory0/boot_timeout
+Date: November 2021
+KernelVersion: 5.16
+Contact: xen-devel(a)lists.xenproject.org
+Description:
+ The time (in seconds) to wait before giving up to boot in case
+ initial ballooning fails to free enough memory. Applies only
+ when running as HVM or PVH guest and started with less memory
+ configured than allowed at max.
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 3a50f097ed3e..98fae43d4cec 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -125,12 +125,12 @@ static struct ctl_table xen_root[] = {
* BP_ECANCELED: error, balloon operation canceled.
*/
-enum bp_state {
+static enum bp_state {
BP_DONE,
BP_WAIT,
BP_EAGAIN,
BP_ECANCELED
-};
+} balloon_state = BP_DONE;
/* Main waiting point for xen-balloon thread. */
static DECLARE_WAIT_QUEUE_HEAD(balloon_thread_wq);
@@ -494,9 +494,9 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
* Stop waiting if either state is BP_DONE and ballooning action is
* needed, or if the credit has changed while state is not BP_DONE.
*/
-static bool balloon_thread_cond(enum bp_state state, long credit)
+static bool balloon_thread_cond(long credit)
{
- if (state == BP_DONE)
+ if (balloon_state == BP_DONE)
credit = 0;
return current_credit() != credit || kthread_should_stop();
@@ -510,13 +510,12 @@ static bool balloon_thread_cond(enum bp_state state, long credit)
*/
static int balloon_thread(void *unused)
{
- enum bp_state state = BP_DONE;
long credit;
unsigned long timeout;
set_freezable();
for (;;) {
- switch (state) {
+ switch (balloon_state) {
case BP_DONE:
case BP_ECANCELED:
timeout = 3600 * HZ;
@@ -532,7 +531,7 @@ static int balloon_thread(void *unused)
credit = current_credit();
wait_event_freezable_timeout(balloon_thread_wq,
- balloon_thread_cond(state, credit), timeout);
+ balloon_thread_cond(credit), timeout);
if (kthread_should_stop())
return 0;
@@ -543,22 +542,23 @@ static int balloon_thread(void *unused)
if (credit > 0) {
if (balloon_is_inflated())
- state = increase_reservation(credit);
+ balloon_state = increase_reservation(credit);
else
- state = reserve_additional_memory();
+ balloon_state = reserve_additional_memory();
}
if (credit < 0) {
long n_pages;
n_pages = min(-credit, si_mem_available());
- state = decrease_reservation(n_pages, GFP_BALLOON);
- if (state == BP_DONE && n_pages != -credit &&
+ balloon_state = decrease_reservation(n_pages,
+ GFP_BALLOON);
+ if (balloon_state == BP_DONE && n_pages != -credit &&
n_pages < totalreserve_pages)
- state = BP_EAGAIN;
+ balloon_state = BP_EAGAIN;
}
- state = update_schedule(state);
+ balloon_state = update_schedule(balloon_state);
mutex_unlock(&balloon_mutex);
@@ -731,6 +731,7 @@ static int __init balloon_init(void)
balloon_stats.max_schedule_delay = 32;
balloon_stats.retry_count = 1;
balloon_stats.max_retry_count = 4;
+ balloon_stats.boot_timeout = 180;
#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
set_online_page_callback(&xen_online_page);
@@ -765,3 +766,39 @@ static int __init balloon_init(void)
return 0;
}
subsys_initcall(balloon_init);
+
+static int __init balloon_wait_finish(void)
+{
+ long credit, last_credit = 0;
+ unsigned long last_changed;
+
+ if (!xen_domain())
+ return -ENODEV;
+
+ /* PV guests don't need to wait. */
+ if (xen_pv_domain() || !current_credit())
+ return 0;
+
+ pr_info("Waiting for initial ballooning down having finished.\n");
+
+ while ((credit = current_credit()) < 0) {
+ if (credit != last_credit) {
+ last_changed = jiffies;
+ last_credit = credit;
+ }
+ if (balloon_state == BP_ECANCELED) {
+ pr_warn_once("Initial ballooning failed, %ld pages need to be freed.\n",
+ -credit);
+ if (jiffies - last_changed >=
+ HZ * balloon_stats.boot_timeout)
+ panic("Initial ballooning failed!\n");
+ }
+
+ schedule_timeout_interruptible(HZ / 10);
+ }
+
+ pr_info("Initial ballooning down finished.\n");
+
+ return 0;
+}
+late_initcall_sync(balloon_wait_finish);
diff --git a/drivers/xen/xen-balloon.c b/drivers/xen/xen-balloon.c
index 8cd583db20b1..6e5db50ede0f 100644
--- a/drivers/xen/xen-balloon.c
+++ b/drivers/xen/xen-balloon.c
@@ -150,6 +150,7 @@ static DEVICE_ULONG_ATTR(schedule_delay, 0444, balloon_stats.schedule_delay);
static DEVICE_ULONG_ATTR(max_schedule_delay, 0644, balloon_stats.max_schedule_delay);
static DEVICE_ULONG_ATTR(retry_count, 0444, balloon_stats.retry_count);
static DEVICE_ULONG_ATTR(max_retry_count, 0644, balloon_stats.max_retry_count);
+static DEVICE_ULONG_ATTR(boot_timeout, 0644, balloon_stats.boot_timeout);
static DEVICE_BOOL_ATTR(scrub_pages, 0644, xen_scrub_pages);
static ssize_t target_kb_show(struct device *dev, struct device_attribute *attr,
@@ -211,6 +212,7 @@ static struct attribute *balloon_attrs[] = {
&dev_attr_max_schedule_delay.attr.attr,
&dev_attr_retry_count.attr.attr,
&dev_attr_max_retry_count.attr.attr,
+ &dev_attr_boot_timeout.attr.attr,
&dev_attr_scrub_pages.attr.attr,
NULL
};
diff --git a/include/xen/balloon.h b/include/xen/balloon.h
index 6dbdb0b3fd03..95a4187f263b 100644
--- a/include/xen/balloon.h
+++ b/include/xen/balloon.h
@@ -20,6 +20,7 @@ struct balloon_stats {
unsigned long max_schedule_delay;
unsigned long retry_count;
unsigned long max_retry_count;
+ unsigned long boot_timeout;
};
extern struct balloon_stats balloon_stats;
--
2.26.2
This is a note to let you know that I've just added the patch titled
coresight: trbe: Defer the probe on offline CPUs
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From a08025b3fe56185290a1ea476581f03ca733f967 Mon Sep 17 00:00:00 2001
From: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Date: Thu, 14 Oct 2021 15:22:38 +0100
Subject: coresight: trbe: Defer the probe on offline CPUs
If a CPU is offline during the driver init, we could end up causing
a kernel crash trying to register the coresight device for the TRBE
instance. The trbe_cpudata for the TRBE instance is initialized only
when it is probed. Otherwise, we could end up dereferencing a NULL
cpudata->drvdata.
e.g:
[ 0.149999] coresight ete0: CPU0: ete v1.1 initialized
[ 0.149999] coresight-etm4x ete_1: ETM arch init failed
[ 0.149999] coresight-etm4x: probe of ete_1 failed with error -22
[ 0.150085] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
[ 0.150085] Mem abort info:
[ 0.150085] ESR = 0x96000005
[ 0.150085] EC = 0x25: DABT (current EL), IL = 32 bits
[ 0.150085] SET = 0, FnV = 0
[ 0.150085] EA = 0, S1PTW = 0
[ 0.150085] Data abort info:
[ 0.150085] ISV = 0, ISS = 0x00000005
[ 0.150085] CM = 0, WnR = 0
[ 0.150085] [0000000000000050] user address but active_mm is swapper
[ 0.150085] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 0.150085] Modules linked in:
[ 0.150085] Hardware name: FVP Base RevC (DT)
[ 0.150085] pstate: 00800009 (nzcv daif -PAN +UAO -TCO BTYPE=--)
[ 0.150155] pc : arm_trbe_register_coresight_cpu+0x74/0x144
[ 0.150155] lr : arm_trbe_register_coresight_cpu+0x48/0x144
...
[ 0.150237] Call trace:
[ 0.150237] arm_trbe_register_coresight_cpu+0x74/0x144
[ 0.150237] arm_trbe_device_probe+0x1c0/0x2d8
[ 0.150259] platform_drv_probe+0x94/0xbc
[ 0.150259] really_probe+0x1bc/0x4a8
[ 0.150266] driver_probe_device+0x7c/0xb8
[ 0.150266] device_driver_attach+0x6c/0xac
[ 0.150266] __driver_attach+0xc4/0x148
[ 0.150266] bus_for_each_dev+0x7c/0xc8
[ 0.150266] driver_attach+0x24/0x30
[ 0.150266] bus_add_driver+0x100/0x1e0
[ 0.150266] driver_register+0x78/0x110
[ 0.150266] __platform_driver_register+0x44/0x50
[ 0.150266] arm_trbe_init+0x28/0x84
[ 0.150266] do_one_initcall+0x94/0x2bc
[ 0.150266] do_initcall_level+0xa4/0x158
[ 0.150266] do_initcalls+0x54/0x94
[ 0.150319] do_basic_setup+0x24/0x30
[ 0.150319] kernel_init_freeable+0xe8/0x14c
[ 0.150319] kernel_init+0x14/0x18c
[ 0.150319] ret_from_fork+0x10/0x30
[ 0.150319] Code: f94012c8 b0004ce2 9134a442 52819801 (f9402917)
[ 0.150319] ---[ end trace d23e0cfe5098535e ]---
[ 0.150346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Fix this by skipping the step, if we are unable to probe the CPU.
Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
Reported-by: Bransilav Rankov <branislav.rankov(a)arm.com>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Mike Leach <mike.leach(a)linaro.org>
Cc: Leo Yan <leo.yan(a)linaro.org>
Cc: stable <stable(a)vger.kernel.org>
Tested-by: Branislav Rankov <branislav.rankov(a)arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
Link: https://lore.kernel.org/r/20211014142238.2221248-1-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
---
drivers/hwtracing/coresight/coresight-trbe.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 2825ccb0cf39..5d77baba8b0f 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -893,6 +893,10 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
if (WARN_ON(trbe_csdev))
return;
+ /* If the TRBE was not probed on the CPU, we shouldn't be here */
+ if (WARN_ON(!cpudata->drvdata))
+ return;
+
dev = &cpudata->drvdata->pdev->dev;
desc.name = devm_kasprintf(dev, GFP_KERNEL, "trbe%d", cpu);
if (!desc.name)
@@ -974,7 +978,9 @@ static int arm_trbe_probe_coresight(struct trbe_drvdata *drvdata)
return -ENOMEM;
for_each_cpu(cpu, &drvdata->supported_cpus) {
- smp_call_function_single(cpu, arm_trbe_probe_cpu, drvdata, 1);
+ /* If we fail to probe the CPU, let us defer it to hotplug callbacks */
+ if (smp_call_function_single(cpu, arm_trbe_probe_cpu, drvdata, 1))
+ continue;
if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
arm_trbe_register_coresight_cpu(drvdata, cpu);
if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
--
2.33.1
This is a note to let you know that I've just added the patch titled
coresight: trbe: Fix incorrect access of the sink specific data
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From bb5293e334af51b19b62d8bef1852ea13e935e9b Mon Sep 17 00:00:00 2001
From: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Date: Tue, 21 Sep 2021 14:41:05 +0100
Subject: coresight: trbe: Fix incorrect access of the sink specific data
The TRBE driver wrongly treats the aux private data as the TRBE driver
specific buffer for a given perf handle, while it is the ETM PMU's
event specific data. Fix this by correcting the instance to use
appropriate helper.
Cc: stable <stable(a)vger.kernel.org>
Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
Link: https://lore.kernel.org/r/20210921134121.2423546-2-suzuki.poulose@arm.com
[Fixed 13 character SHA down to 12]
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
---
drivers/hwtracing/coresight/coresight-trbe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index a53ee98f312f..2825ccb0cf39 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -382,7 +382,7 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle)
static unsigned long trbe_normal_offset(struct perf_output_handle *handle)
{
- struct trbe_buf *buf = perf_get_aux(handle);
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
u64 limit = __trbe_normal_offset(handle);
u64 head = PERF_IDX2OFF(handle->head, buf);
--
2.33.1
This is a note to let you know that I've just added the patch titled
coresight: cti: Correct the parameter for pm_runtime_put
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 692c9a499b286ea478f41b23a91fe3873b9e1326 Mon Sep 17 00:00:00 2001
From: Tao Zhang <quic_taozha(a)quicinc.com>
Date: Thu, 19 Aug 2021 17:29:37 +0800
Subject: coresight: cti: Correct the parameter for pm_runtime_put
The input parameter of the function pm_runtime_put should be the
same in the function cti_enable_hw and cti_disable_hw. The correct
parameter to use here should be dev->parent.
Signed-off-by: Tao Zhang <quic_taozha(a)quicinc.com>
Reviewed-by: Leo Yan <leo.yan(a)linaro.org>
Fixes: 835d722ba10a ("coresight: cti: Initial CoreSight CTI Driver")
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/1629365377-5937-1-git-send-email-quic_taozha@quic…
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
---
drivers/hwtracing/coresight/coresight-cti-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-cti-core.c b/drivers/hwtracing/coresight/coresight-cti-core.c
index e2a3620cbf48..8988b2ed2ea6 100644
--- a/drivers/hwtracing/coresight/coresight-cti-core.c
+++ b/drivers/hwtracing/coresight/coresight-cti-core.c
@@ -175,7 +175,7 @@ static int cti_disable_hw(struct cti_drvdata *drvdata)
coresight_disclaim_device_unlocked(csdev);
CS_LOCK(drvdata->base);
spin_unlock(&drvdata->spinlock);
- pm_runtime_put(dev);
+ pm_runtime_put(dev->parent);
return 0;
/* not disabled this call */
--
2.33.1