The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: b63f20a778c88b6a04458ed6ffc69da953d3a109
Gitweb: https://git.kernel.org/tip/b63f20a778c88b6a04458ed6ffc69da953d3a109
Author: Sean Christopherson <sean.j.christopherson(a)intel.com>
AuthorDate: Thu, 22 Aug 2019 14:11:22 -07:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Fri, 23 Aug 2019 17:38:13 +02:00
x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
Use 'lea' instead of 'add' when adjusting %rsp in CALL_NOSPEC so as to
avoid clobbering flags.
KVM's emulator makes indirect calls into a jump table of sorts, where
the destination of the CALL_NOSPEC is a small blob of code that performs
fast emulation by executing the target instruction with fixed operands.
adcb_al_dl:
0x000339f8 <+0>: adc %dl,%al
0x000339fa <+2>: ret
A major motiviation for doing fast emulation is to leverage the CPU to
handle consumption and manipulation of arithmetic flags, i.e. RFLAGS is
both an input and output to the target of CALL_NOSPEC. Clobbering flags
results in all sorts of incorrect emulation, e.g. Jcc instructions often
take the wrong path. Sans the nops...
asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
0x0003595a <+58>: mov 0xc0(%ebx),%eax
0x00035960 <+64>: mov 0x60(%ebx),%edx
0x00035963 <+67>: mov 0x90(%ebx),%ecx
0x00035969 <+73>: push %edi
0x0003596a <+74>: popf
0x0003596b <+75>: call *%esi
0x000359a0 <+128>: pushf
0x000359a1 <+129>: pop %edi
0x000359a2 <+130>: mov %eax,0xc0(%ebx)
0x000359b1 <+145>: mov %edx,0x60(%ebx)
ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
0x000359a8 <+136>: mov -0x10(%ebp),%eax
0x000359ab <+139>: and $0x8d5,%edi
0x000359b4 <+148>: and $0xfffff72a,%eax
0x000359b9 <+153>: or %eax,%edi
0x000359bd <+157>: mov %edi,0x4(%ebx)
For the most part this has gone unnoticed as emulation of guest code
that can trigger fast emulation is effectively limited to MMIO when
running on modern hardware, and MMIO is rarely, if ever, accessed by
instructions that affect or consume flags.
Breakage is almost instantaneous when running with unrestricted guest
disabled, in which case KVM must emulate all instructions when the guest
has invalid state, e.g. when the guest is in Big Real Mode during early
BIOS.
Fixes: 776b043848fd2 ("x86/retpoline: Add initial retpoline support")
Fixes: 1a29b5b7f347a ("KVM: x86: Make indirect calls in emulator speculation safe")
Signed-off-by: Sean Christopherson <sean.j.christopherson(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20190822211122.27579-1-sean.j.christopherson@inte…
---
arch/x86/include/asm/nospec-branch.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 109f974..80bc209 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -192,7 +192,7 @@
" lfence;\n" \
" jmp 902b;\n" \
" .align 16\n" \
- "903: addl $4, %%esp;\n" \
+ "903: lea 4(%%esp), %%esp;\n" \
" pushl %[thunk_target];\n" \
" ret;\n" \
" .align 16\n" \
Hello,
We ran automated tests on a patchset that was proposed for merging into this
kernel tree. The patches were applied to:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: aad39e30fb9e - Linux 5.2.9
The results of these automated tests are provided below.
Overall result: FAILED (see details below)
Merge: OK
Compile: OK
Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/116984
One or more kernel tests failed:
aarch64:
❌ LTP lite
❌ Loopdev Sanity
We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Merge testing
-------------
We cloned this repository and checked out the following commit:
Repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: aad39e30fb9e - Linux 5.2.9
We grabbed the 8a2474fee8e4 commit of the stable queue repository.
We then merged the patchset with `git am`:
keys-trusted-allow-module-init-if-tpm-is-inactive-or-deactivated.patch
sh-kernel-hw_breakpoint-fix-missing-break-in-switch-statement.patch
seq_file-fix-problem-when-seeking-mid-record.patch
mm-hmm-fix-bad-subpage-pointer-in-try_to_unmap_one.patch
mm-mempolicy-make-the-behavior-consistent-when-mpol_mf_move-and-mpol_mf_strict-were-specified.patch
mm-mempolicy-handle-vma-with-unmovable-pages-mapped-correctly-in-mbind.patch
mm-z3fold.c-fix-z3fold_destroy_pool-ordering.patch
mm-z3fold.c-fix-z3fold_destroy_pool-race-condition.patch
mm-memcontrol.c-fix-use-after-free-in-mem_cgroup_iter.patch
mm-usercopy-use-memory-range-to-be-accessed-for-wraparound-check.patch
mm-vmscan-do-not-special-case-slab-reclaim-when-watermarks-are-boosted.patch
cpufreq-schedutil-don-t-skip-freq-update-when-limits-change.patch
drm-amdgpu-fix-gfx9-soft-recovery.patch
drm-nouveau-only-recalculate-pbn-vcpi-on-mode-connector-changes.patch
xtensa-add-missing-isync-to-the-cpu_reset-tlb-code.patch
arm64-ftrace-ensure-module-ftrace-trampoline-is-coherent-with-i-side.patch
alsa-hda-realtek-add-quirk-for-hp-envy-x360.patch
alsa-usb-audio-fix-a-stack-buffer-overflow-bug-in-check_input_term.patch
alsa-usb-audio-fix-an-oob-bug-in-parse_audio_mixer_unit.patch
alsa-hda-apply-workaround-for-another-amd-chip-1022-1487.patch
alsa-hda-fix-a-memory-leak-bug.patch
alsa-hda-add-a-generic-reboot_notify.patch
alsa-hda-let-all-conexant-codec-enter-d3-when-rebooting.patch
hid-holtek-test-for-sanity-of-intfdata.patch
hid-hiddev-avoid-opening-a-disconnected-device.patch
hid-hiddev-do-cleanup-in-failure-of-opening-a-device.patch
input-kbtab-sanity-check-for-endpoint-type.patch
input-iforce-add-sanity-checks.patch
net-usb-pegasus-fix-improper-read-if-get_registers-fail.patch
bpf-fix-access-to-skb_shared_info-gso_segs.patch
netfilter-ebtables-also-count-base-chain-policies.patch
riscv-correct-the-initialized-flow-of-fp-register.patch
riscv-make-__fstate_clean-work-correctly.patch
revert-i2c-imx-improve-the-error-handling-in-i2c_imx_dma_request.patch
blk-mq-move-cancel-of-requeue_work-to-the-front-of-blk_exit_queue.patch
io_uring-fix-manual-setup-of-iov_iter-for-fixed-buffers.patch
rdma-hns-fix-sg-offset-non-zero-issue.patch
ib-mlx5-replace-kfree-with-kvfree.patch
clk-at91-generated-truncate-divisor-to-generated_max.patch
clk-sprd-select-regmap_mmio-to-avoid-compile-errors.patch
clk-renesas-cpg-mssr-fix-reset-control-race-conditio.patch
dma-mapping-check-pfn-validity-in-dma_common_-mmap-g.patch
platform-x86-pcengines-apuv2-fix-softdep-statement.patch
platform-x86-intel_pmc_core-add-icl-nnpi-support-to-.patch
mm-hmm-always-return-ebusy-for-invalid-ranges-in-hmm.patch
xen-pciback-remove-set-but-not-used-variable-old_sta.patch
irqchip-gic-v3-its-free-unused-vpt_page-when-alloc-v.patch
irqchip-irq-imx-gpcv2-forward-irq-type-to-parent.patch
f2fs-fix-to-read-source-block-before-invalidating-it.patch
tools-perf-beauty-fix-usbdevfs_ioctl-table-generator.patch
perf-header-fix-divide-by-zero-error-if-f_header.att.patch
perf-header-fix-use-of-unitialized-value-warning.patch
rdma-qedr-fix-the-hca_type-and-hca_rev-returned-in-d.patch
alsa-pcm-fix-lost-wakeup-event-scenarios-in-snd_pcm_.patch
libata-zpodd-fix-small-read-overflow-in-zpodd_get_me.patch
powerpc-nvdimm-pick-nearby-online-node-if-the-device.patch
drm-bridge-lvds-encoder-fix-build-error-while-config.patch
drm-bridge-tc358764-fix-build-error.patch
btrfs-fix-deadlock-between-fiemap-and-transaction-co.patch
scsi-hpsa-correct-scsi-command-status-issue-after-re.patch
scsi-qla2xxx-fix-possible-fcport-null-pointer-derefe.patch
tracing-fix-header-include-guards-in-trace-event-hea.patch
drm-amdkfd-fix-byte-align-on-vegam.patch
drm-amd-powerplay-fix-null-pointer-dereference-aroun.patch
drm-amdgpu-fix-error-handling-in-amdgpu_cs_process_f.patch
drm-amdgpu-fix-a-potential-information-leaking-bug.patch
ata-libahci-do-not-complain-in-case-of-deferred-prob.patch
kbuild-modpost-handle-kbuild_extra_symbols-only-for-.patch
kbuild-check-for-unknown-options-with-cc-option-usag.patch
arm64-efi-fix-variable-si-set-but-not-used.patch
riscv-fix-perf-record-without-libelf-support.patch
arm64-lower-priority-mask-for-gic_prio_irqon.patch
arm64-unwind-prohibit-probing-on-return_address.patch
arm64-mm-fix-variable-pud-set-but-not-used.patch
arm64-mm-fix-variable-tag-set-but-not-used.patch
ib-core-add-mitigation-for-spectre-v1.patch
ib-mlx5-fix-mr-registration-flow-to-use-umr-properly.patch
rdma-restrack-track-driver-qp-types-in-resource-trac.patch
ib-mad-fix-use-after-free-in-ib-mad-completion-handl.patch
rdma-mlx5-release-locks-during-notifier-unregister.patch
drm-msm-fix-add_gpu_components.patch
rdma-hns-fix-error-return-code-in-hns_roce_v1_rsv_lp.patch
drm-exynos-fix-missing-decrement-of-retry-counter.patch
arm64-kprobes-recover-pstate.d-in-single-step-except.patch
arm64-make-debug-exception-handlers-visible-from-rcu.patch
revert-kmemleak-allow-to-coexist-with-fault-injectio.patch
ocfs2-remove-set-but-not-used-variable-last_hash.patch
page-flags-prioritize-kasan-bits-over-last-cpuid.patch
asm-generic-fix-wtype-limits-compiler-warnings.patch
tpm-tpm_ibm_vtpm-fix-unallocated-banks.patch
arm64-kvm-regmap-fix-unexpected-switch-fall-through.patch
staging-comedi-dt3000-fix-signed-integer-overflow-divider-base.patch
staging-comedi-dt3000-fix-rounding-up-of-timer-divisor.patch
iio-adc-max9611-fix-temperature-reading-in-probe.patch
usb-core-fix-races-in-character-device-registration-and-deregistraion.patch
usb-gadget-udc-renesas_usb3-fix-sysfs-interface-of-role.patch
usb-cdc-acm-make-sure-a-refcount-is-taken-early-enough.patch
usb-cdc-fix-sanity-checks-in-cdc-union-parser.patch
usb-serial-option-add-d-link-dwm-222-device-id.patch
usb-serial-option-add-support-for-zte-mf871a.patch
usb-serial-option-add-the-broadmobi-bm818-card.patch
usb-serial-option-add-motorola-modem-uarts.patch
usb-setup-authorized_default-attributes-using-usb_bus_notify.patch
netfilter-conntrack-use-consistent-ct-id-hash-calculation.patch
iwlwifi-add-support-for-sar-south-korea-limitation.patch
input-psmouse-fix-build-error-of-multiple-definition.patch
bnx2x-fix-vf-s-vlan-reconfiguration-in-reload.patch
bonding-add-vlan-tx-offload-to-hw_enc_features.patch
net-dsa-check-existence-of-.port_mdb_add-callback-before-calling-it.patch
net-mlx4_en-fix-a-memory-leak-bug.patch
net-packet-fix-race-in-tpacket_snd.patch
net-sched-sch_taprio-fix-memleak-in-error-path-for-sched-list-parse.patch
sctp-fix-memleak-in-sctp_send_reset_streams.patch
sctp-fix-the-transport-error_count-check.patch
team-add-vlan-tx-offload-to-hw_enc_features.patch
tipc-initialise-addr_trail_end-when-setting-node-addresses.patch
xen-netback-reset-nr_frags-before-freeing-skb.patch
net-mlx5e-only-support-tx-rx-pause-setting-for-port-owner.patch
bnxt_en-fix-vnic-clearing-logic-for-57500-chips.patch
bnxt_en-improve-rx-doorbell-sequence.patch
bnxt_en-fix-handling-frag_err-when-nvm_install_update-cmd-fails.patch
bnxt_en-suppress-hwrm-errors-for-hwrm_nvm_get_variable-command.patch
bnxt_en-use-correct-src_fid-to-determine-direction-of-the-flow.patch
bnxt_en-fix-to-include-flow-direction-in-l2-key.patch
net-sched-update-skbedit-action-for-batched-events-operations.patch
tc-testing-updated-skbedit-action-tests-with-batch-create-delete.patch
netdevsim-restore-per-network-namespace-accounting-for-fib-entries.patch
net-mlx5e-ethtool-avoid-setting-speed-to-56gbase-when-autoneg-off.patch
net-mlx5e-fix-false-negative-indication-on-tx-reporter-cqe-recovery.patch
net-mlx5e-remove-redundant-check-in-cqe-recovery-flow-of-tx-reporter.patch
net-mlx5e-use-flow-keys-dissector-to-parse-packets-for-arfs.patch
net-tls-prevent-skb_orphan-from-leaking-tls-plain-text-with-offload.patch
net-phy-consider-an_restart-status-when-reading-link-status.patch
netlink-fix-nlmsg_parse-as-a-wrapper-for-strict-message-parsing.patch
Compile testing
---------------
We compiled the kernel for 3 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [1]
✅ selinux-policy: serge-testsuite [2]
✅ lvm thinp sanity [3]
✅ storage: software RAID testing [4]
🚧 ✅ Storage blktests [5]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
❌ LTP lite [7]
❌ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ Networking sctp-auth: sockopts test [14]
✅ Networking TCP: keepalive test [15]
✅ audit: audit testsuite test [16]
✅ httpd: mod_ssl smoke sanity [17]
✅ iotop: sanity [18]
✅ tuned: tune-processes-through-perf [19]
✅ Usex - version 1.9-29 [20]
✅ storage: SCSI VPD [21]
✅ stress: stress-ng [22]
ppc64le:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [1]
✅ selinux-policy: serge-testsuite [2]
✅ lvm thinp sanity [3]
✅ storage: software RAID testing [4]
🚧 ✅ Storage blktests [5]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
✅ LTP lite [7]
✅ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ Networking sctp-auth: sockopts test [14]
✅ Networking TCP: keepalive test [15]
✅ audit: audit testsuite test [16]
✅ httpd: mod_ssl smoke sanity [17]
✅ iotop: sanity [18]
✅ tuned: tune-processes-through-perf [19]
✅ Usex - version 1.9-29 [20]
x86_64:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [1]
✅ selinux-policy: serge-testsuite [2]
✅ lvm thinp sanity [3]
✅ storage: software RAID testing [4]
🚧 ✅ Storage blktests [5]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
✅ LTP lite [7]
✅ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ Networking sctp-auth: sockopts test [14]
✅ Networking TCP: keepalive test [15]
✅ audit: audit testsuite test [16]
✅ httpd: mod_ssl smoke sanity [17]
✅ iotop: sanity [18]
✅ tuned: tune-processes-through-perf [19]
✅ pciutils: sanity smoke test [23]
✅ Usex - version 1.9-29 [20]
✅ storage: SCSI VPD [21]
✅ stress: stress-ng [22]
Test source:
💚 Pull requests are welcome for new tests or improvements to existing tests!
[0]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[1]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/filesystems…
[2]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/packages/se…
[3]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/lvm/…
[4]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/swra…
[5]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/blk
[6]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/container/p…
[7]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[8]: https://github.com/CKI-project/tests-beaker/archive/master.zip#filesystems/…
[9]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/jvm
[10]: https://github.com/CKI-project/tests-beaker/archive/master.zip#misc/amtu
[11]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[12]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/networking/…
[13]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/networking/…
[14]: https://github.com/CKI-project/tests-beaker/archive/master.zip#networking/s…
[15]: https://github.com/CKI-project/tests-beaker/archive/master.zip#networking/t…
[16]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/aud…
[17]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/htt…
[18]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/iot…
[19]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/tun…
[20]: https://github.com/CKI-project/tests-beaker/archive/master.zip#standards/us…
[21]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/scsi…
[22]: https://github.com/CKI-project/tests-beaker/archive/master.zip#stress/stres…
[23]: https://github.com/CKI-project/tests-beaker/archive/master.zip#pciutils/san…
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Turns out a cfs_rq->runtime_remaining can become positive in
assign_cfs_rq_runtime(), but this codepath has no call to
unthrottle_cfs_rq().
This can leave us in a situation where we have a throttled cfs_rq with
positive ->runtime_remaining, which breaks the math in
distribute_cfs_runtime(): this function expects a negative value so that
it may safely negate it into a positive value.
Add the missing unthrottle_cfs_rq(). While at it, add a WARN_ON where
we expect negative values, and pull in a comment from the mailing list
that didn't make it in [1].
[1]: https://lkml.kernel.org/r/BANLkTi=NmCxKX6EbDQcJYDJ5kKyG2N1ssw@mail.gmail.com
Cc: <stable(a)vger.kernel.org>
Fixes: ec12cb7f31e2 ("sched: Accumulate per-cfs_rq cpu usage and charge against bandwidth")
Reported-by: Liangyan <liangyan.peng(a)linux.alibaba.com>
Signed-off-by: Valentin Schneider <valentin.schneider(a)arm.com>
---
kernel/sched/fair.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1054d2cf6aaa..219ff3f328e5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4385,6 +4385,11 @@ static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq)
return rq_clock_task(rq_of(cfs_rq)) - cfs_rq->throttled_clock_task_time;
}
+static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq)
+{
+ return cfs_bandwidth_used() && cfs_rq->throttled;
+}
+
/* returns 0 on failure to allocate runtime */
static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
{
@@ -4411,6 +4416,9 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
cfs_rq->runtime_remaining += amount;
+ if (cfs_rq->runtime_remaining > 0 && cfs_rq_throttled(cfs_rq))
+ unthrottle_cfs_rq(cfs_rq);
+
return cfs_rq->runtime_remaining > 0;
}
@@ -4439,11 +4447,6 @@ void account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec)
__account_cfs_rq_runtime(cfs_rq, delta_exec);
}
-static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq)
-{
- return cfs_bandwidth_used() && cfs_rq->throttled;
-}
-
/* check whether cfs_rq, or any parent, is throttled */
static inline int throttled_hierarchy(struct cfs_rq *cfs_rq)
{
@@ -4628,6 +4631,10 @@ static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, u64 remaining)
if (!cfs_rq_throttled(cfs_rq))
goto next;
+ /* By the above check, this should never be true */
+ WARN_ON(cfs_rq->runtime_remaining > 0);
+
+ /* Pick the minimum amount to return to a positive quota state */
runtime = -cfs_rq->runtime_remaining + 1;
if (runtime > remaining)
runtime = remaining;
--
2.22.0
vgpu ppgtt notification was split into 2 steps, the first step is to
update PVINFO's pdp register and then write PVINFO's g2v_notify register
with action code to tirgger ppgtt notification to GVT side.
currently these steps were not atomic operations due to no any protection,
so it is easy to enter race condition state during the MTBF, stress and
IGT test to cause GPU hang.
the solution is to add a lock to make vgpu ppgtt notication as atomic
operation.
Cc: stable(a)vger.kernel.org
Signed-off-by: Xiaolin Zhang <xiaolin.zhang(a)intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem_gtt.c | 4 ++++
drivers/gpu/drm/i915/i915_vgpu.c | 1 +
3 files changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index eb31c16..32e17c4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -961,6 +961,7 @@ struct i915_frontbuffer_tracking {
};
struct i915_virtual_gpu {
+ struct mutex lock;
bool active;
u32 caps;
};
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2cd2dab..1bb93b7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -833,6 +833,8 @@ static int gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
enum vgt_g2v_type msg;
int i;
+ mutex_lock(&dev_priv->vgpu.lock);
+
if (create)
atomic_inc(px_used(ppgtt->pd)); /* never remove */
else
@@ -860,6 +862,8 @@ static int gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
I915_WRITE(vgtif_reg(g2v_notify), msg);
+ mutex_lock(&dev_priv->vgpu.lock);
+
return 0;
}
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index bf2b837..7493544 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -94,6 +94,7 @@ void i915_detect_vgpu(struct drm_i915_private *dev_priv)
dev_priv->vgpu.caps = readl(shared_area + vgtif_offset(vgt_caps));
dev_priv->vgpu.active = true;
+ mutex_init(&dev_priv->vgpu.lock);
DRM_INFO("Virtual GPU for Intel GVT-g detected.\n");
out:
--
2.7.4