This is the start of the stable review cycle for the 4.9.97 release.
There are 74 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Apr 29 13:56:52 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.97-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.97-rc1
Hans de Goede <hdegoede(a)redhat.com>
ACPI / video: Only default only_lcd to true on Win8-ready _desktops_
Heiko Carstens <heiko.carstens(a)de.ibm.com>
s390/uprobes: implement arch_uretprobe_is_alive()
Stefan Haberland <sth(a)linux.vnet.ibm.com>
s390/dasd: fix IO error for newly defined devices
Sebastian Ott <sebott(a)linux.ibm.com>
s390/cio: update chpid descriptor after resource accessibility event
Dan Carpenter <dan.carpenter(a)oracle.com>
cdrom: information leak in cdrom_ioctl_media_changed()
Martin K. Petersen <martin.petersen(a)oracle.com>
scsi: mptsas: Disable WRITE SAME
Doron Roberts-Kedes <doronrk(a)fb.com>
strparser: Fix incorrect strp->need_bytes value.
Eric Dumazet <edumazet(a)google.com>
ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
Eric Dumazet <edumazet(a)google.com>
net: af_packet: fix race in PACKET_{R|T}X_RING
Eric Dumazet <edumazet(a)google.com>
tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets
Wolfgang Bumiller <w.bumiller(a)proxmox.com>
net: fix deadlock while clearing neighbor proxy table
Ivan Khoronzhuk <ivan.khoronzhuk(a)linaro.org>
net: ethernet: ti: cpsw: fix tx vlan priority mapping
Cong Wang <xiyou.wangcong(a)gmail.com>
llc: fix NULL pointer deref for SOCK_ZAPPED
Cong Wang <xiyou.wangcong(a)gmail.com>
llc: hold llc_sap before release_sock()
Alexander Aring <aring(a)mojatatu.com>
net: sched: ife: signal not finding metaid
Xin Long <lucien.xin(a)gmail.com>
sctp: do not check port in sctp_inet6_cmp_addr
Toshiaki Makita <makita.toshiaki(a)lab.ntt.co.jp>
vlan: Fix reading memory beyond skb->tail in skb_vlan_tagged_multi
Guillaume Nault <g.nault(a)alphalink.fr>
pppoe: check sockaddr length in pppoe_connect()
Eric Dumazet <edumazet(a)google.com>
tipc: add policy for TIPC_NLA_NET_ADDR
Willem de Bruijn <willemb(a)google.com>
packet: fix bitfield update race
Xin Long <lucien.xin(a)gmail.com>
team: fix netconsole setup over team
Paolo Abeni <pabeni(a)redhat.com>
team: avoid adding twice the same option to the event list
Jann Horn <jannh(a)google.com>
tcp: don't read out-of-bounds opsize
Cong Wang <xiyou.wangcong(a)gmail.com>
llc: delete timers synchronously in llc_sk_free()
Eric Dumazet <edumazet(a)google.com>
net: validate attribute sizes in neigh_dump_table()
Guillaume Nault <g.nault(a)alphalink.fr>
l2tp: check sockaddr length in pppol2tp_connect()
Eric Biggers <ebiggers(a)google.com>
KEYS: DNS: limit the length of option strings
Xin Long <lucien.xin(a)gmail.com>
bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: correct module section names for expoline code revert
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: correct nospec auto detection init order
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: add sysfs attributes for spectre
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: report spectre mitigation via syslog
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: add automatic detection of the spectre defense
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: move nobp parameter functions to nospec-branch.c
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390/entry.S: fix spurious zeroing of r0
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: do not bypass BPENTER for interrupt system calls
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*)
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
KVM: s390: force bp isolation for VSIE
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: introduce execute-trampolines for branches
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: run user space and KVM guests with modified branch prediction
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: add options to change branch prediction behaviour for the kernel
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390/alternative: use a copy of the facility bit mask
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: add optimized array_index_mask_nospec
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: scrub registers on kernel entry and KVM exit
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
KVM: s390: wire up bpb feature
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: enable CPU alternatives unconditionally
Martin Schwidefsky <schwidefsky(a)de.ibm.com>
s390: introduce CPU alternatives
Sinan Kaya <okaya(a)codeaurora.org>
PCI: Wait up to 60 seconds for device to become ready after FLR
Karthikeyan Periyasamy <periyasa(a)codeaurora.org>
Revert "ath10k: send (re)assoc peer command when NSS changed"
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "pinctrl: intel: Initialize GPIO properly when used through irqchip"
Grant Grundler <grundler(a)chromium.org>
r8152: add Linksys USB3GIGV1 id
Benjamin Beichler <benjamin.beichler(a)uni-rostock.de>
mac80211_hwsim: fix use-after-free bug in hwsim_exit_net
Imre Deak <imre.deak(a)intel.com>
drm/i915/bxt, glk: Increase PCODE timeouts during CDCLK freq changing
Leon Romanovsky <leonro(a)mellanox.com>
RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPs
Jiri Olsa <jolsa(a)kernel.org>
perf: Return proper values for user stack errors
Jiri Olsa <jolsa(a)kernel.org>
perf: Fix sample_max_stack maximum check
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "perf tools: Decompress kernel module when reading DSO data"
Sahitya Tummala <stummala(a)codeaurora.org>
jbd2: fix use after free in kjournald2()
Felix Fietkau <nbd(a)nbd.name>
ath9k_hw: check if the chip failed to wake up
Paul Burton <paul.burton(a)imgtec.com>
OF: Prevent unaligned access in of_alias_scan()
Dan Carpenter <dan.carpenter(a)oracle.com>
stk-webcam: fix an endian bug in stk_camera_read_reg()
Colin Ian King <colin.king(a)canonical.com>
power: supply: bq2415x: check for NULL acpi_id to avoid null pointer dereference
Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Input: drv260x - fix initializing overdrive voltage
Matt Redfearn <matt.redfearn(a)imgtec.com>
MIPS: Generic: Fix big endian CPUs on generic machine
Merlijn Wajer <merlijn(a)wizzup.org>
usb: musb: Fix external abort in musb_remove on omap2430
Merlijn Wajer <merlijn(a)wizzup.org>
usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers
Andreas Kemnade <andreas(a)kemnade.info>
usb: musb: fix enumeration after resume
Jean Delvare <jdelvare(a)suse.de>
i2c: i801: Restore configuration at shutdown
Jean Delvare <jdelvare(a)suse.de>
i2c: i801: Save register SMBSLVCMD value only once
Benjamin Tissoires <benjamin.tissoires(a)redhat.com>
i2c: i801: store and restore the SLVCMD register at load and unload
Imre Deak <imre.deak(a)intel.com>
drm/i915: Fix LSPCON TMDS output buffer enabling from low-power state
Daniel J Blueman <daniel(a)quora.org>
drm/vc4: Fix memory leak during BO teardown
Xiaoming Gao <gxm.linux.kernel(a)gmail.com>
x86/tsc: Prevent 32bit truncation in calc_hpet_ref()
Steve French <smfrench(a)gmail.com>
cifs: do not allow creating sockets except with SMB1 posix exensions
-------------
Diffstat:
Documentation/kernel-parameters.txt | 3 +
Makefile | 4 +-
arch/mips/Kconfig | 1 +
arch/s390/Kconfig | 47 ++++++
arch/s390/Makefile | 10 ++
arch/s390/include/asm/alternative.h | 149 +++++++++++++++++
arch/s390/include/asm/barrier.h | 24 +++
arch/s390/include/asm/facility.h | 18 +++
arch/s390/include/asm/kvm_host.h | 3 +-
arch/s390/include/asm/lowcore.h | 7 +-
arch/s390/include/asm/nospec-branch.h | 17 ++
arch/s390/include/asm/processor.h | 4 +
arch/s390/include/asm/thread_info.h | 4 +
arch/s390/include/uapi/asm/kvm.h | 5 +-
arch/s390/kernel/Makefile | 6 +-
arch/s390/kernel/alternative.c | 112 +++++++++++++
arch/s390/kernel/early.c | 5 +
arch/s390/kernel/entry.S | 250 ++++++++++++++++++++++++++---
arch/s390/kernel/ipl.c | 1 +
arch/s390/kernel/module.c | 65 +++++++-
arch/s390/kernel/nospec-branch.c | 169 +++++++++++++++++++
arch/s390/kernel/processor.c | 18 +++
arch/s390/kernel/setup.c | 14 +-
arch/s390/kernel/smp.c | 7 +-
arch/s390/kernel/uprobes.c | 9 ++
arch/s390/kernel/vmlinux.lds.S | 37 +++++
arch/s390/kvm/kvm-s390.c | 13 +-
arch/s390/kvm/vsie.c | 30 ++++
arch/x86/kernel/tsc.c | 2 +-
drivers/acpi/acpi_video.c | 27 +++-
drivers/cdrom/cdrom.c | 2 +-
drivers/gpu/drm/drm_dp_dual_mode_helper.c | 39 ++++-
drivers/gpu/drm/i915/i915_drv.h | 6 +-
drivers/gpu/drm/i915/intel_display.c | 9 +-
drivers/gpu/drm/i915/intel_pm.c | 6 +-
drivers/gpu/drm/vc4/vc4_bo.c | 2 +
drivers/gpu/drm/vc4/vc4_validate_shaders.c | 1 +
drivers/i2c/busses/i2c-i801.c | 29 +++-
drivers/infiniband/hw/mlx5/qp.c | 3 +-
drivers/input/misc/drv260x.c | 2 +-
drivers/media/usb/stkwebcam/stk-sensor.c | 6 +-
drivers/media/usb/stkwebcam/stk-webcam.c | 11 +-
drivers/media/usb/stkwebcam/stk-webcam.h | 2 +-
drivers/message/fusion/mptsas.c | 1 +
drivers/net/bonding/bond_main.c | 3 +-
drivers/net/ethernet/ti/cpsw.c | 2 +-
drivers/net/ppp/pppoe.c | 4 +
drivers/net/team/team.c | 38 ++++-
drivers/net/usb/cdc_ether.c | 10 ++
drivers/net/usb/r8152.c | 2 +
drivers/net/wireless/ath/ath10k/mac.c | 5 +-
drivers/net/wireless/ath/ath9k/hw.c | 4 +
drivers/net/wireless/mac80211_hwsim.c | 7 +-
drivers/of/base.c | 2 +-
drivers/pci/pci.c | 52 ++++--
drivers/pinctrl/intel/pinctrl-intel.c | 23 +--
drivers/power/supply/bq2415x_charger.c | 5 +
drivers/s390/block/dasd_alias.c | 13 +-
drivers/s390/char/Makefile | 2 +
drivers/s390/cio/chsc.c | 14 +-
drivers/usb/musb/musb_core.c | 8 +-
fs/cifs/dir.c | 9 +-
fs/jbd2/journal.c | 2 +-
include/linux/if_vlan.h | 7 +-
include/net/llc_conn.h | 1 +
include/uapi/linux/kvm.h | 1 +
kernel/events/callchain.c | 21 +--
kernel/events/core.c | 4 +-
net/core/dev.c | 2 +-
net/core/neighbour.c | 40 +++--
net/dns_resolver/dns_key.c | 13 +-
net/ipv4/tcp.c | 6 +-
net/ipv4/tcp_input.c | 7 +-
net/ipv6/route.c | 2 +
net/l2tp/l2tp_ppp.c | 7 +
net/llc/af_llc.c | 14 +-
net/llc/llc_c_ac.c | 9 +-
net/llc/llc_conn.c | 22 ++-
net/packet/af_packet.c | 82 +++++++---
net/packet/internal.h | 10 +-
net/sched/act_ife.c | 2 +-
net/sctp/ipv6.c | 60 +++----
net/strparser/strparser.c | 7 +-
net/tipc/netlink.c | 3 +-
tools/perf/util/dso.c | 16 --
85 files changed, 1459 insertions(+), 262 deletions(-)
This is the start of the stable review cycle for the 3.18.107 release.
There are 24 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.107-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 3.18.107-rc1
Dan Carpenter <dan.carpenter(a)oracle.com>
cdrom: information leak in cdrom_ioctl_media_changed()
Martin K. Petersen <martin.petersen(a)oracle.com>
scsi: mptsas: Disable WRITE SAME
Eric Dumazet <edumazet(a)google.com>
ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
Cong Wang <xiyou.wangcong(a)gmail.com>
llc: delete timers synchronously in llc_sk_free()
Eric Dumazet <edumazet(a)google.com>
net: af_packet: fix race in PACKET_{R|T}X_RING
Eric Dumazet <edumazet(a)google.com>
tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets
Willem de Bruijn <willemb(a)google.com>
packet: fix bitfield update race
Cong Wang <xiyou.wangcong(a)gmail.com>
llc: fix NULL pointer deref for SOCK_ZAPPED
Cong Wang <xiyou.wangcong(a)gmail.com>
llc: hold llc_sap before release_sock()
Guillaume Nault <g.nault(a)alphalink.fr>
pppoe: check sockaddr length in pppoe_connect()
Xin Long <lucien.xin(a)gmail.com>
team: fix netconsole setup over team
Paolo Abeni <pabeni(a)redhat.com>
team: avoid adding twice the same option to the event list
Jann Horn <jannh(a)google.com>
tcp: don't read out-of-bounds opsize
Guillaume Nault <g.nault(a)alphalink.fr>
l2tp: check sockaddr length in pppol2tp_connect()
Eric Biggers <ebiggers(a)google.com>
KEYS: DNS: limit the length of option strings
Xin Long <lucien.xin(a)gmail.com>
bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave
Sahitya Tummala <stummala(a)codeaurora.org>
jbd2: fix use after free in kjournald2()
Matthew Wilcox <mawilcox(a)microsoft.com>
mm/filemap.c: fix NULL pointer in page_cache_tree_insert()
Jiri Olsa <jolsa(a)kernel.org>
perf: Return proper values for user stack errors
Theodore Ts'o <tytso(a)mit.edu>
ext4: don't update checksum of new initialized bitmaps
wangguang <wang.guang55(a)zte.com.cn>
ext4: bugfix for mmaped pages in mpage_release_unused_pages()
Theodore Ts'o <tytso(a)mit.edu>
ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea()
Xiaoming Gao <gxm.linux.kernel(a)gmail.com>
x86/tsc: Prevent 32bit truncation in calc_hpet_ref()
Steve French <smfrench(a)gmail.com>
cifs: do not allow creating sockets except with SMB1 posix exensions
-------------
Diffstat:
Makefile | 4 +-
arch/x86/kernel/tsc.c | 2 +-
drivers/cdrom/cdrom.c | 2 +-
drivers/message/fusion/mptsas.c | 1 +
drivers/net/bonding/bond_main.c | 3 +-
drivers/net/ppp/pppoe.c | 4 ++
drivers/net/team/team.c | 38 ++++++++++++++----
fs/cifs/dir.c | 9 +++--
fs/ext4/balloc.c | 3 +-
fs/ext4/ialloc.c | 43 ++------------------
fs/ext4/inline.c | 66 ++++++++++++++-----------------
fs/ext4/inode.c | 2 +
fs/ext4/xattr.c | 30 ++++++--------
fs/ext4/xattr.h | 32 +++++++++++++++
fs/jbd2/journal.c | 2 +-
include/net/llc_conn.h | 1 +
kernel/events/core.c | 4 +-
mm/filemap.c | 4 +-
net/dns_resolver/dns_key.c | 13 +++---
net/ipv4/tcp.c | 6 ++-
net/ipv4/tcp_input.c | 7 +---
net/ipv6/route.c | 2 +
net/l2tp/l2tp_ppp.c | 7 ++++
net/llc/af_llc.c | 14 ++++++-
net/llc/llc_c_ac.c | 9 +----
net/llc/llc_conn.c | 22 ++++++++++-
net/packet/af_packet.c | 88 +++++++++++++++++++++++++++++------------
net/packet/internal.h | 10 ++---
28 files changed, 254 insertions(+), 174 deletions(-)
Hi,
This 4th version of the series which fixes %p uses in kprobes.
Some by replacing with %pS, some by replacing with %px but
masking with kallsyms_show_value().
I've read the thread about %pK and if I understand correctly
we shouldn't print kernel addresses. However, kprobes debugfs
interface can not stop to show the actual probe address because
it should be compared with addresses in kallsyms for debugging.
So, it depends on that kallsyms_show_value() allows to show
address to user, because if it returns true, anyway that user
can dump /proc/kallsyms.
Other error messages are replaced it with %pS or just removed.
This series also including fixes for arch ports too.
Changes in this version;
[1/7] Fix "list" file's mode too.
[2/7] Do not use local variables and fix comment.
[4/7] Use WARN_ONCE() for single bug.
[5/7] Just remove %p.
Thank you,
---
Masami Hiramatsu (7):
kprobes: Make list and blacklist root user read only
kprobes: Show blacklist addresses as same as kallsyms does
kprobes: Show address of kprobes if kallsyms does
kprobes: Replace %p with other pointer types
kprobes/x86: Fix %p uses in error messages
kprobes/arm: Fix %p uses in error messages
kprobes/arm64: Fix %p uses in error messages
arch/arm/probes/kprobes/core.c | 10 +++----
arch/arm/probes/kprobes/test-core.c | 1 -
arch/arm64/kernel/probes/kprobes.c | 4 +--
arch/x86/kernel/kprobes/core.c | 13 +++------
kernel/kprobes.c | 52 +++++++++++++++++++++--------------
5 files changed, 42 insertions(+), 38 deletions(-)
--
Masami Hiramatsu (Linaro) <mhiramat(a)kernel.org>
gpstate_timer_handler() uses synchronous smp_call to set the pstate
on the requested core. This causes the below hard lockup:
[c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable)
[c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250
[c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580
[c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
[c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0
[c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270
[c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
[c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120
[c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0
[c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120
-- interrupt: 901 at doorbell_global_ipi+0x34/0x50
LR = arch_send_call_function_ipi_mask+0x120/0x130
[c000003fe566ba50] [c00000000004876c]
arch_send_call_function_ipi_mask+0x4c/0x130
[c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450
[c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
[c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270
[c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40
[c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340
[c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
[c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c
One way to avoid this is removing the smp-call. We can ensure that the timer
always runs on one of the policy-cpus. If the timer gets migrated to a
cpu outside the policy then re-queue it back on the policy->cpus. This way
we can get rid of the smp-call which was being used to set the pstate
on the policy->cpus.
Fixes: 7bc54b652f13 (timers, cpufreq/powernv: Initialize the gpstate timer as pinned)
Cc: <stable(a)vger.kernel.org> [4.8+]
Reported-by: Nicholas Piggin <npiggin(a)gmail.com>
Reported-by: Pridhiviraj Paidipeddi <ppaidipe(a)linux.vnet.ibm.com>
Signed-off-by: Shilpasri G Bhat <shilpa.bhat(a)linux.vnet.ibm.com>
---
Changes from V2:
- Remove the check for active policy while requeing the migrated timer
Changes from V1:
- Remove smp_call in the pstate handler.
drivers/cpufreq/powernv-cpufreq.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 71f8682..e368e1f 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -679,6 +679,16 @@ void gpstate_timer_handler(struct timer_list *t)
if (!spin_trylock(&gpstates->gpstate_lock))
return;
+ /*
+ * If the timer has migrated to the different cpu then bring
+ * it back to one of the policy->cpus
+ */
+ if (!cpumask_test_cpu(raw_smp_processor_id(), policy->cpus)) {
+ gpstates->timer.expires = jiffies + msecs_to_jiffies(1);
+ add_timer_on(&gpstates->timer, cpumask_first(policy->cpus));
+ spin_unlock(&gpstates->gpstate_lock);
+ return;
+ }
/*
* If PMCR was last updated was using fast_swtich then
@@ -718,10 +728,8 @@ void gpstate_timer_handler(struct timer_list *t)
if (gpstate_idx != gpstates->last_lpstate_idx)
queue_gpstate_timer(gpstates);
+ set_pstate(&freq_data);
spin_unlock(&gpstates->gpstate_lock);
-
- /* Timer may get migrated to a different cpu on cpu hot unplug */
- smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
}
/*
--
1.8.3.1
This fixes the compile error "multiple definition of `dev_attr_modalias'"
by adding the static modifier to DEVICE_ATTR_RO(modalias).
This change was made in the mainline kernel in 2460942f51f1 ("serdev: do
not generate modaliases for controllers") along with some other changes.
Fixes: 4fe99816a1ab ("tty: serdev: use dev_groups and not dev_attrs for bus_type")
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: Johan Hovold <johan(a)kernel.org>
Cc: Sebastian Reichel <sebastian.reichel(a)collabora.co.uk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org> # 4.14.x
Signed-off-by: David Lechner <david(a)lechnology.com>
---
Should we pick up the patch 2460942f51f1 ("serdev: do not generate modaliases
for controllers") for stable or is this patch good enough?
drivers/tty/serdev/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serdev/core.c b/drivers/tty/serdev/core.c
index 97db76afced2..25298b7b2419 100644
--- a/drivers/tty/serdev/core.c
+++ b/drivers/tty/serdev/core.c
@@ -276,7 +276,7 @@ static ssize_t modalias_show(struct device *dev,
{
return of_device_modalias(dev, buf, PAGE_SIZE);
}
-DEVICE_ATTR_RO(modalias);
+static DEVICE_ATTR_RO(modalias);
static struct attribute *serdev_device_attrs[] = {
&dev_attr_modalias.attr,
--
2.17.0
Hi,
This 3rd version of the series which fixes %p uses in kprobes.
Some by replacing with %pS, some by replacing with %px but
masking with kallsyms_show_value().
I've read the thread about %pK and if I understand correctly
we shouldn't print kernel addresses. However, kprobes debugfs
interface can not stop to show the actual probe address because
it should be compared with addresses in kallsyms for debugging.
So, it depends on that kallsyms_show_value() allows to show
address to user, because if it returns true, anyway that user
can dump /proc/kallsyms.
Other error messages are replaced it with %pS or just removed.
This series also including some fixes for arch ports too.
Changes in this version;
- [2/7]: Updated for the latest linus tree.
- [4/7][5/7]: Do not use %px.
Thank you,
---
Masami Hiramatsu (7):
kprobes: Make blacklist root user read only
kprobes: Show blacklist addresses as same as kallsyms does
kprobes: Show address of kprobes if kallsyms does
kprobes: Replace %p with other pointer types
kprobes/x86: Fix %p uses in error messages
kprobes/arm: Fix %p uses in error messages
kprobes/arm64: Fix %p uses in error messages
arch/arm/probes/kprobes/core.c | 10 ++++---
arch/arm/probes/kprobes/test-core.c | 1 -
arch/arm64/kernel/probes/kprobes.c | 4 +--
arch/x86/kernel/kprobes/core.c | 12 +++------
kernel/kprobes.c | 48 ++++++++++++++++++++++-------------
5 files changed, 41 insertions(+), 34 deletions(-)
--
Masami Hiramatsu (Linaro) <mhiramat(a)kernel.org>
KEXEC needs the new kernel's load address to be aligned on a page
boundary (see sanity_check_segment_list()), but on MIPS the default
vmlinuz load address is only explicitly aligned to 16 bytes.
Since the largest PAGE_SIZE supported by MIPS kernels is 64KB, increase
the alignment calculated by calc_vmlinuz_load_addr to 64KB.
Cc: <stable(a)vger.kernel.org> # 2.6.36+
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
arch/mips/boot/compressed/calc_vmlinuz_load_addr.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/mips/boot/compressed/calc_vmlinuz_load_addr.c b/arch/mips/boot/compressed/calc_vmlinuz_load_addr.c
index 37fe58c..542c3ed 100644
--- a/arch/mips/boot/compressed/calc_vmlinuz_load_addr.c
+++ b/arch/mips/boot/compressed/calc_vmlinuz_load_addr.c
@@ -13,6 +13,7 @@
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
+#include "../../../../include/linux/sizes.h"
int main(int argc, char *argv[])
{
@@ -45,11 +46,11 @@ int main(int argc, char *argv[])
vmlinuz_load_addr = vmlinux_load_addr + vmlinux_size;
/*
- * Align with 16 bytes: "greater than that used for any standard data
- * types by a MIPS compiler." -- See MIPS Run Linux (Second Edition).
+ * Align with 64KB: KEXEC needs load sections to be aligned to PAGE_SIZE,
+ * which may be as large as 64KB depending on the kernel configuration.
*/
- vmlinuz_load_addr += (16 - vmlinux_size % 16);
+ vmlinuz_load_addr += (SZ_64K - vmlinux_size % SZ_64K);
printf("0x%llx\n", vmlinuz_load_addr);
--
2.7.0
On Tue, Apr 24, 2018 at 3:34 PM, Will Deacon <will.deacon(a)arm.com> wrote:
> I've not run into any build issues here -- is this specifically with some
> out-of-tree module?
I received a bug report email about this. I'm not sure which specific
module, and I assumed from the email that it was actually a result of
in-tree configuration options rather than an out-of-tree module, but
I'm not sure exactly. Either way, I was able to reproduce the problem
by coding up a little PoC out-of-tree module, so it is certainly a
real problem.
> It would be better not to introduce a new header file just for this, I
> think. How about compiler.h instead?
I could, but actually after I wrote this email I noticed that this is
a widespread convention:
zx2c4@thinkpad ~/Projects/linux $ subfind asm-prototypes
./arch/s390/include/asm/asm-prototypes.h
./arch/alpha/include/asm/asm-prototypes.h
./arch/powerpc/include/asm/asm-prototypes.h
./arch/m68k/include/asm/asm-prototypes.h
./arch/mips/include/asm/asm-prototypes.h
./arch/x86/include/asm/asm-prototypes.h
./arch/sparc/include/asm/asm-prototypes.h
./arch/ia64/include/asm/asm-prototypes.h
./arch/um/include/asm/asm-prototypes.h
./include/asm-generic/asm-prototypes.h
>
> We normally export asm symbols via arm64ksyms.c. In fact, would doing that
> remove the need for the explicit declarations completely?
I'm pretty sure it still needs the declaration; otherwise the module
hashing will get confused. Also, the EXPORT_SYMBOL macro is a
different one when called from assembly versus from C, though not sure
that makes a substantive difference. It seems like this is what other
architectures are doing:
zx2c4@thinkpad ~/Projects/linux $ rg 'EXPORT_SYMBOL.*__.*[std]i[0-9]' -g '*.S'
arch/m68k/lib/modsi3.S
111: EXPORT_SYMBOL(__modsi3)
arch/m68k/lib/umodsi3.S
108: EXPORT_SYMBOL(__umodsi3)
arch/m68k/lib/udivsi3.S
157: EXPORT_SYMBOL(__udivsi3)
arch/m68k/lib/divsi3.S
123: EXPORT_SYMBOL(__divsi3)
arch/m68k/lib/mulsi3.S
105: EXPORT_SYMBOL(__mulsi3)
arch/powerpc/kernel/misc_32.S
529:EXPORT_SYMBOL(__ashrdi3)
541:EXPORT_SYMBOL(__ashldi3)
553:EXPORT_SYMBOL(__lshrdi3)
569:EXPORT_SYMBOL(__cmpdi2)
584:EXPORT_SYMBOL(__ucmpdi2)
596:EXPORT_SYMBOL(__bswapdi2)
arch/powerpc/kernel/misc_64.S
211:EXPORT_SYMBOL(__bswapdi2)
arch/sparc/lib/lshrdi3.S
30:EXPORT_SYMBOL(__lshrdi3)
arch/sparc/lib/muldi3.S
78:EXPORT_SYMBOL(__muldi3)
arch/sparc/lib/divdi3.S
283:EXPORT_SYMBOL(__divdi3)
arch/sparc/lib/ashrdi3.S
40:EXPORT_SYMBOL(__ashrdi3)
arch/sparc/lib/multi3.S
36:EXPORT_SYMBOL(__multi3)
arch/sparc/lib/ashldi3.S
38:EXPORT_SYMBOL(__ashldi3)
The patch titled
Subject: mm/filemap.c: fix NULL pointer in page_cache_tree_insert()
has been removed from the -mm tree. Its filename was
fix-null-pointer-in-page_cache_tree_insert.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Matthew Wilcox <mawilcox(a)microsoft.com>
Subject: mm/filemap.c: fix NULL pointer in page_cache_tree_insert()
f2fs specifies the __GFP_ZERO flag for allocating some of its pages.
Unfortunately, the page cache also uses the mapping's GFP flags for
allocating radix tree nodes. It always masked off the __GFP_HIGHMEM
flag, and masks off __GFP_ZERO in some paths, but not all. That causes
radix tree nodes to be allocated with a NULL list_head, which causes
backtraces like:
[<ffffff80086f4de0>] __list_del_entry+0x30/0xd0
[<ffffff8008362018>] list_lru_del+0xac/0x1ac
[<ffffff800830f04c>] page_cache_tree_insert+0xd8/0x110
The __GFP_DMA and __GFP_DMA32 flags would also be able to sneak through if
they are ever used. Fix them all by using GFP_RECLAIM_MASK at the
innermost location, and remove it from earlier in the callchain.
Link: http://lkml.kernel.org/r/20180411060320.14458-2-willy@infradead.org
Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check")
Signed-off-by: Matthew Wilcox <mawilcox(a)microsoft.com>
Reported-by: Chris Fries <cfries(a)google.com>
Debugged-by: Minchan Kim <minchan(a)kernel.org>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/filemap.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff -puN mm/filemap.c~fix-null-pointer-in-page_cache_tree_insert mm/filemap.c
--- a/mm/filemap.c~fix-null-pointer-in-page_cache_tree_insert
+++ a/mm/filemap.c
@@ -786,7 +786,7 @@ int replace_page_cache_page(struct page
VM_BUG_ON_PAGE(!PageLocked(new), new);
VM_BUG_ON_PAGE(new->mapping, new);
- error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
+ error = radix_tree_preload(gfp_mask & GFP_RECLAIM_MASK);
if (!error) {
struct address_space *mapping = old->mapping;
void (*freepage)(struct page *);
@@ -842,7 +842,7 @@ static int __add_to_page_cache_locked(st
return error;
}
- error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
+ error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK);
if (error) {
if (!huge)
mem_cgroup_cancel_charge(page, memcg, false);
@@ -1585,8 +1585,7 @@ no_page:
if (fgp_flags & FGP_ACCESSED)
__SetPageReferenced(page);
- err = add_to_page_cache_lru(page, mapping, offset,
- gfp_mask & GFP_RECLAIM_MASK);
+ err = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
if (unlikely(err)) {
put_page(page);
page = NULL;
@@ -2387,7 +2386,7 @@ static int page_cache_read(struct file *
if (!page)
return -ENOMEM;
- ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask & GFP_KERNEL);
+ ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
if (ret == 0)
ret = mapping->a_ops->readpage(file, page);
else if (ret == -EEXIST)
_
Patches currently in -mm which might be from mawilcox(a)microsoft.com are
slab-__gfp_zero-is-incompatible-with-a-constructor.patch
ida-remove-simple_ida_lock.patch
The patch titled
Subject: autofs: mount point create should honour passed in mode
has been removed from the -mm tree. Its filename was
autofs-mount-point-create-should-honour-passed-in-mode.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Ian Kent <raven(a)themaw.net>
Subject: autofs: mount point create should honour passed in mode
The autofs file system mkdir inode operation blindly sets the created
directory mode to S_IFDIR | 0555, ingoring the passed in mode, which can
cause selinux dac_override denials.
But the function also checks if the caller is the daemon (as no-one else
should be able to do anything here) so there's no point in not honouring
the passed in mode, allowing the daemon to set appropriate mode when
required.
Link: http://lkml.kernel.org/r/152361593601.8051.14014139124905996173.stgit@pluto…
Signed-off-by: Ian Kent <raven(a)themaw.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/autofs4/root.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN fs/autofs4/root.c~autofs-mount-point-create-should-honour-passed-in-mode fs/autofs4/root.c
--- a/fs/autofs4/root.c~autofs-mount-point-create-should-honour-passed-in-mode
+++ a/fs/autofs4/root.c
@@ -749,7 +749,7 @@ static int autofs4_dir_mkdir(struct inod
autofs4_del_active(dentry);
- inode = autofs4_get_inode(dir->i_sb, S_IFDIR | 0555);
+ inode = autofs4_get_inode(dir->i_sb, S_IFDIR | mode);
if (!inode)
return -ENOMEM;
d_add(dentry, inode);
_
Patches currently in -mm which might be from raven(a)themaw.net are
The patch titled
Subject: rapidio: fix rio_dma_transfer error handling
has been removed from the -mm tree. Its filename was
rapidio-fix-rio_dma_transfer-error-handling.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Ioan Nicu <ioan.nicu.ext(a)nokia.com>
Subject: rapidio: fix rio_dma_transfer error handling
Some of the mport_dma_req structure members were initialized late
inside the do_dma_request() function, just before submitting the
request to the dma engine. But we have some error branches before
that. In case of such an error, the code would return on the error
path and trigger the calling of dma_req_free() with a req structure
which is not completely initialized. This causes a NULL pointer
dereference in dma_req_free().
This patch fixes these error branches by making sure that all
necessary mport_dma_req structure members are initialized in
rio_dma_transfer() immediately after the request structure gets
allocated.
Link: http://lkml.kernel.org/r/20180412150605.GA31409@nokia.com
Fixes: bbd876adb8c72 ("rapidio: use a reference count for struct mport_dma_req")
Signed-off-by: Ioan Nicu <ioan.nicu.ext(a)nokia.com>
Tested-by: Alexander Sverdlin <alexander.sverdlin(a)nokia.com>
Acked-by: Alexandre Bounine <alex.bou9(a)gmail.com>
Cc: Barry Wood <barry.wood(a)idt.com>
Cc: Matt Porter <mporter(a)kernel.crashing.org>
Cc: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: Frank Kunz <frank.kunz(a)nokia.com>
Cc: <stable(a)vger.kernel.org> [4.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/rapidio/devices/rio_mport_cdev.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff -puN drivers/rapidio/devices/rio_mport_cdev.c~rapidio-fix-rio_dma_transfer-error-handling drivers/rapidio/devices/rio_mport_cdev.c
--- a/drivers/rapidio/devices/rio_mport_cdev.c~rapidio-fix-rio_dma_transfer-error-handling
+++ a/drivers/rapidio/devices/rio_mport_cdev.c
@@ -740,10 +740,7 @@ static int do_dma_request(struct mport_d
tx->callback = dma_xfer_callback;
tx->callback_param = req;
- req->dmach = chan;
- req->sync = sync;
req->status = DMA_IN_PROGRESS;
- init_completion(&req->req_comp);
kref_get(&req->refcount);
cookie = dmaengine_submit(tx);
@@ -831,13 +828,20 @@ rio_dma_transfer(struct file *filp, u32
if (!req)
return -ENOMEM;
- kref_init(&req->refcount);
-
ret = get_dma_channel(priv);
if (ret) {
kfree(req);
return ret;
}
+ chan = priv->dmach;
+
+ kref_init(&req->refcount);
+ init_completion(&req->req_comp);
+ req->dir = dir;
+ req->filp = filp;
+ req->priv = priv;
+ req->dmach = chan;
+ req->sync = sync;
/*
* If parameter loc_addr != NULL, we are transferring data from/to
@@ -925,11 +929,6 @@ rio_dma_transfer(struct file *filp, u32
xfer->offset, xfer->length);
}
- req->dir = dir;
- req->filp = filp;
- req->priv = priv;
- chan = priv->dmach;
-
nents = dma_map_sg(chan->device->dev,
req->sgt.sgl, req->sgt.nents, dir);
if (nents == 0) {
_
Patches currently in -mm which might be from ioan.nicu.ext(a)nokia.com are
The patch titled
Subject: writeback: safer lock nesting
has been removed from the -mm tree. Its filename was
writeback-safer-lock-nesting.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Greg Thelen <gthelen(a)google.com>
Subject: writeback: safer lock nesting
lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if
the page's memcg is undergoing move accounting, which occurs when a
process leaves its memcg for a new one that has
memory.move_charge_at_immigrate set.
unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if
the given inode is switching writeback domains. Switches occur when
enough writes are issued from a new domain.
This existing pattern is thus suspicious:
lock_page_memcg(page);
unlocked_inode_to_wb_begin(inode, &locked);
...
unlocked_inode_to_wb_end(inode, locked);
unlock_page_memcg(page);
If both inode switch and process memcg migration are both in-flight then
unlocked_inode_to_wb_end() will unconditionally enable interrupts while
still holding the lock_page_memcg() irq spinlock. This suggests the
possibility of deadlock if an interrupt occurs before unlock_page_memcg().
truncate
__cancel_dirty_page
lock_page_memcg
unlocked_inode_to_wb_begin
unlocked_inode_to_wb_end
<interrupts mistakenly enabled>
<interrupt>
end_page_writeback
test_clear_page_writeback
lock_page_memcg
<deadlock>
unlock_page_memcg
Due to configuration limitations this deadlock is not currently possible
because we don't mix cgroup writeback (a cgroupv2 feature) and
memory.move_charge_at_immigrate (a cgroupv1 feature).
If the kernel is hacked to always claim inode switching and memcg
moving_account, then this script triggers lockup in less than a minute:
cd /mnt/cgroup/memory
mkdir a b
echo 1 > a/memory.move_charge_at_immigrate
echo 1 > b/memory.move_charge_at_immigrate
(
echo $BASHPID > a/cgroup.procs
while true; do
dd if=/dev/zero of=/mnt/big bs=1M count=256
done
) &
while true; do
sync
done &
sleep 1h &
SLEEP=$!
while true; do
echo $SLEEP > a/cgroup.procs
echo $SLEEP > b/cgroup.procs
done
The deadlock does not seem possible, so it's debatable if there's any
reason to modify the kernel. I suggest we should to prevent future
surprises. And Wang Long said "this deadlock occurs three times in our
environment", so there's more reason to apply this, even to stable.
Stable 4.4 has minor conflicts applying this patch. For a clean 4.4 patch
see "[PATCH for-4.4] writeback: safer lock nesting"
https://lkml.org/lkml/2018/4/11/146
Wang Long said "this deadlock occurs three times in our environment"
[gthelen(a)google.com: v4]
Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com
[akpm(a)linux-foundation.org: comment tweaks, struct initialization simplification]
Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613
Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com
Fixes: 682aa8e1a6a1 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates")
Signed-off-by: Greg Thelen <gthelen(a)google.com>
Reported-by: Wang Long <wanglong19(a)meituan.com>
Acked-by: Wang Long <wanglong19(a)meituan.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Nicholas Piggin <npiggin(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [v4.2+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/fs-writeback.c | 7 +++---
include/linux/backing-dev-defs.h | 5 ++++
include/linux/backing-dev.h | 30 +++++++++++++++--------------
mm/page-writeback.c | 18 ++++++++---------
4 files changed, 34 insertions(+), 26 deletions(-)
diff -puN fs/fs-writeback.c~writeback-safer-lock-nesting fs/fs-writeback.c
--- a/fs/fs-writeback.c~writeback-safer-lock-nesting
+++ a/fs/fs-writeback.c
@@ -745,11 +745,12 @@ int inode_congested(struct inode *inode,
*/
if (inode && inode_to_wb_is_valid(inode)) {
struct bdi_writeback *wb;
- bool locked, congested;
+ struct wb_lock_cookie lock_cookie = {};
+ bool congested;
- wb = unlocked_inode_to_wb_begin(inode, &locked);
+ wb = unlocked_inode_to_wb_begin(inode, &lock_cookie);
congested = wb_congested(wb, cong_bits);
- unlocked_inode_to_wb_end(inode, locked);
+ unlocked_inode_to_wb_end(inode, &lock_cookie);
return congested;
}
diff -puN include/linux/backing-dev-defs.h~writeback-safer-lock-nesting include/linux/backing-dev-defs.h
--- a/include/linux/backing-dev-defs.h~writeback-safer-lock-nesting
+++ a/include/linux/backing-dev-defs.h
@@ -223,6 +223,11 @@ static inline void set_bdi_congested(str
set_wb_congested(bdi->wb.congested, sync);
}
+struct wb_lock_cookie {
+ bool locked;
+ unsigned long flags;
+};
+
#ifdef CONFIG_CGROUP_WRITEBACK
/**
diff -puN include/linux/backing-dev.h~writeback-safer-lock-nesting include/linux/backing-dev.h
--- a/include/linux/backing-dev.h~writeback-safer-lock-nesting
+++ a/include/linux/backing-dev.h
@@ -347,7 +347,7 @@ static inline struct bdi_writeback *inod
/**
* unlocked_inode_to_wb_begin - begin unlocked inode wb access transaction
* @inode: target inode
- * @lockedp: temp bool output param, to be passed to the end function
+ * @cookie: output param, to be passed to the end function
*
* The caller wants to access the wb associated with @inode but isn't
* holding inode->i_lock, the i_pages lock or wb->list_lock. This
@@ -355,12 +355,12 @@ static inline struct bdi_writeback *inod
* association doesn't change until the transaction is finished with
* unlocked_inode_to_wb_end().
*
- * The caller must call unlocked_inode_to_wb_end() with *@lockdep
- * afterwards and can't sleep during transaction. IRQ may or may not be
- * disabled on return.
+ * The caller must call unlocked_inode_to_wb_end() with *@cookie afterwards and
+ * can't sleep during the transaction. IRQs may or may not be disabled on
+ * return.
*/
static inline struct bdi_writeback *
-unlocked_inode_to_wb_begin(struct inode *inode, bool *lockedp)
+unlocked_inode_to_wb_begin(struct inode *inode, struct wb_lock_cookie *cookie)
{
rcu_read_lock();
@@ -368,10 +368,10 @@ unlocked_inode_to_wb_begin(struct inode
* Paired with store_release in inode_switch_wb_work_fn() and
* ensures that we see the new wb if we see cleared I_WB_SWITCH.
*/
- *lockedp = smp_load_acquire(&inode->i_state) & I_WB_SWITCH;
+ cookie->locked = smp_load_acquire(&inode->i_state) & I_WB_SWITCH;
- if (unlikely(*lockedp))
- xa_lock_irq(&inode->i_mapping->i_pages);
+ if (unlikely(cookie->locked))
+ xa_lock_irqsave(&inode->i_mapping->i_pages, cookie->flags);
/*
* Protected by either !I_WB_SWITCH + rcu_read_lock() or the i_pages
@@ -383,12 +383,13 @@ unlocked_inode_to_wb_begin(struct inode
/**
* unlocked_inode_to_wb_end - end inode wb access transaction
* @inode: target inode
- * @locked: *@lockedp from unlocked_inode_to_wb_begin()
+ * @cookie: @cookie from unlocked_inode_to_wb_begin()
*/
-static inline void unlocked_inode_to_wb_end(struct inode *inode, bool locked)
+static inline void unlocked_inode_to_wb_end(struct inode *inode,
+ struct wb_lock_cookie *cookie)
{
- if (unlikely(locked))
- xa_unlock_irq(&inode->i_mapping->i_pages);
+ if (unlikely(cookie->locked))
+ xa_unlock_irqrestore(&inode->i_mapping->i_pages, cookie->flags);
rcu_read_unlock();
}
@@ -435,12 +436,13 @@ static inline struct bdi_writeback *inod
}
static inline struct bdi_writeback *
-unlocked_inode_to_wb_begin(struct inode *inode, bool *lockedp)
+unlocked_inode_to_wb_begin(struct inode *inode, struct wb_lock_cookie *cookie)
{
return inode_to_wb(inode);
}
-static inline void unlocked_inode_to_wb_end(struct inode *inode, bool locked)
+static inline void unlocked_inode_to_wb_end(struct inode *inode,
+ struct wb_lock_cookie *cookie)
{
}
diff -puN mm/page-writeback.c~writeback-safer-lock-nesting mm/page-writeback.c
--- a/mm/page-writeback.c~writeback-safer-lock-nesting
+++ a/mm/page-writeback.c
@@ -2502,13 +2502,13 @@ void account_page_redirty(struct page *p
if (mapping && mapping_cap_account_dirty(mapping)) {
struct inode *inode = mapping->host;
struct bdi_writeback *wb;
- bool locked;
+ struct wb_lock_cookie cookie = {};
- wb = unlocked_inode_to_wb_begin(inode, &locked);
+ wb = unlocked_inode_to_wb_begin(inode, &cookie);
current->nr_dirtied--;
dec_node_page_state(page, NR_DIRTIED);
dec_wb_stat(wb, WB_DIRTIED);
- unlocked_inode_to_wb_end(inode, locked);
+ unlocked_inode_to_wb_end(inode, &cookie);
}
}
EXPORT_SYMBOL(account_page_redirty);
@@ -2614,15 +2614,15 @@ void __cancel_dirty_page(struct page *pa
if (mapping_cap_account_dirty(mapping)) {
struct inode *inode = mapping->host;
struct bdi_writeback *wb;
- bool locked;
+ struct wb_lock_cookie cookie = {};
lock_page_memcg(page);
- wb = unlocked_inode_to_wb_begin(inode, &locked);
+ wb = unlocked_inode_to_wb_begin(inode, &cookie);
if (TestClearPageDirty(page))
account_page_cleaned(page, mapping, wb);
- unlocked_inode_to_wb_end(inode, locked);
+ unlocked_inode_to_wb_end(inode, &cookie);
unlock_page_memcg(page);
} else {
ClearPageDirty(page);
@@ -2654,7 +2654,7 @@ int clear_page_dirty_for_io(struct page
if (mapping && mapping_cap_account_dirty(mapping)) {
struct inode *inode = mapping->host;
struct bdi_writeback *wb;
- bool locked;
+ struct wb_lock_cookie cookie = {};
/*
* Yes, Virginia, this is indeed insane.
@@ -2691,14 +2691,14 @@ int clear_page_dirty_for_io(struct page
* always locked coming in here, so we get the desired
* exclusion.
*/
- wb = unlocked_inode_to_wb_begin(inode, &locked);
+ wb = unlocked_inode_to_wb_begin(inode, &cookie);
if (TestClearPageDirty(page)) {
dec_lruvec_page_state(page, NR_FILE_DIRTY);
dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
dec_wb_stat(wb, WB_RECLAIMABLE);
ret = 1;
}
- unlocked_inode_to_wb_end(inode, locked);
+ unlocked_inode_to_wb_end(inode, &cookie);
return ret;
}
return TestClearPageDirty(page);
_
Patches currently in -mm which might be from gthelen(a)google.com are
From: Joerg Roedel <jroedel(a)suse.de>
This reverts commit 28ee90fe6048fa7b7ceaeb8831c0e4e454a4cf89.
This commit is broken for x86, as it unmaps the PTE and PMD
pages and immediatly frees them without doing a TLB flush.
Further this lacks synchronization with other page-tables in
the system when the PMD pages are not shared between
mm_structs.
On x86-32 with PAE and PTI patches on-top this patch
triggers the BUG_ON in vmalloc_sync_one() because the kernel
and the process page-table were not synchronized.
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
---
arch/x86/mm/pgtable.c | 28 ++--------------------------
1 file changed, 2 insertions(+), 26 deletions(-)
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index ae98d4c5e32a..fd02a537a80f 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -787,22 +787,7 @@ int pmd_clear_huge(pmd_t *pmd)
*/
int pud_free_pmd_page(pud_t *pud)
{
- pmd_t *pmd;
- int i;
-
- if (pud_none(*pud))
- return 1;
-
- pmd = (pmd_t *)pud_page_vaddr(*pud);
-
- for (i = 0; i < PTRS_PER_PMD; i++)
- if (!pmd_free_pte_page(&pmd[i]))
- return 0;
-
- pud_clear(pud);
- free_page((unsigned long)pmd);
-
- return 1;
+ return pud_none(*pud);
}
/**
@@ -814,15 +799,6 @@ int pud_free_pmd_page(pud_t *pud)
*/
int pmd_free_pte_page(pmd_t *pmd)
{
- pte_t *pte;
-
- if (pmd_none(*pmd))
- return 1;
-
- pte = (pte_t *)pmd_page_vaddr(*pmd);
- pmd_clear(pmd);
- free_page((unsigned long)pte);
-
- return 1;
+ return pmd_none(*pmd);
}
#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
--
2.13.6
From: Dave Hansen <dave.hansen(a)linux.intel.com>
I got a bug report that the following code (roughly) was
causing a SIGSEGV:
mprotect(ptr, size, PROT_EXEC);
mprotect(ptr, size, PROT_NONE);
mprotect(ptr, size, PROT_READ);
*ptr = 100;
The problem is hit when the mprotect(PROT_EXEC)
is implicitly assigned a protection key to the VMA, and made
that key ACCESS_DENY|WRITE_DENY. The PROT_NONE mprotect()
failed to remove the protection key, and the PROT_NONE->
PROT_READ left the PTE usable, but the pkey still in place
and left the memory inaccessible.
To fix this, we ensure that we always "override" the pkee
at mprotect() if the VMA does not have execute-only
permissions, but the VMA has the execute-only pkey.
We had a check for PROT_READ/WRITE, but it did not work
for PROT_NONE. This entirely removes the PROT_* checks,
which ensures that PROT_NONE now works.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Fixes: 62b5f7d013f ("mm/core, x86/mm/pkeys: Add execute-only protection keys support")
Reported-by: Shakeel Butt <shakeelb(a)google.com>
Cc: stable(a)vger.kernel.org
Cc: Ram Pai <linuxram(a)us.ibm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Michael Ellermen <mpe(a)ellerman.id.au>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Shuah Khan <shuah(a)kernel.org>
---
b/arch/x86/include/asm/pkeys.h | 12 +++++++++++-
b/arch/x86/mm/pkeys.c | 21 +++++++++++----------
2 files changed, 22 insertions(+), 11 deletions(-)
diff -puN arch/x86/include/asm/pkeys.h~pkeys-abandon-exec-only-pkey-more-aggressively arch/x86/include/asm/pkeys.h
--- a/arch/x86/include/asm/pkeys.h~pkeys-abandon-exec-only-pkey-more-aggressively 2018-04-26 10:42:18.971487371 -0700
+++ b/arch/x86/include/asm/pkeys.h 2018-04-26 10:42:18.977487371 -0700
@@ -2,6 +2,8 @@
#ifndef _ASM_X86_PKEYS_H
#define _ASM_X86_PKEYS_H
+#define ARCH_DEFAULT_PKEY 0
+
#define arch_max_pkey() (boot_cpu_has(X86_FEATURE_OSPKE) ? 16 : 1)
extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
@@ -15,7 +17,7 @@ extern int __execute_only_pkey(struct mm
static inline int execute_only_pkey(struct mm_struct *mm)
{
if (!boot_cpu_has(X86_FEATURE_OSPKE))
- return 0;
+ return ARCH_DEFAULT_PKEY;
return __execute_only_pkey(mm);
}
@@ -56,6 +58,14 @@ bool mm_pkey_is_allocated(struct mm_stru
return false;
if (pkey >= arch_max_pkey())
return false;
+ /*
+ * The exec-only pkey is set in the allocation map, but
+ * is not available to any of the user interfaces like
+ * mprotect_pkey().
+ */
+ if (pkey == mm->context.execute_only_pkey)
+ return false;
+
return mm_pkey_allocation_map(mm) & (1U << pkey);
}
diff -puN arch/x86/mm/pkeys.c~pkeys-abandon-exec-only-pkey-more-aggressively arch/x86/mm/pkeys.c
--- a/arch/x86/mm/pkeys.c~pkeys-abandon-exec-only-pkey-more-aggressively 2018-04-26 10:42:18.973487371 -0700
+++ b/arch/x86/mm/pkeys.c 2018-04-26 10:47:34.806486584 -0700
@@ -94,26 +94,27 @@ int __arch_override_mprotect_pkey(struct
*/
if (pkey != -1)
return pkey;
- /*
- * Look for a protection-key-drive execute-only mapping
- * which is now being given permissions that are not
- * execute-only. Move it back to the default pkey.
- */
- if (vma_is_pkey_exec_only(vma) &&
- (prot & (PROT_READ|PROT_WRITE))) {
- return 0;
- }
+
/*
* The mapping is execute-only. Go try to get the
* execute-only protection key. If we fail to do that,
* fall through as if we do not have execute-only
- * support.
+ * support in this mm.
*/
if (prot == PROT_EXEC) {
pkey = execute_only_pkey(vma->vm_mm);
if (pkey > 0)
return pkey;
+ } else if (vma_is_pkey_exec_only(vma)) {
+ /*
+ * Protections are *not* PROT_EXEC, but the mapping
+ * is using the exec-only pkey. This mapping was
+ * PROT_EXEC and will no longer be. Move back to
+ * the default pkey.
+ */
+ return ARCH_DEFAULT_PKEY;
}
+
/*
* This is a vanilla, non-pkey mprotect (or we failed to
* setup execute-only), inherit the pkey from the VMA we
_
From: Dave Hansen <dave.hansen(a)linux.intel.com>
mm_pkey_is_allocated() treats pkey 0 as unallocated. That is
inconsistent with the manpages, and also inconsistent with
mm->context.pkey_allocation_map. Stop special casing it and only
disallow values that are actually bad (< 0).
The end-user visible effect of this is that you can now use
mprotect_pkey() to set pkey=0.
This is a bit nicer than what Ram proposed because it is simpler
and removes special-casing for pkey 0. On the other hand, it does
allow applciations to pkey_free() pkey-0, but that's just a silly
thing to do, so we are not going to protect against it.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Fixes: 58ab9a088dda ("x86/pkeys: Check against max pkey to avoid overflows")
Cc: stable(a)kernel.org
Cc: Ram Pai <linuxram(a)us.ibm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Michael Ellermen <mpe(a)ellerman.id.au>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>p
Cc: Shuah Khan <shuah(a)kernel.org>
---
b/arch/x86/include/asm/mmu_context.h | 2 +-
b/arch/x86/include/asm/pkeys.h | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff -puN arch/x86/include/asm/mmu_context.h~x86-pkey-0-default-allocated arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~x86-pkey-0-default-allocated 2018-03-26 10:22:33.742170197 -0700
+++ b/arch/x86/include/asm/mmu_context.h 2018-03-26 10:22:33.747170197 -0700
@@ -192,7 +192,7 @@ static inline int init_new_context(struc
#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
- /* pkey 0 is the default and always allocated */
+ /* pkey 0 is the default and allocated implicitly */
mm->context.pkey_allocation_map = 0x1;
/* -1 means unallocated or invalid */
mm->context.execute_only_pkey = -1;
diff -puN arch/x86/include/asm/pkeys.h~x86-pkey-0-default-allocated arch/x86/include/asm/pkeys.h
--- a/arch/x86/include/asm/pkeys.h~x86-pkey-0-default-allocated 2018-03-26 10:22:33.744170197 -0700
+++ b/arch/x86/include/asm/pkeys.h 2018-03-26 10:22:33.747170197 -0700
@@ -49,10 +49,10 @@ bool mm_pkey_is_allocated(struct mm_stru
{
/*
* "Allocated" pkeys are those that have been returned
- * from pkey_alloc(). pkey 0 is special, and never
- * returned from pkey_alloc().
+ * from pkey_alloc() or pkey 0 which is allocated
+ * implicitly when the mm is created.
*/
- if (pkey <= 0)
+ if (pkey < 0)
return false;
if (pkey >= arch_max_pkey())
return false;
_
Greetings,
this series is the backport of 18 upstream patches to add the
current s390 spectre mitigation to kernel version 4.14. One
less patch than 4.14 and 4.9 as there is no nested KVM guest
support (aka vsie) in 4.4.
It follows the x86 approach with array_index_nospec for the v1
spectre attack and retpoline/expoline for v2. As a fallback
there is the ppa-12/ppa-13 based defense which requires an
micro-code update.
Christian Borntraeger (2):
KVM: s390: wire up bpb feature
s390/entry.S: fix spurious zeroing of r0
Eugeniu Rosca (1):
s390: Replace IS_ENABLED(EXPOLINE_*) with
IS_ENABLED(CONFIG_EXPOLINE_*)
Heiko Carstens (1):
s390: enable CPU alternatives unconditionally
Martin Schwidefsky (13):
s390: scrub registers on kernel entry and KVM exit
s390: add optimized array_index_mask_nospec
s390/alternative: use a copy of the facility bit mask
s390: add options to change branch prediction behaviour for the kernel
s390: run user space and KVM guests with modified branch prediction
s390: introduce execute-trampolines for branches
s390: do not bypass BPENTER for interrupt system calls
s390: move nobp parameter functions to nospec-branch.c
s390: add automatic detection of the spectre defense
s390: report spectre mitigation via syslog
s390: add sysfs attributes for spectre
s390: correct nospec auto detection init order
s390: correct module section names for expoline code revert
Vasily Gorbik (1):
s390: introduce CPU alternatives
Documentation/kernel-parameters.txt | 3 +
arch/s390/Kconfig | 47 +++++++
arch/s390/Makefile | 10 ++
arch/s390/include/asm/alternative.h | 149 ++++++++++++++++++++
arch/s390/include/asm/barrier.h | 24 ++++
arch/s390/include/asm/facility.h | 18 +++
arch/s390/include/asm/kvm_host.h | 3 +-
arch/s390/include/asm/lowcore.h | 7 +-
arch/s390/include/asm/nospec-branch.h | 17 +++
arch/s390/include/asm/processor.h | 4 +
arch/s390/include/asm/thread_info.h | 4 +
arch/s390/include/uapi/asm/kvm.h | 3 +
arch/s390/kernel/Makefile | 5 +-
arch/s390/kernel/alternative.c | 112 +++++++++++++++
arch/s390/kernel/early.c | 5 +
arch/s390/kernel/entry.S | 250 ++++++++++++++++++++++++++++++----
arch/s390/kernel/ipl.c | 1 +
arch/s390/kernel/module.c | 65 ++++++++-
arch/s390/kernel/nospec-branch.c | 169 +++++++++++++++++++++++
arch/s390/kernel/processor.c | 18 +++
arch/s390/kernel/setup.c | 14 +-
arch/s390/kernel/smp.c | 7 +-
arch/s390/kernel/vmlinux.lds.S | 37 +++++
arch/s390/kvm/kvm-s390.c | 13 +-
drivers/s390/char/Makefile | 2 +
include/uapi/linux/kvm.h | 1 +
26 files changed, 952 insertions(+), 36 deletions(-)
create mode 100644 arch/s390/include/asm/alternative.h
create mode 100644 arch/s390/include/asm/nospec-branch.h
create mode 100644 arch/s390/kernel/alternative.c
create mode 100644 arch/s390/kernel/nospec-branch.c
--
2.13.5