 
            This is the start of the stable review cycle for the 5.4.1 release. There are 66 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 29 Nov 2019 20:18:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.1-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.4.1-rc1
Michael Ellerman mpe@ellerman.id.au KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel
Michael Ellerman mpe@ellerman.id.au powerpc/book3s64: Fix link stack flush on context switch
Bernd Porr mail@berndporr.me.uk staging: comedi: usbduxfast: usbduxfast_ai_cmdtest rounding error
Aleksander Morgado aleksander@aleksander.es USB: serial: option: add support for Foxconn T77W968 LTE modules
Aleksander Morgado aleksander@aleksander.es USB: serial: option: add support for DW5821e with eSIM support
Johan Hovold johan@kernel.org USB: serial: mos7840: fix remote wakeup
Johan Hovold johan@kernel.org USB: serial: mos7720: fix remote wakeup
Pavel Löbl pavel@loebl.cz USB: serial: mos7840: add USB ID to support Moxa UPort 2210
Oliver Neukum oneukum@suse.com appledisplay: fix error handling in the scheduled work
Oliver Neukum oneukum@suse.com USB: chaoskey: fix error case of a timeout
Greg Kroah-Hartman gregkh@linuxfoundation.org usb-serial: cp201x: support Mark-10 digital force gauge
Suwan Kim suwan.kim027@gmail.com usbip: Fix uninitialized symbol 'nents' in stub_recv_cmd_submit()
Hewenliang hewenliang4@huawei.com usbip: tools: fix fd leakage in the function of read_attr_usbip_status
Oliver Neukum oneukum@suse.com USBIP: add config dependency for SGL_ALLOC
Takashi Iwai tiwai@suse.de ALSA: hda - Disable audio component for legacy Nvidia HDMI codecs
A Sun as1033x@comcast.net media: mceusb: fix out of bounds read in MCE receiver buffer
Sean Young sean@mess.org media: imon: invalid dereference in imon_touch_event
Vito Caputo vcaputo@pengaru.com media: cxusb: detect cxusb_ctrl_msg error in query
Oliver Neukum oneukum@suse.com media: b2c2-flexcop-usb: add sanity checking
Laurent Pinchart laurent.pinchart@ideasonboard.com media: uvcvideo: Fix error path in control parsing failure
Thomas Gleixner tglx@linutronix.de futex: Prevent exit livelock
Thomas Gleixner tglx@linutronix.de futex: Provide distinct return value when owner is exiting
Thomas Gleixner tglx@linutronix.de futex: Add mutex around futex exit
Thomas Gleixner tglx@linutronix.de futex: Provide state handling for exec() as well
Thomas Gleixner tglx@linutronix.de futex: Sanitize exit state handling
Thomas Gleixner tglx@linutronix.de futex: Mark the begin of futex exit explicitly
Thomas Gleixner tglx@linutronix.de futex: Set task::futex_state to DEAD right after handling futex exit
Thomas Gleixner tglx@linutronix.de futex: Split futex_mm_release() for exit/exec
Thomas Gleixner tglx@linutronix.de exit/exec: Seperate mm_release()
Thomas Gleixner tglx@linutronix.de futex: Replace PF_EXITPIDONE with a state
Thomas Gleixner tglx@linutronix.de futex: Move futex exit handling into futex code
Kai Shen shenkai8@huawei.com cpufreq: Add NULL checks to show() and store() methods of cpufreq
Alan Stern stern@rowland.harvard.edu media: usbvision: Fix races among open, close, and disconnect
Alan Stern stern@rowland.harvard.edu media: usbvision: Fix invalid accesses after device disconnect
Alexander Popov alex.popov@linux.com media: vivid: Fix wrong locking that causes race conditions on streaming stop
Vandana BN bnvandana@gmail.com media: vivid: Set vid_cap_streaming and vid_out_streaming to true
Geoffrey D. Bennett g@b4.vu ALSA: usb-audio: Fix Scarlett 6i6 Gen 2 port data
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Fix NULL dereference at parsing BADD
Yang Tao yang.tao172@zte.com.cn futex: Prevent robust futex exit race
Andy Lutomirski luto@kernel.org x86/entry/32: Fix FIXUP_ESPFIX_STACK with user CR3
Ingo Molnar mingo@kernel.org x86/pti/32: Calculate the various PTI cpu_entry_area sizes correctly, make the CPU_ENTRY_AREA_PAGES assert precise
Andy Lutomirski luto@kernel.org selftests/x86/sigreturn/32: Invalidate DS and ES when abusing the kernel
Andy Lutomirski luto@kernel.org selftests/x86/mov_ss_trap: Fix the SYSENTER test
Peter Zijlstra peterz@infradead.org x86/entry/32: Fix NMI vs ESPFIX
Andy Lutomirski luto@kernel.org x86/entry/32: Unwind the ESPFIX stack earlier on exception entry
Andy Lutomirski luto@kernel.org x86/entry/32: Move FIXUP_FRAME after pushing %fs in SAVE_ALL
Andy Lutomirski luto@kernel.org x86/entry/32: Use %ss segment where required
Peter Zijlstra peterz@infradead.org x86/entry/32: Fix IRET exception
Thomas Gleixner tglx@linutronix.de x86/cpu_entry_area: Add guard page for entry stack on 32bit
Thomas Gleixner tglx@linutronix.de x86/pti/32: Size initial_page_table correctly
Andy Lutomirski luto@kernel.org x86/doublefault/32: Fix stack canaries in the double fault handler
Jan Beulich jbeulich@suse.com x86/xen/32: Simplify ring check in xen_iret_crit_fixup()
Jan Beulich jbeulich@suse.com x86/xen/32: Make xen_iret_crit_fixup() independent of frame layout
Jan Beulich jbeulich@suse.com x86/stackframe/32: Repair 32-bit Xen PV
Navid Emamdoost navid.emamdoost@gmail.com nbd: prevent memory leak
Waiman Long longman@redhat.com x86/speculation: Fix redundant MDS mitigation message
Waiman Long longman@redhat.com x86/speculation: Fix incorrect MDS/TAA mitigation status
Alexander Kapshuk alexander.kapshuk@gmail.com x86/insn: Fix awk regexp warnings
John Pittman jpittman@redhat.com md/raid10: prevent access of uninitialized resync_pages offset
Mike Snitzer snitzer@redhat.com Revert "dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues"
Adam Ford aford173@gmail.com Revert "Bluetooth: hci_ll: set operational frequency earlier"
Christian Lamparter chunkeey@gmail.com ath10k: restore QCA9880-AR1A (v1) detection
Bjorn Andersson bjorn.andersson@linaro.org ath10k: Fix HOST capability QMI incompatibility
Hui Peng benquike@gmail.com ath10k: Fix a NULL-ptr-deref bug in ath10k_usb_alloc_urb_from_pipe
Denis Efremov efremov@linux.com ath9k_hw: fix uninitialized variable data
Tomas Bortoli tomasbortoli@gmail.com Bluetooth: Fix invalid-free in bcsp_close()
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/mds.rst | 7 +- .../admin-guide/hw-vuln/tsx_async_abort.rst | 5 +- Documentation/admin-guide/kernel-parameters.txt | 11 + .../bindings/net/wireless/qcom,ath10k.txt | 6 + Makefile | 4 +- arch/powerpc/include/asm/asm-prototypes.h | 3 + arch/powerpc/include/asm/security_features.h | 3 + arch/powerpc/kernel/entry_64.S | 6 + arch/powerpc/kernel/security.c | 57 +++- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 30 ++ arch/x86/entry/entry_32.S | 211 +++++++++----- arch/x86/include/asm/cpu_entry_area.h | 18 +- arch/x86/include/asm/pgtable_32_types.h | 8 +- arch/x86/include/asm/segment.h | 12 + arch/x86/kernel/cpu/bugs.c | 30 +- arch/x86/kernel/doublefault.c | 3 + arch/x86/kernel/head_32.S | 10 + arch/x86/mm/cpu_entry_area.c | 4 +- arch/x86/tools/gen-insn-attr-x86.awk | 4 +- arch/x86/xen/xen-asm_32.S | 75 ++--- drivers/block/nbd.c | 5 +- drivers/bluetooth/hci_bcsp.c | 3 + drivers/bluetooth/hci_ll.c | 39 ++- drivers/cpufreq/cpufreq.c | 6 + drivers/md/dm-crypt.c | 9 +- drivers/md/raid10.c | 2 +- drivers/media/platform/vivid/vivid-kthread-cap.c | 8 +- drivers/media/platform/vivid/vivid-kthread-out.c | 8 +- drivers/media/platform/vivid/vivid-sdr-cap.c | 8 +- drivers/media/platform/vivid/vivid-vid-cap.c | 3 - drivers/media/platform/vivid/vivid-vid-out.c | 3 - drivers/media/rc/imon.c | 3 +- drivers/media/rc/mceusb.c | 141 ++++++--- drivers/media/usb/b2c2/flexcop-usb.c | 3 + drivers/media/usb/dvb-usb/cxusb.c | 3 +- drivers/media/usb/usbvision/usbvision-video.c | 29 +- drivers/media/usb/uvc/uvc_driver.c | 28 +- drivers/net/wireless/ath/ath10k/pci.c | 36 ++- drivers/net/wireless/ath/ath10k/qmi.c | 13 +- drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c | 22 ++ drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h | 1 + drivers/net/wireless/ath/ath10k/snoc.c | 11 + drivers/net/wireless/ath/ath10k/snoc.h | 1 + drivers/net/wireless/ath/ath10k/usb.c | 8 + drivers/net/wireless/ath/ath9k/ar9003_eeprom.c | 2 +- drivers/staging/comedi/drivers/usbduxfast.c | 21 +- drivers/usb/misc/appledisplay.c | 8 +- drivers/usb/misc/chaoskey.c | 24 +- drivers/usb/serial/cp210x.c | 1 + drivers/usb/serial/mos7720.c | 4 - drivers/usb/serial/mos7840.c | 16 +- drivers/usb/serial/option.c | 7 + drivers/usb/usbip/Kconfig | 1 + drivers/usb/usbip/stub_rx.c | 50 ++-- fs/exec.c | 2 +- include/linux/compat.h | 2 - include/linux/futex.h | 40 ++- include/linux/sched.h | 3 +- include/linux/sched/mm.h | 6 +- kernel/exit.c | 30 +- kernel/fork.c | 40 +-- kernel/futex.c | 324 ++++++++++++++++++--- sound/pci/hda/patch_hdmi.c | 22 -- sound/usb/mixer.c | 3 + sound/usb/mixer_scarlett_gen2.c | 36 +-- tools/arch/x86/tools/gen-insn-attr-x86.awk | 4 +- tools/testing/selftests/x86/mov_ss_trap.c | 3 +- tools/testing/selftests/x86/sigreturn.c | 13 + tools/usb/usbip/libsrc/usbip_host_common.c | 2 +- 69 files changed, 1091 insertions(+), 473 deletions(-)
 
            From: Tomas Bortoli tomasbortoli@gmail.com
commit cf94da6f502d8caecabd56b194541c873c8a7a3c upstream.
Syzbot reported an invalid-free that I introduced fixing a memleak.
bcsp_recv() also frees bcsp->rx_skb but never nullifies its value. Nullify bcsp->rx_skb every time it is freed.
Signed-off-by: Tomas Bortoli tomasbortoli@gmail.com Reported-by: syzbot+a0d209a4676664613e76@syzkaller.appspotmail.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Cc: Alexander Potapenko glider@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/bluetooth/hci_bcsp.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/bluetooth/hci_bcsp.c +++ b/drivers/bluetooth/hci_bcsp.c @@ -591,6 +591,7 @@ static int bcsp_recv(struct hci_uart *hu if (*ptr == 0xc0) { BT_ERR("Short BCSP packet"); kfree_skb(bcsp->rx_skb); + bcsp->rx_skb = NULL; bcsp->rx_state = BCSP_W4_PKT_START; bcsp->rx_count = 0; } else @@ -606,6 +607,7 @@ static int bcsp_recv(struct hci_uart *hu bcsp->rx_skb->data[2])) != bcsp->rx_skb->data[3]) { BT_ERR("Error in BCSP hdr checksum"); kfree_skb(bcsp->rx_skb); + bcsp->rx_skb = NULL; bcsp->rx_state = BCSP_W4_PKT_DELIMITER; bcsp->rx_count = 0; continue; @@ -630,6 +632,7 @@ static int bcsp_recv(struct hci_uart *hu bscp_get_crc(bcsp));
kfree_skb(bcsp->rx_skb); + bcsp->rx_skb = NULL; bcsp->rx_state = BCSP_W4_PKT_DELIMITER; bcsp->rx_count = 0; continue;
 
            From: Denis Efremov efremov@linux.com
commit 80e84f36412e0c5172447b6947068dca0d04ee82 upstream.
Currently, data variable in ar9003_hw_thermo_cal_apply() could be uninitialized if ar9300_otp_read_word() will fail to read the value. Initialize data variable with 0 to prevent an undefined behavior. This will be enough to handle error case when ar9300_otp_read_word() fails.
Fixes: 80fe43f2bbd5 ("ath9k_hw: Read and configure thermocal for AR9462") Cc: Rajkumar Manoharan rmanohar@qca.qualcomm.com Cc: John W. Linville linville@tuxdriver.com Cc: Kalle Valo kvalo@codeaurora.org Cc: "David S. Miller" davem@davemloft.net Cc: stable@vger.kernel.org Signed-off-by: Denis Efremov efremov@linux.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/ath/ath9k/ar9003_eeprom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/wireless/ath/ath9k/ar9003_eeprom.c +++ b/drivers/net/wireless/ath/ath9k/ar9003_eeprom.c @@ -4183,7 +4183,7 @@ static void ar9003_hw_thermometer_apply(
static void ar9003_hw_thermo_cal_apply(struct ath_hw *ah) { - u32 data, ko, kg; + u32 data = 0, ko, kg;
if (!AR_SREV_9462_20_OR_LATER(ah)) return;
 
            From: Hui Peng benquike@gmail.com
commit bfd6e6e6c5d2ee43a3d9902b36e01fc7527ebb27 upstream.
The `ar_usb` field of `ath10k_usb_pipe_usb_pipe` objects are initialized to point to the containing `ath10k_usb` object according to endpoint descriptors read from the device side, as shown below in `ath10k_usb_setup_pipe_resources`:
for (i = 0; i < iface_desc->desc.bNumEndpoints; ++i) { endpoint = &iface_desc->endpoint[i].desc;
// get the address from endpoint descriptor pipe_num = ath10k_usb_get_logical_pipe_num(ar_usb, endpoint->bEndpointAddress, &urbcount); ...... // select the pipe object pipe = &ar_usb->pipes[pipe_num];
// initialize the ar_usb field pipe->ar_usb = ar_usb; }
The driver assumes that the addresses reported in endpoint descriptors from device side to be complete. If a device is malicious and does not report complete addresses, it may trigger NULL-ptr-deref `ath10k_usb_alloc_urb_from_pipe` and `ath10k_usb_free_urb_to_pipe`.
This patch fixes the bug by preventing potential NULL-ptr-deref.
Signed-off-by: Hui Peng benquike@gmail.com Reported-by: Hui Peng benquike@gmail.com Reported-by: Mathias Payer mathias.payer@nebelwelt.net Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [groeck: Add driver tag to subject, fix build warning] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/ath/ath10k/usb.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/net/wireless/ath/ath10k/usb.c +++ b/drivers/net/wireless/ath/ath10k/usb.c @@ -38,6 +38,10 @@ ath10k_usb_alloc_urb_from_pipe(struct at struct ath10k_urb_context *urb_context = NULL; unsigned long flags;
+ /* bail if this pipe is not initialized */ + if (!pipe->ar_usb) + return NULL; + spin_lock_irqsave(&pipe->ar_usb->cs_lock, flags); if (!list_empty(&pipe->urb_list_head)) { urb_context = list_first_entry(&pipe->urb_list_head, @@ -55,6 +59,10 @@ static void ath10k_usb_free_urb_to_pipe( { unsigned long flags;
+ /* bail if this pipe is not initialized */ + if (!pipe->ar_usb) + return; + spin_lock_irqsave(&pipe->ar_usb->cs_lock, flags);
pipe->urb_cnt++;
 
            From: Bjorn Andersson bjorn.andersson@linaro.org
commit 7165ef890a4c44cf16db66b82fd78448f4bde6ba upstream.
The introduction of 768ec4c012ac ("ath10k: update HOST capability QMI message") served the purpose of supporting the new and extended HOST capability QMI message.
But while the new message adds a slew of optional members it changes the data type of the "daemon_support" member, which means that older versions of the firmware will fail to decode the incoming request message.
There is no way to detect this breakage from Linux and there's no way to recover from sending the wrong message (i.e. we can't just try one format and then fallback to the other), so a quirk is introduced in DeviceTree to indicate to the driver that the firmware requires the 8bit version of this message.
Cc: stable@vger.kernel.org Fixes: 768ec4c012ac ("ath10k: update HOST capability qmi message") Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Acked-by: Rob Herring robh@kernel.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt | 6 ++ drivers/net/wireless/ath/ath10k/qmi.c | 13 ++++- drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c | 22 ++++++++++ drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h | 1 drivers/net/wireless/ath/ath10k/snoc.c | 11 +++++ drivers/net/wireless/ath/ath10k/snoc.h | 1 6 files changed, 51 insertions(+), 3 deletions(-)
--- a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt @@ -81,6 +81,12 @@ Optional properties: Definition: Name of external front end module used. Some valid FEM names for example: "microsemi-lx5586", "sky85703-11" and "sky85803" etc. +- qcom,snoc-host-cap-8bit-quirk: + Usage: Optional + Value type: <empty> + Definition: Quirk specifying that the firmware expects the 8bit version + of the host capability QMI request +
Example (to supply PCI based wifi block details):
--- a/drivers/net/wireless/ath/ath10k/qmi.c +++ b/drivers/net/wireless/ath/ath10k/qmi.c @@ -581,22 +581,29 @@ static int ath10k_qmi_host_cap_send_sync { struct wlfw_host_cap_resp_msg_v01 resp = {}; struct wlfw_host_cap_req_msg_v01 req = {}; + struct qmi_elem_info *req_ei; struct ath10k *ar = qmi->ar; + struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar); struct qmi_txn txn; int ret;
req.daemon_support_valid = 1; req.daemon_support = 0;
- ret = qmi_txn_init(&qmi->qmi_hdl, &txn, - wlfw_host_cap_resp_msg_v01_ei, &resp); + ret = qmi_txn_init(&qmi->qmi_hdl, &txn, wlfw_host_cap_resp_msg_v01_ei, + &resp); if (ret < 0) goto out;
+ if (test_bit(ATH10K_SNOC_FLAG_8BIT_HOST_CAP_QUIRK, &ar_snoc->flags)) + req_ei = wlfw_host_cap_8bit_req_msg_v01_ei; + else + req_ei = wlfw_host_cap_req_msg_v01_ei; + ret = qmi_send_request(&qmi->qmi_hdl, NULL, &txn, QMI_WLFW_HOST_CAP_REQ_V01, WLFW_HOST_CAP_REQ_MSG_V01_MAX_MSG_LEN, - wlfw_host_cap_req_msg_v01_ei, &req); + req_ei, &req); if (ret < 0) { qmi_txn_cancel(&txn); ath10k_err(ar, "failed to send host capability request: %d\n", ret); --- a/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c +++ b/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.c @@ -1988,6 +1988,28 @@ struct qmi_elem_info wlfw_host_cap_req_m {} };
+struct qmi_elem_info wlfw_host_cap_8bit_req_msg_v01_ei[] = { + { + .data_type = QMI_OPT_FLAG, + .elem_len = 1, + .elem_size = sizeof(u8), + .array_type = NO_ARRAY, + .tlv_type = 0x10, + .offset = offsetof(struct wlfw_host_cap_req_msg_v01, + daemon_support_valid), + }, + { + .data_type = QMI_UNSIGNED_1_BYTE, + .elem_len = 1, + .elem_size = sizeof(u8), + .array_type = NO_ARRAY, + .tlv_type = 0x10, + .offset = offsetof(struct wlfw_host_cap_req_msg_v01, + daemon_support), + }, + {} +}; + struct qmi_elem_info wlfw_host_cap_resp_msg_v01_ei[] = { { .data_type = QMI_STRUCT, --- a/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h +++ b/drivers/net/wireless/ath/ath10k/qmi_wlfw_v01.h @@ -575,6 +575,7 @@ struct wlfw_host_cap_req_msg_v01 {
#define WLFW_HOST_CAP_REQ_MSG_V01_MAX_MSG_LEN 189 extern struct qmi_elem_info wlfw_host_cap_req_msg_v01_ei[]; +extern struct qmi_elem_info wlfw_host_cap_8bit_req_msg_v01_ei[];
struct wlfw_host_cap_resp_msg_v01 { struct qmi_response_type_v01 resp; --- a/drivers/net/wireless/ath/ath10k/snoc.c +++ b/drivers/net/wireless/ath/ath10k/snoc.c @@ -1261,6 +1261,15 @@ out: return ret; }
+static void ath10k_snoc_quirks_init(struct ath10k *ar) +{ + struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar); + struct device *dev = &ar_snoc->dev->dev; + + if (of_property_read_bool(dev->of_node, "qcom,snoc-host-cap-8bit-quirk")) + set_bit(ATH10K_SNOC_FLAG_8BIT_HOST_CAP_QUIRK, &ar_snoc->flags); +} + int ath10k_snoc_fw_indication(struct ath10k *ar, u64 type) { struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar); @@ -1678,6 +1687,8 @@ static int ath10k_snoc_probe(struct plat ar->ce_priv = &ar_snoc->ce; msa_size = drv_data->msa_size;
+ ath10k_snoc_quirks_init(ar); + ret = ath10k_snoc_resource_init(ar); if (ret) { ath10k_warn(ar, "failed to initialize resource: %d\n", ret); --- a/drivers/net/wireless/ath/ath10k/snoc.h +++ b/drivers/net/wireless/ath/ath10k/snoc.h @@ -63,6 +63,7 @@ enum ath10k_snoc_flags { ATH10K_SNOC_FLAG_REGISTERED, ATH10K_SNOC_FLAG_UNREGISTERING, ATH10K_SNOC_FLAG_RECOVERY, + ATH10K_SNOC_FLAG_8BIT_HOST_CAP_QUIRK, };
struct ath10k_snoc {
 
            From: Christian Lamparter chunkeey@gmail.com
commit f8914a14623a79b73f72b2b1ee4cd9b2cb91b735 upstream.
This patch restores the old behavior that read the chip_id on the QCA988x before resetting the chip. This needs to be done in this order since the unsupported QCA988x AR1A chips fall off the bus when resetted. Otherwise the next MMIO Op after the reset causes a BUS ERROR and panic.
Cc: stable@vger.kernel.org Fixes: 1a7fecb766c8 ("ath10k: reset chip before reading chip_id in probe") Signed-off-by: Christian Lamparter chunkeey@gmail.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/ath/ath10k/pci.c | 36 +++++++++++++++++++++++----------- 1 file changed, 25 insertions(+), 11 deletions(-)
--- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -3490,7 +3490,7 @@ static int ath10k_pci_probe(struct pci_d struct ath10k_pci *ar_pci; enum ath10k_hw_rev hw_rev; struct ath10k_bus_params bus_params = {}; - bool pci_ps; + bool pci_ps, is_qca988x = false; int (*pci_soft_reset)(struct ath10k *ar); int (*pci_hard_reset)(struct ath10k *ar); u32 (*targ_cpu_to_ce_addr)(struct ath10k *ar, u32 addr); @@ -3500,6 +3500,7 @@ static int ath10k_pci_probe(struct pci_d case QCA988X_2_0_DEVICE_ID: hw_rev = ATH10K_HW_QCA988X; pci_ps = false; + is_qca988x = true; pci_soft_reset = ath10k_pci_warm_reset; pci_hard_reset = ath10k_pci_qca988x_chip_reset; targ_cpu_to_ce_addr = ath10k_pci_qca988x_targ_cpu_to_ce_addr; @@ -3619,25 +3620,34 @@ static int ath10k_pci_probe(struct pci_d goto err_deinit_irq; }
+ bus_params.dev_type = ATH10K_DEV_TYPE_LL; + bus_params.link_can_suspend = true; + /* Read CHIP_ID before reset to catch QCA9880-AR1A v1 devices that + * fall off the bus during chip_reset. These chips have the same pci + * device id as the QCA9880 BR4A or 2R4E. So that's why the check. + */ + if (is_qca988x) { + bus_params.chip_id = + ath10k_pci_soc_read32(ar, SOC_CHIP_ID_ADDRESS); + if (bus_params.chip_id != 0xffffffff) { + if (!ath10k_pci_chip_is_supported(pdev->device, + bus_params.chip_id)) + goto err_unsupported; + } + } + ret = ath10k_pci_chip_reset(ar); if (ret) { ath10k_err(ar, "failed to reset chip: %d\n", ret); goto err_free_irq; }
- bus_params.dev_type = ATH10K_DEV_TYPE_LL; - bus_params.link_can_suspend = true; bus_params.chip_id = ath10k_pci_soc_read32(ar, SOC_CHIP_ID_ADDRESS); - if (bus_params.chip_id == 0xffffffff) { - ath10k_err(ar, "failed to get chip id\n"); - goto err_free_irq; - } + if (bus_params.chip_id == 0xffffffff) + goto err_unsupported;
- if (!ath10k_pci_chip_is_supported(pdev->device, bus_params.chip_id)) { - ath10k_err(ar, "device %04x with chip_id %08x isn't supported\n", - pdev->device, bus_params.chip_id); + if (!ath10k_pci_chip_is_supported(pdev->device, bus_params.chip_id)) goto err_free_irq; - }
ret = ath10k_core_register(ar, &bus_params); if (ret) { @@ -3647,6 +3657,10 @@ static int ath10k_pci_probe(struct pci_d
return 0;
+err_unsupported: + ath10k_err(ar, "device %04x with chip_id %08x isn't supported\n", + pdev->device, bus_params.chip_id); + err_free_irq: ath10k_pci_free_irq(ar); ath10k_pci_rx_retry_sync(ar);
 
            From: Adam Ford aford173@gmail.com
commit cef456cd354ef485f12d57000c455e83e416a2b6 upstream.
As nice as it would be to update firmware faster, that patch broke at least two different boards, an OMAP4+WL1285 based Motorola Droid 4, as reported by Sebasian Reichel and the Logic PD i.MX6Q + WL1837MOD.
This reverts commit a2e02f38eff84f199c8e32359eb213f81f270047.
Signed-off-by: Adam Ford aford173@gmail.com Acked-by: Sebastian Reichel sebastian.reichel@collabora.com Cc: stable@vger.kernel.org Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/bluetooth/hci_ll.c | 39 ++++++++++++++++++--------------------- 1 file changed, 18 insertions(+), 21 deletions(-)
--- a/drivers/bluetooth/hci_ll.c +++ b/drivers/bluetooth/hci_ll.c @@ -621,13 +621,6 @@ static int ll_setup(struct hci_uart *hu)
serdev_device_set_flow_control(serdev, true);
- if (hu->oper_speed) - speed = hu->oper_speed; - else if (hu->proto->oper_speed) - speed = hu->proto->oper_speed; - else - speed = 0; - do { /* Reset the Bluetooth device */ gpiod_set_value_cansleep(lldev->enable_gpio, 0); @@ -639,20 +632,6 @@ static int ll_setup(struct hci_uart *hu) return err; }
- if (speed) { - __le32 speed_le = cpu_to_le32(speed); - struct sk_buff *skb; - - skb = __hci_cmd_sync(hu->hdev, - HCI_VS_UPDATE_UART_HCI_BAUDRATE, - sizeof(speed_le), &speed_le, - HCI_INIT_TIMEOUT); - if (!IS_ERR(skb)) { - kfree_skb(skb); - serdev_device_set_baudrate(serdev, speed); - } - } - err = download_firmware(lldev); if (!err) break; @@ -677,7 +656,25 @@ static int ll_setup(struct hci_uart *hu) }
/* Operational speed if any */ + if (hu->oper_speed) + speed = hu->oper_speed; + else if (hu->proto->oper_speed) + speed = hu->proto->oper_speed; + else + speed = 0;
+ if (speed) { + __le32 speed_le = cpu_to_le32(speed); + struct sk_buff *skb; + + skb = __hci_cmd_sync(hu->hdev, HCI_VS_UPDATE_UART_HCI_BAUDRATE, + sizeof(speed_le), &speed_le, + HCI_INIT_TIMEOUT); + if (!IS_ERR(skb)) { + kfree_skb(skb); + serdev_device_set_baudrate(serdev, speed); + } + }
return 0; }
 
            From: Mike Snitzer snitzer@redhat.com
commit f612b2132db529feac4f965f28a1b9258ea7c22b upstream.
This reverts commit a1b89132dc4f61071bdeaab92ea958e0953380a1.
Revert required hand-patching due to subsequent changes that were applied since commit a1b89132dc4f61071bdeaab92ea958e0953380a1.
Requires: ed0302e83098d ("dm crypt: make workqueue names device-specific") Cc: stable@vger.kernel.org Bug: https://bugzilla.kernel.org/show_bug.cgi?id=199857 Reported-by: Vito Caputo vcaputo@pengaru.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-crypt.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -2700,21 +2700,18 @@ static int crypt_ctr(struct dm_target *t }
ret = -ENOMEM; - cc->io_queue = alloc_workqueue("kcryptd_io/%s", - WQ_HIGHPRI | WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM, - 1, devname); + cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM, 1, devname); if (!cc->io_queue) { ti->error = "Couldn't create kcryptd io queue"; goto bad; }
if (test_bit(DM_CRYPT_SAME_CPU, &cc->flags)) - cc->crypt_queue = alloc_workqueue("kcryptd/%s", - WQ_HIGHPRI | WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM, + cc->crypt_queue = alloc_workqueue("kcryptd/%s", WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM, 1, devname); else cc->crypt_queue = alloc_workqueue("kcryptd/%s", - WQ_HIGHPRI | WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND, + WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND, num_online_cpus(), devname); if (!cc->crypt_queue) { ti->error = "Couldn't create kcryptd queue";
 
            From: John Pittman jpittman@redhat.com
commit 45422b704db392a6d79d07ee3e3670b11048bd53 upstream.
Due to unneeded multiplication in the out_free_pages portion of r10buf_pool_alloc(), when using a 3-copy raid10 layout, it is possible to access a resync_pages offset that has not been initialized. This access translates into a crash of the system within resync_free_pages() while passing a bad pointer to put_page(). Remove the multiplication, preventing access to the uninitialized area.
Fixes: f0250618361db ("md: raid10: don't use bio's vec table to manage resync pages") Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: John Pittman jpittman@redhat.com Suggested-by: David Jeffery djeffery@redhat.com Reviewed-by: Laurence Oberman loberman@redhat.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/raid10.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -191,7 +191,7 @@ static void * r10buf_pool_alloc(gfp_t gf
out_free_pages: while (--j >= 0) - resync_free_pages(&rps[j * 2]); + resync_free_pages(&rps[j]);
j = 0; out_free_bio:
 
            From: Alexander Kapshuk alexander.kapshuk@gmail.com
commit 700c1018b86d0d4b3f1f2d459708c0cdf42b521d upstream.
gawk 5.0.1 generates the following regexp warnings:
GEN /home/sasha/torvalds/tools/objtool/arch/x86/lib/inat-tables.c awk: ../arch/x86/tools/gen-insn-attr-x86.awk:260: warning: regexp escape sequence `:' is not a known regexp operator awk: ../arch/x86/tools/gen-insn-attr-x86.awk:350: (FILENAME=../arch/x86/lib/x86-opcode-map.txt FNR=41) warning: regexp escape sequence `&' is not a known regexp operator
Ealier versions of gawk are not known to generate these warnings. The gawk manual referenced below does not list characters ':' and '&' as needing escaping, so 'unescape' them. See
https://www.gnu.org/software/gawk/manual/html_node/Escape-Sequences.html
for more info.
Running diff on the output generated by the script before and after applying the patch reported no differences.
[ bp: Massage commit message. ]
[ Caught the respective tools header discrepancy. ] Reported-by: kbuild test robot lkp@intel.com Signed-off-by: Alexander Kapshuk alexander.kapshuk@gmail.com Signed-off-by: Borislav Petkov bp@suse.de Acked-by: Masami Hiramatsu mhiramat@kernel.org Cc: "H. Peter Anvin" hpa@zytor.com Cc: "Peter Zijlstra (Intel)" peterz@infradead.org Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Ingo Molnar mingo@redhat.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/20190924044659.3785-1-alexander.kapshuk@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/tools/gen-insn-attr-x86.awk | 4 ++-- tools/arch/x86/tools/gen-insn-attr-x86.awk | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
--- a/arch/x86/tools/gen-insn-attr-x86.awk +++ b/arch/x86/tools/gen-insn-attr-x86.awk @@ -69,7 +69,7 @@ BEGIN {
lprefix1_expr = "\((66|!F3)\)" lprefix2_expr = "\(F3\)" - lprefix3_expr = "\((F2|!F3|66\&F2)\)" + lprefix3_expr = "\((F2|!F3|66&F2)\)" lprefix_expr = "\((66|F2|F3)\)" max_lprefix = 4
@@ -257,7 +257,7 @@ function convert_operands(count,opnd, return add_flags(imm, mod) }
-/^[0-9a-f]+:/ { +/^[0-9a-f]+:/ { if (NR == 1) next # get index --- a/tools/arch/x86/tools/gen-insn-attr-x86.awk +++ b/tools/arch/x86/tools/gen-insn-attr-x86.awk @@ -69,7 +69,7 @@ BEGIN {
lprefix1_expr = "\((66|!F3)\)" lprefix2_expr = "\(F3\)" - lprefix3_expr = "\((F2|!F3|66\&F2)\)" + lprefix3_expr = "\((F2|!F3|66&F2)\)" lprefix_expr = "\((66|F2|F3)\)" max_lprefix = 4
@@ -257,7 +257,7 @@ function convert_operands(count,opnd, return add_flags(imm, mod) }
-/^[0-9a-f]+:/ { +/^[0-9a-f]+:/ { if (NR == 1) next # get index
 
            From: Waiman Long longman@redhat.com
commit 64870ed1b12e235cfca3f6c6da75b542c973ff78 upstream.
For MDS vulnerable processors with TSX support, enabling either MDS or TAA mitigations will enable the use of VERW to flush internal processor buffers at the right code path. IOW, they are either both mitigated or both not. However, if the command line options are inconsistent, the vulnerabilites sysfs files may not report the mitigation status correctly.
For example, with only the "mds=off" option:
vulnerabilities/mds:Vulnerable; SMT vulnerable vulnerabilities/tsx_async_abort:Mitigation: Clear CPU buffers; SMT vulnerable
The mds vulnerabilities file has wrong status in this case. Similarly, the taa vulnerability file will be wrong with mds mitigation on, but taa off.
Change taa_select_mitigation() to sync up the two mitigation status and have them turned off if both "mds=off" and "tsx_async_abort=off" are present.
Update documentation to emphasize the fact that both "mds=off" and "tsx_async_abort=off" have to be specified together for processors that are affected by both TAA and MDS to be effective.
[ bp: Massage and add kernel-parameters.txt change too. ]
Fixes: 1b42f017415b ("x86/speculation/taa: Add mitigation for TSX Async Abort") Signed-off-by: Waiman Long longman@redhat.com Signed-off-by: Borislav Petkov bp@suse.de Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Kosina jkosina@suse.cz Cc: Jonathan Corbet corbet@lwn.net Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: linux-doc@vger.kernel.org Cc: Mark Gross mgross@linux.intel.com Cc: stable@vger.kernel.org Cc: Pawan Gupta pawan.kumar.gupta@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Tony Luck tony.luck@intel.com Cc: Tyler Hicks tyhicks@canonical.com Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/20191115161445.30809-2-longman@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/admin-guide/hw-vuln/mds.rst | 7 +++++-- Documentation/admin-guide/hw-vuln/tsx_async_abort.rst | 5 ++++- Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++++ arch/x86/kernel/cpu/bugs.c | 17 +++++++++++++++-- 4 files changed, 35 insertions(+), 5 deletions(-)
--- a/Documentation/admin-guide/hw-vuln/mds.rst +++ b/Documentation/admin-guide/hw-vuln/mds.rst @@ -265,8 +265,11 @@ time with the option "mds=". The valid a
============ =============================================================
-Not specifying this option is equivalent to "mds=full". - +Not specifying this option is equivalent to "mds=full". For processors +that are affected by both TAA (TSX Asynchronous Abort) and MDS, +specifying just "mds=off" without an accompanying "tsx_async_abort=off" +will have no effect as the same mitigation is used for both +vulnerabilities.
Mitigation selection guide -------------------------- --- a/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst +++ b/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst @@ -174,7 +174,10 @@ the option "tsx_async_abort=". The valid CPU is not vulnerable to cross-thread TAA attacks. ============ =============================================================
-Not specifying this option is equivalent to "tsx_async_abort=full". +Not specifying this option is equivalent to "tsx_async_abort=full". For +processors that are affected by both TAA and MDS, specifying just +"tsx_async_abort=off" without an accompanying "mds=off" will have no +effect as the same mitigation is used for both vulnerabilities.
The kernel command line also allows to control the TSX feature using the parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2473,6 +2473,12 @@ SMT on vulnerable CPUs off - Unconditionally disable MDS mitigation
+ On TAA-affected machines, mds=off can be prevented by + an active TAA mitigation as both vulnerabilities are + mitigated with the same mechanism so in order to disable + this mitigation, you need to specify tsx_async_abort=off + too. + Not specifying this option is equivalent to mds=full.
@@ -4931,6 +4937,11 @@ vulnerable to cross-thread TAA attacks. off - Unconditionally disable TAA mitigation
+ On MDS-affected machines, tsx_async_abort=off can be + prevented by an active MDS mitigation as both vulnerabilities + are mitigated with the same mechanism so in order to disable + this mitigation, you need to specify mds=off too. + Not specifying this option is equivalent to tsx_async_abort=full. On CPUs which are MDS affected and deploy MDS mitigation, TAA mitigation is not --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -304,8 +304,12 @@ static void __init taa_select_mitigation return; }
- /* TAA mitigation is turned off on the cmdline (tsx_async_abort=off) */ - if (taa_mitigation == TAA_MITIGATION_OFF) + /* + * TAA mitigation via VERW is turned off if both + * tsx_async_abort=off and mds=off are specified. + */ + if (taa_mitigation == TAA_MITIGATION_OFF && + mds_mitigation == MDS_MITIGATION_OFF) goto out;
if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) @@ -339,6 +343,15 @@ static void __init taa_select_mitigation if (taa_nosmt || cpu_mitigations_auto_nosmt()) cpu_smt_disable(false);
+ /* + * Update MDS mitigation, if necessary, as the mds_user_clear is + * now enabled for TAA mitigation. + */ + if (mds_mitigation == MDS_MITIGATION_OFF && + boot_cpu_has_bug(X86_BUG_MDS)) { + mds_mitigation = MDS_MITIGATION_FULL; + mds_select_mitigation(); + } out: pr_info("%s\n", taa_strings[taa_mitigation]); }
 
            From: Waiman Long longman@redhat.com
commit cd5a2aa89e847bdda7b62029d94e95488d73f6b2 upstream.
Since MDS and TAA mitigations are inter-related for processors that are affected by both vulnerabilities, the followiing confusing messages can be printed in the kernel log:
MDS: Vulnerable MDS: Mitigation: Clear CPU buffers
To avoid the first incorrect message, defer the printing of MDS mitigation after the TAA mitigation selection has been done. However, that has the side effect of printing TAA mitigation first before MDS mitigation.
[ bp: Check box is affected/mitigations are disabled first before printing and massage. ]
Suggested-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Waiman Long longman@redhat.com Signed-off-by: Borislav Petkov bp@suse.de Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Mark Gross mgross@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Tony Luck tony.luck@intel.com Cc: Tyler Hicks tyhicks@canonical.com Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/20191115161445.30809-3-longman@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/cpu/bugs.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -39,6 +39,7 @@ static void __init spectre_v2_select_mit static void __init ssb_select_mitigation(void); static void __init l1tf_select_mitigation(void); static void __init mds_select_mitigation(void); +static void __init mds_print_mitigation(void); static void __init taa_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR that always has to be preserved. */ @@ -108,6 +109,12 @@ void __init check_bugs(void) mds_select_mitigation(); taa_select_mitigation();
+ /* + * As MDS and TAA mitigations are inter-related, print MDS + * mitigation until after TAA mitigation selection is done. + */ + mds_print_mitigation(); + arch_smt_update();
#ifdef CONFIG_X86_32 @@ -245,6 +252,12 @@ static void __init mds_select_mitigation (mds_nosmt || cpu_mitigations_auto_nosmt())) cpu_smt_disable(false); } +} + +static void __init mds_print_mitigation(void) +{ + if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off()) + return;
pr_info("%s\n", mds_strings[mds_mitigation]); }
 
            From: Navid Emamdoost navid.emamdoost@gmail.com
commit 03bf73c315edca28f47451913177e14cd040a216 upstream.
In nbd_add_socket when krealloc succeeds, if nsock's allocation fail the reallocted memory is leak. The correct behaviour should be assigning the reallocted memory to config->socks right after success.
Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Navid Emamdoost navid.emamdoost@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/nbd.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1032,14 +1032,15 @@ static int nbd_add_socket(struct nbd_dev sockfd_put(sock); return -ENOMEM; } + + config->socks = socks; + nsock = kzalloc(sizeof(struct nbd_sock), GFP_KERNEL); if (!nsock) { sockfd_put(sock); return -ENOMEM; }
- config->socks = socks; - nsock->fallback_index = -1; nsock->dead = false; mutex_init(&nsock->tx_lock);
 
            From: Jan Beulich jbeulich@suse.com
commit 81ff2c37f9e5d77593928df0536d86443195fd64 upstream.
Once again RPL checks have been introduced which don't account for a 32-bit kernel living in ring 1 when running in a PV Xen domain. The case in FIXUP_FRAME has been preventing boot.
Adjust BUG_IF_WRONG_CR3 as well to guard against future uses of the macro on a code path reachable when running in PV mode under Xen; I have to admit that I stopped at a certain point trying to figure out whether there are present ones.
Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs") Signed-off-by: Jan Beulich jbeulich@suse.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Stable Team stable@vger.kernel.org Link: https://lore.kernel.org/r/0fad341f-b7f5-f859-d55d-f0084ee7087e@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 4 ++-- arch/x86/include/asm/segment.h | 12 ++++++++++++ 2 files changed, 14 insertions(+), 2 deletions(-)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -172,7 +172,7 @@ ALTERNATIVE "jmp .Lend_@", "", X86_FEATURE_PTI .if \no_user_check == 0 /* coming from usermode? */ - testl $SEGMENT_RPL_MASK, PT_CS(%esp) + testl $USER_SEGMENT_RPL_MASK, PT_CS(%esp) jz .Lend_@ .endif /* On user-cr3? */ @@ -217,7 +217,7 @@ testl $X86_EFLAGS_VM, 4*4(%esp) jnz .Lfrom_usermode_no_fixup_@ #endif - testl $SEGMENT_RPL_MASK, 3*4(%esp) + testl $USER_SEGMENT_RPL_MASK, 3*4(%esp) jnz .Lfrom_usermode_no_fixup_@
orl $CS_FROM_KERNEL, 3*4(%esp) --- a/arch/x86/include/asm/segment.h +++ b/arch/x86/include/asm/segment.h @@ -31,6 +31,18 @@ */ #define SEGMENT_RPL_MASK 0x3
+/* + * When running on Xen PV, the actual privilege level of the kernel is 1, + * not 0. Testing the Requested Privilege Level in a segment selector to + * determine whether the context is user mode or kernel mode with + * SEGMENT_RPL_MASK is wrong because the PV kernel's privilege level + * matches the 0x3 mask. + * + * Testing with USER_SEGMENT_RPL_MASK is valid for both native and Xen PV + * kernels because privilege level 2 is never used. + */ +#define USER_SEGMENT_RPL_MASK 0x2 + /* User mode is privilege level 3: */ #define USER_RPL 0x3
 
            From: Jan Beulich jbeulich@suse.com
commit 29b810f5a5ec127d3143770098e05981baa3eb77 upstream.
Now that SS:ESP always get saved by SAVE_ALL, this also needs to be accounted for in xen_iret_crit_fixup(). Otherwise the old_ax value gets interpreted as EFLAGS, and hence VM86 mode appears to be active all the time, leading to random "vm86_32: no user_vm86: BAD" log messages alongside processes randomly crashing.
Since following the previous model (sitting after SAVE_ALL) would further complicate the code _and_ retain the dependency of xen_iret_crit_fixup() on frame manipulations done by entry_32.S, switch things around and do the adjustment ahead of SAVE_ALL.
Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs") Signed-off-by: Jan Beulich jbeulich@suse.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Juergen Gross jgross@suse.com Cc: Stable Team stable@vger.kernel.org Link: https://lkml.kernel.org/r/32d8713d-25a7-84ab-b74b-aa3e88abce6b@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 22 +++++--------- arch/x86/xen/xen-asm_32.S | 70 +++++++++++++++++----------------------------- 2 files changed, 35 insertions(+), 57 deletions(-)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -1341,11 +1341,6 @@ END(spurious_interrupt_bug)
#ifdef CONFIG_XEN_PV ENTRY(xen_hypervisor_callback) - pushl $-1 /* orig_ax = -1 => not a system call */ - SAVE_ALL - ENCODE_FRAME_POINTER - TRACE_IRQS_OFF - /* * Check to see if we got the event in the critical * region in xen_iret_direct, after we've reenabled @@ -1353,16 +1348,17 @@ ENTRY(xen_hypervisor_callback) * iret instruction's behaviour where it delivers a * pending interrupt when enabling interrupts: */ - movl PT_EIP(%esp), %eax - cmpl $xen_iret_start_crit, %eax + cmpl $xen_iret_start_crit, (%esp) jb 1f - cmpl $xen_iret_end_crit, %eax + cmpl $xen_iret_end_crit, (%esp) jae 1f - - jmp xen_iret_crit_fixup - -ENTRY(xen_do_upcall) -1: mov %esp, %eax + call xen_iret_crit_fixup +1: + pushl $-1 /* orig_ax = -1 => not a system call */ + SAVE_ALL + ENCODE_FRAME_POINTER + TRACE_IRQS_OFF + mov %esp, %eax call xen_evtchn_do_upcall #ifndef CONFIG_PREEMPTION call xen_maybe_preempt_hcall --- a/arch/x86/xen/xen-asm_32.S +++ b/arch/x86/xen/xen-asm_32.S @@ -126,10 +126,9 @@ hyper_iret: .globl xen_iret_start_crit, xen_iret_end_crit
/* - * This is called by xen_hypervisor_callback in entry.S when it sees + * This is called by xen_hypervisor_callback in entry_32.S when it sees * that the EIP at the time of interrupt was between - * xen_iret_start_crit and xen_iret_end_crit. We're passed the EIP in - * %eax so we can do a more refined determination of what to do. + * xen_iret_start_crit and xen_iret_end_crit. * * The stack format at this point is: * ---------------- @@ -138,34 +137,23 @@ hyper_iret: * eflags } outer exception info * cs } * eip } - * ---------------- <- edi (copy dest) - * eax : outer eax if it hasn't been restored * ---------------- - * eflags } nested exception info - * cs } (no ss/esp because we're nested - * eip } from the same ring) - * orig_eax }<- esi (copy src) - * - - - - - - - - - * fs } - * es } - * ds } SAVE_ALL state - * eax } - * : : - * ebx }<- esp + * eax : outer eax if it hasn't been restored * ---------------- + * eflags } + * cs } nested exception info + * eip } + * return address : (into xen_hypervisor_callback) * - * In order to deliver the nested exception properly, we need to shift - * everything from the return addr up to the error code so it sits - * just under the outer exception info. This means that when we - * handle the exception, we do it in the context of the outer - * exception rather than starting a new one. + * In order to deliver the nested exception properly, we need to discard the + * nested exception frame such that when we handle the exception, we do it + * in the context of the outer exception rather than starting a new one. * - * The only caveat is that if the outer eax hasn't been restored yet - * (ie, it's still on stack), we need to insert its value into the - * SAVE_ALL state before going on, since it's usermode state which we - * eventually need to restore. + * The only caveat is that if the outer eax hasn't been restored yet (i.e. + * it's still on stack), we need to restore its value here. */ ENTRY(xen_iret_crit_fixup) + pushl %ecx /* * Paranoia: Make sure we're really coming from kernel space. * One could imagine a case where userspace jumps into the @@ -176,32 +164,26 @@ ENTRY(xen_iret_crit_fixup) * jump instruction itself, not the destination, but some * virtual environments get this wrong. */ - movl PT_CS(%esp), %ecx + movl 3*4(%esp), %ecx /* nested CS */ andl $SEGMENT_RPL_MASK, %ecx cmpl $USER_RPL, %ecx + popl %ecx je 2f
- lea PT_ORIG_EAX(%esp), %esi - lea PT_EFLAGS(%esp), %edi - /* * If eip is before iret_restore_end then stack * hasn't been restored yet. */ - cmp $iret_restore_end, %eax + cmpl $iret_restore_end, 1*4(%esp) jae 1f
- movl 0+4(%edi), %eax /* copy EAX (just above top of frame) */ - movl %eax, PT_EAX(%esp) - - lea ESP_OFFSET(%edi), %edi /* move dest up over saved regs */ - - /* set up the copy */ -1: std - mov $PT_EIP / 4, %ecx /* saved regs up to orig_eax */ - rep movsl - cld - - lea 4(%edi), %esp /* point esp to new frame */ -2: jmp xen_do_upcall - + movl 4*4(%esp), %eax /* load outer EAX */ + ret $4*4 /* discard nested EIP, CS, and EFLAGS as + * well as the just restored EAX */ + +1: + ret $3*4 /* discard nested EIP, CS, and EFLAGS */ + +2: + ret +END(xen_iret_crit_fixup)
 
            From: Jan Beulich jbeulich@suse.com
commit 922eea2ce5c799228d9ff1be9890e6873ce8fff6 upstream.
This can be had with two instead of six insns, by just checking the high CS.RPL bit.
Also adjust the comment - there would be no #GP in the mentioned cases, as there's no segment limit violation or alike. Instead there'd be #PF, but that one reports the target EIP of said branch, not the address of the branch insn itself.
Signed-off-by: Jan Beulich jbeulich@suse.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Juergen Gross jgross@suse.com Link: https://lkml.kernel.org/r/a5986837-01eb-7bf8-bf42-4d3084d6a1f5@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/xen/xen-asm_32.S | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-)
--- a/arch/x86/xen/xen-asm_32.S +++ b/arch/x86/xen/xen-asm_32.S @@ -153,22 +153,15 @@ hyper_iret: * it's still on stack), we need to restore its value here. */ ENTRY(xen_iret_crit_fixup) - pushl %ecx /* * Paranoia: Make sure we're really coming from kernel space. * One could imagine a case where userspace jumps into the * critical range address, but just before the CPU delivers a - * GP, it decides to deliver an interrupt instead. Unlikely? - * Definitely. Easy to avoid? Yes. The Intel documents - * explicitly say that the reported EIP for a bad jump is the - * jump instruction itself, not the destination, but some - * virtual environments get this wrong. + * PF, it decides to deliver an interrupt instead. Unlikely? + * Definitely. Easy to avoid? Yes. */ - movl 3*4(%esp), %ecx /* nested CS */ - andl $SEGMENT_RPL_MASK, %ecx - cmpl $USER_RPL, %ecx - popl %ecx - je 2f + testb $2, 2*4(%esp) /* nested CS */ + jnz 2f
/* * If eip is before iret_restore_end then stack
 
            From: Andy Lutomirski luto@kernel.org
commit 3580d0b29cab08483f84a16ce6a1151a1013695f upstream.
The double fault TSS was missing GS setup, which is needed for stack canaries to work.
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/doublefault.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/x86/kernel/doublefault.c +++ b/arch/x86/kernel/doublefault.c @@ -65,6 +65,9 @@ struct x86_hw_tss doublefault_tss __cach .ss = __KERNEL_DS, .ds = __USER_DS, .fs = __KERNEL_PERCPU, +#ifndef CONFIG_X86_32_LAZY_GS + .gs = __KERNEL_STACK_CANARY, +#endif
.__cr3 = __pa_nodebug(swapper_pg_dir), };
 
            From: Thomas Gleixner tglx@linutronix.de
commit f490e07c53d66045d9d739e134145ec9b38653d3 upstream.
Commit 945fd17ab6ba ("x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table") introduced the sync for the initial page table for 32bit.
sync_initial_page_table() uses clone_pgd_range() which does the update for the kernel page table. If PTI is enabled it also updates the user space page table counterpart, which is assumed to be in the next page after the target PGD.
At this point in time 32-bit did not have PTI support, so the user space page table update was not taking place.
The support for PTI on 32-bit which was introduced later on, did not take that into account and missed to add the user space counter part for the initial page table.
As a consequence sync_initial_page_table() overwrites any data which is located in the page behing initial_page_table causing random failures, e.g. by corrupting doublefault_tss and wreckaging the doublefault handler on 32bit.
Fix it by adding a "user" page table right after initial_page_table.
Fixes: 7757d607c6b3 ("x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32") Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Joerg Roedel jroedel@suse.de Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/head_32.S | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/arch/x86/kernel/head_32.S +++ b/arch/x86/kernel/head_32.S @@ -571,6 +571,16 @@ ENTRY(initial_page_table) # error "Kernel PMDs should be 1, 2 or 3" # endif .align PAGE_SIZE /* needs to be page-sized too */ + +#ifdef CONFIG_PAGE_TABLE_ISOLATION + /* + * PTI needs another page so sync_initial_pagetable() works correctly + * and does not scribble over the data which is placed behind the + * actual initial_page_table. See clone_pgd_range(). + */ + .fill 1024, 4, 0 +#endif + #endif
.data
 
            From: Thomas Gleixner tglx@linutronix.de
commit 880a98c339961eaa074393e3a2117cbe9125b8bb upstream.
The entry stack in the cpu entry area is protected against overflow by the readonly GDT on 64-bit, but on 32-bit the GDT needs to be writeable and therefore does not trigger a fault on stack overflow.
Add a guard page.
Fixes: c482feefe1ae ("x86/entry/64: Make cpu_entry_area.tss read-only") Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/cpu_entry_area.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/cpu_entry_area.h +++ b/arch/x86/include/asm/cpu_entry_area.h @@ -78,8 +78,12 @@ struct cpu_entry_area {
/* * The GDT is just below entry_stack and thus serves (on x86_64) as - * a a read-only guard page. + * a read-only guard page. On 32-bit the GDT must be writeable, so + * it needs an extra guard page. */ +#ifdef CONFIG_X86_32 + char guard_entry_stack[PAGE_SIZE]; +#endif struct entry_stack_page entry_stack_page;
/*
 
            From: Peter Zijlstra peterz@infradead.org
commit 40ad2199580e248dce2a2ebb722854180c334b9e upstream.
As reported by Lai, the commit 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs") wrecked the IRET EXTABLE entry by making .Lirq_return not point at IRET.
Fix this by placing IRET_FRAME in RESTORE_REGS, to mirror how FIXUP_FRAME is part of SAVE_ALL.
Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs") Reported-by: Lai Jiangshan laijs@linux.alibaba.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Andy Lutomirski luto@kernel.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -357,6 +357,7 @@ 2: popl %es 3: popl %fs POP_GS \pop + IRET_FRAME .pushsection .fixup, "ax" 4: movl $0, (%esp) jmp 1b @@ -1075,7 +1076,6 @@ restore_all: /* Restore user state */ RESTORE_REGS pop=4 # skip orig_eax/error_code .Lirq_return: - IRET_FRAME /* * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization * when returning from IPI handler and when returning from
 
            From: Andy Lutomirski luto@kernel.org
commit 4c4fd55d3d59a41ddfa6ecba7e76928921759f43 upstream.
When re-building the IRET frame we use %eax as an destination %esp, make sure to then also match the segment for when there is a nonzero SS base (ESPFIX).
[peterz: Changelog and minor edits] Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs") Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -210,6 +210,8 @@ /* * The high bits of the CS dword (__csh) are used for CS_FROM_*. * Clear them in case hardware didn't do this for us. + * + * Be careful: we may have nonzero SS base due to ESPFIX. */ andl $0x0000ffff, 3*4(%esp)
@@ -263,6 +265,13 @@ .endm
.macro IRET_FRAME + /* + * We're called with %ds, %es, %fs, and %gs from the interrupted + * frame, so we shouldn't use them. Also, we may be in ESPFIX + * mode and therefore have a nonzero SS base and an offset ESP, + * so any attempt to access the stack needs to use SS. (except for + * accesses through %esp, which automatically use SS.) + */ testl $CS_FROM_KERNEL, 1*4(%esp) jz .Lfinished_frame_@
@@ -276,20 +285,20 @@ movl 5*4(%esp), %eax # (modified) regs->sp
movl 4*4(%esp), %ecx # flags - movl %ecx, -4(%eax) + movl %ecx, %ss:-1*4(%eax)
movl 3*4(%esp), %ecx # cs andl $0x0000ffff, %ecx - movl %ecx, -8(%eax) + movl %ecx, %ss:-2*4(%eax)
movl 2*4(%esp), %ecx # ip - movl %ecx, -12(%eax) + movl %ecx, %ss:-3*4(%eax)
movl 1*4(%esp), %ecx # eax - movl %ecx, -16(%eax) + movl %ecx, %ss:-4*4(%eax)
popl %ecx - lea -16(%eax), %esp + lea -4*4(%eax), %esp popl %eax .Lfinished_frame_@: .endm
 
            From: Andy Lutomirski luto@kernel.org
commit 82cb8a0b1d8d07817b5d59f7fa1438e1fceafab2 upstream.
This will allow us to get percpu access working before FIXUP_FRAME, which will allow us to unwind ESPFIX earlier.
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 66 ++++++++++++++++++++++++---------------------- 1 file changed, 35 insertions(+), 31 deletions(-)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -213,54 +213,58 @@ * * Be careful: we may have nonzero SS base due to ESPFIX. */ - andl $0x0000ffff, 3*4(%esp) + andl $0x0000ffff, 4*4(%esp)
#ifdef CONFIG_VM86 - testl $X86_EFLAGS_VM, 4*4(%esp) + testl $X86_EFLAGS_VM, 5*4(%esp) jnz .Lfrom_usermode_no_fixup_@ #endif - testl $USER_SEGMENT_RPL_MASK, 3*4(%esp) + testl $USER_SEGMENT_RPL_MASK, 4*4(%esp) jnz .Lfrom_usermode_no_fixup_@
- orl $CS_FROM_KERNEL, 3*4(%esp) + orl $CS_FROM_KERNEL, 4*4(%esp)
/* * When we're here from kernel mode; the (exception) stack looks like: * - * 5*4(%esp) - <previous context> - * 4*4(%esp) - flags - * 3*4(%esp) - cs - * 2*4(%esp) - ip - * 1*4(%esp) - orig_eax - * 0*4(%esp) - gs / function + * 6*4(%esp) - <previous context> + * 5*4(%esp) - flags + * 4*4(%esp) - cs + * 3*4(%esp) - ip + * 2*4(%esp) - orig_eax + * 1*4(%esp) - gs / function + * 0*4(%esp) - fs * * Lets build a 5 entry IRET frame after that, such that struct pt_regs * is complete and in particular regs->sp is correct. This gives us - * the original 5 enties as gap: + * the original 6 enties as gap: * - * 12*4(%esp) - <previous context> - * 11*4(%esp) - gap / flags - * 10*4(%esp) - gap / cs - * 9*4(%esp) - gap / ip - * 8*4(%esp) - gap / orig_eax - * 7*4(%esp) - gap / gs / function - * 6*4(%esp) - ss - * 5*4(%esp) - sp - * 4*4(%esp) - flags - * 3*4(%esp) - cs - * 2*4(%esp) - ip - * 1*4(%esp) - orig_eax - * 0*4(%esp) - gs / function + * 14*4(%esp) - <previous context> + * 13*4(%esp) - gap / flags + * 12*4(%esp) - gap / cs + * 11*4(%esp) - gap / ip + * 10*4(%esp) - gap / orig_eax + * 9*4(%esp) - gap / gs / function + * 8*4(%esp) - gap / fs + * 7*4(%esp) - ss + * 6*4(%esp) - sp + * 5*4(%esp) - flags + * 4*4(%esp) - cs + * 3*4(%esp) - ip + * 2*4(%esp) - orig_eax + * 1*4(%esp) - gs / function + * 0*4(%esp) - fs */
pushl %ss # ss pushl %esp # sp (points at ss) - addl $6*4, (%esp) # point sp back at the previous context - pushl 6*4(%esp) # flags - pushl 6*4(%esp) # cs - pushl 6*4(%esp) # ip - pushl 6*4(%esp) # orig_eax - pushl 6*4(%esp) # gs / function + addl $7*4, (%esp) # point sp back at the previous context + pushl 7*4(%esp) # flags + pushl 7*4(%esp) # cs + pushl 7*4(%esp) # ip + pushl 7*4(%esp) # orig_eax + pushl 7*4(%esp) # gs / function + pushl 7*4(%esp) # fs .Lfrom_usermode_no_fixup_@: .endm
@@ -308,8 +312,8 @@ .if \skip_gs == 0 PUSH_GS .endif - FIXUP_FRAME pushl %fs + FIXUP_FRAME pushl %es pushl %ds pushl \pt_regs_ax
 
            From: Andy Lutomirski luto@kernel.org
commit a1a338e5b6fe9e0a39c57c232dc96c198bb53e47 upstream.
Right now, we do some fancy parts of the exception entry path while SS might have a nonzero base: we fill in regs->ss and regs->sp, and we consider switching to the kernel stack. This results in regs->ss and regs->sp referring to a non-flat stack and it may result in overflowing the entry stack. The former issue means that we can try to call iret_exc on a non-flat stack, which doesn't work.
Tested with selftests/x86/sigreturn_32.
Fixes: 45d7b255747c ("x86/entry/32: Enter the kernel via trampoline stack") Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 30 ++++++++++++++++-------------- 1 file changed, 16 insertions(+), 14 deletions(-)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -210,8 +210,6 @@ /* * The high bits of the CS dword (__csh) are used for CS_FROM_*. * Clear them in case hardware didn't do this for us. - * - * Be careful: we may have nonzero SS base due to ESPFIX. */ andl $0x0000ffff, 4*4(%esp)
@@ -307,12 +305,21 @@ .Lfinished_frame_@: .endm
-.macro SAVE_ALL pt_regs_ax=%eax switch_stacks=0 skip_gs=0 +.macro SAVE_ALL pt_regs_ax=%eax switch_stacks=0 skip_gs=0 unwind_espfix=0 cld .if \skip_gs == 0 PUSH_GS .endif pushl %fs + + pushl %eax + movl $(__KERNEL_PERCPU), %eax + movl %eax, %fs +.if \unwind_espfix > 0 + UNWIND_ESPFIX_STACK +.endif + popl %eax + FIXUP_FRAME pushl %es pushl %ds @@ -326,8 +333,6 @@ movl $(__USER_DS), %edx movl %edx, %ds movl %edx, %es - movl $(__KERNEL_PERCPU), %edx - movl %edx, %fs .if \skip_gs == 0 SET_KERNEL_GS %edx .endif @@ -1153,18 +1158,17 @@ ENDPROC(entry_INT80_32) lss (%esp), %esp /* switch to the normal stack segment */ #endif .endm + .macro UNWIND_ESPFIX_STACK + /* It's safe to clobber %eax, all other regs need to be preserved */ #ifdef CONFIG_X86_ESPFIX32 movl %ss, %eax /* see if on espfix stack */ cmpw $__ESPFIX_SS, %ax - jne 27f - movl $__KERNEL_DS, %eax - movl %eax, %ds - movl %eax, %es + jne .Lno_fixup_@ /* switch to normal stack */ FIXUP_ESPFIX_STACK -27: +.Lno_fixup_@: #endif .endm
@@ -1458,10 +1462,9 @@ END(page_fault)
common_exception_read_cr2: /* the function address is in %gs's slot on the stack */ - SAVE_ALL switch_stacks=1 skip_gs=1 + SAVE_ALL switch_stacks=1 skip_gs=1 unwind_espfix=1
ENCODE_FRAME_POINTER - UNWIND_ESPFIX_STACK
/* fixup %gs */ GS_TO_REG %ecx @@ -1483,9 +1486,8 @@ END(common_exception_read_cr2)
common_exception: /* the function address is in %gs's slot on the stack */ - SAVE_ALL switch_stacks=1 skip_gs=1 + SAVE_ALL switch_stacks=1 skip_gs=1 unwind_espfix=1 ENCODE_FRAME_POINTER - UNWIND_ESPFIX_STACK
/* fixup %gs */ GS_TO_REG %ecx
 
            From: Peter Zijlstra peterz@infradead.org
commit 895429076512e9d1cf5428181076299c90713159 upstream.
When the NMI lands on an ESPFIX_SS, we are on the entry stack and must swizzle, otherwise we'll run do_nmi() on the entry stack, which is BAD.
Also, similar to the normal exception path, we need to correct the ESPFIX magic before leaving the entry stack, otherwise pt_regs will present a non-flat stack pointer.
Tested by running sigreturn_32 concurrent with perf-record.
Fixes: e5862d0515ad ("x86/entry/32: Leave the kernel via trampoline stack") Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Andy Lutomirski luto@kernel.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 53 +++++++++++++++++++++++++++++++++++----------- 1 file changed, 41 insertions(+), 12 deletions(-)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -205,6 +205,7 @@ #define CS_FROM_ENTRY_STACK (1 << 31) #define CS_FROM_USER_CR3 (1 << 30) #define CS_FROM_KERNEL (1 << 29) +#define CS_FROM_ESPFIX (1 << 28)
.macro FIXUP_FRAME /* @@ -342,8 +343,8 @@ .endif .endm
-.macro SAVE_ALL_NMI cr3_reg:req - SAVE_ALL +.macro SAVE_ALL_NMI cr3_reg:req unwind_espfix=0 + SAVE_ALL unwind_espfix=\unwind_espfix
BUG_IF_WRONG_CR3
@@ -1526,6 +1527,10 @@ ENTRY(nmi) ASM_CLAC
#ifdef CONFIG_X86_ESPFIX32 + /* + * ESPFIX_SS is only ever set on the return to user path + * after we've switched to the entry stack. + */ pushl %eax movl %ss, %eax cmpw $__ESPFIX_SS, %ax @@ -1561,6 +1566,11 @@ ENTRY(nmi) movl %ebx, %esp
.Lnmi_return: +#ifdef CONFIG_X86_ESPFIX32 + testl $CS_FROM_ESPFIX, PT_CS(%esp) + jnz .Lnmi_from_espfix +#endif + CHECK_AND_APPLY_ESPFIX RESTORE_ALL_NMI cr3_reg=%edi pop=4 jmp .Lirq_return @@ -1568,23 +1578,42 @@ ENTRY(nmi) #ifdef CONFIG_X86_ESPFIX32 .Lnmi_espfix_stack: /* - * create the pointer to lss back + * Create the pointer to LSS back */ pushl %ss pushl %esp addl $4, (%esp) - /* copy the iret frame of 12 bytes */ - .rept 3 - pushl 16(%esp) - .endr - pushl %eax - SAVE_ALL_NMI cr3_reg=%edi + + /* Copy the (short) IRET frame */ + pushl 4*4(%esp) # flags + pushl 4*4(%esp) # cs + pushl 4*4(%esp) # ip + + pushl %eax # orig_ax + + SAVE_ALL_NMI cr3_reg=%edi unwind_espfix=1 ENCODE_FRAME_POINTER - FIXUP_ESPFIX_STACK # %eax == %esp + + /* clear CS_FROM_KERNEL, set CS_FROM_ESPFIX */ + xorl $(CS_FROM_ESPFIX | CS_FROM_KERNEL), PT_CS(%esp) + xorl %edx, %edx # zero error code - call do_nmi + movl %esp, %eax # pt_regs pointer + jmp .Lnmi_from_sysenter_stack + +.Lnmi_from_espfix: RESTORE_ALL_NMI cr3_reg=%edi - lss 12+4(%esp), %esp # back to espfix stack + /* + * Because we cleared CS_FROM_KERNEL, IRET_FRAME 'forgot' to + * fix up the gap and long frame: + * + * 3 - original frame (exception) + * 2 - ESPFIX block (above) + * 6 - gap (FIXUP_FRAME) + * 5 - long frame (FIXUP_FRAME) + * 1 - orig_ax + */ + lss (1+5+6)*4(%esp), %esp # back to espfix stack jmp .Lirq_return #endif END(nmi)
 
            From: Andy Lutomirski luto@kernel.org
commit 8caa016bfc129f2c925d52da43022171d1d1de91 upstream.
For reasons that I haven't quite fully diagnosed, running mov_ss_trap_32 on a 32-bit kernel results in an infinite loop in userspace. This appears to be because the hacky SYSENTER test doesn't segfault as desired; instead it corrupts the program state such that it infinite loops.
Fix it by explicitly clearing EBP before doing SYSENTER. This will give a more reliable segfault.
Fixes: 59c2a7226fc5 ("x86/selftests: Add mov_to_ss test") Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/testing/selftests/x86/mov_ss_trap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/x86/mov_ss_trap.c +++ b/tools/testing/selftests/x86/mov_ss_trap.c @@ -257,7 +257,8 @@ int main() err(1, "sigaltstack"); sethandler(SIGSEGV, handle_and_longjmp, SA_RESETHAND | SA_ONSTACK); nr = SYS_getpid; - asm volatile ("mov %[ss], %%ss; SYSENTER" : "+a" (nr) + /* Clear EBP first to make sure we segfault cleanly. */ + asm volatile ("xorl %%ebp, %%ebp; mov %[ss], %%ss; SYSENTER" : "+a" (nr) : [ss] "m" (ss) : "flags", "rcx" #ifdef __x86_64__ , "r11"
 
            From: Andy Lutomirski luto@kernel.org
commit 4d2fa82d98d2d296043a04eb517d7dbade5b13b8 upstream.
If the kernel accidentally uses DS or ES while the user values are loaded, it will work fine for sane userspace. In the interest of simulating maximally insane userspace, make sigreturn_32 zero out DS and ES for the nasty parts so that inadvertent use of these segments will crash.
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/testing/selftests/x86/sigreturn.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/tools/testing/selftests/x86/sigreturn.c +++ b/tools/testing/selftests/x86/sigreturn.c @@ -451,6 +451,19 @@ static void sigusr1(int sig, siginfo_t * ctx->uc_mcontext.gregs[REG_SP] = (unsigned long)0x8badf00d5aadc0deULL; ctx->uc_mcontext.gregs[REG_CX] = 0;
+#ifdef __i386__ + /* + * Make sure the kernel doesn't inadvertently use DS or ES-relative + * accesses in a region where user DS or ES is loaded. + * + * Skip this for 64-bit builds because long mode doesn't care about + * DS and ES and skipping it increases test coverage a little bit, + * since 64-bit kernels can still run the 32-bit build. + */ + ctx->uc_mcontext.gregs[REG_DS] = 0; + ctx->uc_mcontext.gregs[REG_ES] = 0; +#endif + memcpy(&requested_regs, &ctx->uc_mcontext.gregs, sizeof(gregset_t)); requested_regs[REG_CX] = *ssptr(ctx); /* The asm code does this. */
 
            From: Ingo Molnar mingo@kernel.org
commit 05b042a1944322844eaae7ea596d5f154166d68a upstream.
When two recent commits that increased the size of the 'struct cpu_entry_area' were merged in -tip, the 32-bit defconfig build started failing on the following build time assert:
./include/linux/compiler.h:391:38: error: call to ‘__compiletime_assert_189’ declared with attribute error: BUILD_BUG_ON failed: CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE arch/x86/mm/cpu_entry_area.c:189:2: note: in expansion of macro ‘BUILD_BUG_ON’ In function ‘setup_cpu_entry_area_ptes’,
Which corresponds to the following build time assert:
BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE);
The purpose of this assert is to sanity check the fixed-value definition of CPU_ENTRY_AREA_PAGES arch/x86/include/asm/pgtable_32_types.h:
#define CPU_ENTRY_AREA_PAGES (NR_CPUS * 41)
The '41' is supposed to match sizeof(struct cpu_entry_area)/PAGE_SIZE, which value we didn't want to define in such a low level header, because it would cause dependency hell.
Every time the size of cpu_entry_area is changed, we have to adjust CPU_ENTRY_AREA_PAGES accordingly - and this assert is checking that constraint.
But the assert is both imprecise and buggy, primarily because it doesn't include the single readonly IDT page that is mapped at CPU_ENTRY_AREA_BASE (which begins at a PMD boundary).
This bug was hidden by the fact that by accident CPU_ENTRY_AREA_PAGES is defined too large upstream (v5.4-rc8):
#define CPU_ENTRY_AREA_PAGES (NR_CPUS * 40)
While 'struct cpu_entry_area' is 155648 bytes, or 38 pages. So we had two extra pages, which hid the bug.
The following commit (not yet upstream) increased the size to 40 pages:
x86/iopl: ("Restrict iopl() permission scope")
... but increased CPU_ENTRY_AREA_PAGES only 41 - i.e. shortening the gap to just 1 extra page.
Then another not-yet-upstream commit changed the size again:
880a98c33996: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit")
Which increased the cpu_entry_area size from 38 to 39 pages, but didn't change CPU_ENTRY_AREA_PAGES (kept it at 40). This worked fine, because we still had a page left from the accidental 'reserve'.
But when these two commits were merged into the same tree, the combined size of cpu_entry_area grew from 38 to 40 pages, while CPU_ENTRY_AREA_PAGES finally caught up to 40 as well.
Which is fine in terms of functionality, but the assert broke:
BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE);
because CPU_ENTRY_AREA_MAP_SIZE is the total size of the area, which is 1 page larger due to the IDT page.
To fix all this, change the assert to two precise asserts:
BUILD_BUG_ON((CPU_ENTRY_AREA_PAGES+1)*PAGE_SIZE != CPU_ENTRY_AREA_MAP_SIZE); BUILD_BUG_ON(CPU_ENTRY_AREA_TOTAL_SIZE != CPU_ENTRY_AREA_MAP_SIZE);
This takes the IDT page into account, and also connects the size-based define of CPU_ENTRY_AREA_TOTAL_SIZE with the address-subtraction based define of CPU_ENTRY_AREA_MAP_SIZE.
Also clean up some of the names which made it rather confusing:
- 'CPU_ENTRY_AREA_TOT_SIZE' wasn't actually the 'total' size of the cpu-entry-area, but the per-cpu array size, so rename this to CPU_ENTRY_AREA_ARRAY_SIZE.
- Introduce CPU_ENTRY_AREA_TOTAL_SIZE that _is_ the total mapping size, with the IDT included.
- Add comments where '+1' denotes the IDT mapping - it wasn't obvious and took me about 3 hours to decode...
Finally, because this particular commit is actually applied after this patch:
880a98c33996: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit")
Fix the CPU_ENTRY_AREA_PAGES value from 40 pages to the correct 39 pages.
All future commits that change cpu_entry_area will have to adjust this value precisely.
As a side note, we should probably attempt to remove CPU_ENTRY_AREA_PAGES and derive its value directly from the structure, without causing header hell - but that is an adventure for another day! :-)
Fixes: 880a98c33996: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit") Cc: Thomas Gleixner tglx@linutronix.de Cc: Borislav Petkov bp@alien8.de Cc: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Andy Lutomirski luto@kernel.org Cc: stable@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/cpu_entry_area.h | 12 +++++++----- arch/x86/include/asm/pgtable_32_types.h | 8 ++++---- arch/x86/mm/cpu_entry_area.c | 4 +++- 3 files changed, 14 insertions(+), 10 deletions(-)
--- a/arch/x86/include/asm/cpu_entry_area.h +++ b/arch/x86/include/asm/cpu_entry_area.h @@ -98,7 +98,6 @@ struct cpu_entry_area { */ struct cea_exception_stacks estacks; #endif -#ifdef CONFIG_CPU_SUP_INTEL /* * Per CPU debug store for Intel performance monitoring. Wastes a * full page at the moment. @@ -109,11 +108,13 @@ struct cpu_entry_area { * Reserve enough fixmap PTEs. */ struct debug_store_buffers cpu_debug_buffers; -#endif };
-#define CPU_ENTRY_AREA_SIZE (sizeof(struct cpu_entry_area)) -#define CPU_ENTRY_AREA_TOT_SIZE (CPU_ENTRY_AREA_SIZE * NR_CPUS) +#define CPU_ENTRY_AREA_SIZE (sizeof(struct cpu_entry_area)) +#define CPU_ENTRY_AREA_ARRAY_SIZE (CPU_ENTRY_AREA_SIZE * NR_CPUS) + +/* Total size includes the readonly IDT mapping page as well: */ +#define CPU_ENTRY_AREA_TOTAL_SIZE (CPU_ENTRY_AREA_ARRAY_SIZE + PAGE_SIZE)
DECLARE_PER_CPU(struct cpu_entry_area *, cpu_entry_area); DECLARE_PER_CPU(struct cea_exception_stacks *, cea_exception_stacks); @@ -121,13 +122,14 @@ DECLARE_PER_CPU(struct cea_exception_sta extern void setup_cpu_entry_areas(void); extern void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags);
+/* Single page reserved for the readonly IDT mapping: */ #define CPU_ENTRY_AREA_RO_IDT CPU_ENTRY_AREA_BASE #define CPU_ENTRY_AREA_PER_CPU (CPU_ENTRY_AREA_RO_IDT + PAGE_SIZE)
#define CPU_ENTRY_AREA_RO_IDT_VADDR ((void *)CPU_ENTRY_AREA_RO_IDT)
#define CPU_ENTRY_AREA_MAP_SIZE \ - (CPU_ENTRY_AREA_PER_CPU + CPU_ENTRY_AREA_TOT_SIZE - CPU_ENTRY_AREA_BASE) + (CPU_ENTRY_AREA_PER_CPU + CPU_ENTRY_AREA_ARRAY_SIZE - CPU_ENTRY_AREA_BASE)
extern struct cpu_entry_area *get_cpu_entry_area(int cpu);
--- a/arch/x86/include/asm/pgtable_32_types.h +++ b/arch/x86/include/asm/pgtable_32_types.h @@ -44,11 +44,11 @@ extern bool __vmalloc_start_set; /* set * Define this here and validate with BUILD_BUG_ON() in pgtable_32.c * to avoid include recursion hell */ -#define CPU_ENTRY_AREA_PAGES (NR_CPUS * 40) +#define CPU_ENTRY_AREA_PAGES (NR_CPUS * 39)
-#define CPU_ENTRY_AREA_BASE \ - ((FIXADDR_TOT_START - PAGE_SIZE * (CPU_ENTRY_AREA_PAGES + 1)) \ - & PMD_MASK) +/* The +1 is for the readonly IDT page: */ +#define CPU_ENTRY_AREA_BASE \ + ((FIXADDR_TOT_START - PAGE_SIZE*(CPU_ENTRY_AREA_PAGES+1)) & PMD_MASK)
#define LDT_BASE_ADDR \ ((CPU_ENTRY_AREA_BASE - PAGE_SIZE) & PMD_MASK) --- a/arch/x86/mm/cpu_entry_area.c +++ b/arch/x86/mm/cpu_entry_area.c @@ -178,7 +178,9 @@ static __init void setup_cpu_entry_area_ #ifdef CONFIG_X86_32 unsigned long start, end;
- BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE); + /* The +1 is for the readonly IDT: */ + BUILD_BUG_ON((CPU_ENTRY_AREA_PAGES+1)*PAGE_SIZE != CPU_ENTRY_AREA_MAP_SIZE); + BUILD_BUG_ON(CPU_ENTRY_AREA_TOTAL_SIZE != CPU_ENTRY_AREA_MAP_SIZE); BUG_ON(CPU_ENTRY_AREA_BASE & ~PMD_MASK);
start = CPU_ENTRY_AREA_BASE;
 
            From: Andy Lutomirski luto@kernel.org
commit 4a13b0e3e10996b9aa0b45a764ecfe49f6fcd360 upstream.
UNWIND_ESPFIX_STACK needs to read the GDT, and the GDT mapping that can be accessed via %fs is not mapped in the user pagetables. Use SGDT to find the cpu_entry_area mapping and read the espfix offset from that instead.
Reported-and-tested-by: Borislav Petkov bp@alien8.de Signed-off-by: Andy Lutomirski luto@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Linus Torvalds torvalds@linux-foundation.org Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -415,7 +415,8 @@
.macro CHECK_AND_APPLY_ESPFIX #ifdef CONFIG_X86_ESPFIX32 -#define GDT_ESPFIX_SS PER_CPU_VAR(gdt_page) + (GDT_ENTRY_ESPFIX_SS * 8) +#define GDT_ESPFIX_OFFSET (GDT_ENTRY_ESPFIX_SS * 8) +#define GDT_ESPFIX_SS PER_CPU_VAR(gdt_page) + GDT_ESPFIX_OFFSET
ALTERNATIVE "jmp .Lend_@", "", X86_BUG_ESPFIX
@@ -1147,12 +1148,26 @@ ENDPROC(entry_INT80_32) * We can't call C functions using the ESPFIX stack. This code reads * the high word of the segment base from the GDT and swiches to the * normal stack and adjusts ESP with the matching offset. + * + * We might be on user CR3 here, so percpu data is not mapped and we can't + * access the GDT through the percpu segment. Instead, use SGDT to find + * the cpu_entry_area alias of the GDT. */ #ifdef CONFIG_X86_ESPFIX32 /* fixup the stack */ - mov GDT_ESPFIX_SS + 4, %al /* bits 16..23 */ - mov GDT_ESPFIX_SS + 7, %ah /* bits 24..31 */ + pushl %ecx + subl $2*4, %esp + sgdt (%esp) + movl 2(%esp), %ecx /* GDT address */ + /* + * Careful: ECX is a linear pointer, so we need to force base + * zero. %cs is the only known-linear segment we have right now. + */ + mov %cs:GDT_ESPFIX_OFFSET + 4(%ecx), %al /* bits 16..23 */ + mov %cs:GDT_ESPFIX_OFFSET + 7(%ecx), %ah /* bits 24..31 */ shl $16, %eax + addl $2*4, %esp + popl %ecx addl %esp, %eax /* the adjusted stack pointer */ pushl $__KERNEL_DS pushl %eax
 
            From: Yang Tao yang.tao172@zte.com.cn
commit ca16d5bee59807bf04deaab0a8eccecd5061528c upstream.
Robust futexes utilize the robust_list mechanism to allow the kernel to release futexes which are held when a task exits. The exit can be voluntary or caused by a signal or fault. This prevents that waiters block forever.
The futex operations in user space store a pointer to the futex they are either locking or unlocking in the op_pending member of the per task robust list.
After a lock operation has succeeded the futex is queued in the robust list linked list and the op_pending pointer is cleared.
After an unlock operation has succeeded the futex is removed from the robust list linked list and the op_pending pointer is cleared.
The robust list exit code checks for the pending operation and any futex which is queued in the linked list. It carefully checks whether the futex value is the TID of the exiting task. If so, it sets the OWNER_DIED bit and tries to wake up a potential waiter.
This is race free for the lock operation but unlock has two race scenarios where waiters might not be woken up. These issues can be observed with regular robust pthread mutexes. PI aware pthread mutexes are not affected.
(1) Unlocking task is killed after unlocking the futex value in user space before being able to wake a waiter.
pthread_mutex_unlock() | V atomic_exchange_rel (&mutex->__data.__lock, 0) <------------------------killed lll_futex_wake () | | |(__lock = 0) |(enter kernel) | V do_exit() exit_mm() mm_release() exit_robust_list() handle_futex_death() | |(__lock = 0) |(uval = 0) | V if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr)) return 0;
The sanity check which ensures that the user space futex is owned by the exiting task prevents the wakeup of waiters which in consequence block infinitely.
(2) Waiting task is killed after a wakeup and before it can acquire the futex in user space.
OWNER WAITER futex_wait() pthread_mutex_unlock() | | | |(__lock = 0) | | | V | futex_wake() ------------> wakeup() | |(return to userspace) |(__lock = 0) | V oldval = mutex->__data.__lock <-----------------killed atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, | id | assume_other_futex_waiters, 0) | | | (enter kernel)| | V do_exit() | | V handle_futex_death() | |(__lock = 0) |(uval = 0) | V if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr)) return 0;
The sanity check which ensures that the user space futex is owned by the exiting task prevents the wakeup of waiters, which seems to be correct as the exiting task does not own the futex value, but the consequence is that other waiters wont be woken up and block infinitely.
In both scenarios the following conditions are true:
- task->robust_list->list_op_pending != NULL - user space futex value == 0 - Regular futex (not PI)
If these conditions are met then it is reasonably safe to wake up a potential waiter in order to prevent the above problems.
As this might be a false positive it can cause spurious wakeups, but the waiter side has to handle other types of unrelated wakeups, e.g. signals gracefully anyway. So such a spurious wakeup will not affect the correctness of these operations.
This workaround must not touch the user space futex value and cannot set the OWNER_DIED bit because the lock value is 0, i.e. uncontended. Setting OWNER_DIED in this case would result in inconsistent state and subsequently in malfunction of the owner died handling in user space.
The rest of the user space state is still consistent as no other task can observe the list_op_pending entry in the exiting tasks robust list.
The eventually woken up waiter will observe the uncontended lock value and take it over.
[ tglx: Massaged changelog and comment. Made the return explicit and not depend on the subsequent check and added constants to hand into handle_futex_death() instead of plain numbers. Fixed a few coding style issues. ]
Fixes: 0771dfefc9e5 ("[PATCH] lightweight robust futexes: core") Signed-off-by: Yang Tao yang.tao172@zte.com.cn Signed-off-by: Yi Wang wang.yi59@zte.com.cn Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1573010582-35297-1-git-send-email-wang.yi59@zte.co... Link: https://lkml.kernel.org/r/20191106224555.943191378@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/futex.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 51 insertions(+), 7 deletions(-)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -3452,11 +3452,16 @@ err_unlock: return ret; }
+/* Constants for the pending_op argument of handle_futex_death */ +#define HANDLE_DEATH_PENDING true +#define HANDLE_DEATH_LIST false + /* * Process a futex-list entry, check whether it's owned by the * dying task, and do notification if so: */ -static int handle_futex_death(u32 __user *uaddr, struct task_struct *curr, int pi) +static int handle_futex_death(u32 __user *uaddr, struct task_struct *curr, + bool pi, bool pending_op) { u32 uval, uninitialized_var(nval), mval; int err; @@ -3469,6 +3474,42 @@ retry: if (get_user(uval, uaddr)) return -1;
+ /* + * Special case for regular (non PI) futexes. The unlock path in + * user space has two race scenarios: + * + * 1. The unlock path releases the user space futex value and + * before it can execute the futex() syscall to wake up + * waiters it is killed. + * + * 2. A woken up waiter is killed before it can acquire the + * futex in user space. + * + * In both cases the TID validation below prevents a wakeup of + * potential waiters which can cause these waiters to block + * forever. + * + * In both cases the following conditions are met: + * + * 1) task->robust_list->list_op_pending != NULL + * @pending_op == true + * 2) User space futex value == 0 + * 3) Regular futex: @pi == false + * + * If these conditions are met, it is safe to attempt waking up a + * potential waiter without touching the user space futex value and + * trying to set the OWNER_DIED bit. The user space futex value is + * uncontended and the rest of the user space mutex state is + * consistent, so a woken waiter will just take over the + * uncontended futex. Setting the OWNER_DIED bit would create + * inconsistent state and malfunction of the user space owner died + * handling. + */ + if (pending_op && !pi && !uval) { + futex_wake(uaddr, 1, 1, FUTEX_BITSET_MATCH_ANY); + return 0; + } + if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr)) return 0;
@@ -3588,10 +3629,11 @@ void exit_robust_list(struct task_struct * A pending lock might already be on the list, so * don't process it twice: */ - if (entry != pending) + if (entry != pending) { if (handle_futex_death((void __user *)entry + futex_offset, - curr, pi)) + curr, pi, HANDLE_DEATH_LIST)) return; + } if (rc) return; entry = next_entry; @@ -3605,9 +3647,10 @@ void exit_robust_list(struct task_struct cond_resched(); }
- if (pending) + if (pending) { handle_futex_death((void __user *)pending + futex_offset, - curr, pip); + curr, pip, HANDLE_DEATH_PENDING); + } }
long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, @@ -3784,7 +3827,8 @@ void compat_exit_robust_list(struct task if (entry != pending) { void __user *uaddr = futex_uaddr(entry, futex_offset);
- if (handle_futex_death(uaddr, curr, pi)) + if (handle_futex_death(uaddr, curr, pi, + HANDLE_DEATH_LIST)) return; } if (rc) @@ -3803,7 +3847,7 @@ void compat_exit_robust_list(struct task if (pending) { void __user *uaddr = futex_uaddr(pending, futex_offset);
- handle_futex_death(uaddr, curr, pip); + handle_futex_death(uaddr, curr, pip, HANDLE_DEATH_PENDING); } }
 
            From: Takashi Iwai tiwai@suse.de
commit 9435f2bb66874a0c4dd25e7c978957a7ca2c93b1 upstream.
snd_usb_mixer_controls_badd() that parses UAC3 BADD profiles misses a NULL check for the given interfaces. When a malformed USB descriptor is passed, this may lead to an Oops, as spotted by syzkaller. Skip the iteration if the interface doesn't exist for avoiding the crash.
Fixes: 17156f23e93c ("ALSA: usb: add UAC3 BADD profiles support") Reported-by: syzbot+a36ab65c6653d7ccdd62@syzkaller.appspotmail.com Suggested-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191122112840.24797-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/usb/mixer.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/sound/usb/mixer.c +++ b/sound/usb/mixer.c @@ -2930,6 +2930,9 @@ static int snd_usb_mixer_controls_badd(s continue;
iface = usb_ifnum_to_if(dev, intf); + if (!iface) + continue; + num = iface->num_altsetting;
if (num < 2)
 
            From: Geoffrey D. Bennett g@b4.vu
commit ce3cba788a1b7b8aed9380c3035d9e850884bd2d upstream.
The s6i6_gen2_info.ports[] array had the Mixer and PCM port type entries in the wrong place. Use designators to explicitly specify the array elements being set.
Fixes: 9e4d5c1be21f ("ALSA: usb-audio: Scarlett Gen 2 mixer interface") Signed-off-by: Geoffrey D. Bennett g@b4.vu Tested-by: Alex Fellows alex.fellows@gmail.com Tested-by: Markus Schroetter project.m.schroetter@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191110134356.GA31589@b4.vu Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/usb/mixer_scarlett_gen2.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-)
--- a/sound/usb/mixer_scarlett_gen2.c +++ b/sound/usb/mixer_scarlett_gen2.c @@ -261,34 +261,34 @@ static const struct scarlett2_device_inf },
.ports = { - { + [SCARLETT2_PORT_TYPE_NONE] = { .id = 0x000, .num = { 1, 0, 8, 8, 8 }, .src_descr = "Off", .src_num_offset = 0, }, - { + [SCARLETT2_PORT_TYPE_ANALOGUE] = { .id = 0x080, .num = { 4, 4, 4, 4, 4 }, .src_descr = "Analogue %d", .src_num_offset = 1, .dst_descr = "Analogue Output %02d Playback" }, - { + [SCARLETT2_PORT_TYPE_SPDIF] = { .id = 0x180, .num = { 2, 2, 2, 2, 2 }, .src_descr = "S/PDIF %d", .src_num_offset = 1, .dst_descr = "S/PDIF Output %d Playback" }, - { + [SCARLETT2_PORT_TYPE_MIX] = { .id = 0x300, .num = { 10, 18, 18, 18, 18 }, .src_descr = "Mix %c", .src_num_offset = 65, .dst_descr = "Mixer Input %02d Capture" }, - { + [SCARLETT2_PORT_TYPE_PCM] = { .id = 0x600, .num = { 6, 6, 6, 6, 6 }, .src_descr = "PCM %d", @@ -317,44 +317,44 @@ static const struct scarlett2_device_inf },
.ports = { - { + [SCARLETT2_PORT_TYPE_NONE] = { .id = 0x000, .num = { 1, 0, 8, 8, 4 }, .src_descr = "Off", .src_num_offset = 0, }, - { + [SCARLETT2_PORT_TYPE_ANALOGUE] = { .id = 0x080, .num = { 8, 6, 6, 6, 6 }, .src_descr = "Analogue %d", .src_num_offset = 1, .dst_descr = "Analogue Output %02d Playback" }, - { + [SCARLETT2_PORT_TYPE_SPDIF] = { + .id = 0x180, /* S/PDIF outputs aren't available at 192KHz * but are included in the USB mux I/O * assignment message anyway */ - .id = 0x180, .num = { 2, 2, 2, 2, 2 }, .src_descr = "S/PDIF %d", .src_num_offset = 1, .dst_descr = "S/PDIF Output %d Playback" }, - { + [SCARLETT2_PORT_TYPE_ADAT] = { .id = 0x200, .num = { 8, 0, 0, 0, 0 }, .src_descr = "ADAT %d", .src_num_offset = 1, }, - { + [SCARLETT2_PORT_TYPE_MIX] = { .id = 0x300, .num = { 10, 18, 18, 18, 18 }, .src_descr = "Mix %c", .src_num_offset = 65, .dst_descr = "Mixer Input %02d Capture" }, - { + [SCARLETT2_PORT_TYPE_PCM] = { .id = 0x600, .num = { 20, 18, 18, 14, 10 }, .src_descr = "PCM %d", @@ -387,20 +387,20 @@ static const struct scarlett2_device_inf },
.ports = { - { + [SCARLETT2_PORT_TYPE_NONE] = { .id = 0x000, .num = { 1, 0, 8, 8, 6 }, .src_descr = "Off", .src_num_offset = 0, }, - { + [SCARLETT2_PORT_TYPE_ANALOGUE] = { .id = 0x080, .num = { 8, 10, 10, 10, 10 }, .src_descr = "Analogue %d", .src_num_offset = 1, .dst_descr = "Analogue Output %02d Playback" }, - { + [SCARLETT2_PORT_TYPE_SPDIF] = { /* S/PDIF outputs aren't available at 192KHz * but are included in the USB mux I/O * assignment message anyway @@ -411,21 +411,21 @@ static const struct scarlett2_device_inf .src_num_offset = 1, .dst_descr = "S/PDIF Output %d Playback" }, - { + [SCARLETT2_PORT_TYPE_ADAT] = { .id = 0x200, .num = { 8, 8, 8, 4, 0 }, .src_descr = "ADAT %d", .src_num_offset = 1, .dst_descr = "ADAT Output %d Playback" }, - { + [SCARLETT2_PORT_TYPE_MIX] = { .id = 0x300, .num = { 10, 18, 18, 18, 18 }, .src_descr = "Mix %c", .src_num_offset = 65, .dst_descr = "Mixer Input %02d Capture" }, - { + [SCARLETT2_PORT_TYPE_PCM] = { .id = 0x600, .num = { 20, 18, 18, 14, 10 }, .src_descr = "PCM %d",
 
            From: Vandana BN bnvandana@gmail.com
commit b4add02d2236fd5f568db141cfd8eb4290972eb3 upstream.
When vbi stream is started, followed by video streaming, the vid_cap_streaming and vid_out_streaming were not being set to true, which would cause the video stream to stop when vbi stream is stopped. This patch allows to set vid_cap_streaming and vid_out_streaming to true. According to Hans Verkuil it appears that these 'if (dev->kthread_vid_cap)' checks are a left-over from the original vivid development and should never have been there.
Signed-off-by: Vandana BN bnvandana@gmail.com Cc: stable@vger.kernel.org # for v3.18 and up Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/platform/vivid/vivid-vid-cap.c | 3 --- drivers/media/platform/vivid/vivid-vid-out.c | 3 --- 2 files changed, 6 deletions(-)
--- a/drivers/media/platform/vivid/vivid-vid-cap.c +++ b/drivers/media/platform/vivid/vivid-vid-cap.c @@ -223,9 +223,6 @@ static int vid_cap_start_streaming(struc if (vb2_is_streaming(&dev->vb_vid_out_q)) dev->can_loop_video = vivid_vid_can_loop(dev);
- if (dev->kthread_vid_cap) - return 0; - dev->vid_cap_seq_count = 0; dprintk(dev, 1, "%s\n", __func__); for (i = 0; i < VIDEO_MAX_FRAME; i++) --- a/drivers/media/platform/vivid/vivid-vid-out.c +++ b/drivers/media/platform/vivid/vivid-vid-out.c @@ -161,9 +161,6 @@ static int vid_out_start_streaming(struc if (vb2_is_streaming(&dev->vb_vid_cap_q)) dev->can_loop_video = vivid_vid_can_loop(dev);
- if (dev->kthread_vid_out) - return 0; - dev->vid_out_seq_count = 0; dprintk(dev, 1, "%s\n", __func__); if (dev->start_streaming_error) {
 
            From: Alexander Popov alex.popov@linux.com
commit 6dcd5d7a7a29c1e4b8016a06aed78cd650cd8c27 upstream.
There is the same incorrect approach to locking implemented in vivid_stop_generating_vid_cap(), vivid_stop_generating_vid_out() and sdr_cap_stop_streaming().
These functions are called during streaming stopping with vivid_dev.mutex locked. And they all do the same mistake while stopping their kthreads, which need to lock this mutex as well. See the example from vivid_stop_generating_vid_cap(): /* shutdown control thread */ vivid_grab_controls(dev, false); mutex_unlock(&dev->mutex); kthread_stop(dev->kthread_vid_cap); dev->kthread_vid_cap = NULL; mutex_lock(&dev->mutex);
But when this mutex is unlocked, another vb2_fop_read() can lock it instead of vivid_thread_vid_cap() and manipulate the buffer queue. That causes a use-after-free access later.
To fix those issues let's: 1. avoid unlocking the mutex in vivid_stop_generating_vid_cap(), vivid_stop_generating_vid_out() and sdr_cap_stop_streaming(); 2. use mutex_trylock() with schedule_timeout_uninterruptible() in the loops of the vivid kthread handlers.
Signed-off-by: Alexander Popov alex.popov@linux.com Acked-by: Linus Torvalds torvalds@linux-foundation.org Tested-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Cc: stable@vger.kernel.org # for v3.18 and up Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/platform/vivid/vivid-kthread-cap.c | 8 +++++--- drivers/media/platform/vivid/vivid-kthread-out.c | 8 +++++--- drivers/media/platform/vivid/vivid-sdr-cap.c | 8 +++++--- 3 files changed, 15 insertions(+), 9 deletions(-)
--- a/drivers/media/platform/vivid/vivid-kthread-cap.c +++ b/drivers/media/platform/vivid/vivid-kthread-cap.c @@ -796,7 +796,11 @@ static int vivid_thread_vid_cap(void *da if (kthread_should_stop()) break;
- mutex_lock(&dev->mutex); + if (!mutex_trylock(&dev->mutex)) { + schedule_timeout_uninterruptible(1); + continue; + } + cur_jiffies = jiffies; if (dev->cap_seq_resync) { dev->jiffies_vid_cap = cur_jiffies; @@ -956,8 +960,6 @@ void vivid_stop_generating_vid_cap(struc
/* shutdown control thread */ vivid_grab_controls(dev, false); - mutex_unlock(&dev->mutex); kthread_stop(dev->kthread_vid_cap); dev->kthread_vid_cap = NULL; - mutex_lock(&dev->mutex); } --- a/drivers/media/platform/vivid/vivid-kthread-out.c +++ b/drivers/media/platform/vivid/vivid-kthread-out.c @@ -143,7 +143,11 @@ static int vivid_thread_vid_out(void *da if (kthread_should_stop()) break;
- mutex_lock(&dev->mutex); + if (!mutex_trylock(&dev->mutex)) { + schedule_timeout_uninterruptible(1); + continue; + } + cur_jiffies = jiffies; if (dev->out_seq_resync) { dev->jiffies_vid_out = cur_jiffies; @@ -301,8 +305,6 @@ void vivid_stop_generating_vid_out(struc
/* shutdown control thread */ vivid_grab_controls(dev, false); - mutex_unlock(&dev->mutex); kthread_stop(dev->kthread_vid_out); dev->kthread_vid_out = NULL; - mutex_lock(&dev->mutex); } --- a/drivers/media/platform/vivid/vivid-sdr-cap.c +++ b/drivers/media/platform/vivid/vivid-sdr-cap.c @@ -141,7 +141,11 @@ static int vivid_thread_sdr_cap(void *da if (kthread_should_stop()) break;
- mutex_lock(&dev->mutex); + if (!mutex_trylock(&dev->mutex)) { + schedule_timeout_uninterruptible(1); + continue; + } + cur_jiffies = jiffies; if (dev->sdr_cap_seq_resync) { dev->jiffies_sdr_cap = cur_jiffies; @@ -303,10 +307,8 @@ static void sdr_cap_stop_streaming(struc }
/* shutdown control thread */ - mutex_unlock(&dev->mutex); kthread_stop(dev->kthread_sdr_cap); dev->kthread_sdr_cap = NULL; - mutex_lock(&dev->mutex); }
static void sdr_cap_buf_request_complete(struct vb2_buffer *vb)
 
            From: Alan Stern stern@rowland.harvard.edu
commit c7a191464078262bf799136317c95824e26a222b upstream.
The syzbot fuzzer found two invalid-access bugs in the usbvision driver. These bugs occur when userspace keeps the device file open after the device has been disconnected and usbvision_disconnect() has set usbvision->dev to NULL:
When the device file is closed, usbvision_radio_close() tries to issue a usb_set_interface() call, passing the NULL pointer as its first argument.
If userspace performs a querycap ioctl call, vidioc_querycap() calls usb_make_path() with the same NULL pointer.
This patch fixes the problems by making the appropriate tests beforehand. Note that vidioc_querycap() is protected by usbvision->v4l2_lock, acquired in a higher layer of the V4L2 subsystem.
Reported-and-tested-by: syzbot+7fa38a608b1075dfd634@syzkaller.appspotmail.com
Signed-off-by: Alan Stern stern@rowland.harvard.edu CC: stable@vger.kernel.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/usbvision/usbvision-video.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/media/usb/usbvision/usbvision-video.c +++ b/drivers/media/usb/usbvision/usbvision-video.c @@ -453,6 +453,9 @@ static int vidioc_querycap(struct file * { struct usb_usbvision *usbvision = video_drvdata(file);
+ if (!usbvision->dev) + return -ENODEV; + strscpy(vc->driver, "USBVision", sizeof(vc->driver)); strscpy(vc->card, usbvision_device_data[usbvision->dev_model].model_string, @@ -1099,8 +1102,9 @@ static int usbvision_radio_close(struct mutex_lock(&usbvision->v4l2_lock); /* Set packet size to 0 */ usbvision->iface_alt = 0; - usb_set_interface(usbvision->dev, usbvision->iface, - usbvision->iface_alt); + if (usbvision->dev) + usb_set_interface(usbvision->dev, usbvision->iface, + usbvision->iface_alt);
usbvision_audio_off(usbvision); usbvision->radio = 0;
 
            From: Alan Stern stern@rowland.harvard.edu
commit 9e08117c9d4efc1e1bc6fce83dab856d9fd284b6 upstream.
Visual inspection of the usbvision driver shows that it suffers from three races between its open, close, and disconnect handlers. In particular, the driver is careful to update its usbvision->user and usbvision->remove_pending flags while holding the private mutex, but:
usbvision_v4l2_close() and usbvision_radio_close() don't hold the mutex while they check the value of usbvision->remove_pending;
usbvision_disconnect() doesn't hold the mutex while checking the value of usbvision->user; and
also, usbvision_v4l2_open() and usbvision_radio_open() don't check whether the device has been unplugged before allowing the user to open the device files.
Each of these can potentially lead to usbvision_release() being called twice and use-after-free errors.
This patch fixes the races by reading the flags while the mutex is still held and checking for pending removes before allowing an open to succeed.
Signed-off-by: Alan Stern stern@rowland.harvard.edu CC: stable@vger.kernel.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/usbvision/usbvision-video.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
--- a/drivers/media/usb/usbvision/usbvision-video.c +++ b/drivers/media/usb/usbvision/usbvision-video.c @@ -314,6 +314,10 @@ static int usbvision_v4l2_open(struct fi if (mutex_lock_interruptible(&usbvision->v4l2_lock)) return -ERESTARTSYS;
+ if (usbvision->remove_pending) { + err_code = -ENODEV; + goto unlock; + } if (usbvision->user) { err_code = -EBUSY; } else { @@ -377,6 +381,7 @@ unlock: static int usbvision_v4l2_close(struct file *file) { struct usb_usbvision *usbvision = video_drvdata(file); + int r;
PDEBUG(DBG_IO, "close");
@@ -391,9 +396,10 @@ static int usbvision_v4l2_close(struct f usbvision_scratch_free(usbvision);
usbvision->user--; + r = usbvision->remove_pending; mutex_unlock(&usbvision->v4l2_lock);
- if (usbvision->remove_pending) { + if (r) { printk(KERN_INFO "%s: Final disconnect\n", __func__); usbvision_release(usbvision); return 0; @@ -1064,6 +1070,11 @@ static int usbvision_radio_open(struct f
if (mutex_lock_interruptible(&usbvision->v4l2_lock)) return -ERESTARTSYS; + + if (usbvision->remove_pending) { + err_code = -ENODEV; + goto out; + } err_code = v4l2_fh_open(file); if (err_code) goto out; @@ -1096,6 +1107,7 @@ out: static int usbvision_radio_close(struct file *file) { struct usb_usbvision *usbvision = video_drvdata(file); + int r;
PDEBUG(DBG_IO, "");
@@ -1109,9 +1121,10 @@ static int usbvision_radio_close(struct usbvision_audio_off(usbvision); usbvision->radio = 0; usbvision->user--; + r = usbvision->remove_pending; mutex_unlock(&usbvision->v4l2_lock);
- if (usbvision->remove_pending) { + if (r) { printk(KERN_INFO "%s: Final disconnect\n", __func__); v4l2_fh_release(file); usbvision_release(usbvision); @@ -1543,6 +1556,7 @@ err_usb: static void usbvision_disconnect(struct usb_interface *intf) { struct usb_usbvision *usbvision = to_usbvision(usb_get_intfdata(intf)); + int u;
PDEBUG(DBG_PROBE, "");
@@ -1559,13 +1573,14 @@ static void usbvision_disconnect(struct v4l2_device_disconnect(&usbvision->v4l2_dev); usbvision_i2c_unregister(usbvision); usbvision->remove_pending = 1; /* Now all ISO data will be ignored */ + u = usbvision->user;
usb_put_dev(usbvision->dev); usbvision->dev = NULL; /* USB device is no more */
mutex_unlock(&usbvision->v4l2_lock);
- if (usbvision->user) { + if (u) { printk(KERN_INFO "%s: In use, disconnect pending\n", __func__); wake_up_interruptible(&usbvision->wait_frame);
 
            From: Kai Shen shenkai8@huawei.com
commit e6e8df07268c1f75dd9215536e2ce4587b70f977 upstream.
Add NULL checks to show() and store() in cpufreq.c to avoid attempts to invoke a NULL callback.
Though some interfaces of cpufreq are set as read-only, users can still get write permission using chmod which can lead to a kernel crash, as follows:
chmod +w /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq echo 1 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
This bug was found in linux 4.19.
Signed-off-by: Kai Shen shenkai8@huawei.com Reported-by: Feilong Lin linfeilong@huawei.com Reviewed-by: Feilong Lin linfeilong@huawei.com Acked-by: Viresh Kumar viresh.kumar@linaro.org [ rjw: Subject & changelog ] Cc: All applicable stable@vger.kernel.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cpufreq/cpufreq.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -933,6 +933,9 @@ static ssize_t show(struct kobject *kobj struct freq_attr *fattr = to_attr(attr); ssize_t ret;
+ if (!fattr->show) + return -EIO; + down_read(&policy->rwsem); ret = fattr->show(policy, buf); up_read(&policy->rwsem); @@ -947,6 +950,9 @@ static ssize_t store(struct kobject *kob struct freq_attr *fattr = to_attr(attr); ssize_t ret = -EINVAL;
+ if (!fattr->store) + return -EIO; + /* * cpus_read_trylock() is used here to work around a circular lock * dependency problem with respect to the cpufreq_register_driver().
 
            From: Thomas Gleixner tglx@linutronix.de
commit ba31c1a48538992316cc71ce94fa9cd3e7b427c0 upstream.
The futex exit handling is #ifdeffed into mm_release() which is not pretty to begin with. But upcoming changes to address futex exit races need to add more functionality to this exit code.
Split it out into a function, move it into futex code and make the various futex exit functions static.
Preparatory only and no functional change.
Folded build fix from Borislav.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.049705556@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/compat.h | 2 -- include/linux/futex.h | 29 ++++++++++++++++------------- kernel/fork.c | 25 +++---------------------- kernel/futex.c | 33 +++++++++++++++++++++++++++++---- 4 files changed, 48 insertions(+), 41 deletions(-)
--- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -410,8 +410,6 @@ struct compat_kexec_segment; struct compat_mq_attr; struct compat_msgbuf;
-extern void compat_exit_robust_list(struct task_struct *curr); - #define BITS_PER_COMPAT_LONG (8*sizeof(compat_long_t))
#define BITS_TO_COMPAT_LONGS(bits) DIV_ROUND_UP(bits, BITS_PER_COMPAT_LONG) --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -2,7 +2,9 @@ #ifndef _LINUX_FUTEX_H #define _LINUX_FUTEX_H
+#include <linux/sched.h> #include <linux/ktime.h> + #include <uapi/linux/futex.h>
struct inode; @@ -48,15 +50,24 @@ union futex_key { #define FUTEX_KEY_INIT (union futex_key) { .both = { .ptr = NULL } }
#ifdef CONFIG_FUTEX -extern void exit_robust_list(struct task_struct *curr);
-long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, - u32 __user *uaddr2, u32 val2, u32 val3); -#else -static inline void exit_robust_list(struct task_struct *curr) +static inline void futex_init_task(struct task_struct *tsk) { + tsk->robust_list = NULL; +#ifdef CONFIG_COMPAT + tsk->compat_robust_list = NULL; +#endif + INIT_LIST_HEAD(&tsk->pi_state_list); + tsk->pi_state_cache = NULL; }
+void futex_mm_release(struct task_struct *tsk); + +long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, + u32 __user *uaddr2, u32 val2, u32 val3); +#else +static inline void futex_init_task(struct task_struct *tsk) { } +static inline void futex_mm_release(struct task_struct *tsk) { } static inline long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3) @@ -65,12 +76,4 @@ static inline long do_futex(u32 __user * } #endif
-#ifdef CONFIG_FUTEX_PI -extern void exit_pi_state_list(struct task_struct *curr); -#else -static inline void exit_pi_state_list(struct task_struct *curr) -{ -} -#endif - #endif --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1286,20 +1286,7 @@ static int wait_for_vfork_done(struct ta void mm_release(struct task_struct *tsk, struct mm_struct *mm) { /* Get rid of any futexes when releasing the mm */ -#ifdef CONFIG_FUTEX - if (unlikely(tsk->robust_list)) { - exit_robust_list(tsk); - tsk->robust_list = NULL; - } -#ifdef CONFIG_COMPAT - if (unlikely(tsk->compat_robust_list)) { - compat_exit_robust_list(tsk); - tsk->compat_robust_list = NULL; - } -#endif - if (unlikely(!list_empty(&tsk->pi_state_list))) - exit_pi_state_list(tsk); -#endif + futex_mm_release(tsk);
uprobe_free_utask(tsk);
@@ -2062,14 +2049,8 @@ static __latent_entropy struct task_stru #ifdef CONFIG_BLOCK p->plug = NULL; #endif -#ifdef CONFIG_FUTEX - p->robust_list = NULL; -#ifdef CONFIG_COMPAT - p->compat_robust_list = NULL; -#endif - INIT_LIST_HEAD(&p->pi_state_list); - p->pi_state_cache = NULL; -#endif + futex_init_task(p); + /* * sigaltstack should be cleared when sharing the same VM */ --- a/kernel/futex.c +++ b/kernel/futex.c @@ -325,6 +325,12 @@ static inline bool should_fail_futex(boo } #endif /* CONFIG_FAIL_FUTEX */
+#ifdef CONFIG_COMPAT +static void compat_exit_robust_list(struct task_struct *curr); +#else +static inline void compat_exit_robust_list(struct task_struct *curr) { } +#endif + static inline void futex_get_mm(union futex_key *key) { mmgrab(key->private.mm); @@ -890,7 +896,7 @@ static void put_pi_state(struct futex_pi * Kernel cleans up PI-state, but userspace is likely hosed. * (Robust-futex cleanup is separate and might save the day for userspace.) */ -void exit_pi_state_list(struct task_struct *curr) +static void exit_pi_state_list(struct task_struct *curr) { struct list_head *next, *head = &curr->pi_state_list; struct futex_pi_state *pi_state; @@ -960,7 +966,8 @@ void exit_pi_state_list(struct task_stru } raw_spin_unlock_irq(&curr->pi_lock); } - +#else +static inline void exit_pi_state_list(struct task_struct *curr) { } #endif
/* @@ -3588,7 +3595,7 @@ static inline int fetch_robust_entry(str * * We silently return on any sign of list-walking problem. */ -void exit_robust_list(struct task_struct *curr) +static void exit_robust_list(struct task_struct *curr) { struct robust_list_head __user *head = curr->robust_list; struct robust_list __user *entry, *next_entry, *pending; @@ -3653,6 +3660,24 @@ void exit_robust_list(struct task_struct } }
+void futex_mm_release(struct task_struct *tsk) +{ + if (unlikely(tsk->robust_list)) { + exit_robust_list(tsk); + tsk->robust_list = NULL; + } + +#ifdef CONFIG_COMPAT + if (unlikely(tsk->compat_robust_list)) { + compat_exit_robust_list(tsk); + tsk->compat_robust_list = NULL; + } +#endif + + if (unlikely(!list_empty(&tsk->pi_state_list))) + exit_pi_state_list(tsk); +} + long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3) { @@ -3780,7 +3805,7 @@ static void __user *futex_uaddr(struct r * * We silently return on any sign of list-walking problem. */ -void compat_exit_robust_list(struct task_struct *curr) +static void compat_exit_robust_list(struct task_struct *curr) { struct compat_robust_list_head __user *head = curr->compat_robust_list; struct robust_list __user *entry, *next_entry, *pending;
 
            From: Thomas Gleixner tglx@linutronix.de
commit 3d4775df0a89240f671861c6ab6e8d59af8e9e41 upstream.
The futex exit handling relies on PF_ flags. That's suboptimal as it requires a smp_mb() and an ugly lock/unlock of the exiting tasks pi_lock in the middle of do_exit() to enforce the observability of PF_EXITING in the futex code.
Add a futex_state member to task_struct and convert the PF_EXITPIDONE logic over to the new state. The PF_EXITING dependency will be cleaned up in a later step.
This prepares for handling various futex exit issues later.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.149449274@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/futex.h | 33 +++++++++++++++++++++++++++++++++ include/linux/sched.h | 2 +- kernel/exit.c | 18 ++---------------- kernel/futex.c | 25 +++++++++++++------------ 4 files changed, 49 insertions(+), 29 deletions(-)
--- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -50,6 +50,10 @@ union futex_key { #define FUTEX_KEY_INIT (union futex_key) { .both = { .ptr = NULL } }
#ifdef CONFIG_FUTEX +enum { + FUTEX_STATE_OK, + FUTEX_STATE_DEAD, +};
static inline void futex_init_task(struct task_struct *tsk) { @@ -59,6 +63,34 @@ static inline void futex_init_task(struc #endif INIT_LIST_HEAD(&tsk->pi_state_list); tsk->pi_state_cache = NULL; + tsk->futex_state = FUTEX_STATE_OK; +} + +/** + * futex_exit_done - Sets the tasks futex state to FUTEX_STATE_DEAD + * @tsk: task to set the state on + * + * Set the futex exit state of the task lockless. The futex waiter code + * observes that state when a task is exiting and loops until the task has + * actually finished the futex cleanup. The worst case for this is that the + * waiter runs through the wait loop until the state becomes visible. + * + * This has two callers: + * + * - futex_mm_release() after the futex exit cleanup has been done + * + * - do_exit() from the recursive fault handling path. + * + * In case of a recursive fault this is best effort. Either the futex exit + * code has run already or not. If the OWNER_DIED bit has been set on the + * futex then the waiter can take it over. If not, the problem is pushed + * back to user space. If the futex exit code did not run yet, then an + * already queued waiter might block forever, but there is nothing which + * can be done about that. + */ +static inline void futex_exit_done(struct task_struct *tsk) +{ + tsk->futex_state = FUTEX_STATE_DEAD; }
void futex_mm_release(struct task_struct *tsk); @@ -68,6 +100,7 @@ long do_futex(u32 __user *uaddr, int op, #else static inline void futex_init_task(struct task_struct *tsk) { } static inline void futex_mm_release(struct task_struct *tsk) { } +static inline void futex_exit_done(struct task_struct *tsk) { } static inline long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3) --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1054,6 +1054,7 @@ struct task_struct { #endif struct list_head pi_state_list; struct futex_pi_state *pi_state_cache; + unsigned int futex_state; #endif #ifdef CONFIG_PERF_EVENTS struct perf_event_context *perf_event_ctxp[perf_nr_task_contexts]; @@ -1442,7 +1443,6 @@ extern struct pid *cad_pid; */ #define PF_IDLE 0x00000002 /* I am an IDLE thread */ #define PF_EXITING 0x00000004 /* Getting shut down */ -#define PF_EXITPIDONE 0x00000008 /* PI exit done on shut down */ #define PF_VCPU 0x00000010 /* I'm a virtual CPU */ #define PF_WQ_WORKER 0x00000020 /* I'm a workqueue worker */ #define PF_FORKNOEXEC 0x00000040 /* Forked but didn't exec */ --- a/kernel/exit.c +++ b/kernel/exit.c @@ -746,16 +746,7 @@ void __noreturn do_exit(long code) */ if (unlikely(tsk->flags & PF_EXITING)) { pr_alert("Fixing recursive fault but reboot is needed!\n"); - /* - * We can do this unlocked here. The futex code uses - * this flag just to verify whether the pi state - * cleanup has been done or not. In the worst case it - * loops once more. We pretend that the cleanup was - * done as there is no way to return. Either the - * OWNER_DIED bit is set by now or we push the blocked - * task into the wait for ever nirwana as well. - */ - tsk->flags |= PF_EXITPIDONE; + futex_exit_done(tsk); set_current_state(TASK_UNINTERRUPTIBLE); schedule(); } @@ -846,12 +837,7 @@ void __noreturn do_exit(long code) * Make sure we are holding no locks: */ debug_check_no_locks_held(); - /* - * We can do this unlocked here. The futex code uses this flag - * just to verify whether the pi state cleanup has been done - * or not. In the worst case it loops once more. - */ - tsk->flags |= PF_EXITPIDONE; + futex_exit_done(tsk);
if (tsk->io_context) exit_io_context(tsk); --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1182,9 +1182,10 @@ static int handle_exit_race(u32 __user * u32 uval2;
/* - * If PF_EXITPIDONE is not yet set, then try again. + * If the futex exit state is not yet FUTEX_STATE_DEAD, wait + * for it to finish. */ - if (tsk && !(tsk->flags & PF_EXITPIDONE)) + if (tsk && tsk->futex_state != FUTEX_STATE_DEAD) return -EAGAIN;
/* @@ -1203,8 +1204,9 @@ static int handle_exit_race(u32 __user * * *uaddr = 0xC0000000; tsk = get_task(PID); * } if (!tsk->flags & PF_EXITING) { * ... attach(); - * tsk->flags |= PF_EXITPIDONE; } else { - * if (!(tsk->flags & PF_EXITPIDONE)) + * tsk->futex_state = } else { + * FUTEX_STATE_DEAD; if (tsk->futex_state != + * FUTEX_STATE_DEAD) * return -EAGAIN; * return -ESRCH; <--- FAIL * } @@ -1260,17 +1262,16 @@ static int attach_to_pi_owner(u32 __user }
/* - * We need to look at the task state flags to figure out, - * whether the task is exiting. To protect against the do_exit - * change of the task flags, we do this protected by - * p->pi_lock: + * We need to look at the task state to figure out, whether the + * task is exiting. To protect against the change of the task state + * in futex_exit_release(), we do this protected by p->pi_lock: */ raw_spin_lock_irq(&p->pi_lock); - if (unlikely(p->flags & PF_EXITING)) { + if (unlikely(p->futex_state != FUTEX_STATE_OK)) { /* - * The task is on the way out. When PF_EXITPIDONE is - * set, we know that the task has finished the - * cleanup: + * The task is on the way out. When the futex state is + * FUTEX_STATE_DEAD, we know that the task has finished + * the cleanup: */ int ret = handle_exit_race(uaddr, uval, p);
 
            From: Thomas Gleixner tglx@linutronix.de
commit 4610ba7ad877fafc0a25a30c6c82015304120426 upstream.
mm_release() contains the futex exit handling. mm_release() is called from do_exit()->exit_mm() and from exec()->exec_mm().
In the exit_mm() case PF_EXITING and the futex state is updated. In the exec_mm() case these states are not touched.
As the futex exit code needs further protections against exit races, this needs to be split into two functions.
Preparatory only, no functional change.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.240518241@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/exec.c | 2 +- include/linux/sched/mm.h | 6 ++++-- kernel/exit.c | 2 +- kernel/fork.c | 12 +++++++++++- 4 files changed, 17 insertions(+), 5 deletions(-)
--- a/fs/exec.c +++ b/fs/exec.c @@ -1015,7 +1015,7 @@ static int exec_mmap(struct mm_struct *m /* Notify parent that we're no longer interested in the old VM */ tsk = current; old_mm = current->mm; - mm_release(tsk, old_mm); + exec_mm_release(tsk, old_mm);
if (old_mm) { sync_mm_rss(old_mm); --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -117,8 +117,10 @@ extern struct mm_struct *get_task_mm(str * succeeds. */ extern struct mm_struct *mm_access(struct task_struct *task, unsigned int mode); -/* Remove the current tasks stale references to the old mm_struct */ -extern void mm_release(struct task_struct *, struct mm_struct *); +/* Remove the current tasks stale references to the old mm_struct on exit() */ +extern void exit_mm_release(struct task_struct *, struct mm_struct *); +/* Remove the current tasks stale references to the old mm_struct on exec() */ +extern void exec_mm_release(struct task_struct *, struct mm_struct *);
#ifdef CONFIG_MEMCG extern void mm_update_next_owner(struct mm_struct *mm); --- a/kernel/exit.c +++ b/kernel/exit.c @@ -437,7 +437,7 @@ static void exit_mm(void) struct mm_struct *mm = current->mm; struct core_state *core_state;
- mm_release(current, mm); + exit_mm_release(current, mm); if (!mm) return; sync_mm_rss(mm); --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1283,7 +1283,7 @@ static int wait_for_vfork_done(struct ta * restoring the old one. . . * Eric Biederman 10 January 1998 */ -void mm_release(struct task_struct *tsk, struct mm_struct *mm) +static void mm_release(struct task_struct *tsk, struct mm_struct *mm) { /* Get rid of any futexes when releasing the mm */ futex_mm_release(tsk); @@ -1320,6 +1320,16 @@ void mm_release(struct task_struct *tsk, complete_vfork_done(tsk); }
+void exit_mm_release(struct task_struct *tsk, struct mm_struct *mm) +{ + mm_release(tsk, mm); +} + +void exec_mm_release(struct task_struct *tsk, struct mm_struct *mm) +{ + mm_release(tsk, mm); +} + /** * dup_mm() - duplicates an existing mm structure * @tsk: the task_struct with which the new mm will be associated.
 
            From: Thomas Gleixner tglx@linutronix.de
commit 150d71584b12809144b8145b817e83b81158ae5f upstream.
To allow separate handling of the futex exit state in the futex exit code for exit and exec, split futex_mm_release() into two functions and invoke them from the corresponding exit/exec_mm_release() callsites.
Preparatory only, no functional change.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.332094221@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/futex.h | 6 ++++-- kernel/fork.c | 5 ++--- kernel/futex.c | 7 ++++++- 3 files changed, 12 insertions(+), 6 deletions(-)
--- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -93,14 +93,16 @@ static inline void futex_exit_done(struc tsk->futex_state = FUTEX_STATE_DEAD; }
-void futex_mm_release(struct task_struct *tsk); +void futex_exit_release(struct task_struct *tsk); +void futex_exec_release(struct task_struct *tsk);
long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3); #else static inline void futex_init_task(struct task_struct *tsk) { } -static inline void futex_mm_release(struct task_struct *tsk) { } static inline void futex_exit_done(struct task_struct *tsk) { } +static inline void futex_exit_release(struct task_struct *tsk) { } +static inline void futex_exec_release(struct task_struct *tsk) { } static inline long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3) --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1285,9 +1285,6 @@ static int wait_for_vfork_done(struct ta */ static void mm_release(struct task_struct *tsk, struct mm_struct *mm) { - /* Get rid of any futexes when releasing the mm */ - futex_mm_release(tsk); - uprobe_free_utask(tsk);
/* Get rid of any cached register state */ @@ -1322,11 +1319,13 @@ static void mm_release(struct task_struc
void exit_mm_release(struct task_struct *tsk, struct mm_struct *mm) { + futex_exit_release(tsk); mm_release(tsk, mm); }
void exec_mm_release(struct task_struct *tsk, struct mm_struct *mm) { + futex_exec_release(tsk); mm_release(tsk, mm); }
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -3661,7 +3661,7 @@ static void exit_robust_list(struct task } }
-void futex_mm_release(struct task_struct *tsk) +void futex_exec_release(struct task_struct *tsk) { if (unlikely(tsk->robust_list)) { exit_robust_list(tsk); @@ -3679,6 +3679,11 @@ void futex_mm_release(struct task_struct exit_pi_state_list(tsk); }
+void futex_exit_release(struct task_struct *tsk) +{ + futex_exec_release(tsk); +} + long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3) {
 
            From: Thomas Gleixner tglx@linutronix.de
commit f24f22435dcc11389acc87e5586239c1819d217c upstream.
Setting task::futex_state in do_exit() is rather arbitrarily placed for no reason. Move it into the futex code.
Note, this is only done for the exit cleanup as the exec cleanup cannot set the state to FUTEX_STATE_DEAD because the task struct is still in active use.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.439511191@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/exit.c | 1 - kernel/futex.c | 1 + 2 files changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/exit.c +++ b/kernel/exit.c @@ -837,7 +837,6 @@ void __noreturn do_exit(long code) * Make sure we are holding no locks: */ debug_check_no_locks_held(); - futex_exit_done(tsk);
if (tsk->io_context) exit_io_context(tsk); --- a/kernel/futex.c +++ b/kernel/futex.c @@ -3682,6 +3682,7 @@ void futex_exec_release(struct task_stru void futex_exit_release(struct task_struct *tsk) { futex_exec_release(tsk); + futex_exit_done(tsk); }
long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
 
            From: Thomas Gleixner tglx@linutronix.de
commit 18f694385c4fd77a09851fd301236746ca83f3cb upstream.
Instead of relying on PF_EXITING use an explicit state for the futex exit and set it in the futex exit function. This moves the smp barrier and the lock/unlock serialization into the futex code.
As with the DEAD state this is restricted to the exit path as exec continues to use the same task struct.
This allows to simplify that logic in a next step.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.539409004@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/futex.h | 31 +++---------------------------- kernel/exit.c | 13 +------------ kernel/futex.c | 37 ++++++++++++++++++++++++++++++++++++- 3 files changed, 40 insertions(+), 41 deletions(-)
--- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -52,6 +52,7 @@ union futex_key { #ifdef CONFIG_FUTEX enum { FUTEX_STATE_OK, + FUTEX_STATE_EXITING, FUTEX_STATE_DEAD, };
@@ -66,33 +67,7 @@ static inline void futex_init_task(struc tsk->futex_state = FUTEX_STATE_OK; }
-/** - * futex_exit_done - Sets the tasks futex state to FUTEX_STATE_DEAD - * @tsk: task to set the state on - * - * Set the futex exit state of the task lockless. The futex waiter code - * observes that state when a task is exiting and loops until the task has - * actually finished the futex cleanup. The worst case for this is that the - * waiter runs through the wait loop until the state becomes visible. - * - * This has two callers: - * - * - futex_mm_release() after the futex exit cleanup has been done - * - * - do_exit() from the recursive fault handling path. - * - * In case of a recursive fault this is best effort. Either the futex exit - * code has run already or not. If the OWNER_DIED bit has been set on the - * futex then the waiter can take it over. If not, the problem is pushed - * back to user space. If the futex exit code did not run yet, then an - * already queued waiter might block forever, but there is nothing which - * can be done about that. - */ -static inline void futex_exit_done(struct task_struct *tsk) -{ - tsk->futex_state = FUTEX_STATE_DEAD; -} - +void futex_exit_recursive(struct task_struct *tsk); void futex_exit_release(struct task_struct *tsk); void futex_exec_release(struct task_struct *tsk);
@@ -100,7 +75,7 @@ long do_futex(u32 __user *uaddr, int op, u32 __user *uaddr2, u32 val2, u32 val3); #else static inline void futex_init_task(struct task_struct *tsk) { } -static inline void futex_exit_done(struct task_struct *tsk) { } +static inline void futex_exit_recursive(struct task_struct *tsk) { } static inline void futex_exit_release(struct task_struct *tsk) { } static inline void futex_exec_release(struct task_struct *tsk) { } static inline long do_futex(u32 __user *uaddr, int op, u32 val, --- a/kernel/exit.c +++ b/kernel/exit.c @@ -746,23 +746,12 @@ void __noreturn do_exit(long code) */ if (unlikely(tsk->flags & PF_EXITING)) { pr_alert("Fixing recursive fault but reboot is needed!\n"); - futex_exit_done(tsk); + futex_exit_recursive(tsk); set_current_state(TASK_UNINTERRUPTIBLE); schedule(); }
exit_signals(tsk); /* sets PF_EXITING */ - /* - * Ensure that all new tsk->pi_lock acquisitions must observe - * PF_EXITING. Serializes against futex.c:attach_to_pi_owner(). - */ - smp_mb(); - /* - * Ensure that we must observe the pi_state in exit_mm() -> - * mm_release() -> exit_pi_state_list(). - */ - raw_spin_lock_irq(&tsk->pi_lock); - raw_spin_unlock_irq(&tsk->pi_lock);
if (unlikely(in_atomic())) { pr_info("note: %s[%d] exited with preempt_count %d\n", --- a/kernel/futex.c +++ b/kernel/futex.c @@ -3679,10 +3679,45 @@ void futex_exec_release(struct task_stru exit_pi_state_list(tsk); }
+/** + * futex_exit_recursive - Set the tasks futex state to FUTEX_STATE_DEAD + * @tsk: task to set the state on + * + * Set the futex exit state of the task lockless. The futex waiter code + * observes that state when a task is exiting and loops until the task has + * actually finished the futex cleanup. The worst case for this is that the + * waiter runs through the wait loop until the state becomes visible. + * + * This is called from the recursive fault handling path in do_exit(). + * + * This is best effort. Either the futex exit code has run already or + * not. If the OWNER_DIED bit has been set on the futex then the waiter can + * take it over. If not, the problem is pushed back to user space. If the + * futex exit code did not run yet, then an already queued waiter might + * block forever, but there is nothing which can be done about that. + */ +void futex_exit_recursive(struct task_struct *tsk) +{ + tsk->futex_state = FUTEX_STATE_DEAD; +} + void futex_exit_release(struct task_struct *tsk) { + tsk->futex_state = FUTEX_STATE_EXITING; + /* + * Ensure that all new tsk->pi_lock acquisitions must observe + * FUTEX_STATE_EXITING. Serializes against attach_to_pi_owner(). + */ + smp_mb(); + /* + * Ensure that we must observe the pi_state in exit_pi_state_list(). + */ + raw_spin_lock_irq(&tsk->pi_lock); + raw_spin_unlock_irq(&tsk->pi_lock); + futex_exec_release(tsk); - futex_exit_done(tsk); + + tsk->futex_state = FUTEX_STATE_DEAD; }
long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
 
            From: Thomas Gleixner tglx@linutronix.de
commit 4a8e991b91aca9e20705d434677ac013974e0e30 upstream.
Instead of having a smp_mb() and an empty lock/unlock of task::pi_lock move the state setting into to the lock section.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.645603214@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/futex.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -3703,16 +3703,19 @@ void futex_exit_recursive(struct task_st
void futex_exit_release(struct task_struct *tsk) { - tsk->futex_state = FUTEX_STATE_EXITING; - /* - * Ensure that all new tsk->pi_lock acquisitions must observe - * FUTEX_STATE_EXITING. Serializes against attach_to_pi_owner(). - */ - smp_mb(); /* - * Ensure that we must observe the pi_state in exit_pi_state_list(). + * Switch the state to FUTEX_STATE_EXITING under tsk->pi_lock. + * + * This ensures that all subsequent checks of tsk->futex_state in + * attach_to_pi_owner() must observe FUTEX_STATE_EXITING with + * tsk->pi_lock held. + * + * It guarantees also that a pi_state which was queued right before + * the state change under tsk->pi_lock by a concurrent waiter must + * be observed in exit_pi_state_list(). */ raw_spin_lock_irq(&tsk->pi_lock); + tsk->futex_state = FUTEX_STATE_EXITING; raw_spin_unlock_irq(&tsk->pi_lock);
futex_exec_release(tsk);
 
            From: Thomas Gleixner tglx@linutronix.de
commit af8cbda2cfcaa5515d61ec500498d46e9a8247e2 upstream.
exec() attempts to handle potentially held futexes gracefully by running the futex exit handling code like exit() does.
The current implementation has no protection against concurrent incoming waiters. The reason is that the futex state cannot be set to FUTEX_STATE_DEAD after the cleanup because the task struct is still active and just about to execute the new binary.
While its arguably buggy when a task holds a futex over exec(), for consistency sake the state handling can at least cover the actual futex exit cleanup section. This provides state consistency protection accross the cleanup. As the futex state of the task becomes FUTEX_STATE_OK after the cleanup has been finished, this cannot prevent subsequent attempts to attach to the task in case that the cleanup was not successfull in mopping up all leftovers.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.753355618@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/futex.c | 38 ++++++++++++++++++++++++++++++++++---- 1 file changed, 34 insertions(+), 4 deletions(-)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -3661,7 +3661,7 @@ static void exit_robust_list(struct task } }
-void futex_exec_release(struct task_struct *tsk) +static void futex_cleanup(struct task_struct *tsk) { if (unlikely(tsk->robust_list)) { exit_robust_list(tsk); @@ -3701,7 +3701,7 @@ void futex_exit_recursive(struct task_st tsk->futex_state = FUTEX_STATE_DEAD; }
-void futex_exit_release(struct task_struct *tsk) +static void futex_cleanup_begin(struct task_struct *tsk) { /* * Switch the state to FUTEX_STATE_EXITING under tsk->pi_lock. @@ -3717,10 +3717,40 @@ void futex_exit_release(struct task_stru raw_spin_lock_irq(&tsk->pi_lock); tsk->futex_state = FUTEX_STATE_EXITING; raw_spin_unlock_irq(&tsk->pi_lock); +}
- futex_exec_release(tsk); +static void futex_cleanup_end(struct task_struct *tsk, int state) +{ + /* + * Lockless store. The only side effect is that an observer might + * take another loop until it becomes visible. + */ + tsk->futex_state = state; +}
- tsk->futex_state = FUTEX_STATE_DEAD; +void futex_exec_release(struct task_struct *tsk) +{ + /* + * The state handling is done for consistency, but in the case of + * exec() there is no way to prevent futher damage as the PID stays + * the same. But for the unlikely and arguably buggy case that a + * futex is held on exec(), this provides at least as much state + * consistency protection which is possible. + */ + futex_cleanup_begin(tsk); + futex_cleanup(tsk); + /* + * Reset the state to FUTEX_STATE_OK. The task is alive and about + * exec a new binary. + */ + futex_cleanup_end(tsk, FUTEX_STATE_OK); +} + +void futex_exit_release(struct task_struct *tsk) +{ + futex_cleanup_begin(tsk); + futex_cleanup(tsk); + futex_cleanup_end(tsk, FUTEX_STATE_DEAD); }
long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
 
            From: Thomas Gleixner tglx@linutronix.de
commit 3f186d974826847a07bc7964d79ec4eded475ad9 upstream.
The mutex will be used in subsequent changes to replace the busy looping of a waiter when the futex owner is currently executing the exit cleanup to prevent a potential live lock.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.845798895@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/futex.h | 1 + include/linux/sched.h | 1 + kernel/futex.c | 16 ++++++++++++++++ 3 files changed, 18 insertions(+)
--- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -65,6 +65,7 @@ static inline void futex_init_task(struc INIT_LIST_HEAD(&tsk->pi_state_list); tsk->pi_state_cache = NULL; tsk->futex_state = FUTEX_STATE_OK; + mutex_init(&tsk->futex_exit_mutex); }
void futex_exit_recursive(struct task_struct *tsk); --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1054,6 +1054,7 @@ struct task_struct { #endif struct list_head pi_state_list; struct futex_pi_state *pi_state_cache; + struct mutex futex_exit_mutex; unsigned int futex_state; #endif #ifdef CONFIG_PERF_EVENTS --- a/kernel/futex.c +++ b/kernel/futex.c @@ -3698,12 +3698,23 @@ static void futex_cleanup(struct task_st */ void futex_exit_recursive(struct task_struct *tsk) { + /* If the state is FUTEX_STATE_EXITING then futex_exit_mutex is held */ + if (tsk->futex_state == FUTEX_STATE_EXITING) + mutex_unlock(&tsk->futex_exit_mutex); tsk->futex_state = FUTEX_STATE_DEAD; }
static void futex_cleanup_begin(struct task_struct *tsk) { /* + * Prevent various race issues against a concurrent incoming waiter + * including live locks by forcing the waiter to block on + * tsk->futex_exit_mutex when it observes FUTEX_STATE_EXITING in + * attach_to_pi_owner(). + */ + mutex_lock(&tsk->futex_exit_mutex); + + /* * Switch the state to FUTEX_STATE_EXITING under tsk->pi_lock. * * This ensures that all subsequent checks of tsk->futex_state in @@ -3726,6 +3737,11 @@ static void futex_cleanup_end(struct tas * take another loop until it becomes visible. */ tsk->futex_state = state; + /* + * Drop the exit protection. This unblocks waiters which observed + * FUTEX_STATE_EXITING to reevaluate the state. + */ + mutex_unlock(&tsk->futex_exit_mutex); }
void futex_exec_release(struct task_struct *tsk)
 
            From: Thomas Gleixner tglx@linutronix.de
commit ac31c7ff8624409ba3c4901df9237a616c187a5d upstream.
attach_to_pi_owner() returns -EAGAIN for various cases:
- Owner task is exiting - Futex value has changed
The caller drops the held locks (hash bucket, mmap_sem) and retries the operation. In case of the owner task exiting this can result in a live lock.
As a preparatory step for seperating those cases, provide a distinct return value (EBUSY) for the owner exiting case.
No functional change.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20191106224556.935606117@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/futex.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -1182,11 +1182,11 @@ static int handle_exit_race(u32 __user * u32 uval2;
/* - * If the futex exit state is not yet FUTEX_STATE_DEAD, wait - * for it to finish. + * If the futex exit state is not yet FUTEX_STATE_DEAD, tell the + * caller that the alleged owner is busy. */ if (tsk && tsk->futex_state != FUTEX_STATE_DEAD) - return -EAGAIN; + return -EBUSY;
/* * Reread the user space value to handle the following situation: @@ -2092,12 +2092,13 @@ retry_private: if (!ret) goto retry; goto out; + case -EBUSY: case -EAGAIN: /* * Two reasons for this: - * - Owner is exiting and we just wait for the + * - EBUSY: Owner is exiting and we just wait for the * exit to complete. - * - The user space value changed. + * - EAGAIN: The user space value changed. */ double_unlock_hb(hb1, hb2); hb_waiters_dec(hb2); @@ -2843,12 +2844,13 @@ retry_private: goto out_unlock_put_key; case -EFAULT: goto uaddr_faulted; + case -EBUSY: case -EAGAIN: /* * Two reasons for this: - * - Task is exiting and we just wait for the + * - EBUSY: Task is exiting and we just wait for the * exit to complete. - * - The user space value changed. + * - EAGAIN: The user space value changed. */ queue_unlock(hb); put_futex_key(&q.key);
 
            From: Thomas Gleixner tglx@linutronix.de
commit 3ef240eaff36b8119ac9e2ea17cbf41179c930ba upstream.
Oleg provided the following test case:
int main(void) { struct sched_param sp = {};
sp.sched_priority = 2; assert(sched_setscheduler(0, SCHED_FIFO, &sp) == 0);
int lock = vfork(); if (!lock) { sp.sched_priority = 1; assert(sched_setscheduler(0, SCHED_FIFO, &sp) == 0); _exit(0); }
syscall(__NR_futex, &lock, FUTEX_LOCK_PI, 0,0,0); return 0; }
This creates an unkillable RT process spinning in futex_lock_pi() on a UP machine or if the process is affine to a single CPU. The reason is:
parent child
set FIFO prio 2
vfork() -> set FIFO prio 1 implies wait_for_child() sched_setscheduler(...) exit() do_exit() .... mm_release() tsk->futex_state = FUTEX_STATE_EXITING; exit_futex(); (NOOP in this case) complete() --> wakes parent sys_futex() loop infinite because tsk->futex_state == FUTEX_STATE_EXITING
The same problem can happen just by regular preemption as well:
task holds futex ... do_exit() tsk->futex_state = FUTEX_STATE_EXITING;
--> preemption (unrelated wakeup of some other higher prio task, e.g. timer)
switch_to(other_task)
return to user sys_futex() loop infinite as above
Just for the fun of it the futex exit cleanup could trigger the wakeup itself before the task sets its futex state to DEAD.
To cure this, the handling of the exiting owner is changed so:
- A refcount is held on the task
- The task pointer is stored in a caller visible location
- The caller drops all locks (hash bucket, mmap_sem) and blocks on task::futex_exit_mutex. When the mutex is acquired then the exiting task has completed the cleanup and the state is consistent and can be reevaluated.
This is not a pretty solution, but there is no choice other than returning an error code to user space, which would break the state consistency guarantee and open another can of problems including regressions.
For stable backports the preparatory commits ac31c7ff8624 .. ba31c1a48538 are required as well, but for anything older than 5.3.y the backports are going to be provided when this hits mainline as the other dependencies for those kernels are definitely not stable material.
Fixes: 778e9a9c3e71 ("pi-futex: fix exit races and locking problems") Reported-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Stable Team stable@vger.kernel.org Link: https://lkml.kernel.org/r/20191106224557.041676471@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/futex.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 91 insertions(+), 15 deletions(-)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -1176,6 +1176,36 @@ out_error: return ret; }
+/** + * wait_for_owner_exiting - Block until the owner has exited + * @exiting: Pointer to the exiting task + * + * Caller must hold a refcount on @exiting. + */ +static void wait_for_owner_exiting(int ret, struct task_struct *exiting) +{ + if (ret != -EBUSY) { + WARN_ON_ONCE(exiting); + return; + } + + if (WARN_ON_ONCE(ret == -EBUSY && !exiting)) + return; + + mutex_lock(&exiting->futex_exit_mutex); + /* + * No point in doing state checking here. If the waiter got here + * while the task was in exec()->exec_futex_release() then it can + * have any FUTEX_STATE_* value when the waiter has acquired the + * mutex. OK, if running, EXITING or DEAD if it reached exit() + * already. Highly unlikely and not a problem. Just one more round + * through the futex maze. + */ + mutex_unlock(&exiting->futex_exit_mutex); + + put_task_struct(exiting); +} + static int handle_exit_race(u32 __user *uaddr, u32 uval, struct task_struct *tsk) { @@ -1237,7 +1267,8 @@ static int handle_exit_race(u32 __user * * it after doing proper sanity checks. */ static int attach_to_pi_owner(u32 __user *uaddr, u32 uval, union futex_key *key, - struct futex_pi_state **ps) + struct futex_pi_state **ps, + struct task_struct **exiting) { pid_t pid = uval & FUTEX_TID_MASK; struct futex_pi_state *pi_state; @@ -1276,7 +1307,19 @@ static int attach_to_pi_owner(u32 __user int ret = handle_exit_race(uaddr, uval, p);
raw_spin_unlock_irq(&p->pi_lock); - put_task_struct(p); + /* + * If the owner task is between FUTEX_STATE_EXITING and + * FUTEX_STATE_DEAD then store the task pointer and keep + * the reference on the task struct. The calling code will + * drop all locks, wait for the task to reach + * FUTEX_STATE_DEAD and then drop the refcount. This is + * required to prevent a live lock when the current task + * preempted the exiting task between the two states. + */ + if (ret == -EBUSY) + *exiting = p; + else + put_task_struct(p); return ret; }
@@ -1315,7 +1358,8 @@ static int attach_to_pi_owner(u32 __user
static int lookup_pi_state(u32 __user *uaddr, u32 uval, struct futex_hash_bucket *hb, - union futex_key *key, struct futex_pi_state **ps) + union futex_key *key, struct futex_pi_state **ps, + struct task_struct **exiting) { struct futex_q *top_waiter = futex_top_waiter(hb, key);
@@ -1330,7 +1374,7 @@ static int lookup_pi_state(u32 __user *u * We are the first waiter - try to look up the owner based on * @uval and attach to it. */ - return attach_to_pi_owner(uaddr, uval, key, ps); + return attach_to_pi_owner(uaddr, uval, key, ps, exiting); }
static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval) @@ -1358,6 +1402,8 @@ static int lock_pi_update_atomic(u32 __u * lookup * @task: the task to perform the atomic lock work for. This will * be "current" except in the case of requeue pi. + * @exiting: Pointer to store the task pointer of the owner task + * which is in the middle of exiting * @set_waiters: force setting the FUTEX_WAITERS bit (1) or not (0) * * Return: @@ -1366,11 +1412,17 @@ static int lock_pi_update_atomic(u32 __u * - <0 - error * * The hb->lock and futex_key refs shall be held by the caller. + * + * @exiting is only set when the return value is -EBUSY. If so, this holds + * a refcount on the exiting task on return and the caller needs to drop it + * after waiting for the exit to complete. */ static int futex_lock_pi_atomic(u32 __user *uaddr, struct futex_hash_bucket *hb, union futex_key *key, struct futex_pi_state **ps, - struct task_struct *task, int set_waiters) + struct task_struct *task, + struct task_struct **exiting, + int set_waiters) { u32 uval, newval, vpid = task_pid_vnr(task); struct futex_q *top_waiter; @@ -1440,7 +1492,7 @@ static int futex_lock_pi_atomic(u32 __us * attach to the owner. If that fails, no harm done, we only * set the FUTEX_WAITERS bit in the user space variable. */ - return attach_to_pi_owner(uaddr, newval, key, ps); + return attach_to_pi_owner(uaddr, newval, key, ps, exiting); }
/** @@ -1858,6 +1910,8 @@ void requeue_pi_wake_futex(struct futex_ * @key1: the from futex key * @key2: the to futex key * @ps: address to store the pi_state pointer + * @exiting: Pointer to store the task pointer of the owner task + * which is in the middle of exiting * @set_waiters: force setting the FUTEX_WAITERS bit (1) or not (0) * * Try and get the lock on behalf of the top waiter if we can do it atomically. @@ -1865,16 +1919,20 @@ void requeue_pi_wake_futex(struct futex_ * then direct futex_lock_pi_atomic() to force setting the FUTEX_WAITERS bit. * hb1 and hb2 must be held by the caller. * + * @exiting is only set when the return value is -EBUSY. If so, this holds + * a refcount on the exiting task on return and the caller needs to drop it + * after waiting for the exit to complete. + * * Return: * - 0 - failed to acquire the lock atomically; * - >0 - acquired the lock, return value is vpid of the top_waiter * - <0 - error */ -static int futex_proxy_trylock_atomic(u32 __user *pifutex, - struct futex_hash_bucket *hb1, - struct futex_hash_bucket *hb2, - union futex_key *key1, union futex_key *key2, - struct futex_pi_state **ps, int set_waiters) +static int +futex_proxy_trylock_atomic(u32 __user *pifutex, struct futex_hash_bucket *hb1, + struct futex_hash_bucket *hb2, union futex_key *key1, + union futex_key *key2, struct futex_pi_state **ps, + struct task_struct **exiting, int set_waiters) { struct futex_q *top_waiter = NULL; u32 curval; @@ -1911,7 +1969,7 @@ static int futex_proxy_trylock_atomic(u3 */ vpid = task_pid_vnr(top_waiter->task); ret = futex_lock_pi_atomic(pifutex, hb2, key2, ps, top_waiter->task, - set_waiters); + exiting, set_waiters); if (ret == 1) { requeue_pi_wake_futex(top_waiter, key2, hb2); return vpid; @@ -2040,6 +2098,8 @@ retry_private: }
if (requeue_pi && (task_count - nr_wake < nr_requeue)) { + struct task_struct *exiting = NULL; + /* * Attempt to acquire uaddr2 and wake the top waiter. If we * intend to requeue waiters, force setting the FUTEX_WAITERS @@ -2047,7 +2107,8 @@ retry_private: * faults rather in the requeue loop below. */ ret = futex_proxy_trylock_atomic(uaddr2, hb1, hb2, &key1, - &key2, &pi_state, nr_requeue); + &key2, &pi_state, + &exiting, nr_requeue);
/* * At this point the top_waiter has either taken uaddr2 or is @@ -2074,7 +2135,8 @@ retry_private: * If that call succeeds then we have pi_state and an * initial refcount on it. */ - ret = lookup_pi_state(uaddr2, ret, hb2, &key2, &pi_state); + ret = lookup_pi_state(uaddr2, ret, hb2, &key2, + &pi_state, &exiting); }
switch (ret) { @@ -2104,6 +2166,12 @@ retry_private: hb_waiters_dec(hb2); put_futex_key(&key2); put_futex_key(&key1); + /* + * Handle the case where the owner is in the middle of + * exiting. Wait for the exit to complete otherwise + * this task might loop forever, aka. live lock. + */ + wait_for_owner_exiting(ret, exiting); cond_resched(); goto retry; default: @@ -2810,6 +2878,7 @@ static int futex_lock_pi(u32 __user *uad { struct hrtimer_sleeper timeout, *to; struct futex_pi_state *pi_state = NULL; + struct task_struct *exiting = NULL; struct rt_mutex_waiter rt_waiter; struct futex_hash_bucket *hb; struct futex_q q = futex_q_init; @@ -2831,7 +2900,8 @@ retry: retry_private: hb = queue_lock(&q);
- ret = futex_lock_pi_atomic(uaddr, hb, &q.key, &q.pi_state, current, 0); + ret = futex_lock_pi_atomic(uaddr, hb, &q.key, &q.pi_state, current, + &exiting, 0); if (unlikely(ret)) { /* * Atomic work succeeded and we got the lock, @@ -2854,6 +2924,12 @@ retry_private: */ queue_unlock(hb); put_futex_key(&q.key); + /* + * Handle the case where the owner is in the middle of + * exiting. Wait for the exit to complete otherwise + * this task might loop forever, aka. live lock. + */ + wait_for_owner_exiting(ret, exiting); cond_resched(); goto retry; default:
 
            From: Laurent Pinchart laurent.pinchart@ideasonboard.com
commit 8c279e9394cade640ed86ec6c6645a0e7df5e0b6 upstream.
When parsing the UVC control descriptors fails, the error path tries to cleanup a media device that hasn't been initialised, potentially resulting in a crash. Fix this by initialising the media device before the error handling path can be reached.
Fixes: 5a254d751e52 ("[media] uvcvideo: Register a v4l2_device") Reported-by: syzbot+c86454eb3af9e8a4da20@syzkaller.appspotmail.com Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/uvc/uvc_driver.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
--- a/drivers/media/usb/uvc/uvc_driver.c +++ b/drivers/media/usb/uvc/uvc_driver.c @@ -2151,6 +2151,20 @@ static int uvc_probe(struct usb_interfac sizeof(dev->name) - len); }
+ /* Initialize the media device. */ +#ifdef CONFIG_MEDIA_CONTROLLER + dev->mdev.dev = &intf->dev; + strscpy(dev->mdev.model, dev->name, sizeof(dev->mdev.model)); + if (udev->serial) + strscpy(dev->mdev.serial, udev->serial, + sizeof(dev->mdev.serial)); + usb_make_path(udev, dev->mdev.bus_info, sizeof(dev->mdev.bus_info)); + dev->mdev.hw_revision = le16_to_cpu(udev->descriptor.bcdDevice); + media_device_init(&dev->mdev); + + dev->vdev.mdev = &dev->mdev; +#endif + /* Parse the Video Class control descriptor. */ if (uvc_parse_control(dev) < 0) { uvc_trace(UVC_TRACE_PROBE, "Unable to parse UVC " @@ -2171,19 +2185,7 @@ static int uvc_probe(struct usb_interfac "linux-uvc-devel mailing list.\n"); }
- /* Initialize the media device and register the V4L2 device. */ -#ifdef CONFIG_MEDIA_CONTROLLER - dev->mdev.dev = &intf->dev; - strscpy(dev->mdev.model, dev->name, sizeof(dev->mdev.model)); - if (udev->serial) - strscpy(dev->mdev.serial, udev->serial, - sizeof(dev->mdev.serial)); - usb_make_path(udev, dev->mdev.bus_info, sizeof(dev->mdev.bus_info)); - dev->mdev.hw_revision = le16_to_cpu(udev->descriptor.bcdDevice); - media_device_init(&dev->mdev); - - dev->vdev.mdev = &dev->mdev; -#endif + /* Register the V4L2 device. */ if (v4l2_device_register(&intf->dev, &dev->vdev) < 0) goto error;
 
            From: Oliver Neukum oneukum@suse.com
commit 1b976fc6d684e3282914cdbe7a8d68fdce19095c upstream.
The driver needs an isochronous endpoint to be present. It will oops in its absence. Add checking for it.
Reported-by: syzbot+d93dff37e6a89431c158@syzkaller.appspotmail.com Signed-off-by: Oliver Neukum oneukum@suse.com Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/b2c2/flexcop-usb.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/media/usb/b2c2/flexcop-usb.c +++ b/drivers/media/usb/b2c2/flexcop-usb.c @@ -538,6 +538,9 @@ static int flexcop_usb_probe(struct usb_ struct flexcop_device *fc = NULL; int ret;
+ if (intf->cur_altsetting->desc.bNumEndpoints < 1) + return -ENODEV; + if ((fc = flexcop_device_kmalloc(sizeof(struct flexcop_usb))) == NULL) { err("out of memory\n"); return -ENOMEM;
 
            From: Vito Caputo vcaputo@pengaru.com
commit ca8f245f284eeffa56f3b7a5eb6fc503159ee028 upstream.
Don't use uninitialized ircode[] in cxusb_rc_query() when cxusb_ctrl_msg() fails to populate its contents.
syzbot reported:
dvb-usb: bulk message failed: -22 (1/-30591) ===================================================== BUG: KMSAN: uninit-value in ir_lookup_by_scancode drivers/media/rc/rc-main.c:494 [inline] BUG: KMSAN: uninit-value in rc_g_keycode_from_table drivers/media/rc/rc-main.c:582 [inline] BUG: KMSAN: uninit-value in rc_keydown+0x1a6/0x6f0 drivers/media/rc/rc-main.c:816 CPU: 1 PID: 11436 Comm: kworker/1:2 Not tainted 5.3.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events dvb_usb_read_remote_control Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x191/0x1f0 lib/dump_stack.c:113 kmsan_report+0x13a/0x2b0 mm/kmsan/kmsan_report.c:108 __msan_warning+0x73/0xe0 mm/kmsan/kmsan_instr.c:250 bsearch+0x1dd/0x250 lib/bsearch.c:41 ir_lookup_by_scancode drivers/media/rc/rc-main.c:494 [inline] rc_g_keycode_from_table drivers/media/rc/rc-main.c:582 [inline] rc_keydown+0x1a6/0x6f0 drivers/media/rc/rc-main.c:816 cxusb_rc_query+0x2e1/0x360 drivers/media/usb/dvb-usb/cxusb.c:548 dvb_usb_read_remote_control+0xf9/0x290 drivers/media/usb/dvb-usb/dvb-usb-remote.c:261 process_one_work+0x1572/0x1ef0 kernel/workqueue.c:2269 worker_thread+0x111b/0x2460 kernel/workqueue.c:2415 kthread+0x4b5/0x4f0 kernel/kthread.c:256 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355
Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:150 [inline] kmsan_internal_chain_origin+0xd2/0x170 mm/kmsan/kmsan.c:314 __msan_chain_origin+0x6b/0xe0 mm/kmsan/kmsan_instr.c:184 rc_g_keycode_from_table drivers/media/rc/rc-main.c:583 [inline] rc_keydown+0x2c4/0x6f0 drivers/media/rc/rc-main.c:816 cxusb_rc_query+0x2e1/0x360 drivers/media/usb/dvb-usb/cxusb.c:548 dvb_usb_read_remote_control+0xf9/0x290 drivers/media/usb/dvb-usb/dvb-usb-remote.c:261 process_one_work+0x1572/0x1ef0 kernel/workqueue.c:2269 worker_thread+0x111b/0x2460 kernel/workqueue.c:2415 kthread+0x4b5/0x4f0 kernel/kthread.c:256 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355
Local variable description: ----ircode@cxusb_rc_query Variable was created at: cxusb_rc_query+0x4d/0x360 drivers/media/usb/dvb-usb/cxusb.c:543 dvb_usb_read_remote_control+0xf9/0x290 drivers/media/usb/dvb-usb/dvb-usb-remote.c:261
Signed-off-by: Vito Caputo vcaputo@pengaru.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/dvb-usb/cxusb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/media/usb/dvb-usb/cxusb.c +++ b/drivers/media/usb/dvb-usb/cxusb.c @@ -521,7 +521,8 @@ static int cxusb_rc_query(struct dvb_usb { u8 ircode[4];
- cxusb_ctrl_msg(d, CMD_GET_IR_CODE, NULL, 0, ircode, 4); + if (cxusb_ctrl_msg(d, CMD_GET_IR_CODE, NULL, 0, ircode, 4) < 0) + return 0;
if (ircode[2] || ircode[3]) rc_keydown(d->rc_dev, RC_PROTO_NEC,
 
            From: Sean Young sean@mess.org
commit f3f5ba42c58d56d50f539854d8cc188944e96087 upstream.
The touch timer is set up in intf1. If the second interface does not exist, the timer and touch input device are not setup and we get the following error, when touch events are reported via intf0.
kernel BUG at kernel/time/timer.c:956! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc1+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__mod_timer kernel/time/timer.c:956 [inline] RIP: 0010:__mod_timer kernel/time/timer.c:949 [inline] RIP: 0010:mod_timer+0x5a2/0xb50 kernel/time/timer.c:1100 Code: 45 10 c7 44 24 14 ff ff ff ff 48 89 44 24 08 48 8d 45 20 48 c7 44 24 18 00 00 00 00 48 89 04 24 e9 5a fc ff ff e8 ae ce 0e 00 <0f> 0b e8 a7 ce 0e 00 4c 89 74 24 20 e9 37 fe ff ff e8 98 ce 0e 00 RSP: 0018:ffff8881db209930 EFLAGS: 00010006 RAX: ffffffff86c2b200 RBX: 00000000ffffa688 RCX: ffffffff83efc583 RDX: 0000000000000100 RSI: ffffffff812f4d82 RDI: ffff8881d2356200 RBP: ffff8881d23561e8 R08: ffffffff86c2b200 R09: ffffed103a46abeb R10: ffffed103a46abea R11: ffff8881d2355f53 R12: dffffc0000000000 R13: 1ffff1103b64132d R14: ffff8881d2355f50 R15: 0000000000000006 FS: 0000000000000000(0000) GS:ffff8881db200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f75e2799000 CR3: 00000001d3b07000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> imon_touch_event drivers/media/rc/imon.c:1348 [inline] imon_incoming_packet.isra.0+0x2546/0x2f10 drivers/media/rc/imon.c:1603 usb_rx_callback_intf0+0x151/0x1e0 drivers/media/rc/imon.c:1734 __usb_hcd_giveback_urb+0x1f2/0x470 drivers/usb/core/hcd.c:1654 usb_hcd_giveback_urb+0x368/0x420 drivers/usb/core/hcd.c:1719 dummy_timer+0x120f/0x2fa2 drivers/usb/gadget/udc/dummy_hcd.c:1965 call_timer_fn+0x179/0x650 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x5e3/0x1490 kernel/time/timer.c:1786 __do_softirq+0x221/0x912 kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x178/0x1a0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x12f/0x500 arch/x86/kernel/apic/apic.c:1137 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 </IRQ> RIP: 0010:default_idle+0x28/0x2e0 arch/x86/kernel/process.c:581 Code: 90 90 41 56 41 55 65 44 8b 2d 44 3a 8f 7a 41 54 55 53 0f 1f 44 00 00 e8 36 ee d0 fb e9 07 00 00 00 0f 00 2d fa dd 4f 00 fb f4 <65> 44 8b 2d 20 3a 8f 7a 0f 1f 44 00 00 5b 5d 41 5c 41 5d 41 5e c3 RSP: 0018:ffffffff86c07da8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000007 RBX: ffffffff86c2b200 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffffffff86c2ba4c RBP: fffffbfff0d85640 R08: ffffffff86c2b200 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x3b6/0x500 kernel/sched/idle.c:263 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:355 start_kernel+0x82a/0x864 init/main.c:784 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 Modules linked in:
Reported-by: syzbot+f49d12d34f2321cf4df2@syzkaller.appspotmail.com Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/rc/imon.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/media/rc/imon.c +++ b/drivers/media/rc/imon.c @@ -1598,8 +1598,7 @@ static void imon_incoming_packet(struct spin_unlock_irqrestore(&ictx->kc_lock, flags);
/* send touchscreen events through input subsystem if touchpad data */ - if (ictx->display_type == IMON_DISPLAY_TYPE_VGA && len == 8 && - buf[7] == 0x86) { + if (ictx->touch && len == 8 && buf[7] == 0x86) { imon_touch_event(ictx, buf); return;
 
            From: A Sun as1033x@comcast.net
commit e43148645d18efc3072b1ba45afaa3f385299e55 upstream.
Fix multiple cases of out of bounds (OOB) read associated with MCE device receive/input data handling.
In reference for the OOB cases below, the incoming/read (byte) data format when the MCE device responds to a command is: { cmd_prefix, subcmd, data0, data1, ... } where cmd_prefix are: MCE_CMD_PORT_SYS MCE_CMD_PORT_IR and subcmd examples are: MCE_RSP_GETPORTSTATUS MCE_RSP_EQIRNUMPORTS ... Response size dynamically depends on cmd_prefix and subcmd. So data0, data1, ... may or may not be present on input. Multiple responses may return in a single receiver buffer.
The trigger condition for OOB read is typically random or corrupt input data that fills the mceusb receiver buffer.
Case 1:
mceusb_handle_command() reads data0 (var hi) and data1 (var lo) regardless of whether the response includes such data. If { cmd_prefix, subcmd } is at the end of the receiver buffer, read past end of buffer occurs.
This case was reported by KASAN: slab-out-of-bounds Read in mceusb_dev_recv https://syzkaller.appspot.com/bug?extid=c7fdb6cb36e65f2fe8c9
Fix: In mceusb_handle_command(), change variable hi and lo to pointers, and dereference only when required.
Case 2:
If response with data is truncated at end of buffer after { cmd_prefix, subcmd }, mceusb_handle_command() reads past end of buffer for data0, data1, ...
Fix: In mceusb_process_ir_data(), check response size with remaining buffer size before invoking mceusb_handle_command(). + if (i + ir->rem < buf_len) mceusb_handle_command(ir, &ir->buf_in[i - 1]);
Case 3:
mceusb_handle_command() handles invalid/bad response such as { 0x??, MCE_RSP_GETPORTSTATUS } of length 2 as a response { MCE_CMD_PORT_SYS, MCE_RSP_GETPORTSTATUS, data0, ... } of length 7. Read OOB occurs for non-existent data0, data1, ... Cause is mceusb_handle_command() does not check cmd_prefix value.
Fix: mceusb_handle_command() must test both cmd_prefix and subcmd.
Case 4:
mceusb_process_ir_data() receiver parser state SUBCMD is possible at start (i=0) of receiver buffer resulting in buffer offset=-1 passed to mceusb_dev_printdata(). Bad offset results in OOB read before start of buffer.
[1214218.580308] mceusb 1-1.3:1.0: rx data[0]: 00 80 (length=2) [1214218.580323] mceusb 1-1.3:1.0: Unknown command 0x00 0x80 ... [1214218.580406] mceusb 1-1.3:1.0: rx data[14]: 7f 7f (length=2) [1214218.679311] mceusb 1-1.3:1.0: rx data[-1]: 80 90 (length=2) [1214218.679325] mceusb 1-1.3:1.0: End of raw IR data [1214218.679340] mceusb 1-1.3:1.0: rx data[1]: 7f 7f (length=2)
Fix: If parser_state is SUBCMD after processing receiver buffer, reset parser_state to CMD_HEADER. In effect, discard cmd_prefix at end of receiver buffer. In mceusb_dev_printdata(), abort if buffer offset is out of bounds.
Case 5:
If response with data is truncated at end of buffer after { cmd_prefix, subcmd }, mceusb_dev_printdata() reads past end of buffer for data0, data1, ... while decoding the response to print out.
Fix: In mceusb_dev_printdata(), remove unneeded buffer offset adjustments (var start and var skip) associated with MCE gen1 header. Test for truncated MCE cmd response (compare offset+len with buf_len) and skip decoding of incomplete response. Move IR data tracing to execute before the truncation test.
Signed-off-by: A Sun as1033x@comcast.net Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/rc/mceusb.c | 141 +++++++++++++++++++++++++++++++--------------- 1 file changed, 98 insertions(+), 43 deletions(-)
--- a/drivers/media/rc/mceusb.c +++ b/drivers/media/rc/mceusb.c @@ -564,7 +564,7 @@ static int mceusb_cmd_datasize(u8 cmd, u datasize = 4; break; case MCE_CMD_G_REVISION: - datasize = 2; + datasize = 4; break; case MCE_RSP_EQWAKESUPPORT: case MCE_RSP_GETWAKESOURCE: @@ -600,14 +600,9 @@ static void mceusb_dev_printdata(struct char *inout; u8 cmd, subcmd, *data; struct device *dev = ir->dev; - int start, skip = 0; u32 carrier, period;
- /* skip meaningless 0xb1 0x60 header bytes on orig receiver */ - if (ir->flags.microsoft_gen1 && !out && !offset) - skip = 2; - - if (len <= skip) + if (offset < 0 || offset >= buf_len) return;
dev_dbg(dev, "%cx data[%d]: %*ph (len=%d sz=%d)", @@ -616,11 +611,32 @@ static void mceusb_dev_printdata(struct
inout = out ? "Request" : "Got";
- start = offset + skip; - cmd = buf[start] & 0xff; - subcmd = buf[start + 1] & 0xff; - data = buf + start + 2; + cmd = buf[offset]; + subcmd = (offset + 1 < buf_len) ? buf[offset + 1] : 0; + data = &buf[offset] + 2; + + /* Trace meaningless 0xb1 0x60 header bytes on original receiver */ + if (ir->flags.microsoft_gen1 && !out && !offset) { + dev_dbg(dev, "MCE gen 1 header"); + return; + }
+ /* Trace IR data header or trailer */ + if (cmd != MCE_CMD_PORT_IR && + (cmd & MCE_PORT_MASK) == MCE_COMMAND_IRDATA) { + if (cmd == MCE_IRDATA_TRAILER) + dev_dbg(dev, "End of raw IR data"); + else + dev_dbg(dev, "Raw IR data, %d pulse/space samples", + cmd & MCE_PACKET_LENGTH_MASK); + return; + } + + /* Unexpected end of buffer? */ + if (offset + len > buf_len) + return; + + /* Decode MCE command/response */ switch (cmd) { case MCE_CMD_NULL: if (subcmd == MCE_CMD_NULL) @@ -644,7 +660,7 @@ static void mceusb_dev_printdata(struct dev_dbg(dev, "Get hw/sw rev?"); else dev_dbg(dev, "hw/sw rev %*ph", - 4, &buf[start + 2]); + 4, &buf[offset + 2]); break; case MCE_CMD_RESUME: dev_dbg(dev, "Device resume requested"); @@ -746,13 +762,6 @@ static void mceusb_dev_printdata(struct default: break; } - - if (cmd == MCE_IRDATA_TRAILER) - dev_dbg(dev, "End of raw IR data"); - else if ((cmd != MCE_CMD_PORT_IR) && - ((cmd & MCE_PORT_MASK) == MCE_COMMAND_IRDATA)) - dev_dbg(dev, "Raw IR data, %d pulse/space samples", - cmd & MCE_PACKET_LENGTH_MASK); #endif }
@@ -1136,32 +1145,62 @@ static int mceusb_set_rx_carrier_report( }
/* + * Handle PORT_SYS/IR command response received from the MCE device. + * + * Assumes single response with all its data (not truncated) + * in buf_in[]. The response itself determines its total length + * (mceusb_cmd_datasize() + 2) and hence the minimum size of buf_in[]. + * * We don't do anything but print debug spew for many of the command bits * we receive from the hardware, but some of them are useful information * we want to store so that we can use them. */ -static void mceusb_handle_command(struct mceusb_dev *ir, int index) +static void mceusb_handle_command(struct mceusb_dev *ir, u8 *buf_in) { + u8 cmd = buf_in[0]; + u8 subcmd = buf_in[1]; + u8 *hi = &buf_in[2]; /* read only when required */ + u8 *lo = &buf_in[3]; /* read only when required */ struct ir_raw_event rawir = {}; - u8 hi = ir->buf_in[index + 1] & 0xff; - u8 lo = ir->buf_in[index + 2] & 0xff; u32 carrier_cycles; u32 cycles_fix;
- switch (ir->buf_in[index]) { - /* the one and only 5-byte return value command */ - case MCE_RSP_GETPORTSTATUS: - if ((ir->buf_in[index + 4] & 0xff) == 0x00) - ir->txports_cabled |= 1 << hi; - break; + if (cmd == MCE_CMD_PORT_SYS) { + switch (subcmd) { + /* the one and only 5-byte return value command */ + case MCE_RSP_GETPORTSTATUS: + if (buf_in[5] == 0) + ir->txports_cabled |= 1 << *hi; + break;
+ /* 1-byte return value commands */ + case MCE_RSP_EQEMVER: + ir->emver = *hi; + break; + + /* No return value commands */ + case MCE_RSP_CMD_ILLEGAL: + ir->need_reset = true; + break; + + default: + break; + } + + return; + } + + if (cmd != MCE_CMD_PORT_IR) + return; + + switch (subcmd) { /* 2-byte return value commands */ case MCE_RSP_EQIRTIMEOUT: - ir->rc->timeout = US_TO_NS((hi << 8 | lo) * MCE_TIME_UNIT); + ir->rc->timeout = US_TO_NS((*hi << 8 | *lo) * MCE_TIME_UNIT); break; case MCE_RSP_EQIRNUMPORTS: - ir->num_txports = hi; - ir->num_rxports = lo; + ir->num_txports = *hi; + ir->num_rxports = *lo; break; case MCE_RSP_EQIRRXCFCNT: /* @@ -1174,7 +1213,7 @@ static void mceusb_handle_command(struct */ if (ir->carrier_report_enabled && ir->learning_active && ir->pulse_tunit > 0) { - carrier_cycles = (hi << 8 | lo); + carrier_cycles = (*hi << 8 | *lo); /* * Adjust carrier cycle count by adding * 1 missed count per pulse "on" @@ -1192,24 +1231,24 @@ static void mceusb_handle_command(struct break;
/* 1-byte return value commands */ - case MCE_RSP_EQEMVER: - ir->emver = hi; - break; case MCE_RSP_EQIRTXPORTS: - ir->tx_mask = hi; + ir->tx_mask = *hi; break; case MCE_RSP_EQIRRXPORTEN: - ir->learning_active = ((hi & 0x02) == 0x02); - if (ir->rxports_active != hi) { + ir->learning_active = ((*hi & 0x02) == 0x02); + if (ir->rxports_active != *hi) { dev_info(ir->dev, "%s-range (0x%x) receiver active", - ir->learning_active ? "short" : "long", hi); - ir->rxports_active = hi; + ir->learning_active ? "short" : "long", *hi); + ir->rxports_active = *hi; } break; + + /* No return value commands */ case MCE_RSP_CMD_ILLEGAL: case MCE_RSP_TX_TIMEOUT: ir->need_reset = true; break; + default: break; } @@ -1235,7 +1274,8 @@ static void mceusb_process_ir_data(struc ir->rem = mceusb_cmd_datasize(ir->cmd, ir->buf_in[i]); mceusb_dev_printdata(ir, ir->buf_in, buf_len, i - 1, ir->rem + 2, false); - mceusb_handle_command(ir, i); + if (i + ir->rem < buf_len) + mceusb_handle_command(ir, &ir->buf_in[i - 1]); ir->parser_state = CMD_DATA; break; case PARSE_IRDATA: @@ -1264,15 +1304,22 @@ static void mceusb_process_ir_data(struc ir->rem--; break; case CMD_HEADER: - /* decode mce packets of the form (84),AA,BB,CC,DD */ - /* IR data packets can span USB messages - rem */ ir->cmd = ir->buf_in[i]; if ((ir->cmd == MCE_CMD_PORT_IR) || ((ir->cmd & MCE_PORT_MASK) != MCE_COMMAND_IRDATA)) { + /* + * got PORT_SYS, PORT_IR, or unknown + * command response prefix + */ ir->parser_state = SUBCMD; continue; } + /* + * got IR data prefix (0x80 + num_bytes) + * decode MCE packets of the form {0x83, AA, BB, CC} + * IR data packets can span USB messages + */ ir->rem = (ir->cmd & MCE_PACKET_LENGTH_MASK); mceusb_dev_printdata(ir, ir->buf_in, buf_len, i, ir->rem + 1, false); @@ -1296,6 +1343,14 @@ static void mceusb_process_ir_data(struc if (ir->parser_state != CMD_HEADER && !ir->rem) ir->parser_state = CMD_HEADER; } + + /* + * Accept IR data spanning multiple rx buffers. + * Reject MCE command response spanning multiple rx buffers. + */ + if (ir->parser_state != PARSE_IRDATA || !ir->rem) + ir->parser_state = CMD_HEADER; + if (event) { dev_dbg(ir->dev, "processed IR data"); ir_raw_event_handle(ir->rc);
 
            From: Takashi Iwai tiwai@suse.de
commit 5a858e79c911330678b5a9be91a24830e94a0dc9 upstream.
The old Nvidia chips have multiple HD-audio codecs on the same HD-audio controller, and this doesn't work as expected with the current audio component binding that is implemented under the one-codec-per- controller assumption; at the probe time, the driver leads to several kernel WARNING messages.
For the proper support, we may change the pin2port and port2pin to traverse the codec list per the given pin number, but this needs more development and testing.
As a quick workaround, instead, this patch drops the binding in the audio side for these legacy chips since the audio component support in nouveau graphics driver is still not merged (hence it's basically unused).
[ Unlike the original commit, this patch actually disables the audio component binding for all Nvidia chips, not only for legacy chips. It doesn't matter much, though: nouveau gfx driver still doesn't provide the audio component binding on 5.4.y, so it's only a placeholder for now. Also, another difference from the original commit is that this removes the nvhdmi_audio_ops and other definitions completely in order to avoid a compile warning due to unused stuff. -- tiwai ]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=205625 Fixes: ade49db337a9 ("ALSA: hda/hdmi - Allow audio component for AMD/ATI and Nvidia HDMI") Link: https://lore.kernel.org/r/20191122132000.4460-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_hdmi.c | 22 ---------------------- 1 file changed, 22 deletions(-)
--- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -3454,26 +3454,6 @@ static int nvhdmi_chmap_validate(struct return 0; }
-/* map from pin NID to port; port is 0-based */ -/* for Nvidia: assume widget NID starting from 4, with step 1 (4, 5, 6, ...) */ -static int nvhdmi_pin2port(void *audio_ptr, int pin_nid) -{ - return pin_nid - 4; -} - -/* reverse-map from port to pin NID: see above */ -static int nvhdmi_port2pin(struct hda_codec *codec, int port) -{ - return port + 4; -} - -static const struct drm_audio_component_audio_ops nvhdmi_audio_ops = { - .pin2port = nvhdmi_pin2port, - .pin_eld_notify = generic_acomp_pin_eld_notify, - .master_bind = generic_acomp_master_bind, - .master_unbind = generic_acomp_master_unbind, -}; - static int patch_nvhdmi(struct hda_codec *codec) { struct hdmi_spec *spec; @@ -3492,8 +3472,6 @@ static int patch_nvhdmi(struct hda_codec
codec->link_down_at_suspend = 1;
- generic_acomp_init(codec, &nvhdmi_audio_ops, nvhdmi_port2pin); - return 0; }
 
            From: Oliver Neukum oneukum@suse.com
commit 1ec13abac58b6f24e32f0d3081ef4e7456e62ed8 upstream.
USBIP uses lib/scatterlist.h Hence it needs to set CONFIG_SGL_ALLOC
Signed-off-by: Oliver Neukum oneukum@suse.com Cc: stable stable@vger.kernel.org Acked-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/20191112154939.21217-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/usbip/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/usbip/Kconfig +++ b/drivers/usb/usbip/Kconfig @@ -4,6 +4,7 @@ config USBIP_CORE tristate "USB/IP support" depends on NET select USB_COMMON + select SGL_ALLOC ---help--- This enables pushing USB packets over IP to allow remote machines direct access to USB devices. It provides the
 
            From: Hewenliang hewenliang4@huawei.com
commit 26a4d4c00f85cb844dd11dd35e848b079c2f5e8f upstream.
We should close the fd before the return of read_attr_usbip_status.
Fixes: 3391ba0e2792 ("usbip: tools: Extract generic code to be shared with vudc backend") Signed-off-by: Hewenliang hewenliang4@huawei.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20191025043515.20053-1-hewenliang4@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/usb/usbip/libsrc/usbip_host_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/usb/usbip/libsrc/usbip_host_common.c +++ b/tools/usb/usbip/libsrc/usbip_host_common.c @@ -57,7 +57,7 @@ static int32_t read_attr_usbip_status(st }
value = atoi(status); - + close(fd); return value; }
 
            From: Suwan Kim suwan.kim027@gmail.com
commit 2a9125317b247f2cf35c196f968906dcf062ae2d upstream.
Smatch reported that nents is not initialized and used in stub_recv_cmd_submit(). nents is currently initialized by sgl_alloc() and used to allocate multiple URBs when host controller doesn't support scatter-gather DMA. The use of uninitialized nents means that buf_len is zero and use_sg is true. But buffer length should not be zero when an URB uses scatter-gather DMA.
To prevent this situation, add the conditional that checks buf_len and use_sg. And move the use of nents right after the sgl_alloc() to avoid the use of uninitialized nents.
If the error occurs, it adds SDEV_EVENT_ERROR_MALLOC and stub_priv will be released by stub event handler and connection will be shut down.
Fixes: ea44d190764b ("usbip: Implement SG support to vhci-hcd and stub driver") Reported-by: kbuild test robot lkp@intel.com Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Suwan Kim suwan.kim027@gmail.com Acked-by: Shuah Khan skhan@linuxfoundation.org Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20191111141035.27788-1-suwan.kim027@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/usbip/stub_rx.c | 50 ++++++++++++++++++++++++++++---------------- 1 file changed, 32 insertions(+), 18 deletions(-)
--- a/drivers/usb/usbip/stub_rx.c +++ b/drivers/usb/usbip/stub_rx.c @@ -470,18 +470,50 @@ static void stub_recv_cmd_submit(struct if (pipe == -1) return;
+ /* + * Smatch reported the error case where use_sg is true and buf_len is 0. + * In this case, It adds SDEV_EVENT_ERROR_MALLOC and stub_priv will be + * released by stub event handler and connection will be shut down. + */ priv = stub_priv_alloc(sdev, pdu); if (!priv) return;
buf_len = (unsigned long long)pdu->u.cmd_submit.transfer_buffer_length;
+ if (use_sg && !buf_len) { + dev_err(&udev->dev, "sg buffer with zero length\n"); + goto err_malloc; + } + /* allocate urb transfer buffer, if needed */ if (buf_len) { if (use_sg) { sgl = sgl_alloc(buf_len, GFP_KERNEL, &nents); if (!sgl) goto err_malloc; + + /* Check if the server's HCD supports SG */ + if (!udev->bus->sg_tablesize) { + /* + * If the server's HCD doesn't support SG, break + * a single SG request into several URBs and map + * each SG list entry to corresponding URB + * buffer. The previously allocated SG list is + * stored in priv->sgl (If the server's HCD + * support SG, SG list is stored only in + * urb->sg) and it is used as an indicator that + * the server split single SG request into + * several URBs. Later, priv->sgl is used by + * stub_complete() and stub_send_ret_submit() to + * reassemble the divied URBs. + */ + support_sg = 0; + num_urbs = nents; + priv->completed_urbs = 0; + pdu->u.cmd_submit.transfer_flags &= + ~URB_DMA_MAP_SG; + } } else { buffer = kzalloc(buf_len, GFP_KERNEL); if (!buffer) @@ -489,24 +521,6 @@ static void stub_recv_cmd_submit(struct } }
- /* Check if the server's HCD supports SG */ - if (use_sg && !udev->bus->sg_tablesize) { - /* - * If the server's HCD doesn't support SG, break a single SG - * request into several URBs and map each SG list entry to - * corresponding URB buffer. The previously allocated SG - * list is stored in priv->sgl (If the server's HCD support SG, - * SG list is stored only in urb->sg) and it is used as an - * indicator that the server split single SG request into - * several URBs. Later, priv->sgl is used by stub_complete() and - * stub_send_ret_submit() to reassemble the divied URBs. - */ - support_sg = 0; - num_urbs = nents; - priv->completed_urbs = 0; - pdu->u.cmd_submit.transfer_flags &= ~URB_DMA_MAP_SG; - } - /* allocate urb array */ priv->num_urbs = num_urbs; priv->urbs = kmalloc_array(num_urbs, sizeof(*priv->urbs), GFP_KERNEL);
 
            From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 347bc8cb26388791c5881a3775cb14a3f765a674 upstream.
Add support for the Mark-10 digital force gauge device to the cp201x driver.
Based on a report and a larger patch from Joel Jennings
Reported-by: Joel Jennings joel.jennings@makeitlabs.com Cc: stable stable@vger.kernel.org Acked-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20191118092119.GA153852@kroah.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -125,6 +125,7 @@ static const struct usb_device_id id_tab { USB_DEVICE(0x10C4, 0x8341) }, /* Siemens MC35PU GPRS Modem */ { USB_DEVICE(0x10C4, 0x8382) }, /* Cygnal Integrated Products, Inc. */ { USB_DEVICE(0x10C4, 0x83A8) }, /* Amber Wireless AMB2560 */ + { USB_DEVICE(0x10C4, 0x83AA) }, /* Mark-10 Digital Force Gauge */ { USB_DEVICE(0x10C4, 0x83D8) }, /* DekTec DTA Plus VHF/UHF Booster/Attenuator */ { USB_DEVICE(0x10C4, 0x8411) }, /* Kyocera GPS Module */ { USB_DEVICE(0x10C4, 0x8418) }, /* IRZ Automation Teleport SG-10 GSM/GPRS Modem */
 
            From: Oliver Neukum oneukum@suse.com
commit 92aa5986f4f7b5a8bf282ca0f50967f4326559f5 upstream.
In case of a timeout or if a signal aborts a read communication with the device needs to be ended lest we overwrite an active URB the next time we do IO to the device, as the URB may still be active.
Signed-off-by: Oliver Neukum oneukum@suse.de Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20191107142856.16774-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/misc/chaoskey.c | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-)
--- a/drivers/usb/misc/chaoskey.c +++ b/drivers/usb/misc/chaoskey.c @@ -384,13 +384,17 @@ static int _chaoskey_fill(struct chaoske !dev->reading, (started ? NAK_TIMEOUT : ALEA_FIRST_TIMEOUT) );
- if (result < 0) + if (result < 0) { + usb_kill_urb(dev->urb); goto out; + }
- if (result == 0) + if (result == 0) { result = -ETIMEDOUT; - else + usb_kill_urb(dev->urb); + } else { result = dev->valid; + } out: /* Let the device go back to sleep eventually */ usb_autopm_put_interface(dev->interface); @@ -526,7 +530,21 @@ static int chaoskey_suspend(struct usb_i
static int chaoskey_resume(struct usb_interface *interface) { + struct chaoskey *dev; + struct usb_device *udev = interface_to_usbdev(interface); + usb_dbg(interface, "resume"); + dev = usb_get_intfdata(interface); + + /* + * We may have lost power. + * In that case the device that needs a long time + * for the first requests needs an extended timeout + * again + */ + if (le16_to_cpu(udev->descriptor.idVendor) == ALEA_VENDOR_ID) + dev->reads_started = false; + return 0; } #else
 
            From: Oliver Neukum oneukum@suse.com
commit 91feb01596e5efc0cc922cc73f5583114dccf4d2 upstream.
The work item can operate on
1. stale memory left over from the last transfer the actual length of the data transfered needs to be checked 2. memory already freed the error handling in appledisplay_probe() needs to cancel the work in that case
Reported-and-tested-by: syzbot+495dab1f175edc9c2f13@syzkaller.appspotmail.com Signed-off-by: Oliver Neukum oneukum@suse.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20191106124902.7765-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/misc/appledisplay.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/usb/misc/appledisplay.c +++ b/drivers/usb/misc/appledisplay.c @@ -164,7 +164,12 @@ static int appledisplay_bl_get_brightnes 0, pdata->msgdata, 2, ACD_USB_TIMEOUT); - brightness = pdata->msgdata[1]; + if (retval < 2) { + if (retval >= 0) + retval = -EMSGSIZE; + } else { + brightness = pdata->msgdata[1]; + } mutex_unlock(&pdata->sysfslock);
if (retval < 0) @@ -299,6 +304,7 @@ error: if (pdata) { if (pdata->urb) { usb_kill_urb(pdata->urb); + cancel_delayed_work_sync(&pdata->work); if (pdata->urbdata) usb_free_coherent(pdata->udev, ACD_URB_BUFFER_LEN, pdata->urbdata, pdata->urb->transfer_dma);
 
            From: Pavel Löbl pavel@loebl.cz
commit e696d00e65e81d46e911f24b12e441037bf11b38 upstream.
Add USB ID for MOXA UPort 2210. This device contains mos7820 but it passes GPIO0 check implemented by driver and it's detected as mos7840. Hence product id check is added to force mos7820 mode.
Signed-off-by: Pavel Löbl pavel@loebl.cz Cc: stable stable@vger.kernel.org [ johan: rename id defines and add vendor-id check ] Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/mos7840.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/usb/serial/mos7840.c +++ b/drivers/usb/serial/mos7840.c @@ -119,11 +119,15 @@ /* This driver also supports * ATEN UC2324 device using Moschip MCS7840 * ATEN UC2322 device using Moschip MCS7820 + * MOXA UPort 2210 device using Moschip MCS7820 */ #define USB_VENDOR_ID_ATENINTL 0x0557 #define ATENINTL_DEVICE_ID_UC2324 0x2011 #define ATENINTL_DEVICE_ID_UC2322 0x7820
+#define USB_VENDOR_ID_MOXA 0x110a +#define MOXA_DEVICE_ID_2210 0x2210 + /* Interrupt Routine Defines */
#define SERIAL_IIR_RLS 0x06 @@ -195,6 +199,7 @@ static const struct usb_device_id id_tab {USB_DEVICE(USB_VENDOR_ID_BANDB, BANDB_DEVICE_ID_USOPTL2_4)}, {USB_DEVICE(USB_VENDOR_ID_ATENINTL, ATENINTL_DEVICE_ID_UC2324)}, {USB_DEVICE(USB_VENDOR_ID_ATENINTL, ATENINTL_DEVICE_ID_UC2322)}, + {USB_DEVICE(USB_VENDOR_ID_MOXA, MOXA_DEVICE_ID_2210)}, {} /* terminating entry */ }; MODULE_DEVICE_TABLE(usb, id_table); @@ -2020,6 +2025,7 @@ static int mos7840_probe(struct usb_seri const struct usb_device_id *id) { u16 product = le16_to_cpu(serial->dev->descriptor.idProduct); + u16 vid = le16_to_cpu(serial->dev->descriptor.idVendor); u8 *buf; int device_type;
@@ -2030,6 +2036,11 @@ static int mos7840_probe(struct usb_seri goto out; }
+ if (vid == USB_VENDOR_ID_MOXA && product == MOXA_DEVICE_ID_2210) { + device_type = MOSCHIP_DEVICE_ID_7820; + goto out; + } + buf = kzalloc(VENDOR_READ_LENGTH, GFP_KERNEL); if (!buf) return -ENOMEM;
 
            From: Johan Hovold johan@kernel.org
commit ea422312a462696093b5db59d294439796cba4ad upstream.
The driver was setting the device remote-wakeup feature during probe in violation of the USB specification (which says it should only be set just prior to suspending the device). This could potentially waste power during suspend as well as lead to spurious wakeups.
Note that USB core would clear the remote-wakeup feature at first resume.
Fixes: 0f64478cbc7a ("USB: add USB serial mos7720 driver") Cc: stable stable@vger.kernel.org # 2.6.19 Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/mos7720.c | 4 ---- 1 file changed, 4 deletions(-)
--- a/drivers/usb/serial/mos7720.c +++ b/drivers/usb/serial/mos7720.c @@ -1833,10 +1833,6 @@ static int mos7720_startup(struct usb_se product = le16_to_cpu(serial->dev->descriptor.idProduct); dev = serial->dev;
- /* setting configuration feature to one */ - usb_control_msg(serial->dev, usb_sndctrlpipe(serial->dev, 0), - (__u8)0x03, 0x00, 0x01, 0x00, NULL, 0x00, 5000); - if (product == MOSCHIP_DEVICE_ID_7715) { struct urb *urb = serial->port[0]->interrupt_in_urb;
 
            From: Johan Hovold johan@kernel.org
commit 92fe35fb9c70a00d8fbbf5bd6172c921dd9c7815 upstream.
The driver was setting the device remote-wakeup feature during probe in violation of the USB specification (which says it should only be set just prior to suspending the device). This could potentially waste power during suspend as well as lead to spurious wakeups.
Note that USB core would clear the remote-wakeup feature at first resume.
Fixes: 3f5429746d91 ("USB: Moschip 7840 USB-Serial Driver") Cc: stable stable@vger.kernel.org # 2.6.19 Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/mos7840.c | 5 ----- 1 file changed, 5 deletions(-)
--- a/drivers/usb/serial/mos7840.c +++ b/drivers/usb/serial/mos7840.c @@ -2290,11 +2290,6 @@ out: goto error; } else dev_dbg(&port->dev, "ZLP_REG5 Writing success status%d\n", status); - - /* setting configuration feature to one */ - usb_control_msg(serial->dev, usb_sndctrlpipe(serial->dev, 0), - 0x03, 0x00, 0x01, 0x00, NULL, 0x00, - MOS_WDR_TIMEOUT); } return 0; error:
 
            From: Aleksander Morgado aleksander@aleksander.es
commit 957c31ea082e3fe5196f46d5b04018b10de47400 upstream.
The device exposes AT, NMEA and DIAG ports in both USB configurations. Exactly same layout as the default DW5821e module, just a different vid/pid.
P: Vendor=413c ProdID=81e0 Rev=03.18 S: Manufacturer=Dell Inc. S: Product=DW5821e-eSIM Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
P: Vendor=413c ProdID=81e0 Rev=03.18 S: Manufacturer=Dell Inc. S: Product=DW5821e-eSIM Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 7 Cfg#= 2 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim I: If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#=0x6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
Signed-off-by: Aleksander Morgado aleksander@aleksander.es Cc: stable stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/option.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -197,6 +197,7 @@ static void option_instat_callback(struc #define DELL_PRODUCT_5804_MINICARD_ATT 0x819b /* Novatel E371 */
#define DELL_PRODUCT_5821E 0x81d7 +#define DELL_PRODUCT_5821E_ESIM 0x81e0
#define KYOCERA_VENDOR_ID 0x0c88 #define KYOCERA_PRODUCT_KPC650 0x17da @@ -1044,6 +1045,8 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, DELL_PRODUCT_5804_MINICARD_ATT, 0xff, 0xff, 0xff) }, { USB_DEVICE(DELL_VENDOR_ID, DELL_PRODUCT_5821E), .driver_info = RSVD(0) | RSVD(1) | RSVD(6) }, + { USB_DEVICE(DELL_VENDOR_ID, DELL_PRODUCT_5821E_ESIM), + .driver_info = RSVD(0) | RSVD(1) | RSVD(6) }, { USB_DEVICE(ANYDATA_VENDOR_ID, ANYDATA_PRODUCT_ADU_E100A) }, /* ADU-E100, ADU-310 */ { USB_DEVICE(ANYDATA_VENDOR_ID, ANYDATA_PRODUCT_ADU_500A) }, { USB_DEVICE(ANYDATA_VENDOR_ID, ANYDATA_PRODUCT_ADU_620UW) },
 
            From: Aleksander Morgado aleksander@aleksander.es
commit f0797095423e6ea3b4be61134ee353c7f504d440 upstream.
These are the Foxconn-branded variants of the Dell DW5821e modules, same USB layout as those. The device exposes AT, NMEA and DIAG ports in both USB configurations.
P: Vendor=0489 ProdID=e0b4 Rev=03.18 S: Manufacturer=FII S: Product=T77W968 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
P: Vendor=0489 ProdID=e0b4 Rev=03.18 S: Manufacturer=FII S: Product=T77W968 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 7 Cfg#= 2 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim I: If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#=0x6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
Signed-off-by: Aleksander Morgado aleksander@aleksander.es [ johan: drop id defines ] Cc: stable stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/option.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1993,6 +1993,10 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(0x03f0, 0xa31d, 0xff, 0x06, 0x13) }, { USB_DEVICE_AND_INTERFACE_INFO(0x03f0, 0xa31d, 0xff, 0x06, 0x14) }, { USB_DEVICE_AND_INTERFACE_INFO(0x03f0, 0xa31d, 0xff, 0x06, 0x1b) }, + { USB_DEVICE(0x0489, 0xe0b4), /* Foxconn T77W968 */ + .driver_info = RSVD(0) | RSVD(1) | RSVD(6) }, + { USB_DEVICE(0x0489, 0xe0b5), /* Foxconn T77W968 ESIM */ + .driver_info = RSVD(0) | RSVD(1) | RSVD(6) }, { USB_DEVICE(0x1508, 0x1001), /* Fibocom NL668 */ .driver_info = RSVD(4) | RSVD(5) | RSVD(6) }, { USB_DEVICE(0x2cb7, 0x0104), /* Fibocom NL678 series */
 
            From: Bernd Porr mail@berndporr.me.uk
commit 5618332e5b955b4bff06d0b88146b971c8dd7b32 upstream.
The userspace comedilib function 'get_cmd_generic_timed' fills the cmd structure with an informed guess and then calls the function 'usbduxfast_ai_cmdtest' in this driver repeatedly while 'usbduxfast_ai_cmdtest' is modifying the cmd struct until it no longer changes. However, because of rounding errors this never converged because 'steps = (cmd->convert_arg * 30) / 1000' and then back to 'cmd->convert_arg = (steps * 1000) / 30' won't be the same because of rounding errors. 'Steps' should only be converted back to the 'convert_arg' if 'steps' has actually been modified. In addition the case of steps being 0 wasn't checked which is also now done.
Signed-off-by: Bernd Porr mail@berndporr.me.uk Cc: stable@vger.kernel.org # 4.4+ Reviewed-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20191118230759.1727-1-mail@berndporr.me.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/comedi/drivers/usbduxfast.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
--- a/drivers/staging/comedi/drivers/usbduxfast.c +++ b/drivers/staging/comedi/drivers/usbduxfast.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0+ /* - * Copyright (C) 2004-2014 Bernd Porr, mail@berndporr.me.uk + * Copyright (C) 2004-2019 Bernd Porr, mail@berndporr.me.uk */
/* @@ -8,7 +8,7 @@ * Description: University of Stirling USB DAQ & INCITE Technology Limited * Devices: [ITL] USB-DUX-FAST (usbduxfast) * Author: Bernd Porr mail@berndporr.me.uk - * Updated: 10 Oct 2014 + * Updated: 16 Nov 2019 * Status: stable */
@@ -22,6 +22,7 @@ * * * Revision history: + * 1.0: Fixed a rounding error in usbduxfast_ai_cmdtest * 0.9: Dropping the first data packet which seems to be from the last transfer. * Buffer overflows in the FX2 are handed over to comedi. * 0.92: Dropping now 4 packets. The quad buffer has to be emptied. @@ -350,6 +351,7 @@ static int usbduxfast_ai_cmdtest(struct struct comedi_cmd *cmd) { int err = 0; + int err2 = 0; unsigned int steps; unsigned int arg;
@@ -399,11 +401,16 @@ static int usbduxfast_ai_cmdtest(struct */ steps = (cmd->convert_arg * 30) / 1000; if (cmd->chanlist_len != 1) - err |= comedi_check_trigger_arg_min(&steps, - MIN_SAMPLING_PERIOD); - err |= comedi_check_trigger_arg_max(&steps, MAX_SAMPLING_PERIOD); - arg = (steps * 1000) / 30; - err |= comedi_check_trigger_arg_is(&cmd->convert_arg, arg); + err2 |= comedi_check_trigger_arg_min(&steps, + MIN_SAMPLING_PERIOD); + else + err2 |= comedi_check_trigger_arg_min(&steps, 1); + err2 |= comedi_check_trigger_arg_max(&steps, MAX_SAMPLING_PERIOD); + if (err2) { + err |= err2; + arg = (steps * 1000) / 30; + err |= comedi_check_trigger_arg_is(&cmd->convert_arg, arg); + }
if (cmd->stop_src == TRIG_COUNT) err |= comedi_check_trigger_arg_min(&cmd->stop_arg, 1);
 
            From: Michael Ellerman mpe@ellerman.id.au
commit 39e72bf96f5847ba87cc5bd7a3ce0fed813dc9ad upstream.
In commit ee13cb249fab ("powerpc/64s: Add support for software count cache flush"), I added support for software to flush the count cache (indirect branch cache) on context switch if firmware told us that was the required mitigation for Spectre v2.
As part of that code we also added a software flush of the link stack (return address stack), which protects against Spectre-RSB between user processes.
That is all correct for CPUs that activate that mitigation, which is currently Power9 Nimbus DD2.3.
What I got wrong is that on older CPUs, where firmware has disabled the count cache, we also need to flush the link stack on context switch.
To fix it we create a new feature bit which is not set by firmware, which tells us we need to flush the link stack. We set that when firmware tells us that either of the existing Spectre v2 mitigations are enabled.
Then we adjust the patching code so that if we see that feature bit we enable the link stack flush. If we're also told to flush the count cache in software then we fall through and do that also.
On the older CPUs we don't need to do do the software count cache flush, firmware has disabled it, so in that case we patch in an early return after the link stack flush.
The naming of some of the functions is awkward after this patch, because they're called "count cache" but they also do link stack. But we'll fix that up in a later commit to ease backporting.
This is the fix for CVE-2019-18660.
Reported-by: Anthony Steinhauser asteinhauser@google.com Fixes: ee13cb249fab ("powerpc/64s: Add support for software count cache flush") Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/asm-prototypes.h | 1 arch/powerpc/include/asm/security_features.h | 3 + arch/powerpc/kernel/entry_64.S | 6 +++ arch/powerpc/kernel/security.c | 48 ++++++++++++++++++++++++--- 4 files changed, 54 insertions(+), 4 deletions(-)
--- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -152,6 +152,7 @@ void _kvmppc_save_tm_pr(struct kvm_vcpu /* Patch sites */ extern s32 patch__call_flush_count_cache; extern s32 patch__flush_count_cache_return; +extern s32 patch__flush_link_stack_return; extern s32 patch__memset_nocache, patch__memcpy_nocache;
extern long flush_count_cache; --- a/arch/powerpc/include/asm/security_features.h +++ b/arch/powerpc/include/asm/security_features.h @@ -81,6 +81,9 @@ static inline bool security_ftr_enabled( // Software required to flush count cache on context switch #define SEC_FTR_FLUSH_COUNT_CACHE 0x0000000000000400ull
+// Software required to flush link stack on context switch +#define SEC_FTR_FLUSH_LINK_STACK 0x0000000000001000ull +
// Features enabled by default #define SEC_FTR_DEFAULT \ --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -537,6 +537,7 @@ flush_count_cache: /* Save LR into r9 */ mflr r9
+ // Flush the link stack .rept 64 bl .+4 .endr @@ -546,6 +547,11 @@ flush_count_cache: .balign 32 /* Restore LR */ 1: mtlr r9 + + // If we're just flushing the link stack, return here +3: nop + patch_site 3b patch__flush_link_stack_return + li r9,0x7fff mtctr r9
--- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -24,6 +24,7 @@ enum count_cache_flush_type { COUNT_CACHE_FLUSH_HW = 0x4, }; static enum count_cache_flush_type count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; +static bool link_stack_flush_enabled;
bool barrier_nospec_enabled; static bool no_nospec; @@ -212,11 +213,19 @@ ssize_t cpu_show_spectre_v2(struct devic
if (ccd) seq_buf_printf(&s, "Indirect branch cache disabled"); + + if (link_stack_flush_enabled) + seq_buf_printf(&s, ", Software link stack flush"); + } else if (count_cache_flush_type != COUNT_CACHE_FLUSH_NONE) { seq_buf_printf(&s, "Mitigation: Software count cache flush");
if (count_cache_flush_type == COUNT_CACHE_FLUSH_HW) seq_buf_printf(&s, " (hardware accelerated)"); + + if (link_stack_flush_enabled) + seq_buf_printf(&s, ", Software link stack flush"); + } else if (btb_flush_enabled) { seq_buf_printf(&s, "Mitigation: Branch predictor state flush"); } else { @@ -377,18 +386,40 @@ static __init int stf_barrier_debugfs_in device_initcall(stf_barrier_debugfs_init); #endif /* CONFIG_DEBUG_FS */
+static void no_count_cache_flush(void) +{ + count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; + pr_info("count-cache-flush: software flush disabled.\n"); +} + static void toggle_count_cache_flush(bool enable) { - if (!enable || !security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { + if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE) && + !security_ftr_enabled(SEC_FTR_FLUSH_LINK_STACK)) + enable = false; + + if (!enable) { patch_instruction_site(&patch__call_flush_count_cache, PPC_INST_NOP); - count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; - pr_info("count-cache-flush: software flush disabled.\n"); + pr_info("link-stack-flush: software flush disabled.\n"); + link_stack_flush_enabled = false; + no_count_cache_flush(); return; }
+ // This enables the branch from _switch to flush_count_cache patch_branch_site(&patch__call_flush_count_cache, (u64)&flush_count_cache, BRANCH_SET_LINK);
+ pr_info("link-stack-flush: software flush enabled.\n"); + link_stack_flush_enabled = true; + + // If we just need to flush the link stack, patch an early return + if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { + patch_instruction_site(&patch__flush_link_stack_return, PPC_INST_BLR); + no_count_cache_flush(); + return; + } + if (!security_ftr_enabled(SEC_FTR_BCCTR_FLUSH_ASSIST)) { count_cache_flush_type = COUNT_CACHE_FLUSH_SW; pr_info("count-cache-flush: full software flush sequence enabled.\n"); @@ -407,11 +438,20 @@ void setup_count_cache_flush(void) if (no_spectrev2 || cpu_mitigations_off()) { if (security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED) || security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED)) - pr_warn("Spectre v2 mitigations not under software control, can't disable\n"); + pr_warn("Spectre v2 mitigations not fully under software control, can't disable\n");
enable = false; }
+ /* + * There's no firmware feature flag/hypervisor bit to tell us we need to + * flush the link stack on context switch. So we set it here if we see + * either of the Spectre v2 mitigations that aim to protect userspace. + */ + if (security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED) || + security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) + security_ftr_set(SEC_FTR_FLUSH_LINK_STACK); + toggle_count_cache_flush(enable); }
 
            From: Michael Ellerman mpe@ellerman.id.au
commit af2e8c68b9c5403f77096969c516f742f5bb29e0 upstream.
On some systems that are vulnerable to Spectre v2, it is up to software to flush the link stack (return address stack), in order to protect against Spectre-RSB.
When exiting from a guest we do some house keeping and then potentially exit to C code which is several stack frames deep in the host kernel. We will then execute a series of returns without preceeding calls, opening up the possiblity that the guest could have poisoned the link stack, and direct speculative execution of the host to a gadget of some sort.
To prevent this we add a flush of the link stack on exit from a guest.
Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/asm-prototypes.h | 2 ++ arch/powerpc/kernel/security.c | 9 +++++++++ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 30 ++++++++++++++++++++++++++++++ 3 files changed, 41 insertions(+)
--- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -153,9 +153,11 @@ void _kvmppc_save_tm_pr(struct kvm_vcpu extern s32 patch__call_flush_count_cache; extern s32 patch__flush_count_cache_return; extern s32 patch__flush_link_stack_return; +extern s32 patch__call_kvm_flush_link_stack; extern s32 patch__memset_nocache, patch__memcpy_nocache;
extern long flush_count_cache; +extern long kvm_flush_link_stack;
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM void kvmppc_save_tm_hv(struct kvm_vcpu *vcpu, u64 msr, bool preserve_nv); --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -400,6 +400,9 @@ static void toggle_count_cache_flush(boo
if (!enable) { patch_instruction_site(&patch__call_flush_count_cache, PPC_INST_NOP); +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + patch_instruction_site(&patch__call_kvm_flush_link_stack, PPC_INST_NOP); +#endif pr_info("link-stack-flush: software flush disabled.\n"); link_stack_flush_enabled = false; no_count_cache_flush(); @@ -410,6 +413,12 @@ static void toggle_count_cache_flush(boo patch_branch_site(&patch__call_flush_count_cache, (u64)&flush_count_cache, BRANCH_SET_LINK);
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + // This enables the branch from guest_exit_cont to kvm_flush_link_stack + patch_branch_site(&patch__call_kvm_flush_link_stack, + (u64)&kvm_flush_link_stack, BRANCH_SET_LINK); +#endif + pr_info("link-stack-flush: software flush enabled.\n"); link_stack_flush_enabled = true;
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -11,6 +11,7 @@ */
#include <asm/ppc_asm.h> +#include <asm/code-patching-asm.h> #include <asm/kvm_asm.h> #include <asm/reg.h> #include <asm/mmu.h> @@ -1487,6 +1488,13 @@ guest_exit_cont: /* r9 = vcpu, r12 = tr 1: #endif /* CONFIG_KVM_XICS */
+ /* + * Possibly flush the link stack here, before we do a blr in + * guest_exit_short_path. + */ +1: nop + patch_site 1b patch__call_kvm_flush_link_stack + /* If we came in through the P9 short path, go back out to C now */ lwz r0, STACK_SLOT_SHORT_PATH(r1) cmpwi r0, 0 @@ -1963,6 +1971,28 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) mtlr r0 blr
+.balign 32 +.global kvm_flush_link_stack +kvm_flush_link_stack: + /* Save LR into r0 */ + mflr r0 + + /* Flush the link stack. On Power8 it's up to 32 entries in size. */ + .rept 32 + bl .+4 + .endr + + /* And on Power9 it's up to 64. */ +BEGIN_FTR_SECTION + .rept 32 + bl .+4 + .endr +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) + + /* Restore LR */ + mtlr r0 + blr + kvmppc_guest_external: /* External interrupt, first check for host_ipi. If this is * set, we know the host wants us out so let's do it now
 
            On Thu, 28 Nov 2019 at 02:47, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.4.1 release. There are 66 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 29 Nov 2019 20:18:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.1-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 5.4.1-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.4.y git commit: d6453d6b0c5737ef7e24b4216b86ddc013bfc158 git describe: v5.4-67-gd6453d6b0c57 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.4-oe/build/v5.4-67-gd64...
No regressions (compared to build v5.4)
No fixes (compared to build v5.4)
Ran 26389 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - x86
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * libgpiod * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * network-basic-tests * ltp-open-posix-tests * kvm-unit-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none * ssuite
 
            On Thu, Nov 28, 2019 at 02:26:27PM +0530, Naresh Kamboju wrote:
On Thu, 28 Nov 2019 at 02:47, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.4.1 release. There are 66 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 29 Nov 2019 20:18:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.1-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Great, thanks for testing and letting me know.
greg k-h
 
            On 27/11/2019 20:31, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.1 release. There are 66 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 29 Nov 2019 20:18:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.1-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Here are the test results for Tegra ...
Test results for stable-v5.4: 13 builds: 13 pass, 0 fail 22 boots: 22 pass, 0 fail 38 tests: 37 pass, 1 fail
Linux version: 5.4.1-rc1-gd6453d6b0c57 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
We are seeing 1 failure for Tegra194, but this is not a new failure this is present in v5.4 and it is a kernel warnings failure that has been fixed for v5.5 by the following commits.
This one has been merged for v5.5 ...
commit c745da8d4320c49e54662c0a8f7cb6b8204f44c4 Author: Jon Hunter jonathanh@nvidia.com Date: Fri Oct 11 09:34:59 2019 +0100
mailbox: tegra: Fix superfluous IRQ error message
This one has not been merged for v5.5 yet ...
commit d440538e5f219900a9fc9d96fd10727b4d2b3c48 Author: Jon Hunter jonathanh@nvidia.com Date: Wed Sep 25 15:12:28 2019 +0100
arm64: tegra: Fix 'active-low' warning for Jetson Xavier regulator
If you like I can let you know once the above is merged so we can merge for linux-5.4.y.
Cheers Jon
 
            On Thu, Nov 28, 2019 at 10:42:00AM +0000, Jon Hunter wrote:
On 27/11/2019 20:31, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.1 release. There are 66 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 29 Nov 2019 20:18:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.1-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Here are the test results for Tegra ...
Test results for stable-v5.4: 13 builds: 13 pass, 0 fail 22 boots: 22 pass, 0 fail 38 tests: 37 pass, 1 fail
Linux version: 5.4.1-rc1-gd6453d6b0c57 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
We are seeing 1 failure for Tegra194, but this is not a new failure this is present in v5.4 and it is a kernel warnings failure that has been fixed for v5.5 by the following commits.
This one has been merged for v5.5 ...
commit c745da8d4320c49e54662c0a8f7cb6b8204f44c4 Author: Jon Hunter jonathanh@nvidia.com Date: Fri Oct 11 09:34:59 2019 +0100
mailbox: tegra: Fix superfluous IRQ error messageThis one has not been merged for v5.5 yet ...
I'll queue that up next.
commit d440538e5f219900a9fc9d96fd10727b4d2b3c48 Author: Jon Hunter jonathanh@nvidia.com Date: Wed Sep 25 15:12:28 2019 +0100
arm64: tegra: Fix 'active-low' warning for Jetson Xavier regulatorIf you like I can let you know once the above is merged so we can merge for linux-5.4.y.
Please do, that would be great.
thanks for testing all of these and letting me know.
greg k-h
 
            On 11/27/19 1:31 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.1 release. There are 66 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 29 Nov 2019 20:18:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.1-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
 
            On Thu, Nov 28, 2019 at 08:40:04AM -0700, shuah wrote:
On 11/27/19 1:31 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.1 release. There are 66 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 29 Nov 2019 20:18:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.1-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Great, thanks for testing these and letting me know.
greg k-h
 
            On 11/27/19 12:31 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.1 release. There are 66 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 29 Nov 2019 20:18:09 +0000. Anything received after that time might be too late.
Build results: total: 158 pass: 157 fail: 1 Failed builds: mips:allmodconfig Qemu test results: total: 394 pass: 394 fail: 0
The mips build failure is well known and inherited from mainline.
Guenter
 
            On Thu, Nov 28, 2019 at 10:47:54AM -0800, Guenter Roeck wrote:
On 11/27/19 12:31 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.1 release. There are 66 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 29 Nov 2019 20:18:09 +0000. Anything received after that time might be too late.
Build results: total: 158 pass: 157 fail: 1 Failed builds: mips:allmodconfig Qemu test results: total: 394 pass: 394 fail: 0
The mips build failure is well known and inherited from mainline.
Thanks for testing all of these and letting me know.
greg k-h
linux-stable-mirror@lists.linaro.org




