This is the start of the stable review cycle for the 3.18.107 release. There are 24 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.107-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 3.18.107-rc1
Dan Carpenter dan.carpenter@oracle.com cdrom: information leak in cdrom_ioctl_media_changed()
Martin K. Petersen martin.petersen@oracle.com scsi: mptsas: Disable WRITE SAME
Eric Dumazet edumazet@google.com ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
Cong Wang xiyou.wangcong@gmail.com llc: delete timers synchronously in llc_sk_free()
Eric Dumazet edumazet@google.com net: af_packet: fix race in PACKET_{R|T}X_RING
Eric Dumazet edumazet@google.com tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets
Willem de Bruijn willemb@google.com packet: fix bitfield update race
Cong Wang xiyou.wangcong@gmail.com llc: fix NULL pointer deref for SOCK_ZAPPED
Cong Wang xiyou.wangcong@gmail.com llc: hold llc_sap before release_sock()
Guillaume Nault g.nault@alphalink.fr pppoe: check sockaddr length in pppoe_connect()
Xin Long lucien.xin@gmail.com team: fix netconsole setup over team
Paolo Abeni pabeni@redhat.com team: avoid adding twice the same option to the event list
Jann Horn jannh@google.com tcp: don't read out-of-bounds opsize
Guillaume Nault g.nault@alphalink.fr l2tp: check sockaddr length in pppol2tp_connect()
Eric Biggers ebiggers@google.com KEYS: DNS: limit the length of option strings
Xin Long lucien.xin@gmail.com bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave
Sahitya Tummala stummala@codeaurora.org jbd2: fix use after free in kjournald2()
Matthew Wilcox mawilcox@microsoft.com mm/filemap.c: fix NULL pointer in page_cache_tree_insert()
Jiri Olsa jolsa@kernel.org perf: Return proper values for user stack errors
Theodore Ts'o tytso@mit.edu ext4: don't update checksum of new initialized bitmaps
wangguang wang.guang55@zte.com.cn ext4: bugfix for mmaped pages in mpage_release_unused_pages()
Theodore Ts'o tytso@mit.edu ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea()
Xiaoming Gao gxm.linux.kernel@gmail.com x86/tsc: Prevent 32bit truncation in calc_hpet_ref()
Steve French smfrench@gmail.com cifs: do not allow creating sockets except with SMB1 posix exensions
-------------
Diffstat:
Makefile | 4 +- arch/x86/kernel/tsc.c | 2 +- drivers/cdrom/cdrom.c | 2 +- drivers/message/fusion/mptsas.c | 1 + drivers/net/bonding/bond_main.c | 3 +- drivers/net/ppp/pppoe.c | 4 ++ drivers/net/team/team.c | 38 ++++++++++++++---- fs/cifs/dir.c | 9 +++-- fs/ext4/balloc.c | 3 +- fs/ext4/ialloc.c | 43 ++------------------ fs/ext4/inline.c | 66 ++++++++++++++----------------- fs/ext4/inode.c | 2 + fs/ext4/xattr.c | 30 ++++++-------- fs/ext4/xattr.h | 32 +++++++++++++++ fs/jbd2/journal.c | 2 +- include/net/llc_conn.h | 1 + kernel/events/core.c | 4 +- mm/filemap.c | 4 +- net/dns_resolver/dns_key.c | 13 +++--- net/ipv4/tcp.c | 6 ++- net/ipv4/tcp_input.c | 7 +--- net/ipv6/route.c | 2 + net/l2tp/l2tp_ppp.c | 7 ++++ net/llc/af_llc.c | 14 ++++++- net/llc/llc_c_ac.c | 9 +---- net/llc/llc_conn.c | 22 ++++++++++- net/packet/af_packet.c | 88 +++++++++++++++++++++++++++++------------ net/packet/internal.h | 10 ++--- 28 files changed, 254 insertions(+), 174 deletions(-)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French smfrench@gmail.com
commit 1d0cffa674cfa7d185a302c8c6850fc50b893bed upstream.
RHBZ: 1453123
Since at least the 3.10 kernel and likely a lot earlier we have not been able to create unix domain sockets in a cifs share when mounted using the SFU mount option (except when mounted with the cifs unix extensions to Samba e.g.) Trying to create a socket, for example using the af_unix command from xfstests will cause : BUG: unable to handle kernel NULL pointer dereference at 00000000 00000040
Since no one uses or depends on being able to create unix domains sockets on a cifs share the easiest fix to stop this vulnerability is to simply not allow creation of any other special files than char or block devices when sfu is used.
Added update to Ronnie's patch to handle a tcon link leak, and to address a buf leak noticed by Gustavo and Colin.
Acked-by: Gustavo A. R. Silva gustavo@embeddedor.com CC: Colin Ian King colin.king@canonical.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Reported-by: Eryu Guan eguan@redhat.com Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French smfrench@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/dir.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/fs/cifs/dir.c +++ b/fs/cifs/dir.c @@ -674,6 +674,9 @@ int cifs_mknod(struct inode *inode, stru goto mknod_out; }
+ if (!S_ISCHR(mode) && !S_ISBLK(mode)) + goto mknod_out; + if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_UNX_EMUL)) goto mknod_out;
@@ -682,10 +685,8 @@ int cifs_mknod(struct inode *inode, stru
buf = kmalloc(sizeof(FILE_ALL_INFO), GFP_KERNEL); if (buf == NULL) { - kfree(full_path); rc = -ENOMEM; - free_xid(xid); - return rc; + goto mknod_out; }
if (backup_cred(cifs_sb)) @@ -732,7 +733,7 @@ int cifs_mknod(struct inode *inode, stru pdev->minor = cpu_to_le64(MINOR(device_number)); rc = tcon->ses->server->ops->sync_write(xid, &fid, &io_parms, &bytes_written, iov, 1); - } /* else if (S_ISFIFO) */ + } tcon->ses->server->ops->close(xid, tcon, &fid); d_drop(direntry);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiaoming Gao gxm.linux.kernel@gmail.com
commit d3878e164dcd3925a237a20e879432400e369172 upstream.
The TSC calibration code uses HPET as reference. The conversion normalizes the delta of two HPET timestamps:
hpetref = ((tshpet1 - tshpet2) * HPET_PERIOD) / 1e6
and then divides the normalized delta of the corresponding TSC timestamps by the result to calulate the TSC frequency.
tscfreq = ((tstsc1 - tstsc2 ) * 1e6) / hpetref
This uses do_div() which takes an u32 as the divisor, which worked so far because the HPET frequency was low enough that 'hpetref' never exceeded 32bit.
On Skylake machines the HPET frequency increased so 'hpetref' can exceed 32bit. do_div() truncates the divisor, which causes the calibration to fail.
Use div64_u64() to avoid the problem.
[ tglx: Fixes whitespace mangled patch and rewrote changelog ]
Signed-off-by: Xiaoming Gao newtongao@tencent.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Cc: peterz@infradead.org Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/38894564-4fc9-b8ec-353f-de702839e44e@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/tsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -399,7 +399,7 @@ static unsigned long calc_hpet_ref(u64 d hpet2 -= hpet1; tmp = ((u64)hpet2 * hpet_readl(HPET_PERIOD)); do_div(tmp, 1000000); - do_div(deltatsc, tmp); + deltatsc = div64_u64(deltatsc, tmp);
return (unsigned long) deltatsc; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Theodore Ts'o tytso@mit.edu
commit c755e251357a0cee0679081f08c3f4ba797a8009 upstream.
The xattr_sem deadlock problems fixed in commit 2e81a4eeedca: "ext4: avoid deadlock when expanding inode size" didn't include the use of xattr_sem in fs/ext4/inline.c. With the addition of project quota which added a new extra inode field, this exposed deadlocks in the inline_data code similar to the ones fixed by 2e81a4eeedca.
The deadlock can be reproduced via:
dmesg -n 7 mke2fs -t ext4 -O inline_data -Fq -I 256 /dev/vdc 32768 mount -t ext4 -o debug_want_extra_isize=24 /dev/vdc /vdc mkdir /vdc/a umount /vdc mount -t ext4 /dev/vdc /vdc echo foo > /vdc/a/foo
and looks like this:
[ 11.158815] [ 11.160276] ============================================= [ 11.161960] [ INFO: possible recursive locking detected ] [ 11.161960] 4.10.0-rc3-00015-g011b30a8a3cf #160 Tainted: G W [ 11.161960] --------------------------------------------- [ 11.161960] bash/2519 is trying to acquire lock: [ 11.161960] (&ei->xattr_sem){++++..}, at: [<c1225a4b>] ext4_expand_extra_isize_ea+0x3d/0x4cd [ 11.161960] [ 11.161960] but task is already holding lock: [ 11.161960] (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152 [ 11.161960] [ 11.161960] other info that might help us debug this: [ 11.161960] Possible unsafe locking scenario: [ 11.161960] [ 11.161960] CPU0 [ 11.161960] ---- [ 11.161960] lock(&ei->xattr_sem); [ 11.161960] lock(&ei->xattr_sem); [ 11.161960] [ 11.161960] *** DEADLOCK *** [ 11.161960] [ 11.161960] May be due to missing lock nesting notation [ 11.161960] [ 11.161960] 4 locks held by bash/2519: [ 11.161960] #0: (sb_writers#3){.+.+.+}, at: [<c11a2414>] mnt_want_write+0x1e/0x3e [ 11.161960] #1: (&type->i_mutex_dir_key){++++++}, at: [<c119508b>] path_openat+0x338/0x67a [ 11.161960] #2: (jbd2_handle){++++..}, at: [<c123314a>] start_this_handle+0x582/0x622 [ 11.161960] #3: (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152 [ 11.161960] [ 11.161960] stack backtrace: [ 11.161960] CPU: 0 PID: 2519 Comm: bash Tainted: G W 4.10.0-rc3-00015-g011b30a8a3cf #160 [ 11.161960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1 04/01/2014 [ 11.161960] Call Trace: [ 11.161960] dump_stack+0x72/0xa3 [ 11.161960] __lock_acquire+0xb7c/0xcb9 [ 11.161960] ? kvm_clock_read+0x1f/0x29 [ 11.161960] ? __lock_is_held+0x36/0x66 [ 11.161960] ? __lock_is_held+0x36/0x66 [ 11.161960] lock_acquire+0x106/0x18a [ 11.161960] ? ext4_expand_extra_isize_ea+0x3d/0x4cd [ 11.161960] down_write+0x39/0x72 [ 11.161960] ? ext4_expand_extra_isize_ea+0x3d/0x4cd [ 11.161960] ext4_expand_extra_isize_ea+0x3d/0x4cd [ 11.161960] ? _raw_read_unlock+0x22/0x2c [ 11.161960] ? jbd2_journal_extend+0x1e2/0x262 [ 11.161960] ? __ext4_journal_get_write_access+0x3d/0x60 [ 11.161960] ext4_mark_inode_dirty+0x17d/0x26d [ 11.161960] ? ext4_add_dirent_to_inline.isra.12+0xa5/0xb2 [ 11.161960] ext4_add_dirent_to_inline.isra.12+0xa5/0xb2 [ 11.161960] ext4_try_add_inline_entry+0x69/0x152 [ 11.161960] ext4_add_entry+0xa3/0x848 [ 11.161960] ? __brelse+0x14/0x2f [ 11.161960] ? _raw_spin_unlock_irqrestore+0x44/0x4f [ 11.161960] ext4_add_nondir+0x17/0x5b [ 11.161960] ext4_create+0xcf/0x133 [ 11.161960] ? ext4_mknod+0x12f/0x12f [ 11.161960] lookup_open+0x39e/0x3fb [ 11.161960] ? __wake_up+0x1a/0x40 [ 11.161960] ? lock_acquire+0x11e/0x18a [ 11.161960] path_openat+0x35c/0x67a [ 11.161960] ? sched_clock_cpu+0xd7/0xf2 [ 11.161960] do_filp_open+0x36/0x7c [ 11.161960] ? _raw_spin_unlock+0x22/0x2c [ 11.161960] ? __alloc_fd+0x169/0x173 [ 11.161960] do_sys_open+0x59/0xcc [ 11.161960] SyS_open+0x1d/0x1f [ 11.161960] do_int80_syscall_32+0x4f/0x61 [ 11.161960] entry_INT80_32+0x2f/0x2f [ 11.161960] EIP: 0xb76ad469 [ 11.161960] EFLAGS: 00000286 CPU: 0 [ 11.161960] EAX: ffffffda EBX: 08168ac8 ECX: 00008241 EDX: 000001b6 [ 11.161960] ESI: b75e46bc EDI: b7755000 EBP: bfbdb108 ESP: bfbdafc0 [ 11.161960] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
Cc: stable@vger.kernel.org # 3.10 (requires 2e81a4eeedca as a prereq) Reported-by: George Spelvin linux@sciencehorizons.net Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Harsh Shandilya harsh@prjkt.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/inline.c | 66 +++++++++++++++++++++++++------------------------------ fs/ext4/xattr.c | 30 ++++++++++--------------- fs/ext4/xattr.h | 32 ++++++++++++++++++++++++++ 3 files changed, 74 insertions(+), 54 deletions(-)
--- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -374,7 +374,7 @@ out: static int ext4_prepare_inline_data(handle_t *handle, struct inode *inode, unsigned int len) { - int ret, size; + int ret, size, no_expand; struct ext4_inode_info *ei = EXT4_I(inode);
if (!ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) @@ -384,15 +384,14 @@ static int ext4_prepare_inline_data(hand if (size < len) return -ENOSPC;
- down_write(&EXT4_I(inode)->xattr_sem); + ext4_write_lock_xattr(inode, &no_expand);
if (ei->i_inline_off) ret = ext4_update_inline_data(handle, inode, len); else ret = ext4_create_inline_data(handle, inode, len);
- up_write(&EXT4_I(inode)->xattr_sem); - + ext4_write_unlock_xattr(inode, &no_expand); return ret; }
@@ -522,7 +521,7 @@ static int ext4_convert_inline_data_to_e struct inode *inode, unsigned flags) { - int ret, needed_blocks; + int ret, needed_blocks, no_expand; handle_t *handle = NULL; int retries = 0, sem_held = 0; struct page *page = NULL; @@ -562,7 +561,7 @@ retry: goto out; }
- down_write(&EXT4_I(inode)->xattr_sem); + ext4_write_lock_xattr(inode, &no_expand); sem_held = 1; /* If some one has already done this for us, just exit. */ if (!ext4_has_inline_data(inode)) { @@ -598,7 +597,7 @@ retry: page_cache_release(page); page = NULL; ext4_orphan_add(handle, inode); - up_write(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand); sem_held = 0; ext4_journal_stop(handle); handle = NULL; @@ -624,7 +623,7 @@ out: page_cache_release(page); } if (sem_held) - up_write(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand); if (handle) ext4_journal_stop(handle); brelse(iloc.bh); @@ -717,7 +716,7 @@ convert: int ext4_write_inline_data_end(struct inode *inode, loff_t pos, unsigned len, unsigned copied, struct page *page) { - int ret; + int ret, no_expand; void *kaddr; struct ext4_iloc iloc;
@@ -735,7 +734,7 @@ int ext4_write_inline_data_end(struct in goto out; }
- down_write(&EXT4_I(inode)->xattr_sem); + ext4_write_lock_xattr(inode, &no_expand); BUG_ON(!ext4_has_inline_data(inode));
kaddr = kmap_atomic(page); @@ -745,7 +744,7 @@ int ext4_write_inline_data_end(struct in /* clear page dirty so that writepages wouldn't work for us. */ ClearPageDirty(page);
- up_write(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand); brelse(iloc.bh); out: return copied; @@ -756,7 +755,7 @@ ext4_journalled_write_inline_data(struct unsigned len, struct page *page) { - int ret; + int ret, no_expand; void *kaddr; struct ext4_iloc iloc;
@@ -766,11 +765,11 @@ ext4_journalled_write_inline_data(struct return NULL; }
- down_write(&EXT4_I(inode)->xattr_sem); + ext4_write_lock_xattr(inode, &no_expand); kaddr = kmap_atomic(page); ext4_write_inline_data(inode, &iloc, kaddr, 0, len); kunmap_atomic(kaddr); - up_write(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand);
return iloc.bh; } @@ -1245,7 +1244,7 @@ out: int ext4_try_add_inline_entry(handle_t *handle, struct dentry *dentry, struct inode *inode) { - int ret, inline_size; + int ret, inline_size, no_expand; void *inline_start; struct ext4_iloc iloc; struct inode *dir = dentry->d_parent->d_inode; @@ -1254,7 +1253,7 @@ int ext4_try_add_inline_entry(handle_t * if (ret) return ret;
- down_write(&EXT4_I(dir)->xattr_sem); + ext4_write_lock_xattr(dir, &no_expand); if (!ext4_has_inline_data(dir)) goto out;
@@ -1299,7 +1298,7 @@ int ext4_try_add_inline_entry(handle_t *
out: ext4_mark_inode_dirty(handle, dir); - up_write(&EXT4_I(dir)->xattr_sem); + ext4_write_unlock_xattr(dir, &no_expand); brelse(iloc.bh); return ret; } @@ -1655,7 +1654,7 @@ int ext4_delete_inline_entry(handle_t *h struct buffer_head *bh, int *has_inline_data) { - int err, inline_size; + int err, inline_size, no_expand; struct ext4_iloc iloc; void *inline_start;
@@ -1663,7 +1662,7 @@ int ext4_delete_inline_entry(handle_t *h if (err) return err;
- down_write(&EXT4_I(dir)->xattr_sem); + ext4_write_lock_xattr(dir, &no_expand); if (!ext4_has_inline_data(dir)) { *has_inline_data = 0; goto out; @@ -1698,7 +1697,7 @@ int ext4_delete_inline_entry(handle_t *h
ext4_show_inline_dir(dir, iloc.bh, inline_start, inline_size); out: - up_write(&EXT4_I(dir)->xattr_sem); + ext4_write_unlock_xattr(dir, &no_expand); brelse(iloc.bh); if (err != -ENOENT) ext4_std_error(dir->i_sb, err); @@ -1797,11 +1796,11 @@ out:
int ext4_destroy_inline_data(handle_t *handle, struct inode *inode) { - int ret; + int ret, no_expand;
- down_write(&EXT4_I(inode)->xattr_sem); + ext4_write_lock_xattr(inode, &no_expand); ret = ext4_destroy_inline_data_nolock(handle, inode); - up_write(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand);
return ret; } @@ -1879,7 +1878,7 @@ out: void ext4_inline_data_truncate(struct inode *inode, int *has_inline) { handle_t *handle; - int inline_size, value_len, needed_blocks; + int inline_size, value_len, needed_blocks, no_expand; size_t i_size; void *value = NULL; struct ext4_xattr_ibody_find is = { @@ -1896,7 +1895,7 @@ void ext4_inline_data_truncate(struct in if (IS_ERR(handle)) return;
- down_write(&EXT4_I(inode)->xattr_sem); + ext4_write_lock_xattr(inode, &no_expand); if (!ext4_has_inline_data(inode)) { *has_inline = 0; ext4_journal_stop(handle); @@ -1954,7 +1953,7 @@ out_error: up_write(&EXT4_I(inode)->i_data_sem); out: brelse(is.iloc.bh); - up_write(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand); kfree(value); if (inode->i_nlink) ext4_orphan_del(handle, inode); @@ -1970,7 +1969,7 @@ out:
int ext4_convert_inline_data(struct inode *inode) { - int error, needed_blocks; + int error, needed_blocks, no_expand; handle_t *handle; struct ext4_iloc iloc;
@@ -1992,15 +1991,10 @@ int ext4_convert_inline_data(struct inod goto out_free; }
- down_write(&EXT4_I(inode)->xattr_sem); - if (!ext4_has_inline_data(inode)) { - up_write(&EXT4_I(inode)->xattr_sem); - goto out; - } - - error = ext4_convert_inline_data_nolock(handle, inode, &iloc); - up_write(&EXT4_I(inode)->xattr_sem); -out: + ext4_write_lock_xattr(inode, &no_expand); + if (ext4_has_inline_data(inode)) + error = ext4_convert_inline_data_nolock(handle, inode, &iloc); + ext4_write_unlock_xattr(inode, &no_expand); ext4_journal_stop(handle); out_free: brelse(iloc.bh); --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1120,16 +1120,14 @@ ext4_xattr_set_handle(handle_t *handle, struct ext4_xattr_block_find bs = { .s = { .not_found = -ENODATA, }, }; - unsigned long no_expand; + int no_expand; int error;
if (!name) return -EINVAL; if (strlen(name) > 255) return -ERANGE; - down_write(&EXT4_I(inode)->xattr_sem); - no_expand = ext4_test_inode_state(inode, EXT4_STATE_NO_EXPAND); - ext4_set_inode_state(inode, EXT4_STATE_NO_EXPAND); + ext4_write_lock_xattr(inode, &no_expand);
error = ext4_reserve_inode_write(handle, inode, &is.iloc); if (error) @@ -1190,7 +1188,7 @@ ext4_xattr_set_handle(handle_t *handle, ext4_xattr_update_super_block(handle, inode->i_sb); inode->i_ctime = ext4_current_time(inode); if (!value) - ext4_clear_inode_state(inode, EXT4_STATE_NO_EXPAND); + no_expand = 0; error = ext4_mark_iloc_dirty(handle, inode, &is.iloc); /* * The bh is consumed by ext4_mark_iloc_dirty, even with @@ -1204,9 +1202,7 @@ ext4_xattr_set_handle(handle_t *handle, cleanup: brelse(is.iloc.bh); brelse(bs.bh); - if (no_expand == 0) - ext4_clear_inode_state(inode, EXT4_STATE_NO_EXPAND); - up_write(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand); return error; }
@@ -1289,12 +1285,11 @@ int ext4_expand_extra_isize_ea(struct in void *base, *start, *end; int extra_isize = 0, error = 0, tried_min_extra_isize = 0; int s_min_extra_isize = le16_to_cpu(EXT4_SB(inode->i_sb)->s_es->s_min_extra_isize); + int no_expand; + + if (ext4_write_trylock_xattr(inode, &no_expand) == 0) + return 0;
- down_write(&EXT4_I(inode)->xattr_sem); - /* - * Set EXT4_STATE_NO_EXPAND to avoid recursion when marking inode dirty - */ - ext4_set_inode_state(inode, EXT4_STATE_NO_EXPAND); retry: if (EXT4_I(inode)->i_extra_isize >= new_extra_isize) goto out; @@ -1487,8 +1482,7 @@ retry: } brelse(bh); out: - ext4_clear_inode_state(inode, EXT4_STATE_NO_EXPAND); - up_write(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand); return 0;
cleanup: @@ -1500,10 +1494,10 @@ cleanup: kfree(bs); brelse(bh); /* - * We deliberately leave EXT4_STATE_NO_EXPAND set here since inode - * size expansion failed. + * Inode size expansion failed; don't try again */ - up_write(&EXT4_I(inode)->xattr_sem); + no_expand = 1; + ext4_write_unlock_xattr(inode, &no_expand); return error; }
--- a/fs/ext4/xattr.h +++ b/fs/ext4/xattr.h @@ -98,6 +98,38 @@ extern const struct xattr_handler ext4_x extern const struct xattr_handler ext4_xattr_trusted_handler; extern const struct xattr_handler ext4_xattr_security_handler;
+/* + * The EXT4_STATE_NO_EXPAND is overloaded and used for two purposes. + * The first is to signal that there the inline xattrs and data are + * taking up so much space that we might as well not keep trying to + * expand it. The second is that xattr_sem is taken for writing, so + * we shouldn't try to recurse into the inode expansion. For this + * second case, we need to make sure that we take save and restore the + * NO_EXPAND state flag appropriately. + */ +static inline void ext4_write_lock_xattr(struct inode *inode, int *save) +{ + down_write(&EXT4_I(inode)->xattr_sem); + *save = ext4_test_inode_state(inode, EXT4_STATE_NO_EXPAND); + ext4_set_inode_state(inode, EXT4_STATE_NO_EXPAND); +} + +static inline int ext4_write_trylock_xattr(struct inode *inode, int *save) +{ + if (down_write_trylock(&EXT4_I(inode)->xattr_sem) == 0) + return 0; + *save = ext4_test_inode_state(inode, EXT4_STATE_NO_EXPAND); + ext4_set_inode_state(inode, EXT4_STATE_NO_EXPAND); + return 1; +} + +static inline void ext4_write_unlock_xattr(struct inode *inode, int *save) +{ + if (*save == 0) + ext4_clear_inode_state(inode, EXT4_STATE_NO_EXPAND); + up_write(&EXT4_I(inode)->xattr_sem); +} + extern ssize_t ext4_listxattr(struct dentry *, char *, size_t);
extern int ext4_xattr_get(struct inode *, int, const char *, void *, size_t);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Theodore Ts'o tytso@mit.edu
commit 044e6e3d74a3d7103a0c8a9305dfd94d64000660 upstream.
When reading the inode or block allocation bitmap, if the bitmap needs to be initialized, do not update the checksum in the block group descriptor. That's because we're not set up to journal those changes. Instead, just set the verified bit on the bitmap block, so that it's not necessary to validate the checksum.
When a block or inode allocation actually happens, at that point the checksum will be calculated, and update of the bg descriptor block will be properly journalled.
Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@vger.kernel.org Signed-off-by: Harsh Shandilya harsh@prjkt.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/balloc.c | 3 +-- fs/ext4/ialloc.c | 43 +++---------------------------------------- 2 files changed, 4 insertions(+), 42 deletions(-)
--- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -243,8 +243,6 @@ static int ext4_init_block_bitmap(struct */ ext4_mark_bitmap_end(num_clusters_in_group(sb, block_group), sb->s_blocksize * 8, bh->b_data); - ext4_block_bitmap_csum_set(sb, block_group, gdp, bh); - ext4_group_desc_csum_set(sb, block_group, gdp); return 0; }
@@ -446,6 +444,7 @@ ext4_read_block_bitmap_nowait(struct sup err = ext4_init_block_bitmap(sb, bh, block_group, desc); set_bitmap_uptodate(bh); set_buffer_uptodate(bh); + set_buffer_verified(bh); ext4_unlock_group(sb, block_group); unlock_buffer(bh); if (err) --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -64,45 +64,6 @@ void ext4_mark_bitmap_end(int start_bit, memset(bitmap + (i >> 3), 0xff, (end_bit - i) >> 3); }
-/* Initializes an uninitialized inode bitmap */ -static unsigned ext4_init_inode_bitmap(struct super_block *sb, - struct buffer_head *bh, - ext4_group_t block_group, - struct ext4_group_desc *gdp) -{ - struct ext4_group_info *grp; - struct ext4_sb_info *sbi = EXT4_SB(sb); - J_ASSERT_BH(bh, buffer_locked(bh)); - - /* If checksum is bad mark all blocks and inodes use to prevent - * allocation, essentially implementing a per-group read-only flag. */ - if (!ext4_group_desc_csum_verify(sb, block_group, gdp)) { - ext4_error(sb, "Checksum bad for group %u", block_group); - grp = ext4_get_group_info(sb, block_group); - if (!EXT4_MB_GRP_BBITMAP_CORRUPT(grp)) - percpu_counter_sub(&sbi->s_freeclusters_counter, - grp->bb_free); - set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT, &grp->bb_state); - if (!EXT4_MB_GRP_IBITMAP_CORRUPT(grp)) { - int count; - count = ext4_free_inodes_count(sb, gdp); - percpu_counter_sub(&sbi->s_freeinodes_counter, - count); - } - set_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT, &grp->bb_state); - return 0; - } - - memset(bh->b_data, 0, (EXT4_INODES_PER_GROUP(sb) + 7) / 8); - ext4_mark_bitmap_end(EXT4_INODES_PER_GROUP(sb), sb->s_blocksize * 8, - bh->b_data); - ext4_inode_bitmap_csum_set(sb, block_group, gdp, bh, - EXT4_INODES_PER_GROUP(sb) / 8); - ext4_group_desc_csum_set(sb, block_group, gdp); - - return EXT4_INODES_PER_GROUP(sb); -} - void ext4_end_bitmap_read(struct buffer_head *bh, int uptodate) { if (uptodate) { @@ -151,7 +112,9 @@ ext4_read_inode_bitmap(struct super_bloc
ext4_lock_group(sb, block_group); if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) { - ext4_init_inode_bitmap(sb, bh, block_group, desc); + memset(bh->b_data, 0, (EXT4_INODES_PER_GROUP(sb) + 7) / 8); + ext4_mark_bitmap_end(EXT4_INODES_PER_GROUP(sb), + sb->s_blocksize * 8, bh->b_data); set_bitmap_uptodate(bh); set_buffer_uptodate(bh); set_buffer_verified(bh);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Olsa jolsa@kernel.org
commit 78b562fbfa2cf0a9fcb23c3154756b690f4905c1 upstream.
Return immediately when we find issue in the user stack checks. The error value could get overwritten by following check for PERF_SAMPLE_REGS_INTR.
Signed-off-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andi Kleen andi@firstfloor.org Cc: H. Peter Anvin hpa@zytor.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Fixes: 60e2364e60e8 ("perf: Add ability to sample machine state on interrupt") Link: http://lkml.kernel.org/r/20180415092352.12403-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/events/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7348,9 +7348,9 @@ static int perf_copy_attr(struct perf_ev * __u16 sample size limit. */ if (attr->sample_stack_user >= USHRT_MAX) - ret = -EINVAL; + return -EINVAL; else if (!IS_ALIGNED(attr->sample_stack_user, sizeof(u64))) - ret = -EINVAL; + return -EINVAL; }
out:
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox mawilcox@microsoft.com
commit abc1be13fd113ddef5e2d807a466286b864caed3 upstream.
f2fs specifies the __GFP_ZERO flag for allocating some of its pages. Unfortunately, the page cache also uses the mapping's GFP flags for allocating radix tree nodes. It always masked off the __GFP_HIGHMEM flag, and masks off __GFP_ZERO in some paths, but not all. That causes radix tree nodes to be allocated with a NULL list_head, which causes backtraces like:
__list_del_entry+0x30/0xd0 list_lru_del+0xac/0x1ac page_cache_tree_insert+0xd8/0x110
The __GFP_DMA and __GFP_DMA32 flags would also be able to sneak through if they are ever used. Fix them all by using GFP_RECLAIM_MASK at the innermost location, and remove it from earlier in the callchain.
Link: http://lkml.kernel.org/r/20180411060320.14458-2-willy@infradead.org Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Signed-off-by: Matthew Wilcox mawilcox@microsoft.com Reported-by: Chris Fries cfries@google.com Debugged-by: Minchan Kim minchan@kernel.org Acked-by: Johannes Weiner hannes@cmpxchg.org Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Jan Kara jack@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Harsh Shandilya harsh@prjkt.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/filemap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/filemap.c +++ b/mm/filemap.c @@ -468,7 +468,7 @@ int replace_page_cache_page(struct page VM_BUG_ON_PAGE(!PageLocked(new), new); VM_BUG_ON_PAGE(new->mapping, new);
- error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_preload(gfp_mask & GFP_RECLAIM_MASK); if (!error) { struct address_space *mapping = old->mapping; void (*freepage)(struct page *); @@ -561,7 +561,7 @@ static int __add_to_page_cache_locked(st return error; }
- error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK); if (error) { if (!huge) mem_cgroup_cancel_charge(page, memcg);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sahitya Tummala stummala@codeaurora.org
commit dbfcef6b0f4012c57bc0b6e0e660d5ed12a5eaed upstream.
Below is the synchronization issue between unmount and kjournald2 contexts, which results into use after free issue in kjournald2(). Fix this issue by using journal->j_state_lock to synchronize the wait_event() done in journal_kill_thread() and the wake_up() done in kjournald2().
TASK 1: umount cmd: |--jbd2_journal_destroy() { |--journal_kill_thread() { write_lock(&journal->j_state_lock); journal->j_flags |= JBD2_UNMOUNT; ... write_unlock(&journal->j_state_lock); wake_up(&journal->j_wait_commit); TASK 2 wakes up here: kjournald2() { ... checks JBD2_UNMOUNT flag and calls goto end-loop; ... end_loop: write_unlock(&journal->j_state_lock); journal->j_task = NULL; --> If this thread gets pre-empted here, then TASK 1 wait_event will exit even before this thread is completely done. wait_event(journal->j_wait_done_commit, journal->j_task == NULL); ... write_lock(&journal->j_state_lock); write_unlock(&journal->j_state_lock); } |--kfree(journal); } } wake_up(&journal->j_wait_done_commit); --> this step now results into use after free issue. }
Signed-off-by: Sahitya Tummala stummala@codeaurora.org Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/jbd2/journal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -275,11 +275,11 @@ loop: goto loop;
end_loop: - write_unlock(&journal->j_state_lock); del_timer_sync(&journal->j_commit_timer); journal->j_task = NULL; wake_up(&journal->j_wait_done_commit); jbd_debug(1, "Journal thread exiting.\n"); + write_unlock(&journal->j_state_lock); return 0; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit ddea788c63094f7c483783265563dd5b50052e28 ]
After Commit 8a8efa22f51b ("bonding: sync netpoll code with bridge"), it would set slave_dev npinfo in slave_enable_netpoll when enslaving a dev if bond->dev->npinfo was set.
However now slave_dev npinfo is set with bond->dev->npinfo before calling slave_enable_netpoll. With slave_dev npinfo set, __netpoll_setup called in slave_enable_netpoll will not call slave dev's .ndo_netpoll_setup(). It causes that the lower dev of this slave dev can't set its npinfo.
One way to reproduce it:
# modprobe bonding # brctl addbr br0 # brctl addif br0 eth1 # ifconfig bond0 192.168.122.1/24 up # ifenslave bond0 eth2 # systemctl restart netconsole # ifenslave bond0 br0 # ifconfig eth2 down # systemctl restart netconsole
The netpoll won't really work.
This patch is to remove that slave_dev npinfo setting in bond_enslave().
Fixes: 8a8efa22f51b ("bonding: sync netpoll code with bridge") Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1505,8 +1505,7 @@ int bond_enslave(struct net_device *bond } /* switch(bond_mode) */
#ifdef CONFIG_NET_POLL_CONTROLLER - slave_dev->npinfo = bond->dev->npinfo; - if (slave_dev->npinfo) { + if (bond->dev->npinfo) { if (slave_enable_netpoll(new_slave)) { netdev_info(bond_dev, "master_dev is using netpoll, but new slave device does not support netpoll\n"); res = -EBUSY;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
[ Upstream commit 9c438d7a3a52dcc2b9ed095cb87d3a5e83cf7e60 ]
Adding a dns_resolver key whose payload contains a very long option name resulted in that string being printed in full. This hit the WARN_ONCE() in set_precision() during the printk(), because printk() only supports a precision of up to 32767 bytes:
precision 1000000 too large WARNING: CPU: 0 PID: 752 at lib/vsprintf.c:2189 vsnprintf+0x4bc/0x5b0
Fix it by limiting option strings (combined name + value) to a much more reasonable 128 bytes. The exact limit is arbitrary, but currently the only recognized option is formatted as "dnserror=%lu" which fits well within this limit.
Also ratelimit the printks.
Reproducer:
perl -e 'print "#", "A" x 1000000, "\x00"' | keyctl padd dns_resolver desc @s
This bug was found using syzkaller.
Reported-by: Mark Rutland mark.rutland@arm.com Fixes: 4a2d789267e0 ("DNS: If the DNS server returns an error, allow that to be cached [ver #2]") Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dns_resolver/dns_key.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
--- a/net/dns_resolver/dns_key.c +++ b/net/dns_resolver/dns_key.c @@ -25,6 +25,7 @@ #include <linux/moduleparam.h> #include <linux/slab.h> #include <linux/string.h> +#include <linux/ratelimit.h> #include <linux/kernel.h> #include <linux/keyctl.h> #include <linux/err.h> @@ -91,9 +92,9 @@ dns_resolver_preparse(struct key_prepars
next_opt = memchr(opt, '#', end - opt) ?: end; opt_len = next_opt - opt; - if (!opt_len) { - printk(KERN_WARNING - "Empty option to dns_resolver key\n"); + if (opt_len <= 0 || opt_len > 128) { + pr_warn_ratelimited("Invalid option length (%d) for dns_resolver key\n", + opt_len); return -EINVAL; }
@@ -127,10 +128,8 @@ dns_resolver_preparse(struct key_prepars }
bad_option_value: - printk(KERN_WARNING - "Option '%*.*s' to dns_resolver key:" - " bad/missing value\n", - opt_nlen, opt_nlen, opt); + pr_warn_ratelimited("Option '%*.*s' to dns_resolver key: bad/missing value\n", + opt_nlen, opt_nlen, opt); return -EINVAL; } while (opt = next_opt + 1, opt < end); }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guillaume Nault g.nault@alphalink.fr
[ Upstream commit eb1c28c05894a4b1f6b56c5bf072205e64cfa280 ]
Check sockaddr_len before dereferencing sp->sa_protocol, to ensure that it actually points to valid data.
Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Reported-by: syzbot+a70ac890b23b1bf29f5c@syzkaller.appspotmail.com Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_ppp.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/net/l2tp/l2tp_ppp.c +++ b/net/l2tp/l2tp_ppp.c @@ -608,6 +608,13 @@ static int pppol2tp_connect(struct socke lock_sock(sk);
error = -EINVAL; + + if (sockaddr_len != sizeof(struct sockaddr_pppol2tp) && + sockaddr_len != sizeof(struct sockaddr_pppol2tpv3) && + sockaddr_len != sizeof(struct sockaddr_pppol2tpin6) && + sockaddr_len != sizeof(struct sockaddr_pppol2tpv3in6)) + goto end; + if (sp->sa_protocol != PX_PROTO_OL2TP) goto end;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
[ Upstream commit 7e5a206ab686f098367b61aca989f5cdfa8114a3 ]
The old code reads the "opsize" variable from out-of-bounds memory (first byte behind the segment) if a broken TCP segment ends directly after an opcode that is neither EOL nor NOP.
The result of the read isn't used for anything, so the worst thing that could theoretically happen is a pagefault; and since the physmap is usually mostly contiguous, even that seems pretty unlikely.
The following C reproducer triggers the uninitialized read - however, you can't actually see anything happen unless you put something like a pr_warn() in tcp_parse_md5sig_option() to print the opsize.
==================================== #define _GNU_SOURCE #include <arpa/inet.h> #include <stdlib.h> #include <errno.h> #include <stdarg.h> #include <net/if.h> #include <linux/if.h> #include <linux/ip.h> #include <linux/tcp.h> #include <linux/in.h> #include <linux/if_tun.h> #include <err.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <string.h> #include <stdio.h> #include <unistd.h> #include <sys/ioctl.h> #include <assert.h>
void systemf(const char *command, ...) { char *full_command; va_list ap; va_start(ap, command); if (vasprintf(&full_command, command, ap) == -1) err(1, "vasprintf"); va_end(ap); printf("systemf: <<<%s>>>\n", full_command); system(full_command); }
char *devname;
int tun_alloc(char *name) { int fd = open("/dev/net/tun", O_RDWR); if (fd == -1) err(1, "open tun dev"); static struct ifreq req = { .ifr_flags = IFF_TUN|IFF_NO_PI }; strcpy(req.ifr_name, name); if (ioctl(fd, TUNSETIFF, &req)) err(1, "TUNSETIFF"); devname = req.ifr_name; printf("device name: %s\n", devname); return fd; }
#define IPADDR(a,b,c,d) (((a)<<0)+((b)<<8)+((c)<<16)+((d)<<24))
void sum_accumulate(unsigned int *sum, void *data, int len) { assert((len&2)==0); for (int i=0; i<len/2; i++) { *sum += ntohs(((unsigned short *)data)[i]); } }
unsigned short sum_final(unsigned int sum) { sum = (sum >> 16) + (sum & 0xffff); sum = (sum >> 16) + (sum & 0xffff); return htons(~sum); }
void fix_ip_sum(struct iphdr *ip) { unsigned int sum = 0; sum_accumulate(&sum, ip, sizeof(*ip)); ip->check = sum_final(sum); }
void fix_tcp_sum(struct iphdr *ip, struct tcphdr *tcp) { unsigned int sum = 0; struct { unsigned int saddr; unsigned int daddr; unsigned char pad; unsigned char proto_num; unsigned short tcp_len; } fakehdr = { .saddr = ip->saddr, .daddr = ip->daddr, .proto_num = ip->protocol, .tcp_len = htons(ntohs(ip->tot_len) - ip->ihl*4) }; sum_accumulate(&sum, &fakehdr, sizeof(fakehdr)); sum_accumulate(&sum, tcp, tcp->doff*4); tcp->check = sum_final(sum); }
int main(void) { int tun_fd = tun_alloc("inject_dev%d"); systemf("ip link set %s up", devname); systemf("ip addr add 192.168.42.1/24 dev %s", devname);
struct { struct iphdr ip; struct tcphdr tcp; unsigned char tcp_opts[20]; } __attribute__((packed)) syn_packet = { .ip = { .ihl = sizeof(struct iphdr)/4, .version = 4, .tot_len = htons(sizeof(syn_packet)), .ttl = 30, .protocol = IPPROTO_TCP, /* FIXUP check */ .saddr = IPADDR(192,168,42,2), .daddr = IPADDR(192,168,42,1) }, .tcp = { .source = htons(1), .dest = htons(1337), .seq = 0x12345678, .doff = (sizeof(syn_packet.tcp)+sizeof(syn_packet.tcp_opts))/4, .syn = 1, .window = htons(64), .check = 0 /*FIXUP*/ }, .tcp_opts = { /* INVALID: trailing MD5SIG opcode after NOPs */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 19 } }; fix_ip_sum(&syn_packet.ip); fix_tcp_sum(&syn_packet.ip, &syn_packet.tcp); while (1) { int write_res = write(tun_fd, &syn_packet, sizeof(syn_packet)); if (write_res != sizeof(syn_packet)) err(1, "packet write failed"); } } ====================================
Fixes: cfb6eeb4c860 ("[TCP]: MD5 Signature Option (RFC2385) support.") Signed-off-by: Jann Horn jannh@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_input.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
--- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3731,11 +3731,8 @@ const u8 *tcp_parse_md5sig_option(const int length = (th->doff << 2) - sizeof(*th); const u8 *ptr = (const u8 *)(th + 1);
- /* If the TCP option is too short, we can short cut */ - if (length < TCPOLEN_MD5SIG) - return NULL; - - while (length > 0) { + /* If not enough data remaining, we can short cut */ + while (length >= TCPOLEN_MD5SIG) { int opcode = *ptr++; int opsize;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 4fb0534fb7bbc2346ba7d3a072b538007f4135a5 ]
When parsing the options provided by the user space, team_nl_cmd_options_set() insert them in a temporary list to send multiple events with a single message. While each option's attribute is correctly validated, the code does not check for duplicate entries before inserting into the event list.
Exploiting the above, the syzbot was able to trigger the following splat:
kernel BUG at lib/list_debug.c:31! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 4466 Comm: syzkaller556835 Not tainted 4.16.0+ #17 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__list_add_valid+0xaa/0xb0 lib/list_debug.c:29 RSP: 0018:ffff8801b04bf248 EFLAGS: 00010286 RAX: 0000000000000058 RBX: ffff8801c8fc7a90 RCX: 0000000000000000 RDX: 0000000000000058 RSI: ffffffff815fbf41 RDI: ffffed0036097e3f RBP: ffff8801b04bf260 R08: ffff8801b0b2a700 R09: ffffed003b604f90 R10: ffffed003b604f90 R11: ffff8801db027c87 R12: ffff8801c8fc7a90 R13: ffff8801c8fc7a90 R14: dffffc0000000000 R15: 0000000000000000 FS: 0000000000b98880(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000043fc30 CR3: 00000001afe8e000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __list_add include/linux/list.h:60 [inline] list_add include/linux/list.h:79 [inline] team_nl_cmd_options_set+0x9ff/0x12b0 drivers/net/team/team.c:2571 genl_family_rcv_msg+0x889/0x1120 net/netlink/genetlink.c:599 genl_rcv_msg+0xc6/0x170 net/netlink/genetlink.c:624 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448 genl_rcv+0x28/0x40 net/netlink/genetlink.c:635 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:639 ___sys_sendmsg+0x805/0x940 net/socket.c:2117 __sys_sendmsg+0x115/0x270 net/socket.c:2155 SYSC_sendmsg net/socket.c:2164 [inline] SyS_sendmsg+0x29/0x30 net/socket.c:2162 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x4458b9 RSP: 002b:00007ffd1d4a7278 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000000001b RCX: 00000000004458b9 RDX: 0000000000000010 RSI: 0000000020000d00 RDI: 0000000000000004 RBP: 00000000004a74ed R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000213 R12: 00007ffd1d4a7348 R13: 0000000000402a60 R14: 0000000000000000 R15: 0000000000000000 Code: 75 e8 eb a9 48 89 f7 48 89 75 e8 e8 d1 85 7b fe 48 8b 75 e8 eb bb 48 89 f2 48 89 d9 4c 89 e6 48 c7 c7 a0 84 d8 87 e8 ea 67 28 fe <0f> 0b 0f 1f 40 00 48 b8 00 00 00 00 00 fc ff df 55 48 89 e5 41 RIP: __list_add_valid+0xaa/0xb0 lib/list_debug.c:29 RSP: ffff8801b04bf248
This changeset addresses the avoiding list_add() if the current option is already present in the event list.
Reported-and-tested-by: syzbot+4d4af685432dc0e56c91@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Fixes: 2fcdb2c9e659 ("team: allow to send multiple set events in one message") Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/team/team.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
--- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -253,6 +253,17 @@ static void __team_option_inst_mark_remo } }
+static bool __team_option_inst_tmp_find(const struct list_head *opts, + const struct team_option_inst *needle) +{ + struct team_option_inst *opt_inst; + + list_for_each_entry(opt_inst, opts, tmp_list) + if (opt_inst == needle) + return true; + return false; +} + static int __team_options_register(struct team *team, const struct team_option *option, size_t option_count) @@ -2532,6 +2543,14 @@ static int team_nl_cmd_options_set(struc if (err) goto team_put; opt_inst->changed = true; + + /* dumb/evil user-space can send us duplicate opt, + * keep only the last one + */ + if (__team_option_inst_tmp_find(&opt_inst_list, + opt_inst)) + continue; + list_add(&opt_inst->tmp_list, &opt_inst_list); } if (!opt_found) {
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guillaume Nault g.nault@alphalink.fr
[ Upstream commit a49e2f5d5fb141884452ddb428f551b123d436b5 ]
We must validate sockaddr_len, otherwise userspace can pass fewer data than we expect and we end up accessing invalid data.
Fixes: 224cf5ad14c0 ("ppp: Move the PPP drivers") Reported-by: syzbot+4f03bdf92fdf9ef5ddab@syzkaller.appspotmail.com Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ppp/pppoe.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/net/ppp/pppoe.c +++ b/drivers/net/ppp/pppoe.c @@ -613,6 +613,10 @@ static int pppoe_connect(struct socket * lock_sock(sk);
error = -EINVAL; + + if (sockaddr_len != sizeof(struct sockaddr_pppox)) + goto end; + if (sp->sa_protocol != PX_PROTO_OE) goto end;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit f7e43672683b097bb074a8fe7af9bc600a23f231 ]
syzbot reported we still access llc->sap in llc_backlog_rcv() after it is freed in llc_sap_remove_socket():
Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430 llc_conn_ac_send_sabme_cmd_p_set_x+0x3a8/0x460 net/llc/llc_c_ac.c:785 llc_exec_conn_trans_actions net/llc/llc_conn.c:475 [inline] llc_conn_service net/llc/llc_conn.c:400 [inline] llc_conn_state_process+0x4e1/0x13a0 net/llc/llc_conn.c:75 llc_backlog_rcv+0x195/0x1e0 net/llc/llc_conn.c:891 sk_backlog_rcv include/net/sock.h:909 [inline] __release_sock+0x12f/0x3a0 net/core/sock.c:2335 release_sock+0xa4/0x2b0 net/core/sock.c:2850 llc_ui_release+0xc8/0x220 net/llc/af_llc.c:204
llc->sap is refcount'ed and llc_sap_remove_socket() is paired with llc_sap_add_socket(). This can be amended by holding its refcount before llc_sap_remove_socket() and releasing it after release_sock().
Reported-by: syzbot+6e181fc95081c2cf9051@syzkaller.appspotmail.com Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/llc/af_llc.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/net/llc/af_llc.c +++ b/net/llc/af_llc.c @@ -187,6 +187,7 @@ static int llc_ui_release(struct socket { struct sock *sk = sock->sk; struct llc_sock *llc; + struct llc_sap *sap;
if (unlikely(sk == NULL)) goto out; @@ -197,9 +198,15 @@ static int llc_ui_release(struct socket llc->laddr.lsap, llc->daddr.lsap); if (!llc_send_disc(sk)) llc_ui_wait_for_disc(sk, sk->sk_rcvtimeo); + sap = llc->sap; + /* Hold this for release_sock(), so that llc_backlog_rcv() could still + * use it. + */ + llc_sap_hold(sap); if (!sock_flag(sk, SOCK_ZAPPED)) llc_sap_remove_socket(llc->sap, sk); release_sock(sk); + llc_sap_put(sap); if (llc->dev) dev_put(llc->dev); sock_put(sk);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit 3a04ce7130a7e5dad4e78d45d50313747f8c830f ]
For SOCK_ZAPPED socket, we don't need to care about llc->sap, so we should just skip these refcount functions in this case.
Fixes: f7e43672683b ("llc: hold llc_sap before release_sock()") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/llc/af_llc.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
--- a/net/llc/af_llc.c +++ b/net/llc/af_llc.c @@ -187,7 +187,6 @@ static int llc_ui_release(struct socket { struct sock *sk = sock->sk; struct llc_sock *llc; - struct llc_sap *sap;
if (unlikely(sk == NULL)) goto out; @@ -198,15 +197,19 @@ static int llc_ui_release(struct socket llc->laddr.lsap, llc->daddr.lsap); if (!llc_send_disc(sk)) llc_ui_wait_for_disc(sk, sk->sk_rcvtimeo); - sap = llc->sap; - /* Hold this for release_sock(), so that llc_backlog_rcv() could still - * use it. - */ - llc_sap_hold(sap); - if (!sock_flag(sk, SOCK_ZAPPED)) + if (!sock_flag(sk, SOCK_ZAPPED)) { + struct llc_sap *sap = llc->sap; + + /* Hold this for release_sock(), so that llc_backlog_rcv() + * could still use it. + */ + llc_sap_hold(sap); llc_sap_remove_socket(llc->sap, sk); - release_sock(sk); - llc_sap_put(sap); + release_sock(sk); + llc_sap_put(sap); + } else { + release_sock(sk); + } if (llc->dev) dev_put(llc->dev); sock_put(sk);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Willem de Bruijn willemb@google.com
[ Upstream commit a6361f0ca4b25460f2cdf3235ebe8115f622901e ]
Updates to the bitfields in struct packet_sock are not atomic. Serialize these read-modify-write cycles.
Move po->running into a separate variable. Its writes are protected by po->bind_lock (except for one startup case at packet_create). Also replace a textual precondition warning with lockdep annotation.
All others are set only in packet_setsockopt. Serialize these updates by holding the socket lock. Analogous to other field updates, also hold the lock when testing whether a ring is active (pg_vec).
Fixes: 8dc419447415 ("[PACKET]: Add optional checksum computation for recvmsg") Reported-by: DaeRyong Jeong threeearcat@gmail.com Reported-by: Byoungyoung Lee byoungyoung@purdue.edu Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/packet/af_packet.c | 60 +++++++++++++++++++++++++++++++++++-------------- net/packet/internal.h | 10 ++++---- 2 files changed, 49 insertions(+), 21 deletions(-)
--- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -323,11 +323,11 @@ static void packet_pick_tx_queue(struct skb_set_queue_mapping(skb, queue_index); }
-/* register_prot_hook must be invoked with the po->bind_lock held, +/* __register_prot_hook must be invoked through register_prot_hook * or from a context in which asynchronous accesses to the packet * socket is not possible (packet_create()). */ -static void register_prot_hook(struct sock *sk) +static void __register_prot_hook(struct sock *sk) { struct packet_sock *po = pkt_sk(sk);
@@ -342,8 +342,13 @@ static void register_prot_hook(struct so } }
-/* {,__}unregister_prot_hook() must be invoked with the po->bind_lock - * held. If the sync parameter is true, we will temporarily drop +static void register_prot_hook(struct sock *sk) +{ + lockdep_assert_held_once(&pkt_sk(sk)->bind_lock); + __register_prot_hook(sk); +} + +/* If the sync parameter is true, we will temporarily drop * the po->bind_lock and do a synchronize_net to make sure no * asynchronous packet processing paths still refer to the elements * of po->prot_hook. If the sync parameter is false, it is the @@ -353,6 +358,8 @@ static void __unregister_prot_hook(struc { struct packet_sock *po = pkt_sk(sk);
+ lockdep_assert_held_once(&po->bind_lock); + po->running = 0;
if (po->fanout) @@ -2861,7 +2868,7 @@ static int packet_create(struct net *net
if (proto) { po->prot_hook.type = proto; - register_prot_hook(sk); + __register_prot_hook(sk); }
mutex_lock(&net->packet.sklist_lock); @@ -3352,12 +3359,18 @@ packet_setsockopt(struct socket *sock, i
if (optlen != sizeof(val)) return -EINVAL; - if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) - return -EBUSY; if (copy_from_user(&val, optval, sizeof(val))) return -EFAULT; - po->tp_loss = !!val; - return 0; + + lock_sock(sk); + if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) { + ret = -EBUSY; + } else { + po->tp_loss = !!val; + ret = 0; + } + release_sock(sk); + return ret; } case PACKET_AUXDATA: { @@ -3368,7 +3381,9 @@ packet_setsockopt(struct socket *sock, i if (copy_from_user(&val, optval, sizeof(val))) return -EFAULT;
+ lock_sock(sk); po->auxdata = !!val; + release_sock(sk); return 0; } case PACKET_ORIGDEV: @@ -3380,7 +3395,9 @@ packet_setsockopt(struct socket *sock, i if (copy_from_user(&val, optval, sizeof(val))) return -EFAULT;
+ lock_sock(sk); po->origdev = !!val; + release_sock(sk); return 0; } case PACKET_VNET_HDR: @@ -3389,15 +3406,20 @@ packet_setsockopt(struct socket *sock, i
if (sock->type != SOCK_RAW) return -EINVAL; - if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) - return -EBUSY; if (optlen < sizeof(val)) return -EINVAL; if (copy_from_user(&val, optval, sizeof(val))) return -EFAULT;
- po->has_vnet_hdr = !!val; - return 0; + lock_sock(sk); + if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) { + ret = -EBUSY; + } else { + po->has_vnet_hdr = !!val; + ret = 0; + } + release_sock(sk); + return ret; } case PACKET_TIMESTAMP: { @@ -3428,11 +3450,17 @@ packet_setsockopt(struct socket *sock, i
if (optlen != sizeof(val)) return -EINVAL; - if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) - return -EBUSY; if (copy_from_user(&val, optval, sizeof(val))) return -EFAULT; - po->tp_tx_has_off = !!val; + + lock_sock(sk); + if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) { + ret = -EBUSY; + } else { + po->tp_tx_has_off = !!val; + ret = 0; + } + release_sock(sk); return 0; } case PACKET_QDISC_BYPASS: --- a/net/packet/internal.h +++ b/net/packet/internal.h @@ -100,10 +100,12 @@ struct packet_sock { int copy_thresh; spinlock_t bind_lock; struct mutex pg_vec_lock; - unsigned int running:1, /* prot_hook is attached*/ - auxdata:1, + unsigned int running; /* bind_lock must be held */ + unsigned int auxdata:1, /* writer must hold sock lock */ origdev:1, - has_vnet_hdr:1; + has_vnet_hdr:1, + tp_loss:1, + tp_tx_has_off:1; int ifindex; /* bound device */ __be16 num; struct packet_mclist *mclist; @@ -111,8 +113,6 @@ struct packet_sock { enum tpacket_versions tp_version; unsigned int tp_hdrlen; unsigned int tp_reserve; - unsigned int tp_loss:1; - unsigned int tp_tx_has_off:1; unsigned int tp_tstamp; struct net_device __rcu *cached_dev; int (*xmit)(struct sk_buff *skb);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 7212303268918b9a203aebeacfdbd83b5e87b20d ]
syzbot/KMSAN reported an uninit-value in tcp_parse_options() [1]
I believe this was caused by a TCP_MD5SIG being set on live flow.
This is highly unexpected, since TCP option space is limited.
For instance, presence of TCP MD5 option automatically disables TCP TimeStamp option at SYN/SYNACK time, which we can not do once flow has been established.
Really, adding/deleting an MD5 key only makes sense on sockets in CLOSE or LISTEN state.
[1] BUG: KMSAN: uninit-value in tcp_parse_options+0xd74/0x1a30 net/ipv4/tcp_input.c:3720 CPU: 1 PID: 6177 Comm: syzkaller192004 Not tainted 4.16.0+ #83 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 tcp_parse_options+0xd74/0x1a30 net/ipv4/tcp_input.c:3720 tcp_fast_parse_options net/ipv4/tcp_input.c:3858 [inline] tcp_validate_incoming+0x4f1/0x2790 net/ipv4/tcp_input.c:5184 tcp_rcv_established+0xf60/0x2bb0 net/ipv4/tcp_input.c:5453 tcp_v4_do_rcv+0x6cd/0xd90 net/ipv4/tcp_ipv4.c:1469 sk_backlog_rcv include/net/sock.h:908 [inline] __release_sock+0x2d6/0x680 net/core/sock.c:2271 release_sock+0x97/0x2a0 net/core/sock.c:2786 tcp_sendmsg+0xd6/0x100 net/ipv4/tcp.c:1464 inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:764 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] SYSC_sendto+0x6c3/0x7e0 net/socket.c:1747 SyS_sendto+0x8a/0xb0 net/socket.c:1715 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x448fe9 RSP: 002b:00007fd472c64d38 EFLAGS: 00000216 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000006e5a30 RCX: 0000000000448fe9 RDX: 000000000000029f RSI: 0000000020a88f88 RDI: 0000000000000004 RBP: 00000000006e5a34 R08: 0000000020e68000 R09: 0000000000000010 R10: 00000000200007fd R11: 0000000000000216 R12: 0000000000000000 R13: 00007fff074899ef R14: 00007fd472c659c0 R15: 0000000000000009
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321 slab_post_alloc_hook mm/slab.h:445 [inline] slab_alloc_node mm/slub.c:2737 [inline] __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:984 [inline] tcp_send_ack+0x18c/0x910 net/ipv4/tcp_output.c:3624 __tcp_ack_snd_check net/ipv4/tcp_input.c:5040 [inline] tcp_ack_snd_check net/ipv4/tcp_input.c:5053 [inline] tcp_rcv_established+0x2103/0x2bb0 net/ipv4/tcp_input.c:5469 tcp_v4_do_rcv+0x6cd/0xd90 net/ipv4/tcp_ipv4.c:1469 sk_backlog_rcv include/net/sock.h:908 [inline] __release_sock+0x2d6/0x680 net/core/sock.c:2271 release_sock+0x97/0x2a0 net/core/sock.c:2786 tcp_sendmsg+0xd6/0x100 net/ipv4/tcp.c:1464 inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:764 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] SYSC_sendto+0x6c3/0x7e0 net/socket.c:1747 SyS_sendto+0x8a/0xb0 net/socket.c:1715 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Fixes: cfb6eeb4c860 ("[TCP]: MD5 Signature Option (RFC2385) support.") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Acked-by: Yuchung Cheng ycheng@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2593,8 +2593,10 @@ static int do_tcp_setsockopt(struct sock
#ifdef CONFIG_TCP_MD5SIG case TCP_MD5SIG: - /* Read the IP->Key mappings from userspace */ - err = tp->af_specific->md5_parse(sk, optval, optlen); + if ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) + err = tp->af_specific->md5_parse(sk, optval, optlen); + else + err = -EINVAL; break; #endif case TCP_USER_TIMEOUT:
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 5171b37d959641bbc619781caf62e61f7b940871 ]
In order to remove the race caught by syzbot [1], we need to lock the socket before using po->tp_version as this could change under us otherwise.
This means lock_sock() and release_sock() must be done by packet_set_ring() callers.
[1] : BUG: KMSAN: uninit-value in packet_set_ring+0x1254/0x3870 net/packet/af_packet.c:4249 CPU: 0 PID: 20195 Comm: syzkaller707632 Not tainted 4.16.0+ #83 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 packet_set_ring+0x1254/0x3870 net/packet/af_packet.c:4249 packet_setsockopt+0x12c6/0x5a90 net/packet/af_packet.c:3662 SYSC_setsockopt+0x4b8/0x570 net/socket.c:1849 SyS_setsockopt+0x76/0xa0 net/socket.c:1828 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x449099 RSP: 002b:00007f42b5307ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 000000000070003c RCX: 0000000000449099 RDX: 0000000000000005 RSI: 0000000000000107 RDI: 0000000000000003 RBP: 0000000000700038 R08: 000000000000001c R09: 0000000000000000 R10: 00000000200000c0 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000080eecf R14: 00007f42b53089c0 R15: 0000000000000001
Local variable description: ----req_u@packet_setsockopt Variable was created at: packet_setsockopt+0x13f/0x5a90 net/packet/af_packet.c:3612 SYSC_setsockopt+0x4b8/0x570 net/socket.c:1849
Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/packet/af_packet.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
--- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2633,6 +2633,7 @@ static int packet_release(struct socket
packet_flush_mclist(sk);
+ lock_sock(sk); if (po->rx_ring.pg_vec) { memset(&req_u, 0, sizeof(req_u)); packet_set_ring(sk, &req_u, 1, 0); @@ -2642,6 +2643,7 @@ static int packet_release(struct socket memset(&req_u, 0, sizeof(req_u)); packet_set_ring(sk, &req_u, 1, 1); } + release_sock(sk);
fanout_release(sk);
@@ -3276,6 +3278,7 @@ packet_setsockopt(struct socket *sock, i union tpacket_req_u req_u; int len;
+ lock_sock(sk); switch (po->tp_version) { case TPACKET_V1: case TPACKET_V2: @@ -3286,14 +3289,21 @@ packet_setsockopt(struct socket *sock, i len = sizeof(req_u.req3); break; } - if (optlen < len) - return -EINVAL; - if (pkt_sk(sk)->has_vnet_hdr) - return -EINVAL; - if (copy_from_user(&req_u.req, optval, len)) - return -EFAULT; - return packet_set_ring(sk, &req_u, 0, - optname == PACKET_TX_RING); + if (optlen < len) { + ret = -EINVAL; + } else { + if (pkt_sk(sk)->has_vnet_hdr) { + ret = -EINVAL; + } else { + if (copy_from_user(&req_u.req, optval, len)) + ret = -EFAULT; + else + ret = packet_set_ring(sk, &req_u, 0, + optname == PACKET_TX_RING); + } + } + release_sock(sk); + return ret; } case PACKET_COPY_THRESH: { @@ -3825,7 +3835,6 @@ static int packet_set_ring(struct sock * /* Added to avoid minimal code churn */ struct tpacket_req *req = &req_u->req;
- lock_sock(sk); /* Opening a Tx-ring is NOT supported in TPACKET_V3 */ if (!closing && tx_ring && (po->tp_version > TPACKET_V2)) { WARN(1, "Tx-ring is not supported.\n"); @@ -3961,7 +3970,6 @@ static int packet_set_ring(struct sock * if (pg_vec) free_pg_vec(pg_vec, order, req->tp_block_nr); out: - release_sock(sk); return err; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit b905ef9ab90115d001c1658259af4b1c65088779 ]
The connection timers of an llc sock could be still flying after we delete them in llc_sk_free(), and even possibly after we free the sock. We could just wait synchronously here in case of troubles.
Note, I leave other call paths as they are, since they may not have to wait, at least we can change them to synchronously when needed.
Also, move the code to net/llc/llc_conn.c, which is apparently a better place.
Reported-by: syzbot+f922284c18ea23a8e457@syzkaller.appspotmail.com Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/llc_conn.h | 1 + net/llc/llc_c_ac.c | 9 +-------- net/llc/llc_conn.c | 22 +++++++++++++++++++++- 3 files changed, 23 insertions(+), 9 deletions(-)
--- a/include/net/llc_conn.h +++ b/include/net/llc_conn.h @@ -97,6 +97,7 @@ static __inline__ char llc_backlog_type(
struct sock *llc_sk_alloc(struct net *net, int family, gfp_t priority, struct proto *prot); +void llc_sk_stop_all_timers(struct sock *sk, bool sync); void llc_sk_free(struct sock *sk);
void llc_sk_reset(struct sock *sk); --- a/net/llc/llc_c_ac.c +++ b/net/llc/llc_c_ac.c @@ -1096,14 +1096,7 @@ int llc_conn_ac_inc_tx_win_size(struct s
int llc_conn_ac_stop_all_timers(struct sock *sk, struct sk_buff *skb) { - struct llc_sock *llc = llc_sk(sk); - - del_timer(&llc->pf_cycle_timer.timer); - del_timer(&llc->ack_timer.timer); - del_timer(&llc->rej_sent_timer.timer); - del_timer(&llc->busy_state_timer.timer); - llc->ack_must_be_send = 0; - llc->ack_pf = 0; + llc_sk_stop_all_timers(sk, false); return 0; }
--- a/net/llc/llc_conn.c +++ b/net/llc/llc_conn.c @@ -951,6 +951,26 @@ out: return sk; }
+void llc_sk_stop_all_timers(struct sock *sk, bool sync) +{ + struct llc_sock *llc = llc_sk(sk); + + if (sync) { + del_timer_sync(&llc->pf_cycle_timer.timer); + del_timer_sync(&llc->ack_timer.timer); + del_timer_sync(&llc->rej_sent_timer.timer); + del_timer_sync(&llc->busy_state_timer.timer); + } else { + del_timer(&llc->pf_cycle_timer.timer); + del_timer(&llc->ack_timer.timer); + del_timer(&llc->rej_sent_timer.timer); + del_timer(&llc->busy_state_timer.timer); + } + + llc->ack_must_be_send = 0; + llc->ack_pf = 0; +} + /** * llc_sk_free - Frees a LLC socket * @sk - socket to free @@ -963,7 +983,7 @@ void llc_sk_free(struct sock *sk)
llc->state = LLC_CONN_OUT_OF_SVC; /* Stop all (possibly) running timers */ - llc_conn_ac_stop_all_timers(sk, NULL); + llc_sk_stop_all_timers(sk, true); #ifdef DEBUG_LLC_CONN_ALLOC printk(KERN_INFO "%s: unackq=%d, txq=%d\n", __func__, skb_queue_len(&llc->pdu_unack_q),
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit aa8f8778493c85fff480cdf8b349b1e1dcb5f243 ]
KMSAN reported use of uninit-value that I tracked to lack of proper size check on RTA_TABLE attribute.
I also believe RTA_PREFSRC lacks a similar check.
Fixes: 86872cb57925 ("[IPv6] route: FIB6 configuration using struct fib6_config") Fixes: c3968a857a6b ("ipv6: RTA_PREFSRC support for ipv6 route source address selection") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Acked-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/route.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2333,11 +2333,13 @@ void rt6_mtu_change(struct net_device *d
static const struct nla_policy rtm_ipv6_policy[RTA_MAX+1] = { [RTA_GATEWAY] = { .len = sizeof(struct in6_addr) }, + [RTA_PREFSRC] = { .len = sizeof(struct in6_addr) }, [RTA_OIF] = { .type = NLA_U32 }, [RTA_IIF] = { .type = NLA_U32 }, [RTA_PRIORITY] = { .type = NLA_U32 }, [RTA_METRICS] = { .type = NLA_NESTED }, [RTA_MULTIPATH] = { .len = sizeof(struct rtnexthop) }, + [RTA_TABLE] = { .type = NLA_U32 }, };
static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin K. Petersen martin.petersen@oracle.com
commit 94e5395d2403c8bc2504a7cbe4c4caaacb7b8b84 upstream.
First generation MPT Fusion controllers can not translate WRITE SAME when the attached device is a SATA drive. Disable WRITE SAME support.
Reported-by: Nikola Ciprich nikola.ciprich@linuxbox.cz Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/message/fusion/mptsas.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/message/fusion/mptsas.c +++ b/drivers/message/fusion/mptsas.c @@ -1994,6 +1994,7 @@ static struct scsi_host_template mptsas_ .cmd_per_lun = 7, .use_clustering = ENABLE_CLUSTERING, .shost_attrs = mptscsih_host_attrs, + .no_write_same = 1, };
static int mptsas_get_linkerrors(struct sas_phy *phy)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@oracle.com
commit 9de4ee40547fd315d4a0ed1dd15a2fa3559ad707 upstream.
This cast is wrong. "cdi->capacity" is an int and "arg" is an unsigned long. The way the check is written now, if one of the high 32 bits is set then we could read outside the info->slots[] array.
This bug is pretty old and it predates git.
Reviewed-by: Christoph Hellwig hch@lst.de Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cdrom/cdrom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -2357,7 +2357,7 @@ static int cdrom_ioctl_media_changed(str if (!CDROM_CAN(CDC_SELECT_DISC) || arg == CDSL_CURRENT) return media_changed(cdi, 1);
- if ((unsigned int)arg >= cdi->capacity) + if (arg >= cdi->capacity) return -EINVAL;
info = kmalloc(sizeof(*info), GFP_KERNEL);
On 27/04/18 20:57, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.107 release. There are 24 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.107-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
thanks,
greg k-h
build and tested on rn4, No dmesg regressions.
thanks,
On Fri, Apr 27, 2018 at 11:00:41PM +0700, Dede Dindin Qudsy wrote:
On 27/04/18 20:57, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.107 release. There are 24 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.107-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
thanks,
greg k-h
build and tested on rn4, No dmesg regressions.
What is a "rn4"? Anyway, thanks for testing and letting me know.
greg k-h
On 28 April 2018 11:21:20 AM IST, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Fri, Apr 27, 2018 at 11:00:41PM +0700, Dede Dindin Qudsy wrote:
On 27/04/18 20:57, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.107
release.
There are 24 patches in this series, all will be posted as a
response
to this one. If anyone has any issues with these being applied,
please
let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.107-rc...
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y
and the diffstat can be found below.
thanks,
greg k-h
build and tested on rn4, No dmesg regressions.
What is a "rn4"? Anyway, thanks for testing and letting me know.
greg k-h
Xiaomi Redmi Note 4, most GPL adhering Android OEM ever :P
On 04/27/2018 07:57 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.107 release. There are 24 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.107-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Fri, Apr 27, 2018 at 12:12:32PM -0600, Shuah Khan wrote:
On 04/27/2018 07:57 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.107 release. There are 24 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.107-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Thanks for testing all of these and letting me know.
greg k-h
On Fri, Apr 27, 2018 at 03:57:35PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.107 release.
The kernel.org page currently lists 3.18 as EOL. I assume we released an update for Spectre/Meltdown after we declared it end of life, but I was surprised to see that there was going to be a 3.18.107.
Should we change how 3.18 is listed on kernel.org? Or should we just keep peeople wondering whether will be updates after 3.18.107? :-)
- Ted
On Fri, Apr 27, 2018 at 03:41:28PM -0400, Theodore Y. Ts'o wrote:
On Fri, Apr 27, 2018 at 03:57:35PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.107 release.
The kernel.org page currently lists 3.18 as EOL. I assume we released an update for Spectre/Meltdown after we declared it end of life, but I was surprised to see that there was going to be a 3.18.107.
There are no patches for Spectre/Meltdown in the 3.18.y kernel tree that I know of.
Should we change how 3.18 is listed on kernel.org? Or should we just keep peeople wondering whether will be updates after 3.18.107? :-)
Keep people wondering.
As I have stated before, I'm keeping it alive as there are still a few tens of millions of devices out there using this kernel, including one of my personal phones. I might get tired of keeping it semi-up-to-date and then will stop releasing updates, and reserve the right to do so at any point in time :)
thanks,
greg k-h
On 27 April 2018 7:27:35 PM IST, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 3.18.107 release. There are 24 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.107-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
No issues on the OnePlus 3, usage seems fine so far. Thanks!
On Sat, Apr 28, 2018 at 03:03:04AM +0530, Harsh Shandilya wrote:
On 27 April 2018 7:27:35 PM IST, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 3.18.107 release. There are 24 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.107-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
No issues on the OnePlus 3, usage seems fine so far. Thanks!
Wonderful, thanks for testing and letting me know.
greg k-h
On 04/27/2018 06:57 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.107 release. There are 24 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Apr 29 13:56:20 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 136 fail: 0 Qemu test results: total: 119 pass: 119 fail: 0
Details are available at http://kerneltests.org/builders/.
Guenter
linux-stable-mirror@lists.linaro.org