Hi,
Commit bca9ca[0] causes a build failure while building for a G4 system
since 5.10.8:
arch/powerpc/kernel/head_book3s_32.S: Assembler messages:
arch/powerpc/kernel/head_book3s_32.S:296: Error: attempt to move .org backwards
make[2]: *** [scripts/Makefile.build:360:
arch/powerpc/kernel/head_book3s_32.o] Error 1
Reverting the commit allows it to build. I've uploaded the config[1],
but let me know if you need other information.
Thanks.
David
[0] https://github.com/gregkh/linux/commit/bca9ca5a603f6c5586a7dfd35e06abe6d5fc…
[1] https://dpaste.com/7SZMWCU89.txt
Patch f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
pointer invalidated") introduced a change that results in a circular
locking dependency when a Secure Execution guest that is configured with
crypto devices is started. The problem resulted due to the fact that the
patch moved the setting of the guest's AP masks within the protection of
the matrix_dev->lock when the vfio_ap driver is notified that the KVM
pointer has been set. Since it is not critical that setting/clearing of
the guest's AP masks when the driver is notified, the masks will not be
updated under the matrix_dev->lock. The lock is necessary for the
setting/unsetting of the KVM pointer, however, so that will remain in
place.
The dependency chain for the circular lockdep resolved by this patch
is:
#2 vfio_ap_mdev_group_notifier: kvm->lock
matrix_dev->lock
#1: handle_pqap: matrix_dev->lock
kvm_vcpu_ioctl: vcpu->mutex
#0: kvm_s390_cpus_to_pv: vcpu->mutex
kvm_vm_ioctl: kvm->lock
Tony Krowiak (1):
s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++-------------
1 file changed, 45 insertions(+), 30 deletions(-)
--
2.21.1
From: Seth Forshee <seth.forshee(a)canonical.com>
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
As with s390, alpha is a 64-bit architecture with a 32-bit ino_t. With
CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
display "inode64" in the mount options, whereas passing "inode64" in the
mount options will fail. This leads to erroneous behaviours such as this:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
Prevent CONFIG_TMPFS_INODE64 from being selected on alpha.
Link: https://lkml.kernel.org/r/20210208215726.608197-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: Richard Henderson <rth(a)twiddle.net>
Cc: Ivan Kokshaysky <ink(a)jurassic.park.msu.ru>
Cc: Matt Turner <mattst88(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/Kconfig~tmpfs-disallow-config_tmpfs_inode64-on-alpha
+++ a/fs/Kconfig
@@ -203,7 +203,7 @@ config TMPFS_XATTR
config TMPFS_INODE64
bool "Use 64-bit ino_t by default in tmpfs"
- depends on TMPFS && 64BIT && !S390
+ depends on TMPFS && 64BIT && !(S390 || ALPHA)
default n
help
tmpfs has historically used only inode numbers as wide as an unsigned
_
The patch titled
Subject: nilfs2: make splice write available again
has been removed from the -mm tree. Its filename was
nilfs2-make-splice-write-available-again.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Joachim Henke <joachim.henke(a)t-systems.com>
Subject: nilfs2: make splice write available again
Since 5.10, splice() or sendfile() to NILFS2 return EINVAL. This was
caused by commit 36e2c7421f02 ("fs: don't allow splice read/write without
explicit ops").
This patch initializes the splice_write field in file_operations, like
most file systems do, to restore the functionality.
Link: https://lkml.kernel.org/r/1612784101-14353-1-git-send-email-konishi.ryusuke…
Signed-off-by: Joachim Henke <joachim.henke(a)t-systems.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.10+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/file.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/nilfs2/file.c~nilfs2-make-splice-write-available-again
+++ a/fs/nilfs2/file.c
@@ -141,6 +141,7 @@ const struct file_operations nilfs_file_
/* .release = nilfs_release_file, */
.fsync = nilfs_sync_file,
.splice_read = generic_file_splice_read,
+ .splice_write = iter_file_splice_write,
};
const struct inode_operations nilfs_file_inode_operations = {
_
Patches currently in -mm which might be from joachim.henke(a)t-systems.com are
The patch titled
Subject: mm, slub: better heuristic for number of cpus when calculating slab order
has been removed from the -mm tree. Its filename was
mm-slub-better-heuristic-for-number-of-cpus-when-calculating-slab-order.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: mm, slub: better heuristic for number of cpus when calculating slab order
When creating a new kmem cache, SLUB determines how large the slab pages will
based on number of inputs, including the number of CPUs in the system. Larger
slab pages mean that more objects can be allocated/free from per-cpu slabs
before accessing shared structures, but also potentially more memory can be
wasted due to low slab usage and fragmentation.
The rough idea of using number of CPUs is that larger systems will be more
likely to benefit from reduced contention, and also should have enough memory
to spare.
Number of CPUs used to be determined as nr_cpu_ids, which is number of possible
cpus, but on some systems many will never be onlined, thus commit 045ab8c9487b
("mm/slub: let number of online CPUs determine the slub page order") changed it
to nr_online_cpus(). However, for kmem caches created early before CPUs are
onlined, this may lead to permamently low slab page sizes.
Vincent reports a regression [1] of hackbench on arm64 systems:
> I'm facing significant performances regression on a large arm64 server
> system (224 CPUs). Regressions is also present on small arm64 system
> (8 CPUs) but in a far smaller order of magnitude
> On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
> v5.11-rc4 : 9.135sec (+/- 0.45%)
> v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
> v5.10: 3.136sec (+/- 0.40%)
Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting
page allocator contention:
> i.e. the patch incurs a 7% to 32% performance penalty. This bisected
> cleanly yesterday when I was looking for the regression and then found
> the thread.
> Numerous caches change size. For example, kmalloc-512 goes from order-0
> (vanilla) to order-2 with the revert.
> So mostly this is down to the number of times SLUB calls into the page
> allocator which only caches order-0 pages on a per-cpu basis.
Clearly num_online_cpus() doesn't work too early in bootup. We could change
the order dynamically in a memory hotplug callback, but runtime order changing
for existing kmem caches has been already shown as dangerous, and removed in
32a6f409b693 ("mm, slub: remove runtime allocation order changes"). It could be
resurrected in a safe manner with some effort, but to fix the regression we
need something simpler.
We could use num_present_cpus() that should be the number of physically
present CPUs even before they are onlined. That would work for PowerPC
[3], which triggered the original commit, but that still doesn't work on
arm64 [4] as explained in [5].
So this patch tries to determine the best available value without specific
arch knowledge.
- num_present_cpus() if the number is larger than 1, as that means the
arch is likely setting it properly
- nr_cpu_ids otherwise
This should fix the reported regressions while also keeping the effect of
045ab8c9487b for PowerPC systems. It's possible there are configurations
where num_present_cpus() is 1 during boot while nr_cpu_ids is at the same
time bloated, so these (if they exist) would keep the large orders based
on nr_cpu_ids as was before 045ab8c9487b.
[1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj…
[2] https://lore.kernel.org/linux-mm/20210128134512.GF3592@techsingularity.net/
[3] https://lore.kernel.org/linux-mm/20210123051607.GC2587010@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03Ys…
[5] https://lore.kernel.org/linux-mm/20210126230305.GD30941@willie-the-truck/
Link: https://lkml.kernel.org/r/20210208134108.22286-1-vbabka@suse.cz
Fixes: 045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Reported-by: Mel Gorman <mgorman(a)techsingularity.net>
Tested-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Bharata B Rao <bharata(a)linux.ibm.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
--- a/mm/slub.c~mm-slub-better-heuristic-for-number-of-cpus-when-calculating-slab-order
+++ a/mm/slub.c
@@ -3423,6 +3423,7 @@ static inline int calculate_order(unsign
unsigned int order;
unsigned int min_objects;
unsigned int max_objects;
+ unsigned int nr_cpus;
/*
* Attempt to find best configuration for a slab. This
@@ -3433,8 +3434,21 @@ static inline int calculate_order(unsign
* we reduce the minimum objects required in a slab.
*/
min_objects = slub_min_objects;
- if (!min_objects)
- min_objects = 4 * (fls(num_online_cpus()) + 1);
+ if (!min_objects) {
+ /*
+ * Some architectures will only update present cpus when
+ * onlining them, so don't trust the number if it's just 1. But
+ * we also don't want to use nr_cpu_ids always, as on some other
+ * architectures, there can be many possible cpus, but never
+ * onlined. Here we compromise between trying to avoid too high
+ * order on systems that appear larger than they are, and too
+ * low order on systems that appear smaller than they are.
+ */
+ nr_cpus = num_present_cpus();
+ if (nr_cpus <= 1)
+ nr_cpus = nr_cpu_ids;
+ min_objects = 4 * (fls(nr_cpus) + 1);
+ }
max_objects = order_objects(slub_max_order, size);
min_objects = min(min_objects, max_objects);
_
Patches currently in -mm which might be from vbabka(a)suse.cz are
mm-slub-stop-freeing-kmem_cache_node-structures-on-node-offline.patch
mm-slab-slub-stop-taking-memory-hotplug-lock.patch
mm-slab-slub-stop-taking-cpu-hotplug-lock.patch
mm-slub-splice-cpu-and-page-freelists-in-deactivate_slab.patch
mm-slub-remove-slub_memcg_sysfs-boot-param-and-config_slub_memcg_sysfs_on.patch
The patch titled
Subject: Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"
has been removed from the -mm tree. Its filename was
revert-mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"
This reverts commit 536d3bf261a2fc3b05b3e91e7eef7383443015cf, as it can
cause writers to memory.high to get stuck in the kernel forever,
performing page reclaim and consuming excessive amounts of CPU cycles.
Before the patch, a write to memory.high would first put the new limit in
place for the workload, and then reclaim the requested delta. After the
patch, the kernel tries to reclaim the delta before putting the new limit
into place, in order to not overwhelm the workload with a sudden, large
excess over the limit. However, if reclaim is actively racing with new
allocations from the uncurbed workload, it can keep the write() working
inside the kernel indefinitely.
This is causing problems in Facebook production. A privileged
system-level daemon that adjusts memory.high for various workloads running
on a host can get unexpectedly stuck in the kernel and essentially turn
into a sort of involuntary kswapd for one of the workloads. We've
observed that daemon busy-spin in a write() for minutes at a time,
neglecting its other duties on the system, and expending privileged system
resources on behalf of a workload.
To remedy this, we have first considered changing the reclaim logic to
break out after a couple of loops - whether the workload has converged to
the new limit or not - and bound the write() call this way. However, the
root cause that inspired the sequence change in the first place has been
fixed through other means, and so a revert back to the proven
limit-setting sequence, also used by memory.max, is preferable.
The sequence was changed to avoid extreme latencies in the workload when
the limit was lowered: the sudden, large excess created by the limit
lowering would erroneously trigger the penalty sleeping code that is meant
to throttle excessive growth from below. Allocating threads could end up
sleeping long after the write() had already reclaimed the delta for which
they were being punished.
However, erroneous throttling also caused problems in other scenarios at
around the same time. This resulted in commit b3ff92916af3 ("mm, memcg:
reclaim more aggressively before high allocator throttling"), included in
the same release as the offending commit. When allocating threads now
encounter large excess caused by a racing write() to memory.high, instead
of entering punitive sleeps, they will simply be tasked with helping
reclaim down the excess, and will be held no longer than it takes to
accomplish that. This is in line with regular limit enforcement - i.e.
if the workload allocates up against or over an otherwise unchanged limit
from below.
With the patch breaking userspace, and the root cause addressed by other
means already, revert it again.
Link: https://lkml.kernel.org/r/20210122184341.292461-1-hannes@cmpxchg.org
Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Tejun Heo <tj(a)kernel.org>
Acked-by: Chris Down <chris(a)chrisdown.name>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Michal Koutný <mkoutny(a)suse.com>
Cc: <stable(a)vger.kernel.org> [5.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/memcontrol.c~revert-mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh
+++ a/mm/memcontrol.c
@@ -6271,6 +6271,8 @@ static ssize_t memory_high_write(struct
if (err)
return err;
+ page_counter_set_high(&memcg->memory, high);
+
for (;;) {
unsigned long nr_pages = page_counter_read(&memcg->memory);
unsigned long reclaimed;
@@ -6294,10 +6296,7 @@ static ssize_t memory_high_write(struct
break;
}
- page_counter_set_high(&memcg->memory, high);
-
memcg_wb_domain_size_changed(memcg);
-
return nbytes;
}
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
fs-buffer-use-raw-page_memcg-on-locked-page.patch
mm-vmstat-fix-nohz-wakeups-for-node-stat-changes.patch
mm-vmstat-add-some-comments-on-internal-storage-of-byte-items.patch
The patch titled
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
has been removed from the -mm tree. Its filename was
tmpfs-disallow-config_tmpfs_inode64-on-alpha.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Seth Forshee <seth.forshee(a)canonical.com>
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
As with s390, alpha is a 64-bit architecture with a 32-bit ino_t. With
CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
display "inode64" in the mount options, whereas passing "inode64" in the
mount options will fail. This leads to erroneous behaviours such as this:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
Prevent CONFIG_TMPFS_INODE64 from being selected on alpha.
Link: https://lkml.kernel.org/r/20210208215726.608197-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: Richard Henderson <rth(a)twiddle.net>
Cc: Ivan Kokshaysky <ink(a)jurassic.park.msu.ru>
Cc: Matt Turner <mattst88(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/Kconfig~tmpfs-disallow-config_tmpfs_inode64-on-alpha
+++ a/fs/Kconfig
@@ -203,7 +203,7 @@ config TMPFS_XATTR
config TMPFS_INODE64
bool "Use 64-bit ino_t by default in tmpfs"
- depends on TMPFS && 64BIT && !S390
+ depends on TMPFS && 64BIT && !(S390 || ALPHA)
default n
help
tmpfs has historically used only inode numbers as wide as an unsigned
_
Patches currently in -mm which might be from seth.forshee(a)canonical.com are
The patch titled
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
has been removed from the -mm tree. Its filename was
tmpfs-disallow-config_tmpfs_inode64-on-s390.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Seth Forshee <seth.forshee(a)canonical.com>
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
Currently there is an assumption in tmpfs that 64-bit architectures also
have a 64-bit ino_t. This is not true on s390 which has a 32-bit ino_t.
With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
display "inode64" in the mount options, but passing the "inode64" mount
option will fail. This leads to the following behavior:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
As mount sees "inode64" in the mount options and thus passes it in the
options for the remount.
So prevent CONFIG_TMPFS_INODE64 from being selected on s390.
Link: https://lkml.kernel.org/r/20210205230620.518245-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: <stable(a)vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/Kconfig~tmpfs-disallow-config_tmpfs_inode64-on-s390
+++ a/fs/Kconfig
@@ -203,7 +203,7 @@ config TMPFS_XATTR
config TMPFS_INODE64
bool "Use 64-bit ino_t by default in tmpfs"
- depends on TMPFS && 64BIT
+ depends on TMPFS && 64BIT && !S390
default n
help
tmpfs has historically used only inode numbers as wide as an unsigned
_
Patches currently in -mm which might be from seth.forshee(a)canonical.com are
The patch titled
Subject: squashfs: add more sanity checks in xattr id lookup
has been removed from the -mm tree. Its filename was
squashfs-add-more-sanity-checks-in-xattr-id-lookup.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: add more sanity checks in xattr id lookup
Sysbot has reported a warning where a kmalloc() attempt exceeds the
maximum limit. This has been identified as corruption of the xattr_ids
count when reading the xattr id lookup table.
This patch adds a number of additional sanity checks to detect this
corruption and others.
1. It checks for a corrupted xattr index read from the inode. This could
be because the metadata block is uncompressed, or because the
"compression" bit has been corrupted (turning a compressed block
into an uncompressed block). This would cause an out of bounds read.
2. It checks against corruption of the xattr_ids count. This can either
lead to the above kmalloc failure, or a smaller than expected
table to be read.
3. It checks the contents of the index table for corruption.
[phillip(a)squashfs.org.uk: fix checkpatch issue]
Link: https://lkml.kernel.org/r/270245655.754655.1612770082682@webmail.123-reg.co…
Link: https://lkml.kernel.org/r/20210204130249.4495-5-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+2ccea6339d368360800d(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/xattr_id.c | 66 +++++++++++++++++++++++++++++++++------
1 file changed, 57 insertions(+), 9 deletions(-)
--- a/fs/squashfs/xattr_id.c~squashfs-add-more-sanity-checks-in-xattr-id-lookup
+++ a/fs/squashfs/xattr_id.c
@@ -31,10 +31,15 @@ int squashfs_xattr_lookup(struct super_b
struct squashfs_sb_info *msblk = sb->s_fs_info;
int block = SQUASHFS_XATTR_BLOCK(index);
int offset = SQUASHFS_XATTR_BLOCK_OFFSET(index);
- u64 start_block = le64_to_cpu(msblk->xattr_id_table[block]);
+ u64 start_block;
struct squashfs_xattr_id id;
int err;
+ if (index >= msblk->xattr_ids)
+ return -EINVAL;
+
+ start_block = le64_to_cpu(msblk->xattr_id_table[block]);
+
err = squashfs_read_metadata(sb, &id, &start_block, &offset,
sizeof(id));
if (err < 0)
@@ -50,13 +55,17 @@ int squashfs_xattr_lookup(struct super_b
/*
* Read uncompressed xattr id lookup table indexes from disk into memory
*/
-__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start,
+__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start,
u64 *xattr_table_start, int *xattr_ids)
{
- unsigned int len;
+ struct squashfs_sb_info *msblk = sb->s_fs_info;
+ unsigned int len, indexes;
struct squashfs_xattr_id_table *id_table;
+ __le64 *table;
+ u64 start, end;
+ int n;
- id_table = squashfs_read_table(sb, start, sizeof(*id_table));
+ id_table = squashfs_read_table(sb, table_start, sizeof(*id_table));
if (IS_ERR(id_table))
return (__le64 *) id_table;
@@ -70,13 +79,52 @@ __le64 *squashfs_read_xattr_id_table(str
if (*xattr_ids == 0)
return ERR_PTR(-EINVAL);
- /* xattr_table should be less than start */
- if (*xattr_table_start >= start)
+ len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
+ indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids);
+
+ /*
+ * The computed size of the index table (len bytes) should exactly
+ * match the table start and end points
+ */
+ start = table_start + sizeof(*id_table);
+ end = msblk->bytes_used;
+
+ if (len != (end - start))
return ERR_PTR(-EINVAL);
- len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
+ table = squashfs_read_table(sb, start, len);
+ if (IS_ERR(table))
+ return table;
+
+ /* table[0], table[1], ... table[indexes - 1] store the locations
+ * of the compressed xattr id blocks. Each entry should be less than
+ * the next (i.e. table[0] < table[1]), and the difference between them
+ * should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1]
+ * should be less than table_start, and again the difference
+ * shouls be SQUASHFS_METADATA_SIZE or less.
+ *
+ * Finally xattr_table_start should be less than table[0].
+ */
+ for (n = 0; n < (indexes - 1); n++) {
+ start = le64_to_cpu(table[n]);
+ end = le64_to_cpu(table[n + 1]);
+
+ if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ start = le64_to_cpu(table[indexes - 1]);
+ if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
- TRACE("In read_xattr_index_table, length %d\n", len);
+ if (*xattr_table_start >= le64_to_cpu(table[0])) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
- return squashfs_read_table(sb, start + sizeof(*id_table), len);
+ return table;
}
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
The patch titled
Subject: squashfs: add more sanity checks in inode lookup
has been removed from the -mm tree. Its filename was
squashfs-add-more-sanity-checks-in-inode-lookup.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: add more sanity checks in inode lookup
Sysbot has reported an "slab-out-of-bounds read" error which has been
identified as being caused by a corrupted "ino_num" value read from the
inode. This could be because the metadata block is uncompressed, or
because the "compression" bit has been corrupted (turning a compressed
block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the following
corruption.
1. It checks against corruption of the inodes count. This can either
lead to a larger table to be read, or a smaller than expected
table to be read.
In the case of a too large inodes count, this would often have been
trapped by the existing sanity checks, but this patch introduces
a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
[phillip(a)squashfs.org.uk: fix checkpatch issue]
Link: https://lkml.kernel.org/r/527909353.754618.1612769948607@webmail.123-reg.co…
Link: https://lkml.kernel.org/r/20210204130249.4495-4-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+04419e3ff19d2970ea28(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/export.c | 41 +++++++++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 8 deletions(-)
--- a/fs/squashfs/export.c~squashfs-add-more-sanity-checks-in-inode-lookup
+++ a/fs/squashfs/export.c
@@ -41,12 +41,17 @@ static long long squashfs_inode_lookup(s
struct squashfs_sb_info *msblk = sb->s_fs_info;
int blk = SQUASHFS_LOOKUP_BLOCK(ino_num - 1);
int offset = SQUASHFS_LOOKUP_BLOCK_OFFSET(ino_num - 1);
- u64 start = le64_to_cpu(msblk->inode_lookup_table[blk]);
+ u64 start;
__le64 ino;
int err;
TRACE("Entered squashfs_inode_lookup, inode_number = %d\n", ino_num);
+ if (ino_num == 0 || (ino_num - 1) >= msblk->inodes)
+ return -EINVAL;
+
+ start = le64_to_cpu(msblk->inode_lookup_table[blk]);
+
err = squashfs_read_metadata(sb, &ino, &start, &offset, sizeof(ino));
if (err < 0)
return err;
@@ -111,7 +116,10 @@ __le64 *squashfs_read_inode_lookup_table
u64 lookup_table_start, u64 next_table, unsigned int inodes)
{
unsigned int length = SQUASHFS_LOOKUP_BLOCK_BYTES(inodes);
+ unsigned int indexes = SQUASHFS_LOOKUP_BLOCKS(inodes);
+ int n;
__le64 *table;
+ u64 start, end;
TRACE("In read_inode_lookup_table, length %d\n", length);
@@ -121,20 +129,37 @@ __le64 *squashfs_read_inode_lookup_table
if (inodes == 0)
return ERR_PTR(-EINVAL);
- /* length bytes should not extend into the next table - this check
- * also traps instances where lookup_table_start is incorrectly larger
- * than the next table start
+ /*
+ * The computed size of the lookup table (length bytes) should exactly
+ * match the table start and end points
*/
- if (lookup_table_start + length > next_table)
+ if (length != (next_table - lookup_table_start))
return ERR_PTR(-EINVAL);
table = squashfs_read_table(sb, lookup_table_start, length);
+ if (IS_ERR(table))
+ return table;
/*
- * table[0] points to the first inode lookup table metadata block,
- * this should be less than lookup_table_start
+ * table0], table[1], ... table[indexes - 1] store the locations
+ * of the compressed inode lookup blocks. Each entry should be
+ * less than the next (i.e. table[0] < table[1]), and the difference
+ * between them should be SQUASHFS_METADATA_SIZE or less.
+ * table[indexes - 1] should be less than lookup_table_start, and
+ * again the difference should be SQUASHFS_METADATA_SIZE or less
*/
- if (!IS_ERR(table) && le64_to_cpu(table[0]) >= lookup_table_start) {
+ for (n = 0; n < (indexes - 1); n++) {
+ start = le64_to_cpu(table[n]);
+ end = le64_to_cpu(table[n + 1]);
+
+ if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ start = le64_to_cpu(table[indexes - 1]);
+ if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) {
kfree(table);
return ERR_PTR(-EINVAL);
}
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
The patch titled
Subject: squashfs: add more sanity checks in id lookup
has been removed from the -mm tree. Its filename was
squashfs-add-more-sanity-checks-in-id-lookup.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: add more sanity checks in id lookup
Sysbot has reported a number of "slab-out-of-bounds reads" and
"use-after-free read" errors which has been identified as being caused by
a corrupted index value read from the inode. This could be because the
metadata block is uncompressed, or because the "compression" bit has been
corrupted (turning a compressed block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the
following corruption.
1. It checks against corruption of the ids count. This can either
lead to a larger table to be read, or a smaller than expected
table to be read.
In the case of a too large ids count, this would often have been
trapped by the existing sanity checks, but this patch introduces
a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
Link: https://lkml.kernel.org/r/20210204130249.4495-3-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+b06d57ba83f604522af2(a)syzkaller.appspotmail.com
Reported-by: syzbot+c021ba012da41ee9807c(a)syzkaller.appspotmail.com
Reported-by: syzbot+5024636e8b5fd19f0f19(a)syzkaller.appspotmail.com
Reported-by: syzbot+bcbc661df46657d0fa4f(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/id.c | 40 ++++++++++++++++++++++++++-------
fs/squashfs/squashfs_fs_sb.h | 1
fs/squashfs/super.c | 6 ++--
fs/squashfs/xattr.h | 10 +++++++-
4 files changed, 45 insertions(+), 12 deletions(-)
--- a/fs/squashfs/id.c~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/id.c
@@ -35,10 +35,15 @@ int squashfs_get_id(struct super_block *
struct squashfs_sb_info *msblk = sb->s_fs_info;
int block = SQUASHFS_ID_BLOCK(index);
int offset = SQUASHFS_ID_BLOCK_OFFSET(index);
- u64 start_block = le64_to_cpu(msblk->id_table[block]);
+ u64 start_block;
__le32 disk_id;
int err;
+ if (index >= msblk->ids)
+ return -EINVAL;
+
+ start_block = le64_to_cpu(msblk->id_table[block]);
+
err = squashfs_read_metadata(sb, &disk_id, &start_block, &offset,
sizeof(disk_id));
if (err < 0)
@@ -56,7 +61,10 @@ __le64 *squashfs_read_id_index_table(str
u64 id_table_start, u64 next_table, unsigned short no_ids)
{
unsigned int length = SQUASHFS_ID_BLOCK_BYTES(no_ids);
+ unsigned int indexes = SQUASHFS_ID_BLOCKS(no_ids);
+ int n;
__le64 *table;
+ u64 start, end;
TRACE("In read_id_index_table, length %d\n", length);
@@ -67,20 +75,36 @@ __le64 *squashfs_read_id_index_table(str
return ERR_PTR(-EINVAL);
/*
- * length bytes should not extend into the next table - this check
- * also traps instances where id_table_start is incorrectly larger
- * than the next table start
+ * The computed size of the index table (length bytes) should exactly
+ * match the table start and end points
*/
- if (id_table_start + length > next_table)
+ if (length != (next_table - id_table_start))
return ERR_PTR(-EINVAL);
table = squashfs_read_table(sb, id_table_start, length);
+ if (IS_ERR(table))
+ return table;
/*
- * table[0] points to the first id lookup table metadata block, this
- * should be less than id_table_start
+ * table[0], table[1], ... table[indexes - 1] store the locations
+ * of the compressed id blocks. Each entry should be less than
+ * the next (i.e. table[0] < table[1]), and the difference between them
+ * should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1]
+ * should be less than id_table_start, and again the difference
+ * should be SQUASHFS_METADATA_SIZE or less
*/
- if (!IS_ERR(table) && le64_to_cpu(table[0]) >= id_table_start) {
+ for (n = 0; n < (indexes - 1); n++) {
+ start = le64_to_cpu(table[n]);
+ end = le64_to_cpu(table[n + 1]);
+
+ if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ start = le64_to_cpu(table[indexes - 1]);
+ if (start >= id_table_start || (id_table_start - start) > SQUASHFS_METADATA_SIZE) {
kfree(table);
return ERR_PTR(-EINVAL);
}
--- a/fs/squashfs/squashfs_fs_sb.h~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/squashfs_fs_sb.h
@@ -64,5 +64,6 @@ struct squashfs_sb_info {
unsigned int inodes;
unsigned int fragments;
int xattr_ids;
+ unsigned int ids;
};
#endif
--- a/fs/squashfs/super.c~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/super.c
@@ -166,6 +166,7 @@ static int squashfs_fill_super(struct su
msblk->directory_table = le64_to_cpu(sblk->directory_table_start);
msblk->inodes = le32_to_cpu(sblk->inodes);
msblk->fragments = le32_to_cpu(sblk->fragments);
+ msblk->ids = le16_to_cpu(sblk->no_ids);
flags = le16_to_cpu(sblk->flags);
TRACE("Found valid superblock on %pg\n", sb->s_bdev);
@@ -177,7 +178,7 @@ static int squashfs_fill_super(struct su
TRACE("Block size %d\n", msblk->block_size);
TRACE("Number of inodes %d\n", msblk->inodes);
TRACE("Number of fragments %d\n", msblk->fragments);
- TRACE("Number of ids %d\n", le16_to_cpu(sblk->no_ids));
+ TRACE("Number of ids %d\n", msblk->ids);
TRACE("sblk->inode_table_start %llx\n", msblk->inode_table);
TRACE("sblk->directory_table_start %llx\n", msblk->directory_table);
TRACE("sblk->fragment_table_start %llx\n",
@@ -236,8 +237,7 @@ static int squashfs_fill_super(struct su
allocate_id_index_table:
/* Allocate and read id index table */
msblk->id_table = squashfs_read_id_index_table(sb,
- le64_to_cpu(sblk->id_table_start), next_table,
- le16_to_cpu(sblk->no_ids));
+ le64_to_cpu(sblk->id_table_start), next_table, msblk->ids);
if (IS_ERR(msblk->id_table)) {
errorf(fc, "unable to read id index table");
err = PTR_ERR(msblk->id_table);
--- a/fs/squashfs/xattr.h~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/xattr.h
@@ -17,8 +17,16 @@ extern int squashfs_xattr_lookup(struct
static inline __le64 *squashfs_read_xattr_id_table(struct super_block *sb,
u64 start, u64 *xattr_table_start, int *xattr_ids)
{
+ struct squashfs_xattr_id_table *id_table;
+
+ id_table = squashfs_read_table(sb, start, sizeof(*id_table));
+ if (IS_ERR(id_table))
+ return (__le64 *) id_table;
+
+ *xattr_table_start = le64_to_cpu(id_table->xattr_table_start);
+ kfree(id_table);
+
ERROR("Xattrs in filesystem, these will be ignored\n");
- *xattr_table_start = start;
return ERR_PTR(-ENOTSUPP);
}
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
The patch titled
Subject: squashfs: avoid out of bounds writes in decompressors
has been removed from the -mm tree. Its filename was
squashfs-avoid-out-of-bounds-writes-in-decompressors.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: avoid out of bounds writes in decompressors
Patch series "Squashfs: fix BIO migration regression and add sanity checks".
Patch [1/4] fixes a regression introduced by the "migrate from ll_rw_block
usage to BIO" patch, which has produced a number of Sysbot/Syzkaller
reports.
Patches [2/4], [3/4], and [4/4] fix a number of filesystem corruption
issues which have produced Sysbot reports in the id, inode and xattr
lookup code.
Each patch has been tested against the Sysbot reproducers using the given
kernel configuration. They have the appropriate "Reported-by:" lines
added.
Additionally, all of the reproducer filesystems are indirectly fixed by
patch [4/4] due to the fact they all have xattr corruption which is now
detected there.
Additional testing with other configurations and architectures (32bit, big
endian), and normal filesystems has also been done to trap any inadvertent
regressions caused by the additional sanity checks.
This patch (of 4):
This is a regression introduced by the patch "migrate from ll_rw_block
usage to BIO".
Sysbot/Syskaller has reported a number of "out of bounds writes" and
"unable to handle kernel paging request in squashfs_decompress" errors
which have been identified as a regression introduced by the above patch.
Specifically, the patch removed the following sanity check
if (length < 0 || length > output->length ||
(index + length) > msblk->bytes_used)
This check did two things:
1. It ensured any reads were not beyond the end of the filesystem
2. It ensured that the "length" field read from the filesystem
was within the expected maximum length. Without this any
corrupted values can over-run allocated buffers.
Link: https://lkml.kernel.org/r/20210204130249.4495-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20210204130249.4495-2-phillip@squashfs.org.uk
Fixes: 93e72b3c612adc ("squashfs: migrate from ll_rw_block usage to BIO")
Reported-by: syzbot+6fba78f99b9afd4b5634(a)syzkaller.appspotmail.com
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Cc: Philippe Liard <pliard(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/block.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/squashfs/block.c~squashfs-avoid-out-of-bounds-writes-in-decompressors
+++ a/fs/squashfs/block.c
@@ -196,9 +196,15 @@ int squashfs_read_data(struct super_bloc
length = SQUASHFS_COMPRESSED_SIZE(length);
index += 2;
- TRACE("Block @ 0x%llx, %scompressed size %d\n", index,
+ TRACE("Block @ 0x%llx, %scompressed size %d\n", index - 2,
compressed ? "" : "un", length);
}
+ if (length < 0 || length > output->length ||
+ (index + length) > msblk->bytes_used) {
+ res = -EIO;
+ goto out;
+ }
+
if (next_index)
*next_index = index + length;
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
This is a note to let you know that I've just added the patch titled
staging: gdm724x: Fix DMA from stack
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 7c3a0635cd008eaca9a734dc802709ee0b81cac5 Mon Sep 17 00:00:00 2001
From: Amey Narkhede <ameynarkhede03(a)gmail.com>
Date: Thu, 11 Feb 2021 11:08:19 +0530
Subject: staging: gdm724x: Fix DMA from stack
Stack allocated buffers cannot be used for DMA
on all architectures so allocate hci_packet buffer
using kmalloc.
Reviewed-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Amey Narkhede <ameynarkhede03(a)gmail.com>
Link: https://lore.kernel.org/r/20210211053819.34858-1-ameynarkhede03@gmail.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/gdm724x/gdm_usb.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/staging/gdm724x/gdm_usb.c b/drivers/staging/gdm724x/gdm_usb.c
index dc4da66c3695..54bdb64f52e8 100644
--- a/drivers/staging/gdm724x/gdm_usb.c
+++ b/drivers/staging/gdm724x/gdm_usb.c
@@ -56,20 +56,24 @@ static int gdm_usb_recv(void *priv_dev,
static int request_mac_address(struct lte_udev *udev)
{
- u8 buf[16] = {0,};
- struct hci_packet *hci = (struct hci_packet *)buf;
+ struct hci_packet *hci;
struct usb_device *usbdev = udev->usbdev;
int actual;
int ret = -1;
+ hci = kmalloc(struct_size(hci, data, 1), GFP_KERNEL);
+ if (!hci)
+ return -ENOMEM;
+
hci->cmd_evt = gdm_cpu_to_dev16(udev->gdm_ed, LTE_GET_INFORMATION);
hci->len = gdm_cpu_to_dev16(udev->gdm_ed, 1);
hci->data[0] = MAC_ADDRESS;
- ret = usb_bulk_msg(usbdev, usb_sndbulkpipe(usbdev, 2), buf, 5,
+ ret = usb_bulk_msg(usbdev, usb_sndbulkpipe(usbdev, 2), hci, 5,
&actual, 1000);
udev->request_mac_addr = 1;
+ kfree(hci);
return ret;
}
--
2.30.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From eaf5bfe37db871031232d2bf2535b6ca92afbad8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala(a)linux.intel.com>
Date: Thu, 28 Jan 2021 17:59:44 +0200
Subject: [PATCH] drm/i915: Skip vswing programming for TBT
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
In thunderbolt mode the PHY is owned by the thunderbolt controller.
We are not supposed to touch it. So skip the vswing programming
as well (we already skipped the other steps not applicable to TBT).
Touching this stuff could supposedly interfere with the PHY
programming done by the thunderbolt controller.
Cc: stable(a)vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210128155948.13678-1-ville.…
Reviewed-by: Imre Deak <imre.deak(a)intel.com>
(cherry picked from commit f8c6b615b921d8a1bcd74870f9105e62b0bceff3)
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index bf17365857ca..e1e3ac12f979 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2754,6 +2754,9 @@ static void icl_mg_phy_ddi_vswing_sequence(struct intel_encoder *encoder,
int n_entries, ln;
u32 val;
+ if (enc_to_dig_port(encoder)->tc_mode == TC_PORT_TBT_ALT)
+ return;
+
ddi_translations = icl_get_mg_buf_trans(encoder, crtc_state, &n_entries);
if (level >= n_entries) {
drm_dbg_kms(&dev_priv->drm,
@@ -2890,6 +2893,9 @@ tgl_dkl_phy_ddi_vswing_sequence(struct intel_encoder *encoder,
u32 val, dpcnt_mask, dpcnt_val;
int n_entries, ln;
+ if (enc_to_dig_port(encoder)->tc_mode == TC_PORT_TBT_ALT)
+ return;
+
ddi_translations = tgl_get_dkl_buf_trans(encoder, crtc_state, &n_entries);
if (level >= n_entries)
Extend kvm_s390_shadow_fault to return the pointer to the valid leaf
DAT table entry, or to the invalid entry.
Also return some flags in the lower bits of the address:
DAT_PROT: indicates that DAT protection applies because of the
protection bit in the segment (or, if EDAT, region) tables
NOT_PTE: indicates that the address of the DAT table entry returned
does not refer to a PTE, but to a segment or region table.
Signed-off-by: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: stable(a)vger.kernel.org
---
arch/s390/kvm/gaccess.c | 30 +++++++++++++++++++++++++-----
arch/s390/kvm/gaccess.h | 5 ++++-
arch/s390/kvm/vsie.c | 8 ++++----
3 files changed, 33 insertions(+), 10 deletions(-)
diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index 6d6b57059493..e0ab83f051d2 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -976,7 +976,9 @@ int kvm_s390_check_low_addr_prot_real(struct kvm_vcpu *vcpu, unsigned long gra)
* kvm_s390_shadow_tables - walk the guest page table and create shadow tables
* @sg: pointer to the shadow guest address space structure
* @saddr: faulting address in the shadow gmap
- * @pgt: pointer to the page table address result
+ * @pgt: pointer to the beginning of the page table for the given address if
+ * successful (return value 0), or to the first invalid DAT entry in
+ * case of exceptions (return value > 0)
* @fake: pgt references contiguous guest memory block, not a pgtable
*/
static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
@@ -1034,6 +1036,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
rfte.val = ptr;
goto shadow_r2t;
}
+ *pgt = ptr + vaddr.rfx * 8;
rc = gmap_read_table(parent, ptr + vaddr.rfx * 8, &rfte.val);
if (rc)
return rc;
@@ -1060,6 +1063,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
rste.val = ptr;
goto shadow_r3t;
}
+ *pgt = ptr + vaddr.rsx * 8;
rc = gmap_read_table(parent, ptr + vaddr.rsx * 8, &rste.val);
if (rc)
return rc;
@@ -1087,6 +1091,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
rtte.val = ptr;
goto shadow_sgt;
}
+ *pgt = ptr + vaddr.rtx * 8;
rc = gmap_read_table(parent, ptr + vaddr.rtx * 8, &rtte.val);
if (rc)
return rc;
@@ -1123,6 +1128,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
ste.val = ptr;
goto shadow_pgt;
}
+ *pgt = ptr + vaddr.sx * 8;
rc = gmap_read_table(parent, ptr + vaddr.sx * 8, &ste.val);
if (rc)
return rc;
@@ -1157,6 +1163,8 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
* @vcpu: virtual cpu
* @sg: pointer to the shadow guest address space structure
* @saddr: faulting address in the shadow gmap
+ * @datptr: will contain the address of the faulting DAT table entry, or of
+ * the valid leaf, plus some flags
*
* Returns: - 0 if the shadow fault was successfully resolved
* - > 0 (pgm exception code) on exceptions while faulting
@@ -1165,11 +1173,11 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
* - -ENOMEM if out of memory
*/
int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg,
- unsigned long saddr)
+ unsigned long saddr, unsigned long *datptr)
{
union vaddress vaddr;
union page_table_entry pte;
- unsigned long pgt;
+ unsigned long pgt = 0;
int dat_protection, fake;
int rc;
@@ -1191,8 +1199,20 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg,
pte.val = pgt + vaddr.px * PAGE_SIZE;
goto shadow_page;
}
- if (!rc)
- rc = gmap_read_table(sg->parent, pgt + vaddr.px * 8, &pte.val);
+
+ switch (rc) {
+ case PGM_SEGMENT_TRANSLATION:
+ case PGM_REGION_THIRD_TRANS:
+ case PGM_REGION_SECOND_TRANS:
+ case PGM_REGION_FIRST_TRANS:
+ pgt |= NOT_PTE;
+ break;
+ case 0:
+ pgt += vaddr.px * 8;
+ rc = gmap_read_table(sg->parent, pgt, &pte.val);
+ }
+ if (*datptr)
+ *datptr = pgt | dat_protection * DAT_PROT;
if (!rc && pte.i)
rc = PGM_PAGE_TRANSLATION;
if (!rc && pte.z)
diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
index f4c51756c462..fec26bbb17ba 100644
--- a/arch/s390/kvm/gaccess.h
+++ b/arch/s390/kvm/gaccess.h
@@ -359,7 +359,10 @@ void ipte_unlock(struct kvm_vcpu *vcpu);
int ipte_lock_held(struct kvm_vcpu *vcpu);
int kvm_s390_check_low_addr_prot_real(struct kvm_vcpu *vcpu, unsigned long gra);
+#define DAT_PROT 2
+#define NOT_PTE 4
+
int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *shadow,
- unsigned long saddr);
+ unsigned long saddr, unsigned long *datptr);
#endif /* __KVM_S390_GACCESS_H */
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index c5d0a58b2c29..7db022141db3 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -619,10 +619,10 @@ static int map_prefix(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
/* with mso/msl, the prefix lies at offset *mso* */
prefix += scb_s->mso;
- rc = kvm_s390_shadow_fault(vcpu, vsie_page->gmap, prefix);
+ rc = kvm_s390_shadow_fault(vcpu, vsie_page->gmap, prefix, NULL);
if (!rc && (scb_s->ecb & ECB_TE))
rc = kvm_s390_shadow_fault(vcpu, vsie_page->gmap,
- prefix + PAGE_SIZE);
+ prefix + PAGE_SIZE, NULL);
/*
* We don't have to mprotect, we will be called for all unshadows.
* SIE will detect if protection applies and trigger a validity.
@@ -913,7 +913,7 @@ static int handle_fault(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
current->thread.gmap_addr, 1);
rc = kvm_s390_shadow_fault(vcpu, vsie_page->gmap,
- current->thread.gmap_addr);
+ current->thread.gmap_addr, NULL);
if (rc > 0) {
rc = inject_fault(vcpu, rc,
current->thread.gmap_addr,
@@ -935,7 +935,7 @@ static void handle_last_fault(struct kvm_vcpu *vcpu,
{
if (vsie_page->fault_addr)
kvm_s390_shadow_fault(vcpu, vsie_page->gmap,
- vsie_page->fault_addr);
+ vsie_page->fault_addr, NULL);
vsie_page->fault_addr = 0;
}
--
2.26.2
I'm announcing the release of the 4.9.256 kernel.
This, and the 4.4.256 release are a little bit "different" than normal.
This contains only 1 patch, just the version bump from .255 to .256 which ends
up causing the userspace-visable LINUX_VERSION_CODE to behave a bit differently
than normal due to the "overflow".
With this release, KERNEL_VERSION(4, 9, 256) is the same as KERNEL_VERSION(4, 10, 0).
Nothing in the kernel build itself breaks with this change, but given that this
is a userspace visible change, and some crazy tools (like glibc and gcc) have
logic that checks the kernel version for different reasons, I wanted to do this
release as an "empty" release to ensure that everything still works properly.
So, this is a YOU MUST UPGRADE requirement of a release. If you rely on the
4.9.y kernel, please throw this release into your test builds and rebuild the
world and let us know if anything breaks, or if all is well.
Go forth and do full system rebuilds! Yocto and Gentoo are great for this, as
will systems that use buildroot.
I'll try to hold off on doing a "real" 4.9.y release for a 9eek to give
everyone a chance to test this out and get back to me. The pending patches in
the 4.9.y queue are pretty serious, so I am loath to wait longer than that,
consider yourself warned...
The updated 4.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.9.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Greg Kroah-Hartman (1):
Linux 4.9.256
This is a note to let you know that I've just added the patch titled
usb: quirks: add quirk to start video capture on ELMO L-12F document
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 1ebe718bb48278105816ba03a0408ecc2d6cf47f Mon Sep 17 00:00:00 2001
From: Stefan Ursella <stefan.ursella(a)wolfvision.net>
Date: Wed, 10 Feb 2021 15:07:11 +0100
Subject: usb: quirks: add quirk to start video capture on ELMO L-12F document
camera reliable
Without this quirk starting a video capture from the device often fails with
kernel: uvcvideo: Failed to set UVC probe control : -110 (exp. 34).
Signed-off-by: Stefan Ursella <stefan.ursella(a)wolfvision.net>
Link: https://lore.kernel.org/r/20210210140713.18711-1-stefan.ursella@wolfvision.…
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/quirks.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
index 66a0dc618dfc..6ade3daf7858 100644
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -391,6 +391,9 @@ static const struct usb_device_id usb_quirk_list[] = {
/* X-Rite/Gretag-Macbeth Eye-One Pro display colorimeter */
{ USB_DEVICE(0x0971, 0x2000), .driver_info = USB_QUIRK_NO_SET_INTF },
+ /* ELMO L-12F document camera */
+ { USB_DEVICE(0x09a1, 0x0028), .driver_info = USB_QUIRK_DELAY_CTRL_MSG },
+
/* Broadcom BCM92035DGROM BT dongle */
{ USB_DEVICE(0x0a5c, 0x2021), .driver_info = USB_QUIRK_RESET_RESUME },
--
2.30.1
Repeated the same tests as the upstream code on top of v5.10.14 and
v5.9.16, tested on powerpc64 and powerpc64le, with a glibc build and
running the affected glibc's testcase[2], inspected that glibc's
backtrace() now gives the correct result and gdb backtrace also keeps
working as before.
I believe this should be backported to releases 5.9 and 5.10 as
userspace is affected in this releases. I hope I had tagged this
correctly in the patch.
The commit message bellow is cherry-picked from the upstream commit, I
am not sure what should I do with the footer, I left it as-is and just
added a [rff: Backported] at the end.
---- 8< ----
commit 24321ac668e452a4942598533d267805f291fdc9 upstream.
This backport differ from the upstream patch in the way to set the
sigtramp offsets, after 5.10 VDSO symbols offsets are retrieved at
buildtime and before, in this patch it uses the runtime generated
offsets logic.
Commit 0138ba5783ae ("powerpc/64/signal: Balance return predictor
stack in signal trampoline") changed __kernel_sigtramp_rt64() VDSO and
trampoline code, and introduced a regression in the way glibc's
backtrace()[1] detects the signal-handler stack frame. Apart from the
practical implications, __kernel_sigtramp_rt64() was a VDSO function
with the semantics that it is a function you can call from userspace
to end a signal handling. Now this semantics are no longer valid.
I believe the aforementioned change affects all releases since 5.9.
This patch tries to fix both the semantics and practical aspect of
__kernel_sigtramp_rt64() returning it to the previous code, whilst
keeping the intended behaviour of 0138ba5783ae by adding a new symbol
to serve as the jump target from the kernel to the trampoline. Now the
trampoline has two parts, a new entry point and the old return point.
[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-January/223194.html
Fixes: 0138ba5783ae ("powerpc/64/signal: Balance return predictor stack in signal trampoline")
Cc: stable(a)vger.kernel.org # v5.9+
Signed-off-by: Raoni Fassina Firmino <raoni(a)linux.ibm.com>
Acked-by: Nicholas Piggin <npiggin(a)gmail.com>
[mpe: Minor tweaks to change log formatting, add stable tag]
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20210201200505.iz46ubcizipnkcxe@work-tp
[rff: Backported]
---
arch/powerpc/kernel/vdso.c | 2 +-
arch/powerpc/kernel/vdso64/sigtramp.S | 11 ++++++++++-
arch/powerpc/kernel/vdso64/vdso64.lds.S | 1 +
3 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 8dad44262e75..495ffc9cf5e2 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -475,7 +475,7 @@ static __init void vdso_setup_trampolines(struct lib32_elfinfo *v32,
*/
#ifdef CONFIG_PPC64
- vdso64_rt_sigtramp = find_function64(v64, "__kernel_sigtramp_rt64");
+ vdso64_rt_sigtramp = find_function64(v64, "__kernel_start_sigtramp_rt64");
#endif
vdso32_sigtramp = find_function32(v32, "__kernel_sigtramp32");
vdso32_rt_sigtramp = find_function32(v32, "__kernel_sigtramp_rt32");
diff --git a/arch/powerpc/kernel/vdso64/sigtramp.S b/arch/powerpc/kernel/vdso64/sigtramp.S
index bbf68cd01088..2d4067561293 100644
--- a/arch/powerpc/kernel/vdso64/sigtramp.S
+++ b/arch/powerpc/kernel/vdso64/sigtramp.S
@@ -15,11 +15,20 @@
.text
+/*
+ * __kernel_start_sigtramp_rt64 and __kernel_sigtramp_rt64 together
+ * are one function split in two parts. The kernel jumps to the former
+ * and the signal handler indirectly (by blr) returns to the latter.
+ * __kernel_sigtramp_rt64 needs to point to the return address so
+ * glibc can correctly identify the trampoline stack frame.
+ */
.balign 8
.balign IFETCH_ALIGN_BYTES
-V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
+V_FUNCTION_BEGIN(__kernel_start_sigtramp_rt64)
.Lsigrt_start:
bctrl /* call the handler */
+V_FUNCTION_END(__kernel_start_sigtramp_rt64)
+V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
addi r1, r1, __SIGNAL_FRAMESIZE
li r0,__NR_rt_sigreturn
sc
diff --git a/arch/powerpc/kernel/vdso64/vdso64.lds.S b/arch/powerpc/kernel/vdso64/vdso64.lds.S
index 256fb9720298..bd120f590b9e 100644
--- a/arch/powerpc/kernel/vdso64/vdso64.lds.S
+++ b/arch/powerpc/kernel/vdso64/vdso64.lds.S
@@ -150,6 +150,7 @@ VERSION
__kernel_get_tbfreq;
__kernel_sync_dicache;
__kernel_sync_dicache_p5;
+ __kernel_start_sigtramp_rt64;
__kernel_sigtramp_rt64;
__kernel_getcpu;
__kernel_time;
base-commit: b0c8835fc649454c33371f4617111cb5d60463e1
--
2.26.2
This is a note to let you know that I've just added the patch titled
staging: gdm724x: Fix DMA from stack
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the staging-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 7c3a0635cd008eaca9a734dc802709ee0b81cac5 Mon Sep 17 00:00:00 2001
From: Amey Narkhede <ameynarkhede03(a)gmail.com>
Date: Thu, 11 Feb 2021 11:08:19 +0530
Subject: staging: gdm724x: Fix DMA from stack
Stack allocated buffers cannot be used for DMA
on all architectures so allocate hci_packet buffer
using kmalloc.
Reviewed-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Amey Narkhede <ameynarkhede03(a)gmail.com>
Link: https://lore.kernel.org/r/20210211053819.34858-1-ameynarkhede03@gmail.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/gdm724x/gdm_usb.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/staging/gdm724x/gdm_usb.c b/drivers/staging/gdm724x/gdm_usb.c
index dc4da66c3695..54bdb64f52e8 100644
--- a/drivers/staging/gdm724x/gdm_usb.c
+++ b/drivers/staging/gdm724x/gdm_usb.c
@@ -56,20 +56,24 @@ static int gdm_usb_recv(void *priv_dev,
static int request_mac_address(struct lte_udev *udev)
{
- u8 buf[16] = {0,};
- struct hci_packet *hci = (struct hci_packet *)buf;
+ struct hci_packet *hci;
struct usb_device *usbdev = udev->usbdev;
int actual;
int ret = -1;
+ hci = kmalloc(struct_size(hci, data, 1), GFP_KERNEL);
+ if (!hci)
+ return -ENOMEM;
+
hci->cmd_evt = gdm_cpu_to_dev16(udev->gdm_ed, LTE_GET_INFORMATION);
hci->len = gdm_cpu_to_dev16(udev->gdm_ed, 1);
hci->data[0] = MAC_ADDRESS;
- ret = usb_bulk_msg(usbdev, usb_sndbulkpipe(usbdev, 2), buf, 5,
+ ret = usb_bulk_msg(usbdev, usb_sndbulkpipe(usbdev, 2), hci, 5,
&actual, 1000);
udev->request_mac_addr = 1;
+ kfree(hci);
return ret;
}
--
2.30.1
This is a note to let you know that I've just added the patch titled
phy: lantiq: rcu-usb2: wait after clock enable
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 36acd5e24e3000691fb8d1ee31cf959cb1582d35 Mon Sep 17 00:00:00 2001
From: Mathias Kresin <dev(a)kresin.me>
Date: Thu, 7 Jan 2021 23:49:01 +0100
Subject: phy: lantiq: rcu-usb2: wait after clock enable
Commit 65dc2e725286 ("usb: dwc2: Update Core Reset programming flow.")
revealed that the phy isn't ready immediately after enabling it's
clocks. The dwc2_check_core_version() fails and the dwc2 usb driver
errors out.
Add a short delay to let the phy get up and running. There isn't any
documentation how much time is required, the value was chosen based on
tests.
Signed-off-by: Mathias Kresin <dev(a)kresin.me>
Acked-by: Hauke Mehrtens <hauke(a)hauke-m.de>
Acked-by: Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
Cc: <stable(a)vger.kernel.org> # v5.7+
Link: https://lore.kernel.org/r/20210107224901.2102479-1-dev@kresin.me
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
---
drivers/phy/lantiq/phy-lantiq-rcu-usb2.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/phy/lantiq/phy-lantiq-rcu-usb2.c b/drivers/phy/lantiq/phy-lantiq-rcu-usb2.c
index a7d126192cf1..29d246ea24b4 100644
--- a/drivers/phy/lantiq/phy-lantiq-rcu-usb2.c
+++ b/drivers/phy/lantiq/phy-lantiq-rcu-usb2.c
@@ -124,8 +124,16 @@ static int ltq_rcu_usb2_phy_power_on(struct phy *phy)
reset_control_deassert(priv->phy_reset);
ret = clk_prepare_enable(priv->phy_gate_clk);
- if (ret)
+ if (ret) {
dev_err(dev, "failed to enable PHY gate\n");
+ return ret;
+ }
+
+ /*
+ * at least the xrx200 usb2 phy requires some extra time to be
+ * operational after enabling the clock
+ */
+ usleep_range(100, 200);
return ret;
}
--
2.30.1
From: Gao Xiang <hsiangkao(a)redhat.com>
Currently, although set_bit() & test_bit() pairs are used as a fast-
path for initialized configurations. However, these atomic ops are
actually relaxed forms. Instead, load-acquire & store-release form is
needed to make sure uninitialized fields won't be observed in advance
here (yet no such corresponding bitops so use full barriers instead.)
Fixes: 62dc45979f3f ("staging: erofs: fix race of initializing xattrs of a inode at the same time")
Fixes: 152a333a5895 ("staging: erofs: add compacted compression indexes support")
Cc: <stable(a)vger.kernel.org> # 5.3+
Reported-by: Huang Jianan <huangjianan(a)oppo.com>
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
---
fs/erofs/xattr.c | 10 +++++++++-
fs/erofs/zmap.c | 10 +++++++++-
2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/fs/erofs/xattr.c b/fs/erofs/xattr.c
index 5bde77d70852..47314a26767a 100644
--- a/fs/erofs/xattr.c
+++ b/fs/erofs/xattr.c
@@ -48,8 +48,14 @@ static int init_inode_xattrs(struct inode *inode)
int ret = 0;
/* the most case is that xattrs of this inode are initialized. */
- if (test_bit(EROFS_I_EA_INITED_BIT, &vi->flags))
+ if (test_bit(EROFS_I_EA_INITED_BIT, &vi->flags)) {
+ /*
+ * paired with smp_mb() at the end of the function to ensure
+ * fields will only be observed after the bit is set.
+ */
+ smp_mb();
return 0;
+ }
if (wait_on_bit_lock(&vi->flags, EROFS_I_BL_XATTR_BIT, TASK_KILLABLE))
return -ERESTARTSYS;
@@ -137,6 +143,8 @@ static int init_inode_xattrs(struct inode *inode)
}
xattr_iter_end(&it, atomic_map);
+ /* paired with smp_mb() at the beginning of the function. */
+ smp_mb();
set_bit(EROFS_I_EA_INITED_BIT, &vi->flags);
out_unlock:
diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
index ae325541884e..14d2de35110c 100644
--- a/fs/erofs/zmap.c
+++ b/fs/erofs/zmap.c
@@ -36,8 +36,14 @@ static int z_erofs_fill_inode_lazy(struct inode *inode)
void *kaddr;
struct z_erofs_map_header *h;
- if (test_bit(EROFS_I_Z_INITED_BIT, &vi->flags))
+ if (test_bit(EROFS_I_Z_INITED_BIT, &vi->flags)) {
+ /*
+ * paired with smp_mb() at the end of the function to ensure
+ * fields will only be observed after the bit is set.
+ */
+ smp_mb();
return 0;
+ }
if (wait_on_bit_lock(&vi->flags, EROFS_I_BL_Z_BIT, TASK_KILLABLE))
return -ERESTARTSYS;
@@ -83,6 +89,8 @@ static int z_erofs_fill_inode_lazy(struct inode *inode)
vi->z_physical_clusterbits[1] = vi->z_logical_clusterbits +
((h->h_clusterbits >> 5) & 7);
+ /* paired with smp_mb() at the beginning of the function */
+ smp_mb();
set_bit(EROFS_I_Z_INITED_BIT, &vi->flags);
unmap_done:
kunmap_atomic(kaddr);
--
2.24.0
From: Tony Lindgren <tony(a)atomide.com>
[ Upstream commit 7078a5ba7a58e5db07583b176f8a03e0b8714731 ]
We have rst_map_012 used for various accelerators like dsp, ipu and iva.
For these use cases, we have rstctrl bit 2 control the subsystem module
reset, and have and bits 0 and 1 control the accelerator specific
features.
If the bootloader, or kexec boot, has left any accelerator specific
reset bits deasserted, deasserting bit 2 reset will potentially enable
an accelerator with unconfigured MMU and no firmware. And we may get
spammed with a lot by warnings on boot with "Data Access in User mode
during Functional access", or depending on the accelerator, the system
can also just hang.
This issue can be quite easily reproduced by setting a rst_map_012 type
rstctrl register to 0 or 4 in the bootloader, and booting the system.
Let's just assert all reset bits for rst_map_012 type resets. So far
it looks like the other rstctrl types don't need this. If it turns out
that the other type rstctrl bits also need reset on init, we need to
add an instance specific reset mask for the bits to avoid resetting
unwanted bits.
Reported-by: Carl Philipp Klemm <philipp(a)uvos.xyz>
Cc: Philipp Zabel <p.zabel(a)pengutronix.de>
Cc: Santosh Shilimkar <ssantosh(a)kernel.org>
Cc: Suman Anna <s-anna(a)ti.com>
Cc: Tero Kristo <t-kristo(a)ti.com>
Tested-by: Carl Philipp Klemm <philipp(a)uvos.xyz>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/soc/ti/omap_prm.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/soc/ti/omap_prm.c b/drivers/soc/ti/omap_prm.c
index 4d41dc3cdce1f..c8b14b3a171f7 100644
--- a/drivers/soc/ti/omap_prm.c
+++ b/drivers/soc/ti/omap_prm.c
@@ -552,6 +552,7 @@ static int omap_prm_reset_init(struct platform_device *pdev,
const struct omap_rst_map *map;
struct ti_prm_platform_data *pdata = dev_get_platdata(&pdev->dev);
char buf[32];
+ u32 v;
/*
* Check if we have controllable resets. If either rstctrl is non-zero
@@ -599,6 +600,16 @@ static int omap_prm_reset_init(struct platform_device *pdev,
map++;
}
+ /* Quirk handling to assert rst_map_012 bits on reset and avoid errors */
+ if (prm->data->rstmap == rst_map_012) {
+ v = readl_relaxed(reset->prm->base + reset->prm->data->rstctrl);
+ if ((v & reset->mask) != reset->mask) {
+ dev_dbg(&pdev->dev, "Asserting all resets: %08x\n", v);
+ writel_relaxed(reset->mask, reset->prm->base +
+ reset->prm->data->rstctrl);
+ }
+ }
+
return devm_reset_controller_register(&pdev->dev, &reset->rcdev);
}
--
2.27.0
During allocation the allocator will try to allocate an extent using
cluster policy. Once the current cluster is exhausted it will remove the
its entry under btrfs_free_cluster::lock and subsequently acquire
btrfs_free_space_ctl::tree_lock to dispose of the already-deleted
entry and adjust btrfs_free_space_ctl::total_bitmap. This poses a
problem because there exists a race condition between removing the
entry under one lock and doing the necessary accounting holding a
different lock since extent freeing only uses the 2nd lock. This can
result in the following situation:
T1: T2:
btrfs_alloc_from_cluster insert_into_bitmap <holds tree_lock>
if (entry->bytes == 0) if (block_group && !list_empty(&block_group->cluster_list)) {
rb_erase(entry)
spin_unlock(&cluster->lock);
(total_bitmaps is still 4) spin_lock(&cluster->lock);
<doesn't find entry in cluster->root>
spin_lock(&ctl->tree_lock); <goes to new_bitmap label, adds
<blocked since T2 holds tree_lock> <a new entry and calls add_new_bitmap>
recalculate_thresholds <crashes,
due to total_bitmaps
becoming 5 and triggering
an ASSERT>
To fix this ensure that once depleted, the cluster entry is deleted when
both cluster lock and tree locks are held in the allocator (T1), this
ensures that even if there is a race with a concurrent
insert_into_bitmap call it will correctly find the entry in the cluster
and add the new space to it.
Signed-off-by: Nikolay Borisov <nborisov(a)suse.com>
Cc: <stable(a)vger.kernel.org>
---
fs/btrfs/free-space-cache.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 0d6dcb5ff963..e386c468feaf 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -3031,8 +3031,6 @@ u64 btrfs_alloc_from_cluster(struct btrfs_block_group *block_group,
entry->bytes -= bytes;
}
- if (entry->bytes == 0)
- rb_erase(&entry->offset_index, &cluster->root);
break;
}
out:
@@ -3049,7 +3047,9 @@ u64 btrfs_alloc_from_cluster(struct btrfs_block_group *block_group,
ctl->free_space -= bytes;
if (!entry->bitmap && !btrfs_free_space_trimmed(entry))
ctl->discardable_bytes[BTRFS_STAT_CURR] -= bytes;
+ spin_lock(&cluster->lock);
if (entry->bytes == 0) {
+ rb_erase(&entry->offset_index, &cluster->root);
ctl->free_extents--;
if (entry->bitmap) {
kmem_cache_free(btrfs_free_space_bitmap_cachep,
@@ -3062,6 +3062,7 @@ u64 btrfs_alloc_from_cluster(struct btrfs_block_group *block_group,
kmem_cache_free(btrfs_free_space_cachep, entry);
}
+ spin_unlock(&cluster->lock);
spin_unlock(&ctl->tree_lock);
return ret;
--
2.25.1
On Wed, 2021-02-10 at 17:06 +0000, Julien Grall wrote:
> From: Julien Grall <jgrall(a)amazon.com>
>
> After Commit 3499ba8198cad ("xen: Fix event channel callback via
> INTX/GSI"), xenbus_probe() will be called too early on Arm. This will
> recent to a guest hang during boot.
>
> If there hang wasn't there, we would have ended up to call
> xenbus_probe() twice (the second time is in xenbus_probe_initcall()).
>
> We don't need to initialize xenbus_probe() early for Arm guest.
> Therefore, the call in xen_guest_init() is now removed.
>
> After this change, there is no more external caller for xenbus_probe().
> So the function is turned to a static one. Interestingly there were two
> prototypes for it.
>
> Fixes: 3499ba8198cad ("xen: Fix event channel callback via INTX/GSI")
> Reported-by: Ian Jackson <iwj(a)xenproject.org>
> Signed-off-by: Julien Grall <jgrall(a)amazon.com>
Reviewed-by: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: stable(a)vger.kernel.org
Amazon Development Centre (London) Ltd. Registered in England and Wales with registration number 04543232 with its registered office at 1 Principal Place, Worship Street, London EC2A 2FA, United Kingdom.
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 70245f86c109e0eafb92ea9653184c0e44b4b35c
Gitweb: https://git.kernel.org/tip/70245f86c109e0eafb92ea9653184c0e44b4b35c
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Wed, 10 Feb 2021 16:27:41 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 10 Feb 2021 22:06:47 +01:00
x86/pci: Create PCI/MSI irqdomain after x86_init.pci.arch_init()
Invoking x86_init.irqs.create_pci_msi_domain() before
x86_init.pci.arch_init() breaks XEN PV.
The XEN_PV specific pci.arch_init() function overrides the default
create_pci_msi_domain() which is obviously too late.
As a consequence the XEN PV PCI/MSI allocation goes through the native
path which runs out of vectors and causes malfunction.
Invoke it after x86_init.pci.arch_init().
Fixes: 6b15ffa07dc3 ("x86/irq: Initialize PCI/MSI domain at PCI init time")
Reported-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Juergen Gross <jgross(a)suse.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/87pn18djte.fsf@nanos.tec.linutronix.de
---
arch/x86/pci/init.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/arch/x86/pci/init.c b/arch/x86/pci/init.c
index 00bfa1e..0bb3b8b 100644
--- a/arch/x86/pci/init.c
+++ b/arch/x86/pci/init.c
@@ -9,16 +9,23 @@
in the right sequence from here. */
static __init int pci_arch_init(void)
{
- int type;
-
- x86_create_pci_msi_domain();
+ int type, pcbios = 1;
type = pci_direct_probe();
if (!(pci_probe & PCI_PROBE_NOEARLY))
pci_mmcfg_early_init();
- if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
+ if (x86_init.pci.arch_init)
+ pcbios = x86_init.pci.arch_init();
+
+ /*
+ * Must happen after x86_init.pci.arch_init(). Xen sets up the
+ * x86_init.irqs.create_pci_msi_domain there.
+ */
+ x86_create_pci_msi_domain();
+
+ if (!pcbios)
return 0;
pci_pcbios_init();
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: mm, slub: better heuristic for number of cpus when calculating slab order
When creating a new kmem cache, SLUB determines how large the slab pages will
based on number of inputs, including the number of CPUs in the system. Larger
slab pages mean that more objects can be allocated/free from per-cpu slabs
before accessing shared structures, but also potentially more memory can be
wasted due to low slab usage and fragmentation.
The rough idea of using number of CPUs is that larger systems will be more
likely to benefit from reduced contention, and also should have enough memory
to spare.
Number of CPUs used to be determined as nr_cpu_ids, which is number of possible
cpus, but on some systems many will never be onlined, thus commit 045ab8c9487b
("mm/slub: let number of online CPUs determine the slub page order") changed it
to nr_online_cpus(). However, for kmem caches created early before CPUs are
onlined, this may lead to permamently low slab page sizes.
Vincent reports a regression [1] of hackbench on arm64 systems:
> I'm facing significant performances regression on a large arm64 server
> system (224 CPUs). Regressions is also present on small arm64 system
> (8 CPUs) but in a far smaller order of magnitude
> On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
> v5.11-rc4 : 9.135sec (+/- 0.45%)
> v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
> v5.10: 3.136sec (+/- 0.40%)
Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting
page allocator contention:
> i.e. the patch incurs a 7% to 32% performance penalty. This bisected
> cleanly yesterday when I was looking for the regression and then found
> the thread.
> Numerous caches change size. For example, kmalloc-512 goes from order-0
> (vanilla) to order-2 with the revert.
> So mostly this is down to the number of times SLUB calls into the page
> allocator which only caches order-0 pages on a per-cpu basis.
Clearly num_online_cpus() doesn't work too early in bootup. We could change
the order dynamically in a memory hotplug callback, but runtime order changing
for existing kmem caches has been already shown as dangerous, and removed in
32a6f409b693 ("mm, slub: remove runtime allocation order changes"). It could be
resurrected in a safe manner with some effort, but to fix the regression we
need something simpler.
We could use num_present_cpus() that should be the number of physically
present CPUs even before they are onlined. That would work for PowerPC
[3], which triggered the original commit, but that still doesn't work on
arm64 [4] as explained in [5].
So this patch tries to determine the best available value without specific
arch knowledge.
- num_present_cpus() if the number is larger than 1, as that means the
arch is likely setting it properly
- nr_cpu_ids otherwise
This should fix the reported regressions while also keeping the effect of
045ab8c9487b for PowerPC systems. It's possible there are configurations
where num_present_cpus() is 1 during boot while nr_cpu_ids is at the same
time bloated, so these (if they exist) would keep the large orders based
on nr_cpu_ids as was before 045ab8c9487b.
[1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj…
[2] https://lore.kernel.org/linux-mm/20210128134512.GF3592@techsingularity.net/
[3] https://lore.kernel.org/linux-mm/20210123051607.GC2587010@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03Ys…
[5] https://lore.kernel.org/linux-mm/20210126230305.GD30941@willie-the-truck/
Link: https://lkml.kernel.org/r/20210208134108.22286-1-vbabka@suse.cz
Fixes: 045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Reported-by: Mel Gorman <mgorman(a)techsingularity.net>
Tested-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Bharata B Rao <bharata(a)linux.ibm.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
--- a/mm/slub.c~mm-slub-better-heuristic-for-number-of-cpus-when-calculating-slab-order
+++ a/mm/slub.c
@@ -3423,6 +3423,7 @@ static inline int calculate_order(unsign
unsigned int order;
unsigned int min_objects;
unsigned int max_objects;
+ unsigned int nr_cpus;
/*
* Attempt to find best configuration for a slab. This
@@ -3433,8 +3434,21 @@ static inline int calculate_order(unsign
* we reduce the minimum objects required in a slab.
*/
min_objects = slub_min_objects;
- if (!min_objects)
- min_objects = 4 * (fls(num_online_cpus()) + 1);
+ if (!min_objects) {
+ /*
+ * Some architectures will only update present cpus when
+ * onlining them, so don't trust the number if it's just 1. But
+ * we also don't want to use nr_cpu_ids always, as on some other
+ * architectures, there can be many possible cpus, but never
+ * onlined. Here we compromise between trying to avoid too high
+ * order on systems that appear larger than they are, and too
+ * low order on systems that appear smaller than they are.
+ */
+ nr_cpus = num_present_cpus();
+ if (nr_cpus <= 1)
+ nr_cpus = nr_cpu_ids;
+ min_objects = 4 * (fls(nr_cpus) + 1);
+ }
max_objects = order_objects(slub_max_order, size);
min_objects = min(min_objects, max_objects);
_
Hi,
While reconciling the lttng-modules writeback instrumentation with its counterpart
within the upstream Linux kernel, I notice that the following commit introduced in
5.6 is present in stable branches 5.4 and 5.5, but is missing from LTS stable branches
for 4.4, 4.9, 4.14, 4.19:
commit 68f23b89067fdf187763e75a56087550624fdbee
("memcg: fix a crash in wb_workfn when a device disappears")
Considering that this fix was CC'd to the stable mailing list, is there any
reason why it has not been integrated into those LTS branches ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
The recent rework of probe_kernel_address() and its conversion to
get_kernel_nofault() inadvertently broke is_prefetch(). Before this change,
probe_kernel_address() was used as a sloppy "read user or kernel memory"
helper, but it doesn't do that any more. The new get_kernel_nofault()
reads *kernel* memory only, which completely broke is_prefetch() for user
access.
Adjust the code to the the correct accessor based on access mode. The
manual address bounds check is no longer necessary, since the accessor
helpers (get_user() / get_kernel_nofault()) do the right thing all by
themselves. As a bonus, by using the correct accessor, we don't need the
open-coded address bounds check.
Fixes: eab0c6089b68 ("maccess: unify the probe kernel arch hooks")
Cc: stable(a)vger.kernel.org
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
---
arch/x86/mm/fault.c | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index f1f1b5a0956a..441c3e9b8971 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -54,7 +54,7 @@ kmmio_fault(struct pt_regs *regs, unsigned long addr)
* 32-bit mode:
*
* Sometimes AMD Athlon/Opteron CPUs report invalid exceptions on prefetch.
- * Check that here and ignore it.
+ * Check that here and ignore it. This is AMD erratum #91.
*
* 64-bit mode:
*
@@ -83,11 +83,7 @@ check_prefetch_opcode(struct pt_regs *regs, unsigned char *instr,
#ifdef CONFIG_X86_64
case 0x40:
/*
- * In AMD64 long mode 0x40..0x4F are valid REX prefixes
- * Need to figure out under what instruction mode the
- * instruction was issued. Could check the LDT for lm,
- * but for now it's good enough to assume that long
- * mode only uses well known segments or kernel.
+ * In 64-bit mode 0x40..0x4F are valid REX prefixes
*/
return (!user_mode(regs) || user_64bit_mode(regs));
#endif
@@ -127,20 +123,31 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
instr = (void *)convert_ip_to_linear(current, regs);
max_instr = instr + 15;
- if (user_mode(regs) && instr >= (unsigned char *)TASK_SIZE_MAX)
- return 0;
+ /*
+ * This code has historically always bailed out if IP points to a
+ * not-present page (e.g. due to a race). No one has ever
+ * complained about this.
+ */
+ pagefault_disable();
while (instr < max_instr) {
unsigned char opcode;
- if (get_kernel_nofault(opcode, instr))
- break;
+ if (user_mode(regs)) {
+ if (get_user(opcode, instr))
+ break;
+ } else {
+ if (get_kernel_nofault(opcode, instr))
+ break;
+ }
instr++;
if (!check_prefetch_opcode(regs, instr, opcode, &prefetch))
break;
}
+
+ pagefault_enable();
return prefetch;
}
--
2.29.2
printk_safe_flush_on_panic() caused the following deadlock on our
server:
CPU0: CPU1:
panic rcu_dump_cpu_stacks
kdump_nmi_shootdown_cpus nmi_trigger_cpumask_backtrace
register_nmi_handler(crash_nmi_callback) printk_safe_flush
__printk_safe_flush
raw_spin_lock_irqsave(&read_lock)
// send NMI to other processors
apic_send_IPI_allbutself(NMI_VECTOR)
// NMI interrupt, dead loop
crash_nmi_callback
printk_safe_flush_on_panic
printk_safe_flush
__printk_safe_flush
// deadlock
raw_spin_lock_irqsave(&read_lock)
DEADLOCK: read_lock is taken on CPU1 and will never get released.
It happens when panic() stops a CPU by NMI while it has been in
the middle of printk_safe_flush().
Handle the lock the same way as logbuf_lock. The printk_safe buffers
are flushed only when both locks can be safely taken. It can avoid
the deadlock _in this particular case_ at expense of losing contents
of printk_safe buffers.
Note: It would actually be safe to re-init the locks when all CPUs were
stopped by NMI. But it would require passing this information
from arch-specific code. It is not worth the complexity.
Especially because logbuf_lock and printk_safe buffers have been
obsoleted by the lockless ring buffer.
Fixes: cf9b1106c81c ("printk/nmi: flush NMI messages on the system panic")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
Cc: <stable(a)vger.kernel.org>
---
kernel/printk/printk_safe.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index a0e6f746de6c..2e9e3ed7d63e 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -45,6 +45,8 @@ struct printk_safe_seq_buf {
static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
static DEFINE_PER_CPU(int, printk_context);
+static DEFINE_RAW_SPINLOCK(safe_read_lock);
+
#ifdef CONFIG_PRINTK_NMI
static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
#endif
@@ -180,8 +182,6 @@ static void report_message_lost(struct printk_safe_seq_buf *s)
*/
static void __printk_safe_flush(struct irq_work *work)
{
- static raw_spinlock_t read_lock =
- __RAW_SPIN_LOCK_INITIALIZER(read_lock);
struct printk_safe_seq_buf *s =
container_of(work, struct printk_safe_seq_buf, work);
unsigned long flags;
@@ -195,7 +195,7 @@ static void __printk_safe_flush(struct irq_work *work)
* different CPUs. This is especially important when printing
* a backtrace.
*/
- raw_spin_lock_irqsave(&read_lock, flags);
+ raw_spin_lock_irqsave(&safe_read_lock, flags);
i = 0;
more:
@@ -232,7 +232,7 @@ static void __printk_safe_flush(struct irq_work *work)
out:
report_message_lost(s);
- raw_spin_unlock_irqrestore(&read_lock, flags);
+ raw_spin_unlock_irqrestore(&safe_read_lock, flags);
}
/**
@@ -278,6 +278,14 @@ void printk_safe_flush_on_panic(void)
raw_spin_lock_init(&logbuf_lock);
}
+ if (raw_spin_is_locked(&safe_read_lock)) {
+ if (num_online_cpus() > 1)
+ return;
+
+ debug_locks_off();
+ raw_spin_lock_init(&safe_read_lock);
+ }
+
printk_safe_flush();
}
--
2.11.0
This is a note to let you know that I've just added the patch titled
usb: quirks: add quirk to start video capture on ELMO L-12F document
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the usb-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 1ebe718bb48278105816ba03a0408ecc2d6cf47f Mon Sep 17 00:00:00 2001
From: Stefan Ursella <stefan.ursella(a)wolfvision.net>
Date: Wed, 10 Feb 2021 15:07:11 +0100
Subject: usb: quirks: add quirk to start video capture on ELMO L-12F document
camera reliable
Without this quirk starting a video capture from the device often fails with
kernel: uvcvideo: Failed to set UVC probe control : -110 (exp. 34).
Signed-off-by: Stefan Ursella <stefan.ursella(a)wolfvision.net>
Link: https://lore.kernel.org/r/20210210140713.18711-1-stefan.ursella@wolfvision.…
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/quirks.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
index 66a0dc618dfc..6ade3daf7858 100644
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -391,6 +391,9 @@ static const struct usb_device_id usb_quirk_list[] = {
/* X-Rite/Gretag-Macbeth Eye-One Pro display colorimeter */
{ USB_DEVICE(0x0971, 0x2000), .driver_info = USB_QUIRK_NO_SET_INTF },
+ /* ELMO L-12F document camera */
+ { USB_DEVICE(0x09a1, 0x0028), .driver_info = USB_QUIRK_DELAY_CTRL_MSG },
+
/* Broadcom BCM92035DGROM BT dongle */
{ USB_DEVICE(0x0a5c, 0x2021), .driver_info = USB_QUIRK_RESET_RESUME },
--
2.30.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e013f455d95add874f310dc47c608e8c70692ae5 Mon Sep 17 00:00:00 2001
From: Sibi Sankar <sibis(a)codeaurora.org>
Date: Thu, 23 Jul 2020 01:40:45 +0530
Subject: [PATCH] remoteproc: qcom_q6v5_mss: Validate MBA firmware size before
load
The following mem abort is observed when the mba firmware size exceeds
the allocated mba region. MBA firmware size is restricted to a maximum
size of 1M and remaining memory region is used by modem debug policy
firmware when available. Hence verify whether the MBA firmware size lies
within the allocated memory region and is not greater than 1M before
loading.
Err Logs:
Unable to handle kernel paging request at virtual address
Mem abort info:
...
Call trace:
__memcpy+0x110/0x180
rproc_start+0x40/0x218
rproc_boot+0x5b4/0x608
state_store+0x54/0xf8
dev_attr_store+0x44/0x60
sysfs_kf_write+0x58/0x80
kernfs_fop_write+0x140/0x230
vfs_write+0xc4/0x208
ksys_write+0x74/0xf8
__arm64_sys_write+0x24/0x30
...
Reviewed-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Fixes: 051fb70fd4ea4 ("remoteproc: qcom: Driver for the self-authenticating Hexagon v5")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sibi Sankar <sibis(a)codeaurora.org>
Link: https://lore.kernel.org/r/20200722201047.12975-2-sibis@codeaurora.org
Signed-off-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
diff --git a/drivers/remoteproc/qcom_q6v5_mss.c b/drivers/remoteproc/qcom_q6v5_mss.c
index 03d7f3d702b3..7826f229957d 100644
--- a/drivers/remoteproc/qcom_q6v5_mss.c
+++ b/drivers/remoteproc/qcom_q6v5_mss.c
@@ -411,6 +411,12 @@ static int q6v5_load(struct rproc *rproc, const struct firmware *fw)
{
struct q6v5 *qproc = rproc->priv;
+ /* MBA is restricted to a maximum size of 1M */
+ if (fw->size > qproc->mba_size || fw->size > SZ_1M) {
+ dev_err(qproc->dev, "MBA firmware load failed\n");
+ return -EINVAL;
+ }
+
memcpy(qproc->mba_region, fw->data, fw->size);
return 0;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7e0a9220467dbcfdc5bc62825724f3e52e50ab31 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Fri, 29 Jan 2021 10:13:53 -0500
Subject: [PATCH] fgraph: Initialize tracing_graph_pause at task creation
On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.
The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.
CPU 1 CPU 2
----- -----
do_idle()
cpu_suspend()
pause_graph_tracing()
task_struct->tracing_graph_pause++ (0 -> 1)
start_graph_tracing()
for_each_online_cpu(cpu) {
ftrace_graph_init_idle_task(cpu)
task-struct->tracing_graph_pause = 0 (1 -> 0)
unpause_graph_tracing()
task_struct->tracing_graph_pause-- (0 -> -1)
The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.
There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.
Cc: stable(a)vger.kernel.org
Fixes: 380c4b1411ccd ("tracing/function-graph-tracer: append the tracing_graph_flag")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
Reported-by: pierre.gondois(a)arm.com
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/init/init_task.c b/init/init_task.c
index 8a992d73e6fb..3711cdaafed2 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -198,7 +198,8 @@ struct task_struct init_task
.lockdep_recursion = 0,
#endif
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- .ret_stack = NULL,
+ .ret_stack = NULL,
+ .tracing_graph_pause = ATOMIC_INIT(0),
#endif
#if defined(CONFIG_TRACING) && defined(CONFIG_PREEMPTION)
.trace_recursion = 0,
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 73edb9e4f354..29a6ebeebc9e 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -394,7 +394,6 @@ static int alloc_retstack_tasklist(struct ftrace_ret_stack **ret_stack_list)
}
if (t->ret_stack == NULL) {
- atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->curr_ret_stack = -1;
t->curr_ret_depth = -1;
@@ -489,7 +488,6 @@ static DEFINE_PER_CPU(struct ftrace_ret_stack *, idle_ret_stack);
static void
graph_init_task(struct task_struct *t, struct ftrace_ret_stack *ret_stack)
{
- atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->ftrace_timestamp = 0;
/* make curr_ret_stack visible before we add the ret_stack */
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 03a58ea5905fdbd93ff9e52e670d802600ba38cd Mon Sep 17 00:00:00 2001
From: Kent Gibson <warthog618(a)gmail.com>
Date: Thu, 21 Jan 2021 22:10:38 +0800
Subject: [PATCH] gpiolib: cdev: clear debounce period if line set to output
When set_config changes a line from input to output debounce is
implicitly disabled, as debounce makes no sense for outputs, but the
debounce period is not being cleared and is still reported in the
line info.
So clear the debounce period when the debouncer is stopped in
edge_detector_stop().
Fixes: 65cff7046406 ("gpiolib: cdev: support setting debounce")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kent Gibson <warthog618(a)gmail.com>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski(a)baylibre.com>
diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c
index 1a7b51163528..1631727bf0da 100644
--- a/drivers/gpio/gpiolib-cdev.c
+++ b/drivers/gpio/gpiolib-cdev.c
@@ -776,6 +776,8 @@ static void edge_detector_stop(struct line *line)
cancel_delayed_work_sync(&line->work);
WRITE_ONCE(line->sw_debounced, 0);
WRITE_ONCE(line->eflags, 0);
+ if (line->desc)
+ WRITE_ONCE(line->desc->debounce_period_us, 0);
/* do not change line->level - see comment in debounced_value() */
}
A bit more than expected because apart from 9 failed-to-apply patches
there are lots of dependencies to them, but for the most part
automatically merged.
Hao Xu (1):
io_uring: fix flush cqring overflow list while TASK_INTERRUPTIBLE
Jens Axboe (2):
io_uring: account io_uring internal files as REQ_F_INFLIGHT
io_uring: if we see flush on exit, cancel related tasks
Pavel Begunkov (13):
io_uring: simplify io_task_match()
io_uring: add a {task,files} pair matching helper
io_uring: don't iterate io_uring_cancel_files()
io_uring: pass files into kill timeouts/poll
io_uring: always batch cancel in *cancel_files()
io_uring: fix files cancellation
io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE
io_uring: replace inflight_wait with tctx->wait
io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE
io_uring: fix list corruption for splice file_get
io_uring: fix sqo ownership false positive warning
io_uring: reinforce cancel on flush during exit
io_uring: drop mm/files between task_work_submit
fs/io-wq.c | 10 --
fs/io-wq.h | 1 -
fs/io_uring.c | 360 ++++++++++++++++++++------------------------------
3 files changed, 141 insertions(+), 230 deletions(-)
--
2.24.0
When creating a new kmem cache, SLUB determines how large the slab pages will
based on number of inputs, including the number of CPUs in the system. Larger
slab pages mean that more objects can be allocated/free from per-cpu slabs
before accessing shared structures, but also potentially more memory can be
wasted due to low slab usage and fragmentation.
The rough idea of using number of CPUs is that larger systems will be more
likely to benefit from reduced contention, and also should have enough memory
to spare.
Number of CPUs used to be determined as nr_cpu_ids, which is number of possible
cpus, but on some systems many will never be onlined, thus commit 045ab8c9487b
("mm/slub: let number of online CPUs determine the slub page order") changed it
to nr_online_cpus(). However, for kmem caches created early before CPUs are
onlined, this may lead to permamently low slab page sizes.
Vincent reports a regression [1] of hackbench on arm64 systems:
> I'm facing significant performances regression on a large arm64 server
> system (224 CPUs). Regressions is also present on small arm64 system
> (8 CPUs) but in a far smaller order of magnitude
> On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
> v5.11-rc4 : 9.135sec (+/- 0.45%)
> v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
> v5.10: 3.136sec (+/- 0.40%)
Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting
page allocator contention:
> i.e. the patch incurs a 7% to 32% performance penalty. This bisected
> cleanly yesterday when I was looking for the regression and then found
> the thread.
> Numerous caches change size. For example, kmalloc-512 goes from order-0
> (vanilla) to order-2 with the revert.
> So mostly this is down to the number of times SLUB calls into the page
> allocator which only caches order-0 pages on a per-cpu basis.
Clearly num_online_cpus() doesn't work too early in bootup. We could change
the order dynamically in a memory hotplug callback, but runtime order changing
for existing kmem caches has been already shown as dangerous, and removed in
32a6f409b693 ("mm, slub: remove runtime allocation order changes"). It could be
resurrected in a safe manner with some effort, but to fix the regression we
need something simpler.
We could use num_present_cpus() that should be the number of physically present
CPUs even before they are onlined. That would for for PowerPC [3], which
triggered the original commit, but that still doesn't work on arm64 [4] as
explained in [5].
So this patch tries to determine the best available value without specific arch
knowledge.
- num_present_cpus() if the number is larger than 1, as that means the arch is
likely setting it properly
- nr_cpu_ids otherwise
This should fix the reported regressions while also keeping the effect of
045ab8c9487b for PowerPC systems. It's possible there are configurations where
num_present_cpus() is 1 during boot while nr_cpu_ids is at the same time
bloated, so these (if they exist) would keep the large orders based on
nr_cpu_ids as was before 045ab8c9487b.
[1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj…
[2] https://lore.kernel.org/linux-mm/20210128134512.GF3592@techsingularity.net/
[3] https://lore.kernel.org/linux-mm/20210123051607.GC2587010@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03Ys…
[5] https://lore.kernel.org/linux-mm/20210126230305.GD30941@willie-the-truck/
Fixes: 045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
Reported-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Reported-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
---
OK, this is a 5.11 regression, so we should try to it by 5.12. I've also
Cc'd stable for that reason although it's not a crash fix.
We can still try later to replace this with a safe order update in hotplug
callbacks, but that's infeasible for 5.12.
mm/slub.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 176b1cb0d006..8fc9190e6cb3 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3454,6 +3454,7 @@ static inline int calculate_order(unsigned int size)
unsigned int order;
unsigned int min_objects;
unsigned int max_objects;
+ unsigned int nr_cpus;
/*
* Attempt to find best configuration for a slab. This
@@ -3464,8 +3465,21 @@ static inline int calculate_order(unsigned int size)
* we reduce the minimum objects required in a slab.
*/
min_objects = slub_min_objects;
- if (!min_objects)
- min_objects = 4 * (fls(num_online_cpus()) + 1);
+ if (!min_objects) {
+ /*
+ * Some architectures will only update present cpus when
+ * onlining them, so don't trust the number if it's just 1. But
+ * we also don't want to use nr_cpu_ids always, as on some other
+ * architectures, there can be many possible cpus, but never
+ * onlined. Here we compromise between trying to avoid too high
+ * order on systems that appear larger than they are, and too
+ * low order on systems that appear smaller than they are.
+ */
+ nr_cpus = num_present_cpus();
+ if (nr_cpus <= 1)
+ nr_cpus = nr_cpu_ids;
+ min_objects = 4 * (fls(nr_cpus) + 1);
+ }
max_objects = order_objects(slub_max_order, size);
min_objects = min(min_objects, max_objects);
--
2.30.0
From: Johannes Weiner <hannes(a)cmpxchg.org>
commit 739f79fc9db1b38f96b5a5109b247a650fbebf6d upstream
Jaegeuk and Brad report a NULL pointer crash when writeback ending tries
to update the memcg stats:
BUG: unable to handle kernel NULL pointer dereference at 00000000000003b0
IP: test_clear_page_writeback+0x12e/0x2c0
[...]
RIP: 0010:test_clear_page_writeback+0x12e/0x2c0
Call Trace:
<IRQ>
end_page_writeback+0x47/0x70
f2fs_write_end_io+0x76/0x180 [f2fs]
bio_endio+0x9f/0x120
blk_update_request+0xa8/0x2f0
scsi_end_request+0x39/0x1d0
scsi_io_completion+0x211/0x690
scsi_finish_command+0xd9/0x120
scsi_softirq_done+0x127/0x150
__blk_mq_complete_request_remote+0x13/0x20
flush_smp_call_function_queue+0x56/0x110
generic_smp_call_function_single_interrupt+0x13/0x30
smp_call_function_single_interrupt+0x27/0x40
call_function_single_interrupt+0x89/0x90
RIP: 0010:native_safe_halt+0x6/0x10
(gdb) l *(test_clear_page_writeback+0x12e)
0xffffffff811bae3e is in test_clear_page_writeback (./include/linux/memcontrol.h:619).
614 mod_node_page_state(page_pgdat(page), idx, val);
615 if (mem_cgroup_disabled() || !page->mem_cgroup)
616 return;
617 mod_memcg_state(page->mem_cgroup, idx, val);
618 pn = page->mem_cgroup->nodeinfo[page_to_nid(page)];
619 this_cpu_add(pn->lruvec_stat->count[idx], val);
620 }
621
622 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
623 gfp_t gfp_mask,
The issue is that writeback doesn't hold a page reference and the page
might get freed after PG_writeback is cleared (and the mapping is
unlocked) in test_clear_page_writeback(). The stat functions looking up
the page's node or zone are safe, as those attributes are static across
allocation and free cycles. But page->mem_cgroup is not, and it will
get cleared if we race with truncation or migration.
It appears this race window has been around for a while, but less likely
to trigger when the memcg stats were updated first thing after
PG_writeback is cleared. Recent changes reshuffled this code to update
the global node stats before the memcg ones, though, stretching the race
window out to an extent where people can reproduce the problem.
Update test_clear_page_writeback() to look up and pin page->mem_cgroup
before clearing PG_writeback, then not use that pointer afterward. It
is a partial revert of 62cccb8c8e7a ("mm: simplify lock_page_memcg()")
but leaves the pageref-holding callsites that aren't affected alone.
Link: http://lkml.kernel.org/r/20170809183825.GA26387@cmpxchg.org
Fixes: 62cccb8c8e7a ("mm: simplify lock_page_memcg()")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
Tested-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
Reported-by: Bradley Bolen <bradleybolen(a)gmail.com>
Tested-by: Brad Bolen <bradleybolen(a)gmail.com>
Cc: Vladimir Davydov <vdavydov(a)virtuozzo.com>
Cc: Michal Hocko <mhocko(a)suse.cz>
Cc: <stable(a)vger.kernel.org> [4.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
[guptap(a)codeaurora.org: Resolved merge conflicts]
Signed-off-by: Prakash Gupta <guptap(a)codeaurora.org>
Signed-off-by: Florian Fainelli <f.fainelli(a)gmail.com>
---
This patch is present in a downstream Android tree:
https://source.mcwhirter.io/craige/bluecross/commit/d4a742865c6b69ef9316947…
and I happened to have stumbled across the same problem too.
Johannes can you review it for correctness with respect to the 4.9
kernel? Thanks!
include/linux/memcontrol.h | 33 ++++++++++++++++++++++++-----
mm/memcontrol.c | 43 +++++++++++++++++++++++++++-----------
mm/page-writeback.c | 14 ++++++++++---
3 files changed, 70 insertions(+), 20 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 8b35bdbdc214..fd77f8303ab9 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -490,9 +490,21 @@ bool mem_cgroup_oom_synchronize(bool wait);
extern int do_swap_account;
#endif
-void lock_page_memcg(struct page *page);
+struct mem_cgroup *lock_page_memcg(struct page *page);
+void __unlock_page_memcg(struct mem_cgroup *memcg);
void unlock_page_memcg(struct page *page);
+static inline void __mem_cgroup_update_page_stat(struct page *page,
+ struct mem_cgroup *memcg,
+ enum mem_cgroup_stat_index idx,
+ int val)
+{
+ VM_BUG_ON(!(rcu_read_lock_held() || PageLocked(page)));
+
+ if (memcg && memcg->stat)
+ this_cpu_add(memcg->stat->count[idx], val);
+}
+
/**
* mem_cgroup_update_page_stat - update page state statistics
* @page: the page
@@ -508,13 +520,12 @@ void unlock_page_memcg(struct page *page);
* mem_cgroup_update_page_stat(page, state, -1);
* unlock_page(page) or unlock_page_memcg(page)
*/
+
static inline void mem_cgroup_update_page_stat(struct page *page,
enum mem_cgroup_stat_index idx, int val)
{
- VM_BUG_ON(!(rcu_read_lock_held() || PageLocked(page)));
- if (page->mem_cgroup)
- this_cpu_add(page->mem_cgroup->stat->count[idx], val);
+ __mem_cgroup_update_page_stat(page, page->mem_cgroup, idx, val);
}
static inline void mem_cgroup_inc_page_stat(struct page *page,
@@ -709,7 +720,12 @@ mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
{
}
-static inline void lock_page_memcg(struct page *page)
+static inline struct mem_cgroup *lock_page_memcg(struct page *page)
+{
+ return NULL;
+}
+
+static inline void __unlock_page_memcg(struct mem_cgroup *memcg)
{
}
@@ -745,6 +761,13 @@ static inline void mem_cgroup_update_page_stat(struct page *page,
{
}
+static inline void __mem_cgroup_update_page_stat(struct page *page,
+ struct mem_cgroup *memcg,
+ enum mem_cgroup_stat_index idx,
+ int nr)
+{
+}
+
static inline void mem_cgroup_inc_page_stat(struct page *page,
enum mem_cgroup_stat_index idx)
{
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d4232744c59f..27b0b4f03fcd 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1638,9 +1638,13 @@ bool mem_cgroup_oom_synchronize(bool handle)
* @page: the page
*
* This function protects unlocked LRU pages from being moved to
- * another cgroup and stabilizes their page->mem_cgroup binding.
+ * another cgroup.
+ *
+ * It ensures lifetime of the returned memcg. Caller is responsible
+ * for the lifetime of the page; __unlock_page_memcg() is available
+ * when @page might get freed inside the locked section.
*/
-void lock_page_memcg(struct page *page)
+struct mem_cgroup *lock_page_memcg(struct page *page)
{
struct mem_cgroup *memcg;
unsigned long flags;
@@ -1649,18 +1653,24 @@ void lock_page_memcg(struct page *page)
* The RCU lock is held throughout the transaction. The fast
* path can get away without acquiring the memcg->move_lock
* because page moving starts with an RCU grace period.
- */
+ *
+ * The RCU lock also protects the memcg from being freed when
+ * the page state that is going to change is the only thing
+ * preventing the page itself from being freed. E.g. writeback
+ * doesn't hold a page reference and relies on PG_writeback to
+ * keep off truncation, migration and so forth.
+ */
rcu_read_lock();
if (mem_cgroup_disabled())
- return;
+ return NULL;
again:
memcg = page->mem_cgroup;
if (unlikely(!memcg))
- return;
+ return NULL;
if (atomic_read(&memcg->moving_account) <= 0)
- return;
+ return memcg;
spin_lock_irqsave(&memcg->move_lock, flags);
if (memcg != page->mem_cgroup) {
@@ -1676,18 +1686,18 @@ void lock_page_memcg(struct page *page)
memcg->move_lock_task = current;
memcg->move_lock_flags = flags;
- return;
+ return memcg;
}
EXPORT_SYMBOL(lock_page_memcg);
/**
- * unlock_page_memcg - unlock a page->mem_cgroup binding
- * @page: the page
+ * __unlock_page_memcg - unlock and unpin a memcg
+ * @memcg: the memcg
+ *
+ * Unlock and unpin a memcg returned by lock_page_memcg().
*/
-void unlock_page_memcg(struct page *page)
+void __unlock_page_memcg(struct mem_cgroup *memcg)
{
- struct mem_cgroup *memcg = page->mem_cgroup;
-
if (memcg && memcg->move_lock_task == current) {
unsigned long flags = memcg->move_lock_flags;
@@ -1699,6 +1709,15 @@ void unlock_page_memcg(struct page *page)
rcu_read_unlock();
}
+
+/**
+ * unlock_page_memcg - unlock a page->mem_cgroup binding
+ * @page: the page
+ */
+void unlock_page_memcg(struct page *page)
+{
+ __unlock_page_memcg(page->mem_cgroup);
+}
EXPORT_SYMBOL(unlock_page_memcg);
/*
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 462c778b9fb5..498c924f2fcd 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2717,9 +2717,10 @@ EXPORT_SYMBOL(clear_page_dirty_for_io);
int test_clear_page_writeback(struct page *page)
{
struct address_space *mapping = page_mapping(page);
+ struct mem_cgroup *memcg;
int ret;
- lock_page_memcg(page);
+ memcg = lock_page_memcg(page);
if (mapping && mapping_use_writeback_tags(mapping)) {
struct inode *inode = mapping->host;
struct backing_dev_info *bdi = inode_to_bdi(inode);
@@ -2747,13 +2748,20 @@ int test_clear_page_writeback(struct page *page)
} else {
ret = TestClearPageWriteback(page);
}
+ /*
+ * NOTE: Page might be free now! Writeback doesn't hold a page
+ * reference on its own, it relies on truncation to wait for
+ * the clearing of PG_writeback. The below can only access
+ * page state that is static across allocation cycles.
+ */
if (ret) {
- mem_cgroup_dec_page_stat(page, MEM_CGROUP_STAT_WRITEBACK);
+ __mem_cgroup_update_page_stat(page, memcg,
+ MEM_CGROUP_STAT_WRITEBACK, -1);
dec_node_page_state(page, NR_WRITEBACK);
dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
inc_node_page_state(page, NR_WRITTEN);
}
- unlock_page_memcg(page);
+ __unlock_page_memcg(memcg);
return ret;
}
--
2.25.1
commit 97c753e62e6c31a404183898d950d8c08d752dbd upstream.
Fix kprobe_on_func_entry() returns error code instead of false so that
register_kretprobe() can return an appropriate error code.
append_trace_kprobe() expects the kprobe registration returns -ENOENT
when the target symbol is not found, and it checks whether the target
module is unloaded or not. If the target module doesn't exist, it
defers to probe the target symbol until the module is loaded.
However, since register_kretprobe() returns -EINVAL instead of -ENOENT
in that case, it always fail on putting the kretprobe event on unloaded
modules. e.g.
Kprobe event:
/sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events
[ 16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue.
Kretprobe event: (p -> r)
/sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events
sh: write error: Invalid argument
/sys/kernel/debug/tracing # cat error_log
[ 41.122514] trace_kprobe: error: Failed to register probe event
Command: r xfs:xfs_end_io
^
To fix this bug, change kprobe_on_func_entry() to detect symbol lookup
failure and return -ENOENT in that case. Otherwise it returns -EINVAL
or 0 (succeeded, given address is on the entry).
Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@de…
Cc: stable(a)vger.kernel.org
Fixes: 59158ec4aef7 ("tracing/kprobes: Check the probe on unloaded module correctly")
Reported-by: Jianlin Lv <Jianlin.Lv(a)arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
include/linux/kprobes.h | 2 +-
kernel/kprobes.c | 34 +++++++++++++++++++++++++---------
kernel/trace/trace_kprobe.c | 4 ++--
3 files changed, 28 insertions(+), 12 deletions(-)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 9f22652d69bb..c28204e22b54 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -245,7 +245,7 @@ extern void kprobes_inc_nmissed_count(struct kprobe *p);
extern bool arch_within_kprobe_blacklist(unsigned long addr);
extern int arch_populate_kprobe_blacklist(void);
extern bool arch_kprobe_on_func_entry(unsigned long offset);
-extern bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
+extern int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
extern bool within_kprobe_blacklist(unsigned long addr);
extern int kprobe_add_ksym_blacklist(unsigned long entry);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 2161f519d481..ebbd4320143d 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1921,29 +1921,45 @@ bool __weak arch_kprobe_on_func_entry(unsigned long offset)
return !offset;
}
-bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
+/**
+ * kprobe_on_func_entry() -- check whether given address is function entry
+ * @addr: Target address
+ * @sym: Target symbol name
+ * @offset: The offset from the symbol or the address
+ *
+ * This checks whether the given @addr+@offset or @sym+@offset is on the
+ * function entry address or not.
+ * This returns 0 if it is the function entry, or -EINVAL if it is not.
+ * And also it returns -ENOENT if it fails the symbol or address lookup.
+ * Caller must pass @addr or @sym (either one must be NULL), or this
+ * returns -EINVAL.
+ */
+int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
{
kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset);
if (IS_ERR(kp_addr))
- return false;
+ return PTR_ERR(kp_addr);
- if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset) ||
- !arch_kprobe_on_func_entry(offset))
- return false;
+ if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset))
+ return -ENOENT;
- return true;
+ if (!arch_kprobe_on_func_entry(offset))
+ return -EINVAL;
+
+ return 0;
}
int register_kretprobe(struct kretprobe *rp)
{
- int ret = 0;
+ int ret;
struct kretprobe_instance *inst;
int i;
void *addr;
- if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset))
- return -EINVAL;
+ ret = kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset);
+ if (ret)
+ return ret;
if (kretprobe_blacklist_size) {
addr = kprobe_addr(&rp->kp);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 5c17f70c7f2d..61eff45653f5 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -112,9 +112,9 @@ bool trace_kprobe_on_func_entry(struct trace_event_call *call)
{
struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
- return kprobe_on_func_entry(tk->rp.kp.addr,
+ return (kprobe_on_func_entry(tk->rp.kp.addr,
tk->rp.kp.addr ? NULL : tk->rp.kp.symbol_name,
- tk->rp.kp.addr ? 0 : tk->rp.kp.offset);
+ tk->rp.kp.addr ? 0 : tk->rp.kp.offset) == 0);
}
bool trace_kprobe_error_injectable(struct trace_event_call *call)
This is the start of the stable review cycle for the 5.10.15 release.
There are 120 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.15-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.15-rc1
Alexander Ovechkin <ovov(a)yandex-team.ru>
net: sched: replaced invalid qdisc tree flush helper in qdisc_replace
DENG Qingfang <dqfext(a)gmail.com>
net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Dongseok Yi <dseok.yi(a)samsung.com>
udp: ipv4: manipulate network header of NATed UDP GRO fraglist
Vadim Fedorenko <vfedorenko(a)novek.ru>
net: ip_tunnel: fix mtu calculation
Chinmay Agarwal <chinagar(a)codeaurora.org>
neighbour: Prevent a dead entry from updating gc_list
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
igc: Report speed and duplex as unknown when device is runtime suspended
Xiao Ni <xni(a)redhat.com>
md: Set prev_flush_start and flush_bio in an atomic way
Marek Vasut <marex(a)denx.de>
Input: ili210x - implement pressure reporting for ILI251x
Benjamin Valentin <benpicco(a)googlemail.com>
Input: xpad - sync supported devices with fork on GitHub
AngeloGioacchino Del Regno <angelogioacchino.delregno(a)somainline.org>
Input: goodix - add support for Goodix GT9286 chip
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/apic: Add extra serialization for non-serializing MSRs
Lai Jiangshan <laijs(a)linux.alibaba.com>
x86/debug: Prevent data breakpoints on cpu_dr7
Lai Jiangshan <laijs(a)linux.alibaba.com>
x86/debug: Prevent data breakpoints on __per_cpu_offset
Peter Zijlstra <peterz(a)infradead.org>
x86/debug: Fix DR6 handling
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/build: Disable CET instrumentation in the kernel
Waiman Long <longman(a)redhat.com>
mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()
Hugh Dickins <hughd(a)google.com>
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Rick Edgecombe <rick.p.edgecombe(a)intel.com>
mm/vmalloc: separate put pages and flush VM flags
Rokudo Yan <wu-yan(a)tcl.com>
mm, compaction: move high_pfn to the for loop scope
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between isolating and freeing page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between freeing and dissolving the page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
Dmitry Osipenko <digetx(a)gmail.com>
ARM: 9043/1: tegra: Fix misplaced tegra_uart_config in decompressor
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: footbridge: fix dc21285 PCI configuration accessors
H. Nikolaus Schaller <hns(a)goldelico.com>
ARM: dts; gta04: SPI panel chip select is active low
H. Nikolaus Schaller <hns(a)goldelico.com>
DTS: ARM: gta04: remove legacy spi-cs-high to make display work again
Sean Christopherson <seanjc(a)google.com>
KVM: x86: Set so called 'reserved CR3 bits in LM mask' at vCPU reset
Sean Christopherson <seanjc(a)google.com>
KVM: x86: Update emulator context mode if SYSENTER xfers to 64-bit mode
Michael Roth <michael.roth(a)amd.com>
KVM: x86: fix CPUID entries returned by KVM_GET_CPUID2 ioctl
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off
Ben Gardon <bgardon(a)google.com>
KVM: x86/mmu: Fix TDP MMU zap collapsible SPTEs
Sean Christopherson <seanjc(a)google.com>
KVM: SVM: Treat SVM as unsupported when running as an SEV guest
Thorsten Leemhuis <linux(a)leemhuis.info>
nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
Xiaoguang Wang <xiaoguang.wang(a)linux.alibaba.com>
io_uring: don't modify identity's files uncess identity is cowed
Stylon Wang <stylon.wang(a)amd.com>
drm/amd/display: Revert "Fix EDID parsing after resume from suspend"
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Power up combo PHY lanes for for HDMI as well
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Extract intel_ddi_power_up_lanes()
Andres Calderon Jaramillo <andrescj(a)chromium.org>
drm/i915/display: Prevent double YUV range correction on HDR planes
Chris Wilson <chris(a)chris-wilson.co.uk>
drm/i915/gt: Close race between enable_breadcrumbs and cancel_breadcrumbs
Chris Wilson <chris(a)chris-wilson.co.uk>
drm/i915/gem: Drop lru bumping on display unpinning
Imre Deak <imre.deak(a)intel.com>
drm/i915: Fix the MST PBN divider calculation
Imre Deak <imre.deak(a)intel.com>
drm/dp/mst: Export drm_dp_get_vc_payload_bw()
Peter Gonda <pgonda(a)google.com>
Fix unsynchronized access to sev members through svm_register_enc_region
Fengnan Chang <fengnanchang(a)gmail.com>
mmc: core: Limit retries when analyse of SDIO tuples fails
Ulf Hansson <ulf.hansson(a)linaro.org>
mmc: sdhci-pltfm: Fix linking err for sdhci-brcmstb
Pavel Shilovsky <pshilov(a)microsoft.com>
smb3: fix crediting for compounding when only one request in flight
Gustavo A. R. Silva <gustavoars(a)kernel.org>
smb3: Fix out-of-bounds bug in SMB2_negotiate()
Joerg Roedel <jroedel(a)suse.de>
iommu: Check dev->iommu in dev_iommu_priv_get() before dereferencing it
Aurelien Aptel <aaptel(a)suse.com>
cifs: report error instead of invalid when revalidating a dentry fails
Atish Patra <atish.patra(a)wdc.com>
RISC-V: Define MAXPHYSMEM_1GB only for RV32
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: fix bounce buffer usage for non-sg list case
Rolf Eike Beer <eb(a)emlix.com>
scripts: use pkg-config to locate libcrypto
Marc Zyngier <maz(a)kernel.org>
genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
Hans de Goede <hdegoede(a)redhat.com>
genirq: Prevent [devm_]irq_alloc_desc from returning irq 0
Dan Williams <dan.j.williams(a)intel.com>
libnvdimm/dimm: Avoid race between probe and available_slots_show()
Dan Williams <dan.j.williams(a)intel.com>
libnvdimm/namespace: Fix visibility of namespace resource attribute
Alexey Kardashevskiy <aik(a)ozlabs.ru>
tracepoint: Fix race between tracing and removing tracepoint
Viktor Rosendahl <Viktor.Rosendahl(a)bmw.de>
tracing: Use pause-on-trace with the latency tracers
Wang ShaoBo <bobo.shaobowang(a)huawei.com>
kretprobe: Avoid re-registration of the same kretprobe earlier
Masami Hiramatsu <mhiramat(a)kernel.org>
tracing/kprobe: Fix to support kretprobe events on unloaded modules
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
fgraph: Initialize tracing_graph_pause at task creation
Quanyang Wang <quanyang.wang(a)windriver.com>
gpiolib: free device name on error path to fix kmemleak
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix station rate table updates on assoc
Sargun Dhillon <sargun(a)sargun.me>
ovl: implement volatile-specific fsync error behaviour
Miklos Szeredi <mszeredi(a)redhat.com>
ovl: avoid deadlock on directory ioctl
Liangyan <liangyan.peng(a)linux.alibaba.com>
ovl: fix dentry leak in ovl_get_redirect
Mario Limonciello <mario.limonciello(a)dell.com>
thunderbolt: Fix possible NULL pointer dereference in tb_acpi_add_link()
Masahiro Yamada <masahiroy(a)kernel.org>
kbuild: fix duplicated flags in DEBUG_CFLAGS
Roman Gushchin <guro(a)fb.com>
memblock: do not start bottom-up allocations with kernel_end
Eli Cohen <elic(a)nvidia.com>
vdpa/mlx5: Restore the hardware used index after change map
Sagi Grimberg <sagi(a)grimberg.me>
nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs
Hermann Lauer <Hermann.Lauer(a)uni-heidelberg.de>
ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode
Dan Carpenter <dan.carpenter(a)oracle.com>
net: ipa: pass correct dma_handle to dma_free_coherent()
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set
Stefan Chulski <stefanc(a)marvell.com>
net: mvpp2: TCAM entry enable should be written after SRAM data
Xie He <xie.he.0141(a)gmail.com>
net: lapb: Copy the skb before sending a packet
Maor Dickman <maord(a)nvidia.com>
net/mlx5e: Release skb in case of failure in tc update skb
Maxim Mikityanskiy <maximmi(a)mellanox.com>
net/mlx5e: Update max_opened_tc also when channels are closed
Maor Gottlieb <maorg(a)nvidia.com>
net/mlx5: Fix leak upon failure of rule creation
Daniel Jurgens <danielj(a)nvidia.com>
net/mlx5: Fix function calculation for page trees
Lijun Pan <ljp(a)linux.ibm.com>
ibmvnic: device remove has higher precedence over reset
Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
i40e: Revert "i40e: don't report link up for a VF who hasn't enabled queues"
Kevin Lo <kevlo(a)kevlo.org>
igc: check return value of ret_val in igc_config_fc_after_link_up
Kevin Lo <kevlo(a)kevlo.org>
igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr
Chuck Lever <chuck.lever(a)oracle.com>
SUNRPC: Fix NFS READs that start at non-page-aligned offsets
Zyta Szpak <zr(a)semihalf.com>
arm64: dts: ls1046a: fix dcfg address range
David Howells <dhowells(a)redhat.com>
rxrpc: Fix deadlock around release of dst cached on udp tunnel
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: work around RTL8125 UDP hw bug
Marek Szyprowski <m.szyprowski(a)samsung.com>
arm64: dts: meson: switch TFLASH_VDD_EN pin to open drain on Odroid-C4
Quentin Monnet <quentin(a)isovalent.com>
bpf, preload: Fix build when $(O) points to a relative path
Johannes Berg <johannes.berg(a)intel.com>
um: virtio: free vu_dev only with the contained struct device
Pan Bian <bianpan2016(a)163.com>
bpf, inode_storage: Put file handler if no storage was found
Loris Reiff <loris.reiff(a)liblor.ch>
bpf, cgroup: Fix problematic bounds check
Loris Reiff <loris.reiff(a)liblor.ch>
bpf, cgroup: Fix optlen WARN_ON_ONCE toctou
Eli Cohen <elic(a)nvidia.com>
vdpa/mlx5: Fix memory key MTT population
Marek Vasut <marex(a)denx.de>
ARM: dts: stm32: Fix GPIO hog flags on DHCOM DRC02
Marek Vasut <marex(a)denx.de>
ARM: dts: stm32: Disable optional TSC2004 on DRC02 board
Marek Vasut <marex(a)denx.de>
ARM: dts: stm32: Disable WP on DHCOM uSD slot
Marek Vasut <marex(a)denx.de>
ARM: dts: stm32: Connect card-detect signal on DHCOM
Marek Vasut <marex(a)denx.de>
ARM: dts: stm32: Fix polarity of the DH DRC02 uSD card detect
Simon South <simon(a)simonsouth.net>
arm64: dts: rockchip: Use only supported PCIe link speed on Pinebook Pro
Sandy Huang <hjc(a)rock-chips.com>
arm64: dts: rockchip: fix vopl iommu irq on px30
Serge Semin <Sergey.Semin(a)baikalelectronics.ru>
arm64: dts: amlogic: meson-g12: Set FL-adj property value
Alexey Dobriyan <adobriyan(a)gmail.com>
Input: i8042 - unbreak Pegatron C15B
Shawn Guo <shawn.guo(a)linaro.org>
arm64: dts: qcom: c630: keep both touchpad devices enabled
Linus Walleij <linus.walleij(a)linaro.org>
ARM: OMAP1: OSK: fix ohci-omap breakage
Chunfeng Yun <chunfeng.yun(a)mediatek.com>
usb: xhci-mtk: break loop when find the endpoint to drop
Chunfeng Yun <chunfeng.yun(a)mediatek.com>
usb: xhci-mtk: skip dropping bandwidth of unchecked endpoints
Ikjoon Jang <ikjn(a)chromium.org>
usb: xhci-mtk: fix unreleased bandwidth data
Gary Bisson <gary.bisson(a)boundarydevices.com>
usb: dwc3: fix clock issue during resume in OTG mode
Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
usb: dwc2: Fix endpoint direction check in ep_from_windex
Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
Jeremy Figgins <kernel(a)jeremyfiggins.com>
USB: usblp: don't call usb_set_interface if there's a single alt
kernel test robot <lkp(a)intel.com>
usb: gadget: aspeed: add missing of_node_put
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: gadget: legacy: fix an error code in eth_bind()
Pali Rohár <pali(a)kernel.org>
usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720
Christoph Schemmel <christoph.schemmel(a)gmail.com>
USB: serial: option: Adding support for Cinterion MV31
Chenxin Jin <bg4akv(a)hotmail.com>
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Pho Tran <Pho.Tran(a)silabs.com>
USB: serial: cp210x: add pid/vid for WSDA-200-USB
-------------
Diffstat:
Documentation/filesystems/overlayfs.rst | 8 ++
Makefile | 14 +--
arch/arm/boot/dts/omap3-gta04.dtsi | 3 +-
arch/arm/boot/dts/stm32mp15xx-dhcom-drc02.dtsi | 12 +-
arch/arm/boot/dts/stm32mp15xx-dhcom-som.dtsi | 3 +-
arch/arm/boot/dts/sun7i-a20-bananapro.dts | 2 +-
arch/arm/include/debug/tegra.S | 54 ++++-----
arch/arm/mach-footbridge/dc21285.c | 12 +-
arch/arm/mach-omap1/board-osk.c | 2 +
arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi | 2 +-
.../arm64/boot/dts/amlogic/meson-sm1-odroid-c4.dts | 2 +-
arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +-
.../boot/dts/qcom/sdm850-lenovo-yoga-c630.dts | 10 +-
arch/arm64/boot/dts/rockchip/px30.dtsi | 2 +-
.../boot/dts/rockchip/rk3399-pinebook-pro.dts | 1 -
arch/riscv/Kconfig | 2 +
arch/um/drivers/virtio_uml.c | 3 +-
arch/x86/Makefile | 3 +
arch/x86/include/asm/apic.h | 10 --
arch/x86/include/asm/barrier.h | 18 +++
arch/x86/kernel/apic/apic.c | 4 +
arch/x86/kernel/apic/x2apic_cluster.c | 6 +-
arch/x86/kernel/apic/x2apic_phys.c | 9 +-
arch/x86/kernel/hw_breakpoint.c | 61 ++++++----
arch/x86/kvm/cpuid.c | 2 +-
arch/x86/kvm/emulate.c | 2 +
arch/x86/kvm/mmu/tdp_mmu.c | 6 +-
arch/x86/kvm/svm/sev.c | 17 +--
arch/x86/kvm/svm/svm.c | 5 +
arch/x86/kvm/vmx/vmx.c | 17 ++-
arch/x86/kvm/x86.c | 27 +++--
arch/x86/mm/mem_encrypt.c | 1 +
drivers/gpio/gpiolib.c | 10 +-
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 -
drivers/gpu/drm/drm_dp_mst_topology.c | 24 +++-
drivers/gpu/drm/i915/display/intel_ddi.c | 37 +++---
drivers/gpu/drm/i915/display/intel_display.c | 9 +-
drivers/gpu/drm/i915/display/intel_dp_mst.c | 4 +-
drivers/gpu/drm/i915/display/intel_overlay.c | 4 +-
drivers/gpu/drm/i915/display/intel_sprite.c | 65 ++---------
drivers/gpu/drm/i915/gem/i915_gem_domain.c | 45 -------
drivers/gpu/drm/i915/gem/i915_gem_object.h | 1 -
drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 6 +-
drivers/input/joystick/xpad.c | 17 ++-
drivers/input/serio/i8042-x86ia64io.h | 2 +
drivers/input/touchscreen/goodix.c | 2 +
drivers/input/touchscreen/ili210x.c | 26 +++--
drivers/md/md.c | 2 +
drivers/mmc/core/sdio_cis.c | 6 +
drivers/mmc/host/sdhci-pltfm.h | 7 +-
drivers/net/dsa/mv88e6xxx/chip.c | 6 +-
drivers/net/ethernet/ibm/ibmvnic.c | 5 -
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 13 +--
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h | 1 -
drivers/net/ethernet/intel/igc/igc_ethtool.c | 3 +-
drivers/net/ethernet/intel/igc/igc_i225.c | 3 +-
drivers/net/ethernet/intel/igc/igc_mac.c | 2 +-
drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 10 +-
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 6 +-
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 16 ++-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 5 +
.../net/ethernet/mellanox/mlx5/core/pagealloc.c | 2 +-
drivers/net/ethernet/realtek/r8169_main.c | 75 ++++++++++--
drivers/net/ipa/gsi.c | 2 +-
drivers/nvdimm/dimm_devs.c | 18 ++-
drivers/nvdimm/namespace_devs.c | 10 +-
drivers/nvme/host/pci.c | 2 +
drivers/nvme/target/tcp.c | 3 +-
drivers/thunderbolt/acpi.c | 2 +-
drivers/usb/class/usblp.c | 19 +--
drivers/usb/dwc2/gadget.c | 8 +-
drivers/usb/dwc3/core.c | 2 +-
drivers/usb/gadget/legacy/ether.c | 4 +-
drivers/usb/gadget/udc/aspeed-vhub/hub.c | 4 +-
drivers/usb/host/xhci-mtk-sch.c | 130 +++++++++++++++------
drivers/usb/host/xhci-mtk.c | 2 +
drivers/usb/host/xhci-mtk.h | 15 +++
drivers/usb/host/xhci-mvebu.c | 42 +++++++
drivers/usb/host/xhci-mvebu.h | 6 +
drivers/usb/host/xhci-plat.c | 20 +++-
drivers/usb/host/xhci-plat.h | 1 +
drivers/usb/host/xhci-ring.c | 31 +++--
drivers/usb/host/xhci.c | 8 +-
drivers/usb/host/xhci.h | 4 +
drivers/usb/renesas_usbhs/fifo.c | 1 +
drivers/usb/serial/cp210x.c | 2 +
drivers/usb/serial/option.c | 6 +
drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 +
drivers/vdpa/mlx5/core/mr.c | 28 ++---
drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 +++
fs/afs/main.c | 6 +-
fs/cifs/dir.c | 22 +++-
fs/cifs/smb2pdu.h | 2 +-
fs/cifs/transport.c | 18 ++-
fs/hugetlbfs/inode.c | 3 +-
fs/io_uring.c | 6 -
fs/overlayfs/dir.c | 2 +-
fs/overlayfs/file.c | 5 +-
fs/overlayfs/overlayfs.h | 1 +
fs/overlayfs/ovl_entry.h | 2 +
fs/overlayfs/readdir.c | 28 ++---
fs/overlayfs/super.c | 34 ++++--
fs/overlayfs/util.c | 27 +++++
include/drm/drm_dp_mst_helper.h | 1 +
include/linux/hugetlb.h | 2 +
include/linux/iommu.h | 5 +-
include/linux/irq.h | 4 +-
include/linux/kprobes.h | 2 +-
include/linux/msi.h | 6 +
include/linux/tracepoint.h | 12 +-
include/linux/vmalloc.h | 9 +-
include/net/sch_generic.h | 2 +-
include/net/udp.h | 2 +-
init/init_task.c | 3 +-
kernel/bpf/bpf_inode_storage.c | 6 +-
kernel/bpf/cgroup.c | 7 +-
kernel/bpf/preload/Makefile | 5 +-
kernel/irq/msi.c | 44 ++++---
kernel/kprobes.c | 36 ++++--
kernel/trace/fgraph.c | 2 -
kernel/trace/trace_irqsoff.c | 4 +
kernel/trace/trace_kprobe.c | 10 +-
mm/compaction.c | 3 +-
mm/filemap.c | 4 +
mm/huge_memory.c | 37 +++---
mm/hugetlb.c | 48 +++++++-
mm/memblock.c | 49 +-------
net/core/neighbour.c | 7 +-
net/ipv4/ip_tunnel.c | 16 ++-
net/ipv4/udp_offload.c | 69 ++++++++++-
net/ipv6/udp_offload.c | 2 +-
net/lapb/lapb_out.c | 3 +-
net/mac80211/driver-ops.c | 5 +-
net/mac80211/rate.c | 3 +-
net/rxrpc/af_rxrpc.c | 6 +-
net/sunrpc/svcsock.c | 7 +-
scripts/Makefile | 8 +-
137 files changed, 1106 insertions(+), 606 deletions(-)
We park SQPOLL task before going into io_uring_cancel_files(), so the
task won't run task_works including those that might be important for
the cancellation passes. In this case it's io_poll_remove_one(), which
frees requests via io_put_req_deferred().
Unpark it for while waiting, it's ok as we disable submissions
beforehand, so no new will be generated.
INFO: task syz-executor893:8493 blocked for more than 143 seconds.
Call Trace:
context_switch kernel/sched/core.c:4327 [inline]
__schedule+0x90c/0x21a0 kernel/sched/core.c:5078
schedule+0xcf/0x270 kernel/sched/core.c:5157
io_uring_cancel_files fs/io_uring.c:8912 [inline]
io_uring_cancel_task_requests+0xe70/0x11a0 fs/io_uring.c:8979
__io_uring_files_cancel+0x110/0x1b0 fs/io_uring.c:9067
io_uring_files_cancel include/linux/io_uring.h:51 [inline]
do_exit+0x2fe/0x2ae0 kernel/exit.c:780
do_group_exit+0x125/0x310 kernel/exit.c:922
__do_sys_exit_group kernel/exit.c:933 [inline]
__se_sys_exit_group kernel/exit.c:931 [inline]
__x64_sys_exit_group+0x3a/0x50 kernel/exit.c:931
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Cc: stable(a)vger.kernel.org # 5.5+
Reported-by: syzbot+695b03d82fa8e4901b06(a)syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
---
fs/io_uring.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 6b73e38aa1a9..1e803a9afc8e 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9056,11 +9056,16 @@ static void io_uring_cancel_files(struct io_ring_ctx *ctx,
break;
io_uring_try_cancel_requests(ctx, task, files);
+
+ if (ctx->sq_data)
+ io_sq_thread_unpark(ctx->sq_data);
prepare_to_wait(&task->io_uring->wait, &wait,
TASK_UNINTERRUPTIBLE);
if (inflight == io_uring_count_inflight(ctx, task, files))
schedule();
finish_wait(&task->io_uring->wait, &wait);
+ if (ctx->sq_data)
+ io_sq_thread_park(ctx->sq_data);
}
}
--
2.24.0
Fix kprobe_on_func_entry() returns error code instead of false so that
register_kretprobe() can return an appropriate error code.
append_trace_kprobe() expects the kprobe registration returns -ENOENT
when the target symbol is not found, and it checks whether the target
module is unloaded or not. If the target module doesn't exist, it
defers to probe the target symbol until the module is loaded.
However, since register_kretprobe() returns -EINVAL instead of -ENOENT
in that case, it always fail on putting the kretprobe event on unloaded
modules. e.g.
Kprobe event:
/sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events
[ 16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue.
Kretprobe event: (p -> r)
/sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events
sh: write error: Invalid argument
/sys/kernel/debug/tracing # cat error_log
[ 41.122514] trace_kprobe: error: Failed to register probe event
Command: r xfs:xfs_end_io
^
To fix this bug, change kprobe_on_func_entry() to detect symbol lookup
failure and return -ENOENT in that case. Otherwise it returns -EINVAL
or 0 (succeeded, given address is on the entry).
Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@de…
Cc: stable(a)vger.kernel.org
Fixes: 59158ec4aef7 ("tracing/kprobes: Check the probe on unloaded module correctly")
Reported-by: Jianlin Lv <Jianlin.Lv(a)arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
include/linux/kprobes.h | 2 +-
kernel/kprobes.c | 34 +++++++++++++++++++++++++---------
kernel/trace/trace_kprobe.c | 4 ++--
3 files changed, 28 insertions(+), 12 deletions(-)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 9f22652d69bb..c28204e22b54 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -245,7 +245,7 @@ extern void kprobes_inc_nmissed_count(struct kprobe *p);
extern bool arch_within_kprobe_blacklist(unsigned long addr);
extern int arch_populate_kprobe_blacklist(void);
extern bool arch_kprobe_on_func_entry(unsigned long offset);
-extern bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
+extern int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
extern bool within_kprobe_blacklist(unsigned long addr);
extern int kprobe_add_ksym_blacklist(unsigned long entry);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 2161f519d481..ebbd4320143d 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1921,29 +1921,45 @@ bool __weak arch_kprobe_on_func_entry(unsigned long offset)
return !offset;
}
-bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
+/**
+ * kprobe_on_func_entry() -- check whether given address is function entry
+ * @addr: Target address
+ * @sym: Target symbol name
+ * @offset: The offset from the symbol or the address
+ *
+ * This checks whether the given @addr+@offset or @sym+@offset is on the
+ * function entry address or not.
+ * This returns 0 if it is the function entry, or -EINVAL if it is not.
+ * And also it returns -ENOENT if it fails the symbol or address lookup.
+ * Caller must pass @addr or @sym (either one must be NULL), or this
+ * returns -EINVAL.
+ */
+int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
{
kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset);
if (IS_ERR(kp_addr))
- return false;
+ return PTR_ERR(kp_addr);
- if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset) ||
- !arch_kprobe_on_func_entry(offset))
- return false;
+ if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset))
+ return -ENOENT;
- return true;
+ if (!arch_kprobe_on_func_entry(offset))
+ return -EINVAL;
+
+ return 0;
}
int register_kretprobe(struct kretprobe *rp)
{
- int ret = 0;
+ int ret;
struct kretprobe_instance *inst;
int i;
void *addr;
- if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset))
- return -EINVAL;
+ ret = kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset);
+ if (ret)
+ return ret;
if (kretprobe_blacklist_size) {
addr = kprobe_addr(&rp->kp);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 5c17f70c7f2d..61eff45653f5 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -112,9 +112,9 @@ bool trace_kprobe_on_func_entry(struct trace_event_call *call)
{
struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
- return kprobe_on_func_entry(tk->rp.kp.addr,
+ return (kprobe_on_func_entry(tk->rp.kp.addr,
tk->rp.kp.addr ? NULL : tk->rp.kp.symbol_name,
- tk->rp.kp.addr ? 0 : tk->rp.kp.offset);
+ tk->rp.kp.addr ? 0 : tk->rp.kp.offset) == 0);
}
bool trace_kprobe_error_injectable(struct trace_event_call *call)
commit 97c753e62e6c31a404183898d950d8c08d752dbd upstream.
Fix kprobe_on_func_entry() returns error code instead of false so that
register_kretprobe() can return an appropriate error code.
append_trace_kprobe() expects the kprobe registration returns -ENOENT
when the target symbol is not found, and it checks whether the target
module is unloaded or not. If the target module doesn't exist, it
defers to probe the target symbol until the module is loaded.
However, since register_kretprobe() returns -EINVAL instead of -ENOENT
in that case, it always fail on putting the kretprobe event on unloaded
modules. e.g.
Kprobe event:
/sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events
[ 16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue.
Kretprobe event: (p -> r)
/sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events
sh: write error: Invalid argument
/sys/kernel/debug/tracing # cat error_log
[ 41.122514] trace_kprobe: error: Failed to register probe event
Command: r xfs:xfs_end_io
^
To fix this bug, change kprobe_on_func_entry() to detect symbol lookup
failure and return -ENOENT in that case. Otherwise it returns -EINVAL
or 0 (succeeded, given address is on the entry).
Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@de…
Cc: stable(a)vger.kernel.org
Fixes: 59158ec4aef7 ("tracing/kprobes: Check the probe on unloaded module correctly")
Reported-by: Jianlin Lv <Jianlin.Lv(a)arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
include/linux/kprobes.h | 2 +-
kernel/kprobes.c | 34 +++++++++++++++++++++++++---------
kernel/trace/trace_kprobe.c | 10 ++++++----
3 files changed, 32 insertions(+), 14 deletions(-)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index a60488867dd0..a121fd8e7c3a 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -232,7 +232,7 @@ extern void kprobes_inc_nmissed_count(struct kprobe *p);
extern bool arch_within_kprobe_blacklist(unsigned long addr);
extern int arch_populate_kprobe_blacklist(void);
extern bool arch_kprobe_on_func_entry(unsigned long offset);
-extern bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
+extern int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
extern bool within_kprobe_blacklist(unsigned long addr);
extern int kprobe_add_ksym_blacklist(unsigned long entry);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 283c8b01ce78..8f9fbc74021d 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1948,29 +1948,45 @@ bool __weak arch_kprobe_on_func_entry(unsigned long offset)
return !offset;
}
-bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
+/**
+ * kprobe_on_func_entry() -- check whether given address is function entry
+ * @addr: Target address
+ * @sym: Target symbol name
+ * @offset: The offset from the symbol or the address
+ *
+ * This checks whether the given @addr+@offset or @sym+@offset is on the
+ * function entry address or not.
+ * This returns 0 if it is the function entry, or -EINVAL if it is not.
+ * And also it returns -ENOENT if it fails the symbol or address lookup.
+ * Caller must pass @addr or @sym (either one must be NULL), or this
+ * returns -EINVAL.
+ */
+int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
{
kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset);
if (IS_ERR(kp_addr))
- return false;
+ return PTR_ERR(kp_addr);
- if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset) ||
- !arch_kprobe_on_func_entry(offset))
- return false;
+ if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset))
+ return -ENOENT;
- return true;
+ if (!arch_kprobe_on_func_entry(offset))
+ return -EINVAL;
+
+ return 0;
}
int register_kretprobe(struct kretprobe *rp)
{
- int ret = 0;
+ int ret;
struct kretprobe_instance *inst;
int i;
void *addr;
- if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset))
- return -EINVAL;
+ ret = kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset);
+ if (ret)
+ return ret;
if (kretprobe_blacklist_size) {
addr = kprobe_addr(&rp->kp);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 1074a69beff3..233322c77b76 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -220,9 +220,9 @@ bool trace_kprobe_on_func_entry(struct trace_event_call *call)
{
struct trace_kprobe *tk = trace_kprobe_primary_from_call(call);
- return tk ? kprobe_on_func_entry(tk->rp.kp.addr,
+ return tk ? (kprobe_on_func_entry(tk->rp.kp.addr,
tk->rp.kp.addr ? NULL : tk->rp.kp.symbol_name,
- tk->rp.kp.addr ? 0 : tk->rp.kp.offset) : false;
+ tk->rp.kp.addr ? 0 : tk->rp.kp.offset) == 0) : false;
}
bool trace_kprobe_error_injectable(struct trace_event_call *call)
@@ -811,9 +811,11 @@ static int trace_kprobe_create(int argc, const char *argv[])
trace_probe_log_err(0, BAD_PROBE_ADDR);
goto parse_error;
}
- if (kprobe_on_func_entry(NULL, symbol, offset))
+ ret = kprobe_on_func_entry(NULL, symbol, offset);
+ if (ret == 0)
flags |= TPARG_FL_FENTRY;
- if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
+ /* Defer the ENOENT case until register kprobe */
+ if (ret == -EINVAL && is_return) {
trace_probe_log_err(0, BAD_RETPROBE);
goto parse_error;
}
From: Dave Hansen <dave.hansen(a)linux.intel.com>
I went to go add a new RECLAIM_* mode for the zone_reclaim_mode
sysctl. Like a good kernel developer, I also went to go update the
documentation. I noticed that the bits in the documentation didn't
match the bits in the #defines.
The VM never explicitly checks the RECLAIM_ZONE bit. The bit is,
however implicitly checked when checking 'node_reclaim_mode==0'.
The RECLAIM_ZONE #define was removed in a cleanup. That, by itself
is fine.
But, when the bit was removed (bit 0) the _other_ bit locations also
got changed. That's not OK because the bit values are documented to
mean one specific thing and users surely rely on them meaning that one
thing and not changing from kernel to kernel. The end result is that
if someone had a script that did:
sysctl vm.zone_reclaim_mode=1
This script would have gone from enalbing node reclaim for clean
unmapped pages to writing out pages during node reclaim after the
commit in question. That's not great.
Put the bits back the way they were and add a comment so something
like this is a bit harder to do again. Update the documentation to
make it clear that the first bit is ignored.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Fixes: 648b5cf368e0 ("mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE")
Reviewed-by: Ben Widawsky <ben.widawsky(a)intel.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Acked-by: Christoph Lameter <cl(a)linux.com>
Cc: Alex Shi <alex.shi(a)linux.alibaba.com>
Cc: Daniel Wagner <dwagner(a)suse.de>
Cc: "Tobin C. Harding" <tobin(a)kernel.org>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Huang Ying <ying.huang(a)intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Qian Cai <cai(a)lca.pw>
Cc: Daniel Wagner <dwagner(a)suse.de>
Cc: osalvador <osalvador(a)suse.de>
Cc: stable(a)vger.kernel.org
--
Changes from v2:
* Update description to indicate that bit0 was used for clean
unmapped page node reclaim.
---
b/Documentation/admin-guide/sysctl/vm.rst | 10 +++++-----
b/mm/vmscan.c | 9 +++++++--
2 files changed, 12 insertions(+), 7 deletions(-)
diff -puN Documentation/admin-guide/sysctl/vm.rst~mm-vmscan-restore-old-zone_reclaim_mode-abi Documentation/admin-guide/sysctl/vm.rst
--- a/Documentation/admin-guide/sysctl/vm.rst~mm-vmscan-restore-old-zone_reclaim_mode-abi 2021-01-25 16:23:06.048866718 -0800
+++ b/Documentation/admin-guide/sysctl/vm.rst 2021-01-25 16:23:06.056866718 -0800
@@ -978,11 +978,11 @@ that benefit from having their data cach
left disabled as the caching effect is likely to be more important than
data locality.
-zone_reclaim may be enabled if it's known that the workload is partitioned
-such that each partition fits within a NUMA node and that accessing remote
-memory would cause a measurable performance reduction. The page allocator
-will then reclaim easily reusable pages (those page cache pages that are
-currently not used) before allocating off node pages.
+Consider enabling one or more zone_reclaim mode bits if it's known that the
+workload is partitioned such that each partition fits within a NUMA node
+and that accessing remote memory would cause a measurable performance
+reduction. The page allocator will take additional actions before
+allocating off node pages.
Allowing zone reclaim to write out pages stops processes that are
writing large amounts of data from dirtying pages on other nodes. Zone
diff -puN mm/vmscan.c~mm-vmscan-restore-old-zone_reclaim_mode-abi mm/vmscan.c
--- a/mm/vmscan.c~mm-vmscan-restore-old-zone_reclaim_mode-abi 2021-01-25 16:23:06.052866718 -0800
+++ b/mm/vmscan.c 2021-01-25 16:23:06.057866718 -0800
@@ -4086,8 +4086,13 @@ module_init(kswapd_init)
*/
int node_reclaim_mode __read_mostly;
-#define RECLAIM_WRITE (1<<0) /* Writeout pages during reclaim */
-#define RECLAIM_UNMAP (1<<1) /* Unmap pages during reclaim */
+/*
+ * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
+ * ABI. New bits are OK, but existing bits can never change.
+ */
+#define RECLAIM_ZONE (1<<0) /* Run shrink_inactive_list on the zone */
+#define RECLAIM_WRITE (1<<1) /* Writeout pages during reclaim */
+#define RECLAIM_UNMAP (1<<2) /* Unmap pages during reclaim */
/*
* Priority for NODE_RECLAIM. This determines the fraction of pages
_
This is a note to let you know that I've just added the patch titled
phy: lantiq: rcu-usb2: wait after clock enable
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 36acd5e24e3000691fb8d1ee31cf959cb1582d35 Mon Sep 17 00:00:00 2001
From: Mathias Kresin <dev(a)kresin.me>
Date: Thu, 7 Jan 2021 23:49:01 +0100
Subject: phy: lantiq: rcu-usb2: wait after clock enable
Commit 65dc2e725286 ("usb: dwc2: Update Core Reset programming flow.")
revealed that the phy isn't ready immediately after enabling it's
clocks. The dwc2_check_core_version() fails and the dwc2 usb driver
errors out.
Add a short delay to let the phy get up and running. There isn't any
documentation how much time is required, the value was chosen based on
tests.
Signed-off-by: Mathias Kresin <dev(a)kresin.me>
Acked-by: Hauke Mehrtens <hauke(a)hauke-m.de>
Acked-by: Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
Cc: <stable(a)vger.kernel.org> # v5.7+
Link: https://lore.kernel.org/r/20210107224901.2102479-1-dev@kresin.me
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
---
drivers/phy/lantiq/phy-lantiq-rcu-usb2.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/phy/lantiq/phy-lantiq-rcu-usb2.c b/drivers/phy/lantiq/phy-lantiq-rcu-usb2.c
index a7d126192cf1..29d246ea24b4 100644
--- a/drivers/phy/lantiq/phy-lantiq-rcu-usb2.c
+++ b/drivers/phy/lantiq/phy-lantiq-rcu-usb2.c
@@ -124,8 +124,16 @@ static int ltq_rcu_usb2_phy_power_on(struct phy *phy)
reset_control_deassert(priv->phy_reset);
ret = clk_prepare_enable(priv->phy_gate_clk);
- if (ret)
+ if (ret) {
dev_err(dev, "failed to enable PHY gate\n");
+ return ret;
+ }
+
+ /*
+ * at least the xrx200 usb2 phy requires some extra time to be
+ * operational after enabling the clock
+ */
+ usleep_range(100, 200);
return ret;
}
--
2.30.1
ftw() has been obsolete for about 12 years now.
Fixes: bb1c15b60b98 ("perf stat: Support regex pattern in --for-each-cgroup")
CC: stable(a)vger.kernel.org
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
---
Notes:
NOTE: Not runtime-tested, I have no idea what I need to do in perf
to test this. But at least it compiles now with my uClibc-based
toolchain.
tools/perf/util/cgroup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index 5dff7e489921..f24ab4585553 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -161,7 +161,7 @@ void evlist__set_default_cgroup(struct evlist *evlist, struct cgroup *cgroup)
/* helper function for ftw() in match_cgroups and list_cgroups */
static int add_cgroup_name(const char *fpath, const struct stat *sb __maybe_unused,
- int typeflag)
+ int typeflag, struct FTW *ftwbuf __maybe_unused)
{
struct cgroup_name *cn;
@@ -209,12 +209,12 @@ static int list_cgroups(const char *str)
if (!s)
return -1;
/* pretend if it's added by ftw() */
- ret = add_cgroup_name(s, NULL, FTW_D);
+ ret = add_cgroup_name(s, NULL, FTW_D, NULL);
free(s);
if (ret)
return -1;
} else {
- if (add_cgroup_name("", NULL, FTW_D) < 0)
+ if (add_cgroup_name("", NULL, FTW_D, NULL) < 0)
return -1;
}
@@ -247,7 +247,7 @@ static int match_cgroups(const char *str)
prefix_len = strlen(mnt);
/* collect all cgroups in the cgroup_list */
- if (ftw(mnt, add_cgroup_name, 20) < 0)
+ if (nftw(mnt, add_cgroup_name, 20, 0) < 0)
return -1;
for (;;) {
--
2.30.0
I'm announcing the release of the 5.4.97 kernel.
All users of the 5.4 kernel series must upgrade.
The updated 5.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 8 -
arch/arm/boot/dts/sun7i-a20-bananapro.dts | 2
arch/arm/mach-footbridge/dc21285.c | 12 -
arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi | 2
arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2
arch/arm64/boot/dts/qcom/sdm850-lenovo-yoga-c630.dts | 10 -
arch/arm64/boot/dts/rockchip/px30.dtsi | 2
arch/um/drivers/virtio_uml.c | 3
arch/x86/Makefile | 3
arch/x86/include/asm/apic.h | 10 -
arch/x86/include/asm/barrier.h | 18 ++
arch/x86/kernel/apic/apic.c | 4
arch/x86/kernel/apic/x2apic_cluster.c | 6
arch/x86/kernel/apic/x2apic_phys.c | 9 -
arch/x86/kvm/emulate.c | 2
arch/x86/kvm/svm.c | 5
arch/x86/mm/mem_encrypt.c | 1
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2
drivers/input/joystick/xpad.c | 17 ++
drivers/input/serio/i8042-x86ia64io.h | 2
drivers/iommu/intel-iommu.c | 6
drivers/md/md.c | 2
drivers/mmc/core/sdio_cis.c | 6
drivers/net/dsa/mv88e6xxx/chip.c | 6
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 13 -
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h | 1
drivers/net/ethernet/intel/igc/igc_ethtool.c | 3
drivers/net/ethernet/intel/igc/igc_i225.c | 3
drivers/net/ethernet/intel/igc/igc_mac.c | 2
drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 10 -
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 5
drivers/net/ethernet/realtek/r8169_main.c | 4
drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 9 +
drivers/nvdimm/dimm_devs.c | 18 ++
drivers/nvme/host/pci.c | 2
drivers/nvme/target/tcp.c | 3
drivers/usb/class/usblp.c | 19 +-
drivers/usb/dwc2/gadget.c | 8 -
drivers/usb/dwc3/core.c | 2
drivers/usb/gadget/legacy/ether.c | 4
drivers/usb/host/xhci-mtk-sch.c | 130 +++++++++++++------
drivers/usb/host/xhci-mtk.c | 2
drivers/usb/host/xhci-mtk.h | 15 ++
drivers/usb/host/xhci-mvebu.c | 42 ++++++
drivers/usb/host/xhci-mvebu.h | 6
drivers/usb/host/xhci-plat.c | 26 +++
drivers/usb/host/xhci-plat.h | 1
drivers/usb/host/xhci-ring.c | 31 ++--
drivers/usb/host/xhci.c | 8 -
drivers/usb/host/xhci.h | 5
drivers/usb/renesas_usbhs/fifo.c | 1
drivers/usb/serial/cp210x.c | 2
drivers/usb/serial/option.c | 6
fs/afs/main.c | 6
fs/cifs/dir.c | 22 ++-
fs/cifs/smb2pdu.h | 2
fs/cifs/transport.c | 18 ++
fs/hugetlbfs/inode.c | 3
fs/overlayfs/dir.c | 2
include/linux/hugetlb.h | 2
include/linux/msi.h | 6
include/net/sch_generic.h | 2
init/init_task.c | 3
kernel/bpf/cgroup.c | 7 -
kernel/irq/msi.c | 44 ++----
kernel/kprobes.c | 4
kernel/trace/fgraph.c | 2
mm/compaction.c | 3
mm/huge_memory.c | 37 +++--
mm/hugetlb.c | 48 ++++++-
mm/memblock.c | 49 -------
net/core/neighbour.c | 7 -
net/ipv4/ip_tunnel.c | 16 +-
net/lapb/lapb_out.c | 3
net/mac80211/driver-ops.c | 5
net/mac80211/rate.c | 3
net/rxrpc/af_rxrpc.c | 6
77 files changed, 557 insertions(+), 264 deletions(-)
Aleksandr Loktionov (1):
i40e: Revert "i40e: don't report link up for a VF who hasn't enabled queues"
Alexander Ovechkin (1):
net: sched: replaced invalid qdisc tree flush helper in qdisc_replace
Alexey Dobriyan (1):
Input: i8042 - unbreak Pegatron C15B
Aurelien Aptel (1):
cifs: report error instead of invalid when revalidating a dentry fails
Benjamin Valentin (1):
Input: xpad - sync supported devices with fork on GitHub
Chenxin Jin (1):
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Chinmay Agarwal (1):
neighbour: Prevent a dead entry from updating gc_list
Christoph Schemmel (1):
USB: serial: option: Adding support for Cinterion MV31
Chunfeng Yun (2):
usb: xhci-mtk: skip dropping bandwidth of unchecked endpoints
usb: xhci-mtk: break loop when find the endpoint to drop
DENG Qingfang (1):
net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Dan Carpenter (1):
USB: gadget: legacy: fix an error code in eth_bind()
Dan Williams (1):
libnvdimm/dimm: Avoid race between probe and available_slots_show()
Dave Hansen (1):
x86/apic: Add extra serialization for non-serializing MSRs
David Howells (1):
rxrpc: Fix deadlock around release of dst cached on udp tunnel
Felix Fietkau (1):
mac80211: fix station rate table updates on assoc
Fengnan Chang (1):
mmc: core: Limit retries when analyse of SDIO tuples fails
Gary Bisson (1):
usb: dwc3: fix clock issue during resume in OTG mode
Greg Kroah-Hartman (1):
Linux 5.4.97
Gustavo A. R. Silva (1):
smb3: Fix out-of-bounds bug in SMB2_negotiate()
Heiko Stuebner (1):
usb: dwc2: Fix endpoint direction check in ep_from_windex
Heiner Kallweit (1):
r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set
Hermann Lauer (1):
ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode
Hugh Dickins (1):
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Ikjoon Jang (1):
usb: xhci-mtk: fix unreleased bandwidth data
Jeremy Figgins (1):
USB: usblp: don't call usb_set_interface if there's a single alt
Johannes Berg (1):
um: virtio: free vu_dev only with the contained struct device
Josh Poimboeuf (1):
x86/build: Disable CET instrumentation in the kernel
Kai-Heng Feng (1):
igc: Report speed and duplex as unknown when device is runtime suspended
Kevin Lo (2):
igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr
igc: check return value of ret_val in igc_config_fc_after_link_up
Liangyan (1):
ovl: fix dentry leak in ovl_get_redirect
Loris Reiff (2):
bpf, cgroup: Fix optlen WARN_ON_ONCE toctou
bpf, cgroup: Fix problematic bounds check
Luca Coelho (1):
iwlwifi: mvm: don't send RFH_QUEUE_CONFIG_CMD with no queues
Maor Gottlieb (1):
net/mlx5: Fix leak upon failure of rule creation
Marc Zyngier (1):
genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
Mathias Nyman (1):
xhci: fix bounce buffer usage for non-sg list case
Muchun Song (4):
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
mm: hugetlb: fix a race between freeing and dissolving the page
mm: hugetlb: fix a race between isolating and freeing page
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Nadav Amit (1):
iommu/vt-d: Do not use flush-queue when caching-mode is on
Pali Rohár (1):
usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720
Pavel Shilovsky (1):
smb3: fix crediting for compounding when only one request in flight
Peter Chen (1):
usb: host: xhci-plat: add priv quirk for skip PHY initialization
Pho Tran (1):
USB: serial: cp210x: add pid/vid for WSDA-200-USB
Rokudo Yan (1):
mm, compaction: move high_pfn to the for loop scope
Roman Gushchin (1):
memblock: do not start bottom-up allocations with kernel_end
Russell King (1):
ARM: footbridge: fix dc21285 PCI configuration accessors
Sagi Grimberg (1):
nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs
Sandy Huang (1):
arm64: dts: rockchip: fix vopl iommu irq on px30
Sean Christopherson (2):
KVM: SVM: Treat SVM as unsupported when running as an SEV guest
KVM: x86: Update emulator context mode if SYSENTER xfers to 64-bit mode
Serge Semin (1):
arm64: dts: amlogic: meson-g12: Set FL-adj property value
Shawn Guo (1):
arm64: dts: qcom: c630: keep both touchpad devices enabled
Stefan Chulski (1):
net: mvpp2: TCAM entry enable should be written after SRAM data
Steven Rostedt (VMware) (1):
fgraph: Initialize tracing_graph_pause at task creation
Stylon Wang (1):
drm/amd/display: Revert "Fix EDID parsing after resume from suspend"
Thorsten Leemhuis (1):
nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
Vadim Fedorenko (1):
net: ip_tunnel: fix mtu calculation
Wang ShaoBo (1):
kretprobe: Avoid re-registration of the same kretprobe earlier
Xiao Ni (1):
md: Set prev_flush_start and flush_bio in an atomic way
Xie He (1):
net: lapb: Copy the skb before sending a packet
Yoshihiro Shimoda (1):
usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
Zyta Szpak (1):
arm64: dts: ls1046a: fix dcfg address range
I'm announcing the release of the 4.19.175 kernel.
All users of the 4.19 kernel series must upgrade.
The updated 4.19.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.19.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 8 ----
arch/arm/mach-footbridge/dc21285.c | 12 +++---
arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 -
arch/x86/Makefile | 3 +
arch/x86/include/asm/apic.h | 10 -----
arch/x86/include/asm/barrier.h | 18 +++++++++
arch/x86/kernel/apic/apic.c | 4 ++
arch/x86/kernel/apic/x2apic_cluster.c | 6 ++-
arch/x86/kernel/apic/x2apic_phys.c | 6 ++-
arch/x86/kvm/svm.c | 5 ++
drivers/input/joystick/xpad.c | 17 ++++++++
drivers/input/serio/i8042-x86ia64io.h | 2 +
drivers/iommu/intel-iommu.c | 6 +++
drivers/md/md.c | 2 +
drivers/mmc/core/sdio_cis.c | 6 +++
drivers/net/dsa/mv88e6xxx/chip.c | 6 ++-
drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 10 ++---
drivers/nvme/host/pci.c | 2 +
drivers/usb/class/usblp.c | 19 +++++----
drivers/usb/dwc2/gadget.c | 8 ----
drivers/usb/dwc3/core.c | 2 -
drivers/usb/gadget/legacy/ether.c | 4 +-
drivers/usb/host/xhci-ring.c | 31 ++++++++++-----
drivers/usb/renesas_usbhs/fifo.c | 1
drivers/usb/serial/cp210x.c | 2 +
drivers/usb/serial/option.c | 6 +++
fs/afs/main.c | 6 +--
fs/cifs/dir.c | 22 ++++++++++-
fs/cifs/smb2pdu.h | 2 -
fs/hugetlbfs/inode.c | 3 +
fs/overlayfs/dir.c | 2 -
include/linux/elfcore.h | 22 +++++++++++
include/linux/hugetlb.h | 3 +
include/linux/msi.h | 6 +++
kernel/Makefile | 1
kernel/elfcore.c | 26 -------------
kernel/irq/msi.c | 44 ++++++++++------------
kernel/kprobes.c | 4 ++
mm/huge_memory.c | 37 +++++++++++-------
mm/hugetlb.c | 48 +++++++++++++++++++++---
mm/memblock.c | 49 +++----------------------
net/ipv4/ip_tunnel.c | 16 +++-----
net/lapb/lapb_out.c | 3 +
net/mac80211/driver-ops.c | 5 ++
net/mac80211/rate.c | 3 +
net/rxrpc/af_rxrpc.c | 6 +--
46 files changed, 307 insertions(+), 199 deletions(-)
Alexey Dobriyan (1):
Input: i8042 - unbreak Pegatron C15B
Arnd Bergmann (1):
elfcore: fix building with clang
Aurelien Aptel (1):
cifs: report error instead of invalid when revalidating a dentry fails
Benjamin Valentin (1):
Input: xpad - sync supported devices with fork on GitHub
Chenxin Jin (1):
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Christoph Schemmel (1):
USB: serial: option: Adding support for Cinterion MV31
DENG Qingfang (1):
net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Dan Carpenter (1):
USB: gadget: legacy: fix an error code in eth_bind()
Dave Hansen (1):
x86/apic: Add extra serialization for non-serializing MSRs
David Howells (1):
rxrpc: Fix deadlock around release of dst cached on udp tunnel
Felix Fietkau (1):
mac80211: fix station rate table updates on assoc
Fengnan Chang (1):
mmc: core: Limit retries when analyse of SDIO tuples fails
Gary Bisson (1):
usb: dwc3: fix clock issue during resume in OTG mode
Greg Kroah-Hartman (1):
Linux 4.19.175
Gustavo A. R. Silva (1):
smb3: Fix out-of-bounds bug in SMB2_negotiate()
Heiko Stuebner (1):
usb: dwc2: Fix endpoint direction check in ep_from_windex
Hugh Dickins (1):
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Jeremy Figgins (1):
USB: usblp: don't call usb_set_interface if there's a single alt
Josh Poimboeuf (1):
x86/build: Disable CET instrumentation in the kernel
Liangyan (1):
ovl: fix dentry leak in ovl_get_redirect
Marc Zyngier (1):
genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
Mathias Nyman (1):
xhci: fix bounce buffer usage for non-sg list case
Muchun Song (4):
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
mm: hugetlb: fix a race between freeing and dissolving the page
mm: hugetlb: fix a race between isolating and freeing page
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Nadav Amit (1):
iommu/vt-d: Do not use flush-queue when caching-mode is on
Pho Tran (1):
USB: serial: cp210x: add pid/vid for WSDA-200-USB
Roman Gushchin (1):
memblock: do not start bottom-up allocations with kernel_end
Russell King (1):
ARM: footbridge: fix dc21285 PCI configuration accessors
Sean Christopherson (1):
KVM: SVM: Treat SVM as unsupported when running as an SEV guest
Stefan Chulski (1):
net: mvpp2: TCAM entry enable should be written after SRAM data
Thorsten Leemhuis (1):
nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
Vadim Fedorenko (1):
net: ip_tunnel: fix mtu calculation
Wang ShaoBo (1):
kretprobe: Avoid re-registration of the same kretprobe earlier
Xiao Ni (1):
md: Set prev_flush_start and flush_bio in an atomic way
Xie He (1):
net: lapb: Copy the skb before sending a packet
Yoshihiro Shimoda (1):
usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
Zyta Szpak (1):
arm64: dts: ls1046a: fix dcfg address range
When starting an iomap write, gfs2_quota_lock_check -> gfs2_quota_lock
-> gfs2_quota_hold is called from gfs2_iomap_begin. At the end of the
write, before unlocking the quotas, punch_hole -> gfs2_quota_hold can be
called again in gfs2_iomap_end, which is incorrect and leads to a failed
assertion. Instead, move the call to gfs2_quota_unlock before the call
to punch_hole to fix that.
Fixes: 64bc06bb32ee ("gfs2: iomap buffered write support")
Cc: stable(a)vger.kernel.org # v4.19+
Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com>
---
fs/gfs2/bmap.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index cf6ccdd00587..7a358ae05185 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1230,6 +1230,9 @@ static int gfs2_iomap_end(struct inode *inode, loff_t pos, loff_t length,
gfs2_inplace_release(ip);
+ if (ip->i_qadata && ip->i_qadata->qa_qd_num)
+ gfs2_quota_unlock(ip);
+
if (length != written && (iomap->flags & IOMAP_F_NEW)) {
/* Deallocate blocks that were just allocated. */
loff_t blockmask = i_blocksize(inode) - 1;
@@ -1242,9 +1245,6 @@ static int gfs2_iomap_end(struct inode *inode, loff_t pos, loff_t length,
}
}
- if (ip->i_qadata && ip->i_qadata->qa_qd_num)
- gfs2_quota_unlock(ip);
-
if (unlikely(!written))
goto out_unlock;
--
2.26.2
I'm announcing the release of the 4.9.257 kernel.
All users of the 4.9 kernel series must upgrade.
The updated 4.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.9.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 10 -
arch/arm/mach-footbridge/dc21285.c | 12 -
arch/x86/Makefile | 3
arch/x86/include/asm/apic.h | 10 -
arch/x86/include/asm/barrier.h | 18 ++
arch/x86/kernel/apic/apic.c | 4
arch/x86/kernel/apic/x2apic_cluster.c | 6
arch/x86/kernel/apic/x2apic_phys.c | 6
drivers/acpi/thermal.c | 55 ++++--
drivers/input/joystick/xpad.c | 17 +-
drivers/input/serio/i8042-x86ia64io.h | 2
drivers/iommu/intel-iommu.c | 6
drivers/mmc/core/sdio_cis.c | 6
drivers/net/dsa/bcm_sf2.c | 8
drivers/net/ethernet/ibm/ibmvnic.c | 6
drivers/scsi/ibmvscsi/ibmvfc.c | 4
drivers/scsi/libfc/fc_exch.c | 16 +
drivers/usb/class/usblp.c | 19 +-
drivers/usb/dwc2/gadget.c | 8
drivers/usb/gadget/legacy/ether.c | 4
drivers/usb/host/xhci-ring.c | 31 ++-
drivers/usb/serial/cp210x.c | 2
drivers/usb/serial/option.c | 6
fs/cifs/dir.c | 22 ++
fs/hugetlbfs/inode.c | 3
include/linux/elfcore.h | 22 ++
include/linux/hugetlb.h | 3
kernel/Makefile | 1
kernel/elfcore.c | 25 ---
kernel/futex.c | 276 +++++++++++++++++++---------------
kernel/kprobes.c | 4
kernel/locking/rtmutex-debug.c | 9 -
kernel/locking/rtmutex-debug.h | 3
kernel/locking/rtmutex.c | 127 +++++++++------
kernel/locking/rtmutex.h | 2
kernel/locking/rtmutex_common.h | 12 -
mm/huge_memory.c | 37 ++--
mm/hugetlb.c | 9 -
net/lapb/lapb_out.c | 3
net/mac80211/driver-ops.c | 5
net/mac80211/rate.c | 3
net/mac80211/rx.c | 2
net/sched/sch_api.c | 3
sound/pci/hda/patch_realtek.c | 2
tools/objtool/elf.c | 7
45 files changed, 520 insertions(+), 319 deletions(-)
Alexey Dobriyan (1):
Input: i8042 - unbreak Pegatron C15B
Arnd Bergmann (1):
elfcore: fix building with clang
Aurelien Aptel (1):
cifs: report error instead of invalid when revalidating a dentry fails
Benjamin Valentin (1):
Input: xpad - sync supported devices with fork on GitHub
Brian King (1):
scsi: ibmvfc: Set default timeout to avoid crash during migration
Chenxin Jin (1):
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Christoph Schemmel (1):
USB: serial: option: Adding support for Cinterion MV31
Dan Carpenter (1):
USB: gadget: legacy: fix an error code in eth_bind()
Dave Hansen (1):
x86/apic: Add extra serialization for non-serializing MSRs
Eric Dumazet (1):
net_sched: reject silly cell_log in qdisc_get_rtab()
Felix Fietkau (2):
mac80211: fix fast-rx encryption check
mac80211: fix station rate table updates on assoc
Fengnan Chang (1):
mmc: core: Limit retries when analyse of SDIO tuples fails
Greg Kroah-Hartman (1):
Linux 4.9.257
Heiko Stuebner (1):
usb: dwc2: Fix endpoint direction check in ep_from_windex
Hugh Dickins (1):
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Javed Hasan (1):
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Jeremy Figgins (1):
USB: usblp: don't call usb_set_interface if there's a single alt
Josh Poimboeuf (2):
objtool: Don't fail on missing symbol table
x86/build: Disable CET instrumentation in the kernel
Lijun Pan (1):
ibmvnic: Ensure that CRQ entry read are correctly ordered
Mathias Nyman (1):
xhci: fix bounce buffer usage for non-sg list case
Muchun Song (3):
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
mm: hugetlb: fix a race between isolating and freeing page
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Nadav Amit (1):
iommu/vt-d: Do not use flush-queue when caching-mode is on
Pan Bian (1):
net: dsa: bcm_sf2: put device node before return
Peter Zijlstra (4):
futex,rt_mutex: Provide futex specific rt_mutex API
futex: Remove rt_mutex_deadlock_account_*()
futex: Rework inconsistent rt_mutex/futex_q state
futex: Avoid violating the 10th rule of futex
Pho Tran (1):
USB: serial: cp210x: add pid/vid for WSDA-200-USB
Rafael J. Wysocki (1):
ACPI: thermal: Do not call acpi_thermal_check() directly
Russell King (1):
ARM: footbridge: fix dc21285 PCI configuration accessors
Sasha Levin (1):
stable: clamp SUBLEVEL in 4.4 and 4.9
Shih-Yuan Lee (FourDollars) (1):
ALSA: hda/realtek - Fix typo of pincfg for Dell quirk
Thomas Gleixner (6):
futex: Replace pointless printk in fixup_owner()
futex: Provide and use pi_state_update_owner()
rtmutex: Remove unused argument from rt_mutex_proxy_unlock()
futex: Use pi_state_update_owner() in put_pi_state()
futex: Simplify fixup_pi_state_owner()
futex: Handle faults correctly for PI futexes
Wang ShaoBo (1):
kretprobe: Avoid re-registration of the same kretprobe earlier
Xie He (1):
net: lapb: Copy the skb before sending a packet
I'm announcing the release of the 4.4.257 kernel.
All users of the 4.4 kernel series must upgrade.
The updated 4.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 10 -
arch/arm/mach-footbridge/dc21285.c | 12 -
arch/mips/Kconfig | 1
arch/x86/Makefile | 3
arch/x86/include/asm/apic.h | 10 -
arch/x86/include/asm/barrier.h | 18 ++
arch/x86/kernel/apic/apic.c | 4
arch/x86/kernel/apic/x2apic_cluster.c | 3
arch/x86/kernel/apic/x2apic_phys.c | 3
drivers/acpi/thermal.c | 54 ++++--
drivers/input/joystick/xpad.c | 17 +-
drivers/input/serio/i8042-x86ia64io.h | 2
drivers/mmc/core/sdio_cis.c | 6
drivers/scsi/ibmvscsi/ibmvfc.c | 4
drivers/scsi/libfc/fc_exch.c | 16 +
drivers/usb/class/usblp.c | 19 +-
drivers/usb/dwc2/gadget.c | 8
drivers/usb/gadget/legacy/ether.c | 4
drivers/usb/gadget/udc/udc-core.c | 13 +
drivers/usb/serial/cp210x.c | 2
drivers/usb/serial/option.c | 6
fs/Kconfig.binfmt | 8
fs/cifs/dir.c | 22 ++
fs/hugetlbfs/inode.c | 3
include/linux/elfcore.h | 22 ++
include/linux/hugetlb.h | 3
kernel/Makefile | 3
kernel/elfcore.c | 25 ---
kernel/futex.c | 278 +++++++++++++++++++---------------
kernel/kprobes.c | 4
kernel/locking/rtmutex-debug.c | 9 -
kernel/locking/rtmutex-debug.h | 3
kernel/locking/rtmutex.c | 127 +++++++++------
kernel/locking/rtmutex.h | 2
kernel/locking/rtmutex_common.h | 12 -
mm/hugetlb.c | 9 -
net/lapb/lapb_out.c | 3
net/mac80211/driver-ops.c | 5
net/mac80211/rate.c | 3
net/sched/sch_api.c | 3
sound/pci/hda/patch_realtek.c | 2
41 files changed, 468 insertions(+), 293 deletions(-)
Alexey Dobriyan (1):
Input: i8042 - unbreak Pegatron C15B
Arnd Bergmann (1):
elfcore: fix building with clang
Aurelien Aptel (1):
cifs: report error instead of invalid when revalidating a dentry fails
Benjamin Valentin (1):
Input: xpad - sync supported devices with fork on GitHub
Brian King (1):
scsi: ibmvfc: Set default timeout to avoid crash during migration
Chenxin Jin (1):
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Christoph Schemmel (1):
USB: serial: option: Adding support for Cinterion MV31
Dan Carpenter (1):
USB: gadget: legacy: fix an error code in eth_bind()
Dave Hansen (1):
x86/apic: Add extra serialization for non-serializing MSRs
Eric Dumazet (1):
net_sched: reject silly cell_log in qdisc_get_rtab()
Felix Fietkau (1):
mac80211: fix station rate table updates on assoc
Fengnan Chang (1):
mmc: core: Limit retries when analyse of SDIO tuples fails
Greg Kroah-Hartman (1):
Linux 4.4.257
Heiko Stuebner (1):
usb: dwc2: Fix endpoint direction check in ep_from_windex
Javed Hasan (1):
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Jeremy Figgins (1):
USB: usblp: don't call usb_set_interface if there's a single alt
Josh Poimboeuf (1):
x86/build: Disable CET instrumentation in the kernel
Lee Jones (10):
futex,rt_mutex: Provide futex specific rt_mutex API
futex: Remove rt_mutex_deadlock_account_*()
futex: Rework inconsistent rt_mutex/futex_q state
futex: Avoid violating the 10th rule of futex
futex: Replace pointless printk in fixup_owner()
futex: Provide and use pi_state_update_owner()
rtmutex: Remove unused argument from rt_mutex_proxy_unlock()
futex: Use pi_state_update_owner() in put_pi_state()
futex: Simplify fixup_pi_state_owner()
futex: Handle faults correctly for PI futexes
Muchun Song (3):
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
mm: hugetlb: fix a race between isolating and freeing page
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Pho Tran (1):
USB: serial: cp210x: add pid/vid for WSDA-200-USB
Rafael J. Wysocki (1):
ACPI: thermal: Do not call acpi_thermal_check() directly
Ralf Baechle (1):
ELF/MIPS build fix
Russell King (1):
ARM: footbridge: fix dc21285 PCI configuration accessors
Sasha Levin (1):
stable: clamp SUBLEVEL in 4.4 and 4.9
Shih-Yuan Lee (FourDollars) (1):
ALSA: hda/realtek - Fix typo of pincfg for Dell quirk
Thinh Nguyen (1):
usb: udc: core: Use lock when write to soft_connect
Wang ShaoBo (1):
kretprobe: Avoid re-registration of the same kretprobe earlier
Xie He (1):
net: lapb: Copy the skb before sending a packet
This is the start of the stable review cycle for the 5.4.97 release.
There are 65 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.97-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.97-rc1
Pali Rohár <pali(a)kernel.org>
usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720
Alexander Ovechkin <ovov(a)yandex-team.ru>
net: sched: replaced invalid qdisc tree flush helper in qdisc_replace
DENG Qingfang <dqfext(a)gmail.com>
net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Vadim Fedorenko <vfedorenko(a)novek.ru>
net: ip_tunnel: fix mtu calculation
Chinmay Agarwal <chinagar(a)codeaurora.org>
neighbour: Prevent a dead entry from updating gc_list
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
igc: Report speed and duplex as unknown when device is runtime suspended
Xiao Ni <xni(a)redhat.com>
md: Set prev_flush_start and flush_bio in an atomic way
Nadav Amit <namit(a)vmware.com>
iommu/vt-d: Do not use flush-queue when caching-mode is on
Benjamin Valentin <benpicco(a)googlemail.com>
Input: xpad - sync supported devices with fork on GitHub
Luca Coelho <luciano.coelho(a)intel.com>
iwlwifi: mvm: don't send RFH_QUEUE_CONFIG_CMD with no queues
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/apic: Add extra serialization for non-serializing MSRs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/build: Disable CET instrumentation in the kernel
Hugh Dickins <hughd(a)google.com>
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Rokudo Yan <wu-yan(a)tcl.com>
mm, compaction: move high_pfn to the for loop scope
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between isolating and freeing page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between freeing and dissolving the page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: footbridge: fix dc21285 PCI configuration accessors
Sean Christopherson <seanjc(a)google.com>
KVM: x86: Update emulator context mode if SYSENTER xfers to 64-bit mode
Sean Christopherson <seanjc(a)google.com>
KVM: SVM: Treat SVM as unsupported when running as an SEV guest
Thorsten Leemhuis <linux(a)leemhuis.info>
nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
Stylon Wang <stylon.wang(a)amd.com>
drm/amd/display: Revert "Fix EDID parsing after resume from suspend"
Fengnan Chang <fengnanchang(a)gmail.com>
mmc: core: Limit retries when analyse of SDIO tuples fails
Pavel Shilovsky <pshilov(a)microsoft.com>
smb3: fix crediting for compounding when only one request in flight
Gustavo A. R. Silva <gustavoars(a)kernel.org>
smb3: Fix out-of-bounds bug in SMB2_negotiate()
Aurelien Aptel <aaptel(a)suse.com>
cifs: report error instead of invalid when revalidating a dentry fails
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: fix bounce buffer usage for non-sg list case
Marc Zyngier <maz(a)kernel.org>
genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
Dan Williams <dan.j.williams(a)intel.com>
libnvdimm/dimm: Avoid race between probe and available_slots_show()
Wang ShaoBo <bobo.shaobowang(a)huawei.com>
kretprobe: Avoid re-registration of the same kretprobe earlier
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
fgraph: Initialize tracing_graph_pause at task creation
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix station rate table updates on assoc
Liangyan <liangyan.peng(a)linux.alibaba.com>
ovl: fix dentry leak in ovl_get_redirect
Peter Chen <peter.chen(a)nxp.com>
usb: host: xhci-plat: add priv quirk for skip PHY initialization
Chunfeng Yun <chunfeng.yun(a)mediatek.com>
usb: xhci-mtk: break loop when find the endpoint to drop
Chunfeng Yun <chunfeng.yun(a)mediatek.com>
usb: xhci-mtk: skip dropping bandwidth of unchecked endpoints
Ikjoon Jang <ikjn(a)chromium.org>
usb: xhci-mtk: fix unreleased bandwidth data
Gary Bisson <gary.bisson(a)boundarydevices.com>
usb: dwc3: fix clock issue during resume in OTG mode
Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
usb: dwc2: Fix endpoint direction check in ep_from_windex
Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
Jeremy Figgins <kernel(a)jeremyfiggins.com>
USB: usblp: don't call usb_set_interface if there's a single alt
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: gadget: legacy: fix an error code in eth_bind()
Roman Gushchin <guro(a)fb.com>
memblock: do not start bottom-up allocations with kernel_end
Sagi Grimberg <sagi(a)grimberg.me>
nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs
Hermann Lauer <Hermann.Lauer(a)uni-heidelberg.de>
ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set
Stefan Chulski <stefanc(a)marvell.com>
net: mvpp2: TCAM entry enable should be written after SRAM data
Xie He <xie.he.0141(a)gmail.com>
net: lapb: Copy the skb before sending a packet
Maor Gottlieb <maorg(a)nvidia.com>
net/mlx5: Fix leak upon failure of rule creation
Aleksandr Loktionov <aleksandr.loktionov(a)intel.com>
i40e: Revert "i40e: don't report link up for a VF who hasn't enabled queues"
Kevin Lo <kevlo(a)kevlo.org>
igc: check return value of ret_val in igc_config_fc_after_link_up
Kevin Lo <kevlo(a)kevlo.org>
igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr
Zyta Szpak <zr(a)semihalf.com>
arm64: dts: ls1046a: fix dcfg address range
David Howells <dhowells(a)redhat.com>
rxrpc: Fix deadlock around release of dst cached on udp tunnel
Johannes Berg <johannes.berg(a)intel.com>
um: virtio: free vu_dev only with the contained struct device
Loris Reiff <loris.reiff(a)liblor.ch>
bpf, cgroup: Fix problematic bounds check
Loris Reiff <loris.reiff(a)liblor.ch>
bpf, cgroup: Fix optlen WARN_ON_ONCE toctou
Sandy Huang <hjc(a)rock-chips.com>
arm64: dts: rockchip: fix vopl iommu irq on px30
Serge Semin <Sergey.Semin(a)baikalelectronics.ru>
arm64: dts: amlogic: meson-g12: Set FL-adj property value
Alexey Dobriyan <adobriyan(a)gmail.com>
Input: i8042 - unbreak Pegatron C15B
Shawn Guo <shawn.guo(a)linaro.org>
arm64: dts: qcom: c630: keep both touchpad devices enabled
Christoph Schemmel <christoph.schemmel(a)gmail.com>
USB: serial: option: Adding support for Cinterion MV31
Chenxin Jin <bg4akv(a)hotmail.com>
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Pho Tran <Pho.Tran(a)silabs.com>
USB: serial: cp210x: add pid/vid for WSDA-200-USB
-------------
Diffstat:
Makefile | 10 +-
arch/arm/boot/dts/sun7i-a20-bananapro.dts | 2 +-
arch/arm/mach-footbridge/dc21285.c | 12 +-
arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi | 2 +-
arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +-
.../boot/dts/qcom/sdm850-lenovo-yoga-c630.dts | 10 +-
arch/arm64/boot/dts/rockchip/px30.dtsi | 2 +-
arch/um/drivers/virtio_uml.c | 3 +-
arch/x86/Makefile | 3 +
arch/x86/include/asm/apic.h | 10 --
arch/x86/include/asm/barrier.h | 18 +++
arch/x86/kernel/apic/apic.c | 4 +
arch/x86/kernel/apic/x2apic_cluster.c | 6 +-
arch/x86/kernel/apic/x2apic_phys.c | 9 +-
arch/x86/kvm/emulate.c | 2 +
arch/x86/kvm/svm.c | 5 +
arch/x86/mm/mem_encrypt.c | 1 +
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 -
drivers/input/joystick/xpad.c | 17 ++-
drivers/input/serio/i8042-x86ia64io.h | 2 +
drivers/iommu/intel-iommu.c | 6 +
drivers/md/md.c | 2 +
drivers/mmc/core/sdio_cis.c | 6 +
drivers/net/dsa/mv88e6xxx/chip.c | 6 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 13 +--
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h | 1 -
drivers/net/ethernet/intel/igc/igc_ethtool.c | 3 +-
drivers/net/ethernet/intel/igc/igc_i225.c | 3 +-
drivers/net/ethernet/intel/igc/igc_mac.c | 2 +-
drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 10 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 5 +
drivers/net/ethernet/realtek/r8169_main.c | 4 +-
drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 9 +-
drivers/nvdimm/dimm_devs.c | 18 ++-
drivers/nvme/host/pci.c | 2 +
drivers/nvme/target/tcp.c | 3 +-
drivers/usb/class/usblp.c | 19 +--
drivers/usb/dwc2/gadget.c | 8 +-
drivers/usb/dwc3/core.c | 2 +-
drivers/usb/gadget/legacy/ether.c | 4 +-
drivers/usb/host/xhci-mtk-sch.c | 130 +++++++++++++++------
drivers/usb/host/xhci-mtk.c | 2 +
drivers/usb/host/xhci-mtk.h | 15 +++
drivers/usb/host/xhci-mvebu.c | 42 +++++++
drivers/usb/host/xhci-mvebu.h | 6 +
drivers/usb/host/xhci-plat.c | 26 ++++-
drivers/usb/host/xhci-plat.h | 1 +
drivers/usb/host/xhci-ring.c | 31 +++--
drivers/usb/host/xhci.c | 8 +-
drivers/usb/host/xhci.h | 5 +
drivers/usb/renesas_usbhs/fifo.c | 1 +
drivers/usb/serial/cp210x.c | 2 +
drivers/usb/serial/option.c | 6 +
fs/afs/main.c | 6 +-
fs/cifs/dir.c | 22 +++-
fs/cifs/smb2pdu.h | 2 +-
fs/cifs/transport.c | 18 ++-
fs/hugetlbfs/inode.c | 3 +-
fs/overlayfs/dir.c | 2 +-
include/linux/hugetlb.h | 2 +
include/linux/msi.h | 6 +
include/net/sch_generic.h | 2 +-
init/init_task.c | 3 +-
kernel/bpf/cgroup.c | 7 +-
kernel/irq/msi.c | 44 ++++---
kernel/kprobes.c | 4 +
kernel/trace/fgraph.c | 2 -
mm/compaction.c | 3 +-
mm/huge_memory.c | 37 +++---
mm/hugetlb.c | 48 +++++++-
mm/memblock.c | 49 +-------
net/core/neighbour.c | 7 +-
net/ipv4/ip_tunnel.c | 16 ++-
net/lapb/lapb_out.c | 3 +-
net/mac80211/driver-ops.c | 5 +-
net/mac80211/rate.c | 3 +-
net/rxrpc/af_rxrpc.c | 6 +-
77 files changed, 558 insertions(+), 265 deletions(-)
From: Kai Krakow <kai(a)kaishome.de>
This is potentially long running and not latency sensitive, let's get
it out of the way of other latency sensitive events.
As observed in the previous commit, the `system_wq` comes easily
congested by bcache, and this fixes a few more stalls I was observing
every once in a while.
Let's not make this `WQ_MEM_RECLAIM` as it showed to reduce performance
of boot and file system operations in my tests. Also, without
`WQ_MEM_RECLAIM`, I no longer see desktop stalls. This matches the
previous behavior as `system_wq` also does no memory reclaim:
> // workqueue.c:
> system_wq = alloc_workqueue("events", 0, 0);
Cc: Coly Li <colyli(a)suse.de>
Cc: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Kai Krakow <kai(a)kaishome.de>
Signed-off-by: Coly Li <colyli(a)suse.de>
---
drivers/md/bcache/bcache.h | 1 +
drivers/md/bcache/journal.c | 4 ++--
drivers/md/bcache/super.c | 16 ++++++++++++++++
3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 2b8c7dd2cfae..848dd4db1659 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -1005,6 +1005,7 @@ void bch_write_bdev_super(struct cached_dev *dc, struct closure *parent);
extern struct workqueue_struct *bcache_wq;
extern struct workqueue_struct *bch_journal_wq;
+extern struct workqueue_struct *bch_flush_wq;
extern struct mutex bch_register_lock;
extern struct list_head bch_cache_sets;
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index aefbdb7e003b..c6613e817333 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -932,8 +932,8 @@ atomic_t *bch_journal(struct cache_set *c,
journal_try_write(c);
} else if (!w->dirty) {
w->dirty = true;
- schedule_delayed_work(&c->journal.work,
- msecs_to_jiffies(c->journal_delay_ms));
+ queue_delayed_work(bch_flush_wq, &c->journal.work,
+ msecs_to_jiffies(c->journal_delay_ms));
spin_unlock(&c->journal.lock);
} else {
spin_unlock(&c->journal.lock);
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 85a44a0cffe0..0228ccb293fc 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -49,6 +49,7 @@ static int bcache_major;
static DEFINE_IDA(bcache_device_idx);
static wait_queue_head_t unregister_wait;
struct workqueue_struct *bcache_wq;
+struct workqueue_struct *bch_flush_wq;
struct workqueue_struct *bch_journal_wq;
@@ -2821,6 +2822,8 @@ static void bcache_exit(void)
destroy_workqueue(bcache_wq);
if (bch_journal_wq)
destroy_workqueue(bch_journal_wq);
+ if (bch_flush_wq)
+ destroy_workqueue(bch_flush_wq);
bch_btree_exit();
if (bcache_major)
@@ -2884,6 +2887,19 @@ static int __init bcache_init(void)
if (!bcache_wq)
goto err;
+ /*
+ * Let's not make this `WQ_MEM_RECLAIM` for the following reasons:
+ *
+ * 1. It used `system_wq` before which also does no memory reclaim.
+ * 2. With `WQ_MEM_RECLAIM` desktop stalls, increased boot times, and
+ * reduced throughput can be observed.
+ *
+ * We still want to user our own queue to not congest the `system_wq`.
+ */
+ bch_flush_wq = alloc_workqueue("bch_flush", 0, 0);
+ if (!bch_flush_wq)
+ goto err;
+
bch_journal_wq = alloc_workqueue("bch_journal", WQ_MEM_RECLAIM, 0);
if (!bch_journal_wq)
goto err;
--
2.26.2
From: Kai Krakow <kai(a)kaishome.de>
Before killing `btree_io_wq`, the queue was allocated using
`create_singlethread_workqueue()` which has `WQ_MEM_RECLAIM`. After
killing it, it no longer had this property but `system_wq` is not
single threaded.
Let's combine both worlds and make it multi threaded but able to
reclaim memory.
Cc: Coly Li <colyli(a)suse.de>
Cc: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Kai Krakow <kai(a)kaishome.de>
Signed-off-by: Coly Li <colyli(a)suse.de>
---
drivers/md/bcache/btree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 952f022db5a5..fe6dce125aba 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -2775,7 +2775,7 @@ void bch_btree_exit(void)
int __init bch_btree_init(void)
{
- btree_io_wq = create_singlethread_workqueue("bch_btree_io");
+ btree_io_wq = alloc_workqueue("bch_btree_io", WQ_MEM_RECLAIM, 0);
if (!btree_io_wq)
return -ENOMEM;
--
2.26.2
From: Kai Krakow <kai(a)kaishome.de>
This reverts commit 56b30770b27d54d68ad51eccc6d888282b568cee.
With the btree using the `system_wq`, I seem to see a lot more desktop
latency than I should.
After some more investigation, it looks like the original assumption
of 56b3077 no longer is true, and bcache has a very high potential of
congesting the `system_wq`. In turn, this introduces laggy desktop
performance, IO stalls (at least with btrfs), and input events may be
delayed.
So let's revert this. It's important to note that the semantics of
using `system_wq` previously mean that `btree_io_wq` should be created
before and destroyed after other bcache wqs to keep the same
assumptions.
Cc: Coly Li <colyli(a)suse.de>
Cc: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Kai Krakow <kai(a)kaishome.de>
Signed-off-by: Coly Li <colyli(a)suse.de>
---
drivers/md/bcache/bcache.h | 2 ++
drivers/md/bcache/btree.c | 21 +++++++++++++++++++--
drivers/md/bcache/super.c | 4 ++++
3 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index d7a84327b7f1..2b8c7dd2cfae 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -1046,5 +1046,7 @@ void bch_debug_exit(void);
void bch_debug_init(void);
void bch_request_exit(void);
int bch_request_init(void);
+void bch_btree_exit(void);
+int bch_btree_init(void);
#endif /* _BCACHE_H */
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 910df242c83d..952f022db5a5 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -99,6 +99,8 @@
#define PTR_HASH(c, k) \
(((k)->ptr[0] >> c->bucket_bits) | PTR_GEN(k, 0))
+static struct workqueue_struct *btree_io_wq;
+
#define insert_lock(s, b) ((b)->level <= (s)->lock)
@@ -308,7 +310,7 @@ static void __btree_node_write_done(struct closure *cl)
btree_complete_write(b, w);
if (btree_node_dirty(b))
- schedule_delayed_work(&b->work, 30 * HZ);
+ queue_delayed_work(btree_io_wq, &b->work, 30 * HZ);
closure_return_with_destructor(cl, btree_node_write_unlock);
}
@@ -481,7 +483,7 @@ static void bch_btree_leaf_dirty(struct btree *b, atomic_t *journal_ref)
BUG_ON(!i->keys);
if (!btree_node_dirty(b))
- schedule_delayed_work(&b->work, 30 * HZ);
+ queue_delayed_work(btree_io_wq, &b->work, 30 * HZ);
set_btree_node_dirty(b);
@@ -2764,3 +2766,18 @@ void bch_keybuf_init(struct keybuf *buf)
spin_lock_init(&buf->lock);
array_allocator_init(&buf->freelist);
}
+
+void bch_btree_exit(void)
+{
+ if (btree_io_wq)
+ destroy_workqueue(btree_io_wq);
+}
+
+int __init bch_btree_init(void)
+{
+ btree_io_wq = create_singlethread_workqueue("bch_btree_io");
+ if (!btree_io_wq)
+ return -ENOMEM;
+
+ return 0;
+}
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index e7d1b52c5cc8..85a44a0cffe0 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -2821,6 +2821,7 @@ static void bcache_exit(void)
destroy_workqueue(bcache_wq);
if (bch_journal_wq)
destroy_workqueue(bch_journal_wq);
+ bch_btree_exit();
if (bcache_major)
unregister_blkdev(bcache_major, "bcache");
@@ -2876,6 +2877,9 @@ static int __init bcache_init(void)
return bcache_major;
}
+ if (bch_btree_init())
+ goto err;
+
bcache_wq = alloc_workqueue("bcache", WQ_MEM_RECLAIM, 0);
if (!bcache_wq)
goto err;
--
2.26.2
This is the start of the stable review cycle for the 4.4.257 release.
There are 38 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.257-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.257-rc1
Shih-Yuan Lee (FourDollars) <sylee(a)canonical.com>
ALSA: hda/realtek - Fix typo of pincfg for Dell quirk
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
ACPI: thermal: Do not call acpi_thermal_check() directly
Benjamin Valentin <benpicco(a)googlemail.com>
Input: xpad - sync supported devices with fork on GitHub
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/apic: Add extra serialization for non-serializing MSRs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/build: Disable CET instrumentation in the kernel
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between isolating and freeing page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: footbridge: fix dc21285 PCI configuration accessors
Fengnan Chang <fengnanchang(a)gmail.com>
mmc: core: Limit retries when analyse of SDIO tuples fails
Aurelien Aptel <aaptel(a)suse.com>
cifs: report error instead of invalid when revalidating a dentry fails
Wang ShaoBo <bobo.shaobowang(a)huawei.com>
kretprobe: Avoid re-registration of the same kretprobe earlier
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix station rate table updates on assoc
Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
usb: dwc2: Fix endpoint direction check in ep_from_windex
Jeremy Figgins <kernel(a)jeremyfiggins.com>
USB: usblp: don't call usb_set_interface if there's a single alt
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: gadget: legacy: fix an error code in eth_bind()
Arnd Bergmann <arnd(a)arndb.de>
elfcore: fix building with clang
Ralf Baechle <ralf(a)linux-mips.org>
ELF/MIPS build fix
Xie He <xie.he.0141(a)gmail.com>
net: lapb: Copy the skb before sending a packet
Alexey Dobriyan <adobriyan(a)gmail.com>
Input: i8042 - unbreak Pegatron C15B
Christoph Schemmel <christoph.schemmel(a)gmail.com>
USB: serial: option: Adding support for Cinterion MV31
Chenxin Jin <bg4akv(a)hotmail.com>
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Pho Tran <Pho.Tran(a)silabs.com>
USB: serial: cp210x: add pid/vid for WSDA-200-USB
Sasha Levin <sashal(a)kernel.org>
stable: clamp SUBLEVEL in 4.4 and 4.9
Brian King <brking(a)linux.vnet.ibm.com>
scsi: ibmvfc: Set default timeout to avoid crash during migration
Javed Hasan <jhasan(a)marvell.com>
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
usb: udc: core: Use lock when write to soft_connect
Lee Jones <lee.jones(a)linaro.org>
futex: Handle faults correctly for PI futexes
Lee Jones <lee.jones(a)linaro.org>
futex: Simplify fixup_pi_state_owner()
Lee Jones <lee.jones(a)linaro.org>
futex: Use pi_state_update_owner() in put_pi_state()
Lee Jones <lee.jones(a)linaro.org>
rtmutex: Remove unused argument from rt_mutex_proxy_unlock()
Lee Jones <lee.jones(a)linaro.org>
futex: Provide and use pi_state_update_owner()
Lee Jones <lee.jones(a)linaro.org>
futex: Replace pointless printk in fixup_owner()
Lee Jones <lee.jones(a)linaro.org>
futex: Avoid violating the 10th rule of futex
Lee Jones <lee.jones(a)linaro.org>
futex: Rework inconsistent rt_mutex/futex_q state
Lee Jones <lee.jones(a)linaro.org>
futex: Remove rt_mutex_deadlock_account_*()
Lee Jones <lee.jones(a)linaro.org>
futex,rt_mutex: Provide futex specific rt_mutex API
Eric Dumazet <edumazet(a)google.com>
net_sched: reject silly cell_log in qdisc_get_rtab()
-------------
Diffstat:
Makefile | 12 +-
arch/arm/mach-footbridge/dc21285.c | 12 +-
arch/mips/Kconfig | 1 +
arch/x86/Makefile | 3 +
arch/x86/include/asm/apic.h | 10 --
arch/x86/include/asm/barrier.h | 18 +++
arch/x86/kernel/apic/apic.c | 4 +
arch/x86/kernel/apic/x2apic_cluster.c | 3 +-
arch/x86/kernel/apic/x2apic_phys.c | 3 +-
drivers/acpi/thermal.c | 54 +++++--
drivers/input/joystick/xpad.c | 17 ++-
drivers/input/serio/i8042-x86ia64io.h | 2 +
drivers/mmc/core/sdio_cis.c | 6 +
drivers/scsi/ibmvscsi/ibmvfc.c | 4 +-
drivers/scsi/libfc/fc_exch.c | 16 +-
drivers/usb/class/usblp.c | 19 ++-
drivers/usb/dwc2/gadget.c | 8 +-
drivers/usb/gadget/legacy/ether.c | 4 +-
drivers/usb/gadget/udc/udc-core.c | 13 +-
drivers/usb/serial/cp210x.c | 2 +
drivers/usb/serial/option.c | 6 +
fs/Kconfig.binfmt | 8 +
fs/cifs/dir.c | 22 ++-
fs/hugetlbfs/inode.c | 3 +-
include/linux/elfcore.h | 22 +++
include/linux/hugetlb.h | 3 +
kernel/Makefile | 3 -
kernel/elfcore.c | 25 ---
kernel/futex.c | 278 +++++++++++++++++++---------------
kernel/kprobes.c | 4 +
kernel/locking/rtmutex-debug.c | 9 --
kernel/locking/rtmutex-debug.h | 3 -
kernel/locking/rtmutex.c | 127 ++++++++++------
kernel/locking/rtmutex.h | 2 -
kernel/locking/rtmutex_common.h | 12 +-
mm/hugetlb.c | 9 +-
net/lapb/lapb_out.c | 3 +-
net/mac80211/driver-ops.c | 5 +-
net/mac80211/rate.c | 3 +-
net/sched/sch_api.c | 3 +-
sound/pci/hda/patch_realtek.c | 2 +-
41 files changed, 469 insertions(+), 294 deletions(-)
This is the start of the stable review cycle for the 4.19.175 release.
There are 38 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.175-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.175-rc1
DENG Qingfang <dqfext(a)gmail.com>
net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Vadim Fedorenko <vfedorenko(a)novek.ru>
net: ip_tunnel: fix mtu calculation
Xiao Ni <xni(a)redhat.com>
md: Set prev_flush_start and flush_bio in an atomic way
Nadav Amit <namit(a)vmware.com>
iommu/vt-d: Do not use flush-queue when caching-mode is on
Benjamin Valentin <benpicco(a)googlemail.com>
Input: xpad - sync supported devices with fork on GitHub
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/apic: Add extra serialization for non-serializing MSRs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/build: Disable CET instrumentation in the kernel
Hugh Dickins <hughd(a)google.com>
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between isolating and freeing page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between freeing and dissolving the page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: footbridge: fix dc21285 PCI configuration accessors
Sean Christopherson <seanjc(a)google.com>
KVM: SVM: Treat SVM as unsupported when running as an SEV guest
Thorsten Leemhuis <linux(a)leemhuis.info>
nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
Fengnan Chang <fengnanchang(a)gmail.com>
mmc: core: Limit retries when analyse of SDIO tuples fails
Gustavo A. R. Silva <gustavoars(a)kernel.org>
smb3: Fix out-of-bounds bug in SMB2_negotiate()
Aurelien Aptel <aaptel(a)suse.com>
cifs: report error instead of invalid when revalidating a dentry fails
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: fix bounce buffer usage for non-sg list case
Marc Zyngier <maz(a)kernel.org>
genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
Wang ShaoBo <bobo.shaobowang(a)huawei.com>
kretprobe: Avoid re-registration of the same kretprobe earlier
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix station rate table updates on assoc
Liangyan <liangyan.peng(a)linux.alibaba.com>
ovl: fix dentry leak in ovl_get_redirect
Gary Bisson <gary.bisson(a)boundarydevices.com>
usb: dwc3: fix clock issue during resume in OTG mode
Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
usb: dwc2: Fix endpoint direction check in ep_from_windex
Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
Jeremy Figgins <kernel(a)jeremyfiggins.com>
USB: usblp: don't call usb_set_interface if there's a single alt
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: gadget: legacy: fix an error code in eth_bind()
Roman Gushchin <guro(a)fb.com>
memblock: do not start bottom-up allocations with kernel_end
Stefan Chulski <stefanc(a)marvell.com>
net: mvpp2: TCAM entry enable should be written after SRAM data
Xie He <xie.he.0141(a)gmail.com>
net: lapb: Copy the skb before sending a packet
Zyta Szpak <zr(a)semihalf.com>
arm64: dts: ls1046a: fix dcfg address range
David Howells <dhowells(a)redhat.com>
rxrpc: Fix deadlock around release of dst cached on udp tunnel
Alexey Dobriyan <adobriyan(a)gmail.com>
Input: i8042 - unbreak Pegatron C15B
Arnd Bergmann <arnd(a)arndb.de>
elfcore: fix building with clang
Christoph Schemmel <christoph.schemmel(a)gmail.com>
USB: serial: option: Adding support for Cinterion MV31
Chenxin Jin <bg4akv(a)hotmail.com>
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Pho Tran <Pho.Tran(a)silabs.com>
USB: serial: cp210x: add pid/vid for WSDA-200-USB
-------------
Diffstat:
Makefile | 10 ++----
arch/arm/mach-footbridge/dc21285.c | 12 +++----
arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +-
arch/x86/Makefile | 3 ++
arch/x86/include/asm/apic.h | 10 ------
arch/x86/include/asm/barrier.h | 18 ++++++++++
arch/x86/kernel/apic/apic.c | 4 +++
arch/x86/kernel/apic/x2apic_cluster.c | 6 ++--
arch/x86/kernel/apic/x2apic_phys.c | 6 ++--
arch/x86/kvm/svm.c | 5 +++
drivers/input/joystick/xpad.c | 17 ++++++++-
drivers/input/serio/i8042-x86ia64io.h | 2 ++
drivers/iommu/intel-iommu.c | 6 ++++
drivers/md/md.c | 2 ++
drivers/mmc/core/sdio_cis.c | 6 ++++
drivers/net/dsa/mv88e6xxx/chip.c | 6 +++-
drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 10 +++---
drivers/nvme/host/pci.c | 2 ++
drivers/usb/class/usblp.c | 19 +++++-----
drivers/usb/dwc2/gadget.c | 8 +----
drivers/usb/dwc3/core.c | 2 +-
drivers/usb/gadget/legacy/ether.c | 4 ++-
drivers/usb/host/xhci-ring.c | 31 ++++++++++------
drivers/usb/renesas_usbhs/fifo.c | 1 +
drivers/usb/serial/cp210x.c | 2 ++
drivers/usb/serial/option.c | 6 ++++
fs/afs/main.c | 6 ++--
fs/cifs/dir.c | 22 ++++++++++--
fs/cifs/smb2pdu.h | 2 +-
fs/hugetlbfs/inode.c | 3 +-
fs/overlayfs/dir.c | 2 +-
include/linux/elfcore.h | 22 ++++++++++++
include/linux/hugetlb.h | 3 ++
include/linux/msi.h | 6 ++++
kernel/Makefile | 1 -
kernel/elfcore.c | 26 --------------
kernel/irq/msi.c | 44 +++++++++++------------
kernel/kprobes.c | 4 +++
mm/huge_memory.c | 37 +++++++++++--------
mm/hugetlb.c | 48 ++++++++++++++++++++++---
mm/memblock.c | 49 ++++----------------------
net/ipv4/ip_tunnel.c | 16 ++++-----
net/lapb/lapb_out.c | 3 +-
net/mac80211/driver-ops.c | 5 ++-
net/mac80211/rate.c | 3 +-
net/rxrpc/af_rxrpc.c | 6 ++--
46 files changed, 308 insertions(+), 200 deletions(-)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 97c753e62e6c31a404183898d950d8c08d752dbd Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Thu, 28 Jan 2021 00:37:51 +0900
Subject: [PATCH] tracing/kprobe: Fix to support kretprobe events on unloaded
modules
Fix kprobe_on_func_entry() returns error code instead of false so that
register_kretprobe() can return an appropriate error code.
append_trace_kprobe() expects the kprobe registration returns -ENOENT
when the target symbol is not found, and it checks whether the target
module is unloaded or not. If the target module doesn't exist, it
defers to probe the target symbol until the module is loaded.
However, since register_kretprobe() returns -EINVAL instead of -ENOENT
in that case, it always fail on putting the kretprobe event on unloaded
modules. e.g.
Kprobe event:
/sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events
[ 16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue.
Kretprobe event: (p -> r)
/sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events
sh: write error: Invalid argument
/sys/kernel/debug/tracing # cat error_log
[ 41.122514] trace_kprobe: error: Failed to register probe event
Command: r xfs:xfs_end_io
^
To fix this bug, change kprobe_on_func_entry() to detect symbol lookup
failure and return -ENOENT in that case. Otherwise it returns -EINVAL
or 0 (succeeded, given address is on the entry).
Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@de…
Cc: stable(a)vger.kernel.org
Fixes: 59158ec4aef7 ("tracing/kprobes: Check the probe on unloaded module correctly")
Reported-by: Jianlin Lv <Jianlin.Lv(a)arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index b3a36b0cfc81..1883a4a9f16a 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -266,7 +266,7 @@ extern void kprobes_inc_nmissed_count(struct kprobe *p);
extern bool arch_within_kprobe_blacklist(unsigned long addr);
extern int arch_populate_kprobe_blacklist(void);
extern bool arch_kprobe_on_func_entry(unsigned long offset);
-extern bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
+extern int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
extern bool within_kprobe_blacklist(unsigned long addr);
extern int kprobe_add_ksym_blacklist(unsigned long entry);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index f7fb5d135930..1a5bc321e0a5 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1954,29 +1954,45 @@ bool __weak arch_kprobe_on_func_entry(unsigned long offset)
return !offset;
}
-bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
+/**
+ * kprobe_on_func_entry() -- check whether given address is function entry
+ * @addr: Target address
+ * @sym: Target symbol name
+ * @offset: The offset from the symbol or the address
+ *
+ * This checks whether the given @addr+@offset or @sym+@offset is on the
+ * function entry address or not.
+ * This returns 0 if it is the function entry, or -EINVAL if it is not.
+ * And also it returns -ENOENT if it fails the symbol or address lookup.
+ * Caller must pass @addr or @sym (either one must be NULL), or this
+ * returns -EINVAL.
+ */
+int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
{
kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset);
if (IS_ERR(kp_addr))
- return false;
+ return PTR_ERR(kp_addr);
- if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset) ||
- !arch_kprobe_on_func_entry(offset))
- return false;
+ if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset))
+ return -ENOENT;
- return true;
+ if (!arch_kprobe_on_func_entry(offset))
+ return -EINVAL;
+
+ return 0;
}
int register_kretprobe(struct kretprobe *rp)
{
- int ret = 0;
+ int ret;
struct kretprobe_instance *inst;
int i;
void *addr;
- if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset))
- return -EINVAL;
+ ret = kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset);
+ if (ret)
+ return ret;
if (kretprobe_blacklist_size) {
addr = kprobe_addr(&rp->kp);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index e6fba1798771..56c7fbff7bd7 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -221,9 +221,9 @@ bool trace_kprobe_on_func_entry(struct trace_event_call *call)
{
struct trace_kprobe *tk = trace_kprobe_primary_from_call(call);
- return tk ? kprobe_on_func_entry(tk->rp.kp.addr,
+ return tk ? (kprobe_on_func_entry(tk->rp.kp.addr,
tk->rp.kp.addr ? NULL : tk->rp.kp.symbol_name,
- tk->rp.kp.addr ? 0 : tk->rp.kp.offset) : false;
+ tk->rp.kp.addr ? 0 : tk->rp.kp.offset) == 0) : false;
}
bool trace_kprobe_error_injectable(struct trace_event_call *call)
@@ -828,9 +828,11 @@ static int trace_kprobe_create(int argc, const char *argv[])
}
if (is_return)
flags |= TPARG_FL_RETURN;
- if (kprobe_on_func_entry(NULL, symbol, offset))
+ ret = kprobe_on_func_entry(NULL, symbol, offset);
+ if (ret == 0)
flags |= TPARG_FL_FENTRY;
- if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
+ /* Defer the ENOENT case until register kprobe */
+ if (ret == -EINVAL && is_return) {
trace_probe_log_err(0, BAD_RETPROBE);
goto parse_error;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 135b9e8d1cd8ba5ac9ad9bcf24b464b7b052e5b8 Mon Sep 17 00:00:00 2001
From: Sibi Sankar <sibis(a)codeaurora.org>
Date: Thu, 23 Jul 2020 01:40:46 +0530
Subject: [PATCH] remoteproc: qcom_q6v5_mss: Validate modem blob firmware size
before load
The following mem abort is observed when one of the modem blob firmware
size exceeds the allocated mpss region. Fix this by restricting the copy
size to segment size using request_firmware_into_buf before load.
Err Logs:
Unable to handle kernel paging request at virtual address
Mem abort info:
...
Call trace:
__memcpy+0x110/0x180
rproc_start+0xd0/0x190
rproc_boot+0x404/0x550
state_store+0x54/0xf8
dev_attr_store+0x44/0x60
sysfs_kf_write+0x58/0x80
kernfs_fop_write+0x140/0x230
vfs_write+0xc4/0x208
ksys_write+0x74/0xf8
...
Reviewed-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Fixes: 051fb70fd4ea4 ("remoteproc: qcom: Driver for the self-authenticating Hexagon v5")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sibi Sankar <sibis(a)codeaurora.org>
Link: https://lore.kernel.org/r/20200722201047.12975-3-sibis@codeaurora.org
Signed-off-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
diff --git a/drivers/remoteproc/qcom_q6v5_mss.c b/drivers/remoteproc/qcom_q6v5_mss.c
index 7826f229957d..8199d9f59209 100644
--- a/drivers/remoteproc/qcom_q6v5_mss.c
+++ b/drivers/remoteproc/qcom_q6v5_mss.c
@@ -1173,15 +1173,14 @@ static int q6v5_mpss_load(struct q6v5 *qproc)
} else if (phdr->p_filesz) {
/* Replace "xxx.xxx" with "xxx.bxx" */
sprintf(fw_name + fw_name_len - 3, "b%02d", i);
- ret = request_firmware(&seg_fw, fw_name, qproc->dev);
+ ret = request_firmware_into_buf(&seg_fw, fw_name, qproc->dev,
+ ptr, phdr->p_filesz);
if (ret) {
dev_err(qproc->dev, "failed to load %s\n", fw_name);
iounmap(ptr);
goto release_firmware;
}
- memcpy(ptr, seg_fw->data, seg_fw->size);
-
release_firmware(seg_fw);
}
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"
This reverts commit 536d3bf261a2fc3b05b3e91e7eef7383443015cf, as it can
cause writers to memory.high to get stuck in the kernel forever,
performing page reclaim and consuming excessive amounts of CPU cycles.
Before the patch, a write to memory.high would first put the new limit in
place for the workload, and then reclaim the requested delta. After the
patch, the kernel tries to reclaim the delta before putting the new limit
into place, in order to not overwhelm the workload with a sudden, large
excess over the limit. However, if reclaim is actively racing with new
allocations from the uncurbed workload, it can keep the write() working
inside the kernel indefinitely.
This is causing problems in Facebook production. A privileged
system-level daemon that adjusts memory.high for various workloads running
on a host can get unexpectedly stuck in the kernel and essentially turn
into a sort of involuntary kswapd for one of the workloads. We've
observed that daemon busy-spin in a write() for minutes at a time,
neglecting its other duties on the system, and expending privileged system
resources on behalf of a workload.
To remedy this, we have first considered changing the reclaim logic to
break out after a couple of loops - whether the workload has converged to
the new limit or not - and bound the write() call this way. However, the
root cause that inspired the sequence change in the first place has been
fixed through other means, and so a revert back to the proven
limit-setting sequence, also used by memory.max, is preferable.
The sequence was changed to avoid extreme latencies in the workload when
the limit was lowered: the sudden, large excess created by the limit
lowering would erroneously trigger the penalty sleeping code that is meant
to throttle excessive growth from below. Allocating threads could end up
sleeping long after the write() had already reclaimed the delta for which
they were being punished.
However, erroneous throttling also caused problems in other scenarios at
around the same time. This resulted in commit b3ff92916af3 ("mm, memcg:
reclaim more aggressively before high allocator throttling"), included in
the same release as the offending commit. When allocating threads now
encounter large excess caused by a racing write() to memory.high, instead
of entering punitive sleeps, they will simply be tasked with helping
reclaim down the excess, and will be held no longer than it takes to
accomplish that. This is in line with regular limit enforcement - i.e.
if the workload allocates up against or over an otherwise unchanged limit
from below.
With the patch breaking userspace, and the root cause addressed by other
means already, revert it again.
Link: https://lkml.kernel.org/r/20210122184341.292461-1-hannes@cmpxchg.org
Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Tejun Heo <tj(a)kernel.org>
Acked-by: Chris Down <chris(a)chrisdown.name>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Michal Koutný <mkoutny(a)suse.com>
Cc: <stable(a)vger.kernel.org> [5.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/memcontrol.c~revert-mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh
+++ a/mm/memcontrol.c
@@ -6271,6 +6271,8 @@ static ssize_t memory_high_write(struct
if (err)
return err;
+ page_counter_set_high(&memcg->memory, high);
+
for (;;) {
unsigned long nr_pages = page_counter_read(&memcg->memory);
unsigned long reclaimed;
@@ -6294,10 +6296,7 @@ static ssize_t memory_high_write(struct
break;
}
- page_counter_set_high(&memcg->memory, high);
-
memcg_wb_domain_size_changed(memcg);
-
return nbytes;
}
_
From: Seth Forshee <seth.forshee(a)canonical.com>
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
Currently there is an assumption in tmpfs that 64-bit architectures also
have a 64-bit ino_t. This is not true on s390 which has a 32-bit ino_t.
With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
display "inode64" in the mount options, but passing the "inode64" mount
option will fail. This leads to the following behavior:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
As mount sees "inode64" in the mount options and thus passes it in the
options for the remount.
So prevent CONFIG_TMPFS_INODE64 from being selected on s390.
Link: https://lkml.kernel.org/r/20210205230620.518245-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: <stable(a)vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/Kconfig~tmpfs-disallow-config_tmpfs_inode64-on-s390
+++ a/fs/Kconfig
@@ -203,7 +203,7 @@ config TMPFS_XATTR
config TMPFS_INODE64
bool "Use 64-bit ino_t by default in tmpfs"
- depends on TMPFS && 64BIT
+ depends on TMPFS && 64BIT && !S390
default n
help
tmpfs has historically used only inode numbers as wide as an unsigned
_
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: add more sanity checks in xattr id lookup
Sysbot has reported a warning where a kmalloc() attempt exceeds the
maximum limit. This has been identified as corruption of the xattr_ids
count when reading the xattr id lookup table.
This patch adds a number of additional sanity checks to detect this
corruption and others.
1. It checks for a corrupted xattr index read from the inode. This could
be because the metadata block is uncompressed, or because the
"compression" bit has been corrupted (turning a compressed block
into an uncompressed block). This would cause an out of bounds read.
2. It checks against corruption of the xattr_ids count. This can either
lead to the above kmalloc failure, or a smaller than expected
table to be read.
3. It checks the contents of the index table for corruption.
[phillip(a)squashfs.org.uk: fix checkpatch issue]
Link: https://lkml.kernel.org/r/270245655.754655.1612770082682@webmail.123-reg.co…
Link: https://lkml.kernel.org/r/20210204130249.4495-5-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+2ccea6339d368360800d(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/xattr_id.c | 66 +++++++++++++++++++++++++++++++++------
1 file changed, 57 insertions(+), 9 deletions(-)
--- a/fs/squashfs/xattr_id.c~squashfs-add-more-sanity-checks-in-xattr-id-lookup
+++ a/fs/squashfs/xattr_id.c
@@ -31,10 +31,15 @@ int squashfs_xattr_lookup(struct super_b
struct squashfs_sb_info *msblk = sb->s_fs_info;
int block = SQUASHFS_XATTR_BLOCK(index);
int offset = SQUASHFS_XATTR_BLOCK_OFFSET(index);
- u64 start_block = le64_to_cpu(msblk->xattr_id_table[block]);
+ u64 start_block;
struct squashfs_xattr_id id;
int err;
+ if (index >= msblk->xattr_ids)
+ return -EINVAL;
+
+ start_block = le64_to_cpu(msblk->xattr_id_table[block]);
+
err = squashfs_read_metadata(sb, &id, &start_block, &offset,
sizeof(id));
if (err < 0)
@@ -50,13 +55,17 @@ int squashfs_xattr_lookup(struct super_b
/*
* Read uncompressed xattr id lookup table indexes from disk into memory
*/
-__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start,
+__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start,
u64 *xattr_table_start, int *xattr_ids)
{
- unsigned int len;
+ struct squashfs_sb_info *msblk = sb->s_fs_info;
+ unsigned int len, indexes;
struct squashfs_xattr_id_table *id_table;
+ __le64 *table;
+ u64 start, end;
+ int n;
- id_table = squashfs_read_table(sb, start, sizeof(*id_table));
+ id_table = squashfs_read_table(sb, table_start, sizeof(*id_table));
if (IS_ERR(id_table))
return (__le64 *) id_table;
@@ -70,13 +79,52 @@ __le64 *squashfs_read_xattr_id_table(str
if (*xattr_ids == 0)
return ERR_PTR(-EINVAL);
- /* xattr_table should be less than start */
- if (*xattr_table_start >= start)
+ len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
+ indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids);
+
+ /*
+ * The computed size of the index table (len bytes) should exactly
+ * match the table start and end points
+ */
+ start = table_start + sizeof(*id_table);
+ end = msblk->bytes_used;
+
+ if (len != (end - start))
return ERR_PTR(-EINVAL);
- len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
+ table = squashfs_read_table(sb, start, len);
+ if (IS_ERR(table))
+ return table;
+
+ /* table[0], table[1], ... table[indexes - 1] store the locations
+ * of the compressed xattr id blocks. Each entry should be less than
+ * the next (i.e. table[0] < table[1]), and the difference between them
+ * should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1]
+ * should be less than table_start, and again the difference
+ * shouls be SQUASHFS_METADATA_SIZE or less.
+ *
+ * Finally xattr_table_start should be less than table[0].
+ */
+ for (n = 0; n < (indexes - 1); n++) {
+ start = le64_to_cpu(table[n]);
+ end = le64_to_cpu(table[n + 1]);
+
+ if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ start = le64_to_cpu(table[indexes - 1]);
+ if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
- TRACE("In read_xattr_index_table, length %d\n", len);
+ if (*xattr_table_start >= le64_to_cpu(table[0])) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
- return squashfs_read_table(sb, start + sizeof(*id_table), len);
+ return table;
}
_
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: add more sanity checks in inode lookup
Sysbot has reported an "slab-out-of-bounds read" error which has been
identified as being caused by a corrupted "ino_num" value read from the
inode. This could be because the metadata block is uncompressed, or
because the "compression" bit has been corrupted (turning a compressed
block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the following
corruption.
1. It checks against corruption of the inodes count. This can either
lead to a larger table to be read, or a smaller than expected
table to be read.
In the case of a too large inodes count, this would often have been
trapped by the existing sanity checks, but this patch introduces
a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
[phillip(a)squashfs.org.uk: fix checkpatch issue]
Link: https://lkml.kernel.org/r/527909353.754618.1612769948607@webmail.123-reg.co…
Link: https://lkml.kernel.org/r/20210204130249.4495-4-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+04419e3ff19d2970ea28(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/export.c | 41 +++++++++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 8 deletions(-)
--- a/fs/squashfs/export.c~squashfs-add-more-sanity-checks-in-inode-lookup
+++ a/fs/squashfs/export.c
@@ -41,12 +41,17 @@ static long long squashfs_inode_lookup(s
struct squashfs_sb_info *msblk = sb->s_fs_info;
int blk = SQUASHFS_LOOKUP_BLOCK(ino_num - 1);
int offset = SQUASHFS_LOOKUP_BLOCK_OFFSET(ino_num - 1);
- u64 start = le64_to_cpu(msblk->inode_lookup_table[blk]);
+ u64 start;
__le64 ino;
int err;
TRACE("Entered squashfs_inode_lookup, inode_number = %d\n", ino_num);
+ if (ino_num == 0 || (ino_num - 1) >= msblk->inodes)
+ return -EINVAL;
+
+ start = le64_to_cpu(msblk->inode_lookup_table[blk]);
+
err = squashfs_read_metadata(sb, &ino, &start, &offset, sizeof(ino));
if (err < 0)
return err;
@@ -111,7 +116,10 @@ __le64 *squashfs_read_inode_lookup_table
u64 lookup_table_start, u64 next_table, unsigned int inodes)
{
unsigned int length = SQUASHFS_LOOKUP_BLOCK_BYTES(inodes);
+ unsigned int indexes = SQUASHFS_LOOKUP_BLOCKS(inodes);
+ int n;
__le64 *table;
+ u64 start, end;
TRACE("In read_inode_lookup_table, length %d\n", length);
@@ -121,20 +129,37 @@ __le64 *squashfs_read_inode_lookup_table
if (inodes == 0)
return ERR_PTR(-EINVAL);
- /* length bytes should not extend into the next table - this check
- * also traps instances where lookup_table_start is incorrectly larger
- * than the next table start
+ /*
+ * The computed size of the lookup table (length bytes) should exactly
+ * match the table start and end points
*/
- if (lookup_table_start + length > next_table)
+ if (length != (next_table - lookup_table_start))
return ERR_PTR(-EINVAL);
table = squashfs_read_table(sb, lookup_table_start, length);
+ if (IS_ERR(table))
+ return table;
/*
- * table[0] points to the first inode lookup table metadata block,
- * this should be less than lookup_table_start
+ * table0], table[1], ... table[indexes - 1] store the locations
+ * of the compressed inode lookup blocks. Each entry should be
+ * less than the next (i.e. table[0] < table[1]), and the difference
+ * between them should be SQUASHFS_METADATA_SIZE or less.
+ * table[indexes - 1] should be less than lookup_table_start, and
+ * again the difference should be SQUASHFS_METADATA_SIZE or less
*/
- if (!IS_ERR(table) && le64_to_cpu(table[0]) >= lookup_table_start) {
+ for (n = 0; n < (indexes - 1); n++) {
+ start = le64_to_cpu(table[n]);
+ end = le64_to_cpu(table[n + 1]);
+
+ if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ start = le64_to_cpu(table[indexes - 1]);
+ if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) {
kfree(table);
return ERR_PTR(-EINVAL);
}
_
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: add more sanity checks in id lookup
Sysbot has reported a number of "slab-out-of-bounds reads" and
"use-after-free read" errors which has been identified as being caused by
a corrupted index value read from the inode. This could be because the
metadata block is uncompressed, or because the "compression" bit has been
corrupted (turning a compressed block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the
following corruption.
1. It checks against corruption of the ids count. This can either
lead to a larger table to be read, or a smaller than expected
table to be read.
In the case of a too large ids count, this would often have been
trapped by the existing sanity checks, but this patch introduces
a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
Link: https://lkml.kernel.org/r/20210204130249.4495-3-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+b06d57ba83f604522af2(a)syzkaller.appspotmail.com
Reported-by: syzbot+c021ba012da41ee9807c(a)syzkaller.appspotmail.com
Reported-by: syzbot+5024636e8b5fd19f0f19(a)syzkaller.appspotmail.com
Reported-by: syzbot+bcbc661df46657d0fa4f(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/id.c | 40 ++++++++++++++++++++++++++-------
fs/squashfs/squashfs_fs_sb.h | 1
fs/squashfs/super.c | 6 ++--
fs/squashfs/xattr.h | 10 +++++++-
4 files changed, 45 insertions(+), 12 deletions(-)
--- a/fs/squashfs/id.c~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/id.c
@@ -35,10 +35,15 @@ int squashfs_get_id(struct super_block *
struct squashfs_sb_info *msblk = sb->s_fs_info;
int block = SQUASHFS_ID_BLOCK(index);
int offset = SQUASHFS_ID_BLOCK_OFFSET(index);
- u64 start_block = le64_to_cpu(msblk->id_table[block]);
+ u64 start_block;
__le32 disk_id;
int err;
+ if (index >= msblk->ids)
+ return -EINVAL;
+
+ start_block = le64_to_cpu(msblk->id_table[block]);
+
err = squashfs_read_metadata(sb, &disk_id, &start_block, &offset,
sizeof(disk_id));
if (err < 0)
@@ -56,7 +61,10 @@ __le64 *squashfs_read_id_index_table(str
u64 id_table_start, u64 next_table, unsigned short no_ids)
{
unsigned int length = SQUASHFS_ID_BLOCK_BYTES(no_ids);
+ unsigned int indexes = SQUASHFS_ID_BLOCKS(no_ids);
+ int n;
__le64 *table;
+ u64 start, end;
TRACE("In read_id_index_table, length %d\n", length);
@@ -67,20 +75,36 @@ __le64 *squashfs_read_id_index_table(str
return ERR_PTR(-EINVAL);
/*
- * length bytes should not extend into the next table - this check
- * also traps instances where id_table_start is incorrectly larger
- * than the next table start
+ * The computed size of the index table (length bytes) should exactly
+ * match the table start and end points
*/
- if (id_table_start + length > next_table)
+ if (length != (next_table - id_table_start))
return ERR_PTR(-EINVAL);
table = squashfs_read_table(sb, id_table_start, length);
+ if (IS_ERR(table))
+ return table;
/*
- * table[0] points to the first id lookup table metadata block, this
- * should be less than id_table_start
+ * table[0], table[1], ... table[indexes - 1] store the locations
+ * of the compressed id blocks. Each entry should be less than
+ * the next (i.e. table[0] < table[1]), and the difference between them
+ * should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1]
+ * should be less than id_table_start, and again the difference
+ * should be SQUASHFS_METADATA_SIZE or less
*/
- if (!IS_ERR(table) && le64_to_cpu(table[0]) >= id_table_start) {
+ for (n = 0; n < (indexes - 1); n++) {
+ start = le64_to_cpu(table[n]);
+ end = le64_to_cpu(table[n + 1]);
+
+ if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ start = le64_to_cpu(table[indexes - 1]);
+ if (start >= id_table_start || (id_table_start - start) > SQUASHFS_METADATA_SIZE) {
kfree(table);
return ERR_PTR(-EINVAL);
}
--- a/fs/squashfs/squashfs_fs_sb.h~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/squashfs_fs_sb.h
@@ -64,5 +64,6 @@ struct squashfs_sb_info {
unsigned int inodes;
unsigned int fragments;
int xattr_ids;
+ unsigned int ids;
};
#endif
--- a/fs/squashfs/super.c~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/super.c
@@ -166,6 +166,7 @@ static int squashfs_fill_super(struct su
msblk->directory_table = le64_to_cpu(sblk->directory_table_start);
msblk->inodes = le32_to_cpu(sblk->inodes);
msblk->fragments = le32_to_cpu(sblk->fragments);
+ msblk->ids = le16_to_cpu(sblk->no_ids);
flags = le16_to_cpu(sblk->flags);
TRACE("Found valid superblock on %pg\n", sb->s_bdev);
@@ -177,7 +178,7 @@ static int squashfs_fill_super(struct su
TRACE("Block size %d\n", msblk->block_size);
TRACE("Number of inodes %d\n", msblk->inodes);
TRACE("Number of fragments %d\n", msblk->fragments);
- TRACE("Number of ids %d\n", le16_to_cpu(sblk->no_ids));
+ TRACE("Number of ids %d\n", msblk->ids);
TRACE("sblk->inode_table_start %llx\n", msblk->inode_table);
TRACE("sblk->directory_table_start %llx\n", msblk->directory_table);
TRACE("sblk->fragment_table_start %llx\n",
@@ -236,8 +237,7 @@ static int squashfs_fill_super(struct su
allocate_id_index_table:
/* Allocate and read id index table */
msblk->id_table = squashfs_read_id_index_table(sb,
- le64_to_cpu(sblk->id_table_start), next_table,
- le16_to_cpu(sblk->no_ids));
+ le64_to_cpu(sblk->id_table_start), next_table, msblk->ids);
if (IS_ERR(msblk->id_table)) {
errorf(fc, "unable to read id index table");
err = PTR_ERR(msblk->id_table);
--- a/fs/squashfs/xattr.h~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/xattr.h
@@ -17,8 +17,16 @@ extern int squashfs_xattr_lookup(struct
static inline __le64 *squashfs_read_xattr_id_table(struct super_block *sb,
u64 start, u64 *xattr_table_start, int *xattr_ids)
{
+ struct squashfs_xattr_id_table *id_table;
+
+ id_table = squashfs_read_table(sb, start, sizeof(*id_table));
+ if (IS_ERR(id_table))
+ return (__le64 *) id_table;
+
+ *xattr_table_start = le64_to_cpu(id_table->xattr_table_start);
+ kfree(id_table);
+
ERROR("Xattrs in filesystem, these will be ignored\n");
- *xattr_table_start = start;
return ERR_PTR(-ENOTSUPP);
}
_
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: avoid out of bounds writes in decompressors
Patch series "Squashfs: fix BIO migration regression and add sanity checks".
Patch [1/4] fixes a regression introduced by the "migrate from ll_rw_block
usage to BIO" patch, which has produced a number of Sysbot/Syzkaller
reports.
Patches [2/4], [3/4], and [4/4] fix a number of filesystem corruption
issues which have produced Sysbot reports in the id, inode and xattr
lookup code.
Each patch has been tested against the Sysbot reproducers using the given
kernel configuration. They have the appropriate "Reported-by:" lines
added.
Additionally, all of the reproducer filesystems are indirectly fixed by
patch [4/4] due to the fact they all have xattr corruption which is now
detected there.
Additional testing with other configurations and architectures (32bit, big
endian), and normal filesystems has also been done to trap any inadvertent
regressions caused by the additional sanity checks.
This patch (of 4):
This is a regression introduced by the patch "migrate from ll_rw_block
usage to BIO".
Sysbot/Syskaller has reported a number of "out of bounds writes" and
"unable to handle kernel paging request in squashfs_decompress" errors
which have been identified as a regression introduced by the above patch.
Specifically, the patch removed the following sanity check
if (length < 0 || length > output->length ||
(index + length) > msblk->bytes_used)
This check did two things:
1. It ensured any reads were not beyond the end of the filesystem
2. It ensured that the "length" field read from the filesystem
was within the expected maximum length. Without this any
corrupted values can over-run allocated buffers.
Link: https://lkml.kernel.org/r/20210204130249.4495-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20210204130249.4495-2-phillip@squashfs.org.uk
Fixes: 93e72b3c612adc ("squashfs: migrate from ll_rw_block usage to BIO")
Reported-by: syzbot+6fba78f99b9afd4b5634(a)syzkaller.appspotmail.com
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Cc: Philippe Liard <pliard(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/block.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/squashfs/block.c~squashfs-avoid-out-of-bounds-writes-in-decompressors
+++ a/fs/squashfs/block.c
@@ -196,9 +196,15 @@ int squashfs_read_data(struct super_bloc
length = SQUASHFS_COMPRESSED_SIZE(length);
index += 2;
- TRACE("Block @ 0x%llx, %scompressed size %d\n", index,
+ TRACE("Block @ 0x%llx, %scompressed size %d\n", index - 2,
compressed ? "" : "un", length);
}
+ if (length < 0 || length > output->length ||
+ (index + length) > msblk->bytes_used) {
+ res = -EIO;
+ goto out;
+ }
+
if (next_index)
*next_index = index + length;
_
This is the start of the stable review cycle for the 4.9.257 release.
There are 43 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.257-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.257-rc1
Shih-Yuan Lee (FourDollars) <sylee(a)canonical.com>
ALSA: hda/realtek - Fix typo of pincfg for Dell quirk
Nadav Amit <namit(a)vmware.com>
iommu/vt-d: Do not use flush-queue when caching-mode is on
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
ACPI: thermal: Do not call acpi_thermal_check() directly
Benjamin Valentin <benpicco(a)googlemail.com>
Input: xpad - sync supported devices with fork on GitHub
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/apic: Add extra serialization for non-serializing MSRs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/build: Disable CET instrumentation in the kernel
Hugh Dickins <hughd(a)google.com>
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between isolating and freeing page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: footbridge: fix dc21285 PCI configuration accessors
Fengnan Chang <fengnanchang(a)gmail.com>
mmc: core: Limit retries when analyse of SDIO tuples fails
Aurelien Aptel <aaptel(a)suse.com>
cifs: report error instead of invalid when revalidating a dentry fails
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: fix bounce buffer usage for non-sg list case
Wang ShaoBo <bobo.shaobowang(a)huawei.com>
kretprobe: Avoid re-registration of the same kretprobe earlier
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix station rate table updates on assoc
Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
usb: dwc2: Fix endpoint direction check in ep_from_windex
Jeremy Figgins <kernel(a)jeremyfiggins.com>
USB: usblp: don't call usb_set_interface if there's a single alt
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: gadget: legacy: fix an error code in eth_bind()
Arnd Bergmann <arnd(a)arndb.de>
elfcore: fix building with clang
Xie He <xie.he.0141(a)gmail.com>
net: lapb: Copy the skb before sending a packet
Alexey Dobriyan <adobriyan(a)gmail.com>
Input: i8042 - unbreak Pegatron C15B
Christoph Schemmel <christoph.schemmel(a)gmail.com>
USB: serial: option: Adding support for Cinterion MV31
Chenxin Jin <bg4akv(a)hotmail.com>
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Pho Tran <Pho.Tran(a)silabs.com>
USB: serial: cp210x: add pid/vid for WSDA-200-USB
Sasha Levin <sashal(a)kernel.org>
stable: clamp SUBLEVEL in 4.4 and 4.9
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Don't fail on missing symbol table
Brian King <brking(a)linux.vnet.ibm.com>
scsi: ibmvfc: Set default timeout to avoid crash during migration
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix fast-rx encryption check
Javed Hasan <jhasan(a)marvell.com>
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Thomas Gleixner <tglx(a)linutronix.de>
futex: Handle faults correctly for PI futexes
Thomas Gleixner <tglx(a)linutronix.de>
futex: Simplify fixup_pi_state_owner()
Thomas Gleixner <tglx(a)linutronix.de>
futex: Use pi_state_update_owner() in put_pi_state()
Thomas Gleixner <tglx(a)linutronix.de>
rtmutex: Remove unused argument from rt_mutex_proxy_unlock()
Thomas Gleixner <tglx(a)linutronix.de>
futex: Provide and use pi_state_update_owner()
Thomas Gleixner <tglx(a)linutronix.de>
futex: Replace pointless printk in fixup_owner()
Peter Zijlstra <peterz(a)infradead.org>
futex: Avoid violating the 10th rule of futex
Peter Zijlstra <peterz(a)infradead.org>
futex: Rework inconsistent rt_mutex/futex_q state
Peter Zijlstra <peterz(a)infradead.org>
futex: Remove rt_mutex_deadlock_account_*()
Peter Zijlstra <peterz(a)infradead.org>
futex,rt_mutex: Provide futex specific rt_mutex API
Eric Dumazet <edumazet(a)google.com>
net_sched: reject silly cell_log in qdisc_get_rtab()
Lijun Pan <ljp(a)linux.ibm.com>
ibmvnic: Ensure that CRQ entry read are correctly ordered
Pan Bian <bianpan2016(a)163.com>
net: dsa: bcm_sf2: put device node before return
-------------
Diffstat:
Makefile | 12 +-
arch/arm/mach-footbridge/dc21285.c | 12 +-
arch/x86/Makefile | 3 +
arch/x86/include/asm/apic.h | 10 --
arch/x86/include/asm/barrier.h | 18 +++
arch/x86/kernel/apic/apic.c | 4 +
arch/x86/kernel/apic/x2apic_cluster.c | 6 +-
arch/x86/kernel/apic/x2apic_phys.c | 6 +-
drivers/acpi/thermal.c | 55 ++++---
drivers/input/joystick/xpad.c | 17 ++-
drivers/input/serio/i8042-x86ia64io.h | 2 +
drivers/iommu/intel-iommu.c | 6 +
drivers/mmc/core/sdio_cis.c | 6 +
drivers/net/dsa/bcm_sf2.c | 8 +-
drivers/net/ethernet/ibm/ibmvnic.c | 6 +
drivers/scsi/ibmvscsi/ibmvfc.c | 4 +-
drivers/scsi/libfc/fc_exch.c | 16 +-
drivers/usb/class/usblp.c | 19 ++-
drivers/usb/dwc2/gadget.c | 8 +-
drivers/usb/gadget/legacy/ether.c | 4 +-
drivers/usb/host/xhci-ring.c | 31 ++--
drivers/usb/serial/cp210x.c | 2 +
drivers/usb/serial/option.c | 6 +
fs/cifs/dir.c | 22 ++-
fs/hugetlbfs/inode.c | 3 +-
include/linux/elfcore.h | 22 +++
include/linux/hugetlb.h | 3 +
kernel/Makefile | 1 -
kernel/elfcore.c | 25 ---
kernel/futex.c | 276 +++++++++++++++++++---------------
kernel/kprobes.c | 4 +
kernel/locking/rtmutex-debug.c | 9 --
kernel/locking/rtmutex-debug.h | 3 -
kernel/locking/rtmutex.c | 127 ++++++++++------
kernel/locking/rtmutex.h | 2 -
kernel/locking/rtmutex_common.h | 12 +-
mm/huge_memory.c | 37 +++--
mm/hugetlb.c | 9 +-
net/lapb/lapb_out.c | 3 +-
net/mac80211/driver-ops.c | 5 +-
net/mac80211/rate.c | 3 +-
net/mac80211/rx.c | 2 +
net/sched/sch_api.c | 3 +-
sound/pci/hda/patch_realtek.c | 2 +-
tools/objtool/elf.c | 7 +-
45 files changed, 521 insertions(+), 320 deletions(-)
This is the start of the stable review cycle for the 4.14.220 release.
There are 15 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 07 Feb 2021 14:06:42 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.220-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.220-rc1
Peter Zijlstra <peterz(a)infradead.org>
kthread: Extract KTHREAD_IS_PER_CPU
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Don't fail on missing symbol table
Brian King <brking(a)linux.vnet.ibm.com>
scsi: ibmvfc: Set default timeout to avoid crash during migration
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix fast-rx encryption check
Javed Hasan <jhasan(a)marvell.com>
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Martin Wilck <mwilck(a)suse.com>
scsi: scsi_transport_srp: Don't block target in failfast state
Peter Zijlstra <peterz(a)infradead.org>
x86: __always_inline __{rd,wr}msr()
Tony Lindgren <tony(a)atomide.com>
phy: cpcap-usb: Fix warning for missing regulator_disable
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
driver core: Extend device_is_dependent()
Benjamin Gaignard <benjamin.gaignard(a)linaro.org>
base: core: Remove WARN_ON from link dependencies check
Eric Dumazet <edumazet(a)google.com>
net_sched: gen_estimator: support large ewma log
Eric Dumazet <edumazet(a)google.com>
net_sched: reject silly cell_log in qdisc_get_rtab()
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
ACPI: thermal: Do not call acpi_thermal_check() directly
Lijun Pan <ljp(a)linux.ibm.com>
ibmvnic: Ensure that CRQ entry read are correctly ordered
Pan Bian <bianpan2016(a)163.com>
net: dsa: bcm_sf2: put device node before return
-------------
Diffstat:
Makefile | 4 +--
arch/x86/include/asm/msr.h | 4 +--
drivers/acpi/thermal.c | 55 +++++++++++++++++++++++++-----------
drivers/base/core.c | 19 +++++++++++--
drivers/net/dsa/bcm_sf2.c | 8 ++++--
drivers/net/ethernet/ibm/ibmvnic.c | 6 ++++
drivers/phy/motorola/phy-cpcap-usb.c | 19 +++++++++----
drivers/scsi/ibmvscsi/ibmvfc.c | 4 ++-
drivers/scsi/libfc/fc_exch.c | 16 +++++++++--
drivers/scsi/scsi_transport_srp.c | 9 +++++-
include/linux/kthread.h | 3 ++
kernel/kthread.c | 27 +++++++++++++++++-
kernel/smpboot.c | 1 +
net/core/gen_estimator.c | 11 +++++---
net/mac80211/rx.c | 2 ++
net/sched/sch_api.c | 3 +-
tools/objtool/elf.c | 7 +++--
17 files changed, 155 insertions(+), 43 deletions(-)
This is the start of the stable review cycle for the 4.14.221 release.
There are 30 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.221-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.221-rc1
DENG Qingfang <dqfext(a)gmail.com>
net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Nadav Amit <namit(a)vmware.com>
iommu/vt-d: Do not use flush-queue when caching-mode is on
Benjamin Valentin <benpicco(a)googlemail.com>
Input: xpad - sync supported devices with fork on GitHub
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/apic: Add extra serialization for non-serializing MSRs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/build: Disable CET instrumentation in the kernel
Hugh Dickins <hughd(a)google.com>
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between isolating and freeing page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: footbridge: fix dc21285 PCI configuration accessors
Thorsten Leemhuis <linux(a)leemhuis.info>
nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
Fengnan Chang <fengnanchang(a)gmail.com>
mmc: core: Limit retries when analyse of SDIO tuples fails
Gustavo A. R. Silva <gustavoars(a)kernel.org>
smb3: Fix out-of-bounds bug in SMB2_negotiate()
Aurelien Aptel <aaptel(a)suse.com>
cifs: report error instead of invalid when revalidating a dentry fails
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: fix bounce buffer usage for non-sg list case
Wang ShaoBo <bobo.shaobowang(a)huawei.com>
kretprobe: Avoid re-registration of the same kretprobe earlier
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix station rate table updates on assoc
Liangyan <liangyan.peng(a)linux.alibaba.com>
ovl: fix dentry leak in ovl_get_redirect
Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
usb: dwc2: Fix endpoint direction check in ep_from_windex
Jeremy Figgins <kernel(a)jeremyfiggins.com>
USB: usblp: don't call usb_set_interface if there's a single alt
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: gadget: legacy: fix an error code in eth_bind()
Wei Wang <weiwan(a)google.com>
ipv4: fix race condition between route lookup and invalidation
Arnd Bergmann <arnd(a)arndb.de>
elfcore: fix building with clang
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Support Clang non-section symbols in ORC generation
Xie He <xie.he.0141(a)gmail.com>
net: lapb: Copy the skb before sending a packet
Zyta Szpak <zr(a)semihalf.com>
arm64: dts: ls1046a: fix dcfg address range
Alexey Dobriyan <adobriyan(a)gmail.com>
Input: i8042 - unbreak Pegatron C15B
Christoph Schemmel <christoph.schemmel(a)gmail.com>
USB: serial: option: Adding support for Cinterion MV31
Chenxin Jin <bg4akv(a)hotmail.com>
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Pho Tran <Pho.Tran(a)silabs.com>
USB: serial: cp210x: add pid/vid for WSDA-200-USB
-------------
Diffstat:
Makefile | 10 ++-----
arch/arm/mach-footbridge/dc21285.c | 12 ++++----
arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +-
arch/x86/Makefile | 3 ++
arch/x86/include/asm/apic.h | 10 -------
arch/x86/include/asm/barrier.h | 18 ++++++++++++
arch/x86/kernel/apic/apic.c | 4 +++
arch/x86/kernel/apic/x2apic_cluster.c | 6 ++--
arch/x86/kernel/apic/x2apic_phys.c | 6 ++--
drivers/input/joystick/xpad.c | 17 +++++++++++-
drivers/input/serio/i8042-x86ia64io.h | 2 ++
drivers/iommu/intel-iommu.c | 6 ++++
drivers/mmc/core/sdio_cis.c | 6 ++++
drivers/net/dsa/mv88e6xxx/chip.c | 6 +++-
drivers/nvme/host/pci.c | 2 ++
drivers/usb/class/usblp.c | 19 +++++++------
drivers/usb/dwc2/gadget.c | 8 +-----
drivers/usb/gadget/legacy/ether.c | 4 ++-
drivers/usb/host/xhci-ring.c | 31 +++++++++++++--------
drivers/usb/serial/cp210x.c | 2 ++
drivers/usb/serial/option.c | 6 ++++
fs/cifs/dir.c | 22 +++++++++++++--
fs/cifs/smb2pdu.h | 2 +-
fs/hugetlbfs/inode.c | 3 +-
fs/overlayfs/dir.c | 2 +-
include/linux/elfcore.h | 22 +++++++++++++++
include/linux/hugetlb.h | 3 ++
kernel/Makefile | 1 -
kernel/elfcore.c | 26 ------------------
kernel/kprobes.c | 4 +++
mm/huge_memory.c | 37 +++++++++++++++----------
mm/hugetlb.c | 9 +++---
net/ipv4/route.c | 38 +++++++++++++-------------
net/lapb/lapb_out.c | 3 +-
net/mac80211/driver-ops.c | 5 +++-
net/mac80211/rate.c | 3 +-
tools/objtool/orc_gen.c | 33 +++++++++++++++++-----
37 files changed, 255 insertions(+), 138 deletions(-)
The patch titled
Subject: net: fix iteration for sctp transport seq_files
has been removed from the -mm tree. Its filename was
net-fix-iteration-for-sctp-transport-seq_files.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: NeilBrown <neilb(a)suse.de>
Subject: net: fix iteration for sctp transport seq_files
The sctp transport seq_file iterators take a reference to the transport in
the ->start and ->next functions and releases the reference in the ->show
function. The preferred handling for such resources is to release them in
the subsequent ->next or ->stop function call.
Since Commit 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration
code and interface") there is no guarantee that ->show will be called
after ->next, so this function can now leak references.
So move the sctp_transport_put() call to ->next and ->stop.
Link: https://lkml.kernel.org/r/161248539022.21478.17038123892954492263.stgit@nob…
Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface")
Signed-off-by: NeilBrown <neilb(a)suse.de>
Reported-by: Xin Long <lucien.xin(a)gmail.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Vlad Yasevich <vyasevich(a)gmail.com>
Cc: Neil Horman <nhorman(a)tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
net/sctp/proc.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
--- a/net/sctp/proc.c~net-fix-iteration-for-sctp-transport-seq_files
+++ a/net/sctp/proc.c
@@ -215,6 +215,12 @@ static void sctp_transport_seq_stop(stru
{
struct sctp_ht_iter *iter = seq->private;
+ if (v && v != SEQ_START_TOKEN) {
+ struct sctp_transport *transport = v;
+
+ sctp_transport_put(transport);
+ }
+
sctp_transport_walk_stop(&iter->hti);
}
@@ -222,6 +228,12 @@ static void *sctp_transport_seq_next(str
{
struct sctp_ht_iter *iter = seq->private;
+ if (v && v != SEQ_START_TOKEN) {
+ struct sctp_transport *transport = v;
+
+ sctp_transport_put(transport);
+ }
+
++*pos;
return sctp_transport_get_next(seq_file_net(seq), &iter->hti);
@@ -277,8 +289,6 @@ static int sctp_assocs_seq_show(struct s
sk->sk_rcvbuf);
seq_printf(seq, "\n");
- sctp_transport_put(transport);
-
return 0;
}
@@ -354,8 +364,6 @@ static int sctp_remaddr_seq_show(struct
seq_printf(seq, "\n");
}
- sctp_transport_put(transport);
-
return 0;
}
_
Patches currently in -mm which might be from neilb(a)suse.de are
seq_file-document-how-per-entry-resources-are-managed.patch
x86-fix-seq_file-iteration-for-pat-memtypec.patch
This is a note to let you know that I've just added the patch titled
usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 4b049f55ed95cd889bcdb3034fd75e1f01852b38 Mon Sep 17 00:00:00 2001
From: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Date: Mon, 8 Feb 2021 13:53:16 -0800
Subject: usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
The dep->interval captures the number of frames/microframes per interval
from bInterval. Fullspeed interrupt endpoint bInterval is the number of
frames per interval and not 2^(bInterval - 1). So fix it here. This
change is only for debugging purpose and should not affect the interrupt
endpoint operation.
Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Link: https://lore.kernel.org/r/1263b563dedc4ab8b0fb854fba06ce4bc56bd495.16128209…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/dwc3/gadget.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index d0f8d3ec855f..aebcf8ec0716 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -615,8 +615,13 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action)
if (dwc->gadget->speed == USB_SPEED_FULL)
bInterval_m1 = 0;
+ if (usb_endpoint_type(desc) == USB_ENDPOINT_XFER_INT &&
+ dwc->gadget->speed == USB_SPEED_FULL)
+ dep->interval = desc->bInterval;
+ else
+ dep->interval = 1 << (desc->bInterval - 1);
+
params.param1 |= DWC3_DEPCFG_BINTERVAL_M1(bInterval_m1);
- dep->interval = 1 << (desc->bInterval - 1);
}
return dwc3_send_gadget_ep_cmd(dep, DWC3_DEPCMD_SETEPCONFIG, ¶ms);
--
2.30.0
This is a note to let you know that I've just added the patch titled
usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From a1679af85b2ae35a2b78ad04c18bb069c37330cc Mon Sep 17 00:00:00 2001
From: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Date: Mon, 8 Feb 2021 13:53:10 -0800
Subject: usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
Valid range for DEPCFG.bInterval_m1 is from 0 to 13, and it must be set
to 0 when the controller operates in full-speed. See the programming
guide for DEPCFG command section 3.2.2.1 (v3.30a).
Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Link: https://lore.kernel.org/r/3f57026f993c0ce71498dbb06e49b3a47c4d0265.16128209…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/dwc3/gadget.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 97d707b4f384..d0f8d3ec855f 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -605,7 +605,17 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action)
params.param0 |= DWC3_DEPCFG_FIFO_NUMBER(dep->number >> 1);
if (desc->bInterval) {
- params.param1 |= DWC3_DEPCFG_BINTERVAL_M1(desc->bInterval - 1);
+ u8 bInterval_m1;
+
+ /*
+ * Valid range for DEPCFG.bInterval_m1 is from 0 to 13, and it
+ * must be set to 0 when the controller operates in full-speed.
+ */
+ bInterval_m1 = min_t(u8, desc->bInterval - 1, 13);
+ if (dwc->gadget->speed == USB_SPEED_FULL)
+ bInterval_m1 = 0;
+
+ params.param1 |= DWC3_DEPCFG_BINTERVAL_M1(bInterval_m1);
dep->interval = 1 << (desc->bInterval - 1);
}
--
2.30.0
This is a note to let you know that I've just added the patch titled
drivers/misc/vmw_vmci: restrict too big queue size in
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 2fd10bcf0310b9525b2af9e1f7aa9ddd87c3772e Mon Sep 17 00:00:00 2001
From: Sabyrzhan Tasbolatov <snovitoll(a)gmail.com>
Date: Tue, 9 Feb 2021 16:26:12 +0600
Subject: drivers/misc/vmw_vmci: restrict too big queue size in
qp_host_alloc_queue
syzbot found WARNING in qp_broker_alloc[1] in qp_host_alloc_queue()
when num_pages is 0x100001, giving queue_size + queue_page_size
bigger than KMALLOC_MAX_SIZE for kzalloc(), resulting order >= MAX_ORDER
condition.
queue_size + queue_page_size=0x8000d8, where KMALLOC_MAX_SIZE=0x400000.
[1]
Call Trace:
alloc_pages include/linux/gfp.h:547 [inline]
kmalloc_order+0x40/0x130 mm/slab_common.c:837
kmalloc_order_trace+0x15/0x70 mm/slab_common.c:853
kmalloc_large include/linux/slab.h:481 [inline]
__kmalloc+0x257/0x330 mm/slub.c:3959
kmalloc include/linux/slab.h:557 [inline]
kzalloc include/linux/slab.h:682 [inline]
qp_host_alloc_queue drivers/misc/vmw_vmci/vmci_queue_pair.c:540 [inline]
qp_broker_create drivers/misc/vmw_vmci/vmci_queue_pair.c:1351 [inline]
qp_broker_alloc+0x936/0x2740 drivers/misc/vmw_vmci/vmci_queue_pair.c:1739
Reported-by: syzbot+15ec7391f3d6a1a7cc7d(a)syzkaller.appspotmail.com
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll(a)gmail.com>
Link: https://lore.kernel.org/r/20210209102612.2112247-1-snovitoll@gmail.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/misc/vmw_vmci/vmci_queue_pair.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
index d787ddecee77..880c33ab9f47 100644
--- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
+++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
@@ -539,6 +539,9 @@ static struct vmci_queue *qp_host_alloc_queue(u64 size)
queue_page_size = num_pages * sizeof(*queue->kernel_if->u.h.page);
+ if (queue_size + queue_page_size > KMALLOC_MAX_SIZE)
+ return NULL;
+
queue = kzalloc(queue_size + queue_page_size, GFP_KERNEL);
if (queue) {
queue->q_header = NULL;
--
2.30.0
This is a note to let you know that I've just added the patch titled
mei: bus: block send with vtag on non-conformat FW
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From b398d53cd421454d64850f8b1f6d609ede9042d9 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Mon, 8 Feb 2021 17:06:48 +0200
Subject: mei: bus: block send with vtag on non-conformat FW
Block data send with vtag if either transport layer or
FW client are not supporting vtags.
Cc: <stable(a)vger.kernel.org> # v5.10+
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20210208150649.141358-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/misc/mei/bus.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 580074e32599..935acc6bbf3c 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -61,6 +61,13 @@ ssize_t __mei_cl_send(struct mei_cl *cl, u8 *buf, size_t length, u8 vtag,
goto out;
}
+ if (vtag) {
+ /* Check if vtag is supported by client */
+ rets = mei_cl_vt_support_check(cl);
+ if (rets)
+ goto out;
+ }
+
if (length > mei_cl_mtu(cl)) {
rets = -EFBIG;
goto out;
--
2.30.0
Hi Christoph, Greg,
Currently we are observing an incorrect address translation
corresponding to DMA direct mapping methods on 5.4 stable kernel while
sharing dmabuf from one device to another where both devices have
their own coherent DMA memory pools.
I am able to root cause this issue which is caused by incorrect virt
to phys translation for addresses belonging to vmalloc space using
virt_to_page(). But while looking at the mainline kernel, this patch
[1] changes address translation from virt->to->phys to dma->to->phys
which fixes the issue observed on 5.4 stable kernel as well (minimal
fix [2]).
So I would like to seek your suggestion for backport to stable kernels
(5.4 or earlier) as to whether we should backport the complete
mainline commit [1] or we should just apply the minimal fix [2]?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
[2] minimal fix required for 5.4 stable kernel:
commit bb0b3ff6e54d78370b6b0c04426f0d9192f31795
Author: Sumit Garg <sumit.garg(a)linaro.org>
Date: Wed Feb 3 13:08:37 2021 +0530
dma-mapping: Fix common get_sgtable and mmap methods
Currently common get_sgtable and mmap methods can only handle normal
kernel addresses leading to incorrect handling of vmalloc addresses which
is common means for DMA coherent memory mapping.
So instead of cpu_addr, directly decode the physical address from
dma_addr and
hence decode corresponding page and pfn values. In this way we can handle
normal kernel addresses as well as vmalloc addresses.
This fix is inspired from following mainline commit:
34dc0ea6bc96 ("dma-direct: provide mmap and get_sgtable method overrides")
This fixes an issue observed during dmabuf sharing from one device to
another where both devices have their own coherent DMA memory pools.
Signed-off-by: Sumit Garg <sumit.garg(a)linaro.org>
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 8682a53..034bbae 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -127,7 +127,7 @@ int dma_common_get_sgtable(struct device *dev,
struct sg_table *sgt,
return -ENXIO;
page = pfn_to_page(pfn);
} else {
- page = virt_to_page(cpu_addr);
+ page = pfn_to_page(PHYS_PFN(dma_to_phys(dev, dma_addr)));
}
ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
@@ -214,7 +214,7 @@ int dma_common_mmap(struct device *dev, struct
vm_area_struct *vma,
if (!pfn_valid(pfn))
return -ENXIO;
} else {
- pfn = page_to_pfn(virt_to_page(cpu_addr));
+ pfn = PHYS_PFN(dma_to_phys(dev, dma_addr));
}
return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
The following commit has been merged into the timers/urgent branch of tip:
Commit-ID: 0fcc7c20d2e2a65fb5b80d42841084e8509d085d
Gitweb: https://git.kernel.org/tip/0fcc7c20d2e2a65fb5b80d42841084e8509d085d
Author: Mikael Beckius <mikael.beckius(a)windriver.com>
AuthorDate: Thu, 28 Jan 2021 15:02:08 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Tue, 09 Feb 2021 16:18:42 +01:00
hrtimer: Update softirq_expires_next correctly in hrtimer_force_reprogram()
hrtimer_force_reprogram() invokes __hrtimer_get_next_event() to find the
earliest expiry time of all hrtimer bases. __hrtimer_get_next_event() does
not update cpu_base::[softirq_]_expires_next. That needs to be done at the
callsites.
hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when
the first expiring timer is a softirq timer and the soft interrupt is not
activated. That's wrong because cpu_base::softirq_expires_next is left
stale when the first expiring timer of all bases is a timer which expires
in hard interrupt context.
That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and
the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting
CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that
timer before the stale cpu_base::softirq_expires_next.
cpu_base::softirq_expires_next is cached to make the check for raising the
soft interrupt fast. In the above case the soft interrupt won't be raised
until clock monotonic reaches the stale cpu_base::softirq_expires_next
value. That's incorrect, but what's worse it that if the softirq timer
becomes the first expiring timer of all clock bases after the hard expiry
timer has been handled the reprogramming of the clockevent from
hrtimer_interrupt() will result in an interrupt storm. That happens because
the reprogramming does not use cpu_base::softirq_expires_next, it uses
__hrtimer_get_next_event() which returns the actual expiry time. Once clock
MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is
raised and the storm subsides.
Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard
bases seperately, update softirq_expires_next and handle the case when a
soft expiring timer is the first of all bases by comparing the expiry times
and updating the required cpu base fields.
[ tglx: Modified it to avoid the double evaluation ]
Fixes:5da70160462e ("hrtimer: Implement support for softirq based hrtimers")
Signed-off-by: Mikael Beckius <mikael.beckius(a)windriver.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210128140208.30264-1-mikael.beckius@windriver.c…
---
kernel/time/hrtimer.c | 32 +++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 743c852..88a0145 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -626,24 +626,30 @@ static inline int hrtimer_hres_active(void)
static void
hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
{
- ktime_t expires_next;
+ ktime_t expires_next, soft = KTIME_MAX;
/*
- * Find the current next expiration time.
+ * If the soft interrupt has already been activated, ignore the
+ * soft bases. They will be handled in the already raised soft
+ * interrupt.
*/
- expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL);
-
- if (cpu_base->next_timer && cpu_base->next_timer->is_soft) {
+ if (!cpu_base->softirq_activated) {
+ soft = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_SOFT);
/*
- * When the softirq is activated, hrtimer has to be
- * programmed with the first hard hrtimer because soft
- * timer interrupt could occur too late.
+ * Update the soft expiry time. clock_settime() might have
+ * affected it.
*/
- if (cpu_base->softirq_activated)
- expires_next = __hrtimer_get_next_event(cpu_base,
- HRTIMER_ACTIVE_HARD);
- else
- cpu_base->softirq_expires_next = expires_next;
+ cpu_base->softirq_expires_next = soft;
+ }
+
+ expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_HARD);
+ /*
+ * If a softirq timer is expiring first, update cpu_base->next_timer
+ * and program the hardware with the soft expiry time.
+ */
+ if (expires_next > soft) {
+ cpu_base->next_timer = cpu_base->softirq_next_timer;
+ expires_next = soft;
}
if (skip_equal && expires_next == cpu_base->expires_next)
Commit 1e35918ad9d1 ("MIPS: Enable Undefined Behavior Sanitizer
UBSAN") added a possibility to build the entire kernel with UBSAN
instrumentation for MIPS, with the exception for VDSO.
However, self-extracting head wasn't been added to exceptions, so
this occurs:
mips-alpine-linux-musl-ld: arch/mips/boot/compressed/decompress.o:
in function `FSE_buildDTable_wksp':
decompress.c:(.text.FSE_buildDTable_wksp+0x278): undefined reference
to `__ubsan_handle_shift_out_of_bounds'
mips-alpine-linux-musl-ld: decompress.c:(.text.FSE_buildDTable_wksp+0x2a8):
undefined reference to `__ubsan_handle_shift_out_of_bounds'
mips-alpine-linux-musl-ld: decompress.c:(.text.FSE_buildDTable_wksp+0x2c4):
undefined reference to `__ubsan_handle_shift_out_of_bounds'
mips-alpine-linux-musl-ld: arch/mips/boot/compressed/decompress.o:
decompress.c:(.text.FSE_buildDTable_raw+0x9c): more undefined references
to `__ubsan_handle_shift_out_of_bounds' follow
Add UBSAN_SANITIZE := n to mips/boot/compressed/Makefile to exclude
it from instrumentation scope and fix this issue.
Fixes: 1e35918ad9d1 ("MIPS: Enable Undefined Behavior Sanitizer UBSAN")
Cc: stable(a)vger.kernel.org # 5.0+
Signed-off-by: Alexander Lobakin <alobakin(a)pm.me>
---
arch/mips/boot/compressed/Makefile | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/mips/boot/compressed/Makefile b/arch/mips/boot/compressed/Makefile
index 47cd9dc7454a..f93f72bcba97 100644
--- a/arch/mips/boot/compressed/Makefile
+++ b/arch/mips/boot/compressed/Makefile
@@ -37,6 +37,7 @@ KBUILD_AFLAGS := $(KBUILD_AFLAGS) -D__ASSEMBLY__ \
# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
KCOV_INSTRUMENT := n
GCOV_PROFILE := n
+UBSAN_SANITIZE := n
# decompressor objects (linked with vmlinuz)
vmlinuzobjs-y := $(obj)/head.o $(obj)/decompress.o $(obj)/string.o
--
2.30.0
This is a note to let you know that I've just added the patch titled
drivers/misc/vmw_vmci: restrict too big queue size in
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 2fd10bcf0310b9525b2af9e1f7aa9ddd87c3772e Mon Sep 17 00:00:00 2001
From: Sabyrzhan Tasbolatov <snovitoll(a)gmail.com>
Date: Tue, 9 Feb 2021 16:26:12 +0600
Subject: drivers/misc/vmw_vmci: restrict too big queue size in
qp_host_alloc_queue
syzbot found WARNING in qp_broker_alloc[1] in qp_host_alloc_queue()
when num_pages is 0x100001, giving queue_size + queue_page_size
bigger than KMALLOC_MAX_SIZE for kzalloc(), resulting order >= MAX_ORDER
condition.
queue_size + queue_page_size=0x8000d8, where KMALLOC_MAX_SIZE=0x400000.
[1]
Call Trace:
alloc_pages include/linux/gfp.h:547 [inline]
kmalloc_order+0x40/0x130 mm/slab_common.c:837
kmalloc_order_trace+0x15/0x70 mm/slab_common.c:853
kmalloc_large include/linux/slab.h:481 [inline]
__kmalloc+0x257/0x330 mm/slub.c:3959
kmalloc include/linux/slab.h:557 [inline]
kzalloc include/linux/slab.h:682 [inline]
qp_host_alloc_queue drivers/misc/vmw_vmci/vmci_queue_pair.c:540 [inline]
qp_broker_create drivers/misc/vmw_vmci/vmci_queue_pair.c:1351 [inline]
qp_broker_alloc+0x936/0x2740 drivers/misc/vmw_vmci/vmci_queue_pair.c:1739
Reported-by: syzbot+15ec7391f3d6a1a7cc7d(a)syzkaller.appspotmail.com
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll(a)gmail.com>
Link: https://lore.kernel.org/r/20210209102612.2112247-1-snovitoll@gmail.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/misc/vmw_vmci/vmci_queue_pair.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
index d787ddecee77..880c33ab9f47 100644
--- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
+++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
@@ -539,6 +539,9 @@ static struct vmci_queue *qp_host_alloc_queue(u64 size)
queue_page_size = num_pages * sizeof(*queue->kernel_if->u.h.page);
+ if (queue_size + queue_page_size > KMALLOC_MAX_SIZE)
+ return NULL;
+
queue = kzalloc(queue_size + queue_page_size, GFP_KERNEL);
if (queue) {
queue->q_header = NULL;
--
2.30.0
From: Richard Gong <richard.gong(a)intel.com>
Clean up COMMAND_RECONFIG_FLAG_PARTIAL flag by resetting it to 0, which
aligns with the firmware settings.
Cc: <stable(a)vger.kernel.org> # 5.9+
Fixes: 36847f9e3e56 ("firmware: correct reconfig flag and timeout values")
Signed-off-by: Richard Gong <richard.gong(a)intel.com>
---
v2: add tag Cc: <stable(a)vger.kernel.org> # 5.9+
add 'Fixes: ... ' line in the comment
---
include/linux/firmware/intel/stratix10-svc-client.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/firmware/intel/stratix10-svc-client.h b/include/linux/firmware/intel/stratix10-svc-client.h
index a93d859..f843c6a 100644
--- a/include/linux/firmware/intel/stratix10-svc-client.h
+++ b/include/linux/firmware/intel/stratix10-svc-client.h
@@ -56,7 +56,7 @@
* COMMAND_RECONFIG_FLAG_PARTIAL:
* Set to FPGA configuration type (full or partial).
*/
-#define COMMAND_RECONFIG_FLAG_PARTIAL 1
+#define COMMAND_RECONFIG_FLAG_PARTIAL 0
/**
* Timeout settings for service clients:
--
2.7.4
Currently we use bitmap_alloc() to allocate msi bitmap which should be
initialized with zero. This is obviously wrong but it works because msi
can fallback to legacy interrupt mode. So use bitmap_zalloc() instead.
Fixes: 632dcc2c75ef6de3272aa ("irqchip: Add Loongson PCH MSI controller")
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
drivers/irqchip/irq-loongson-pch-msi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-loongson-pch-msi.c b/drivers/irqchip/irq-loongson-pch-msi.c
index 12aeeab43289..32562b7e681b 100644
--- a/drivers/irqchip/irq-loongson-pch-msi.c
+++ b/drivers/irqchip/irq-loongson-pch-msi.c
@@ -225,7 +225,7 @@ static int pch_msi_init(struct device_node *node,
goto err_priv;
}
- priv->msi_map = bitmap_alloc(priv->num_irqs, GFP_KERNEL);
+ priv->msi_map = bitmap_zalloc(priv->num_irqs, GFP_KERNEL);
if (!priv->msi_map) {
ret = -ENOMEM;
goto err_priv;
--
2.27.0
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
We don't have a persistent fb holding a reference to the frontbuffer
object, so every time we do the get+put we throw the frontbuffer object
immediately away. And so the next time around we get a pristine
frontbuffer object with bits==0 even for the old vma. This confuses
the frontbuffer tracking code which understandably expects the old
frontbuffer to have the overlay's bit set.
Fix this by hanging on to the frontbuffer reference until the next
flip. And just to make this a bit more clear let's track the frontbuffer
explicitly instead of just grabbing it via the old vma.
Cc: stable(a)vger.kernel.org
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1136
Fixes: da42104f589d ("drm/i915: Hold reference to intel_frontbuffer as we track activity")
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_overlay.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_overlay.c b/drivers/gpu/drm/i915/display/intel_overlay.c
index 9c0113f15b58..ef8f44f5e751 100644
--- a/drivers/gpu/drm/i915/display/intel_overlay.c
+++ b/drivers/gpu/drm/i915/display/intel_overlay.c
@@ -183,6 +183,7 @@ struct intel_overlay {
struct intel_crtc *crtc;
struct i915_vma *vma;
struct i915_vma *old_vma;
+ struct intel_frontbuffer *frontbuffer;
bool active;
bool pfit_active;
u32 pfit_vscale_ratio; /* shifted-point number, (1<<12) == 1.0 */
@@ -283,21 +284,19 @@ static void intel_overlay_flip_prepare(struct intel_overlay *overlay,
struct i915_vma *vma)
{
enum pipe pipe = overlay->crtc->pipe;
- struct intel_frontbuffer *from = NULL, *to = NULL;
+ struct intel_frontbuffer *frontbuffer = NULL;
drm_WARN_ON(&overlay->i915->drm, overlay->old_vma);
- if (overlay->vma)
- from = intel_frontbuffer_get(overlay->vma->obj);
if (vma)
- to = intel_frontbuffer_get(vma->obj);
+ frontbuffer = intel_frontbuffer_get(vma->obj);
- intel_frontbuffer_track(from, to, INTEL_FRONTBUFFER_OVERLAY(pipe));
+ intel_frontbuffer_track(overlay->frontbuffer, frontbuffer,
+ INTEL_FRONTBUFFER_OVERLAY(pipe));
- if (to)
- intel_frontbuffer_put(to);
- if (from)
- intel_frontbuffer_put(from);
+ if (overlay->frontbuffer)
+ intel_frontbuffer_put(overlay->frontbuffer);
+ overlay->frontbuffer = frontbuffer;
intel_frontbuffer_flip_prepare(overlay->i915,
INTEL_FRONTBUFFER_OVERLAY(pipe));
--
2.26.2
This is a note to let you know that I've just added the patch titled
usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the usb-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From a1679af85b2ae35a2b78ad04c18bb069c37330cc Mon Sep 17 00:00:00 2001
From: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Date: Mon, 8 Feb 2021 13:53:10 -0800
Subject: usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
Valid range for DEPCFG.bInterval_m1 is from 0 to 13, and it must be set
to 0 when the controller operates in full-speed. See the programming
guide for DEPCFG command section 3.2.2.1 (v3.30a).
Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Link: https://lore.kernel.org/r/3f57026f993c0ce71498dbb06e49b3a47c4d0265.16128209…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/dwc3/gadget.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 97d707b4f384..d0f8d3ec855f 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -605,7 +605,17 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action)
params.param0 |= DWC3_DEPCFG_FIFO_NUMBER(dep->number >> 1);
if (desc->bInterval) {
- params.param1 |= DWC3_DEPCFG_BINTERVAL_M1(desc->bInterval - 1);
+ u8 bInterval_m1;
+
+ /*
+ * Valid range for DEPCFG.bInterval_m1 is from 0 to 13, and it
+ * must be set to 0 when the controller operates in full-speed.
+ */
+ bInterval_m1 = min_t(u8, desc->bInterval - 1, 13);
+ if (dwc->gadget->speed == USB_SPEED_FULL)
+ bInterval_m1 = 0;
+
+ params.param1 |= DWC3_DEPCFG_BINTERVAL_M1(bInterval_m1);
dep->interval = 1 << (desc->bInterval - 1);
}
--
2.30.0
This is a note to let you know that I've just added the patch titled
usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the usb-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 4b049f55ed95cd889bcdb3034fd75e1f01852b38 Mon Sep 17 00:00:00 2001
From: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Date: Mon, 8 Feb 2021 13:53:16 -0800
Subject: usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
The dep->interval captures the number of frames/microframes per interval
from bInterval. Fullspeed interrupt endpoint bInterval is the number of
frames per interval and not 2^(bInterval - 1). So fix it here. This
change is only for debugging purpose and should not affect the interrupt
endpoint operation.
Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Link: https://lore.kernel.org/r/1263b563dedc4ab8b0fb854fba06ce4bc56bd495.16128209…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/dwc3/gadget.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index d0f8d3ec855f..aebcf8ec0716 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -615,8 +615,13 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action)
if (dwc->gadget->speed == USB_SPEED_FULL)
bInterval_m1 = 0;
+ if (usb_endpoint_type(desc) == USB_ENDPOINT_XFER_INT &&
+ dwc->gadget->speed == USB_SPEED_FULL)
+ dep->interval = desc->bInterval;
+ else
+ dep->interval = 1 << (desc->bInterval - 1);
+
params.param1 |= DWC3_DEPCFG_BINTERVAL_M1(bInterval_m1);
- dep->interval = 1 << (desc->bInterval - 1);
}
return dwc3_send_gadget_ep_cmd(dep, DWC3_DEPCMD_SETEPCONFIG, ¶ms);
--
2.30.0
This is a note to let you know that I've just added the patch titled
mei: bus: block send with vtag on non-conformat FW
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From b398d53cd421454d64850f8b1f6d609ede9042d9 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Mon, 8 Feb 2021 17:06:48 +0200
Subject: mei: bus: block send with vtag on non-conformat FW
Block data send with vtag if either transport layer or
FW client are not supporting vtags.
Cc: <stable(a)vger.kernel.org> # v5.10+
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20210208150649.141358-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/misc/mei/bus.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 580074e32599..935acc6bbf3c 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -61,6 +61,13 @@ ssize_t __mei_cl_send(struct mei_cl *cl, u8 *buf, size_t length, u8 vtag,
goto out;
}
+ if (vtag) {
+ /* Check if vtag is supported by client */
+ rets = mei_cl_vt_support_check(cl);
+ if (rets)
+ goto out;
+ }
+
if (length > mei_cl_mtu(cl)) {
rets = -EFBIG;
goto out;
--
2.30.0
As with s390, alpha is a 64-bit architecture with a 32-bit ino_t.
With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode
numbers and display "inode64" in the mount options, whereas
passing "inode64" in the mount options will fail. This leads to
erroneous behaviours such as this:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
Prevent CONFIG_TMPFS_INODE64 from being selected on alpha.
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Cc: stable(a)vger.kernel.org # v5.9+
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
---
fs/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/Kconfig b/fs/Kconfig
index 3347ec7bd837..da524c4d7b7e 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -203,7 +203,7 @@ config TMPFS_XATTR
config TMPFS_INODE64
bool "Use 64-bit ino_t by default in tmpfs"
- depends on TMPFS && 64BIT && !S390
+ depends on TMPFS && 64BIT && !(S390 || ALPHA)
default n
help
tmpfs has historically used only inode numbers as wide as an unsigned
--
2.29.2
The patch titled
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
has been added to the -mm tree. Its filename is
tmpfs-disallow-config_tmpfs_inode64-on-alpha.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/tmpfs-disallow-config_tmpfs_inode…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/tmpfs-disallow-config_tmpfs_inode…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Seth Forshee <seth.forshee(a)canonical.com>
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
As with s390, alpha is a 64-bit architecture with a 32-bit ino_t. With
CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
display "inode64" in the mount options, whereas passing "inode64" in the
mount options will fail. This leads to erroneous behaviours such as this:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
Prevent CONFIG_TMPFS_INODE64 from being selected on alpha.
Link: https://lkml.kernel.org/r/20210208215726.608197-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: Richard Henderson <rth(a)twiddle.net>
Cc: Ivan Kokshaysky <ink(a)jurassic.park.msu.ru>
Cc: Matt Turner <mattst88(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/Kconfig~tmpfs-disallow-config_tmpfs_inode64-on-alpha
+++ a/fs/Kconfig
@@ -203,7 +203,7 @@ config TMPFS_XATTR
config TMPFS_INODE64
bool "Use 64-bit ino_t by default in tmpfs"
- depends on TMPFS && 64BIT && !S390
+ depends on TMPFS && 64BIT && !(S390 || ALPHA)
default n
help
tmpfs has historically used only inode numbers as wide as an unsigned
_
Patches currently in -mm which might be from seth.forshee(a)canonical.com are
tmpfs-disallow-config_tmpfs_inode64-on-s390.patch
tmpfs-disallow-config_tmpfs_inode64-on-alpha.patch
Currently, when handling the SPMI summary interrupt, the hw_irq
number is calculated based on SID, Peripheral ID, IRQ index and
APID. This is then passed to irq_find_mapping() to see if a
mapping exists for this hw_irq and if available, invoke the
interrupt handler. Since the IRQ index uses an "int" type, hw_irq
which is of unsigned long data type can take a large value when
SID has its MSB set to 1 and the type conversion happens. Because
of this, irq_find_mapping() returns 0 as there is no mapping
for this hw_irq. This ends up invoking cleanup_irq() as if
the interrupt is spurious whereas it is actually a valid
interrupt. Fix this by using the proper data type (u32) for id.
Cc: stable(a)vger.kernel.org
Signed-off-by: Subbaraman Narayanamurthy <subbaram(a)codeaurora.org>
---
drivers/spmi/spmi-pmic-arb.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/spmi/spmi-pmic-arb.c b/drivers/spmi/spmi-pmic-arb.c
index de844b4..bbbd311 100644
--- a/drivers/spmi/spmi-pmic-arb.c
+++ b/drivers/spmi/spmi-pmic-arb.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
- * Copyright (c) 2012-2015, 2017, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2012-2015, 2017, 2021, The Linux Foundation. All rights reserved.
*/
#include <linux/bitmap.h>
#include <linux/delay.h>
@@ -505,8 +505,7 @@ static void cleanup_irq(struct spmi_pmic_arb *pmic_arb, u16 apid, int id)
static void periph_interrupt(struct spmi_pmic_arb *pmic_arb, u16 apid)
{
unsigned int irq;
- u32 status;
- int id;
+ u32 status, id;
u8 sid = (pmic_arb->apid_data[apid].ppid >> 8) & 0xF;
u8 per = pmic_arb->apid_data[apid].ppid & 0xFF;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
The dwc3 driver did not account for operating in fullspeed when setting
DEPCFG.bInterval_m1. This series fixes it.
Note that for some bInterval, some IP versions may not exhibit invalid behavior
from the invalid DEPCFG.bInterval_m1 setting, which may mask this issue.
Thinh Nguyen (2):
usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
drivers/usb/dwc3/gadget.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
base-commit: d8c849037d9398abe6a5f5d065eafc777eb3bdaf
--
2.28.0
The patch titled
Subject: nilfs2: make splice write available again
has been added to the -mm tree. Its filename is
nilfs2-make-splice-write-available-again.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/nilfs2-make-splice-write-availabl…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/nilfs2-make-splice-write-availabl…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Joachim Henke <joachim.henke(a)t-systems.com>
Subject: nilfs2: make splice write available again
Since 5.10, splice() or sendfile() to NILFS2 return EINVAL. This was
caused by commit 36e2c7421f02 ("fs: don't allow splice read/write without
explicit ops").
This patch initializes the splice_write field in file_operations, like
most file systems do, to restore the functionality.
Link: https://lkml.kernel.org/r/1612784101-14353-1-git-send-email-konishi.ryusuke…
Signed-off-by: Joachim Henke <joachim.henke(a)t-systems.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.10+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/file.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/nilfs2/file.c~nilfs2-make-splice-write-available-again
+++ a/fs/nilfs2/file.c
@@ -141,6 +141,7 @@ const struct file_operations nilfs_file_
/* .release = nilfs_release_file, */
.fsync = nilfs_sync_file,
.splice_read = generic_file_splice_read,
+ .splice_write = iter_file_splice_write,
};
const struct inode_operations nilfs_file_inode_operations = {
_
Patches currently in -mm which might be from joachim.henke(a)t-systems.com are
nilfs2-make-splice-write-available-again.patch
The patch titled
Subject: mm, slub: better heuristic for number of cpus when calculating slab order
has been added to the -mm tree. Its filename is
mm-slub-better-heuristic-for-number-of-cpus-when-calculating-slab-order.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-slub-better-heuristic-for-numb…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-slub-better-heuristic-for-numb…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: mm, slub: better heuristic for number of cpus when calculating slab order
When creating a new kmem cache, SLUB determines how large the slab pages will
based on number of inputs, including the number of CPUs in the system. Larger
slab pages mean that more objects can be allocated/free from per-cpu slabs
before accessing shared structures, but also potentially more memory can be
wasted due to low slab usage and fragmentation.
The rough idea of using number of CPUs is that larger systems will be more
likely to benefit from reduced contention, and also should have enough memory
to spare.
Number of CPUs used to be determined as nr_cpu_ids, which is number of possible
cpus, but on some systems many will never be onlined, thus commit 045ab8c9487b
("mm/slub: let number of online CPUs determine the slub page order") changed it
to nr_online_cpus(). However, for kmem caches created early before CPUs are
onlined, this may lead to permamently low slab page sizes.
Vincent reports a regression [1] of hackbench on arm64 systems:
> I'm facing significant performances regression on a large arm64 server
> system (224 CPUs). Regressions is also present on small arm64 system
> (8 CPUs) but in a far smaller order of magnitude
> On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
> v5.11-rc4 : 9.135sec (+/- 0.45%)
> v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
> v5.10: 3.136sec (+/- 0.40%)
Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting
page allocator contention:
> i.e. the patch incurs a 7% to 32% performance penalty. This bisected
> cleanly yesterday when I was looking for the regression and then found
> the thread.
> Numerous caches change size. For example, kmalloc-512 goes from order-0
> (vanilla) to order-2 with the revert.
> So mostly this is down to the number of times SLUB calls into the page
> allocator which only caches order-0 pages on a per-cpu basis.
Clearly num_online_cpus() doesn't work too early in bootup. We could change
the order dynamically in a memory hotplug callback, but runtime order changing
for existing kmem caches has been already shown as dangerous, and removed in
32a6f409b693 ("mm, slub: remove runtime allocation order changes"). It could be
resurrected in a safe manner with some effort, but to fix the regression we
need something simpler.
We could use num_present_cpus() that should be the number of physically
present CPUs even before they are onlined. That would work for PowerPC
[3], which triggered the original commit, but that still doesn't work on
arm64 [4] as explained in [5].
So this patch tries to determine the best available value without specific
arch knowledge.
- num_present_cpus() if the number is larger than 1, as that means the
arch is likely setting it properly
- nr_cpu_ids otherwise
This should fix the reported regressions while also keeping the effect of
045ab8c9487b for PowerPC systems. It's possible there are configurations
where num_present_cpus() is 1 during boot while nr_cpu_ids is at the same
time bloated, so these (if they exist) would keep the large orders based
on nr_cpu_ids as was before 045ab8c9487b.
[1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj…
[2] https://lore.kernel.org/linux-mm/20210128134512.GF3592@techsingularity.net/
[3] https://lore.kernel.org/linux-mm/20210123051607.GC2587010@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03Ys…
[5] https://lore.kernel.org/linux-mm/20210126230305.GD30941@willie-the-truck/
Link: https://lkml.kernel.org/r/20210208134108.22286-1-vbabka@suse.cz
Fixes: 045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Reported-by: Mel Gorman <mgorman(a)techsingularity.net>
Tested-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Bharata B Rao <bharata(a)linux.ibm.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
--- a/mm/slub.c~mm-slub-better-heuristic-for-number-of-cpus-when-calculating-slab-order
+++ a/mm/slub.c
@@ -3423,6 +3423,7 @@ static inline int calculate_order(unsign
unsigned int order;
unsigned int min_objects;
unsigned int max_objects;
+ unsigned int nr_cpus;
/*
* Attempt to find best configuration for a slab. This
@@ -3433,8 +3434,21 @@ static inline int calculate_order(unsign
* we reduce the minimum objects required in a slab.
*/
min_objects = slub_min_objects;
- if (!min_objects)
- min_objects = 4 * (fls(num_online_cpus()) + 1);
+ if (!min_objects) {
+ /*
+ * Some architectures will only update present cpus when
+ * onlining them, so don't trust the number if it's just 1. But
+ * we also don't want to use nr_cpu_ids always, as on some other
+ * architectures, there can be many possible cpus, but never
+ * onlined. Here we compromise between trying to avoid too high
+ * order on systems that appear larger than they are, and too
+ * low order on systems that appear smaller than they are.
+ */
+ nr_cpus = num_present_cpus();
+ if (nr_cpus <= 1)
+ nr_cpus = nr_cpu_ids;
+ min_objects = 4 * (fls(nr_cpus) + 1);
+ }
max_objects = order_objects(slub_max_order, size);
min_objects = min(min_objects, max_objects);
_
Patches currently in -mm which might be from vbabka(a)suse.cz are
mm-slub-better-heuristic-for-number-of-cpus-when-calculating-slab-order.patch
mm-slub-stop-freeing-kmem_cache_node-structures-on-node-offline.patch
mm-slab-slub-stop-taking-memory-hotplug-lock.patch
mm-slab-slub-stop-taking-cpu-hotplug-lock.patch
mm-slub-splice-cpu-and-page-freelists-in-deactivate_slab.patch
mm-slub-remove-slub_memcg_sysfs-boot-param-and-config_slub_memcg_sysfs_on.patch
From: Marc Zyngier <maz(a)kernel.org>
[ Upstream commit 43f20b1c6140896916f4e91aacc166830a7ba849 ]
It recently became apparent that the lack of a 'device_type = "pci"'
in the PCIe root complex node for rk3399 is a violation of the PCI
binding, as documented in IEEE Std 1275-1994. Changes to the kernel's
parsing of the DT made such violation fatal, as drivers cannot
probe the controller anymore.
Add the missing property makes the PCIe node compliant. While we
are at it, drop the pointless linux,pci-domain property, which only
makes sense when there are multiple host bridges.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20200815125112.462652-3-maz@kernel.org
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 82747048381fa..721f4b6b262f1 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -231,6 +231,7 @@ pcie0: pcie@f8000000 {
reg = <0x0 0xf8000000 0x0 0x2000000>,
<0x0 0xfd000000 0x0 0x1000000>;
reg-names = "axi-base", "apb-base";
+ device_type = "pci";
#address-cells = <3>;
#size-cells = <2>;
#interrupt-cells = <1>;
@@ -249,7 +250,6 @@ pcie0: pcie@f8000000 {
<0 0 0 2 &pcie0_intc 1>,
<0 0 0 3 &pcie0_intc 2>,
<0 0 0 4 &pcie0_intc 3>;
- linux,pci-domain = <0>;
max-link-speed = <1>;
msi-map = <0x0 &its 0x0 0x1000>;
phys = <&pcie_phy 0>, <&pcie_phy 1>,
--
2.27.0
From: Marc Zyngier <maz(a)kernel.org>
[ Upstream commit 43f20b1c6140896916f4e91aacc166830a7ba849 ]
It recently became apparent that the lack of a 'device_type = "pci"'
in the PCIe root complex node for rk3399 is a violation of the PCI
binding, as documented in IEEE Std 1275-1994. Changes to the kernel's
parsing of the DT made such violation fatal, as drivers cannot
probe the controller anymore.
Add the missing property makes the PCIe node compliant. While we
are at it, drop the pointless linux,pci-domain property, which only
makes sense when there are multiple host bridges.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20200815125112.462652-3-maz@kernel.org
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index f4ee7c4f83b8b..b1c1a88a1c20c 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -198,6 +198,7 @@ pcie0: pcie@f8000000 {
reg = <0x0 0xf8000000 0x0 0x2000000>,
<0x0 0xfd000000 0x0 0x1000000>;
reg-names = "axi-base", "apb-base";
+ device_type = "pci";
#address-cells = <3>;
#size-cells = <2>;
#interrupt-cells = <1>;
@@ -216,7 +217,6 @@ pcie0: pcie@f8000000 {
<0 0 0 2 &pcie0_intc 1>,
<0 0 0 3 &pcie0_intc 2>,
<0 0 0 4 &pcie0_intc 3>;
- linux,pci-domain = <0>;
max-link-speed = <1>;
msi-map = <0x0 &its 0x0 0x1000>;
phys = <&pcie_phy 0>, <&pcie_phy 1>,
--
2.27.0
From: Marc Zyngier <maz(a)kernel.org>
[ Upstream commit 43f20b1c6140896916f4e91aacc166830a7ba849 ]
It recently became apparent that the lack of a 'device_type = "pci"'
in the PCIe root complex node for rk3399 is a violation of the PCI
binding, as documented in IEEE Std 1275-1994. Changes to the kernel's
parsing of the DT made such violation fatal, as drivers cannot
probe the controller anymore.
Add the missing property makes the PCIe node compliant. While we
are at it, drop the pointless linux,pci-domain property, which only
makes sense when there are multiple host bridges.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20200815125112.462652-3-maz@kernel.org
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index bb7d0aac6b9db..9d6ed8cda2c86 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -232,6 +232,7 @@ pcie0: pcie@f8000000 {
reg = <0x0 0xf8000000 0x0 0x2000000>,
<0x0 0xfd000000 0x0 0x1000000>;
reg-names = "axi-base", "apb-base";
+ device_type = "pci";
#address-cells = <3>;
#size-cells = <2>;
#interrupt-cells = <1>;
@@ -250,7 +251,6 @@ pcie0: pcie@f8000000 {
<0 0 0 2 &pcie0_intc 1>,
<0 0 0 3 &pcie0_intc 2>,
<0 0 0 4 &pcie0_intc 3>;
- linux,pci-domain = <0>;
max-link-speed = <1>;
msi-map = <0x0 &its 0x0 0x1000>;
phys = <&pcie_phy 0>, <&pcie_phy 1>,
--
2.27.0
The patch titled
Subject: mm: hugetlb: fix missing put_page in gather_surplus_pages()
has been removed from the -mm tree. Its filename was
mm-hugetlb-fix-missing-put_page-in-gather_surplus_pages.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlb: fix missing put_page in gather_surplus_pages()
The VM_BUG_ON_PAGE avoids the generation of any code, even if that
expression has side-effects when !CONFIG_DEBUG_VM.
Link: https://lkml.kernel.org/r/20210126031009.96266-1-songmuchun@bytedance.com
Fixes: e5dfacebe4a4 ("mm/hugetlb.c: just use put_page_testzero() instead of page_count()")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-missing-put_page-in-gather_surplus_pages
+++ a/mm/hugetlb.c
@@ -2047,13 +2047,16 @@ retry:
/* Free the needed pages to the hugetlb pool */
list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
+ int zeroed;
+
if ((--needed) < 0)
break;
/*
* This page is now managed by the hugetlb allocator and has
* no users -- drop the buddy allocator's reference.
*/
- VM_BUG_ON_PAGE(!put_page_testzero(page), page);
+ zeroed = put_page_testzero(page);
+ VM_BUG_ON_PAGE(!zeroed, page);
enqueue_huge_page(h, page);
}
free:
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
mm-memcontrol-replace-the-loop-with-a-list_for_each_entry.patch
hugetlb-convert-page_huge_active-hpagemigratable-flag-fix.patch
The patch titled
Subject: memblock: do not start bottom-up allocations with kernel_end
has been removed from the -mm tree. Its filename was
memblock-do-not-start-bottom-up-allocations-with-kernel_end.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Roman Gushchin <guro(a)fb.com>
Subject: memblock: do not start bottom-up allocations with kernel_end
With kaslr the kernel image is placed at a random place, so starting the
bottom-up allocation with the kernel_end can result in an allocation
failure and a warning like this one:
[ 0.002920] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node
[ 0.002921] ------------[ cut here ]------------
[ 0.002922] memblock: bottom-up allocation failed, memory hotremove may be affected
[ 0.002937] WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a
[ 0.002937] Modules linked in:
[ 0.002939] CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169
[ 0.002940] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
[ 0.002942] RIP: 0010:memblock_find_in_range_node+0x178/0x25a
[ 0.002944] Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c
[ 0.002945] RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000
[ 0.002947] RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff
[ 0.002948] RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046
[ 0.002948] RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb
[ 0.002949] R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
[ 0.002950] R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000
[ 0.002952] FS: 0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000
[ 0.002953] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.002954] CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0
[ 0.002956] Call Trace:
[ 0.002961] ? memblock_alloc_range_nid+0x8d/0x11e
[ 0.002963] ? cma_declare_contiguous_nid+0x2c4/0x38c
[ 0.002964] ? hugetlb_cma_reserve+0xdc/0x128
[ 0.002968] ? flush_tlb_one_kernel+0xc/0x20
[ 0.002969] ? native_set_fixmap+0x82/0xd0
[ 0.002971] ? flat_get_apic_id+0x5/0x10
[ 0.002973] ? register_lapic_address+0x8e/0x97
[ 0.002975] ? setup_arch+0x8a5/0xc3f
[ 0.002978] ? start_kernel+0x66/0x547
[ 0.002980] ? load_ucode_bsp+0x4c/0xcd
[ 0.002982] ? secondary_startup_64_no_verify+0xb0/0xbb
[ 0.002986] random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0
[ 0.002988] ---[ end trace f151227d0b39be70 ]---
At the same time, the kernel image is protected with memblock_reserve(),
so we can just start searching at PAGE_SIZE. In this case the bottom-up
allocation has the same chances to success as a top-down allocation, so
there is no reason to fallback in the case of a failure. All together it
simplifies the logic.
Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com
Fixes: 8fabc623238e ("powerpc: Ensure that swiotlb buffer is allocated from low memory")
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Wonhyuk Yang <vvghjk1234(a)gmail.com>
Cc: Thiago Jung Bauermann <bauerman(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memblock.c | 49 +++++-------------------------------------------
1 file changed, 6 insertions(+), 43 deletions(-)
--- a/mm/memblock.c~memblock-do-not-start-bottom-up-allocations-with-kernel_end
+++ a/mm/memblock.c
@@ -275,14 +275,6 @@ __memblock_find_range_top_down(phys_addr
*
* Find @size free area aligned to @align in the specified range and node.
*
- * When allocation direction is bottom-up, the @start should be greater
- * than the end of the kernel image. Otherwise, it will be trimmed. The
- * reason is that we want the bottom-up allocation just near the kernel
- * image so it is highly likely that the allocated memory and the kernel
- * will reside in the same node.
- *
- * If bottom-up allocation failed, will try to allocate memory top-down.
- *
* Return:
* Found address on success, 0 on failure.
*/
@@ -291,8 +283,6 @@ static phys_addr_t __init_memblock membl
phys_addr_t end, int nid,
enum memblock_flags flags)
{
- phys_addr_t kernel_end, ret;
-
/* pump up @end */
if (end == MEMBLOCK_ALLOC_ACCESSIBLE ||
end == MEMBLOCK_ALLOC_KASAN)
@@ -301,40 +291,13 @@ static phys_addr_t __init_memblock membl
/* avoid allocating the first page */
start = max_t(phys_addr_t, start, PAGE_SIZE);
end = max(start, end);
- kernel_end = __pa_symbol(_end);
-
- /*
- * try bottom-up allocation only when bottom-up mode
- * is set and @end is above the kernel image.
- */
- if (memblock_bottom_up() && end > kernel_end) {
- phys_addr_t bottom_up_start;
-
- /* make sure we will allocate above the kernel */
- bottom_up_start = max(start, kernel_end);
-
- /* ok, try bottom-up allocation first */
- ret = __memblock_find_range_bottom_up(bottom_up_start, end,
- size, align, nid, flags);
- if (ret)
- return ret;
-
- /*
- * we always limit bottom-up allocation above the kernel,
- * but top-down allocation doesn't have the limit, so
- * retrying top-down allocation may succeed when bottom-up
- * allocation failed.
- *
- * bottom-up allocation is expected to be fail very rarely,
- * so we use WARN_ONCE() here to see the stack trace if
- * fail happens.
- */
- WARN_ONCE(IS_ENABLED(CONFIG_MEMORY_HOTREMOVE),
- "memblock: bottom-up allocation failed, memory hotremove may be affected\n");
- }
- return __memblock_find_range_top_down(start, end, size, align, nid,
- flags);
+ if (memblock_bottom_up())
+ return __memblock_find_range_bottom_up(start, end, size, align,
+ nid, flags);
+ else
+ return __memblock_find_range_top_down(start, end, size, align,
+ nid, flags);
}
/**
_
Patches currently in -mm which might be from guro(a)fb.com are
mm-memcg-slab-pre-allocate-obj_cgroups-for-slab-caches-with-slab_account.patch
mm-kmem-make-__memcg_kmem_uncharge-static.patch
mm-cma-allocate-cma-areas-bottom-up.patch
mm-cma-allocate-cma-areas-bottom-up-fix.patch
mm-cma-allocate-cma-areas-bottom-up-fix-2.patch
mm-cma-allocate-cma-areas-bottom-up-fix-3.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch