Hi Greg,
Can you please consider taking the patch "KVM: x86: hyper-v: fix type
of valid_bank_mask" into stable?
Commit ea8c66fe8d8f4f93df941e52120a3512d7bf5128 upstream.
Thanks,
Yury
The patch titled
Subject: mm/page_alloc: always attempt to allocate at least one page during bulk allocation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-page_alloc-always-attempt-to-allocate-at-least-one-page-during-bulk-allocation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mel Gorman <mgorman(a)techsingularity.net>
Subject: mm/page_alloc: always attempt to allocate at least one page during bulk allocation
Date: Thu, 26 May 2022 10:12:10 +0100
Peter Pavlisko reported the following problem on kernel bugzilla 216007.
When I try to extract an uncompressed tar archive (2.6 milion
files, 760.3 GiB in size) on newly created (empty) XFS file system,
after first low tens of gigabytes extracted the process hangs in
iowait indefinitely. One CPU core is 100% occupied with iowait,
the other CPU core is idle (on 2-core Intel Celeron G1610T).
It was bisected to c9fa563072e1 ("xfs: use alloc_pages_bulk_array() for
buffers") but XFS is only the messenger. The problem is that nothing is
waking kswapd to reclaim some pages at a time the PCP lists cannot be
refilled until some reclaim happens. The bulk allocator checks that there
are some pages in the array and the original intent was that a bulk
allocator did not necessarily need all the requested pages and it was best
to return as quickly as possible.
This was fine for the first user of the API but both NFS and XFS require
the requested number of pages be available before making progress. Both
could be adjusted to call the page allocator directly if a bulk allocation
fails but it puts a burden on users of the API. Adjust the semantics to
attempt at least one allocation via __alloc_pages() before returning so
kswapd is woken if necessary.
It was reported via bugzilla that the patch addressed the problem and that
the tar extraction completed successfully. This may also address bug
215975 but has yet to be confirmed.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216007
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215975
Link: https://lkml.kernel.org/r/20220526091210.GC3441@techsingularity.net
Fixes: 387ba26fb1cb ("mm/page_alloc: add a bulk page allocator")
Signed-off-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: "Darrick J. Wong" <djwong(a)kernel.org>
Cc: Dave Chinner <dchinner(a)redhat.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Jesper Dangaard Brouer <brouer(a)redhat.com>
Cc: Chuck Lever <chuck.lever(a)oracle.com>
Cc: <stable(a)vger.kernel.org> # v5.13+
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-always-attempt-to-allocate-at-least-one-page-during-bulk-allocation
+++ a/mm/page_alloc.c
@@ -5324,8 +5324,8 @@ unsigned long __alloc_pages_bulk(gfp_t g
page = __rmqueue_pcplist(zone, 0, ac.migratetype, alloc_flags,
pcp, pcp_list);
if (unlikely(!page)) {
- /* Try and get at least one page */
- if (!nr_populated)
+ /* Try and allocate at least one page */
+ if (!nr_account)
goto failed_irq;
break;
}
_
Patches currently in -mm which might be from mgorman(a)techsingularity.net are
mm-page_alloc-always-attempt-to-allocate-at-least-one-page-during-bulk-allocation.patch
mm-page_alloc-add-page-buddy_list-and-page-pcp_list.patch
mm-page_alloc-use-only-one-pcp-list-for-thp-sized-allocations.patch
mm-page_alloc-split-out-buddy-removal-code-from-rmqueue-into-separate-helper.patch
mm-page_alloc-protect-pcp-lists-with-a-spinlock.patch
hey Greg,
I'm proposing the following commits for stable, as they fix an
oops we're seeing in our testing[*]:
This is needed to 4.14.y -> 5.18.y:
1bbc21785b73 ACPI: sysfs: Fix BERT error region memory mapping
A dependency of the above, needed for 4.14.y -> 5.10.y
bdd56d7d8931 ACPI: sysfs: Make sparse happy about address space in use
-dann
[*] https://launchpad.net/bugs/1973153
Upon nfsd shutdown any pending DRC cache is freed. DRC cache use is
tracked via a percpu counter. In the current code the percpu counter
is destroyed before. If any pending cache is still present,
percpu_counter_add is called with a percpu counter==NULL. This causes
a kernel crash.
The solution is to destroy the percpu counter after the cache is freed.
Fixes: e567b98ce9a4b (“nfsd: protect concurrent access to nfsd stats counters”)
Signed-off-by: Julian Schroeder <jumaco(a)amazon.com>
---
fs/nfsd/nfscache.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
index 0b3f12aa37ff..7da88bdc0d6c 100644
--- a/fs/nfsd/nfscache.c
+++ b/fs/nfsd/nfscache.c
@@ -206,7 +206,6 @@ void nfsd_reply_cache_shutdown(struct nfsd_net *nn)
struct svc_cacherep *rp;
unsigned int i;
- nfsd_reply_cache_stats_destroy(nn);
unregister_shrinker(&nn->nfsd_reply_cache_shrinker);
for (i = 0; i < nn->drc_hashsize; i++) {
@@ -217,6 +216,7 @@ void nfsd_reply_cache_shutdown(struct nfsd_net *nn)
rp, nn);
}
}
+ nfsd_reply_cache_stats_destroy(nn);
kvfree(nn->drc_hashtbl);
nn->drc_hashtbl = NULL;
--
2.32.0
--
WELCOME TO ILLUMINATI, The Club of the Rich and Famous; is the world
oldest and largest fraternity made up of 5 Millions Members.We are one
Family under one father who is the Supreme Being. In ILLUMINATI we
believe that we were born in paradise and no member should struggle in
this world. Hence all our new members are given Money Rewards once
they join in order to upgrade their lifestyle.; interested members
should contact us via Email: illuminatihome999world(a)gmail.com
From: Eric Dumazet <edumazet(a)google.com>
commit 190cc82489f46f9d88e73c81a47e14f80a791e1a upstream
RFC 6056 (Recommendations for Transport-Protocol Port Randomization)
provides good summary of why source selection needs extra care.
David Dworken reminded us that linux implements Algorithm 3
as described in RFC 6056 3.3.3
Quoting David :
In the context of the web, this creates an interesting info leak where
websites can count how many TCP connections a user's computer is
establishing over time. For example, this allows a website to count
exactly how many subresources a third party website loaded.
This also allows:
- Distinguishing between different users behind a VPN based on
distinct source port ranges.
- Tracking users over time across multiple networks.
- Covert communication channels between different browsers/browser
profiles running on the same computer
- Tracking what applications are running on a computer based on
the pattern of how fast source ports are getting incremented.
Section 3.3.4 describes an enhancement, that reduces
attackers ability to use the basic information currently
stored into the shared 'u32 hint'.
This change also decreases collision rate when
multiple applications need to connect() to
different destinations.
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: David Dworken <ddworken(a)google.com>
Cc: Willem de Bruijn <willemb(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
[SG: Adjusted context]
Signed-off-by: Stefan Ghinea <stefan.ghinea(a)windriver.com>
---
net/ipv4/inet_hashtables.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 1346e45cf8d1..0bc6549c38b1 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -587,6 +587,17 @@ void inet_unhash(struct sock *sk)
}
EXPORT_SYMBOL_GPL(inet_unhash);
+/* RFC 6056 3.3.4. Algorithm 4: Double-Hash Port Selection Algorithm
+ * Note that we use 32bit integers (vs RFC 'short integers')
+ * because 2^16 is not a multiple of num_ephemeral and this
+ * property might be used by clever attacker.
+ * RFC claims using TABLE_LENGTH=10 buckets gives an improvement,
+ * we use 256 instead to really give more isolation and
+ * privacy, this only consumes 1 KB of kernel memory.
+ */
+#define INET_TABLE_PERTURB_SHIFT 8
+static u32 table_perturb[1 << INET_TABLE_PERTURB_SHIFT];
+
int __inet_hash_connect(struct inet_timewait_death_row *death_row,
struct sock *sk, u32 port_offset,
int (*check_established)(struct inet_timewait_death_row *,
@@ -600,7 +611,7 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
struct inet_bind_bucket *tb;
u32 remaining, offset;
int ret, i, low, high;
- static u32 hint;
+ u32 index;
if (port) {
head = &hinfo->bhash[inet_bhashfn(net, port,
@@ -625,7 +636,10 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
if (likely(remaining > 1))
remaining &= ~1U;
- offset = (hint + port_offset) % remaining;
+ net_get_random_once(table_perturb, sizeof(table_perturb));
+ index = hash_32(port_offset, INET_TABLE_PERTURB_SHIFT);
+
+ offset = (READ_ONCE(table_perturb[index]) + port_offset) % remaining;
/* In first pass we try ports of @low parity.
* inet_csk_get_port() does the opposite choice.
*/
@@ -678,7 +692,7 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
return -EADDRNOTAVAIL;
ok:
- hint += i + 2;
+ WRITE_ONCE(table_perturb[index], READ_ONCE(table_perturb[index]) + i + 2);
/* Head lock still held and bh's disabled */
inet_bind_hash(sk, tb, port);
--
2.36.1
From: Magnus Karlsson <magnus.karlsson(a)intel.com>
Fix a crash in the zero-copy driver that occurs when it fails to
allocate buffers from user-space. This crash can easily be triggered
by a malicious program that does not provide any buffers in the fill
ring for the kernel to use.
Note that this bug does not exist in upstream since the batched buffer
allocation interface got introduced in 5.16 and replaced this code.
Reported-by: Jeff Shaw <jeffrey.b.shaw(a)intel.com>
Tested-by: Jeff Shaw <jeffrey.b.shaw(a)intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson(a)intel.com>
---
drivers/net/ethernet/intel/ice/ice_xsk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 2b1873061912..5581747947e5 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -378,7 +378,7 @@ bool ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, u16 count)
do {
*xdp = xsk_buff_alloc(rx_ring->xsk_pool);
- if (!xdp) {
+ if (!*xdp) {
ok = false;
break;
}
base-commit: 9f43e3ac7e662f352f829077723fa0b92ccaded1
--
2.34.1
The patch below does not apply to the 5.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9f46c187e2e680ecd9de7983e4d081c3391acc76 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Fri, 20 May 2022 13:48:11 -0400
Subject: [PATCH] KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID
With shadow paging enabled, the INVPCID instruction results in a call
to kvm_mmu_invpcid_gva. If INVPCID is executed with CR0.PG=0, the
invlpg callback is not set and the result is a NULL pointer dereference.
Fix it trivially by checking for mmu->invlpg before every call.
There are other possibilities:
- check for CR0.PG, because KVM (like all Intel processors after P5)
flushes guest TLB on CR0.PG changes so that INVPCID/INVLPG are a
nop with paging disabled
- check for EFER.LMA, because KVM syncs and flushes when switching
MMU contexts outside of 64-bit mode
All of these are tricky, go for the simple solution. This is CVE-2022-1789.
Reported-by: Yongkang Jia <kangel(a)zju.edu.cn>
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 56ebc4fb7f91..45e1573f8f1d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5470,14 +5470,16 @@ void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid)
uint i;
if (pcid == kvm_get_active_pcid(vcpu)) {
- mmu->invlpg(vcpu, gva, mmu->root.hpa);
+ if (mmu->invlpg)
+ mmu->invlpg(vcpu, gva, mmu->root.hpa);
tlb_flush = true;
}
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) {
if (VALID_PAGE(mmu->prev_roots[i].hpa) &&
pcid == kvm_get_pcid(vcpu, mmu->prev_roots[i].pgd)) {
- mmu->invlpg(vcpu, gva, mmu->prev_roots[i].hpa);
+ if (mmu->invlpg)
+ mmu->invlpg(vcpu, gva, mmu->prev_roots[i].hpa);
tlb_flush = true;
}
}
Hi,
I would like to kindly request the inclusion of commit 24ef83f6e31d
("Input: goodix - fix spurious key release events") to the 5.4 stable
tree.
It fixes the spurious touches reported on an imx6dl board with Goodix
GT911 running kernel 5.4.
Thanks,
Fabio Estevam
[Public]
Hi,
The firmware on some OEM laptops with AMD SOCs advertise that they have sensors connected to AMD SFH but they really don't
physically have them. In 5.19 a commit has gone in that discovers this case and prevents the driver from advertising this sensor
to userspace. This might not seem like a big deal to have sensors advertised that aren't really there, but AMD has observed
that specifically on orientation sensors the random garbage data associated can cause userspace to interpret a screen rotation
during resume from suspend.
As GNOME has a daemon running that interprets these events I've seen first hand that it can cause the display go upside down
without a lot of recourse other than command line tools or rebooting.
Can you please backport this commit to 5.15.y+ and later to fix this:
commit b5d7f43e97dabfa04a4be5ff027ce7da119332be ("HID: amd_sfh: Add support for sensor discovery")
Thanks,
The routine huge_pmd_unshare is passed a pointer to an address
associated with an area which may be unshared. If unshare is successful
this address is updated to 'optimize' callers iterating over huge page
addresses. For the optimization to work correctly, address should be
updated to the last huge page in the unmapped/unshared area. However,
in the common case where the passed address is PUD_SIZE aligned, the
address is incorrectly updated to the address of the preceding huge
page. That wastes CPU cycles as the unmapped/unshared range is scanned
twice.
Cc: <stable(a)vger.kernel.org>
Fixes: 39dde65c9940 ("shared page table for hugetlb page")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
---
mm/hugetlb.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 01f0e2e5ab48..7c468ac1d069 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6755,7 +6755,14 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
pud_clear(pud);
put_page(virt_to_page(ptep));
mm_dec_nr_pmds(mm);
- *addr = ALIGN(*addr, HPAGE_SIZE * PTRS_PER_PTE) - HPAGE_SIZE;
+ /*
+ * This update of passed address optimizes loops sequentially
+ * processing addresses in increments of huge page size (PMD_SIZE
+ * in this case). By clearing the pud, a PUD_SIZE area is unmapped.
+ * Update address to the 'last page' in the cleared area so that
+ * calling loop can move to first page past this area.
+ */
+ *addr |= PUD_SIZE - PMD_SIZE;
return 1;
}
--
2.35.3
Resending this because stable(a)vger.kernel.org using wrong header field.
Apologize for duplicates.
On Thu, 19 May 2022 09:52:07 -0400
Sasha Levin <sashal(a)kernel.org> wrote:
> This is a note to let you know that I've just added the patch titled
>
> um: port_user: Improve error handling when port-helper is not found
>
> to the 5.17-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> um-port_user-improve-error-handling-when-port-helper.patch
> and it can be found in the queue-5.17 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
First, I should say that I'm not familiar with the process so I'm
likely to be wrong on any number of things. Second I'm the author of
this patch and I would like to see this included in the stable trees.
However, it appears to me that there is a problem in including just this
patch, as it depends on a previous patch which does not appear to be
applied[1].
> commit efc324ad7e7e1c92a8862bd71b2f5f8f15513304
> Author: Glenn Washburn <development(a)efficientek.com>
> Date: Thu Mar 3 01:53:32 2022 -0600
>
> um: port_user: Improve error handling when port-helper is not found
>
> [ Upstream commit 3cb5a7f167c620a8b0e38b0446df2e024d2243dc ]
>
> Check if port-helper exists and is executable. If not, write an error
> message to the kernel log with information to help the user diagnose the
> issue and exit with an error. If UML_PORT_HELPER was not set, write a
> message suggesting that the user set it. This makes it easier to understand
> why telneting to the UML instance is failing and what can be done to fix it.
>
> Signed-off-by: Glenn Washburn <development(a)efficientek.com>
> Signed-off-by: Richard Weinberger <richard(a)nod.at>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/arch/um/drivers/port_user.c b/arch/um/drivers/port_user.c
> index 5b5b64cb1071..133ca7bf2d91 100644
> --- a/arch/um/drivers/port_user.c
> +++ b/arch/um/drivers/port_user.c
> @@ -5,6 +5,7 @@
>
> #include <stdio.h>
> #include <stdlib.h>
> +#include <string.h>
> #include <errno.h>
> #include <termios.h>
> #include <unistd.h>
> @@ -175,6 +176,17 @@ int port_connection(int fd, int *socket, int *pid_out)
> if (new < 0)
> return -errno;
>
> + err = os_access(argv[2], X_OK);
> + if (err < 0) {
> + printk(UM_KERN_ERR "port_connection : error accessing port-helper "
> + "executable at %s: %s\n", argv[2], strerror(-err));
> + if (env == NULL)
The the afore mentioned patch that this patch depends on "env" is
declared and set. Without it, I'd expect this to fail to compile. As
such, I may be wrong in that the dependent patch was not already
included because I'd expect there to have been a compile test prior to
this patch getting to this phase.
My suspicion is that the stable trees try to not include new
functionality, which the missing patch may have been considered to have
done, and thus was not included. If its deemed undesirable to include
the missing patch, this "if" block can be removed. Although, I think
the missing patch is valuable enough to include.
The above goes for all the stable branches that this patch is set to be
included in.
Glenn
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
> + printk(UM_KERN_ERR "Set UML_PORT_HELPER environment "
> + "variable to path to uml-utilities port-helper "
> + "binary\n");
> + goto out_close;
> + }
> +
> err = os_pipe(socket, 0, 0);
> if (err < 0)
> goto out_close;
Guten Tag,
Ich habe mich nur gefragt, ob Sie meine vorherige E-Mail bekommen
haben ?
Ich habe versucht, Sie per E-Mail zu erreichen.
Kommen Sie bitte schnell zu mir zurück, es ist sehr wichtig.
Danke
Falcao Deterin
falcaodeterin(a)gmail.com
----------------------------------
Good Afternoon,
I was just wondering if you got my Previous E-mail
have ?
I tried to reach you by E-mail.
Please come back to me quickly, it is very Important.
Thanks
Falcao Deterin
falcaodeterin(a)gmail.com
On 5/20/2022 10:06 AM, Jan Beulich wrote:
> On 20.05.2022 15:33, Chuck Zmudzinski wrote:
>> On 5/20/2022 5:41 AM, Jan Beulich wrote:
>>> On 20.05.2022 10:30, Chuck Zmudzinski wrote:
>>>> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>>>>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>>>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>>>>
>>>>>>>>> ... these uses there are several more. You say nothing on why
>>>>>>>>> those want
>>>>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>>>>> and came to the conclusion that these all would also better
>>>>>>>>> observe the
>>>>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>>>>> patch, in
>>>>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>>>>> the
>>>>>>>>> problematic one, which you leave alone.
>>>>>>>> Oh, I missed that one, sorry.
>>>>>>> That is why your patch would not fix my Haswell unless
>>>>>>> it also touches i915_gem_object_pin_map() in
>>>>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>>>>
>>>>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>>>>> the
>>>>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>>>>> I think your approach needs to be more aggressive so it will fix
>>>>>>> all the known false negatives introduced by bdd8b6c98239
>>>>>>> such as the one in i915_gem_object_pin_map().
>>>>>>>
>>>>>>> I looked at Jan's approach and I think it would fix the issue
>>>>>>> with my Haswell as long as I don't use the nopat option. I
>>>>>>> really don't have a strong opinion on that question, but I
>>>>>>> think the nopat option as a Linux kernel option, as opposed
>>>>>>> to a hypervisor option, should only affect the kernel, and
>>>>>>> if the hypervisor provides the pat feature, then the kernel
>>>>>>> should not override that,
>>>>>> Hmm, why would the kernel not be allowed to override that? Such
>>>>>> an override would affect only the single domain where the
>>>>>> kernel runs; other domains could take their own decisions.
>>>>>>
>>>>>> Also, for the sake of completeness: "nopat" used when running on
>>>>>> bare metal has the same bad effect on system boot, so there
>>>>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>>>>> that's orthogonal, and I expect the maintainers may not even care
>>>>>> (but tell us "don't do that then").
>>>> Actually I just did a test with the last official Debian kernel
>>>> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
>>>> applied. In fact, the nopat option does *not* break the i915 driver
>>>> in 5.16. That is, with the nopat option, the i915 driver loads
>>>> normally on both the bare metal and on the Xen hypervisor.
>>>> That means your presumption (and the presumption of
>>>> the author of bdd8b6c98239) that the "nopat" option was
>>>> being observed by the i915 driver is incorrect. Setting "nopat"
>>>> had no effect on my system with Linux 5.16. So after doing these
>>>> tests, I am against the aggressive approach of breaking the i915
>>>> driver with the "nopat" option because prior to bdd8b6c98239,
>>>> nopat did not break the i915 driver. Why break it now?
>>> Because that's, in my understanding, is the purpose of "nopat"
>>> (not breaking the driver of course - that's a driver bug -, but
>>> having an effect on the driver).
>> I wouldn't call it a driver bug, but an incorrect configuration of the
>> kernel by the user. I presume X86_FEATURE_PAT is required by the
>> i915 driver
> The driver ought to work fine without PAT (and hence without being
> able to make WC mappings). It would use UC instead and be slow, but
> it ought to work.
>
>> and therefore the driver should refuse to disable
>> it if the user requests to disable it and instead warn the user that
>> the driver did not disable the feature, contrary to what the user
>> requested with the nopat option.
>>
>> In any case, my test did not verify that when nopat is set in Linux 5.16,
>> the thread takes the same code path as when nopat is not set,
>> so I am not totally sure that the reason nopat does not break the
>> i915 driver in 5.16 is that static_cpu_has(X86_FEATURE_PAT)
>> returns true even when nopat is set. I could test it with a custom
>> log message in 5.16 if that is necessary.
>>
>> Are you saying it was wrong for static_cpu_has(X86_FEATURE_PAT)
>> to return true in 5.16 when the user requests nopat?
> No, I'm not saying that. It was wrong for this construct to be used
> in the driver, which was fixed for 5.17 (and which had caused the
> regression I did observe, leading to the patch as a hopefully least
> bad option).
>
>> I think that is
>> just permitting a bad configuration to break the driver that a
>> well-written operating system should not allow. The i915 driver
>> was, in my opinion, correctly ignoring the nopat option in 5.16
>> because that option is not compatible with the hardware the
>> i915 driver is trying to initialize and setup at boot time. At least
>> that is my understanding now, but I will need to test it on 5.16
>> to be sure I understand it correctly.
>>
>> Also, AFAICT, your patch would break the driver when the nopat
>> option is set and only fix the regression introduced by bdd8b6c98239
>> when nopat is not set on my box, so your patch would
>> introduce a regression relative to Linux 5.16 and earlier for the
>> case when nopat is set on my box. I think your point would
>> be that it is not a regression if it is an incorrect user configuration.
> Again no - my view is that there's a separate, pre-existing issue
> in the driver which was uncovered by the change. This may be a
> perceived regression, but is imo different from a real one.
>
> Jan
Since it is a regression, I think for now bdd8b6c98239 should
be reverted and the fix backported to Linux 5.17 stable until
the underlying memory subsystem can provide the i915 driver
with an updated test for the PAT feature that also meets the
requirements of the author of bdd8b6c98239 without breaking
the i915 driver. The i915 driver relies on the memory subsytem
to provide it with an accurate test for the existence of
X86_FEATURE_PAT. I think your patch provides that more accurate
test so that bdd8b6c98239 could be re-applied when your patch is
committed. Juergen's patch would have to touch bdd8b6c98239
with new functions that probably have unknown and unintended
consequences, so I think your approach is also better in that regard.
As regards your patch, there is just a disagreement about how the
i915 driver should behave if nopat is set. I agree the i915 driver
could do a better job handling that case, at least with better error
logs.
Chuck
>
>> I respond by saying a well-written driver should refuse to honor
>> the incorrect configuration requested by the user and instead
>> warn the user that it did not honor the incorrect kernel option.
>>
>> I am only presuming what your patch would do on my box based
>> on what I learned about this problem from my debugging. I can
>> also test your patch on my box to verify that my understanding of
>> it is correct.
>>
>> I also have not yet verified Juergen's patch will not fix it, but
>> I am almost certain it will not unless it is expanded so it also
>> touches i915_gem_object_pin_map() with the fix. I plan to test
>> his patch, but expanded so it touches that function also.
>>
>> I also plan to test your patch with and without nopat and report the
>> results in the thread where you posted your patch. Hopefully
>> by tomorrow I will have the results.
>>
>> Chuck
>>
commit eadb2f47a3ced5c64b23b90fd2a3463f63726066 upstream.
KGDB and KDB allow read and write access to kernel memory, and thus
should be restricted during lockdown. An attacker with access to a
serial port (for example, via a hypervisor console, which some cloud
vendors provide over the network) could trigger the debugger so it is
important that the debugger respect the lockdown mode when/if it is
triggered.
Fix this by integrating lockdown into kdb's existing permissions
mechanism. Unfortunately kgdb does not have any permissions mechanism
(although it certainly could be added later) so, for now, kgdb is simply
and brutally disabled by immediately exiting the gdb stub without taking
any action.
For lockdowns established early in the boot (e.g. the normal case) then
this should be fine but on systems where kgdb has set breakpoints before
the lockdown is enacted than "bad things" will happen.
CVE: CVE-2022-21499
Co-developed-by: Stephen Brennan <stephen.s.brennan(a)oracle.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan(a)oracle.com>
Reviewed-by: Douglas Anderson <dianders(a)chromium.org>
Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
Notes:
Original patch did not backport cleanly. This backport is fixed up,
compile tested (on arm64) and side-by-side compared against the
original.
include/linux/security.h | 2 ++
kernel/debug/debug_core.c | 24 ++++++++++++++
kernel/debug/kdb/kdb_main.c | 62 ++++++++++++++++++++++++++++++++++--
security/lockdown/lockdown.c | 2 ++
4 files changed, 87 insertions(+), 3 deletions(-)
diff --git a/include/linux/security.h b/include/linux/security.h
index 3f6b8195ae9e..aa5c7141c8d1 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -118,10 +118,12 @@ enum lockdown_reason {
LOCKDOWN_MMIOTRACE,
LOCKDOWN_DEBUGFS,
LOCKDOWN_XMON_WR,
+ LOCKDOWN_DBG_WRITE_KERNEL,
LOCKDOWN_INTEGRITY_MAX,
LOCKDOWN_KCORE,
LOCKDOWN_KPROBES,
LOCKDOWN_BPF_READ,
+ LOCKDOWN_DBG_READ_KERNEL,
LOCKDOWN_PERF,
LOCKDOWN_TRACEFS,
LOCKDOWN_XMON_RW,
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 097ab02989f9..565987557ad8 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -56,6 +56,7 @@
#include <linux/vmacache.h>
#include <linux/rcupdate.h>
#include <linux/irq.h>
+#include <linux/security.h>
#include <asm/cacheflush.h>
#include <asm/byteorder.h>
@@ -685,6 +686,29 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
continue;
kgdb_connected = 0;
} else {
+ /*
+ * This is a brutal way to interfere with the debugger
+ * and prevent gdb being used to poke at kernel memory.
+ * This could cause trouble if lockdown is applied when
+ * there is already an active gdb session. For now the
+ * answer is simply "don't do that". Typically lockdown
+ * *will* be applied before the debug core gets started
+ * so only developers using kgdb for fairly advanced
+ * early kernel debug can be biten by this. Hopefully
+ * they are sophisticated enough to take care of
+ * themselves, especially with help from the lockdown
+ * message printed on the console!
+ */
+ if (security_locked_down(LOCKDOWN_DBG_WRITE_KERNEL)) {
+ if (IS_ENABLED(CONFIG_KGDB_KDB)) {
+ /* Switch back to kdb if possible... */
+ dbg_kdb_mode = 1;
+ continue;
+ } else {
+ /* ... otherwise just bail */
+ break;
+ }
+ }
error = gdb_serial_stub(ks);
}
diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 4567fe998c30..7c96bf9a6c2c 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -45,6 +45,7 @@
#include <linux/proc_fs.h>
#include <linux/uaccess.h>
#include <linux/slab.h>
+#include <linux/security.h>
#include "kdb_private.h"
#undef MODULE_PARAM_PREFIX
@@ -198,10 +199,62 @@ struct task_struct *kdb_curr_task(int cpu)
}
/*
- * Check whether the flags of the current command and the permissions
- * of the kdb console has allow a command to be run.
+ * Update the permissions flags (kdb_cmd_enabled) to match the
+ * current lockdown state.
+ *
+ * Within this function the calls to security_locked_down() are "lazy". We
+ * avoid calling them if the current value of kdb_cmd_enabled already excludes
+ * flags that might be subject to lockdown. Additionally we deliberately check
+ * the lockdown flags independently (even though read lockdown implies write
+ * lockdown) since that results in both simpler code and clearer messages to
+ * the user on first-time debugger entry.
+ *
+ * The permission masks during a read+write lockdown permits the following
+ * flags: INSPECT, SIGNAL, REBOOT (and ALWAYS_SAFE).
+ *
+ * The INSPECT commands are not blocked during lockdown because they are
+ * not arbitrary memory reads. INSPECT covers the backtrace family (sometimes
+ * forcing them to have no arguments) and lsmod. These commands do expose
+ * some kernel state but do not allow the developer seated at the console to
+ * choose what state is reported. SIGNAL and REBOOT should not be controversial,
+ * given these are allowed for root during lockdown already.
+ */
+static void kdb_check_for_lockdown(void)
+{
+ const int write_flags = KDB_ENABLE_MEM_WRITE |
+ KDB_ENABLE_REG_WRITE |
+ KDB_ENABLE_FLOW_CTRL;
+ const int read_flags = KDB_ENABLE_MEM_READ |
+ KDB_ENABLE_REG_READ;
+
+ bool need_to_lockdown_write = false;
+ bool need_to_lockdown_read = false;
+
+ if (kdb_cmd_enabled & (KDB_ENABLE_ALL | write_flags))
+ need_to_lockdown_write =
+ security_locked_down(LOCKDOWN_DBG_WRITE_KERNEL);
+
+ if (kdb_cmd_enabled & (KDB_ENABLE_ALL | read_flags))
+ need_to_lockdown_read =
+ security_locked_down(LOCKDOWN_DBG_READ_KERNEL);
+
+ /* De-compose KDB_ENABLE_ALL if required */
+ if (need_to_lockdown_write || need_to_lockdown_read)
+ if (kdb_cmd_enabled & KDB_ENABLE_ALL)
+ kdb_cmd_enabled = KDB_ENABLE_MASK & ~KDB_ENABLE_ALL;
+
+ if (need_to_lockdown_write)
+ kdb_cmd_enabled &= ~write_flags;
+
+ if (need_to_lockdown_read)
+ kdb_cmd_enabled &= ~read_flags;
+}
+
+/*
+ * Check whether the flags of the current command, the permissions of the kdb
+ * console and the lockdown state allow a command to be run.
*/
-static inline bool kdb_check_flags(kdb_cmdflags_t flags, int permissions,
+static bool kdb_check_flags(kdb_cmdflags_t flags, int permissions,
bool no_args)
{
/* permissions comes from userspace so needs massaging slightly */
@@ -1188,6 +1241,9 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs,
kdb_curr_task(raw_smp_processor_id());
KDB_DEBUG_STATE("kdb_local 1", reason);
+
+ kdb_check_for_lockdown();
+
kdb_go_count = 0;
if (reason == KDB_REASON_DEBUG) {
/* special case below */
diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
index 3f38583bed06..655a6edb5d7f 100644
--- a/security/lockdown/lockdown.c
+++ b/security/lockdown/lockdown.c
@@ -33,10 +33,12 @@ static const char *const lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
[LOCKDOWN_MMIOTRACE] = "unsafe mmio",
[LOCKDOWN_DEBUGFS] = "debugfs access",
[LOCKDOWN_XMON_WR] = "xmon write access",
+ [LOCKDOWN_DBG_WRITE_KERNEL] = "use of kgdb/kdb to write kernel RAM",
[LOCKDOWN_INTEGRITY_MAX] = "integrity",
[LOCKDOWN_KCORE] = "/proc/kcore access",
[LOCKDOWN_KPROBES] = "use of kprobes",
[LOCKDOWN_BPF_READ] = "use of bpf to read kernel RAM",
+ [LOCKDOWN_DBG_READ_KERNEL] = "use of kgdb/kdb to read kernel RAM",
[LOCKDOWN_PERF] = "unsafe use of perf",
[LOCKDOWN_TRACEFS] = "use of tracefs",
[LOCKDOWN_XMON_RW] = "xmon read and write access",
base-commit: 04b092e4a01a3488e762897e2d29f85eda2c6a60
--
2.35.1
commit eadb2f47a3ced5c64b23b90fd2a3463f63726066 upstream.
KGDB and KDB allow read and write access to kernel memory, and thus
should be restricted during lockdown. An attacker with access to a
serial port (for example, via a hypervisor console, which some cloud
vendors provide over the network) could trigger the debugger so it is
important that the debugger respect the lockdown mode when/if it is
triggered.
Fix this by integrating lockdown into kdb's existing permissions
mechanism. Unfortunately kgdb does not have any permissions mechanism
(although it certainly could be added later) so, for now, kgdb is simply
and brutally disabled by immediately exiting the gdb stub without taking
any action.
For lockdowns established early in the boot (e.g. the normal case) then
this should be fine but on systems where kgdb has set breakpoints before
the lockdown is enacted than "bad things" will happen.
CVE: CVE-2022-21499
Co-developed-by: Stephen Brennan <stephen.s.brennan(a)oracle.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan(a)oracle.com>
Reviewed-by: Douglas Anderson <dianders(a)chromium.org>
Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
Notes:
Original patch did not backport cleanly. This backport is fixed up,
compile tested (on arm64) and side-by-side compared against the
original.
include/linux/security.h | 2 ++
kernel/debug/debug_core.c | 24 ++++++++++++++
kernel/debug/kdb/kdb_main.c | 62 +++++++++++++++++++++++++++++++++++--
security/security.c | 2 ++
4 files changed, 87 insertions(+), 3 deletions(-)
diff --git a/include/linux/security.h b/include/linux/security.h
index 35355429648e..330029ef7e89 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -121,10 +121,12 @@ enum lockdown_reason {
LOCKDOWN_DEBUGFS,
LOCKDOWN_XMON_WR,
LOCKDOWN_BPF_WRITE_USER,
+ LOCKDOWN_DBG_WRITE_KERNEL,
LOCKDOWN_INTEGRITY_MAX,
LOCKDOWN_KCORE,
LOCKDOWN_KPROBES,
LOCKDOWN_BPF_READ,
+ LOCKDOWN_DBG_READ_KERNEL,
LOCKDOWN_PERF,
LOCKDOWN_TRACEFS,
LOCKDOWN_XMON_RW,
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 8661eb2b1771..0f31b22abe8d 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -56,6 +56,7 @@
#include <linux/vmacache.h>
#include <linux/rcupdate.h>
#include <linux/irq.h>
+#include <linux/security.h>
#include <asm/cacheflush.h>
#include <asm/byteorder.h>
@@ -756,6 +757,29 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
continue;
kgdb_connected = 0;
} else {
+ /*
+ * This is a brutal way to interfere with the debugger
+ * and prevent gdb being used to poke at kernel memory.
+ * This could cause trouble if lockdown is applied when
+ * there is already an active gdb session. For now the
+ * answer is simply "don't do that". Typically lockdown
+ * *will* be applied before the debug core gets started
+ * so only developers using kgdb for fairly advanced
+ * early kernel debug can be biten by this. Hopefully
+ * they are sophisticated enough to take care of
+ * themselves, especially with help from the lockdown
+ * message printed on the console!
+ */
+ if (security_locked_down(LOCKDOWN_DBG_WRITE_KERNEL)) {
+ if (IS_ENABLED(CONFIG_KGDB_KDB)) {
+ /* Switch back to kdb if possible... */
+ dbg_kdb_mode = 1;
+ continue;
+ } else {
+ /* ... otherwise just bail */
+ break;
+ }
+ }
error = gdb_serial_stub(ks);
}
diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 930ac1b25ec7..4e09fab52faf 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -45,6 +45,7 @@
#include <linux/proc_fs.h>
#include <linux/uaccess.h>
#include <linux/slab.h>
+#include <linux/security.h>
#include "kdb_private.h"
#undef MODULE_PARAM_PREFIX
@@ -197,10 +198,62 @@ struct task_struct *kdb_curr_task(int cpu)
}
/*
- * Check whether the flags of the current command and the permissions
- * of the kdb console has allow a command to be run.
+ * Update the permissions flags (kdb_cmd_enabled) to match the
+ * current lockdown state.
+ *
+ * Within this function the calls to security_locked_down() are "lazy". We
+ * avoid calling them if the current value of kdb_cmd_enabled already excludes
+ * flags that might be subject to lockdown. Additionally we deliberately check
+ * the lockdown flags independently (even though read lockdown implies write
+ * lockdown) since that results in both simpler code and clearer messages to
+ * the user on first-time debugger entry.
+ *
+ * The permission masks during a read+write lockdown permits the following
+ * flags: INSPECT, SIGNAL, REBOOT (and ALWAYS_SAFE).
+ *
+ * The INSPECT commands are not blocked during lockdown because they are
+ * not arbitrary memory reads. INSPECT covers the backtrace family (sometimes
+ * forcing them to have no arguments) and lsmod. These commands do expose
+ * some kernel state but do not allow the developer seated at the console to
+ * choose what state is reported. SIGNAL and REBOOT should not be controversial,
+ * given these are allowed for root during lockdown already.
+ */
+static void kdb_check_for_lockdown(void)
+{
+ const int write_flags = KDB_ENABLE_MEM_WRITE |
+ KDB_ENABLE_REG_WRITE |
+ KDB_ENABLE_FLOW_CTRL;
+ const int read_flags = KDB_ENABLE_MEM_READ |
+ KDB_ENABLE_REG_READ;
+
+ bool need_to_lockdown_write = false;
+ bool need_to_lockdown_read = false;
+
+ if (kdb_cmd_enabled & (KDB_ENABLE_ALL | write_flags))
+ need_to_lockdown_write =
+ security_locked_down(LOCKDOWN_DBG_WRITE_KERNEL);
+
+ if (kdb_cmd_enabled & (KDB_ENABLE_ALL | read_flags))
+ need_to_lockdown_read =
+ security_locked_down(LOCKDOWN_DBG_READ_KERNEL);
+
+ /* De-compose KDB_ENABLE_ALL if required */
+ if (need_to_lockdown_write || need_to_lockdown_read)
+ if (kdb_cmd_enabled & KDB_ENABLE_ALL)
+ kdb_cmd_enabled = KDB_ENABLE_MASK & ~KDB_ENABLE_ALL;
+
+ if (need_to_lockdown_write)
+ kdb_cmd_enabled &= ~write_flags;
+
+ if (need_to_lockdown_read)
+ kdb_cmd_enabled &= ~read_flags;
+}
+
+/*
+ * Check whether the flags of the current command, the permissions of the kdb
+ * console and the lockdown state allow a command to be run.
*/
-static inline bool kdb_check_flags(kdb_cmdflags_t flags, int permissions,
+static bool kdb_check_flags(kdb_cmdflags_t flags, int permissions,
bool no_args)
{
/* permissions comes from userspace so needs massaging slightly */
@@ -1194,6 +1247,9 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs,
kdb_curr_task(raw_smp_processor_id());
KDB_DEBUG_STATE("kdb_local 1", reason);
+
+ kdb_check_for_lockdown();
+
kdb_go_count = 0;
if (reason == KDB_REASON_DEBUG) {
/* special case below */
diff --git a/security/security.c b/security/security.c
index d9d42d64f89f..360706cdabab 100644
--- a/security/security.c
+++ b/security/security.c
@@ -59,10 +59,12 @@ const char *const lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
[LOCKDOWN_DEBUGFS] = "debugfs access",
[LOCKDOWN_XMON_WR] = "xmon write access",
[LOCKDOWN_BPF_WRITE_USER] = "use of bpf to write user RAM",
+ [LOCKDOWN_DBG_WRITE_KERNEL] = "use of kgdb/kdb to write kernel RAM",
[LOCKDOWN_INTEGRITY_MAX] = "integrity",
[LOCKDOWN_KCORE] = "/proc/kcore access",
[LOCKDOWN_KPROBES] = "use of kprobes",
[LOCKDOWN_BPF_READ] = "use of bpf to read kernel RAM",
+ [LOCKDOWN_DBG_READ_KERNEL] = "use of kgdb/kdb to read kernel RAM",
[LOCKDOWN_PERF] = "unsafe use of perf",
[LOCKDOWN_TRACEFS] = "use of tracefs",
[LOCKDOWN_XMON_RW] = "xmon read and write access",
base-commit: c204ee3350ebbc4e2ab108cbce7afc0cac1c407d
--
2.35.1
Hello,
Greetings? I am Amjad. I work with a leading Bio Firm. Due to
the setbacks of the pandemic, my company has opened a bid in
search of new suppliers for basic raw materials needed in
production.
I am seeking a representative as I am a staff, I can not be
involved directly. It may not be your area of work but the
profits are great and I will guide you through. I have already
sourced a local supplier for this. I only need a reliable
representative.
Please get back to me so I can explain this in full.
Amjad
[ Upstream commit ae66fb2ba6c3dcaf8b9612b65aa949a1a4bed150 ]
RFC 8684 section 3.7 describes several opportunities for a MPTCP
connection to "fall back" to regular TCP early in the connection
process, before it has been confirmed that MPTCP options can be
successfully propagated on all SYN, SYN/ACK, and data packets. If a peer
acknowledges the first received data packet with a regular TCP header
(no MPTCP options), fallback is allowed.
If the recipient of that first data packet finds a MPTCP DSS checksum
error, this provides an opportunity to fail gracefully with a TCP
fallback rather than resetting the connection (as might happen if a
checksum failure were detected later).
This commit modifies the checksum failure code to attempt fallback on
the initial subflow of a MPTCP connection, only if it's a failure in the
first data mapping. In cases where the peer initiates the connection,
requests checksums, is the first to send data, and the peer is sending
incorrect checksums (see
https://github.com/multipath-tcp/mptcp_net-next/issues/275), this allows
the connection to proceed as TCP rather than reset.
Cc: <stable(a)vger.kernel.org> # 5.17.x
Cc: <stable(a)vger.kernel.org> # 5.15.x
Fixes: dd8bcd1768ff ("mptcp: validate the data checksum")
Acked-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
[mathew.j.martineau: backport: Resolved bitfield conflict in protocol.h]
Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
This patch is already in 5.17.10-rc1 and 5.15.42-rc1, but involves a
context dependency on upstream commit 4cf86ae84c71 which I have
requested to be dropped from the stable queues.
I'm posting this backport without the protocol.h conflict to
(hopefully?) make it easier for the stable maintainers to drop
4cf86ae84c71.
For context see https://lore.kernel.org/stable/fa953ec-288f-7715-c6fb-47a222e85270@linux.in…
Thanks,
Mat
---
net/mptcp/protocol.h | 3 ++-
net/mptcp/subflow.c | 21 ++++++++++++++++++---
2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index aec767ee047a..46b343a0b17e 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -442,7 +442,8 @@ struct mptcp_subflow_context {
rx_eof : 1,
can_ack : 1, /* only after processing the remote a key */
disposable : 1, /* ctx can be free at ulp release time */
- stale : 1; /* unable to snd/rcv data, do not use for xmit */
+ stale : 1, /* unable to snd/rcv data, do not use for xmit */
+ valid_csum_seen : 1; /* at least one csum validated */
enum mptcp_data_avail data_avail;
u32 remote_nonce;
u64 thmac;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 651f01d13191..8d5ddf8e3ef7 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -913,11 +913,14 @@ static enum mapping_status validate_data_csum(struct sock *ssk, struct sk_buff *
subflow->map_data_csum);
if (unlikely(csum)) {
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DATACSUMERR);
- subflow->send_mp_fail = 1;
- MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_MPFAILTX);
+ if (subflow->mp_join || subflow->valid_csum_seen) {
+ subflow->send_mp_fail = 1;
+ MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_MPFAILTX);
+ }
return subflow->mp_join ? MAPPING_INVALID : MAPPING_DUMMY;
}
+ subflow->valid_csum_seen = 1;
return MAPPING_OK;
}
@@ -1099,6 +1102,18 @@ static void subflow_sched_work_if_closed(struct mptcp_sock *msk, struct sock *ss
}
}
+static bool subflow_can_fallback(struct mptcp_subflow_context *subflow)
+{
+ struct mptcp_sock *msk = mptcp_sk(subflow->conn);
+
+ if (subflow->mp_join)
+ return false;
+ else if (READ_ONCE(msk->csum_enabled))
+ return !subflow->valid_csum_seen;
+ else
+ return !subflow->fully_established;
+}
+
static bool subflow_check_data_avail(struct sock *ssk)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
@@ -1176,7 +1191,7 @@ static bool subflow_check_data_avail(struct sock *ssk)
return true;
}
- if (subflow->mp_join || subflow->fully_established) {
+ if (!subflow_can_fallback(subflow)) {
/* fatal protocol error, close the socket.
* subflow_error_report() will introduce the appropriate barriers
*/
--
2.36.1
I'm announcing the release of the 5.17.11 kernel.
All users of the 5.17 kernel series must upgrade.
The updated 5.17.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.17.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
net/mptcp/protocol.h | 3 ++-
net/mptcp/subflow.c | 21 ++++++++++++++++++---
3 files changed, 21 insertions(+), 5 deletions(-)
Greg Kroah-Hartman (1):
Linux 5.17.11
Mat Martineau (1):
mptcp: Do TCP fallback on early DSS checksum failure
I'm announcing the release of the 5.15.43 kernel.
All users of the 5.15 kernel series must upgrade.
The updated 5.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.15.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
net/mptcp/protocol.h | 3 ++-
net/mptcp/subflow.c | 21 ++++++++++++++++++---
3 files changed, 21 insertions(+), 5 deletions(-)
Greg Kroah-Hartman (1):
Linux 5.15.43
Mat Martineau (1):
mptcp: Do TCP fallback on early DSS checksum failure