This is a note to let you know that I've just added the patch titled
x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-oprofile-ppro-do-not-use-__this_cpu-in-preemptible-context.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From a743bbeef27b9176987ec0cb7f906ab0ab52d1da Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp(a)suse.de>
Date: Tue, 7 Nov 2017 18:53:07 +0100
Subject: x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context
From: Borislav Petkov <bp(a)suse.de>
commit a743bbeef27b9176987ec0cb7f906ab0ab52d1da upstream.
The warning below says it all:
BUG: using __this_cpu_read() in preemptible [00000000] code: swapper/0/1
caller is __this_cpu_preempt_check
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc8 #4
Call Trace:
dump_stack
check_preemption_disabled
? do_early_param
__this_cpu_preempt_check
arch_perfmon_init
op_nmi_init
? alloc_pci_root_info
oprofile_arch_init
oprofile_init
do_one_initcall
...
These accessors should not have been used in the first place: it is PPro so
no mixed silicon revisions and thus it can simply use boot_cpu_data.
Reported-by: Fengguang Wu <fengguang.wu(a)intel.com>
Tested-by: Fengguang Wu <fengguang.wu(a)intel.com>
Fix-creation-mandated-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Robert Richter <rric(a)kernel.org>
Cc: x86(a)kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/oprofile/op_model_ppro.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -212,8 +212,8 @@ static void arch_perfmon_setup_counters(
eax.full = cpuid_eax(0xa);
/* Workaround for BIOS bugs in 6/15. Taken from perfmon2 */
- if (eax.split.version_id == 0 && __this_cpu_read(cpu_info.x86) == 6 &&
- __this_cpu_read(cpu_info.x86_model) == 15) {
+ if (eax.split.version_id == 0 && boot_cpu_data.x86 == 6 &&
+ boot_cpu_data.x86_model == 15) {
eax.split.version_id = 2;
eax.split.num_counters = 2;
eax.split.bit_width = 40;
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.13/x86-oprofile-ppro-do-not-use-__this_cpu-in-preemptible-context.patch
This is a note to let you know that I've just added the patch titled
x86/debug: Handle warnings before the notifier chain, to fix KGDB crash
to the 4.13-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-debug-handle-warnings-before-the-notifier-chain-to-fix-kgdb-crash.patch
and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b8347c2196492f4e1cccde3d92fda1cc2cc7de7e Mon Sep 17 00:00:00 2001
From: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Date: Mon, 24 Jul 2017 13:04:28 +0300
Subject: x86/debug: Handle warnings before the notifier chain, to fix KGDB crash
From: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
commit b8347c2196492f4e1cccde3d92fda1cc2cc7de7e upstream.
Commit:
9a93848fe787 ("x86/debug: Implement __WARN() using UD0")
turned warnings into UD0, but the fixup code only runs after the
notify_die() chain. This is a problem, in particular, with kgdb,
which kicks in as if it was a BUG().
Fix this by running the fixup code before the notifier chain in
the invalid op handler path.
Signed-off-by: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Tested-by: Ilya Dryomov <idryomov(a)gmail.com>
Acked-by: Daniel Thompson <daniel.thompson(a)linaro.org>
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Jason Wessel <jason.wessel(a)windriver.com>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Richard Weinberger <richard.weinberger(a)gmail.com>
Link: http://lkml.kernel.org/r/20170724100428.19173-1-alexander.shishkin@linux.in…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/traps.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -221,9 +221,6 @@ do_trap_no_signal(struct task_struct *ts
if (fixup_exception(regs, trapnr))
return 0;
- if (fixup_bug(regs, trapnr))
- return 0;
-
tsk->thread.error_code = error_code;
tsk->thread.trap_nr = trapnr;
die(str, regs, error_code);
@@ -304,6 +301,13 @@ static void do_error_trap(struct pt_regs
RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
+ /*
+ * WARN*()s end up here; fix them up before we call the
+ * notifier chain.
+ */
+ if (!user_mode(regs) && fixup_bug(regs, trapnr))
+ return;
+
if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) !=
NOTIFY_STOP) {
cond_local_irq_enable(regs);
Patches currently in stable-queue which might be from alexander.shishkin(a)linux.intel.com are
queue-4.13/x86-debug-handle-warnings-before-the-notifier-chain-to-fix-kgdb-crash.patch
Please cherry-pick this fix for 4.4 and later stable branches:
commit 06aae592425701851e02bb850cb9f4997f0ae163
Author: Colin Ian King <colin.king(a)canonical.com>
Date: Sat Feb 27 12:45:26 2016 +0000
PKCS#7: fix unitialized boolean 'want'
Ben.
--
Ben Hutchings
Software Developer, Codethink Ltd.
This is a note to let you know that I've just added the patch titled
x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-oprofile-ppro-do-not-use-__this_cpu-in-preemptible-context.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From a743bbeef27b9176987ec0cb7f906ab0ab52d1da Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp(a)suse.de>
Date: Tue, 7 Nov 2017 18:53:07 +0100
Subject: x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context
From: Borislav Petkov <bp(a)suse.de>
commit a743bbeef27b9176987ec0cb7f906ab0ab52d1da upstream.
The warning below says it all:
BUG: using __this_cpu_read() in preemptible [00000000] code: swapper/0/1
caller is __this_cpu_preempt_check
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc8 #4
Call Trace:
dump_stack
check_preemption_disabled
? do_early_param
__this_cpu_preempt_check
arch_perfmon_init
op_nmi_init
? alloc_pci_root_info
oprofile_arch_init
oprofile_init
do_one_initcall
...
These accessors should not have been used in the first place: it is PPro so
no mixed silicon revisions and thus it can simply use boot_cpu_data.
Reported-by: Fengguang Wu <fengguang.wu(a)intel.com>
Tested-by: Fengguang Wu <fengguang.wu(a)intel.com>
Fix-creation-mandated-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Robert Richter <rric(a)kernel.org>
Cc: x86(a)kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/oprofile/op_model_ppro.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/oprofile/op_model_ppro.c
+++ b/arch/x86/oprofile/op_model_ppro.c
@@ -212,8 +212,8 @@ static void arch_perfmon_setup_counters(
eax.full = cpuid_eax(0xa);
/* Workaround for BIOS bugs in 6/15. Taken from perfmon2 */
- if (eax.split.version_id == 0 && __this_cpu_read(cpu_info.x86) == 6 &&
- __this_cpu_read(cpu_info.x86_model) == 15) {
+ if (eax.split.version_id == 0 && boot_cpu_data.x86 == 6 &&
+ boot_cpu_data.x86_model == 15) {
eax.split.version_id = 2;
eax.split.num_counters = 2;
eax.split.bit_width = 40;
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-3.18/x86-oprofile-ppro-do-not-use-__this_cpu-in-preemptible-context.patch
This is a note to let you know that I've just added the patch titled
netfilter: nat: Revert "netfilter: nat: convert nat bysrc hash to rhashtable"
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-nat-revert-netfilter-nat-convert-nat-bysrc-hash-to-rhashtable.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e1bf1687740ce1a3598a1c5e452b852ff2190682 Mon Sep 17 00:00:00 2001
From: Florian Westphal <fw(a)strlen.de>
Date: Wed, 6 Sep 2017 14:39:51 +0200
Subject: netfilter: nat: Revert "netfilter: nat: convert nat bysrc hash to rhashtable"
From: Florian Westphal <fw(a)strlen.de>
commit e1bf1687740ce1a3598a1c5e452b852ff2190682 upstream.
This reverts commit 870190a9ec9075205c0fa795a09fa931694a3ff1.
It was not a good idea. The custom hash table was a much better
fit for this purpose.
A fast lookup is not essential, in fact for most cases there is no lookup
at all because original tuple is not taken and can be used as-is.
What needs to be fast is insertion and deletion.
rhlist removal however requires a rhlist walk.
We can have thousands of entries in such a list if source port/addresses
are reused for multiple flows, if this happens removal requests are so
expensive that deletions of a few thousand flows can take several
seconds(!).
The advantages that we got from rhashtable are:
1) table auto-sizing
2) multiple locks
1) would be nice to have, but it is not essential as we have at
most one lookup per new flow, so even a million flows in the bysource
table are not a problem compared to current deletion cost.
2) is easy to add to custom hash table.
I tried to add hlist_node to rhlist to speed up rhltable_remove but this
isn't doable without changing semantics. rhltable_remove_fast will
check that the to-be-deleted object is part of the table and that
requires a list walk that we want to avoid.
Furthermore, using hlist_node increases size of struct rhlist_head, which
in turn increases nf_conn size.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196821
Reported-by: Ivan Babrou <ibobrik(a)gmail.com>
Signed-off-by: Florian Westphal <fw(a)strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/netfilter/nf_conntrack.h | 3
include/net/netfilter/nf_nat.h | 1
net/netfilter/nf_nat_core.c | 131 ++++++++++++++---------------------
3 files changed, 56 insertions(+), 79 deletions(-)
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -17,7 +17,6 @@
#include <linux/bitops.h>
#include <linux/compiler.h>
#include <linux/atomic.h>
-#include <linux/rhashtable.h>
#include <linux/netfilter/nf_conntrack_tcp.h>
#include <linux/netfilter/nf_conntrack_dccp.h>
@@ -101,7 +100,7 @@ struct nf_conn {
possible_net_t ct_net;
#if IS_ENABLED(CONFIG_NF_NAT)
- struct rhlist_head nat_bysource;
+ struct hlist_node nat_bysource;
#endif
/* all members below initialized via memset */
u8 __nfct_init_offset[0];
--- a/include/net/netfilter/nf_nat.h
+++ b/include/net/netfilter/nf_nat.h
@@ -1,6 +1,5 @@
#ifndef _NF_NAT_H
#define _NF_NAT_H
-#include <linux/rhashtable.h>
#include <linux/netfilter_ipv4.h>
#include <linux/netfilter/nf_nat.h>
#include <net/netfilter/nf_conntrack_tuple.h>
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -30,19 +30,17 @@
#include <net/netfilter/nf_conntrack_zones.h>
#include <linux/netfilter/nf_nat.h>
+static DEFINE_SPINLOCK(nf_nat_lock);
+
static DEFINE_MUTEX(nf_nat_proto_mutex);
static const struct nf_nat_l3proto __rcu *nf_nat_l3protos[NFPROTO_NUMPROTO]
__read_mostly;
static const struct nf_nat_l4proto __rcu **nf_nat_l4protos[NFPROTO_NUMPROTO]
__read_mostly;
-struct nf_nat_conn_key {
- const struct net *net;
- const struct nf_conntrack_tuple *tuple;
- const struct nf_conntrack_zone *zone;
-};
-
-static struct rhltable nf_nat_bysource_table;
+static struct hlist_head *nf_nat_bysource __read_mostly;
+static unsigned int nf_nat_htable_size __read_mostly;
+static unsigned int nf_nat_hash_rnd __read_mostly;
inline const struct nf_nat_l3proto *
__nf_nat_l3proto_find(u8 family)
@@ -121,17 +119,19 @@ int nf_xfrm_me_harder(struct net *net, s
EXPORT_SYMBOL(nf_xfrm_me_harder);
#endif /* CONFIG_XFRM */
-static u32 nf_nat_bysource_hash(const void *data, u32 len, u32 seed)
+/* We keep an extra hash for each conntrack, for fast searching. */
+static inline unsigned int
+hash_by_src(const struct net *n, const struct nf_conntrack_tuple *tuple)
{
- const struct nf_conntrack_tuple *t;
- const struct nf_conn *ct = data;
+ unsigned int hash;
+
+ get_random_once(&nf_nat_hash_rnd, sizeof(nf_nat_hash_rnd));
- t = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
/* Original src, to ensure we map it consistently if poss. */
+ hash = jhash2((u32 *)&tuple->src, sizeof(tuple->src) / sizeof(u32),
+ tuple->dst.protonum ^ nf_nat_hash_rnd ^ net_hash_mix(n));
- seed ^= net_hash_mix(nf_ct_net(ct));
- return jhash2((const u32 *)&t->src, sizeof(t->src) / sizeof(u32),
- t->dst.protonum ^ seed);
+ return reciprocal_scale(hash, nf_nat_htable_size);
}
/* Is this tuple already taken? (not by us) */
@@ -187,28 +187,6 @@ same_src(const struct nf_conn *ct,
t->src.u.all == tuple->src.u.all);
}
-static int nf_nat_bysource_cmp(struct rhashtable_compare_arg *arg,
- const void *obj)
-{
- const struct nf_nat_conn_key *key = arg->key;
- const struct nf_conn *ct = obj;
-
- if (!same_src(ct, key->tuple) ||
- !net_eq(nf_ct_net(ct), key->net) ||
- !nf_ct_zone_equal(ct, key->zone, IP_CT_DIR_ORIGINAL))
- return 1;
-
- return 0;
-}
-
-static struct rhashtable_params nf_nat_bysource_params = {
- .head_offset = offsetof(struct nf_conn, nat_bysource),
- .obj_hashfn = nf_nat_bysource_hash,
- .obj_cmpfn = nf_nat_bysource_cmp,
- .nelem_hint = 256,
- .min_size = 1024,
-};
-
/* Only called for SRC manip */
static int
find_appropriate_src(struct net *net,
@@ -219,26 +197,22 @@ find_appropriate_src(struct net *net,
struct nf_conntrack_tuple *result,
const struct nf_nat_range *range)
{
+ unsigned int h = hash_by_src(net, tuple);
const struct nf_conn *ct;
- struct nf_nat_conn_key key = {
- .net = net,
- .tuple = tuple,
- .zone = zone
- };
- struct rhlist_head *hl, *h;
-
- hl = rhltable_lookup(&nf_nat_bysource_table, &key,
- nf_nat_bysource_params);
- rhl_for_each_entry_rcu(ct, h, hl, nat_bysource) {
- nf_ct_invert_tuplepr(result,
- &ct->tuplehash[IP_CT_DIR_REPLY].tuple);
- result->dst = tuple->dst;
+ hlist_for_each_entry_rcu(ct, &nf_nat_bysource[h], nat_bysource) {
+ if (same_src(ct, tuple) &&
+ net_eq(net, nf_ct_net(ct)) &&
+ nf_ct_zone_equal(ct, zone, IP_CT_DIR_ORIGINAL)) {
+ /* Copy source part from reply tuple. */
+ nf_ct_invert_tuplepr(result,
+ &ct->tuplehash[IP_CT_DIR_REPLY].tuple);
+ result->dst = tuple->dst;
- if (in_range(l3proto, l4proto, result, range))
- return 1;
+ if (in_range(l3proto, l4proto, result, range))
+ return 1;
+ }
}
-
return 0;
}
@@ -411,6 +385,7 @@ nf_nat_setup_info(struct nf_conn *ct,
const struct nf_nat_range *range,
enum nf_nat_manip_type maniptype)
{
+ struct net *net = nf_ct_net(ct);
struct nf_conntrack_tuple curr_tuple, new_tuple;
struct nf_conn_nat *nat;
@@ -452,19 +427,16 @@ nf_nat_setup_info(struct nf_conn *ct,
}
if (maniptype == NF_NAT_MANIP_SRC) {
- struct nf_nat_conn_key key = {
- .net = nf_ct_net(ct),
- .tuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
- .zone = nf_ct_zone(ct),
- };
- int err;
-
- err = rhltable_insert_key(&nf_nat_bysource_table,
- &key,
- &ct->nat_bysource,
- nf_nat_bysource_params);
- if (err)
- return NF_DROP;
+ unsigned int srchash;
+
+ srchash = hash_by_src(net,
+ &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
+ spin_lock_bh(&nf_nat_lock);
+ /* nf_conntrack_alter_reply might re-allocate extension aera */
+ nat = nfct_nat(ct);
+ hlist_add_head_rcu(&ct->nat_bysource,
+ &nf_nat_bysource[srchash]);
+ spin_unlock_bh(&nf_nat_lock);
}
/* It's done. */
@@ -578,9 +550,10 @@ static int nf_nat_proto_clean(struct nf_
* Else, when the conntrack is destoyed, nf_nat_cleanup_conntrack()
* will delete entry from already-freed table.
*/
+ spin_lock_bh(&nf_nat_lock);
+ hlist_del_rcu(&ct->nat_bysource);
ct->status &= ~IPS_NAT_DONE_MASK;
- rhltable_remove(&nf_nat_bysource_table, &ct->nat_bysource,
- nf_nat_bysource_params);
+ spin_unlock_bh(&nf_nat_lock);
/* don't delete conntrack. Although that would make things a lot
* simpler, we'd end up flushing all conntracks on nat rmmod.
@@ -710,8 +683,11 @@ static void nf_nat_cleanup_conntrack(str
if (!nat)
return;
- rhltable_remove(&nf_nat_bysource_table, &ct->nat_bysource,
- nf_nat_bysource_params);
+ NF_CT_ASSERT(ct->status & IPS_SRC_NAT_DONE);
+
+ spin_lock_bh(&nf_nat_lock);
+ hlist_del_rcu(&ct->nat_bysource);
+ spin_unlock_bh(&nf_nat_lock);
}
static struct nf_ct_ext_type nat_extend __read_mostly = {
@@ -846,13 +822,16 @@ static int __init nf_nat_init(void)
{
int ret;
- ret = rhltable_init(&nf_nat_bysource_table, &nf_nat_bysource_params);
- if (ret)
- return ret;
+ /* Leave them the same for the moment. */
+ nf_nat_htable_size = nf_conntrack_htable_size;
+
+ nf_nat_bysource = nf_ct_alloc_hashtable(&nf_nat_htable_size, 0);
+ if (!nf_nat_bysource)
+ return -ENOMEM;
ret = nf_ct_extend_register(&nat_extend);
if (ret < 0) {
- rhltable_destroy(&nf_nat_bysource_table);
+ nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size);
printk(KERN_ERR "nf_nat_core: Unable to register extension\n");
return ret;
}
@@ -876,7 +855,7 @@ static int __init nf_nat_init(void)
return 0;
cleanup_extend:
- rhltable_destroy(&nf_nat_bysource_table);
+ nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size);
nf_ct_extend_unregister(&nat_extend);
return ret;
}
@@ -896,8 +875,8 @@ static void __exit nf_nat_cleanup(void)
for (i = 0; i < NFPROTO_NUMPROTO; i++)
kfree(nf_nat_l4protos[i]);
-
- rhltable_destroy(&nf_nat_bysource_table);
+ synchronize_net();
+ nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size);
}
MODULE_LICENSE("GPL");
Patches currently in stable-queue which might be from fw(a)strlen.de are
queue-4.9/netfilter-nat-revert-netfilter-nat-convert-nat-bysrc-hash-to-rhashtable.patch
Sometimes, a processor might execute an instruction while another
processor is updating the page tables for that instruction's code page,
but before the TLB shootdown completes. The interesting case happens
if the page is in the TLB.
In general, the processor will succeed in executing the instruction and
nothing bad happens. However, what if the instruction is an MMIO access?
If *that* happens, KVM invokes the emulator, and the emulator gets the
updated page tables. If the update side had marked the code page as non
present, the page table walk then will fail and so will x86_decode_insn.
Unfortunately, even though kvm_fetch_guest_virt is correctly returning
X86EMUL_PROPAGATE_FAULT, x86_decode_insn's caller treats the failure as
a fatal error if the instruction cannot simply be reexecuted (as is the
case for MMIO). And this in fact happened sometimes when rebooting
Windows 2012r2 guests. Just checking ctxt->have_exception and injecting
the exception if true is enough to fix the case.
Thanks to Eduardo Habkost for helping in the debugging of this issue.
Reported-by: Yanan Fu <yfu(a)redhat.com>
Cc: Eduardo Habkost <ehabkost(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
---
arch/x86/kvm/x86.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 34c85aa2e2d1..6dbed9022797 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5722,6 +5722,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
if (reexecute_instruction(vcpu, cr2, write_fault_to_spt,
emulation_type))
return EMULATE_DONE;
+ if (ctxt->have_exception && inject_emulated_exception(vcpu))
+ return EMULATE_DONE;
if (emulation_type & EMULTYPE_SKIP)
return EMULATE_FAIL;
return handle_emulation_failure(vcpu);
--
1.8.3.1
Hi Ben,
On Tue, Nov 07, 2017 at 01:42:35PM +0000, Ben Hutchings wrote:
> That function didn't exist in 3.16 (at least not under that name).
Ah, you're right, back then the netlink interface did not
exist in batman-adv yet, only the debugfs one.
batadv_tt_global_print_entry would be the equivalent function
for debugfs. But not worth the effort now, in my opinion.
I'm fine with this proposed patch for 3.16 now. Thanks for the
clarification! And I'm happy to see this patch backported.
Regards, Linus
From: Jérôme Glisse <jglisse(a)redhat.com>
(Result of code inspection only, i do not have a bug, nor know a bug
that would be explain by this issue. Is there a kernel trace database
one can query for that ?)
When fs is mounted with nobh temporary buffer_head are created during
write and they are only associated with the page when a filesystem
error happen. When this happen nobh_write_begin() or nobh_write_end()
call attach_nobh_buffers() which expect that provided buffer_head
list is circular (ie tail entry point to head entry). If it is not
the case it will dereference the last pointer in the list which is
NULL (last item in the list point to NULL see alloc_page_buffers())
and thus SEGFAULT.
Hence nobh_write_begin() must make the buffer_head list circular as
alloc_page_buffers() is not responsible for doing that.
(This might worth including in 4.14 as i don't think it can regress
anything but i am not a filesystem expert).
Note i did not make the list circular inside attach_nobh_buffers()
as some patchset i am working on also expect the list to be circular
no matter what. But if people are more confortable with me doing
that in my patchset the fix can be move to attach_nobh_buffers().
Signed-off-by: Jérôme Glisse <jglisse(a)redhat.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: linux-fsdevel(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
---
fs/buffer.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 170df856bdb9..6bc47c11d6ac 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2598,7 +2598,7 @@ int nobh_write_begin(struct address_space *mapping,
struct inode *inode = mapping->host;
const unsigned blkbits = inode->i_blkbits;
const unsigned blocksize = 1 << blkbits;
- struct buffer_head *head, *bh;
+ struct buffer_head *head, *bh, *tail;
struct page *page;
pgoff_t index;
unsigned from, to;
@@ -2643,6 +2643,13 @@ int nobh_write_begin(struct address_space *mapping,
ret = -ENOMEM;
goto out_release;
}
+ /* We need to make buffer_head list circular to avoid NULL SEGFAULT */
+ bh = head;
+ do {
+ tail = bh;
+ bh = bh->b_this_page;
+ } while (bh);
+ tail->b_this_page = head;
block_in_file = (sector_t)page->index << (PAGE_SHIFT - blkbits);
--
2.13.6