On 02/11/2018 08:26, Jan Beulich wrote:
>>>> On 01.11.18 at 17:27, <jgross(a)suse.com> wrote:
>> On 01/11/2018 16:50, Jan Beulich wrote:
>>>>>> Juergen Gross <jgross(a)suse.com> 11/01/18 3:23 PM >>>
>>>> On 01/11/2018 15:18, Jan Beulich wrote:
>>>>>>>> Juergen Gross <jgross(a)suse.com> 11/01/18 1:34 PM >>>
>>>>>> Currently the size of hypercall buffers allocated via
>>>>>> /dev/xen/hypercall is limited to a default of 64 memory pages. For live
>>>>>> migration of guests this might be too small as the page dirty bitmask
>>>>>> needs to be sized according to the size of the guest. This means
>>>>>> migrating a 8GB sized guest is already exhausting the default buffer
>>>>>> size for the dirty bitmap.
>>>>>>
>>>>>> There is no sensible way to set a sane limit, so just remove it
>>>>>> completely. The device node's usage is limited to root anyway, so there
>>>>>> is no additional DOS scenario added by allowing unlimited buffers.
>>>>>
>>>>> But is this setting of permissions what we want long term? What about a
>>>>> de-privileged qemu, which still needs to be able to issue at least dm-op
>>>>> hypercalls?
>>>>
>>>> Wouldn't that qemu have opened the node while still being privileged?
>>>
>>> Possibly, but how does this help? As soon as it's unprivileged it must not
>>> be able to hog resources anymore.
>>>
>>> Anyway, with Andrew's reply my point may be irrelevant, but I have to
>>> admit I'm not entirely sure.
>>
>> I guess we want Xen tools to close /dev/xen/hypercall (or more precise:
>> don't dup2() it) when qemu is de-privileging itself. This will make it
>> very clear that it can't hog memory via mmap().
>>
>> When you are fine with that I'll send a Xen patch for this.
>
> If that doesn't prevent the process from making the hypercalls it
> is permitted to do (I have to admit I don't recall if there are any
> still needed besides the dmop ones), sure.
Turns out that is already done: the restrict_all callback of libxencall
will associate /dev/null with the file descriptor of /dev/xen/hypercall.
Juergen
On Fri, Nov 02, 2018 at 06:13:17AM +0100, gregkh(a)linuxfoundation.org wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> bridge: do not add port to router list when receives query with source 0.0.0.0
>
> to the 4.18-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> bridge-do-not-add-port-to-router-list-when-receives-query-with-source-0.0.0.0.patch
> and it can be found in the queue-4.18 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Hi,
Please also include patch
commit 0fe5119e267f3e3d8ac206895f5922195ec55a8a
Author: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Date: Sat Oct 27 12:07:47 2018 +0300
net: bridge: remove ipv6 zero address check in mcast queries
which fixed the patch.
Thanks
Hangbin
>
>
> From foo@baz Fri Nov 2 06:12:44 CET 2018
> From: Hangbin Liu <liuhangbin(a)gmail.com>
> Date: Fri, 26 Oct 2018 10:28:43 +0800
> Subject: bridge: do not add port to router list when receives query with source 0.0.0.0
>
> From: Hangbin Liu <liuhangbin(a)gmail.com>
>
> [ Upstream commit 5a2de63fd1a59c30c02526d427bc014b98adf508 ]
>
> Based on RFC 4541, 2.1.1. IGMP Forwarding Rules
>
> The switch supporting IGMP snooping must maintain a list of
> multicast routers and the ports on which they are attached. This
> list can be constructed in any combination of the following ways:
>
> a) This list should be built by the snooping switch sending
> Multicast Router Solicitation messages as described in IGMP
> Multicast Router Discovery [MRDISC]. It may also snoop
> Multicast Router Advertisement messages sent by and to other
> nodes.
>
> b) The arrival port for IGMP Queries (sent by multicast routers)
> where the source address is not 0.0.0.0.
>
> We should not add the port to router list when receives query with source
> 0.0.0.0.
>
> Reported-by: Ying Xu <yinxu(a)redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
> Acked-by: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
> Acked-by: Roopa Prabhu <roopa(a)cumulusnetworks.com>
> Signed-off-by: David S. Miller <davem(a)davemloft.net>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
> ---
> net/bridge/br_multicast.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> --- a/net/bridge/br_multicast.c
> +++ b/net/bridge/br_multicast.c
> @@ -1420,7 +1420,15 @@ static void br_multicast_query_received(
> return;
>
> br_multicast_update_query_timer(br, query, max_delay);
> - br_multicast_mark_router(br, port);
> +
> + /* Based on RFC4541, section 2.1.1 IGMP Forwarding Rules,
> + * the arrival port for IGMP Queries where the source address
> + * is 0.0.0.0 should not be added to router port list.
> + */
> + if ((saddr->proto == htons(ETH_P_IP) && saddr->u.ip4) ||
> + (saddr->proto == htons(ETH_P_IPV6) &&
> + !ipv6_addr_any(&saddr->u.ip6)))
> + br_multicast_mark_router(br, port);
> }
>
> static int br_ip4_multicast_query(struct net_bridge *br,
>
>
> Patches currently in stable-queue which might be from liuhangbin(a)gmail.com are
>
> queue-4.18/bridge-do-not-add-port-to-router-list-when-receives-query-with-source-0.0.0.0.patch
On 01/11/2018 16:50, Jan Beulich wrote:
>>>> Juergen Gross <jgross(a)suse.com> 11/01/18 3:23 PM >>>
>> On 01/11/2018 15:18, Jan Beulich wrote:
>>>>>> Juergen Gross <jgross(a)suse.com> 11/01/18 1:34 PM >>>
>>>> Currently the size of hypercall buffers allocated via
>>>> /dev/xen/hypercall is limited to a default of 64 memory pages. For live
>>>> migration of guests this might be too small as the page dirty bitmask
>>>> needs to be sized according to the size of the guest. This means
>>>> migrating a 8GB sized guest is already exhausting the default buffer
>>>> size for the dirty bitmap.
>>>>
>>>> There is no sensible way to set a sane limit, so just remove it
>>>> completely. The device node's usage is limited to root anyway, so there
>>>> is no additional DOS scenario added by allowing unlimited buffers.
>>>
>>> But is this setting of permissions what we want long term? What about a
>>> de-privileged qemu, which still needs to be able to issue at least dm-op
>>> hypercalls?
>>
>> Wouldn't that qemu have opened the node while still being privileged?
>
> Possibly, but how does this help? As soon as it's unprivileged it must not
> be able to hog resources anymore.
>
> Anyway, with Andrew's reply my point may be irrelevant, but I have to
> admit I'm not entirely sure.
I guess we want Xen tools to close /dev/xen/hypercall (or more precise:
don't dup2() it) when qemu is de-privileging itself. This will make it
very clear that it can't hog memory via mmap().
When you are fine with that I'll send a Xen patch for this.
Juergen
On Fri, Nov 02, 2018 at 06:16:13AM +0100, gregkh(a)linuxfoundation.org wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> bridge: do not add port to router list when receives query with source 0.0.0.0
>
> to the 4.19-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> bridge-do-not-add-port-to-router-list-when-receives-query-with-source-0.0.0.0.patch
> and it can be found in the queue-4.19 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Hi,
Patch
commit 0fe5119e267f3e3d8ac206895f5922195ec55a8a
Author: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Date: Sat Oct 27 12:07:47 2018 +0300
net: bridge: remove ipv6 zero address check in mcast queries
is also needed to fix this patch.
Thanks
Hangbin
>
>
> From foo@baz Fri Nov 2 06:12:28 CET 2018
> From: Hangbin Liu <liuhangbin(a)gmail.com>
> Date: Fri, 26 Oct 2018 10:28:43 +0800
> Subject: bridge: do not add port to router list when receives query with source 0.0.0.0
>
> From: Hangbin Liu <liuhangbin(a)gmail.com>
>
> [ Upstream commit 5a2de63fd1a59c30c02526d427bc014b98adf508 ]
>
> Based on RFC 4541, 2.1.1. IGMP Forwarding Rules
>
> The switch supporting IGMP snooping must maintain a list of
> multicast routers and the ports on which they are attached. This
> list can be constructed in any combination of the following ways:
>
> a) This list should be built by the snooping switch sending
> Multicast Router Solicitation messages as described in IGMP
> Multicast Router Discovery [MRDISC]. It may also snoop
> Multicast Router Advertisement messages sent by and to other
> nodes.
>
> b) The arrival port for IGMP Queries (sent by multicast routers)
> where the source address is not 0.0.0.0.
>
> We should not add the port to router list when receives query with source
> 0.0.0.0.
>
> Reported-by: Ying Xu <yinxu(a)redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
> Acked-by: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
> Acked-by: Roopa Prabhu <roopa(a)cumulusnetworks.com>
> Signed-off-by: David S. Miller <davem(a)davemloft.net>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
> ---
> net/bridge/br_multicast.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> --- a/net/bridge/br_multicast.c
> +++ b/net/bridge/br_multicast.c
> @@ -1420,7 +1420,15 @@ static void br_multicast_query_received(
> return;
>
> br_multicast_update_query_timer(br, query, max_delay);
> - br_multicast_mark_router(br, port);
> +
> + /* Based on RFC4541, section 2.1.1 IGMP Forwarding Rules,
> + * the arrival port for IGMP Queries where the source address
> + * is 0.0.0.0 should not be added to router port list.
> + */
> + if ((saddr->proto == htons(ETH_P_IP) && saddr->u.ip4) ||
> + (saddr->proto == htons(ETH_P_IPV6) &&
> + !ipv6_addr_any(&saddr->u.ip6)))
> + br_multicast_mark_router(br, port);
> }
>
> static void br_ip4_multicast_query(struct net_bridge *br,
>
>
> Patches currently in stable-queue which might be from liuhangbin(a)gmail.com are
>
> queue-4.19/bridge-do-not-add-port-to-router-list-when-receives-query-with-source-0.0.0.0.patch
commit 0962590e553331db2cc0aef2dc35c57f6300dbbe upstream.
ALU operations on pointers such as scalar_reg += map_value_ptr are
handled in adjust_ptr_min_max_vals(). Problem is however that map_ptr
and range in the register state share a union, so transferring state
through dst_reg->range = ptr_reg->range is just buggy as any new
map_ptr in the dst_reg is then truncated (or null) for subsequent
checks. Fix this by adding a raw member and use it for copying state
over to dst_reg.
Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: Edward Cree <ecree(a)solarflare.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Edward Cree <ecree(a)solarflare.com>
---
include/linux/bpf_verifier.h | 3 +++
kernel/bpf/verifier.c | 10 ++++++----
2 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 73bec75..a333300 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -50,6 +50,9 @@ struct bpf_reg_state {
* PTR_TO_MAP_VALUE_OR_NULL
*/
struct bpf_map *map_ptr;
+
+ /* Max size from any of the above. */
+ unsigned long raw;
};
/* Fixed part of pointer offset, pointer types only */
s32 off;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a0ffc62..013b0cd 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1935,7 +1935,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
dst_reg->umax_value = umax_ptr;
dst_reg->var_off = ptr_reg->var_off;
dst_reg->off = ptr_reg->off + smin_val;
- dst_reg->range = ptr_reg->range;
+ dst_reg->raw = ptr_reg->raw;
break;
}
/* A new variable offset is created. Note that off_reg->off
@@ -1965,10 +1965,11 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
}
dst_reg->var_off = tnum_add(ptr_reg->var_off, off_reg->var_off);
dst_reg->off = ptr_reg->off;
+ dst_reg->raw = ptr_reg->raw;
if (ptr_reg->type == PTR_TO_PACKET) {
dst_reg->id = ++env->id_gen;
/* something was added to pkt_ptr, set range to zero */
- dst_reg->range = 0;
+ dst_reg->raw = 0;
}
break;
case BPF_SUB:
@@ -1999,7 +2000,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
dst_reg->var_off = ptr_reg->var_off;
dst_reg->id = ptr_reg->id;
dst_reg->off = ptr_reg->off - smin_val;
- dst_reg->range = ptr_reg->range;
+ dst_reg->raw = ptr_reg->raw;
break;
}
/* A new variable offset is created. If the subtrahend is known
@@ -2025,11 +2026,12 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
}
dst_reg->var_off = tnum_sub(ptr_reg->var_off, off_reg->var_off);
dst_reg->off = ptr_reg->off;
+ dst_reg->raw = ptr_reg->raw;
if (ptr_reg->type == PTR_TO_PACKET) {
dst_reg->id = ++env->id_gen;
/* something was added to pkt_ptr, set range to zero */
if (smin_val < 0)
- dst_reg->range = 0;
+ dst_reg->raw = 0;
}
break;
case BPF_AND:
--
2.9.5