From: SeongJae Park sjpark@amazon.de
SeongJae Park (5): xen/xenbus: Allow watches discard events before queueing xen/xenbus: Add 'will_handle' callback support in xenbus_watch_path() xen/xenbus/xen_bus_type: Support will_handle watch callback xen/xenbus: Count pending messages for each watch xenbus/xenbus_backend: Disallow pending watch messages
drivers/block/xen-blkback/xenbus.c | 3 +- drivers/net/xen-netback/xenbus.c | 4 ++- drivers/xen/xen-pciback/xenbus.c | 2 +- drivers/xen/xenbus/xenbus_client.c | 8 ++++- drivers/xen/xenbus/xenbus_probe.c | 1 + drivers/xen/xenbus/xenbus_probe.h | 2 ++ drivers/xen/xenbus/xenbus_probe_backend.c | 7 +++++ drivers/xen/xenbus/xenbus_xs.c | 38 +++++++++++++++-------- include/xen/xenbus.h | 15 ++++++++- 9 files changed, 62 insertions(+), 18 deletions(-)
From: SeongJae Park sjpark@amazon.de
If handling logics of watch events are slower than the events enqueue logic and the events can be created from the guests, the guests could trigger memory pressure by intensively inducing the events, because it will create a huge number of pending events that exhausting the memory. This is known as XSA-349.
Fortunately, some watch events could be ignored, depending on its handler callback. For example, if the callback has interest in only one single path, the watch wouldn't want multiple pending events. Or, some watches could ignore events to same path.
To let such watches to volutarily help avoiding the memory pressure situation, this commit introduces new watch callback, 'will_handle'. If it is not NULL, it will be called for each new event just before enqueuing it. Then, if the callback returns false, the event will be discarded. No watch is using the callback for now, though.
This is part of XSA-349
This is upstream commit fed1755b118147721f2c87b37b9d66e62c39b668
Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park sjpark@amazon.de Reported-by: Michael Kurth mku@amazon.de Reported-by: Pawel Wieczorkiewicz wipawel@amazon.de Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com --- drivers/net/xen-netback/xenbus.c | 2 ++ drivers/xen/xenbus/xenbus_client.c | 1 + drivers/xen/xenbus/xenbus_xs.c | 7 ++++++- include/xen/xenbus.h | 7 +++++++ 4 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 56ebd8267386..23f03af0a2d4 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -697,12 +697,14 @@ static int xen_register_watchers(struct xenbus_device *dev, struct xenvif *vif) return -ENOMEM; snprintf(node, maxlen, "%s/rate", dev->nodename); vif->credit_watch.node = node; + vif->credit_watch.will_handle = NULL; vif->credit_watch.callback = xen_net_rate_changed; err = register_xenbus_watch(&vif->credit_watch); if (err) { pr_err("Failed to set watcher %s\n", vif->credit_watch.node); kfree(node); vif->credit_watch.node = NULL; + vif->credit_watch.will_handle = NULL; vif->credit_watch.callback = NULL; } return err; diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c index 266f446ba331..d02d25f784c9 100644 --- a/drivers/xen/xenbus/xenbus_client.c +++ b/drivers/xen/xenbus/xenbus_client.c @@ -120,6 +120,7 @@ int xenbus_watch_path(struct xenbus_device *dev, const char *path, int err;
watch->node = path; + watch->will_handle = NULL; watch->callback = callback;
err = register_xenbus_watch(watch); diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c index ce65591b4168..0ea1c259f2f1 100644 --- a/drivers/xen/xenbus/xenbus_xs.c +++ b/drivers/xen/xenbus/xenbus_xs.c @@ -903,7 +903,12 @@ static int process_msg(void) spin_lock(&watches_lock); msg->u.watch.handle = find_watch( msg->u.watch.vec[XS_WATCH_TOKEN]); - if (msg->u.watch.handle != NULL) { + if (msg->u.watch.handle != NULL && + (!msg->u.watch.handle->will_handle || + msg->u.watch.handle->will_handle( + msg->u.watch.handle, + (const char **)msg->u.watch.vec, + msg->u.watch.vec_size))) { spin_lock(&watch_events_lock); list_add_tail(&msg->list, &watch_events); wake_up(&watch_events_waitq); diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h index 32b944b7cebd..11697aa023b5 100644 --- a/include/xen/xenbus.h +++ b/include/xen/xenbus.h @@ -58,6 +58,13 @@ struct xenbus_watch /* Path being watched. */ const char *node;
+ /* + * Called just before enqueing new event while a spinlock is held. + * The event will be discarded if this callback returns false. + */ + bool (*will_handle)(struct xenbus_watch *, + const char **vec, unsigned int len); + /* Callback (executed in a process context with no locks held). */ void (*callback)(struct xenbus_watch *, const char **vec, unsigned int len);
From: SeongJae Park sjpark@amazon.de
Some code does not directly make 'xenbus_watch' object and call 'register_xenbus_watch()' but use 'xenbus_watch_path()' instead. This commit adds support of 'will_handle' callback in the 'xenbus_watch_path()' and it's wrapper, 'xenbus_watch_pathfmt()'.
This is part of XSA-349
This is upstream commit 2e85d32b1c865bec703ce0c962221a5e955c52c2
Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park sjpark@amazon.de Reported-by: Michael Kurth mku@amazon.de Reported-by: Pawel Wieczorkiewicz wipawel@amazon.de Signed-off-by: Author Redacted security@xen.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com --- drivers/block/xen-blkback/xenbus.c | 3 ++- drivers/net/xen-netback/xenbus.c | 2 +- drivers/xen/xen-pciback/xenbus.c | 2 +- drivers/xen/xenbus/xenbus_client.c | 9 +++++++-- drivers/xen/xenbus/xenbus_probe.c | 2 +- include/xen/xenbus.h | 6 +++++- 6 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 0ec257e69e95..823f3480ebd1 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -553,7 +553,8 @@ static int xen_blkbk_probe(struct xenbus_device *dev, /* setup back pointer */ be->blkif->be = be;
- err = xenbus_watch_pathfmt(dev, &be->backend_watch, backend_changed, + err = xenbus_watch_pathfmt(dev, &be->backend_watch, NULL, + backend_changed, "%s/%s", dev->nodename, "physical-device"); if (err) goto fail; diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 23f03af0a2d4..21c8e2720b40 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -849,7 +849,7 @@ static void connect(struct backend_info *be) xenvif_carrier_on(be->vif);
unregister_hotplug_status_watch(be); - err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch, + err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch, NULL, hotplug_status_changed, "%s/%s", dev->nodename, "hotplug-status"); if (!err) diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c index 48196347f2f9..12497a2140c2 100644 --- a/drivers/xen/xen-pciback/xenbus.c +++ b/drivers/xen/xen-pciback/xenbus.c @@ -691,7 +691,7 @@ static int xen_pcibk_xenbus_probe(struct xenbus_device *dev,
/* watch the backend node for backend configuration information */ err = xenbus_watch_path(dev, dev->nodename, &pdev->be_watch, - xen_pcibk_be_watch); + NULL, xen_pcibk_be_watch); if (err) goto out;
diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c index d02d25f784c9..8bbd887ca422 100644 --- a/drivers/xen/xenbus/xenbus_client.c +++ b/drivers/xen/xenbus/xenbus_client.c @@ -114,19 +114,22 @@ EXPORT_SYMBOL_GPL(xenbus_strstate); */ int xenbus_watch_path(struct xenbus_device *dev, const char *path, struct xenbus_watch *watch, + bool (*will_handle)(struct xenbus_watch *, + const char **, unsigned int), void (*callback)(struct xenbus_watch *, const char **, unsigned int)) { int err;
watch->node = path; - watch->will_handle = NULL; + watch->will_handle = will_handle; watch->callback = callback;
err = register_xenbus_watch(watch);
if (err) { watch->node = NULL; + watch->will_handle = NULL; watch->callback = NULL; xenbus_dev_fatal(dev, err, "adding watch on %s", path); } @@ -153,6 +156,8 @@ EXPORT_SYMBOL_GPL(xenbus_watch_path); */ int xenbus_watch_pathfmt(struct xenbus_device *dev, struct xenbus_watch *watch, + bool (*will_handle)(struct xenbus_watch *, + const char **, unsigned int), void (*callback)(struct xenbus_watch *, const char **, unsigned int), const char *pathfmt, ...) @@ -169,7 +174,7 @@ int xenbus_watch_pathfmt(struct xenbus_device *dev, xenbus_dev_fatal(dev, -ENOMEM, "allocating path for watch"); return -ENOMEM; } - err = xenbus_watch_path(dev, path, watch, callback); + err = xenbus_watch_path(dev, path, watch, will_handle, callback);
if (err) kfree(path); diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c index c2d447687e33..c560c1b8489a 100644 --- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -137,7 +137,7 @@ static int watch_otherend(struct xenbus_device *dev) container_of(dev->dev.bus, struct xen_bus_type, bus);
return xenbus_watch_pathfmt(dev, &dev->otherend_watch, - bus->otherend_changed, + NULL, bus->otherend_changed, "%s/%s", dev->otherend, "state"); }
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h index 11697aa023b5..1772507dc2c9 100644 --- a/include/xen/xenbus.h +++ b/include/xen/xenbus.h @@ -201,10 +201,14 @@ void xenbus_suspend_cancel(void);
int xenbus_watch_path(struct xenbus_device *dev, const char *path, struct xenbus_watch *watch, + bool (*will_handle)(struct xenbus_watch *, + const char **, unsigned int), void (*callback)(struct xenbus_watch *, const char **, unsigned int)); -__printf(4, 5) +__printf(5, 6) int xenbus_watch_pathfmt(struct xenbus_device *dev, struct xenbus_watch *watch, + bool (*will_handle)(struct xenbus_watch *, + const char **, unsigned int), void (*callback)(struct xenbus_watch *, const char **, unsigned int), const char *pathfmt, ...);
On Thu, 17 Dec 2020 09:17:24 +0100 SeongJae Park sjpark@amazon.com wrote:
From: SeongJae Park sjpark@amazon.de
Some code does not directly make 'xenbus_watch' object and call 'register_xenbus_watch()' but use 'xenbus_watch_path()' instead. This commit adds support of 'will_handle' callback in the 'xenbus_watch_path()' and it's wrapper, 'xenbus_watch_pathfmt()'.
This is part of XSA-349
This is upstream commit 2e85d32b1c865bec703ce0c962221a5e955c52c2
Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park sjpark@amazon.de Reported-by: Michael Kurth mku@amazon.de Reported-by: Pawel Wieczorkiewicz wipawel@amazon.de Signed-off-by: Author Redacted security@xen.org
Juergen found this wrong line (thank you, Juergen!). I will remove the lines from the patches and send 2nd version soon.
Thanks, SeongJae Park
From: SeongJae Park sjpark@amazon.de
This commit adds support of the 'will_handle' watch callback for 'xen_bus_type' users.
This is part of XSA-349
This is upstream commit be987200fbaceaef340872841d4f7af2c5ee8dc3
Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park sjpark@amazon.de Reported-by: Michael Kurth mku@amazon.de Reported-by: Pawel Wieczorkiewicz wipawel@amazon.de Signed-off-by: Author Redacted security@xen.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com --- drivers/xen/xenbus/xenbus_probe.c | 3 ++- drivers/xen/xenbus/xenbus_probe.h | 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c index c560c1b8489a..ba7590d75985 100644 --- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -137,7 +137,8 @@ static int watch_otherend(struct xenbus_device *dev) container_of(dev->dev.bus, struct xen_bus_type, bus);
return xenbus_watch_pathfmt(dev, &dev->otherend_watch, - NULL, bus->otherend_changed, + bus->otherend_will_handle, + bus->otherend_changed, "%s/%s", dev->otherend, "state"); }
diff --git a/drivers/xen/xenbus/xenbus_probe.h b/drivers/xen/xenbus/xenbus_probe.h index c9ec7ca1f7ab..2c394c6ba605 100644 --- a/drivers/xen/xenbus/xenbus_probe.h +++ b/drivers/xen/xenbus/xenbus_probe.h @@ -42,6 +42,8 @@ struct xen_bus_type { int (*get_bus_id)(char bus_id[XEN_BUS_ID_SIZE], const char *nodename); int (*probe)(struct xen_bus_type *bus, const char *type, const char *dir); + bool (*otherend_will_handle)(struct xenbus_watch *watch, + const char **vec, unsigned int len); void (*otherend_changed)(struct xenbus_watch *watch, const char **vec, unsigned int len); struct bus_type bus;
From: SeongJae Park sjpark@amazon.de
This commit adds a counter of pending messages for each watch in the struct. It is used to skip unnecessary pending messages lookup in 'unregister_xenbus_watch()'. It could also be used in 'will_handle' callback.
This is part of XSA-349
This is upstream commit 3dc86ca6b4c8cfcba9da7996189d1b5a358a94fc
Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park sjpark@amazon.de Reported-by: Michael Kurth mku@amazon.de Reported-by: Pawel Wieczorkiewicz wipawel@amazon.de Signed-off-by: Author Redacted security@xen.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com --- drivers/xen/xenbus/xenbus_xs.c | 30 ++++++++++++++++++------------ include/xen/xenbus.h | 2 ++ 2 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c index 0ea1c259f2f1..420d478e1708 100644 --- a/drivers/xen/xenbus/xenbus_xs.c +++ b/drivers/xen/xenbus/xenbus_xs.c @@ -701,6 +701,8 @@ int register_xenbus_watch(struct xenbus_watch *watch)
sprintf(token, "%lX", (long)watch);
+ watch->nr_pending = 0; + down_read(&xs_state.watch_mutex);
spin_lock(&watches_lock); @@ -750,12 +752,15 @@ void unregister_xenbus_watch(struct xenbus_watch *watch)
/* Cancel pending watch events. */ spin_lock(&watch_events_lock); - list_for_each_entry_safe(msg, tmp, &watch_events, list) { - if (msg->u.watch.handle != watch) - continue; - list_del(&msg->list); - kfree(msg->u.watch.vec); - kfree(msg); + if (watch->nr_pending) { + list_for_each_entry_safe(msg, tmp, &watch_events, list) { + if (msg->u.watch.handle != watch) + continue; + list_del(&msg->list); + kfree(msg->u.watch.vec); + kfree(msg); + } + watch->nr_pending = 0; } spin_unlock(&watch_events_lock);
@@ -802,7 +807,6 @@ void xs_suspend_cancel(void)
static int xenwatch_thread(void *unused) { - struct list_head *ent; struct xs_stored_msg *msg;
for (;;) { @@ -815,13 +819,15 @@ static int xenwatch_thread(void *unused) mutex_lock(&xenwatch_mutex);
spin_lock(&watch_events_lock); - ent = watch_events.next; - if (ent != &watch_events) - list_del(ent); + msg = list_first_entry_or_null(&watch_events, + struct xs_stored_msg, list); + if (msg) { + list_del(&msg->list); + msg->u.watch.handle->nr_pending--; + } spin_unlock(&watch_events_lock);
- if (ent != &watch_events) { - msg = list_entry(ent, struct xs_stored_msg, list); + if (msg) { msg->u.watch.handle->callback( msg->u.watch.handle, (const char **)msg->u.watch.vec, diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h index 1772507dc2c9..ed9e7e3307b7 100644 --- a/include/xen/xenbus.h +++ b/include/xen/xenbus.h @@ -58,6 +58,8 @@ struct xenbus_watch /* Path being watched. */ const char *node;
+ unsigned int nr_pending; + /* * Called just before enqueing new event while a spinlock is held. * The event will be discarded if this callback returns false.
On 17.12.20 09:17, SeongJae Park wrote:
From: SeongJae Park sjpark@amazon.de
This commit adds a counter of pending messages for each watch in the struct. It is used to skip unnecessary pending messages lookup in 'unregister_xenbus_watch()'. It could also be used in 'will_handle' callback.
This is part of XSA-349
This is upstream commit 3dc86ca6b4c8cfcba9da7996189d1b5a358a94fc
Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park sjpark@amazon.de Reported-by: Michael Kurth mku@amazon.de Reported-by: Pawel Wieczorkiewicz wipawel@amazon.de Signed-off-by: Author Redacted security@xen.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com
drivers/xen/xenbus/xenbus_xs.c | 30 ++++++++++++++++++------------ include/xen/xenbus.h | 2 ++ 2 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c index 0ea1c259f2f1..420d478e1708 100644 --- a/drivers/xen/xenbus/xenbus_xs.c +++ b/drivers/xen/xenbus/xenbus_xs.c @@ -701,6 +701,8 @@ int register_xenbus_watch(struct xenbus_watch *watch) sprintf(token, "%lX", (long)watch);
- watch->nr_pending = 0;
I'm missing the incrementing of nr_pending, which was present in the upstream patch.
Juergen
down_read(&xs_state.watch_mutex); spin_lock(&watches_lock); @@ -750,12 +752,15 @@ void unregister_xenbus_watch(struct xenbus_watch *watch) /* Cancel pending watch events. */ spin_lock(&watch_events_lock);
- list_for_each_entry_safe(msg, tmp, &watch_events, list) {
if (msg->u.watch.handle != watch)
continue;
list_del(&msg->list);
kfree(msg->u.watch.vec);
kfree(msg);
- if (watch->nr_pending) {
list_for_each_entry_safe(msg, tmp, &watch_events, list) {
if (msg->u.watch.handle != watch)
continue;
list_del(&msg->list);
kfree(msg->u.watch.vec);
kfree(msg);
}
} spin_unlock(&watch_events_lock);watch->nr_pending = 0;
@@ -802,7 +807,6 @@ void xs_suspend_cancel(void) static int xenwatch_thread(void *unused) {
- struct list_head *ent; struct xs_stored_msg *msg;
for (;;) { @@ -815,13 +819,15 @@ static int xenwatch_thread(void *unused) mutex_lock(&xenwatch_mutex); spin_lock(&watch_events_lock);
ent = watch_events.next;
if (ent != &watch_events)
list_del(ent);
msg = list_first_entry_or_null(&watch_events,
struct xs_stored_msg, list);
if (msg) {
list_del(&msg->list);
msg->u.watch.handle->nr_pending--;
spin_unlock(&watch_events_lock);}
if (ent != &watch_events) {
msg = list_entry(ent, struct xs_stored_msg, list);
if (msg) { msg->u.watch.handle->callback( msg->u.watch.handle, (const char **)msg->u.watch.vec,
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h index 1772507dc2c9..ed9e7e3307b7 100644 --- a/include/xen/xenbus.h +++ b/include/xen/xenbus.h @@ -58,6 +58,8 @@ struct xenbus_watch /* Path being watched. */ const char *node;
- unsigned int nr_pending;
- /*
- Called just before enqueing new event while a spinlock is held.
- The event will be discarded if this callback returns false.
On Thu, 17 Dec 2020 15:50:34 +0100 "Jürgen Groß" jgross@suse.com wrote:
[-- Attachment #1.1.1: Type: text/plain, Size: 3509 bytes --]
On 17.12.20 09:17, SeongJae Park wrote:
From: SeongJae Park sjpark@amazon.de
This commit adds a counter of pending messages for each watch in the struct. It is used to skip unnecessary pending messages lookup in 'unregister_xenbus_watch()'. It could also be used in 'will_handle' callback.
This is part of XSA-349
This is upstream commit 3dc86ca6b4c8cfcba9da7996189d1b5a358a94fc
Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park sjpark@amazon.de Reported-by: Michael Kurth mku@amazon.de Reported-by: Pawel Wieczorkiewicz wipawel@amazon.de Signed-off-by: Author Redacted security@xen.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com
drivers/xen/xenbus/xenbus_xs.c | 30 ++++++++++++++++++------------ include/xen/xenbus.h | 2 ++ 2 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c index 0ea1c259f2f1..420d478e1708 100644 --- a/drivers/xen/xenbus/xenbus_xs.c +++ b/drivers/xen/xenbus/xenbus_xs.c @@ -701,6 +701,8 @@ int register_xenbus_watch(struct xenbus_watch *watch) sprintf(token, "%lX", (long)watch);
- watch->nr_pending = 0;
I'm missing the incrementing of nr_pending, which was present in the upstream patch.
Oops, it should be in this patch, but I mistakenly put it in the fifth patch.
67 --- a/drivers/xen/xenbus/xenbus_xs.c 68 +++ b/drivers/xen/xenbus/xenbus_xs.c 69 @@ -917,6 +917,7 @@ static int process_msg(void) 70 msg->u.watch.vec_size))) { 71 spin_lock(&watch_events_lock); 72 list_add_tail(&msg->list, &watch_events); 73 + msg->u.watch.handle->nr_pending++; 74 wake_up(&watch_events_waitq); 75 spin_unlock(&watch_events_lock); 76 } else { 77 --
And I just realized I even didn't post the fifth patch.
I will fix this and post new version (v3) soon.
Thank you for catching this, Juergen.
Thanks, SeongJae Park
linux-stable-mirror@lists.linaro.org