Hi,
Here is the v6 patch to support polling on event 'hist' file. The previous version is here;
https://lore.kernel.org/all/172398710447.295714.4489282566285719918.stgit@de...
This version is rebased on the ftrace/for-next branch of the linux-trace tree, and use global irq_work and wq instead of per-event one.
Background ---------- There has been interest in allowing user programs to monitor kernel events in real time. Ftrace provides `trace_pipe` interface to wait on events in the ring buffer, but it is needed to wait until filling up a page with events in the ring buffer. We can also peek the `trace` file periodically, but that is inefficient way to monitor a randomely happening event.
Overview -------- This patch set allows user to `poll`(or `select`, `epoll`) on event histogram interface. As you know each event has its own `hist` file which shows histograms generated by trigger action. So user can set a new hist trigger on any event you want to monitor, and poll on the `hist` file until it is updated.
There are 2 poll events are supported, POLLIN and POLLPRI. POLLIN means that there are any readable update on `hist` file and this event will be flashed only when you call read(). So, this is useful if you want to read the histogram periodically. The other POLLPRI event is for monitoring trace event. Like the POLLIN, this will be returned when the histogram is updated, but you don't need to read() the file and use poll() again.
Note that this waits for histogram update (not event arrival), thus you must set a histogram on the event at first.
Usage ----- Here is an example usage:
---- TRACEFS=/sys/kernel/tracing EVENT=$TRACEFS/events/sched/sched_process_free
# setup histogram trigger and enable event echo "hist:key=comm" >> $EVENT/trigger echo 1 > $EVENT/enable
# Wait for update poll pri $EVENT/hist
# Event arrived. echo "process free event is comming" tail $TRACEFS/trace ----
The 'poll' command is in the selftest patch.
You can take this series also from here;
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=to...
Thank you, ---
Masami Hiramatsu (Google) (3): tracing/hist: Add poll(POLLIN) support on hist file tracing/hist: Support POLLPRI event for poll on histogram selftests/tracing: Add hist poll() support test
include/linux/trace_events.h | 14 +++ kernel/trace/trace_events.c | 14 +++ kernel/trace/trace_events_hist.c | 100 +++++++++++++++++++- tools/testing/selftests/ftrace/Makefile | 2 tools/testing/selftests/ftrace/poll.c | 74 +++++++++++++++ .../ftrace/test.d/trigger/trigger-hist-poll.tc | 74 +++++++++++++++ 6 files changed, 275 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/ftrace/poll.c create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc
-- Masami Hiramatsu (Google) mhiramat@kernel.org
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Add poll syscall support on the `hist` file. The Waiter will be waken up when the histogram is updated with POLLIN.
Currently, there is no way to wait for a specific event in userspace. So user needs to peek the `trace` periodicaly, or wait on `trace_pipe`. But that is not good idea to peek the `trace` for the event randomely happens. And `trace_pipe` is not coming back until a page is filled with events.
This allows user to wait for a specific events on `hist` file. User can set a histogram trigger on the event which they want to monitor. And poll() on its `hist` file. Since this poll() returns POLLIN, the next poll() will return soon unless you do read() on hist file.
NOTE: To read the hist file again, you must set the file offset to 0, but just for monitoring the event, you may not need to read the histogram.
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Tom Zanussi zanussi@kernel.org --- Changes in v6: - Use a global poll irq_work and wait_queue. --- include/linux/trace_events.h | 14 +++++++ kernel/trace/trace_events.c | 14 +++++++ kernel/trace/trace_events_hist.c | 75 ++++++++++++++++++++++++++++++++++++-- 3 files changed, 100 insertions(+), 3 deletions(-)
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 42bedcddd511..46c771a61f2a 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -685,6 +685,20 @@ struct trace_event_file { atomic_t tm_ref; /* trigger-mode reference counter */ };
+#ifdef CONFIG_HIST_TRIGGERS +extern struct irq_work hist_poll_work; +extern wait_queue_head_t hist_poll_wq; + +static inline void hist_poll_wakeup(void) +{ + if (wq_has_sleeper(&hist_poll_wq)) + irq_work_queue(&hist_poll_work); +} + +#define hist_poll_wait(file, wait) \ + poll_wait(file, &hist_poll_wq, wait) +#endif + #define __TRACE_EVENT_FLAGS(name, value) \ static int __init trace_init_flags_##name(void) \ { \ diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 7266ec2a4eea..0a7cb30043ef 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -2972,6 +2972,20 @@ static bool event_in_systems(struct trace_event_call *call, return !*p || isspace(*p) || *p == ','; }
+#ifdef CONFIG_HIST_TRIGGERS +/* + * Wake up waiter on the hist_poll_wq from irq_work because the hist trigger + * may happen in any context. + */ +static void hist_poll_event_irq_work(struct irq_work *work) +{ + wake_up_all(&hist_poll_wq); +} + +DEFINE_IRQ_WORK(hist_poll_work, hist_poll_event_irq_work); +DECLARE_WAIT_QUEUE_HEAD(hist_poll_wq); +#endif + static struct trace_event_file * trace_create_new_event(struct trace_event_call *call, struct trace_array *tr) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 5f9119eb7c67..107eaa0f40f1 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -5314,6 +5314,8 @@ static void event_hist_trigger(struct event_trigger_data *data,
if (resolve_var_refs(hist_data, key, var_ref_vals, true)) hist_trigger_actions(hist_data, elt, buffer, rec, rbe, key, var_ref_vals); + + hist_poll_wakeup(); }
static void hist_trigger_stacktrace_print(struct seq_file *m, @@ -5593,15 +5595,36 @@ static void hist_trigger_show(struct seq_file *m, n_entries, (u64)atomic64_read(&hist_data->map->drops)); }
+struct hist_file_data { + struct file *file; + u64 last_read; +}; + +static u64 get_hist_hit_count(struct trace_event_file *event_file) +{ + struct hist_trigger_data *hist_data; + struct event_trigger_data *data; + u64 ret = 0; + + list_for_each_entry(data, &event_file->triggers, list) { + if (data->cmd_ops->trigger_type == ETT_EVENT_HIST) { + hist_data = data->private_data; + ret += atomic64_read(&hist_data->map->hits); + } + } + return ret; +} + static int hist_show(struct seq_file *m, void *v) { + struct hist_file_data *hist_file = m->private; struct event_trigger_data *data; struct trace_event_file *event_file; int n = 0, ret = 0;
mutex_lock(&event_mutex);
- event_file = event_file_file(m->private); + event_file = event_file_file(hist_file->file); if (unlikely(!event_file)) { ret = -ENODEV; goto out_unlock; @@ -5611,6 +5634,7 @@ static int hist_show(struct seq_file *m, void *v) if (data->cmd_ops->trigger_type == ETT_EVENT_HIST) hist_trigger_show(m, data, n++); } + hist_file->last_read = get_hist_hit_count(event_file);
out_unlock: mutex_unlock(&event_mutex); @@ -5618,24 +5642,69 @@ static int hist_show(struct seq_file *m, void *v) return ret; }
+static __poll_t event_hist_poll(struct file *file, struct poll_table_struct *wait) +{ + struct trace_event_file *event_file; + struct seq_file *m = file->private_data; + struct hist_file_data *hist_file = m->private; + __poll_t ret = 0; + + mutex_lock(&event_mutex); + + event_file = event_file_data(file); + if (!event_file) { + ret = EPOLLERR; + goto out_unlock; + } + + hist_poll_wait(file, wait); + + if (hist_file->last_read != get_hist_hit_count(event_file)) + ret = EPOLLIN | EPOLLRDNORM; + +out_unlock: + mutex_unlock(&event_mutex); + + return ret; +} + +static int event_hist_release(struct inode *inode, struct file *file) +{ + struct seq_file *m = file->private_data; + struct hist_file_data *hist_file = m->private; + + kfree(hist_file); + return tracing_single_release_file_tr(inode, file); +} + static int event_hist_open(struct inode *inode, struct file *file) { + struct hist_file_data *hist_file; int ret;
ret = tracing_open_file_tr(inode, file); if (ret) return ret;
+ hist_file = kzalloc(sizeof(*hist_file), GFP_KERNEL); + if (!hist_file) + return -ENOMEM; + hist_file->file = file; + /* Clear private_data to avoid warning in single_open() */ file->private_data = NULL; - return single_open(file, hist_show, file); + ret = single_open(file, hist_show, hist_file); + if (ret) + kfree(hist_file); + return ret; }
const struct file_operations event_hist_fops = { .open = event_hist_open, .read = seq_read, .llseek = seq_lseek, - .release = tracing_single_release_file_tr, + .release = event_hist_release, + .poll = event_hist_poll, };
#ifdef CONFIG_HIST_TRIGGERS_DEBUG
On Wed, 16 Oct 2024 19:49:24 +0900 "Masami Hiramatsu (Google)" mhiramat@kernel.org wrote:
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Add poll syscall support on the `hist` file. The Waiter will be waken up when the histogram is updated with POLLIN.
Currently, there is no way to wait for a specific event in userspace. So user needs to peek the `trace` periodicaly, or wait on `trace_pipe`.
But that is not good idea to peek the `trace` for the event randomely happens. And `trace_pipe` is not coming back until a page is filled with events.
I would reword the above to:
But it is not a good idea to peek at the `trace` for an event that randomly happens. And `trace_pipe` is not coming back until a page is filled with events.
This allows user to wait for a specific events on `hist` file. User can set a histogram trigger on the event which they want to monitor. And poll() on its `hist` file. Since this poll() returns POLLIN, the next poll() will return soon unless you do read() on hist file.
And that to:
This allows a user to wait for a specific event on the `hist` file. User can set a histogram trigger on the event which they want to monitor and poll() on its `hist` file. Since this poll() returns POLLIN, the next poll() will return soon unless a read() happens on that hist file.
NOTE: To read the hist file again, you must set the file offset to 0, but just for monitoring the event, you may not need to read the histogram.
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Tom Zanussi zanussi@kernel.org
Changes in v6:
- Use a global poll irq_work and wait_queue.
include/linux/trace_events.h | 14 +++++++ kernel/trace/trace_events.c | 14 +++++++ kernel/trace/trace_events_hist.c | 75 ++++++++++++++++++++++++++++++++++++-- 3 files changed, 100 insertions(+), 3 deletions(-)
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 42bedcddd511..46c771a61f2a 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -685,6 +685,20 @@ struct trace_event_file { atomic_t tm_ref; /* trigger-mode reference counter */ }; +#ifdef CONFIG_HIST_TRIGGERS +extern struct irq_work hist_poll_work; +extern wait_queue_head_t hist_poll_wq;
+static inline void hist_poll_wakeup(void) +{
- if (wq_has_sleeper(&hist_poll_wq))
irq_work_queue(&hist_poll_work);
+}
+#define hist_poll_wait(file, wait) \
- poll_wait(file, &hist_poll_wq, wait)
+#endif
#define __TRACE_EVENT_FLAGS(name, value) \ static int __init trace_init_flags_##name(void) \ { \ diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 7266ec2a4eea..0a7cb30043ef 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -2972,6 +2972,20 @@ static bool event_in_systems(struct trace_event_call *call, return !*p || isspace(*p) || *p == ','; } +#ifdef CONFIG_HIST_TRIGGERS +/*
- Wake up waiter on the hist_poll_wq from irq_work because the hist trigger
- may happen in any context.
- */
+static void hist_poll_event_irq_work(struct irq_work *work) +{
- wake_up_all(&hist_poll_wq);
+}
+DEFINE_IRQ_WORK(hist_poll_work, hist_poll_event_irq_work); +DECLARE_WAIT_QUEUE_HEAD(hist_poll_wq); +#endif
static struct trace_event_file * trace_create_new_event(struct trace_event_call *call, struct trace_array *tr) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 5f9119eb7c67..107eaa0f40f1 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -5314,6 +5314,8 @@ static void event_hist_trigger(struct event_trigger_data *data, if (resolve_var_refs(hist_data, key, var_ref_vals, true)) hist_trigger_actions(hist_data, elt, buffer, rec, rbe, key, var_ref_vals);
- hist_poll_wakeup();
} static void hist_trigger_stacktrace_print(struct seq_file *m, @@ -5593,15 +5595,36 @@ static void hist_trigger_show(struct seq_file *m, n_entries, (u64)atomic64_read(&hist_data->map->drops)); } +struct hist_file_data {
- struct file *file;
- u64 last_read;
+};
+static u64 get_hist_hit_count(struct trace_event_file *event_file) +{
- struct hist_trigger_data *hist_data;
- struct event_trigger_data *data;
- u64 ret = 0;
- list_for_each_entry(data, &event_file->triggers, list) {
if (data->cmd_ops->trigger_type == ETT_EVENT_HIST) {
hist_data = data->private_data;
ret += atomic64_read(&hist_data->map->hits);
}
- }
- return ret;
+}
static int hist_show(struct seq_file *m, void *v) {
- struct hist_file_data *hist_file = m->private; struct event_trigger_data *data; struct trace_event_file *event_file; int n = 0, ret = 0;
mutex_lock(&event_mutex);
- event_file = event_file_file(m->private);
- event_file = event_file_file(hist_file->file); if (unlikely(!event_file)) { ret = -ENODEV; goto out_unlock;
@@ -5611,6 +5634,7 @@ static int hist_show(struct seq_file *m, void *v) if (data->cmd_ops->trigger_type == ETT_EVENT_HIST) hist_trigger_show(m, data, n++); }
- hist_file->last_read = get_hist_hit_count(event_file);
out_unlock: mutex_unlock(&event_mutex); @@ -5618,24 +5642,69 @@ static int hist_show(struct seq_file *m, void *v) return ret; } +static __poll_t event_hist_poll(struct file *file, struct poll_table_struct *wait) +{
- struct trace_event_file *event_file;
- struct seq_file *m = file->private_data;
- struct hist_file_data *hist_file = m->private;
- __poll_t ret = 0;
- mutex_lock(&event_mutex);
Let's start using guard(mutex)(&event_mutex);
I'm working on changing the other locations in this file with a separate patch. I don't want to add new ones.
-- Steve
- event_file = event_file_data(file);
- if (!event_file) {
ret = EPOLLERR;
goto out_unlock;
- }
- hist_poll_wait(file, wait);
- if (hist_file->last_read != get_hist_hit_count(event_file))
ret = EPOLLIN | EPOLLRDNORM;
+out_unlock:
- mutex_unlock(&event_mutex);
- return ret;
+}
+static int event_hist_release(struct inode *inode, struct file *file) +{
- struct seq_file *m = file->private_data;
- struct hist_file_data *hist_file = m->private;
- kfree(hist_file);
- return tracing_single_release_file_tr(inode, file);
+}
static int event_hist_open(struct inode *inode, struct file *file) {
- struct hist_file_data *hist_file; int ret;
ret = tracing_open_file_tr(inode, file); if (ret) return ret;
- hist_file = kzalloc(sizeof(*hist_file), GFP_KERNEL);
- if (!hist_file)
return -ENOMEM;
- hist_file->file = file;
- /* Clear private_data to avoid warning in single_open() */ file->private_data = NULL;
- return single_open(file, hist_show, file);
- ret = single_open(file, hist_show, hist_file);
- if (ret)
kfree(hist_file);
- return ret;
} const struct file_operations event_hist_fops = { .open = event_hist_open, .read = seq_read, .llseek = seq_lseek,
- .release = tracing_single_release_file_tr,
- .release = event_hist_release,
- .poll = event_hist_poll,
}; #ifdef CONFIG_HIST_TRIGGERS_DEBUG
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Since POLLIN will not be flashed until read the hist file, user needs to repeat read() and poll() on hist for monitoring the event continuously. But the read() is somewhat redundant only for monitoring events.
This add POLLPRI poll event on hist, this event returns when a histogram is updated after open(), poll() or read(). Thus it is possible to wait next event without read().
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Tom Zanussi zanussi@kernel.org --- kernel/trace/trace_events_hist.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 107eaa0f40f1..8819a8cc4d53 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -5598,6 +5598,7 @@ static void hist_trigger_show(struct seq_file *m, struct hist_file_data { struct file *file; u64 last_read; + u64 last_act; };
static u64 get_hist_hit_count(struct trace_event_file *event_file) @@ -5635,6 +5636,11 @@ static int hist_show(struct seq_file *m, void *v) hist_trigger_show(m, data, n++); } hist_file->last_read = get_hist_hit_count(event_file); + /* + * Update last_act too so that poll()/POLLPRI can wait for the next + * event after any syscall on hist file. + */ + hist_file->last_act = hist_file->last_read;
out_unlock: mutex_unlock(&event_mutex); @@ -5648,6 +5654,7 @@ static __poll_t event_hist_poll(struct file *file, struct poll_table_struct *wai struct seq_file *m = file->private_data; struct hist_file_data *hist_file = m->private; __poll_t ret = 0; + u64 cnt;
mutex_lock(&event_mutex);
@@ -5659,8 +5666,13 @@ static __poll_t event_hist_poll(struct file *file, struct poll_table_struct *wai
hist_poll_wait(file, wait);
- if (hist_file->last_read != get_hist_hit_count(event_file)) - ret = EPOLLIN | EPOLLRDNORM; + cnt = get_hist_hit_count(event_file); + if (hist_file->last_read != cnt) + ret |= EPOLLIN | EPOLLRDNORM; + if (hist_file->last_act != cnt) { + hist_file->last_act = cnt; + ret |= EPOLLPRI; + }
out_unlock: mutex_unlock(&event_mutex); @@ -5679,6 +5691,7 @@ static int event_hist_release(struct inode *inode, struct file *file)
static int event_hist_open(struct inode *inode, struct file *file) { + struct trace_event_file *event_file; struct hist_file_data *hist_file; int ret;
@@ -5689,13 +5702,25 @@ static int event_hist_open(struct inode *inode, struct file *file) hist_file = kzalloc(sizeof(*hist_file), GFP_KERNEL); if (!hist_file) return -ENOMEM; + + mutex_lock(&event_mutex); + event_file = event_file_data(file); + if (!event_file) { + ret = -ENODEV; + goto out_unlock; + } + hist_file->file = file; + hist_file->last_act = get_hist_hit_count(event_file);
/* Clear private_data to avoid warning in single_open() */ file->private_data = NULL; ret = single_open(file, hist_show, hist_file); + +out_unlock: if (ret) kfree(hist_file); + mutex_unlock(&event_mutex); return ret; }
On Wed, 16 Oct 2024 19:49:33 +0900 "Masami Hiramatsu (Google)" mhiramat@kernel.org wrote:
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Since POLLIN will not be flashed until read the hist file, user needs to repeat read() and poll() on hist for monitoring the event continuously. But the read() is somewhat redundant only for monitoring events.
This add POLLPRI poll event on hist, this event returns when a histogram is updated after open(), poll() or read(). Thus it is possible to wait next event without read().
I would reword the above to:
Since POLLIN will not be flushed until the hist file is read, the user needs to repeatedly read() and poll() on the hist file for monitoring the event continuously. But the read() is somewhat redundant when the user is only monitoring for event updates.
Add POLLPRI poll event on the hist file so the event returns when a histogram is updated after open(), poll() or read(). Thus it is possible to wait for the next event without having to issue a read().
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Tom Zanussi zanussi@kernel.org
kernel/trace/trace_events_hist.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 107eaa0f40f1..8819a8cc4d53 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -5598,6 +5598,7 @@ static void hist_trigger_show(struct seq_file *m, struct hist_file_data { struct file *file; u64 last_read;
- u64 last_act;
}; static u64 get_hist_hit_count(struct trace_event_file *event_file) @@ -5635,6 +5636,11 @@ static int hist_show(struct seq_file *m, void *v) hist_trigger_show(m, data, n++); } hist_file->last_read = get_hist_hit_count(event_file);
- /*
* Update last_act too so that poll()/POLLPRI can wait for the next
* event after any syscall on hist file.
*/
- hist_file->last_act = hist_file->last_read;
out_unlock: mutex_unlock(&event_mutex); @@ -5648,6 +5654,7 @@ static __poll_t event_hist_poll(struct file *file, struct poll_table_struct *wai struct seq_file *m = file->private_data; struct hist_file_data *hist_file = m->private; __poll_t ret = 0;
- u64 cnt;
mutex_lock(&event_mutex); @@ -5659,8 +5666,13 @@ static __poll_t event_hist_poll(struct file *file, struct poll_table_struct *wai hist_poll_wait(file, wait);
- if (hist_file->last_read != get_hist_hit_count(event_file))
ret = EPOLLIN | EPOLLRDNORM;
- cnt = get_hist_hit_count(event_file);
- if (hist_file->last_read != cnt)
ret |= EPOLLIN | EPOLLRDNORM;
- if (hist_file->last_act != cnt) {
hist_file->last_act = cnt;
ret |= EPOLLPRI;
- }
out_unlock: mutex_unlock(&event_mutex); @@ -5679,6 +5691,7 @@ static int event_hist_release(struct inode *inode, struct file *file) static int event_hist_open(struct inode *inode, struct file *file) {
- struct trace_event_file *event_file; struct hist_file_data *hist_file; int ret;
@@ -5689,13 +5702,25 @@ static int event_hist_open(struct inode *inode, struct file *file) hist_file = kzalloc(sizeof(*hist_file), GFP_KERNEL); if (!hist_file) return -ENOMEM;
- mutex_lock(&event_mutex);
And switch this over to guard() as well.
Thanks,
-- Steve
- event_file = event_file_data(file);
- if (!event_file) {
ret = -ENODEV;
goto out_unlock;
- }
- hist_file->file = file;
- hist_file->last_act = get_hist_hit_count(event_file);
/* Clear private_data to avoid warning in single_open() */ file->private_data = NULL; ret = single_open(file, hist_show, hist_file);
+out_unlock: if (ret) kfree(hist_file);
- mutex_unlock(&event_mutex); return ret;
}
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Add a testcase for poll() on hist file. This introduces a helper binary to the ftracetest, because there is no good way to reliably execute poll() on hist file.
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Shuah Khan skhan@linuxfoundation.org --- tools/testing/selftests/ftrace/Makefile | 2 + tools/testing/selftests/ftrace/poll.c | 74 ++++++++++++++++++++ .../ftrace/test.d/trigger/trigger-hist-poll.tc | 74 ++++++++++++++++++++ 3 files changed, 150 insertions(+) create mode 100644 tools/testing/selftests/ftrace/poll.c create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile index a1e955d2de4c..49d96bb16355 100644 --- a/tools/testing/selftests/ftrace/Makefile +++ b/tools/testing/selftests/ftrace/Makefile @@ -6,4 +6,6 @@ TEST_PROGS := ftracetest-ktap TEST_FILES := test.d settings EXTRA_CLEAN := $(OUTPUT)/logs/*
+TEST_GEN_PROGS = poll + include ../lib.mk diff --git a/tools/testing/selftests/ftrace/poll.c b/tools/testing/selftests/ftrace/poll.c new file mode 100644 index 000000000000..53258f7515e7 --- /dev/null +++ b/tools/testing/selftests/ftrace/poll.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Simple poll on a file. + * + * Copyright (c) 2024 Google LLC. + */ + +#include <errno.h> +#include <fcntl.h> +#include <poll.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#define BUFSIZE 4096 + +/* + * Usage: + * poll [-I|-P] [-t timeout] FILE + */ +int main(int argc, char *argv[]) +{ + struct pollfd pfd = {.events = POLLIN}; + char buf[BUFSIZE]; + int timeout = -1; + int ret, opt; + + while ((opt = getopt(argc, argv, "IPt:")) != -1) { + switch (opt) { + case 'I': + pfd.events = POLLIN; + break; + case 'P': + pfd.events = POLLPRI; + break; + case 't': + timeout = atoi(optarg); + break; + default: + fprintf(stderr, "Usage: %s [-I|-P] [-t timeout] FILE\n", + argv[0]); + return -1; + } + } + if (optind >= argc) { + fprintf(stderr, "Error: Polling file is not specified\n"); + return -1; + } + + pfd.fd = open(argv[optind], O_RDONLY); + if (pfd.fd < 0) { + fprintf(stderr, "failed to open %s", argv[optind]); + perror("open"); + return -1; + } + + /* Reset poll by read if POLLIN is specified. */ + if (pfd.events & POLLIN) + do {} while (read(pfd.fd, buf, BUFSIZE) == BUFSIZE); + + ret = poll(&pfd, 1, timeout); + if (ret < 0 && errno != EINTR) { + perror("poll"); + return -1; + } + close(pfd.fd); + + /* If timeout happned (ret == 0), exit code is 1 */ + if (ret == 0) + return 1; + + return 0; +} diff --git a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc new file mode 100644 index 000000000000..cbd01a71ecad --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc @@ -0,0 +1,74 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: event trigger - test poll wait on histogram +# requires: set_event events/sched/sched_process_free/trigger events/sched/sched_process_free/hist +# flags: instance + +POLL=${FTRACETEST_ROOT}/poll + +if [ ! -x ${POLL} ]; then + echo "poll program is not compiled!" + exit_unresolved +fi + +EVENT=events/sched/sched_process_free/ + +# Check poll ops is supported. Before implementing poll on hist file, it +# returns soon with POLLIN | POLLOUT, but not POLLPRI. + +# This must wait >1 sec and return 1 (timeout). +set +e +${POLL} -I -t 1000 ${EVENT}/hist +ret=$? +set -e +if [ ${ret} != 1 ]; then + echo "poll on hist file is not supported" + exit_unsupported +fi + +# Test POLLIN +echo > trace +echo "hist:key=comm" > ${EVENT}/trigger +echo 1 > ${EVENT}/enable + +# This sleep command will exit after 2 seconds. +sleep 2 & +BGPID=$! +# if timeout happens, poll returns 1. +${POLL} -I -t 4000 ${EVENT}/hist +echo 0 > tracing_on + +if [ -d /proc/${BGPID} ]; then + echo "poll exits too soon" + kill -KILL ${BGPID} ||: + exit_fail +fi + +if ! grep -qw "sleep" trace; then + echo "poll exits before event happens" + exit_fail +fi + +# Test POLLPRI +echo > trace +echo 1 > tracing_on + +# This sleep command will exit after 2 seconds. +sleep 2 & +BGPID=$! +# if timeout happens, poll returns 1. +${POLL} -P -t 4000 ${EVENT}/hist +echo 0 > tracing_on + +if [ -d /proc/${BGPID} ]; then + echo "poll exits too soon" + kill -KILL ${BGPID} ||: + exit_fail +fi + +if ! grep -qw "sleep" trace; then + echo "poll exits before event happens" + exit_fail +fi + +exit_pass
On Wed, 16 Oct 2024 19:49:41 +0900 "Masami Hiramatsu (Google)" mhiramat@kernel.org wrote:
--- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc @@ -0,0 +1,74 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: event trigger - test poll wait on histogram +# requires: set_event events/sched/sched_process_free/trigger events/sched/sched_process_free/hist +# flags: instance
+POLL=${FTRACETEST_ROOT}/poll
+if [ ! -x ${POLL} ]; then
- echo "poll program is not compiled!"
- exit_unresolved
+fi
+EVENT=events/sched/sched_process_free/
+# Check poll ops is supported. Before implementing poll on hist file, it +# returns soon with POLLIN | POLLOUT, but not POLLPRI.
+# This must wait >1 sec and return 1 (timeout). +set +e +${POLL} -I -t 1000 ${EVENT}/hist +ret=$? +set -e +if [ ${ret} != 1 ]; then
- echo "poll on hist file is not supported"
- exit_unsupported
+fi
+# Test POLLIN +echo > trace +echo "hist:key=comm" > ${EVENT}/trigger +echo 1 > ${EVENT}/enable
+# This sleep command will exit after 2 seconds. +sleep 2 & +BGPID=$! +# if timeout happens, poll returns 1. +${POLL} -I -t 4000 ${EVENT}/hist +echo 0 > tracing_on
+if [ -d /proc/${BGPID} ]; then
- echo "poll exits too soon"
- kill -KILL ${BGPID} ||:
- exit_fail
+fi
+if ! grep -qw "sleep" trace; then
- echo "poll exits before event happens"
I ran this and it failed here. But it wasn't because the poll failed, it's because the test is wrong. If something else exits during the test, then the poll function will exit early.
What the check should do is simply read the hist file, get the hist count, and make sure it's updated after the poll is run, or at least put a filter on it:
echo 'hist:keys=comm if comm =="sleep"' > /sys/kernel/tracing/events/sched/sched_process_free/trigger
Which would work as long as no other "sleep" exits during the test.
-- Steve
- exit_fail
+fi
+# Test POLLPRI +echo > trace +echo 1 > tracing_on
+# This sleep command will exit after 2 seconds. +sleep 2 & +BGPID=$! +# if timeout happens, poll returns 1. +${POLL} -P -t 4000 ${EVENT}/hist +echo 0 > tracing_on
+if [ -d /proc/${BGPID} ]; then
- echo "poll exits too soon"
- kill -KILL ${BGPID} ||:
- exit_fail
+fi
+if ! grep -qw "sleep" trace; then
- echo "poll exits before event happens"
- exit_fail
+fi
+exit_pass
On Wed, 16 Oct 2024 19:49:15 +0900 "Masami Hiramatsu (Google)" mhiramat@kernel.org wrote:
Overview
This patch set allows user to `poll`(or `select`, `epoll`) on event histogram interface. As you know each event has its own `hist` file which shows histograms generated by trigger action. So user can set a new hist trigger on any event you want to monitor, and poll on the `hist` file until it is updated.
Note: This `hist` is not disabled by tracing_on interface, because that interface only disables `recording`. Thus to monitor events via this interface, user must ensure the tracing_on is enabled, also, set the same "filter" to the hist action.
Thank you,
There are 2 poll events are supported, POLLIN and POLLPRI. POLLIN means that there are any readable update on `hist` file and this event will be flashed only when you call read(). So, this is useful if you want to read the histogram periodically. The other POLLPRI event is for monitoring trace event. Like the POLLIN, this will be returned when the histogram is updated, but you don't need to read() the file and use poll() again.
Note that this waits for histogram update (not event arrival), thus you must set a histogram on the event at first.
Usage
Here is an example usage:
TRACEFS=/sys/kernel/tracing EVENT=$TRACEFS/events/sched/sched_process_free
# setup histogram trigger and enable event echo "hist:key=comm" >> $EVENT/trigger echo 1 > $EVENT/enable
# Wait for update poll pri $EVENT/hist
# Event arrived. echo "process free event is comming" tail $TRACEFS/trace
The 'poll' command is in the selftest patch.
You can take this series also from here;
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=to...
Thank you,
Masami Hiramatsu (Google) (3): tracing/hist: Add poll(POLLIN) support on hist file tracing/hist: Support POLLPRI event for poll on histogram selftests/tracing: Add hist poll() support test
include/linux/trace_events.h | 14 +++ kernel/trace/trace_events.c | 14 +++ kernel/trace/trace_events_hist.c | 100 +++++++++++++++++++- tools/testing/selftests/ftrace/Makefile | 2 tools/testing/selftests/ftrace/poll.c | 74 +++++++++++++++ .../ftrace/test.d/trigger/trigger-hist-poll.tc | 74 +++++++++++++++ 6 files changed, 275 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/ftrace/poll.c create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc
-- Masami Hiramatsu (Google) mhiramat@kernel.org
linux-kselftest-mirror@lists.linaro.org