Hi Reinette and Shuah,
On 11/2/2022 2:41 AM, Shuah Khan wrote:
On 11/1/22 03:43, Shaopeng Tan wrote:
After creating a child process with fork() in CAT test, if there is an error occurs or such as a SIGINT signal is received, the parent process will be terminated immediately, but the child process will not be killed and also umount_resctrlfs() will not be called.
Add a signal handler like other tests to kill child process, umount resctrlfs, cleanup result files, etc. when an error occurs.
Signed-off-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com
tools/testing/selftests/resctrl/cat_test.c | 28 +++++++++++++++------- 1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 6a8306b0a109..5f81817f4366 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -98,12 +98,21 @@ void cat_test_cleanup(void) remove(RESULT_FILE_NAME2); } +static void ctrl_handler(int signo) +{ + kill(bm_pid, SIGKILL); + umount_resctrlfs(); + tests_cleanup(); + ksft_print_msg("Ending\n\n");
Is there a reason to print this message? Remove it unless it serves a purpose.
This function appears to be a duplicate of existing resctrl_val.c:ctrlc_handler(). Could the duplication be avoided instead of refining the copy?
Yes, it's a duplicate of existing resctrl_val.c:ctrlc_handler(). I will try to use resctrl_val.c:ctrlc_handler() in next version patch series.
+ exit(EXIT_SUCCESS); +}
int cat_perf_miss_val(int cpu_no, int n, char *cache_type) { unsigned long l_mask, l_mask_1; int ret, pipefd[2], sibling_cpu_no; char pipe_message; - pid_t bm_pid;
Odd. bm_pid is used below - why remove it here?
There is a global bm_pid in resctrl_val.c that is made available via extern in resctrl.h. This is what causes this code to still compile but I would also like to learn why moving to that is desired as a change here. I expect such a big change to get a mention in the commit message.
Yes. I used global bm_pid, since bm_pid is used in ctrl_handler() before this function cat_perf_miss_val(). I will add a description to commit message.
cache_size = 0; @@ -181,17 +190,19 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) strcpy(param.filename, RESULT_FILE_NAME1); param.num_of_runs = 0; param.cpu_no = sibling_cpu_no; + } else { + /* set up ctrl-c handler */ + if (signal(SIGINT, ctrl_handler) == SIG_ERR || + signal(SIGHUP, ctrl_handler) == SIG_ERR || + signal(SIGTERM, ctrl_handler) == SIG_ERR) + printf("Failed to catch SIGNAL!\n");
Is perror() more appropriate here?
Should we be using signal() at all? "man signal" reads: "WARNING: the behavior of signal() varies across UNIX versions, and has also varied historically across different versions of Linux. Avoid its use: use sigaction(2) instead."
"Failed to catch SIGNAL" also seems unclear to me. This is where a signal handler is set up, the signal for which the handler is installed has not arrived.
I will use sigaction(2) and perror().
} remove(param.filename); ret = cat_val(¶m); - if (ret) - return ret;
- ret = check_results(¶m); - if (ret) - return ret; + if (ret == 0) + ret = check_results(¶m);
Why not use a goto in error case to do umount_resctrlfs() instead of changing the conditionals?
My understanding is the code that follows is needed to synchronize the exits between the parent and child. It is the parent that will run umount_resctrlfs() and it should only do so after the child is done. A goto by the parent may thus cause umount_resctrlfs() to be run while the child still relies on it while a goto by the child may cause the parent not to receive the message that the child is complete.
Yes, the code that follows is needed to synchronize the exits between the parent and child.
@@ -201,7 +212,6 @@ int cat_perf_miss_val(int cpu_no, int n, char
*cache_type)
sizeof(pipe_message)) { close(pipefd[1]); perror("# failed signaling parent process");
return errno;
It looks like pipefd[1] will be closed twice if the write() failed.
This close(pipefd[1]); should also be removed.
It does look strange to let the child continue to its infinite loop after the write() failed. I assume that it is because the parent will also be stuck and the new ctrl_handler() is expected to unblock both?
If a SIGNAL is received, ctrl_handler() will be called to kill the child process and exit parent process. If no SIGNAL is received, the parent process will kill the child process. (by else{kill(bm_pid, SIGKILL);})
Could you please add a comment to the code to clarify this flow?
I will add comments here.
Best regards, Shaopeng Tan