During shutdown of rcutorture, the shutdown thread in rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for readers and fakewriters to breakout of their main while loop and start shutting down.
Once out of their main loop, they then call torture_kthread_stopping() which in turn waits for kthread_stop() to be called, however rcu_torture_cleanup() has not even called kthread_stop() on those threads yet, it does that a bit later. However, before it gets a chance to do so, torture_kthread_stopping() calls schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed this makes the timer softirq constantly execute timer callbacks, while never returning back to the softirq exit path and is essentially "locked up" because of that. If the softirq preempts the shutdown thread, kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout passed to schedule_timeout_interruptible() 1/20th of a second. This causes the timer softirq to not lock up a CPU and everything works fine. Testing has shown 100 runs of TREE07 passing reliably, which was not the case before because of RCU stalls.
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org --- kernel/torture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/torture.c b/kernel/torture.c index 29afc62f2bfe..d024f3b7181f 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title) VERBOSE_TOROUT_STRING(buf); while (!kthread_should_stop()) { torture_shutdown_absorb(title); - schedule_timeout_uninterruptible(1); + schedule_timeout_uninterruptible(HZ/20); } } EXPORT_SYMBOL_GPL(torture_kthread_stopping);
On Sun, Jan 1, 2023 at 1:16 AM Joel Fernandes (Google) joel@joelfernandes.org wrote:
During shutdown of rcutorture, the shutdown thread in rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for readers and fakewriters to breakout of their main while loop and start shutting down.
Once out of their main loop, they then call torture_kthread_stopping() which in turn waits for kthread_stop() to be called, however rcu_torture_cleanup() has not even called kthread_stop() on those threads yet, it does that a bit later. However, before it gets a chance to do so, torture_kthread_stopping() calls schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed this makes the timer softirq constantly execute timer callbacks, while never returning back to the softirq exit path and is essentially "locked up" because of that. If the softirq preempts the shutdown thread, kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout passed to schedule_timeout_interruptible() 1/20th of a second. This causes the timer softirq to not lock up a CPU and everything works fine. Testing has shown 100 runs of TREE07 passing reliably, which was not the case before because of RCU stalls.
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x
Question for stable maintainers: This patch is for mainline and 6.0 stable. However, it should also go to 6.1 stable. How do we tag it to do that? I did not know how to tag 2 stable versions. I guess the above implies > 6.0 ?
Thanks,
- Joel
On Sun, Jan 01, 2023 at 01:20:01AM -0500, Joel Fernandes wrote:
On Sun, Jan 1, 2023 at 1:16 AM Joel Fernandes (Google)
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x
Question for stable maintainers: This patch is for mainline and 6.0 stable. However, it should also go to 6.1 stable. How do we tag it to do that? I did not know how to tag 2 stable versions. I guess the above implies > 6.0 ?
The above implies 6.0 and newer already which included 6.1, so all is good.
On Jan 1, 2023, at 3:48 AM, Greg KH gregkh@linuxfoundation.org wrote:
On Sun, Jan 01, 2023 at 01:20:01AM -0500, Joel Fernandes wrote:
On Sun, Jan 1, 2023 at 1:16 AM Joel Fernandes (Google)
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x
Question for stable maintainers: This patch is for mainline and 6.0 stable. However, it should also go to 6.1 stable. How do we tag it to do that? I did not know how to tag 2 stable versions. I guess the above implies > 6.0 ?
The above implies 6.0 and newer already which included 6.1, so all is good.
Thanks a lot for clarification,
- Joel
On Sun, Jan 1, 2023 at 2:16 PM Joel Fernandes (Google) joel@joelfernandes.org wrote:
During shutdown of rcutorture, the shutdown thread in rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for readers and fakewriters to breakout of their main while loop and start shutting down.
Once out of their main loop, they then call torture_kthread_stopping() which in turn waits for kthread_stop() to be called, however rcu_torture_cleanup() has not even called kthread_stop() on those threads yet, it does that a bit later. However, before it gets a chance to do so, torture_kthread_stopping() calls schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed this makes the timer softirq constantly execute timer callbacks, while never returning back to the softirq exit path and is essentially "locked up" because of that. If the softirq preempts the shutdown thread, kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout passed to schedule_timeout_interruptible() 1/20th of a second. This causes the timer softirq to not lock up a CPU and everything works fine. Testing has shown 100 runs of TREE07 passing reliably, which was not the case before because of RCU stalls.
On my Dell PowerEdge R720 with two Intel(R) Xeon(R) CPU E5-2660 128G memory: 1) before this patch: 3 of 80 rounds failed with "rcu: INFO: rcu_sched detected stalls on CPUs/tasks" [1] 2) after this patch all 80 rounds passed
Tested-by: Zhouyi Zhou zhouzhouyi@gmail.com
Thanks Zhouyi
[1] http://154.220.3.115/logs/20230101/console.log
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org
kernel/torture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/torture.c b/kernel/torture.c index 29afc62f2bfe..d024f3b7181f 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title) VERBOSE_TOROUT_STRING(buf); while (!kthread_should_stop()) { torture_shutdown_absorb(title);
schedule_timeout_uninterruptible(1);
schedule_timeout_uninterruptible(HZ/20); }
} EXPORT_SYMBOL_GPL(torture_kthread_stopping); -- 2.39.0.314.g84b9a713c41-goog
On Jan 1, 2023, at 8:02 AM, Zhouyi Zhou zhouzhouyi@gmail.com wrote:
On Sun, Jan 1, 2023 at 2:16 PM Joel Fernandes (Google) joel@joelfernandes.org wrote:
During shutdown of rcutorture, the shutdown thread in rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for readers and fakewriters to breakout of their main while loop and start shutting down.
Once out of their main loop, they then call torture_kthread_stopping() which in turn waits for kthread_stop() to be called, however rcu_torture_cleanup() has not even called kthread_stop() on those threads yet, it does that a bit later. However, before it gets a chance to do so, torture_kthread_stopping() calls schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed this makes the timer softirq constantly execute timer callbacks, while never returning back to the softirq exit path and is essentially "locked up" because of that. If the softirq preempts the shutdown thread, kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout passed to schedule_timeout_interruptible() 1/20th of a second. This causes the timer softirq to not lock up a CPU and everything works fine. Testing has shown 100 runs of TREE07 passing reliably, which was not the case before because of RCU stalls.
On my Dell PowerEdge R720 with two Intel(R) Xeon(R) CPU E5-2660 128G memory:
- before this patch:
3 of 80 rounds failed with "rcu: INFO: rcu_sched detected stalls on CPUs/tasks" [1] 2) after this patch all 80 rounds passed
Tested-by: Zhouyi Zhou zhouzhouyi@gmail.com
Thanks! Glad to see your tests look good now.
- Joel
Thanks Zhouyi
[1] http://154.220.3.115/logs/20230101/console.log
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org
kernel/torture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/torture.c b/kernel/torture.c index 29afc62f2bfe..d024f3b7181f 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title) VERBOSE_TOROUT_STRING(buf); while (!kthread_should_stop()) { torture_shutdown_absorb(title);
schedule_timeout_uninterruptible(1);
schedule_timeout_uninterruptible(HZ/20); }
} EXPORT_SYMBOL_GPL(torture_kthread_stopping); -- 2.39.0.314.g84b9a713c41-goog
On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote:
During shutdown of rcutorture, the shutdown thread in rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for readers and fakewriters to breakout of their main while loop and start shutting down.
Once out of their main loop, they then call torture_kthread_stopping() which in turn waits for kthread_stop() to be called, however rcu_torture_cleanup() has not even called kthread_stop() on those threads yet, it does that a bit later. However, before it gets a chance to do so, torture_kthread_stopping() calls schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed this makes the timer softirq constantly execute timer callbacks, while never returning back to the softirq exit path and is essentially "locked up" because of that. If the softirq preempts the shutdown thread, kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout passed to schedule_timeout_interruptible() 1/20th of a second. This causes the timer softirq to not lock up a CPU and everything works fine. Testing has shown 100 runs of TREE07 passing reliably, which was not the case before because of RCU stalls.
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org
Reviewed-by: Davidlohr Bueso dave@stgolabs.net
On Mon, Jan 02, 2023 at 08:43:10AM -0800, Davidlohr Bueso wrote:
On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote:
During shutdown of rcutorture, the shutdown thread in rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for readers and fakewriters to breakout of their main while loop and start shutting down.
Once out of their main loop, they then call torture_kthread_stopping() which in turn waits for kthread_stop() to be called, however rcu_torture_cleanup() has not even called kthread_stop() on those threads yet, it does that a bit later. However, before it gets a chance to do so, torture_kthread_stopping() calls schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed this makes the timer softirq constantly execute timer callbacks, while never returning back to the softirq exit path and is essentially "locked up" because of that. If the softirq preempts the shutdown thread, kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout passed to schedule_timeout_interruptible() 1/20th of a second. This causes the timer softirq to not lock up a CPU and everything works fine. Testing has shown 100 runs of TREE07 passing reliably, which was not the case before because of RCU stalls.
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org
Reviewed-by: Davidlohr Bueso dave@stgolabs.net
Queued for further review and testing, thank you all!
One thing still puzzles me. Assuming HZ=1000 and given 16 CPUs, each timer hander must have consumed many tens of microseconds in order to keep the system busy, which seems a bit longer than it should be. Or am I underestimating the number of tasks involved?
Thanx, Paul
On Tue, Jan 03, 2023 at 10:04:04AM -0800, Paul E. McKenney wrote:
On Mon, Jan 02, 2023 at 08:43:10AM -0800, Davidlohr Bueso wrote:
On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote:
During shutdown of rcutorture, the shutdown thread in rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for readers and fakewriters to breakout of their main while loop and start shutting down.
Once out of their main loop, they then call torture_kthread_stopping() which in turn waits for kthread_stop() to be called, however rcu_torture_cleanup() has not even called kthread_stop() on those threads yet, it does that a bit later. However, before it gets a chance to do so, torture_kthread_stopping() calls schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed this makes the timer softirq constantly execute timer callbacks, while never returning back to the softirq exit path and is essentially "locked up" because of that. If the softirq preempts the shutdown thread, kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout passed to schedule_timeout_interruptible() 1/20th of a second. This causes the timer softirq to not lock up a CPU and everything works fine. Testing has shown 100 runs of TREE07 passing reliably, which was not the case before because of RCU stalls.
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org
Reviewed-by: Davidlohr Bueso dave@stgolabs.net
Queued for further review and testing, thank you all!
One thing still puzzles me. Assuming HZ=1000 and given 16 CPUs, each timer hander must have consumed many tens of microseconds in order to keep the system busy, which seems a bit longer than it should be. Or am I underestimating the number of tasks involved?
Here are the traces between successive calls to process_timeout() which is the timer callback handler:
[ 1320.444210] <idle>-0 0dNs.. 314229620us : __run_timers: Calling timerfn 5: process_timeout [ 1320.444215] <idle>-0 0dNs.. 314229620us : sched_waking: comm=rcu_torture_fak pid=145 prio=139 target_cpu=008 [ 1320.463393] <idle>-0 7d.... 314229655us : sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_torture_wri next_pid=144 next_prio=120 [ 1320.478870] rcu_tort-144 7d.... 314229658us : sched_switch: prev_comm=rcu_torture_wri prev_pid=144 prev_prio=120 prev_state=D ==> next_comm=swapper/7 next_pid=0 next_prio=120 [ 1320.494324] <idle>-0 0dNs.. 314229738us : __run_timers: Calling timerfn 6: process_timeout
It appears the time delta in the above occurrence is 118 micro seconds between 2 timer callbacks. It does appear to be doing a cross-CPU wake up. Maybe that adds to the long time?
Here are the full logs with traces (in case it helps, search for "=D" for the D-state sched_switch event before the "panic now" trace happens): http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.0.y/26/a...
thanks,
- Joel
On Tue, Jan 03, 2023 at 06:24:23PM +0000, Joel Fernandes wrote:
On Tue, Jan 03, 2023 at 10:04:04AM -0800, Paul E. McKenney wrote:
On Mon, Jan 02, 2023 at 08:43:10AM -0800, Davidlohr Bueso wrote:
On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote:
During shutdown of rcutorture, the shutdown thread in rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for readers and fakewriters to breakout of their main while loop and start shutting down.
Once out of their main loop, they then call torture_kthread_stopping() which in turn waits for kthread_stop() to be called, however rcu_torture_cleanup() has not even called kthread_stop() on those threads yet, it does that a bit later. However, before it gets a chance to do so, torture_kthread_stopping() calls schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed this makes the timer softirq constantly execute timer callbacks, while never returning back to the softirq exit path and is essentially "locked up" because of that. If the softirq preempts the shutdown thread, kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout passed to schedule_timeout_interruptible() 1/20th of a second. This causes the timer softirq to not lock up a CPU and everything works fine. Testing has shown 100 runs of TREE07 passing reliably, which was not the case before because of RCU stalls.
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org
Reviewed-by: Davidlohr Bueso dave@stgolabs.net
Queued for further review and testing, thank you all!
One thing still puzzles me. Assuming HZ=1000 and given 16 CPUs, each timer hander must have consumed many tens of microseconds in order to keep the system busy, which seems a bit longer than it should be. Or am I underestimating the number of tasks involved?
Here are the traces between successive calls to process_timeout() which is the timer callback handler:
[ 1320.444210] <idle>-0 0dNs.. 314229620us : __run_timers: Calling timerfn 5: process_timeout [ 1320.444215] <idle>-0 0dNs.. 314229620us : sched_waking: comm=rcu_torture_fak pid=145 prio=139 target_cpu=008 [ 1320.463393] <idle>-0 7d.... 314229655us : sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_torture_wri next_pid=144 next_prio=120 [ 1320.478870] rcu_tort-144 7d.... 314229658us : sched_switch: prev_comm=rcu_torture_wri prev_pid=144 prev_prio=120 prev_state=D ==> next_comm=swapper/7 next_pid=0 next_prio=120 [ 1320.494324] <idle>-0 0dNs.. 314229738us : __run_timers: Calling timerfn 6: process_timeout
It appears the time delta in the above occurrence is 118 micro seconds between 2 timer callbacks. It does appear to be doing a cross-CPU wake up. Maybe that adds to the long time?
Here are the full logs with traces (in case it helps, search for "=D" for the D-state sched_switch event before the "panic now" trace happens): http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.0.y/26/a...
118 microseconds would do it!
Still seems excessive to me, but what do I know?
Thanx, Paul
On Tue, Jan 3, 2023 at 7:00 PM Paul E. McKenney paulmck@kernel.org wrote:
On Tue, Jan 03, 2023 at 06:24:23PM +0000, Joel Fernandes wrote:
On Tue, Jan 03, 2023 at 10:04:04AM -0800, Paul E. McKenney wrote:
On Mon, Jan 02, 2023 at 08:43:10AM -0800, Davidlohr Bueso wrote:
On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote:
During shutdown of rcutorture, the shutdown thread in rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for readers and fakewriters to breakout of their main while loop and start shutting down.
Once out of their main loop, they then call torture_kthread_stopping() which in turn waits for kthread_stop() to be called, however rcu_torture_cleanup() has not even called kthread_stop() on those threads yet, it does that a bit later. However, before it gets a chance to do so, torture_kthread_stopping() calls schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed this makes the timer softirq constantly execute timer callbacks, while never returning back to the softirq exit path and is essentially "locked up" because of that. If the softirq preempts the shutdown thread, kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout passed to schedule_timeout_interruptible() 1/20th of a second. This causes the timer softirq to not lock up a CPU and everything works fine. Testing has shown 100 runs of TREE07 passing reliably, which was not the case before because of RCU stalls.
Cc: Paul McKenney paulmck@kernel.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Zhouyi Zhou zhouzhouyi@gmail.com Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org
Reviewed-by: Davidlohr Bueso dave@stgolabs.net
Queued for further review and testing, thank you all!
One thing still puzzles me. Assuming HZ=1000 and given 16 CPUs, each timer hander must have consumed many tens of microseconds in order to keep the system busy, which seems a bit longer than it should be. Or am I underestimating the number of tasks involved?
Here are the traces between successive calls to process_timeout() which is the timer callback handler:
[ 1320.444210] <idle>-0 0dNs.. 314229620us : __run_timers: Calling timerfn 5: process_timeout [ 1320.444215] <idle>-0 0dNs.. 314229620us : sched_waking: comm=rcu_torture_fak pid=145 prio=139 target_cpu=008 [ 1320.463393] <idle>-0 7d.... 314229655us : sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_torture_wri next_pid=144 next_prio=120 [ 1320.478870] rcu_tort-144 7d.... 314229658us : sched_switch: prev_comm=rcu_torture_wri prev_pid=144 prev_prio=120 prev_state=D ==> next_comm=swapper/7 next_pid=0 next_prio=120 [ 1320.494324] <idle>-0 0dNs.. 314229738us : __run_timers: Calling timerfn 6: process_timeout
It appears the time delta in the above occurrence is 118 micro seconds between 2 timer callbacks. It does appear to be doing a cross-CPU wake up. Maybe that adds to the long time?
Here are the full logs with traces (in case it helps, search for "=D" for the D-state sched_switch event before the "panic now" trace happens): http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.0.y/26/a...
118 microseconds would do it!
Still seems excessive to me
True, also my machine is a bit slow so maybe it took more CPU cycles than usual. Still if it happened to me and Zhouyi, it could happen to anyone ;-)
thanks,
- Joel
linux-stable-mirror@lists.linaro.org