The standard printk() tries to flush the message to the console immediately. It tries to take the console lock. If the lock is already taken then the current owner is responsible for flushing even the new message.
There is a small race window between checking whether a new message is available and releasing the console lock. It is solved by re-checking the state after releasing the console lock. If the check is positive then console_unlock() tries to take the lock again and process the new message as well.
The commit 996e966640ddea7b535c ("printk: remove logbuf_lock") causes that console_seq is not longer read atomically. As a result, the re-check might be done with an inconsistent 64-bit index.
Solve it by using the last sequence number that has been checked under the console lock. In the worst case, it will take the lock again only to realized that the new message has already been proceed. But it was possible even before.
Fixes: commit 996e966640ddea7b535c ("printk: remove logbuf_lock") Cc: stable@vger.kernel.org # 5.13 Signed-off-by: Petr Mladek pmladek@suse.com --- kernel/printk/printk.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 142a58d124d9..87411084075e 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2545,6 +2545,7 @@ void console_unlock(void) bool do_cond_resched, retry; struct printk_info info; struct printk_record r; + u64 next_seq;
if (console_suspended) { up_console_sem(); @@ -2654,8 +2655,10 @@ void console_unlock(void) cond_resched(); }
- console_locked = 0; + /* Get consistent value of the next-to-be-used sequence number. */ + next_seq = console_seq;
+ console_locked = 0; up_console_sem();
/* @@ -2664,7 +2667,7 @@ void console_unlock(void) * there's a new owner and the console_unlock() from them will do the * flush, no worries. */ - retry = prb_read_valid(prb, console_seq, NULL); + retry = prb_read_valid(prb, next_seq, NULL); printk_safe_exit_irqrestore(flags);
if (retry && console_trylock())
On 2021-06-29, Petr Mladek pmladek@suse.com wrote:
The standard printk() tries to flush the message to the console immediately. It tries to take the console lock. If the lock is already taken then the current owner is responsible for flushing even the new message.
There is a small race window between checking whether a new message is available and releasing the console lock. It is solved by re-checking the state after releasing the console lock. If the check is positive then console_unlock() tries to take the lock again and process the new message as well.
The commit 996e966640ddea7b535c ("printk: remove logbuf_lock") causes that console_seq is not longer read atomically. As a result, the re-check might be done with an inconsistent 64-bit index.
Solve it by using the last sequence number that has been checked under the console lock. In the worst case, it will take the lock again only to realized that the new message has already been proceed. But it was possible even before.
Fixes: commit 996e966640ddea7b535c ("printk: remove logbuf_lock") Cc: stable@vger.kernel.org # 5.13 Signed-off-by: Petr Mladek pmladek@suse.com
Reviewed-by: John Ogness john.ogness@linutronix.de
Good catch.
John
kernel/printk/printk.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 142a58d124d9..87411084075e 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2545,6 +2545,7 @@ void console_unlock(void) bool do_cond_resched, retry; struct printk_info info; struct printk_record r;
- u64 next_seq;
if (console_suspended) { up_console_sem(); @@ -2654,8 +2655,10 @@ void console_unlock(void) cond_resched(); }
- console_locked = 0;
- /* Get consistent value of the next-to-be-used sequence number. */
- next_seq = console_seq;
- console_locked = 0; up_console_sem();
/* @@ -2664,7 +2667,7 @@ void console_unlock(void) * there's a new owner and the console_unlock() from them will do the * flush, no worries. */
- retry = prb_read_valid(prb, console_seq, NULL);
- retry = prb_read_valid(prb, next_seq, NULL); printk_safe_exit_irqrestore(flags);
if (retry && console_trylock()) -- 2.26.2
On (21/06/29 16:33), Petr Mladek wrote:
The standard printk() tries to flush the message to the console immediately. It tries to take the console lock. If the lock is already taken then the current owner is responsible for flushing even the new message.
There is a small race window between checking whether a new message is available and releasing the console lock. It is solved by re-checking the state after releasing the console lock. If the check is positive then console_unlock() tries to take the lock again and process the new message as well.
[..]
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 142a58d124d9..87411084075e 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2545,6 +2545,7 @@ void console_unlock(void) bool do_cond_resched, retry; struct printk_info info; struct printk_record r;
- u64 next_seq;
if (console_suspended) { up_console_sem(); @@ -2654,8 +2655,10 @@ void console_unlock(void) cond_resched(); }
- console_locked = 0;
- /* Get consistent value of the next-to-be-used sequence number. */
- next_seq = console_seq;
- console_locked = 0; up_console_sem();
/* @@ -2664,7 +2667,7 @@ void console_unlock(void) * there's a new owner and the console_unlock() from them will do the * flush, no worries. */
- retry = prb_read_valid(prb, console_seq, NULL);
- retry = prb_read_valid(prb, next_seq, NULL); printk_safe_exit_irqrestore(flags);
if (retry && console_trylock())
Maybe it's too late here in my time zone, but what are the consequences of this race?
`retry` can be falsely set, console_trylock() does not spin on owner, so the context that just released the lock can grab it again only if it's unlocked. For the context that just has released the console_sem and then acquired it again, because of the race, - console_seq will be valid after it acquires the lock, then it'll jump to `retry` and re-validated the console_seq - prb_read_valid(). If it's valid, it'll print the message; and should another CPU printk that CPU will spin on owner and then the current console_sem owner will yield to it via console_lock_spinning branch.
One way or the other, good catch and nice to have it fixed.
Acked-by: Sergey Senozhatsky senozhatsky@chromium.org
On Wed 2021-06-30 00:42:13, Sergey Senozhatsky wrote:
On (21/06/29 16:33), Petr Mladek wrote:
The standard printk() tries to flush the message to the console immediately. It tries to take the console lock. If the lock is already taken then the current owner is responsible for flushing even the new message.
There is a small race window between checking whether a new message is available and releasing the console lock. It is solved by re-checking the state after releasing the console lock. If the check is positive then console_unlock() tries to take the lock again and process the new message as well.
[..]
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 142a58d124d9..87411084075e 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2545,6 +2545,7 @@ void console_unlock(void) bool do_cond_resched, retry; struct printk_info info; struct printk_record r;
- u64 next_seq;
if (console_suspended) { up_console_sem(); @@ -2654,8 +2655,10 @@ void console_unlock(void) cond_resched(); }
- console_locked = 0;
- /* Get consistent value of the next-to-be-used sequence number. */
- next_seq = console_seq;
- console_locked = 0; up_console_sem();
/* @@ -2664,7 +2667,7 @@ void console_unlock(void) * there's a new owner and the console_unlock() from them will do the * flush, no worries. */
- retry = prb_read_valid(prb, console_seq, NULL);
- retry = prb_read_valid(prb, next_seq, NULL); printk_safe_exit_irqrestore(flags);
if (retry && console_trylock())
Maybe it's too late here in my time zone, but what are the consequences of this race?
`retry` can be falsely set, console_trylock() does not spin on owner, so the context that just released the lock can grab it again only if it's unlocked. For the context that just has released the console_sem and then acquired it again, because of the race, - console_seq will be valid after it acquires the lock, then it'll jump to `retry` and re-validated the console_seq - prb_read_valid(). If it's valid, it'll print the message; and should another CPU printk that CPU will spin on owner and then the current console_sem owner will yield to it via console_lock_spinning branch.
I am not sure that I follow it correctly. IMHO, there are two possible races. I believe that you are talking about the 2nd scenario:
1st scenario: console_unlock() retries but the message has been proceed in the meantime:
CPU0 CPU1
console_unlock()
// process all pending messages next_seq = 100; up_console_sem();
printk() vprintk_store() // storing message with seq = 100;
console_trylock_spinning() //succees console_unlock() // show message with seq == 100 console_seq++; (101)
next_seq = 101; up_consle_sem()
retry = prb_read_valid(prb, next_seq, NULL); // true because next_seq == 100 goto retry;
if (!prb_read_valid(prb, console_seq, &r)) break; // breaks because console_seq == 101
Result: CPU0 retired just to realize that the message has already been procceed.
2nd scenario: printk() caller spins when other process is already processing it's message
CPU0 CPU1
printk() vprintk_store() // storing message with seq == 100
console_trylock_spinning() // succees console_unlock() // show message with seq == 100 console_seq++; (=> 101)
next_seq = 101; up_console_sem();
printk() vprintk_store() // storing message with seq == 101
retry = prb_read_valid(prb, next_seq, NULL); // true because next_seq == 101 goto retry
if (!prb_read_valid(prb, console_seq, &r)) // read messages with seq == 101
console_seq++; (=> 102)
console_lock_spinning_enable(); call_console_drivers();
console_trylock_spinning() // spinning
if (console_lock_spinning_disable_and_check()) return;
// returns because there is a waiter
// got the console lock console_unlock() if (!prb_read_valid(prb, console_seq, &r)) break; // breaks because console_seq == 102
Result: CPU1 was spinning just to realize that the message has already been proceed.
It is not ideal. But the result is always correct.
The races have been there already before. Only the race window in 1st scenario was a bit smaller.
Anyway, thanks for the review.
Best Regards, Petr
On (21/06/30 09:58), Petr Mladek wrote: [..]
`retry` can be falsely set, console_trylock() does not spin on owner, so the context that just released the lock can grab it again only if it's unlocked. For the context that just has released the console_sem and then acquired it again, because of the race, - console_seq will be valid after it acquires the lock, then it'll jump to `retry` and re-validated the console_seq - prb_read_valid(). If it's valid, it'll print the message; and should another CPU printk that CPU will spin on owner and then the current console_sem owner will yield to it via console_lock_spinning branch.
I am not sure that I follow it correctly. IMHO, there are two possible races. I believe that you are talking about the 2nd scenario:
I guess I was thinking about two scenarios simultaneously, but you certainly did a much better job describing them.
Thanks a lot for spending time on this!
1st scenario: console_unlock() retries but the message has been proceed in the meantime:
[..]
Result: CPU0 retired just to realize that the message has already been procceed.
Ack.
2nd scenario: printk() caller spins when other process is already processing it's message
[..]
Result: CPU1 was spinning just to realize that the message has already been proceed.
Ack.
It is not ideal. But the result is always correct.
The races have been there already before. Only the race window in 1st scenario was a bit smaller.
Yeah, this was my assertion as well, but I wanted to double check.
Hi Petr,
I love your patch! Perhaps something to improve:
[auto build test WARNING on linus/master] [also build test WARNING on v5.13 next-20210629] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Petr-Mladek/printk-console-Check-co... base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git c54b245d011855ea91c5beff07f1db74143ce614 config: x86_64-buildonly-randconfig-r006-20210629 (attached as .config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/24c9c4e757be2efe5bb225e443b3502338ab... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Petr-Mladek/printk-console-Check-consistent-sequence-number-when-handling-race-in-console_unlock/20210629-223408 git checkout 24c9c4e757be2efe5bb225e443b3502338abf64c # save the attached .config to linux build tree make W=1 ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All warnings (new ones prefixed by >>):
kernel/printk/printk.c:175:5: warning: no previous prototype for 'devkmsg_sysctl_set_loglvl' [-Wmissing-prototypes] 175 | int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write, | ^~~~~~~~~~~~~~~~~~~~~~~~~ kernel/printk/printk.c: In function 'console_unlock':
kernel/printk/printk.c:2548:6: warning: variable 'next_seq' set but not used [-Wunused-but-set-variable]
2548 | u64 next_seq; | ^~~~~~~~
vim +/next_seq +2548 kernel/printk/printk.c
2525 2526 /** 2527 * console_unlock - unlock the console system 2528 * 2529 * Releases the console_lock which the caller holds on the console system 2530 * and the console driver list. 2531 * 2532 * While the console_lock was held, console output may have been buffered 2533 * by printk(). If this is the case, console_unlock(); emits 2534 * the output prior to releasing the lock. 2535 * 2536 * If there is output waiting, we wake /dev/kmsg and syslog() users. 2537 * 2538 * console_unlock(); may be called from any context. 2539 */ 2540 void console_unlock(void) 2541 { 2542 static char ext_text[CONSOLE_EXT_LOG_MAX]; 2543 static char text[CONSOLE_LOG_MAX]; 2544 unsigned long flags; 2545 bool do_cond_resched, retry; 2546 struct printk_info info; 2547 struct printk_record r;
2548 u64 next_seq;
2549 2550 if (console_suspended) { 2551 up_console_sem(); 2552 return; 2553 } 2554 2555 prb_rec_init_rd(&r, &info, text, sizeof(text)); 2556 2557 /* 2558 * Console drivers are called with interrupts disabled, so 2559 * @console_may_schedule should be cleared before; however, we may 2560 * end up dumping a lot of lines, for example, if called from 2561 * console registration path, and should invoke cond_resched() 2562 * between lines if allowable. Not doing so can cause a very long 2563 * scheduling stall on a slow console leading to RCU stall and 2564 * softlockup warnings which exacerbate the issue with more 2565 * messages practically incapacitating the system. 2566 * 2567 * console_trylock() is not able to detect the preemptive 2568 * context reliably. Therefore the value must be stored before 2569 * and cleared after the "again" goto label. 2570 */ 2571 do_cond_resched = console_may_schedule; 2572 again: 2573 console_may_schedule = 0; 2574 2575 /* 2576 * We released the console_sem lock, so we need to recheck if 2577 * cpu is online and (if not) is there at least one CON_ANYTIME 2578 * console. 2579 */ 2580 if (!can_use_console()) { 2581 console_locked = 0; 2582 up_console_sem(); 2583 return; 2584 } 2585 2586 for (;;) { 2587 size_t ext_len = 0; 2588 size_t len; 2589 2590 printk_safe_enter_irqsave(flags); 2591 skip: 2592 if (!prb_read_valid(prb, console_seq, &r)) 2593 break; 2594 2595 if (console_seq != r.info->seq) { 2596 console_dropped += r.info->seq - console_seq; 2597 console_seq = r.info->seq; 2598 } 2599 2600 if (suppress_message_printing(r.info->level)) { 2601 /* 2602 * Skip record we have buffered and already printed 2603 * directly to the console when we received it, and 2604 * record that has level above the console loglevel. 2605 */ 2606 console_seq++; 2607 goto skip; 2608 } 2609 2610 /* Output to all consoles once old messages replayed. */ 2611 if (unlikely(exclusive_console && 2612 console_seq >= exclusive_console_stop_seq)) { 2613 exclusive_console = NULL; 2614 } 2615 2616 /* 2617 * Handle extended console text first because later 2618 * record_print_text() will modify the record buffer in-place. 2619 */ 2620 if (nr_ext_console_drivers) { 2621 ext_len = info_print_ext_header(ext_text, 2622 sizeof(ext_text), 2623 r.info); 2624 ext_len += msg_print_ext_body(ext_text + ext_len, 2625 sizeof(ext_text) - ext_len, 2626 &r.text_buf[0], 2627 r.info->text_len, 2628 &r.info->dev_info); 2629 } 2630 len = record_print_text(&r, 2631 console_msg_format & MSG_FORMAT_SYSLOG, 2632 printk_time); 2633 console_seq++; 2634 2635 /* 2636 * While actively printing out messages, if another printk() 2637 * were to occur on another CPU, it may wait for this one to 2638 * finish. This task can not be preempted if there is a 2639 * waiter waiting to take over. 2640 */ 2641 console_lock_spinning_enable(); 2642 2643 stop_critical_timings(); /* don't trace print latency */ 2644 call_console_drivers(ext_text, ext_len, text, len); 2645 start_critical_timings(); 2646 2647 if (console_lock_spinning_disable_and_check()) { 2648 printk_safe_exit_irqrestore(flags); 2649 return; 2650 } 2651 2652 printk_safe_exit_irqrestore(flags); 2653 2654 if (do_cond_resched) 2655 cond_resched(); 2656 } 2657 2658 /* Get consistent value of the next-to-be-used sequence number. */ 2659 next_seq = console_seq; 2660 2661 console_locked = 0; 2662 up_console_sem(); 2663 2664 /* 2665 * Someone could have filled up the buffer again, so re-check if there's 2666 * something to flush. In case we cannot trylock the console_sem again, 2667 * there's a new owner and the console_unlock() from them will do the 2668 * flush, no worries. 2669 */ 2670 retry = prb_read_valid(prb, next_seq, NULL); 2671 printk_safe_exit_irqrestore(flags); 2672 2673 if (retry && console_trylock()) 2674 goto again; 2675 } 2676 EXPORT_SYMBOL(console_unlock); 2677
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On 2021-06-30, kernel test robot lkp@intel.com wrote:
kernel/printk/printk.c:2548:6: warning: variable 'next_seq' set but not used [-Wunused-but-set-variable]
I suppose the correct fix for this warning would be to change the NOP macros. Currently they are:
#define prb_read_valid(rb, seq, r) false #define prb_first_valid_seq(rb) 0
They should probably be something like (untested):
#define prb_read_valid(rb, seq, r) \ ({ \ (void)(rb); \ (void)(seq); \ (void)(r); \ false; \ })
#define prb_first_valid_seq(rb) \ ({ \ (void)(rb); \ 0; \ })
John Ogness
On Tue 2021-06-29 22:59:57, John Ogness wrote:
On 2021-06-30, kernel test robot lkp@intel.com wrote:
kernel/printk/printk.c:2548:6: warning: variable 'next_seq' set but not used [-Wunused-but-set-variable]
I suppose the correct fix for this warning would be to change the NOP macros. Currently they are:
#define prb_read_valid(rb, seq, r) false #define prb_first_valid_seq(rb) 0
They should probably be something like (untested):
#define prb_read_valid(rb, seq, r) \ ({ \ (void)(rb); \ (void)(seq); \ (void)(r); \ false; \ })
This did not work:
kernel/printk/printk.c: In function ‘console_unlock’: kernel/printk/printk.c:2600:23: error: ‘prb’ undeclared (first use in this function) if (!prb_read_valid(prb, console_seq, &r)) ^ kernel/printk/printk.c:2230:16: note: in definition of macro ‘prb_read_valid’ (void)(rb); \ ^~ kernel/printk/printk.c:2600:23: note: each undeclared identifier is reported only once for each function it appears in if (!prb_read_valid(prb, console_seq, &r)) ^ kernel/printk/printk.c:2230:16: note: in definition of macro ‘prb_read_valid’ (void)(rb); \ ^~
Instead, it might be solved by declaring next_seq as:
u64 __maybe_unused next_seq;
Any better idea is welcome. Well, it is not worth any big complexity.
Best Regards, Petr
linux-stable-mirror@lists.linaro.org