In one of my rcutorture tests the TSC clocksource got marked unstable due to a large difference in the TSC value. I'm not sure if the guest run for a long time with disabled interrupts or if the host was very busy and didn't schedule the guest for some time. I took a look on the qemu/KVM options and decided to update the options: - Use kvm{32|64} as CPU. We could probably use `host' (like ARM does) for maximum available features but since we don't run any userland I'm not sure if it makes any difference.
- Drop the "noapic" option, enable TSC deadline timer. There is no history why the APIC was disabled, I see no reason for it. The deadline timer is probably "nicer".
- Additional config options. It ensures that the kernel knowns that it runs as a kvm guest and can use virt devices like the kvm-clock as clocksource. The kvm-clock was the main motivation here.
- I didn't add a random HW device. It would make the random device ready earlier (not it doesn't complete the initialisation at all) but I doubt that there is any need for this.
Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de --- tools/testing/selftests/rcutorture/bin/functions.sh | 13 ++++++++++++- .../selftests/rcutorture/configs/rcu/CFcommon | 4 ++++ 2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh index 6bcb8b5b2ff22..be3c5c73d7e79 100644 --- a/tools/testing/selftests/rcutorture/bin/functions.sh +++ b/tools/testing/selftests/rcutorture/bin/functions.sh @@ -172,7 +172,7 @@ identify_qemu_append () { local console=ttyS0 case "$1" in qemu-system-x86_64|qemu-system-i386) - echo noapic selinux=0 initcall_debug debug + echo selinux=0 initcall_debug debug ;; qemu-system-aarch64) console=ttyAMA0 @@ -191,8 +191,19 @@ identify_qemu_append () { # Output arguments for qemu arguments based on the TORTURE_QEMU_MAC # and TORTURE_QEMU_INTERACTIVE environment variables. identify_qemu_args () { + local KVM_CPU="" + case "$1" in + qemu-system-x86_64) + KVM_CPU=kvm64 + ;; + qemu-system-i386) + KVM_CPU=kvm32 + ;; + esac case "$1" in qemu-system-x86_64|qemu-system-i386) + echo -machine q35,accel=kvm + echo -cpu ${KVM_CPU},x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on ;; qemu-system-aarch64) echo -machine virt,gic-version=host -cpu host diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon index d2d2a86139db1..322d5d40443cd 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon +++ b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon @@ -1,2 +1,6 @@ CONFIG_RCU_TORTURE_TEST=y CONFIG_PRINTK_TIME=y +CONFIG_HYPERVISOR_GUEST=y +CONFIG_PARAVIRT=y +CONFIG_PARAVIRT_SPINLOCKS=y +CONFIG_KVM_GUEST=y
On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
In one of my rcutorture tests the TSC clocksource got marked unstable due to a large difference in the TSC value. I'm not sure if the guest run for a long time with disabled interrupts or if the host was very busy and didn't schedule the guest for some time. I took a look on the qemu/KVM options and decided to update the options:
Use kvm{32|64} as CPU. We could probably use `host' (like ARM does) for maximum available features but since we don't run any userland I'm not sure if it makes any difference.
Drop the "noapic" option, enable TSC deadline timer. There is no history why the APIC was disabled, I see no reason for it. The deadline timer is probably "nicer".
Additional config options. It ensures that the kernel knowns that it runs as a kvm guest and can use virt devices like the kvm-clock as clocksource. The kvm-clock was the main motivation here.
I didn't add a random HW device. It would make the random device ready earlier (not it doesn't complete the initialisation at all) but I doubt that there is any need for this.
Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de
Thank you, Sebastian! Queued for review and testing.
Thanx, Paul
tools/testing/selftests/rcutorture/bin/functions.sh | 13 ++++++++++++- .../selftests/rcutorture/configs/rcu/CFcommon | 4 ++++ 2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh index 6bcb8b5b2ff22..be3c5c73d7e79 100644 --- a/tools/testing/selftests/rcutorture/bin/functions.sh +++ b/tools/testing/selftests/rcutorture/bin/functions.sh @@ -172,7 +172,7 @@ identify_qemu_append () { local console=ttyS0 case "$1" in qemu-system-x86_64|qemu-system-i386)
echo noapic selinux=0 initcall_debug debug
;; qemu-system-aarch64) console=ttyAMA0echo selinux=0 initcall_debug debug
@@ -191,8 +191,19 @@ identify_qemu_append () { # Output arguments for qemu arguments based on the TORTURE_QEMU_MAC # and TORTURE_QEMU_INTERACTIVE environment variables. identify_qemu_args () {
- local KVM_CPU=""
- case "$1" in
- qemu-system-x86_64)
KVM_CPU=kvm64
;;
- qemu-system-i386)
KVM_CPU=kvm32
;;
- esac case "$1" in qemu-system-x86_64|qemu-system-i386)
echo -machine q35,accel=kvm
;; qemu-system-aarch64) echo -machine virt,gic-version=host -cpu hostecho -cpu ${KVM_CPU},x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon index d2d2a86139db1..322d5d40443cd 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon +++ b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon @@ -1,2 +1,6 @@ CONFIG_RCU_TORTURE_TEST=y CONFIG_PRINTK_TIME=y +CONFIG_HYPERVISOR_GUEST=y +CONFIG_PARAVIRT=y +CONFIG_PARAVIRT_SPINLOCKS=y
+CONFIG_KVM_GUEST=y
2.20.1
On Wed, Apr 24, 2019 at 03:38:09AM -0700, Paul E. McKenney wrote:
On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
In one of my rcutorture tests the TSC clocksource got marked unstable due to a large difference in the TSC value. I'm not sure if the guest run for a long time with disabled interrupts or if the host was very busy and didn't schedule the guest for some time. I took a look on the qemu/KVM options and decided to update the options:
Use kvm{32|64} as CPU. We could probably use `host' (like ARM does) for maximum available features but since we don't run any userland I'm not sure if it makes any difference.
Drop the "noapic" option, enable TSC deadline timer. There is no history why the APIC was disabled, I see no reason for it. The deadline timer is probably "nicer".
Additional config options. It ensures that the kernel knowns that it runs as a kvm guest and can use virt devices like the kvm-clock as clocksource. The kvm-clock was the main motivation here.
I didn't add a random HW device. It would make the random device ready earlier (not it doesn't complete the initialisation at all) but I doubt that there is any need for this.
Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de
Thank you, Sebastian! Queued for review and testing.
And it doesn't like my (admittedly ancient) QEMU, complaining about not knowing about "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on". If I remove these, it works. I will be upgrading soon (famous last words), so what I am going to do is queue the following separate not-for-upstream patch that makes it work on my setup.
Thanx, Paul
------------------------------------------------------------------------
commit d33dffe7d1f05adfc798836a9d30b9f6bf69635d Author: Paul E. McKenney paulmck@linux.ibm.com Date: Wed Apr 24 05:20:07 2019 -0700
EXP torture: Hack patch to support ancient QEMU
To be removed before upstreaming or when Paul gets his QEMU act together, whichever comes first. ;-)
Signed-off-by: Paul E. McKenney paulmck@linux.ibm.com
diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh index be3c5c73d7e7..c3a49fb4d6f6 100644 --- a/tools/testing/selftests/rcutorture/bin/functions.sh +++ b/tools/testing/selftests/rcutorture/bin/functions.sh @@ -203,7 +203,7 @@ identify_qemu_args () { case "$1" in qemu-system-x86_64|qemu-system-i386) echo -machine q35,accel=kvm - echo -cpu ${KVM_CPU},x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on + echo -cpu ${KVM_CPU} ;; qemu-system-aarch64) echo -machine virt,gic-version=host -cpu host
On Wed, Apr 24, 2019 at 11:30:39AM -0700, Paul E. McKenney wrote:
On Wed, Apr 24, 2019 at 03:38:09AM -0700, Paul E. McKenney wrote:
On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
In one of my rcutorture tests the TSC clocksource got marked unstable due to a large difference in the TSC value. I'm not sure if the guest run for a long time with disabled interrupts or if the host was very busy and didn't schedule the guest for some time. I took a look on the qemu/KVM options and decided to update the options:
Use kvm{32|64} as CPU. We could probably use `host' (like ARM does) for maximum available features but since we don't run any userland I'm not sure if it makes any difference.
Drop the "noapic" option, enable TSC deadline timer. There is no history why the APIC was disabled, I see no reason for it. The deadline timer is probably "nicer".
Additional config options. It ensures that the kernel knowns that it runs as a kvm guest and can use virt devices like the kvm-clock as clocksource. The kvm-clock was the main motivation here.
I didn't add a random HW device. It would make the random device ready earlier (not it doesn't complete the initialisation at all) but I doubt that there is any need for this.
Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de
Thank you, Sebastian! Queued for review and testing.
And it doesn't like my (admittedly ancient) QEMU, complaining about not knowing about "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on". If I remove these, it works. I will be upgrading soon (famous last words), so what I am going to do is queue the following separate not-for-upstream patch that makes it work on my setup.
Also, the !SMP scenarios get this:
:CONFIG_PARAVIRT_SPINLOCKS=y: improperly set
Would it make sense to only set this on CONFIG_SMP=y runs? The easy way to do this is to move it from CFcommon to the scenario files not having CONFIG_SMP=n. Or would something else work better?
Or am I doing something wrong?
Thanx, Paul
On 2019-04-25 12:46:38 [-0700], Paul E. McKenney wrote:
And it doesn't like my (admittedly ancient) QEMU, complaining about not knowing about "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on". If I remove these, it works. I will be upgrading soon (famous last words), so what I am going to do is queue the following separate not-for-upstream patch that makes it work on my setup.
Also, the !SMP scenarios get this:
:CONFIG_PARAVIRT_SPINLOCKS=y: improperly set
I ignored that (because there is a CHECK option to ensure that certain options are set).
Would it make sense to only set this on CONFIG_SMP=y runs? The easy way to do this is to move it from CFcommon to the scenario files not having CONFIG_SMP=n. Or would something else work better?
Or am I doing something wrong?
You should also get that warning on ARM for instance since those PARAVIRT SPINLOCKS are only implemented on x86. I have no numbers to compare the performance with and without that option. So we could drop that option or you tell me where to look after a run with/without that option so I can tell if it is worth the effort.
Thanx, Paul
Sebastian
On Fri, Apr 26, 2019 at 12:54:14PM +0200, Sebastian Andrzej Siewior wrote:
On 2019-04-25 12:46:38 [-0700], Paul E. McKenney wrote:
And it doesn't like my (admittedly ancient) QEMU, complaining about not knowing about "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on". If I remove these, it works. I will be upgrading soon (famous last words), so what I am going to do is queue the following separate not-for-upstream patch that makes it work on my setup.
Also, the !SMP scenarios get this:
:CONFIG_PARAVIRT_SPINLOCKS=y: improperly set
I ignored that (because there is a CHECK option to ensure that certain options are set).
True enough, but these things are a bit like compiler warnings in that it is good to have a clean run.
Would it make sense to only set this on CONFIG_SMP=y runs? The easy way to do this is to move it from CFcommon to the scenario files not having CONFIG_SMP=n.
(Which is how I am currently running, I should have added.)
having CONFIG_SMP=n. Or would something else work better?
Or am I doing something wrong?
You should also get that warning on ARM for instance since those PARAVIRT SPINLOCKS are only implemented on x86.
OK, so we would want this to only be added for x86 runs.
I have no numbers to compare the performance with and without that option. So we could drop that option or you tell me where to look after a run with/without that option so I can tell if it is worth the effort.
One place to look is in the summary output:
TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]
The "58.4667/s" is the number of grace periods per second. I would be surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in grace-period rate (given the natural variation), but you never know.
Thanx, Paul
On 2019-04-26 06:50:58 [-0700], Paul E. McKenney wrote:
One place to look is in the summary output:
TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]
The "58.4667/s" is the number of grace periods per second. I would be surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in grace-period rate (given the natural variation), but you never know.
I did four runs of the different parts of the patch: - 5.1-rc6 - #1 + kvm64 CPU + some config options - #2 + tsc-deadline=on and so on (the whole line) - #3 + CONFIG_PARAVIRT_SPINLOCKS (now everything)
the test command was tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "16*TREE08" --memory 4G
and the results: | HEAD is now at 085b7755808a... Linux 5.1-rc6 | (28.5942 +27.4658 +28.0203 +27.2061 +28.0731 +26.9078 +27.8494 + 27.3392 +26.4339 +28.025 +27.4797 +27.6775 +28.0653 +28.0742 +27.9581 +28.6508)/ 16 | 27.738775 | | HEAD is now at 36a12aa9761a... tune #1 | (28.5761 +26.6514 +26.6989 +27.4375 +27.3442 +28.3228 +26.6353 +27.5461+28.5531 +27.7006 +27.8078 +27.9753 +27.4269 +28.0464 +27.6314 +27.8356) / 16 | 27.6368375 | | HEAD is now at af5cd7196436... tune #2 | (28.4867 +26.3675 +27.6364 +28.3344 +27.4153 +27.9306 +27.1703 +26.8461+27.3194 +28.5486 +27.8975 +27.4356 +28.12 +28.4397 +29.0186 +26.9328 )/ 16 | 27.74371875 | | HEAD is now at 3701f64943f5... tune #3 | (28.2431 +27.7831 +28.39 +28.2586 +27.7408 +27.9258 +26.6236 +26.7817+29.1178 +26.9564 +29.0525 +27.4258 +27.4931 +27.8928 +26.9308 +28.4833)/ 16 | 27.8187
This 28.… is the number of GP/s. Based on the results in looks like noise to me. Also I have no idea why you have more than twice as many GP/s as I do. Different box doing: tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "42*TREE01" --memory 4G and got this:
| HEAD is now at 085b7755808aa... Linux 5.1-rc6 | (15.2878 + 15.8664 + 15.6369 + 15.6714 + 15.3667 + 16.4706 + 15.7844 + | 16.2119 + 15.6108 + 15.84 + 16.0003 + 16.2886 + 15.8728 + 15.5347 + | 15.6753 + 15.6628 + 15.8628 + 15.8158 + 15.8419 + 16.0053 + 15.7878 + | 16.465 + 16.2267 + 16.6881 + 16.3186 + 16.1778 + 15.7069 + 16.0178 + | 15.7156 + 16.0083 + 15.7181 + 15.8961 + 15.5253 + 16.1569 + 15.7692 + | 16.2622 + 16.2931 + 15.9531 + 15.6697 + 15.4539 + 15.6478 + 15.8047) / | 42 | ~15.89452142857142857143 | | HEAD is now at 3701f64943f5a... tune #3 | ; (15.8461 + 15.8653 + 16.0392 + 15.8906 + 15.7606 + 15.6169 + 16.1425 + | 15.9089 + 16.2169 + 16.1694 + 16.2122 + 15.6417 + 15.8022 + 16.1178 + | 15.1689 + 16.1303 + 16.0181 + 16.3797 + 16.0614 + 16.2839 + 15.4583 + | 15.9178 + 16.0589 + 16.3428 + 15.5486 + 16.0839 + 15.9931 + 15.8417 + | 16.0981 + 15.8075 + 15.9925 + 15.7311 + 15.9172 + 16.1164 + 15.6303 + | 15.9383 + 16.0714 + 16.2786 + 15.8394 + 15.9597 + 16.0175 + 15.3908) / | 42 | ~15.93586904761904761905
Noise.
So from RCUtorture point of view there is no improvement right? In that case I would suggest to drop the problematic parts…
Thanx, Paul
Sebastian
On Mon, Apr 29, 2019 at 10:19:44AM +0200, Sebastian Andrzej Siewior wrote:
On 2019-04-26 06:50:58 [-0700], Paul E. McKenney wrote:
One place to look is in the summary output:
TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]
The "58.4667/s" is the number of grace periods per second. I would be surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in grace-period rate (given the natural variation), but you never know.
I did four runs of the different parts of the patch:
- 5.1-rc6
- #1 + kvm64 CPU + some config options
- #2 + tsc-deadline=on and so on (the whole line)
- #3 + CONFIG_PARAVIRT_SPINLOCKS (now everything)
the test command was tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "16*TREE08" --memory 4G
and the results: | HEAD is now at 085b7755808a... Linux 5.1-rc6 | (28.5942 +27.4658 +28.0203 +27.2061 +28.0731 +26.9078 +27.8494 + 27.3392 +26.4339 +28.025 +27.4797 +27.6775 +28.0653 +28.0742 +27.9581 +28.6508)/ 16 | 27.738775 | | HEAD is now at 36a12aa9761a... tune #1 | (28.5761 +26.6514 +26.6989 +27.4375 +27.3442 +28.3228 +26.6353 +27.5461+28.5531 +27.7006 +27.8078 +27.9753 +27.4269 +28.0464 +27.6314 +27.8356) / 16 | 27.6368375 | | HEAD is now at af5cd7196436... tune #2 | (28.4867 +26.3675 +27.6364 +28.3344 +27.4153 +27.9306 +27.1703 +26.8461+27.3194 +28.5486 +27.8975 +27.4356 +28.12 +28.4397 +29.0186 +26.9328 )/ 16 | 27.74371875 | | HEAD is now at 3701f64943f5... tune #3 | (28.2431 +27.7831 +28.39 +28.2586 +27.7408 +27.9258 +26.6236 +26.7817+29.1178 +26.9564 +29.0525 +27.4258 +27.4931 +27.8928 +26.9308 +28.4833)/ 16 | 27.8187
This 28.… is the number of GP/s. Based on the results in looks like noise to me. Also I have no idea why you have more than twice as many GP/s as I do.
My guess is that because you have more CPUs, the for_each_online_cpu() loop takes longer on your system.
Different box doing: tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "42*TREE01" --memory 4G and got this:
| HEAD is now at 085b7755808aa... Linux 5.1-rc6 | (15.2878 + 15.8664 + 15.6369 + 15.6714 + 15.3667 + 16.4706 + 15.7844 + | 16.2119 + 15.6108 + 15.84 + 16.0003 + 16.2886 + 15.8728 + 15.5347 + | 15.6753 + 15.6628 + 15.8628 + 15.8158 + 15.8419 + 16.0053 + 15.7878 + | 16.465 + 16.2267 + 16.6881 + 16.3186 + 16.1778 + 15.7069 + 16.0178 + | 15.7156 + 16.0083 + 15.7181 + 15.8961 + 15.5253 + 16.1569 + 15.7692 + | 16.2622 + 16.2931 + 15.9531 + 15.6697 + 15.4539 + 15.6478 + 15.8047) / | 42 | ~15.89452142857142857143 | | HEAD is now at 3701f64943f5a... tune #3 | ; (15.8461 + 15.8653 + 16.0392 + 15.8906 + 15.7606 + 15.6169 + 16.1425 + | 15.9089 + 16.2169 + 16.1694 + 16.2122 + 15.6417 + 15.8022 + 16.1178 + | 15.1689 + 16.1303 + 16.0181 + 16.3797 + 16.0614 + 16.2839 + 15.4583 + | 15.9178 + 16.0589 + 16.3428 + 15.5486 + 16.0839 + 15.9931 + 15.8417 + | 16.0981 + 15.8075 + 15.9925 + 15.7311 + 15.9172 + 16.1164 + 15.6303 + | 15.9383 + 16.0714 + 16.2786 + 15.8394 + 15.9597 + 16.0175 + 15.3908) / | 42 | ~15.93586904761904761905
Noise.
So from RCUtorture point of view there is no improvement right? In that case I would suggest to drop the problematic parts…
Thank you for testing this! I will adjust accordingly.
Thanx, Paul
On Mon, Apr 29, 2019 at 07:49:24AM -0700, Paul E. McKenney wrote:
On Mon, Apr 29, 2019 at 10:19:44AM +0200, Sebastian Andrzej Siewior wrote:
On 2019-04-26 06:50:58 [-0700], Paul E. McKenney wrote:
One place to look is in the summary output:
TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]
The "58.4667/s" is the number of grace periods per second. I would be surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in grace-period rate (given the natural variation), but you never know.
I did four runs of the different parts of the patch:
- 5.1-rc6
- #1 + kvm64 CPU + some config options
- #2 + tsc-deadline=on and so on (the whole line)
- #3 + CONFIG_PARAVIRT_SPINLOCKS (now everything)
the test command was tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "16*TREE08" --memory 4G
and the results: | HEAD is now at 085b7755808a... Linux 5.1-rc6 | (28.5942 +27.4658 +28.0203 +27.2061 +28.0731 +26.9078 +27.8494 + 27.3392 +26.4339 +28.025 +27.4797 +27.6775 +28.0653 +28.0742 +27.9581 +28.6508)/ 16 | 27.738775 | | HEAD is now at 36a12aa9761a... tune #1 | (28.5761 +26.6514 +26.6989 +27.4375 +27.3442 +28.3228 +26.6353 +27.5461+28.5531 +27.7006 +27.8078 +27.9753 +27.4269 +28.0464 +27.6314 +27.8356) / 16 | 27.6368375 | | HEAD is now at af5cd7196436... tune #2 | (28.4867 +26.3675 +27.6364 +28.3344 +27.4153 +27.9306 +27.1703 +26.8461+27.3194 +28.5486 +27.8975 +27.4356 +28.12 +28.4397 +29.0186 +26.9328 )/ 16 | 27.74371875 | | HEAD is now at 3701f64943f5... tune #3 | (28.2431 +27.7831 +28.39 +28.2586 +27.7408 +27.9258 +26.6236 +26.7817+29.1178 +26.9564 +29.0525 +27.4258 +27.4931 +27.8928 +26.9308 +28.4833)/ 16 | 27.8187
This 28.… is the number of GP/s. Based on the results in looks like noise to me. Also I have no idea why you have more than twice as many GP/s as I do.
My guess is that because you have more CPUs, the for_each_online_cpu() loop takes longer on your system.
OK, that is rather oversimplified, to say the least. A better way to put this is that the probability of some CPU holding things up is larger the more CPUs you have. RCU doe take explicit steps to slow down grace periods, but that doesn't start kicking in until 256 CPUs.
Thanx, Paul
Different box doing: tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "42*TREE01" --memory 4G and got this:
| HEAD is now at 085b7755808aa... Linux 5.1-rc6 | (15.2878 + 15.8664 + 15.6369 + 15.6714 + 15.3667 + 16.4706 + 15.7844 + | 16.2119 + 15.6108 + 15.84 + 16.0003 + 16.2886 + 15.8728 + 15.5347 + | 15.6753 + 15.6628 + 15.8628 + 15.8158 + 15.8419 + 16.0053 + 15.7878 + | 16.465 + 16.2267 + 16.6881 + 16.3186 + 16.1778 + 15.7069 + 16.0178 + | 15.7156 + 16.0083 + 15.7181 + 15.8961 + 15.5253 + 16.1569 + 15.7692 + | 16.2622 + 16.2931 + 15.9531 + 15.6697 + 15.4539 + 15.6478 + 15.8047) / | 42 | ~15.89452142857142857143 | | HEAD is now at 3701f64943f5a... tune #3 | ; (15.8461 + 15.8653 + 16.0392 + 15.8906 + 15.7606 + 15.6169 + 16.1425 + | 15.9089 + 16.2169 + 16.1694 + 16.2122 + 15.6417 + 15.8022 + 16.1178 + | 15.1689 + 16.1303 + 16.0181 + 16.3797 + 16.0614 + 16.2839 + 15.4583 + | 15.9178 + 16.0589 + 16.3428 + 15.5486 + 16.0839 + 15.9931 + 15.8417 + | 16.0981 + 15.8075 + 15.9925 + 15.7311 + 15.9172 + 16.1164 + 15.6303 + | 15.9383 + 16.0714 + 16.2786 + 15.8394 + 15.9597 + 16.0175 + 15.3908) / | 42 | ~15.93586904761904761905
Noise.
So from RCUtorture point of view there is no improvement right? In that case I would suggest to drop the problematic parts…
Thank you for testing this! I will adjust accordingly.
Thanx, Paul
On Mon, Apr 29, 2019 at 08:06:00AM -0700, Paul E. McKenney wrote:
On Mon, Apr 29, 2019 at 07:49:24AM -0700, Paul E. McKenney wrote:
On Mon, Apr 29, 2019 at 10:19:44AM +0200, Sebastian Andrzej Siewior wrote:
On 2019-04-26 06:50:58 [-0700], Paul E. McKenney wrote:
One place to look is in the summary output:
TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]
The "58.4667/s" is the number of grace periods per second. I would be surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in grace-period rate (given the natural variation), but you never know.
I did four runs of the different parts of the patch:
- 5.1-rc6
- #1 + kvm64 CPU + some config options
- #2 + tsc-deadline=on and so on (the whole line)
- #3 + CONFIG_PARAVIRT_SPINLOCKS (now everything)
the test command was tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "16*TREE08" --memory 4G
and the results: | HEAD is now at 085b7755808a... Linux 5.1-rc6 | (28.5942 +27.4658 +28.0203 +27.2061 +28.0731 +26.9078 +27.8494 + 27.3392 +26.4339 +28.025 +27.4797 +27.6775 +28.0653 +28.0742 +27.9581 +28.6508)/ 16 | 27.738775 | | HEAD is now at 36a12aa9761a... tune #1 | (28.5761 +26.6514 +26.6989 +27.4375 +27.3442 +28.3228 +26.6353 +27.5461+28.5531 +27.7006 +27.8078 +27.9753 +27.4269 +28.0464 +27.6314 +27.8356) / 16 | 27.6368375 | | HEAD is now at af5cd7196436... tune #2 | (28.4867 +26.3675 +27.6364 +28.3344 +27.4153 +27.9306 +27.1703 +26.8461+27.3194 +28.5486 +27.8975 +27.4356 +28.12 +28.4397 +29.0186 +26.9328 )/ 16 | 27.74371875 | | HEAD is now at 3701f64943f5... tune #3 | (28.2431 +27.7831 +28.39 +28.2586 +27.7408 +27.9258 +26.6236 +26.7817+29.1178 +26.9564 +29.0525 +27.4258 +27.4931 +27.8928 +26.9308 +28.4833)/ 16 | 27.8187
This 28.… is the number of GP/s. Based on the results in looks like noise to me. Also I have no idea why you have more than twice as many GP/s as I do.
My guess is that because you have more CPUs, the for_each_online_cpu() loop takes longer on your system.
OK, that is rather oversimplified, to say the least. A better way to put this is that the probability of some CPU holding things up is larger the more CPUs you have. RCU doe take explicit steps to slow down grace periods, but that doesn't start kicking in until 256 CPUs.
And I ended up with the following variant of your patch. If I don't hear otherwise from you, I will assume that you are OK with it. So if something bothers you about it, please don't suffer in silence!
Thanx, Paul
------------------------------------------------------------------------
commit 4a04229cf73ac9bc1ae15357beb32a0b37be1480 Author: Sebastian Andrzej Siewior bigeasy@linutronix.de Date: Wed Apr 24 09:34:46 2019 +0200
rcutorture: Tweak kvm options
In one of my rcutorture tests the TSC clocksource got marked unstable due to a large difference in the TSC value. I'm not sure if the guest run for a long time with disabled interrupts or if the host was very busy and didn't schedule the guest for some time.
I took a look on the qemu/KVM options and decided to update the options:
- Use kvm{32|64} as CPU. We could probably use `host' (like ARM does) for maximum available features but since we don't run any userland I'm not sure if it makes any difference.
- Drop the "noapic" option. There is no history why the APIC was disabled, I see no reason for it. Once old qemu versions fade away, we can add "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on".
- Additional config options. It ensures that the kernel knowns that it runs as a kvm guest and can use virt devices like the kvm-clock as clocksource. The kvm-clock was the main motivation here.
- I didn't add a random HW device. It would make the random device ready earlier (not it doesn't complete the initialisation at all) but I doubt that there is any need for this.
Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de [ paulmck: The world is not quite ready for CONFIG_PARAVIRT_SPINLOCKS=y and x2apic, so they are omitted for the time being. ] Signed-off-by: Paul E. McKenney paulmck@linux.ibm.com
diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh index 6bcb8b5b2ff2..c3a49fb4d6f6 100644 --- a/tools/testing/selftests/rcutorture/bin/functions.sh +++ b/tools/testing/selftests/rcutorture/bin/functions.sh @@ -172,7 +172,7 @@ identify_qemu_append () { local console=ttyS0 case "$1" in qemu-system-x86_64|qemu-system-i386) - echo noapic selinux=0 initcall_debug debug + echo selinux=0 initcall_debug debug ;; qemu-system-aarch64) console=ttyAMA0 @@ -191,8 +191,19 @@ identify_qemu_append () { # Output arguments for qemu arguments based on the TORTURE_QEMU_MAC # and TORTURE_QEMU_INTERACTIVE environment variables. identify_qemu_args () { + local KVM_CPU="" + case "$1" in + qemu-system-x86_64) + KVM_CPU=kvm64 + ;; + qemu-system-i386) + KVM_CPU=kvm32 + ;; + esac case "$1" in qemu-system-x86_64|qemu-system-i386) + echo -machine q35,accel=kvm + echo -cpu ${KVM_CPU} ;; qemu-system-aarch64) echo -machine virt,gic-version=host -cpu host diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon index d2d2a86139db..e19a444a0684 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon +++ b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon @@ -1,2 +1,5 @@ CONFIG_RCU_TORTURE_TEST=y CONFIG_PRINTK_TIME=y +CONFIG_HYPERVISOR_GUEST=y +CONFIG_PARAVIRT=y +CONFIG_KVM_GUEST=y
On 2019-05-02 11:41:16 [-0700], Paul E. McKenney wrote:
And I ended up with the following variant of your patch. If I don't hear otherwise from you, I will assume that you are OK with it. So if something bothers you about it, please don't suffer in silence!
perfect, thank you!
Thanx, Paul
Sebastian
On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
In one of my rcutorture tests the TSC clocksource got marked unstable due to a large difference in the TSC value. I'm not sure if the guest run for a long time with disabled interrupts or if the host was very busy and didn't schedule the guest for some time. I took a look on the qemu/KVM options and decided to update the options:
Use kvm{32|64} as CPU. We could probably use `host' (like ARM does) for maximum available features but since we don't run any userland I'm not sure if it makes any difference.
Drop the "noapic" option, enable TSC deadline timer. There is no history why the APIC was disabled, I see no reason for it. The deadline timer is probably "nicer".
I was wondering why the tsc deadline timer can't just be the default in the kernel if it is "nicer" / "better" , and why does it need to be an option.
Additional config options. It ensures that the kernel knowns that it runs as a kvm guest and can use virt devices like the kvm-clock as clocksource. The kvm-clock was the main motivation here.
I didn't add a random HW device. It would make the random device ready earlier (not it doesn't complete the initialisation at all) but I doubt that there is any need for this.
Didn't follow this point about "random HW device". It looks like there is no part of the patch that matches this comment. There was no "random HW device" needed for this change of the clocksource, so could you clarify what this means?
Otherwise lgtm, thanks!
Reviewed-by: Joel Fernandes (Google) joel@joelfernandes.org
- Joel
Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de
tools/testing/selftests/rcutorture/bin/functions.sh | 13 ++++++++++++- .../selftests/rcutorture/configs/rcu/CFcommon | 4 ++++ 2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh index 6bcb8b5b2ff22..be3c5c73d7e79 100644 --- a/tools/testing/selftests/rcutorture/bin/functions.sh +++ b/tools/testing/selftests/rcutorture/bin/functions.sh @@ -172,7 +172,7 @@ identify_qemu_append () { local console=ttyS0 case "$1" in qemu-system-x86_64|qemu-system-i386)
echo noapic selinux=0 initcall_debug debug
;; qemu-system-aarch64) console=ttyAMA0echo selinux=0 initcall_debug debug
@@ -191,8 +191,19 @@ identify_qemu_append () { # Output arguments for qemu arguments based on the TORTURE_QEMU_MAC # and TORTURE_QEMU_INTERACTIVE environment variables. identify_qemu_args () {
- local KVM_CPU=""
- case "$1" in
- qemu-system-x86_64)
KVM_CPU=kvm64
;;
- qemu-system-i386)
KVM_CPU=kvm32
;;
- esac case "$1" in qemu-system-x86_64|qemu-system-i386)
echo -machine q35,accel=kvm
;; qemu-system-aarch64) echo -machine virt,gic-version=host -cpu hostecho -cpu ${KVM_CPU},x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon index d2d2a86139db1..322d5d40443cd 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon +++ b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon @@ -1,2 +1,6 @@ CONFIG_RCU_TORTURE_TEST=y CONFIG_PRINTK_TIME=y +CONFIG_HYPERVISOR_GUEST=y +CONFIG_PARAVIRT=y +CONFIG_PARAVIRT_SPINLOCKS=y
+CONFIG_KVM_GUEST=y
2.20.1
On 2019-04-25 12:45:58 [-0400], Joel Fernandes wrote:
On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
In one of my rcutorture tests the TSC clocksource got marked unstable due to a large difference in the TSC value. I'm not sure if the guest run for a long time with disabled interrupts or if the host was very busy and didn't schedule the guest for some time. I took a look on the qemu/KVM options and decided to update the options:
Use kvm{32|64} as CPU. We could probably use `host' (like ARM does) for maximum available features but since we don't run any userland I'm not sure if it makes any difference.
Drop the "noapic" option, enable TSC deadline timer. There is no history why the APIC was disabled, I see no reason for it. The deadline timer is probably "nicer".
I was wondering why the tsc deadline timer can't just be the default in the kernel if it is "nicer" / "better" , and why does it need to be an option.
The tsc-deadline=on part tells qemu to expose it. Otherwise the kernel can't use HW that isn't there. I added q35 as the machine which should pass enough sane default options. If this tsc-deadline timer is a problem we could probably drop it. The local-apic should work.
- I didn't add a random HW device. It would make the random device ready earlier (not it doesn't complete the initialisation at all) but I doubt that there is any need for this.
Didn't follow this point about "random HW device". It looks like there is no part of the patch that matches this comment. There was no "random HW device" needed for this change of the clocksource, so could you clarify what this means?
I wanted to "upgrade" the kvm options and as part of it also add: -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0
With that change you would see |random: crng init done
during boot. Now you should end up with "only" |random: fast init done
I mentioned it because I didn't see a reason why to do so. If someone has an idea why it would make sense for rcutorture to use it, I can add it.
Otherwise lgtm, thanks!
Reviewed-by: Joel Fernandes (Google) joel@joelfernandes.org
- Joel
Sebastian
linux-kselftest-mirror@lists.linaro.org