Dear Meng Li and team,
thank you so much for working on finally bringing AMD preferred core
scheduling to mainline Linux!
> The initial core rankings are set up by AMD Pstate when the
> system boots.
I tested this patch on our Ryzen 7950x and 5950x systems and could
unfortunatlely not find any performance differences. I therefore took
a closer look and as far as I can tell the conditional for the initial
preferred performance priorities appears to be reversed. I marked them
down below. I also attached a patch for the fix. With that fixed I can
measure a 0.7% improvement compiling Firefox on 7950x. I wonder
slightly how this ever past testing before, ...
I think it would be a good idea to always expose the hw perf values in
sysfs to help users debugging hardware issues or BIOS settings even
with percore not enabled and therefore not using the unused 166 or 255
values anyway.
With that fixed, however, Linux is still not always scheduling to
preferred cores, but that appears to be an independant limitation of
the current linux scheduler not strictly using the priority for
scheduling, yet. With manual taskset guidance I could further improve
the Firefox build time by some more seconds to over 1% overall
performance improvement, if the linux scheudler would more reliably
schedule minute long running rust lto link tasks to the preferred
cores and not some mediocre ones.
> - highest_perf = amd_get_highest_perf();
> - if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
> - highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
> -
> - WRITE_ONCE(cpudata->highest_perf, highest_perf);
> + if (prefcore)
> + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
> + else
> + WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
Conditional reversed, assigns THRESHOLD if enabled!
> WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
> WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
> @@ -318,17 +322,15 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
> static int cppc_init_perf(struct amd_cpudata *cpudata)
> {
> struct cppc_perf_caps cppc_perf;
> - u32 highest_perf;
>
> int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> if (ret)
> return ret;
>
> - highest_perf = amd_get_highest_perf();
> - if (highest_perf > cppc_perf.highest_perf)
> - highest_perf = cppc_perf.highest_perf;
> -
> - WRITE_ONCE(cpudata->highest_perf, highest_perf);
> + if (prefcore)
> + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
> + else
> + WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf);
Same here. Not using highest_perf if enabled, ...
Signed-off-by: René Rebe <rene(a)exactcode.de>
--- linux-6.4/drivers/cpufreq/amd-pstate.c.vanilla 2023-08-25 22:34:25.254995690 +0200
+++ linux-6.4/drivers/cpufreq/amd-pstate.c 2023-08-25 22:35:49.194991446 +0200
@@ -282,9 +282,9 @@
* the default max perf.
*/
if (prefcore)
- WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
- else
WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
+ else
+ WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
@@ -303,9 +303,9 @@
return ret;
if (prefcore)
- WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
- else
WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf);
+ else
+ WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
WRITE_ONCE(cpudata->lowest_nonlinear_perf,
--
René Rebe, ExactCODE GmbH, Lietzenburger Str. 42, DE-10789 Berlin
https://exactcode.com | https://t2sde.org | https://rene.rebe.de
Write_schemata() uses fprintf() to write a bitmask into a schemata file
inside resctrl FS. It checks fprintf() return value but it doesn't check
fclose() return value. Error codes from fprintf() such as write errors,
are flushed back to the user only after fclose() is executed which means
any invalid bitmask can be written into the schemata file.
Rewrite write_schemata() to use syscalls instead of stdio functions to
interact with the schemata file.
Change sprintf() to snprintf() in write_schemata().
In case of write() returning an error pass the string acquired with
strerror() to the "reason" buffer.
Extend "reason" buffer by a factor of two so it can hold longer error
messages.
The resctrlfs.c file defines functions that interact with the resctrl FS
while resctrl_val.c file defines functions that perform measurements on
the cache. Run_benchmark() fits logically into the second file before
resctrl_val() function that uses it.
Move run_benchmark() from resctrlfs.c to resctrl_val.c just before
resctrl_val() function definition.
Series is based on kselftest next branch.
Changelog v2:
- Change sprintf() to snprintf() in write_schemata().
- Redo write_schemata() with syscalls instead of stdio functions.
- Fix typos and missing dots in patch messages.
- Branch printf attribute patch to a separate series.
[v1] https://lore.kernel.org/all/cover.1692880423.git.maciej.wieczor-retman@inte…
Wieczor-Retman, Maciej (2):
selftests/resctrl: Fix schemata write error check
selftests/resctrl: Move run_benchmark() to a more fitting file
tools/testing/selftests/resctrl/resctrl_val.c | 50 ++++++++++++
tools/testing/selftests/resctrl/resctrlfs.c | 76 ++++---------------
2 files changed, 63 insertions(+), 63 deletions(-)
base-commit: 13eb52f6293dbda02890698d92f3d9913d8d5aeb
--
2.42.0
The benchmark command handling (-b) in resctrl selftests is overly
complicated code. This series turns the benchmark command immutable to
preserve it for all selftests and improves benchmark command related
error handling.
This series also ends up removing the strcpy() calls which were pointed
out earlier.
v3:
- Removed DEFAULT_SPAN_STR for real and the duplicated copy of defines
that made to v2 likely due to my incorrect conflict resolutions
v2:
- Added argument length check into patch 1/7
- Updated also -b line in help message.
- Document -b argument related "algorithm"
- Use asprintf() to convert defined constant int to string
- Improved changelog texts
- Added \n to ksft_exit_fail_msg() call messages.
- Print DEFAULT_SPAN with %u instead of %zu to avoid need to cast it
Ilpo Järvinen (7):
selftests/resctrl: Ensure the benchmark commands fits to its array
selftests/resctrl: Correct benchmark command help
selftests/resctrl: Remove bw_report and bm_type from main()
selftests/resctrl: Simplify span lifetime
selftests/resctrl: Make benchmark command const and build it with
pointers
selftests/resctrl: Remove ben_count variable
selftests/resctrl: Cleanup benchmark argument parsing
tools/testing/selftests/resctrl/cache.c | 5 +-
tools/testing/selftests/resctrl/cat_test.c | 13 +-
tools/testing/selftests/resctrl/cmt_test.c | 34 +++--
tools/testing/selftests/resctrl/mba_test.c | 4 +-
tools/testing/selftests/resctrl/mbm_test.c | 7 +-
tools/testing/selftests/resctrl/resctrl.h | 17 ++-
.../testing/selftests/resctrl/resctrl_tests.c | 120 +++++++++---------
tools/testing/selftests/resctrl/resctrl_val.c | 10 +-
8 files changed, 120 insertions(+), 90 deletions(-)
--
2.30.2
This was prompted by the discussion about output directory support with
O=.
It seems sometimes we were pulling in system headers making testing
annoying and unreliable.
Willy:
I did not implement the '#ifdef va_start` guard that we discussed
before. In my understanding the latest agreement does not need it
anymore. Please let me know if this is incorrect.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Changes in v2:
- Adapt comment in nolibc.h
- <stdarg.h> -> "stdarg.h"
- Link to v1: https://lore.kernel.org/r/20230827-nolibc-nostdinc-v1-0-995d1811f1f3@weisss…
---
Thomas Weißschuh (2):
tools/nolibc: add stdarg.h header
selftests/nolibc: use -nostdinc for nolibc-test
tools/include/nolibc/Makefile | 1 +
tools/include/nolibc/nolibc.h | 4 ++--
tools/include/nolibc/stdarg.h | 16 ++++++++++++++++
tools/include/nolibc/stdio.h | 3 +--
tools/include/nolibc/sys.h | 2 +-
tools/testing/selftests/nolibc/Makefile | 2 +-
6 files changed, 22 insertions(+), 6 deletions(-)
---
base-commit: 556fb7131e03b0283672fb40f6dc2d151752aaa7
change-id: 20230827-nolibc-nostdinc-203908130d67
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Hi all:
The core frequency is subjected to the process variation in semiconductors.
Not all cores are able to reach the maximum frequency respecting the
infrastructure limits. Consequently, AMD has redefined the concept of
maximum frequency of a part. This means that a fraction of cores can reach
maximum frequency. To find the best process scheduling policy for a given
scenario, OS needs to know the core ordering informed by the platform through
highest performance capability register of the CPPC interface.
Earlier implementations of AMD Pstate Preferred Core only support a static
core ranking and targeted performance. Now it has the ability to dynamically
change the preferred core based on the workload and platform conditions and
accounting for thermals and aging.
AMD Pstate driver utilizes the functions and data structures provided by
the ITMT architecture to enable the scheduler to favor scheduling on cores
which can be get a higher frequency with lower voltage.
We call it AMD Pstate Preferrred Core.
Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
AMD Pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.
AMD Pstate driver will provide an initial core ordering at boot time.
It relies on the CPPC interface to communicate the core ranking to the
operating system and scheduler to make sure that OS is choosing the cores
with highest performance firstly for scheduling the process. When AMD Pstate
driver receives a message with the highest performance change, it will
update the core ranking.
Changes form V3->V4:
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V2->V3:
- x86:
- - Modify kconfig and description.
- cpufreq: amd-pstate:
- - Add Co-developed-by tag in commit message.
- cpufreq:
- - Modify commit message.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V1->V2:
- acpi: cppc:
- - Add reference link.
- cpufreq:
- - Moidfy link error.
- cpufreq: amd-pstate:
- - Init the priorities of all online CPUs
- - Use a single variable to represent the status of Preferred Core.
- Documentation:
- - Default enabled preferred core.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
- - Default enabled preferred core.
- - Use a single variable to represent the status of Preferred Core.
Meng Li (7):
x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
acpi: cppc: Add get the highest performance cppc control
cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting.
cpufreq: Add a notification message that the highest perf has changed
cpufreq: amd-pstate: Update AMD Pstate Preferred Core ranking
dynamically
Documentation: amd-pstate: introduce AMD Pstate Preferred Core
Documentation: introduce AMD Pstate Preferrd Core mode kernel command
line options
.../admin-guide/kernel-parameters.txt | 5 +
Documentation/admin-guide/pm/amd-pstate.rst | 54 +++++++
arch/x86/Kconfig | 5 +-
drivers/acpi/cppc_acpi.c | 13 ++
drivers/acpi/processor_driver.c | 6 +
drivers/cpufreq/amd-pstate.c | 152 ++++++++++++++++--
drivers/cpufreq/cpufreq.c | 13 ++
include/acpi/cppc_acpi.h | 5 +
include/linux/amd-pstate.h | 1 +
include/linux/cpufreq.h | 4 +
10 files changed, 240 insertions(+), 18 deletions(-)
--
2.34.1
All packets in the same flow (L3/L4 depending on multipath hash policy)
should be directed to the same target, but after [0]/[1] we see stray
packets directed towards other targets. This, for instance, causes RST
to be sent on TCP connections.
The first two patches solve the problem by ignoring route hints for
destinations that are part of multipath group, by using new SKB flags
for IPv4 and IPv6. The third patch is a selftest that tests the
scenario.
Thanks to Ido, for reviewing and suggesting a way forward in [2] and
also suggesting how to write a selftest for this.
v2->v3:
- Add NULL check for skb in fib6_select_path (Ido Schimmel)
- Use fib_tests.sh for selftest instead of the forwarding suite (Ido
Schimmel)
v1->v2:
- Update to commit messages describing the solution (Ido Schimmel)
- Use perf stat to count fib table lookups in selftest (Ido Schimmel)
Sriram Yagnaraman (3):
ipv4: ignore dst hint for multipath routes
ipv6: ignore dst hint for multipath routes
selftests: fib_tests: Add multipath list receive tests
include/linux/ipv6.h | 1 +
include/net/ip.h | 1 +
net/ipv4/ip_input.c | 3 +-
net/ipv4/route.c | 1 +
net/ipv6/ip6_input.c | 3 +-
net/ipv6/route.c | 3 +
tools/testing/selftests/net/fib_tests.sh | 150 ++++++++++++++++++++++-
7 files changed, 159 insertions(+), 3 deletions(-)
--
2.34.1
Use "static_keys" (name of the test itself) consistently instead of mixing
"static_key" and "static_keys" at the beginning of the messages in the
test_static_keys script.
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
This single-patch series fixes a minor inconsistency in the
'test_static_keys' script from the static_keys selftest. As a general
rule, the selftest name is provided at the beginning of every log
message.
Apply the selftest name for all log messages consequently.
---
tools/testing/selftests/static_keys/test_static_keys.sh | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/static_keys/test_static_keys.sh b/tools/testing/selftests/static_keys/test_static_keys.sh
index fc9f8cde7d42..3b0f17b81ac2 100755
--- a/tools/testing/selftests/static_keys/test_static_keys.sh
+++ b/tools/testing/selftests/static_keys/test_static_keys.sh
@@ -6,18 +6,18 @@
ksft_skip=4
if ! /sbin/modprobe -q -n test_static_key_base; then
- echo "static_key: module test_static_key_base is not found [SKIP]"
+ echo "static_keys: module test_static_key_base is not found [SKIP]"
exit $ksft_skip
fi
if ! /sbin/modprobe -q -n test_static_keys; then
- echo "static_key: module test_static_keys is not found [SKIP]"
+ echo "static_keys: module test_static_keys is not found [SKIP]"
exit $ksft_skip
fi
if /sbin/modprobe -q test_static_key_base; then
if /sbin/modprobe -q test_static_keys; then
- echo "static_key: ok"
+ echo "static_keys: ok"
/sbin/modprobe -q -r test_static_keys
/sbin/modprobe -q -r test_static_key_base
else
@@ -25,6 +25,6 @@ if /sbin/modprobe -q test_static_key_base; then
/sbin/modprobe -q -r test_static_key_base
fi
else
- echo "static_key: [FAIL]"
+ echo "static_keys: [FAIL]"
exit 1
fi
---
base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
change-id: 20230830-feature-static_keys_selftest_messages-d43d67db7974
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>