- Fix negative values seen while running lpar

- Fix lparstat error with mixed SMT state
Resolves: #2225135
This commit is contained in:
Than Ngo 2023-07-27 15:01:49 +02:00
parent 38490c794f
commit 56d2670ce4
4 changed files with 236 additions and 1 deletions

View File

@ -0,0 +1,84 @@
commit 73ba26c1240a25e7699449e82cfc09dad10fed80
Author: Sathvika Vasireddy <sv@linux.ibm.com>
Date: Fri Dec 9 15:26:46 2022 +0530
lparstat: Fix negative values seen while running lparstat with -E option
Negative values are seen while running lparstat with -E option.
This is because delta_purr value is less than delta_idle_purr.
Given that these values are read from different sources, a
small variation in the values is possible. So, in such cases,
round down delta_idle_purr to delta_purr.
Without this patch:
=====
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=240 mem=67033290112 kB cpus=0
ent=240.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
-0.03 100.02 3.93GHz[111%] 0.01 110.97
0.00 100.00 3.93GHz[111%] 0.01 110.99
-0.04 100.03 3.93GHz[111%] 0.01 110.98
0.06 99.95 3.93GHz[111%] 0.01 110.99
0.02 99.98 3.93GHz[111%] 0.01 110.99
=====
With this patch:
=====
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=240 mem=67033290112 kB cpus=0
ent=240.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
0.03 99.96 3.93GHz[111%] 0.01 110.98
0.00 100.00 3.93GHz[111%] 0.01 110.99
0.03 99.97 3.93GHz[111%] 0.01 110.99
0.00 100.00 3.93GHz[111%] 0.01 110.99
0.09 99.90 3.93GHz[111%] 0.01 110.99
=====
Reported-by: Shirisha Ganta <shirisha.ganta1@ibm.com>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
diff --git a/src/lparstat.c b/src/lparstat.c
index 31a4ee8..eebba1f 100644
--- a/src/lparstat.c
+++ b/src/lparstat.c
@@ -492,6 +492,15 @@ void get_cpu_util_purr(struct sysentry *unused_se, char *buf)
delta_purr = get_delta_value("purr");
delta_idle_purr = get_delta_value("idle_purr");
+ /*
+ * Given that these values are read from different
+ * sources (purr from lparcfg and idle_purr from sysfs),
+ * a small variation in the values is possible.
+ * In such cases, round down delta_idle_purr to delta_purr.
+ */
+ if (delta_idle_purr > delta_purr)
+ delta_idle_purr = delta_purr;
+
physc = (delta_purr - delta_idle_purr) / delta_tb;
physc *= 100.00;
@@ -507,6 +516,15 @@ void get_cpu_idle_purr(struct sysentry *unused_se, char *buf)
delta_purr = get_delta_value("purr");
delta_idle_purr = get_delta_value("idle_purr");
+ /*
+ * Given that these values are read from different
+ * sources (purr from lparcfg and idle_purr from sysfs),
+ * a small variation in the values is possible.
+ * In such cases, round down delta_idle_purr to delta_purr.
+ */
+ if (delta_idle_purr > delta_purr)
+ delta_idle_purr = delta_purr;
+
physc = (delta_purr - delta_idle_purr) / delta_tb;
idle = (delta_purr / delta_tb) - physc;
idle *= 100.00;

View File

@ -0,0 +1,46 @@
commit dee15756bcb287ccf39a904be07c90107b13844b
Author: Laurent Dufour <ldufour@linux.ibm.com>
Date: Wed May 3 10:50:15 2023 +0200
lparstat: Fix offline threads uninitialized entries
When some threads are offline, lparstat -E is failing like that:
$ ppc64_cpu --info # CPU 20 is offline
Core 0: 0* 1* 2* 3* 4* 5* 6* 7*
Core 1: 8* 9* 10* 11* 12* 13* 14* 15*
Core 2: 16* 17* 18* 19* 20 21* 22* 23*
Core 3: 24* 25* 26* 27* 28* 29* 30* 31*
Core 4: 32* 33* 34* 35* 36* 37* 38* 39*
Core 5: 40* 41* 42* 43* 44* 45* 46* 47*
$ lparstat -E
Failed to read /sys/devices/system/cpu/cpu0/spurr
The message is complaining about CPU0 but the real issue is that in
parse_sysfs_values() the test cpu_sysfs_fds[i].spurr >= 0 is valid even if
the entry has not been initialized (cpu_sysfs_fds is alloc cleared). So
if the number of threads online seen in assign_cpu_sysfs_fds is lower than
threads_in_system, the loop in parse_sysfs_values() will read uninitialized
entry, where .cpu=0.
To prevent that, unset entries in the cpu_sysfs_fds should have the spurr
fd set to -1.
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
diff --git a/src/lparstat.c b/src/lparstat.c
index a9e7bce..d2fdb3f 100644
--- a/src/lparstat.c
+++ b/src/lparstat.c
@@ -163,6 +163,10 @@ static int assign_cpu_sysfs_fds(int threads_in_system)
cpu_idx++;
}
+ /* Mark extra slots for offline threads unset, see parse_sysfs_values */
+ for (; cpu_idx < threads_in_system; cpu_idx++)
+ cpu_sysfs_fds[cpu_idx].spurr = -1;
+
return 0;
error:
fprintf(stderr, "Failed to open %s: %s\n",

View File

@ -0,0 +1,93 @@
commit b2672fa3d462217ccd057a2cd307af2448e78757
Author: Laurent Dufour <ldufour@linux.ibm.com>
Date: Wed May 3 10:50:14 2023 +0200
lparstat: report mixed SMT state
when SMT state is mixed like this one (CPU 4 is offline):
$ ppc64_cpu --info
Core 0: 0* 1* 2* 3* 4 5* 6* 7*
Core 1: 8* 9* 10* 11* 12* 13* 14* 15*
Core 2: 16* 17* 18* 19* 20* 21* 22* 23*
Core 3: 24* 25* 26* 27* 28* 29* 30* 31*
Core 4: 32* 33* 34* 35* 36* 37* 38* 39*
Core 5: 40* 41* 42* 43* 44* 45* 46* 47*
$ ppc64_cpu --smt
SMT=7: 0
SMT=8: 1-5
ppc64_cpu --smt is handling that nicely but lparstat failed reporting the
SMT state:
$ /usr/sbin/lparstat
Failed to get smt state
System Configuration
type=Dedicated mode=Capped smt=Capped lcpu=6 mem=65969728 kB cpus=0 ent=6.00
%user %sys %wait %idle physc %entc lbusy app vcsw phint
----- ----- ----- ----- ----- ----- ----- ----- ----- -----
0.02 0.01 0.00 99.97 3.41 56.83 0.02 0.00 4061778 156
Makes lparstat reporting "smt=mixed" in that case.
__do_smt is now returning 0 when the SMT state is mixed instead of -1 which
is also reported when an error is detected.
This doesn't change the call made by ppc64_cpu which is using
print_smt_state=true and so is expecting a returned value equal to 0 or -1.
With that patch applied, lparstat print that in the above case:
$lparstat
System Configuration
type=Dedicated mode=Capped smt=Mixed lcpu=6 mem=65969728 kB cpus=0 ent=6.00
%user %sys %wait %idle physc %entc lbusy app vcsw phint
----- ----- ----- ----- ----- ----- ----- ----- ----- -----
0.01 0.01 0.00 99.97 3.43 57.17 0.02 0.00 4105654 156
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
diff --git a/src/common/cpu_info_helpers.c b/src/common/cpu_info_helpers.c
index 925f220..c05d96d 100644
--- a/src/common/cpu_info_helpers.c
+++ b/src/common/cpu_info_helpers.c
@@ -245,7 +245,7 @@ int __do_smt(bool numeric, int cpus_in_system, int threads_per_cpu,
if (smt_state == 0)
smt_state = thread + 1;
else if (smt_state > 0)
- smt_state = -1; /* mix of SMT modes */
+ smt_state = 0; /* mix of SMT modes */
}
}
@@ -257,7 +257,7 @@ int __do_smt(bool numeric, int cpus_in_system, int threads_per_cpu,
printf("SMT=1\n");
else
printf("SMT is off\n");
- } else if (smt_state == -1) {
+ } else if (smt_state == 0) {
for (thread = 0; thread < threads_per_cpu; thread++) {
if (CPU_COUNT_S(cpu_state_size,
cpu_states[thread])) {
diff --git a/src/lparstat.c b/src/lparstat.c
index eebba1f..a9e7bce 100644
--- a/src/lparstat.c
+++ b/src/lparstat.c
@@ -884,13 +884,15 @@ void get_smt_mode(struct sysentry *se, char *buf)
}
smt_state = parse_smt_state();
- if (smt_state < 0) {
+ if (smt_state == -1) {
fprintf(stderr, "Failed to get smt state\n");
return;
}
if (smt_state == 1)
sprintf(buf, "Off");
+ else if (smt_state == 0)
+ sprintf(buf, "Mixed");
else
sprintf(buf, "%d", smt_state);
}

View File

@ -1,6 +1,6 @@
Name: powerpc-utils Name: powerpc-utils
Version: 1.3.10 Version: 1.3.10
Release: 5%{?dist} Release: 6%{?dist}
Summary: PERL-based scripts for maintaining and servicing PowerPC systems Summary: PERL-based scripts for maintaining and servicing PowerPC systems
Group: System Environment/Base Group: System Environment/Base
@ -16,6 +16,13 @@ Patch4: powerpc-utils-e1f1de-lmb_address_in_hexadecimal.patch
Patch5: powerpc-utils-fix_setting_primary_slave_across_reboots.patch Patch5: powerpc-utils-fix_setting_primary_slave_across_reboots.patch
Patch6: powerpc-utils-f4c2b0-fix_display_of_mode_for_dedicated_donating_partition.patch Patch6: powerpc-utils-f4c2b0-fix_display_of_mode_for_dedicated_donating_partition.patch
# lparstat: Fix-negative-values-seen-while-running-lpar
Patch10: powerpc-utils-1.3.10-lparstat-Fix-negative-values-seen-while-running-lpar.patch
# report-mixed-SMT-state
Patch11: powerpc-utils-1.3.10-lparstat-report-mixed-SMT-state.patch
# Fix-offline-threads-uninitialized-entries
Patch12: powerpc-utils-1.3.10-lparstat-Fix-offline-threads-uninitialized-entries.patch
ExclusiveArch: ppc %{power64} ExclusiveArch: ppc %{power64}
BuildRequires: gcc BuildRequires: gcc
@ -197,6 +204,11 @@ systemctl enable hcn-init.service >/dev/null 2>&1 || :
%{_mandir}/man8/lparnumascore.8* %{_mandir}/man8/lparnumascore.8*
%changelog %changelog
* Wed Jul 26 2023 Than Ngo <than@redhat.com> - 1.3.10-6
- Fix negative values seen while running lpar
- Fix lparstat error with mixed SMT state
Resolves: #2225135
* Sat Jun 17 2023 Than Ngo <than@redhat.com> - 1.3.10-5 * Sat Jun 17 2023 Than Ngo <than@redhat.com> - 1.3.10-5
- Resolves: #2207649, Add udev rule for the nx-gzip in to the core subpackage - Resolves: #2207649, Add udev rule for the nx-gzip in to the core subpackage