import glibc-2.28-164.el8

This commit is contained in:
CentOS Sources 2021-10-06 11:09:51 -04:00 committed by Stepan Oksanichenko
parent ada533335c
commit 49f27b7834
40 changed files with 6407 additions and 4 deletions

View File

@ -0,0 +1,20 @@
commit 0798b8ecc8da8667362496c1217d18635106c609
Author: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Date: Wed Apr 8 19:56:12 2020 -0700
ARC: Update syscall-names.list for ARC specific syscalls
diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
index 314a653938..21a62a06f4 100644
--- a/sysdeps/unix/sysv/linux/syscall-names.list
+++ b/sysdeps/unix/sysv/linux/syscall-names.list
@@ -41,6 +41,9 @@ adjtimex
afs_syscall
alarm
alloc_hugepages
+arc_gettls
+arc_settls
+arc_usr_cmpxchg
arch_prctl
arm_fadvise64_64
arm_sync_file_range

View File

@ -0,0 +1,26 @@
commit b67339d0bbc07911859ca8c488e1923441cd3c33
Author: Joseph Myers <joseph@codesourcery.com>
Date: Mon Jun 15 22:58:22 2020 +0000
Update syscall-names.list for Linux 5.7.
Linux 5.7 has no new syscalls. Update the version number in
syscall-names.list to reflect that it is still current for 5.7.
Tested with build-many-glibcs.py.
diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
index 21a62a06f4..15dec5b98f 100644
--- a/sysdeps/unix/sysv/linux/syscall-names.list
+++ b/sysdeps/unix/sysv/linux/syscall-names.list
@@ -21,8 +21,8 @@
# This file can list all potential system calls. The names are only
# used if the installed kernel headers also provide them.
-# The list of system calls is current as of Linux 5.6.
-kernel 5.6
+# The list of system calls is current as of Linux 5.7.
+kernel 5.7
FAST_atomic_update
FAST_cmpxchg

View File

@ -0,0 +1,37 @@
commit 1cfb4715288845ebc55ad664421b48b32de9599c
Author: Joseph Myers <joseph@codesourcery.com>
Date: Fri Aug 7 14:38:43 2020 +0000
Update syscall lists for Linux 5.8.
Linux 5.8 has one new syscall, faccessat2. Update syscall-names.list
and regenerate the arch-syscall.h headers with build-many-glibcs.py
update-syscalls.
Tested with build-many-glibcs.py.
Modified to only update syscall-names.list for RHEL 8.5.0.
diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
index 15dec5b98f..a462318ecf 100644
--- a/sysdeps/unix/sysv/linux/syscall-names.list
+++ b/sysdeps/unix/sysv/linux/syscall-names.list
@@ -21,8 +21,8 @@
# This file can list all potential system calls. The names are only
# used if the installed kernel headers also provide them.
-# The list of system calls is current as of Linux 5.7.
-kernel 5.7
+# The list of system calls is current as of Linux 5.8.
+kernel 5.8
FAST_atomic_update
FAST_cmpxchg
@@ -105,6 +105,7 @@ execveat
exit
exit_group
faccessat
+faccessat2
fadvise64
fadvise64_64
fallocate

View File

@ -0,0 +1,37 @@
commit dac8713629c8736a60aebec2f01657e46baa4c73
Author: Joseph Myers <joseph@codesourcery.com>
Date: Fri Oct 23 16:31:11 2020 +0000
Update syscall lists for Linux 5.9.
Linux 5.9 has one new syscall, close_range. Update syscall-names.list
and regenerate the arch-syscall.h headers with build-many-glibcs.py
update-syscalls.
Tested with build-many-glibcs.py.
Modified to only update syscall-names.list for RHEL 8.5.0.
diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
index a462318ecf..2d42aaf803 100644
--- a/sysdeps/unix/sysv/linux/syscall-names.list
+++ b/sysdeps/unix/sysv/linux/syscall-names.list
@@ -21,8 +21,8 @@
# This file can list all potential system calls. The names are only
# used if the installed kernel headers also provide them.
-# The list of system calls is current as of Linux 5.8.
-kernel 5.8
+# The list of system calls is current as of Linux 5.9.
+kernel 5.9
FAST_atomic_update
FAST_cmpxchg
@@ -79,6 +79,7 @@ clone
clone2
clone3
close
+close_range
cmpxchg_badaddr
connect
copy_file_range

View File

@ -0,0 +1,37 @@
commit bcf47eb0fba4c6278aadd6a377d6b7b3f673e17c
Author: Joseph Myers <joseph@codesourcery.com>
Date: Wed Dec 16 02:08:52 2020 +0000
Update syscall lists for Linux 5.10.
Linux 5.10 has one new syscall, process_madvise. Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
Modified to only update syscall-names.list for RHEL 8.5.0.
diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
index 2d42aaf803..4bd42be2b9 100644
--- a/sysdeps/unix/sysv/linux/syscall-names.list
+++ b/sysdeps/unix/sysv/linux/syscall-names.list
@@ -21,8 +21,8 @@
# This file can list all potential system calls. The names are only
# used if the installed kernel headers also provide them.
-# The list of system calls is current as of Linux 5.9.
-kernel 5.9
+# The list of system calls is current as of Linux 5.10.
+kernel 5.10
FAST_atomic_update
FAST_cmpxchg
@@ -433,6 +433,7 @@ pread64
preadv
preadv2
prlimit64
+process_madvise
process_vm_readv
process_vm_writev
prof

View File

@ -0,0 +1,42 @@
commit 2b778ceb4010c28d70de9b8eab20e8d88eed586b
Author: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat Jan 2 11:32:25 2021 -0800
Update copyright dates with scripts/update-copyrights
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
Modified to only update copyright for syscall-names.list for RHEL 8.5.0. Also
update licenses link to use https.
diff -Nrup a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
--- a/sysdeps/unix/sysv/linux/syscall-names.list 2021-03-16 14:12:27.828571456 -0400
+++ b/sysdeps/unix/sysv/linux/syscall-names.list 2021-03-16 14:13:23.950145631 -0400
@@ -1,5 +1,5 @@
# List of all known Linux system calls.
-# Copyright (C) 2017-2018 Free Software Foundation, Inc.
+# Copyright (C) 2017-2021 Free Software Foundation, Inc.
# This file is part of the GNU C Library.
#
# The GNU C Library is free software; you can redistribute it and/or
@@ -14,7 +14,7 @@
#
# You should have received a copy of the GNU Lesser General Public
# License along with the GNU C Library; if not, see
-# <http://www.gnu.org/licenses/>.
+# <https://www.gnu.org/licenses/>.
# This file contains the list of system call names. It has to remain in
# alphabetical order. Lines which start with # are treated as comments.

View File

@ -0,0 +1,37 @@
commit 83908b3a1ea51e3aa7ff422275940e56dbba989f
Author: Joseph Myers <joseph@codesourcery.com>
Date: Fri Feb 19 21:16:27 2021 +0000
Update syscall lists for Linux 5.11.
Linux 5.11 has one new syscall, epoll_pwait2. Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
Modified to only update syscall-names.list for RHEL 8.5.0.
diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
index 4df7eeab96..f6cb34089d 100644
--- a/sysdeps/unix/sysv/linux/syscall-names.list
+++ b/sysdeps/unix/sysv/linux/syscall-names.list
@@ -21,8 +21,8 @@
# This file can list all potential system calls. The names are only
# used if the installed kernel headers also provide them.
-# The list of system calls is current as of Linux 5.10.
-kernel 5.10
+# The list of system calls is current as of Linux 5.11.
+kernel 5.11
FAST_atomic_update
FAST_cmpxchg
@@ -95,6 +95,7 @@ epoll_create1
epoll_ctl
epoll_ctl_old
epoll_pwait
+epoll_pwait2
epoll_wait
epoll_wait_old
eventfd

View File

@ -0,0 +1,266 @@
Conflicts in sysdeps/unix/sysv/linux/semctl.c were due to 64-bit time_t
and RHEL8 has a simpler implementation.
Conflicts in sysdeps/unix/sysv/linux/Makefile were due to the usual test
case conflicts.
commit 574500a108be1d2a6a0dc97a075c9e0a98371aba
Author: Dmitry V. Levin <ldv@altlinux.org>
Date: Tue Sep 29 14:10:20 2020 -0300
sysvipc: Fix SEM_STAT_ANY kernel argument pass [BZ #26637]
Handle SEM_STAT_ANY the same way as SEM_STAT so that the buffer argument
of SEM_STAT_ANY is properly passed to the kernel and back.
The regression testcase checks for Linux specifix SysV ipc message
control extension. For IPC_INFO/SEM_INFO it tries to match the values
against the tunable /proc values and for SEM_STAT/SEM_STAT_ANY it
check if the create message queue is within the global list returned
by the kernel.
Checked on x86_64-linux-gnu and on i686-linux-gnu (Linux v5.4 and on
Linux v4.15).
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
# Conflicts:
# sysdeps/unix/sysv/linux/Makefile
# sysdeps/unix/sysv/linux/semctl.c
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index fb4ccd63ddec7eca..c6907796152eb09d 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -45,7 +45,8 @@ sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \
tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \
tst-quota tst-sync_file_range tst-sysconf-iov_max tst-ttyname \
test-errno-linux tst-memfd_create tst-mlock2 tst-pkey \
- tst-rlimit-infinity tst-ofdlocks
+ tst-rlimit-infinity tst-ofdlocks \
+ tst-sysvsem-linux
tests-internal += tst-ofdlocks-compat
diff --git a/sysdeps/unix/sysv/linux/semctl.c b/sysdeps/unix/sysv/linux/semctl.c
index e2925447eba2ee94..bdf31ca7747fe5a4 100644
--- a/sysdeps/unix/sysv/linux/semctl.c
+++ b/sysdeps/unix/sysv/linux/semctl.c
@@ -51,6 +51,7 @@ __new_semctl (int semid, int semnum, int cmd, ...)
case IPC_STAT: /* arg.buf */
case IPC_SET:
case SEM_STAT:
+ case SEM_STAT_ANY:
case IPC_INFO: /* arg.__buf */
case SEM_INFO:
va_start (ap, cmd);
@@ -90,6 +91,7 @@ __old_semctl (int semid, int semnum, int cmd, ...)
case IPC_STAT: /* arg.buf */
case IPC_SET:
case SEM_STAT:
+ case SEM_STAT_ANY:
case IPC_INFO: /* arg.__buf */
case SEM_INFO:
va_start (ap, cmd);
diff --git a/sysdeps/unix/sysv/linux/tst-sysvsem-linux.c b/sysdeps/unix/sysv/linux/tst-sysvsem-linux.c
new file mode 100644
index 0000000000000000..45f19e2d37ed194a
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-sysvsem-linux.c
@@ -0,0 +1,184 @@
+/* Basic tests for Linux SYSV semaphore extensions.
+ Copyright (C) 2020 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <sys/ipc.h>
+#include <sys/sem.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <stdio.h>
+
+#include <support/check.h>
+#include <support/temp_file.h>
+
+/* These are for the temporary file we generate. */
+static char *name;
+static int semid;
+
+static void
+remove_sem (void)
+{
+ /* Enforce message queue removal in case of early test failure.
+ Ignore error since the sem may already have being removed. */
+ semctl (semid, 0, IPC_RMID, 0);
+}
+
+static void
+do_prepare (int argc, char *argv[])
+{
+ TEST_VERIFY_EXIT (create_temp_file ("tst-sysvsem.", &name) != -1);
+}
+
+#define PREPARE do_prepare
+
+#define SEM_MODE 0644
+
+union semun
+{
+ int val;
+ struct semid_ds *buf;
+ unsigned short *array;
+ struct seminfo *__buf;
+};
+
+struct test_seminfo
+{
+ int semmsl;
+ int semmns;
+ int semopm;
+ int semmni;
+};
+
+/* It tries to obtain some system-wide SysV semaphore information from /proc
+ to check against IPC_INFO/SEM_INFO. The /proc only returns the tunables
+ value of SEMMSL, SEMMNS, SEMOPM, and SEMMNI.
+
+ The kernel also returns constant value for SEMVMX, SEMMNU, SEMMAP, SEMUME,
+ and also SEMUSZ and SEMAEM (for IPC_INFO). The issue to check them is they
+ might change over kernel releases. */
+
+static void
+read_sem_stat (struct test_seminfo *tseminfo)
+{
+ FILE *f = fopen ("/proc/sys/kernel/sem", "r");
+ if (f == NULL)
+ FAIL_UNSUPPORTED ("/proc is not mounted or /proc/sys/kernel/sem is not "
+ "available");
+
+ int r = fscanf (f, "%d %d %d %d",
+ &tseminfo->semmsl, &tseminfo->semmns, &tseminfo->semopm,
+ &tseminfo->semmni);
+ TEST_VERIFY_EXIT (r == 4);
+
+ fclose (f);
+}
+
+
+/* Check if the semaphore with IDX (index into the kernel's internal array)
+ matches the one with KEY. The CMD is either SEM_STAT or SEM_STAT_ANY. */
+
+static bool
+check_seminfo (int idx, key_t key, int cmd)
+{
+ struct semid_ds seminfo;
+ int sid = semctl (idx, 0, cmd, (union semun) { .buf = &seminfo });
+ /* Ignore unused array slot returned by the kernel or information from
+ unknown semaphores. */
+ if ((sid == -1 && errno == EINVAL) || sid != semid)
+ return false;
+
+ if (sid == -1)
+ FAIL_EXIT1 ("semctl with SEM_STAT failed (errno=%d)", errno);
+
+ TEST_COMPARE (seminfo.sem_perm.__key, key);
+ TEST_COMPARE (seminfo.sem_perm.mode, SEM_MODE);
+ TEST_COMPARE (seminfo.sem_nsems, 1);
+
+ return true;
+}
+
+static int
+do_test (void)
+{
+ atexit (remove_sem);
+
+ key_t key = ftok (name, 'G');
+ if (key == -1)
+ FAIL_EXIT1 ("ftok failed: %m");
+
+ semid = semget (key, 1, IPC_CREAT | IPC_EXCL | SEM_MODE);
+ if (semid == -1)
+ FAIL_EXIT1 ("semget failed: %m");
+
+ struct test_seminfo tipcinfo;
+ read_sem_stat (&tipcinfo);
+
+ int semidx;
+
+ {
+ struct seminfo ipcinfo;
+ semidx = semctl (semid, 0, IPC_INFO, (union semun) { .__buf = &ipcinfo });
+ if (semidx == -1)
+ FAIL_EXIT1 ("semctl with IPC_INFO failed: %m");
+
+ TEST_COMPARE (ipcinfo.semmsl, tipcinfo.semmsl);
+ TEST_COMPARE (ipcinfo.semmns, tipcinfo.semmns);
+ TEST_COMPARE (ipcinfo.semopm, tipcinfo.semopm);
+ TEST_COMPARE (ipcinfo.semmni, tipcinfo.semmni);
+ }
+
+ /* Same as before but with SEM_INFO. */
+ {
+ struct seminfo ipcinfo;
+ semidx = semctl (semid, 0, SEM_INFO, (union semun) { .__buf = &ipcinfo });
+ if (semidx == -1)
+ FAIL_EXIT1 ("semctl with IPC_INFO failed: %m");
+
+ TEST_COMPARE (ipcinfo.semmsl, tipcinfo.semmsl);
+ TEST_COMPARE (ipcinfo.semmns, tipcinfo.semmns);
+ TEST_COMPARE (ipcinfo.semopm, tipcinfo.semopm);
+ TEST_COMPARE (ipcinfo.semmni, tipcinfo.semmni);
+ }
+
+ /* We check if the created semaphore shows in the system-wide status. */
+ bool found = false;
+ for (int i = 0; i <= semidx; i++)
+ {
+ /* We can't tell apart if SEM_STAT_ANY is not supported (kernel older
+ than 4.17) or if the index used is invalid. So it just check if
+ value returned from a valid call matches the created semaphore. */
+ check_seminfo (i, key, SEM_STAT_ANY);
+
+ if (check_seminfo (i, key, SEM_STAT))
+ {
+ found = true;
+ break;
+ }
+ }
+
+ if (!found)
+ FAIL_EXIT1 ("semctl with SEM_STAT/SEM_STAT_ANY could not find the "
+ "created semaphore");
+
+ if (semctl (semid, 0, IPC_RMID, 0) == -1)
+ FAIL_EXIT1 ("semctl failed: %m");
+
+ return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysvipc/test-sysvsem.c b/sysvipc/test-sysvsem.c
index a8e9bff000949ff8..d197772917a7579d 100644
--- a/sysvipc/test-sysvsem.c
+++ b/sysvipc/test-sysvsem.c
@@ -20,6 +20,7 @@
#include <stdlib.h>
#include <errno.h>
#include <string.h>
+#include <stdbool.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>

View File

@ -0,0 +1,151 @@
Rewrite of the following commit but adjusted pre-64-bit time_t
conversion. We want to follow the same upstream behaviour and return
EINVAL for unknown commands rather than to attempt the command with an
argument of {0} which has likely never been tested upstream.
commit a16d2abd496bd974a88207d5599265aae5ae4880
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date: Tue Sep 29 14:29:48 2020 -0300
sysvipc: Return EINVAL for invalid semctl commands
It avoids regressions on possible future commands that might require
additional libc support. The downside is new commands added by newer
kernels will need further glibc support.
Checked on x86_64-linux-gnu and i686-linux-gnu (Linux v4.15 and v5.4).
diff --git a/sysdeps/unix/sysv/linux/semctl.c b/sysdeps/unix/sysv/linux/semctl.c
index bdf31ca7747fe5a4..03c56c69a5412c82 100644
--- a/sysdeps/unix/sysv/linux/semctl.c
+++ b/sysdeps/unix/sysv/linux/semctl.c
@@ -58,6 +58,15 @@ __new_semctl (int semid, int semnum, int cmd, ...)
arg = va_arg (ap, union semun);
va_end (ap);
break;
+ case IPC_RMID: /* arg ignored. */
+ case GETNCNT:
+ case GETPID:
+ case GETVAL:
+ case GETZCNT:
+ break;
+ default:
+ __set_errno (EINVAL);
+ return -1;
}
#ifdef __ASSUME_DIRECT_SYSVIPC_SYSCALLS
diff --git a/sysvipc/test-sysvipc.h b/sysvipc/test-sysvipc.h
new file mode 100644
index 0000000000000000..d7ed496511c10afb
--- /dev/null
+++ b/sysvipc/test-sysvipc.h
@@ -0,0 +1,85 @@
+/* Basic definition for Sysv IPC test functions.
+ Copyright (C) 2020 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#ifndef _TEST_SYSV_H
+#define _TEST_SYSV_H
+
+#include <sys/ipc.h>
+#include <sys/sem.h>
+#include <sys/msg.h>
+#include <sys/shm.h>
+#include <include/array_length.h>
+
+/* Return the first invalid command SysV IPC command from common shared
+ between message queue, shared memory, and semaphore. */
+static inline int
+first_common_invalid_cmd (void)
+{
+ const int common_cmds[] = {
+ IPC_RMID,
+ IPC_SET,
+ IPC_STAT,
+ IPC_INFO,
+ };
+
+ int invalid = 0;
+ for (int i = 0; i < array_length (common_cmds); i++)
+ {
+ if (invalid == common_cmds[i])
+ {
+ invalid++;
+ i = 0;
+ }
+ }
+
+ return invalid;
+}
+
+/* Return the first invalid command SysV IPC command for semaphore. */
+static inline int
+first_sem_invalid_cmd (void)
+{
+ const int sem_cmds[] = {
+ GETPID,
+ GETVAL,
+ GETALL,
+ GETNCNT,
+ GETZCNT,
+ SETVAL,
+ SETALL,
+ SEM_STAT,
+ SEM_INFO,
+#ifdef SEM_STAT_ANY
+ SEM_STAT_ANY,
+#endif
+ };
+
+ int invalid = first_common_invalid_cmd ();
+ for (int i = 0; i < array_length (sem_cmds); i++)
+ {
+ if (invalid == sem_cmds[i])
+ {
+ invalid++;
+ i = 0;
+ }
+ }
+
+ return invalid;
+}
+
+#endif /* _TEST_SYSV_H */
diff --git a/sysvipc/test-sysvsem.c b/sysvipc/test-sysvsem.c
index d197772917a7579d..43a1460ec2b9308f 100644
--- a/sysvipc/test-sysvsem.c
+++ b/sysvipc/test-sysvsem.c
@@ -25,6 +25,8 @@
#include <sys/ipc.h>
#include <sys/sem.h>
+#include <test-sysvipc.h>
+
#include <support/support.h>
#include <support/check.h>
#include <support/temp_file.h>
@@ -80,6 +82,9 @@ do_test (void)
FAIL_EXIT1 ("semget failed (errno=%d)", errno);
}
+ TEST_COMPARE (semctl (semid, 0, first_sem_invalid_cmd (), NULL), -1);
+ TEST_COMPARE (errno, EINVAL);
+
/* Get semaphore kernel information and do some sanity checks. */
struct semid_ds seminfo;
if (semctl (semid, 0, IPC_STAT, (union semun) { .buf = &seminfo }) == -1)

View File

@ -0,0 +1,224 @@
Backport only the test case:
* sysdeps/unix/sysv/linux/tst-sysvmsg-linux.c
This improves coverage for IPC_INFO and MSG_INFO.
We don't need the actual fix in the bug because we don't have the 64-bit
time_t handling backported.
commit 20a00dbefca5695cccaa44846a482db8ccdd85ab
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date: Tue Sep 29 14:39:56 2020 -0300
sysvipc: Fix IPC_INFO and MSG_INFO handling [BZ #26639]
Both commands are Linux extensions where the third argument is a
'struct msginfo' instead of 'struct msqid_ds' and its information
does not contain any time related fields (so there is no need to
extra conversion for __IPC_TIME64.
The regression testcase checks for Linux specifix SysV ipc message
control extension. For IPC_INFO/MSG_INFO it tries to match the values
against the tunable /proc values and for MSG_STAT/MSG_STAT_ANY it
check if the create message queue is within the global list returned
by the kernel.
Checked on x86_64-linux-gnu and on i686-linux-gnu (Linux v5.4 and on
Linux v4.15).
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 7d04e3313c56c15d..688cf9fa9dea23a6 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -46,7 +46,7 @@ tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \
tst-quota tst-sync_file_range tst-sysconf-iov_max tst-ttyname \
test-errno-linux tst-memfd_create tst-mlock2 tst-pkey \
tst-rlimit-infinity tst-ofdlocks \
- tst-sysvsem-linux
+ tst-sysvsem-linux tst-sysvmsg-linux
tests-internal += tst-ofdlocks-compat
diff --git a/sysdeps/unix/sysv/linux/tst-sysvmsg-linux.c b/sysdeps/unix/sysv/linux/tst-sysvmsg-linux.c
new file mode 100644
index 0000000000000000..1857fab8c1fdf041
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-sysvmsg-linux.c
@@ -0,0 +1,177 @@
+/* Basic tests for Linux SYSV message queue extensions.
+ Copyright (C) 2020-2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <sys/ipc.h>
+#include <sys/msg.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <stdio.h>
+
+#include <support/check.h>
+#include <support/temp_file.h>
+
+#define MSGQ_MODE 0644
+
+/* These are for the temporary file we generate. */
+static char *name;
+static int msqid;
+
+static void
+remove_msq (void)
+{
+ /* Enforce message queue removal in case of early test failure.
+ Ignore error since the msg may already have being removed. */
+ msgctl (msqid, IPC_RMID, NULL);
+}
+
+static void
+do_prepare (int argc, char *argv[])
+{
+ TEST_VERIFY_EXIT (create_temp_file ("tst-sysvmsg.", &name) != -1);
+}
+
+#define PREPARE do_prepare
+
+struct test_msginfo
+{
+ int msgmax;
+ int msgmnb;
+ int msgmni;
+};
+
+/* It tries to obtain some system-wide SysV messsage queue information from
+ /proc to check against IPC_INFO/MSG_INFO. The /proc only returns the
+ tunables value of MSGMAX, MSGMNB, and MSGMNI.
+
+ The kernel also returns constant value for MSGSSZ, MSGSEG and also MSGMAP,
+ MSGPOOL, and MSGTQL (for IPC_INFO). The issue to check them is they might
+ change over kernel releases. */
+
+static int
+read_proc_file (const char *file)
+{
+ FILE *f = fopen (file, "r");
+ if (f == NULL)
+ FAIL_UNSUPPORTED ("/proc is not mounted or %s is not available", file);
+
+ int v;
+ int r = fscanf (f, "%d", & v);
+ TEST_VERIFY_EXIT (r == 1);
+
+ fclose (f);
+ return v;
+}
+
+
+/* Check if the message queue with IDX (index into the kernel's internal
+ array) matches the one with KEY. The CMD is either MSG_STAT or
+ MSG_STAT_ANY. */
+
+static bool
+check_msginfo (int idx, key_t key, int cmd)
+{
+ struct msqid_ds msginfo;
+ int mid = msgctl (idx, cmd, &msginfo);
+ /* Ignore unused array slot returned by the kernel or information from
+ unknown message queue. */
+ if ((mid == -1 && errno == EINVAL) || mid != msqid)
+ return false;
+
+ if (mid == -1)
+ FAIL_EXIT1 ("msgctl with %s failed: %m",
+ cmd == MSG_STAT ? "MSG_STAT" : "MSG_STAT_ANY");
+
+ TEST_COMPARE (msginfo.msg_perm.__key, key);
+ TEST_COMPARE (msginfo.msg_perm.mode, MSGQ_MODE);
+ TEST_COMPARE (msginfo.msg_qnum, 0);
+
+ return true;
+}
+
+static int
+do_test (void)
+{
+ atexit (remove_msq);
+
+ key_t key = ftok (name, 'G');
+ if (key == -1)
+ FAIL_EXIT1 ("ftok failed: %m");
+
+ msqid = msgget (key, MSGQ_MODE | IPC_CREAT);
+ if (msqid == -1)
+ FAIL_EXIT1 ("msgget failed: %m");
+
+ struct test_msginfo tipcinfo;
+ tipcinfo.msgmax = read_proc_file ("/proc/sys/kernel/msgmax");
+ tipcinfo.msgmnb = read_proc_file ("/proc/sys/kernel/msgmnb");
+ tipcinfo.msgmni = read_proc_file ("/proc/sys/kernel/msgmni");
+
+ int msqidx;
+
+ {
+ struct msginfo ipcinfo;
+ msqidx = msgctl (msqid, IPC_INFO, (struct msqid_ds *) &ipcinfo);
+ if (msqidx == -1)
+ FAIL_EXIT1 ("msgctl with IPC_INFO failed: %m");
+
+ TEST_COMPARE (ipcinfo.msgmax, tipcinfo.msgmax);
+ TEST_COMPARE (ipcinfo.msgmnb, tipcinfo.msgmnb);
+ TEST_COMPARE (ipcinfo.msgmni, tipcinfo.msgmni);
+ }
+
+ /* Same as before but with MSG_INFO. */
+ {
+ struct msginfo ipcinfo;
+ msqidx = msgctl (msqid, MSG_INFO, (struct msqid_ds *) &ipcinfo);
+ if (msqidx == -1)
+ FAIL_EXIT1 ("msgctl with IPC_INFO failed: %m");
+
+ TEST_COMPARE (ipcinfo.msgmax, tipcinfo.msgmax);
+ TEST_COMPARE (ipcinfo.msgmnb, tipcinfo.msgmnb);
+ TEST_COMPARE (ipcinfo.msgmni, tipcinfo.msgmni);
+ }
+
+ /* We check if the created message queue shows in global list. */
+ bool found = false;
+ for (int i = 0; i <= msqidx; i++)
+ {
+ /* We can't tell apart if MSG_STAT_ANY is not supported (kernel older
+ than 4.17) or if the index used is invalid. So it just check if the
+ value returned from a valid call matches the created message
+ queue. */
+ check_msginfo (i, key, MSG_STAT_ANY);
+
+ if (check_msginfo (i, key, MSG_STAT))
+ {
+ found = true;
+ break;
+ }
+ }
+
+ if (!found)
+ FAIL_EXIT1 ("msgctl with MSG_STAT/MSG_STAT_ANY could not find the "
+ "created message queue");
+
+ if (msgctl (msqid, IPC_RMID, NULL) == -1)
+ FAIL_EXIT1 ("msgctl failed");
+
+ return 0;
+}
+
+#include <support/test-driver.c>

View File

@ -0,0 +1,98 @@
This is a rewrite of the commit for the pre-64-bit time_t version of
the msgctl handling. Similar to semctl we want the RHEL8 handling of
the unknown commands to be the same as upstream.
commit be9b0b9a012780a403a266c90878efffb9a5f3ca
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date: Tue Sep 29 14:45:09 2020 -0300
sysvipc: Return EINVAL for invalid msgctl commands
It avoids regressions on possible future commands that might require
additional libc support. The downside is new commands added by newer
kernels will need further glibc support.
Checked on x86_64-linux-gnu and i686-linux-gnu (Linux v4.15 and v5.4).
diff --git a/sysdeps/unix/sysv/linux/msgctl.c b/sysdeps/unix/sysv/linux/msgctl.c
index 7280cba31a8815a2..6a2c79d188b875b9 100644
--- a/sysdeps/unix/sysv/linux/msgctl.c
+++ b/sysdeps/unix/sysv/linux/msgctl.c
@@ -29,6 +29,20 @@
int
__new_msgctl (int msqid, int cmd, struct msqid_ds *buf)
{
+ switch (cmd)
+ {
+ case IPC_RMID:
+ case IPC_SET:
+ case IPC_STAT:
+ case MSG_STAT:
+ case MSG_STAT_ANY:
+ case IPC_INFO:
+ case MSG_INFO:
+ break;
+ default:
+ __set_errno (EINVAL);
+ return -1;
+ }
#ifdef __ASSUME_DIRECT_SYSVIPC_SYSCALLS
return INLINE_SYSCALL_CALL (msgctl, msqid, cmd | __IPC_64, buf);
#else
diff --git a/sysvipc/test-sysvipc.h b/sysvipc/test-sysvipc.h
index ed0057b7871e505c..133fb71c6113a2b5 100644
--- a/sysvipc/test-sysvipc.h
+++ b/sysvipc/test-sysvipc.h
@@ -134,4 +134,29 @@ first_shm_invalid_cmd (void)
return invalid;
}
+/* Return the first invalid command SysV IPC command for message queue. */
+static inline int
+first_msg_invalid_cmd (void)
+{
+ const int msg_cmds[] = {
+ MSG_STAT,
+ MSG_INFO,
+#ifdef MSG_STAT_ANY
+ MSG_STAT_ANY,
+#endif
+ };
+
+ int invalid = first_common_invalid_cmd ();
+ for (int i = 0; i < array_length (msg_cmds); i++)
+ {
+ if (invalid == msg_cmds[i])
+ {
+ invalid++;
+ i = 0;
+ }
+ }
+
+ return invalid;
+}
+
#endif /* _TEST_SYSV_H */
diff --git a/sysvipc/test-sysvmsg.c b/sysvipc/test-sysvmsg.c
index 1e0471807cd26da1..74a907ad39ee114e 100644
--- a/sysvipc/test-sysvmsg.c
+++ b/sysvipc/test-sysvmsg.c
@@ -24,6 +24,8 @@
#include <sys/ipc.h>
#include <sys/msg.h>
+#include <test-sysvipc.h>
+
#include <support/support.h>
#include <support/check.h>
#include <support/temp_file.h>
@@ -86,6 +88,9 @@ do_test (void)
FAIL_EXIT1 ("msgget failed (errno=%d)", errno);
}
+ TEST_COMPARE (msgctl (msqid, first_msg_invalid_cmd (), NULL), -1);
+ TEST_COMPARE (errno, EINVAL);
+
/* Get message queue kernel information and do some sanity checks. */
struct msqid_ds msginfo;
if (msgctl (msqid, IPC_STAT, &msginfo) == -1)

View File

@ -0,0 +1,128 @@
Rewrite of the following commit to support returning EINVAL for unknown
commands and therefore match upstream behaviour.
commit 9ebaabeaac1a96b0d91f52902ce1dbf4f5a562dd
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date: Tue Sep 29 14:55:02 2020 -0300
sysvipc: Return EINVAL for invalid shmctl commands
It avoids regressions on possible future commands that might require
additional libc support. The downside is new commands added by newer
kernels will need further glibc support.
Checked on x86_64-linux-gnu and i686-linux-gnu (Linux v4.15 and v5.4).
diff --git a/sysdeps/unix/sysv/linux/shmctl.c b/sysdeps/unix/sysv/linux/shmctl.c
index 25c5152944a6fcf3..00768bc47614f9aa 100644
--- a/sysdeps/unix/sysv/linux/shmctl.c
+++ b/sysdeps/unix/sysv/linux/shmctl.c
@@ -33,6 +33,22 @@
int
__new_shmctl (int shmid, int cmd, struct shmid_ds *buf)
{
+ switch (cmd)
+ {
+ case IPC_RMID:
+ case SHM_LOCK:
+ case SHM_UNLOCK:
+ case IPC_SET:
+ case IPC_STAT:
+ case SHM_STAT:
+ case SHM_STAT_ANY:
+ case IPC_INFO:
+ case SHM_INFO:
+ break;
+ default:
+ __set_errno (EINVAL);
+ break;
+ }
#ifdef __ASSUME_DIRECT_SYSVIPC_SYSCALLS
return INLINE_SYSCALL_CALL (shmctl, shmid, cmd | __IPC_64, buf);
#else
diff --git a/sysvipc/test-sysvipc.h b/sysvipc/test-sysvipc.h
index 21ef6c656581519e..d1c8349b45b5ce49 100644
--- a/sysvipc/test-sysvipc.h
+++ b/sysvipc/test-sysvipc.h
@@ -25,7 +25,7 @@
#include <sys/shm.h>
#include <include/array_length.h>
-/* Return the first invalid command SysV IPC command from common shared
+/* Return the first invalid SysV IPC command from common shared
between message queue, shared memory, and semaphore. */
static inline int
first_common_invalid_cmd (void)
@@ -50,7 +50,7 @@ first_common_invalid_cmd (void)
return invalid;
}
-/* Return the first invalid command SysV IPC command for semaphore. */
+/* Return the first invalid SysV IPC command for semaphore. */
static inline int
first_sem_invalid_cmd (void)
{
@@ -82,7 +82,7 @@ first_sem_invalid_cmd (void)
return invalid;
}
-/* Return the first invalid command SysV IPC command for message queue. */
+/* Return the first invalid SysV IPC command for message queue. */
static inline int
first_msg_invalid_cmd (void)
{
@@ -107,4 +107,31 @@ first_msg_invalid_cmd (void)
return invalid;
}
+/* Return the first invalid SysV IPC command for shared memory. */
+static inline int
+first_shm_invalid_cmd (void)
+{
+ const int shm_cmds[] = {
+ SHM_STAT,
+ SHM_INFO,
+#ifdef SHM_STAT_ANY
+ SHM_STAT_ANY,
+#endif
+ SHM_LOCK,
+ SHM_UNLOCK
+ };
+
+ int invalid = first_common_invalid_cmd ();
+ for (int i = 0; i < array_length (shm_cmds); i++)
+ {
+ if (invalid == shm_cmds[i])
+ {
+ invalid++;
+ i = 0;
+ }
+ }
+
+ return invalid;
+}
+
#endif /* _TEST_SYSV_H */
diff --git a/sysvipc/test-sysvshm.c b/sysvipc/test-sysvshm.c
index a7c2e0bd4065dbcd..0fdfddf8550413e4 100644
--- a/sysvipc/test-sysvshm.c
+++ b/sysvipc/test-sysvshm.c
@@ -25,6 +25,8 @@
#include <sys/ipc.h>
#include <sys/shm.h>
+#include <test-sysvipc.h>
+
#include <support/support.h>
#include <support/check.h>
#include <support/temp_file.h>
@@ -81,6 +83,9 @@ do_test (void)
FAIL_EXIT1 ("shmget failed (errno=%d)", errno);
}
+ TEST_COMPARE (shmctl (shmid, first_shm_invalid_cmd (), NULL), -1);
+ TEST_COMPARE (errno, EINVAL);
+
/* Get shared memory kernel information and do some sanity checks. */
struct shmid_ds shminfo;
if (shmctl (shmid, IPC_STAT, &shminfo) == -1)

View File

@ -0,0 +1,453 @@
From b9d83bf3eb57e1cf8ef785f1a58e13ddf162b6f3 Mon Sep 17 00:00:00 2001
From: Raphael M Zinsly <rzinsly@linux.ibm.com>
Date: Thu, 12 Nov 2020 13:12:24 -0300
Subject: powerpc: Add optimized strncpy for POWER9
Similar to the strcpy P9 optimization, this version uses VSX to improve
performance.
Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
diff --git a/sysdeps/powerpc/powerpc64/le/power9/strncpy.S b/sysdeps/powerpc/powerpc64/le/power9/strncpy.S
new file mode 100644
index 0000000000..cbfc37bda3
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/power9/strncpy.S
@@ -0,0 +1,344 @@
+/* Optimized strncpy implementation for POWER9 LE.
+ Copyright (C) 2020 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <sysdep.h>
+
+# ifndef STRNCPY
+# define FUNC_NAME strncpy
+# else
+# define FUNC_NAME STRNCPY
+# endif
+
+#ifndef MEMSET
+/* For builds without IFUNC support, local calls should be made to internal
+ GLIBC symbol (created by libc_hidden_builtin_def). */
+# ifdef SHARED
+# define MEMSET_is_local
+# define MEMSET __GI_memset
+# else
+# define MEMSET memset
+# endif
+#endif
+
+#define FRAMESIZE (FRAME_MIN_SIZE+8)
+
+/* Implements the function
+
+ char * [r3] strncpy (char *dest [r3], const char *src [r4], size_t n [r5])
+
+ The implementation can load bytes past a null terminator, but only
+ up to the next 16-byte aligned address, so it never crosses a page. */
+
+.machine power9
+#ifdef MEMSET_is_local
+ENTRY_TOCLESS (FUNC_NAME, 4)
+#else
+ENTRY (FUNC_NAME, 4)
+#endif
+ CALL_MCOUNT 2
+
+ /* NULL string optimizations */
+ cmpdi r5, 0
+ beqlr
+
+ lbz r0,0(r4)
+ stb r0,0(r3)
+ addi r11,r3,1
+ addi r5,r5,-1
+ vspltisb v18,0 /* Zeroes in v18 */
+ cmpdi r0,0
+ beq L(zero_padding)
+
+ /* Empty/1-byte string optimization */
+ cmpdi r5,0
+ beqlr
+
+ addi r4,r4,1
+ neg r7,r4
+ rldicl r9,r7,0,60 /* How many bytes to get source 16B aligned? */
+
+ /* Get source 16B aligned */
+ lvx v0,0,r4
+ lvsr v1,0,r4
+ vperm v0,v18,v0,v1
+
+ vcmpequb v6,v0,v18 /* 0xff if byte is NULL, 0x00 otherwise */
+ vctzlsbb r7,v6 /* Number of trailing zeroes */
+ addi r8,r7,1 /* Add null terminator */
+
+ /* r8 = bytes including null
+ r9 = bytes to get source 16B aligned
+ if r8 > r9
+ no null, copy r9 bytes
+ else
+ there is a null, copy r8 bytes and return. */
+ cmpld r8,r9
+ bgt L(no_null)
+
+ cmpld cr6,r8,r5 /* r8 <= n? */
+ ble cr6,L(null)
+
+ sldi r10,r5,56 /* stxvl wants size in top 8 bits */
+ stxvl 32+v0,r11,r10 /* Partial store */
+
+ blr
+
+L(null):
+ sldi r10,r8,56 /* stxvl wants size in top 8 bits */
+ stxvl 32+v0,r11,r10 /* Partial store */
+
+ add r11,r11,r8
+ sub r5,r5,r8
+ b L(zero_padding)
+
+L(no_null):
+ cmpld r9,r5 /* Check if length was reached. */
+ bge L(n_tail1)
+
+ sldi r10,r9,56 /* stxvl wants size in top 8 bits */
+ stxvl 32+v0,r11,r10 /* Partial store */
+
+ add r4,r4,r9
+ add r11,r11,r9
+ sub r5,r5,r9
+
+L(loop):
+ cmpldi cr6,r5,64 /* Check if length was reached. */
+ ble cr6,L(final_loop)
+
+ lxv 32+v0,0(r4)
+ vcmpequb. v6,v0,v18 /* Any zero bytes? */
+ bne cr6,L(prep_tail1)
+
+ lxv 32+v1,16(r4)
+ vcmpequb. v6,v1,v18 /* Any zero bytes? */
+ bne cr6,L(prep_tail2)
+
+ lxv 32+v2,32(r4)
+ vcmpequb. v6,v2,v18 /* Any zero bytes? */
+ bne cr6,L(prep_tail3)
+
+ lxv 32+v3,48(r4)
+ vcmpequb. v6,v3,v18 /* Any zero bytes? */
+ bne cr6,L(prep_tail4)
+
+ stxv 32+v0,0(r11)
+ stxv 32+v1,16(r11)
+ stxv 32+v2,32(r11)
+ stxv 32+v3,48(r11)
+
+ addi r4,r4,64
+ addi r11,r11,64
+ addi r5,r5,-64
+
+ b L(loop)
+
+L(final_loop):
+ cmpldi cr5,r5,16
+ lxv 32+v0,0(r4)
+ vcmpequb. v6,v0,v18 /* Any zero bytes? */
+ ble cr5,L(prep_n_tail1)
+ bne cr6,L(count_tail1)
+ addi r5,r5,-16
+
+ cmpldi cr5,r5,16
+ lxv 32+v1,16(r4)
+ vcmpequb. v6,v1,v18 /* Any zero bytes? */
+ ble cr5,L(prep_n_tail2)
+ bne cr6,L(count_tail2)
+ addi r5,r5,-16
+
+ cmpldi cr5,r5,16
+ lxv 32+v2,32(r4)
+ vcmpequb. v6,v2,v18 /* Any zero bytes? */
+ ble cr5,L(prep_n_tail3)
+ bne cr6,L(count_tail3)
+ addi r5,r5,-16
+
+ lxv 32+v3,48(r4)
+ vcmpequb. v6,v3,v18 /* Any zero bytes? */
+ beq cr6,L(n_tail4)
+
+ vctzlsbb r8,v6 /* Number of trailing zeroes */
+ cmpld r8,r5 /* r8 < n? */
+ blt L(tail4)
+
+L(n_tail4):
+ stxv 32+v0,0(r11)
+ stxv 32+v1,16(r11)
+ stxv 32+v2,32(r11)
+ sldi r10,r5,56 /* stxvl wants size in top 8 bits */
+ addi r11,r11,48 /* Offset */
+ stxvl 32+v3,r11,r10 /* Partial store */
+ blr
+
+L(prep_n_tail1):
+ beq cr6,L(n_tail1) /* Any zero bytes? */
+ vctzlsbb r8,v6 /* Number of trailing zeroes */
+ cmpld r8,r5 /* r8 < n? */
+ blt L(tail1)
+
+L(n_tail1):
+ sldi r10,r5,56 /* stxvl wants size in top 8 bits */
+ stxvl 32+v0,r11,r10 /* Partial store */
+ blr
+
+L(prep_n_tail2):
+ beq cr6,L(n_tail2) /* Any zero bytes? */
+ vctzlsbb r8,v6 /* Number of trailing zeroes */
+ cmpld r8,r5 /* r8 < n? */
+ blt L(tail2)
+
+L(n_tail2):
+ stxv 32+v0,0(r11)
+ sldi r10,r5,56 /* stxvl wants size in top 8 bits */
+ addi r11,r11,16 /* offset */
+ stxvl 32+v1,r11,r10 /* Partial store */
+ blr
+
+L(prep_n_tail3):
+ beq cr6,L(n_tail3) /* Any zero bytes? */
+ vctzlsbb r8,v6 /* Number of trailing zeroes */
+ cmpld r8,r5 /* r8 < n? */
+ blt L(tail3)
+
+L(n_tail3):
+ stxv 32+v0,0(r11)
+ stxv 32+v1,16(r11)
+ sldi r10,r5,56 /* stxvl wants size in top 8 bits */
+ addi r11,r11,32 /* Offset */
+ stxvl 32+v2,r11,r10 /* Partial store */
+ blr
+
+L(prep_tail1):
+L(count_tail1):
+ vctzlsbb r8,v6 /* Number of trailing zeroes */
+L(tail1):
+ addi r9,r8,1 /* Add null terminator */
+ sldi r10,r9,56 /* stxvl wants size in top 8 bits */
+ stxvl 32+v0,r11,r10 /* Partial store */
+ add r11,r11,r9
+ sub r5,r5,r9
+ b L(zero_padding)
+
+L(prep_tail2):
+ addi r5,r5,-16
+L(count_tail2):
+ vctzlsbb r8,v6 /* Number of trailing zeroes */
+L(tail2):
+ addi r9,r8,1 /* Add null terminator */
+ stxv 32+v0,0(r11)
+ sldi r10,r9,56 /* stxvl wants size in top 8 bits */
+ addi r11,r11,16 /* offset */
+ stxvl 32+v1,r11,r10 /* Partial store */
+ add r11,r11,r9
+ sub r5,r5,r9
+ b L(zero_padding)
+
+L(prep_tail3):
+ addi r5,r5,-32
+L(count_tail3):
+ vctzlsbb r8,v6 /* Number of trailing zeroes */
+L(tail3):
+ addi r9,r8,1 /* Add null terminator */
+ stxv 32+v0,0(r11)
+ stxv 32+v1,16(r11)
+ sldi r10,r9,56 /* stxvl wants size in top 8 bits */
+ addi r11,r11,32 /* offset */
+ stxvl 32+v2,r11,r10 /* Partial store */
+ add r11,r11,r9
+ sub r5,r5,r9
+ b L(zero_padding)
+
+L(prep_tail4):
+ addi r5,r5,-48
+ vctzlsbb r8,v6 /* Number of trailing zeroes */
+L(tail4):
+ addi r9,r8,1 /* Add null terminator */
+ stxv 32+v0,0(r11)
+ stxv 32+v1,16(r11)
+ stxv 32+v2,32(r11)
+ sldi r10,r9,56 /* stxvl wants size in top 8 bits */
+ addi r11,r11,48 /* offset */
+ stxvl 32+v3,r11,r10 /* Partial store */
+ add r11,r11,r9
+ sub r5,r5,r9
+
+/* This code pads the remainder of dest with NULL bytes. For large numbers
+ memset gives a better performance, 255 was chosen through experimentation.
+ */
+L(zero_padding):
+ cmpldi r5,255
+ bge L(zero_padding_memset)
+
+L(zero_padding_loop):
+ cmpldi cr6,r5,16 /* Check if length was reached. */
+ ble cr6,L(zero_padding_end)
+
+ stxv v18,0(r11)
+ addi r11,r11,16
+ addi r5,r5,-16
+
+ b L(zero_padding_loop)
+
+L(zero_padding_end):
+ sldi r10,r5,56 /* stxvl wants size in top 8 bits */
+ stxvl v18,r11,r10 /* Partial store */
+ blr
+
+ .align 4
+L(zero_padding_memset):
+ std r30,-8(r1) /* Save r30 on the stack. */
+ cfi_offset(r30, -8)
+ mr r30,r3 /* Save the return value of strncpy. */
+ /* Prepare the call to memset. */
+ mr r3,r11 /* Pointer to the area to be zero-filled. */
+ li r4,0 /* Byte to be written (zero). */
+
+ /* We delayed the creation of the stack frame, as well as the saving of
+ the link register, because only at this point, we are sure that
+ doing so is actually needed. */
+
+ /* Save the link register. */
+ mflr r0
+ std r0,16(r1)
+
+ /* Create the stack frame. */
+ stdu r1,-FRAMESIZE(r1)
+ cfi_adjust_cfa_offset(FRAMESIZE)
+ cfi_offset(lr, 16)
+
+ bl MEMSET
+#ifndef MEMSET_is_local
+ nop
+#endif
+
+ ld r0,FRAMESIZE+16(r1)
+
+ mr r3,r30 /* Restore the return value of strncpy, i.e.:
+ dest. */
+ ld r30,FRAMESIZE-8(r1) /* Restore r30. */
+ /* Restore the stack frame. */
+ addi r1,r1,FRAMESIZE
+ cfi_adjust_cfa_offset(-FRAMESIZE)
+ /* Restore the link register. */
+ mtlr r0
+ cfi_restore(lr)
+ blr
+
+END (FUNC_NAME)
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index 19acb6c64a..cd2b47b403 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -33,7 +33,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
ifneq (,$(filter %le,$(config-machine)))
sysdep_routines += strcmp-power9 strncmp-power9 strcpy-power9 stpcpy-power9 \
- rawmemchr-power9 strlen-power9
+ rawmemchr-power9 strlen-power9 strncpy-power9
endif
CFLAGS-strncase-power7.c += -mcpu=power7 -funroll-loops
CFLAGS-strncase_l-power7.c += -mcpu=power7 -funroll-loops
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index dd54e7d6bb..135326c97a 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -301,6 +301,12 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
/* Support sysdeps/powerpc/powerpc64/multiarch/strncpy.c. */
IFUNC_IMPL (i, name, strncpy,
+#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, strncpy,
+ (hwcap2 & PPC_FEATURE2_ARCH_3_00)
+ && (hwcap & PPC_FEATURE_HAS_VSX),
+ __strncpy_power9)
+#endif
IFUNC_IMPL_ADD (array, i, strncpy,
hwcap2 & PPC_FEATURE2_ARCH_2_07,
__strncpy_power8)
diff --git a/sysdeps/powerpc/powerpc64/multiarch/strncpy-power9.S b/sysdeps/powerpc/powerpc64/multiarch/strncpy-power9.S
new file mode 100644
index 0000000000..2b57c190f5
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/multiarch/strncpy-power9.S
@@ -0,0 +1,32 @@
+/* Optimized strncpy implementation for POWER9 LE.
+ Copyright (C) 2020 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#if defined __LITTLE_ENDIAN__ && IS_IN (libc)
+# define STRNCPY __strncpy_power9
+
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(name)
+
+/* memset is used to pad the end of the string. */
+# define MEMSET __memset_power8
+# ifdef SHARED
+# define MEMSET_is_local
+# endif
+
+# include <sysdeps/powerpc/powerpc64/le/power9/strncpy.S>
+#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/strncpy.c b/sysdeps/powerpc/powerpc64/multiarch/strncpy.c
index 7bacf28aca..af8b6cdd9c 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/strncpy.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/strncpy.c
@@ -28,11 +28,19 @@
extern __typeof (strncpy) __strncpy_ppc attribute_hidden;
extern __typeof (strncpy) __strncpy_power7 attribute_hidden;
extern __typeof (strncpy) __strncpy_power8 attribute_hidden;
+# ifdef __LITTLE_ENDIAN__
+extern __typeof (strncpy) __strncpy_power9 attribute_hidden;
+# endif
# undef strncpy
/* Avoid DWARF definition DIE on ifunc symbol so that GDB can handle
ifunc symbol properly. */
libc_ifunc_redirected (__redirect_strncpy, strncpy,
+# ifdef __LITTLE_ENDIAN__
+ (hwcap2 & PPC_FEATURE2_ARCH_3_00) &&
+ (hwcap & PPC_FEATURE_HAS_VSX)
+ ? __strncpy_power9 :
+# endif
(hwcap2 & PPC_FEATURE2_ARCH_2_07)
? __strncpy_power8
: (hwcap & PPC_FEATURE_HAS_VSX)

View File

@ -0,0 +1,307 @@
From 7beee7b39adeda657f45989b0635033dae25a1fd Mon Sep 17 00:00:00 2001
From: Raphael M Zinsly <rzinsly@linux.ibm.com>
Date: Thu, 12 Nov 2020 13:12:24 -0300
Subject: powerpc: Add optimized stpncpy for POWER9
Add stpncpy support into the POWER9 strncpy.
Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
diff --git a/sysdeps/powerpc/powerpc64/le/power9/stpncpy.S b/sysdeps/powerpc/powerpc64/le/power9/stpncpy.S
new file mode 100644
index 0000000000..81d9673d8b
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/power9/stpncpy.S
@@ -0,0 +1,24 @@
+/* Optimized stpncpy implementation for POWER9 LE.
+ Copyright (C) 2020 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#define USE_AS_STPNCPY
+#include <sysdeps/powerpc/powerpc64/le/power9/strncpy.S>
+
+weak_alias (__stpncpy, stpncpy)
+libc_hidden_def (__stpncpy)
+libc_hidden_builtin_def (stpncpy)
diff --git a/sysdeps/powerpc/powerpc64/le/power9/strncpy.S b/sysdeps/powerpc/powerpc64/le/power9/strncpy.S
index cbfc37bda3..b4ba428662 100644
--- a/sysdeps/powerpc/powerpc64/le/power9/strncpy.S
+++ b/sysdeps/powerpc/powerpc64/le/power9/strncpy.S
@@ -18,11 +18,19 @@
#include <sysdep.h>
+#ifdef USE_AS_STPNCPY
+# ifndef STPNCPY
+# define FUNC_NAME __stpncpy
+# else
+# define FUNC_NAME STPNCPY
+# endif
+#else
# ifndef STRNCPY
# define FUNC_NAME strncpy
# else
# define FUNC_NAME STRNCPY
# endif
+#endif /* !USE_AS_STPNCPY */
#ifndef MEMSET
/* For builds without IFUNC support, local calls should be made to internal
@@ -41,6 +49,12 @@
char * [r3] strncpy (char *dest [r3], const char *src [r4], size_t n [r5])
+ or
+
+ char * [r3] stpncpy (char *dest [r3], const char *src [r4], size_t n [r5])
+
+ if USE_AS_STPNCPY is defined.
+
The implementation can load bytes past a null terminator, but only
up to the next 16-byte aligned address, so it never crosses a page. */
@@ -66,7 +80,15 @@ ENTRY (FUNC_NAME, 4)
/* Empty/1-byte string optimization */
cmpdi r5,0
+#ifdef USE_AS_STPNCPY
+ bgt L(cont)
+ /* Compute pointer to last byte copied into dest. */
+ addi r3,r3,1
+ blr
+L(cont):
+#else
beqlr
+#endif
addi r4,r4,1
neg r7,r4
@@ -96,12 +118,20 @@ ENTRY (FUNC_NAME, 4)
sldi r10,r5,56 /* stxvl wants size in top 8 bits */
stxvl 32+v0,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r5
+#endif
blr
L(null):
sldi r10,r8,56 /* stxvl wants size in top 8 bits */
stxvl 32+v0,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r7
+#endif
add r11,r11,r8
sub r5,r5,r8
b L(zero_padding)
@@ -185,6 +215,10 @@ L(n_tail4):
sldi r10,r5,56 /* stxvl wants size in top 8 bits */
addi r11,r11,48 /* Offset */
stxvl 32+v3,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r5
+#endif
blr
L(prep_n_tail1):
@@ -196,6 +230,10 @@ L(prep_n_tail1):
L(n_tail1):
sldi r10,r5,56 /* stxvl wants size in top 8 bits */
stxvl 32+v0,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r5
+#endif
blr
L(prep_n_tail2):
@@ -209,6 +247,10 @@ L(n_tail2):
sldi r10,r5,56 /* stxvl wants size in top 8 bits */
addi r11,r11,16 /* offset */
stxvl 32+v1,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r5
+#endif
blr
L(prep_n_tail3):
@@ -223,6 +265,10 @@ L(n_tail3):
sldi r10,r5,56 /* stxvl wants size in top 8 bits */
addi r11,r11,32 /* Offset */
stxvl 32+v2,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r5
+#endif
blr
L(prep_tail1):
@@ -232,6 +278,10 @@ L(tail1):
addi r9,r8,1 /* Add null terminator */
sldi r10,r9,56 /* stxvl wants size in top 8 bits */
stxvl 32+v0,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r8
+#endif
add r11,r11,r9
sub r5,r5,r9
b L(zero_padding)
@@ -246,6 +296,10 @@ L(tail2):
sldi r10,r9,56 /* stxvl wants size in top 8 bits */
addi r11,r11,16 /* offset */
stxvl 32+v1,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r8
+#endif
add r11,r11,r9
sub r5,r5,r9
b L(zero_padding)
@@ -261,6 +315,10 @@ L(tail3):
sldi r10,r9,56 /* stxvl wants size in top 8 bits */
addi r11,r11,32 /* offset */
stxvl 32+v2,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r8
+#endif
add r11,r11,r9
sub r5,r5,r9
b L(zero_padding)
@@ -276,6 +334,10 @@ L(tail4):
sldi r10,r9,56 /* stxvl wants size in top 8 bits */
addi r11,r11,48 /* offset */
stxvl 32+v3,r11,r10 /* Partial store */
+#ifdef USE_AS_STPNCPY
+ /* Compute pointer to last byte copied into dest. */
+ add r3,r11,r8
+#endif
add r11,r11,r9
sub r5,r5,r9
@@ -331,7 +393,8 @@ L(zero_padding_memset):
ld r0,FRAMESIZE+16(r1)
mr r3,r30 /* Restore the return value of strncpy, i.e.:
- dest. */
+ dest. For stpncpy, the return value is the
+ same as return value of memset. */
ld r30,FRAMESIZE-8(r1) /* Restore r30. */
/* Restore the stack frame. */
addi r1,r1,FRAMESIZE
@@ -342,3 +405,6 @@ L(zero_padding_memset):
blr
END (FUNC_NAME)
+#ifndef USE_AS_STPNCPY
+libc_hidden_builtin_def (strncpy)
+#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index cd2b47b403..f46bf50732 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -33,7 +33,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
ifneq (,$(filter %le,$(config-machine)))
sysdep_routines += strcmp-power9 strncmp-power9 strcpy-power9 stpcpy-power9 \
- rawmemchr-power9 strlen-power9 strncpy-power9
+ rawmemchr-power9 strlen-power9 strncpy-power9 stpncpy-power9
endif
CFLAGS-strncase-power7.c += -mcpu=power7 -funroll-loops
CFLAGS-strncase_l-power7.c += -mcpu=power7 -funroll-loops
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index 135326c97a..8e19ebbf09 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -318,6 +318,12 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
/* Support sysdeps/powerpc/powerpc64/multiarch/stpncpy.c. */
IFUNC_IMPL (i, name, stpncpy,
+#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, stpncpy,
+ (hwcap2 & PPC_FEATURE2_ARCH_3_00)
+ && (hwcap & PPC_FEATURE_HAS_VSX),
+ __stpncpy_power9)
+#endif
IFUNC_IMPL_ADD (array, i, stpncpy,
hwcap2 & PPC_FEATURE2_ARCH_2_07,
__stpncpy_power8)
diff --git a/sysdeps/powerpc/powerpc64/multiarch/stpncpy-power9.S b/sysdeps/powerpc/powerpc64/multiarch/stpncpy-power9.S
new file mode 100644
index 0000000000..1188bd0894
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/multiarch/stpncpy-power9.S
@@ -0,0 +1,29 @@
+/* Optimized stpncpy implementation for POWER9 LE.
+ Copyright (C) 2020 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#define STPNCPY __stpncpy_power9
+
+#undef libc_hidden_builtin_def
+#define libc_hidden_builtin_def(name)
+
+#define MEMSET __memset_power8
+#ifdef SHARED
+# define MEMSET_is_local
+#endif
+
+#include <sysdeps/powerpc/powerpc64/le/power9/stpncpy.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/stpncpy.c b/sysdeps/powerpc/powerpc64/multiarch/stpncpy.c
index 17df886431..3758f29ad1 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/stpncpy.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/stpncpy.c
@@ -26,10 +26,18 @@
extern __typeof (__stpncpy) __stpncpy_ppc attribute_hidden;
extern __typeof (__stpncpy) __stpncpy_power7 attribute_hidden;
extern __typeof (__stpncpy) __stpncpy_power8 attribute_hidden;
+# ifdef __LITTLE_ENDIAN__
+extern __typeof (__stpncpy) __stpncpy_power9 attribute_hidden;
+# endif
# undef stpncpy
# undef __stpncpy
libc_ifunc_redirected (__redirect___stpncpy, __stpncpy,
+# ifdef __LITTLE_ENDIAN__
+ (hwcap2 & PPC_FEATURE2_ARCH_3_00) &&
+ (hwcap & PPC_FEATURE_HAS_VSX)
+ ? __stpncpy_power9 :
+# endif
(hwcap2 & PPC_FEATURE2_ARCH_2_07)
? __stpncpy_power8
: (hwcap & PPC_FEATURE_HAS_VSX)

View File

@ -0,0 +1,30 @@
From 3322ecbfe29a16e74c4f584d661b0b8018bb4031 Mon Sep 17 00:00:00 2001
From: Raphael Moreira Zinsly <rzinsly@linux.ibm.com>
Date: Mon, 14 Sep 2020 11:59:24 -0300
Subject: [PATCH] powerpc: Protect dl_powerpc_cpu_features on INIT_ARCH() [BZ
#26615]
dl_powerpc_cpu_features also needs to be protected by __GLRO to check
for the _rtld_global_ro realocation before accessing it.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
---
sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h b/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h
index 17ddfcf528..c8fa07fadc 100644
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h
+++ b/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h
@@ -38,7 +38,7 @@
unsigned long int hwcap = __GLRO(dl_hwcap); \
unsigned long int __attribute__((unused)) hwcap2 = __GLRO(dl_hwcap2); \
bool __attribute__((unused)) use_cached_memopt = \
- GLRO(dl_powerpc_cpu_features).use_cached_memopt; \
+ __GLRO(dl_powerpc_cpu_features.use_cached_memopt); \
if (hwcap & PPC_FEATURE_ARCH_2_06) \
hwcap |= PPC_FEATURE_ARCH_2_05 | \
PPC_FEATURE_POWER5_PLUS | \
--
2.27.0

View File

@ -0,0 +1,39 @@
commit dca565886b5e8bd7966e15f0ca42ee5cff686673
Author: DJ Delorie <dj@redhat.com>
Date: Thu Feb 25 16:08:21 2021 -0500
nscd: Fix double free in netgroupcache [BZ #27462]
In commit 745664bd798ec8fd50438605948eea594179fba1 a use-after-free
was fixed, but this led to an occasional double-free. This patch
tracks the "live" allocation better.
Tested manually by a third party.
Related: RHBZ 1927877
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
diff --git a/nscd/netgroupcache.c b/nscd/netgroupcache.c
index f521df824102bbca..5ee4413ef9384ec9 100644
--- a/nscd/netgroupcache.c
+++ b/nscd/netgroupcache.c
@@ -248,7 +248,7 @@ addgetnetgrentX (struct database_dyn *db, int fd, request_header *req,
: NULL);
ndomain = (ndomain ? newbuf + ndomaindiff
: NULL);
- buffer = newbuf;
+ *tofreep = buffer = newbuf;
}
nhost = memcpy (buffer + bufused,
@@ -319,7 +319,7 @@ addgetnetgrentX (struct database_dyn *db, int fd, request_header *req,
else if (status == NSS_STATUS_TRYAGAIN && e == ERANGE)
{
buflen *= 2;
- buffer = xrealloc (buffer, buflen);
+ *tofreep = buffer = xrealloc (buffer, buflen);
}
else if (status == NSS_STATUS_RETURN
|| status == NSS_STATUS_NOTFOUND

View File

@ -0,0 +1,25 @@
commit dc91a19e6f71e1523f4ac179191a29b2131d74bb
Author: Joseph Myers <joseph@codesourcery.com>
Date: Mon Jun 3 11:16:02 2019 +0000
Add INADDR_ALLSNOOPERS_GROUP from Linux 5.1 to netinet/in.h.
This patch adds INADDR_ALLSNOOPERS_GROUP from Linux 5.1 to
netinet/in.h.
Tested for x86_64.
* inet/netinet/in.h (INADDR_ALLSNOOPERS_GROUP): New macro.
diff --git a/inet/netinet/in.h b/inet/netinet/in.h
index 03a31b634c8bfbed..c2d12a04aab6c022 100644
--- a/inet/netinet/in.h
+++ b/inet/netinet/in.h
@@ -204,6 +204,7 @@ enum
#define INADDR_UNSPEC_GROUP ((in_addr_t) 0xe0000000) /* 224.0.0.0 */
#define INADDR_ALLHOSTS_GROUP ((in_addr_t) 0xe0000001) /* 224.0.0.1 */
#define INADDR_ALLRTRS_GROUP ((in_addr_t) 0xe0000002) /* 224.0.0.2 */
+#define INADDR_ALLSNOOPERS_GROUP ((in_addr_t) 0xe000006a) /* 224.0.0.106 */
#define INADDR_MAX_LOCAL_GROUP ((in_addr_t) 0xe00000ff) /* 224.0.0.255 */
#if !__USE_KERNEL_IPV6_DEFS

View File

@ -0,0 +1,28 @@
commit f9ac84f92f151e07586c55e14ed628d493a5929d
Author: Joseph Myers <joseph@codesourcery.com>
Date: Fri Apr 3 18:08:28 2020 +0000
Add IPPROTO_ETHERNET and IPPROTO_MPTCP from Linux 5.6 to netinet/in.h.
This patch adds the IPPROTO_ETHERNET and IPPROTO_MPTCP constants from
Linux 5.6 to glibc's netinet/in.h.
Tested for x86_64.
diff --git a/inet/netinet/in.h b/inet/netinet/in.h
index c2d12a04aab6c022..5880e909ff3e06fb 100644
--- a/inet/netinet/in.h
+++ b/inet/netinet/in.h
@@ -87,8 +87,12 @@ enum
#define IPPROTO_UDPLITE IPPROTO_UDPLITE
IPPROTO_MPLS = 137, /* MPLS in IP. */
#define IPPROTO_MPLS IPPROTO_MPLS
+ IPPROTO_ETHERNET = 143, /* Ethernet-within-IPv6 Encapsulation. */
+#define IPPROTO_ETHERNET IPPROTO_ETHERNET
IPPROTO_RAW = 255, /* Raw IP packets. */
#define IPPROTO_RAW IPPROTO_RAW
+ IPPROTO_MPTCP = 262, /* Multipath TCP connection. */
+#define IPPROTO_MPTCP IPPROTO_MPTCP
IPPROTO_MAX
};

View File

@ -0,0 +1,177 @@
This is a custom downstream RHEL 8 patch which rebuilds three
GLIBC_PRIVATE interfaces locally for use by libnss_files.so.2
and libnss_compat.so.2.
The shared objects needs the following 3 functions:
__nss_readline
__nss_parse_line_result
__nss_files_fopen (only requirement for libnss_compat.so.2)
They are implemented in:
nss/nss_parse_line_result.c
nss/nss_readline.c
nss/nss_files_fopen.c
We create wrappers for those functions, recompile, and link directly
into the shared objects:
nss/nss_parse_line_result_int.c
nss/nss_readline_int.c
nss/nss_files_fopen_int.c
After building the new shared objects there are no longer any undefined
global function references to __nss_readline@GLIBC_PRIVATE,
__nss_parse_line_result@GLIBC_PRIVATE or
__nss_files_fopen@GLIBC_PRIVATE.
Instead we see local function definitions in the shared object e.g.
Symbol table '.symtab' contains 628 entries:
...
486: 0000000000008ce0 92 FUNC LOCAL DEFAULT 15 __nss_parse_line_result
...
494: 0000000000008b70 72 FUNC LOCAL DEFAULT 15 __nss_readline_seek
...
497: 0000000000008bc0 279 FUNC LOCAL DEFAULT 15 __nss_readline
...
510: 0000000000008ce0 82 FUNC LOCAL DEFAULT 15 __nss_files_fopen
The remaining GLIBC_PRIVATE references in the shared objects are all
pre-existing and do not impact upgrade scenarios.
For reference the existing and present GLIBC_PRIVATE interfaces are:
__libc_alloc_buffer_alloc_array@@GLIBC_PRIVATE
__libc_alloc_buffer_copy_string@@GLIBC_PRIVATE
__libc_alloc_buffer_create_failure@@GLIBC_PRIVATE
__libc_dynarray_emplace_enlarge@@GLIBC_PRIVATE
__libc_scratch_buffer_grow@@GLIBC_PRIVATE
__resp@@GLIBC_PRIVATE
_nss_files_parse_grent@@GLIBC_PRIVATE
_nss_files_parse_pwent@@GLIBC_PRIVATE
_nss_files_parse_sgent@@GLIBC_PRIVATE
_nss_files_parse_spent@@GLIBC_PRIVATE
errno@@GLIBC_PRIVATE
__nss_database_lookup2@GLIBC_PRIVATE
__nss_lookup_function@GLIBC_PRIVATE
Each was checked for existence in libc.so.6.
A small reproducer was used in testing this patch, included here:
cat >> tst-rhbz1927040.c <<EOF
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <errno.h>
#include <pwd.h>
#include <string.h>
int
main (void)
{
struct passwd *res;
/* Only lookup via files. */
printf ("INFO: Upgrade glibc, then press ENTER to see if libnss_files.so.2 loads.");
getchar ();
/* Try to get one entry. */
printf ("INFO: Looking up first password entry.\n");
setpwent ();
errno = 0;
res = getpwent ();
if (res == NULL && errno != 0)
{
printf ("FAIL: Could not get entry (%s).\n", strerror(errno));
exit (1);
}
printf ("INFO: First entry passwd.pw_name = \"%s\"\n", res->pw_name);
printf ("PASS: Call to getpwent succeeded.\n");
endpwent ();
exit (0);
}
EOF
Testing RHEL upgrade
from: glibc-2.28-127.el8_3.2
to: glibc-2.28-148.el8
./tst-rhbz1927040
INFO: Upgrade glibc, then press ENTER to see if libnss_files.so.2 loads.
INFO: Looking up first password entry.
INFO: Result was NULL.
PASS: Call to getpwent succeeded.
With LD_DEBUG=all you can observe:
22697: /lib64/libnss_files.so.2: error: symbol lookup error: undefined symbol: __nss_files_fopen, version GLIBC_PRIVATE (fatal)
Which is the indication that the upgrade caused the transient IdM lookup failure.
Running again succeeds:
INFO: Upgrade glibc, then press ENTER to see if libnss_files.so.2 loads.
INFO: Looking up first password entry.
INFO: First entry passwd.pw_name = "root"
PASS: Call to getpwent succeeded.
diff --git a/nss/Makefile b/nss/Makefile
index 7359da38feb65618..d5c28a6b5ed3661c 100644
--- a/nss/Makefile
+++ b/nss/Makefile
@@ -92,9 +92,19 @@ extra-libs-others = $(extra-libs)
subdir-dirs = $(services:%=nss_%)
vpath %.c $(subdir-dirs) ../locale/programs ../intl
-
+# In RHEL we add nss_readline, nss_parse_line_result, and
+# nss_files_fopen to the libnss_files-routines in order to avoid the
+# case where a long running process (having never used NSS) attemps to
+# load an NSS module for the first time and that NSS module needs a
+# newer GLIBC_PRIVATE interface. In effect we must make the NSS modules
+# self-sufficient and not rely on a GLIBC_PRIVATE interface.
+# See: https://bugzilla.redhat.com/show_bug.cgi?id=1927040
+# Note: We must recompile the objects to get the correct global symbol
+# references, which is why we have the *_int.c wrappers.
libnss_files-routines := $(addprefix files-,$(databases)) \
- files-initgroups files-init
+ files-initgroups files-init \
+ nss_readline_int nss_parse_line_result_int \
+ nss_files_fopen_int
libnss_db-dbs := $(addprefix db-,\
$(filter-out hosts network key alias,\
@@ -104,8 +114,10 @@ libnss_db-routines := $(libnss_db-dbs) db-open db-init hash-string
generated += $(filter-out db-alias.c db-netgrp.c, \
$(addsuffix .c,$(libnss_db-dbs)))
+# See note above regarding nss_files_fopen.
libnss_compat-routines := $(addprefix compat-,grp pwd spwd initgroups) \
- nisdomain
+ nisdomain \
+ nss_files_fopen_int
install-others += $(inst_vardbdir)/Makefile
diff --git a/nss/nss_files_fopen_int.c b/nss/nss_files_fopen_int.c
new file mode 100644
index 0000000000000000..fa518084fd609b52
--- /dev/null
+++ b/nss/nss_files_fopen_int.c
@@ -0,0 +1,3 @@
+/* Include a local internal copy of __nss_files_fopen to make the NSS
+ module self-contained. */
+#include <nss_files_fopen.c>
diff --git a/nss/nss_parse_line_result_int.c b/nss/nss_parse_line_result_int.c
new file mode 100644
index 0000000000000000..bc0ee7a251743c9a
--- /dev/null
+++ b/nss/nss_parse_line_result_int.c
@@ -0,0 +1,3 @@
+/* Include a local internal copy of __nss_parse_line_result to make the
+ NSS module self-contained. */
+#include <nss_parse_line_result.c>
diff --git a/nss/nss_readline_int.c b/nss/nss_readline_int.c
new file mode 100644
index 0000000000000000..0e7bd259733673c9
--- /dev/null
+++ b/nss/nss_readline_int.c
@@ -0,0 +1,3 @@
+/* Include a local internal copy of __nss_readline and
+ __nss_readline_seek to make the NSS module self-contained. */
+#include <nss_readline.c>

View File

@ -0,0 +1,20 @@
support: Pass environ to child process
Pass environ to posix_spawn so that the child process can inherit
environment of the test.
(cherry picked from commit e958490f8c74e660bd93c128b3bea746e268f3f6)
diff --git a/support/support_subprocess.c b/support/support_subprocess.c
index 12c79ff6b0859877..4573350d775ac4c8 100644
--- a/support/support_subprocess.c
+++ b/support/support_subprocess.c
@@ -84,7 +84,7 @@ support_subprogram (const char *file, char *const argv[])
xposix_spawn_file_actions_addclose (&fa, result.stdout_pipe[1]);
xposix_spawn_file_actions_addclose (&fa, result.stderr_pipe[1]);
- result.pid = xposix_spawn (file, &fa, NULL, argv, NULL);
+ result.pid = xposix_spawn (file, &fa, NULL, argv, environ);
xclose (result.stdout_pipe[1]);
xclose (result.stderr_pipe[1]);

View File

@ -0,0 +1,51 @@
support: Typo and formatting fixes
- Add a newline to the end of error messages in transfer().
- Fixed the name of support_subprocess_init().
(cherry picked from commit 95c68080a3ded882789b1629f872c3ad531efda0)
diff --git a/support/support_capture_subprocess.c b/support/support_capture_subprocess.c
index c13b3e59ece0842e..c475e2004da3183e 100644
--- a/support/support_capture_subprocess.c
+++ b/support/support_capture_subprocess.c
@@ -36,7 +36,7 @@ transfer (const char *what, struct pollfd *pfd, struct xmemstream *stream)
if (ret < 0)
{
support_record_failure ();
- printf ("error: reading from subprocess %s: %m", what);
+ printf ("error: reading from subprocess %s: %m\n", what);
pfd->events = 0;
pfd->revents = 0;
}
diff --git a/support/support_subprocess.c b/support/support_subprocess.c
index 4573350d775ac4c8..af01827cac81d80c 100644
--- a/support/support_subprocess.c
+++ b/support/support_subprocess.c
@@ -27,7 +27,7 @@
#include <support/subprocess.h>
static struct support_subprocess
-support_suprocess_init (void)
+support_subprocess_init (void)
{
struct support_subprocess result;
@@ -48,7 +48,7 @@ support_suprocess_init (void)
struct support_subprocess
support_subprocess (void (*callback) (void *), void *closure)
{
- struct support_subprocess result = support_suprocess_init ();
+ struct support_subprocess result = support_subprocess_init ();
result.pid = xfork ();
if (result.pid == 0)
@@ -71,7 +71,7 @@ support_subprocess (void (*callback) (void *), void *closure)
struct support_subprocess
support_subprogram (const char *file, char *const argv[])
{
- struct support_subprocess result = support_suprocess_init ();
+ struct support_subprocess result = support_subprocess_init ();
posix_spawn_file_actions_t fa;
/* posix_spawn_file_actions_init does not fail. */

View File

@ -0,0 +1,447 @@
support: Add capability to fork an sgid child
Add a new function support_capture_subprogram_self_sgid that spawns an
sgid child of the running program with its own image and returns the
exit code of the child process. This functionality is used by at
least three tests in the testsuite at the moment, so it makes sense to
consolidate.
There is also a new function support_subprogram_wait which should
provide simple system() like functionality that does not set up file
actions. This is useful in cases where only the return code of the
spawned subprocess is interesting.
This patch also ports tst-secure-getenv to this new function. A
subsequent patch will port other tests. This also brings an important
change to tst-secure-getenv behaviour. Now instead of succeeding, the
test fails as UNSUPPORTED if it is unable to spawn a setgid child,
which is how it should have been in the first place.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 716a3bdc41b2b4b864dc64475015ba51e35e1273)
diff --git a/stdlib/tst-secure-getenv.c b/stdlib/tst-secure-getenv.c
index a682b7493e41f200..156c92fea216729f 100644
--- a/stdlib/tst-secure-getenv.c
+++ b/stdlib/tst-secure-getenv.c
@@ -30,156 +30,12 @@
#include <sys/wait.h>
#include <unistd.h>
+#include <support/check.h>
#include <support/support.h>
+#include <support/capture_subprocess.h>
#include <support/test-driver.h>
static char MAGIC_ARGUMENT[] = "run-actual-test";
-#define MAGIC_STATUS 19
-
-/* Return a GID which is not our current GID, but is present in the
- supplementary group list. */
-static gid_t
-choose_gid (void)
-{
- const int count = 64;
- gid_t groups[count];
- int ret = getgroups (count, groups);
- if (ret < 0)
- {
- printf ("getgroups: %m\n");
- exit (1);
- }
- gid_t current = getgid ();
- for (int i = 0; i < ret; ++i)
- {
- if (groups[i] != current)
- return groups[i];
- }
- return 0;
-}
-
-
-/* Copies the executable into a restricted directory, so that we can
- safely make it SGID with the TARGET group ID. Then runs the
- executable. */
-static int
-run_executable_sgid (gid_t target)
-{
- char *dirname = xasprintf ("%s/secure-getenv.%jd",
- test_dir, (intmax_t) getpid ());
- char *execname = xasprintf ("%s/bin", dirname);
- int infd = -1;
- int outfd = -1;
- int ret = -1;
- if (mkdir (dirname, 0700) < 0)
- {
- printf ("mkdir: %m\n");
- goto err;
- }
- infd = open ("/proc/self/exe", O_RDONLY);
- if (infd < 0)
- {
- printf ("open (/proc/self/exe): %m\n");
- goto err;
- }
- outfd = open (execname, O_WRONLY | O_CREAT | O_EXCL, 0700);
- if (outfd < 0)
- {
- printf ("open (%s): %m\n", execname);
- goto err;
- }
- char buf[4096];
- for (;;)
- {
- ssize_t rdcount = read (infd, buf, sizeof (buf));
- if (rdcount < 0)
- {
- printf ("read: %m\n");
- goto err;
- }
- if (rdcount == 0)
- break;
- char *p = buf;
- char *end = buf + rdcount;
- while (p != end)
- {
- ssize_t wrcount = write (outfd, buf, end - p);
- if (wrcount == 0)
- errno = ENOSPC;
- if (wrcount <= 0)
- {
- printf ("write: %m\n");
- goto err;
- }
- p += wrcount;
- }
- }
- if (fchown (outfd, getuid (), target) < 0)
- {
- printf ("fchown (%s): %m\n", execname);
- goto err;
- }
- if (fchmod (outfd, 02750) < 0)
- {
- printf ("fchmod (%s): %m\n", execname);
- goto err;
- }
- if (close (outfd) < 0)
- {
- printf ("close (outfd): %m\n");
- goto err;
- }
- if (close (infd) < 0)
- {
- printf ("close (infd): %m\n");
- goto err;
- }
-
- int kid = fork ();
- if (kid < 0)
- {
- printf ("fork: %m\n");
- goto err;
- }
- if (kid == 0)
- {
- /* Child process. */
- char *args[] = { execname, MAGIC_ARGUMENT, NULL };
- execve (execname, args, environ);
- printf ("execve (%s): %m\n", execname);
- _exit (1);
- }
- int status;
- if (waitpid (kid, &status, 0) < 0)
- {
- printf ("waitpid: %m\n");
- goto err;
- }
- if (!WIFEXITED (status) || WEXITSTATUS (status) != MAGIC_STATUS)
- {
- printf ("Unexpected exit status %d from child process\n",
- status);
- goto err;
- }
- ret = 0;
-
-err:
- if (outfd >= 0)
- close (outfd);
- if (infd >= 0)
- close (infd);
- if (execname)
- {
- unlink (execname);
- free (execname);
- }
- if (dirname)
- {
- rmdir (dirname);
- free (dirname);
- }
- return ret;
-}
static int
do_test (void)
@@ -201,15 +57,15 @@ do_test (void)
exit (1);
}
- gid_t target = choose_gid ();
- if (target == 0)
- {
- fprintf (stderr,
- "Could not find a suitable GID for user %jd, skipping test\n",
- (intmax_t) getuid ());
- exit (0);
- }
- return run_executable_sgid (target);
+ int status = support_capture_subprogram_self_sgid (MAGIC_ARGUMENT);
+
+ if (WEXITSTATUS (status) == EXIT_UNSUPPORTED)
+ return EXIT_UNSUPPORTED;
+
+ if (!WIFEXITED (status))
+ FAIL_EXIT1 ("Unexpected exit status %d from child process\n", status);
+
+ return 0;
}
static void
@@ -218,23 +74,15 @@ alternative_main (int argc, char **argv)
if (argc == 2 && strcmp (argv[1], MAGIC_ARGUMENT) == 0)
{
if (getgid () == getegid ())
- {
- /* This can happen if the file system is mounted nosuid. */
- fprintf (stderr, "SGID failed: GID and EGID match (%jd)\n",
- (intmax_t) getgid ());
- exit (MAGIC_STATUS);
- }
+ /* This can happen if the file system is mounted nosuid. */
+ FAIL_UNSUPPORTED ("SGID failed: GID and EGID match (%jd)\n",
+ (intmax_t) getgid ());
if (getenv ("PATH") == NULL)
- {
- printf ("PATH variable not present\n");
- exit (3);
- }
+ FAIL_EXIT (3, "PATH variable not present\n");
if (secure_getenv ("PATH") != NULL)
- {
- printf ("PATH variable not filtered out\n");
- exit (4);
- }
- exit (MAGIC_STATUS);
+ FAIL_EXIT (4, "PATH variable not filtered out\n");
+
+ exit (EXIT_SUCCESS);
}
}
diff --git a/support/capture_subprocess.h b/support/capture_subprocess.h
index 2d2384e73df0d2d0..72fb30504684a84e 100644
--- a/support/capture_subprocess.h
+++ b/support/capture_subprocess.h
@@ -41,6 +41,12 @@ struct support_capture_subprocess support_capture_subprocess
struct support_capture_subprocess support_capture_subprogram
(const char *file, char *const argv[]);
+/* Copy the running program into a setgid binary and run it with CHILD_ID
+ argument. If execution is successful, return the exit status of the child
+ program, otherwise return a non-zero failure exit code. */
+int support_capture_subprogram_self_sgid
+ (char *child_id);
+
/* Deallocate the subprocess data captured by
support_capture_subprocess. */
void support_capture_subprocess_free (struct support_capture_subprocess *);
diff --git a/support/subprocess.h b/support/subprocess.h
index c031878d94c70c71..a19335ee5dbfcf98 100644
--- a/support/subprocess.h
+++ b/support/subprocess.h
@@ -38,6 +38,11 @@ struct support_subprocess support_subprocess
struct support_subprocess support_subprogram
(const char *file, char *const argv[]);
+/* Invoke program FILE with ARGV arguments by using posix_spawn and wait for it
+ to complete. Return program exit status. */
+int support_subprogram_wait
+ (const char *file, char *const argv[]);
+
/* Wait for the subprocess indicated by PROC::PID. Return the status
indicate by waitpid call. */
int support_process_wait (struct support_subprocess *proc);
diff --git a/support/support_capture_subprocess.c b/support/support_capture_subprocess.c
index c475e2004da3183e..eec5371d5602aa29 100644
--- a/support/support_capture_subprocess.c
+++ b/support/support_capture_subprocess.c
@@ -20,11 +20,14 @@
#include <support/capture_subprocess.h>
#include <errno.h>
+#include <fcntl.h>
#include <stdlib.h>
#include <support/check.h>
#include <support/xunistd.h>
#include <support/xsocket.h>
#include <support/xspawn.h>
+#include <support/support.h>
+#include <support/test-driver.h>
static void
transfer (const char *what, struct pollfd *pfd, struct xmemstream *stream)
@@ -101,6 +104,129 @@ support_capture_subprogram (const char *file, char *const argv[])
return result;
}
+/* Copies the executable into a restricted directory, so that we can
+ safely make it SGID with the TARGET group ID. Then runs the
+ executable. */
+static int
+copy_and_spawn_sgid (char *child_id, gid_t gid)
+{
+ char *dirname = xasprintf ("%s/tst-tunables-setuid.%jd",
+ test_dir, (intmax_t) getpid ());
+ char *execname = xasprintf ("%s/bin", dirname);
+ int infd = -1;
+ int outfd = -1;
+ int ret = 1, status = 1;
+
+ TEST_VERIFY (mkdir (dirname, 0700) == 0);
+ if (support_record_failure_is_failed ())
+ goto err;
+
+ infd = open ("/proc/self/exe", O_RDONLY);
+ if (infd < 0)
+ FAIL_UNSUPPORTED ("unsupported: Cannot read binary from procfs\n");
+
+ outfd = open (execname, O_WRONLY | O_CREAT | O_EXCL, 0700);
+ TEST_VERIFY (outfd >= 0);
+ if (support_record_failure_is_failed ())
+ goto err;
+
+ char buf[4096];
+ for (;;)
+ {
+ ssize_t rdcount = read (infd, buf, sizeof (buf));
+ TEST_VERIFY (rdcount >= 0);
+ if (support_record_failure_is_failed ())
+ goto err;
+ if (rdcount == 0)
+ break;
+ char *p = buf;
+ char *end = buf + rdcount;
+ while (p != end)
+ {
+ ssize_t wrcount = write (outfd, buf, end - p);
+ if (wrcount == 0)
+ errno = ENOSPC;
+ TEST_VERIFY (wrcount > 0);
+ if (support_record_failure_is_failed ())
+ goto err;
+ p += wrcount;
+ }
+ }
+ TEST_VERIFY (fchown (outfd, getuid (), gid) == 0);
+ if (support_record_failure_is_failed ())
+ goto err;
+ TEST_VERIFY (fchmod (outfd, 02750) == 0);
+ if (support_record_failure_is_failed ())
+ goto err;
+ TEST_VERIFY (close (outfd) == 0);
+ if (support_record_failure_is_failed ())
+ goto err;
+ TEST_VERIFY (close (infd) == 0);
+ if (support_record_failure_is_failed ())
+ goto err;
+
+ /* We have the binary, now spawn the subprocess. Avoid using
+ support_subprogram because we only want the program exit status, not the
+ contents. */
+ ret = 0;
+
+ char * const args[] = {execname, child_id, NULL};
+
+ status = support_subprogram_wait (args[0], args);
+
+err:
+ if (outfd >= 0)
+ close (outfd);
+ if (infd >= 0)
+ close (infd);
+ if (execname != NULL)
+ {
+ unlink (execname);
+ free (execname);
+ }
+ if (dirname != NULL)
+ {
+ rmdir (dirname);
+ free (dirname);
+ }
+
+ if (ret != 0)
+ FAIL_EXIT1("Failed to make sgid executable for test\n");
+
+ return status;
+}
+
+int
+support_capture_subprogram_self_sgid (char *child_id)
+{
+ gid_t target = 0;
+ const int count = 64;
+ gid_t groups[count];
+
+ /* Get a GID which is not our current GID, but is present in the
+ supplementary group list. */
+ int ret = getgroups (count, groups);
+ if (ret < 0)
+ FAIL_UNSUPPORTED("Could not get group list for user %jd\n",
+ (intmax_t) getuid ());
+
+ gid_t current = getgid ();
+ for (int i = 0; i < ret; ++i)
+ {
+ if (groups[i] != current)
+ {
+ target = groups[i];
+ break;
+ }
+ }
+
+ if (target == 0)
+ FAIL_UNSUPPORTED("Could not find a suitable GID for user %jd\n",
+ (intmax_t) getuid ());
+
+ return copy_and_spawn_sgid (child_id, target);
+}
+
void
support_capture_subprocess_free (struct support_capture_subprocess *p)
{
diff --git a/support/support_subprocess.c b/support/support_subprocess.c
index af01827cac81d80c..f7ee28af2531eda8 100644
--- a/support/support_subprocess.c
+++ b/support/support_subprocess.c
@@ -92,6 +92,19 @@ support_subprogram (const char *file, char *const argv[])
return result;
}
+int
+support_subprogram_wait (const char *file, char *const argv[])
+{
+ posix_spawn_file_actions_t fa;
+
+ posix_spawn_file_actions_init (&fa);
+ struct support_subprocess res = support_subprocess_init ();
+
+ res.pid = xposix_spawn (file, &fa, NULL, argv, environ);
+
+ return support_process_wait (&res);
+}
+
int
support_process_wait (struct support_subprocess *proc)
{

View File

@ -0,0 +1,240 @@
tst-env-setuid: Use support_capture_subprogram_self_sgid
Use the support_capture_subprogram_self_sgid to spawn an sgid child.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit ca335281068a1ed549a75ee64f90a8310755956f)
diff --git a/elf/tst-env-setuid.c b/elf/tst-env-setuid.c
index 183a6dd133cfa16e..eda87f9dda293e79 100644
--- a/elf/tst-env-setuid.c
+++ b/elf/tst-env-setuid.c
@@ -29,173 +29,12 @@
#include <sys/wait.h>
#include <unistd.h>
+#include <support/check.h>
#include <support/support.h>
#include <support/test-driver.h>
+#include <support/capture_subprocess.h>
static char SETGID_CHILD[] = "setgid-child";
-#define CHILD_STATUS 42
-
-/* Return a GID which is not our current GID, but is present in the
- supplementary group list. */
-static gid_t
-choose_gid (void)
-{
- const int count = 64;
- gid_t groups[count];
- int ret = getgroups (count, groups);
- if (ret < 0)
- {
- printf ("getgroups: %m\n");
- exit (1);
- }
- gid_t current = getgid ();
- for (int i = 0; i < ret; ++i)
- {
- if (groups[i] != current)
- return groups[i];
- }
- return 0;
-}
-
-/* Spawn and execute a program and verify that it returns the CHILD_STATUS. */
-static pid_t
-do_execve (char **args)
-{
- pid_t kid = vfork ();
-
- if (kid < 0)
- {
- printf ("vfork: %m\n");
- return -1;
- }
-
- if (kid == 0)
- {
- /* Child process. */
- execve (args[0], args, environ);
- _exit (-errno);
- }
-
- if (kid < 0)
- return 1;
-
- int status;
-
- if (waitpid (kid, &status, 0) < 0)
- {
- printf ("waitpid: %m\n");
- return 1;
- }
-
- if (WEXITSTATUS (status) == EXIT_UNSUPPORTED)
- return EXIT_UNSUPPORTED;
-
- if (!WIFEXITED (status) || WEXITSTATUS (status) != CHILD_STATUS)
- {
- printf ("Unexpected exit status %d from child process\n",
- WEXITSTATUS (status));
- return 1;
- }
- return 0;
-}
-
-/* Copies the executable into a restricted directory, so that we can
- safely make it SGID with the TARGET group ID. Then runs the
- executable. */
-static int
-run_executable_sgid (gid_t target)
-{
- char *dirname = xasprintf ("%s/tst-tunables-setuid.%jd",
- test_dir, (intmax_t) getpid ());
- char *execname = xasprintf ("%s/bin", dirname);
- int infd = -1;
- int outfd = -1;
- int ret = 0;
- if (mkdir (dirname, 0700) < 0)
- {
- printf ("mkdir: %m\n");
- goto err;
- }
- infd = open ("/proc/self/exe", O_RDONLY);
- if (infd < 0)
- {
- printf ("open (/proc/self/exe): %m\n");
- goto err;
- }
- outfd = open (execname, O_WRONLY | O_CREAT | O_EXCL, 0700);
- if (outfd < 0)
- {
- printf ("open (%s): %m\n", execname);
- goto err;
- }
- char buf[4096];
- for (;;)
- {
- ssize_t rdcount = read (infd, buf, sizeof (buf));
- if (rdcount < 0)
- {
- printf ("read: %m\n");
- goto err;
- }
- if (rdcount == 0)
- break;
- char *p = buf;
- char *end = buf + rdcount;
- while (p != end)
- {
- ssize_t wrcount = write (outfd, buf, end - p);
- if (wrcount == 0)
- errno = ENOSPC;
- if (wrcount <= 0)
- {
- printf ("write: %m\n");
- goto err;
- }
- p += wrcount;
- }
- }
- if (fchown (outfd, getuid (), target) < 0)
- {
- printf ("fchown (%s): %m\n", execname);
- goto err;
- }
- if (fchmod (outfd, 02750) < 0)
- {
- printf ("fchmod (%s): %m\n", execname);
- goto err;
- }
- if (close (outfd) < 0)
- {
- printf ("close (outfd): %m\n");
- goto err;
- }
- if (close (infd) < 0)
- {
- printf ("close (infd): %m\n");
- goto err;
- }
-
- char *args[] = {execname, SETGID_CHILD, NULL};
-
- ret = do_execve (args);
-
-err:
- if (outfd >= 0)
- close (outfd);
- if (infd >= 0)
- close (infd);
- if (execname)
- {
- unlink (execname);
- free (execname);
- }
- if (dirname)
- {
- rmdir (dirname);
- free (dirname);
- }
- return ret;
-}
#ifndef test_child
static int
@@ -256,40 +95,32 @@ do_test (int argc, char **argv)
if (argc == 2 && strcmp (argv[1], SETGID_CHILD) == 0)
{
if (getgid () == getegid ())
- {
- /* This can happen if the file system is mounted nosuid. */
- fprintf (stderr, "SGID failed: GID and EGID match (%jd)\n",
- (intmax_t) getgid ());
- exit (EXIT_UNSUPPORTED);
- }
+ /* This can happen if the file system is mounted nosuid. */
+ FAIL_UNSUPPORTED ("SGID failed: GID and EGID match (%jd)\n",
+ (intmax_t) getgid ());
int ret = test_child ();
if (ret != 0)
exit (1);
- exit (CHILD_STATUS);
+ exit (EXIT_SUCCESS);
}
else
{
if (test_parent () != 0)
exit (1);
- /* Try running a setgid program. */
- gid_t target = choose_gid ();
- if (target == 0)
- {
- fprintf (stderr,
- "Could not find a suitable GID for user %jd, skipping test\n",
- (intmax_t) getuid ());
- exit (0);
- }
+ int status = support_capture_subprogram_self_sgid (SETGID_CHILD);
- return run_executable_sgid (target);
- }
+ if (WEXITSTATUS (status) == EXIT_UNSUPPORTED)
+ return EXIT_UNSUPPORTED;
+
+ if (!WIFEXITED (status))
+ FAIL_EXIT1 ("Unexpected exit status %d from child process\n", status);
- /* Something went wrong and our argv was corrupted. */
- _exit (1);
+ return 0;
+ }
}
#define TEST_FUNCTION_ARGV do_test

View File

@ -0,0 +1,613 @@
Enhance setuid-tunables test
Instead of passing GLIBC_TUNABLES via the environment, pass the
environment variable from parent to child. This allows us to test
multiple variables to ensure better coverage.
The test list currently only includes the case that's already being
tested. More tests will be added later.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 061fe3f8add46a89b7453e87eabb9c4695005ced)
Also add intprops.h from 2.29 from commit 8e6fd2bdb21efe2cc1ae7571ff8fb2599db6a05a
diff --git a/elf/Makefile b/elf/Makefile
index fc9c685b9d23bb6c..2093cefa7e73349e 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -1597,8 +1597,6 @@ $(objpfx)tst-nodelete-dlclose.out: $(objpfx)tst-nodelete-dlclose-dso.so \
tst-env-setuid-ENV = MALLOC_CHECK_=2 MALLOC_MMAP_THRESHOLD_=4096 \
LD_HWCAP_MASK=0x1
-tst-env-setuid-tunables-ENV = \
- GLIBC_TUNABLES=glibc.malloc.check=2:glibc.malloc.mmap_threshold=4096
$(objpfx)tst-debug1: $(libdl)
$(objpfx)tst-debug1.out: $(objpfx)tst-debug1mod1.so
diff --git a/elf/tst-env-setuid-tunables.c b/elf/tst-env-setuid-tunables.c
index d7c4f0d5742cd526..a48281b175af6920 100644
--- a/elf/tst-env-setuid-tunables.c
+++ b/elf/tst-env-setuid-tunables.c
@@ -25,35 +25,50 @@
#include "config.h"
#undef _LIBC
-#define test_parent test_parent_tunables
-#define test_child test_child_tunables
-
-static int test_child_tunables (void);
-static int test_parent_tunables (void);
-
-#include "tst-env-setuid.c"
+#include <errno.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/wait.h>
+#include <unistd.h>
+#include <intprops.h>
+#include <array_length.h>
+
+#include <support/check.h>
+#include <support/support.h>
+#include <support/test-driver.h>
+#include <support/capture_subprocess.h>
+
+const char *teststrings[] =
+{
+ "glibc.malloc.check=2:glibc.malloc.mmap_threshold=4096",
+};
-#define CHILD_VALSTRING_VALUE "glibc.malloc.mmap_threshold=4096"
-#define PARENT_VALSTRING_VALUE \
- "glibc.malloc.check=2:glibc.malloc.mmap_threshold=4096"
+const char *resultstrings[] =
+{
+ "glibc.malloc.mmap_threshold=4096",
+};
static int
-test_child_tunables (void)
+test_child (int off)
{
const char *val = getenv ("GLIBC_TUNABLES");
#if HAVE_TUNABLES
- if (val != NULL && strcmp (val, CHILD_VALSTRING_VALUE) == 0)
+ if (val != NULL && strcmp (val, resultstrings[off]) == 0)
return 0;
if (val != NULL)
- printf ("Unexpected GLIBC_TUNABLES VALUE %s\n", val);
+ printf ("[%d] Unexpected GLIBC_TUNABLES VALUE %s\n", off, val);
return 1;
#else
if (val != NULL)
{
- printf ("GLIBC_TUNABLES not cleared\n");
+ printf ("[%d] GLIBC_TUNABLES not cleared\n", off);
return 1;
}
return 0;
@@ -61,15 +76,48 @@ test_child_tunables (void)
}
static int
-test_parent_tunables (void)
+do_test (int argc, char **argv)
{
- const char *val = getenv ("GLIBC_TUNABLES");
+ /* Setgid child process. */
+ if (argc == 2)
+ {
+ if (getgid () == getegid ())
+ /* This can happen if the file system is mounted nosuid. */
+ FAIL_UNSUPPORTED ("SGID failed: GID and EGID match (%jd)\n",
+ (intmax_t) getgid ());
- if (val != NULL && strcmp (val, PARENT_VALSTRING_VALUE) == 0)
- return 0;
+ int ret = test_child (atoi (argv[1]));
- if (val != NULL)
- printf ("Unexpected GLIBC_TUNABLES VALUE %s\n", val);
+ if (ret != 0)
+ exit (1);
- return 1;
+ exit (EXIT_SUCCESS);
+ }
+ else
+ {
+ int ret = 0;
+
+ /* Spawn tests. */
+ for (int i = 0; i < array_length (teststrings); i++)
+ {
+ char buf[INT_BUFSIZE_BOUND (int)];
+
+ printf ("Spawned test for %s (%d)\n", teststrings[i], i);
+ snprintf (buf, sizeof (buf), "%d\n", i);
+ if (setenv ("GLIBC_TUNABLES", teststrings[i], 1) != 0)
+ exit (1);
+
+ int status = support_capture_subprogram_self_sgid (buf);
+
+ /* Bail out early if unsupported. */
+ if (WEXITSTATUS (status) == EXIT_UNSUPPORTED)
+ return EXIT_UNSUPPORTED;
+
+ ret |= status;
+ }
+ return ret;
+ }
}
+
+#define TEST_FUNCTION_ARGV do_test
+#include <support/test-driver.c>
diff --git a/include/intprops.h b/include/intprops.h
new file mode 100644
index 0000000000000000..9702aec4c6e3c80a
--- /dev/null
+++ b/include/intprops.h
@@ -0,0 +1,455 @@
+/* intprops.h -- properties of integer types
+
+ Copyright (C) 2001-2018 Free Software Foundation, Inc.
+
+ This program is free software: you can redistribute it and/or modify it
+ under the terms of the GNU Lesser General Public License as published
+ by the Free Software Foundation; either version 2.1 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public License
+ along with this program. If not, see <https://www.gnu.org/licenses/>. */
+
+/* Written by Paul Eggert. */
+
+#ifndef _GL_INTPROPS_H
+#define _GL_INTPROPS_H
+
+#include <limits.h>
+
+/* Return a value with the common real type of E and V and the value of V.
+ Do not evaluate E. */
+#define _GL_INT_CONVERT(e, v) ((1 ? 0 : (e)) + (v))
+
+/* Act like _GL_INT_CONVERT (E, -V) but work around a bug in IRIX 6.5 cc; see
+ <https://lists.gnu.org/r/bug-gnulib/2011-05/msg00406.html>. */
+#define _GL_INT_NEGATE_CONVERT(e, v) ((1 ? 0 : (e)) - (v))
+
+/* The extra casts in the following macros work around compiler bugs,
+ e.g., in Cray C 5.0.3.0. */
+
+/* True if the arithmetic type T is an integer type. bool counts as
+ an integer. */
+#define TYPE_IS_INTEGER(t) ((t) 1.5 == 1)
+
+/* True if the real type T is signed. */
+#define TYPE_SIGNED(t) (! ((t) 0 < (t) -1))
+
+/* Return 1 if the real expression E, after promotion, has a
+ signed or floating type. Do not evaluate E. */
+#define EXPR_SIGNED(e) (_GL_INT_NEGATE_CONVERT (e, 1) < 0)
+
+
+/* Minimum and maximum values for integer types and expressions. */
+
+/* The width in bits of the integer type or expression T.
+ Do not evaluate T.
+ Padding bits are not supported; this is checked at compile-time below. */
+#define TYPE_WIDTH(t) (sizeof (t) * CHAR_BIT)
+
+/* The maximum and minimum values for the integer type T. */
+#define TYPE_MINIMUM(t) ((t) ~ TYPE_MAXIMUM (t))
+#define TYPE_MAXIMUM(t) \
+ ((t) (! TYPE_SIGNED (t) \
+ ? (t) -1 \
+ : ((((t) 1 << (TYPE_WIDTH (t) - 2)) - 1) * 2 + 1)))
+
+/* The maximum and minimum values for the type of the expression E,
+ after integer promotion. E is not evaluated. */
+#define _GL_INT_MINIMUM(e) \
+ (EXPR_SIGNED (e) \
+ ? ~ _GL_SIGNED_INT_MAXIMUM (e) \
+ : _GL_INT_CONVERT (e, 0))
+#define _GL_INT_MAXIMUM(e) \
+ (EXPR_SIGNED (e) \
+ ? _GL_SIGNED_INT_MAXIMUM (e) \
+ : _GL_INT_NEGATE_CONVERT (e, 1))
+#define _GL_SIGNED_INT_MAXIMUM(e) \
+ (((_GL_INT_CONVERT (e, 1) << (TYPE_WIDTH ((e) + 0) - 2)) - 1) * 2 + 1)
+
+/* Work around OpenVMS incompatibility with C99. */
+#if !defined LLONG_MAX && defined __INT64_MAX
+# define LLONG_MAX __INT64_MAX
+# define LLONG_MIN __INT64_MIN
+#endif
+
+/* This include file assumes that signed types are two's complement without
+ padding bits; the above macros have undefined behavior otherwise.
+ If this is a problem for you, please let us know how to fix it for your host.
+ This assumption is tested by the intprops-tests module. */
+
+/* Does the __typeof__ keyword work? This could be done by
+ 'configure', but for now it's easier to do it by hand. */
+#if (2 <= __GNUC__ \
+ || (1210 <= __IBMC__ && defined __IBM__TYPEOF__) \
+ || (0x5110 <= __SUNPRO_C && !__STDC__))
+# define _GL_HAVE___TYPEOF__ 1
+#else
+# define _GL_HAVE___TYPEOF__ 0
+#endif
+
+/* Return 1 if the integer type or expression T might be signed. Return 0
+ if it is definitely unsigned. This macro does not evaluate its argument,
+ and expands to an integer constant expression. */
+#if _GL_HAVE___TYPEOF__
+# define _GL_SIGNED_TYPE_OR_EXPR(t) TYPE_SIGNED (__typeof__ (t))
+#else
+# define _GL_SIGNED_TYPE_OR_EXPR(t) 1
+#endif
+
+/* Bound on length of the string representing an unsigned integer
+ value representable in B bits. log10 (2.0) < 146/485. The
+ smallest value of B where this bound is not tight is 2621. */
+#define INT_BITS_STRLEN_BOUND(b) (((b) * 146 + 484) / 485)
+
+/* Bound on length of the string representing an integer type or expression T.
+ Subtract 1 for the sign bit if T is signed, and then add 1 more for
+ a minus sign if needed.
+
+ Because _GL_SIGNED_TYPE_OR_EXPR sometimes returns 0 when its argument is
+ signed, this macro may overestimate the true bound by one byte when
+ applied to unsigned types of size 2, 4, 16, ... bytes. */
+#define INT_STRLEN_BOUND(t) \
+ (INT_BITS_STRLEN_BOUND (TYPE_WIDTH (t) - _GL_SIGNED_TYPE_OR_EXPR (t)) \
+ + _GL_SIGNED_TYPE_OR_EXPR (t))
+
+/* Bound on buffer size needed to represent an integer type or expression T,
+ including the terminating null. */
+#define INT_BUFSIZE_BOUND(t) (INT_STRLEN_BOUND (t) + 1)
+
+
+/* Range overflow checks.
+
+ The INT_<op>_RANGE_OVERFLOW macros return 1 if the corresponding C
+ operators might not yield numerically correct answers due to
+ arithmetic overflow. They do not rely on undefined or
+ implementation-defined behavior. Their implementations are simple
+ and straightforward, but they are a bit harder to use than the
+ INT_<op>_OVERFLOW macros described below.
+
+ Example usage:
+
+ long int i = ...;
+ long int j = ...;
+ if (INT_MULTIPLY_RANGE_OVERFLOW (i, j, LONG_MIN, LONG_MAX))
+ printf ("multiply would overflow");
+ else
+ printf ("product is %ld", i * j);
+
+ Restrictions on *_RANGE_OVERFLOW macros:
+
+ These macros do not check for all possible numerical problems or
+ undefined or unspecified behavior: they do not check for division
+ by zero, for bad shift counts, or for shifting negative numbers.
+
+ These macros may evaluate their arguments zero or multiple times,
+ so the arguments should not have side effects. The arithmetic
+ arguments (including the MIN and MAX arguments) must be of the same
+ integer type after the usual arithmetic conversions, and the type
+ must have minimum value MIN and maximum MAX. Unsigned types should
+ use a zero MIN of the proper type.
+
+ These macros are tuned for constant MIN and MAX. For commutative
+ operations such as A + B, they are also tuned for constant B. */
+
+/* Return 1 if A + B would overflow in [MIN,MAX] arithmetic.
+ See above for restrictions. */
+#define INT_ADD_RANGE_OVERFLOW(a, b, min, max) \
+ ((b) < 0 \
+ ? (a) < (min) - (b) \
+ : (max) - (b) < (a))
+
+/* Return 1 if A - B would overflow in [MIN,MAX] arithmetic.
+ See above for restrictions. */
+#define INT_SUBTRACT_RANGE_OVERFLOW(a, b, min, max) \
+ ((b) < 0 \
+ ? (max) + (b) < (a) \
+ : (a) < (min) + (b))
+
+/* Return 1 if - A would overflow in [MIN,MAX] arithmetic.
+ See above for restrictions. */
+#define INT_NEGATE_RANGE_OVERFLOW(a, min, max) \
+ ((min) < 0 \
+ ? (a) < - (max) \
+ : 0 < (a))
+
+/* Return 1 if A * B would overflow in [MIN,MAX] arithmetic.
+ See above for restrictions. Avoid && and || as they tickle
+ bugs in Sun C 5.11 2010/08/13 and other compilers; see
+ <https://lists.gnu.org/r/bug-gnulib/2011-05/msg00401.html>. */
+#define INT_MULTIPLY_RANGE_OVERFLOW(a, b, min, max) \
+ ((b) < 0 \
+ ? ((a) < 0 \
+ ? (a) < (max) / (b) \
+ : (b) == -1 \
+ ? 0 \
+ : (min) / (b) < (a)) \
+ : (b) == 0 \
+ ? 0 \
+ : ((a) < 0 \
+ ? (a) < (min) / (b) \
+ : (max) / (b) < (a)))
+
+/* Return 1 if A / B would overflow in [MIN,MAX] arithmetic.
+ See above for restrictions. Do not check for division by zero. */
+#define INT_DIVIDE_RANGE_OVERFLOW(a, b, min, max) \
+ ((min) < 0 && (b) == -1 && (a) < - (max))
+
+/* Return 1 if A % B would overflow in [MIN,MAX] arithmetic.
+ See above for restrictions. Do not check for division by zero.
+ Mathematically, % should never overflow, but on x86-like hosts
+ INT_MIN % -1 traps, and the C standard permits this, so treat this
+ as an overflow too. */
+#define INT_REMAINDER_RANGE_OVERFLOW(a, b, min, max) \
+ INT_DIVIDE_RANGE_OVERFLOW (a, b, min, max)
+
+/* Return 1 if A << B would overflow in [MIN,MAX] arithmetic.
+ See above for restrictions. Here, MIN and MAX are for A only, and B need
+ not be of the same type as the other arguments. The C standard says that
+ behavior is undefined for shifts unless 0 <= B < wordwidth, and that when
+ A is negative then A << B has undefined behavior and A >> B has
+ implementation-defined behavior, but do not check these other
+ restrictions. */
+#define INT_LEFT_SHIFT_RANGE_OVERFLOW(a, b, min, max) \
+ ((a) < 0 \
+ ? (a) < (min) >> (b) \
+ : (max) >> (b) < (a))
+
+/* True if __builtin_add_overflow (A, B, P) works when P is non-null. */
+#if 5 <= __GNUC__ && !defined __ICC
+# define _GL_HAS_BUILTIN_OVERFLOW 1
+#else
+# define _GL_HAS_BUILTIN_OVERFLOW 0
+#endif
+
+/* True if __builtin_add_overflow_p (A, B, C) works. */
+#define _GL_HAS_BUILTIN_OVERFLOW_P (7 <= __GNUC__)
+
+/* The _GL*_OVERFLOW macros have the same restrictions as the
+ *_RANGE_OVERFLOW macros, except that they do not assume that operands
+ (e.g., A and B) have the same type as MIN and MAX. Instead, they assume
+ that the result (e.g., A + B) has that type. */
+#if _GL_HAS_BUILTIN_OVERFLOW_P
+# define _GL_ADD_OVERFLOW(a, b, min, max) \
+ __builtin_add_overflow_p (a, b, (__typeof__ ((a) + (b))) 0)
+# define _GL_SUBTRACT_OVERFLOW(a, b, min, max) \
+ __builtin_sub_overflow_p (a, b, (__typeof__ ((a) - (b))) 0)
+# define _GL_MULTIPLY_OVERFLOW(a, b, min, max) \
+ __builtin_mul_overflow_p (a, b, (__typeof__ ((a) * (b))) 0)
+#else
+# define _GL_ADD_OVERFLOW(a, b, min, max) \
+ ((min) < 0 ? INT_ADD_RANGE_OVERFLOW (a, b, min, max) \
+ : (a) < 0 ? (b) <= (a) + (b) \
+ : (b) < 0 ? (a) <= (a) + (b) \
+ : (a) + (b) < (b))
+# define _GL_SUBTRACT_OVERFLOW(a, b, min, max) \
+ ((min) < 0 ? INT_SUBTRACT_RANGE_OVERFLOW (a, b, min, max) \
+ : (a) < 0 ? 1 \
+ : (b) < 0 ? (a) - (b) <= (a) \
+ : (a) < (b))
+# define _GL_MULTIPLY_OVERFLOW(a, b, min, max) \
+ (((min) == 0 && (((a) < 0 && 0 < (b)) || ((b) < 0 && 0 < (a)))) \
+ || INT_MULTIPLY_RANGE_OVERFLOW (a, b, min, max))
+#endif
+#define _GL_DIVIDE_OVERFLOW(a, b, min, max) \
+ ((min) < 0 ? (b) == _GL_INT_NEGATE_CONVERT (min, 1) && (a) < - (max) \
+ : (a) < 0 ? (b) <= (a) + (b) - 1 \
+ : (b) < 0 && (a) + (b) <= (a))
+#define _GL_REMAINDER_OVERFLOW(a, b, min, max) \
+ ((min) < 0 ? (b) == _GL_INT_NEGATE_CONVERT (min, 1) && (a) < - (max) \
+ : (a) < 0 ? (a) % (b) != ((max) - (b) + 1) % (b) \
+ : (b) < 0 && ! _GL_UNSIGNED_NEG_MULTIPLE (a, b, max))
+
+/* Return a nonzero value if A is a mathematical multiple of B, where
+ A is unsigned, B is negative, and MAX is the maximum value of A's
+ type. A's type must be the same as (A % B)'s type. Normally (A %
+ -B == 0) suffices, but things get tricky if -B would overflow. */
+#define _GL_UNSIGNED_NEG_MULTIPLE(a, b, max) \
+ (((b) < -_GL_SIGNED_INT_MAXIMUM (b) \
+ ? (_GL_SIGNED_INT_MAXIMUM (b) == (max) \
+ ? (a) \
+ : (a) % (_GL_INT_CONVERT (a, _GL_SIGNED_INT_MAXIMUM (b)) + 1)) \
+ : (a) % - (b)) \
+ == 0)
+
+/* Check for integer overflow, and report low order bits of answer.
+
+ The INT_<op>_OVERFLOW macros return 1 if the corresponding C operators
+ might not yield numerically correct answers due to arithmetic overflow.
+ The INT_<op>_WRAPV macros also store the low-order bits of the answer.
+ These macros work correctly on all known practical hosts, and do not rely
+ on undefined behavior due to signed arithmetic overflow.
+
+ Example usage, assuming A and B are long int:
+
+ if (INT_MULTIPLY_OVERFLOW (a, b))
+ printf ("result would overflow\n");
+ else
+ printf ("result is %ld (no overflow)\n", a * b);
+
+ Example usage with WRAPV flavor:
+
+ long int result;
+ bool overflow = INT_MULTIPLY_WRAPV (a, b, &result);
+ printf ("result is %ld (%s)\n", result,
+ overflow ? "after overflow" : "no overflow");
+
+ Restrictions on these macros:
+
+ These macros do not check for all possible numerical problems or
+ undefined or unspecified behavior: they do not check for division
+ by zero, for bad shift counts, or for shifting negative numbers.
+
+ These macros may evaluate their arguments zero or multiple times, so the
+ arguments should not have side effects.
+
+ The WRAPV macros are not constant expressions. They support only
+ +, binary -, and *. The result type must be signed.
+
+ These macros are tuned for their last argument being a constant.
+
+ Return 1 if the integer expressions A * B, A - B, -A, A * B, A / B,
+ A % B, and A << B would overflow, respectively. */
+
+#define INT_ADD_OVERFLOW(a, b) \
+ _GL_BINARY_OP_OVERFLOW (a, b, _GL_ADD_OVERFLOW)
+#define INT_SUBTRACT_OVERFLOW(a, b) \
+ _GL_BINARY_OP_OVERFLOW (a, b, _GL_SUBTRACT_OVERFLOW)
+#if _GL_HAS_BUILTIN_OVERFLOW_P
+# define INT_NEGATE_OVERFLOW(a) INT_SUBTRACT_OVERFLOW (0, a)
+#else
+# define INT_NEGATE_OVERFLOW(a) \
+ INT_NEGATE_RANGE_OVERFLOW (a, _GL_INT_MINIMUM (a), _GL_INT_MAXIMUM (a))
+#endif
+#define INT_MULTIPLY_OVERFLOW(a, b) \
+ _GL_BINARY_OP_OVERFLOW (a, b, _GL_MULTIPLY_OVERFLOW)
+#define INT_DIVIDE_OVERFLOW(a, b) \
+ _GL_BINARY_OP_OVERFLOW (a, b, _GL_DIVIDE_OVERFLOW)
+#define INT_REMAINDER_OVERFLOW(a, b) \
+ _GL_BINARY_OP_OVERFLOW (a, b, _GL_REMAINDER_OVERFLOW)
+#define INT_LEFT_SHIFT_OVERFLOW(a, b) \
+ INT_LEFT_SHIFT_RANGE_OVERFLOW (a, b, \
+ _GL_INT_MINIMUM (a), _GL_INT_MAXIMUM (a))
+
+/* Return 1 if the expression A <op> B would overflow,
+ where OP_RESULT_OVERFLOW (A, B, MIN, MAX) does the actual test,
+ assuming MIN and MAX are the minimum and maximum for the result type.
+ Arguments should be free of side effects. */
+#define _GL_BINARY_OP_OVERFLOW(a, b, op_result_overflow) \
+ op_result_overflow (a, b, \
+ _GL_INT_MINIMUM (_GL_INT_CONVERT (a, b)), \
+ _GL_INT_MAXIMUM (_GL_INT_CONVERT (a, b)))
+
+/* Store the low-order bits of A + B, A - B, A * B, respectively, into *R.
+ Return 1 if the result overflows. See above for restrictions. */
+#define INT_ADD_WRAPV(a, b, r) \
+ _GL_INT_OP_WRAPV (a, b, r, +, __builtin_add_overflow, INT_ADD_OVERFLOW)
+#define INT_SUBTRACT_WRAPV(a, b, r) \
+ _GL_INT_OP_WRAPV (a, b, r, -, __builtin_sub_overflow, INT_SUBTRACT_OVERFLOW)
+#define INT_MULTIPLY_WRAPV(a, b, r) \
+ _GL_INT_OP_WRAPV (a, b, r, *, __builtin_mul_overflow, INT_MULTIPLY_OVERFLOW)
+
+/* Nonzero if this compiler has GCC bug 68193 or Clang bug 25390. See:
+ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68193
+ https://llvm.org/bugs/show_bug.cgi?id=25390
+ For now, assume all versions of GCC-like compilers generate bogus
+ warnings for _Generic. This matters only for older compilers that
+ lack __builtin_add_overflow. */
+#if __GNUC__
+# define _GL__GENERIC_BOGUS 1
+#else
+# define _GL__GENERIC_BOGUS 0
+#endif
+
+/* Store the low-order bits of A <op> B into *R, where OP specifies
+ the operation. BUILTIN is the builtin operation, and OVERFLOW the
+ overflow predicate. Return 1 if the result overflows. See above
+ for restrictions. */
+#if _GL_HAS_BUILTIN_OVERFLOW
+# define _GL_INT_OP_WRAPV(a, b, r, op, builtin, overflow) builtin (a, b, r)
+#elif 201112 <= __STDC_VERSION__ && !_GL__GENERIC_BOGUS
+# define _GL_INT_OP_WRAPV(a, b, r, op, builtin, overflow) \
+ (_Generic \
+ (*(r), \
+ signed char: \
+ _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned int, \
+ signed char, SCHAR_MIN, SCHAR_MAX), \
+ short int: \
+ _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned int, \
+ short int, SHRT_MIN, SHRT_MAX), \
+ int: \
+ _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned int, \
+ int, INT_MIN, INT_MAX), \
+ long int: \
+ _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned long int, \
+ long int, LONG_MIN, LONG_MAX), \
+ long long int: \
+ _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned long long int, \
+ long long int, LLONG_MIN, LLONG_MAX)))
+#else
+# define _GL_INT_OP_WRAPV(a, b, r, op, builtin, overflow) \
+ (sizeof *(r) == sizeof (signed char) \
+ ? _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned int, \
+ signed char, SCHAR_MIN, SCHAR_MAX) \
+ : sizeof *(r) == sizeof (short int) \
+ ? _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned int, \
+ short int, SHRT_MIN, SHRT_MAX) \
+ : sizeof *(r) == sizeof (int) \
+ ? _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned int, \
+ int, INT_MIN, INT_MAX) \
+ : _GL_INT_OP_WRAPV_LONGISH(a, b, r, op, overflow))
+# ifdef LLONG_MAX
+# define _GL_INT_OP_WRAPV_LONGISH(a, b, r, op, overflow) \
+ (sizeof *(r) == sizeof (long int) \
+ ? _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned long int, \
+ long int, LONG_MIN, LONG_MAX) \
+ : _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned long long int, \
+ long long int, LLONG_MIN, LLONG_MAX))
+# else
+# define _GL_INT_OP_WRAPV_LONGISH(a, b, r, op, overflow) \
+ _GL_INT_OP_CALC (a, b, r, op, overflow, unsigned long int, \
+ long int, LONG_MIN, LONG_MAX)
+# endif
+#endif
+
+/* Store the low-order bits of A <op> B into *R, where the operation
+ is given by OP. Use the unsigned type UT for calculation to avoid
+ overflow problems. *R's type is T, with extrema TMIN and TMAX.
+ T must be a signed integer type. Return 1 if the result overflows. */
+#define _GL_INT_OP_CALC(a, b, r, op, overflow, ut, t, tmin, tmax) \
+ (sizeof ((a) op (b)) < sizeof (t) \
+ ? _GL_INT_OP_CALC1 ((t) (a), (t) (b), r, op, overflow, ut, t, tmin, tmax) \
+ : _GL_INT_OP_CALC1 (a, b, r, op, overflow, ut, t, tmin, tmax))
+#define _GL_INT_OP_CALC1(a, b, r, op, overflow, ut, t, tmin, tmax) \
+ ((overflow (a, b) \
+ || (EXPR_SIGNED ((a) op (b)) && ((a) op (b)) < (tmin)) \
+ || (tmax) < ((a) op (b))) \
+ ? (*(r) = _GL_INT_OP_WRAPV_VIA_UNSIGNED (a, b, op, ut, t), 1) \
+ : (*(r) = _GL_INT_OP_WRAPV_VIA_UNSIGNED (a, b, op, ut, t), 0))
+
+/* Return the low-order bits of A <op> B, where the operation is given
+ by OP. Use the unsigned type UT for calculation to avoid undefined
+ behavior on signed integer overflow, and convert the result to type T.
+ UT is at least as wide as T and is no narrower than unsigned int,
+ T is two's complement, and there is no padding or trap representations.
+ Assume that converting UT to T yields the low-order bits, as is
+ done in all known two's-complement C compilers. E.g., see:
+ https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html
+
+ According to the C standard, converting UT to T yields an
+ implementation-defined result or signal for values outside T's
+ range. However, code that works around this theoretical problem
+ runs afoul of a compiler bug in Oracle Studio 12.3 x86. See:
+ https://lists.gnu.org/r/bug-gnulib/2017-04/msg00049.html
+ As the compiler bug is real, don't try to work around the
+ theoretical problem. */
+
+#define _GL_INT_OP_WRAPV_VIA_UNSIGNED(a, b, op, ut, t) \
+ ((t) ((ut) (a) op (ut) (b)))
+
+#endif /* _GL_INTPROPS_H */

View File

@ -0,0 +1,160 @@
Fix SXID_ERASE behavior in setuid programs (BZ #27471)
When parse_tunables tries to erase a tunable marked as SXID_ERASE for
setuid programs, it ends up setting the envvar string iterator
incorrectly, because of which it may parse the next tunable
incorrectly. Given that currently the implementation allows malformed
and unrecognized tunables pass through, it may even allow SXID_ERASE
tunables to go through.
This change revamps the SXID_ERASE implementation so that:
- Only valid tunables are written back to the tunestr string, because
of which children of SXID programs will only inherit a clean list of
identified tunables that are not SXID_ERASE.
- Unrecognized tunables get scrubbed off from the environment and
subsequently from the child environment.
- This has the side-effect that a tunable that is not identified by
the setxid binary, will not be passed on to a non-setxid child even
if the child could have identified that tunable. This may break
applications that expect this behaviour but expecting such tunables
to cross the SXID boundary is wrong.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 2ed18c5b534d9e92fc006202a5af0df6b72e7aca)
diff --git a/elf/dl-tunables.c b/elf/dl-tunables.c
index 4c9d36e3980758b9..bbc3679e3564a766 100644
--- a/elf/dl-tunables.c
+++ b/elf/dl-tunables.c
@@ -178,6 +178,7 @@ parse_tunables (char *tunestr, char *valstring)
return;
char *p = tunestr;
+ size_t off = 0;
while (true)
{
@@ -191,7 +192,11 @@ parse_tunables (char *tunestr, char *valstring)
/* If we reach the end of the string before getting a valid name-value
pair, bail out. */
if (p[len] == '\0')
- return;
+ {
+ if (__libc_enable_secure)
+ tunestr[off] = '\0';
+ return;
+ }
/* We did not find a valid name-value pair before encountering the
colon. */
@@ -217,35 +222,28 @@ parse_tunables (char *tunestr, char *valstring)
if (tunable_is_name (cur->name, name))
{
- /* If we are in a secure context (AT_SECURE) then ignore the tunable
- unless it is explicitly marked as secure. Tunable values take
- precendence over their envvar aliases. */
+ /* If we are in a secure context (AT_SECURE) then ignore the
+ tunable unless it is explicitly marked as secure. Tunable
+ values take precedence over their envvar aliases. We write
+ the tunables that are not SXID_ERASE back to TUNESTR, thus
+ dropping all SXID_ERASE tunables and any invalid or
+ unrecognized tunables. */
if (__libc_enable_secure)
{
- if (cur->security_level == TUNABLE_SECLEVEL_SXID_ERASE)
+ if (cur->security_level != TUNABLE_SECLEVEL_SXID_ERASE)
{
- if (p[len] == '\0')
- {
- /* Last tunable in the valstring. Null-terminate and
- return. */
- *name = '\0';
- return;
- }
- else
- {
- /* Remove the current tunable from the string. We do
- this by overwriting the string starting from NAME
- (which is where the current tunable begins) with
- the remainder of the string. We then have P point
- to NAME so that we continue in the correct
- position in the valstring. */
- char *q = &p[len + 1];
- p = name;
- while (*q != '\0')
- *name++ = *q++;
- name[0] = '\0';
- len = 0;
- }
+ if (off > 0)
+ tunestr[off++] = ':';
+
+ const char *n = cur->name;
+
+ while (*n != '\0')
+ tunestr[off++] = *n++;
+
+ tunestr[off++] = '=';
+
+ for (size_t j = 0; j < len; j++)
+ tunestr[off++] = value[j];
}
if (cur->security_level != TUNABLE_SECLEVEL_NONE)
@@ -258,9 +256,7 @@ parse_tunables (char *tunestr, char *valstring)
}
}
- if (p[len] == '\0')
- return;
- else
+ if (p[len] != '\0')
p += len + 1;
}
}
diff --git a/elf/tst-env-setuid-tunables.c b/elf/tst-env-setuid-tunables.c
index a48281b175af6920..0b9b075c40598c6f 100644
--- a/elf/tst-env-setuid-tunables.c
+++ b/elf/tst-env-setuid-tunables.c
@@ -45,11 +45,37 @@
const char *teststrings[] =
{
"glibc.malloc.check=2:glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.check=2:glibc.malloc.check=2:glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.check=2:glibc.malloc.mmap_threshold=4096:glibc.malloc.check=2",
+ "glibc.malloc.perturb=0x800",
+ "glibc.malloc.perturb=0x800:glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.perturb=0x800:not_valid.malloc.check=2:glibc.malloc.mmap_threshold=4096",
+ "glibc.not_valid.check=2:glibc.malloc.mmap_threshold=4096",
+ "not_valid.malloc.check=2:glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.garbage=2:glibc.maoc.mmap_threshold=4096:glibc.malloc.check=2",
+ "glibc.malloc.check=4:glibc.malloc.garbage=2:glibc.maoc.mmap_threshold=4096",
+ ":glibc.malloc.garbage=2:glibc.malloc.check=1",
+ "glibc.malloc.check=1:glibc.malloc.check=2",
+ "not_valid.malloc.check=2",
+ "glibc.not_valid.check=2",
};
const char *resultstrings[] =
{
"glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.perturb=0x800",
+ "glibc.malloc.perturb=0x800:glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.perturb=0x800:glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.mmap_threshold=4096",
+ "glibc.malloc.mmap_threshold=4096",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
};
static int

View File

@ -0,0 +1,28 @@
commit 583dd860d5b833037175247230a328f0050dbfe9
Author: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon Jan 21 11:08:13 2019 -0800
regex: fix read overrun [BZ #24114]
Problem found by AddressSanitizer, reported by Hongxu Chen in:
https://debbugs.gnu.org/34140
* posix/regexec.c (proceed_next_node):
Do not read past end of input buffer.
diff --git a/posix/regexec.c b/posix/regexec.c
index 73644c2341336e66..06b8487c3e3eab0e 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -1289,8 +1289,10 @@ proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs,
else if (naccepted)
{
char *buf = (char *) re_string_get_buffer (&mctx->input);
- if (memcmp (buf + regs[subexp_idx].rm_so, buf + *pidx,
- naccepted) != 0)
+ if (mctx->input.valid_len - *pidx < naccepted
+ || (memcmp (buf + regs[subexp_idx].rm_so, buf + *pidx,
+ naccepted)
+ != 0))
return -1;
}
}

View File

@ -0,0 +1,100 @@
commit 56c81132ccc6f468fa4fc29c536db060e18e9d87
Author: Raphael Moreira Zinsly <rzinsly@linux.ibm.com>
Date: Tue Feb 23 14:14:37 2021 -0300
powerpc: Add optimized ilogb* for POWER9
The instructions xsxexpdp and xsxexpqp introduced on POWER9 extract
the exponent from a double-precision and quad-precision floating-point
respectively, thus they can be used to improve ilogb, ilogbf and ilogbf128.
diff --git a/sysdeps/powerpc/fpu/math_private.h b/sysdeps/powerpc/fpu/math_private.h
index e642d6c8237578ea..5bbc468829062a48 100644
--- a/sysdeps/powerpc/fpu/math_private.h
+++ b/sysdeps/powerpc/fpu/math_private.h
@@ -26,7 +26,28 @@
#include_next <math_private.h>
-#if defined _ARCH_PWR9 && __HAVE_DISTINCT_FLOAT128
+#ifdef _ARCH_PWR9
+
+#if __GNUC_PREREQ (8, 0)
+# define _GL_HAS_BUILTIN_ILOGB 1
+#elif defined __has_builtin
+# define _GL_HAS_BUILTIN_ILOGB __has_builtin (__builtin_vsx_scalar_extract_exp)
+#else
+# define _GL_HAS_BUILTIN_ILOGB 0
+#endif
+
+#define __builtin_test_dc_ilogbf __builtin_test_dc_ilogb
+#define __builtin_ilogbf __builtin_ilogb
+
+#define __builtin_test_dc_ilogb(x, y) \
+ __builtin_vsx_scalar_test_data_class_dp(x, y)
+#define __builtin_ilogb(x) __builtin_vsx_scalar_extract_exp(x) - 0x3ff
+
+#define __builtin_test_dc_ilogbf128(x, y) \
+ __builtin_vsx_scalar_test_data_class_qp(x, y)
+#define __builtin_ilogbf128(x) __builtin_vsx_scalar_extract_expq(x) - 0x3fff
+
+#if __HAVE_DISTINCT_FLOAT128
extern __always_inline _Float128
__ieee754_sqrtf128 (_Float128 __x)
{
@@ -35,6 +56,9 @@ __ieee754_sqrtf128 (_Float128 __x)
return __z;
}
#endif
+#else /* !_ARCH_PWR9 */
+#define _GL_HAS_BUILTIN_ILOGB 0
+#endif
#if defined _ARCH_PWR5X
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/w_ilogb_template.c b/sysdeps/powerpc/powerpc64/le/fpu/w_ilogb_template.c
new file mode 100644
index 0000000000000000..b5c1c0aa9db86f3d
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/w_ilogb_template.c
@@ -0,0 +1,30 @@
+#include <math.h>
+#include <errno.h>
+#include <limits.h>
+#include <math_private.h>
+#include <fenv.h>
+
+#if _GL_HAS_BUILTIN_ILOGB
+int
+M_DECL_FUNC (__ilogb) (FLOAT x)
+{
+ int r;
+ /* Check for exceptional cases. */
+ if (! M_SUF(__builtin_test_dc_ilogb) (x, 0x7f))
+ r = M_SUF (__builtin_ilogb) (x);
+ else
+ /* Fallback to the generic ilogb if x is NaN, Inf or subnormal. */
+ r = M_SUF (__ieee754_ilogb) (x);
+ if (__builtin_expect (r == FP_ILOGB0, 0)
+ || __builtin_expect (r == FP_ILOGBNAN, 0)
+ || __builtin_expect (r == INT_MAX, 0))
+ {
+ __set_errno (EDOM);
+ __feraiseexcept (FE_INVALID);
+ }
+ return r;
+}
+declare_mgen_alias (__ilogb, ilogb)
+#else
+#include <math/w_ilogb_template.c>
+#endif
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/w_ilogbl.c b/sysdeps/powerpc/powerpc64/le/fpu/w_ilogbl.c
new file mode 100644
index 0000000000000000..205f154f0089a269
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/w_ilogbl.c
@@ -0,0 +1,4 @@
+/* Skip the optimization for long double as ibm128 does not provide an
+ optimized builtin. */
+#include <math-type-macros-ldouble.h>
+#include <math/w_ilogb_template.c>

View File

@ -0,0 +1,64 @@
commit a7d88506c260e7a0e4268803e76fc19e38ed041f
Author: Raphael Moreira Zinsly <rzinsly@linux.ibm.com>
Date: Thu Feb 25 09:58:52 2021 -0300
powerpc: Add optimized llogb* for POWER9
The POWER9 builtins used to improve the ilogb* functions can be
used in the llogb* functions as well.
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/w_llogb_template.c b/sysdeps/powerpc/powerpc64/le/fpu/w_llogb_template.c
new file mode 100644
index 0000000000000000..d00b71d2a34e28da
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/w_llogb_template.c
@@ -0,0 +1,39 @@
+#include <math.h>
+#include <errno.h>
+#include <limits.h>
+#include <math_private.h>
+#include <fenv.h>
+
+#if _GL_HAS_BUILTIN_ILOGB
+long int
+M_DECL_FUNC (__llogb) (FLOAT x)
+{
+ int r;
+ /* Check for exceptional cases. */
+ if (! M_SUF(__builtin_test_dc_ilogb) (x, 0x7f))
+ r = M_SUF (__builtin_ilogb) (x);
+ else
+ /* Fallback to the generic ilogb if x is NaN, Inf or subnormal. */
+ r = M_SUF (__ieee754_ilogb) (x);
+ long int lr = r;
+ if (__glibc_unlikely (r == FP_ILOGB0)
+ || __glibc_unlikely (r == FP_ILOGBNAN)
+ || __glibc_unlikely (r == INT_MAX))
+ {
+#if LONG_MAX != INT_MAX
+ if (r == FP_ILOGB0)
+ lr = FP_LLOGB0;
+ else if (r == FP_ILOGBNAN)
+ lr = FP_LLOGBNAN;
+ else
+ lr = LONG_MAX;
+#endif
+ __set_errno (EDOM);
+ __feraiseexcept (FE_INVALID);
+ }
+ return lr;
+}
+declare_mgen_alias (__llogb, llogb)
+#else
+#include <math/w_llogb_template.c>
+#endif
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/w_llogbl.c b/sysdeps/powerpc/powerpc64/le/fpu/w_llogbl.c
new file mode 100644
index 0000000000000000..69477a37ae82c476
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/w_llogbl.c
@@ -0,0 +1,4 @@
+/* Skip the optimization for long double as ibm128 does not provide an
+ optimized builtin. */
+#include <math-type-macros-ldouble.h>
+#include <math/w_llogb_template.c>

View File

@ -0,0 +1,334 @@
commit 10624a97e8e47004985740cbb04060a84cfada76
Author: Matheus Castanho <msc@linux.ibm.com>
Date: Tue Sep 29 15:40:08 2020 -0300
powerpc: Add optimized strlen for POWER10
Improvements compared to POWER9 version:
1. Take into account first 16B comparison for aligned strings
The previous version compares the first 16B and increments r4 by the number
of bytes until the address is 16B-aligned, then starts doing aligned loads at
that address. For aligned strings, this causes the first 16B to be compared
twice, because the increment is 0. Here we calculate the next 16B-aligned
address differently, which avoids that issue.
2. Use simple comparisons for the first ~192 bytes
The main loop is good for big strings, but comparing 16B each time is better
for smaller strings. So after aligning the address to 16 Bytes, we check
more 176B in 16B chunks. There may be some overlaps with the main loop for
unaligned strings, but we avoid using the more aggressive strategy too soon,
and also allow the loop to start at a 64B-aligned address. This greatly
benefits smaller strings and avoids overlapping checks if the string is
already aligned at a 64B boundary.
3. Reduce dependencies between load blocks caused by address calculation on loop
Doing a precise time tracing on the code showed many loads in the loop were
stalled waiting for updates to r4 from previous code blocks. This
implementation avoids that as much as possible by using 2 registers (r4 and
r5) to hold addresses to be used by different parts of the code.
Also, the previous code aligned the address to 16B, then to 64B by doing a
few 48B loops (if needed) until the address was aligned. The main loop could
not start until that 48B loop had finished and r4 was updated with the
current address. Here we calculate the address used by the loop very early,
so it can start sooner.
The main loop now uses 2 pointers 128B apart to make pointer updates less
frequent, and also unrolls 1 iteration to guarantee there is enough time
between iterations to update the pointers, reducing stalled cycles.
4. Use new P10 instructions
lxvp is used to load 32B with a single instruction, reducing contention in
the load queue.
vextractbm allows simplifying the tail code for the loop, replacing
vbpermq and avoiding having to generate a permute control vector.
Reviewed-by: Paul E Murphy <murphyp@linux.ibm.com>
Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
Reviewed-by: Lucas A. M. Magalhaes <lamm@linux.ibm.com>
diff --git a/sysdeps/powerpc/powerpc64/le/power10/strlen.S b/sysdeps/powerpc/powerpc64/le/power10/strlen.S
new file mode 100644
index 0000000000000000..ca7e9eb3d84c9b00
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/power10/strlen.S
@@ -0,0 +1,221 @@
+/* Optimized strlen implementation for POWER10 LE.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <sysdep.h>
+
+#ifndef STRLEN
+# define STRLEN __strlen
+# define DEFINE_STRLEN_HIDDEN_DEF 1
+#endif
+
+/* TODO: Replace macros by the actual instructions when minimum binutils becomes
+ >= 2.35. This is used to keep compatibility with older versions. */
+#define VEXTRACTBM(rt,vrb) \
+ .long(((4)<<(32-6)) \
+ | ((rt)<<(32-11)) \
+ | ((8)<<(32-16)) \
+ | ((vrb)<<(32-21)) \
+ | 1602)
+
+#define LXVP(xtp,dq,ra) \
+ .long(((6)<<(32-6)) \
+ | ((((xtp)-32)>>1)<<(32-10)) \
+ | ((1)<<(32-11)) \
+ | ((ra)<<(32-16)) \
+ | dq)
+
+#define CHECK16(vreg,offset,addr,label) \
+ lxv vreg+32,offset(addr); \
+ vcmpequb. vreg,vreg,v18; \
+ bne cr6,L(label);
+
+/* Load 4 quadwords, merge into one VR for speed and check for NULLs. r6 has #
+ of bytes already checked. */
+#define CHECK64(offset,addr,label) \
+ li r6,offset; \
+ LXVP(v4+32,offset,addr); \
+ LXVP(v6+32,offset+32,addr); \
+ vminub v14,v4,v5; \
+ vminub v15,v6,v7; \
+ vminub v16,v14,v15; \
+ vcmpequb. v0,v16,v18; \
+ bne cr6,L(label)
+
+#define TAIL(vreg,increment) \
+ vctzlsbb r4,vreg; \
+ subf r3,r3,r5; \
+ addi r4,r4,increment; \
+ add r3,r3,r4; \
+ blr
+
+/* Implements the function
+
+ int [r3] strlen (const void *s [r3])
+
+ The implementation can load bytes past a matching byte, but only
+ up to the next 64B boundary, so it never crosses a page. */
+
+.machine power9
+
+ENTRY_TOCLESS (STRLEN, 4)
+ CALL_MCOUNT 1
+
+ vspltisb v18,0
+ vspltisb v19,-1
+
+ /* Next 16B-aligned address. Prepare address for L(aligned). */
+ addi r5,r3,16
+ clrrdi r5,r5,4
+
+ /* Align data and fill bytes not loaded with non matching char. */
+ lvx v0,0,r3
+ lvsr v1,0,r3
+ vperm v0,v19,v0,v1
+
+ vcmpequb. v6,v0,v18
+ beq cr6,L(aligned)
+
+ vctzlsbb r3,v6
+ blr
+
+ /* Test next 176B, 16B at a time. The main loop is optimized for longer
+ strings, so checking the first bytes in 16B chunks benefits a lot
+ small strings. */
+ .p2align 5
+L(aligned):
+ /* Prepare address for the loop. */
+ addi r4,r3,192
+ clrrdi r4,r4,6
+
+ CHECK16(v0,0,r5,tail1)
+ CHECK16(v1,16,r5,tail2)
+ CHECK16(v2,32,r5,tail3)
+ CHECK16(v3,48,r5,tail4)
+ CHECK16(v4,64,r5,tail5)
+ CHECK16(v5,80,r5,tail6)
+ CHECK16(v6,96,r5,tail7)
+ CHECK16(v7,112,r5,tail8)
+ CHECK16(v8,128,r5,tail9)
+ CHECK16(v9,144,r5,tail10)
+ CHECK16(v10,160,r5,tail11)
+
+ addi r5,r4,128
+
+ /* Switch to a more aggressive approach checking 64B each time. Use 2
+ pointers 128B apart and unroll the loop once to make the pointer
+ updates and usages separated enough to avoid stalls waiting for
+ address calculation. */
+ .p2align 5
+L(loop):
+ CHECK64(0,r4,pre_tail_64b)
+ CHECK64(64,r4,pre_tail_64b)
+ addi r4,r4,256
+
+ CHECK64(0,r5,tail_64b)
+ CHECK64(64,r5,tail_64b)
+ addi r5,r5,256
+
+ b L(loop)
+
+ .p2align 5
+L(pre_tail_64b):
+ mr r5,r4
+L(tail_64b):
+ /* OK, we found a null byte. Let's look for it in the current 64-byte
+ block and mark it in its corresponding VR. lxvp vx,0(ry) puts the
+ low 16B bytes into vx+1, and the high into vx, so the order here is
+ v5, v4, v7, v6. */
+ vcmpequb v1,v5,v18
+ vcmpequb v2,v4,v18
+ vcmpequb v3,v7,v18
+ vcmpequb v4,v6,v18
+
+ /* Take into account the other 64B blocks we had already checked. */
+ add r5,r5,r6
+
+ /* Extract first bit of each byte. */
+ VEXTRACTBM(r7,v1)
+ VEXTRACTBM(r8,v2)
+ VEXTRACTBM(r9,v3)
+ VEXTRACTBM(r10,v4)
+
+ /* Shift each value into their corresponding position. */
+ sldi r8,r8,16
+ sldi r9,r9,32
+ sldi r10,r10,48
+
+ /* Merge the results. */
+ or r7,r7,r8
+ or r8,r9,r10
+ or r10,r8,r7
+
+ cnttzd r0,r10 /* Count trailing zeros before the match. */
+ subf r5,r3,r5
+ add r3,r5,r0 /* Compute final length. */
+ blr
+
+ .p2align 5
+L(tail1):
+ TAIL(v0,0)
+
+ .p2align 5
+L(tail2):
+ TAIL(v1,16)
+
+ .p2align 5
+L(tail3):
+ TAIL(v2,32)
+
+ .p2align 5
+L(tail4):
+ TAIL(v3,48)
+
+ .p2align 5
+L(tail5):
+ TAIL(v4,64)
+
+ .p2align 5
+L(tail6):
+ TAIL(v5,80)
+
+ .p2align 5
+L(tail7):
+ TAIL(v6,96)
+
+ .p2align 5
+L(tail8):
+ TAIL(v7,112)
+
+ .p2align 5
+L(tail9):
+ TAIL(v8,128)
+
+ .p2align 5
+L(tail10):
+ TAIL(v9,144)
+
+ .p2align 5
+L(tail11):
+ TAIL(v10,160)
+
+END (STRLEN)
+
+#ifdef DEFINE_STRLEN_HIDDEN_DEF
+weak_alias (__strlen, strlen)
+libc_hidden_builtin_def (strlen)
+#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index a9e13e05e90601cd..61652b65dd223018 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -33,7 +33,8 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
ifneq (,$(filter %le,$(config-machine)))
sysdep_routines += strcmp-power9 strncmp-power9 strcpy-power9 stpcpy-power9 \
- rawmemchr-power9 strlen-power9 strncpy-power9 stpncpy-power9
+ rawmemchr-power9 strlen-power9 strncpy-power9 stpncpy-power9 \
+ strlen-power10
endif
CFLAGS-strncase-power7.c += -mcpu=power7 -funroll-loops
CFLAGS-strncase_l-power7.c += -mcpu=power7 -funroll-loops
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index b30bc53930fc0e36..46d5956adda72b86 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -112,6 +112,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
/* Support sysdeps/powerpc/powerpc64/multiarch/strlen.c. */
IFUNC_IMPL (i, name, strlen,
#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, strlen, hwcap2 & PPC_FEATURE2_ARCH_3_1,
+ __strlen_power10)
IFUNC_IMPL_ADD (array, i, strlen, hwcap2 & PPC_FEATURE2_ARCH_3_00,
__strlen_power9)
#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/strlen-power10.S b/sysdeps/powerpc/powerpc64/multiarch/strlen-power10.S
new file mode 100644
index 0000000000000000..6a774fad58c77179
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/multiarch/strlen-power10.S
@@ -0,0 +1,2 @@
+#define STRLEN __strlen_power10
+#include <sysdeps/powerpc/powerpc64/le/power10/strlen.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/strlen.c b/sysdeps/powerpc/powerpc64/multiarch/strlen.c
index b7f0fbb13fb97783..11bdb96de2d2aa66 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/strlen.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/strlen.c
@@ -31,9 +31,12 @@ extern __typeof (__redirect_strlen) __strlen_ppc attribute_hidden;
extern __typeof (__redirect_strlen) __strlen_power7 attribute_hidden;
extern __typeof (__redirect_strlen) __strlen_power8 attribute_hidden;
extern __typeof (__redirect_strlen) __strlen_power9 attribute_hidden;
+extern __typeof (__redirect_strlen) __strlen_power10 attribute_hidden;
libc_ifunc (__libc_strlen,
# ifdef __LITTLE_ENDIAN__
+ (hwcap2 & PPC_FEATURE2_ARCH_3_1)
+ ? __strlen_power10 :
(hwcap2 & PPC_FEATURE2_ARCH_3_00)
? __strlen_power9 :
# endif

View File

@ -0,0 +1,527 @@
commit dd59655e9371af86043b97e38953f43bd9496699
Author: Lucas A. M. Magalhaes <lamm@linux.ibm.com>
Date: Fri Apr 30 18:12:08 2021 -0300
powerpc64le: Optimized memmove for POWER10
This patch was initially based on the __memmove_power7 with some ideas
from strncpy implementation for Power 9.
Improvements from __memmove_power7:
1. Use lxvl/stxvl for alignment code.
The code for Power 7 uses branches when the input is not naturally
aligned to the width of a vector. The new implementation uses
lxvl/stxvl instead which reduces pressure on GPRs. It also allows
the removal of branch instructions, implicitly removing branch stalls
and mispredictions.
2. Use of lxv/stxv and lxvl/stxvl pair is safe to use on Cache Inhibited
memory.
On Power 10 vector load and stores are safe to use on CI memory for
addresses unaligned to 16B. This code takes advantage of this to
do unaligned loads.
The unaligned loads don't have a significant performance impact by
themselves. However doing so decreases register pressure on GPRs
and interdependence stalls on load/store pairs. This also improved
readability as there are now less code paths for different alignments.
Finally this reduces the overall code size.
3. Improved performance.
This version runs on average about 30% better than memmove_power7
for lengths larger than 8KB. For input lengths shorter than 8KB
the improvement is smaller, it has on average about 17% better
performance.
This version has a degradation of about 50% for input lengths
in the 0 to 31 bytes range when dest is unaligned.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
diff --git a/sysdeps/powerpc/powerpc64/le/power10/memmove.S b/sysdeps/powerpc/powerpc64/le/power10/memmove.S
new file mode 100644
index 0000000000000000..7dfd57edeb37e8e4
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/power10/memmove.S
@@ -0,0 +1,320 @@
+/* Optimized memmove implementation for POWER10.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <sysdep.h>
+
+
+/* void* [r3] memmove (void *dest [r3], const void *src [r4], size_t len [r5])
+
+ This optimization checks if 'src' and 'dst' overlap. If they do not
+ or 'src' is ahead of 'dest' then it copies forward.
+ Otherwise, an optimized backward copy is used. */
+
+#ifndef MEMMOVE
+# define MEMMOVE memmove
+#endif
+ .machine power9
+ENTRY_TOCLESS (MEMMOVE, 5)
+ CALL_MCOUNT 3
+
+L(_memmove):
+ .p2align 5
+ /* Check if there is overlap, if so it will branch to backward copy. */
+ subf r9,r4,r3
+ cmpld cr7,r9,r5
+ blt cr7,L(memmove_bwd)
+
+ /* Fast path for length shorter than 16 bytes. */
+ sldi r7,r5,56
+ lxvl 32+v2,r4,r7
+ stxvl 32+v2,r3,r7
+ subic. r8,r5,16
+ blelr
+
+ /* For shorter lengths aligning the dest address to 16 bytes either
+ decreases performance or is irrelevant. I'm making use of this
+ comparison to skip the alignment in. */
+ cmpldi cr6,r5,256
+ bge cr6,L(ge_256)
+ /* Account for the first 16-byte copy. */
+ addi r4,r4,16
+ addi r11,r3,16 /* use r11 to keep dest address on r3. */
+ subi r5,r5,16
+ b L(loop_head)
+
+ .p2align 5
+L(ge_256):
+ /* Account for the first copy <= 16 bytes. This is necessary for
+ memmove because at this point the src address can be in front of the
+ dest address. */
+ clrldi r9,r5,56
+ li r8,16
+ cmpldi r9,16
+ iselgt r9,r8,r9
+ add r4,r4,r9
+ add r11,r3,r9 /* use r11 to keep dest address on r3. */
+ sub r5,r5,r9
+
+ /* Align dest to 16 bytes. */
+ neg r7,r3
+ clrldi. r9,r7,60
+ beq L(loop_head)
+
+ .p2align 5
+ sldi r6,r9,56
+ lxvl 32+v0,r4,r6
+ stxvl 32+v0,r11,r6
+ sub r5,r5,r9
+ add r4,r4,r9
+ add r11,r11,r9
+
+L(loop_head):
+ cmpldi r5,63
+ ble L(final_64)
+
+ srdi. r7,r5,7
+ beq L(loop_tail)
+
+ mtctr r7
+
+/* Main loop that copies 128 bytes each iteration. */
+ .p2align 5
+L(loop):
+ addi r9,r4,64
+ addi r10,r11,64
+
+ lxv 32+v0,0(r4)
+ lxv 32+v1,16(r4)
+ lxv 32+v2,32(r4)
+ lxv 32+v3,48(r4)
+
+ stxv 32+v0,0(r11)
+ stxv 32+v1,16(r11)
+ stxv 32+v2,32(r11)
+ stxv 32+v3,48(r11)
+
+ addi r4,r4,128
+ addi r11,r11,128
+
+ lxv 32+v4,0(r9)
+ lxv 32+v5,16(r9)
+ lxv 32+v6,32(r9)
+ lxv 32+v7,48(r9)
+
+ stxv 32+v4,0(r10)
+ stxv 32+v5,16(r10)
+ stxv 32+v6,32(r10)
+ stxv 32+v7,48(r10)
+
+ bdnz L(loop)
+ clrldi. r5,r5,57
+ beqlr
+
+/* Copy 64 bytes. */
+ .p2align 5
+L(loop_tail):
+ cmpldi cr5,r5,63
+ ble cr5,L(final_64)
+
+ lxv 32+v0,0(r4)
+ lxv 32+v1,16(r4)
+ lxv 32+v2,32(r4)
+ lxv 32+v3,48(r4)
+
+ stxv 32+v0,0(r11)
+ stxv 32+v1,16(r11)
+ stxv 32+v2,32(r11)
+ stxv 32+v3,48(r11)
+
+ addi r4,r4,64
+ addi r11,r11,64
+ subi r5,r5,64
+
+/* Copies the last 1-63 bytes. */
+ .p2align 5
+L(final_64):
+ /* r8 holds the number of bytes that will be copied with lxv/stxv. */
+ clrrdi. r8,r5,4
+ beq L(tail1)
+
+ cmpldi cr5,r5,32
+ lxv 32+v0,0(r4)
+ blt cr5,L(tail2)
+
+ cmpldi cr6,r5,48
+ lxv 32+v1,16(r4)
+ blt cr6,L(tail3)
+
+ .p2align 5
+ lxv 32+v2,32(r4)
+ stxv 32+v2,32(r11)
+L(tail3):
+ stxv 32+v1,16(r11)
+L(tail2):
+ stxv 32+v0,0(r11)
+ sub r5,r5,r8
+ add r4,r4,r8
+ add r11,r11,r8
+ .p2align 5
+L(tail1):
+ sldi r6,r5,56
+ lxvl v4,r4,r6
+ stxvl v4,r11,r6
+ blr
+
+/* If dest and src overlap, we should copy backwards. */
+L(memmove_bwd):
+ add r11,r3,r5
+ add r4,r4,r5
+
+ /* Optimization for length smaller than 16 bytes. */
+ cmpldi cr5,r5,15
+ ble cr5,L(tail1_bwd)
+
+ /* For shorter lengths the alignment either slows down or is irrelevant.
+ The forward copy uses a already need 256 comparison for that. Here
+ it's using 128 as it will reduce code and improve readability. */
+ cmpldi cr7,r5,128
+ blt cr7,L(bwd_loop_tail)
+
+ /* Align dest address to 16 bytes. */
+ .p2align 5
+ clrldi. r9,r11,60
+ beq L(bwd_loop_head)
+ sub r4,r4,r9
+ sub r11,r11,r9
+ lxv 32+v0,0(r4)
+ sldi r6,r9,56
+ stxvl 32+v0,r11,r6
+ sub r5,r5,r9
+
+L(bwd_loop_head):
+ srdi. r7,r5,7
+ beq L(bwd_loop_tail)
+
+ mtctr r7
+
+/* Main loop that copies 128 bytes every iteration. */
+ .p2align 5
+L(bwd_loop):
+ addi r9,r4,-64
+ addi r10,r11,-64
+
+ lxv 32+v0,-16(r4)
+ lxv 32+v1,-32(r4)
+ lxv 32+v2,-48(r4)
+ lxv 32+v3,-64(r4)
+
+ stxv 32+v0,-16(r11)
+ stxv 32+v1,-32(r11)
+ stxv 32+v2,-48(r11)
+ stxv 32+v3,-64(r11)
+
+ addi r4,r4,-128
+ addi r11,r11,-128
+
+ lxv 32+v0,-16(r9)
+ lxv 32+v1,-32(r9)
+ lxv 32+v2,-48(r9)
+ lxv 32+v3,-64(r9)
+
+ stxv 32+v0,-16(r10)
+ stxv 32+v1,-32(r10)
+ stxv 32+v2,-48(r10)
+ stxv 32+v3,-64(r10)
+
+ bdnz L(bwd_loop)
+ clrldi. r5,r5,57
+ beqlr
+
+/* Copy 64 bytes. */
+ .p2align 5
+L(bwd_loop_tail):
+ cmpldi cr5,r5,63
+ ble cr5,L(bwd_final_64)
+
+ addi r4,r4,-64
+ addi r11,r11,-64
+
+ lxv 32+v0,0(r4)
+ lxv 32+v1,16(r4)
+ lxv 32+v2,32(r4)
+ lxv 32+v3,48(r4)
+
+ stxv 32+v0,0(r11)
+ stxv 32+v1,16(r11)
+ stxv 32+v2,32(r11)
+ stxv 32+v3,48(r11)
+
+ subi r5,r5,64
+
+/* Copies the last 1-63 bytes. */
+ .p2align 5
+L(bwd_final_64):
+ /* r8 holds the number of bytes that will be copied with lxv/stxv. */
+ clrrdi. r8,r5,4
+ beq L(tail1_bwd)
+
+ cmpldi cr5,r5,32
+ lxv 32+v2,-16(r4)
+ blt cr5,L(tail2_bwd)
+
+ cmpldi cr6,r5,48
+ lxv 32+v1,-32(r4)
+ blt cr6,L(tail3_bwd)
+
+ .p2align 5
+ lxv 32+v0,-48(r4)
+ stxv 32+v0,-48(r11)
+L(tail3_bwd):
+ stxv 32+v1,-32(r11)
+L(tail2_bwd):
+ stxv 32+v2,-16(r11)
+ sub r4,r4,r5
+ sub r11,r11,r5
+ sub r5,r5,r8
+ sldi r6,r5,56
+ lxvl v4,r4,r6
+ stxvl v4,r11,r6
+ blr
+
+/* Copy last 16 bytes. */
+ .p2align 5
+L(tail1_bwd):
+ sub r4,r4,r5
+ sub r11,r11,r5
+ sldi r6,r5,56
+ lxvl v4,r4,r6
+ stxvl v4,r11,r6
+ blr
+
+END_GEN_TB (MEMMOVE,TB_TOCLESS)
+libc_hidden_builtin_def (memmove)
+
+/* void bcopy(const void *src [r3], void *dest [r4], size_t n [r5])
+ Implemented in this file to avoid linker create a stub function call
+ in the branch to '_memmove'. */
+ENTRY_TOCLESS (__bcopy)
+ mr r6,r3
+ mr r3,r4
+ mr r4,r6
+ b L(_memmove)
+END (__bcopy)
+#ifndef __bcopy
+weak_alias (__bcopy, bcopy)
+#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index 61652b65dd223018..66f8c6ace9824d4a 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -32,7 +32,8 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
strncase-power8
ifneq (,$(filter %le,$(config-machine)))
-sysdep_routines += strcmp-power9 strncmp-power9 strcpy-power9 stpcpy-power9 \
+sysdep_routines += memmove-power10 \
+ strcmp-power9 strncmp-power9 strcpy-power9 stpcpy-power9 \
rawmemchr-power9 strlen-power9 strncpy-power9 stpncpy-power9 \
strlen-power10
endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c b/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
index 1c4a229b1fc5654a..705fef33d4e57557 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
@@ -22,8 +22,17 @@
extern __typeof (bcopy) __bcopy_ppc attribute_hidden;
/* __bcopy_power7 symbol is implemented at memmove-power7.S */
extern __typeof (bcopy) __bcopy_power7 attribute_hidden;
+#ifdef __LITTLE_ENDIAN__
+extern __typeof (bcopy) __bcopy_power10 attribute_hidden;
+#endif
libc_ifunc (bcopy,
+#ifdef __LITTLE_ENDIAN__
+ hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
+ PPC_FEATURE2_HAS_ISEL)
+ && (hwcap & PPC_FEATURE_HAS_VSX)
+ ? __bcopy_power10 :
+#endif
(hwcap & PPC_FEATURE_HAS_VSX)
? __bcopy_power7
: __bcopy_ppc);
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index 46d5956adda72b86..4ce04bc51574cca1 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -67,6 +67,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
/* Support sysdeps/powerpc/powerpc64/multiarch/memmove.c. */
IFUNC_IMPL (i, name, memmove,
+#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, memmove,
+ hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
+ PPC_FEATURE2_HAS_ISEL)
+ && (hwcap & PPC_FEATURE_HAS_VSX),
+ __memmove_power10)
+#endif
IFUNC_IMPL_ADD (array, i, memmove, hwcap & PPC_FEATURE_HAS_VSX,
__memmove_power7)
IFUNC_IMPL_ADD (array, i, memmove, 1, __memmove_ppc))
@@ -186,6 +193,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
/* Support sysdeps/powerpc/powerpc64/multiarch/bcopy.c. */
IFUNC_IMPL (i, name, bcopy,
+#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, bcopy,
+ hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
+ PPC_FEATURE2_HAS_ISEL)
+ && (hwcap & PPC_FEATURE_HAS_VSX),
+ __bcopy_power10)
+#endif
IFUNC_IMPL_ADD (array, i, bcopy, hwcap & PPC_FEATURE_HAS_VSX,
__bcopy_power7)
IFUNC_IMPL_ADD (array, i, bcopy, 1, __bcopy_ppc))
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
new file mode 100644
index 0000000000000000..171b32921a0a4d47
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
@@ -0,0 +1,27 @@
+/* Optimized memmove implementation for POWER10.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#define MEMMOVE __memmove_power10
+
+#undef libc_hidden_builtin_def
+#define libc_hidden_builtin_def(name)
+
+#undef __bcopy
+#define __bcopy __bcopy_power10
+
+#include <sysdeps/powerpc/powerpc64/le/power10/memmove.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S b/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
index 0b251d0f5f087874..fb5261ecda64d061 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
@@ -21,7 +21,7 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef bcopy
-#define bcopy __bcopy_power7
+#undef __bcopy
+#define __bcopy __bcopy_power7
#include <sysdeps/powerpc/powerpc64/power7/memmove.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove.c b/sysdeps/powerpc/powerpc64/multiarch/memmove.c
index 39987155cc7d3624..2fd7b6d309e4bedd 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memmove.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove.c
@@ -28,14 +28,22 @@
# include "init-arch.h"
extern __typeof (__redirect_memmove) __libc_memmove;
-
extern __typeof (__redirect_memmove) __memmove_ppc attribute_hidden;
extern __typeof (__redirect_memmove) __memmove_power7 attribute_hidden;
+#ifdef __LITTLE_ENDIAN__
+extern __typeof (__redirect_memmove) __memmove_power10 attribute_hidden;
+#endif
libc_ifunc (__libc_memmove,
- (hwcap & PPC_FEATURE_HAS_VSX)
- ? __memmove_power7
- : __memmove_ppc);
+#ifdef __LITTLE_ENDIAN__
+ hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
+ PPC_FEATURE2_HAS_ISEL)
+ && (hwcap & PPC_FEATURE_HAS_VSX)
+ ? __memmove_power10 :
+#endif
+ (hwcap & PPC_FEATURE_HAS_VSX)
+ ? __memmove_power7
+ : __memmove_ppc);
#undef memmove
strong_alias (__libc_memmove, memmove);
diff --git a/sysdeps/powerpc/powerpc64/power7/memmove.S b/sysdeps/powerpc/powerpc64/power7/memmove.S
index b7f3dc28d1a8eac3..9e4cabb07ef9b732 100644
--- a/sysdeps/powerpc/powerpc64/power7/memmove.S
+++ b/sysdeps/powerpc/powerpc64/power7/memmove.S
@@ -832,4 +832,6 @@ ENTRY_TOCLESS (__bcopy)
mr r4,r6
b L(_memmove)
END (__bcopy)
+#ifndef __bcopy
weak_alias (__bcopy, bcopy)
+#endif

View File

@ -0,0 +1,308 @@
commit e941e0ae80626b7661c1db8953a673cafd3b8b19
Author: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Date: Fri Apr 30 18:12:08 2021 -0300
powerpc64le: Optimize memcpy for POWER10
This implementation is based on __memcpy_power8_cached and integrates
suggestions from Anton Blanchard.
It benefits from loads and stores with length for short lengths and for
tail code, simplifying the code.
All unaligned memory accesses use instructions that do not generate
alignment interrupts on POWER10, making it safe to use on
caching-inhibited memory.
The main loop has also been modified in order to increase instruction
throughput by reducing the dependency on updates from previous iterations.
On average, this implementation provides around 30% improvement when
compared to __memcpy_power7 and 10% improvement in comparison to
__memcpy_power8_cached.
diff --git a/sysdeps/powerpc/powerpc64/le/power10/memcpy.S b/sysdeps/powerpc/powerpc64/le/power10/memcpy.S
new file mode 100644
index 0000000000000000..ad1414db4a3a8b9f
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/power10/memcpy.S
@@ -0,0 +1,198 @@
+/* Optimized memcpy implementation for POWER10.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include <sysdep.h>
+
+
+#ifndef MEMCPY
+# define MEMCPY memcpy
+#endif
+
+/* __ptr_t [r3] memcpy (__ptr_t dst [r3], __ptr_t src [r4], size_t len [r5]);
+ Returns 'dst'. */
+
+ .machine power9
+ENTRY_TOCLESS (MEMCPY, 5)
+ CALL_MCOUNT 3
+
+ /* Copy up to 16 bytes. */
+ sldi r6,r5,56 /* Prepare [l|st]xvl counter. */
+ lxvl v10,r4,r6
+ stxvl v10,r3,r6
+ subic. r6,r5,16 /* Return if len <= 16. */
+ blelr
+
+ /* If len >= 256, assume nothing got copied before and copy
+ again. This might cause issues with overlapped memory, but memcpy
+ is not expected to treat overlapped memory. */
+ cmpdi r5,256
+ bge L(copy_ge_256)
+ /* 16 < len < 256 and the first 16 bytes have already been copied. */
+ addi r10,r3,16 /* Keep r3 intact as return value. */
+ addi r4,r4,16
+ subi r5,r5,16
+ b L(copy_lt_256) /* Avoid the main loop if len < 256. */
+
+ .p2align 5
+L(copy_ge_256):
+ mr r10,r3 /* Keep r3 intact as return value. */
+ /* Align dst to 16 bytes. */
+ andi. r9,r10,0xf
+ beq L(dst_is_align_16)
+ lxv v10,0(r4)
+ subfic r12,r9,16
+ subf r5,r12,r5
+ add r4,r4,r12
+ stxv v10,0(r3)
+ add r10,r3,r12
+
+L(dst_is_align_16):
+ srdi r9,r5,7 /* Divide by 128. */
+ mtctr r9
+ addi r6,r4,64
+ addi r7,r10,64
+
+
+ /* Main loop, copy 128 bytes per iteration.
+ Use r6=src+64 and r7=dest+64 in order to reduce the dependency on
+ r4 and r10. */
+ .p2align 5
+L(copy_128):
+
+ lxv v10, 0(r4)
+ lxv v11, 16(r4)
+ lxv v12, 32(r4)
+ lxv v13, 48(r4)
+
+ addi r4,r4,128
+
+ stxv v10, 0(r10)
+ stxv v11, 16(r10)
+ stxv v12, 32(r10)
+ stxv v13, 48(r10)
+
+ addi r10,r10,128
+
+ lxv v10, 0(r6)
+ lxv v11, 16(r6)
+ lxv v12, 32(r6)
+ lxv v13, 48(r6)
+
+ addi r6,r6,128
+
+ stxv v10, 0(r7)
+ stxv v11, 16(r7)
+ stxv v12, 32(r7)
+ stxv v13, 48(r7)
+
+ addi r7,r7,128
+
+ bdnz L(copy_128)
+
+ clrldi. r5,r5,64-7 /* Have we copied everything? */
+ beqlr
+
+ .p2align 5
+L(copy_lt_256):
+ cmpdi r5,16
+ ble L(copy_le_16)
+ srdi. r9,r5,5 /* Divide by 32. */
+ beq L(copy_lt_32)
+ mtctr r9
+ /* Use r6=src+32, r7=dest+32, r8=src+64, r9=dest+64 in order to reduce
+ the dependency on r4 and r10. */
+ addi r6,r4,32
+ addi r7,r10,32
+ addi r8,r4,64
+ addi r9,r10,64
+
+ .p2align 5
+ /* Copy 32 bytes at a time, unaligned.
+ The loop is unrolled 3 times in order to reduce the dependency on
+ r4 and r10, copying up-to 96 bytes per iteration. */
+L(copy_32):
+ lxv v10, 0(r4)
+ lxv v11, 16(r4)
+ stxv v10, 0(r10)
+ stxv v11, 16(r10)
+ bdz L(end_copy_32a)
+ addi r4,r4,96
+ addi r10,r10,96
+
+ lxv v10, 0(r6)
+ lxv v11, 16(r6)
+ addi r6,r6,96
+ stxv v10, 0(r7)
+ stxv v11, 16(r7)
+ bdz L(end_copy_32b)
+ addi r7,r7,96
+
+ lxv v12, 0(r8)
+ lxv v13, 16(r8)
+ addi r8,r8,96
+ stxv v12, 0(r9)
+ stxv v13, 16(r9)
+ addi r9,r9,96
+ bdnz L(copy_32)
+
+ clrldi. r5,r5,64-5 /* Have we copied everything? */
+ beqlr
+ cmpdi r5,16
+ ble L(copy_le_16)
+ b L(copy_lt_32)
+
+ .p2align 5
+L(end_copy_32a):
+ clrldi. r5,r5,64-5 /* Have we copied everything? */
+ beqlr
+ /* 32 bytes have been copied since the last update of r4 and r10. */
+ addi r4,r4,32
+ addi r10,r10,32
+ cmpdi r5,16
+ ble L(copy_le_16)
+ b L(copy_lt_32)
+
+ .p2align 5
+L(end_copy_32b):
+ clrldi. r5,r5,64-5 /* Have we copied everything? */
+ beqlr
+ /* The last iteration of the loop copied 64 bytes. Update r4 and r10
+ accordingly. */
+ addi r4,r4,-32
+ addi r10,r10,-32
+ cmpdi r5,16
+ ble L(copy_le_16)
+
+ .p2align 5
+L(copy_lt_32):
+ lxv v10, 0(r4)
+ stxv v10, 0(r10)
+ addi r4,r4,16
+ addi r10,r10,16
+ subi r5,r5,16
+
+ .p2align 5
+L(copy_le_16):
+ sldi r6,r5,56
+ lxvl v10,r4,r6
+ stxvl v10,r10,r6
+ blr
+
+
+END_GEN_TB (MEMCPY,TB_TOCLESS)
+libc_hidden_builtin_def (memcpy)
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index 66f8c6ace9824d4a..2e3c8f2e8a81cda4 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -32,7 +32,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
strncase-power8
ifneq (,$(filter %le,$(config-machine)))
-sysdep_routines += memmove-power10 \
+sysdep_routines += memcpy-power10 memmove-power10 \
strcmp-power9 strncmp-power9 strcpy-power9 stpcpy-power9 \
rawmemchr-power9 strlen-power9 strncpy-power9 stpncpy-power9 \
strlen-power10
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index 4ce04bc51574cca1..9d5a14e480c02171 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -51,6 +51,12 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
#ifdef SHARED
/* Support sysdeps/powerpc/powerpc64/multiarch/memcpy.c. */
IFUNC_IMPL (i, name, memcpy,
+#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, memcpy,
+ hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap & PPC_FEATURE_HAS_VSX,
+ __memcpy_power10)
+#endif
IFUNC_IMPL_ADD (array, i, memcpy, hwcap2 & PPC_FEATURE2_ARCH_2_07,
__memcpy_power8_cached)
IFUNC_IMPL_ADD (array, i, memcpy, hwcap & PPC_FEATURE_HAS_VSX,
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memcpy-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memcpy-power10.S
new file mode 100644
index 0000000000000000..70e0fc3ed610cdc3
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/multiarch/memcpy-power10.S
@@ -0,0 +1,26 @@
+/* Optimized memcpy implementation for POWER10.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#if defined __LITTLE_ENDIAN__ && IS_IN (libc)
+#define MEMCPY __memcpy_power10
+
+#undef libc_hidden_builtin_def
+#define libc_hidden_builtin_def(name)
+
+#include <sysdeps/powerpc/powerpc64/le/power10/memcpy.S>
+#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memcpy.c b/sysdeps/powerpc/powerpc64/multiarch/memcpy.c
index 44dea594f3770673..be0e47f32dde2ccf 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memcpy.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/memcpy.c
@@ -36,8 +36,15 @@ extern __typeof (__redirect_memcpy) __memcpy_power6 attribute_hidden;
extern __typeof (__redirect_memcpy) __memcpy_a2 attribute_hidden;
extern __typeof (__redirect_memcpy) __memcpy_power7 attribute_hidden;
extern __typeof (__redirect_memcpy) __memcpy_power8_cached attribute_hidden;
+# if defined __LITTLE_ENDIAN__
+extern __typeof (__redirect_memcpy) __memcpy_power10 attribute_hidden;
+# endif
libc_ifunc (__libc_memcpy,
+# if defined __LITTLE_ENDIAN__
+ (hwcap2 & PPC_FEATURE2_ARCH_3_1 && hwcap & PPC_FEATURE_HAS_VSX)
+ ? __memcpy_power10 :
+# endif
((hwcap2 & PPC_FEATURE2_ARCH_2_07) && use_cached_memopt)
? __memcpy_power8_cached :
(hwcap & PPC_FEATURE_HAS_VSX)

View File

@ -0,0 +1,420 @@
commit 23fdf8178cce3c2ec320dd5eca8b544245bcaef0
Author: Raoni Fassina Firmino <raoni@linux.ibm.com>
Date: Fri Apr 30 18:12:08 2021 -0300
powerpc64le: Optimize memset for POWER10
This implementation is based on __memset_power8 and integrates a lot
of suggestions from Anton Blanchard.
The biggest difference is that it makes extensive use of stxvl to
alignment and tail code to avoid branches and small stores. It has
three main execution paths:
a) "Short lengths" for lengths up to 64 bytes, avoiding as many
branches as possible.
b) "General case" for larger lengths, it has an alignment section
using stxvl to avoid branches, a 128 bytes loop and then a tail
code, again using stxvl with few branches.
c) "Zeroing cache blocks" for lengths from 256 bytes upwards and set
value being zero. It is mostly the __memset_power8 code but the
alignment phase was simplified because, at this point, address is
already 16-bytes aligned and also changed to use vector stores.
The tail code was also simplified to reuse the general case tail.
All unaligned stores use stxvl instructions that do not generate
alignment interrupts on POWER10, making it safe to use on
caching-inhibited memory.
On average, this implementation provides something around 30%
improvement when compared to __memset_power8.
Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
diff --git a/sysdeps/powerpc/powerpc64/le/power10/memset.S b/sysdeps/powerpc/powerpc64/le/power10/memset.S
new file mode 100644
index 0000000000000000..6b8e2cfdaf25fd30
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/power10/memset.S
@@ -0,0 +1,256 @@
+/* Optimized memset implementation for POWER10 LE.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <sysdep.h>
+
+/* void * [r3] memset (void *s [r3], int c [r4], size_t n [r5]));
+ Returns 's'. */
+
+#ifndef MEMSET
+# define MEMSET memset
+#endif
+
+ .machine power9
+ENTRY_TOCLESS (MEMSET, 5)
+ CALL_MCOUNT 3
+
+L(_memset):
+ /* Assume memset of zero length is uncommon, and just let it go
+ through the small path below. */
+ cmpldi r5,64
+
+ /* Replicate byte to quad word. */
+ mtvsrd v0+32,r4
+ vspltb v0,v0,7
+
+ li r7,16
+ sldi r8,r7,56
+
+ bgt L(large)
+
+ /* For short lengths we want to avoid as many branches as possible.
+ We use store VSX vector with length instructions to do this.
+ It takes advantage of the fact that if the length passed to stxvl
+ is zero nothing is done, effectively a no-op. */
+ sldi r5,r5,56
+
+ addi r10,r3,16
+
+ sub. r11,r5,r8
+ isellt r11,0,r11 /* Saturate the subtraction to zero. */
+
+ stxvl v0+32,r3,r5
+ stxvl v0+32,r10,r11
+
+ addi r9,r3,32
+ addi r10,r3,48
+
+ sub. r11,r11,r8
+ isellt r11,0,r11
+
+ sub. r5,r11,r8
+ isellt r5,0,r5
+
+ stxvl v0+32,r9,r11
+ stxvl v0+32,r10,r5
+
+ blr
+
+ .balign 16
+L(large):
+ mr r6,r3 /* Don't modify r3 since we need to return it. */
+
+ /* Get dest 16B aligned. */
+ neg r0,r3
+ clrldi. r7,r0,(64-4)
+ beq L(aligned)
+ rldic r9,r0,56,4 /* (~X & 0xf)<<56 "clrlsldi r9,r0,64-4,56". */
+
+ stxvl v0+32,r6,r9 /* Store up to 15B until aligned address. */
+
+ add r6,r6,r7
+ sub r5,r5,r7
+
+ /* Go to tail if there is less than 64B left after alignment. */
+ cmpldi r5,64
+ blt L(tail_64)
+
+ .balign 16
+L(aligned):
+ /* Go to tail if there is less than 128B left after alignment. */
+ srdi. r0,r5,7
+ beq L(tail_128)
+
+ /* If c == 0 && n >= 256 use dcbz to zero out full cache blocks. */
+ cmpldi cr5,r5,255
+ cmpldi cr6,r4,0
+ crand 27,26,21
+ bt 27,L(dcbz)
+
+ mtctr r0
+
+ .balign 32
+L(loop):
+ stxv v0+32,0(r6)
+ stxv v0+32,16(r6)
+ stxv v0+32,32(r6)
+ stxv v0+32,48(r6)
+ stxv v0+32,64(r6)
+ stxv v0+32,80(r6)
+ stxv v0+32,96(r6)
+ stxv v0+32,112(r6)
+ addi r6,r6,128
+ bdnz L(loop)
+
+ .balign 16
+L(tail):
+ /* 127B or less left, finish the tail or return. */
+ andi. r5,r5,127
+ beqlr
+
+ cmpldi r5,64
+ blt L(tail_64)
+
+ .balign 16
+L(tail_128):
+ /* Stores a minimum of 64B and up to 128B and return. */
+ stxv v0+32,0(r6)
+ stxv v0+32,16(r6)
+ stxv v0+32,32(r6)
+ stxv v0+32,48(r6)
+ addi r6,r6,64
+ andi. r5,r5,63
+ beqlr
+
+ .balign 16
+L(tail_64):
+ /* Stores up to 64B and return. */
+ sldi r5,r5,56
+
+ addi r10,r6,16
+
+ sub. r11,r5,r8
+ isellt r11,0,r11
+
+ stxvl v0+32,r6,r5
+ stxvl v0+32,r10,r11
+
+ sub. r11,r11,r8
+ blelr
+
+ addi r9,r6,32
+ addi r10,r6,48
+
+ isellt r11,0,r11
+
+ sub. r5,r11,r8
+ isellt r5,0,r5
+
+ stxvl v0+32,r9,r11
+ stxvl v0+32,r10,r5
+
+ blr
+
+ .balign 16
+L(dcbz):
+ /* Special case when value is 0 and we have a long length to deal
+ with. Use dcbz to zero out a full cacheline of 128 bytes at a time.
+ Before using dcbz though, we need to get the destination 128-byte
+ aligned. */
+ neg r0,r6
+ clrldi. r0,r0,(64-7)
+ beq L(dcbz_aligned)
+
+ sub r5,r5,r0
+ mtocrf 0x2,r0 /* copying bits 57..59 to cr6. The ones for sizes 64,
+ 32 and 16 which need to be checked. */
+
+ /* Write 16-128 bytes until DST is aligned to 128 bytes. */
+64: bf 25,32f
+ stxv v0+32,0(r6)
+ stxv v0+32,16(r6)
+ stxv v0+32,32(r6)
+ stxv v0+32,48(r6)
+ addi r6,r6,64
+
+32: bf 26,16f
+ stxv v0+32,0(r6)
+ stxv v0+32,16(r6)
+ addi r6,r6,32
+
+16: bf 27,L(dcbz_aligned)
+ stxv v0+32,0(r6)
+ addi r6,r6,16
+
+ .balign 16
+L(dcbz_aligned):
+ /* Setup dcbz unroll offsets and count numbers. */
+ srdi. r0,r5,9
+ li r9,128
+ beq L(bcdz_tail)
+ li r10,256
+ li r11,384
+ mtctr r0
+
+ .balign 16
+L(dcbz_loop):
+ /* Sets 512 bytes to zero in each iteration, the loop unrolling shows
+ a throughput boost for large sizes (2048 bytes or higher). */
+ dcbz 0,r6
+ dcbz r9,r6
+ dcbz r10,r6
+ dcbz r11,r6
+ addi r6,r6,512
+ bdnz L(dcbz_loop)
+
+ andi. r5,r5,511
+ beqlr
+
+ .balign 16
+L(bcdz_tail):
+ /* We have 1-511 bytes remaining. */
+ srdi. r0,r5,7
+ beq L(tail)
+
+ mtocrf 0x1,r0
+
+256: bf 30,128f
+ dcbz 0,r6
+ dcbz r9,r6
+ addi r6,r6,256
+
+128: bf 31,L(tail)
+ dcbz 0,r6
+ addi r6,r6,128
+
+ b L(tail)
+
+END_GEN_TB (MEMSET,TB_TOCLESS)
+libc_hidden_builtin_def (memset)
+
+/* Copied from bzero.S to prevent the linker from inserting a stub
+ between bzero and memset. */
+ENTRY_TOCLESS (__bzero)
+ CALL_MCOUNT 2
+ mr r5,r4
+ li r4,0
+ b L(_memset)
+END (__bzero)
+#ifndef __bzero
+weak_alias (__bzero, bzero)
+#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index 2e3c8f2e8a81cda4..1d517698429e1230 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -32,7 +32,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
strncase-power8
ifneq (,$(filter %le,$(config-machine)))
-sysdep_routines += memcpy-power10 memmove-power10 \
+sysdep_routines += memcpy-power10 memmove-power10 memset-power10 \
strcmp-power9 strncmp-power9 strcpy-power9 stpcpy-power9 \
rawmemchr-power9 strlen-power9 strncpy-power9 stpncpy-power9 \
strlen-power10
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bzero.c b/sysdeps/powerpc/powerpc64/multiarch/bzero.c
index f8cb05bea8a3505b..4ce98e324d12a31e 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/bzero.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/bzero.c
@@ -27,8 +27,16 @@ extern __typeof (bzero) __bzero_power4 attribute_hidden;
extern __typeof (bzero) __bzero_power6 attribute_hidden;
extern __typeof (bzero) __bzero_power7 attribute_hidden;
extern __typeof (bzero) __bzero_power8 attribute_hidden;
+# ifdef __LITTLE_ENDIAN__
+extern __typeof (bzero) __bzero_power10 attribute_hidden;
+# endif
libc_ifunc (__bzero,
+# ifdef __LITTLE_ENDIAN__
+ (hwcap2 & (PPC_FEATURE2_ARCH_3_1 | PPC_FEATURE2_HAS_ISEL)
+ && hwcap & PPC_FEATURE_HAS_VSX)
+ ? __bzero_power10 :
+# endif
(hwcap2 & PPC_FEATURE2_ARCH_2_07)
? __bzero_power8 :
(hwcap & PPC_FEATURE_HAS_VSX)
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index 9d5a14e480c02171..11532f77d4d03b2a 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -86,6 +86,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
/* Support sysdeps/powerpc/powerpc64/multiarch/memset.c. */
IFUNC_IMPL (i, name, memset,
+#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, memset,
+ hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
+ PPC_FEATURE2_HAS_ISEL)
+ && hwcap & PPC_FEATURE_HAS_VSX,
+ __memset_power10)
+#endif
IFUNC_IMPL_ADD (array, i, memset, hwcap2 & PPC_FEATURE2_ARCH_2_07,
__memset_power8)
IFUNC_IMPL_ADD (array, i, memset, hwcap & PPC_FEATURE_HAS_VSX,
@@ -187,6 +194,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
/* Support sysdeps/powerpc/powerpc64/multiarch/bzero.c. */
IFUNC_IMPL (i, name, bzero,
+#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, bzero,
+ hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
+ PPC_FEATURE2_HAS_ISEL)
+ && hwcap & PPC_FEATURE_HAS_VSX,
+ __bzero_power10)
+#endif
IFUNC_IMPL_ADD (array, i, bzero, hwcap2 & PPC_FEATURE2_ARCH_2_07,
__bzero_power8)
IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_HAS_VSX,
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
new file mode 100644
index 0000000000000000..548e99789735296c
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
@@ -0,0 +1,27 @@
+/* Optimized memset implementation for POWER10 LE.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#define MEMSET __memset_power10
+
+#undef libc_hidden_builtin_def
+#define libc_hidden_builtin_def(name)
+
+#undef __bzero
+#define __bzero __bzero_power10
+
+#include <sysdeps/powerpc/powerpc64/le/power10/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset.c b/sysdeps/powerpc/powerpc64/multiarch/memset.c
index 1a7c46fecf78ab1f..4c97622c7d7eb8aa 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset.c
@@ -33,10 +33,18 @@ extern __typeof (__redirect_memset) __memset_power4 attribute_hidden;
extern __typeof (__redirect_memset) __memset_power6 attribute_hidden;
extern __typeof (__redirect_memset) __memset_power7 attribute_hidden;
extern __typeof (__redirect_memset) __memset_power8 attribute_hidden;
+# ifdef __LITTLE_ENDIAN__
+extern __typeof (__redirect_memset) __memset_power10 attribute_hidden;
+# endif
/* Avoid DWARF definition DIE on ifunc symbol so that GDB can handle
ifunc symbol properly. */
libc_ifunc (__libc_memset,
+# ifdef __LITTLE_ENDIAN__
+ (hwcap2 & (PPC_FEATURE2_ARCH_3_1 | PPC_FEATURE2_HAS_ISEL)
+ && hwcap & PPC_FEATURE_HAS_VSX)
+ ? __memset_power10 :
+# endif
(hwcap2 & PPC_FEATURE2_ARCH_2_07)
? __memset_power8 :
(hwcap & PPC_FEATURE_HAS_VSX)

View File

@ -0,0 +1,131 @@
commit 17a73a6d8b4c46f3e87fc53c7c25fa7cec01d707
Author: Raoni Fassina Firmino <raoni@linux.ibm.com>
Date: Mon May 3 16:59:35 2021 -0300
powerpc64le: Fix ifunc selection for memset, memmove, bzero and bcopy
The hwcap2 check for the aforementioned functions should check for
both PPC_FEATURE2_ARCH_3_1 and PPC_FEATURE2_HAS_ISEL but was
mistakenly checking for any one of them, enabling isa 3.1 version of
the functions in incompatible processors, like POWER8.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c b/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
index 705fef33d4e57557..3c6528e5dbccfdbd 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
@@ -28,10 +28,10 @@ extern __typeof (bcopy) __bcopy_power10 attribute_hidden;
libc_ifunc (bcopy,
#ifdef __LITTLE_ENDIAN__
- hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
- PPC_FEATURE2_HAS_ISEL)
- && (hwcap & PPC_FEATURE_HAS_VSX)
- ? __bcopy_power10 :
+ (hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap2 & PPC_FEATURE2_HAS_ISEL
+ && hwcap & PPC_FEATURE_HAS_VSX)
+ ? __bcopy_power10 :
#endif
(hwcap & PPC_FEATURE_HAS_VSX)
? __bcopy_power7
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bzero.c b/sysdeps/powerpc/powerpc64/multiarch/bzero.c
index 4ce98e324d12a31e..b08b381b4a3999f1 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/bzero.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/bzero.c
@@ -33,7 +33,8 @@ extern __typeof (bzero) __bzero_power10 attribute_hidden;
libc_ifunc (__bzero,
# ifdef __LITTLE_ENDIAN__
- (hwcap2 & (PPC_FEATURE2_ARCH_3_1 | PPC_FEATURE2_HAS_ISEL)
+ (hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap2 & PPC_FEATURE2_HAS_ISEL
&& hwcap & PPC_FEATURE_HAS_VSX)
? __bzero_power10 :
# endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index 11532f77d4d03b2a..6e36659d1903448a 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -75,9 +75,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
IFUNC_IMPL (i, name, memmove,
#ifdef __LITTLE_ENDIAN__
IFUNC_IMPL_ADD (array, i, memmove,
- hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
- PPC_FEATURE2_HAS_ISEL)
- && (hwcap & PPC_FEATURE_HAS_VSX),
+ hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap2 & PPC_FEATURE2_HAS_ISEL
+ && hwcap & PPC_FEATURE_HAS_VSX,
__memmove_power10)
#endif
IFUNC_IMPL_ADD (array, i, memmove, hwcap & PPC_FEATURE_HAS_VSX,
@@ -88,8 +88,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
IFUNC_IMPL (i, name, memset,
#ifdef __LITTLE_ENDIAN__
IFUNC_IMPL_ADD (array, i, memset,
- hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
- PPC_FEATURE2_HAS_ISEL)
+ hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap2 & PPC_FEATURE2_HAS_ISEL
&& hwcap & PPC_FEATURE_HAS_VSX,
__memset_power10)
#endif
@@ -196,8 +196,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
IFUNC_IMPL (i, name, bzero,
#ifdef __LITTLE_ENDIAN__
IFUNC_IMPL_ADD (array, i, bzero,
- hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
- PPC_FEATURE2_HAS_ISEL)
+ hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap2 & PPC_FEATURE2_HAS_ISEL
&& hwcap & PPC_FEATURE_HAS_VSX,
__bzero_power10)
#endif
@@ -215,9 +215,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
IFUNC_IMPL (i, name, bcopy,
#ifdef __LITTLE_ENDIAN__
IFUNC_IMPL_ADD (array, i, bcopy,
- hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
- PPC_FEATURE2_HAS_ISEL)
- && (hwcap & PPC_FEATURE_HAS_VSX),
+ hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap2 & PPC_FEATURE2_HAS_ISEL
+ && hwcap & PPC_FEATURE_HAS_VSX,
__bcopy_power10)
#endif
IFUNC_IMPL_ADD (array, i, bcopy, hwcap & PPC_FEATURE_HAS_VSX,
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove.c b/sysdeps/powerpc/powerpc64/multiarch/memmove.c
index 2fd7b6d309e4bedd..27895faad0cab40e 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memmove.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove.c
@@ -36,10 +36,10 @@ extern __typeof (__redirect_memmove) __memmove_power10 attribute_hidden;
libc_ifunc (__libc_memmove,
#ifdef __LITTLE_ENDIAN__
- hwcap2 & (PPC_FEATURE2_ARCH_3_1 |
- PPC_FEATURE2_HAS_ISEL)
- && (hwcap & PPC_FEATURE_HAS_VSX)
- ? __memmove_power10 :
+ (hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap2 & PPC_FEATURE2_HAS_ISEL
+ && hwcap & PPC_FEATURE_HAS_VSX)
+ ? __memmove_power10 :
#endif
(hwcap & PPC_FEATURE_HAS_VSX)
? __memmove_power7
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset.c b/sysdeps/powerpc/powerpc64/multiarch/memset.c
index 4c97622c7d7eb8aa..685623ae870a0725 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset.c
@@ -41,7 +41,8 @@ extern __typeof (__redirect_memset) __memset_power10 attribute_hidden;
ifunc symbol properly. */
libc_ifunc (__libc_memset,
# ifdef __LITTLE_ENDIAN__
- (hwcap2 & (PPC_FEATURE2_ARCH_3_1 | PPC_FEATURE2_HAS_ISEL)
+ (hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap2 & PPC_FEATURE2_HAS_ISEL
&& hwcap & PPC_FEATURE_HAS_VSX)
? __memset_power10 :
# endif

View File

@ -0,0 +1,387 @@
commit 1a594aa986ffe28657a03baa5c53c0a0e7dc2ecd
Author: Matheus Castanho <msc@linux.ibm.com>
Date: Tue May 11 17:53:07 2021 -0300
powerpc: Add optimized rawmemchr for POWER10
Reuse code for optimized strlen to implement a faster version of rawmemchr.
This takes advantage of the same benefits provided by the strlen implementation,
but needs some extra steps. __strlen_power10 code should be unchanged after this
change.
rawmemchr returns a pointer to the char found, while strlen returns only the
length, so we have to take that into account when preparing the return value.
To quickly check 64B, the loop on __strlen_power10 merges the whole block into
16B by using unsigned minimum vector operations (vminub) and checks if there are
any \0 on the resulting vector. The same code is used by rawmemchr if the char c
is 0. However, this approach does not work when c != 0. We first need to
subtract each byte by c, so that the value we are looking for is converted to a
0, then taking the minimum and checking for nulls works again.
The new code branches after it has compared ~256 bytes and chooses which of the
two strategies above will be used in the main loop, based on the char c. This
extra branch adds some overhead (~5%) for length ~256, but is quickly amortized
by the faster loop for larger sizes.
Compared to __rawmemchr_power9, this version is ~20% faster for length < 256.
Because of the optimized main loop, the improvement becomes ~35% for c != 0
and ~50% for c = 0 for strings longer than 256.
Reviewed-by: Lucas A. M. Magalhaes <lamm@linux.ibm.com>
Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
diff --git a/sysdeps/powerpc/powerpc64/le/power10/rawmemchr.S b/sysdeps/powerpc/powerpc64/le/power10/rawmemchr.S
new file mode 100644
index 0000000000000000..5351c2634f6086bf
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/power10/rawmemchr.S
@@ -0,0 +1,22 @@
+/* Optimized rawmemchr implementation for POWER10 LE.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <sysdep.h>
+
+#define USE_AS_RAWMEMCHR 1
+#include <sysdeps/powerpc/powerpc64/le/power10/strlen.S>
diff --git a/sysdeps/powerpc/powerpc64/le/power10/strlen.S b/sysdeps/powerpc/powerpc64/le/power10/strlen.S
index ca7e9eb3d84c9b00..dda5282f1b9a07cf 100644
--- a/sysdeps/powerpc/powerpc64/le/power10/strlen.S
+++ b/sysdeps/powerpc/powerpc64/le/power10/strlen.S
@@ -18,10 +18,50 @@
#include <sysdep.h>
-#ifndef STRLEN
-# define STRLEN __strlen
-# define DEFINE_STRLEN_HIDDEN_DEF 1
-#endif
+/* To reuse the code for rawmemchr, we have some extra steps compared to the
+ strlen implementation:
+ - Sum the initial value of r3 with the position at which the char was
+ found, to guarantee we return a pointer and not the length.
+ - In the main loop, subtract each byte by the char we are looking for,
+ so we can keep using vminub to quickly check 64B at once. */
+#ifdef USE_AS_RAWMEMCHR
+# ifndef RAWMEMCHR
+# define FUNCNAME __rawmemchr
+# else
+# define FUNCNAME RAWMEMCHR
+# endif
+# define MCOUNT_NARGS 2
+# define VREG_ZERO v20
+# define OFF_START_LOOP 256
+# define RAWMEMCHR_SUBTRACT_VECTORS \
+ vsububm v4,v4,v18; \
+ vsububm v5,v5,v18; \
+ vsububm v6,v6,v18; \
+ vsububm v7,v7,v18;
+# define TAIL(vreg,increment) \
+ vctzlsbb r4,vreg; \
+ addi r4,r4,increment; \
+ add r3,r5,r4; \
+ blr
+
+#else /* strlen */
+
+# ifndef STRLEN
+# define FUNCNAME __strlen
+# define DEFINE_STRLEN_HIDDEN_DEF 1
+# else
+# define FUNCNAME STRLEN
+# endif
+# define MCOUNT_NARGS 1
+# define VREG_ZERO v18
+# define OFF_START_LOOP 192
+# define TAIL(vreg,increment) \
+ vctzlsbb r4,vreg; \
+ subf r3,r3,r5; \
+ addi r4,r4,increment; \
+ add r3,r3,r4; \
+ blr
+#endif /* USE_AS_RAWMEMCHR */
/* TODO: Replace macros by the actual instructions when minimum binutils becomes
>= 2.35. This is used to keep compatibility with older versions. */
@@ -50,33 +90,41 @@
li r6,offset; \
LXVP(v4+32,offset,addr); \
LXVP(v6+32,offset+32,addr); \
+ RAWMEMCHR_SUBTRACT_VECTORS; \
vminub v14,v4,v5; \
vminub v15,v6,v7; \
vminub v16,v14,v15; \
- vcmpequb. v0,v16,v18; \
+ vcmpequb. v0,v16,VREG_ZERO; \
bne cr6,L(label)
-#define TAIL(vreg,increment) \
- vctzlsbb r4,vreg; \
- subf r3,r3,r5; \
- addi r4,r4,increment; \
- add r3,r3,r4; \
- blr
-
/* Implements the function
int [r3] strlen (const void *s [r3])
+ but when USE_AS_RAWMEMCHR is set, implements the function
+
+ void* [r3] rawmemchr (const void *s [r3], int c [r4])
+
The implementation can load bytes past a matching byte, but only
up to the next 64B boundary, so it never crosses a page. */
.machine power9
-ENTRY_TOCLESS (STRLEN, 4)
- CALL_MCOUNT 1
+ENTRY_TOCLESS (FUNCNAME, 4)
+ CALL_MCOUNT MCOUNT_NARGS
- vspltisb v18,0
+#ifdef USE_AS_RAWMEMCHR
+ xori r5,r4,0xff
+
+ mtvsrd v18+32,r4 /* matching char in v18 */
+ mtvsrd v19+32,r5 /* non matching char in v19 */
+
+ vspltb v18,v18,7 /* replicate */
+ vspltb v19,v19,7 /* replicate */
+#else
vspltisb v19,-1
+#endif
+ vspltisb VREG_ZERO,0
/* Next 16B-aligned address. Prepare address for L(aligned). */
addi r5,r3,16
@@ -90,16 +138,25 @@ ENTRY_TOCLESS (STRLEN, 4)
vcmpequb. v6,v0,v18
beq cr6,L(aligned)
+#ifdef USE_AS_RAWMEMCHR
+ vctzlsbb r6,v6
+ add r3,r3,r6
+#else
vctzlsbb r3,v6
+#endif
blr
- /* Test next 176B, 16B at a time. The main loop is optimized for longer
- strings, so checking the first bytes in 16B chunks benefits a lot
- small strings. */
+ /* Test up to OFF_START_LOOP-16 bytes in 16B chunks. The main loop is
+ optimized for longer strings, so checking the first bytes in 16B
+ chunks benefits a lot small strings. */
.p2align 5
L(aligned):
+#ifdef USE_AS_RAWMEMCHR
+ cmpdi cr5,r4,0 /* Check if c == 0. This will be useful to
+ choose how we will perform the main loop. */
+#endif
/* Prepare address for the loop. */
- addi r4,r3,192
+ addi r4,r3,OFF_START_LOOP
clrrdi r4,r4,6
CHECK16(v0,0,r5,tail1)
@@ -113,15 +170,43 @@ L(aligned):
CHECK16(v8,128,r5,tail9)
CHECK16(v9,144,r5,tail10)
CHECK16(v10,160,r5,tail11)
+#ifdef USE_AS_RAWMEMCHR
+ CHECK16(v0,176,r5,tail12)
+ CHECK16(v1,192,r5,tail13)
+ CHECK16(v2,208,r5,tail14)
+ CHECK16(v3,224,r5,tail15)
+#endif
addi r5,r4,128
+#ifdef USE_AS_RAWMEMCHR
+ /* If c == 0, use the same loop as strlen, without the vsububm. */
+ beq cr5,L(loop)
+
+ /* This is very similar to the block after L(loop), the difference is
+ that here RAWMEMCHR_SUBTRACT_VECTORS is not empty, and we subtract
+ each byte loaded by the char we are looking for, this way we can keep
+ using vminub to merge the results and checking for nulls. */
+ .p2align 5
+L(rawmemchr_loop):
+ CHECK64(0,r4,pre_tail_64b)
+ CHECK64(64,r4,pre_tail_64b)
+ addi r4,r4,256
+
+ CHECK64(0,r5,tail_64b)
+ CHECK64(64,r5,tail_64b)
+ addi r5,r5,256
+
+ b L(rawmemchr_loop)
+#endif
/* Switch to a more aggressive approach checking 64B each time. Use 2
pointers 128B apart and unroll the loop once to make the pointer
updates and usages separated enough to avoid stalls waiting for
address calculation. */
.p2align 5
L(loop):
+#undef RAWMEMCHR_SUBTRACT_VECTORS
+#define RAWMEMCHR_SUBTRACT_VECTORS /* nothing */
CHECK64(0,r4,pre_tail_64b)
CHECK64(64,r4,pre_tail_64b)
addi r4,r4,256
@@ -140,10 +225,10 @@ L(tail_64b):
block and mark it in its corresponding VR. lxvp vx,0(ry) puts the
low 16B bytes into vx+1, and the high into vx, so the order here is
v5, v4, v7, v6. */
- vcmpequb v1,v5,v18
- vcmpequb v2,v4,v18
- vcmpequb v3,v7,v18
- vcmpequb v4,v6,v18
+ vcmpequb v1,v5,VREG_ZERO
+ vcmpequb v2,v4,VREG_ZERO
+ vcmpequb v3,v7,VREG_ZERO
+ vcmpequb v4,v6,VREG_ZERO
/* Take into account the other 64B blocks we had already checked. */
add r5,r5,r6
@@ -165,7 +250,9 @@ L(tail_64b):
or r10,r8,r7
cnttzd r0,r10 /* Count trailing zeros before the match. */
+#ifndef USE_AS_RAWMEMCHR
subf r5,r3,r5
+#endif
add r3,r5,r0 /* Compute final length. */
blr
@@ -213,9 +300,32 @@ L(tail10):
L(tail11):
TAIL(v10,160)
-END (STRLEN)
+#ifdef USE_AS_RAWMEMCHR
+ .p2align 5
+L(tail12):
+ TAIL(v0,176)
+
+ .p2align 5
+L(tail13):
+ TAIL(v1,192)
+
+ .p2align 5
+L(tail14):
+ TAIL(v2,208)
+
+ .p2align 5
+L(tail15):
+ TAIL(v3,224)
+#endif
+
+END (FUNCNAME)
-#ifdef DEFINE_STRLEN_HIDDEN_DEF
+#ifdef USE_AS_RAWMEMCHR
+weak_alias (__rawmemchr,rawmemchr)
+libc_hidden_builtin_def (__rawmemchr)
+#else
+# ifdef DEFINE_STRLEN_HIDDEN_DEF
weak_alias (__strlen, strlen)
libc_hidden_builtin_def (strlen)
+# endif
#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index 1d517698429e1230..ac2446aca62cc4ab 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -33,9 +33,9 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
ifneq (,$(filter %le,$(config-machine)))
sysdep_routines += memcpy-power10 memmove-power10 memset-power10 \
+ rawmemchr-power9 rawmemchr-power10 \
strcmp-power9 strncmp-power9 strcpy-power9 stpcpy-power9 \
- rawmemchr-power9 strlen-power9 strncpy-power9 stpncpy-power9 \
- strlen-power10
+ strlen-power9 strncpy-power9 stpncpy-power9 strlen-power10
endif
CFLAGS-strncase-power7.c += -mcpu=power7 -funroll-loops
CFLAGS-strncase_l-power7.c += -mcpu=power7 -funroll-loops
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index 6e36659d1903448a..127af84b32a8196f 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -257,6 +257,10 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
/* Support sysdeps/powerpc/powerpc64/multiarch/rawmemchr.c. */
IFUNC_IMPL (i, name, rawmemchr,
#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, rawmemchr,
+ (hwcap2 & PPC_FEATURE2_ARCH_3_1)
+ && (hwcap & PPC_FEATURE_HAS_VSX),
+ __rawmemchr_power10)
IFUNC_IMPL_ADD (array, i, rawmemchr,
hwcap2 & PPC_FEATURE2_ARCH_3_00,
__rawmemchr_power9)
diff --git a/sysdeps/powerpc/powerpc64/multiarch/rawmemchr-power10.S b/sysdeps/powerpc/powerpc64/multiarch/rawmemchr-power10.S
new file mode 100644
index 0000000000000000..bf1ed7e1941f922d
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/multiarch/rawmemchr-power10.S
@@ -0,0 +1,21 @@
+/* Optimized rawmemchr implementation for PowerPC64/POWER10.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#define RAWMEMCHR __rawmemchr_power10
+
+#include <sysdeps/powerpc/powerpc64/le/power10/rawmemchr.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/rawmemchr.c b/sysdeps/powerpc/powerpc64/multiarch/rawmemchr.c
index 2a7ae5a1ed02e556..369d6359e8987052 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/rawmemchr.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/rawmemchr.c
@@ -26,6 +26,7 @@ extern __typeof (__rawmemchr) __rawmemchr_ppc attribute_hidden;
extern __typeof (__rawmemchr) __rawmemchr_power7 attribute_hidden;
# ifdef __LITTLE_ENDIAN__
extern __typeof (__rawmemchr) __rawmemchr_power9 attribute_hidden;
+extern __typeof (__rawmemchr) __rawmemchr_power10 attribute_hidden;
# endif
# undef __rawmemchr
@@ -34,6 +35,9 @@ extern __typeof (__rawmemchr) __rawmemchr_power9 attribute_hidden;
ifunc symbol properly. */
libc_ifunc_redirected (__redirect___rawmemchr, __rawmemchr,
# ifdef __LITTLE_ENDIAN__
+ (hwcap2 & PPC_FEATURE2_ARCH_3_1)
+ && (hwcap & PPC_FEATURE_HAS_VSX)
+ ? __rawmemchr_power10 :
(hwcap2 & PPC_FEATURE2_ARCH_3_00)
? __rawmemchr_power9 :
# endif

View File

@ -0,0 +1,159 @@
nptl: Add __pthread_attr_copy for copying pthread_attr_t objects
Also add the private type union pthread_attr_transparent, to reduce
the amount of casting that is required.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 331c6e8a184167dd21a9f0b3fc165aeefea6eeca)
Difference from upstream:
Unlike upstream, __pthread_attr_copy is in libpthread.so.
# Conflicts:
# nptl/Makefile
# nptl/Versions
diff --git a/nptl/Makefile b/nptl/Makefile
index d6b37b6efd3b7d78..b14de3ffb330c10b 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -54,7 +54,8 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
pthread_getconcurrency pthread_setconcurrency \
pthread_getschedparam pthread_setschedparam \
pthread_setschedprio \
- pthread_attr_init pthread_attr_destroy \
+ pthread_attr_init pthread_attr_copy \
+ pthread_attr_destroy \
pthread_attr_getdetachstate pthread_attr_setdetachstate \
pthread_attr_getguardsize pthread_attr_setguardsize \
pthread_attr_getschedparam pthread_attr_setschedparam \
diff --git a/nptl/Versions b/nptl/Versions
index 6007fd03e7ed117c..e38272aa187fbe78 100644
--- a/nptl/Versions
+++ b/nptl/Versions
@@ -283,5 +283,6 @@ libpthread {
__pthread_barrier_init; __pthread_barrier_wait;
__shm_directory;
__libpthread_freeres;
+ __pthread_attr_copy;
}
}
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index 00be8f92793e8710..a2d48b2015cd385c 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -464,6 +464,9 @@ extern int __pthread_attr_getstack (const pthread_attr_t *__restrict __attr,
size_t *__restrict __stacksize);
extern int __pthread_attr_setstack (pthread_attr_t *__attr, void *__stackaddr,
size_t __stacksize);
+extern int __pthread_attr_setaffinity_np (pthread_attr_t *attr,
+ size_t cpusetsize,
+ const cpu_set_t *cpuset);
extern int __pthread_rwlock_init (pthread_rwlock_t *__restrict __rwlock,
const pthread_rwlockattr_t *__restrict
__attr);
@@ -605,6 +608,11 @@ extern void __wait_lookup_done (void) attribute_hidden;
# define PTHREAD_STATIC_FN_REQUIRE(name) __asm (".globl " #name);
#endif
+/* Make a deep copy of the attribute *SOURCE in *TARGET. *TARGET is
+ not assumed to have been initialized. Returns 0 on success, or a
+ positive error code otherwise. */
+int __pthread_attr_copy (pthread_attr_t *target, const pthread_attr_t *source);
+
/* Returns 0 if POL is a valid scheduling policy. */
static inline int
check_sched_policy_attr (int pol)
diff --git a/nptl/pthread_attr_copy.c b/nptl/pthread_attr_copy.c
new file mode 100644
index 0000000000000000..67f272acf297100c
--- /dev/null
+++ b/nptl/pthread_attr_copy.c
@@ -0,0 +1,56 @@
+/* Deep copy of a pthread_attr_t object.
+ Copyright (C) 2020 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <errno.h>
+#include <pthreadP.h>
+#include <stdlib.h>
+
+int
+__pthread_attr_copy (pthread_attr_t *target, const pthread_attr_t *source)
+{
+ /* Avoid overwriting *TARGET until all allocations have
+ succeeded. */
+ union pthread_attr_transparent temp;
+ temp.external = *source;
+
+ /* Force new allocation. This function has full ownership of temp. */
+ temp.internal.cpuset = NULL;
+ temp.internal.cpusetsize = 0;
+
+ int ret = 0;
+
+ struct pthread_attr *isource = (struct pthread_attr *) source;
+
+ /* Propagate affinity mask information. */
+ if (isource->cpusetsize > 0)
+ ret = __pthread_attr_setaffinity_np (&temp.external,
+ isource->cpusetsize,
+ isource->cpuset);
+
+ if (ret != 0)
+ {
+ /* Deallocate because we have ownership. */
+ __pthread_attr_destroy (&temp.external);
+ return ret;
+ }
+
+ /* Transfer ownership. *target is not assumed to have been
+ initialized. */
+ *target = temp.external;
+ return 0;
+}
diff --git a/nptl/pthread_attr_setaffinity.c b/nptl/pthread_attr_setaffinity.c
index 545b72c91e290216..914ebf6f9cbfd5ff 100644
--- a/nptl/pthread_attr_setaffinity.c
+++ b/nptl/pthread_attr_setaffinity.c
@@ -55,6 +55,7 @@ __pthread_attr_setaffinity_new (pthread_attr_t *attr, size_t cpusetsize,
return 0;
}
+strong_alias (__pthread_attr_setaffinity_new, __pthread_attr_setaffinity_np)
versioned_symbol (libpthread, __pthread_attr_setaffinity_new,
pthread_attr_setaffinity_np, GLIBC_2_3_4);
diff --git a/sysdeps/nptl/internaltypes.h b/sysdeps/nptl/internaltypes.h
index b78ad99a888b4e3b..d3dce1278de989e2 100644
--- a/sysdeps/nptl/internaltypes.h
+++ b/sysdeps/nptl/internaltypes.h
@@ -49,6 +49,13 @@ struct pthread_attr
#define ATTR_FLAG_SCHED_SET 0x0020
#define ATTR_FLAG_POLICY_SET 0x0040
+/* Used to allocate a pthread_attr_t object which is also accessed
+ internally. */
+union pthread_attr_transparent
+{
+ pthread_attr_t external;
+ struct pthread_attr internal;
+};
/* Mutex attribute data structure. */
struct pthread_mutexattr

View File

@ -0,0 +1,50 @@
Use __pthread_attr_copy in mq_notify (bug 27896)
Make a deep copy of the pthread attribute object to remove a potential
use-after-free issue.
(cherry picked from commit 42d359350510506b87101cf77202fefcbfc790cb)
# Conflicts:
# NEWS
diff --git a/sysdeps/unix/sysv/linux/mq_notify.c b/sysdeps/unix/sysv/linux/mq_notify.c
index 3563e82cd4f4b552..c4091169306ffde8 100644
--- a/sysdeps/unix/sysv/linux/mq_notify.c
+++ b/sysdeps/unix/sysv/linux/mq_notify.c
@@ -135,8 +135,11 @@ helper_thread (void *arg)
(void) __pthread_barrier_wait (&notify_barrier);
}
else if (data.raw[NOTIFY_COOKIE_LEN - 1] == NOTIFY_REMOVED)
- /* The only state we keep is the copy of the thread attributes. */
- free (data.attr);
+ {
+ /* The only state we keep is the copy of the thread attributes. */
+ pthread_attr_destroy (data.attr);
+ free (data.attr);
+ }
}
return NULL;
}
@@ -257,8 +260,7 @@ mq_notify (mqd_t mqdes, const struct sigevent *notification)
if (data.attr == NULL)
return -1;
- memcpy (data.attr, notification->sigev_notify_attributes,
- sizeof (pthread_attr_t));
+ __pthread_attr_copy (data.attr, notification->sigev_notify_attributes);
}
/* Construct the new request. */
@@ -272,7 +274,10 @@ mq_notify (mqd_t mqdes, const struct sigevent *notification)
/* If it failed, free the allocated memory. */
if (__glibc_unlikely (retval != 0))
- free (data.attr);
+ {
+ pthread_attr_destroy (data.attr);
+ free (data.attr);
+ }
return retval;
}

View File

@ -0,0 +1,44 @@
Fix use of __pthread_attr_copy in mq_notify (bug 27896)
__pthread_attr_copy can fail and does not initialize the attribute
structure in that case.
If __pthread_attr_copy is never called and there is no allocated
attribute, pthread_attr_destroy should not be called, otherwise
there is a null pointer dereference in rt/tst-mqueue6.
Fixes commit 42d359350510506b87101cf77202fefcbfc790cb
("Use __pthread_attr_copy in mq_notify (bug 27896)").
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
(cherry picked from commit 217b6dc298156bdb0d6aea9ea93e7e394a5ff091)
diff --git a/sysdeps/unix/sysv/linux/mq_notify.c b/sysdeps/unix/sysv/linux/mq_notify.c
index c4091169306ffde8..45449571d14c379f 100644
--- a/sysdeps/unix/sysv/linux/mq_notify.c
+++ b/sysdeps/unix/sysv/linux/mq_notify.c
@@ -260,7 +260,14 @@ mq_notify (mqd_t mqdes, const struct sigevent *notification)
if (data.attr == NULL)
return -1;
- __pthread_attr_copy (data.attr, notification->sigev_notify_attributes);
+ int ret = __pthread_attr_copy (data.attr,
+ notification->sigev_notify_attributes);
+ if (ret != 0)
+ {
+ free (data.attr);
+ __set_errno (ret);
+ return -1;
+ }
}
/* Construct the new request. */
@@ -273,7 +280,7 @@ mq_notify (mqd_t mqdes, const struct sigevent *notification)
int retval = INLINE_SYSCALL (mq_notify, 2, mqdes, &se);
/* If it failed, free the allocated memory. */
- if (__glibc_unlikely (retval != 0))
+ if (retval != 0 && data.attr != NULL)
{
pthread_attr_destroy (data.attr);
free (data.attr);

View File

@ -0,0 +1,34 @@
commit b805aebd42364fe696e417808a700fdb9800c9e8
Author: Nikita Popov <npv1310@gmail.com>
Date: Mon Aug 9 20:17:34 2021 +0530
librt: fix NULL pointer dereference (bug 28213)
Helper thread frees copied attribute on NOTIFY_REMOVED message
received from the OS kernel. Unfortunately, it fails to check whether
copied attribute actually exists (data.attr != NULL). This worked
earlier because free() checks passed pointer before actually
attempting to release corresponding memory. But
__pthread_attr_destroy assumes pointer is not NULL.
So passing NULL pointer to __pthread_attr_destroy will result in
segmentation fault. This scenario is possible if
notification->sigev_notify_attributes == NULL (which means default
thread attributes should be used).
Signed-off-by: Nikita Popov <npv1310@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
diff --git a/sysdeps/unix/sysv/linux/mq_notify.c b/sysdeps/unix/sysv/linux/mq_notify.c
index 45449571d14c379f..581959d621135fb0 100644
--- a/sysdeps/unix/sysv/linux/mq_notify.c
+++ b/sysdeps/unix/sysv/linux/mq_notify.c
@@ -134,7 +134,7 @@ helper_thread (void *arg)
to wait until it is done with it. */
(void) __pthread_barrier_wait (&notify_barrier);
}
- else if (data.raw[NOTIFY_COOKIE_LEN - 1] == NOTIFY_REMOVED)
+ else if (data.raw[NOTIFY_COOKIE_LEN - 1] == NOTIFY_REMOVED && data.attr != NULL)
{
/* The only state we keep is the copy of the thread attributes. */
pthread_attr_destroy (data.attr);

View File

@ -0,0 +1,33 @@
commit 5adda61f62b77384718b4c0d8336ade8f2b4b35c
Author: Andreas Schwab <schwab@linux-m68k.org>
Date: Fri Jun 25 15:02:47 2021 +0200
wordexp: handle overflow in positional parameter number (bug 28011)
Use strtoul instead of atoi so that overflow can be detected.
diff --git a/posix/wordexp-test.c b/posix/wordexp-test.c
index cc29840355e047cc..30c1dd65efcc0b49 100644
--- a/posix/wordexp-test.c
+++ b/posix/wordexp-test.c
@@ -200,6 +200,7 @@ struct test_case_struct
{ 0, NULL, "$var", 0, 0, { NULL, }, IFS },
{ 0, NULL, "\"\\n\"", 0, 1, { "\\n", }, IFS },
{ 0, NULL, "", 0, 0, { NULL, }, IFS },
+ { 0, NULL, "${1234567890123456789012}", 0, 0, { NULL, }, IFS },
/* Flags not already covered (testit() has special handling for these) */
{ 0, NULL, "one two", WRDE_DOOFFS, 2, { "one", "two", }, IFS },
diff --git a/posix/wordexp.c b/posix/wordexp.c
index 048a8068544c81fa..4061969c720f1f34 100644
--- a/posix/wordexp.c
+++ b/posix/wordexp.c
@@ -1420,7 +1420,7 @@ envsubst:
/* Is it a numeric parameter? */
else if (isdigit (env[0]))
{
- int n = atoi (env);
+ unsigned long n = strtoul (env, NULL, 10);
if (n >= __libc_argc)
/* Substitute NULL. */

View File

@ -1,6 +1,6 @@
%define glibcsrcdir glibc-2.28 %define glibcsrcdir glibc-2.28
%define glibcversion 2.28 %define glibcversion 2.28
%define glibcrelease 148%{?dist} %define glibcrelease 164%{?dist}
# Pre-release tarballs are pulled in from git using a command that is # Pre-release tarballs are pulled in from git using a command that is
# effectively: # effectively:
# #
@ -680,6 +680,45 @@ Patch543: glibc-rh1817513-133.patch
Patch544: glibc-rh1912544.patch Patch544: glibc-rh1912544.patch
Patch545: glibc-rh1918115.patch Patch545: glibc-rh1918115.patch
Patch546: glibc-rh1924919.patch Patch546: glibc-rh1924919.patch
Patch547: glibc-rh1932770.patch
Patch548: glibc-rh1936864.patch
Patch549: glibc-rh1871386-1.patch
Patch550: glibc-rh1871386-2.patch
Patch551: glibc-rh1871386-3.patch
Patch552: glibc-rh1871386-4.patch
Patch553: glibc-rh1871386-5.patch
Patch554: glibc-rh1871386-6.patch
Patch555: glibc-rh1871386-7.patch
Patch556: glibc-rh1912670-1.patch
Patch557: glibc-rh1912670-2.patch
Patch558: glibc-rh1912670-3.patch
Patch559: glibc-rh1912670-4.patch
Patch560: glibc-rh1912670-5.patch
Patch561: glibc-rh1930302-1.patch
Patch562: glibc-rh1930302-2.patch
Patch563: glibc-rh1927877.patch
Patch564: glibc-rh1918719-1.patch
Patch565: glibc-rh1918719-2.patch
Patch566: glibc-rh1918719-3.patch
Patch567: glibc-rh1934155-1.patch
Patch568: glibc-rh1934155-2.patch
Patch569: glibc-rh1934155-3.patch
Patch570: glibc-rh1934155-4.patch
Patch571: glibc-rh1934155-5.patch
Patch572: glibc-rh1934155-6.patch
Patch573: glibc-rh1956357-1.patch
Patch574: glibc-rh1956357-2.patch
Patch575: glibc-rh1956357-3.patch
Patch576: glibc-rh1956357-4.patch
Patch577: glibc-rh1956357-5.patch
Patch578: glibc-rh1956357-6.patch
Patch579: glibc-rh1956357-7.patch
Patch580: glibc-rh1956357-8.patch
Patch581: glibc-rh1979127.patch
Patch582: glibc-rh1966472-1.patch
Patch583: glibc-rh1966472-2.patch
Patch584: glibc-rh1966472-3.patch
Patch585: glibc-rh1966472-4.patch
############################################################################## ##############################################################################
# Continued list of core "glibc" package information: # Continued list of core "glibc" package information:
@ -1512,7 +1551,7 @@ install_different()
%if %{buildpower9} %if %{buildpower9}
pushd build-%{target}-power9 pushd build-%{target}-power9
install_different "$RPM_BUILD_ROOT/%{_lib}" power9 .. install_different "$RPM_BUILD_ROOT/%{_lib}/glibc-hwcaps" power9 "../.."
popd popd
%endif %endif
@ -2356,7 +2395,8 @@ local remove_dirs = { "%{_libdir}/i686",
"%{_libdir}/i686/nosegneg", "%{_libdir}/i686/nosegneg",
"%{_libdir}/power6", "%{_libdir}/power6",
"%{_libdir}/power7", "%{_libdir}/power7",
"%{_libdir}/power8" } "%{_libdir}/power8",
"%{_libdir}/power9"}
-- Walk all the directories with files we need to remove... -- Walk all the directories with files we need to remove...
for _, rdir in ipairs (remove_dirs) do for _, rdir in ipairs (remove_dirs) do
@ -2500,7 +2540,7 @@ fi
%files -f glibc.filelist %files -f glibc.filelist
%dir %{_prefix}/%{_lib}/audit %dir %{_prefix}/%{_lib}/audit
%if %{buildpower9} %if %{buildpower9}
%dir /%{_lib}/power9 %dir /%{_lib}/glibc-hwcaps/power9
%endif %endif
%ifarch s390x %ifarch s390x
/lib/ld64.so.1 /lib/ld64.so.1
@ -2591,6 +2631,57 @@ fi
%files -f compat-libpthread-nonshared.filelist -n compat-libpthread-nonshared %files -f compat-libpthread-nonshared.filelist -n compat-libpthread-nonshared
%changelog %changelog
* Mon Aug 9 2021 Siddhesh Poyarekar <siddhesh@redhat.com> - 2.28-164
- librt: fix NULL pointer dereference (#1966472).
* Mon Aug 9 2021 Siddhesh Poyarekar <siddhesh@redhat.com> - 2.28-163
- CVE-2021-33574: Deep copy pthread attribute in mq_notify (#1966472)
* Thu Jul 8 2021 Siddhesh Poyarekar <siddhesh@redhat.com> - 2.28-162
- CVE-2021-35942: wordexp: handle overflow in positional parameter number
(#1979127)
* Fri Jun 18 2021 Carlos O'Donell <carlos@redhat.com> - 2.28-161
- Improve POWER10 performance with POWER9 fallbacks (#1956357)
* Mon May 31 2021 Arjun Shankar <arjun@redhat.com> - 2.28-160
- Backport POWER10 optimized rawmemchr for ppc64le (#1956357)
* Thu May 27 2021 Arjun Shankar <arjun@redhat.com> - 2.28-159
- Backport additional ifunc optimizations for ppc64le (#1956357)
* Thu Apr 22 2021 Florian Weimer <fweimer@redhat.com> - 2.28-158
- Rebuild with new binutils (#1946518)
* Wed Apr 14 2021 Siddhesh Poyarekar <siddhesh@redhat.com> - 2.28-157
- Consistently SXID_ERASE tunables in sxid binaries (#1934155)
* Wed Mar 31 2021 DJ Delorie <dj@redhat.com> - 2.28-156
- Backport ifunc optimizations for glibc for ppc64le (#1918719)
* Wed Mar 24 2021 Arjun Shankar <arjun@redhat.com> - 2.28-155
- CVE-2021-27645: nscd: Fix double free in netgroupcache (#1927877)
* Thu Mar 18 2021 Carlos O'Donell <carlos@redhat.com> - 2.28-154
- Add IPPROTO_ETHERNET, IPPROTO_MPTCP, and INADDR_ALLSNOOPERS_GROUP defines
(#1930302)
* Thu Mar 18 2021 Carlos O'Donell <carlos@redhat.com> - 2.28-153
- Support SEM_STAT_ANY via semctl. Return EINVAL for unknown commands to semctl,
msgctl, and shmctl. (#1912670)
* Tue Mar 16 2021 Patsy Griffin <patsy@redhat.com> - 2.28-152
- Update syscall-names.list to 5.7, 5.8, 5.9, 5.10 and 5.11. (#1871386)
* Mon Mar 15 2021 Siddhesh Poyarekar <siddhesh@redhat.com> - 2.28-151
- CVE-2019-9169: Fix buffer overread in regexec.c (#1936864).
* Mon Mar 15 2021 Siddhesh Poyarekar <siddhesh@redhat.com> - 2.28-150
- Rebuild glibc to update security markup metadata (#1935128)
* Mon Mar 15 2021 Siddhesh Poyarekar <siddhesh@redhat.com> - 2.28-149
- Fix NSS files and compat service upgrade defect (#1932770).
* Fri Feb 5 2021 Florian Weimer <fweimer@redhat.com> - 2.28-148 * Fri Feb 5 2021 Florian Weimer <fweimer@redhat.com> - 2.28-148
- CVE-2021-3326: iconv assertion failure in ISO-2022-JP-3 decoding (#1924919) - CVE-2021-3326: iconv assertion failure in ISO-2022-JP-3 decoding (#1924919)