Enable cycle counter on arm64

FFTW needs to measure plan speeds to find an optimal plan.  There isn't
cycle counter support on ARM by default.  This causes FFTW to silently
fall back to always running in estimate mode.

The CNTVCT_EL0 counter works by default on Linux now, so enable it.
There's an FFTW issue, https://github.com/FFTW/fftw3/issues/287, to make
this the default, but it has not been acted on.

In my tests, this results in a huge speedup of 2x to 3x.

Signed-off-by: Trent Piepho <tpiepho@gmail.com>
This commit is contained in:
Trent Piepho 2022-11-28 11:05:35 -08:00
parent 17d2b0fa31
commit ca665b587a

View File

@ -307,6 +307,7 @@ done
for ((i=0; i<2; i++)) ; do for ((i=0; i<2; i++)) ; do
prec_flags[i]+=" --enable-neon" prec_flags[i]+=" --enable-neon"
done done
BASEFLAGS+=" --enable-armv8-cntvct-el0"
%endif %endif
%ifarch ppc ppc64 %ifarch ppc ppc64
@ -531,6 +532,7 @@ done
- Fix for OpenMPI build with < 4 processors - Fix for OpenMPI build with < 4 processors
- Fix building with no enabled MPI types - Fix building with no enabled MPI types
- Enable single precision Altivec on PPC - Enable single precision Altivec on PPC
- Enable CNTVCT_EL0 support on ARMv8
* Thu Mar 02 2023 Orion Poplawski <orion@nwra.com> - 3.3.10-5 * Thu Mar 02 2023 Orion Poplawski <orion@nwra.com> - 3.3.10-5
- Use make macros - Use make macros