Enable cycle counter on arm64
FFTW needs to measure plan speeds to find an optimal plan. There isn't cycle counter support on ARM by default. This causes FFTW to silently fall back to always running in estimate mode. The CNTVCT_EL0 counter works by default on Linux now, so enable it. There's an FFTW issue, https://github.com/FFTW/fftw3/issues/287, to make this the default, but it has not been acted on. In my tests, this results in a huge speedup of 2x to 3x. Signed-off-by: Trent Piepho <tpiepho@gmail.com>
This commit is contained in:
parent
17d2b0fa31
commit
ca665b587a
@ -307,6 +307,7 @@ done
|
|||||||
for ((i=0; i<2; i++)) ; do
|
for ((i=0; i<2; i++)) ; do
|
||||||
prec_flags[i]+=" --enable-neon"
|
prec_flags[i]+=" --enable-neon"
|
||||||
done
|
done
|
||||||
|
BASEFLAGS+=" --enable-armv8-cntvct-el0"
|
||||||
%endif
|
%endif
|
||||||
|
|
||||||
%ifarch ppc ppc64
|
%ifarch ppc ppc64
|
||||||
@ -531,6 +532,7 @@ done
|
|||||||
- Fix for OpenMPI build with < 4 processors
|
- Fix for OpenMPI build with < 4 processors
|
||||||
- Fix building with no enabled MPI types
|
- Fix building with no enabled MPI types
|
||||||
- Enable single precision Altivec on PPC
|
- Enable single precision Altivec on PPC
|
||||||
|
- Enable CNTVCT_EL0 support on ARMv8
|
||||||
|
|
||||||
* Thu Mar 02 2023 Orion Poplawski <orion@nwra.com> - 3.3.10-5
|
* Thu Mar 02 2023 Orion Poplawski <orion@nwra.com> - 3.3.10-5
|
||||||
- Use make macros
|
- Use make macros
|
||||||
|
Loading…
Reference in New Issue
Block a user