gcc/gcc44-power7-3.patch
Jakub Jelinek c3b415bb00 4.4.0-5
2009-05-14 08:52:31 +00:00

3524 lines
134 KiB
Diff
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

2009-04-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vector.md (vector_vsel<mode>): Generate the insns
directly instead of calling VSX/Altivec expanders.
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Map VSX
builtins that are identical to Altivec, to the Altivec vesion.
(altivec_overloaded_builtins): Add V2DF/V2DI sel, perm support.
(altivec_resolve_overloaded_builtin): Add V2DF/V2DI support.
* config/rs6000/rs6000.c (rs6000_expand_vector_init): Rename VSX
splat functions.
(expand_vector_set): Merge V2DF/V2DI code.
(expand_vector_extract): Ditto.
(bdesc_3arg): Add more VSX builtins.
(bdesc_2arg): Ditto.
(bdesc_1arg): Ditto.
(rs6000_expand_ternop_builtin): Require xxpermdi 3rd argument to
be 2 bit-constant, and V2DF/V2DI set to be a 1 bit-constant.
(altivec_expand_builtin): Add support for VSX overloaded builtins.
(altivec_init_builtins): Ditto.
(rs6000_common_init_builtins): Ditto.
(rs6000_init_builtins): Add V2DI types and vector long support.
(rs6000_handle_altivec_attribute): Ditto.
(rs6000_mange_type): Ditto.
* config/rs6000/vsx.md (UNSPEC_*): Add new UNSPEC constants.
(vsx_vsel<mode>): Add support for all vector types, including
Altivec types.
(vsx_ftrunc<mode>2): Emit the correct instruction.
(vsx_x<VSv>r<VSs>i): New builtin rounding mode insns.
(vsx_x<VSv>r<VSs>ic): Ditto.
(vsx_concat_<mode>): Key off of VSX memory instructions being
generated instead of the vector arithmetic unit to enable V2DI
mode.
(vsx_extract_<mode>): Ditto.
(vsx_set_<mode>): Rewrite as an unspec.
(vsx_xxpermdi2_<mode>): Rename old vsx_xxpermdi_<mode> here. Key
off of VSX memory instructions instead of arithmetic unit.
(vsx_xxpermdi_<mode>): New insn for __builtin_vsx_xxpermdi.
(vsx_splat_<mode>): Rename from vsx_splat<mode>.
(vsx_xxspltw_<mode>): Change from V4SF only to V4SF/V4SI modes.
Fix up constraints. Key off of memory instructions instead of
arithmetic instructions to allow use with V4SI.
(vsx_xxmrghw_<mode>): Ditto.
(vsx_xxmrglw_<mode>): Ditto.
(vsx_xxsldwi_<mode>): Implement vector shift double by word
immediate.
* config/rs6000/rs6000.h (VSX_BUILTIN_*): Update for current
builtins being generated.
(RS6000_BTI_unsigned_V2DI): Add vector long support.
(RS6000_BTI_bool_long): Ditto.
(RS6000_BTI_bool_V2DI): Ditto.
(unsigned_V2DI_type_node): Ditto.
(bool_long_type_node): Ditto.
(bool_V2DI_type_node): Ditto.
* config/rs6000/altivec.md (altivec_vsel<mode>): Add '*' since we
don't need the generator function now. Use VSX instruction if
-mvsx.
(altivec_vmrghw): Use VSX instruction if -mvsx.
(altivec_vmrghsf): Ditto.
(altivec_vmrglw): Ditto.
(altivec_vmrglsf): Ditto.
* doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions):
Document that under VSX, vector double/long are available.
testsuite/
* gcc.target/powerpc/vsx-builtin-3.c: New test for VSX builtins.
2009-04-23 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vector.md (VEC_E): New iterator to add V2DImode.
(vec_init<mode>): Use VEC_E instead of VEC_C iterator, to add
V2DImode support.
(vec_set<mode>): Ditto.
(vec_extract<mode>): Ditto.
* config/rs6000/predicates.md (easy_vector_constant): Add support
for setting TImode to 0.
* config/rs6000/rs6000.opt (-mvsx-vector-memory): Delete old debug
switch that is no longer used.
(-mvsx-vector-float): Ditto.
(-mvsx-vector-double): Ditto.
(-mvsx-v4sf-altivec-regs): Ditto.
(-mreload-functions): Ditto.
(-mallow-timode): New debug switch.
* config/rs6000/rs6000.c (rs6000_ira_cover_classes): New target
hook for IRA cover classes, to know that under VSX the float and
altivec registers are part of the same register class, but before
they weren't.
(TARGET_IRA_COVER_CLASSES): Set ira cover classes target hookd.
(rs6000_hard_regno_nregs): Key off of whether VSX/Altivec memory
instructions are supported, and not whether the vector unit has
arithmetic support to enable V2DI/TI mode.
(rs6000_hard_regno_mode_ok): Ditto.
(rs6000_init_hard_regno_mode_ok): Add V2DImode, TImode support.
Drop several of the debug switches.
(rs6000_emit_move): Force TImode constants to memory if we have
either Altivec or VSX.
(rs6000_builtin_conversion): Use correct insns for V2DI<->V2DF
conversions.
(rs6000_expand_vector_init): Add V2DI support.
(rs6000_expand_vector_set): Ditto.
(avoiding_indexed_address_p): Simplify tests to say if the mode
uses VSX/Altivec memory instructions we can't eliminate reg+reg
addressing.
(rs6000_legitimize_address): Move VSX/Altivec REG+REG support
before the large integer support.
(rs6000_legitimate_address): Add support for TImode in VSX/Altivec
registers.
(rs6000_emit_move): Ditto.
(def_builtin): Change internal error message to provide more
information.
(bdesc_2arg): Add conversion builtins.
(builtin_hash_function): New function for hashing all of the types
for builtin functions.
(builtin_hash_eq): Ditto.
(builtin_function_type): Ditto.
(builtin_mode_to_type): New static for builtin argument hashing.
(builtin_hash_table): Ditto.
(rs6000_common_init_builtins): Rewrite so that types for builtin
functions are only created when we need them, and use a hash table
to store all of the different argument combinations that are
created. Add support for VSX conversion builtins.
(rs6000_preferred_reload_class): Add TImode support.
(reg_classes_cannot_change_mode_class): Be stricter about VSX and
Altivec vector types.
(rs6000_emit_vector_cond_expr): Use VSX_MOVE_MODE, not
VSX_VECTOR_MOVE_MODE.
(rs6000_handle_altivec_attribute): Allow __vector long on VSX.
* config/rs6000/vsx.md (VSX_D): New iterator for vectors with
64-bit elements.
(VSX_M): New iterator for 128 bit types for moves, except for
TImode.
(VSm, VSs, VSr): Add TImode.
(VSr4, VSr5): New mode attributes for float<->double conversion.
(VSX_SPDP): New iterator for float<->double conversion.
(VS_spdp_*): New mode attributes for float<->double conversion.
(UNSPEC_VSX_*): Rename unspec constants to remove XV from the
names. Change all users.
(vsx_mov<mode>): Drop TImode support here.
(vsx_movti): New TImode support, allow GPRs, but favor VSX
registers.
(vsx_<VS_spdp_insn>): New support for float<->double conversions.
(vsx_xvcvdpsp): Delete, move into vsx_<VS_spdp_insn>.
(vsx_xvcvspdp): Ditto.
(vsx_xvcvuxdsp): New conversion insn.
(vsx_xvcvspsxds): Ditto.
(vsx_xvcvspuxds): Ditto.
(vsx_concat_<mode>): Generalize V2DF permute/splat operations to
include V2DI.
(vsx_set_<mode>): Ditto.
(vsx_extract_<mode>): Ditto.
(vsx_xxpermdi_<mode>): Ditto.
(vsx_splat<mode>): Ditto.
* config/rs6000/rs6000.h (VSX_VECTOR_MOVE_MODE): Delete.
(VSX_MOVE_MODE): Add TImode.
(IRA_COVER_CLASSES): Delete.
(IRA_COVER_CLASSES_PRE_VSX): New cover classes for machines
without VSX where float and altivec are different registers.
(IRA_COVER_CLASS_VSX): New cover classes for machines with VSX
where float and altivec are part of the same register class.
* config/rs6000/altivec.md (VM2): New iterator for 128-bit types,
except TImode.
(altivec_mov<mode>): Drop movti mode here.
(altivec_movti): Add movti insn, and allow GPRs, but favor altivec
registers.
2009-04-16 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000-protos.h (rs6000_has_indirect_jump_p): New
declaration.
(rs6000_set_indirect_jump): Ditto.
* config/rs6000/rs6000.c (struct machine_function): Add
indirect_jump_p field.
(rs6000_override_options): Wrap warning messages in N_(). If
-mvsx was implicitly set, don't give a warning for -msoft-float,
just silently turn off vsx.
(rs6000_secondary_reload_inner): Don't use strict register
checking, since pseudos may still be present.
(register_move_cost): If -mdebug=cost, print out cost information.
(rs6000_memory_move_cost): Ditto.
(rs6000_has_indirect_jump_p): New function, return true if
current function has an indirect jump.
(rs6000_set_indirect_jump): New function, note that an indirect
jump has been generated.
* config/rs6000/rs6000.md (indirect_jump): Note that we've
generated an indirect jump.
(tablejump): Ditto.
(doloop_end): Do not generate decrement ctr and branch
instructions if an indirect jump has been generated.
--- gcc/doc/extend.texi (revision 146119)
+++ gcc/doc/extend.texi (revision 146798)
@@ -7094,7 +7094,7 @@ instructions, but allow the compiler to
* MIPS Loongson Built-in Functions::
* Other MIPS Built-in Functions::
* picoChip Built-in Functions::
-* PowerPC AltiVec Built-in Functions::
+* PowerPC AltiVec/VSX Built-in Functions::
* SPARC VIS Built-in Functions::
* SPU Built-in Functions::
@end menu
@@ -9571,7 +9571,7 @@ GCC defines the preprocessor macro @code
when this function is available.
@end table
-@node PowerPC AltiVec Built-in Functions
+@node PowerPC AltiVec/VSX Built-in Functions
@subsection PowerPC AltiVec Built-in Functions
GCC provides an interface for the PowerPC family of processors to access
@@ -9597,6 +9597,19 @@ vector bool int
vector float
@end smallexample
+If @option{-mvsx} is used the following additional vector types are
+implemented.
+
+@smallexample
+vector unsigned long
+vector signed long
+vector double
+@end smallexample
+
+The long types are only implemented for 64-bit code generation, and
+the long type is only used in the floating point/integer conversion
+instructions.
+
GCC's implementation of the high-level language interface available from
C and C++ code differs from Motorola's documentation in several ways.
--- gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c (revision 146798)
@@ -0,0 +1,212 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mcpu=power7" } */
+/* { dg-final { scan-assembler "xxsel" } } */
+/* { dg-final { scan-assembler "vperm" } } */
+/* { dg-final { scan-assembler "xvrdpi" } } */
+/* { dg-final { scan-assembler "xvrdpic" } } */
+/* { dg-final { scan-assembler "xvrdpim" } } */
+/* { dg-final { scan-assembler "xvrdpip" } } */
+/* { dg-final { scan-assembler "xvrdpiz" } } */
+/* { dg-final { scan-assembler "xvrspi" } } */
+/* { dg-final { scan-assembler "xvrspic" } } */
+/* { dg-final { scan-assembler "xvrspim" } } */
+/* { dg-final { scan-assembler "xvrspip" } } */
+/* { dg-final { scan-assembler "xvrspiz" } } */
+/* { dg-final { scan-assembler "xsrdpi" } } */
+/* { dg-final { scan-assembler "xsrdpic" } } */
+/* { dg-final { scan-assembler "xsrdpim" } } */
+/* { dg-final { scan-assembler "xsrdpip" } } */
+/* { dg-final { scan-assembler "xsrdpiz" } } */
+/* { dg-final { scan-assembler "xsmaxdp" } } */
+/* { dg-final { scan-assembler "xsmindp" } } */
+/* { dg-final { scan-assembler "xxland" } } */
+/* { dg-final { scan-assembler "xxlandc" } } */
+/* { dg-final { scan-assembler "xxlnor" } } */
+/* { dg-final { scan-assembler "xxlor" } } */
+/* { dg-final { scan-assembler "xxlxor" } } */
+/* { dg-final { scan-assembler "xvcmpeqdp" } } */
+/* { dg-final { scan-assembler "xvcmpgtdp" } } */
+/* { dg-final { scan-assembler "xvcmpgedp" } } */
+/* { dg-final { scan-assembler "xvcmpeqsp" } } */
+/* { dg-final { scan-assembler "xvcmpgtsp" } } */
+/* { dg-final { scan-assembler "xvcmpgesp" } } */
+/* { dg-final { scan-assembler "xxsldwi" } } */
+/* { dg-final { scan-assembler-not "call" } } */
+
+extern __vector int si[][4];
+extern __vector short ss[][4];
+extern __vector signed char sc[][4];
+extern __vector float f[][4];
+extern __vector unsigned int ui[][4];
+extern __vector unsigned short us[][4];
+extern __vector unsigned char uc[][4];
+extern __vector __bool int bi[][4];
+extern __vector __bool short bs[][4];
+extern __vector __bool char bc[][4];
+extern __vector __pixel p[][4];
+#ifdef __VSX__
+extern __vector double d[][4];
+extern __vector long sl[][4];
+extern __vector unsigned long ul[][4];
+extern __vector __bool long bl[][4];
+#endif
+
+int do_sel(void)
+{
+ int i = 0;
+
+ si[i][0] = __builtin_vsx_xxsel_4si (si[i][1], si[i][2], si[i][3]); i++;
+ ss[i][0] = __builtin_vsx_xxsel_8hi (ss[i][1], ss[i][2], ss[i][3]); i++;
+ sc[i][0] = __builtin_vsx_xxsel_16qi (sc[i][1], sc[i][2], sc[i][3]); i++;
+ f[i][0] = __builtin_vsx_xxsel_4sf (f[i][1], f[i][2], f[i][3]); i++;
+ d[i][0] = __builtin_vsx_xxsel_2df (d[i][1], d[i][2], d[i][3]); i++;
+
+ si[i][0] = __builtin_vsx_xxsel (si[i][1], si[i][2], bi[i][3]); i++;
+ ss[i][0] = __builtin_vsx_xxsel (ss[i][1], ss[i][2], bs[i][3]); i++;
+ sc[i][0] = __builtin_vsx_xxsel (sc[i][1], sc[i][2], bc[i][3]); i++;
+ f[i][0] = __builtin_vsx_xxsel (f[i][1], f[i][2], bi[i][3]); i++;
+ d[i][0] = __builtin_vsx_xxsel (d[i][1], d[i][2], bl[i][3]); i++;
+
+ si[i][0] = __builtin_vsx_xxsel (si[i][1], si[i][2], ui[i][3]); i++;
+ ss[i][0] = __builtin_vsx_xxsel (ss[i][1], ss[i][2], us[i][3]); i++;
+ sc[i][0] = __builtin_vsx_xxsel (sc[i][1], sc[i][2], uc[i][3]); i++;
+ f[i][0] = __builtin_vsx_xxsel (f[i][1], f[i][2], ui[i][3]); i++;
+ d[i][0] = __builtin_vsx_xxsel (d[i][1], d[i][2], ul[i][3]); i++;
+
+ return i;
+}
+
+int do_perm(void)
+{
+ int i = 0;
+
+ si[i][0] = __builtin_vsx_vperm_4si (si[i][1], si[i][2], sc[i][3]); i++;
+ ss[i][0] = __builtin_vsx_vperm_8hi (ss[i][1], ss[i][2], sc[i][3]); i++;
+ sc[i][0] = __builtin_vsx_vperm_16qi (sc[i][1], sc[i][2], sc[i][3]); i++;
+ f[i][0] = __builtin_vsx_vperm_4sf (f[i][1], f[i][2], sc[i][3]); i++;
+ d[i][0] = __builtin_vsx_vperm_2df (d[i][1], d[i][2], sc[i][3]); i++;
+
+ si[i][0] = __builtin_vsx_vperm (si[i][1], si[i][2], uc[i][3]); i++;
+ ss[i][0] = __builtin_vsx_vperm (ss[i][1], ss[i][2], uc[i][3]); i++;
+ sc[i][0] = __builtin_vsx_vperm (sc[i][1], sc[i][2], uc[i][3]); i++;
+ f[i][0] = __builtin_vsx_vperm (f[i][1], f[i][2], uc[i][3]); i++;
+ d[i][0] = __builtin_vsx_vperm (d[i][1], d[i][2], uc[i][3]); i++;
+
+ return i;
+}
+
+int do_xxperm (void)
+{
+ int i = 0;
+
+ d[i][0] = __builtin_vsx_xxpermdi_2df (d[i][1], d[i][2], 0); i++;
+ d[i][0] = __builtin_vsx_xxpermdi (d[i][1], d[i][2], 1); i++;
+ return i;
+}
+
+double x, y;
+void do_concat (void)
+{
+ d[0][0] = __builtin_vsx_concat_2df (x, y);
+}
+
+void do_set (void)
+{
+ d[0][0] = __builtin_vsx_set_2df (d[0][1], x, 0);
+ d[1][0] = __builtin_vsx_set_2df (d[1][1], y, 1);
+}
+
+extern double z[][4];
+
+int do_math (void)
+{
+ int i = 0;
+
+ d[i][0] = __builtin_vsx_xvrdpi (d[i][1]); i++;
+ d[i][0] = __builtin_vsx_xvrdpic (d[i][1]); i++;
+ d[i][0] = __builtin_vsx_xvrdpim (d[i][1]); i++;
+ d[i][0] = __builtin_vsx_xvrdpip (d[i][1]); i++;
+ d[i][0] = __builtin_vsx_xvrdpiz (d[i][1]); i++;
+
+ f[i][0] = __builtin_vsx_xvrspi (f[i][1]); i++;
+ f[i][0] = __builtin_vsx_xvrspic (f[i][1]); i++;
+ f[i][0] = __builtin_vsx_xvrspim (f[i][1]); i++;
+ f[i][0] = __builtin_vsx_xvrspip (f[i][1]); i++;
+ f[i][0] = __builtin_vsx_xvrspiz (f[i][1]); i++;
+
+ z[i][0] = __builtin_vsx_xsrdpi (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsrdpic (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsrdpim (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsrdpip (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsrdpiz (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsmaxdp (z[i][1], z[i][0]); i++;
+ z[i][0] = __builtin_vsx_xsmindp (z[i][1], z[i][0]); i++;
+ return i;
+}
+
+int do_cmp (void)
+{
+ int i = 0;
+
+ d[i][0] = __builtin_vsx_xvcmpeqdp (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xvcmpgtdp (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xvcmpgedp (d[i][1], d[i][2]); i++;
+
+ f[i][0] = __builtin_vsx_xvcmpeqsp (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xvcmpgtsp (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xvcmpgesp (f[i][1], f[i][2]); i++;
+ return i;
+}
+
+int do_logical (void)
+{
+ int i = 0;
+
+ si[i][0] = __builtin_vsx_xxland (si[i][1], si[i][2]); i++;
+ si[i][0] = __builtin_vsx_xxlandc (si[i][1], si[i][2]); i++;
+ si[i][0] = __builtin_vsx_xxlnor (si[i][1], si[i][2]); i++;
+ si[i][0] = __builtin_vsx_xxlor (si[i][1], si[i][2]); i++;
+ si[i][0] = __builtin_vsx_xxlxor (si[i][1], si[i][2]); i++;
+
+ ss[i][0] = __builtin_vsx_xxland (ss[i][1], ss[i][2]); i++;
+ ss[i][0] = __builtin_vsx_xxlandc (ss[i][1], ss[i][2]); i++;
+ ss[i][0] = __builtin_vsx_xxlnor (ss[i][1], ss[i][2]); i++;
+ ss[i][0] = __builtin_vsx_xxlor (ss[i][1], ss[i][2]); i++;
+ ss[i][0] = __builtin_vsx_xxlxor (ss[i][1], ss[i][2]); i++;
+
+ sc[i][0] = __builtin_vsx_xxland (sc[i][1], sc[i][2]); i++;
+ sc[i][0] = __builtin_vsx_xxlandc (sc[i][1], sc[i][2]); i++;
+ sc[i][0] = __builtin_vsx_xxlnor (sc[i][1], sc[i][2]); i++;
+ sc[i][0] = __builtin_vsx_xxlor (sc[i][1], sc[i][2]); i++;
+ sc[i][0] = __builtin_vsx_xxlxor (sc[i][1], sc[i][2]); i++;
+
+ d[i][0] = __builtin_vsx_xxland (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xxlandc (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xxlnor (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xxlor (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xxlxor (d[i][1], d[i][2]); i++;
+
+ f[i][0] = __builtin_vsx_xxland (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xxlandc (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xxlnor (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xxlor (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xxlxor (f[i][1], f[i][2]); i++;
+ return i;
+}
+
+int do_xxsldwi (void)
+{
+ int i = 0;
+
+ si[i][0] = __builtin_vsx_xxsldwi (si[i][1], si[i][2], 0); i++;
+ ss[i][0] = __builtin_vsx_xxsldwi (ss[i][1], ss[i][2], 1); i++;
+ sc[i][0] = __builtin_vsx_xxsldwi (sc[i][1], sc[i][2], 2); i++;
+ ui[i][0] = __builtin_vsx_xxsldwi (ui[i][1], ui[i][2], 3); i++;
+ us[i][0] = __builtin_vsx_xxsldwi (us[i][1], us[i][2], 0); i++;
+ uc[i][0] = __builtin_vsx_xxsldwi (uc[i][1], uc[i][2], 1); i++;
+ f[i][0] = __builtin_vsx_xxsldwi (f[i][1], f[i][2], 2); i++;
+ d[i][0] = __builtin_vsx_xxsldwi (d[i][1], d[i][2], 3); i++;
+ return i;
+}
--- gcc/config/rs6000/vector.md (revision 146119)
+++ gcc/config/rs6000/vector.md (revision 146798)
@@ -39,6 +39,9 @@ (define_mode_iterator VEC_M [V16QI V8HI
;; Vector comparison modes
(define_mode_iterator VEC_C [V16QI V8HI V4SI V4SF V2DF])
+;; Vector init/extract modes
+(define_mode_iterator VEC_E [V16QI V8HI V4SI V2DI V4SF V2DF])
+
;; Vector reload iterator
(define_mode_iterator VEC_R [V16QI V8HI V4SI V2DI V4SF V2DF DF TI])
@@ -347,34 +350,13 @@ (define_expand "vector_geu<mode>"
;; Note the arguments for __builtin_altivec_vsel are op2, op1, mask
;; which is in the reverse order that we want
(define_expand "vector_vsel<mode>"
- [(match_operand:VEC_F 0 "vlogical_operand" "")
- (match_operand:VEC_F 1 "vlogical_operand" "")
- (match_operand:VEC_F 2 "vlogical_operand" "")
- (match_operand:VEC_F 3 "vlogical_operand" "")]
+ [(set (match_operand:VEC_L 0 "vlogical_operand" "")
+ (if_then_else:VEC_L (ne (match_operand:VEC_L 3 "vlogical_operand" "")
+ (const_int 0))
+ (match_operand:VEC_L 2 "vlogical_operand" "")
+ (match_operand:VEC_L 1 "vlogical_operand" "")))]
"VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
- "
-{
- if (VECTOR_UNIT_VSX_P (<MODE>mode))
- emit_insn (gen_vsx_vsel<mode> (operands[0], operands[3],
- operands[2], operands[1]));
- else
- emit_insn (gen_altivec_vsel<mode> (operands[0], operands[3],
- operands[2], operands[1]));
- DONE;
-}")
-
-(define_expand "vector_vsel<mode>"
- [(match_operand:VEC_I 0 "vlogical_operand" "")
- (match_operand:VEC_I 1 "vlogical_operand" "")
- (match_operand:VEC_I 2 "vlogical_operand" "")
- (match_operand:VEC_I 3 "vlogical_operand" "")]
- "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
- "
-{
- emit_insn (gen_altivec_vsel<mode> (operands[0], operands[3],
- operands[2], operands[1]));
- DONE;
-}")
+ "")
;; Vector logical instructions
@@ -475,19 +457,23 @@ (define_expand "fixuns_trunc<mode><VEC_i
;; Vector initialization, set, extract
(define_expand "vec_init<mode>"
- [(match_operand:VEC_C 0 "vlogical_operand" "")
- (match_operand:VEC_C 1 "vec_init_operand" "")]
- "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ [(match_operand:VEC_E 0 "vlogical_operand" "")
+ (match_operand:VEC_E 1 "vec_init_operand" "")]
+ "(<MODE>mode == V2DImode
+ ? VECTOR_MEM_VSX_P (V2DImode)
+ : VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode))"
{
rs6000_expand_vector_init (operands[0], operands[1]);
DONE;
})
(define_expand "vec_set<mode>"
- [(match_operand:VEC_C 0 "vlogical_operand" "")
+ [(match_operand:VEC_E 0 "vlogical_operand" "")
(match_operand:<VEC_base> 1 "register_operand" "")
(match_operand 2 "const_int_operand" "")]
- "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "(<MODE>mode == V2DImode
+ ? VECTOR_MEM_VSX_P (V2DImode)
+ : VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode))"
{
rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2]));
DONE;
@@ -495,9 +481,11 @@ (define_expand "vec_set<mode>"
(define_expand "vec_extract<mode>"
[(match_operand:<VEC_base> 0 "register_operand" "")
- (match_operand:VEC_C 1 "vlogical_operand" "")
+ (match_operand:VEC_E 1 "vlogical_operand" "")
(match_operand 2 "const_int_operand" "")]
- "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "(<MODE>mode == V2DImode
+ ? VECTOR_MEM_VSX_P (V2DImode)
+ : VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode))"
{
rs6000_expand_vector_extract (operands[0], operands[1],
INTVAL (operands[2]));
--- gcc/config/rs6000/predicates.md (revision 146119)
+++ gcc/config/rs6000/predicates.md (revision 146798)
@@ -327,6 +327,9 @@ (define_predicate "easy_vector_constant"
if (TARGET_PAIRED_FLOAT)
return false;
+ if ((VSX_VECTOR_MODE (mode) || mode == TImode) && zero_constant (op, mode))
+ return true;
+
if (ALTIVEC_VECTOR_MODE (mode))
{
if (zero_constant (op, mode))
--- gcc/config/rs6000/rs6000-protos.h (revision 146119)
+++ gcc/config/rs6000/rs6000-protos.h (revision 146798)
@@ -176,6 +176,8 @@ extern int rs6000_register_move_cost (en
enum reg_class, enum reg_class);
extern int rs6000_memory_move_cost (enum machine_mode, enum reg_class, int);
extern bool rs6000_tls_referenced_p (rtx);
+extern bool rs6000_has_indirect_jump_p (void);
+extern void rs6000_set_indirect_jump (void);
extern void rs6000_conditional_register_usage (void);
/* Declare functions in rs6000-c.c */
--- gcc/config/rs6000/rs6000-c.c (revision 146119)
+++ gcc/config/rs6000/rs6000-c.c (revision 146798)
@@ -336,7 +336,20 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi
if (TARGET_NO_LWSYNC)
builtin_define ("__NO_LWSYNC__");
if (TARGET_VSX)
- builtin_define ("__VSX__");
+ {
+ builtin_define ("__VSX__");
+
+ /* For the VSX builtin functions identical to Altivec functions, just map
+ the altivec builtin into the vsx version (the altivec functions
+ generate VSX code if -mvsx). */
+ builtin_define ("__builtin_vsx_xxland=__builtin_vec_and");
+ builtin_define ("__builtin_vsx_xxlandc=__builtin_vec_andc");
+ builtin_define ("__builtin_vsx_xxlnor=__builtin_vec_nor");
+ builtin_define ("__builtin_vsx_xxlor=__builtin_vec_or");
+ builtin_define ("__builtin_vsx_xxlxor=__builtin_vec_xor");
+ builtin_define ("__builtin_vsx_xxsel=__builtin_vec_sel");
+ builtin_define ("__builtin_vsx_vperm=__builtin_vec_perm");
+ }
/* May be overridden by target configuration. */
RS6000_CPU_CPP_ENDIAN_BUILTINS();
@@ -400,7 +413,7 @@ struct altivec_builtin_types
};
const struct altivec_builtin_types altivec_overloaded_builtins[] = {
- /* Unary AltiVec builtins. */
+ /* Unary AltiVec/VSX builtins. */
{ ALTIVEC_BUILTIN_VEC_ABS, ALTIVEC_BUILTIN_ABS_V16QI,
RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },
{ ALTIVEC_BUILTIN_VEC_ABS, ALTIVEC_BUILTIN_ABS_V8HI,
@@ -496,7 +509,7 @@ const struct altivec_builtin_types altiv
{ ALTIVEC_BUILTIN_VEC_VUPKLSB, ALTIVEC_BUILTIN_VUPKLSB,
RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V16QI, 0, 0 },
- /* Binary AltiVec builtins. */
+ /* Binary AltiVec/VSX builtins. */
{ ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
RS6000_BTI_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_V16QI, 0 },
{ ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
@@ -2206,7 +2219,7 @@ const struct altivec_builtin_types altiv
{ ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR,
RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 },
- /* Ternary AltiVec builtins. */
+ /* Ternary AltiVec/VSX builtins. */
{ ALTIVEC_BUILTIN_VEC_DST, ALTIVEC_BUILTIN_DST,
RS6000_BTI_void, ~RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, RS6000_BTI_INTSI },
{ ALTIVEC_BUILTIN_VEC_DST, ALTIVEC_BUILTIN_DST,
@@ -2407,6 +2420,10 @@ const struct altivec_builtin_types altiv
RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V4SI },
{ ALTIVEC_BUILTIN_VEC_NMSUB, ALTIVEC_BUILTIN_VNMSUBFP,
RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF },
+ { ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_unsigned_V16QI },
+ { ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V16QI },
{ ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_4SF,
RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_unsigned_V16QI },
{ ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_4SI,
@@ -2433,11 +2450,29 @@ const struct altivec_builtin_types altiv
RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI },
{ ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_16QI,
RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_unsigned_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SF,
RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI },
{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SF,
RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_unsigned_V4SI },
{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
+ RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
+ RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI },
{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI },
@@ -2805,6 +2840,37 @@ const struct altivec_builtin_types altiv
RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V16QI },
{ ALTIVEC_BUILTIN_VEC_STVRXL, ALTIVEC_BUILTIN_STVRXL,
RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_16QI,
+ RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_16QI,
+ RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI,
+ RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_8HI,
+ RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_8HI,
+ RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI,
+ RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_4SI,
+ RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_4SI,
+ RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI,
+ RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_2DI,
+ RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
+ RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_4SF,
+ RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DI,
+ RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
+ RS6000_BTI_NOT_OPAQUE },
/* Predicates. */
{ ALTIVEC_BUILTIN_VCMPGT_P, ALTIVEC_BUILTIN_VCMPGTUB_P,
@@ -3108,6 +3174,10 @@ altivec_resolve_overloaded_builtin (tree
goto bad;
switch (TYPE_MODE (type))
{
+ case DImode:
+ type = (unsigned_p ? unsigned_V2DI_type_node : V2DI_type_node);
+ size = 2;
+ break;
case SImode:
type = (unsigned_p ? unsigned_V4SI_type_node : V4SI_type_node);
size = 4;
@@ -3121,6 +3191,7 @@ altivec_resolve_overloaded_builtin (tree
size = 16;
break;
case SFmode: type = V4SF_type_node; size = 4; break;
+ case DFmode: type = V2DF_type_node; size = 2; break;
default:
goto bad;
}
--- gcc/config/rs6000/rs6000.opt (revision 146119)
+++ gcc/config/rs6000/rs6000.opt (revision 146798)
@@ -119,18 +119,6 @@ mvsx
Target Report Mask(VSX)
Use vector/scalar (VSX) instructions
-mvsx-vector-memory
-Target Undocumented Report Var(TARGET_VSX_VECTOR_MEMORY) Init(-1)
-; If -mvsx, use VSX vector load/store instructions instead of Altivec instructions
-
-mvsx-vector-float
-Target Undocumented Report Var(TARGET_VSX_VECTOR_FLOAT) Init(-1)
-; If -mvsx, use VSX arithmetic instructions for float vectors (on by default)
-
-mvsx-vector-double
-Target Undocumented Report Var(TARGET_VSX_VECTOR_DOUBLE) Init(-1)
-; If -mvsx, use VSX arithmetic instructions for double vectors (on by default)
-
mvsx-scalar-double
Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(-1)
; If -mvsx, use VSX arithmetic instructions for scalar double (on by default)
@@ -139,18 +127,14 @@ mvsx-scalar-memory
Target Undocumented Report Var(TARGET_VSX_SCALAR_MEMORY)
; If -mvsx, use VSX scalar memory reference instructions for scalar double (off by default)
-mvsx-v4sf-altivec-regs
-Target Undocumented Report Var(TARGET_V4SF_ALTIVEC_REGS) Init(-1)
-; If -mvsx, prefer V4SF types to use Altivec regs and not the floating registers
-
-mreload-functions
-Target Undocumented Report Var(TARGET_RELOAD_FUNCTIONS) Init(-1)
-; If -mvsx or -maltivec, enable reload functions
-
mpower7-adjust-cost
Target Undocumented Var(TARGET_POWER7_ADJUST_COST)
; Add extra cost for setting CR registers before a branch like is done for Power5
+mallow-timode
+Target Undocumented Var(TARGET_ALLOW_TIMODE)
+; Allow VSX/Altivec to target loading TImode variables.
+
mdisallow-float-in-lr-ctr
Target Undocumented Var(TARGET_DISALLOW_FLOAT_IN_LR_CTR) Init(-1)
; Disallow floating point in LR or CTR, causes some reload bugs
--- gcc/config/rs6000/rs6000.c (revision 146119)
+++ gcc/config/rs6000/rs6000.c (revision 146798)
@@ -130,6 +130,8 @@ typedef struct machine_function GTY(())
64-bits wide and is allocated early enough so that the offset
does not overflow the 16-bit load/store offset field. */
rtx sdmode_stack_slot;
+ /* Whether an indirect jump or table jump was generated. */
+ bool indirect_jump_p;
} machine_function;
/* Target cpu type */
@@ -917,6 +919,11 @@ static rtx rs6000_expand_binop_builtin (
static rtx rs6000_expand_ternop_builtin (enum insn_code, tree, rtx);
static rtx rs6000_expand_builtin (tree, rtx, rtx, enum machine_mode, int);
static void altivec_init_builtins (void);
+static unsigned builtin_hash_function (const void *);
+static int builtin_hash_eq (const void *, const void *);
+static tree builtin_function_type (enum machine_mode, enum machine_mode,
+ enum machine_mode, enum machine_mode,
+ const char *name);
static void rs6000_common_init_builtins (void);
static void rs6000_init_libfuncs (void);
@@ -1018,6 +1025,8 @@ static enum reg_class rs6000_secondary_r
enum machine_mode,
struct secondary_reload_info *);
+static const enum reg_class *rs6000_ira_cover_classes (void);
+
const int INSN_NOT_AVAILABLE = -1;
static enum machine_mode rs6000_eh_return_filter_mode (void);
@@ -1033,6 +1042,16 @@ struct toc_hash_struct GTY(())
};
static GTY ((param_is (struct toc_hash_struct))) htab_t toc_hash_table;
+
+/* Hash table to keep track of the argument types for builtin functions. */
+
+struct builtin_hash_struct GTY(())
+{
+ tree type;
+ enum machine_mode mode[4]; /* return value + 3 arguments */
+};
+
+static GTY ((param_is (struct builtin_hash_struct))) htab_t builtin_hash_table;
/* Default register names. */
char rs6000_reg_names[][8] =
@@ -1350,6 +1369,9 @@ static const char alt_reg_names[][8] =
#undef TARGET_SECONDARY_RELOAD
#define TARGET_SECONDARY_RELOAD rs6000_secondary_reload
+#undef TARGET_IRA_COVER_CLASSES
+#define TARGET_IRA_COVER_CLASSES rs6000_ira_cover_classes
+
struct gcc_target targetm = TARGET_INITIALIZER;
/* Return number of consecutive hard regs needed starting at reg REGNO
@@ -1370,7 +1392,7 @@ rs6000_hard_regno_nregs_internal (int re
unsigned HOST_WIDE_INT reg_size;
if (FP_REGNO_P (regno))
- reg_size = (VECTOR_UNIT_VSX_P (mode)
+ reg_size = (VECTOR_MEM_VSX_P (mode)
? UNITS_PER_VSX_WORD
: UNITS_PER_FP_WORD);
@@ -1452,7 +1474,7 @@ rs6000_hard_regno_mode_ok (int regno, en
/* AltiVec only in AldyVec registers. */
if (ALTIVEC_REGNO_P (regno))
- return VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode);
+ return VECTOR_MEM_ALTIVEC_OR_VSX_P (mode);
/* ...but GPRs can hold SIMD data on the SPE in one register. */
if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode))
@@ -1613,10 +1635,8 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vector_reload[m][1] = CODE_FOR_nothing;
}
- /* TODO, add TI/V2DI mode for moving data if Altivec or VSX. */
-
/* V2DF mode, VSX only. */
- if (float_p && TARGET_VSX && TARGET_VSX_VECTOR_DOUBLE)
+ if (float_p && TARGET_VSX)
{
rs6000_vector_unit[V2DFmode] = VECTOR_VSX;
rs6000_vector_mem[V2DFmode] = VECTOR_VSX;
@@ -1624,17 +1644,11 @@ rs6000_init_hard_regno_mode_ok (void)
}
/* V4SF mode, either VSX or Altivec. */
- if (float_p && TARGET_VSX && TARGET_VSX_VECTOR_FLOAT)
+ if (float_p && TARGET_VSX)
{
rs6000_vector_unit[V4SFmode] = VECTOR_VSX;
- if (TARGET_VSX_VECTOR_MEMORY || !TARGET_ALTIVEC)
- {
- rs6000_vector_align[V4SFmode] = 32;
- rs6000_vector_mem[V4SFmode] = VECTOR_VSX;
- } else {
- rs6000_vector_align[V4SFmode] = 128;
- rs6000_vector_mem[V4SFmode] = VECTOR_ALTIVEC;
- }
+ rs6000_vector_align[V4SFmode] = 32;
+ rs6000_vector_mem[V4SFmode] = VECTOR_VSX;
}
else if (float_p && TARGET_ALTIVEC)
{
@@ -1655,7 +1669,7 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vector_reg_class[V8HImode] = ALTIVEC_REGS;
rs6000_vector_reg_class[V4SImode] = ALTIVEC_REGS;
- if (TARGET_VSX && TARGET_VSX_VECTOR_MEMORY)
+ if (TARGET_VSX)
{
rs6000_vector_mem[V4SImode] = VECTOR_VSX;
rs6000_vector_mem[V8HImode] = VECTOR_VSX;
@@ -1675,6 +1689,23 @@ rs6000_init_hard_regno_mode_ok (void)
}
}
+ /* V2DImode, prefer vsx over altivec, since the main use will be for
+ vectorized floating point conversions. */
+ if (float_p && TARGET_VSX)
+ {
+ rs6000_vector_mem[V2DImode] = VECTOR_VSX;
+ rs6000_vector_unit[V2DImode] = VECTOR_NONE;
+ rs6000_vector_reg_class[V2DImode] = vsx_rc;
+ rs6000_vector_align[V2DImode] = 64;
+ }
+ else if (TARGET_ALTIVEC)
+ {
+ rs6000_vector_mem[V2DImode] = VECTOR_ALTIVEC;
+ rs6000_vector_unit[V2DImode] = VECTOR_NONE;
+ rs6000_vector_reg_class[V2DImode] = ALTIVEC_REGS;
+ rs6000_vector_align[V2DImode] = 128;
+ }
+
/* DFmode, see if we want to use the VSX unit. */
if (float_p && TARGET_VSX && TARGET_VSX_SCALAR_DOUBLE)
{
@@ -1684,16 +1715,30 @@ rs6000_init_hard_regno_mode_ok (void)
= (TARGET_VSX_SCALAR_MEMORY ? VECTOR_VSX : VECTOR_NONE);
}
- /* TODO, add SPE and paired floating point vector support. */
+ /* TImode. Until this is debugged, only add it under switch control. */
+ if (TARGET_ALLOW_TIMODE)
+ {
+ if (float_p && TARGET_VSX)
+ {
+ rs6000_vector_mem[TImode] = VECTOR_VSX;
+ rs6000_vector_unit[TImode] = VECTOR_NONE;
+ rs6000_vector_reg_class[TImode] = vsx_rc;
+ rs6000_vector_align[TImode] = 64;
+ }
+ else if (TARGET_ALTIVEC)
+ {
+ rs6000_vector_mem[TImode] = VECTOR_ALTIVEC;
+ rs6000_vector_unit[TImode] = VECTOR_NONE;
+ rs6000_vector_reg_class[TImode] = ALTIVEC_REGS;
+ rs6000_vector_align[TImode] = 128;
+ }
+ }
+
+ /* TODO add SPE and paired floating point vector support. */
/* Set the VSX register classes. */
-
- /* For V4SF, prefer the Altivec registers, because there are a few operations
- that want to use Altivec operations instead of VSX. */
rs6000_vector_reg_class[V4SFmode]
- = ((VECTOR_UNIT_VSX_P (V4SFmode)
- && VECTOR_MEM_VSX_P (V4SFmode)
- && !TARGET_V4SF_ALTIVEC_REGS)
+ = ((VECTOR_UNIT_VSX_P (V4SFmode) && VECTOR_MEM_VSX_P (V4SFmode))
? vsx_rc
: (VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)
? ALTIVEC_REGS
@@ -1712,7 +1757,7 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vsx_reg_class = (float_p && TARGET_VSX) ? vsx_rc : NO_REGS;
/* Set up the reload helper functions. */
- if (TARGET_RELOAD_FUNCTIONS && (TARGET_VSX || TARGET_ALTIVEC))
+ if (TARGET_VSX || TARGET_ALTIVEC)
{
if (TARGET_64BIT)
{
@@ -1728,6 +1773,11 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_di_load;
rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_di_store;
rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_di_load;
+ if (TARGET_ALLOW_TIMODE)
+ {
+ rs6000_vector_reload[TImode][0] = CODE_FOR_reload_ti_di_store;
+ rs6000_vector_reload[TImode][1] = CODE_FOR_reload_ti_di_load;
+ }
}
else
{
@@ -1743,6 +1793,11 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_si_load;
rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_si_store;
rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_si_load;
+ if (TARGET_ALLOW_TIMODE)
+ {
+ rs6000_vector_reload[TImode][0] = CODE_FOR_reload_ti_si_store;
+ rs6000_vector_reload[TImode][1] = CODE_FOR_reload_ti_si_load;
+ }
}
}
@@ -2132,23 +2187,29 @@ rs6000_override_options (const char *def
const char *msg = NULL;
if (!TARGET_HARD_FLOAT || !TARGET_FPRS
|| !TARGET_SINGLE_FLOAT || !TARGET_DOUBLE_FLOAT)
- msg = "-mvsx requires hardware floating point";
+ {
+ if (target_flags_explicit & MASK_VSX)
+ msg = N_("-mvsx requires hardware floating point");
+ else
+ target_flags &= ~ MASK_VSX;
+ }
else if (TARGET_PAIRED_FLOAT)
- msg = "-mvsx and -mpaired are incompatible";
+ msg = N_("-mvsx and -mpaired are incompatible");
/* The hardware will allow VSX and little endian, but until we make sure
things like vector select, etc. work don't allow VSX on little endian
systems at this point. */
else if (!BYTES_BIG_ENDIAN)
- msg = "-mvsx used with little endian code";
+ msg = N_("-mvsx used with little endian code");
else if (TARGET_AVOID_XFORM > 0)
- msg = "-mvsx needs indexed addressing";
+ msg = N_("-mvsx needs indexed addressing");
if (msg)
{
warning (0, msg);
- target_flags &= MASK_VSX;
+ target_flags &= ~ MASK_VSX;
}
- else if (!TARGET_ALTIVEC && (target_flags_explicit & MASK_ALTIVEC) == 0)
+ else if (TARGET_VSX && !TARGET_ALTIVEC
+ && (target_flags_explicit & MASK_ALTIVEC) == 0)
target_flags |= MASK_ALTIVEC;
}
@@ -2581,8 +2642,8 @@ rs6000_builtin_conversion (enum tree_cod
return NULL_TREE;
return TYPE_UNSIGNED (type)
- ? rs6000_builtin_decls[VSX_BUILTIN_XVCVUXDSP]
- : rs6000_builtin_decls[VSX_BUILTIN_XVCVSXDSP];
+ ? rs6000_builtin_decls[VSX_BUILTIN_XVCVUXDDP]
+ : rs6000_builtin_decls[VSX_BUILTIN_XVCVSXDDP];
case V4SImode:
if (VECTOR_UNIT_NONE_P (V4SImode) || VECTOR_UNIT_NONE_P (V4SFmode))
@@ -3785,15 +3846,28 @@ rs6000_expand_vector_init (rtx target, r
}
}
- if (mode == V2DFmode)
+ if (VECTOR_MEM_VSX_P (mode) && (mode == V2DFmode || mode == V2DImode))
{
- gcc_assert (TARGET_VSX);
+ rtx (*splat) (rtx, rtx);
+ rtx (*concat) (rtx, rtx, rtx);
+
+ if (mode == V2DFmode)
+ {
+ splat = gen_vsx_splat_v2df;
+ concat = gen_vsx_concat_v2df;
+ }
+ else
+ {
+ splat = gen_vsx_splat_v2di;
+ concat = gen_vsx_concat_v2di;
+ }
+
if (all_same)
- emit_insn (gen_vsx_splatv2df (target, XVECEXP (vals, 0, 0)));
+ emit_insn (splat (target, XVECEXP (vals, 0, 0)));
else
- emit_insn (gen_vsx_concat_v2df (target,
- copy_to_reg (XVECEXP (vals, 0, 0)),
- copy_to_reg (XVECEXP (vals, 0, 1))));
+ emit_insn (concat (target,
+ copy_to_reg (XVECEXP (vals, 0, 0)),
+ copy_to_reg (XVECEXP (vals, 0, 1))));
return;
}
@@ -3856,10 +3930,12 @@ rs6000_expand_vector_set (rtx target, rt
int width = GET_MODE_SIZE (inner_mode);
int i;
- if (mode == V2DFmode)
+ if (mode == V2DFmode || mode == V2DImode)
{
+ rtx (*set_func) (rtx, rtx, rtx, rtx)
+ = ((mode == V2DFmode) ? gen_vsx_set_v2df : gen_vsx_set_v2di);
gcc_assert (TARGET_VSX);
- emit_insn (gen_vsx_set_v2df (target, val, target, GEN_INT (elt)));
+ emit_insn (set_func (target, val, target, GEN_INT (elt)));
return;
}
@@ -3900,10 +3976,12 @@ rs6000_expand_vector_extract (rtx target
enum machine_mode inner_mode = GET_MODE_INNER (mode);
rtx mem, x;
- if (mode == V2DFmode)
+ if (mode == V2DFmode || mode == V2DImode)
{
+ rtx (*extract_func) (rtx, rtx, rtx)
+ = ((mode == V2DFmode) ? gen_vsx_extract_v2df : gen_vsx_extract_v2di);
gcc_assert (TARGET_VSX);
- emit_insn (gen_vsx_extract_v2df (target, vec, GEN_INT (elt)));
+ emit_insn (extract_func (target, vec, GEN_INT (elt)));
return;
}
@@ -4323,9 +4401,7 @@ avoiding_indexed_address_p (enum machine
{
/* Avoid indexed addressing for modes that have non-indexed
load/store instruction forms. */
- return (TARGET_AVOID_XFORM
- && (!TARGET_ALTIVEC || !ALTIVEC_VECTOR_MODE (mode))
- && (!TARGET_VSX || !VSX_VECTOR_MODE (mode)));
+ return (TARGET_AVOID_XFORM && VECTOR_MEM_NONE_P (mode));
}
inline bool
@@ -4427,6 +4503,16 @@ rs6000_legitimize_address (rtx x, rtx ol
ret = rs6000_legitimize_tls_address (x, model);
}
+ else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
+ {
+ /* Make sure both operands are registers. */
+ if (GET_CODE (x) == PLUS)
+ ret = gen_rtx_PLUS (Pmode,
+ force_reg (Pmode, XEXP (x, 0)),
+ force_reg (Pmode, XEXP (x, 1)));
+ else
+ ret = force_reg (Pmode, x);
+ }
else if (GET_CODE (x) == PLUS
&& GET_CODE (XEXP (x, 0)) == REG
&& GET_CODE (XEXP (x, 1)) == CONST_INT
@@ -4436,8 +4522,6 @@ rs6000_legitimize_address (rtx x, rtx ol
&& (mode == DImode || mode == TImode)
&& (INTVAL (XEXP (x, 1)) & 3) != 0)
|| (TARGET_SPE && SPE_VECTOR_MODE (mode))
- || (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode))
- || (TARGET_VSX && VSX_VECTOR_MODE (mode))
|| (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode
|| mode == DImode || mode == DDmode
|| mode == TDmode))))
@@ -4467,15 +4551,6 @@ rs6000_legitimize_address (rtx x, rtx ol
ret = gen_rtx_PLUS (Pmode, XEXP (x, 0),
force_reg (Pmode, force_operand (XEXP (x, 1), 0)));
}
- else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
- {
- /* Make sure both operands are registers. */
- if (GET_CODE (x) == PLUS)
- ret = gen_rtx_PLUS (Pmode, force_reg (Pmode, XEXP (x, 0)),
- force_reg (Pmode, XEXP (x, 1)));
- else
- ret = force_reg (Pmode, x);
- }
else if ((TARGET_SPE && SPE_VECTOR_MODE (mode))
|| (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode
|| mode == DDmode || mode == TDmode
@@ -5113,7 +5188,7 @@ rs6000_legitimate_address (enum machine_
ret = 1;
else if (rs6000_legitimate_offset_address_p (mode, x, reg_ok_strict))
ret = 1;
- else if (mode != TImode
+ else if ((mode != TImode || !VECTOR_MEM_NONE_P (TImode))
&& mode != TFmode
&& mode != TDmode
&& ((TARGET_HARD_FLOAT && TARGET_FPRS)
@@ -5953,7 +6028,13 @@ rs6000_emit_move (rtx dest, rtx source,
case TImode:
if (VECTOR_MEM_ALTIVEC_OR_VSX_P (TImode))
- break;
+ {
+ if (CONSTANT_P (operands[1])
+ && !easy_vector_constant (operands[1], mode))
+ operands[1] = force_const_mem (mode, operands[1]);
+
+ break;
+ }
rs6000_eliminate_indexed_memrefs (operands);
@@ -7869,7 +7950,8 @@ def_builtin (int mask, const char *name,
if ((mask & target_flags) || TARGET_PAIRED_FLOAT)
{
if (rs6000_builtin_decls[code])
- abort ();
+ fatal_error ("internal error: builtin function to %s already processed.",
+ name);
rs6000_builtin_decls[code] =
add_builtin_function (name, type, code, BUILT_IN_MD,
@@ -7934,6 +8016,34 @@ static const struct builtin_description
{ MASK_VSX, CODE_FOR_vsx_fnmaddv4sf4, "__builtin_vsx_xvnmaddsp", VSX_BUILTIN_XVNMADDSP },
{ MASK_VSX, CODE_FOR_vsx_fnmsubv4sf4, "__builtin_vsx_xvnmsubsp", VSX_BUILTIN_XVNMSUBSP },
+ { MASK_VSX, CODE_FOR_vector_vselv2di, "__builtin_vsx_xxsel_2di", VSX_BUILTIN_XXSEL_2DI },
+ { MASK_VSX, CODE_FOR_vector_vselv2df, "__builtin_vsx_xxsel_2df", VSX_BUILTIN_XXSEL_2DF },
+ { MASK_VSX, CODE_FOR_vector_vselv4sf, "__builtin_vsx_xxsel_4sf", VSX_BUILTIN_XXSEL_4SF },
+ { MASK_VSX, CODE_FOR_vector_vselv4si, "__builtin_vsx_xxsel_4si", VSX_BUILTIN_XXSEL_4SI },
+ { MASK_VSX, CODE_FOR_vector_vselv8hi, "__builtin_vsx_xxsel_8hi", VSX_BUILTIN_XXSEL_8HI },
+ { MASK_VSX, CODE_FOR_vector_vselv16qi, "__builtin_vsx_xxsel_16qi", VSX_BUILTIN_XXSEL_16QI },
+
+ { MASK_VSX, CODE_FOR_altivec_vperm_v2di, "__builtin_vsx_vperm_2di", VSX_BUILTIN_VPERM_2DI },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v2df, "__builtin_vsx_vperm_2df", VSX_BUILTIN_VPERM_2DF },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v4sf, "__builtin_vsx_vperm_4sf", VSX_BUILTIN_VPERM_4SF },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v4si, "__builtin_vsx_vperm_4si", VSX_BUILTIN_VPERM_4SI },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v8hi, "__builtin_vsx_vperm_8hi", VSX_BUILTIN_VPERM_8HI },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v16qi, "__builtin_vsx_vperm_16qi", VSX_BUILTIN_VPERM_16QI },
+
+ { MASK_VSX, CODE_FOR_vsx_xxpermdi_v2df, "__builtin_vsx_xxpermdi_2df", VSX_BUILTIN_XXPERMDI_2DF },
+ { MASK_VSX, CODE_FOR_vsx_xxpermdi_v2di, "__builtin_vsx_xxpermdi_2di", VSX_BUILTIN_XXPERMDI_2DI },
+ { MASK_VSX, CODE_FOR_nothing, "__builtin_vsx_xxpermdi", VSX_BUILTIN_VEC_XXPERMDI },
+ { MASK_VSX, CODE_FOR_vsx_set_v2df, "__builtin_vsx_set_2df", VSX_BUILTIN_SET_2DF },
+ { MASK_VSX, CODE_FOR_vsx_set_v2di, "__builtin_vsx_set_2di", VSX_BUILTIN_SET_2DI },
+
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v2di, "__builtin_vsx_xxsldwi_2di", VSX_BUILTIN_XXSLDWI_2DI },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v2df, "__builtin_vsx_xxsldwi_2df", VSX_BUILTIN_XXSLDWI_2DF },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v4sf, "__builtin_vsx_xxsldwi_4sf", VSX_BUILTIN_XXSLDWI_4SF },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v4si, "__builtin_vsx_xxsldwi_4si", VSX_BUILTIN_XXSLDWI_4SI },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v8hi, "__builtin_vsx_xxsldwi_8hi", VSX_BUILTIN_XXSLDWI_8HI },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v16qi, "__builtin_vsx_xxsldwi_16qi", VSX_BUILTIN_XXSLDWI_16QI },
+ { MASK_VSX, CODE_FOR_nothing, "__builtin_vsx_xxsldwi", VSX_BUILTIN_VEC_XXSLDWI },
+
{ 0, CODE_FOR_paired_msub, "__builtin_paired_msub", PAIRED_BUILTIN_MSUB },
{ 0, CODE_FOR_paired_madd, "__builtin_paired_madd", PAIRED_BUILTIN_MADD },
{ 0, CODE_FOR_paired_madds0, "__builtin_paired_madds0", PAIRED_BUILTIN_MADDS0 },
@@ -8083,6 +8193,9 @@ static struct builtin_description bdesc_
{ MASK_VSX, CODE_FOR_sminv2df3, "__builtin_vsx_xvmindp", VSX_BUILTIN_XVMINDP },
{ MASK_VSX, CODE_FOR_smaxv2df3, "__builtin_vsx_xvmaxdp", VSX_BUILTIN_XVMAXDP },
{ MASK_VSX, CODE_FOR_vsx_tdivv2df3, "__builtin_vsx_xvtdivdp", VSX_BUILTIN_XVTDIVDP },
+ { MASK_VSX, CODE_FOR_vector_eqv2df, "__builtin_vsx_xvcmpeqdp", VSX_BUILTIN_XVCMPEQDP },
+ { MASK_VSX, CODE_FOR_vector_gtv2df, "__builtin_vsx_xvcmpgtdp", VSX_BUILTIN_XVCMPGTDP },
+ { MASK_VSX, CODE_FOR_vector_gev2df, "__builtin_vsx_xvcmpgedp", VSX_BUILTIN_XVCMPGEDP },
{ MASK_VSX, CODE_FOR_addv4sf3, "__builtin_vsx_xvaddsp", VSX_BUILTIN_XVADDSP },
{ MASK_VSX, CODE_FOR_subv4sf3, "__builtin_vsx_xvsubsp", VSX_BUILTIN_XVSUBSP },
@@ -8091,6 +8204,21 @@ static struct builtin_description bdesc_
{ MASK_VSX, CODE_FOR_sminv4sf3, "__builtin_vsx_xvminsp", VSX_BUILTIN_XVMINSP },
{ MASK_VSX, CODE_FOR_smaxv4sf3, "__builtin_vsx_xvmaxsp", VSX_BUILTIN_XVMAXSP },
{ MASK_VSX, CODE_FOR_vsx_tdivv4sf3, "__builtin_vsx_xvtdivsp", VSX_BUILTIN_XVTDIVSP },
+ { MASK_VSX, CODE_FOR_vector_eqv4sf, "__builtin_vsx_xvcmpeqsp", VSX_BUILTIN_XVCMPEQSP },
+ { MASK_VSX, CODE_FOR_vector_gtv4sf, "__builtin_vsx_xvcmpgtsp", VSX_BUILTIN_XVCMPGTSP },
+ { MASK_VSX, CODE_FOR_vector_gev4sf, "__builtin_vsx_xvcmpgesp", VSX_BUILTIN_XVCMPGESP },
+
+ { MASK_VSX, CODE_FOR_smindf3, "__builtin_vsx_xsmindp", VSX_BUILTIN_XSMINDP },
+ { MASK_VSX, CODE_FOR_smaxdf3, "__builtin_vsx_xsmaxdp", VSX_BUILTIN_XSMAXDP },
+
+ { MASK_VSX, CODE_FOR_vsx_concat_v2df, "__builtin_vsx_concat_2df", VSX_BUILTIN_CONCAT_2DF },
+ { MASK_VSX, CODE_FOR_vsx_concat_v2di, "__builtin_vsx_concat_2di", VSX_BUILTIN_CONCAT_2DI },
+ { MASK_VSX, CODE_FOR_vsx_splat_v2df, "__builtin_vsx_splat_2df", VSX_BUILTIN_SPLAT_2DF },
+ { MASK_VSX, CODE_FOR_vsx_splat_v2di, "__builtin_vsx_splat_2di", VSX_BUILTIN_SPLAT_2DI },
+ { MASK_VSX, CODE_FOR_vsx_xxmrghw_v4sf, "__builtin_vsx_xxmrghw", VSX_BUILTIN_XXMRGHW_4SF },
+ { MASK_VSX, CODE_FOR_vsx_xxmrghw_v4si, "__builtin_vsx_xxmrghw_4si", VSX_BUILTIN_XXMRGHW_4SI },
+ { MASK_VSX, CODE_FOR_vsx_xxmrglw_v4sf, "__builtin_vsx_xxmrglw", VSX_BUILTIN_XXMRGLW_4SF },
+ { MASK_VSX, CODE_FOR_vsx_xxmrglw_v4si, "__builtin_vsx_xxmrglw_4si", VSX_BUILTIN_XXMRGLW_4SI },
{ MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_add", ALTIVEC_BUILTIN_VEC_ADD },
{ MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vaddfp", ALTIVEC_BUILTIN_VEC_VADDFP },
@@ -8508,6 +8636,47 @@ static struct builtin_description bdesc_
{ MASK_VSX, CODE_FOR_vsx_tsqrtv4sf2, "__builtin_vsx_xvtsqrtsp", VSX_BUILTIN_XVTSQRTSP },
{ MASK_VSX, CODE_FOR_vsx_frev4sf2, "__builtin_vsx_xvresp", VSX_BUILTIN_XVRESP },
+ { MASK_VSX, CODE_FOR_vsx_xscvdpsp, "__builtin_vsx_xscvdpsp", VSX_BUILTIN_XSCVDPSP },
+ { MASK_VSX, CODE_FOR_vsx_xscvdpsp, "__builtin_vsx_xscvspdp", VSX_BUILTIN_XSCVSPDP },
+ { MASK_VSX, CODE_FOR_vsx_xvcvdpsp, "__builtin_vsx_xvcvdpsp", VSX_BUILTIN_XVCVDPSP },
+ { MASK_VSX, CODE_FOR_vsx_xvcvspdp, "__builtin_vsx_xvcvspdp", VSX_BUILTIN_XVCVSPDP },
+
+ { MASK_VSX, CODE_FOR_vsx_fix_truncv2dfv2di2, "__builtin_vsx_xvcvdpsxds", VSX_BUILTIN_XVCVDPSXDS },
+ { MASK_VSX, CODE_FOR_vsx_fixuns_truncv2dfv2di2, "__builtin_vsx_xvcvdpuxds", VSX_BUILTIN_XVCVDPUXDS },
+ { MASK_VSX, CODE_FOR_vsx_floatv2div2df2, "__builtin_vsx_xvcvsxddp", VSX_BUILTIN_XVCVSXDDP },
+ { MASK_VSX, CODE_FOR_vsx_floatunsv2div2df2, "__builtin_vsx_xvcvuxddp", VSX_BUILTIN_XVCVUXDDP },
+
+ { MASK_VSX, CODE_FOR_vsx_fix_truncv4sfv4si2, "__builtin_vsx_xvcvspsxws", VSX_BUILTIN_XVCVSPSXWS },
+ { MASK_VSX, CODE_FOR_vsx_fixuns_truncv4sfv4si2, "__builtin_vsx_xvcvspuxws", VSX_BUILTIN_XVCVSPUXWS },
+ { MASK_VSX, CODE_FOR_vsx_floatv4siv4sf2, "__builtin_vsx_xvcvsxwsp", VSX_BUILTIN_XVCVSXWSP },
+ { MASK_VSX, CODE_FOR_vsx_floatunsv4siv4sf2, "__builtin_vsx_xvcvuxwsp", VSX_BUILTIN_XVCVUXWSP },
+
+ { MASK_VSX, CODE_FOR_vsx_xvcvdpsxws, "__builtin_vsx_xvcvdpsxws", VSX_BUILTIN_XVCVDPSXWS },
+ { MASK_VSX, CODE_FOR_vsx_xvcvdpuxws, "__builtin_vsx_xvcvdpuxws", VSX_BUILTIN_XVCVDPUXWS },
+ { MASK_VSX, CODE_FOR_vsx_xvcvsxwdp, "__builtin_vsx_xvcvsxwdp", VSX_BUILTIN_XVCVSXWDP },
+ { MASK_VSX, CODE_FOR_vsx_xvcvuxwdp, "__builtin_vsx_xvcvuxwdp", VSX_BUILTIN_XVCVUXWDP },
+ { MASK_VSX, CODE_FOR_vsx_xvrdpi, "__builtin_vsx_xvrdpi", VSX_BUILTIN_XVRDPI },
+ { MASK_VSX, CODE_FOR_vsx_xvrdpic, "__builtin_vsx_xvrdpic", VSX_BUILTIN_XVRDPIC },
+ { MASK_VSX, CODE_FOR_vsx_floorv2df2, "__builtin_vsx_xvrdpim", VSX_BUILTIN_XVRDPIM },
+ { MASK_VSX, CODE_FOR_vsx_ceilv2df2, "__builtin_vsx_xvrdpip", VSX_BUILTIN_XVRDPIP },
+ { MASK_VSX, CODE_FOR_vsx_btruncv2df2, "__builtin_vsx_xvrdpiz", VSX_BUILTIN_XVRDPIZ },
+
+ { MASK_VSX, CODE_FOR_vsx_xvcvspsxds, "__builtin_vsx_xvcvspsxds", VSX_BUILTIN_XVCVSPSXDS },
+ { MASK_VSX, CODE_FOR_vsx_xvcvspuxds, "__builtin_vsx_xvcvspuxds", VSX_BUILTIN_XVCVSPUXDS },
+ { MASK_VSX, CODE_FOR_vsx_xvcvsxdsp, "__builtin_vsx_xvcvsxdsp", VSX_BUILTIN_XVCVSXDSP },
+ { MASK_VSX, CODE_FOR_vsx_xvcvuxdsp, "__builtin_vsx_xvcvuxdsp", VSX_BUILTIN_XVCVUXDSP },
+ { MASK_VSX, CODE_FOR_vsx_xvrspi, "__builtin_vsx_xvrspi", VSX_BUILTIN_XVRSPI },
+ { MASK_VSX, CODE_FOR_vsx_xvrspic, "__builtin_vsx_xvrspic", VSX_BUILTIN_XVRSPIC },
+ { MASK_VSX, CODE_FOR_vsx_floorv4sf2, "__builtin_vsx_xvrspim", VSX_BUILTIN_XVRSPIM },
+ { MASK_VSX, CODE_FOR_vsx_ceilv4sf2, "__builtin_vsx_xvrspip", VSX_BUILTIN_XVRSPIP },
+ { MASK_VSX, CODE_FOR_vsx_btruncv4sf2, "__builtin_vsx_xvrspiz", VSX_BUILTIN_XVRSPIZ },
+
+ { MASK_VSX, CODE_FOR_vsx_xsrdpi, "__builtin_vsx_xsrdpi", VSX_BUILTIN_XSRDPI },
+ { MASK_VSX, CODE_FOR_vsx_xsrdpic, "__builtin_vsx_xsrdpic", VSX_BUILTIN_XSRDPIC },
+ { MASK_VSX, CODE_FOR_vsx_floordf2, "__builtin_vsx_xsrdpim", VSX_BUILTIN_XSRDPIM },
+ { MASK_VSX, CODE_FOR_vsx_ceildf2, "__builtin_vsx_xsrdpip", VSX_BUILTIN_XSRDPIP },
+ { MASK_VSX, CODE_FOR_vsx_btruncdf2, "__builtin_vsx_xsrdpiz", VSX_BUILTIN_XSRDPIZ },
+
{ MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_abs", ALTIVEC_BUILTIN_VEC_ABS },
{ MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_abss", ALTIVEC_BUILTIN_VEC_ABSS },
{ MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_ceil", ALTIVEC_BUILTIN_VEC_CEIL },
@@ -8533,15 +8702,6 @@ static struct builtin_description bdesc_
{ MASK_ALTIVEC|MASK_VSX, CODE_FOR_fix_truncv4sfv4si2, "__builtin_vec_fix_sfsi", VECTOR_BUILTIN_FIX_V4SF_V4SI },
{ MASK_ALTIVEC|MASK_VSX, CODE_FOR_fixuns_truncv4sfv4si2, "__builtin_vec_fixuns_sfsi", VECTOR_BUILTIN_FIXUNS_V4SF_V4SI },
- { MASK_VSX, CODE_FOR_floatv2div2df2, "__builtin_vsx_xvcvsxddp", VSX_BUILTIN_XVCVSXDDP },
- { MASK_VSX, CODE_FOR_unsigned_floatv2div2df2, "__builtin_vsx_xvcvuxddp", VSX_BUILTIN_XVCVUXDDP },
- { MASK_VSX, CODE_FOR_fix_truncv2dfv2di2, "__builtin_vsx_xvdpsxds", VSX_BUILTIN_XVCVDPSXDS },
- { MASK_VSX, CODE_FOR_fixuns_truncv2dfv2di2, "__builtin_vsx_xvdpuxds", VSX_BUILTIN_XVCVDPUXDS },
- { MASK_VSX, CODE_FOR_floatv4siv4sf2, "__builtin_vsx_xvcvsxwsp", VSX_BUILTIN_XVCVSXDSP },
- { MASK_VSX, CODE_FOR_unsigned_floatv4siv4sf2, "__builtin_vsx_xvcvuxwsp", VSX_BUILTIN_XVCVUXWSP },
- { MASK_VSX, CODE_FOR_fix_truncv4sfv4si2, "__builtin_vsx_xvspsxws", VSX_BUILTIN_XVCVSPSXWS },
- { MASK_VSX, CODE_FOR_fixuns_truncv4sfv4si2, "__builtin_vsx_xvspuxws", VSX_BUILTIN_XVCVSPUXWS },
-
/* The SPE unary builtins must start with SPE_BUILTIN_EVABS and
end with SPE_BUILTIN_EVSUBFUSIAAW. */
{ 0, CODE_FOR_spe_evabs, "__builtin_spe_evabs", SPE_BUILTIN_EVABS },
@@ -9046,11 +9206,12 @@ rs6000_expand_ternop_builtin (enum insn_
|| arg2 == error_mark_node)
return const0_rtx;
- if (icode == CODE_FOR_altivec_vsldoi_v4sf
- || icode == CODE_FOR_altivec_vsldoi_v4si
- || icode == CODE_FOR_altivec_vsldoi_v8hi
- || icode == CODE_FOR_altivec_vsldoi_v16qi)
+ switch (icode)
{
+ case CODE_FOR_altivec_vsldoi_v4sf:
+ case CODE_FOR_altivec_vsldoi_v4si:
+ case CODE_FOR_altivec_vsldoi_v8hi:
+ case CODE_FOR_altivec_vsldoi_v16qi:
/* Only allow 4-bit unsigned literals. */
STRIP_NOPS (arg2);
if (TREE_CODE (arg2) != INTEGER_CST
@@ -9059,6 +9220,40 @@ rs6000_expand_ternop_builtin (enum insn_
error ("argument 3 must be a 4-bit unsigned literal");
return const0_rtx;
}
+ break;
+
+ case CODE_FOR_vsx_xxpermdi_v2df:
+ case CODE_FOR_vsx_xxpermdi_v2di:
+ case CODE_FOR_vsx_xxsldwi_v16qi:
+ case CODE_FOR_vsx_xxsldwi_v8hi:
+ case CODE_FOR_vsx_xxsldwi_v4si:
+ case CODE_FOR_vsx_xxsldwi_v4sf:
+ case CODE_FOR_vsx_xxsldwi_v2di:
+ case CODE_FOR_vsx_xxsldwi_v2df:
+ /* Only allow 2-bit unsigned literals. */
+ STRIP_NOPS (arg2);
+ if (TREE_CODE (arg2) != INTEGER_CST
+ || TREE_INT_CST_LOW (arg2) & ~0x3)
+ {
+ error ("argument 3 must be a 2-bit unsigned literal");
+ return const0_rtx;
+ }
+ break;
+
+ case CODE_FOR_vsx_set_v2df:
+ case CODE_FOR_vsx_set_v2di:
+ /* Only allow 1-bit unsigned literals. */
+ STRIP_NOPS (arg2);
+ if (TREE_CODE (arg2) != INTEGER_CST
+ || TREE_INT_CST_LOW (arg2) & ~0x1)
+ {
+ error ("argument 3 must be a 1-bit unsigned literal");
+ return const0_rtx;
+ }
+ break;
+
+ default:
+ break;
}
if (target == 0
@@ -9366,8 +9561,10 @@ altivec_expand_builtin (tree exp, rtx ta
enum machine_mode tmode, mode0;
unsigned int fcode = DECL_FUNCTION_CODE (fndecl);
- if (fcode >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && fcode <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ if ((fcode >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && fcode <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (fcode >= VSX_BUILTIN_OVERLOADED_FIRST
+ && fcode <= VSX_BUILTIN_OVERLOADED_LAST))
{
*expandedp = true;
error ("unresolved overload for Altivec builtin %qF", fndecl);
@@ -10156,6 +10353,7 @@ rs6000_init_builtins (void)
unsigned_V16QI_type_node = build_vector_type (unsigned_intQI_type_node, 16);
unsigned_V8HI_type_node = build_vector_type (unsigned_intHI_type_node, 8);
unsigned_V4SI_type_node = build_vector_type (unsigned_intSI_type_node, 4);
+ unsigned_V2DI_type_node = build_vector_type (unsigned_intDI_type_node, 2);
opaque_V2SF_type_node = build_opaque_vector_type (float_type_node, 2);
opaque_V2SI_type_node = build_opaque_vector_type (intSI_type_node, 2);
@@ -10169,6 +10367,7 @@ rs6000_init_builtins (void)
bool_char_type_node = build_distinct_type_copy (unsigned_intQI_type_node);
bool_short_type_node = build_distinct_type_copy (unsigned_intHI_type_node);
bool_int_type_node = build_distinct_type_copy (unsigned_intSI_type_node);
+ bool_long_type_node = build_distinct_type_copy (unsigned_intDI_type_node);
pixel_type_node = build_distinct_type_copy (unsigned_intHI_type_node);
long_integer_type_internal_node = long_integer_type_node;
@@ -10201,6 +10400,7 @@ rs6000_init_builtins (void)
bool_V16QI_type_node = build_vector_type (bool_char_type_node, 16);
bool_V8HI_type_node = build_vector_type (bool_short_type_node, 8);
bool_V4SI_type_node = build_vector_type (bool_int_type_node, 4);
+ bool_V2DI_type_node = build_vector_type (bool_long_type_node, 2);
pixel_V8HI_type_node = build_vector_type (pixel_type_node, 8);
(*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
@@ -10241,9 +10441,17 @@ rs6000_init_builtins (void)
pixel_V8HI_type_node));
if (TARGET_VSX)
- (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
- get_identifier ("__vector double"),
- V2DF_type_node));
+ {
+ (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
+ get_identifier ("__vector double"),
+ V2DF_type_node));
+ (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
+ get_identifier ("__vector long"),
+ V2DI_type_node));
+ (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
+ get_identifier ("__vector __bool long"),
+ bool_V2DI_type_node));
+ }
if (TARGET_PAIRED_FLOAT)
paired_init_builtins ();
@@ -10818,8 +11026,10 @@ altivec_init_builtins (void)
{
enum machine_mode mode1;
tree type;
- bool is_overloaded = dp->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && dp->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST;
+ bool is_overloaded = ((dp->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && dp->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (dp->code >= VSX_BUILTIN_OVERLOADED_FIRST
+ && dp->code <= VSX_BUILTIN_OVERLOADED_LAST));
if (is_overloaded)
mode1 = VOIDmode;
@@ -10982,592 +11192,302 @@ altivec_init_builtins (void)
ALTIVEC_BUILTIN_VEC_EXT_V4SF);
}
-static void
-rs6000_common_init_builtins (void)
+/* Hash function for builtin functions with up to 3 arguments and a return
+ type. */
+static unsigned
+builtin_hash_function (const void *hash_entry)
{
- const struct builtin_description *d;
- size_t i;
+ unsigned ret = 0;
+ int i;
+ const struct builtin_hash_struct *bh =
+ (const struct builtin_hash_struct *) hash_entry;
- tree v2sf_ftype_v2sf_v2sf_v2sf
- = build_function_type_list (V2SF_type_node,
- V2SF_type_node, V2SF_type_node,
- V2SF_type_node, NULL_TREE);
-
- tree v4sf_ftype_v4sf_v4sf_v16qi
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node,
- V16QI_type_node, NULL_TREE);
- tree v4si_ftype_v4si_v4si_v16qi
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, V4SI_type_node,
- V16QI_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_v8hi_v16qi
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, V8HI_type_node,
- V16QI_type_node, NULL_TREE);
- tree v16qi_ftype_v16qi_v16qi_v16qi
- = build_function_type_list (V16QI_type_node,
- V16QI_type_node, V16QI_type_node,
- V16QI_type_node, NULL_TREE);
- tree v4si_ftype_int
- = build_function_type_list (V4SI_type_node, integer_type_node, NULL_TREE);
- tree v8hi_ftype_int
- = build_function_type_list (V8HI_type_node, integer_type_node, NULL_TREE);
- tree v16qi_ftype_int
- = build_function_type_list (V16QI_type_node, integer_type_node, NULL_TREE);
- tree v8hi_ftype_v16qi
- = build_function_type_list (V8HI_type_node, V16QI_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf
- = build_function_type_list (V4SF_type_node, V4SF_type_node, NULL_TREE);
+ for (i = 0; i < 4; i++)
+ ret = (ret * (unsigned)MAX_MACHINE_MODE) + ((unsigned)bh->mode[i]);
- tree v2si_ftype_v2si_v2si
- = build_function_type_list (opaque_V2SI_type_node,
- opaque_V2SI_type_node,
- opaque_V2SI_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2sf_v2sf_spe
- = build_function_type_list (opaque_V2SF_type_node,
- opaque_V2SF_type_node,
- opaque_V2SF_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2sf_v2sf
- = build_function_type_list (V2SF_type_node,
- V2SF_type_node,
- V2SF_type_node, NULL_TREE);
-
-
- tree v2si_ftype_int_int
- = build_function_type_list (opaque_V2SI_type_node,
- integer_type_node, integer_type_node,
- NULL_TREE);
+ return ret;
+}
- tree opaque_ftype_opaque
- = build_function_type_list (opaque_V4SI_type_node,
- opaque_V4SI_type_node, NULL_TREE);
+/* Compare builtin hash entries H1 and H2 for equivalence. */
+static int
+builtin_hash_eq (const void *h1, const void *h2)
+{
+ const struct builtin_hash_struct *p1 = (const struct builtin_hash_struct *) h1;
+ const struct builtin_hash_struct *p2 = (const struct builtin_hash_struct *) h2;
- tree v2si_ftype_v2si
- = build_function_type_list (opaque_V2SI_type_node,
- opaque_V2SI_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2sf_spe
- = build_function_type_list (opaque_V2SF_type_node,
- opaque_V2SF_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2sf
- = build_function_type_list (V2SF_type_node,
- V2SF_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2si
- = build_function_type_list (opaque_V2SF_type_node,
- opaque_V2SI_type_node, NULL_TREE);
-
- tree v2si_ftype_v2sf
- = build_function_type_list (opaque_V2SI_type_node,
- opaque_V2SF_type_node, NULL_TREE);
-
- tree v2si_ftype_v2si_char
- = build_function_type_list (opaque_V2SI_type_node,
- opaque_V2SI_type_node,
- char_type_node, NULL_TREE);
-
- tree v2si_ftype_int_char
- = build_function_type_list (opaque_V2SI_type_node,
- integer_type_node, char_type_node, NULL_TREE);
-
- tree v2si_ftype_char
- = build_function_type_list (opaque_V2SI_type_node,
- char_type_node, NULL_TREE);
+ return ((p1->mode[0] == p2->mode[0])
+ && (p1->mode[1] == p2->mode[1])
+ && (p1->mode[2] == p2->mode[2])
+ && (p1->mode[3] == p2->mode[3]));
+}
- tree int_ftype_int_int
- = build_function_type_list (integer_type_node,
- integer_type_node, integer_type_node,
- NULL_TREE);
+/* Map selected modes to types for builtins. */
+static tree builtin_mode_to_type[MAX_MACHINE_MODE];
- tree opaque_ftype_opaque_opaque
- = build_function_type_list (opaque_V4SI_type_node,
- opaque_V4SI_type_node, opaque_V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v4si_v4si
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, V4SI_type_node, NULL_TREE);
- tree v4sf_ftype_v4si_int
- = build_function_type_list (V4SF_type_node,
- V4SI_type_node, integer_type_node, NULL_TREE);
- tree v4si_ftype_v4sf_int
- = build_function_type_list (V4SI_type_node,
- V4SF_type_node, integer_type_node, NULL_TREE);
- tree v4si_ftype_v4si_int
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, integer_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_int
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, integer_type_node, NULL_TREE);
- tree v16qi_ftype_v16qi_int
- = build_function_type_list (V16QI_type_node,
- V16QI_type_node, integer_type_node, NULL_TREE);
- tree v16qi_ftype_v16qi_v16qi_int
- = build_function_type_list (V16QI_type_node,
- V16QI_type_node, V16QI_type_node,
- integer_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_v8hi_int
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, V8HI_type_node,
- integer_type_node, NULL_TREE);
- tree v4si_ftype_v4si_v4si_int
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, V4SI_type_node,
- integer_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf_v4sf_int
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node,
- integer_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf_v4sf
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node, NULL_TREE);
- tree opaque_ftype_opaque_opaque_opaque
- = build_function_type_list (opaque_V4SI_type_node,
- opaque_V4SI_type_node, opaque_V4SI_type_node,
- opaque_V4SI_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf_v4sf_v4si
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node,
- V4SI_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf_v4sf_v4sf
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node,
- V4SF_type_node, NULL_TREE);
- tree v4si_ftype_v4si_v4si_v4si
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, V4SI_type_node,
- V4SI_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_v8hi
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, V8HI_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_v8hi_v8hi
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, V8HI_type_node,
- V8HI_type_node, NULL_TREE);
- tree v4si_ftype_v8hi_v8hi_v4si
- = build_function_type_list (V4SI_type_node,
- V8HI_type_node, V8HI_type_node,
- V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v16qi_v16qi_v4si
- = build_function_type_list (V4SI_type_node,
- V16QI_type_node, V16QI_type_node,
- V4SI_type_node, NULL_TREE);
- tree v16qi_ftype_v16qi_v16qi
- = build_function_type_list (V16QI_type_node,
- V16QI_type_node, V16QI_type_node, NULL_TREE);
- tree v4si_ftype_v4sf_v4sf
- = build_function_type_list (V4SI_type_node,
- V4SF_type_node, V4SF_type_node, NULL_TREE);
- tree v8hi_ftype_v16qi_v16qi
- = build_function_type_list (V8HI_type_node,
- V16QI_type_node, V16QI_type_node, NULL_TREE);
- tree v4si_ftype_v8hi_v8hi
- = build_function_type_list (V4SI_type_node,
- V8HI_type_node, V8HI_type_node, NULL_TREE);
- tree v8hi_ftype_v4si_v4si
- = build_function_type_list (V8HI_type_node,
- V4SI_type_node, V4SI_type_node, NULL_TREE);
- tree v16qi_ftype_v8hi_v8hi
- = build_function_type_list (V16QI_type_node,
- V8HI_type_node, V8HI_type_node, NULL_TREE);
- tree v4si_ftype_v16qi_v4si
- = build_function_type_list (V4SI_type_node,
- V16QI_type_node, V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v16qi_v16qi
- = build_function_type_list (V4SI_type_node,
- V16QI_type_node, V16QI_type_node, NULL_TREE);
- tree v4si_ftype_v8hi_v4si
- = build_function_type_list (V4SI_type_node,
- V8HI_type_node, V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v8hi
- = build_function_type_list (V4SI_type_node, V8HI_type_node, NULL_TREE);
- tree int_ftype_v4si_v4si
- = build_function_type_list (integer_type_node,
- V4SI_type_node, V4SI_type_node, NULL_TREE);
- tree int_ftype_v4sf_v4sf
- = build_function_type_list (integer_type_node,
- V4SF_type_node, V4SF_type_node, NULL_TREE);
- tree int_ftype_v16qi_v16qi
- = build_function_type_list (integer_type_node,
- V16QI_type_node, V16QI_type_node, NULL_TREE);
- tree int_ftype_v8hi_v8hi
- = build_function_type_list (integer_type_node,
- V8HI_type_node, V8HI_type_node, NULL_TREE);
- tree v2di_ftype_v2df
- = build_function_type_list (V2DI_type_node,
- V2DF_type_node, NULL_TREE);
- tree v2df_ftype_v2df
- = build_function_type_list (V2DF_type_node,
- V2DF_type_node, NULL_TREE);
- tree v2df_ftype_v2di
- = build_function_type_list (V2DF_type_node,
- V2DI_type_node, NULL_TREE);
- tree v2df_ftype_v2df_v2df
- = build_function_type_list (V2DF_type_node,
- V2DF_type_node, V2DF_type_node, NULL_TREE);
- tree v2df_ftype_v2df_v2df_v2df
- = build_function_type_list (V2DF_type_node,
- V2DF_type_node, V2DF_type_node,
- V2DF_type_node, NULL_TREE);
- tree v2di_ftype_v2di_v2di_v2di
- = build_function_type_list (V2DI_type_node,
- V2DI_type_node, V2DI_type_node,
- V2DI_type_node, NULL_TREE);
- tree v2df_ftype_v2df_v2df_v16qi
- = build_function_type_list (V2DF_type_node,
- V2DF_type_node, V2DF_type_node,
- V16QI_type_node, NULL_TREE);
- tree v2di_ftype_v2di_v2di_v16qi
- = build_function_type_list (V2DI_type_node,
- V2DI_type_node, V2DI_type_node,
- V16QI_type_node, NULL_TREE);
- tree v4sf_ftype_v4si
- = build_function_type_list (V4SF_type_node, V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v4sf
- = build_function_type_list (V4SI_type_node, V4SF_type_node, NULL_TREE);
+/* Map types for builtin functions with an explicit return type and up to 3
+ arguments. Functions with fewer than 3 arguments use VOIDmode as the type
+ of the argument. */
+static tree
+builtin_function_type (enum machine_mode mode_ret, enum machine_mode mode_arg0,
+ enum machine_mode mode_arg1, enum machine_mode mode_arg2,
+ const char *name)
+{
+ struct builtin_hash_struct h;
+ struct builtin_hash_struct *h2;
+ void **found;
+ int num_args = 3;
+ int i;
- /* Add the simple ternary operators. */
+ /* Create builtin_hash_table. */
+ if (builtin_hash_table == NULL)
+ builtin_hash_table = htab_create_ggc (1500, builtin_hash_function,
+ builtin_hash_eq, NULL);
+
+ h.type = NULL_TREE;
+ h.mode[0] = mode_ret;
+ h.mode[1] = mode_arg0;
+ h.mode[2] = mode_arg1;
+ h.mode[3] = mode_arg2;
+
+ /* Figure out how many args are present. */
+ while (num_args > 0 && h.mode[num_args] == VOIDmode)
+ num_args--;
+
+ if (num_args == 0)
+ fatal_error ("internal error: builtin function %s had no type", name);
+
+ if (!builtin_mode_to_type[h.mode[0]])
+ fatal_error ("internal error: builtin function %s had an unexpected "
+ "return type %s", name, GET_MODE_NAME (h.mode[0]));
+
+ for (i = 0; i < num_args; i++)
+ if (!builtin_mode_to_type[h.mode[i+1]])
+ fatal_error ("internal error: builtin function %s, argument %d "
+ "had unexpected argument type %s", name, i,
+ GET_MODE_NAME (h.mode[i+1]));
+
+ found = htab_find_slot (builtin_hash_table, &h, 1);
+ if (*found == NULL)
+ {
+ h2 = GGC_NEW (struct builtin_hash_struct);
+ *h2 = h;
+ *found = (void *)h2;
+
+ switch (num_args)
+ {
+ case 1:
+ h2->type = build_function_type_list (builtin_mode_to_type[mode_ret],
+ builtin_mode_to_type[mode_arg0],
+ NULL_TREE);
+ break;
+
+ case 2:
+ h2->type = build_function_type_list (builtin_mode_to_type[mode_ret],
+ builtin_mode_to_type[mode_arg0],
+ builtin_mode_to_type[mode_arg1],
+ NULL_TREE);
+ break;
+
+ case 3:
+ h2->type = build_function_type_list (builtin_mode_to_type[mode_ret],
+ builtin_mode_to_type[mode_arg0],
+ builtin_mode_to_type[mode_arg1],
+ builtin_mode_to_type[mode_arg2],
+ NULL_TREE);
+ break;
+
+ default:
+ gcc_unreachable ();
+ }
+ }
+
+ return ((struct builtin_hash_struct *)(*found))->type;
+}
+
+static void
+rs6000_common_init_builtins (void)
+{
+ const struct builtin_description *d;
+ size_t i;
+
+ tree opaque_ftype_opaque = NULL_TREE;
+ tree opaque_ftype_opaque_opaque = NULL_TREE;
+ tree opaque_ftype_opaque_opaque_opaque = NULL_TREE;
+ tree v2si_ftype_qi = NULL_TREE;
+ tree v2si_ftype_v2si_qi = NULL_TREE;
+ tree v2si_ftype_int_qi = NULL_TREE;
+
+ /* Initialize the tables for the unary, binary, and ternary ops. */
+ builtin_mode_to_type[QImode] = integer_type_node;
+ builtin_mode_to_type[HImode] = integer_type_node;
+ builtin_mode_to_type[SImode] = intSI_type_node;
+ builtin_mode_to_type[DImode] = intDI_type_node;
+ builtin_mode_to_type[SFmode] = float_type_node;
+ builtin_mode_to_type[DFmode] = double_type_node;
+ builtin_mode_to_type[V2SImode] = V2SI_type_node;
+ builtin_mode_to_type[V2SFmode] = V2SF_type_node;
+ builtin_mode_to_type[V2DImode] = V2DI_type_node;
+ builtin_mode_to_type[V2DFmode] = V2DF_type_node;
+ builtin_mode_to_type[V4HImode] = V4HI_type_node;
+ builtin_mode_to_type[V4SImode] = V4SI_type_node;
+ builtin_mode_to_type[V4SFmode] = V4SF_type_node;
+ builtin_mode_to_type[V8HImode] = V8HI_type_node;
+ builtin_mode_to_type[V16QImode] = V16QI_type_node;
+
+ if (!TARGET_PAIRED_FLOAT)
+ {
+ builtin_mode_to_type[V2SImode] = opaque_V2SI_type_node;
+ builtin_mode_to_type[V2SFmode] = opaque_V2SF_type_node;
+ }
+
+ /* Add the ternary operators. */
d = bdesc_3arg;
for (i = 0; i < ARRAY_SIZE (bdesc_3arg); i++, d++)
{
- enum machine_mode mode0, mode1, mode2, mode3;
tree type;
- bool is_overloaded = d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST;
+ int mask = d->mask;
- if (is_overloaded)
- {
- mode0 = VOIDmode;
- mode1 = VOIDmode;
- mode2 = VOIDmode;
- mode3 = VOIDmode;
+ if ((mask != 0 && (mask & target_flags) == 0)
+ || (mask == 0 && !TARGET_PAIRED_FLOAT))
+ continue;
+
+ if ((d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (d->code >= VSX_BUILTIN_OVERLOADED_FIRST
+ && d->code <= VSX_BUILTIN_OVERLOADED_LAST))
+ {
+ if (! (type = opaque_ftype_opaque_opaque_opaque))
+ type = opaque_ftype_opaque_opaque_opaque
+ = build_function_type_list (opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ NULL_TREE);
}
else
{
- if (d->name == 0 || d->icode == CODE_FOR_nothing)
+ enum insn_code icode = d->icode;
+ if (d->name == 0 || icode == CODE_FOR_nothing)
continue;
- mode0 = insn_data[d->icode].operand[0].mode;
- mode1 = insn_data[d->icode].operand[1].mode;
- mode2 = insn_data[d->icode].operand[2].mode;
- mode3 = insn_data[d->icode].operand[3].mode;
+ type = builtin_function_type (insn_data[icode].operand[0].mode,
+ insn_data[icode].operand[1].mode,
+ insn_data[icode].operand[2].mode,
+ insn_data[icode].operand[3].mode,
+ d->name);
}
- /* When all four are of the same mode. */
- if (mode0 == mode1 && mode1 == mode2 && mode2 == mode3)
- {
- switch (mode0)
- {
- case VOIDmode:
- type = opaque_ftype_opaque_opaque_opaque;
- break;
- case V2DImode:
- type = v2di_ftype_v2di_v2di_v2di;
- break;
- case V2DFmode:
- type = v2df_ftype_v2df_v2df_v2df;
- break;
- case V4SImode:
- type = v4si_ftype_v4si_v4si_v4si;
- break;
- case V4SFmode:
- type = v4sf_ftype_v4sf_v4sf_v4sf;
- break;
- case V8HImode:
- type = v8hi_ftype_v8hi_v8hi_v8hi;
- break;
- case V16QImode:
- type = v16qi_ftype_v16qi_v16qi_v16qi;
- break;
- case V2SFmode:
- type = v2sf_ftype_v2sf_v2sf_v2sf;
- break;
- default:
- gcc_unreachable ();
- }
- }
- else if (mode0 == mode1 && mode1 == mode2 && mode3 == V16QImode)
- {
- switch (mode0)
- {
- case V2DImode:
- type = v2di_ftype_v2di_v2di_v16qi;
- break;
- case V2DFmode:
- type = v2df_ftype_v2df_v2df_v16qi;
- break;
- case V4SImode:
- type = v4si_ftype_v4si_v4si_v16qi;
- break;
- case V4SFmode:
- type = v4sf_ftype_v4sf_v4sf_v16qi;
- break;
- case V8HImode:
- type = v8hi_ftype_v8hi_v8hi_v16qi;
- break;
- case V16QImode:
- type = v16qi_ftype_v16qi_v16qi_v16qi;
- break;
- default:
- gcc_unreachable ();
- }
- }
- else if (mode0 == V4SImode && mode1 == V16QImode && mode2 == V16QImode
- && mode3 == V4SImode)
- type = v4si_ftype_v16qi_v16qi_v4si;
- else if (mode0 == V4SImode && mode1 == V8HImode && mode2 == V8HImode
- && mode3 == V4SImode)
- type = v4si_ftype_v8hi_v8hi_v4si;
- else if (mode0 == V4SFmode && mode1 == V4SFmode && mode2 == V4SFmode
- && mode3 == V4SImode)
- type = v4sf_ftype_v4sf_v4sf_v4si;
-
- /* vchar, vchar, vchar, 4-bit literal. */
- else if (mode0 == V16QImode && mode1 == mode0 && mode2 == mode0
- && mode3 == QImode)
- type = v16qi_ftype_v16qi_v16qi_int;
-
- /* vshort, vshort, vshort, 4-bit literal. */
- else if (mode0 == V8HImode && mode1 == mode0 && mode2 == mode0
- && mode3 == QImode)
- type = v8hi_ftype_v8hi_v8hi_int;
-
- /* vint, vint, vint, 4-bit literal. */
- else if (mode0 == V4SImode && mode1 == mode0 && mode2 == mode0
- && mode3 == QImode)
- type = v4si_ftype_v4si_v4si_int;
-
- /* vfloat, vfloat, vfloat, 4-bit literal. */
- else if (mode0 == V4SFmode && mode1 == mode0 && mode2 == mode0
- && mode3 == QImode)
- type = v4sf_ftype_v4sf_v4sf_int;
-
- else
- gcc_unreachable ();
-
def_builtin (d->mask, d->name, type, d->code);
}
- /* Add the simple binary operators. */
+ /* Add the binary operators. */
d = (struct builtin_description *) bdesc_2arg;
for (i = 0; i < ARRAY_SIZE (bdesc_2arg); i++, d++)
{
enum machine_mode mode0, mode1, mode2;
tree type;
- bool is_overloaded = d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST;
+ int mask = d->mask;
- if (is_overloaded)
- {
- mode0 = VOIDmode;
- mode1 = VOIDmode;
- mode2 = VOIDmode;
+ if ((mask != 0 && (mask & target_flags) == 0)
+ || (mask == 0 && !TARGET_PAIRED_FLOAT))
+ continue;
+
+ if ((d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (d->code >= VSX_BUILTIN_OVERLOADED_FIRST
+ && d->code <= VSX_BUILTIN_OVERLOADED_LAST))
+ {
+ if (! (type = opaque_ftype_opaque_opaque))
+ type = opaque_ftype_opaque_opaque
+ = build_function_type_list (opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ NULL_TREE);
}
else
{
- if (d->name == 0 || d->icode == CODE_FOR_nothing)
+ enum insn_code icode = d->icode;
+ if (d->name == 0 || icode == CODE_FOR_nothing)
continue;
- mode0 = insn_data[d->icode].operand[0].mode;
- mode1 = insn_data[d->icode].operand[1].mode;
- mode2 = insn_data[d->icode].operand[2].mode;
- }
+ mode0 = insn_data[icode].operand[0].mode;
+ mode1 = insn_data[icode].operand[1].mode;
+ mode2 = insn_data[icode].operand[2].mode;
- /* When all three operands are of the same mode. */
- if (mode0 == mode1 && mode1 == mode2)
- {
- switch (mode0)
+ if (mode0 == V2SImode && mode1 == V2SImode && mode2 == QImode)
{
- case VOIDmode:
- type = opaque_ftype_opaque_opaque;
- break;
- case V2DFmode:
- type = v2df_ftype_v2df_v2df;
- break;
- case V4SFmode:
- type = v4sf_ftype_v4sf_v4sf;
- break;
- case V4SImode:
- type = v4si_ftype_v4si_v4si;
- break;
- case V16QImode:
- type = v16qi_ftype_v16qi_v16qi;
- break;
- case V8HImode:
- type = v8hi_ftype_v8hi_v8hi;
- break;
- case V2SImode:
- type = v2si_ftype_v2si_v2si;
- break;
- case V2SFmode:
- if (TARGET_PAIRED_FLOAT)
- type = v2sf_ftype_v2sf_v2sf;
- else
- type = v2sf_ftype_v2sf_v2sf_spe;
- break;
- case SImode:
- type = int_ftype_int_int;
- break;
- default:
- gcc_unreachable ();
+ if (! (type = v2si_ftype_v2si_qi))
+ type = v2si_ftype_v2si_qi
+ = build_function_type_list (opaque_V2SI_type_node,
+ opaque_V2SI_type_node,
+ char_type_node,
+ NULL_TREE);
}
- }
-
- /* A few other combos we really don't want to do manually. */
-
- /* vint, vfloat, vfloat. */
- else if (mode0 == V4SImode && mode1 == V4SFmode && mode2 == V4SFmode)
- type = v4si_ftype_v4sf_v4sf;
-
- /* vshort, vchar, vchar. */
- else if (mode0 == V8HImode && mode1 == V16QImode && mode2 == V16QImode)
- type = v8hi_ftype_v16qi_v16qi;
-
- /* vint, vshort, vshort. */
- else if (mode0 == V4SImode && mode1 == V8HImode && mode2 == V8HImode)
- type = v4si_ftype_v8hi_v8hi;
-
- /* vshort, vint, vint. */
- else if (mode0 == V8HImode && mode1 == V4SImode && mode2 == V4SImode)
- type = v8hi_ftype_v4si_v4si;
-
- /* vchar, vshort, vshort. */
- else if (mode0 == V16QImode && mode1 == V8HImode && mode2 == V8HImode)
- type = v16qi_ftype_v8hi_v8hi;
-
- /* vint, vchar, vint. */
- else if (mode0 == V4SImode && mode1 == V16QImode && mode2 == V4SImode)
- type = v4si_ftype_v16qi_v4si;
-
- /* vint, vchar, vchar. */
- else if (mode0 == V4SImode && mode1 == V16QImode && mode2 == V16QImode)
- type = v4si_ftype_v16qi_v16qi;
-
- /* vint, vshort, vint. */
- else if (mode0 == V4SImode && mode1 == V8HImode && mode2 == V4SImode)
- type = v4si_ftype_v8hi_v4si;
- /* vint, vint, 5-bit literal. */
- else if (mode0 == V4SImode && mode1 == V4SImode && mode2 == QImode)
- type = v4si_ftype_v4si_int;
-
- /* vshort, vshort, 5-bit literal. */
- else if (mode0 == V8HImode && mode1 == V8HImode && mode2 == QImode)
- type = v8hi_ftype_v8hi_int;
-
- /* vchar, vchar, 5-bit literal. */
- else if (mode0 == V16QImode && mode1 == V16QImode && mode2 == QImode)
- type = v16qi_ftype_v16qi_int;
-
- /* vfloat, vint, 5-bit literal. */
- else if (mode0 == V4SFmode && mode1 == V4SImode && mode2 == QImode)
- type = v4sf_ftype_v4si_int;
-
- /* vint, vfloat, 5-bit literal. */
- else if (mode0 == V4SImode && mode1 == V4SFmode && mode2 == QImode)
- type = v4si_ftype_v4sf_int;
-
- else if (mode0 == V2SImode && mode1 == SImode && mode2 == SImode)
- type = v2si_ftype_int_int;
-
- else if (mode0 == V2SImode && mode1 == V2SImode && mode2 == QImode)
- type = v2si_ftype_v2si_char;
-
- else if (mode0 == V2SImode && mode1 == SImode && mode2 == QImode)
- type = v2si_ftype_int_char;
-
- else
- {
- /* int, x, x. */
- gcc_assert (mode0 == SImode);
- switch (mode1)
+ else if (mode0 == V2SImode && GET_MODE_CLASS (mode1) == MODE_INT
+ && mode2 == QImode)
{
- case V4SImode:
- type = int_ftype_v4si_v4si;
- break;
- case V4SFmode:
- type = int_ftype_v4sf_v4sf;
- break;
- case V16QImode:
- type = int_ftype_v16qi_v16qi;
- break;
- case V8HImode:
- type = int_ftype_v8hi_v8hi;
- break;
- default:
- gcc_unreachable ();
+ if (! (type = v2si_ftype_int_qi))
+ type = v2si_ftype_int_qi
+ = build_function_type_list (opaque_V2SI_type_node,
+ integer_type_node,
+ char_type_node,
+ NULL_TREE);
}
+
+ else
+ type = builtin_function_type (mode0, mode1, mode2, VOIDmode,
+ d->name);
}
def_builtin (d->mask, d->name, type, d->code);
}
- /* Add the simple unary operators. */
+ /* Add the unary operators. */
d = (struct builtin_description *) bdesc_1arg;
for (i = 0; i < ARRAY_SIZE (bdesc_1arg); i++, d++)
{
enum machine_mode mode0, mode1;
tree type;
- bool is_overloaded = d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST;
+ int mask = d->mask;
- if (is_overloaded)
- {
- mode0 = VOIDmode;
- mode1 = VOIDmode;
- }
+ if ((mask != 0 && (mask & target_flags) == 0)
+ || (mask == 0 && !TARGET_PAIRED_FLOAT))
+ continue;
+
+ if ((d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (d->code >= VSX_BUILTIN_OVERLOADED_FIRST
+ && d->code <= VSX_BUILTIN_OVERLOADED_LAST))
+ {
+ if (! (type = opaque_ftype_opaque))
+ type = opaque_ftype_opaque
+ = build_function_type_list (opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ NULL_TREE);
+ }
else
{
- if (d->name == 0 || d->icode == CODE_FOR_nothing)
+ enum insn_code icode = d->icode;
+ if (d->name == 0 || icode == CODE_FOR_nothing)
continue;
- mode0 = insn_data[d->icode].operand[0].mode;
- mode1 = insn_data[d->icode].operand[1].mode;
- }
+ mode0 = insn_data[icode].operand[0].mode;
+ mode1 = insn_data[icode].operand[1].mode;
- if (mode0 == V4SImode && mode1 == QImode)
- type = v4si_ftype_int;
- else if (mode0 == V8HImode && mode1 == QImode)
- type = v8hi_ftype_int;
- else if (mode0 == V16QImode && mode1 == QImode)
- type = v16qi_ftype_int;
- else if (mode0 == VOIDmode && mode1 == VOIDmode)
- type = opaque_ftype_opaque;
- else if (mode0 == V2DFmode && mode1 == V2DFmode)
- type = v2df_ftype_v2df;
- else if (mode0 == V4SFmode && mode1 == V4SFmode)
- type = v4sf_ftype_v4sf;
- else if (mode0 == V8HImode && mode1 == V16QImode)
- type = v8hi_ftype_v16qi;
- else if (mode0 == V4SImode && mode1 == V8HImode)
- type = v4si_ftype_v8hi;
- else if (mode0 == V2SImode && mode1 == V2SImode)
- type = v2si_ftype_v2si;
- else if (mode0 == V2SFmode && mode1 == V2SFmode)
- {
- if (TARGET_PAIRED_FLOAT)
- type = v2sf_ftype_v2sf;
- else
- type = v2sf_ftype_v2sf_spe;
- }
- else if (mode0 == V2SFmode && mode1 == V2SImode)
- type = v2sf_ftype_v2si;
- else if (mode0 == V2SImode && mode1 == V2SFmode)
- type = v2si_ftype_v2sf;
- else if (mode0 == V2SImode && mode1 == QImode)
- type = v2si_ftype_char;
- else if (mode0 == V4SImode && mode1 == V4SFmode)
- type = v4si_ftype_v4sf;
- else if (mode0 == V4SFmode && mode1 == V4SImode)
- type = v4sf_ftype_v4si;
- else if (mode0 == V2DImode && mode1 == V2DFmode)
- type = v2di_ftype_v2df;
- else if (mode0 == V2DFmode && mode1 == V2DImode)
- type = v2df_ftype_v2di;
- else
- gcc_unreachable ();
+ if (mode0 == V2SImode && mode1 == QImode)
+ {
+ if (! (type = v2si_ftype_qi))
+ type = v2si_ftype_qi
+ = build_function_type_list (opaque_V2SI_type_node,
+ char_type_node,
+ NULL_TREE);
+ }
+
+ else
+ type = builtin_function_type (mode0, mode1, VOIDmode, VOIDmode,
+ d->name);
+ }
def_builtin (d->mask, d->name, type, d->code);
}
@@ -12618,12 +12538,12 @@ rs6000_secondary_reload_inner (rtx reg,
}
if (GET_CODE (addr) == PLUS
- && (!rs6000_legitimate_offset_address_p (TImode, addr, true)
+ && (!rs6000_legitimate_offset_address_p (TImode, addr, false)
|| and_op2 != NULL_RTX))
{
addr_op1 = XEXP (addr, 0);
addr_op2 = XEXP (addr, 1);
- gcc_assert (legitimate_indirect_address_p (addr_op1, true));
+ gcc_assert (legitimate_indirect_address_p (addr_op1, false));
if (!REG_P (addr_op2)
&& (GET_CODE (addr_op2) != CONST_INT
@@ -12642,8 +12562,8 @@ rs6000_secondary_reload_inner (rtx reg,
addr = scratch_or_premodify;
scratch_or_premodify = scratch;
}
- else if (!legitimate_indirect_address_p (addr, true)
- && !rs6000_legitimate_offset_address_p (TImode, addr, true))
+ else if (!legitimate_indirect_address_p (addr, false)
+ && !rs6000_legitimate_offset_address_p (TImode, addr, false))
{
rs6000_emit_move (scratch_or_premodify, addr, Pmode);
addr = scratch_or_premodify;
@@ -12672,24 +12592,24 @@ rs6000_secondary_reload_inner (rtx reg,
if (GET_CODE (addr) == PRE_MODIFY
&& (!VECTOR_MEM_VSX_P (mode)
|| and_op2 != NULL_RTX
- || !legitimate_indexed_address_p (XEXP (addr, 1), true)))
+ || !legitimate_indexed_address_p (XEXP (addr, 1), false)))
{
scratch_or_premodify = XEXP (addr, 0);
gcc_assert (legitimate_indirect_address_p (scratch_or_premodify,
- true));
+ false));
gcc_assert (GET_CODE (XEXP (addr, 1)) == PLUS);
addr = XEXP (addr, 1);
}
- if (legitimate_indirect_address_p (addr, true) /* reg */
- || legitimate_indexed_address_p (addr, true) /* reg+reg */
+ if (legitimate_indirect_address_p (addr, false) /* reg */
+ || legitimate_indexed_address_p (addr, false) /* reg+reg */
|| GET_CODE (addr) == PRE_MODIFY /* VSX pre-modify */
|| GET_CODE (addr) == AND /* Altivec memory */
|| (rclass == FLOAT_REGS /* legacy float mem */
&& GET_MODE_SIZE (mode) == 8
&& and_op2 == NULL_RTX
&& scratch_or_premodify == scratch
- && rs6000_legitimate_offset_address_p (mode, addr, true)))
+ && rs6000_legitimate_offset_address_p (mode, addr, false)))
;
else if (GET_CODE (addr) == PLUS)
@@ -12709,7 +12629,7 @@ rs6000_secondary_reload_inner (rtx reg,
}
else if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST
- || GET_CODE (addr) == CONST_INT)
+ || GET_CODE (addr) == CONST_INT || REG_P (addr))
{
rs6000_emit_move (scratch_or_premodify, addr, Pmode);
addr = scratch_or_premodify;
@@ -12741,7 +12661,7 @@ rs6000_secondary_reload_inner (rtx reg,
andi. instruction. */
if (and_op2 != NULL_RTX)
{
- if (! legitimate_indirect_address_p (addr, true))
+ if (! legitimate_indirect_address_p (addr, false))
{
emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
addr = scratch;
@@ -12776,6 +12696,26 @@ rs6000_secondary_reload_inner (rtx reg,
return;
}
+/* Target hook to return the cover classes for Integrated Register Allocator.
+ Cover classes is a set of non-intersected register classes covering all hard
+ registers used for register allocation purpose. Any move between two
+ registers of a cover class should be cheaper than load or store of the
+ registers. The value is array of register classes with LIM_REG_CLASSES used
+ as the end marker.
+
+ We need two IRA_COVER_CLASSES, one for pre-VSX, and the other for VSX to
+ account for the Altivec and Floating registers being subsets of the VSX
+ register set under VSX, but distinct register sets on pre-VSX machines. */
+
+static const enum reg_class *
+rs6000_ira_cover_classes (void)
+{
+ static const enum reg_class cover_pre_vsx[] = IRA_COVER_CLASSES_PRE_VSX;
+ static const enum reg_class cover_vsx[] = IRA_COVER_CLASSES_VSX;
+
+ return (TARGET_VSX) ? cover_vsx : cover_pre_vsx;
+}
+
/* Allocate a 64-bit stack slot to be used for copying SDmode
values through if this function has any SDmode references. */
@@ -12849,13 +12789,15 @@ rs6000_preferred_reload_class (rtx x, en
enum machine_mode mode = GET_MODE (x);
enum reg_class ret;
- if (TARGET_VSX && VSX_VECTOR_MODE (mode) && x == CONST0_RTX (mode)
- && VSX_REG_CLASS_P (rclass))
+ if (TARGET_VSX
+ && (VSX_VECTOR_MODE (mode) || mode == TImode)
+ && x == CONST0_RTX (mode) && VSX_REG_CLASS_P (rclass))
ret = rclass;
- else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode)
- && rclass == ALTIVEC_REGS && easy_vector_constant (x, mode))
- ret = rclass;
+ else if (TARGET_ALTIVEC && (ALTIVEC_VECTOR_MODE (mode) || mode == TImode)
+ && (rclass == ALTIVEC_REGS || rclass == VSX_REGS)
+ && easy_vector_constant (x, mode))
+ ret = ALTIVEC_REGS;
else if (CONSTANT_P (x) && reg_classes_intersect_p (rclass, FLOAT_REGS))
ret = NO_REGS;
@@ -13074,8 +13016,10 @@ rs6000_cannot_change_mode_class (enum ma
|| (((to) == TDmode) + ((from) == TDmode)) == 1
|| (((to) == DImode) + ((from) == DImode)) == 1))
|| (TARGET_VSX
- && (VSX_VECTOR_MODE (from) + VSX_VECTOR_MODE (to)) == 1)
+ && (VSX_MOVE_MODE (from) + VSX_MOVE_MODE (to)) == 1
+ && VSX_REG_CLASS_P (rclass))
|| (TARGET_ALTIVEC
+ && rclass == ALTIVEC_REGS
&& (ALTIVEC_VECTOR_MODE (from)
+ ALTIVEC_VECTOR_MODE (to)) == 1)
|| (TARGET_SPE
@@ -14953,7 +14897,7 @@ rs6000_emit_vector_cond_expr (rtx dest,
if (!mask)
return 0;
- if ((TARGET_VSX && VSX_VECTOR_MOVE_MODE (dest_mode))
+ if ((TARGET_VSX && VSX_MOVE_MODE (dest_mode))
|| (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (dest_mode)))
{
rtx cond2 = gen_rtx_fmt_ee (NE, VOIDmode, mask, const0_rtx);
@@ -22044,7 +21988,8 @@ rs6000_handle_altivec_attribute (tree *n
mode = TYPE_MODE (type);
/* Check for invalid AltiVec type qualifiers. */
- if (type == long_unsigned_type_node || type == long_integer_type_node)
+ if ((type == long_unsigned_type_node || type == long_integer_type_node)
+ && !TARGET_VSX)
{
if (TARGET_64BIT)
error ("use of %<long%> in AltiVec types is invalid for 64-bit code");
@@ -22082,6 +22027,7 @@ rs6000_handle_altivec_attribute (tree *n
break;
case SFmode: result = V4SF_type_node; break;
case DFmode: result = V2DF_type_node; break;
+ case DImode: result = V2DI_type_node; break;
/* If the user says 'vector int bool', we may be handed the 'bool'
attribute _before_ the 'vector' attribute, and so select the
proper type in the 'b' case below. */
@@ -22093,6 +22039,7 @@ rs6000_handle_altivec_attribute (tree *n
case 'b':
switch (mode)
{
+ case DImode: case V2DImode: result = bool_V2DI_type_node; break;
case SImode: case V4SImode: result = bool_V4SI_type_node; break;
case HImode: case V8HImode: result = bool_V8HI_type_node; break;
case QImode: case V16QImode: result = bool_V16QI_type_node;
@@ -22137,6 +22084,7 @@ rs6000_mangle_type (const_tree type)
if (type == bool_short_type_node) return "U6__bools";
if (type == pixel_type_node) return "u7__pixel";
if (type == bool_int_type_node) return "U6__booli";
+ if (type == bool_long_type_node) return "U6__booll";
/* Mangle IBM extended float long double as `g' (__float128) on
powerpc*-linux where long-double-64 previously was the default. */
@@ -23647,6 +23595,8 @@ int
rs6000_register_move_cost (enum machine_mode mode,
enum reg_class from, enum reg_class to)
{
+ int ret;
+
/* Moves from/to GENERAL_REGS. */
if (reg_classes_intersect_p (to, GENERAL_REGS)
|| reg_classes_intersect_p (from, GENERAL_REGS))
@@ -23655,39 +23605,47 @@ rs6000_register_move_cost (enum machine_
from = to;
if (from == FLOAT_REGS || from == ALTIVEC_REGS || from == VSX_REGS)
- return (rs6000_memory_move_cost (mode, from, 0)
- + rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
+ ret = (rs6000_memory_move_cost (mode, from, 0)
+ + rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
/* It's more expensive to move CR_REGS than CR0_REGS because of the
shift. */
else if (from == CR_REGS)
- return 4;
+ ret = 4;
/* Power6 has slower LR/CTR moves so make them more expensive than
memory in order to bias spills to memory .*/
else if (rs6000_cpu == PROCESSOR_POWER6
&& reg_classes_intersect_p (from, LINK_OR_CTR_REGS))
- return 6 * hard_regno_nregs[0][mode];
+ ret = 6 * hard_regno_nregs[0][mode];
else
/* A move will cost one instruction per GPR moved. */
- return 2 * hard_regno_nregs[0][mode];
+ ret = 2 * hard_regno_nregs[0][mode];
}
/* If we have VSX, we can easily move between FPR or Altivec registers. */
- else if (TARGET_VSX
- && ((from == VSX_REGS || from == FLOAT_REGS || from == ALTIVEC_REGS)
- || (to == VSX_REGS || to == FLOAT_REGS || to == ALTIVEC_REGS)))
- return 2;
+ else if (VECTOR_UNIT_VSX_P (mode)
+ && reg_classes_intersect_p (to, VSX_REGS)
+ && reg_classes_intersect_p (from, VSX_REGS))
+ ret = 2 * hard_regno_nregs[32][mode];
/* Moving between two similar registers is just one instruction. */
else if (reg_classes_intersect_p (to, from))
- return (mode == TFmode || mode == TDmode) ? 4 : 2;
+ ret = (mode == TFmode || mode == TDmode) ? 4 : 2;
/* Everything else has to go through GENERAL_REGS. */
else
- return (rs6000_register_move_cost (mode, GENERAL_REGS, to)
- + rs6000_register_move_cost (mode, from, GENERAL_REGS));
+ ret = (rs6000_register_move_cost (mode, GENERAL_REGS, to)
+ + rs6000_register_move_cost (mode, from, GENERAL_REGS));
+
+ if (TARGET_DEBUG_COST)
+ fprintf (stderr,
+ "rs6000_register_move_cost:, ret=%d, mode=%s, from=%s, to=%s\n",
+ ret, GET_MODE_NAME (mode), reg_class_names[from],
+ reg_class_names[to]);
+
+ return ret;
}
/* A C expressions returning the cost of moving data of MODE from a register to
@@ -23697,14 +23655,23 @@ int
rs6000_memory_move_cost (enum machine_mode mode, enum reg_class rclass,
int in ATTRIBUTE_UNUSED)
{
+ int ret;
+
if (reg_classes_intersect_p (rclass, GENERAL_REGS))
- return 4 * hard_regno_nregs[0][mode];
+ ret = 4 * hard_regno_nregs[0][mode];
else if (reg_classes_intersect_p (rclass, FLOAT_REGS))
- return 4 * hard_regno_nregs[32][mode];
+ ret = 4 * hard_regno_nregs[32][mode];
else if (reg_classes_intersect_p (rclass, ALTIVEC_REGS))
- return 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode];
+ ret = 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode];
else
- return 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
+ ret = 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
+
+ if (TARGET_DEBUG_COST)
+ fprintf (stderr,
+ "rs6000_memory_move_cost: ret=%d, mode=%s, rclass=%s, in=%d\n",
+ ret, GET_MODE_NAME (mode), reg_class_names[rclass], in);
+
+ return ret;
}
/* Returns a code for a target-specific builtin that implements
@@ -24424,4 +24391,24 @@ rs6000_final_prescan_insn (rtx insn, rtx
}
}
+/* Return true if the function has an indirect jump or a table jump. The compiler
+ prefers the ctr register for such jumps, which interferes with using the decrement
+ ctr register and branch. */
+
+bool
+rs6000_has_indirect_jump_p (void)
+{
+ gcc_assert (cfun && cfun->machine);
+ return cfun->machine->indirect_jump_p;
+}
+
+/* Remember when we've generated an indirect jump. */
+
+void
+rs6000_set_indirect_jump (void)
+{
+ gcc_assert (cfun && cfun->machine);
+ cfun->machine->indirect_jump_p = true;
+}
+
#include "gt-rs6000.h"
--- gcc/config/rs6000/vsx.md (revision 146119)
+++ gcc/config/rs6000/vsx.md (revision 146798)
@@ -22,12 +22,22 @@
;; Iterator for both scalar and vector floating point types supported by VSX
(define_mode_iterator VSX_B [DF V4SF V2DF])
+;; Iterator for the 2 64-bit vector types
+(define_mode_iterator VSX_D [V2DF V2DI])
+
+;; Iterator for the 2 32-bit vector types
+(define_mode_iterator VSX_W [V4SF V4SI])
+
;; Iterator for vector floating point types supported by VSX
(define_mode_iterator VSX_F [V4SF V2DF])
;; Iterator for logical types supported by VSX
(define_mode_iterator VSX_L [V16QI V8HI V4SI V2DI V4SF V2DF TI])
+;; Iterator for memory move. Handle TImode specially to allow
+;; it to use gprs as well as vsx registers.
+(define_mode_iterator VSX_M [V16QI V8HI V4SI V2DI V4SF V2DF])
+
;; Iterator for types for load/store with update
(define_mode_iterator VSX_U [V16QI V8HI V4SI V2DI V4SF V2DF TI DF])
@@ -49,9 +59,10 @@ (define_mode_attr VSs [(V16QI "sp")
(V2DF "dp")
(V2DI "dp")
(DF "dp")
+ (SF "sp")
(TI "sp")])
-;; Map into the register class used
+;; Map the register class used
(define_mode_attr VSr [(V16QI "v")
(V8HI "v")
(V4SI "v")
@@ -59,9 +70,10 @@ (define_mode_attr VSr [(V16QI "v")
(V2DI "wd")
(V2DF "wd")
(DF "ws")
+ (SF "f")
(TI "wd")])
-;; Map into the register class used for float<->int conversions
+;; Map the register class used for float<->int conversions
(define_mode_attr VSr2 [(V2DF "wd")
(V4SF "wf")
(DF "!f#r")])
@@ -70,6 +82,18 @@ (define_mode_attr VSr3 [(V2DF "wa")
(V4SF "wa")
(DF "!f#r")])
+;; Map the register class for sp<->dp float conversions, destination
+(define_mode_attr VSr4 [(SF "ws")
+ (DF "f")
+ (V2DF "wd")
+ (V4SF "v")])
+
+;; Map the register class for sp<->dp float conversions, destination
+(define_mode_attr VSr5 [(SF "ws")
+ (DF "f")
+ (V2DF "v")
+ (V4SF "wd")])
+
;; Same size integer type for floating point data
(define_mode_attr VSi [(V4SF "v4si")
(V2DF "v2di")
@@ -137,6 +161,32 @@ (define_mode_attr VSfptype_sqrt [(V2DF "
(V4SF "fp_sqrt_s")
(DF "fp_sqrt_d")])
+;; Iterator and modes for sp<->dp conversions
+(define_mode_iterator VSX_SPDP [SF DF V4SF V2DF])
+
+(define_mode_attr VS_spdp_res [(SF "DF")
+ (DF "SF")
+ (V4SF "V2DF")
+ (V2DF "V4SF")])
+
+(define_mode_attr VS_spdp_insn [(SF "xscvspdp")
+ (DF "xscvdpsp")
+ (V4SF "xvcvspdp")
+ (V2DF "xvcvdpsp")])
+
+(define_mode_attr VS_spdp_type [(SF "fp")
+ (DF "fp")
+ (V4SF "vecfloat")
+ (V2DF "vecfloat")])
+
+;; Map the scalar mode for a vector type
+(define_mode_attr VS_scalar [(V2DF "DF")
+ (V2DI "DI")
+ (V4SF "SF")
+ (V4SI "SI")
+ (V8HI "HI")
+ (V16QI "QI")])
+
;; Appropriate type for load + update
(define_mode_attr VStype_load_update [(V16QI "vecload")
(V8HI "vecload")
@@ -159,25 +209,33 @@ (define_mode_attr VStype_store_update [(
;; Constants for creating unspecs
(define_constants
- [(UNSPEC_VSX_CONCAT_V2DF 500)
- (UNSPEC_VSX_XVCVDPSP 501)
- (UNSPEC_VSX_XVCVDPSXWS 502)
- (UNSPEC_VSX_XVCVDPUXWS 503)
- (UNSPEC_VSX_XVCVSPDP 504)
- (UNSPEC_VSX_XVCVSXWDP 505)
- (UNSPEC_VSX_XVCVUXWDP 506)
- (UNSPEC_VSX_XVMADD 507)
- (UNSPEC_VSX_XVMSUB 508)
- (UNSPEC_VSX_XVNMADD 509)
- (UNSPEC_VSX_XVNMSUB 510)
- (UNSPEC_VSX_XVRSQRTE 511)
- (UNSPEC_VSX_XVTDIV 512)
- (UNSPEC_VSX_XVTSQRT 513)])
+ [(UNSPEC_VSX_CONCAT 500)
+ (UNSPEC_VSX_CVDPSXWS 501)
+ (UNSPEC_VSX_CVDPUXWS 502)
+ (UNSPEC_VSX_CVSPDP 503)
+ (UNSPEC_VSX_CVSXWDP 504)
+ (UNSPEC_VSX_CVUXWDP 505)
+ (UNSPEC_VSX_CVSXDSP 506)
+ (UNSPEC_VSX_CVUXDSP 507)
+ (UNSPEC_VSX_CVSPSXDS 508)
+ (UNSPEC_VSX_CVSPUXDS 509)
+ (UNSPEC_VSX_MADD 510)
+ (UNSPEC_VSX_MSUB 511)
+ (UNSPEC_VSX_NMADD 512)
+ (UNSPEC_VSX_NMSUB 513)
+ (UNSPEC_VSX_RSQRTE 514)
+ (UNSPEC_VSX_TDIV 515)
+ (UNSPEC_VSX_TSQRT 516)
+ (UNSPEC_VSX_XXPERMDI 517)
+ (UNSPEC_VSX_SET 518)
+ (UNSPEC_VSX_ROUND_I 519)
+ (UNSPEC_VSX_ROUND_IC 520)
+ (UNSPEC_VSX_SLDWI 521)])
;; VSX moves
(define_insn "*vsx_mov<mode>"
- [(set (match_operand:VSX_L 0 "nonimmediate_operand" "=Z,<VSr>,<VSr>,?Z,?wa,?wa,*o,*r,*r,<VSr>,?wa,v,wZ,v")
- (match_operand:VSX_L 1 "input_operand" "<VSr>,Z,<VSr>,wa,Z,wa,r,o,r,j,j,W,v,wZ"))]
+ [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=Z,<VSr>,<VSr>,?Z,?wa,?wa,*o,*r,*r,<VSr>,?wa,v,wZ,v")
+ (match_operand:VSX_M 1 "input_operand" "<VSr>,Z,<VSr>,wa,Z,wa,r,o,r,j,j,W,v,wZ"))]
"VECTOR_MEM_VSX_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
@@ -220,6 +278,49 @@ (define_insn "*vsx_mov<mode>"
}
[(set_attr "type" "vecstore,vecload,vecsimple,vecstore,vecload,vecsimple,*,*,*,vecsimple,vecsimple,*,vecstore,vecload")])
+;; Unlike other VSX moves, allow the GPRs, since a normal use of TImode is for
+;; unions. However for plain data movement, slightly favor the vector loads
+(define_insn "*vsx_movti"
+ [(set (match_operand:TI 0 "nonimmediate_operand" "=Z,wa,wa,?o,?r,?r,wa,v,v,wZ")
+ (match_operand:TI 1 "input_operand" "wa,Z,wa,r,o,r,j,W,wZ,v"))]
+ "VECTOR_MEM_VSX_P (TImode)
+ && (register_operand (operands[0], TImode)
+ || register_operand (operands[1], TImode))"
+{
+ switch (which_alternative)
+ {
+ case 0:
+ return "stxvd2%U0x %x1,%y0";
+
+ case 1:
+ return "lxvd2%U0x %x0,%y1";
+
+ case 2:
+ return "xxlor %x0,%x1,%x1";
+
+ case 3:
+ case 4:
+ case 5:
+ return "#";
+
+ case 6:
+ return "xxlxor %x0,%x0,%x0";
+
+ case 7:
+ return output_vec_const_move (operands);
+
+ case 8:
+ return "stvx %1,%y0";
+
+ case 9:
+ return "lvx %0,%y1";
+
+ default:
+ gcc_unreachable ();
+ }
+}
+ [(set_attr "type" "vecstore,vecload,vecsimple,*,*,*,vecsimple,*,vecstore,vecload")])
+
;; Load/store with update
;; Define insns that do load or store with update. Because VSX only has
;; reg+reg addressing, pre-decrement or pre-inrement is unlikely to be
@@ -297,7 +398,7 @@ (define_insn "vsx_tdiv<mode>3"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")]
- UNSPEC_VSX_XVTDIV))]
+ UNSPEC_VSX_TDIV))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"x<VSv>tdiv<VSs> %x0,%x1,%x2"
[(set_attr "type" "<VStype_simple>")
@@ -367,7 +468,7 @@ (define_insn "*vsx_sqrt<mode>2"
(define_insn "vsx_rsqrte<mode>2"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
- UNSPEC_VSX_XVRSQRTE))]
+ UNSPEC_VSX_RSQRTE))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"x<VSv>rsqrte<VSs> %x0,%x1"
[(set_attr "type" "<VStype_simple>")
@@ -376,7 +477,7 @@ (define_insn "vsx_rsqrte<mode>2"
(define_insn "vsx_tsqrt<mode>2"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
- UNSPEC_VSX_XVTSQRT))]
+ UNSPEC_VSX_TSQRT))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"x<VSv>tsqrt<VSs> %x0,%x1"
[(set_attr "type" "<VStype_simple>")
@@ -426,7 +527,7 @@ (define_insn "vsx_fmadd<mode>4_2"
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")
(match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")]
- UNSPEC_VSX_XVMADD))]
+ UNSPEC_VSX_MADD))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
x<VSv>madda<VSs> %x0,%x1,%x2
@@ -474,7 +575,7 @@ (define_insn "vsx_fmsub<mode>4_2"
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")
(match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")]
- UNSPEC_VSX_XVMSUB))]
+ UNSPEC_VSX_MSUB))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
x<VSv>msuba<VSs> %x0,%x1,%x2
@@ -552,7 +653,7 @@ (define_insn "vsx_fnmadd<mode>4_3"
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,<VSr>,wa,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")
(match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")]
- UNSPEC_VSX_XVNMADD))]
+ UNSPEC_VSX_NMADD))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
x<VSv>nmadda<VSs> %x0,%x1,%x2
@@ -629,7 +730,7 @@ (define_insn "vsx_fnmsub<mode>4_3"
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")
(match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")]
- UNSPEC_VSX_XVNMSUB))]
+ UNSPEC_VSX_NMSUB))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
x<VSv>nmsuba<VSs> %x0,%x1,%x2
@@ -667,13 +768,13 @@ (define_insn "*vsx_ge<mode>"
[(set_attr "type" "<VStype_simple>")
(set_attr "fp_type" "<VSfptype_simple>")])
-(define_insn "vsx_vsel<mode>"
- [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
- (if_then_else:VSX_F (ne (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+(define_insn "*vsx_vsel<mode>"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+ (if_then_else:VSX_L (ne (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
(const_int 0))
- (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")
- (match_operand:VSX_F 3 "vsx_register_operand" "<VSr>,wa")))]
- "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
+ (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxsel %x0,%x3,%x2,%x1"
[(set_attr "type" "vecperm")])
@@ -698,7 +799,7 @@ (define_insn "vsx_ftrunc<mode>2"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(fix:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
- "x<VSv>r<VSs>piz %x0,%x1"
+ "x<VSv>r<VSs>iz %x0,%x1"
[(set_attr "type" "<VStype_simple>")
(set_attr "fp_type" "<VSfptype_simple>")])
@@ -735,6 +836,24 @@ (define_insn "vsx_fixuns_trunc<mode><VSi
(set_attr "fp_type" "<VSfptype_simple>")])
;; Math rounding functions
+(define_insn "vsx_x<VSv>r<VSs>i"
+ [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
+ (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+ UNSPEC_VSX_ROUND_I))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "x<VSv>r<VSs>i %x0,%x1"
+ [(set_attr "type" "<VStype_simple>")
+ (set_attr "fp_type" "<VSfptype_simple>")])
+
+(define_insn "vsx_x<VSv>r<VSs>ic"
+ [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
+ (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+ UNSPEC_VSX_ROUND_IC))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "x<VSv>r<VSs>ic %x0,%x1"
+ [(set_attr "type" "<VStype_simple>")
+ (set_attr "fp_type" "<VSfptype_simple>")])
+
(define_insn "vsx_btrunc<mode>2"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
@@ -765,22 +884,26 @@ (define_insn "vsx_ceil<mode>2"
;; VSX convert to/from double vector
+;; Convert between single and double precision
+;; Don't use xscvspdp and xscvdpsp for scalar conversions, since the normal
+;; scalar single precision instructions internally use the double format.
+;; Prefer the altivec registers, since we likely will need to do a vperm
+(define_insn "vsx_<VS_spdp_insn>"
+ [(set (match_operand:<VS_spdp_res> 0 "vsx_register_operand" "=<VSr4>,?wa")
+ (unspec:<VS_spdp_res> [(match_operand:VSX_SPDP 1 "vsx_register_operand" "<VSr5>,wa")]
+ UNSPEC_VSX_CVSPDP))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "<VS_spdp_insn> %x0,%x1"
+ [(set_attr "type" "<VS_spdp_type>")])
+
;; Convert from 64-bit to 32-bit types
;; Note, favor the Altivec registers since the usual use of these instructions
;; is in vector converts and we need to use the Altivec vperm instruction.
-(define_insn "vsx_xvcvdpsp"
- [(set (match_operand:V4SF 0 "vsx_register_operand" "=v,?wa")
- (unspec:V4SF [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")]
- UNSPEC_VSX_XVCVDPSP))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
- "xvcvdpsp %x0,%x1"
- [(set_attr "type" "vecfloat")])
-
(define_insn "vsx_xvcvdpsxws"
[(set (match_operand:V4SI 0 "vsx_register_operand" "=v,?wa")
(unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")]
- UNSPEC_VSX_XVCVDPSXWS))]
+ UNSPEC_VSX_CVDPSXWS))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
"xvcvdpsxws %x0,%x1"
[(set_attr "type" "vecfloat")])
@@ -788,24 +911,32 @@ (define_insn "vsx_xvcvdpsxws"
(define_insn "vsx_xvcvdpuxws"
[(set (match_operand:V4SI 0 "vsx_register_operand" "=v,?wa")
(unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")]
- UNSPEC_VSX_XVCVDPUXWS))]
+ UNSPEC_VSX_CVDPUXWS))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
"xvcvdpuxws %x0,%x1"
[(set_attr "type" "vecfloat")])
-;; Convert from 32-bit to 64-bit types
-(define_insn "vsx_xvcvspdp"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
- (unspec:V2DF [(match_operand:V4SF 1 "vsx_register_operand" "wf,wa")]
- UNSPEC_VSX_XVCVSPDP))]
+(define_insn "vsx_xvcvsxdsp"
+ [(set (match_operand:V4SI 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wf,wa")]
+ UNSPEC_VSX_CVSXDSP))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "xvcvsxdsp %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "vsx_xvcvuxdsp"
+ [(set (match_operand:V4SI 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wf,wa")]
+ UNSPEC_VSX_CVUXDSP))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
- "xvcvspdp %x0,%x1"
+ "xvcvuxwdp %x0,%x1"
[(set_attr "type" "vecfloat")])
+;; Convert from 32-bit to 64-bit types
(define_insn "vsx_xvcvsxwdp"
[(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
(unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")]
- UNSPEC_VSX_XVCVSXWDP))]
+ UNSPEC_VSX_CVSXWDP))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
"xvcvsxwdp %x0,%x1"
[(set_attr "type" "vecfloat")])
@@ -813,11 +944,26 @@ (define_insn "vsx_xvcvsxwdp"
(define_insn "vsx_xvcvuxwdp"
[(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
(unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")]
- UNSPEC_VSX_XVCVUXWDP))]
+ UNSPEC_VSX_CVUXWDP))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
"xvcvuxwdp %x0,%x1"
[(set_attr "type" "vecfloat")])
+(define_insn "vsx_xvcvspsxds"
+ [(set (match_operand:V2DI 0 "vsx_register_operand" "=v,?wa")
+ (unspec:V2DI [(match_operand:V4SF 1 "vsx_register_operand" "wd,wa")]
+ UNSPEC_VSX_CVSPSXDS))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "xvcvspsxds %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "vsx_xvcvspuxds"
+ [(set (match_operand:V2DI 0 "vsx_register_operand" "=v,?wa")
+ (unspec:V2DI [(match_operand:V4SF 1 "vsx_register_operand" "wd,wa")]
+ UNSPEC_VSX_CVSPUXDS))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "xvcvspuxds %x0,%x1"
+ [(set_attr "type" "vecfloat")])
;; Logical and permute operations
(define_insn "*vsx_and<mode>3"
@@ -877,24 +1023,25 @@ (define_insn "*vsx_andc<mode>3"
;; Permute operations
-(define_insn "vsx_concat_v2df"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
- (unspec:V2DF
- [(match_operand:DF 1 "vsx_register_operand" "ws,wa")
- (match_operand:DF 2 "vsx_register_operand" "ws,wa")]
- UNSPEC_VSX_CONCAT_V2DF))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+;; Build a V2DF/V2DI vector from two scalars
+(define_insn "vsx_concat_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:VSX_D
+ [(match_operand:<VS_scalar> 1 "vsx_register_operand" "ws,wa")
+ (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa")]
+ UNSPEC_VSX_CONCAT))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxpermdi %x0,%x1,%x2,0"
[(set_attr "type" "vecperm")])
-;; Set a double into one element
-(define_insn "vsx_set_v2df"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
- (vec_merge:V2DF
- (match_operand:V2DF 1 "vsx_register_operand" "wd,wa")
- (vec_duplicate:V2DF (match_operand:DF 2 "vsx_register_operand" "ws,f"))
- (match_operand:QI 3 "u5bit_cint_operand" "i,i")))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+;; Set the element of a V2DI/VD2F mode
+(define_insn "vsx_set_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:VSX_D [(match_operand:VSX_D 1 "vsx_register_operand" "wd,wa")
+ (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa")
+ (match_operand:QI 3 "u5bit_cint_operand" "i,i")]
+ UNSPEC_VSX_SET))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
{
if (INTVAL (operands[3]) == 0)
return \"xxpermdi %x0,%x1,%x2,1\";
@@ -906,12 +1053,12 @@ (define_insn "vsx_set_v2df"
[(set_attr "type" "vecperm")])
;; Extract a DF element from V2DF
-(define_insn "vsx_extract_v2df"
- [(set (match_operand:DF 0 "vsx_register_operand" "=ws,f,?wa")
- (vec_select:DF (match_operand:V2DF 1 "vsx_register_operand" "wd,wd,wa")
+(define_insn "vsx_extract_<mode>"
+ [(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=ws,f,?wa")
+ (vec_select:<VS_scalar> (match_operand:VSX_D 1 "vsx_register_operand" "wd,wd,wa")
(parallel
[(match_operand:QI 2 "u5bit_cint_operand" "i,i,i")])))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
{
gcc_assert (UINTVAL (operands[2]) <= 1);
operands[3] = GEN_INT (INTVAL (operands[2]) << 1);
@@ -919,17 +1066,30 @@ (define_insn "vsx_extract_v2df"
}
[(set_attr "type" "vecperm")])
-;; General V2DF permute, extract_{high,low,even,odd}
-(define_insn "vsx_xxpermdi"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd")
- (vec_concat:V2DF
- (vec_select:DF (match_operand:V2DF 1 "vsx_register_operand" "wd")
- (parallel
- [(match_operand:QI 2 "u5bit_cint_operand" "i")]))
- (vec_select:DF (match_operand:V2DF 3 "vsx_register_operand" "wd")
- (parallel
- [(match_operand:QI 4 "u5bit_cint_operand" "i")]))))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+;; General V2DF/V2DI permute
+(define_insn "vsx_xxpermdi_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:VSX_D [(match_operand:VSX_D 1 "vsx_register_operand" "wd,wa")
+ (match_operand:VSX_D 2 "vsx_register_operand" "wd,wa")
+ (match_operand:QI 3 "u5bit_cint_operand" "i,i")]
+ UNSPEC_VSX_XXPERMDI))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxpermdi %x0,%x1,%x2,%3"
+ [(set_attr "type" "vecperm")])
+
+;; Varient of xxpermdi that is emitted by the vec_interleave functions
+(define_insn "*vsx_xxpermdi2_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd")
+ (vec_concat:VSX_D
+ (vec_select:<VS_scalar>
+ (match_operand:VSX_D 1 "vsx_register_operand" "wd")
+ (parallel
+ [(match_operand:QI 2 "u5bit_cint_operand" "i")]))
+ (vec_select:<VS_scalar>
+ (match_operand:VSX_D 3 "vsx_register_operand" "wd")
+ (parallel
+ [(match_operand:QI 4 "u5bit_cint_operand" "i")]))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
{
gcc_assert ((UINTVAL (operands[2]) <= 1) && (UINTVAL (operands[4]) <= 1));
operands[5] = GEN_INT (((INTVAL (operands[2]) & 1) << 1)
@@ -939,11 +1099,11 @@ (define_insn "vsx_xxpermdi"
[(set_attr "type" "vecperm")])
;; V2DF splat
-(define_insn "vsx_splatv2df"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,wd,wd,?wa,?wa,?wa")
- (vec_duplicate:V2DF
- (match_operand:DF 1 "input_operand" "ws,f,Z,wa,wa,Z")))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+(define_insn "vsx_splat_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,wd,wd,?wa,?wa,?wa")
+ (vec_duplicate:VSX_D
+ (match_operand:<VS_scalar> 1 "input_operand" "ws,f,Z,wa,wa,Z")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
xxpermdi %x0,%x1,%x1,0
xxpermdi %x0,%x1,%x1,0
@@ -953,52 +1113,66 @@ (define_insn "vsx_splatv2df"
lxvdsx %x0,%y1"
[(set_attr "type" "vecperm,vecperm,vecload,vecperm,vecperm,vecload")])
-;; V4SF splat
-(define_insn "*vsx_xxspltw"
- [(set (match_operand:V4SF 0 "vsx_register_operand" "=wf,?wa")
- (vec_duplicate:V4SF
- (vec_select:SF (match_operand:V4SF 1 "vsx_register_operand" "wf,wa")
- (parallel
- [(match_operand:QI 2 "u5bit_cint_operand" "i,i")]))))]
- "VECTOR_UNIT_VSX_P (V4SFmode)"
+;; V4SF/V4SI splat
+(define_insn "vsx_xxspltw_<mode>"
+ [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+ (vec_duplicate:VSX_W
+ (vec_select:<VS_scalar>
+ (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
+ (parallel
+ [(match_operand:QI 2 "u5bit_cint_operand" "i,i")]))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxspltw %x0,%x1,%2"
[(set_attr "type" "vecperm")])
-;; V4SF interleave
-(define_insn "vsx_xxmrghw"
- [(set (match_operand:V4SF 0 "register_operand" "=wf,?wa")
- (vec_merge:V4SF
- (vec_select:V4SF (match_operand:V4SF 1 "vsx_register_operand" "wf,wa")
- (parallel [(const_int 0)
- (const_int 2)
- (const_int 1)
- (const_int 3)]))
- (vec_select:V4SF (match_operand:V4SF 2 "vsx_register_operand" "wf,wa")
- (parallel [(const_int 2)
- (const_int 0)
- (const_int 3)
- (const_int 1)]))
+;; V4SF/V4SI interleave
+(define_insn "vsx_xxmrghw_<mode>"
+ [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+ (vec_merge:VSX_W
+ (vec_select:VSX_W
+ (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
+ (parallel [(const_int 0)
+ (const_int 2)
+ (const_int 1)
+ (const_int 3)]))
+ (vec_select:VSX_W
+ (match_operand:VSX_W 2 "vsx_register_operand" "wf,wa")
+ (parallel [(const_int 2)
+ (const_int 0)
+ (const_int 3)
+ (const_int 1)]))
(const_int 5)))]
- "VECTOR_UNIT_VSX_P (V4SFmode)"
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxmrghw %x0,%x1,%x2"
[(set_attr "type" "vecperm")])
-(define_insn "vsx_xxmrglw"
- [(set (match_operand:V4SF 0 "register_operand" "=wf,?wa")
- (vec_merge:V4SF
- (vec_select:V4SF
- (match_operand:V4SF 1 "register_operand" "wf,wa")
+(define_insn "vsx_xxmrglw_<mode>"
+ [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+ (vec_merge:VSX_W
+ (vec_select:VSX_W
+ (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
(parallel [(const_int 2)
(const_int 0)
(const_int 3)
(const_int 1)]))
- (vec_select:V4SF
- (match_operand:V4SF 2 "register_operand" "wf,?wa")
+ (vec_select:VSX_W
+ (match_operand:VSX_W 2 "vsx_register_operand" "wf,?wa")
(parallel [(const_int 0)
(const_int 2)
(const_int 1)
(const_int 3)]))
(const_int 5)))]
- "VECTOR_UNIT_VSX_P (V4SFmode)"
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxmrglw %x0,%x1,%x2"
[(set_attr "type" "vecperm")])
+
+;; Shift left double by word immediate
+(define_insn "vsx_xxsldwi_<mode>"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=wa")
+ (unspec:VSX_L [(match_operand:VSX_L 1 "vsx_register_operand" "wa")
+ (match_operand:VSX_L 2 "vsx_register_operand" "wa")
+ (match_operand:QI 3 "u5bit_cint_operand" "i")]
+ UNSPEC_VSX_SLDWI))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxsldwi %x0,%x1,%x2,%3"
+ [(set_attr "type" "vecperm")])
--- gcc/config/rs6000/rs6000.h (revision 146119)
+++ gcc/config/rs6000/rs6000.h (revision 146798)
@@ -1033,14 +1033,6 @@ extern int rs6000_vector_align[];
((MODE) == V4SFmode \
|| (MODE) == V2DFmode) \
-#define VSX_VECTOR_MOVE_MODE(MODE) \
- ((MODE) == V16QImode \
- || (MODE) == V8HImode \
- || (MODE) == V4SImode \
- || (MODE) == V2DImode \
- || (MODE) == V4SFmode \
- || (MODE) == V2DFmode) \
-
#define VSX_SCALAR_MODE(MODE) \
((MODE) == DFmode)
@@ -1049,12 +1041,9 @@ extern int rs6000_vector_align[];
|| VSX_SCALAR_MODE (MODE))
#define VSX_MOVE_MODE(MODE) \
- (VSX_VECTOR_MOVE_MODE (MODE) \
- || VSX_SCALAR_MODE(MODE) \
- || (MODE) == V16QImode \
- || (MODE) == V8HImode \
- || (MODE) == V4SImode \
- || (MODE) == V2DImode \
+ (VSX_VECTOR_MODE (MODE) \
+ || VSX_SCALAR_MODE (MODE) \
+ || ALTIVEC_VECTOR_MODE (MODE) \
|| (MODE) == TImode)
#define ALTIVEC_VECTOR_MODE(MODE) \
@@ -1304,12 +1293,24 @@ enum reg_class
purpose. Any move between two registers of a cover class should be
cheaper than load or store of the registers. The macro value is
array of register classes with LIM_REG_CLASSES used as the end
- marker. */
+ marker.
+
+ We need two IRA_COVER_CLASSES, one for pre-VSX, and the other for VSX to
+ account for the Altivec and Floating registers being subsets of the VSX
+ register set. */
+
+#define IRA_COVER_CLASSES_PRE_VSX \
+{ \
+ GENERAL_REGS, SPECIAL_REGS, FLOAT_REGS, ALTIVEC_REGS, /* VSX_REGS, */ \
+ /* VRSAVE_REGS,*/ VSCR_REGS, SPE_ACC_REGS, SPEFSCR_REGS, \
+ /* MQ_REGS, LINK_REGS, CTR_REGS, */ \
+ CR_REGS, XER_REGS, LIM_REG_CLASSES \
+}
-#define IRA_COVER_CLASSES \
+#define IRA_COVER_CLASSES_VSX \
{ \
- GENERAL_REGS, SPECIAL_REGS, FLOAT_REGS, ALTIVEC_REGS, \
- /*VRSAVE_REGS,*/ VSCR_REGS, SPE_ACC_REGS, SPEFSCR_REGS, \
+ GENERAL_REGS, SPECIAL_REGS, /* FLOAT_REGS, ALTIVEC_REGS, */ VSX_REGS, \
+ /* VRSAVE_REGS,*/ VSCR_REGS, SPE_ACC_REGS, SPEFSCR_REGS, \
/* MQ_REGS, LINK_REGS, CTR_REGS, */ \
CR_REGS, XER_REGS, LIM_REG_CLASSES \
}
@@ -3371,21 +3372,36 @@ enum rs6000_builtins
VSX_BUILTIN_XVTDIVSP,
VSX_BUILTIN_XVTSQRTDP,
VSX_BUILTIN_XVTSQRTSP,
- VSX_BUILTIN_XXLAND,
- VSX_BUILTIN_XXLANDC,
- VSX_BUILTIN_XXLNOR,
- VSX_BUILTIN_XXLOR,
- VSX_BUILTIN_XXLXOR,
- VSX_BUILTIN_XXMRGHD,
- VSX_BUILTIN_XXMRGHW,
- VSX_BUILTIN_XXMRGLD,
- VSX_BUILTIN_XXMRGLW,
- VSX_BUILTIN_XXPERMDI,
- VSX_BUILTIN_XXSEL,
- VSX_BUILTIN_XXSLDWI,
- VSX_BUILTIN_XXSPLTD,
- VSX_BUILTIN_XXSPLTW,
- VSX_BUILTIN_XXSWAPD,
+ VSX_BUILTIN_XXSEL_2DI,
+ VSX_BUILTIN_XXSEL_2DF,
+ VSX_BUILTIN_XXSEL_4SI,
+ VSX_BUILTIN_XXSEL_4SF,
+ VSX_BUILTIN_XXSEL_8HI,
+ VSX_BUILTIN_XXSEL_16QI,
+ VSX_BUILTIN_VPERM_2DI,
+ VSX_BUILTIN_VPERM_2DF,
+ VSX_BUILTIN_VPERM_4SI,
+ VSX_BUILTIN_VPERM_4SF,
+ VSX_BUILTIN_VPERM_8HI,
+ VSX_BUILTIN_VPERM_16QI,
+ VSX_BUILTIN_XXPERMDI_2DF,
+ VSX_BUILTIN_XXPERMDI_2DI,
+ VSX_BUILTIN_CONCAT_2DF,
+ VSX_BUILTIN_CONCAT_2DI,
+ VSX_BUILTIN_SET_2DF,
+ VSX_BUILTIN_SET_2DI,
+ VSX_BUILTIN_SPLAT_2DF,
+ VSX_BUILTIN_SPLAT_2DI,
+ VSX_BUILTIN_XXMRGHW_4SF,
+ VSX_BUILTIN_XXMRGHW_4SI,
+ VSX_BUILTIN_XXMRGLW_4SF,
+ VSX_BUILTIN_XXMRGLW_4SI,
+ VSX_BUILTIN_XXSLDWI_16QI,
+ VSX_BUILTIN_XXSLDWI_8HI,
+ VSX_BUILTIN_XXSLDWI_4SI,
+ VSX_BUILTIN_XXSLDWI_4SF,
+ VSX_BUILTIN_XXSLDWI_2DI,
+ VSX_BUILTIN_XXSLDWI_2DF,
/* VSX overloaded builtins, add the overloaded functions not present in
Altivec. */
@@ -3395,7 +3411,13 @@ enum rs6000_builtins
VSX_BUILTIN_VEC_NMADD,
VSX_BUITLIN_VEC_NMSUB,
VSX_BUILTIN_VEC_DIV,
- VSX_BUILTIN_OVERLOADED_LAST = VSX_BUILTIN_VEC_DIV,
+ VSX_BUILTIN_VEC_XXMRGHW,
+ VSX_BUILTIN_VEC_XXMRGLW,
+ VSX_BUILTIN_VEC_XXPERMDI,
+ VSX_BUILTIN_VEC_XXSLDWI,
+ VSX_BUILTIN_VEC_XXSPLTD,
+ VSX_BUILTIN_VEC_XXSPLTW,
+ VSX_BUILTIN_OVERLOADED_LAST = VSX_BUILTIN_VEC_XXSPLTW,
/* Combined VSX/Altivec builtins. */
VECTOR_BUILTIN_FLOAT_V4SI_V4SF,
@@ -3425,13 +3447,16 @@ enum rs6000_builtin_type_index
RS6000_BTI_unsigned_V16QI,
RS6000_BTI_unsigned_V8HI,
RS6000_BTI_unsigned_V4SI,
+ RS6000_BTI_unsigned_V2DI,
RS6000_BTI_bool_char, /* __bool char */
RS6000_BTI_bool_short, /* __bool short */
RS6000_BTI_bool_int, /* __bool int */
+ RS6000_BTI_bool_long, /* __bool long */
RS6000_BTI_pixel, /* __pixel */
RS6000_BTI_bool_V16QI, /* __vector __bool char */
RS6000_BTI_bool_V8HI, /* __vector __bool short */
RS6000_BTI_bool_V4SI, /* __vector __bool int */
+ RS6000_BTI_bool_V2DI, /* __vector __bool long */
RS6000_BTI_pixel_V8HI, /* __vector __pixel */
RS6000_BTI_long, /* long_integer_type_node */
RS6000_BTI_unsigned_long, /* long_unsigned_type_node */
@@ -3466,13 +3491,16 @@ enum rs6000_builtin_type_index
#define unsigned_V16QI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V16QI])
#define unsigned_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V8HI])
#define unsigned_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V4SI])
+#define unsigned_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V2DI])
#define bool_char_type_node (rs6000_builtin_types[RS6000_BTI_bool_char])
#define bool_short_type_node (rs6000_builtin_types[RS6000_BTI_bool_short])
#define bool_int_type_node (rs6000_builtin_types[RS6000_BTI_bool_int])
+#define bool_long_type_node (rs6000_builtin_types[RS6000_BTI_bool_long])
#define pixel_type_node (rs6000_builtin_types[RS6000_BTI_pixel])
#define bool_V16QI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V16QI])
#define bool_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V8HI])
#define bool_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V4SI])
+#define bool_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V2DI])
#define pixel_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_pixel_V8HI])
#define long_integer_type_internal_node (rs6000_builtin_types[RS6000_BTI_long])
--- gcc/config/rs6000/altivec.md (revision 146119)
+++ gcc/config/rs6000/altivec.md (revision 146798)
@@ -166,12 +166,15 @@ (define_mode_iterator V [V4SI V8HI V16QI
;; otherwise handled by altivec (v2df, v2di, ti)
(define_mode_iterator VM [V4SI V8HI V16QI V4SF V2DF V2DI TI])
+;; Like VM, except don't do TImode
+(define_mode_iterator VM2 [V4SI V8HI V16QI V4SF V2DF V2DI])
+
(define_mode_attr VI_char [(V4SI "w") (V8HI "h") (V16QI "b")])
;; Vector move instructions.
(define_insn "*altivec_mov<mode>"
- [(set (match_operand:V 0 "nonimmediate_operand" "=Z,v,v,*o,*r,*r,v,v")
- (match_operand:V 1 "input_operand" "v,Z,v,r,o,r,j,W"))]
+ [(set (match_operand:VM2 0 "nonimmediate_operand" "=Z,v,v,*o,*r,*r,v,v")
+ (match_operand:VM2 1 "input_operand" "v,Z,v,r,o,r,j,W"))]
"VECTOR_MEM_ALTIVEC_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
@@ -191,6 +194,31 @@ (define_insn "*altivec_mov<mode>"
}
[(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*")])
+;; Unlike other altivec moves, allow the GPRs, since a normal use of TImode
+;; is for unions. However for plain data movement, slightly favor the vector
+;; loads
+(define_insn "*altivec_movti"
+ [(set (match_operand:TI 0 "nonimmediate_operand" "=Z,v,v,?o,?r,?r,v,v")
+ (match_operand:TI 1 "input_operand" "v,Z,v,r,o,r,j,W"))]
+ "VECTOR_MEM_ALTIVEC_P (TImode)
+ && (register_operand (operands[0], TImode)
+ || register_operand (operands[1], TImode))"
+{
+ switch (which_alternative)
+ {
+ case 0: return "stvx %1,%y0";
+ case 1: return "lvx %0,%y1";
+ case 2: return "vor %0,%1,%1";
+ case 3: return "#";
+ case 4: return "#";
+ case 5: return "#";
+ case 6: return "vxor %0,%0,%0";
+ case 7: return output_vec_const_move (operands);
+ default: gcc_unreachable ();
+ }
+}
+ [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*")])
+
(define_split
[(set (match_operand:VM 0 "altivec_register_operand" "")
(match_operand:VM 1 "easy_vector_constant_add_self" ""))]
@@ -434,13 +462,13 @@ (define_insn "*altivec_gev4sf"
"vcmpgefp %0,%1,%2"
[(set_attr "type" "veccmp")])
-(define_insn "altivec_vsel<mode>"
+(define_insn "*altivec_vsel<mode>"
[(set (match_operand:VM 0 "altivec_register_operand" "=v")
(if_then_else:VM (ne (match_operand:VM 1 "altivec_register_operand" "v")
(const_int 0))
(match_operand:VM 2 "altivec_register_operand" "v")
(match_operand:VM 3 "altivec_register_operand" "v")))]
- "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+ "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
"vsel %0,%3,%2,%1"
[(set_attr "type" "vecperm")])
@@ -780,7 +808,7 @@ (define_insn "altivec_vmrghw"
(const_int 3)
(const_int 1)]))
(const_int 5)))]
- "TARGET_ALTIVEC"
+ "VECTOR_MEM_ALTIVEC_P (V4SImode)"
"vmrghw %0,%1,%2"
[(set_attr "type" "vecperm")])
@@ -797,7 +825,7 @@ (define_insn "*altivec_vmrghsf"
(const_int 3)
(const_int 1)]))
(const_int 5)))]
- "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
+ "VECTOR_MEM_ALTIVEC_P (V4SFmode)"
"vmrghw %0,%1,%2"
[(set_attr "type" "vecperm")])
@@ -881,7 +909,7 @@ (define_insn "altivec_vmrglw"
(const_int 1)
(const_int 3)]))
(const_int 5)))]
- "TARGET_ALTIVEC"
+ "VECTOR_MEM_ALTIVEC_P (V4SImode)"
"vmrglw %0,%1,%2"
[(set_attr "type" "vecperm")])
@@ -899,7 +927,7 @@ (define_insn "*altivec_vmrglsf"
(const_int 1)
(const_int 3)]))
(const_int 5)))]
- "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
+ "VECTOR_MEM_ALTIVEC_P (V4SFmode)"
"vmrglw %0,%1,%2"
[(set_attr "type" "vecperm")])
--- gcc/config/rs6000/rs6000.md (revision 146119)
+++ gcc/config/rs6000/rs6000.md (revision 146798)
@@ -14667,7 +14667,11 @@ (define_insn "return"
[(set_attr "type" "jmpreg")])
(define_expand "indirect_jump"
- [(set (pc) (match_operand 0 "register_operand" ""))])
+ [(set (pc) (match_operand 0 "register_operand" ""))]
+ ""
+{
+ rs6000_set_indirect_jump ();
+})
(define_insn "*indirect_jump<mode>"
[(set (pc) (match_operand:P 0 "register_operand" "c,*l"))]
@@ -14682,14 +14686,14 @@ (define_expand "tablejump"
[(use (match_operand 0 "" ""))
(use (label_ref (match_operand 1 "" "")))]
""
- "
{
+ rs6000_set_indirect_jump ();
if (TARGET_32BIT)
emit_jump_insn (gen_tablejumpsi (operands[0], operands[1]));
else
emit_jump_insn (gen_tablejumpdi (operands[0], operands[1]));
DONE;
-}")
+})
(define_expand "tablejumpsi"
[(set (match_dup 3)
@@ -14749,6 +14753,11 @@ (define_expand "doloop_end"
/* Only use this on innermost loops. */
if (INTVAL (operands[3]) > 1)
FAIL;
+ /* Do not try to use decrement and count on code that has an indirect
+ jump or a table jump, because the ctr register is preferred over the
+ lr register. */
+ if (rs6000_has_indirect_jump_p ())
+ FAIL;
if (TARGET_64BIT)
{
if (GET_MODE (operands[0]) != DImode)