gcc/gcc44-power7-3.patch

3524 lines
134 KiB
Diff
Raw Normal View History

2009-05-14 08:52:31 +00:00
2009-04-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vector.md (vector_vsel<mode>): Generate the insns
directly instead of calling VSX/Altivec expanders.
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Map VSX
builtins that are identical to Altivec, to the Altivec vesion.
(altivec_overloaded_builtins): Add V2DF/V2DI sel, perm support.
(altivec_resolve_overloaded_builtin): Add V2DF/V2DI support.
* config/rs6000/rs6000.c (rs6000_expand_vector_init): Rename VSX
splat functions.
(expand_vector_set): Merge V2DF/V2DI code.
(expand_vector_extract): Ditto.
(bdesc_3arg): Add more VSX builtins.
(bdesc_2arg): Ditto.
(bdesc_1arg): Ditto.
(rs6000_expand_ternop_builtin): Require xxpermdi 3rd argument to
be 2 bit-constant, and V2DF/V2DI set to be a 1 bit-constant.
(altivec_expand_builtin): Add support for VSX overloaded builtins.
(altivec_init_builtins): Ditto.
(rs6000_common_init_builtins): Ditto.
(rs6000_init_builtins): Add V2DI types and vector long support.
(rs6000_handle_altivec_attribute): Ditto.
(rs6000_mange_type): Ditto.
* config/rs6000/vsx.md (UNSPEC_*): Add new UNSPEC constants.
(vsx_vsel<mode>): Add support for all vector types, including
Altivec types.
(vsx_ftrunc<mode>2): Emit the correct instruction.
(vsx_x<VSv>r<VSs>i): New builtin rounding mode insns.
(vsx_x<VSv>r<VSs>ic): Ditto.
(vsx_concat_<mode>): Key off of VSX memory instructions being
generated instead of the vector arithmetic unit to enable V2DI
mode.
(vsx_extract_<mode>): Ditto.
(vsx_set_<mode>): Rewrite as an unspec.
(vsx_xxpermdi2_<mode>): Rename old vsx_xxpermdi_<mode> here. Key
off of VSX memory instructions instead of arithmetic unit.
(vsx_xxpermdi_<mode>): New insn for __builtin_vsx_xxpermdi.
(vsx_splat_<mode>): Rename from vsx_splat<mode>.
(vsx_xxspltw_<mode>): Change from V4SF only to V4SF/V4SI modes.
Fix up constraints. Key off of memory instructions instead of
arithmetic instructions to allow use with V4SI.
(vsx_xxmrghw_<mode>): Ditto.
(vsx_xxmrglw_<mode>): Ditto.
(vsx_xxsldwi_<mode>): Implement vector shift double by word
immediate.
* config/rs6000/rs6000.h (VSX_BUILTIN_*): Update for current
builtins being generated.
(RS6000_BTI_unsigned_V2DI): Add vector long support.
(RS6000_BTI_bool_long): Ditto.
(RS6000_BTI_bool_V2DI): Ditto.
(unsigned_V2DI_type_node): Ditto.
(bool_long_type_node): Ditto.
(bool_V2DI_type_node): Ditto.
* config/rs6000/altivec.md (altivec_vsel<mode>): Add '*' since we
don't need the generator function now. Use VSX instruction if
-mvsx.
(altivec_vmrghw): Use VSX instruction if -mvsx.
(altivec_vmrghsf): Ditto.
(altivec_vmrglw): Ditto.
(altivec_vmrglsf): Ditto.
* doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions):
Document that under VSX, vector double/long are available.
testsuite/
* gcc.target/powerpc/vsx-builtin-3.c: New test for VSX builtins.
2009-04-23 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vector.md (VEC_E): New iterator to add V2DImode.
(vec_init<mode>): Use VEC_E instead of VEC_C iterator, to add
V2DImode support.
(vec_set<mode>): Ditto.
(vec_extract<mode>): Ditto.
* config/rs6000/predicates.md (easy_vector_constant): Add support
for setting TImode to 0.
* config/rs6000/rs6000.opt (-mvsx-vector-memory): Delete old debug
switch that is no longer used.
(-mvsx-vector-float): Ditto.
(-mvsx-vector-double): Ditto.
(-mvsx-v4sf-altivec-regs): Ditto.
(-mreload-functions): Ditto.
(-mallow-timode): New debug switch.
* config/rs6000/rs6000.c (rs6000_ira_cover_classes): New target
hook for IRA cover classes, to know that under VSX the float and
altivec registers are part of the same register class, but before
they weren't.
(TARGET_IRA_COVER_CLASSES): Set ira cover classes target hookd.
(rs6000_hard_regno_nregs): Key off of whether VSX/Altivec memory
instructions are supported, and not whether the vector unit has
arithmetic support to enable V2DI/TI mode.
(rs6000_hard_regno_mode_ok): Ditto.
(rs6000_init_hard_regno_mode_ok): Add V2DImode, TImode support.
Drop several of the debug switches.
(rs6000_emit_move): Force TImode constants to memory if we have
either Altivec or VSX.
(rs6000_builtin_conversion): Use correct insns for V2DI<->V2DF
conversions.
(rs6000_expand_vector_init): Add V2DI support.
(rs6000_expand_vector_set): Ditto.
(avoiding_indexed_address_p): Simplify tests to say if the mode
uses VSX/Altivec memory instructions we can't eliminate reg+reg
addressing.
(rs6000_legitimize_address): Move VSX/Altivec REG+REG support
before the large integer support.
(rs6000_legitimate_address): Add support for TImode in VSX/Altivec
registers.
(rs6000_emit_move): Ditto.
(def_builtin): Change internal error message to provide more
information.
(bdesc_2arg): Add conversion builtins.
(builtin_hash_function): New function for hashing all of the types
for builtin functions.
(builtin_hash_eq): Ditto.
(builtin_function_type): Ditto.
(builtin_mode_to_type): New static for builtin argument hashing.
(builtin_hash_table): Ditto.
(rs6000_common_init_builtins): Rewrite so that types for builtin
functions are only created when we need them, and use a hash table
to store all of the different argument combinations that are
created. Add support for VSX conversion builtins.
(rs6000_preferred_reload_class): Add TImode support.
(reg_classes_cannot_change_mode_class): Be stricter about VSX and
Altivec vector types.
(rs6000_emit_vector_cond_expr): Use VSX_MOVE_MODE, not
VSX_VECTOR_MOVE_MODE.
(rs6000_handle_altivec_attribute): Allow __vector long on VSX.
* config/rs6000/vsx.md (VSX_D): New iterator for vectors with
64-bit elements.
(VSX_M): New iterator for 128 bit types for moves, except for
TImode.
(VSm, VSs, VSr): Add TImode.
(VSr4, VSr5): New mode attributes for float<->double conversion.
(VSX_SPDP): New iterator for float<->double conversion.
(VS_spdp_*): New mode attributes for float<->double conversion.
(UNSPEC_VSX_*): Rename unspec constants to remove XV from the
names. Change all users.
(vsx_mov<mode>): Drop TImode support here.
(vsx_movti): New TImode support, allow GPRs, but favor VSX
registers.
(vsx_<VS_spdp_insn>): New support for float<->double conversions.
(vsx_xvcvdpsp): Delete, move into vsx_<VS_spdp_insn>.
(vsx_xvcvspdp): Ditto.
(vsx_xvcvuxdsp): New conversion insn.
(vsx_xvcvspsxds): Ditto.
(vsx_xvcvspuxds): Ditto.
(vsx_concat_<mode>): Generalize V2DF permute/splat operations to
include V2DI.
(vsx_set_<mode>): Ditto.
(vsx_extract_<mode>): Ditto.
(vsx_xxpermdi_<mode>): Ditto.
(vsx_splat<mode>): Ditto.
* config/rs6000/rs6000.h (VSX_VECTOR_MOVE_MODE): Delete.
(VSX_MOVE_MODE): Add TImode.
(IRA_COVER_CLASSES): Delete.
(IRA_COVER_CLASSES_PRE_VSX): New cover classes for machines
without VSX where float and altivec are different registers.
(IRA_COVER_CLASS_VSX): New cover classes for machines with VSX
where float and altivec are part of the same register class.
* config/rs6000/altivec.md (VM2): New iterator for 128-bit types,
except TImode.
(altivec_mov<mode>): Drop movti mode here.
(altivec_movti): Add movti insn, and allow GPRs, but favor altivec
registers.
2009-04-16 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000-protos.h (rs6000_has_indirect_jump_p): New
declaration.
(rs6000_set_indirect_jump): Ditto.
* config/rs6000/rs6000.c (struct machine_function): Add
indirect_jump_p field.
(rs6000_override_options): Wrap warning messages in N_(). If
-mvsx was implicitly set, don't give a warning for -msoft-float,
just silently turn off vsx.
(rs6000_secondary_reload_inner): Don't use strict register
checking, since pseudos may still be present.
(register_move_cost): If -mdebug=cost, print out cost information.
(rs6000_memory_move_cost): Ditto.
(rs6000_has_indirect_jump_p): New function, return true if
current function has an indirect jump.
(rs6000_set_indirect_jump): New function, note that an indirect
jump has been generated.
* config/rs6000/rs6000.md (indirect_jump): Note that we've
generated an indirect jump.
(tablejump): Ditto.
(doloop_end): Do not generate decrement ctr and branch
instructions if an indirect jump has been generated.
--- gcc/doc/extend.texi (revision 146119)
+++ gcc/doc/extend.texi (revision 146798)
@@ -7094,7 +7094,7 @@ instructions, but allow the compiler to
* MIPS Loongson Built-in Functions::
* Other MIPS Built-in Functions::
* picoChip Built-in Functions::
-* PowerPC AltiVec Built-in Functions::
+* PowerPC AltiVec/VSX Built-in Functions::
* SPARC VIS Built-in Functions::
* SPU Built-in Functions::
@end menu
@@ -9571,7 +9571,7 @@ GCC defines the preprocessor macro @code
when this function is available.
@end table
-@node PowerPC AltiVec Built-in Functions
+@node PowerPC AltiVec/VSX Built-in Functions
@subsection PowerPC AltiVec Built-in Functions
GCC provides an interface for the PowerPC family of processors to access
@@ -9597,6 +9597,19 @@ vector bool int
vector float
@end smallexample
+If @option{-mvsx} is used the following additional vector types are
+implemented.
+
+@smallexample
+vector unsigned long
+vector signed long
+vector double
+@end smallexample
+
+The long types are only implemented for 64-bit code generation, and
+the long type is only used in the floating point/integer conversion
+instructions.
+
GCC's implementation of the high-level language interface available from
C and C++ code differs from Motorola's documentation in several ways.
--- gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c (revision 146798)
@@ -0,0 +1,212 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mcpu=power7" } */
+/* { dg-final { scan-assembler "xxsel" } } */
+/* { dg-final { scan-assembler "vperm" } } */
+/* { dg-final { scan-assembler "xvrdpi" } } */
+/* { dg-final { scan-assembler "xvrdpic" } } */
+/* { dg-final { scan-assembler "xvrdpim" } } */
+/* { dg-final { scan-assembler "xvrdpip" } } */
+/* { dg-final { scan-assembler "xvrdpiz" } } */
+/* { dg-final { scan-assembler "xvrspi" } } */
+/* { dg-final { scan-assembler "xvrspic" } } */
+/* { dg-final { scan-assembler "xvrspim" } } */
+/* { dg-final { scan-assembler "xvrspip" } } */
+/* { dg-final { scan-assembler "xvrspiz" } } */
+/* { dg-final { scan-assembler "xsrdpi" } } */
+/* { dg-final { scan-assembler "xsrdpic" } } */
+/* { dg-final { scan-assembler "xsrdpim" } } */
+/* { dg-final { scan-assembler "xsrdpip" } } */
+/* { dg-final { scan-assembler "xsrdpiz" } } */
+/* { dg-final { scan-assembler "xsmaxdp" } } */
+/* { dg-final { scan-assembler "xsmindp" } } */
+/* { dg-final { scan-assembler "xxland" } } */
+/* { dg-final { scan-assembler "xxlandc" } } */
+/* { dg-final { scan-assembler "xxlnor" } } */
+/* { dg-final { scan-assembler "xxlor" } } */
+/* { dg-final { scan-assembler "xxlxor" } } */
+/* { dg-final { scan-assembler "xvcmpeqdp" } } */
+/* { dg-final { scan-assembler "xvcmpgtdp" } } */
+/* { dg-final { scan-assembler "xvcmpgedp" } } */
+/* { dg-final { scan-assembler "xvcmpeqsp" } } */
+/* { dg-final { scan-assembler "xvcmpgtsp" } } */
+/* { dg-final { scan-assembler "xvcmpgesp" } } */
+/* { dg-final { scan-assembler "xxsldwi" } } */
+/* { dg-final { scan-assembler-not "call" } } */
+
+extern __vector int si[][4];
+extern __vector short ss[][4];
+extern __vector signed char sc[][4];
+extern __vector float f[][4];
+extern __vector unsigned int ui[][4];
+extern __vector unsigned short us[][4];
+extern __vector unsigned char uc[][4];
+extern __vector __bool int bi[][4];
+extern __vector __bool short bs[][4];
+extern __vector __bool char bc[][4];
+extern __vector __pixel p[][4];
+#ifdef __VSX__
+extern __vector double d[][4];
+extern __vector long sl[][4];
+extern __vector unsigned long ul[][4];
+extern __vector __bool long bl[][4];
+#endif
+
+int do_sel(void)
+{
+ int i = 0;
+
+ si[i][0] = __builtin_vsx_xxsel_4si (si[i][1], si[i][2], si[i][3]); i++;
+ ss[i][0] = __builtin_vsx_xxsel_8hi (ss[i][1], ss[i][2], ss[i][3]); i++;
+ sc[i][0] = __builtin_vsx_xxsel_16qi (sc[i][1], sc[i][2], sc[i][3]); i++;
+ f[i][0] = __builtin_vsx_xxsel_4sf (f[i][1], f[i][2], f[i][3]); i++;
+ d[i][0] = __builtin_vsx_xxsel_2df (d[i][1], d[i][2], d[i][3]); i++;
+
+ si[i][0] = __builtin_vsx_xxsel (si[i][1], si[i][2], bi[i][3]); i++;
+ ss[i][0] = __builtin_vsx_xxsel (ss[i][1], ss[i][2], bs[i][3]); i++;
+ sc[i][0] = __builtin_vsx_xxsel (sc[i][1], sc[i][2], bc[i][3]); i++;
+ f[i][0] = __builtin_vsx_xxsel (f[i][1], f[i][2], bi[i][3]); i++;
+ d[i][0] = __builtin_vsx_xxsel (d[i][1], d[i][2], bl[i][3]); i++;
+
+ si[i][0] = __builtin_vsx_xxsel (si[i][1], si[i][2], ui[i][3]); i++;
+ ss[i][0] = __builtin_vsx_xxsel (ss[i][1], ss[i][2], us[i][3]); i++;
+ sc[i][0] = __builtin_vsx_xxsel (sc[i][1], sc[i][2], uc[i][3]); i++;
+ f[i][0] = __builtin_vsx_xxsel (f[i][1], f[i][2], ui[i][3]); i++;
+ d[i][0] = __builtin_vsx_xxsel (d[i][1], d[i][2], ul[i][3]); i++;
+
+ return i;
+}
+
+int do_perm(void)
+{
+ int i = 0;
+
+ si[i][0] = __builtin_vsx_vperm_4si (si[i][1], si[i][2], sc[i][3]); i++;
+ ss[i][0] = __builtin_vsx_vperm_8hi (ss[i][1], ss[i][2], sc[i][3]); i++;
+ sc[i][0] = __builtin_vsx_vperm_16qi (sc[i][1], sc[i][2], sc[i][3]); i++;
+ f[i][0] = __builtin_vsx_vperm_4sf (f[i][1], f[i][2], sc[i][3]); i++;
+ d[i][0] = __builtin_vsx_vperm_2df (d[i][1], d[i][2], sc[i][3]); i++;
+
+ si[i][0] = __builtin_vsx_vperm (si[i][1], si[i][2], uc[i][3]); i++;
+ ss[i][0] = __builtin_vsx_vperm (ss[i][1], ss[i][2], uc[i][3]); i++;
+ sc[i][0] = __builtin_vsx_vperm (sc[i][1], sc[i][2], uc[i][3]); i++;
+ f[i][0] = __builtin_vsx_vperm (f[i][1], f[i][2], uc[i][3]); i++;
+ d[i][0] = __builtin_vsx_vperm (d[i][1], d[i][2], uc[i][3]); i++;
+
+ return i;
+}
+
+int do_xxperm (void)
+{
+ int i = 0;
+
+ d[i][0] = __builtin_vsx_xxpermdi_2df (d[i][1], d[i][2], 0); i++;
+ d[i][0] = __builtin_vsx_xxpermdi (d[i][1], d[i][2], 1); i++;
+ return i;
+}
+
+double x, y;
+void do_concat (void)
+{
+ d[0][0] = __builtin_vsx_concat_2df (x, y);
+}
+
+void do_set (void)
+{
+ d[0][0] = __builtin_vsx_set_2df (d[0][1], x, 0);
+ d[1][0] = __builtin_vsx_set_2df (d[1][1], y, 1);
+}
+
+extern double z[][4];
+
+int do_math (void)
+{
+ int i = 0;
+
+ d[i][0] = __builtin_vsx_xvrdpi (d[i][1]); i++;
+ d[i][0] = __builtin_vsx_xvrdpic (d[i][1]); i++;
+ d[i][0] = __builtin_vsx_xvrdpim (d[i][1]); i++;
+ d[i][0] = __builtin_vsx_xvrdpip (d[i][1]); i++;
+ d[i][0] = __builtin_vsx_xvrdpiz (d[i][1]); i++;
+
+ f[i][0] = __builtin_vsx_xvrspi (f[i][1]); i++;
+ f[i][0] = __builtin_vsx_xvrspic (f[i][1]); i++;
+ f[i][0] = __builtin_vsx_xvrspim (f[i][1]); i++;
+ f[i][0] = __builtin_vsx_xvrspip (f[i][1]); i++;
+ f[i][0] = __builtin_vsx_xvrspiz (f[i][1]); i++;
+
+ z[i][0] = __builtin_vsx_xsrdpi (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsrdpic (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsrdpim (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsrdpip (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsrdpiz (z[i][1]); i++;
+ z[i][0] = __builtin_vsx_xsmaxdp (z[i][1], z[i][0]); i++;
+ z[i][0] = __builtin_vsx_xsmindp (z[i][1], z[i][0]); i++;
+ return i;
+}
+
+int do_cmp (void)
+{
+ int i = 0;
+
+ d[i][0] = __builtin_vsx_xvcmpeqdp (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xvcmpgtdp (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xvcmpgedp (d[i][1], d[i][2]); i++;
+
+ f[i][0] = __builtin_vsx_xvcmpeqsp (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xvcmpgtsp (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xvcmpgesp (f[i][1], f[i][2]); i++;
+ return i;
+}
+
+int do_logical (void)
+{
+ int i = 0;
+
+ si[i][0] = __builtin_vsx_xxland (si[i][1], si[i][2]); i++;
+ si[i][0] = __builtin_vsx_xxlandc (si[i][1], si[i][2]); i++;
+ si[i][0] = __builtin_vsx_xxlnor (si[i][1], si[i][2]); i++;
+ si[i][0] = __builtin_vsx_xxlor (si[i][1], si[i][2]); i++;
+ si[i][0] = __builtin_vsx_xxlxor (si[i][1], si[i][2]); i++;
+
+ ss[i][0] = __builtin_vsx_xxland (ss[i][1], ss[i][2]); i++;
+ ss[i][0] = __builtin_vsx_xxlandc (ss[i][1], ss[i][2]); i++;
+ ss[i][0] = __builtin_vsx_xxlnor (ss[i][1], ss[i][2]); i++;
+ ss[i][0] = __builtin_vsx_xxlor (ss[i][1], ss[i][2]); i++;
+ ss[i][0] = __builtin_vsx_xxlxor (ss[i][1], ss[i][2]); i++;
+
+ sc[i][0] = __builtin_vsx_xxland (sc[i][1], sc[i][2]); i++;
+ sc[i][0] = __builtin_vsx_xxlandc (sc[i][1], sc[i][2]); i++;
+ sc[i][0] = __builtin_vsx_xxlnor (sc[i][1], sc[i][2]); i++;
+ sc[i][0] = __builtin_vsx_xxlor (sc[i][1], sc[i][2]); i++;
+ sc[i][0] = __builtin_vsx_xxlxor (sc[i][1], sc[i][2]); i++;
+
+ d[i][0] = __builtin_vsx_xxland (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xxlandc (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xxlnor (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xxlor (d[i][1], d[i][2]); i++;
+ d[i][0] = __builtin_vsx_xxlxor (d[i][1], d[i][2]); i++;
+
+ f[i][0] = __builtin_vsx_xxland (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xxlandc (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xxlnor (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xxlor (f[i][1], f[i][2]); i++;
+ f[i][0] = __builtin_vsx_xxlxor (f[i][1], f[i][2]); i++;
+ return i;
+}
+
+int do_xxsldwi (void)
+{
+ int i = 0;
+
+ si[i][0] = __builtin_vsx_xxsldwi (si[i][1], si[i][2], 0); i++;
+ ss[i][0] = __builtin_vsx_xxsldwi (ss[i][1], ss[i][2], 1); i++;
+ sc[i][0] = __builtin_vsx_xxsldwi (sc[i][1], sc[i][2], 2); i++;
+ ui[i][0] = __builtin_vsx_xxsldwi (ui[i][1], ui[i][2], 3); i++;
+ us[i][0] = __builtin_vsx_xxsldwi (us[i][1], us[i][2], 0); i++;
+ uc[i][0] = __builtin_vsx_xxsldwi (uc[i][1], uc[i][2], 1); i++;
+ f[i][0] = __builtin_vsx_xxsldwi (f[i][1], f[i][2], 2); i++;
+ d[i][0] = __builtin_vsx_xxsldwi (d[i][1], d[i][2], 3); i++;
+ return i;
+}
--- gcc/config/rs6000/vector.md (revision 146119)
+++ gcc/config/rs6000/vector.md (revision 146798)
@@ -39,6 +39,9 @@ (define_mode_iterator VEC_M [V16QI V8HI
;; Vector comparison modes
(define_mode_iterator VEC_C [V16QI V8HI V4SI V4SF V2DF])
+;; Vector init/extract modes
+(define_mode_iterator VEC_E [V16QI V8HI V4SI V2DI V4SF V2DF])
+
;; Vector reload iterator
(define_mode_iterator VEC_R [V16QI V8HI V4SI V2DI V4SF V2DF DF TI])
@@ -347,34 +350,13 @@ (define_expand "vector_geu<mode>"
;; Note the arguments for __builtin_altivec_vsel are op2, op1, mask
;; which is in the reverse order that we want
(define_expand "vector_vsel<mode>"
- [(match_operand:VEC_F 0 "vlogical_operand" "")
- (match_operand:VEC_F 1 "vlogical_operand" "")
- (match_operand:VEC_F 2 "vlogical_operand" "")
- (match_operand:VEC_F 3 "vlogical_operand" "")]
+ [(set (match_operand:VEC_L 0 "vlogical_operand" "")
+ (if_then_else:VEC_L (ne (match_operand:VEC_L 3 "vlogical_operand" "")
+ (const_int 0))
+ (match_operand:VEC_L 2 "vlogical_operand" "")
+ (match_operand:VEC_L 1 "vlogical_operand" "")))]
"VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
- "
-{
- if (VECTOR_UNIT_VSX_P (<MODE>mode))
- emit_insn (gen_vsx_vsel<mode> (operands[0], operands[3],
- operands[2], operands[1]));
- else
- emit_insn (gen_altivec_vsel<mode> (operands[0], operands[3],
- operands[2], operands[1]));
- DONE;
-}")
-
-(define_expand "vector_vsel<mode>"
- [(match_operand:VEC_I 0 "vlogical_operand" "")
- (match_operand:VEC_I 1 "vlogical_operand" "")
- (match_operand:VEC_I 2 "vlogical_operand" "")
- (match_operand:VEC_I 3 "vlogical_operand" "")]
- "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
- "
-{
- emit_insn (gen_altivec_vsel<mode> (operands[0], operands[3],
- operands[2], operands[1]));
- DONE;
-}")
+ "")
;; Vector logical instructions
@@ -475,19 +457,23 @@ (define_expand "fixuns_trunc<mode><VEC_i
;; Vector initialization, set, extract
(define_expand "vec_init<mode>"
- [(match_operand:VEC_C 0 "vlogical_operand" "")
- (match_operand:VEC_C 1 "vec_init_operand" "")]
- "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ [(match_operand:VEC_E 0 "vlogical_operand" "")
+ (match_operand:VEC_E 1 "vec_init_operand" "")]
+ "(<MODE>mode == V2DImode
+ ? VECTOR_MEM_VSX_P (V2DImode)
+ : VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode))"
{
rs6000_expand_vector_init (operands[0], operands[1]);
DONE;
})
(define_expand "vec_set<mode>"
- [(match_operand:VEC_C 0 "vlogical_operand" "")
+ [(match_operand:VEC_E 0 "vlogical_operand" "")
(match_operand:<VEC_base> 1 "register_operand" "")
(match_operand 2 "const_int_operand" "")]
- "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "(<MODE>mode == V2DImode
+ ? VECTOR_MEM_VSX_P (V2DImode)
+ : VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode))"
{
rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2]));
DONE;
@@ -495,9 +481,11 @@ (define_expand "vec_set<mode>"
(define_expand "vec_extract<mode>"
[(match_operand:<VEC_base> 0 "register_operand" "")
- (match_operand:VEC_C 1 "vlogical_operand" "")
+ (match_operand:VEC_E 1 "vlogical_operand" "")
(match_operand 2 "const_int_operand" "")]
- "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "(<MODE>mode == V2DImode
+ ? VECTOR_MEM_VSX_P (V2DImode)
+ : VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode))"
{
rs6000_expand_vector_extract (operands[0], operands[1],
INTVAL (operands[2]));
--- gcc/config/rs6000/predicates.md (revision 146119)
+++ gcc/config/rs6000/predicates.md (revision 146798)
@@ -327,6 +327,9 @@ (define_predicate "easy_vector_constant"
if (TARGET_PAIRED_FLOAT)
return false;
+ if ((VSX_VECTOR_MODE (mode) || mode == TImode) && zero_constant (op, mode))
+ return true;
+
if (ALTIVEC_VECTOR_MODE (mode))
{
if (zero_constant (op, mode))
--- gcc/config/rs6000/rs6000-protos.h (revision 146119)
+++ gcc/config/rs6000/rs6000-protos.h (revision 146798)
@@ -176,6 +176,8 @@ extern int rs6000_register_move_cost (en
enum reg_class, enum reg_class);
extern int rs6000_memory_move_cost (enum machine_mode, enum reg_class, int);
extern bool rs6000_tls_referenced_p (rtx);
+extern bool rs6000_has_indirect_jump_p (void);
+extern void rs6000_set_indirect_jump (void);
extern void rs6000_conditional_register_usage (void);
/* Declare functions in rs6000-c.c */
--- gcc/config/rs6000/rs6000-c.c (revision 146119)
+++ gcc/config/rs6000/rs6000-c.c (revision 146798)
@@ -336,7 +336,20 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi
if (TARGET_NO_LWSYNC)
builtin_define ("__NO_LWSYNC__");
if (TARGET_VSX)
- builtin_define ("__VSX__");
+ {
+ builtin_define ("__VSX__");
+
+ /* For the VSX builtin functions identical to Altivec functions, just map
+ the altivec builtin into the vsx version (the altivec functions
+ generate VSX code if -mvsx). */
+ builtin_define ("__builtin_vsx_xxland=__builtin_vec_and");
+ builtin_define ("__builtin_vsx_xxlandc=__builtin_vec_andc");
+ builtin_define ("__builtin_vsx_xxlnor=__builtin_vec_nor");
+ builtin_define ("__builtin_vsx_xxlor=__builtin_vec_or");
+ builtin_define ("__builtin_vsx_xxlxor=__builtin_vec_xor");
+ builtin_define ("__builtin_vsx_xxsel=__builtin_vec_sel");
+ builtin_define ("__builtin_vsx_vperm=__builtin_vec_perm");
+ }
/* May be overridden by target configuration. */
RS6000_CPU_CPP_ENDIAN_BUILTINS();
@@ -400,7 +413,7 @@ struct altivec_builtin_types
};
const struct altivec_builtin_types altivec_overloaded_builtins[] = {
- /* Unary AltiVec builtins. */
+ /* Unary AltiVec/VSX builtins. */
{ ALTIVEC_BUILTIN_VEC_ABS, ALTIVEC_BUILTIN_ABS_V16QI,
RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },
{ ALTIVEC_BUILTIN_VEC_ABS, ALTIVEC_BUILTIN_ABS_V8HI,
@@ -496,7 +509,7 @@ const struct altivec_builtin_types altiv
{ ALTIVEC_BUILTIN_VEC_VUPKLSB, ALTIVEC_BUILTIN_VUPKLSB,
RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V16QI, 0, 0 },
- /* Binary AltiVec builtins. */
+ /* Binary AltiVec/VSX builtins. */
{ ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
RS6000_BTI_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_V16QI, 0 },
{ ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
@@ -2206,7 +2219,7 @@ const struct altivec_builtin_types altiv
{ ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR,
RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 },
- /* Ternary AltiVec builtins. */
+ /* Ternary AltiVec/VSX builtins. */
{ ALTIVEC_BUILTIN_VEC_DST, ALTIVEC_BUILTIN_DST,
RS6000_BTI_void, ~RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, RS6000_BTI_INTSI },
{ ALTIVEC_BUILTIN_VEC_DST, ALTIVEC_BUILTIN_DST,
@@ -2407,6 +2420,10 @@ const struct altivec_builtin_types altiv
RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V4SI },
{ ALTIVEC_BUILTIN_VEC_NMSUB, ALTIVEC_BUILTIN_VNMSUBFP,
RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF },
+ { ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_unsigned_V16QI },
+ { ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V16QI },
{ ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_4SF,
RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_unsigned_V16QI },
{ ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_4SI,
@@ -2433,11 +2450,29 @@ const struct altivec_builtin_types altiv
RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI },
{ ALTIVEC_BUILTIN_VEC_PERM, ALTIVEC_BUILTIN_VPERM_16QI,
RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_unsigned_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SF,
RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI },
{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SF,
RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_unsigned_V4SI },
{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
+ RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
+ RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SI },
+ { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI },
{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI },
@@ -2805,6 +2840,37 @@ const struct altivec_builtin_types altiv
RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V16QI },
{ ALTIVEC_BUILTIN_VEC_STVRXL, ALTIVEC_BUILTIN_STVRXL,
RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_16QI,
+ RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_16QI,
+ RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI,
+ RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_8HI,
+ RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_8HI,
+ RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI,
+ RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_4SI,
+ RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_4SI,
+ RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI,
+ RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_2DI,
+ RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
+ RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_4SF,
+ RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DF,
+ RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DI,
+ RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_NOT_OPAQUE },
+ { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DI,
+ RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
+ RS6000_BTI_NOT_OPAQUE },
/* Predicates. */
{ ALTIVEC_BUILTIN_VCMPGT_P, ALTIVEC_BUILTIN_VCMPGTUB_P,
@@ -3108,6 +3174,10 @@ altivec_resolve_overloaded_builtin (tree
goto bad;
switch (TYPE_MODE (type))
{
+ case DImode:
+ type = (unsigned_p ? unsigned_V2DI_type_node : V2DI_type_node);
+ size = 2;
+ break;
case SImode:
type = (unsigned_p ? unsigned_V4SI_type_node : V4SI_type_node);
size = 4;
@@ -3121,6 +3191,7 @@ altivec_resolve_overloaded_builtin (tree
size = 16;
break;
case SFmode: type = V4SF_type_node; size = 4; break;
+ case DFmode: type = V2DF_type_node; size = 2; break;
default:
goto bad;
}
--- gcc/config/rs6000/rs6000.opt (revision 146119)
+++ gcc/config/rs6000/rs6000.opt (revision 146798)
@@ -119,18 +119,6 @@ mvsx
Target Report Mask(VSX)
Use vector/scalar (VSX) instructions
-mvsx-vector-memory
-Target Undocumented Report Var(TARGET_VSX_VECTOR_MEMORY) Init(-1)
-; If -mvsx, use VSX vector load/store instructions instead of Altivec instructions
-
-mvsx-vector-float
-Target Undocumented Report Var(TARGET_VSX_VECTOR_FLOAT) Init(-1)
-; If -mvsx, use VSX arithmetic instructions for float vectors (on by default)
-
-mvsx-vector-double
-Target Undocumented Report Var(TARGET_VSX_VECTOR_DOUBLE) Init(-1)
-; If -mvsx, use VSX arithmetic instructions for double vectors (on by default)
-
mvsx-scalar-double
Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(-1)
; If -mvsx, use VSX arithmetic instructions for scalar double (on by default)
@@ -139,18 +127,14 @@ mvsx-scalar-memory
Target Undocumented Report Var(TARGET_VSX_SCALAR_MEMORY)
; If -mvsx, use VSX scalar memory reference instructions for scalar double (off by default)
-mvsx-v4sf-altivec-regs
-Target Undocumented Report Var(TARGET_V4SF_ALTIVEC_REGS) Init(-1)
-; If -mvsx, prefer V4SF types to use Altivec regs and not the floating registers
-
-mreload-functions
-Target Undocumented Report Var(TARGET_RELOAD_FUNCTIONS) Init(-1)
-; If -mvsx or -maltivec, enable reload functions
-
mpower7-adjust-cost
Target Undocumented Var(TARGET_POWER7_ADJUST_COST)
; Add extra cost for setting CR registers before a branch like is done for Power5
+mallow-timode
+Target Undocumented Var(TARGET_ALLOW_TIMODE)
+; Allow VSX/Altivec to target loading TImode variables.
+
mdisallow-float-in-lr-ctr
Target Undocumented Var(TARGET_DISALLOW_FLOAT_IN_LR_CTR) Init(-1)
; Disallow floating point in LR or CTR, causes some reload bugs
--- gcc/config/rs6000/rs6000.c (revision 146119)
+++ gcc/config/rs6000/rs6000.c (revision 146798)
@@ -130,6 +130,8 @@ typedef struct machine_function GTY(())
64-bits wide and is allocated early enough so that the offset
does not overflow the 16-bit load/store offset field. */
rtx sdmode_stack_slot;
+ /* Whether an indirect jump or table jump was generated. */
+ bool indirect_jump_p;
} machine_function;
/* Target cpu type */
@@ -917,6 +919,11 @@ static rtx rs6000_expand_binop_builtin (
static rtx rs6000_expand_ternop_builtin (enum insn_code, tree, rtx);
static rtx rs6000_expand_builtin (tree, rtx, rtx, enum machine_mode, int);
static void altivec_init_builtins (void);
+static unsigned builtin_hash_function (const void *);
+static int builtin_hash_eq (const void *, const void *);
+static tree builtin_function_type (enum machine_mode, enum machine_mode,
+ enum machine_mode, enum machine_mode,
+ const char *name);
static void rs6000_common_init_builtins (void);
static void rs6000_init_libfuncs (void);
@@ -1018,6 +1025,8 @@ static enum reg_class rs6000_secondary_r
enum machine_mode,
struct secondary_reload_info *);
+static const enum reg_class *rs6000_ira_cover_classes (void);
+
const int INSN_NOT_AVAILABLE = -1;
static enum machine_mode rs6000_eh_return_filter_mode (void);
@@ -1033,6 +1042,16 @@ struct toc_hash_struct GTY(())
};
static GTY ((param_is (struct toc_hash_struct))) htab_t toc_hash_table;
+
+/* Hash table to keep track of the argument types for builtin functions. */
+
+struct builtin_hash_struct GTY(())
+{
+ tree type;
+ enum machine_mode mode[4]; /* return value + 3 arguments */
+};
+
+static GTY ((param_is (struct builtin_hash_struct))) htab_t builtin_hash_table;
/* Default register names. */
char rs6000_reg_names[][8] =
@@ -1350,6 +1369,9 @@ static const char alt_reg_names[][8] =
#undef TARGET_SECONDARY_RELOAD
#define TARGET_SECONDARY_RELOAD rs6000_secondary_reload
+#undef TARGET_IRA_COVER_CLASSES
+#define TARGET_IRA_COVER_CLASSES rs6000_ira_cover_classes
+
struct gcc_target targetm = TARGET_INITIALIZER;
/* Return number of consecutive hard regs needed starting at reg REGNO
@@ -1370,7 +1392,7 @@ rs6000_hard_regno_nregs_internal (int re
unsigned HOST_WIDE_INT reg_size;
if (FP_REGNO_P (regno))
- reg_size = (VECTOR_UNIT_VSX_P (mode)
+ reg_size = (VECTOR_MEM_VSX_P (mode)
? UNITS_PER_VSX_WORD
: UNITS_PER_FP_WORD);
@@ -1452,7 +1474,7 @@ rs6000_hard_regno_mode_ok (int regno, en
/* AltiVec only in AldyVec registers. */
if (ALTIVEC_REGNO_P (regno))
- return VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode);
+ return VECTOR_MEM_ALTIVEC_OR_VSX_P (mode);
/* ...but GPRs can hold SIMD data on the SPE in one register. */
if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode))
@@ -1613,10 +1635,8 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vector_reload[m][1] = CODE_FOR_nothing;
}
- /* TODO, add TI/V2DI mode for moving data if Altivec or VSX. */
-
/* V2DF mode, VSX only. */
- if (float_p && TARGET_VSX && TARGET_VSX_VECTOR_DOUBLE)
+ if (float_p && TARGET_VSX)
{
rs6000_vector_unit[V2DFmode] = VECTOR_VSX;
rs6000_vector_mem[V2DFmode] = VECTOR_VSX;
@@ -1624,17 +1644,11 @@ rs6000_init_hard_regno_mode_ok (void)
}
/* V4SF mode, either VSX or Altivec. */
- if (float_p && TARGET_VSX && TARGET_VSX_VECTOR_FLOAT)
+ if (float_p && TARGET_VSX)
{
rs6000_vector_unit[V4SFmode] = VECTOR_VSX;
- if (TARGET_VSX_VECTOR_MEMORY || !TARGET_ALTIVEC)
- {
- rs6000_vector_align[V4SFmode] = 32;
- rs6000_vector_mem[V4SFmode] = VECTOR_VSX;
- } else {
- rs6000_vector_align[V4SFmode] = 128;
- rs6000_vector_mem[V4SFmode] = VECTOR_ALTIVEC;
- }
+ rs6000_vector_align[V4SFmode] = 32;
+ rs6000_vector_mem[V4SFmode] = VECTOR_VSX;
}
else if (float_p && TARGET_ALTIVEC)
{
@@ -1655,7 +1669,7 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vector_reg_class[V8HImode] = ALTIVEC_REGS;
rs6000_vector_reg_class[V4SImode] = ALTIVEC_REGS;
- if (TARGET_VSX && TARGET_VSX_VECTOR_MEMORY)
+ if (TARGET_VSX)
{
rs6000_vector_mem[V4SImode] = VECTOR_VSX;
rs6000_vector_mem[V8HImode] = VECTOR_VSX;
@@ -1675,6 +1689,23 @@ rs6000_init_hard_regno_mode_ok (void)
}
}
+ /* V2DImode, prefer vsx over altivec, since the main use will be for
+ vectorized floating point conversions. */
+ if (float_p && TARGET_VSX)
+ {
+ rs6000_vector_mem[V2DImode] = VECTOR_VSX;
+ rs6000_vector_unit[V2DImode] = VECTOR_NONE;
+ rs6000_vector_reg_class[V2DImode] = vsx_rc;
+ rs6000_vector_align[V2DImode] = 64;
+ }
+ else if (TARGET_ALTIVEC)
+ {
+ rs6000_vector_mem[V2DImode] = VECTOR_ALTIVEC;
+ rs6000_vector_unit[V2DImode] = VECTOR_NONE;
+ rs6000_vector_reg_class[V2DImode] = ALTIVEC_REGS;
+ rs6000_vector_align[V2DImode] = 128;
+ }
+
/* DFmode, see if we want to use the VSX unit. */
if (float_p && TARGET_VSX && TARGET_VSX_SCALAR_DOUBLE)
{
@@ -1684,16 +1715,30 @@ rs6000_init_hard_regno_mode_ok (void)
= (TARGET_VSX_SCALAR_MEMORY ? VECTOR_VSX : VECTOR_NONE);
}
- /* TODO, add SPE and paired floating point vector support. */
+ /* TImode. Until this is debugged, only add it under switch control. */
+ if (TARGET_ALLOW_TIMODE)
+ {
+ if (float_p && TARGET_VSX)
+ {
+ rs6000_vector_mem[TImode] = VECTOR_VSX;
+ rs6000_vector_unit[TImode] = VECTOR_NONE;
+ rs6000_vector_reg_class[TImode] = vsx_rc;
+ rs6000_vector_align[TImode] = 64;
+ }
+ else if (TARGET_ALTIVEC)
+ {
+ rs6000_vector_mem[TImode] = VECTOR_ALTIVEC;
+ rs6000_vector_unit[TImode] = VECTOR_NONE;
+ rs6000_vector_reg_class[TImode] = ALTIVEC_REGS;
+ rs6000_vector_align[TImode] = 128;
+ }
+ }
+
+ /* TODO add SPE and paired floating point vector support. */
/* Set the VSX register classes. */
-
- /* For V4SF, prefer the Altivec registers, because there are a few operations
- that want to use Altivec operations instead of VSX. */
rs6000_vector_reg_class[V4SFmode]
- = ((VECTOR_UNIT_VSX_P (V4SFmode)
- && VECTOR_MEM_VSX_P (V4SFmode)
- && !TARGET_V4SF_ALTIVEC_REGS)
+ = ((VECTOR_UNIT_VSX_P (V4SFmode) && VECTOR_MEM_VSX_P (V4SFmode))
? vsx_rc
: (VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)
? ALTIVEC_REGS
@@ -1712,7 +1757,7 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vsx_reg_class = (float_p && TARGET_VSX) ? vsx_rc : NO_REGS;
/* Set up the reload helper functions. */
- if (TARGET_RELOAD_FUNCTIONS && (TARGET_VSX || TARGET_ALTIVEC))
+ if (TARGET_VSX || TARGET_ALTIVEC)
{
if (TARGET_64BIT)
{
@@ -1728,6 +1773,11 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_di_load;
rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_di_store;
rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_di_load;
+ if (TARGET_ALLOW_TIMODE)
+ {
+ rs6000_vector_reload[TImode][0] = CODE_FOR_reload_ti_di_store;
+ rs6000_vector_reload[TImode][1] = CODE_FOR_reload_ti_di_load;
+ }
}
else
{
@@ -1743,6 +1793,11 @@ rs6000_init_hard_regno_mode_ok (void)
rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_si_load;
rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_si_store;
rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_si_load;
+ if (TARGET_ALLOW_TIMODE)
+ {
+ rs6000_vector_reload[TImode][0] = CODE_FOR_reload_ti_si_store;
+ rs6000_vector_reload[TImode][1] = CODE_FOR_reload_ti_si_load;
+ }
}
}
@@ -2132,23 +2187,29 @@ rs6000_override_options (const char *def
const char *msg = NULL;
if (!TARGET_HARD_FLOAT || !TARGET_FPRS
|| !TARGET_SINGLE_FLOAT || !TARGET_DOUBLE_FLOAT)
- msg = "-mvsx requires hardware floating point";
+ {
+ if (target_flags_explicit & MASK_VSX)
+ msg = N_("-mvsx requires hardware floating point");
+ else
+ target_flags &= ~ MASK_VSX;
+ }
else if (TARGET_PAIRED_FLOAT)
- msg = "-mvsx and -mpaired are incompatible";
+ msg = N_("-mvsx and -mpaired are incompatible");
/* The hardware will allow VSX and little endian, but until we make sure
things like vector select, etc. work don't allow VSX on little endian
systems at this point. */
else if (!BYTES_BIG_ENDIAN)
- msg = "-mvsx used with little endian code";
+ msg = N_("-mvsx used with little endian code");
else if (TARGET_AVOID_XFORM > 0)
- msg = "-mvsx needs indexed addressing";
+ msg = N_("-mvsx needs indexed addressing");
if (msg)
{
warning (0, msg);
- target_flags &= MASK_VSX;
+ target_flags &= ~ MASK_VSX;
}
- else if (!TARGET_ALTIVEC && (target_flags_explicit & MASK_ALTIVEC) == 0)
+ else if (TARGET_VSX && !TARGET_ALTIVEC
+ && (target_flags_explicit & MASK_ALTIVEC) == 0)
target_flags |= MASK_ALTIVEC;
}
@@ -2581,8 +2642,8 @@ rs6000_builtin_conversion (enum tree_cod
return NULL_TREE;
return TYPE_UNSIGNED (type)
- ? rs6000_builtin_decls[VSX_BUILTIN_XVCVUXDSP]
- : rs6000_builtin_decls[VSX_BUILTIN_XVCVSXDSP];
+ ? rs6000_builtin_decls[VSX_BUILTIN_XVCVUXDDP]
+ : rs6000_builtin_decls[VSX_BUILTIN_XVCVSXDDP];
case V4SImode:
if (VECTOR_UNIT_NONE_P (V4SImode) || VECTOR_UNIT_NONE_P (V4SFmode))
@@ -3785,15 +3846,28 @@ rs6000_expand_vector_init (rtx target, r
}
}
- if (mode == V2DFmode)
+ if (VECTOR_MEM_VSX_P (mode) && (mode == V2DFmode || mode == V2DImode))
{
- gcc_assert (TARGET_VSX);
+ rtx (*splat) (rtx, rtx);
+ rtx (*concat) (rtx, rtx, rtx);
+
+ if (mode == V2DFmode)
+ {
+ splat = gen_vsx_splat_v2df;
+ concat = gen_vsx_concat_v2df;
+ }
+ else
+ {
+ splat = gen_vsx_splat_v2di;
+ concat = gen_vsx_concat_v2di;
+ }
+
if (all_same)
- emit_insn (gen_vsx_splatv2df (target, XVECEXP (vals, 0, 0)));
+ emit_insn (splat (target, XVECEXP (vals, 0, 0)));
else
- emit_insn (gen_vsx_concat_v2df (target,
- copy_to_reg (XVECEXP (vals, 0, 0)),
- copy_to_reg (XVECEXP (vals, 0, 1))));
+ emit_insn (concat (target,
+ copy_to_reg (XVECEXP (vals, 0, 0)),
+ copy_to_reg (XVECEXP (vals, 0, 1))));
return;
}
@@ -3856,10 +3930,12 @@ rs6000_expand_vector_set (rtx target, rt
int width = GET_MODE_SIZE (inner_mode);
int i;
- if (mode == V2DFmode)
+ if (mode == V2DFmode || mode == V2DImode)
{
+ rtx (*set_func) (rtx, rtx, rtx, rtx)
+ = ((mode == V2DFmode) ? gen_vsx_set_v2df : gen_vsx_set_v2di);
gcc_assert (TARGET_VSX);
- emit_insn (gen_vsx_set_v2df (target, val, target, GEN_INT (elt)));
+ emit_insn (set_func (target, val, target, GEN_INT (elt)));
return;
}
@@ -3900,10 +3976,12 @@ rs6000_expand_vector_extract (rtx target
enum machine_mode inner_mode = GET_MODE_INNER (mode);
rtx mem, x;
- if (mode == V2DFmode)
+ if (mode == V2DFmode || mode == V2DImode)
{
+ rtx (*extract_func) (rtx, rtx, rtx)
+ = ((mode == V2DFmode) ? gen_vsx_extract_v2df : gen_vsx_extract_v2di);
gcc_assert (TARGET_VSX);
- emit_insn (gen_vsx_extract_v2df (target, vec, GEN_INT (elt)));
+ emit_insn (extract_func (target, vec, GEN_INT (elt)));
return;
}
@@ -4323,9 +4401,7 @@ avoiding_indexed_address_p (enum machine
{
/* Avoid indexed addressing for modes that have non-indexed
load/store instruction forms. */
- return (TARGET_AVOID_XFORM
- && (!TARGET_ALTIVEC || !ALTIVEC_VECTOR_MODE (mode))
- && (!TARGET_VSX || !VSX_VECTOR_MODE (mode)));
+ return (TARGET_AVOID_XFORM && VECTOR_MEM_NONE_P (mode));
}
inline bool
@@ -4427,6 +4503,16 @@ rs6000_legitimize_address (rtx x, rtx ol
ret = rs6000_legitimize_tls_address (x, model);
}
+ else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
+ {
+ /* Make sure both operands are registers. */
+ if (GET_CODE (x) == PLUS)
+ ret = gen_rtx_PLUS (Pmode,
+ force_reg (Pmode, XEXP (x, 0)),
+ force_reg (Pmode, XEXP (x, 1)));
+ else
+ ret = force_reg (Pmode, x);
+ }
else if (GET_CODE (x) == PLUS
&& GET_CODE (XEXP (x, 0)) == REG
&& GET_CODE (XEXP (x, 1)) == CONST_INT
@@ -4436,8 +4522,6 @@ rs6000_legitimize_address (rtx x, rtx ol
&& (mode == DImode || mode == TImode)
&& (INTVAL (XEXP (x, 1)) & 3) != 0)
|| (TARGET_SPE && SPE_VECTOR_MODE (mode))
- || (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode))
- || (TARGET_VSX && VSX_VECTOR_MODE (mode))
|| (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode
|| mode == DImode || mode == DDmode
|| mode == TDmode))))
@@ -4467,15 +4551,6 @@ rs6000_legitimize_address (rtx x, rtx ol
ret = gen_rtx_PLUS (Pmode, XEXP (x, 0),
force_reg (Pmode, force_operand (XEXP (x, 1), 0)));
}
- else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
- {
- /* Make sure both operands are registers. */
- if (GET_CODE (x) == PLUS)
- ret = gen_rtx_PLUS (Pmode, force_reg (Pmode, XEXP (x, 0)),
- force_reg (Pmode, XEXP (x, 1)));
- else
- ret = force_reg (Pmode, x);
- }
else if ((TARGET_SPE && SPE_VECTOR_MODE (mode))
|| (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode
|| mode == DDmode || mode == TDmode
@@ -5113,7 +5188,7 @@ rs6000_legitimate_address (enum machine_
ret = 1;
else if (rs6000_legitimate_offset_address_p (mode, x, reg_ok_strict))
ret = 1;
- else if (mode != TImode
+ else if ((mode != TImode || !VECTOR_MEM_NONE_P (TImode))
&& mode != TFmode
&& mode != TDmode
&& ((TARGET_HARD_FLOAT && TARGET_FPRS)
@@ -5953,7 +6028,13 @@ rs6000_emit_move (rtx dest, rtx source,
case TImode:
if (VECTOR_MEM_ALTIVEC_OR_VSX_P (TImode))
- break;
+ {
+ if (CONSTANT_P (operands[1])
+ && !easy_vector_constant (operands[1], mode))
+ operands[1] = force_const_mem (mode, operands[1]);
+
+ break;
+ }
rs6000_eliminate_indexed_memrefs (operands);
@@ -7869,7 +7950,8 @@ def_builtin (int mask, const char *name,
if ((mask & target_flags) || TARGET_PAIRED_FLOAT)
{
if (rs6000_builtin_decls[code])
- abort ();
+ fatal_error ("internal error: builtin function to %s already processed.",
+ name);
rs6000_builtin_decls[code] =
add_builtin_function (name, type, code, BUILT_IN_MD,
@@ -7934,6 +8016,34 @@ static const struct builtin_description
{ MASK_VSX, CODE_FOR_vsx_fnmaddv4sf4, "__builtin_vsx_xvnmaddsp", VSX_BUILTIN_XVNMADDSP },
{ MASK_VSX, CODE_FOR_vsx_fnmsubv4sf4, "__builtin_vsx_xvnmsubsp", VSX_BUILTIN_XVNMSUBSP },
+ { MASK_VSX, CODE_FOR_vector_vselv2di, "__builtin_vsx_xxsel_2di", VSX_BUILTIN_XXSEL_2DI },
+ { MASK_VSX, CODE_FOR_vector_vselv2df, "__builtin_vsx_xxsel_2df", VSX_BUILTIN_XXSEL_2DF },
+ { MASK_VSX, CODE_FOR_vector_vselv4sf, "__builtin_vsx_xxsel_4sf", VSX_BUILTIN_XXSEL_4SF },
+ { MASK_VSX, CODE_FOR_vector_vselv4si, "__builtin_vsx_xxsel_4si", VSX_BUILTIN_XXSEL_4SI },
+ { MASK_VSX, CODE_FOR_vector_vselv8hi, "__builtin_vsx_xxsel_8hi", VSX_BUILTIN_XXSEL_8HI },
+ { MASK_VSX, CODE_FOR_vector_vselv16qi, "__builtin_vsx_xxsel_16qi", VSX_BUILTIN_XXSEL_16QI },
+
+ { MASK_VSX, CODE_FOR_altivec_vperm_v2di, "__builtin_vsx_vperm_2di", VSX_BUILTIN_VPERM_2DI },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v2df, "__builtin_vsx_vperm_2df", VSX_BUILTIN_VPERM_2DF },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v4sf, "__builtin_vsx_vperm_4sf", VSX_BUILTIN_VPERM_4SF },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v4si, "__builtin_vsx_vperm_4si", VSX_BUILTIN_VPERM_4SI },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v8hi, "__builtin_vsx_vperm_8hi", VSX_BUILTIN_VPERM_8HI },
+ { MASK_VSX, CODE_FOR_altivec_vperm_v16qi, "__builtin_vsx_vperm_16qi", VSX_BUILTIN_VPERM_16QI },
+
+ { MASK_VSX, CODE_FOR_vsx_xxpermdi_v2df, "__builtin_vsx_xxpermdi_2df", VSX_BUILTIN_XXPERMDI_2DF },
+ { MASK_VSX, CODE_FOR_vsx_xxpermdi_v2di, "__builtin_vsx_xxpermdi_2di", VSX_BUILTIN_XXPERMDI_2DI },
+ { MASK_VSX, CODE_FOR_nothing, "__builtin_vsx_xxpermdi", VSX_BUILTIN_VEC_XXPERMDI },
+ { MASK_VSX, CODE_FOR_vsx_set_v2df, "__builtin_vsx_set_2df", VSX_BUILTIN_SET_2DF },
+ { MASK_VSX, CODE_FOR_vsx_set_v2di, "__builtin_vsx_set_2di", VSX_BUILTIN_SET_2DI },
+
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v2di, "__builtin_vsx_xxsldwi_2di", VSX_BUILTIN_XXSLDWI_2DI },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v2df, "__builtin_vsx_xxsldwi_2df", VSX_BUILTIN_XXSLDWI_2DF },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v4sf, "__builtin_vsx_xxsldwi_4sf", VSX_BUILTIN_XXSLDWI_4SF },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v4si, "__builtin_vsx_xxsldwi_4si", VSX_BUILTIN_XXSLDWI_4SI },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v8hi, "__builtin_vsx_xxsldwi_8hi", VSX_BUILTIN_XXSLDWI_8HI },
+ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v16qi, "__builtin_vsx_xxsldwi_16qi", VSX_BUILTIN_XXSLDWI_16QI },
+ { MASK_VSX, CODE_FOR_nothing, "__builtin_vsx_xxsldwi", VSX_BUILTIN_VEC_XXSLDWI },
+
{ 0, CODE_FOR_paired_msub, "__builtin_paired_msub", PAIRED_BUILTIN_MSUB },
{ 0, CODE_FOR_paired_madd, "__builtin_paired_madd", PAIRED_BUILTIN_MADD },
{ 0, CODE_FOR_paired_madds0, "__builtin_paired_madds0", PAIRED_BUILTIN_MADDS0 },
@@ -8083,6 +8193,9 @@ static struct builtin_description bdesc_
{ MASK_VSX, CODE_FOR_sminv2df3, "__builtin_vsx_xvmindp", VSX_BUILTIN_XVMINDP },
{ MASK_VSX, CODE_FOR_smaxv2df3, "__builtin_vsx_xvmaxdp", VSX_BUILTIN_XVMAXDP },
{ MASK_VSX, CODE_FOR_vsx_tdivv2df3, "__builtin_vsx_xvtdivdp", VSX_BUILTIN_XVTDIVDP },
+ { MASK_VSX, CODE_FOR_vector_eqv2df, "__builtin_vsx_xvcmpeqdp", VSX_BUILTIN_XVCMPEQDP },
+ { MASK_VSX, CODE_FOR_vector_gtv2df, "__builtin_vsx_xvcmpgtdp", VSX_BUILTIN_XVCMPGTDP },
+ { MASK_VSX, CODE_FOR_vector_gev2df, "__builtin_vsx_xvcmpgedp", VSX_BUILTIN_XVCMPGEDP },
{ MASK_VSX, CODE_FOR_addv4sf3, "__builtin_vsx_xvaddsp", VSX_BUILTIN_XVADDSP },
{ MASK_VSX, CODE_FOR_subv4sf3, "__builtin_vsx_xvsubsp", VSX_BUILTIN_XVSUBSP },
@@ -8091,6 +8204,21 @@ static struct builtin_description bdesc_
{ MASK_VSX, CODE_FOR_sminv4sf3, "__builtin_vsx_xvminsp", VSX_BUILTIN_XVMINSP },
{ MASK_VSX, CODE_FOR_smaxv4sf3, "__builtin_vsx_xvmaxsp", VSX_BUILTIN_XVMAXSP },
{ MASK_VSX, CODE_FOR_vsx_tdivv4sf3, "__builtin_vsx_xvtdivsp", VSX_BUILTIN_XVTDIVSP },
+ { MASK_VSX, CODE_FOR_vector_eqv4sf, "__builtin_vsx_xvcmpeqsp", VSX_BUILTIN_XVCMPEQSP },
+ { MASK_VSX, CODE_FOR_vector_gtv4sf, "__builtin_vsx_xvcmpgtsp", VSX_BUILTIN_XVCMPGTSP },
+ { MASK_VSX, CODE_FOR_vector_gev4sf, "__builtin_vsx_xvcmpgesp", VSX_BUILTIN_XVCMPGESP },
+
+ { MASK_VSX, CODE_FOR_smindf3, "__builtin_vsx_xsmindp", VSX_BUILTIN_XSMINDP },
+ { MASK_VSX, CODE_FOR_smaxdf3, "__builtin_vsx_xsmaxdp", VSX_BUILTIN_XSMAXDP },
+
+ { MASK_VSX, CODE_FOR_vsx_concat_v2df, "__builtin_vsx_concat_2df", VSX_BUILTIN_CONCAT_2DF },
+ { MASK_VSX, CODE_FOR_vsx_concat_v2di, "__builtin_vsx_concat_2di", VSX_BUILTIN_CONCAT_2DI },
+ { MASK_VSX, CODE_FOR_vsx_splat_v2df, "__builtin_vsx_splat_2df", VSX_BUILTIN_SPLAT_2DF },
+ { MASK_VSX, CODE_FOR_vsx_splat_v2di, "__builtin_vsx_splat_2di", VSX_BUILTIN_SPLAT_2DI },
+ { MASK_VSX, CODE_FOR_vsx_xxmrghw_v4sf, "__builtin_vsx_xxmrghw", VSX_BUILTIN_XXMRGHW_4SF },
+ { MASK_VSX, CODE_FOR_vsx_xxmrghw_v4si, "__builtin_vsx_xxmrghw_4si", VSX_BUILTIN_XXMRGHW_4SI },
+ { MASK_VSX, CODE_FOR_vsx_xxmrglw_v4sf, "__builtin_vsx_xxmrglw", VSX_BUILTIN_XXMRGLW_4SF },
+ { MASK_VSX, CODE_FOR_vsx_xxmrglw_v4si, "__builtin_vsx_xxmrglw_4si", VSX_BUILTIN_XXMRGLW_4SI },
{ MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_add", ALTIVEC_BUILTIN_VEC_ADD },
{ MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vaddfp", ALTIVEC_BUILTIN_VEC_VADDFP },
@@ -8508,6 +8636,47 @@ static struct builtin_description bdesc_
{ MASK_VSX, CODE_FOR_vsx_tsqrtv4sf2, "__builtin_vsx_xvtsqrtsp", VSX_BUILTIN_XVTSQRTSP },
{ MASK_VSX, CODE_FOR_vsx_frev4sf2, "__builtin_vsx_xvresp", VSX_BUILTIN_XVRESP },
+ { MASK_VSX, CODE_FOR_vsx_xscvdpsp, "__builtin_vsx_xscvdpsp", VSX_BUILTIN_XSCVDPSP },
+ { MASK_VSX, CODE_FOR_vsx_xscvdpsp, "__builtin_vsx_xscvspdp", VSX_BUILTIN_XSCVSPDP },
+ { MASK_VSX, CODE_FOR_vsx_xvcvdpsp, "__builtin_vsx_xvcvdpsp", VSX_BUILTIN_XVCVDPSP },
+ { MASK_VSX, CODE_FOR_vsx_xvcvspdp, "__builtin_vsx_xvcvspdp", VSX_BUILTIN_XVCVSPDP },
+
+ { MASK_VSX, CODE_FOR_vsx_fix_truncv2dfv2di2, "__builtin_vsx_xvcvdpsxds", VSX_BUILTIN_XVCVDPSXDS },
+ { MASK_VSX, CODE_FOR_vsx_fixuns_truncv2dfv2di2, "__builtin_vsx_xvcvdpuxds", VSX_BUILTIN_XVCVDPUXDS },
+ { MASK_VSX, CODE_FOR_vsx_floatv2div2df2, "__builtin_vsx_xvcvsxddp", VSX_BUILTIN_XVCVSXDDP },
+ { MASK_VSX, CODE_FOR_vsx_floatunsv2div2df2, "__builtin_vsx_xvcvuxddp", VSX_BUILTIN_XVCVUXDDP },
+
+ { MASK_VSX, CODE_FOR_vsx_fix_truncv4sfv4si2, "__builtin_vsx_xvcvspsxws", VSX_BUILTIN_XVCVSPSXWS },
+ { MASK_VSX, CODE_FOR_vsx_fixuns_truncv4sfv4si2, "__builtin_vsx_xvcvspuxws", VSX_BUILTIN_XVCVSPUXWS },
+ { MASK_VSX, CODE_FOR_vsx_floatv4siv4sf2, "__builtin_vsx_xvcvsxwsp", VSX_BUILTIN_XVCVSXWSP },
+ { MASK_VSX, CODE_FOR_vsx_floatunsv4siv4sf2, "__builtin_vsx_xvcvuxwsp", VSX_BUILTIN_XVCVUXWSP },
+
+ { MASK_VSX, CODE_FOR_vsx_xvcvdpsxws, "__builtin_vsx_xvcvdpsxws", VSX_BUILTIN_XVCVDPSXWS },
+ { MASK_VSX, CODE_FOR_vsx_xvcvdpuxws, "__builtin_vsx_xvcvdpuxws", VSX_BUILTIN_XVCVDPUXWS },
+ { MASK_VSX, CODE_FOR_vsx_xvcvsxwdp, "__builtin_vsx_xvcvsxwdp", VSX_BUILTIN_XVCVSXWDP },
+ { MASK_VSX, CODE_FOR_vsx_xvcvuxwdp, "__builtin_vsx_xvcvuxwdp", VSX_BUILTIN_XVCVUXWDP },
+ { MASK_VSX, CODE_FOR_vsx_xvrdpi, "__builtin_vsx_xvrdpi", VSX_BUILTIN_XVRDPI },
+ { MASK_VSX, CODE_FOR_vsx_xvrdpic, "__builtin_vsx_xvrdpic", VSX_BUILTIN_XVRDPIC },
+ { MASK_VSX, CODE_FOR_vsx_floorv2df2, "__builtin_vsx_xvrdpim", VSX_BUILTIN_XVRDPIM },
+ { MASK_VSX, CODE_FOR_vsx_ceilv2df2, "__builtin_vsx_xvrdpip", VSX_BUILTIN_XVRDPIP },
+ { MASK_VSX, CODE_FOR_vsx_btruncv2df2, "__builtin_vsx_xvrdpiz", VSX_BUILTIN_XVRDPIZ },
+
+ { MASK_VSX, CODE_FOR_vsx_xvcvspsxds, "__builtin_vsx_xvcvspsxds", VSX_BUILTIN_XVCVSPSXDS },
+ { MASK_VSX, CODE_FOR_vsx_xvcvspuxds, "__builtin_vsx_xvcvspuxds", VSX_BUILTIN_XVCVSPUXDS },
+ { MASK_VSX, CODE_FOR_vsx_xvcvsxdsp, "__builtin_vsx_xvcvsxdsp", VSX_BUILTIN_XVCVSXDSP },
+ { MASK_VSX, CODE_FOR_vsx_xvcvuxdsp, "__builtin_vsx_xvcvuxdsp", VSX_BUILTIN_XVCVUXDSP },
+ { MASK_VSX, CODE_FOR_vsx_xvrspi, "__builtin_vsx_xvrspi", VSX_BUILTIN_XVRSPI },
+ { MASK_VSX, CODE_FOR_vsx_xvrspic, "__builtin_vsx_xvrspic", VSX_BUILTIN_XVRSPIC },
+ { MASK_VSX, CODE_FOR_vsx_floorv4sf2, "__builtin_vsx_xvrspim", VSX_BUILTIN_XVRSPIM },
+ { MASK_VSX, CODE_FOR_vsx_ceilv4sf2, "__builtin_vsx_xvrspip", VSX_BUILTIN_XVRSPIP },
+ { MASK_VSX, CODE_FOR_vsx_btruncv4sf2, "__builtin_vsx_xvrspiz", VSX_BUILTIN_XVRSPIZ },
+
+ { MASK_VSX, CODE_FOR_vsx_xsrdpi, "__builtin_vsx_xsrdpi", VSX_BUILTIN_XSRDPI },
+ { MASK_VSX, CODE_FOR_vsx_xsrdpic, "__builtin_vsx_xsrdpic", VSX_BUILTIN_XSRDPIC },
+ { MASK_VSX, CODE_FOR_vsx_floordf2, "__builtin_vsx_xsrdpim", VSX_BUILTIN_XSRDPIM },
+ { MASK_VSX, CODE_FOR_vsx_ceildf2, "__builtin_vsx_xsrdpip", VSX_BUILTIN_XSRDPIP },
+ { MASK_VSX, CODE_FOR_vsx_btruncdf2, "__builtin_vsx_xsrdpiz", VSX_BUILTIN_XSRDPIZ },
+
{ MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_abs", ALTIVEC_BUILTIN_VEC_ABS },
{ MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_abss", ALTIVEC_BUILTIN_VEC_ABSS },
{ MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_ceil", ALTIVEC_BUILTIN_VEC_CEIL },
@@ -8533,15 +8702,6 @@ static struct builtin_description bdesc_
{ MASK_ALTIVEC|MASK_VSX, CODE_FOR_fix_truncv4sfv4si2, "__builtin_vec_fix_sfsi", VECTOR_BUILTIN_FIX_V4SF_V4SI },
{ MASK_ALTIVEC|MASK_VSX, CODE_FOR_fixuns_truncv4sfv4si2, "__builtin_vec_fixuns_sfsi", VECTOR_BUILTIN_FIXUNS_V4SF_V4SI },
- { MASK_VSX, CODE_FOR_floatv2div2df2, "__builtin_vsx_xvcvsxddp", VSX_BUILTIN_XVCVSXDDP },
- { MASK_VSX, CODE_FOR_unsigned_floatv2div2df2, "__builtin_vsx_xvcvuxddp", VSX_BUILTIN_XVCVUXDDP },
- { MASK_VSX, CODE_FOR_fix_truncv2dfv2di2, "__builtin_vsx_xvdpsxds", VSX_BUILTIN_XVCVDPSXDS },
- { MASK_VSX, CODE_FOR_fixuns_truncv2dfv2di2, "__builtin_vsx_xvdpuxds", VSX_BUILTIN_XVCVDPUXDS },
- { MASK_VSX, CODE_FOR_floatv4siv4sf2, "__builtin_vsx_xvcvsxwsp", VSX_BUILTIN_XVCVSXDSP },
- { MASK_VSX, CODE_FOR_unsigned_floatv4siv4sf2, "__builtin_vsx_xvcvuxwsp", VSX_BUILTIN_XVCVUXWSP },
- { MASK_VSX, CODE_FOR_fix_truncv4sfv4si2, "__builtin_vsx_xvspsxws", VSX_BUILTIN_XVCVSPSXWS },
- { MASK_VSX, CODE_FOR_fixuns_truncv4sfv4si2, "__builtin_vsx_xvspuxws", VSX_BUILTIN_XVCVSPUXWS },
-
/* The SPE unary builtins must start with SPE_BUILTIN_EVABS and
end with SPE_BUILTIN_EVSUBFUSIAAW. */
{ 0, CODE_FOR_spe_evabs, "__builtin_spe_evabs", SPE_BUILTIN_EVABS },
@@ -9046,11 +9206,12 @@ rs6000_expand_ternop_builtin (enum insn_
|| arg2 == error_mark_node)
return const0_rtx;
- if (icode == CODE_FOR_altivec_vsldoi_v4sf
- || icode == CODE_FOR_altivec_vsldoi_v4si
- || icode == CODE_FOR_altivec_vsldoi_v8hi
- || icode == CODE_FOR_altivec_vsldoi_v16qi)
+ switch (icode)
{
+ case CODE_FOR_altivec_vsldoi_v4sf:
+ case CODE_FOR_altivec_vsldoi_v4si:
+ case CODE_FOR_altivec_vsldoi_v8hi:
+ case CODE_FOR_altivec_vsldoi_v16qi:
/* Only allow 4-bit unsigned literals. */
STRIP_NOPS (arg2);
if (TREE_CODE (arg2) != INTEGER_CST
@@ -9059,6 +9220,40 @@ rs6000_expand_ternop_builtin (enum insn_
error ("argument 3 must be a 4-bit unsigned literal");
return const0_rtx;
}
+ break;
+
+ case CODE_FOR_vsx_xxpermdi_v2df:
+ case CODE_FOR_vsx_xxpermdi_v2di:
+ case CODE_FOR_vsx_xxsldwi_v16qi:
+ case CODE_FOR_vsx_xxsldwi_v8hi:
+ case CODE_FOR_vsx_xxsldwi_v4si:
+ case CODE_FOR_vsx_xxsldwi_v4sf:
+ case CODE_FOR_vsx_xxsldwi_v2di:
+ case CODE_FOR_vsx_xxsldwi_v2df:
+ /* Only allow 2-bit unsigned literals. */
+ STRIP_NOPS (arg2);
+ if (TREE_CODE (arg2) != INTEGER_CST
+ || TREE_INT_CST_LOW (arg2) & ~0x3)
+ {
+ error ("argument 3 must be a 2-bit unsigned literal");
+ return const0_rtx;
+ }
+ break;
+
+ case CODE_FOR_vsx_set_v2df:
+ case CODE_FOR_vsx_set_v2di:
+ /* Only allow 1-bit unsigned literals. */
+ STRIP_NOPS (arg2);
+ if (TREE_CODE (arg2) != INTEGER_CST
+ || TREE_INT_CST_LOW (arg2) & ~0x1)
+ {
+ error ("argument 3 must be a 1-bit unsigned literal");
+ return const0_rtx;
+ }
+ break;
+
+ default:
+ break;
}
if (target == 0
@@ -9366,8 +9561,10 @@ altivec_expand_builtin (tree exp, rtx ta
enum machine_mode tmode, mode0;
unsigned int fcode = DECL_FUNCTION_CODE (fndecl);
- if (fcode >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && fcode <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ if ((fcode >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && fcode <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (fcode >= VSX_BUILTIN_OVERLOADED_FIRST
+ && fcode <= VSX_BUILTIN_OVERLOADED_LAST))
{
*expandedp = true;
error ("unresolved overload for Altivec builtin %qF", fndecl);
@@ -10156,6 +10353,7 @@ rs6000_init_builtins (void)
unsigned_V16QI_type_node = build_vector_type (unsigned_intQI_type_node, 16);
unsigned_V8HI_type_node = build_vector_type (unsigned_intHI_type_node, 8);
unsigned_V4SI_type_node = build_vector_type (unsigned_intSI_type_node, 4);
+ unsigned_V2DI_type_node = build_vector_type (unsigned_intDI_type_node, 2);
opaque_V2SF_type_node = build_opaque_vector_type (float_type_node, 2);
opaque_V2SI_type_node = build_opaque_vector_type (intSI_type_node, 2);
@@ -10169,6 +10367,7 @@ rs6000_init_builtins (void)
bool_char_type_node = build_distinct_type_copy (unsigned_intQI_type_node);
bool_short_type_node = build_distinct_type_copy (unsigned_intHI_type_node);
bool_int_type_node = build_distinct_type_copy (unsigned_intSI_type_node);
+ bool_long_type_node = build_distinct_type_copy (unsigned_intDI_type_node);
pixel_type_node = build_distinct_type_copy (unsigned_intHI_type_node);
long_integer_type_internal_node = long_integer_type_node;
@@ -10201,6 +10400,7 @@ rs6000_init_builtins (void)
bool_V16QI_type_node = build_vector_type (bool_char_type_node, 16);
bool_V8HI_type_node = build_vector_type (bool_short_type_node, 8);
bool_V4SI_type_node = build_vector_type (bool_int_type_node, 4);
+ bool_V2DI_type_node = build_vector_type (bool_long_type_node, 2);
pixel_V8HI_type_node = build_vector_type (pixel_type_node, 8);
(*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
@@ -10241,9 +10441,17 @@ rs6000_init_builtins (void)
pixel_V8HI_type_node));
if (TARGET_VSX)
- (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
- get_identifier ("__vector double"),
- V2DF_type_node));
+ {
+ (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
+ get_identifier ("__vector double"),
+ V2DF_type_node));
+ (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
+ get_identifier ("__vector long"),
+ V2DI_type_node));
+ (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
+ get_identifier ("__vector __bool long"),
+ bool_V2DI_type_node));
+ }
if (TARGET_PAIRED_FLOAT)
paired_init_builtins ();
@@ -10818,8 +11026,10 @@ altivec_init_builtins (void)
{
enum machine_mode mode1;
tree type;
- bool is_overloaded = dp->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && dp->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST;
+ bool is_overloaded = ((dp->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && dp->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (dp->code >= VSX_BUILTIN_OVERLOADED_FIRST
+ && dp->code <= VSX_BUILTIN_OVERLOADED_LAST));
if (is_overloaded)
mode1 = VOIDmode;
@@ -10982,592 +11192,302 @@ altivec_init_builtins (void)
ALTIVEC_BUILTIN_VEC_EXT_V4SF);
}
-static void
-rs6000_common_init_builtins (void)
+/* Hash function for builtin functions with up to 3 arguments and a return
+ type. */
+static unsigned
+builtin_hash_function (const void *hash_entry)
{
- const struct builtin_description *d;
- size_t i;
+ unsigned ret = 0;
+ int i;
+ const struct builtin_hash_struct *bh =
+ (const struct builtin_hash_struct *) hash_entry;
- tree v2sf_ftype_v2sf_v2sf_v2sf
- = build_function_type_list (V2SF_type_node,
- V2SF_type_node, V2SF_type_node,
- V2SF_type_node, NULL_TREE);
-
- tree v4sf_ftype_v4sf_v4sf_v16qi
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node,
- V16QI_type_node, NULL_TREE);
- tree v4si_ftype_v4si_v4si_v16qi
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, V4SI_type_node,
- V16QI_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_v8hi_v16qi
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, V8HI_type_node,
- V16QI_type_node, NULL_TREE);
- tree v16qi_ftype_v16qi_v16qi_v16qi
- = build_function_type_list (V16QI_type_node,
- V16QI_type_node, V16QI_type_node,
- V16QI_type_node, NULL_TREE);
- tree v4si_ftype_int
- = build_function_type_list (V4SI_type_node, integer_type_node, NULL_TREE);
- tree v8hi_ftype_int
- = build_function_type_list (V8HI_type_node, integer_type_node, NULL_TREE);
- tree v16qi_ftype_int
- = build_function_type_list (V16QI_type_node, integer_type_node, NULL_TREE);
- tree v8hi_ftype_v16qi
- = build_function_type_list (V8HI_type_node, V16QI_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf
- = build_function_type_list (V4SF_type_node, V4SF_type_node, NULL_TREE);
+ for (i = 0; i < 4; i++)
+ ret = (ret * (unsigned)MAX_MACHINE_MODE) + ((unsigned)bh->mode[i]);
- tree v2si_ftype_v2si_v2si
- = build_function_type_list (opaque_V2SI_type_node,
- opaque_V2SI_type_node,
- opaque_V2SI_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2sf_v2sf_spe
- = build_function_type_list (opaque_V2SF_type_node,
- opaque_V2SF_type_node,
- opaque_V2SF_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2sf_v2sf
- = build_function_type_list (V2SF_type_node,
- V2SF_type_node,
- V2SF_type_node, NULL_TREE);
-
-
- tree v2si_ftype_int_int
- = build_function_type_list (opaque_V2SI_type_node,
- integer_type_node, integer_type_node,
- NULL_TREE);
+ return ret;
+}
- tree opaque_ftype_opaque
- = build_function_type_list (opaque_V4SI_type_node,
- opaque_V4SI_type_node, NULL_TREE);
+/* Compare builtin hash entries H1 and H2 for equivalence. */
+static int
+builtin_hash_eq (const void *h1, const void *h2)
+{
+ const struct builtin_hash_struct *p1 = (const struct builtin_hash_struct *) h1;
+ const struct builtin_hash_struct *p2 = (const struct builtin_hash_struct *) h2;
- tree v2si_ftype_v2si
- = build_function_type_list (opaque_V2SI_type_node,
- opaque_V2SI_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2sf_spe
- = build_function_type_list (opaque_V2SF_type_node,
- opaque_V2SF_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2sf
- = build_function_type_list (V2SF_type_node,
- V2SF_type_node, NULL_TREE);
-
- tree v2sf_ftype_v2si
- = build_function_type_list (opaque_V2SF_type_node,
- opaque_V2SI_type_node, NULL_TREE);
-
- tree v2si_ftype_v2sf
- = build_function_type_list (opaque_V2SI_type_node,
- opaque_V2SF_type_node, NULL_TREE);
-
- tree v2si_ftype_v2si_char
- = build_function_type_list (opaque_V2SI_type_node,
- opaque_V2SI_type_node,
- char_type_node, NULL_TREE);
-
- tree v2si_ftype_int_char
- = build_function_type_list (opaque_V2SI_type_node,
- integer_type_node, char_type_node, NULL_TREE);
-
- tree v2si_ftype_char
- = build_function_type_list (opaque_V2SI_type_node,
- char_type_node, NULL_TREE);
+ return ((p1->mode[0] == p2->mode[0])
+ && (p1->mode[1] == p2->mode[1])
+ && (p1->mode[2] == p2->mode[2])
+ && (p1->mode[3] == p2->mode[3]));
+}
- tree int_ftype_int_int
- = build_function_type_list (integer_type_node,
- integer_type_node, integer_type_node,
- NULL_TREE);
+/* Map selected modes to types for builtins. */
+static tree builtin_mode_to_type[MAX_MACHINE_MODE];
- tree opaque_ftype_opaque_opaque
- = build_function_type_list (opaque_V4SI_type_node,
- opaque_V4SI_type_node, opaque_V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v4si_v4si
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, V4SI_type_node, NULL_TREE);
- tree v4sf_ftype_v4si_int
- = build_function_type_list (V4SF_type_node,
- V4SI_type_node, integer_type_node, NULL_TREE);
- tree v4si_ftype_v4sf_int
- = build_function_type_list (V4SI_type_node,
- V4SF_type_node, integer_type_node, NULL_TREE);
- tree v4si_ftype_v4si_int
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, integer_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_int
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, integer_type_node, NULL_TREE);
- tree v16qi_ftype_v16qi_int
- = build_function_type_list (V16QI_type_node,
- V16QI_type_node, integer_type_node, NULL_TREE);
- tree v16qi_ftype_v16qi_v16qi_int
- = build_function_type_list (V16QI_type_node,
- V16QI_type_node, V16QI_type_node,
- integer_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_v8hi_int
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, V8HI_type_node,
- integer_type_node, NULL_TREE);
- tree v4si_ftype_v4si_v4si_int
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, V4SI_type_node,
- integer_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf_v4sf_int
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node,
- integer_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf_v4sf
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node, NULL_TREE);
- tree opaque_ftype_opaque_opaque_opaque
- = build_function_type_list (opaque_V4SI_type_node,
- opaque_V4SI_type_node, opaque_V4SI_type_node,
- opaque_V4SI_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf_v4sf_v4si
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node,
- V4SI_type_node, NULL_TREE);
- tree v4sf_ftype_v4sf_v4sf_v4sf
- = build_function_type_list (V4SF_type_node,
- V4SF_type_node, V4SF_type_node,
- V4SF_type_node, NULL_TREE);
- tree v4si_ftype_v4si_v4si_v4si
- = build_function_type_list (V4SI_type_node,
- V4SI_type_node, V4SI_type_node,
- V4SI_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_v8hi
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, V8HI_type_node, NULL_TREE);
- tree v8hi_ftype_v8hi_v8hi_v8hi
- = build_function_type_list (V8HI_type_node,
- V8HI_type_node, V8HI_type_node,
- V8HI_type_node, NULL_TREE);
- tree v4si_ftype_v8hi_v8hi_v4si
- = build_function_type_list (V4SI_type_node,
- V8HI_type_node, V8HI_type_node,
- V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v16qi_v16qi_v4si
- = build_function_type_list (V4SI_type_node,
- V16QI_type_node, V16QI_type_node,
- V4SI_type_node, NULL_TREE);
- tree v16qi_ftype_v16qi_v16qi
- = build_function_type_list (V16QI_type_node,
- V16QI_type_node, V16QI_type_node, NULL_TREE);
- tree v4si_ftype_v4sf_v4sf
- = build_function_type_list (V4SI_type_node,
- V4SF_type_node, V4SF_type_node, NULL_TREE);
- tree v8hi_ftype_v16qi_v16qi
- = build_function_type_list (V8HI_type_node,
- V16QI_type_node, V16QI_type_node, NULL_TREE);
- tree v4si_ftype_v8hi_v8hi
- = build_function_type_list (V4SI_type_node,
- V8HI_type_node, V8HI_type_node, NULL_TREE);
- tree v8hi_ftype_v4si_v4si
- = build_function_type_list (V8HI_type_node,
- V4SI_type_node, V4SI_type_node, NULL_TREE);
- tree v16qi_ftype_v8hi_v8hi
- = build_function_type_list (V16QI_type_node,
- V8HI_type_node, V8HI_type_node, NULL_TREE);
- tree v4si_ftype_v16qi_v4si
- = build_function_type_list (V4SI_type_node,
- V16QI_type_node, V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v16qi_v16qi
- = build_function_type_list (V4SI_type_node,
- V16QI_type_node, V16QI_type_node, NULL_TREE);
- tree v4si_ftype_v8hi_v4si
- = build_function_type_list (V4SI_type_node,
- V8HI_type_node, V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v8hi
- = build_function_type_list (V4SI_type_node, V8HI_type_node, NULL_TREE);
- tree int_ftype_v4si_v4si
- = build_function_type_list (integer_type_node,
- V4SI_type_node, V4SI_type_node, NULL_TREE);
- tree int_ftype_v4sf_v4sf
- = build_function_type_list (integer_type_node,
- V4SF_type_node, V4SF_type_node, NULL_TREE);
- tree int_ftype_v16qi_v16qi
- = build_function_type_list (integer_type_node,
- V16QI_type_node, V16QI_type_node, NULL_TREE);
- tree int_ftype_v8hi_v8hi
- = build_function_type_list (integer_type_node,
- V8HI_type_node, V8HI_type_node, NULL_TREE);
- tree v2di_ftype_v2df
- = build_function_type_list (V2DI_type_node,
- V2DF_type_node, NULL_TREE);
- tree v2df_ftype_v2df
- = build_function_type_list (V2DF_type_node,
- V2DF_type_node, NULL_TREE);
- tree v2df_ftype_v2di
- = build_function_type_list (V2DF_type_node,
- V2DI_type_node, NULL_TREE);
- tree v2df_ftype_v2df_v2df
- = build_function_type_list (V2DF_type_node,
- V2DF_type_node, V2DF_type_node, NULL_TREE);
- tree v2df_ftype_v2df_v2df_v2df
- = build_function_type_list (V2DF_type_node,
- V2DF_type_node, V2DF_type_node,
- V2DF_type_node, NULL_TREE);
- tree v2di_ftype_v2di_v2di_v2di
- = build_function_type_list (V2DI_type_node,
- V2DI_type_node, V2DI_type_node,
- V2DI_type_node, NULL_TREE);
- tree v2df_ftype_v2df_v2df_v16qi
- = build_function_type_list (V2DF_type_node,
- V2DF_type_node, V2DF_type_node,
- V16QI_type_node, NULL_TREE);
- tree v2di_ftype_v2di_v2di_v16qi
- = build_function_type_list (V2DI_type_node,
- V2DI_type_node, V2DI_type_node,
- V16QI_type_node, NULL_TREE);
- tree v4sf_ftype_v4si
- = build_function_type_list (V4SF_type_node, V4SI_type_node, NULL_TREE);
- tree v4si_ftype_v4sf
- = build_function_type_list (V4SI_type_node, V4SF_type_node, NULL_TREE);
+/* Map types for builtin functions with an explicit return type and up to 3
+ arguments. Functions with fewer than 3 arguments use VOIDmode as the type
+ of the argument. */
+static tree
+builtin_function_type (enum machine_mode mode_ret, enum machine_mode mode_arg0,
+ enum machine_mode mode_arg1, enum machine_mode mode_arg2,
+ const char *name)
+{
+ struct builtin_hash_struct h;
+ struct builtin_hash_struct *h2;
+ void **found;
+ int num_args = 3;
+ int i;
- /* Add the simple ternary operators. */
+ /* Create builtin_hash_table. */
+ if (builtin_hash_table == NULL)
+ builtin_hash_table = htab_create_ggc (1500, builtin_hash_function,
+ builtin_hash_eq, NULL);
+
+ h.type = NULL_TREE;
+ h.mode[0] = mode_ret;
+ h.mode[1] = mode_arg0;
+ h.mode[2] = mode_arg1;
+ h.mode[3] = mode_arg2;
+
+ /* Figure out how many args are present. */
+ while (num_args > 0 && h.mode[num_args] == VOIDmode)
+ num_args--;
+
+ if (num_args == 0)
+ fatal_error ("internal error: builtin function %s had no type", name);
+
+ if (!builtin_mode_to_type[h.mode[0]])
+ fatal_error ("internal error: builtin function %s had an unexpected "
+ "return type %s", name, GET_MODE_NAME (h.mode[0]));
+
+ for (i = 0; i < num_args; i++)
+ if (!builtin_mode_to_type[h.mode[i+1]])
+ fatal_error ("internal error: builtin function %s, argument %d "
+ "had unexpected argument type %s", name, i,
+ GET_MODE_NAME (h.mode[i+1]));
+
+ found = htab_find_slot (builtin_hash_table, &h, 1);
+ if (*found == NULL)
+ {
+ h2 = GGC_NEW (struct builtin_hash_struct);
+ *h2 = h;
+ *found = (void *)h2;
+
+ switch (num_args)
+ {
+ case 1:
+ h2->type = build_function_type_list (builtin_mode_to_type[mode_ret],
+ builtin_mode_to_type[mode_arg0],
+ NULL_TREE);
+ break;
+
+ case 2:
+ h2->type = build_function_type_list (builtin_mode_to_type[mode_ret],
+ builtin_mode_to_type[mode_arg0],
+ builtin_mode_to_type[mode_arg1],
+ NULL_TREE);
+ break;
+
+ case 3:
+ h2->type = build_function_type_list (builtin_mode_to_type[mode_ret],
+ builtin_mode_to_type[mode_arg0],
+ builtin_mode_to_type[mode_arg1],
+ builtin_mode_to_type[mode_arg2],
+ NULL_TREE);
+ break;
+
+ default:
+ gcc_unreachable ();
+ }
+ }
+
+ return ((struct builtin_hash_struct *)(*found))->type;
+}
+
+static void
+rs6000_common_init_builtins (void)
+{
+ const struct builtin_description *d;
+ size_t i;
+
+ tree opaque_ftype_opaque = NULL_TREE;
+ tree opaque_ftype_opaque_opaque = NULL_TREE;
+ tree opaque_ftype_opaque_opaque_opaque = NULL_TREE;
+ tree v2si_ftype_qi = NULL_TREE;
+ tree v2si_ftype_v2si_qi = NULL_TREE;
+ tree v2si_ftype_int_qi = NULL_TREE;
+
+ /* Initialize the tables for the unary, binary, and ternary ops. */
+ builtin_mode_to_type[QImode] = integer_type_node;
+ builtin_mode_to_type[HImode] = integer_type_node;
+ builtin_mode_to_type[SImode] = intSI_type_node;
+ builtin_mode_to_type[DImode] = intDI_type_node;
+ builtin_mode_to_type[SFmode] = float_type_node;
+ builtin_mode_to_type[DFmode] = double_type_node;
+ builtin_mode_to_type[V2SImode] = V2SI_type_node;
+ builtin_mode_to_type[V2SFmode] = V2SF_type_node;
+ builtin_mode_to_type[V2DImode] = V2DI_type_node;
+ builtin_mode_to_type[V2DFmode] = V2DF_type_node;
+ builtin_mode_to_type[V4HImode] = V4HI_type_node;
+ builtin_mode_to_type[V4SImode] = V4SI_type_node;
+ builtin_mode_to_type[V4SFmode] = V4SF_type_node;
+ builtin_mode_to_type[V8HImode] = V8HI_type_node;
+ builtin_mode_to_type[V16QImode] = V16QI_type_node;
+
+ if (!TARGET_PAIRED_FLOAT)
+ {
+ builtin_mode_to_type[V2SImode] = opaque_V2SI_type_node;
+ builtin_mode_to_type[V2SFmode] = opaque_V2SF_type_node;
+ }
+
+ /* Add the ternary operators. */
d = bdesc_3arg;
for (i = 0; i < ARRAY_SIZE (bdesc_3arg); i++, d++)
{
- enum machine_mode mode0, mode1, mode2, mode3;
tree type;
- bool is_overloaded = d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST;
+ int mask = d->mask;
- if (is_overloaded)
- {
- mode0 = VOIDmode;
- mode1 = VOIDmode;
- mode2 = VOIDmode;
- mode3 = VOIDmode;
+ if ((mask != 0 && (mask & target_flags) == 0)
+ || (mask == 0 && !TARGET_PAIRED_FLOAT))
+ continue;
+
+ if ((d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (d->code >= VSX_BUILTIN_OVERLOADED_FIRST
+ && d->code <= VSX_BUILTIN_OVERLOADED_LAST))
+ {
+ if (! (type = opaque_ftype_opaque_opaque_opaque))
+ type = opaque_ftype_opaque_opaque_opaque
+ = build_function_type_list (opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ NULL_TREE);
}
else
{
- if (d->name == 0 || d->icode == CODE_FOR_nothing)
+ enum insn_code icode = d->icode;
+ if (d->name == 0 || icode == CODE_FOR_nothing)
continue;
- mode0 = insn_data[d->icode].operand[0].mode;
- mode1 = insn_data[d->icode].operand[1].mode;
- mode2 = insn_data[d->icode].operand[2].mode;
- mode3 = insn_data[d->icode].operand[3].mode;
+ type = builtin_function_type (insn_data[icode].operand[0].mode,
+ insn_data[icode].operand[1].mode,
+ insn_data[icode].operand[2].mode,
+ insn_data[icode].operand[3].mode,
+ d->name);
}
- /* When all four are of the same mode. */
- if (mode0 == mode1 && mode1 == mode2 && mode2 == mode3)
- {
- switch (mode0)
- {
- case VOIDmode:
- type = opaque_ftype_opaque_opaque_opaque;
- break;
- case V2DImode:
- type = v2di_ftype_v2di_v2di_v2di;
- break;
- case V2DFmode:
- type = v2df_ftype_v2df_v2df_v2df;
- break;
- case V4SImode:
- type = v4si_ftype_v4si_v4si_v4si;
- break;
- case V4SFmode:
- type = v4sf_ftype_v4sf_v4sf_v4sf;
- break;
- case V8HImode:
- type = v8hi_ftype_v8hi_v8hi_v8hi;
- break;
- case V16QImode:
- type = v16qi_ftype_v16qi_v16qi_v16qi;
- break;
- case V2SFmode:
- type = v2sf_ftype_v2sf_v2sf_v2sf;
- break;
- default:
- gcc_unreachable ();
- }
- }
- else if (mode0 == mode1 && mode1 == mode2 && mode3 == V16QImode)
- {
- switch (mode0)
- {
- case V2DImode:
- type = v2di_ftype_v2di_v2di_v16qi;
- break;
- case V2DFmode:
- type = v2df_ftype_v2df_v2df_v16qi;
- break;
- case V4SImode:
- type = v4si_ftype_v4si_v4si_v16qi;
- break;
- case V4SFmode:
- type = v4sf_ftype_v4sf_v4sf_v16qi;
- break;
- case V8HImode:
- type = v8hi_ftype_v8hi_v8hi_v16qi;
- break;
- case V16QImode:
- type = v16qi_ftype_v16qi_v16qi_v16qi;
- break;
- default:
- gcc_unreachable ();
- }
- }
- else if (mode0 == V4SImode && mode1 == V16QImode && mode2 == V16QImode
- && mode3 == V4SImode)
- type = v4si_ftype_v16qi_v16qi_v4si;
- else if (mode0 == V4SImode && mode1 == V8HImode && mode2 == V8HImode
- && mode3 == V4SImode)
- type = v4si_ftype_v8hi_v8hi_v4si;
- else if (mode0 == V4SFmode && mode1 == V4SFmode && mode2 == V4SFmode
- && mode3 == V4SImode)
- type = v4sf_ftype_v4sf_v4sf_v4si;
-
- /* vchar, vchar, vchar, 4-bit literal. */
- else if (mode0 == V16QImode && mode1 == mode0 && mode2 == mode0
- && mode3 == QImode)
- type = v16qi_ftype_v16qi_v16qi_int;
-
- /* vshort, vshort, vshort, 4-bit literal. */
- else if (mode0 == V8HImode && mode1 == mode0 && mode2 == mode0
- && mode3 == QImode)
- type = v8hi_ftype_v8hi_v8hi_int;
-
- /* vint, vint, vint, 4-bit literal. */
- else if (mode0 == V4SImode && mode1 == mode0 && mode2 == mode0
- && mode3 == QImode)
- type = v4si_ftype_v4si_v4si_int;
-
- /* vfloat, vfloat, vfloat, 4-bit literal. */
- else if (mode0 == V4SFmode && mode1 == mode0 && mode2 == mode0
- && mode3 == QImode)
- type = v4sf_ftype_v4sf_v4sf_int;
-
- else
- gcc_unreachable ();
-
def_builtin (d->mask, d->name, type, d->code);
}
- /* Add the simple binary operators. */
+ /* Add the binary operators. */
d = (struct builtin_description *) bdesc_2arg;
for (i = 0; i < ARRAY_SIZE (bdesc_2arg); i++, d++)
{
enum machine_mode mode0, mode1, mode2;
tree type;
- bool is_overloaded = d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST;
+ int mask = d->mask;
- if (is_overloaded)
- {
- mode0 = VOIDmode;
- mode1 = VOIDmode;
- mode2 = VOIDmode;
+ if ((mask != 0 && (mask & target_flags) == 0)
+ || (mask == 0 && !TARGET_PAIRED_FLOAT))
+ continue;
+
+ if ((d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (d->code >= VSX_BUILTIN_OVERLOADED_FIRST
+ && d->code <= VSX_BUILTIN_OVERLOADED_LAST))
+ {
+ if (! (type = opaque_ftype_opaque_opaque))
+ type = opaque_ftype_opaque_opaque
+ = build_function_type_list (opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ NULL_TREE);
}
else
{
- if (d->name == 0 || d->icode == CODE_FOR_nothing)
+ enum insn_code icode = d->icode;
+ if (d->name == 0 || icode == CODE_FOR_nothing)
continue;
- mode0 = insn_data[d->icode].operand[0].mode;
- mode1 = insn_data[d->icode].operand[1].mode;
- mode2 = insn_data[d->icode].operand[2].mode;
- }
+ mode0 = insn_data[icode].operand[0].mode;
+ mode1 = insn_data[icode].operand[1].mode;
+ mode2 = insn_data[icode].operand[2].mode;
- /* When all three operands are of the same mode. */
- if (mode0 == mode1 && mode1 == mode2)
- {
- switch (mode0)
+ if (mode0 == V2SImode && mode1 == V2SImode && mode2 == QImode)
{
- case VOIDmode:
- type = opaque_ftype_opaque_opaque;
- break;
- case V2DFmode:
- type = v2df_ftype_v2df_v2df;
- break;
- case V4SFmode:
- type = v4sf_ftype_v4sf_v4sf;
- break;
- case V4SImode:
- type = v4si_ftype_v4si_v4si;
- break;
- case V16QImode:
- type = v16qi_ftype_v16qi_v16qi;
- break;
- case V8HImode:
- type = v8hi_ftype_v8hi_v8hi;
- break;
- case V2SImode:
- type = v2si_ftype_v2si_v2si;
- break;
- case V2SFmode:
- if (TARGET_PAIRED_FLOAT)
- type = v2sf_ftype_v2sf_v2sf;
- else
- type = v2sf_ftype_v2sf_v2sf_spe;
- break;
- case SImode:
- type = int_ftype_int_int;
- break;
- default:
- gcc_unreachable ();
+ if (! (type = v2si_ftype_v2si_qi))
+ type = v2si_ftype_v2si_qi
+ = build_function_type_list (opaque_V2SI_type_node,
+ opaque_V2SI_type_node,
+ char_type_node,
+ NULL_TREE);
}
- }
-
- /* A few other combos we really don't want to do manually. */
-
- /* vint, vfloat, vfloat. */
- else if (mode0 == V4SImode && mode1 == V4SFmode && mode2 == V4SFmode)
- type = v4si_ftype_v4sf_v4sf;
-
- /* vshort, vchar, vchar. */
- else if (mode0 == V8HImode && mode1 == V16QImode && mode2 == V16QImode)
- type = v8hi_ftype_v16qi_v16qi;
-
- /* vint, vshort, vshort. */
- else if (mode0 == V4SImode && mode1 == V8HImode && mode2 == V8HImode)
- type = v4si_ftype_v8hi_v8hi;
-
- /* vshort, vint, vint. */
- else if (mode0 == V8HImode && mode1 == V4SImode && mode2 == V4SImode)
- type = v8hi_ftype_v4si_v4si;
-
- /* vchar, vshort, vshort. */
- else if (mode0 == V16QImode && mode1 == V8HImode && mode2 == V8HImode)
- type = v16qi_ftype_v8hi_v8hi;
-
- /* vint, vchar, vint. */
- else if (mode0 == V4SImode && mode1 == V16QImode && mode2 == V4SImode)
- type = v4si_ftype_v16qi_v4si;
-
- /* vint, vchar, vchar. */
- else if (mode0 == V4SImode && mode1 == V16QImode && mode2 == V16QImode)
- type = v4si_ftype_v16qi_v16qi;
-
- /* vint, vshort, vint. */
- else if (mode0 == V4SImode && mode1 == V8HImode && mode2 == V4SImode)
- type = v4si_ftype_v8hi_v4si;
- /* vint, vint, 5-bit literal. */
- else if (mode0 == V4SImode && mode1 == V4SImode && mode2 == QImode)
- type = v4si_ftype_v4si_int;
-
- /* vshort, vshort, 5-bit literal. */
- else if (mode0 == V8HImode && mode1 == V8HImode && mode2 == QImode)
- type = v8hi_ftype_v8hi_int;
-
- /* vchar, vchar, 5-bit literal. */
- else if (mode0 == V16QImode && mode1 == V16QImode && mode2 == QImode)
- type = v16qi_ftype_v16qi_int;
-
- /* vfloat, vint, 5-bit literal. */
- else if (mode0 == V4SFmode && mode1 == V4SImode && mode2 == QImode)
- type = v4sf_ftype_v4si_int;
-
- /* vint, vfloat, 5-bit literal. */
- else if (mode0 == V4SImode && mode1 == V4SFmode && mode2 == QImode)
- type = v4si_ftype_v4sf_int;
-
- else if (mode0 == V2SImode && mode1 == SImode && mode2 == SImode)
- type = v2si_ftype_int_int;
-
- else if (mode0 == V2SImode && mode1 == V2SImode && mode2 == QImode)
- type = v2si_ftype_v2si_char;
-
- else if (mode0 == V2SImode && mode1 == SImode && mode2 == QImode)
- type = v2si_ftype_int_char;
-
- else
- {
- /* int, x, x. */
- gcc_assert (mode0 == SImode);
- switch (mode1)
+ else if (mode0 == V2SImode && GET_MODE_CLASS (mode1) == MODE_INT
+ && mode2 == QImode)
{
- case V4SImode:
- type = int_ftype_v4si_v4si;
- break;
- case V4SFmode:
- type = int_ftype_v4sf_v4sf;
- break;
- case V16QImode:
- type = int_ftype_v16qi_v16qi;
- break;
- case V8HImode:
- type = int_ftype_v8hi_v8hi;
- break;
- default:
- gcc_unreachable ();
+ if (! (type = v2si_ftype_int_qi))
+ type = v2si_ftype_int_qi
+ = build_function_type_list (opaque_V2SI_type_node,
+ integer_type_node,
+ char_type_node,
+ NULL_TREE);
}
+
+ else
+ type = builtin_function_type (mode0, mode1, mode2, VOIDmode,
+ d->name);
}
def_builtin (d->mask, d->name, type, d->code);
}
- /* Add the simple unary operators. */
+ /* Add the unary operators. */
d = (struct builtin_description *) bdesc_1arg;
for (i = 0; i < ARRAY_SIZE (bdesc_1arg); i++, d++)
{
enum machine_mode mode0, mode1;
tree type;
- bool is_overloaded = d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
- && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST;
+ int mask = d->mask;
- if (is_overloaded)
- {
- mode0 = VOIDmode;
- mode1 = VOIDmode;
- }
+ if ((mask != 0 && (mask & target_flags) == 0)
+ || (mask == 0 && !TARGET_PAIRED_FLOAT))
+ continue;
+
+ if ((d->code >= ALTIVEC_BUILTIN_OVERLOADED_FIRST
+ && d->code <= ALTIVEC_BUILTIN_OVERLOADED_LAST)
+ || (d->code >= VSX_BUILTIN_OVERLOADED_FIRST
+ && d->code <= VSX_BUILTIN_OVERLOADED_LAST))
+ {
+ if (! (type = opaque_ftype_opaque))
+ type = opaque_ftype_opaque
+ = build_function_type_list (opaque_V4SI_type_node,
+ opaque_V4SI_type_node,
+ NULL_TREE);
+ }
else
{
- if (d->name == 0 || d->icode == CODE_FOR_nothing)
+ enum insn_code icode = d->icode;
+ if (d->name == 0 || icode == CODE_FOR_nothing)
continue;
- mode0 = insn_data[d->icode].operand[0].mode;
- mode1 = insn_data[d->icode].operand[1].mode;
- }
+ mode0 = insn_data[icode].operand[0].mode;
+ mode1 = insn_data[icode].operand[1].mode;
- if (mode0 == V4SImode && mode1 == QImode)
- type = v4si_ftype_int;
- else if (mode0 == V8HImode && mode1 == QImode)
- type = v8hi_ftype_int;
- else if (mode0 == V16QImode && mode1 == QImode)
- type = v16qi_ftype_int;
- else if (mode0 == VOIDmode && mode1 == VOIDmode)
- type = opaque_ftype_opaque;
- else if (mode0 == V2DFmode && mode1 == V2DFmode)
- type = v2df_ftype_v2df;
- else if (mode0 == V4SFmode && mode1 == V4SFmode)
- type = v4sf_ftype_v4sf;
- else if (mode0 == V8HImode && mode1 == V16QImode)
- type = v8hi_ftype_v16qi;
- else if (mode0 == V4SImode && mode1 == V8HImode)
- type = v4si_ftype_v8hi;
- else if (mode0 == V2SImode && mode1 == V2SImode)
- type = v2si_ftype_v2si;
- else if (mode0 == V2SFmode && mode1 == V2SFmode)
- {
- if (TARGET_PAIRED_FLOAT)
- type = v2sf_ftype_v2sf;
- else
- type = v2sf_ftype_v2sf_spe;
- }
- else if (mode0 == V2SFmode && mode1 == V2SImode)
- type = v2sf_ftype_v2si;
- else if (mode0 == V2SImode && mode1 == V2SFmode)
- type = v2si_ftype_v2sf;
- else if (mode0 == V2SImode && mode1 == QImode)
- type = v2si_ftype_char;
- else if (mode0 == V4SImode && mode1 == V4SFmode)
- type = v4si_ftype_v4sf;
- else if (mode0 == V4SFmode && mode1 == V4SImode)
- type = v4sf_ftype_v4si;
- else if (mode0 == V2DImode && mode1 == V2DFmode)
- type = v2di_ftype_v2df;
- else if (mode0 == V2DFmode && mode1 == V2DImode)
- type = v2df_ftype_v2di;
- else
- gcc_unreachable ();
+ if (mode0 == V2SImode && mode1 == QImode)
+ {
+ if (! (type = v2si_ftype_qi))
+ type = v2si_ftype_qi
+ = build_function_type_list (opaque_V2SI_type_node,
+ char_type_node,
+ NULL_TREE);
+ }
+
+ else
+ type = builtin_function_type (mode0, mode1, VOIDmode, VOIDmode,
+ d->name);
+ }
def_builtin (d->mask, d->name, type, d->code);
}
@@ -12618,12 +12538,12 @@ rs6000_secondary_reload_inner (rtx reg,
}
if (GET_CODE (addr) == PLUS
- && (!rs6000_legitimate_offset_address_p (TImode, addr, true)
+ && (!rs6000_legitimate_offset_address_p (TImode, addr, false)
|| and_op2 != NULL_RTX))
{
addr_op1 = XEXP (addr, 0);
addr_op2 = XEXP (addr, 1);
- gcc_assert (legitimate_indirect_address_p (addr_op1, true));
+ gcc_assert (legitimate_indirect_address_p (addr_op1, false));
if (!REG_P (addr_op2)
&& (GET_CODE (addr_op2) != CONST_INT
@@ -12642,8 +12562,8 @@ rs6000_secondary_reload_inner (rtx reg,
addr = scratch_or_premodify;
scratch_or_premodify = scratch;
}
- else if (!legitimate_indirect_address_p (addr, true)
- && !rs6000_legitimate_offset_address_p (TImode, addr, true))
+ else if (!legitimate_indirect_address_p (addr, false)
+ && !rs6000_legitimate_offset_address_p (TImode, addr, false))
{
rs6000_emit_move (scratch_or_premodify, addr, Pmode);
addr = scratch_or_premodify;
@@ -12672,24 +12592,24 @@ rs6000_secondary_reload_inner (rtx reg,
if (GET_CODE (addr) == PRE_MODIFY
&& (!VECTOR_MEM_VSX_P (mode)
|| and_op2 != NULL_RTX
- || !legitimate_indexed_address_p (XEXP (addr, 1), true)))
+ || !legitimate_indexed_address_p (XEXP (addr, 1), false)))
{
scratch_or_premodify = XEXP (addr, 0);
gcc_assert (legitimate_indirect_address_p (scratch_or_premodify,
- true));
+ false));
gcc_assert (GET_CODE (XEXP (addr, 1)) == PLUS);
addr = XEXP (addr, 1);
}
- if (legitimate_indirect_address_p (addr, true) /* reg */
- || legitimate_indexed_address_p (addr, true) /* reg+reg */
+ if (legitimate_indirect_address_p (addr, false) /* reg */
+ || legitimate_indexed_address_p (addr, false) /* reg+reg */
|| GET_CODE (addr) == PRE_MODIFY /* VSX pre-modify */
|| GET_CODE (addr) == AND /* Altivec memory */
|| (rclass == FLOAT_REGS /* legacy float mem */
&& GET_MODE_SIZE (mode) == 8
&& and_op2 == NULL_RTX
&& scratch_or_premodify == scratch
- && rs6000_legitimate_offset_address_p (mode, addr, true)))
+ && rs6000_legitimate_offset_address_p (mode, addr, false)))
;
else if (GET_CODE (addr) == PLUS)
@@ -12709,7 +12629,7 @@ rs6000_secondary_reload_inner (rtx reg,
}
else if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST
- || GET_CODE (addr) == CONST_INT)
+ || GET_CODE (addr) == CONST_INT || REG_P (addr))
{
rs6000_emit_move (scratch_or_premodify, addr, Pmode);
addr = scratch_or_premodify;
@@ -12741,7 +12661,7 @@ rs6000_secondary_reload_inner (rtx reg,
andi. instruction. */
if (and_op2 != NULL_RTX)
{
- if (! legitimate_indirect_address_p (addr, true))
+ if (! legitimate_indirect_address_p (addr, false))
{
emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
addr = scratch;
@@ -12776,6 +12696,26 @@ rs6000_secondary_reload_inner (rtx reg,
return;
}
+/* Target hook to return the cover classes for Integrated Register Allocator.
+ Cover classes is a set of non-intersected register classes covering all hard
+ registers used for register allocation purpose. Any move between two
+ registers of a cover class should be cheaper than load or store of the
+ registers. The value is array of register classes with LIM_REG_CLASSES used
+ as the end marker.
+
+ We need two IRA_COVER_CLASSES, one for pre-VSX, and the other for VSX to
+ account for the Altivec and Floating registers being subsets of the VSX
+ register set under VSX, but distinct register sets on pre-VSX machines. */
+
+static const enum reg_class *
+rs6000_ira_cover_classes (void)
+{
+ static const enum reg_class cover_pre_vsx[] = IRA_COVER_CLASSES_PRE_VSX;
+ static const enum reg_class cover_vsx[] = IRA_COVER_CLASSES_VSX;
+
+ return (TARGET_VSX) ? cover_vsx : cover_pre_vsx;
+}
+
/* Allocate a 64-bit stack slot to be used for copying SDmode
values through if this function has any SDmode references. */
@@ -12849,13 +12789,15 @@ rs6000_preferred_reload_class (rtx x, en
enum machine_mode mode = GET_MODE (x);
enum reg_class ret;
- if (TARGET_VSX && VSX_VECTOR_MODE (mode) && x == CONST0_RTX (mode)
- && VSX_REG_CLASS_P (rclass))
+ if (TARGET_VSX
+ && (VSX_VECTOR_MODE (mode) || mode == TImode)
+ && x == CONST0_RTX (mode) && VSX_REG_CLASS_P (rclass))
ret = rclass;
- else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode)
- && rclass == ALTIVEC_REGS && easy_vector_constant (x, mode))
- ret = rclass;
+ else if (TARGET_ALTIVEC && (ALTIVEC_VECTOR_MODE (mode) || mode == TImode)
+ && (rclass == ALTIVEC_REGS || rclass == VSX_REGS)
+ && easy_vector_constant (x, mode))
+ ret = ALTIVEC_REGS;
else if (CONSTANT_P (x) && reg_classes_intersect_p (rclass, FLOAT_REGS))
ret = NO_REGS;
@@ -13074,8 +13016,10 @@ rs6000_cannot_change_mode_class (enum ma
|| (((to) == TDmode) + ((from) == TDmode)) == 1
|| (((to) == DImode) + ((from) == DImode)) == 1))
|| (TARGET_VSX
- && (VSX_VECTOR_MODE (from) + VSX_VECTOR_MODE (to)) == 1)
+ && (VSX_MOVE_MODE (from) + VSX_MOVE_MODE (to)) == 1
+ && VSX_REG_CLASS_P (rclass))
|| (TARGET_ALTIVEC
+ && rclass == ALTIVEC_REGS
&& (ALTIVEC_VECTOR_MODE (from)
+ ALTIVEC_VECTOR_MODE (to)) == 1)
|| (TARGET_SPE
@@ -14953,7 +14897,7 @@ rs6000_emit_vector_cond_expr (rtx dest,
if (!mask)
return 0;
- if ((TARGET_VSX && VSX_VECTOR_MOVE_MODE (dest_mode))
+ if ((TARGET_VSX && VSX_MOVE_MODE (dest_mode))
|| (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (dest_mode)))
{
rtx cond2 = gen_rtx_fmt_ee (NE, VOIDmode, mask, const0_rtx);
@@ -22044,7 +21988,8 @@ rs6000_handle_altivec_attribute (tree *n
mode = TYPE_MODE (type);
/* Check for invalid AltiVec type qualifiers. */
- if (type == long_unsigned_type_node || type == long_integer_type_node)
+ if ((type == long_unsigned_type_node || type == long_integer_type_node)
+ && !TARGET_VSX)
{
if (TARGET_64BIT)
error ("use of %<long%> in AltiVec types is invalid for 64-bit code");
@@ -22082,6 +22027,7 @@ rs6000_handle_altivec_attribute (tree *n
break;
case SFmode: result = V4SF_type_node; break;
case DFmode: result = V2DF_type_node; break;
+ case DImode: result = V2DI_type_node; break;
/* If the user says 'vector int bool', we may be handed the 'bool'
attribute _before_ the 'vector' attribute, and so select the
proper type in the 'b' case below. */
@@ -22093,6 +22039,7 @@ rs6000_handle_altivec_attribute (tree *n
case 'b':
switch (mode)
{
+ case DImode: case V2DImode: result = bool_V2DI_type_node; break;
case SImode: case V4SImode: result = bool_V4SI_type_node; break;
case HImode: case V8HImode: result = bool_V8HI_type_node; break;
case QImode: case V16QImode: result = bool_V16QI_type_node;
@@ -22137,6 +22084,7 @@ rs6000_mangle_type (const_tree type)
if (type == bool_short_type_node) return "U6__bools";
if (type == pixel_type_node) return "u7__pixel";
if (type == bool_int_type_node) return "U6__booli";
+ if (type == bool_long_type_node) return "U6__booll";
/* Mangle IBM extended float long double as `g' (__float128) on
powerpc*-linux where long-double-64 previously was the default. */
@@ -23647,6 +23595,8 @@ int
rs6000_register_move_cost (enum machine_mode mode,
enum reg_class from, enum reg_class to)
{
+ int ret;
+
/* Moves from/to GENERAL_REGS. */
if (reg_classes_intersect_p (to, GENERAL_REGS)
|| reg_classes_intersect_p (from, GENERAL_REGS))
@@ -23655,39 +23605,47 @@ rs6000_register_move_cost (enum machine_
from = to;
if (from == FLOAT_REGS || from == ALTIVEC_REGS || from == VSX_REGS)
- return (rs6000_memory_move_cost (mode, from, 0)
- + rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
+ ret = (rs6000_memory_move_cost (mode, from, 0)
+ + rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
/* It's more expensive to move CR_REGS than CR0_REGS because of the
shift. */
else if (from == CR_REGS)
- return 4;
+ ret = 4;
/* Power6 has slower LR/CTR moves so make them more expensive than
memory in order to bias spills to memory .*/
else if (rs6000_cpu == PROCESSOR_POWER6
&& reg_classes_intersect_p (from, LINK_OR_CTR_REGS))
- return 6 * hard_regno_nregs[0][mode];
+ ret = 6 * hard_regno_nregs[0][mode];
else
/* A move will cost one instruction per GPR moved. */
- return 2 * hard_regno_nregs[0][mode];
+ ret = 2 * hard_regno_nregs[0][mode];
}
/* If we have VSX, we can easily move between FPR or Altivec registers. */
- else if (TARGET_VSX
- && ((from == VSX_REGS || from == FLOAT_REGS || from == ALTIVEC_REGS)
- || (to == VSX_REGS || to == FLOAT_REGS || to == ALTIVEC_REGS)))
- return 2;
+ else if (VECTOR_UNIT_VSX_P (mode)
+ && reg_classes_intersect_p (to, VSX_REGS)
+ && reg_classes_intersect_p (from, VSX_REGS))
+ ret = 2 * hard_regno_nregs[32][mode];
/* Moving between two similar registers is just one instruction. */
else if (reg_classes_intersect_p (to, from))
- return (mode == TFmode || mode == TDmode) ? 4 : 2;
+ ret = (mode == TFmode || mode == TDmode) ? 4 : 2;
/* Everything else has to go through GENERAL_REGS. */
else
- return (rs6000_register_move_cost (mode, GENERAL_REGS, to)
- + rs6000_register_move_cost (mode, from, GENERAL_REGS));
+ ret = (rs6000_register_move_cost (mode, GENERAL_REGS, to)
+ + rs6000_register_move_cost (mode, from, GENERAL_REGS));
+
+ if (TARGET_DEBUG_COST)
+ fprintf (stderr,
+ "rs6000_register_move_cost:, ret=%d, mode=%s, from=%s, to=%s\n",
+ ret, GET_MODE_NAME (mode), reg_class_names[from],
+ reg_class_names[to]);
+
+ return ret;
}
/* A C expressions returning the cost of moving data of MODE from a register to
@@ -23697,14 +23655,23 @@ int
rs6000_memory_move_cost (enum machine_mode mode, enum reg_class rclass,
int in ATTRIBUTE_UNUSED)
{
+ int ret;
+
if (reg_classes_intersect_p (rclass, GENERAL_REGS))
- return 4 * hard_regno_nregs[0][mode];
+ ret = 4 * hard_regno_nregs[0][mode];
else if (reg_classes_intersect_p (rclass, FLOAT_REGS))
- return 4 * hard_regno_nregs[32][mode];
+ ret = 4 * hard_regno_nregs[32][mode];
else if (reg_classes_intersect_p (rclass, ALTIVEC_REGS))
- return 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode];
+ ret = 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode];
else
- return 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
+ ret = 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
+
+ if (TARGET_DEBUG_COST)
+ fprintf (stderr,
+ "rs6000_memory_move_cost: ret=%d, mode=%s, rclass=%s, in=%d\n",
+ ret, GET_MODE_NAME (mode), reg_class_names[rclass], in);
+
+ return ret;
}
/* Returns a code for a target-specific builtin that implements
@@ -24424,4 +24391,24 @@ rs6000_final_prescan_insn (rtx insn, rtx
}
}
+/* Return true if the function has an indirect jump or a table jump. The compiler
+ prefers the ctr register for such jumps, which interferes with using the decrement
+ ctr register and branch. */
+
+bool
+rs6000_has_indirect_jump_p (void)
+{
+ gcc_assert (cfun && cfun->machine);
+ return cfun->machine->indirect_jump_p;
+}
+
+/* Remember when we've generated an indirect jump. */
+
+void
+rs6000_set_indirect_jump (void)
+{
+ gcc_assert (cfun && cfun->machine);
+ cfun->machine->indirect_jump_p = true;
+}
+
#include "gt-rs6000.h"
--- gcc/config/rs6000/vsx.md (revision 146119)
+++ gcc/config/rs6000/vsx.md (revision 146798)
@@ -22,12 +22,22 @@
;; Iterator for both scalar and vector floating point types supported by VSX
(define_mode_iterator VSX_B [DF V4SF V2DF])
+;; Iterator for the 2 64-bit vector types
+(define_mode_iterator VSX_D [V2DF V2DI])
+
+;; Iterator for the 2 32-bit vector types
+(define_mode_iterator VSX_W [V4SF V4SI])
+
;; Iterator for vector floating point types supported by VSX
(define_mode_iterator VSX_F [V4SF V2DF])
;; Iterator for logical types supported by VSX
(define_mode_iterator VSX_L [V16QI V8HI V4SI V2DI V4SF V2DF TI])
+;; Iterator for memory move. Handle TImode specially to allow
+;; it to use gprs as well as vsx registers.
+(define_mode_iterator VSX_M [V16QI V8HI V4SI V2DI V4SF V2DF])
+
;; Iterator for types for load/store with update
(define_mode_iterator VSX_U [V16QI V8HI V4SI V2DI V4SF V2DF TI DF])
@@ -49,9 +59,10 @@ (define_mode_attr VSs [(V16QI "sp")
(V2DF "dp")
(V2DI "dp")
(DF "dp")
+ (SF "sp")
(TI "sp")])
-;; Map into the register class used
+;; Map the register class used
(define_mode_attr VSr [(V16QI "v")
(V8HI "v")
(V4SI "v")
@@ -59,9 +70,10 @@ (define_mode_attr VSr [(V16QI "v")
(V2DI "wd")
(V2DF "wd")
(DF "ws")
+ (SF "f")
(TI "wd")])
-;; Map into the register class used for float<->int conversions
+;; Map the register class used for float<->int conversions
(define_mode_attr VSr2 [(V2DF "wd")
(V4SF "wf")
(DF "!f#r")])
@@ -70,6 +82,18 @@ (define_mode_attr VSr3 [(V2DF "wa")
(V4SF "wa")
(DF "!f#r")])
+;; Map the register class for sp<->dp float conversions, destination
+(define_mode_attr VSr4 [(SF "ws")
+ (DF "f")
+ (V2DF "wd")
+ (V4SF "v")])
+
+;; Map the register class for sp<->dp float conversions, destination
+(define_mode_attr VSr5 [(SF "ws")
+ (DF "f")
+ (V2DF "v")
+ (V4SF "wd")])
+
;; Same size integer type for floating point data
(define_mode_attr VSi [(V4SF "v4si")
(V2DF "v2di")
@@ -137,6 +161,32 @@ (define_mode_attr VSfptype_sqrt [(V2DF "
(V4SF "fp_sqrt_s")
(DF "fp_sqrt_d")])
+;; Iterator and modes for sp<->dp conversions
+(define_mode_iterator VSX_SPDP [SF DF V4SF V2DF])
+
+(define_mode_attr VS_spdp_res [(SF "DF")
+ (DF "SF")
+ (V4SF "V2DF")
+ (V2DF "V4SF")])
+
+(define_mode_attr VS_spdp_insn [(SF "xscvspdp")
+ (DF "xscvdpsp")
+ (V4SF "xvcvspdp")
+ (V2DF "xvcvdpsp")])
+
+(define_mode_attr VS_spdp_type [(SF "fp")
+ (DF "fp")
+ (V4SF "vecfloat")
+ (V2DF "vecfloat")])
+
+;; Map the scalar mode for a vector type
+(define_mode_attr VS_scalar [(V2DF "DF")
+ (V2DI "DI")
+ (V4SF "SF")
+ (V4SI "SI")
+ (V8HI "HI")
+ (V16QI "QI")])
+
;; Appropriate type for load + update
(define_mode_attr VStype_load_update [(V16QI "vecload")
(V8HI "vecload")
@@ -159,25 +209,33 @@ (define_mode_attr VStype_store_update [(
;; Constants for creating unspecs
(define_constants
- [(UNSPEC_VSX_CONCAT_V2DF 500)
- (UNSPEC_VSX_XVCVDPSP 501)
- (UNSPEC_VSX_XVCVDPSXWS 502)
- (UNSPEC_VSX_XVCVDPUXWS 503)
- (UNSPEC_VSX_XVCVSPDP 504)
- (UNSPEC_VSX_XVCVSXWDP 505)
- (UNSPEC_VSX_XVCVUXWDP 506)
- (UNSPEC_VSX_XVMADD 507)
- (UNSPEC_VSX_XVMSUB 508)
- (UNSPEC_VSX_XVNMADD 509)
- (UNSPEC_VSX_XVNMSUB 510)
- (UNSPEC_VSX_XVRSQRTE 511)
- (UNSPEC_VSX_XVTDIV 512)
- (UNSPEC_VSX_XVTSQRT 513)])
+ [(UNSPEC_VSX_CONCAT 500)
+ (UNSPEC_VSX_CVDPSXWS 501)
+ (UNSPEC_VSX_CVDPUXWS 502)
+ (UNSPEC_VSX_CVSPDP 503)
+ (UNSPEC_VSX_CVSXWDP 504)
+ (UNSPEC_VSX_CVUXWDP 505)
+ (UNSPEC_VSX_CVSXDSP 506)
+ (UNSPEC_VSX_CVUXDSP 507)
+ (UNSPEC_VSX_CVSPSXDS 508)
+ (UNSPEC_VSX_CVSPUXDS 509)
+ (UNSPEC_VSX_MADD 510)
+ (UNSPEC_VSX_MSUB 511)
+ (UNSPEC_VSX_NMADD 512)
+ (UNSPEC_VSX_NMSUB 513)
+ (UNSPEC_VSX_RSQRTE 514)
+ (UNSPEC_VSX_TDIV 515)
+ (UNSPEC_VSX_TSQRT 516)
+ (UNSPEC_VSX_XXPERMDI 517)
+ (UNSPEC_VSX_SET 518)
+ (UNSPEC_VSX_ROUND_I 519)
+ (UNSPEC_VSX_ROUND_IC 520)
+ (UNSPEC_VSX_SLDWI 521)])
;; VSX moves
(define_insn "*vsx_mov<mode>"
- [(set (match_operand:VSX_L 0 "nonimmediate_operand" "=Z,<VSr>,<VSr>,?Z,?wa,?wa,*o,*r,*r,<VSr>,?wa,v,wZ,v")
- (match_operand:VSX_L 1 "input_operand" "<VSr>,Z,<VSr>,wa,Z,wa,r,o,r,j,j,W,v,wZ"))]
+ [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=Z,<VSr>,<VSr>,?Z,?wa,?wa,*o,*r,*r,<VSr>,?wa,v,wZ,v")
+ (match_operand:VSX_M 1 "input_operand" "<VSr>,Z,<VSr>,wa,Z,wa,r,o,r,j,j,W,v,wZ"))]
"VECTOR_MEM_VSX_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
@@ -220,6 +278,49 @@ (define_insn "*vsx_mov<mode>"
}
[(set_attr "type" "vecstore,vecload,vecsimple,vecstore,vecload,vecsimple,*,*,*,vecsimple,vecsimple,*,vecstore,vecload")])
+;; Unlike other VSX moves, allow the GPRs, since a normal use of TImode is for
+;; unions. However for plain data movement, slightly favor the vector loads
+(define_insn "*vsx_movti"
+ [(set (match_operand:TI 0 "nonimmediate_operand" "=Z,wa,wa,?o,?r,?r,wa,v,v,wZ")
+ (match_operand:TI 1 "input_operand" "wa,Z,wa,r,o,r,j,W,wZ,v"))]
+ "VECTOR_MEM_VSX_P (TImode)
+ && (register_operand (operands[0], TImode)
+ || register_operand (operands[1], TImode))"
+{
+ switch (which_alternative)
+ {
+ case 0:
+ return "stxvd2%U0x %x1,%y0";
+
+ case 1:
+ return "lxvd2%U0x %x0,%y1";
+
+ case 2:
+ return "xxlor %x0,%x1,%x1";
+
+ case 3:
+ case 4:
+ case 5:
+ return "#";
+
+ case 6:
+ return "xxlxor %x0,%x0,%x0";
+
+ case 7:
+ return output_vec_const_move (operands);
+
+ case 8:
+ return "stvx %1,%y0";
+
+ case 9:
+ return "lvx %0,%y1";
+
+ default:
+ gcc_unreachable ();
+ }
+}
+ [(set_attr "type" "vecstore,vecload,vecsimple,*,*,*,vecsimple,*,vecstore,vecload")])
+
;; Load/store with update
;; Define insns that do load or store with update. Because VSX only has
;; reg+reg addressing, pre-decrement or pre-inrement is unlikely to be
@@ -297,7 +398,7 @@ (define_insn "vsx_tdiv<mode>3"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")]
- UNSPEC_VSX_XVTDIV))]
+ UNSPEC_VSX_TDIV))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"x<VSv>tdiv<VSs> %x0,%x1,%x2"
[(set_attr "type" "<VStype_simple>")
@@ -367,7 +468,7 @@ (define_insn "*vsx_sqrt<mode>2"
(define_insn "vsx_rsqrte<mode>2"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
- UNSPEC_VSX_XVRSQRTE))]
+ UNSPEC_VSX_RSQRTE))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"x<VSv>rsqrte<VSs> %x0,%x1"
[(set_attr "type" "<VStype_simple>")
@@ -376,7 +477,7 @@ (define_insn "vsx_rsqrte<mode>2"
(define_insn "vsx_tsqrt<mode>2"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
- UNSPEC_VSX_XVTSQRT))]
+ UNSPEC_VSX_TSQRT))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"x<VSv>tsqrt<VSs> %x0,%x1"
[(set_attr "type" "<VStype_simple>")
@@ -426,7 +527,7 @@ (define_insn "vsx_fmadd<mode>4_2"
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")
(match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")]
- UNSPEC_VSX_XVMADD))]
+ UNSPEC_VSX_MADD))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
x<VSv>madda<VSs> %x0,%x1,%x2
@@ -474,7 +575,7 @@ (define_insn "vsx_fmsub<mode>4_2"
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")
(match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")]
- UNSPEC_VSX_XVMSUB))]
+ UNSPEC_VSX_MSUB))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
x<VSv>msuba<VSs> %x0,%x1,%x2
@@ -552,7 +653,7 @@ (define_insn "vsx_fnmadd<mode>4_3"
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,<VSr>,wa,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")
(match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")]
- UNSPEC_VSX_XVNMADD))]
+ UNSPEC_VSX_NMADD))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
x<VSv>nmadda<VSs> %x0,%x1,%x2
@@ -629,7 +730,7 @@ (define_insn "vsx_fnmsub<mode>4_3"
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%<VSr>,<VSr>,wa,wa")
(match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,0,wa,0")
(match_operand:VSX_B 3 "vsx_register_operand" "0,<VSr>,0,wa")]
- UNSPEC_VSX_XVNMSUB))]
+ UNSPEC_VSX_NMSUB))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
x<VSv>nmsuba<VSs> %x0,%x1,%x2
@@ -667,13 +768,13 @@ (define_insn "*vsx_ge<mode>"
[(set_attr "type" "<VStype_simple>")
(set_attr "fp_type" "<VSfptype_simple>")])
-(define_insn "vsx_vsel<mode>"
- [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
- (if_then_else:VSX_F (ne (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+(define_insn "*vsx_vsel<mode>"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+ (if_then_else:VSX_L (ne (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
(const_int 0))
- (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")
- (match_operand:VSX_F 3 "vsx_register_operand" "<VSr>,wa")))]
- "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
+ (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxsel %x0,%x3,%x2,%x1"
[(set_attr "type" "vecperm")])
@@ -698,7 +799,7 @@ (define_insn "vsx_ftrunc<mode>2"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(fix:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
"VECTOR_UNIT_VSX_P (<MODE>mode)"
- "x<VSv>r<VSs>piz %x0,%x1"
+ "x<VSv>r<VSs>iz %x0,%x1"
[(set_attr "type" "<VStype_simple>")
(set_attr "fp_type" "<VSfptype_simple>")])
@@ -735,6 +836,24 @@ (define_insn "vsx_fixuns_trunc<mode><VSi
(set_attr "fp_type" "<VSfptype_simple>")])
;; Math rounding functions
+(define_insn "vsx_x<VSv>r<VSs>i"
+ [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
+ (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+ UNSPEC_VSX_ROUND_I))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "x<VSv>r<VSs>i %x0,%x1"
+ [(set_attr "type" "<VStype_simple>")
+ (set_attr "fp_type" "<VSfptype_simple>")])
+
+(define_insn "vsx_x<VSv>r<VSs>ic"
+ [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
+ (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+ UNSPEC_VSX_ROUND_IC))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "x<VSv>r<VSs>ic %x0,%x1"
+ [(set_attr "type" "<VStype_simple>")
+ (set_attr "fp_type" "<VSfptype_simple>")])
+
(define_insn "vsx_btrunc<mode>2"
[(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
@@ -765,22 +884,26 @@ (define_insn "vsx_ceil<mode>2"
;; VSX convert to/from double vector
+;; Convert between single and double precision
+;; Don't use xscvspdp and xscvdpsp for scalar conversions, since the normal
+;; scalar single precision instructions internally use the double format.
+;; Prefer the altivec registers, since we likely will need to do a vperm
+(define_insn "vsx_<VS_spdp_insn>"
+ [(set (match_operand:<VS_spdp_res> 0 "vsx_register_operand" "=<VSr4>,?wa")
+ (unspec:<VS_spdp_res> [(match_operand:VSX_SPDP 1 "vsx_register_operand" "<VSr5>,wa")]
+ UNSPEC_VSX_CVSPDP))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "<VS_spdp_insn> %x0,%x1"
+ [(set_attr "type" "<VS_spdp_type>")])
+
;; Convert from 64-bit to 32-bit types
;; Note, favor the Altivec registers since the usual use of these instructions
;; is in vector converts and we need to use the Altivec vperm instruction.
-(define_insn "vsx_xvcvdpsp"
- [(set (match_operand:V4SF 0 "vsx_register_operand" "=v,?wa")
- (unspec:V4SF [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")]
- UNSPEC_VSX_XVCVDPSP))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
- "xvcvdpsp %x0,%x1"
- [(set_attr "type" "vecfloat")])
-
(define_insn "vsx_xvcvdpsxws"
[(set (match_operand:V4SI 0 "vsx_register_operand" "=v,?wa")
(unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")]
- UNSPEC_VSX_XVCVDPSXWS))]
+ UNSPEC_VSX_CVDPSXWS))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
"xvcvdpsxws %x0,%x1"
[(set_attr "type" "vecfloat")])
@@ -788,24 +911,32 @@ (define_insn "vsx_xvcvdpsxws"
(define_insn "vsx_xvcvdpuxws"
[(set (match_operand:V4SI 0 "vsx_register_operand" "=v,?wa")
(unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")]
- UNSPEC_VSX_XVCVDPUXWS))]
+ UNSPEC_VSX_CVDPUXWS))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
"xvcvdpuxws %x0,%x1"
[(set_attr "type" "vecfloat")])
-;; Convert from 32-bit to 64-bit types
-(define_insn "vsx_xvcvspdp"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
- (unspec:V2DF [(match_operand:V4SF 1 "vsx_register_operand" "wf,wa")]
- UNSPEC_VSX_XVCVSPDP))]
+(define_insn "vsx_xvcvsxdsp"
+ [(set (match_operand:V4SI 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wf,wa")]
+ UNSPEC_VSX_CVSXDSP))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "xvcvsxdsp %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "vsx_xvcvuxdsp"
+ [(set (match_operand:V4SI 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wf,wa")]
+ UNSPEC_VSX_CVUXDSP))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
- "xvcvspdp %x0,%x1"
+ "xvcvuxwdp %x0,%x1"
[(set_attr "type" "vecfloat")])
+;; Convert from 32-bit to 64-bit types
(define_insn "vsx_xvcvsxwdp"
[(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
(unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")]
- UNSPEC_VSX_XVCVSXWDP))]
+ UNSPEC_VSX_CVSXWDP))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
"xvcvsxwdp %x0,%x1"
[(set_attr "type" "vecfloat")])
@@ -813,11 +944,26 @@ (define_insn "vsx_xvcvsxwdp"
(define_insn "vsx_xvcvuxwdp"
[(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
(unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")]
- UNSPEC_VSX_XVCVUXWDP))]
+ UNSPEC_VSX_CVUXWDP))]
"VECTOR_UNIT_VSX_P (V2DFmode)"
"xvcvuxwdp %x0,%x1"
[(set_attr "type" "vecfloat")])
+(define_insn "vsx_xvcvspsxds"
+ [(set (match_operand:V2DI 0 "vsx_register_operand" "=v,?wa")
+ (unspec:V2DI [(match_operand:V4SF 1 "vsx_register_operand" "wd,wa")]
+ UNSPEC_VSX_CVSPSXDS))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "xvcvspsxds %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "vsx_xvcvspuxds"
+ [(set (match_operand:V2DI 0 "vsx_register_operand" "=v,?wa")
+ (unspec:V2DI [(match_operand:V4SF 1 "vsx_register_operand" "wd,wa")]
+ UNSPEC_VSX_CVSPUXDS))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "xvcvspuxds %x0,%x1"
+ [(set_attr "type" "vecfloat")])
;; Logical and permute operations
(define_insn "*vsx_and<mode>3"
@@ -877,24 +1023,25 @@ (define_insn "*vsx_andc<mode>3"
;; Permute operations
-(define_insn "vsx_concat_v2df"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
- (unspec:V2DF
- [(match_operand:DF 1 "vsx_register_operand" "ws,wa")
- (match_operand:DF 2 "vsx_register_operand" "ws,wa")]
- UNSPEC_VSX_CONCAT_V2DF))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+;; Build a V2DF/V2DI vector from two scalars
+(define_insn "vsx_concat_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:VSX_D
+ [(match_operand:<VS_scalar> 1 "vsx_register_operand" "ws,wa")
+ (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa")]
+ UNSPEC_VSX_CONCAT))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxpermdi %x0,%x1,%x2,0"
[(set_attr "type" "vecperm")])
-;; Set a double into one element
-(define_insn "vsx_set_v2df"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
- (vec_merge:V2DF
- (match_operand:V2DF 1 "vsx_register_operand" "wd,wa")
- (vec_duplicate:V2DF (match_operand:DF 2 "vsx_register_operand" "ws,f"))
- (match_operand:QI 3 "u5bit_cint_operand" "i,i")))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+;; Set the element of a V2DI/VD2F mode
+(define_insn "vsx_set_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:VSX_D [(match_operand:VSX_D 1 "vsx_register_operand" "wd,wa")
+ (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa")
+ (match_operand:QI 3 "u5bit_cint_operand" "i,i")]
+ UNSPEC_VSX_SET))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
{
if (INTVAL (operands[3]) == 0)
return \"xxpermdi %x0,%x1,%x2,1\";
@@ -906,12 +1053,12 @@ (define_insn "vsx_set_v2df"
[(set_attr "type" "vecperm")])
;; Extract a DF element from V2DF
-(define_insn "vsx_extract_v2df"
- [(set (match_operand:DF 0 "vsx_register_operand" "=ws,f,?wa")
- (vec_select:DF (match_operand:V2DF 1 "vsx_register_operand" "wd,wd,wa")
+(define_insn "vsx_extract_<mode>"
+ [(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=ws,f,?wa")
+ (vec_select:<VS_scalar> (match_operand:VSX_D 1 "vsx_register_operand" "wd,wd,wa")
(parallel
[(match_operand:QI 2 "u5bit_cint_operand" "i,i,i")])))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
{
gcc_assert (UINTVAL (operands[2]) <= 1);
operands[3] = GEN_INT (INTVAL (operands[2]) << 1);
@@ -919,17 +1066,30 @@ (define_insn "vsx_extract_v2df"
}
[(set_attr "type" "vecperm")])
-;; General V2DF permute, extract_{high,low,even,odd}
-(define_insn "vsx_xxpermdi"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd")
- (vec_concat:V2DF
- (vec_select:DF (match_operand:V2DF 1 "vsx_register_operand" "wd")
- (parallel
- [(match_operand:QI 2 "u5bit_cint_operand" "i")]))
- (vec_select:DF (match_operand:V2DF 3 "vsx_register_operand" "wd")
- (parallel
- [(match_operand:QI 4 "u5bit_cint_operand" "i")]))))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+;; General V2DF/V2DI permute
+(define_insn "vsx_xxpermdi_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:VSX_D [(match_operand:VSX_D 1 "vsx_register_operand" "wd,wa")
+ (match_operand:VSX_D 2 "vsx_register_operand" "wd,wa")
+ (match_operand:QI 3 "u5bit_cint_operand" "i,i")]
+ UNSPEC_VSX_XXPERMDI))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxpermdi %x0,%x1,%x2,%3"
+ [(set_attr "type" "vecperm")])
+
+;; Varient of xxpermdi that is emitted by the vec_interleave functions
+(define_insn "*vsx_xxpermdi2_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd")
+ (vec_concat:VSX_D
+ (vec_select:<VS_scalar>
+ (match_operand:VSX_D 1 "vsx_register_operand" "wd")
+ (parallel
+ [(match_operand:QI 2 "u5bit_cint_operand" "i")]))
+ (vec_select:<VS_scalar>
+ (match_operand:VSX_D 3 "vsx_register_operand" "wd")
+ (parallel
+ [(match_operand:QI 4 "u5bit_cint_operand" "i")]))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
{
gcc_assert ((UINTVAL (operands[2]) <= 1) && (UINTVAL (operands[4]) <= 1));
operands[5] = GEN_INT (((INTVAL (operands[2]) & 1) << 1)
@@ -939,11 +1099,11 @@ (define_insn "vsx_xxpermdi"
[(set_attr "type" "vecperm")])
;; V2DF splat
-(define_insn "vsx_splatv2df"
- [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,wd,wd,?wa,?wa,?wa")
- (vec_duplicate:V2DF
- (match_operand:DF 1 "input_operand" "ws,f,Z,wa,wa,Z")))]
- "VECTOR_UNIT_VSX_P (V2DFmode)"
+(define_insn "vsx_splat_<mode>"
+ [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,wd,wd,?wa,?wa,?wa")
+ (vec_duplicate:VSX_D
+ (match_operand:<VS_scalar> 1 "input_operand" "ws,f,Z,wa,wa,Z")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
"@
xxpermdi %x0,%x1,%x1,0
xxpermdi %x0,%x1,%x1,0
@@ -953,52 +1113,66 @@ (define_insn "vsx_splatv2df"
lxvdsx %x0,%y1"
[(set_attr "type" "vecperm,vecperm,vecload,vecperm,vecperm,vecload")])
-;; V4SF splat
-(define_insn "*vsx_xxspltw"
- [(set (match_operand:V4SF 0 "vsx_register_operand" "=wf,?wa")
- (vec_duplicate:V4SF
- (vec_select:SF (match_operand:V4SF 1 "vsx_register_operand" "wf,wa")
- (parallel
- [(match_operand:QI 2 "u5bit_cint_operand" "i,i")]))))]
- "VECTOR_UNIT_VSX_P (V4SFmode)"
+;; V4SF/V4SI splat
+(define_insn "vsx_xxspltw_<mode>"
+ [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+ (vec_duplicate:VSX_W
+ (vec_select:<VS_scalar>
+ (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
+ (parallel
+ [(match_operand:QI 2 "u5bit_cint_operand" "i,i")]))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxspltw %x0,%x1,%2"
[(set_attr "type" "vecperm")])
-;; V4SF interleave
-(define_insn "vsx_xxmrghw"
- [(set (match_operand:V4SF 0 "register_operand" "=wf,?wa")
- (vec_merge:V4SF
- (vec_select:V4SF (match_operand:V4SF 1 "vsx_register_operand" "wf,wa")
- (parallel [(const_int 0)
- (const_int 2)
- (const_int 1)
- (const_int 3)]))
- (vec_select:V4SF (match_operand:V4SF 2 "vsx_register_operand" "wf,wa")
- (parallel [(const_int 2)
- (const_int 0)
- (const_int 3)
- (const_int 1)]))
+;; V4SF/V4SI interleave
+(define_insn "vsx_xxmrghw_<mode>"
+ [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+ (vec_merge:VSX_W
+ (vec_select:VSX_W
+ (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
+ (parallel [(const_int 0)
+ (const_int 2)
+ (const_int 1)
+ (const_int 3)]))
+ (vec_select:VSX_W
+ (match_operand:VSX_W 2 "vsx_register_operand" "wf,wa")
+ (parallel [(const_int 2)
+ (const_int 0)
+ (const_int 3)
+ (const_int 1)]))
(const_int 5)))]
- "VECTOR_UNIT_VSX_P (V4SFmode)"
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxmrghw %x0,%x1,%x2"
[(set_attr "type" "vecperm")])
-(define_insn "vsx_xxmrglw"
- [(set (match_operand:V4SF 0 "register_operand" "=wf,?wa")
- (vec_merge:V4SF
- (vec_select:V4SF
- (match_operand:V4SF 1 "register_operand" "wf,wa")
+(define_insn "vsx_xxmrglw_<mode>"
+ [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+ (vec_merge:VSX_W
+ (vec_select:VSX_W
+ (match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
(parallel [(const_int 2)
(const_int 0)
(const_int 3)
(const_int 1)]))
- (vec_select:V4SF
- (match_operand:V4SF 2 "register_operand" "wf,?wa")
+ (vec_select:VSX_W
+ (match_operand:VSX_W 2 "vsx_register_operand" "wf,?wa")
(parallel [(const_int 0)
(const_int 2)
(const_int 1)
(const_int 3)]))
(const_int 5)))]
- "VECTOR_UNIT_VSX_P (V4SFmode)"
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
"xxmrglw %x0,%x1,%x2"
[(set_attr "type" "vecperm")])
+
+;; Shift left double by word immediate
+(define_insn "vsx_xxsldwi_<mode>"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=wa")
+ (unspec:VSX_L [(match_operand:VSX_L 1 "vsx_register_operand" "wa")
+ (match_operand:VSX_L 2 "vsx_register_operand" "wa")
+ (match_operand:QI 3 "u5bit_cint_operand" "i")]
+ UNSPEC_VSX_SLDWI))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxsldwi %x0,%x1,%x2,%3"
+ [(set_attr "type" "vecperm")])
--- gcc/config/rs6000/rs6000.h (revision 146119)
+++ gcc/config/rs6000/rs6000.h (revision 146798)
@@ -1033,14 +1033,6 @@ extern int rs6000_vector_align[];
((MODE) == V4SFmode \
|| (MODE) == V2DFmode) \
-#define VSX_VECTOR_MOVE_MODE(MODE) \
- ((MODE) == V16QImode \
- || (MODE) == V8HImode \
- || (MODE) == V4SImode \
- || (MODE) == V2DImode \
- || (MODE) == V4SFmode \
- || (MODE) == V2DFmode) \
-
#define VSX_SCALAR_MODE(MODE) \
((MODE) == DFmode)
@@ -1049,12 +1041,9 @@ extern int rs6000_vector_align[];
|| VSX_SCALAR_MODE (MODE))
#define VSX_MOVE_MODE(MODE) \
- (VSX_VECTOR_MOVE_MODE (MODE) \
- || VSX_SCALAR_MODE(MODE) \
- || (MODE) == V16QImode \
- || (MODE) == V8HImode \
- || (MODE) == V4SImode \
- || (MODE) == V2DImode \
+ (VSX_VECTOR_MODE (MODE) \
+ || VSX_SCALAR_MODE (MODE) \
+ || ALTIVEC_VECTOR_MODE (MODE) \
|| (MODE) == TImode)
#define ALTIVEC_VECTOR_MODE(MODE) \
@@ -1304,12 +1293,24 @@ enum reg_class
purpose. Any move between two registers of a cover class should be
cheaper than load or store of the registers. The macro value is
array of register classes with LIM_REG_CLASSES used as the end
- marker. */
+ marker.
+
+ We need two IRA_COVER_CLASSES, one for pre-VSX, and the other for VSX to
+ account for the Altivec and Floating registers being subsets of the VSX
+ register set. */
+
+#define IRA_COVER_CLASSES_PRE_VSX \
+{ \
+ GENERAL_REGS, SPECIAL_REGS, FLOAT_REGS, ALTIVEC_REGS, /* VSX_REGS, */ \
+ /* VRSAVE_REGS,*/ VSCR_REGS, SPE_ACC_REGS, SPEFSCR_REGS, \
+ /* MQ_REGS, LINK_REGS, CTR_REGS, */ \
+ CR_REGS, XER_REGS, LIM_REG_CLASSES \
+}
-#define IRA_COVER_CLASSES \
+#define IRA_COVER_CLASSES_VSX \
{ \
- GENERAL_REGS, SPECIAL_REGS, FLOAT_REGS, ALTIVEC_REGS, \
- /*VRSAVE_REGS,*/ VSCR_REGS, SPE_ACC_REGS, SPEFSCR_REGS, \
+ GENERAL_REGS, SPECIAL_REGS, /* FLOAT_REGS, ALTIVEC_REGS, */ VSX_REGS, \
+ /* VRSAVE_REGS,*/ VSCR_REGS, SPE_ACC_REGS, SPEFSCR_REGS, \
/* MQ_REGS, LINK_REGS, CTR_REGS, */ \
CR_REGS, XER_REGS, LIM_REG_CLASSES \
}
@@ -3371,21 +3372,36 @@ enum rs6000_builtins
VSX_BUILTIN_XVTDIVSP,
VSX_BUILTIN_XVTSQRTDP,
VSX_BUILTIN_XVTSQRTSP,
- VSX_BUILTIN_XXLAND,
- VSX_BUILTIN_XXLANDC,
- VSX_BUILTIN_XXLNOR,
- VSX_BUILTIN_XXLOR,
- VSX_BUILTIN_XXLXOR,
- VSX_BUILTIN_XXMRGHD,
- VSX_BUILTIN_XXMRGHW,
- VSX_BUILTIN_XXMRGLD,
- VSX_BUILTIN_XXMRGLW,
- VSX_BUILTIN_XXPERMDI,
- VSX_BUILTIN_XXSEL,
- VSX_BUILTIN_XXSLDWI,
- VSX_BUILTIN_XXSPLTD,
- VSX_BUILTIN_XXSPLTW,
- VSX_BUILTIN_XXSWAPD,
+ VSX_BUILTIN_XXSEL_2DI,
+ VSX_BUILTIN_XXSEL_2DF,
+ VSX_BUILTIN_XXSEL_4SI,
+ VSX_BUILTIN_XXSEL_4SF,
+ VSX_BUILTIN_XXSEL_8HI,
+ VSX_BUILTIN_XXSEL_16QI,
+ VSX_BUILTIN_VPERM_2DI,
+ VSX_BUILTIN_VPERM_2DF,
+ VSX_BUILTIN_VPERM_4SI,
+ VSX_BUILTIN_VPERM_4SF,
+ VSX_BUILTIN_VPERM_8HI,
+ VSX_BUILTIN_VPERM_16QI,
+ VSX_BUILTIN_XXPERMDI_2DF,
+ VSX_BUILTIN_XXPERMDI_2DI,
+ VSX_BUILTIN_CONCAT_2DF,
+ VSX_BUILTIN_CONCAT_2DI,
+ VSX_BUILTIN_SET_2DF,
+ VSX_BUILTIN_SET_2DI,
+ VSX_BUILTIN_SPLAT_2DF,
+ VSX_BUILTIN_SPLAT_2DI,
+ VSX_BUILTIN_XXMRGHW_4SF,
+ VSX_BUILTIN_XXMRGHW_4SI,
+ VSX_BUILTIN_XXMRGLW_4SF,
+ VSX_BUILTIN_XXMRGLW_4SI,
+ VSX_BUILTIN_XXSLDWI_16QI,
+ VSX_BUILTIN_XXSLDWI_8HI,
+ VSX_BUILTIN_XXSLDWI_4SI,
+ VSX_BUILTIN_XXSLDWI_4SF,
+ VSX_BUILTIN_XXSLDWI_2DI,
+ VSX_BUILTIN_XXSLDWI_2DF,
/* VSX overloaded builtins, add the overloaded functions not present in
Altivec. */
@@ -3395,7 +3411,13 @@ enum rs6000_builtins
VSX_BUILTIN_VEC_NMADD,
VSX_BUITLIN_VEC_NMSUB,
VSX_BUILTIN_VEC_DIV,
- VSX_BUILTIN_OVERLOADED_LAST = VSX_BUILTIN_VEC_DIV,
+ VSX_BUILTIN_VEC_XXMRGHW,
+ VSX_BUILTIN_VEC_XXMRGLW,
+ VSX_BUILTIN_VEC_XXPERMDI,
+ VSX_BUILTIN_VEC_XXSLDWI,
+ VSX_BUILTIN_VEC_XXSPLTD,
+ VSX_BUILTIN_VEC_XXSPLTW,
+ VSX_BUILTIN_OVERLOADED_LAST = VSX_BUILTIN_VEC_XXSPLTW,
/* Combined VSX/Altivec builtins. */
VECTOR_BUILTIN_FLOAT_V4SI_V4SF,
@@ -3425,13 +3447,16 @@ enum rs6000_builtin_type_index
RS6000_BTI_unsigned_V16QI,
RS6000_BTI_unsigned_V8HI,
RS6000_BTI_unsigned_V4SI,
+ RS6000_BTI_unsigned_V2DI,
RS6000_BTI_bool_char, /* __bool char */
RS6000_BTI_bool_short, /* __bool short */
RS6000_BTI_bool_int, /* __bool int */
+ RS6000_BTI_bool_long, /* __bool long */
RS6000_BTI_pixel, /* __pixel */
RS6000_BTI_bool_V16QI, /* __vector __bool char */
RS6000_BTI_bool_V8HI, /* __vector __bool short */
RS6000_BTI_bool_V4SI, /* __vector __bool int */
+ RS6000_BTI_bool_V2DI, /* __vector __bool long */
RS6000_BTI_pixel_V8HI, /* __vector __pixel */
RS6000_BTI_long, /* long_integer_type_node */
RS6000_BTI_unsigned_long, /* long_unsigned_type_node */
@@ -3466,13 +3491,16 @@ enum rs6000_builtin_type_index
#define unsigned_V16QI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V16QI])
#define unsigned_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V8HI])
#define unsigned_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V4SI])
+#define unsigned_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_unsigned_V2DI])
#define bool_char_type_node (rs6000_builtin_types[RS6000_BTI_bool_char])
#define bool_short_type_node (rs6000_builtin_types[RS6000_BTI_bool_short])
#define bool_int_type_node (rs6000_builtin_types[RS6000_BTI_bool_int])
+#define bool_long_type_node (rs6000_builtin_types[RS6000_BTI_bool_long])
#define pixel_type_node (rs6000_builtin_types[RS6000_BTI_pixel])
#define bool_V16QI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V16QI])
#define bool_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V8HI])
#define bool_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V4SI])
+#define bool_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V2DI])
#define pixel_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_pixel_V8HI])
#define long_integer_type_internal_node (rs6000_builtin_types[RS6000_BTI_long])
--- gcc/config/rs6000/altivec.md (revision 146119)
+++ gcc/config/rs6000/altivec.md (revision 146798)
@@ -166,12 +166,15 @@ (define_mode_iterator V [V4SI V8HI V16QI
;; otherwise handled by altivec (v2df, v2di, ti)
(define_mode_iterator VM [V4SI V8HI V16QI V4SF V2DF V2DI TI])
+;; Like VM, except don't do TImode
+(define_mode_iterator VM2 [V4SI V8HI V16QI V4SF V2DF V2DI])
+
(define_mode_attr VI_char [(V4SI "w") (V8HI "h") (V16QI "b")])
;; Vector move instructions.
(define_insn "*altivec_mov<mode>"
- [(set (match_operand:V 0 "nonimmediate_operand" "=Z,v,v,*o,*r,*r,v,v")
- (match_operand:V 1 "input_operand" "v,Z,v,r,o,r,j,W"))]
+ [(set (match_operand:VM2 0 "nonimmediate_operand" "=Z,v,v,*o,*r,*r,v,v")
+ (match_operand:VM2 1 "input_operand" "v,Z,v,r,o,r,j,W"))]
"VECTOR_MEM_ALTIVEC_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
@@ -191,6 +194,31 @@ (define_insn "*altivec_mov<mode>"
}
[(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*")])
+;; Unlike other altivec moves, allow the GPRs, since a normal use of TImode
+;; is for unions. However for plain data movement, slightly favor the vector
+;; loads
+(define_insn "*altivec_movti"
+ [(set (match_operand:TI 0 "nonimmediate_operand" "=Z,v,v,?o,?r,?r,v,v")
+ (match_operand:TI 1 "input_operand" "v,Z,v,r,o,r,j,W"))]
+ "VECTOR_MEM_ALTIVEC_P (TImode)
+ && (register_operand (operands[0], TImode)
+ || register_operand (operands[1], TImode))"
+{
+ switch (which_alternative)
+ {
+ case 0: return "stvx %1,%y0";
+ case 1: return "lvx %0,%y1";
+ case 2: return "vor %0,%1,%1";
+ case 3: return "#";
+ case 4: return "#";
+ case 5: return "#";
+ case 6: return "vxor %0,%0,%0";
+ case 7: return output_vec_const_move (operands);
+ default: gcc_unreachable ();
+ }
+}
+ [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*")])
+
(define_split
[(set (match_operand:VM 0 "altivec_register_operand" "")
(match_operand:VM 1 "easy_vector_constant_add_self" ""))]
@@ -434,13 +462,13 @@ (define_insn "*altivec_gev4sf"
"vcmpgefp %0,%1,%2"
[(set_attr "type" "veccmp")])
-(define_insn "altivec_vsel<mode>"
+(define_insn "*altivec_vsel<mode>"
[(set (match_operand:VM 0 "altivec_register_operand" "=v")
(if_then_else:VM (ne (match_operand:VM 1 "altivec_register_operand" "v")
(const_int 0))
(match_operand:VM 2 "altivec_register_operand" "v")
(match_operand:VM 3 "altivec_register_operand" "v")))]
- "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+ "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
"vsel %0,%3,%2,%1"
[(set_attr "type" "vecperm")])
@@ -780,7 +808,7 @@ (define_insn "altivec_vmrghw"
(const_int 3)
(const_int 1)]))
(const_int 5)))]
- "TARGET_ALTIVEC"
+ "VECTOR_MEM_ALTIVEC_P (V4SImode)"
"vmrghw %0,%1,%2"
[(set_attr "type" "vecperm")])
@@ -797,7 +825,7 @@ (define_insn "*altivec_vmrghsf"
(const_int 3)
(const_int 1)]))
(const_int 5)))]
- "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
+ "VECTOR_MEM_ALTIVEC_P (V4SFmode)"
"vmrghw %0,%1,%2"
[(set_attr "type" "vecperm")])
@@ -881,7 +909,7 @@ (define_insn "altivec_vmrglw"
(const_int 1)
(const_int 3)]))
(const_int 5)))]
- "TARGET_ALTIVEC"
+ "VECTOR_MEM_ALTIVEC_P (V4SImode)"
"vmrglw %0,%1,%2"
[(set_attr "type" "vecperm")])
@@ -899,7 +927,7 @@ (define_insn "*altivec_vmrglsf"
(const_int 1)
(const_int 3)]))
(const_int 5)))]
- "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
+ "VECTOR_MEM_ALTIVEC_P (V4SFmode)"
"vmrglw %0,%1,%2"
[(set_attr "type" "vecperm")])
--- gcc/config/rs6000/rs6000.md (revision 146119)
+++ gcc/config/rs6000/rs6000.md (revision 146798)
@@ -14667,7 +14667,11 @@ (define_insn "return"
[(set_attr "type" "jmpreg")])
(define_expand "indirect_jump"
- [(set (pc) (match_operand 0 "register_operand" ""))])
+ [(set (pc) (match_operand 0 "register_operand" ""))]
+ ""
+{
+ rs6000_set_indirect_jump ();
+})
(define_insn "*indirect_jump<mode>"
[(set (pc) (match_operand:P 0 "register_operand" "c,*l"))]
@@ -14682,14 +14686,14 @@ (define_expand "tablejump"
[(use (match_operand 0 "" ""))
(use (label_ref (match_operand 1 "" "")))]
""
- "
{
+ rs6000_set_indirect_jump ();
if (TARGET_32BIT)
emit_jump_insn (gen_tablejumpsi (operands[0], operands[1]));
else
emit_jump_insn (gen_tablejumpdi (operands[0], operands[1]));
DONE;
-}")
+})
(define_expand "tablejumpsi"
[(set (match_dup 3)
@@ -14749,6 +14753,11 @@ (define_expand "doloop_end"
/* Only use this on innermost loops. */
if (INTVAL (operands[3]) > 1)
FAIL;
+ /* Do not try to use decrement and count on code that has an indirect
+ jump or a table jump, because the ctr register is preferred over the
+ lr register. */
+ if (rs6000_has_indirect_jump_p ())
+ FAIL;
if (TARGET_64BIT)
{
if (GET_MODE (operands[0]) != DImode)