2009-04-08 Michael Meissner * doc/invoke.texi (-mvsx-vector-memory): Make debug switches undoucmented. (-mvsx-vector-float): Ditto. (-mvsx-vector-double): Ditto. (-mvsx-scalar-double): Ditto. (-mvsx-scalar-memory): Ditto. * config/rs6000/vector.md (VEC_R): New iterator for reload patterns. (mov): Don't call rs6000_emit_move. (reload___*): New insns for secondary_reload support. (vec_reload_and_plus_): New insns in case reload needs to move a VSX/Altivec (and (plus reg reg) -16) type address to a base register. * config/rs6000/rs6000-protos.h (rs6000_secondary_reload_inner): Rename from rs6000_vector_secondary_reload. * config/rs6000/rs6000.opt (-mvsx-vector-memory): Make debug switches undoucmented. (-mvsx-vector-float): Ditto. (-mvsx-vector-double): Ditto. (-mvsx-scalar-double): Ditto. (-mvsx-scalar-memory): Ditto. (-mvsx-v4sf-altivec-regs): New undocumented debug switch to control whether V4SF types prefer the Altivec registers or all of the VSX registers. (-mreload-functions): New undocumented debug switch to enable/disable the secondary reload support. * config/rs6000/rs6000.c (rs6000_regno_regclass): New global to map register number to regclass. (rs6000_vector_reload): New array to hold the secondary reload insn codes for the vector types. (rs6000_init_hard_regno_mode_ok): Fill in rs6000_regno_regclass and rs6000_vector_reload. (rs6000_mode_dependent_address): Using AND in addresses is mode dependent. (rs6000_emit_move): Add debug information if -mdebug=addr. (rs6000_reload_register_type): Classify register classes for secondary reload. (rs6000_secondary_reload): For the vector types, add reload support to support reg+reg addressing for gprs, and reg+offset addressing for vector registers. (rs6000_secondary_reload_inner): Rename from rs6000_vector_secondary_reload. Fixup gpr addressing to just reg or reg+offset, and vector addressing to just reg or reg+reg. (rs6000_preferred_reload_class): Make sure all cases set the return value. If VSX/Altivec address with AND -16, prefer using an Altivec register. (rs6000_secondary_memory_needed): Handle things like SFmode that can go in floating registers, but not altivec registers under -mvsx. * config/rs6000/vsx.md (VSX_U): New iterator for load/store with update. (VSi, VSI): Reorder fields. (VSd): Add support for load/store with update rewrite. (VSv): Ditto. (VStype_load_update): New mode attribute for load/store with update. (VStype_store_update): Ditto. (vsx_mov): Use * instead of store/load attributes for multi-instruction gpr loads/stores. (vsx_reload**): Delete unused reload patterns. * config/rs6000/rs6000.h (REGNO_REG_CLASS): Change from a bunch of if statements to using a lookup table. (rs6000_regno_regclass): Lookup table for REGNO_REG_CLASS. * config/rs6000/altivec.md (altivec_reload*): Delete unused reload patterns. * config/rs6000/rs6000.md (tptrsize, mptrsize): New mode attributes for -m32/-m64 support. 2009-03-27 Jakub Jelinek PR target/39558 * macro.c (cpp_get_token): If macro_to_expand returns NULL and used some tokens, add CPP_PADDING before next token. * gcc.target/powerpc/altivec-29.c: New test. 2009-03-27 Michael Meissner * config/rs6000/constraints.md ("wZ" constraint): New constraint for using Altivec loads/stores under VSX for vector realignment. * config/rs6000/predicates.md (altivec_indexed_or_indirect_operand): New predicate to recognize Altivec load/stores with an explicit AND -16. * config/rs6000/power7.md: Whitespace change. * config/rs6000/rs6000.opt (-mpower7-adjust-cost): New debug switch. * config/rs6000/rs6000-c.c (altivec_categorize_keyword): If -mvsx and -mno-altivec, recognize the 'vector' keyword, but do not recognize 'bool' or 'pixel'. Recognize vector double under VSX. (init_vector_keywords): Ditto. (rs6000_macro_to_expand): Ditto. (altivec_overloaded_builtins): Add VSX overloaded builtins. (altivec_resolve_overloaded_builtin): Ditto. * config/rs6000/rs6000.c (rs6000_debug_cost): New global for -mdebug=cost. (rs6000_debug_address_cost): New function for printing costs if -mdebug=cost. (rs6000_debug_rtx_costs): Ditto. (rs6000_debug_adjust_cost): Ditto. (rs6000_override_options): Add -mdebug=cost support. (rs6000_legitimize_reload_address): Allow Altivec loads and stores with an explicit AND -16, in VSX for vector realignment. (rs6000_legitimize_reload_address): Ditto. (rs6000_legitimate_address): Ditto. (print_operand): Ditto. (bdesc_3arg): Add VSX builtins. (bdesc_2arg): Ditto. (bdesc_1arg): Ditto. (bdesc_abs): Ditto. (vsx_expand_builtin): Stub function for expanding VSX builtins. (rs6000_expand_builtin): Call vsx_expand_builtin. * config/rs6000/vsx.md (most DF insns): Merge DF insns in with V2DF and V4SF insns, rather than duplicating much of the code. (all insns): Go through all insns, and alternatives to address the full VSX register set, as a non-preferred option. (vsx_mod): Add support for using Altivec load/store with explicit AND -16. Use xxlor to copy registers, not copy sign. (multiply/add insns): Add an expander and unspec so the insn can be used directly even if -mno-fused-madd. (vsx_tdiv3): New insn for use as a builtin function. (vsx_tsqrt2): Ditto. (vsx_rsqrte2): Ditto. * config/rs6000/rs6000.h (rs6000_debug_cost): New for -mdebug=cost. (TARGET_DEBUG_COST): Ditto. (VSX_BUILTIN_*): Merge the two forms of multiply/add instructions into a single insn. Start to add overloaded VSX builtins. * config/rs6000/altivec.md (build_vector_mask_for_load): Delete VSX code. * config/rs6000/rs6000.md (btruncsf2): Delete extra space. (movdf_hardfloat32): Use xxlor instead of xscpsgndp to copy data. (movdf_hardfloat64_mfpgpr): Ditto. (movdf_hardfloat64): Ditto. 2009-03-13 Michael Meissner PR target/39457 * config/rs6000/rs6000.opt (-mdisallow-float-in-lr-ctr): Add temporary debug switch. * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Revert behavior of disallowing 2009-03-13 Michael Meissner * config/rs6000/vector.md (vec_extract_evenv2df): Delete, insn causes problems in building spec 2006. (vec_extract_oddv2df): Ditto. (vec_pack_trunc_v2df): New expanders for VSX vectorized conversions. (vec_pack_sfix_trunc_v2df): Ditto. (vec_pack_ufix_trunc_v2df): Ditto. (vec_unpacks_hi_v4sf): Ditto. (vec_unpacks_lo_v4sf): Ditto. (vec_unpacks_float_hi_v4si): Ditto. (vec_unpacks_float_lo_v4si): Ditto. (vec_unpacku_float_hi_v4si): Ditto. (vec_unpacku_float_lo_v4si): Ditto. * config/rs6000/rs6000-protos.h (rs6000_vector_secondary_reload): Declaration for new target hook. * config/rs6000/rs6000.c (TARGET_SECONDARY_RELOAD): Add new target hook for eventually fixing up the memory references for Altivec and VSX reloads to be reg+reg instead of reg+offset. Right now, this is a stub function that prints debug information if -mdebug=addr and then calls default_secondary_reload. (rs6000_secondary_reload): Ditto. (rs6000_vector_secondary_reload): Ditto. (rs6000_builtin_conversion): Add support for V2DI/V2DF conversions. (rs6000_legitimate_offset_address_p): Test for the vector unit doing the memory references. (rs6000_legimize_reload_address): Ditto. (rs6000_legitimize_address): Print extra \n if -mdebug=addr. (rs6000_legitimize_reload_address): Ditto. (rs6000_legitimate_address): Ditto. (rs6000_mode_dependent_address): Ditto. (bdesc_2arg): Add VSX builtins. (bdesc_abs): Ditto. (bdesc_1arg): Ditto. (altivec_init_builtins): Ditto. (rs6000_secondary_memory_needed_rtx): Add debug support if -mdebug=addr. (rs6000_preferred_reload_class): Ditto. (rs6000_secondary_memory_needed): Ditto. (rs6000_secondary_reload_class): Ditto. (rs6000_cannot_change_mode_class): Ditto. * config/rs6000/vsx.md (UNSPEC_VSX_*): Add unspecs for VSX conversions. (vsx_nabs): Add generator function. (vsx_float2): Ditto. (vsx_floatuns2): Ditto. (vsx_xxmrghw): Ditto. (vsx_xxmrglw): Ditto. (vsx_xvcvdpsp): New VSX vector conversion insn. (vsx_xvcvdpsxws): Ditto. (vsx_xvcvdpuxws): Ditto. (vsx_xvcvspdp): Ditto. (vsx_xvcvsxwdp): Ditto. (vsx_xvcvuxwdp): Ditto. (vsx_reload_*): New insns for reload support. * config/rs6000/rs6000.h: Fix a comment. * config/rs6000/altivec.md (altivec_reload_*): New insns for reload support. * config/rs6000/rs6000.md (ptrsize): New mode attribute for the pointer size. 2009-03-10 Michael Meissner * config/rs6000/vsx.md (vsx_concat_v2df): Add explicit 'f' register class for scalar data, correct uses of the xxpermdi instruction. (vsx_set_v2df): Ditto. (vsx_extract_v2df): Ditto. (vsx_xxpermdi): Ditto. (vsx_splatv2df): Ditto. (vsx_xxmrghw): Use wf instead of v constraints. (vsx_xxmrglw): Ditto. 2009-03-09 Michael Meissner * config/rs6000/vsx.md (vsx_store_update64): Use correct registers for store with update. (vsx_store_update32): Ditto. (vsx_storedf_update): Ditto. 2009-03-06 Michael Meissner * doc/invoke.texi (-mvsx-scalar-memory): New switch, to switch to use VSX reg+reg addressing for all scalar double precision floating point. * config/rs6000/rs6000.opt (-vsx-scalar-memory): Ditto. * configure.ac (gcc_cv_as_powerpc_mfpgpr): Set binutils version to 2.19.2. (gcc_cv_as_powerpc_cmpb): Ditto. (gcc_cv_as_powerpc_dfp): Ditto. (gcc_cv_as_powerpc_vsx): Ditto. (gcc_cv_as_powerpc_popcntd): Ditto. * configure: Regenerate. * config/rs6000/vector.md (VEC_int): New mode attribute for vector conversions. (VEC_INT): Ditto. (ftrunc2): Make this a define_expand. (float2): New vector conversion support to add VSX 32 bit int/32 bit floating point convert and 64 bit int/64 bit floating point vector instructions. (unsigned_float2): Ditto. (fix_trunc2): Ditto. (fixuns_trunc2): Ditto. * config/rs6000/predicates.md (easy_fp_constant): 0.0 is an easy constant under VSX. (indexed_or_indirect_operand): Add VSX load/store with update support. * config/rs6000/rs6000.c (rs6000_debug_addr): New global for -mdebug=addr. (rs6000_init_hard_regno_mode_ok): Add -mvsx-scalar-memory support. (rs6000_override_options): Add -mdebug=addr support. (rs6000_builtin_conversion): Add VSX same size conversions. (rs6000_legitimize_address): Add -mdebug=addr support. Add support for VSX load/store with update instructions. (rs6000_legitimize_reload_address): Ditto. (rs6000_legitimate_address): Ditto. (rs6000_mode_dependent_address): Ditto. (print_operand): Ditto. (bdesc_1arg): Add builtins for conversion that calls either the VSX or Altivec insn pattern. (rs6000_common_init_builtins): Ditto. * config/rs6000/vsx.md (VSX_I): Delete, no longer used. (VSi): New mode attribute for conversions. (VSI): Ditto. (VSc): Ditto. (vsx_mov): Add load/store with update support. (vsx_load_update*): New insns for load/store with update support. (vsx_store_update*): Ditto. (vsx_fmadd4): Generate correct code for V4SF. (vsx_fmsub4): Ditto. (vsx_fnmadd4_*): Ditto. (vsx_fnmsub4_*): Ditto. (vsx_float2): New insn for vector conversion. (vsx_floatuns2): Ditto. (vsx_fix_trunc2): Ditto. (vsx_fixuns_trunc2): Ditto. (vsx_xxmrghw): New insn for V4SF interleave. (vsx_xxmrglw): Ditto. * config/rs6000/rs6000.h (rs6000_debug_addr): -mdebug=addr support. (TARGET_DEBUG_ADDR): Ditto. (rs6000_builtins): Add VSX instructions for eventual VSX builtins. * config/rs6000/altivec.md (altivec_vmrghsf): Don't do the altivec instruction if VSX. (altivec_vmrglsf): Ditto. * config/rs6000/rs6000.md (movdf_hardfloat32): Add support for using xxlxor to zero a floating register if VSX. (movdf_hardfloat64_mfpgpr): Ditto. (movdf_hardfloat64): Ditto. 2009-03-03 Michael Meissner * config/rs6000/vsx.md (vsx_xxmrglw): Delete for now, use Altivec. (vsx_xxmrghw): Ditto. * config/rs6000/altivec.md (altivec_vmrghsf): Use this insn even on VSX systems. (altivec_vmrglsf): Ditto. * config/rs6000/rs6000.h (ASM_CPU_NATIVE_SPEC): Use %(asm_default) if we are running as a cross compiler. * config/rs6000/vector.md (vec_interleave_highv4sf): Use correct constants for the extraction. (vec_interleave_lowv4sf): Ditto. * config/rs6000/rs6000.md (floordf2): Fix typo, make this a define_expand, not define_insn. * config/rs6000/aix53.h (ASM_CPU_SPEC): If -mcpu=native, call %:local_cpu_detect(asm) to get the appropriate assembler flags for the machine. * config/rs6000/aix61.h (ASM_CPU_SPEC): Ditto. * config/rs6000/rs6000.h (ASM_CPU_SPEC): Ditto. (ASM_CPU_NATIVE_SPEC): New spec to get asm options if -mcpu=native. (EXTRA_SPECS): Add ASM_CPU_NATIVE_SPEC. * config/rs6000/driver-rs6000.c (asm_names): New static array to give the appropriate asm switches if -mcpu=native. (host_detect_local_cpu): Add support for "asm". * config/rs6000/rs6000.c (processor_target_table): Don't turn on -misel by default for power7. 2009-03-02 Michael Meissner * config/rs6000/rs6000.c (rs6000_emit_swdivdf): Revert last change, since we reverted the floating multiply/add changes. * doc/md.texi (Machine Constraints): Update rs6000 constraints. * config/rs6000/vector.md (neg2): Fix typo to enable vectorized negation. (ftrunc2): Move ftrunc expander here from altivec.md, and add V2DF case. (vec_interleave_highv4sf): Correct type to be V4SF, not V4SI. (vec_extract_evenv2df): Add expander. (vec_extract_oddv2df): Ditto. * config/rs6000/vsx.md (vsx_ftrunc2): New VSX pattern for truncate. (vsx_ftruncdf2): Ditto. (vsx_xxspltw): New instruction for word splat. (vsx_xxmrglw): Whitespace changes. Fix typo from V4SI to v4SF. (vsx_xxmrghw): Ditto. * config/rs6000/altivec.md (altivec_vmrghsf): Whitespace changes. (altivec_vmrglsf): Ditto. (altivec_vspltsf): Disable if we have VSX. (altivec_ftruncv4sf2): Move expander to vector.md, rename insn. * config/rs6000/rs6000.md (ftruncdf2): Add expander for VSX. * config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Reenable vectorizing V4SF under altivec. (rs6000_hard_regno_mode_ok): Don't allow floating values in LR, CTR, MQ. Also, VRSAVE/VSCR are both 32-bits. (rs6000_init_hard_regno_mode_ok): Print some of the special registers if -mdebug=reg. * config/rs6000/rs6000.md (floating multiply/add insns): Go back to the original semantics for multiply add/subtract, particularly with -ffast-math. * config/rs6000/vsx.md (floating multiply/add insns): Mirror the rs6000 floating point multiply/add insns in VSX. 2009-03-01 Michael Meissner * config/rs6000/vector.md (VEC_L): At TImode. (VEC_M): Like VEC_L, except no TImode. (VEC_base): Add TImode support. (mov): Use VEC_M, not VEC_L. If there is no extra optimization for the move, just generate the standard move. (vector_store_): Ditto. (vector_load_): Ditto. (vec_init): Use vec_init_operand predicate. * config/rs6000/predicates.md (vec_init_operand): New predicate. * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Allow mode in a VSX register if there is a move operation. (rs6000_vector_reg_class): Add internal register number to the debug output. (rs6000_init_hard_regno_mode_ok): Reorganize so all of the code for a given type is located together. If not -mvsx, make "ws" constraint become NO_REGS, not FLOAT_REGS. Change -mdebug=reg output. (rs6000_expand_vector_init): Before calling gen_vsx_concat_v2df, make sure the two float arguments are copied into registers. (rs6000_legitimate_offset_address_p): If no vsx or altivec, don't disallow offset addressing. Add V2DImode. If TImode is handled by the vector unit, allow indexed addressing. Change default case to be a fatal_insn instead of gcc_unreachable. (rs6000_handle_altivec_attribute): Add support for vector double if -mvsx. (rs6000_register_move_cost): Add support for VSX_REGS. Know that under VSX, you can move between float and altivec registers cheaply. (rs6000_emit_swdivdf): Change the pattern of the negate multiply and subtract operation. * config/rs6000/vsx.md (VSX_I): Add TImode. (VSX_L): Add TImode. (VSm): Ditto. (VSs): Ditto. (VSr): Ditto. (UNSPEC_VSX_CONCAT_V2DF): New constant. (vsx_fre2): Add reciprocal estimate. (vsx_freDF2): Ditto. (vsx_fnmadd4): Rework pattern so it matches the canonicalization that the compiler does. (vsx_fnmsub4): Ditto. (vsx_fnmaddDF4): Ditto. (vsx_fnmsubDF4): Ditto. (vsx_vsel): Use vsx_register_operand, not register_operand. (vsx_adddf3): Ditto. (vsx_subdf3): Ditto. (vsx_muldf3): Ditto. (vsx_divdf3): Ditto. (vsx_negdf3): Ditto. (vsx_absdf2): Ditto. (vsx_nabsdf2): Ditto. (vsx_copysign3): Add copysign support. (vsx_copysignDF3): Ditto. (vsx_concat_v2df): Rewrite to use an UNSPEC. (vsx_set_v2df): Use "ws" constraint for scalar float. (vsx_splatv2df): Ditto. * config/rs6000/rs6000.h (VECTOR_UNIT_NONE_P): New macro to say no vector support. (VECTOR_MEM_NONE_P): Ditto. (VSX_MOVE_MODE): Add V2DImode, TImode. * config/rs6000/altivec.md (VM): Add V2DI, TI. (build_vector_mask_for_load): Fix thinko in VSX case. * config/rs6000/rs6000.md (fmaddsf4_powerpc): Name previously unnamed pattern. Fix insns so combine will generate the negative multiply and subtract operations. (fmaddsf4_power): Ditto. (fmsubsf4_powerpc): Ditto. (fmsubsf4_power): Ditto. (fnmaddsf4_powerpc): Ditto. (fnmaddsf4_power): Ditto. (fnmsubsf4_powerpc): Ditto. (fnmsubsf4_power): Ditto. (fnsubsf4_powerpc2): Ditto. (fnsubsf4_power2): Ditto. (fmadddf4_powerpc): Ditto. (fmadddf4_power): Ditto. (fmsubdf4_powerpc): Ditto. (fmsubdf4_power): Ditto. (fnmadddf4_powerpc): Ditto. (fnmadddf4_power): Ditto. (fnmsubdf4_powerpc): Ditto. (fnmsubdf4_power): Ditto. (copysigndf3): If VSX, call the VSX copysign. (fred): Split into an expander and insn. On insn, disable if VSX. (movdf_hardfloat32): Rework VSX support. (movdf_hardfloat64_mfpgpr): Ditto. (movdf_hardfloat64): Ditto. (movti_ppc64): If vector unit is handling TImode, disable this pattern. 2009-02-28 Michael Meissner * config/rs6000/ppc-asm.h: If __VSX__ define the additional scalar floating point registers that overlap with th Altivec registers. 2009-02-27 Michael Meissner * config/rs6000/spe.md (spe_fixuns_truncdfsi2): Rename from fixuns_truncdfsi2, and move fixuns_truncdfsi2 into rs6000.md. * config/rs6000/ppc-asm.h: If __ALTIVEC__ is defined, define the Altivec registers. If __VSX__ is defined, define the VSX registers. * config/rs6000/rs6000.opt (-mvsx-scalar-double): Make this on by default. (-misel): Make this a target mask instead of a variable. * config/rs6000/rs6000.c (rs6000_isel): Delete global variable. (rs6000_explicit_options): Delete isel field. (POWERPC_MASKS): Add MASK_ISEL. (processor_target_table): Add MASK_ISEL to 8540, 8548, e500mc, and power7 processors. (rs6000_override_options): Move -misel to a target mask. (rs6000_handle_option): Ditto. (rs6000_emit_int_cmove): Add support for 64-bit isel. * config/rs6000/vsx.md (vsx_floatdidf2): New scalar floating point pattern to support VSX conversion and rounding instructions. (vsx_floatunsdidf2): Ditto. (vsx_fix_trundfdi2): Ditto. (vsx_fixuns_trundfdi2): Ditto. (vsx_btrundf2): Ditto. (vsx_floordf2): Ditto. (vsx_ceildf2): Ditto. * config/rs6000/rs6000.h (rs6000_isel): Delete global. (TARGET_ISEL): Delete, since -misel is now a target mask. (TARGET_ISEL64): New target option for -misel on 64-bit systems. * config/rs6000/altivec.md (altivec_gtu): Use gtu, not geu. * config/rs6000/rs6000.md (sel): New mode attribute for 64-bit ISEL support. (movcc): Add 64-bit ISEL support. (isel_signed_): Ditto. (isel_unsigned_): Ditto. (fixuns_truncdfsi2): Move expander here from spe.md. (fixuns_truncdfdi2): New expander for unsigned floating point conversion on power7. (btruncdf2): Split into expander and insn. On the insn, disallow on VSX, so the VSX instruction will be generated. (ceildf2): Ditto. (floordf2): Ditto. (floatdidf2): Ditto. (fix_truncdfdi2): Ditto. (smindi3): Define if we have -misel on 64-bit systems. (smaxdi3): Ditto. (umindi3): Ditto. (umaxdi3): Ditto. * config/rs6000/e500.h (CHECK_E500_OPTIONS): Disable -mvsx on E500. 2009-02-26 Michael Meissner * config/rs6000/constraints.md ("wd" constraint): Change the variable that holds the register class to use. ("wf" constraint): Ditto. ("ws" constraint): Ditto. ("wa" constraint): Ditto. * config/rs6000/rs6000.opt (-mvsx-vector-memory): Make this on by default. (-mvsx-vector-float): Ditto. * config/rs6000/rs6000.c (rs6000_vector_reg_class): New global to hold the register classes for the vector modes. (rs6000_vsx_v4sf_regclass): Delete, move into rs6000_vector_reg_class. (rs6000_vsx_v2df_regclass): Ditto. (rs6000_vsx_df_regclass): Ditto. (rs6000_vsx_reg_class): Rename from rs6000_vsx_any_regclass. (rs6000_hard_regno_mode_ok): Rework VSX, Altivec registers. (rs6000_init_hard_regno_mode_ok): Setup rs6000_vector_reg_class. Drop rs6000_vsx_*_regclass. By default, use all 64 registers for V4SF and V2DF. Use VSX_REG_CLASS_P macro instead of separate tests. Update -mdebug=reg printout. (rs6000_preferred_reload_class): If VSX, prefer FLOAT_REGS for scalar floating point and ALTIVEC_REGS for the types that have altivec instructions. (rs6000_secondary_memory_needed): If VSX, we can copy between FPR and Altivec registers without needed memory. (rs6000_secondary_reload_class): Delete ATTRIBUTE_UNUSED from an argument that is used. If VSX, we can copy between FPR and Altivec registers directly. * config/rs6000/rs6000.h (VSX_MOVE_MODE): Add in the Altivec types. (rs6000_vsx_v4sf_regclass): Delete. (rs6000_vsx_v2df_regclass): Ditto. (rs6000_vsx_df_regclass): Ditto. (rs6000_vsx_reg_class): Rename from rs6000_vsx_any_reg_class. (rs6000_vector_reg_class): New global to map machine mode to the preferred register class to use for that mode. (VSX_REG_CLASS_P): New macro to return true for all of the register classes VSX items can be in. 2009-02-25 Michael Meissner * doc/invoke.texi (-mvsx-vector-memory): Rename from -mvsx-vector-move. (-mvsx-vector-logical): Delete. * config/rs6000/aix53.h (ASM_CPU_SPEC): Add power7 support. * config/rs6000/aix61.h (ASM_CPU_SPEC): Ditto. * config/rs6000/vector.md (all insns); Change from using rs6000_vector_info to VECTOR_MEM_* or VECTOR_UNIT_* macros. * config/rs6000/constraints.md ("wi" constraint): Delete. ("wl" constraint): Ditto. ("ws" constraint): Change to use rs6000_vsx_df_regclass. * config/rs6000/rs6000.opt (-mvsx-vector-memory): Rename from -mvsx-vector-move. (-mvsx-vector-float): Make default 0, not 1. (-mvsx-vector-double): Make default -1, not 1. (-mvsx-vector-logical): Delete. * config/rs6000/rs6000.c (rs6000_vector_info): Delete. (rs6000_vector_unit): New global array to say what vector unit is used for arithmetic instructions. (rs6000_vector_move): New global array to say what vector unit is used for memory references. (rs6000_vsx_int_regclass): Delete. (rs6000_vsx_logical_regclass): Delete. (rs6000_hard_regno_nregs_internal): Switch from using rs6000_vector_info to rs6000_vector_unit, rs6000_vector_move. (rs6000_hard_regno_mode_ok): Ditto. Reformat code somewhat. (rs6000_debug_vector_unit): New array to print vector unit information if -mdebug=reg. (rs6000_init_hard_regno_mode_ok): Rework to better describe VSX and Altivec register sets. (builtin_mask_for_load): Return 0 if -mvsx. (rs6000_legitimize_reload_address): Allow AND in VSX addresses. (rs6000_legitimate_address): Ditto. (bdesc_3arg): Delete vselv2di builtin. (rs6000_emit_minmax): Use rs6000_vector_unit instead of rs6000_vector_info. (rs6000_vector_mode_supported_p): Ditto. * config/rs6000/vsx.md (all insns): Change from using rs6000_vector_info to VECTOR_MEM_* and VECTOR_UNIT_* macros. (VSr): Change to use "v" register class, not "wi". (vsx_mov): Combine floating and integer. Allow prefered register class, and then use ?wa for all VSX registers. (vsx_fmadddf4): Use ws constraint, not f. (vsx_fmsubdf4): Ditto. (vsx_fnmadddf4): Ditto. (vsx_fnmsubdf4): Ditto. (vsx_and3): Use preferred register class, and then ?wa to catch all VSX registers. (vsx_ior3): Ditto. (vsx_xor3): Ditto. (vsx_one_cmpl2): Ditto. (vsx_nor3): Ditto. (vsx_andc3): Ditto. * config/rs6000/rs6000.h (rs6000_vector_struct): Delete. (rs6000_vector_info): Ditto. (rs6000_vector_unit): New global array to say whether a machine mode arithmetic is handled by a particular vector unit. (rs6000_vector_mem): New global array to say which vector unit to use for moves. (VECTOR_UNIT_*): New macros to say which vector unit to use. (VECTOR_MEM_*): Ditto. (rs6000_vsx_int_regclass): Delete. (rs6000_vsx_logical_regclass): Delete. * config/rs6000/altivec.md (all insns): Change from using rs6000_vector_info to VECTOR_MEM_* and VECTOR_UNIT_* macros. (build_vector_mask_for_load): Disable if VSX. * config/rs6000/rs6000.md (all DF insns): Change how the VSX exclusion is done. 2009-02-24 Michael Meissner * config/rs6000/rs6000.c (rs6000_debug_reg): New global. (rs6000_debug_reg_print): New function to print register classes for a given register range. (rs6000_init_hard_regno_mode_ok): If -mdebug=reg, print out the register class, call used, fixed information for most of the registers. Print the vsx register class variables. (rs6000_override_options): Add -mdebug=reg support. * config/rs6000/rs6000.h (rs6000_debug_reg): New global. (TARGET_DEBUG_REG): New target switch for -mdebug=reg. 2009-02-23 Michael Meissner * reload.c (subst_reloads): Change gcc_assert into a fatal_insn. * config/rs6000/vector.md (VEC_I): Reorder iterator. (VEC_L): Ditto. (VEC_C): New iterator field for vector comparisons. (VEC_base): New mode attribute that mapes the vector type to the base type. (all insns): Switch to use rs6000_vector_info to determine whether the insn is valid instead of using TARGET_VSX or TARGET_ALTIVEC. (vcond): Move here from altivec, and add VSX support. (vcondu): Ditto. (vector_eq): New expander for vector comparisons. (vector_gt): Ditto. (vector_ge): Ditto. (vector_gtu): Ditto. (vector_geu): Ditto. (vector_vsel): New expander for vector select. (vec_init): Move expander from altivec.md and generalize for VSX. (vec_set): Ditto. (vec_extract): Ditto. (vec_interleave_highv4sf): Ditto. (vec_interleave_lowv4sf): Ditto. (vec_interleave_highv2df): New expander for VSX. (vec_interleave_lowv2df): Ditto. * config/rs6000/contraints.md (toplevel): Add comment on the available constraint letters. ("w" constraint): Delete, in favor of using "w" as a two letter constraint. ("wd" constraint): New VSX constraint. ("wf" constraint): Ditto. ("wi" constraint): Ditto. ("wl" constraint): Ditto. ("ws" constraint): Ditto. ("wa" constraint): Ditto. * config/rs6000/predicates.md (indexed_or_indirect_operand): Disable altivec support allowing AND of memory address if -mvsx. * config/rs6000/rs6000.opt (-mvsx-vector-move): New switches to allow finer control over whether VSX, Altivec, or the traditional instructions are used. (-mvsx-scalar-move): Ditto. (-mvsx-vector-float): Ditto. (-mvsx-vector-double): Ditto. (-mvsx-vector-logical): Ditto. (-mvsx-scalar-double): Ditto. * config/rs6000/rs6000.c (rs6000_vector_info): New global to hold various information about which vector instruction set to use, and the alignment of data. (rs6000_vsx_v4sf_regclass): New global to hold VSX register class. (rs6000_vsx_v2df_regclass): Ditto. (rs6000_vsx_df_regclass): Ditto. (rs6000_vsx_int_regclass): Ditto. (rs6000_vsx_logical_regclass): Ditto. (rs6000_vsx_any_regclass): Ditto. (rs6000_hard_regno_nregs_internal): Rewrite to fine tune VSX/Altivec register selection. (rs6000_hard_regno_mode_ok): Ditto. (rs6000_init_hard_regno_mode_ok): Set up the vector information globals based on the -mvsx-* switches. (rs6000_override_options): Add warnings for -mvsx and -mlittle-endian or -mavoid-indexed-addresses. (rs6000_builtin_vec_perm): Add V2DF/V2DI support. (rs6000_expand_vector_init): Add V2DF support. (rs6000_expand_vector_set): Ditto. (rs6000_expand_vector_extract): Ditto. (avoiding_indexed_address_p): Add VSX support. (rs6000_legitimize_address): Ditto. (rs6000_legitimize_reload_address): Ditto. (rs6000_legitimite_address): Ditto. (USE_ALTIVEC_FOR_ARG_P): Ditto. (function_arg_boundary): Ditto. (function_arg_advance): Ditto. (function_arg): Ditto. (get_vec_cmp_insn): Delete. (rs6000_emit_vector_vsx): New function for VSX vector compare. (rs6000_emit_vector_altivec): New function for Altivec vector compare. (get_vsel_insn): Delete. (rs6000_emit_vector_select): Ditto. (rs6000_override_options): If -mvsx, turn on -maltivec by default. (rs6000_builtin_vec_perm): Add support for V2DI, V2DF modes. (bdesc_3arg): Add vector select and vector permute builtins for V2DI and V2DF types. Switch to using the vector_* expander instead of altivec_*. (rs6000_init_builtins): Initialize new type nodes for VSX. Initialize __vector double type. Initialize common builtins for VSX. (rs6000_emit_vector_compare): Add VSX support. (rs6000_vector_mode_supported_p): If VSX, support V2DF. * config/rs6000/vsx.md (VSX_I): New iterator for integer modes. (VSX_L): Reorder iterator. (lx__vsx): Delete, no longer needed. (stx__vsx): Ditto. (all insns): Change to use vsx_ instead of _vsx for consistancy with the other rs6000 md files. Change to use the new "w" constraints for all insns. Change to use rs6000_vector_info deciding whether to execute the instruction or not. (vsx_mov): Rewrite constraints so GPR registers are not chosen as reload targets. Split integer vector loads into a separate insn, and favor the altivec register over the VSX fp registers. (vsx_fmadd4): Use , not . (vsx_fmsub4): Ditto. (vsx_eq): New insns for V2DF/V4SF vector compare. (vsx_gt): Ditto. (vsx_ge): Ditto. (vsx_vsel): New insns for VSX vector select. (vsx_xxpermdi): New insn for DF permute. (vsx_splatv2df): New insn for DF splat support. (vsx_xxmrglw): New insns for DF interleave. (vsx_xxmrghw): Ditto. * config/rs6000/rs000.h (enum rs6000_vector): New enum to describe which vector unit is being used. (struct rs6000_vector_struct): New struct to describe the various aspects about the current vector instruction set. (rs6000_vector_info): New global to describe the current vector instruction set. (SLOW_UNALIGNED_ACCESS): If rs6000_vector_info has alignment information for a type, use that. (VSX_VECTOR_MOVE_MODE): New macro for all VSX vectors that are supported by move instructions. (VSX_MOVE_MODE): New macro for all VSX moves. (enum rs6000_builtins): Add V2DI/V2DF vector select and permute builtins. (rs6000_builtin_type_index): Add new types for VSX vectors. (rs6000_vsx_v4sf_regclass): New global to hold VSX register class. (rs6000_vsx_v2df_regclass): Ditto. (rs6000_vsx_df_regclass): Ditto. (rs6000_vsx_int_regclass): Ditto. (rs6000_vsx_logical_regclass): Ditto. (rs6000_vsx_any_regclass): Ditto. * config/rs6000/altivec.md (UNSPEC_VCMP*): Delete unspec constants no longer needed. (UNSPEC_VSEL*): Ditto. (altivec_lvx_): Delete, no longer needed. (altivec_stvx_): Ditto. (all insns): Rewrite to be consistant of altivec_. Switch to use rs6000_vector_info to determine whether to issue to the altivec form of the instructions. (mov_altivec): Rewrite constraints so GPR registers are not chosen as reload targets. (altivec_eq): Rewrite vector conditionals, permute, select to use iterators, and work with VSX. (altivec_gt): Ditto. (altivec_ge): Ditto. (altivec_gtu): Ditto. (altivec_geu): Ditto. (altivec_vsel): Ditto. (altivec_vperm_): Ditto. (altivec_vcmp*): Rewrite to not use unspecs any more, and use mode iterators, add VSX support. (vcondv4si): Move to vector.md. (vconduv4si): Ditto. (vcondv8hi): Ditto. (vconduv8hi): Ditto. (vcondv16qi): Ditto. (vconduv16qi): Ditto. * config/rs6000/rs6000.md (negdf2_fpr): Add support for -mvsx-scalar-double. (absdf2_fpr): Ditto. (nabsdf2_fpr): Ditto. (adddf3_fpr): Ditto. (subdf3_fpr): Ditto. (muldf3_fpr): Ditto. (divdf3_fpr): Ditto. (DF multiply/add patterns): Ditto. (sqrtdf2): Ditto. (movdf_hardfloat32): Add VSX support. (movdf_hardfloat64_mfpgpr): Ditto. (movdf_hardfloat64): Ditto. * doc/invoke.texi (-mvsx-*): Add new vsx switches. 2009-02-13 Michael Meissner * config.in: Update two comments. * config/rs6000/vector.md (VEC_L): Add V2DI type. (move): Use VEC_L to get all vector types, and delete the separate integer mode move definitions. (vector_load_): Ditto. (vector_store_): Ditto. (vector move splitters): Move GPR register splitters here from altivec.md. * config/rs6000/constraints.md ("j"): Add "j" constraint to match the mode's 0 value. * config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Only count the FPRs as being 128 bits if the mode is a VSX type. (rs6000_hard_regno_mode_ok): Ditto. (rs6000_emit_minmax): Use new VSX_MODE instead of separate tests. * config/rs6000/vsx.md (VSX_L): Add V2DImode. (VSm): Rename from VSX_mem, add modes for integer vectors. Change all uses. (VSs): Rename from VSX_op, add modes for integer vectors. Change all uses. (VSr): New mode address to give the register class. (mov_vsx): Use VSr to get the register preferences. Add explicit 0 option. (scalar double precision patterns): Do not use v register constraint right now. (logical patterns): Use VSr mode attribute for register preferences. * config/rs6000/rs6000.h (VSX_SCALAR_MODE): New macro. (VSX_MODE): Ditto. * config/rs6000/altivec.md (VM): New mode iterator for memory operations. Add V2DI mode. (mov_altivec_): Disable if -mvsx for all modes, not just V4SFmode. (gpr move splitters): Move to vector.md. (and3_altivec): Use VM mode iterator, not V. (ior3_altivec): Ditto. (xor3_altivec): Ditto. (one_cmpl2_altivec): Ditto. (nor3_altivec): Ditto. (andc3_altivec): Ditto. * config/rs6000/rs6000.md (movdf_hardfloat): Back out vsx changes. (movdf_hardfloat64_vsx): Delete. 2009-02-12 Michael Meissner * config/rs6000/vector.md: New file to abstract out the expanders for vector operations from alitvec.md. * config/rs6000/predicates.md (vsx_register_operand): New predicate to match VSX registers. (vfloat_operand): New predicate to match registers used for vector floating point operations. (vint_operand): New predicate to match registers used for vector integer operations. (vlogical_operand): New predicate to match registers used for vector logical operations. * config/rs6000/rs6000-protos.h (rs6000_hard_regno_nregs): Change from a function to an array. (rs6000_class_max_nregs): Add declaration. * config/rs6000/t-rs6000 (MD_INCLUDES): Define to include all of the .md files included by rs6000.md. * config/rs6000/rs6000.c (rs6000_class_max_nregs): New global array to pre-calculate CLASS_MAX_NREGS. (rs6000_hard_regno_nregs): Change from a function to an array to pre-calculate HARD_REGNO_NREGS. (rs6000_hard_regno_nregs_internal): Rename from rs6000_hard_regno_nregs and add VSX support. (rs6000_hard_regno_mode_ok): Add VSX support, and switch to use lookup table rs6000_hard_regno_nregs. (rs6000_init_hard_regno_mode_ok): Add initialization of rs6000_hard_regno_nregs, and rs6000_class_max_nregs global arrays. (rs6000_override_options): Add some warnings for things that are incompatible with -mvsx. (rs6000_legitimate_offset_address_p): Add V2DFmode. (rs6000_conditional_register_usage): Enable altivec registers if -mvsx. (bdesc_2arg): Change the name of the nor pattern. (altivec_expand_ld_builtin): Change the names of the load patterns to be the generic vector loads. (altivec_expand_st_builtin): Change the names of the store patterns to be the generic vector stores. (print_operand): Add 'x' to print out a VSX register properly. (rs6000_emit_minmax): Directly emit the min/max patterns for VSX and Altivec. * config/rs6000/vsx.md: New file to add all of the VSX specific instructions. Add support for load, store, move, add, subtract, multiply, multiply/add, divide, negate, absolute value, maximum, minimum, sqrt, and, or, xor, and complent, xor, one's complement, and nor instructions. * config/rs6000/rs6000.h (UNITS_PER_VSX_WORD): Define. (VSX_REGNO_P): New macro for VSX registers. (VFLOAT_REGNO): New macro for vector floating point registers. (VINT_REGNO): New macro for vector integer registers. (VLOGICAL_REGNO): New macro for vector logical registers. (VSX_VECTOR_MODE): New macro for vector modes supported by VSX. (HARD_REGNO_NREGS): Switch to using pre-computed table. (CLASS_MAX_NREGS): Ditto. * config/rs6000/altivec.md (altivec_lvx_): Delete, repalced by expanders in vector.md. (altivec_stvx_): Ditto. (mov): Ditto. (mov_altivec_): Rename from mov_internal, and prefer using VSX if available. (addv4sf3_altivec): Rename from standard name, and prefer using VSX if available. (subv4sf3_altivec): Ditto. (mulv4sf3_altivec): Ditto. (smaxv4sf3_altivec): Ditto. (sminv4sf3_altivec): Ditto. (and3_altivec): Ditto. (ior3_altivec): Ditto. (xor3_altivec): Ditto. (one_cmpl2): Ditto. (nor3_altivec): Ditto. (andc3_altivec): Ditto. (absv4sf2_altivec): Ditto. (vcondv4sf): Move to vector.md. * config/rs6000/rs6000.md (negdf2_fpr): Add !TARGET_VSX to prefer the version in vsx.md if -mvsx is available. (absdf2_fpr): Ditto. (nabsdf2_fpr): Ditto. (adddf3_fpr): Ditto. (subdf3_fpr): Ditto. (muldf3_fpr): Ditto. (multiply/add patterns): Ditto. (movdf_hardfloat64): Disable if -mvsx. (movdf_hardfloat64_vsx): Clone from movdf_hardfloat64 and add VSX support. (vector.md): Include new .md file. (vsx.md): Ditto. 2009-02-11 Michael Meissner * doc/invoke.texi (-mvsx, -mno-vsx): Document new switches. * config/rs6000/linux64.opt (-mprofile-kernel): Move to being a variable to reduce the number of target flag bits. * config/rs6000/sysv4.opt (-mbit-align): Ditto. (-mbit-word): Ditto. (-mregnames): Ditto. * config/rs6000/rs6000.opt (-mupdate, -mno-update): Ditto. (-mvsx): New switch, enable VSX support. * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Define __VSX__ if the vector/scalar instruction set is available. * config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Change to allow -mprofile-kernel to be a variable. * config/rs6000/rs6000.c (processor_target_table): Set -mvsx for power7 cpus. (POWERPC_MASKS): Add -mvsx. * config/rs6000/rs6000.h (ADDITIONAL_REGISTER_NAMES): Add VSX register names for the registers that overlap with the floating point and altivec registers. * config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): TARGET_NO_BITFIELD_WORD is now a variable, not a target mask. 2009-02-11 Pat Haugen Michael Meissner * doc/invoke.texi (-mpopcntd, -mno-popcntd): Document new switches. * configure.ac (powerpc*-*-*): Test for the assembler having the popcntd instruction. * configure: Regenerate. * config.in (HAVE_AS_POPCNTD): Add default value for configure test. * config/rs6000/power7.md: New file. * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Define _ARCH_PWR7 if the popcntd instruction is available. * config/rs6000/rs6000.opt (-mpopcntd): New switch to enable/disable the use of the popcntd and popcntw instructions. (-mfused-madd, -mno-fused-madd): Move to being a separate variable because we are out of mask bits. * config/rs6000/rs6000.c (power7_cost): Define. (rs6000_override_options): Add Power7 support. (rs6000_issue_rate): Ditto. (insn_must_be_first_in_group): Ditto. (insn_must_be_last_in_group): Ditto. (rs6000_emit_popcount): Add support for using the popcntw and popcntd instructions. * config/rs6000/rs6000.h (ASM_CPU_POWER7_SPEC): Switch to using popcntd as the test for a power7 assembler instead of vector scalar instructions. * (TARGET_POPCNTD): If assembler does not support the popcntd instruction, disable using it. (processor_type): Add Power7 entry. * config/rs6000/rs6000.md (define_attr "cpu"): Add power7. (power7.md): Include it. (andi./andis./nor. patterns): Change insn type to fast_compare. (popcntwsi2): Add popcntw support. (popcntddi2): Add popcntd support. 2009-03-27 Jakub Jelinek PR target/39558 * gcc.target/powerpc/altivec-29.c: New test. 2009-03-27 Michael Meissner * gcc.target/powerpc/vsx-builtin-1.c: Add more VSX builtins. Prevent the optimizer from combining the various multiplies. * gcc.target/powerpc/vsx-builtin-2.c: Ditto. 2009-03-13 Michael Meissner PR target/39457 * gcc.target/powerpc/pr39457.c: New test for PR39457. 2009-03-13 Michael Meissner * gcc.target/powerpc/vsx-builtin-1.c: New test for builtins. * gcc.target/powerpc/vsx-builtin-2.c: Ditto. 2009-03-01 Michael Meissner * gcc.target/powerpc/vsx-vector-1.c: Fix typos. * gcc.target/powerpc/vsx-vector-2.c: Ditto. * gcc.target/powerpc/vsx-vector-3.c: New file, test __vector double. * gcc.target/powerpc/vsx-vector-4.c: New file, test __vector float uses VSX instructions if -mvsx. * gcc.dg/vmx/vmx.exp (DEFAULT_VMXCLFAGS): Add -mno-vsx. * lib/target-supports.exp (check_vsx_hw_available): New function to test for VSX. (check_vmx_hw_available): Add -mno-vsx to options. (check_effective_target_powerpc_vsx_ok): New function to check if the powerpc compiler can support VSX. 2009-02-27 Michael Meissner * gcc.target/powerpc/vsx-vector-1.c: New file to test VSX code generation. * gcc.target/powerpc/vsx-vector-2.c: Ditto. 2009-02-01 Michael Meissner * gcc.target/powerpc/popcount-2.c: New file for power7 support. * gcc.target/powerpc/popcount-3.c: Ditto. --- gcc/doc/invoke.texi (.../trunk) (revision 145777) +++ gcc/doc/invoke.texi (.../branches/ibm/power7-meissner) (revision 146027) @@ -715,7 +715,8 @@ See RS/6000 and PowerPC Options. -maltivec -mno-altivec @gol -mpowerpc-gpopt -mno-powerpc-gpopt @gol -mpowerpc-gfxopt -mno-powerpc-gfxopt @gol --mmfcrf -mno-mfcrf -mpopcntb -mno-popcntb -mfprnd -mno-fprnd @gol +-mmfcrf -mno-mfcrf -mpopcntb -mno-popcntb -mpopcntd -mno-popcntd @gol +-mfprnd -mno-fprnd @gol -mcmpb -mno-cmpb -mmfpgpr -mno-mfpgpr -mhard-dfp -mno-hard-dfp @gol -mnew-mnemonics -mold-mnemonics @gol -mfull-toc -mminimal-toc -mno-fp-in-toc -mno-sum-in-toc @gol @@ -729,7 +730,7 @@ See RS/6000 and PowerPC Options. -mstrict-align -mno-strict-align -mrelocatable @gol -mno-relocatable -mrelocatable-lib -mno-relocatable-lib @gol -mtoc -mno-toc -mlittle -mlittle-endian -mbig -mbig-endian @gol --mdynamic-no-pic -maltivec -mswdiv @gol +-mdynamic-no-pic -maltivec -mswdiv @gol -mprioritize-restricted-insns=@var{priority} @gol -msched-costly-dep=@var{dependence_type} @gol -minsert-sched-nops=@var{scheme} @gol @@ -13614,6 +13615,8 @@ These @samp{-m} options are defined for @itemx -mno-mfcrf @itemx -mpopcntb @itemx -mno-popcntb +@itemx -mpopcntd +@itemx -mno-popcntd @itemx -mfprnd @itemx -mno-fprnd @itemx -mcmpb @@ -13638,6 +13641,8 @@ These @samp{-m} options are defined for @opindex mno-mfcrf @opindex mpopcntb @opindex mno-popcntb +@opindex mpopcntd +@opindex mno-popcntd @opindex mfprnd @opindex mno-fprnd @opindex mcmpb @@ -13687,6 +13692,9 @@ The @option{-mpopcntb} option allows GCC double precision FP reciprocal estimate instruction implemented on the POWER5 processor and other processors that support the PowerPC V2.02 architecture. +The @option{-mpopcntd} option allows GCC to generate the popcount +instruction implemented on the POWER7 processor and other processors +that support the PowerPC V2.06 architecture. The @option{-mfprnd} option allows GCC to generate the FP round to integer instructions implemented on the POWER5+ processor and other processors that support the PowerPC V2.03 architecture. @@ -13765,9 +13773,9 @@ The @option{-mcpu} options automatically following options: @gccoptlist{-maltivec -mfprnd -mhard-float -mmfcrf -mmultiple @gol --mnew-mnemonics -mpopcntb -mpower -mpower2 -mpowerpc64 @gol +-mnew-mnemonics -mpopcntb -mpopcntd -mpower -mpower2 -mpowerpc64 @gol -mpowerpc-gpopt -mpowerpc-gfxopt -msingle-float -mdouble-float @gol --msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr} +-msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr -mvsx} The particular options set for any particular CPU will vary between compiler versions, depending on what setting seems to produce optimal @@ -13868,6 +13876,14 @@ instructions. This option has been deprecated. Use @option{-mspe} and @option{-mno-spe} instead. +@item -mvsx +@itemx -mno-vsx +@opindex mvsx +@opindex mno-vsx +Generate code that uses (does not use) vector/scalar (VSX) +instructions, and also enable the use of built-in functions that allow +more direct access to the VSX instruction set. + @item -mfloat-gprs=@var{yes/single/double/no} @itemx -mfloat-gprs @opindex mfloat-gprs --- gcc/doc/md.texi (.../trunk) (revision 145777) +++ gcc/doc/md.texi (.../branches/ibm/power7-meissner) (revision 146027) @@ -1913,7 +1913,19 @@ Address base register Floating point register @item v -Vector register +Altivec vector register + +@item wd +VSX vector register to hold vector double data + +@item wf +VSX vector register to hold vector float data + +@item ws +VSX vector register to hold scalar float data + +@item wa +Any VSX register @item h @samp{MQ}, @samp{CTR}, or @samp{LINK} register @@ -1999,6 +2011,9 @@ AND masks that can be performed by two r @item W Vector constant that does not require memory +@item j +Vector constant that is all zeros. + @end table @item Intel 386---@file{config/i386/constraints.md} --- gcc/reload.c (.../trunk) (revision 145777) +++ gcc/reload.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -6255,8 +6255,14 @@ subst_reloads (rtx insn) *r->where = reloadreg; } /* If reload got no reg and isn't optional, something's wrong. */ - else - gcc_assert (rld[r->what].optional); + else if (!rld[r->what].optional) + { + char buffer[100]; + sprintf (buffer, + "unable to find register for reload, replacement #%d", + i); + fatal_insn (buffer, insn); + } } } --- gcc/configure (.../trunk) (revision 145777) +++ gcc/configure (.../branches/ibm/power7-meissner) (revision 146027) @@ -23075,7 +23075,7 @@ if test "${gcc_cv_as_powerpc_mfpgpr+set} else gcc_cv_as_powerpc_mfpgpr=no if test $in_tree_gas = yes; then - if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0` + if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2` then gcc_cv_as_powerpc_mfpgpr=yes fi elif test x$gcc_cv_as != x; then @@ -23171,7 +23171,7 @@ if test "${gcc_cv_as_powerpc_cmpb+set}" else gcc_cv_as_powerpc_cmpb=no if test $in_tree_gas = yes; then - if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0` + if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2` then gcc_cv_as_powerpc_cmpb=yes fi elif test x$gcc_cv_as != x; then @@ -23217,7 +23217,7 @@ if test "${gcc_cv_as_powerpc_dfp+set}" = else gcc_cv_as_powerpc_dfp=no if test $in_tree_gas = yes; then - if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0` + if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2` then gcc_cv_as_powerpc_dfp=yes fi elif test x$gcc_cv_as != x; then @@ -23263,7 +23263,7 @@ if test "${gcc_cv_as_powerpc_vsx+set}" = else gcc_cv_as_powerpc_vsx=no if test $in_tree_gas = yes; then - if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0` + if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2` then gcc_cv_as_powerpc_vsx=yes fi elif test x$gcc_cv_as != x; then @@ -23293,6 +23293,52 @@ _ACEOF fi + case $target in + *-*-aix*) conftest_s=' .machine "pwr7" + .csect .text[PR] + popcntd 3,3';; + *) conftest_s=' .machine power7 + .text + popcntd 3,3';; + esac + + echo "$as_me:$LINENO: checking assembler for popcntd support" >&5 +echo $ECHO_N "checking assembler for popcntd support... $ECHO_C" >&6 +if test "${gcc_cv_as_powerpc_popcntd+set}" = set; then + echo $ECHO_N "(cached) $ECHO_C" >&6 +else + gcc_cv_as_powerpc_popcntd=no + if test $in_tree_gas = yes; then + if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2` + then gcc_cv_as_powerpc_popcntd=yes +fi + elif test x$gcc_cv_as != x; then + echo "$conftest_s" > conftest.s + if { ac_try='$gcc_cv_as -a32 -o conftest.o conftest.s >&5' + { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 + (eval $ac_try) 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); }; } + then + gcc_cv_as_powerpc_popcntd=yes + else + echo "configure: failed program was" >&5 + cat conftest.s >&5 + fi + rm -f conftest.o conftest.s + fi +fi +echo "$as_me:$LINENO: result: $gcc_cv_as_powerpc_popcntd" >&5 +echo "${ECHO_T}$gcc_cv_as_powerpc_popcntd" >&6 +if test $gcc_cv_as_powerpc_popcntd = yes; then + +cat >>confdefs.h <<\_ACEOF +#define HAVE_AS_POPCNTD 1 +_ACEOF + +fi + echo "$as_me:$LINENO: checking assembler for .gnu_attribute support" >&5 echo $ECHO_N "checking assembler for .gnu_attribute support... $ECHO_C" >&6 if test "${gcc_cv_as_powerpc_gnu_attribute+set}" = set; then --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-2.c (.../trunk) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-2.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,42 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mcpu=power7" } */ +/* { dg-final { scan-assembler "xvaddsp" } } */ +/* { dg-final { scan-assembler "xvsubsp" } } */ +/* { dg-final { scan-assembler "xvmulsp" } } */ +/* { dg-final { scan-assembler "xvmadd" } } */ +/* { dg-final { scan-assembler "xvmsub" } } */ +/* { dg-final { scan-assembler "xvnmadd" } } */ +/* { dg-final { scan-assembler "xvnmsub" } } */ +/* { dg-final { scan-assembler "xvdivsp" } } */ +/* { dg-final { scan-assembler "xvmaxsp" } } */ +/* { dg-final { scan-assembler "xvminsp" } } */ +/* { dg-final { scan-assembler "xvsqrtsp" } } */ +/* { dg-final { scan-assembler "xvabssp" } } */ +/* { dg-final { scan-assembler "xvnabssp" } } */ +/* { dg-final { scan-assembler "xvresp" } } */ +/* { dg-final { scan-assembler "xvrsqrtesp" } } */ +/* { dg-final { scan-assembler "xvtsqrtsp" } } */ +/* { dg-final { scan-assembler "xvtdivsp" } } */ + +void use_builtins (__vector float *p, __vector float *q, __vector float *r, __vector float *s) +{ + p[0] = __builtin_vsx_xvaddsp (q[0], r[0]); + p[1] = __builtin_vsx_xvsubsp (q[1], r[1]); + p[2] = __builtin_vsx_xvmulsp (q[2], r[2]); + p[3] = __builtin_vsx_xvdivsp (q[3], r[3]); + p[4] = __builtin_vsx_xvmaxsp (q[4], r[4]); + p[5] = __builtin_vsx_xvminsp (q[5], r[5]); + p[6] = __builtin_vsx_xvabssp (q[6]); + p[7] = __builtin_vsx_xvnabssp (q[7]); + p[8] = __builtin_vsx_xvsqrtsp (q[8]); + p[9] = __builtin_vsx_xvmaddsp (q[9], r[9], s[9]); + p[10] = __builtin_vsx_xvmsubsp (q[10], r[10], s[10]); + p[11] = __builtin_vsx_xvnmaddsp (q[11], r[11], s[11]); + p[12] = __builtin_vsx_xvnmsubsp (q[12], r[12], s[12]); + p[13] = __builtin_vsx_xvresp (q[13]); + p[14] = __builtin_vsx_xvrsqrtesp (q[14]); + p[15] = __builtin_vsx_xvtsqrtsp (q[15]); + p[16] = __builtin_vsx_xvtdivsp (q[16], r[16]); +} --- gcc/testsuite/gcc.target/powerpc/popcount-3.c (.../trunk) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/popcount-3.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,9 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-options "-O2 -mcpu=power7 -m64" } */ +/* { dg-final { scan-assembler "popcntd" } } */ + +long foo(int x) +{ + return __builtin_popcountl(x); +} --- gcc/testsuite/gcc.target/powerpc/vsx-vector-1.c (.../trunk) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsx-vector-1.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,74 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -ftree-vectorize -mcpu=power7 -m64" } */ +/* { dg-final { scan-assembler "xvadddp" } } */ +/* { dg-final { scan-assembler "xvsubdp" } } */ +/* { dg-final { scan-assembler "xvmuldp" } } */ +/* { dg-final { scan-assembler "xvdivdp" } } */ +/* { dg-final { scan-assembler "xvmadd" } } */ +/* { dg-final { scan-assembler "xvmsub" } } */ + +#ifndef SIZE +#define SIZE 1024 +#endif + +double a[SIZE] __attribute__((__aligned__(32))); +double b[SIZE] __attribute__((__aligned__(32))); +double c[SIZE] __attribute__((__aligned__(32))); +double d[SIZE] __attribute__((__aligned__(32))); +double e[SIZE] __attribute__((__aligned__(32))); + +void +vector_add (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = b[i] + c[i]; +} + +void +vector_subtract (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = b[i] - c[i]; +} + +void +vector_multiply (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = b[i] * c[i]; +} + +void +vector_multiply_add (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = (b[i] * c[i]) + d[i]; +} + +void +vector_multiply_subtract (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = (b[i] * c[i]) - d[i]; +} + +void +vector_divide (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = b[i] / c[i]; +} --- gcc/testsuite/gcc.target/powerpc/vsx-vector-2.c (.../trunk) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsx-vector-2.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,74 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -ftree-vectorize -mcpu=power7 -m64" } */ +/* { dg-final { scan-assembler "xvaddsp" } } */ +/* { dg-final { scan-assembler "xvsubsp" } } */ +/* { dg-final { scan-assembler "xvmulsp" } } */ +/* { dg-final { scan-assembler "xvdivsp" } } */ +/* { dg-final { scan-assembler "xvmadd" } } */ +/* { dg-final { scan-assembler "xvmsub" } } */ + +#ifndef SIZE +#define SIZE 1024 +#endif + +float a[SIZE] __attribute__((__aligned__(32))); +float b[SIZE] __attribute__((__aligned__(32))); +float c[SIZE] __attribute__((__aligned__(32))); +float d[SIZE] __attribute__((__aligned__(32))); +float e[SIZE] __attribute__((__aligned__(32))); + +void +vector_add (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = b[i] + c[i]; +} + +void +vector_subtract (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = b[i] - c[i]; +} + +void +vector_multiply (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = b[i] * c[i]; +} + +void +vector_multiply_add (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = (b[i] * c[i]) + d[i]; +} + +void +vector_multiply_subtract (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = (b[i] * c[i]) - d[i]; +} + +void +vector_divide (void) +{ + int i; + + for (i = 0; i < SIZE; i++) + a[i] = b[i] / c[i]; +} --- gcc/testsuite/gcc.target/powerpc/pr39457.c (.../trunk) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr39457.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,56 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-options "-m64 -O2 -mminimal-toc" } */ + +/* PR 39457 -- fix breakage because the compiler ran out of registers and + wanted to stash a floating point value to the LR/CTR register. */ + +/* -O2 -m64 -mminimal-toc */ +typedef struct { void *s; } S; +typedef void (*T1) (void); +typedef void (*T2) (void *, void *, int, void *); +char *fn1 (const char *, ...); +void *fn2 (void); +int fn3 (char *, int); +int fn4 (const void *); +int fn5 (const void *); +long fn6 (void) __attribute__ ((__const__)); +int fn7 (void *, void *, void *); +void *fn8 (void *, long); +void *fn9 (void *, long, const char *, ...); +void *fn10 (void *); +long fn11 (void) __attribute__ ((__const__)); +long fn12 (void *, const char *, T1, T2, void *); +void *fn13 (void *); +long fn14 (void) __attribute__ ((__const__)); +extern void *v1; +extern char *v2; +extern int v3; + +void +foo (void *x, char *z) +{ + void *i1, *i2; + int y; + if (v1) + return; + v1 = fn9 (fn10 (fn2 ()), fn6 (), "x", 0., "y", 0., 0); + y = 520 - (520 - fn4 (x)) / 2; + fn9 (fn8 (v1, fn6 ()), fn6 (), "wig", fn8 (v1, fn14 ()), "x", 18.0, + "y", 16.0, "wid", 80.0, "hi", 500.0, 0); + fn9 (fn10 (v1), fn6 (), "x1", 0., "y1", 0., "x2", 80.0, "y2", + 500.0, "f", fn3 ("fff", 0x0D0DFA00), 0); + fn13 (((S *) fn8 (v1, fn6 ()))->s); + fn12 (fn8 (v1, fn11 ()), "ev", (T1) fn7, 0, fn8 (v1, fn6 ())); + fn9 (fn8 (v1, fn6 ()), fn6 (), "wig", + fn8 (v1, fn14 ()), "x", 111.0, "y", 14.0, "wid", 774.0, "hi", + 500.0, 0); + v1 = fn9 (fn10 (v1), fn6 (), "x1", 0., "y1", 0., "x2", 774.0, "y2", + 500.0, "f", fn3 ("gc", 0x0D0DFA00), 0); + fn1 (z, 0); + i1 = fn9 (fn8 (v1, fn6 ()), fn6 (), "pixbuf", x, "x", + 800 - fn5 (x) / 2, "y", y - fn4 (x), 0); + fn12 (fn8 (i1, fn11 ()), "ev", (T1) fn7, 0, "/ok/"); + fn12 (fn8 (i1, fn11 ()), "ev", (T1) fn7, 0, 0); + i2 = fn9 (fn8 (v1, fn6 ()), fn6 (), "txt", "OK", "fnt", v2, "x", + 800, "y", y - fn4 (x) + 15, "ar", 0, "f", v3, 0); +} --- gcc/testsuite/gcc.target/powerpc/vsx-vector-3.c (.../trunk) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsx-vector-3.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,48 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -ftree-vectorize -mcpu=power7 -m64" } */ +/* { dg-final { scan-assembler "xvadddp" } } */ +/* { dg-final { scan-assembler "xvsubdp" } } */ +/* { dg-final { scan-assembler "xvmuldp" } } */ +/* { dg-final { scan-assembler "xvdivdp" } } */ +/* { dg-final { scan-assembler "xvmadd" } } */ +/* { dg-final { scan-assembler "xvmsub" } } */ + +__vector double a, b, c, d; + +void +vector_add (void) +{ + a = b + c; +} + +void +vector_subtract (void) +{ + a = b - c; +} + +void +vector_multiply (void) +{ + a = b * c; +} + +void +vector_multiply_add (void) +{ + a = (b * c) + d; +} + +void +vector_multiply_subtract (void) +{ + a = (b * c) - d; +} + +void +vector_divide (void) +{ + a = b / c; +} --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-1.c (.../trunk) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-1.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,42 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mcpu=power7" } */ +/* { dg-final { scan-assembler "xvadddp" } } */ +/* { dg-final { scan-assembler "xvsubdp" } } */ +/* { dg-final { scan-assembler "xvmuldp" } } */ +/* { dg-final { scan-assembler "xvmadd" } } */ +/* { dg-final { scan-assembler "xvmsub" } } */ +/* { dg-final { scan-assembler "xvnmadd" } } */ +/* { dg-final { scan-assembler "xvnmsub" } } */ +/* { dg-final { scan-assembler "xvdivdp" } } */ +/* { dg-final { scan-assembler "xvmaxdp" } } */ +/* { dg-final { scan-assembler "xvmindp" } } */ +/* { dg-final { scan-assembler "xvsqrtdp" } } */ +/* { dg-final { scan-assembler "xvrsqrtedp" } } */ +/* { dg-final { scan-assembler "xvtsqrtdp" } } */ +/* { dg-final { scan-assembler "xvabsdp" } } */ +/* { dg-final { scan-assembler "xvnabsdp" } } */ +/* { dg-final { scan-assembler "xvredp" } } */ +/* { dg-final { scan-assembler "xvtdivdp" } } */ + +void use_builtins (__vector double *p, __vector double *q, __vector double *r, __vector double *s) +{ + p[0] = __builtin_vsx_xvadddp (q[0], r[0]); + p[1] = __builtin_vsx_xvsubdp (q[1], r[1]); + p[2] = __builtin_vsx_xvmuldp (q[2], r[2]); + p[3] = __builtin_vsx_xvdivdp (q[3], r[3]); + p[4] = __builtin_vsx_xvmaxdp (q[4], r[4]); + p[5] = __builtin_vsx_xvmindp (q[5], r[5]); + p[6] = __builtin_vsx_xvabsdp (q[6]); + p[7] = __builtin_vsx_xvnabsdp (q[7]); + p[8] = __builtin_vsx_xvsqrtdp (q[8]); + p[9] = __builtin_vsx_xvmadddp (q[9], r[9], s[9]); + p[10] = __builtin_vsx_xvmsubdp (q[10], r[10], s[10]); + p[11] = __builtin_vsx_xvnmadddp (q[11], r[11], s[11]); + p[12] = __builtin_vsx_xvnmsubdp (q[12], r[12], s[12]); + p[13] = __builtin_vsx_xvredp (q[13]); + p[14] = __builtin_vsx_xvrsqrtedp (q[14]); + p[15] = __builtin_vsx_xvtsqrtdp (q[15]); + p[16] = __builtin_vsx_xvtdivdp (q[16], r[16]); +} --- gcc/testsuite/gcc.target/powerpc/popcount-2.c (.../trunk) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/popcount-2.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,9 @@ +/* { dg-do compile { target { ilp32 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-options "-O2 -mcpu=power7 -m32" } */ +/* { dg-final { scan-assembler "popcntw" } } */ + +int foo(int x) +{ + return __builtin_popcount(x); +} --- gcc/testsuite/gcc.target/powerpc/vsx-vector-4.c (.../trunk) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsx-vector-4.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,48 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -ftree-vectorize -mcpu=power7 -m64" } */ +/* { dg-final { scan-assembler "xvaddsp" } } */ +/* { dg-final { scan-assembler "xvsubsp" } } */ +/* { dg-final { scan-assembler "xvmulsp" } } */ +/* { dg-final { scan-assembler "xvdivsp" } } */ +/* { dg-final { scan-assembler "xvmadd" } } */ +/* { dg-final { scan-assembler "xvmsub" } } */ + +__vector float a, b, c, d; + +void +vector_add (void) +{ + a = b + c; +} + +void +vector_subtract (void) +{ + a = b - c; +} + +void +vector_multiply (void) +{ + a = b * c; +} + +void +vector_multiply_add (void) +{ + a = (b * c) + d; +} + +void +vector_multiply_subtract (void) +{ + a = (b * c) - d; +} + +void +vector_divide (void) +{ + a = b / c; +} --- gcc/testsuite/gcc.dg/vmx/vmx.exp (.../trunk) (revision 145777) +++ gcc/testsuite/gcc.dg/vmx/vmx.exp (.../branches/ibm/power7-meissner) (revision 146027) @@ -31,7 +31,7 @@ if {![istarget powerpc*-*-*] # nothing but extensions. global DEFAULT_VMXCFLAGS if ![info exists DEFAULT_VMXCFLAGS] then { - set DEFAULT_VMXCFLAGS "-maltivec -mabi=altivec -std=gnu99" + set DEFAULT_VMXCFLAGS "-maltivec -mabi=altivec -std=gnu99 -mno-vsx" } # If the target system supports AltiVec instructions, the default action --- gcc/testsuite/lib/target-supports.exp (.../trunk) (revision 145777) +++ gcc/testsuite/lib/target-supports.exp (.../branches/ibm/power7-meissner) (revision 146027) @@ -873,6 +873,32 @@ proc check_sse2_hw_available { } { }] } +# Return 1 if the target supports executing VSX instructions, 0 +# otherwise. Cache the result. + +proc check_vsx_hw_available { } { + return [check_cached_effective_target vsx_hw_available { + # Some simulators are known to not support VSX instructions. + # For now, disable on Darwin + if { [istarget powerpc-*-eabi] || [istarget powerpc*-*-eabispe] || [istarget *-*-darwin*]} { + expr 0 + } else { + set options "-mvsx" + check_runtime_nocache vsx_hw_available { + int main() + { + #ifdef __MACH__ + asm volatile ("xxlor vs0,vs0,vs0"); + #else + asm volatile ("xxlor 0,0,0"); + #endif + return 0; + } + } $options + } + }] +} + # Return 1 if the target supports executing AltiVec instructions, 0 # otherwise. Cache the result. @@ -883,12 +909,13 @@ proc check_vmx_hw_available { } { expr 0 } else { # Most targets don't require special flags for this test case, but - # Darwin does. + # Darwin does. Just to be sure, make sure VSX is not enabled for + # the altivec tests. if { [istarget *-*-darwin*] || [istarget *-*-aix*] } { - set options "-maltivec" + set options "-maltivec -mno-vsx" } else { - set options "" + set options "-mno-vsx" } check_runtime_nocache vmx_hw_available { int main() @@ -1519,6 +1546,33 @@ proc check_effective_target_powerpc_alti } } +# Return 1 if this is a PowerPC target supporting -mvsx + +proc check_effective_target_powerpc_vsx_ok { } { + if { ([istarget powerpc*-*-*] + && ![istarget powerpc-*-linux*paired*]) + || [istarget rs6000-*-*] } { + # AltiVec is not supported on AIX before 5.3. + if { [istarget powerpc*-*-aix4*] + || [istarget powerpc*-*-aix5.1*] + || [istarget powerpc*-*-aix5.2*] } { + return 0 + } + return [check_no_compiler_messages powerpc_vsx_ok object { + int main (void) { +#ifdef __MACH__ + asm volatile ("xxlor vs0,vs0,vs0"); +#else + asm volatile ("xxlor 0,0,0"); +#endif + return 0; + } + } "-mvsx"] + } else { + return 0 + } +} + # Return 1 if this is a PowerPC target supporting -mcpu=cell. proc check_effective_target_powerpc_ppu_ok { } { --- gcc/config.in (.../trunk) (revision 145777) +++ gcc/config.in (.../branches/ibm/power7-meissner) (revision 146027) @@ -327,12 +327,18 @@ #endif -/* Define if your assembler supports popcntb field. */ +/* Define if your assembler supports popcntb instruction. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_POPCNTB #endif +/* Define if your assembler supports popcntd instruction. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_POPCNTD +#endif + + /* Define if your assembler supports .register. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_REGISTER_PSEUDO_OP --- gcc/configure.ac (.../trunk) (revision 145777) +++ gcc/configure.ac (.../branches/ibm/power7-meissner) (revision 146027) @@ -3054,7 +3054,7 @@ foo: nop esac gcc_GAS_CHECK_FEATURE([move fp gpr support], - gcc_cv_as_powerpc_mfpgpr, [9,99,0],, + gcc_cv_as_powerpc_mfpgpr, [2,19,2],, [$conftest_s],, [AC_DEFINE(HAVE_AS_MFPGPR, 1, [Define if your assembler supports mffgpr and mftgpr.])]) @@ -3088,7 +3088,7 @@ LCF0: esac gcc_GAS_CHECK_FEATURE([compare bytes support], - gcc_cv_as_powerpc_cmpb, [9,99,0], -a32, + gcc_cv_as_powerpc_cmpb, [2,19,2], -a32, [$conftest_s],, [AC_DEFINE(HAVE_AS_CMPB, 1, [Define if your assembler supports cmpb.])]) @@ -3103,7 +3103,7 @@ LCF0: esac gcc_GAS_CHECK_FEATURE([decimal float support], - gcc_cv_as_powerpc_dfp, [9,99,0], -a32, + gcc_cv_as_powerpc_dfp, [2,19,2], -a32, [$conftest_s],, [AC_DEFINE(HAVE_AS_DFP, 1, [Define if your assembler supports DFP instructions.])]) @@ -3118,11 +3118,26 @@ LCF0: esac gcc_GAS_CHECK_FEATURE([vector-scalar support], - gcc_cv_as_powerpc_vsx, [9,99,0], -a32, + gcc_cv_as_powerpc_vsx, [2,19,2], -a32, [$conftest_s],, [AC_DEFINE(HAVE_AS_VSX, 1, [Define if your assembler supports VSX instructions.])]) + case $target in + *-*-aix*) conftest_s=' .machine "pwr7" + .csect .text[[PR]] + popcntd 3,3';; + *) conftest_s=' .machine power7 + .text + popcntd 3,3';; + esac + + gcc_GAS_CHECK_FEATURE([popcntd support], + gcc_cv_as_powerpc_popcntd, [2,19,2], -a32, + [$conftest_s],, + [AC_DEFINE(HAVE_AS_POPCNTD, 1, + [Define if your assembler supports POPCNTD instructions.])]) + gcc_GAS_CHECK_FEATURE([.gnu_attribute support], gcc_cv_as_powerpc_gnu_attribute, [2,18,0],, [.gnu_attribute 4,1],, --- gcc/config/rs6000/aix53.h (.../trunk) (revision 145777) +++ gcc/config/rs6000/aix53.h (.../branches/ibm/power7-meissner) (revision 146027) @@ -57,20 +57,24 @@ do { \ #undef ASM_SPEC #define ASM_SPEC "-u %{maix64:-a64 %{!mcpu*:-mppc64}} %(asm_cpu)" -/* Common ASM definitions used by ASM_SPEC amongst the various targets - for handling -mcpu=xxx switches. */ +/* Common ASM definitions used by ASM_SPEC amongst the various targets for + handling -mcpu=xxx switches. There is a parallel list in driver-rs6000.c to + provide the default assembler options if the user uses -mcpu=native, so if + you make changes here, make them there also. */ #undef ASM_CPU_SPEC #define ASM_CPU_SPEC \ "%{!mcpu*: %{!maix64: \ %{mpowerpc64: -mppc64} \ %{maltivec: -m970} \ %{!maltivec: %{!mpower64: %(asm_default)}}}} \ +%{mcpu=native: %(asm_cpu_native)} \ %{mcpu=power3: -m620} \ %{mcpu=power4: -mpwr4} \ %{mcpu=power5: -mpwr5} \ %{mcpu=power5+: -mpwr5x} \ %{mcpu=power6: -mpwr6} \ %{mcpu=power6x: -mpwr6} \ +%{mcpu=power7: -mpwr7} \ %{mcpu=powerpc: -mppc} \ %{mcpu=rs64a: -mppc} \ %{mcpu=603: -m603} \ --- gcc/config/rs6000/vector.md (.../trunk) (revision 0) +++ gcc/config/rs6000/vector.md (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,664 @@ +;; Expander definitions for vector support between altivec & vsx. No +;; instructions are in this file, this file provides the generic vector +;; expander, and the actual vector instructions will be in altivec.md and +;; vsx.md + +;; Copyright (C) 2009 +;; Free Software Foundation, Inc. +;; Contributed by Michael Meissner + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published +;; by the Free Software Foundation; either version 3, or (at your +;; option) any later version. + +;; GCC is distributed in the hope that it will be useful, but WITHOUT +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public +;; License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + + +;; Vector int modes +(define_mode_iterator VEC_I [V16QI V8HI V4SI]) + +;; Vector float modes +(define_mode_iterator VEC_F [V4SF V2DF]) + +;; Vector logical modes +(define_mode_iterator VEC_L [V16QI V8HI V4SI V2DI V4SF V2DF TI]) + +;; Vector modes for moves. Don't do TImode here. +(define_mode_iterator VEC_M [V16QI V8HI V4SI V2DI V4SF V2DF]) + +;; Vector comparison modes +(define_mode_iterator VEC_C [V16QI V8HI V4SI V4SF V2DF]) + +;; Vector reload iterator +(define_mode_iterator VEC_R [V16QI V8HI V4SI V2DI V4SF V2DF DF TI]) + +;; Base type from vector mode +(define_mode_attr VEC_base [(V16QI "QI") + (V8HI "HI") + (V4SI "SI") + (V2DI "DI") + (V4SF "SF") + (V2DF "DF") + (TI "TI")]) + +;; Same size integer type for floating point data +(define_mode_attr VEC_int [(V4SF "v4si") + (V2DF "v2di")]) + +(define_mode_attr VEC_INT [(V4SF "V4SI") + (V2DF "V2DI")]) + +;; Vector move instructions. +(define_expand "mov" + [(set (match_operand:VEC_M 0 "nonimmediate_operand" "") + (match_operand:VEC_M 1 "any_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" +{ + if (can_create_pseudo_p ()) + { + if (CONSTANT_P (operands[1]) + && !easy_vector_constant (operands[1], mode)) + operands[1] = force_const_mem (mode, operands[1]); + + else if (!vlogical_operand (operands[0], mode) + && !vlogical_operand (operands[1], mode)) + operands[1] = force_reg (mode, operands[1]); + } +}) + +;; Generic vector floating point load/store instructions. These will match +;; insns defined in vsx.md or altivec.md depending on the switches. +(define_expand "vector_load_" + [(set (match_operand:VEC_M 0 "vfloat_operand" "") + (match_operand:VEC_M 1 "memory_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "vector_store_" + [(set (match_operand:VEC_M 0 "memory_operand" "") + (match_operand:VEC_M 1 "vfloat_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + "") + +;; Splits if a GPR register was chosen for the move +(define_split + [(set (match_operand:VEC_L 0 "nonimmediate_operand" "") + (match_operand:VEC_L 1 "input_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) + && reload_completed + && gpr_or_gpr_p (operands[0], operands[1])" + [(pc)] +{ + rs6000_split_multireg_move (operands[0], operands[1]); + DONE; +}) + + +;; Reload patterns for vector operations. We may need an addtional base +;; register to convert the reg+offset addressing to reg+reg for vector +;; registers and reg+reg or (reg+reg)&(-16) addressing to just an index +;; register for gpr registers. +(define_expand "reload___store" + [(parallel [(match_operand:VEC_R 0 "memory_operand" "m") + (match_operand:VEC_R 1 "gpc_reg_operand" "r") + (match_operand:P 2 "register_operand" "=&b")])] + "" +{ + rs6000_secondary_reload_inner (operands[1], operands[0], operands[2], true); + DONE; +}) + +(define_expand "reload___load" + [(parallel [(match_operand:VEC_R 0 "gpc_reg_operand" "=&r") + (match_operand:VEC_R 1 "memory_operand" "m") + (match_operand:P 2 "register_operand" "=&b")])] + "" +{ + rs6000_secondary_reload_inner (operands[0], operands[1], operands[2], false); + DONE; +}) + +;; Reload sometimes tries to move the address to a GPR, and can generate +;; invalid RTL for addresses involving AND -16. + +(define_insn_and_split "*vec_reload_and_plus_" + [(set (match_operand:P 0 "gpc_reg_operand" "=b") + (and:P (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (match_operand:P 2 "gpc_reg_operand" "r")) + (const_int -16)))] + "TARGET_ALTIVEC || TARGET_VSX" + "#" + "&& reload_completed" + [(set (match_dup 0) + (plus:P (match_dup 1) + (match_dup 2))) + (parallel [(set (match_dup 0) + (and:P (match_dup 0) + (const_int -16))) + (clobber:CC (scratch:CC))])]) + +;; Generic floating point vector arithmetic support +(define_expand "add3" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (plus:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "sub3" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (minus:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "mul3" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (mult:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "(VECTOR_UNIT_VSX_P (mode) + || (VECTOR_UNIT_ALTIVEC_P (mode) && TARGET_FUSED_MADD))" + " +{ + if (mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (mode)) + { + emit_insn (gen_altivec_mulv4sf3 (operands[0], operands[1], operands[2])); + DONE; + } +}") + +(define_expand "div3" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (div:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "VECTOR_UNIT_VSX_P (mode)" + "") + +(define_expand "neg2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (neg:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + " +{ + if (mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (mode)) + { + emit_insn (gen_altivec_negv4sf2 (operands[0], operands[1])); + DONE; + } +}") + +(define_expand "abs2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (abs:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + " +{ + if (mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (mode)) + { + emit_insn (gen_altivec_absv4sf2 (operands[0], operands[1])); + DONE; + } +}") + +(define_expand "smin3" + [(set (match_operand:VEC_F 0 "register_operand" "") + (smin:VEC_F (match_operand:VEC_F 1 "register_operand" "") + (match_operand:VEC_F 2 "register_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "smax3" + [(set (match_operand:VEC_F 0 "register_operand" "") + (smax:VEC_F (match_operand:VEC_F 1 "register_operand" "") + (match_operand:VEC_F 2 "register_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + "") + + +(define_expand "sqrt2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (sqrt:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_VSX_P (mode)" + "") + +(define_expand "ftrunc2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (fix:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + "") + + +;; Vector comparisons +(define_expand "vcond" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (if_then_else:VEC_F + (match_operator 3 "comparison_operator" + [(match_operand:VEC_F 4 "vfloat_operand" "") + (match_operand:VEC_F 5 "vfloat_operand" "")]) + (match_operand:VEC_F 1 "vfloat_operand" "") + (match_operand:VEC_F 2 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + " +{ + if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], + operands[3], operands[4], operands[5])) + DONE; + else + FAIL; +}") + +(define_expand "vcond" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (if_then_else:VEC_I + (match_operator 3 "comparison_operator" + [(match_operand:VEC_I 4 "vint_operand" "") + (match_operand:VEC_I 5 "vint_operand" "")]) + (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_P (mode)" + " +{ + if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], + operands[3], operands[4], operands[5])) + DONE; + else + FAIL; +}") + +(define_expand "vcondu" + [(set (match_operand:VEC_I 0 "vint_operand" "=v") + (if_then_else:VEC_I + (match_operator 3 "comparison_operator" + [(match_operand:VEC_I 4 "vint_operand" "") + (match_operand:VEC_I 5 "vint_operand" "")]) + (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_P (mode)" + " +{ + if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], + operands[3], operands[4], operands[5])) + DONE; + else + FAIL; +}") + +(define_expand "vector_eq" + [(set (match_operand:VEC_C 0 "vlogical_operand" "") + (eq:VEC_C (match_operand:VEC_C 1 "vlogical_operand" "") + (match_operand:VEC_C 2 "vlogical_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "vector_gt" + [(set (match_operand:VEC_C 0 "vlogical_operand" "") + (gt:VEC_C (match_operand:VEC_C 1 "vlogical_operand" "") + (match_operand:VEC_C 2 "vlogical_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "vector_ge" + [(set (match_operand:VEC_C 0 "vlogical_operand" "") + (ge:VEC_C (match_operand:VEC_C 1 "vlogical_operand" "") + (match_operand:VEC_C 2 "vlogical_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "vector_gtu" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (gtu:VEC_I (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_P (mode)" + "") + +(define_expand "vector_geu" + [(set (match_operand:VEC_I 0 "vint_operand" "") + (geu:VEC_I (match_operand:VEC_I 1 "vint_operand" "") + (match_operand:VEC_I 2 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_P (mode)" + "") + +;; Note the arguments for __builtin_altivec_vsel are op2, op1, mask +;; which is in the reverse order that we want +(define_expand "vector_vsel" + [(match_operand:VEC_F 0 "vlogical_operand" "") + (match_operand:VEC_F 1 "vlogical_operand" "") + (match_operand:VEC_F 2 "vlogical_operand" "") + (match_operand:VEC_F 3 "vlogical_operand" "")] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + " +{ + if (VECTOR_UNIT_VSX_P (mode)) + emit_insn (gen_vsx_vsel (operands[0], operands[3], + operands[2], operands[1])); + else + emit_insn (gen_altivec_vsel (operands[0], operands[3], + operands[2], operands[1])); + DONE; +}") + +(define_expand "vector_vsel" + [(match_operand:VEC_I 0 "vlogical_operand" "") + (match_operand:VEC_I 1 "vlogical_operand" "") + (match_operand:VEC_I 2 "vlogical_operand" "") + (match_operand:VEC_I 3 "vlogical_operand" "")] + "VECTOR_UNIT_ALTIVEC_P (mode)" + " +{ + emit_insn (gen_altivec_vsel (operands[0], operands[3], + operands[2], operands[1])); + DONE; +}") + + +;; Vector logical instructions +(define_expand "xor3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (xor:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:VEC_L 2 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "ior3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (ior:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:VEC_L 2 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "and3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (and:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:VEC_L 2 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "one_cmpl2" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (not:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "nor3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (not:VEC_L (ior:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "") + (match_operand:VEC_L 2 "vlogical_operand" ""))))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + "") + +(define_expand "andc3" + [(set (match_operand:VEC_L 0 "vlogical_operand" "") + (and:VEC_L (not:VEC_L (match_operand:VEC_L 2 "vlogical_operand" "")) + (match_operand:VEC_L 1 "vlogical_operand" "")))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + "") + +;; Same size conversions +(define_expand "float2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (float:VEC_F (match_operand: 1 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + " +{ + if (mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (mode)) + { + emit_insn (gen_altivec_vcfsx (operands[0], operands[1], const0_rtx)); + DONE; + } +}") + +(define_expand "unsigned_float2" + [(set (match_operand:VEC_F 0 "vfloat_operand" "") + (unsigned_float:VEC_F (match_operand: 1 "vint_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + " +{ + if (mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (mode)) + { + emit_insn (gen_altivec_vcfux (operands[0], operands[1], const0_rtx)); + DONE; + } +}") + +(define_expand "fix_trunc2" + [(set (match_operand: 0 "vint_operand" "") + (fix: (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + " +{ + if (mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (mode)) + { + emit_insn (gen_altivec_vctsxs (operands[0], operands[1], const0_rtx)); + DONE; + } +}") + +(define_expand "fixuns_trunc2" + [(set (match_operand: 0 "vint_operand" "") + (unsigned_fix: (match_operand:VEC_F 1 "vfloat_operand" "")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" + " +{ + if (mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (mode)) + { + emit_insn (gen_altivec_vctuxs (operands[0], operands[1], const0_rtx)); + DONE; + } +}") + + +;; Vector initialization, set, extract +(define_expand "vec_init" + [(match_operand:VEC_C 0 "vlogical_operand" "") + (match_operand:VEC_C 1 "vec_init_operand" "")] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" +{ + rs6000_expand_vector_init (operands[0], operands[1]); + DONE; +}) + +(define_expand "vec_set" + [(match_operand:VEC_C 0 "vlogical_operand" "") + (match_operand: 1 "register_operand" "") + (match_operand 2 "const_int_operand" "")] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" +{ + rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); + DONE; +}) + +(define_expand "vec_extract" + [(match_operand: 0 "register_operand" "") + (match_operand:VEC_C 1 "vlogical_operand" "") + (match_operand 2 "const_int_operand" "")] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" +{ + rs6000_expand_vector_extract (operands[0], operands[1], + INTVAL (operands[2])); + DONE; +}) + +;; Interleave patterns +(define_expand "vec_interleave_highv4sf" + [(set (match_operand:V4SF 0 "vfloat_operand" "") + (vec_merge:V4SF + (vec_select:V4SF (match_operand:V4SF 1 "vfloat_operand" "") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (vec_select:V4SF (match_operand:V4SF 2 "vfloat_operand" "") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (const_int 5)))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)" + "") + +(define_expand "vec_interleave_lowv4sf" + [(set (match_operand:V4SF 0 "vfloat_operand" "") + (vec_merge:V4SF + (vec_select:V4SF (match_operand:V4SF 1 "vfloat_operand" "") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (vec_select:V4SF (match_operand:V4SF 2 "vfloat_operand" "") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (const_int 5)))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)" + "") + +(define_expand "vec_interleave_highv2df" + [(set (match_operand:V2DF 0 "vfloat_operand" "") + (vec_concat:V2DF + (vec_select:DF (match_operand:V2DF 1 "vfloat_operand" "") + (parallel [(const_int 0)])) + (vec_select:DF (match_operand:V2DF 2 "vfloat_operand" "") + (parallel [(const_int 0)]))))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "") + +(define_expand "vec_interleave_lowv2df" + [(set (match_operand:V2DF 0 "vfloat_operand" "") + (vec_concat:V2DF + (vec_select:DF (match_operand:V2DF 1 "vfloat_operand" "") + (parallel [(const_int 1)])) + (vec_select:DF (match_operand:V2DF 2 "vfloat_operand" "") + (parallel [(const_int 1)]))))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "") + + +;; Convert double word types to single word types +(define_expand "vec_pack_trunc_v2df" + [(match_operand:V4SF 0 "vsx_register_operand" "") + (match_operand:V2DF 1 "vsx_register_operand" "") + (match_operand:V2DF 2 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && TARGET_ALTIVEC" +{ + rtx r1 = gen_reg_rtx (V4SFmode); + rtx r2 = gen_reg_rtx (V4SFmode); + + emit_insn (gen_vsx_xvcvdpsp (r1, operands[1])); + emit_insn (gen_vsx_xvcvdpsp (r2, operands[2])); + emit_insn (gen_vec_extract_evenv4sf (operands[0], r1, r2)); + DONE; +}) + +(define_expand "vec_pack_sfix_trunc_v2df" + [(match_operand:V4SI 0 "vsx_register_operand" "") + (match_operand:V2DF 1 "vsx_register_operand" "") + (match_operand:V2DF 2 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && TARGET_ALTIVEC" +{ + rtx r1 = gen_reg_rtx (V4SImode); + rtx r2 = gen_reg_rtx (V4SImode); + + emit_insn (gen_vsx_xvcvdpsxws (r1, operands[1])); + emit_insn (gen_vsx_xvcvdpsxws (r2, operands[2])); + emit_insn (gen_vec_extract_evenv4si (operands[0], r1, r2)); + DONE; +}) + +(define_expand "vec_pack_ufix_trunc_v2df" + [(match_operand:V4SI 0 "vsx_register_operand" "") + (match_operand:V2DF 1 "vsx_register_operand" "") + (match_operand:V2DF 2 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && TARGET_ALTIVEC" +{ + rtx r1 = gen_reg_rtx (V4SImode); + rtx r2 = gen_reg_rtx (V4SImode); + + emit_insn (gen_vsx_xvcvdpuxws (r1, operands[1])); + emit_insn (gen_vsx_xvcvdpuxws (r2, operands[2])); + emit_insn (gen_vec_extract_evenv4si (operands[0], r1, r2)); + DONE; +}) + +;; Convert single word types to double word +(define_expand "vec_unpacks_hi_v4sf" + [(match_operand:V2DF 0 "vsx_register_operand" "") + (match_operand:V4SF 1 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)" +{ + rtx reg = gen_reg_rtx (V4SFmode); + + emit_insn (gen_vec_interleave_highv4sf (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvspdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacks_lo_v4sf" + [(match_operand:V2DF 0 "vsx_register_operand" "") + (match_operand:V4SF 1 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)" +{ + rtx reg = gen_reg_rtx (V4SFmode); + + emit_insn (gen_vec_interleave_lowv4sf (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvspdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacks_float_hi_v4si" + [(match_operand:V2DF 0 "vsx_register_operand" "") + (match_operand:V4SI 1 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SImode)" +{ + rtx reg = gen_reg_rtx (V4SImode); + + emit_insn (gen_vec_interleave_highv4si (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvsxwdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacks_float_lo_v4si" + [(match_operand:V2DF 0 "vsx_register_operand" "") + (match_operand:V4SI 1 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SImode)" +{ + rtx reg = gen_reg_rtx (V4SImode); + + emit_insn (gen_vec_interleave_lowv4si (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvsxwdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacku_float_hi_v4si" + [(match_operand:V2DF 0 "vsx_register_operand" "") + (match_operand:V4SI 1 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SImode)" +{ + rtx reg = gen_reg_rtx (V4SImode); + + emit_insn (gen_vec_interleave_highv4si (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvuxwdp (operands[0], reg)); + DONE; +}) + +(define_expand "vec_unpacku_float_lo_v4si" + [(match_operand:V2DF 0 "vsx_register_operand" "") + (match_operand:V4SI 1 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (V2DFmode) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SImode)" +{ + rtx reg = gen_reg_rtx (V4SImode); + + emit_insn (gen_vec_interleave_lowv4si (reg, operands[1], operands[1])); + emit_insn (gen_vsx_xvcvuxwdp (operands[0], reg)); + DONE; +}) --- gcc/config/rs6000/spe.md (.../trunk) (revision 145777) +++ gcc/config/rs6000/spe.md (.../branches/ibm/power7-meissner) (revision 146027) @@ -99,7 +99,7 @@ (define_insn "*divsf3_gpr" ;; Floating point conversion instructions. -(define_insn "fixuns_truncdfsi2" +(define_insn "spe_fixuns_truncdfsi2" [(set (match_operand:SI 0 "gpc_reg_operand" "=r") (unsigned_fix:SI (match_operand:DF 1 "gpc_reg_operand" "r")))] "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE" --- gcc/config/rs6000/constraints.md (.../trunk) (revision 145777) +++ gcc/config/rs6000/constraints.md (.../branches/ibm/power7-meissner) (revision 146027) @@ -17,6 +17,8 @@ ;; along with GCC; see the file COPYING3. If not see ;; . +;; Available constraint letters: "e", "k", "u", "A", "B", "C", "D" + ;; Register constraints (define_register_constraint "f" "TARGET_HARD_FLOAT && TARGET_FPRS @@ -50,6 +52,28 @@ (define_register_constraint "y" "CR_REGS (define_register_constraint "z" "XER_REGS" "@internal") +;; Use w as a prefix to add VSX modes +;; vector double (V2DF) +(define_register_constraint "wd" "rs6000_vector_reg_class[V2DFmode]" + "@internal") + +;; vector float (V4SF) +(define_register_constraint "wf" "rs6000_vector_reg_class[V4SFmode]" + "@internal") + +;; scalar double (DF) +(define_register_constraint "ws" "rs6000_vector_reg_class[DFmode]" + "@internal") + +;; any VSX register +(define_register_constraint "wa" "rs6000_vsx_reg_class" + "@internal") + +;; Altivec style load/store that ignores the bottom bits of the address +(define_memory_constraint "wZ" + "Indexed or indirect memory operand, ignoring the bottom 4 bits" + (match_operand 0 "altivec_indexed_or_indirect_operand")) + ;; Integer constraints (define_constraint "I" @@ -159,3 +183,7 @@ (define_constraint "t" (define_constraint "W" "vector constant that does not require memory" (match_operand 0 "easy_vector_constant")) + +(define_constraint "j" + "Zero vector constant" + (match_test "(op == const0_rtx || op == CONST0_RTX (GET_MODE (op)))")) --- gcc/config/rs6000/predicates.md (.../trunk) (revision 145777) +++ gcc/config/rs6000/predicates.md (.../branches/ibm/power7-meissner) (revision 146027) @@ -38,6 +38,37 @@ (define_predicate "altivec_register_oper || ALTIVEC_REGNO_P (REGNO (op)) || REGNO (op) > LAST_VIRTUAL_REGISTER"))) +;; Return 1 if op is a VSX register. +(define_predicate "vsx_register_operand" + (and (match_operand 0 "register_operand") + (match_test "GET_CODE (op) != REG + || VSX_REGNO_P (REGNO (op)) + || REGNO (op) > LAST_VIRTUAL_REGISTER"))) + +;; Return 1 if op is a vector register that operates on floating point vectors +;; (either altivec or VSX). +(define_predicate "vfloat_operand" + (and (match_operand 0 "register_operand") + (match_test "GET_CODE (op) != REG + || VFLOAT_REGNO_P (REGNO (op)) + || REGNO (op) > LAST_VIRTUAL_REGISTER"))) + +;; Return 1 if op is a vector register that operates on integer vectors +;; (only altivec, VSX doesn't support integer vectors) +(define_predicate "vint_operand" + (and (match_operand 0 "register_operand") + (match_test "GET_CODE (op) != REG + || VINT_REGNO_P (REGNO (op)) + || REGNO (op) > LAST_VIRTUAL_REGISTER"))) + +;; Return 1 if op is a vector register to do logical operations on (and, or, +;; xor, etc.) +(define_predicate "vlogical_operand" + (and (match_operand 0 "register_operand") + (match_test "GET_CODE (op) != REG + || VLOGICAL_REGNO_P (REGNO (op)) + || REGNO (op) > LAST_VIRTUAL_REGISTER"))) + ;; Return 1 if op is XER register. (define_predicate "xer_operand" (and (match_code "reg") @@ -234,6 +265,10 @@ (define_predicate "easy_fp_constant" && num_insns_constant_wide ((HOST_WIDE_INT) k[3]) == 1); case DFmode: + /* The constant 0.f is easy under VSX. */ + if (op == CONST0_RTX (DFmode) && VECTOR_UNIT_VSX_P (DFmode)) + return 1; + /* Force constants to memory before reload to utilize compress_float_constant. Avoid this when flag_unsafe_math_optimizations is enabled @@ -396,16 +431,36 @@ (define_predicate "indexed_or_indirect_o (match_code "mem") { op = XEXP (op, 0); - if (TARGET_ALTIVEC - && ALTIVEC_VECTOR_MODE (mode) + if (VECTOR_MEM_ALTIVEC_P (mode) && GET_CODE (op) == AND && GET_CODE (XEXP (op, 1)) == CONST_INT && INTVAL (XEXP (op, 1)) == -16) op = XEXP (op, 0); + else if (VECTOR_MEM_VSX_P (mode) + && GET_CODE (op) == PRE_MODIFY) + op = XEXP (op, 1); + return indexed_or_indirect_address (op, mode); }) +;; Return 1 if the operand is an indexed or indirect memory operand with an +;; AND -16 in it, used to recognize when we need to switch to Altivec loads +;; to realign loops instead of VSX (altivec silently ignores the bottom bits, +;; while VSX uses the full address and traps) +(define_predicate "altivec_indexed_or_indirect_operand" + (match_code "mem") +{ + op = XEXP (op, 0); + if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) + && GET_CODE (op) == AND + && GET_CODE (XEXP (op, 1)) == CONST_INT + && INTVAL (XEXP (op, 1)) == -16) + return indexed_or_indirect_address (XEXP (op, 0), mode); + + return 0; +}) + ;; Return 1 if the operand is an indexed or indirect address. (define_special_predicate "indexed_or_indirect_address" (and (match_test "REG_P (op) @@ -1336,3 +1391,19 @@ (define_predicate "stmw_operation" return 1; }) + +;; Return true if the operand is a legitimate parallel for vec_init +(define_predicate "vec_init_operand" + (match_code "parallel") +{ + /* Disallow V2DF mode with MEM's unless both are the same under VSX. */ + if (mode == V2DFmode && VECTOR_UNIT_VSX_P (mode)) + { + rtx op0 = XVECEXP (op, 0, 0); + rtx op1 = XVECEXP (op, 0, 1); + if ((MEM_P (op0) || MEM_P (op1)) && !rtx_equal_p (op0, op1)) + return 0; + } + + return 1; +}) --- gcc/config/rs6000/ppc-asm.h (.../trunk) (revision 145777) +++ gcc/config/rs6000/ppc-asm.h (.../branches/ibm/power7-meissner) (revision 146027) @@ -63,7 +63,7 @@ #define f16 16 #define f17 17 #define f18 18 -#define f19 19 +#define f19 19 #define f20 20 #define f21 21 #define f22 22 @@ -77,6 +77,143 @@ #define f30 30 #define f31 31 +#ifdef __VSX__ +#define f32 32 +#define f33 33 +#define f34 34 +#define f35 35 +#define f36 36 +#define f37 37 +#define f38 38 +#define f39 39 +#define f40 40 +#define f41 41 +#define f42 42 +#define f43 43 +#define f44 44 +#define f45 45 +#define f46 46 +#define f47 47 +#define f48 48 +#define f49 49 +#define f50 30 +#define f51 51 +#define f52 52 +#define f53 53 +#define f54 54 +#define f55 55 +#define f56 56 +#define f57 57 +#define f58 58 +#define f59 59 +#define f60 60 +#define f61 61 +#define f62 62 +#define f63 63 +#endif + +#ifdef __ALTIVEC__ +#define v0 0 +#define v1 1 +#define v2 2 +#define v3 3 +#define v4 4 +#define v5 5 +#define v6 6 +#define v7 7 +#define v8 8 +#define v9 9 +#define v10 10 +#define v11 11 +#define v12 12 +#define v13 13 +#define v14 14 +#define v15 15 +#define v16 16 +#define v17 17 +#define v18 18 +#define v19 19 +#define v20 20 +#define v21 21 +#define v22 22 +#define v23 23 +#define v24 24 +#define v25 25 +#define v26 26 +#define v27 27 +#define v28 28 +#define v29 29 +#define v30 30 +#define v31 31 +#endif + +#ifdef __VSX__ +#define vs0 0 +#define vs1 1 +#define vs2 2 +#define vs3 3 +#define vs4 4 +#define vs5 5 +#define vs6 6 +#define vs7 7 +#define vs8 8 +#define vs9 9 +#define vs10 10 +#define vs11 11 +#define vs12 12 +#define vs13 13 +#define vs14 14 +#define vs15 15 +#define vs16 16 +#define vs17 17 +#define vs18 18 +#define vs19 19 +#define vs20 20 +#define vs21 21 +#define vs22 22 +#define vs23 23 +#define vs24 24 +#define vs25 25 +#define vs26 26 +#define vs27 27 +#define vs28 28 +#define vs29 29 +#define vs30 30 +#define vs31 31 +#define vs32 32 +#define vs33 33 +#define vs34 34 +#define vs35 35 +#define vs36 36 +#define vs37 37 +#define vs38 38 +#define vs39 39 +#define vs40 40 +#define vs41 41 +#define vs42 42 +#define vs43 43 +#define vs44 44 +#define vs45 45 +#define vs46 46 +#define vs47 47 +#define vs48 48 +#define vs49 49 +#define vs50 30 +#define vs51 51 +#define vs52 52 +#define vs53 53 +#define vs54 54 +#define vs55 55 +#define vs56 56 +#define vs57 57 +#define vs58 58 +#define vs59 59 +#define vs60 60 +#define vs61 61 +#define vs62 62 +#define vs63 63 +#endif + /* * Macros to glue together two tokens. */ --- gcc/config/rs6000/linux64.opt (.../trunk) (revision 145777) +++ gcc/config/rs6000/linux64.opt (.../branches/ibm/power7-meissner) (revision 146027) @@ -20,5 +20,5 @@ ; . mprofile-kernel -Target Report Mask(PROFILE_KERNEL) +Target Report Var(TARGET_PROFILE_KERNEL) Call mcount for profiling before a function prologue --- gcc/config/rs6000/sysv4.opt (.../trunk) (revision 145777) +++ gcc/config/rs6000/sysv4.opt (.../branches/ibm/power7-meissner) (revision 146027) @@ -32,7 +32,7 @@ Target RejectNegative Joined Specify bit size of immediate TLS offsets mbit-align -Target Report Mask(NO_BITFIELD_TYPE) +Target Report Var(TARGET_NO_BITFIELD_TYPE) Align to the base type of the bit-field mstrict-align @@ -87,11 +87,11 @@ Target Report Mask(EABI) Use EABI mbit-word -Target Report Mask(NO_BITFIELD_WORD) +Target Report Var(TARGET_NO_BITFIELD_WORD) Allow bit-fields to cross word boundaries mregnames -Target Mask(REGNAMES) +Target Var(TARGET_REGNAMES) Use alternate register names ;; FIXME: Does nothing. --- gcc/config/rs6000/rs6000-protos.h (.../trunk) (revision 145777) +++ gcc/config/rs6000/rs6000-protos.h (.../branches/ibm/power7-meissner) (revision 146027) @@ -64,9 +64,15 @@ extern int insvdi_rshift_rlwimi_p (rtx, extern int registers_ok_for_quad_peep (rtx, rtx); extern int mems_ok_for_quad_peep (rtx, rtx); extern bool gpr_or_gpr_p (rtx, rtx); +extern enum reg_class rs6000_preferred_reload_class(rtx, enum reg_class); extern enum reg_class rs6000_secondary_reload_class (enum reg_class, enum machine_mode, rtx); - +extern bool rs6000_secondary_memory_needed (enum reg_class, enum reg_class, + enum machine_mode); +extern bool rs6000_cannot_change_mode_class (enum machine_mode, + enum machine_mode, + enum reg_class); +extern void rs6000_secondary_reload_inner (rtx, rtx, rtx, bool); extern int paired_emit_vector_cond_expr (rtx, rtx, rtx, rtx, rtx, rtx); extern void paired_expand_vector_move (rtx operands[]); @@ -170,7 +176,6 @@ extern int rs6000_register_move_cost (en enum reg_class, enum reg_class); extern int rs6000_memory_move_cost (enum machine_mode, enum reg_class, int); extern bool rs6000_tls_referenced_p (rtx); -extern int rs6000_hard_regno_nregs (int, enum machine_mode); extern void rs6000_conditional_register_usage (void); /* Declare functions in rs6000-c.c */ @@ -189,4 +194,6 @@ const char * rs6000_xcoff_strip_dollar ( void rs6000_final_prescan_insn (rtx, rtx *operand, int num_operands); extern bool rs6000_hard_regno_mode_ok_p[][FIRST_PSEUDO_REGISTER]; +extern unsigned char rs6000_class_max_nregs[][LIM_REG_CLASSES]; +extern unsigned char rs6000_hard_regno_nregs[][FIRST_PSEUDO_REGISTER]; #endif /* rs6000-protos.h */ --- gcc/config/rs6000/t-rs6000 (.../trunk) (revision 145777) +++ gcc/config/rs6000/t-rs6000 (.../branches/ibm/power7-meissner) (revision 146027) @@ -16,3 +16,33 @@ rs6000-c.o: $(srcdir)/config/rs6000/rs60 # The rs6000 backend doesn't cause warnings in these files. insn-conditions.o-warn = + +MD_INCLUDES = $(srcdir)/config/rs6000/rios1.md \ + $(srcdir)/config/rs6000/rios2.md \ + $(srcdir)/config/rs6000/rs64.md \ + $(srcdir)/config/rs6000/mpc.md \ + $(srcdir)/config/rs6000/40x.md \ + $(srcdir)/config/rs6000/440.md \ + $(srcdir)/config/rs6000/603.md \ + $(srcdir)/config/rs6000/6xx.md \ + $(srcdir)/config/rs6000/7xx.md \ + $(srcdir)/config/rs6000/7450.md \ + $(srcdir)/config/rs6000/8540.md \ + $(srcdir)/config/rs6000/e300c2c3.md \ + $(srcdir)/config/rs6000/e500mc.md \ + $(srcdir)/config/rs6000/power4.md \ + $(srcdir)/config/rs6000/power5.md \ + $(srcdir)/config/rs6000/power6.md \ + $(srcdir)/config/rs6000/power7.md \ + $(srcdir)/config/rs6000/cell.md \ + $(srcdir)/config/rs6000/xfpu.md \ + $(srcdir)/config/rs6000/predicates.md \ + $(srcdir)/config/rs6000/constraints.md \ + $(srcdir)/config/rs6000/darwin.md \ + $(srcdir)/config/rs6000/sync.md \ + $(srcdir)/config/rs6000/vector.md \ + $(srcdir)/config/rs6000/vsx.md \ + $(srcdir)/config/rs6000/altivec.md \ + $(srcdir)/config/rs6000/spe.md \ + $(srcdir)/config/rs6000/dfp.md \ + $(srcdir)/config/rs6000/paired.md --- gcc/config/rs6000/power7.md (.../trunk) (revision 0) +++ gcc/config/rs6000/power7.md (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,318 @@ +;; Scheduling description for IBM POWER7 processor. +;; Copyright (C) 2009 Free Software Foundation, Inc. +;; +;; Contributed by Pat Haugen (pthaugen@us.ibm.com). + +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published +;; by the Free Software Foundation; either version 3, or (at your +;; option) any later version. +;; +;; GCC is distributed in the hope that it will be useful, but WITHOUT +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public +;; License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + +(define_automaton "power7iu,power7lsu,power7vsu,power7misc") + +(define_cpu_unit "iu1_power7,iu2_power7" "power7iu") +(define_cpu_unit "lsu1_power7,lsu2_power7" "power7lsu") +(define_cpu_unit "vsu1_power7,vsu2_power7" "power7vsu") +(define_cpu_unit "bpu_power7,cru_power7" "power7misc") +(define_cpu_unit "du1_power7,du2_power7,du3_power7,du4_power7,du5_power7" + "power7misc") + + +(define_reservation "DU_power7" + "du1_power7|du2_power7|du3_power7|du4_power7") + +(define_reservation "DU2F_power7" + "du1_power7+du2_power7") + +(define_reservation "DU4_power7" + "du1_power7+du2_power7+du3_power7+du4_power7") + +(define_reservation "FXU_power7" + "iu1_power7|iu2_power7") + +(define_reservation "VSU_power7" + "vsu1_power7|vsu2_power7") + +(define_reservation "LSU_power7" + "lsu1_power7|lsu2_power7") + + +; Dispatch slots are allocated in order conforming to program order. +(absence_set "du1_power7" "du2_power7,du3_power7,du4_power7,du5_power7") +(absence_set "du2_power7" "du3_power7,du4_power7,du5_power7") +(absence_set "du3_power7" "du4_power7,du5_power7") +(absence_set "du4_power7" "du5_power7") + + +; LS Unit +(define_insn_reservation "power7-load" 2 + (and (eq_attr "type" "load") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7") + +(define_insn_reservation "power7-load-ext" 3 + (and (eq_attr "type" "load_ext") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7,FXU_power7") + +(define_insn_reservation "power7-load-update" 2 + (and (eq_attr "type" "load_u") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7+FXU_power7") + +(define_insn_reservation "power7-load-update-indexed" 3 + (and (eq_attr "type" "load_ux") + (eq_attr "cpu" "power7")) + "DU4_power7,FXU_power7,LSU_power7+FXU_power7") + +(define_insn_reservation "power7-load-ext-update" 4 + (and (eq_attr "type" "load_ext_u") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7+FXU_power7,FXU_power7") + +(define_insn_reservation "power7-load-ext-update-indexed" 4 + (and (eq_attr "type" "load_ext_ux") + (eq_attr "cpu" "power7")) + "DU4_power7,FXU_power7,LSU_power7+FXU_power7,FXU_power7") + +(define_insn_reservation "power7-fpload" 3 + (and (eq_attr "type" "fpload") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7") + +(define_insn_reservation "power7-fpload-update" 3 + (and (eq_attr "type" "fpload_u,fpload_ux") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7+FXU_power7") + +(define_insn_reservation "power7-store" 6 ; store-forwarding latency + (and (eq_attr "type" "store") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7+FXU_power7") + +(define_insn_reservation "power7-store-update" 6 + (and (eq_attr "type" "store_u") + (eq_attr "cpu" "power7")) + "DU2F_power7,LSU_power7+FXU_power7,FXU_power7") + +(define_insn_reservation "power7-store-update-indexed" 6 + (and (eq_attr "type" "store_ux") + (eq_attr "cpu" "power7")) + "DU4_power7,LSU_power7+FXU_power7,FXU_power7") + +(define_insn_reservation "power7-fpstore" 6 + (and (eq_attr "type" "fpstore") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7+VSU_power7") + +(define_insn_reservation "power7-fpstore-update" 6 + (and (eq_attr "type" "fpstore_u,fpstore_ux") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7+VSU_power7+FXU_power7") + +(define_insn_reservation "power7-larx" 3 + (and (eq_attr "type" "load_l") + (eq_attr "cpu" "power7")) + "DU4_power7,LSU_power7") + +(define_insn_reservation "power7-stcx" 10 + (and (eq_attr "type" "store_c") + (eq_attr "cpu" "power7")) + "DU4_power7,LSU_power7") + +(define_insn_reservation "power7-vecload" 3 + (and (eq_attr "type" "vecload") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7") + +(define_insn_reservation "power7-vecstore" 6 + (and (eq_attr "type" "vecstore") + (eq_attr "cpu" "power7")) + "DU_power7,LSU_power7+VSU_power7") + +(define_insn_reservation "power7-sync" 11 + (and (eq_attr "type" "sync") + (eq_attr "cpu" "power7")) + "DU4_power7,LSU_power7") + + +; FX Unit +(define_insn_reservation "power7-integer" 1 + (and (eq_attr "type" "integer,insert_word,insert_dword,shift,trap,\ + var_shift_rotate,exts") + (eq_attr "cpu" "power7")) + "DU_power7,FXU_power7") + +(define_insn_reservation "power7-cntlz" 2 + (and (eq_attr "type" "cntlz") + (eq_attr "cpu" "power7")) + "DU_power7,FXU_power7") + +(define_insn_reservation "power7-two" 2 + (and (eq_attr "type" "two") + (eq_attr "cpu" "power7")) + "DU_power7+DU_power7,FXU_power7,FXU_power7") + +(define_insn_reservation "power7-three" 3 + (and (eq_attr "type" "three") + (eq_attr "cpu" "power7")) + "DU_power7+DU_power7+DU_power7,FXU_power7,FXU_power7,FXU_power7") + +(define_insn_reservation "power7-cmp" 1 + (and (eq_attr "type" "cmp,fast_compare") + (eq_attr "cpu" "power7")) + "DU_power7,FXU_power7") + +(define_insn_reservation "power7-compare" 2 + (and (eq_attr "type" "compare,delayed_compare,var_delayed_compare") + (eq_attr "cpu" "power7")) + "DU2F_power7,FXU_power7,FXU_power7") + +(define_bypass 3 "power7-cmp,power7-compare" "power7-crlogical,power7-delayedcr") + +(define_insn_reservation "power7-mul" 4 + (and (eq_attr "type" "imul,imul2,imul3,lmul") + (eq_attr "cpu" "power7")) + "DU_power7,FXU_power7") + +(define_insn_reservation "power7-mul-compare" 5 + (and (eq_attr "type" "imul_compare,lmul_compare") + (eq_attr "cpu" "power7")) + "DU2F_power7,FXU_power7,nothing*3,FXU_power7") + +(define_insn_reservation "power7-idiv" 36 + (and (eq_attr "type" "idiv") + (eq_attr "cpu" "power7")) + "DU2F_power7,iu1_power7*36|iu2_power7*36") + +(define_insn_reservation "power7-ldiv" 68 + (and (eq_attr "type" "ldiv") + (eq_attr "cpu" "power7")) + "DU2F_power7,iu1_power7*68|iu2_power7*68") + +(define_insn_reservation "power7-isync" 1 ; + (and (eq_attr "type" "isync") + (eq_attr "cpu" "power7")) + "DU4_power7,FXU_power7") + + +; CR Unit +(define_insn_reservation "power7-mtjmpr" 4 + (and (eq_attr "type" "mtjmpr") + (eq_attr "cpu" "power7")) + "du1_power7,FXU_power7") + +(define_insn_reservation "power7-mfjmpr" 5 + (and (eq_attr "type" "mfjmpr") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7+FXU_power7") + +(define_insn_reservation "power7-crlogical" 3 + (and (eq_attr "type" "cr_logical") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7") + +(define_insn_reservation "power7-delayedcr" 3 + (and (eq_attr "type" "delayed_cr") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7") + +(define_insn_reservation "power7-mfcr" 6 + (and (eq_attr "type" "mfcr") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7") + +(define_insn_reservation "power7-mfcrf" 3 + (and (eq_attr "type" "mfcrf") + (eq_attr "cpu" "power7")) + "du1_power7,cru_power7") + +(define_insn_reservation "power7-mtcr" 3 + (and (eq_attr "type" "mtcr") + (eq_attr "cpu" "power7")) + "DU4_power7,cru_power7+FXU_power7") + + +; BR Unit +; Branches take dispatch Slot 4. The presence_sets prevent other insn from +; grabbing previous dispatch slots once this is assigned. +(define_insn_reservation "power7-branch" 3 + (and (eq_attr "type" "jmpreg,branch") + (eq_attr "cpu" "power7")) + "(du5_power7\ + |du4_power7+du5_power7\ + |du3_power7+du4_power7+du5_power7\ + |du2_power7+du3_power7+du4_power7+du5_power7\ + |du1_power7+du2_power7+du3_power7+du4_power7+du5_power7),bpu_power7") + + +; VS Unit (includes FP/VSX/VMX/DFP) +(define_insn_reservation "power7-fp" 6 + (and (eq_attr "type" "fp,dmul") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_bypass 8 "power7-fp" "power7-branch") + +(define_insn_reservation "power7-fpcompare" 4 + (and (eq_attr "type" "fpcompare") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-sdiv" 26 + (and (eq_attr "type" "sdiv") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-ddiv" 32 + (and (eq_attr "type" "ddiv") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-sqrt" 31 + (and (eq_attr "type" "ssqrt") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-dsqrt" 43 + (and (eq_attr "type" "dsqrt") + (eq_attr "cpu" "power7")) + "DU_power7,VSU_power7") + +(define_insn_reservation "power7-vecsimple" 2 + (and (eq_attr "type" "vecsimple") + (eq_attr "cpu" "power7")) + "du1_power7,VSU_power7") + +(define_insn_reservation "power7-veccmp" 7 + (and (eq_attr "type" "veccmp") + (eq_attr "cpu" "power7")) + "du1_power7,VSU_power7") + +(define_insn_reservation "power7-vecfloat" 7 + (and (eq_attr "type" "vecfloat") + (eq_attr "cpu" "power7")) + "du1_power7,VSU_power7") + +(define_bypass 6 "power7-vecfloat" "power7-vecfloat") + +(define_insn_reservation "power7-veccomplex" 7 + (and (eq_attr "type" "veccomplex") + (eq_attr "cpu" "power7")) + "du1_power7,VSU_power7") + +(define_insn_reservation "power7-vecperm" 3 + (and (eq_attr "type" "vecperm") + (eq_attr "cpu" "power7")) + "du2_power7,VSU_power7") --- gcc/config/rs6000/rs6000-c.c (.../trunk) (revision 145777) +++ gcc/config/rs6000/rs6000-c.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -106,14 +106,17 @@ altivec_categorize_keyword (const cpp_to if (ident == C_CPP_HASHNODE (vector_keyword)) return C_CPP_HASHNODE (__vector_keyword); - if (ident == C_CPP_HASHNODE (pixel_keyword)) - return C_CPP_HASHNODE (__pixel_keyword); + if (TARGET_ALTIVEC) + { + if (ident == C_CPP_HASHNODE (pixel_keyword)) + return C_CPP_HASHNODE (__pixel_keyword); - if (ident == C_CPP_HASHNODE (bool_keyword)) - return C_CPP_HASHNODE (__bool_keyword); + if (ident == C_CPP_HASHNODE (bool_keyword)) + return C_CPP_HASHNODE (__bool_keyword); - if (ident == C_CPP_HASHNODE (_Bool_keyword)) - return C_CPP_HASHNODE (__bool_keyword); + if (ident == C_CPP_HASHNODE (_Bool_keyword)) + return C_CPP_HASHNODE (__bool_keyword); + } return ident; } @@ -131,23 +134,26 @@ init_vector_keywords (void) __vector_keyword = get_identifier ("__vector"); C_CPP_HASHNODE (__vector_keyword)->flags |= NODE_CONDITIONAL; - __pixel_keyword = get_identifier ("__pixel"); - C_CPP_HASHNODE (__pixel_keyword)->flags |= NODE_CONDITIONAL; - - __bool_keyword = get_identifier ("__bool"); - C_CPP_HASHNODE (__bool_keyword)->flags |= NODE_CONDITIONAL; - vector_keyword = get_identifier ("vector"); C_CPP_HASHNODE (vector_keyword)->flags |= NODE_CONDITIONAL; - pixel_keyword = get_identifier ("pixel"); - C_CPP_HASHNODE (pixel_keyword)->flags |= NODE_CONDITIONAL; + if (TARGET_ALTIVEC) + { + __pixel_keyword = get_identifier ("__pixel"); + C_CPP_HASHNODE (__pixel_keyword)->flags |= NODE_CONDITIONAL; + + __bool_keyword = get_identifier ("__bool"); + C_CPP_HASHNODE (__bool_keyword)->flags |= NODE_CONDITIONAL; - bool_keyword = get_identifier ("bool"); - C_CPP_HASHNODE (bool_keyword)->flags |= NODE_CONDITIONAL; + pixel_keyword = get_identifier ("pixel"); + C_CPP_HASHNODE (pixel_keyword)->flags |= NODE_CONDITIONAL; - _Bool_keyword = get_identifier ("_Bool"); - C_CPP_HASHNODE (_Bool_keyword)->flags |= NODE_CONDITIONAL; + bool_keyword = get_identifier ("bool"); + C_CPP_HASHNODE (bool_keyword)->flags |= NODE_CONDITIONAL; + + _Bool_keyword = get_identifier ("_Bool"); + C_CPP_HASHNODE (_Bool_keyword)->flags |= NODE_CONDITIONAL; + } } /* Called to decide whether a conditional macro should be expanded. @@ -214,7 +220,8 @@ rs6000_macro_to_expand (cpp_reader *pfil if (rid_code == RID_UNSIGNED || rid_code == RID_LONG || rid_code == RID_SHORT || rid_code == RID_SIGNED || rid_code == RID_INT || rid_code == RID_CHAR - || rid_code == RID_FLOAT) + || rid_code == RID_FLOAT + || (rid_code == RID_DOUBLE && TARGET_VSX)) { expand_this = C_CPP_HASHNODE (__vector_keyword); /* If the next keyword is bool or pixel, it @@ -284,13 +291,14 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi builtin_define ("_ARCH_PWR6X"); if (! TARGET_POWER && ! TARGET_POWER2 && ! TARGET_POWERPC) builtin_define ("_ARCH_COM"); + if (TARGET_POPCNTD) + builtin_define ("_ARCH_PWR7"); if (TARGET_ALTIVEC) { builtin_define ("__ALTIVEC__"); builtin_define ("__VEC__=10206"); /* Define the AltiVec syntactic elements. */ - builtin_define ("__vector=__attribute__((altivec(vector__)))"); builtin_define ("__pixel=__attribute__((altivec(pixel__))) unsigned short"); builtin_define ("__bool=__attribute__((altivec(bool__))) unsigned"); @@ -298,11 +306,20 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi { /* Define this when supporting context-sensitive keywords. */ builtin_define ("__APPLE_ALTIVEC__"); - - builtin_define ("vector=vector"); + builtin_define ("pixel=pixel"); builtin_define ("bool=bool"); builtin_define ("_Bool=_Bool"); + } + } + if (TARGET_ALTIVEC || TARGET_VSX) + { + /* Define the AltiVec/VSX syntactic elements. */ + builtin_define ("__vector=__attribute__((altivec(vector__)))"); + + if (!flag_iso) + { + builtin_define ("vector=vector"); init_vector_keywords (); /* Enable context-sensitive macros. */ @@ -326,6 +343,8 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi /* Used by libstdc++. */ if (TARGET_NO_LWSYNC) builtin_define ("__NO_LWSYNC__"); + if (TARGET_VSX) + builtin_define ("__VSX__"); /* May be overridden by target configuration. */ RS6000_CPU_CPP_ENDIAN_BUILTINS(); @@ -504,6 +523,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_ADD, VSX_BUILTIN_XVADDDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VADDFP, ALTIVEC_BUILTIN_VADDFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_VADDUWM, ALTIVEC_BUILTIN_VADDUWM, @@ -647,6 +668,12 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, + RS6000_BTI_V2DF, RS6000_BTI_bool_V4SI, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VAND, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, 0 }, @@ -695,6 +722,12 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, + RS6000_BTI_V2DF, RS6000_BTI_bool_V4SI, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_ANDC, ALTIVEC_BUILTIN_VANDC, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, 0 }, @@ -1198,6 +1231,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_MAX, ALTIVEC_BUILTIN_VMAXFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_MAX, VSX_BUILTIN_XVMAXDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VMAXFP, ALTIVEC_BUILTIN_VMAXFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_VMAXSW, ALTIVEC_BUILTIN_VMAXSW, @@ -1374,6 +1409,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_MIN, ALTIVEC_BUILTIN_VMINFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_MIN, VSX_BUILTIN_XVMINDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VMINFP, ALTIVEC_BUILTIN_VMINFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_VMINSW, ALTIVEC_BUILTIN_VMINSW, @@ -1459,6 +1496,8 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, @@ -1483,6 +1522,12 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, + RS6000_BTI_V2DF, RS6000_BTI_bool_V4SI, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_OR, ALTIVEC_BUILTIN_VOR, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, 0 }, @@ -1940,6 +1985,8 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_SUB, ALTIVEC_BUILTIN_VSUBFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, + { ALTIVEC_BUILTIN_VEC_SUB, VSX_BUILTIN_XVSUBDP, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_VSUBFP, ALTIVEC_BUILTIN_VSUBFP, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_VSUBUWM, ALTIVEC_BUILTIN_VSUBUWM, @@ -2099,6 +2146,12 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, RS6000_BTI_V4SF, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_bool_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, + RS6000_BTI_V2DF, RS6000_BTI_bool_V4SI, RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, 0 }, @@ -2989,8 +3042,10 @@ altivec_resolve_overloaded_builtin (tree const struct altivec_builtin_types *desc; int n; - if (fcode < ALTIVEC_BUILTIN_OVERLOADED_FIRST - || fcode > ALTIVEC_BUILTIN_OVERLOADED_LAST) + if ((fcode < ALTIVEC_BUILTIN_OVERLOADED_FIRST + || fcode > ALTIVEC_BUILTIN_OVERLOADED_LAST) + && (fcode < VSX_BUILTIN_OVERLOADED_FIRST + || fcode > VSX_BUILTIN_OVERLOADED_LAST)) return NULL_TREE; /* For now treat vec_splats and vec_promote as the same. */ --- gcc/config/rs6000/rs6000.opt (.../trunk) (revision 145777) +++ gcc/config/rs6000/rs6000.opt (.../branches/ibm/power7-meissner) (revision 146027) @@ -111,24 +111,60 @@ mhard-float Target Report RejectNegative InverseMask(SOFT_FLOAT, HARD_FLOAT) Use hardware floating point -mno-update -Target Report RejectNegative Mask(NO_UPDATE) -Do not generate load/store with update instructions +mpopcntd +Target Report Mask(POPCNTD) +Use PowerPC V2.06 popcntd instruction + +mvsx +Target Report Mask(VSX) +Use vector/scalar (VSX) instructions + +mvsx-vector-memory +Target Undocumented Report Var(TARGET_VSX_VECTOR_MEMORY) Init(-1) +; If -mvsx, use VSX vector load/store instructions instead of Altivec instructions + +mvsx-vector-float +Target Undocumented Report Var(TARGET_VSX_VECTOR_FLOAT) Init(-1) +; If -mvsx, use VSX arithmetic instructions for float vectors (on by default) + +mvsx-vector-double +Target Undocumented Report Var(TARGET_VSX_VECTOR_DOUBLE) Init(-1) +; If -mvsx, use VSX arithmetic instructions for double vectors (on by default) + +mvsx-scalar-double +Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(-1) +; If -mvsx, use VSX arithmetic instructions for scalar double (on by default) + +mvsx-scalar-memory +Target Undocumented Report Var(TARGET_VSX_SCALAR_MEMORY) +; If -mvsx, use VSX scalar memory reference instructions for scalar double (off by default) + +mvsx-v4sf-altivec-regs +Target Undocumented Report Var(TARGET_V4SF_ALTIVEC_REGS) Init(-1) +; If -mvsx, prefer V4SF types to use Altivec regs and not the floating registers + +mreload-functions +Target Undocumented Report Var(TARGET_RELOAD_FUNCTIONS) Init(-1) +; If -mvsx or -maltivec, enable reload functions + +mpower7-adjust-cost +Target Undocumented Var(TARGET_POWER7_ADJUST_COST) +; Add extra cost for setting CR registers before a branch like is done for Power5 + +mdisallow-float-in-lr-ctr +Target Undocumented Var(TARGET_DISALLOW_FLOAT_IN_LR_CTR) Init(-1) +; Disallow floating point in LR or CTR, causes some reload bugs mupdate -Target Report RejectNegative InverseMask(NO_UPDATE, UPDATE) +Target Report Var(TARGET_UPDATE) Init(1) Generate load/store with update instructions mavoid-indexed-addresses Target Report Var(TARGET_AVOID_XFORM) Init(-1) Avoid generation of indexed load/store instructions when possible -mno-fused-madd -Target Report RejectNegative Mask(NO_FUSED_MADD) -Do not generate fused multiply/add instructions - mfused-madd -Target Report RejectNegative InverseMask(NO_FUSED_MADD, FUSED_MADD) +Target Report Var(TARGET_FUSED_MADD) Init(1) Generate fused multiply/add instructions msched-prolog @@ -194,7 +230,7 @@ Target RejectNegative Joined -mvrsave=yes/no Deprecated option. Use -mvrsave/-mno-vrsave instead misel -Target +Target Report Mask(ISEL) Generate isel instructions misel= --- gcc/config/rs6000/linux64.h (.../trunk) (revision 145777) +++ gcc/config/rs6000/linux64.h (.../branches/ibm/power7-meissner) (revision 146027) @@ -114,7 +114,7 @@ extern int dot_symbols; error (INVALID_32BIT, "32"); \ if (TARGET_PROFILE_KERNEL) \ { \ - target_flags &= ~MASK_PROFILE_KERNEL; \ + SET_PROFILE_KERNEL (0); \ error (INVALID_32BIT, "profile-kernel"); \ } \ } \ --- gcc/config/rs6000/rs6000.c (.../trunk) (revision 145777) +++ gcc/config/rs6000/rs6000.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -178,9 +178,6 @@ int rs6000_spe; /* Nonzero if we want SPE ABI extensions. */ int rs6000_spe_abi; -/* Nonzero to use isel instructions. */ -int rs6000_isel; - /* Nonzero if floating point operations are done in the GPRs. */ int rs6000_float_gprs = 0; @@ -227,12 +224,26 @@ int dot_symbols; const char *rs6000_debug_name; int rs6000_debug_stack; /* debug stack applications */ int rs6000_debug_arg; /* debug argument handling */ +int rs6000_debug_reg; /* debug register classes */ +int rs6000_debug_addr; /* debug memory addressing */ +int rs6000_debug_cost; /* debug rtx_costs */ /* Value is TRUE if register/mode pair is acceptable. */ bool rs6000_hard_regno_mode_ok_p[NUM_MACHINE_MODES][FIRST_PSEUDO_REGISTER]; -/* Built in types. */ +/* Maximum number of registers needed for a given register class and mode. */ +unsigned char rs6000_class_max_nregs[NUM_MACHINE_MODES][LIM_REG_CLASSES]; + +/* How many registers are needed for a given register and mode. */ +unsigned char rs6000_hard_regno_nregs[NUM_MACHINE_MODES][FIRST_PSEUDO_REGISTER]; + +/* Map register number to register class. */ +enum reg_class rs6000_regno_regclass[FIRST_PSEUDO_REGISTER]; +/* Reload functions based on the type and the vector unit. */ +static enum insn_code rs6000_vector_reload[NUM_MACHINE_MODES][2]; + +/* Built in types. */ tree rs6000_builtin_types[RS6000_BTI_MAX]; tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT]; @@ -270,7 +281,6 @@ struct { bool altivec_abi; /* True if -mabi=altivec/no-altivec used. */ bool spe; /* True if -mspe= was used. */ bool float_gprs; /* True if -mfloat-gprs= was used. */ - bool isel; /* True if -misel was used. */ bool long_double; /* True if -mlong-double- was used. */ bool ieee; /* True if -mabi=ieee/ibmlongdouble used. */ bool vrsave; /* True if -mvrsave was used. */ @@ -286,6 +296,18 @@ struct builtin_description const char *const name; const enum rs6000_builtins code; }; + +/* Describe the vector unit used for modes. */ +enum rs6000_vector rs6000_vector_unit[NUM_MACHINE_MODES]; +enum rs6000_vector rs6000_vector_mem[NUM_MACHINE_MODES]; +enum reg_class rs6000_vector_reg_class[NUM_MACHINE_MODES]; + +/* Describe the alignment of a vector. */ +int rs6000_vector_align[NUM_MACHINE_MODES]; + +/* Describe the register classes used by VSX instructions. */ +enum reg_class rs6000_vsx_reg_class = NO_REGS; + /* Target cpu costs. */ @@ -749,6 +771,25 @@ struct processor_costs power6_cost = { 16, /* prefetch streams */ }; +/* Instruction costs on POWER7 processors. */ +static const +struct processor_costs power7_cost = { + COSTS_N_INSNS (2), /* mulsi */ + COSTS_N_INSNS (2), /* mulsi_const */ + COSTS_N_INSNS (2), /* mulsi_const9 */ + COSTS_N_INSNS (2), /* muldi */ + COSTS_N_INSNS (18), /* divsi */ + COSTS_N_INSNS (34), /* divdi */ + COSTS_N_INSNS (3), /* fp */ + COSTS_N_INSNS (3), /* dmul */ + COSTS_N_INSNS (13), /* sdiv */ + COSTS_N_INSNS (16), /* ddiv */ + 128, /* cache line size */ + 32, /* l1 cache */ + 256, /* l2 cache */ + 12, /* prefetch streams */ +}; + static bool rs6000_function_ok_for_sibcall (tree, tree); static const char *rs6000_invalid_within_doloop (const_rtx); @@ -827,7 +868,10 @@ static void rs6000_xcoff_file_end (void) #endif static int rs6000_variable_issue (FILE *, int, rtx, int); static bool rs6000_rtx_costs (rtx, int, int, int *, bool); +static bool rs6000_debug_rtx_costs (rtx, int, int, int *, bool); +static int rs6000_debug_address_cost (rtx, bool); static int rs6000_adjust_cost (rtx, rtx, rtx, int); +static int rs6000_debug_adjust_cost (rtx, rtx, rtx, int); static void rs6000_sched_init (FILE *, int, int); static bool is_microcoded_insn (rtx); static bool is_nonpipeline_insn (rtx); @@ -906,6 +950,7 @@ static rtx altivec_expand_stv_builtin (e static rtx altivec_expand_vec_init_builtin (tree, tree, rtx); static rtx altivec_expand_vec_set_builtin (tree); static rtx altivec_expand_vec_ext_builtin (tree, rtx); +static rtx vsx_expand_builtin (tree, rtx, bool *); static int get_element_number (tree, tree); static bool rs6000_handle_option (size_t, const char *, int); static void rs6000_parse_tls_size_option (void); @@ -963,14 +1008,16 @@ static tree rs6000_gimplify_va_arg (tree static bool rs6000_must_pass_in_stack (enum machine_mode, const_tree); static bool rs6000_scalar_mode_supported_p (enum machine_mode); static bool rs6000_vector_mode_supported_p (enum machine_mode); -static int get_vec_cmp_insn (enum rtx_code, enum machine_mode, - enum machine_mode); +static rtx rs6000_emit_vector_compare_vsx (enum rtx_code, rtx, rtx, rtx); +static rtx rs6000_emit_vector_compare_altivec (enum rtx_code, rtx, rtx, rtx); static rtx rs6000_emit_vector_compare (enum rtx_code, rtx, rtx, enum machine_mode); -static int get_vsel_insn (enum machine_mode); -static void rs6000_emit_vector_select (rtx, rtx, rtx, rtx); static tree rs6000_stack_protect_fail (void); +static enum reg_class rs6000_secondary_reload (bool, rtx, enum reg_class, + enum machine_mode, + struct secondary_reload_info *); + const int INSN_NOT_AVAILABLE = -1; static enum machine_mode rs6000_eh_return_filter_mode (void); @@ -1045,6 +1092,9 @@ static const char alt_reg_names[][8] = #endif #ifndef TARGET_PROFILE_KERNEL #define TARGET_PROFILE_KERNEL 0 +#define SET_PROFILE_KERNEL(N) +#else +#define SET_PROFILE_KERNEL(N) TARGET_PROFILE_KERNEL = (N) #endif /* The VRSAVE bitmask puts bit %v0 as the most significant bit. */ @@ -1297,30 +1347,101 @@ static const char alt_reg_names[][8] = #undef TARGET_INSTANTIATE_DECLS #define TARGET_INSTANTIATE_DECLS rs6000_instantiate_decls +#undef TARGET_SECONDARY_RELOAD +#define TARGET_SECONDARY_RELOAD rs6000_secondary_reload + struct gcc_target targetm = TARGET_INITIALIZER; +/* Return number of consecutive hard regs needed starting at reg REGNO + to hold something of mode MODE. + This is ordinarily the length in words of a value of mode MODE + but can be less for certain modes in special long registers. + + For the SPE, GPRs are 64 bits but only 32 bits are visible in + scalar instructions. The upper 32 bits are only available to the + SIMD instructions. + + POWER and PowerPC GPRs hold 32 bits worth; + PowerPC64 GPRs and FPRs point register holds 64 bits worth. */ + +static int +rs6000_hard_regno_nregs_internal (int regno, enum machine_mode mode) +{ + unsigned HOST_WIDE_INT reg_size; + + if (FP_REGNO_P (regno)) + reg_size = (VECTOR_UNIT_VSX_P (mode) + ? UNITS_PER_VSX_WORD + : UNITS_PER_FP_WORD); + + else if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode)) + reg_size = UNITS_PER_SPE_WORD; + + else if (ALTIVEC_REGNO_P (regno)) + reg_size = UNITS_PER_ALTIVEC_WORD; + + /* The value returned for SCmode in the E500 double case is 2 for + ABI compatibility; storing an SCmode value in a single register + would require function_arg and rs6000_spe_function_arg to handle + SCmode so as to pass the value correctly in a pair of + registers. */ + else if (TARGET_E500_DOUBLE && FLOAT_MODE_P (mode) && mode != SCmode + && !DECIMAL_FLOAT_MODE_P (mode)) + reg_size = UNITS_PER_FP_WORD; + + else + reg_size = UNITS_PER_WORD; + + return (GET_MODE_SIZE (mode) + reg_size - 1) / reg_size; +} /* Value is 1 if hard register REGNO can hold a value of machine-mode MODE. */ static int rs6000_hard_regno_mode_ok (int regno, enum machine_mode mode) { + int last_regno = regno + rs6000_hard_regno_nregs[mode][regno] - 1; + + /* VSX registers that overlap the FPR registers are larger than for non-VSX + implementations. Don't allow an item to be split between a FP register + and an Altivec register. */ + if (VECTOR_UNIT_VSX_P (mode) || VECTOR_MEM_VSX_P (mode)) + { + enum reg_class rclass = rs6000_vector_reg_class[mode]; + if (FP_REGNO_P (regno)) + return ((rclass == FLOAT_REGS || rclass == VSX_REGS) + && FP_REGNO_P (last_regno)); + + if (ALTIVEC_REGNO_P (regno)) + return ((rclass == ALTIVEC_REGS || rclass == VSX_REGS) + && ALTIVEC_REGNO_P (last_regno)); + } + /* The GPRs can hold any mode, but values bigger than one register cannot go past R31. */ if (INT_REGNO_P (regno)) - return INT_REGNO_P (regno + HARD_REGNO_NREGS (regno, mode) - 1); + return INT_REGNO_P (last_regno); - /* The float registers can only hold floating modes and DImode. - This excludes the 32-bit decimal float mode for now. */ + /* The float registers (except for VSX vector modes) can only hold floating + modes and DImode. This excludes the 32-bit decimal float mode for + now. */ if (FP_REGNO_P (regno)) - return - ((SCALAR_FLOAT_MODE_P (mode) - && (mode != TDmode || (regno % 2) == 0) - && FP_REGNO_P (regno + HARD_REGNO_NREGS (regno, mode) - 1)) - || (GET_MODE_CLASS (mode) == MODE_INT + { + if (SCALAR_FLOAT_MODE_P (mode) + && (mode != TDmode || (regno % 2) == 0) + && FP_REGNO_P (last_regno)) + return 1; + + if (GET_MODE_CLASS (mode) == MODE_INT && GET_MODE_SIZE (mode) == UNITS_PER_FP_WORD) - || (PAIRED_SIMD_REGNO_P (regno) && TARGET_PAIRED_FLOAT - && PAIRED_VECTOR_MODE (mode))); + return 1; + + if (PAIRED_SIMD_REGNO_P (regno) && TARGET_PAIRED_FLOAT + && PAIRED_VECTOR_MODE (mode)) + return 1; + + return 0; + } /* The CR register can only hold CC modes. */ if (CR_REGNO_P (regno)) @@ -1331,28 +1452,389 @@ rs6000_hard_regno_mode_ok (int regno, en /* AltiVec only in AldyVec registers. */ if (ALTIVEC_REGNO_P (regno)) - return ALTIVEC_VECTOR_MODE (mode); + return VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode); /* ...but GPRs can hold SIMD data on the SPE in one register. */ if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode)) return 1; - /* We cannot put TImode anywhere except general register and it must be - able to fit within the register set. */ + /* Don't allow anything but word sized integers (aka pointers) in CTR/LR. + You really don't want to spill your floating point values to those + registers. Also do it for the old MQ register in the power. + + While this is desirable in theory, disabling float to go in LR/CTR does + cause some regressions, so until they are taken care of, revert to the old + behavior by default for most power systems, but enable it for power7. */ + if ((TARGET_DISALLOW_FLOAT_IN_LR_CTR > 0 + || (TARGET_DISALLOW_FLOAT_IN_LR_CTR < 0 && TARGET_VSX)) + && (regno == CTR_REGNO || regno == LR_REGNO || regno == MQ_REGNO)) + return (GET_MODE_CLASS (mode) == MODE_INT + && GET_MODE_SIZE (mode) <= UNITS_PER_WORD); + + /* The VRSAVE/VSCR registers are 32-bits (they are fixed, but add this for + completeness). */ + if (regno == VRSAVE_REGNO || regno == VSCR_REGNO) + return (mode == SImode); + + /* We cannot put TImode anywhere except general register and it must be able + to fit within the register set. In the future, allow TImode in the + Altivec or VSX registers. */ return GET_MODE_SIZE (mode) <= UNITS_PER_WORD; } -/* Initialize rs6000_hard_regno_mode_ok_p table. */ +/* Print interesting facts about registers. */ static void -rs6000_init_hard_regno_mode_ok (void) +rs6000_debug_reg_print (int first_regno, int last_regno, const char *reg_name) { int r, m; + for (r = first_regno; r <= last_regno; ++r) + { + const char *comma = ""; + int len; + + if (first_regno == last_regno) + fprintf (stderr, "%s:\t", reg_name); + else + fprintf (stderr, "%s%d:\t", reg_name, r - first_regno); + + len = 8; + for (m = 0; m < NUM_MACHINE_MODES; ++m) + if (rs6000_hard_regno_mode_ok_p[m][r] && rs6000_hard_regno_nregs[m][r]) + { + if (len > 70) + { + fprintf (stderr, ",\n\t"); + len = 8; + comma = ""; + } + + if (rs6000_hard_regno_nregs[m][r] > 1) + len += fprintf (stderr, "%s%s/%d", comma, GET_MODE_NAME (m), + rs6000_hard_regno_nregs[m][r]); + else + len += fprintf (stderr, "%s%s", comma, GET_MODE_NAME (m)); + + comma = ", "; + } + + if (call_used_regs[r]) + { + if (len > 70) + { + fprintf (stderr, ",\n\t"); + len = 8; + comma = ""; + } + + len += fprintf (stderr, "%s%s", comma, "call-used"); + comma = ", "; + } + + if (fixed_regs[r]) + { + if (len > 70) + { + fprintf (stderr, ",\n\t"); + len = 8; + comma = ""; + } + + len += fprintf (stderr, "%s%s", comma, "fixed"); + comma = ", "; + } + + if (len > 70) + { + fprintf (stderr, ",\n\t"); + comma = ""; + } + + fprintf (stderr, "%sregno = %d\n", comma, r); + } +} + +/* Map enum rs6000_vector to string. */ +static const char * +rs6000_debug_vector_unit[] = { + "none", + "altivec", + "vsx", + "paired", + "spe", + "other" +}; + +/* Initialize the various global tables that are based on register size. */ +static void +rs6000_init_hard_regno_mode_ok (void) +{ + int r, m, c; + enum reg_class vsx_rc = (TARGET_ALTIVEC ? VSX_REGS : FLOAT_REGS); + bool float_p = (TARGET_HARD_FLOAT && TARGET_FPRS); + + /* Precalculate REGNO_REG_CLASS. */ + rs6000_regno_regclass[0] = GENERAL_REGS; + for (r = 1; r < 32; ++r) + rs6000_regno_regclass[r] = BASE_REGS; + + for (r = 32; r < 64; ++r) + rs6000_regno_regclass[r] = FLOAT_REGS; + + for (r = 64; r < FIRST_PSEUDO_REGISTER; ++r) + rs6000_regno_regclass[r] = NO_REGS; + + for (r = FIRST_ALTIVEC_REGNO; r <= LAST_ALTIVEC_REGNO; ++r) + rs6000_regno_regclass[r] = ALTIVEC_REGS; + + rs6000_regno_regclass[CR0_REGNO] = CR0_REGS; + for (r = CR1_REGNO; r <= CR7_REGNO; ++r) + rs6000_regno_regclass[r] = CR_REGS; + + rs6000_regno_regclass[MQ_REGNO] = MQ_REGS; + rs6000_regno_regclass[LR_REGNO] = LINK_REGS; + rs6000_regno_regclass[CTR_REGNO] = CTR_REGS; + rs6000_regno_regclass[XER_REGNO] = XER_REGS; + rs6000_regno_regclass[VRSAVE_REGNO] = VRSAVE_REGS; + rs6000_regno_regclass[VSCR_REGNO] = VRSAVE_REGS; + rs6000_regno_regclass[SPE_ACC_REGNO] = SPE_ACC_REGS; + rs6000_regno_regclass[SPEFSCR_REGNO] = SPEFSCR_REGS; + rs6000_regno_regclass[ARG_POINTER_REGNUM] = BASE_REGS; + rs6000_regno_regclass[FRAME_POINTER_REGNUM] = BASE_REGS; + + /* Precalculate vector information, this must be set up before the + rs6000_hard_regno_nregs_internal below. */ + for (m = 0; m < NUM_MACHINE_MODES; ++m) + { + rs6000_vector_unit[m] = rs6000_vector_mem[m] = VECTOR_NONE; + rs6000_vector_reg_class[m] = NO_REGS; + rs6000_vector_reload[m][0] = CODE_FOR_nothing; + rs6000_vector_reload[m][1] = CODE_FOR_nothing; + } + + /* TODO, add TI/V2DI mode for moving data if Altivec or VSX. */ + + /* V2DF mode, VSX only. */ + if (float_p && TARGET_VSX && TARGET_VSX_VECTOR_DOUBLE) + { + rs6000_vector_unit[V2DFmode] = VECTOR_VSX; + rs6000_vector_mem[V2DFmode] = VECTOR_VSX; + rs6000_vector_align[V2DFmode] = 64; + } + + /* V4SF mode, either VSX or Altivec. */ + if (float_p && TARGET_VSX && TARGET_VSX_VECTOR_FLOAT) + { + rs6000_vector_unit[V4SFmode] = VECTOR_VSX; + if (TARGET_VSX_VECTOR_MEMORY || !TARGET_ALTIVEC) + { + rs6000_vector_align[V4SFmode] = 32; + rs6000_vector_mem[V4SFmode] = VECTOR_VSX; + } else { + rs6000_vector_align[V4SFmode] = 128; + rs6000_vector_mem[V4SFmode] = VECTOR_ALTIVEC; + } + } + else if (float_p && TARGET_ALTIVEC) + { + rs6000_vector_unit[V4SFmode] = VECTOR_ALTIVEC; + rs6000_vector_mem[V4SFmode] = VECTOR_ALTIVEC; + rs6000_vector_align[V4SFmode] = 128; + } + + /* V16QImode, V8HImode, V4SImode are Altivec only, but possibly do VSX loads + and stores. */ + if (TARGET_ALTIVEC) + { + rs6000_vector_unit[V4SImode] = VECTOR_ALTIVEC; + rs6000_vector_unit[V8HImode] = VECTOR_ALTIVEC; + rs6000_vector_unit[V16QImode] = VECTOR_ALTIVEC; + + rs6000_vector_reg_class[V16QImode] = ALTIVEC_REGS; + rs6000_vector_reg_class[V8HImode] = ALTIVEC_REGS; + rs6000_vector_reg_class[V4SImode] = ALTIVEC_REGS; + + if (TARGET_VSX && TARGET_VSX_VECTOR_MEMORY) + { + rs6000_vector_mem[V4SImode] = VECTOR_VSX; + rs6000_vector_mem[V8HImode] = VECTOR_VSX; + rs6000_vector_mem[V16QImode] = VECTOR_VSX; + rs6000_vector_align[V4SImode] = 32; + rs6000_vector_align[V8HImode] = 32; + rs6000_vector_align[V16QImode] = 32; + } + else + { + rs6000_vector_mem[V4SImode] = VECTOR_ALTIVEC; + rs6000_vector_mem[V8HImode] = VECTOR_ALTIVEC; + rs6000_vector_mem[V16QImode] = VECTOR_ALTIVEC; + rs6000_vector_align[V4SImode] = 128; + rs6000_vector_align[V8HImode] = 128; + rs6000_vector_align[V16QImode] = 128; + } + } + + /* DFmode, see if we want to use the VSX unit. */ + if (float_p && TARGET_VSX && TARGET_VSX_SCALAR_DOUBLE) + { + rs6000_vector_unit[DFmode] = VECTOR_VSX; + rs6000_vector_align[DFmode] = 64; + rs6000_vector_mem[DFmode] + = (TARGET_VSX_SCALAR_MEMORY ? VECTOR_VSX : VECTOR_NONE); + } + + /* TODO, add SPE and paired floating point vector support. */ + + /* Set the VSX register classes. */ + + /* For V4SF, prefer the Altivec registers, because there are a few operations + that want to use Altivec operations instead of VSX. */ + rs6000_vector_reg_class[V4SFmode] + = ((VECTOR_UNIT_VSX_P (V4SFmode) + && VECTOR_MEM_VSX_P (V4SFmode) + && !TARGET_V4SF_ALTIVEC_REGS) + ? vsx_rc + : (VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode) + ? ALTIVEC_REGS + : NO_REGS)); + + rs6000_vector_reg_class[V2DFmode] + = (VECTOR_UNIT_VSX_P (V2DFmode) ? vsx_rc : NO_REGS); + + rs6000_vector_reg_class[DFmode] + = ((!float_p || !VECTOR_UNIT_VSX_P (DFmode)) + ? NO_REGS + : ((TARGET_VSX_SCALAR_MEMORY) + ? vsx_rc + : FLOAT_REGS)); + + rs6000_vsx_reg_class = (float_p && TARGET_VSX) ? vsx_rc : NO_REGS; + + /* Set up the reload helper functions. */ + if (TARGET_RELOAD_FUNCTIONS && (TARGET_VSX || TARGET_ALTIVEC)) + { + if (TARGET_64BIT) + { + rs6000_vector_reload[V16QImode][0] = CODE_FOR_reload_v16qi_di_store; + rs6000_vector_reload[V16QImode][1] = CODE_FOR_reload_v16qi_di_load; + rs6000_vector_reload[V8HImode][0] = CODE_FOR_reload_v8hi_di_store; + rs6000_vector_reload[V8HImode][1] = CODE_FOR_reload_v8hi_di_load; + rs6000_vector_reload[V4SImode][0] = CODE_FOR_reload_v4si_di_store; + rs6000_vector_reload[V4SImode][1] = CODE_FOR_reload_v4si_di_load; + rs6000_vector_reload[V2DImode][0] = CODE_FOR_reload_v2di_di_store; + rs6000_vector_reload[V2DImode][1] = CODE_FOR_reload_v2di_di_load; + rs6000_vector_reload[V4SFmode][0] = CODE_FOR_reload_v4sf_di_store; + rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_di_load; + rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_di_store; + rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_di_load; + } + else + { + rs6000_vector_reload[V16QImode][0] = CODE_FOR_reload_v16qi_si_store; + rs6000_vector_reload[V16QImode][1] = CODE_FOR_reload_v16qi_si_load; + rs6000_vector_reload[V8HImode][0] = CODE_FOR_reload_v8hi_si_store; + rs6000_vector_reload[V8HImode][1] = CODE_FOR_reload_v8hi_si_load; + rs6000_vector_reload[V4SImode][0] = CODE_FOR_reload_v4si_si_store; + rs6000_vector_reload[V4SImode][1] = CODE_FOR_reload_v4si_si_load; + rs6000_vector_reload[V2DImode][0] = CODE_FOR_reload_v2di_si_store; + rs6000_vector_reload[V2DImode][1] = CODE_FOR_reload_v2di_si_load; + rs6000_vector_reload[V4SFmode][0] = CODE_FOR_reload_v4sf_si_store; + rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_si_load; + rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_si_store; + rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_si_load; + } + } + + /* Precalculate HARD_REGNO_NREGS. */ + for (r = 0; r < FIRST_PSEUDO_REGISTER; ++r) + for (m = 0; m < NUM_MACHINE_MODES; ++m) + rs6000_hard_regno_nregs[m][r] = rs6000_hard_regno_nregs_internal (r, m); + + /* Precalculate HARD_REGNO_MODE_OK. */ for (r = 0; r < FIRST_PSEUDO_REGISTER; ++r) for (m = 0; m < NUM_MACHINE_MODES; ++m) if (rs6000_hard_regno_mode_ok (r, m)) rs6000_hard_regno_mode_ok_p[m][r] = true; + + /* Precalculate CLASSS_MAX_NREGS sizes. */ + for (c = 0; c < LIM_REG_CLASSES; ++c) + { + int reg_size; + + if (TARGET_VSX && VSX_REG_CLASS_P (c)) + reg_size = UNITS_PER_VSX_WORD; + + else if (c == ALTIVEC_REGS) + reg_size = UNITS_PER_ALTIVEC_WORD; + + else if (c == FLOAT_REGS) + reg_size = UNITS_PER_FP_WORD; + + else + reg_size = UNITS_PER_WORD; + + for (m = 0; m < NUM_MACHINE_MODES; ++m) + rs6000_class_max_nregs[m][c] + = (GET_MODE_SIZE (m) + reg_size - 1) / reg_size; + } + + if (TARGET_E500_DOUBLE) + rs6000_class_max_nregs[DFmode][GENERAL_REGS] = 1; + + if (TARGET_DEBUG_REG) + { + const char *nl = (const char *)0; + + fprintf (stderr, "Register information: (last virtual reg = %d)\n", + LAST_VIRTUAL_REGISTER); + rs6000_debug_reg_print (0, 31, "gr"); + rs6000_debug_reg_print (32, 63, "fp"); + rs6000_debug_reg_print (FIRST_ALTIVEC_REGNO, + LAST_ALTIVEC_REGNO, + "vs"); + rs6000_debug_reg_print (LR_REGNO, LR_REGNO, "lr"); + rs6000_debug_reg_print (CTR_REGNO, CTR_REGNO, "ctr"); + rs6000_debug_reg_print (CR0_REGNO, CR7_REGNO, "cr"); + rs6000_debug_reg_print (MQ_REGNO, MQ_REGNO, "mq"); + rs6000_debug_reg_print (XER_REGNO, XER_REGNO, "xer"); + rs6000_debug_reg_print (VRSAVE_REGNO, VRSAVE_REGNO, "vrsave"); + rs6000_debug_reg_print (VSCR_REGNO, VSCR_REGNO, "vscr"); + rs6000_debug_reg_print (SPE_ACC_REGNO, SPE_ACC_REGNO, "spe_a"); + rs6000_debug_reg_print (SPEFSCR_REGNO, SPEFSCR_REGNO, "spe_f"); + + fprintf (stderr, + "\n" + "V16QI reg_class = %s\n" + "V8HI reg_class = %s\n" + "V4SI reg_class = %s\n" + "V2DI reg_class = %s\n" + "V4SF reg_class = %s\n" + "V2DF reg_class = %s\n" + "DF reg_class = %s\n" + "vsx reg_class = %s\n\n", + reg_class_names[rs6000_vector_reg_class[V16QImode]], + reg_class_names[rs6000_vector_reg_class[V8HImode]], + reg_class_names[rs6000_vector_reg_class[V4SImode]], + reg_class_names[rs6000_vector_reg_class[V2DImode]], + reg_class_names[rs6000_vector_reg_class[V4SFmode]], + reg_class_names[rs6000_vector_reg_class[V2DFmode]], + reg_class_names[rs6000_vector_reg_class[DFmode]], + reg_class_names[rs6000_vsx_reg_class]); + + for (m = 0; m < NUM_MACHINE_MODES; ++m) + if (rs6000_vector_unit[m] || rs6000_vector_mem[m]) + { + nl = "\n"; + fprintf (stderr, "Vector mode: %-5s arithmetic: %-8s move: %-8s\n", + GET_MODE_NAME (m), + rs6000_debug_vector_unit[ rs6000_vector_unit[m] ], + rs6000_debug_vector_unit[ rs6000_vector_mem[m] ]); + } + + if (nl) + fputs (nl, stderr); + } } #if TARGET_MACHO @@ -1482,12 +1964,15 @@ rs6000_override_options (const char *def {"801", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"821", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"823", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, - {"8540", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN}, + {"8540", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN + | MASK_ISEL}, /* 8548 has a dummy entry for now. */ - {"8548", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN}, + {"8548", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN + | MASK_ISEL}, {"e300c2", PROCESSOR_PPCE300C2, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"e300c3", PROCESSOR_PPCE300C3, POWERPC_BASE_MASK}, - {"e500mc", PROCESSOR_PPCE500MC, POWERPC_BASE_MASK | MASK_PPC_GFXOPT}, + {"e500mc", PROCESSOR_PPCE500MC, POWERPC_BASE_MASK | MASK_PPC_GFXOPT + | MASK_ISEL}, {"860", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"970", PROCESSOR_POWER4, POWERPC_7400_MASK | MASK_PPC_GPOPT | MASK_MFCRF | MASK_POWERPC64}, @@ -1520,9 +2005,10 @@ rs6000_override_options (const char *def POWERPC_BASE_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_PPC_GFXOPT | MASK_MFCRF | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_MFPGPR}, - {"power7", PROCESSOR_POWER5, + {"power7", PROCESSOR_POWER7, POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF - | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP}, + | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD + | MASK_VSX}, /* Don't add MASK_ISEL by default */ {"powerpc", PROCESSOR_POWERPC, POWERPC_BASE_MASK}, {"powerpc64", PROCESSOR_POWERPC64, POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_POWERPC64}, @@ -1549,7 +2035,8 @@ rs6000_override_options (const char *def POWERPC_MASKS = (POWERPC_BASE_MASK | MASK_PPC_GPOPT | MASK_STRICT_ALIGN | MASK_PPC_GFXOPT | MASK_POWERPC64 | MASK_ALTIVEC | MASK_MFCRF | MASK_POPCNTB | MASK_FPRND | MASK_MULHW - | MASK_DLMZB | MASK_CMPB | MASK_MFPGPR | MASK_DFP) + | MASK_DLMZB | MASK_CMPB | MASK_MFPGPR | MASK_DFP + | MASK_POPCNTD | MASK_VSX | MASK_ISEL) }; set_masks = POWER_MASKS | POWERPC_MASKS | MASK_SOFT_FLOAT; @@ -1594,10 +2081,6 @@ rs6000_override_options (const char *def } } - if ((TARGET_E500 || rs6000_cpu == PROCESSOR_PPCE500MC) - && !rs6000_explicit_options.isel) - rs6000_isel = 1; - if (rs6000_cpu == PROCESSOR_PPCE300C2 || rs6000_cpu == PROCESSOR_PPCE300C3 || rs6000_cpu == PROCESSOR_PPCE500MC) { @@ -1642,17 +2125,61 @@ rs6000_override_options (const char *def } } + /* Add some warnings for VSX. Enable -maltivec unless the user explicitly + used -mno-altivec */ + if (TARGET_VSX) + { + const char *msg = NULL; + if (!TARGET_HARD_FLOAT || !TARGET_FPRS + || !TARGET_SINGLE_FLOAT || !TARGET_DOUBLE_FLOAT) + msg = "-mvsx requires hardware floating point"; + else if (TARGET_PAIRED_FLOAT) + msg = "-mvsx and -mpaired are incompatible"; + /* The hardware will allow VSX and little endian, but until we make sure + things like vector select, etc. work don't allow VSX on little endian + systems at this point. */ + else if (!BYTES_BIG_ENDIAN) + msg = "-mvsx used with little endian code"; + else if (TARGET_AVOID_XFORM > 0) + msg = "-mvsx needs indexed addressing"; + + if (msg) + { + warning (0, msg); + target_flags &= MASK_VSX; + } + else if (!TARGET_ALTIVEC && (target_flags_explicit & MASK_ALTIVEC) == 0) + target_flags |= MASK_ALTIVEC; + } + /* Set debug flags */ if (rs6000_debug_name) { if (! strcmp (rs6000_debug_name, "all")) - rs6000_debug_stack = rs6000_debug_arg = 1; + rs6000_debug_stack = rs6000_debug_arg = rs6000_debug_reg + = rs6000_debug_addr = rs6000_debug_cost = 1; else if (! strcmp (rs6000_debug_name, "stack")) rs6000_debug_stack = 1; else if (! strcmp (rs6000_debug_name, "arg")) rs6000_debug_arg = 1; + else if (! strcmp (rs6000_debug_name, "reg")) + rs6000_debug_reg = 1; + else if (! strcmp (rs6000_debug_name, "addr")) + rs6000_debug_addr = 1; + else if (! strcmp (rs6000_debug_name, "cost")) + rs6000_debug_cost = 1; else error ("unknown -mdebug-%s switch", rs6000_debug_name); + + /* If -mdebug=cost or -mdebug=all, replace the cost target hooks with + debug versions that call the real version and then prints debugging + information. */ + if (TARGET_DEBUG_COST) + { + targetm.rtx_costs = rs6000_debug_rtx_costs; + targetm.address_cost = rs6000_debug_address_cost; + targetm.sched.adjust_cost = rs6000_debug_adjust_cost; + } } if (rs6000_traceback_name) @@ -1741,8 +2268,8 @@ rs6000_override_options (const char *def rs6000_spe = 0; if (!rs6000_explicit_options.float_gprs) rs6000_float_gprs = 0; - if (!rs6000_explicit_options.isel) - rs6000_isel = 0; + if (!(target_flags_explicit & MASK_ISEL)) + target_flags &= ~MASK_ISEL; } /* Detect invalid option combinations with E500. */ @@ -1751,12 +2278,14 @@ rs6000_override_options (const char *def rs6000_always_hint = (rs6000_cpu != PROCESSOR_POWER4 && rs6000_cpu != PROCESSOR_POWER5 && rs6000_cpu != PROCESSOR_POWER6 + && rs6000_cpu != PROCESSOR_POWER7 && rs6000_cpu != PROCESSOR_CELL); rs6000_sched_groups = (rs6000_cpu == PROCESSOR_POWER4 || rs6000_cpu == PROCESSOR_POWER5); rs6000_align_branch_targets = (rs6000_cpu == PROCESSOR_POWER4 || rs6000_cpu == PROCESSOR_POWER5 - || rs6000_cpu == PROCESSOR_POWER6); + || rs6000_cpu == PROCESSOR_POWER6 + || rs6000_cpu == PROCESSOR_POWER7); rs6000_sched_restricted_insns_priority = (rs6000_sched_groups ? 1 : 0); @@ -1951,6 +2480,10 @@ rs6000_override_options (const char *def rs6000_cost = &power6_cost; break; + case PROCESSOR_POWER7: + rs6000_cost = &power7_cost; + break; + default: gcc_unreachable (); } @@ -2001,7 +2534,7 @@ rs6000_override_options (const char *def static tree rs6000_builtin_mask_for_load (void) { - if (TARGET_ALTIVEC) + if (TARGET_ALTIVEC && !TARGET_VSX) return altivec_builtin_mask_for_load; else return 0; @@ -2015,18 +2548,27 @@ rs6000_builtin_mask_for_load (void) static tree rs6000_builtin_conversion (enum tree_code code, tree type) { - if (!TARGET_ALTIVEC) - return NULL_TREE; - switch (code) { case FIX_TRUNC_EXPR: switch (TYPE_MODE (type)) { + case V2DImode: + if (!VECTOR_UNIT_VSX_P (V2DFmode)) + return NULL_TREE; + + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[VSX_BUILTIN_XVCVDPUXDS] + : rs6000_builtin_decls[VSX_BUILTIN_XVCVDPSXDS]; + case V4SImode: + if (VECTOR_UNIT_NONE_P (V4SImode) || VECTOR_UNIT_NONE_P (V4SFmode)) + return NULL_TREE; + return TYPE_UNSIGNED (type) - ? rs6000_builtin_decls[ALTIVEC_BUILTIN_VCTUXS] - : rs6000_builtin_decls[ALTIVEC_BUILTIN_VCTSXS]; + ? rs6000_builtin_decls[VECTOR_BUILTIN_FIXUNS_V4SF_V4SI] + : rs6000_builtin_decls[VECTOR_BUILTIN_FIX_V4SF_V4SI]; + default: return NULL_TREE; } @@ -2034,10 +2576,22 @@ rs6000_builtin_conversion (enum tree_cod case FLOAT_EXPR: switch (TYPE_MODE (type)) { + case V2DImode: + if (!VECTOR_UNIT_VSX_P (V2DFmode)) + return NULL_TREE; + + return TYPE_UNSIGNED (type) + ? rs6000_builtin_decls[VSX_BUILTIN_XVCVUXDSP] + : rs6000_builtin_decls[VSX_BUILTIN_XVCVSXDSP]; + case V4SImode: + if (VECTOR_UNIT_NONE_P (V4SImode) || VECTOR_UNIT_NONE_P (V4SFmode)) + return NULL_TREE; + return TYPE_UNSIGNED (type) - ? rs6000_builtin_decls[ALTIVEC_BUILTIN_VCFUX] - : rs6000_builtin_decls[ALTIVEC_BUILTIN_VCFSX]; + ? rs6000_builtin_decls[VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF] + : rs6000_builtin_decls[VECTOR_BUILTIN_FLOAT_V4SI_V4SF]; + default: return NULL_TREE; } @@ -2150,6 +2704,14 @@ rs6000_builtin_vec_perm (tree type, tree d = rs6000_builtin_decls[ALTIVEC_BUILTIN_VPERM_4SF]; break; + case V2DFmode: + d = rs6000_builtin_decls[ALTIVEC_BUILTIN_VPERM_2DF]; + break; + + case V2DImode: + d = rs6000_builtin_decls[ALTIVEC_BUILTIN_VPERM_2DI]; + break; + default: return NULL_TREE; } @@ -2229,6 +2791,7 @@ static bool rs6000_handle_option (size_t code, const char *arg, int value) { enum fpu_type_t fpu_type = FPU_NONE; + int isel; switch (code) { @@ -2331,14 +2894,14 @@ rs6000_handle_option (size_t code, const rs6000_parse_yes_no_option ("vrsave", arg, &(TARGET_ALTIVEC_VRSAVE)); break; - case OPT_misel: - rs6000_explicit_options.isel = true; - rs6000_isel = value; - break; - case OPT_misel_: - rs6000_explicit_options.isel = true; - rs6000_parse_yes_no_option ("isel", arg, &(rs6000_isel)); + target_flags_explicit |= MASK_ISEL; + isel = 0; + rs6000_parse_yes_no_option ("isel", arg, &isel); + if (isel) + target_flags |= MASK_ISEL; + else + target_flags &= ~MASK_ISEL; break; case OPT_mspe: @@ -2967,6 +3530,9 @@ output_vec_const_move (rtx *operands) vec = operands[1]; mode = GET_MODE (dest); + if (TARGET_VSX && zero_constant (vec, mode)) + return "xxlxor %x0,%x0,%x0"; + if (TARGET_ALTIVEC) { rtx splat_vec; @@ -3190,20 +3756,21 @@ rs6000_expand_vector_init (rtx target, r if (n_var == 0) { rtx const_vec = gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0)); - if (mode != V4SFmode && all_const_zero) + bool int_vector_p = (GET_MODE_CLASS (mode) == MODE_VECTOR_INT); + if ((int_vector_p || TARGET_VSX) && all_const_zero) { /* Zero register. */ emit_insn (gen_rtx_SET (VOIDmode, target, gen_rtx_XOR (mode, target, target))); return; } - else if (mode != V4SFmode && easy_vector_constant (const_vec, mode)) + else if (int_vector_p && easy_vector_constant (const_vec, mode)) { /* Splat immediate. */ emit_insn (gen_rtx_SET (VOIDmode, target, const_vec)); return; } - else if (all_same) + else if (all_same && int_vector_p) ; /* Splat vector element. */ else { @@ -3213,6 +3780,18 @@ rs6000_expand_vector_init (rtx target, r } } + if (mode == V2DFmode) + { + gcc_assert (TARGET_VSX); + if (all_same) + emit_insn (gen_vsx_splatv2df (target, XVECEXP (vals, 0, 0))); + else + emit_insn (gen_vsx_concat_v2df (target, + copy_to_reg (XVECEXP (vals, 0, 0)), + copy_to_reg (XVECEXP (vals, 0, 1)))); + return; + } + /* Store value to stack temp. Load vector element. Splat. */ if (all_same) { @@ -3272,6 +3851,13 @@ rs6000_expand_vector_set (rtx target, rt int width = GET_MODE_SIZE (inner_mode); int i; + if (mode == V2DFmode) + { + gcc_assert (TARGET_VSX); + emit_insn (gen_vsx_set_v2df (target, val, target, GEN_INT (elt))); + return; + } + /* Load single variable value. */ mem = assign_stack_temp (mode, GET_MODE_SIZE (inner_mode), 0); emit_move_insn (adjust_address_nv (mem, inner_mode, 0), val); @@ -3309,6 +3895,13 @@ rs6000_expand_vector_extract (rtx target enum machine_mode inner_mode = GET_MODE_INNER (mode); rtx mem, x; + if (mode == V2DFmode) + { + gcc_assert (TARGET_VSX); + emit_insn (gen_vsx_extract_v2df (target, vec, GEN_INT (elt))); + return; + } + /* Allocate mode-sized buffer. */ mem = assign_stack_temp (mode, GET_MODE_SIZE (mode), 0); @@ -3627,9 +4220,13 @@ rs6000_legitimate_offset_address_p (enum case V8HImode: case V4SFmode: case V4SImode: - /* AltiVec vector modes. Only reg+reg addressing is valid and + case V2DFmode: + case V2DImode: + /* AltiVec/VSX vector modes. Only reg+reg addressing is valid and constant offset zero should not occur due to canonicalization. */ - return false; + if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)) + return false; + break; case V4HImode: case V2SImode: @@ -3646,6 +4243,11 @@ rs6000_legitimate_offset_address_p (enum if (TARGET_E500_DOUBLE) return SPE_CONST_OFFSET_OK (offset); + /* If we are using VSX scalar loads, restrict ourselves to reg+reg + addressing. */ + if (VECTOR_MEM_VSX_P (DFmode)) + return false; + case DDmode: case DImode: /* On e500v2, we may have: @@ -3716,7 +4318,9 @@ avoiding_indexed_address_p (enum machine { /* Avoid indexed addressing for modes that have non-indexed load/store instruction forms. */ - return TARGET_AVOID_XFORM && !ALTIVEC_VECTOR_MODE (mode); + return (TARGET_AVOID_XFORM + && (!TARGET_ALTIVEC || !ALTIVEC_VECTOR_MODE (mode)) + && (!TARGET_VSX || !VSX_VECTOR_MODE (mode))); } inline bool @@ -3808,15 +4412,10 @@ rtx rs6000_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED, enum machine_mode mode) { + rtx ret = NULL_RTX; + rtx orig_x = x; unsigned int extra = 0; - if (GET_CODE (x) == SYMBOL_REF) - { - enum tls_model model = SYMBOL_REF_TLS_MODEL (x); - if (model != 0) - return rs6000_legitimize_tls_address (x, model); - } - switch (mode) { case DFmode: @@ -3838,19 +4437,26 @@ rs6000_legitimize_address (rtx x, rtx ol break; } - if (GET_CODE (x) == PLUS - && GET_CODE (XEXP (x, 0)) == REG - && GET_CODE (XEXP (x, 1)) == CONST_INT - && ((unsigned HOST_WIDE_INT) (INTVAL (XEXP (x, 1)) + 0x8000) - >= 0x10000 - extra) - && !((TARGET_POWERPC64 - && (mode == DImode || mode == TImode) - && (INTVAL (XEXP (x, 1)) & 3) != 0) - || SPE_VECTOR_MODE (mode) - || ALTIVEC_VECTOR_MODE (mode) - || (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode - || mode == DImode || mode == DDmode - || mode == TDmode)))) + if (GET_CODE (x) == SYMBOL_REF && SYMBOL_REF_TLS_MODEL (x)) + { + enum tls_model model = SYMBOL_REF_TLS_MODEL (x); + ret = rs6000_legitimize_tls_address (x, model); + } + + else if (GET_CODE (x) == PLUS + && GET_CODE (XEXP (x, 0)) == REG + && GET_CODE (XEXP (x, 1)) == CONST_INT + && ((unsigned HOST_WIDE_INT) (INTVAL (XEXP (x, 1)) + 0x8000) + >= 0x10000 - extra) + && !((TARGET_POWERPC64 + && (mode == DImode || mode == TImode) + && (INTVAL (XEXP (x, 1)) & 3) != 0) + || (TARGET_SPE && SPE_VECTOR_MODE (mode)) + || (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode)) + || (TARGET_VSX && VSX_VECTOR_MODE (mode)) + || (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode + || mode == DImode || mode == DDmode + || mode == TDmode)))) { HOST_WIDE_INT high_int, low_int; rtx sum; @@ -3860,7 +4466,7 @@ rs6000_legitimize_address (rtx x, rtx ol high_int = INTVAL (XEXP (x, 1)) - low_int; sum = force_operand (gen_rtx_PLUS (Pmode, XEXP (x, 0), GEN_INT (high_int)), 0); - return plus_constant (sum, low_int); + ret = plus_constant (sum, low_int); } else if (GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 0)) == REG @@ -3876,32 +4482,29 @@ rs6000_legitimize_address (rtx x, rtx ol && mode != TFmode && mode != TDmode) { - return gen_rtx_PLUS (Pmode, XEXP (x, 0), - force_reg (Pmode, force_operand (XEXP (x, 1), 0))); + ret = gen_rtx_PLUS (Pmode, XEXP (x, 0), + force_reg (Pmode, force_operand (XEXP (x, 1), 0))); } - else if (ALTIVEC_VECTOR_MODE (mode)) + else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)) { - rtx reg; - /* Make sure both operands are registers. */ if (GET_CODE (x) == PLUS) - return gen_rtx_PLUS (Pmode, force_reg (Pmode, XEXP (x, 0)), - force_reg (Pmode, XEXP (x, 1))); - - reg = force_reg (Pmode, x); - return reg; + ret = gen_rtx_PLUS (Pmode, force_reg (Pmode, XEXP (x, 0)), + force_reg (Pmode, XEXP (x, 1))); + else + ret = force_reg (Pmode, x); } - else if (SPE_VECTOR_MODE (mode) + else if ((TARGET_SPE && SPE_VECTOR_MODE (mode)) || (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode || mode == DDmode || mode == TDmode || mode == DImode))) { if (mode == DImode) - return NULL_RTX; - /* We accept [reg + reg] and [reg + OFFSET]. */ + ret = NULL_RTX; - if (GET_CODE (x) == PLUS) - { + /* We accept [reg + reg] and [reg + OFFSET]. */ + else if (GET_CODE (x) == PLUS) + { rtx op1 = XEXP (x, 0); rtx op2 = XEXP (x, 1); rtx y; @@ -3920,12 +4523,12 @@ rs6000_legitimize_address (rtx x, rtx ol y = gen_rtx_PLUS (Pmode, op1, op2); if ((GET_MODE_SIZE (mode) > 8 || mode == DDmode) && REG_P (op2)) - return force_reg (Pmode, y); + ret = force_reg (Pmode, y); else - return y; + ret = y; } - - return force_reg (Pmode, x); + else + ret = force_reg (Pmode, x); } else if (TARGET_ELF && TARGET_32BIT @@ -3941,7 +4544,7 @@ rs6000_legitimize_address (rtx x, rtx ol { rtx reg = gen_reg_rtx (Pmode); emit_insn (gen_elf_high (reg, x)); - return gen_rtx_LO_SUM (Pmode, reg, x); + ret = gen_rtx_LO_SUM (Pmode, reg, x); } else if (TARGET_MACHO && TARGET_32BIT && TARGET_NO_TOC && ! flag_pic @@ -3959,17 +4562,35 @@ rs6000_legitimize_address (rtx x, rtx ol { rtx reg = gen_reg_rtx (Pmode); emit_insn (gen_macho_high (reg, x)); - return gen_rtx_LO_SUM (Pmode, reg, x); + ret = gen_rtx_LO_SUM (Pmode, reg, x); } else if (TARGET_TOC && GET_CODE (x) == SYMBOL_REF && constant_pool_expr_p (x) && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), Pmode)) { - return create_TOC_reference (x); + ret = create_TOC_reference (x); } else - return NULL_RTX; + ret = NULL_RTX; + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "\nrs6000_legitimize_address: mode %s, original addr:\n", + GET_MODE_NAME (mode)); + debug_rtx (orig_x); + if (ret) + { + fprintf (stderr, "New addr:\n"); + debug_rtx (ret); + } + else + fprintf (stderr, "NULL returned\n"); + fprintf (stderr, "\n"); + } + + return ret; } /* This is called from dwarf2out.c via TARGET_ASM_OUTPUT_DWARF_DTPREL. @@ -4258,6 +4879,9 @@ rs6000_legitimize_reload_address (rtx x, int opnum, int type, int ind_levels ATTRIBUTE_UNUSED, int *win) { + rtx orig_x = x; + rtx ret = NULL_RTX; + /* We must recognize output that we have already generated ourselves. */ if (GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 0)) == PLUS @@ -4269,17 +4893,17 @@ rs6000_legitimize_reload_address (rtx x, BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, opnum, (enum reload_type)type); *win = 1; - return x; + ret = x; } #if TARGET_MACHO - if (DEFAULT_ABI == ABI_DARWIN && flag_pic - && GET_CODE (x) == LO_SUM - && GET_CODE (XEXP (x, 0)) == PLUS - && XEXP (XEXP (x, 0), 0) == pic_offset_table_rtx - && GET_CODE (XEXP (XEXP (x, 0), 1)) == HIGH - && XEXP (XEXP (XEXP (x, 0), 1), 0) == XEXP (x, 1) - && machopic_operand_p (XEXP (x, 1))) + else if (DEFAULT_ABI == ABI_DARWIN && flag_pic + && GET_CODE (x) == LO_SUM + && GET_CODE (XEXP (x, 0)) == PLUS + && XEXP (XEXP (x, 0), 0) == pic_offset_table_rtx + && GET_CODE (XEXP (XEXP (x, 0), 1)) == HIGH + && XEXP (XEXP (XEXP (x, 0), 1), 0) == XEXP (x, 1) + && machopic_operand_p (XEXP (x, 1))) { /* Result of previous invocation of this function on Darwin floating point constant. */ @@ -4287,40 +4911,40 @@ rs6000_legitimize_reload_address (rtx x, BASE_REG_CLASS, Pmode, VOIDmode, 0, 0, opnum, (enum reload_type)type); *win = 1; - return x; + ret = x; } #endif /* Force ld/std non-word aligned offset into base register by wrapping in offset 0. */ - if (GET_CODE (x) == PLUS - && GET_CODE (XEXP (x, 0)) == REG - && REGNO (XEXP (x, 0)) < 32 - && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode) - && GET_CODE (XEXP (x, 1)) == CONST_INT - && (INTVAL (XEXP (x, 1)) & 3) != 0 - && !ALTIVEC_VECTOR_MODE (mode) - && GET_MODE_SIZE (mode) >= UNITS_PER_WORD - && TARGET_POWERPC64) + else if (GET_CODE (x) == PLUS + && GET_CODE (XEXP (x, 0)) == REG + && REGNO (XEXP (x, 0)) < 32 + && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode) + && GET_CODE (XEXP (x, 1)) == CONST_INT + && (INTVAL (XEXP (x, 1)) & 3) != 0 + && VECTOR_MEM_NONE_P (mode) + && GET_MODE_SIZE (mode) >= UNITS_PER_WORD + && TARGET_POWERPC64) { x = gen_rtx_PLUS (GET_MODE (x), x, GEN_INT (0)); push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL, BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, opnum, (enum reload_type) type); *win = 1; - return x; + ret = x; } - if (GET_CODE (x) == PLUS - && GET_CODE (XEXP (x, 0)) == REG - && REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER - && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode) - && GET_CODE (XEXP (x, 1)) == CONST_INT - && !SPE_VECTOR_MODE (mode) - && !(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode - || mode == DDmode || mode == TDmode - || mode == DImode)) - && !ALTIVEC_VECTOR_MODE (mode)) + else if (GET_CODE (x) == PLUS + && GET_CODE (XEXP (x, 0)) == REG + && REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER + && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode) + && GET_CODE (XEXP (x, 1)) == CONST_INT + && !SPE_VECTOR_MODE (mode) + && !(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode + || mode == DDmode || mode == TDmode + || mode == DImode)) + && VECTOR_MEM_NONE_P (mode)) { HOST_WIDE_INT val = INTVAL (XEXP (x, 1)); HOST_WIDE_INT low = ((val & 0xffff) ^ 0x8000) - 0x8000; @@ -4331,42 +4955,45 @@ rs6000_legitimize_reload_address (rtx x, if (high + low != val) { *win = 0; - return x; + ret = x; } + else + { + /* Reload the high part into a base reg; leave the low part + in the mem directly. */ - /* Reload the high part into a base reg; leave the low part - in the mem directly. */ - - x = gen_rtx_PLUS (GET_MODE (x), - gen_rtx_PLUS (GET_MODE (x), XEXP (x, 0), - GEN_INT (high)), - GEN_INT (low)); + x = gen_rtx_PLUS (GET_MODE (x), + gen_rtx_PLUS (GET_MODE (x), XEXP (x, 0), + GEN_INT (high)), + GEN_INT (low)); - push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL, - BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, - opnum, (enum reload_type)type); - *win = 1; - return x; + push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL, + BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, + opnum, (enum reload_type)type); + *win = 1; + return x; + } } - if (GET_CODE (x) == SYMBOL_REF - && !ALTIVEC_VECTOR_MODE (mode) - && !SPE_VECTOR_MODE (mode) + else if (GET_CODE (x) == SYMBOL_REF + && VECTOR_MEM_NONE_P (mode) + && (!TARGET_SPE || !SPE_VECTOR_MODE (mode)) #if TARGET_MACHO - && DEFAULT_ABI == ABI_DARWIN - && (flag_pic || MACHO_DYNAMIC_NO_PIC_P) + && DEFAULT_ABI == ABI_DARWIN + && (flag_pic || MACHO_DYNAMIC_NO_PIC_P) #else - && DEFAULT_ABI == ABI_V4 - && !flag_pic + && DEFAULT_ABI == ABI_V4 + && !flag_pic #endif - /* Don't do this for TFmode or TDmode, since the result isn't offsettable. - The same goes for DImode without 64-bit gprs and DFmode and DDmode - without fprs. */ - && mode != TFmode - && mode != TDmode - && (mode != DImode || TARGET_POWERPC64) - && ((mode != DFmode && mode != DDmode) || TARGET_POWERPC64 - || (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT))) + /* Don't do this for TFmode or TDmode, since the result isn't + offsettable. The same goes for DImode without 64-bit gprs and + DFmode and DDmode without fprs. */ + && VECTOR_MEM_NONE_P (mode) + && mode != TFmode + && mode != TDmode + && (mode != DImode || TARGET_POWERPC64) + && ((mode != DFmode && mode != DDmode) || TARGET_POWERPC64 + || (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT))) { #if TARGET_MACHO if (flag_pic) @@ -4385,37 +5012,63 @@ rs6000_legitimize_reload_address (rtx x, BASE_REG_CLASS, Pmode, VOIDmode, 0, 0, opnum, (enum reload_type)type); *win = 1; - return x; + ret = x; } /* Reload an offset address wrapped by an AND that represents the masking of the lower bits. Strip the outer AND and let reload convert the offset address into an indirect address. */ - if (TARGET_ALTIVEC - && ALTIVEC_VECTOR_MODE (mode) - && GET_CODE (x) == AND - && GET_CODE (XEXP (x, 0)) == PLUS - && GET_CODE (XEXP (XEXP (x, 0), 0)) == REG - && GET_CODE (XEXP (XEXP (x, 0), 1)) == CONST_INT - && GET_CODE (XEXP (x, 1)) == CONST_INT - && INTVAL (XEXP (x, 1)) == -16) + else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) + && GET_CODE (x) == AND + && GET_CODE (XEXP (x, 0)) == PLUS + && GET_CODE (XEXP (XEXP (x, 0), 0)) == REG + && GET_CODE (XEXP (XEXP (x, 0), 1)) == CONST_INT + && GET_CODE (XEXP (x, 1)) == CONST_INT + && INTVAL (XEXP (x, 1)) == -16) { x = XEXP (x, 0); *win = 1; - return x; + ret = x; } - if (TARGET_TOC - && GET_CODE (x) == SYMBOL_REF - && constant_pool_expr_p (x) - && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), mode)) + else if (TARGET_TOC + && GET_CODE (x) == SYMBOL_REF + && constant_pool_expr_p (x) + && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), mode)) { x = create_TOC_reference (x); *win = 1; - return x; + ret = x; + } + + else + { + *win = 0; + ret = x; + } + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "\nrs6000_legitimize_reload_address: mode = %s, opnum = %d, " + "type = %d, ind_levels = %d, win = %d, original addr:\n", + GET_MODE_NAME (mode), opnum, type, ind_levels, *win); + debug_rtx (orig_x); + + if (orig_x == ret) + fprintf (stderr, "Same address returned\n"); + else if (!ret) + fprintf (stderr, "NULL returned\n"); + else + { + fprintf (stderr, "New address:\n"); + debug_rtx (ret); + } + + fprintf (stderr, "\n"); } - *win = 0; - return x; + + return ret; } /* GO_IF_LEGITIMATE_ADDRESS recognizes an RTL expression @@ -4438,77 +5091,101 @@ rs6000_legitimize_reload_address (rtx x, int rs6000_legitimate_address (enum machine_mode mode, rtx x, int reg_ok_strict) { + int ret; + rtx orig_x = x; + /* If this is an unaligned stvx/ldvx type address, discard the outer AND. */ - if (TARGET_ALTIVEC - && ALTIVEC_VECTOR_MODE (mode) + if ((TARGET_ALTIVEC || TARGET_VSX) + && VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) && GET_CODE (x) == AND && GET_CODE (XEXP (x, 1)) == CONST_INT && INTVAL (XEXP (x, 1)) == -16) x = XEXP (x, 0); if (RS6000_SYMBOL_REF_TLS_P (x)) - return 0; - if (legitimate_indirect_address_p (x, reg_ok_strict)) - return 1; - if ((GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC) - && !ALTIVEC_VECTOR_MODE (mode) - && !SPE_VECTOR_MODE (mode) - && mode != TFmode - && mode != TDmode - /* Restrict addressing for DI because of our SUBREG hackery. */ - && !(TARGET_E500_DOUBLE - && (mode == DFmode || mode == DDmode || mode == DImode)) - && TARGET_UPDATE - && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict)) - return 1; - if (legitimate_small_data_p (mode, x)) - return 1; - if (legitimate_constant_pool_address_p (x)) - return 1; + ret = 0; + else if (legitimate_indirect_address_p (x, reg_ok_strict)) + ret = 1; + else if ((GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC) + && !VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) + && (TARGET_SPE && !SPE_VECTOR_MODE (mode)) + && mode != TFmode + && mode != TDmode + /* Restrict addressing for DI because of our SUBREG hackery. */ + && !(TARGET_E500_DOUBLE + && (mode == DFmode || mode == DDmode || mode == DImode)) + && TARGET_UPDATE + && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict)) + ret = 1; + else if (legitimate_small_data_p (mode, x)) + ret = 1; + else if (legitimate_constant_pool_address_p (x)) + ret = 1; /* If not REG_OK_STRICT (before reload) let pass any stack offset. */ - if (! reg_ok_strict - && GET_CODE (x) == PLUS - && GET_CODE (XEXP (x, 0)) == REG - && (XEXP (x, 0) == virtual_stack_vars_rtx - || XEXP (x, 0) == arg_pointer_rtx) - && GET_CODE (XEXP (x, 1)) == CONST_INT) - return 1; - if (rs6000_legitimate_offset_address_p (mode, x, reg_ok_strict)) - return 1; - if (mode != TImode - && mode != TFmode - && mode != TDmode - && ((TARGET_HARD_FLOAT && TARGET_FPRS) - || TARGET_POWERPC64 - || (mode != DFmode && mode != DDmode) - || (TARGET_E500_DOUBLE && mode != DDmode)) - && (TARGET_POWERPC64 || mode != DImode) - && !avoiding_indexed_address_p (mode) - && legitimate_indexed_address_p (x, reg_ok_strict)) - return 1; - if (GET_CODE (x) == PRE_MODIFY - && mode != TImode - && mode != TFmode - && mode != TDmode - && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) - || TARGET_POWERPC64 - || ((mode != DFmode && mode != DDmode) || TARGET_E500_DOUBLE)) - && (TARGET_POWERPC64 || mode != DImode) - && !ALTIVEC_VECTOR_MODE (mode) - && !SPE_VECTOR_MODE (mode) - /* Restrict addressing for DI because of our SUBREG hackery. */ - && !(TARGET_E500_DOUBLE - && (mode == DFmode || mode == DDmode || mode == DImode)) - && TARGET_UPDATE - && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict) - && (rs6000_legitimate_offset_address_p (mode, XEXP (x, 1), reg_ok_strict) - || (!avoiding_indexed_address_p (mode) - && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict))) - && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0))) - return 1; - if (legitimate_lo_sum_address_p (mode, x, reg_ok_strict)) - return 1; - return 0; + else if (! reg_ok_strict + && GET_CODE (x) == PLUS + && GET_CODE (XEXP (x, 0)) == REG + && (XEXP (x, 0) == virtual_stack_vars_rtx + || XEXP (x, 0) == arg_pointer_rtx) + && GET_CODE (XEXP (x, 1)) == CONST_INT) + ret = 1; + else if (rs6000_legitimate_offset_address_p (mode, x, reg_ok_strict)) + ret = 1; + else if (mode != TImode + && mode != TFmode + && mode != TDmode + && ((TARGET_HARD_FLOAT && TARGET_FPRS) + || TARGET_POWERPC64 + || (mode != DFmode && mode != DDmode) + || (TARGET_E500_DOUBLE && mode != DDmode)) + && (TARGET_POWERPC64 || mode != DImode) + && !avoiding_indexed_address_p (mode) + && legitimate_indexed_address_p (x, reg_ok_strict)) + ret = 1; + else if (GET_CODE (x) == PRE_MODIFY + && VECTOR_MEM_VSX_P (mode) + && TARGET_UPDATE + && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict) + && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0))) + ret = 1; + else if (GET_CODE (x) == PRE_MODIFY + && mode != TImode + && mode != TFmode + && mode != TDmode + && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) + || TARGET_POWERPC64 + || ((mode != DFmode && mode != DDmode) || TARGET_E500_DOUBLE)) + && (TARGET_POWERPC64 || mode != DImode) + && !VECTOR_MEM_ALTIVEC_P (mode) + && (!TARGET_SPE || !SPE_VECTOR_MODE (mode)) + /* Restrict addressing for DI because of our SUBREG hackery. */ + && !(TARGET_E500_DOUBLE + && (mode == DFmode || mode == DDmode || mode == DImode)) + && TARGET_UPDATE + && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict) + && (rs6000_legitimate_offset_address_p (mode, XEXP (x, 1), reg_ok_strict) + || (!avoiding_indexed_address_p (mode) + && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict))) + && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0))) + ret = 1; + else if (legitimate_lo_sum_address_p (mode, x, reg_ok_strict)) + ret = 1; + else + ret = 0; + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "\nrs6000_legitimate_address: return = %d, mode = %s, " + "strict = %d\n", + ret, + GET_MODE_NAME (mode), + reg_ok_strict); + debug_rtx (orig_x); + fprintf (stderr, "\n"); + } + + return ret; } /* Go to LABEL if ADDR (a legitimate address expression) @@ -4525,28 +5202,46 @@ rs6000_legitimate_address (enum machine_ bool rs6000_mode_dependent_address (rtx addr) { + bool ret = false; + switch (GET_CODE (addr)) { case PLUS: if (GET_CODE (XEXP (addr, 1)) == CONST_INT) { unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1)); - return val + 12 + 0x8000 >= 0x10000; + ret = (val + 12 + 0x8000 >= 0x10000); } break; case LO_SUM: - return true; + ret = true; + break; /* Auto-increment cases are now treated generically in recog.c. */ case PRE_MODIFY: - return TARGET_UPDATE; + ret = (TARGET_UPDATE != 0); + break; + + /* AND is only allowed in Altivec loads. */ + case AND: + ret = true; + break; default: break; } - return false; + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "\nrs6000_mode_dependent_address: ret = %s\n", + ret ? "true" : "false"); + debug_rtx (addr); + fprintf (stderr, "\n"); + } + + return ret; } /* Implement FIND_BASE_TERM. */ @@ -4597,43 +5292,6 @@ rs6000_offsettable_memref_p (rtx op) return rs6000_legitimate_offset_address_p (GET_MODE (op), XEXP (op, 0), 1); } -/* Return number of consecutive hard regs needed starting at reg REGNO - to hold something of mode MODE. - This is ordinarily the length in words of a value of mode MODE - but can be less for certain modes in special long registers. - - For the SPE, GPRs are 64 bits but only 32 bits are visible in - scalar instructions. The upper 32 bits are only available to the - SIMD instructions. - - POWER and PowerPC GPRs hold 32 bits worth; - PowerPC64 GPRs and FPRs point register holds 64 bits worth. */ - -int -rs6000_hard_regno_nregs (int regno, enum machine_mode mode) -{ - if (FP_REGNO_P (regno)) - return (GET_MODE_SIZE (mode) + UNITS_PER_FP_WORD - 1) / UNITS_PER_FP_WORD; - - if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode)) - return (GET_MODE_SIZE (mode) + UNITS_PER_SPE_WORD - 1) / UNITS_PER_SPE_WORD; - - if (ALTIVEC_REGNO_P (regno)) - return - (GET_MODE_SIZE (mode) + UNITS_PER_ALTIVEC_WORD - 1) / UNITS_PER_ALTIVEC_WORD; - - /* The value returned for SCmode in the E500 double case is 2 for - ABI compatibility; storing an SCmode value in a single register - would require function_arg and rs6000_spe_function_arg to handle - SCmode so as to pass the value correctly in a pair of - registers. */ - if (TARGET_E500_DOUBLE && FLOAT_MODE_P (mode) && mode != SCmode - && !DECIMAL_FLOAT_MODE_P (mode)) - return (GET_MODE_SIZE (mode) + UNITS_PER_FP_WORD - 1) / UNITS_PER_FP_WORD; - - return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD; -} - /* Change register usage conditional on target flags. */ void rs6000_conditional_register_usage (void) @@ -4698,14 +5356,14 @@ rs6000_conditional_register_usage (void) = call_really_used_regs[14] = 1; } - if (!TARGET_ALTIVEC) + if (!TARGET_ALTIVEC && !TARGET_VSX) { for (i = FIRST_ALTIVEC_REGNO; i <= LAST_ALTIVEC_REGNO; ++i) fixed_regs[i] = call_used_regs[i] = call_really_used_regs[i] = 1; call_really_used_regs[VRSAVE_REGNO] = 1; } - if (TARGET_ALTIVEC) + if (TARGET_ALTIVEC || TARGET_VSX) global_regs[VSCR_REGNO] = 1; if (TARGET_ALTIVEC_ABI) @@ -4923,6 +5581,20 @@ rs6000_emit_move (rtx dest, rtx source, operands[0] = dest; operands[1] = source; + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "\nrs6000_emit_move: mode = %s, reload_in_progress = %d, " + "reload_completed = %d, can_create_pseudos = %d.\ndest:\n", + GET_MODE_NAME (mode), + reload_in_progress, + reload_completed, + can_create_pseudo_p ()); + debug_rtx (dest); + fprintf (stderr, "source:\n"); + debug_rtx (source); + } + /* Sanity checks. Check that we get CONST_DOUBLE only when we should. */ if (GET_CODE (operands[1]) == CONST_DOUBLE && ! FLOAT_MODE_P (mode) @@ -5127,6 +5799,8 @@ rs6000_emit_move (rtx dest, rtx source, case V2SFmode: case V2SImode: case V1DImode: + case V2DFmode: + case V2DImode: if (CONSTANT_P (operands[1]) && !easy_vector_constant (operands[1], mode)) operands[1] = force_const_mem (mode, operands[1]); @@ -5296,6 +5970,9 @@ rs6000_emit_move (rtx dest, rtx source, break; case TImode: + if (VECTOR_MEM_ALTIVEC_OR_VSX_P (TImode)) + break; + rs6000_eliminate_indexed_memrefs (operands); if (TARGET_POWER) @@ -5311,7 +5988,7 @@ rs6000_emit_move (rtx dest, rtx source, break; default: - gcc_unreachable (); + fatal_insn ("bad move", gen_rtx_SET (VOIDmode, dest, source)); } /* Above, we may have called force_const_mem which may have returned @@ -5331,10 +6008,10 @@ rs6000_emit_move (rtx dest, rtx source, && TARGET_HARD_FLOAT && TARGET_FPRS) /* Nonzero if we can use an AltiVec register to pass this arg. */ -#define USE_ALTIVEC_FOR_ARG_P(CUM,MODE,TYPE,NAMED) \ - (ALTIVEC_VECTOR_MODE (MODE) \ - && (CUM)->vregno <= ALTIVEC_ARG_MAX_REG \ - && TARGET_ALTIVEC_ABI \ +#define USE_ALTIVEC_FOR_ARG_P(CUM,MODE,TYPE,NAMED) \ + ((ALTIVEC_VECTOR_MODE (MODE) || VSX_VECTOR_MODE (MODE)) \ + && (CUM)->vregno <= ALTIVEC_ARG_MAX_REG \ + && TARGET_ALTIVEC_ABI \ && (NAMED)) /* Return a nonzero value to say to return the function value in @@ -5575,7 +6252,7 @@ function_arg_boundary (enum machine_mode && int_size_in_bytes (type) >= 8 && int_size_in_bytes (type) < 16)) return 64; - else if (ALTIVEC_VECTOR_MODE (mode) + else if ((ALTIVEC_VECTOR_MODE (mode) || VSX_VECTOR_MODE (mode)) || (type && TREE_CODE (type) == VECTOR_TYPE && int_size_in_bytes (type) >= 16)) return 128; @@ -5720,7 +6397,7 @@ function_arg_advance (CUMULATIVE_ARGS *c cum->nargs_prototype--; if (TARGET_ALTIVEC_ABI - && (ALTIVEC_VECTOR_MODE (mode) + && ((ALTIVEC_VECTOR_MODE (mode) || VSX_VECTOR_MODE (mode)) || (type && TREE_CODE (type) == VECTOR_TYPE && int_size_in_bytes (type) == 16))) { @@ -6314,7 +6991,7 @@ function_arg (CUMULATIVE_ARGS *cum, enum else return gen_rtx_REG (mode, cum->vregno); else if (TARGET_ALTIVEC_ABI - && (ALTIVEC_VECTOR_MODE (mode) + && ((ALTIVEC_VECTOR_MODE (mode) || VSX_VECTOR_MODE (mode)) || (type && TREE_CODE (type) == VECTOR_TYPE && int_size_in_bytes (type) == 16))) { @@ -7238,10 +7915,13 @@ static const struct builtin_description { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v4si, "__builtin_altivec_vperm_4si", ALTIVEC_BUILTIN_VPERM_4SI }, { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v8hi, "__builtin_altivec_vperm_8hi", ALTIVEC_BUILTIN_VPERM_8HI }, { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v16qi, "__builtin_altivec_vperm_16qi", ALTIVEC_BUILTIN_VPERM_16QI }, - { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v4sf, "__builtin_altivec_vsel_4sf", ALTIVEC_BUILTIN_VSEL_4SF }, - { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v4si, "__builtin_altivec_vsel_4si", ALTIVEC_BUILTIN_VSEL_4SI }, - { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v8hi, "__builtin_altivec_vsel_8hi", ALTIVEC_BUILTIN_VSEL_8HI }, - { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v16qi, "__builtin_altivec_vsel_16qi", ALTIVEC_BUILTIN_VSEL_16QI }, + { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v2df, "__builtin_altivec_vperm_2df", ALTIVEC_BUILTIN_VPERM_2DF }, + { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v2di, "__builtin_altivec_vperm_2di", ALTIVEC_BUILTIN_VPERM_2DI }, + { MASK_ALTIVEC, CODE_FOR_vector_vselv4sf, "__builtin_altivec_vsel_4sf", ALTIVEC_BUILTIN_VSEL_4SF }, + { MASK_ALTIVEC, CODE_FOR_vector_vselv4si, "__builtin_altivec_vsel_4si", ALTIVEC_BUILTIN_VSEL_4SI }, + { MASK_ALTIVEC, CODE_FOR_vector_vselv8hi, "__builtin_altivec_vsel_8hi", ALTIVEC_BUILTIN_VSEL_8HI }, + { MASK_ALTIVEC, CODE_FOR_vector_vselv16qi, "__builtin_altivec_vsel_16qi", ALTIVEC_BUILTIN_VSEL_16QI }, + { MASK_ALTIVEC, CODE_FOR_vector_vselv2df, "__builtin_altivec_vsel_2df", ALTIVEC_BUILTIN_VSEL_2DF }, { MASK_ALTIVEC, CODE_FOR_altivec_vsldoi_v16qi, "__builtin_altivec_vsldoi_16qi", ALTIVEC_BUILTIN_VSLDOI_16QI }, { MASK_ALTIVEC, CODE_FOR_altivec_vsldoi_v8hi, "__builtin_altivec_vsldoi_8hi", ALTIVEC_BUILTIN_VSLDOI_8HI }, { MASK_ALTIVEC, CODE_FOR_altivec_vsldoi_v4si, "__builtin_altivec_vsldoi_4si", ALTIVEC_BUILTIN_VSLDOI_4SI }, @@ -7263,6 +7943,16 @@ static const struct builtin_description { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_perm", ALTIVEC_BUILTIN_VEC_PERM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sel", ALTIVEC_BUILTIN_VEC_SEL }, + { MASK_VSX, CODE_FOR_vsx_fmaddv2df4, "__builtin_vsx_xvmadddp", VSX_BUILTIN_XVMADDDP }, + { MASK_VSX, CODE_FOR_vsx_fmsubv2df4, "__builtin_vsx_xvmsubdp", VSX_BUILTIN_XVMSUBDP }, + { MASK_VSX, CODE_FOR_vsx_fnmaddv2df4, "__builtin_vsx_xvnmadddp", VSX_BUILTIN_XVNMADDDP }, + { MASK_VSX, CODE_FOR_vsx_fnmsubv2df4, "__builtin_vsx_xvnmsubdp", VSX_BUILTIN_XVNMSUBDP }, + + { MASK_VSX, CODE_FOR_vsx_fmaddv4sf4, "__builtin_vsx_xvmaddsp", VSX_BUILTIN_XVMADDSP }, + { MASK_VSX, CODE_FOR_vsx_fmsubv4sf4, "__builtin_vsx_xvmsubsp", VSX_BUILTIN_XVMSUBSP }, + { MASK_VSX, CODE_FOR_vsx_fnmaddv4sf4, "__builtin_vsx_xvnmaddsp", VSX_BUILTIN_XVNMADDSP }, + { MASK_VSX, CODE_FOR_vsx_fnmsubv4sf4, "__builtin_vsx_xvnmsubsp", VSX_BUILTIN_XVNMSUBSP }, + { 0, CODE_FOR_paired_msub, "__builtin_paired_msub", PAIRED_BUILTIN_MSUB }, { 0, CODE_FOR_paired_madd, "__builtin_paired_madd", PAIRED_BUILTIN_MADD }, { 0, CODE_FOR_paired_madds0, "__builtin_paired_madds0", PAIRED_BUILTIN_MADDS0 }, @@ -7315,18 +8005,18 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_altivec_vcfux, "__builtin_altivec_vcfux", ALTIVEC_BUILTIN_VCFUX }, { MASK_ALTIVEC, CODE_FOR_altivec_vcfsx, "__builtin_altivec_vcfsx", ALTIVEC_BUILTIN_VCFSX }, { MASK_ALTIVEC, CODE_FOR_altivec_vcmpbfp, "__builtin_altivec_vcmpbfp", ALTIVEC_BUILTIN_VCMPBFP }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpequb, "__builtin_altivec_vcmpequb", ALTIVEC_BUILTIN_VCMPEQUB }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpequh, "__builtin_altivec_vcmpequh", ALTIVEC_BUILTIN_VCMPEQUH }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpequw, "__builtin_altivec_vcmpequw", ALTIVEC_BUILTIN_VCMPEQUW }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpeqfp, "__builtin_altivec_vcmpeqfp", ALTIVEC_BUILTIN_VCMPEQFP }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgefp, "__builtin_altivec_vcmpgefp", ALTIVEC_BUILTIN_VCMPGEFP }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtub, "__builtin_altivec_vcmpgtub", ALTIVEC_BUILTIN_VCMPGTUB }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtsb, "__builtin_altivec_vcmpgtsb", ALTIVEC_BUILTIN_VCMPGTSB }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtuh, "__builtin_altivec_vcmpgtuh", ALTIVEC_BUILTIN_VCMPGTUH }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtsh, "__builtin_altivec_vcmpgtsh", ALTIVEC_BUILTIN_VCMPGTSH }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtuw, "__builtin_altivec_vcmpgtuw", ALTIVEC_BUILTIN_VCMPGTUW }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtsw, "__builtin_altivec_vcmpgtsw", ALTIVEC_BUILTIN_VCMPGTSW }, - { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtfp, "__builtin_altivec_vcmpgtfp", ALTIVEC_BUILTIN_VCMPGTFP }, + { MASK_ALTIVEC, CODE_FOR_vector_eqv16qi, "__builtin_altivec_vcmpequb", ALTIVEC_BUILTIN_VCMPEQUB }, + { MASK_ALTIVEC, CODE_FOR_vector_eqv8hi, "__builtin_altivec_vcmpequh", ALTIVEC_BUILTIN_VCMPEQUH }, + { MASK_ALTIVEC, CODE_FOR_vector_eqv4si, "__builtin_altivec_vcmpequw", ALTIVEC_BUILTIN_VCMPEQUW }, + { MASK_ALTIVEC, CODE_FOR_vector_eqv4sf, "__builtin_altivec_vcmpeqfp", ALTIVEC_BUILTIN_VCMPEQFP }, + { MASK_ALTIVEC, CODE_FOR_vector_gev4sf, "__builtin_altivec_vcmpgefp", ALTIVEC_BUILTIN_VCMPGEFP }, + { MASK_ALTIVEC, CODE_FOR_vector_gtuv16qi, "__builtin_altivec_vcmpgtub", ALTIVEC_BUILTIN_VCMPGTUB }, + { MASK_ALTIVEC, CODE_FOR_vector_gtuv8hi, "__builtin_altivec_vcmpgtsb", ALTIVEC_BUILTIN_VCMPGTSB }, + { MASK_ALTIVEC, CODE_FOR_vector_gtuv4si, "__builtin_altivec_vcmpgtuh", ALTIVEC_BUILTIN_VCMPGTUH }, + { MASK_ALTIVEC, CODE_FOR_vector_gtv16qi, "__builtin_altivec_vcmpgtsh", ALTIVEC_BUILTIN_VCMPGTSH }, + { MASK_ALTIVEC, CODE_FOR_vector_gtv8hi, "__builtin_altivec_vcmpgtuw", ALTIVEC_BUILTIN_VCMPGTUW }, + { MASK_ALTIVEC, CODE_FOR_vector_gtv4si, "__builtin_altivec_vcmpgtsw", ALTIVEC_BUILTIN_VCMPGTSW }, + { MASK_ALTIVEC, CODE_FOR_vector_gtv4sf, "__builtin_altivec_vcmpgtfp", ALTIVEC_BUILTIN_VCMPGTFP }, { MASK_ALTIVEC, CODE_FOR_altivec_vctsxs, "__builtin_altivec_vctsxs", ALTIVEC_BUILTIN_VCTSXS }, { MASK_ALTIVEC, CODE_FOR_altivec_vctuxs, "__builtin_altivec_vctuxs", ALTIVEC_BUILTIN_VCTUXS }, { MASK_ALTIVEC, CODE_FOR_umaxv16qi3, "__builtin_altivec_vmaxub", ALTIVEC_BUILTIN_VMAXUB }, @@ -7357,7 +8047,7 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_altivec_vmulosb, "__builtin_altivec_vmulosb", ALTIVEC_BUILTIN_VMULOSB }, { MASK_ALTIVEC, CODE_FOR_altivec_vmulouh, "__builtin_altivec_vmulouh", ALTIVEC_BUILTIN_VMULOUH }, { MASK_ALTIVEC, CODE_FOR_altivec_vmulosh, "__builtin_altivec_vmulosh", ALTIVEC_BUILTIN_VMULOSH }, - { MASK_ALTIVEC, CODE_FOR_altivec_norv4si3, "__builtin_altivec_vnor", ALTIVEC_BUILTIN_VNOR }, + { MASK_ALTIVEC, CODE_FOR_norv4si3, "__builtin_altivec_vnor", ALTIVEC_BUILTIN_VNOR }, { MASK_ALTIVEC, CODE_FOR_iorv4si3, "__builtin_altivec_vor", ALTIVEC_BUILTIN_VOR }, { MASK_ALTIVEC, CODE_FOR_altivec_vpkuhum, "__builtin_altivec_vpkuhum", ALTIVEC_BUILTIN_VPKUHUM }, { MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum, "__builtin_altivec_vpkuwum", ALTIVEC_BUILTIN_VPKUWUM }, @@ -7405,8 +8095,24 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_altivec_vsumsws, "__builtin_altivec_vsumsws", ALTIVEC_BUILTIN_VSUMSWS }, { MASK_ALTIVEC, CODE_FOR_xorv4si3, "__builtin_altivec_vxor", ALTIVEC_BUILTIN_VXOR }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_add", ALTIVEC_BUILTIN_VEC_ADD }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vaddfp", ALTIVEC_BUILTIN_VEC_VADDFP }, + { MASK_VSX, CODE_FOR_addv2df3, "__builtin_vsx_xvadddp", VSX_BUILTIN_XVADDDP }, + { MASK_VSX, CODE_FOR_subv2df3, "__builtin_vsx_xvsubdp", VSX_BUILTIN_XVSUBDP }, + { MASK_VSX, CODE_FOR_mulv2df3, "__builtin_vsx_xvmuldp", VSX_BUILTIN_XVMULDP }, + { MASK_VSX, CODE_FOR_divv2df3, "__builtin_vsx_xvdivdp", VSX_BUILTIN_XVDIVDP }, + { MASK_VSX, CODE_FOR_sminv2df3, "__builtin_vsx_xvmindp", VSX_BUILTIN_XVMINDP }, + { MASK_VSX, CODE_FOR_smaxv2df3, "__builtin_vsx_xvmaxdp", VSX_BUILTIN_XVMAXDP }, + { MASK_VSX, CODE_FOR_vsx_tdivv2df3, "__builtin_vsx_xvtdivdp", VSX_BUILTIN_XVTDIVDP }, + + { MASK_VSX, CODE_FOR_addv4sf3, "__builtin_vsx_xvaddsp", VSX_BUILTIN_XVADDSP }, + { MASK_VSX, CODE_FOR_subv4sf3, "__builtin_vsx_xvsubsp", VSX_BUILTIN_XVSUBSP }, + { MASK_VSX, CODE_FOR_mulv4sf3, "__builtin_vsx_xvmulsp", VSX_BUILTIN_XVMULSP }, + { MASK_VSX, CODE_FOR_divv4sf3, "__builtin_vsx_xvdivsp", VSX_BUILTIN_XVDIVSP }, + { MASK_VSX, CODE_FOR_sminv4sf3, "__builtin_vsx_xvminsp", VSX_BUILTIN_XVMINSP }, + { MASK_VSX, CODE_FOR_smaxv4sf3, "__builtin_vsx_xvmaxsp", VSX_BUILTIN_XVMAXSP }, + { MASK_VSX, CODE_FOR_vsx_tdivv4sf3, "__builtin_vsx_xvtdivsp", VSX_BUILTIN_XVTDIVSP }, + + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_add", ALTIVEC_BUILTIN_VEC_ADD }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vaddfp", ALTIVEC_BUILTIN_VEC_VADDFP }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vadduwm", ALTIVEC_BUILTIN_VEC_VADDUWM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vadduhm", ALTIVEC_BUILTIN_VEC_VADDUHM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vaddubm", ALTIVEC_BUILTIN_VEC_VADDUBM }, @@ -7418,8 +8124,8 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vadduhs", ALTIVEC_BUILTIN_VEC_VADDUHS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vaddsbs", ALTIVEC_BUILTIN_VEC_VADDSBS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vaddubs", ALTIVEC_BUILTIN_VEC_VADDUBS }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_and", ALTIVEC_BUILTIN_VEC_AND }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_andc", ALTIVEC_BUILTIN_VEC_ANDC }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_and", ALTIVEC_BUILTIN_VEC_AND }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_andc", ALTIVEC_BUILTIN_VEC_ANDC }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_avg", ALTIVEC_BUILTIN_VEC_AVG }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vavgsw", ALTIVEC_BUILTIN_VEC_VAVGSW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vavguw", ALTIVEC_BUILTIN_VEC_VAVGUW }, @@ -7444,8 +8150,8 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vcmpgtub", ALTIVEC_BUILTIN_VEC_VCMPGTUB }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_cmple", ALTIVEC_BUILTIN_VEC_CMPLE }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_cmplt", ALTIVEC_BUILTIN_VEC_CMPLT }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_max", ALTIVEC_BUILTIN_VEC_MAX }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmaxfp", ALTIVEC_BUILTIN_VEC_VMAXFP }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_max", ALTIVEC_BUILTIN_VEC_MAX }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vmaxfp", ALTIVEC_BUILTIN_VEC_VMAXFP }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmaxsw", ALTIVEC_BUILTIN_VEC_VMAXSW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmaxuw", ALTIVEC_BUILTIN_VEC_VMAXUW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmaxsh", ALTIVEC_BUILTIN_VEC_VMAXSH }, @@ -7460,8 +8166,8 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmrglw", ALTIVEC_BUILTIN_VEC_VMRGLW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmrglh", ALTIVEC_BUILTIN_VEC_VMRGLH }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmrglb", ALTIVEC_BUILTIN_VEC_VMRGLB }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_min", ALTIVEC_BUILTIN_VEC_MIN }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vminfp", ALTIVEC_BUILTIN_VEC_VMINFP }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_min", ALTIVEC_BUILTIN_VEC_MIN }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vminfp", ALTIVEC_BUILTIN_VEC_VMINFP }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vminsw", ALTIVEC_BUILTIN_VEC_VMINSW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vminuw", ALTIVEC_BUILTIN_VEC_VMINUW }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vminsh", ALTIVEC_BUILTIN_VEC_VMINSH }, @@ -7478,8 +8184,8 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmulouh", ALTIVEC_BUILTIN_VEC_VMULOUH }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmulosb", ALTIVEC_BUILTIN_VEC_VMULOSB }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vmuloub", ALTIVEC_BUILTIN_VEC_VMULOUB }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_nor", ALTIVEC_BUILTIN_VEC_NOR }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_or", ALTIVEC_BUILTIN_VEC_OR }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_nor", ALTIVEC_BUILTIN_VEC_NOR }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_or", ALTIVEC_BUILTIN_VEC_OR }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_pack", ALTIVEC_BUILTIN_VEC_PACK }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vpkuwum", ALTIVEC_BUILTIN_VEC_VPKUWUM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vpkuhum", ALTIVEC_BUILTIN_VEC_VPKUHUM }, @@ -7512,8 +8218,8 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsrab", ALTIVEC_BUILTIN_VEC_VSRAB }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_srl", ALTIVEC_BUILTIN_VEC_SRL }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sro", ALTIVEC_BUILTIN_VEC_SRO }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sub", ALTIVEC_BUILTIN_VEC_SUB }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsubfp", ALTIVEC_BUILTIN_VEC_VSUBFP }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_sub", ALTIVEC_BUILTIN_VEC_SUB }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_vsubfp", ALTIVEC_BUILTIN_VEC_VSUBFP }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsubuwm", ALTIVEC_BUILTIN_VEC_VSUBUWM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsubuhm", ALTIVEC_BUILTIN_VEC_VSUBUHM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsububm", ALTIVEC_BUILTIN_VEC_VSUBUBM }, @@ -7531,7 +8237,10 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vsum4ubs", ALTIVEC_BUILTIN_VEC_VSUM4UBS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sum2s", ALTIVEC_BUILTIN_VEC_SUM2S }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sums", ALTIVEC_BUILTIN_VEC_SUMS }, - { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_xor", ALTIVEC_BUILTIN_VEC_XOR }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_nothing, "__builtin_vec_xor", ALTIVEC_BUILTIN_VEC_XOR }, + + { MASK_VSX, CODE_FOR_nothing, "__builtin_vec_mul", VSX_BUILTIN_VEC_MUL }, + { MASK_VSX, CODE_FOR_nothing, "__builtin_vec_div", VSX_BUILTIN_VEC_DIV }, { 0, CODE_FOR_divv2sf3, "__builtin_paired_divv2sf3", PAIRED_BUILTIN_DIVV2SF3 }, { 0, CODE_FOR_addv2sf3, "__builtin_paired_addv2sf3", PAIRED_BUILTIN_ADDV2SF3 }, @@ -7776,7 +8485,11 @@ static const struct builtin_description { MASK_ALTIVEC, CODE_FOR_absv16qi2, "__builtin_altivec_abs_v16qi", ALTIVEC_BUILTIN_ABS_V16QI }, { MASK_ALTIVEC, CODE_FOR_altivec_abss_v4si, "__builtin_altivec_abss_v4si", ALTIVEC_BUILTIN_ABSS_V4SI }, { MASK_ALTIVEC, CODE_FOR_altivec_abss_v8hi, "__builtin_altivec_abss_v8hi", ALTIVEC_BUILTIN_ABSS_V8HI }, - { MASK_ALTIVEC, CODE_FOR_altivec_abss_v16qi, "__builtin_altivec_abss_v16qi", ALTIVEC_BUILTIN_ABSS_V16QI } + { MASK_ALTIVEC, CODE_FOR_altivec_abss_v16qi, "__builtin_altivec_abss_v16qi", ALTIVEC_BUILTIN_ABSS_V16QI }, + { MASK_VSX, CODE_FOR_absv2df2, "__builtin_vsx_xvabsdp", VSX_BUILTIN_XVABSDP }, + { MASK_VSX, CODE_FOR_vsx_nabsv2df2, "__builtin_vsx_xvnabsdp", VSX_BUILTIN_XVNABSDP }, + { MASK_VSX, CODE_FOR_absv4sf2, "__builtin_vsx_xvabssp", VSX_BUILTIN_XVABSSP }, + { MASK_VSX, CODE_FOR_vsx_nabsv4sf2, "__builtin_vsx_xvnabssp", VSX_BUILTIN_XVNABSSP }, }; /* Simple unary operations: VECb = foo (unsigned literal) or VECb = @@ -7802,6 +8515,18 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_altivec_vupklpx, "__builtin_altivec_vupklpx", ALTIVEC_BUILTIN_VUPKLPX }, { MASK_ALTIVEC, CODE_FOR_altivec_vupklsh, "__builtin_altivec_vupklsh", ALTIVEC_BUILTIN_VUPKLSH }, + { MASK_VSX, CODE_FOR_negv2df2, "__builtin_vsx_xvnegdp", VSX_BUILTIN_XVNEGDP }, + { MASK_VSX, CODE_FOR_sqrtv2df2, "__builtin_vsx_xvsqrtdp", VSX_BUILTIN_XVSQRTDP }, + { MASK_VSX, CODE_FOR_vsx_rsqrtev2df2, "__builtin_vsx_xvrsqrtedp", VSX_BUILTIN_XVRSQRTEDP }, + { MASK_VSX, CODE_FOR_vsx_tsqrtv2df2, "__builtin_vsx_xvtsqrtdp", VSX_BUILTIN_XVTSQRTDP }, + { MASK_VSX, CODE_FOR_vsx_frev2df2, "__builtin_vsx_xvredp", VSX_BUILTIN_XVREDP }, + + { MASK_VSX, CODE_FOR_negv4sf2, "__builtin_vsx_xvnegsp", VSX_BUILTIN_XVNEGSP }, + { MASK_VSX, CODE_FOR_sqrtv4sf2, "__builtin_vsx_xvsqrtsp", VSX_BUILTIN_XVSQRTSP }, + { MASK_VSX, CODE_FOR_vsx_rsqrtev4sf2, "__builtin_vsx_xvrsqrtesp", VSX_BUILTIN_XVRSQRTESP }, + { MASK_VSX, CODE_FOR_vsx_tsqrtv4sf2, "__builtin_vsx_xvtsqrtsp", VSX_BUILTIN_XVTSQRTSP }, + { MASK_VSX, CODE_FOR_vsx_frev4sf2, "__builtin_vsx_xvresp", VSX_BUILTIN_XVRESP }, + { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_abs", ALTIVEC_BUILTIN_VEC_ABS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_abss", ALTIVEC_BUILTIN_VEC_ABSS }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_ceil", ALTIVEC_BUILTIN_VEC_CEIL }, @@ -7822,6 +8547,20 @@ static struct builtin_description bdesc_ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vupklsh", ALTIVEC_BUILTIN_VEC_VUPKLSH }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vupklsb", ALTIVEC_BUILTIN_VEC_VUPKLSB }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_floatv4siv4sf2, "__builtin_vec_float_sisf", VECTOR_BUILTIN_FLOAT_V4SI_V4SF }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_unsigned_floatv4siv4sf2, "__builtin_vec_uns_float_sisf", VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_fix_truncv4sfv4si2, "__builtin_vec_fix_sfsi", VECTOR_BUILTIN_FIX_V4SF_V4SI }, + { MASK_ALTIVEC|MASK_VSX, CODE_FOR_fixuns_truncv4sfv4si2, "__builtin_vec_fixuns_sfsi", VECTOR_BUILTIN_FIXUNS_V4SF_V4SI }, + + { MASK_VSX, CODE_FOR_floatv2div2df2, "__builtin_vsx_xvcvsxddp", VSX_BUILTIN_XVCVSXDDP }, + { MASK_VSX, CODE_FOR_unsigned_floatv2div2df2, "__builtin_vsx_xvcvuxddp", VSX_BUILTIN_XVCVUXDDP }, + { MASK_VSX, CODE_FOR_fix_truncv2dfv2di2, "__builtin_vsx_xvdpsxds", VSX_BUILTIN_XVCVDPSXDS }, + { MASK_VSX, CODE_FOR_fixuns_truncv2dfv2di2, "__builtin_vsx_xvdpuxds", VSX_BUILTIN_XVCVDPUXDS }, + { MASK_VSX, CODE_FOR_floatv4siv4sf2, "__builtin_vsx_xvcvsxwsp", VSX_BUILTIN_XVCVSXDSP }, + { MASK_VSX, CODE_FOR_unsigned_floatv4siv4sf2, "__builtin_vsx_xvcvuxwsp", VSX_BUILTIN_XVCVUXWSP }, + { MASK_VSX, CODE_FOR_fix_truncv4sfv4si2, "__builtin_vsx_xvspsxws", VSX_BUILTIN_XVCVSPSXWS }, + { MASK_VSX, CODE_FOR_fixuns_truncv4sfv4si2, "__builtin_vsx_xvspuxws", VSX_BUILTIN_XVCVSPUXWS }, + /* The SPE unary builtins must start with SPE_BUILTIN_EVABS and end with SPE_BUILTIN_EVSUBFUSIAAW. */ { 0, CODE_FOR_spe_evabs, "__builtin_spe_evabs", SPE_BUILTIN_EVABS }, @@ -8378,16 +9117,16 @@ altivec_expand_ld_builtin (tree exp, rtx switch (fcode) { case ALTIVEC_BUILTIN_LD_INTERNAL_16qi: - icode = CODE_FOR_altivec_lvx_v16qi; + icode = CODE_FOR_vector_load_v16qi; break; case ALTIVEC_BUILTIN_LD_INTERNAL_8hi: - icode = CODE_FOR_altivec_lvx_v8hi; + icode = CODE_FOR_vector_load_v8hi; break; case ALTIVEC_BUILTIN_LD_INTERNAL_4si: - icode = CODE_FOR_altivec_lvx_v4si; + icode = CODE_FOR_vector_load_v4si; break; case ALTIVEC_BUILTIN_LD_INTERNAL_4sf: - icode = CODE_FOR_altivec_lvx_v4sf; + icode = CODE_FOR_vector_load_v4sf; break; default: *expandedp = false; @@ -8431,16 +9170,16 @@ altivec_expand_st_builtin (tree exp, rtx switch (fcode) { case ALTIVEC_BUILTIN_ST_INTERNAL_16qi: - icode = CODE_FOR_altivec_stvx_v16qi; + icode = CODE_FOR_vector_store_v16qi; break; case ALTIVEC_BUILTIN_ST_INTERNAL_8hi: - icode = CODE_FOR_altivec_stvx_v8hi; + icode = CODE_FOR_vector_store_v8hi; break; case ALTIVEC_BUILTIN_ST_INTERNAL_4si: - icode = CODE_FOR_altivec_stvx_v4si; + icode = CODE_FOR_vector_store_v4si; break; case ALTIVEC_BUILTIN_ST_INTERNAL_4sf: - icode = CODE_FOR_altivec_stvx_v4sf; + icode = CODE_FOR_vector_store_v4sf; break; default: *expandedp = false; @@ -8835,6 +9574,26 @@ altivec_expand_builtin (tree exp, rtx ta /* Expand the builtin in EXP and store the result in TARGET. Store true in *EXPANDEDP if we found a builtin to expand. */ static rtx +vsx_expand_builtin (tree exp, rtx target ATTRIBUTE_UNUSED, bool *expandedp) +{ + tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); + unsigned int fcode = DECL_FUNCTION_CODE (fndecl); + + if (fcode >= VSX_BUILTIN_OVERLOADED_FIRST + && fcode <= VSX_BUILTIN_OVERLOADED_LAST) + { + *expandedp = true; + error ("unresolved overload for vsx builtin %qF", fndecl); + return const0_rtx; + } + + *expandedp = false; + return NULL_RTX; +} + +/* Expand the builtin in EXP and store the result in TARGET. Store + true in *EXPANDEDP if we found a builtin to expand. */ +static rtx paired_expand_builtin (tree exp, rtx target, bool * expandedp) { tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); @@ -9346,6 +10105,13 @@ rs6000_expand_builtin (tree exp, rtx tar if (success) return ret; } + if (TARGET_VSX) + { + ret = vsx_expand_builtin (exp, target, &success); + + if (success) + return ret; + } if (TARGET_SPE) { ret = spe_expand_builtin (exp, target, &success); @@ -9361,7 +10127,7 @@ rs6000_expand_builtin (tree exp, rtx tar return ret; } - gcc_assert (TARGET_ALTIVEC || TARGET_SPE || TARGET_PAIRED_FLOAT); + gcc_assert (TARGET_ALTIVEC || TARGET_VSX || TARGET_SPE || TARGET_PAIRED_FLOAT); /* Handle simple unary operations. */ d = (struct builtin_description *) bdesc_1arg; @@ -9398,6 +10164,8 @@ rs6000_init_builtins (void) { V2SI_type_node = build_vector_type (intSI_type_node, 2); V2SF_type_node = build_vector_type (float_type_node, 2); + V2DI_type_node = build_vector_type (intDI_type_node, 2); + V2DF_type_node = build_vector_type (double_type_node, 2); V4HI_type_node = build_vector_type (intHI_type_node, 4); V4SI_type_node = build_vector_type (intSI_type_node, 4); V4SF_type_node = build_vector_type (float_type_node, 4); @@ -9430,7 +10198,10 @@ rs6000_init_builtins (void) uintHI_type_internal_node = unsigned_intHI_type_node; intSI_type_internal_node = intSI_type_node; uintSI_type_internal_node = unsigned_intSI_type_node; + intDI_type_internal_node = intDI_type_node; + uintDI_type_internal_node = unsigned_intDI_type_node; float_type_internal_node = float_type_node; + double_type_internal_node = float_type_node; void_type_internal_node = void_type_node; (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL, @@ -9488,13 +10259,18 @@ rs6000_init_builtins (void) get_identifier ("__vector __pixel"), pixel_V8HI_type_node)); + if (TARGET_VSX) + (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL, + get_identifier ("__vector double"), + V2DF_type_node)); + if (TARGET_PAIRED_FLOAT) paired_init_builtins (); if (TARGET_SPE) spe_init_builtins (); if (TARGET_ALTIVEC) altivec_init_builtins (); - if (TARGET_ALTIVEC || TARGET_SPE || TARGET_PAIRED_FLOAT) + if (TARGET_ALTIVEC || TARGET_SPE || TARGET_PAIRED_FLOAT || TARGET_VSX) rs6000_common_init_builtins (); if (TARGET_PPC_GFXOPT) { @@ -9957,6 +10733,8 @@ altivec_init_builtins (void) = build_function_type_list (V16QI_type_node, V16QI_type_node, NULL_TREE); tree v4sf_ftype_v4sf = build_function_type_list (V4SF_type_node, V4SF_type_node, NULL_TREE); + tree v2df_ftype_v2df + = build_function_type_list (V2DF_type_node, V2DF_type_node, NULL_TREE); tree void_ftype_pcvoid_int_int = build_function_type_list (void_type_node, pcvoid_type_node, integer_type_node, @@ -10114,6 +10892,9 @@ altivec_init_builtins (void) case V4SFmode: type = v4sf_ftype_v4sf; break; + case V2DFmode: + type = v2df_ftype_v2df; + break; default: gcc_unreachable (); } @@ -10433,6 +11214,38 @@ rs6000_common_init_builtins (void) tree int_ftype_v8hi_v8hi = build_function_type_list (integer_type_node, V8HI_type_node, V8HI_type_node, NULL_TREE); + tree v2di_ftype_v2df + = build_function_type_list (V2DI_type_node, + V2DF_type_node, NULL_TREE); + tree v2df_ftype_v2df + = build_function_type_list (V2DF_type_node, + V2DF_type_node, NULL_TREE); + tree v2df_ftype_v2di + = build_function_type_list (V2DF_type_node, + V2DI_type_node, NULL_TREE); + tree v2df_ftype_v2df_v2df + = build_function_type_list (V2DF_type_node, + V2DF_type_node, V2DF_type_node, NULL_TREE); + tree v2df_ftype_v2df_v2df_v2df + = build_function_type_list (V2DF_type_node, + V2DF_type_node, V2DF_type_node, + V2DF_type_node, NULL_TREE); + tree v2di_ftype_v2di_v2di_v2di + = build_function_type_list (V2DI_type_node, + V2DI_type_node, V2DI_type_node, + V2DI_type_node, NULL_TREE); + tree v2df_ftype_v2df_v2df_v16qi + = build_function_type_list (V2DF_type_node, + V2DF_type_node, V2DF_type_node, + V16QI_type_node, NULL_TREE); + tree v2di_ftype_v2di_v2di_v16qi + = build_function_type_list (V2DI_type_node, + V2DI_type_node, V2DI_type_node, + V16QI_type_node, NULL_TREE); + tree v4sf_ftype_v4si + = build_function_type_list (V4SF_type_node, V4SI_type_node, NULL_TREE); + tree v4si_ftype_v4sf + = build_function_type_list (V4SI_type_node, V4SF_type_node, NULL_TREE); /* Add the simple ternary operators. */ d = bdesc_3arg; @@ -10469,6 +11282,12 @@ rs6000_common_init_builtins (void) case VOIDmode: type = opaque_ftype_opaque_opaque_opaque; break; + case V2DImode: + type = v2di_ftype_v2di_v2di_v2di; + break; + case V2DFmode: + type = v2df_ftype_v2df_v2df_v2df; + break; case V4SImode: type = v4si_ftype_v4si_v4si_v4si; break; @@ -10492,6 +11311,12 @@ rs6000_common_init_builtins (void) { switch (mode0) { + case V2DImode: + type = v2di_ftype_v2di_v2di_v16qi; + break; + case V2DFmode: + type = v2df_ftype_v2df_v2df_v16qi; + break; case V4SImode: type = v4si_ftype_v4si_v4si_v16qi; break; @@ -10577,6 +11402,9 @@ rs6000_common_init_builtins (void) case VOIDmode: type = opaque_ftype_opaque_opaque; break; + case V2DFmode: + type = v2df_ftype_v2df_v2df; + break; case V4SFmode: type = v4sf_ftype_v4sf_v4sf; break; @@ -10726,6 +11554,8 @@ rs6000_common_init_builtins (void) type = v16qi_ftype_int; else if (mode0 == VOIDmode && mode1 == VOIDmode) type = opaque_ftype_opaque; + else if (mode0 == V2DFmode && mode1 == V2DFmode) + type = v2df_ftype_v2df; else if (mode0 == V4SFmode && mode1 == V4SFmode) type = v4sf_ftype_v4sf; else if (mode0 == V8HImode && mode1 == V16QImode) @@ -10747,6 +11577,14 @@ rs6000_common_init_builtins (void) type = v2si_ftype_v2sf; else if (mode0 == V2SImode && mode1 == QImode) type = v2si_ftype_char; + else if (mode0 == V4SImode && mode1 == V4SFmode) + type = v4si_ftype_v4sf; + else if (mode0 == V4SFmode && mode1 == V4SImode) + type = v4sf_ftype_v4si; + else if (mode0 == V2DImode && mode1 == V2DFmode) + type = v2di_ftype_v2df; + else if (mode0 == V2DFmode && mode1 == V2DImode) + type = v2df_ftype_v2di; else gcc_unreachable (); @@ -11529,8 +12367,10 @@ rtx rs6000_secondary_memory_needed_rtx (enum machine_mode mode) { static bool eliminated = false; + rtx ret; + if (mode != SDmode) - return assign_stack_local (mode, GET_MODE_SIZE (mode), 0); + ret = assign_stack_local (mode, GET_MODE_SIZE (mode), 0); else { rtx mem = cfun->machine->sdmode_stack_slot; @@ -11542,8 +12382,21 @@ rs6000_secondary_memory_needed_rtx (enum cfun->machine->sdmode_stack_slot = mem; eliminated = true; } - return mem; + ret = mem; + } + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "rs6000_secondary_memory_needed_rtx, mode %s, rtx:\n", + GET_MODE_NAME (mode)); + if (!ret) + fprintf (stderr, "\tNULL_RTX\n"); + else + debug_rtx (ret); + fprintf (stderr, "\n"); } + + return ret; } static tree @@ -11577,6 +12430,282 @@ rs6000_check_sdmode (tree *tp, int *walk return NULL_TREE; } +enum reload_reg_type { + GPR_REGISTER_TYPE, + VECTOR_REGISTER_TYPE, + OTHER_REGISTER_TYPE +}; + +static enum reload_reg_type +rs6000_reload_register_type (enum reg_class rclass) +{ + switch (rclass) + { + case GENERAL_REGS: + case BASE_REGS: + return GPR_REGISTER_TYPE; + + case FLOAT_REGS: + case ALTIVEC_REGS: + case VSX_REGS: + return VECTOR_REGISTER_TYPE; + + default: + return OTHER_REGISTER_TYPE; + } +} + +/* Inform reload about cases where moving X with a mode MODE to a register in + RCLASS requires an extra scratch or immediate register. Return the class + needed for the immediate register. + + For VSX and Altivec, we may need a register to convert sp+offset into + reg+sp. */ + +static enum reg_class +rs6000_secondary_reload (bool in_p, + rtx x, + enum reg_class rclass, + enum machine_mode mode, + secondary_reload_info *sri) +{ + enum reg_class ret; + enum insn_code icode; + + /* Convert vector loads and stores into gprs to use an additional base + register. */ + icode = rs6000_vector_reload[mode][in_p != false]; + if (icode != CODE_FOR_nothing) + { + ret = NO_REGS; + sri->icode = CODE_FOR_nothing; + sri->extra_cost = 0; + + if (GET_CODE (x) == MEM) + { + rtx addr = XEXP (x, 0); + + /* Loads to and stores from gprs can do reg+offset, and wouldn't need + an extra register in that case, but it would need an extra + register if the addressing is reg+reg or (reg+reg)&(-16). */ + if (rclass == GENERAL_REGS || rclass == BASE_REGS) + { + if (! rs6000_legitimate_offset_address_p (TImode, addr, true)) + { + sri->icode = icode; + /* account for splitting the loads, and converting the + address from reg+reg to reg. */ + sri->extra_cost = (((TARGET_64BIT) ? 3 : 5) + + ((GET_CODE (addr) == AND) ? 1 : 0)); + } + } + /* Loads to and stores from vector registers can only do reg+reg + addressing. Altivec registers can also do (reg+reg)&(-16). */ + else if (rclass == VSX_REGS || rclass == ALTIVEC_REGS + || rclass == FLOAT_REGS) + { + if (rclass != ALTIVEC_REGS + && GET_CODE (addr) == AND + && GET_CODE (XEXP (addr, 1)) == CONST_INT + && INTVAL (XEXP (addr, 1)) == -16 + && (legitimate_indirect_address_p (XEXP (addr, 0), true) + || legitimate_indexed_address_p (XEXP (addr, 0), true))) + { + sri->icode = icode; + sri->extra_cost = ((GET_CODE (XEXP (addr, 0)) == PLUS) + ? 2 : 1); + } + else if (!legitimate_indexed_address_p (addr, true) + && !legitimate_indirect_address_p (addr, true)) + { + sri->icode = icode; + sri->extra_cost = 1; + } + else + icode = CODE_FOR_nothing; + } + /* Any other loads, including to pseudo registers which haven't been + assigned to a register yet, default to require a scratch + register. */ + else + { + sri->icode = icode; + sri->extra_cost = 2; + } + } + else + { + int regno = true_regnum (x); + + icode = CODE_FOR_nothing; + if (regno < 0 || regno >= FIRST_PSEUDO_REGISTER) + ret = default_secondary_reload (in_p, x, rclass, mode, sri); + else + { + enum reg_class xclass = REGNO_REG_CLASS (regno); + enum reload_reg_type rtype1 = rs6000_reload_register_type (rclass); + enum reload_reg_type rtype2 = rs6000_reload_register_type (xclass); + + /* If memory is needed, use default_secondary_reload to create the + stack slot. */ + if (rtype1 != rtype2 || rtype1 == OTHER_REGISTER_TYPE) + ret = default_secondary_reload (in_p, x, rclass, mode, sri); + else + ret = NO_REGS; + } + } + } + else + ret = default_secondary_reload (in_p, x, rclass, mode, sri); + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "rs6000_secondary_reload, return %s, in_p = %s, rclass = %s, " + "mode = %s", + reg_class_names[ret], + in_p ? "true" : "false", + reg_class_names[rclass], + GET_MODE_NAME (mode)); + + if (icode != CODE_FOR_nothing) + fprintf (stderr, ", reload func = %s, extra cost = %d\n", + insn_data[icode].name, sri->extra_cost); + else + fprintf (stderr, "\n"); + + debug_rtx (x); + fprintf (stderr, "\n"); + } + + return ret; +} + +/* Fixup reload addresses for Altivec or VSX loads/stores to change SP+offset + to SP+reg addressing. */ + +void +rs6000_secondary_reload_inner (rtx reg, rtx mem, rtx scratch, bool store_p) +{ + int regno = true_regnum (reg); + enum machine_mode mode = GET_MODE (reg); + enum reg_class rclass; + rtx addr; + rtx and_op2 = NULL_RTX; + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "rs6000_secondary_reload_inner, type = %s\n", + store_p ? "store" : "load"); + fprintf (stderr, "reg:\n"); + debug_rtx (reg); + fprintf (stderr, "mem:\n"); + debug_rtx (mem); + fprintf (stderr, "scratch:\n"); + debug_rtx (scratch); + fprintf (stderr, "\n"); + } + + gcc_assert (regno >= 0 && regno < FIRST_PSEUDO_REGISTER); + gcc_assert (GET_CODE (mem) == MEM); + rclass = REGNO_REG_CLASS (regno); + addr = XEXP (mem, 0); + + switch (rclass) + { + /* Move reg+reg addresses into a scratch register for GPRs. */ + case GENERAL_REGS: + case BASE_REGS: + if (GET_CODE (addr) == AND) + { + and_op2 = XEXP (addr, 1); + addr = XEXP (addr, 0); + } + if (GET_CODE (addr) == PLUS + && (!rs6000_legitimate_offset_address_p (TImode, addr, true) + || and_op2 != NULL_RTX)) + { + if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST + || GET_CODE (addr) == CONST_INT) + rs6000_emit_move (scratch, addr, GET_MODE (addr)); + else + emit_insn (gen_rtx_SET (VOIDmode, scratch, addr)); + addr = scratch; + } + else if (GET_CODE (addr) == PRE_MODIFY + && REG_P (XEXP (addr, 0)) + && GET_CODE (XEXP (addr, 1)) == PLUS) + { + emit_insn (gen_rtx_SET (VOIDmode, XEXP (addr, 0), XEXP (addr, 1))); + addr = XEXP (addr, 0); + } + break; + + /* With float regs, we need to handle the AND ourselves, since we can't + use the Altivec instruction with an implicit AND -16. Allow scalar + loads to float registers to use reg+offset even if VSX. */ + case FLOAT_REGS: + case VSX_REGS: + if (GET_CODE (addr) == AND) + { + and_op2 = XEXP (addr, 1); + addr = XEXP (addr, 0); + } + /* fall through */ + + /* Move reg+offset addresses into a scratch register. */ + case ALTIVEC_REGS: + if (!legitimate_indirect_address_p (addr, true) + && !legitimate_indexed_address_p (addr, true) + && (GET_CODE (addr) != PRE_MODIFY + || !legitimate_indexed_address_p (XEXP (addr, 1), true)) + && (rclass != FLOAT_REGS + || GET_MODE_SIZE (mode) != 8 + || and_op2 != NULL_RTX + || !rs6000_legitimate_offset_address_p (mode, addr, true))) + { + if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST + || GET_CODE (addr) == CONST_INT) + rs6000_emit_move (scratch, addr, GET_MODE (addr)); + else + emit_insn (gen_rtx_SET (VOIDmode, scratch, addr)); + addr = scratch; + } + break; + + default: + gcc_unreachable (); + } + + /* If the original address involved an AND -16 that is part of the Altivec + addresses, recreate the and now. */ + if (and_op2 != NULL_RTX) + { + rtx and_rtx = gen_rtx_SET (VOIDmode, + scratch, + gen_rtx_AND (Pmode, addr, and_op2)); + rtx cc_clobber = gen_rtx_CLOBBER (CCmode, gen_rtx_SCRATCH (CCmode)); + emit_insn (gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (2, and_rtx, cc_clobber))); + addr = scratch; + } + + /* Adjust the address if it changed. */ + if (addr != XEXP (mem, 0)) + { + mem = change_address (mem, mode, addr); + if (TARGET_DEBUG_ADDR) + fprintf (stderr, "rs6000_secondary_reload_inner, mem adjusted.\n"); + } + + /* Now create the move. */ + if (store_p) + emit_insn (gen_rtx_SET (VOIDmode, mem, reg)); + else + emit_insn (gen_rtx_SET (VOIDmode, reg, mem)); + + return; +} /* Allocate a 64-bit stack slot to be used for copying SDmode values through if this function has any SDmode references. */ @@ -11627,15 +12756,146 @@ rs6000_instantiate_decls (void) instantiate_decl_rtl (cfun->machine->sdmode_stack_slot); } +/* Given an rtx X being reloaded into a reg required to be + in class CLASS, return the class of reg to actually use. + In general this is just CLASS; but on some machines + in some cases it is preferable to use a more restrictive class. + + On the RS/6000, we have to return NO_REGS when we want to reload a + floating-point CONST_DOUBLE to force it to be copied to memory. + + We also don't want to reload integer values into floating-point + registers if we can at all help it. In fact, this can + cause reload to die, if it tries to generate a reload of CTR + into a FP register and discovers it doesn't have the memory location + required. + + ??? Would it be a good idea to have reload do the converse, that is + try to reload floating modes into FP registers if possible? + */ + +enum reg_class +rs6000_preferred_reload_class (rtx x, enum reg_class rclass) +{ + enum machine_mode mode = GET_MODE (x); + enum reg_class ret; + + if (TARGET_VSX && VSX_VECTOR_MODE (mode) && x == CONST0_RTX (mode) + && VSX_REG_CLASS_P (rclass)) + ret = rclass; + + else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode) + && rclass == ALTIVEC_REGS && easy_vector_constant (x, mode)) + ret = rclass; + + else if (CONSTANT_P (x) && reg_classes_intersect_p (rclass, FLOAT_REGS)) + ret = NO_REGS; + + else if (GET_MODE_CLASS (mode) == MODE_INT && rclass == NON_SPECIAL_REGS) + ret = GENERAL_REGS; + + /* For VSX, prefer the traditional registers unless the address involves AND + -16, where we prefer to use the Altivec register so we don't have to break + down the AND. */ + else if (rclass == VSX_REGS) + { + if (mode == DFmode) + ret = FLOAT_REGS; + + else if (altivec_indexed_or_indirect_operand (x, mode)) + ret = ALTIVEC_REGS; + + else if (ALTIVEC_VECTOR_MODE (mode)) + ret = ALTIVEC_REGS; + + else + ret = rclass; + } + else + ret = rclass; + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "rs6000_preferred_reload_class, return %s, rclass = %s, x:\n", + reg_class_names[ret], reg_class_names[rclass]); + debug_rtx (x); + fprintf (stderr, "\n"); + } + + return ret; +} + +/* If we are copying between FP or AltiVec registers and anything else, we need + a memory location. The exception is when we are targeting ppc64 and the + move to/from fpr to gpr instructions are available. Also, under VSX, you + can copy vector registers from the FP register set to the Altivec register + set and vice versa. */ + +bool +rs6000_secondary_memory_needed (enum reg_class class1, + enum reg_class class2, + enum machine_mode mode) +{ + bool ret; + + if (class1 == class2) + ret = false; + + /* Under VSX, there are 3 register classes that values could be in (VSX_REGS, + ALTIVEC_REGS, and FLOAT_REGS). We don't need to use memory to copy + between these classes. But we need memory for other things that can go in + FLOAT_REGS like SFmode. */ + else if (TARGET_VSX + && (VECTOR_MEM_VSX_P (mode) || VECTOR_UNIT_VSX_P (mode)) + && (class1 == VSX_REGS || class1 == ALTIVEC_REGS + || class1 == FLOAT_REGS)) + ret = (class2 != VSX_REGS && class2 != ALTIVEC_REGS + && class2 != FLOAT_REGS); + + else if (class1 == VSX_REGS || class2 == VSX_REGS) + ret = true; + + else if (class1 == FLOAT_REGS + && (!TARGET_MFPGPR || !TARGET_POWERPC64 + || ((mode != DFmode) + && (mode != DDmode) + && (mode != DImode)))) + ret = true; + + else if (class2 == FLOAT_REGS + && (!TARGET_MFPGPR || !TARGET_POWERPC64 + || ((mode != DFmode) + && (mode != DDmode) + && (mode != DImode)))) + ret = true; + + else if (class1 == ALTIVEC_REGS || class2 == ALTIVEC_REGS) + ret = true; + + else + ret = false; + + if (TARGET_DEBUG_ADDR) + fprintf (stderr, + "rs6000_secondary_memory_needed, return: %s, class1 = %s, " + "class2 = %s, mode = %s\n", + ret ? "true" : "false", reg_class_names[class1], + reg_class_names[class2], GET_MODE_NAME (mode)); + + return ret; +} + /* Return the register class of a scratch register needed to copy IN into or out of a register in RCLASS in MODE. If it can be done directly, NO_REGS is returned. */ enum reg_class rs6000_secondary_reload_class (enum reg_class rclass, - enum machine_mode mode ATTRIBUTE_UNUSED, + enum machine_mode mode, rtx in) { + enum reg_class ret = NO_REGS; int regno; if (TARGET_ELF || (DEFAULT_ABI == ABI_DARWIN @@ -11656,51 +12916,112 @@ rs6000_secondary_reload_class (enum reg_ || GET_CODE (in) == HIGH || GET_CODE (in) == LABEL_REF || GET_CODE (in) == CONST)) - return BASE_REGS; + ret = BASE_REGS; } - if (GET_CODE (in) == REG) + if (ret == NO_REGS) { - regno = REGNO (in); - if (regno >= FIRST_PSEUDO_REGISTER) + if (GET_CODE (in) == REG) + { + regno = REGNO (in); + if (regno >= FIRST_PSEUDO_REGISTER) + { + regno = true_regnum (in); + if (regno >= FIRST_PSEUDO_REGISTER) + regno = -1; + } + } + else if (GET_CODE (in) == SUBREG) { regno = true_regnum (in); if (regno >= FIRST_PSEUDO_REGISTER) regno = -1; } - } - else if (GET_CODE (in) == SUBREG) - { - regno = true_regnum (in); - if (regno >= FIRST_PSEUDO_REGISTER) + else regno = -1; + + /* We can place anything into GENERAL_REGS and can put GENERAL_REGS + into anything. */ + if (rclass == GENERAL_REGS || rclass == BASE_REGS + || (regno >= 0 && INT_REGNO_P (regno))) + ret = NO_REGS; + + /* Constants, memory, and FP registers can go into FP registers. */ + else if ((regno == -1 || FP_REGNO_P (regno)) + && (rclass == FLOAT_REGS || rclass == NON_SPECIAL_REGS)) + ret = (mode != SDmode) ? NO_REGS : GENERAL_REGS; + + /* Memory, and FP/altivec registers can go into fp/altivec registers under + VSX. */ + else if (TARGET_VSX + && (regno == -1 || VSX_REGNO_P (regno)) + && VSX_REG_CLASS_P (rclass)) + ret = NO_REGS; + + /* Memory, and AltiVec registers can go into AltiVec registers. */ + else if ((regno == -1 || ALTIVEC_REGNO_P (regno)) + && rclass == ALTIVEC_REGS) + ret = NO_REGS; + + /* We can copy among the CR registers. */ + else if ((rclass == CR_REGS || rclass == CR0_REGS) + && regno >= 0 && CR_REGNO_P (regno)) + ret = NO_REGS; + + /* Otherwise, we need GENERAL_REGS. */ + else + ret = GENERAL_REGS; + } + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, + "rs6000_secondary_reload_class, return %s, rclass = %s, " + "mode = %s, input rtx:\n", + reg_class_names[ret], reg_class_names[rclass], + GET_MODE_NAME (mode)); + debug_rtx (in); + fprintf (stderr, "\n"); } - else - regno = -1; - /* We can place anything into GENERAL_REGS and can put GENERAL_REGS - into anything. */ - if (rclass == GENERAL_REGS || rclass == BASE_REGS - || (regno >= 0 && INT_REGNO_P (regno))) - return NO_REGS; - - /* Constants, memory, and FP registers can go into FP registers. */ - if ((regno == -1 || FP_REGNO_P (regno)) - && (rclass == FLOAT_REGS || rclass == NON_SPECIAL_REGS)) - return (mode != SDmode) ? NO_REGS : GENERAL_REGS; - - /* Memory, and AltiVec registers can go into AltiVec registers. */ - if ((regno == -1 || ALTIVEC_REGNO_P (regno)) - && rclass == ALTIVEC_REGS) - return NO_REGS; - - /* We can copy among the CR registers. */ - if ((rclass == CR_REGS || rclass == CR0_REGS) - && regno >= 0 && CR_REGNO_P (regno)) - return NO_REGS; + return ret; +} + +/* Return nonzero if for CLASS a mode change from FROM to TO is invalid. */ - /* Otherwise, we need GENERAL_REGS. */ - return GENERAL_REGS; +bool +rs6000_cannot_change_mode_class (enum machine_mode from, + enum machine_mode to, + enum reg_class rclass) +{ + bool ret = (GET_MODE_SIZE (from) != GET_MODE_SIZE (to) + ? ((GET_MODE_SIZE (from) < 8 || GET_MODE_SIZE (to) < 8 + || TARGET_IEEEQUAD) + && reg_classes_intersect_p (FLOAT_REGS, rclass)) + : (((TARGET_E500_DOUBLE + && ((((to) == DFmode) + ((from) == DFmode)) == 1 + || (((to) == TFmode) + ((from) == TFmode)) == 1 + || (((to) == DDmode) + ((from) == DDmode)) == 1 + || (((to) == TDmode) + ((from) == TDmode)) == 1 + || (((to) == DImode) + ((from) == DImode)) == 1)) + || (TARGET_VSX + && (VSX_VECTOR_MODE (from) + VSX_VECTOR_MODE (to)) == 1) + || (TARGET_ALTIVEC + && (ALTIVEC_VECTOR_MODE (from) + + ALTIVEC_VECTOR_MODE (to)) == 1) + || (TARGET_SPE + && (SPE_VECTOR_MODE (from) + SPE_VECTOR_MODE (to)) == 1)) + && reg_classes_intersect_p (GENERAL_REGS, rclass))); + + if (TARGET_DEBUG_ADDR) + fprintf (stderr, + "rs6000_cannot_change_mode_class, return %s, from = %s, " + "to = %s, rclass = %s\n", + ret ? "true" : "false", + GET_MODE_NAME (from), GET_MODE_NAME (to), + reg_class_names[rclass]); + + return ret; } /* Given a comparison operation, return the bit number in CCR to test. We @@ -12432,6 +13753,26 @@ print_operand (FILE *file, rtx x, int co fprintf (file, "%d", i + 1); return; + case 'x': + /* X is a FPR or Altivec register used in a VSX context. */ + if (GET_CODE (x) != REG || !VSX_REGNO_P (REGNO (x))) + output_operand_lossage ("invalid %%x value"); + else + { + int reg = REGNO (x); + int vsx_reg = (FP_REGNO_P (reg) + ? reg - 32 + : reg - FIRST_ALTIVEC_REGNO + 32); + +#ifdef TARGET_REGNAMES + if (TARGET_REGNAMES) + fprintf (file, "%%vs%d", vsx_reg); + else +#endif + fprintf (file, "%d", vsx_reg); + } + return; + case 'X': if (GET_CODE (x) == MEM && (legitimate_indexed_address_p (XEXP (x, 0), 0) @@ -12544,13 +13885,16 @@ print_operand (FILE *file, rtx x, int co /* Fall through. Must be [reg+reg]. */ } - if (TARGET_ALTIVEC + if (VECTOR_MEM_ALTIVEC_OR_VSX_P (GET_MODE (x)) && GET_CODE (tmp) == AND && GET_CODE (XEXP (tmp, 1)) == CONST_INT && INTVAL (XEXP (tmp, 1)) == -16) tmp = XEXP (tmp, 0); + else if (VECTOR_MEM_VSX_P (GET_MODE (x)) + && GET_CODE (tmp) == PRE_MODIFY) + tmp = XEXP (tmp, 1); if (GET_CODE (tmp) == REG) - fprintf (file, "0,%s", reg_names[REGNO (tmp)]); + fprintf (file, "%s,%s", reg_names[0], reg_names[REGNO (tmp)]); else { if (!GET_CODE (tmp) == PLUS @@ -13322,55 +14666,62 @@ output_e500_flip_gt_bit (rtx dst, rtx sr return string; } -/* Return insn index for the vector compare instruction for given CODE, - and DEST_MODE, OP_MODE. Return INSN_NOT_AVAILABLE if valid insn is - not available. */ +/* Return insn for VSX comparisons. */ -static int -get_vec_cmp_insn (enum rtx_code code, - enum machine_mode dest_mode, - enum machine_mode op_mode) +static rtx +rs6000_emit_vector_compare_vsx (enum rtx_code code, + rtx mask, + rtx op0, + rtx op1) { - if (!TARGET_ALTIVEC) - return INSN_NOT_AVAILABLE; - switch (code) { + default: + break; + case EQ: - if (dest_mode == V16QImode && op_mode == V16QImode) - return UNSPEC_VCMPEQUB; - if (dest_mode == V8HImode && op_mode == V8HImode) - return UNSPEC_VCMPEQUH; - if (dest_mode == V4SImode && op_mode == V4SImode) - return UNSPEC_VCMPEQUW; - if (dest_mode == V4SImode && op_mode == V4SFmode) - return UNSPEC_VCMPEQFP; + case GT: + case GE: + emit_insn (gen_rtx_SET (VOIDmode, + mask, + gen_rtx_fmt_ee (code, GET_MODE (mask), + op0, + op1))); + return mask; + } + + return NULL_RTX; +} + +/* Return insn for Altivec comparisons. */ + +static rtx +rs6000_emit_vector_compare_altivec (enum rtx_code code, + rtx mask, + rtx op0, + rtx op1) +{ + switch (code) + { + default: break; + case GE: - if (dest_mode == V4SImode && op_mode == V4SFmode) - return UNSPEC_VCMPGEFP; + if (GET_MODE (mask) != V4SFmode) + return NULL_RTX; + /* fall through */ + case EQ: case GT: - if (dest_mode == V16QImode && op_mode == V16QImode) - return UNSPEC_VCMPGTSB; - if (dest_mode == V8HImode && op_mode == V8HImode) - return UNSPEC_VCMPGTSH; - if (dest_mode == V4SImode && op_mode == V4SImode) - return UNSPEC_VCMPGTSW; - if (dest_mode == V4SImode && op_mode == V4SFmode) - return UNSPEC_VCMPGTFP; - break; case GTU: - if (dest_mode == V16QImode && op_mode == V16QImode) - return UNSPEC_VCMPGTUB; - if (dest_mode == V8HImode && op_mode == V8HImode) - return UNSPEC_VCMPGTUH; - if (dest_mode == V4SImode && op_mode == V4SImode) - return UNSPEC_VCMPGTUW; - break; - default: - break; + emit_insn (gen_rtx_SET (VOIDmode, + mask, + gen_rtx_fmt_ee (code, GET_MODE (mask), + op0, + op1))); + return mask; } - return INSN_NOT_AVAILABLE; + + return NULL_RTX; } /* Emit vector compare for operands OP0 and OP1 using code RCODE. @@ -13381,129 +14732,111 @@ rs6000_emit_vector_compare (enum rtx_cod rtx op0, rtx op1, enum machine_mode dmode) { - int vec_cmp_insn; rtx mask; - enum machine_mode dest_mode; - enum machine_mode op_mode = GET_MODE (op1); + bool swap_operands = false; + bool try_again = false; - gcc_assert (TARGET_ALTIVEC); + gcc_assert (TARGET_ALTIVEC || TARGET_VSX); gcc_assert (GET_MODE (op0) == GET_MODE (op1)); - /* Floating point vector compare instructions uses destination V4SImode. - Move destination to appropriate mode later. */ - if (dmode == V4SFmode) - dest_mode = V4SImode; - else - dest_mode = dmode; + mask = gen_reg_rtx (dmode); - mask = gen_reg_rtx (dest_mode); - vec_cmp_insn = get_vec_cmp_insn (rcode, dest_mode, op_mode); + /* Try for VSX before Altivec. */ + if (TARGET_VSX && VSX_VECTOR_MODE (dmode)) + { + rtx vsx = rs6000_emit_vector_compare_vsx (rcode, mask, op0, op1); + if (vsx) + return vsx; + } + else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (dmode)) + { + rtx av = rs6000_emit_vector_compare_altivec (rcode, mask, op0, op1); + if (av) + return av; + } - if (vec_cmp_insn == INSN_NOT_AVAILABLE) + switch (rcode) { - bool swap_operands = false; - bool try_again = false; - switch (rcode) - { - case LT: - rcode = GT; - swap_operands = true; - try_again = true; - break; - case LTU: - rcode = GTU; - swap_operands = true; - try_again = true; - break; - case NE: - case UNLE: - case UNLT: - case UNGE: - case UNGT: - /* Invert condition and try again. - e.g., A != B becomes ~(A==B). */ - { - enum rtx_code rev_code; - enum insn_code nor_code; - rtx eq_rtx; - - rev_code = reverse_condition_maybe_unordered (rcode); - eq_rtx = rs6000_emit_vector_compare (rev_code, op0, op1, - dest_mode); - - nor_code = optab_handler (one_cmpl_optab, (int)dest_mode)->insn_code; - gcc_assert (nor_code != CODE_FOR_nothing); - emit_insn (GEN_FCN (nor_code) (mask, eq_rtx)); + case LT: + rcode = GT; + swap_operands = true; + try_again = true; + break; + case LTU: + rcode = GTU; + swap_operands = true; + try_again = true; + break; + case NE: + case UNLE: + case UNLT: + case UNGE: + case UNGT: + /* Invert condition and try again. + e.g., A != B becomes ~(A==B). */ + { + enum rtx_code rev_code; + enum insn_code nor_code; + rtx eq_rtx; + + rev_code = reverse_condition_maybe_unordered (rcode); + eq_rtx = rs6000_emit_vector_compare (rev_code, op0, op1, dmode); + + nor_code = optab_handler (one_cmpl_optab, (int)dmode)->insn_code; + gcc_assert (nor_code != CODE_FOR_nothing); + emit_insn (GEN_FCN (nor_code) (mask, eq_rtx)); + return mask; + } + break; + case GE: + case GEU: + case LE: + case LEU: + /* Try GT/GTU/LT/LTU OR EQ */ + { + rtx c_rtx, eq_rtx; + enum insn_code ior_code; + enum rtx_code new_code; - if (dmode != dest_mode) - { - rtx temp = gen_reg_rtx (dest_mode); - convert_move (temp, mask, 0); - return temp; - } - return mask; - } - break; - case GE: - case GEU: - case LE: - case LEU: - /* Try GT/GTU/LT/LTU OR EQ */ + switch (rcode) { - rtx c_rtx, eq_rtx; - enum insn_code ior_code; - enum rtx_code new_code; - - switch (rcode) - { - case GE: - new_code = GT; - break; - - case GEU: - new_code = GTU; - break; + case GE: + new_code = GT; + break; - case LE: - new_code = LT; - break; + case GEU: + new_code = GTU; + break; - case LEU: - new_code = LTU; - break; + case LE: + new_code = LT; + break; - default: - gcc_unreachable (); - } + case LEU: + new_code = LTU; + break; - c_rtx = rs6000_emit_vector_compare (new_code, - op0, op1, dest_mode); - eq_rtx = rs6000_emit_vector_compare (EQ, op0, op1, - dest_mode); - - ior_code = optab_handler (ior_optab, (int)dest_mode)->insn_code; - gcc_assert (ior_code != CODE_FOR_nothing); - emit_insn (GEN_FCN (ior_code) (mask, c_rtx, eq_rtx)); - if (dmode != dest_mode) - { - rtx temp = gen_reg_rtx (dest_mode); - convert_move (temp, mask, 0); - return temp; - } - return mask; + default: + gcc_unreachable (); } - break; - default: - gcc_unreachable (); - } - if (try_again) - { - vec_cmp_insn = get_vec_cmp_insn (rcode, dest_mode, op_mode); - /* You only get two chances. */ - gcc_assert (vec_cmp_insn != INSN_NOT_AVAILABLE); - } + c_rtx = rs6000_emit_vector_compare (new_code, + op0, op1, dmode); + eq_rtx = rs6000_emit_vector_compare (EQ, op0, op1, + dmode); + + ior_code = optab_handler (ior_optab, (int)dmode)->insn_code; + gcc_assert (ior_code != CODE_FOR_nothing); + emit_insn (GEN_FCN (ior_code) (mask, c_rtx, eq_rtx)); + return mask; + } + break; + default: + gcc_unreachable (); + } + if (try_again) + { if (swap_operands) { rtx tmp; @@ -13511,69 +14844,23 @@ rs6000_emit_vector_compare (enum rtx_cod op0 = op1; op1 = tmp; } - } - - emit_insn (gen_rtx_SET (VOIDmode, mask, - gen_rtx_UNSPEC (dest_mode, - gen_rtvec (2, op0, op1), - vec_cmp_insn))); - if (dmode != dest_mode) - { - rtx temp = gen_reg_rtx (dest_mode); - convert_move (temp, mask, 0); - return temp; - } - return mask; -} -/* Return vector select instruction for MODE. Return INSN_NOT_AVAILABLE, if - valid insn doesn exist for given mode. */ - -static int -get_vsel_insn (enum machine_mode mode) -{ - switch (mode) - { - case V4SImode: - return UNSPEC_VSEL4SI; - break; - case V4SFmode: - return UNSPEC_VSEL4SF; - break; - case V8HImode: - return UNSPEC_VSEL8HI; - break; - case V16QImode: - return UNSPEC_VSEL16QI; - break; - default: - return INSN_NOT_AVAILABLE; - break; + if (TARGET_VSX && VSX_VECTOR_MODE (dmode)) + { + rtx vsx = rs6000_emit_vector_compare_vsx (rcode, mask, op0, op1); + if (vsx) + return vsx; + } + else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (dmode)) + { + rtx av = rs6000_emit_vector_compare_altivec (rcode, mask, op0, op1); + if (av) + return av; + } } - return INSN_NOT_AVAILABLE; -} - -/* Emit vector select insn where DEST is destination using - operands OP1, OP2 and MASK. */ - -static void -rs6000_emit_vector_select (rtx dest, rtx op1, rtx op2, rtx mask) -{ - rtx t, temp; - enum machine_mode dest_mode = GET_MODE (dest); - int vsel_insn_index = get_vsel_insn (GET_MODE (dest)); - - temp = gen_reg_rtx (dest_mode); - /* For each vector element, select op1 when mask is 1 otherwise - select op2. */ - t = gen_rtx_SET (VOIDmode, temp, - gen_rtx_UNSPEC (dest_mode, - gen_rtvec (3, op2, op1, mask), - vsel_insn_index)); - emit_insn (t); - emit_move_insn (dest, temp); - return; + /* You only get two chances. */ + gcc_unreachable (); } /* Emit vector conditional expression. @@ -13588,15 +14875,29 @@ rs6000_emit_vector_cond_expr (rtx dest, enum rtx_code rcode = GET_CODE (cond); rtx mask; - if (!TARGET_ALTIVEC) + if (!TARGET_ALTIVEC && !TARGET_VSX) return 0; /* Get the vector mask for the given relational operations. */ mask = rs6000_emit_vector_compare (rcode, cc_op0, cc_op1, dest_mode); - rs6000_emit_vector_select (dest, op1, op2, mask); + if (!mask) + return 0; - return 1; + if ((TARGET_VSX && VSX_VECTOR_MOVE_MODE (dest_mode)) + || (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (dest_mode))) + { + rtx cond2 = gen_rtx_fmt_ee (NE, VOIDmode, mask, const0_rtx); + emit_insn (gen_rtx_SET (VOIDmode, + dest, + gen_rtx_IF_THEN_ELSE (dest_mode, + cond2, + op1, + op2))); + return 1; + } + + return 0; } /* Emit a conditional move: move TRUE_COND to DEST if OP of the @@ -13792,8 +15093,8 @@ rs6000_emit_int_cmove (rtx dest, rtx op, { rtx condition_rtx, cr; - /* All isel implementations thus far are 32-bits. */ - if (GET_MODE (rs6000_compare_op0) != SImode) + if (GET_MODE (rs6000_compare_op0) != SImode + && (!TARGET_POWERPC64 || GET_MODE (rs6000_compare_op0) != DImode)) return 0; /* We still have to do the compare, because isel doesn't do a @@ -13802,12 +15103,24 @@ rs6000_emit_int_cmove (rtx dest, rtx op, condition_rtx = rs6000_generate_compare (GET_CODE (op)); cr = XEXP (condition_rtx, 0); - if (GET_MODE (cr) == CCmode) - emit_insn (gen_isel_signed (dest, condition_rtx, - true_cond, false_cond, cr)); + if (GET_MODE (rs6000_compare_op0) == SImode) + { + if (GET_MODE (cr) == CCmode) + emit_insn (gen_isel_signed_si (dest, condition_rtx, + true_cond, false_cond, cr)); + else + emit_insn (gen_isel_unsigned_si (dest, condition_rtx, + true_cond, false_cond, cr)); + } else - emit_insn (gen_isel_unsigned (dest, condition_rtx, - true_cond, false_cond, cr)); + { + if (GET_MODE (cr) == CCmode) + emit_insn (gen_isel_signed_di (dest, condition_rtx, + true_cond, false_cond, cr)); + else + emit_insn (gen_isel_unsigned_di (dest, condition_rtx, + true_cond, false_cond, cr)); + } return 1; } @@ -13834,6 +15147,15 @@ rs6000_emit_minmax (rtx dest, enum rtx_c enum rtx_code c; rtx target; + /* VSX/altivec have direct min/max insns. */ + if ((code == SMAX || code == SMIN) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)) + { + emit_insn (gen_rtx_SET (VOIDmode, + dest, + gen_rtx_fmt_ee (code, mode, op0, op1))); + return; + } + if (code == SMAX || code == SMIN) c = GE; else @@ -15811,6 +17133,7 @@ emit_frame_save (rtx frame_reg, rtx fram /* Some cases that need register indexed addressing. */ if ((TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode)) + || (TARGET_VSX && VSX_VECTOR_MODE (mode)) || (TARGET_E500_DOUBLE && mode == DFmode) || (TARGET_SPE_ABI && SPE_VECTOR_MODE (mode) @@ -18923,6 +20246,7 @@ rs6000_adjust_cost (rtx insn, rtx link, || rs6000_cpu_attr == CPU_PPC7450 || rs6000_cpu_attr == CPU_POWER4 || rs6000_cpu_attr == CPU_POWER5 + || (rs6000_cpu_attr == CPU_POWER7 && TARGET_POWER7_ADJUST_COST) || rs6000_cpu_attr == CPU_CELL) && recog_memoized (dep_insn) && (INSN_CODE (dep_insn) >= 0)) @@ -19162,6 +20486,35 @@ rs6000_adjust_cost (rtx insn, rtx link, return cost; } +/* Debug version of rs6000_adjust_cost. */ + +static int +rs6000_debug_adjust_cost (rtx insn, rtx link, rtx dep_insn, int cost) +{ + int ret = rs6000_adjust_cost (insn, link, dep_insn, cost); + + if (ret != cost) + { + const char *dep; + + switch (REG_NOTE_KIND (link)) + { + default: dep = "unknown depencency"; break; + case REG_DEP_TRUE: dep = "data dependency"; break; + case REG_DEP_OUTPUT: dep = "output dependency"; break; + case REG_DEP_ANTI: dep = "anti depencency"; break; + } + + fprintf (stderr, + "\nrs6000_adjust_cost, final cost = %d, orig cost = %d, " + "%s, insn:\n", ret, cost, dep); + + debug_rtx (insn); + } + + return ret; +} + /* The function returns a true if INSN is microcoded. Return false otherwise. */ @@ -19443,6 +20796,7 @@ rs6000_issue_rate (void) case CPU_POWER4: case CPU_POWER5: case CPU_POWER6: + case CPU_POWER7: return 5; default: return 1; @@ -20044,6 +21398,41 @@ insn_must_be_first_in_group (rtx insn) break; } break; + case PROCESSOR_POWER7: + type = get_attr_type (insn); + + switch (type) + { + case TYPE_CR_LOGICAL: + case TYPE_MFCR: + case TYPE_MFCRF: + case TYPE_MTCR: + case TYPE_IDIV: + case TYPE_LDIV: + case TYPE_COMPARE: + case TYPE_DELAYED_COMPARE: + case TYPE_VAR_DELAYED_COMPARE: + case TYPE_ISYNC: + case TYPE_LOAD_L: + case TYPE_STORE_C: + case TYPE_LOAD_U: + case TYPE_LOAD_UX: + case TYPE_LOAD_EXT: + case TYPE_LOAD_EXT_U: + case TYPE_LOAD_EXT_UX: + case TYPE_STORE_U: + case TYPE_STORE_UX: + case TYPE_FPLOAD_U: + case TYPE_FPLOAD_UX: + case TYPE_FPSTORE_U: + case TYPE_FPSTORE_UX: + case TYPE_MFJMPR: + case TYPE_MTJMPR: + return true; + default: + break; + } + break; default: break; } @@ -20105,6 +21494,23 @@ insn_must_be_last_in_group (rtx insn) break; } break; + case PROCESSOR_POWER7: + type = get_attr_type (insn); + + switch (type) + { + case TYPE_ISYNC: + case TYPE_SYNC: + case TYPE_LOAD_L: + case TYPE_STORE_C: + case TYPE_LOAD_EXT_U: + case TYPE_LOAD_EXT_UX: + case TYPE_STORE_UX: + return true; + default: + break; + } + break; default: break; } @@ -20677,8 +22083,8 @@ rs6000_handle_altivec_attribute (tree *n else if (type == long_long_unsigned_type_node || type == long_long_integer_type_node) error ("use of % in AltiVec types is invalid"); - else if (type == double_type_node) - error ("use of % in AltiVec types is invalid"); + else if (type == double_type_node && !TARGET_VSX) + error ("use of % in AltiVec types is invalid without -mvsx"); else if (type == long_double_type_node) error ("use of % in AltiVec types is invalid"); else if (type == boolean_type_node) @@ -20704,6 +22110,7 @@ rs6000_handle_altivec_attribute (tree *n result = (unsigned_p ? unsigned_V16QI_type_node : V16QI_type_node); break; case SFmode: result = V4SF_type_node; break; + case DFmode: result = V2DF_type_node; break; /* If the user says 'vector int bool', we may be handed the 'bool' attribute _before_ the 'vector' attribute, and so select the proper type in the 'b' case below. */ @@ -22242,6 +23649,43 @@ rs6000_rtx_costs (rtx x, int code, int o return false; } +/* Debug form of r6000_rtx_costs that is selected if -mdebug=cost. */ + +static bool +rs6000_debug_rtx_costs (rtx x, int code, int outer_code, int *total, + bool speed) +{ + bool ret = rs6000_rtx_costs (x, code, outer_code, total, speed); + + fprintf (stderr, + "\nrs6000_rtx_costs, return = %s, code = %s, outer_code = %s, " + "total = %d, speed = %s, x:\n", + ret ? "complete" : "scan inner", + GET_RTX_NAME (code), + GET_RTX_NAME (outer_code), + *total, + speed ? "true" : "false"); + + debug_rtx (x); + + return ret; +} + +/* Debug form of ADDRESS_COST that is selected if -mdebug=cost. */ + +static int +rs6000_debug_address_cost (rtx x, bool speed) +{ + int ret = TARGET_ADDRESS_COST (x, speed); + + fprintf (stderr, "\nrs6000_address_cost, return = %d, speed = %s, x:\n", + ret, speed ? "true" : "false"); + debug_rtx (x); + + return ret; +} + + /* A C expression returning the cost of moving data from a register of class CLASS1 to one of CLASS2. */ @@ -22256,7 +23700,7 @@ rs6000_register_move_cost (enum machine_ if (! reg_classes_intersect_p (to, GENERAL_REGS)) from = to; - if (from == FLOAT_REGS || from == ALTIVEC_REGS) + if (from == FLOAT_REGS || from == ALTIVEC_REGS || from == VSX_REGS) return (rs6000_memory_move_cost (mode, from, 0) + rs6000_memory_move_cost (mode, GENERAL_REGS, 0)); @@ -22276,6 +23720,12 @@ rs6000_register_move_cost (enum machine_ return 2 * hard_regno_nregs[0][mode]; } + /* If we have VSX, we can easily move between FPR or Altivec registers. */ + else if (TARGET_VSX + && ((from == VSX_REGS || from == FLOAT_REGS || from == ALTIVEC_REGS) + || (to == VSX_REGS || to == FLOAT_REGS || to == ALTIVEC_REGS))) + return 2; + /* Moving between two similar registers is just one instruction. */ else if (reg_classes_intersect_p (to, from)) return (mode == TFmode || mode == TDmode) ? 4 : 2; @@ -22516,8 +23966,8 @@ rs6000_emit_swrsqrtsf (rtx dst, rtx src) emit_label (XEXP (label, 0)); } -/* Emit popcount intrinsic on TARGET_POPCNTB targets. DST is the - target, and SRC is the argument operand. */ +/* Emit popcount intrinsic on TARGET_POPCNTB (Power5) and TARGET_POPCNTD + (Power7) targets. DST is the target, and SRC is the argument operand. */ void rs6000_emit_popcount (rtx dst, rtx src) @@ -22525,6 +23975,16 @@ rs6000_emit_popcount (rtx dst, rtx src) enum machine_mode mode = GET_MODE (dst); rtx tmp1, tmp2; + /* Use the PPC ISA 2.06 popcnt{w,d} instruction if we can. */ + if (TARGET_POPCNTD) + { + if (mode == SImode) + emit_insn (gen_popcntwsi2 (dst, src)); + else + emit_insn (gen_popcntddi2 (dst, src)); + return; + } + tmp1 = gen_reg_rtx (mode); if (mode == SImode) @@ -22937,7 +24397,7 @@ rs6000_vector_mode_supported_p (enum mac if (TARGET_SPE && SPE_VECTOR_MODE (mode)) return true; - else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode)) + else if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)) return true; else --- gcc/config/rs6000/vsx.md (.../trunk) (revision 0) +++ gcc/config/rs6000/vsx.md (.../branches/ibm/power7-meissner) (revision 146027) @@ -0,0 +1,1004 @@ +;; VSX patterns. +;; Copyright (C) 2009 +;; Free Software Foundation, Inc. +;; Contributed by Michael Meissner + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published +;; by the Free Software Foundation; either version 3, or (at your +;; option) any later version. + +;; GCC is distributed in the hope that it will be useful, but WITHOUT +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public +;; License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + +;; Iterator for both scalar and vector floating point types supported by VSX +(define_mode_iterator VSX_B [DF V4SF V2DF]) + +;; Iterator for vector floating point types supported by VSX +(define_mode_iterator VSX_F [V4SF V2DF]) + +;; Iterator for logical types supported by VSX +(define_mode_iterator VSX_L [V16QI V8HI V4SI V2DI V4SF V2DF TI]) + +;; Iterator for types for load/store with update +(define_mode_iterator VSX_U [V16QI V8HI V4SI V2DI V4SF V2DF TI DF]) + +;; Map into the appropriate load/store name based on the type +(define_mode_attr VSm [(V16QI "vw4") + (V8HI "vw4") + (V4SI "vw4") + (V4SF "vw4") + (V2DF "vd2") + (V2DI "vd2") + (DF "d") + (TI "vw4")]) + +;; Map into the appropriate suffix based on the type +(define_mode_attr VSs [(V16QI "sp") + (V8HI "sp") + (V4SI "sp") + (V4SF "sp") + (V2DF "dp") + (V2DI "dp") + (DF "dp") + (TI "sp")]) + +;; Map into the register class used +(define_mode_attr VSr [(V16QI "v") + (V8HI "v") + (V4SI "v") + (V4SF "wf") + (V2DI "wd") + (V2DF "wd") + (DF "ws") + (TI "wd")]) + +;; Map into the register class used for float<->int conversions +(define_mode_attr VSr2 [(V2DF "wd") + (V4SF "wf") + (DF "!f#r")]) + +(define_mode_attr VSr3 [(V2DF "wa") + (V4SF "wa") + (DF "!f#r")]) + +;; Same size integer type for floating point data +(define_mode_attr VSi [(V4SF "v4si") + (V2DF "v2di") + (DF "di")]) + +(define_mode_attr VSI [(V4SF "V4SI") + (V2DF "V2DI") + (DF "DI")]) + +;; Word size for same size conversion +(define_mode_attr VSc [(V4SF "w") + (V2DF "d") + (DF "d")]) + +;; Bitsize for DF load with update +(define_mode_attr VSbit [(SI "32") + (DI "64")]) + +;; Map into either s or v, depending on whether this is a scalar or vector +;; operation +(define_mode_attr VSv [(V16QI "v") + (V8HI "v") + (V4SI "v") + (V4SF "v") + (V2DI "v") + (V2DF "v") + (TI "v") + (DF "s")]) + +;; Appropriate type for add ops (and other simple FP ops) +(define_mode_attr VStype_simple [(V2DF "vecfloat") + (V4SF "vecfloat") + (DF "fp")]) + +(define_mode_attr VSfptype_simple [(V2DF "fp_addsub_d") + (V4SF "fp_addsub_s") + (DF "fp_addsub_d")]) + +;; Appropriate type for multiply ops +(define_mode_attr VStype_mul [(V2DF "vecfloat") + (V4SF "vecfloat") + (DF "dmul")]) + +(define_mode_attr VSfptype_mul [(V2DF "fp_mul_d") + (V4SF "fp_mul_s") + (DF "fp_mul_d")]) + +;; Appropriate type for divide ops. For now, just lump the vector divide with +;; the scalar divides +(define_mode_attr VStype_div [(V2DF "ddiv") + (V4SF "sdiv") + (DF "ddiv")]) + +(define_mode_attr VSfptype_div [(V2DF "fp_div_d") + (V4SF "fp_div_s") + (DF "fp_div_d")]) + +;; Appropriate type for sqrt ops. For now, just lump the vector sqrt with +;; the scalar sqrt +(define_mode_attr VStype_sqrt [(V2DF "dsqrt") + (V4SF "sdiv") + (DF "ddiv")]) + +(define_mode_attr VSfptype_sqrt [(V2DF "fp_sqrt_d") + (V4SF "fp_sqrt_s") + (DF "fp_sqrt_d")]) + +;; Appropriate type for load + update +(define_mode_attr VStype_load_update [(V16QI "vecload") + (V8HI "vecload") + (V4SI "vecload") + (V4SF "vecload") + (V2DI "vecload") + (V2DF "vecload") + (TI "vecload") + (DF "fpload")]) + +;; Appropriate type for store + update +(define_mode_attr VStype_store_update [(V16QI "vecstore") + (V8HI "vecstore") + (V4SI "vecstore") + (V4SF "vecstore") + (V2DI "vecstore") + (V2DF "vecstore") + (TI "vecstore") + (DF "fpstore")]) + +;; Constants for creating unspecs +(define_constants + [(UNSPEC_VSX_CONCAT_V2DF 500) + (UNSPEC_VSX_XVCVDPSP 501) + (UNSPEC_VSX_XVCVDPSXWS 502) + (UNSPEC_VSX_XVCVDPUXWS 503) + (UNSPEC_VSX_XVCVSPDP 504) + (UNSPEC_VSX_XVCVSXWDP 505) + (UNSPEC_VSX_XVCVUXWDP 506) + (UNSPEC_VSX_XVMADD 507) + (UNSPEC_VSX_XVMSUB 508) + (UNSPEC_VSX_XVNMADD 509) + (UNSPEC_VSX_XVNMSUB 510) + (UNSPEC_VSX_XVRSQRTE 511) + (UNSPEC_VSX_XVTDIV 512) + (UNSPEC_VSX_XVTSQRT 513)]) + +;; VSX moves +(define_insn "*vsx_mov" + [(set (match_operand:VSX_L 0 "nonimmediate_operand" "=Z,,,?Z,?wa,?wa,*o,*r,*r,,?wa,v,wZ,v") + (match_operand:VSX_L 1 "input_operand" ",Z,,wa,Z,wa,r,o,r,j,j,W,v,wZ"))] + "VECTOR_MEM_VSX_P (mode) + && (register_operand (operands[0], mode) + || register_operand (operands[1], mode))" +{ + switch (which_alternative) + { + case 0: + case 3: + return "stx%U0x %x1,%y0"; + + case 1: + case 4: + return "lx%U0x %x0,%y1"; + + case 2: + case 5: + return "xxlor %x0,%x1,%x1"; + + case 6: + case 7: + case 8: + return "#"; + + case 9: + case 10: + return "xxlxor %x0,%x0,%x0"; + + case 11: + return output_vec_const_move (operands); + + case 12: + return "stvx %1,%y0"; + + case 13: + return "lvx %0,%y1"; + + default: + gcc_unreachable (); + } +} + [(set_attr "type" "vecstore,vecload,vecsimple,vecstore,vecload,vecsimple,*,*,*,vecsimple,vecsimple,*,vecstore,vecload")]) + +;; Load/store with update +;; Define insns that do load or store with update. Because VSX only has +;; reg+reg addressing, pre-decrement or pre-inrement is unlikely to be +;; generated. +;; +;; In all these cases, we use operands 0 and 1 for the register being +;; incremented because those are the operands that local-alloc will +;; tie and these are the pair most likely to be tieable (and the ones +;; that will benefit the most). + +(define_insn "*vsx_load_update_" + [(set (match_operand:VSX_U 3 "vsx_register_operand" "=,?wa") + (mem:VSX_U (plus:P (match_operand:P 1 "gpc_reg_operand" "0,0") + (match_operand:P 2 "gpc_reg_operand" "r,r")))) + (set (match_operand:P 0 "gpc_reg_operand" "=b,b") + (plus:P (match_dup 1) + (match_dup 2)))] + " && TARGET_UPDATE && VECTOR_MEM_VSX_P (mode)" + "lxux %x3,%0,%2" + [(set_attr "type" "")]) + +(define_insn "*vsx_store_update_" + [(set (mem:VSX_U (plus:P (match_operand:P 1 "gpc_reg_operand" "0,0") + (match_operand:P 2 "gpc_reg_operand" "r,r"))) + (match_operand:VSX_U 3 "gpc_reg_operand" ",?wa")) + (set (match_operand:P 0 "gpc_reg_operand" "=b,b") + (plus:P (match_dup 1) + (match_dup 2)))] + " && TARGET_UPDATE && VECTOR_MEM_VSX_P (mode)" + "stxux %x3,%0,%2" + [(set_attr "type" "")]) + +;; We may need to have a varient on the pattern for use in the prologue +;; that doesn't depend on TARGET_UPDATE. + + +;; VSX scalar and vector floating point arithmetic instructions +(define_insn "*vsx_add3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (plus:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xadd %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "*vsx_sub3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (minus:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xsub %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "*vsx_mul3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (mult:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xmul %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "*vsx_div3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (div:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xdiv %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_tdiv3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" ",wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",wa")] + UNSPEC_VSX_XVTDIV))] + "VECTOR_UNIT_VSX_P (mode)" + "xtdiv %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_fre2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" ",wa")] + UNSPEC_FRES))] + "VECTOR_UNIT_VSX_P (mode)" + "xre %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "*vsx_neg2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (neg:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xneg %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "*vsx_abs2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (abs:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xabs %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_nabs2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (neg:VSX_B + (abs:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" ",wa"))))] + "VECTOR_UNIT_VSX_P (mode)" + "xnabs %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_smax3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (smax:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xmax %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "*vsx_smin3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (smin:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xmin %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "*vsx_sqrt2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (sqrt:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xsqrt %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_rsqrte2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" ",wa")] + UNSPEC_VSX_XVRSQRTE))] + "VECTOR_UNIT_VSX_P (mode)" + "xrsqrte %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_tsqrt2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" ",wa")] + UNSPEC_VSX_XVTSQRT))] + "VECTOR_UNIT_VSX_P (mode)" + "xtsqrt %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +;; Fused vector multiply/add instructions + +;; Note we have a pattern for the multiply/add operations that uses unspec and +;; does not check -mfused-madd to allow users to use these ops when they know +;; they want the fused multiply/add. + +(define_expand "vsx_fmadd4" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "") + (plus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "")) + (match_operand:VSX_B 3 "vsx_register_operand" "")))] + "VECTOR_UNIT_VSX_P (mode)" +{ + if (!TARGET_FUSED_MADD) + { + emit_insn (gen_vsx_fmadd4_2 (operands[0], operands[1], operands[2], + operands[3])); + DONE; + } +}) + +(define_insn "*vsx_fmadd4_1" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (plus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))] + "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD" + "@ + xmadda %x0,%x1,%x2 + xmaddm %x0,%x1,%x3 + xmadda %x0,%x1,%x2 + xmaddm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_fmadd4_2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")] + UNSPEC_VSX_XVMADD))] + "VECTOR_UNIT_VSX_P (mode)" + "@ + xmadda %x0,%x1,%x2 + xmaddm %x0,%x1,%x3 + xmadda %x0,%x1,%x2 + xmaddm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_expand "vsx_fmsub4" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "") + (minus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "")) + (match_operand:VSX_B 3 "vsx_register_operand" "")))] + "VECTOR_UNIT_VSX_P (mode)" +{ + if (!TARGET_FUSED_MADD) + { + emit_insn (gen_vsx_fmsub4_2 (operands[0], operands[1], operands[2], + operands[3])); + DONE; + } +}) + +(define_insn "*vsx_fmsub4_1" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (minus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))] + "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD" + "@ + xmsuba %x0,%x1,%x2 + xmsubm %x0,%x1,%x3 + xmsuba %x0,%x1,%x2 + xmsubm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_fmsub4_2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")] + UNSPEC_VSX_XVMSUB))] + "VECTOR_UNIT_VSX_P (mode)" + "@ + xmsuba %x0,%x1,%x2 + xmsubm %x0,%x1,%x3 + xmsuba %x0,%x1,%x2 + xmsubm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_expand "vsx_fnmadd4" + [(match_operand:VSX_B 0 "vsx_register_operand" "") + (match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "") + (match_operand:VSX_B 3 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (mode)" +{ + if (TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (DFmode)) + { + emit_insn (gen_vsx_fnmadd4_1 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } + else if (TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (DFmode)) + { + emit_insn (gen_vsx_fnmadd4_2 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } + else + { + emit_insn (gen_vsx_fnmadd4_3 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } +}) + +(define_insn "vsx_fnmadd4_1" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (neg:VSX_B + (plus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" ",,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa"))))] + "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD + && HONOR_SIGNED_ZEROS (DFmode)" + "@ + xnmadda %x0,%x1,%x2 + xnmaddm %x0,%x1,%x3 + xnmadda %x0,%x1,%x2 + xnmaddm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_fnmadd4_2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (minus:VSX_B + (mult:VSX_B + (neg:VSX_B + (match_operand:VSX_B 1 "gpc_reg_operand" ",,wa,wa")) + (match_operand:VSX_B 2 "gpc_reg_operand" ",0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))] + "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD + && !HONOR_SIGNED_ZEROS (DFmode)" + "@ + xnmadda %x0,%x1,%x2 + xnmaddm %x0,%x1,%x3 + xnmadda %x0,%x1,%x2 + xnmaddm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_fnmadd4_3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" ",,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")] + UNSPEC_VSX_XVNMADD))] + "VECTOR_UNIT_VSX_P (mode)" + "@ + xnmadda %x0,%x1,%x2 + xnmaddm %x0,%x1,%x3 + xnmadda %x0,%x1,%x2 + xnmaddm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_expand "vsx_fnmsub4" + [(match_operand:VSX_B 0 "vsx_register_operand" "") + (match_operand:VSX_B 1 "vsx_register_operand" "") + (match_operand:VSX_B 2 "vsx_register_operand" "") + (match_operand:VSX_B 3 "vsx_register_operand" "")] + "VECTOR_UNIT_VSX_P (mode)" +{ + if (TARGET_FUSED_MADD && HONOR_SIGNED_ZEROS (DFmode)) + { + emit_insn (gen_vsx_fnmsub4_1 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } + else if (TARGET_FUSED_MADD && !HONOR_SIGNED_ZEROS (DFmode)) + { + emit_insn (gen_vsx_fnmsub4_2 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } + else + { + emit_insn (gen_vsx_fnmsub4_3 (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } +}) + +(define_insn "vsx_fnmsub4_1" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (neg:VSX_B + (minus:VSX_B + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0")) + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa"))))] + "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD + && HONOR_SIGNED_ZEROS (DFmode)" + "@ + xnmsuba %x0,%x1,%x2 + xnmsubm %x0,%x1,%x3 + xnmsuba %x0,%x1,%x2 + xnmsubm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_fnmsub4_2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (minus:VSX_B + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa") + (mult:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0"))))] + "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD + && !HONOR_SIGNED_ZEROS (DFmode)" + "@ + xnmsuba %x0,%x1,%x2 + xnmsubm %x0,%x1,%x3 + xnmsuba %x0,%x1,%x2 + xnmsubm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_fnmsub4_3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")] + UNSPEC_VSX_XVNMSUB))] + "VECTOR_UNIT_VSX_P (mode)" + "@ + xnmsuba %x0,%x1,%x2 + xnmsubm %x0,%x1,%x3 + xnmsuba %x0,%x1,%x2 + xnmsubm %x0,%x1,%x3" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +;; Vector conditional expressions (no scalar version for these instructions) +(define_insn "vsx_eq" + [(set (match_operand:VSX_F 0 "vsx_register_operand" "=,?wa") + (eq:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" ",wa") + (match_operand:VSX_F 2 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xvcmpeq %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_gt" + [(set (match_operand:VSX_F 0 "vsx_register_operand" "=,?wa") + (gt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" ",wa") + (match_operand:VSX_F 2 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xvcmpgt %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "*vsx_ge" + [(set (match_operand:VSX_F 0 "vsx_register_operand" "=,?wa") + (ge:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" ",wa") + (match_operand:VSX_F 2 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xvcmpge %x0,%x1,%x2" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_vsel" + [(set (match_operand:VSX_F 0 "vsx_register_operand" "=,?wa") + (if_then_else:VSX_F (ne (match_operand:VSX_F 1 "vsx_register_operand" ",wa") + (const_int 0)) + (match_operand:VSX_F 2 "vsx_register_operand" ",wa") + (match_operand:VSX_F 3 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xxsel %x0,%x3,%x2,%x1" + [(set_attr "type" "vecperm")]) + +;; Copy sign +(define_insn "vsx_copysign3" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (if_then_else:VSX_B + (ge:VSX_B (match_operand:VSX_B 2 "vsx_register_operand" ",wa") + (const_int 0)) + (abs:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa")) + (neg:VSX_B (abs:VSX_B (match_dup 1)))))] + "VECTOR_UNIT_VSX_P (mode)" + "xcpsgn %x0,%x2,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +;; For the conversions, limit the register class for the integer value to be +;; the fprs because we don't want to add the altivec registers to movdi/movsi. +;; For the unsigned tests, there isn't a generic double -> unsigned conversion +;; in rs6000.md so don't test VECTOR_UNIT_VSX_P, just test against VSX. +(define_insn "vsx_ftrunc2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (fix:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xrpiz %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_float2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (float:VSX_B (match_operand: 1 "vsx_register_operand" ",")))] + "VECTOR_UNIT_VSX_P (mode)" + "xcvsx %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_floatuns2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (unsigned_float:VSX_B (match_operand: 1 "vsx_register_operand" ",")))] + "VECTOR_UNIT_VSX_P (mode)" + "xcvux %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_fix_trunc2" + [(set (match_operand: 0 "vsx_register_operand" "=,?") + (fix: (match_operand:VSX_B 1 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xcvsxs %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_fixuns_trunc2" + [(set (match_operand: 0 "vsx_register_operand" "=,?") + (unsigned_fix: (match_operand:VSX_B 1 "vsx_register_operand" ",wa")))] + "VECTOR_UNIT_VSX_P (mode)" + "xcvuxs %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +;; Math rounding functions +(define_insn "vsx_btrunc2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" ",wa")] + UNSPEC_FRIZ))] + "VECTOR_UNIT_VSX_P (mode)" + "xriz %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_floor2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" ",wa")] + UNSPEC_FRIM))] + "VECTOR_UNIT_VSX_P (mode)" + "xrim %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + +(define_insn "vsx_ceil2" + [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa") + (unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" ",wa")] + UNSPEC_FRIP))] + "VECTOR_UNIT_VSX_P (mode)" + "xrip %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + + +;; VSX convert to/from double vector + +;; Convert from 64-bit to 32-bit types +;; Note, favor the Altivec registers since the usual use of these instructions +;; is in vector converts and we need to use the Altivec vperm instruction. + +(define_insn "vsx_xvcvdpsp" + [(set (match_operand:V4SF 0 "vsx_register_operand" "=v,?wa") + (unspec:V4SF [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")] + UNSPEC_VSX_XVCVDPSP))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvdpsp %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvdpsxws" + [(set (match_operand:V4SI 0 "vsx_register_operand" "=v,?wa") + (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")] + UNSPEC_VSX_XVCVDPSXWS))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvdpsxws %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvdpuxws" + [(set (match_operand:V4SI 0 "vsx_register_operand" "=v,?wa") + (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wd,wa")] + UNSPEC_VSX_XVCVDPUXWS))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvdpuxws %x0,%x1" + [(set_attr "type" "vecfloat")]) + +;; Convert from 32-bit to 64-bit types +(define_insn "vsx_xvcvspdp" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") + (unspec:V2DF [(match_operand:V4SF 1 "vsx_register_operand" "wf,wa")] + UNSPEC_VSX_XVCVSPDP))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvspdp %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvsxwdp" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") + (unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")] + UNSPEC_VSX_XVCVSXWDP))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvsxwdp %x0,%x1" + [(set_attr "type" "vecfloat")]) + +(define_insn "vsx_xvcvuxwdp" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") + (unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")] + UNSPEC_VSX_XVCVUXWDP))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvuxwdp %x0,%x1" + [(set_attr "type" "vecfloat")]) + + +;; Logical and permute operations +(define_insn "*vsx_and3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=,?wa") + (and:VSX_L + (match_operand:VSX_L 1 "vsx_register_operand" ",?wa") + (match_operand:VSX_L 2 "vsx_register_operand" ",?wa")))] + "VECTOR_MEM_VSX_P (mode)" + "xxland %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_ior3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=,?wa") + (ior:VSX_L (match_operand:VSX_L 1 "vsx_register_operand" ",?wa") + (match_operand:VSX_L 2 "vsx_register_operand" ",?wa")))] + "VECTOR_MEM_VSX_P (mode)" + "xxlor %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_xor3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=,?wa") + (xor:VSX_L + (match_operand:VSX_L 1 "vsx_register_operand" ",?wa") + (match_operand:VSX_L 2 "vsx_register_operand" ",?wa")))] + "VECTOR_MEM_VSX_P (mode)" + "xxlxor %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_one_cmpl2" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=,?wa") + (not:VSX_L + (match_operand:VSX_L 1 "vsx_register_operand" ",?wa")))] + "VECTOR_MEM_VSX_P (mode)" + "xxlnor %x0,%x1,%x1" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_nor3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=,?wa") + (not:VSX_L + (ior:VSX_L + (match_operand:VSX_L 1 "vsx_register_operand" ",?wa") + (match_operand:VSX_L 2 "vsx_register_operand" ",?wa"))))] + "VECTOR_MEM_VSX_P (mode)" + "xxlnor %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + +(define_insn "*vsx_andc3" + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=,?wa") + (and:VSX_L + (not:VSX_L + (match_operand:VSX_L 2 "vsx_register_operand" ",?wa")) + (match_operand:VSX_L 1 "vsx_register_operand" ",?wa")))] + "VECTOR_MEM_VSX_P (mode)" + "xxlandc %x0,%x1,%x2" + [(set_attr "type" "vecsimple")]) + + +;; Permute operations + +(define_insn "vsx_concat_v2df" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") + (unspec:V2DF + [(match_operand:DF 1 "vsx_register_operand" "ws,wa") + (match_operand:DF 2 "vsx_register_operand" "ws,wa")] + UNSPEC_VSX_CONCAT_V2DF))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xxpermdi %x0,%x1,%x2,0" + [(set_attr "type" "vecperm")]) + +;; Set a double into one element +(define_insn "vsx_set_v2df" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") + (vec_merge:V2DF + (match_operand:V2DF 1 "vsx_register_operand" "wd,wa") + (vec_duplicate:V2DF (match_operand:DF 2 "vsx_register_operand" "ws,f")) + (match_operand:QI 3 "u5bit_cint_operand" "i,i")))] + "VECTOR_UNIT_VSX_P (V2DFmode)" +{ + if (INTVAL (operands[3]) == 0) + return \"xxpermdi %x0,%x1,%x2,1\"; + else if (INTVAL (operands[3]) == 1) + return \"xxpermdi %x0,%x2,%x1,0\"; + else + gcc_unreachable (); +} + [(set_attr "type" "vecperm")]) + +;; Extract a DF element from V2DF +(define_insn "vsx_extract_v2df" + [(set (match_operand:DF 0 "vsx_register_operand" "=ws,f,?wa") + (vec_select:DF (match_operand:V2DF 1 "vsx_register_operand" "wd,wd,wa") + (parallel + [(match_operand:QI 2 "u5bit_cint_operand" "i,i,i")])))] + "VECTOR_UNIT_VSX_P (V2DFmode)" +{ + gcc_assert (UINTVAL (operands[2]) <= 1); + operands[3] = GEN_INT (INTVAL (operands[2]) << 1); + return \"xxpermdi %x0,%x1,%x1,%3\"; +} + [(set_attr "type" "vecperm")]) + +;; General V2DF permute, extract_{high,low,even,odd} +(define_insn "vsx_xxpermdi" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd") + (vec_concat:V2DF + (vec_select:DF (match_operand:V2DF 1 "vsx_register_operand" "wd") + (parallel + [(match_operand:QI 2 "u5bit_cint_operand" "i")])) + (vec_select:DF (match_operand:V2DF 3 "vsx_register_operand" "wd") + (parallel + [(match_operand:QI 4 "u5bit_cint_operand" "i")]))))] + "VECTOR_UNIT_VSX_P (V2DFmode)" +{ + gcc_assert ((UINTVAL (operands[2]) <= 1) && (UINTVAL (operands[4]) <= 1)); + operands[5] = GEN_INT (((INTVAL (operands[2]) & 1) << 1) + | (INTVAL (operands[4]) & 1)); + return \"xxpermdi %x0,%x1,%x3,%5\"; +} + [(set_attr "type" "vecperm")]) + +;; V2DF splat +(define_insn "vsx_splatv2df" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,wd,wd,?wa,?wa,?wa") + (vec_duplicate:V2DF + (match_operand:DF 1 "input_operand" "ws,f,Z,wa,wa,Z")))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "@ + xxpermdi %x0,%x1,%x1,0 + xxpermdi %x0,%x1,%x1,0 + lxvdsx %x0,%y1 + xxpermdi %x0,%x1,%x1,0 + xxpermdi %x0,%x1,%x1,0 + lxvdsx %x0,%y1" + [(set_attr "type" "vecperm,vecperm,vecload,vecperm,vecperm,vecload")]) + +;; V4SF splat +(define_insn "*vsx_xxspltw" + [(set (match_operand:V4SF 0 "vsx_register_operand" "=wf,?wa") + (vec_duplicate:V4SF + (vec_select:SF (match_operand:V4SF 1 "vsx_register_operand" "wf,wa") + (parallel + [(match_operand:QI 2 "u5bit_cint_operand" "i,i")]))))] + "VECTOR_UNIT_VSX_P (V4SFmode)" + "xxspltw %x0,%x1,%2" + [(set_attr "type" "vecperm")]) + +;; V4SF interleave +(define_insn "vsx_xxmrghw" + [(set (match_operand:V4SF 0 "register_operand" "=wf,?wa") + (vec_merge:V4SF + (vec_select:V4SF (match_operand:V4SF 1 "vsx_register_operand" "wf,wa") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (vec_select:V4SF (match_operand:V4SF 2 "vsx_register_operand" "wf,wa") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (const_int 5)))] + "VECTOR_UNIT_VSX_P (V4SFmode)" + "xxmrghw %x0,%x1,%x2" + [(set_attr "type" "vecperm")]) + +(define_insn "vsx_xxmrglw" + [(set (match_operand:V4SF 0 "register_operand" "=wf,?wa") + (vec_merge:V4SF + (vec_select:V4SF + (match_operand:V4SF 1 "register_operand" "wf,wa") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (vec_select:V4SF + (match_operand:V4SF 2 "register_operand" "wf,?wa") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (const_int 5)))] + "VECTOR_UNIT_VSX_P (V4SFmode)" + "xxmrglw %x0,%x1,%x2" + [(set_attr "type" "vecperm")]) --- gcc/config/rs6000/rs6000.h (.../trunk) (revision 145777) +++ gcc/config/rs6000/rs6000.h (.../branches/ibm/power7-meissner) (revision 146027) @@ -72,14 +72,16 @@ #define ASM_CPU_POWER6_SPEC "-mpower4 -maltivec" #endif -#ifdef HAVE_AS_VSX +#ifdef HAVE_AS_POPCNTD #define ASM_CPU_POWER7_SPEC "-mpower7" #else #define ASM_CPU_POWER7_SPEC "-mpower4 -maltivec" #endif -/* Common ASM definitions used by ASM_SPEC among the various targets - for handling -mcpu=xxx switches. */ +/* Common ASM definitions used by ASM_SPEC among the various targets for + handling -mcpu=xxx switches. There is a parallel list in driver-rs6000.c to + provide the default assembler options if the user uses -mcpu=native, so if + you make changes here, make them also there. */ #define ASM_CPU_SPEC \ "%{!mcpu*: \ %{mpower: %{!mpower2: -mpwr}} \ @@ -88,6 +90,7 @@ %{!mpowerpc64*: %{mpowerpc*: -mppc}} \ %{mno-power: %{!mpowerpc*: -mcom}} \ %{!mno-power: %{!mpower*: %(asm_default)}}} \ +%{mcpu=native: %(asm_cpu_native)} \ %{mcpu=common: -mcom} \ %{mcpu=cell: -mcell} \ %{mcpu=power: -mpwr} \ @@ -163,6 +166,7 @@ #define EXTRA_SPECS \ { "cpp_default", CPP_DEFAULT_SPEC }, \ { "asm_cpu", ASM_CPU_SPEC }, \ + { "asm_cpu_native", ASM_CPU_NATIVE_SPEC }, \ { "asm_default", ASM_DEFAULT_SPEC }, \ { "cc1_cpu", CC1_CPU_SPEC }, \ { "asm_cpu_power5", ASM_CPU_POWER5_SPEC }, \ @@ -179,6 +183,10 @@ extern const char *host_detect_local_cpu #define EXTRA_SPEC_FUNCTIONS \ { "local_cpu_detect", host_detect_local_cpu }, #define HAVE_LOCAL_CPU_DETECT +#define ASM_CPU_NATIVE_SPEC "%:local_cpu_detect(asm)" + +#else +#define ASM_CPU_NATIVE_SPEC "%(asm_default)" #endif #ifndef CC1_CPU_SPEC @@ -233,11 +241,12 @@ extern const char *host_detect_local_cpu #define TARGET_MFPGPR 0 #endif -/* Define TARGET_DFP if the target assembler does not support decimal - floating point instructions. */ -#ifndef HAVE_AS_DFP -#undef TARGET_DFP -#define TARGET_DFP 0 +/* Define TARGET_POPCNTD if the target assembler does not support the + popcount word and double word instructions. */ + +#ifndef HAVE_AS_POPCNTD +#undef TARGET_POPCNTD +#define TARGET_POPCNTD 0 #endif #ifndef TARGET_SECURE_PLT @@ -295,6 +304,7 @@ enum processor_type PROCESSOR_POWER4, PROCESSOR_POWER5, PROCESSOR_POWER6, + PROCESSOR_POWER7, PROCESSOR_CELL }; @@ -388,9 +398,15 @@ extern struct rs6000_cpu_select rs6000_s extern const char *rs6000_debug_name; /* Name for -mdebug-xxxx option */ extern int rs6000_debug_stack; /* debug stack applications */ extern int rs6000_debug_arg; /* debug argument handling */ +extern int rs6000_debug_reg; /* debug register handling */ +extern int rs6000_debug_addr; /* debug memory addressing */ +extern int rs6000_debug_cost; /* debug rtx_costs */ #define TARGET_DEBUG_STACK rs6000_debug_stack #define TARGET_DEBUG_ARG rs6000_debug_arg +#define TARGET_DEBUG_REG rs6000_debug_reg +#define TARGET_DEBUG_ADDR rs6000_debug_addr +#define TARGET_DEBUG_COST rs6000_debug_cost extern const char *rs6000_traceback_name; /* Type of traceback table. */ @@ -401,13 +417,65 @@ extern int rs6000_ieeequad; extern int rs6000_altivec_abi; extern int rs6000_spe_abi; extern int rs6000_spe; -extern int rs6000_isel; extern int rs6000_float_gprs; extern int rs6000_alignment_flags; extern const char *rs6000_sched_insert_nops_str; extern enum rs6000_nop_insertion rs6000_sched_insert_nops; extern int rs6000_xilinx_fpu; +/* Describe which vector unit to use for a given machine mode. */ +enum rs6000_vector { + VECTOR_NONE, /* Type is not a vector or not supported */ + VECTOR_ALTIVEC, /* Use altivec for vector processing */ + VECTOR_VSX, /* Use VSX for vector processing */ + VECTOR_PAIRED, /* Use paired floating point for vectors */ + VECTOR_SPE, /* Use SPE for vector processing */ + VECTOR_OTHER /* Some other vector unit */ +}; + +extern enum rs6000_vector rs6000_vector_unit[]; + +#define VECTOR_UNIT_NONE_P(MODE) \ + (rs6000_vector_unit[(MODE)] == VECTOR_NONE) + +#define VECTOR_UNIT_VSX_P(MODE) \ + (rs6000_vector_unit[(MODE)] == VECTOR_VSX) + +#define VECTOR_UNIT_ALTIVEC_P(MODE) \ + (rs6000_vector_unit[(MODE)] == VECTOR_ALTIVEC) + +#define VECTOR_UNIT_ALTIVEC_OR_VSX_P(MODE) \ + (rs6000_vector_unit[(MODE)] == VECTOR_ALTIVEC \ + || rs6000_vector_unit[(MODE)] == VECTOR_VSX) + +/* Describe whether to use VSX loads or Altivec loads. For now, just use the + same unit as the vector unit we are using, but we may want to migrate to + using VSX style loads even for types handled by altivec. */ +extern enum rs6000_vector rs6000_vector_mem[]; + +#define VECTOR_MEM_NONE_P(MODE) \ + (rs6000_vector_mem[(MODE)] == VECTOR_NONE) + +#define VECTOR_MEM_VSX_P(MODE) \ + (rs6000_vector_mem[(MODE)] == VECTOR_VSX) + +#define VECTOR_MEM_ALTIVEC_P(MODE) \ + (rs6000_vector_mem[(MODE)] == VECTOR_ALTIVEC) + +#define VECTOR_MEM_ALTIVEC_OR_VSX_P(MODE) \ + (rs6000_vector_mem[(MODE)] == VECTOR_ALTIVEC \ + || rs6000_vector_mem[(MODE)] == VECTOR_VSX) + +/* Return the alignment of a given vector type, which is set based on the + vector unit use. VSX for instance can load 32 or 64 bit aligned words + without problems, while Altivec requires 128-bit aligned vectors. */ +extern int rs6000_vector_align[]; + +#define VECTOR_ALIGN(MODE) \ + ((rs6000_vector_align[(MODE)] != 0) \ + ? rs6000_vector_align[(MODE)] \ + : (int)GET_MODE_BITSIZE ((MODE))) + /* Alignment options for fields in structures for sub-targets following AIX-like ABI. ALIGN_POWER word-aligns FP doubles (default AIX ABI). @@ -432,7 +500,7 @@ extern int rs6000_xilinx_fpu; #define TARGET_SPE_ABI 0 #define TARGET_SPE 0 #define TARGET_E500 0 -#define TARGET_ISEL rs6000_isel +#define TARGET_ISEL64 (TARGET_ISEL && TARGET_POWERPC64) #define TARGET_FPRS 1 #define TARGET_E500_SINGLE 0 #define TARGET_E500_DOUBLE 0 @@ -530,6 +598,7 @@ extern int rs6000_xilinx_fpu; #endif #define UNITS_PER_FP_WORD 8 #define UNITS_PER_ALTIVEC_WORD 16 +#define UNITS_PER_VSX_WORD 16 #define UNITS_PER_SPE_WORD 8 #define UNITS_PER_PAIRED_WORD 8 @@ -600,8 +669,9 @@ extern int rs6000_xilinx_fpu; #define PARM_BOUNDARY (TARGET_32BIT ? 32 : 64) /* Boundary (in *bits*) on which stack pointer should be aligned. */ -#define STACK_BOUNDARY \ - ((TARGET_32BIT && !TARGET_ALTIVEC && !TARGET_ALTIVEC_ABI) ? 64 : 128) +#define STACK_BOUNDARY \ + ((TARGET_32BIT && !TARGET_ALTIVEC && !TARGET_ALTIVEC_ABI && !TARGET_VSX) \ + ? 64 : 128) /* Allocation boundary (in *bits*) for the code of a function. */ #define FUNCTION_BOUNDARY 32 @@ -613,10 +683,11 @@ extern int rs6000_xilinx_fpu; local store. TYPE is the data type, and ALIGN is the alignment that the object would ordinarily have. */ #define LOCAL_ALIGNMENT(TYPE, ALIGN) \ - ((TARGET_ALTIVEC && TREE_CODE (TYPE) == VECTOR_TYPE) ? 128 : \ + (((TARGET_ALTIVEC || TARGET_VSX) \ + && TREE_CODE (TYPE) == VECTOR_TYPE) ? 128 : \ (TARGET_E500_DOUBLE \ - && TYPE_MODE (TYPE) == DFmode) ? 64 : \ - ((TARGET_SPE && TREE_CODE (TYPE) == VECTOR_TYPE \ + && TYPE_MODE (TYPE) == DFmode) ? 64 : \ + ((TARGET_SPE && TREE_CODE (TYPE) == VECTOR_TYPE \ && SPE_VECTOR_MODE (TYPE_MODE (TYPE))) || (TARGET_PAIRED_FLOAT \ && TREE_CODE (TYPE) == VECTOR_TYPE \ && PAIRED_VECTOR_MODE (TYPE_MODE (TYPE)))) ? 64 : ALIGN) @@ -674,15 +745,17 @@ extern int rs6000_xilinx_fpu; /* Define this macro to be the value 1 if unaligned accesses have a cost many times greater than aligned accesses, for example if they are emulated in a trap handler. */ -/* Altivec vector memory instructions simply ignore the low bits; SPE - vector memory instructions trap on unaligned accesses. */ +/* Altivec vector memory instructions simply ignore the low bits; SPE vector + memory instructions trap on unaligned accesses; VSX memory instructions are + aligned to 4 or 8 bytes. */ #define SLOW_UNALIGNED_ACCESS(MODE, ALIGN) \ (STRICT_ALIGNMENT \ || (((MODE) == SFmode || (MODE) == DFmode || (MODE) == TFmode \ || (MODE) == SDmode || (MODE) == DDmode || (MODE) == TDmode \ || (MODE) == DImode) \ && (ALIGN) < 32) \ - || (VECTOR_MODE_P ((MODE)) && (ALIGN) < GET_MODE_BITSIZE ((MODE)))) + || (VECTOR_MODE_P ((MODE)) && (((int)(ALIGN)) < VECTOR_ALIGN (MODE)))) + /* Standard register usage. */ @@ -909,16 +982,60 @@ extern int rs6000_xilinx_fpu; /* True if register is an AltiVec register. */ #define ALTIVEC_REGNO_P(N) ((N) >= FIRST_ALTIVEC_REGNO && (N) <= LAST_ALTIVEC_REGNO) +/* True if register is a VSX register. */ +#define VSX_REGNO_P(N) (FP_REGNO_P (N) || ALTIVEC_REGNO_P (N)) + +/* Alternate name for any vector register supporting floating point, no matter + which instruction set(s) are available. */ +#define VFLOAT_REGNO_P(N) \ + (ALTIVEC_REGNO_P (N) || (TARGET_VSX && FP_REGNO_P (N))) + +/* Alternate name for any vector register supporting integer, no matter which + instruction set(s) are available. */ +#define VINT_REGNO_P(N) ALTIVEC_REGNO_P (N) + +/* Alternate name for any vector register supporting logical operations, no + matter which instruction set(s) are available. */ +#define VLOGICAL_REGNO_P(N) VFLOAT_REGNO_P (N) + /* Return number of consecutive hard regs needed starting at reg REGNO to hold something of mode MODE. */ -#define HARD_REGNO_NREGS(REGNO, MODE) rs6000_hard_regno_nregs ((REGNO), (MODE)) +#define HARD_REGNO_NREGS(REGNO, MODE) rs6000_hard_regno_nregs[(MODE)][(REGNO)] #define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) \ ((TARGET_32BIT && TARGET_POWERPC64 \ && (GET_MODE_SIZE (MODE) > 4) \ && INT_REGNO_P (REGNO)) ? 1 : 0) +#define VSX_VECTOR_MODE(MODE) \ + ((MODE) == V4SFmode \ + || (MODE) == V2DFmode) \ + +#define VSX_VECTOR_MOVE_MODE(MODE) \ + ((MODE) == V16QImode \ + || (MODE) == V8HImode \ + || (MODE) == V4SImode \ + || (MODE) == V2DImode \ + || (MODE) == V4SFmode \ + || (MODE) == V2DFmode) \ + +#define VSX_SCALAR_MODE(MODE) \ + ((MODE) == DFmode) + +#define VSX_MODE(MODE) \ + (VSX_VECTOR_MODE (MODE) \ + || VSX_SCALAR_MODE (MODE)) + +#define VSX_MOVE_MODE(MODE) \ + (VSX_VECTOR_MOVE_MODE (MODE) \ + || VSX_SCALAR_MODE(MODE) \ + || (MODE) == V16QImode \ + || (MODE) == V8HImode \ + || (MODE) == V4SImode \ + || (MODE) == V2DImode \ + || (MODE) == TImode) + #define ALTIVEC_VECTOR_MODE(MODE) \ ((MODE) == V16QImode \ || (MODE) == V8HImode \ @@ -934,10 +1051,12 @@ extern int rs6000_xilinx_fpu; #define PAIRED_VECTOR_MODE(MODE) \ ((MODE) == V2SFmode) -#define UNITS_PER_SIMD_WORD(MODE) \ - (TARGET_ALTIVEC ? UNITS_PER_ALTIVEC_WORD \ - : (TARGET_SPE ? UNITS_PER_SPE_WORD : (TARGET_PAIRED_FLOAT ? \ - UNITS_PER_PAIRED_WORD : UNITS_PER_WORD))) +#define UNITS_PER_SIMD_WORD(MODE) \ + (TARGET_VSX ? UNITS_PER_VSX_WORD \ + : (TARGET_ALTIVEC ? UNITS_PER_ALTIVEC_WORD \ + : (TARGET_SPE ? UNITS_PER_SPE_WORD \ + : (TARGET_PAIRED_FLOAT ? UNITS_PER_PAIRED_WORD \ + : UNITS_PER_WORD)))) /* Value is TRUE if hard register REGNO can hold a value of machine-mode MODE. */ @@ -965,6 +1084,10 @@ extern int rs6000_xilinx_fpu; ? ALTIVEC_VECTOR_MODE (MODE2) \ : ALTIVEC_VECTOR_MODE (MODE2) \ ? ALTIVEC_VECTOR_MODE (MODE1) \ + : VSX_VECTOR_MODE (MODE1) \ + ? VSX_VECTOR_MODE (MODE2) \ + : VSX_VECTOR_MODE (MODE2) \ + ? VSX_VECTOR_MODE (MODE1) \ : 1) /* Post-reload, we can't use any new AltiVec registers, as we already @@ -1056,9 +1179,10 @@ extern int rs6000_xilinx_fpu; For any two classes, it is very desirable that there be another class that represents their union. */ -/* The RS/6000 has three types of registers, fixed-point, floating-point, - and condition registers, plus three special registers, MQ, CTR, and the - link register. AltiVec adds a vector register class. +/* The RS/6000 has three types of registers, fixed-point, floating-point, and + condition registers, plus three special registers, MQ, CTR, and the link + register. AltiVec adds a vector register class. VSX registers overlap the + FPR registers and the Altivec registers. However, r0 is special in that it cannot be used as a base register. So make a class for registers valid as base registers. @@ -1073,6 +1197,7 @@ enum reg_class GENERAL_REGS, FLOAT_REGS, ALTIVEC_REGS, + VSX_REGS, VRSAVE_REGS, VSCR_REGS, SPE_ACC_REGS, @@ -1103,6 +1228,7 @@ enum reg_class "GENERAL_REGS", \ "FLOAT_REGS", \ "ALTIVEC_REGS", \ + "VSX_REGS", \ "VRSAVE_REGS", \ "VSCR_REGS", \ "SPE_ACC_REGS", \ @@ -1132,6 +1258,7 @@ enum reg_class { 0xffffffff, 0x00000000, 0x00000008, 0x00020000 }, /* GENERAL_REGS */ \ { 0x00000000, 0xffffffff, 0x00000000, 0x00000000 }, /* FLOAT_REGS */ \ { 0x00000000, 0x00000000, 0xffffe000, 0x00001fff }, /* ALTIVEC_REGS */ \ + { 0x00000000, 0xffffffff, 0xffffe000, 0x00001fff }, /* VSX_REGS */ \ { 0x00000000, 0x00000000, 0x00000000, 0x00002000 }, /* VRSAVE_REGS */ \ { 0x00000000, 0x00000000, 0x00000000, 0x00004000 }, /* VSCR_REGS */ \ { 0x00000000, 0x00000000, 0x00000000, 0x00008000 }, /* SPE_ACC_REGS */ \ @@ -1171,29 +1298,29 @@ enum reg_class reg number REGNO. This could be a conditional expression or could index an array. */ -#define REGNO_REG_CLASS(REGNO) \ - ((REGNO) == 0 ? GENERAL_REGS \ - : (REGNO) < 32 ? BASE_REGS \ - : FP_REGNO_P (REGNO) ? FLOAT_REGS \ - : ALTIVEC_REGNO_P (REGNO) ? ALTIVEC_REGS \ - : (REGNO) == CR0_REGNO ? CR0_REGS \ - : CR_REGNO_P (REGNO) ? CR_REGS \ - : (REGNO) == MQ_REGNO ? MQ_REGS \ - : (REGNO) == LR_REGNO ? LINK_REGS \ - : (REGNO) == CTR_REGNO ? CTR_REGS \ - : (REGNO) == ARG_POINTER_REGNUM ? BASE_REGS \ - : (REGNO) == XER_REGNO ? XER_REGS \ - : (REGNO) == VRSAVE_REGNO ? VRSAVE_REGS \ - : (REGNO) == VSCR_REGNO ? VRSAVE_REGS \ - : (REGNO) == SPE_ACC_REGNO ? SPE_ACC_REGS \ - : (REGNO) == SPEFSCR_REGNO ? SPEFSCR_REGS \ - : (REGNO) == FRAME_POINTER_REGNUM ? BASE_REGS \ - : NO_REGS) +extern enum reg_class rs6000_regno_regclass[FIRST_PSEUDO_REGISTER]; + +#if ENABLE_CHECKING +#define REGNO_REG_CLASS(REGNO) \ + (gcc_assert (IN_RANGE ((REGNO), 0, FIRST_PSEUDO_REGISTER-1)), \ + rs6000_regno_regclass[(REGNO)]) + +#else +#define REGNO_REG_CLASS(REGNO) rs6000_regno_regclass[(REGNO)] +#endif + +/* VSX register classes. */ +extern enum reg_class rs6000_vector_reg_class[]; +extern enum reg_class rs6000_vsx_reg_class; /* The class value for index registers, and the one for base regs. */ #define INDEX_REG_CLASS GENERAL_REGS #define BASE_REG_CLASS BASE_REGS +/* Return whether a given register class can hold VSX objects. */ +#define VSX_REG_CLASS_P(CLASS) \ + ((CLASS) == VSX_REGS || (CLASS) == FLOAT_REGS || (CLASS) == ALTIVEC_REGS) + /* Given an rtx X being reloaded into a reg required to be in class CLASS, return the class of reg to actually use. In general this is just CLASS; but on some machines @@ -1213,13 +1340,7 @@ enum reg_class */ #define PREFERRED_RELOAD_CLASS(X,CLASS) \ - ((CONSTANT_P (X) \ - && reg_classes_intersect_p ((CLASS), FLOAT_REGS)) \ - ? NO_REGS \ - : (GET_MODE_CLASS (GET_MODE (X)) == MODE_INT \ - && (CLASS) == NON_SPECIAL_REGS) \ - ? GENERAL_REGS \ - : (CLASS)) + rs6000_preferred_reload_class (X, CLASS) /* Return the register class of a scratch register needed to copy IN into or out of a register in CLASS in MODE. If it can be done directly, @@ -1234,18 +1355,7 @@ enum reg_class are available.*/ #define SECONDARY_MEMORY_NEEDED(CLASS1,CLASS2,MODE) \ - ((CLASS1) != (CLASS2) && (((CLASS1) == FLOAT_REGS \ - && (!TARGET_MFPGPR || !TARGET_POWERPC64 \ - || ((MODE != DFmode) \ - && (MODE != DDmode) \ - && (MODE != DImode)))) \ - || ((CLASS2) == FLOAT_REGS \ - && (!TARGET_MFPGPR || !TARGET_POWERPC64 \ - || ((MODE != DFmode) \ - && (MODE != DDmode) \ - && (MODE != DImode)))) \ - || (CLASS1) == ALTIVEC_REGS \ - || (CLASS2) == ALTIVEC_REGS)) + rs6000_secondary_memory_needed (CLASS1, CLASS2, MODE) /* For cpus that cannot load/store SDmode values from the 64-bit FP registers without using a full 64-bit load/store, we need @@ -1257,32 +1367,15 @@ enum reg_class /* Return the maximum number of consecutive registers needed to represent mode MODE in a register of class CLASS. - On RS/6000, this is the size of MODE in words, - except in the FP regs, where a single reg is enough for two words. */ -#define CLASS_MAX_NREGS(CLASS, MODE) \ - (((CLASS) == FLOAT_REGS) \ - ? ((GET_MODE_SIZE (MODE) + UNITS_PER_FP_WORD - 1) / UNITS_PER_FP_WORD) \ - : (TARGET_E500_DOUBLE && (CLASS) == GENERAL_REGS \ - && (MODE) == DFmode) \ - ? 1 \ - : ((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)) + On RS/6000, this is the size of MODE in words, except in the FP regs, where + a single reg is enough for two words, unless we have VSX, where the FP + registers can hold 128 bits. */ +#define CLASS_MAX_NREGS(CLASS, MODE) rs6000_class_max_nregs[(MODE)][(CLASS)] /* Return nonzero if for CLASS a mode change from FROM to TO is invalid. */ #define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ - (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \ - ? ((GET_MODE_SIZE (FROM) < 8 || GET_MODE_SIZE (TO) < 8 \ - || TARGET_IEEEQUAD) \ - && reg_classes_intersect_p (FLOAT_REGS, CLASS)) \ - : (((TARGET_E500_DOUBLE \ - && ((((TO) == DFmode) + ((FROM) == DFmode)) == 1 \ - || (((TO) == TFmode) + ((FROM) == TFmode)) == 1 \ - || (((TO) == DDmode) + ((FROM) == DDmode)) == 1 \ - || (((TO) == TDmode) + ((FROM) == TDmode)) == 1 \ - || (((TO) == DImode) + ((FROM) == DImode)) == 1)) \ - || (TARGET_SPE \ - && (SPE_VECTOR_MODE (FROM) + SPE_VECTOR_MODE (TO)) == 1)) \ - && reg_classes_intersect_p (GENERAL_REGS, CLASS))) + rs6000_cannot_change_mode_class (FROM, TO, CLASS) /* Stack layout; function entry, exit and calling. */ @@ -1343,8 +1436,8 @@ extern enum rs6000_abi rs6000_current_ab #define STARTING_FRAME_OFFSET \ (FRAME_GROWS_DOWNWARD \ ? 0 \ - : (RS6000_ALIGN (crtl->outgoing_args_size, \ - TARGET_ALTIVEC ? 16 : 8) \ + : (RS6000_ALIGN (crtl->outgoing_args_size, \ + (TARGET_ALTIVEC || TARGET_VSX) ? 16 : 8) \ + RS6000_SAVE_AREA)) /* Offset from the stack pointer register to an item dynamically @@ -1354,8 +1447,8 @@ extern enum rs6000_abi rs6000_current_ab length of the outgoing arguments. The default is correct for most machines. See `function.c' for details. */ #define STACK_DYNAMIC_OFFSET(FUNDECL) \ - (RS6000_ALIGN (crtl->outgoing_args_size, \ - TARGET_ALTIVEC ? 16 : 8) \ + (RS6000_ALIGN (crtl->outgoing_args_size, \ + (TARGET_ALTIVEC || TARGET_VSX) ? 16 : 8) \ + (STACK_POINTER_OFFSET)) /* If we generate an insn to push BYTES bytes, @@ -1605,7 +1698,7 @@ typedef struct rs6000_args #define EPILOGUE_USES(REGNO) \ ((reload_completed && (REGNO) == LR_REGNO) \ || (TARGET_ALTIVEC && (REGNO) == VRSAVE_REGNO) \ - || (crtl->calls_eh_return \ + || (crtl->calls_eh_return \ && TARGET_AIX \ && (REGNO) == 2)) @@ -2316,7 +2409,24 @@ extern char rs6000_reg_names[][8]; /* re /* no additional names for: mq, lr, ctr, ap */ \ {"cr0", 68}, {"cr1", 69}, {"cr2", 70}, {"cr3", 71}, \ {"cr4", 72}, {"cr5", 73}, {"cr6", 74}, {"cr7", 75}, \ - {"cc", 68}, {"sp", 1}, {"toc", 2} } + {"cc", 68}, {"sp", 1}, {"toc", 2}, \ + /* VSX registers overlaid on top of FR, Altivec registers */ \ + {"vs0", 32}, {"vs1", 33}, {"vs2", 34}, {"vs3", 35}, \ + {"vs4", 36}, {"vs5", 37}, {"vs6", 38}, {"vs7", 39}, \ + {"vs8", 40}, {"vs9", 41}, {"vs10", 42}, {"vs11", 43}, \ + {"vs12", 44}, {"vs13", 45}, {"vs14", 46}, {"vs15", 47}, \ + {"vs16", 48}, {"vs17", 49}, {"vs18", 50}, {"vs19", 51}, \ + {"vs20", 52}, {"vs21", 53}, {"vs22", 54}, {"vs23", 55}, \ + {"vs24", 56}, {"vs25", 57}, {"vs26", 58}, {"vs27", 59}, \ + {"vs28", 60}, {"vs29", 61}, {"vs30", 62}, {"vs31", 63}, \ + {"vs32", 77}, {"vs33", 78}, {"vs34", 79}, {"vs35", 80}, \ + {"vs36", 81}, {"vs37", 82}, {"vs38", 83}, {"vs39", 84}, \ + {"vs40", 85}, {"vs41", 86}, {"vs42", 87}, {"vs43", 88}, \ + {"vs44", 89}, {"vs45", 90}, {"vs46", 91}, {"vs47", 92}, \ + {"vs48", 93}, {"vs49", 94}, {"vs50", 95}, {"vs51", 96}, \ + {"vs52", 97}, {"vs53", 98}, {"vs54", 99}, {"vs55", 100}, \ + {"vs56", 101},{"vs57", 102},{"vs58", 103},{"vs59", 104}, \ + {"vs60", 105},{"vs61", 106},{"vs62", 107},{"vs63", 108} } /* Text to write out after a CALL that may be replaced by glue code by the loader. This depends on the AIX version. */ @@ -2480,10 +2590,14 @@ enum rs6000_builtins ALTIVEC_BUILTIN_VSEL_4SF, ALTIVEC_BUILTIN_VSEL_8HI, ALTIVEC_BUILTIN_VSEL_16QI, + ALTIVEC_BUILTIN_VSEL_2DF, /* needed for VSX */ + ALTIVEC_BUILTIN_VSEL_2DI, /* needed for VSX */ ALTIVEC_BUILTIN_VPERM_4SI, ALTIVEC_BUILTIN_VPERM_4SF, ALTIVEC_BUILTIN_VPERM_8HI, ALTIVEC_BUILTIN_VPERM_16QI, + ALTIVEC_BUILTIN_VPERM_2DF, /* needed for VSX */ + ALTIVEC_BUILTIN_VPERM_2DI, /* needed for VSX */ ALTIVEC_BUILTIN_VPKUHUM, ALTIVEC_BUILTIN_VPKUWUM, ALTIVEC_BUILTIN_VPKPX, @@ -2839,6 +2953,7 @@ enum rs6000_builtins ALTIVEC_BUILTIN_VEC_PROMOTE, ALTIVEC_BUILTIN_VEC_INSERT, ALTIVEC_BUILTIN_VEC_SPLATS, + ALTIVEC_BUILTIN_OVERLOADED_LAST = ALTIVEC_BUILTIN_VEC_SPLATS, /* SPE builtins. */ @@ -3110,6 +3225,163 @@ enum rs6000_builtins RS6000_BUILTIN_RECIPF, RS6000_BUILTIN_RSQRTF, + /* VSX builtins. */ + VSX_BUILTIN_LXSDUX, + VSX_BUILTIN_LXSDX, + VSX_BUILTIN_LXVD2UX, + VSX_BUILTIN_LXVD2X, + VSX_BUILTIN_LXVDSX, + VSX_BUILTIN_LXVW4UX, + VSX_BUILTIN_LXVW4X, + VSX_BUILTIN_STXSDUX, + VSX_BUILTIN_STXSDX, + VSX_BUILTIN_STXVD2UX, + VSX_BUILTIN_STXVD2X, + VSX_BUILTIN_STXVW4UX, + VSX_BUILTIN_STXVW4X, + VSX_BUILTIN_XSABSDP, + VSX_BUILTIN_XSADDDP, + VSX_BUILTIN_XSCMPODP, + VSX_BUILTIN_XSCMPUDP, + VSX_BUILTIN_XSCPSGNDP, + VSX_BUILTIN_XSCVDPSP, + VSX_BUILTIN_XSCVDPSXDS, + VSX_BUILTIN_XSCVDPSXWS, + VSX_BUILTIN_XSCVDPUXDS, + VSX_BUILTIN_XSCVDPUXWS, + VSX_BUILTIN_XSCVSPDP, + VSX_BUILTIN_XSCVSXDDP, + VSX_BUILTIN_XSCVUXDDP, + VSX_BUILTIN_XSDIVDP, + VSX_BUILTIN_XSMADDADP, + VSX_BUILTIN_XSMADDMDP, + VSX_BUILTIN_XSMAXDP, + VSX_BUILTIN_XSMINDP, + VSX_BUILTIN_XSMOVDP, + VSX_BUILTIN_XSMSUBADP, + VSX_BUILTIN_XSMSUBMDP, + VSX_BUILTIN_XSMULDP, + VSX_BUILTIN_XSNABSDP, + VSX_BUILTIN_XSNEGDP, + VSX_BUILTIN_XSNMADDADP, + VSX_BUILTIN_XSNMADDMDP, + VSX_BUILTIN_XSNMSUBADP, + VSX_BUILTIN_XSNMSUBMDP, + VSX_BUILTIN_XSRDPI, + VSX_BUILTIN_XSRDPIC, + VSX_BUILTIN_XSRDPIM, + VSX_BUILTIN_XSRDPIP, + VSX_BUILTIN_XSRDPIZ, + VSX_BUILTIN_XSREDP, + VSX_BUILTIN_XSRSQRTEDP, + VSX_BUILTIN_XSSQRTDP, + VSX_BUILTIN_XSSUBDP, + VSX_BUILTIN_XSTDIVDP, + VSX_BUILTIN_XSTSQRTDP, + VSX_BUILTIN_XVABSDP, + VSX_BUILTIN_XVABSSP, + VSX_BUILTIN_XVADDDP, + VSX_BUILTIN_XVADDSP, + VSX_BUILTIN_XVCMPEQDP, + VSX_BUILTIN_XVCMPEQSP, + VSX_BUILTIN_XVCMPGEDP, + VSX_BUILTIN_XVCMPGESP, + VSX_BUILTIN_XVCMPGTDP, + VSX_BUILTIN_XVCMPGTSP, + VSX_BUILTIN_XVCPSGNDP, + VSX_BUILTIN_XVCPSGNSP, + VSX_BUILTIN_XVCVDPSP, + VSX_BUILTIN_XVCVDPSXDS, + VSX_BUILTIN_XVCVDPSXWS, + VSX_BUILTIN_XVCVDPUXDS, + VSX_BUILTIN_XVCVDPUXWS, + VSX_BUILTIN_XVCVSPDP, + VSX_BUILTIN_XVCVSPSXDS, + VSX_BUILTIN_XVCVSPSXWS, + VSX_BUILTIN_XVCVSPUXDS, + VSX_BUILTIN_XVCVSPUXWS, + VSX_BUILTIN_XVCVSXDDP, + VSX_BUILTIN_XVCVSXDSP, + VSX_BUILTIN_XVCVSXWDP, + VSX_BUILTIN_XVCVSXWSP, + VSX_BUILTIN_XVCVUXDDP, + VSX_BUILTIN_XVCVUXDSP, + VSX_BUILTIN_XVCVUXWDP, + VSX_BUILTIN_XVCVUXWSP, + VSX_BUILTIN_XVDIVDP, + VSX_BUILTIN_XVDIVSP, + VSX_BUILTIN_XVMADDDP, + VSX_BUILTIN_XVMADDSP, + VSX_BUILTIN_XVMAXDP, + VSX_BUILTIN_XVMAXSP, + VSX_BUILTIN_XVMINDP, + VSX_BUILTIN_XVMINSP, + VSX_BUILTIN_XVMSUBDP, + VSX_BUILTIN_XVMSUBSP, + VSX_BUILTIN_XVMULDP, + VSX_BUILTIN_XVMULSP, + VSX_BUILTIN_XVNABSDP, + VSX_BUILTIN_XVNABSSP, + VSX_BUILTIN_XVNEGDP, + VSX_BUILTIN_XVNEGSP, + VSX_BUILTIN_XVNMADDDP, + VSX_BUILTIN_XVNMADDSP, + VSX_BUILTIN_XVNMSUBDP, + VSX_BUILTIN_XVNMSUBSP, + VSX_BUILTIN_XVRDPI, + VSX_BUILTIN_XVRDPIC, + VSX_BUILTIN_XVRDPIM, + VSX_BUILTIN_XVRDPIP, + VSX_BUILTIN_XVRDPIZ, + VSX_BUILTIN_XVREDP, + VSX_BUILTIN_XVRESP, + VSX_BUILTIN_XVRSPI, + VSX_BUILTIN_XVRSPIC, + VSX_BUILTIN_XVRSPIM, + VSX_BUILTIN_XVRSPIP, + VSX_BUILTIN_XVRSPIZ, + VSX_BUILTIN_XVRSQRTEDP, + VSX_BUILTIN_XVRSQRTESP, + VSX_BUILTIN_XVSQRTDP, + VSX_BUILTIN_XVSQRTSP, + VSX_BUILTIN_XVSUBDP, + VSX_BUILTIN_XVSUBSP, + VSX_BUILTIN_XVTDIVDP, + VSX_BUILTIN_XVTDIVSP, + VSX_BUILTIN_XVTSQRTDP, + VSX_BUILTIN_XVTSQRTSP, + VSX_BUILTIN_XXLAND, + VSX_BUILTIN_XXLANDC, + VSX_BUILTIN_XXLNOR, + VSX_BUILTIN_XXLOR, + VSX_BUILTIN_XXLXOR, + VSX_BUILTIN_XXMRGHD, + VSX_BUILTIN_XXMRGHW, + VSX_BUILTIN_XXMRGLD, + VSX_BUILTIN_XXMRGLW, + VSX_BUILTIN_XXPERMDI, + VSX_BUILTIN_XXSEL, + VSX_BUILTIN_XXSLDWI, + VSX_BUILTIN_XXSPLTD, + VSX_BUILTIN_XXSPLTW, + VSX_BUILTIN_XXSWAPD, + + /* VSX overloaded builtins, add the overloaded functions not present in + Altivec. */ + VSX_BUILTIN_VEC_MUL, + VSX_BUILTIN_OVERLOADED_FIRST = VSX_BUILTIN_VEC_MUL, + VSX_BUILTIN_VEC_MSUB, + VSX_BUILTIN_VEC_NMADD, + VSX_BUITLIN_VEC_NMSUB, + VSX_BUILTIN_VEC_DIV, + VSX_BUILTIN_OVERLOADED_LAST = VSX_BUILTIN_VEC_DIV, + + /* Combined VSX/Altivec builtins. */ + VECTOR_BUILTIN_FLOAT_V4SI_V4SF, + VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF, + VECTOR_BUILTIN_FIX_V4SF_V4SI, + VECTOR_BUILTIN_FIXUNS_V4SF_V4SI, + RS6000_BUILTIN_COUNT }; @@ -3123,6 +3395,8 @@ enum rs6000_builtin_type_index RS6000_BTI_V16QI, RS6000_BTI_V2SI, RS6000_BTI_V2SF, + RS6000_BTI_V2DI, + RS6000_BTI_V2DF, RS6000_BTI_V4HI, RS6000_BTI_V4SI, RS6000_BTI_V4SF, @@ -3146,7 +3420,10 @@ enum rs6000_builtin_type_index RS6000_BTI_UINTHI, /* unsigned_intHI_type_node */ RS6000_BTI_INTSI, /* intSI_type_node */ RS6000_BTI_UINTSI, /* unsigned_intSI_type_node */ + RS6000_BTI_INTDI, /* intDI_type_node */ + RS6000_BTI_UINTDI, /* unsigned_intDI_type_node */ RS6000_BTI_float, /* float_type_node */ + RS6000_BTI_double, /* double_type_node */ RS6000_BTI_void, /* void_type_node */ RS6000_BTI_MAX }; @@ -3157,6 +3434,8 @@ enum rs6000_builtin_type_index #define opaque_p_V2SI_type_node (rs6000_builtin_types[RS6000_BTI_opaque_p_V2SI]) #define opaque_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_opaque_V4SI]) #define V16QI_type_node (rs6000_builtin_types[RS6000_BTI_V16QI]) +#define V2DI_type_node (rs6000_builtin_types[RS6000_BTI_V2DI]) +#define V2DF_type_node (rs6000_builtin_types[RS6000_BTI_V2DF]) #define V2SI_type_node (rs6000_builtin_types[RS6000_BTI_V2SI]) #define V2SF_type_node (rs6000_builtin_types[RS6000_BTI_V2SF]) #define V4HI_type_node (rs6000_builtin_types[RS6000_BTI_V4HI]) @@ -3183,7 +3462,10 @@ enum rs6000_builtin_type_index #define uintHI_type_internal_node (rs6000_builtin_types[RS6000_BTI_UINTHI]) #define intSI_type_internal_node (rs6000_builtin_types[RS6000_BTI_INTSI]) #define uintSI_type_internal_node (rs6000_builtin_types[RS6000_BTI_UINTSI]) +#define intDI_type_internal_node (rs6000_builtin_types[RS6000_BTI_INTDI]) +#define uintDI_type_internal_node (rs6000_builtin_types[RS6000_BTI_UINTDI]) #define float_type_internal_node (rs6000_builtin_types[RS6000_BTI_float]) +#define double_type_internal_node (rs6000_builtin_types[RS6000_BTI_double]) #define void_type_internal_node (rs6000_builtin_types[RS6000_BTI_void]) extern GTY(()) tree rs6000_builtin_types[RS6000_BTI_MAX]; --- gcc/config/rs6000/altivec.md (.../trunk) (revision 145777) +++ gcc/config/rs6000/altivec.md (.../branches/ibm/power7-meissner) (revision 146027) @@ -21,18 +21,7 @@ (define_constants [(UNSPEC_VCMPBFP 50) - (UNSPEC_VCMPEQUB 51) - (UNSPEC_VCMPEQUH 52) - (UNSPEC_VCMPEQUW 53) - (UNSPEC_VCMPEQFP 54) - (UNSPEC_VCMPGEFP 55) - (UNSPEC_VCMPGTUB 56) - (UNSPEC_VCMPGTSB 57) - (UNSPEC_VCMPGTUH 58) - (UNSPEC_VCMPGTSH 59) - (UNSPEC_VCMPGTUW 60) - (UNSPEC_VCMPGTSW 61) - (UNSPEC_VCMPGTFP 62) + ;; 51-62 deleted (UNSPEC_VMSUMU 65) (UNSPEC_VMSUMM 66) (UNSPEC_VMSUMSHM 68) @@ -87,10 +76,7 @@ (define_constants (UNSPEC_VEXPTEFP 156) (UNSPEC_VRSQRTEFP 157) (UNSPEC_VREFP 158) - (UNSPEC_VSEL4SI 159) - (UNSPEC_VSEL4SF 160) - (UNSPEC_VSEL8HI 161) - (UNSPEC_VSEL16QI 162) + ;; 159-162 deleted (UNSPEC_VLSDOI 163) (UNSPEC_VUPKHSB 167) (UNSPEC_VUPKHPX 168) @@ -125,11 +111,11 @@ (define_constants (UNSPEC_INTERHI_V4SI 228) (UNSPEC_INTERHI_V8HI 229) (UNSPEC_INTERHI_V16QI 230) - (UNSPEC_INTERHI_V4SF 231) + ;; delete 231 (UNSPEC_INTERLO_V4SI 232) (UNSPEC_INTERLO_V8HI 233) (UNSPEC_INTERLO_V16QI 234) - (UNSPEC_INTERLO_V4SF 235) + ;; delete 235 (UNSPEC_LVLX 236) (UNSPEC_LVLXL 237) (UNSPEC_LVRX 238) @@ -176,39 +162,17 @@ (define_mode_iterator VIshort [V8HI V16Q (define_mode_iterator VF [V4SF]) ;; Vec modes, pity mode iterators are not composable (define_mode_iterator V [V4SI V8HI V16QI V4SF]) +;; Vec modes for move/logical/permute ops, include vector types for move not +;; otherwise handled by altivec (v2df, v2di, ti) +(define_mode_iterator VM [V4SI V8HI V16QI V4SF V2DF V2DI TI]) (define_mode_attr VI_char [(V4SI "w") (V8HI "h") (V16QI "b")]) -;; Generic LVX load instruction. -(define_insn "altivec_lvx_" - [(set (match_operand:V 0 "altivec_register_operand" "=v") - (match_operand:V 1 "memory_operand" "Z"))] - "TARGET_ALTIVEC" - "lvx %0,%y1" - [(set_attr "type" "vecload")]) - -;; Generic STVX store instruction. -(define_insn "altivec_stvx_" - [(set (match_operand:V 0 "memory_operand" "=Z") - (match_operand:V 1 "altivec_register_operand" "v"))] - "TARGET_ALTIVEC" - "stvx %1,%y0" - [(set_attr "type" "vecstore")]) - ;; Vector move instructions. -(define_expand "mov" - [(set (match_operand:V 0 "nonimmediate_operand" "") - (match_operand:V 1 "any_operand" ""))] - "TARGET_ALTIVEC" -{ - rs6000_emit_move (operands[0], operands[1], mode); - DONE; -}) - -(define_insn "*mov_internal" - [(set (match_operand:V 0 "nonimmediate_operand" "=Z,v,v,o,r,r,v") - (match_operand:V 1 "input_operand" "v,Z,v,r,o,r,W"))] - "TARGET_ALTIVEC +(define_insn "*altivec_mov" + [(set (match_operand:V 0 "nonimmediate_operand" "=Z,v,v,*o,*r,*r,v,v") + (match_operand:V 1 "input_operand" "v,Z,v,r,o,r,j,W"))] + "VECTOR_MEM_ALTIVEC_P (mode) && (register_operand (operands[0], mode) || register_operand (operands[1], mode))" { @@ -220,52 +184,17 @@ (define_insn "*mov_internal" case 3: return "#"; case 4: return "#"; case 5: return "#"; - case 6: return output_vec_const_move (operands); + case 6: return "vxor %0,%0,%0"; + case 7: return output_vec_const_move (operands); default: gcc_unreachable (); } } - [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,*")]) + [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*")]) (define_split - [(set (match_operand:V4SI 0 "nonimmediate_operand" "") - (match_operand:V4SI 1 "input_operand" ""))] - "TARGET_ALTIVEC && reload_completed - && gpr_or_gpr_p (operands[0], operands[1])" - [(pc)] -{ - rs6000_split_multireg_move (operands[0], operands[1]); DONE; -}) - -(define_split - [(set (match_operand:V8HI 0 "nonimmediate_operand" "") - (match_operand:V8HI 1 "input_operand" ""))] - "TARGET_ALTIVEC && reload_completed - && gpr_or_gpr_p (operands[0], operands[1])" - [(pc)] -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }) - -(define_split - [(set (match_operand:V16QI 0 "nonimmediate_operand" "") - (match_operand:V16QI 1 "input_operand" ""))] - "TARGET_ALTIVEC && reload_completed - && gpr_or_gpr_p (operands[0], operands[1])" - [(pc)] -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }) - -(define_split - [(set (match_operand:V4SF 0 "nonimmediate_operand" "") - (match_operand:V4SF 1 "input_operand" ""))] - "TARGET_ALTIVEC && reload_completed - && gpr_or_gpr_p (operands[0], operands[1])" - [(pc)] -{ - rs6000_split_multireg_move (operands[0], operands[1]); DONE; -}) - -(define_split - [(set (match_operand:V 0 "altivec_register_operand" "") - (match_operand:V 1 "easy_vector_constant_add_self" ""))] - "TARGET_ALTIVEC && reload_completed" + [(set (match_operand:VM 0 "altivec_register_operand" "") + (match_operand:VM 1 "easy_vector_constant_add_self" ""))] + "VECTOR_UNIT_ALTIVEC_P (mode) && reload_completed" [(set (match_dup 0) (match_dup 3)) (set (match_dup 0) (match_dup 4))] { @@ -346,11 +275,11 @@ (define_insn "add3" "vaddum %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "addv4sf3" +(define_insn "*altivec_addv4sf3" [(set (match_operand:V4SF 0 "register_operand" "=v") (plus:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vaddfp %0,%1,%2" [(set_attr "type" "vecfloat")]) @@ -392,11 +321,11 @@ (define_insn "sub3" "vsubum %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "subv4sf3" +(define_insn "*altivec_subv4sf3" [(set (match_operand:V4SF 0 "register_operand" "=v") (minus:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vsubfp %0,%1,%2" [(set_attr "type" "vecfloat")]) @@ -457,131 +386,81 @@ (define_insn "altivec_vcmpbfp" "vcmpbfp %0,%1,%2" [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpequb" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")] - UNSPEC_VCMPEQUB))] +(define_insn "*altivec_eq" + [(set (match_operand:VI 0 "altivec_register_operand" "=v") + (eq:VI (match_operand:VI 1 "altivec_register_operand" "v") + (match_operand:VI 2 "altivec_register_operand" "v")))] "TARGET_ALTIVEC" - "vcmpequb %0,%1,%2" - [(set_attr "type" "vecsimple")]) + "vcmpequ %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpequh" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")] - UNSPEC_VCMPEQUH))] +(define_insn "*altivec_gt" + [(set (match_operand:VI 0 "altivec_register_operand" "=v") + (gt:VI (match_operand:VI 1 "altivec_register_operand" "v") + (match_operand:VI 2 "altivec_register_operand" "v")))] "TARGET_ALTIVEC" - "vcmpequh %0,%1,%2" - [(set_attr "type" "vecsimple")]) + "vcmpgts %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpequw" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")] - UNSPEC_VCMPEQUW))] +(define_insn "*altivec_gtu" + [(set (match_operand:VI 0 "altivec_register_operand" "=v") + (gtu:VI (match_operand:VI 1 "altivec_register_operand" "v") + (match_operand:VI 2 "altivec_register_operand" "v")))] "TARGET_ALTIVEC" - "vcmpequw %0,%1,%2" - [(set_attr "type" "vecsimple")]) + "vcmpgtu %0,%1,%2" + [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpeqfp" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")] - UNSPEC_VCMPEQFP))] - "TARGET_ALTIVEC" +(define_insn "*altivec_eqv4sf" + [(set (match_operand:V4SF 0 "altivec_register_operand" "=v") + (eq:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v") + (match_operand:V4SF 2 "altivec_register_operand" "v")))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vcmpeqfp %0,%1,%2" [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpgefp" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")] - UNSPEC_VCMPGEFP))] - "TARGET_ALTIVEC" - "vcmpgefp %0,%1,%2" +(define_insn "*altivec_gtv4sf" + [(set (match_operand:V4SF 0 "altivec_register_operand" "=v") + (gt:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v") + (match_operand:V4SF 2 "altivec_register_operand" "v")))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" + "vcmpgtfp %0,%1,%2" [(set_attr "type" "veccmp")]) -(define_insn "altivec_vcmpgtub" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")] - UNSPEC_VCMPGTUB))] - "TARGET_ALTIVEC" - "vcmpgtub %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtsb" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")] - UNSPEC_VCMPGTSB))] - "TARGET_ALTIVEC" - "vcmpgtsb %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtuh" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")] - UNSPEC_VCMPGTUH))] - "TARGET_ALTIVEC" - "vcmpgtuh %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtsh" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")] - UNSPEC_VCMPGTSH))] - "TARGET_ALTIVEC" - "vcmpgtsh %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtuw" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")] - UNSPEC_VCMPGTUW))] - "TARGET_ALTIVEC" - "vcmpgtuw %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtsw" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")] - UNSPEC_VCMPGTSW))] - "TARGET_ALTIVEC" - "vcmpgtsw %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "altivec_vcmpgtfp" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")] - UNSPEC_VCMPGTFP))] - "TARGET_ALTIVEC" - "vcmpgtfp %0,%1,%2" +(define_insn "*altivec_gev4sf" + [(set (match_operand:V4SF 0 "altivec_register_operand" "=v") + (ge:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v") + (match_operand:V4SF 2 "altivec_register_operand" "v")))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" + "vcmpgefp %0,%1,%2" [(set_attr "type" "veccmp")]) +(define_insn "altivec_vsel" + [(set (match_operand:VM 0 "altivec_register_operand" "=v") + (if_then_else:VM (ne (match_operand:VM 1 "altivec_register_operand" "v") + (const_int 0)) + (match_operand:VM 2 "altivec_register_operand" "v") + (match_operand:VM 3 "altivec_register_operand" "v")))] + "VECTOR_UNIT_ALTIVEC_P (mode)" + "vsel %0,%3,%2,%1" + [(set_attr "type" "vecperm")]) + ;; Fused multiply add (define_insn "altivec_vmaddfp" [(set (match_operand:V4SF 0 "register_operand" "=v") (plus:V4SF (mult:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")) (match_operand:V4SF 3 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vmaddfp %0,%1,%2,%3" [(set_attr "type" "vecfloat")]) ;; We do multiply as a fused multiply-add with an add of a -0.0 vector. -(define_expand "mulv4sf3" +(define_expand "altivec_mulv4sf3" [(use (match_operand:V4SF 0 "register_operand" "")) (use (match_operand:V4SF 1 "register_operand" "")) (use (match_operand:V4SF 2 "register_operand" ""))] - "TARGET_ALTIVEC && TARGET_FUSED_MADD" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode) && TARGET_FUSED_MADD" " { rtx neg0; @@ -684,7 +563,7 @@ (define_insn "altivec_vnmsubfp" (neg:V4SF (minus:V4SF (mult:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")) (match_operand:V4SF 3 "register_operand" "v"))))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vnmsubfp %0,%1,%2,%3" [(set_attr "type" "vecfloat")]) @@ -758,11 +637,11 @@ (define_insn "smax3" "vmaxs %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "smaxv4sf3" +(define_insn "*altivec_smaxv4sf3" [(set (match_operand:V4SF 0 "register_operand" "=v") (smax:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vmaxfp %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -782,11 +661,11 @@ (define_insn "smin3" "vmins %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "sminv4sf3" +(define_insn "*altivec_sminv4sf3" [(set (match_operand:V4SF 0 "register_operand" "=v") (smin:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vminfp %0,%1,%2" [(set_attr "type" "veccmp")]) @@ -905,7 +784,7 @@ (define_insn "altivec_vmrghw" "vmrghw %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vmrghsf" +(define_insn "*altivec_vmrghsf" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_merge:V4SF (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v") (parallel [(const_int 0) @@ -918,7 +797,7 @@ (define_insn "altivec_vmrghsf" (const_int 3) (const_int 1)])) (const_int 5)))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vmrghw %0,%1,%2" [(set_attr "type" "vecperm")]) @@ -990,35 +869,37 @@ (define_insn "altivec_vmrglh" (define_insn "altivec_vmrglw" [(set (match_operand:V4SI 0 "register_operand" "=v") - (vec_merge:V4SI (vec_select:V4SI (match_operand:V4SI 1 "register_operand" "v") - (parallel [(const_int 2) - (const_int 0) - (const_int 3) - (const_int 1)])) - (vec_select:V4SI (match_operand:V4SI 2 "register_operand" "v") - (parallel [(const_int 0) - (const_int 2) - (const_int 1) - (const_int 3)])) - (const_int 5)))] + (vec_merge:V4SI + (vec_select:V4SI (match_operand:V4SI 1 "register_operand" "v") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (vec_select:V4SI (match_operand:V4SI 2 "register_operand" "v") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (const_int 5)))] "TARGET_ALTIVEC" "vmrglw %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vmrglsf" +(define_insn "*altivec_vmrglsf" [(set (match_operand:V4SF 0 "register_operand" "=v") - (vec_merge:V4SF (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v") - (parallel [(const_int 2) - (const_int 0) - (const_int 3) - (const_int 1)])) - (vec_select:V4SF (match_operand:V4SF 2 "register_operand" "v") - (parallel [(const_int 0) - (const_int 2) - (const_int 1) - (const_int 3)])) - (const_int 5)))] - "TARGET_ALTIVEC" + (vec_merge:V4SF + (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v") + (parallel [(const_int 2) + (const_int 0) + (const_int 3) + (const_int 1)])) + (vec_select:V4SF (match_operand:V4SF 2 "register_operand" "v") + (parallel [(const_int 0) + (const_int 2) + (const_int 1) + (const_int 3)])) + (const_int 5)))] + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vmrglw %0,%1,%2" [(set_attr "type" "vecperm")]) @@ -1095,68 +976,53 @@ (define_insn "altivec_vmulosh" [(set_attr "type" "veccomplex")]) -;; logical ops +;; logical ops. Have the logical ops follow the memory ops in +;; terms of whether to prefer VSX or Altivec -(define_insn "and3" - [(set (match_operand:VI 0 "register_operand" "=v") - (and:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_and3" + [(set (match_operand:VM 0 "register_operand" "=v") + (and:VM (match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (mode)" "vand %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "ior3" - [(set (match_operand:VI 0 "register_operand" "=v") - (ior:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_ior3" + [(set (match_operand:VM 0 "register_operand" "=v") + (ior:VM (match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (mode)" "vor %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "xor3" - [(set (match_operand:VI 0 "register_operand" "=v") - (xor:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_xor3" + [(set (match_operand:VM 0 "register_operand" "=v") + (xor:VM (match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (mode)" "vxor %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "xorv4sf3" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (xor:V4SF (match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - "vxor %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "one_cmpl2" - [(set (match_operand:VI 0 "register_operand" "=v") - (not:VI (match_operand:VI 1 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_one_cmpl2" + [(set (match_operand:VM 0 "register_operand" "=v") + (not:VM (match_operand:VM 1 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (mode)" "vnor %0,%1,%1" [(set_attr "type" "vecsimple")]) -(define_insn "altivec_nor3" - [(set (match_operand:VI 0 "register_operand" "=v") - (not:VI (ior:VI (match_operand:VI 1 "register_operand" "v") - (match_operand:VI 2 "register_operand" "v"))))] - "TARGET_ALTIVEC" +(define_insn "*altivec_nor3" + [(set (match_operand:VM 0 "register_operand" "=v") + (not:VM (ior:VM (match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v"))))] + "VECTOR_MEM_ALTIVEC_P (mode)" "vnor %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "andc3" - [(set (match_operand:VI 0 "register_operand" "=v") - (and:VI (not:VI (match_operand:VI 2 "register_operand" "v")) - (match_operand:VI 1 "register_operand" "v")))] - "TARGET_ALTIVEC" - "vandc %0,%1,%2" - [(set_attr "type" "vecsimple")]) - -(define_insn "*andc3_v4sf" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (and:V4SF (not:V4SF (match_operand:V4SF 2 "register_operand" "v")) - (match_operand:V4SF 1 "register_operand" "v")))] - "TARGET_ALTIVEC" +(define_insn "*altivec_andc3" + [(set (match_operand:VM 0 "register_operand" "=v") + (and:VM (not:VM (match_operand:VM 2 "register_operand" "v")) + (match_operand:VM 1 "register_operand" "v")))] + "VECTOR_MEM_ALTIVEC_P (mode)" "vandc %0,%1,%2" [(set_attr "type" "vecsimple")]) @@ -1392,7 +1258,7 @@ (define_insn "*altivec_vspltsf" (vec_select:SF (match_operand:V4SF 1 "register_operand" "v") (parallel [(match_operand:QI 2 "u5bit_cint_operand" "i")]))))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vspltw %0,%1,%2" [(set_attr "type" "vecperm")]) @@ -1404,19 +1270,19 @@ (define_insn "altivec_vspltis" "vspltis %0,%1" [(set_attr "type" "vecperm")]) -(define_insn "ftruncv4sf2" +(define_insn "*altivec_ftruncv4sf2" [(set (match_operand:V4SF 0 "register_operand" "=v") (fix:V4SF (match_operand:V4SF 1 "register_operand" "v")))] - "TARGET_ALTIVEC" + "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" "vrfiz %0,%1" [(set_attr "type" "vecfloat")]) (define_insn "altivec_vperm_" - [(set (match_operand:V 0 "register_operand" "=v") - (unspec:V [(match_operand:V 1 "register_operand" "v") - (match_operand:V 2 "register_operand" "v") - (match_operand:V16QI 3 "register_operand" "v")] - UNSPEC_VPERM))] + [(set (match_operand:VM 0 "register_operand" "=v") + (unspec:VM [(match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v") + (match_operand:V16QI 3 "register_operand" "v")] + UNSPEC_VPERM))] "TARGET_ALTIVEC" "vperm %0,%1,%2,%3" [(set_attr "type" "vecperm")]) @@ -1515,180 +1381,6 @@ (define_insn "altivec_vrefp" "vrefp %0,%1" [(set_attr "type" "vecfloat")]) -(define_expand "vcondv4si" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (if_then_else:V4SI - (match_operator 3 "comparison_operator" - [(match_operand:V4SI 4 "register_operand" "v") - (match_operand:V4SI 5 "register_operand" "v")]) - (match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vconduv4si" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (if_then_else:V4SI - (match_operator 3 "comparison_operator" - [(match_operand:V4SI 4 "register_operand" "v") - (match_operand:V4SI 5 "register_operand" "v")]) - (match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vcondv4sf" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (if_then_else:V4SF - (match_operator 3 "comparison_operator" - [(match_operand:V4SF 4 "register_operand" "v") - (match_operand:V4SF 5 "register_operand" "v")]) - (match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vcondv8hi" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (if_then_else:V8HI - (match_operator 3 "comparison_operator" - [(match_operand:V8HI 4 "register_operand" "v") - (match_operand:V8HI 5 "register_operand" "v")]) - (match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vconduv8hi" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (if_then_else:V8HI - (match_operator 3 "comparison_operator" - [(match_operand:V8HI 4 "register_operand" "v") - (match_operand:V8HI 5 "register_operand" "v")]) - (match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vcondv16qi" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (if_then_else:V16QI - (match_operator 3 "comparison_operator" - [(match_operand:V16QI 4 "register_operand" "v") - (match_operand:V16QI 5 "register_operand" "v")]) - (match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - -(define_expand "vconduv16qi" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (if_then_else:V16QI - (match_operator 3 "comparison_operator" - [(match_operand:V16QI 4 "register_operand" "v") - (match_operand:V16QI 5 "register_operand" "v")]) - (match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v")))] - "TARGET_ALTIVEC" - " -{ - if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2], - operands[3], operands[4], operands[5])) - DONE; - else - FAIL; -} - ") - - -(define_insn "altivec_vsel_v4si" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") - (match_operand:V4SI 2 "register_operand" "v") - (match_operand:V4SI 3 "register_operand" "v")] - UNSPEC_VSEL4SI))] - "TARGET_ALTIVEC" - "vsel %0,%1,%2,%3" - [(set_attr "type" "vecperm")]) - -(define_insn "altivec_vsel_v4sf" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v") - (match_operand:V4SI 3 "register_operand" "v")] - UNSPEC_VSEL4SF))] - "TARGET_ALTIVEC" - "vsel %0,%1,%2,%3" - [(set_attr "type" "vecperm")]) - -(define_insn "altivec_vsel_v8hi" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v") - (match_operand:V8HI 3 "register_operand" "v")] - UNSPEC_VSEL8HI))] - "TARGET_ALTIVEC" - "vsel %0,%1,%2,%3" - [(set_attr "type" "vecperm")]) - -(define_insn "altivec_vsel_v16qi" - [(set (match_operand:V16QI 0 "register_operand" "=v") - (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") - (match_operand:V16QI 2 "register_operand" "v") - (match_operand:V16QI 3 "register_operand" "v")] - UNSPEC_VSEL16QI))] - "TARGET_ALTIVEC" - "vsel %0,%1,%2,%3" - [(set_attr "type" "vecperm")]) - (define_insn "altivec_vsldoi_" [(set (match_operand:V 0 "register_operand" "=v") (unspec:V [(match_operand:V 1 "register_operand" "v") @@ -1959,95 +1651,6 @@ (define_insn "*altivec_stvesfx" "stvewx %1,%y0" [(set_attr "type" "vecstore")]) -(define_expand "vec_init" - [(match_operand:V 0 "register_operand" "") - (match_operand 1 "" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_init (operands[0], operands[1]); - DONE; -}) - -(define_expand "vec_setv4si" - [(match_operand:V4SI 0 "register_operand" "") - (match_operand:SI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_setv8hi" - [(match_operand:V8HI 0 "register_operand" "") - (match_operand:HI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_setv16qi" - [(match_operand:V16QI 0 "register_operand" "") - (match_operand:QI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_setv4sf" - [(match_operand:V4SF 0 "register_operand" "") - (match_operand:SF 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_extractv4si" - [(match_operand:SI 0 "register_operand" "") - (match_operand:V4SI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_extractv8hi" - [(match_operand:HI 0 "register_operand" "") - (match_operand:V8HI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_extractv16qi" - [(match_operand:QI 0 "register_operand" "") - (match_operand:V16QI 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - -(define_expand "vec_extractv4sf" - [(match_operand:SF 0 "register_operand" "") - (match_operand:V4SF 1 "register_operand" "") - (match_operand 2 "const_int_operand" "")] - "TARGET_ALTIVEC" -{ - rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2])); - DONE; -}) - ;; Generate ;; vspltis? SCRATCH0,0 ;; vsubu?m SCRATCH2,SCRATCH1,%1 @@ -2069,7 +1672,7 @@ (define_expand "abs2" ;; vspltisw SCRATCH1,-1 ;; vslw SCRATCH2,SCRATCH1,SCRATCH1 ;; vandc %0,%1,SCRATCH2 -(define_expand "absv4sf2" +(define_expand "altivec_absv4sf2" [(set (match_dup 2) (vec_duplicate:V4SI (const_int -1))) (set (match_dup 3) @@ -2132,7 +1735,7 @@ (define_expand "vec_shl_" DONE; }") -;; Vector shift left in bits. Currently supported ony for shift +;; Vector shift right in bits. Currently supported ony for shift ;; amounts that can be expressed as byte shifts (divisible by 8). ;; General shift amounts can be supported using vsro + vsr. We're ;; not expecting to see these yet (the vectorizer currently @@ -2665,7 +2268,7 @@ (define_expand "vec_pack_trunc_v4si" DONE; }") -(define_expand "negv4sf2" +(define_expand "altivec_negv4sf2" [(use (match_operand:V4SF 0 "register_operand" "")) (use (match_operand:V4SF 1 "register_operand" ""))] "TARGET_ALTIVEC" @@ -2994,29 +2597,6 @@ (define_expand "vec_extract_oddv16qi" emit_insn (gen_vpkuhum_nomode (operands[0], operands[1], operands[2])); DONE; }") -(define_expand "vec_interleave_highv4sf" - [(set (match_operand:V4SF 0 "register_operand" "") - (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "") - (match_operand:V4SF 2 "register_operand" "")] - UNSPEC_INTERHI_V4SF))] - "TARGET_ALTIVEC" - " -{ - emit_insn (gen_altivec_vmrghsf (operands[0], operands[1], operands[2])); - DONE; -}") - -(define_expand "vec_interleave_lowv4sf" - [(set (match_operand:V4SF 0 "register_operand" "") - (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "") - (match_operand:V4SF 2 "register_operand" "")] - UNSPEC_INTERLO_V4SF))] - "TARGET_ALTIVEC" - " -{ - emit_insn (gen_altivec_vmrglsf (operands[0], operands[1], operands[2])); - DONE; -}") (define_expand "vec_interleave_high" [(set (match_operand:VI 0 "register_operand" "") --- gcc/config/rs6000/aix61.h (.../trunk) (revision 145777) +++ gcc/config/rs6000/aix61.h (.../branches/ibm/power7-meissner) (revision 146027) @@ -57,20 +57,24 @@ do { \ #undef ASM_SPEC #define ASM_SPEC "-u %{maix64:-a64 %{!mcpu*:-mppc64}} %(asm_cpu)" -/* Common ASM definitions used by ASM_SPEC amongst the various targets - for handling -mcpu=xxx switches. */ +/* Common ASM definitions used by ASM_SPEC amongst the various targets for + handling -mcpu=xxx switches. There is a parallel list in driver-rs6000.c to + provide the default assembler options if the user uses -mcpu=native, so if + you make changes here, make them there also. */ #undef ASM_CPU_SPEC #define ASM_CPU_SPEC \ "%{!mcpu*: %{!maix64: \ %{mpowerpc64: -mppc64} \ %{maltivec: -m970} \ %{!maltivec: %{!mpower64: %(asm_default)}}}} \ +%{mcpu=native: %(asm_cpu_native)} \ %{mcpu=power3: -m620} \ %{mcpu=power4: -mpwr4} \ %{mcpu=power5: -mpwr5} \ %{mcpu=power5+: -mpwr5x} \ %{mcpu=power6: -mpwr6} \ %{mcpu=power6x: -mpwr6} \ +%{mcpu=power7: -mpwr7} \ %{mcpu=powerpc: -mppc} \ %{mcpu=rs64a: -mppc} \ %{mcpu=603: -m603} \ --- gcc/config/rs6000/rs6000.md (.../trunk) (revision 145777) +++ gcc/config/rs6000/rs6000.md (.../branches/ibm/power7-meissner) (revision 146027) @@ -138,7 +138,7 @@ ;; Processor type -- this attribute must exactly match the processor_type ;; enumeration in rs6000.h. -(define_attr "cpu" "rios1,rios2,rs64a,mpccore,ppc403,ppc405,ppc440,ppc601,ppc603,ppc604,ppc604e,ppc620,ppc630,ppc750,ppc7400,ppc7450,ppc8540,ppce300c2,ppce300c3,ppce500mc,power4,power5,power6,cell" +(define_attr "cpu" "rios1,rios2,rs64a,mpccore,ppc403,ppc405,ppc440,ppc601,ppc603,ppc604,ppc604e,ppc620,ppc630,ppc750,ppc7400,ppc7450,ppc8540,ppce300c2,ppce300c3,ppce500mc,power4,power5,power6,power7,cell" (const (symbol_ref "rs6000_cpu_attr"))) @@ -167,6 +167,7 @@ (include "power4.md") (include "power5.md") (include "power6.md") +(include "power7.md") (include "cell.md") (include "xfpu.md") @@ -218,6 +219,19 @@ ; DImode bits (define_mode_attr dbits [(QI "56") (HI "48") (SI "32")]) +;; ISEL/ISEL64 target selection +(define_mode_attr sel [(SI "") (DI "64")]) + +;; Suffix for reload patterns +(define_mode_attr ptrsize [(SI "32bit") + (DI "64bit")]) + +(define_mode_attr tptrsize [(SI "TARGET_32BIT") + (DI "TARGET_64BIT")]) + +(define_mode_attr mptrsize [(SI "si") + (DI "di")]) + ;; Start with fixed-point load and store insns. Here we put only the more ;; complex forms. Basic data transfer is done later. @@ -520,7 +534,7 @@ "@ {andil.|andi.} %2,%1,0xff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -546,7 +560,7 @@ "@ {andil.|andi.} %0,%1,0xff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -687,7 +701,7 @@ "@ {andil.|andi.} %2,%1,0xff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -713,7 +727,7 @@ "@ {andil.|andi.} %0,%1,0xff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -856,7 +870,7 @@ "@ {andil.|andi.} %2,%1,0xffff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -882,7 +896,7 @@ "@ {andil.|andi.} %0,%1,0xffff #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -1670,7 +1684,7 @@ "@ nor. %2,%1,%1 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -1696,7 +1710,7 @@ "@ nor. %0,%1,%1 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -2221,10 +2235,22 @@ "TARGET_POPCNTB" "popcntb %0,%1") +(define_insn "popcntwsi2" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (popcount:SI (match_operand:SI 1 "gpc_reg_operand" "r")))] + "TARGET_POPCNTD" + "popcntw %0,%1") + +(define_insn "popcntddi2" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (popcount:DI (match_operand:DI 1 "gpc_reg_operand" "r")))] + "TARGET_POPCNTD && TARGET_POWERPC64" + "popcntd %0,%1") + (define_expand "popcount2" [(set (match_operand:GPR 0 "gpc_reg_operand" "") (popcount:GPR (match_operand:GPR 1 "gpc_reg_operand" "")))] - "TARGET_POPCNTB" + "TARGET_POPCNTB || TARGET_POPCNTD" { rs6000_emit_popcount (operands[0], operands[1]); DONE; @@ -2852,7 +2878,7 @@ {rlinm|rlwinm} %0,%1,0,%m2,%M2 {andil.|andi.} %0,%1,%b2 {andiu.|andis.} %0,%1,%u2" - [(set_attr "type" "*,*,compare,compare")]) + [(set_attr "type" "*,*,fast_compare,fast_compare")]) (define_insn "andsi3_nomc" [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r") @@ -2895,7 +2921,8 @@ # # #" - [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare") + [(set_attr "type" "fast_compare,fast_compare,fast_compare,delayed_compare,\ + compare,compare,compare,compare") (set_attr "length" "4,4,4,4,8,8,8,8")]) (define_insn "*andsi3_internal3_mc" @@ -2915,7 +2942,8 @@ # # #" - [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare") + [(set_attr "type" "compare,fast_compare,fast_compare,delayed_compare,compare,\ + compare,compare,compare") (set_attr "length" "8,4,4,4,8,8,8,8")]) (define_split @@ -2974,7 +3002,8 @@ # # #" - [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare") + [(set_attr "type" "fast_compare,fast_compare,fast_compare,delayed_compare,\ + compare,compare,compare,compare") (set_attr "length" "4,4,4,4,8,8,8,8")]) (define_insn "*andsi3_internal5_mc" @@ -2996,7 +3025,8 @@ # # #" - [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare") + [(set_attr "type" "compare,fast_compare,fast_compare,delayed_compare,compare,\ + compare,compare,compare") (set_attr "length" "8,4,4,4,8,8,8,8")]) (define_split @@ -3127,7 +3157,7 @@ "@ %q4. %3,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -3156,7 +3186,7 @@ "@ %q4. %0,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -3281,7 +3311,7 @@ "@ %q4. %3,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -3310,7 +3340,7 @@ "@ %q4. %0,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -5303,7 +5333,7 @@ "fres %0,%1" [(set_attr "type" "fp")]) -(define_insn "" +(define_insn "*fmaddsf4_powerpc" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5314,7 +5344,7 @@ [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fmaddsf4_power" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5323,7 +5353,7 @@ "{fma|fmadd} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fmsubsf4_powerpc" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5334,7 +5364,7 @@ [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fmsubsf4_power" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5343,7 +5373,7 @@ "{fms|fmsub} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fnmaddsf4_powerpc_1" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5354,7 +5384,7 @@ [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fnmaddsf4_powerpc_2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f")) (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5365,7 +5395,7 @@ [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fnmaddsf4_power_1" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5374,7 +5404,7 @@ "{fnma|fnmadd} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fnmaddsf4_power_2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f")) (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5384,7 +5414,7 @@ "{fnma|fnmadd} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fnmsubsf4_powerpc_1" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5395,7 +5425,7 @@ [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fnmsubsf4_powerpc_2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (match_operand:SF 3 "gpc_reg_operand" "f") (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") @@ -5406,7 +5436,7 @@ [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "" +(define_insn "*fnmsubsf4_power_1" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") (match_operand:SF 2 "gpc_reg_operand" "f")) @@ -5415,7 +5445,7 @@ "{fnms|fnmsub} %0,%1,%2,%3" [(set_attr "type" "dmul")]) -(define_insn "" +(define_insn "*fnmsubsf4_power_2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (minus:SF (match_operand:SF 3 "gpc_reg_operand" "f") (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") @@ -5496,9 +5526,18 @@ (match_dup 5)) (match_dup 3) (match_dup 4)))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT - && !HONOR_NANS (DFmode) && !HONOR_SIGNED_ZEROS (DFmode)" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && ((TARGET_PPC_GFXOPT + && !HONOR_NANS (DFmode) + && !HONOR_SIGNED_ZEROS (DFmode)) + || VECTOR_UNIT_VSX_P (DFmode))" { + if (VECTOR_UNIT_VSX_P (DFmode)) + { + emit_insn (gen_vsx_copysigndf3 (operands[0], operands[1], + operands[2])); + DONE; + } operands[3] = gen_reg_rtx (DFmode); operands[4] = gen_reg_rtx (DFmode); operands[5] = CONST0_RTX (DFmode); @@ -5542,12 +5581,12 @@ DONE; }") -(define_expand "movsicc" - [(set (match_operand:SI 0 "gpc_reg_operand" "") - (if_then_else:SI (match_operand 1 "comparison_operator" "") - (match_operand:SI 2 "gpc_reg_operand" "") - (match_operand:SI 3 "gpc_reg_operand" "")))] - "TARGET_ISEL" +(define_expand "movcc" + [(set (match_operand:GPR 0 "gpc_reg_operand" "") + (if_then_else:GPR (match_operand 1 "comparison_operator" "") + (match_operand:GPR 2 "gpc_reg_operand" "") + (match_operand:GPR 3 "gpc_reg_operand" "")))] + "TARGET_ISEL" " { if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3])) @@ -5564,28 +5603,28 @@ ;; leave out the mode in operand 4 and use one pattern, but reload can ;; change the mode underneath our feet and then gets confused trying ;; to reload the value. -(define_insn "isel_signed" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (if_then_else:SI +(define_insn "isel_signed_" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (if_then_else:GPR (match_operator 1 "comparison_operator" [(match_operand:CC 4 "cc_reg_operand" "y") (const_int 0)]) - (match_operand:SI 2 "gpc_reg_operand" "b") - (match_operand:SI 3 "gpc_reg_operand" "b")))] - "TARGET_ISEL" + (match_operand:GPR 2 "gpc_reg_operand" "b") + (match_operand:GPR 3 "gpc_reg_operand" "b")))] + "TARGET_ISEL" "* { return output_isel (operands); }" [(set_attr "length" "4")]) -(define_insn "isel_unsigned" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (if_then_else:SI +(define_insn "isel_unsigned_" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (if_then_else:GPR (match_operator 1 "comparison_operator" [(match_operand:CCUNS 4 "cc_reg_operand" "y") (const_int 0)]) - (match_operand:SI 2 "gpc_reg_operand" "b") - (match_operand:SI 3 "gpc_reg_operand" "b")))] - "TARGET_ISEL" + (match_operand:GPR 2 "gpc_reg_operand" "b") + (match_operand:GPR 3 "gpc_reg_operand" "b")))] + "TARGET_ISEL" "* { return output_isel (operands); }" [(set_attr "length" "4")]) @@ -5633,7 +5672,8 @@ (define_insn "*negdf2_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (neg:DF (match_operand:DF 1 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "fneg %0,%1" [(set_attr "type" "fp")]) @@ -5646,14 +5686,16 @@ (define_insn "*absdf2_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (abs:DF (match_operand:DF 1 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "fabs %0,%1" [(set_attr "type" "fp")]) (define_insn "*nabsdf2_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (neg:DF (abs:DF (match_operand:DF 1 "gpc_reg_operand" "f"))))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "fnabs %0,%1" [(set_attr "type" "fp")]) @@ -5668,7 +5710,8 @@ [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (plus:DF (match_operand:DF 1 "gpc_reg_operand" "%f") (match_operand:DF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "{fa|fadd} %0,%1,%2" [(set_attr "type" "fp") (set_attr "fp_type" "fp_addsub_d")]) @@ -5684,7 +5727,8 @@ [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (minus:DF (match_operand:DF 1 "gpc_reg_operand" "f") (match_operand:DF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "{fs|fsub} %0,%1,%2" [(set_attr "type" "fp") (set_attr "fp_type" "fp_addsub_d")]) @@ -5700,7 +5744,8 @@ [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") (match_operand:DF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "{fm|fmul} %0,%1,%2" [(set_attr "type" "dmul") (set_attr "fp_type" "fp_mul_d")]) @@ -5718,7 +5763,8 @@ [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (div:DF (match_operand:DF 1 "gpc_reg_operand" "f") (match_operand:DF 2 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && !TARGET_SIMPLE_FPU" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && !TARGET_SIMPLE_FPU + && !VECTOR_UNIT_VSX_P (DFmode)" "{fd|fdiv} %0,%1,%2" [(set_attr "type" "ddiv")]) @@ -5734,73 +5780,81 @@ DONE; }) -(define_insn "fred" +(define_expand "fred" + [(set (match_operand:DF 0 "gpc_reg_operand" "=f") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRES))] + "(TARGET_POPCNTB || VECTOR_UNIT_VSX_P (DFmode)) && flag_finite_math_only" + "") + +(define_insn "*fred_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRES))] - "TARGET_POPCNTB && flag_finite_math_only" + "TARGET_POPCNTB && flag_finite_math_only && !VECTOR_UNIT_VSX_P (DFmode)" "fre %0,%1" [(set_attr "type" "fp")]) -(define_insn "" +(define_insn "*fmadddf4_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") (match_operand:DF 2 "gpc_reg_operand" "f")) (match_operand:DF 3 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT + && VECTOR_UNIT_NONE_P (DFmode)" "{fma|fmadd} %0,%1,%2,%3" [(set_attr "type" "dmul") (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" +(define_insn "*fmsubdf4_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") (match_operand:DF 2 "gpc_reg_operand" "f")) (match_operand:DF 3 "gpc_reg_operand" "f")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT" + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT + && VECTOR_UNIT_NONE_P (DFmode)" "{fms|fmsub} %0,%1,%2,%3" [(set_attr "type" "dmul") (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" +(define_insn "*fnmadddf4_fpr_1" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (neg:DF (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") (match_operand:DF 2 "gpc_reg_operand" "f")) (match_operand:DF 3 "gpc_reg_operand" "f"))))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && HONOR_SIGNED_ZEROS (DFmode)" + && HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" "{fnma|fnmadd} %0,%1,%2,%3" [(set_attr "type" "dmul") (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" +(define_insn "*fnmadddf4_fpr_2" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (minus:DF (mult:DF (neg:DF (match_operand:DF 1 "gpc_reg_operand" "f")) (match_operand:DF 2 "gpc_reg_operand" "f")) (match_operand:DF 3 "gpc_reg_operand" "f")))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && ! HONOR_SIGNED_ZEROS (DFmode)" + && ! HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" "{fnma|fnmadd} %0,%1,%2,%3" [(set_attr "type" "dmul") (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" +(define_insn "*fnmsubdf4_fpr_1" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (neg:DF (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") (match_operand:DF 2 "gpc_reg_operand" "f")) (match_operand:DF 3 "gpc_reg_operand" "f"))))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && HONOR_SIGNED_ZEROS (DFmode)" + && HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" "{fnms|fnmsub} %0,%1,%2,%3" [(set_attr "type" "dmul") (set_attr "fp_type" "fp_maddsub_d")]) -(define_insn "" +(define_insn "*fnmsubdf4_fpr_2" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (minus:DF (match_operand:DF 3 "gpc_reg_operand" "f") (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f") (match_operand:DF 2 "gpc_reg_operand" "f"))))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && ! HONOR_SIGNED_ZEROS (DFmode)" + && ! HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" "{fnms|fnmsub} %0,%1,%2,%3" [(set_attr "type" "dmul") (set_attr "fp_type" "fp_maddsub_d")]) @@ -5809,7 +5863,8 @@ [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (sqrt:DF (match_operand:DF 1 "gpc_reg_operand" "f")))] "(TARGET_PPC_GPOPT || TARGET_POWER2) && TARGET_HARD_FLOAT && TARGET_FPRS - && TARGET_DOUBLE_FLOAT" + && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "fsqrt %0,%1" [(set_attr "type" "dsqrt")]) @@ -5898,6 +5953,18 @@ "TARGET_HARD_FLOAT && !TARGET_FPRS && TARGET_SINGLE_FLOAT" "") +(define_expand "fixuns_truncdfsi2" + [(set (match_operand:SI 0 "gpc_reg_operand" "") + (unsigned_fix:SI (match_operand:DF 1 "gpc_reg_operand" "")))] + "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE" + "") + +(define_expand "fixuns_truncdfdi2" + [(set (match_operand:DI 0 "register_operand" "") + (unsigned_fix:DI (match_operand:DF 1 "register_operand" "")))] + "TARGET_HARD_FLOAT && TARGET_VSX" + "") + ; For each of these conversions, there is a define_expand, a define_insn ; with a '#' template, and a define_split (with C code). The idea is ; to allow constant folding with the template of the define_insn, @@ -6139,24 +6206,38 @@ "{fcirz|fctiwz} %0,%1" [(set_attr "type" "fp")]) -(define_insn "btruncdf2" +(define_expand "btruncdf2" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIZ))] "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "") + +(define_insn "*btruncdf2_fprs" + [(set (match_operand:DF 0 "gpc_reg_operand" "=f") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIZ))] + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "friz %0,%1" [(set_attr "type" "fp")]) (define_insn "btruncsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (unspec:SF [(match_operand:SF 1 "gpc_reg_operand" "f")] UNSPEC_FRIZ))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT " + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" "friz %0,%1" [(set_attr "type" "fp")]) -(define_insn "ceildf2" +(define_expand "ceildf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "")] UNSPEC_FRIP))] + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "") + +(define_insn "*ceildf2_fprs" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIP))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "frip %0,%1" [(set_attr "type" "fp")]) @@ -6167,10 +6248,17 @@ "frip %0,%1" [(set_attr "type" "fp")]) -(define_insn "floordf2" +(define_expand "floordf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "")] UNSPEC_FRIM))] + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "") + +(define_insn "*floordf2_fprs" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIM))] - "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" + "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && !VECTOR_UNIT_VSX_P (DFmode)" "frim %0,%1" [(set_attr "type" "fp")]) @@ -6181,6 +6269,7 @@ "frim %0,%1" [(set_attr "type" "fp")]) +;; No VSX equivalent to frin (define_insn "rounddf2" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIN))] @@ -6195,6 +6284,12 @@ "frin %0,%1" [(set_attr "type" "fp")]) +(define_expand "ftruncdf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (fix:DF (match_operand:DF 1 "gpc_reg_operand" "")))] + "VECTOR_UNIT_VSX_P (DFmode)" + "") + ; An UNSPEC is used so we don't have to support SImode in FP registers. (define_insn "stfiwx" [(set (match_operand:SI 0 "memory_operand" "=Z") @@ -6210,17 +6305,40 @@ "TARGET_HARD_FLOAT && !TARGET_FPRS" "") -(define_insn "floatdidf2" +(define_expand "floatdidf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (float:DF (match_operand:DI 1 "gpc_reg_operand" "")))] + "(TARGET_POWERPC64 || TARGET_XILINX_FPU || VECTOR_UNIT_VSX_P (DFmode)) + && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS" + "") + +(define_insn "*floatdidf2_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (float:DF (match_operand:DI 1 "gpc_reg_operand" "!f#r")))] - "(TARGET_POWERPC64 || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS" + "(TARGET_POWERPC64 || TARGET_XILINX_FPU) + && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS + && !VECTOR_UNIT_VSX_P (DFmode)" "fcfid %0,%1" [(set_attr "type" "fp")]) -(define_insn "fix_truncdfdi2" +(define_expand "floatunsdidf2" + [(set (match_operand:DF 0 "gpc_reg_operand" "") + (unsigned_float:DF (match_operand:DI 1 "gpc_reg_operand" "")))] + "TARGET_VSX" + "") + +(define_expand "fix_truncdfdi2" + [(set (match_operand:DI 0 "gpc_reg_operand" "") + (fix:DI (match_operand:DF 1 "gpc_reg_operand" "")))] + "(TARGET_POWERPC64 || TARGET_XILINX_FPU || VECTOR_UNIT_VSX_P (DFmode)) + && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS" + "") + +(define_insn "*fix_truncdfdi2_fpr" [(set (match_operand:DI 0 "gpc_reg_operand" "=!f#r") (fix:DI (match_operand:DF 1 "gpc_reg_operand" "f")))] - "(TARGET_POWERPC64 || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS" + "(TARGET_POWERPC64 || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT + && TARGET_DOUBLE_FLOAT && TARGET_FPRS && !VECTOR_UNIT_VSX_P (DFmode)" "fctidz %0,%1" [(set_attr "type" "fp")]) @@ -7609,7 +7727,7 @@ andi. %0,%1,%b2 andis. %0,%1,%u2 #" - [(set_attr "type" "*,*,*,compare,compare,*") + [(set_attr "type" "*,*,*,fast_compare,fast_compare,*") (set_attr "length" "4,4,4,4,4,8")]) (define_insn "anddi3_nomc" @@ -7667,7 +7785,9 @@ # # #" - [(set_attr "type" "compare,compare,delayed_compare,compare,compare,compare,compare,compare,compare,compare,compare,compare") + [(set_attr "type" "fast_compare,compare,delayed_compare,fast_compare,\ + fast_compare,compare,compare,compare,compare,compare,\ + compare,compare") (set_attr "length" "4,4,4,4,4,8,8,8,8,8,8,12")]) (define_split @@ -7718,7 +7838,9 @@ # # #" - [(set_attr "type" "compare,compare,delayed_compare,compare,compare,compare,compare,compare,compare,compare,compare,compare") + [(set_attr "type" "fast_compare,compare,delayed_compare,fast_compare,\ + fast_compare,compare,compare,compare,compare,compare,\ + compare,compare") (set_attr "length" "4,4,4,4,4,8,8,8,8,8,8,12")]) (define_split @@ -7858,7 +7980,7 @@ "@ %q4. %3,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -7887,7 +8009,7 @@ "@ %q4. %0,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -7958,7 +8080,7 @@ "@ %q4. %3,%2,%1 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -7987,7 +8109,7 @@ "@ %q4. %0,%2,%1 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -8024,7 +8146,7 @@ "@ %q4. %3,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -8053,7 +8175,7 @@ "@ %q4. %0,%1,%2 #" - [(set_attr "type" "compare") + [(set_attr "type" "fast_compare,compare") (set_attr "length" "4,8")]) (define_split @@ -8070,6 +8192,51 @@ (compare:CC (match_dup 0) (const_int 0)))] "") + +(define_expand "smindi3" + [(match_operand:DI 0 "gpc_reg_operand" "") + (match_operand:DI 1 "gpc_reg_operand" "") + (match_operand:DI 2 "gpc_reg_operand" "")] + "TARGET_ISEL64" + " +{ + rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); + DONE; +}") + +(define_expand "smaxdi3" + [(match_operand:DI 0 "gpc_reg_operand" "") + (match_operand:DI 1 "gpc_reg_operand" "") + (match_operand:DI 2 "gpc_reg_operand" "")] + "TARGET_ISEL64" + " +{ + rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); + DONE; +}") + +(define_expand "umindi3" + [(match_operand:DI 0 "gpc_reg_operand" "") + (match_operand:DI 1 "gpc_reg_operand" "") + (match_operand:DI 2 "gpc_reg_operand" "")] + "TARGET_ISEL64" + " +{ + rs6000_emit_minmax (operands[0], UMIN, operands[1], operands[2]); + DONE; +}") + +(define_expand "umaxdi3" + [(match_operand:DI 0 "gpc_reg_operand" "") + (match_operand:DI 1 "gpc_reg_operand" "") + (match_operand:DI 2 "gpc_reg_operand" "")] + "TARGET_ISEL64" + " +{ + rs6000_emit_minmax (operands[0], UMAX, operands[1], operands[2]); + DONE; +}") + ;; Now define ways of moving data around. @@ -8473,8 +8640,8 @@ ;; The "??" is a kludge until we can figure out a more reasonable way ;; of handling these non-offsettable values. (define_insn "*movdf_hardfloat32" - [(set (match_operand:DF 0 "nonimmediate_operand" "=!r,??r,m,f,f,m,!r,!r,!r") - (match_operand:DF 1 "input_operand" "r,m,r,f,m,f,G,H,F"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=!r, ??r, m, ws, ?wa, ws, ?wa, Z, ?Z, f, f, m, wa, !r, !r, !r") + (match_operand:DF 1 "input_operand" "r, m, r, ws, wa, Z, Z, ws, wa, f, m, f, j, G, H, F"))] "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && (gpc_reg_operand (operands[0], DFmode) || gpc_reg_operand (operands[1], DFmode))" @@ -8553,19 +8720,30 @@ return \"\"; } case 3: - return \"fmr %0,%1\"; case 4: - return \"lfd%U1%X1 %0,%1\"; + return \"xxlor %x0,%x1,%x1\"; case 5: - return \"stfd%U0%X0 %1,%0\"; case 6: + return \"lxsd%U1x %x0,%y1\"; case 7: case 8: + return \"stxsd%U0x %x1,%y0\"; + case 9: + return \"fmr %0,%1\"; + case 10: + return \"lfd%U1%X1 %0,%1\"; + case 11: + return \"stfd%U0%X0 %1,%0\"; + case 12: + return \"xxlxor %x0,%x0,%x0\"; + case 13: + case 14: + case 15: return \"#\"; } }" - [(set_attr "type" "two,load,store,fp,fpload,fpstore,*,*,*") - (set_attr "length" "8,16,16,4,4,4,8,12,16")]) + [(set_attr "type" "two, load, store, fp, fp, fpload, fpload, fpstore, fpstore, fp, fpload, fpstore, vecsimple, *, *, *") + (set_attr "length" "8, 16, 16, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 8, 12, 16")]) (define_insn "*movdf_softfloat32" [(set (match_operand:DF 0 "nonimmediate_operand" "=r,r,m,r,r,r") @@ -8613,19 +8791,26 @@ ; ld/std require word-aligned displacements -> 'Y' constraint. ; List Y->r and r->Y before r->r for reload. (define_insn "*movdf_hardfloat64_mfpgpr" - [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,f,f,m,*c*l,!r,*h,!r,!r,!r,r,f") - (match_operand:DF 1 "input_operand" "r,Y,r,f,m,f,r,h,0,G,H,F,f,r"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=Y, r, !r, ws, ?wa, ws, ?wa, Z, ?Z, f, f, m, wa, *c*l, !r, *h, !r, !r, !r, r, f") + (match_operand:DF 1 "input_operand" "r, Y, r, ws, ?wa, Z, Z, ws, wa, f, m, f, j, r, h, 0, G, H, F, f, r"))] "TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS - && TARGET_DOUBLE_FLOAT + && TARGET_DOUBLE_FLOAT && (gpc_reg_operand (operands[0], DFmode) || gpc_reg_operand (operands[1], DFmode))" "@ std%U0%X0 %1,%0 ld%U1%X1 %0,%1 mr %0,%1 + xxlor %x0,%x1,%x1 + xxlor %x0,%x1,%x1 + lxsd%U1x %x0,%y1 + lxsd%U1x %x0,%y1 + stxsd%U0x %x1,%y0 + stxsd%U0x %x1,%y0 fmr %0,%1 lfd%U1%X1 %0,%1 stfd%U0%X0 %1,%0 + xxlxor %x0,%x0,%x0 mt%0 %1 mf%1 %0 {cror 0,0,0|nop} @@ -8634,33 +8819,40 @@ # mftgpr %0,%1 mffgpr %0,%1" - [(set_attr "type" "store,load,*,fp,fpload,fpstore,mtjmpr,mfjmpr,*,*,*,*,mftgpr,mffgpr") - (set_attr "length" "4,4,4,4,4,4,4,4,4,8,12,16,4,4")]) + [(set_attr "type" "store, load, *, fp, fp, fpload, fpload, fpstore, fpstore, fp, fpload, fpstore, vecsimple, mtjmpr, mfjmpr, *, *, *, *, mftgpr, mffgpr") + (set_attr "length" "4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 8, 12, 16, 4, 4")]) ; ld/std require word-aligned displacements -> 'Y' constraint. ; List Y->r and r->Y before r->r for reload. (define_insn "*movdf_hardfloat64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,f,f,m,*c*l,!r,*h,!r,!r,!r") - (match_operand:DF 1 "input_operand" "r,Y,r,f,m,f,r,h,0,G,H,F"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=Y, r, !r, ws, ?wa, ws, ?wa, Z, ?Z, f, f, m, wa, *c*l, !r, *h, !r, !r, !r") + (match_operand:DF 1 "input_operand" "r, Y, r, ws, wa, Z, Z, ws, wa, f, m, f, j, r, h, 0, G, H, F"))] "TARGET_POWERPC64 && !TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS - && TARGET_DOUBLE_FLOAT + && TARGET_DOUBLE_FLOAT && (gpc_reg_operand (operands[0], DFmode) || gpc_reg_operand (operands[1], DFmode))" "@ std%U0%X0 %1,%0 ld%U1%X1 %0,%1 mr %0,%1 + xxlor %x0,%x1,%x1 + xxlor %x0,%x1,%x1 + lxsd%U1x %x0,%y1 + lxsd%U1x %x0,%y1 + stxsd%U0x %x1,%y0 + stxsd%U0x %x1,%y0 fmr %0,%1 lfd%U1%X1 %0,%1 stfd%U0%X0 %1,%0 + xxlxor %x0,%x0,%x0 mt%0 %1 mf%1 %0 {cror 0,0,0|nop} # # #" - [(set_attr "type" "store,load,*,fp,fpload,fpstore,mtjmpr,mfjmpr,*,*,*,*") - (set_attr "length" "4,4,4,4,4,4,4,4,4,8,12,16")]) + [(set_attr "type" "store, load, *, fp, fp, fpload, fpload, fpstore, fpstore, fp, fpload, fpstore, vecsimple, mtjmpr, mfjmpr, *, *, *, *") + (set_attr "length" " 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 8, 12, 16")]) (define_insn "*movdf_softfloat64" [(set (match_operand:DF 0 "nonimmediate_operand" "=r,Y,r,cl,r,r,r,r,*h") @@ -9237,15 +9429,16 @@ (define_insn "*movti_ppc64" [(set (match_operand:TI 0 "nonimmediate_operand" "=r,o<>,r") (match_operand:TI 1 "input_operand" "r,r,m"))] - "TARGET_POWERPC64 && (gpc_reg_operand (operands[0], TImode) - || gpc_reg_operand (operands[1], TImode))" + "(TARGET_POWERPC64 && (gpc_reg_operand (operands[0], TImode) + || gpc_reg_operand (operands[1], TImode))) + && VECTOR_MEM_NONE_P (TImode)" "#" [(set_attr "type" "*,load,store")]) (define_split [(set (match_operand:TI 0 "gpc_reg_operand" "") (match_operand:TI 1 "const_double_operand" ""))] - "TARGET_POWERPC64" + "TARGET_POWERPC64 && VECTOR_MEM_NONE_P (TImode)" [(set (match_dup 2) (match_dup 4)) (set (match_dup 3) (match_dup 5))] " @@ -9271,7 +9464,7 @@ (define_split [(set (match_operand:TI 0 "nonimmediate_operand" "") (match_operand:TI 1 "input_operand" ""))] - "reload_completed + "reload_completed && VECTOR_MEM_NONE_P (TImode) && gpr_or_gpr_p (operands[0], operands[1])" [(pc)] { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }) @@ -14891,6 +15084,8 @@ (include "sync.md") +(include "vector.md") +(include "vsx.md") (include "altivec.md") (include "spe.md") (include "dfp.md") --- gcc/config/rs6000/e500.h (.../trunk) (revision 145777) +++ gcc/config/rs6000/e500.h (.../branches/ibm/power7-meissner) (revision 146027) @@ -37,6 +37,8 @@ { \ if (TARGET_ALTIVEC) \ error ("AltiVec and E500 instructions cannot coexist"); \ + if (TARGET_VSX) \ + error ("VSX and E500 instructions cannot coexist"); \ if (TARGET_64BIT) \ error ("64-bit E500 not supported"); \ if (TARGET_HARD_FLOAT && TARGET_FPRS) \ --- gcc/config/rs6000/driver-rs6000.c (.../trunk) (revision 145777) +++ gcc/config/rs6000/driver-rs6000.c (.../branches/ibm/power7-meissner) (revision 146027) @@ -343,11 +343,115 @@ detect_processor_aix (void) #endif /* _AIX */ +/* + * Array to map -mcpu=native names to the switches passed to the assembler. + * This list mirrors the specs in ASM_CPU_SPEC, and any changes made here + * should be made there as well. + */ + +struct asm_name { + const char *cpu; + const char *asm_sw; +}; + +static const +struct asm_name asm_names[] = { +#if defined (_AIX) + { "power3", "-m620" }, + { "power4", "-mpwr4" }, + { "power5", "-mpwr5" }, + { "power5+", "-mpwr5x" }, + { "power6", "-mpwr6" }, + { "power6x", "-mpwr6" }, + { "power7", "-mpwr7" }, + { "powerpc", "-mppc" }, + { "rs64a", "-mppc" }, + { "603", "-m603" }, + { "603e", "-m603" }, + { "604", "-m604" }, + { "604e", "-m604" }, + { "620", "-m620" }, + { "630", "-m620" }, + { "970", "-m970" }, + { "G5", "-m970" }, + { NULL, "\ +%{!maix64: \ +%{mpowerpc64: -mppc64} \ +%{maltivec: -m970} \ +%{!maltivec: %{!mpower64: %(asm_default)}}}" }, + +#else + { "common", "-mcom" }, + { "cell", "-mcell" }, + { "power", "-mpwr" }, + { "power2", "-mpwrx" }, + { "power3", "-mppc64" }, + { "power4", "-mpower4" }, + { "power5", "%(asm_cpu_power5)" }, + { "power5+", "%(asm_cpu_power5)" }, + { "power6", "%(asm_cpu_power6) -maltivec" }, + { "power6x", "%(asm_cpu_power6) -maltivec" }, + { "power7", "%(asm_cpu_power7)" }, + { "powerpc", "-mppc" }, + { "rios", "-mpwr" }, + { "rios1", "-mpwr" }, + { "rios2", "-mpwrx" }, + { "rsc", "-mpwr" }, + { "rsc1", "-mpwr" }, + { "rs64a", "-mppc64" }, + { "401", "-mppc" }, + { "403", "-m403" }, + { "405", "-m405" }, + { "405fp", "-m405" }, + { "440", "-m440" }, + { "440fp", "-m440" }, + { "464", "-m440" }, + { "464fp", "-m440" }, + { "505", "-mppc" }, + { "601", "-m601" }, + { "602", "-mppc" }, + { "603", "-mppc" }, + { "603e", "-mppc" }, + { "ec603e", "-mppc" }, + { "604", "-mppc" }, + { "604e", "-mppc" }, + { "620", "-mppc64" }, + { "630", "-mppc64" }, + { "740", "-mppc" }, + { "750", "-mppc" }, + { "G3", "-mppc" }, + { "7400", "-mppc -maltivec" }, + { "7450", "-mppc -maltivec" }, + { "G4", "-mppc -maltivec" }, + { "801", "-mppc" }, + { "821", "-mppc" }, + { "823", "-mppc" }, + { "860", "-mppc" }, + { "970", "-mpower4 -maltivec" }, + { "G5", "-mpower4 -maltivec" }, + { "8540", "-me500" }, + { "8548", "-me500" }, + { "e300c2", "-me300" }, + { "e300c3", "-me300" }, + { "e500mc", "-me500mc" }, + { NULL, "\ +%{mpower: %{!mpower2: -mpwr}} \ +%{mpower2: -mpwrx} \ +%{mpowerpc64*: -mppc64} \ +%{!mpowerpc64*: %{mpowerpc*: -mppc}} \ +%{mno-power: %{!mpowerpc*: -mcom}} \ +%{!mno-power: %{!mpower*: %(asm_default)}}" }, +#endif +}; + /* This will be called by the spec parser in gcc.c when it sees a %:local_cpu_detect(args) construct. Currently it will be called with either "arch" or "tune" as argument depending on if -march=native or -mtune=native is to be substituted. + Additionally it will be called with "asm" to select the appropriate flags + for the assembler. + It returns a string containing new command line parameters to be put at the place of the above two options, depending on what CPU this is executed. @@ -361,29 +465,35 @@ const char const char *cache = ""; const char *options = ""; bool arch; + bool assembler; + size_t i; if (argc < 1) return NULL; arch = strcmp (argv[0], "cpu") == 0; - if (!arch && strcmp (argv[0], "tune")) + assembler = (!arch && strcmp (argv[0], "asm") == 0); + if (!arch && !assembler && strcmp (argv[0], "tune")) return NULL; + if (! assembler) + { #if defined (_AIX) - cache = detect_caches_aix (); + cache = detect_caches_aix (); #elif defined (__APPLE__) - cache = detect_caches_darwin (); + cache = detect_caches_darwin (); #elif defined (__FreeBSD__) - cache = detect_caches_freebsd (); - /* FreeBSD PPC does not provide any cache information yet. */ - cache = ""; + cache = detect_caches_freebsd (); + /* FreeBSD PPC does not provide any cache information yet. */ + cache = ""; #elif defined (__linux__) - cache = detect_caches_linux (); - /* PPC Linux does not provide any cache information yet. */ - cache = ""; + cache = detect_caches_linux (); + /* PPC Linux does not provide any cache information yet. */ + cache = ""; #else - cache = ""; + cache = ""; #endif + } #if defined (_AIX) cpu = detect_processor_aix (); @@ -397,6 +507,17 @@ const char cpu = "powerpc"; #endif + if (assembler) + { + for (i = 0; i < sizeof (asm_names) / sizeof (asm_names[0]); i++) + { + if (!asm_names[i].cpu || !strcmp (asm_names[i].cpu, cpu)) + return asm_names[i].asm_sw; + } + + return NULL; + } + return concat (cache, "-m", argv[0], "=", cpu, " ", options, NULL); } --- gcc/config/rs6000/sysv4.h (.../trunk) (revision 145777) +++ gcc/config/rs6000/sysv4.h (.../branches/ibm/power7-meissner) (revision 146027) @@ -120,9 +120,9 @@ do { \ else if (!strcmp (rs6000_abi_name, "i960-old")) \ { \ rs6000_current_abi = ABI_V4; \ - target_flags |= (MASK_LITTLE_ENDIAN | MASK_EABI \ - | MASK_NO_BITFIELD_WORD); \ + target_flags |= (MASK_LITTLE_ENDIAN | MASK_EABI); \ target_flags &= ~MASK_STRICT_ALIGN; \ + TARGET_NO_BITFIELD_WORD = 1; \ } \ else \ { \