comparison gcc/doc/md.texi @ 131:84e7813d76e9

gcc-8.2
author mir3636
date Thu, 25 Oct 2018 07:37:49 +0900
parents 04ced10e8804
children 1830386684a0
comparison
equal deleted inserted replaced
111:04ced10e8804 131:84e7813d76e9
1 @c Copyright (C) 1988-2017 Free Software Foundation, Inc. 1 @c Copyright (C) 1988-2018 Free Software Foundation, Inc.
2 @c This is part of the GCC manual. 2 @c This is part of the GCC manual.
3 @c For copying conditions, see the file gcc.texi. 3 @c For copying conditions, see the file gcc.texi.
4 4
5 @ifset INTERNALS 5 @ifset INTERNALS
6 @node Machine Desc 6 @node Machine Desc
113 113
114 A @code{define_insn} is an RTL expression containing four or five operands: 114 A @code{define_insn} is an RTL expression containing four or five operands:
115 115
116 @enumerate 116 @enumerate
117 @item 117 @item
118 An optional name. The presence of a name indicates that this instruction 118 An optional name @var{n}. When a name is present, the compiler
119 pattern can perform a certain standard job for the RTL-generation 119 automically generates a C++ function @samp{gen_@var{n}} that takes
120 pass of the compiler. This pass knows certain names and will use 120 the operands of the instruction as arguments and returns the instruction's
121 the instruction patterns with those names, if the names are defined 121 rtx pattern. The compiler also assigns the instruction a unique code
122 in the machine description. 122 @samp{CODE_FOR_@var{n}}, with all such codes belonging to an enum
123 called @code{insn_code}.
124
125 These names serve one of two purposes. The first is to indicate that the
126 instruction performs a certain standard job for the RTL-generation
127 pass of the compiler, such as a move, an addition, or a conditional
128 jump. The second is to help the target generate certain target-specific
129 operations, such as when implementing target-specific intrinsic functions.
130
131 It is better to prefix target-specific names with the name of the
132 target, to avoid any clash with current or future standard names.
123 133
124 The absence of a name is indicated by writing an empty string 134 The absence of a name is indicated by writing an empty string
125 where the name should go. Nameless instruction patterns are never 135 where the name should go. Nameless instruction patterns are never
126 used for generating RTL code, but they may permit several simpler insns 136 used for generating RTL code, but they may permit several simpler insns
127 to be combined later on. 137 to be combined later on.
128
129 Names that are not thus known and used in RTL-generation have no
130 effect; they are equivalent to no name at all.
131 138
132 For the purpose of debugging the compiler, you may also specify a 139 For the purpose of debugging the compiler, you may also specify a
133 name beginning with the @samp{*} character. Such a name is used only 140 name beginning with the @samp{*} character. Such a name is used only
134 for identifying the instruction in RTL dumps; it is equivalent to having 141 for identifying the instruction in RTL dumps; it is equivalent to having
135 a nameless pattern for all other purposes. Names beginning with the 142 a nameless pattern for all other purposes. Names beginning with the
136 @samp{*} character are not required to be unique. 143 @samp{*} character are not required to be unique.
144
145 The name may also have the form @samp{@@@var{n}}. This has the same
146 effect as a name @samp{@var{n}}, but in addition tells the compiler to
147 generate further helper functions; see @ref{Parameterized Names} for details.
137 148
138 @item 149 @item
139 The @dfn{RTL template}: This is a vector of incomplete RTL expressions 150 The @dfn{RTL template}: This is a vector of incomplete RTL expressions
140 which describe the semantics of the instruction (@pxref{RTL Template}). 151 which describe the semantics of the instruction (@pxref{RTL Template}).
141 It is incomplete because it may contain @code{match_operand}, 152 It is incomplete because it may contain @code{match_operand},
865 @end defun 876 @end defun
866 877
867 @defun push_operand 878 @defun push_operand
868 This predicate allows a memory reference suitable for pushing a value 879 This predicate allows a memory reference suitable for pushing a value
869 onto the stack. This will be a @code{MEM} which refers to 880 onto the stack. This will be a @code{MEM} which refers to
870 @code{stack_pointer_rtx}, with a side-effect in its address expression 881 @code{stack_pointer_rtx}, with a side effect in its address expression
871 (@pxref{Incdec}); which one is determined by the 882 (@pxref{Incdec}); which one is determined by the
872 @code{STACK_PUSH_CODE} macro (@pxref{Frame Layout}). 883 @code{STACK_PUSH_CODE} macro (@pxref{Frame Layout}).
873 @end defun 884 @end defun
874 885
875 @defun pop_operand 886 @defun pop_operand
876 This predicate allows a memory reference suitable for popping a value 887 This predicate allows a memory reference suitable for popping a value
877 off the stack. Again, this will be a @code{MEM} referring to 888 off the stack. Again, this will be a @code{MEM} referring to
878 @code{stack_pointer_rtx}, with a side-effect in its address 889 @code{stack_pointer_rtx}, with a side effect in its address
879 expression. However, this time @code{STACK_POP_CODE} is expected. 890 expression. However, this time @code{STACK_POP_CODE} is expected.
880 @end defun 891 @end defun
881 892
882 @noindent 893 @noindent
883 The fourth category of predicates allow some combination of the above 894 The fourth category of predicates allow some combination of the above
1089 operand can be a memory reference, and which kinds of address; whether the 1100 operand can be a memory reference, and which kinds of address; whether the
1090 operand may be an immediate constant, and which possible values it may 1101 operand may be an immediate constant, and which possible values it may
1091 have. Constraints can also require two operands to match. 1102 have. Constraints can also require two operands to match.
1092 Side-effects aren't allowed in operands of inline @code{asm}, unless 1103 Side-effects aren't allowed in operands of inline @code{asm}, unless
1093 @samp{<} or @samp{>} constraints are used, because there is no guarantee 1104 @samp{<} or @samp{>} constraints are used, because there is no guarantee
1094 that the side-effects will happen exactly once in an instruction that can update 1105 that the side effects will happen exactly once in an instruction that can update
1095 the addressing register. 1106 the addressing register.
1096 1107
1097 @ifset INTERNALS 1108 @ifset INTERNALS
1098 @menu 1109 @menu
1099 * Simple Constraints:: Basic use of constraints. 1110 * Simple Constraints:: Basic use of constraints.
1170 @cindex @samp{<} in constraint 1181 @cindex @samp{<} in constraint
1171 @item @samp{<} 1182 @item @samp{<}
1172 A memory operand with autodecrement addressing (either predecrement or 1183 A memory operand with autodecrement addressing (either predecrement or
1173 postdecrement) is allowed. In inline @code{asm} this constraint is only 1184 postdecrement) is allowed. In inline @code{asm} this constraint is only
1174 allowed if the operand is used exactly once in an instruction that can 1185 allowed if the operand is used exactly once in an instruction that can
1175 handle the side-effects. Not using an operand with @samp{<} in constraint 1186 handle the side effects. Not using an operand with @samp{<} in constraint
1176 string in the inline @code{asm} pattern at all or using it in multiple 1187 string in the inline @code{asm} pattern at all or using it in multiple
1177 instructions isn't valid, because the side-effects wouldn't be performed 1188 instructions isn't valid, because the side effects wouldn't be performed
1178 or would be performed more than once. Furthermore, on some targets 1189 or would be performed more than once. Furthermore, on some targets
1179 the operand with @samp{<} in constraint string must be accompanied by 1190 the operand with @samp{<} in constraint string must be accompanied by
1180 special instruction suffixes like @code{%U0} instruction suffix on PowerPC 1191 special instruction suffixes like @code{%U0} instruction suffix on PowerPC
1181 or @code{%P0} on IA-64. 1192 or @code{%P0} on IA-64.
1182 1193
1733 @table @code 1744 @table @code
1734 @item k 1745 @item k
1735 The stack pointer register (@code{SP}) 1746 The stack pointer register (@code{SP})
1736 1747
1737 @item w 1748 @item w
1738 Floating point or SIMD vector register 1749 Floating point register, Advanced SIMD vector register or SVE vector register
1750
1751 @item Upl
1752 One of the low eight SVE predicate registers (@code{P0} to @code{P7})
1753
1754 @item Upa
1755 Any of the SVE predicate registers (@code{P0} to @code{P15})
1739 1756
1740 @item I 1757 @item I
1741 Integer constant that is valid as an immediate operand in an @code{ADD} 1758 Integer constant that is valid as an immediate operand in an @code{ADD}
1742 instruction 1759 instruction
1743 1760
2112 Check for 64 bits wide constants for add/sub instructions 2129 Check for 64 bits wide constants for add/sub instructions
2113 2130
2114 @item G 2131 @item G
2115 Floating point constant that is legal for store immediate 2132 Floating point constant that is legal for store immediate
2116 @end table 2133 @end table
2134
2135 @item C-SKY---@file{config/csky/constraints.md}
2136 @table @code
2137
2138 @item a
2139 The mini registers r0 - r7.
2140
2141 @item b
2142 The low registers r0 - r15.
2143
2144 @item c
2145 C register.
2146
2147 @item y
2148 HI and LO registers.
2149
2150 @item l
2151 LO register.
2152
2153 @item h
2154 HI register.
2155
2156 @item v
2157 Vector registers.
2158
2159 @item z
2160 Stack pointer register (SP).
2161 @end table
2162
2163 @ifset INTERNALS
2164 The C-SKY back end supports a large set of additional constraints
2165 that are only useful for instruction selection or splitting rather
2166 than inline asm, such as constraints representing constant integer
2167 ranges accepted by particular instruction encodings.
2168 Refer to the source code for details.
2169 @end ifset
2117 2170
2118 @item Epiphany---@file{config/epiphany/constraints.md} 2171 @item Epiphany---@file{config/epiphany/constraints.md}
2119 @table @code 2172 @table @code
2120 @item U16 2173 @item U16
2121 An unsigned 16-bit constant. 2174 An unsigned 16-bit constant.
2958 3011
2959 @item d 3012 @item d
2960 Odd numbered general registers (R1, R3, R5). These are used for 3013 Odd numbered general registers (R1, R3, R5). These are used for
2961 16-bit multiply operations. 3014 16-bit multiply operations.
2962 3015
3016 @item D
3017 A memory reference that is encoded within the opcode, but not
3018 auto-increment or auto-decrement.
3019
2963 @item f 3020 @item f
2964 Any of the floating point registers (AC0 through AC5). 3021 Any of the floating point registers (AC0 through AC5).
2965 3022
2966 @item G 3023 @item G
2967 Floating point constant 0. 3024 Floating point constant 0.
3025
3026 @item h
3027 Floating point registers AC4 and AC5. These cannot be loaded from/to
3028 memory with a single instruction.
2968 3029
2969 @item I 3030 @item I
2970 An integer constant that fits in 16 bits. 3031 An integer constant that fits in 16 bits.
2971 3032
2972 @item J 3033 @item J
2984 3045
2985 @item N 3046 @item N
2986 The integer constant 0. 3047 The integer constant 0.
2987 3048
2988 @item O 3049 @item O
2989 Integer constants @minus{}4 through @minus{}1 and 1 through 4; shifts by these 3050 Integer constants 0 through 3; shifts by these
2990 amounts are handled as multiple single-bit shifts rather than a single 3051 amounts are handled as multiple single-bit shifts rather than a single
2991 variable-length shift. 3052 variable-length shift.
2992 3053
2993 @item Q 3054 @item Q
2994 A memory reference which requires an additional word (address or 3055 A memory reference which requires an additional word (address or
4181 VSIB address operand. 4242 VSIB address operand.
4182 4243
4183 @item Ts 4244 @item Ts
4184 Address operand without segment register. 4245 Address operand without segment register.
4185 4246
4186 @item Ti
4187 MPX address operand without index.
4188
4189 @item Tb
4190 MPX address operand without base.
4191
4192 @end table 4247 @end table
4193 4248
4194 @item Xstormy16---@file{config/stormy16/stormy16.h} 4249 @item Xstormy16---@file{config/stormy16/stormy16.h}
4195 @table @code 4250 @table @code
4196 @item a 4251 @item a
4847 instruction for some mode @var{n}, it also supports unaligned 4902 instruction for some mode @var{n}, it also supports unaligned
4848 loads for vectors of mode @var{n}. 4903 loads for vectors of mode @var{n}.
4849 4904
4850 This pattern is not allowed to @code{FAIL}. 4905 This pattern is not allowed to @code{FAIL}.
4851 4906
4907 @cindex @code{vec_mask_load_lanes@var{m}@var{n}} instruction pattern
4908 @item @samp{vec_mask_load_lanes@var{m}@var{n}}
4909 Like @samp{vec_load_lanes@var{m}@var{n}}, but takes an additional
4910 mask operand (operand 2) that specifies which elements of the destination
4911 vectors should be loaded. Other elements of the destination
4912 vectors are set to zero. The operation is equivalent to:
4913
4914 @smallexample
4915 int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});
4916 for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)
4917 if (operand2[j])
4918 for (i = 0; i < c; i++)
4919 operand0[i][j] = operand1[j * c + i];
4920 else
4921 for (i = 0; i < c; i++)
4922 operand0[i][j] = 0;
4923 @end smallexample
4924
4925 This pattern is not allowed to @code{FAIL}.
4926
4852 @cindex @code{vec_store_lanes@var{m}@var{n}} instruction pattern 4927 @cindex @code{vec_store_lanes@var{m}@var{n}} instruction pattern
4853 @item @samp{vec_store_lanes@var{m}@var{n}} 4928 @item @samp{vec_store_lanes@var{m}@var{n}}
4854 Equivalent to @samp{vec_load_lanes@var{m}@var{n}}, with the memory 4929 Equivalent to @samp{vec_load_lanes@var{m}@var{n}}, with the memory
4855 and register operands reversed. That is, the instruction is 4930 and register operands reversed. That is, the instruction is
4856 equivalent to: 4931 equivalent to:
4863 @end smallexample 4938 @end smallexample
4864 4939
4865 for a memory operand 0 and register operand 1. 4940 for a memory operand 0 and register operand 1.
4866 4941
4867 This pattern is not allowed to @code{FAIL}. 4942 This pattern is not allowed to @code{FAIL}.
4943
4944 @cindex @code{vec_mask_store_lanes@var{m}@var{n}} instruction pattern
4945 @item @samp{vec_mask_store_lanes@var{m}@var{n}}
4946 Like @samp{vec_store_lanes@var{m}@var{n}}, but takes an additional
4947 mask operand (operand 2) that specifies which elements of the source
4948 vectors should be stored. The operation is equivalent to:
4949
4950 @smallexample
4951 int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});
4952 for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)
4953 if (operand2[j])
4954 for (i = 0; i < c; i++)
4955 operand0[j * c + i] = operand1[i][j];
4956 @end smallexample
4957
4958 This pattern is not allowed to @code{FAIL}.
4959
4960 @cindex @code{gather_load@var{m}} instruction pattern
4961 @item @samp{gather_load@var{m}}
4962 Load several separate memory locations into a vector of mode @var{m}.
4963 Operand 1 is a scalar base address and operand 2 is a vector of
4964 offsets from that base. Operand 0 is a destination vector with the
4965 same number of elements as the offset. For each element index @var{i}:
4966
4967 @itemize @bullet
4968 @item
4969 extend the offset element @var{i} to address width, using zero
4970 extension if operand 3 is 1 and sign extension if operand 3 is zero;
4971 @item
4972 multiply the extended offset by operand 4;
4973 @item
4974 add the result to the base; and
4975 @item
4976 load the value at that address into element @var{i} of operand 0.
4977 @end itemize
4978
4979 The value of operand 3 does not matter if the offsets are already
4980 address width.
4981
4982 @cindex @code{mask_gather_load@var{m}} instruction pattern
4983 @item @samp{mask_gather_load@var{m}}
4984 Like @samp{gather_load@var{m}}, but takes an extra mask operand as
4985 operand 5. Bit @var{i} of the mask is set if element @var{i}
4986 of the result should be loaded from memory and clear if element @var{i}
4987 of the result should be set to zero.
4988
4989 @cindex @code{scatter_store@var{m}} instruction pattern
4990 @item @samp{scatter_store@var{m}}
4991 Store a vector of mode @var{m} into several distinct memory locations.
4992 Operand 0 is a scalar base address and operand 1 is a vector of offsets
4993 from that base. Operand 4 is the vector of values that should be stored,
4994 which has the same number of elements as the offset. For each element
4995 index @var{i}:
4996
4997 @itemize @bullet
4998 @item
4999 extend the offset element @var{i} to address width, using zero
5000 extension if operand 2 is 1 and sign extension if operand 2 is zero;
5001 @item
5002 multiply the extended offset by operand 3;
5003 @item
5004 add the result to the base; and
5005 @item
5006 store element @var{i} of operand 4 to that address.
5007 @end itemize
5008
5009 The value of operand 2 does not matter if the offsets are already
5010 address width.
5011
5012 @cindex @code{mask_scatter_store@var{m}} instruction pattern
5013 @item @samp{mask_scatter_store@var{m}}
5014 Like @samp{scatter_store@var{m}}, but takes an extra mask operand as
5015 operand 5. Bit @var{i} of the mask is set if element @var{i}
5016 of the result should be stored to memory.
4868 5017
4869 @cindex @code{vec_set@var{m}} instruction pattern 5018 @cindex @code{vec_set@var{m}} instruction pattern
4870 @item @samp{vec_set@var{m}} 5019 @item @samp{vec_set@var{m}}
4871 Set given field in the vector value. Operand 0 is the vector to modify, 5020 Set given field in the vector value. Operand 0 is the vector to modify,
4872 operand 1 is new value of field and operand 2 specify the field index. 5021 operand 1 is new value of field and operand 2 specify the field index.
4885 Initialize the vector to given values. Operand 0 is the vector to initialize 5034 Initialize the vector to given values. Operand 0 is the vector to initialize
4886 and operand 1 is parallel containing values for individual fields. The 5035 and operand 1 is parallel containing values for individual fields. The
4887 @var{n} mode is the mode of the elements, should be either element mode of 5036 @var{n} mode is the mode of the elements, should be either element mode of
4888 the vector mode @var{m}, or a vector mode with the same element mode and 5037 the vector mode @var{m}, or a vector mode with the same element mode and
4889 smaller number of elements. 5038 smaller number of elements.
5039
5040 @cindex @code{vec_duplicate@var{m}} instruction pattern
5041 @item @samp{vec_duplicate@var{m}}
5042 Initialize vector output operand 0 so that each element has the value given
5043 by scalar input operand 1. The vector has mode @var{m} and the scalar has
5044 the mode appropriate for one element of @var{m}.
5045
5046 This pattern only handles duplicates of non-constant inputs. Constant
5047 vectors go through the @code{mov@var{m}} pattern instead.
5048
5049 This pattern is not allowed to @code{FAIL}.
5050
5051 @cindex @code{vec_series@var{m}} instruction pattern
5052 @item @samp{vec_series@var{m}}
5053 Initialize vector output operand 0 so that element @var{i} is equal to
5054 operand 1 plus @var{i} times operand 2. In other words, create a linear
5055 series whose base value is operand 1 and whose step is operand 2.
5056
5057 The vector output has mode @var{m} and the scalar inputs have the mode
5058 appropriate for one element of @var{m}. This pattern is not used for
5059 floating-point vectors, in order to avoid having to specify the
5060 rounding behavior for @var{i} > 1.
5061
5062 This pattern is not allowed to @code{FAIL}.
5063
5064 @cindex @code{while_ult@var{m}@var{n}} instruction pattern
5065 @item @code{while_ult@var{m}@var{n}}
5066 Set operand 0 to a mask that is true while incrementing operand 1
5067 gives a value that is less than operand 2. Operand 0 has mode @var{n}
5068 and operands 1 and 2 are scalar integers of mode @var{m}.
5069 The operation is equivalent to:
5070
5071 @smallexample
5072 operand0[0] = operand1 < operand2;
5073 for (i = 1; i < GET_MODE_NUNITS (@var{n}); i++)
5074 operand0[i] = operand0[i - 1] && (operand1 + i < operand2);
5075 @end smallexample
4890 5076
4891 @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern 5077 @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
4892 @item @samp{vec_cmp@var{m}@var{n}} 5078 @item @samp{vec_cmp@var{m}@var{n}}
4893 Output a vector comparison. Operand 0 of mode @var{n} is the destination for 5079 Output a vector comparison. Operand 0 of mode @var{n} is the destination for
4894 predicate in operand 1 which is a signed vector comparison with operands of 5080 predicate in operand 1 which is a signed vector comparison with operands of
4970 @samp{vec_perm} pattern for mode @var{m}, but there is for mode @var{q} 5156 @samp{vec_perm} pattern for mode @var{m}, but there is for mode @var{q}
4971 where @var{q} is a vector of @code{QImode} of the same width as @var{m}, 5157 where @var{q} is a vector of @code{QImode} of the same width as @var{m},
4972 the middle-end will lower the mode @var{m} @code{VEC_PERM_EXPR} to 5158 the middle-end will lower the mode @var{m} @code{VEC_PERM_EXPR} to
4973 mode @var{q}. 5159 mode @var{q}.
4974 5160
4975 @cindex @code{vec_perm_const@var{m}} instruction pattern 5161 See also @code{TARGET_VECTORIZER_VEC_PERM_CONST}, which performs
4976 @item @samp{vec_perm_const@var{m}} 5162 the analogous operation for constant selectors.
4977 Like @samp{vec_perm} except that the permutation is a compile-time
4978 constant. That is, operand 3, the @dfn{selector}, is a @code{CONST_VECTOR}.
4979
4980 Some targets cannot perform a permutation with a variable selector,
4981 but can efficiently perform a constant permutation. Further, the
4982 target hook @code{vec_perm_ok} is queried to determine if the
4983 specific constant permutation is available efficiently; the named
4984 pattern is never expanded without @code{vec_perm_ok} returning true.
4985
4986 There is no need for a target to supply both @samp{vec_perm@var{m}}
4987 and @samp{vec_perm_const@var{m}} if the former can trivially implement
4988 the operation with, say, the vector constant loaded into a register.
4989 5163
4990 @cindex @code{push@var{m}1} instruction pattern 5164 @cindex @code{push@var{m}1} instruction pattern
4991 @item @samp{push@var{m}1} 5165 @item @samp{push@var{m}1}
4992 Output a push instruction. Operand 0 is value to push. Used only when 5166 Output a push instruction. Operand 0 is value to push. Used only when
4993 @code{PUSH_ROUNDING} is defined. For historical reason, this pattern may be 5167 @code{PUSH_ROUNDING} is defined. For historical reason, this pattern may be
5139 @item @samp{reduc_plus_scal_@var{m}} 5313 @item @samp{reduc_plus_scal_@var{m}}
5140 Compute the sum of the elements of a vector. The vector is operand 1, and 5314 Compute the sum of the elements of a vector. The vector is operand 1, and
5141 operand 0 is the scalar result, with mode equal to the mode of the elements of 5315 operand 0 is the scalar result, with mode equal to the mode of the elements of
5142 the input vector. 5316 the input vector.
5143 5317
5318 @cindex @code{reduc_and_scal_@var{m}} instruction pattern
5319 @item @samp{reduc_and_scal_@var{m}}
5320 @cindex @code{reduc_ior_scal_@var{m}} instruction pattern
5321 @itemx @samp{reduc_ior_scal_@var{m}}
5322 @cindex @code{reduc_xor_scal_@var{m}} instruction pattern
5323 @itemx @samp{reduc_xor_scal_@var{m}}
5324 Compute the bitwise @code{AND}/@code{IOR}/@code{XOR} reduction of the elements
5325 of a vector of mode @var{m}. Operand 1 is the vector input and operand 0
5326 is the scalar result. The mode of the scalar result is the same as one
5327 element of @var{m}.
5328
5329 @cindex @code{extract_last_@var{m}} instruction pattern
5330 @item @code{extract_last_@var{m}}
5331 Find the last set bit in mask operand 1 and extract the associated element
5332 of vector operand 2. Store the result in scalar operand 0. Operand 2
5333 has vector mode @var{m} while operand 0 has the mode appropriate for one
5334 element of @var{m}. Operand 1 has the usual mask mode for vectors of mode
5335 @var{m}; see @code{TARGET_VECTORIZE_GET_MASK_MODE}.
5336
5337 @cindex @code{fold_extract_last_@var{m}} instruction pattern
5338 @item @code{fold_extract_last_@var{m}}
5339 If any bits of mask operand 2 are set, find the last set bit, extract
5340 the associated element from vector operand 3, and store the result
5341 in operand 0. Store operand 1 in operand 0 otherwise. Operand 3
5342 has mode @var{m} and operands 0 and 1 have the mode appropriate for
5343 one element of @var{m}. Operand 2 has the usual mask mode for vectors
5344 of mode @var{m}; see @code{TARGET_VECTORIZE_GET_MASK_MODE}.
5345
5346 @cindex @code{fold_left_plus_@var{m}} instruction pattern
5347 @item @code{fold_left_plus_@var{m}}
5348 Take scalar operand 1 and successively add each element from vector
5349 operand 2. Store the result in scalar operand 0. The vector has
5350 mode @var{m} and the scalars have the mode appropriate for one
5351 element of @var{m}. The operation is strictly in-order: there is
5352 no reassociation.
5353
5144 @cindex @code{sdot_prod@var{m}} instruction pattern 5354 @cindex @code{sdot_prod@var{m}} instruction pattern
5145 @item @samp{sdot_prod@var{m}} 5355 @item @samp{sdot_prod@var{m}}
5146 @cindex @code{udot_prod@var{m}} instruction pattern 5356 @cindex @code{udot_prod@var{m}} instruction pattern
5147 @itemx @samp{udot_prod@var{m}} 5357 @itemx @samp{udot_prod@var{m}}
5148 Compute the sum of the products of two signed/unsigned elements. 5358 Compute the sum of the products of two signed/unsigned elements.
5168 Operands 0 and 2 are of the same mode, which is wider than the mode of 5378 Operands 0 and 2 are of the same mode, which is wider than the mode of
5169 operand 1. Add operand 1 to operand 2 and place the widened result in 5379 operand 1. Add operand 1 to operand 2 and place the widened result in
5170 operand 0. (This is used express accumulation of elements into an accumulator 5380 operand 0. (This is used express accumulation of elements into an accumulator
5171 of a wider mode.) 5381 of a wider mode.)
5172 5382
5383 @cindex @code{vec_shl_insert_@var{m}} instruction pattern
5384 @item @samp{vec_shl_insert_@var{m}}
5385 Shift the elements in vector input operand 1 left one element (i.e.
5386 away from element 0) and fill the vacated element 0 with the scalar
5387 in operand 2. Store the result in vector output operand 0. Operands
5388 0 and 1 have mode @var{m} and operand 2 has the mode appropriate for
5389 one element of @var{m}.
5390
5173 @cindex @code{vec_shr_@var{m}} instruction pattern 5391 @cindex @code{vec_shr_@var{m}} instruction pattern
5174 @item @samp{vec_shr_@var{m}} 5392 @item @samp{vec_shr_@var{m}}
5175 Whole vector right shift in bits, i.e. towards element 0. 5393 Whole vector right shift in bits, i.e. towards element 0.
5176 Operand 1 is a vector to be shifted. 5394 Operand 1 is a vector to be shifted.
5177 Operand 2 is an integer shift amount in bits. 5395 Operand 2 is an integer shift amount in bits.
5198 @cindex @code{vec_pack_ufix_trunc_@var{m}} instruction pattern 5416 @cindex @code{vec_pack_ufix_trunc_@var{m}} instruction pattern
5199 @item @samp{vec_pack_sfix_trunc_@var{m}}, @samp{vec_pack_ufix_trunc_@var{m}} 5417 @item @samp{vec_pack_sfix_trunc_@var{m}}, @samp{vec_pack_ufix_trunc_@var{m}}
5200 Narrow, convert to signed/unsigned integral type and merge the elements 5418 Narrow, convert to signed/unsigned integral type and merge the elements
5201 of two vectors. Operands 1 and 2 are vectors of the same mode having N 5419 of two vectors. Operands 1 and 2 are vectors of the same mode having N
5202 floating point elements of size S@. Operand 0 is the resulting vector 5420 floating point elements of size S@. Operand 0 is the resulting vector
5421 in which 2*N elements of size N/2 are concatenated.
5422
5423 @cindex @code{vec_packs_float_@var{m}} instruction pattern
5424 @cindex @code{vec_packu_float_@var{m}} instruction pattern
5425 @item @samp{vec_packs_float_@var{m}}, @samp{vec_packu_float_@var{m}}
5426 Narrow, convert to floating point type and merge the elements
5427 of two vectors. Operands 1 and 2 are vectors of the same mode having N
5428 signed/unsigned integral elements of size S@. Operand 0 is the resulting vector
5203 in which 2*N elements of size N/2 are concatenated. 5429 in which 2*N elements of size N/2 are concatenated.
5204 5430
5205 @cindex @code{vec_unpacks_hi_@var{m}} instruction pattern 5431 @cindex @code{vec_unpacks_hi_@var{m}} instruction pattern
5206 @cindex @code{vec_unpacks_lo_@var{m}} instruction pattern 5432 @cindex @code{vec_unpacks_lo_@var{m}} instruction pattern
5207 @item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}} 5433 @item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}}
5227 @itemx @samp{vec_unpacku_float_hi_@var{m}}, @samp{vec_unpacku_float_lo_@var{m}} 5453 @itemx @samp{vec_unpacku_float_hi_@var{m}}, @samp{vec_unpacku_float_lo_@var{m}}
5228 Extract, convert to floating point type and widen the high/low part of a 5454 Extract, convert to floating point type and widen the high/low part of a
5229 vector of signed/unsigned integral elements. The input vector (operand 1) 5455 vector of signed/unsigned integral elements. The input vector (operand 1)
5230 has N elements of size S@. Convert the high/low elements of the vector using 5456 has N elements of size S@. Convert the high/low elements of the vector using
5231 floating point conversion and place the resulting N/2 values of size 2*S in 5457 floating point conversion and place the resulting N/2 values of size 2*S in
5458 the output vector (operand 0).
5459
5460 @cindex @code{vec_unpack_sfix_trunc_hi_@var{m}} instruction pattern
5461 @cindex @code{vec_unpack_sfix_trunc_lo_@var{m}} instruction pattern
5462 @cindex @code{vec_unpack_ufix_trunc_hi_@var{m}} instruction pattern
5463 @cindex @code{vec_unpack_ufix_trunc_lo_@var{m}} instruction pattern
5464 @item @samp{vec_unpack_sfix_trunc_hi_@var{m}},
5465 @itemx @samp{vec_unpack_sfix_trunc_lo_@var{m}}
5466 @itemx @samp{vec_unpack_ufix_trunc_hi_@var{m}}
5467 @itemx @samp{vec_unpack_ufix_trunc_lo_@var{m}}
5468 Extract, convert to signed/unsigned integer type and widen the high/low part of a
5469 vector of floating point elements. The input vector (operand 1)
5470 has N elements of size S@. Convert the high/low elements of the vector
5471 to integers and place the resulting N/2 values of size 2*S in
5232 the output vector (operand 0). 5472 the output vector (operand 0).
5233 5473
5234 @cindex @code{vec_widen_umult_hi_@var{m}} instruction pattern 5474 @cindex @code{vec_widen_umult_hi_@var{m}} instruction pattern
5235 @cindex @code{vec_widen_umult_lo_@var{m}} instruction pattern 5475 @cindex @code{vec_widen_umult_lo_@var{m}} instruction pattern
5236 @cindex @code{vec_widen_smult_hi_@var{m}} instruction pattern 5476 @cindex @code{vec_widen_smult_hi_@var{m}} instruction pattern
5404 @cindex @code{vrotr@var{m}3} instruction pattern 5644 @cindex @code{vrotr@var{m}3} instruction pattern
5405 @item @samp{vashl@var{m}3}, @samp{vashr@var{m}3}, @samp{vlshr@var{m}3}, @samp{vrotl@var{m}3}, @samp{vrotr@var{m}3} 5645 @item @samp{vashl@var{m}3}, @samp{vashr@var{m}3}, @samp{vlshr@var{m}3}, @samp{vrotl@var{m}3}, @samp{vrotr@var{m}3}
5406 Vector shift and rotate instructions that take vectors as operand 2 5646 Vector shift and rotate instructions that take vectors as operand 2
5407 instead of a scalar type. 5647 instead of a scalar type.
5408 5648
5649 @cindex @code{avg@var{m}3_floor} instruction pattern
5650 @cindex @code{uavg@var{m}3_floor} instruction pattern
5651 @item @samp{avg@var{m}3_floor}
5652 @itemx @samp{uavg@var{m}3_floor}
5653 Signed and unsigned average instructions. These instructions add
5654 operands 1 and 2 without truncation, divide the result by 2,
5655 round towards -Inf, and store the result in operand 0. This is
5656 equivalent to the C code:
5657 @smallexample
5658 narrow op0, op1, op2;
5659 @dots{}
5660 op0 = (narrow) (((wide) op1 + (wide) op2) >> 1);
5661 @end smallexample
5662 where the sign of @samp{narrow} determines whether this is a signed
5663 or unsigned operation.
5664
5665 @cindex @code{avg@var{m}3_ceil} instruction pattern
5666 @cindex @code{uavg@var{m}3_ceil} instruction pattern
5667 @item @samp{avg@var{m}3_ceil}
5668 @itemx @samp{uavg@var{m}3_ceil}
5669 Like @samp{avg@var{m}3_floor} and @samp{uavg@var{m}3_floor}, but round
5670 towards +Inf. This is equivalent to the C code:
5671 @smallexample
5672 narrow op0, op1, op2;
5673 @dots{}
5674 op0 = (narrow) (((wide) op1 + (wide) op2 + 1) >> 1);
5675 @end smallexample
5676
5409 @cindex @code{bswap@var{m}2} instruction pattern 5677 @cindex @code{bswap@var{m}2} instruction pattern
5410 @item @samp{bswap@var{m}2} 5678 @item @samp{bswap@var{m}2}
5411 Reverse the order of bytes of operand 1 and store the result in operand 0. 5679 Reverse the order of bytes of operand 1 and store the result in operand 0.
5412 5680
5413 @cindex @code{neg@var{m}2} instruction pattern 5681 @cindex @code{neg@var{m}2} instruction pattern
6160 Similar to @samp{mov@var{mode}cc} but for conditional addition. Conditionally 6428 Similar to @samp{mov@var{mode}cc} but for conditional addition. Conditionally
6161 move operand 2 or (operands 2 + operand 3) into operand 0 according to the 6429 move operand 2 or (operands 2 + operand 3) into operand 0 according to the
6162 comparison in operand 1. If the comparison is false, operand 2 is moved into 6430 comparison in operand 1. If the comparison is false, operand 2 is moved into
6163 operand 0, otherwise (operand 2 + operand 3) is moved. 6431 operand 0, otherwise (operand 2 + operand 3) is moved.
6164 6432
6433 @cindex @code{cond_add@var{mode}} instruction pattern
6434 @cindex @code{cond_sub@var{mode}} instruction pattern
6435 @cindex @code{cond_mul@var{mode}} instruction pattern
6436 @cindex @code{cond_div@var{mode}} instruction pattern
6437 @cindex @code{cond_udiv@var{mode}} instruction pattern
6438 @cindex @code{cond_mod@var{mode}} instruction pattern
6439 @cindex @code{cond_umod@var{mode}} instruction pattern
6440 @cindex @code{cond_and@var{mode}} instruction pattern
6441 @cindex @code{cond_ior@var{mode}} instruction pattern
6442 @cindex @code{cond_xor@var{mode}} instruction pattern
6443 @cindex @code{cond_smin@var{mode}} instruction pattern
6444 @cindex @code{cond_smax@var{mode}} instruction pattern
6445 @cindex @code{cond_umin@var{mode}} instruction pattern
6446 @cindex @code{cond_umax@var{mode}} instruction pattern
6447 @item @samp{cond_add@var{mode}}
6448 @itemx @samp{cond_sub@var{mode}}
6449 @itemx @samp{cond_mul@var{mode}}
6450 @itemx @samp{cond_div@var{mode}}
6451 @itemx @samp{cond_udiv@var{mode}}
6452 @itemx @samp{cond_mod@var{mode}}
6453 @itemx @samp{cond_umod@var{mode}}
6454 @itemx @samp{cond_and@var{mode}}
6455 @itemx @samp{cond_ior@var{mode}}
6456 @itemx @samp{cond_xor@var{mode}}
6457 @itemx @samp{cond_smin@var{mode}}
6458 @itemx @samp{cond_smax@var{mode}}
6459 @itemx @samp{cond_umin@var{mode}}
6460 @itemx @samp{cond_umax@var{mode}}
6461 When operand 1 is true, perform an operation on operands 2 and 3 and
6462 store the result in operand 0, otherwise store operand 4 in operand 0.
6463 The operation works elementwise if the operands are vectors.
6464
6465 The scalar case is equivalent to:
6466
6467 @smallexample
6468 op0 = op1 ? op2 @var{op} op3 : op4;
6469 @end smallexample
6470
6471 while the vector case is equivalent to:
6472
6473 @smallexample
6474 for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)
6475 op0[i] = op1[i] ? op2[i] @var{op} op3[i] : op4[i];
6476 @end smallexample
6477
6478 where, for example, @var{op} is @code{+} for @samp{cond_add@var{mode}}.
6479
6480 When defined for floating-point modes, the contents of @samp{op3[i]}
6481 are not interpreted if @var{op1[i]} is false, just like they would not
6482 be in a normal C @samp{?:} condition.
6483
6484 Operands 0, 2, 3 and 4 all have mode @var{m}. Operand 1 is a scalar
6485 integer if @var{m} is scalar, otherwise it has the mode returned by
6486 @code{TARGET_VECTORIZE_GET_MASK_MODE}.
6487
6488 @cindex @code{cond_fma@var{mode}} instruction pattern
6489 @cindex @code{cond_fms@var{mode}} instruction pattern
6490 @cindex @code{cond_fnma@var{mode}} instruction pattern
6491 @cindex @code{cond_fnms@var{mode}} instruction pattern
6492 @item @samp{cond_fma@var{mode}}
6493 @itemx @samp{cond_fms@var{mode}}
6494 @itemx @samp{cond_fnma@var{mode}}
6495 @itemx @samp{cond_fnms@var{mode}}
6496 Like @samp{cond_add@var{m}}, except that the conditional operation
6497 takes 3 operands rather than two. For example, the vector form of
6498 @samp{cond_fma@var{mode}} is equivalent to:
6499
6500 @smallexample
6501 for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)
6502 op0[i] = op1[i] ? fma (op2[i], op3[i], op4[i]) : op5[i];
6503 @end smallexample
6504
6165 @cindex @code{neg@var{mode}cc} instruction pattern 6505 @cindex @code{neg@var{mode}cc} instruction pattern
6166 @item @samp{neg@var{mode}cc} 6506 @item @samp{neg@var{mode}cc}
6167 Similar to @samp{mov@var{mode}cc} but for conditional negation. Conditionally 6507 Similar to @samp{mov@var{mode}cc} but for conditional negation. Conditionally
6168 move the negation of operand 2 or the unchanged operand 3 into operand 0 6508 move the negation of operand 2 or the unchanged operand 3 into operand 0
6169 according to the comparison in operand 1. If the comparison is true, the negation 6509 according to the comparison in operand 1. If the comparison is true, the negation
6402 The @samp{tablejump} insn is always the last insn before the jump 6742 The @samp{tablejump} insn is always the last insn before the jump
6403 table it uses. Its assembler code normally has no need to use the 6743 table it uses. Its assembler code normally has no need to use the
6404 second operand, but you should incorporate it in the RTL pattern so 6744 second operand, but you should incorporate it in the RTL pattern so
6405 that the jump optimizer will not delete the table as unreachable code. 6745 that the jump optimizer will not delete the table as unreachable code.
6406 6746
6407
6408 @cindex @code{decrement_and_branch_until_zero} instruction pattern
6409 @item @samp{decrement_and_branch_until_zero}
6410 Conditional branch instruction that decrements a register and
6411 jumps if the register is nonzero. Operand 0 is the register to
6412 decrement and test; operand 1 is the label to jump to if the
6413 register is nonzero. @xref{Looping Patterns}.
6414
6415 This optional instruction pattern is only used by the combiner,
6416 typically for loops reversed by the loop optimizer when strength
6417 reduction is enabled.
6418 6747
6419 @cindex @code{doloop_end} instruction pattern 6748 @cindex @code{doloop_end} instruction pattern
6420 @item @samp{doloop_end} 6749 @item @samp{doloop_end}
6421 Conditional branch instruction that decrements a register and 6750 Conditional branch instruction that decrements a register and
6422 jumps if the register is nonzero. Operand 0 is the register to 6751 jumps if the register is nonzero. Operand 0 is the register to
6747 @item @samp{memory_barrier} 7076 @item @samp{memory_barrier}
6748 If the target memory model is not fully synchronous, then this pattern 7077 If the target memory model is not fully synchronous, then this pattern
6749 should be defined to an instruction that orders both loads and stores 7078 should be defined to an instruction that orders both loads and stores
6750 before the instruction with respect to loads and stores after the instruction. 7079 before the instruction with respect to loads and stores after the instruction.
6751 This pattern has no operands. 7080 This pattern has no operands.
7081
7082 @cindex @code{speculation_barrier} instruction pattern
7083 @item @samp{speculation_barrier}
7084 If the target can support speculative execution, then this pattern should
7085 be defined to an instruction that will block subsequent execution until
7086 any prior speculation conditions has been resolved. The pattern must also
7087 ensure that the compiler cannot move memory operations past the barrier,
7088 so it needs to be an UNSPEC_VOLATILE pattern. The pattern has no
7089 operands.
7090
7091 If this pattern is not defined then the default expansion of
7092 @code{__builtin_speculation_safe_value} will emit a warning. You can
7093 suppress this warning by defining this pattern with a final condition
7094 of @code{0} (zero), which tells the compiler that a speculation
7095 barrier is not needed for this target.
6752 7096
6753 @cindex @code{sync_compare_and_swap@var{mode}} instruction pattern 7097 @cindex @code{sync_compare_and_swap@var{mode}} instruction pattern
6754 @item @samp{sync_compare_and_swap@var{mode}} 7098 @item @samp{sync_compare_and_swap@var{mode}}
6755 This pattern, if defined, emits code for an atomic compare-and-swap 7099 This pattern, if defined, emits code for an atomic compare-and-swap
6756 operation. Operand 1 is the memory on which the atomic operation is 7100 operation. Operand 1 is the memory on which the atomic operation is
7237 mark the top and end of a loop and to count the number of loop 7581 mark the top and end of a loop and to count the number of loop
7238 iterations. This avoids the need for fetching and executing a 7582 iterations. This avoids the need for fetching and executing a
7239 @samp{dbra}-like instruction and avoids pipeline stalls associated with 7583 @samp{dbra}-like instruction and avoids pipeline stalls associated with
7240 the jump. 7584 the jump.
7241 7585
7242 GCC has three special named patterns to support low overhead looping. 7586 GCC has two special named patterns to support low overhead looping.
7243 They are @samp{decrement_and_branch_until_zero}, @samp{doloop_begin}, 7587 They are @samp{doloop_begin} and @samp{doloop_end}. These are emitted
7244 and @samp{doloop_end}. The first pattern, 7588 by the loop optimizer for certain well-behaved loops with a finite
7245 @samp{decrement_and_branch_until_zero}, is not emitted during RTL 7589 number of loop iterations using information collected during strength
7246 generation but may be emitted during the instruction combination phase. 7590 reduction.
7247 This requires the assistance of the loop optimizer, using information
7248 collected during strength reduction, to reverse a loop to count down to
7249 zero. Some targets also require the loop optimizer to add a
7250 @code{REG_NONNEG} note to indicate that the iteration count is always
7251 positive. This is needed if the target performs a signed loop
7252 termination test. For example, the 68000 uses a pattern similar to the
7253 following for its @code{dbra} instruction:
7254
7255 @smallexample
7256 @group
7257 (define_insn "decrement_and_branch_until_zero"
7258 [(set (pc)
7259 (if_then_else
7260 (ge (plus:SI (match_operand:SI 0 "general_operand" "+d*am")
7261 (const_int -1))
7262 (const_int 0))
7263 (label_ref (match_operand 1 "" ""))
7264 (pc)))
7265 (set (match_dup 0)
7266 (plus:SI (match_dup 0)
7267 (const_int -1)))]
7268 "find_reg_note (insn, REG_NONNEG, 0)"
7269 "@dots{}")
7270 @end group
7271 @end smallexample
7272
7273 Note that since the insn is both a jump insn and has an output, it must
7274 deal with its own reloads, hence the `m' constraints. Also note that
7275 since this insn is generated by the instruction combination phase
7276 combining two sequential insns together into an implicit parallel insn,
7277 the iteration counter needs to be biased by the same amount as the
7278 decrement operation, in this case @minus{}1. Note that the following similar
7279 pattern will not be matched by the combiner.
7280
7281 @smallexample
7282 @group
7283 (define_insn "decrement_and_branch_until_zero"
7284 [(set (pc)
7285 (if_then_else
7286 (ge (match_operand:SI 0 "general_operand" "+d*am")
7287 (const_int 1))
7288 (label_ref (match_operand 1 "" ""))
7289 (pc)))
7290 (set (match_dup 0)
7291 (plus:SI (match_dup 0)
7292 (const_int -1)))]
7293 "find_reg_note (insn, REG_NONNEG, 0)"
7294 "@dots{}")
7295 @end group
7296 @end smallexample
7297
7298 The other two special looping patterns, @samp{doloop_begin} and
7299 @samp{doloop_end}, are emitted by the loop optimizer for certain
7300 well-behaved loops with a finite number of loop iterations using
7301 information collected during strength reduction.
7302 7591
7303 The @samp{doloop_end} pattern describes the actual looping instruction 7592 The @samp{doloop_end} pattern describes the actual looping instruction
7304 (or the implicit looping operation) and the @samp{doloop_begin} pattern 7593 (or the implicit looping operation) and the @samp{doloop_begin} pattern
7305 is an optional companion pattern that can be used for initialization 7594 is an optional companion pattern that can be used for initialization
7306 needed for some low-overhead looping instructions. 7595 needed for some low-overhead looping instructions.
7316 additional labels can be emitted at this point. In addition, if the 7605 additional labels can be emitted at this point. In addition, if the
7317 desired special iteration counter register was not allocated, this 7606 desired special iteration counter register was not allocated, this
7318 machine dependent reorg pass could emit a traditional compare and jump 7607 machine dependent reorg pass could emit a traditional compare and jump
7319 instruction pair. 7608 instruction pair.
7320 7609
7321 The essential difference between the 7610 For the @samp{doloop_end} pattern, the loop optimizer allocates an
7322 @samp{decrement_and_branch_until_zero} and the @samp{doloop_end} 7611 additional pseudo register as an iteration counter. This pseudo
7323 patterns is that the loop optimizer allocates an additional pseudo 7612 register cannot be used within the loop (i.e., general induction
7324 register for the latter as an iteration counter. This pseudo register 7613 variables cannot be derived from it), however, in many cases the loop
7325 cannot be used within the loop (i.e., general induction variables cannot 7614 induction variable may become redundant and removed by the flow pass.
7326 be derived from it), however, in many cases the loop induction variable 7615
7327 may become redundant and removed by the flow pass. 7616 The @samp{doloop_end} pattern must have a specific structure to be
7328 7617 handled correctly by GCC. The example below is taken (slightly
7618 simplified) from the PDP-11 target:
7619
7620 @smallexample
7621 @group
7622 (define_expand "doloop_end"
7623 [(parallel [(set (pc)
7624 (if_then_else
7625 (ne (match_operand:HI 0 "nonimmediate_operand" "+r,!m")
7626 (const_int 1))
7627 (label_ref (match_operand 1 "" ""))
7628 (pc)))
7629 (set (match_dup 0)
7630 (plus:HI (match_dup 0)
7631 (const_int -1)))])]
7632 ""
7633 "@{
7634 if (GET_MODE (operands[0]) != HImode)
7635 FAIL;
7636 @}")
7637
7638 (define_insn "doloop_end_insn"
7639 [(set (pc)
7640 (if_then_else
7641 (ne (match_operand:HI 0 "nonimmediate_operand" "+r,!m")
7642 (const_int 1))
7643 (label_ref (match_operand 1 "" ""))
7644 (pc)))
7645 (set (match_dup 0)
7646 (plus:HI (match_dup 0)
7647 (const_int -1)))]
7648 ""
7649
7650 @{
7651 if (which_alternative == 0)
7652 return "sob %0,%l1";
7653
7654 /* emulate sob */
7655 output_asm_insn ("dec %0", operands);
7656 return "bne %l1";
7657 @})
7658 @end group
7659 @end smallexample
7660
7661 The first part of the pattern describes the branch condition. GCC
7662 supports three cases for the way the target machine handles the loop
7663 counter:
7664 @itemize @bullet
7665 @item Loop terminates when the loop register decrements to zero. This
7666 is represented by a @code{ne} comparison of the register (its old value)
7667 with constant 1 (as in the example above).
7668 @item Loop terminates when the loop register decrements to @minus{}1.
7669 This is represented by a @code{ne} comparison of the register with
7670 constant zero.
7671 @item Loop terminates when the loop register decrements to a negative
7672 value. This is represented by a @code{ge} comparison of the register
7673 with constant zero. For this case, GCC will attach a @code{REG_NONNEG}
7674 note to the @code{doloop_end} insn if it can determine that the register
7675 will be non-negative.
7676 @end itemize
7677
7678 Since the @code{doloop_end} insn is a jump insn that also has an output,
7679 the reload pass does not handle the output operand. Therefore, the
7680 constraint must allow for that operand to be in memory rather than a
7681 register. In the example shown above, that is handled (in the
7682 @code{doloop_end_insn} pattern) by using a loop instruction sequence
7683 that can handle memory operands when the memory alternative appears.
7684
7685 GCC does not check the mode of the loop register operand when generating
7686 the @code{doloop_end} pattern. If the pattern is only valid for some
7687 modes but not others, the pattern should be a @code{define_expand}
7688 pattern that checks the operand mode in the preparation code, and issues
7689 @code{FAIL} if an unsupported mode is found. The example above does
7690 this, since the machine instruction to be used only exists for
7691 @code{HImode}.
7692
7693 If the @code{doloop_end} pattern is a @code{define_expand}, there must
7694 also be a @code{define_insn} or @code{define_insn_and_split} matching
7695 the generated pattern. Otherwise, the compiler will fail during loop
7696 optimization.
7329 7697
7330 @end ifset 7698 @end ifset
7331 @ifset INTERNALS 7699 @ifset INTERNALS
7332 @node Insn Canonicalizations 7700 @node Insn Canonicalizations
7333 @section Canonicalization of Instructions 7701 @section Canonicalization of Instructions
7781 and are executed before the new RTL is generated to prepare for the 8149 and are executed before the new RTL is generated to prepare for the
7782 generated code or emit some insns whose pattern is not fixed. Unlike 8150 generated code or emit some insns whose pattern is not fixed. Unlike
7783 those in @code{define_expand}, however, these statements must not 8151 those in @code{define_expand}, however, these statements must not
7784 generate any new pseudo-registers. Once reload has completed, they also 8152 generate any new pseudo-registers. Once reload has completed, they also
7785 must not allocate any space in the stack frame. 8153 must not allocate any space in the stack frame.
8154
8155 There are two special macros defined for use in the preparation statements:
8156 @code{DONE} and @code{FAIL}. Use them with a following semicolon,
8157 as a statement.
8158
8159 @table @code
8160
8161 @findex DONE
8162 @item DONE
8163 Use the @code{DONE} macro to end RTL generation for the splitter. The
8164 only RTL insns generated as replacement for the matched input insn will
8165 be those already emitted by explicit calls to @code{emit_insn} within
8166 the preparation statements; the replacement pattern is not used.
8167
8168 @findex FAIL
8169 @item FAIL
8170 Make the @code{define_split} fail on this occasion. When a @code{define_split}
8171 fails, it means that the splitter was not truly available for the inputs
8172 it was given, and the input insn will not be split.
8173 @end table
8174
8175 If the preparation falls through (invokes neither @code{DONE} nor
8176 @code{FAIL}), then the @code{define_split} uses the replacement
8177 template.
7786 8178
7787 Patterns are matched against @var{insn-pattern} in two different 8179 Patterns are matched against @var{insn-pattern} in two different
7788 circumstances. If an insn needs to be split for delay slot scheduling 8180 circumstances. If an insn needs to be split for delay slot scheduling
7789 or insn scheduling, the insn is already known to be valid, which means 8181 or insn scheduling, the insn is already known to be valid, which means
7790 that it must have been matched by some @code{define_insn} and, if 8182 that it must have been matched by some @code{define_insn} and, if
7888 definitions, one for the insns that are valid and one for the insns that 8280 definitions, one for the insns that are valid and one for the insns that
7889 are not valid. 8281 are not valid.
7890 8282
7891 The splitter is allowed to split jump instructions into sequence of 8283 The splitter is allowed to split jump instructions into sequence of
7892 jumps or create new jumps in while splitting non-jump instructions. As 8284 jumps or create new jumps in while splitting non-jump instructions. As
7893 the central flowgraph and branch prediction information needs to be updated, 8285 the control flow graph and branch prediction information needs to be updated,
7894 several restriction apply. 8286 several restriction apply.
7895 8287
7896 Splitting of jump instruction into sequence that over by another jump 8288 Splitting of jump instruction into sequence that over by another jump
7897 instruction is always valid, as compiler expect identical behavior of new 8289 instruction is always valid, as compiler expect identical behavior of new
7898 jump. When new sequence contains multiple jump instructions or new labels, 8290 jump. When new sequence contains multiple jump instructions or new labels,
8336 (set (match_dup 0) (match_dup 4)) 8728 (set (match_dup 0) (match_dup 4))
8337 (set (match_dup 2) (match_dup 4)) 8729 (set (match_dup 2) (match_dup 4))
8338 (set (match_dup 3) (match_dup 4))] 8730 (set (match_dup 3) (match_dup 4))]
8339 "") 8731 "")
8340 @end smallexample 8732 @end smallexample
8733
8734 There are two special macros defined for use in the preparation statements:
8735 @code{DONE} and @code{FAIL}. Use them with a following semicolon,
8736 as a statement.
8737
8738 @table @code
8739
8740 @findex DONE
8741 @item DONE
8742 Use the @code{DONE} macro to end RTL generation for the peephole. The
8743 only RTL insns generated as replacement for the matched input insn will
8744 be those already emitted by explicit calls to @code{emit_insn} within
8745 the preparation statements; the replacement pattern is not used.
8746
8747 @findex FAIL
8748 @item FAIL
8749 Make the @code{define_peephole2} fail on this occasion. When a @code{define_peephole2}
8750 fails, it means that the replacement was not truly available for the
8751 particular inputs it was given. In that case, GCC may still apply a
8752 later @code{define_peephole2} that also matches the given insn pattern.
8753 (Note that this is different from @code{define_split}, where @code{FAIL}
8754 prevents the input insn from being split at all.)
8755 @end table
8756
8757 If the preparation falls through (invokes neither @code{DONE} nor
8758 @code{FAIL}), then the @code{define_peephole2} uses the replacement
8759 template.
8341 8760
8342 @noindent 8761 @noindent
8343 If we had not added the @code{(match_dup 4)} in the middle of the input 8762 If we had not added the @code{(match_dup 4)} in the middle of the input
8344 sequence, it might have been the case that the register we chose at the 8763 sequence, it might have been the case that the register we chose at the
8345 beginning of the sequence is killed by the first or second @code{set}. 8764 beginning of the sequence is killed by the first or second @code{set}.
9615 All simple integer insns can be executed in any integer pipeline and 10034 All simple integer insns can be executed in any integer pipeline and
9616 their result is ready in two cycles. The simple integer insns are 10035 their result is ready in two cycles. The simple integer insns are
9617 issued into the first pipeline unless it is reserved, otherwise they 10036 issued into the first pipeline unless it is reserved, otherwise they
9618 are issued into the second pipeline. Integer division and 10037 are issued into the second pipeline. Integer division and
9619 multiplication insns can be executed only in the second integer 10038 multiplication insns can be executed only in the second integer
9620 pipeline and their results are ready correspondingly in 8 and 4 10039 pipeline and their results are ready correspondingly in 9 and 4
9621 cycles. The integer division is not pipelined, i.e.@: the subsequent 10040 cycles. The integer division is not pipelined, i.e.@: the subsequent
9622 integer division insn can not be issued until the current division 10041 integer division insn can not be issued until the current division
9623 insn finished. Floating point insns are fully pipelined and their 10042 insn finished. Floating point insns are fully pipelined and their
9624 results are ready in 3 cycles. Where the result of a floating point 10043 results are ready in 3 cycles. Where the result of a floating point
9625 insn is used by an integer insn, an additional delay of one cycle is 10044 insn is used by an integer insn, an additional delay of one cycle is
9632 "(i0_pipeline | i1_pipeline), (port0 | port1)") 10051 "(i0_pipeline | i1_pipeline), (port0 | port1)")
9633 10052
9634 (define_insn_reservation "mult" 4 (eq_attr "type" "mult") 10053 (define_insn_reservation "mult" 4 (eq_attr "type" "mult")
9635 "i1_pipeline, nothing*2, (port0 | port1)") 10054 "i1_pipeline, nothing*2, (port0 | port1)")
9636 10055
9637 (define_insn_reservation "div" 8 (eq_attr "type" "div") 10056 (define_insn_reservation "div" 9 (eq_attr "type" "div")
9638 "i1_pipeline, div*7, div + (port0 | port1)") 10057 "i1_pipeline, div*7, div + (port0 | port1)")
9639 10058
9640 (define_insn_reservation "float" 3 (eq_attr "type" "float") 10059 (define_insn_reservation "float" 3 (eq_attr "type" "float")
9641 "f_pipeline, nothing, (port0 | port1)) 10060 "f_pipeline, nothing, (port0 | port1))
9642 10061
9934 @code{match_dup N} is used in the output template to be replaced with 10353 @code{match_dup N} is used in the output template to be replaced with
9935 the expression from the original pattern, which matched 10354 the expression from the original pattern, which matched
9936 @code{match_operand N} from the input pattern. As a consequence, 10355 @code{match_operand N} from the input pattern. As a consequence,
9937 @code{match_dup} cannot be used to point to @code{match_operand}s from 10356 @code{match_dup} cannot be used to point to @code{match_operand}s from
9938 the output pattern, it should always refer to a @code{match_operand} 10357 the output pattern, it should always refer to a @code{match_operand}
9939 from the input pattern. 10358 from the input pattern. If a @code{match_dup N} occurs more than once
10359 in the output template, its first occurrence is replaced with the
10360 expression from the original pattern, and the subsequent expressions
10361 are replaced with @code{match_dup N}, i.e., a reference to the first
10362 expression.
9940 10363
9941 In the output template one can refer to the expressions from the 10364 In the output template one can refer to the expressions from the
9942 original pattern and create new ones. For instance, some operands could 10365 original pattern and create new ones. For instance, some operands could
9943 be added by means of standard @code{match_operand}. 10366 be added by means of standard @code{match_operand}.
9944 10367
10142 @menu 10565 @menu
10143 * Mode Iterators:: Generating variations of patterns for different modes. 10566 * Mode Iterators:: Generating variations of patterns for different modes.
10144 * Code Iterators:: Doing the same for codes. 10567 * Code Iterators:: Doing the same for codes.
10145 * Int Iterators:: Doing the same for integers. 10568 * Int Iterators:: Doing the same for integers.
10146 * Subst Iterators:: Generating variations of patterns for define_subst. 10569 * Subst Iterators:: Generating variations of patterns for define_subst.
10570 * Parameterized Names:: Specifying iterator values in C++ code.
10147 @end menu 10571 @end menu
10148 10572
10149 @node Mode Iterators 10573 @node Mode Iterators
10150 @subsection Mode Iterators 10574 @subsection Mode Iterators
10151 @cindex mode iterators in @file{.md} files 10575 @cindex mode iterators in @file{.md} files
10537 replaced in the first copy of the original RTL-template. 10961 replaced in the first copy of the original RTL-template.
10538 10962
10539 @var{subst-applied-value} is a value with which subst-attribute would be 10963 @var{subst-applied-value} is a value with which subst-attribute would be
10540 replaced in the second copy of the original RTL-template. 10964 replaced in the second copy of the original RTL-template.
10541 10965
10966 @node Parameterized Names
10967 @subsection Parameterized Names
10968 @cindex @samp{@@} in instruction pattern names
10969 Ports sometimes need to apply iterators using C++ code, in order to
10970 get the code or RTL pattern for a specific instruction. For example,
10971 suppose we have the @samp{neon_vq<absneg><mode>} pattern given above:
10972
10973 @smallexample
10974 (define_int_iterator QABSNEG [UNSPEC_VQABS UNSPEC_VQNEG])
10975
10976 (define_int_attr absneg [(UNSPEC_VQABS "abs") (UNSPEC_VQNEG "neg")])
10977
10978 (define_insn "neon_vq<absneg><mode>"
10979 [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
10980 (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w")
10981 (match_operand:SI 2 "immediate_operand" "i")]
10982 QABSNEG))]
10983 @dots{}
10984 )
10985 @end smallexample
10986
10987 A port might need to generate this pattern for a variable
10988 @samp{QABSNEG} value and a variable @samp{VDQIW} mode. There are two
10989 ways of doing this. The first is to build the rtx for the pattern
10990 directly from C++ code; this is a valid technique and avoids any risk
10991 of combinatorial explosion. The second is to prefix the instruction
10992 name with the special character @samp{@@}, which tells GCC to generate
10993 the four additional functions below. In each case, @var{name} is the
10994 name of the instruction without the leading @samp{@@} character,
10995 without the @samp{<@dots{}>} placeholders, and with any underscore
10996 before a @samp{<@dots{}>} placeholder removed if keeping it would
10997 lead to a double or trailing underscore.
10998
10999 @table @samp
11000 @item insn_code maybe_code_for_@var{name} (@var{i1}, @var{i2}, @dots{})
11001 See whether replacing the first @samp{<@dots{}>} placeholder with
11002 iterator value @var{i1}, the second with iterator value @var{i2}, and
11003 so on, gives a valid instruction. Return its code if so, otherwise
11004 return @code{CODE_FOR_nothing}.
11005
11006 @item insn_code code_for_@var{name} (@var{i1}, @var{i2}, @dots{})
11007 Same, but abort the compiler if the requested instruction does not exist.
11008
11009 @item rtx maybe_gen_@var{name} (@var{i1}, @var{i2}, @dots{}, @var{op0}, @var{op1}, @dots{})
11010 Check for a valid instruction in the same way as
11011 @code{maybe_code_for_@var{name}}. If the instruction exists,
11012 generate an instance of it using the operand values given by @var{op0},
11013 @var{op1}, and so on, otherwise return null.
11014
11015 @item rtx gen_@var{name} (@var{i1}, @var{i2}, @dots{}, @var{op0}, @var{op1}, @dots{})
11016 Same, but abort the compiler if the requested instruction does not exist,
11017 or if the instruction generator invoked the @code{FAIL} macro.
11018 @end table
11019
11020 For example, changing the pattern above to:
11021
11022 @smallexample
11023 (define_insn "@@neon_vq<absneg><mode>"
11024 [(set (match_operand:VDQIW 0 "s_register_operand" "=w")
11025 (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w")
11026 (match_operand:SI 2 "immediate_operand" "i")]
11027 QABSNEG))]
11028 @dots{}
11029 )
11030 @end smallexample
11031
11032 would define the same patterns as before, but in addition would generate
11033 the four functions below:
11034
11035 @smallexample
11036 insn_code maybe_code_for_neon_vq (int, machine_mode);
11037 insn_code code_for_neon_vq (int, machine_mode);
11038 rtx maybe_gen_neon_vq (int, machine_mode, rtx, rtx, rtx);
11039 rtx gen_neon_vq (int, machine_mode, rtx, rtx, rtx);
11040 @end smallexample
11041
11042 Calling @samp{code_for_neon_vq (UNSPEC_VQABS, V8QImode)}
11043 would then give @code{CODE_FOR_neon_vqabsv8qi}.
11044
11045 It is possible to have multiple @samp{@@} patterns with the same
11046 name and same types of iterator. For example:
11047
11048 @smallexample
11049 (define_insn "@@some_arithmetic_op<mode>"
11050 [(set (match_operand:INTEGER_MODES 0 "register_operand") @dots{})]
11051 @dots{}
11052 )
11053
11054 (define_insn "@@some_arithmetic_op<mode>"
11055 [(set (match_operand:FLOAT_MODES 0 "register_operand") @dots{})]
11056 @dots{}
11057 )
11058 @end smallexample
11059
11060 would produce a single set of functions that handles both
11061 @code{INTEGER_MODES} and @code{FLOAT_MODES}.
11062
10542 @end ifset 11063 @end ifset