Mercurial > hg > CbC > CbC_gcc
comparison gcc/doc/md.texi @ 0:a06113de4d67
first commit
author | kent <kent@cr.ie.u-ryukyu.ac.jp> |
---|---|
date | Fri, 17 Jul 2009 14:47:48 +0900 |
parents | |
children | 58ad6c70ea60 |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:a06113de4d67 |
---|---|
1 @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1996, 1998, 1999, 2000, 2001, | |
2 @c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 | |
3 @c Free Software Foundation, Inc. | |
4 @c This is part of the GCC manual. | |
5 @c For copying conditions, see the file gcc.texi. | |
6 | |
7 @ifset INTERNALS | |
8 @node Machine Desc | |
9 @chapter Machine Descriptions | |
10 @cindex machine descriptions | |
11 | |
12 A machine description has two parts: a file of instruction patterns | |
13 (@file{.md} file) and a C header file of macro definitions. | |
14 | |
15 The @file{.md} file for a target machine contains a pattern for each | |
16 instruction that the target machine supports (or at least each instruction | |
17 that is worth telling the compiler about). It may also contain comments. | |
18 A semicolon causes the rest of the line to be a comment, unless the semicolon | |
19 is inside a quoted string. | |
20 | |
21 See the next chapter for information on the C header file. | |
22 | |
23 @menu | |
24 * Overview:: How the machine description is used. | |
25 * Patterns:: How to write instruction patterns. | |
26 * Example:: An explained example of a @code{define_insn} pattern. | |
27 * RTL Template:: The RTL template defines what insns match a pattern. | |
28 * Output Template:: The output template says how to make assembler code | |
29 from such an insn. | |
30 * Output Statement:: For more generality, write C code to output | |
31 the assembler code. | |
32 * Predicates:: Controlling what kinds of operands can be used | |
33 for an insn. | |
34 * Constraints:: Fine-tuning operand selection. | |
35 * Standard Names:: Names mark patterns to use for code generation. | |
36 * Pattern Ordering:: When the order of patterns makes a difference. | |
37 * Dependent Patterns:: Having one pattern may make you need another. | |
38 * Jump Patterns:: Special considerations for patterns for jump insns. | |
39 * Looping Patterns:: How to define patterns for special looping insns. | |
40 * Insn Canonicalizations::Canonicalization of Instructions | |
41 * Expander Definitions::Generating a sequence of several RTL insns | |
42 for a standard operation. | |
43 * Insn Splitting:: Splitting Instructions into Multiple Instructions. | |
44 * Including Patterns:: Including Patterns in Machine Descriptions. | |
45 * Peephole Definitions::Defining machine-specific peephole optimizations. | |
46 * Insn Attributes:: Specifying the value of attributes for generated insns. | |
47 * Conditional Execution::Generating @code{define_insn} patterns for | |
48 predication. | |
49 * Constant Definitions::Defining symbolic constants that can be used in the | |
50 md file. | |
51 * Iterators:: Using iterators to generate patterns from a template. | |
52 @end menu | |
53 | |
54 @node Overview | |
55 @section Overview of How the Machine Description is Used | |
56 | |
57 There are three main conversions that happen in the compiler: | |
58 | |
59 @enumerate | |
60 | |
61 @item | |
62 The front end reads the source code and builds a parse tree. | |
63 | |
64 @item | |
65 The parse tree is used to generate an RTL insn list based on named | |
66 instruction patterns. | |
67 | |
68 @item | |
69 The insn list is matched against the RTL templates to produce assembler | |
70 code. | |
71 | |
72 @end enumerate | |
73 | |
74 For the generate pass, only the names of the insns matter, from either a | |
75 named @code{define_insn} or a @code{define_expand}. The compiler will | |
76 choose the pattern with the right name and apply the operands according | |
77 to the documentation later in this chapter, without regard for the RTL | |
78 template or operand constraints. Note that the names the compiler looks | |
79 for are hard-coded in the compiler---it will ignore unnamed patterns and | |
80 patterns with names it doesn't know about, but if you don't provide a | |
81 named pattern it needs, it will abort. | |
82 | |
83 If a @code{define_insn} is used, the template given is inserted into the | |
84 insn list. If a @code{define_expand} is used, one of three things | |
85 happens, based on the condition logic. The condition logic may manually | |
86 create new insns for the insn list, say via @code{emit_insn()}, and | |
87 invoke @code{DONE}. For certain named patterns, it may invoke @code{FAIL} to tell the | |
88 compiler to use an alternate way of performing that task. If it invokes | |
89 neither @code{DONE} nor @code{FAIL}, the template given in the pattern | |
90 is inserted, as if the @code{define_expand} were a @code{define_insn}. | |
91 | |
92 Once the insn list is generated, various optimization passes convert, | |
93 replace, and rearrange the insns in the insn list. This is where the | |
94 @code{define_split} and @code{define_peephole} patterns get used, for | |
95 example. | |
96 | |
97 Finally, the insn list's RTL is matched up with the RTL templates in the | |
98 @code{define_insn} patterns, and those patterns are used to emit the | |
99 final assembly code. For this purpose, each named @code{define_insn} | |
100 acts like it's unnamed, since the names are ignored. | |
101 | |
102 @node Patterns | |
103 @section Everything about Instruction Patterns | |
104 @cindex patterns | |
105 @cindex instruction patterns | |
106 | |
107 @findex define_insn | |
108 Each instruction pattern contains an incomplete RTL expression, with pieces | |
109 to be filled in later, operand constraints that restrict how the pieces can | |
110 be filled in, and an output pattern or C code to generate the assembler | |
111 output, all wrapped up in a @code{define_insn} expression. | |
112 | |
113 A @code{define_insn} is an RTL expression containing four or five operands: | |
114 | |
115 @enumerate | |
116 @item | |
117 An optional name. The presence of a name indicate that this instruction | |
118 pattern can perform a certain standard job for the RTL-generation | |
119 pass of the compiler. This pass knows certain names and will use | |
120 the instruction patterns with those names, if the names are defined | |
121 in the machine description. | |
122 | |
123 The absence of a name is indicated by writing an empty string | |
124 where the name should go. Nameless instruction patterns are never | |
125 used for generating RTL code, but they may permit several simpler insns | |
126 to be combined later on. | |
127 | |
128 Names that are not thus known and used in RTL-generation have no | |
129 effect; they are equivalent to no name at all. | |
130 | |
131 For the purpose of debugging the compiler, you may also specify a | |
132 name beginning with the @samp{*} character. Such a name is used only | |
133 for identifying the instruction in RTL dumps; it is entirely equivalent | |
134 to having a nameless pattern for all other purposes. | |
135 | |
136 @item | |
137 The @dfn{RTL template} (@pxref{RTL Template}) is a vector of incomplete | |
138 RTL expressions which show what the instruction should look like. It is | |
139 incomplete because it may contain @code{match_operand}, | |
140 @code{match_operator}, and @code{match_dup} expressions that stand for | |
141 operands of the instruction. | |
142 | |
143 If the vector has only one element, that element is the template for the | |
144 instruction pattern. If the vector has multiple elements, then the | |
145 instruction pattern is a @code{parallel} expression containing the | |
146 elements described. | |
147 | |
148 @item | |
149 @cindex pattern conditions | |
150 @cindex conditions, in patterns | |
151 A condition. This is a string which contains a C expression that is | |
152 the final test to decide whether an insn body matches this pattern. | |
153 | |
154 @cindex named patterns and conditions | |
155 For a named pattern, the condition (if present) may not depend on | |
156 the data in the insn being matched, but only the target-machine-type | |
157 flags. The compiler needs to test these conditions during | |
158 initialization in order to learn exactly which named instructions are | |
159 available in a particular run. | |
160 | |
161 @findex operands | |
162 For nameless patterns, the condition is applied only when matching an | |
163 individual insn, and only after the insn has matched the pattern's | |
164 recognition template. The insn's operands may be found in the vector | |
165 @code{operands}. For an insn where the condition has once matched, it | |
166 can't be used to control register allocation, for example by excluding | |
167 certain hard registers or hard register combinations. | |
168 | |
169 @item | |
170 The @dfn{output template}: a string that says how to output matching | |
171 insns as assembler code. @samp{%} in this string specifies where | |
172 to substitute the value of an operand. @xref{Output Template}. | |
173 | |
174 When simple substitution isn't general enough, you can specify a piece | |
175 of C code to compute the output. @xref{Output Statement}. | |
176 | |
177 @item | |
178 Optionally, a vector containing the values of attributes for insns matching | |
179 this pattern. @xref{Insn Attributes}. | |
180 @end enumerate | |
181 | |
182 @node Example | |
183 @section Example of @code{define_insn} | |
184 @cindex @code{define_insn} example | |
185 | |
186 Here is an actual example of an instruction pattern, for the 68000/68020. | |
187 | |
188 @smallexample | |
189 (define_insn "tstsi" | |
190 [(set (cc0) | |
191 (match_operand:SI 0 "general_operand" "rm"))] | |
192 "" | |
193 "* | |
194 @{ | |
195 if (TARGET_68020 || ! ADDRESS_REG_P (operands[0])) | |
196 return \"tstl %0\"; | |
197 return \"cmpl #0,%0\"; | |
198 @}") | |
199 @end smallexample | |
200 | |
201 @noindent | |
202 This can also be written using braced strings: | |
203 | |
204 @smallexample | |
205 (define_insn "tstsi" | |
206 [(set (cc0) | |
207 (match_operand:SI 0 "general_operand" "rm"))] | |
208 "" | |
209 @{ | |
210 if (TARGET_68020 || ! ADDRESS_REG_P (operands[0])) | |
211 return "tstl %0"; | |
212 return "cmpl #0,%0"; | |
213 @}) | |
214 @end smallexample | |
215 | |
216 This is an instruction that sets the condition codes based on the value of | |
217 a general operand. It has no condition, so any insn whose RTL description | |
218 has the form shown may be handled according to this pattern. The name | |
219 @samp{tstsi} means ``test a @code{SImode} value'' and tells the RTL generation | |
220 pass that, when it is necessary to test such a value, an insn to do so | |
221 can be constructed using this pattern. | |
222 | |
223 The output control string is a piece of C code which chooses which | |
224 output template to return based on the kind of operand and the specific | |
225 type of CPU for which code is being generated. | |
226 | |
227 @samp{"rm"} is an operand constraint. Its meaning is explained below. | |
228 | |
229 @node RTL Template | |
230 @section RTL Template | |
231 @cindex RTL insn template | |
232 @cindex generating insns | |
233 @cindex insns, generating | |
234 @cindex recognizing insns | |
235 @cindex insns, recognizing | |
236 | |
237 The RTL template is used to define which insns match the particular pattern | |
238 and how to find their operands. For named patterns, the RTL template also | |
239 says how to construct an insn from specified operands. | |
240 | |
241 Construction involves substituting specified operands into a copy of the | |
242 template. Matching involves determining the values that serve as the | |
243 operands in the insn being matched. Both of these activities are | |
244 controlled by special expression types that direct matching and | |
245 substitution of the operands. | |
246 | |
247 @table @code | |
248 @findex match_operand | |
249 @item (match_operand:@var{m} @var{n} @var{predicate} @var{constraint}) | |
250 This expression is a placeholder for operand number @var{n} of | |
251 the insn. When constructing an insn, operand number @var{n} | |
252 will be substituted at this point. When matching an insn, whatever | |
253 appears at this position in the insn will be taken as operand | |
254 number @var{n}; but it must satisfy @var{predicate} or this instruction | |
255 pattern will not match at all. | |
256 | |
257 Operand numbers must be chosen consecutively counting from zero in | |
258 each instruction pattern. There may be only one @code{match_operand} | |
259 expression in the pattern for each operand number. Usually operands | |
260 are numbered in the order of appearance in @code{match_operand} | |
261 expressions. In the case of a @code{define_expand}, any operand numbers | |
262 used only in @code{match_dup} expressions have higher values than all | |
263 other operand numbers. | |
264 | |
265 @var{predicate} is a string that is the name of a function that | |
266 accepts two arguments, an expression and a machine mode. | |
267 @xref{Predicates}. During matching, the function will be called with | |
268 the putative operand as the expression and @var{m} as the mode | |
269 argument (if @var{m} is not specified, @code{VOIDmode} will be used, | |
270 which normally causes @var{predicate} to accept any mode). If it | |
271 returns zero, this instruction pattern fails to match. | |
272 @var{predicate} may be an empty string; then it means no test is to be | |
273 done on the operand, so anything which occurs in this position is | |
274 valid. | |
275 | |
276 Most of the time, @var{predicate} will reject modes other than @var{m}---but | |
277 not always. For example, the predicate @code{address_operand} uses | |
278 @var{m} as the mode of memory ref that the address should be valid for. | |
279 Many predicates accept @code{const_int} nodes even though their mode is | |
280 @code{VOIDmode}. | |
281 | |
282 @var{constraint} controls reloading and the choice of the best register | |
283 class to use for a value, as explained later (@pxref{Constraints}). | |
284 If the constraint would be an empty string, it can be omitted. | |
285 | |
286 People are often unclear on the difference between the constraint and the | |
287 predicate. The predicate helps decide whether a given insn matches the | |
288 pattern. The constraint plays no role in this decision; instead, it | |
289 controls various decisions in the case of an insn which does match. | |
290 | |
291 @findex match_scratch | |
292 @item (match_scratch:@var{m} @var{n} @var{constraint}) | |
293 This expression is also a placeholder for operand number @var{n} | |
294 and indicates that operand must be a @code{scratch} or @code{reg} | |
295 expression. | |
296 | |
297 When matching patterns, this is equivalent to | |
298 | |
299 @smallexample | |
300 (match_operand:@var{m} @var{n} "scratch_operand" @var{pred}) | |
301 @end smallexample | |
302 | |
303 but, when generating RTL, it produces a (@code{scratch}:@var{m}) | |
304 expression. | |
305 | |
306 If the last few expressions in a @code{parallel} are @code{clobber} | |
307 expressions whose operands are either a hard register or | |
308 @code{match_scratch}, the combiner can add or delete them when | |
309 necessary. @xref{Side Effects}. | |
310 | |
311 @findex match_dup | |
312 @item (match_dup @var{n}) | |
313 This expression is also a placeholder for operand number @var{n}. | |
314 It is used when the operand needs to appear more than once in the | |
315 insn. | |
316 | |
317 In construction, @code{match_dup} acts just like @code{match_operand}: | |
318 the operand is substituted into the insn being constructed. But in | |
319 matching, @code{match_dup} behaves differently. It assumes that operand | |
320 number @var{n} has already been determined by a @code{match_operand} | |
321 appearing earlier in the recognition template, and it matches only an | |
322 identical-looking expression. | |
323 | |
324 Note that @code{match_dup} should not be used to tell the compiler that | |
325 a particular register is being used for two operands (example: | |
326 @code{add} that adds one register to another; the second register is | |
327 both an input operand and the output operand). Use a matching | |
328 constraint (@pxref{Simple Constraints}) for those. @code{match_dup} is for the cases where one | |
329 operand is used in two places in the template, such as an instruction | |
330 that computes both a quotient and a remainder, where the opcode takes | |
331 two input operands but the RTL template has to refer to each of those | |
332 twice; once for the quotient pattern and once for the remainder pattern. | |
333 | |
334 @findex match_operator | |
335 @item (match_operator:@var{m} @var{n} @var{predicate} [@var{operands}@dots{}]) | |
336 This pattern is a kind of placeholder for a variable RTL expression | |
337 code. | |
338 | |
339 When constructing an insn, it stands for an RTL expression whose | |
340 expression code is taken from that of operand @var{n}, and whose | |
341 operands are constructed from the patterns @var{operands}. | |
342 | |
343 When matching an expression, it matches an expression if the function | |
344 @var{predicate} returns nonzero on that expression @emph{and} the | |
345 patterns @var{operands} match the operands of the expression. | |
346 | |
347 Suppose that the function @code{commutative_operator} is defined as | |
348 follows, to match any expression whose operator is one of the | |
349 commutative arithmetic operators of RTL and whose mode is @var{mode}: | |
350 | |
351 @smallexample | |
352 int | |
353 commutative_integer_operator (x, mode) | |
354 rtx x; | |
355 enum machine_mode mode; | |
356 @{ | |
357 enum rtx_code code = GET_CODE (x); | |
358 if (GET_MODE (x) != mode) | |
359 return 0; | |
360 return (GET_RTX_CLASS (code) == RTX_COMM_ARITH | |
361 || code == EQ || code == NE); | |
362 @} | |
363 @end smallexample | |
364 | |
365 Then the following pattern will match any RTL expression consisting | |
366 of a commutative operator applied to two general operands: | |
367 | |
368 @smallexample | |
369 (match_operator:SI 3 "commutative_operator" | |
370 [(match_operand:SI 1 "general_operand" "g") | |
371 (match_operand:SI 2 "general_operand" "g")]) | |
372 @end smallexample | |
373 | |
374 Here the vector @code{[@var{operands}@dots{}]} contains two patterns | |
375 because the expressions to be matched all contain two operands. | |
376 | |
377 When this pattern does match, the two operands of the commutative | |
378 operator are recorded as operands 1 and 2 of the insn. (This is done | |
379 by the two instances of @code{match_operand}.) Operand 3 of the insn | |
380 will be the entire commutative expression: use @code{GET_CODE | |
381 (operands[3])} to see which commutative operator was used. | |
382 | |
383 The machine mode @var{m} of @code{match_operator} works like that of | |
384 @code{match_operand}: it is passed as the second argument to the | |
385 predicate function, and that function is solely responsible for | |
386 deciding whether the expression to be matched ``has'' that mode. | |
387 | |
388 When constructing an insn, argument 3 of the gen-function will specify | |
389 the operation (i.e.@: the expression code) for the expression to be | |
390 made. It should be an RTL expression, whose expression code is copied | |
391 into a new expression whose operands are arguments 1 and 2 of the | |
392 gen-function. The subexpressions of argument 3 are not used; | |
393 only its expression code matters. | |
394 | |
395 When @code{match_operator} is used in a pattern for matching an insn, | |
396 it usually best if the operand number of the @code{match_operator} | |
397 is higher than that of the actual operands of the insn. This improves | |
398 register allocation because the register allocator often looks at | |
399 operands 1 and 2 of insns to see if it can do register tying. | |
400 | |
401 There is no way to specify constraints in @code{match_operator}. The | |
402 operand of the insn which corresponds to the @code{match_operator} | |
403 never has any constraints because it is never reloaded as a whole. | |
404 However, if parts of its @var{operands} are matched by | |
405 @code{match_operand} patterns, those parts may have constraints of | |
406 their own. | |
407 | |
408 @findex match_op_dup | |
409 @item (match_op_dup:@var{m} @var{n}[@var{operands}@dots{}]) | |
410 Like @code{match_dup}, except that it applies to operators instead of | |
411 operands. When constructing an insn, operand number @var{n} will be | |
412 substituted at this point. But in matching, @code{match_op_dup} behaves | |
413 differently. It assumes that operand number @var{n} has already been | |
414 determined by a @code{match_operator} appearing earlier in the | |
415 recognition template, and it matches only an identical-looking | |
416 expression. | |
417 | |
418 @findex match_parallel | |
419 @item (match_parallel @var{n} @var{predicate} [@var{subpat}@dots{}]) | |
420 This pattern is a placeholder for an insn that consists of a | |
421 @code{parallel} expression with a variable number of elements. This | |
422 expression should only appear at the top level of an insn pattern. | |
423 | |
424 When constructing an insn, operand number @var{n} will be substituted at | |
425 this point. When matching an insn, it matches if the body of the insn | |
426 is a @code{parallel} expression with at least as many elements as the | |
427 vector of @var{subpat} expressions in the @code{match_parallel}, if each | |
428 @var{subpat} matches the corresponding element of the @code{parallel}, | |
429 @emph{and} the function @var{predicate} returns nonzero on the | |
430 @code{parallel} that is the body of the insn. It is the responsibility | |
431 of the predicate to validate elements of the @code{parallel} beyond | |
432 those listed in the @code{match_parallel}. | |
433 | |
434 A typical use of @code{match_parallel} is to match load and store | |
435 multiple expressions, which can contain a variable number of elements | |
436 in a @code{parallel}. For example, | |
437 | |
438 @smallexample | |
439 (define_insn "" | |
440 [(match_parallel 0 "load_multiple_operation" | |
441 [(set (match_operand:SI 1 "gpc_reg_operand" "=r") | |
442 (match_operand:SI 2 "memory_operand" "m")) | |
443 (use (reg:SI 179)) | |
444 (clobber (reg:SI 179))])] | |
445 "" | |
446 "loadm 0,0,%1,%2") | |
447 @end smallexample | |
448 | |
449 This example comes from @file{a29k.md}. The function | |
450 @code{load_multiple_operation} is defined in @file{a29k.c} and checks | |
451 that subsequent elements in the @code{parallel} are the same as the | |
452 @code{set} in the pattern, except that they are referencing subsequent | |
453 registers and memory locations. | |
454 | |
455 An insn that matches this pattern might look like: | |
456 | |
457 @smallexample | |
458 (parallel | |
459 [(set (reg:SI 20) (mem:SI (reg:SI 100))) | |
460 (use (reg:SI 179)) | |
461 (clobber (reg:SI 179)) | |
462 (set (reg:SI 21) | |
463 (mem:SI (plus:SI (reg:SI 100) | |
464 (const_int 4)))) | |
465 (set (reg:SI 22) | |
466 (mem:SI (plus:SI (reg:SI 100) | |
467 (const_int 8))))]) | |
468 @end smallexample | |
469 | |
470 @findex match_par_dup | |
471 @item (match_par_dup @var{n} [@var{subpat}@dots{}]) | |
472 Like @code{match_op_dup}, but for @code{match_parallel} instead of | |
473 @code{match_operator}. | |
474 | |
475 @end table | |
476 | |
477 @node Output Template | |
478 @section Output Templates and Operand Substitution | |
479 @cindex output templates | |
480 @cindex operand substitution | |
481 | |
482 @cindex @samp{%} in template | |
483 @cindex percent sign | |
484 The @dfn{output template} is a string which specifies how to output the | |
485 assembler code for an instruction pattern. Most of the template is a | |
486 fixed string which is output literally. The character @samp{%} is used | |
487 to specify where to substitute an operand; it can also be used to | |
488 identify places where different variants of the assembler require | |
489 different syntax. | |
490 | |
491 In the simplest case, a @samp{%} followed by a digit @var{n} says to output | |
492 operand @var{n} at that point in the string. | |
493 | |
494 @samp{%} followed by a letter and a digit says to output an operand in an | |
495 alternate fashion. Four letters have standard, built-in meanings described | |
496 below. The machine description macro @code{PRINT_OPERAND} can define | |
497 additional letters with nonstandard meanings. | |
498 | |
499 @samp{%c@var{digit}} can be used to substitute an operand that is a | |
500 constant value without the syntax that normally indicates an immediate | |
501 operand. | |
502 | |
503 @samp{%n@var{digit}} is like @samp{%c@var{digit}} except that the value of | |
504 the constant is negated before printing. | |
505 | |
506 @samp{%a@var{digit}} can be used to substitute an operand as if it were a | |
507 memory reference, with the actual operand treated as the address. This may | |
508 be useful when outputting a ``load address'' instruction, because often the | |
509 assembler syntax for such an instruction requires you to write the operand | |
510 as if it were a memory reference. | |
511 | |
512 @samp{%l@var{digit}} is used to substitute a @code{label_ref} into a jump | |
513 instruction. | |
514 | |
515 @samp{%=} outputs a number which is unique to each instruction in the | |
516 entire compilation. This is useful for making local labels to be | |
517 referred to more than once in a single template that generates multiple | |
518 assembler instructions. | |
519 | |
520 @samp{%} followed by a punctuation character specifies a substitution that | |
521 does not use an operand. Only one case is standard: @samp{%%} outputs a | |
522 @samp{%} into the assembler code. Other nonstandard cases can be | |
523 defined in the @code{PRINT_OPERAND} macro. You must also define | |
524 which punctuation characters are valid with the | |
525 @code{PRINT_OPERAND_PUNCT_VALID_P} macro. | |
526 | |
527 @cindex \ | |
528 @cindex backslash | |
529 The template may generate multiple assembler instructions. Write the text | |
530 for the instructions, with @samp{\;} between them. | |
531 | |
532 @cindex matching operands | |
533 When the RTL contains two operands which are required by constraint to match | |
534 each other, the output template must refer only to the lower-numbered operand. | |
535 Matching operands are not always identical, and the rest of the compiler | |
536 arranges to put the proper RTL expression for printing into the lower-numbered | |
537 operand. | |
538 | |
539 One use of nonstandard letters or punctuation following @samp{%} is to | |
540 distinguish between different assembler languages for the same machine; for | |
541 example, Motorola syntax versus MIT syntax for the 68000. Motorola syntax | |
542 requires periods in most opcode names, while MIT syntax does not. For | |
543 example, the opcode @samp{movel} in MIT syntax is @samp{move.l} in Motorola | |
544 syntax. The same file of patterns is used for both kinds of output syntax, | |
545 but the character sequence @samp{%.} is used in each place where Motorola | |
546 syntax wants a period. The @code{PRINT_OPERAND} macro for Motorola syntax | |
547 defines the sequence to output a period; the macro for MIT syntax defines | |
548 it to do nothing. | |
549 | |
550 @cindex @code{#} in template | |
551 As a special case, a template consisting of the single character @code{#} | |
552 instructs the compiler to first split the insn, and then output the | |
553 resulting instructions separately. This helps eliminate redundancy in the | |
554 output templates. If you have a @code{define_insn} that needs to emit | |
555 multiple assembler instructions, and there is an matching @code{define_split} | |
556 already defined, then you can simply use @code{#} as the output template | |
557 instead of writing an output template that emits the multiple assembler | |
558 instructions. | |
559 | |
560 If the macro @code{ASSEMBLER_DIALECT} is defined, you can use construct | |
561 of the form @samp{@{option0|option1|option2@}} in the templates. These | |
562 describe multiple variants of assembler language syntax. | |
563 @xref{Instruction Output}. | |
564 | |
565 @node Output Statement | |
566 @section C Statements for Assembler Output | |
567 @cindex output statements | |
568 @cindex C statements for assembler output | |
569 @cindex generating assembler output | |
570 | |
571 Often a single fixed template string cannot produce correct and efficient | |
572 assembler code for all the cases that are recognized by a single | |
573 instruction pattern. For example, the opcodes may depend on the kinds of | |
574 operands; or some unfortunate combinations of operands may require extra | |
575 machine instructions. | |
576 | |
577 If the output control string starts with a @samp{@@}, then it is actually | |
578 a series of templates, each on a separate line. (Blank lines and | |
579 leading spaces and tabs are ignored.) The templates correspond to the | |
580 pattern's constraint alternatives (@pxref{Multi-Alternative}). For example, | |
581 if a target machine has a two-address add instruction @samp{addr} to add | |
582 into a register and another @samp{addm} to add a register to memory, you | |
583 might write this pattern: | |
584 | |
585 @smallexample | |
586 (define_insn "addsi3" | |
587 [(set (match_operand:SI 0 "general_operand" "=r,m") | |
588 (plus:SI (match_operand:SI 1 "general_operand" "0,0") | |
589 (match_operand:SI 2 "general_operand" "g,r")))] | |
590 "" | |
591 "@@ | |
592 addr %2,%0 | |
593 addm %2,%0") | |
594 @end smallexample | |
595 | |
596 @cindex @code{*} in template | |
597 @cindex asterisk in template | |
598 If the output control string starts with a @samp{*}, then it is not an | |
599 output template but rather a piece of C program that should compute a | |
600 template. It should execute a @code{return} statement to return the | |
601 template-string you want. Most such templates use C string literals, which | |
602 require doublequote characters to delimit them. To include these | |
603 doublequote characters in the string, prefix each one with @samp{\}. | |
604 | |
605 If the output control string is written as a brace block instead of a | |
606 double-quoted string, it is automatically assumed to be C code. In that | |
607 case, it is not necessary to put in a leading asterisk, or to escape the | |
608 doublequotes surrounding C string literals. | |
609 | |
610 The operands may be found in the array @code{operands}, whose C data type | |
611 is @code{rtx []}. | |
612 | |
613 It is very common to select different ways of generating assembler code | |
614 based on whether an immediate operand is within a certain range. Be | |
615 careful when doing this, because the result of @code{INTVAL} is an | |
616 integer on the host machine. If the host machine has more bits in an | |
617 @code{int} than the target machine has in the mode in which the constant | |
618 will be used, then some of the bits you get from @code{INTVAL} will be | |
619 superfluous. For proper results, you must carefully disregard the | |
620 values of those bits. | |
621 | |
622 @findex output_asm_insn | |
623 It is possible to output an assembler instruction and then go on to output | |
624 or compute more of them, using the subroutine @code{output_asm_insn}. This | |
625 receives two arguments: a template-string and a vector of operands. The | |
626 vector may be @code{operands}, or it may be another array of @code{rtx} | |
627 that you declare locally and initialize yourself. | |
628 | |
629 @findex which_alternative | |
630 When an insn pattern has multiple alternatives in its constraints, often | |
631 the appearance of the assembler code is determined mostly by which alternative | |
632 was matched. When this is so, the C code can test the variable | |
633 @code{which_alternative}, which is the ordinal number of the alternative | |
634 that was actually satisfied (0 for the first, 1 for the second alternative, | |
635 etc.). | |
636 | |
637 For example, suppose there are two opcodes for storing zero, @samp{clrreg} | |
638 for registers and @samp{clrmem} for memory locations. Here is how | |
639 a pattern could use @code{which_alternative} to choose between them: | |
640 | |
641 @smallexample | |
642 (define_insn "" | |
643 [(set (match_operand:SI 0 "general_operand" "=r,m") | |
644 (const_int 0))] | |
645 "" | |
646 @{ | |
647 return (which_alternative == 0 | |
648 ? "clrreg %0" : "clrmem %0"); | |
649 @}) | |
650 @end smallexample | |
651 | |
652 The example above, where the assembler code to generate was | |
653 @emph{solely} determined by the alternative, could also have been specified | |
654 as follows, having the output control string start with a @samp{@@}: | |
655 | |
656 @smallexample | |
657 @group | |
658 (define_insn "" | |
659 [(set (match_operand:SI 0 "general_operand" "=r,m") | |
660 (const_int 0))] | |
661 "" | |
662 "@@ | |
663 clrreg %0 | |
664 clrmem %0") | |
665 @end group | |
666 @end smallexample | |
667 | |
668 @node Predicates | |
669 @section Predicates | |
670 @cindex predicates | |
671 @cindex operand predicates | |
672 @cindex operator predicates | |
673 | |
674 A predicate determines whether a @code{match_operand} or | |
675 @code{match_operator} expression matches, and therefore whether the | |
676 surrounding instruction pattern will be used for that combination of | |
677 operands. GCC has a number of machine-independent predicates, and you | |
678 can define machine-specific predicates as needed. By convention, | |
679 predicates used with @code{match_operand} have names that end in | |
680 @samp{_operand}, and those used with @code{match_operator} have names | |
681 that end in @samp{_operator}. | |
682 | |
683 All predicates are Boolean functions (in the mathematical sense) of | |
684 two arguments: the RTL expression that is being considered at that | |
685 position in the instruction pattern, and the machine mode that the | |
686 @code{match_operand} or @code{match_operator} specifies. In this | |
687 section, the first argument is called @var{op} and the second argument | |
688 @var{mode}. Predicates can be called from C as ordinary two-argument | |
689 functions; this can be useful in output templates or other | |
690 machine-specific code. | |
691 | |
692 Operand predicates can allow operands that are not actually acceptable | |
693 to the hardware, as long as the constraints give reload the ability to | |
694 fix them up (@pxref{Constraints}). However, GCC will usually generate | |
695 better code if the predicates specify the requirements of the machine | |
696 instructions as closely as possible. Reload cannot fix up operands | |
697 that must be constants (``immediate operands''); you must use a | |
698 predicate that allows only constants, or else enforce the requirement | |
699 in the extra condition. | |
700 | |
701 @cindex predicates and machine modes | |
702 @cindex normal predicates | |
703 @cindex special predicates | |
704 Most predicates handle their @var{mode} argument in a uniform manner. | |
705 If @var{mode} is @code{VOIDmode} (unspecified), then @var{op} can have | |
706 any mode. If @var{mode} is anything else, then @var{op} must have the | |
707 same mode, unless @var{op} is a @code{CONST_INT} or integer | |
708 @code{CONST_DOUBLE}. These RTL expressions always have | |
709 @code{VOIDmode}, so it would be counterproductive to check that their | |
710 mode matches. Instead, predicates that accept @code{CONST_INT} and/or | |
711 integer @code{CONST_DOUBLE} check that the value stored in the | |
712 constant will fit in the requested mode. | |
713 | |
714 Predicates with this behavior are called @dfn{normal}. | |
715 @command{genrecog} can optimize the instruction recognizer based on | |
716 knowledge of how normal predicates treat modes. It can also diagnose | |
717 certain kinds of common errors in the use of normal predicates; for | |
718 instance, it is almost always an error to use a normal predicate | |
719 without specifying a mode. | |
720 | |
721 Predicates that do something different with their @var{mode} argument | |
722 are called @dfn{special}. The generic predicates | |
723 @code{address_operand} and @code{pmode_register_operand} are special | |
724 predicates. @command{genrecog} does not do any optimizations or | |
725 diagnosis when special predicates are used. | |
726 | |
727 @menu | |
728 * Machine-Independent Predicates:: Predicates available to all back ends. | |
729 * Defining Predicates:: How to write machine-specific predicate | |
730 functions. | |
731 @end menu | |
732 | |
733 @node Machine-Independent Predicates | |
734 @subsection Machine-Independent Predicates | |
735 @cindex machine-independent predicates | |
736 @cindex generic predicates | |
737 | |
738 These are the generic predicates available to all back ends. They are | |
739 defined in @file{recog.c}. The first category of predicates allow | |
740 only constant, or @dfn{immediate}, operands. | |
741 | |
742 @defun immediate_operand | |
743 This predicate allows any sort of constant that fits in @var{mode}. | |
744 It is an appropriate choice for instructions that take operands that | |
745 must be constant. | |
746 @end defun | |
747 | |
748 @defun const_int_operand | |
749 This predicate allows any @code{CONST_INT} expression that fits in | |
750 @var{mode}. It is an appropriate choice for an immediate operand that | |
751 does not allow a symbol or label. | |
752 @end defun | |
753 | |
754 @defun const_double_operand | |
755 This predicate accepts any @code{CONST_DOUBLE} expression that has | |
756 exactly @var{mode}. If @var{mode} is @code{VOIDmode}, it will also | |
757 accept @code{CONST_INT}. It is intended for immediate floating point | |
758 constants. | |
759 @end defun | |
760 | |
761 @noindent | |
762 The second category of predicates allow only some kind of machine | |
763 register. | |
764 | |
765 @defun register_operand | |
766 This predicate allows any @code{REG} or @code{SUBREG} expression that | |
767 is valid for @var{mode}. It is often suitable for arithmetic | |
768 instruction operands on a RISC machine. | |
769 @end defun | |
770 | |
771 @defun pmode_register_operand | |
772 This is a slight variant on @code{register_operand} which works around | |
773 a limitation in the machine-description reader. | |
774 | |
775 @smallexample | |
776 (match_operand @var{n} "pmode_register_operand" @var{constraint}) | |
777 @end smallexample | |
778 | |
779 @noindent | |
780 means exactly what | |
781 | |
782 @smallexample | |
783 (match_operand:P @var{n} "register_operand" @var{constraint}) | |
784 @end smallexample | |
785 | |
786 @noindent | |
787 would mean, if the machine-description reader accepted @samp{:P} | |
788 mode suffixes. Unfortunately, it cannot, because @code{Pmode} is an | |
789 alias for some other mode, and might vary with machine-specific | |
790 options. @xref{Misc}. | |
791 @end defun | |
792 | |
793 @defun scratch_operand | |
794 This predicate allows hard registers and @code{SCRATCH} expressions, | |
795 but not pseudo-registers. It is used internally by @code{match_scratch}; | |
796 it should not be used directly. | |
797 @end defun | |
798 | |
799 @noindent | |
800 The third category of predicates allow only some kind of memory reference. | |
801 | |
802 @defun memory_operand | |
803 This predicate allows any valid reference to a quantity of mode | |
804 @var{mode} in memory, as determined by the weak form of | |
805 @code{GO_IF_LEGITIMATE_ADDRESS} (@pxref{Addressing Modes}). | |
806 @end defun | |
807 | |
808 @defun address_operand | |
809 This predicate is a little unusual; it allows any operand that is a | |
810 valid expression for the @emph{address} of a quantity of mode | |
811 @var{mode}, again determined by the weak form of | |
812 @code{GO_IF_LEGITIMATE_ADDRESS}. To first order, if | |
813 @samp{@w{(mem:@var{mode} (@var{exp}))}} is acceptable to | |
814 @code{memory_operand}, then @var{exp} is acceptable to | |
815 @code{address_operand}. Note that @var{exp} does not necessarily have | |
816 the mode @var{mode}. | |
817 @end defun | |
818 | |
819 @defun indirect_operand | |
820 This is a stricter form of @code{memory_operand} which allows only | |
821 memory references with a @code{general_operand} as the address | |
822 expression. New uses of this predicate are discouraged, because | |
823 @code{general_operand} is very permissive, so it's hard to tell what | |
824 an @code{indirect_operand} does or does not allow. If a target has | |
825 different requirements for memory operands for different instructions, | |
826 it is better to define target-specific predicates which enforce the | |
827 hardware's requirements explicitly. | |
828 @end defun | |
829 | |
830 @defun push_operand | |
831 This predicate allows a memory reference suitable for pushing a value | |
832 onto the stack. This will be a @code{MEM} which refers to | |
833 @code{stack_pointer_rtx}, with a side-effect in its address expression | |
834 (@pxref{Incdec}); which one is determined by the | |
835 @code{STACK_PUSH_CODE} macro (@pxref{Frame Layout}). | |
836 @end defun | |
837 | |
838 @defun pop_operand | |
839 This predicate allows a memory reference suitable for popping a value | |
840 off the stack. Again, this will be a @code{MEM} referring to | |
841 @code{stack_pointer_rtx}, with a side-effect in its address | |
842 expression. However, this time @code{STACK_POP_CODE} is expected. | |
843 @end defun | |
844 | |
845 @noindent | |
846 The fourth category of predicates allow some combination of the above | |
847 operands. | |
848 | |
849 @defun nonmemory_operand | |
850 This predicate allows any immediate or register operand valid for @var{mode}. | |
851 @end defun | |
852 | |
853 @defun nonimmediate_operand | |
854 This predicate allows any register or memory operand valid for @var{mode}. | |
855 @end defun | |
856 | |
857 @defun general_operand | |
858 This predicate allows any immediate, register, or memory operand | |
859 valid for @var{mode}. | |
860 @end defun | |
861 | |
862 @noindent | |
863 Finally, there is one generic operator predicate. | |
864 | |
865 @defun comparison_operator | |
866 This predicate matches any expression which performs an arithmetic | |
867 comparison in @var{mode}; that is, @code{COMPARISON_P} is true for the | |
868 expression code. | |
869 @end defun | |
870 | |
871 @node Defining Predicates | |
872 @subsection Defining Machine-Specific Predicates | |
873 @cindex defining predicates | |
874 @findex define_predicate | |
875 @findex define_special_predicate | |
876 | |
877 Many machines have requirements for their operands that cannot be | |
878 expressed precisely using the generic predicates. You can define | |
879 additional predicates using @code{define_predicate} and | |
880 @code{define_special_predicate} expressions. These expressions have | |
881 three operands: | |
882 | |
883 @itemize @bullet | |
884 @item | |
885 The name of the predicate, as it will be referred to in | |
886 @code{match_operand} or @code{match_operator} expressions. | |
887 | |
888 @item | |
889 An RTL expression which evaluates to true if the predicate allows the | |
890 operand @var{op}, false if it does not. This expression can only use | |
891 the following RTL codes: | |
892 | |
893 @table @code | |
894 @item MATCH_OPERAND | |
895 When written inside a predicate expression, a @code{MATCH_OPERAND} | |
896 expression evaluates to true if the predicate it names would allow | |
897 @var{op}. The operand number and constraint are ignored. Due to | |
898 limitations in @command{genrecog}, you can only refer to generic | |
899 predicates and predicates that have already been defined. | |
900 | |
901 @item MATCH_CODE | |
902 This expression evaluates to true if @var{op} or a specified | |
903 subexpression of @var{op} has one of a given list of RTX codes. | |
904 | |
905 The first operand of this expression is a string constant containing a | |
906 comma-separated list of RTX code names (in lower case). These are the | |
907 codes for which the @code{MATCH_CODE} will be true. | |
908 | |
909 The second operand is a string constant which indicates what | |
910 subexpression of @var{op} to examine. If it is absent or the empty | |
911 string, @var{op} itself is examined. Otherwise, the string constant | |
912 must be a sequence of digits and/or lowercase letters. Each character | |
913 indicates a subexpression to extract from the current expression; for | |
914 the first character this is @var{op}, for the second and subsequent | |
915 characters it is the result of the previous character. A digit | |
916 @var{n} extracts @samp{@w{XEXP (@var{e}, @var{n})}}; a letter @var{l} | |
917 extracts @samp{@w{XVECEXP (@var{e}, 0, @var{n})}} where @var{n} is the | |
918 alphabetic ordinal of @var{l} (0 for `a', 1 for 'b', and so on). The | |
919 @code{MATCH_CODE} then examines the RTX code of the subexpression | |
920 extracted by the complete string. It is not possible to extract | |
921 components of an @code{rtvec} that is not at position 0 within its RTX | |
922 object. | |
923 | |
924 @item MATCH_TEST | |
925 This expression has one operand, a string constant containing a C | |
926 expression. The predicate's arguments, @var{op} and @var{mode}, are | |
927 available with those names in the C expression. The @code{MATCH_TEST} | |
928 evaluates to true if the C expression evaluates to a nonzero value. | |
929 @code{MATCH_TEST} expressions must not have side effects. | |
930 | |
931 @item AND | |
932 @itemx IOR | |
933 @itemx NOT | |
934 @itemx IF_THEN_ELSE | |
935 The basic @samp{MATCH_} expressions can be combined using these | |
936 logical operators, which have the semantics of the C operators | |
937 @samp{&&}, @samp{||}, @samp{!}, and @samp{@w{? :}} respectively. As | |
938 in Common Lisp, you may give an @code{AND} or @code{IOR} expression an | |
939 arbitrary number of arguments; this has exactly the same effect as | |
940 writing a chain of two-argument @code{AND} or @code{IOR} expressions. | |
941 @end table | |
942 | |
943 @item | |
944 An optional block of C code, which should execute | |
945 @samp{@w{return true}} if the predicate is found to match and | |
946 @samp{@w{return false}} if it does not. It must not have any side | |
947 effects. The predicate arguments, @var{op} and @var{mode}, are | |
948 available with those names. | |
949 | |
950 If a code block is present in a predicate definition, then the RTL | |
951 expression must evaluate to true @emph{and} the code block must | |
952 execute @samp{@w{return true}} for the predicate to allow the operand. | |
953 The RTL expression is evaluated first; do not re-check anything in the | |
954 code block that was checked in the RTL expression. | |
955 @end itemize | |
956 | |
957 The program @command{genrecog} scans @code{define_predicate} and | |
958 @code{define_special_predicate} expressions to determine which RTX | |
959 codes are possibly allowed. You should always make this explicit in | |
960 the RTL predicate expression, using @code{MATCH_OPERAND} and | |
961 @code{MATCH_CODE}. | |
962 | |
963 Here is an example of a simple predicate definition, from the IA64 | |
964 machine description: | |
965 | |
966 @smallexample | |
967 @group | |
968 ;; @r{True if @var{op} is a @code{SYMBOL_REF} which refers to the sdata section.} | |
969 (define_predicate "small_addr_symbolic_operand" | |
970 (and (match_code "symbol_ref") | |
971 (match_test "SYMBOL_REF_SMALL_ADDR_P (op)"))) | |
972 @end group | |
973 @end smallexample | |
974 | |
975 @noindent | |
976 And here is another, showing the use of the C block. | |
977 | |
978 @smallexample | |
979 @group | |
980 ;; @r{True if @var{op} is a register operand that is (or could be) a GR reg.} | |
981 (define_predicate "gr_register_operand" | |
982 (match_operand 0 "register_operand") | |
983 @{ | |
984 unsigned int regno; | |
985 if (GET_CODE (op) == SUBREG) | |
986 op = SUBREG_REG (op); | |
987 | |
988 regno = REGNO (op); | |
989 return (regno >= FIRST_PSEUDO_REGISTER || GENERAL_REGNO_P (regno)); | |
990 @}) | |
991 @end group | |
992 @end smallexample | |
993 | |
994 Predicates written with @code{define_predicate} automatically include | |
995 a test that @var{mode} is @code{VOIDmode}, or @var{op} has the same | |
996 mode as @var{mode}, or @var{op} is a @code{CONST_INT} or | |
997 @code{CONST_DOUBLE}. They do @emph{not} check specifically for | |
998 integer @code{CONST_DOUBLE}, nor do they test that the value of either | |
999 kind of constant fits in the requested mode. This is because | |
1000 target-specific predicates that take constants usually have to do more | |
1001 stringent value checks anyway. If you need the exact same treatment | |
1002 of @code{CONST_INT} or @code{CONST_DOUBLE} that the generic predicates | |
1003 provide, use a @code{MATCH_OPERAND} subexpression to call | |
1004 @code{const_int_operand}, @code{const_double_operand}, or | |
1005 @code{immediate_operand}. | |
1006 | |
1007 Predicates written with @code{define_special_predicate} do not get any | |
1008 automatic mode checks, and are treated as having special mode handling | |
1009 by @command{genrecog}. | |
1010 | |
1011 The program @command{genpreds} is responsible for generating code to | |
1012 test predicates. It also writes a header file containing function | |
1013 declarations for all machine-specific predicates. It is not necessary | |
1014 to declare these predicates in @file{@var{cpu}-protos.h}. | |
1015 @end ifset | |
1016 | |
1017 @c Most of this node appears by itself (in a different place) even | |
1018 @c when the INTERNALS flag is clear. Passages that require the internals | |
1019 @c manual's context are conditionalized to appear only in the internals manual. | |
1020 @ifset INTERNALS | |
1021 @node Constraints | |
1022 @section Operand Constraints | |
1023 @cindex operand constraints | |
1024 @cindex constraints | |
1025 | |
1026 Each @code{match_operand} in an instruction pattern can specify | |
1027 constraints for the operands allowed. The constraints allow you to | |
1028 fine-tune matching within the set of operands allowed by the | |
1029 predicate. | |
1030 | |
1031 @end ifset | |
1032 @ifclear INTERNALS | |
1033 @node Constraints | |
1034 @section Constraints for @code{asm} Operands | |
1035 @cindex operand constraints, @code{asm} | |
1036 @cindex constraints, @code{asm} | |
1037 @cindex @code{asm} constraints | |
1038 | |
1039 Here are specific details on what constraint letters you can use with | |
1040 @code{asm} operands. | |
1041 @end ifclear | |
1042 Constraints can say whether | |
1043 an operand may be in a register, and which kinds of register; whether the | |
1044 operand can be a memory reference, and which kinds of address; whether the | |
1045 operand may be an immediate constant, and which possible values it may | |
1046 have. Constraints can also require two operands to match. | |
1047 | |
1048 @ifset INTERNALS | |
1049 @menu | |
1050 * Simple Constraints:: Basic use of constraints. | |
1051 * Multi-Alternative:: When an insn has two alternative constraint-patterns. | |
1052 * Class Preferences:: Constraints guide which hard register to put things in. | |
1053 * Modifiers:: More precise control over effects of constraints. | |
1054 * Disable Insn Alternatives:: Disable insn alternatives using the @code{enabled} attribute. | |
1055 * Machine Constraints:: Existing constraints for some particular machines. | |
1056 * Define Constraints:: How to define machine-specific constraints. | |
1057 * C Constraint Interface:: How to test constraints from C code. | |
1058 @end menu | |
1059 @end ifset | |
1060 | |
1061 @ifclear INTERNALS | |
1062 @menu | |
1063 * Simple Constraints:: Basic use of constraints. | |
1064 * Multi-Alternative:: When an insn has two alternative constraint-patterns. | |
1065 * Modifiers:: More precise control over effects of constraints. | |
1066 * Machine Constraints:: Special constraints for some particular machines. | |
1067 @end menu | |
1068 @end ifclear | |
1069 | |
1070 @node Simple Constraints | |
1071 @subsection Simple Constraints | |
1072 @cindex simple constraints | |
1073 | |
1074 The simplest kind of constraint is a string full of letters, each of | |
1075 which describes one kind of operand that is permitted. Here are | |
1076 the letters that are allowed: | |
1077 | |
1078 @table @asis | |
1079 @item whitespace | |
1080 Whitespace characters are ignored and can be inserted at any position | |
1081 except the first. This enables each alternative for different operands to | |
1082 be visually aligned in the machine description even if they have different | |
1083 number of constraints and modifiers. | |
1084 | |
1085 @cindex @samp{m} in constraint | |
1086 @cindex memory references in constraints | |
1087 @item @samp{m} | |
1088 A memory operand is allowed, with any kind of address that the machine | |
1089 supports in general. | |
1090 Note that the letter used for the general memory constraint can be | |
1091 re-defined by a back end using the @code{TARGET_MEM_CONSTRAINT} macro. | |
1092 | |
1093 @cindex offsettable address | |
1094 @cindex @samp{o} in constraint | |
1095 @item @samp{o} | |
1096 A memory operand is allowed, but only if the address is | |
1097 @dfn{offsettable}. This means that adding a small integer (actually, | |
1098 the width in bytes of the operand, as determined by its machine mode) | |
1099 may be added to the address and the result is also a valid memory | |
1100 address. | |
1101 | |
1102 @cindex autoincrement/decrement addressing | |
1103 For example, an address which is constant is offsettable; so is an | |
1104 address that is the sum of a register and a constant (as long as a | |
1105 slightly larger constant is also within the range of address-offsets | |
1106 supported by the machine); but an autoincrement or autodecrement | |
1107 address is not offsettable. More complicated indirect/indexed | |
1108 addresses may or may not be offsettable depending on the other | |
1109 addressing modes that the machine supports. | |
1110 | |
1111 Note that in an output operand which can be matched by another | |
1112 operand, the constraint letter @samp{o} is valid only when accompanied | |
1113 by both @samp{<} (if the target machine has predecrement addressing) | |
1114 and @samp{>} (if the target machine has preincrement addressing). | |
1115 | |
1116 @cindex @samp{V} in constraint | |
1117 @item @samp{V} | |
1118 A memory operand that is not offsettable. In other words, anything that | |
1119 would fit the @samp{m} constraint but not the @samp{o} constraint. | |
1120 | |
1121 @cindex @samp{<} in constraint | |
1122 @item @samp{<} | |
1123 A memory operand with autodecrement addressing (either predecrement or | |
1124 postdecrement) is allowed. | |
1125 | |
1126 @cindex @samp{>} in constraint | |
1127 @item @samp{>} | |
1128 A memory operand with autoincrement addressing (either preincrement or | |
1129 postincrement) is allowed. | |
1130 | |
1131 @cindex @samp{r} in constraint | |
1132 @cindex registers in constraints | |
1133 @item @samp{r} | |
1134 A register operand is allowed provided that it is in a general | |
1135 register. | |
1136 | |
1137 @cindex constants in constraints | |
1138 @cindex @samp{i} in constraint | |
1139 @item @samp{i} | |
1140 An immediate integer operand (one with constant value) is allowed. | |
1141 This includes symbolic constants whose values will be known only at | |
1142 assembly time or later. | |
1143 | |
1144 @cindex @samp{n} in constraint | |
1145 @item @samp{n} | |
1146 An immediate integer operand with a known numeric value is allowed. | |
1147 Many systems cannot support assembly-time constants for operands less | |
1148 than a word wide. Constraints for these operands should use @samp{n} | |
1149 rather than @samp{i}. | |
1150 | |
1151 @cindex @samp{I} in constraint | |
1152 @item @samp{I}, @samp{J}, @samp{K}, @dots{} @samp{P} | |
1153 Other letters in the range @samp{I} through @samp{P} may be defined in | |
1154 a machine-dependent fashion to permit immediate integer operands with | |
1155 explicit integer values in specified ranges. For example, on the | |
1156 68000, @samp{I} is defined to stand for the range of values 1 to 8. | |
1157 This is the range permitted as a shift count in the shift | |
1158 instructions. | |
1159 | |
1160 @cindex @samp{E} in constraint | |
1161 @item @samp{E} | |
1162 An immediate floating operand (expression code @code{const_double}) is | |
1163 allowed, but only if the target floating point format is the same as | |
1164 that of the host machine (on which the compiler is running). | |
1165 | |
1166 @cindex @samp{F} in constraint | |
1167 @item @samp{F} | |
1168 An immediate floating operand (expression code @code{const_double} or | |
1169 @code{const_vector}) is allowed. | |
1170 | |
1171 @cindex @samp{G} in constraint | |
1172 @cindex @samp{H} in constraint | |
1173 @item @samp{G}, @samp{H} | |
1174 @samp{G} and @samp{H} may be defined in a machine-dependent fashion to | |
1175 permit immediate floating operands in particular ranges of values. | |
1176 | |
1177 @cindex @samp{s} in constraint | |
1178 @item @samp{s} | |
1179 An immediate integer operand whose value is not an explicit integer is | |
1180 allowed. | |
1181 | |
1182 This might appear strange; if an insn allows a constant operand with a | |
1183 value not known at compile time, it certainly must allow any known | |
1184 value. So why use @samp{s} instead of @samp{i}? Sometimes it allows | |
1185 better code to be generated. | |
1186 | |
1187 For example, on the 68000 in a fullword instruction it is possible to | |
1188 use an immediate operand; but if the immediate value is between @minus{}128 | |
1189 and 127, better code results from loading the value into a register and | |
1190 using the register. This is because the load into the register can be | |
1191 done with a @samp{moveq} instruction. We arrange for this to happen | |
1192 by defining the letter @samp{K} to mean ``any integer outside the | |
1193 range @minus{}128 to 127'', and then specifying @samp{Ks} in the operand | |
1194 constraints. | |
1195 | |
1196 @cindex @samp{g} in constraint | |
1197 @item @samp{g} | |
1198 Any register, memory or immediate integer operand is allowed, except for | |
1199 registers that are not general registers. | |
1200 | |
1201 @cindex @samp{X} in constraint | |
1202 @item @samp{X} | |
1203 @ifset INTERNALS | |
1204 Any operand whatsoever is allowed, even if it does not satisfy | |
1205 @code{general_operand}. This is normally used in the constraint of | |
1206 a @code{match_scratch} when certain alternatives will not actually | |
1207 require a scratch register. | |
1208 @end ifset | |
1209 @ifclear INTERNALS | |
1210 Any operand whatsoever is allowed. | |
1211 @end ifclear | |
1212 | |
1213 @cindex @samp{0} in constraint | |
1214 @cindex digits in constraint | |
1215 @item @samp{0}, @samp{1}, @samp{2}, @dots{} @samp{9} | |
1216 An operand that matches the specified operand number is allowed. If a | |
1217 digit is used together with letters within the same alternative, the | |
1218 digit should come last. | |
1219 | |
1220 This number is allowed to be more than a single digit. If multiple | |
1221 digits are encountered consecutively, they are interpreted as a single | |
1222 decimal integer. There is scant chance for ambiguity, since to-date | |
1223 it has never been desirable that @samp{10} be interpreted as matching | |
1224 either operand 1 @emph{or} operand 0. Should this be desired, one | |
1225 can use multiple alternatives instead. | |
1226 | |
1227 @cindex matching constraint | |
1228 @cindex constraint, matching | |
1229 This is called a @dfn{matching constraint} and what it really means is | |
1230 that the assembler has only a single operand that fills two roles | |
1231 @ifset INTERNALS | |
1232 considered separate in the RTL insn. For example, an add insn has two | |
1233 input operands and one output operand in the RTL, but on most CISC | |
1234 @end ifset | |
1235 @ifclear INTERNALS | |
1236 which @code{asm} distinguishes. For example, an add instruction uses | |
1237 two input operands and an output operand, but on most CISC | |
1238 @end ifclear | |
1239 machines an add instruction really has only two operands, one of them an | |
1240 input-output operand: | |
1241 | |
1242 @smallexample | |
1243 addl #35,r12 | |
1244 @end smallexample | |
1245 | |
1246 Matching constraints are used in these circumstances. | |
1247 More precisely, the two operands that match must include one input-only | |
1248 operand and one output-only operand. Moreover, the digit must be a | |
1249 smaller number than the number of the operand that uses it in the | |
1250 constraint. | |
1251 | |
1252 @ifset INTERNALS | |
1253 For operands to match in a particular case usually means that they | |
1254 are identical-looking RTL expressions. But in a few special cases | |
1255 specific kinds of dissimilarity are allowed. For example, @code{*x} | |
1256 as an input operand will match @code{*x++} as an output operand. | |
1257 For proper results in such cases, the output template should always | |
1258 use the output-operand's number when printing the operand. | |
1259 @end ifset | |
1260 | |
1261 @cindex load address instruction | |
1262 @cindex push address instruction | |
1263 @cindex address constraints | |
1264 @cindex @samp{p} in constraint | |
1265 @item @samp{p} | |
1266 An operand that is a valid memory address is allowed. This is | |
1267 for ``load address'' and ``push address'' instructions. | |
1268 | |
1269 @findex address_operand | |
1270 @samp{p} in the constraint must be accompanied by @code{address_operand} | |
1271 as the predicate in the @code{match_operand}. This predicate interprets | |
1272 the mode specified in the @code{match_operand} as the mode of the memory | |
1273 reference for which the address would be valid. | |
1274 | |
1275 @cindex other register constraints | |
1276 @cindex extensible constraints | |
1277 @item @var{other-letters} | |
1278 Other letters can be defined in machine-dependent fashion to stand for | |
1279 particular classes of registers or other arbitrary operand types. | |
1280 @samp{d}, @samp{a} and @samp{f} are defined on the 68000/68020 to stand | |
1281 for data, address and floating point registers. | |
1282 @end table | |
1283 | |
1284 @ifset INTERNALS | |
1285 In order to have valid assembler code, each operand must satisfy | |
1286 its constraint. But a failure to do so does not prevent the pattern | |
1287 from applying to an insn. Instead, it directs the compiler to modify | |
1288 the code so that the constraint will be satisfied. Usually this is | |
1289 done by copying an operand into a register. | |
1290 | |
1291 Contrast, therefore, the two instruction patterns that follow: | |
1292 | |
1293 @smallexample | |
1294 (define_insn "" | |
1295 [(set (match_operand:SI 0 "general_operand" "=r") | |
1296 (plus:SI (match_dup 0) | |
1297 (match_operand:SI 1 "general_operand" "r")))] | |
1298 "" | |
1299 "@dots{}") | |
1300 @end smallexample | |
1301 | |
1302 @noindent | |
1303 which has two operands, one of which must appear in two places, and | |
1304 | |
1305 @smallexample | |
1306 (define_insn "" | |
1307 [(set (match_operand:SI 0 "general_operand" "=r") | |
1308 (plus:SI (match_operand:SI 1 "general_operand" "0") | |
1309 (match_operand:SI 2 "general_operand" "r")))] | |
1310 "" | |
1311 "@dots{}") | |
1312 @end smallexample | |
1313 | |
1314 @noindent | |
1315 which has three operands, two of which are required by a constraint to be | |
1316 identical. If we are considering an insn of the form | |
1317 | |
1318 @smallexample | |
1319 (insn @var{n} @var{prev} @var{next} | |
1320 (set (reg:SI 3) | |
1321 (plus:SI (reg:SI 6) (reg:SI 109))) | |
1322 @dots{}) | |
1323 @end smallexample | |
1324 | |
1325 @noindent | |
1326 the first pattern would not apply at all, because this insn does not | |
1327 contain two identical subexpressions in the right place. The pattern would | |
1328 say, ``That does not look like an add instruction; try other patterns''. | |
1329 The second pattern would say, ``Yes, that's an add instruction, but there | |
1330 is something wrong with it''. It would direct the reload pass of the | |
1331 compiler to generate additional insns to make the constraint true. The | |
1332 results might look like this: | |
1333 | |
1334 @smallexample | |
1335 (insn @var{n2} @var{prev} @var{n} | |
1336 (set (reg:SI 3) (reg:SI 6)) | |
1337 @dots{}) | |
1338 | |
1339 (insn @var{n} @var{n2} @var{next} | |
1340 (set (reg:SI 3) | |
1341 (plus:SI (reg:SI 3) (reg:SI 109))) | |
1342 @dots{}) | |
1343 @end smallexample | |
1344 | |
1345 It is up to you to make sure that each operand, in each pattern, has | |
1346 constraints that can handle any RTL expression that could be present for | |
1347 that operand. (When multiple alternatives are in use, each pattern must, | |
1348 for each possible combination of operand expressions, have at least one | |
1349 alternative which can handle that combination of operands.) The | |
1350 constraints don't need to @emph{allow} any possible operand---when this is | |
1351 the case, they do not constrain---but they must at least point the way to | |
1352 reloading any possible operand so that it will fit. | |
1353 | |
1354 @itemize @bullet | |
1355 @item | |
1356 If the constraint accepts whatever operands the predicate permits, | |
1357 there is no problem: reloading is never necessary for this operand. | |
1358 | |
1359 For example, an operand whose constraints permit everything except | |
1360 registers is safe provided its predicate rejects registers. | |
1361 | |
1362 An operand whose predicate accepts only constant values is safe | |
1363 provided its constraints include the letter @samp{i}. If any possible | |
1364 constant value is accepted, then nothing less than @samp{i} will do; | |
1365 if the predicate is more selective, then the constraints may also be | |
1366 more selective. | |
1367 | |
1368 @item | |
1369 Any operand expression can be reloaded by copying it into a register. | |
1370 So if an operand's constraints allow some kind of register, it is | |
1371 certain to be safe. It need not permit all classes of registers; the | |
1372 compiler knows how to copy a register into another register of the | |
1373 proper class in order to make an instruction valid. | |
1374 | |
1375 @cindex nonoffsettable memory reference | |
1376 @cindex memory reference, nonoffsettable | |
1377 @item | |
1378 A nonoffsettable memory reference can be reloaded by copying the | |
1379 address into a register. So if the constraint uses the letter | |
1380 @samp{o}, all memory references are taken care of. | |
1381 | |
1382 @item | |
1383 A constant operand can be reloaded by allocating space in memory to | |
1384 hold it as preinitialized data. Then the memory reference can be used | |
1385 in place of the constant. So if the constraint uses the letters | |
1386 @samp{o} or @samp{m}, constant operands are not a problem. | |
1387 | |
1388 @item | |
1389 If the constraint permits a constant and a pseudo register used in an insn | |
1390 was not allocated to a hard register and is equivalent to a constant, | |
1391 the register will be replaced with the constant. If the predicate does | |
1392 not permit a constant and the insn is re-recognized for some reason, the | |
1393 compiler will crash. Thus the predicate must always recognize any | |
1394 objects allowed by the constraint. | |
1395 @end itemize | |
1396 | |
1397 If the operand's predicate can recognize registers, but the constraint does | |
1398 not permit them, it can make the compiler crash. When this operand happens | |
1399 to be a register, the reload pass will be stymied, because it does not know | |
1400 how to copy a register temporarily into memory. | |
1401 | |
1402 If the predicate accepts a unary operator, the constraint applies to the | |
1403 operand. For example, the MIPS processor at ISA level 3 supports an | |
1404 instruction which adds two registers in @code{SImode} to produce a | |
1405 @code{DImode} result, but only if the registers are correctly sign | |
1406 extended. This predicate for the input operands accepts a | |
1407 @code{sign_extend} of an @code{SImode} register. Write the constraint | |
1408 to indicate the type of register that is required for the operand of the | |
1409 @code{sign_extend}. | |
1410 @end ifset | |
1411 | |
1412 @node Multi-Alternative | |
1413 @subsection Multiple Alternative Constraints | |
1414 @cindex multiple alternative constraints | |
1415 | |
1416 Sometimes a single instruction has multiple alternative sets of possible | |
1417 operands. For example, on the 68000, a logical-or instruction can combine | |
1418 register or an immediate value into memory, or it can combine any kind of | |
1419 operand into a register; but it cannot combine one memory location into | |
1420 another. | |
1421 | |
1422 These constraints are represented as multiple alternatives. An alternative | |
1423 can be described by a series of letters for each operand. The overall | |
1424 constraint for an operand is made from the letters for this operand | |
1425 from the first alternative, a comma, the letters for this operand from | |
1426 the second alternative, a comma, and so on until the last alternative. | |
1427 @ifset INTERNALS | |
1428 Here is how it is done for fullword logical-or on the 68000: | |
1429 | |
1430 @smallexample | |
1431 (define_insn "iorsi3" | |
1432 [(set (match_operand:SI 0 "general_operand" "=m,d") | |
1433 (ior:SI (match_operand:SI 1 "general_operand" "%0,0") | |
1434 (match_operand:SI 2 "general_operand" "dKs,dmKs")))] | |
1435 @dots{}) | |
1436 @end smallexample | |
1437 | |
1438 The first alternative has @samp{m} (memory) for operand 0, @samp{0} for | |
1439 operand 1 (meaning it must match operand 0), and @samp{dKs} for operand | |
1440 2. The second alternative has @samp{d} (data register) for operand 0, | |
1441 @samp{0} for operand 1, and @samp{dmKs} for operand 2. The @samp{=} and | |
1442 @samp{%} in the constraints apply to all the alternatives; their | |
1443 meaning is explained in the next section (@pxref{Class Preferences}). | |
1444 @end ifset | |
1445 | |
1446 @c FIXME Is this ? and ! stuff of use in asm()? If not, hide unless INTERNAL | |
1447 If all the operands fit any one alternative, the instruction is valid. | |
1448 Otherwise, for each alternative, the compiler counts how many instructions | |
1449 must be added to copy the operands so that that alternative applies. | |
1450 The alternative requiring the least copying is chosen. If two alternatives | |
1451 need the same amount of copying, the one that comes first is chosen. | |
1452 These choices can be altered with the @samp{?} and @samp{!} characters: | |
1453 | |
1454 @table @code | |
1455 @cindex @samp{?} in constraint | |
1456 @cindex question mark | |
1457 @item ? | |
1458 Disparage slightly the alternative that the @samp{?} appears in, | |
1459 as a choice when no alternative applies exactly. The compiler regards | |
1460 this alternative as one unit more costly for each @samp{?} that appears | |
1461 in it. | |
1462 | |
1463 @cindex @samp{!} in constraint | |
1464 @cindex exclamation point | |
1465 @item ! | |
1466 Disparage severely the alternative that the @samp{!} appears in. | |
1467 This alternative can still be used if it fits without reloading, | |
1468 but if reloading is needed, some other alternative will be used. | |
1469 @end table | |
1470 | |
1471 @ifset INTERNALS | |
1472 When an insn pattern has multiple alternatives in its constraints, often | |
1473 the appearance of the assembler code is determined mostly by which | |
1474 alternative was matched. When this is so, the C code for writing the | |
1475 assembler code can use the variable @code{which_alternative}, which is | |
1476 the ordinal number of the alternative that was actually satisfied (0 for | |
1477 the first, 1 for the second alternative, etc.). @xref{Output Statement}. | |
1478 @end ifset | |
1479 | |
1480 @ifset INTERNALS | |
1481 @node Class Preferences | |
1482 @subsection Register Class Preferences | |
1483 @cindex class preference constraints | |
1484 @cindex register class preference constraints | |
1485 | |
1486 @cindex voting between constraint alternatives | |
1487 The operand constraints have another function: they enable the compiler | |
1488 to decide which kind of hardware register a pseudo register is best | |
1489 allocated to. The compiler examines the constraints that apply to the | |
1490 insns that use the pseudo register, looking for the machine-dependent | |
1491 letters such as @samp{d} and @samp{a} that specify classes of registers. | |
1492 The pseudo register is put in whichever class gets the most ``votes''. | |
1493 The constraint letters @samp{g} and @samp{r} also vote: they vote in | |
1494 favor of a general register. The machine description says which registers | |
1495 are considered general. | |
1496 | |
1497 Of course, on some machines all registers are equivalent, and no register | |
1498 classes are defined. Then none of this complexity is relevant. | |
1499 @end ifset | |
1500 | |
1501 @node Modifiers | |
1502 @subsection Constraint Modifier Characters | |
1503 @cindex modifiers in constraints | |
1504 @cindex constraint modifier characters | |
1505 | |
1506 @c prevent bad page break with this line | |
1507 Here are constraint modifier characters. | |
1508 | |
1509 @table @samp | |
1510 @cindex @samp{=} in constraint | |
1511 @item = | |
1512 Means that this operand is write-only for this instruction: the previous | |
1513 value is discarded and replaced by output data. | |
1514 | |
1515 @cindex @samp{+} in constraint | |
1516 @item + | |
1517 Means that this operand is both read and written by the instruction. | |
1518 | |
1519 When the compiler fixes up the operands to satisfy the constraints, | |
1520 it needs to know which operands are inputs to the instruction and | |
1521 which are outputs from it. @samp{=} identifies an output; @samp{+} | |
1522 identifies an operand that is both input and output; all other operands | |
1523 are assumed to be input only. | |
1524 | |
1525 If you specify @samp{=} or @samp{+} in a constraint, you put it in the | |
1526 first character of the constraint string. | |
1527 | |
1528 @cindex @samp{&} in constraint | |
1529 @cindex earlyclobber operand | |
1530 @item & | |
1531 Means (in a particular alternative) that this operand is an | |
1532 @dfn{earlyclobber} operand, which is modified before the instruction is | |
1533 finished using the input operands. Therefore, this operand may not lie | |
1534 in a register that is used as an input operand or as part of any memory | |
1535 address. | |
1536 | |
1537 @samp{&} applies only to the alternative in which it is written. In | |
1538 constraints with multiple alternatives, sometimes one alternative | |
1539 requires @samp{&} while others do not. See, for example, the | |
1540 @samp{movdf} insn of the 68000. | |
1541 | |
1542 An input operand can be tied to an earlyclobber operand if its only | |
1543 use as an input occurs before the early result is written. Adding | |
1544 alternatives of this form often allows GCC to produce better code | |
1545 when only some of the inputs can be affected by the earlyclobber. | |
1546 See, for example, the @samp{mulsi3} insn of the ARM@. | |
1547 | |
1548 @samp{&} does not obviate the need to write @samp{=}. | |
1549 | |
1550 @cindex @samp{%} in constraint | |
1551 @item % | |
1552 Declares the instruction to be commutative for this operand and the | |
1553 following operand. This means that the compiler may interchange the | |
1554 two operands if that is the cheapest way to make all operands fit the | |
1555 constraints. | |
1556 @ifset INTERNALS | |
1557 This is often used in patterns for addition instructions | |
1558 that really have only two operands: the result must go in one of the | |
1559 arguments. Here for example, is how the 68000 halfword-add | |
1560 instruction is defined: | |
1561 | |
1562 @smallexample | |
1563 (define_insn "addhi3" | |
1564 [(set (match_operand:HI 0 "general_operand" "=m,r") | |
1565 (plus:HI (match_operand:HI 1 "general_operand" "%0,0") | |
1566 (match_operand:HI 2 "general_operand" "di,g")))] | |
1567 @dots{}) | |
1568 @end smallexample | |
1569 @end ifset | |
1570 GCC can only handle one commutative pair in an asm; if you use more, | |
1571 the compiler may fail. Note that you need not use the modifier if | |
1572 the two alternatives are strictly identical; this would only waste | |
1573 time in the reload pass. The modifier is not operational after | |
1574 register allocation, so the result of @code{define_peephole2} | |
1575 and @code{define_split}s performed after reload cannot rely on | |
1576 @samp{%} to make the intended insn match. | |
1577 | |
1578 @cindex @samp{#} in constraint | |
1579 @item # | |
1580 Says that all following characters, up to the next comma, are to be | |
1581 ignored as a constraint. They are significant only for choosing | |
1582 register preferences. | |
1583 | |
1584 @cindex @samp{*} in constraint | |
1585 @item * | |
1586 Says that the following character should be ignored when choosing | |
1587 register preferences. @samp{*} has no effect on the meaning of the | |
1588 constraint as a constraint, and no effect on reloading. | |
1589 | |
1590 @ifset INTERNALS | |
1591 Here is an example: the 68000 has an instruction to sign-extend a | |
1592 halfword in a data register, and can also sign-extend a value by | |
1593 copying it into an address register. While either kind of register is | |
1594 acceptable, the constraints on an address-register destination are | |
1595 less strict, so it is best if register allocation makes an address | |
1596 register its goal. Therefore, @samp{*} is used so that the @samp{d} | |
1597 constraint letter (for data register) is ignored when computing | |
1598 register preferences. | |
1599 | |
1600 @smallexample | |
1601 (define_insn "extendhisi2" | |
1602 [(set (match_operand:SI 0 "general_operand" "=*d,a") | |
1603 (sign_extend:SI | |
1604 (match_operand:HI 1 "general_operand" "0,g")))] | |
1605 @dots{}) | |
1606 @end smallexample | |
1607 @end ifset | |
1608 @end table | |
1609 | |
1610 @node Machine Constraints | |
1611 @subsection Constraints for Particular Machines | |
1612 @cindex machine specific constraints | |
1613 @cindex constraints, machine specific | |
1614 | |
1615 Whenever possible, you should use the general-purpose constraint letters | |
1616 in @code{asm} arguments, since they will convey meaning more readily to | |
1617 people reading your code. Failing that, use the constraint letters | |
1618 that usually have very similar meanings across architectures. The most | |
1619 commonly used constraints are @samp{m} and @samp{r} (for memory and | |
1620 general-purpose registers respectively; @pxref{Simple Constraints}), and | |
1621 @samp{I}, usually the letter indicating the most common | |
1622 immediate-constant format. | |
1623 | |
1624 Each architecture defines additional constraints. These constraints | |
1625 are used by the compiler itself for instruction generation, as well as | |
1626 for @code{asm} statements; therefore, some of the constraints are not | |
1627 particularly useful for @code{asm}. Here is a summary of some of the | |
1628 machine-dependent constraints available on some particular machines; | |
1629 it includes both constraints that are useful for @code{asm} and | |
1630 constraints that aren't. The compiler source file mentioned in the | |
1631 table heading for each architecture is the definitive reference for | |
1632 the meanings of that architecture's constraints. | |
1633 | |
1634 @table @emph | |
1635 @item ARM family---@file{config/arm/arm.h} | |
1636 @table @code | |
1637 @item f | |
1638 Floating-point register | |
1639 | |
1640 @item w | |
1641 VFP floating-point register | |
1642 | |
1643 @item F | |
1644 One of the floating-point constants 0.0, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0 | |
1645 or 10.0 | |
1646 | |
1647 @item G | |
1648 Floating-point constant that would satisfy the constraint @samp{F} if it | |
1649 were negated | |
1650 | |
1651 @item I | |
1652 Integer that is valid as an immediate operand in a data processing | |
1653 instruction. That is, an integer in the range 0 to 255 rotated by a | |
1654 multiple of 2 | |
1655 | |
1656 @item J | |
1657 Integer in the range @minus{}4095 to 4095 | |
1658 | |
1659 @item K | |
1660 Integer that satisfies constraint @samp{I} when inverted (ones complement) | |
1661 | |
1662 @item L | |
1663 Integer that satisfies constraint @samp{I} when negated (twos complement) | |
1664 | |
1665 @item M | |
1666 Integer in the range 0 to 32 | |
1667 | |
1668 @item Q | |
1669 A memory reference where the exact address is in a single register | |
1670 (`@samp{m}' is preferable for @code{asm} statements) | |
1671 | |
1672 @item R | |
1673 An item in the constant pool | |
1674 | |
1675 @item S | |
1676 A symbol in the text segment of the current file | |
1677 | |
1678 @item Uv | |
1679 A memory reference suitable for VFP load/store insns (reg+constant offset) | |
1680 | |
1681 @item Uy | |
1682 A memory reference suitable for iWMMXt load/store instructions. | |
1683 | |
1684 @item Uq | |
1685 A memory reference suitable for the ARMv4 ldrsb instruction. | |
1686 @end table | |
1687 | |
1688 @item AVR family---@file{config/avr/constraints.md} | |
1689 @table @code | |
1690 @item l | |
1691 Registers from r0 to r15 | |
1692 | |
1693 @item a | |
1694 Registers from r16 to r23 | |
1695 | |
1696 @item d | |
1697 Registers from r16 to r31 | |
1698 | |
1699 @item w | |
1700 Registers from r24 to r31. These registers can be used in @samp{adiw} command | |
1701 | |
1702 @item e | |
1703 Pointer register (r26--r31) | |
1704 | |
1705 @item b | |
1706 Base pointer register (r28--r31) | |
1707 | |
1708 @item q | |
1709 Stack pointer register (SPH:SPL) | |
1710 | |
1711 @item t | |
1712 Temporary register r0 | |
1713 | |
1714 @item x | |
1715 Register pair X (r27:r26) | |
1716 | |
1717 @item y | |
1718 Register pair Y (r29:r28) | |
1719 | |
1720 @item z | |
1721 Register pair Z (r31:r30) | |
1722 | |
1723 @item I | |
1724 Constant greater than @minus{}1, less than 64 | |
1725 | |
1726 @item J | |
1727 Constant greater than @minus{}64, less than 1 | |
1728 | |
1729 @item K | |
1730 Constant integer 2 | |
1731 | |
1732 @item L | |
1733 Constant integer 0 | |
1734 | |
1735 @item M | |
1736 Constant that fits in 8 bits | |
1737 | |
1738 @item N | |
1739 Constant integer @minus{}1 | |
1740 | |
1741 @item O | |
1742 Constant integer 8, 16, or 24 | |
1743 | |
1744 @item P | |
1745 Constant integer 1 | |
1746 | |
1747 @item G | |
1748 A floating point constant 0.0 | |
1749 | |
1750 @item R | |
1751 Integer constant in the range -6 @dots{} 5. | |
1752 | |
1753 @item Q | |
1754 A memory address based on Y or Z pointer with displacement. | |
1755 @end table | |
1756 | |
1757 @item CRX Architecture---@file{config/crx/crx.h} | |
1758 @table @code | |
1759 | |
1760 @item b | |
1761 Registers from r0 to r14 (registers without stack pointer) | |
1762 | |
1763 @item l | |
1764 Register r16 (64-bit accumulator lo register) | |
1765 | |
1766 @item h | |
1767 Register r17 (64-bit accumulator hi register) | |
1768 | |
1769 @item k | |
1770 Register pair r16-r17. (64-bit accumulator lo-hi pair) | |
1771 | |
1772 @item I | |
1773 Constant that fits in 3 bits | |
1774 | |
1775 @item J | |
1776 Constant that fits in 4 bits | |
1777 | |
1778 @item K | |
1779 Constant that fits in 5 bits | |
1780 | |
1781 @item L | |
1782 Constant that is one of -1, 4, -4, 7, 8, 12, 16, 20, 32, 48 | |
1783 | |
1784 @item G | |
1785 Floating point constant that is legal for store immediate | |
1786 @end table | |
1787 | |
1788 @item Hewlett-Packard PA-RISC---@file{config/pa/pa.h} | |
1789 @table @code | |
1790 @item a | |
1791 General register 1 | |
1792 | |
1793 @item f | |
1794 Floating point register | |
1795 | |
1796 @item q | |
1797 Shift amount register | |
1798 | |
1799 @item x | |
1800 Floating point register (deprecated) | |
1801 | |
1802 @item y | |
1803 Upper floating point register (32-bit), floating point register (64-bit) | |
1804 | |
1805 @item Z | |
1806 Any register | |
1807 | |
1808 @item I | |
1809 Signed 11-bit integer constant | |
1810 | |
1811 @item J | |
1812 Signed 14-bit integer constant | |
1813 | |
1814 @item K | |
1815 Integer constant that can be deposited with a @code{zdepi} instruction | |
1816 | |
1817 @item L | |
1818 Signed 5-bit integer constant | |
1819 | |
1820 @item M | |
1821 Integer constant 0 | |
1822 | |
1823 @item N | |
1824 Integer constant that can be loaded with a @code{ldil} instruction | |
1825 | |
1826 @item O | |
1827 Integer constant whose value plus one is a power of 2 | |
1828 | |
1829 @item P | |
1830 Integer constant that can be used for @code{and} operations in @code{depi} | |
1831 and @code{extru} instructions | |
1832 | |
1833 @item S | |
1834 Integer constant 31 | |
1835 | |
1836 @item U | |
1837 Integer constant 63 | |
1838 | |
1839 @item G | |
1840 Floating-point constant 0.0 | |
1841 | |
1842 @item A | |
1843 A @code{lo_sum} data-linkage-table memory operand | |
1844 | |
1845 @item Q | |
1846 A memory operand that can be used as the destination operand of an | |
1847 integer store instruction | |
1848 | |
1849 @item R | |
1850 A scaled or unscaled indexed memory operand | |
1851 | |
1852 @item T | |
1853 A memory operand for floating-point loads and stores | |
1854 | |
1855 @item W | |
1856 A register indirect memory operand | |
1857 @end table | |
1858 | |
1859 @item picoChip family---@file{picochip.h} | |
1860 @table @code | |
1861 @item k | |
1862 Stack register. | |
1863 | |
1864 @item f | |
1865 Pointer register. A register which can be used to access memory without | |
1866 supplying an offset. Any other register can be used to access memory, | |
1867 but will need a constant offset. In the case of the offset being zero, | |
1868 it is more efficient to use a pointer register, since this reduces code | |
1869 size. | |
1870 | |
1871 @item t | |
1872 A twin register. A register which may be paired with an adjacent | |
1873 register to create a 32-bit register. | |
1874 | |
1875 @item a | |
1876 Any absolute memory address (e.g., symbolic constant, symbolic | |
1877 constant + offset). | |
1878 | |
1879 @item I | |
1880 4-bit signed integer. | |
1881 | |
1882 @item J | |
1883 4-bit unsigned integer. | |
1884 | |
1885 @item K | |
1886 8-bit signed integer. | |
1887 | |
1888 @item M | |
1889 Any constant whose absolute value is no greater than 4-bits. | |
1890 | |
1891 @item N | |
1892 10-bit signed integer | |
1893 | |
1894 @item O | |
1895 16-bit signed integer. | |
1896 | |
1897 @end table | |
1898 | |
1899 @item PowerPC and IBM RS6000---@file{config/rs6000/rs6000.h} | |
1900 @table @code | |
1901 @item b | |
1902 Address base register | |
1903 | |
1904 @item f | |
1905 Floating point register | |
1906 | |
1907 @item v | |
1908 Vector register | |
1909 | |
1910 @item h | |
1911 @samp{MQ}, @samp{CTR}, or @samp{LINK} register | |
1912 | |
1913 @item q | |
1914 @samp{MQ} register | |
1915 | |
1916 @item c | |
1917 @samp{CTR} register | |
1918 | |
1919 @item l | |
1920 @samp{LINK} register | |
1921 | |
1922 @item x | |
1923 @samp{CR} register (condition register) number 0 | |
1924 | |
1925 @item y | |
1926 @samp{CR} register (condition register) | |
1927 | |
1928 @item z | |
1929 @samp{FPMEM} stack memory for FPR-GPR transfers | |
1930 | |
1931 @item I | |
1932 Signed 16-bit constant | |
1933 | |
1934 @item J | |
1935 Unsigned 16-bit constant shifted left 16 bits (use @samp{L} instead for | |
1936 @code{SImode} constants) | |
1937 | |
1938 @item K | |
1939 Unsigned 16-bit constant | |
1940 | |
1941 @item L | |
1942 Signed 16-bit constant shifted left 16 bits | |
1943 | |
1944 @item M | |
1945 Constant larger than 31 | |
1946 | |
1947 @item N | |
1948 Exact power of 2 | |
1949 | |
1950 @item O | |
1951 Zero | |
1952 | |
1953 @item P | |
1954 Constant whose negation is a signed 16-bit constant | |
1955 | |
1956 @item G | |
1957 Floating point constant that can be loaded into a register with one | |
1958 instruction per word | |
1959 | |
1960 @item H | |
1961 Integer/Floating point constant that can be loaded into a register using | |
1962 three instructions | |
1963 | |
1964 @item Q | |
1965 Memory operand that is an offset from a register (@samp{m} is preferable | |
1966 for @code{asm} statements) | |
1967 | |
1968 @item Z | |
1969 Memory operand that is an indexed or indirect from a register (@samp{m} is | |
1970 preferable for @code{asm} statements) | |
1971 | |
1972 @item R | |
1973 AIX TOC entry | |
1974 | |
1975 @item a | |
1976 Address operand that is an indexed or indirect from a register (@samp{p} is | |
1977 preferable for @code{asm} statements) | |
1978 | |
1979 @item S | |
1980 Constant suitable as a 64-bit mask operand | |
1981 | |
1982 @item T | |
1983 Constant suitable as a 32-bit mask operand | |
1984 | |
1985 @item U | |
1986 System V Release 4 small data area reference | |
1987 | |
1988 @item t | |
1989 AND masks that can be performed by two rldic@{l, r@} instructions | |
1990 | |
1991 @item W | |
1992 Vector constant that does not require memory | |
1993 | |
1994 @end table | |
1995 | |
1996 @item Intel 386---@file{config/i386/constraints.md} | |
1997 @table @code | |
1998 @item R | |
1999 Legacy register---the eight integer registers available on all | |
2000 i386 processors (@code{a}, @code{b}, @code{c}, @code{d}, | |
2001 @code{si}, @code{di}, @code{bp}, @code{sp}). | |
2002 | |
2003 @item q | |
2004 Any register accessible as @code{@var{r}l}. In 32-bit mode, @code{a}, | |
2005 @code{b}, @code{c}, and @code{d}; in 64-bit mode, any integer register. | |
2006 | |
2007 @item Q | |
2008 Any register accessible as @code{@var{r}h}: @code{a}, @code{b}, | |
2009 @code{c}, and @code{d}. | |
2010 | |
2011 @ifset INTERNALS | |
2012 @item l | |
2013 Any register that can be used as the index in a base+index memory | |
2014 access: that is, any general register except the stack pointer. | |
2015 @end ifset | |
2016 | |
2017 @item a | |
2018 The @code{a} register. | |
2019 | |
2020 @item b | |
2021 The @code{b} register. | |
2022 | |
2023 @item c | |
2024 The @code{c} register. | |
2025 | |
2026 @item d | |
2027 The @code{d} register. | |
2028 | |
2029 @item S | |
2030 The @code{si} register. | |
2031 | |
2032 @item D | |
2033 The @code{di} register. | |
2034 | |
2035 @item A | |
2036 The @code{a} and @code{d} registers, as a pair (for instructions that | |
2037 return half the result in one and half in the other). | |
2038 | |
2039 @item f | |
2040 Any 80387 floating-point (stack) register. | |
2041 | |
2042 @item t | |
2043 Top of 80387 floating-point stack (@code{%st(0)}). | |
2044 | |
2045 @item u | |
2046 Second from top of 80387 floating-point stack (@code{%st(1)}). | |
2047 | |
2048 @item y | |
2049 Any MMX register. | |
2050 | |
2051 @item x | |
2052 Any SSE register. | |
2053 | |
2054 @item Yz | |
2055 First SSE register (@code{%xmm0}). | |
2056 | |
2057 @ifset INTERNALS | |
2058 @item Y2 | |
2059 Any SSE register, when SSE2 is enabled. | |
2060 | |
2061 @item Yi | |
2062 Any SSE register, when SSE2 and inter-unit moves are enabled. | |
2063 | |
2064 @item Ym | |
2065 Any MMX register, when inter-unit moves are enabled. | |
2066 @end ifset | |
2067 | |
2068 @item I | |
2069 Integer constant in the range 0 @dots{} 31, for 32-bit shifts. | |
2070 | |
2071 @item J | |
2072 Integer constant in the range 0 @dots{} 63, for 64-bit shifts. | |
2073 | |
2074 @item K | |
2075 Signed 8-bit integer constant. | |
2076 | |
2077 @item L | |
2078 @code{0xFF} or @code{0xFFFF}, for andsi as a zero-extending move. | |
2079 | |
2080 @item M | |
2081 0, 1, 2, or 3 (shifts for the @code{lea} instruction). | |
2082 | |
2083 @item N | |
2084 Unsigned 8-bit integer constant (for @code{in} and @code{out} | |
2085 instructions). | |
2086 | |
2087 @ifset INTERNALS | |
2088 @item O | |
2089 Integer constant in the range 0 @dots{} 127, for 128-bit shifts. | |
2090 @end ifset | |
2091 | |
2092 @item G | |
2093 Standard 80387 floating point constant. | |
2094 | |
2095 @item C | |
2096 Standard SSE floating point constant. | |
2097 | |
2098 @item e | |
2099 32-bit signed integer constant, or a symbolic reference known | |
2100 to fit that range (for immediate operands in sign-extending x86-64 | |
2101 instructions). | |
2102 | |
2103 @item Z | |
2104 32-bit unsigned integer constant, or a symbolic reference known | |
2105 to fit that range (for immediate operands in zero-extending x86-64 | |
2106 instructions). | |
2107 | |
2108 @end table | |
2109 | |
2110 @item Intel IA-64---@file{config/ia64/ia64.h} | |
2111 @table @code | |
2112 @item a | |
2113 General register @code{r0} to @code{r3} for @code{addl} instruction | |
2114 | |
2115 @item b | |
2116 Branch register | |
2117 | |
2118 @item c | |
2119 Predicate register (@samp{c} as in ``conditional'') | |
2120 | |
2121 @item d | |
2122 Application register residing in M-unit | |
2123 | |
2124 @item e | |
2125 Application register residing in I-unit | |
2126 | |
2127 @item f | |
2128 Floating-point register | |
2129 | |
2130 @item m | |
2131 Memory operand. | |
2132 Remember that @samp{m} allows postincrement and postdecrement which | |
2133 require printing with @samp{%Pn} on IA-64. | |
2134 Use @samp{S} to disallow postincrement and postdecrement. | |
2135 | |
2136 @item G | |
2137 Floating-point constant 0.0 or 1.0 | |
2138 | |
2139 @item I | |
2140 14-bit signed integer constant | |
2141 | |
2142 @item J | |
2143 22-bit signed integer constant | |
2144 | |
2145 @item K | |
2146 8-bit signed integer constant for logical instructions | |
2147 | |
2148 @item L | |
2149 8-bit adjusted signed integer constant for compare pseudo-ops | |
2150 | |
2151 @item M | |
2152 6-bit unsigned integer constant for shift counts | |
2153 | |
2154 @item N | |
2155 9-bit signed integer constant for load and store postincrements | |
2156 | |
2157 @item O | |
2158 The constant zero | |
2159 | |
2160 @item P | |
2161 0 or @minus{}1 for @code{dep} instruction | |
2162 | |
2163 @item Q | |
2164 Non-volatile memory for floating-point loads and stores | |
2165 | |
2166 @item R | |
2167 Integer constant in the range 1 to 4 for @code{shladd} instruction | |
2168 | |
2169 @item S | |
2170 Memory operand except postincrement and postdecrement | |
2171 @end table | |
2172 | |
2173 @item FRV---@file{config/frv/frv.h} | |
2174 @table @code | |
2175 @item a | |
2176 Register in the class @code{ACC_REGS} (@code{acc0} to @code{acc7}). | |
2177 | |
2178 @item b | |
2179 Register in the class @code{EVEN_ACC_REGS} (@code{acc0} to @code{acc7}). | |
2180 | |
2181 @item c | |
2182 Register in the class @code{CC_REGS} (@code{fcc0} to @code{fcc3} and | |
2183 @code{icc0} to @code{icc3}). | |
2184 | |
2185 @item d | |
2186 Register in the class @code{GPR_REGS} (@code{gr0} to @code{gr63}). | |
2187 | |
2188 @item e | |
2189 Register in the class @code{EVEN_REGS} (@code{gr0} to @code{gr63}). | |
2190 Odd registers are excluded not in the class but through the use of a machine | |
2191 mode larger than 4 bytes. | |
2192 | |
2193 @item f | |
2194 Register in the class @code{FPR_REGS} (@code{fr0} to @code{fr63}). | |
2195 | |
2196 @item h | |
2197 Register in the class @code{FEVEN_REGS} (@code{fr0} to @code{fr63}). | |
2198 Odd registers are excluded not in the class but through the use of a machine | |
2199 mode larger than 4 bytes. | |
2200 | |
2201 @item l | |
2202 Register in the class @code{LR_REG} (the @code{lr} register). | |
2203 | |
2204 @item q | |
2205 Register in the class @code{QUAD_REGS} (@code{gr2} to @code{gr63}). | |
2206 Register numbers not divisible by 4 are excluded not in the class but through | |
2207 the use of a machine mode larger than 8 bytes. | |
2208 | |
2209 @item t | |
2210 Register in the class @code{ICC_REGS} (@code{icc0} to @code{icc3}). | |
2211 | |
2212 @item u | |
2213 Register in the class @code{FCC_REGS} (@code{fcc0} to @code{fcc3}). | |
2214 | |
2215 @item v | |
2216 Register in the class @code{ICR_REGS} (@code{cc4} to @code{cc7}). | |
2217 | |
2218 @item w | |
2219 Register in the class @code{FCR_REGS} (@code{cc0} to @code{cc3}). | |
2220 | |
2221 @item x | |
2222 Register in the class @code{QUAD_FPR_REGS} (@code{fr0} to @code{fr63}). | |
2223 Register numbers not divisible by 4 are excluded not in the class but through | |
2224 the use of a machine mode larger than 8 bytes. | |
2225 | |
2226 @item z | |
2227 Register in the class @code{SPR_REGS} (@code{lcr} and @code{lr}). | |
2228 | |
2229 @item A | |
2230 Register in the class @code{QUAD_ACC_REGS} (@code{acc0} to @code{acc7}). | |
2231 | |
2232 @item B | |
2233 Register in the class @code{ACCG_REGS} (@code{accg0} to @code{accg7}). | |
2234 | |
2235 @item C | |
2236 Register in the class @code{CR_REGS} (@code{cc0} to @code{cc7}). | |
2237 | |
2238 @item G | |
2239 Floating point constant zero | |
2240 | |
2241 @item I | |
2242 6-bit signed integer constant | |
2243 | |
2244 @item J | |
2245 10-bit signed integer constant | |
2246 | |
2247 @item L | |
2248 16-bit signed integer constant | |
2249 | |
2250 @item M | |
2251 16-bit unsigned integer constant | |
2252 | |
2253 @item N | |
2254 12-bit signed integer constant that is negative---i.e.@: in the | |
2255 range of @minus{}2048 to @minus{}1 | |
2256 | |
2257 @item O | |
2258 Constant zero | |
2259 | |
2260 @item P | |
2261 12-bit signed integer constant that is greater than zero---i.e.@: in the | |
2262 range of 1 to 2047. | |
2263 | |
2264 @end table | |
2265 | |
2266 @item Blackfin family---@file{config/bfin/constraints.md} | |
2267 @table @code | |
2268 @item a | |
2269 P register | |
2270 | |
2271 @item d | |
2272 D register | |
2273 | |
2274 @item z | |
2275 A call clobbered P register. | |
2276 | |
2277 @item q@var{n} | |
2278 A single register. If @var{n} is in the range 0 to 7, the corresponding D | |
2279 register. If it is @code{A}, then the register P0. | |
2280 | |
2281 @item D | |
2282 Even-numbered D register | |
2283 | |
2284 @item W | |
2285 Odd-numbered D register | |
2286 | |
2287 @item e | |
2288 Accumulator register. | |
2289 | |
2290 @item A | |
2291 Even-numbered accumulator register. | |
2292 | |
2293 @item B | |
2294 Odd-numbered accumulator register. | |
2295 | |
2296 @item b | |
2297 I register | |
2298 | |
2299 @item v | |
2300 B register | |
2301 | |
2302 @item f | |
2303 M register | |
2304 | |
2305 @item c | |
2306 Registers used for circular buffering, i.e. I, B, or L registers. | |
2307 | |
2308 @item C | |
2309 The CC register. | |
2310 | |
2311 @item t | |
2312 LT0 or LT1. | |
2313 | |
2314 @item k | |
2315 LC0 or LC1. | |
2316 | |
2317 @item u | |
2318 LB0 or LB1. | |
2319 | |
2320 @item x | |
2321 Any D, P, B, M, I or L register. | |
2322 | |
2323 @item y | |
2324 Additional registers typically used only in prologues and epilogues: RETS, | |
2325 RETN, RETI, RETX, RETE, ASTAT, SEQSTAT and USP. | |
2326 | |
2327 @item w | |
2328 Any register except accumulators or CC. | |
2329 | |
2330 @item Ksh | |
2331 Signed 16 bit integer (in the range -32768 to 32767) | |
2332 | |
2333 @item Kuh | |
2334 Unsigned 16 bit integer (in the range 0 to 65535) | |
2335 | |
2336 @item Ks7 | |
2337 Signed 7 bit integer (in the range -64 to 63) | |
2338 | |
2339 @item Ku7 | |
2340 Unsigned 7 bit integer (in the range 0 to 127) | |
2341 | |
2342 @item Ku5 | |
2343 Unsigned 5 bit integer (in the range 0 to 31) | |
2344 | |
2345 @item Ks4 | |
2346 Signed 4 bit integer (in the range -8 to 7) | |
2347 | |
2348 @item Ks3 | |
2349 Signed 3 bit integer (in the range -3 to 4) | |
2350 | |
2351 @item Ku3 | |
2352 Unsigned 3 bit integer (in the range 0 to 7) | |
2353 | |
2354 @item P@var{n} | |
2355 Constant @var{n}, where @var{n} is a single-digit constant in the range 0 to 4. | |
2356 | |
2357 @item PA | |
2358 An integer equal to one of the MACFLAG_XXX constants that is suitable for | |
2359 use with either accumulator. | |
2360 | |
2361 @item PB | |
2362 An integer equal to one of the MACFLAG_XXX constants that is suitable for | |
2363 use only with accumulator A1. | |
2364 | |
2365 @item M1 | |
2366 Constant 255. | |
2367 | |
2368 @item M2 | |
2369 Constant 65535. | |
2370 | |
2371 @item J | |
2372 An integer constant with exactly a single bit set. | |
2373 | |
2374 @item L | |
2375 An integer constant with all bits set except exactly one. | |
2376 | |
2377 @item H | |
2378 | |
2379 @item Q | |
2380 Any SYMBOL_REF. | |
2381 @end table | |
2382 | |
2383 @item M32C---@file{config/m32c/m32c.c} | |
2384 @table @code | |
2385 @item Rsp | |
2386 @itemx Rfb | |
2387 @itemx Rsb | |
2388 @samp{$sp}, @samp{$fb}, @samp{$sb}. | |
2389 | |
2390 @item Rcr | |
2391 Any control register, when they're 16 bits wide (nothing if control | |
2392 registers are 24 bits wide) | |
2393 | |
2394 @item Rcl | |
2395 Any control register, when they're 24 bits wide. | |
2396 | |
2397 @item R0w | |
2398 @itemx R1w | |
2399 @itemx R2w | |
2400 @itemx R3w | |
2401 $r0, $r1, $r2, $r3. | |
2402 | |
2403 @item R02 | |
2404 $r0 or $r2, or $r2r0 for 32 bit values. | |
2405 | |
2406 @item R13 | |
2407 $r1 or $r3, or $r3r1 for 32 bit values. | |
2408 | |
2409 @item Rdi | |
2410 A register that can hold a 64 bit value. | |
2411 | |
2412 @item Rhl | |
2413 $r0 or $r1 (registers with addressable high/low bytes) | |
2414 | |
2415 @item R23 | |
2416 $r2 or $r3 | |
2417 | |
2418 @item Raa | |
2419 Address registers | |
2420 | |
2421 @item Raw | |
2422 Address registers when they're 16 bits wide. | |
2423 | |
2424 @item Ral | |
2425 Address registers when they're 24 bits wide. | |
2426 | |
2427 @item Rqi | |
2428 Registers that can hold QI values. | |
2429 | |
2430 @item Rad | |
2431 Registers that can be used with displacements ($a0, $a1, $sb). | |
2432 | |
2433 @item Rsi | |
2434 Registers that can hold 32 bit values. | |
2435 | |
2436 @item Rhi | |
2437 Registers that can hold 16 bit values. | |
2438 | |
2439 @item Rhc | |
2440 Registers chat can hold 16 bit values, including all control | |
2441 registers. | |
2442 | |
2443 @item Rra | |
2444 $r0 through R1, plus $a0 and $a1. | |
2445 | |
2446 @item Rfl | |
2447 The flags register. | |
2448 | |
2449 @item Rmm | |
2450 The memory-based pseudo-registers $mem0 through $mem15. | |
2451 | |
2452 @item Rpi | |
2453 Registers that can hold pointers (16 bit registers for r8c, m16c; 24 | |
2454 bit registers for m32cm, m32c). | |
2455 | |
2456 @item Rpa | |
2457 Matches multiple registers in a PARALLEL to form a larger register. | |
2458 Used to match function return values. | |
2459 | |
2460 @item Is3 | |
2461 -8 @dots{} 7 | |
2462 | |
2463 @item IS1 | |
2464 -128 @dots{} 127 | |
2465 | |
2466 @item IS2 | |
2467 -32768 @dots{} 32767 | |
2468 | |
2469 @item IU2 | |
2470 0 @dots{} 65535 | |
2471 | |
2472 @item In4 | |
2473 -8 @dots{} -1 or 1 @dots{} 8 | |
2474 | |
2475 @item In5 | |
2476 -16 @dots{} -1 or 1 @dots{} 16 | |
2477 | |
2478 @item In6 | |
2479 -32 @dots{} -1 or 1 @dots{} 32 | |
2480 | |
2481 @item IM2 | |
2482 -65536 @dots{} -1 | |
2483 | |
2484 @item Ilb | |
2485 An 8 bit value with exactly one bit set. | |
2486 | |
2487 @item Ilw | |
2488 A 16 bit value with exactly one bit set. | |
2489 | |
2490 @item Sd | |
2491 The common src/dest memory addressing modes. | |
2492 | |
2493 @item Sa | |
2494 Memory addressed using $a0 or $a1. | |
2495 | |
2496 @item Si | |
2497 Memory addressed with immediate addresses. | |
2498 | |
2499 @item Ss | |
2500 Memory addressed using the stack pointer ($sp). | |
2501 | |
2502 @item Sf | |
2503 Memory addressed using the frame base register ($fb). | |
2504 | |
2505 @item Ss | |
2506 Memory addressed using the small base register ($sb). | |
2507 | |
2508 @item S1 | |
2509 $r1h | |
2510 @end table | |
2511 | |
2512 @item MIPS---@file{config/mips/constraints.md} | |
2513 @table @code | |
2514 @item d | |
2515 An address register. This is equivalent to @code{r} unless | |
2516 generating MIPS16 code. | |
2517 | |
2518 @item f | |
2519 A floating-point register (if available). | |
2520 | |
2521 @item h | |
2522 Formerly the @code{hi} register. This constraint is no longer supported. | |
2523 | |
2524 @item l | |
2525 The @code{lo} register. Use this register to store values that are | |
2526 no bigger than a word. | |
2527 | |
2528 @item x | |
2529 The concatenated @code{hi} and @code{lo} registers. Use this register | |
2530 to store doubleword values. | |
2531 | |
2532 @item c | |
2533 A register suitable for use in an indirect jump. This will always be | |
2534 @code{$25} for @option{-mabicalls}. | |
2535 | |
2536 @item v | |
2537 Register @code{$3}. Do not use this constraint in new code; | |
2538 it is retained only for compatibility with glibc. | |
2539 | |
2540 @item y | |
2541 Equivalent to @code{r}; retained for backwards compatibility. | |
2542 | |
2543 @item z | |
2544 A floating-point condition code register. | |
2545 | |
2546 @item I | |
2547 A signed 16-bit constant (for arithmetic instructions). | |
2548 | |
2549 @item J | |
2550 Integer zero. | |
2551 | |
2552 @item K | |
2553 An unsigned 16-bit constant (for logic instructions). | |
2554 | |
2555 @item L | |
2556 A signed 32-bit constant in which the lower 16 bits are zero. | |
2557 Such constants can be loaded using @code{lui}. | |
2558 | |
2559 @item M | |
2560 A constant that cannot be loaded using @code{lui}, @code{addiu} | |
2561 or @code{ori}. | |
2562 | |
2563 @item N | |
2564 A constant in the range -65535 to -1 (inclusive). | |
2565 | |
2566 @item O | |
2567 A signed 15-bit constant. | |
2568 | |
2569 @item P | |
2570 A constant in the range 1 to 65535 (inclusive). | |
2571 | |
2572 @item G | |
2573 Floating-point zero. | |
2574 | |
2575 @item R | |
2576 An address that can be used in a non-macro load or store. | |
2577 @end table | |
2578 | |
2579 @item Motorola 680x0---@file{config/m68k/constraints.md} | |
2580 @table @code | |
2581 @item a | |
2582 Address register | |
2583 | |
2584 @item d | |
2585 Data register | |
2586 | |
2587 @item f | |
2588 68881 floating-point register, if available | |
2589 | |
2590 @item I | |
2591 Integer in the range 1 to 8 | |
2592 | |
2593 @item J | |
2594 16-bit signed number | |
2595 | |
2596 @item K | |
2597 Signed number whose magnitude is greater than 0x80 | |
2598 | |
2599 @item L | |
2600 Integer in the range @minus{}8 to @minus{}1 | |
2601 | |
2602 @item M | |
2603 Signed number whose magnitude is greater than 0x100 | |
2604 | |
2605 @item N | |
2606 Range 24 to 31, rotatert:SI 8 to 1 expressed as rotate | |
2607 | |
2608 @item O | |
2609 16 (for rotate using swap) | |
2610 | |
2611 @item P | |
2612 Range 8 to 15, rotatert:HI 8 to 1 expressed as rotate | |
2613 | |
2614 @item R | |
2615 Numbers that mov3q can handle | |
2616 | |
2617 @item G | |
2618 Floating point constant that is not a 68881 constant | |
2619 | |
2620 @item S | |
2621 Operands that satisfy 'm' when -mpcrel is in effect | |
2622 | |
2623 @item T | |
2624 Operands that satisfy 's' when -mpcrel is not in effect | |
2625 | |
2626 @item Q | |
2627 Address register indirect addressing mode | |
2628 | |
2629 @item U | |
2630 Register offset addressing | |
2631 | |
2632 @item W | |
2633 const_call_operand | |
2634 | |
2635 @item Cs | |
2636 symbol_ref or const | |
2637 | |
2638 @item Ci | |
2639 const_int | |
2640 | |
2641 @item C0 | |
2642 const_int 0 | |
2643 | |
2644 @item Cj | |
2645 Range of signed numbers that don't fit in 16 bits | |
2646 | |
2647 @item Cmvq | |
2648 Integers valid for mvq | |
2649 | |
2650 @item Capsw | |
2651 Integers valid for a moveq followed by a swap | |
2652 | |
2653 @item Cmvz | |
2654 Integers valid for mvz | |
2655 | |
2656 @item Cmvs | |
2657 Integers valid for mvs | |
2658 | |
2659 @item Ap | |
2660 push_operand | |
2661 | |
2662 @item Ac | |
2663 Non-register operands allowed in clr | |
2664 | |
2665 @end table | |
2666 | |
2667 @item Motorola 68HC11 & 68HC12 families---@file{config/m68hc11/m68hc11.h} | |
2668 @table @code | |
2669 @item a | |
2670 Register `a' | |
2671 | |
2672 @item b | |
2673 Register `b' | |
2674 | |
2675 @item d | |
2676 Register `d' | |
2677 | |
2678 @item q | |
2679 An 8-bit register | |
2680 | |
2681 @item t | |
2682 Temporary soft register _.tmp | |
2683 | |
2684 @item u | |
2685 A soft register _.d1 to _.d31 | |
2686 | |
2687 @item w | |
2688 Stack pointer register | |
2689 | |
2690 @item x | |
2691 Register `x' | |
2692 | |
2693 @item y | |
2694 Register `y' | |
2695 | |
2696 @item z | |
2697 Pseudo register `z' (replaced by `x' or `y' at the end) | |
2698 | |
2699 @item A | |
2700 An address register: x, y or z | |
2701 | |
2702 @item B | |
2703 An address register: x or y | |
2704 | |
2705 @item D | |
2706 Register pair (x:d) to form a 32-bit value | |
2707 | |
2708 @item L | |
2709 Constants in the range @minus{}65536 to 65535 | |
2710 | |
2711 @item M | |
2712 Constants whose 16-bit low part is zero | |
2713 | |
2714 @item N | |
2715 Constant integer 1 or @minus{}1 | |
2716 | |
2717 @item O | |
2718 Constant integer 16 | |
2719 | |
2720 @item P | |
2721 Constants in the range @minus{}8 to 2 | |
2722 | |
2723 @end table | |
2724 | |
2725 @need 1000 | |
2726 @item SPARC---@file{config/sparc/sparc.h} | |
2727 @table @code | |
2728 @item f | |
2729 Floating-point register on the SPARC-V8 architecture and | |
2730 lower floating-point register on the SPARC-V9 architecture. | |
2731 | |
2732 @item e | |
2733 Floating-point register. It is equivalent to @samp{f} on the | |
2734 SPARC-V8 architecture and contains both lower and upper | |
2735 floating-point registers on the SPARC-V9 architecture. | |
2736 | |
2737 @item c | |
2738 Floating-point condition code register. | |
2739 | |
2740 @item d | |
2741 Lower floating-point register. It is only valid on the SPARC-V9 | |
2742 architecture when the Visual Instruction Set is available. | |
2743 | |
2744 @item b | |
2745 Floating-point register. It is only valid on the SPARC-V9 architecture | |
2746 when the Visual Instruction Set is available. | |
2747 | |
2748 @item h | |
2749 64-bit global or out register for the SPARC-V8+ architecture. | |
2750 | |
2751 @item D | |
2752 A vector constant | |
2753 | |
2754 @item I | |
2755 Signed 13-bit constant | |
2756 | |
2757 @item J | |
2758 Zero | |
2759 | |
2760 @item K | |
2761 32-bit constant with the low 12 bits clear (a constant that can be | |
2762 loaded with the @code{sethi} instruction) | |
2763 | |
2764 @item L | |
2765 A constant in the range supported by @code{movcc} instructions | |
2766 | |
2767 @item M | |
2768 A constant in the range supported by @code{movrcc} instructions | |
2769 | |
2770 @item N | |
2771 Same as @samp{K}, except that it verifies that bits that are not in the | |
2772 lower 32-bit range are all zero. Must be used instead of @samp{K} for | |
2773 modes wider than @code{SImode} | |
2774 | |
2775 @item O | |
2776 The constant 4096 | |
2777 | |
2778 @item G | |
2779 Floating-point zero | |
2780 | |
2781 @item H | |
2782 Signed 13-bit constant, sign-extended to 32 or 64 bits | |
2783 | |
2784 @item Q | |
2785 Floating-point constant whose integral representation can | |
2786 be moved into an integer register using a single sethi | |
2787 instruction | |
2788 | |
2789 @item R | |
2790 Floating-point constant whose integral representation can | |
2791 be moved into an integer register using a single mov | |
2792 instruction | |
2793 | |
2794 @item S | |
2795 Floating-point constant whose integral representation can | |
2796 be moved into an integer register using a high/lo_sum | |
2797 instruction sequence | |
2798 | |
2799 @item T | |
2800 Memory address aligned to an 8-byte boundary | |
2801 | |
2802 @item U | |
2803 Even register | |
2804 | |
2805 @item W | |
2806 Memory address for @samp{e} constraint registers | |
2807 | |
2808 @item Y | |
2809 Vector zero | |
2810 | |
2811 @end table | |
2812 | |
2813 @item SPU---@file{config/spu/spu.h} | |
2814 @table @code | |
2815 @item a | |
2816 An immediate which can be loaded with the il/ila/ilh/ilhu instructions. const_int is treated as a 64 bit value. | |
2817 | |
2818 @item c | |
2819 An immediate for and/xor/or instructions. const_int is treated as a 64 bit value. | |
2820 | |
2821 @item d | |
2822 An immediate for the @code{iohl} instruction. const_int is treated as a 64 bit value. | |
2823 | |
2824 @item f | |
2825 An immediate which can be loaded with @code{fsmbi}. | |
2826 | |
2827 @item A | |
2828 An immediate which can be loaded with the il/ila/ilh/ilhu instructions. const_int is treated as a 32 bit value. | |
2829 | |
2830 @item B | |
2831 An immediate for most arithmetic instructions. const_int is treated as a 32 bit value. | |
2832 | |
2833 @item C | |
2834 An immediate for and/xor/or instructions. const_int is treated as a 32 bit value. | |
2835 | |
2836 @item D | |
2837 An immediate for the @code{iohl} instruction. const_int is treated as a 32 bit value. | |
2838 | |
2839 @item I | |
2840 A constant in the range [-64, 63] for shift/rotate instructions. | |
2841 | |
2842 @item J | |
2843 An unsigned 7-bit constant for conversion/nop/channel instructions. | |
2844 | |
2845 @item K | |
2846 A signed 10-bit constant for most arithmetic instructions. | |
2847 | |
2848 @item M | |
2849 A signed 16 bit immediate for @code{stop}. | |
2850 | |
2851 @item N | |
2852 An unsigned 16-bit constant for @code{iohl} and @code{fsmbi}. | |
2853 | |
2854 @item O | |
2855 An unsigned 7-bit constant whose 3 least significant bits are 0. | |
2856 | |
2857 @item P | |
2858 An unsigned 3-bit constant for 16-byte rotates and shifts | |
2859 | |
2860 @item R | |
2861 Call operand, reg, for indirect calls | |
2862 | |
2863 @item S | |
2864 Call operand, symbol, for relative calls. | |
2865 | |
2866 @item T | |
2867 Call operand, const_int, for absolute calls. | |
2868 | |
2869 @item U | |
2870 An immediate which can be loaded with the il/ila/ilh/ilhu instructions. const_int is sign extended to 128 bit. | |
2871 | |
2872 @item W | |
2873 An immediate for shift and rotate instructions. const_int is treated as a 32 bit value. | |
2874 | |
2875 @item Y | |
2876 An immediate for and/xor/or instructions. const_int is sign extended as a 128 bit. | |
2877 | |
2878 @item Z | |
2879 An immediate for the @code{iohl} instruction. const_int is sign extended to 128 bit. | |
2880 | |
2881 @end table | |
2882 | |
2883 @item S/390 and zSeries---@file{config/s390/s390.h} | |
2884 @table @code | |
2885 @item a | |
2886 Address register (general purpose register except r0) | |
2887 | |
2888 @item c | |
2889 Condition code register | |
2890 | |
2891 @item d | |
2892 Data register (arbitrary general purpose register) | |
2893 | |
2894 @item f | |
2895 Floating-point register | |
2896 | |
2897 @item I | |
2898 Unsigned 8-bit constant (0--255) | |
2899 | |
2900 @item J | |
2901 Unsigned 12-bit constant (0--4095) | |
2902 | |
2903 @item K | |
2904 Signed 16-bit constant (@minus{}32768--32767) | |
2905 | |
2906 @item L | |
2907 Value appropriate as displacement. | |
2908 @table @code | |
2909 @item (0..4095) | |
2910 for short displacement | |
2911 @item (-524288..524287) | |
2912 for long displacement | |
2913 @end table | |
2914 | |
2915 @item M | |
2916 Constant integer with a value of 0x7fffffff. | |
2917 | |
2918 @item N | |
2919 Multiple letter constraint followed by 4 parameter letters. | |
2920 @table @code | |
2921 @item 0..9: | |
2922 number of the part counting from most to least significant | |
2923 @item H,Q: | |
2924 mode of the part | |
2925 @item D,S,H: | |
2926 mode of the containing operand | |
2927 @item 0,F: | |
2928 value of the other parts (F---all bits set) | |
2929 @end table | |
2930 The constraint matches if the specified part of a constant | |
2931 has a value different from its other parts. | |
2932 | |
2933 @item Q | |
2934 Memory reference without index register and with short displacement. | |
2935 | |
2936 @item R | |
2937 Memory reference with index register and short displacement. | |
2938 | |
2939 @item S | |
2940 Memory reference without index register but with long displacement. | |
2941 | |
2942 @item T | |
2943 Memory reference with index register and long displacement. | |
2944 | |
2945 @item U | |
2946 Pointer with short displacement. | |
2947 | |
2948 @item W | |
2949 Pointer with long displacement. | |
2950 | |
2951 @item Y | |
2952 Shift count operand. | |
2953 | |
2954 @end table | |
2955 | |
2956 @item Score family---@file{config/score/score.h} | |
2957 @table @code | |
2958 @item d | |
2959 Registers from r0 to r32. | |
2960 | |
2961 @item e | |
2962 Registers from r0 to r16. | |
2963 | |
2964 @item t | |
2965 r8---r11 or r22---r27 registers. | |
2966 | |
2967 @item h | |
2968 hi register. | |
2969 | |
2970 @item l | |
2971 lo register. | |
2972 | |
2973 @item x | |
2974 hi + lo register. | |
2975 | |
2976 @item q | |
2977 cnt register. | |
2978 | |
2979 @item y | |
2980 lcb register. | |
2981 | |
2982 @item z | |
2983 scb register. | |
2984 | |
2985 @item a | |
2986 cnt + lcb + scb register. | |
2987 | |
2988 @item c | |
2989 cr0---cr15 register. | |
2990 | |
2991 @item b | |
2992 cp1 registers. | |
2993 | |
2994 @item f | |
2995 cp2 registers. | |
2996 | |
2997 @item i | |
2998 cp3 registers. | |
2999 | |
3000 @item j | |
3001 cp1 + cp2 + cp3 registers. | |
3002 | |
3003 @item I | |
3004 High 16-bit constant (32-bit constant with 16 LSBs zero). | |
3005 | |
3006 @item J | |
3007 Unsigned 5 bit integer (in the range 0 to 31). | |
3008 | |
3009 @item K | |
3010 Unsigned 16 bit integer (in the range 0 to 65535). | |
3011 | |
3012 @item L | |
3013 Signed 16 bit integer (in the range @minus{}32768 to 32767). | |
3014 | |
3015 @item M | |
3016 Unsigned 14 bit integer (in the range 0 to 16383). | |
3017 | |
3018 @item N | |
3019 Signed 14 bit integer (in the range @minus{}8192 to 8191). | |
3020 | |
3021 @item Z | |
3022 Any SYMBOL_REF. | |
3023 @end table | |
3024 | |
3025 @item Xstormy16---@file{config/stormy16/stormy16.h} | |
3026 @table @code | |
3027 @item a | |
3028 Register r0. | |
3029 | |
3030 @item b | |
3031 Register r1. | |
3032 | |
3033 @item c | |
3034 Register r2. | |
3035 | |
3036 @item d | |
3037 Register r8. | |
3038 | |
3039 @item e | |
3040 Registers r0 through r7. | |
3041 | |
3042 @item t | |
3043 Registers r0 and r1. | |
3044 | |
3045 @item y | |
3046 The carry register. | |
3047 | |
3048 @item z | |
3049 Registers r8 and r9. | |
3050 | |
3051 @item I | |
3052 A constant between 0 and 3 inclusive. | |
3053 | |
3054 @item J | |
3055 A constant that has exactly one bit set. | |
3056 | |
3057 @item K | |
3058 A constant that has exactly one bit clear. | |
3059 | |
3060 @item L | |
3061 A constant between 0 and 255 inclusive. | |
3062 | |
3063 @item M | |
3064 A constant between @minus{}255 and 0 inclusive. | |
3065 | |
3066 @item N | |
3067 A constant between @minus{}3 and 0 inclusive. | |
3068 | |
3069 @item O | |
3070 A constant between 1 and 4 inclusive. | |
3071 | |
3072 @item P | |
3073 A constant between @minus{}4 and @minus{}1 inclusive. | |
3074 | |
3075 @item Q | |
3076 A memory reference that is a stack push. | |
3077 | |
3078 @item R | |
3079 A memory reference that is a stack pop. | |
3080 | |
3081 @item S | |
3082 A memory reference that refers to a constant address of known value. | |
3083 | |
3084 @item T | |
3085 The register indicated by Rx (not implemented yet). | |
3086 | |
3087 @item U | |
3088 A constant that is not between 2 and 15 inclusive. | |
3089 | |
3090 @item Z | |
3091 The constant 0. | |
3092 | |
3093 @end table | |
3094 | |
3095 @item Xtensa---@file{config/xtensa/constraints.md} | |
3096 @table @code | |
3097 @item a | |
3098 General-purpose 32-bit register | |
3099 | |
3100 @item b | |
3101 One-bit boolean register | |
3102 | |
3103 @item A | |
3104 MAC16 40-bit accumulator register | |
3105 | |
3106 @item I | |
3107 Signed 12-bit integer constant, for use in MOVI instructions | |
3108 | |
3109 @item J | |
3110 Signed 8-bit integer constant, for use in ADDI instructions | |
3111 | |
3112 @item K | |
3113 Integer constant valid for BccI instructions | |
3114 | |
3115 @item L | |
3116 Unsigned constant valid for BccUI instructions | |
3117 | |
3118 @end table | |
3119 | |
3120 @end table | |
3121 | |
3122 @ifset INTERNALS | |
3123 @node Disable Insn Alternatives | |
3124 @subsection Disable insn alternatives using the @code{enabled} attribute | |
3125 @cindex enabled | |
3126 | |
3127 The @code{enabled} insn attribute may be used to disable certain insn | |
3128 alternatives for machine-specific reasons. This is useful when adding | |
3129 new instructions to an existing pattern which are only available for | |
3130 certain cpu architecture levels as specified with the @code{-march=} | |
3131 option. | |
3132 | |
3133 If an insn alternative is disabled, then it will never be used. The | |
3134 compiler treats the constraints for the disabled alternative as | |
3135 unsatisfiable. | |
3136 | |
3137 In order to make use of the @code{enabled} attribute a back end has to add | |
3138 in the machine description files: | |
3139 | |
3140 @enumerate | |
3141 @item | |
3142 A definition of the @code{enabled} insn attribute. The attribute is | |
3143 defined as usual using the @code{define_attr} command. This | |
3144 definition should be based on other insn attributes and/or target flags. | |
3145 The @code{enabled} attribute is a numeric attribute and should evaluate to | |
3146 @code{(const_int 1)} for an enabled alternative and to | |
3147 @code{(const_int 0)} otherwise. | |
3148 @item | |
3149 A definition of another insn attribute used to describe for what | |
3150 reason an insn alternative might be available or | |
3151 not. E.g. @code{cpu_facility} as in the example below. | |
3152 @item | |
3153 An assignment for the second attribute to each insn definition | |
3154 combining instructions which are not all available under the same | |
3155 circumstances. (Note: It obviously only makes sense for definitions | |
3156 with more than one alternative. Otherwise the insn pattern should be | |
3157 disabled or enabled using the insn condition.) | |
3158 @end enumerate | |
3159 | |
3160 E.g. the following two patterns could easily be merged using the @code{enabled} | |
3161 attribute: | |
3162 | |
3163 @smallexample | |
3164 | |
3165 (define_insn "*movdi_old" | |
3166 [(set (match_operand:DI 0 "register_operand" "=d") | |
3167 (match_operand:DI 1 "register_operand" " d"))] | |
3168 "!TARGET_NEW" | |
3169 "lgr %0,%1") | |
3170 | |
3171 (define_insn "*movdi_new" | |
3172 [(set (match_operand:DI 0 "register_operand" "=d,f,d") | |
3173 (match_operand:DI 1 "register_operand" " d,d,f"))] | |
3174 "TARGET_NEW" | |
3175 "@@ | |
3176 lgr %0,%1 | |
3177 ldgr %0,%1 | |
3178 lgdr %0,%1") | |
3179 | |
3180 @end smallexample | |
3181 | |
3182 to: | |
3183 | |
3184 @smallexample | |
3185 | |
3186 (define_insn "*movdi_combined" | |
3187 [(set (match_operand:DI 0 "register_operand" "=d,f,d") | |
3188 (match_operand:DI 1 "register_operand" " d,d,f"))] | |
3189 "" | |
3190 "@@ | |
3191 lgr %0,%1 | |
3192 ldgr %0,%1 | |
3193 lgdr %0,%1" | |
3194 [(set_attr "cpu_facility" "*,new,new")]) | |
3195 | |
3196 @end smallexample | |
3197 | |
3198 with the @code{enabled} attribute defined like this: | |
3199 | |
3200 @smallexample | |
3201 | |
3202 (define_attr "cpu_facility" "standard,new" (const_string "standard")) | |
3203 | |
3204 (define_attr "enabled" "" | |
3205 (cond [(eq_attr "cpu_facility" "standard") (const_int 1) | |
3206 (and (eq_attr "cpu_facility" "new") | |
3207 (ne (symbol_ref "TARGET_NEW") (const_int 0))) | |
3208 (const_int 1)] | |
3209 (const_int 0))) | |
3210 | |
3211 @end smallexample | |
3212 | |
3213 @end ifset | |
3214 | |
3215 @ifset INTERNALS | |
3216 @node Define Constraints | |
3217 @subsection Defining Machine-Specific Constraints | |
3218 @cindex defining constraints | |
3219 @cindex constraints, defining | |
3220 | |
3221 Machine-specific constraints fall into two categories: register and | |
3222 non-register constraints. Within the latter category, constraints | |
3223 which allow subsets of all possible memory or address operands should | |
3224 be specially marked, to give @code{reload} more information. | |
3225 | |
3226 Machine-specific constraints can be given names of arbitrary length, | |
3227 but they must be entirely composed of letters, digits, underscores | |
3228 (@samp{_}), and angle brackets (@samp{< >}). Like C identifiers, they | |
3229 must begin with a letter or underscore. | |
3230 | |
3231 In order to avoid ambiguity in operand constraint strings, no | |
3232 constraint can have a name that begins with any other constraint's | |
3233 name. For example, if @code{x} is defined as a constraint name, | |
3234 @code{xy} may not be, and vice versa. As a consequence of this rule, | |
3235 no constraint may begin with one of the generic constraint letters: | |
3236 @samp{E F V X g i m n o p r s}. | |
3237 | |
3238 Register constraints correspond directly to register classes. | |
3239 @xref{Register Classes}. There is thus not much flexibility in their | |
3240 definitions. | |
3241 | |
3242 @deffn {MD Expression} define_register_constraint name regclass docstring | |
3243 All three arguments are string constants. | |
3244 @var{name} is the name of the constraint, as it will appear in | |
3245 @code{match_operand} expressions. If @var{name} is a multi-letter | |
3246 constraint its length shall be the same for all constraints starting | |
3247 with the same letter. @var{regclass} can be either the | |
3248 name of the corresponding register class (@pxref{Register Classes}), | |
3249 or a C expression which evaluates to the appropriate register class. | |
3250 If it is an expression, it must have no side effects, and it cannot | |
3251 look at the operand. The usual use of expressions is to map some | |
3252 register constraints to @code{NO_REGS} when the register class | |
3253 is not available on a given subarchitecture. | |
3254 | |
3255 @var{docstring} is a sentence documenting the meaning of the | |
3256 constraint. Docstrings are explained further below. | |
3257 @end deffn | |
3258 | |
3259 Non-register constraints are more like predicates: the constraint | |
3260 definition gives a Boolean expression which indicates whether the | |
3261 constraint matches. | |
3262 | |
3263 @deffn {MD Expression} define_constraint name docstring exp | |
3264 The @var{name} and @var{docstring} arguments are the same as for | |
3265 @code{define_register_constraint}, but note that the docstring comes | |
3266 immediately after the name for these expressions. @var{exp} is an RTL | |
3267 expression, obeying the same rules as the RTL expressions in predicate | |
3268 definitions. @xref{Defining Predicates}, for details. If it | |
3269 evaluates true, the constraint matches; if it evaluates false, it | |
3270 doesn't. Constraint expressions should indicate which RTL codes they | |
3271 might match, just like predicate expressions. | |
3272 | |
3273 @code{match_test} C expressions have access to the | |
3274 following variables: | |
3275 | |
3276 @table @var | |
3277 @item op | |
3278 The RTL object defining the operand. | |
3279 @item mode | |
3280 The machine mode of @var{op}. | |
3281 @item ival | |
3282 @samp{INTVAL (@var{op})}, if @var{op} is a @code{const_int}. | |
3283 @item hval | |
3284 @samp{CONST_DOUBLE_HIGH (@var{op})}, if @var{op} is an integer | |
3285 @code{const_double}. | |
3286 @item lval | |
3287 @samp{CONST_DOUBLE_LOW (@var{op})}, if @var{op} is an integer | |
3288 @code{const_double}. | |
3289 @item rval | |
3290 @samp{CONST_DOUBLE_REAL_VALUE (@var{op})}, if @var{op} is a floating-point | |
3291 @code{const_double}. | |
3292 @end table | |
3293 | |
3294 The @var{*val} variables should only be used once another piece of the | |
3295 expression has verified that @var{op} is the appropriate kind of RTL | |
3296 object. | |
3297 @end deffn | |
3298 | |
3299 Most non-register constraints should be defined with | |
3300 @code{define_constraint}. The remaining two definition expressions | |
3301 are only appropriate for constraints that should be handled specially | |
3302 by @code{reload} if they fail to match. | |
3303 | |
3304 @deffn {MD Expression} define_memory_constraint name docstring exp | |
3305 Use this expression for constraints that match a subset of all memory | |
3306 operands: that is, @code{reload} can make them match by converting the | |
3307 operand to the form @samp{@w{(mem (reg @var{X}))}}, where @var{X} is a | |
3308 base register (from the register class specified by | |
3309 @code{BASE_REG_CLASS}, @pxref{Register Classes}). | |
3310 | |
3311 For example, on the S/390, some instructions do not accept arbitrary | |
3312 memory references, but only those that do not make use of an index | |
3313 register. The constraint letter @samp{Q} is defined to represent a | |
3314 memory address of this type. If @samp{Q} is defined with | |
3315 @code{define_memory_constraint}, a @samp{Q} constraint can handle any | |
3316 memory operand, because @code{reload} knows it can simply copy the | |
3317 memory address into a base register if required. This is analogous to | |
3318 the way a @samp{o} constraint can handle any memory operand. | |
3319 | |
3320 The syntax and semantics are otherwise identical to | |
3321 @code{define_constraint}. | |
3322 @end deffn | |
3323 | |
3324 @deffn {MD Expression} define_address_constraint name docstring exp | |
3325 Use this expression for constraints that match a subset of all address | |
3326 operands: that is, @code{reload} can make the constraint match by | |
3327 converting the operand to the form @samp{@w{(reg @var{X})}}, again | |
3328 with @var{X} a base register. | |
3329 | |
3330 Constraints defined with @code{define_address_constraint} can only be | |
3331 used with the @code{address_operand} predicate, or machine-specific | |
3332 predicates that work the same way. They are treated analogously to | |
3333 the generic @samp{p} constraint. | |
3334 | |
3335 The syntax and semantics are otherwise identical to | |
3336 @code{define_constraint}. | |
3337 @end deffn | |
3338 | |
3339 For historical reasons, names beginning with the letters @samp{G H} | |
3340 are reserved for constraints that match only @code{const_double}s, and | |
3341 names beginning with the letters @samp{I J K L M N O P} are reserved | |
3342 for constraints that match only @code{const_int}s. This may change in | |
3343 the future. For the time being, constraints with these names must be | |
3344 written in a stylized form, so that @code{genpreds} can tell you did | |
3345 it correctly: | |
3346 | |
3347 @smallexample | |
3348 @group | |
3349 (define_constraint "[@var{GHIJKLMNOP}]@dots{}" | |
3350 "@var{doc}@dots{}" | |
3351 (and (match_code "const_int") ; @r{@code{const_double} for G/H} | |
3352 @var{condition}@dots{})) ; @r{usually a @code{match_test}} | |
3353 @end group | |
3354 @end smallexample | |
3355 @c the semicolons line up in the formatted manual | |
3356 | |
3357 It is fine to use names beginning with other letters for constraints | |
3358 that match @code{const_double}s or @code{const_int}s. | |
3359 | |
3360 Each docstring in a constraint definition should be one or more complete | |
3361 sentences, marked up in Texinfo format. @emph{They are currently unused.} | |
3362 In the future they will be copied into the GCC manual, in @ref{Machine | |
3363 Constraints}, replacing the hand-maintained tables currently found in | |
3364 that section. Also, in the future the compiler may use this to give | |
3365 more helpful diagnostics when poor choice of @code{asm} constraints | |
3366 causes a reload failure. | |
3367 | |
3368 If you put the pseudo-Texinfo directive @samp{@@internal} at the | |
3369 beginning of a docstring, then (in the future) it will appear only in | |
3370 the internals manual's version of the machine-specific constraint tables. | |
3371 Use this for constraints that should not appear in @code{asm} statements. | |
3372 | |
3373 @node C Constraint Interface | |
3374 @subsection Testing constraints from C | |
3375 @cindex testing constraints | |
3376 @cindex constraints, testing | |
3377 | |
3378 It is occasionally useful to test a constraint from C code rather than | |
3379 implicitly via the constraint string in a @code{match_operand}. The | |
3380 generated file @file{tm_p.h} declares a few interfaces for working | |
3381 with machine-specific constraints. None of these interfaces work with | |
3382 the generic constraints described in @ref{Simple Constraints}. This | |
3383 may change in the future. | |
3384 | |
3385 @strong{Warning:} @file{tm_p.h} may declare other functions that | |
3386 operate on constraints, besides the ones documented here. Do not use | |
3387 those functions from machine-dependent code. They exist to implement | |
3388 the old constraint interface that machine-independent components of | |
3389 the compiler still expect. They will change or disappear in the | |
3390 future. | |
3391 | |
3392 Some valid constraint names are not valid C identifiers, so there is a | |
3393 mangling scheme for referring to them from C@. Constraint names that | |
3394 do not contain angle brackets or underscores are left unchanged. | |
3395 Underscores are doubled, each @samp{<} is replaced with @samp{_l}, and | |
3396 each @samp{>} with @samp{_g}. Here are some examples: | |
3397 | |
3398 @c the @c's prevent double blank lines in the printed manual. | |
3399 @example | |
3400 @multitable {Original} {Mangled} | |
3401 @item @strong{Original} @tab @strong{Mangled} @c | |
3402 @item @code{x} @tab @code{x} @c | |
3403 @item @code{P42x} @tab @code{P42x} @c | |
3404 @item @code{P4_x} @tab @code{P4__x} @c | |
3405 @item @code{P4>x} @tab @code{P4_gx} @c | |
3406 @item @code{P4>>} @tab @code{P4_g_g} @c | |
3407 @item @code{P4_g>} @tab @code{P4__g_g} @c | |
3408 @end multitable | |
3409 @end example | |
3410 | |
3411 Throughout this section, the variable @var{c} is either a constraint | |
3412 in the abstract sense, or a constant from @code{enum constraint_num}; | |
3413 the variable @var{m} is a mangled constraint name (usually as part of | |
3414 a larger identifier). | |
3415 | |
3416 @deftp Enum constraint_num | |
3417 For each machine-specific constraint, there is a corresponding | |
3418 enumeration constant: @samp{CONSTRAINT_} plus the mangled name of the | |
3419 constraint. Functions that take an @code{enum constraint_num} as an | |
3420 argument expect one of these constants. | |
3421 | |
3422 Machine-independent constraints do not have associated constants. | |
3423 This may change in the future. | |
3424 @end deftp | |
3425 | |
3426 @deftypefun {inline bool} satisfies_constraint_@var{m} (rtx @var{exp}) | |
3427 For each machine-specific, non-register constraint @var{m}, there is | |
3428 one of these functions; it returns @code{true} if @var{exp} satisfies the | |
3429 constraint. These functions are only visible if @file{rtl.h} was included | |
3430 before @file{tm_p.h}. | |
3431 @end deftypefun | |
3432 | |
3433 @deftypefun bool constraint_satisfied_p (rtx @var{exp}, enum constraint_num @var{c}) | |
3434 Like the @code{satisfies_constraint_@var{m}} functions, but the | |
3435 constraint to test is given as an argument, @var{c}. If @var{c} | |
3436 specifies a register constraint, this function will always return | |
3437 @code{false}. | |
3438 @end deftypefun | |
3439 | |
3440 @deftypefun {enum reg_class} regclass_for_constraint (enum constraint_num @var{c}) | |
3441 Returns the register class associated with @var{c}. If @var{c} is not | |
3442 a register constraint, or those registers are not available for the | |
3443 currently selected subtarget, returns @code{NO_REGS}. | |
3444 @end deftypefun | |
3445 | |
3446 Here is an example use of @code{satisfies_constraint_@var{m}}. In | |
3447 peephole optimizations (@pxref{Peephole Definitions}), operand | |
3448 constraint strings are ignored, so if there are relevant constraints, | |
3449 they must be tested in the C condition. In the example, the | |
3450 optimization is applied if operand 2 does @emph{not} satisfy the | |
3451 @samp{K} constraint. (This is a simplified version of a peephole | |
3452 definition from the i386 machine description.) | |
3453 | |
3454 @smallexample | |
3455 (define_peephole2 | |
3456 [(match_scratch:SI 3 "r") | |
3457 (set (match_operand:SI 0 "register_operand" "") | |
3458 (mult:SI (match_operand:SI 1 "memory_operand" "") | |
3459 (match_operand:SI 2 "immediate_operand" "")))] | |
3460 | |
3461 "!satisfies_constraint_K (operands[2])" | |
3462 | |
3463 [(set (match_dup 3) (match_dup 1)) | |
3464 (set (match_dup 0) (mult:SI (match_dup 3) (match_dup 2)))] | |
3465 | |
3466 "") | |
3467 @end smallexample | |
3468 | |
3469 @node Standard Names | |
3470 @section Standard Pattern Names For Generation | |
3471 @cindex standard pattern names | |
3472 @cindex pattern names | |
3473 @cindex names, pattern | |
3474 | |
3475 Here is a table of the instruction names that are meaningful in the RTL | |
3476 generation pass of the compiler. Giving one of these names to an | |
3477 instruction pattern tells the RTL generation pass that it can use the | |
3478 pattern to accomplish a certain task. | |
3479 | |
3480 @table @asis | |
3481 @cindex @code{mov@var{m}} instruction pattern | |
3482 @item @samp{mov@var{m}} | |
3483 Here @var{m} stands for a two-letter machine mode name, in lowercase. | |
3484 This instruction pattern moves data with that machine mode from operand | |
3485 1 to operand 0. For example, @samp{movsi} moves full-word data. | |
3486 | |
3487 If operand 0 is a @code{subreg} with mode @var{m} of a register whose | |
3488 own mode is wider than @var{m}, the effect of this instruction is | |
3489 to store the specified value in the part of the register that corresponds | |
3490 to mode @var{m}. Bits outside of @var{m}, but which are within the | |
3491 same target word as the @code{subreg} are undefined. Bits which are | |
3492 outside the target word are left unchanged. | |
3493 | |
3494 This class of patterns is special in several ways. First of all, each | |
3495 of these names up to and including full word size @emph{must} be defined, | |
3496 because there is no other way to copy a datum from one place to another. | |
3497 If there are patterns accepting operands in larger modes, | |
3498 @samp{mov@var{m}} must be defined for integer modes of those sizes. | |
3499 | |
3500 Second, these patterns are not used solely in the RTL generation pass. | |
3501 Even the reload pass can generate move insns to copy values from stack | |
3502 slots into temporary registers. When it does so, one of the operands is | |
3503 a hard register and the other is an operand that can need to be reloaded | |
3504 into a register. | |
3505 | |
3506 @findex force_reg | |
3507 Therefore, when given such a pair of operands, the pattern must generate | |
3508 RTL which needs no reloading and needs no temporary registers---no | |
3509 registers other than the operands. For example, if you support the | |
3510 pattern with a @code{define_expand}, then in such a case the | |
3511 @code{define_expand} mustn't call @code{force_reg} or any other such | |
3512 function which might generate new pseudo registers. | |
3513 | |
3514 This requirement exists even for subword modes on a RISC machine where | |
3515 fetching those modes from memory normally requires several insns and | |
3516 some temporary registers. | |
3517 | |
3518 @findex change_address | |
3519 During reload a memory reference with an invalid address may be passed | |
3520 as an operand. Such an address will be replaced with a valid address | |
3521 later in the reload pass. In this case, nothing may be done with the | |
3522 address except to use it as it stands. If it is copied, it will not be | |
3523 replaced with a valid address. No attempt should be made to make such | |
3524 an address into a valid address and no routine (such as | |
3525 @code{change_address}) that will do so may be called. Note that | |
3526 @code{general_operand} will fail when applied to such an address. | |
3527 | |
3528 @findex reload_in_progress | |
3529 The global variable @code{reload_in_progress} (which must be explicitly | |
3530 declared if required) can be used to determine whether such special | |
3531 handling is required. | |
3532 | |
3533 The variety of operands that have reloads depends on the rest of the | |
3534 machine description, but typically on a RISC machine these can only be | |
3535 pseudo registers that did not get hard registers, while on other | |
3536 machines explicit memory references will get optional reloads. | |
3537 | |
3538 If a scratch register is required to move an object to or from memory, | |
3539 it can be allocated using @code{gen_reg_rtx} prior to life analysis. | |
3540 | |
3541 If there are cases which need scratch registers during or after reload, | |
3542 you must provide an appropriate secondary_reload target hook. | |
3543 | |
3544 @findex can_create_pseudo_p | |
3545 The macro @code{can_create_pseudo_p} can be used to determine if it | |
3546 is unsafe to create new pseudo registers. If this variable is nonzero, then | |
3547 it is unsafe to call @code{gen_reg_rtx} to allocate a new pseudo. | |
3548 | |
3549 The constraints on a @samp{mov@var{m}} must permit moving any hard | |
3550 register to any other hard register provided that | |
3551 @code{HARD_REGNO_MODE_OK} permits mode @var{m} in both registers and | |
3552 @code{REGISTER_MOVE_COST} applied to their classes returns a value of 2. | |
3553 | |
3554 It is obligatory to support floating point @samp{mov@var{m}} | |
3555 instructions into and out of any registers that can hold fixed point | |
3556 values, because unions and structures (which have modes @code{SImode} or | |
3557 @code{DImode}) can be in those registers and they may have floating | |
3558 point members. | |
3559 | |
3560 There may also be a need to support fixed point @samp{mov@var{m}} | |
3561 instructions in and out of floating point registers. Unfortunately, I | |
3562 have forgotten why this was so, and I don't know whether it is still | |
3563 true. If @code{HARD_REGNO_MODE_OK} rejects fixed point values in | |
3564 floating point registers, then the constraints of the fixed point | |
3565 @samp{mov@var{m}} instructions must be designed to avoid ever trying to | |
3566 reload into a floating point register. | |
3567 | |
3568 @cindex @code{reload_in} instruction pattern | |
3569 @cindex @code{reload_out} instruction pattern | |
3570 @item @samp{reload_in@var{m}} | |
3571 @itemx @samp{reload_out@var{m}} | |
3572 These named patterns have been obsoleted by the target hook | |
3573 @code{secondary_reload}. | |
3574 | |
3575 Like @samp{mov@var{m}}, but used when a scratch register is required to | |
3576 move between operand 0 and operand 1. Operand 2 describes the scratch | |
3577 register. See the discussion of the @code{SECONDARY_RELOAD_CLASS} | |
3578 macro in @pxref{Register Classes}. | |
3579 | |
3580 There are special restrictions on the form of the @code{match_operand}s | |
3581 used in these patterns. First, only the predicate for the reload | |
3582 operand is examined, i.e., @code{reload_in} examines operand 1, but not | |
3583 the predicates for operand 0 or 2. Second, there may be only one | |
3584 alternative in the constraints. Third, only a single register class | |
3585 letter may be used for the constraint; subsequent constraint letters | |
3586 are ignored. As a special exception, an empty constraint string | |
3587 matches the @code{ALL_REGS} register class. This may relieve ports | |
3588 of the burden of defining an @code{ALL_REGS} constraint letter just | |
3589 for these patterns. | |
3590 | |
3591 @cindex @code{movstrict@var{m}} instruction pattern | |
3592 @item @samp{movstrict@var{m}} | |
3593 Like @samp{mov@var{m}} except that if operand 0 is a @code{subreg} | |
3594 with mode @var{m} of a register whose natural mode is wider, | |
3595 the @samp{movstrict@var{m}} instruction is guaranteed not to alter | |
3596 any of the register except the part which belongs to mode @var{m}. | |
3597 | |
3598 @cindex @code{movmisalign@var{m}} instruction pattern | |
3599 @item @samp{movmisalign@var{m}} | |
3600 This variant of a move pattern is designed to load or store a value | |
3601 from a memory address that is not naturally aligned for its mode. | |
3602 For a store, the memory will be in operand 0; for a load, the memory | |
3603 will be in operand 1. The other operand is guaranteed not to be a | |
3604 memory, so that it's easy to tell whether this is a load or store. | |
3605 | |
3606 This pattern is used by the autovectorizer, and when expanding a | |
3607 @code{MISALIGNED_INDIRECT_REF} expression. | |
3608 | |
3609 @cindex @code{load_multiple} instruction pattern | |
3610 @item @samp{load_multiple} | |
3611 Load several consecutive memory locations into consecutive registers. | |
3612 Operand 0 is the first of the consecutive registers, operand 1 | |
3613 is the first memory location, and operand 2 is a constant: the | |
3614 number of consecutive registers. | |
3615 | |
3616 Define this only if the target machine really has such an instruction; | |
3617 do not define this if the most efficient way of loading consecutive | |
3618 registers from memory is to do them one at a time. | |
3619 | |
3620 On some machines, there are restrictions as to which consecutive | |
3621 registers can be stored into memory, such as particular starting or | |
3622 ending register numbers or only a range of valid counts. For those | |
3623 machines, use a @code{define_expand} (@pxref{Expander Definitions}) | |
3624 and make the pattern fail if the restrictions are not met. | |
3625 | |
3626 Write the generated insn as a @code{parallel} with elements being a | |
3627 @code{set} of one register from the appropriate memory location (you may | |
3628 also need @code{use} or @code{clobber} elements). Use a | |
3629 @code{match_parallel} (@pxref{RTL Template}) to recognize the insn. See | |
3630 @file{rs6000.md} for examples of the use of this insn pattern. | |
3631 | |
3632 @cindex @samp{store_multiple} instruction pattern | |
3633 @item @samp{store_multiple} | |
3634 Similar to @samp{load_multiple}, but store several consecutive registers | |
3635 into consecutive memory locations. Operand 0 is the first of the | |
3636 consecutive memory locations, operand 1 is the first register, and | |
3637 operand 2 is a constant: the number of consecutive registers. | |
3638 | |
3639 @cindex @code{vec_set@var{m}} instruction pattern | |
3640 @item @samp{vec_set@var{m}} | |
3641 Set given field in the vector value. Operand 0 is the vector to modify, | |
3642 operand 1 is new value of field and operand 2 specify the field index. | |
3643 | |
3644 @cindex @code{vec_extract@var{m}} instruction pattern | |
3645 @item @samp{vec_extract@var{m}} | |
3646 Extract given field from the vector value. Operand 1 is the vector, operand 2 | |
3647 specify field index and operand 0 place to store value into. | |
3648 | |
3649 @cindex @code{vec_extract_even@var{m}} instruction pattern | |
3650 @item @samp{vec_extract_even@var{m}} | |
3651 Extract even elements from the input vectors (operand 1 and operand 2). | |
3652 The even elements of operand 2 are concatenated to the even elements of operand | |
3653 1 in their original order. The result is stored in operand 0. | |
3654 The output and input vectors should have the same modes. | |
3655 | |
3656 @cindex @code{vec_extract_odd@var{m}} instruction pattern | |
3657 @item @samp{vec_extract_odd@var{m}} | |
3658 Extract odd elements from the input vectors (operand 1 and operand 2). | |
3659 The odd elements of operand 2 are concatenated to the odd elements of operand | |
3660 1 in their original order. The result is stored in operand 0. | |
3661 The output and input vectors should have the same modes. | |
3662 | |
3663 @cindex @code{vec_interleave_high@var{m}} instruction pattern | |
3664 @item @samp{vec_interleave_high@var{m}} | |
3665 Merge high elements of the two input vectors into the output vector. The output | |
3666 and input vectors should have the same modes (@code{N} elements). The high | |
3667 @code{N/2} elements of the first input vector are interleaved with the high | |
3668 @code{N/2} elements of the second input vector. | |
3669 | |
3670 @cindex @code{vec_interleave_low@var{m}} instruction pattern | |
3671 @item @samp{vec_interleave_low@var{m}} | |
3672 Merge low elements of the two input vectors into the output vector. The output | |
3673 and input vectors should have the same modes (@code{N} elements). The low | |
3674 @code{N/2} elements of the first input vector are interleaved with the low | |
3675 @code{N/2} elements of the second input vector. | |
3676 | |
3677 @cindex @code{vec_init@var{m}} instruction pattern | |
3678 @item @samp{vec_init@var{m}} | |
3679 Initialize the vector to given values. Operand 0 is the vector to initialize | |
3680 and operand 1 is parallel containing values for individual fields. | |
3681 | |
3682 @cindex @code{push@var{m}1} instruction pattern | |
3683 @item @samp{push@var{m}1} | |
3684 Output a push instruction. Operand 0 is value to push. Used only when | |
3685 @code{PUSH_ROUNDING} is defined. For historical reason, this pattern may be | |
3686 missing and in such case an @code{mov} expander is used instead, with a | |
3687 @code{MEM} expression forming the push operation. The @code{mov} expander | |
3688 method is deprecated. | |
3689 | |
3690 @cindex @code{add@var{m}3} instruction pattern | |
3691 @item @samp{add@var{m}3} | |
3692 Add operand 2 and operand 1, storing the result in operand 0. All operands | |
3693 must have mode @var{m}. This can be used even on two-address machines, by | |
3694 means of constraints requiring operands 1 and 0 to be the same location. | |
3695 | |
3696 @cindex @code{ssadd@var{m}3} instruction pattern | |
3697 @cindex @code{usadd@var{m}3} instruction pattern | |
3698 @cindex @code{sub@var{m}3} instruction pattern | |
3699 @cindex @code{sssub@var{m}3} instruction pattern | |
3700 @cindex @code{ussub@var{m}3} instruction pattern | |
3701 @cindex @code{mul@var{m}3} instruction pattern | |
3702 @cindex @code{ssmul@var{m}3} instruction pattern | |
3703 @cindex @code{usmul@var{m}3} instruction pattern | |
3704 @cindex @code{div@var{m}3} instruction pattern | |
3705 @cindex @code{ssdiv@var{m}3} instruction pattern | |
3706 @cindex @code{udiv@var{m}3} instruction pattern | |
3707 @cindex @code{usdiv@var{m}3} instruction pattern | |
3708 @cindex @code{mod@var{m}3} instruction pattern | |
3709 @cindex @code{umod@var{m}3} instruction pattern | |
3710 @cindex @code{umin@var{m}3} instruction pattern | |
3711 @cindex @code{umax@var{m}3} instruction pattern | |
3712 @cindex @code{and@var{m}3} instruction pattern | |
3713 @cindex @code{ior@var{m}3} instruction pattern | |
3714 @cindex @code{xor@var{m}3} instruction pattern | |
3715 @item @samp{ssadd@var{m}3}, @samp{usadd@var{m}3} | |
3716 @item @samp{sub@var{m}3}, @samp{sssub@var{m}3}, @samp{ussub@var{m}3} | |
3717 @item @samp{mul@var{m}3}, @samp{ssmul@var{m}3}, @samp{usmul@var{m}3} | |
3718 @itemx @samp{div@var{m}3}, @samp{ssdiv@var{m}3} | |
3719 @itemx @samp{udiv@var{m}3}, @samp{usdiv@var{m}3} | |
3720 @itemx @samp{mod@var{m}3}, @samp{umod@var{m}3} | |
3721 @itemx @samp{umin@var{m}3}, @samp{umax@var{m}3} | |
3722 @itemx @samp{and@var{m}3}, @samp{ior@var{m}3}, @samp{xor@var{m}3} | |
3723 Similar, for other arithmetic operations. | |
3724 | |
3725 @cindex @code{min@var{m}3} instruction pattern | |
3726 @cindex @code{max@var{m}3} instruction pattern | |
3727 @item @samp{smin@var{m}3}, @samp{smax@var{m}3} | |
3728 Signed minimum and maximum operations. When used with floating point, | |
3729 if both operands are zeros, or if either operand is @code{NaN}, then | |
3730 it is unspecified which of the two operands is returned as the result. | |
3731 | |
3732 @cindex @code{reduc_smin_@var{m}} instruction pattern | |
3733 @cindex @code{reduc_smax_@var{m}} instruction pattern | |
3734 @item @samp{reduc_smin_@var{m}}, @samp{reduc_smax_@var{m}} | |
3735 Find the signed minimum/maximum of the elements of a vector. The vector is | |
3736 operand 1, and the scalar result is stored in the least significant bits of | |
3737 operand 0 (also a vector). The output and input vector should have the same | |
3738 modes. | |
3739 | |
3740 @cindex @code{reduc_umin_@var{m}} instruction pattern | |
3741 @cindex @code{reduc_umax_@var{m}} instruction pattern | |
3742 @item @samp{reduc_umin_@var{m}}, @samp{reduc_umax_@var{m}} | |
3743 Find the unsigned minimum/maximum of the elements of a vector. The vector is | |
3744 operand 1, and the scalar result is stored in the least significant bits of | |
3745 operand 0 (also a vector). The output and input vector should have the same | |
3746 modes. | |
3747 | |
3748 @cindex @code{reduc_splus_@var{m}} instruction pattern | |
3749 @item @samp{reduc_splus_@var{m}} | |
3750 Compute the sum of the signed elements of a vector. The vector is operand 1, | |
3751 and the scalar result is stored in the least significant bits of operand 0 | |
3752 (also a vector). The output and input vector should have the same modes. | |
3753 | |
3754 @cindex @code{reduc_uplus_@var{m}} instruction pattern | |
3755 @item @samp{reduc_uplus_@var{m}} | |
3756 Compute the sum of the unsigned elements of a vector. The vector is operand 1, | |
3757 and the scalar result is stored in the least significant bits of operand 0 | |
3758 (also a vector). The output and input vector should have the same modes. | |
3759 | |
3760 @cindex @code{sdot_prod@var{m}} instruction pattern | |
3761 @item @samp{sdot_prod@var{m}} | |
3762 @cindex @code{udot_prod@var{m}} instruction pattern | |
3763 @item @samp{udot_prod@var{m}} | |
3764 Compute the sum of the products of two signed/unsigned elements. | |
3765 Operand 1 and operand 2 are of the same mode. Their product, which is of a | |
3766 wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or | |
3767 wider than the mode of the product. The result is placed in operand 0, which | |
3768 is of the same mode as operand 3. | |
3769 | |
3770 @cindex @code{ssum_widen@var{m3}} instruction pattern | |
3771 @item @samp{ssum_widen@var{m3}} | |
3772 @cindex @code{usum_widen@var{m3}} instruction pattern | |
3773 @item @samp{usum_widen@var{m3}} | |
3774 Operands 0 and 2 are of the same mode, which is wider than the mode of | |
3775 operand 1. Add operand 1 to operand 2 and place the widened result in | |
3776 operand 0. (This is used express accumulation of elements into an accumulator | |
3777 of a wider mode.) | |
3778 | |
3779 @cindex @code{vec_shl_@var{m}} instruction pattern | |
3780 @cindex @code{vec_shr_@var{m}} instruction pattern | |
3781 @item @samp{vec_shl_@var{m}}, @samp{vec_shr_@var{m}} | |
3782 Whole vector left/right shift in bits. | |
3783 Operand 1 is a vector to be shifted. | |
3784 Operand 2 is an integer shift amount in bits. | |
3785 Operand 0 is where the resulting shifted vector is stored. | |
3786 The output and input vectors should have the same modes. | |
3787 | |
3788 @cindex @code{vec_pack_trunc_@var{m}} instruction pattern | |
3789 @item @samp{vec_pack_trunc_@var{m}} | |
3790 Narrow (demote) and merge the elements of two vectors. Operands 1 and 2 | |
3791 are vectors of the same mode having N integral or floating point elements | |
3792 of size S@. Operand 0 is the resulting vector in which 2*N elements of | |
3793 size N/2 are concatenated after narrowing them down using truncation. | |
3794 | |
3795 @cindex @code{vec_pack_ssat_@var{m}} instruction pattern | |
3796 @cindex @code{vec_pack_usat_@var{m}} instruction pattern | |
3797 @item @samp{vec_pack_ssat_@var{m}}, @samp{vec_pack_usat_@var{m}} | |
3798 Narrow (demote) and merge the elements of two vectors. Operands 1 and 2 | |
3799 are vectors of the same mode having N integral elements of size S. | |
3800 Operand 0 is the resulting vector in which the elements of the two input | |
3801 vectors are concatenated after narrowing them down using signed/unsigned | |
3802 saturating arithmetic. | |
3803 | |
3804 @cindex @code{vec_pack_sfix_trunc_@var{m}} instruction pattern | |
3805 @cindex @code{vec_pack_ufix_trunc_@var{m}} instruction pattern | |
3806 @item @samp{vec_pack_sfix_trunc_@var{m}}, @samp{vec_pack_ufix_trunc_@var{m}} | |
3807 Narrow, convert to signed/unsigned integral type and merge the elements | |
3808 of two vectors. Operands 1 and 2 are vectors of the same mode having N | |
3809 floating point elements of size S@. Operand 0 is the resulting vector | |
3810 in which 2*N elements of size N/2 are concatenated. | |
3811 | |
3812 @cindex @code{vec_unpacks_hi_@var{m}} instruction pattern | |
3813 @cindex @code{vec_unpacks_lo_@var{m}} instruction pattern | |
3814 @item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}} | |
3815 Extract and widen (promote) the high/low part of a vector of signed | |
3816 integral or floating point elements. The input vector (operand 1) has N | |
3817 elements of size S@. Widen (promote) the high/low elements of the vector | |
3818 using signed or floating point extension and place the resulting N/2 | |
3819 values of size 2*S in the output vector (operand 0). | |
3820 | |
3821 @cindex @code{vec_unpacku_hi_@var{m}} instruction pattern | |
3822 @cindex @code{vec_unpacku_lo_@var{m}} instruction pattern | |
3823 @item @samp{vec_unpacku_hi_@var{m}}, @samp{vec_unpacku_lo_@var{m}} | |
3824 Extract and widen (promote) the high/low part of a vector of unsigned | |
3825 integral elements. The input vector (operand 1) has N elements of size S. | |
3826 Widen (promote) the high/low elements of the vector using zero extension and | |
3827 place the resulting N/2 values of size 2*S in the output vector (operand 0). | |
3828 | |
3829 @cindex @code{vec_unpacks_float_hi_@var{m}} instruction pattern | |
3830 @cindex @code{vec_unpacks_float_lo_@var{m}} instruction pattern | |
3831 @cindex @code{vec_unpacku_float_hi_@var{m}} instruction pattern | |
3832 @cindex @code{vec_unpacku_float_lo_@var{m}} instruction pattern | |
3833 @item @samp{vec_unpacks_float_hi_@var{m}}, @samp{vec_unpacks_float_lo_@var{m}} | |
3834 @itemx @samp{vec_unpacku_float_hi_@var{m}}, @samp{vec_unpacku_float_lo_@var{m}} | |
3835 Extract, convert to floating point type and widen the high/low part of a | |
3836 vector of signed/unsigned integral elements. The input vector (operand 1) | |
3837 has N elements of size S@. Convert the high/low elements of the vector using | |
3838 floating point conversion and place the resulting N/2 values of size 2*S in | |
3839 the output vector (operand 0). | |
3840 | |
3841 @cindex @code{vec_widen_umult_hi_@var{m}} instruction pattern | |
3842 @cindex @code{vec_widen_umult_lo__@var{m}} instruction pattern | |
3843 @cindex @code{vec_widen_smult_hi_@var{m}} instruction pattern | |
3844 @cindex @code{vec_widen_smult_lo_@var{m}} instruction pattern | |
3845 @item @samp{vec_widen_umult_hi_@var{m}}, @samp{vec_widen_umult_lo_@var{m}} | |
3846 @itemx @samp{vec_widen_smult_hi_@var{m}}, @samp{vec_widen_smult_lo_@var{m}} | |
3847 Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2) | |
3848 are vectors with N signed/unsigned elements of size S@. Multiply the high/low | |
3849 elements of the two vectors, and put the N/2 products of size 2*S in the | |
3850 output vector (operand 0). | |
3851 | |
3852 @cindex @code{mulhisi3} instruction pattern | |
3853 @item @samp{mulhisi3} | |
3854 Multiply operands 1 and 2, which have mode @code{HImode}, and store | |
3855 a @code{SImode} product in operand 0. | |
3856 | |
3857 @cindex @code{mulqihi3} instruction pattern | |
3858 @cindex @code{mulsidi3} instruction pattern | |
3859 @item @samp{mulqihi3}, @samp{mulsidi3} | |
3860 Similar widening-multiplication instructions of other widths. | |
3861 | |
3862 @cindex @code{umulqihi3} instruction pattern | |
3863 @cindex @code{umulhisi3} instruction pattern | |
3864 @cindex @code{umulsidi3} instruction pattern | |
3865 @item @samp{umulqihi3}, @samp{umulhisi3}, @samp{umulsidi3} | |
3866 Similar widening-multiplication instructions that do unsigned | |
3867 multiplication. | |
3868 | |
3869 @cindex @code{usmulqihi3} instruction pattern | |
3870 @cindex @code{usmulhisi3} instruction pattern | |
3871 @cindex @code{usmulsidi3} instruction pattern | |
3872 @item @samp{usmulqihi3}, @samp{usmulhisi3}, @samp{usmulsidi3} | |
3873 Similar widening-multiplication instructions that interpret the first | |
3874 operand as unsigned and the second operand as signed, then do a signed | |
3875 multiplication. | |
3876 | |
3877 @cindex @code{smul@var{m}3_highpart} instruction pattern | |
3878 @item @samp{smul@var{m}3_highpart} | |
3879 Perform a signed multiplication of operands 1 and 2, which have mode | |
3880 @var{m}, and store the most significant half of the product in operand 0. | |
3881 The least significant half of the product is discarded. | |
3882 | |
3883 @cindex @code{umul@var{m}3_highpart} instruction pattern | |
3884 @item @samp{umul@var{m}3_highpart} | |
3885 Similar, but the multiplication is unsigned. | |
3886 | |
3887 @cindex @code{madd@var{m}@var{n}4} instruction pattern | |
3888 @item @samp{madd@var{m}@var{n}4} | |
3889 Multiply operands 1 and 2, sign-extend them to mode @var{n}, add | |
3890 operand 3, and store the result in operand 0. Operands 1 and 2 | |
3891 have mode @var{m} and operands 0 and 3 have mode @var{n}. | |
3892 Both modes must be integer or fixed-point modes and @var{n} must be twice | |
3893 the size of @var{m}. | |
3894 | |
3895 In other words, @code{madd@var{m}@var{n}4} is like | |
3896 @code{mul@var{m}@var{n}3} except that it also adds operand 3. | |
3897 | |
3898 These instructions are not allowed to @code{FAIL}. | |
3899 | |
3900 @cindex @code{umadd@var{m}@var{n}4} instruction pattern | |
3901 @item @samp{umadd@var{m}@var{n}4} | |
3902 Like @code{madd@var{m}@var{n}4}, but zero-extend the multiplication | |
3903 operands instead of sign-extending them. | |
3904 | |
3905 @cindex @code{ssmadd@var{m}@var{n}4} instruction pattern | |
3906 @item @samp{ssmadd@var{m}@var{n}4} | |
3907 Like @code{madd@var{m}@var{n}4}, but all involved operations must be | |
3908 signed-saturating. | |
3909 | |
3910 @cindex @code{usmadd@var{m}@var{n}4} instruction pattern | |
3911 @item @samp{usmadd@var{m}@var{n}4} | |
3912 Like @code{umadd@var{m}@var{n}4}, but all involved operations must be | |
3913 unsigned-saturating. | |
3914 | |
3915 @cindex @code{msub@var{m}@var{n}4} instruction pattern | |
3916 @item @samp{msub@var{m}@var{n}4} | |
3917 Multiply operands 1 and 2, sign-extend them to mode @var{n}, subtract the | |
3918 result from operand 3, and store the result in operand 0. Operands 1 and 2 | |
3919 have mode @var{m} and operands 0 and 3 have mode @var{n}. | |
3920 Both modes must be integer or fixed-point modes and @var{n} must be twice | |
3921 the size of @var{m}. | |
3922 | |
3923 In other words, @code{msub@var{m}@var{n}4} is like | |
3924 @code{mul@var{m}@var{n}3} except that it also subtracts the result | |
3925 from operand 3. | |
3926 | |
3927 These instructions are not allowed to @code{FAIL}. | |
3928 | |
3929 @cindex @code{umsub@var{m}@var{n}4} instruction pattern | |
3930 @item @samp{umsub@var{m}@var{n}4} | |
3931 Like @code{msub@var{m}@var{n}4}, but zero-extend the multiplication | |
3932 operands instead of sign-extending them. | |
3933 | |
3934 @cindex @code{ssmsub@var{m}@var{n}4} instruction pattern | |
3935 @item @samp{ssmsub@var{m}@var{n}4} | |
3936 Like @code{msub@var{m}@var{n}4}, but all involved operations must be | |
3937 signed-saturating. | |
3938 | |
3939 @cindex @code{usmsub@var{m}@var{n}4} instruction pattern | |
3940 @item @samp{usmsub@var{m}@var{n}4} | |
3941 Like @code{umsub@var{m}@var{n}4}, but all involved operations must be | |
3942 unsigned-saturating. | |
3943 | |
3944 @cindex @code{divmod@var{m}4} instruction pattern | |
3945 @item @samp{divmod@var{m}4} | |
3946 Signed division that produces both a quotient and a remainder. | |
3947 Operand 1 is divided by operand 2 to produce a quotient stored | |
3948 in operand 0 and a remainder stored in operand 3. | |
3949 | |
3950 For machines with an instruction that produces both a quotient and a | |
3951 remainder, provide a pattern for @samp{divmod@var{m}4} but do not | |
3952 provide patterns for @samp{div@var{m}3} and @samp{mod@var{m}3}. This | |
3953 allows optimization in the relatively common case when both the quotient | |
3954 and remainder are computed. | |
3955 | |
3956 If an instruction that just produces a quotient or just a remainder | |
3957 exists and is more efficient than the instruction that produces both, | |
3958 write the output routine of @samp{divmod@var{m}4} to call | |
3959 @code{find_reg_note} and look for a @code{REG_UNUSED} note on the | |
3960 quotient or remainder and generate the appropriate instruction. | |
3961 | |
3962 @cindex @code{udivmod@var{m}4} instruction pattern | |
3963 @item @samp{udivmod@var{m}4} | |
3964 Similar, but does unsigned division. | |
3965 | |
3966 @anchor{shift patterns} | |
3967 @cindex @code{ashl@var{m}3} instruction pattern | |
3968 @cindex @code{ssashl@var{m}3} instruction pattern | |
3969 @cindex @code{usashl@var{m}3} instruction pattern | |
3970 @item @samp{ashl@var{m}3}, @samp{ssashl@var{m}3}, @samp{usashl@var{m}3} | |
3971 Arithmetic-shift operand 1 left by a number of bits specified by operand | |
3972 2, and store the result in operand 0. Here @var{m} is the mode of | |
3973 operand 0 and operand 1; operand 2's mode is specified by the | |
3974 instruction pattern, and the compiler will convert the operand to that | |
3975 mode before generating the instruction. The meaning of out-of-range shift | |
3976 counts can optionally be specified by @code{TARGET_SHIFT_TRUNCATION_MASK}. | |
3977 @xref{TARGET_SHIFT_TRUNCATION_MASK}. Operand 2 is always a scalar type. | |
3978 | |
3979 @cindex @code{ashr@var{m}3} instruction pattern | |
3980 @cindex @code{lshr@var{m}3} instruction pattern | |
3981 @cindex @code{rotl@var{m}3} instruction pattern | |
3982 @cindex @code{rotr@var{m}3} instruction pattern | |
3983 @item @samp{ashr@var{m}3}, @samp{lshr@var{m}3}, @samp{rotl@var{m}3}, @samp{rotr@var{m}3} | |
3984 Other shift and rotate instructions, analogous to the | |
3985 @code{ashl@var{m}3} instructions. Operand 2 is always a scalar type. | |
3986 | |
3987 @cindex @code{vashl@var{m}3} instruction pattern | |
3988 @cindex @code{vashr@var{m}3} instruction pattern | |
3989 @cindex @code{vlshr@var{m}3} instruction pattern | |
3990 @cindex @code{vrotl@var{m}3} instruction pattern | |
3991 @cindex @code{vrotr@var{m}3} instruction pattern | |
3992 @item @samp{vashl@var{m}3}, @samp{vashr@var{m}3}, @samp{vlshr@var{m}3}, @samp{vrotl@var{m}3}, @samp{vrotr@var{m}3} | |
3993 Vector shift and rotate instructions that take vectors as operand 2 | |
3994 instead of a scalar type. | |
3995 | |
3996 @cindex @code{neg@var{m}2} instruction pattern | |
3997 @cindex @code{ssneg@var{m}2} instruction pattern | |
3998 @cindex @code{usneg@var{m}2} instruction pattern | |
3999 @item @samp{neg@var{m}2}, @samp{ssneg@var{m}2}, @samp{usneg@var{m}2} | |
4000 Negate operand 1 and store the result in operand 0. | |
4001 | |
4002 @cindex @code{abs@var{m}2} instruction pattern | |
4003 @item @samp{abs@var{m}2} | |
4004 Store the absolute value of operand 1 into operand 0. | |
4005 | |
4006 @cindex @code{sqrt@var{m}2} instruction pattern | |
4007 @item @samp{sqrt@var{m}2} | |
4008 Store the square root of operand 1 into operand 0. | |
4009 | |
4010 The @code{sqrt} built-in function of C always uses the mode which | |
4011 corresponds to the C data type @code{double} and the @code{sqrtf} | |
4012 built-in function uses the mode which corresponds to the C data | |
4013 type @code{float}. | |
4014 | |
4015 @cindex @code{fmod@var{m}3} instruction pattern | |
4016 @item @samp{fmod@var{m}3} | |
4017 Store the remainder of dividing operand 1 by operand 2 into | |
4018 operand 0, rounded towards zero to an integer. | |
4019 | |
4020 The @code{fmod} built-in function of C always uses the mode which | |
4021 corresponds to the C data type @code{double} and the @code{fmodf} | |
4022 built-in function uses the mode which corresponds to the C data | |
4023 type @code{float}. | |
4024 | |
4025 @cindex @code{remainder@var{m}3} instruction pattern | |
4026 @item @samp{remainder@var{m}3} | |
4027 Store the remainder of dividing operand 1 by operand 2 into | |
4028 operand 0, rounded to the nearest integer. | |
4029 | |
4030 The @code{remainder} built-in function of C always uses the mode | |
4031 which corresponds to the C data type @code{double} and the | |
4032 @code{remainderf} built-in function uses the mode which corresponds | |
4033 to the C data type @code{float}. | |
4034 | |
4035 @cindex @code{cos@var{m}2} instruction pattern | |
4036 @item @samp{cos@var{m}2} | |
4037 Store the cosine of operand 1 into operand 0. | |
4038 | |
4039 The @code{cos} built-in function of C always uses the mode which | |
4040 corresponds to the C data type @code{double} and the @code{cosf} | |
4041 built-in function uses the mode which corresponds to the C data | |
4042 type @code{float}. | |
4043 | |
4044 @cindex @code{sin@var{m}2} instruction pattern | |
4045 @item @samp{sin@var{m}2} | |
4046 Store the sine of operand 1 into operand 0. | |
4047 | |
4048 The @code{sin} built-in function of C always uses the mode which | |
4049 corresponds to the C data type @code{double} and the @code{sinf} | |
4050 built-in function uses the mode which corresponds to the C data | |
4051 type @code{float}. | |
4052 | |
4053 @cindex @code{exp@var{m}2} instruction pattern | |
4054 @item @samp{exp@var{m}2} | |
4055 Store the exponential of operand 1 into operand 0. | |
4056 | |
4057 The @code{exp} built-in function of C always uses the mode which | |
4058 corresponds to the C data type @code{double} and the @code{expf} | |
4059 built-in function uses the mode which corresponds to the C data | |
4060 type @code{float}. | |
4061 | |
4062 @cindex @code{log@var{m}2} instruction pattern | |
4063 @item @samp{log@var{m}2} | |
4064 Store the natural logarithm of operand 1 into operand 0. | |
4065 | |
4066 The @code{log} built-in function of C always uses the mode which | |
4067 corresponds to the C data type @code{double} and the @code{logf} | |
4068 built-in function uses the mode which corresponds to the C data | |
4069 type @code{float}. | |
4070 | |
4071 @cindex @code{pow@var{m}3} instruction pattern | |
4072 @item @samp{pow@var{m}3} | |
4073 Store the value of operand 1 raised to the exponent operand 2 | |
4074 into operand 0. | |
4075 | |
4076 The @code{pow} built-in function of C always uses the mode which | |
4077 corresponds to the C data type @code{double} and the @code{powf} | |
4078 built-in function uses the mode which corresponds to the C data | |
4079 type @code{float}. | |
4080 | |
4081 @cindex @code{atan2@var{m}3} instruction pattern | |
4082 @item @samp{atan2@var{m}3} | |
4083 Store the arc tangent (inverse tangent) of operand 1 divided by | |
4084 operand 2 into operand 0, using the signs of both arguments to | |
4085 determine the quadrant of the result. | |
4086 | |
4087 The @code{atan2} built-in function of C always uses the mode which | |
4088 corresponds to the C data type @code{double} and the @code{atan2f} | |
4089 built-in function uses the mode which corresponds to the C data | |
4090 type @code{float}. | |
4091 | |
4092 @cindex @code{floor@var{m}2} instruction pattern | |
4093 @item @samp{floor@var{m}2} | |
4094 Store the largest integral value not greater than argument. | |
4095 | |
4096 The @code{floor} built-in function of C always uses the mode which | |
4097 corresponds to the C data type @code{double} and the @code{floorf} | |
4098 built-in function uses the mode which corresponds to the C data | |
4099 type @code{float}. | |
4100 | |
4101 @cindex @code{btrunc@var{m}2} instruction pattern | |
4102 @item @samp{btrunc@var{m}2} | |
4103 Store the argument rounded to integer towards zero. | |
4104 | |
4105 The @code{trunc} built-in function of C always uses the mode which | |
4106 corresponds to the C data type @code{double} and the @code{truncf} | |
4107 built-in function uses the mode which corresponds to the C data | |
4108 type @code{float}. | |
4109 | |
4110 @cindex @code{round@var{m}2} instruction pattern | |
4111 @item @samp{round@var{m}2} | |
4112 Store the argument rounded to integer away from zero. | |
4113 | |
4114 The @code{round} built-in function of C always uses the mode which | |
4115 corresponds to the C data type @code{double} and the @code{roundf} | |
4116 built-in function uses the mode which corresponds to the C data | |
4117 type @code{float}. | |
4118 | |
4119 @cindex @code{ceil@var{m}2} instruction pattern | |
4120 @item @samp{ceil@var{m}2} | |
4121 Store the argument rounded to integer away from zero. | |
4122 | |
4123 The @code{ceil} built-in function of C always uses the mode which | |
4124 corresponds to the C data type @code{double} and the @code{ceilf} | |
4125 built-in function uses the mode which corresponds to the C data | |
4126 type @code{float}. | |
4127 | |
4128 @cindex @code{nearbyint@var{m}2} instruction pattern | |
4129 @item @samp{nearbyint@var{m}2} | |
4130 Store the argument rounded according to the default rounding mode | |
4131 | |
4132 The @code{nearbyint} built-in function of C always uses the mode which | |
4133 corresponds to the C data type @code{double} and the @code{nearbyintf} | |
4134 built-in function uses the mode which corresponds to the C data | |
4135 type @code{float}. | |
4136 | |
4137 @cindex @code{rint@var{m}2} instruction pattern | |
4138 @item @samp{rint@var{m}2} | |
4139 Store the argument rounded according to the default rounding mode and | |
4140 raise the inexact exception when the result differs in value from | |
4141 the argument | |
4142 | |
4143 The @code{rint} built-in function of C always uses the mode which | |
4144 corresponds to the C data type @code{double} and the @code{rintf} | |
4145 built-in function uses the mode which corresponds to the C data | |
4146 type @code{float}. | |
4147 | |
4148 @cindex @code{lrint@var{m}@var{n}2} | |
4149 @item @samp{lrint@var{m}@var{n}2} | |
4150 Convert operand 1 (valid for floating point mode @var{m}) to fixed | |
4151 point mode @var{n} as a signed number according to the current | |
4152 rounding mode and store in operand 0 (which has mode @var{n}). | |
4153 | |
4154 @cindex @code{lround@var{m}@var{n}2} | |
4155 @item @samp{lround@var{m}2} | |
4156 Convert operand 1 (valid for floating point mode @var{m}) to fixed | |
4157 point mode @var{n} as a signed number rounding to nearest and away | |
4158 from zero and store in operand 0 (which has mode @var{n}). | |
4159 | |
4160 @cindex @code{lfloor@var{m}@var{n}2} | |
4161 @item @samp{lfloor@var{m}2} | |
4162 Convert operand 1 (valid for floating point mode @var{m}) to fixed | |
4163 point mode @var{n} as a signed number rounding down and store in | |
4164 operand 0 (which has mode @var{n}). | |
4165 | |
4166 @cindex @code{lceil@var{m}@var{n}2} | |
4167 @item @samp{lceil@var{m}2} | |
4168 Convert operand 1 (valid for floating point mode @var{m}) to fixed | |
4169 point mode @var{n} as a signed number rounding up and store in | |
4170 operand 0 (which has mode @var{n}). | |
4171 | |
4172 @cindex @code{copysign@var{m}3} instruction pattern | |
4173 @item @samp{copysign@var{m}3} | |
4174 Store a value with the magnitude of operand 1 and the sign of operand | |
4175 2 into operand 0. | |
4176 | |
4177 The @code{copysign} built-in function of C always uses the mode which | |
4178 corresponds to the C data type @code{double} and the @code{copysignf} | |
4179 built-in function uses the mode which corresponds to the C data | |
4180 type @code{float}. | |
4181 | |
4182 @cindex @code{ffs@var{m}2} instruction pattern | |
4183 @item @samp{ffs@var{m}2} | |
4184 Store into operand 0 one plus the index of the least significant 1-bit | |
4185 of operand 1. If operand 1 is zero, store zero. @var{m} is the mode | |
4186 of operand 0; operand 1's mode is specified by the instruction | |
4187 pattern, and the compiler will convert the operand to that mode before | |
4188 generating the instruction. | |
4189 | |
4190 The @code{ffs} built-in function of C always uses the mode which | |
4191 corresponds to the C data type @code{int}. | |
4192 | |
4193 @cindex @code{clz@var{m}2} instruction pattern | |
4194 @item @samp{clz@var{m}2} | |
4195 Store into operand 0 the number of leading 0-bits in @var{x}, starting | |
4196 at the most significant bit position. If @var{x} is 0, the | |
4197 @code{CLZ_DEFINED_VALUE_AT_ZERO} (@pxref{Misc}) macro defines if | |
4198 the result is undefined or has a useful value. | |
4199 @var{m} is the mode of operand 0; operand 1's mode is | |
4200 specified by the instruction pattern, and the compiler will convert the | |
4201 operand to that mode before generating the instruction. | |
4202 | |
4203 @cindex @code{ctz@var{m}2} instruction pattern | |
4204 @item @samp{ctz@var{m}2} | |
4205 Store into operand 0 the number of trailing 0-bits in @var{x}, starting | |
4206 at the least significant bit position. If @var{x} is 0, the | |
4207 @code{CTZ_DEFINED_VALUE_AT_ZERO} (@pxref{Misc}) macro defines if | |
4208 the result is undefined or has a useful value. | |
4209 @var{m} is the mode of operand 0; operand 1's mode is | |
4210 specified by the instruction pattern, and the compiler will convert the | |
4211 operand to that mode before generating the instruction. | |
4212 | |
4213 @cindex @code{popcount@var{m}2} instruction pattern | |
4214 @item @samp{popcount@var{m}2} | |
4215 Store into operand 0 the number of 1-bits in @var{x}. @var{m} is the | |
4216 mode of operand 0; operand 1's mode is specified by the instruction | |
4217 pattern, and the compiler will convert the operand to that mode before | |
4218 generating the instruction. | |
4219 | |
4220 @cindex @code{parity@var{m}2} instruction pattern | |
4221 @item @samp{parity@var{m}2} | |
4222 Store into operand 0 the parity of @var{x}, i.e.@: the number of 1-bits | |
4223 in @var{x} modulo 2. @var{m} is the mode of operand 0; operand 1's mode | |
4224 is specified by the instruction pattern, and the compiler will convert | |
4225 the operand to that mode before generating the instruction. | |
4226 | |
4227 @cindex @code{one_cmpl@var{m}2} instruction pattern | |
4228 @item @samp{one_cmpl@var{m}2} | |
4229 Store the bitwise-complement of operand 1 into operand 0. | |
4230 | |
4231 @cindex @code{cmp@var{m}} instruction pattern | |
4232 @item @samp{cmp@var{m}} | |
4233 Compare operand 0 and operand 1, and set the condition codes. | |
4234 The RTL pattern should look like this: | |
4235 | |
4236 @smallexample | |
4237 (set (cc0) (compare (match_operand:@var{m} 0 @dots{}) | |
4238 (match_operand:@var{m} 1 @dots{}))) | |
4239 @end smallexample | |
4240 | |
4241 @cindex @code{tst@var{m}} instruction pattern | |
4242 @item @samp{tst@var{m}} | |
4243 Compare operand 0 against zero, and set the condition codes. | |
4244 The RTL pattern should look like this: | |
4245 | |
4246 @smallexample | |
4247 (set (cc0) (match_operand:@var{m} 0 @dots{})) | |
4248 @end smallexample | |
4249 | |
4250 @samp{tst@var{m}} patterns should not be defined for machines that do | |
4251 not use @code{(cc0)}. Doing so would confuse the optimizer since it | |
4252 would no longer be clear which @code{set} operations were comparisons. | |
4253 The @samp{cmp@var{m}} patterns should be used instead. | |
4254 | |
4255 @cindex @code{movmem@var{m}} instruction pattern | |
4256 @item @samp{movmem@var{m}} | |
4257 Block move instruction. The destination and source blocks of memory | |
4258 are the first two operands, and both are @code{mem:BLK}s with an | |
4259 address in mode @code{Pmode}. | |
4260 | |
4261 The number of bytes to move is the third operand, in mode @var{m}. | |
4262 Usually, you specify @code{word_mode} for @var{m}. However, if you can | |
4263 generate better code knowing the range of valid lengths is smaller than | |
4264 those representable in a full word, you should provide a pattern with a | |
4265 mode corresponding to the range of values you can handle efficiently | |
4266 (e.g., @code{QImode} for values in the range 0--127; note we avoid numbers | |
4267 that appear negative) and also a pattern with @code{word_mode}. | |
4268 | |
4269 The fourth operand is the known shared alignment of the source and | |
4270 destination, in the form of a @code{const_int} rtx. Thus, if the | |
4271 compiler knows that both source and destination are word-aligned, | |
4272 it may provide the value 4 for this operand. | |
4273 | |
4274 Optional operands 5 and 6 specify expected alignment and size of block | |
4275 respectively. The expected alignment differs from alignment in operand 4 | |
4276 in a way that the blocks are not required to be aligned according to it in | |
4277 all cases. This expected alignment is also in bytes, just like operand 4. | |
4278 Expected size, when unknown, is set to @code{(const_int -1)}. | |
4279 | |
4280 Descriptions of multiple @code{movmem@var{m}} patterns can only be | |
4281 beneficial if the patterns for smaller modes have fewer restrictions | |
4282 on their first, second and fourth operands. Note that the mode @var{m} | |
4283 in @code{movmem@var{m}} does not impose any restriction on the mode of | |
4284 individually moved data units in the block. | |
4285 | |
4286 These patterns need not give special consideration to the possibility | |
4287 that the source and destination strings might overlap. | |
4288 | |
4289 @cindex @code{movstr} instruction pattern | |
4290 @item @samp{movstr} | |
4291 String copy instruction, with @code{stpcpy} semantics. Operand 0 is | |
4292 an output operand in mode @code{Pmode}. The addresses of the | |
4293 destination and source strings are operands 1 and 2, and both are | |
4294 @code{mem:BLK}s with addresses in mode @code{Pmode}. The execution of | |
4295 the expansion of this pattern should store in operand 0 the address in | |
4296 which the @code{NUL} terminator was stored in the destination string. | |
4297 | |
4298 @cindex @code{setmem@var{m}} instruction pattern | |
4299 @item @samp{setmem@var{m}} | |
4300 Block set instruction. The destination string is the first operand, | |
4301 given as a @code{mem:BLK} whose address is in mode @code{Pmode}. The | |
4302 number of bytes to set is the second operand, in mode @var{m}. The value to | |
4303 initialize the memory with is the third operand. Targets that only support the | |
4304 clearing of memory should reject any value that is not the constant 0. See | |
4305 @samp{movmem@var{m}} for a discussion of the choice of mode. | |
4306 | |
4307 The fourth operand is the known alignment of the destination, in the form | |
4308 of a @code{const_int} rtx. Thus, if the compiler knows that the | |
4309 destination is word-aligned, it may provide the value 4 for this | |
4310 operand. | |
4311 | |
4312 Optional operands 5 and 6 specify expected alignment and size of block | |
4313 respectively. The expected alignment differs from alignment in operand 4 | |
4314 in a way that the blocks are not required to be aligned according to it in | |
4315 all cases. This expected alignment is also in bytes, just like operand 4. | |
4316 Expected size, when unknown, is set to @code{(const_int -1)}. | |
4317 | |
4318 The use for multiple @code{setmem@var{m}} is as for @code{movmem@var{m}}. | |
4319 | |
4320 @cindex @code{cmpstrn@var{m}} instruction pattern | |
4321 @item @samp{cmpstrn@var{m}} | |
4322 String compare instruction, with five operands. Operand 0 is the output; | |
4323 it has mode @var{m}. The remaining four operands are like the operands | |
4324 of @samp{movmem@var{m}}. The two memory blocks specified are compared | |
4325 byte by byte in lexicographic order starting at the beginning of each | |
4326 string. The instruction is not allowed to prefetch more than one byte | |
4327 at a time since either string may end in the first byte and reading past | |
4328 that may access an invalid page or segment and cause a fault. The | |
4329 effect of the instruction is to store a value in operand 0 whose sign | |
4330 indicates the result of the comparison. | |
4331 | |
4332 @cindex @code{cmpstr@var{m}} instruction pattern | |
4333 @item @samp{cmpstr@var{m}} | |
4334 String compare instruction, without known maximum length. Operand 0 is the | |
4335 output; it has mode @var{m}. The second and third operand are the blocks of | |
4336 memory to be compared; both are @code{mem:BLK} with an address in mode | |
4337 @code{Pmode}. | |
4338 | |
4339 The fourth operand is the known shared alignment of the source and | |
4340 destination, in the form of a @code{const_int} rtx. Thus, if the | |
4341 compiler knows that both source and destination are word-aligned, | |
4342 it may provide the value 4 for this operand. | |
4343 | |
4344 The two memory blocks specified are compared byte by byte in lexicographic | |
4345 order starting at the beginning of each string. The instruction is not allowed | |
4346 to prefetch more than one byte at a time since either string may end in the | |
4347 first byte and reading past that may access an invalid page or segment and | |
4348 cause a fault. The effect of the instruction is to store a value in operand 0 | |
4349 whose sign indicates the result of the comparison. | |
4350 | |
4351 @cindex @code{cmpmem@var{m}} instruction pattern | |
4352 @item @samp{cmpmem@var{m}} | |
4353 Block compare instruction, with five operands like the operands | |
4354 of @samp{cmpstr@var{m}}. The two memory blocks specified are compared | |
4355 byte by byte in lexicographic order starting at the beginning of each | |
4356 block. Unlike @samp{cmpstr@var{m}} the instruction can prefetch | |
4357 any bytes in the two memory blocks. The effect of the instruction is | |
4358 to store a value in operand 0 whose sign indicates the result of the | |
4359 comparison. | |
4360 | |
4361 @cindex @code{strlen@var{m}} instruction pattern | |
4362 @item @samp{strlen@var{m}} | |
4363 Compute the length of a string, with three operands. | |
4364 Operand 0 is the result (of mode @var{m}), operand 1 is | |
4365 a @code{mem} referring to the first character of the string, | |
4366 operand 2 is the character to search for (normally zero), | |
4367 and operand 3 is a constant describing the known alignment | |
4368 of the beginning of the string. | |
4369 | |
4370 @cindex @code{float@var{mn}2} instruction pattern | |
4371 @item @samp{float@var{m}@var{n}2} | |
4372 Convert signed integer operand 1 (valid for fixed point mode @var{m}) to | |
4373 floating point mode @var{n} and store in operand 0 (which has mode | |
4374 @var{n}). | |
4375 | |
4376 @cindex @code{floatuns@var{mn}2} instruction pattern | |
4377 @item @samp{floatuns@var{m}@var{n}2} | |
4378 Convert unsigned integer operand 1 (valid for fixed point mode @var{m}) | |
4379 to floating point mode @var{n} and store in operand 0 (which has mode | |
4380 @var{n}). | |
4381 | |
4382 @cindex @code{fix@var{mn}2} instruction pattern | |
4383 @item @samp{fix@var{m}@var{n}2} | |
4384 Convert operand 1 (valid for floating point mode @var{m}) to fixed | |
4385 point mode @var{n} as a signed number and store in operand 0 (which | |
4386 has mode @var{n}). This instruction's result is defined only when | |
4387 the value of operand 1 is an integer. | |
4388 | |
4389 If the machine description defines this pattern, it also needs to | |
4390 define the @code{ftrunc} pattern. | |
4391 | |
4392 @cindex @code{fixuns@var{mn}2} instruction pattern | |
4393 @item @samp{fixuns@var{m}@var{n}2} | |
4394 Convert operand 1 (valid for floating point mode @var{m}) to fixed | |
4395 point mode @var{n} as an unsigned number and store in operand 0 (which | |
4396 has mode @var{n}). This instruction's result is defined only when the | |
4397 value of operand 1 is an integer. | |
4398 | |
4399 @cindex @code{ftrunc@var{m}2} instruction pattern | |
4400 @item @samp{ftrunc@var{m}2} | |
4401 Convert operand 1 (valid for floating point mode @var{m}) to an | |
4402 integer value, still represented in floating point mode @var{m}, and | |
4403 store it in operand 0 (valid for floating point mode @var{m}). | |
4404 | |
4405 @cindex @code{fix_trunc@var{mn}2} instruction pattern | |
4406 @item @samp{fix_trunc@var{m}@var{n}2} | |
4407 Like @samp{fix@var{m}@var{n}2} but works for any floating point value | |
4408 of mode @var{m} by converting the value to an integer. | |
4409 | |
4410 @cindex @code{fixuns_trunc@var{mn}2} instruction pattern | |
4411 @item @samp{fixuns_trunc@var{m}@var{n}2} | |
4412 Like @samp{fixuns@var{m}@var{n}2} but works for any floating point | |
4413 value of mode @var{m} by converting the value to an integer. | |
4414 | |
4415 @cindex @code{trunc@var{mn}2} instruction pattern | |
4416 @item @samp{trunc@var{m}@var{n}2} | |
4417 Truncate operand 1 (valid for mode @var{m}) to mode @var{n} and | |
4418 store in operand 0 (which has mode @var{n}). Both modes must be fixed | |
4419 point or both floating point. | |
4420 | |
4421 @cindex @code{extend@var{mn}2} instruction pattern | |
4422 @item @samp{extend@var{m}@var{n}2} | |
4423 Sign-extend operand 1 (valid for mode @var{m}) to mode @var{n} and | |
4424 store in operand 0 (which has mode @var{n}). Both modes must be fixed | |
4425 point or both floating point. | |
4426 | |
4427 @cindex @code{zero_extend@var{mn}2} instruction pattern | |
4428 @item @samp{zero_extend@var{m}@var{n}2} | |
4429 Zero-extend operand 1 (valid for mode @var{m}) to mode @var{n} and | |
4430 store in operand 0 (which has mode @var{n}). Both modes must be fixed | |
4431 point. | |
4432 | |
4433 @cindex @code{fract@var{mn}2} instruction pattern | |
4434 @item @samp{fract@var{m}@var{n}2} | |
4435 Convert operand 1 of mode @var{m} to mode @var{n} and store in | |
4436 operand 0 (which has mode @var{n}). Mode @var{m} and mode @var{n} | |
4437 could be fixed-point to fixed-point, signed integer to fixed-point, | |
4438 fixed-point to signed integer, floating-point to fixed-point, | |
4439 or fixed-point to floating-point. | |
4440 When overflows or underflows happen, the results are undefined. | |
4441 | |
4442 @cindex @code{satfract@var{mn}2} instruction pattern | |
4443 @item @samp{satfract@var{m}@var{n}2} | |
4444 Convert operand 1 of mode @var{m} to mode @var{n} and store in | |
4445 operand 0 (which has mode @var{n}). Mode @var{m} and mode @var{n} | |
4446 could be fixed-point to fixed-point, signed integer to fixed-point, | |
4447 or floating-point to fixed-point. | |
4448 When overflows or underflows happen, the instruction saturates the | |
4449 results to the maximum or the minimum. | |
4450 | |
4451 @cindex @code{fractuns@var{mn}2} instruction pattern | |
4452 @item @samp{fractuns@var{m}@var{n}2} | |
4453 Convert operand 1 of mode @var{m} to mode @var{n} and store in | |
4454 operand 0 (which has mode @var{n}). Mode @var{m} and mode @var{n} | |
4455 could be unsigned integer to fixed-point, or | |
4456 fixed-point to unsigned integer. | |
4457 When overflows or underflows happen, the results are undefined. | |
4458 | |
4459 @cindex @code{satfractuns@var{mn}2} instruction pattern | |
4460 @item @samp{satfractuns@var{m}@var{n}2} | |
4461 Convert unsigned integer operand 1 of mode @var{m} to fixed-point mode | |
4462 @var{n} and store in operand 0 (which has mode @var{n}). | |
4463 When overflows or underflows happen, the instruction saturates the | |
4464 results to the maximum or the minimum. | |
4465 | |
4466 @cindex @code{extv} instruction pattern | |
4467 @item @samp{extv} | |
4468 Extract a bit-field from operand 1 (a register or memory operand), where | |
4469 operand 2 specifies the width in bits and operand 3 the starting bit, | |
4470 and store it in operand 0. Operand 0 must have mode @code{word_mode}. | |
4471 Operand 1 may have mode @code{byte_mode} or @code{word_mode}; often | |
4472 @code{word_mode} is allowed only for registers. Operands 2 and 3 must | |
4473 be valid for @code{word_mode}. | |
4474 | |
4475 The RTL generation pass generates this instruction only with constants | |
4476 for operands 2 and 3 and the constant is never zero for operand 2. | |
4477 | |
4478 The bit-field value is sign-extended to a full word integer | |
4479 before it is stored in operand 0. | |
4480 | |
4481 @cindex @code{extzv} instruction pattern | |
4482 @item @samp{extzv} | |
4483 Like @samp{extv} except that the bit-field value is zero-extended. | |
4484 | |
4485 @cindex @code{insv} instruction pattern | |
4486 @item @samp{insv} | |
4487 Store operand 3 (which must be valid for @code{word_mode}) into a | |
4488 bit-field in operand 0, where operand 1 specifies the width in bits and | |
4489 operand 2 the starting bit. Operand 0 may have mode @code{byte_mode} or | |
4490 @code{word_mode}; often @code{word_mode} is allowed only for registers. | |
4491 Operands 1 and 2 must be valid for @code{word_mode}. | |
4492 | |
4493 The RTL generation pass generates this instruction only with constants | |
4494 for operands 1 and 2 and the constant is never zero for operand 1. | |
4495 | |
4496 @cindex @code{mov@var{mode}cc} instruction pattern | |
4497 @item @samp{mov@var{mode}cc} | |
4498 Conditionally move operand 2 or operand 3 into operand 0 according to the | |
4499 comparison in operand 1. If the comparison is true, operand 2 is moved | |
4500 into operand 0, otherwise operand 3 is moved. | |
4501 | |
4502 The mode of the operands being compared need not be the same as the operands | |
4503 being moved. Some machines, sparc64 for example, have instructions that | |
4504 conditionally move an integer value based on the floating point condition | |
4505 codes and vice versa. | |
4506 | |
4507 If the machine does not have conditional move instructions, do not | |
4508 define these patterns. | |
4509 | |
4510 @cindex @code{add@var{mode}cc} instruction pattern | |
4511 @item @samp{add@var{mode}cc} | |
4512 Similar to @samp{mov@var{mode}cc} but for conditional addition. Conditionally | |
4513 move operand 2 or (operands 2 + operand 3) into operand 0 according to the | |
4514 comparison in operand 1. If the comparison is true, operand 2 is moved into | |
4515 operand 0, otherwise (operand 2 + operand 3) is moved. | |
4516 | |
4517 @cindex @code{s@var{cond}} instruction pattern | |
4518 @item @samp{s@var{cond}} | |
4519 Store zero or nonzero in the operand according to the condition codes. | |
4520 Value stored is nonzero iff the condition @var{cond} is true. | |
4521 @var{cond} is the name of a comparison operation expression code, such | |
4522 as @code{eq}, @code{lt} or @code{leu}. | |
4523 | |
4524 You specify the mode that the operand must have when you write the | |
4525 @code{match_operand} expression. The compiler automatically sees | |
4526 which mode you have used and supplies an operand of that mode. | |
4527 | |
4528 The value stored for a true condition must have 1 as its low bit, or | |
4529 else must be negative. Otherwise the instruction is not suitable and | |
4530 you should omit it from the machine description. You describe to the | |
4531 compiler exactly which value is stored by defining the macro | |
4532 @code{STORE_FLAG_VALUE} (@pxref{Misc}). If a description cannot be | |
4533 found that can be used for all the @samp{s@var{cond}} patterns, you | |
4534 should omit those operations from the machine description. | |
4535 | |
4536 These operations may fail, but should do so only in relatively | |
4537 uncommon cases; if they would fail for common cases involving | |
4538 integer comparisons, it is best to omit these patterns. | |
4539 | |
4540 If these operations are omitted, the compiler will usually generate code | |
4541 that copies the constant one to the target and branches around an | |
4542 assignment of zero to the target. If this code is more efficient than | |
4543 the potential instructions used for the @samp{s@var{cond}} pattern | |
4544 followed by those required to convert the result into a 1 or a zero in | |
4545 @code{SImode}, you should omit the @samp{s@var{cond}} operations from | |
4546 the machine description. | |
4547 | |
4548 @cindex @code{b@var{cond}} instruction pattern | |
4549 @item @samp{b@var{cond}} | |
4550 Conditional branch instruction. Operand 0 is a @code{label_ref} that | |
4551 refers to the label to jump to. Jump if the condition codes meet | |
4552 condition @var{cond}. | |
4553 | |
4554 Some machines do not follow the model assumed here where a comparison | |
4555 instruction is followed by a conditional branch instruction. In that | |
4556 case, the @samp{cmp@var{m}} (and @samp{tst@var{m}}) patterns should | |
4557 simply store the operands away and generate all the required insns in a | |
4558 @code{define_expand} (@pxref{Expander Definitions}) for the conditional | |
4559 branch operations. All calls to expand @samp{b@var{cond}} patterns are | |
4560 immediately preceded by calls to expand either a @samp{cmp@var{m}} | |
4561 pattern or a @samp{tst@var{m}} pattern. | |
4562 | |
4563 Machines that use a pseudo register for the condition code value, or | |
4564 where the mode used for the comparison depends on the condition being | |
4565 tested, should also use the above mechanism. @xref{Jump Patterns}. | |
4566 | |
4567 The above discussion also applies to the @samp{mov@var{mode}cc} and | |
4568 @samp{s@var{cond}} patterns. | |
4569 | |
4570 @cindex @code{cbranch@var{mode}4} instruction pattern | |
4571 @item @samp{cbranch@var{mode}4} | |
4572 Conditional branch instruction combined with a compare instruction. | |
4573 Operand 0 is a comparison operator. Operand 1 and operand 2 are the | |
4574 first and second operands of the comparison, respectively. Operand 3 | |
4575 is a @code{label_ref} that refers to the label to jump to. | |
4576 | |
4577 @cindex @code{jump} instruction pattern | |
4578 @item @samp{jump} | |
4579 A jump inside a function; an unconditional branch. Operand 0 is the | |
4580 @code{label_ref} of the label to jump to. This pattern name is mandatory | |
4581 on all machines. | |
4582 | |
4583 @cindex @code{call} instruction pattern | |
4584 @item @samp{call} | |
4585 Subroutine call instruction returning no value. Operand 0 is the | |
4586 function to call; operand 1 is the number of bytes of arguments pushed | |
4587 as a @code{const_int}; operand 2 is the number of registers used as | |
4588 operands. | |
4589 | |
4590 On most machines, operand 2 is not actually stored into the RTL | |
4591 pattern. It is supplied for the sake of some RISC machines which need | |
4592 to put this information into the assembler code; they can put it in | |
4593 the RTL instead of operand 1. | |
4594 | |
4595 Operand 0 should be a @code{mem} RTX whose address is the address of the | |
4596 function. Note, however, that this address can be a @code{symbol_ref} | |
4597 expression even if it would not be a legitimate memory address on the | |
4598 target machine. If it is also not a valid argument for a call | |
4599 instruction, the pattern for this operation should be a | |
4600 @code{define_expand} (@pxref{Expander Definitions}) that places the | |
4601 address into a register and uses that register in the call instruction. | |
4602 | |
4603 @cindex @code{call_value} instruction pattern | |
4604 @item @samp{call_value} | |
4605 Subroutine call instruction returning a value. Operand 0 is the hard | |
4606 register in which the value is returned. There are three more | |
4607 operands, the same as the three operands of the @samp{call} | |
4608 instruction (but with numbers increased by one). | |
4609 | |
4610 Subroutines that return @code{BLKmode} objects use the @samp{call} | |
4611 insn. | |
4612 | |
4613 @cindex @code{call_pop} instruction pattern | |
4614 @cindex @code{call_value_pop} instruction pattern | |
4615 @item @samp{call_pop}, @samp{call_value_pop} | |
4616 Similar to @samp{call} and @samp{call_value}, except used if defined and | |
4617 if @code{RETURN_POPS_ARGS} is nonzero. They should emit a @code{parallel} | |
4618 that contains both the function call and a @code{set} to indicate the | |
4619 adjustment made to the frame pointer. | |
4620 | |
4621 For machines where @code{RETURN_POPS_ARGS} can be nonzero, the use of these | |
4622 patterns increases the number of functions for which the frame pointer | |
4623 can be eliminated, if desired. | |
4624 | |
4625 @cindex @code{untyped_call} instruction pattern | |
4626 @item @samp{untyped_call} | |
4627 Subroutine call instruction returning a value of any type. Operand 0 is | |
4628 the function to call; operand 1 is a memory location where the result of | |
4629 calling the function is to be stored; operand 2 is a @code{parallel} | |
4630 expression where each element is a @code{set} expression that indicates | |
4631 the saving of a function return value into the result block. | |
4632 | |
4633 This instruction pattern should be defined to support | |
4634 @code{__builtin_apply} on machines where special instructions are needed | |
4635 to call a subroutine with arbitrary arguments or to save the value | |
4636 returned. This instruction pattern is required on machines that have | |
4637 multiple registers that can hold a return value | |
4638 (i.e.@: @code{FUNCTION_VALUE_REGNO_P} is true for more than one register). | |
4639 | |
4640 @cindex @code{return} instruction pattern | |
4641 @item @samp{return} | |
4642 Subroutine return instruction. This instruction pattern name should be | |
4643 defined only if a single instruction can do all the work of returning | |
4644 from a function. | |
4645 | |
4646 Like the @samp{mov@var{m}} patterns, this pattern is also used after the | |
4647 RTL generation phase. In this case it is to support machines where | |
4648 multiple instructions are usually needed to return from a function, but | |
4649 some class of functions only requires one instruction to implement a | |
4650 return. Normally, the applicable functions are those which do not need | |
4651 to save any registers or allocate stack space. | |
4652 | |
4653 @findex reload_completed | |
4654 @findex leaf_function_p | |
4655 For such machines, the condition specified in this pattern should only | |
4656 be true when @code{reload_completed} is nonzero and the function's | |
4657 epilogue would only be a single instruction. For machines with register | |
4658 windows, the routine @code{leaf_function_p} may be used to determine if | |
4659 a register window push is required. | |
4660 | |
4661 Machines that have conditional return instructions should define patterns | |
4662 such as | |
4663 | |
4664 @smallexample | |
4665 (define_insn "" | |
4666 [(set (pc) | |
4667 (if_then_else (match_operator | |
4668 0 "comparison_operator" | |
4669 [(cc0) (const_int 0)]) | |
4670 (return) | |
4671 (pc)))] | |
4672 "@var{condition}" | |
4673 "@dots{}") | |
4674 @end smallexample | |
4675 | |
4676 where @var{condition} would normally be the same condition specified on the | |
4677 named @samp{return} pattern. | |
4678 | |
4679 @cindex @code{untyped_return} instruction pattern | |
4680 @item @samp{untyped_return} | |
4681 Untyped subroutine return instruction. This instruction pattern should | |
4682 be defined to support @code{__builtin_return} on machines where special | |
4683 instructions are needed to return a value of any type. | |
4684 | |
4685 Operand 0 is a memory location where the result of calling a function | |
4686 with @code{__builtin_apply} is stored; operand 1 is a @code{parallel} | |
4687 expression where each element is a @code{set} expression that indicates | |
4688 the restoring of a function return value from the result block. | |
4689 | |
4690 @cindex @code{nop} instruction pattern | |
4691 @item @samp{nop} | |
4692 No-op instruction. This instruction pattern name should always be defined | |
4693 to output a no-op in assembler code. @code{(const_int 0)} will do as an | |
4694 RTL pattern. | |
4695 | |
4696 @cindex @code{indirect_jump} instruction pattern | |
4697 @item @samp{indirect_jump} | |
4698 An instruction to jump to an address which is operand zero. | |
4699 This pattern name is mandatory on all machines. | |
4700 | |
4701 @cindex @code{casesi} instruction pattern | |
4702 @item @samp{casesi} | |
4703 Instruction to jump through a dispatch table, including bounds checking. | |
4704 This instruction takes five operands: | |
4705 | |
4706 @enumerate | |
4707 @item | |
4708 The index to dispatch on, which has mode @code{SImode}. | |
4709 | |
4710 @item | |
4711 The lower bound for indices in the table, an integer constant. | |
4712 | |
4713 @item | |
4714 The total range of indices in the table---the largest index | |
4715 minus the smallest one (both inclusive). | |
4716 | |
4717 @item | |
4718 A label that precedes the table itself. | |
4719 | |
4720 @item | |
4721 A label to jump to if the index has a value outside the bounds. | |
4722 @end enumerate | |
4723 | |
4724 The table is a @code{addr_vec} or @code{addr_diff_vec} inside of a | |
4725 @code{jump_insn}. The number of elements in the table is one plus the | |
4726 difference between the upper bound and the lower bound. | |
4727 | |
4728 @cindex @code{tablejump} instruction pattern | |
4729 @item @samp{tablejump} | |
4730 Instruction to jump to a variable address. This is a low-level | |
4731 capability which can be used to implement a dispatch table when there | |
4732 is no @samp{casesi} pattern. | |
4733 | |
4734 This pattern requires two operands: the address or offset, and a label | |
4735 which should immediately precede the jump table. If the macro | |
4736 @code{CASE_VECTOR_PC_RELATIVE} evaluates to a nonzero value then the first | |
4737 operand is an offset which counts from the address of the table; otherwise, | |
4738 it is an absolute address to jump to. In either case, the first operand has | |
4739 mode @code{Pmode}. | |
4740 | |
4741 The @samp{tablejump} insn is always the last insn before the jump | |
4742 table it uses. Its assembler code normally has no need to use the | |
4743 second operand, but you should incorporate it in the RTL pattern so | |
4744 that the jump optimizer will not delete the table as unreachable code. | |
4745 | |
4746 | |
4747 @cindex @code{decrement_and_branch_until_zero} instruction pattern | |
4748 @item @samp{decrement_and_branch_until_zero} | |
4749 Conditional branch instruction that decrements a register and | |
4750 jumps if the register is nonzero. Operand 0 is the register to | |
4751 decrement and test; operand 1 is the label to jump to if the | |
4752 register is nonzero. @xref{Looping Patterns}. | |
4753 | |
4754 This optional instruction pattern is only used by the combiner, | |
4755 typically for loops reversed by the loop optimizer when strength | |
4756 reduction is enabled. | |
4757 | |
4758 @cindex @code{doloop_end} instruction pattern | |
4759 @item @samp{doloop_end} | |
4760 Conditional branch instruction that decrements a register and jumps if | |
4761 the register is nonzero. This instruction takes five operands: Operand | |
4762 0 is the register to decrement and test; operand 1 is the number of loop | |
4763 iterations as a @code{const_int} or @code{const0_rtx} if this cannot be | |
4764 determined until run-time; operand 2 is the actual or estimated maximum | |
4765 number of iterations as a @code{const_int}; operand 3 is the number of | |
4766 enclosed loops as a @code{const_int} (an innermost loop has a value of | |
4767 1); operand 4 is the label to jump to if the register is nonzero. | |
4768 @xref{Looping Patterns}. | |
4769 | |
4770 This optional instruction pattern should be defined for machines with | |
4771 low-overhead looping instructions as the loop optimizer will try to | |
4772 modify suitable loops to utilize it. If nested low-overhead looping is | |
4773 not supported, use a @code{define_expand} (@pxref{Expander Definitions}) | |
4774 and make the pattern fail if operand 3 is not @code{const1_rtx}. | |
4775 Similarly, if the actual or estimated maximum number of iterations is | |
4776 too large for this instruction, make it fail. | |
4777 | |
4778 @cindex @code{doloop_begin} instruction pattern | |
4779 @item @samp{doloop_begin} | |
4780 Companion instruction to @code{doloop_end} required for machines that | |
4781 need to perform some initialization, such as loading special registers | |
4782 used by a low-overhead looping instruction. If initialization insns do | |
4783 not always need to be emitted, use a @code{define_expand} | |
4784 (@pxref{Expander Definitions}) and make it fail. | |
4785 | |
4786 | |
4787 @cindex @code{canonicalize_funcptr_for_compare} instruction pattern | |
4788 @item @samp{canonicalize_funcptr_for_compare} | |
4789 Canonicalize the function pointer in operand 1 and store the result | |
4790 into operand 0. | |
4791 | |
4792 Operand 0 is always a @code{reg} and has mode @code{Pmode}; operand 1 | |
4793 may be a @code{reg}, @code{mem}, @code{symbol_ref}, @code{const_int}, etc | |
4794 and also has mode @code{Pmode}. | |
4795 | |
4796 Canonicalization of a function pointer usually involves computing | |
4797 the address of the function which would be called if the function | |
4798 pointer were used in an indirect call. | |
4799 | |
4800 Only define this pattern if function pointers on the target machine | |
4801 can have different values but still call the same function when | |
4802 used in an indirect call. | |
4803 | |
4804 @cindex @code{save_stack_block} instruction pattern | |
4805 @cindex @code{save_stack_function} instruction pattern | |
4806 @cindex @code{save_stack_nonlocal} instruction pattern | |
4807 @cindex @code{restore_stack_block} instruction pattern | |
4808 @cindex @code{restore_stack_function} instruction pattern | |
4809 @cindex @code{restore_stack_nonlocal} instruction pattern | |
4810 @item @samp{save_stack_block} | |
4811 @itemx @samp{save_stack_function} | |
4812 @itemx @samp{save_stack_nonlocal} | |
4813 @itemx @samp{restore_stack_block} | |
4814 @itemx @samp{restore_stack_function} | |
4815 @itemx @samp{restore_stack_nonlocal} | |
4816 Most machines save and restore the stack pointer by copying it to or | |
4817 from an object of mode @code{Pmode}. Do not define these patterns on | |
4818 such machines. | |
4819 | |
4820 Some machines require special handling for stack pointer saves and | |
4821 restores. On those machines, define the patterns corresponding to the | |
4822 non-standard cases by using a @code{define_expand} (@pxref{Expander | |
4823 Definitions}) that produces the required insns. The three types of | |
4824 saves and restores are: | |
4825 | |
4826 @enumerate | |
4827 @item | |
4828 @samp{save_stack_block} saves the stack pointer at the start of a block | |
4829 that allocates a variable-sized object, and @samp{restore_stack_block} | |
4830 restores the stack pointer when the block is exited. | |
4831 | |
4832 @item | |
4833 @samp{save_stack_function} and @samp{restore_stack_function} do a | |
4834 similar job for the outermost block of a function and are used when the | |
4835 function allocates variable-sized objects or calls @code{alloca}. Only | |
4836 the epilogue uses the restored stack pointer, allowing a simpler save or | |
4837 restore sequence on some machines. | |
4838 | |
4839 @item | |
4840 @samp{save_stack_nonlocal} is used in functions that contain labels | |
4841 branched to by nested functions. It saves the stack pointer in such a | |
4842 way that the inner function can use @samp{restore_stack_nonlocal} to | |
4843 restore the stack pointer. The compiler generates code to restore the | |
4844 frame and argument pointer registers, but some machines require saving | |
4845 and restoring additional data such as register window information or | |
4846 stack backchains. Place insns in these patterns to save and restore any | |
4847 such required data. | |
4848 @end enumerate | |
4849 | |
4850 When saving the stack pointer, operand 0 is the save area and operand 1 | |
4851 is the stack pointer. The mode used to allocate the save area defaults | |
4852 to @code{Pmode} but you can override that choice by defining the | |
4853 @code{STACK_SAVEAREA_MODE} macro (@pxref{Storage Layout}). You must | |
4854 specify an integral mode, or @code{VOIDmode} if no save area is needed | |
4855 for a particular type of save (either because no save is needed or | |
4856 because a machine-specific save area can be used). Operand 0 is the | |
4857 stack pointer and operand 1 is the save area for restore operations. If | |
4858 @samp{save_stack_block} is defined, operand 0 must not be | |
4859 @code{VOIDmode} since these saves can be arbitrarily nested. | |
4860 | |
4861 A save area is a @code{mem} that is at a constant offset from | |
4862 @code{virtual_stack_vars_rtx} when the stack pointer is saved for use by | |
4863 nonlocal gotos and a @code{reg} in the other two cases. | |
4864 | |
4865 @cindex @code{allocate_stack} instruction pattern | |
4866 @item @samp{allocate_stack} | |
4867 Subtract (or add if @code{STACK_GROWS_DOWNWARD} is undefined) operand 1 from | |
4868 the stack pointer to create space for dynamically allocated data. | |
4869 | |
4870 Store the resultant pointer to this space into operand 0. If you | |
4871 are allocating space from the main stack, do this by emitting a | |
4872 move insn to copy @code{virtual_stack_dynamic_rtx} to operand 0. | |
4873 If you are allocating the space elsewhere, generate code to copy the | |
4874 location of the space to operand 0. In the latter case, you must | |
4875 ensure this space gets freed when the corresponding space on the main | |
4876 stack is free. | |
4877 | |
4878 Do not define this pattern if all that must be done is the subtraction. | |
4879 Some machines require other operations such as stack probes or | |
4880 maintaining the back chain. Define this pattern to emit those | |
4881 operations in addition to updating the stack pointer. | |
4882 | |
4883 @cindex @code{check_stack} instruction pattern | |
4884 @item @samp{check_stack} | |
4885 If stack checking cannot be done on your system by probing the stack with | |
4886 a load or store instruction (@pxref{Stack Checking}), define this pattern | |
4887 to perform the needed check and signaling an error if the stack | |
4888 has overflowed. The single operand is the location in the stack furthest | |
4889 from the current stack pointer that you need to validate. Normally, | |
4890 on machines where this pattern is needed, you would obtain the stack | |
4891 limit from a global or thread-specific variable or register. | |
4892 | |
4893 @cindex @code{nonlocal_goto} instruction pattern | |
4894 @item @samp{nonlocal_goto} | |
4895 Emit code to generate a non-local goto, e.g., a jump from one function | |
4896 to a label in an outer function. This pattern has four arguments, | |
4897 each representing a value to be used in the jump. The first | |
4898 argument is to be loaded into the frame pointer, the second is | |
4899 the address to branch to (code to dispatch to the actual label), | |
4900 the third is the address of a location where the stack is saved, | |
4901 and the last is the address of the label, to be placed in the | |
4902 location for the incoming static chain. | |
4903 | |
4904 On most machines you need not define this pattern, since GCC will | |
4905 already generate the correct code, which is to load the frame pointer | |
4906 and static chain, restore the stack (using the | |
4907 @samp{restore_stack_nonlocal} pattern, if defined), and jump indirectly | |
4908 to the dispatcher. You need only define this pattern if this code will | |
4909 not work on your machine. | |
4910 | |
4911 @cindex @code{nonlocal_goto_receiver} instruction pattern | |
4912 @item @samp{nonlocal_goto_receiver} | |
4913 This pattern, if defined, contains code needed at the target of a | |
4914 nonlocal goto after the code already generated by GCC@. You will not | |
4915 normally need to define this pattern. A typical reason why you might | |
4916 need this pattern is if some value, such as a pointer to a global table, | |
4917 must be restored when the frame pointer is restored. Note that a nonlocal | |
4918 goto only occurs within a unit-of-translation, so a global table pointer | |
4919 that is shared by all functions of a given module need not be restored. | |
4920 There are no arguments. | |
4921 | |
4922 @cindex @code{exception_receiver} instruction pattern | |
4923 @item @samp{exception_receiver} | |
4924 This pattern, if defined, contains code needed at the site of an | |
4925 exception handler that isn't needed at the site of a nonlocal goto. You | |
4926 will not normally need to define this pattern. A typical reason why you | |
4927 might need this pattern is if some value, such as a pointer to a global | |
4928 table, must be restored after control flow is branched to the handler of | |
4929 an exception. There are no arguments. | |
4930 | |
4931 @cindex @code{builtin_setjmp_setup} instruction pattern | |
4932 @item @samp{builtin_setjmp_setup} | |
4933 This pattern, if defined, contains additional code needed to initialize | |
4934 the @code{jmp_buf}. You will not normally need to define this pattern. | |
4935 A typical reason why you might need this pattern is if some value, such | |
4936 as a pointer to a global table, must be restored. Though it is | |
4937 preferred that the pointer value be recalculated if possible (given the | |
4938 address of a label for instance). The single argument is a pointer to | |
4939 the @code{jmp_buf}. Note that the buffer is five words long and that | |
4940 the first three are normally used by the generic mechanism. | |
4941 | |
4942 @cindex @code{builtin_setjmp_receiver} instruction pattern | |
4943 @item @samp{builtin_setjmp_receiver} | |
4944 This pattern, if defined, contains code needed at the site of an | |
4945 built-in setjmp that isn't needed at the site of a nonlocal goto. You | |
4946 will not normally need to define this pattern. A typical reason why you | |
4947 might need this pattern is if some value, such as a pointer to a global | |
4948 table, must be restored. It takes one argument, which is the label | |
4949 to which builtin_longjmp transfered control; this pattern may be emitted | |
4950 at a small offset from that label. | |
4951 | |
4952 @cindex @code{builtin_longjmp} instruction pattern | |
4953 @item @samp{builtin_longjmp} | |
4954 This pattern, if defined, performs the entire action of the longjmp. | |
4955 You will not normally need to define this pattern unless you also define | |
4956 @code{builtin_setjmp_setup}. The single argument is a pointer to the | |
4957 @code{jmp_buf}. | |
4958 | |
4959 @cindex @code{eh_return} instruction pattern | |
4960 @item @samp{eh_return} | |
4961 This pattern, if defined, affects the way @code{__builtin_eh_return}, | |
4962 and thence the call frame exception handling library routines, are | |
4963 built. It is intended to handle non-trivial actions needed along | |
4964 the abnormal return path. | |
4965 | |
4966 The address of the exception handler to which the function should return | |
4967 is passed as operand to this pattern. It will normally need to copied by | |
4968 the pattern to some special register or memory location. | |
4969 If the pattern needs to determine the location of the target call | |
4970 frame in order to do so, it may use @code{EH_RETURN_STACKADJ_RTX}, | |
4971 if defined; it will have already been assigned. | |
4972 | |
4973 If this pattern is not defined, the default action will be to simply | |
4974 copy the return address to @code{EH_RETURN_HANDLER_RTX}. Either | |
4975 that macro or this pattern needs to be defined if call frame exception | |
4976 handling is to be used. | |
4977 | |
4978 @cindex @code{prologue} instruction pattern | |
4979 @anchor{prologue instruction pattern} | |
4980 @item @samp{prologue} | |
4981 This pattern, if defined, emits RTL for entry to a function. The function | |
4982 entry is responsible for setting up the stack frame, initializing the frame | |
4983 pointer register, saving callee saved registers, etc. | |
4984 | |
4985 Using a prologue pattern is generally preferred over defining | |
4986 @code{TARGET_ASM_FUNCTION_PROLOGUE} to emit assembly code for the prologue. | |
4987 | |
4988 The @code{prologue} pattern is particularly useful for targets which perform | |
4989 instruction scheduling. | |
4990 | |
4991 @cindex @code{epilogue} instruction pattern | |
4992 @anchor{epilogue instruction pattern} | |
4993 @item @samp{epilogue} | |
4994 This pattern emits RTL for exit from a function. The function | |
4995 exit is responsible for deallocating the stack frame, restoring callee saved | |
4996 registers and emitting the return instruction. | |
4997 | |
4998 Using an epilogue pattern is generally preferred over defining | |
4999 @code{TARGET_ASM_FUNCTION_EPILOGUE} to emit assembly code for the epilogue. | |
5000 | |
5001 The @code{epilogue} pattern is particularly useful for targets which perform | |
5002 instruction scheduling or which have delay slots for their return instruction. | |
5003 | |
5004 @cindex @code{sibcall_epilogue} instruction pattern | |
5005 @item @samp{sibcall_epilogue} | |
5006 This pattern, if defined, emits RTL for exit from a function without the final | |
5007 branch back to the calling function. This pattern will be emitted before any | |
5008 sibling call (aka tail call) sites. | |
5009 | |
5010 The @code{sibcall_epilogue} pattern must not clobber any arguments used for | |
5011 parameter passing or any stack slots for arguments passed to the current | |
5012 function. | |
5013 | |
5014 @cindex @code{trap} instruction pattern | |
5015 @item @samp{trap} | |
5016 This pattern, if defined, signals an error, typically by causing some | |
5017 kind of signal to be raised. Among other places, it is used by the Java | |
5018 front end to signal `invalid array index' exceptions. | |
5019 | |
5020 @cindex @code{conditional_trap} instruction pattern | |
5021 @item @samp{conditional_trap} | |
5022 Conditional trap instruction. Operand 0 is a piece of RTL which | |
5023 performs a comparison. Operand 1 is the trap code, an integer. | |
5024 | |
5025 A typical @code{conditional_trap} pattern looks like | |
5026 | |
5027 @smallexample | |
5028 (define_insn "conditional_trap" | |
5029 [(trap_if (match_operator 0 "trap_operator" | |
5030 [(cc0) (const_int 0)]) | |
5031 (match_operand 1 "const_int_operand" "i"))] | |
5032 "" | |
5033 "@dots{}") | |
5034 @end smallexample | |
5035 | |
5036 @cindex @code{prefetch} instruction pattern | |
5037 @item @samp{prefetch} | |
5038 | |
5039 This pattern, if defined, emits code for a non-faulting data prefetch | |
5040 instruction. Operand 0 is the address of the memory to prefetch. Operand 1 | |
5041 is a constant 1 if the prefetch is preparing for a write to the memory | |
5042 address, or a constant 0 otherwise. Operand 2 is the expected degree of | |
5043 temporal locality of the data and is a value between 0 and 3, inclusive; 0 | |
5044 means that the data has no temporal locality, so it need not be left in the | |
5045 cache after the access; 3 means that the data has a high degree of temporal | |
5046 locality and should be left in all levels of cache possible; 1 and 2 mean, | |
5047 respectively, a low or moderate degree of temporal locality. | |
5048 | |
5049 Targets that do not support write prefetches or locality hints can ignore | |
5050 the values of operands 1 and 2. | |
5051 | |
5052 @cindex @code{blockage} instruction pattern | |
5053 @item @samp{blockage} | |
5054 | |
5055 This pattern defines a pseudo insn that prevents the instruction | |
5056 scheduler from moving instructions across the boundary defined by the | |
5057 blockage insn. Normally an UNSPEC_VOLATILE pattern. | |
5058 | |
5059 @cindex @code{memory_barrier} instruction pattern | |
5060 @item @samp{memory_barrier} | |
5061 | |
5062 If the target memory model is not fully synchronous, then this pattern | |
5063 should be defined to an instruction that orders both loads and stores | |
5064 before the instruction with respect to loads and stores after the instruction. | |
5065 This pattern has no operands. | |
5066 | |
5067 @cindex @code{sync_compare_and_swap@var{mode}} instruction pattern | |
5068 @item @samp{sync_compare_and_swap@var{mode}} | |
5069 | |
5070 This pattern, if defined, emits code for an atomic compare-and-swap | |
5071 operation. Operand 1 is the memory on which the atomic operation is | |
5072 performed. Operand 2 is the ``old'' value to be compared against the | |
5073 current contents of the memory location. Operand 3 is the ``new'' value | |
5074 to store in the memory if the compare succeeds. Operand 0 is the result | |
5075 of the operation; it should contain the contents of the memory | |
5076 before the operation. If the compare succeeds, this should obviously be | |
5077 a copy of operand 2. | |
5078 | |
5079 This pattern must show that both operand 0 and operand 1 are modified. | |
5080 | |
5081 This pattern must issue any memory barrier instructions such that all | |
5082 memory operations before the atomic operation occur before the atomic | |
5083 operation and all memory operations after the atomic operation occur | |
5084 after the atomic operation. | |
5085 | |
5086 @cindex @code{sync_compare_and_swap_cc@var{mode}} instruction pattern | |
5087 @item @samp{sync_compare_and_swap_cc@var{mode}} | |
5088 | |
5089 This pattern is just like @code{sync_compare_and_swap@var{mode}}, except | |
5090 it should act as if compare part of the compare-and-swap were issued via | |
5091 @code{cmp@var{m}}. This comparison will only be used with @code{EQ} and | |
5092 @code{NE} branches and @code{setcc} operations. | |
5093 | |
5094 Some targets do expose the success or failure of the compare-and-swap | |
5095 operation via the status flags. Ideally we wouldn't need a separate | |
5096 named pattern in order to take advantage of this, but the combine pass | |
5097 does not handle patterns with multiple sets, which is required by | |
5098 definition for @code{sync_compare_and_swap@var{mode}}. | |
5099 | |
5100 @cindex @code{sync_add@var{mode}} instruction pattern | |
5101 @cindex @code{sync_sub@var{mode}} instruction pattern | |
5102 @cindex @code{sync_ior@var{mode}} instruction pattern | |
5103 @cindex @code{sync_and@var{mode}} instruction pattern | |
5104 @cindex @code{sync_xor@var{mode}} instruction pattern | |
5105 @cindex @code{sync_nand@var{mode}} instruction pattern | |
5106 @item @samp{sync_add@var{mode}}, @samp{sync_sub@var{mode}} | |
5107 @itemx @samp{sync_ior@var{mode}}, @samp{sync_and@var{mode}} | |
5108 @itemx @samp{sync_xor@var{mode}}, @samp{sync_nand@var{mode}} | |
5109 | |
5110 These patterns emit code for an atomic operation on memory. | |
5111 Operand 0 is the memory on which the atomic operation is performed. | |
5112 Operand 1 is the second operand to the binary operator. | |
5113 | |
5114 The ``nand'' operation is @code{~op0 & op1}. | |
5115 | |
5116 This pattern must issue any memory barrier instructions such that all | |
5117 memory operations before the atomic operation occur before the atomic | |
5118 operation and all memory operations after the atomic operation occur | |
5119 after the atomic operation. | |
5120 | |
5121 If these patterns are not defined, the operation will be constructed | |
5122 from a compare-and-swap operation, if defined. | |
5123 | |
5124 @cindex @code{sync_old_add@var{mode}} instruction pattern | |
5125 @cindex @code{sync_old_sub@var{mode}} instruction pattern | |
5126 @cindex @code{sync_old_ior@var{mode}} instruction pattern | |
5127 @cindex @code{sync_old_and@var{mode}} instruction pattern | |
5128 @cindex @code{sync_old_xor@var{mode}} instruction pattern | |
5129 @cindex @code{sync_old_nand@var{mode}} instruction pattern | |
5130 @item @samp{sync_old_add@var{mode}}, @samp{sync_old_sub@var{mode}} | |
5131 @itemx @samp{sync_old_ior@var{mode}}, @samp{sync_old_and@var{mode}} | |
5132 @itemx @samp{sync_old_xor@var{mode}}, @samp{sync_old_nand@var{mode}} | |
5133 | |
5134 These patterns are emit code for an atomic operation on memory, | |
5135 and return the value that the memory contained before the operation. | |
5136 Operand 0 is the result value, operand 1 is the memory on which the | |
5137 atomic operation is performed, and operand 2 is the second operand | |
5138 to the binary operator. | |
5139 | |
5140 This pattern must issue any memory barrier instructions such that all | |
5141 memory operations before the atomic operation occur before the atomic | |
5142 operation and all memory operations after the atomic operation occur | |
5143 after the atomic operation. | |
5144 | |
5145 If these patterns are not defined, the operation will be constructed | |
5146 from a compare-and-swap operation, if defined. | |
5147 | |
5148 @cindex @code{sync_new_add@var{mode}} instruction pattern | |
5149 @cindex @code{sync_new_sub@var{mode}} instruction pattern | |
5150 @cindex @code{sync_new_ior@var{mode}} instruction pattern | |
5151 @cindex @code{sync_new_and@var{mode}} instruction pattern | |
5152 @cindex @code{sync_new_xor@var{mode}} instruction pattern | |
5153 @cindex @code{sync_new_nand@var{mode}} instruction pattern | |
5154 @item @samp{sync_new_add@var{mode}}, @samp{sync_new_sub@var{mode}} | |
5155 @itemx @samp{sync_new_ior@var{mode}}, @samp{sync_new_and@var{mode}} | |
5156 @itemx @samp{sync_new_xor@var{mode}}, @samp{sync_new_nand@var{mode}} | |
5157 | |
5158 These patterns are like their @code{sync_old_@var{op}} counterparts, | |
5159 except that they return the value that exists in the memory location | |
5160 after the operation, rather than before the operation. | |
5161 | |
5162 @cindex @code{sync_lock_test_and_set@var{mode}} instruction pattern | |
5163 @item @samp{sync_lock_test_and_set@var{mode}} | |
5164 | |
5165 This pattern takes two forms, based on the capabilities of the target. | |
5166 In either case, operand 0 is the result of the operand, operand 1 is | |
5167 the memory on which the atomic operation is performed, and operand 2 | |
5168 is the value to set in the lock. | |
5169 | |
5170 In the ideal case, this operation is an atomic exchange operation, in | |
5171 which the previous value in memory operand is copied into the result | |
5172 operand, and the value operand is stored in the memory operand. | |
5173 | |
5174 For less capable targets, any value operand that is not the constant 1 | |
5175 should be rejected with @code{FAIL}. In this case the target may use | |
5176 an atomic test-and-set bit operation. The result operand should contain | |
5177 1 if the bit was previously set and 0 if the bit was previously clear. | |
5178 The true contents of the memory operand are implementation defined. | |
5179 | |
5180 This pattern must issue any memory barrier instructions such that the | |
5181 pattern as a whole acts as an acquire barrier, that is all memory | |
5182 operations after the pattern do not occur until the lock is acquired. | |
5183 | |
5184 If this pattern is not defined, the operation will be constructed from | |
5185 a compare-and-swap operation, if defined. | |
5186 | |
5187 @cindex @code{sync_lock_release@var{mode}} instruction pattern | |
5188 @item @samp{sync_lock_release@var{mode}} | |
5189 | |
5190 This pattern, if defined, releases a lock set by | |
5191 @code{sync_lock_test_and_set@var{mode}}. Operand 0 is the memory | |
5192 that contains the lock; operand 1 is the value to store in the lock. | |
5193 | |
5194 If the target doesn't implement full semantics for | |
5195 @code{sync_lock_test_and_set@var{mode}}, any value operand which is not | |
5196 the constant 0 should be rejected with @code{FAIL}, and the true contents | |
5197 of the memory operand are implementation defined. | |
5198 | |
5199 This pattern must issue any memory barrier instructions such that the | |
5200 pattern as a whole acts as a release barrier, that is the lock is | |
5201 released only after all previous memory operations have completed. | |
5202 | |
5203 If this pattern is not defined, then a @code{memory_barrier} pattern | |
5204 will be emitted, followed by a store of the value to the memory operand. | |
5205 | |
5206 @cindex @code{stack_protect_set} instruction pattern | |
5207 @item @samp{stack_protect_set} | |
5208 | |
5209 This pattern, if defined, moves a @code{Pmode} value from the memory | |
5210 in operand 1 to the memory in operand 0 without leaving the value in | |
5211 a register afterward. This is to avoid leaking the value some place | |
5212 that an attacker might use to rewrite the stack guard slot after | |
5213 having clobbered it. | |
5214 | |
5215 If this pattern is not defined, then a plain move pattern is generated. | |
5216 | |
5217 @cindex @code{stack_protect_test} instruction pattern | |
5218 @item @samp{stack_protect_test} | |
5219 | |
5220 This pattern, if defined, compares a @code{Pmode} value from the | |
5221 memory in operand 1 with the memory in operand 0 without leaving the | |
5222 value in a register afterward and branches to operand 2 if the values | |
5223 weren't equal. | |
5224 | |
5225 If this pattern is not defined, then a plain compare pattern and | |
5226 conditional branch pattern is used. | |
5227 | |
5228 @cindex @code{clear_cache} instruction pattern | |
5229 @item @samp{clear_cache} | |
5230 | |
5231 This pattern, if defined, flushes the instruction cache for a region of | |
5232 memory. The region is bounded to by the Pmode pointers in operand 0 | |
5233 inclusive and operand 1 exclusive. | |
5234 | |
5235 If this pattern is not defined, a call to the library function | |
5236 @code{__clear_cache} is used. | |
5237 | |
5238 @end table | |
5239 | |
5240 @end ifset | |
5241 @c Each of the following nodes are wrapped in separate | |
5242 @c "@ifset INTERNALS" to work around memory limits for the default | |
5243 @c configuration in older tetex distributions. Known to not work: | |
5244 @c tetex-1.0.7, known to work: tetex-2.0.2. | |
5245 @ifset INTERNALS | |
5246 @node Pattern Ordering | |
5247 @section When the Order of Patterns Matters | |
5248 @cindex Pattern Ordering | |
5249 @cindex Ordering of Patterns | |
5250 | |
5251 Sometimes an insn can match more than one instruction pattern. Then the | |
5252 pattern that appears first in the machine description is the one used. | |
5253 Therefore, more specific patterns (patterns that will match fewer things) | |
5254 and faster instructions (those that will produce better code when they | |
5255 do match) should usually go first in the description. | |
5256 | |
5257 In some cases the effect of ordering the patterns can be used to hide | |
5258 a pattern when it is not valid. For example, the 68000 has an | |
5259 instruction for converting a fullword to floating point and another | |
5260 for converting a byte to floating point. An instruction converting | |
5261 an integer to floating point could match either one. We put the | |
5262 pattern to convert the fullword first to make sure that one will | |
5263 be used rather than the other. (Otherwise a large integer might | |
5264 be generated as a single-byte immediate quantity, which would not work.) | |
5265 Instead of using this pattern ordering it would be possible to make the | |
5266 pattern for convert-a-byte smart enough to deal properly with any | |
5267 constant value. | |
5268 | |
5269 @end ifset | |
5270 @ifset INTERNALS | |
5271 @node Dependent Patterns | |
5272 @section Interdependence of Patterns | |
5273 @cindex Dependent Patterns | |
5274 @cindex Interdependence of Patterns | |
5275 | |
5276 Every machine description must have a named pattern for each of the | |
5277 conditional branch names @samp{b@var{cond}}. The recognition template | |
5278 must always have the form | |
5279 | |
5280 @smallexample | |
5281 (set (pc) | |
5282 (if_then_else (@var{cond} (cc0) (const_int 0)) | |
5283 (label_ref (match_operand 0 "" "")) | |
5284 (pc))) | |
5285 @end smallexample | |
5286 | |
5287 @noindent | |
5288 In addition, every machine description must have an anonymous pattern | |
5289 for each of the possible reverse-conditional branches. Their templates | |
5290 look like | |
5291 | |
5292 @smallexample | |
5293 (set (pc) | |
5294 (if_then_else (@var{cond} (cc0) (const_int 0)) | |
5295 (pc) | |
5296 (label_ref (match_operand 0 "" "")))) | |
5297 @end smallexample | |
5298 | |
5299 @noindent | |
5300 They are necessary because jump optimization can turn direct-conditional | |
5301 branches into reverse-conditional branches. | |
5302 | |
5303 It is often convenient to use the @code{match_operator} construct to | |
5304 reduce the number of patterns that must be specified for branches. For | |
5305 example, | |
5306 | |
5307 @smallexample | |
5308 (define_insn "" | |
5309 [(set (pc) | |
5310 (if_then_else (match_operator 0 "comparison_operator" | |
5311 [(cc0) (const_int 0)]) | |
5312 (pc) | |
5313 (label_ref (match_operand 1 "" ""))))] | |
5314 "@var{condition}" | |
5315 "@dots{}") | |
5316 @end smallexample | |
5317 | |
5318 In some cases machines support instructions identical except for the | |
5319 machine mode of one or more operands. For example, there may be | |
5320 ``sign-extend halfword'' and ``sign-extend byte'' instructions whose | |
5321 patterns are | |
5322 | |
5323 @smallexample | |
5324 (set (match_operand:SI 0 @dots{}) | |
5325 (extend:SI (match_operand:HI 1 @dots{}))) | |
5326 | |
5327 (set (match_operand:SI 0 @dots{}) | |
5328 (extend:SI (match_operand:QI 1 @dots{}))) | |
5329 @end smallexample | |
5330 | |
5331 @noindent | |
5332 Constant integers do not specify a machine mode, so an instruction to | |
5333 extend a constant value could match either pattern. The pattern it | |
5334 actually will match is the one that appears first in the file. For correct | |
5335 results, this must be the one for the widest possible mode (@code{HImode}, | |
5336 here). If the pattern matches the @code{QImode} instruction, the results | |
5337 will be incorrect if the constant value does not actually fit that mode. | |
5338 | |
5339 Such instructions to extend constants are rarely generated because they are | |
5340 optimized away, but they do occasionally happen in nonoptimized | |
5341 compilations. | |
5342 | |
5343 If a constraint in a pattern allows a constant, the reload pass may | |
5344 replace a register with a constant permitted by the constraint in some | |
5345 cases. Similarly for memory references. Because of this substitution, | |
5346 you should not provide separate patterns for increment and decrement | |
5347 instructions. Instead, they should be generated from the same pattern | |
5348 that supports register-register add insns by examining the operands and | |
5349 generating the appropriate machine instruction. | |
5350 | |
5351 @end ifset | |
5352 @ifset INTERNALS | |
5353 @node Jump Patterns | |
5354 @section Defining Jump Instruction Patterns | |
5355 @cindex jump instruction patterns | |
5356 @cindex defining jump instruction patterns | |
5357 | |
5358 For most machines, GCC assumes that the machine has a condition code. | |
5359 A comparison insn sets the condition code, recording the results of both | |
5360 signed and unsigned comparison of the given operands. A separate branch | |
5361 insn tests the condition code and branches or not according its value. | |
5362 The branch insns come in distinct signed and unsigned flavors. Many | |
5363 common machines, such as the VAX, the 68000 and the 32000, work this | |
5364 way. | |
5365 | |
5366 Some machines have distinct signed and unsigned compare instructions, and | |
5367 only one set of conditional branch instructions. The easiest way to handle | |
5368 these machines is to treat them just like the others until the final stage | |
5369 where assembly code is written. At this time, when outputting code for the | |
5370 compare instruction, peek ahead at the following branch using | |
5371 @code{next_cc0_user (insn)}. (The variable @code{insn} refers to the insn | |
5372 being output, in the output-writing code in an instruction pattern.) If | |
5373 the RTL says that is an unsigned branch, output an unsigned compare; | |
5374 otherwise output a signed compare. When the branch itself is output, you | |
5375 can treat signed and unsigned branches identically. | |
5376 | |
5377 The reason you can do this is that GCC always generates a pair of | |
5378 consecutive RTL insns, possibly separated by @code{note} insns, one to | |
5379 set the condition code and one to test it, and keeps the pair inviolate | |
5380 until the end. | |
5381 | |
5382 To go with this technique, you must define the machine-description macro | |
5383 @code{NOTICE_UPDATE_CC} to do @code{CC_STATUS_INIT}; in other words, no | |
5384 compare instruction is superfluous. | |
5385 | |
5386 Some machines have compare-and-branch instructions and no condition code. | |
5387 A similar technique works for them. When it is time to ``output'' a | |
5388 compare instruction, record its operands in two static variables. When | |
5389 outputting the branch-on-condition-code instruction that follows, actually | |
5390 output a compare-and-branch instruction that uses the remembered operands. | |
5391 | |
5392 It also works to define patterns for compare-and-branch instructions. | |
5393 In optimizing compilation, the pair of compare and branch instructions | |
5394 will be combined according to these patterns. But this does not happen | |
5395 if optimization is not requested. So you must use one of the solutions | |
5396 above in addition to any special patterns you define. | |
5397 | |
5398 In many RISC machines, most instructions do not affect the condition | |
5399 code and there may not even be a separate condition code register. On | |
5400 these machines, the restriction that the definition and use of the | |
5401 condition code be adjacent insns is not necessary and can prevent | |
5402 important optimizations. For example, on the IBM RS/6000, there is a | |
5403 delay for taken branches unless the condition code register is set three | |
5404 instructions earlier than the conditional branch. The instruction | |
5405 scheduler cannot perform this optimization if it is not permitted to | |
5406 separate the definition and use of the condition code register. | |
5407 | |
5408 On these machines, do not use @code{(cc0)}, but instead use a register | |
5409 to represent the condition code. If there is a specific condition code | |
5410 register in the machine, use a hard register. If the condition code or | |
5411 comparison result can be placed in any general register, or if there are | |
5412 multiple condition registers, use a pseudo register. | |
5413 | |
5414 @findex prev_cc0_setter | |
5415 @findex next_cc0_user | |
5416 On some machines, the type of branch instruction generated may depend on | |
5417 the way the condition code was produced; for example, on the 68k and | |
5418 SPARC, setting the condition code directly from an add or subtract | |
5419 instruction does not clear the overflow bit the way that a test | |
5420 instruction does, so a different branch instruction must be used for | |
5421 some conditional branches. For machines that use @code{(cc0)}, the set | |
5422 and use of the condition code must be adjacent (separated only by | |
5423 @code{note} insns) allowing flags in @code{cc_status} to be used. | |
5424 (@xref{Condition Code}.) Also, the comparison and branch insns can be | |
5425 located from each other by using the functions @code{prev_cc0_setter} | |
5426 and @code{next_cc0_user}. | |
5427 | |
5428 However, this is not true on machines that do not use @code{(cc0)}. On | |
5429 those machines, no assumptions can be made about the adjacency of the | |
5430 compare and branch insns and the above methods cannot be used. Instead, | |
5431 we use the machine mode of the condition code register to record | |
5432 different formats of the condition code register. | |
5433 | |
5434 Registers used to store the condition code value should have a mode that | |
5435 is in class @code{MODE_CC}. Normally, it will be @code{CCmode}. If | |
5436 additional modes are required (as for the add example mentioned above in | |
5437 the SPARC), define them in @file{@var{machine}-modes.def} | |
5438 (@pxref{Condition Code}). Also define @code{SELECT_CC_MODE} to choose | |
5439 a mode given an operand of a compare. | |
5440 | |
5441 If it is known during RTL generation that a different mode will be | |
5442 required (for example, if the machine has separate compare instructions | |
5443 for signed and unsigned quantities, like most IBM processors), they can | |
5444 be specified at that time. | |
5445 | |
5446 If the cases that require different modes would be made by instruction | |
5447 combination, the macro @code{SELECT_CC_MODE} determines which machine | |
5448 mode should be used for the comparison result. The patterns should be | |
5449 written using that mode. To support the case of the add on the SPARC | |
5450 discussed above, we have the pattern | |
5451 | |
5452 @smallexample | |
5453 (define_insn "" | |
5454 [(set (reg:CC_NOOV 0) | |
5455 (compare:CC_NOOV | |
5456 (plus:SI (match_operand:SI 0 "register_operand" "%r") | |
5457 (match_operand:SI 1 "arith_operand" "rI")) | |
5458 (const_int 0)))] | |
5459 "" | |
5460 "@dots{}") | |
5461 @end smallexample | |
5462 | |
5463 The @code{SELECT_CC_MODE} macro on the SPARC returns @code{CC_NOOVmode} | |
5464 for comparisons whose argument is a @code{plus}. | |
5465 | |
5466 @end ifset | |
5467 @ifset INTERNALS | |
5468 @node Looping Patterns | |
5469 @section Defining Looping Instruction Patterns | |
5470 @cindex looping instruction patterns | |
5471 @cindex defining looping instruction patterns | |
5472 | |
5473 Some machines have special jump instructions that can be utilized to | |
5474 make loops more efficient. A common example is the 68000 @samp{dbra} | |
5475 instruction which performs a decrement of a register and a branch if the | |
5476 result was greater than zero. Other machines, in particular digital | |
5477 signal processors (DSPs), have special block repeat instructions to | |
5478 provide low-overhead loop support. For example, the TI TMS320C3x/C4x | |
5479 DSPs have a block repeat instruction that loads special registers to | |
5480 mark the top and end of a loop and to count the number of loop | |
5481 iterations. This avoids the need for fetching and executing a | |
5482 @samp{dbra}-like instruction and avoids pipeline stalls associated with | |
5483 the jump. | |
5484 | |
5485 GCC has three special named patterns to support low overhead looping. | |
5486 They are @samp{decrement_and_branch_until_zero}, @samp{doloop_begin}, | |
5487 and @samp{doloop_end}. The first pattern, | |
5488 @samp{decrement_and_branch_until_zero}, is not emitted during RTL | |
5489 generation but may be emitted during the instruction combination phase. | |
5490 This requires the assistance of the loop optimizer, using information | |
5491 collected during strength reduction, to reverse a loop to count down to | |
5492 zero. Some targets also require the loop optimizer to add a | |
5493 @code{REG_NONNEG} note to indicate that the iteration count is always | |
5494 positive. This is needed if the target performs a signed loop | |
5495 termination test. For example, the 68000 uses a pattern similar to the | |
5496 following for its @code{dbra} instruction: | |
5497 | |
5498 @smallexample | |
5499 @group | |
5500 (define_insn "decrement_and_branch_until_zero" | |
5501 [(set (pc) | |
5502 (if_then_else | |
5503 (ge (plus:SI (match_operand:SI 0 "general_operand" "+d*am") | |
5504 (const_int -1)) | |
5505 (const_int 0)) | |
5506 (label_ref (match_operand 1 "" "")) | |
5507 (pc))) | |
5508 (set (match_dup 0) | |
5509 (plus:SI (match_dup 0) | |
5510 (const_int -1)))] | |
5511 "find_reg_note (insn, REG_NONNEG, 0)" | |
5512 "@dots{}") | |
5513 @end group | |
5514 @end smallexample | |
5515 | |
5516 Note that since the insn is both a jump insn and has an output, it must | |
5517 deal with its own reloads, hence the `m' constraints. Also note that | |
5518 since this insn is generated by the instruction combination phase | |
5519 combining two sequential insns together into an implicit parallel insn, | |
5520 the iteration counter needs to be biased by the same amount as the | |
5521 decrement operation, in this case @minus{}1. Note that the following similar | |
5522 pattern will not be matched by the combiner. | |
5523 | |
5524 @smallexample | |
5525 @group | |
5526 (define_insn "decrement_and_branch_until_zero" | |
5527 [(set (pc) | |
5528 (if_then_else | |
5529 (ge (match_operand:SI 0 "general_operand" "+d*am") | |
5530 (const_int 1)) | |
5531 (label_ref (match_operand 1 "" "")) | |
5532 (pc))) | |
5533 (set (match_dup 0) | |
5534 (plus:SI (match_dup 0) | |
5535 (const_int -1)))] | |
5536 "find_reg_note (insn, REG_NONNEG, 0)" | |
5537 "@dots{}") | |
5538 @end group | |
5539 @end smallexample | |
5540 | |
5541 The other two special looping patterns, @samp{doloop_begin} and | |
5542 @samp{doloop_end}, are emitted by the loop optimizer for certain | |
5543 well-behaved loops with a finite number of loop iterations using | |
5544 information collected during strength reduction. | |
5545 | |
5546 The @samp{doloop_end} pattern describes the actual looping instruction | |
5547 (or the implicit looping operation) and the @samp{doloop_begin} pattern | |
5548 is an optional companion pattern that can be used for initialization | |
5549 needed for some low-overhead looping instructions. | |
5550 | |
5551 Note that some machines require the actual looping instruction to be | |
5552 emitted at the top of the loop (e.g., the TMS320C3x/C4x DSPs). Emitting | |
5553 the true RTL for a looping instruction at the top of the loop can cause | |
5554 problems with flow analysis. So instead, a dummy @code{doloop} insn is | |
5555 emitted at the end of the loop. The machine dependent reorg pass checks | |
5556 for the presence of this @code{doloop} insn and then searches back to | |
5557 the top of the loop, where it inserts the true looping insn (provided | |
5558 there are no instructions in the loop which would cause problems). Any | |
5559 additional labels can be emitted at this point. In addition, if the | |
5560 desired special iteration counter register was not allocated, this | |
5561 machine dependent reorg pass could emit a traditional compare and jump | |
5562 instruction pair. | |
5563 | |
5564 The essential difference between the | |
5565 @samp{decrement_and_branch_until_zero} and the @samp{doloop_end} | |
5566 patterns is that the loop optimizer allocates an additional pseudo | |
5567 register for the latter as an iteration counter. This pseudo register | |
5568 cannot be used within the loop (i.e., general induction variables cannot | |
5569 be derived from it), however, in many cases the loop induction variable | |
5570 may become redundant and removed by the flow pass. | |
5571 | |
5572 | |
5573 @end ifset | |
5574 @ifset INTERNALS | |
5575 @node Insn Canonicalizations | |
5576 @section Canonicalization of Instructions | |
5577 @cindex canonicalization of instructions | |
5578 @cindex insn canonicalization | |
5579 | |
5580 There are often cases where multiple RTL expressions could represent an | |
5581 operation performed by a single machine instruction. This situation is | |
5582 most commonly encountered with logical, branch, and multiply-accumulate | |
5583 instructions. In such cases, the compiler attempts to convert these | |
5584 multiple RTL expressions into a single canonical form to reduce the | |
5585 number of insn patterns required. | |
5586 | |
5587 In addition to algebraic simplifications, following canonicalizations | |
5588 are performed: | |
5589 | |
5590 @itemize @bullet | |
5591 @item | |
5592 For commutative and comparison operators, a constant is always made the | |
5593 second operand. If a machine only supports a constant as the second | |
5594 operand, only patterns that match a constant in the second operand need | |
5595 be supplied. | |
5596 | |
5597 @item | |
5598 For associative operators, a sequence of operators will always chain | |
5599 to the left; for instance, only the left operand of an integer @code{plus} | |
5600 can itself be a @code{plus}. @code{and}, @code{ior}, @code{xor}, | |
5601 @code{plus}, @code{mult}, @code{smin}, @code{smax}, @code{umin}, and | |
5602 @code{umax} are associative when applied to integers, and sometimes to | |
5603 floating-point. | |
5604 | |
5605 @item | |
5606 @cindex @code{neg}, canonicalization of | |
5607 @cindex @code{not}, canonicalization of | |
5608 @cindex @code{mult}, canonicalization of | |
5609 @cindex @code{plus}, canonicalization of | |
5610 @cindex @code{minus}, canonicalization of | |
5611 For these operators, if only one operand is a @code{neg}, @code{not}, | |
5612 @code{mult}, @code{plus}, or @code{minus} expression, it will be the | |
5613 first operand. | |
5614 | |
5615 @item | |
5616 In combinations of @code{neg}, @code{mult}, @code{plus}, and | |
5617 @code{minus}, the @code{neg} operations (if any) will be moved inside | |
5618 the operations as far as possible. For instance, | |
5619 @code{(neg (mult A B))} is canonicalized as @code{(mult (neg A) B)}, but | |
5620 @code{(plus (mult (neg A) B) C)} is canonicalized as | |
5621 @code{(minus A (mult B C))}. | |
5622 | |
5623 @cindex @code{compare}, canonicalization of | |
5624 @item | |
5625 For the @code{compare} operator, a constant is always the second operand | |
5626 on machines where @code{cc0} is used (@pxref{Jump Patterns}). On other | |
5627 machines, there are rare cases where the compiler might want to construct | |
5628 a @code{compare} with a constant as the first operand. However, these | |
5629 cases are not common enough for it to be worthwhile to provide a pattern | |
5630 matching a constant as the first operand unless the machine actually has | |
5631 such an instruction. | |
5632 | |
5633 An operand of @code{neg}, @code{not}, @code{mult}, @code{plus}, or | |
5634 @code{minus} is made the first operand under the same conditions as | |
5635 above. | |
5636 | |
5637 @item | |
5638 @code{(ltu (plus @var{a} @var{b}) @var{b})} is converted to | |
5639 @code{(ltu (plus @var{a} @var{b}) @var{a})}. Likewise with @code{geu} instead | |
5640 of @code{ltu}. | |
5641 | |
5642 @item | |
5643 @code{(minus @var{x} (const_int @var{n}))} is converted to | |
5644 @code{(plus @var{x} (const_int @var{-n}))}. | |
5645 | |
5646 @item | |
5647 Within address computations (i.e., inside @code{mem}), a left shift is | |
5648 converted into the appropriate multiplication by a power of two. | |
5649 | |
5650 @cindex @code{ior}, canonicalization of | |
5651 @cindex @code{and}, canonicalization of | |
5652 @cindex De Morgan's law | |
5653 @item | |
5654 De Morgan's Law is used to move bitwise negation inside a bitwise | |
5655 logical-and or logical-or operation. If this results in only one | |
5656 operand being a @code{not} expression, it will be the first one. | |
5657 | |
5658 A machine that has an instruction that performs a bitwise logical-and of one | |
5659 operand with the bitwise negation of the other should specify the pattern | |
5660 for that instruction as | |
5661 | |
5662 @smallexample | |
5663 (define_insn "" | |
5664 [(set (match_operand:@var{m} 0 @dots{}) | |
5665 (and:@var{m} (not:@var{m} (match_operand:@var{m} 1 @dots{})) | |
5666 (match_operand:@var{m} 2 @dots{})))] | |
5667 "@dots{}" | |
5668 "@dots{}") | |
5669 @end smallexample | |
5670 | |
5671 @noindent | |
5672 Similarly, a pattern for a ``NAND'' instruction should be written | |
5673 | |
5674 @smallexample | |
5675 (define_insn "" | |
5676 [(set (match_operand:@var{m} 0 @dots{}) | |
5677 (ior:@var{m} (not:@var{m} (match_operand:@var{m} 1 @dots{})) | |
5678 (not:@var{m} (match_operand:@var{m} 2 @dots{}))))] | |
5679 "@dots{}" | |
5680 "@dots{}") | |
5681 @end smallexample | |
5682 | |
5683 In both cases, it is not necessary to include patterns for the many | |
5684 logically equivalent RTL expressions. | |
5685 | |
5686 @cindex @code{xor}, canonicalization of | |
5687 @item | |
5688 The only possible RTL expressions involving both bitwise exclusive-or | |
5689 and bitwise negation are @code{(xor:@var{m} @var{x} @var{y})} | |
5690 and @code{(not:@var{m} (xor:@var{m} @var{x} @var{y}))}. | |
5691 | |
5692 @item | |
5693 The sum of three items, one of which is a constant, will only appear in | |
5694 the form | |
5695 | |
5696 @smallexample | |
5697 (plus:@var{m} (plus:@var{m} @var{x} @var{y}) @var{constant}) | |
5698 @end smallexample | |
5699 | |
5700 @item | |
5701 On machines that do not use @code{cc0}, | |
5702 @code{(compare @var{x} (const_int 0))} will be converted to | |
5703 @var{x}. | |
5704 | |
5705 @cindex @code{zero_extract}, canonicalization of | |
5706 @cindex @code{sign_extract}, canonicalization of | |
5707 @item | |
5708 Equality comparisons of a group of bits (usually a single bit) with zero | |
5709 will be written using @code{zero_extract} rather than the equivalent | |
5710 @code{and} or @code{sign_extract} operations. | |
5711 | |
5712 @end itemize | |
5713 | |
5714 Further canonicalization rules are defined in the function | |
5715 @code{commutative_operand_precedence} in @file{gcc/rtlanal.c}. | |
5716 | |
5717 @end ifset | |
5718 @ifset INTERNALS | |
5719 @node Expander Definitions | |
5720 @section Defining RTL Sequences for Code Generation | |
5721 @cindex expander definitions | |
5722 @cindex code generation RTL sequences | |
5723 @cindex defining RTL sequences for code generation | |
5724 | |
5725 On some target machines, some standard pattern names for RTL generation | |
5726 cannot be handled with single insn, but a sequence of RTL insns can | |
5727 represent them. For these target machines, you can write a | |
5728 @code{define_expand} to specify how to generate the sequence of RTL@. | |
5729 | |
5730 @findex define_expand | |
5731 A @code{define_expand} is an RTL expression that looks almost like a | |
5732 @code{define_insn}; but, unlike the latter, a @code{define_expand} is used | |
5733 only for RTL generation and it can produce more than one RTL insn. | |
5734 | |
5735 A @code{define_expand} RTX has four operands: | |
5736 | |
5737 @itemize @bullet | |
5738 @item | |
5739 The name. Each @code{define_expand} must have a name, since the only | |
5740 use for it is to refer to it by name. | |
5741 | |
5742 @item | |
5743 The RTL template. This is a vector of RTL expressions representing | |
5744 a sequence of separate instructions. Unlike @code{define_insn}, there | |
5745 is no implicit surrounding @code{PARALLEL}. | |
5746 | |
5747 @item | |
5748 The condition, a string containing a C expression. This expression is | |
5749 used to express how the availability of this pattern depends on | |
5750 subclasses of target machine, selected by command-line options when GCC | |
5751 is run. This is just like the condition of a @code{define_insn} that | |
5752 has a standard name. Therefore, the condition (if present) may not | |
5753 depend on the data in the insn being matched, but only the | |
5754 target-machine-type flags. The compiler needs to test these conditions | |
5755 during initialization in order to learn exactly which named instructions | |
5756 are available in a particular run. | |
5757 | |
5758 @item | |
5759 The preparation statements, a string containing zero or more C | |
5760 statements which are to be executed before RTL code is generated from | |
5761 the RTL template. | |
5762 | |
5763 Usually these statements prepare temporary registers for use as | |
5764 internal operands in the RTL template, but they can also generate RTL | |
5765 insns directly by calling routines such as @code{emit_insn}, etc. | |
5766 Any such insns precede the ones that come from the RTL template. | |
5767 @end itemize | |
5768 | |
5769 Every RTL insn emitted by a @code{define_expand} must match some | |
5770 @code{define_insn} in the machine description. Otherwise, the compiler | |
5771 will crash when trying to generate code for the insn or trying to optimize | |
5772 it. | |
5773 | |
5774 The RTL template, in addition to controlling generation of RTL insns, | |
5775 also describes the operands that need to be specified when this pattern | |
5776 is used. In particular, it gives a predicate for each operand. | |
5777 | |
5778 A true operand, which needs to be specified in order to generate RTL from | |
5779 the pattern, should be described with a @code{match_operand} in its first | |
5780 occurrence in the RTL template. This enters information on the operand's | |
5781 predicate into the tables that record such things. GCC uses the | |
5782 information to preload the operand into a register if that is required for | |
5783 valid RTL code. If the operand is referred to more than once, subsequent | |
5784 references should use @code{match_dup}. | |
5785 | |
5786 The RTL template may also refer to internal ``operands'' which are | |
5787 temporary registers or labels used only within the sequence made by the | |
5788 @code{define_expand}. Internal operands are substituted into the RTL | |
5789 template with @code{match_dup}, never with @code{match_operand}. The | |
5790 values of the internal operands are not passed in as arguments by the | |
5791 compiler when it requests use of this pattern. Instead, they are computed | |
5792 within the pattern, in the preparation statements. These statements | |
5793 compute the values and store them into the appropriate elements of | |
5794 @code{operands} so that @code{match_dup} can find them. | |
5795 | |
5796 There are two special macros defined for use in the preparation statements: | |
5797 @code{DONE} and @code{FAIL}. Use them with a following semicolon, | |
5798 as a statement. | |
5799 | |
5800 @table @code | |
5801 | |
5802 @findex DONE | |
5803 @item DONE | |
5804 Use the @code{DONE} macro to end RTL generation for the pattern. The | |
5805 only RTL insns resulting from the pattern on this occasion will be | |
5806 those already emitted by explicit calls to @code{emit_insn} within the | |
5807 preparation statements; the RTL template will not be generated. | |
5808 | |
5809 @findex FAIL | |
5810 @item FAIL | |
5811 Make the pattern fail on this occasion. When a pattern fails, it means | |
5812 that the pattern was not truly available. The calling routines in the | |
5813 compiler will try other strategies for code generation using other patterns. | |
5814 | |
5815 Failure is currently supported only for binary (addition, multiplication, | |
5816 shifting, etc.) and bit-field (@code{extv}, @code{extzv}, and @code{insv}) | |
5817 operations. | |
5818 @end table | |
5819 | |
5820 If the preparation falls through (invokes neither @code{DONE} nor | |
5821 @code{FAIL}), then the @code{define_expand} acts like a | |
5822 @code{define_insn} in that the RTL template is used to generate the | |
5823 insn. | |
5824 | |
5825 The RTL template is not used for matching, only for generating the | |
5826 initial insn list. If the preparation statement always invokes | |
5827 @code{DONE} or @code{FAIL}, the RTL template may be reduced to a simple | |
5828 list of operands, such as this example: | |
5829 | |
5830 @smallexample | |
5831 @group | |
5832 (define_expand "addsi3" | |
5833 [(match_operand:SI 0 "register_operand" "") | |
5834 (match_operand:SI 1 "register_operand" "") | |
5835 (match_operand:SI 2 "register_operand" "")] | |
5836 @end group | |
5837 @group | |
5838 "" | |
5839 " | |
5840 @{ | |
5841 handle_add (operands[0], operands[1], operands[2]); | |
5842 DONE; | |
5843 @}") | |
5844 @end group | |
5845 @end smallexample | |
5846 | |
5847 Here is an example, the definition of left-shift for the SPUR chip: | |
5848 | |
5849 @smallexample | |
5850 @group | |
5851 (define_expand "ashlsi3" | |
5852 [(set (match_operand:SI 0 "register_operand" "") | |
5853 (ashift:SI | |
5854 @end group | |
5855 @group | |
5856 (match_operand:SI 1 "register_operand" "") | |
5857 (match_operand:SI 2 "nonmemory_operand" "")))] | |
5858 "" | |
5859 " | |
5860 @end group | |
5861 @end smallexample | |
5862 | |
5863 @smallexample | |
5864 @group | |
5865 @{ | |
5866 if (GET_CODE (operands[2]) != CONST_INT | |
5867 || (unsigned) INTVAL (operands[2]) > 3) | |
5868 FAIL; | |
5869 @}") | |
5870 @end group | |
5871 @end smallexample | |
5872 | |
5873 @noindent | |
5874 This example uses @code{define_expand} so that it can generate an RTL insn | |
5875 for shifting when the shift-count is in the supported range of 0 to 3 but | |
5876 fail in other cases where machine insns aren't available. When it fails, | |
5877 the compiler tries another strategy using different patterns (such as, a | |
5878 library call). | |
5879 | |
5880 If the compiler were able to handle nontrivial condition-strings in | |
5881 patterns with names, then it would be possible to use a | |
5882 @code{define_insn} in that case. Here is another case (zero-extension | |
5883 on the 68000) which makes more use of the power of @code{define_expand}: | |
5884 | |
5885 @smallexample | |
5886 (define_expand "zero_extendhisi2" | |
5887 [(set (match_operand:SI 0 "general_operand" "") | |
5888 (const_int 0)) | |
5889 (set (strict_low_part | |
5890 (subreg:HI | |
5891 (match_dup 0) | |
5892 0)) | |
5893 (match_operand:HI 1 "general_operand" ""))] | |
5894 "" | |
5895 "operands[1] = make_safe_from (operands[1], operands[0]);") | |
5896 @end smallexample | |
5897 | |
5898 @noindent | |
5899 @findex make_safe_from | |
5900 Here two RTL insns are generated, one to clear the entire output operand | |
5901 and the other to copy the input operand into its low half. This sequence | |
5902 is incorrect if the input operand refers to [the old value of] the output | |
5903 operand, so the preparation statement makes sure this isn't so. The | |
5904 function @code{make_safe_from} copies the @code{operands[1]} into a | |
5905 temporary register if it refers to @code{operands[0]}. It does this | |
5906 by emitting another RTL insn. | |
5907 | |
5908 Finally, a third example shows the use of an internal operand. | |
5909 Zero-extension on the SPUR chip is done by @code{and}-ing the result | |
5910 against a halfword mask. But this mask cannot be represented by a | |
5911 @code{const_int} because the constant value is too large to be legitimate | |
5912 on this machine. So it must be copied into a register with | |
5913 @code{force_reg} and then the register used in the @code{and}. | |
5914 | |
5915 @smallexample | |
5916 (define_expand "zero_extendhisi2" | |
5917 [(set (match_operand:SI 0 "register_operand" "") | |
5918 (and:SI (subreg:SI | |
5919 (match_operand:HI 1 "register_operand" "") | |
5920 0) | |
5921 (match_dup 2)))] | |
5922 "" | |
5923 "operands[2] | |
5924 = force_reg (SImode, GEN_INT (65535)); ") | |
5925 @end smallexample | |
5926 | |
5927 @emph{Note:} If the @code{define_expand} is used to serve a | |
5928 standard binary or unary arithmetic operation or a bit-field operation, | |
5929 then the last insn it generates must not be a @code{code_label}, | |
5930 @code{barrier} or @code{note}. It must be an @code{insn}, | |
5931 @code{jump_insn} or @code{call_insn}. If you don't need a real insn | |
5932 at the end, emit an insn to copy the result of the operation into | |
5933 itself. Such an insn will generate no code, but it can avoid problems | |
5934 in the compiler. | |
5935 | |
5936 @end ifset | |
5937 @ifset INTERNALS | |
5938 @node Insn Splitting | |
5939 @section Defining How to Split Instructions | |
5940 @cindex insn splitting | |
5941 @cindex instruction splitting | |
5942 @cindex splitting instructions | |
5943 | |
5944 There are two cases where you should specify how to split a pattern | |
5945 into multiple insns. On machines that have instructions requiring | |
5946 delay slots (@pxref{Delay Slots}) or that have instructions whose | |
5947 output is not available for multiple cycles (@pxref{Processor pipeline | |
5948 description}), the compiler phases that optimize these cases need to | |
5949 be able to move insns into one-instruction delay slots. However, some | |
5950 insns may generate more than one machine instruction. These insns | |
5951 cannot be placed into a delay slot. | |
5952 | |
5953 Often you can rewrite the single insn as a list of individual insns, | |
5954 each corresponding to one machine instruction. The disadvantage of | |
5955 doing so is that it will cause the compilation to be slower and require | |
5956 more space. If the resulting insns are too complex, it may also | |
5957 suppress some optimizations. The compiler splits the insn if there is a | |
5958 reason to believe that it might improve instruction or delay slot | |
5959 scheduling. | |
5960 | |
5961 The insn combiner phase also splits putative insns. If three insns are | |
5962 merged into one insn with a complex expression that cannot be matched by | |
5963 some @code{define_insn} pattern, the combiner phase attempts to split | |
5964 the complex pattern into two insns that are recognized. Usually it can | |
5965 break the complex pattern into two patterns by splitting out some | |
5966 subexpression. However, in some other cases, such as performing an | |
5967 addition of a large constant in two insns on a RISC machine, the way to | |
5968 split the addition into two insns is machine-dependent. | |
5969 | |
5970 @findex define_split | |
5971 The @code{define_split} definition tells the compiler how to split a | |
5972 complex insn into several simpler insns. It looks like this: | |
5973 | |
5974 @smallexample | |
5975 (define_split | |
5976 [@var{insn-pattern}] | |
5977 "@var{condition}" | |
5978 [@var{new-insn-pattern-1} | |
5979 @var{new-insn-pattern-2} | |
5980 @dots{}] | |
5981 "@var{preparation-statements}") | |
5982 @end smallexample | |
5983 | |
5984 @var{insn-pattern} is a pattern that needs to be split and | |
5985 @var{condition} is the final condition to be tested, as in a | |
5986 @code{define_insn}. When an insn matching @var{insn-pattern} and | |
5987 satisfying @var{condition} is found, it is replaced in the insn list | |
5988 with the insns given by @var{new-insn-pattern-1}, | |
5989 @var{new-insn-pattern-2}, etc. | |
5990 | |
5991 The @var{preparation-statements} are similar to those statements that | |
5992 are specified for @code{define_expand} (@pxref{Expander Definitions}) | |
5993 and are executed before the new RTL is generated to prepare for the | |
5994 generated code or emit some insns whose pattern is not fixed. Unlike | |
5995 those in @code{define_expand}, however, these statements must not | |
5996 generate any new pseudo-registers. Once reload has completed, they also | |
5997 must not allocate any space in the stack frame. | |
5998 | |
5999 Patterns are matched against @var{insn-pattern} in two different | |
6000 circumstances. If an insn needs to be split for delay slot scheduling | |
6001 or insn scheduling, the insn is already known to be valid, which means | |
6002 that it must have been matched by some @code{define_insn} and, if | |
6003 @code{reload_completed} is nonzero, is known to satisfy the constraints | |
6004 of that @code{define_insn}. In that case, the new insn patterns must | |
6005 also be insns that are matched by some @code{define_insn} and, if | |
6006 @code{reload_completed} is nonzero, must also satisfy the constraints | |
6007 of those definitions. | |
6008 | |
6009 As an example of this usage of @code{define_split}, consider the following | |
6010 example from @file{a29k.md}, which splits a @code{sign_extend} from | |
6011 @code{HImode} to @code{SImode} into a pair of shift insns: | |
6012 | |
6013 @smallexample | |
6014 (define_split | |
6015 [(set (match_operand:SI 0 "gen_reg_operand" "") | |
6016 (sign_extend:SI (match_operand:HI 1 "gen_reg_operand" "")))] | |
6017 "" | |
6018 [(set (match_dup 0) | |
6019 (ashift:SI (match_dup 1) | |
6020 (const_int 16))) | |
6021 (set (match_dup 0) | |
6022 (ashiftrt:SI (match_dup 0) | |
6023 (const_int 16)))] | |
6024 " | |
6025 @{ operands[1] = gen_lowpart (SImode, operands[1]); @}") | |
6026 @end smallexample | |
6027 | |
6028 When the combiner phase tries to split an insn pattern, it is always the | |
6029 case that the pattern is @emph{not} matched by any @code{define_insn}. | |
6030 The combiner pass first tries to split a single @code{set} expression | |
6031 and then the same @code{set} expression inside a @code{parallel}, but | |
6032 followed by a @code{clobber} of a pseudo-reg to use as a scratch | |
6033 register. In these cases, the combiner expects exactly two new insn | |
6034 patterns to be generated. It will verify that these patterns match some | |
6035 @code{define_insn} definitions, so you need not do this test in the | |
6036 @code{define_split} (of course, there is no point in writing a | |
6037 @code{define_split} that will never produce insns that match). | |
6038 | |
6039 Here is an example of this use of @code{define_split}, taken from | |
6040 @file{rs6000.md}: | |
6041 | |
6042 @smallexample | |
6043 (define_split | |
6044 [(set (match_operand:SI 0 "gen_reg_operand" "") | |
6045 (plus:SI (match_operand:SI 1 "gen_reg_operand" "") | |
6046 (match_operand:SI 2 "non_add_cint_operand" "")))] | |
6047 "" | |
6048 [(set (match_dup 0) (plus:SI (match_dup 1) (match_dup 3))) | |
6049 (set (match_dup 0) (plus:SI (match_dup 0) (match_dup 4)))] | |
6050 " | |
6051 @{ | |
6052 int low = INTVAL (operands[2]) & 0xffff; | |
6053 int high = (unsigned) INTVAL (operands[2]) >> 16; | |
6054 | |
6055 if (low & 0x8000) | |
6056 high++, low |= 0xffff0000; | |
6057 | |
6058 operands[3] = GEN_INT (high << 16); | |
6059 operands[4] = GEN_INT (low); | |
6060 @}") | |
6061 @end smallexample | |
6062 | |
6063 Here the predicate @code{non_add_cint_operand} matches any | |
6064 @code{const_int} that is @emph{not} a valid operand of a single add | |
6065 insn. The add with the smaller displacement is written so that it | |
6066 can be substituted into the address of a subsequent operation. | |
6067 | |
6068 An example that uses a scratch register, from the same file, generates | |
6069 an equality comparison of a register and a large constant: | |
6070 | |
6071 @smallexample | |
6072 (define_split | |
6073 [(set (match_operand:CC 0 "cc_reg_operand" "") | |
6074 (compare:CC (match_operand:SI 1 "gen_reg_operand" "") | |
6075 (match_operand:SI 2 "non_short_cint_operand" ""))) | |
6076 (clobber (match_operand:SI 3 "gen_reg_operand" ""))] | |
6077 "find_single_use (operands[0], insn, 0) | |
6078 && (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ | |
6079 || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)" | |
6080 [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4))) | |
6081 (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))] | |
6082 " | |
6083 @{ | |
6084 /* @r{Get the constant we are comparing against, C, and see what it | |
6085 looks like sign-extended to 16 bits. Then see what constant | |
6086 could be XOR'ed with C to get the sign-extended value.} */ | |
6087 | |
6088 int c = INTVAL (operands[2]); | |
6089 int sextc = (c << 16) >> 16; | |
6090 int xorv = c ^ sextc; | |
6091 | |
6092 operands[4] = GEN_INT (xorv); | |
6093 operands[5] = GEN_INT (sextc); | |
6094 @}") | |
6095 @end smallexample | |
6096 | |
6097 To avoid confusion, don't write a single @code{define_split} that | |
6098 accepts some insns that match some @code{define_insn} as well as some | |
6099 insns that don't. Instead, write two separate @code{define_split} | |
6100 definitions, one for the insns that are valid and one for the insns that | |
6101 are not valid. | |
6102 | |
6103 The splitter is allowed to split jump instructions into sequence of | |
6104 jumps or create new jumps in while splitting non-jump instructions. As | |
6105 the central flowgraph and branch prediction information needs to be updated, | |
6106 several restriction apply. | |
6107 | |
6108 Splitting of jump instruction into sequence that over by another jump | |
6109 instruction is always valid, as compiler expect identical behavior of new | |
6110 jump. When new sequence contains multiple jump instructions or new labels, | |
6111 more assistance is needed. Splitter is required to create only unconditional | |
6112 jumps, or simple conditional jump instructions. Additionally it must attach a | |
6113 @code{REG_BR_PROB} note to each conditional jump. A global variable | |
6114 @code{split_branch_probability} holds the probability of the original branch in case | |
6115 it was an simple conditional jump, @minus{}1 otherwise. To simplify | |
6116 recomputing of edge frequencies, the new sequence is required to have only | |
6117 forward jumps to the newly created labels. | |
6118 | |
6119 @findex define_insn_and_split | |
6120 For the common case where the pattern of a define_split exactly matches the | |
6121 pattern of a define_insn, use @code{define_insn_and_split}. It looks like | |
6122 this: | |
6123 | |
6124 @smallexample | |
6125 (define_insn_and_split | |
6126 [@var{insn-pattern}] | |
6127 "@var{condition}" | |
6128 "@var{output-template}" | |
6129 "@var{split-condition}" | |
6130 [@var{new-insn-pattern-1} | |
6131 @var{new-insn-pattern-2} | |
6132 @dots{}] | |
6133 "@var{preparation-statements}" | |
6134 [@var{insn-attributes}]) | |
6135 | |
6136 @end smallexample | |
6137 | |
6138 @var{insn-pattern}, @var{condition}, @var{output-template}, and | |
6139 @var{insn-attributes} are used as in @code{define_insn}. The | |
6140 @var{new-insn-pattern} vector and the @var{preparation-statements} are used as | |
6141 in a @code{define_split}. The @var{split-condition} is also used as in | |
6142 @code{define_split}, with the additional behavior that if the condition starts | |
6143 with @samp{&&}, the condition used for the split will be the constructed as a | |
6144 logical ``and'' of the split condition with the insn condition. For example, | |
6145 from i386.md: | |
6146 | |
6147 @smallexample | |
6148 (define_insn_and_split "zero_extendhisi2_and" | |
6149 [(set (match_operand:SI 0 "register_operand" "=r") | |
6150 (zero_extend:SI (match_operand:HI 1 "register_operand" "0"))) | |
6151 (clobber (reg:CC 17))] | |
6152 "TARGET_ZERO_EXTEND_WITH_AND && !optimize_size" | |
6153 "#" | |
6154 "&& reload_completed" | |
6155 [(parallel [(set (match_dup 0) | |
6156 (and:SI (match_dup 0) (const_int 65535))) | |
6157 (clobber (reg:CC 17))])] | |
6158 "" | |
6159 [(set_attr "type" "alu1")]) | |
6160 | |
6161 @end smallexample | |
6162 | |
6163 In this case, the actual split condition will be | |
6164 @samp{TARGET_ZERO_EXTEND_WITH_AND && !optimize_size && reload_completed}. | |
6165 | |
6166 The @code{define_insn_and_split} construction provides exactly the same | |
6167 functionality as two separate @code{define_insn} and @code{define_split} | |
6168 patterns. It exists for compactness, and as a maintenance tool to prevent | |
6169 having to ensure the two patterns' templates match. | |
6170 | |
6171 @end ifset | |
6172 @ifset INTERNALS | |
6173 @node Including Patterns | |
6174 @section Including Patterns in Machine Descriptions. | |
6175 @cindex insn includes | |
6176 | |
6177 @findex include | |
6178 The @code{include} pattern tells the compiler tools where to | |
6179 look for patterns that are in files other than in the file | |
6180 @file{.md}. This is used only at build time and there is no preprocessing allowed. | |
6181 | |
6182 It looks like: | |
6183 | |
6184 @smallexample | |
6185 | |
6186 (include | |
6187 @var{pathname}) | |
6188 @end smallexample | |
6189 | |
6190 For example: | |
6191 | |
6192 @smallexample | |
6193 | |
6194 (include "filestuff") | |
6195 | |
6196 @end smallexample | |
6197 | |
6198 Where @var{pathname} is a string that specifies the location of the file, | |
6199 specifies the include file to be in @file{gcc/config/target/filestuff}. The | |
6200 directory @file{gcc/config/target} is regarded as the default directory. | |
6201 | |
6202 | |
6203 Machine descriptions may be split up into smaller more manageable subsections | |
6204 and placed into subdirectories. | |
6205 | |
6206 By specifying: | |
6207 | |
6208 @smallexample | |
6209 | |
6210 (include "BOGUS/filestuff") | |
6211 | |
6212 @end smallexample | |
6213 | |
6214 the include file is specified to be in @file{gcc/config/@var{target}/BOGUS/filestuff}. | |
6215 | |
6216 Specifying an absolute path for the include file such as; | |
6217 @smallexample | |
6218 | |
6219 (include "/u2/BOGUS/filestuff") | |
6220 | |
6221 @end smallexample | |
6222 is permitted but is not encouraged. | |
6223 | |
6224 @subsection RTL Generation Tool Options for Directory Search | |
6225 @cindex directory options .md | |
6226 @cindex options, directory search | |
6227 @cindex search options | |
6228 | |
6229 The @option{-I@var{dir}} option specifies directories to search for machine descriptions. | |
6230 For example: | |
6231 | |
6232 @smallexample | |
6233 | |
6234 genrecog -I/p1/abc/proc1 -I/p2/abcd/pro2 target.md | |
6235 | |
6236 @end smallexample | |
6237 | |
6238 | |
6239 Add the directory @var{dir} to the head of the list of directories to be | |
6240 searched for header files. This can be used to override a system machine definition | |
6241 file, substituting your own version, since these directories are | |
6242 searched before the default machine description file directories. If you use more than | |
6243 one @option{-I} option, the directories are scanned in left-to-right | |
6244 order; the standard default directory come after. | |
6245 | |
6246 | |
6247 @end ifset | |
6248 @ifset INTERNALS | |
6249 @node Peephole Definitions | |
6250 @section Machine-Specific Peephole Optimizers | |
6251 @cindex peephole optimizer definitions | |
6252 @cindex defining peephole optimizers | |
6253 | |
6254 In addition to instruction patterns the @file{md} file may contain | |
6255 definitions of machine-specific peephole optimizations. | |
6256 | |
6257 The combiner does not notice certain peephole optimizations when the data | |
6258 flow in the program does not suggest that it should try them. For example, | |
6259 sometimes two consecutive insns related in purpose can be combined even | |
6260 though the second one does not appear to use a register computed in the | |
6261 first one. A machine-specific peephole optimizer can detect such | |
6262 opportunities. | |
6263 | |
6264 There are two forms of peephole definitions that may be used. The | |
6265 original @code{define_peephole} is run at assembly output time to | |
6266 match insns and substitute assembly text. Use of @code{define_peephole} | |
6267 is deprecated. | |
6268 | |
6269 A newer @code{define_peephole2} matches insns and substitutes new | |
6270 insns. The @code{peephole2} pass is run after register allocation | |
6271 but before scheduling, which may result in much better code for | |
6272 targets that do scheduling. | |
6273 | |
6274 @menu | |
6275 * define_peephole:: RTL to Text Peephole Optimizers | |
6276 * define_peephole2:: RTL to RTL Peephole Optimizers | |
6277 @end menu | |
6278 | |
6279 @end ifset | |
6280 @ifset INTERNALS | |
6281 @node define_peephole | |
6282 @subsection RTL to Text Peephole Optimizers | |
6283 @findex define_peephole | |
6284 | |
6285 @need 1000 | |
6286 A definition looks like this: | |
6287 | |
6288 @smallexample | |
6289 (define_peephole | |
6290 [@var{insn-pattern-1} | |
6291 @var{insn-pattern-2} | |
6292 @dots{}] | |
6293 "@var{condition}" | |
6294 "@var{template}" | |
6295 "@var{optional-insn-attributes}") | |
6296 @end smallexample | |
6297 | |
6298 @noindent | |
6299 The last string operand may be omitted if you are not using any | |
6300 machine-specific information in this machine description. If present, | |
6301 it must obey the same rules as in a @code{define_insn}. | |
6302 | |
6303 In this skeleton, @var{insn-pattern-1} and so on are patterns to match | |
6304 consecutive insns. The optimization applies to a sequence of insns when | |
6305 @var{insn-pattern-1} matches the first one, @var{insn-pattern-2} matches | |
6306 the next, and so on. | |
6307 | |
6308 Each of the insns matched by a peephole must also match a | |
6309 @code{define_insn}. Peepholes are checked only at the last stage just | |
6310 before code generation, and only optionally. Therefore, any insn which | |
6311 would match a peephole but no @code{define_insn} will cause a crash in code | |
6312 generation in an unoptimized compilation, or at various optimization | |
6313 stages. | |
6314 | |
6315 The operands of the insns are matched with @code{match_operands}, | |
6316 @code{match_operator}, and @code{match_dup}, as usual. What is not | |
6317 usual is that the operand numbers apply to all the insn patterns in the | |
6318 definition. So, you can check for identical operands in two insns by | |
6319 using @code{match_operand} in one insn and @code{match_dup} in the | |
6320 other. | |
6321 | |
6322 The operand constraints used in @code{match_operand} patterns do not have | |
6323 any direct effect on the applicability of the peephole, but they will | |
6324 be validated afterward, so make sure your constraints are general enough | |
6325 to apply whenever the peephole matches. If the peephole matches | |
6326 but the constraints are not satisfied, the compiler will crash. | |
6327 | |
6328 It is safe to omit constraints in all the operands of the peephole; or | |
6329 you can write constraints which serve as a double-check on the criteria | |
6330 previously tested. | |
6331 | |
6332 Once a sequence of insns matches the patterns, the @var{condition} is | |
6333 checked. This is a C expression which makes the final decision whether to | |
6334 perform the optimization (we do so if the expression is nonzero). If | |
6335 @var{condition} is omitted (in other words, the string is empty) then the | |
6336 optimization is applied to every sequence of insns that matches the | |
6337 patterns. | |
6338 | |
6339 The defined peephole optimizations are applied after register allocation | |
6340 is complete. Therefore, the peephole definition can check which | |
6341 operands have ended up in which kinds of registers, just by looking at | |
6342 the operands. | |
6343 | |
6344 @findex prev_active_insn | |
6345 The way to refer to the operands in @var{condition} is to write | |
6346 @code{operands[@var{i}]} for operand number @var{i} (as matched by | |
6347 @code{(match_operand @var{i} @dots{})}). Use the variable @code{insn} | |
6348 to refer to the last of the insns being matched; use | |
6349 @code{prev_active_insn} to find the preceding insns. | |
6350 | |
6351 @findex dead_or_set_p | |
6352 When optimizing computations with intermediate results, you can use | |
6353 @var{condition} to match only when the intermediate results are not used | |
6354 elsewhere. Use the C expression @code{dead_or_set_p (@var{insn}, | |
6355 @var{op})}, where @var{insn} is the insn in which you expect the value | |
6356 to be used for the last time (from the value of @code{insn}, together | |
6357 with use of @code{prev_nonnote_insn}), and @var{op} is the intermediate | |
6358 value (from @code{operands[@var{i}]}). | |
6359 | |
6360 Applying the optimization means replacing the sequence of insns with one | |
6361 new insn. The @var{template} controls ultimate output of assembler code | |
6362 for this combined insn. It works exactly like the template of a | |
6363 @code{define_insn}. Operand numbers in this template are the same ones | |
6364 used in matching the original sequence of insns. | |
6365 | |
6366 The result of a defined peephole optimizer does not need to match any of | |
6367 the insn patterns in the machine description; it does not even have an | |
6368 opportunity to match them. The peephole optimizer definition itself serves | |
6369 as the insn pattern to control how the insn is output. | |
6370 | |
6371 Defined peephole optimizers are run as assembler code is being output, | |
6372 so the insns they produce are never combined or rearranged in any way. | |
6373 | |
6374 Here is an example, taken from the 68000 machine description: | |
6375 | |
6376 @smallexample | |
6377 (define_peephole | |
6378 [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4))) | |
6379 (set (match_operand:DF 0 "register_operand" "=f") | |
6380 (match_operand:DF 1 "register_operand" "ad"))] | |
6381 "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])" | |
6382 @{ | |
6383 rtx xoperands[2]; | |
6384 xoperands[1] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1); | |
6385 #ifdef MOTOROLA | |
6386 output_asm_insn ("move.l %1,(sp)", xoperands); | |
6387 output_asm_insn ("move.l %1,-(sp)", operands); | |
6388 return "fmove.d (sp)+,%0"; | |
6389 #else | |
6390 output_asm_insn ("movel %1,sp@@", xoperands); | |
6391 output_asm_insn ("movel %1,sp@@-", operands); | |
6392 return "fmoved sp@@+,%0"; | |
6393 #endif | |
6394 @}) | |
6395 @end smallexample | |
6396 | |
6397 @need 1000 | |
6398 The effect of this optimization is to change | |
6399 | |
6400 @smallexample | |
6401 @group | |
6402 jbsr _foobar | |
6403 addql #4,sp | |
6404 movel d1,sp@@- | |
6405 movel d0,sp@@- | |
6406 fmoved sp@@+,fp0 | |
6407 @end group | |
6408 @end smallexample | |
6409 | |
6410 @noindent | |
6411 into | |
6412 | |
6413 @smallexample | |
6414 @group | |
6415 jbsr _foobar | |
6416 movel d1,sp@@ | |
6417 movel d0,sp@@- | |
6418 fmoved sp@@+,fp0 | |
6419 @end group | |
6420 @end smallexample | |
6421 | |
6422 @ignore | |
6423 @findex CC_REVERSED | |
6424 If a peephole matches a sequence including one or more jump insns, you must | |
6425 take account of the flags such as @code{CC_REVERSED} which specify that the | |
6426 condition codes are represented in an unusual manner. The compiler | |
6427 automatically alters any ordinary conditional jumps which occur in such | |
6428 situations, but the compiler cannot alter jumps which have been replaced by | |
6429 peephole optimizations. So it is up to you to alter the assembler code | |
6430 that the peephole produces. Supply C code to write the assembler output, | |
6431 and in this C code check the condition code status flags and change the | |
6432 assembler code as appropriate. | |
6433 @end ignore | |
6434 | |
6435 @var{insn-pattern-1} and so on look @emph{almost} like the second | |
6436 operand of @code{define_insn}. There is one important difference: the | |
6437 second operand of @code{define_insn} consists of one or more RTX's | |
6438 enclosed in square brackets. Usually, there is only one: then the same | |
6439 action can be written as an element of a @code{define_peephole}. But | |
6440 when there are multiple actions in a @code{define_insn}, they are | |
6441 implicitly enclosed in a @code{parallel}. Then you must explicitly | |
6442 write the @code{parallel}, and the square brackets within it, in the | |
6443 @code{define_peephole}. Thus, if an insn pattern looks like this, | |
6444 | |
6445 @smallexample | |
6446 (define_insn "divmodsi4" | |
6447 [(set (match_operand:SI 0 "general_operand" "=d") | |
6448 (div:SI (match_operand:SI 1 "general_operand" "0") | |
6449 (match_operand:SI 2 "general_operand" "dmsK"))) | |
6450 (set (match_operand:SI 3 "general_operand" "=d") | |
6451 (mod:SI (match_dup 1) (match_dup 2)))] | |
6452 "TARGET_68020" | |
6453 "divsl%.l %2,%3:%0") | |
6454 @end smallexample | |
6455 | |
6456 @noindent | |
6457 then the way to mention this insn in a peephole is as follows: | |
6458 | |
6459 @smallexample | |
6460 (define_peephole | |
6461 [@dots{} | |
6462 (parallel | |
6463 [(set (match_operand:SI 0 "general_operand" "=d") | |
6464 (div:SI (match_operand:SI 1 "general_operand" "0") | |
6465 (match_operand:SI 2 "general_operand" "dmsK"))) | |
6466 (set (match_operand:SI 3 "general_operand" "=d") | |
6467 (mod:SI (match_dup 1) (match_dup 2)))]) | |
6468 @dots{}] | |
6469 @dots{}) | |
6470 @end smallexample | |
6471 | |
6472 @end ifset | |
6473 @ifset INTERNALS | |
6474 @node define_peephole2 | |
6475 @subsection RTL to RTL Peephole Optimizers | |
6476 @findex define_peephole2 | |
6477 | |
6478 The @code{define_peephole2} definition tells the compiler how to | |
6479 substitute one sequence of instructions for another sequence, | |
6480 what additional scratch registers may be needed and what their | |
6481 lifetimes must be. | |
6482 | |
6483 @smallexample | |
6484 (define_peephole2 | |
6485 [@var{insn-pattern-1} | |
6486 @var{insn-pattern-2} | |
6487 @dots{}] | |
6488 "@var{condition}" | |
6489 [@var{new-insn-pattern-1} | |
6490 @var{new-insn-pattern-2} | |
6491 @dots{}] | |
6492 "@var{preparation-statements}") | |
6493 @end smallexample | |
6494 | |
6495 The definition is almost identical to @code{define_split} | |
6496 (@pxref{Insn Splitting}) except that the pattern to match is not a | |
6497 single instruction, but a sequence of instructions. | |
6498 | |
6499 It is possible to request additional scratch registers for use in the | |
6500 output template. If appropriate registers are not free, the pattern | |
6501 will simply not match. | |
6502 | |
6503 @findex match_scratch | |
6504 @findex match_dup | |
6505 Scratch registers are requested with a @code{match_scratch} pattern at | |
6506 the top level of the input pattern. The allocated register (initially) will | |
6507 be dead at the point requested within the original sequence. If the scratch | |
6508 is used at more than a single point, a @code{match_dup} pattern at the | |
6509 top level of the input pattern marks the last position in the input sequence | |
6510 at which the register must be available. | |
6511 | |
6512 Here is an example from the IA-32 machine description: | |
6513 | |
6514 @smallexample | |
6515 (define_peephole2 | |
6516 [(match_scratch:SI 2 "r") | |
6517 (parallel [(set (match_operand:SI 0 "register_operand" "") | |
6518 (match_operator:SI 3 "arith_or_logical_operator" | |
6519 [(match_dup 0) | |
6520 (match_operand:SI 1 "memory_operand" "")])) | |
6521 (clobber (reg:CC 17))])] | |
6522 "! optimize_size && ! TARGET_READ_MODIFY" | |
6523 [(set (match_dup 2) (match_dup 1)) | |
6524 (parallel [(set (match_dup 0) | |
6525 (match_op_dup 3 [(match_dup 0) (match_dup 2)])) | |
6526 (clobber (reg:CC 17))])] | |
6527 "") | |
6528 @end smallexample | |
6529 | |
6530 @noindent | |
6531 This pattern tries to split a load from its use in the hopes that we'll be | |
6532 able to schedule around the memory load latency. It allocates a single | |
6533 @code{SImode} register of class @code{GENERAL_REGS} (@code{"r"}) that needs | |
6534 to be live only at the point just before the arithmetic. | |
6535 | |
6536 A real example requiring extended scratch lifetimes is harder to come by, | |
6537 so here's a silly made-up example: | |
6538 | |
6539 @smallexample | |
6540 (define_peephole2 | |
6541 [(match_scratch:SI 4 "r") | |
6542 (set (match_operand:SI 0 "" "") (match_operand:SI 1 "" "")) | |
6543 (set (match_operand:SI 2 "" "") (match_dup 1)) | |
6544 (match_dup 4) | |
6545 (set (match_operand:SI 3 "" "") (match_dup 1))] | |
6546 "/* @r{determine 1 does not overlap 0 and 2} */" | |
6547 [(set (match_dup 4) (match_dup 1)) | |
6548 (set (match_dup 0) (match_dup 4)) | |
6549 (set (match_dup 2) (match_dup 4))] | |
6550 (set (match_dup 3) (match_dup 4))] | |
6551 "") | |
6552 @end smallexample | |
6553 | |
6554 @noindent | |
6555 If we had not added the @code{(match_dup 4)} in the middle of the input | |
6556 sequence, it might have been the case that the register we chose at the | |
6557 beginning of the sequence is killed by the first or second @code{set}. | |
6558 | |
6559 @end ifset | |
6560 @ifset INTERNALS | |
6561 @node Insn Attributes | |
6562 @section Instruction Attributes | |
6563 @cindex insn attributes | |
6564 @cindex instruction attributes | |
6565 | |
6566 In addition to describing the instruction supported by the target machine, | |
6567 the @file{md} file also defines a group of @dfn{attributes} and a set of | |
6568 values for each. Every generated insn is assigned a value for each attribute. | |
6569 One possible attribute would be the effect that the insn has on the machine's | |
6570 condition code. This attribute can then be used by @code{NOTICE_UPDATE_CC} | |
6571 to track the condition codes. | |
6572 | |
6573 @menu | |
6574 * Defining Attributes:: Specifying attributes and their values. | |
6575 * Expressions:: Valid expressions for attribute values. | |
6576 * Tagging Insns:: Assigning attribute values to insns. | |
6577 * Attr Example:: An example of assigning attributes. | |
6578 * Insn Lengths:: Computing the length of insns. | |
6579 * Constant Attributes:: Defining attributes that are constant. | |
6580 * Delay Slots:: Defining delay slots required for a machine. | |
6581 * Processor pipeline description:: Specifying information for insn scheduling. | |
6582 @end menu | |
6583 | |
6584 @end ifset | |
6585 @ifset INTERNALS | |
6586 @node Defining Attributes | |
6587 @subsection Defining Attributes and their Values | |
6588 @cindex defining attributes and their values | |
6589 @cindex attributes, defining | |
6590 | |
6591 @findex define_attr | |
6592 The @code{define_attr} expression is used to define each attribute required | |
6593 by the target machine. It looks like: | |
6594 | |
6595 @smallexample | |
6596 (define_attr @var{name} @var{list-of-values} @var{default}) | |
6597 @end smallexample | |
6598 | |
6599 @var{name} is a string specifying the name of the attribute being defined. | |
6600 | |
6601 @var{list-of-values} is either a string that specifies a comma-separated | |
6602 list of values that can be assigned to the attribute, or a null string to | |
6603 indicate that the attribute takes numeric values. | |
6604 | |
6605 @var{default} is an attribute expression that gives the value of this | |
6606 attribute for insns that match patterns whose definition does not include | |
6607 an explicit value for this attribute. @xref{Attr Example}, for more | |
6608 information on the handling of defaults. @xref{Constant Attributes}, | |
6609 for information on attributes that do not depend on any particular insn. | |
6610 | |
6611 @findex insn-attr.h | |
6612 For each defined attribute, a number of definitions are written to the | |
6613 @file{insn-attr.h} file. For cases where an explicit set of values is | |
6614 specified for an attribute, the following are defined: | |
6615 | |
6616 @itemize @bullet | |
6617 @item | |
6618 A @samp{#define} is written for the symbol @samp{HAVE_ATTR_@var{name}}. | |
6619 | |
6620 @item | |
6621 An enumerated class is defined for @samp{attr_@var{name}} with | |
6622 elements of the form @samp{@var{upper-name}_@var{upper-value}} where | |
6623 the attribute name and value are first converted to uppercase. | |
6624 | |
6625 @item | |
6626 A function @samp{get_attr_@var{name}} is defined that is passed an insn and | |
6627 returns the attribute value for that insn. | |
6628 @end itemize | |
6629 | |
6630 For example, if the following is present in the @file{md} file: | |
6631 | |
6632 @smallexample | |
6633 (define_attr "type" "branch,fp,load,store,arith" @dots{}) | |
6634 @end smallexample | |
6635 | |
6636 @noindent | |
6637 the following lines will be written to the file @file{insn-attr.h}. | |
6638 | |
6639 @smallexample | |
6640 #define HAVE_ATTR_type | |
6641 enum attr_type @{TYPE_BRANCH, TYPE_FP, TYPE_LOAD, | |
6642 TYPE_STORE, TYPE_ARITH@}; | |
6643 extern enum attr_type get_attr_type (); | |
6644 @end smallexample | |
6645 | |
6646 If the attribute takes numeric values, no @code{enum} type will be | |
6647 defined and the function to obtain the attribute's value will return | |
6648 @code{int}. | |
6649 | |
6650 There are attributes which are tied to a specific meaning. These | |
6651 attributes are not free to use for other purposes: | |
6652 | |
6653 @table @code | |
6654 @item length | |
6655 The @code{length} attribute is used to calculate the length of emitted | |
6656 code chunks. This is especially important when verifying branch | |
6657 distances. @xref{Insn Lengths}. | |
6658 | |
6659 @item enabled | |
6660 The @code{enabled} attribute can be defined to prevent certain | |
6661 alternatives of an insn definition from being used during code | |
6662 generation. @xref{Disable Insn Alternatives}. | |
6663 | |
6664 @end table | |
6665 | |
6666 @end ifset | |
6667 @ifset INTERNALS | |
6668 @node Expressions | |
6669 @subsection Attribute Expressions | |
6670 @cindex attribute expressions | |
6671 | |
6672 RTL expressions used to define attributes use the codes described above | |
6673 plus a few specific to attribute definitions, to be discussed below. | |
6674 Attribute value expressions must have one of the following forms: | |
6675 | |
6676 @table @code | |
6677 @cindex @code{const_int} and attributes | |
6678 @item (const_int @var{i}) | |
6679 The integer @var{i} specifies the value of a numeric attribute. @var{i} | |
6680 must be non-negative. | |
6681 | |
6682 The value of a numeric attribute can be specified either with a | |
6683 @code{const_int}, or as an integer represented as a string in | |
6684 @code{const_string}, @code{eq_attr} (see below), @code{attr}, | |
6685 @code{symbol_ref}, simple arithmetic expressions, and @code{set_attr} | |
6686 overrides on specific instructions (@pxref{Tagging Insns}). | |
6687 | |
6688 @cindex @code{const_string} and attributes | |
6689 @item (const_string @var{value}) | |
6690 The string @var{value} specifies a constant attribute value. | |
6691 If @var{value} is specified as @samp{"*"}, it means that the default value of | |
6692 the attribute is to be used for the insn containing this expression. | |
6693 @samp{"*"} obviously cannot be used in the @var{default} expression | |
6694 of a @code{define_attr}. | |
6695 | |
6696 If the attribute whose value is being specified is numeric, @var{value} | |
6697 must be a string containing a non-negative integer (normally | |
6698 @code{const_int} would be used in this case). Otherwise, it must | |
6699 contain one of the valid values for the attribute. | |
6700 | |
6701 @cindex @code{if_then_else} and attributes | |
6702 @item (if_then_else @var{test} @var{true-value} @var{false-value}) | |
6703 @var{test} specifies an attribute test, whose format is defined below. | |
6704 The value of this expression is @var{true-value} if @var{test} is true, | |
6705 otherwise it is @var{false-value}. | |
6706 | |
6707 @cindex @code{cond} and attributes | |
6708 @item (cond [@var{test1} @var{value1} @dots{}] @var{default}) | |
6709 The first operand of this expression is a vector containing an even | |
6710 number of expressions and consisting of pairs of @var{test} and @var{value} | |
6711 expressions. The value of the @code{cond} expression is that of the | |
6712 @var{value} corresponding to the first true @var{test} expression. If | |
6713 none of the @var{test} expressions are true, the value of the @code{cond} | |
6714 expression is that of the @var{default} expression. | |
6715 @end table | |
6716 | |
6717 @var{test} expressions can have one of the following forms: | |
6718 | |
6719 @table @code | |
6720 @cindex @code{const_int} and attribute tests | |
6721 @item (const_int @var{i}) | |
6722 This test is true if @var{i} is nonzero and false otherwise. | |
6723 | |
6724 @cindex @code{not} and attributes | |
6725 @cindex @code{ior} and attributes | |
6726 @cindex @code{and} and attributes | |
6727 @item (not @var{test}) | |
6728 @itemx (ior @var{test1} @var{test2}) | |
6729 @itemx (and @var{test1} @var{test2}) | |
6730 These tests are true if the indicated logical function is true. | |
6731 | |
6732 @cindex @code{match_operand} and attributes | |
6733 @item (match_operand:@var{m} @var{n} @var{pred} @var{constraints}) | |
6734 This test is true if operand @var{n} of the insn whose attribute value | |
6735 is being determined has mode @var{m} (this part of the test is ignored | |
6736 if @var{m} is @code{VOIDmode}) and the function specified by the string | |
6737 @var{pred} returns a nonzero value when passed operand @var{n} and mode | |
6738 @var{m} (this part of the test is ignored if @var{pred} is the null | |
6739 string). | |
6740 | |
6741 The @var{constraints} operand is ignored and should be the null string. | |
6742 | |
6743 @cindex @code{le} and attributes | |
6744 @cindex @code{leu} and attributes | |
6745 @cindex @code{lt} and attributes | |
6746 @cindex @code{gt} and attributes | |
6747 @cindex @code{gtu} and attributes | |
6748 @cindex @code{ge} and attributes | |
6749 @cindex @code{geu} and attributes | |
6750 @cindex @code{ne} and attributes | |
6751 @cindex @code{eq} and attributes | |
6752 @cindex @code{plus} and attributes | |
6753 @cindex @code{minus} and attributes | |
6754 @cindex @code{mult} and attributes | |
6755 @cindex @code{div} and attributes | |
6756 @cindex @code{mod} and attributes | |
6757 @cindex @code{abs} and attributes | |
6758 @cindex @code{neg} and attributes | |
6759 @cindex @code{ashift} and attributes | |
6760 @cindex @code{lshiftrt} and attributes | |
6761 @cindex @code{ashiftrt} and attributes | |
6762 @item (le @var{arith1} @var{arith2}) | |
6763 @itemx (leu @var{arith1} @var{arith2}) | |
6764 @itemx (lt @var{arith1} @var{arith2}) | |
6765 @itemx (ltu @var{arith1} @var{arith2}) | |
6766 @itemx (gt @var{arith1} @var{arith2}) | |
6767 @itemx (gtu @var{arith1} @var{arith2}) | |
6768 @itemx (ge @var{arith1} @var{arith2}) | |
6769 @itemx (geu @var{arith1} @var{arith2}) | |
6770 @itemx (ne @var{arith1} @var{arith2}) | |
6771 @itemx (eq @var{arith1} @var{arith2}) | |
6772 These tests are true if the indicated comparison of the two arithmetic | |
6773 expressions is true. Arithmetic expressions are formed with | |
6774 @code{plus}, @code{minus}, @code{mult}, @code{div}, @code{mod}, | |
6775 @code{abs}, @code{neg}, @code{and}, @code{ior}, @code{xor}, @code{not}, | |
6776 @code{ashift}, @code{lshiftrt}, and @code{ashiftrt} expressions. | |
6777 | |
6778 @findex get_attr | |
6779 @code{const_int} and @code{symbol_ref} are always valid terms (@pxref{Insn | |
6780 Lengths},for additional forms). @code{symbol_ref} is a string | |
6781 denoting a C expression that yields an @code{int} when evaluated by the | |
6782 @samp{get_attr_@dots{}} routine. It should normally be a global | |
6783 variable. | |
6784 | |
6785 @findex eq_attr | |
6786 @item (eq_attr @var{name} @var{value}) | |
6787 @var{name} is a string specifying the name of an attribute. | |
6788 | |
6789 @var{value} is a string that is either a valid value for attribute | |
6790 @var{name}, a comma-separated list of values, or @samp{!} followed by a | |
6791 value or list. If @var{value} does not begin with a @samp{!}, this | |
6792 test is true if the value of the @var{name} attribute of the current | |
6793 insn is in the list specified by @var{value}. If @var{value} begins | |
6794 with a @samp{!}, this test is true if the attribute's value is | |
6795 @emph{not} in the specified list. | |
6796 | |
6797 For example, | |
6798 | |
6799 @smallexample | |
6800 (eq_attr "type" "load,store") | |
6801 @end smallexample | |
6802 | |
6803 @noindent | |
6804 is equivalent to | |
6805 | |
6806 @smallexample | |
6807 (ior (eq_attr "type" "load") (eq_attr "type" "store")) | |
6808 @end smallexample | |
6809 | |
6810 If @var{name} specifies an attribute of @samp{alternative}, it refers to the | |
6811 value of the compiler variable @code{which_alternative} | |
6812 (@pxref{Output Statement}) and the values must be small integers. For | |
6813 example, | |
6814 | |
6815 @smallexample | |
6816 (eq_attr "alternative" "2,3") | |
6817 @end smallexample | |
6818 | |
6819 @noindent | |
6820 is equivalent to | |
6821 | |
6822 @smallexample | |
6823 (ior (eq (symbol_ref "which_alternative") (const_int 2)) | |
6824 (eq (symbol_ref "which_alternative") (const_int 3))) | |
6825 @end smallexample | |
6826 | |
6827 Note that, for most attributes, an @code{eq_attr} test is simplified in cases | |
6828 where the value of the attribute being tested is known for all insns matching | |
6829 a particular pattern. This is by far the most common case. | |
6830 | |
6831 @findex attr_flag | |
6832 @item (attr_flag @var{name}) | |
6833 The value of an @code{attr_flag} expression is true if the flag | |
6834 specified by @var{name} is true for the @code{insn} currently being | |
6835 scheduled. | |
6836 | |
6837 @var{name} is a string specifying one of a fixed set of flags to test. | |
6838 Test the flags @code{forward} and @code{backward} to determine the | |
6839 direction of a conditional branch. Test the flags @code{very_likely}, | |
6840 @code{likely}, @code{very_unlikely}, and @code{unlikely} to determine | |
6841 if a conditional branch is expected to be taken. | |
6842 | |
6843 If the @code{very_likely} flag is true, then the @code{likely} flag is also | |
6844 true. Likewise for the @code{very_unlikely} and @code{unlikely} flags. | |
6845 | |
6846 This example describes a conditional branch delay slot which | |
6847 can be nullified for forward branches that are taken (annul-true) or | |
6848 for backward branches which are not taken (annul-false). | |
6849 | |
6850 @smallexample | |
6851 (define_delay (eq_attr "type" "cbranch") | |
6852 [(eq_attr "in_branch_delay" "true") | |
6853 (and (eq_attr "in_branch_delay" "true") | |
6854 (attr_flag "forward")) | |
6855 (and (eq_attr "in_branch_delay" "true") | |
6856 (attr_flag "backward"))]) | |
6857 @end smallexample | |
6858 | |
6859 The @code{forward} and @code{backward} flags are false if the current | |
6860 @code{insn} being scheduled is not a conditional branch. | |
6861 | |
6862 The @code{very_likely} and @code{likely} flags are true if the | |
6863 @code{insn} being scheduled is not a conditional branch. | |
6864 The @code{very_unlikely} and @code{unlikely} flags are false if the | |
6865 @code{insn} being scheduled is not a conditional branch. | |
6866 | |
6867 @code{attr_flag} is only used during delay slot scheduling and has no | |
6868 meaning to other passes of the compiler. | |
6869 | |
6870 @findex attr | |
6871 @item (attr @var{name}) | |
6872 The value of another attribute is returned. This is most useful | |
6873 for numeric attributes, as @code{eq_attr} and @code{attr_flag} | |
6874 produce more efficient code for non-numeric attributes. | |
6875 @end table | |
6876 | |
6877 @end ifset | |
6878 @ifset INTERNALS | |
6879 @node Tagging Insns | |
6880 @subsection Assigning Attribute Values to Insns | |
6881 @cindex tagging insns | |
6882 @cindex assigning attribute values to insns | |
6883 | |
6884 The value assigned to an attribute of an insn is primarily determined by | |
6885 which pattern is matched by that insn (or which @code{define_peephole} | |
6886 generated it). Every @code{define_insn} and @code{define_peephole} can | |
6887 have an optional last argument to specify the values of attributes for | |
6888 matching insns. The value of any attribute not specified in a particular | |
6889 insn is set to the default value for that attribute, as specified in its | |
6890 @code{define_attr}. Extensive use of default values for attributes | |
6891 permits the specification of the values for only one or two attributes | |
6892 in the definition of most insn patterns, as seen in the example in the | |
6893 next section. | |
6894 | |
6895 The optional last argument of @code{define_insn} and | |
6896 @code{define_peephole} is a vector of expressions, each of which defines | |
6897 the value for a single attribute. The most general way of assigning an | |
6898 attribute's value is to use a @code{set} expression whose first operand is an | |
6899 @code{attr} expression giving the name of the attribute being set. The | |
6900 second operand of the @code{set} is an attribute expression | |
6901 (@pxref{Expressions}) giving the value of the attribute. | |
6902 | |
6903 When the attribute value depends on the @samp{alternative} attribute | |
6904 (i.e., which is the applicable alternative in the constraint of the | |
6905 insn), the @code{set_attr_alternative} expression can be used. It | |
6906 allows the specification of a vector of attribute expressions, one for | |
6907 each alternative. | |
6908 | |
6909 @findex set_attr | |
6910 When the generality of arbitrary attribute expressions is not required, | |
6911 the simpler @code{set_attr} expression can be used, which allows | |
6912 specifying a string giving either a single attribute value or a list | |
6913 of attribute values, one for each alternative. | |
6914 | |
6915 The form of each of the above specifications is shown below. In each case, | |
6916 @var{name} is a string specifying the attribute to be set. | |
6917 | |
6918 @table @code | |
6919 @item (set_attr @var{name} @var{value-string}) | |
6920 @var{value-string} is either a string giving the desired attribute value, | |
6921 or a string containing a comma-separated list giving the values for | |
6922 succeeding alternatives. The number of elements must match the number | |
6923 of alternatives in the constraint of the insn pattern. | |
6924 | |
6925 Note that it may be useful to specify @samp{*} for some alternative, in | |
6926 which case the attribute will assume its default value for insns matching | |
6927 that alternative. | |
6928 | |
6929 @findex set_attr_alternative | |
6930 @item (set_attr_alternative @var{name} [@var{value1} @var{value2} @dots{}]) | |
6931 Depending on the alternative of the insn, the value will be one of the | |
6932 specified values. This is a shorthand for using a @code{cond} with | |
6933 tests on the @samp{alternative} attribute. | |
6934 | |
6935 @findex attr | |
6936 @item (set (attr @var{name}) @var{value}) | |
6937 The first operand of this @code{set} must be the special RTL expression | |
6938 @code{attr}, whose sole operand is a string giving the name of the | |
6939 attribute being set. @var{value} is the value of the attribute. | |
6940 @end table | |
6941 | |
6942 The following shows three different ways of representing the same | |
6943 attribute value specification: | |
6944 | |
6945 @smallexample | |
6946 (set_attr "type" "load,store,arith") | |
6947 | |
6948 (set_attr_alternative "type" | |
6949 [(const_string "load") (const_string "store") | |
6950 (const_string "arith")]) | |
6951 | |
6952 (set (attr "type") | |
6953 (cond [(eq_attr "alternative" "1") (const_string "load") | |
6954 (eq_attr "alternative" "2") (const_string "store")] | |
6955 (const_string "arith"))) | |
6956 @end smallexample | |
6957 | |
6958 @need 1000 | |
6959 @findex define_asm_attributes | |
6960 The @code{define_asm_attributes} expression provides a mechanism to | |
6961 specify the attributes assigned to insns produced from an @code{asm} | |
6962 statement. It has the form: | |
6963 | |
6964 @smallexample | |
6965 (define_asm_attributes [@var{attr-sets}]) | |
6966 @end smallexample | |
6967 | |
6968 @noindent | |
6969 where @var{attr-sets} is specified the same as for both the | |
6970 @code{define_insn} and the @code{define_peephole} expressions. | |
6971 | |
6972 These values will typically be the ``worst case'' attribute values. For | |
6973 example, they might indicate that the condition code will be clobbered. | |
6974 | |
6975 A specification for a @code{length} attribute is handled specially. The | |
6976 way to compute the length of an @code{asm} insn is to multiply the | |
6977 length specified in the expression @code{define_asm_attributes} by the | |
6978 number of machine instructions specified in the @code{asm} statement, | |
6979 determined by counting the number of semicolons and newlines in the | |
6980 string. Therefore, the value of the @code{length} attribute specified | |
6981 in a @code{define_asm_attributes} should be the maximum possible length | |
6982 of a single machine instruction. | |
6983 | |
6984 @end ifset | |
6985 @ifset INTERNALS | |
6986 @node Attr Example | |
6987 @subsection Example of Attribute Specifications | |
6988 @cindex attribute specifications example | |
6989 @cindex attribute specifications | |
6990 | |
6991 The judicious use of defaulting is important in the efficient use of | |
6992 insn attributes. Typically, insns are divided into @dfn{types} and an | |
6993 attribute, customarily called @code{type}, is used to represent this | |
6994 value. This attribute is normally used only to define the default value | |
6995 for other attributes. An example will clarify this usage. | |
6996 | |
6997 Assume we have a RISC machine with a condition code and in which only | |
6998 full-word operations are performed in registers. Let us assume that we | |
6999 can divide all insns into loads, stores, (integer) arithmetic | |
7000 operations, floating point operations, and branches. | |
7001 | |
7002 Here we will concern ourselves with determining the effect of an insn on | |
7003 the condition code and will limit ourselves to the following possible | |
7004 effects: The condition code can be set unpredictably (clobbered), not | |
7005 be changed, be set to agree with the results of the operation, or only | |
7006 changed if the item previously set into the condition code has been | |
7007 modified. | |
7008 | |
7009 Here is part of a sample @file{md} file for such a machine: | |
7010 | |
7011 @smallexample | |
7012 (define_attr "type" "load,store,arith,fp,branch" (const_string "arith")) | |
7013 | |
7014 (define_attr "cc" "clobber,unchanged,set,change0" | |
7015 (cond [(eq_attr "type" "load") | |
7016 (const_string "change0") | |
7017 (eq_attr "type" "store,branch") | |
7018 (const_string "unchanged") | |
7019 (eq_attr "type" "arith") | |
7020 (if_then_else (match_operand:SI 0 "" "") | |
7021 (const_string "set") | |
7022 (const_string "clobber"))] | |
7023 (const_string "clobber"))) | |
7024 | |
7025 (define_insn "" | |
7026 [(set (match_operand:SI 0 "general_operand" "=r,r,m") | |
7027 (match_operand:SI 1 "general_operand" "r,m,r"))] | |
7028 "" | |
7029 "@@ | |
7030 move %0,%1 | |
7031 load %0,%1 | |
7032 store %0,%1" | |
7033 [(set_attr "type" "arith,load,store")]) | |
7034 @end smallexample | |
7035 | |
7036 Note that we assume in the above example that arithmetic operations | |
7037 performed on quantities smaller than a machine word clobber the condition | |
7038 code since they will set the condition code to a value corresponding to the | |
7039 full-word result. | |
7040 | |
7041 @end ifset | |
7042 @ifset INTERNALS | |
7043 @node Insn Lengths | |
7044 @subsection Computing the Length of an Insn | |
7045 @cindex insn lengths, computing | |
7046 @cindex computing the length of an insn | |
7047 | |
7048 For many machines, multiple types of branch instructions are provided, each | |
7049 for different length branch displacements. In most cases, the assembler | |
7050 will choose the correct instruction to use. However, when the assembler | |
7051 cannot do so, GCC can when a special attribute, the @code{length} | |
7052 attribute, is defined. This attribute must be defined to have numeric | |
7053 values by specifying a null string in its @code{define_attr}. | |
7054 | |
7055 In the case of the @code{length} attribute, two additional forms of | |
7056 arithmetic terms are allowed in test expressions: | |
7057 | |
7058 @table @code | |
7059 @cindex @code{match_dup} and attributes | |
7060 @item (match_dup @var{n}) | |
7061 This refers to the address of operand @var{n} of the current insn, which | |
7062 must be a @code{label_ref}. | |
7063 | |
7064 @cindex @code{pc} and attributes | |
7065 @item (pc) | |
7066 This refers to the address of the @emph{current} insn. It might have | |
7067 been more consistent with other usage to make this the address of the | |
7068 @emph{next} insn but this would be confusing because the length of the | |
7069 current insn is to be computed. | |
7070 @end table | |
7071 | |
7072 @cindex @code{addr_vec}, length of | |
7073 @cindex @code{addr_diff_vec}, length of | |
7074 For normal insns, the length will be determined by value of the | |
7075 @code{length} attribute. In the case of @code{addr_vec} and | |
7076 @code{addr_diff_vec} insn patterns, the length is computed as | |
7077 the number of vectors multiplied by the size of each vector. | |
7078 | |
7079 Lengths are measured in addressable storage units (bytes). | |
7080 | |
7081 The following macros can be used to refine the length computation: | |
7082 | |
7083 @table @code | |
7084 @findex ADJUST_INSN_LENGTH | |
7085 @item ADJUST_INSN_LENGTH (@var{insn}, @var{length}) | |
7086 If defined, modifies the length assigned to instruction @var{insn} as a | |
7087 function of the context in which it is used. @var{length} is an lvalue | |
7088 that contains the initially computed length of the insn and should be | |
7089 updated with the correct length of the insn. | |
7090 | |
7091 This macro will normally not be required. A case in which it is | |
7092 required is the ROMP@. On this machine, the size of an @code{addr_vec} | |
7093 insn must be increased by two to compensate for the fact that alignment | |
7094 may be required. | |
7095 @end table | |
7096 | |
7097 @findex get_attr_length | |
7098 The routine that returns @code{get_attr_length} (the value of the | |
7099 @code{length} attribute) can be used by the output routine to | |
7100 determine the form of the branch instruction to be written, as the | |
7101 example below illustrates. | |
7102 | |
7103 As an example of the specification of variable-length branches, consider | |
7104 the IBM 360. If we adopt the convention that a register will be set to | |
7105 the starting address of a function, we can jump to labels within 4k of | |
7106 the start using a four-byte instruction. Otherwise, we need a six-byte | |
7107 sequence to load the address from memory and then branch to it. | |
7108 | |
7109 On such a machine, a pattern for a branch instruction might be specified | |
7110 as follows: | |
7111 | |
7112 @smallexample | |
7113 (define_insn "jump" | |
7114 [(set (pc) | |
7115 (label_ref (match_operand 0 "" "")))] | |
7116 "" | |
7117 @{ | |
7118 return (get_attr_length (insn) == 4 | |
7119 ? "b %l0" : "l r15,=a(%l0); br r15"); | |
7120 @} | |
7121 [(set (attr "length") | |
7122 (if_then_else (lt (match_dup 0) (const_int 4096)) | |
7123 (const_int 4) | |
7124 (const_int 6)))]) | |
7125 @end smallexample | |
7126 | |
7127 @end ifset | |
7128 @ifset INTERNALS | |
7129 @node Constant Attributes | |
7130 @subsection Constant Attributes | |
7131 @cindex constant attributes | |
7132 | |
7133 A special form of @code{define_attr}, where the expression for the | |
7134 default value is a @code{const} expression, indicates an attribute that | |
7135 is constant for a given run of the compiler. Constant attributes may be | |
7136 used to specify which variety of processor is used. For example, | |
7137 | |
7138 @smallexample | |
7139 (define_attr "cpu" "m88100,m88110,m88000" | |
7140 (const | |
7141 (cond [(symbol_ref "TARGET_88100") (const_string "m88100") | |
7142 (symbol_ref "TARGET_88110") (const_string "m88110")] | |
7143 (const_string "m88000")))) | |
7144 | |
7145 (define_attr "memory" "fast,slow" | |
7146 (const | |
7147 (if_then_else (symbol_ref "TARGET_FAST_MEM") | |
7148 (const_string "fast") | |
7149 (const_string "slow")))) | |
7150 @end smallexample | |
7151 | |
7152 The routine generated for constant attributes has no parameters as it | |
7153 does not depend on any particular insn. RTL expressions used to define | |
7154 the value of a constant attribute may use the @code{symbol_ref} form, | |
7155 but may not use either the @code{match_operand} form or @code{eq_attr} | |
7156 forms involving insn attributes. | |
7157 | |
7158 @end ifset | |
7159 @ifset INTERNALS | |
7160 @node Delay Slots | |
7161 @subsection Delay Slot Scheduling | |
7162 @cindex delay slots, defining | |
7163 | |
7164 The insn attribute mechanism can be used to specify the requirements for | |
7165 delay slots, if any, on a target machine. An instruction is said to | |
7166 require a @dfn{delay slot} if some instructions that are physically | |
7167 after the instruction are executed as if they were located before it. | |
7168 Classic examples are branch and call instructions, which often execute | |
7169 the following instruction before the branch or call is performed. | |
7170 | |
7171 On some machines, conditional branch instructions can optionally | |
7172 @dfn{annul} instructions in the delay slot. This means that the | |
7173 instruction will not be executed for certain branch outcomes. Both | |
7174 instructions that annul if the branch is true and instructions that | |
7175 annul if the branch is false are supported. | |
7176 | |
7177 Delay slot scheduling differs from instruction scheduling in that | |
7178 determining whether an instruction needs a delay slot is dependent only | |
7179 on the type of instruction being generated, not on data flow between the | |
7180 instructions. See the next section for a discussion of data-dependent | |
7181 instruction scheduling. | |
7182 | |
7183 @findex define_delay | |
7184 The requirement of an insn needing one or more delay slots is indicated | |
7185 via the @code{define_delay} expression. It has the following form: | |
7186 | |
7187 @smallexample | |
7188 (define_delay @var{test} | |
7189 [@var{delay-1} @var{annul-true-1} @var{annul-false-1} | |
7190 @var{delay-2} @var{annul-true-2} @var{annul-false-2} | |
7191 @dots{}]) | |
7192 @end smallexample | |
7193 | |
7194 @var{test} is an attribute test that indicates whether this | |
7195 @code{define_delay} applies to a particular insn. If so, the number of | |
7196 required delay slots is determined by the length of the vector specified | |
7197 as the second argument. An insn placed in delay slot @var{n} must | |
7198 satisfy attribute test @var{delay-n}. @var{annul-true-n} is an | |
7199 attribute test that specifies which insns may be annulled if the branch | |
7200 is true. Similarly, @var{annul-false-n} specifies which insns in the | |
7201 delay slot may be annulled if the branch is false. If annulling is not | |
7202 supported for that delay slot, @code{(nil)} should be coded. | |
7203 | |
7204 For example, in the common case where branch and call insns require | |
7205 a single delay slot, which may contain any insn other than a branch or | |
7206 call, the following would be placed in the @file{md} file: | |
7207 | |
7208 @smallexample | |
7209 (define_delay (eq_attr "type" "branch,call") | |
7210 [(eq_attr "type" "!branch,call") (nil) (nil)]) | |
7211 @end smallexample | |
7212 | |
7213 Multiple @code{define_delay} expressions may be specified. In this | |
7214 case, each such expression specifies different delay slot requirements | |
7215 and there must be no insn for which tests in two @code{define_delay} | |
7216 expressions are both true. | |
7217 | |
7218 For example, if we have a machine that requires one delay slot for branches | |
7219 but two for calls, no delay slot can contain a branch or call insn, | |
7220 and any valid insn in the delay slot for the branch can be annulled if the | |
7221 branch is true, we might represent this as follows: | |
7222 | |
7223 @smallexample | |
7224 (define_delay (eq_attr "type" "branch") | |
7225 [(eq_attr "type" "!branch,call") | |
7226 (eq_attr "type" "!branch,call") | |
7227 (nil)]) | |
7228 | |
7229 (define_delay (eq_attr "type" "call") | |
7230 [(eq_attr "type" "!branch,call") (nil) (nil) | |
7231 (eq_attr "type" "!branch,call") (nil) (nil)]) | |
7232 @end smallexample | |
7233 @c the above is *still* too long. --mew 4feb93 | |
7234 | |
7235 @end ifset | |
7236 @ifset INTERNALS | |
7237 @node Processor pipeline description | |
7238 @subsection Specifying processor pipeline description | |
7239 @cindex processor pipeline description | |
7240 @cindex processor functional units | |
7241 @cindex instruction latency time | |
7242 @cindex interlock delays | |
7243 @cindex data dependence delays | |
7244 @cindex reservation delays | |
7245 @cindex pipeline hazard recognizer | |
7246 @cindex automaton based pipeline description | |
7247 @cindex regular expressions | |
7248 @cindex deterministic finite state automaton | |
7249 @cindex automaton based scheduler | |
7250 @cindex RISC | |
7251 @cindex VLIW | |
7252 | |
7253 To achieve better performance, most modern processors | |
7254 (super-pipelined, superscalar @acronym{RISC}, and @acronym{VLIW} | |
7255 processors) have many @dfn{functional units} on which several | |
7256 instructions can be executed simultaneously. An instruction starts | |
7257 execution if its issue conditions are satisfied. If not, the | |
7258 instruction is stalled until its conditions are satisfied. Such | |
7259 @dfn{interlock (pipeline) delay} causes interruption of the fetching | |
7260 of successor instructions (or demands nop instructions, e.g.@: for some | |
7261 MIPS processors). | |
7262 | |
7263 There are two major kinds of interlock delays in modern processors. | |
7264 The first one is a data dependence delay determining @dfn{instruction | |
7265 latency time}. The instruction execution is not started until all | |
7266 source data have been evaluated by prior instructions (there are more | |
7267 complex cases when the instruction execution starts even when the data | |
7268 are not available but will be ready in given time after the | |
7269 instruction execution start). Taking the data dependence delays into | |
7270 account is simple. The data dependence (true, output, and | |
7271 anti-dependence) delay between two instructions is given by a | |
7272 constant. In most cases this approach is adequate. The second kind | |
7273 of interlock delays is a reservation delay. The reservation delay | |
7274 means that two instructions under execution will be in need of shared | |
7275 processors resources, i.e.@: buses, internal registers, and/or | |
7276 functional units, which are reserved for some time. Taking this kind | |
7277 of delay into account is complex especially for modern @acronym{RISC} | |
7278 processors. | |
7279 | |
7280 The task of exploiting more processor parallelism is solved by an | |
7281 instruction scheduler. For a better solution to this problem, the | |
7282 instruction scheduler has to have an adequate description of the | |
7283 processor parallelism (or @dfn{pipeline description}). GCC | |
7284 machine descriptions describe processor parallelism and functional | |
7285 unit reservations for groups of instructions with the aid of | |
7286 @dfn{regular expressions}. | |
7287 | |
7288 The GCC instruction scheduler uses a @dfn{pipeline hazard recognizer} to | |
7289 figure out the possibility of the instruction issue by the processor | |
7290 on a given simulated processor cycle. The pipeline hazard recognizer is | |
7291 automatically generated from the processor pipeline description. The | |
7292 pipeline hazard recognizer generated from the machine description | |
7293 is based on a deterministic finite state automaton (@acronym{DFA}): | |
7294 the instruction issue is possible if there is a transition from one | |
7295 automaton state to another one. This algorithm is very fast, and | |
7296 furthermore, its speed is not dependent on processor | |
7297 complexity@footnote{However, the size of the automaton depends on | |
7298 processor complexity. To limit this effect, machine descriptions | |
7299 can split orthogonal parts of the machine description among several | |
7300 automata: but then, since each of these must be stepped independently, | |
7301 this does cause a small decrease in the algorithm's performance.}. | |
7302 | |
7303 @cindex automaton based pipeline description | |
7304 The rest of this section describes the directives that constitute | |
7305 an automaton-based processor pipeline description. The order of | |
7306 these constructions within the machine description file is not | |
7307 important. | |
7308 | |
7309 @findex define_automaton | |
7310 @cindex pipeline hazard recognizer | |
7311 The following optional construction describes names of automata | |
7312 generated and used for the pipeline hazards recognition. Sometimes | |
7313 the generated finite state automaton used by the pipeline hazard | |
7314 recognizer is large. If we use more than one automaton and bind functional | |
7315 units to the automata, the total size of the automata is usually | |
7316 less than the size of the single automaton. If there is no one such | |
7317 construction, only one finite state automaton is generated. | |
7318 | |
7319 @smallexample | |
7320 (define_automaton @var{automata-names}) | |
7321 @end smallexample | |
7322 | |
7323 @var{automata-names} is a string giving names of the automata. The | |
7324 names are separated by commas. All the automata should have unique names. | |
7325 The automaton name is used in the constructions @code{define_cpu_unit} and | |
7326 @code{define_query_cpu_unit}. | |
7327 | |
7328 @findex define_cpu_unit | |
7329 @cindex processor functional units | |
7330 Each processor functional unit used in the description of instruction | |
7331 reservations should be described by the following construction. | |
7332 | |
7333 @smallexample | |
7334 (define_cpu_unit @var{unit-names} [@var{automaton-name}]) | |
7335 @end smallexample | |
7336 | |
7337 @var{unit-names} is a string giving the names of the functional units | |
7338 separated by commas. Don't use name @samp{nothing}, it is reserved | |
7339 for other goals. | |
7340 | |
7341 @var{automaton-name} is a string giving the name of the automaton with | |
7342 which the unit is bound. The automaton should be described in | |
7343 construction @code{define_automaton}. You should give | |
7344 @dfn{automaton-name}, if there is a defined automaton. | |
7345 | |
7346 The assignment of units to automata are constrained by the uses of the | |
7347 units in insn reservations. The most important constraint is: if a | |
7348 unit reservation is present on a particular cycle of an alternative | |
7349 for an insn reservation, then some unit from the same automaton must | |
7350 be present on the same cycle for the other alternatives of the insn | |
7351 reservation. The rest of the constraints are mentioned in the | |
7352 description of the subsequent constructions. | |
7353 | |
7354 @findex define_query_cpu_unit | |
7355 @cindex querying function unit reservations | |
7356 The following construction describes CPU functional units analogously | |
7357 to @code{define_cpu_unit}. The reservation of such units can be | |
7358 queried for an automaton state. The instruction scheduler never | |
7359 queries reservation of functional units for given automaton state. So | |
7360 as a rule, you don't need this construction. This construction could | |
7361 be used for future code generation goals (e.g.@: to generate | |
7362 @acronym{VLIW} insn templates). | |
7363 | |
7364 @smallexample | |
7365 (define_query_cpu_unit @var{unit-names} [@var{automaton-name}]) | |
7366 @end smallexample | |
7367 | |
7368 @var{unit-names} is a string giving names of the functional units | |
7369 separated by commas. | |
7370 | |
7371 @var{automaton-name} is a string giving the name of the automaton with | |
7372 which the unit is bound. | |
7373 | |
7374 @findex define_insn_reservation | |
7375 @cindex instruction latency time | |
7376 @cindex regular expressions | |
7377 @cindex data bypass | |
7378 The following construction is the major one to describe pipeline | |
7379 characteristics of an instruction. | |
7380 | |
7381 @smallexample | |
7382 (define_insn_reservation @var{insn-name} @var{default_latency} | |
7383 @var{condition} @var{regexp}) | |
7384 @end smallexample | |
7385 | |
7386 @var{default_latency} is a number giving latency time of the | |
7387 instruction. There is an important difference between the old | |
7388 description and the automaton based pipeline description. The latency | |
7389 time is used for all dependencies when we use the old description. In | |
7390 the automaton based pipeline description, the given latency time is only | |
7391 used for true dependencies. The cost of anti-dependencies is always | |
7392 zero and the cost of output dependencies is the difference between | |
7393 latency times of the producing and consuming insns (if the difference | |
7394 is negative, the cost is considered to be zero). You can always | |
7395 change the default costs for any description by using the target hook | |
7396 @code{TARGET_SCHED_ADJUST_COST} (@pxref{Scheduling}). | |
7397 | |
7398 @var{insn-name} is a string giving the internal name of the insn. The | |
7399 internal names are used in constructions @code{define_bypass} and in | |
7400 the automaton description file generated for debugging. The internal | |
7401 name has nothing in common with the names in @code{define_insn}. It is a | |
7402 good practice to use insn classes described in the processor manual. | |
7403 | |
7404 @var{condition} defines what RTL insns are described by this | |
7405 construction. You should remember that you will be in trouble if | |
7406 @var{condition} for two or more different | |
7407 @code{define_insn_reservation} constructions is TRUE for an insn. In | |
7408 this case what reservation will be used for the insn is not defined. | |
7409 Such cases are not checked during generation of the pipeline hazards | |
7410 recognizer because in general recognizing that two conditions may have | |
7411 the same value is quite difficult (especially if the conditions | |
7412 contain @code{symbol_ref}). It is also not checked during the | |
7413 pipeline hazard recognizer work because it would slow down the | |
7414 recognizer considerably. | |
7415 | |
7416 @var{regexp} is a string describing the reservation of the cpu's functional | |
7417 units by the instruction. The reservations are described by a regular | |
7418 expression according to the following syntax: | |
7419 | |
7420 @smallexample | |
7421 regexp = regexp "," oneof | |
7422 | oneof | |
7423 | |
7424 oneof = oneof "|" allof | |
7425 | allof | |
7426 | |
7427 allof = allof "+" repeat | |
7428 | repeat | |
7429 | |
7430 repeat = element "*" number | |
7431 | element | |
7432 | |
7433 element = cpu_function_unit_name | |
7434 | reservation_name | |
7435 | result_name | |
7436 | "nothing" | |
7437 | "(" regexp ")" | |
7438 @end smallexample | |
7439 | |
7440 @itemize @bullet | |
7441 @item | |
7442 @samp{,} is used for describing the start of the next cycle in | |
7443 the reservation. | |
7444 | |
7445 @item | |
7446 @samp{|} is used for describing a reservation described by the first | |
7447 regular expression @strong{or} a reservation described by the second | |
7448 regular expression @strong{or} etc. | |
7449 | |
7450 @item | |
7451 @samp{+} is used for describing a reservation described by the first | |
7452 regular expression @strong{and} a reservation described by the | |
7453 second regular expression @strong{and} etc. | |
7454 | |
7455 @item | |
7456 @samp{*} is used for convenience and simply means a sequence in which | |
7457 the regular expression are repeated @var{number} times with cycle | |
7458 advancing (see @samp{,}). | |
7459 | |
7460 @item | |
7461 @samp{cpu_function_unit_name} denotes reservation of the named | |
7462 functional unit. | |
7463 | |
7464 @item | |
7465 @samp{reservation_name} --- see description of construction | |
7466 @samp{define_reservation}. | |
7467 | |
7468 @item | |
7469 @samp{nothing} denotes no unit reservations. | |
7470 @end itemize | |
7471 | |
7472 @findex define_reservation | |
7473 Sometimes unit reservations for different insns contain common parts. | |
7474 In such case, you can simplify the pipeline description by describing | |
7475 the common part by the following construction | |
7476 | |
7477 @smallexample | |
7478 (define_reservation @var{reservation-name} @var{regexp}) | |
7479 @end smallexample | |
7480 | |
7481 @var{reservation-name} is a string giving name of @var{regexp}. | |
7482 Functional unit names and reservation names are in the same name | |
7483 space. So the reservation names should be different from the | |
7484 functional unit names and can not be the reserved name @samp{nothing}. | |
7485 | |
7486 @findex define_bypass | |
7487 @cindex instruction latency time | |
7488 @cindex data bypass | |
7489 The following construction is used to describe exceptions in the | |
7490 latency time for given instruction pair. This is so called bypasses. | |
7491 | |
7492 @smallexample | |
7493 (define_bypass @var{number} @var{out_insn_names} @var{in_insn_names} | |
7494 [@var{guard}]) | |
7495 @end smallexample | |
7496 | |
7497 @var{number} defines when the result generated by the instructions | |
7498 given in string @var{out_insn_names} will be ready for the | |
7499 instructions given in string @var{in_insn_names}. The instructions in | |
7500 the string are separated by commas. | |
7501 | |
7502 @var{guard} is an optional string giving the name of a C function which | |
7503 defines an additional guard for the bypass. The function will get the | |
7504 two insns as parameters. If the function returns zero the bypass will | |
7505 be ignored for this case. The additional guard is necessary to | |
7506 recognize complicated bypasses, e.g.@: when the consumer is only an address | |
7507 of insn @samp{store} (not a stored value). | |
7508 | |
7509 @findex exclusion_set | |
7510 @findex presence_set | |
7511 @findex final_presence_set | |
7512 @findex absence_set | |
7513 @findex final_absence_set | |
7514 @cindex VLIW | |
7515 @cindex RISC | |
7516 The following five constructions are usually used to describe | |
7517 @acronym{VLIW} processors, or more precisely, to describe a placement | |
7518 of small instructions into @acronym{VLIW} instruction slots. They | |
7519 can be used for @acronym{RISC} processors, too. | |
7520 | |
7521 @smallexample | |
7522 (exclusion_set @var{unit-names} @var{unit-names}) | |
7523 (presence_set @var{unit-names} @var{patterns}) | |
7524 (final_presence_set @var{unit-names} @var{patterns}) | |
7525 (absence_set @var{unit-names} @var{patterns}) | |
7526 (final_absence_set @var{unit-names} @var{patterns}) | |
7527 @end smallexample | |
7528 | |
7529 @var{unit-names} is a string giving names of functional units | |
7530 separated by commas. | |
7531 | |
7532 @var{patterns} is a string giving patterns of functional units | |
7533 separated by comma. Currently pattern is one unit or units | |
7534 separated by white-spaces. | |
7535 | |
7536 The first construction (@samp{exclusion_set}) means that each | |
7537 functional unit in the first string can not be reserved simultaneously | |
7538 with a unit whose name is in the second string and vice versa. For | |
7539 example, the construction is useful for describing processors | |
7540 (e.g.@: some SPARC processors) with a fully pipelined floating point | |
7541 functional unit which can execute simultaneously only single floating | |
7542 point insns or only double floating point insns. | |
7543 | |
7544 The second construction (@samp{presence_set}) means that each | |
7545 functional unit in the first string can not be reserved unless at | |
7546 least one of pattern of units whose names are in the second string is | |
7547 reserved. This is an asymmetric relation. For example, it is useful | |
7548 for description that @acronym{VLIW} @samp{slot1} is reserved after | |
7549 @samp{slot0} reservation. We could describe it by the following | |
7550 construction | |
7551 | |
7552 @smallexample | |
7553 (presence_set "slot1" "slot0") | |
7554 @end smallexample | |
7555 | |
7556 Or @samp{slot1} is reserved only after @samp{slot0} and unit @samp{b0} | |
7557 reservation. In this case we could write | |
7558 | |
7559 @smallexample | |
7560 (presence_set "slot1" "slot0 b0") | |
7561 @end smallexample | |
7562 | |
7563 The third construction (@samp{final_presence_set}) is analogous to | |
7564 @samp{presence_set}. The difference between them is when checking is | |
7565 done. When an instruction is issued in given automaton state | |
7566 reflecting all current and planned unit reservations, the automaton | |
7567 state is changed. The first state is a source state, the second one | |
7568 is a result state. Checking for @samp{presence_set} is done on the | |
7569 source state reservation, checking for @samp{final_presence_set} is | |
7570 done on the result reservation. This construction is useful to | |
7571 describe a reservation which is actually two subsequent reservations. | |
7572 For example, if we use | |
7573 | |
7574 @smallexample | |
7575 (presence_set "slot1" "slot0") | |
7576 @end smallexample | |
7577 | |
7578 the following insn will be never issued (because @samp{slot1} requires | |
7579 @samp{slot0} which is absent in the source state). | |
7580 | |
7581 @smallexample | |
7582 (define_reservation "insn_and_nop" "slot0 + slot1") | |
7583 @end smallexample | |
7584 | |
7585 but it can be issued if we use analogous @samp{final_presence_set}. | |
7586 | |
7587 The forth construction (@samp{absence_set}) means that each functional | |
7588 unit in the first string can be reserved only if each pattern of units | |
7589 whose names are in the second string is not reserved. This is an | |
7590 asymmetric relation (actually @samp{exclusion_set} is analogous to | |
7591 this one but it is symmetric). For example it might be useful in a | |
7592 @acronym{VLIW} description to say that @samp{slot0} cannot be reserved | |
7593 after either @samp{slot1} or @samp{slot2} have been reserved. This | |
7594 can be described as: | |
7595 | |
7596 @smallexample | |
7597 (absence_set "slot0" "slot1, slot2") | |
7598 @end smallexample | |
7599 | |
7600 Or @samp{slot2} can not be reserved if @samp{slot0} and unit @samp{b0} | |
7601 are reserved or @samp{slot1} and unit @samp{b1} are reserved. In | |
7602 this case we could write | |
7603 | |
7604 @smallexample | |
7605 (absence_set "slot2" "slot0 b0, slot1 b1") | |
7606 @end smallexample | |
7607 | |
7608 All functional units mentioned in a set should belong to the same | |
7609 automaton. | |
7610 | |
7611 The last construction (@samp{final_absence_set}) is analogous to | |
7612 @samp{absence_set} but checking is done on the result (state) | |
7613 reservation. See comments for @samp{final_presence_set}. | |
7614 | |
7615 @findex automata_option | |
7616 @cindex deterministic finite state automaton | |
7617 @cindex nondeterministic finite state automaton | |
7618 @cindex finite state automaton minimization | |
7619 You can control the generator of the pipeline hazard recognizer with | |
7620 the following construction. | |
7621 | |
7622 @smallexample | |
7623 (automata_option @var{options}) | |
7624 @end smallexample | |
7625 | |
7626 @var{options} is a string giving options which affect the generated | |
7627 code. Currently there are the following options: | |
7628 | |
7629 @itemize @bullet | |
7630 @item | |
7631 @dfn{no-minimization} makes no minimization of the automaton. This is | |
7632 only worth to do when we are debugging the description and need to | |
7633 look more accurately at reservations of states. | |
7634 | |
7635 @item | |
7636 @dfn{time} means printing time statistics about the generation of | |
7637 automata. | |
7638 | |
7639 @item | |
7640 @dfn{stats} means printing statistics about the generated automata | |
7641 such as the number of DFA states, NDFA states and arcs. | |
7642 | |
7643 @item | |
7644 @dfn{v} means a generation of the file describing the result automata. | |
7645 The file has suffix @samp{.dfa} and can be used for the description | |
7646 verification and debugging. | |
7647 | |
7648 @item | |
7649 @dfn{w} means a generation of warning instead of error for | |
7650 non-critical errors. | |
7651 | |
7652 @item | |
7653 @dfn{ndfa} makes nondeterministic finite state automata. This affects | |
7654 the treatment of operator @samp{|} in the regular expressions. The | |
7655 usual treatment of the operator is to try the first alternative and, | |
7656 if the reservation is not possible, the second alternative. The | |
7657 nondeterministic treatment means trying all alternatives, some of them | |
7658 may be rejected by reservations in the subsequent insns. | |
7659 | |
7660 @item | |
7661 @dfn{progress} means output of a progress bar showing how many states | |
7662 were generated so far for automaton being processed. This is useful | |
7663 during debugging a @acronym{DFA} description. If you see too many | |
7664 generated states, you could interrupt the generator of the pipeline | |
7665 hazard recognizer and try to figure out a reason for generation of the | |
7666 huge automaton. | |
7667 @end itemize | |
7668 | |
7669 As an example, consider a superscalar @acronym{RISC} machine which can | |
7670 issue three insns (two integer insns and one floating point insn) on | |
7671 the cycle but can finish only two insns. To describe this, we define | |
7672 the following functional units. | |
7673 | |
7674 @smallexample | |
7675 (define_cpu_unit "i0_pipeline, i1_pipeline, f_pipeline") | |
7676 (define_cpu_unit "port0, port1") | |
7677 @end smallexample | |
7678 | |
7679 All simple integer insns can be executed in any integer pipeline and | |
7680 their result is ready in two cycles. The simple integer insns are | |
7681 issued into the first pipeline unless it is reserved, otherwise they | |
7682 are issued into the second pipeline. Integer division and | |
7683 multiplication insns can be executed only in the second integer | |
7684 pipeline and their results are ready correspondingly in 8 and 4 | |
7685 cycles. The integer division is not pipelined, i.e.@: the subsequent | |
7686 integer division insn can not be issued until the current division | |
7687 insn finished. Floating point insns are fully pipelined and their | |
7688 results are ready in 3 cycles. Where the result of a floating point | |
7689 insn is used by an integer insn, an additional delay of one cycle is | |
7690 incurred. To describe all of this we could specify | |
7691 | |
7692 @smallexample | |
7693 (define_cpu_unit "div") | |
7694 | |
7695 (define_insn_reservation "simple" 2 (eq_attr "type" "int") | |
7696 "(i0_pipeline | i1_pipeline), (port0 | port1)") | |
7697 | |
7698 (define_insn_reservation "mult" 4 (eq_attr "type" "mult") | |
7699 "i1_pipeline, nothing*2, (port0 | port1)") | |
7700 | |
7701 (define_insn_reservation "div" 8 (eq_attr "type" "div") | |
7702 "i1_pipeline, div*7, div + (port0 | port1)") | |
7703 | |
7704 (define_insn_reservation "float" 3 (eq_attr "type" "float") | |
7705 "f_pipeline, nothing, (port0 | port1)) | |
7706 | |
7707 (define_bypass 4 "float" "simple,mult,div") | |
7708 @end smallexample | |
7709 | |
7710 To simplify the description we could describe the following reservation | |
7711 | |
7712 @smallexample | |
7713 (define_reservation "finish" "port0|port1") | |
7714 @end smallexample | |
7715 | |
7716 and use it in all @code{define_insn_reservation} as in the following | |
7717 construction | |
7718 | |
7719 @smallexample | |
7720 (define_insn_reservation "simple" 2 (eq_attr "type" "int") | |
7721 "(i0_pipeline | i1_pipeline), finish") | |
7722 @end smallexample | |
7723 | |
7724 | |
7725 @end ifset | |
7726 @ifset INTERNALS | |
7727 @node Conditional Execution | |
7728 @section Conditional Execution | |
7729 @cindex conditional execution | |
7730 @cindex predication | |
7731 | |
7732 A number of architectures provide for some form of conditional | |
7733 execution, or predication. The hallmark of this feature is the | |
7734 ability to nullify most of the instructions in the instruction set. | |
7735 When the instruction set is large and not entirely symmetric, it | |
7736 can be quite tedious to describe these forms directly in the | |
7737 @file{.md} file. An alternative is the @code{define_cond_exec} template. | |
7738 | |
7739 @findex define_cond_exec | |
7740 @smallexample | |
7741 (define_cond_exec | |
7742 [@var{predicate-pattern}] | |
7743 "@var{condition}" | |
7744 "@var{output-template}") | |
7745 @end smallexample | |
7746 | |
7747 @var{predicate-pattern} is the condition that must be true for the | |
7748 insn to be executed at runtime and should match a relational operator. | |
7749 One can use @code{match_operator} to match several relational operators | |
7750 at once. Any @code{match_operand} operands must have no more than one | |
7751 alternative. | |
7752 | |
7753 @var{condition} is a C expression that must be true for the generated | |
7754 pattern to match. | |
7755 | |
7756 @findex current_insn_predicate | |
7757 @var{output-template} is a string similar to the @code{define_insn} | |
7758 output template (@pxref{Output Template}), except that the @samp{*} | |
7759 and @samp{@@} special cases do not apply. This is only useful if the | |
7760 assembly text for the predicate is a simple prefix to the main insn. | |
7761 In order to handle the general case, there is a global variable | |
7762 @code{current_insn_predicate} that will contain the entire predicate | |
7763 if the current insn is predicated, and will otherwise be @code{NULL}. | |
7764 | |
7765 When @code{define_cond_exec} is used, an implicit reference to | |
7766 the @code{predicable} instruction attribute is made. | |
7767 @xref{Insn Attributes}. This attribute must be boolean (i.e.@: have | |
7768 exactly two elements in its @var{list-of-values}). Further, it must | |
7769 not be used with complex expressions. That is, the default and all | |
7770 uses in the insns must be a simple constant, not dependent on the | |
7771 alternative or anything else. | |
7772 | |
7773 For each @code{define_insn} for which the @code{predicable} | |
7774 attribute is true, a new @code{define_insn} pattern will be | |
7775 generated that matches a predicated version of the instruction. | |
7776 For example, | |
7777 | |
7778 @smallexample | |
7779 (define_insn "addsi" | |
7780 [(set (match_operand:SI 0 "register_operand" "r") | |
7781 (plus:SI (match_operand:SI 1 "register_operand" "r") | |
7782 (match_operand:SI 2 "register_operand" "r")))] | |
7783 "@var{test1}" | |
7784 "add %2,%1,%0") | |
7785 | |
7786 (define_cond_exec | |
7787 [(ne (match_operand:CC 0 "register_operand" "c") | |
7788 (const_int 0))] | |
7789 "@var{test2}" | |
7790 "(%0)") | |
7791 @end smallexample | |
7792 | |
7793 @noindent | |
7794 generates a new pattern | |
7795 | |
7796 @smallexample | |
7797 (define_insn "" | |
7798 [(cond_exec | |
7799 (ne (match_operand:CC 3 "register_operand" "c") (const_int 0)) | |
7800 (set (match_operand:SI 0 "register_operand" "r") | |
7801 (plus:SI (match_operand:SI 1 "register_operand" "r") | |
7802 (match_operand:SI 2 "register_operand" "r"))))] | |
7803 "(@var{test2}) && (@var{test1})" | |
7804 "(%3) add %2,%1,%0") | |
7805 @end smallexample | |
7806 | |
7807 @end ifset | |
7808 @ifset INTERNALS | |
7809 @node Constant Definitions | |
7810 @section Constant Definitions | |
7811 @cindex constant definitions | |
7812 @findex define_constants | |
7813 | |
7814 Using literal constants inside instruction patterns reduces legibility and | |
7815 can be a maintenance problem. | |
7816 | |
7817 To overcome this problem, you may use the @code{define_constants} | |
7818 expression. It contains a vector of name-value pairs. From that | |
7819 point on, wherever any of the names appears in the MD file, it is as | |
7820 if the corresponding value had been written instead. You may use | |
7821 @code{define_constants} multiple times; each appearance adds more | |
7822 constants to the table. It is an error to redefine a constant with | |
7823 a different value. | |
7824 | |
7825 To come back to the a29k load multiple example, instead of | |
7826 | |
7827 @smallexample | |
7828 (define_insn "" | |
7829 [(match_parallel 0 "load_multiple_operation" | |
7830 [(set (match_operand:SI 1 "gpc_reg_operand" "=r") | |
7831 (match_operand:SI 2 "memory_operand" "m")) | |
7832 (use (reg:SI 179)) | |
7833 (clobber (reg:SI 179))])] | |
7834 "" | |
7835 "loadm 0,0,%1,%2") | |
7836 @end smallexample | |
7837 | |
7838 You could write: | |
7839 | |
7840 @smallexample | |
7841 (define_constants [ | |
7842 (R_BP 177) | |
7843 (R_FC 178) | |
7844 (R_CR 179) | |
7845 (R_Q 180) | |
7846 ]) | |
7847 | |
7848 (define_insn "" | |
7849 [(match_parallel 0 "load_multiple_operation" | |
7850 [(set (match_operand:SI 1 "gpc_reg_operand" "=r") | |
7851 (match_operand:SI 2 "memory_operand" "m")) | |
7852 (use (reg:SI R_CR)) | |
7853 (clobber (reg:SI R_CR))])] | |
7854 "" | |
7855 "loadm 0,0,%1,%2") | |
7856 @end smallexample | |
7857 | |
7858 The constants that are defined with a define_constant are also output | |
7859 in the insn-codes.h header file as #defines. | |
7860 @end ifset | |
7861 @ifset INTERNALS | |
7862 @node Iterators | |
7863 @section Iterators | |
7864 @cindex iterators in @file{.md} files | |
7865 | |
7866 Ports often need to define similar patterns for more than one machine | |
7867 mode or for more than one rtx code. GCC provides some simple iterator | |
7868 facilities to make this process easier. | |
7869 | |
7870 @menu | |
7871 * Mode Iterators:: Generating variations of patterns for different modes. | |
7872 * Code Iterators:: Doing the same for codes. | |
7873 @end menu | |
7874 | |
7875 @node Mode Iterators | |
7876 @subsection Mode Iterators | |
7877 @cindex mode iterators in @file{.md} files | |
7878 | |
7879 Ports often need to define similar patterns for two or more different modes. | |
7880 For example: | |
7881 | |
7882 @itemize @bullet | |
7883 @item | |
7884 If a processor has hardware support for both single and double | |
7885 floating-point arithmetic, the @code{SFmode} patterns tend to be | |
7886 very similar to the @code{DFmode} ones. | |
7887 | |
7888 @item | |
7889 If a port uses @code{SImode} pointers in one configuration and | |
7890 @code{DImode} pointers in another, it will usually have very similar | |
7891 @code{SImode} and @code{DImode} patterns for manipulating pointers. | |
7892 @end itemize | |
7893 | |
7894 Mode iterators allow several patterns to be instantiated from one | |
7895 @file{.md} file template. They can be used with any type of | |
7896 rtx-based construct, such as a @code{define_insn}, | |
7897 @code{define_split}, or @code{define_peephole2}. | |
7898 | |
7899 @menu | |
7900 * Defining Mode Iterators:: Defining a new mode iterator. | |
7901 * Substitutions:: Combining mode iterators with substitutions | |
7902 * Examples:: Examples | |
7903 @end menu | |
7904 | |
7905 @node Defining Mode Iterators | |
7906 @subsubsection Defining Mode Iterators | |
7907 @findex define_mode_iterator | |
7908 | |
7909 The syntax for defining a mode iterator is: | |
7910 | |
7911 @smallexample | |
7912 (define_mode_iterator @var{name} [(@var{mode1} "@var{cond1}") @dots{} (@var{moden} "@var{condn}")]) | |
7913 @end smallexample | |
7914 | |
7915 This allows subsequent @file{.md} file constructs to use the mode suffix | |
7916 @code{:@var{name}}. Every construct that does so will be expanded | |
7917 @var{n} times, once with every use of @code{:@var{name}} replaced by | |
7918 @code{:@var{mode1}}, once with every use replaced by @code{:@var{mode2}}, | |
7919 and so on. In the expansion for a particular @var{modei}, every | |
7920 C condition will also require that @var{condi} be true. | |
7921 | |
7922 For example: | |
7923 | |
7924 @smallexample | |
7925 (define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")]) | |
7926 @end smallexample | |
7927 | |
7928 defines a new mode suffix @code{:P}. Every construct that uses | |
7929 @code{:P} will be expanded twice, once with every @code{:P} replaced | |
7930 by @code{:SI} and once with every @code{:P} replaced by @code{:DI}. | |
7931 The @code{:SI} version will only apply if @code{Pmode == SImode} and | |
7932 the @code{:DI} version will only apply if @code{Pmode == DImode}. | |
7933 | |
7934 As with other @file{.md} conditions, an empty string is treated | |
7935 as ``always true''. @code{(@var{mode} "")} can also be abbreviated | |
7936 to @code{@var{mode}}. For example: | |
7937 | |
7938 @smallexample | |
7939 (define_mode_iterator GPR [SI (DI "TARGET_64BIT")]) | |
7940 @end smallexample | |
7941 | |
7942 means that the @code{:DI} expansion only applies if @code{TARGET_64BIT} | |
7943 but that the @code{:SI} expansion has no such constraint. | |
7944 | |
7945 Iterators are applied in the order they are defined. This can be | |
7946 significant if two iterators are used in a construct that requires | |
7947 substitutions. @xref{Substitutions}. | |
7948 | |
7949 @node Substitutions | |
7950 @subsubsection Substitution in Mode Iterators | |
7951 @findex define_mode_attr | |
7952 | |
7953 If an @file{.md} file construct uses mode iterators, each version of the | |
7954 construct will often need slightly different strings or modes. For | |
7955 example: | |
7956 | |
7957 @itemize @bullet | |
7958 @item | |
7959 When a @code{define_expand} defines several @code{add@var{m}3} patterns | |
7960 (@pxref{Standard Names}), each expander will need to use the | |
7961 appropriate mode name for @var{m}. | |
7962 | |
7963 @item | |
7964 When a @code{define_insn} defines several instruction patterns, | |
7965 each instruction will often use a different assembler mnemonic. | |
7966 | |
7967 @item | |
7968 When a @code{define_insn} requires operands with different modes, | |
7969 using an iterator for one of the operand modes usually requires a specific | |
7970 mode for the other operand(s). | |
7971 @end itemize | |
7972 | |
7973 GCC supports such variations through a system of ``mode attributes''. | |
7974 There are two standard attributes: @code{mode}, which is the name of | |
7975 the mode in lower case, and @code{MODE}, which is the same thing in | |
7976 upper case. You can define other attributes using: | |
7977 | |
7978 @smallexample | |
7979 (define_mode_attr @var{name} [(@var{mode1} "@var{value1}") @dots{} (@var{moden} "@var{valuen}")]) | |
7980 @end smallexample | |
7981 | |
7982 where @var{name} is the name of the attribute and @var{valuei} | |
7983 is the value associated with @var{modei}. | |
7984 | |
7985 When GCC replaces some @var{:iterator} with @var{:mode}, it will scan | |
7986 each string and mode in the pattern for sequences of the form | |
7987 @code{<@var{iterator}:@var{attr}>}, where @var{attr} is the name of a | |
7988 mode attribute. If the attribute is defined for @var{mode}, the whole | |
7989 @code{<@dots{}>} sequence will be replaced by the appropriate attribute | |
7990 value. | |
7991 | |
7992 For example, suppose an @file{.md} file has: | |
7993 | |
7994 @smallexample | |
7995 (define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")]) | |
7996 (define_mode_attr load [(SI "lw") (DI "ld")]) | |
7997 @end smallexample | |
7998 | |
7999 If one of the patterns that uses @code{:P} contains the string | |
8000 @code{"<P:load>\t%0,%1"}, the @code{SI} version of that pattern | |
8001 will use @code{"lw\t%0,%1"} and the @code{DI} version will use | |
8002 @code{"ld\t%0,%1"}. | |
8003 | |
8004 Here is an example of using an attribute for a mode: | |
8005 | |
8006 @smallexample | |
8007 (define_mode_iterator LONG [SI DI]) | |
8008 (define_mode_attr SHORT [(SI "HI") (DI "SI")]) | |
8009 (define_insn @dots{} | |
8010 (sign_extend:LONG (match_operand:<LONG:SHORT> @dots{})) @dots{}) | |
8011 @end smallexample | |
8012 | |
8013 The @code{@var{iterator}:} prefix may be omitted, in which case the | |
8014 substitution will be attempted for every iterator expansion. | |
8015 | |
8016 @node Examples | |
8017 @subsubsection Mode Iterator Examples | |
8018 | |
8019 Here is an example from the MIPS port. It defines the following | |
8020 modes and attributes (among others): | |
8021 | |
8022 @smallexample | |
8023 (define_mode_iterator GPR [SI (DI "TARGET_64BIT")]) | |
8024 (define_mode_attr d [(SI "") (DI "d")]) | |
8025 @end smallexample | |
8026 | |
8027 and uses the following template to define both @code{subsi3} | |
8028 and @code{subdi3}: | |
8029 | |
8030 @smallexample | |
8031 (define_insn "sub<mode>3" | |
8032 [(set (match_operand:GPR 0 "register_operand" "=d") | |
8033 (minus:GPR (match_operand:GPR 1 "register_operand" "d") | |
8034 (match_operand:GPR 2 "register_operand" "d")))] | |
8035 "" | |
8036 "<d>subu\t%0,%1,%2" | |
8037 [(set_attr "type" "arith") | |
8038 (set_attr "mode" "<MODE>")]) | |
8039 @end smallexample | |
8040 | |
8041 This is exactly equivalent to: | |
8042 | |
8043 @smallexample | |
8044 (define_insn "subsi3" | |
8045 [(set (match_operand:SI 0 "register_operand" "=d") | |
8046 (minus:SI (match_operand:SI 1 "register_operand" "d") | |
8047 (match_operand:SI 2 "register_operand" "d")))] | |
8048 "" | |
8049 "subu\t%0,%1,%2" | |
8050 [(set_attr "type" "arith") | |
8051 (set_attr "mode" "SI")]) | |
8052 | |
8053 (define_insn "subdi3" | |
8054 [(set (match_operand:DI 0 "register_operand" "=d") | |
8055 (minus:DI (match_operand:DI 1 "register_operand" "d") | |
8056 (match_operand:DI 2 "register_operand" "d")))] | |
8057 "" | |
8058 "dsubu\t%0,%1,%2" | |
8059 [(set_attr "type" "arith") | |
8060 (set_attr "mode" "DI")]) | |
8061 @end smallexample | |
8062 | |
8063 @node Code Iterators | |
8064 @subsection Code Iterators | |
8065 @cindex code iterators in @file{.md} files | |
8066 @findex define_code_iterator | |
8067 @findex define_code_attr | |
8068 | |
8069 Code iterators operate in a similar way to mode iterators. @xref{Mode Iterators}. | |
8070 | |
8071 The construct: | |
8072 | |
8073 @smallexample | |
8074 (define_code_iterator @var{name} [(@var{code1} "@var{cond1}") @dots{} (@var{coden} "@var{condn}")]) | |
8075 @end smallexample | |
8076 | |
8077 defines a pseudo rtx code @var{name} that can be instantiated as | |
8078 @var{codei} if condition @var{condi} is true. Each @var{codei} | |
8079 must have the same rtx format. @xref{RTL Classes}. | |
8080 | |
8081 As with mode iterators, each pattern that uses @var{name} will be | |
8082 expanded @var{n} times, once with all uses of @var{name} replaced by | |
8083 @var{code1}, once with all uses replaced by @var{code2}, and so on. | |
8084 @xref{Defining Mode Iterators}. | |
8085 | |
8086 It is possible to define attributes for codes as well as for modes. | |
8087 There are two standard code attributes: @code{code}, the name of the | |
8088 code in lower case, and @code{CODE}, the name of the code in upper case. | |
8089 Other attributes are defined using: | |
8090 | |
8091 @smallexample | |
8092 (define_code_attr @var{name} [(@var{code1} "@var{value1}") @dots{} (@var{coden} "@var{valuen}")]) | |
8093 @end smallexample | |
8094 | |
8095 Here's an example of code iterators in action, taken from the MIPS port: | |
8096 | |
8097 @smallexample | |
8098 (define_code_iterator any_cond [unordered ordered unlt unge uneq ltgt unle ungt | |
8099 eq ne gt ge lt le gtu geu ltu leu]) | |
8100 | |
8101 (define_expand "b<code>" | |
8102 [(set (pc) | |
8103 (if_then_else (any_cond:CC (cc0) | |
8104 (const_int 0)) | |
8105 (label_ref (match_operand 0 "")) | |
8106 (pc)))] | |
8107 "" | |
8108 @{ | |
8109 gen_conditional_branch (operands, <CODE>); | |
8110 DONE; | |
8111 @}) | |
8112 @end smallexample | |
8113 | |
8114 This is equivalent to: | |
8115 | |
8116 @smallexample | |
8117 (define_expand "bunordered" | |
8118 [(set (pc) | |
8119 (if_then_else (unordered:CC (cc0) | |
8120 (const_int 0)) | |
8121 (label_ref (match_operand 0 "")) | |
8122 (pc)))] | |
8123 "" | |
8124 @{ | |
8125 gen_conditional_branch (operands, UNORDERED); | |
8126 DONE; | |
8127 @}) | |
8128 | |
8129 (define_expand "bordered" | |
8130 [(set (pc) | |
8131 (if_then_else (ordered:CC (cc0) | |
8132 (const_int 0)) | |
8133 (label_ref (match_operand 0 "")) | |
8134 (pc)))] | |
8135 "" | |
8136 @{ | |
8137 gen_conditional_branch (operands, ORDERED); | |
8138 DONE; | |
8139 @}) | |
8140 | |
8141 @dots{} | |
8142 @end smallexample | |
8143 | |
8144 @end ifset |