CbC/CbC_gcc: gcc/doc/md.texi comparison

comparison gcc/doc/md.texi @ 145:1830386684a0

gcc-9.2.0

author	anatofuz
date	Thu, 13 Feb 2020 11:34:05 +0900
parents	84e7813d76e9
children

comparison

equal deleted inserted replaced

-:84e7813d76e9
+:1830386684a0
-@c Copyright (C) 1988-2018 Free Software Foundation, Inc.
+@c Copyright (C) 1988-2020 Free Software Foundation, Inc.
 @c This is part of the GCC manual.
 @c For copying conditions, see the file gcc.texi.
 @ifset INTERNALS
 @node Machine Desc
 The stack pointer register (@code{SP})
 @item w
 Floating point register, Advanced SIMD vector register or SVE vector register
+@item x
+Like @code{w}, but restricted to registers 0 to 15 inclusive.
+@item y
+Like @code{w}, but restricted to registers 0 to 7 inclusive.
 @item Upl
 One of the low eight SVE predicate registers (@code{P0} to @code{P7})
 @item Upa
 Any of the SVE predicate registers (@code{P0} to @code{P15})
 A memory address which uses a single base register with no offset
 @item Ump
 A memory address suitable for a load/store pair instruction in SI, DI, SF and
 DF modes
+@end table
+@item AMD GCN ---@file{config/gcn/constraints.md}
+@table @code
+@item I
+Immediate integer in the range @minus{}16 to 64
+@item J
+Immediate 16-bit signed integer
+@item Kf
+Immediate constant @minus{}1
+@item L
+Immediate 15-bit unsigned integer
+@item A
+Immediate constant that can be inlined in an instruction encoding: integer
+@minus{}16..64, or float 0.0, +/@minus{}0.5, +/@minus{}1.0, +/@minus{}2.0,
++/@minus{}4.0, 1.0/(2.0*PI)
+@item B
+Immediate 32-bit signed integer that can be attached to an instruction encoding
+@item C
+Immediate 32-bit integer in range @minus{}16..4294967295 (i.e. 32-bit unsigned
+integer or @samp{A} constraint)
+@item DA
+Immediate 64-bit constant that can be split into two @samp{A} constants
+@item DB
+Immediate 64-bit constant that can be split into two @samp{B} constants
+@item U
+Any @code{unspec}
+@item Y
+Any @code{symbol_ref} or @code{label_ref}
+@item v
+VGPR register
+@item Sg
+SGPR register
+@item SD
+SGPR registers valid for instruction destinations, including VCC, M0 and EXEC
+@item SS
+SGPR registers valid for instruction sources, including VCC, M0, EXEC and SCC
+@item Sm
+SGPR registers valid as a source for scalar memory instructions (excludes M0
+and EXEC)
+@item Sv
+SGPR registers valid as a source or destination for vector instructions
+(excludes EXEC)
+@item ca
+All condition registers: SCC, VCCZ, EXECZ
+@item cs
+Scalar condition register: SCC
+@item cV
+Vector condition register: VCC, VCC_LO, VCC_HI
+@item e
+EXEC register (EXEC_LO and EXEC_HI)
+@item RB
+Memory operand with address space suitable for @code{buffer_*} instructions
+@item RF
+Memory operand with address space suitable for @code{flat_*} instructions
+@item RS
+Memory operand with address space suitable for @code{s_*} instructions
+@item RL
+Memory operand with address space suitable for @code{ds_*} LDS instructions
+@item RG
+Memory operand with address space suitable for @code{ds_*} GDS instructions
+@item RD
+Memory operand with address space suitable for any @code{ds_*} instructions
+@item RM
+Memory operand with address space suitable for @code{global_*} instructions
 @end table
 @item ARC ---@file{config/arc/constraints.md}
 @item f
 M register
 @item c
-Registers used for circular buffering, i.e. I, B, or L registers.
+Registers used for circular buffering, i.e.@: I, B, or L registers.
 @item C
 The CC register.
 @item t
 representing a supported PIC or TLS relocation.
 @end ifset
 @end table
+@item OpenRISC---@file{config/or1k/constraints.md}
+@table @code
+@item I
+Integer that is valid as an immediate operand in an
+instruction taking a signed 16-bit number. Range
+@minus{}32768 to 32767.
+@item K
+Integer that is valid as an immediate operand in an
+instruction taking an unsigned 16-bit number. Range
+0 to 65535.
+@item M
+Signed 16-bit constant shifted left 16 bits. (Used with @code{l.movhi})
+@item O
+Zero
+@ifset INTERNALS
+@item c
+Register usable for sibcalls.
+@end ifset
+@end table
 @item PDP-11---@file{config/pdp11/constraints.md}
 @table @code
 @item a
 Floating point registers AC0 through AC3.  These can be loaded from/to
 memory with a single instruction.
 @end table
 @item PowerPC and IBM RS6000---@file{config/rs6000/constraints.md}
 @table @code
+@item r
+A general purpose register (GPR), @code{r0}@dots{}@code{r31}.
 @item b
-Address base register
+A base register.  Like @code{r}, but @code{r0} is not allowed, so
+@code{r1}@dots{}@code{r31}.
+@item f
+A floating point register (FPR), @code{f0}@dots{}@code{f31}.
 @item d
-Floating point register (containing 64-bit value)
+A floating point register.  This is the same as @code{f} nowadays;
+historically @code{f} was for single-precision and @code{d} was for
-@item f
+double-precision floating point.
-Floating point register (containing 32-bit value)
 @item v
-Altivec vector register
+An Altivec vector register (VR), @code{v0}@dots{}@code{v31}.
 @item wa
-Any VSX register if the @option{-mvsx} option was used or NO_REGS.
+A VSX register (VSR), @code{vs0}@dots{}@code{vs63}.  This is either an
+FPR (@code{vs0}@dots{}@code{vs31} are @code{f0}@dots{}@code{f31}) or a VR
-When using any of the register constraints (@code{wa}, @code{wd},
+(@code{vs32}@dots{}@code{vs63} are @code{v0}@dots{}@code{v31}).
-@code{wf}, @code{wg}, @code{wh}, @code{wi}, @code{wj}, @code{wk},
-@code{wl}, @code{wm}, @code{wo}, @code{wp}, @code{wq}, @code{ws},
+When using @code{wa}, you should use the @code{%x} output modifier, so that
-@code{wt}, @code{wu}, @code{wv}, @code{ww}, or @code{wy})
+the correct register number is printed.  For example:
-that take VSX registers, you must use @code{%x<n>} in the template so
-that the correct register is used.  Otherwise the register number
-output in the assembly file will be incorrect if an Altivec register
-is an operand of a VSX instruction that expects VSX register
-numbering.
 @smallexample
 asm ("xvadddp %x0,%x1,%x2"
 : "=wa" (v1)
 : "wa" (v2), "wa" (v3));
 @end smallexample
-@noindent
+You should not use @code{%x} for @code{v} operands:
-is correct, but:
-@smallexample
-asm ("xvadddp %0,%1,%2"
-: "=wa" (v1)
-: "wa" (v2), "wa" (v3));
-@end smallexample
-@noindent
-is not correct.
-If an instruction only takes Altivec registers, you do not want to use
-@code{%x<n>}.
 @smallexample
 asm ("xsaddqp %0,%1,%2"
 : "=v" (v1)
 : "v" (v2), "v" (v3));
 @end smallexample
-@noindent
+@ifset INTERNALS
-is correct because the @code{xsaddqp} instruction only takes Altivec
+@item h
-registers, while:
+A special register (@code{vrsave}, @code{ctr}, or @code{lr}).
+@end ifset
-@smallexample
-asm ("xsaddqp %x0,%x1,%x2"
+@item c
-: "=v" (v1)
+The count register, @code{ctr}.
-: "v" (v2), "v" (v3));
-@end smallexample
+@item l
+The link register, @code{lr}.
-@noindent
-is incorrect.
+@item x
+Condition register field 0, @code{cr0}.
-@item wb
-Altivec register if @option{-mcpu=power9} is used or NO_REGS.
+@item y
+Any condition register field, @code{cr0}@dots{}@code{cr7}.
-@item wd
-VSX vector register to hold vector double data or NO_REGS.
+@ifset INTERNALS
+@item z
+The carry bit, @code{XER[CA]}.
 @item we
-VSX register if the @option{-mcpu=power9} and @option{-m64} options
+Like @code{wa}, if @option{-mpower9-vector} and @option{-m64} are used;
-were used or NO_REGS.
+otherwise, @code{NO_REGS}.
-@item wf
-VSX vector register to hold vector float data or NO_REGS.
-@item wg
-If @option{-mmfpgpr} was used, a floating point register or NO_REGS.
-@item wh
-Floating point register if direct moves are available, or NO_REGS.
-@item wi
-FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.
-@item wj
-FP or VSX register to hold 64-bit integers for direct moves or NO_REGS.
-@item wk
-FP or VSX register to hold 64-bit doubles for direct moves or NO_REGS.
-@item wl
-Floating point register if the LFIWAX instruction is enabled or NO_REGS.
-@item wm
-VSX register if direct move instructions are enabled, or NO_REGS.
 @item wn
-No register (NO_REGS).
+No register (@code{NO_REGS}).
-@item wo
-VSX register to use for ISA 3.0 vector instructions, or NO_REGS.
-@item wp
-VSX register to use for IEEE 128-bit floating point TFmode, or NO_REGS.
-@item wq
-VSX register to use for IEEE 128-bit floating point, or NO_REGS.
 @item wr
-General purpose register if 64-bit instructions are enabled or NO_REGS.
+Like @code{r}, if @option{-mpowerpc64} is used; otherwise, @code{NO_REGS}.
-@item ws
-VSX vector register to hold scalar double values or NO_REGS.
-@item wt
-VSX vector register to hold 128 bit integer or NO_REGS.
-@item wu
-Altivec register to use for float/32-bit int loads/stores  or NO_REGS.
-@item wv
-Altivec register to use for double loads/stores  or NO_REGS.
-@item ww
-FP or VSX register to perform float operations under @option{-mvsx} or NO_REGS.
 @item wx
-Floating point register if the STFIWX instruction is enabled or NO_REGS.
+Like @code{d}, if @option{-mpowerpc-gfxopt} is used; otherwise, @code{NO_REGS}.
-@item wy
-FP or VSX register to perform ISA 2.07 float ops or NO_REGS.
-@item wz
-Floating point register if the LFIWZX instruction is enabled or NO_REGS.
 @item wA
-Address base register if 64-bit instructions are enabled or NO_REGS.
+Like @code{b}, if @option{-mpowerpc64} is used; otherwise, @code{NO_REGS}.
 @item wB
-Signed 5-bit constant integer that can be loaded into an altivec register.
+Signed 5-bit constant integer that can be loaded into an Altivec register.
 @item wD
 Int constant that is the element number of the 64-bit scalar in a vector.
 @item wE
 Vector constant that can be loaded with the XXSPLTIB instruction.
 @item wF
-Memory operand suitable for power9 fusion load/stores.
+Memory operand suitable for power8 GPR load fusion.
-@item wG
-Memory operand suitable for TOC fusion memory references.
-@item wH
-Altivec register if @option{-mvsx-small-integer}.
-@item wI
-Floating point register if @option{-mvsx-small-integer}.
-@item wJ
-FP register if @option{-mvsx-small-integer} and @option{-mpower9-vector}.
-@item wK
-Altivec register if @option{-mvsx-small-integer} and @option{-mpower9-vector}.
 @item wL
-Int constant that is the element number that the MFVSRLD instruction.
+Int constant that is the element number mfvsrld accesses in a vector.
-targets.
 @item wM
 Match vector constant with all 1's if the XXLORC instruction is available.
 @item wO
-A memory operand suitable for the ISA 3.0 vector d-form instructions.
+Memory operand suitable for the ISA 3.0 vector d-form instructions.
 @item wQ
-A memory address that will work with the @code{lq} and @code{stq}
+Memory operand suitable for the load/store quad instructions.
-instructions.
 @item wS
 Vector constant that can be loaded with XXSPLTIB & sign extension.
-@item h
+@item wY
-@samp{MQ}, @samp{CTR}, or @samp{LINK} register
+A memory operand for a DS-form instruction.
-@item c
+@item wZ
-@samp{CTR} register
+An indexed or indirect memory operand, ignoring the bottom 4 bits.
+@end ifset
-@item l
-@samp{LINK} register
-@item x
-@samp{CR} register (condition register) number 0
-@item y
-@samp{CR} register (condition register)
-@item z
-@samp{XER[CA]} carry bit (part of the XER register)
 @item I
-Signed 16-bit constant
+A signed 16-bit constant.
 @item J
-Unsigned 16-bit constant shifted left 16 bits (use @samp{L} instead for
+An unsigned 16-bit constant shifted left 16 bits (use @code{L} instead
-@code{SImode} constants)
+for @code{SImode} constants).
 @item K
-Unsigned 16-bit constant
+An unsigned 16-bit constant.
 @item L
-Signed 16-bit constant shifted left 16 bits
+A signed 16-bit constant shifted left 16 bits.
+@ifset INTERNALS
 @item M
-Constant larger than 31
+An integer constant greater than 31.
 @item N
-Exact power of 2
+An exact power of 2.
 @item O
-Zero
+The integer constant zero.
 @item P
-Constant whose negation is a signed 16-bit constant
+A constant whose negation is a signed 16-bit constant.
+@end ifset
+@item eI
+A signed 34-bit integer constant if prefixed instructions are supported.
+@ifset INTERNALS
 @item G
-Floating point constant that can be loaded into a register with one
+A floating point constant that can be loaded into a register with one
-instruction per word
+instruction per word.
 @item H
-Integer/Floating point constant that can be loaded into a register using
+A floating point constant that can be loaded into a register using
-three instructions
+three instructions.
+@end ifset
 @item m
-Memory operand.
+A memory operand.
 Normally, @code{m} does not allow addresses that update the base register.
-If @samp{<} or @samp{>} constraint is also used, they are allowed and
+If the @code{<} or @code{>} constraint is also used, they are allowed and
 therefore on PowerPC targets in that case it is only safe
-to use @samp{m<>} in an @code{asm} statement if that @code{asm} statement
+to use @code{m<>} in an @code{asm} statement if that @code{asm} statement
 accesses the operand exactly once.  The @code{asm} statement must also
-use @samp{%U@var{<opno>}} as a placeholder for the ``update'' flag in the
+use @code{%U@var{<opno>}} as a placeholder for the ``update'' flag in the
 corresponding load or store instruction.  For example:
 @smallexample
 asm ("st%U0 %1,%0" : "=m<>" (mem) : "r" (val));
 @end smallexample
 asm ("st %1,%0" : "=m<>" (mem) : "r" (val));
 @end smallexample
 is not.
+@ifset INTERNALS
 @item es
 A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  This used to be useful when
-@samp{m} allowed automodification of the base register, but as those are now only
+@code{m} allowed automodification of the base register, but as those
-allowed when @samp{<} or @samp{>} is used, @samp{es} is basically the same
+are now only allowed when @code{<} or @code{>} is used, @code{es} is
-as @samp{m} without @samp{<} and @samp{>}.
+basically the same as @code{m} without @code{<} and @code{>}.
+@end ifset
 @item Q
-Memory operand that is an offset from a register (it is usually better
+A memory operand addressed by just a base register.
-to use @samp{m} or @samp{es} in @code{asm} statements)
+@ifset INTERNALS
+@item Y
+A memory operand for a DQ-form instruction.
+@end ifset
 @item Z
-Memory operand that is an indexed or indirect from a register (it is
+A memory operand accessed with indexed or indirect addressing.
-usually better to use @samp{m} or @samp{es} in @code{asm} statements)
+@ifset INTERNALS
 @item R
-AIX TOC entry
+An AIX TOC entry.
+@end ifset
 @item a
-Address operand that is an indexed or indirect from a register (@samp{p} is
+An indexed or indirect address.
-preferable for @code{asm} statements)
+@ifset INTERNALS
 @item U
-System V Release 4 small data area reference
+A V.4 small data reference.
 @item W
-Vector constant that does not require memory
+A vector constant that does not require memory.
 @item j
-Vector constant that is all zeros.
+The zero vector constant.
+@end ifset
+@end table
+@item PRU---@file{config/pru/constraints.md}
+@table @code
+@item I
+An unsigned 8-bit integer constant.
+@item J
+An unsigned 16-bit integer constant.
+@item L
+An unsigned 5-bit integer constant (for shift counts).
+@item T
+A text segment (program memory) constant label.
+@item Z
+Integer constant zero.
 @end table
 @item RL78---@file{config/rl78/constraints.md}
 @table @code
 @item RISC-V---@file{config/riscv/constraints.md}
 @table @code
 @item f
-A floating-point register (if availiable).
+A floating-point register (if available).
 @item I
 An I-type 12-bit signed immediate.
 @item J
 @item w
 Memory address with only a base register
 @item Y
 Vector zero
-@end table
-@item SPU---@file{config/spu/spu.h}
-@table @code
-@item a
-An immediate which can be loaded with the il/ila/ilh/ilhu instructions.  const_int is treated as a 64 bit value.
-@item c
-An immediate for and/xor/or instructions.  const_int is treated as a 64 bit value.
-@item d
-An immediate for the @code{iohl} instruction.  const_int is treated as a 64 bit value.
-@item f
-An immediate which can be loaded with @code{fsmbi}.
-@item A
-An immediate which can be loaded with the il/ila/ilh/ilhu instructions.  const_int is treated as a 32 bit value.
-@item B
-An immediate for most arithmetic instructions.  const_int is treated as a 32 bit value.
-@item C
-An immediate for and/xor/or instructions.  const_int is treated as a 32 bit value.
-@item D
-An immediate for the @code{iohl} instruction.  const_int is treated as a 32 bit value.
-@item I
-A constant in the range [@minus{}64, 63] for shift/rotate instructions.
-@item J
-An unsigned 7-bit constant for conversion/nop/channel instructions.
-@item K
-A signed 10-bit constant for most arithmetic instructions.
-@item M
-A signed 16 bit immediate for @code{stop}.
-@item N
-An unsigned 16-bit constant for @code{iohl} and @code{fsmbi}.
-@item O
-An unsigned 7-bit constant whose 3 least significant bits are 0.
-@item P
-An unsigned 3-bit constant for 16-byte rotates and shifts
-@item R
-Call operand, reg, for indirect calls
-@item S
-Call operand, symbol, for relative calls.
-@item T
-Call operand, const_int, for absolute calls.
-@item U
-An immediate which can be loaded with the il/ila/ilh/ilhu instructions.  const_int is sign extended to 128 bit.
-@item W
-An immediate for shift and rotate instructions.  const_int is treated as a 32 bit value.
-@item Y
-An immediate for and/xor/or instructions.  const_int is sign extended as a 128 bit.
-@item Z
-An immediate for the @code{iohl} instruction.  const_int is sign extended to 128 bit.
 @end table
 @item TI C6X family---@file{config/c6x/constraints.md}
 @table @code
 @item u
 Second from top of 80387 floating-point stack (@code{%st(1)}).
 @ifset INTERNALS
 @item Yk
-Any mask register that can be used as a predicate, i.e. @code{k1-k7}.
+Any mask register that can be used as a predicate, i.e.@: @code{k1-k7}.
 @item k
 Any mask register.
 @end ifset
 @code{define_constraint}.
 @end deffn
 @deffn {MD Expression} define_special_memory_constraint name docstring exp
 Use this expression for constraints that match a subset of all memory
-operands: that is, @code{reload} can not make them match by reloading
+operands: that is, @code{reload} cannot make them match by reloading
 the address as it is described for @code{define_memory_constraint} or
 such address reload is undesirable with the performance point of view.
 For example, @code{define_special_memory_constraint} can be useful if
 specifically aligned memory is necessary or desirable for some insn
 operand0[j * c + i] = operand1[i][j];
 @end smallexample
 This pattern is not allowed to @code{FAIL}.
-@cindex @code{gather_load@var{m}} instruction pattern
+@cindex @code{gather_load@var{m}@var{n}} instruction pattern
-@item @samp{gather_load@var{m}}
+@item @samp{gather_load@var{m}@var{n}}
 Load several separate memory locations into a vector of mode @var{m}.
-Operand 1 is a scalar base address and operand 2 is a vector of
+Operand 1 is a scalar base address and operand 2 is a vector of mode @var{n}
-offsets from that base.  Operand 0 is a destination vector with the
+containing offsets from that base.  Operand 0 is a destination vector with
-same number of elements as the offset.  For each element index @var{i}:
+the same number of elements as @var{n}.  For each element index @var{i}:
 @itemize @bullet
 @item
 extend the offset element @var{i} to address width, using zero
 extension if operand 3 is 1 and sign extension if operand 3 is zero;
 @end itemize
 The value of operand 3 does not matter if the offsets are already
 address width.
-@cindex @code{mask_gather_load@var{m}} instruction pattern
+@cindex @code{mask_gather_load@var{m}@var{n}} instruction pattern
-@item @samp{mask_gather_load@var{m}}
+@item @samp{mask_gather_load@var{m}@var{n}}
-Like @samp{gather_load@var{m}}, but takes an extra mask operand as
+Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand as
 operand 5.  Bit @var{i} of the mask is set if element @var{i}
 of the result should be loaded from memory and clear if element @var{i}
 of the result should be set to zero.
-@cindex @code{scatter_store@var{m}} instruction pattern
+@cindex @code{scatter_store@var{m}@var{n}} instruction pattern
-@item @samp{scatter_store@var{m}}
+@item @samp{scatter_store@var{m}@var{n}}
 Store a vector of mode @var{m} into several distinct memory locations.
-Operand 0 is a scalar base address and operand 1 is a vector of offsets
+Operand 0 is a scalar base address and operand 1 is a vector of mode
-from that base.  Operand 4 is the vector of values that should be stored,
+@var{n} containing offsets from that base.  Operand 4 is the vector of
-which has the same number of elements as the offset.  For each element
+values that should be stored, which has the same number of elements as
-index @var{i}:
+@var{n}.  For each element index @var{i}:
 @itemize @bullet
 @item
 extend the offset element @var{i} to address width, using zero
 extension if operand 2 is 1 and sign extension if operand 2 is zero;
 @end itemize
 The value of operand 2 does not matter if the offsets are already
 address width.
-@cindex @code{mask_scatter_store@var{m}} instruction pattern
+@cindex @code{mask_scatter_store@var{m}@var{n}} instruction pattern
-@item @samp{mask_scatter_store@var{m}}
+@item @samp{mask_scatter_store@var{m}@var{n}}
-Like @samp{scatter_store@var{m}}, but takes an extra mask operand as
+Like @samp{scatter_store@var{m}@var{n}}, but takes an extra mask operand as
 operand 5.  Bit @var{i} of the mask is set if element @var{i}
 of the result should be stored to memory.
 @cindex @code{vec_set@var{m}} instruction pattern
 @item @samp{vec_set@var{m}}
 @smallexample
 operand0[0] = operand1 < operand2;
 for (i = 1; i < GET_MODE_NUNITS (@var{n}); i++)
 operand0[i] = operand0[i - 1] && (operand1 + i < operand2);
+@end smallexample
+@cindex @code{check_raw_ptrs@var{m}} instruction pattern
+@item @samp{check_raw_ptrs@var{m}}
+Check whether, given two pointers @var{a} and @var{b} and a length @var{len},
+a write of @var{len} bytes at @var{a} followed by a read of @var{len} bytes
+at @var{b} can be split into interleaved byte accesses
+@samp{@var{a}[0], @var{b}[0], @var{a}[1], @var{b}[1], @dots{}}
+without affecting the dependencies between the bytes.  Set operand 0
+to true if the split is possible and false otherwise.
+Operands 1, 2 and 3 provide the values of @var{a}, @var{b} and @var{len}
+respectively.  Operand 4 is a constant integer that provides the known
+common alignment of @var{a} and @var{b}.  All inputs have mode @var{m}.
+This split is possible if:
+@smallexample
+@var{a} == @var{b} || @var{a} + @var{len} <= @var{b} || @var{b} + @var{len} <= @var{a}
+@end smallexample
+You should only define this pattern if the target has a way of accelerating
+the test without having to do the individual comparisons.
+@cindex @code{check_war_ptrs@var{m}} instruction pattern
+@item @samp{check_war_ptrs@var{m}}
+Like @samp{check_raw_ptrs@var{m}}, but with the read and write swapped round.
+The split is possible in this case if:
+@smallexample
+@var{b} <= @var{a} || @var{a} + @var{len} <= @var{b}
 @end smallexample
 @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
 @item @samp{vec_cmp@var{m}@var{n}}
 Output a vector comparison.  Operand 0 of mode @var{n} is the destination for
 Like @code{add@var{m}3} but is guaranteed to only be used for address
 calculations.  The expanded code is not allowed to clobber the
 condition code.  It only needs to be defined if @code{add@var{m}3}
 sets the condition code.  If adds used for address calculations and
 normal adds are not compatible it is required to expand a distinct
-pattern (e.g. using an unspec).  The pattern is used by LRA to emit
+pattern (e.g.@: using an unspec).  The pattern is used by LRA to emit
 address calculations.  @code{add@var{m}3} is used if
 @code{addptr@var{m}3} is not defined.
 @cindex @code{fma@var{m}4} instruction pattern
 @item @samp{fma@var{m}4}
 operand 2.  Store the result in scalar operand 0.  The vector has
 mode @var{m} and the scalars have the mode appropriate for one
 element of @var{m}.  The operation is strictly in-order: there is
 no reassociation.
+@cindex @code{mask_fold_left_plus_@var{m}} instruction pattern
+@item @code{mask_fold_left_plus_@var{m}}
+Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand
+(operand 3) that specifies which elements of the source vector should be added.
 @cindex @code{sdot_prod@var{m}} instruction pattern
 @item @samp{sdot_prod@var{m}}
 @cindex @code{udot_prod@var{m}} instruction pattern
 @itemx @samp{udot_prod@var{m}}
 Compute the sum of the products of two signed/unsigned elements.
 Operands 0 and 2 are of the same mode, which is wider than the mode of
 operand 1. Add operand 1 to operand 2 and place the widened result in
 operand 0. (This is used express accumulation of elements into an accumulator
 of a wider mode.)
+@cindex @code{smulhs@var{m3}} instruction pattern
+@item @samp{smulhs@var{m3}}
+@cindex @code{umulhs@var{m3}} instruction pattern
+@itemx @samp{umulhs@var{m3}}
+Signed/unsigned multiply high with scale. This is equivalent to the C code:
+@smallexample
+narrow op0, op1, op2;
+@dots{}
+op0 = (narrow) (((wide) op1 * (wide) op2) >> (N / 2 - 1));
+@end smallexample
+where the sign of @samp{narrow} determines whether this is a signed
+or unsigned operation, and @var{N} is the size of @samp{wide} in bits.
+@cindex @code{smulhrs@var{m3}} instruction pattern
+@item @samp{smulhrs@var{m3}}
+@cindex @code{umulhrs@var{m3}} instruction pattern
+@itemx @samp{umulhrs@var{m3}}
+Signed/unsigned multiply high with round and scale. This is
+equivalent to the C code:
+@smallexample
+narrow op0, op1, op2;
+@dots{}
+op0 = (narrow) (((((wide) op1 * (wide) op2) >> (N / 2 - 2)) + 1) >> 1);
+@end smallexample
+where the sign of @samp{narrow} determines whether this is a signed
+or unsigned operation, and @var{N} is the size of @samp{wide} in bits.
+@cindex @code{sdiv_pow2@var{m3}} instruction pattern
+@item @samp{sdiv_pow2@var{m3}}
+@cindex @code{sdiv_pow2@var{m3}} instruction pattern
+@itemx @samp{sdiv_pow2@var{m3}}
+Signed division by power-of-2 immediate. Equivalent to:
+@smallexample
+signed op0, op1;
+@dots{}
+op0 = op1 / (1 << imm);
+@end smallexample
 @cindex @code{vec_shl_insert_@var{m}} instruction pattern
 @item @samp{vec_shl_insert_@var{m}}
-Shift the elements in vector input operand 1 left one element (i.e.
+Shift the elements in vector input operand 1 left one element (i.e.@:
 away from element 0) and fill the vacated element 0 with the scalar
 in operand 2.  Store the result in vector output operand 0.  Operands
 0 and 1 have mode @var{m} and operand 2 has the mode appropriate for
 one element of @var{m}.
+@cindex @code{vec_shl_@var{m}} instruction pattern
+@item @samp{vec_shl_@var{m}}
+Whole vector left shift in bits, i.e.@: away from element 0.
+Operand 1 is a vector to be shifted.
+Operand 2 is an integer shift amount in bits.
+Operand 0 is where the resulting shifted vector is stored.
+The output and input vectors should have the same modes.
 @cindex @code{vec_shr_@var{m}} instruction pattern
 @item @samp{vec_shr_@var{m}}
-Whole vector right shift in bits, i.e. towards element 0.
+Whole vector right shift in bits, i.e.@: towards element 0.
 Operand 1 is a vector to be shifted.
 Operand 2 is an integer shift amount in bits.
 Operand 0 is where the resulting shifted vector is stored.
 The output and input vectors should have the same modes.
 @cindex @code{vec_pack_trunc_@var{m}} instruction pattern
 @item @samp{vec_pack_trunc_@var{m}}
 Narrow (demote) and merge the elements of two vectors. Operands 1 and 2
 are vectors of the same mode having N integral or floating point elements
 of size S@.  Operand 0 is the resulting vector in which 2*N elements of
-size N/2 are concatenated after narrowing them down using truncation.
+size S/2 are concatenated after narrowing them down using truncation.
+@cindex @code{vec_pack_sbool_trunc_@var{m}} instruction pattern
+@item @samp{vec_pack_sbool_trunc_@var{m}}
+Narrow and merge the elements of two vectors.  Operands 1 and 2 are vectors
+of the same type having N boolean elements.  Operand 0 is the resulting
+vector in which 2*N elements are concatenated.  The last operand (operand 3)
+is the number of elements in the output vector 2*N as a @code{CONST_INT}.
+This instruction pattern is used when all the vector input and output
+operands have the same scalar mode @var{m} and thus using
+@code{vec_pack_trunc_@var{m}} would be ambiguous.
 @cindex @code{vec_pack_ssat_@var{m}} instruction pattern
 @cindex @code{vec_pack_usat_@var{m}} instruction pattern
 @item @samp{vec_pack_ssat_@var{m}}, @samp{vec_pack_usat_@var{m}}
 Narrow (demote) and merge the elements of two vectors.  Operands 1 and 2
 @cindex @code{vec_pack_ufix_trunc_@var{m}} instruction pattern
 @item @samp{vec_pack_sfix_trunc_@var{m}}, @samp{vec_pack_ufix_trunc_@var{m}}
 Narrow, convert to signed/unsigned integral type and merge the elements
 of two vectors.  Operands 1 and 2 are vectors of the same mode having N
 floating point elements of size S@.  Operand 0 is the resulting vector
-in which 2*N elements of size N/2 are concatenated.
+in which 2*N elements of size S/2 are concatenated.
 @cindex @code{vec_packs_float_@var{m}} instruction pattern
 @cindex @code{vec_packu_float_@var{m}} instruction pattern
 @item @samp{vec_packs_float_@var{m}}, @samp{vec_packu_float_@var{m}}
 Narrow, convert to floating point type and merge the elements
 of two vectors.  Operands 1 and 2 are vectors of the same mode having N
 signed/unsigned integral elements of size S@.  Operand 0 is the resulting vector
-in which 2*N elements of size N/2 are concatenated.
+in which 2*N elements of size S/2 are concatenated.
 @cindex @code{vec_unpacks_hi_@var{m}} instruction pattern
 @cindex @code{vec_unpacks_lo_@var{m}} instruction pattern
 @item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}}
 Extract and widen (promote) the high/low part of a vector of signed
 @item @samp{vec_unpacku_hi_@var{m}}, @samp{vec_unpacku_lo_@var{m}}
 Extract and widen (promote) the high/low part of a vector of unsigned
 integral elements.  The input vector (operand 1) has N elements of size S.
 Widen (promote) the high/low elements of the vector using zero extension and
 place the resulting N/2 values of size 2*S in the output vector (operand 0).
+@cindex @code{vec_unpacks_sbool_hi_@var{m}} instruction pattern
+@cindex @code{vec_unpacks_sbool_lo_@var{m}} instruction pattern
+@item @samp{vec_unpacks_sbool_hi_@var{m}}, @samp{vec_unpacks_sbool_lo_@var{m}}
+Extract the high/low part of a vector of boolean elements that have scalar
+mode @var{m}.  The input vector (operand 1) has N elements, the output
+vector (operand 0) has N/2 elements.  The last operand (operand 2) is the
+number of elements of the input vector N as a @code{CONST_INT}.  These
+patterns are used if both the input and output vectors have the same scalar
+mode @var{m} and thus using @code{vec_unpacks_hi_@var{m}} or
+@code{vec_unpacks_lo_@var{m}} would be ambiguous.
 @cindex @code{vec_unpacks_float_hi_@var{m}} instruction pattern
 @cindex @code{vec_unpacks_float_lo_@var{m}} instruction pattern
 @cindex @code{vec_unpacku_float_hi_@var{m}} instruction pattern
 @cindex @code{vec_unpacku_float_lo_@var{m}} instruction pattern
 2 into operand 0.  All operands have mode @var{m}, which is a scalar or
 vector floating-point mode.
 This pattern is not allowed to @code{FAIL}.
+@cindex @code{xorsign@var{m}3} instruction pattern
+@item @samp{xorsign@var{m}3}
+Equivalent to @samp{op0 = op1 * copysign (1.0, op2)}: store a value with
+the magnitude of operand 1 and the sign of operand 2 into operand 0.
+All operands have mode @var{m}, which is a scalar or vector
+floating-point mode.
+This pattern is not allowed to @code{FAIL}.
 @cindex @code{ffs@var{m}2} instruction pattern
 @item @samp{ffs@var{m}2}
 Store into operand 0 one plus the index of the least significant 1-bit
 of operand 1.  If operand 1 is zero, store zero.
 @cindex @code{one_cmpl@var{m}2} instruction pattern
 @item @samp{one_cmpl@var{m}2}
 Store the bitwise-complement of operand 1 into operand 0.
-@cindex @code{movmem@var{m}} instruction pattern
+@cindex @code{cpymem@var{m}} instruction pattern
-@item @samp{movmem@var{m}}
+@item @samp{cpymem@var{m}}
-Block move instruction.  The destination and source blocks of memory
+Block copy instruction.  The destination and source blocks of memory
 are the first two operands, and both are @code{mem:BLK}s with an
 address in mode @code{Pmode}.
-The number of bytes to move is the third operand, in mode @var{m}.
+The number of bytes to copy is the third operand, in mode @var{m}.
 Usually, you specify @code{Pmode} for @var{m}.  However, if you can
 generate better code knowing the range of valid lengths is smaller than
 those representable in a full Pmode pointer, you should provide
 a pattern with a
 mode corresponding to the range of values you can handle efficiently
 respectively.  The expected alignment differs from alignment in operand 4
 in a way that the blocks are not required to be aligned according to it in
 all cases. This expected alignment is also in bytes, just like operand 4.
 Expected size, when unknown, is set to @code{(const_int -1)}.
+Descriptions of multiple @code{cpymem@var{m}} patterns can only be
+beneficial if the patterns for smaller modes have fewer restrictions
+on their first, second and fourth operands.  Note that the mode @var{m}
+in @code{cpymem@var{m}} does not impose any restriction on the mode of
+individually copied data units in the block.
+The @code{cpymem@var{m}} patterns need not give special consideration
+to the possibility that the source and destination strings might
+overlap. These patterns are used to do inline expansion of
+@code{__builtin_memcpy}.
+@cindex @code{movmem@var{m}} instruction pattern
+@item @samp{movmem@var{m}}
+Block move instruction.  The destination and source blocks of memory
+are the first two operands, and both are @code{mem:BLK}s with an
+address in mode @code{Pmode}.
+The number of bytes to copy is the third operand, in mode @var{m}.
+Usually, you specify @code{Pmode} for @var{m}.  However, if you can
+generate better code knowing the range of valid lengths is smaller than
+those representable in a full Pmode pointer, you should provide
+a pattern with a
+mode corresponding to the range of values you can handle efficiently
+(e.g., @code{QImode} for values in the range 0--127; note we avoid numbers
+that appear negative) and also a pattern with @code{Pmode}.
+The fourth operand is the known shared alignment of the source and
+destination, in the form of a @code{const_int} rtx.  Thus, if the
+compiler knows that both source and destination are word-aligned,
+it may provide the value 4 for this operand.
+Optional operands 5 and 6 specify expected alignment and size of block
+respectively.  The expected alignment differs from alignment in operand 4
+in a way that the blocks are not required to be aligned according to it in
+all cases. This expected alignment is also in bytes, just like operand 4.
+Expected size, when unknown, is set to @code{(const_int -1)}.
 Descriptions of multiple @code{movmem@var{m}} patterns can only be
 beneficial if the patterns for smaller modes have fewer restrictions
 on their first, second and fourth operands.  Note that the mode @var{m}
 in @code{movmem@var{m}} does not impose any restriction on the mode of
-individually moved data units in the block.
+individually copied data units in the block.
-These patterns need not give special consideration to the possibility
+The @code{movmem@var{m}} patterns must correctly handle the case where
-that the source and destination strings might overlap.
+the source and destination strings overlap. These patterns are used to
+do inline expansion of @code{__builtin_memmove}.
 @cindex @code{movstr} instruction pattern
 @item @samp{movstr}
 String copy instruction, with @code{stpcpy} semantics.  Operand 0 is
 an output operand in mode @code{Pmode}.  The addresses of the
 destination and source strings are operands 1 and 2, and both are
 @code{mem:BLK}s with addresses in mode @code{Pmode}.  The execution of
 the expansion of this pattern should store in operand 0 the address in
 which the @code{NUL} terminator was stored in the destination string.
-This patern has also several optional operands that are same as in
+This pattern has also several optional operands that are same as in
 @code{setmem}.
 @cindex @code{setmem@var{m}} instruction pattern
 @item @samp{setmem@var{m}}
 Block set instruction.  The destination string is the first operand,
 given as a @code{mem:BLK} whose address is in mode @code{Pmode}.  The
 number of bytes to set is the second operand, in mode @var{m}.  The value to
 initialize the memory with is the third operand. Targets that only support the
 clearing of memory should reject any value that is not the constant 0.  See
-@samp{movmem@var{m}} for a discussion of the choice of mode.
+@samp{cpymem@var{m}} for a discussion of the choice of mode.
 The fourth operand is the known alignment of the destination, in the form
 of a @code{const_int} rtx.  Thus, if the compiler knows that the
 destination is word-aligned, it may provide the value 4 for this
 operand.
 respectively.  The expected alignment differs from alignment in operand 4
 in a way that the blocks are not required to be aligned according to it in
 all cases. This expected alignment is also in bytes, just like operand 4.
 Expected size, when unknown, is set to @code{(const_int -1)}.
 Operand 7 is the minimal size of the block and operand 8 is the
-maximal size of the block (NULL if it can not be represented as CONST_INT).
+maximal size of the block (NULL if it cannot be represented as CONST_INT).
-Operand 9 is the probable maximal size (i.e. we can not rely on it for correctness,
+Operand 9 is the probable maximal size (i.e.@: we cannot rely on it for
-but it can be used for choosing proper code sequence for a given size).
+correctness, but it can be used for choosing proper code sequence for a
+given size).
-The use for multiple @code{setmem@var{m}} is as for @code{movmem@var{m}}.
+The use for multiple @code{setmem@var{m}} is as for @code{cpymem@var{m}}.
 @cindex @code{cmpstrn@var{m}} instruction pattern
 @item @samp{cmpstrn@var{m}}
 String compare instruction, with five operands.  Operand 0 is the output;
 it has mode @var{m}.  The remaining four operands are like the operands
-of @samp{movmem@var{m}}.  The two memory blocks specified are compared
+of @samp{cpymem@var{m}}.  The two memory blocks specified are compared
 byte by byte in lexicographic order starting at the beginning of each
 string.  The instruction is not allowed to prefetch more than one byte
 at a time since either string may end in the first byte and reading past
 that may access an invalid page or segment and cause a fault.  The
 comparison terminates early if the fetched bytes are different or if
 @end smallexample
 where, for example, @var{op} is @code{+} for @samp{cond_add@var{mode}}.
 When defined for floating-point modes, the contents of @samp{op3[i]}
-are not interpreted if @var{op1[i]} is false, just like they would not
+are not interpreted if @samp{op1[i]} is false, just like they would not
 be in a normal C @samp{?:} condition.
 Operands 0, 2, 3 and 4 all have mode @var{m}.  Operand 1 is a scalar
 integer if @var{m} is scalar, otherwise it has the mode returned by
 @code{TARGET_VECTORIZE_GET_MASK_MODE}.
 builtins.
 The get/set patterns have a single output/input operand respectively,
 with @var{mode} intended to be @code{Pmode}.
+@cindex @code{stack_protect_combined_set} instruction pattern
+@item @samp{stack_protect_combined_set}
+This pattern, if defined, moves a @code{ptr_mode} value from an address
+whose declaration RTX is given in operand 1 to the memory in operand 0
+without leaving the value in a register afterward.  If several
+instructions are needed by the target to perform the operation (eg. to
+load the address from a GOT entry then load the @code{ptr_mode} value
+and finally store it), it is the backend's responsibility to ensure no
+intermediate result gets spilled.  This is to avoid leaking the value
+some place that an attacker might use to rewrite the stack guard slot
+after having clobbered it.
+If this pattern is not defined, then the address declaration is
+expanded first in the standard way and a @code{stack_protect_set}
+pattern is then generated to move the value from that address to the
+address in operand 0.
 @cindex @code{stack_protect_set} instruction pattern
 @item @samp{stack_protect_set}
-This pattern, if defined, moves a @code{ptr_mode} value from the memory
+This pattern, if defined, moves a @code{ptr_mode} value from the valid
-in operand 1 to the memory in operand 0 without leaving the value in
+memory location in operand 1 to the memory in operand 0 without leaving
-a register afterward.  This is to avoid leaking the value some place
+the value in a register afterward.  This is to avoid leaking the value
-that an attacker might use to rewrite the stack guard slot after
+some place that an attacker might use to rewrite the stack guard slot
-having clobbered it.
+after having clobbered it.
+Note: on targets where the addressing modes do not allow to load
+directly from stack guard address, the address is expanded in a standard
+way first which could cause some spills.
 If this pattern is not defined, then a plain move pattern is generated.
+@cindex @code{stack_protect_combined_test} instruction pattern
+@item @samp{stack_protect_combined_test}
+This pattern, if defined, compares a @code{ptr_mode} value from an
+address whose declaration RTX is given in operand 1 with the memory in
+operand 0 without leaving the value in a register afterward and
+branches to operand 2 if the values were equal.  If several
+instructions are needed by the target to perform the operation (eg. to
+load the address from a GOT entry then load the @code{ptr_mode} value
+and finally store it), it is the backend's responsibility to ensure no
+intermediate result gets spilled.  This is to avoid leaking the value
+some place that an attacker might use to rewrite the stack guard slot
+after having clobbered it.
+If this pattern is not defined, then the address declaration is
+expanded first in the standard way and a @code{stack_protect_test}
+pattern is then generated to compare the value from that address to the
+value at the memory in operand 0.
 @cindex @code{stack_protect_test} instruction pattern
 @item @samp{stack_protect_test}
 This pattern, if defined, compares a @code{ptr_mode} value from the
-memory in operand 1 with the memory in operand 0 without leaving the
+valid memory location in operand 1 with the memory in operand 0 without
-value in a register afterward and branches to operand 2 if the values
+leaving the value in a register afterward and branches to operand 2 if
-were equal.
+the values were equal.
 If this pattern is not defined, then a plain compare pattern and
 conditional branch pattern is used.
 @cindex @code{clear_cache} instruction pattern
 separating compares and branches is limiting, which is why the
 more flexible approach with one @code{define_expand} is used in GCC.
 The machine description becomes clearer for architectures that
 have compare-and-branch instructions but no condition code.  It also
 works better when different sets of comparison operators are supported
-by different kinds of conditional branches (e.g. integer vs. floating-point),
+by different kinds of conditional branches (e.g.@: integer vs.@:
-or by conditional branches with respect to conditional stores.
+floating-point), or by conditional branches with respect to conditional stores.
 Two separate insns are always used if the machine description represents
 a condition code register using the legacy RTL expression @code{(cc0)},
 and on most machines that use a separate condition code register
 (@pxref{Condition Code}).  For machines that use @code{(cc0)}, in
 When the combiner phase tries to split an insn pattern, it is always the
 case that the pattern is @emph{not} matched by any @code{define_insn}.
 The combiner pass first tries to split a single @code{set} expression
 and then the same @code{set} expression inside a @code{parallel}, but
 followed by a @code{clobber} of a pseudo-reg to use as a scratch
-register.  In these cases, the combiner expects exactly two new insn
+register.  In these cases, the combiner expects exactly one or two new insn
 patterns to be generated.  It will verify that these patterns match some
 @code{define_insn} definitions, so you need not do this test in the
 @code{define_split} (of course, there is no point in writing a
 @code{define_split} that will never produce insns that match).
 The @code{define_insn_and_split} construction provides exactly the same
 functionality as two separate @code{define_insn} and @code{define_split}
 patterns.  It exists for compactness, and as a maintenance tool to prevent
 having to ensure the two patterns' templates match.
+@findex define_insn_and_rewrite
+It is sometimes useful to have a @code{define_insn_and_split}
+that replaces specific operands of an instruction but leaves the
+rest of the instruction pattern unchanged.  You can do this directly
+with a @code{define_insn_and_split}, but it requires a
+@var{new-insn-pattern-1} that repeats most of the original @var{insn-pattern}.
+There is also the complication that an implicit @code{parallel} in
+@var{insn-pattern} must become an explicit @code{parallel} in
+@var{new-insn-pattern-1}, which is easy to overlook.
+A simpler alternative is to use @code{define_insn_and_rewrite}, which
+is a form of @code{define_insn_and_split} that automatically generates
+@var{new-insn-pattern-1} by replacing each @code{match_operand}
+in @var{insn-pattern} with a corresponding @code{match_dup}, and each
+@code{match_operator} in the pattern with a corresponding @code{match_op_dup}.
+The arguments are otherwise identical to @code{define_insn_and_split}:
+@smallexample
+(define_insn_and_rewrite
+[@var{insn-pattern}]
+"@var{condition}"
+"@var{output-template}"
+"@var{split-condition}"
+"@var{preparation-statements}"
+[@var{insn-attributes}])
+@end smallexample
+The @code{match_dup}s and @code{match_op_dup}s in the new
+instruction pattern use any new operand values that the
+@var{preparation-statements} store in the @code{operands} array,
+as for a normal @code{define_insn_and_split}.  @var{preparation-statements}
+can also emit additional instructions before the new instruction.
+They can even emit an entirely different sequence of instructions and
+use @code{DONE} to avoid emitting a new form of the original
+instruction.
+The split in a @code{define_insn_and_rewrite} is only intended
+to apply to existing instructions that match @var{insn-pattern}.
+@var{split-condition} must therefore start with @code{&&},
+so that the split condition applies on top of @var{condition}.
+Here is an example from the AArch64 SVE port, in which operand 1 is
+known to be equivalent to an all-true constant and isn't used by the
+output template:
+@smallexample
+(define_insn_and_rewrite "*while_ult<GPI:mode><PRED_ALL:mode>_cc"
+[(set (reg:CC CC_REGNUM)
+(compare:CC
+(unspec:SI [(match_operand:PRED_ALL 1)
+(unspec:PRED_ALL
+[(match_operand:GPI 2 "aarch64_reg_or_zero" "rZ")
+(match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")]
+UNSPEC_WHILE_LO)]
+UNSPEC_PTEST_PTRUE)
+(const_int 0)))
+(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
+(unspec:PRED_ALL [(match_dup 2)
+(match_dup 3)]
+UNSPEC_WHILE_LO))]
+"TARGET_SVE"
+"whilelo\t%0.<PRED_ALL:Vetype>, %<w>2, %<w>3"
+;; Force the compiler to drop the unused predicate operand, so that we
+;; don't have an unnecessary PTRUE.
+"&& !CONSTANT_P (operands[1])"
+@{
+operands[1] = CONSTM1_RTX (<MODE>mode);
+@}
+)
+@end smallexample
+The splitter in this case simply replaces operand 1 with the constant
+value that it is known to have.  The equivalent @code{define_insn_and_split}
+would be:
+@smallexample
+(define_insn_and_split "*while_ult<GPI:mode><PRED_ALL:mode>_cc"
+[(set (reg:CC CC_REGNUM)
+(compare:CC
+(unspec:SI [(match_operand:PRED_ALL 1)
+(unspec:PRED_ALL
+[(match_operand:GPI 2 "aarch64_reg_or_zero" "rZ")
+(match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")]
+UNSPEC_WHILE_LO)]
+UNSPEC_PTEST_PTRUE)
+(const_int 0)))
+(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
+(unspec:PRED_ALL [(match_dup 2)
+(match_dup 3)]
+UNSPEC_WHILE_LO))]
+"TARGET_SVE"
+"whilelo\t%0.<PRED_ALL:Vetype>, %<w>2, %<w>3"
+;; Force the compiler to drop the unused predicate operand, so that we
+;; don't have an unnecessary PTRUE.
+"&& !CONSTANT_P (operands[1])"
+[(parallel
+[(set (reg:CC CC_REGNUM)
+(compare:CC
+(unspec:SI [(match_dup 1)
+(unspec:PRED_ALL [(match_dup 2)
+(match_dup 3)]
+UNSPEC_WHILE_LO)]
+UNSPEC_PTEST_PTRUE)
+(const_int 0)))
+(set (match_dup 0)
+(unspec:PRED_ALL [(match_dup 2)
+(match_dup 3)]
+UNSPEC_WHILE_LO))])]
+@{
+operands[1] = CONSTM1_RTX (<MODE>mode);
+@}
+)
+@end smallexample
 @end ifset
 @ifset INTERNALS
 @node Including Patterns
 @section Including Patterns in Machine Descriptions.
 alternatives of an insn definition from being used during code
 generation. @xref{Disable Insn Alternatives}.
 @item mnemonic
 The @code{mnemonic} attribute can be defined to implement instruction
-specific checks in e.g. the pipeline description.
+specific checks in e.g.@: the pipeline description.
 @xref{Mnemonic Attribute}.
 @end table
 For each of these special attributes, the corresponding
 @samp{HAVE_ATTR_@var{name}} @samp{#define} is also written when the
 @end smallexample
 @var{reservation-name} is a string giving name of @var{regexp}.
 Functional unit names and reservation names are in the same name
 space.  So the reservation names should be different from the
-functional unit names and can not be the reserved name @samp{nothing}.
+functional unit names and cannot be the reserved name @samp{nothing}.
 @findex define_bypass
 @cindex instruction latency time
 @cindex data bypass
 The following construction is used to describe exceptions in the
 @var{patterns} is a string giving patterns of functional units
 separated by comma.  Currently pattern is one unit or units
 separated by white-spaces.
 The first construction (@samp{exclusion_set}) means that each
-functional unit in the first string can not be reserved simultaneously
+functional unit in the first string cannot be reserved simultaneously
 with a unit whose name is in the second string and vice versa.  For
 example, the construction is useful for describing processors
 (e.g.@: some SPARC processors) with a fully pipelined floating point
 functional unit which can execute simultaneously only single floating
 point insns or only double floating point insns.
 The second construction (@samp{presence_set}) means that each
-functional unit in the first string can not be reserved unless at
+functional unit in the first string cannot be reserved unless at
 least one of pattern of units whose names are in the second string is
 reserved.  This is an asymmetric relation.  For example, it is useful
 for description that @acronym{VLIW} @samp{slot1} is reserved after
 @samp{slot0} reservation.  We could describe it by the following
 construction
 @smallexample
 (absence_set "slot0" "slot1, slot2")
 @end smallexample
-Or @samp{slot2} can not be reserved if @samp{slot0} and unit @samp{b0}
+Or @samp{slot2} cannot be reserved if @samp{slot0} and unit @samp{b0}
 are reserved or @samp{slot1} and unit @samp{b1} are reserved.  In
 this case we could write
 @smallexample
 (absence_set "slot2" "slot0 b0, slot1 b1")
 issued into the first pipeline unless it is reserved, otherwise they
 are issued into the second pipeline.  Integer division and
 multiplication insns can be executed only in the second integer
 pipeline and their results are ready correspondingly in 9 and 4
 cycles.  The integer division is not pipelined, i.e.@: the subsequent
-integer division insn can not be issued until the current division
+integer division insn cannot be issued until the current division
 insn finished.  Floating point insns are fully pipelined and their
 results are ready in 3 cycles.  Where the result of a floating point
 insn is used by an integer insn, an additional delay of one cycle is
 incurred.  To describe all of this we could specify
 case when the source RTL template is not matched against the
 input-template of the @code{define_subst}.  In such case the copy is
 deleted.
 @code{define_subst} can be used only in @code{define_insn} and
-@code{define_expand}, it cannot be used in other expressions (e.g. in
+@code{define_expand}, it cannot be used in other expressions (e.g.@: in
 @code{define_insn_and_split}).
 @menu
 * Define Subst Example::	    Example of @code{define_subst} work.
 * Define Subst Pattern Matching::   Process of template comparison.
 [(set (match_operand:SI 0 "" "")
 (match_operand:SI 1 "" ""))]
 ""
 [(set (match_dup 0)
 (match_dup 1))
-(clobber (reg:CC FLAGS_REG))]
+(clobber (reg:CC FLAGS_REG))])
 @end smallexample
 This @code{define_subst} can be applied to any RTL pattern containing
 @code{set} of mode SI and generates a copy with clobber when it is
 applied.
 @smallexample
 (define_code_attr @var{name} [(@var{code1} "@var{value1}") @dots{} (@var{coden} "@var{valuen}")])
 @end smallexample
+Instruction patterns can use code attributes as rtx codes, which can be
+useful if two sets of codes act in tandem.  For example, the following
+@code{define_insn} defines two patterns, one calculating a signed absolute
+difference and another calculating an unsigned absolute difference:
+@smallexample
+(define_code_iterator any_max [smax umax])
+(define_code_attr paired_min [(smax "smin") (umax "umin")])
+(define_insn @dots{}
+[(set (match_operand:SI 0 @dots{})
+(minus:SI (any_max:SI (match_operand:SI 1 @dots{})
+(match_operand:SI 2 @dots{}))
+(<paired_min>:SI (match_dup 1) (match_dup 2))))]
+@dots{})
+@end smallexample
+The signed version of the instruction uses @code{smax} and @code{smin}
+while the unsigned version uses @code{umax} and @code{umin}.  There
+are no versions that pair @code{smax} with @code{umin} or @code{umax}
+with @code{smin}.
 Here's an example of code iterators in action, taken from the MIPS port:
 @smallexample
 (define_code_iterator any_cond [unordered ordered unlt unge uneq ltgt unle ungt
 eq ne gt ge lt le gtu geu ltu leu])
 @end smallexample
 would produce a single set of functions that handles both
 @code{INTEGER_MODES} and @code{FLOAT_MODES}.
+It is also possible for these @samp{@@} patterns to have different
+numbers of operands from each other.  For example, patterns with
+a binary rtl code might take three operands (one output and two inputs)
+while patterns with a ternary rtl code might take four operands (one
+output and three inputs).  This combination would produce separate
+@samp{maybe_gen_@var{name}} and @samp{gen_@var{name}} functions for
+each operand count, but it would still produce a single
+@samp{maybe_code_for_@var{name}} and a single @samp{code_for_@var{name}}.
 @end ifset

Mercurial > hg > CbC > CbC_gcc

comparison gcc/doc/md.texi @ 145:1830386684a0