Mercurial > hg > CbC > CbC_gcc
comparison gcc/ira.c @ 111:04ced10e8804
gcc 7
author | kono |
---|---|
date | Fri, 27 Oct 2017 22:46:09 +0900 |
parents | f6334be47118 |
children | 84e7813d76e9 |
comparison
equal
deleted
inserted
replaced
68:561a7518be6b | 111:04ced10e8804 |
---|---|
1 /* Integrated Register Allocator (IRA) entry point. | 1 /* Integrated Register Allocator (IRA) entry point. |
2 Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 | 2 Copyright (C) 2006-2017 Free Software Foundation, Inc. |
3 Free Software Foundation, Inc. | |
4 Contributed by Vladimir Makarov <vmakarov@redhat.com>. | 3 Contributed by Vladimir Makarov <vmakarov@redhat.com>. |
5 | 4 |
6 This file is part of GCC. | 5 This file is part of GCC. |
7 | 6 |
8 GCC is free software; you can redistribute it and/or modify it under | 7 GCC is free software; you can redistribute it and/or modify it under |
36 nested CFG regions forming a tree. Currently the regions are | 35 nested CFG regions forming a tree. Currently the regions are |
37 the entire function for the root region and natural loops for | 36 the entire function for the root region and natural loops for |
38 the other regions. Therefore data structure representing a | 37 the other regions. Therefore data structure representing a |
39 region is called loop_tree_node. | 38 region is called loop_tree_node. |
40 | 39 |
41 o *Cover class* is a register class belonging to a set of | 40 o *Allocno class* is a register class used for allocation of |
42 non-intersecting register classes containing all of the | 41 given allocno. It means that only hard register of given |
43 hard-registers available for register allocation. The set of | 42 register class can be assigned to given allocno. In reality, |
44 all cover classes for a target is defined in the corresponding | 43 even smaller subset of (*profitable*) hard registers can be |
45 machine-description file according some criteria. Such notion | 44 assigned. In rare cases, the subset can be even smaller |
46 is needed because Chaitin-Briggs algorithm works on | 45 because our modification of Chaitin-Briggs algorithm requires |
47 non-intersected register classes. | 46 that sets of hard registers can be assigned to allocnos forms a |
47 forest, i.e. the sets can be ordered in a way where any | |
48 previous set is not intersected with given set or is a superset | |
49 of given set. | |
50 | |
51 o *Pressure class* is a register class belonging to a set of | |
52 register classes containing all of the hard-registers available | |
53 for register allocation. The set of all pressure classes for a | |
54 target is defined in the corresponding machine-description file | |
55 according some criteria. Register pressure is calculated only | |
56 for pressure classes and it affects some IRA decisions as | |
57 forming allocation regions. | |
48 | 58 |
49 o *Allocno* represents the live range of a pseudo-register in a | 59 o *Allocno* represents the live range of a pseudo-register in a |
50 region. Besides the obvious attributes like the corresponding | 60 region. Besides the obvious attributes like the corresponding |
51 pseudo-register number, cover class, conflicting allocnos and | 61 pseudo-register number, allocno class, conflicting allocnos and |
52 conflicting hard-registers, there are a few allocno attributes | 62 conflicting hard-registers, there are a few allocno attributes |
53 which are important for understanding the allocation algorithm: | 63 which are important for understanding the allocation algorithm: |
54 | 64 |
55 - *Live ranges*. This is a list of ranges of *program | 65 - *Live ranges*. This is a list of ranges of *program points* |
56 points* where the allocno lives. Program points represent | 66 where the allocno lives. Program points represent places |
57 places where a pseudo can be born or become dead (there are | 67 where a pseudo can be born or become dead (there are |
58 approximately two times more program points than the insns) | 68 approximately two times more program points than the insns) |
59 and they are represented by integers starting with 0. The | 69 and they are represented by integers starting with 0. The |
60 live ranges are used to find conflicts between allocnos of | 70 live ranges are used to find conflicts between allocnos. |
61 different cover classes. They also play very important role | 71 They also play very important role for the transformation of |
62 for the transformation of the IRA internal representation of | 72 the IRA internal representation of several regions into a one |
63 several regions into a one region representation. The later is | 73 region representation. The later is used during the reload |
64 used during the reload pass work because each allocno | 74 pass work because each allocno represents all of the |
65 represents all of the corresponding pseudo-registers. | 75 corresponding pseudo-registers. |
66 | 76 |
67 - *Hard-register costs*. This is a vector of size equal to the | 77 - *Hard-register costs*. This is a vector of size equal to the |
68 number of available hard-registers of the allocno's cover | 78 number of available hard-registers of the allocno class. The |
69 class. The cost of a callee-clobbered hard-register for an | 79 cost of a callee-clobbered hard-register for an allocno is |
70 allocno is increased by the cost of save/restore code around | 80 increased by the cost of save/restore code around the calls |
71 the calls through the given allocno's life. If the allocno | 81 through the given allocno's life. If the allocno is a move |
72 is a move instruction operand and another operand is a | 82 instruction operand and another operand is a hard-register of |
73 hard-register of the allocno's cover class, the cost of the | 83 the allocno class, the cost of the hard-register is decreased |
74 hard-register is decreased by the move cost. | 84 by the move cost. |
75 | 85 |
76 When an allocno is assigned, the hard-register with minimal | 86 When an allocno is assigned, the hard-register with minimal |
77 full cost is used. Initially, a hard-register's full cost is | 87 full cost is used. Initially, a hard-register's full cost is |
78 the corresponding value from the hard-register's cost vector. | 88 the corresponding value from the hard-register's cost vector. |
79 If the allocno is connected by a *copy* (see below) to | 89 If the allocno is connected by a *copy* (see below) to |
137 following subpasses: | 147 following subpasses: |
138 | 148 |
139 * First, IRA builds regions and creates allocnos (file | 149 * First, IRA builds regions and creates allocnos (file |
140 ira-build.c) and initializes most of their attributes. | 150 ira-build.c) and initializes most of their attributes. |
141 | 151 |
142 * Then IRA finds a cover class for each allocno and calculates | 152 * Then IRA finds an allocno class for each allocno and |
143 its initial (non-accumulated) cost of memory and each | 153 calculates its initial (non-accumulated) cost of memory and |
144 hard-register of its cover class (file ira-cost.c). | 154 each hard-register of its allocno class (file ira-cost.c). |
145 | 155 |
146 * IRA creates live ranges of each allocno, calulates register | 156 * IRA creates live ranges of each allocno, calculates register |
147 pressure for each cover class in each region, sets up | 157 pressure for each pressure class in each region, sets up |
148 conflict hard registers for each allocno and info about calls | 158 conflict hard registers for each allocno and info about calls |
149 the allocno lives through (file ira-lives.c). | 159 the allocno lives through (file ira-lives.c). |
150 | 160 |
151 * IRA removes low register pressure loops from the regions | 161 * IRA removes low register pressure loops from the regions |
152 mostly to speed IRA up (file ira-build.c). | 162 mostly to speed IRA up (file ira-build.c). |
155 allocnos to corresponding upper region allocnos (file | 165 allocnos to corresponding upper region allocnos (file |
156 ira-build.c). | 166 ira-build.c). |
157 | 167 |
158 * IRA creates all caps (file ira-build.c). | 168 * IRA creates all caps (file ira-build.c). |
159 | 169 |
160 * Having live-ranges of allocnos and their cover classes, IRA | 170 * Having live-ranges of allocnos and their classes, IRA creates |
161 creates conflicting allocnos of the same cover class for each | 171 conflicting allocnos for each allocno. Conflicting allocnos |
162 allocno. Conflicting allocnos are stored as a bit vector or | 172 are stored as a bit vector or array of pointers to the |
163 array of pointers to the conflicting allocnos whatever is | 173 conflicting allocnos whatever is more profitable (file |
164 more profitable (file ira-conflicts.c). At this point IRA | 174 ira-conflicts.c). At this point IRA creates allocno copies. |
165 creates allocno copies. | |
166 | 175 |
167 o Coloring. Now IRA has all necessary info to start graph coloring | 176 o Coloring. Now IRA has all necessary info to start graph coloring |
168 process. It is done in each region on top-down traverse of the | 177 process. It is done in each region on top-down traverse of the |
169 region tree (file ira-color.c). There are following subpasses: | 178 region tree (file ira-color.c). There are following subpasses: |
179 | |
180 * Finding profitable hard registers of corresponding allocno | |
181 class for each allocno. For example, only callee-saved hard | |
182 registers are frequently profitable for allocnos living | |
183 through colors. If the profitable hard register set of | |
184 allocno does not form a tree based on subset relation, we use | |
185 some approximation to form the tree. This approximation is | |
186 used to figure out trivial colorability of allocnos. The | |
187 approximation is a pretty rare case. | |
170 | 188 |
171 * Putting allocnos onto the coloring stack. IRA uses Briggs | 189 * Putting allocnos onto the coloring stack. IRA uses Briggs |
172 optimistic coloring which is a major improvement over | 190 optimistic coloring which is a major improvement over |
173 Chaitin's coloring. Therefore IRA does not spill allocnos at | 191 Chaitin's coloring. Therefore IRA does not spill allocnos at |
174 this point. There is some freedom in the order of putting | 192 this point. There is some freedom in the order of putting |
175 allocnos on the stack which can affect the final result of | 193 allocnos on the stack which can affect the final result of |
176 the allocation. IRA uses some heuristics to improve the order. | 194 the allocation. IRA uses some heuristics to improve the |
195 order. The major one is to form *threads* from colorable | |
196 allocnos and push them on the stack by threads. Thread is a | |
197 set of non-conflicting colorable allocnos connected by | |
198 copies. The thread contains allocnos from the colorable | |
199 bucket or colorable allocnos already pushed onto the coloring | |
200 stack. Pushing thread allocnos one after another onto the | |
201 stack increases chances of removing copies when the allocnos | |
202 get the same hard reg. | |
203 | |
204 We also use a modification of Chaitin-Briggs algorithm which | |
205 works for intersected register classes of allocnos. To | |
206 figure out trivial colorability of allocnos, the mentioned | |
207 above tree of hard register sets is used. To get an idea how | |
208 the algorithm works in i386 example, let us consider an | |
209 allocno to which any general hard register can be assigned. | |
210 If the allocno conflicts with eight allocnos to which only | |
211 EAX register can be assigned, given allocno is still | |
212 trivially colorable because all conflicting allocnos might be | |
213 assigned only to EAX and all other general hard registers are | |
214 still free. | |
215 | |
216 To get an idea of the used trivial colorability criterion, it | |
217 is also useful to read article "Graph-Coloring Register | |
218 Allocation for Irregular Architectures" by Michael D. Smith | |
219 and Glen Holloway. Major difference between the article | |
220 approach and approach used in IRA is that Smith's approach | |
221 takes register classes only from machine description and IRA | |
222 calculate register classes from intermediate code too | |
223 (e.g. an explicit usage of hard registers in RTL code for | |
224 parameter passing can result in creation of additional | |
225 register classes which contain or exclude the hard | |
226 registers). That makes IRA approach useful for improving | |
227 coloring even for architectures with regular register files | |
228 and in fact some benchmarking shows the improvement for | |
229 regular class architectures is even bigger than for irregular | |
230 ones. Another difference is that Smith's approach chooses | |
231 intersection of classes of all insn operands in which a given | |
232 pseudo occurs. IRA can use bigger classes if it is still | |
233 more profitable than memory usage. | |
177 | 234 |
178 * Popping the allocnos from the stack and assigning them hard | 235 * Popping the allocnos from the stack and assigning them hard |
179 registers. If IRA can not assign a hard register to an | 236 registers. If IRA can not assign a hard register to an |
180 allocno and the allocno is coalesced, IRA undoes the | 237 allocno and the allocno is coalesced, IRA undoes the |
181 coalescing and puts the uncoalesced allocnos onto the stack in | 238 coalescing and puts the uncoalesced allocnos onto the stack in |
185 assigns the allocno the hard-register with minimal full | 242 assigns the allocno the hard-register with minimal full |
186 allocation cost which reflects the cost of usage of the | 243 allocation cost which reflects the cost of usage of the |
187 hard-register for the allocno and cost of usage of the | 244 hard-register for the allocno and cost of usage of the |
188 hard-register for allocnos conflicting with given allocno. | 245 hard-register for allocnos conflicting with given allocno. |
189 | 246 |
190 * After allono assigning in the region, IRA modifies the hard | 247 * Chaitin-Briggs coloring assigns as many pseudos as possible |
248 to hard registers. After coloring we try to improve | |
249 allocation with cost point of view. We improve the | |
250 allocation by spilling some allocnos and assigning the freed | |
251 hard registers to other allocnos if it decreases the overall | |
252 allocation cost. | |
253 | |
254 * After allocno assigning in the region, IRA modifies the hard | |
191 register and memory costs for the corresponding allocnos in | 255 register and memory costs for the corresponding allocnos in |
192 the subregions to reflect the cost of possible loads, stores, | 256 the subregions to reflect the cost of possible loads, stores, |
193 or moves on the border of the region and its subregions. | 257 or moves on the border of the region and its subregions. |
194 When default regional allocation algorithm is used | 258 When default regional allocation algorithm is used |
195 (-fira-algorithm=mixed), IRA just propagates the assignment | 259 (-fira-algorithm=mixed), IRA just propagates the assignment |
196 for allocnos if the register pressure in the region for the | 260 for allocnos if the register pressure in the region for the |
197 corresponding cover class is less than number of available | 261 corresponding pressure class is less than number of available |
198 hard registers for given cover class. | 262 hard registers for given pressure class. |
199 | 263 |
200 o Spill/restore code moving. When IRA performs an allocation | 264 o Spill/restore code moving. When IRA performs an allocation |
201 by traversing regions in top-down order, it does not know what | 265 by traversing regions in top-down order, it does not know what |
202 happens below in the region tree. Therefore, sometimes IRA | 266 happens below in the region tree. Therefore, sometimes IRA |
203 misses opportunities to perform a better allocation. A simple | 267 misses opportunities to perform a better allocation. A simple |
208 implements a simple iterative algorithm performing profitable | 272 implements a simple iterative algorithm performing profitable |
209 transformations while they are still possible. It is fast in | 273 transformations while they are still possible. It is fast in |
210 practice, so there is no real need for a better time complexity | 274 practice, so there is no real need for a better time complexity |
211 algorithm. | 275 algorithm. |
212 | 276 |
213 o Code change. After coloring, two allocnos representing the same | 277 o Code change. After coloring, two allocnos representing the |
214 pseudo-register outside and inside a region respectively may be | 278 same pseudo-register outside and inside a region respectively |
215 assigned to different locations (hard-registers or memory). In | 279 may be assigned to different locations (hard-registers or |
216 this case IRA creates and uses a new pseudo-register inside the | 280 memory). In this case IRA creates and uses a new |
217 region and adds code to move allocno values on the region's | 281 pseudo-register inside the region and adds code to move allocno |
218 borders. This is done during top-down traversal of the regions | 282 values on the region's borders. This is done during top-down |
219 (file ira-emit.c). In some complicated cases IRA can create a | 283 traversal of the regions (file ira-emit.c). In some |
220 new allocno to move allocno values (e.g. when a swap of values | 284 complicated cases IRA can create a new allocno to move allocno |
221 stored in two hard-registers is needed). At this stage, the | 285 values (e.g. when a swap of values stored in two hard-registers |
222 new allocno is marked as spilled. IRA still creates the | 286 is needed). At this stage, the new allocno is marked as |
223 pseudo-register and the moves on the region borders even when | 287 spilled. IRA still creates the pseudo-register and the moves |
224 both allocnos were assigned to the same hard-register. If the | 288 on the region borders even when both allocnos were assigned to |
225 reload pass spills a pseudo-register for some reason, the | 289 the same hard-register. If the reload pass spills a |
226 effect will be smaller because another allocno will still be in | 290 pseudo-register for some reason, the effect will be smaller |
227 the hard-register. In most cases, this is better then spilling | 291 because another allocno will still be in the hard-register. In |
228 both allocnos. If reload does not change the allocation | 292 most cases, this is better then spilling both allocnos. If |
229 for the two pseudo-registers, the trivial move will be removed | 293 reload does not change the allocation for the two |
230 by post-reload optimizations. IRA does not generate moves for | 294 pseudo-registers, the trivial move will be removed by |
295 post-reload optimizations. IRA does not generate moves for | |
231 allocnos assigned to the same hard register when the default | 296 allocnos assigned to the same hard register when the default |
232 regional allocation algorithm is used and the register pressure | 297 regional allocation algorithm is used and the register pressure |
233 in the region for the corresponding allocno cover class is less | 298 in the region for the corresponding pressure class is less than |
234 than number of available hard registers for given cover class. | 299 number of available hard registers for given pressure class. |
235 IRA also does some optimizations to remove redundant stores and | 300 IRA also does some optimizations to remove redundant stores and |
236 to reduce code duplication on the region borders. | 301 to reduce code duplication on the region borders. |
237 | 302 |
238 o Flattening internal representation. After changing code, IRA | 303 o Flattening internal representation. After changing code, IRA |
239 transforms its internal representation for several regions into | 304 transforms its internal representation for several regions into |
240 one region representation (file ira-build.c). This process is | 305 one region representation (file ira-build.c). This process is |
241 called IR flattening. Such process is more complicated than IR | 306 called IR flattening. Such process is more complicated than IR |
242 rebuilding would be, but is much faster. | 307 rebuilding would be, but is much faster. |
243 | 308 |
244 o After IR flattening, IRA tries to assign hard registers to all | 309 o After IR flattening, IRA tries to assign hard registers to all |
245 spilled allocnos. This is impelemented by a simple and fast | 310 spilled allocnos. This is implemented by a simple and fast |
246 priority coloring algorithm (see function | 311 priority coloring algorithm (see function |
247 ira_reassign_conflict_allocnos::ira-color.c). Here new allocnos | 312 ira_reassign_conflict_allocnos::ira-color.c). Here new allocnos |
248 created during the code change pass can be assigned to hard | 313 created during the code change pass can be assigned to hard |
249 registers. | 314 registers. |
250 | 315 |
261 * choosing a better hard-register to spill based on IRA info | 326 * choosing a better hard-register to spill based on IRA info |
262 about pseudo-register live ranges and the register pressure | 327 about pseudo-register live ranges and the register pressure |
263 in places where the pseudo-register lives. | 328 in places where the pseudo-register lives. |
264 | 329 |
265 IRA uses a lot of data representing the target processors. These | 330 IRA uses a lot of data representing the target processors. These |
266 data are initilized in file ira.c. | 331 data are initialized in file ira.c. |
267 | 332 |
268 If function has no loops (or the loops are ignored when | 333 If function has no loops (or the loops are ignored when |
269 -fira-algorithm=CB is used), we have classic Chaitin-Briggs | 334 -fira-algorithm=CB is used), we have classic Chaitin-Briggs |
270 coloring (only instead of separate pass of coalescing, we use hard | 335 coloring (only instead of separate pass of coalescing, we use hard |
271 register preferencing). In such case, IRA works much faster | 336 register preferencing). In such case, IRA works much faster |
285 Callahan-Koblenz Algorithms. | 350 Callahan-Koblenz Algorithms. |
286 | 351 |
287 o Guei-Yuan Lueh, Thomas Gross, and Ali-Reza Adl-Tabatabai. Global | 352 o Guei-Yuan Lueh, Thomas Gross, and Ali-Reza Adl-Tabatabai. Global |
288 Register Allocation Based on Graph Fusion. | 353 Register Allocation Based on Graph Fusion. |
289 | 354 |
355 o Michael D. Smith and Glenn Holloway. Graph-Coloring Register | |
356 Allocation for Irregular Architectures | |
357 | |
290 o Vladimir Makarov. The Integrated Register Allocator for GCC. | 358 o Vladimir Makarov. The Integrated Register Allocator for GCC. |
291 | 359 |
292 o Vladimir Makarov. The top-down register allocator for irregular | 360 o Vladimir Makarov. The top-down register allocator for irregular |
293 register file architectures. | 361 register file architectures. |
294 | 362 |
296 | 364 |
297 | 365 |
298 #include "config.h" | 366 #include "config.h" |
299 #include "system.h" | 367 #include "system.h" |
300 #include "coretypes.h" | 368 #include "coretypes.h" |
301 #include "tm.h" | 369 #include "backend.h" |
370 #include "target.h" | |
371 #include "rtl.h" | |
372 #include "tree.h" | |
373 #include "df.h" | |
374 #include "memmodel.h" | |
375 #include "tm_p.h" | |
376 #include "insn-config.h" | |
302 #include "regs.h" | 377 #include "regs.h" |
303 #include "rtl.h" | 378 #include "ira.h" |
304 #include "tm_p.h" | 379 #include "ira-int.h" |
305 #include "target.h" | 380 #include "diagnostic-core.h" |
306 #include "flags.h" | 381 #include "cfgrtl.h" |
307 #include "obstack.h" | 382 #include "cfgbuild.h" |
308 #include "bitmap.h" | 383 #include "cfgcleanup.h" |
309 #include "hard-reg-set.h" | |
310 #include "basic-block.h" | |
311 #include "df.h" | |
312 #include "expr.h" | 384 #include "expr.h" |
313 #include "recog.h" | |
314 #include "params.h" | |
315 #include "timevar.h" | |
316 #include "tree-pass.h" | 385 #include "tree-pass.h" |
317 #include "output.h" | 386 #include "output.h" |
318 #include "except.h" | |
319 #include "reload.h" | 387 #include "reload.h" |
320 #include "diagnostic-core.h" | 388 #include "cfgloop.h" |
321 #include "integrate.h" | 389 #include "lra.h" |
322 #include "ggc.h" | 390 #include "dce.h" |
323 #include "ira-int.h" | 391 #include "dbgcnt.h" |
324 | 392 #include "rtl-iter.h" |
393 #include "shrink-wrap.h" | |
394 #include "print-rtl.h" | |
325 | 395 |
326 struct target_ira default_target_ira; | 396 struct target_ira default_target_ira; |
327 struct target_ira_int default_target_ira_int; | 397 struct target_ira_int default_target_ira_int; |
328 #if SWITCHABLE_TARGET | 398 #if SWITCHABLE_TARGET |
329 struct target_ira *this_target_ira = &default_target_ira; | 399 struct target_ira *this_target_ira = &default_target_ira; |
341 | 411 |
342 /* The following array contains info about spilled pseudo-registers | 412 /* The following array contains info about spilled pseudo-registers |
343 stack slots used in current function so far. */ | 413 stack slots used in current function so far. */ |
344 struct ira_spilled_reg_stack_slot *ira_spilled_reg_stack_slots; | 414 struct ira_spilled_reg_stack_slot *ira_spilled_reg_stack_slots; |
345 | 415 |
346 /* Correspondingly overall cost of the allocation, cost of the | 416 /* Correspondingly overall cost of the allocation, overall cost before |
347 allocnos assigned to hard-registers, cost of the allocnos assigned | 417 reload, cost of the allocnos assigned to hard-registers, cost of |
348 to memory, cost of loads, stores and register move insns generated | 418 the allocnos assigned to memory, cost of loads, stores and register |
349 for pseudo-register live range splitting (see ira-emit.c). */ | 419 move insns generated for pseudo-register live range splitting (see |
350 int ira_overall_cost; | 420 ira-emit.c). */ |
351 int ira_reg_cost, ira_mem_cost; | 421 int64_t ira_overall_cost, overall_cost_before; |
352 int ira_load_cost, ira_store_cost, ira_shuffle_cost; | 422 int64_t ira_reg_cost, ira_mem_cost; |
423 int64_t ira_load_cost, ira_store_cost, ira_shuffle_cost; | |
353 int ira_move_loops_num, ira_additional_jumps_num; | 424 int ira_move_loops_num, ira_additional_jumps_num; |
354 | 425 |
355 /* All registers that can be eliminated. */ | 426 /* All registers that can be eliminated. */ |
356 | 427 |
357 HARD_REG_SET eliminable_regset; | 428 HARD_REG_SET eliminable_regset; |
429 | |
430 /* Value of max_reg_num () before IRA work start. This value helps | |
431 us to recognize a situation when new pseudos were created during | |
432 IRA work. */ | |
433 static int max_regno_before_ira; | |
358 | 434 |
359 /* Temporary hard reg set used for a different calculation. */ | 435 /* Temporary hard reg set used for a different calculation. */ |
360 static HARD_REG_SET temp_hard_regset; | 436 static HARD_REG_SET temp_hard_regset; |
361 | 437 |
438 #define last_mode_for_init_move_cost \ | |
439 (this_target_ira_int->x_last_mode_for_init_move_cost) | |
362 | 440 |
363 | 441 |
364 /* The function sets up the map IRA_REG_MODE_HARD_REGSET. */ | 442 /* The function sets up the map IRA_REG_MODE_HARD_REGSET. */ |
365 static void | 443 static void |
366 setup_reg_mode_hard_regset (void) | 444 setup_reg_mode_hard_regset (void) |
369 | 447 |
370 for (m = 0; m < NUM_MACHINE_MODES; m++) | 448 for (m = 0; m < NUM_MACHINE_MODES; m++) |
371 for (hard_regno = 0; hard_regno < FIRST_PSEUDO_REGISTER; hard_regno++) | 449 for (hard_regno = 0; hard_regno < FIRST_PSEUDO_REGISTER; hard_regno++) |
372 { | 450 { |
373 CLEAR_HARD_REG_SET (ira_reg_mode_hard_regset[hard_regno][m]); | 451 CLEAR_HARD_REG_SET (ira_reg_mode_hard_regset[hard_regno][m]); |
374 for (i = hard_regno_nregs[hard_regno][m] - 1; i >= 0; i--) | 452 for (i = hard_regno_nregs (hard_regno, (machine_mode) m) - 1; |
453 i >= 0; i--) | |
375 if (hard_regno + i < FIRST_PSEUDO_REGISTER) | 454 if (hard_regno + i < FIRST_PSEUDO_REGISTER) |
376 SET_HARD_REG_BIT (ira_reg_mode_hard_regset[hard_regno][m], | 455 SET_HARD_REG_BIT (ira_reg_mode_hard_regset[hard_regno][m], |
377 hard_regno + i); | 456 hard_regno + i); |
378 } | 457 } |
379 } | 458 } |
424 ira_non_ordered_class_hard_regs[cl][n++] = i; | 503 ira_non_ordered_class_hard_regs[cl][n++] = i; |
425 ira_assert (ira_class_hard_regs_num[cl] == n); | 504 ira_assert (ira_class_hard_regs_num[cl] == n); |
426 } | 505 } |
427 } | 506 } |
428 | 507 |
429 /* Set up IRA_AVAILABLE_CLASS_REGS. */ | |
430 static void | |
431 setup_available_class_regs (void) | |
432 { | |
433 int i, j; | |
434 | |
435 memset (ira_available_class_regs, 0, sizeof (ira_available_class_regs)); | |
436 for (i = 0; i < N_REG_CLASSES; i++) | |
437 { | |
438 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[i]); | |
439 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | |
440 for (j = 0; j < FIRST_PSEUDO_REGISTER; j++) | |
441 if (TEST_HARD_REG_BIT (temp_hard_regset, j)) | |
442 ira_available_class_regs[i]++; | |
443 } | |
444 } | |
445 | |
446 /* Set up global variables defining info about hard registers for the | 508 /* Set up global variables defining info about hard registers for the |
447 allocation. These depend on USE_HARD_FRAME_P whose TRUE value means | 509 allocation. These depend on USE_HARD_FRAME_P whose TRUE value means |
448 that we can use the hard frame pointer for the allocation. */ | 510 that we can use the hard frame pointer for the allocation. */ |
449 static void | 511 static void |
450 setup_alloc_regs (bool use_hard_frame_p) | 512 setup_alloc_regs (bool use_hard_frame_p) |
451 { | 513 { |
452 #ifdef ADJUST_REG_ALLOC_ORDER | 514 #ifdef ADJUST_REG_ALLOC_ORDER |
453 ADJUST_REG_ALLOC_ORDER; | 515 ADJUST_REG_ALLOC_ORDER; |
454 #endif | 516 #endif |
455 COPY_HARD_REG_SET (no_unit_alloc_regs, fixed_reg_set); | 517 COPY_HARD_REG_SET (no_unit_alloc_regs, fixed_nonglobal_reg_set); |
456 if (! use_hard_frame_p) | 518 if (! use_hard_frame_p) |
457 SET_HARD_REG_BIT (no_unit_alloc_regs, HARD_FRAME_POINTER_REGNUM); | 519 SET_HARD_REG_BIT (no_unit_alloc_regs, HARD_FRAME_POINTER_REGNUM); |
458 setup_class_hard_regs (); | 520 setup_class_hard_regs (); |
459 setup_available_class_regs (); | |
460 } | 521 } |
461 | 522 |
462 | 523 |
463 | 524 |
464 /* Set up IRA_MEMORY_MOVE_COST, IRA_REGISTER_MOVE_COST. */ | 525 #define alloc_reg_class_subclasses \ |
526 (this_target_ira_int->x_alloc_reg_class_subclasses) | |
527 | |
528 /* Initialize the table of subclasses of each reg class. */ | |
529 static void | |
530 setup_reg_subclasses (void) | |
531 { | |
532 int i, j; | |
533 HARD_REG_SET temp_hard_regset2; | |
534 | |
535 for (i = 0; i < N_REG_CLASSES; i++) | |
536 for (j = 0; j < N_REG_CLASSES; j++) | |
537 alloc_reg_class_subclasses[i][j] = LIM_REG_CLASSES; | |
538 | |
539 for (i = 0; i < N_REG_CLASSES; i++) | |
540 { | |
541 if (i == (int) NO_REGS) | |
542 continue; | |
543 | |
544 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[i]); | |
545 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | |
546 if (hard_reg_set_empty_p (temp_hard_regset)) | |
547 continue; | |
548 for (j = 0; j < N_REG_CLASSES; j++) | |
549 if (i != j) | |
550 { | |
551 enum reg_class *p; | |
552 | |
553 COPY_HARD_REG_SET (temp_hard_regset2, reg_class_contents[j]); | |
554 AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); | |
555 if (! hard_reg_set_subset_p (temp_hard_regset, | |
556 temp_hard_regset2)) | |
557 continue; | |
558 p = &alloc_reg_class_subclasses[j][0]; | |
559 while (*p != LIM_REG_CLASSES) p++; | |
560 *p = (enum reg_class) i; | |
561 } | |
562 } | |
563 } | |
564 | |
565 | |
566 | |
567 /* Set up IRA_MEMORY_MOVE_COST and IRA_MAX_MEMORY_MOVE_COST. */ | |
465 static void | 568 static void |
466 setup_class_subset_and_memory_move_costs (void) | 569 setup_class_subset_and_memory_move_costs (void) |
467 { | 570 { |
468 int cl, cl2, mode; | 571 int cl, cl2, mode, cost; |
469 HARD_REG_SET temp_hard_regset2; | 572 HARD_REG_SET temp_hard_regset2; |
470 | 573 |
471 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) | 574 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) |
472 ira_memory_move_cost[mode][NO_REGS][0] | 575 ira_memory_move_cost[mode][NO_REGS][0] |
473 = ira_memory_move_cost[mode][NO_REGS][1] = SHRT_MAX; | 576 = ira_memory_move_cost[mode][NO_REGS][1] = SHRT_MAX; |
474 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--) | 577 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--) |
475 { | 578 { |
476 if (cl != (int) NO_REGS) | 579 if (cl != (int) NO_REGS) |
477 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) | 580 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) |
478 { | 581 { |
479 ira_memory_move_cost[mode][cl][0] = | 582 ira_max_memory_move_cost[mode][cl][0] |
480 memory_move_cost ((enum machine_mode) mode, | 583 = ira_memory_move_cost[mode][cl][0] |
481 (enum reg_class) cl, false); | 584 = memory_move_cost ((machine_mode) mode, |
482 ira_memory_move_cost[mode][cl][1] = | 585 (reg_class_t) cl, false); |
483 memory_move_cost ((enum machine_mode) mode, | 586 ira_max_memory_move_cost[mode][cl][1] |
484 (enum reg_class) cl, true); | 587 = ira_memory_move_cost[mode][cl][1] |
588 = memory_move_cost ((machine_mode) mode, | |
589 (reg_class_t) cl, true); | |
485 /* Costs for NO_REGS are used in cost calculation on the | 590 /* Costs for NO_REGS are used in cost calculation on the |
486 1st pass when the preferred register classes are not | 591 1st pass when the preferred register classes are not |
487 known yet. In this case we take the best scenario. */ | 592 known yet. In this case we take the best scenario. */ |
488 if (ira_memory_move_cost[mode][NO_REGS][0] | 593 if (ira_memory_move_cost[mode][NO_REGS][0] |
489 > ira_memory_move_cost[mode][cl][0]) | 594 > ira_memory_move_cost[mode][cl][0]) |
490 ira_memory_move_cost[mode][NO_REGS][0] | 595 ira_max_memory_move_cost[mode][NO_REGS][0] |
596 = ira_memory_move_cost[mode][NO_REGS][0] | |
491 = ira_memory_move_cost[mode][cl][0]; | 597 = ira_memory_move_cost[mode][cl][0]; |
492 if (ira_memory_move_cost[mode][NO_REGS][1] | 598 if (ira_memory_move_cost[mode][NO_REGS][1] |
493 > ira_memory_move_cost[mode][cl][1]) | 599 > ira_memory_move_cost[mode][cl][1]) |
494 ira_memory_move_cost[mode][NO_REGS][1] | 600 ira_max_memory_move_cost[mode][NO_REGS][1] |
601 = ira_memory_move_cost[mode][NO_REGS][1] | |
495 = ira_memory_move_cost[mode][cl][1]; | 602 = ira_memory_move_cost[mode][cl][1]; |
496 } | 603 } |
497 for (cl2 = (int) N_REG_CLASSES - 1; cl2 >= 0; cl2--) | 604 } |
498 { | 605 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--) |
499 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | 606 for (cl2 = (int) N_REG_CLASSES - 1; cl2 >= 0; cl2--) |
500 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | 607 { |
501 COPY_HARD_REG_SET (temp_hard_regset2, reg_class_contents[cl2]); | 608 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); |
502 AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); | 609 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); |
503 ira_class_subset_p[cl][cl2] | 610 COPY_HARD_REG_SET (temp_hard_regset2, reg_class_contents[cl2]); |
504 = hard_reg_set_subset_p (temp_hard_regset, temp_hard_regset2); | 611 AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); |
505 } | 612 ira_class_subset_p[cl][cl2] |
506 } | 613 = hard_reg_set_subset_p (temp_hard_regset, temp_hard_regset2); |
614 if (! hard_reg_set_empty_p (temp_hard_regset2) | |
615 && hard_reg_set_subset_p (reg_class_contents[cl2], | |
616 reg_class_contents[cl])) | |
617 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) | |
618 { | |
619 cost = ira_memory_move_cost[mode][cl2][0]; | |
620 if (cost > ira_max_memory_move_cost[mode][cl][0]) | |
621 ira_max_memory_move_cost[mode][cl][0] = cost; | |
622 cost = ira_memory_move_cost[mode][cl2][1]; | |
623 if (cost > ira_max_memory_move_cost[mode][cl][1]) | |
624 ira_max_memory_move_cost[mode][cl][1] = cost; | |
625 } | |
626 } | |
627 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--) | |
628 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) | |
629 { | |
630 ira_memory_move_cost[mode][cl][0] | |
631 = ira_max_memory_move_cost[mode][cl][0]; | |
632 ira_memory_move_cost[mode][cl][1] | |
633 = ira_max_memory_move_cost[mode][cl][1]; | |
634 } | |
635 setup_reg_subclasses (); | |
507 } | 636 } |
508 | 637 |
509 | 638 |
510 | 639 |
511 /* Define the following macro if allocation through malloc if | 640 /* Define the following macro if allocation through malloc if |
529 | 658 |
530 #ifndef IRA_NO_OBSTACK | 659 #ifndef IRA_NO_OBSTACK |
531 res = obstack_alloc (&ira_obstack, len); | 660 res = obstack_alloc (&ira_obstack, len); |
532 #else | 661 #else |
533 res = xmalloc (len); | 662 res = xmalloc (len); |
534 #endif | |
535 return res; | |
536 } | |
537 | |
538 /* Reallocate memory PTR of size LEN for IRA data. */ | |
539 void * | |
540 ira_reallocate (void *ptr, size_t len) | |
541 { | |
542 void *res; | |
543 | |
544 #ifndef IRA_NO_OBSTACK | |
545 res = obstack_alloc (&ira_obstack, len); | |
546 #else | |
547 res = xrealloc (ptr, len); | |
548 #endif | 663 #endif |
549 return res; | 664 return res; |
550 } | 665 } |
551 | 666 |
552 /* Free memory ADDR allocated for IRA data. */ | 667 /* Free memory ADDR allocated for IRA data. */ |
598 n++; | 713 n++; |
599 fprintf (f, " %4d:r%-4d", ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); | 714 fprintf (f, " %4d:r%-4d", ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); |
600 if ((bb = ALLOCNO_LOOP_TREE_NODE (a)->bb) != NULL) | 715 if ((bb = ALLOCNO_LOOP_TREE_NODE (a)->bb) != NULL) |
601 fprintf (f, "b%-3d", bb->index); | 716 fprintf (f, "b%-3d", bb->index); |
602 else | 717 else |
603 fprintf (f, "l%-3d", ALLOCNO_LOOP_TREE_NODE (a)->loop->num); | 718 fprintf (f, "l%-3d", ALLOCNO_LOOP_TREE_NODE (a)->loop_num); |
604 if (ALLOCNO_HARD_REGNO (a) >= 0) | 719 if (ALLOCNO_HARD_REGNO (a) >= 0) |
605 fprintf (f, " %3d", ALLOCNO_HARD_REGNO (a)); | 720 fprintf (f, " %3d", ALLOCNO_HARD_REGNO (a)); |
606 else | 721 else |
607 fprintf (f, " mem"); | 722 fprintf (f, " mem"); |
608 } | 723 } |
616 { | 731 { |
617 ira_print_disposition (stderr); | 732 ira_print_disposition (stderr); |
618 } | 733 } |
619 | 734 |
620 | 735 |
621 #define alloc_reg_class_subclasses \ | 736 |
622 (this_target_ira_int->x_alloc_reg_class_subclasses) | 737 /* Set up ira_stack_reg_pressure_class which is the biggest pressure |
623 | 738 register class containing stack registers or NO_REGS if there are |
624 /* Initialize the table of subclasses of each reg class. */ | 739 no stack registers. To find this class, we iterate through all |
740 register pressure classes and choose the first register pressure | |
741 class containing all the stack registers and having the biggest | |
742 size. */ | |
625 static void | 743 static void |
626 setup_reg_subclasses (void) | 744 setup_stack_reg_pressure_class (void) |
627 { | 745 { |
628 int i, j; | 746 ira_stack_reg_pressure_class = NO_REGS; |
747 #ifdef STACK_REGS | |
748 { | |
749 int i, best, size; | |
750 enum reg_class cl; | |
751 HARD_REG_SET temp_hard_regset2; | |
752 | |
753 CLEAR_HARD_REG_SET (temp_hard_regset); | |
754 for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++) | |
755 SET_HARD_REG_BIT (temp_hard_regset, i); | |
756 best = 0; | |
757 for (i = 0; i < ira_pressure_classes_num; i++) | |
758 { | |
759 cl = ira_pressure_classes[i]; | |
760 COPY_HARD_REG_SET (temp_hard_regset2, temp_hard_regset); | |
761 AND_HARD_REG_SET (temp_hard_regset2, reg_class_contents[cl]); | |
762 size = hard_reg_set_size (temp_hard_regset2); | |
763 if (best < size) | |
764 { | |
765 best = size; | |
766 ira_stack_reg_pressure_class = cl; | |
767 } | |
768 } | |
769 } | |
770 #endif | |
771 } | |
772 | |
773 /* Find pressure classes which are register classes for which we | |
774 calculate register pressure in IRA, register pressure sensitive | |
775 insn scheduling, and register pressure sensitive loop invariant | |
776 motion. | |
777 | |
778 To make register pressure calculation easy, we always use | |
779 non-intersected register pressure classes. A move of hard | |
780 registers from one register pressure class is not more expensive | |
781 than load and store of the hard registers. Most likely an allocno | |
782 class will be a subset of a register pressure class and in many | |
783 cases a register pressure class. That makes usage of register | |
784 pressure classes a good approximation to find a high register | |
785 pressure. */ | |
786 static void | |
787 setup_pressure_classes (void) | |
788 { | |
789 int cost, i, n, curr; | |
790 int cl, cl2; | |
791 enum reg_class pressure_classes[N_REG_CLASSES]; | |
792 int m; | |
629 HARD_REG_SET temp_hard_regset2; | 793 HARD_REG_SET temp_hard_regset2; |
630 | 794 bool insert_p; |
631 for (i = 0; i < N_REG_CLASSES; i++) | 795 |
632 for (j = 0; j < N_REG_CLASSES; j++) | 796 if (targetm.compute_pressure_classes) |
633 alloc_reg_class_subclasses[i][j] = LIM_REG_CLASSES; | 797 n = targetm.compute_pressure_classes (pressure_classes); |
634 | 798 else |
635 for (i = 0; i < N_REG_CLASSES; i++) | 799 { |
636 { | 800 n = 0; |
637 if (i == (int) NO_REGS) | 801 for (cl = 0; cl < N_REG_CLASSES; cl++) |
802 { | |
803 if (ira_class_hard_regs_num[cl] == 0) | |
804 continue; | |
805 if (ira_class_hard_regs_num[cl] != 1 | |
806 /* A register class without subclasses may contain a few | |
807 hard registers and movement between them is costly | |
808 (e.g. SPARC FPCC registers). We still should consider it | |
809 as a candidate for a pressure class. */ | |
810 && alloc_reg_class_subclasses[cl][0] < cl) | |
811 { | |
812 /* Check that the moves between any hard registers of the | |
813 current class are not more expensive for a legal mode | |
814 than load/store of the hard registers of the current | |
815 class. Such class is a potential candidate to be a | |
816 register pressure class. */ | |
817 for (m = 0; m < NUM_MACHINE_MODES; m++) | |
818 { | |
819 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | |
820 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | |
821 AND_COMPL_HARD_REG_SET (temp_hard_regset, | |
822 ira_prohibited_class_mode_regs[cl][m]); | |
823 if (hard_reg_set_empty_p (temp_hard_regset)) | |
824 continue; | |
825 ira_init_register_move_cost_if_necessary ((machine_mode) m); | |
826 cost = ira_register_move_cost[m][cl][cl]; | |
827 if (cost <= ira_max_memory_move_cost[m][cl][1] | |
828 || cost <= ira_max_memory_move_cost[m][cl][0]) | |
829 break; | |
830 } | |
831 if (m >= NUM_MACHINE_MODES) | |
832 continue; | |
833 } | |
834 curr = 0; | |
835 insert_p = true; | |
836 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | |
837 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | |
838 /* Remove so far added pressure classes which are subset of the | |
839 current candidate class. Prefer GENERAL_REGS as a pressure | |
840 register class to another class containing the same | |
841 allocatable hard registers. We do this because machine | |
842 dependent cost hooks might give wrong costs for the latter | |
843 class but always give the right cost for the former class | |
844 (GENERAL_REGS). */ | |
845 for (i = 0; i < n; i++) | |
846 { | |
847 cl2 = pressure_classes[i]; | |
848 COPY_HARD_REG_SET (temp_hard_regset2, reg_class_contents[cl2]); | |
849 AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); | |
850 if (hard_reg_set_subset_p (temp_hard_regset, temp_hard_regset2) | |
851 && (! hard_reg_set_equal_p (temp_hard_regset, | |
852 temp_hard_regset2) | |
853 || cl2 == (int) GENERAL_REGS)) | |
854 { | |
855 pressure_classes[curr++] = (enum reg_class) cl2; | |
856 insert_p = false; | |
857 continue; | |
858 } | |
859 if (hard_reg_set_subset_p (temp_hard_regset2, temp_hard_regset) | |
860 && (! hard_reg_set_equal_p (temp_hard_regset2, | |
861 temp_hard_regset) | |
862 || cl == (int) GENERAL_REGS)) | |
863 continue; | |
864 if (hard_reg_set_equal_p (temp_hard_regset2, temp_hard_regset)) | |
865 insert_p = false; | |
866 pressure_classes[curr++] = (enum reg_class) cl2; | |
867 } | |
868 /* If the current candidate is a subset of a so far added | |
869 pressure class, don't add it to the list of the pressure | |
870 classes. */ | |
871 if (insert_p) | |
872 pressure_classes[curr++] = (enum reg_class) cl; | |
873 n = curr; | |
874 } | |
875 } | |
876 #ifdef ENABLE_IRA_CHECKING | |
877 { | |
878 HARD_REG_SET ignore_hard_regs; | |
879 | |
880 /* Check pressure classes correctness: here we check that hard | |
881 registers from all register pressure classes contains all hard | |
882 registers available for the allocation. */ | |
883 CLEAR_HARD_REG_SET (temp_hard_regset); | |
884 CLEAR_HARD_REG_SET (temp_hard_regset2); | |
885 COPY_HARD_REG_SET (ignore_hard_regs, no_unit_alloc_regs); | |
886 for (cl = 0; cl < LIM_REG_CLASSES; cl++) | |
887 { | |
888 /* For some targets (like MIPS with MD_REGS), there are some | |
889 classes with hard registers available for allocation but | |
890 not able to hold value of any mode. */ | |
891 for (m = 0; m < NUM_MACHINE_MODES; m++) | |
892 if (contains_reg_of_mode[cl][m]) | |
893 break; | |
894 if (m >= NUM_MACHINE_MODES) | |
895 { | |
896 IOR_HARD_REG_SET (ignore_hard_regs, reg_class_contents[cl]); | |
897 continue; | |
898 } | |
899 for (i = 0; i < n; i++) | |
900 if ((int) pressure_classes[i] == cl) | |
901 break; | |
902 IOR_HARD_REG_SET (temp_hard_regset2, reg_class_contents[cl]); | |
903 if (i < n) | |
904 IOR_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | |
905 } | |
906 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) | |
907 /* Some targets (like SPARC with ICC reg) have allocatable regs | |
908 for which no reg class is defined. */ | |
909 if (REGNO_REG_CLASS (i) == NO_REGS) | |
910 SET_HARD_REG_BIT (ignore_hard_regs, i); | |
911 AND_COMPL_HARD_REG_SET (temp_hard_regset, ignore_hard_regs); | |
912 AND_COMPL_HARD_REG_SET (temp_hard_regset2, ignore_hard_regs); | |
913 ira_assert (hard_reg_set_subset_p (temp_hard_regset2, temp_hard_regset)); | |
914 } | |
915 #endif | |
916 ira_pressure_classes_num = 0; | |
917 for (i = 0; i < n; i++) | |
918 { | |
919 cl = (int) pressure_classes[i]; | |
920 ira_reg_pressure_class_p[cl] = true; | |
921 ira_pressure_classes[ira_pressure_classes_num++] = (enum reg_class) cl; | |
922 } | |
923 setup_stack_reg_pressure_class (); | |
924 } | |
925 | |
926 /* Set up IRA_UNIFORM_CLASS_P. Uniform class is a register class | |
927 whose register move cost between any registers of the class is the | |
928 same as for all its subclasses. We use the data to speed up the | |
929 2nd pass of calculations of allocno costs. */ | |
930 static void | |
931 setup_uniform_class_p (void) | |
932 { | |
933 int i, cl, cl2, m; | |
934 | |
935 for (cl = 0; cl < N_REG_CLASSES; cl++) | |
936 { | |
937 ira_uniform_class_p[cl] = false; | |
938 if (ira_class_hard_regs_num[cl] == 0) | |
638 continue; | 939 continue; |
639 | 940 /* We can not use alloc_reg_class_subclasses here because move |
941 cost hooks does not take into account that some registers are | |
942 unavailable for the subtarget. E.g. for i686, INT_SSE_REGS | |
943 is element of alloc_reg_class_subclasses for GENERAL_REGS | |
944 because SSE regs are unavailable. */ | |
945 for (i = 0; (cl2 = reg_class_subclasses[cl][i]) != LIM_REG_CLASSES; i++) | |
946 { | |
947 if (ira_class_hard_regs_num[cl2] == 0) | |
948 continue; | |
949 for (m = 0; m < NUM_MACHINE_MODES; m++) | |
950 if (contains_reg_of_mode[cl][m] && contains_reg_of_mode[cl2][m]) | |
951 { | |
952 ira_init_register_move_cost_if_necessary ((machine_mode) m); | |
953 if (ira_register_move_cost[m][cl][cl] | |
954 != ira_register_move_cost[m][cl2][cl2]) | |
955 break; | |
956 } | |
957 if (m < NUM_MACHINE_MODES) | |
958 break; | |
959 } | |
960 if (cl2 == LIM_REG_CLASSES) | |
961 ira_uniform_class_p[cl] = true; | |
962 } | |
963 } | |
964 | |
965 /* Set up IRA_ALLOCNO_CLASSES, IRA_ALLOCNO_CLASSES_NUM, | |
966 IRA_IMPORTANT_CLASSES, and IRA_IMPORTANT_CLASSES_NUM. | |
967 | |
968 Target may have many subtargets and not all target hard registers can | |
969 be used for allocation, e.g. x86 port in 32-bit mode can not use | |
970 hard registers introduced in x86-64 like r8-r15). Some classes | |
971 might have the same allocatable hard registers, e.g. INDEX_REGS | |
972 and GENERAL_REGS in x86 port in 32-bit mode. To decrease different | |
973 calculations efforts we introduce allocno classes which contain | |
974 unique non-empty sets of allocatable hard-registers. | |
975 | |
976 Pseudo class cost calculation in ira-costs.c is very expensive. | |
977 Therefore we are trying to decrease number of classes involved in | |
978 such calculation. Register classes used in the cost calculation | |
979 are called important classes. They are allocno classes and other | |
980 non-empty classes whose allocatable hard register sets are inside | |
981 of an allocno class hard register set. From the first sight, it | |
982 looks like that they are just allocno classes. It is not true. In | |
983 example of x86-port in 32-bit mode, allocno classes will contain | |
984 GENERAL_REGS but not LEGACY_REGS (because allocatable hard | |
985 registers are the same for the both classes). The important | |
986 classes will contain GENERAL_REGS and LEGACY_REGS. It is done | |
987 because a machine description insn constraint may refers for | |
988 LEGACY_REGS and code in ira-costs.c is mostly base on investigation | |
989 of the insn constraints. */ | |
990 static void | |
991 setup_allocno_and_important_classes (void) | |
992 { | |
993 int i, j, n, cl; | |
994 bool set_p; | |
995 HARD_REG_SET temp_hard_regset2; | |
996 static enum reg_class classes[LIM_REG_CLASSES + 1]; | |
997 | |
998 n = 0; | |
999 /* Collect classes which contain unique sets of allocatable hard | |
1000 registers. Prefer GENERAL_REGS to other classes containing the | |
1001 same set of hard registers. */ | |
1002 for (i = 0; i < LIM_REG_CLASSES; i++) | |
1003 { | |
640 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[i]); | 1004 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[i]); |
641 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | 1005 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); |
642 if (hard_reg_set_empty_p (temp_hard_regset)) | 1006 for (j = 0; j < n; j++) |
643 continue; | |
644 for (j = 0; j < N_REG_CLASSES; j++) | |
645 if (i != j) | |
646 { | |
647 enum reg_class *p; | |
648 | |
649 COPY_HARD_REG_SET (temp_hard_regset2, reg_class_contents[j]); | |
650 AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); | |
651 if (! hard_reg_set_subset_p (temp_hard_regset, | |
652 temp_hard_regset2)) | |
653 continue; | |
654 p = &alloc_reg_class_subclasses[j][0]; | |
655 while (*p != LIM_REG_CLASSES) p++; | |
656 *p = (enum reg_class) i; | |
657 } | |
658 } | |
659 } | |
660 | |
661 | |
662 | |
663 /* Set the four global variables defined above. */ | |
664 static void | |
665 setup_cover_and_important_classes (void) | |
666 { | |
667 int i, j, n, cl; | |
668 bool set_p; | |
669 const reg_class_t *cover_classes; | |
670 HARD_REG_SET temp_hard_regset2; | |
671 static enum reg_class classes[LIM_REG_CLASSES + 1]; | |
672 | |
673 if (targetm.ira_cover_classes == NULL) | |
674 cover_classes = NULL; | |
675 else | |
676 cover_classes = targetm.ira_cover_classes (); | |
677 if (cover_classes == NULL) | |
678 ira_assert (flag_ira_algorithm == IRA_ALGORITHM_PRIORITY); | |
679 else | |
680 { | |
681 for (i = 0; (cl = cover_classes[i]) != LIM_REG_CLASSES; i++) | |
682 classes[i] = (enum reg_class) cl; | |
683 classes[i] = LIM_REG_CLASSES; | |
684 } | |
685 | |
686 if (flag_ira_algorithm == IRA_ALGORITHM_PRIORITY) | |
687 { | |
688 n = 0; | |
689 for (i = 0; i <= LIM_REG_CLASSES; i++) | |
690 { | 1007 { |
691 if (i == NO_REGS) | 1008 cl = classes[j]; |
692 continue; | 1009 COPY_HARD_REG_SET (temp_hard_regset2, reg_class_contents[cl]); |
693 #ifdef CONSTRAINT_NUM_DEFINED_P | 1010 AND_COMPL_HARD_REG_SET (temp_hard_regset2, |
694 for (j = 0; j < CONSTRAINT__LIMIT; j++) | 1011 no_unit_alloc_regs); |
695 if ((int) REG_CLASS_FOR_CONSTRAINT ((enum constraint_num) j) == i) | 1012 if (hard_reg_set_equal_p (temp_hard_regset, |
696 break; | 1013 temp_hard_regset2)) |
697 if (j < CONSTRAINT__LIMIT) | 1014 break; |
698 { | |
699 classes[n++] = (enum reg_class) i; | |
700 continue; | |
701 } | |
702 #endif | |
703 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[i]); | |
704 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | |
705 for (j = 0; j < LIM_REG_CLASSES; j++) | |
706 { | |
707 if (i == j) | |
708 continue; | |
709 COPY_HARD_REG_SET (temp_hard_regset2, reg_class_contents[j]); | |
710 AND_COMPL_HARD_REG_SET (temp_hard_regset2, | |
711 no_unit_alloc_regs); | |
712 if (hard_reg_set_equal_p (temp_hard_regset, | |
713 temp_hard_regset2)) | |
714 break; | |
715 } | |
716 if (j >= i) | |
717 classes[n++] = (enum reg_class) i; | |
718 } | 1015 } |
719 classes[n] = LIM_REG_CLASSES; | 1016 if (j >= n || targetm.additional_allocno_class_p (i)) |
720 } | 1017 classes[n++] = (enum reg_class) i; |
721 | 1018 else if (i == GENERAL_REGS) |
722 ira_reg_class_cover_size = 0; | 1019 /* Prefer general regs. For i386 example, it means that |
1020 we prefer GENERAL_REGS over INDEX_REGS or LEGACY_REGS | |
1021 (all of them consists of the same available hard | |
1022 registers). */ | |
1023 classes[j] = (enum reg_class) i; | |
1024 } | |
1025 classes[n] = LIM_REG_CLASSES; | |
1026 | |
1027 /* Set up classes which can be used for allocnos as classes | |
1028 containing non-empty unique sets of allocatable hard | |
1029 registers. */ | |
1030 ira_allocno_classes_num = 0; | |
723 for (i = 0; (cl = classes[i]) != LIM_REG_CLASSES; i++) | 1031 for (i = 0; (cl = classes[i]) != LIM_REG_CLASSES; i++) |
724 { | 1032 if (ira_class_hard_regs_num[cl] > 0) |
725 for (j = 0; j < i; j++) | 1033 ira_allocno_classes[ira_allocno_classes_num++] = (enum reg_class) cl; |
726 if (flag_ira_algorithm != IRA_ALGORITHM_PRIORITY | |
727 && reg_classes_intersect_p ((enum reg_class) cl, classes[j])) | |
728 gcc_unreachable (); | |
729 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | |
730 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | |
731 if (! hard_reg_set_empty_p (temp_hard_regset)) | |
732 ira_reg_class_cover[ira_reg_class_cover_size++] = (enum reg_class) cl; | |
733 } | |
734 ira_important_classes_num = 0; | 1034 ira_important_classes_num = 0; |
1035 /* Add non-allocno classes containing to non-empty set of | |
1036 allocatable hard regs. */ | |
735 for (cl = 0; cl < N_REG_CLASSES; cl++) | 1037 for (cl = 0; cl < N_REG_CLASSES; cl++) |
736 { | 1038 if (ira_class_hard_regs_num[cl] > 0) |
737 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | |
738 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | |
739 if (! hard_reg_set_empty_p (temp_hard_regset)) | |
740 { | |
741 set_p = false; | |
742 for (j = 0; j < ira_reg_class_cover_size; j++) | |
743 { | |
744 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | |
745 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | |
746 COPY_HARD_REG_SET (temp_hard_regset2, | |
747 reg_class_contents[ira_reg_class_cover[j]]); | |
748 AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); | |
749 if ((enum reg_class) cl == ira_reg_class_cover[j] | |
750 || hard_reg_set_equal_p (temp_hard_regset, | |
751 temp_hard_regset2)) | |
752 break; | |
753 else if (hard_reg_set_subset_p (temp_hard_regset, | |
754 temp_hard_regset2)) | |
755 set_p = true; | |
756 } | |
757 if (set_p && j >= ira_reg_class_cover_size) | |
758 ira_important_classes[ira_important_classes_num++] | |
759 = (enum reg_class) cl; | |
760 } | |
761 } | |
762 for (j = 0; j < ira_reg_class_cover_size; j++) | |
763 ira_important_classes[ira_important_classes_num++] | |
764 = ira_reg_class_cover[j]; | |
765 } | |
766 | |
767 /* Set up array IRA_CLASS_TRANSLATE. */ | |
768 static void | |
769 setup_class_translate (void) | |
770 { | |
771 int cl, mode; | |
772 enum reg_class cover_class, best_class, *cl_ptr; | |
773 int i, cost, min_cost, best_cost; | |
774 | |
775 for (cl = 0; cl < N_REG_CLASSES; cl++) | |
776 ira_class_translate[cl] = NO_REGS; | |
777 | |
778 if (flag_ira_algorithm == IRA_ALGORITHM_PRIORITY) | |
779 for (cl = 0; cl < LIM_REG_CLASSES; cl++) | |
780 { | 1039 { |
781 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | 1040 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); |
782 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | 1041 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); |
783 for (i = 0; i < ira_reg_class_cover_size; i++) | 1042 set_p = false; |
1043 for (j = 0; j < ira_allocno_classes_num; j++) | |
784 { | 1044 { |
785 HARD_REG_SET temp_hard_regset2; | |
786 | |
787 cover_class = ira_reg_class_cover[i]; | |
788 COPY_HARD_REG_SET (temp_hard_regset2, | 1045 COPY_HARD_REG_SET (temp_hard_regset2, |
789 reg_class_contents[cover_class]); | 1046 reg_class_contents[ira_allocno_classes[j]]); |
790 AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); | 1047 AND_COMPL_HARD_REG_SET (temp_hard_regset2, no_unit_alloc_regs); |
791 if (hard_reg_set_equal_p (temp_hard_regset, temp_hard_regset2)) | 1048 if ((enum reg_class) cl == ira_allocno_classes[j]) |
792 ira_class_translate[cl] = cover_class; | 1049 break; |
1050 else if (hard_reg_set_subset_p (temp_hard_regset, | |
1051 temp_hard_regset2)) | |
1052 set_p = true; | |
793 } | 1053 } |
1054 if (set_p && j >= ira_allocno_classes_num) | |
1055 ira_important_classes[ira_important_classes_num++] | |
1056 = (enum reg_class) cl; | |
794 } | 1057 } |
795 for (i = 0; i < ira_reg_class_cover_size; i++) | 1058 /* Now add allocno classes to the important classes. */ |
796 { | 1059 for (j = 0; j < ira_allocno_classes_num; j++) |
797 cover_class = ira_reg_class_cover[i]; | 1060 ira_important_classes[ira_important_classes_num++] |
798 if (flag_ira_algorithm != IRA_ALGORITHM_PRIORITY) | 1061 = ira_allocno_classes[j]; |
799 for (cl_ptr = &alloc_reg_class_subclasses[cover_class][0]; | |
800 (cl = *cl_ptr) != LIM_REG_CLASSES; | |
801 cl_ptr++) | |
802 { | |
803 if (ira_class_translate[cl] == NO_REGS) | |
804 ira_class_translate[cl] = cover_class; | |
805 #ifdef ENABLE_IRA_CHECKING | |
806 else | |
807 { | |
808 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | |
809 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | |
810 if (! hard_reg_set_empty_p (temp_hard_regset)) | |
811 gcc_unreachable (); | |
812 } | |
813 #endif | |
814 } | |
815 ira_class_translate[cover_class] = cover_class; | |
816 } | |
817 /* For classes which are not fully covered by a cover class (in | |
818 other words covered by more one cover class), use the cheapest | |
819 cover class. */ | |
820 for (cl = 0; cl < N_REG_CLASSES; cl++) | 1062 for (cl = 0; cl < N_REG_CLASSES; cl++) |
821 { | 1063 { |
822 if (cl == NO_REGS || ira_class_translate[cl] != NO_REGS) | 1064 ira_reg_allocno_class_p[cl] = false; |
1065 ira_reg_pressure_class_p[cl] = false; | |
1066 } | |
1067 for (j = 0; j < ira_allocno_classes_num; j++) | |
1068 ira_reg_allocno_class_p[ira_allocno_classes[j]] = true; | |
1069 setup_pressure_classes (); | |
1070 setup_uniform_class_p (); | |
1071 } | |
1072 | |
1073 /* Setup translation in CLASS_TRANSLATE of all classes into a class | |
1074 given by array CLASSES of length CLASSES_NUM. The function is used | |
1075 make translation any reg class to an allocno class or to an | |
1076 pressure class. This translation is necessary for some | |
1077 calculations when we can use only allocno or pressure classes and | |
1078 such translation represents an approximate representation of all | |
1079 classes. | |
1080 | |
1081 The translation in case when allocatable hard register set of a | |
1082 given class is subset of allocatable hard register set of a class | |
1083 in CLASSES is pretty simple. We use smallest classes from CLASSES | |
1084 containing a given class. If allocatable hard register set of a | |
1085 given class is not a subset of any corresponding set of a class | |
1086 from CLASSES, we use the cheapest (with load/store point of view) | |
1087 class from CLASSES whose set intersects with given class set. */ | |
1088 static void | |
1089 setup_class_translate_array (enum reg_class *class_translate, | |
1090 int classes_num, enum reg_class *classes) | |
1091 { | |
1092 int cl, mode; | |
1093 enum reg_class aclass, best_class, *cl_ptr; | |
1094 int i, cost, min_cost, best_cost; | |
1095 | |
1096 for (cl = 0; cl < N_REG_CLASSES; cl++) | |
1097 class_translate[cl] = NO_REGS; | |
1098 | |
1099 for (i = 0; i < classes_num; i++) | |
1100 { | |
1101 aclass = classes[i]; | |
1102 for (cl_ptr = &alloc_reg_class_subclasses[aclass][0]; | |
1103 (cl = *cl_ptr) != LIM_REG_CLASSES; | |
1104 cl_ptr++) | |
1105 if (class_translate[cl] == NO_REGS) | |
1106 class_translate[cl] = aclass; | |
1107 class_translate[aclass] = aclass; | |
1108 } | |
1109 /* For classes which are not fully covered by one of given classes | |
1110 (in other words covered by more one given class), use the | |
1111 cheapest class. */ | |
1112 for (cl = 0; cl < N_REG_CLASSES; cl++) | |
1113 { | |
1114 if (cl == NO_REGS || class_translate[cl] != NO_REGS) | |
823 continue; | 1115 continue; |
824 best_class = NO_REGS; | 1116 best_class = NO_REGS; |
825 best_cost = INT_MAX; | 1117 best_cost = INT_MAX; |
826 for (i = 0; i < ira_reg_class_cover_size; i++) | 1118 for (i = 0; i < classes_num; i++) |
827 { | 1119 { |
828 cover_class = ira_reg_class_cover[i]; | 1120 aclass = classes[i]; |
829 COPY_HARD_REG_SET (temp_hard_regset, | 1121 COPY_HARD_REG_SET (temp_hard_regset, |
830 reg_class_contents[cover_class]); | 1122 reg_class_contents[aclass]); |
831 AND_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); | 1123 AND_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); |
832 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | 1124 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); |
833 if (! hard_reg_set_empty_p (temp_hard_regset)) | 1125 if (! hard_reg_set_empty_p (temp_hard_regset)) |
834 { | 1126 { |
835 min_cost = INT_MAX; | 1127 min_cost = INT_MAX; |
836 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) | 1128 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) |
837 { | 1129 { |
838 cost = (ira_memory_move_cost[mode][cl][0] | 1130 cost = (ira_memory_move_cost[mode][aclass][0] |
839 + ira_memory_move_cost[mode][cl][1]); | 1131 + ira_memory_move_cost[mode][aclass][1]); |
840 if (min_cost > cost) | 1132 if (min_cost > cost) |
841 min_cost = cost; | 1133 min_cost = cost; |
842 } | 1134 } |
843 if (best_class == NO_REGS || best_cost > min_cost) | 1135 if (best_class == NO_REGS || best_cost > min_cost) |
844 { | 1136 { |
845 best_class = cover_class; | 1137 best_class = aclass; |
846 best_cost = min_cost; | 1138 best_cost = min_cost; |
847 } | 1139 } |
848 } | 1140 } |
849 } | 1141 } |
850 ira_class_translate[cl] = best_class; | 1142 class_translate[cl] = best_class; |
851 } | 1143 } |
852 } | 1144 } |
853 | 1145 |
854 /* Order numbers of cover classes in original target cover class | 1146 /* Set up array IRA_ALLOCNO_CLASS_TRANSLATE and |
855 array, -1 for non-cover classes. This is only live during | 1147 IRA_PRESSURE_CLASS_TRANSLATE. */ |
856 reorder_important_classes. */ | 1148 static void |
857 static int cover_class_order[N_REG_CLASSES]; | 1149 setup_class_translate (void) |
1150 { | |
1151 setup_class_translate_array (ira_allocno_class_translate, | |
1152 ira_allocno_classes_num, ira_allocno_classes); | |
1153 setup_class_translate_array (ira_pressure_class_translate, | |
1154 ira_pressure_classes_num, ira_pressure_classes); | |
1155 } | |
1156 | |
1157 /* Order numbers of allocno classes in original target allocno class | |
1158 array, -1 for non-allocno classes. */ | |
1159 static int allocno_class_order[N_REG_CLASSES]; | |
858 | 1160 |
859 /* The function used to sort the important classes. */ | 1161 /* The function used to sort the important classes. */ |
860 static int | 1162 static int |
861 comp_reg_classes_func (const void *v1p, const void *v2p) | 1163 comp_reg_classes_func (const void *v1p, const void *v2p) |
862 { | 1164 { |
863 enum reg_class cl1 = *(const enum reg_class *) v1p; | 1165 enum reg_class cl1 = *(const enum reg_class *) v1p; |
864 enum reg_class cl2 = *(const enum reg_class *) v2p; | 1166 enum reg_class cl2 = *(const enum reg_class *) v2p; |
1167 enum reg_class tcl1, tcl2; | |
865 int diff; | 1168 int diff; |
866 | 1169 |
867 cl1 = ira_class_translate[cl1]; | 1170 tcl1 = ira_allocno_class_translate[cl1]; |
868 cl2 = ira_class_translate[cl2]; | 1171 tcl2 = ira_allocno_class_translate[cl2]; |
869 if (cl1 != NO_REGS && cl2 != NO_REGS | 1172 if (tcl1 != NO_REGS && tcl2 != NO_REGS |
870 && (diff = cover_class_order[cl1] - cover_class_order[cl2]) != 0) | 1173 && (diff = allocno_class_order[tcl1] - allocno_class_order[tcl2]) != 0) |
871 return diff; | 1174 return diff; |
872 return (int) cl1 - (int) cl2; | 1175 return (int) cl1 - (int) cl2; |
873 } | 1176 } |
874 | 1177 |
875 /* Reorder important classes according to the order of their cover | 1178 /* For correct work of function setup_reg_class_relation we need to |
876 classes. */ | 1179 reorder important classes according to the order of their allocno |
1180 classes. It places important classes containing the same | |
1181 allocatable hard register set adjacent to each other and allocno | |
1182 class with the allocatable hard register set right after the other | |
1183 important classes with the same set. | |
1184 | |
1185 In example from comments of function | |
1186 setup_allocno_and_important_classes, it places LEGACY_REGS and | |
1187 GENERAL_REGS close to each other and GENERAL_REGS is after | |
1188 LEGACY_REGS. */ | |
877 static void | 1189 static void |
878 reorder_important_classes (void) | 1190 reorder_important_classes (void) |
879 { | 1191 { |
880 int i; | 1192 int i; |
881 | 1193 |
882 for (i = 0; i < N_REG_CLASSES; i++) | 1194 for (i = 0; i < N_REG_CLASSES; i++) |
883 cover_class_order[i] = -1; | 1195 allocno_class_order[i] = -1; |
884 for (i = 0; i < ira_reg_class_cover_size; i++) | 1196 for (i = 0; i < ira_allocno_classes_num; i++) |
885 cover_class_order[ira_reg_class_cover[i]] = i; | 1197 allocno_class_order[ira_allocno_classes[i]] = i; |
886 qsort (ira_important_classes, ira_important_classes_num, | 1198 qsort (ira_important_classes, ira_important_classes_num, |
887 sizeof (enum reg_class), comp_reg_classes_func); | 1199 sizeof (enum reg_class), comp_reg_classes_func); |
888 } | 1200 for (i = 0; i < ira_important_classes_num; i++) |
889 | 1201 ira_important_class_nums[ira_important_classes[i]] = i; |
890 /* Set up the above reg class relations. */ | 1202 } |
1203 | |
1204 /* Set up IRA_REG_CLASS_SUBUNION, IRA_REG_CLASS_SUPERUNION, | |
1205 IRA_REG_CLASS_SUPER_CLASSES, IRA_REG_CLASSES_INTERSECT, and | |
1206 IRA_REG_CLASSES_INTERSECT_P. For the meaning of the relations, | |
1207 please see corresponding comments in ira-int.h. */ | |
891 static void | 1208 static void |
892 setup_reg_class_relations (void) | 1209 setup_reg_class_relations (void) |
893 { | 1210 { |
894 int i, cl1, cl2, cl3; | 1211 int i, cl1, cl2, cl3; |
895 HARD_REG_SET intersection_set, union_set, temp_set2; | 1212 HARD_REG_SET intersection_set, union_set, temp_set2; |
903 ira_reg_class_super_classes[cl1][0] = LIM_REG_CLASSES; | 1220 ira_reg_class_super_classes[cl1][0] = LIM_REG_CLASSES; |
904 for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++) | 1221 for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++) |
905 { | 1222 { |
906 ira_reg_classes_intersect_p[cl1][cl2] = false; | 1223 ira_reg_classes_intersect_p[cl1][cl2] = false; |
907 ira_reg_class_intersect[cl1][cl2] = NO_REGS; | 1224 ira_reg_class_intersect[cl1][cl2] = NO_REGS; |
1225 ira_reg_class_subset[cl1][cl2] = NO_REGS; | |
908 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]); | 1226 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]); |
909 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | 1227 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); |
910 COPY_HARD_REG_SET (temp_set2, reg_class_contents[cl2]); | 1228 COPY_HARD_REG_SET (temp_set2, reg_class_contents[cl2]); |
911 AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); | 1229 AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); |
912 if (hard_reg_set_empty_p (temp_hard_regset) | 1230 if (hard_reg_set_empty_p (temp_hard_regset) |
913 && hard_reg_set_empty_p (temp_set2)) | 1231 && hard_reg_set_empty_p (temp_set2)) |
914 { | 1232 { |
1233 /* The both classes have no allocatable hard registers | |
1234 -- take all class hard registers into account and use | |
1235 reg_class_subunion and reg_class_superunion. */ | |
915 for (i = 0;; i++) | 1236 for (i = 0;; i++) |
916 { | 1237 { |
917 cl3 = reg_class_subclasses[cl1][i]; | 1238 cl3 = reg_class_subclasses[cl1][i]; |
918 if (cl3 == LIM_REG_CLASSES) | 1239 if (cl3 == LIM_REG_CLASSES) |
919 break; | 1240 break; |
920 if (reg_class_subset_p (ira_reg_class_intersect[cl1][cl2], | 1241 if (reg_class_subset_p (ira_reg_class_intersect[cl1][cl2], |
921 (enum reg_class) cl3)) | 1242 (enum reg_class) cl3)) |
922 ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3; | 1243 ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3; |
923 } | 1244 } |
924 ira_reg_class_union[cl1][cl2] = reg_class_subunion[cl1][cl2]; | 1245 ira_reg_class_subunion[cl1][cl2] = reg_class_subunion[cl1][cl2]; |
1246 ira_reg_class_superunion[cl1][cl2] = reg_class_superunion[cl1][cl2]; | |
925 continue; | 1247 continue; |
926 } | 1248 } |
927 ira_reg_classes_intersect_p[cl1][cl2] | 1249 ira_reg_classes_intersect_p[cl1][cl2] |
928 = hard_reg_set_intersect_p (temp_hard_regset, temp_set2); | 1250 = hard_reg_set_intersect_p (temp_hard_regset, temp_set2); |
929 if (important_class_p[cl1] && important_class_p[cl2] | 1251 if (important_class_p[cl1] && important_class_p[cl2] |
930 && hard_reg_set_subset_p (temp_hard_regset, temp_set2)) | 1252 && hard_reg_set_subset_p (temp_hard_regset, temp_set2)) |
931 { | 1253 { |
1254 /* CL1 and CL2 are important classes and CL1 allocatable | |
1255 hard register set is inside of CL2 allocatable hard | |
1256 registers -- make CL1 a superset of CL2. */ | |
932 enum reg_class *p; | 1257 enum reg_class *p; |
933 | 1258 |
934 p = &ira_reg_class_super_classes[cl1][0]; | 1259 p = &ira_reg_class_super_classes[cl1][0]; |
935 while (*p != LIM_REG_CLASSES) | 1260 while (*p != LIM_REG_CLASSES) |
936 p++; | 1261 p++; |
937 *p++ = (enum reg_class) cl2; | 1262 *p++ = (enum reg_class) cl2; |
938 *p = LIM_REG_CLASSES; | 1263 *p = LIM_REG_CLASSES; |
939 } | 1264 } |
940 ira_reg_class_union[cl1][cl2] = NO_REGS; | 1265 ira_reg_class_subunion[cl1][cl2] = NO_REGS; |
1266 ira_reg_class_superunion[cl1][cl2] = NO_REGS; | |
941 COPY_HARD_REG_SET (intersection_set, reg_class_contents[cl1]); | 1267 COPY_HARD_REG_SET (intersection_set, reg_class_contents[cl1]); |
942 AND_HARD_REG_SET (intersection_set, reg_class_contents[cl2]); | 1268 AND_HARD_REG_SET (intersection_set, reg_class_contents[cl2]); |
943 AND_COMPL_HARD_REG_SET (intersection_set, no_unit_alloc_regs); | 1269 AND_COMPL_HARD_REG_SET (intersection_set, no_unit_alloc_regs); |
944 COPY_HARD_REG_SET (union_set, reg_class_contents[cl1]); | 1270 COPY_HARD_REG_SET (union_set, reg_class_contents[cl1]); |
945 IOR_HARD_REG_SET (union_set, reg_class_contents[cl2]); | 1271 IOR_HARD_REG_SET (union_set, reg_class_contents[cl2]); |
946 AND_COMPL_HARD_REG_SET (union_set, no_unit_alloc_regs); | 1272 AND_COMPL_HARD_REG_SET (union_set, no_unit_alloc_regs); |
947 for (i = 0; i < ira_important_classes_num; i++) | 1273 for (cl3 = 0; cl3 < N_REG_CLASSES; cl3++) |
948 { | 1274 { |
949 cl3 = ira_important_classes[i]; | |
950 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl3]); | 1275 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl3]); |
951 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); | 1276 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); |
952 if (hard_reg_set_subset_p (temp_hard_regset, intersection_set)) | 1277 if (hard_reg_set_subset_p (temp_hard_regset, intersection_set)) |
953 { | 1278 { |
1279 /* CL3 allocatable hard register set is inside of | |
1280 intersection of allocatable hard register sets | |
1281 of CL1 and CL2. */ | |
1282 if (important_class_p[cl3]) | |
1283 { | |
1284 COPY_HARD_REG_SET | |
1285 (temp_set2, | |
1286 reg_class_contents | |
1287 [(int) ira_reg_class_intersect[cl1][cl2]]); | |
1288 AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); | |
1289 if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2) | |
1290 /* If the allocatable hard register sets are | |
1291 the same, prefer GENERAL_REGS or the | |
1292 smallest class for debugging | |
1293 purposes. */ | |
1294 || (hard_reg_set_equal_p (temp_hard_regset, temp_set2) | |
1295 && (cl3 == GENERAL_REGS | |
1296 || ((ira_reg_class_intersect[cl1][cl2] | |
1297 != GENERAL_REGS) | |
1298 && hard_reg_set_subset_p | |
1299 (reg_class_contents[cl3], | |
1300 reg_class_contents | |
1301 [(int) | |
1302 ira_reg_class_intersect[cl1][cl2]]))))) | |
1303 ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3; | |
1304 } | |
954 COPY_HARD_REG_SET | 1305 COPY_HARD_REG_SET |
955 (temp_set2, | 1306 (temp_set2, |
956 reg_class_contents[(int) | 1307 reg_class_contents[(int) ira_reg_class_subset[cl1][cl2]]); |
957 ira_reg_class_intersect[cl1][cl2]]); | |
958 AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); | 1308 AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); |
959 if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2) | 1309 if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2) |
960 /* Ignore unavailable hard registers and prefer | 1310 /* Ignore unavailable hard registers and prefer |
961 smallest class for debugging purposes. */ | 1311 smallest class for debugging purposes. */ |
962 || (hard_reg_set_equal_p (temp_hard_regset, temp_set2) | 1312 || (hard_reg_set_equal_p (temp_hard_regset, temp_set2) |
963 && hard_reg_set_subset_p | 1313 && hard_reg_set_subset_p |
964 (reg_class_contents[cl3], | 1314 (reg_class_contents[cl3], |
965 reg_class_contents | 1315 reg_class_contents |
966 [(int) ira_reg_class_intersect[cl1][cl2]]))) | 1316 [(int) ira_reg_class_subset[cl1][cl2]]))) |
967 ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3; | 1317 ira_reg_class_subset[cl1][cl2] = (enum reg_class) cl3; |
968 } | 1318 } |
969 if (hard_reg_set_subset_p (temp_hard_regset, union_set)) | 1319 if (important_class_p[cl3] |
1320 && hard_reg_set_subset_p (temp_hard_regset, union_set)) | |
970 { | 1321 { |
1322 /* CL3 allocatable hard register set is inside of | |
1323 union of allocatable hard register sets of CL1 | |
1324 and CL2. */ | |
971 COPY_HARD_REG_SET | 1325 COPY_HARD_REG_SET |
972 (temp_set2, | 1326 (temp_set2, |
973 reg_class_contents[(int) ira_reg_class_union[cl1][cl2]]); | 1327 reg_class_contents[(int) ira_reg_class_subunion[cl1][cl2]]); |
974 AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); | 1328 AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); |
975 if (ira_reg_class_union[cl1][cl2] == NO_REGS | 1329 if (ira_reg_class_subunion[cl1][cl2] == NO_REGS |
976 || (hard_reg_set_subset_p (temp_set2, temp_hard_regset) | 1330 || (hard_reg_set_subset_p (temp_set2, temp_hard_regset) |
977 | 1331 |
978 && (! hard_reg_set_equal_p (temp_set2, | 1332 && (! hard_reg_set_equal_p (temp_set2, |
979 temp_hard_regset) | 1333 temp_hard_regset) |
980 /* Ignore unavailable hard registers and | 1334 || cl3 == GENERAL_REGS |
981 prefer smallest class for debugging | 1335 /* If the allocatable hard register sets are the |
982 purposes. */ | 1336 same, prefer GENERAL_REGS or the smallest |
983 || hard_reg_set_subset_p | 1337 class for debugging purposes. */ |
984 (reg_class_contents[cl3], | 1338 || (ira_reg_class_subunion[cl1][cl2] != GENERAL_REGS |
985 reg_class_contents | 1339 && hard_reg_set_subset_p |
986 [(int) ira_reg_class_union[cl1][cl2]])))) | 1340 (reg_class_contents[cl3], |
987 ira_reg_class_union[cl1][cl2] = (enum reg_class) cl3; | 1341 reg_class_contents |
1342 [(int) ira_reg_class_subunion[cl1][cl2]]))))) | |
1343 ira_reg_class_subunion[cl1][cl2] = (enum reg_class) cl3; | |
1344 } | |
1345 if (hard_reg_set_subset_p (union_set, temp_hard_regset)) | |
1346 { | |
1347 /* CL3 allocatable hard register set contains union | |
1348 of allocatable hard register sets of CL1 and | |
1349 CL2. */ | |
1350 COPY_HARD_REG_SET | |
1351 (temp_set2, | |
1352 reg_class_contents[(int) ira_reg_class_superunion[cl1][cl2]]); | |
1353 AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); | |
1354 if (ira_reg_class_superunion[cl1][cl2] == NO_REGS | |
1355 || (hard_reg_set_subset_p (temp_hard_regset, temp_set2) | |
1356 | |
1357 && (! hard_reg_set_equal_p (temp_set2, | |
1358 temp_hard_regset) | |
1359 || cl3 == GENERAL_REGS | |
1360 /* If the allocatable hard register sets are the | |
1361 same, prefer GENERAL_REGS or the smallest | |
1362 class for debugging purposes. */ | |
1363 || (ira_reg_class_superunion[cl1][cl2] != GENERAL_REGS | |
1364 && hard_reg_set_subset_p | |
1365 (reg_class_contents[cl3], | |
1366 reg_class_contents | |
1367 [(int) ira_reg_class_superunion[cl1][cl2]]))))) | |
1368 ira_reg_class_superunion[cl1][cl2] = (enum reg_class) cl3; | |
988 } | 1369 } |
989 } | 1370 } |
990 } | 1371 } |
991 } | 1372 } |
992 } | 1373 } |
993 | 1374 |
994 /* Output all cover classes and the translation map into file F. */ | 1375 /* Output all uniform and important classes into file F. */ |
995 static void | 1376 static void |
996 print_class_cover (FILE *f) | 1377 print_uniform_and_important_classes (FILE *f) |
997 { | 1378 { |
998 static const char *const reg_class_names[] = REG_CLASS_NAMES; | 1379 int i, cl; |
1380 | |
1381 fprintf (f, "Uniform classes:\n"); | |
1382 for (cl = 0; cl < N_REG_CLASSES; cl++) | |
1383 if (ira_uniform_class_p[cl]) | |
1384 fprintf (f, " %s", reg_class_names[cl]); | |
1385 fprintf (f, "\nImportant classes:\n"); | |
1386 for (i = 0; i < ira_important_classes_num; i++) | |
1387 fprintf (f, " %s", reg_class_names[ira_important_classes[i]]); | |
1388 fprintf (f, "\n"); | |
1389 } | |
1390 | |
1391 /* Output all possible allocno or pressure classes and their | |
1392 translation map into file F. */ | |
1393 static void | |
1394 print_translated_classes (FILE *f, bool pressure_p) | |
1395 { | |
1396 int classes_num = (pressure_p | |
1397 ? ira_pressure_classes_num : ira_allocno_classes_num); | |
1398 enum reg_class *classes = (pressure_p | |
1399 ? ira_pressure_classes : ira_allocno_classes); | |
1400 enum reg_class *class_translate = (pressure_p | |
1401 ? ira_pressure_class_translate | |
1402 : ira_allocno_class_translate); | |
999 int i; | 1403 int i; |
1000 | 1404 |
1001 fprintf (f, "Class cover:\n"); | 1405 fprintf (f, "%s classes:\n", pressure_p ? "Pressure" : "Allocno"); |
1002 for (i = 0; i < ira_reg_class_cover_size; i++) | 1406 for (i = 0; i < classes_num; i++) |
1003 fprintf (f, " %s", reg_class_names[ira_reg_class_cover[i]]); | 1407 fprintf (f, " %s", reg_class_names[classes[i]]); |
1004 fprintf (f, "\nClass translation:\n"); | 1408 fprintf (f, "\nClass translation:\n"); |
1005 for (i = 0; i < N_REG_CLASSES; i++) | 1409 for (i = 0; i < N_REG_CLASSES; i++) |
1006 fprintf (f, " %s -> %s\n", reg_class_names[i], | 1410 fprintf (f, " %s -> %s\n", reg_class_names[i], |
1007 reg_class_names[ira_class_translate[i]]); | 1411 reg_class_names[class_translate[i]]); |
1008 } | 1412 } |
1009 | 1413 |
1010 /* Output all cover classes and the translation map into | 1414 /* Output all possible allocno and translation classes and the |
1011 stderr. */ | 1415 translation maps into stderr. */ |
1012 void | 1416 void |
1013 ira_debug_class_cover (void) | 1417 ira_debug_allocno_classes (void) |
1014 { | 1418 { |
1015 print_class_cover (stderr); | 1419 print_uniform_and_important_classes (stderr); |
1016 } | 1420 print_translated_classes (stderr, false); |
1017 | 1421 print_translated_classes (stderr, true); |
1018 /* Set up different arrays concerning class subsets, cover and | 1422 } |
1423 | |
1424 /* Set up different arrays concerning class subsets, allocno and | |
1019 important classes. */ | 1425 important classes. */ |
1020 static void | 1426 static void |
1021 find_reg_class_closure (void) | 1427 find_reg_classes (void) |
1022 { | 1428 { |
1023 setup_reg_subclasses (); | 1429 setup_allocno_and_important_classes (); |
1024 setup_cover_and_important_classes (); | |
1025 setup_class_translate (); | 1430 setup_class_translate (); |
1026 reorder_important_classes (); | 1431 reorder_important_classes (); |
1027 setup_reg_class_relations (); | 1432 setup_reg_class_relations (); |
1028 } | 1433 } |
1029 | 1434 |
1030 | 1435 |
1031 | 1436 |
1032 /* Set up the array above. */ | 1437 /* Set up the array above. */ |
1033 static void | 1438 static void |
1034 setup_hard_regno_cover_class (void) | 1439 setup_hard_regno_aclass (void) |
1035 { | 1440 { |
1036 int i; | 1441 int i; |
1037 | 1442 |
1038 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) | 1443 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) |
1039 { | 1444 { |
1040 ira_hard_regno_cover_class[i] | 1445 #if 1 |
1446 ira_hard_regno_allocno_class[i] | |
1041 = (TEST_HARD_REG_BIT (no_unit_alloc_regs, i) | 1447 = (TEST_HARD_REG_BIT (no_unit_alloc_regs, i) |
1042 ? NO_REGS | 1448 ? NO_REGS |
1043 : ira_class_translate[REGNO_REG_CLASS (i)]); | 1449 : ira_allocno_class_translate[REGNO_REG_CLASS (i)]); |
1450 #else | |
1451 int j; | |
1452 enum reg_class cl; | |
1453 ira_hard_regno_allocno_class[i] = NO_REGS; | |
1454 for (j = 0; j < ira_allocno_classes_num; j++) | |
1455 { | |
1456 cl = ira_allocno_classes[j]; | |
1457 if (ira_class_hard_reg_index[cl][i] >= 0) | |
1458 { | |
1459 ira_hard_regno_allocno_class[i] = cl; | |
1460 break; | |
1461 } | |
1462 } | |
1463 #endif | |
1044 } | 1464 } |
1045 } | 1465 } |
1046 | 1466 |
1047 | 1467 |
1048 | 1468 |
1049 /* Form IRA_REG_CLASS_NREGS map. */ | 1469 /* Form IRA_REG_CLASS_MAX_NREGS and IRA_REG_CLASS_MIN_NREGS maps. */ |
1050 static void | 1470 static void |
1051 setup_reg_class_nregs (void) | 1471 setup_reg_class_nregs (void) |
1052 { | 1472 { |
1053 int cl, m; | 1473 int i, cl, cl2, m; |
1054 | 1474 |
1055 for (cl = 0; cl < N_REG_CLASSES; cl++) | 1475 for (m = 0; m < MAX_MACHINE_MODE; m++) |
1056 for (m = 0; m < MAX_MACHINE_MODE; m++) | 1476 { |
1057 ira_reg_class_nregs[cl][m] = CLASS_MAX_NREGS ((enum reg_class) cl, | 1477 for (cl = 0; cl < N_REG_CLASSES; cl++) |
1058 (enum machine_mode) m); | 1478 ira_reg_class_max_nregs[cl][m] |
1479 = ira_reg_class_min_nregs[cl][m] | |
1480 = targetm.class_max_nregs ((reg_class_t) cl, (machine_mode) m); | |
1481 for (cl = 0; cl < N_REG_CLASSES; cl++) | |
1482 for (i = 0; | |
1483 (cl2 = alloc_reg_class_subclasses[cl][i]) != LIM_REG_CLASSES; | |
1484 i++) | |
1485 if (ira_reg_class_min_nregs[cl2][m] | |
1486 < ira_reg_class_min_nregs[cl][m]) | |
1487 ira_reg_class_min_nregs[cl][m] = ira_reg_class_min_nregs[cl2][m]; | |
1488 } | |
1059 } | 1489 } |
1060 | 1490 |
1061 | 1491 |
1062 | 1492 |
1063 /* Set up PROHIBITED_CLASS_MODE_REGS. */ | 1493 /* Set up IRA_PROHIBITED_CLASS_MODE_REGS and IRA_CLASS_SINGLETON. |
1494 This function is called once IRA_CLASS_HARD_REGS has been initialized. */ | |
1064 static void | 1495 static void |
1065 setup_prohibited_class_mode_regs (void) | 1496 setup_prohibited_class_mode_regs (void) |
1066 { | 1497 { |
1067 int i, j, k, hard_regno; | 1498 int j, k, hard_regno, cl, last_hard_regno, count; |
1068 enum reg_class cl; | 1499 |
1069 | 1500 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--) |
1070 for (i = 0; i < ira_reg_class_cover_size; i++) | 1501 { |
1071 { | 1502 COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl]); |
1072 cl = ira_reg_class_cover[i]; | 1503 AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); |
1073 for (j = 0; j < NUM_MACHINE_MODES; j++) | 1504 for (j = 0; j < NUM_MACHINE_MODES; j++) |
1074 { | 1505 { |
1075 CLEAR_HARD_REG_SET (prohibited_class_mode_regs[cl][j]); | 1506 count = 0; |
1507 last_hard_regno = -1; | |
1508 CLEAR_HARD_REG_SET (ira_prohibited_class_mode_regs[cl][j]); | |
1076 for (k = ira_class_hard_regs_num[cl] - 1; k >= 0; k--) | 1509 for (k = ira_class_hard_regs_num[cl] - 1; k >= 0; k--) |
1077 { | 1510 { |
1078 hard_regno = ira_class_hard_regs[cl][k]; | 1511 hard_regno = ira_class_hard_regs[cl][k]; |
1079 if (! HARD_REGNO_MODE_OK (hard_regno, (enum machine_mode) j)) | 1512 if (!targetm.hard_regno_mode_ok (hard_regno, (machine_mode) j)) |
1080 SET_HARD_REG_BIT (prohibited_class_mode_regs[cl][j], | 1513 SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j], |
1081 hard_regno); | 1514 hard_regno); |
1515 else if (in_hard_reg_set_p (temp_hard_regset, | |
1516 (machine_mode) j, hard_regno)) | |
1517 { | |
1518 last_hard_regno = hard_regno; | |
1519 count++; | |
1520 } | |
1082 } | 1521 } |
1522 ira_class_singleton[cl][j] = (count == 1 ? last_hard_regno : -1); | |
1083 } | 1523 } |
1084 } | 1524 } |
1085 } | 1525 } |
1086 | 1526 |
1527 /* Clarify IRA_PROHIBITED_CLASS_MODE_REGS by excluding hard registers | |
1528 spanning from one register pressure class to another one. It is | |
1529 called after defining the pressure classes. */ | |
1530 static void | |
1531 clarify_prohibited_class_mode_regs (void) | |
1532 { | |
1533 int j, k, hard_regno, cl, pclass, nregs; | |
1534 | |
1535 for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--) | |
1536 for (j = 0; j < NUM_MACHINE_MODES; j++) | |
1537 { | |
1538 CLEAR_HARD_REG_SET (ira_useful_class_mode_regs[cl][j]); | |
1539 for (k = ira_class_hard_regs_num[cl] - 1; k >= 0; k--) | |
1540 { | |
1541 hard_regno = ira_class_hard_regs[cl][k]; | |
1542 if (TEST_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j], hard_regno)) | |
1543 continue; | |
1544 nregs = hard_regno_nregs (hard_regno, (machine_mode) j); | |
1545 if (hard_regno + nregs > FIRST_PSEUDO_REGISTER) | |
1546 { | |
1547 SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j], | |
1548 hard_regno); | |
1549 continue; | |
1550 } | |
1551 pclass = ira_pressure_class_translate[REGNO_REG_CLASS (hard_regno)]; | |
1552 for (nregs-- ;nregs >= 0; nregs--) | |
1553 if (((enum reg_class) pclass | |
1554 != ira_pressure_class_translate[REGNO_REG_CLASS | |
1555 (hard_regno + nregs)])) | |
1556 { | |
1557 SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j], | |
1558 hard_regno); | |
1559 break; | |
1560 } | |
1561 if (!TEST_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j], | |
1562 hard_regno)) | |
1563 add_to_hard_reg_set (&ira_useful_class_mode_regs[cl][j], | |
1564 (machine_mode) j, hard_regno); | |
1565 } | |
1566 } | |
1567 } | |
1087 | 1568 |
1088 | 1569 /* Allocate and initialize IRA_REGISTER_MOVE_COST, IRA_MAY_MOVE_IN_COST |
1089 /* Allocate and initialize IRA_REGISTER_MOVE_COST, | 1570 and IRA_MAY_MOVE_OUT_COST for MODE. */ |
1090 IRA_MAY_MOVE_IN_COST, and IRA_MAY_MOVE_OUT_COST for MODE if it is | |
1091 not done yet. */ | |
1092 void | 1571 void |
1093 ira_init_register_move_cost (enum machine_mode mode) | 1572 ira_init_register_move_cost (machine_mode mode) |
1094 { | 1573 { |
1095 int cl1, cl2; | 1574 static unsigned short last_move_cost[N_REG_CLASSES][N_REG_CLASSES]; |
1575 bool all_match = true; | |
1576 unsigned int cl1, cl2; | |
1096 | 1577 |
1097 ira_assert (ira_register_move_cost[mode] == NULL | 1578 ira_assert (ira_register_move_cost[mode] == NULL |
1098 && ira_may_move_in_cost[mode] == NULL | 1579 && ira_may_move_in_cost[mode] == NULL |
1099 && ira_may_move_out_cost[mode] == NULL); | 1580 && ira_may_move_out_cost[mode] == NULL); |
1100 if (move_cost[mode] == NULL) | 1581 ira_assert (have_regs_of_mode[mode]); |
1101 init_move_cost (mode); | |
1102 ira_register_move_cost[mode] = move_cost[mode]; | |
1103 /* Don't use ira_allocate because the tables exist out of scope of a | |
1104 IRA call. */ | |
1105 ira_may_move_in_cost[mode] | |
1106 = (move_table *) xmalloc (sizeof (move_table) * N_REG_CLASSES); | |
1107 memcpy (ira_may_move_in_cost[mode], may_move_in_cost[mode], | |
1108 sizeof (move_table) * N_REG_CLASSES); | |
1109 ira_may_move_out_cost[mode] | |
1110 = (move_table *) xmalloc (sizeof (move_table) * N_REG_CLASSES); | |
1111 memcpy (ira_may_move_out_cost[mode], may_move_out_cost[mode], | |
1112 sizeof (move_table) * N_REG_CLASSES); | |
1113 for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++) | 1582 for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++) |
1114 { | 1583 for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++) |
1115 for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++) | 1584 { |
1116 { | 1585 int cost; |
1117 if (ira_class_subset_p[cl1][cl2]) | 1586 if (!contains_reg_of_mode[cl1][mode] |
1118 ira_may_move_in_cost[mode][cl1][cl2] = 0; | 1587 || !contains_reg_of_mode[cl2][mode]) |
1119 if (ira_class_subset_p[cl2][cl1]) | 1588 { |
1120 ira_may_move_out_cost[mode][cl1][cl2] = 0; | 1589 if ((ira_reg_class_max_nregs[cl1][mode] |
1121 } | 1590 > ira_class_hard_regs_num[cl1]) |
1122 } | 1591 || (ira_reg_class_max_nregs[cl2][mode] |
1592 > ira_class_hard_regs_num[cl2])) | |
1593 cost = 65535; | |
1594 else | |
1595 cost = (ira_memory_move_cost[mode][cl1][0] | |
1596 + ira_memory_move_cost[mode][cl2][1]) * 2; | |
1597 } | |
1598 else | |
1599 { | |
1600 cost = register_move_cost (mode, (enum reg_class) cl1, | |
1601 (enum reg_class) cl2); | |
1602 ira_assert (cost < 65535); | |
1603 } | |
1604 all_match &= (last_move_cost[cl1][cl2] == cost); | |
1605 last_move_cost[cl1][cl2] = cost; | |
1606 } | |
1607 if (all_match && last_mode_for_init_move_cost != -1) | |
1608 { | |
1609 ira_register_move_cost[mode] | |
1610 = ira_register_move_cost[last_mode_for_init_move_cost]; | |
1611 ira_may_move_in_cost[mode] | |
1612 = ira_may_move_in_cost[last_mode_for_init_move_cost]; | |
1613 ira_may_move_out_cost[mode] | |
1614 = ira_may_move_out_cost[last_mode_for_init_move_cost]; | |
1615 return; | |
1616 } | |
1617 last_mode_for_init_move_cost = mode; | |
1618 ira_register_move_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES); | |
1619 ira_may_move_in_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES); | |
1620 ira_may_move_out_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES); | |
1621 for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++) | |
1622 for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++) | |
1623 { | |
1624 int cost; | |
1625 enum reg_class *p1, *p2; | |
1626 | |
1627 if (last_move_cost[cl1][cl2] == 65535) | |
1628 { | |
1629 ira_register_move_cost[mode][cl1][cl2] = 65535; | |
1630 ira_may_move_in_cost[mode][cl1][cl2] = 65535; | |
1631 ira_may_move_out_cost[mode][cl1][cl2] = 65535; | |
1632 } | |
1633 else | |
1634 { | |
1635 cost = last_move_cost[cl1][cl2]; | |
1636 | |
1637 for (p2 = ®_class_subclasses[cl2][0]; | |
1638 *p2 != LIM_REG_CLASSES; p2++) | |
1639 if (ira_class_hard_regs_num[*p2] > 0 | |
1640 && (ira_reg_class_max_nregs[*p2][mode] | |
1641 <= ira_class_hard_regs_num[*p2])) | |
1642 cost = MAX (cost, ira_register_move_cost[mode][cl1][*p2]); | |
1643 | |
1644 for (p1 = ®_class_subclasses[cl1][0]; | |
1645 *p1 != LIM_REG_CLASSES; p1++) | |
1646 if (ira_class_hard_regs_num[*p1] > 0 | |
1647 && (ira_reg_class_max_nregs[*p1][mode] | |
1648 <= ira_class_hard_regs_num[*p1])) | |
1649 cost = MAX (cost, ira_register_move_cost[mode][*p1][cl2]); | |
1650 | |
1651 ira_assert (cost <= 65535); | |
1652 ira_register_move_cost[mode][cl1][cl2] = cost; | |
1653 | |
1654 if (ira_class_subset_p[cl1][cl2]) | |
1655 ira_may_move_in_cost[mode][cl1][cl2] = 0; | |
1656 else | |
1657 ira_may_move_in_cost[mode][cl1][cl2] = cost; | |
1658 | |
1659 if (ira_class_subset_p[cl2][cl1]) | |
1660 ira_may_move_out_cost[mode][cl1][cl2] = 0; | |
1661 else | |
1662 ira_may_move_out_cost[mode][cl1][cl2] = cost; | |
1663 } | |
1664 } | |
1123 } | 1665 } |
1124 | 1666 |
1125 | 1667 |
1126 | 1668 |
1127 /* This is called once during compiler work. It sets up | 1669 /* This is called once during compiler work. It sets up |
1128 different arrays whose values don't depend on the compiled | 1670 different arrays whose values don't depend on the compiled |
1129 function. */ | 1671 function. */ |
1130 void | 1672 void |
1131 ira_init_once (void) | 1673 ira_init_once (void) |
1132 { | 1674 { |
1133 int mode; | 1675 ira_init_costs_once (); |
1134 | 1676 lra_init_once (); |
1677 | |
1678 ira_use_lra_p = targetm.lra_p (); | |
1679 } | |
1680 | |
1681 /* Free ira_max_register_move_cost, ira_may_move_in_cost and | |
1682 ira_may_move_out_cost for each mode. */ | |
1683 void | |
1684 target_ira_int::free_register_move_costs (void) | |
1685 { | |
1686 int mode, i; | |
1687 | |
1688 /* Reset move_cost and friends, making sure we only free shared | |
1689 table entries once. */ | |
1135 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) | 1690 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) |
1136 { | 1691 if (x_ira_register_move_cost[mode]) |
1137 ira_register_move_cost[mode] = NULL; | 1692 { |
1138 ira_may_move_in_cost[mode] = NULL; | 1693 for (i = 0; |
1139 ira_may_move_out_cost[mode] = NULL; | 1694 i < mode && (x_ira_register_move_cost[i] |
1140 } | 1695 != x_ira_register_move_cost[mode]); |
1141 ira_init_costs_once (); | 1696 i++) |
1142 } | 1697 ; |
1143 | 1698 if (i == mode) |
1144 /* Free ira_register_move_cost, ira_may_move_in_cost, and | 1699 { |
1145 ira_may_move_out_cost for each mode. */ | 1700 free (x_ira_register_move_cost[mode]); |
1146 static void | 1701 free (x_ira_may_move_in_cost[mode]); |
1147 free_register_move_costs (void) | 1702 free (x_ira_may_move_out_cost[mode]); |
1148 { | 1703 } |
1149 int mode; | 1704 } |
1150 | 1705 memset (x_ira_register_move_cost, 0, sizeof x_ira_register_move_cost); |
1151 for (mode = 0; mode < MAX_MACHINE_MODE; mode++) | 1706 memset (x_ira_may_move_in_cost, 0, sizeof x_ira_may_move_in_cost); |
1152 { | 1707 memset (x_ira_may_move_out_cost, 0, sizeof x_ira_may_move_out_cost); |
1153 if (ira_may_move_in_cost[mode] != NULL) | 1708 last_mode_for_init_move_cost = -1; |
1154 free (ira_may_move_in_cost[mode]); | 1709 } |
1155 if (ira_may_move_out_cost[mode] != NULL) | 1710 |
1156 free (ira_may_move_out_cost[mode]); | 1711 target_ira_int::~target_ira_int () |
1157 ira_register_move_cost[mode] = NULL; | 1712 { |
1158 ira_may_move_in_cost[mode] = NULL; | 1713 free_ira_costs (); |
1159 ira_may_move_out_cost[mode] = NULL; | 1714 free_register_move_costs (); |
1160 } | |
1161 } | 1715 } |
1162 | 1716 |
1163 /* This is called every time when register related information is | 1717 /* This is called every time when register related information is |
1164 changed. */ | 1718 changed. */ |
1165 void | 1719 void |
1166 ira_init (void) | 1720 ira_init (void) |
1167 { | 1721 { |
1168 free_register_move_costs (); | 1722 this_target_ira_int->free_register_move_costs (); |
1169 setup_reg_mode_hard_regset (); | 1723 setup_reg_mode_hard_regset (); |
1170 setup_alloc_regs (flag_omit_frame_pointer != 0); | 1724 setup_alloc_regs (flag_omit_frame_pointer != 0); |
1171 setup_class_subset_and_memory_move_costs (); | 1725 setup_class_subset_and_memory_move_costs (); |
1172 find_reg_class_closure (); | |
1173 setup_hard_regno_cover_class (); | |
1174 setup_reg_class_nregs (); | 1726 setup_reg_class_nregs (); |
1175 setup_prohibited_class_mode_regs (); | 1727 setup_prohibited_class_mode_regs (); |
1728 find_reg_classes (); | |
1729 clarify_prohibited_class_mode_regs (); | |
1730 setup_hard_regno_aclass (); | |
1176 ira_init_costs (); | 1731 ira_init_costs (); |
1177 } | |
1178 | |
1179 /* Function called once at the end of compiler work. */ | |
1180 void | |
1181 ira_finish_once (void) | |
1182 { | |
1183 ira_finish_costs_once (); | |
1184 free_register_move_costs (); | |
1185 } | 1732 } |
1186 | 1733 |
1187 | 1734 |
1188 #define ira_prohibited_mode_move_regs_initialized_p \ | 1735 #define ira_prohibited_mode_move_regs_initialized_p \ |
1189 (this_target_ira_int->x_ira_prohibited_mode_move_regs_initialized_p) | 1736 (this_target_ira_int->x_ira_prohibited_mode_move_regs_initialized_p) |
1191 /* Set up IRA_PROHIBITED_MODE_MOVE_REGS. */ | 1738 /* Set up IRA_PROHIBITED_MODE_MOVE_REGS. */ |
1192 static void | 1739 static void |
1193 setup_prohibited_mode_move_regs (void) | 1740 setup_prohibited_mode_move_regs (void) |
1194 { | 1741 { |
1195 int i, j; | 1742 int i, j; |
1196 rtx test_reg1, test_reg2, move_pat, move_insn; | 1743 rtx test_reg1, test_reg2, move_pat; |
1744 rtx_insn *move_insn; | |
1197 | 1745 |
1198 if (ira_prohibited_mode_move_regs_initialized_p) | 1746 if (ira_prohibited_mode_move_regs_initialized_p) |
1199 return; | 1747 return; |
1200 ira_prohibited_mode_move_regs_initialized_p = true; | 1748 ira_prohibited_mode_move_regs_initialized_p = true; |
1201 test_reg1 = gen_rtx_REG (VOIDmode, 0); | 1749 test_reg1 = gen_rtx_REG (word_mode, LAST_VIRTUAL_REGISTER + 1); |
1202 test_reg2 = gen_rtx_REG (VOIDmode, 0); | 1750 test_reg2 = gen_rtx_REG (word_mode, LAST_VIRTUAL_REGISTER + 2); |
1203 move_pat = gen_rtx_SET (VOIDmode, test_reg1, test_reg2); | 1751 move_pat = gen_rtx_SET (test_reg1, test_reg2); |
1204 move_insn = gen_rtx_INSN (VOIDmode, 0, 0, 0, 0, move_pat, 0, -1, 0); | 1752 move_insn = gen_rtx_INSN (VOIDmode, 0, 0, 0, move_pat, 0, -1, 0); |
1205 for (i = 0; i < NUM_MACHINE_MODES; i++) | 1753 for (i = 0; i < NUM_MACHINE_MODES; i++) |
1206 { | 1754 { |
1207 SET_HARD_REG_SET (ira_prohibited_mode_move_regs[i]); | 1755 SET_HARD_REG_SET (ira_prohibited_mode_move_regs[i]); |
1208 for (j = 0; j < FIRST_PSEUDO_REGISTER; j++) | 1756 for (j = 0; j < FIRST_PSEUDO_REGISTER; j++) |
1209 { | 1757 { |
1210 if (! HARD_REGNO_MODE_OK (j, (enum machine_mode) i)) | 1758 if (!targetm.hard_regno_mode_ok (j, (machine_mode) i)) |
1211 continue; | 1759 continue; |
1212 SET_REGNO_RAW (test_reg1, j); | 1760 set_mode_and_regno (test_reg1, (machine_mode) i, j); |
1213 PUT_MODE (test_reg1, (enum machine_mode) i); | 1761 set_mode_and_regno (test_reg2, (machine_mode) i, j); |
1214 SET_REGNO_RAW (test_reg2, j); | |
1215 PUT_MODE (test_reg2, (enum machine_mode) i); | |
1216 INSN_CODE (move_insn) = -1; | 1762 INSN_CODE (move_insn) = -1; |
1217 recog_memoized (move_insn); | 1763 recog_memoized (move_insn); |
1218 if (INSN_CODE (move_insn) < 0) | 1764 if (INSN_CODE (move_insn) < 0) |
1219 continue; | 1765 continue; |
1220 extract_insn (move_insn); | 1766 extract_insn (move_insn); |
1221 if (! constrain_operands (1)) | 1767 /* We don't know whether the move will be in code that is optimized |
1768 for size or speed, so consider all enabled alternatives. */ | |
1769 if (! constrain_operands (1, get_enabled_alternatives (move_insn))) | |
1222 continue; | 1770 continue; |
1223 CLEAR_HARD_REG_BIT (ira_prohibited_mode_move_regs[i], j); | 1771 CLEAR_HARD_REG_BIT (ira_prohibited_mode_move_regs[i], j); |
1224 } | 1772 } |
1225 } | 1773 } |
1774 } | |
1775 | |
1776 | |
1777 | |
1778 /* Setup possible alternatives in ALTS for INSN. */ | |
1779 void | |
1780 ira_setup_alts (rtx_insn *insn, HARD_REG_SET &alts) | |
1781 { | |
1782 /* MAP nalt * nop -> start of constraints for given operand and | |
1783 alternative. */ | |
1784 static vec<const char *> insn_constraints; | |
1785 int nop, nalt; | |
1786 bool curr_swapped; | |
1787 const char *p; | |
1788 int commutative = -1; | |
1789 | |
1790 extract_insn (insn); | |
1791 alternative_mask preferred = get_preferred_alternatives (insn); | |
1792 CLEAR_HARD_REG_SET (alts); | |
1793 insn_constraints.release (); | |
1794 insn_constraints.safe_grow_cleared (recog_data.n_operands | |
1795 * recog_data.n_alternatives + 1); | |
1796 /* Check that the hard reg set is enough for holding all | |
1797 alternatives. It is hard to imagine the situation when the | |
1798 assertion is wrong. */ | |
1799 ira_assert (recog_data.n_alternatives | |
1800 <= (int) MAX (sizeof (HARD_REG_ELT_TYPE) * CHAR_BIT, | |
1801 FIRST_PSEUDO_REGISTER)); | |
1802 for (curr_swapped = false;; curr_swapped = true) | |
1803 { | |
1804 /* Calculate some data common for all alternatives to speed up the | |
1805 function. */ | |
1806 for (nop = 0; nop < recog_data.n_operands; nop++) | |
1807 { | |
1808 for (nalt = 0, p = recog_data.constraints[nop]; | |
1809 nalt < recog_data.n_alternatives; | |
1810 nalt++) | |
1811 { | |
1812 insn_constraints[nop * recog_data.n_alternatives + nalt] = p; | |
1813 while (*p && *p != ',') | |
1814 { | |
1815 /* We only support one commutative marker, the first | |
1816 one. We already set commutative above. */ | |
1817 if (*p == '%' && commutative < 0) | |
1818 commutative = nop; | |
1819 p++; | |
1820 } | |
1821 if (*p) | |
1822 p++; | |
1823 } | |
1824 } | |
1825 for (nalt = 0; nalt < recog_data.n_alternatives; nalt++) | |
1826 { | |
1827 if (!TEST_BIT (preferred, nalt) | |
1828 || TEST_HARD_REG_BIT (alts, nalt)) | |
1829 continue; | |
1830 | |
1831 for (nop = 0; nop < recog_data.n_operands; nop++) | |
1832 { | |
1833 int c, len; | |
1834 | |
1835 rtx op = recog_data.operand[nop]; | |
1836 p = insn_constraints[nop * recog_data.n_alternatives + nalt]; | |
1837 if (*p == 0 || *p == ',') | |
1838 continue; | |
1839 | |
1840 do | |
1841 switch (c = *p, len = CONSTRAINT_LEN (c, p), c) | |
1842 { | |
1843 case '#': | |
1844 case ',': | |
1845 c = '\0'; | |
1846 /* FALLTHRU */ | |
1847 case '\0': | |
1848 len = 0; | |
1849 break; | |
1850 | |
1851 case '%': | |
1852 /* The commutative modifier is handled above. */ | |
1853 break; | |
1854 | |
1855 case '0': case '1': case '2': case '3': case '4': | |
1856 case '5': case '6': case '7': case '8': case '9': | |
1857 goto op_success; | |
1858 break; | |
1859 | |
1860 case 'g': | |
1861 goto op_success; | |
1862 break; | |
1863 | |
1864 default: | |
1865 { | |
1866 enum constraint_num cn = lookup_constraint (p); | |
1867 switch (get_constraint_type (cn)) | |
1868 { | |
1869 case CT_REGISTER: | |
1870 if (reg_class_for_constraint (cn) != NO_REGS) | |
1871 goto op_success; | |
1872 break; | |
1873 | |
1874 case CT_CONST_INT: | |
1875 if (CONST_INT_P (op) | |
1876 && (insn_const_int_ok_for_constraint | |
1877 (INTVAL (op), cn))) | |
1878 goto op_success; | |
1879 break; | |
1880 | |
1881 case CT_ADDRESS: | |
1882 case CT_MEMORY: | |
1883 case CT_SPECIAL_MEMORY: | |
1884 goto op_success; | |
1885 | |
1886 case CT_FIXED_FORM: | |
1887 if (constraint_satisfied_p (op, cn)) | |
1888 goto op_success; | |
1889 break; | |
1890 } | |
1891 break; | |
1892 } | |
1893 } | |
1894 while (p += len, c); | |
1895 break; | |
1896 op_success: | |
1897 ; | |
1898 } | |
1899 if (nop >= recog_data.n_operands) | |
1900 SET_HARD_REG_BIT (alts, nalt); | |
1901 } | |
1902 if (commutative < 0) | |
1903 break; | |
1904 /* Swap forth and back to avoid changing recog_data. */ | |
1905 std::swap (recog_data.operand[commutative], | |
1906 recog_data.operand[commutative + 1]); | |
1907 if (curr_swapped) | |
1908 break; | |
1909 } | |
1910 } | |
1911 | |
1912 /* Return the number of the output non-early clobber operand which | |
1913 should be the same in any case as operand with number OP_NUM (or | |
1914 negative value if there is no such operand). The function takes | |
1915 only really possible alternatives into consideration. */ | |
1916 int | |
1917 ira_get_dup_out_num (int op_num, HARD_REG_SET &alts) | |
1918 { | |
1919 int curr_alt, c, original, dup; | |
1920 bool ignore_p, use_commut_op_p; | |
1921 const char *str; | |
1922 | |
1923 if (op_num < 0 || recog_data.n_alternatives == 0) | |
1924 return -1; | |
1925 /* We should find duplications only for input operands. */ | |
1926 if (recog_data.operand_type[op_num] != OP_IN) | |
1927 return -1; | |
1928 str = recog_data.constraints[op_num]; | |
1929 use_commut_op_p = false; | |
1930 for (;;) | |
1931 { | |
1932 rtx op = recog_data.operand[op_num]; | |
1933 | |
1934 for (curr_alt = 0, ignore_p = !TEST_HARD_REG_BIT (alts, curr_alt), | |
1935 original = -1;;) | |
1936 { | |
1937 c = *str; | |
1938 if (c == '\0') | |
1939 break; | |
1940 if (c == '#') | |
1941 ignore_p = true; | |
1942 else if (c == ',') | |
1943 { | |
1944 curr_alt++; | |
1945 ignore_p = !TEST_HARD_REG_BIT (alts, curr_alt); | |
1946 } | |
1947 else if (! ignore_p) | |
1948 switch (c) | |
1949 { | |
1950 case 'g': | |
1951 goto fail; | |
1952 default: | |
1953 { | |
1954 enum constraint_num cn = lookup_constraint (str); | |
1955 enum reg_class cl = reg_class_for_constraint (cn); | |
1956 if (cl != NO_REGS | |
1957 && !targetm.class_likely_spilled_p (cl)) | |
1958 goto fail; | |
1959 if (constraint_satisfied_p (op, cn)) | |
1960 goto fail; | |
1961 break; | |
1962 } | |
1963 | |
1964 case '0': case '1': case '2': case '3': case '4': | |
1965 case '5': case '6': case '7': case '8': case '9': | |
1966 if (original != -1 && original != c) | |
1967 goto fail; | |
1968 original = c; | |
1969 break; | |
1970 } | |
1971 str += CONSTRAINT_LEN (c, str); | |
1972 } | |
1973 if (original == -1) | |
1974 goto fail; | |
1975 dup = -1; | |
1976 for (ignore_p = false, str = recog_data.constraints[original - '0']; | |
1977 *str != 0; | |
1978 str++) | |
1979 if (ignore_p) | |
1980 { | |
1981 if (*str == ',') | |
1982 ignore_p = false; | |
1983 } | |
1984 else if (*str == '#') | |
1985 ignore_p = true; | |
1986 else if (! ignore_p) | |
1987 { | |
1988 if (*str == '=') | |
1989 dup = original - '0'; | |
1990 /* It is better ignore an alternative with early clobber. */ | |
1991 else if (*str == '&') | |
1992 goto fail; | |
1993 } | |
1994 if (dup >= 0) | |
1995 return dup; | |
1996 fail: | |
1997 if (use_commut_op_p) | |
1998 break; | |
1999 use_commut_op_p = true; | |
2000 if (recog_data.constraints[op_num][0] == '%') | |
2001 str = recog_data.constraints[op_num + 1]; | |
2002 else if (op_num > 0 && recog_data.constraints[op_num - 1][0] == '%') | |
2003 str = recog_data.constraints[op_num - 1]; | |
2004 else | |
2005 break; | |
2006 } | |
2007 return -1; | |
2008 } | |
2009 | |
2010 | |
2011 | |
2012 /* Search forward to see if the source register of a copy insn dies | |
2013 before either it or the destination register is modified, but don't | |
2014 scan past the end of the basic block. If so, we can replace the | |
2015 source with the destination and let the source die in the copy | |
2016 insn. | |
2017 | |
2018 This will reduce the number of registers live in that range and may | |
2019 enable the destination and the source coalescing, thus often saving | |
2020 one register in addition to a register-register copy. */ | |
2021 | |
2022 static void | |
2023 decrease_live_ranges_number (void) | |
2024 { | |
2025 basic_block bb; | |
2026 rtx_insn *insn; | |
2027 rtx set, src, dest, dest_death, note; | |
2028 rtx_insn *p, *q; | |
2029 int sregno, dregno; | |
2030 | |
2031 if (! flag_expensive_optimizations) | |
2032 return; | |
2033 | |
2034 if (ira_dump_file) | |
2035 fprintf (ira_dump_file, "Starting decreasing number of live ranges...\n"); | |
2036 | |
2037 FOR_EACH_BB_FN (bb, cfun) | |
2038 FOR_BB_INSNS (bb, insn) | |
2039 { | |
2040 set = single_set (insn); | |
2041 if (! set) | |
2042 continue; | |
2043 src = SET_SRC (set); | |
2044 dest = SET_DEST (set); | |
2045 if (! REG_P (src) || ! REG_P (dest) | |
2046 || find_reg_note (insn, REG_DEAD, src)) | |
2047 continue; | |
2048 sregno = REGNO (src); | |
2049 dregno = REGNO (dest); | |
2050 | |
2051 /* We don't want to mess with hard regs if register classes | |
2052 are small. */ | |
2053 if (sregno == dregno | |
2054 || (targetm.small_register_classes_for_mode_p (GET_MODE (src)) | |
2055 && (sregno < FIRST_PSEUDO_REGISTER | |
2056 || dregno < FIRST_PSEUDO_REGISTER)) | |
2057 /* We don't see all updates to SP if they are in an | |
2058 auto-inc memory reference, so we must disallow this | |
2059 optimization on them. */ | |
2060 || sregno == STACK_POINTER_REGNUM | |
2061 || dregno == STACK_POINTER_REGNUM) | |
2062 continue; | |
2063 | |
2064 dest_death = NULL_RTX; | |
2065 | |
2066 for (p = NEXT_INSN (insn); p; p = NEXT_INSN (p)) | |
2067 { | |
2068 if (! INSN_P (p)) | |
2069 continue; | |
2070 if (BLOCK_FOR_INSN (p) != bb) | |
2071 break; | |
2072 | |
2073 if (reg_set_p (src, p) || reg_set_p (dest, p) | |
2074 /* If SRC is an asm-declared register, it must not be | |
2075 replaced in any asm. Unfortunately, the REG_EXPR | |
2076 tree for the asm variable may be absent in the SRC | |
2077 rtx, so we can't check the actual register | |
2078 declaration easily (the asm operand will have it, | |
2079 though). To avoid complicating the test for a rare | |
2080 case, we just don't perform register replacement | |
2081 for a hard reg mentioned in an asm. */ | |
2082 || (sregno < FIRST_PSEUDO_REGISTER | |
2083 && asm_noperands (PATTERN (p)) >= 0 | |
2084 && reg_overlap_mentioned_p (src, PATTERN (p))) | |
2085 /* Don't change hard registers used by a call. */ | |
2086 || (CALL_P (p) && sregno < FIRST_PSEUDO_REGISTER | |
2087 && find_reg_fusage (p, USE, src)) | |
2088 /* Don't change a USE of a register. */ | |
2089 || (GET_CODE (PATTERN (p)) == USE | |
2090 && reg_overlap_mentioned_p (src, XEXP (PATTERN (p), 0)))) | |
2091 break; | |
2092 | |
2093 /* See if all of SRC dies in P. This test is slightly | |
2094 more conservative than it needs to be. */ | |
2095 if ((note = find_regno_note (p, REG_DEAD, sregno)) | |
2096 && GET_MODE (XEXP (note, 0)) == GET_MODE (src)) | |
2097 { | |
2098 int failed = 0; | |
2099 | |
2100 /* We can do the optimization. Scan forward from INSN | |
2101 again, replacing regs as we go. Set FAILED if a | |
2102 replacement can't be done. In that case, we can't | |
2103 move the death note for SRC. This should be | |
2104 rare. */ | |
2105 | |
2106 /* Set to stop at next insn. */ | |
2107 for (q = next_real_insn (insn); | |
2108 q != next_real_insn (p); | |
2109 q = next_real_insn (q)) | |
2110 { | |
2111 if (reg_overlap_mentioned_p (src, PATTERN (q))) | |
2112 { | |
2113 /* If SRC is a hard register, we might miss | |
2114 some overlapping registers with | |
2115 validate_replace_rtx, so we would have to | |
2116 undo it. We can't if DEST is present in | |
2117 the insn, so fail in that combination of | |
2118 cases. */ | |
2119 if (sregno < FIRST_PSEUDO_REGISTER | |
2120 && reg_mentioned_p (dest, PATTERN (q))) | |
2121 failed = 1; | |
2122 | |
2123 /* Attempt to replace all uses. */ | |
2124 else if (!validate_replace_rtx (src, dest, q)) | |
2125 failed = 1; | |
2126 | |
2127 /* If this succeeded, but some part of the | |
2128 register is still present, undo the | |
2129 replacement. */ | |
2130 else if (sregno < FIRST_PSEUDO_REGISTER | |
2131 && reg_overlap_mentioned_p (src, PATTERN (q))) | |
2132 { | |
2133 validate_replace_rtx (dest, src, q); | |
2134 failed = 1; | |
2135 } | |
2136 } | |
2137 | |
2138 /* If DEST dies here, remove the death note and | |
2139 save it for later. Make sure ALL of DEST dies | |
2140 here; again, this is overly conservative. */ | |
2141 if (! dest_death | |
2142 && (dest_death = find_regno_note (q, REG_DEAD, dregno))) | |
2143 { | |
2144 if (GET_MODE (XEXP (dest_death, 0)) == GET_MODE (dest)) | |
2145 remove_note (q, dest_death); | |
2146 else | |
2147 { | |
2148 failed = 1; | |
2149 dest_death = 0; | |
2150 } | |
2151 } | |
2152 } | |
2153 | |
2154 if (! failed) | |
2155 { | |
2156 /* Move death note of SRC from P to INSN. */ | |
2157 remove_note (p, note); | |
2158 XEXP (note, 1) = REG_NOTES (insn); | |
2159 REG_NOTES (insn) = note; | |
2160 } | |
2161 | |
2162 /* DEST is also dead if INSN has a REG_UNUSED note for | |
2163 DEST. */ | |
2164 if (! dest_death | |
2165 && (dest_death | |
2166 = find_regno_note (insn, REG_UNUSED, dregno))) | |
2167 { | |
2168 PUT_REG_NOTE_KIND (dest_death, REG_DEAD); | |
2169 remove_note (insn, dest_death); | |
2170 } | |
2171 | |
2172 /* Put death note of DEST on P if we saw it die. */ | |
2173 if (dest_death) | |
2174 { | |
2175 XEXP (dest_death, 1) = REG_NOTES (p); | |
2176 REG_NOTES (p) = dest_death; | |
2177 } | |
2178 break; | |
2179 } | |
2180 | |
2181 /* If SRC is a hard register which is set or killed in | |
2182 some other way, we can't do this optimization. */ | |
2183 else if (sregno < FIRST_PSEUDO_REGISTER && dead_or_set_p (p, src)) | |
2184 break; | |
2185 } | |
2186 } | |
1226 } | 2187 } |
1227 | 2188 |
1228 | 2189 |
1229 | 2190 |
1230 /* Return nonzero if REGNO is a particularly bad choice for reloading X. */ | 2191 /* Return nonzero if REGNO is a particularly bad choice for reloading X. */ |
1269 { | 2230 { |
1270 return (ira_bad_reload_regno_1 (regno, in) | 2231 return (ira_bad_reload_regno_1 (regno, in) |
1271 || ira_bad_reload_regno_1 (regno, out)); | 2232 || ira_bad_reload_regno_1 (regno, out)); |
1272 } | 2233 } |
1273 | 2234 |
1274 /* Function specific hard registers that can not be used for the | |
1275 register allocation. */ | |
1276 HARD_REG_SET ira_no_alloc_regs; | |
1277 | |
1278 /* Return TRUE if *LOC contains an asm. */ | |
1279 static int | |
1280 insn_contains_asm_1 (rtx *loc, void *data ATTRIBUTE_UNUSED) | |
1281 { | |
1282 if ( !*loc) | |
1283 return FALSE; | |
1284 if (GET_CODE (*loc) == ASM_OPERANDS) | |
1285 return TRUE; | |
1286 return FALSE; | |
1287 } | |
1288 | |
1289 | |
1290 /* Return TRUE if INSN contains an ASM. */ | |
1291 static bool | |
1292 insn_contains_asm (rtx insn) | |
1293 { | |
1294 return for_each_rtx (&insn, insn_contains_asm_1, NULL); | |
1295 } | |
1296 | |
1297 /* Add register clobbers from asm statements. */ | 2235 /* Add register clobbers from asm statements. */ |
1298 static void | 2236 static void |
1299 compute_regs_asm_clobbered (void) | 2237 compute_regs_asm_clobbered (void) |
1300 { | 2238 { |
1301 basic_block bb; | 2239 basic_block bb; |
1302 | 2240 |
1303 FOR_EACH_BB (bb) | 2241 FOR_EACH_BB_FN (bb, cfun) |
1304 { | 2242 { |
1305 rtx insn; | 2243 rtx_insn *insn; |
1306 FOR_BB_INSNS_REVERSE (bb, insn) | 2244 FOR_BB_INSNS_REVERSE (bb, insn) |
1307 { | 2245 { |
1308 df_ref *def_rec; | 2246 df_ref def; |
1309 | 2247 |
1310 if (insn_contains_asm (insn)) | 2248 if (NONDEBUG_INSN_P (insn) && asm_noperands (PATTERN (insn)) >= 0) |
1311 for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++) | 2249 FOR_EACH_INSN_DEF (def, insn) |
1312 { | 2250 { |
1313 df_ref def = *def_rec; | |
1314 unsigned int dregno = DF_REF_REGNO (def); | 2251 unsigned int dregno = DF_REF_REGNO (def); |
1315 if (dregno < FIRST_PSEUDO_REGISTER) | 2252 if (HARD_REGISTER_NUM_P (dregno)) |
1316 { | 2253 add_to_hard_reg_set (&crtl->asm_clobbers, |
1317 unsigned int i; | 2254 GET_MODE (DF_REF_REAL_REG (def)), |
1318 enum machine_mode mode = GET_MODE (DF_REF_REAL_REG (def)); | 2255 dregno); |
1319 unsigned int end = dregno | |
1320 + hard_regno_nregs[dregno][mode] - 1; | |
1321 | |
1322 for (i = dregno; i <= end; ++i) | |
1323 SET_HARD_REG_BIT(crtl->asm_clobbers, i); | |
1324 } | |
1325 } | 2256 } |
1326 } | 2257 } |
1327 } | 2258 } |
1328 } | 2259 } |
1329 | 2260 |
1330 | 2261 |
1331 /* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and REGS_EVER_LIVE. */ | 2262 /* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and |
2263 REGS_EVER_LIVE. */ | |
1332 void | 2264 void |
1333 ira_setup_eliminable_regset (void) | 2265 ira_setup_eliminable_regset (void) |
1334 { | 2266 { |
1335 #ifdef ELIMINABLE_REGS | |
1336 int i; | 2267 int i; |
1337 static const struct {const int from, to; } eliminables[] = ELIMINABLE_REGS; | 2268 static const struct {const int from, to; } eliminables[] = ELIMINABLE_REGS; |
1338 #endif | 2269 |
2270 /* Setup is_leaf as frame_pointer_required may use it. This function | |
2271 is called by sched_init before ira if scheduling is enabled. */ | |
2272 crtl->is_leaf = leaf_function_p (); | |
2273 | |
1339 /* FIXME: If EXIT_IGNORE_STACK is set, we will not save and restore | 2274 /* FIXME: If EXIT_IGNORE_STACK is set, we will not save and restore |
1340 sp for alloca. So we can't eliminate the frame pointer in that | 2275 sp for alloca. So we can't eliminate the frame pointer in that |
1341 case. At some point, we should improve this by emitting the | 2276 case. At some point, we should improve this by emitting the |
1342 sp-adjusting insns for this case. */ | 2277 sp-adjusting insns for this case. */ |
1343 int need_fp | 2278 frame_pointer_needed |
1344 = (! flag_omit_frame_pointer | 2279 = (! flag_omit_frame_pointer |
1345 || (cfun->calls_alloca && EXIT_IGNORE_STACK) | 2280 || (cfun->calls_alloca && EXIT_IGNORE_STACK) |
1346 /* We need the frame pointer to catch stack overflow exceptions | 2281 /* We need the frame pointer to catch stack overflow exceptions if |
1347 if the stack pointer is moving. */ | 2282 the stack pointer is moving (as for the alloca case just above). */ |
1348 || (flag_stack_check && STACK_CHECK_MOVING_SP) | 2283 || (STACK_CHECK_MOVING_SP |
2284 && flag_stack_check | |
2285 && flag_exceptions | |
2286 && cfun->can_throw_non_call_exceptions) | |
1349 || crtl->accesses_prior_frames | 2287 || crtl->accesses_prior_frames |
1350 || crtl->stack_realign_needed | 2288 || (SUPPORTS_STACK_ALIGNMENT && crtl->stack_realign_needed) |
2289 /* We need a frame pointer for all Cilk Plus functions that use | |
2290 Cilk keywords. */ | |
2291 || (flag_cilkplus && cfun->is_cilk_function) | |
1351 || targetm.frame_pointer_required ()); | 2292 || targetm.frame_pointer_required ()); |
1352 | 2293 |
1353 frame_pointer_needed = need_fp; | 2294 /* The chance that FRAME_POINTER_NEEDED is changed from inspecting |
1354 | 2295 RTL is very small. So if we use frame pointer for RA and RTL |
2296 actually prevents this, we will spill pseudos assigned to the | |
2297 frame pointer in LRA. */ | |
2298 | |
2299 if (frame_pointer_needed) | |
2300 df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM, true); | |
2301 | |
1355 COPY_HARD_REG_SET (ira_no_alloc_regs, no_unit_alloc_regs); | 2302 COPY_HARD_REG_SET (ira_no_alloc_regs, no_unit_alloc_regs); |
1356 CLEAR_HARD_REG_SET (eliminable_regset); | 2303 CLEAR_HARD_REG_SET (eliminable_regset); |
1357 | 2304 |
1358 compute_regs_asm_clobbered (); | 2305 compute_regs_asm_clobbered (); |
1359 | 2306 |
1360 /* Build the regset of all eliminable registers and show we can't | 2307 /* Build the regset of all eliminable registers and show we can't |
1361 use those that we already know won't be eliminated. */ | 2308 use those that we already know won't be eliminated. */ |
1362 #ifdef ELIMINABLE_REGS | |
1363 for (i = 0; i < (int) ARRAY_SIZE (eliminables); i++) | 2309 for (i = 0; i < (int) ARRAY_SIZE (eliminables); i++) |
1364 { | 2310 { |
1365 bool cannot_elim | 2311 bool cannot_elim |
1366 = (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to) | 2312 = (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to) |
1367 || (eliminables[i].to == STACK_POINTER_REGNUM && need_fp)); | 2313 || (eliminables[i].to == STACK_POINTER_REGNUM && frame_pointer_needed)); |
1368 | 2314 |
1369 if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, eliminables[i].from)) | 2315 if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, eliminables[i].from)) |
1370 { | 2316 { |
1371 SET_HARD_REG_BIT (eliminable_regset, eliminables[i].from); | 2317 SET_HARD_REG_BIT (eliminable_regset, eliminables[i].from); |
1372 | 2318 |
1377 error ("%s cannot be used in asm here", | 2323 error ("%s cannot be used in asm here", |
1378 reg_names[eliminables[i].from]); | 2324 reg_names[eliminables[i].from]); |
1379 else | 2325 else |
1380 df_set_regs_ever_live (eliminables[i].from, true); | 2326 df_set_regs_ever_live (eliminables[i].from, true); |
1381 } | 2327 } |
1382 #if !HARD_FRAME_POINTER_IS_FRAME_POINTER | 2328 if (!HARD_FRAME_POINTER_IS_FRAME_POINTER) |
1383 if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, HARD_FRAME_POINTER_REGNUM)) | 2329 { |
1384 { | 2330 if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, HARD_FRAME_POINTER_REGNUM)) |
1385 SET_HARD_REG_BIT (eliminable_regset, HARD_FRAME_POINTER_REGNUM); | |
1386 if (need_fp) | |
1387 SET_HARD_REG_BIT (ira_no_alloc_regs, HARD_FRAME_POINTER_REGNUM); | |
1388 } | |
1389 else if (need_fp) | |
1390 error ("%s cannot be used in asm here", | |
1391 reg_names[HARD_FRAME_POINTER_REGNUM]); | |
1392 else | |
1393 df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM, true); | |
1394 #endif | |
1395 | |
1396 #else | |
1397 if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, HARD_FRAME_POINTER_REGNUM)) | |
1398 { | |
1399 SET_HARD_REG_BIT (eliminable_regset, FRAME_POINTER_REGNUM); | |
1400 if (need_fp) | |
1401 SET_HARD_REG_BIT (ira_no_alloc_regs, FRAME_POINTER_REGNUM); | |
1402 } | |
1403 else if (need_fp) | |
1404 error ("%s cannot be used in asm here", reg_names[FRAME_POINTER_REGNUM]); | |
1405 else | |
1406 df_set_regs_ever_live (FRAME_POINTER_REGNUM, true); | |
1407 #endif | |
1408 } | |
1409 | |
1410 | |
1411 | |
1412 /* The length of the following two arrays. */ | |
1413 int ira_reg_equiv_len; | |
1414 | |
1415 /* The element value is TRUE if the corresponding regno value is | |
1416 invariant. */ | |
1417 bool *ira_reg_equiv_invariant_p; | |
1418 | |
1419 /* The element value is equiv constant of given pseudo-register or | |
1420 NULL_RTX. */ | |
1421 rtx *ira_reg_equiv_const; | |
1422 | |
1423 /* Set up the two arrays declared above. */ | |
1424 static void | |
1425 find_reg_equiv_invariant_const (void) | |
1426 { | |
1427 int i; | |
1428 bool invariant_p; | |
1429 rtx list, insn, note, constant, x; | |
1430 | |
1431 for (i = FIRST_PSEUDO_REGISTER; i < reg_equiv_init_size; i++) | |
1432 { | |
1433 constant = NULL_RTX; | |
1434 invariant_p = false; | |
1435 for (list = reg_equiv_init[i]; list != NULL_RTX; list = XEXP (list, 1)) | |
1436 { | 2331 { |
1437 insn = XEXP (list, 0); | 2332 SET_HARD_REG_BIT (eliminable_regset, HARD_FRAME_POINTER_REGNUM); |
1438 note = find_reg_note (insn, REG_EQUIV, NULL_RTX); | 2333 if (frame_pointer_needed) |
1439 | 2334 SET_HARD_REG_BIT (ira_no_alloc_regs, HARD_FRAME_POINTER_REGNUM); |
1440 if (note == NULL_RTX) | |
1441 continue; | |
1442 | |
1443 x = XEXP (note, 0); | |
1444 | |
1445 if (! CONSTANT_P (x) | |
1446 || ! flag_pic || LEGITIMATE_PIC_OPERAND_P (x)) | |
1447 { | |
1448 /* It can happen that a REG_EQUIV note contains a MEM | |
1449 that is not a legitimate memory operand. As later | |
1450 stages of the reload assume that all addresses found | |
1451 in the reg_equiv_* arrays were originally legitimate, | |
1452 we ignore such REG_EQUIV notes. */ | |
1453 if (memory_operand (x, VOIDmode)) | |
1454 invariant_p = MEM_READONLY_P (x); | |
1455 else if (function_invariant_p (x)) | |
1456 { | |
1457 if (GET_CODE (x) == PLUS | |
1458 || x == frame_pointer_rtx || x == arg_pointer_rtx) | |
1459 invariant_p = true; | |
1460 else | |
1461 constant = x; | |
1462 } | |
1463 } | |
1464 } | 2335 } |
1465 ira_reg_equiv_invariant_p[i] = invariant_p; | 2336 else if (frame_pointer_needed) |
1466 ira_reg_equiv_const[i] = constant; | 2337 error ("%s cannot be used in asm here", |
2338 reg_names[HARD_FRAME_POINTER_REGNUM]); | |
2339 else | |
2340 df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM, true); | |
1467 } | 2341 } |
1468 } | 2342 } |
1469 | 2343 |
1470 | 2344 |
1471 | 2345 |
1487 ira_allocno_iterator ai; | 2361 ira_allocno_iterator ai; |
1488 | 2362 |
1489 caller_save_needed = 0; | 2363 caller_save_needed = 0; |
1490 FOR_EACH_ALLOCNO (a, ai) | 2364 FOR_EACH_ALLOCNO (a, ai) |
1491 { | 2365 { |
2366 if (ira_use_lra_p && ALLOCNO_CAP_MEMBER (a) != NULL) | |
2367 continue; | |
1492 /* There are no caps at this point. */ | 2368 /* There are no caps at this point. */ |
1493 ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL); | 2369 ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL); |
1494 if (! ALLOCNO_ASSIGNED_P (a)) | 2370 if (! ALLOCNO_ASSIGNED_P (a)) |
1495 /* It can happen if A is not referenced but partially anticipated | 2371 /* It can happen if A is not referenced but partially anticipated |
1496 somewhere in a region. */ | 2372 somewhere in a region. */ |
1497 ALLOCNO_ASSIGNED_P (a) = true; | 2373 ALLOCNO_ASSIGNED_P (a) = true; |
1498 ira_free_allocno_updated_costs (a); | 2374 ira_free_allocno_updated_costs (a); |
1499 hard_regno = ALLOCNO_HARD_REGNO (a); | 2375 hard_regno = ALLOCNO_HARD_REGNO (a); |
1500 regno = (int) REGNO (ALLOCNO_REG (a)); | 2376 regno = ALLOCNO_REGNO (a); |
1501 reg_renumber[regno] = (hard_regno < 0 ? -1 : hard_regno); | 2377 reg_renumber[regno] = (hard_regno < 0 ? -1 : hard_regno); |
1502 if (hard_regno >= 0 && ALLOCNO_CALLS_CROSSED_NUM (a) != 0 | 2378 if (hard_regno >= 0) |
1503 && ! ira_hard_reg_not_in_set_p (hard_regno, ALLOCNO_MODE (a), | |
1504 call_used_reg_set)) | |
1505 { | 2379 { |
1506 ira_assert (!optimize || flag_caller_saves | 2380 int i, nwords; |
1507 || regno >= ira_reg_equiv_len | 2381 enum reg_class pclass; |
1508 || ira_reg_equiv_const[regno] | 2382 ira_object_t obj; |
1509 || ira_reg_equiv_invariant_p[regno]); | 2383 |
1510 caller_save_needed = 1; | 2384 pclass = ira_pressure_class_translate[REGNO_REG_CLASS (hard_regno)]; |
2385 nwords = ALLOCNO_NUM_OBJECTS (a); | |
2386 for (i = 0; i < nwords; i++) | |
2387 { | |
2388 obj = ALLOCNO_OBJECT (a, i); | |
2389 IOR_COMPL_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), | |
2390 reg_class_contents[pclass]); | |
2391 } | |
2392 if (ALLOCNO_CALLS_CROSSED_NUM (a) != 0 | |
2393 && ira_hard_reg_set_intersection_p (hard_regno, ALLOCNO_MODE (a), | |
2394 call_used_reg_set)) | |
2395 { | |
2396 ira_assert (!optimize || flag_caller_saves | |
2397 || (ALLOCNO_CALLS_CROSSED_NUM (a) | |
2398 == ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)) | |
2399 || regno >= ira_reg_equiv_len | |
2400 || ira_equiv_no_lvalue_p (regno)); | |
2401 caller_save_needed = 1; | |
2402 } | |
1511 } | 2403 } |
1512 } | 2404 } |
1513 } | 2405 } |
1514 | 2406 |
1515 /* Set up allocno assignment flags for further allocation | 2407 /* Set up allocno assignment flags for further allocation |
1533 the same value in different hard registers. It is also | 2425 the same value in different hard registers. It is also |
1534 impossible to assign hard registers correctly to such | 2426 impossible to assign hard registers correctly to such |
1535 allocnos because the cost info and info about intersected | 2427 allocnos because the cost info and info about intersected |
1536 calls are incorrect for them. */ | 2428 calls are incorrect for them. */ |
1537 ALLOCNO_ASSIGNED_P (a) = (hard_regno >= 0 | 2429 ALLOCNO_ASSIGNED_P (a) = (hard_regno >= 0 |
1538 || ALLOCNO_MEM_OPTIMIZED_DEST_P (a) | 2430 || ALLOCNO_EMIT_DATA (a)->mem_optimized_dest_p |
1539 || (ALLOCNO_MEMORY_COST (a) | 2431 || (ALLOCNO_MEMORY_COST (a) |
1540 - ALLOCNO_COVER_CLASS_COST (a)) < 0); | 2432 - ALLOCNO_CLASS_COST (a)) < 0); |
1541 ira_assert (hard_regno < 0 | 2433 ira_assert |
1542 || ! ira_hard_reg_not_in_set_p (hard_regno, ALLOCNO_MODE (a), | 2434 (hard_regno < 0 |
1543 reg_class_contents | 2435 || ira_hard_reg_in_set_p (hard_regno, ALLOCNO_MODE (a), |
1544 [ALLOCNO_COVER_CLASS (a)])); | 2436 reg_class_contents[ALLOCNO_CLASS (a)])); |
1545 } | 2437 } |
1546 } | 2438 } |
1547 | 2439 |
1548 /* Evaluate overall allocation cost and the costs for using hard | 2440 /* Evaluate overall allocation cost and the costs for using hard |
1549 registers and memory for allocnos. */ | 2441 registers and memory for allocnos. */ |
1557 ira_overall_cost = ira_reg_cost = ira_mem_cost = 0; | 2449 ira_overall_cost = ira_reg_cost = ira_mem_cost = 0; |
1558 FOR_EACH_ALLOCNO (a, ai) | 2450 FOR_EACH_ALLOCNO (a, ai) |
1559 { | 2451 { |
1560 hard_regno = ALLOCNO_HARD_REGNO (a); | 2452 hard_regno = ALLOCNO_HARD_REGNO (a); |
1561 ira_assert (hard_regno < 0 | 2453 ira_assert (hard_regno < 0 |
1562 || ! ira_hard_reg_not_in_set_p | 2454 || (ira_hard_reg_in_set_p |
1563 (hard_regno, ALLOCNO_MODE (a), | 2455 (hard_regno, ALLOCNO_MODE (a), |
1564 reg_class_contents[ALLOCNO_COVER_CLASS (a)])); | 2456 reg_class_contents[ALLOCNO_CLASS (a)]))); |
1565 if (hard_regno < 0) | 2457 if (hard_regno < 0) |
1566 { | 2458 { |
1567 cost = ALLOCNO_MEMORY_COST (a); | 2459 cost = ALLOCNO_MEMORY_COST (a); |
1568 ira_mem_cost += cost; | 2460 ira_mem_cost += cost; |
1569 } | 2461 } |
1570 else if (ALLOCNO_HARD_REG_COSTS (a) != NULL) | 2462 else if (ALLOCNO_HARD_REG_COSTS (a) != NULL) |
1571 { | 2463 { |
1572 cost = (ALLOCNO_HARD_REG_COSTS (a) | 2464 cost = (ALLOCNO_HARD_REG_COSTS (a) |
1573 [ira_class_hard_reg_index | 2465 [ira_class_hard_reg_index |
1574 [ALLOCNO_COVER_CLASS (a)][hard_regno]]); | 2466 [ALLOCNO_CLASS (a)][hard_regno]]); |
1575 ira_reg_cost += cost; | 2467 ira_reg_cost += cost; |
1576 } | 2468 } |
1577 else | 2469 else |
1578 { | 2470 { |
1579 cost = ALLOCNO_COVER_CLASS_COST (a); | 2471 cost = ALLOCNO_CLASS_COST (a); |
1580 ira_reg_cost += cost; | 2472 ira_reg_cost += cost; |
1581 } | 2473 } |
1582 ira_overall_cost += cost; | 2474 ira_overall_cost += cost; |
1583 } | 2475 } |
1584 | 2476 |
1585 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) | 2477 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) |
1586 { | 2478 { |
1587 fprintf (ira_dump_file, | 2479 fprintf (ira_dump_file, |
1588 "+++Costs: overall %d, reg %d, mem %d, ld %d, st %d, move %d\n", | 2480 "+++Costs: overall %" PRId64 |
2481 ", reg %" PRId64 | |
2482 ", mem %" PRId64 | |
2483 ", ld %" PRId64 | |
2484 ", st %" PRId64 | |
2485 ", move %" PRId64, | |
1589 ira_overall_cost, ira_reg_cost, ira_mem_cost, | 2486 ira_overall_cost, ira_reg_cost, ira_mem_cost, |
1590 ira_load_cost, ira_store_cost, ira_shuffle_cost); | 2487 ira_load_cost, ira_store_cost, ira_shuffle_cost); |
1591 fprintf (ira_dump_file, "+++ move loops %d, new jumps %d\n", | 2488 fprintf (ira_dump_file, "\n+++ move loops %d, new jumps %d\n", |
1592 ira_move_loops_num, ira_additional_jumps_num); | 2489 ira_move_loops_num, ira_additional_jumps_num); |
1593 } | 2490 } |
1594 | 2491 |
1595 } | 2492 } |
1596 | 2493 |
1611 int i; | 2508 int i; |
1612 | 2509 |
1613 if (ALLOCNO_CAP_MEMBER (a) != NULL | 2510 if (ALLOCNO_CAP_MEMBER (a) != NULL |
1614 || (hard_regno = ALLOCNO_HARD_REGNO (a)) < 0) | 2511 || (hard_regno = ALLOCNO_HARD_REGNO (a)) < 0) |
1615 continue; | 2512 continue; |
1616 nregs = hard_regno_nregs[hard_regno][ALLOCNO_MODE (a)]; | 2513 nregs = hard_regno_nregs (hard_regno, ALLOCNO_MODE (a)); |
1617 if (nregs == 1) | 2514 if (nregs == 1) |
1618 /* We allocated a single hard register. */ | 2515 /* We allocated a single hard register. */ |
1619 n = 1; | 2516 n = 1; |
1620 else if (n > 1) | 2517 else if (n > 1) |
1621 /* We allocated multiple hard registers, and we will test | 2518 /* We allocated multiple hard registers, and we will test |
1628 ira_object_t conflict_obj; | 2525 ira_object_t conflict_obj; |
1629 ira_object_conflict_iterator oci; | 2526 ira_object_conflict_iterator oci; |
1630 int this_regno = hard_regno; | 2527 int this_regno = hard_regno; |
1631 if (n > 1) | 2528 if (n > 1) |
1632 { | 2529 { |
1633 if (WORDS_BIG_ENDIAN) | 2530 if (REG_WORDS_BIG_ENDIAN) |
1634 this_regno += n - i - 1; | 2531 this_regno += n - i - 1; |
1635 else | 2532 else |
1636 this_regno += i; | 2533 this_regno += i; |
1637 } | 2534 } |
1638 FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) | 2535 FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) |
1640 ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); | 2537 ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); |
1641 int conflict_hard_regno = ALLOCNO_HARD_REGNO (conflict_a); | 2538 int conflict_hard_regno = ALLOCNO_HARD_REGNO (conflict_a); |
1642 if (conflict_hard_regno < 0) | 2539 if (conflict_hard_regno < 0) |
1643 continue; | 2540 continue; |
1644 | 2541 |
1645 conflict_nregs | 2542 conflict_nregs = hard_regno_nregs (conflict_hard_regno, |
1646 = (hard_regno_nregs | 2543 ALLOCNO_MODE (conflict_a)); |
1647 [conflict_hard_regno][ALLOCNO_MODE (conflict_a)]); | |
1648 | 2544 |
1649 if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1 | 2545 if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1 |
1650 && conflict_nregs == ALLOCNO_NUM_OBJECTS (conflict_a)) | 2546 && conflict_nregs == ALLOCNO_NUM_OBJECTS (conflict_a)) |
1651 { | 2547 { |
1652 if (WORDS_BIG_ENDIAN) | 2548 if (REG_WORDS_BIG_ENDIAN) |
1653 conflict_hard_regno += (ALLOCNO_NUM_OBJECTS (conflict_a) | 2549 conflict_hard_regno += (ALLOCNO_NUM_OBJECTS (conflict_a) |
1654 - OBJECT_SUBWORD (conflict_obj) - 1); | 2550 - OBJECT_SUBWORD (conflict_obj) - 1); |
1655 else | 2551 else |
1656 conflict_hard_regno += OBJECT_SUBWORD (conflict_obj); | 2552 conflict_hard_regno += OBJECT_SUBWORD (conflict_obj); |
1657 conflict_nregs = 1; | 2553 conflict_nregs = 1; |
1670 } | 2566 } |
1671 } | 2567 } |
1672 } | 2568 } |
1673 #endif | 2569 #endif |
1674 | 2570 |
2571 /* Allocate REG_EQUIV_INIT. Set up it from IRA_REG_EQUIV which should | |
2572 be already calculated. */ | |
2573 static void | |
2574 setup_reg_equiv_init (void) | |
2575 { | |
2576 int i; | |
2577 int max_regno = max_reg_num (); | |
2578 | |
2579 for (i = 0; i < max_regno; i++) | |
2580 reg_equiv_init (i) = ira_reg_equiv[i].init_insns; | |
2581 } | |
2582 | |
2583 /* Update equiv regno from movement of FROM_REGNO to TO_REGNO. INSNS | |
2584 are insns which were generated for such movement. It is assumed | |
2585 that FROM_REGNO and TO_REGNO always have the same value at the | |
2586 point of any move containing such registers. This function is used | |
2587 to update equiv info for register shuffles on the region borders | |
2588 and for caller save/restore insns. */ | |
2589 void | |
2590 ira_update_equiv_info_by_shuffle_insn (int to_regno, int from_regno, rtx_insn *insns) | |
2591 { | |
2592 rtx_insn *insn; | |
2593 rtx x, note; | |
2594 | |
2595 if (! ira_reg_equiv[from_regno].defined_p | |
2596 && (! ira_reg_equiv[to_regno].defined_p | |
2597 || ((x = ira_reg_equiv[to_regno].memory) != NULL_RTX | |
2598 && ! MEM_READONLY_P (x)))) | |
2599 return; | |
2600 insn = insns; | |
2601 if (NEXT_INSN (insn) != NULL_RTX) | |
2602 { | |
2603 if (! ira_reg_equiv[to_regno].defined_p) | |
2604 { | |
2605 ira_assert (ira_reg_equiv[to_regno].init_insns == NULL_RTX); | |
2606 return; | |
2607 } | |
2608 ira_reg_equiv[to_regno].defined_p = false; | |
2609 ira_reg_equiv[to_regno].memory | |
2610 = ira_reg_equiv[to_regno].constant | |
2611 = ira_reg_equiv[to_regno].invariant | |
2612 = ira_reg_equiv[to_regno].init_insns = NULL; | |
2613 if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) | |
2614 fprintf (ira_dump_file, | |
2615 " Invalidating equiv info for reg %d\n", to_regno); | |
2616 return; | |
2617 } | |
2618 /* It is possible that FROM_REGNO still has no equivalence because | |
2619 in shuffles to_regno<-from_regno and from_regno<-to_regno the 2nd | |
2620 insn was not processed yet. */ | |
2621 if (ira_reg_equiv[from_regno].defined_p) | |
2622 { | |
2623 ira_reg_equiv[to_regno].defined_p = true; | |
2624 if ((x = ira_reg_equiv[from_regno].memory) != NULL_RTX) | |
2625 { | |
2626 ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX | |
2627 && ira_reg_equiv[from_regno].constant == NULL_RTX); | |
2628 ira_assert (ira_reg_equiv[to_regno].memory == NULL_RTX | |
2629 || rtx_equal_p (ira_reg_equiv[to_regno].memory, x)); | |
2630 ira_reg_equiv[to_regno].memory = x; | |
2631 if (! MEM_READONLY_P (x)) | |
2632 /* We don't add the insn to insn init list because memory | |
2633 equivalence is just to say what memory is better to use | |
2634 when the pseudo is spilled. */ | |
2635 return; | |
2636 } | |
2637 else if ((x = ira_reg_equiv[from_regno].constant) != NULL_RTX) | |
2638 { | |
2639 ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX); | |
2640 ira_assert (ira_reg_equiv[to_regno].constant == NULL_RTX | |
2641 || rtx_equal_p (ira_reg_equiv[to_regno].constant, x)); | |
2642 ira_reg_equiv[to_regno].constant = x; | |
2643 } | |
2644 else | |
2645 { | |
2646 x = ira_reg_equiv[from_regno].invariant; | |
2647 ira_assert (x != NULL_RTX); | |
2648 ira_assert (ira_reg_equiv[to_regno].invariant == NULL_RTX | |
2649 || rtx_equal_p (ira_reg_equiv[to_regno].invariant, x)); | |
2650 ira_reg_equiv[to_regno].invariant = x; | |
2651 } | |
2652 if (find_reg_note (insn, REG_EQUIV, x) == NULL_RTX) | |
2653 { | |
2654 note = set_unique_reg_note (insn, REG_EQUIV, copy_rtx (x)); | |
2655 gcc_assert (note != NULL_RTX); | |
2656 if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) | |
2657 { | |
2658 fprintf (ira_dump_file, | |
2659 " Adding equiv note to insn %u for reg %d ", | |
2660 INSN_UID (insn), to_regno); | |
2661 dump_value_slim (ira_dump_file, x, 1); | |
2662 fprintf (ira_dump_file, "\n"); | |
2663 } | |
2664 } | |
2665 } | |
2666 ira_reg_equiv[to_regno].init_insns | |
2667 = gen_rtx_INSN_LIST (VOIDmode, insn, | |
2668 ira_reg_equiv[to_regno].init_insns); | |
2669 if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) | |
2670 fprintf (ira_dump_file, | |
2671 " Adding equiv init move insn %u to reg %d\n", | |
2672 INSN_UID (insn), to_regno); | |
2673 } | |
2674 | |
1675 /* Fix values of array REG_EQUIV_INIT after live range splitting done | 2675 /* Fix values of array REG_EQUIV_INIT after live range splitting done |
1676 by IRA. */ | 2676 by IRA. */ |
1677 static void | 2677 static void |
1678 fix_reg_equiv_init (void) | 2678 fix_reg_equiv_init (void) |
1679 { | 2679 { |
1680 int max_regno = max_reg_num (); | 2680 int max_regno = max_reg_num (); |
1681 int i, new_regno; | 2681 int i, new_regno, max; |
1682 rtx x, prev, next, insn, set; | 2682 rtx set; |
1683 | 2683 rtx_insn_list *x, *next, *prev; |
1684 if (reg_equiv_init_size < max_regno) | 2684 rtx_insn *insn; |
1685 { | 2685 |
1686 reg_equiv_init = GGC_RESIZEVEC (rtx, reg_equiv_init, max_regno); | 2686 if (max_regno_before_ira < max_regno) |
1687 while (reg_equiv_init_size < max_regno) | 2687 { |
1688 reg_equiv_init[reg_equiv_init_size++] = NULL_RTX; | 2688 max = vec_safe_length (reg_equivs); |
1689 for (i = FIRST_PSEUDO_REGISTER; i < reg_equiv_init_size; i++) | 2689 grow_reg_equivs (); |
1690 for (prev = NULL_RTX, x = reg_equiv_init[i]; x != NULL_RTX; x = next) | 2690 for (i = FIRST_PSEUDO_REGISTER; i < max; i++) |
2691 for (prev = NULL, x = reg_equiv_init (i); | |
2692 x != NULL_RTX; | |
2693 x = next) | |
1691 { | 2694 { |
1692 next = XEXP (x, 1); | 2695 next = x->next (); |
1693 insn = XEXP (x, 0); | 2696 insn = x->insn (); |
1694 set = single_set (insn); | 2697 set = single_set (insn); |
1695 ira_assert (set != NULL_RTX | 2698 ira_assert (set != NULL_RTX |
1696 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set)))); | 2699 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set)))); |
1697 if (REG_P (SET_DEST (set)) | 2700 if (REG_P (SET_DEST (set)) |
1698 && ((int) REGNO (SET_DEST (set)) == i | 2701 && ((int) REGNO (SET_DEST (set)) == i |
1706 gcc_unreachable (); | 2709 gcc_unreachable (); |
1707 if (new_regno == i) | 2710 if (new_regno == i) |
1708 prev = x; | 2711 prev = x; |
1709 else | 2712 else |
1710 { | 2713 { |
2714 /* Remove the wrong list element. */ | |
1711 if (prev == NULL_RTX) | 2715 if (prev == NULL_RTX) |
1712 reg_equiv_init[i] = next; | 2716 reg_equiv_init (i) = next; |
1713 else | 2717 else |
1714 XEXP (prev, 1) = next; | 2718 XEXP (prev, 1) = next; |
1715 XEXP (x, 1) = reg_equiv_init[new_regno]; | 2719 XEXP (x, 1) = reg_equiv_init (new_regno); |
1716 reg_equiv_init[new_regno] = x; | 2720 reg_equiv_init (new_regno) = x; |
1717 } | 2721 } |
1718 } | 2722 } |
1719 } | 2723 } |
1720 } | 2724 } |
1721 | 2725 |
1730 ira_allocno_iterator ai; | 2734 ira_allocno_iterator ai; |
1731 | 2735 |
1732 FOR_EACH_ALLOCNO (a, ai) | 2736 FOR_EACH_ALLOCNO (a, ai) |
1733 { | 2737 { |
1734 if (ALLOCNO_CAP_MEMBER (a) != NULL) | 2738 if (ALLOCNO_CAP_MEMBER (a) != NULL) |
1735 /* It is a cap. */ | 2739 /* It is a cap. */ |
1736 continue; | 2740 continue; |
1737 hard_regno = ALLOCNO_HARD_REGNO (a); | 2741 hard_regno = ALLOCNO_HARD_REGNO (a); |
1738 if (hard_regno >= 0) | 2742 if (hard_regno >= 0) |
1739 continue; | 2743 continue; |
1740 for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) | 2744 for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) |
1766 { | 2770 { |
1767 old_regno = ORIGINAL_REGNO (regno_reg_rtx[i]); | 2771 old_regno = ORIGINAL_REGNO (regno_reg_rtx[i]); |
1768 ira_assert (i != old_regno); | 2772 ira_assert (i != old_regno); |
1769 setup_reg_classes (i, reg_preferred_class (old_regno), | 2773 setup_reg_classes (i, reg_preferred_class (old_regno), |
1770 reg_alternate_class (old_regno), | 2774 reg_alternate_class (old_regno), |
1771 reg_cover_class (old_regno)); | 2775 reg_allocno_class (old_regno)); |
1772 if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) | 2776 if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) |
1773 fprintf (ira_dump_file, | 2777 fprintf (ira_dump_file, |
1774 " New r%d: setting preferred %s, alternative %s\n", | 2778 " New r%d: setting preferred %s, alternative %s\n", |
1775 i, reg_class_names[reg_preferred_class (old_regno)], | 2779 i, reg_class_names[reg_preferred_class (old_regno)], |
1776 reg_class_names[reg_alternate_class (old_regno)]); | 2780 reg_class_names[reg_alternate_class (old_regno)]); |
1777 } | 2781 } |
1778 } | 2782 } |
1779 | 2783 |
1780 | 2784 |
2785 /* The number of entries allocated in reg_info. */ | |
2786 static int allocated_reg_info_size; | |
1781 | 2787 |
1782 /* Regional allocation can create new pseudo-registers. This function | 2788 /* Regional allocation can create new pseudo-registers. This function |
1783 expands some arrays for pseudo-registers. */ | 2789 expands some arrays for pseudo-registers. */ |
1784 static void | 2790 static void |
1785 expand_reg_info (int old_size) | 2791 expand_reg_info (void) |
1786 { | 2792 { |
1787 int i; | 2793 int i; |
1788 int size = max_reg_num (); | 2794 int size = max_reg_num (); |
1789 | 2795 |
1790 resize_reg_info (); | 2796 resize_reg_info (); |
1791 for (i = old_size; i < size; i++) | 2797 for (i = allocated_reg_info_size; i < size; i++) |
1792 setup_reg_classes (i, GENERAL_REGS, ALL_REGS, GENERAL_REGS); | 2798 setup_reg_classes (i, GENERAL_REGS, ALL_REGS, GENERAL_REGS); |
2799 setup_preferred_alternate_classes_for_new_pseudos (allocated_reg_info_size); | |
2800 allocated_reg_info_size = size; | |
1793 } | 2801 } |
1794 | 2802 |
1795 /* Return TRUE if there is too high register pressure in the function. | 2803 /* Return TRUE if there is too high register pressure in the function. |
1796 It is used to decide when stack slot sharing is worth to do. */ | 2804 It is used to decide when stack slot sharing is worth to do. */ |
1797 static bool | 2805 static bool |
1798 too_high_register_pressure_p (void) | 2806 too_high_register_pressure_p (void) |
1799 { | 2807 { |
1800 int i; | 2808 int i; |
1801 enum reg_class cover_class; | 2809 enum reg_class pclass; |
1802 | 2810 |
1803 for (i = 0; i < ira_reg_class_cover_size; i++) | 2811 for (i = 0; i < ira_pressure_classes_num; i++) |
1804 { | 2812 { |
1805 cover_class = ira_reg_class_cover[i]; | 2813 pclass = ira_pressure_classes[i]; |
1806 if (ira_loop_tree_root->reg_pressure[cover_class] > 10000) | 2814 if (ira_loop_tree_root->reg_pressure[pclass] > 10000) |
1807 return true; | 2815 return true; |
1808 } | 2816 } |
1809 return false; | 2817 return false; |
1810 } | 2818 } |
1811 | 2819 |
1818 | 2826 |
1819 void | 2827 void |
1820 mark_elimination (int from, int to) | 2828 mark_elimination (int from, int to) |
1821 { | 2829 { |
1822 basic_block bb; | 2830 basic_block bb; |
1823 | 2831 bitmap r; |
1824 FOR_EACH_BB (bb) | 2832 |
1825 { | 2833 FOR_EACH_BB_FN (bb, cfun) |
1826 /* We don't use LIVE info in IRA. */ | 2834 { |
1827 bitmap r = DF_LR_IN (bb); | 2835 r = DF_LR_IN (bb); |
1828 | 2836 if (bitmap_bit_p (r, from)) |
1829 if (REGNO_REG_SET_P (r, from)) | |
1830 { | 2837 { |
1831 CLEAR_REGNO_REG_SET (r, from); | 2838 bitmap_clear_bit (r, from); |
1832 SET_REGNO_REG_SET (r, to); | 2839 bitmap_set_bit (r, to); |
1833 } | 2840 } |
1834 } | 2841 if (! df_live) |
2842 continue; | |
2843 r = DF_LIVE_IN (bb); | |
2844 if (bitmap_bit_p (r, from)) | |
2845 { | |
2846 bitmap_clear_bit (r, from); | |
2847 bitmap_set_bit (r, to); | |
2848 } | |
2849 } | |
2850 } | |
2851 | |
2852 | |
2853 | |
2854 /* The length of the following array. */ | |
2855 int ira_reg_equiv_len; | |
2856 | |
2857 /* Info about equiv. info for each register. */ | |
2858 struct ira_reg_equiv_s *ira_reg_equiv; | |
2859 | |
2860 /* Expand ira_reg_equiv if necessary. */ | |
2861 void | |
2862 ira_expand_reg_equiv (void) | |
2863 { | |
2864 int old = ira_reg_equiv_len; | |
2865 | |
2866 if (ira_reg_equiv_len > max_reg_num ()) | |
2867 return; | |
2868 ira_reg_equiv_len = max_reg_num () * 3 / 2 + 1; | |
2869 ira_reg_equiv | |
2870 = (struct ira_reg_equiv_s *) xrealloc (ira_reg_equiv, | |
2871 ira_reg_equiv_len | |
2872 * sizeof (struct ira_reg_equiv_s)); | |
2873 gcc_assert (old < ira_reg_equiv_len); | |
2874 memset (ira_reg_equiv + old, 0, | |
2875 sizeof (struct ira_reg_equiv_s) * (ira_reg_equiv_len - old)); | |
2876 } | |
2877 | |
2878 static void | |
2879 init_reg_equiv (void) | |
2880 { | |
2881 ira_reg_equiv_len = 0; | |
2882 ira_reg_equiv = NULL; | |
2883 ira_expand_reg_equiv (); | |
2884 } | |
2885 | |
2886 static void | |
2887 finish_reg_equiv (void) | |
2888 { | |
2889 free (ira_reg_equiv); | |
1835 } | 2890 } |
1836 | 2891 |
1837 | 2892 |
1838 | 2893 |
1839 struct equivalence | 2894 struct equivalence |
1841 /* Set when a REG_EQUIV note is found or created. Use to | 2896 /* Set when a REG_EQUIV note is found or created. Use to |
1842 keep track of what memory accesses might be created later, | 2897 keep track of what memory accesses might be created later, |
1843 e.g. by reload. */ | 2898 e.g. by reload. */ |
1844 rtx replacement; | 2899 rtx replacement; |
1845 rtx *src_p; | 2900 rtx *src_p; |
1846 /* The list of each instruction which initializes this register. */ | 2901 |
1847 rtx init_insns; | 2902 /* The list of each instruction which initializes this register. |
2903 | |
2904 NULL indicates we know nothing about this register's equivalence | |
2905 properties. | |
2906 | |
2907 An INSN_LIST with a NULL insn indicates this pseudo is already | |
2908 known to not have a valid equivalence. */ | |
2909 rtx_insn_list *init_insns; | |
2910 | |
1848 /* Loop depth is used to recognize equivalences which appear | 2911 /* Loop depth is used to recognize equivalences which appear |
1849 to be present within the same loop (or in an inner loop). */ | 2912 to be present within the same loop (or in an inner loop). */ |
1850 int loop_depth; | 2913 short loop_depth; |
1851 /* Nonzero if this had a preexisting REG_EQUIV note. */ | 2914 /* Nonzero if this had a preexisting REG_EQUIV note. */ |
1852 int is_arg_equivalence; | 2915 unsigned char is_arg_equivalence : 1; |
1853 /* Set when an attempt should be made to replace a register | 2916 /* Set when an attempt should be made to replace a register |
1854 with the associated src_p entry. */ | 2917 with the associated src_p entry. */ |
1855 char replace; | 2918 unsigned char replace : 1; |
2919 /* Set if this register has no known equivalence. */ | |
2920 unsigned char no_equiv : 1; | |
2921 /* Set if this register is mentioned in a paradoxical subreg. */ | |
2922 unsigned char pdx_subregs : 1; | |
1856 }; | 2923 }; |
1857 | 2924 |
1858 /* reg_equiv[N] (where N is a pseudo reg number) is the equivalence | 2925 /* reg_equiv[N] (where N is a pseudo reg number) is the equivalence |
1859 structure for that register. */ | 2926 structure for that register. */ |
1860 static struct equivalence *reg_equiv; | 2927 static struct equivalence *reg_equiv; |
1861 | 2928 |
1862 /* Used for communication between the following two functions: contains | 2929 /* Used for communication between the following two functions. */ |
1863 a MEM that we wish to ensure remains unchanged. */ | 2930 struct equiv_mem_data |
1864 static rtx equiv_mem; | 2931 { |
1865 | 2932 /* A MEM that we wish to ensure remains unchanged. */ |
1866 /* Set nonzero if EQUIV_MEM is modified. */ | 2933 rtx equiv_mem; |
1867 static int equiv_mem_modified; | 2934 |
2935 /* Set true if EQUIV_MEM is modified. */ | |
2936 bool equiv_mem_modified; | |
2937 }; | |
1868 | 2938 |
1869 /* If EQUIV_MEM is modified by modifying DEST, indicate that it is modified. | 2939 /* If EQUIV_MEM is modified by modifying DEST, indicate that it is modified. |
1870 Called via note_stores. */ | 2940 Called via note_stores. */ |
1871 static void | 2941 static void |
1872 validate_equiv_mem_from_store (rtx dest, const_rtx set ATTRIBUTE_UNUSED, | 2942 validate_equiv_mem_from_store (rtx dest, const_rtx set ATTRIBUTE_UNUSED, |
1873 void *data ATTRIBUTE_UNUSED) | 2943 void *data) |
1874 { | 2944 { |
2945 struct equiv_mem_data *info = (struct equiv_mem_data *) data; | |
2946 | |
1875 if ((REG_P (dest) | 2947 if ((REG_P (dest) |
1876 && reg_overlap_mentioned_p (dest, equiv_mem)) | 2948 && reg_overlap_mentioned_p (dest, info->equiv_mem)) |
1877 || (MEM_P (dest) | 2949 || (MEM_P (dest) |
1878 && true_dependence (dest, VOIDmode, equiv_mem, rtx_varies_p))) | 2950 && anti_dependence (info->equiv_mem, dest))) |
1879 equiv_mem_modified = 1; | 2951 info->equiv_mem_modified = true; |
1880 } | 2952 } |
2953 | |
2954 enum valid_equiv { valid_none, valid_combine, valid_reload }; | |
1881 | 2955 |
1882 /* Verify that no store between START and the death of REG invalidates | 2956 /* Verify that no store between START and the death of REG invalidates |
1883 MEMREF. MEMREF is invalidated by modifying a register used in MEMREF, | 2957 MEMREF. MEMREF is invalidated by modifying a register used in MEMREF, |
1884 by storing into an overlapping memory location, or with a non-const | 2958 by storing into an overlapping memory location, or with a non-const |
1885 CALL_INSN. | 2959 CALL_INSN. |
1886 | 2960 |
1887 Return 1 if MEMREF remains valid. */ | 2961 Return VALID_RELOAD if MEMREF remains valid for both reload and |
1888 static int | 2962 combine_and_move insns, VALID_COMBINE if only valid for |
1889 validate_equiv_mem (rtx start, rtx reg, rtx memref) | 2963 combine_and_move_insns, and VALID_NONE otherwise. */ |
1890 { | 2964 static enum valid_equiv |
1891 rtx insn; | 2965 validate_equiv_mem (rtx_insn *start, rtx reg, rtx memref) |
2966 { | |
2967 rtx_insn *insn; | |
1892 rtx note; | 2968 rtx note; |
1893 | 2969 struct equiv_mem_data info = { memref, false }; |
1894 equiv_mem = memref; | 2970 enum valid_equiv ret = valid_reload; |
1895 equiv_mem_modified = 0; | |
1896 | 2971 |
1897 /* If the memory reference has side effects or is volatile, it isn't a | 2972 /* If the memory reference has side effects or is volatile, it isn't a |
1898 valid equivalence. */ | 2973 valid equivalence. */ |
1899 if (side_effects_p (memref)) | 2974 if (side_effects_p (memref)) |
1900 return 0; | 2975 return valid_none; |
1901 | 2976 |
1902 for (insn = start; insn && ! equiv_mem_modified; insn = NEXT_INSN (insn)) | 2977 for (insn = start; insn; insn = NEXT_INSN (insn)) |
1903 { | 2978 { |
1904 if (! INSN_P (insn)) | 2979 if (!INSN_P (insn)) |
1905 continue; | 2980 continue; |
1906 | 2981 |
1907 if (find_reg_note (insn, REG_DEAD, reg)) | 2982 if (find_reg_note (insn, REG_DEAD, reg)) |
1908 return 1; | 2983 return ret; |
1909 | 2984 |
1910 /* This used to ignore readonly memory and const/pure calls. The problem | |
1911 is the equivalent form may reference a pseudo which gets assigned a | |
1912 call clobbered hard reg. When we later replace REG with its | |
1913 equivalent form, the value in the call-clobbered reg has been | |
1914 changed and all hell breaks loose. */ | |
1915 if (CALL_P (insn)) | 2985 if (CALL_P (insn)) |
1916 return 0; | 2986 { |
1917 | 2987 /* We can combine a reg def from one insn into a reg use in |
1918 note_stores (PATTERN (insn), validate_equiv_mem_from_store, NULL); | 2988 another over a call if the memory is readonly or the call |
2989 const/pure. However, we can't set reg_equiv notes up for | |
2990 reload over any call. The problem is the equivalent form | |
2991 may reference a pseudo which gets assigned a call | |
2992 clobbered hard reg. When we later replace REG with its | |
2993 equivalent form, the value in the call-clobbered reg has | |
2994 been changed and all hell breaks loose. */ | |
2995 ret = valid_combine; | |
2996 if (!MEM_READONLY_P (memref) | |
2997 && !RTL_CONST_OR_PURE_CALL_P (insn)) | |
2998 return valid_none; | |
2999 } | |
3000 | |
3001 note_stores (PATTERN (insn), validate_equiv_mem_from_store, &info); | |
3002 if (info.equiv_mem_modified) | |
3003 return valid_none; | |
1919 | 3004 |
1920 /* If a register mentioned in MEMREF is modified via an | 3005 /* If a register mentioned in MEMREF is modified via an |
1921 auto-increment, we lose the equivalence. Do the same if one | 3006 auto-increment, we lose the equivalence. Do the same if one |
1922 dies; although we could extend the life, it doesn't seem worth | 3007 dies; although we could extend the life, it doesn't seem worth |
1923 the trouble. */ | 3008 the trouble. */ |
1925 for (note = REG_NOTES (insn); note; note = XEXP (note, 1)) | 3010 for (note = REG_NOTES (insn); note; note = XEXP (note, 1)) |
1926 if ((REG_NOTE_KIND (note) == REG_INC | 3011 if ((REG_NOTE_KIND (note) == REG_INC |
1927 || REG_NOTE_KIND (note) == REG_DEAD) | 3012 || REG_NOTE_KIND (note) == REG_DEAD) |
1928 && REG_P (XEXP (note, 0)) | 3013 && REG_P (XEXP (note, 0)) |
1929 && reg_overlap_mentioned_p (XEXP (note, 0), memref)) | 3014 && reg_overlap_mentioned_p (XEXP (note, 0), memref)) |
1930 return 0; | 3015 return valid_none; |
1931 } | 3016 } |
1932 | 3017 |
1933 return 0; | 3018 return valid_none; |
1934 } | 3019 } |
1935 | 3020 |
1936 /* Returns zero if X is known to be invariant. */ | 3021 /* Returns zero if X is known to be invariant. */ |
1937 static int | 3022 static int |
1938 equiv_init_varies_p (rtx x) | 3023 equiv_init_varies_p (rtx x) |
1945 { | 3030 { |
1946 case MEM: | 3031 case MEM: |
1947 return !MEM_READONLY_P (x) || equiv_init_varies_p (XEXP (x, 0)); | 3032 return !MEM_READONLY_P (x) || equiv_init_varies_p (XEXP (x, 0)); |
1948 | 3033 |
1949 case CONST: | 3034 case CONST: |
1950 case CONST_INT: | 3035 CASE_CONST_ANY: |
1951 case CONST_DOUBLE: | |
1952 case CONST_FIXED: | |
1953 case CONST_VECTOR: | |
1954 case SYMBOL_REF: | 3036 case SYMBOL_REF: |
1955 case LABEL_REF: | 3037 case LABEL_REF: |
1956 return 0; | 3038 return 0; |
1957 | 3039 |
1958 case REG: | 3040 case REG: |
2013 case PRE_MODIFY: | 3095 case PRE_MODIFY: |
2014 case POST_MODIFY: | 3096 case POST_MODIFY: |
2015 return 0; | 3097 return 0; |
2016 | 3098 |
2017 case REG: | 3099 case REG: |
2018 return (reg_equiv[REGNO (x)].loop_depth >= reg_equiv[regno].loop_depth | 3100 return ((reg_equiv[REGNO (x)].loop_depth >= reg_equiv[regno].loop_depth |
2019 && reg_equiv[REGNO (x)].replace) | 3101 && reg_equiv[REGNO (x)].replace) |
2020 || (REG_BASIC_BLOCK (REGNO (x)) < NUM_FIXED_BLOCKS && ! rtx_varies_p (x, 0)); | 3102 || (REG_BASIC_BLOCK (REGNO (x)) < NUM_FIXED_BLOCKS |
3103 && ! rtx_varies_p (x, 0))); | |
2021 | 3104 |
2022 case UNSPEC_VOLATILE: | 3105 case UNSPEC_VOLATILE: |
2023 return 0; | 3106 return 0; |
2024 | 3107 |
2025 case ASM_OPERANDS: | 3108 case ASM_OPERANDS: |
2048 } | 3131 } |
2049 | 3132 |
2050 return 1; | 3133 return 1; |
2051 } | 3134 } |
2052 | 3135 |
2053 /* TRUE if X uses any registers for which reg_equiv[REGNO].replace is true. */ | |
2054 static int | |
2055 contains_replace_regs (rtx x) | |
2056 { | |
2057 int i, j; | |
2058 const char *fmt; | |
2059 enum rtx_code code = GET_CODE (x); | |
2060 | |
2061 switch (code) | |
2062 { | |
2063 case CONST_INT: | |
2064 case CONST: | |
2065 case LABEL_REF: | |
2066 case SYMBOL_REF: | |
2067 case CONST_DOUBLE: | |
2068 case CONST_FIXED: | |
2069 case CONST_VECTOR: | |
2070 case PC: | |
2071 case CC0: | |
2072 case HIGH: | |
2073 return 0; | |
2074 | |
2075 case REG: | |
2076 return reg_equiv[REGNO (x)].replace; | |
2077 | |
2078 default: | |
2079 break; | |
2080 } | |
2081 | |
2082 fmt = GET_RTX_FORMAT (code); | |
2083 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) | |
2084 switch (fmt[i]) | |
2085 { | |
2086 case 'e': | |
2087 if (contains_replace_regs (XEXP (x, i))) | |
2088 return 1; | |
2089 break; | |
2090 case 'E': | |
2091 for (j = XVECLEN (x, i) - 1; j >= 0; j--) | |
2092 if (contains_replace_regs (XVECEXP (x, i, j))) | |
2093 return 1; | |
2094 break; | |
2095 } | |
2096 | |
2097 return 0; | |
2098 } | |
2099 | |
2100 /* TRUE if X references a memory location that would be affected by a store | 3136 /* TRUE if X references a memory location that would be affected by a store |
2101 to MEMREF. */ | 3137 to MEMREF. */ |
2102 static int | 3138 static int |
2103 memref_referenced_p (rtx memref, rtx x) | 3139 memref_referenced_p (rtx memref, rtx x) |
2104 { | 3140 { |
2106 const char *fmt; | 3142 const char *fmt; |
2107 enum rtx_code code = GET_CODE (x); | 3143 enum rtx_code code = GET_CODE (x); |
2108 | 3144 |
2109 switch (code) | 3145 switch (code) |
2110 { | 3146 { |
2111 case CONST_INT: | |
2112 case CONST: | 3147 case CONST: |
2113 case LABEL_REF: | 3148 case LABEL_REF: |
2114 case SYMBOL_REF: | 3149 case SYMBOL_REF: |
2115 case CONST_DOUBLE: | 3150 CASE_CONST_ANY: |
2116 case CONST_FIXED: | |
2117 case CONST_VECTOR: | |
2118 case PC: | 3151 case PC: |
2119 case CC0: | 3152 case CC0: |
2120 case HIGH: | 3153 case HIGH: |
2121 case LO_SUM: | 3154 case LO_SUM: |
2122 return 0; | 3155 return 0; |
2125 return (reg_equiv[REGNO (x)].replacement | 3158 return (reg_equiv[REGNO (x)].replacement |
2126 && memref_referenced_p (memref, | 3159 && memref_referenced_p (memref, |
2127 reg_equiv[REGNO (x)].replacement)); | 3160 reg_equiv[REGNO (x)].replacement)); |
2128 | 3161 |
2129 case MEM: | 3162 case MEM: |
2130 if (true_dependence (memref, VOIDmode, x, rtx_varies_p)) | 3163 if (true_dependence (memref, VOIDmode, x)) |
2131 return 1; | 3164 return 1; |
2132 break; | 3165 break; |
2133 | 3166 |
2134 case SET: | 3167 case SET: |
2135 /* If we are setting a MEM, it doesn't count (its address does), but any | 3168 /* If we are setting a MEM, it doesn't count (its address does), but any |
2165 | 3198 |
2166 return 0; | 3199 return 0; |
2167 } | 3200 } |
2168 | 3201 |
2169 /* TRUE if some insn in the range (START, END] references a memory location | 3202 /* TRUE if some insn in the range (START, END] references a memory location |
2170 that would be affected by a store to MEMREF. */ | 3203 that would be affected by a store to MEMREF. |
3204 | |
3205 Callers should not call this routine if START is after END in the | |
3206 RTL chain. */ | |
3207 | |
2171 static int | 3208 static int |
2172 memref_used_between_p (rtx memref, rtx start, rtx end) | 3209 memref_used_between_p (rtx memref, rtx_insn *start, rtx_insn *end) |
2173 { | 3210 { |
2174 rtx insn; | 3211 rtx_insn *insn; |
2175 | 3212 |
2176 for (insn = NEXT_INSN (start); insn != NEXT_INSN (end); | 3213 for (insn = NEXT_INSN (start); |
3214 insn && insn != NEXT_INSN (end); | |
2177 insn = NEXT_INSN (insn)) | 3215 insn = NEXT_INSN (insn)) |
2178 { | 3216 { |
2179 if (!NONDEBUG_INSN_P (insn)) | 3217 if (!NONDEBUG_INSN_P (insn)) |
2180 continue; | 3218 continue; |
2181 | 3219 |
2185 /* Nonconst functions may access memory. */ | 3223 /* Nonconst functions may access memory. */ |
2186 if (CALL_P (insn) && (! RTL_CONST_CALL_P (insn))) | 3224 if (CALL_P (insn) && (! RTL_CONST_CALL_P (insn))) |
2187 return 1; | 3225 return 1; |
2188 } | 3226 } |
2189 | 3227 |
3228 gcc_assert (insn == NEXT_INSN (end)); | |
2190 return 0; | 3229 return 0; |
2191 } | 3230 } |
2192 | 3231 |
2193 /* Mark REG as having no known equivalence. | 3232 /* Mark REG as having no known equivalence. |
2194 Some instructions might have been processed before and furnished | 3233 Some instructions might have been processed before and furnished |
2196 removed. | 3235 removed. |
2197 STORE is the piece of RTL that does the non-constant / conflicting | 3236 STORE is the piece of RTL that does the non-constant / conflicting |
2198 assignment - a SET, CLOBBER or REG_INC note. It is currently not used, | 3237 assignment - a SET, CLOBBER or REG_INC note. It is currently not used, |
2199 but needs to be there because this function is called from note_stores. */ | 3238 but needs to be there because this function is called from note_stores. */ |
2200 static void | 3239 static void |
2201 no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSED, void *data ATTRIBUTE_UNUSED) | 3240 no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSED, |
3241 void *data ATTRIBUTE_UNUSED) | |
2202 { | 3242 { |
2203 int regno; | 3243 int regno; |
2204 rtx list; | 3244 rtx_insn_list *list; |
2205 | 3245 |
2206 if (!REG_P (reg)) | 3246 if (!REG_P (reg)) |
2207 return; | 3247 return; |
2208 regno = REGNO (reg); | 3248 regno = REGNO (reg); |
3249 reg_equiv[regno].no_equiv = 1; | |
2209 list = reg_equiv[regno].init_insns; | 3250 list = reg_equiv[regno].init_insns; |
2210 if (list == const0_rtx) | 3251 if (list && list->insn () == NULL) |
2211 return; | 3252 return; |
2212 reg_equiv[regno].init_insns = const0_rtx; | 3253 reg_equiv[regno].init_insns = gen_rtx_INSN_LIST (VOIDmode, NULL_RTX, NULL); |
2213 reg_equiv[regno].replacement = NULL_RTX; | 3254 reg_equiv[regno].replacement = NULL_RTX; |
2214 /* This doesn't matter for equivalences made for argument registers, we | 3255 /* This doesn't matter for equivalences made for argument registers, we |
2215 should keep their initialization insns. */ | 3256 should keep their initialization insns. */ |
2216 if (reg_equiv[regno].is_arg_equivalence) | 3257 if (reg_equiv[regno].is_arg_equivalence) |
2217 return; | 3258 return; |
2218 reg_equiv_init[regno] = NULL_RTX; | 3259 ira_reg_equiv[regno].defined_p = false; |
2219 for (; list; list = XEXP (list, 1)) | 3260 ira_reg_equiv[regno].init_insns = NULL; |
2220 { | 3261 for (; list; list = list->next ()) |
2221 rtx insn = XEXP (list, 0); | 3262 { |
3263 rtx_insn *insn = list->insn (); | |
2222 remove_note (insn, find_reg_note (insn, REG_EQUIV, NULL_RTX)); | 3264 remove_note (insn, find_reg_note (insn, REG_EQUIV, NULL_RTX)); |
3265 } | |
3266 } | |
3267 | |
3268 /* Check whether the SUBREG is a paradoxical subreg and set the result | |
3269 in PDX_SUBREGS. */ | |
3270 | |
3271 static void | |
3272 set_paradoxical_subreg (rtx_insn *insn) | |
3273 { | |
3274 subrtx_iterator::array_type array; | |
3275 FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST) | |
3276 { | |
3277 const_rtx subreg = *iter; | |
3278 if (GET_CODE (subreg) == SUBREG) | |
3279 { | |
3280 const_rtx reg = SUBREG_REG (subreg); | |
3281 if (REG_P (reg) && paradoxical_subreg_p (subreg)) | |
3282 reg_equiv[REGNO (reg)].pdx_subregs = true; | |
3283 } | |
2223 } | 3284 } |
2224 } | 3285 } |
2225 | 3286 |
2226 /* In DEBUG_INSN location adjust REGs from CLEARED_REGS bitmap to the | 3287 /* In DEBUG_INSN location adjust REGs from CLEARED_REGS bitmap to the |
2227 equivalent replacement. */ | 3288 equivalent replacement. */ |
2231 { | 3292 { |
2232 if (REG_P (loc)) | 3293 if (REG_P (loc)) |
2233 { | 3294 { |
2234 bitmap cleared_regs = (bitmap) data; | 3295 bitmap cleared_regs = (bitmap) data; |
2235 if (bitmap_bit_p (cleared_regs, REGNO (loc))) | 3296 if (bitmap_bit_p (cleared_regs, REGNO (loc))) |
2236 return simplify_replace_fn_rtx (*reg_equiv[REGNO (loc)].src_p, | 3297 return simplify_replace_fn_rtx (copy_rtx (*reg_equiv[REGNO (loc)].src_p), |
2237 NULL_RTX, adjust_cleared_regs, data); | 3298 NULL_RTX, adjust_cleared_regs, data); |
2238 } | 3299 } |
2239 return NULL_RTX; | 3300 return NULL_RTX; |
2240 } | 3301 } |
2241 | 3302 |
2242 /* Nonzero if we recorded an equivalence for a LABEL_REF. */ | 3303 /* Given register REGNO is set only once, return true if the defining |
2243 static int recorded_label_ref; | 3304 insn dominates all uses. */ |
3305 | |
3306 static bool | |
3307 def_dominates_uses (int regno) | |
3308 { | |
3309 df_ref def = DF_REG_DEF_CHAIN (regno); | |
3310 | |
3311 struct df_insn_info *def_info = DF_REF_INSN_INFO (def); | |
3312 /* If this is an artificial def (eh handler regs, hard frame pointer | |
3313 for non-local goto, regs defined on function entry) then def_info | |
3314 is NULL and the reg is always live before any use. We might | |
3315 reasonably return true in that case, but since the only call | |
3316 of this function is currently here in ira.c when we are looking | |
3317 at a defining insn we can't have an artificial def as that would | |
3318 bump DF_REG_DEF_COUNT. */ | |
3319 gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && def_info != NULL); | |
3320 | |
3321 rtx_insn *def_insn = DF_REF_INSN (def); | |
3322 basic_block def_bb = BLOCK_FOR_INSN (def_insn); | |
3323 | |
3324 for (df_ref use = DF_REG_USE_CHAIN (regno); | |
3325 use; | |
3326 use = DF_REF_NEXT_REG (use)) | |
3327 { | |
3328 struct df_insn_info *use_info = DF_REF_INSN_INFO (use); | |
3329 /* Only check real uses, not artificial ones. */ | |
3330 if (use_info) | |
3331 { | |
3332 rtx_insn *use_insn = DF_REF_INSN (use); | |
3333 if (!DEBUG_INSN_P (use_insn)) | |
3334 { | |
3335 basic_block use_bb = BLOCK_FOR_INSN (use_insn); | |
3336 if (use_bb != def_bb | |
3337 ? !dominated_by_p (CDI_DOMINATORS, use_bb, def_bb) | |
3338 : DF_INSN_INFO_LUID (use_info) < DF_INSN_INFO_LUID (def_info)) | |
3339 return false; | |
3340 } | |
3341 } | |
3342 } | |
3343 return true; | |
3344 } | |
2244 | 3345 |
2245 /* Find registers that are equivalent to a single value throughout the | 3346 /* Find registers that are equivalent to a single value throughout the |
2246 compilation (either because they can be referenced in memory or are set once | 3347 compilation (either because they can be referenced in memory or are |
2247 from a single constant). Lower their priority for a register. | 3348 set once from a single constant). Lower their priority for a |
2248 | 3349 register. |
2249 If such a register is only referenced once, try substituting its value | 3350 |
2250 into the using insn. If it succeeds, we can eliminate the register | 3351 If such a register is only referenced once, try substituting its |
2251 completely. | 3352 value into the using insn. If it succeeds, we can eliminate the |
2252 | 3353 register completely. |
2253 Initialize the REG_EQUIV_INIT array of initializing insns. | 3354 |
2254 | 3355 Initialize init_insns in ira_reg_equiv array. */ |
2255 Return non-zero if jump label rebuilding should be done. */ | 3356 static void |
2256 static int | |
2257 update_equiv_regs (void) | 3357 update_equiv_regs (void) |
2258 { | 3358 { |
2259 rtx insn; | 3359 rtx_insn *insn; |
2260 basic_block bb; | 3360 basic_block bb; |
2261 int loop_depth; | 3361 |
2262 bitmap cleared_regs; | 3362 /* Scan insns and set pdx_subregs if the reg is used in a |
2263 | 3363 paradoxical subreg. Don't set such reg equivalent to a mem, |
2264 /* We need to keep track of whether or not we recorded a LABEL_REF so | 3364 because lra will not substitute such equiv memory in order to |
2265 that we know if the jump optimizer needs to be rerun. */ | 3365 prevent access beyond allocated memory for paradoxical memory subreg. */ |
2266 recorded_label_ref = 0; | 3366 FOR_EACH_BB_FN (bb, cfun) |
2267 | 3367 FOR_BB_INSNS (bb, insn) |
2268 reg_equiv = XCNEWVEC (struct equivalence, max_regno); | 3368 if (NONDEBUG_INSN_P (insn)) |
2269 reg_equiv_init = ggc_alloc_cleared_vec_rtx (max_regno); | 3369 set_paradoxical_subreg (insn); |
2270 reg_equiv_init_size = max_regno; | |
2271 | |
2272 init_alias_analysis (); | |
2273 | 3370 |
2274 /* Scan the insns and find which registers have equivalences. Do this | 3371 /* Scan the insns and find which registers have equivalences. Do this |
2275 in a separate scan of the insns because (due to -fcse-follow-jumps) | 3372 in a separate scan of the insns because (due to -fcse-follow-jumps) |
2276 a register can be set below its use. */ | 3373 a register can be set below its use. */ |
2277 FOR_EACH_BB (bb) | 3374 bitmap setjmp_crosses = regstat_get_setjmp_crosses (); |
2278 { | 3375 FOR_EACH_BB_FN (bb, cfun) |
2279 loop_depth = bb->loop_depth; | 3376 { |
3377 int loop_depth = bb_loop_depth (bb); | |
2280 | 3378 |
2281 for (insn = BB_HEAD (bb); | 3379 for (insn = BB_HEAD (bb); |
2282 insn != NEXT_INSN (BB_END (bb)); | 3380 insn != NEXT_INSN (BB_END (bb)); |
2283 insn = NEXT_INSN (insn)) | 3381 insn = NEXT_INSN (insn)) |
2284 { | 3382 { |
2296 | 3394 |
2297 set = single_set (insn); | 3395 set = single_set (insn); |
2298 | 3396 |
2299 /* If this insn contains more (or less) than a single SET, | 3397 /* If this insn contains more (or less) than a single SET, |
2300 only mark all destinations as having no known equivalence. */ | 3398 only mark all destinations as having no known equivalence. */ |
2301 if (set == 0) | 3399 if (set == NULL_RTX |
3400 || side_effects_p (SET_SRC (set))) | |
2302 { | 3401 { |
2303 note_stores (PATTERN (insn), no_equiv, NULL); | 3402 note_stores (PATTERN (insn), no_equiv, NULL); |
2304 continue; | 3403 continue; |
2305 } | 3404 } |
2306 else if (GET_CODE (PATTERN (insn)) == PARALLEL) | 3405 else if (GET_CODE (PATTERN (insn)) == PARALLEL) |
2324 if (note) | 3423 if (note) |
2325 { | 3424 { |
2326 gcc_assert (REG_P (dest)); | 3425 gcc_assert (REG_P (dest)); |
2327 regno = REGNO (dest); | 3426 regno = REGNO (dest); |
2328 | 3427 |
2329 /* Note that we don't want to clear reg_equiv_init even if there | 3428 /* Note that we don't want to clear init_insns in |
2330 are multiple sets of this register. */ | 3429 ira_reg_equiv even if there are multiple sets of this |
3430 register. */ | |
2331 reg_equiv[regno].is_arg_equivalence = 1; | 3431 reg_equiv[regno].is_arg_equivalence = 1; |
2332 | 3432 |
2333 /* Record for reload that this is an equivalencing insn. */ | 3433 /* The insn result can have equivalence memory although |
2334 if (rtx_equal_p (src, XEXP (note, 0))) | 3434 the equivalence is not set up by the insn. We add |
2335 reg_equiv_init[regno] | 3435 this insn to init insns as it is a flag for now that |
2336 = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init[regno]); | 3436 regno has an equivalence. We will remove the insn |
3437 from init insn list later. */ | |
3438 if (rtx_equal_p (src, XEXP (note, 0)) || MEM_P (XEXP (note, 0))) | |
3439 ira_reg_equiv[regno].init_insns | |
3440 = gen_rtx_INSN_LIST (VOIDmode, insn, | |
3441 ira_reg_equiv[regno].init_insns); | |
2337 | 3442 |
2338 /* Continue normally in case this is a candidate for | 3443 /* Continue normally in case this is a candidate for |
2339 replacements. */ | 3444 replacements. */ |
2340 } | 3445 } |
2341 | 3446 |
2354 preferred class of a pseudo depends on all instructions that set | 3459 preferred class of a pseudo depends on all instructions that set |
2355 or use it. */ | 3460 or use it. */ |
2356 | 3461 |
2357 if (!REG_P (dest) | 3462 if (!REG_P (dest) |
2358 || (regno = REGNO (dest)) < FIRST_PSEUDO_REGISTER | 3463 || (regno = REGNO (dest)) < FIRST_PSEUDO_REGISTER |
2359 || reg_equiv[regno].init_insns == const0_rtx | 3464 || (reg_equiv[regno].init_insns |
3465 && reg_equiv[regno].init_insns->insn () == NULL) | |
2360 || (targetm.class_likely_spilled_p (reg_preferred_class (regno)) | 3466 || (targetm.class_likely_spilled_p (reg_preferred_class (regno)) |
2361 && MEM_P (src) && ! reg_equiv[regno].is_arg_equivalence)) | 3467 && MEM_P (src) && ! reg_equiv[regno].is_arg_equivalence)) |
2362 { | 3468 { |
2363 /* This might be setting a SUBREG of a pseudo, a pseudo that is | 3469 /* This might be setting a SUBREG of a pseudo, a pseudo that is |
2364 also set somewhere else to a constant. */ | 3470 also set somewhere else to a constant. */ |
2365 note_stores (set, no_equiv, NULL); | 3471 note_stores (set, no_equiv, NULL); |
2366 continue; | 3472 continue; |
2367 } | 3473 } |
2368 | 3474 |
3475 /* Don't set reg mentioned in a paradoxical subreg | |
3476 equivalent to a mem. */ | |
3477 if (MEM_P (src) && reg_equiv[regno].pdx_subregs) | |
3478 { | |
3479 note_stores (set, no_equiv, NULL); | |
3480 continue; | |
3481 } | |
3482 | |
2369 note = find_reg_note (insn, REG_EQUAL, NULL_RTX); | 3483 note = find_reg_note (insn, REG_EQUAL, NULL_RTX); |
2370 | 3484 |
2371 /* cse sometimes generates function invariants, but doesn't put a | 3485 /* cse sometimes generates function invariants, but doesn't put a |
2372 REG_EQUAL note on the insn. Since this note would be redundant, | 3486 REG_EQUAL note on the insn. Since this note would be redundant, |
2373 there's no point creating it earlier than here. */ | 3487 there's no point creating it earlier than here. */ |
2374 if (! note && ! rtx_varies_p (src, 0)) | 3488 if (! note && ! rtx_varies_p (src, 0)) |
2375 note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src)); | 3489 note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src)); |
2376 | 3490 |
2377 /* Don't bother considering a REG_EQUAL note containing an EXPR_LIST | 3491 /* Don't bother considering a REG_EQUAL note containing an EXPR_LIST |
2378 since it represents a function call */ | 3492 since it represents a function call. */ |
2379 if (note && GET_CODE (XEXP (note, 0)) == EXPR_LIST) | 3493 if (note && GET_CODE (XEXP (note, 0)) == EXPR_LIST) |
2380 note = NULL_RTX; | 3494 note = NULL_RTX; |
2381 | 3495 |
2382 if (DF_REG_DEF_COUNT (regno) != 1 | 3496 if (DF_REG_DEF_COUNT (regno) != 1) |
2383 && (! note | 3497 { |
3498 bool equal_p = true; | |
3499 rtx_insn_list *list; | |
3500 | |
3501 /* If we have already processed this pseudo and determined it | |
3502 can not have an equivalence, then honor that decision. */ | |
3503 if (reg_equiv[regno].no_equiv) | |
3504 continue; | |
3505 | |
3506 if (! note | |
2384 || rtx_varies_p (XEXP (note, 0), 0) | 3507 || rtx_varies_p (XEXP (note, 0), 0) |
2385 || (reg_equiv[regno].replacement | 3508 || (reg_equiv[regno].replacement |
2386 && ! rtx_equal_p (XEXP (note, 0), | 3509 && ! rtx_equal_p (XEXP (note, 0), |
2387 reg_equiv[regno].replacement)))) | 3510 reg_equiv[regno].replacement))) |
2388 { | 3511 { |
2389 no_equiv (dest, set, NULL); | 3512 no_equiv (dest, set, NULL); |
2390 continue; | 3513 continue; |
3514 } | |
3515 | |
3516 list = reg_equiv[regno].init_insns; | |
3517 for (; list; list = list->next ()) | |
3518 { | |
3519 rtx note_tmp; | |
3520 rtx_insn *insn_tmp; | |
3521 | |
3522 insn_tmp = list->insn (); | |
3523 note_tmp = find_reg_note (insn_tmp, REG_EQUAL, NULL_RTX); | |
3524 gcc_assert (note_tmp); | |
3525 if (! rtx_equal_p (XEXP (note, 0), XEXP (note_tmp, 0))) | |
3526 { | |
3527 equal_p = false; | |
3528 break; | |
3529 } | |
3530 } | |
3531 | |
3532 if (! equal_p) | |
3533 { | |
3534 no_equiv (dest, set, NULL); | |
3535 continue; | |
3536 } | |
2391 } | 3537 } |
3538 | |
2392 /* Record this insn as initializing this register. */ | 3539 /* Record this insn as initializing this register. */ |
2393 reg_equiv[regno].init_insns | 3540 reg_equiv[regno].init_insns |
2394 = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv[regno].init_insns); | 3541 = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv[regno].init_insns); |
2395 | 3542 |
2396 /* If this register is known to be equal to a constant, record that | 3543 /* If this register is known to be equal to a constant, record that |
2397 it is always equivalent to the constant. */ | 3544 it is always equivalent to the constant. |
3545 Note that it is possible to have a register use before | |
3546 the def in loops (see gcc.c-torture/execute/pr79286.c) | |
3547 where the reg is undefined on first use. If the def insn | |
3548 won't trap we can use it as an equivalence, effectively | |
3549 choosing the "undefined" value for the reg to be the | |
3550 same as the value set by the def. */ | |
2398 if (DF_REG_DEF_COUNT (regno) == 1 | 3551 if (DF_REG_DEF_COUNT (regno) == 1 |
2399 && note && ! rtx_varies_p (XEXP (note, 0), 0)) | 3552 && note |
3553 && !rtx_varies_p (XEXP (note, 0), 0) | |
3554 && (!may_trap_or_fault_p (XEXP (note, 0)) | |
3555 || def_dominates_uses (regno))) | |
2400 { | 3556 { |
2401 rtx note_value = XEXP (note, 0); | 3557 rtx note_value = XEXP (note, 0); |
2402 remove_note (insn, note); | 3558 remove_note (insn, note); |
2403 set_unique_reg_note (insn, REG_EQUIV, note_value); | 3559 set_unique_reg_note (insn, REG_EQUIV, note_value); |
2404 } | 3560 } |
2415 | 3571 |
2416 If we don't have a REG_EQUIV note, see if this insn is loading | 3572 If we don't have a REG_EQUIV note, see if this insn is loading |
2417 a register used only in one basic block from a MEM. If so, and the | 3573 a register used only in one basic block from a MEM. If so, and the |
2418 MEM remains unchanged for the life of the register, add a REG_EQUIV | 3574 MEM remains unchanged for the life of the register, add a REG_EQUIV |
2419 note. */ | 3575 note. */ |
2420 | |
2421 note = find_reg_note (insn, REG_EQUIV, NULL_RTX); | 3576 note = find_reg_note (insn, REG_EQUIV, NULL_RTX); |
2422 | 3577 |
2423 if (note == 0 && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS | 3578 rtx replacement = NULL_RTX; |
2424 && MEM_P (SET_SRC (set)) | |
2425 && validate_equiv_mem (insn, dest, SET_SRC (set))) | |
2426 note = set_unique_reg_note (insn, REG_EQUIV, copy_rtx (SET_SRC (set))); | |
2427 | |
2428 if (note) | 3579 if (note) |
3580 replacement = XEXP (note, 0); | |
3581 else if (REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS | |
3582 && MEM_P (SET_SRC (set))) | |
2429 { | 3583 { |
2430 int regno = REGNO (dest); | 3584 enum valid_equiv validity; |
2431 rtx x = XEXP (note, 0); | 3585 validity = validate_equiv_mem (insn, dest, SET_SRC (set)); |
2432 | 3586 if (validity != valid_none) |
2433 /* If we haven't done so, record for reload that this is an | 3587 { |
2434 equivalencing insn. */ | 3588 replacement = copy_rtx (SET_SRC (set)); |
2435 if (!reg_equiv[regno].is_arg_equivalence) | 3589 if (validity == valid_reload) |
2436 reg_equiv_init[regno] | 3590 note = set_unique_reg_note (insn, REG_EQUIV, replacement); |
2437 = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init[regno]); | 3591 } |
2438 | 3592 } |
2439 /* Record whether or not we created a REG_EQUIV note for a LABEL_REF. | 3593 |
2440 We might end up substituting the LABEL_REF for uses of the | 3594 /* If we haven't done so, record for reload that this is an |
2441 pseudo here or later. That kind of transformation may turn an | 3595 equivalencing insn. */ |
2442 indirect jump into a direct jump, in which case we must rerun the | 3596 if (note && !reg_equiv[regno].is_arg_equivalence) |
2443 jump optimizer to ensure that the JUMP_LABEL fields are valid. */ | 3597 ira_reg_equiv[regno].init_insns |
2444 if (GET_CODE (x) == LABEL_REF | 3598 = gen_rtx_INSN_LIST (VOIDmode, insn, |
2445 || (GET_CODE (x) == CONST | 3599 ira_reg_equiv[regno].init_insns); |
2446 && GET_CODE (XEXP (x, 0)) == PLUS | 3600 |
2447 && (GET_CODE (XEXP (XEXP (x, 0), 0)) == LABEL_REF))) | 3601 if (replacement) |
2448 recorded_label_ref = 1; | 3602 { |
2449 | 3603 reg_equiv[regno].replacement = replacement; |
2450 reg_equiv[regno].replacement = x; | |
2451 reg_equiv[regno].src_p = &SET_SRC (set); | 3604 reg_equiv[regno].src_p = &SET_SRC (set); |
2452 reg_equiv[regno].loop_depth = loop_depth; | 3605 reg_equiv[regno].loop_depth = (short) loop_depth; |
2453 | 3606 |
2454 /* Don't mess with things live during setjmp. */ | 3607 /* Don't mess with things live during setjmp. */ |
2455 if (REG_LIVE_LENGTH (regno) >= 0 && optimize) | 3608 if (optimize && !bitmap_bit_p (setjmp_crosses, regno)) |
2456 { | 3609 { |
2457 /* Note that the statement below does not affect the priority | |
2458 in local-alloc! */ | |
2459 REG_LIVE_LENGTH (regno) *= 2; | |
2460 | |
2461 /* If the register is referenced exactly twice, meaning it is | 3610 /* If the register is referenced exactly twice, meaning it is |
2462 set once and used once, indicate that the reference may be | 3611 set once and used once, indicate that the reference may be |
2463 replaced by the equivalence we computed above. Do this | 3612 replaced by the equivalence we computed above. Do this |
2464 even if the register is only used in one block so that | 3613 even if the register is only used in one block so that |
2465 dependencies can be handled where the last register is | 3614 dependencies can be handled where the last register is |
2466 used in a different block (i.e. HIGH / LO_SUM sequences) | 3615 used in a different block (i.e. HIGH / LO_SUM sequences) |
2467 and to reduce the number of registers alive across | 3616 and to reduce the number of registers alive across |
2468 calls. */ | 3617 calls. */ |
2469 | 3618 |
2470 if (REG_N_REFS (regno) == 2 | 3619 if (REG_N_REFS (regno) == 2 |
2471 && (rtx_equal_p (x, src) | 3620 && (rtx_equal_p (replacement, src) |
2472 || ! equiv_init_varies_p (src)) | 3621 || ! equiv_init_varies_p (src)) |
2473 && NONJUMP_INSN_P (insn) | 3622 && NONJUMP_INSN_P (insn) |
2474 && equiv_init_movable_p (PATTERN (insn), regno)) | 3623 && equiv_init_movable_p (PATTERN (insn), regno)) |
2475 reg_equiv[regno].replace = 1; | 3624 reg_equiv[regno].replace = 1; |
2476 } | 3625 } |
2477 } | 3626 } |
2478 } | 3627 } |
2479 } | 3628 } |
2480 | 3629 } |
2481 if (!optimize) | 3630 |
2482 goto out; | 3631 /* For insns that set a MEM to the contents of a REG that is only used |
2483 | 3632 in a single basic block, see if the register is always equivalent |
2484 /* A second pass, to gather additional equivalences with memory. This needs | 3633 to that memory location and if moving the store from INSN to the |
2485 to be done after we know which registers we are going to replace. */ | 3634 insn that sets REG is safe. If so, put a REG_EQUIV note on the |
2486 | 3635 initializing insn. */ |
2487 for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) | 3636 static void |
3637 add_store_equivs (void) | |
3638 { | |
3639 auto_bitmap seen_insns; | |
3640 | |
3641 for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn)) | |
2488 { | 3642 { |
2489 rtx set, src, dest; | 3643 rtx set, src, dest; |
2490 unsigned regno; | 3644 unsigned regno; |
3645 rtx_insn *init_insn; | |
3646 | |
3647 bitmap_set_bit (seen_insns, INSN_UID (insn)); | |
2491 | 3648 |
2492 if (! INSN_P (insn)) | 3649 if (! INSN_P (insn)) |
2493 continue; | 3650 continue; |
2494 | 3651 |
2495 set = single_set (insn); | 3652 set = single_set (insn); |
2497 continue; | 3654 continue; |
2498 | 3655 |
2499 dest = SET_DEST (set); | 3656 dest = SET_DEST (set); |
2500 src = SET_SRC (set); | 3657 src = SET_SRC (set); |
2501 | 3658 |
2502 /* If this sets a MEM to the contents of a REG that is only used | 3659 /* Don't add a REG_EQUIV note if the insn already has one. The existing |
2503 in a single basic block, see if the register is always equivalent | 3660 REG_EQUIV is likely more useful than the one we are adding. */ |
2504 to that memory location and if moving the store from INSN to the | |
2505 insn that set REG is safe. If so, put a REG_EQUIV note on the | |
2506 initializing insn. | |
2507 | |
2508 Don't add a REG_EQUIV note if the insn already has one. The existing | |
2509 REG_EQUIV is likely more useful than the one we are adding. | |
2510 | |
2511 If one of the regs in the address has reg_equiv[REGNO].replace set, | |
2512 then we can't add this REG_EQUIV note. The reg_equiv[REGNO].replace | |
2513 optimization may move the set of this register immediately before | |
2514 insn, which puts it after reg_equiv[REGNO].init_insns, and hence | |
2515 the mention in the REG_EQUIV note would be to an uninitialized | |
2516 pseudo. */ | |
2517 | |
2518 if (MEM_P (dest) && REG_P (src) | 3661 if (MEM_P (dest) && REG_P (src) |
2519 && (regno = REGNO (src)) >= FIRST_PSEUDO_REGISTER | 3662 && (regno = REGNO (src)) >= FIRST_PSEUDO_REGISTER |
2520 && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS | 3663 && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS |
2521 && DF_REG_DEF_COUNT (regno) == 1 | 3664 && DF_REG_DEF_COUNT (regno) == 1 |
2522 && reg_equiv[regno].init_insns != 0 | 3665 && ! reg_equiv[regno].pdx_subregs |
2523 && reg_equiv[regno].init_insns != const0_rtx | 3666 && reg_equiv[regno].init_insns != NULL |
2524 && ! find_reg_note (XEXP (reg_equiv[regno].init_insns, 0), | 3667 && (init_insn = reg_equiv[regno].init_insns->insn ()) != 0 |
2525 REG_EQUIV, NULL_RTX) | 3668 && bitmap_bit_p (seen_insns, INSN_UID (init_insn)) |
2526 && ! contains_replace_regs (XEXP (dest, 0))) | 3669 && ! find_reg_note (init_insn, REG_EQUIV, NULL_RTX) |
3670 && validate_equiv_mem (init_insn, src, dest) == valid_reload | |
3671 && ! memref_used_between_p (dest, init_insn, insn) | |
3672 /* Attaching a REG_EQUIV note will fail if INIT_INSN has | |
3673 multiple sets. */ | |
3674 && set_unique_reg_note (init_insn, REG_EQUIV, copy_rtx (dest))) | |
2527 { | 3675 { |
2528 rtx init_insn = XEXP (reg_equiv[regno].init_insns, 0); | 3676 /* This insn makes the equivalence, not the one initializing |
2529 if (validate_equiv_mem (init_insn, src, dest) | 3677 the register. */ |
2530 && ! memref_used_between_p (dest, init_insn, insn) | 3678 ira_reg_equiv[regno].init_insns |
2531 /* Attaching a REG_EQUIV note will fail if INIT_INSN has | 3679 = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX); |
2532 multiple sets. */ | 3680 df_notes_rescan (init_insn); |
2533 && set_unique_reg_note (init_insn, REG_EQUIV, copy_rtx (dest))) | 3681 if (dump_file) |
2534 { | 3682 fprintf (dump_file, |
2535 /* This insn makes the equivalence, not the one initializing | 3683 "Adding REG_EQUIV to insn %d for source of insn %d\n", |
2536 the register. */ | 3684 INSN_UID (init_insn), |
2537 reg_equiv_init[regno] | 3685 INSN_UID (insn)); |
2538 = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX); | |
2539 df_notes_rescan (init_insn); | |
2540 } | |
2541 } | 3686 } |
2542 } | 3687 } |
2543 | 3688 } |
2544 cleared_regs = BITMAP_ALLOC (NULL); | 3689 |
2545 /* Now scan all regs killed in an insn to see if any of them are | 3690 /* Scan all regs killed in an insn to see if any of them are registers |
2546 registers only used that once. If so, see if we can replace the | 3691 only used that once. If so, see if we can replace the reference |
2547 reference with the equivalent form. If we can, delete the | 3692 with the equivalent form. If we can, delete the initializing |
2548 initializing reference and this register will go away. If we | 3693 reference and this register will go away. If we can't replace the |
2549 can't replace the reference, and the initializing reference is | 3694 reference, and the initializing reference is within the same loop |
2550 within the same loop (or in an inner loop), then move the register | 3695 (or in an inner loop), then move the register initialization just |
2551 initialization just before the use, so that they are in the same | 3696 before the use, so that they are in the same basic block. */ |
2552 basic block. */ | 3697 static void |
2553 FOR_EACH_BB_REVERSE (bb) | 3698 combine_and_move_insns (void) |
2554 { | 3699 { |
2555 loop_depth = bb->loop_depth; | 3700 auto_bitmap cleared_regs; |
2556 for (insn = BB_END (bb); | 3701 int max = max_reg_num (); |
2557 insn != PREV_INSN (BB_HEAD (bb)); | 3702 |
2558 insn = PREV_INSN (insn)) | 3703 for (int regno = FIRST_PSEUDO_REGISTER; regno < max; regno++) |
3704 { | |
3705 if (!reg_equiv[regno].replace) | |
3706 continue; | |
3707 | |
3708 rtx_insn *use_insn = 0; | |
3709 for (df_ref use = DF_REG_USE_CHAIN (regno); | |
3710 use; | |
3711 use = DF_REF_NEXT_REG (use)) | |
3712 if (DF_REF_INSN_INFO (use)) | |
3713 { | |
3714 if (DEBUG_INSN_P (DF_REF_INSN (use))) | |
3715 continue; | |
3716 gcc_assert (!use_insn); | |
3717 use_insn = DF_REF_INSN (use); | |
3718 } | |
3719 gcc_assert (use_insn); | |
3720 | |
3721 /* Don't substitute into jumps. indirect_jump_optimize does | |
3722 this for anything we are prepared to handle. */ | |
3723 if (JUMP_P (use_insn)) | |
3724 continue; | |
3725 | |
3726 /* Also don't substitute into a conditional trap insn -- it can become | |
3727 an unconditional trap, and that is a flow control insn. */ | |
3728 if (GET_CODE (PATTERN (use_insn)) == TRAP_IF) | |
3729 continue; | |
3730 | |
3731 df_ref def = DF_REG_DEF_CHAIN (regno); | |
3732 gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && DF_REF_INSN_INFO (def)); | |
3733 rtx_insn *def_insn = DF_REF_INSN (def); | |
3734 | |
3735 /* We may not move instructions that can throw, since that | |
3736 changes basic block boundaries and we are not prepared to | |
3737 adjust the CFG to match. */ | |
3738 if (can_throw_internal (def_insn)) | |
3739 continue; | |
3740 | |
3741 basic_block use_bb = BLOCK_FOR_INSN (use_insn); | |
3742 basic_block def_bb = BLOCK_FOR_INSN (def_insn); | |
3743 if (bb_loop_depth (use_bb) > bb_loop_depth (def_bb)) | |
3744 continue; | |
3745 | |
3746 if (asm_noperands (PATTERN (def_insn)) < 0 | |
3747 && validate_replace_rtx (regno_reg_rtx[regno], | |
3748 *reg_equiv[regno].src_p, use_insn)) | |
2559 { | 3749 { |
2560 rtx link; | 3750 rtx link; |
2561 | 3751 /* Append the REG_DEAD notes from def_insn. */ |
2562 if (! INSN_P (insn)) | 3752 for (rtx *p = ®_NOTES (def_insn); (link = *p) != 0; ) |
3753 { | |
3754 if (REG_NOTE_KIND (XEXP (link, 0)) == REG_DEAD) | |
3755 { | |
3756 *p = XEXP (link, 1); | |
3757 XEXP (link, 1) = REG_NOTES (use_insn); | |
3758 REG_NOTES (use_insn) = link; | |
3759 } | |
3760 else | |
3761 p = &XEXP (link, 1); | |
3762 } | |
3763 | |
3764 remove_death (regno, use_insn); | |
3765 SET_REG_N_REFS (regno, 0); | |
3766 REG_FREQ (regno) = 0; | |
3767 df_ref use; | |
3768 FOR_EACH_INSN_USE (use, def_insn) | |
3769 { | |
3770 unsigned int use_regno = DF_REF_REGNO (use); | |
3771 if (!HARD_REGISTER_NUM_P (use_regno)) | |
3772 reg_equiv[use_regno].replace = 0; | |
3773 } | |
3774 | |
3775 delete_insn (def_insn); | |
3776 | |
3777 reg_equiv[regno].init_insns = NULL; | |
3778 ira_reg_equiv[regno].init_insns = NULL; | |
3779 bitmap_set_bit (cleared_regs, regno); | |
3780 } | |
3781 | |
3782 /* Move the initialization of the register to just before | |
3783 USE_INSN. Update the flow information. */ | |
3784 else if (prev_nondebug_insn (use_insn) != def_insn) | |
3785 { | |
3786 rtx_insn *new_insn; | |
3787 | |
3788 new_insn = emit_insn_before (PATTERN (def_insn), use_insn); | |
3789 REG_NOTES (new_insn) = REG_NOTES (def_insn); | |
3790 REG_NOTES (def_insn) = 0; | |
3791 /* Rescan it to process the notes. */ | |
3792 df_insn_rescan (new_insn); | |
3793 | |
3794 /* Make sure this insn is recognized before reload begins, | |
3795 otherwise eliminate_regs_in_insn will die. */ | |
3796 INSN_CODE (new_insn) = INSN_CODE (def_insn); | |
3797 | |
3798 delete_insn (def_insn); | |
3799 | |
3800 XEXP (reg_equiv[regno].init_insns, 0) = new_insn; | |
3801 | |
3802 REG_BASIC_BLOCK (regno) = use_bb->index; | |
3803 REG_N_CALLS_CROSSED (regno) = 0; | |
3804 | |
3805 if (use_insn == BB_HEAD (use_bb)) | |
3806 BB_HEAD (use_bb) = new_insn; | |
3807 | |
3808 /* We know regno dies in use_insn, but inside a loop | |
3809 REG_DEAD notes might be missing when def_insn was in | |
3810 another basic block. However, when we move def_insn into | |
3811 this bb we'll definitely get a REG_DEAD note and reload | |
3812 will see the death. It's possible that update_equiv_regs | |
3813 set up an equivalence referencing regno for a reg set by | |
3814 use_insn, when regno was seen as non-local. Now that | |
3815 regno is local to this block, and dies, such an | |
3816 equivalence is invalid. */ | |
3817 if (find_reg_note (use_insn, REG_EQUIV, regno_reg_rtx[regno])) | |
3818 { | |
3819 rtx set = single_set (use_insn); | |
3820 if (set && REG_P (SET_DEST (set))) | |
3821 no_equiv (SET_DEST (set), set, NULL); | |
3822 } | |
3823 | |
3824 ira_reg_equiv[regno].init_insns | |
3825 = gen_rtx_INSN_LIST (VOIDmode, new_insn, NULL_RTX); | |
3826 bitmap_set_bit (cleared_regs, regno); | |
3827 } | |
3828 } | |
3829 | |
3830 if (!bitmap_empty_p (cleared_regs)) | |
3831 { | |
3832 basic_block bb; | |
3833 | |
3834 FOR_EACH_BB_FN (bb, cfun) | |
3835 { | |
3836 bitmap_and_compl_into (DF_LR_IN (bb), cleared_regs); | |
3837 bitmap_and_compl_into (DF_LR_OUT (bb), cleared_regs); | |
3838 if (!df_live) | |
2563 continue; | 3839 continue; |
2564 | |
2565 /* Don't substitute into a non-local goto, this confuses CFG. */ | |
2566 if (JUMP_P (insn) | |
2567 && find_reg_note (insn, REG_NON_LOCAL_GOTO, NULL_RTX)) | |
2568 continue; | |
2569 | |
2570 for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) | |
2571 { | |
2572 if (REG_NOTE_KIND (link) == REG_DEAD | |
2573 /* Make sure this insn still refers to the register. */ | |
2574 && reg_mentioned_p (XEXP (link, 0), PATTERN (insn))) | |
2575 { | |
2576 int regno = REGNO (XEXP (link, 0)); | |
2577 rtx equiv_insn; | |
2578 | |
2579 if (! reg_equiv[regno].replace | |
2580 || reg_equiv[regno].loop_depth < loop_depth | |
2581 /* There is no sense to move insns if we did | |
2582 register pressure-sensitive scheduling was | |
2583 done because it will not improve allocation | |
2584 but worsen insn schedule with a big | |
2585 probability. */ | |
2586 || (flag_sched_pressure && flag_schedule_insns)) | |
2587 continue; | |
2588 | |
2589 /* reg_equiv[REGNO].replace gets set only when | |
2590 REG_N_REFS[REGNO] is 2, i.e. the register is set | |
2591 once and used once. (If it were only set, but not used, | |
2592 flow would have deleted the setting insns.) Hence | |
2593 there can only be one insn in reg_equiv[REGNO].init_insns. */ | |
2594 gcc_assert (reg_equiv[regno].init_insns | |
2595 && !XEXP (reg_equiv[regno].init_insns, 1)); | |
2596 equiv_insn = XEXP (reg_equiv[regno].init_insns, 0); | |
2597 | |
2598 /* We may not move instructions that can throw, since | |
2599 that changes basic block boundaries and we are not | |
2600 prepared to adjust the CFG to match. */ | |
2601 if (can_throw_internal (equiv_insn)) | |
2602 continue; | |
2603 | |
2604 if (asm_noperands (PATTERN (equiv_insn)) < 0 | |
2605 && validate_replace_rtx (regno_reg_rtx[regno], | |
2606 *(reg_equiv[regno].src_p), insn)) | |
2607 { | |
2608 rtx equiv_link; | |
2609 rtx last_link; | |
2610 rtx note; | |
2611 | |
2612 /* Find the last note. */ | |
2613 for (last_link = link; XEXP (last_link, 1); | |
2614 last_link = XEXP (last_link, 1)) | |
2615 ; | |
2616 | |
2617 /* Append the REG_DEAD notes from equiv_insn. */ | |
2618 equiv_link = REG_NOTES (equiv_insn); | |
2619 while (equiv_link) | |
2620 { | |
2621 note = equiv_link; | |
2622 equiv_link = XEXP (equiv_link, 1); | |
2623 if (REG_NOTE_KIND (note) == REG_DEAD) | |
2624 { | |
2625 remove_note (equiv_insn, note); | |
2626 XEXP (last_link, 1) = note; | |
2627 XEXP (note, 1) = NULL_RTX; | |
2628 last_link = note; | |
2629 } | |
2630 } | |
2631 | |
2632 remove_death (regno, insn); | |
2633 SET_REG_N_REFS (regno, 0); | |
2634 REG_FREQ (regno) = 0; | |
2635 delete_insn (equiv_insn); | |
2636 | |
2637 reg_equiv[regno].init_insns | |
2638 = XEXP (reg_equiv[regno].init_insns, 1); | |
2639 | |
2640 reg_equiv_init[regno] = NULL_RTX; | |
2641 bitmap_set_bit (cleared_regs, regno); | |
2642 } | |
2643 /* Move the initialization of the register to just before | |
2644 INSN. Update the flow information. */ | |
2645 else if (prev_nondebug_insn (insn) != equiv_insn) | |
2646 { | |
2647 rtx new_insn; | |
2648 | |
2649 new_insn = emit_insn_before (PATTERN (equiv_insn), insn); | |
2650 REG_NOTES (new_insn) = REG_NOTES (equiv_insn); | |
2651 REG_NOTES (equiv_insn) = 0; | |
2652 /* Rescan it to process the notes. */ | |
2653 df_insn_rescan (new_insn); | |
2654 | |
2655 /* Make sure this insn is recognized before | |
2656 reload begins, otherwise | |
2657 eliminate_regs_in_insn will die. */ | |
2658 INSN_CODE (new_insn) = INSN_CODE (equiv_insn); | |
2659 | |
2660 delete_insn (equiv_insn); | |
2661 | |
2662 XEXP (reg_equiv[regno].init_insns, 0) = new_insn; | |
2663 | |
2664 REG_BASIC_BLOCK (regno) = bb->index; | |
2665 REG_N_CALLS_CROSSED (regno) = 0; | |
2666 REG_FREQ_CALLS_CROSSED (regno) = 0; | |
2667 REG_N_THROWING_CALLS_CROSSED (regno) = 0; | |
2668 REG_LIVE_LENGTH (regno) = 2; | |
2669 | |
2670 if (insn == BB_HEAD (bb)) | |
2671 BB_HEAD (bb) = PREV_INSN (insn); | |
2672 | |
2673 reg_equiv_init[regno] | |
2674 = gen_rtx_INSN_LIST (VOIDmode, new_insn, NULL_RTX); | |
2675 bitmap_set_bit (cleared_regs, regno); | |
2676 } | |
2677 } | |
2678 } | |
2679 } | |
2680 } | |
2681 | |
2682 if (!bitmap_empty_p (cleared_regs)) | |
2683 { | |
2684 FOR_EACH_BB (bb) | |
2685 { | |
2686 bitmap_and_compl_into (DF_LIVE_IN (bb), cleared_regs); | 3840 bitmap_and_compl_into (DF_LIVE_IN (bb), cleared_regs); |
2687 bitmap_and_compl_into (DF_LIVE_OUT (bb), cleared_regs); | 3841 bitmap_and_compl_into (DF_LIVE_OUT (bb), cleared_regs); |
2688 bitmap_and_compl_into (DF_LR_IN (bb), cleared_regs); | |
2689 bitmap_and_compl_into (DF_LR_OUT (bb), cleared_regs); | |
2690 } | 3842 } |
2691 | 3843 |
2692 /* Last pass - adjust debug insns referencing cleared regs. */ | 3844 /* Last pass - adjust debug insns referencing cleared regs. */ |
2693 if (MAY_HAVE_DEBUG_INSNS) | 3845 if (MAY_HAVE_DEBUG_INSNS) |
2694 for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) | 3846 for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn)) |
2695 if (DEBUG_INSN_P (insn)) | 3847 if (DEBUG_INSN_P (insn)) |
2696 { | 3848 { |
2697 rtx old_loc = INSN_VAR_LOCATION_LOC (insn); | 3849 rtx old_loc = INSN_VAR_LOCATION_LOC (insn); |
2698 INSN_VAR_LOCATION_LOC (insn) | 3850 INSN_VAR_LOCATION_LOC (insn) |
2699 = simplify_replace_fn_rtx (old_loc, NULL_RTX, | 3851 = simplify_replace_fn_rtx (old_loc, NULL_RTX, |
2701 (void *) cleared_regs); | 3853 (void *) cleared_regs); |
2702 if (old_loc != INSN_VAR_LOCATION_LOC (insn)) | 3854 if (old_loc != INSN_VAR_LOCATION_LOC (insn)) |
2703 df_insn_rescan (insn); | 3855 df_insn_rescan (insn); |
2704 } | 3856 } |
2705 } | 3857 } |
2706 | 3858 } |
2707 BITMAP_FREE (cleared_regs); | 3859 |
2708 | 3860 /* A pass over indirect jumps, converting simple cases to direct jumps. |
2709 out: | 3861 Combine does this optimization too, but only within a basic block. */ |
2710 /* Clean up. */ | 3862 static void |
2711 | 3863 indirect_jump_optimize (void) |
2712 end_alias_analysis (); | 3864 { |
2713 free (reg_equiv); | 3865 basic_block bb; |
2714 return recorded_label_ref; | 3866 bool rebuild_p = false; |
3867 | |
3868 FOR_EACH_BB_REVERSE_FN (bb, cfun) | |
3869 { | |
3870 rtx_insn *insn = BB_END (bb); | |
3871 if (!JUMP_P (insn) | |
3872 || find_reg_note (insn, REG_NON_LOCAL_GOTO, NULL_RTX)) | |
3873 continue; | |
3874 | |
3875 rtx x = pc_set (insn); | |
3876 if (!x || !REG_P (SET_SRC (x))) | |
3877 continue; | |
3878 | |
3879 int regno = REGNO (SET_SRC (x)); | |
3880 if (DF_REG_DEF_COUNT (regno) == 1) | |
3881 { | |
3882 df_ref def = DF_REG_DEF_CHAIN (regno); | |
3883 if (!DF_REF_IS_ARTIFICIAL (def)) | |
3884 { | |
3885 rtx_insn *def_insn = DF_REF_INSN (def); | |
3886 rtx lab = NULL_RTX; | |
3887 rtx set = single_set (def_insn); | |
3888 if (set && GET_CODE (SET_SRC (set)) == LABEL_REF) | |
3889 lab = SET_SRC (set); | |
3890 else | |
3891 { | |
3892 rtx eqnote = find_reg_note (def_insn, REG_EQUAL, NULL_RTX); | |
3893 if (eqnote && GET_CODE (XEXP (eqnote, 0)) == LABEL_REF) | |
3894 lab = XEXP (eqnote, 0); | |
3895 } | |
3896 if (lab && validate_replace_rtx (SET_SRC (x), lab, insn)) | |
3897 rebuild_p = true; | |
3898 } | |
3899 } | |
3900 } | |
3901 | |
3902 if (rebuild_p) | |
3903 { | |
3904 timevar_push (TV_JUMP); | |
3905 rebuild_jump_labels (get_insns ()); | |
3906 if (purge_all_dead_edges ()) | |
3907 delete_unreachable_blocks (); | |
3908 timevar_pop (TV_JUMP); | |
3909 } | |
3910 } | |
3911 | |
3912 /* Set up fields memory, constant, and invariant from init_insns in | |
3913 the structures of array ira_reg_equiv. */ | |
3914 static void | |
3915 setup_reg_equiv (void) | |
3916 { | |
3917 int i; | |
3918 rtx_insn_list *elem, *prev_elem, *next_elem; | |
3919 rtx_insn *insn; | |
3920 rtx set, x; | |
3921 | |
3922 for (i = FIRST_PSEUDO_REGISTER; i < ira_reg_equiv_len; i++) | |
3923 for (prev_elem = NULL, elem = ira_reg_equiv[i].init_insns; | |
3924 elem; | |
3925 prev_elem = elem, elem = next_elem) | |
3926 { | |
3927 next_elem = elem->next (); | |
3928 insn = elem->insn (); | |
3929 set = single_set (insn); | |
3930 | |
3931 /* Init insns can set up equivalence when the reg is a destination or | |
3932 a source (in this case the destination is memory). */ | |
3933 if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set)))) | |
3934 { | |
3935 if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL) | |
3936 { | |
3937 x = XEXP (x, 0); | |
3938 if (REG_P (SET_DEST (set)) | |
3939 && REGNO (SET_DEST (set)) == (unsigned int) i | |
3940 && ! rtx_equal_p (SET_SRC (set), x) && MEM_P (x)) | |
3941 { | |
3942 /* This insn reporting the equivalence but | |
3943 actually not setting it. Remove it from the | |
3944 list. */ | |
3945 if (prev_elem == NULL) | |
3946 ira_reg_equiv[i].init_insns = next_elem; | |
3947 else | |
3948 XEXP (prev_elem, 1) = next_elem; | |
3949 elem = prev_elem; | |
3950 } | |
3951 } | |
3952 else if (REG_P (SET_DEST (set)) | |
3953 && REGNO (SET_DEST (set)) == (unsigned int) i) | |
3954 x = SET_SRC (set); | |
3955 else | |
3956 { | |
3957 gcc_assert (REG_P (SET_SRC (set)) | |
3958 && REGNO (SET_SRC (set)) == (unsigned int) i); | |
3959 x = SET_DEST (set); | |
3960 } | |
3961 if (! function_invariant_p (x) | |
3962 || ! flag_pic | |
3963 /* A function invariant is often CONSTANT_P but may | |
3964 include a register. We promise to only pass | |
3965 CONSTANT_P objects to LEGITIMATE_PIC_OPERAND_P. */ | |
3966 || (CONSTANT_P (x) && LEGITIMATE_PIC_OPERAND_P (x))) | |
3967 { | |
3968 /* It can happen that a REG_EQUIV note contains a MEM | |
3969 that is not a legitimate memory operand. As later | |
3970 stages of reload assume that all addresses found in | |
3971 the lra_regno_equiv_* arrays were originally | |
3972 legitimate, we ignore such REG_EQUIV notes. */ | |
3973 if (memory_operand (x, VOIDmode)) | |
3974 { | |
3975 ira_reg_equiv[i].defined_p = true; | |
3976 ira_reg_equiv[i].memory = x; | |
3977 continue; | |
3978 } | |
3979 else if (function_invariant_p (x)) | |
3980 { | |
3981 machine_mode mode; | |
3982 | |
3983 mode = GET_MODE (SET_DEST (set)); | |
3984 if (GET_CODE (x) == PLUS | |
3985 || x == frame_pointer_rtx || x == arg_pointer_rtx) | |
3986 /* This is PLUS of frame pointer and a constant, | |
3987 or fp, or argp. */ | |
3988 ira_reg_equiv[i].invariant = x; | |
3989 else if (targetm.legitimate_constant_p (mode, x)) | |
3990 ira_reg_equiv[i].constant = x; | |
3991 else | |
3992 { | |
3993 ira_reg_equiv[i].memory = force_const_mem (mode, x); | |
3994 if (ira_reg_equiv[i].memory == NULL_RTX) | |
3995 { | |
3996 ira_reg_equiv[i].defined_p = false; | |
3997 ira_reg_equiv[i].init_insns = NULL; | |
3998 break; | |
3999 } | |
4000 } | |
4001 ira_reg_equiv[i].defined_p = true; | |
4002 continue; | |
4003 } | |
4004 } | |
4005 } | |
4006 ira_reg_equiv[i].defined_p = false; | |
4007 ira_reg_equiv[i].init_insns = NULL; | |
4008 break; | |
4009 } | |
2715 } | 4010 } |
2716 | 4011 |
2717 | 4012 |
2718 | 4013 |
2719 /* Print chain C to FILE. */ | 4014 /* Print chain C to FILE. */ |
2720 static void | 4015 static void |
2721 print_insn_chain (FILE *file, struct insn_chain *c) | 4016 print_insn_chain (FILE *file, struct insn_chain *c) |
2722 { | 4017 { |
2723 fprintf (file, "insn=%d, ", INSN_UID(c->insn)); | 4018 fprintf (file, "insn=%d, ", INSN_UID (c->insn)); |
2724 bitmap_print (file, &c->live_throughout, "live_throughout: ", ", "); | 4019 bitmap_print (file, &c->live_throughout, "live_throughout: ", ", "); |
2725 bitmap_print (file, &c->dead_or_set, "dead_or_set: ", "\n"); | 4020 bitmap_print (file, &c->dead_or_set, "dead_or_set: ", "\n"); |
2726 } | 4021 } |
2727 | 4022 |
2728 | 4023 |
2748 /* Init LIVE_SUBREGS[ALLOCNUM] and LIVE_SUBREGS_USED[ALLOCNUM] using | 4043 /* Init LIVE_SUBREGS[ALLOCNUM] and LIVE_SUBREGS_USED[ALLOCNUM] using |
2749 REG to the number of nregs, and INIT_VALUE to get the | 4044 REG to the number of nregs, and INIT_VALUE to get the |
2750 initialization. ALLOCNUM need not be the regno of REG. */ | 4045 initialization. ALLOCNUM need not be the regno of REG. */ |
2751 static void | 4046 static void |
2752 init_live_subregs (bool init_value, sbitmap *live_subregs, | 4047 init_live_subregs (bool init_value, sbitmap *live_subregs, |
2753 int *live_subregs_used, int allocnum, rtx reg) | 4048 bitmap live_subregs_used, int allocnum, rtx reg) |
2754 { | 4049 { |
2755 unsigned int regno = REGNO (SUBREG_REG (reg)); | 4050 unsigned int regno = REGNO (SUBREG_REG (reg)); |
2756 int size = GET_MODE_SIZE (GET_MODE (regno_reg_rtx[regno])); | 4051 int size = GET_MODE_SIZE (GET_MODE (regno_reg_rtx[regno])); |
2757 | 4052 |
2758 gcc_assert (size > 0); | 4053 gcc_assert (size > 0); |
2759 | 4054 |
2760 /* Been there, done that. */ | 4055 /* Been there, done that. */ |
2761 if (live_subregs_used[allocnum]) | 4056 if (bitmap_bit_p (live_subregs_used, allocnum)) |
2762 return; | 4057 return; |
2763 | 4058 |
2764 /* Create a new one with zeros. */ | 4059 /* Create a new one. */ |
2765 if (live_subregs[allocnum] == NULL) | 4060 if (live_subregs[allocnum] == NULL) |
2766 live_subregs[allocnum] = sbitmap_alloc (size); | 4061 live_subregs[allocnum] = sbitmap_alloc (size); |
2767 | 4062 |
2768 /* If the entire reg was live before blasting into subregs, we need | 4063 /* If the entire reg was live before blasting into subregs, we need |
2769 to init all of the subregs to ones else init to 0. */ | 4064 to init all of the subregs to ones else init to 0. */ |
2770 if (init_value) | 4065 if (init_value) |
2771 sbitmap_ones (live_subregs[allocnum]); | 4066 bitmap_ones (live_subregs[allocnum]); |
2772 else | 4067 else |
2773 sbitmap_zero (live_subregs[allocnum]); | 4068 bitmap_clear (live_subregs[allocnum]); |
2774 | 4069 |
2775 /* Set the number of bits that we really want. */ | 4070 bitmap_set_bit (live_subregs_used, allocnum); |
2776 live_subregs_used[allocnum] = size; | |
2777 } | 4071 } |
2778 | 4072 |
2779 /* Walk the insns of the current function and build reload_insn_chain, | 4073 /* Walk the insns of the current function and build reload_insn_chain, |
2780 and record register life information. */ | 4074 and record register life information. */ |
2781 static void | 4075 static void |
2784 unsigned int i; | 4078 unsigned int i; |
2785 struct insn_chain **p = &reload_insn_chain; | 4079 struct insn_chain **p = &reload_insn_chain; |
2786 basic_block bb; | 4080 basic_block bb; |
2787 struct insn_chain *c = NULL; | 4081 struct insn_chain *c = NULL; |
2788 struct insn_chain *next = NULL; | 4082 struct insn_chain *next = NULL; |
2789 bitmap live_relevant_regs = BITMAP_ALLOC (NULL); | 4083 auto_bitmap live_relevant_regs; |
2790 bitmap elim_regset = BITMAP_ALLOC (NULL); | 4084 auto_bitmap elim_regset; |
2791 /* live_subregs is a vector used to keep accurate information about | 4085 /* live_subregs is a vector used to keep accurate information about |
2792 which hardregs are live in multiword pseudos. live_subregs and | 4086 which hardregs are live in multiword pseudos. live_subregs and |
2793 live_subregs_used are indexed by pseudo number. The live_subreg | 4087 live_subregs_used are indexed by pseudo number. The live_subreg |
2794 entry for a particular pseudo is only used if the corresponding | 4088 entry for a particular pseudo is only used if the corresponding |
2795 element is non zero in live_subregs_used. The value in | 4089 element is non zero in live_subregs_used. The sbitmap size of |
2796 live_subregs_used is number of bytes that the pseudo can | 4090 live_subreg[allocno] is number of bytes that the pseudo can |
2797 occupy. */ | 4091 occupy. */ |
2798 sbitmap *live_subregs = XCNEWVEC (sbitmap, max_regno); | 4092 sbitmap *live_subregs = XCNEWVEC (sbitmap, max_regno); |
2799 int *live_subregs_used = XNEWVEC (int, max_regno); | 4093 auto_bitmap live_subregs_used; |
2800 | 4094 |
2801 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) | 4095 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) |
2802 if (TEST_HARD_REG_BIT (eliminable_regset, i)) | 4096 if (TEST_HARD_REG_BIT (eliminable_regset, i)) |
2803 bitmap_set_bit (elim_regset, i); | 4097 bitmap_set_bit (elim_regset, i); |
2804 FOR_EACH_BB_REVERSE (bb) | 4098 FOR_EACH_BB_REVERSE_FN (bb, cfun) |
2805 { | 4099 { |
2806 bitmap_iterator bi; | 4100 bitmap_iterator bi; |
2807 rtx insn; | 4101 rtx_insn *insn; |
2808 | 4102 |
2809 CLEAR_REG_SET (live_relevant_regs); | 4103 CLEAR_REG_SET (live_relevant_regs); |
2810 memset (live_subregs_used, 0, max_regno * sizeof (int)); | 4104 bitmap_clear (live_subregs_used); |
2811 | 4105 |
2812 EXECUTE_IF_SET_IN_BITMAP (DF_LR_OUT (bb), 0, i, bi) | 4106 EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb), 0, i, bi) |
2813 { | 4107 { |
2814 if (i >= FIRST_PSEUDO_REGISTER) | 4108 if (i >= FIRST_PSEUDO_REGISTER) |
2815 break; | 4109 break; |
2816 bitmap_set_bit (live_relevant_regs, i); | 4110 bitmap_set_bit (live_relevant_regs, i); |
2817 } | 4111 } |
2818 | 4112 |
2819 EXECUTE_IF_SET_IN_BITMAP (DF_LR_OUT (bb), | 4113 EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb), |
2820 FIRST_PSEUDO_REGISTER, i, bi) | 4114 FIRST_PSEUDO_REGISTER, i, bi) |
2821 { | 4115 { |
2822 if (pseudo_for_reload_consideration_p (i)) | 4116 if (pseudo_for_reload_consideration_p (i)) |
2823 bitmap_set_bit (live_relevant_regs, i); | 4117 bitmap_set_bit (live_relevant_regs, i); |
2824 } | 4118 } |
2825 | 4119 |
2826 FOR_BB_INSNS_REVERSE (bb, insn) | 4120 FOR_BB_INSNS_REVERSE (bb, insn) |
2827 { | 4121 { |
2828 if (!NOTE_P (insn) && !BARRIER_P (insn)) | 4122 if (!NOTE_P (insn) && !BARRIER_P (insn)) |
2829 { | 4123 { |
2830 unsigned int uid = INSN_UID (insn); | 4124 struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn); |
2831 df_ref *def_rec; | 4125 df_ref def, use; |
2832 df_ref *use_rec; | |
2833 | 4126 |
2834 c = new_insn_chain (); | 4127 c = new_insn_chain (); |
2835 c->next = next; | 4128 c->next = next; |
2836 next = c; | 4129 next = c; |
2837 *p = c; | 4130 *p = c; |
2838 p = &c->prev; | 4131 p = &c->prev; |
2839 | 4132 |
2840 c->insn = insn; | 4133 c->insn = insn; |
2841 c->block = bb->index; | 4134 c->block = bb->index; |
2842 | 4135 |
2843 if (INSN_P (insn)) | 4136 if (NONDEBUG_INSN_P (insn)) |
2844 for (def_rec = DF_INSN_UID_DEFS (uid); *def_rec; def_rec++) | 4137 FOR_EACH_INSN_INFO_DEF (def, insn_info) |
2845 { | 4138 { |
2846 df_ref def = *def_rec; | |
2847 unsigned int regno = DF_REF_REGNO (def); | 4139 unsigned int regno = DF_REF_REGNO (def); |
2848 | 4140 |
2849 /* Ignore may clobbers because these are generated | 4141 /* Ignore may clobbers because these are generated |
2850 from calls. However, every other kind of def is | 4142 from calls. However, every other kind of def is |
2851 added to dead_or_set. */ | 4143 added to dead_or_set. */ |
2890 last = ((last + UNITS_PER_WORD - 1) | 4182 last = ((last + UNITS_PER_WORD - 1) |
2891 / UNITS_PER_WORD * UNITS_PER_WORD); | 4183 / UNITS_PER_WORD * UNITS_PER_WORD); |
2892 } | 4184 } |
2893 | 4185 |
2894 /* Ignore the paradoxical bits. */ | 4186 /* Ignore the paradoxical bits. */ |
2895 if ((int)last > live_subregs_used[regno]) | 4187 if (last > SBITMAP_SIZE (live_subregs[regno])) |
2896 last = live_subregs_used[regno]; | 4188 last = SBITMAP_SIZE (live_subregs[regno]); |
2897 | 4189 |
2898 while (start < last) | 4190 while (start < last) |
2899 { | 4191 { |
2900 RESET_BIT (live_subregs[regno], start); | 4192 bitmap_clear_bit (live_subregs[regno], start); |
2901 start++; | 4193 start++; |
2902 } | 4194 } |
2903 | 4195 |
2904 if (sbitmap_empty_p (live_subregs[regno])) | 4196 if (bitmap_empty_p (live_subregs[regno])) |
2905 { | 4197 { |
2906 live_subregs_used[regno] = 0; | 4198 bitmap_clear_bit (live_subregs_used, regno); |
2907 bitmap_clear_bit (live_relevant_regs, regno); | 4199 bitmap_clear_bit (live_relevant_regs, regno); |
2908 } | 4200 } |
2909 else | 4201 else |
2910 /* Set live_relevant_regs here because | 4202 /* Set live_relevant_regs here because |
2911 that bit has to be true to get us to | 4203 that bit has to be true to get us to |
2919 ZERO_EXTRACT. We handle the subreg | 4211 ZERO_EXTRACT. We handle the subreg |
2920 case above so here we have to keep from | 4212 case above so here we have to keep from |
2921 modeling the def as a killing def. */ | 4213 modeling the def as a killing def. */ |
2922 if (!DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL)) | 4214 if (!DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL)) |
2923 { | 4215 { |
4216 bitmap_clear_bit (live_subregs_used, regno); | |
2924 bitmap_clear_bit (live_relevant_regs, regno); | 4217 bitmap_clear_bit (live_relevant_regs, regno); |
2925 live_subregs_used[regno] = 0; | |
2926 } | 4218 } |
2927 } | 4219 } |
2928 } | 4220 } |
2929 } | 4221 } |
2930 | 4222 |
2931 bitmap_and_compl_into (live_relevant_regs, elim_regset); | 4223 bitmap_and_compl_into (live_relevant_regs, elim_regset); |
2932 bitmap_copy (&c->live_throughout, live_relevant_regs); | 4224 bitmap_copy (&c->live_throughout, live_relevant_regs); |
2933 | 4225 |
2934 if (INSN_P (insn)) | 4226 if (NONDEBUG_INSN_P (insn)) |
2935 for (use_rec = DF_INSN_UID_USES (uid); *use_rec; use_rec++) | 4227 FOR_EACH_INSN_INFO_USE (use, insn_info) |
2936 { | 4228 { |
2937 df_ref use = *use_rec; | |
2938 unsigned int regno = DF_REF_REGNO (use); | 4229 unsigned int regno = DF_REF_REGNO (use); |
2939 rtx reg = DF_REF_REG (use); | 4230 rtx reg = DF_REF_REG (use); |
2940 | 4231 |
2941 /* DF_REF_READ_WRITE on a use means that this use | 4232 /* DF_REF_READ_WRITE on a use means that this use |
2942 is fabricated from a def that is a partial set | 4233 is fabricated from a def that is a partial set |
2943 to a multiword reg. Here, we only model the | 4234 to a multiword reg. Here, we only model the |
2944 subreg case that is not wrapped in ZERO_EXTRACT | 4235 subreg case that is not wrapped in ZERO_EXTRACT |
2945 precisely so we do not need to look at the | 4236 precisely so we do not need to look at the |
2946 fabricated use. */ | 4237 fabricated use. */ |
2947 if (DF_REF_FLAGS_IS_SET (use, DF_REF_READ_WRITE) | 4238 if (DF_REF_FLAGS_IS_SET (use, DF_REF_READ_WRITE) |
2948 && !DF_REF_FLAGS_IS_SET (use, DF_REF_ZERO_EXTRACT) | 4239 && !DF_REF_FLAGS_IS_SET (use, DF_REF_ZERO_EXTRACT) |
2949 && DF_REF_FLAGS_IS_SET (use, DF_REF_SUBREG)) | 4240 && DF_REF_FLAGS_IS_SET (use, DF_REF_SUBREG)) |
2950 continue; | 4241 continue; |
2951 | 4242 |
2976 init_live_subregs | 4267 init_live_subregs |
2977 (bitmap_bit_p (live_relevant_regs, regno), | 4268 (bitmap_bit_p (live_relevant_regs, regno), |
2978 live_subregs, live_subregs_used, regno, reg); | 4269 live_subregs, live_subregs_used, regno, reg); |
2979 | 4270 |
2980 /* Ignore the paradoxical bits. */ | 4271 /* Ignore the paradoxical bits. */ |
2981 if ((int)last > live_subregs_used[regno]) | 4272 if (last > SBITMAP_SIZE (live_subregs[regno])) |
2982 last = live_subregs_used[regno]; | 4273 last = SBITMAP_SIZE (live_subregs[regno]); |
2983 | 4274 |
2984 while (start < last) | 4275 while (start < last) |
2985 { | 4276 { |
2986 SET_BIT (live_subregs[regno], start); | 4277 bitmap_set_bit (live_subregs[regno], start); |
2987 start++; | 4278 start++; |
2988 } | 4279 } |
2989 } | 4280 } |
2990 else | 4281 else |
2991 /* Resetting the live_subregs_used is | 4282 /* Resetting the live_subregs_used is |
2992 effectively saying do not use the subregs | 4283 effectively saying do not use the subregs |
2993 because we are reading the whole | 4284 because we are reading the whole |
2994 pseudo. */ | 4285 pseudo. */ |
2995 live_subregs_used[regno] = 0; | 4286 bitmap_clear_bit (live_subregs_used, regno); |
2996 bitmap_set_bit (live_relevant_regs, regno); | 4287 bitmap_set_bit (live_relevant_regs, regno); |
2997 } | 4288 } |
2998 } | 4289 } |
2999 } | 4290 } |
3000 } | 4291 } |
3033 } | 4324 } |
3034 insn = PREV_INSN (insn); | 4325 insn = PREV_INSN (insn); |
3035 } | 4326 } |
3036 } | 4327 } |
3037 | 4328 |
3038 for (i = 0; i < (unsigned int) max_regno; i++) | |
3039 if (live_subregs[i]) | |
3040 free (live_subregs[i]); | |
3041 | |
3042 reload_insn_chain = c; | 4329 reload_insn_chain = c; |
3043 *p = NULL; | 4330 *p = NULL; |
3044 | 4331 |
4332 for (i = 0; i < (unsigned int) max_regno; i++) | |
4333 if (live_subregs[i] != NULL) | |
4334 sbitmap_free (live_subregs[i]); | |
3045 free (live_subregs); | 4335 free (live_subregs); |
3046 free (live_subregs_used); | |
3047 BITMAP_FREE (live_relevant_regs); | |
3048 BITMAP_FREE (elim_regset); | |
3049 | 4336 |
3050 if (dump_file) | 4337 if (dump_file) |
3051 print_insn_chains (dump_file); | 4338 print_insn_chains (dump_file); |
3052 } | 4339 } |
4340 | |
4341 /* Examine the rtx found in *LOC, which is read or written to as determined | |
4342 by TYPE. Return false if we find a reason why an insn containing this | |
4343 rtx should not be moved (such as accesses to non-constant memory), true | |
4344 otherwise. */ | |
4345 static bool | |
4346 rtx_moveable_p (rtx *loc, enum op_type type) | |
4347 { | |
4348 const char *fmt; | |
4349 rtx x = *loc; | |
4350 enum rtx_code code = GET_CODE (x); | |
4351 int i, j; | |
4352 | |
4353 code = GET_CODE (x); | |
4354 switch (code) | |
4355 { | |
4356 case CONST: | |
4357 CASE_CONST_ANY: | |
4358 case SYMBOL_REF: | |
4359 case LABEL_REF: | |
4360 return true; | |
4361 | |
4362 case PC: | |
4363 return type == OP_IN; | |
4364 | |
4365 case CC0: | |
4366 return false; | |
4367 | |
4368 case REG: | |
4369 if (x == frame_pointer_rtx) | |
4370 return true; | |
4371 if (HARD_REGISTER_P (x)) | |
4372 return false; | |
4373 | |
4374 return true; | |
4375 | |
4376 case MEM: | |
4377 if (type == OP_IN && MEM_READONLY_P (x)) | |
4378 return rtx_moveable_p (&XEXP (x, 0), OP_IN); | |
4379 return false; | |
4380 | |
4381 case SET: | |
4382 return (rtx_moveable_p (&SET_SRC (x), OP_IN) | |
4383 && rtx_moveable_p (&SET_DEST (x), OP_OUT)); | |
4384 | |
4385 case STRICT_LOW_PART: | |
4386 return rtx_moveable_p (&XEXP (x, 0), OP_OUT); | |
4387 | |
4388 case ZERO_EXTRACT: | |
4389 case SIGN_EXTRACT: | |
4390 return (rtx_moveable_p (&XEXP (x, 0), type) | |
4391 && rtx_moveable_p (&XEXP (x, 1), OP_IN) | |
4392 && rtx_moveable_p (&XEXP (x, 2), OP_IN)); | |
4393 | |
4394 case CLOBBER: | |
4395 return rtx_moveable_p (&SET_DEST (x), OP_OUT); | |
4396 | |
4397 case UNSPEC_VOLATILE: | |
4398 /* It is a bad idea to consider insns with such rtl | |
4399 as moveable ones. The insn scheduler also considers them as barrier | |
4400 for a reason. */ | |
4401 return false; | |
4402 | |
4403 case ASM_OPERANDS: | |
4404 /* The same is true for volatile asm: it has unknown side effects, it | |
4405 cannot be moved at will. */ | |
4406 if (MEM_VOLATILE_P (x)) | |
4407 return false; | |
4408 | |
4409 default: | |
4410 break; | |
4411 } | |
4412 | |
4413 fmt = GET_RTX_FORMAT (code); | |
4414 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) | |
4415 { | |
4416 if (fmt[i] == 'e') | |
4417 { | |
4418 if (!rtx_moveable_p (&XEXP (x, i), type)) | |
4419 return false; | |
4420 } | |
4421 else if (fmt[i] == 'E') | |
4422 for (j = XVECLEN (x, i) - 1; j >= 0; j--) | |
4423 { | |
4424 if (!rtx_moveable_p (&XVECEXP (x, i, j), type)) | |
4425 return false; | |
4426 } | |
4427 } | |
4428 return true; | |
4429 } | |
4430 | |
4431 /* A wrapper around dominated_by_p, which uses the information in UID_LUID | |
4432 to give dominance relationships between two insns I1 and I2. */ | |
4433 static bool | |
4434 insn_dominated_by_p (rtx i1, rtx i2, int *uid_luid) | |
4435 { | |
4436 basic_block bb1 = BLOCK_FOR_INSN (i1); | |
4437 basic_block bb2 = BLOCK_FOR_INSN (i2); | |
4438 | |
4439 if (bb1 == bb2) | |
4440 return uid_luid[INSN_UID (i2)] < uid_luid[INSN_UID (i1)]; | |
4441 return dominated_by_p (CDI_DOMINATORS, bb1, bb2); | |
4442 } | |
4443 | |
4444 /* Record the range of register numbers added by find_moveable_pseudos. */ | |
4445 int first_moveable_pseudo, last_moveable_pseudo; | |
4446 | |
4447 /* These two vectors hold data for every register added by | |
4448 find_movable_pseudos, with index 0 holding data for the | |
4449 first_moveable_pseudo. */ | |
4450 /* The original home register. */ | |
4451 static vec<rtx> pseudo_replaced_reg; | |
4452 | |
4453 /* Look for instances where we have an instruction that is known to increase | |
4454 register pressure, and whose result is not used immediately. If it is | |
4455 possible to move the instruction downwards to just before its first use, | |
4456 split its lifetime into two ranges. We create a new pseudo to compute the | |
4457 value, and emit a move instruction just before the first use. If, after | |
4458 register allocation, the new pseudo remains unallocated, the function | |
4459 move_unallocated_pseudos then deletes the move instruction and places | |
4460 the computation just before the first use. | |
4461 | |
4462 Such a move is safe and profitable if all the input registers remain live | |
4463 and unchanged between the original computation and its first use. In such | |
4464 a situation, the computation is known to increase register pressure, and | |
4465 moving it is known to at least not worsen it. | |
4466 | |
4467 We restrict moves to only those cases where a register remains unallocated, | |
4468 in order to avoid interfering too much with the instruction schedule. As | |
4469 an exception, we may move insns which only modify their input register | |
4470 (typically induction variables), as this increases the freedom for our | |
4471 intended transformation, and does not limit the second instruction | |
4472 scheduler pass. */ | |
4473 | |
4474 static void | |
4475 find_moveable_pseudos (void) | |
4476 { | |
4477 unsigned i; | |
4478 int max_regs = max_reg_num (); | |
4479 int max_uid = get_max_uid (); | |
4480 basic_block bb; | |
4481 int *uid_luid = XNEWVEC (int, max_uid); | |
4482 rtx_insn **closest_uses = XNEWVEC (rtx_insn *, max_regs); | |
4483 /* A set of registers which are live but not modified throughout a block. */ | |
4484 bitmap_head *bb_transp_live = XNEWVEC (bitmap_head, | |
4485 last_basic_block_for_fn (cfun)); | |
4486 /* A set of registers which only exist in a given basic block. */ | |
4487 bitmap_head *bb_local = XNEWVEC (bitmap_head, | |
4488 last_basic_block_for_fn (cfun)); | |
4489 /* A set of registers which are set once, in an instruction that can be | |
4490 moved freely downwards, but are otherwise transparent to a block. */ | |
4491 bitmap_head *bb_moveable_reg_sets = XNEWVEC (bitmap_head, | |
4492 last_basic_block_for_fn (cfun)); | |
4493 auto_bitmap live, used, set, interesting, unusable_as_input; | |
4494 bitmap_iterator bi; | |
4495 | |
4496 first_moveable_pseudo = max_regs; | |
4497 pseudo_replaced_reg.release (); | |
4498 pseudo_replaced_reg.safe_grow_cleared (max_regs); | |
4499 | |
4500 df_analyze (); | |
4501 calculate_dominance_info (CDI_DOMINATORS); | |
4502 | |
4503 i = 0; | |
4504 FOR_EACH_BB_FN (bb, cfun) | |
4505 { | |
4506 rtx_insn *insn; | |
4507 bitmap transp = bb_transp_live + bb->index; | |
4508 bitmap moveable = bb_moveable_reg_sets + bb->index; | |
4509 bitmap local = bb_local + bb->index; | |
4510 | |
4511 bitmap_initialize (local, 0); | |
4512 bitmap_initialize (transp, 0); | |
4513 bitmap_initialize (moveable, 0); | |
4514 bitmap_copy (live, df_get_live_out (bb)); | |
4515 bitmap_and_into (live, df_get_live_in (bb)); | |
4516 bitmap_copy (transp, live); | |
4517 bitmap_clear (moveable); | |
4518 bitmap_clear (live); | |
4519 bitmap_clear (used); | |
4520 bitmap_clear (set); | |
4521 FOR_BB_INSNS (bb, insn) | |
4522 if (NONDEBUG_INSN_P (insn)) | |
4523 { | |
4524 df_insn_info *insn_info = DF_INSN_INFO_GET (insn); | |
4525 df_ref def, use; | |
4526 | |
4527 uid_luid[INSN_UID (insn)] = i++; | |
4528 | |
4529 def = df_single_def (insn_info); | |
4530 use = df_single_use (insn_info); | |
4531 if (use | |
4532 && def | |
4533 && DF_REF_REGNO (use) == DF_REF_REGNO (def) | |
4534 && !bitmap_bit_p (set, DF_REF_REGNO (use)) | |
4535 && rtx_moveable_p (&PATTERN (insn), OP_IN)) | |
4536 { | |
4537 unsigned regno = DF_REF_REGNO (use); | |
4538 bitmap_set_bit (moveable, regno); | |
4539 bitmap_set_bit (set, regno); | |
4540 bitmap_set_bit (used, regno); | |
4541 bitmap_clear_bit (transp, regno); | |
4542 continue; | |
4543 } | |
4544 FOR_EACH_INSN_INFO_USE (use, insn_info) | |
4545 { | |
4546 unsigned regno = DF_REF_REGNO (use); | |
4547 bitmap_set_bit (used, regno); | |
4548 if (bitmap_clear_bit (moveable, regno)) | |
4549 bitmap_clear_bit (transp, regno); | |
4550 } | |
4551 | |
4552 FOR_EACH_INSN_INFO_DEF (def, insn_info) | |
4553 { | |
4554 unsigned regno = DF_REF_REGNO (def); | |
4555 bitmap_set_bit (set, regno); | |
4556 bitmap_clear_bit (transp, regno); | |
4557 bitmap_clear_bit (moveable, regno); | |
4558 } | |
4559 } | |
4560 } | |
4561 | |
4562 FOR_EACH_BB_FN (bb, cfun) | |
4563 { | |
4564 bitmap local = bb_local + bb->index; | |
4565 rtx_insn *insn; | |
4566 | |
4567 FOR_BB_INSNS (bb, insn) | |
4568 if (NONDEBUG_INSN_P (insn)) | |
4569 { | |
4570 df_insn_info *insn_info = DF_INSN_INFO_GET (insn); | |
4571 rtx_insn *def_insn; | |
4572 rtx closest_use, note; | |
4573 df_ref def, use; | |
4574 unsigned regno; | |
4575 bool all_dominated, all_local; | |
4576 machine_mode mode; | |
4577 | |
4578 def = df_single_def (insn_info); | |
4579 /* There must be exactly one def in this insn. */ | |
4580 if (!def || !single_set (insn)) | |
4581 continue; | |
4582 /* This must be the only definition of the reg. We also limit | |
4583 which modes we deal with so that we can assume we can generate | |
4584 move instructions. */ | |
4585 regno = DF_REF_REGNO (def); | |
4586 mode = GET_MODE (DF_REF_REG (def)); | |
4587 if (DF_REG_DEF_COUNT (regno) != 1 | |
4588 || !DF_REF_INSN_INFO (def) | |
4589 || HARD_REGISTER_NUM_P (regno) | |
4590 || DF_REG_EQ_USE_COUNT (regno) > 0 | |
4591 || (!INTEGRAL_MODE_P (mode) && !FLOAT_MODE_P (mode))) | |
4592 continue; | |
4593 def_insn = DF_REF_INSN (def); | |
4594 | |
4595 for (note = REG_NOTES (def_insn); note; note = XEXP (note, 1)) | |
4596 if (REG_NOTE_KIND (note) == REG_EQUIV && MEM_P (XEXP (note, 0))) | |
4597 break; | |
4598 | |
4599 if (note) | |
4600 { | |
4601 if (dump_file) | |
4602 fprintf (dump_file, "Ignoring reg %d, has equiv memory\n", | |
4603 regno); | |
4604 bitmap_set_bit (unusable_as_input, regno); | |
4605 continue; | |
4606 } | |
4607 | |
4608 use = DF_REG_USE_CHAIN (regno); | |
4609 all_dominated = true; | |
4610 all_local = true; | |
4611 closest_use = NULL_RTX; | |
4612 for (; use; use = DF_REF_NEXT_REG (use)) | |
4613 { | |
4614 rtx_insn *insn; | |
4615 if (!DF_REF_INSN_INFO (use)) | |
4616 { | |
4617 all_dominated = false; | |
4618 all_local = false; | |
4619 break; | |
4620 } | |
4621 insn = DF_REF_INSN (use); | |
4622 if (DEBUG_INSN_P (insn)) | |
4623 continue; | |
4624 if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (def_insn)) | |
4625 all_local = false; | |
4626 if (!insn_dominated_by_p (insn, def_insn, uid_luid)) | |
4627 all_dominated = false; | |
4628 if (closest_use != insn && closest_use != const0_rtx) | |
4629 { | |
4630 if (closest_use == NULL_RTX) | |
4631 closest_use = insn; | |
4632 else if (insn_dominated_by_p (closest_use, insn, uid_luid)) | |
4633 closest_use = insn; | |
4634 else if (!insn_dominated_by_p (insn, closest_use, uid_luid)) | |
4635 closest_use = const0_rtx; | |
4636 } | |
4637 } | |
4638 if (!all_dominated) | |
4639 { | |
4640 if (dump_file) | |
4641 fprintf (dump_file, "Reg %d not all uses dominated by set\n", | |
4642 regno); | |
4643 continue; | |
4644 } | |
4645 if (all_local) | |
4646 bitmap_set_bit (local, regno); | |
4647 if (closest_use == const0_rtx || closest_use == NULL | |
4648 || next_nonnote_nondebug_insn (def_insn) == closest_use) | |
4649 { | |
4650 if (dump_file) | |
4651 fprintf (dump_file, "Reg %d uninteresting%s\n", regno, | |
4652 closest_use == const0_rtx || closest_use == NULL | |
4653 ? " (no unique first use)" : ""); | |
4654 continue; | |
4655 } | |
4656 if (HAVE_cc0 && reg_referenced_p (cc0_rtx, PATTERN (closest_use))) | |
4657 { | |
4658 if (dump_file) | |
4659 fprintf (dump_file, "Reg %d: closest user uses cc0\n", | |
4660 regno); | |
4661 continue; | |
4662 } | |
4663 | |
4664 bitmap_set_bit (interesting, regno); | |
4665 /* If we get here, we know closest_use is a non-NULL insn | |
4666 (as opposed to const_0_rtx). */ | |
4667 closest_uses[regno] = as_a <rtx_insn *> (closest_use); | |
4668 | |
4669 if (dump_file && (all_local || all_dominated)) | |
4670 { | |
4671 fprintf (dump_file, "Reg %u:", regno); | |
4672 if (all_local) | |
4673 fprintf (dump_file, " local to bb %d", bb->index); | |
4674 if (all_dominated) | |
4675 fprintf (dump_file, " def dominates all uses"); | |
4676 if (closest_use != const0_rtx) | |
4677 fprintf (dump_file, " has unique first use"); | |
4678 fputs ("\n", dump_file); | |
4679 } | |
4680 } | |
4681 } | |
4682 | |
4683 EXECUTE_IF_SET_IN_BITMAP (interesting, 0, i, bi) | |
4684 { | |
4685 df_ref def = DF_REG_DEF_CHAIN (i); | |
4686 rtx_insn *def_insn = DF_REF_INSN (def); | |
4687 basic_block def_block = BLOCK_FOR_INSN (def_insn); | |
4688 bitmap def_bb_local = bb_local + def_block->index; | |
4689 bitmap def_bb_moveable = bb_moveable_reg_sets + def_block->index; | |
4690 bitmap def_bb_transp = bb_transp_live + def_block->index; | |
4691 bool local_to_bb_p = bitmap_bit_p (def_bb_local, i); | |
4692 rtx_insn *use_insn = closest_uses[i]; | |
4693 df_ref use; | |
4694 bool all_ok = true; | |
4695 bool all_transp = true; | |
4696 | |
4697 if (!REG_P (DF_REF_REG (def))) | |
4698 continue; | |
4699 | |
4700 if (!local_to_bb_p) | |
4701 { | |
4702 if (dump_file) | |
4703 fprintf (dump_file, "Reg %u not local to one basic block\n", | |
4704 i); | |
4705 continue; | |
4706 } | |
4707 if (reg_equiv_init (i) != NULL_RTX) | |
4708 { | |
4709 if (dump_file) | |
4710 fprintf (dump_file, "Ignoring reg %u with equiv init insn\n", | |
4711 i); | |
4712 continue; | |
4713 } | |
4714 if (!rtx_moveable_p (&PATTERN (def_insn), OP_IN)) | |
4715 { | |
4716 if (dump_file) | |
4717 fprintf (dump_file, "Found def insn %d for %d to be not moveable\n", | |
4718 INSN_UID (def_insn), i); | |
4719 continue; | |
4720 } | |
4721 if (dump_file) | |
4722 fprintf (dump_file, "Examining insn %d, def for %d\n", | |
4723 INSN_UID (def_insn), i); | |
4724 FOR_EACH_INSN_USE (use, def_insn) | |
4725 { | |
4726 unsigned regno = DF_REF_REGNO (use); | |
4727 if (bitmap_bit_p (unusable_as_input, regno)) | |
4728 { | |
4729 all_ok = false; | |
4730 if (dump_file) | |
4731 fprintf (dump_file, " found unusable input reg %u.\n", regno); | |
4732 break; | |
4733 } | |
4734 if (!bitmap_bit_p (def_bb_transp, regno)) | |
4735 { | |
4736 if (bitmap_bit_p (def_bb_moveable, regno) | |
4737 && !control_flow_insn_p (use_insn) | |
4738 && (!HAVE_cc0 || !sets_cc0_p (use_insn))) | |
4739 { | |
4740 if (modified_between_p (DF_REF_REG (use), def_insn, use_insn)) | |
4741 { | |
4742 rtx_insn *x = NEXT_INSN (def_insn); | |
4743 while (!modified_in_p (DF_REF_REG (use), x)) | |
4744 { | |
4745 gcc_assert (x != use_insn); | |
4746 x = NEXT_INSN (x); | |
4747 } | |
4748 if (dump_file) | |
4749 fprintf (dump_file, " input reg %u modified but insn %d moveable\n", | |
4750 regno, INSN_UID (x)); | |
4751 emit_insn_after (PATTERN (x), use_insn); | |
4752 set_insn_deleted (x); | |
4753 } | |
4754 else | |
4755 { | |
4756 if (dump_file) | |
4757 fprintf (dump_file, " input reg %u modified between def and use\n", | |
4758 regno); | |
4759 all_transp = false; | |
4760 } | |
4761 } | |
4762 else | |
4763 all_transp = false; | |
4764 } | |
4765 } | |
4766 if (!all_ok) | |
4767 continue; | |
4768 if (!dbg_cnt (ira_move)) | |
4769 break; | |
4770 if (dump_file) | |
4771 fprintf (dump_file, " all ok%s\n", all_transp ? " and transp" : ""); | |
4772 | |
4773 if (all_transp) | |
4774 { | |
4775 rtx def_reg = DF_REF_REG (def); | |
4776 rtx newreg = ira_create_new_reg (def_reg); | |
4777 if (validate_change (def_insn, DF_REF_REAL_LOC (def), newreg, 0)) | |
4778 { | |
4779 unsigned nregno = REGNO (newreg); | |
4780 emit_insn_before (gen_move_insn (def_reg, newreg), use_insn); | |
4781 nregno -= max_regs; | |
4782 pseudo_replaced_reg[nregno] = def_reg; | |
4783 } | |
4784 } | |
4785 } | |
4786 | |
4787 FOR_EACH_BB_FN (bb, cfun) | |
4788 { | |
4789 bitmap_clear (bb_local + bb->index); | |
4790 bitmap_clear (bb_transp_live + bb->index); | |
4791 bitmap_clear (bb_moveable_reg_sets + bb->index); | |
4792 } | |
4793 free (uid_luid); | |
4794 free (closest_uses); | |
4795 free (bb_local); | |
4796 free (bb_transp_live); | |
4797 free (bb_moveable_reg_sets); | |
4798 | |
4799 last_moveable_pseudo = max_reg_num (); | |
4800 | |
4801 fix_reg_equiv_init (); | |
4802 expand_reg_info (); | |
4803 regstat_free_n_sets_and_refs (); | |
4804 regstat_free_ri (); | |
4805 regstat_init_n_sets_and_refs (); | |
4806 regstat_compute_ri (); | |
4807 free_dominance_info (CDI_DOMINATORS); | |
4808 } | |
4809 | |
4810 /* If SET pattern SET is an assignment from a hard register to a pseudo which | |
4811 is live at CALL_DOM (if non-NULL, otherwise this check is omitted), return | |
4812 the destination. Otherwise return NULL. */ | |
4813 | |
4814 static rtx | |
4815 interesting_dest_for_shprep_1 (rtx set, basic_block call_dom) | |
4816 { | |
4817 rtx src = SET_SRC (set); | |
4818 rtx dest = SET_DEST (set); | |
4819 if (!REG_P (src) || !HARD_REGISTER_P (src) | |
4820 || !REG_P (dest) || HARD_REGISTER_P (dest) | |
4821 || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest)))) | |
4822 return NULL; | |
4823 return dest; | |
4824 } | |
4825 | |
4826 /* If insn is interesting for parameter range-splitting shrink-wrapping | |
4827 preparation, i.e. it is a single set from a hard register to a pseudo, which | |
4828 is live at CALL_DOM (if non-NULL, otherwise this check is omitted), or a | |
4829 parallel statement with only one such statement, return the destination. | |
4830 Otherwise return NULL. */ | |
4831 | |
4832 static rtx | |
4833 interesting_dest_for_shprep (rtx_insn *insn, basic_block call_dom) | |
4834 { | |
4835 if (!INSN_P (insn)) | |
4836 return NULL; | |
4837 rtx pat = PATTERN (insn); | |
4838 if (GET_CODE (pat) == SET) | |
4839 return interesting_dest_for_shprep_1 (pat, call_dom); | |
4840 | |
4841 if (GET_CODE (pat) != PARALLEL) | |
4842 return NULL; | |
4843 rtx ret = NULL; | |
4844 for (int i = 0; i < XVECLEN (pat, 0); i++) | |
4845 { | |
4846 rtx sub = XVECEXP (pat, 0, i); | |
4847 if (GET_CODE (sub) == USE || GET_CODE (sub) == CLOBBER) | |
4848 continue; | |
4849 if (GET_CODE (sub) != SET | |
4850 || side_effects_p (sub)) | |
4851 return NULL; | |
4852 rtx dest = interesting_dest_for_shprep_1 (sub, call_dom); | |
4853 if (dest && ret) | |
4854 return NULL; | |
4855 if (dest) | |
4856 ret = dest; | |
4857 } | |
4858 return ret; | |
4859 } | |
4860 | |
4861 /* Split live ranges of pseudos that are loaded from hard registers in the | |
4862 first BB in a BB that dominates all non-sibling call if such a BB can be | |
4863 found and is not in a loop. Return true if the function has made any | |
4864 changes. */ | |
4865 | |
4866 static bool | |
4867 split_live_ranges_for_shrink_wrap (void) | |
4868 { | |
4869 basic_block bb, call_dom = NULL; | |
4870 basic_block first = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); | |
4871 rtx_insn *insn, *last_interesting_insn = NULL; | |
4872 auto_bitmap need_new, reachable; | |
4873 vec<basic_block> queue; | |
4874 | |
4875 if (!SHRINK_WRAPPING_ENABLED) | |
4876 return false; | |
4877 | |
4878 queue.create (n_basic_blocks_for_fn (cfun)); | |
4879 | |
4880 FOR_EACH_BB_FN (bb, cfun) | |
4881 FOR_BB_INSNS (bb, insn) | |
4882 if (CALL_P (insn) && !SIBLING_CALL_P (insn)) | |
4883 { | |
4884 if (bb == first) | |
4885 { | |
4886 queue.release (); | |
4887 return false; | |
4888 } | |
4889 | |
4890 bitmap_set_bit (need_new, bb->index); | |
4891 bitmap_set_bit (reachable, bb->index); | |
4892 queue.quick_push (bb); | |
4893 break; | |
4894 } | |
4895 | |
4896 if (queue.is_empty ()) | |
4897 { | |
4898 queue.release (); | |
4899 return false; | |
4900 } | |
4901 | |
4902 while (!queue.is_empty ()) | |
4903 { | |
4904 edge e; | |
4905 edge_iterator ei; | |
4906 | |
4907 bb = queue.pop (); | |
4908 FOR_EACH_EDGE (e, ei, bb->succs) | |
4909 if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun) | |
4910 && bitmap_set_bit (reachable, e->dest->index)) | |
4911 queue.quick_push (e->dest); | |
4912 } | |
4913 queue.release (); | |
4914 | |
4915 FOR_BB_INSNS (first, insn) | |
4916 { | |
4917 rtx dest = interesting_dest_for_shprep (insn, NULL); | |
4918 if (!dest) | |
4919 continue; | |
4920 | |
4921 if (DF_REG_DEF_COUNT (REGNO (dest)) > 1) | |
4922 return false; | |
4923 | |
4924 for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest)); | |
4925 use; | |
4926 use = DF_REF_NEXT_REG (use)) | |
4927 { | |
4928 int ubbi = DF_REF_BB (use)->index; | |
4929 if (bitmap_bit_p (reachable, ubbi)) | |
4930 bitmap_set_bit (need_new, ubbi); | |
4931 } | |
4932 last_interesting_insn = insn; | |
4933 } | |
4934 | |
4935 if (!last_interesting_insn) | |
4936 return false; | |
4937 | |
4938 call_dom = nearest_common_dominator_for_set (CDI_DOMINATORS, need_new); | |
4939 if (call_dom == first) | |
4940 return false; | |
4941 | |
4942 loop_optimizer_init (AVOID_CFG_MODIFICATIONS); | |
4943 while (bb_loop_depth (call_dom) > 0) | |
4944 call_dom = get_immediate_dominator (CDI_DOMINATORS, call_dom); | |
4945 loop_optimizer_finalize (); | |
4946 | |
4947 if (call_dom == first) | |
4948 return false; | |
4949 | |
4950 calculate_dominance_info (CDI_POST_DOMINATORS); | |
4951 if (dominated_by_p (CDI_POST_DOMINATORS, first, call_dom)) | |
4952 { | |
4953 free_dominance_info (CDI_POST_DOMINATORS); | |
4954 return false; | |
4955 } | |
4956 free_dominance_info (CDI_POST_DOMINATORS); | |
4957 | |
4958 if (dump_file) | |
4959 fprintf (dump_file, "Will split live ranges of parameters at BB %i\n", | |
4960 call_dom->index); | |
4961 | |
4962 bool ret = false; | |
4963 FOR_BB_INSNS (first, insn) | |
4964 { | |
4965 rtx dest = interesting_dest_for_shprep (insn, call_dom); | |
4966 if (!dest || dest == pic_offset_table_rtx) | |
4967 continue; | |
4968 | |
4969 bool need_newreg = false; | |
4970 df_ref use, next; | |
4971 for (use = DF_REG_USE_CHAIN (REGNO (dest)); use; use = next) | |
4972 { | |
4973 rtx_insn *uin = DF_REF_INSN (use); | |
4974 next = DF_REF_NEXT_REG (use); | |
4975 | |
4976 if (DEBUG_INSN_P (uin)) | |
4977 continue; | |
4978 | |
4979 basic_block ubb = BLOCK_FOR_INSN (uin); | |
4980 if (ubb == call_dom | |
4981 || dominated_by_p (CDI_DOMINATORS, ubb, call_dom)) | |
4982 { | |
4983 need_newreg = true; | |
4984 break; | |
4985 } | |
4986 } | |
4987 | |
4988 if (need_newreg) | |
4989 { | |
4990 rtx newreg = ira_create_new_reg (dest); | |
4991 | |
4992 for (use = DF_REG_USE_CHAIN (REGNO (dest)); use; use = next) | |
4993 { | |
4994 rtx_insn *uin = DF_REF_INSN (use); | |
4995 next = DF_REF_NEXT_REG (use); | |
4996 | |
4997 basic_block ubb = BLOCK_FOR_INSN (uin); | |
4998 if (ubb == call_dom | |
4999 || dominated_by_p (CDI_DOMINATORS, ubb, call_dom)) | |
5000 validate_change (uin, DF_REF_REAL_LOC (use), newreg, true); | |
5001 } | |
5002 | |
5003 rtx_insn *new_move = gen_move_insn (newreg, dest); | |
5004 emit_insn_after (new_move, bb_note (call_dom)); | |
5005 if (dump_file) | |
5006 { | |
5007 fprintf (dump_file, "Split live-range of register "); | |
5008 print_rtl_single (dump_file, dest); | |
5009 } | |
5010 ret = true; | |
5011 } | |
5012 | |
5013 if (insn == last_interesting_insn) | |
5014 break; | |
5015 } | |
5016 apply_change_group (); | |
5017 return ret; | |
5018 } | |
5019 | |
5020 /* Perform the second half of the transformation started in | |
5021 find_moveable_pseudos. We look for instances where the newly introduced | |
5022 pseudo remains unallocated, and remove it by moving the definition to | |
5023 just before its use, replacing the move instruction generated by | |
5024 find_moveable_pseudos. */ | |
5025 static void | |
5026 move_unallocated_pseudos (void) | |
5027 { | |
5028 int i; | |
5029 for (i = first_moveable_pseudo; i < last_moveable_pseudo; i++) | |
5030 if (reg_renumber[i] < 0) | |
5031 { | |
5032 int idx = i - first_moveable_pseudo; | |
5033 rtx other_reg = pseudo_replaced_reg[idx]; | |
5034 rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (i)); | |
5035 /* The use must follow all definitions of OTHER_REG, so we can | |
5036 insert the new definition immediately after any of them. */ | |
5037 df_ref other_def = DF_REG_DEF_CHAIN (REGNO (other_reg)); | |
5038 rtx_insn *move_insn = DF_REF_INSN (other_def); | |
5039 rtx_insn *newinsn = emit_insn_after (PATTERN (def_insn), move_insn); | |
5040 rtx set; | |
5041 int success; | |
5042 | |
5043 if (dump_file) | |
5044 fprintf (dump_file, "moving def of %d (insn %d now) ", | |
5045 REGNO (other_reg), INSN_UID (def_insn)); | |
5046 | |
5047 delete_insn (move_insn); | |
5048 while ((other_def = DF_REG_DEF_CHAIN (REGNO (other_reg)))) | |
5049 delete_insn (DF_REF_INSN (other_def)); | |
5050 delete_insn (def_insn); | |
5051 | |
5052 set = single_set (newinsn); | |
5053 success = validate_change (newinsn, &SET_DEST (set), other_reg, 0); | |
5054 gcc_assert (success); | |
5055 if (dump_file) | |
5056 fprintf (dump_file, " %d) rather than keep unallocated replacement %d\n", | |
5057 INSN_UID (newinsn), i); | |
5058 SET_REG_N_REFS (i, 0); | |
5059 } | |
5060 } | |
3053 | 5061 |
3054 /* Allocate memory for reg_equiv_memory_loc. */ | 5062 /* If the backend knows where to allocate pseudos for hard |
5063 register initial values, register these allocations now. */ | |
3055 static void | 5064 static void |
3056 init_reg_equiv_memory_loc (void) | 5065 allocate_initial_values (void) |
3057 { | 5066 { |
3058 max_regno = max_reg_num (); | 5067 if (targetm.allocate_initial_value) |
3059 | 5068 { |
3060 /* And the reg_equiv_memory_loc array. */ | 5069 rtx hreg, preg, x; |
3061 VEC_safe_grow (rtx, gc, reg_equiv_memory_loc_vec, max_regno); | 5070 int i, regno; |
3062 memset (VEC_address (rtx, reg_equiv_memory_loc_vec), 0, | 5071 |
3063 sizeof (rtx) * max_regno); | 5072 for (i = 0; HARD_REGISTER_NUM_P (i); i++) |
3064 reg_equiv_memory_loc = VEC_address (rtx, reg_equiv_memory_loc_vec); | 5073 { |
3065 } | 5074 if (! initial_value_entry (i, &hreg, &preg)) |
3066 | 5075 break; |
3067 /* All natural loops. */ | 5076 |
3068 struct loops ira_loops; | 5077 x = targetm.allocate_initial_value (hreg); |
5078 regno = REGNO (preg); | |
5079 if (x && REG_N_SETS (regno) <= 1) | |
5080 { | |
5081 if (MEM_P (x)) | |
5082 reg_equiv_memory_loc (regno) = x; | |
5083 else | |
5084 { | |
5085 basic_block bb; | |
5086 int new_regno; | |
5087 | |
5088 gcc_assert (REG_P (x)); | |
5089 new_regno = REGNO (x); | |
5090 reg_renumber[regno] = new_regno; | |
5091 /* Poke the regno right into regno_reg_rtx so that even | |
5092 fixed regs are accepted. */ | |
5093 SET_REGNO (preg, new_regno); | |
5094 /* Update global register liveness information. */ | |
5095 FOR_EACH_BB_FN (bb, cfun) | |
5096 { | |
5097 if (REGNO_REG_SET_P (df_get_live_in (bb), regno)) | |
5098 SET_REGNO_REG_SET (df_get_live_in (bb), new_regno); | |
5099 if (REGNO_REG_SET_P (df_get_live_out (bb), regno)) | |
5100 SET_REGNO_REG_SET (df_get_live_out (bb), new_regno); | |
5101 } | |
5102 } | |
5103 } | |
5104 } | |
5105 | |
5106 gcc_checking_assert (! initial_value_entry (FIRST_PSEUDO_REGISTER, | |
5107 &hreg, &preg)); | |
5108 } | |
5109 } | |
5110 | |
5111 | |
5112 /* True when we use LRA instead of reload pass for the current | |
5113 function. */ | |
5114 bool ira_use_lra_p; | |
3069 | 5115 |
3070 /* True if we have allocno conflicts. It is false for non-optimized | 5116 /* True if we have allocno conflicts. It is false for non-optimized |
3071 mode or when the conflict table is too big. */ | 5117 mode or when the conflict table is too big. */ |
3072 bool ira_conflicts_p; | 5118 bool ira_conflicts_p; |
3073 | 5119 |
5120 /* Saved between IRA and reload. */ | |
5121 static int saved_flag_ira_share_spill_slots; | |
5122 | |
3074 /* This is the main entry of IRA. */ | 5123 /* This is the main entry of IRA. */ |
3075 static void | 5124 static void |
3076 ira (FILE *f) | 5125 ira (FILE *f) |
3077 { | 5126 { |
3078 int overall_cost_before, allocated_reg_info_size; | |
3079 bool loops_p; | 5127 bool loops_p; |
3080 int max_regno_before_ira, ira_max_point_before_emit; | 5128 int ira_max_point_before_emit; |
3081 int rebuild_p; | 5129 bool saved_flag_caller_saves = flag_caller_saves; |
3082 int saved_flag_ira_share_spill_slots; | 5130 enum ira_region saved_flag_ira_region = flag_ira_region; |
3083 basic_block bb; | 5131 |
3084 | 5132 clear_bb_flags (); |
3085 timevar_push (TV_IRA); | 5133 |
3086 | 5134 /* Determine if the current function is a leaf before running IRA |
3087 if (flag_caller_saves) | 5135 since this can impact optimizations done by the prologue and |
5136 epilogue thus changing register elimination offsets. | |
5137 Other target callbacks may use crtl->is_leaf too, including | |
5138 SHRINK_WRAPPING_ENABLED, so initialize as early as possible. */ | |
5139 crtl->is_leaf = leaf_function_p (); | |
5140 | |
5141 /* Perform target specific PIC register initialization. */ | |
5142 targetm.init_pic_reg (); | |
5143 | |
5144 ira_conflicts_p = optimize > 0; | |
5145 | |
5146 /* If there are too many pseudos and/or basic blocks (e.g. 10K | |
5147 pseudos and 10K blocks or 100K pseudos and 1K blocks), we will | |
5148 use simplified and faster algorithms in LRA. */ | |
5149 lra_simple_p | |
5150 = (ira_use_lra_p | |
5151 && max_reg_num () >= (1 << 26) / last_basic_block_for_fn (cfun)); | |
5152 if (lra_simple_p) | |
5153 { | |
5154 /* It permits to skip live range splitting in LRA. */ | |
5155 flag_caller_saves = false; | |
5156 /* There is no sense to do regional allocation when we use | |
5157 simplified LRA. */ | |
5158 flag_ira_region = IRA_REGION_ONE; | |
5159 ira_conflicts_p = false; | |
5160 } | |
5161 | |
5162 #ifndef IRA_NO_OBSTACK | |
5163 gcc_obstack_init (&ira_obstack); | |
5164 #endif | |
5165 bitmap_obstack_initialize (&ira_bitmap_obstack); | |
5166 | |
5167 /* LRA uses its own infrastructure to handle caller save registers. */ | |
5168 if (flag_caller_saves && !ira_use_lra_p) | |
3088 init_caller_save (); | 5169 init_caller_save (); |
3089 | 5170 |
3090 if (flag_ira_verbose < 10) | 5171 if (flag_ira_verbose < 10) |
3091 { | 5172 { |
3092 internal_flag_ira_verbose = flag_ira_verbose; | 5173 internal_flag_ira_verbose = flag_ira_verbose; |
3096 { | 5177 { |
3097 internal_flag_ira_verbose = flag_ira_verbose - 10; | 5178 internal_flag_ira_verbose = flag_ira_verbose - 10; |
3098 ira_dump_file = stderr; | 5179 ira_dump_file = stderr; |
3099 } | 5180 } |
3100 | 5181 |
3101 ira_conflicts_p = optimize > 0; | |
3102 setup_prohibited_mode_move_regs (); | 5182 setup_prohibited_mode_move_regs (); |
3103 | 5183 decrease_live_ranges_number (); |
3104 df_note_add_problem (); | 5184 df_note_add_problem (); |
3105 | 5185 |
3106 if (optimize == 1) | 5186 /* DF_LIVE can't be used in the register allocator, too many other |
3107 { | 5187 parts of the compiler depend on using the "classic" liveness |
3108 df_live_add_problem (); | 5188 interpretation of the DF_LR problem. See PR38711. |
3109 df_live_set_all_dirty (); | 5189 Remove the problem, so that we don't spend time updating it in |
3110 } | 5190 any of the df_analyze() calls during IRA/LRA. */ |
3111 #ifdef ENABLE_CHECKING | 5191 if (optimize > 1) |
3112 df->changeable_flags |= DF_VERIFY_SCHEDULED; | 5192 df_remove_problem (df_live); |
3113 #endif | 5193 gcc_checking_assert (df_live == NULL); |
5194 | |
5195 if (flag_checking) | |
5196 df->changeable_flags |= DF_VERIFY_SCHEDULED; | |
5197 | |
3114 df_analyze (); | 5198 df_analyze (); |
5199 | |
5200 init_reg_equiv (); | |
5201 if (ira_conflicts_p) | |
5202 { | |
5203 calculate_dominance_info (CDI_DOMINATORS); | |
5204 | |
5205 if (split_live_ranges_for_shrink_wrap ()) | |
5206 df_analyze (); | |
5207 | |
5208 free_dominance_info (CDI_DOMINATORS); | |
5209 } | |
5210 | |
3115 df_clear_flags (DF_NO_INSN_RESCAN); | 5211 df_clear_flags (DF_NO_INSN_RESCAN); |
5212 | |
5213 indirect_jump_optimize (); | |
5214 if (delete_trivially_dead_insns (get_insns (), max_reg_num ())) | |
5215 df_analyze (); | |
5216 | |
3116 regstat_init_n_sets_and_refs (); | 5217 regstat_init_n_sets_and_refs (); |
3117 regstat_compute_ri (); | 5218 regstat_compute_ri (); |
3118 | 5219 |
3119 /* If we are not optimizing, then this is the only place before | 5220 /* If we are not optimizing, then this is the only place before |
3120 register allocation where dataflow is done. And that is needed | 5221 register allocation where dataflow is done. And that is needed |
3121 to generate these warnings. */ | 5222 to generate these warnings. */ |
3122 if (warn_clobbered) | 5223 if (warn_clobbered) |
3123 generate_setjmp_warnings (); | 5224 generate_setjmp_warnings (); |
3124 | 5225 |
3125 /* Determine if the current function is a leaf before running IRA | |
3126 since this can impact optimizations done by the prologue and | |
3127 epilogue thus changing register elimination offsets. */ | |
3128 current_function_is_leaf = leaf_function_p (); | |
3129 | |
3130 if (resize_reg_info () && flag_ira_loop_pressure) | 5226 if (resize_reg_info () && flag_ira_loop_pressure) |
3131 ira_set_pseudo_classes (ira_dump_file); | 5227 ira_set_pseudo_classes (true, ira_dump_file); |
3132 | 5228 |
3133 rebuild_p = update_equiv_regs (); | 5229 init_alias_analysis (); |
3134 | 5230 loop_optimizer_init (AVOID_CFG_MODIFICATIONS); |
3135 #ifndef IRA_NO_OBSTACK | 5231 reg_equiv = XCNEWVEC (struct equivalence, max_reg_num ()); |
3136 gcc_obstack_init (&ira_obstack); | 5232 update_equiv_regs (); |
3137 #endif | 5233 |
3138 bitmap_obstack_initialize (&ira_bitmap_obstack); | 5234 /* Don't move insns if live range shrinkage or register |
5235 pressure-sensitive scheduling were done because it will not | |
5236 improve allocation but likely worsen insn scheduling. */ | |
5237 if (optimize | |
5238 && !flag_live_range_shrinkage | |
5239 && !(flag_sched_pressure && flag_schedule_insns)) | |
5240 combine_and_move_insns (); | |
5241 | |
5242 /* Gather additional equivalences with memory. */ | |
3139 if (optimize) | 5243 if (optimize) |
3140 { | 5244 add_store_equivs (); |
3141 max_regno = max_reg_num (); | 5245 |
3142 ira_reg_equiv_len = max_regno; | 5246 loop_optimizer_finalize (); |
3143 ira_reg_equiv_invariant_p | 5247 free_dominance_info (CDI_DOMINATORS); |
3144 = (bool *) ira_allocate (max_regno * sizeof (bool)); | 5248 end_alias_analysis (); |
3145 memset (ira_reg_equiv_invariant_p, 0, max_regno * sizeof (bool)); | 5249 free (reg_equiv); |
3146 ira_reg_equiv_const = (rtx *) ira_allocate (max_regno * sizeof (rtx)); | 5250 |
3147 memset (ira_reg_equiv_const, 0, max_regno * sizeof (rtx)); | 5251 setup_reg_equiv (); |
3148 find_reg_equiv_invariant_const (); | 5252 grow_reg_equivs (); |
3149 if (rebuild_p) | 5253 setup_reg_equiv_init (); |
3150 { | 5254 |
3151 timevar_push (TV_JUMP); | 5255 allocated_reg_info_size = max_reg_num (); |
3152 rebuild_jump_labels (get_insns ()); | 5256 |
3153 if (purge_all_dead_edges ()) | 5257 /* It is not worth to do such improvement when we use a simple |
3154 delete_unreachable_blocks (); | 5258 allocation because of -O0 usage or because the function is too |
3155 timevar_pop (TV_JUMP); | 5259 big. */ |
3156 } | 5260 if (ira_conflicts_p) |
3157 } | 5261 find_moveable_pseudos (); |
3158 | 5262 |
3159 max_regno_before_ira = allocated_reg_info_size = max_reg_num (); | 5263 max_regno_before_ira = max_reg_num (); |
3160 ira_setup_eliminable_regset (); | 5264 ira_setup_eliminable_regset (); |
3161 | 5265 |
3162 ira_overall_cost = ira_reg_cost = ira_mem_cost = 0; | 5266 ira_overall_cost = ira_reg_cost = ira_mem_cost = 0; |
3163 ira_load_cost = ira_store_cost = ira_shuffle_cost = 0; | 5267 ira_load_cost = ira_store_cost = ira_shuffle_cost = 0; |
3164 ira_move_loops_num = ira_additional_jumps_num = 0; | 5268 ira_move_loops_num = ira_additional_jumps_num = 0; |
3165 | 5269 |
3166 ira_assert (current_loops == NULL); | 5270 ira_assert (current_loops == NULL); |
3167 flow_loops_find (&ira_loops); | 5271 if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == IRA_REGION_MIXED) |
3168 record_loop_exits (); | 5272 loop_optimizer_init (AVOID_CFG_MODIFICATIONS | LOOPS_HAVE_RECORDED_EXITS); |
3169 current_loops = &ira_loops; | |
3170 | |
3171 init_reg_equiv_memory_loc (); | |
3172 | 5273 |
3173 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) | 5274 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) |
3174 fprintf (ira_dump_file, "Building IRA IR\n"); | 5275 fprintf (ira_dump_file, "Building IRA IR\n"); |
3175 loops_p = ira_build (optimize | 5276 loops_p = ira_build (); |
3176 && (flag_ira_region == IRA_REGION_ALL | |
3177 || flag_ira_region == IRA_REGION_MIXED)); | |
3178 | 5277 |
3179 ira_assert (ira_conflicts_p || !loops_p); | 5278 ira_assert (ira_conflicts_p || !loops_p); |
3180 | 5279 |
3181 saved_flag_ira_share_spill_slots = flag_ira_share_spill_slots; | 5280 saved_flag_ira_share_spill_slots = flag_ira_share_spill_slots; |
3182 if (too_high_register_pressure_p () || cfun->calls_setjmp) | 5281 if (too_high_register_pressure_p () || cfun->calls_setjmp) |
3189 | 5288 |
3190 ira_color (); | 5289 ira_color (); |
3191 | 5290 |
3192 ira_max_point_before_emit = ira_max_point; | 5291 ira_max_point_before_emit = ira_max_point; |
3193 | 5292 |
5293 ira_initiate_emit_data (); | |
5294 | |
3194 ira_emit (loops_p); | 5295 ira_emit (loops_p); |
3195 | 5296 |
5297 max_regno = max_reg_num (); | |
3196 if (ira_conflicts_p) | 5298 if (ira_conflicts_p) |
3197 { | 5299 { |
3198 max_regno = max_reg_num (); | |
3199 | |
3200 if (! loops_p) | 5300 if (! loops_p) |
3201 ira_initiate_assign (); | 5301 { |
5302 if (! ira_use_lra_p) | |
5303 ira_initiate_assign (); | |
5304 } | |
3202 else | 5305 else |
3203 { | 5306 { |
3204 expand_reg_info (allocated_reg_info_size); | 5307 expand_reg_info (); |
3205 setup_preferred_alternate_classes_for_new_pseudos | 5308 |
3206 (allocated_reg_info_size); | 5309 if (ira_use_lra_p) |
3207 allocated_reg_info_size = max_regno; | 5310 { |
3208 | 5311 ira_allocno_t a; |
3209 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) | 5312 ira_allocno_iterator ai; |
3210 fprintf (ira_dump_file, "Flattening IR\n"); | 5313 |
3211 ira_flattening (max_regno_before_ira, ira_max_point_before_emit); | 5314 FOR_EACH_ALLOCNO (a, ai) |
5315 { | |
5316 int old_regno = ALLOCNO_REGNO (a); | |
5317 int new_regno = REGNO (ALLOCNO_EMIT_DATA (a)->reg); | |
5318 | |
5319 ALLOCNO_REGNO (a) = new_regno; | |
5320 | |
5321 if (old_regno != new_regno) | |
5322 setup_reg_classes (new_regno, reg_preferred_class (old_regno), | |
5323 reg_alternate_class (old_regno), | |
5324 reg_allocno_class (old_regno)); | |
5325 } | |
5326 } | |
5327 else | |
5328 { | |
5329 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) | |
5330 fprintf (ira_dump_file, "Flattening IR\n"); | |
5331 ira_flattening (max_regno_before_ira, ira_max_point_before_emit); | |
5332 } | |
3212 /* New insns were generated: add notes and recalculate live | 5333 /* New insns were generated: add notes and recalculate live |
3213 info. */ | 5334 info. */ |
3214 df_analyze (); | 5335 df_analyze (); |
3215 | 5336 |
3216 flow_loops_find (&ira_loops); | 5337 /* ??? Rebuild the loop tree, but why? Does the loop tree |
3217 record_loop_exits (); | 5338 change if new insns were generated? Can that be handled |
3218 current_loops = &ira_loops; | 5339 by updating the loop tree incrementally? */ |
3219 | 5340 loop_optimizer_finalize (); |
3220 setup_allocno_assignment_flags (); | 5341 free_dominance_info (CDI_DOMINATORS); |
3221 ira_initiate_assign (); | 5342 loop_optimizer_init (AVOID_CFG_MODIFICATIONS |
3222 ira_reassign_conflict_allocnos (max_regno); | 5343 | LOOPS_HAVE_RECORDED_EXITS); |
5344 | |
5345 if (! ira_use_lra_p) | |
5346 { | |
5347 setup_allocno_assignment_flags (); | |
5348 ira_initiate_assign (); | |
5349 ira_reassign_conflict_allocnos (max_regno); | |
5350 } | |
3223 } | 5351 } |
3224 } | 5352 } |
3225 | 5353 |
5354 ira_finish_emit_data (); | |
5355 | |
3226 setup_reg_renumber (); | 5356 setup_reg_renumber (); |
3227 | 5357 |
3228 calculate_allocation_cost (); | 5358 calculate_allocation_cost (); |
3229 | 5359 |
3230 #ifdef ENABLE_IRA_CHECKING | 5360 #ifdef ENABLE_IRA_CHECKING |
3231 if (ira_conflicts_p) | 5361 if (ira_conflicts_p && ! ira_use_lra_p) |
5362 /* Opposite to reload pass, LRA does not use any conflict info | |
5363 from IRA. We don't rebuild conflict info for LRA (through | |
5364 ira_flattening call) and can not use the check here. We could | |
5365 rebuild this info for LRA in the check mode but there is a risk | |
5366 that code generated with the check and without it will be a bit | |
5367 different. Calling ira_flattening in any mode would be a | |
5368 wasting CPU time. So do not check the allocation for LRA. */ | |
3232 check_allocation (); | 5369 check_allocation (); |
3233 #endif | 5370 #endif |
3234 | |
3235 if (delete_trivially_dead_insns (get_insns (), max_reg_num ())) | |
3236 df_analyze (); | |
3237 | |
3238 init_reg_equiv_memory_loc (); | |
3239 | 5371 |
3240 if (max_regno != max_regno_before_ira) | 5372 if (max_regno != max_regno_before_ira) |
3241 { | 5373 { |
3242 regstat_free_n_sets_and_refs (); | 5374 regstat_free_n_sets_and_refs (); |
3243 regstat_free_ri (); | 5375 regstat_free_ri (); |
3244 regstat_init_n_sets_and_refs (); | 5376 regstat_init_n_sets_and_refs (); |
3245 regstat_compute_ri (); | 5377 regstat_compute_ri (); |
3246 } | 5378 } |
3247 | 5379 |
3248 allocate_initial_values (reg_equiv_memory_loc); | |
3249 | |
3250 overall_cost_before = ira_overall_cost; | 5380 overall_cost_before = ira_overall_cost; |
3251 if (ira_conflicts_p) | 5381 if (! ira_conflicts_p) |
5382 grow_reg_equivs (); | |
5383 else | |
3252 { | 5384 { |
3253 fix_reg_equiv_init (); | 5385 fix_reg_equiv_init (); |
3254 | 5386 |
3255 #ifdef ENABLE_IRA_CHECKING | 5387 #ifdef ENABLE_IRA_CHECKING |
3256 print_redundant_copies (); | 5388 print_redundant_copies (); |
3257 #endif | 5389 #endif |
3258 | 5390 if (! ira_use_lra_p) |
3259 ira_spilled_reg_stack_slots_num = 0; | 5391 { |
3260 ira_spilled_reg_stack_slots | 5392 ira_spilled_reg_stack_slots_num = 0; |
3261 = ((struct ira_spilled_reg_stack_slot *) | 5393 ira_spilled_reg_stack_slots |
3262 ira_allocate (max_regno | 5394 = ((struct ira_spilled_reg_stack_slot *) |
3263 * sizeof (struct ira_spilled_reg_stack_slot))); | 5395 ira_allocate (max_regno |
3264 memset (ira_spilled_reg_stack_slots, 0, | 5396 * sizeof (struct ira_spilled_reg_stack_slot))); |
3265 max_regno * sizeof (struct ira_spilled_reg_stack_slot)); | 5397 memset (ira_spilled_reg_stack_slots, 0, |
3266 } | 5398 max_regno * sizeof (struct ira_spilled_reg_stack_slot)); |
3267 | 5399 } |
3268 timevar_pop (TV_IRA); | 5400 } |
5401 allocate_initial_values (); | |
5402 | |
5403 /* See comment for find_moveable_pseudos call. */ | |
5404 if (ira_conflicts_p) | |
5405 move_unallocated_pseudos (); | |
5406 | |
5407 /* Restore original values. */ | |
5408 if (lra_simple_p) | |
5409 { | |
5410 flag_caller_saves = saved_flag_caller_saves; | |
5411 flag_ira_region = saved_flag_ira_region; | |
5412 } | |
5413 } | |
5414 | |
5415 static void | |
5416 do_reload (void) | |
5417 { | |
5418 basic_block bb; | |
5419 bool need_dce; | |
5420 unsigned pic_offset_table_regno = INVALID_REGNUM; | |
5421 | |
5422 if (flag_ira_verbose < 10) | |
5423 ira_dump_file = dump_file; | |
5424 | |
5425 /* If pic_offset_table_rtx is a pseudo register, then keep it so | |
5426 after reload to avoid possible wrong usages of hard reg assigned | |
5427 to it. */ | |
5428 if (pic_offset_table_rtx | |
5429 && REGNO (pic_offset_table_rtx) >= FIRST_PSEUDO_REGISTER) | |
5430 pic_offset_table_regno = REGNO (pic_offset_table_rtx); | |
3269 | 5431 |
3270 timevar_push (TV_RELOAD); | 5432 timevar_push (TV_RELOAD); |
3271 df_set_flags (DF_NO_INSN_RESCAN); | 5433 if (ira_use_lra_p) |
3272 build_insn_chain (); | 5434 { |
3273 | 5435 if (current_loops != NULL) |
3274 reload_completed = !reload (get_insns (), ira_conflicts_p); | 5436 { |
5437 loop_optimizer_finalize (); | |
5438 free_dominance_info (CDI_DOMINATORS); | |
5439 } | |
5440 FOR_ALL_BB_FN (bb, cfun) | |
5441 bb->loop_father = NULL; | |
5442 current_loops = NULL; | |
5443 | |
5444 ira_destroy (); | |
5445 | |
5446 lra (ira_dump_file); | |
5447 /* ???!!! Move it before lra () when we use ira_reg_equiv in | |
5448 LRA. */ | |
5449 vec_free (reg_equivs); | |
5450 reg_equivs = NULL; | |
5451 need_dce = false; | |
5452 } | |
5453 else | |
5454 { | |
5455 df_set_flags (DF_NO_INSN_RESCAN); | |
5456 build_insn_chain (); | |
5457 | |
5458 need_dce = reload (get_insns (), ira_conflicts_p); | |
5459 } | |
3275 | 5460 |
3276 timevar_pop (TV_RELOAD); | 5461 timevar_pop (TV_RELOAD); |
3277 | 5462 |
3278 timevar_push (TV_IRA); | 5463 timevar_push (TV_IRA); |
3279 | 5464 |
3280 if (ira_conflicts_p) | 5465 if (ira_conflicts_p && ! ira_use_lra_p) |
3281 { | 5466 { |
3282 ira_free (ira_spilled_reg_stack_slots); | 5467 ira_free (ira_spilled_reg_stack_slots); |
3283 | |
3284 ira_finish_assign (); | 5468 ira_finish_assign (); |
3285 | 5469 } |
3286 } | 5470 |
3287 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL | 5471 if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL |
3288 && overall_cost_before != ira_overall_cost) | 5472 && overall_cost_before != ira_overall_cost) |
3289 fprintf (ira_dump_file, "+++Overall after reload %d\n", ira_overall_cost); | 5473 fprintf (ira_dump_file, "+++Overall after reload %" PRId64 "\n", |
3290 ira_destroy (); | 5474 ira_overall_cost); |
3291 | 5475 |
3292 flag_ira_share_spill_slots = saved_flag_ira_share_spill_slots; | 5476 flag_ira_share_spill_slots = saved_flag_ira_share_spill_slots; |
3293 | 5477 |
3294 flow_loops_free (&ira_loops); | 5478 if (! ira_use_lra_p) |
3295 free_dominance_info (CDI_DOMINATORS); | 5479 { |
3296 FOR_ALL_BB (bb) | 5480 ira_destroy (); |
3297 bb->loop_father = NULL; | 5481 if (current_loops != NULL) |
3298 current_loops = NULL; | 5482 { |
3299 | 5483 loop_optimizer_finalize (); |
3300 regstat_free_ri (); | 5484 free_dominance_info (CDI_DOMINATORS); |
3301 regstat_free_n_sets_and_refs (); | 5485 } |
5486 FOR_ALL_BB_FN (bb, cfun) | |
5487 bb->loop_father = NULL; | |
5488 current_loops = NULL; | |
5489 | |
5490 regstat_free_ri (); | |
5491 regstat_free_n_sets_and_refs (); | |
5492 } | |
3302 | 5493 |
3303 if (optimize) | 5494 if (optimize) |
3304 { | 5495 cleanup_cfg (CLEANUP_EXPENSIVE); |
3305 cleanup_cfg (CLEANUP_EXPENSIVE); | 5496 |
3306 | 5497 finish_reg_equiv (); |
3307 ira_free (ira_reg_equiv_invariant_p); | |
3308 ira_free (ira_reg_equiv_const); | |
3309 } | |
3310 | 5498 |
3311 bitmap_obstack_release (&ira_bitmap_obstack); | 5499 bitmap_obstack_release (&ira_bitmap_obstack); |
3312 #ifndef IRA_NO_OBSTACK | 5500 #ifndef IRA_NO_OBSTACK |
3313 obstack_free (&ira_obstack, NULL); | 5501 obstack_free (&ira_obstack, NULL); |
3314 #endif | 5502 #endif |
3315 | 5503 |
3316 /* The code after the reload has changed so much that at this point | 5504 /* The code after the reload has changed so much that at this point |
3317 we might as well just rescan everything. Not that | 5505 we might as well just rescan everything. Note that |
3318 df_rescan_all_insns is not going to help here because it does not | 5506 df_rescan_all_insns is not going to help here because it does not |
3319 touch the artificial uses and defs. */ | 5507 touch the artificial uses and defs. */ |
3320 df_finish_pass (true); | 5508 df_finish_pass (true); |
3321 if (optimize > 1) | |
3322 df_live_add_problem (); | |
3323 df_scan_alloc (NULL); | 5509 df_scan_alloc (NULL); |
3324 df_scan_blocks (); | 5510 df_scan_blocks (); |
3325 | 5511 |
5512 if (optimize > 1) | |
5513 { | |
5514 df_live_add_problem (); | |
5515 df_live_set_all_dirty (); | |
5516 } | |
5517 | |
3326 if (optimize) | 5518 if (optimize) |
3327 df_analyze (); | 5519 df_analyze (); |
3328 | 5520 |
5521 if (need_dce && optimize) | |
5522 run_fast_dce (); | |
5523 | |
5524 /* Diagnose uses of the hard frame pointer when it is used as a global | |
5525 register. Often we can get away with letting the user appropriate | |
5526 the frame pointer, but we should let them know when code generation | |
5527 makes that impossible. */ | |
5528 if (global_regs[HARD_FRAME_POINTER_REGNUM] && frame_pointer_needed) | |
5529 { | |
5530 tree decl = global_regs_decl[HARD_FRAME_POINTER_REGNUM]; | |
5531 error_at (DECL_SOURCE_LOCATION (current_function_decl), | |
5532 "frame pointer required, but reserved"); | |
5533 inform (DECL_SOURCE_LOCATION (decl), "for %qD", decl); | |
5534 } | |
5535 | |
5536 /* If we are doing generic stack checking, give a warning if this | |
5537 function's frame size is larger than we expect. */ | |
5538 if (flag_stack_check == GENERIC_STACK_CHECK) | |
5539 { | |
5540 HOST_WIDE_INT size = get_frame_size () + STACK_CHECK_FIXED_FRAME_SIZE; | |
5541 | |
5542 for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++) | |
5543 if (df_regs_ever_live_p (i) && !fixed_regs[i] && call_used_regs[i]) | |
5544 size += UNITS_PER_WORD; | |
5545 | |
5546 if (size > STACK_CHECK_MAX_FRAME_SIZE) | |
5547 warning (0, "frame size too large for reliable stack checking"); | |
5548 } | |
5549 | |
5550 if (pic_offset_table_regno != INVALID_REGNUM) | |
5551 pic_offset_table_rtx = gen_rtx_REG (Pmode, pic_offset_table_regno); | |
5552 | |
3329 timevar_pop (TV_IRA); | 5553 timevar_pop (TV_IRA); |
3330 } | 5554 } |
3331 | |
3332 | 5555 |
3333 | |
3334 static bool | |
3335 gate_ira (void) | |
3336 { | |
3337 return true; | |
3338 } | |
3339 | |
3340 /* Run the integrated register allocator. */ | 5556 /* Run the integrated register allocator. */ |
3341 static unsigned int | 5557 |
3342 rest_of_handle_ira (void) | 5558 namespace { |
3343 { | 5559 |
3344 ira (dump_file); | 5560 const pass_data pass_data_ira = |
3345 return 0; | 5561 { |
3346 } | 5562 RTL_PASS, /* type */ |
3347 | 5563 "ira", /* name */ |
3348 struct rtl_opt_pass pass_ira = | 5564 OPTGROUP_NONE, /* optinfo_flags */ |
3349 { | 5565 TV_IRA, /* tv_id */ |
3350 { | 5566 0, /* properties_required */ |
3351 RTL_PASS, | 5567 0, /* properties_provided */ |
3352 "ira", /* name */ | 5568 0, /* properties_destroyed */ |
3353 gate_ira, /* gate */ | 5569 0, /* todo_flags_start */ |
3354 rest_of_handle_ira, /* execute */ | 5570 TODO_do_not_ggc_collect, /* todo_flags_finish */ |
3355 NULL, /* sub */ | |
3356 NULL, /* next */ | |
3357 0, /* static_pass_number */ | |
3358 TV_NONE, /* tv_id */ | |
3359 0, /* properties_required */ | |
3360 0, /* properties_provided */ | |
3361 0, /* properties_destroyed */ | |
3362 0, /* todo_flags_start */ | |
3363 TODO_dump_func | | |
3364 TODO_ggc_collect /* todo_flags_finish */ | |
3365 } | |
3366 }; | 5571 }; |
5572 | |
5573 class pass_ira : public rtl_opt_pass | |
5574 { | |
5575 public: | |
5576 pass_ira (gcc::context *ctxt) | |
5577 : rtl_opt_pass (pass_data_ira, ctxt) | |
5578 {} | |
5579 | |
5580 /* opt_pass methods: */ | |
5581 virtual bool gate (function *) | |
5582 { | |
5583 return !targetm.no_register_allocation; | |
5584 } | |
5585 virtual unsigned int execute (function *) | |
5586 { | |
5587 ira (dump_file); | |
5588 return 0; | |
5589 } | |
5590 | |
5591 }; // class pass_ira | |
5592 | |
5593 } // anon namespace | |
5594 | |
5595 rtl_opt_pass * | |
5596 make_pass_ira (gcc::context *ctxt) | |
5597 { | |
5598 return new pass_ira (ctxt); | |
5599 } | |
5600 | |
5601 namespace { | |
5602 | |
5603 const pass_data pass_data_reload = | |
5604 { | |
5605 RTL_PASS, /* type */ | |
5606 "reload", /* name */ | |
5607 OPTGROUP_NONE, /* optinfo_flags */ | |
5608 TV_RELOAD, /* tv_id */ | |
5609 0, /* properties_required */ | |
5610 0, /* properties_provided */ | |
5611 0, /* properties_destroyed */ | |
5612 0, /* todo_flags_start */ | |
5613 0, /* todo_flags_finish */ | |
5614 }; | |
5615 | |
5616 class pass_reload : public rtl_opt_pass | |
5617 { | |
5618 public: | |
5619 pass_reload (gcc::context *ctxt) | |
5620 : rtl_opt_pass (pass_data_reload, ctxt) | |
5621 {} | |
5622 | |
5623 /* opt_pass methods: */ | |
5624 virtual bool gate (function *) | |
5625 { | |
5626 return !targetm.no_register_allocation; | |
5627 } | |
5628 virtual unsigned int execute (function *) | |
5629 { | |
5630 do_reload (); | |
5631 return 0; | |
5632 } | |
5633 | |
5634 }; // class pass_reload | |
5635 | |
5636 } // anon namespace | |
5637 | |
5638 rtl_opt_pass * | |
5639 make_pass_reload (gcc::context *ctxt) | |
5640 { | |
5641 return new pass_reload (ctxt); | |
5642 } |