Mercurial > hg > CbC > CbC_gcc
diff gcc/ada/exp_unst.ads @ 111:04ced10e8804
gcc 7
author | kono |
---|---|
date | Fri, 27 Oct 2017 22:46:09 +0900 |
parents | |
children | 84e7813d76e9 |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/gcc/ada/exp_unst.ads Fri Oct 27 22:46:09 2017 +0900 @@ -0,0 +1,694 @@ +------------------------------------------------------------------------------ +-- -- +-- GNAT COMPILER COMPONENTS -- +-- -- +-- E X P _ U N S T -- +-- -- +-- S p e c -- +-- -- +-- Copyright (C) 2014-2017, Free Software Foundation, Inc. -- +-- -- +-- GNAT is free software; you can redistribute it and/or modify it under -- +-- terms of the GNU General Public License as published by the Free Soft- -- +-- ware Foundation; either version 3, or (at your option) any later ver- -- +-- sion. GNAT is distributed in the hope that it will be useful, but WITH- -- +-- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY -- +-- or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License -- +-- for more details. You should have received a copy of the GNU General -- +-- Public License distributed with GNAT; see file COPYING3. If not, go to -- +-- http://www.gnu.org/licenses for a complete copy of the license. -- +-- -- +-- GNAT was originally developed by the GNAT team at New York University. -- +-- Extensive contributions were provided by Ada Core Technologies Inc. -- +-- -- +------------------------------------------------------------------------------ + +-- Expand routines for unnesting subprograms + +with Table; +with Types; use Types; + +package Exp_Unst is + + -- ----------------- + -- -- The Problem -- + -- ----------------- + + -- Normally, nested subprograms in the source result in corresponding + -- nested subprograms in the resulting tree. We then expect the back end + -- to handle such nested subprograms, including all cases of uplevel + -- references. For example, the GCC back end can do this relatively easily + -- since GNU C (as an extension) allows nested functions with uplevel + -- references, and implements an appropriate static chain approach to + -- dealing with such uplevel references. + + -- However, we also want to be able to interface with back ends that do + -- not easily handle such uplevel references. One example is the back end + -- that translates the tree into standard C source code. In the future, + -- other back ends might need the same capability (e.g. a back end that + -- generated LLVM intermediate code). + + -- We could imagine simply handling such references in the appropriate + -- back end. For example the back end that generates C could recognize + -- nested subprograms and rig up some way of translating them, e.g. by + -- making a static-link source level visible. + + -- Rather than take that approach, we prefer to do a semantics-preserving + -- transformation on the GNAT tree, that eliminates the problem before we + -- hand the tree over to the back end. There are two reasons for preferring + -- this approach: + + -- First: the work needs only to be done once for all affected back ends + -- and we can remain within the semantics of the tree. The front end is + -- full of tree transformations, so we have all the infrastructure for + -- doing transformations of this type. + + -- Second: given that the transformation will be semantics-preserving, + -- we can still used the standard GCC back end to build code from it. + -- This means we can easily run our full test suite to verify that the + -- transformations are indeed semantics preserving. It is a lot more + -- work to thoroughly test the output of specialized back ends. + + -- Looking at the problem, we have three situations to deal with. Note + -- that in these examples, we use all lower case, since that is the way + -- the internal tree is cased. + + -- First, cases where there are no uplevel references, for example + + -- procedure case1 is + -- function max (m, n : Integer) return integer is + -- begin + -- return integer'max (m, n); + -- end max; + -- ... + -- end case1; + + -- Second, cases where there are explicit uplevel references. + + -- procedure case2 (b : integer) is + -- procedure Inner (bb : integer); + -- + -- procedure inner2 is + -- begin + -- inner(5); + -- end; + -- + -- x : integer := 77; + -- y : constant integer := 15 * 16; + -- rv : integer := 10; + -- + -- procedure inner (bb : integer) is + -- begin + -- x := rv + y + bb + b; + -- end; + -- + -- begin + -- inner2; + -- end case2; + + -- In this second example, B, X, RV are uplevel referenced. Y is not + -- considered as an uplevel reference since it is a static constant + -- where references are replaced by the value at compile time. + + -- Third, cases where there are implicit uplevel references via types + -- whose bounds depend on locally declared constants or variables: + + -- function case3 (x, y : integer) return boolean is + -- subtype dynam is integer range x .. y + 3; + -- subtype static is integer range 42 .. 73; + -- xx : dynam := y; + -- + -- type darr is array (dynam) of Integer; + -- type darec is record + -- A : darr; + -- B : integer; + -- end record; + -- darecv : darec; + -- + -- function inner (b : integer) return boolean is + -- begin + -- return b in dynam and then darecv.b in static; + -- end inner; + -- + -- begin + -- return inner (42) and then inner (xx * 3 - y * 2); + -- end case3; + -- + -- In this third example, the membership test implicitly references the + -- the bounds of Dynam, which both involve uplevel references. + + -- ------------------ + -- -- The Solution -- + -- ------------------ + + -- Looking at the three cases above, the first case poses no problem at + -- all. Indeed the subprogram could have been declared at the outer level + -- (perhaps changing the name). But this style is quite common as a way + -- of limiting the scope of a local procedure called only within the outer + -- procedure. We could move it to the outer level (with a name change if + -- needed), but we don't bother. We leave it nested, and the back end just + -- translates it as though it were not nested. + + -- In general we leave nested procedures nested, rather than trying to move + -- them to the outer level (the back end may do that, e.g. as part of the + -- translation to C, but we don't do it in the tree itself). This saves a + -- LOT of trouble in terms of visibility and semantics. + + -- But of course we have to deal with the uplevel references. The idea is + -- to rewrite these nested subprograms so that they no longer have any such + -- uplevel references, so by the time they reach the back end, they all are + -- case 1 (no uplevel references) and thus easily handled. + + -- To deal with explicit uplevel references (case 2 above), we proceed with + -- the following steps: + + -- All entities marked as being uplevel referenced are marked as aliased + -- since they will be accessed indirectly via an activation record as + -- described below. + + -- An activation record is created containing system address values + -- for each uplevel referenced entity in a given scope. In the example + -- given before, we would have: + + -- type AREC1T is record + -- b : Address; + -- x : Address; + -- rv : Address; + -- end record; + + -- type AREC1PT is access all AREC1T; + + -- AREC1 : aliased AREC1T; + -- AREC1P : constant AREC1PT := AREC1'Access; + + -- The fields of AREC1 are set at the point the corresponding entity + -- is declared (immediately for parameters). + + -- Note: the 1 in all these names is a unique index number. Different + -- scopes requiring different ARECnT declarations will have different + -- values of n to ensure uniqueness. + + -- Note: normally the field names in the activation record match the + -- name of the entity. An exception is when the entity is declared in + -- a declare block, in which case we append the entity number, to avoid + -- clashes between the same name declared in different declare blocks. + + -- For all subprograms nested immediately within the corresponding scope, + -- a parameter AREC1F is passed, and all calls to these routines have + -- AREC1P added as an additional formal. + + -- Now within the nested procedures, any reference to an uplevel entity + -- xxx is replaced by typ'Deref(AREC1.xxx) where typ is the type of the + -- reference. + + -- Note: the reason that we use Address as the component type in the + -- declaration of AREC1T is that we may create this type before we see + -- the declaration of this type. + + -- The following shows example 2 above after this translation: + + -- procedure case2x (b : aliased Integer) is + -- type AREC1T is record + -- b : Address; + -- x : Address; + -- rv : Address; + -- end record; + -- + -- type AREC1PT is access all AREC1T; + -- + -- AREC1 : aliased AREC1T; + -- AREC1P : constant AREC1PT := AREC1'Access; + -- + -- AREC1.b := b'Address; + -- + -- procedure inner (bb : integer; AREC1F : AREC1PT); + -- + -- procedure inner2 (AREC1F : AREC1PT) is + -- begin + -- inner(5, AREC1F); + -- end; + -- + -- x : aliased integer := 77; + -- AREC1.x := X'Address; + -- + -- y : constant Integer := 15 * 16; + -- + -- rv : aliased Integer; + -- AREC1.rv := rv'Address; + -- + -- procedure inner (bb : integer; AREC1F : AREC1PT) is + -- begin + -- Integer'Deref(AREC1F.x) := + -- Integer'Deref(AREC1F.rv) + y + b + Integer_Deref(AREC1F.b); + -- end; + -- + -- begin + -- inner2 (AREC1P); + -- end case2x; + + -- And now the inner procedures INNER2 and INNER have no uplevel references + -- so they have been reduced to case 1, which is the case easily handled by + -- the back end. Note that the generated code is not strictly legal Ada + -- because of the assignments to AREC1 in the declarative sequence, but the + -- GNAT tree always allows such mixing of declarations and statements, so + -- the back end must be prepared to handle this in any case. + + -- Case 3 where we have uplevel references to types is a bit more complex. + -- That would especially be the case if we did a full transformation that + -- completely eliminated such uplevel references as we did for case 2. But + -- instead of trying to do that, we rewrite the subprogram so that the code + -- generator can easily detect and deal with these uplevel type references. + + -- First we distinguish two cases + + -- Static types are one of the two following cases: + + -- Discrete types whose bounds are known at compile time. This is not + -- quite the same as what is tested by Is_OK_Static_Subtype, in that + -- it allows compile time known values that are not static expressions. + + -- Composite types, whose components are (recursively) static types. + + -- Dynamic types are one of the two following cases: + + -- Discrete types with at least one bound not known at compile time. + + -- Composite types with at least one component that is (recursively) + -- a dynamic type. + + -- Uplevel references to static types are not a problem, the front end + -- or the code generator fetches the bounds as required, and since they + -- are compile time known values, this value can just be extracted and + -- no actual uplevel reference is required. + + -- Uplevel references to dynamic types are a potential problem, since + -- such references may involve an implicit access to a dynamic bound, + -- and this reference is an implicit uplevel access. + + -- To fully unnest such references would be messy, since we would have + -- to create local copies of the dynamic types involved, so that the + -- front end or code generator could generate an explicit uplevel + -- reference to the bound involved. Rather than do that, we set things + -- up so that this situation can be easily detected and dealt with when + -- there is an implicit reference to the bounds. + + -- What we do is to always generate a local constant for any dynamic + -- bound in a dynamic subtype xx with name xx_FIRST or xx_LAST. The one + -- case where we can skip this is where the bound is already a constant. + -- E.g. in the third example above, subtype dynam is expanded as + + -- dynam_LAST : constant Integer := y + 3; + -- subtype dynam is integer range x .. dynam_LAST; + + -- Now if type dynam is uplevel referenced (as it is in this case), then + -- the bounds x and dynam_LAST are marked as uplevel references + -- so that appropriate entries are made in the activation record. Any + -- explicit reference to such a bound in the front end generated code + -- will be handled by the normal uplevel reference mechanism which we + -- described above for case 2. For implicit references by a back end + -- that needs to unnest things, any such implicit reference to one of + -- these bounds can be replaced by an appropriate reference to the entry + -- in the activation record for xx_FIRST or xx_LAST. Thus the back end + -- can eliminate the problematical uplevel reference without the need to + -- do the heavy tree modification to do that at the code expansion level. + + -- Looking at case 3 again, here is the normal -gnatG expanded code + + -- function case3 (x : integer; y : integer) return boolean is + -- dynam_LAST : constant integer := y {+} 3; + -- subtype dynam is integer range x .. dynam_LAST; + -- subtype static is integer range 42 .. 73; + -- + -- [constraint_error when + -- not (y in x .. dynam_LAST) + -- "range check failed"] + -- + -- xx : dynam := y; + -- + -- type darr is array (x .. dynam_LAST) of integer; + -- type darec is record + -- a : darr; + -- b : integer; + -- end record; + -- [type TdarrB is array (x .. dynam_LAST range <>) of integer] + -- freeze TdarrB [] + -- darecv : darec; + -- + -- function inner (b : integer) return boolean is + -- begin + -- return b in x .. dynam_LAST and then darecv.b in 42 .. 73; + -- end inner; + -- begin + -- return inner (42) and then inner (xx {*} 3 {-} y {*} 2); + -- end case3; + + -- Note: the actual expanded code has fully qualified names so for + -- example function inner is actually function case3__inner. For now + -- we ignore that detail to clarify the examples. + + -- Here we see that some of the bounds references are expanded by the + -- front end, so that we get explicit references to y or dynam_Last. These + -- cases are handled by the normal uplevel reference mechanism described + -- above for case 2. This is the case for the constraint check for the + -- initialization of xx, and the range check in function inner. + + -- But the reference darecv.b in the return statement of function + -- inner has an implicit reference to the bounds of dynam, since to + -- compute the location of b in the record, we need the length of a. + + -- Here is the full translation of the third example: + + -- function case3x (x, y : integer) return boolean is + -- type AREC1T is record + -- x : Address; + -- dynam_LAST : Address; + -- end record; + -- + -- type AREC1PT is access all AREC1T; + -- + -- AREC1 : aliased AREC1T; + -- AREC1P : constant AREC1PT := AREC1'Access; + -- + -- AREC1.x := x'Address; + -- + -- dynam_LAST : constant integer := y {+} 3; + -- AREC1.dynam_LAST := dynam_LAST'Address; + -- subtype dynam is integer range x .. dynam_LAST; + -- xx : dynam := y; + -- + -- [constraint_error when + -- not (y in x .. dynam_LAST) + -- "range check failed"] + -- + -- subtype static is integer range 42 .. 73; + -- + -- type darr is array (x .. dynam_LAST) of Integer; + -- type darec is record + -- A : darr; + -- B : integer; + -- end record; + -- darecv : darec; + -- + -- function inner (b : integer; AREC1F : AREC1PT) return boolean is + -- begin + -- return b in x .. Integer'Deref(AREC1F.dynam_LAST) + -- and then darecv.b in 42 .. 73; + -- end inner; + -- + -- begin + -- return inner (42, AREC1P) and then inner (xx * 3, AREC1P); + -- end case3x; + + -- And now the back end when it processes darecv.b will access the bounds + -- of darecv.a by referencing the d and dynam_LAST fields of AREC1P. + + ----------------------------- + -- Multiple Nesting Levels -- + ----------------------------- + + -- In our examples so far, we have only nested to a single level, but the + -- scheme generalizes to multiple levels of nesting and in this section we + -- discuss how this generalization works. + + -- Consider this example with two nesting levels + + -- To deal with elimination of uplevel references, we follow the same basic + -- approach described above for case 2, except that we need an activation + -- record at each nested level. Basically the rule is that any procedure + -- that has nested procedures needs an activation record. When we do this, + -- the inner activation records have a pointer (uplink) to the immediately + -- enclosing activation record, the normal arrangement of static links. The + -- following shows the full translation of this fourth case. + + -- function case4x (x : integer) return integer is + -- type AREC1T is record + -- v1 : Address; + -- end record; + -- + -- type AREC1PT is access all AREC1T; + -- + -- AREC1 : aliased AREC1T; + -- AREC1P : constant AREC1PT := AREC1'Access; + -- + -- v1 : integer := x; + -- AREC1.v1 := v1'Address; + -- + -- function inner1 (y : integer; AREC1F : AREC1PT) return integer is + -- type AREC2T is record + -- AREC1U : AREC1PT; + -- v2 : Address; + -- end record; + -- + -- type AREC2PT is access all AREC2T; + -- + -- AREC2 : aliased AREC2T; + -- AREC2P : constant AREC2PT := AREC2'Access; + -- + -- AREC2.AREC1U := AREC1F; + -- + -- v2 : integer := Integer'Deref (AREC1F.v1) {+} 1; + -- AREC2.v2 := v2'Address; + -- + -- function inner2 + -- (z : integer; AREC2F : AREC2PT) return integer + -- is + -- begin + -- return integer(z {+} + -- Integer'Deref (AREC2F.AREC1U.v1) {+} + -- Integer'Deref (AREC2F.v2).all); + -- end inner2; + -- begin + -- return integer(y {+} + -- inner2 (Integer'Deref (AREC1F.v1), AREC2P)); + -- end inner1; + -- begin + -- return inner1 (x, AREC1P); + -- end case4x; + + -- As can be seen in this example, the index numbers following AREC in the + -- generated names avoid confusion between AREC names at different levels. + + ------------------------- + -- Name Disambiguation -- + ------------------------- + + -- As described above, the translation scheme would raise issues when the + -- code generator did the actual unnesting if identically named nested + -- subprograms exist. Similarly overloading would cause a naming issue. + + -- In fact, the expanded code includes qualified names which eliminate this + -- problem. We omitted the qualification from the exapnded examples above + -- for simplicity. But to see this in action, consider this example: + + -- function Mnames return Boolean is + -- procedure Inner is + -- procedure Inner is + -- begin + -- null; + -- end; + -- begin + -- Inner; + -- end; + -- function F (A : Boolean) return Boolean is + -- begin + -- return not A; + -- end; + -- function F (A : Integer) return Boolean is + -- begin + -- return A > 42; + -- end; + -- begin + -- Inner; + -- return F (42) or F (True); + -- end; + + -- The expanded code actually looks like: + + -- function mnames return boolean is + -- procedure mnames__inner is + -- procedure mnames__inner__inner is + -- begin + -- null; + -- return; + -- end mnames__inner__inner; + -- begin + -- mnames__inner__inner; + -- return; + -- end mnames__inner; + -- function mnames__f (a : boolean) return boolean is + -- begin + -- return not a; + -- end mnames__f; + -- function mnames__f__2 (a : integer) return boolean is + -- begin + -- return a > 42; + -- end mnames__f__2; + -- begin + -- mnames__inner; + -- return mnames__f__2 (42) or mnames__f (true); + -- end mnames; + + -- As can be seen from studying this example, the qualification deals both + -- with the issue of clashing names (mnames__inner, mnames__inner__inner), + -- and with overloading (mnames__f, mnames__f__2). + + -- In addition, the declarations of ARECnT and ARECnPT get moved to the + -- outer level when we actually generate C code, so we suffix these names + -- with the corresponding entity name to make sure they are unique. + + --------------------------- + -- Terminology for Calls -- + --------------------------- + + -- The level of a subprogram in the nest being analyzed is defined to be + -- the level of nesting, so the outer level subprogram (the one passed to + -- Unnest_Subprogram) is 1, subprograms immediately nested within this + -- outer level subprogram have a level of 2, etc. + + -- Calls within the nest being analyzed are of three types: + + -- Downward call: this is a call from a subprogram to a subprogram that + -- is immediately nested with in the caller, and thus has a level that + -- is one greater than the caller. It is a fundamental property of the + -- nesting structure and visibility that it is not possible to make a + -- call from level N to level M, where M is greater than N + 1. + + -- Parallel call: this is a call from a nested subprogram to another + -- nested subprogram that is at the same level. + + -- Upward call: this is a call from a subprogram to a subprogram that + -- encloses the caller. The level of the callee is less than the level + -- of the caller, and there is no limit on the difference, e.g. for an + -- uplevel call, a subprogram at level 5 can call one at level 2 or even + -- the outer level subprogram at level 1. + + ----------- + -- Subps -- + ----------- + + -- Table to record subprograms within the nest being currently analyzed. + -- Entries in this table are made for each subprogram expanded, and do not + -- get cleared as we complete the expansion, since we want the table info + -- around in Cprint for the actual unnesting operation. Subps_First in this + -- unit records the starting entry in the table for the entries for Subp + -- and this is also recorded in the Subps_Index field of the outer level + -- subprogram in the nest. The last subps index for the nest can be found + -- in the Subp_Entry Last field of this first entry. + + subtype SI_Type is Nat; + -- Index type for the table + + Subps_First : SI_Type; + -- Record starting index for entries in the current nest (this is the table + -- index of the entry for Subp itself, and is recorded in the Subps_Index + -- field of the entity for this subprogram). + + type Subp_Entry is record + Ent : Entity_Id; + -- Entity of the subprogram + + Bod : Node_Id; + -- Subprogram_Body node for this subprogram + + Lev : Nat; + -- Subprogram level (1 = outer subprogram (Subp argument), 2 = nested + -- immediately within this outer subprogram etc.) + + Reachable : Boolean; + -- This flag is set True if there is a call path from the outer level + -- subprogram to this subprogram. If Reachable is False, it means that + -- the subprogram is declared but not actually referenced. We remove + -- such subprograms from the tree, which simplifies our task, because + -- we don't have to worry about e.g. uplevel references from such an + -- unreferenced subpogram, which might require (useless) activation + -- records to be created. This is computed by setting the outer level + -- subprogram (Subp itself) as reachable, and then doing a transitive + -- closure following all calls. + + Uplevel_Ref : Nat; + -- The outermost level which defines entities which this subprogram + -- references either directly or indirectly via a call. This cannot + -- be greater than Lev. If it is equal to Lev, then it means that the + -- subprogram does not make any uplevel references and that thus it + -- does not need an activation record pointer passed. If it is less than + -- Lev, then an activation record pointer is needed, since there is at + -- least one uplevel reference. This is computed by initially setting + -- Uplevel_Ref to Lev for all subprograms. Then on the initial tree + -- traversal, decreasing Uplevel_Ref for an explicit uplevel reference, + -- and finally by doing a transitive closure that follows calls (if A + -- calls B and B has an uplevel reference to level X, then A references + -- level X indirectly). + + Declares_AREC : Boolean; + -- This is set True for a subprogram which include the declarations + -- for a local activation record to be passed on downward calls. It + -- is set True for the target level of an uplevel reference, and for + -- all intervening nested subprograms. For example, if a subprogram X + -- at level 5 makes an uplevel reference to an entity declared in a + -- level 2 subprogram, then the subprograms at levels 4,3,2 enclosing + -- the level 5 subprogram will have this flag set True. + + Uents : Elist_Id; + -- This is a list of entities declared in this subprogram which are + -- uplevel referenced. It contains both objects (which will be put in + -- the corresponding AREC activation record), and types. The types are + -- not put in the AREC activation record, but referenced bounds (i.e. + -- generated _FIRST and _LAST entites, and formal parameters) will be + -- in the list in their own right. + + Last : SI_Type; + -- This field is set only in the entry for the outer level subprogram + -- in a nest, and records the last index in the Subp table for all the + -- entries for subprograms in this nest. + + ARECnF : Entity_Id; + -- This entity is defined for all subprograms which need an extra formal + -- that contains a pointer to the activation record needed for uplevel + -- references. ARECnF must be defined for any subprogram which has a + -- direct or indirect uplevel reference (i.e. Reference_Level < Lev). + + ARECn : Entity_Id; + ARECnT : Entity_Id; + ARECnPT : Entity_Id; + ARECnP : Entity_Id; + -- These AREC entities are defined only for subprograms for which we + -- generate an activation record declaration, i.e. for subprograms for + -- which the Declares_AREC flag is set True. + + ARECnU : Entity_Id; + -- This AREC entity is the uplink component. It is other than Empty only + -- for nested subprograms that declare an activation record as indicated + -- by Declares_AREC being Ture, and which have uplevel references (Lev + -- greater than Uplevel_Ref). It is the additional component in the + -- activation record that references the ARECnF pointer (which points + -- the activation record one level higher, thus forming the chain). + + end record; + + package Subps is new Table.Table ( + Table_Component_Type => Subp_Entry, + Table_Index_Type => SI_Type, + Table_Low_Bound => 1, + Table_Initial => 1000, + Table_Increment => 200, + Table_Name => "Unnest_Subps"); + -- Records the subprograms in the nest whose outer subprogram is Subp + + ----------------- + -- Subprograms -- + ----------------- + + function Get_Level (Subp : Entity_Id; Sub : Entity_Id) return Nat; + -- Sub is either Subp itself, or a subprogram nested within Subp. This + -- function returns the level of nesting (Subp = 1, subprograms that + -- are immediately nested within Subp = 2, etc.). + + function Subp_Index (Sub : Entity_Id) return SI_Type; + -- Given the entity for a subprogram, return corresponding Subp's index + + procedure Unnest_Subprograms (N : Node_Id); + -- Called to unnest subprograms. If we are in unnest subprogram mode, this + -- is the call that traverses the tree N and locates all the library level + -- subprograms with nested subprograms to process them. + +end Exp_Unst;