Mercurial > hg > CbC > CbC_gcc
comparison gcc/doc/tree-ssa.texi @ 55:77e2b8dfacca gcc-4.4.5
update it from 4.4.3 to 4.5.0
author | ryoma <e075725@ie.u-ryukyu.ac.jp> |
---|---|
date | Fri, 12 Feb 2010 23:39:51 +0900 |
parents | a06113de4d67 |
children | b7f97abdc517 |
comparison
equal
deleted
inserted
replaced
52:c156f1bd5cd9 | 55:77e2b8dfacca |
---|---|
39 @menu | 39 @menu |
40 * Annotations:: Attributes for variables. | 40 * Annotations:: Attributes for variables. |
41 * SSA Operands:: SSA names referenced by GIMPLE statements. | 41 * SSA Operands:: SSA names referenced by GIMPLE statements. |
42 * SSA:: Static Single Assignment representation. | 42 * SSA:: Static Single Assignment representation. |
43 * Alias analysis:: Representing aliased loads and stores. | 43 * Alias analysis:: Representing aliased loads and stores. |
44 * Memory model:: Memory model used by the middle-end. | |
44 @end menu | 45 @end menu |
45 | 46 |
46 @node Annotations | 47 @node Annotations |
47 @section Annotations | 48 @section Annotations |
48 @cindex annotations | 49 @cindex annotations |
106 @} | 107 @} |
107 @end smallexample | 108 @end smallexample |
108 | 109 |
109 Since @code{a} and @code{b} are non-aliased locals, the statement | 110 Since @code{a} and @code{b} are non-aliased locals, the statement |
110 @code{a = b} will have one real definition and one real use because | 111 @code{a = b} will have one real definition and one real use because |
111 variable @code{b} is completely modified with the contents of | 112 variable @code{a} is completely modified with the contents of |
112 variable @code{a}. Real definition are also known as @dfn{killing | 113 variable @code{b}. Real definition are also known as @dfn{killing |
113 definitions}. Similarly, the use of @code{a} reads all its bits. | 114 definitions}. Similarly, the use of @code{b} reads all its bits. |
114 | 115 |
115 In contrast, virtual operands are used with variables that can have | 116 In contrast, virtual operands are used with variables that can have |
116 a partial or ambiguous reference. This includes structures, arrays, | 117 a partial or ambiguous reference. This includes structures, arrays, |
117 globals, and aliased variables. In these cases, we have two types of | 118 globals, and aliased variables. In these cases, we have two types of |
118 definitions. For globals, structures, and arrays, we can determine from | 119 definitions. For globals, structures, and arrays, we can determine from |
793 @section Alias analysis | 794 @section Alias analysis |
794 @cindex alias | 795 @cindex alias |
795 @cindex flow-sensitive alias analysis | 796 @cindex flow-sensitive alias analysis |
796 @cindex flow-insensitive alias analysis | 797 @cindex flow-insensitive alias analysis |
797 | 798 |
798 Alias analysis proceeds in 4 main phases: | 799 Alias analysis in GIMPLE SSA form consists of two pieces. First |
800 the virtual SSA web ties conflicting memory accesses and provides | |
801 a SSA use-def chain and SSA immediate-use chains for walking | |
802 possibly dependent memory accesses. Second an alias-oracle can | |
803 be queried to disambiguate explicit and implicit memory references. | |
799 | 804 |
800 @enumerate | 805 @enumerate |
801 @item Structural alias analysis. | 806 @item Memory SSA form. |
802 | 807 |
803 This phase walks the types for structure variables, and determines which | 808 All statements that may use memory have exactly one accompanied use of |
804 of the fields can overlap using offset and size of each field. For each | 809 a virtual SSA name that represents the state of memory at the |
805 field, a ``subvariable'' called a ``Structure field tag'' (SFT)@ is | 810 given point in the IL. |
806 created, which represents that field as a separate variable. All | 811 |
807 accesses that could possibly overlap with a given field will have | 812 All statements that may define memory have exactly one accompanied |
808 virtual operands for the SFT of that field. | 813 definition of a virtual SSA name using the previous state of memory |
809 | 814 and defining the new state of memory after the given point in the IL. |
810 @smallexample | 815 |
811 struct foo | 816 @smallexample |
817 int i; | |
818 int foo (void) | |
812 @{ | 819 @{ |
813 int a; | 820 # .MEM_3 = VDEF <.MEM_2(D)> |
814 int b; | 821 i = 1; |
822 # VUSE <.MEM_3> | |
823 return i; | |
815 @} | 824 @} |
816 struct foo temp; | 825 @end smallexample |
817 int bar (void) | 826 |
818 @{ | 827 The virtual SSA names in this case are @code{.MEM_2(D)} and |
819 int tmp1, tmp2, tmp3; | 828 @code{.MEM_3}. The store to the global variable @code{i} |
820 SFT.0_2 = VDEF <SFT.0_1> | 829 defines @code{.MEM_3} invalidating @code{.MEM_2(D)}. The |
821 temp.a = 5; | 830 load from @code{i} uses that new state @code{.MEM_3}. |
822 SFT.1_4 = VDEF <SFT.1_3> | 831 |
823 temp.b = 6; | 832 The virtual SSA web serves as constraints to SSA optimizers |
824 | 833 preventing illegitimate code-motion and optimization. It |
825 VUSE <SFT.1_4> | 834 also provides a way to walk related memory statements. |
826 tmp1_5 = temp.b; | |
827 VUSE <SFT.0_2> | |
828 tmp2_6 = temp.a; | |
829 | |
830 tmp3_7 = tmp1_5 + tmp2_6; | |
831 return tmp3_7; | |
832 @} | |
833 @end smallexample | |
834 | |
835 If you copy the symbol tag for a variable for some reason, you probably | |
836 also want to copy the subvariables for that variable. | |
837 | 835 |
838 @item Points-to and escape analysis. | 836 @item Points-to and escape analysis. |
839 | 837 |
840 This phase walks the use-def chains in the SSA web looking for | 838 Points-to analysis builds a set of constraints from the GIMPLE |
841 three things: | 839 SSA IL representing all pointer operations and facts we do |
840 or do not know about pointers. Solving this set of constraints | |
841 yields a conservatively correct solution for each pointer | |
842 variable in the program (though we are only interested in | |
843 SSA name pointers) as to what it may possibly point to. | |
844 | |
845 This points-to solution for a given SSA name pointer is stored | |
846 in the @code{pt_solution} sub-structure of the | |
847 @code{SSA_NAME_PTR_INFO} record. The following accessor | |
848 functions are available: | |
842 | 849 |
843 @itemize @bullet | 850 @itemize @bullet |
844 @item Assignments of the form @code{P_i = &VAR} | 851 @item @code{pt_solution_includes} |
845 @item Assignments of the form P_i = malloc() | 852 @item @code{pt_solutions_intersect} |
846 @item Pointers and ADDR_EXPR that escape the current function. | |
847 @end itemize | 853 @end itemize |
848 | 854 |
849 The concept of `escaping' is the same one used in the Java world. | 855 Points-to analysis also computes the solution for two special |
850 When a pointer or an ADDR_EXPR escapes, it means that it has been | 856 set of pointers, @code{ESCAPED} and @code{CALLUSED}. Those |
851 exposed outside of the current function. So, assignment to | 857 represent all memory that has escaped the scope of analysis |
852 global variables, function arguments and returning a pointer are | 858 or that is used by pure or nested const calls. |
853 all escape sites. | 859 |
854 | 860 @item Type-based alias analysis |
855 This is where we are currently limited. Since not everything is | 861 |
856 renamed into SSA, we lose track of escape properties when a | 862 Type-based alias analysis is frontend dependent though generic |
857 pointer is stashed inside a field in a structure, for instance. | 863 support is provided by the middle-end in @code{alias.c}. TBAA |
858 In those cases, we are assuming that the pointer does escape. | 864 code is used by both tree optimizers and RTL optimizers. |
859 | |
860 We use escape analysis to determine whether a variable is | |
861 call-clobbered. Simply put, if an ADDR_EXPR escapes, then the | |
862 variable is call-clobbered. If a pointer P_i escapes, then all | |
863 the variables pointed-to by P_i (and its memory tag) also escape. | |
864 | |
865 @item Compute flow-sensitive aliases | |
866 | |
867 We have two classes of memory tags. Memory tags associated with | |
868 the pointed-to data type of the pointers in the program. These | |
869 tags are called ``symbol memory tag'' (SMT)@. The other class are | |
870 those associated with SSA_NAMEs, called ``name memory tag'' (NMT)@. | |
871 The basic idea is that when adding operands for an INDIRECT_REF | |
872 *P_i, we will first check whether P_i has a name tag, if it does | |
873 we use it, because that will have more precise aliasing | |
874 information. Otherwise, we use the standard symbol tag. | |
875 | |
876 In this phase, we go through all the pointers we found in | |
877 points-to analysis and create alias sets for the name memory tags | |
878 associated with each pointer P_i. If P_i escapes, we mark | |
879 call-clobbered the variables it points to and its tag. | |
880 | |
881 | |
882 @item Compute flow-insensitive aliases | |
883 | |
884 This pass will compare the alias set of every symbol memory tag and | |
885 every addressable variable found in the program. Given a symbol | |
886 memory tag SMT and an addressable variable V@. If the alias sets | |
887 of SMT and V conflict (as computed by may_alias_p), then V is | |
888 marked as an alias tag and added to the alias set of SMT@. | |
889 | 865 |
890 Every language that wishes to perform language-specific alias analysis | 866 Every language that wishes to perform language-specific alias analysis |
891 should define a function that computes, given a @code{tree} | 867 should define a function that computes, given a @code{tree} |
892 node, an alias set for the node. Nodes in different alias sets are not | 868 node, an alias set for the node. Nodes in different alias sets are not |
893 allowed to alias. For an example, see the C front-end function | 869 allowed to alias. For an example, see the C front-end function |
894 @code{c_get_alias_set}. | 870 @code{c_get_alias_set}. |
871 | |
872 @item Tree alias-oracle | |
873 | |
874 The tree alias-oracle provides means to disambiguate two memory | |
875 references and memory references against statements. The following | |
876 queries are available: | |
877 | |
878 @itemize @bullet | |
879 @item @code{refs_may_alias_p} | |
880 @item @code{ref_maybe_used_by_stmt_p} | |
881 @item @code{stmt_may_clobber_ref_p} | |
882 @end itemize | |
883 | |
884 In addition to those two kind of statement walkers are available | |
885 walking statements related to a reference ref. | |
886 @code{walk_non_aliased_vuses} walks over dominating memory defining | |
887 statements and calls back if the statement does not clobber ref | |
888 providing the non-aliased VUSE. The walk stops at | |
889 the first clobbering statement or if asked to. | |
890 @code{walk_aliased_vdefs} walks over dominating memory defining | |
891 statements and calls back on each statement clobbering ref | |
892 providing its aliasing VDEF. The walk stops if asked to. | |
893 | |
895 @end enumerate | 894 @end enumerate |
896 | 895 |
897 For instance, consider the following function: | 896 |
898 | 897 @node Memory model |
899 @smallexample | 898 @section Memory model |
900 foo (int i) | 899 @cindex memory model |
901 @{ | 900 |
902 int *p, *q, a, b; | 901 The memory model used by the middle-end models that of the C/C++ |
903 | 902 languages. The middle-end has the notion of an effective type |
904 if (i > 10) | 903 of a memory region which is used for type-based alias analysis. |
905 p = &a; | 904 |
906 else | 905 The following is a refinement of ISO C99 6.5/6, clarifying the block copy case |
907 q = &b; | 906 to follow common sense and extending the concept of a dynamic effective |
908 | 907 type to objects with a declared type as required for C++. |
909 *p = 3; | 908 |
910 *q = 5; | 909 @smallexample |
911 a = b + 2; | 910 The effective type of an object for an access to its stored value is |
912 return *p; | 911 the declared type of the object or the effective type determined by |
913 @} | 912 a previous store to it. If a value is stored into an object through |
914 @end smallexample | 913 an lvalue having a type that is not a character type, then the |
915 | 914 type of the lvalue becomes the effective type of the object for that |
916 After aliasing analysis has finished, the symbol memory tag for | 915 access and for subsequent accesses that do not modify the stored value. |
917 pointer @code{p} will have two aliases, namely variables @code{a} and | 916 If a value is copied into an object using @code{memcpy} or @code{memmove}, |
918 @code{b}. | 917 or is copied as an array of character type, then the effective type |
919 Every time pointer @code{p} is dereferenced, we want to mark the | 918 of the modified object for that access and for subsequent accesses that |
920 operation as a potential reference to @code{a} and @code{b}. | 919 do not modify the value is undetermined. For all other accesses to an |
921 | 920 object, the effective type of the object is simply the type of the |
922 @smallexample | 921 lvalue used for the access. |
923 foo (int i) | 922 @end smallexample |
924 @{ | 923 |
925 int *p, a, b; | |
926 | |
927 if (i_2 > 10) | |
928 p_4 = &a; | |
929 else | |
930 p_6 = &b; | |
931 # p_1 = PHI <p_4(1), p_6(2)>; | |
932 | |
933 # a_7 = VDEF <a_3>; | |
934 # b_8 = VDEF <b_5>; | |
935 *p_1 = 3; | |
936 | |
937 # a_9 = VDEF <a_7> | |
938 # VUSE <b_8> | |
939 a_9 = b_8 + 2; | |
940 | |
941 # VUSE <a_9>; | |
942 # VUSE <b_8>; | |
943 return *p_1; | |
944 @} | |
945 @end smallexample | |
946 | |
947 In certain cases, the list of may aliases for a pointer may grow | |
948 too large. This may cause an explosion in the number of virtual | |
949 operands inserted in the code. Resulting in increased memory | |
950 consumption and compilation time. | |
951 | |
952 When the number of virtual operands needed to represent aliased | |
953 loads and stores grows too large (configurable with @option{--param | |
954 max-aliased-vops}), alias sets are grouped to avoid severe | |
955 compile-time slow downs and memory consumption. The alias | |
956 grouping heuristic proceeds as follows: | |
957 | |
958 @enumerate | |
959 @item Sort the list of pointers in decreasing number of contributed | |
960 virtual operands. | |
961 | |
962 @item Take the first pointer from the list and reverse the role | |
963 of the memory tag and its aliases. Usually, whenever an | |
964 aliased variable Vi is found to alias with a memory tag | |
965 T, we add Vi to the may-aliases set for T@. Meaning that | |
966 after alias analysis, we will have: | |
967 | |
968 @smallexample | |
969 may-aliases(T) = @{ V1, V2, V3, @dots{}, Vn @} | |
970 @end smallexample | |
971 | |
972 This means that every statement that references T, will get | |
973 @code{n} virtual operands for each of the Vi tags. But, when | |
974 alias grouping is enabled, we make T an alias tag and add it | |
975 to the alias set of all the Vi variables: | |
976 | |
977 @smallexample | |
978 may-aliases(V1) = @{ T @} | |
979 may-aliases(V2) = @{ T @} | |
980 @dots{} | |
981 may-aliases(Vn) = @{ T @} | |
982 @end smallexample | |
983 | |
984 This has two effects: (a) statements referencing T will only get | |
985 a single virtual operand, and, (b) all the variables Vi will now | |
986 appear to alias each other. So, we lose alias precision to | |
987 improve compile time. But, in theory, a program with such a high | |
988 level of aliasing should not be very optimizable in the first | |
989 place. | |
990 | |
991 @item Since variables may be in the alias set of more than one | |
992 memory tag, the grouping done in step (2) needs to be extended | |
993 to all the memory tags that have a non-empty intersection with | |
994 the may-aliases set of tag T@. For instance, if we originally | |
995 had these may-aliases sets: | |
996 | |
997 @smallexample | |
998 may-aliases(T) = @{ V1, V2, V3 @} | |
999 may-aliases(R) = @{ V2, V4 @} | |
1000 @end smallexample | |
1001 | |
1002 In step (2) we would have reverted the aliases for T as: | |
1003 | |
1004 @smallexample | |
1005 may-aliases(V1) = @{ T @} | |
1006 may-aliases(V2) = @{ T @} | |
1007 may-aliases(V3) = @{ T @} | |
1008 @end smallexample | |
1009 | |
1010 But note that now V2 is no longer aliased with R@. We could | |
1011 add R to may-aliases(V2), but we are in the process of | |
1012 grouping aliases to reduce virtual operands so what we do is | |
1013 add V4 to the grouping to obtain: | |
1014 | |
1015 @smallexample | |
1016 may-aliases(V1) = @{ T @} | |
1017 may-aliases(V2) = @{ T @} | |
1018 may-aliases(V3) = @{ T @} | |
1019 may-aliases(V4) = @{ T @} | |
1020 @end smallexample | |
1021 | |
1022 @item If the total number of virtual operands due to aliasing is | |
1023 still above the threshold set by max-alias-vops, go back to (2). | |
1024 @end enumerate |