annotate gcc/doc/analyzer.texi @ 145:1830386684a0

gcc-9.2.0
author anatofuz
date Thu, 13 Feb 2020 11:34:05 +0900
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
145
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
1 @c Copyright (C) 2019 Free Software Foundation, Inc.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
2 @c This is part of the GCC manual.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
3 @c For copying conditions, see the file gcc.texi.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
4 @c Contributed by David Malcolm <dmalcolm@redhat.com>.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
5
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
6 @node Static Analyzer
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
7 @chapter Static Analyzer
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
8 @cindex analyzer
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
9 @cindex static analysis
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
10 @cindex static analyzer
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
11
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
12 @menu
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
13 * Analyzer Internals:: Analyzer Internals
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
14 * Debugging the Analyzer:: Useful debugging tips
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
15 @end menu
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
16
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
17 @node Analyzer Internals
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
18 @section Analyzer Internals
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
19 @cindex analyzer, internals
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
20 @cindex static analyzer, internals
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
21
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
22 @subsection Overview
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
23
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
24 The analyzer implementation works on the gimple-SSA representation.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
25 (I chose this in the hopes of making it easy to work with LTO to
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
26 do whole-program analysis).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
27
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
28 The implementation is read-only: it doesn't attempt to change anything,
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
29 just emit warnings.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
30
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
31 The gimple representation can be seen using @option{-fdump-ipa-analyzer}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
32
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
33 First, we build a @code{supergraph} which combines the callgraph and all
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
34 of the CFGs into a single directed graph, with both interprocedural and
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
35 intraprocedural edges. The nodes and edges in the supergraph are called
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
36 ``supernodes'' and ``superedges'', and often referred to in code as
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
37 @code{snodes} and @code{sedges}. Basic blocks in the CFGs are split at
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
38 interprocedural calls, so there can be more than one supernode per
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
39 basic block. Most statements will be in just one supernode, but a call
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
40 statement can appear in two supernodes: at the end of one for the call,
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
41 and again at the start of another for the return.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
42
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
43 The supergraph can be seen using @option{-fdump-analyzer-supergraph}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
44
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
45 We then build an @code{analysis_plan} which walks the callgraph to
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
46 determine which calls might be suitable for being summarized (rather
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
47 than fully explored) and thus in what order to explore the functions.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
48
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
49 Next is the heart of the analyzer: we use a worklist to explore state
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
50 within the supergraph, building an "exploded graph".
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
51 Nodes in the exploded graph correspond to <point,@w{ }state> pairs, as in
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
52 "Precise Interprocedural Dataflow Analysis via Graph Reachability"
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
53 (Thomas Reps, Susan Horwitz and Mooly Sagiv).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
54
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
55 We reuse nodes for <point, state> pairs we've already seen, and avoid
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
56 tracking state too closely, so that (hopefully) we rapidly converge
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
57 on a final exploded graph, and terminate the analysis. We also bail
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
58 out if the number of exploded <end-of-basic-block, state> nodes gets
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
59 larger than a particular multiple of the total number of basic blocks
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
60 (to ensure termination in the face of pathological state-explosion
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
61 cases, or bugs). We also stop exploring a point once we hit a limit
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
62 of states for that point.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
63
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
64 We can identify problems directly when processing a <point,@w{ }state>
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
65 instance. For example, if we're finding the successors of
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
66
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
67 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
68 <point: before-stmt: "free (ptr);",
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
69 state: @{"ptr": freed@}>
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
70 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
71
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
72 then we can detect a double-free of "ptr". We can then emit a path
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
73 to reach the problem by finding the simplest route through the graph.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
74
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
75 Program points in the analysis are much more fine-grained than in the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
76 CFG and supergraph, with points (and thus potentially exploded nodes)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
77 for various events, including before individual statements.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
78 By default the exploded graph merges multiple consecutive statements
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
79 in a supernode into one exploded edge to minimize the size of the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
80 exploded graph. This can be suppressed via
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
81 @option{-fanalyzer-fine-grained}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
82 The fine-grained approach seems to make things simpler and more debuggable
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
83 that other approaches I tried, in that each point is responsible for one
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
84 thing.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
85
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
86 Program points in the analysis also have a "call string" identifying the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
87 stack of callsites below them, so that paths in the exploded graph
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
88 correspond to interprocedurally valid paths: we always return to the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
89 correct call site, propagating state information accordingly.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
90 We avoid infinite recursion by stopping the analysis if a callsite
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
91 appears more than @code{analyzer-max-recursion-depth} in a callstring
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
92 (defaulting to 2).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
93
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
94 @subsection Graphs
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
95
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
96 Nodes and edges in the exploded graph are called ``exploded nodes'' and
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
97 ``exploded edges'' and often referred to in the code as
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
98 @code{enodes} and @code{eedges} (especially when distinguishing them
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
99 from the @code{snodes} and @code{sedges} in the supergraph).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
100
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
101 Each graph numbers its nodes, giving unique identifiers - supernodes
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
102 are referred to throughout dumps in the form @samp{SN': @var{index}} and
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
103 exploded nodes in the form @samp{EN: @var{index}} (e.g. @samp{SN: 2} and
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
104 @samp{EN:29}).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
105
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
106 The supergraph can be seen using @option{-fdump-analyzer-supergraph-graph}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
107
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
108 The exploded graph can be seen using @option{-fdump-analyzer-exploded-graph}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
109 and other dump options. Exploded nodes are color-coded in the .dot output
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
110 based on state-machine states to make it easier to see state changes at
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
111 a glance.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
112
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
113 @subsection State Tracking
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
114
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
115 There's a tension between:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
116 @itemize @bullet
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
117 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
118 precision of analysis in the straight-line case, vs
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
119 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
120 exponential blow-up in the face of control flow.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
121 @end itemize
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
122
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
123 For example, in general, given this CFG:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
124
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
125 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
126 A
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
127 / \
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
128 B C
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
129 \ /
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
130 D
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
131 / \
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
132 E F
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
133 \ /
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
134 G
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
135 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
136
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
137 we want to avoid differences in state-tracking in B and C from
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
138 leading to blow-up. If we don't prevent state blowup, we end up
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
139 with exponential growth of the exploded graph like this:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
140
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
141 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
142
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
143 1:A
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
144 / \
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
145 / \
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
146 / \
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
147 2:B 3:C
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
148 | |
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
149 4:D 5:D (2 exploded nodes for D)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
150 / \ / \
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
151 6:E 7:F 8:E 9:F
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
152 | | | |
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
153 10:G 11:G 12:G 13:G (4 exploded nodes for G)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
154
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
155 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
156
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
157 Similar issues arise with loops.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
158
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
159 To prevent this, we follow various approaches:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
160
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
161 @enumerate a
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
162 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
163 state pruning: which tries to discard state that won't be relevant
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
164 later on withing the function.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
165 This can be disabled via @option{-fno-analyzer-state-purge}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
166
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
167 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
168 state merging. We can try to find the commonality between two
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
169 program_state instances to make a third, simpler program_state.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
170 We have two strategies here:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
171
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
172 @enumerate
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
173 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
174 the worklist keeps new nodes for the same program_point together,
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
175 and tries to merge them before processing, and thus before they have
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
176 successors. Hence, in the above, the two nodes for D (4 and 5) reach
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
177 the front of the worklist together, and we create a node for D with
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
178 the merger of the incoming states.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
179
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
180 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
181 try merging with the state of existing enodes for the program_point
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
182 (which may have already been explored). There will be duplication,
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
183 but only one set of duplication; subsequent duplicates are more likely
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
184 to hit the cache. In particular, (hopefully) all merger chains are
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
185 finite, and so we guarantee termination.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
186 This is intended to help with loops: we ought to explore the first
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
187 iteration, and then have a "subsequent iterations" exploration,
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
188 which uses a state merged from that of the first, to be more abstract.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
189 @end enumerate
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
190
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
191 We avoid merging pairs of states that have state-machine differences,
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
192 as these are the kinds of differences that are likely to be most
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
193 interesting. So, for example, given:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
194
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
195 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
196 if (condition)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
197 ptr = malloc (size);
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
198 else
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
199 ptr = local_buf;
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
200
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
201 .... do things with 'ptr'
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
202
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
203 if (condition)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
204 free (ptr);
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
205
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
206 ...etc
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
207 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
208
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
209 then we end up with an exploded graph that looks like this:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
210
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
211 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
212
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
213 if (condition)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
214 / T \ F
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
215 --------- ----------
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
216 / \
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
217 ptr = malloc (size) ptr = local_buf
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
218 | |
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
219 copy of copy of
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
220 "do things with 'ptr'" "do things with 'ptr'"
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
221 with ptr: heap-allocated with ptr: stack-allocated
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
222 | |
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
223 if (condition) if (condition)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
224 | known to be T | known to be F
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
225 free (ptr); |
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
226 \ /
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
227 -----------------------------
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
228 | ('ptr' is pruned, so states can be merged)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
229 etc
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
230
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
231 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
232
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
233 where some duplication has occurred, but only for the places where the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
234 the different paths are worth exploringly separately.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
235
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
236 Merging can be disabled via @option{-fno-analyzer-state-merge}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
237 @end enumerate
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
238
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
239 @subsection Region Model
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
240
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
241 Part of the state stored at a @code{exploded_node} is a @code{region_model}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
242 This is an implementation of the region-based ternary model described in
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
243 @url{http://lcs.ios.ac.cn/~xuzb/canalyze/memmodel.pdf,
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
244 "A Memory Model for Static Analysis of C Programs"}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
245 (Zhongxing Xu, Ted Kremenek, and Jian Zhang).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
246
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
247 A @code{region_model} encapsulates a representation of the state of
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
248 memory, with a tree of @code{region} instances, along with their associated
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
249 values. The representation is graph-like because values can be pointers
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
250 to regions. It also stores a constraint_manager, capturing relationships
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
251 between the values.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
252
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
253 Because each node in the @code{exploded_graph} has a @code{region_model},
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
254 and each of the latter is graph-like, the @code{exploded_graph} is in some
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
255 ways a graph of graphs.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
256
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
257 Here's an example of printing a @code{region_model}, showing the ASCII-art
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
258 used to visualize the region hierarchy (colorized when printing to stderr):
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
259
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
260 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
261 (gdb) call debug (*this)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
262 r0: @{kind: 'root', parent: null, sval: null@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
263 |-stack: r1: @{kind: 'stack', parent: r0, sval: sv1@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
264 | |: sval: sv1: @{poisoned: uninit@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
265 | |-frame for 'test': r2: @{kind: 'frame', parent: r1, sval: null, map: @{'ptr_3': r3@}, function: 'test', depth: 0@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
266 | | `-'ptr_3': r3: @{kind: 'map', parent: r2, sval: sv3, type: 'void *', map: @{@}@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
267 | | |: sval: sv3: @{type: 'void *', unknown@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
268 | | |: type: 'void *'
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
269 | `-frame for 'calls_malloc': r4: @{kind: 'frame', parent: r1, sval: null, map: @{'result_3': r7, '_4': r8, '<anonymous>': r5@}, function: 'calls_malloc', depth: 1@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
270 | |-'<anonymous>': r5: @{kind: 'map', parent: r4, sval: sv4, type: 'void *', map: @{@}@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
271 | | |: sval: sv4: @{type: 'void *', &r6@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
272 | | |: type: 'void *'
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
273 | |-'result_3': r7: @{kind: 'map', parent: r4, sval: sv4, type: 'void *', map: @{@}@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
274 | | |: sval: sv4: @{type: 'void *', &r6@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
275 | | |: type: 'void *'
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
276 | `-'_4': r8: @{kind: 'map', parent: r4, sval: sv4, type: 'void *', map: @{@}@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
277 | |: sval: sv4: @{type: 'void *', &r6@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
278 | |: type: 'void *'
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
279 `-heap: r9: @{kind: 'heap', parent: r0, sval: sv2@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
280 |: sval: sv2: @{poisoned: uninit@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
281 `-r6: @{kind: 'symbolic', parent: r9, sval: null, map: @{@}@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
282 svalues:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
283 sv0: @{type: 'size_t', '1024'@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
284 sv1: @{poisoned: uninit@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
285 sv2: @{poisoned: uninit@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
286 sv3: @{type: 'void *', unknown@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
287 sv4: @{type: 'void *', &r6@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
288 constraint manager:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
289 equiv classes:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
290 ec0: @{sv0 == '1024'@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
291 ec1: @{sv4@}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
292 constraints:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
293 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
294
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
295 This is the state at the point of returning from @code{calls_malloc} back
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
296 to @code{test} in the following:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
297
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
298 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
299 void *
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
300 calls_malloc (void)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
301 @{
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
302 void *result = malloc (1024);
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
303 return result;
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
304 @}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
305
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
306 void test (void)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
307 @{
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
308 void *ptr = calls_malloc ();
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
309 /* etc. */
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
310 @}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
311 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
312
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
313 The ``root'' region (``r0'') has a ``stack'' child (``r1''), with two
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
314 children: a frame for @code{test} (``r2''), and a frame for
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
315 @code{calls_malloc} (``r4''). These frame regions have child regions for
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
316 storing their local variables. For example, the return region
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
317 and that of various other regions within the ``calls_malloc'' frame all have
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
318 value ``sv4'', a pointer to a heap-allocated region ``r6''. Within the parent
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
319 frame, @code{ptr_3} has value ``sv3'', an unknown @code{void *}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
320
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
321 @subsection Analyzer Paths
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
322
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
323 We need to explain to the user what the problem is, and to persuade them
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
324 that there really is a problem. Hence having a @code{diagnostic_path}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
325 isn't just an incidental detail of the analyzer; it's required.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
326
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
327 Paths ought to be:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
328 @itemize @bullet
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
329 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
330 interprocedurally-valid
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
331 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
332 feasible
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
333 @end itemize
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
334
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
335 Without state-merging, all paths in the exploded graph are feasible
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
336 (in terms of constraints being satisified).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
337 With state-merging, paths in the exploded graph can be infeasible.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
338
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
339 We collate warnings and only emit them for the simplest path
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
340 e.g. for a bug in a utility function, with lots of routes to calling it,
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
341 we only emit the simplest path (which could be intraprocedural, if
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
342 it can be reproduced without a caller). We apply a check that
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
343 each duplicate warning's shortest path is feasible, rejecting any
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
344 warnings for which the shortest path is infeasible (which could lead to
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
345 false negatives).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
346
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
347 We use the shortest feasible @code{exploded_path} through the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
348 @code{exploded_graph} (a list of @code{exploded_edge *}) to build a
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
349 @code{diagnostic_path} (a list of events for the diagnostic subsystem) -
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
350 specifically a @code{checker_path}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
351
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
352 Having built the @code{checker_path}, we prune it to try to eliminate
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
353 events that aren't relevant, to minimize how much the user has to read.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
354
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
355 After pruning, we notify each event in the path of its ID and record the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
356 IDs of interesting events, allowing for events to refer to other events
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
357 in their descriptions. The @code{pending_diagnostic} class has various
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
358 vfuncs to support emitting more precise descriptions, so that e.g.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
359
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
360 @itemize @bullet
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
361 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
362 a deref-of-unchecked-malloc diagnostic might use:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
363 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
364 returning possibly-NULL pointer to 'make_obj' from 'allocator'
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
365 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
366 for a @code{return_event} to make it clearer how the unchecked value moves
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
367 from callee back to caller
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
368 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
369 a double-free diagnostic might use:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
370 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
371 second 'free' here; first 'free' was at (3)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
372 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
373 and a use-after-free might use
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
374 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
375 use after 'free' here; memory was freed at (2)
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
376 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
377 @end itemize
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
378
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
379 At this point we can emit the diagnostic.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
380
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
381 @subsection Limitations
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
382
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
383 @itemize @bullet
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
384 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
385 Only for C so far
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
386 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
387 The implementation of call summaries is currently very simplistic.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
388 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
389 Lack of function pointer analysis
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
390 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
391 The constraint-handling code assumes reflexivity in some places
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
392 (that values are equal to themselves), which is not the case for NaN.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
393 As a simple workaround, constraints on floating-point values are
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
394 currently ignored.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
395 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
396 The region model code creates lots of little mutable objects at each
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
397 @code{region_model} (and thus per @code{exploded_node}) rather than
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
398 sharing immutable objects and having the mutable state in the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
399 @code{program_state} or @code{region_model}. The latter approach might be
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
400 more efficient, and might avoid dealing with IDs rather than pointers
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
401 (which requires us to impose an ordering to get meaningful equality).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
402 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
403 The region model code doesn't yet support @code{memcpy}. At the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
404 gimple-ssa level these have been optimized to statements like this:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
405 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
406 _10 = MEM <long unsigned int> [(char * @{ref-all@})&c]
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
407 MEM <long unsigned int> [(char * @{ref-all@})&d] = _10;
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
408 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
409 Perhaps they could be supported via a new @code{compound_svalue} type.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
410 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
411 There are various other limitations in the region model (grep for TODO/xfail
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
412 in the testsuite).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
413 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
414 The constraint_manager's implementation of transitivity is currently too
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
415 expensive to enable by default and so must be manually enabled via
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
416 @option{-fanalyzer-transitivity}).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
417 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
418 The checkers are currently hardcoded and don't allow for user extensibility
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
419 (e.g. adding allocate/release pairs).
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
420 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
421 Although the analyzer's test suite has a proof-of-concept test case for
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
422 LTO, LTO support hasn't had extensive testing. There are various
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
423 lang-specific things in the analyzer that assume C rather than LTO.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
424 For example, SSA names are printed to the user in ``raw'' form, rather
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
425 than printing the underlying variable name.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
426 @end itemize
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
427
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
428 Some ideas for other checkers
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
429 @itemize @bullet
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
430 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
431 File-descriptor-based APIs
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
432 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
433 Linux kernel internal APIs
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
434 @item
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
435 Signal handling
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
436 @end itemize
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
437
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
438 @node Debugging the Analyzer
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
439 @section Debugging the Analyzer
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
440 @cindex analyzer, debugging
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
441 @cindex static analyzer, debugging
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
442
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
443 @subsection Special Functions for Debugging the Analyzer
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
444
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
445 The analyzer recognizes various special functions by name, for use
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
446 in debugging the analyzer. Declarations can be seen in the testsuite
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
447 in @file{analyzer-decls.h}. None of these functions are actually
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
448 implemented.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
449
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
450 Add:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
451 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
452 __analyzer_break ();
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
453 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
454 to the source being analyzed to trigger a breakpoint in the analyzer when
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
455 that source is reached. By putting a series of these in the source, it's
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
456 much easier to effectively step through the program state as it's analyzed.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
457
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
458 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
459 __analyzer_dump ();
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
460 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
461
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
462 will dump the copious information about the analyzer's state each time it
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
463 reaches the call in its traversal of the source.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
464
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
465 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
466 __analyzer_dump_path ();
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
467 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
468
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
469 will emit a placeholder ``note'' diagnostic with a path to that call site,
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
470 if the analyzer finds a feasible path to it.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
471
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
472 The builtin @code{__analyzer_dump_exploded_nodes} will emit a warning
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
473 after analysis containing information on all of the exploded nodes at that
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
474 program point:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
475
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
476 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
477 __analyzer_dump_exploded_nodes (0);
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
478 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
479
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
480 will output the number of ``processed'' nodes, and the IDs of
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
481 both ``processed'' and ``merger'' nodes, such as:
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
482
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
483 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
484 warning: 2 processed enodes: [EN: 56, EN: 58] merger(s): [EN: 54-55, EN: 57, EN: 59]
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
485 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
486
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
487 With a non-zero argument
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
488
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
489 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
490 __analyzer_dump_exploded_nodes (1);
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
491 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
492
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
493 it will also dump all of the states within the ``processed'' nodes.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
494
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
495 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
496 __analyzer_dump_region_model ();
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
497 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
498 will dump the region_model's state to stderr.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
499
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
500 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
501 __analyzer_eval (expr);
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
502 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
503 will emit a warning with text "TRUE", FALSE" or "UNKNOWN" based on the
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
504 truthfulness of the argument. This is useful for writing DejaGnu tests.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
505
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
506
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
507 @subsection Other Debugging Techniques
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
508
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
509 One approach when tracking down where a particular bogus state is
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
510 introduced into the @code{exploded_graph} is to add custom code to
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
511 @code{region_model::validate}.
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
512
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
513 For example, this custom code (added to @code{region_model::validate})
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
514 breaks with an assertion failure when a variable called @code{ptr}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
515 acquires a value that's unknown, using
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
516 @code{region_model::get_value_by_name} to locate the variable
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
517
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
518 @smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
519 /* Find a variable matching "ptr". */
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
520 svalue_id sid = get_value_by_name ("ptr");
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
521 if (!sid.null_p ())
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
522 @{
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
523 svalue *sval = get_svalue (sid);
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
524 gcc_assert (sval->get_kind () != SK_UNKNOWN);
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
525 @}
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
526 @end smallexample
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
527
1830386684a0 gcc-9.2.0
anatofuz
parents:
diff changeset
528 making it easier to investigate further in a debugger when this occurs.