145
|
1 Copyright (C) 2000-2020 Free Software Foundation, Inc.
|
0
|
2
|
|
3 This file is intended to contain a few notes about writing C code
|
|
4 within GCC so that it compiles without error on the full range of
|
|
5 compilers GCC needs to be able to compile on.
|
|
6
|
|
7 The problem is that many ISO-standard constructs are not accepted by
|
|
8 either old or buggy compilers, and we keep getting bitten by them.
|
111
|
9 This knowledge until now has been sparsely spread around, so I
|
0
|
10 thought I'd collect it in one useful place. Please add and correct
|
|
11 any problems as you come across them.
|
|
12
|
|
13 I'm going to start from a base of the ISO C90 standard, since that is
|
|
14 probably what most people code to naturally. Obviously using
|
|
15 constructs introduced after that is not a good idea.
|
|
16
|
|
17 For the complete coding style conventions used in GCC, please read
|
|
18 http://gcc.gnu.org/codingconventions.html
|
|
19
|
|
20
|
|
21 String literals
|
|
22 ---------------
|
|
23
|
|
24 Some compilers like MSVC++ have fairly low limits on the maximum
|
|
25 length of a string literal; 509 is the lowest we've come across. You
|
|
26 may need to break up a long printf statement into many smaller ones.
|
|
27
|
|
28
|
|
29 Empty macro arguments
|
|
30 ---------------------
|
|
31
|
|
32 ISO C (6.8.3 in the 1990 standard) specifies the following:
|
|
33
|
|
34 If (before argument substitution) any argument consists of no
|
|
35 preprocessing tokens, the behavior is undefined.
|
|
36
|
|
37 This was relaxed by ISO C99, but some older compilers emit an error,
|
|
38 so code like
|
|
39
|
|
40 #define foo(x, y) x y
|
|
41 foo (bar, )
|
|
42
|
|
43 needs to be coded in some other way.
|
|
44
|
|
45
|
111
|
46 Avoid unnecessary test before free
|
|
47 ----------------------------------
|
|
48
|
|
49 Since SunOS 4 stopped being a reasonable portability target,
|
|
50 (which happened around 2007) there has been no need to guard
|
|
51 against "free (NULL)". Thus, any guard like the following
|
|
52 constitutes a redundant test:
|
|
53
|
|
54 if (P)
|
|
55 free (P);
|
0
|
56
|
111
|
57 It is better to avoid the test.[*]
|
|
58 Instead, simply free P, regardless of whether it is NULL.
|
0
|
59
|
111
|
60 [*] However, if your profiling exposes a test like this in a
|
|
61 performance-critical loop, say where P is nearly always NULL, and
|
|
62 the cost of calling free on a NULL pointer would be prohibitively
|
|
63 high, consider using __builtin_expect, e.g., like this:
|
|
64
|
|
65 if (__builtin_expect (ptr != NULL, 0))
|
|
66 free (ptr);
|
|
67
|
0
|
68
|
|
69
|
|
70 Trigraphs
|
|
71 ---------
|
|
72
|
|
73 You weren't going to use them anyway, but some otherwise ISO C
|
|
74 compliant compilers do not accept trigraphs.
|
|
75
|
|
76
|
|
77 Suffixes on Integer Constants
|
|
78 -----------------------------
|
|
79
|
|
80 You should never use a 'l' suffix on integer constants ('L' is fine),
|
|
81 since it can easily be confused with the number '1'.
|
|
82
|
|
83
|
|
84 Common Coding Pitfalls
|
|
85 ======================
|
|
86
|
|
87 errno
|
|
88 -----
|
|
89
|
|
90 errno might be declared as a macro.
|
|
91
|
|
92
|
|
93 Implicit int
|
|
94 ------------
|
|
95
|
|
96 In C, the 'int' keyword can often be omitted from type declarations.
|
|
97 For instance, you can write
|
|
98
|
|
99 unsigned variable;
|
|
100
|
|
101 as shorthand for
|
|
102
|
|
103 unsigned int variable;
|
|
104
|
|
105 There are several places where this can cause trouble. First, suppose
|
|
106 'variable' is a long; then you might think
|
|
107
|
|
108 (unsigned) variable
|
|
109
|
|
110 would convert it to unsigned long. It does not. It converts to
|
|
111 unsigned int. This mostly causes problems on 64-bit platforms, where
|
|
112 long and int are not the same size.
|
|
113
|
|
114 Second, if you write a function definition with no return type at
|
|
115 all:
|
|
116
|
|
117 operate (int a, int b)
|
|
118 {
|
|
119 ...
|
|
120 }
|
|
121
|
|
122 that function is expected to return int, *not* void. GCC will warn
|
|
123 about this.
|
|
124
|
|
125 Implicit function declarations always have return type int. So if you
|
|
126 correct the above definition to
|
|
127
|
|
128 void
|
|
129 operate (int a, int b)
|
|
130 ...
|
|
131
|
|
132 but operate() is called above its definition, you will get an error
|
|
133 about a "type mismatch with previous implicit declaration". The cure
|
|
134 is to prototype all functions at the top of the file, or in an
|
|
135 appropriate header.
|
|
136
|
|
137 Char vs unsigned char vs int
|
|
138 ----------------------------
|
|
139
|
|
140 In C, unqualified 'char' may be either signed or unsigned; it is the
|
|
141 implementation's choice. When you are processing 7-bit ASCII, it does
|
|
142 not matter. But when your program must handle arbitrary binary data,
|
|
143 or fully 8-bit character sets, you have a problem. The most obvious
|
|
144 issue is if you have a look-up table indexed by characters.
|
|
145
|
|
146 For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A
|
|
147 WITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be
|
|
148 true. But if you read '\341' from a file and store it in a plain
|
|
149 char, isalpha(c) may look up character 225, or it may look up
|
|
150 character -31. And the ctype table has no entry at offset -31, so
|
|
151 your program will crash. (If you're lucky.)
|
|
152
|
|
153 It is wise to use unsigned char everywhere you possibly can. This
|
|
154 avoids all these problems. Unfortunately, the routines in <string.h>
|
|
155 take plain char arguments, so you have to remember to cast them back
|
|
156 and forth - or avoid the use of strxxx() functions, which is probably
|
|
157 a good idea anyway.
|
|
158
|
|
159 Another common mistake is to use either char or unsigned char to
|
|
160 receive the result of getc() or related stdio functions. They may
|
|
161 return EOF, which is outside the range of values representable by
|
|
162 char. If you use char, some legal character value may be confused
|
|
163 with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1).
|
|
164 The correct choice is int.
|
|
165
|
|
166 A more subtle version of the same mistake might look like this:
|
|
167
|
|
168 unsigned char pushback[NPUSHBACK];
|
|
169 int pbidx;
|
|
170 #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c))
|
|
171 #define get(c) (pbidx ? pushback[--pbidx] : getchar())
|
|
172 ...
|
|
173 unget(EOF);
|
|
174
|
|
175 which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y
|
|
176 WITH UMLAUT.
|
|
177
|
|
178
|
|
179 Other common pitfalls
|
|
180 ---------------------
|
|
181
|
|
182 o Expecting 'plain' char to be either sign or unsigned extending.
|
|
183
|
|
184 o Shifting an item by a negative amount or by greater than or equal to
|
|
185 the number of bits in a type (expecting shifts by 32 to be sensible
|
|
186 has caused quite a number of bugs at least in the early days).
|
|
187
|
|
188 o Expecting ints shifted right to be sign extended.
|
|
189
|
|
190 o Modifying the same value twice within one sequence point.
|
|
191
|
|
192 o Host vs. target floating point representation, including emitting NaNs
|
|
193 and Infinities in a form that the assembler handles.
|
|
194
|
|
195 o qsort being an unstable sort function (unstable in the sense that
|
|
196 multiple items that sort the same may be sorted in different orders
|
|
197 by different qsort functions).
|
|
198
|
|
199 o Passing incorrect types to fprintf and friends.
|
|
200
|
|
201 o Adding a function declaration for a module declared in another file to
|
|
202 a .c file instead of to a .h file.
|