1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
|
{{+bindTo:partials.standard_nacl_article}}
<section id="pnacl-undefined-behavior">
<h1 id="pnacl-undefined-behavior">PNaCl Undefined Behavior</h1>
<div class="contents local" id="contents" style="display: none">
<ul class="small-gap">
<li><a class="reference internal" href="#overview" id="id2">Overview</a></li>
<li><a class="reference internal" href="#specification" id="id3">Specification</a></li>
<li><p class="first"><a class="reference internal" href="#behavior-in-pnacl-bitcode" id="id4">Behavior in PNaCl Bitcode</a></p>
<ul class="small-gap">
<li><a class="reference internal" href="#well-defined" id="id5">Well-Defined</a></li>
<li><p class="first"><a class="reference internal" href="#not-well-defined" id="id6">Not Well-Defined</a></p>
<ul class="small-gap">
<li><a class="reference internal" href="#potentially-fixable" id="id7">Potentially Fixable</a></li>
<li><a class="reference internal" href="#floating-point" id="id8">Floating-Point</a></li>
<li><a class="reference internal" href="#simd-vectors" id="id9">SIMD Vectors</a></li>
<li><a class="reference internal" href="#hard-to-fix" id="id10">Hard to Fix</a></li>
</ul>
</li>
</ul>
</li>
</ul>
</div><h2 id="overview"><span id="undefined-behavior"></span>Overview</h2>
<p>C and C++ undefined behavior allows efficient mapping of the source
language onto hardware, but leads to different behavior on different
platforms.</p>
<p>PNaCl exposes undefined behavior in the following ways:</p>
<ul class="small-gap">
<li><p class="first">The Clang frontend and optimizations that occur on the developer’s
machine determine what behavior will occur, and it will be specified
deterministically in the <em>pexe</em>. All targets will observe the same
behavior. In some cases, recompiling with a newer PNaCl SDK version
will either:</p>
<ul class="small-gap">
<li>Reliably emit the same behavior in the resulting <em>pexe</em>.</li>
<li>Change the behavior that gets specified in the <em>pexe</em>.</li>
</ul>
</li>
<li><p class="first">The behavior specified in the <em>pexe</em> relies on PNaCl’s bitcode,
runtime or CPU architecture vagaries.</p>
<ul class="small-gap">
<li>In some cases, the behavior using the same PNaCl translator version
on different architectures will produce different behavior.</li>
<li>Sometimes runtime parameters determine the behavior, e.g. memory
allocation determines which out-of-bounds accesses crash versus
returning garbage.</li>
<li>In some cases, different versions of the PNaCl translator
(i.e. after a Chrome update) will compile the code differently and
cause different behavior.</li>
<li>In some cases, the same versions of the PNaCl translator, on the
same architecture, will generate a different <em>nexe</em> for
defense-in-depth purposes, but may cause code that reads invalid
stack values or code sections on the heap to observe these
randomizations.</li>
</ul>
</li>
</ul>
<h2 id="specification">Specification</h2>
<p>PNaCl’s goal is that a single <em>pexe</em> should work reliably in the same
manner on all architectures, irrespective of runtime parameters and
through Chrome updates. This goal is unfortunately not attainable; PNaCl
therefore specifies as much as it can and outlines areas for
improvement.</p>
<p>One interesting solution is to offer good support for LLVM’s sanitizer
tools (including <a class="reference external" href="http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation">UBSan</a>)
at development time, so that developers can test their code against
undefined behavior. Shipping code would then still get good performance,
and diverging behavior would be rare.</p>
<p>Note that none of these issues are vulnerabilities in PNaCl and Chrome:
the NaCl sandboxing still constrains the code through Software Fault
Isolation.</p>
<h2 id="behavior-in-pnacl-bitcode">Behavior in PNaCl Bitcode</h2>
<h3 id="well-defined">Well-Defined</h3>
<p>The following are traditionally undefined behavior in C/C++ but are well
defined at the <em>pexe</em> level:</p>
<ul class="small-gap">
<li>Dynamic initialization order dependencies: the order is deterministic
in the <em>pexe</em>.</li>
<li>Bool which isn’t <code>0</code>/<code>1</code>: the bitcode instruction sequence is
deterministic in the <em>pexe</em>.</li>
<li>Out-of-range <code>enum</code> value: the backing integer type and bitcode
instruction sequence is deterministic in the <em>pexe</em>.</li>
<li>Aggressive optimizations based on type-based alias analysis: TBAA
optimizations are done before stable bitcode is generated and their
metadata is stripped from the <em>pexe</em>; behavior is therefore
deterministic in the <em>pexe</em>.</li>
<li>Operator and subexpression evaluation order in the same expression
(e.g. function parameter passing, or pre-increment): the order is
defined in the <em>pexe</em>.</li>
<li>Signed integer overflow: two’s complement integer arithmetic is
assumed.</li>
<li>Atomic access to a non-atomic memory location (not declared as
<code>std::atomic</code>): atomics and <code>volatile</code> variables all lower to the
same compatible intrinsics or external functions; the behavior is
therefore deterministic in the <em>pexe</em> (see <a class="reference internal" href="/native-client/reference/pnacl-c-cpp-language-support.html#memory-model-and-atomics"><em>Memory Model and
Atomics</em></a>).</li>
<li>Integer divide by zero: always raises a fault (through hardware on
x86, and through integer divide emulation routine or explicit checks
on ARM).</li>
</ul>
<h3 id="not-well-defined">Not Well-Defined</h3>
<p>The following are traditionally undefined behavior in C/C++ which also
exhibit undefined behavior at the <em>pexe</em> level. Some are easier to fix
than others.</p>
<h4 id="potentially-fixable">Potentially Fixable</h4>
<ul class="small-gap">
<li><p class="first">Shift by greater-than-or-equal to left-hand-side’s bit-width or
negative (see <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3604">bug 3604</a>).</p>
<ul class="small-gap">
<li>Some of the behavior will be specified in the <em>pexe</em> depending on
constant propagation and integer type of variables.</li>
<li>There is still some architecture-specific behavior.</li>
<li>PNaCl could force-mask the right-hand-side to <cite>bitwidth-1</cite>, which
could become a no-op on some architectures while ensuring all
architectures behave similarly. Regular optimizations could also be
applied, removing redundant masks.</li>
</ul>
</li>
<li><p class="first">Using a virtual pointer of the wrong type, or of an unallocated
object.</p>
<ul class="small-gap">
<li>Will produce wrong results which will depend on what data is treated
as a <cite>vftable</cite>.</li>
<li>PNaCl could add runtime checks for this, and elide them when types
are provably correct (see this CFI <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3786">bug 3786</a>).</li>
</ul>
</li>
<li><p class="first">Some unaligned load/store (see <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3445">bug 3445</a>).</p>
<ul class="small-gap">
<li>Could force everything to <cite>align 1</cite>; performance cost should be
measured.</li>
<li>The frontend could also be more pessimistic when it sees dubious
casts.</li>
</ul>
</li>
<li>Some values can be marked as <code>undef</code> (see <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3796">bug 3796</a>).</li>
<li>Reaching end-of-value-returning-function without returning a value:
reduces to <code>ret i32 undef</code> in bitcode. This is mostly-defined, but
could be improved (see <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3796">bug 3796</a>).</li>
<li><p class="first">Reaching “unreachable” code.</p>
<ul class="small-gap">
<li>LLVM provides an IR instruction called “unreachable” whose effect
will be undefined. PNaCl could change this to always trap, as the
<code>llvm.trap</code> intrinsic does.</li>
</ul>
</li>
<li>Zero or negative-sized variable-length array (and <code>alloca</code>) aren’t
defined behavior. PNaCl’s frontend or the translator could insert
checks with <code>-fsanitize=vla-bound</code>.</li>
</ul>
<h4 id="floating-point"><span id="undefined-behavior-fp"></span>Floating-Point</h4>
<p>PNaCl offers a IEEE-754 implementation which is as correct as the
underlying hardware allows, with a few limitations. These are a few
sources of undefined behavior which are believed to be fixable:</p>
<ul class="small-gap">
<li>Float cast overflow is currently undefined.</li>
<li>Float divide by zero is currently undefined.</li>
<li>The default denormal behavior is currently unspecified, which isn’t
IEEE-754 compliant (denormals must be supported in IEEE-754). PNaCl
could mandate flush-to-zero, and may give an API to enable denormals
in a future release. The latter is problematic for SIMD and
vectorization support, where some platforms do not support denormal
SIMD operations.</li>
<li><code>NaN</code> values are currently not guaranteed to be canonical; see <a class="reference external" href="https://code.google.com/p/nativeclient/issues/detail?id=3536">bug
3536</a>.</li>
<li>Passing <code>NaN</code> to STL functions (the math is defined, but the
function implementation isn’t, e.g. <code>std::min</code> and <code>std::max</code>), is
well-defined in the <em>pexe</em>.</li>
</ul>
<h4 id="simd-vectors">SIMD Vectors</h4>
<p>SIMD vector instructions aren’t part of the C/C++ standards and as such
their behavior isn’t specified at all in C/C++; it is usually left up to
the target architecture to specify behavior. Portable Native Client
instead exposed <a class="reference internal" href="/native-client/reference/pnacl-c-cpp-language-support.html#portable-simd-vectors"><em>Portable SIMD Vectors</em></a> and
offers the same guarantees on these vectors as the guarantees offered by
the contained elements. Of notable interest amongst these guarantees are
those of alignment for load/store instructions on vectors: they have the
same alignment restriction as the contained elements.</p>
<h4 id="hard-to-fix">Hard to Fix</h4>
<ul class="small-gap">
<li><p class="first">Null pointer/reference has behavior determined by the NaCl sandbox:</p>
<ul class="small-gap">
<li>Raises a segmentation fault in the bottom <code>64KiB</code> bytes on all
platforms, and on some sandboxes there are further non-writable
pages after the initial <code>64KiB</code>.</li>
<li>Negative offsets aren’t handled consistently on all platforms:
x86-64 and ARM will wrap around to the stack (because they mask the
address), whereas x86-32 will fault (because of segmentation).</li>
</ul>
</li>
<li><p class="first">Accessing uninitialized/free’d memory (including out-of-bounds array
access):</p>
<ul class="small-gap">
<li>Might cause a segmentation fault or not, depending on where memory
is allocated and how it gets reclaimed.</li>
<li>Added complexity because of the NaCl sandboxing: some of the
load/stores might be forced back into sandbox range, or eliminated
entirely if they fall out of the sandbox.</li>
</ul>
</li>
<li><p class="first">Executing non-program data (jumping to an address obtained from a
non-function pointer is undefined, can only do <code>void(*)()</code> to
<code>intptr_t</code> to <code>void(*)()</code>).</p>
<ul class="small-gap">
<li>Just-In-Time code generation is supported by NaCl, but is not
currently supported by PNaCl. It is currently not possible to mark
code as executable.</li>
<li>Offering full JIT capabilities would reduce PNaCl’s ability to
change the sandboxing model. It would also require a “jump to JIT
code” syscall (to guarantee a calling convention), and means that
JITs aren’t portable.</li>
<li>PNaCl could offer “portable” JIT capabilities where the code hands
PNaCl some form of LLVM IR, which PNaCl then JIT-compiles.</li>
</ul>
</li>
<li>Out-of-scope variable usage: will produce unknown data, mostly
dependent on stack and memory allocation.</li>
<li>Data races: any two operations that conflict (target overlapping
memory), at least one of which is a store or atomic read-modify-write,
and at least one of which is not atomic: this will be very dependent
on processor and execution sequence, see <a class="reference internal" href="/native-client/reference/pnacl-c-cpp-language-support.html#memory-model-and-atomics"><em>Memory Model and
Atomics</em></a>.</li>
</ul>
</section>
{{/partials.standard_nacl_article}}
|