Typestate Checking of Machine Code
Zhichen Xu, Thomas Reps, and Barton P. Miller
We check statically whether it is safe for untrusted foreign machine
code to be loaded into a trusted host system. (Here ``safety'' means
that the program abides by a memory-access policy that is supplied on
the host side.) Our technique works on ordinary machine code, and
mechanically synthesizes (and verifies) a safety proof. Our earlier
work along these lines was based on a C-like type system, which does
not suffice for machine code whose origin is C++ source code. In the
present paper, we address this limitation with an improved typestate
system and introduce several new techniques, including: summarizing
the effects of function calls so that our analysis can stop at trusted
boundaries, inferring information about the sizes and types of
stack-allocated arrays, and a symbolic range analysis for propagating
information about array bounds. These techniques make our approach to
safety checking more precise, more efficient, and able to handle a
larger collection of real-life code sequences than was previously the
case. For example, allowing subtyping among structures and pointers
allowed our implementation to analyze code originating from
object-oriented source code. The use of symbolic range analysis
eliminated 60% of the total attempts to synthesize loop invariants in
the 11 programs of our test suite that have array accesses. In 4 of
these programs, it eliminated the need to synthesize loop invariants
altogether. The resulting speedup for the global-verification phase
of the system ranges from -0.4% to 63% (with a median of 37%).
(Click here to access the paper:
PostScript,
PDF.)
University of Wisconsin