Typestate Checking of Machine Code

Zhichen Xu, Thomas Reps, and Barton P. Miller
University of Wisconsin

We check statically whether it is safe for untrusted foreign machine code to be loaded into a trusted host system. (Here ``safety'' means that the program abides by a memory-access policy that is supplied on the host side.) Our technique works on ordinary machine code, and mechanically synthesizes (and verifies) a safety proof. Our earlier work along these lines was based on a C-like type system, which does not suffice for machine code whose origin is C++ source code. In the present paper, we address this limitation with an improved typestate system and introduce several new techniques, including: summarizing the effects of function calls so that our analysis can stop at trusted boundaries, inferring information about the sizes and types of stack-allocated arrays, and a symbolic range analysis for propagating information about array bounds. These techniques make our approach to safety checking more precise, more efficient, and able to handle a larger collection of real-life code sequences than was previously the case. For example, allowing subtyping among structures and pointers allowed our implementation to analyze code originating from object-oriented source code. The use of symbolic range analysis eliminated 60% of the total attempts to synthesize loop invariants in the 11 programs of our test suite that have array accesses. In 4 of these programs, it eliminated the need to synthesize loop invariants altogether. The resulting speedup for the global-verification phase of the system ranges from -0.4% to 63% (with a median of 37%).

(Click here to access the paper.)