An Improved Algorithm for Slicing Machine Code
Venkatesh Srinivasan and Thomas Reps
Machine-code slicing is an important primitive for building binary
analysis and rewriting tools, such as taint trackers, fault
localizers, and partial evaluators. However, it is not easy to create
a machine-code slicer that exhibits a high level of
precision. Moreover, the problem of creating such a tool is compounded
by the fact that a small amount of local imprecision can be amplified
via cascade effects.
Most instructions in instruction sets such as Intel's IA-32 and ARM
are multi-assignments: they have several inputs and several outputs
(registers, flags, and memory locations). This aspect of the
instruction set introduces a granularity issue during slicing: there
are often instructions at which we would like the slice to include
only a subset of the instruction's multiple assignments, whereas the
slice is forced to include the entire instruction. Consequently, the
slice computed by state-of-the-art tools is very imprecise, often
including essentially the entire program. We present an algorithm to
slice machine code more accurately. To counter the granularity issue,
our algorithm attempts to include in the slice only the subset of
assignments in an instruction's semantics that is relevant to the
slicing criterion. Our experiments on IA-32 binaries of FreeBSD
utilities show that, in comparison to slices computed by a
state-of-the-art tool, our algorithm reduces the number of
instructions in backward slices by 36%, and in forward slices by 82%.
(Click here to access the paper:
PDF.)
University of Wisconsin