An Improved Algorithm for Slicing Machine Code

Venkatesh Srinivasan and Thomas Reps
University of Wisconsin

Machine-code slicing is an important primitive for building binary analysis and rewriting tools, such as taint trackers, fault localizers, and partial evaluators. However, it is not easy to create a machine-code slicer that exhibits a high level of precision. Moreover, the problem of creating such a tool is compounded by the fact that a small amount of local imprecision can be amplified via cascade effects.

Most instructions in instruction sets such as Intel's IA-32 and ARM are multi-assignments: they have several inputs and several outputs (registers, flags, and memory locations). This aspect of the instruction set introduces a granularity issue during slicing: there are often instructions at which we would like the slice to include only a subset of the instruction's multiple assignments, whereas the slice is forced to include the entire instruction. Consequently, the slice computed by state-of-the-art tools is very imprecise, often including essentially the entire program. We present an algorithm to slice machine code more accurately. To counter the granularity issue, our algorithm attempts to include in the slice only the subset of assignments in an instruction's semantics that is relevant to the slicing criterion. Our experiments on IA-32 binaries of FreeBSD utilities show that, in comparison to slices computed by a state-of-the-art tool, our algorithm reduces the number of instructions in backward slices by 36%, and in forward slices by 82%.

(Click here to access the paper: PDF.)