Extracting Output Formats from Executables
Junghee Lim, Thomas Reps, and Ben Liblit
We describe the design and implementation of FFE/x86
(File-Format Extractor for x86), an analysis tool that works
on stripped executables (i.e., neither source code nor debugging
information need be available) and extracts output
data formats, such as file formats and network packet formats.
We first construct a Hierarchical Finite StateMachine
(HFSM) that over-approximates the output data format. An
HFSM defines a language over the operations used to generate
output data. We use Value-Set Analysis (VSA) and Aggregate
Structure Identification (ASI) to annotate HFSMs
with information that partially characterizes some of the
output data values. VSA determines an over-approximation
of the set of addresses and integer values that each data
object can hold at each program point, and ASI analyzes
memory accesses in the program to recover information
about the structure of aggregates. A series of filtering operations
is performed to over-approximate an HFSM with
a finite-state machine, which can result in a final answer
that is easier to understand. Our experiments with FFE/x86
uncovered a possible bug in the image-conversion utility
(Click here to access the paper:
PDF.)
University of Wisconsin
png2ico
.