Finding Concurrency-Related Bugs using Random Isolation

Nicholas Kidd, Thomas Reps, Julian Dolby, and Mandana Vaziri

This paper concerns automatically verifying safety properties of concurrent programs. In our work, the safety property of interest is to check for multi-location data races in concurrent Java programs, where a multi-location data race arises when a program is supposed to maintain an invariant over multiple data locations, but accesses/updates are not protected correctly by locks.

The main technical challenge that we address is how to generate a program model that retains (at least some of) the synchronization operations of the concrete program, when the concrete program uses dynamic memory allocation. In the presence of dynamic memory allocation, the finite number of abstract objects of the abstract program must represent the unbounded number of concrete objects that the concrete program may allocate, and thus by the pigeon-hole principle some of the abstract objects must be summary objects -- they represent more than one concrete object. Because abstract summary objects represent multiple concrete objects, the program analyzer typically must perform weak updates on the abstract state of a summary object, where a weak update accumulates information. Because weak updates accumulate rather than overwrite, the analyzer is only able to determine weak judgements on the abstract state, i.e., that some property possibly holds, and not that it definitely holds. The problem with weak judgements is that determining whether an interleaved execution respects program synchronization requires the ability to determine strong judgements, i.e., that some lock is definitely held, and thus the analyzer needs to be able to perform strong updates -- an overwrite of the abstract state -- to enable strong judgements.

We present the random-isolation abstraction as a new principle for enabling strong updates of special abstract objects. The idea is to associate with a program allocation site two abstract objects, r# and o#, where r# is a non-summary object and o# is a summary object. Abstract object r# models a distinguished concrete object that is chosen at random in each program execution. Because r# is a non-summary object---i.e, it models only one concrete object---strong updates are able to be performed on its abstract state. Because which concrete object r# models is chosen randomly, a proof that a safety property holds for r# generalizes to all objects modeled by o#.

We implemented the random-isolation abstraction in a tool called Empire, which verifies atomic-set serializability of concurrent Java programs. (Atomic-set serializability is one notion of multi-location data-race freedom.) Random isolation allows Empire to track lock states in ways that would not otherwise have been possible with conventional approaches.

(Click here to access the paper: PDF.)