Wei Zhang, Chong Sun, Junghee Lim, Shan Lu, and Thomas Reps

ConMem: Detecting Crash-Triggering Concurrency Bugs through an Effect-Oriented Approach
University of Wisconsin

Multicore technology is making concurrent programs increasingly pervasive. Unfortunately, it is difficult to deliver reliable concurrent programs, because of the huge and nondeterministic interleaving space. In reality, without the resources to thoroughly check the interleaving space, critical concurrency bugs can slip into production versions and cause failures in the field. Approaches to making the best use of the limited resources and exposing severe concurrency bugs before software release would be desirable.

Unlike previous work that focuses on bugs caused by specific interleavings (e.g., races and atomicity violations), this article targets concurrency bugs that result in one type of severe effect: program crashes. Our study of the error-propagation process of real-world concurrency bugs reveals a common pattern (50% in our nondeadlock concurrency bug set) that is highly correlated with program crashes. We call this pattern concurrency-memory bugs: buggy interleavings directly cause memory bugs (NULL-pointer-dereferences, dangling-pointers, buffer-overflows, uninitialized-reads) on shared memory objects.

Guided by this study, we built ConMem to monitor program execution, analyze memory accesses and synchronizations, and predictively detect these common and severe concurrency-memory bugs. We also built a validator, ConMem-v, to automatically prune false positives by enforcing potential bug-triggering interleavings.

We evaluated ConMem using 7 open-source programs with 10 real-world concurrency bugs. ConMem detects more tested bugs (9 out of 10 bugs) than a lock-set-based race detector and an unserializable-interleaving detector, which detect 4 and 6 bugs, respectively, with a false-positive rate about one tenth of the compared tools. ConMem-v further prunes out all the false positives. ConMem has reasonable overhead suitable for development usage.

(Click here to access the paper: Link via ACM Digital Library.)