Using Architecture Support to make Concurrent and Parallel Software Less Buggy and More Reliable
The limits of single-thread performance and the demands of emerging applications have caused a shift toward increasingly concurrent and parallel software. For example, concurrency and parallelism unlock the performance and energy benefits of multi-core architectures and many domains like servers, mobile devices, and cloud applications require concurrency. Unfortunately, writing correct, reliable concurrent software is extremely difficult. In this talk, I will discuss my research on using architecture and system support to make programs easier to debug and less prone to failure.
First, I will present Recon, a new technique for concurrency debugging. Using a simple statistical model, Recon isolates and reconstructs the root cause of failures to help programmers understand their errors. With hardware support, Recon works efficiently even in production. In experiments with real, buggy programs (e.g., MySQL, Apache) we showed Recon reveals bug root causes with few -- often 0 -- false positives.
Second, I will present Aviso, a new technique for avoiding failures in buggy concurrent programs. Aviso traces events as programs run. When an execution fails, Aviso uses the failing event trace and a statistical model to generate thread schedule constraints that prevent the same failure from occurring in the future. Collections of systems running Aviso can work cooperatively to find and share effective constraints. Our experiments with real software show that Aviso decreases failure rates by up to two orders of magnitude with performance overheads tolerable for production use.
Bio: Brandon Lucia is a 6th (and final) year PhD student at the University of Washington. Brandon's research focuses on designing new computer architectures and systems that address the challenges of concurrency and parallelism. His thesis work developed architecture and system support for new programming and execution models, new debugging techniques, and new failure avoidance mechanisms for concurrent software. Brandon's work crosses the boundaries of traditional architecture, including not just hardware, but compilers, system software layers, and even application-level support, like statistical models. Brandon won the 2010 IBM PhD fellowship for his work in this area. Throughout his graduate career, Brandon has worked with his advisor, Luis Ceze, and many other wonderful collaborators from academia and industrial research labs, such as MSR and HP Labs. Brandon lives in Seattle with his cat.