Failure as a First Class Entity of Software Defined Networks
Despite Software-defined Networking's (SDN) proven benefits, there remains significant reluctance in adopting it. Among the issues that hamper SDN’s adoption two stand out: reliability and fault tolerance. Ensuring bug-free code is an almost impossible task due to the complexity of production networks and lack of appropriate testing suites. Further, there exist a fate-sharing relationship between the SDN control applications and the controller, wherein a crash of the former results in a crash of the latter and thereby affecting the network’s availability. In this talk, I will present two frameworks to address these problems.
First, I will describe Armageddon, a framework to improve network availability by proactively detecting bugs. Armageddon introduces sustainable and systematic failures into a network. At the heart of Armageddon are a set of efficient algorithms for computing failure scenarios that guarantee coverage of the set of potential scenarios while preserving operator-specified invariants (e.g., end-to- end connectivity). In this talk, I will demonstrate the viability and efficiency of Armageddon by evaluating it on over 100 real network topologies.
Next, I will present a fault-tolerant SDN controller framework, LegoSDN, that allows SDN controllers to quickly recover from both deterministic and non- deterministic failures. LegoSDN redesigns SDN controller architectures to isolate and tolerate SDN-App failures. Using real SDN-Apps in an emulated network, I will show that LegoSDN can recover failed SDN-Apps 3x faster than traditional controller reboots.
Bio: Dr Benson is an Assistant Professor in the Computer Science Department of Duke University. His research group focuses on solving practical networking and systems problems, with a focus on Software Defined Networking, data centers, clouds, and configuration management. In addition to serving regularly on several conference committees, he currently serves as the chair for the workshop on hot topics for Middleboxes (HotMIddleboxes 2015). In Oct 2014, he received the Yahoo! ACE Award, a prestigious award granted to top five first and second year faculty nationally, for his contributions to the field of data center performance diagnosis. His honors include IBM Fellowships, a Yahoo Faculty Engagement Award, and award papers at SIGCOMM internet Measurement Conference and SIGCOMM Workshop: research on enterprise networking.