Efficient Algorithms and Routing Protocols for Handling Transient Single Node Failures
Amit M Bhosle, Teofilo F Gonzalez

TL;DR
This paper introduces efficient algorithms and protocols for quickly routing around transient single node failures in large networks, avoiding costly global recomputations and maintaining near-optimal paths.
Contribution
The authors develop faster algorithms for transient failure handling that outperform existing protocols while maintaining comparable path quality.
Findings
Algorithms are an order of magnitude faster than previous methods.
Paths are within 15% of optimal in simulated networks.
Proposed solutions outperform existing protocols in speed and efficiency.
Abstract
Single node failures represent more than 85% of all node failures in the today's large communication networks such as the Internet. Also, these node failures are usually transient. Consequently, having the routing paths globally recomputed does not pay off since the failed nodes recover fairly quickly, and the recomputed routing paths need to be discarded. Instead, we develop algorithms and protocols for dealing with such transient single node failures by suppressing the failure (instead of advertising it across the network), and routing messages to the destination via alternate paths that do not use the failed node. We compare our solution to that of Ref. [11] wherein the authors have presented a "Failure Insensitive Routing" protocol as a proactive recovery scheme for handling transient node failures. We show that our algorithms are faster by an order of magnitude while our paths are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Interconnection Networks and Systems · Distributed systems and fault tolerance
