An Asynchronous Scheme for Rollback Recovery in Message-Passing Concurrent Programming Languages (2401.00087v2)
Abstract: Rollback recovery strategies are well-known in concurrent and distributed systems. In this context, recovering from unexpected failures is even more relevant given the non-deterministic nature of execution, which means that it is practically impossible to foresee all possible process interactions. In this work, we consider a message-passing concurrent programming language where processes interact through message sending and receiving, but shared memory is not allowed. In this context, we design a checkpoint-based rollback recovery strategy that does not need a central coordination. For this purpose, we extend the language with three new operators: check, commit, and rollback. Furthermore, our approach is purely asynchronous, which is an essential ingredient to developing a source-to-source program instrumentation implementing a rollback recovery strategy.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.