The software refactoring of virtualization fault-tolerant systems
Sustainable Development Goals
Abstract/Objectives
Results/Contributions
In recent years, as software systems continue to evolve and become more complex, how to perform efficient and effective refactoring has become an important research topic. At the same time, ensuring high availability and robust resilience of software systems remains a key focus for us. In response to this issue, this paper conducts a comprehensive study exploring how to enhance fault tolerance based on virtualization through software system refactoring. We specifically focus on Cuju, an open-source project that primarily implements a time synchronization technology for virtualization-based fault tolerance. Cuju employs many performance optimization techniques, including non-disruptive/pipelined continuous migration, tracking client virtual memory/device state, and eliminating data transfer between QEMU and KVM. By upgrading the kernel-based virtual machine (KVM) from Linux kernel 4.15 to 5.4, we expanded Cuju's capabilities, providing better stability for our research content. The main objective of this study is to explore how to fully leverage these capabilities of Cuju and enhance the fault tolerance of software systems through effective refactoring methods. We first identify and categorize the most common refactoring techniques, as well as the code that has a significant negative impact on system fault tolerance. Next, we apply these refactoring techniques to a set of open-source software systems and achieve higher system fault tolerance through integration with Cuju. Our preliminary research results indicate that effective software refactoring can reduce the occurrence of system failures, while Cuju's virtualization-based fault tolerance mechanism can also provide seamless system recovery functionality. When combined, these two approaches are expected to enhance the resilience and reliability of software systems. We believe that the results of this research will contribute to the development and maintenance of software systems, particularly in terms of fault tolerance, offering new perspectives and methods.