.

ISSN 2063-5346
For urgent queries please contact : +918130348310

An Optimal Checkpointing with Message Logging Protocol for Fault Tolerance of Distributed Applications in the Cloud Data Center

Main Article Content

Priti Kumari, Vandana Dubey, Vinita
» doi: 10.48047/ecb/2023.12.si4.475

Abstract

Services running on low-cost hardware generally enabled by cloud data centers (CDC). Scaling the hardware horizontally is made simple by the addition of more resources. The CDC uses commodity hardware, which results in a high failure rate for physical servers (PSs). As a result of this failure, virtual servers (VSs) were provisioned on the failed PS. Hence the fault tolerance is a major challenge for cloud service providers. Objective: To handle failures of commodity hardware in CDC, there is need of fault tolerance method. The motive of this work is to develop an optimal and efficient failure recovery methodology based on checkpointing for fault tolerance of cloud-based services. Methods: the proposed approach is implemented in two steps. In the first step, we build a virtual backbone across the CDC network architecture; the research suggests a novel Connected Dominating Set (CDS) based method. The PSs serve as the vertex of the graph used in the CDS creation approach to represent the network topology. In order to obtain a CDS or optimal number of PS for the topology graph, it then suggests a set of criteria based on the rate of CPU heating, storage capacity, and vertex degree of PSs. Moreover, This base is then used to create a fault tolerant system based on checkpointing and rollback recovery (CRR) in order to increase reliability. In the second step, A CDSCKP method is proposed. In which we have implemented uncoordinated checkpointing with message logging while taking into account distributed applications. Checkpoint snapshots of tasks or VSs are placed on the CDS vertices. Results: The suggested scheme's effectiveness is assessed using parameters such as recoverability, bandwidth and power consumption and rollback and recovery time. The CDSCKP is compared with a Random checkpointing placement protocol (RCKP), Wu-Li checkpointing placement protocol (WLCKP). The simulation results show that the CDSCKP offers greater recoverability, uses less bandwidth and power, and has little rollback and recovery overhead. Conclusion: In order to increase the dependability of VS-based services, this study suggests a CDS-based scheme for building a virtual backbone over the DCN topology. This scheme is then used to produce a CRR-based fault tolerance scheme called CDSCKP.

Article Details