Fault tolerance in big data storage and processing systems: A review on challenges and solutions
Document Type
Article
Publication Date
3-1-2022
Abstract
Big data systems are sufficiently stable to store and process a massive volume of rapidly changing data. However, big data systems are composed of large-scale hardware resources that make their subspecies easily fail. Fault tolerance is the main property of such systems because it maintains availability, reliability, and constant performance during faults. Achieving an efficient fault tolerance solution in a big data system is challenging because fault tolerance must meet some constraints related to the system performance and resource consumption. This study aims to provide a consistent understanding of fault tolerance in big data systems and highlights common challenges that hinder the improvement in fault tolerance efficiency. The fault tolerance solutions applied by previous studies intended to address the identified challenges are reviewed. The paper also presents a perceptive discussion of the findings derived from previous studies and proposes a list of future directions to address the fault tolerance challenges. (C) 2021 THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Engineering, Ain Shams University.
Keywords
Fault tolerance, Fault detection, Fault recovery, Big data storage, Big data processing
Divisions
Software
Funders
Malaysia Ministry of Education [Grant No:GPF097C-2020]
Publication Title
Ain Shams Engineering Journal
Volume
13
Issue
2
Publisher
Elsevier
Publisher Location
RADARWEG 29, 1043 NX AMSTERDAM, NETHERLANDS