Dependability in Embedded Systems: A Survey of Fault Tolerance Methods and Software-Based Mitigation Techniques

Fault tolerance is a critical aspect of modern computing systems, ensuring correct functionality in the presence of faults. This paper presents a comprehensive survey of fault tolerance methods and mitigation techniques in embedded systems, with a focus on both software and hardware faults. Emphasis...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohammadreza Amel Solouki, Shaahin Angizi, Massimo Violante
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10772080/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846129640841674752
author Mohammadreza Amel Solouki
Shaahin Angizi
Massimo Violante
author_facet Mohammadreza Amel Solouki
Shaahin Angizi
Massimo Violante
author_sort Mohammadreza Amel Solouki
collection DOAJ
description Fault tolerance is a critical aspect of modern computing systems, ensuring correct functionality in the presence of faults. This paper presents a comprehensive survey of fault tolerance methods and mitigation techniques in embedded systems, with a focus on both software and hardware faults. Emphasis is placed on real-time embedded systems, considering their resource constraints and the increasing interconnectivity of computing systems in commercial and industrial applications. The survey covers various fault tolerance methods, including hardware, software, and hybrid redundancy. Particular attention is given to software faults, acknowledging their significance as a leading cause of system failures, while also addressing hardware faults and their mitigation. Moreover, the paper explores the challenges posed by soft errors in modern computing systems. The survey concludes by emphasizing the need for continued research and development in fault tolerance methods, specifically in the context of real-time embedded systems, and highlights the potential for extending fault tolerance approaches to diverse computing environments.
format Article
id doaj-art-d5e6c43e94d64e26abafca63103433a6
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-d5e6c43e94d64e26abafca63103433a62024-12-10T00:02:08ZengIEEEIEEE Access2169-35362024-01-011218093918096710.1109/ACCESS.2024.350963310772080Dependability in Embedded Systems: A Survey of Fault Tolerance Methods and Software-Based Mitigation TechniquesMohammadreza Amel Solouki0https://orcid.org/0000-0002-3430-9706Shaahin Angizi1https://orcid.org/0000-0003-2289-6381Massimo Violante2https://orcid.org/0000-0002-5821-3418Department of Control and Computer Engineering, Politecnico di Torino, Turin, ItalyDepartment of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ, USADepartment of Control and Computer Engineering, Politecnico di Torino, Turin, ItalyFault tolerance is a critical aspect of modern computing systems, ensuring correct functionality in the presence of faults. This paper presents a comprehensive survey of fault tolerance methods and mitigation techniques in embedded systems, with a focus on both software and hardware faults. Emphasis is placed on real-time embedded systems, considering their resource constraints and the increasing interconnectivity of computing systems in commercial and industrial applications. The survey covers various fault tolerance methods, including hardware, software, and hybrid redundancy. Particular attention is given to software faults, acknowledging their significance as a leading cause of system failures, while also addressing hardware faults and their mitigation. Moreover, the paper explores the challenges posed by soft errors in modern computing systems. The survey concludes by emphasizing the need for continued research and development in fault tolerance methods, specifically in the context of real-time embedded systems, and highlights the potential for extending fault tolerance approaches to diverse computing environments.https://ieeexplore.ieee.org/document/10772080/Embedded systemsfault tolerancereliabilityanalytical redundancydependability
spellingShingle Mohammadreza Amel Solouki
Shaahin Angizi
Massimo Violante
Dependability in Embedded Systems: A Survey of Fault Tolerance Methods and Software-Based Mitigation Techniques
IEEE Access
Embedded systems
fault tolerance
reliability
analytical redundancy
dependability
title Dependability in Embedded Systems: A Survey of Fault Tolerance Methods and Software-Based Mitigation Techniques
title_full Dependability in Embedded Systems: A Survey of Fault Tolerance Methods and Software-Based Mitigation Techniques
title_fullStr Dependability in Embedded Systems: A Survey of Fault Tolerance Methods and Software-Based Mitigation Techniques
title_full_unstemmed Dependability in Embedded Systems: A Survey of Fault Tolerance Methods and Software-Based Mitigation Techniques
title_short Dependability in Embedded Systems: A Survey of Fault Tolerance Methods and Software-Based Mitigation Techniques
title_sort dependability in embedded systems a survey of fault tolerance methods and software based mitigation techniques
topic Embedded systems
fault tolerance
reliability
analytical redundancy
dependability
url https://ieeexplore.ieee.org/document/10772080/
work_keys_str_mv AT mohammadrezaamelsolouki dependabilityinembeddedsystemsasurveyoffaulttolerancemethodsandsoftwarebasedmitigationtechniques
AT shaahinangizi dependabilityinembeddedsystemsasurveyoffaulttolerancemethodsandsoftwarebasedmitigationtechniques
AT massimoviolante dependabilityinembeddedsystemsasurveyoffaulttolerancemethodsandsoftwarebasedmitigationtechniques