The aim of this paper is to cover past and present approaches to software implemented fault tolerance that rely on both software design diversity and on single but. The root cause of software design errors is the complexity of the systems. Despite more and more improvements in fault preventing techniques, it is a fact that faults remain in every complex software system. Software fault tolerance methods are discussed, resulting in definitions for soft and solid faults. This chapter concentrates on software fault tolerance based on design diversity.
This result supports software fault tolerance by design diversity as a creditable approach for software reliability engineering. Approaches to achieve fault tolerance by using the multiple cores to establish redundancy have been presented in literature. Design and technology of integrated systems in nanoscale era, apr 2015, naples, italy. Assessment of data diversity methods for software fault tolerance. Many see fault tolerance to design faults as a lowquality solution, compared to the more desirable goal of fault free software. One other event, again 25 years ago, also had a great though largely negative influence on my subsequent activities. Software designers or system integrators who want an introduction to the problems found in designing for fault tolerance and to the range of design solutions. Coverage includes fault tolerance techniques through hardware, software, information and time redundancy. If design fault detection is required, design diversity in the software has to be used, too. The importance of fault tolerance fault tolerant computing is the art and science of building computing systems that continue to operate satisfactorily in the presence of faults. The ieee council on electronic design automation ceda is a technical cosponsor of. Dd has been said to be orthogonal to design diversity 8. Software reliability growth modeling modelling of software design diversity for fault tolerance exploring the limits of what can be claimed rigorously for system.
This is achieved by creating fault tolerant composite services that leverage functionallyequivalent services. Software engi neers assume that the different implementations use different designs and thereby, it is hoped. They will gain a thorough understanding of fault tolerant computers, including both the theory of how to design and evaluate them and the practical knowledge of achieving fault tolerance in electronic, communication and software systems. Thanks to low acquisition costs, even using multiple versions of software in a parallel architecture, which is a scheme formerly reserved for few and highly critical applications, may. Design diversity is the generation of different implementations.
Software engineering software fault tolerance javatpoint. Coverage includes faulttolerance techniques through hardware, software, information and. As more and more complex systems get designed and built, especially safety critical systems, software fault tolerance and the next generation of hardware fault tolerance will need to evolve to be able to solve the design fault. Software fault tolerance techniques are designed to allow a system to tolerate software faults that remain in the system after its development. Experimental studies of a design diversity approach tech. A structured definition of hardware and software fault tolerant architectures is presented. Structured software fault tolerance are those techniques where redundancy both for. Software fault tolerance in computer operating systems r.
Dft 2018 31st ieee international symposium on defect and fault. Therefore fault tolerance is achieved by using diversity in the data space. Definition and analysis of hardware and softwarefault. Oct 14, 2011 architectural practices for building highly available applications. Software fault tolerance using data diversity attention. If component b is later changed to a less fault tolerant design the system may fail suddenly, making it appear that the new component b is the problem. Fault tolerant strategies fault tolerance in computer system is achieved through redundancy in hardware, software, information, andor time. Data diversity fault tolerance design the software ft architecture in this research uses dd, a complementary approach to design diversity. Data diversity relies on a different form of redundancy from existing approaches to software fault tolerance and is substantially less expensive to implement. Assessment of data diversity methods for software fault.
Fault tolerance fault tolerant solutions have the ability to keep operating even if a component, or multiple components, should fail. We also found that exact faults found among versions are very limited. While there is clear evidence that the approach can be expected to deliver some increase in reliability compared to a single version, there is no agreement about the extent of this. They include the recovery block scheme rbs programming, consensus recovery block programming, nversion programming nvp, n selfchecking programming nscp and data diversity. Design fault tolerance by means of design diversity is a concept that traces back to the very early age of informatics. Structuring redundancy for software fault tolerance robust software. Kellyspecification of fault tolerant multi version software. With the advent of computers, nversion software diversity has been proposed avi77 as a means of dealing with the uncertainties of design faults in a computer.
Amazon rewards visa signature cards store card amazon. Software fault tolerance dedix as an experimentation tool proam interface in multiple version software the versions of an application program are all written according to the same functional specification. Software fault tolerance cmuece carnegie mellon university. A survey of software fault tolerance techniques zaipeng xie, hongyu sun and kewal saluja. Challenges in building fault tolerant flight control system. Design diversity nvp is based on the principle of design diversity, that is coding a software module by different teams of programmers, to have multiple versions 2 the diversity can also be introduced by employing different algorithms for obtaining the same solution or. Software fault tolerance refers to the use of techniques to increase the likelihood that the final design embodiment will produce correct andor safe outputs. As software fault tolerance is often measured in terms of system availability, which is a function of reliability, we should include various single version sv software based approaches of fault tolerance for more effective software fault avoidance in order to combat latent defects, environment and.
The cost effectiveness of telecommunication service dependability y. The loss of a single engine would not cause the aircraft to fail as the three remaining engines would continue to operate. Fault tolerant system dependabilityexplicit modeling of hardware and software componentinteractions. The international symposium on design and diagnostics of. Both schemes are based on software redundancy assuming that the events of coincidental software failures are rare. Backgroundover recent years, software developers have been evaluating the benefits of both serviceoriented architecture soa and software fault tolerance techniques based on design diversity. Index termsdata diversity, design diversity, ncopy pro gramming, nversion programming, recovery blocks, retry blocks, software faults, software fault tolerance. Increasing reliability of programmable mixedsignal systems by applying design diversity redundancy. Increasing reliability of programmable mixedsignal. Consequently, such an adaptation of fault tolerance features can lead to new faults. Mutation testing, software fault tolerance, systematic failures. Suffice it to say that our respective choices of research problem match our respective skills at program design and verification.
The multiple computation approach and its extension to design diversity multiple computation is a fundamental method employedto attain fault tolerance. There are two basic techniques for obtaining fault tolerant software. Visa information travell tourist info centres tourist attractions. Request pdf assessment of data diversity methods for software fault tolerance based on mutation analysis one of the main concerns in safetycritical software is to ensure sufficient. Software testing and software fault injection lirmm cnrs. In this paper, we propose an orthogonal fault tolerance modeling oftm approach to address the problems introduced by dynamically adapting fault tolerance features in a system. That is, it should compensate for the faults and continue to. Study a specific software fault tolerance scheme middleware or application using software fault tolerance e. Pdf an introduction to software engineering and fault tolerance. In this article we will be covering several techniques that can be used to limit the impact of software faults read bugs on system performance. Fault tolerant software architecture stack overflow. Software fault tolerance in the application layer y.
Application of design diversity in computerized control systems, proc. The nversion programming nvp approach achieves fault tolerant software units, called nversion software nvs units, through the development and use of software diversity. In the picture above the aircraft has four engines. An empirical study on testing and fault tolerance for.
In this introduction, we describe the motivation for sift and provide some background for our work. We suggest the combined utilization of so called systematic diversity and design diversity in a timeredundant system instead of the structural redundant duplex system. Fault tolerance via diversity for offtheshelf products. By software fault tolerance in the application layer, we mean a set of application level software components to detect and recover from faults that are not handled in the hardware or operating. Index termsdependability, faults, fault tolerance, fault. To tolerate faults, both of these techniques rely on design diversity, the availability of multiple implementations of a specification.
Pdf a designdiversity based faulttolerant cots avionics. The remainder of the paper describes the actual design of the sift system. Design diversity is the generation of different implementations codes from a common specification 3, 8. By simply triplicating a software module and voting on its outputs we cannot. Software based fault recovery via adaptive diversity for cots multicore processors. Sc high integrity system university of applied sciences, frankfurt am main 2. The versions are used as alternatives with a separate means of. The program cochairs will make the final decisions about which submissions are accepted for. Designing faulttolerant soa based on design diversity.
Compounding the problems in building correct software is the difficulty in assessing the correctness of software for highly complex systems. Also there are multiple methodologies, few of which we already follow without knowing. This means that design diversity has, therefore, become another issue of. Therefore, it is reasonable to deal with the remaining software faults bugs during runtime to increase the overall reliability. Introduction to fault tolerance techniques and implementation. If an offtheshelf software product exhibits poor dependability due to design faults, then software fault tolerance is often the only way available to users and system integrators to alleviate the problem. Since correctness and safety are really system level concepts, the need and degree to use software fault tolerance. Designfault tolerance by means of design diversity is a concept that traces back to the very early age of informatics. The present work addresses the application of the concept of tmr by design diversity 2. Software fault tolerance via environmental diversity. Multiplecomputations are implemented by nfold n 2 2 replications in three domains. Fault tolerance is the realization that we will have faults in our system hardware andor software and we have to design the system in such a way that it will be tolerant of those faults. Software fault tolerance by design diversity cuhk cse.
Traditional method of software fault tolerance based on design diversity is expensive and hence does not get used extensively. From a different point of view, any emphasis on providing fault tolerance for design faults is, in this authors experience, a radical change from the common attitudes of many practitioners and researchers alike. Software fault tolerance, audits, rollback, exception handling. Are games the most complex impressive applications. Chen, on the implementation of nversion programming for software faulttolerance during program execution, proceedings compsac 77, chicago il, pp.
When a fault occurs, these techniques provide mechanisms to. Finally we conducted domain analysis approach for test case generation, and concluded that it is a promising technique for software testing purpose. The two bestknown methods of building fault tolerant software are nversion programming 3 and recovery blocks 7. Such redundancy can be implemented in static, dynamic, or hybrid configurations. Course pm eda122dit061 faulttolerant computer systems ht16. Introduction to software fault tolerance techniques and implementation 9 1 system requirements specification. Fault tolerance through automated diversity in the.
Software fault tolerance is the ability for software to detect and recover from a fault that is happening or has already happened in either the software or hardware in the system in which the software is running to provide service by the specification. Keywords software fault, redundancy, reliability, design diversity, check. These principles deal with desktop, server applications andor soa. Since the 1980s, the field of faulttolerant design has broadened in appeal. Basic fault tolerant software techniques geeksforgeeks. It would be very difficult to sum it up in one article since there are multiple ways to achieve fault tolerance in software. Using commercially available hardware and software, we design the communication network to interface. I had been a member of the ifip algol committee since 1964. Three major design issues need to be considered while building software fault tolerant. Fault tolerant software has the ability to satisfy requirements despite failures.
For example, if component b performs some operation based on the output from component a, then fault tolerance in b can hide a problem with a. Citeseerx software fault tolerance by design diversity. If component b is later changed to a less fault tolerant design the system may fail suddenly. Fault masking is any process that prevents faults in a system. Fault tolerance can be achieved by the following techniques.
To tolerate faults, both of these techniques rely on design diversity, i. Such systems focus strongly on design faults, where the term. Reliability and fault correlation are two main concerns for design diversity, yet empirical data are limited in investigating these two. An empirical study on testing and fault tolerance for software reliability engineering. A complete beginners guide to zoom 2020 update everything you need to know to get started duration. Software engineers assume that the different implementations use different designs. As more and more complex systems get designed and built, especially safety critical systems, software fault tolerance and the next generation of hardware fault tolerance will need to evolve to be able to solve the design fault problem. Look to this innovative resource for the most comprehensive coverage of software fault tolerance techniques available in a single volume. Software fault tolerance carnegie mellon university.
It offers you a thorough understanding of the operation of critical software fault tolerance techniques and guides you through their design. Laura l pullum annotation this innovative resource provides the mostcomprehensive coverage of software fault tolerance techniques as it guides professionals through their design, operation and performance. Design diversity was not a concept applied to the solutions to hardware fault tolerance, and to this end, nway redundant systems solved many single errors by. Professor bev littlewood city, university of london. Data diversity can also be applied to software testing and greatly facilitates the automation of testing. Design diversity has been used for many years now as a means of achieving a degree of fault tolerance in software based systems. Dependability modeling for fault tolerant software and systems. We have several software fault tolerance schemes as proposed in 46,47,48,49,50 are based on software design diversity in order to tolerate software design bugs. Design diversity in a very expensive approach, as the same software has to be developed several times, by several teams flight control system requires fault tolerance software diversity to complete fault tolerance hardware. Review of software design diversity 1 introduction 2 n. Software fault tolerance is not a license to ship the system with bugs. Then it surveys design techniques used by faulttolerant systems. Abstractnowadays the reliability of software is often the main goal in the software development process.
Pdf software fault tolerance in the application layer. Development of this software has been an evolutionary process. Ifip workshop on design diversity in action, springerverlag, vienna. Data diverse software fault tolerance techniques n complements design diversity by compensating for design diversity s limitations n involves obtaining a related set of points in the program data space, executing the same software on those points in the program data space, and then using a decision algorithm to determine the resulting output. Many see fault tolerance to design faults as a lowquality solution, compared to the more desirable goal of faultfree software. The main idea here is to contain the damage caused by software faults.
Software fault tolerance during the development of software, it is infeasible to find all its bugs, which can reach as far back as the design phase. Finally, the relationship of software diversity to other methods used to achieve dependable software systems is discussed. Softwarebased fault recovery via adaptive diversity for. Software fault tolerance sequential fault tolerance. Thus, in the most simple case we have the well known duplex system. Software fault tolerance is an immature area of research. Fault tolerant software assures system reliability by using protective redundancy at the software level. This chapter presents a nonhomogeneous poisson progress reliability model for nversion programming systems. Software fault tolerance techniques are employed during the procurement, or development, of the software.
Software fault tolerance techniques and implementation. Such techniques use datu diversity to tolerate residual faults. For example in fault tolerance and correctness, its probably code for medical or aerospace applications. Introduction an effective technique frequently employed to add fault tolerance to electronic systems is the triple modular redundancy tmr 1. A fault tolerant system may be able to tolerate one or more fault types 9. Design of fault tolerant software andrea bondavalli cnucecnr, via s. Orthogonal fault tolerance for dynamically adaptive systems. Checkpointing and the modeling of program execution time. A design diversity based fault tolerant cots avionics bus network. This course has been developed by the centre for software reliability with funding from the engineering and physical sciences research council grant number 00711eng95 as part of their. In order to complement design diversity in the quest for fault tolerance software, there exits several data diversity techniques which are similar to the aforementioned for the design diversity approach. Software fault tolerance is the ability of computer software to continue its normal operation despite the presence of system or hardware faults. We separate all faults within nvp systems into independent faults and common faults, and model each type of failure as nhpp. The main design diversity and data diversity techniques have been summarized in table 1.
583 544 421 984 1491 841 289 1111 1138 437 1552 605 1305 1149 1359 1524 648 709 1304 1459 1226 823 522 568 1062 924 718 1186 139 431 1515 91 79 981 1287 1051 578 978 1012 1355 1498 545 38 367 347 229 1021 781