Fault Tolerant Computing
(3-0-0-3)
CMPE Degree: This course is Not Applicable for the CMPE degree.
EE Degree: This course is Not Applicable for the EE degree.
Lab Hours: 0 supervised lab hours and 0 unsupervised lab hours.
Technical Interest Group(s) / Course Type(s): VLSI Systems and Digital Design
Course Coordinator:
Prerequisites: ECE 6100
Catalog Description
Key concepts in fault-tolerant computing. Understanding and use of modernfault-tolerant hardware and software design practices. Case studies.
Course Outcomes
Not Applicable
Strategic Performance Indicators (SPIs)
Not Applicable
Topical Outline
Goals and Applications of Fault Tolerant Computing
Reliability, Availability, Safety, Dependability, etc.
Long Life, Critical Computation
High Availability Applications
Fault Tolerance as a Design Objective
Fault Models
Faults, Errors, and Failures
Causes and Characteristics of Faults
Logical and Physical Faults
Error Models
Fault Tolerant Design Techniques Based on Hardware Redundancy
Hardware Redundancy
TMR, N-modular Redundancy
Voting Methods
Duplication, Standby Sparing
Watchdog Timers
Hybrid Hardware Redundancy
N-modular Redundancy with Spares
Sift-out Modular Redundancy
Triple-duplex Architecture
Fault Tolerant Interconnection Networks
Fault Tolerant Design Techniques Based on Information Redundancy
Parity, M-of-N, Duplication Codes
Checksums, Cyclic Codes, Arithmetic Codes
Berger Codes, Hamming Error Correcting Codes
Code Selection Issues
Time Redundancy, Recomputing with Shifted Operands (RESO)
Software Redundancy, Checks and N-version Programming
Reliability Evaluation Techniques
Failure Rate, Mean Time to Repair, Mean Time Between Failure
Reliability Modeling, Fault Coverage
M-of-N Systems
Markov Models
Safety, Maintainability, Availability
Fault Tolerance in VLSI Circuits
Failure Models in VLSI
Redundancy Techniques in VLSI
Self-checking Logic
Reconfiguration Array Structures
Effect on Yield
Case Studies
FTSC, FTBBC
Space Shuttle
Tandem 16 Non Stop System
Stratus/32 System
ESS
This course will involve writing of a term paper by the students on
research/literature review/design in the fault tolerant computing area. The
topics will be chosen in consultation with the instructor.