Fault Tolerant Computing
CMPE Degree: This course is Not Applicable for the CMPE degree.
EE Degree: This course is Not Applicable for the EE degree.
Lab Hours: 0 supervised lab hours and 0 unsupervised lab hours.
Technical Interest Group(s) / Course Type(s): VLSI Systems and Digital Design
Prerequisites: ECE 6100
Catalog DescriptionKey concepts in fault-tolerant computing. Understanding and use of modern
fault-tolerant hardware and software design practices. Case studies.
Student OutcomesIn the parentheses for each Student Outcome:
"P" for primary indicates the outcome is a major focus of the entire course.
“M” for moderate indicates the outcome is the focus of at least one component of the course, but not majority of course material.
“LN” for “little to none” indicates that the course does not contribute significantly to this outcome.
1. ( Not Applicable ) An ability to identify, formulate, and solve complex engineering problems by applying principles of engineering, science, and mathematics
2. ( Not Applicable ) An ability to apply engineering design to produce solutions that meet specified needs with consideration of public health, safety, and welfare, as well as global, cultural, social, environmental, and economic factors
3. ( Not Applicable ) An ability to communicate effectively with a range of audiences
4. ( Not Applicable ) An ability to recognize ethical and professional responsibilities in engineering situations and make informed judgments, which must consider the impact of engineering solutions in global, economic, environmental, and societal contexts
5. ( Not Applicable ) An ability to function effectively on a team whose members together provide leadership, create a collaborative and inclusive environment, establish goals, plan tasks, and meet objectives
6. ( Not Applicable ) An ability to develop and conduct appropriate experimentation, analyze and interpret data, and use engineering judgment to draw conclusions
7. ( Not Applicable ) An ability to acquire and apply new knowledge as needed, using appropriate learning strategies.
Strategic Performance Indicators (SPIs)
Goals and Applications of Fault Tolerant Computing
Reliability, Availability, Safety, Dependability, etc.
Long Life, Critical Computation
High Availability Applications
Fault Tolerance as a Design Objective
Faults, Errors, and Failures
Causes and Characteristics of Faults
Logical and Physical Faults
Fault Tolerant Design Techniques Based on Hardware Redundancy
TMR, N-modular Redundancy
Duplication, Standby Sparing
Hybrid Hardware Redundancy
N-modular Redundancy with Spares
Sift-out Modular Redundancy
Fault Tolerant Interconnection Networks
Fault Tolerant Design Techniques Based on Information Redundancy
Parity, M-of-N, Duplication Codes
Checksums, Cyclic Codes, Arithmetic Codes
Berger Codes, Hamming Error Correcting Codes
Code Selection Issues
Time Redundancy, Recomputing with Shifted Operands (RESO)
Software Redundancy, Checks and N-version Programming
Reliability Evaluation Techniques
Failure Rate, Mean Time to Repair, Mean Time Between Failure
Reliability Modeling, Fault Coverage
Safety, Maintainability, Availability
Fault Tolerance in VLSI Circuits
Failure Models in VLSI
Redundancy Techniques in VLSI
Reconfiguration Array Structures
Effect on Yield
Tandem 16 Non Stop System
This course will involve writing of a term paper by the students on
research/literature review/design in the fault tolerant computing area. The
topics will be chosen in consultation with the instructor.