Subject Code:  CSA4023

Name:  Fault Tolerant Systems

L-T-P: 3-0-0

Credit: 4

Introduction:  Fault Classification, Types of Redundancy, Fault tolerant metrics

Hardware Fault Tolerance: Fault rate, Reliability, MTTF, Canonical and Resilient  structures, Reliability evaluation techniques, Processor level techniques, Byzantine failures

Information Redundancy: Coding techniques, Resilient Disk Systems,  Data replication, Algorithm based fault tolerance

Fault tolerant Networks: Network topologies and their Resilience,  Fault tolerant routing,

Software Fault tolerance: Single version fault tolerance,  N-version programming, Recovery blocks, Conditions and assertions, Exception handling,  Fault tolerant remote procedure calls

Checkpointing: Checkpointing in Analytical model, shared memory systems, real-time systems

Case studies: Non-stop systems, Itanium

Defect tolerance in VLSI circuits: Basic yield models, Yield enhancement through redundancy

Faults in Cryptographic Systems: Security attacks, Countermeasures

Prerequisite: Programming and Data Structures, Computer Organization and Architecture

Text Books:
I. Koren, C Mani Krishna, Fault tolerant systems, Morgan Kaufmann

Reference Books:
D. Pradhan, Fault tolerant Computer Design, Prentice Hall
E. Dubrova, Fault tolerant Design, Springer, 2013

K. Trivedi, Probability and statictics with reliability, queuing and computer science applications, John Wiley