Search In this Thesis
   Search In this Thesis  
العنوان
System-level diagnosis/
الناشر
Wail Ali Hegazy,
المؤلف
Hegazy,Wail Ali.
هيئة الاعداد
مشرف / Mohamed N.Elderini
مشرف / Yahia El-hakim
باحث / Wail Ali Hegazy
مناقش / Ahmed Aly
الموضوع
Computer science .
تاريخ النشر
1983 .
عدد الصفحات
95 P.:
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
علوم الحاسب الآلي
تاريخ الإجازة
1/1/1983
مكان الإجازة
جامعة الاسكندريه - كلية الهندسة - Computer science
الفهرس
Only 14 pages are availabe for public view

from 99

from 99

Abstract

System diagnosis is concerned with the location of the mulfua¬otioning subsystems within a digital system. Interest in this topic is motivated by the need for highly-available systems that can con¬tinue essential operations when failures occur.
Many real-time applications such as air traffic control, manned spaceflight, hospital patient monitoring, and on-line process control; place severe demands on system availability. These demands are sat¬isfied by ~aximizing the system reliability.
The traditional approach to achieving reliable co~puting systems has been based larrely on f;:-ult avoidance (or fault intolerance).
This appro8.ch reQuires the acquisition of the most reliable components, the use of thoroughly refined techniQues for the interconnection of com~onents and assembly of subsystems, and the carrying out of com¬perhensive testing to eliminate t’he design faults. EO’l,-tever, occasional system failures are accepted as a necessary evil, and ~anual maint¬enance is provided for their corrections.
There are several situati0ns in which the fault avoidance appro¬ach does not suffice. These include situations where the frequency
and duration of repair time are unacceptable. An alternative approach
to fault avoidance is that of fault-tolerance. In this approach the oauses of unreliability are expected to be present and to induce errors, but their disrupting effects are automatically counteracted. One
reason for the use of this approach is to achieve a reliability or
2
availability that cannot be attained by the:fault avoidance approach. A second reason may be the attainment of a reliability that matchGs that attained by the fault avoidance approach, but at a lower overall cost of implementation. A third reason is the psychological support
to the users who know that provisions have been made to handle faults automatically.
The techniques for attempting to achieve fault-tolerance comprise strateGies for error detection, fault treatment, damage assessment, and error recovery. Fault treatment is essential to avoid the fault durinc further operation of the system. It can be accomplished in two ways. One method is to provide standby spares which can be switched
in to replace faulty elements. The other method is to design into
the system a capability for graceful degradation. In this scheme, rather than replacing a faulty element, the system is reconfigured
to continue operation at reduced capacity without that element. Re¬g[~rdless of the fault treatment mechanism, the fault must be located to within a component of a size which is acceptable for the treatment mechanism. This is the system-level fault diagnosis. System-level fault diagnosis is also of interest to those cases in which manual repair is performed, where initial diagnosis to the level of large replaceable modules can reduce the system downtime.