SlideShare una empresa de Scribd logo
1 de 70
Quasi-static fault-tolerant scheduling schemes for
energy-efficient hard real-time systems
• Wei Tongquan, CS Department of East China Normal University, China
• Piyush Mishra, GE Global Research, Niskayuna, NY 12309, USA
• Kaijie Wu, ECE Department of University of Illinois, Chicago, IL 60607, USA
• Junlong Zhou, CS Department of East China Normal University, China
Journal of Systems and Software
2012
Reza Ramezani
1
A Unified Approach for Fault Tolerance and Dynamic
Power Management in Fixed-Priority Real-Time
Embedded Systems
• Ying Zhang
– a Senior Software engineer with the Research and Development
Department, Guidant Corporation, St. Paul, MN, USA
• Krishnendu Chakrabarty
– Department of Electrical and Computer Engineering, Duke University,
Durham, USA
Computer-Aided Design of Integrated Circuits and Systems,
IEEE Transactions on 25, no. 1 (2006): 111-125.
2
Overview
 Primaries
 Checkpointing & Response Time
 Reliability, The best fault tolerance count?
 Feasibility Analysis
 Offline Application Level Voltage Scaling
 Offline Task Level Voltage Scaling
 Online DVS by Using Slacks
 Previous Work (Ying Zhang, Krishnendu Chakrabarty, 2006)
 Results
 Suggestion
3
Primaries
4
Features
• Fault Tolerance Scheduling
 Transient Faults
 Fast Detection
 Fault occurrences at runtime, checkpointing and state restoration.
• Dynamic Voltage Scaling (DVS)
• Offline Scheduling
 Application Level Voltage Scaling (A-DVS)
 Task Level Voltage Scaling (T-DVS)
• Online Scheduling
 Using Slacks
• Exact Rate-Monotonic Characterization
 Instead of iteratively deriving the response time of each task for
feasibility analysis. 5
Online DVS Outline
• The adaptation of the offline task schedules to the
runtime behavior of fault occurrences is implemented:
 (1) Pre-computing and saving in a lookup table the maximum slack
requirements for the processor to dynamically slow down.
 (2) Retrieving and comparing the stored slack time requirements with
the generated cumulative slack in the runtime.
 (3) Dynamically scaling down processor speed when the generated
slack time is equal to or greater than the stored slack requirements.
6
System Architecture
7
System Architecture (2)
8
Checkpointing
&
Response Time
9
Checkpoint count
 Fault-tolerant computing refers to the correct execution of user
programs and system software in the presence of faults.
 Fault tolerance is typically achieved in real-time systems through
online fault detection, checkpointing, and rollback recovery .
 Checkpointing increases the task execution time, and in the absence
of faults, it might cause a missed deadline for a task that completes
on time without checkpointing.
 Frequent checkpointing reduces re-execution time due to faults but
increases task execution time and vice versa.
 Therefore, the checkpointing interval, i.e., the duration between two
consecutive checkpoints, must be carefully chosen to balance
checkpointing cost with the re-execution time.
10
Fault occurrences count
• Relation between fault occurrences count and fault
arrival rate
 k is the fault occurrences count to be tolerated.
 a fault arrival rate λ and a task execution interval t, the mean number
of faults that arrive during the interval is λt.
o If k is much smaller than λt, a sophisticated fault-tolerant scheme with its
associated overhead is not appropriate.
o if k is much larger than λt, a fault-tolerant scheme that provides deterministic
real-time guarantee may not exist.
 In order to target a system with reasonable real-time performance with
fault tolerance, the value of k can be taken to be a small multiple of λt,
e.g., 2λt ≤ k ≤ 3λt.
11
Checkpointing
12
Fault placement
13
Fault placement
14
Task response time
15
Task response time
16
Reliability
The best fault tolerance count?
17
Reliability
18
Task Reliability
19
Task Reliability (2)
20
Task Reliability (3)
21
Feasibility Analysis
22
Exact Characterization of RMA (ECRMA)
• Critical Instant
 The worst case behavior of RMA occurs when all tasks in a task set are
instantiated simultaneously and are ready for execution immediately after
initiation.
 It has been shown that a schedule of independent periodic tasks is
feasible if the first instance of each task is schedulable when it is
instantiated at a critical instant Lehoczky et al. (1989) .
23
Exact Characterization of RMA (ECRMA) (2)
24
Exact Characterization of RMA (ECRMA) (3)
25
Offline Application Level Voltage Scaling
26
Application level voltage scaling (A-DVS)
27
A-DVS algorithm
28
A-DVS algorithm (2)
• Some Considerations
 The binary search based A-DVS algorithm is valid only if the energy
consumption is monotonic with respect to frequency/voltage changes.
 When the processor static power consumption as well as context
switching overhead is considered, the monotonicity does not hold.
 In this case, there exists a critical processor speed below which scaling
down the processor speed will instead increase the energy consumption.
 The minimum voltage level low is initialized to the level corresponding
to the processor critical speed.
29
Feasibility Checking Algorithm (FCA)
30
31
Feasibility
Checking
Algorithm
(FCA)
Offline Task Level Voltage Scaling
32
Task level voltage scaling (T-DVS)
33
T-DVS algorithm
34
T-DVS algorithm (2)
35
T-DVS algorithm (3)
36
T-DVS Consideration
37
Schedulability Checking Algorithm (SCA)
38
Online DVS by Using Slacks
39
Online reevaluation of DVS policies
 Offline scheduling assumes that all tasks exhibit the worst case execution
time and all faults occur during the checkpointing.
 The runtime behavior of task execution and fault occurrences can vary
significantly.
 In the runtime, not all tasks execute up to their worst case execution times
and not all faults occur during task executions.
 Hence, the slack generated in the runtime could be used to dynamically scale
down the processor speed to save energy.
 The online reevaluation of DVS policies can save significant energy by using
generated slacks due to uncertainties in fault occurrence.
40
Reevaluation of DVS at application level
41
Reevaluation of DVS at application level (2)
42
Reevaluation of DVS at application level (3)
43
44
45
Dynamic ADVS Algorithm
46
Dynamic ADVS Algorithm (2)
47
Example
48
Reevaluation of DVS policies at task level
49
Reevaluation of T-DVS (D-TDVS)
50
Previous Work (Ying Zhang, Krishnendu Chakrabarty, 2006)
51
Feasibility Analysis
52
Feasibility of a task set under fault-free conditions
53
Fault Free
Tolerating k Faults in Each Task
54
Fault Tolerance With DVS
55
Fault Tolerance With DVS (2)
56
Fault Tolerance With DVS (3)
57
Heuristic Method Based on GA
58
Heuristic Method Based on GA (2)
• Init function
 Initializes the search space (chromosome population).
 One chromosome is initially generated using the computationally
feasible application-level speed scaling method.
 The other chromosomes are generated randomly.
59
Heuristic Method Based on GA
60
Results
61
Experiments
62
Processors
63
Task Sets
64
Application level results on Tranmeta Crusoe
65
Task level results on Tranmeta Crusoe
66
Application level results on Intel XScale
67
Task level results on Intel XScale
68
Real life implementation
 The energy consumptions of the system board ,excludes the processor time.
69
Suggestion
 The scheduler can tolerate at least k faults and then tries to DVS by using
slacks.
 Tolerating more faults than k by increasing processor speed when more
faults than k occur.
70

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Fault tolearant system
Fault tolearant systemFault tolearant system
Fault tolearant system
 
Fault tolerance
Fault toleranceFault tolerance
Fault tolerance
 
Fault tolerance and computing
Fault tolerance  and computingFault tolerance  and computing
Fault tolerance and computing
 
Fault tolerant presentation
Fault tolerant presentationFault tolerant presentation
Fault tolerant presentation
 
Fault tolerance techniques
Fault tolerance techniquesFault tolerance techniques
Fault tolerance techniques
 
Fault Tolerance (Distributed computing)
Fault Tolerance (Distributed computing)Fault Tolerance (Distributed computing)
Fault Tolerance (Distributed computing)
 
Reliability and clock synchronization
Reliability and clock synchronizationReliability and clock synchronization
Reliability and clock synchronization
 
Fault tolerance techniques
Fault tolerance techniquesFault tolerance techniques
Fault tolerance techniques
 
Software Fault Tolerance
Software Fault ToleranceSoftware Fault Tolerance
Software Fault Tolerance
 
Real time operating-systems
Real time operating-systemsReal time operating-systems
Real time operating-systems
 
Real time operating systems (rtos) concepts 1
Real time operating systems (rtos) concepts 1Real time operating systems (rtos) concepts 1
Real time operating systems (rtos) concepts 1
 
Fault tolerance
Fault toleranceFault tolerance
Fault tolerance
 
Fault Tolerance 101
Fault Tolerance 101 Fault Tolerance 101
Fault Tolerance 101
 
Rtos
RtosRtos
Rtos
 
C0931115
C0931115C0931115
C0931115
 
RTOS
RTOSRTOS
RTOS
 
Real time operating systems
Real time operating systemsReal time operating systems
Real time operating systems
 
RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems
 
Real Time Systems & RTOS
Real Time Systems & RTOSReal Time Systems & RTOS
Real Time Systems & RTOS
 
REAL TIME OPERATING SYSTEM
REAL TIME OPERATING SYSTEMREAL TIME OPERATING SYSTEM
REAL TIME OPERATING SYSTEM
 

Destacado

Presentation of the Oracle Real-Time Scheduling solution by Peter Broughton, ...
Presentation of the Oracle Real-Time Scheduling solution by Peter Broughton, ...Presentation of the Oracle Real-Time Scheduling solution by Peter Broughton, ...
Presentation of the Oracle Real-Time Scheduling solution by Peter Broughton, ...Ewa Stepien
 
Real time scheduling
Real time schedulingReal time scheduling
Real time schedulingGBX Summits
 
Automatic real time scheduling
Automatic real time schedulingAutomatic real time scheduling
Automatic real time schedulingManish kumar
 
Integrating fault tolerant scheme with feedback control scheduling algorithm ...
Integrating fault tolerant scheme with feedback control scheduling algorithm ...Integrating fault tolerant scheme with feedback control scheduling algorithm ...
Integrating fault tolerant scheme with feedback control scheduling algorithm ...ijics
 
Presentation oracle optimized solutions
Presentation   oracle optimized solutionsPresentation   oracle optimized solutions
Presentation oracle optimized solutionssolarisyougood
 
Real time scheduling - basic concepts
Real time scheduling - basic conceptsReal time scheduling - basic concepts
Real time scheduling - basic conceptsStudent
 
Real time Scheduling in Operating System for Msc CS
Real time Scheduling in Operating System for Msc CSReal time Scheduling in Operating System for Msc CS
Real time Scheduling in Operating System for Msc CSThanveen
 
Real-Time Scheduling Algorithms
Real-Time Scheduling AlgorithmsReal-Time Scheduling Algorithms
Real-Time Scheduling AlgorithmsAJAL A J
 

Destacado (10)

Presentation of the Oracle Real-Time Scheduling solution by Peter Broughton, ...
Presentation of the Oracle Real-Time Scheduling solution by Peter Broughton, ...Presentation of the Oracle Real-Time Scheduling solution by Peter Broughton, ...
Presentation of the Oracle Real-Time Scheduling solution by Peter Broughton, ...
 
Real time scheduling
Real time schedulingReal time scheduling
Real time scheduling
 
Automatic real time scheduling
Automatic real time schedulingAutomatic real time scheduling
Automatic real time scheduling
 
Integrating fault tolerant scheme with feedback control scheduling algorithm ...
Integrating fault tolerant scheme with feedback control scheduling algorithm ...Integrating fault tolerant scheme with feedback control scheduling algorithm ...
Integrating fault tolerant scheme with feedback control scheduling algorithm ...
 
17th ict forum oracle presentation
17th ict forum oracle presentation17th ict forum oracle presentation
17th ict forum oracle presentation
 
Presentation oracle optimized solutions
Presentation   oracle optimized solutionsPresentation   oracle optimized solutions
Presentation oracle optimized solutions
 
cpu scheduling in os
cpu scheduling in oscpu scheduling in os
cpu scheduling in os
 
Real time scheduling - basic concepts
Real time scheduling - basic conceptsReal time scheduling - basic concepts
Real time scheduling - basic concepts
 
Real time Scheduling in Operating System for Msc CS
Real time Scheduling in Operating System for Msc CSReal time Scheduling in Operating System for Msc CS
Real time Scheduling in Operating System for Msc CS
 
Real-Time Scheduling Algorithms
Real-Time Scheduling AlgorithmsReal-Time Scheduling Algorithms
Real-Time Scheduling Algorithms
 

Similar a Fault tolerant real-time scheduling

A Review of Different Types of Schedulers Used In Energy Management
A Review of Different Types of Schedulers Used In Energy ManagementA Review of Different Types of Schedulers Used In Energy Management
A Review of Different Types of Schedulers Used In Energy ManagementIRJET Journal
 
Integrating Fault Tolerant Scheme With Feedback Control Scheduling Algorithm ...
Integrating Fault Tolerant Scheme With Feedback Control Scheduling Algorithm ...Integrating Fault Tolerant Scheme With Feedback Control Scheduling Algorithm ...
Integrating Fault Tolerant Scheme With Feedback Control Scheduling Algorithm ...ijics
 
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...cscpconf
 
ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AN...
ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AN...ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AN...
ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AN...IJCSEA Journal
 
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...IRJET Journal
 
On-line Power System Static Security Assessment in a Distributed Computing Fr...
On-line Power System Static Security Assessment in a Distributed Computing Fr...On-line Power System Static Security Assessment in a Distributed Computing Fr...
On-line Power System Static Security Assessment in a Distributed Computing Fr...idescitation
 
Adaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computingAdaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computingwww.pixelsolutionbd.com
 
On the quality of service of crash recovery
On the quality of service of crash recoveryOn the quality of service of crash recovery
On the quality of service of crash recoveryingenioustech
 
T1-4_Maslennikov_et_al.pdf
T1-4_Maslennikov_et_al.pdfT1-4_Maslennikov_et_al.pdf
T1-4_Maslennikov_et_al.pdfMareLunare
 
Benchmark methods to analyze embedded processors and systems
Benchmark methods to analyze embedded processors and systemsBenchmark methods to analyze embedded processors and systems
Benchmark methods to analyze embedded processors and systemsXMOS
 
An Efficient Approach Towards Mitigating Soft Errors Risks
An Efficient Approach Towards Mitigating Soft Errors RisksAn Efficient Approach Towards Mitigating Soft Errors Risks
An Efficient Approach Towards Mitigating Soft Errors Riskssipij
 
Cluster Computing Environment for On - line Static Security Assessment of lar...
Cluster Computing Environment for On - line Static Security Assessment of lar...Cluster Computing Environment for On - line Static Security Assessment of lar...
Cluster Computing Environment for On - line Static Security Assessment of lar...IDES Editor
 
Distributed Checkpointing on an Enterprise Desktop Grid
Distributed Checkpointing on an Enterprise Desktop GridDistributed Checkpointing on an Enterprise Desktop Grid
Distributed Checkpointing on an Enterprise Desktop Gridbrent.wilson
 
A SURVEY ON STATIC AND DYNAMIC LOAD BALANCING ALGORITHMS FOR DISTRIBUTED MULT...
A SURVEY ON STATIC AND DYNAMIC LOAD BALANCING ALGORITHMS FOR DISTRIBUTED MULT...A SURVEY ON STATIC AND DYNAMIC LOAD BALANCING ALGORITHMS FOR DISTRIBUTED MULT...
A SURVEY ON STATIC AND DYNAMIC LOAD BALANCING ALGORITHMS FOR DISTRIBUTED MULT...IRJET Journal
 
A Survey on Task Scheduling and Load Balanced Algorithms in Cloud Computing
A Survey on Task Scheduling and Load Balanced Algorithms in Cloud ComputingA Survey on Task Scheduling and Load Balanced Algorithms in Cloud Computing
A Survey on Task Scheduling and Load Balanced Algorithms in Cloud ComputingIRJET Journal
 
22). smlevel energy eff-dynamictaskschedng
22). smlevel energy eff-dynamictaskschedng22). smlevel energy eff-dynamictaskschedng
22). smlevel energy eff-dynamictaskschedngPoornima_Rajanna
 
Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...IOSR Journals
 

Similar a Fault tolerant real-time scheduling (20)

A Review of Different Types of Schedulers Used In Energy Management
A Review of Different Types of Schedulers Used In Energy ManagementA Review of Different Types of Schedulers Used In Energy Management
A Review of Different Types of Schedulers Used In Energy Management
 
Integrating Fault Tolerant Scheme With Feedback Control Scheduling Algorithm ...
Integrating Fault Tolerant Scheme With Feedback Control Scheduling Algorithm ...Integrating Fault Tolerant Scheme With Feedback Control Scheduling Algorithm ...
Integrating Fault Tolerant Scheme With Feedback Control Scheduling Algorithm ...
 
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
 
ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AN...
ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AN...ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AN...
ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AN...
 
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
On-line Power System Static Security Assessment in a Distributed Computing Fr...
On-line Power System Static Security Assessment in a Distributed Computing Fr...On-line Power System Static Security Assessment in a Distributed Computing Fr...
On-line Power System Static Security Assessment in a Distributed Computing Fr...
 
Adaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computingAdaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computing
 
On the quality of service of crash recovery
On the quality of service of crash recoveryOn the quality of service of crash recovery
On the quality of service of crash recovery
 
T1-4_Maslennikov_et_al.pdf
T1-4_Maslennikov_et_al.pdfT1-4_Maslennikov_et_al.pdf
T1-4_Maslennikov_et_al.pdf
 
Benchmark methods to analyze embedded processors and systems
Benchmark methods to analyze embedded processors and systemsBenchmark methods to analyze embedded processors and systems
Benchmark methods to analyze embedded processors and systems
 
An Efficient Approach Towards Mitigating Soft Errors Risks
An Efficient Approach Towards Mitigating Soft Errors RisksAn Efficient Approach Towards Mitigating Soft Errors Risks
An Efficient Approach Towards Mitigating Soft Errors Risks
 
Cluster Computing Environment for On - line Static Security Assessment of lar...
Cluster Computing Environment for On - line Static Security Assessment of lar...Cluster Computing Environment for On - line Static Security Assessment of lar...
Cluster Computing Environment for On - line Static Security Assessment of lar...
 
Distributed Checkpointing on an Enterprise Desktop Grid
Distributed Checkpointing on an Enterprise Desktop GridDistributed Checkpointing on an Enterprise Desktop Grid
Distributed Checkpointing on an Enterprise Desktop Grid
 
A SURVEY ON STATIC AND DYNAMIC LOAD BALANCING ALGORITHMS FOR DISTRIBUTED MULT...
A SURVEY ON STATIC AND DYNAMIC LOAD BALANCING ALGORITHMS FOR DISTRIBUTED MULT...A SURVEY ON STATIC AND DYNAMIC LOAD BALANCING ALGORITHMS FOR DISTRIBUTED MULT...
A SURVEY ON STATIC AND DYNAMIC LOAD BALANCING ALGORITHMS FOR DISTRIBUTED MULT...
 
A Survey on Task Scheduling and Load Balanced Algorithms in Cloud Computing
A Survey on Task Scheduling and Load Balanced Algorithms in Cloud ComputingA Survey on Task Scheduling and Load Balanced Algorithms in Cloud Computing
A Survey on Task Scheduling and Load Balanced Algorithms in Cloud Computing
 
22). smlevel energy eff-dynamictaskschedng
22). smlevel energy eff-dynamictaskschedng22). smlevel energy eff-dynamictaskschedng
22). smlevel energy eff-dynamictaskschedng
 
Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...
 
E01113138
E01113138E01113138
E01113138
 
Ajila (1)
Ajila (1)Ajila (1)
Ajila (1)
 

Más de Reza Ramezani

Real time operating systems for safety-critical applications
Real time operating systems for safety-critical applicationsReal time operating systems for safety-critical applications
Real time operating systems for safety-critical applicationsReza Ramezani
 
Authorship attribution
Authorship attributionAuthorship attribution
Authorship attributionReza Ramezani
 
An introduction to forensic linguistics
An introduction to forensic linguisticsAn introduction to forensic linguistics
An introduction to forensic linguisticsReza Ramezani
 
An improved to ak max sat (max-sat problem)
An improved to ak max sat (max-sat problem)An improved to ak max sat (max-sat problem)
An improved to ak max sat (max-sat problem)Reza Ramezani
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methodsReza Ramezani
 
Multi criteria decision support system on mobile phone selection with ahp and...
Multi criteria decision support system on mobile phone selection with ahp and...Multi criteria decision support system on mobile phone selection with ahp and...
Multi criteria decision support system on mobile phone selection with ahp and...Reza Ramezani
 
Deadlock detection in distributed systems
Deadlock detection in distributed systemsDeadlock detection in distributed systems
Deadlock detection in distributed systemsReza Ramezani
 
Fault injection techniques, design pattern for fault injector system
Fault injection techniques, design pattern for fault injector systemFault injection techniques, design pattern for fault injector system
Fault injection techniques, design pattern for fault injector systemReza Ramezani
 
Question answering in linked data
Question answering in linked dataQuestion answering in linked data
Question answering in linked dataReza Ramezani
 
Finding Association Rules in Linked Data
Finding Association Rules in Linked DataFinding Association Rules in Linked Data
Finding Association Rules in Linked DataReza Ramezani
 

Más de Reza Ramezani (10)

Real time operating systems for safety-critical applications
Real time operating systems for safety-critical applicationsReal time operating systems for safety-critical applications
Real time operating systems for safety-critical applications
 
Authorship attribution
Authorship attributionAuthorship attribution
Authorship attribution
 
An introduction to forensic linguistics
An introduction to forensic linguisticsAn introduction to forensic linguistics
An introduction to forensic linguistics
 
An improved to ak max sat (max-sat problem)
An improved to ak max sat (max-sat problem)An improved to ak max sat (max-sat problem)
An improved to ak max sat (max-sat problem)
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
Multi criteria decision support system on mobile phone selection with ahp and...
Multi criteria decision support system on mobile phone selection with ahp and...Multi criteria decision support system on mobile phone selection with ahp and...
Multi criteria decision support system on mobile phone selection with ahp and...
 
Deadlock detection in distributed systems
Deadlock detection in distributed systemsDeadlock detection in distributed systems
Deadlock detection in distributed systems
 
Fault injection techniques, design pattern for fault injector system
Fault injection techniques, design pattern for fault injector systemFault injection techniques, design pattern for fault injector system
Fault injection techniques, design pattern for fault injector system
 
Question answering in linked data
Question answering in linked dataQuestion answering in linked data
Question answering in linked data
 
Finding Association Rules in Linked Data
Finding Association Rules in Linked DataFinding Association Rules in Linked Data
Finding Association Rules in Linked Data
 

Último

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Último (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Fault tolerant real-time scheduling

  • 1. Quasi-static fault-tolerant scheduling schemes for energy-efficient hard real-time systems • Wei Tongquan, CS Department of East China Normal University, China • Piyush Mishra, GE Global Research, Niskayuna, NY 12309, USA • Kaijie Wu, ECE Department of University of Illinois, Chicago, IL 60607, USA • Junlong Zhou, CS Department of East China Normal University, China Journal of Systems and Software 2012 Reza Ramezani 1
  • 2. A Unified Approach for Fault Tolerance and Dynamic Power Management in Fixed-Priority Real-Time Embedded Systems • Ying Zhang – a Senior Software engineer with the Research and Development Department, Guidant Corporation, St. Paul, MN, USA • Krishnendu Chakrabarty – Department of Electrical and Computer Engineering, Duke University, Durham, USA Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on 25, no. 1 (2006): 111-125. 2
  • 3. Overview  Primaries  Checkpointing & Response Time  Reliability, The best fault tolerance count?  Feasibility Analysis  Offline Application Level Voltage Scaling  Offline Task Level Voltage Scaling  Online DVS by Using Slacks  Previous Work (Ying Zhang, Krishnendu Chakrabarty, 2006)  Results  Suggestion 3
  • 5. Features • Fault Tolerance Scheduling  Transient Faults  Fast Detection  Fault occurrences at runtime, checkpointing and state restoration. • Dynamic Voltage Scaling (DVS) • Offline Scheduling  Application Level Voltage Scaling (A-DVS)  Task Level Voltage Scaling (T-DVS) • Online Scheduling  Using Slacks • Exact Rate-Monotonic Characterization  Instead of iteratively deriving the response time of each task for feasibility analysis. 5
  • 6. Online DVS Outline • The adaptation of the offline task schedules to the runtime behavior of fault occurrences is implemented:  (1) Pre-computing and saving in a lookup table the maximum slack requirements for the processor to dynamically slow down.  (2) Retrieving and comparing the stored slack time requirements with the generated cumulative slack in the runtime.  (3) Dynamically scaling down processor speed when the generated slack time is equal to or greater than the stored slack requirements. 6
  • 10. Checkpoint count  Fault-tolerant computing refers to the correct execution of user programs and system software in the presence of faults.  Fault tolerance is typically achieved in real-time systems through online fault detection, checkpointing, and rollback recovery .  Checkpointing increases the task execution time, and in the absence of faults, it might cause a missed deadline for a task that completes on time without checkpointing.  Frequent checkpointing reduces re-execution time due to faults but increases task execution time and vice versa.  Therefore, the checkpointing interval, i.e., the duration between two consecutive checkpoints, must be carefully chosen to balance checkpointing cost with the re-execution time. 10
  • 11. Fault occurrences count • Relation between fault occurrences count and fault arrival rate  k is the fault occurrences count to be tolerated.  a fault arrival rate λ and a task execution interval t, the mean number of faults that arrive during the interval is λt. o If k is much smaller than λt, a sophisticated fault-tolerant scheme with its associated overhead is not appropriate. o if k is much larger than λt, a fault-tolerant scheme that provides deterministic real-time guarantee may not exist.  In order to target a system with reasonable real-time performance with fault tolerance, the value of k can be taken to be a small multiple of λt, e.g., 2λt ≤ k ≤ 3λt. 11
  • 17. Reliability The best fault tolerance count? 17
  • 23. Exact Characterization of RMA (ECRMA) • Critical Instant  The worst case behavior of RMA occurs when all tasks in a task set are instantiated simultaneously and are ready for execution immediately after initiation.  It has been shown that a schedule of independent periodic tasks is feasible if the first instance of each task is schedulable when it is instantiated at a critical instant Lehoczky et al. (1989) . 23
  • 24. Exact Characterization of RMA (ECRMA) (2) 24
  • 25. Exact Characterization of RMA (ECRMA) (3) 25
  • 26. Offline Application Level Voltage Scaling 26
  • 27. Application level voltage scaling (A-DVS) 27
  • 29. A-DVS algorithm (2) • Some Considerations  The binary search based A-DVS algorithm is valid only if the energy consumption is monotonic with respect to frequency/voltage changes.  When the processor static power consumption as well as context switching overhead is considered, the monotonicity does not hold.  In this case, there exists a critical processor speed below which scaling down the processor speed will instead increase the energy consumption.  The minimum voltage level low is initialized to the level corresponding to the processor critical speed. 29
  • 32. Offline Task Level Voltage Scaling 32
  • 33. Task level voltage scaling (T-DVS) 33
  • 39. Online DVS by Using Slacks 39
  • 40. Online reevaluation of DVS policies  Offline scheduling assumes that all tasks exhibit the worst case execution time and all faults occur during the checkpointing.  The runtime behavior of task execution and fault occurrences can vary significantly.  In the runtime, not all tasks execute up to their worst case execution times and not all faults occur during task executions.  Hence, the slack generated in the runtime could be used to dynamically scale down the processor speed to save energy.  The online reevaluation of DVS policies can save significant energy by using generated slacks due to uncertainties in fault occurrence. 40
  • 41. Reevaluation of DVS at application level 41
  • 42. Reevaluation of DVS at application level (2) 42
  • 43. Reevaluation of DVS at application level (3) 43
  • 44. 44
  • 45. 45
  • 49. Reevaluation of DVS policies at task level 49
  • 50. Reevaluation of T-DVS (D-TDVS) 50
  • 51. Previous Work (Ying Zhang, Krishnendu Chakrabarty, 2006) 51
  • 53. Feasibility of a task set under fault-free conditions 53 Fault Free
  • 54. Tolerating k Faults in Each Task 54
  • 56. Fault Tolerance With DVS (2) 56
  • 57. Fault Tolerance With DVS (3) 57
  • 59. Heuristic Method Based on GA (2) • Init function  Initializes the search space (chromosome population).  One chromosome is initially generated using the computationally feasible application-level speed scaling method.  The other chromosomes are generated randomly. 59
  • 65. Application level results on Tranmeta Crusoe 65
  • 66. Task level results on Tranmeta Crusoe 66
  • 67. Application level results on Intel XScale 67
  • 68. Task level results on Intel XScale 68
  • 69. Real life implementation  The energy consumptions of the system board ,excludes the processor time. 69
  • 70. Suggestion  The scheduler can tolerate at least k faults and then tries to DVS by using slacks.  Tolerating more faults than k by increasing processor speed when more faults than k occur. 70