SlideShare a Scribd company logo
1 of 7
Download to read offline
FlaxRay Fault-Tolerance:
               Capabilities, weaknesses and proposed enhancements


                               Antonio Cappiello, Omar Jaradat
                       Mälardalen University, Västerås, Sweden, 03/2011
                             {aco10003, ojt10001}@student.mdh.se




A
          bstract                                      weaknesses and the capabilities based on two
           This paper gives an overview about          main points: bus controller and network
           FlexRay, and it summarizes its main         topology. The paper will proceed with the
           components, in addition to give an          current state of the work and will be ended up
adequate details of how those components work.         by showing our conclusions.
The document focuses on FlexRay reliability
and how it is considered as a fault tolerant           The FlexRay Protocol
protocol, as well as, it discusses the capabilities,
weaknesses and the authors’ enhancement                The FlexRay protocol is a time-triggered
proposals, so after reading this paper, readers        protocol, and it can offer options for
can create a good knowledge about FlexRay as           deterministic data that arrives in a predictable
well as how well this protocol achieves the            time frame. FlexRay has a core with static
reliability.                                           frames    and     dynamic    frames    with    a
                                                       communication cycle that provides a predefined
Introduction                                           space for static and dynamic data, so nodes on
                                                       FlexRay network must know how all the pieces
FlexRay is a communication system, it is               of the network are configured in order to
considered one of the next generations of bus          communicate, and since the embedded networks
protocol for automotive networks; even it can be       are different from normal PC networks, it
applied on any other real time distributed             means that FlexRay does need any additional
system environment, but any researcher will            mechanism to automatically discover and
notice that this protocol is usually tied with the     configure devices at run-time, like the PCs
automotive industries, and this is simply,             networks which require these procedures,
because it was developed in 1999 by a                  FlexRay network and simply, have a closed
cooperation of leading companies in automotive         configuration and should not be changed once it
industry and it was developed exclusively for          is assembled in the production.
automotive.
                                                       FlexRay manages more than one node “Multiple
Since, software errors are considered one of the       nodes”    with    a Time    Division    Multiple
big challenges that affect seriously on the            Access (TDMA) scheme and every FlexRay node
software performance. Our mission is to show           is synchronized to the same clock, and each node
how FlexRay can be considered as a fault               waits for its turn to write on the bus, and
tolerant system, and how it can handle the             because the timing is harmonious in
failures and errors that can be happened in any        a TDMA scheme, FlexRay is always able to
given time, as well as, try to suggest or propose      guarantee consistency of data deliver to nodes
any idea can lead to enhance the reliability of        on the network, this provides many advantages
FlexRay protocol.                                      for systems that depend on up to date data
                                                       between nodes.
In this paper, we will describe FlexRay protocol
and analyse the bus controller structure and
how the nodes can communicate and interact
within the whole communication system, so we
will begin to talk about the protocol itself, and
then the fault tolerance, by explaining the
Fault-Tolerance: Capabilities and                    operates normal, constitute all together the so-
                                                     called three-level error model, Figure 2. This
Weaknesses                                           model provides a self-diagnostic mechanism of
                                                     the possible error.
In this section of the document we are going to
point out the means adopted by the FlexRay
protocol in order to provide a fault-tolerance
communication.

We have individuated two aspects of a FlexRay
System involved in the assurance of the fault-
tolerance:

      1. The bus controller.
      2. The physical network architecture.



The bus controller

The bus controller consists of six components as
showed in Figure 1 [1], but in particular there
are some of these that use a mechanism to
protect the communication from errors.



                                                                         Figure 2

                                                     The Frame and Symbol Processing (FSP) beside
                                                     to separate the payload from the header of the
                                                     message received, it provides also status data to
                                                     the host regarding the frame reception, as for
                                                     example if the received frame is valid or invalid.

                                                     On the sender node, The Coding/Decoding Unit
                                                     (CODEC) computes and appends the CRC
                                                     checksum to the message that it has to encode
                                                     and send on the bus. On the receiver node, after
                                                     decoding the message received, the CODEC
                                                     performs the CRC check in order to verify
                                                     whether the message integrity has been affected
                                                     by electromagnetic noise on the bus and
                                                     consequently some bits have been flipped.

                                                     In addition, in a time-triggered real time system
                                                     such as FlexRay, different nodes have to keep a
                    Figure 1                         consistent view of the global time even in faulty
                                                     situations, and the component responsible for
The     Protocol   Operation    Control    (POC)     this is the Clock Synchronisation (CS). This
responsible to react to host commands                component tries to improve the fault tolerance of
instructing/guiding the other components also        the Protocol through two kind of correction: the
reacts to error situations. For example when an      offset correction and the rate correction. In
error occurs, the POC falls to normal passive        particular, it is in the offset correction method
state and tries to reintegrate, but when the         that the CS adopts a fault-tolerant midpoint
error is fatal the POC falls in the halt state and   algorithm in order to compute an average over
all operations are stopped. These two states and     the time differences between the communication
the active state, in which the bus controller        rounds. On the base of this computation, the
next message schedule is brought forward or                       allocated slots, and from the other hand the
delayed in such a way that all nodes have                         correctly relay of messages coming from non-
almost the same time in the next cycle.                           faulty communication controller.
The FlaxRay Consortium claims that thanks to
this algorithm, up to two Byzantine faults 1 can                  Summarising [4] [5] about the fault tolerance,
be tolerated. When more than two of these faults                  we can state that the FlexRay Protocol
happen, the System can fall in a situation in                          manages the errors with a “never-give-
which there are different views of the global                             up”-strategy thanks to the three-level
time and consequently another problem can                                 error model explained above, because
affect the System, the Clique problem.                                    “stopping communication is a critical
A Clique is a group of nodes connected to a                               decision which must be made by the
network which can communicate only inside the                             application whenever possible”;
same group and not with the other ones.                                is able to handle both internal and
FlexRay doesn’t provide any mean to detect and                            external faults;
resolve this kind of problem. The Clique                               does not adopt any strategy like
Problem in FlexRay has been well analysed in                              retransmission in case of a corrupted
[2], and more in depth, two kinds of Cliques has                          message, but this is responsibility of the
been identified:                                                          host application to face with these
     1. Time domain cliques, that happens                                 problems because the strategy of the
         when subsets of nodes have different                             protocol is to “signal” the error;
         view of the global time, as described                         as well as for the security aspect,
         before,                                                          because the Protocol does not provide
     2. Value domain cliques, that occurs when                            security, but it is responsibility of the
         a frame is correctly placed in a slot but                        application
         contains a different cycle counter.                           “Requires application support for
Moreover, in [2] it is said that “the FlaxRay                             Byzantine        faults     (e.g.   group
consortium is aware of the potential clique                               membership).
problem” but it is even said that “the cliques do
not constitute a noticeable risk in practice”                     The physical network architecture
maybe because “there are no report published on
cliques observed in a practical setup”. For these                 FlexRay supports single and dual channel
reasons in that document the authors show with                    configurations which consist of one or two pairs
experiments how to create cliques in a physical                   of wires respectively, most FlexRay nodes
FlexRay cluster and how to avoid or detect                        typically also have power and ground wires
possible cliques.                                                 available    to   power     transceivers    and
                                                                  microprocessors.
Finally, when all the above illustrated means
adopted by the bus controller are not enough to                   FlexRay can be distinguished from all other
prevent faulty behaviours, an additional                          automotive protocols such as CAN and LIN by
component can be inserted between the bus                         its Network layout because FlexRay supports a
controller and the network as showed in the                       very flexible network topology, and this is
Figure 1: the Bus Guardian (BG). In [3], four                     because it has two channels that can be used in
properties for the BG have been identified and                    a different ways, this for sure will increase the
formally proofed:                                                 flexibility which will allow the protocol to
                                                                  provide a scalability of the fault tolerance, in
     1.   Correct Relay.                                          addition to that it plays a big role in forming
     2.   Validity.                                               FlexRay system structure, so redundant and
     3.   Agreement.                                              independent systems are possible.
     4.   Integrity.
                                                                  There are three possible FlexRay topologies:
These properties guarantee from one hand no
accesses of the communication control to the                            1.Passive Bus Topology: it means that all
communication channel outside the pre-                                      nodes can be connected to a bus but in
                                                                            dual channels case one node can be
1
  A Byzantine Fault is typical of the distributed system and is             connected to both channels or only to
visible with the wrong behaviour of a node in the system, that
consist in sending arbitrary messages, including messages                   one of these channel. Figure 3.A
aimed to corrupt the system. More details about this topic will
be provided in the Current State of Work paragraph
2.Active Star Topology: In this topology     tolerance and time-determinism performance
          the network can be built as an active    requirements for x-by-wire applications (i.e.
          star that contains star couplers, each   drive-by-wire,      steer-by-wire,   brake-by-wire,
          node must be connected to one            etc.). This article covers the basics FlexRay. [7]
          coupler. Figure 3.B
                                                   Most first FlexRay networks generation only use
      3.Combination of the topologies: In this     the “single channel” and this is to decrease the
          topology a combination between the       wires cost and keep it down, but further
          passive bus and active star is used.     networks will use dual channel and this is
          Figure 3.C                               because the big advantage that they can gain
                                                   from dual channel, since the dual channel
It is very important for designers to select       enhances fault – tolerance and increase the
between these topologies because choosing the      bandwidth.
more suitable topology can play a big role to
optimize the cost, performance, and reliability    FlexRay can redundantly transmit individual
for their design.                                  messages to provide an additional layer of
                                                   network reliability. In fact, FlexRay networks
FlexRay network must know how all the pieces       provide scalable fault – tolerance by allowing
of the network are configured in order to          single or dual channel communication, but for
communicate efficiently.                           sure the dual channel is preferred in many
                                                   cases, for example, in security – critical
Figures 3.A, 3.B and 3.C show several possible     applications, all devices connected to the bus
topologies can be supported by FlexRay             may use both channels for transferring data.
channels [1] [6].                                  However, it is always possible to connect one
                                                   single channel when the redundancy is not
                                                   needed, or to increase the bandwidth by using
                                                   both channels for transferring non-redundant
                                                   data.

                                                   As a result, FlexRay can be used with single or
                                                   dual channels, but since the dual channel
                                                   provides and increases the redundancy this will
                                                   lead to increase the fault – tolerance, thus, using
     Figure3.A              Figure 3.B             dual channel topology instead of single channel
                                                   will logically influence the fault – tolerance
                                                   cumulatively [1].

                                                   Current State of Work
                                                   In this section of the document we are going to
                                                   describe the second step of our work consisting
                                                   in collecting practical and theoretical research
                                                   on the enhancement of the FlaxRay fault-
                                                   tolerance capabilities.

                                                   Regardless of the fact that FlaxRay is still a new
                                                   protocol in the automotive industries, there are
                                                   many works conducted by companies or
                  Figure 3.C                       researchers form one hand in order to find out
                                                   the true potentialities of the protocol and
                                                   determine its working features and from the
FlexRay communications bus is a deterministic,     other hand with the purpose to improve its
fault-tolerant and high-speed bus system, and      reliability and effectiveness. Therefore in our
using     two   separate   physical   FlexRay      work of collecting information we decided to
communication lines with 10Mbps implement          adopt a strategy of research based on selecting
double redundant fault tolerant message            the most reliable work form international
transmission so that data throughput can be        conferences, workshops and companies leader in
doubled as well. FlexRay delivers the error
the field of the embedded systems such as the        (Constraint Logic Programming) in term of
Real-Time Systems Symposium (RTSS), The              results, but computationally less expensive.
Euromicro Technical Committee on Real-Time
Systems (ECRTS), the International Workshop          About the message scheduling a good contribute
on Automated Verification of Critical Systems        has been given by [11], where in order to
(AVoCS), the Real-Time and Embedded                  analyse the timing properties in both the static
Technology and Applications Symposium                and the dynamic segment of a FlexRay
(RTAS), the IEEE Computer Society and many           communication cycle, the authors suggest
others.                                              different techniques.
Moreover our research strategy is focused in         More in depth, about the timing properties of
selecting the works regarding the reliability and    the static segment, an algorithm that builds the
fault-tolerant aspects of FlaxRay that try to        static schedule has been proposed and analysed.
estimates its capacities and propose concrete        About the dynamic segment, several factors that
solutions to its weakness.                           can impact on the worst-case response time have
As result of this research we are going to           been analysed in three different approaches,
describe the most interesting outcomes as a kind     optimal (OO), heuristic (HH) and holistic (OH)
of insight on the current state of work on           solution.
FlexRay.                                             The OO uses a ILP formulation, the HH sees the
                                                     problem as bin-covering problem, and OH
Before to go in depth with the single results we     further reduce the time of HH using partially an
can say to have noticed a common reason on the       ILP formulation. All the proposed analyses are
base of each work: everyone agree on the need to     based on formal extensive experiments.
precisely determine the true performance,            In another article [12] strictly related to the
predictability and reliability of the mentioned      previous one [11] written by almost the same
protocol as mandatory requirement to use             authors, a further step toward an efficient use of
successfully    FlexRay      in    safety-critical   FlexRay is done. While the first article bounds
applications. This common view is due to the         the message transmission time on both the ST
fact that FlaxRay is becoming the leader in the      and DYN segment, the second one is focused on
distributed embedded system targeted to high         find the right bus configuration for a particular
performance vehicles.                                application in order to meet all the time
                                                     constraints.
Several study like [8] and [9] compare the
FlaxRay protocol with the most popular               This purpose is achieved providing four
nowadays in automotive industries as LIN,            techniques extensively tested by the authors:
CAN, TTCAN and others, with the purpose to
show how the flexibility and potentialities of           1. The Basic Bus Configuration (BCC),
FlaxRay include all the benefits of the other               which results from analyzing the
protocol. In addition other works as [3] show               minimal bandwidth requirements of the
practically how it’s possible to “migrate” from             application;
CAN to FlexRay explaning the migration
requirements, parameter calculation, message             2. The OBC heuristic with the curve
analysis, Payload optimization and Slot size                fitting (OBCCF), that instead of
definition, but at the end they indicated that              exhaustively perform the scheduling for
there is a big problem in optimizing a FlexRay              all possible values of the DYN segment
cycle which is formalizing the static segment               length, evaluates the response time for
and dynamic segment parameters. The latter is               only some values and than with the
one of the most interesting aspect on which                 curve fitting approach extrapolates the
many researchers spent their efforts.                       response time for the other points ( this
For example in [10] a technique to schedule                 is based on the regularity of the
messages on the FlaxRay segment has been                    dependence response time vs. size of the
proposed in order to compensate the lack of the             DYN segment noticed in several
protocol toward the faulty messages due to                  experiments and depicted by the
transient and intermittent faults that affect the           following picture)
reliability aspect of the communication. The
technique proposed generate a schedule on the
base of the probability of failure of the message
using an heuristic very close to the CLP
reduce the validation time is required to manage
                                                    even the continues and rapid changes in
                                                    electronic control feature. This means to
                                                    elaborate a schedule that takes into account
                                                    even a certain amount of uncertainty.
                                                    In [13] the info-gap technique has been showed
                                                    with the purpose to generate different schedules
                                                    with a degree of robustness related to different
                                                    ranges of uncertainty. More in depth, the
                                                    uncertainty analysed is in the payloads of the
                                                    messages, but the same approach can be used
                                                    even for uncertainty related to the dependency
                                                    between task and messages, for the period (rate
                                                    of task execution, or message transmission) and
                                                    topology (mapping of tasks to hosts and
                                                    messages to channels).

                    Figure 4                        By now we have discussed only the message
                                                    scheduling problems in a system that uses the
    3. The OBC heuristic with an exhaustive         FlaxRay communication protocol, but there are
       exploration of the size for the DYN          many other issues pointed out by others works
       segment;                                     that need particular attention.
                                                    Most of these are for example related to
    4. The Simulated Annealing (SA) based           Byzantine fault that is very common in
       design space exploration, used to            distributed system.
       provide a base-line for evaluation of the    The Byzantine fault occurs when a faulty node
       proposed heuristics.                         corrupts its local state and sends arbitrary
                                                    messages. To face with this problem can be used
The results of the experiments conducted by the     a Byzantine fault tolerance technique (BFT)
authors can be summarised by the following          which mask a bounded number of Byzantine
picture taken from the same article:                faults e.g. using state machine replication, or a
                                                    detecting technique which equips each node
                                                    with a detector in order to monitor other nodes
                                                    and isolate the possible nodes with faulty
                                                    behaviour. A formal study on these techniques
                                                    has been conduced in [14], and what come out is
                                                    that the first technique is stronger than the
                                                    second one, but analysing a trade-off between
                                                    them follows that:
                                                         Detection require f+1 replication vs. 3f+1
                                                            of the BFT in order to cope with f
                                                            concurrent fault;
                                                         Detection systems need only be
                                                            provisioned for the average load while a
                                                            BFT system must be provisioned for the
                                                            peak load;
                                                         Detection is cheaper.
                                                    In addition to this analysis, in the same article
                    Figure 5                        the authors propose a sketch of a system that
                                                    implements a Byzantine fault detector that
                                                    provide accountability, completeness and
As these studies have showed, design the            accuracy.
schedule of the FlaxRay is a complex operation
not only because it is needed to guarantee the      Toward the Byzantine fault the FlaxRay system
tight time constraints and performance required     can be equipped with an additional module
by some automotive application but even             placed between the Bus Controller and the
because, in order to increase the reusability and   network, the Bus Guardian. The functionality
                                                    of this has been already described in the
previous section of the document, but the           [6] Seminar FlexRay, Robert Rieb, Chemntiz
FlaxRay specification doesn’t give any proof of     University 2009.
its functionalities. Regard to this, in [9], four
properties has been identified and formally         [7] FlexRay Automotive Communication Bus
proofed                                             Overview, National Instruments ("NI").
     1. Correct Relay,
     2. Validity,                                   [8] Comparision of FieldBus Systems CAN,
     3. Agreement,                                  TTCAN, FlexRay and LIN in Passenger
     4. Integrity.                                  Vehicles, Steve C. Talbot, Shangping Ren, 29th
Moreover about the Byzantine fault, the             IEEE International Conference on Distributed
FlexRay specification claims that up to two         Computing Systems Workshops Montreal,
Byzantine faults can be tolerated thanks to the     Quebec, Canada June 22-June 26 2009
Clock Synchronization Algorithm, but even this
property have to be proofed and the author of       [9] In-Veichle Networking, frescale.com
the previous article ([15]) is currently working
even on this problem.                               [10]    Scheduling    for    Fault-Tolerant
                                                    Communication on the Static Segment of
Conclusion                                          FlexRay, Bogdan Tanasa, Unmesh D. Bordoloi,
                                                    Petru Eles, Zebo Peng, 31st IEEE Real-Time
FlexRay communications bus is a deterministic,      Systems Symposium, 2010.
fault-tolerant and high-speed bus system with
high performance, and it has more and more          [11]Timing Analysis of the FlexRay
promising future in real time distributed           Communication Protocol, Traian Pop, Paul Pop,
systems, specially, in automotive industry. Dual    Petru Eles, Zebo Peng, Alexandru Andrei, Real-
– channel topology offers enhanced fault-           Time Systems Journal, Volume 39, Numbers 1-
tolerance and increases the bandwidth, and this     3, pp 205-235, August, 2008
provides messages redundancy or double the
transmission which increases the reliability,       [12] Bus Access Optimisation for FlexRay-based
even the dual channels can be used to increase      Distributed Embedded Systems, Design,
the bandwidth only, without redundant the           Automation, and Test, Traian Pop, Paul Pop,
message. FlexRay has a good mechanism to            Petru Ion Eles and Zebo Peng, in Europe
handle the errors (i.e. three-level error model)    Conference DATE07.
which provides a self-diagnostic mechanism of
the possible error.                                 [13] A. Ghosal, H. Zeng, Y. Ben-Haim, M. Di
                                                    Natale, “Computing Robustness of FlexRay
References                                          Schedules    to  Uncertainties   in  Design
                                                    Parameters” , DATE '10, 2010
[1] Introduction to FlexRay and TTA, Peter
                                                    [14] The case for Byzantine fault detection,
Bohm, November 21, 2005.
                                                    Andreas Haeberlen, Petr Kouznetsov, Peter
                                                    Druschel, HOTDEP'06 Proceedings of the 2nd
[2] An Investigation of the Clique Problem in       conference   on    Hot    Topics in System
FlexRay, P.Milbredt, M.Horauer, A.Steininger,       Dependability, Volume 2 , 2006
IEEE 2008.
                                                    [15] On the Formal Verification of the FlexRay
[3] On the Formal Verification of the FlexRay       Communication Protocol, Bo Zhang, Automatic
Communication Protocol, Bo Zhang, AVoVS             Verification of Critical Systems - AvoCS (2006)
2006.                                               184-189

[4] Protocol Overiew, C.Temple-Motorola,            [16] Migration Framework from CAN to
FlexRay International Workshop, Detroit,2003.       FlexRay, Richard Murphy, Frank Walsh and
                                                    Brendan Jackman, Automotive Control Group,
[5] The FlexRay Protocol, P.Koopman, Carnegie       Waterford Institute of Technology, Cork Road,
Mellon, 2010.                                       Waterford, Ireland.

More Related Content

What's hot

Network Function Virtualization : Overview
Network Function Virtualization : OverviewNetwork Function Virtualization : Overview
Network Function Virtualization : Overview
sidneel
 
Chapter 2 system models
Chapter 2 system modelsChapter 2 system models
Chapter 2 system models
AbDul ThaYyal
 

What's hot (20)

TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CSTCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
 
Cisco Application Centric Infrastructure
Cisco Application Centric InfrastructureCisco Application Centric Infrastructure
Cisco Application Centric Infrastructure
 
Network function virtualization
Network function virtualizationNetwork function virtualization
Network function virtualization
 
EIGRP NXOS vs IOS Differences
EIGRP NXOS vs IOS DifferencesEIGRP NXOS vs IOS Differences
EIGRP NXOS vs IOS Differences
 
Presentation vmax hardware deep dive
Presentation   vmax hardware deep divePresentation   vmax hardware deep dive
Presentation vmax hardware deep dive
 
System On Chip
System On ChipSystem On Chip
System On Chip
 
Network Function Virtualization : Overview
Network Function Virtualization : OverviewNetwork Function Virtualization : Overview
Network Function Virtualization : Overview
 
Cisco IOS XRv Router Installation and Configuration Guide
Cisco IOS XRv Router Installation and Configuration GuideCisco IOS XRv Router Installation and Configuration Guide
Cisco IOS XRv Router Installation and Configuration Guide
 
Beginners: Network In a Box (NIB)
Beginners: Network In a Box (NIB)Beginners: Network In a Box (NIB)
Beginners: Network In a Box (NIB)
 
The flex ray protocol
The flex ray protocolThe flex ray protocol
The flex ray protocol
 
ATE Testers Overview
ATE Testers OverviewATE Testers Overview
ATE Testers Overview
 
middleware
middlewaremiddleware
middleware
 
VANET (BY-VEDANT)
VANET (BY-VEDANT)VANET (BY-VEDANT)
VANET (BY-VEDANT)
 
Designing memory controller for ddr5 and hbm2.0
Designing memory controller for ddr5 and hbm2.0Designing memory controller for ddr5 and hbm2.0
Designing memory controller for ddr5 and hbm2.0
 
Wireshark
Wireshark Wireshark
Wireshark
 
Network Virtualization
Network VirtualizationNetwork Virtualization
Network Virtualization
 
JCL DFSORT
JCL DFSORTJCL DFSORT
JCL DFSORT
 
Chapter 2 system models
Chapter 2 system modelsChapter 2 system models
Chapter 2 system models
 
ISO 26262 introduction
ISO 26262 introductionISO 26262 introduction
ISO 26262 introduction
 
Virtual SAN 6.2, hyper-converged infrastructure software
Virtual SAN 6.2, hyper-converged infrastructure softwareVirtual SAN 6.2, hyper-converged infrastructure software
Virtual SAN 6.2, hyper-converged infrastructure software
 

Similar to FlexRay Fault Tolerance article

IEEE standards and Data Link Layer Protocol
IEEE standards and Data Link Layer ProtocolIEEE standards and Data Link Layer Protocol
IEEE standards and Data Link Layer Protocol
Sajith Ekanayaka
 
Unstoppable Stateful PHP Web Services
Unstoppable Stateful PHP Web ServicesUnstoppable Stateful PHP Web Services
Unstoppable Stateful PHP Web Services
Gera Shegalov
 
Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...
Damir Delija
 
Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...
Damir Delija
 
Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...
Damir Delija
 
Network Advantages And Disadvantages
Network Advantages And DisadvantagesNetwork Advantages And Disadvantages
Network Advantages And Disadvantages
Renee Jones
 
Please answer all 3 question below.  Thank you1.  Give a short ans.docx
Please answer all 3 question below.  Thank you1.  Give a short ans.docxPlease answer all 3 question below.  Thank you1.  Give a short ans.docx
Please answer all 3 question below.  Thank you1.  Give a short ans.docx
ajoy21
 
A Novel Approach for Efficient Resource Utilization and Trustworthy Web Service
A Novel Approach for Efficient Resource Utilization and Trustworthy Web ServiceA Novel Approach for Efficient Resource Utilization and Trustworthy Web Service
A Novel Approach for Efficient Resource Utilization and Trustworthy Web Service
CSCJournals
 
On deferred constraints in distributed database systems
On deferred constraints in distributed database systemsOn deferred constraints in distributed database systems
On deferred constraints in distributed database systems
ijma
 

Similar to FlexRay Fault Tolerance article (20)

System Structure for Dependable Software Systems
System Structure for Dependable Software SystemsSystem Structure for Dependable Software Systems
System Structure for Dependable Software Systems
 
IEEE standards and Data Link Layer Protocol
IEEE standards and Data Link Layer ProtocolIEEE standards and Data Link Layer Protocol
IEEE standards and Data Link Layer Protocol
 
Unstoppable Stateful PHP Web Services
Unstoppable Stateful PHP Web ServicesUnstoppable Stateful PHP Web Services
Unstoppable Stateful PHP Web Services
 
Automation and Robotics 20ME51I Week 3 Theory Notes.pdf
Automation and Robotics 20ME51I Week 3 Theory Notes.pdfAutomation and Robotics 20ME51I Week 3 Theory Notes.pdf
Automation and Robotics 20ME51I Week 3 Theory Notes.pdf
 
Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...
 
Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...
 
Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...Communication network simulation on the unix system trough use of the remote ...
Communication network simulation on the unix system trough use of the remote ...
 
Exploring Models of Computation through Static Analysis
Exploring Models of Computation through Static AnalysisExploring Models of Computation through Static Analysis
Exploring Models of Computation through Static Analysis
 
Amaru Plug Resilient In-Band Control For SDN
Amaru  Plug Resilient In-Band Control For SDNAmaru  Plug Resilient In-Band Control For SDN
Amaru Plug Resilient In-Band Control For SDN
 
Thesis11
Thesis11Thesis11
Thesis11
 
A Critique of the CAP Theorem by Martin Kleppmann
A Critique of the CAP Theorem by Martin KleppmannA Critique of the CAP Theorem by Martin Kleppmann
A Critique of the CAP Theorem by Martin Kleppmann
 
Unit 3 Assignment 1 Osi Model
Unit 3 Assignment 1 Osi ModelUnit 3 Assignment 1 Osi Model
Unit 3 Assignment 1 Osi Model
 
Network Advantages And Disadvantages
Network Advantages And DisadvantagesNetwork Advantages And Disadvantages
Network Advantages And Disadvantages
 
Note 6
Note 6Note 6
Note 6
 
Please answer all 3 question below.  Thank you1.  Give a short ans.docx
Please answer all 3 question below.  Thank you1.  Give a short ans.docxPlease answer all 3 question below.  Thank you1.  Give a short ans.docx
Please answer all 3 question below.  Thank you1.  Give a short ans.docx
 
A Novel Approach for Efficient Resource Utilization and Trustworthy Web Service
A Novel Approach for Efficient Resource Utilization and Trustworthy Web ServiceA Novel Approach for Efficient Resource Utilization and Trustworthy Web Service
A Novel Approach for Efficient Resource Utilization and Trustworthy Web Service
 
OSI &TCP/IP Model
OSI &TCP/IP ModelOSI &TCP/IP Model
OSI &TCP/IP Model
 
Networking Articles Overview
Networking Articles OverviewNetworking Articles Overview
Networking Articles Overview
 
Osi model
Osi modelOsi model
Osi model
 
On deferred constraints in distributed database systems
On deferred constraints in distributed database systemsOn deferred constraints in distributed database systems
On deferred constraints in distributed database systems
 

Recently uploaded

一比一原版西安大略大学毕业证(UWO毕业证)成绩单原件一模一样
一比一原版西安大略大学毕业证(UWO毕业证)成绩单原件一模一样一比一原版西安大略大学毕业证(UWO毕业证)成绩单原件一模一样
一比一原版西安大略大学毕业证(UWO毕业证)成绩单原件一模一样
wsppdmt
 
Top profile Call Girls In Darbhanga [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Darbhanga [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Darbhanga [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Darbhanga [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
如何办理美国华盛顿大学毕业证(UW毕业证书)毕业证成绩单原版一比一
如何办理美国华盛顿大学毕业证(UW毕业证书)毕业证成绩单原版一比一如何办理美国华盛顿大学毕业证(UW毕业证书)毕业证成绩单原版一比一
如何办理美国华盛顿大学毕业证(UW毕业证书)毕业证成绩单原版一比一
avy6anjnd
 
一比一原版(PU学位证书)普渡大学毕业证学历认证加急办理
一比一原版(PU学位证书)普渡大学毕业证学历认证加急办理一比一原版(PU学位证书)普渡大学毕业证学历认证加急办理
一比一原版(PU学位证书)普渡大学毕业证学历认证加急办理
ezgenuh
 
如何办理伦敦商学院毕业证(LBS毕业证)毕业证成绩单原版一比一
如何办理伦敦商学院毕业证(LBS毕业证)毕业证成绩单原版一比一如何办理伦敦商学院毕业证(LBS毕业证)毕业证成绩单原版一比一
如何办理伦敦商学院毕业证(LBS毕业证)毕业证成绩单原版一比一
avy6anjnd
 
Top profile Call Girls In Thrissur [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Thrissur [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Thrissur [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Thrissur [ 7014168258 ] Call Me For Genuine Models ...
nirzagarg
 
如何办理麦考瑞大学毕业证(MQU毕业证书)成绩单原版一比一
如何办理麦考瑞大学毕业证(MQU毕业证书)成绩单原版一比一如何办理麦考瑞大学毕业证(MQU毕业证书)成绩单原版一比一
如何办理麦考瑞大学毕业证(MQU毕业证书)成绩单原版一比一
ozave
 
CELLULAR RESPIRATION. Helpful slides for
CELLULAR RESPIRATION. Helpful slides forCELLULAR RESPIRATION. Helpful slides for
CELLULAR RESPIRATION. Helpful slides for
euphemism22
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN ABUDHABI,DUBAI MA...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN ABUDHABI,DUBAI MA...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN ABUDHABI,DUBAI MA...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN ABUDHABI,DUBAI MA...
Health
 
如何办理多伦多大学毕业证(UofT毕业证书)成绩单原版一比一
如何办理多伦多大学毕业证(UofT毕业证书)成绩单原版一比一如何办理多伦多大学毕业证(UofT毕业证书)成绩单原版一比一
如何办理多伦多大学毕业证(UofT毕业证书)成绩单原版一比一
opyff
 

Recently uploaded (20)

Faridabad Call Girls ₹7.5k Pick Up & Drop With Cash Payment 8168257667 Call G...
Faridabad Call Girls ₹7.5k Pick Up & Drop With Cash Payment 8168257667 Call G...Faridabad Call Girls ₹7.5k Pick Up & Drop With Cash Payment 8168257667 Call G...
Faridabad Call Girls ₹7.5k Pick Up & Drop With Cash Payment 8168257667 Call G...
 
一比一原版西安大略大学毕业证(UWO毕业证)成绩单原件一模一样
一比一原版西安大略大学毕业证(UWO毕业证)成绩单原件一模一样一比一原版西安大略大学毕业证(UWO毕业证)成绩单原件一模一样
一比一原版西安大略大学毕业证(UWO毕业证)成绩单原件一模一样
 
Top profile Call Girls In Darbhanga [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Darbhanga [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Darbhanga [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Darbhanga [ 7014168258 ] Call Me For Genuine Models...
 
如何办理美国华盛顿大学毕业证(UW毕业证书)毕业证成绩单原版一比一
如何办理美国华盛顿大学毕业证(UW毕业证书)毕业证成绩单原版一比一如何办理美国华盛顿大学毕业证(UW毕业证书)毕业证成绩单原版一比一
如何办理美国华盛顿大学毕业证(UW毕业证书)毕业证成绩单原版一比一
 
一比一原版(PU学位证书)普渡大学毕业证学历认证加急办理
一比一原版(PU学位证书)普渡大学毕业证学历认证加急办理一比一原版(PU学位证书)普渡大学毕业证学历认证加急办理
一比一原版(PU学位证书)普渡大学毕业证学历认证加急办理
 
如何办理伦敦商学院毕业证(LBS毕业证)毕业证成绩单原版一比一
如何办理伦敦商学院毕业证(LBS毕业证)毕业证成绩单原版一比一如何办理伦敦商学院毕业证(LBS毕业证)毕业证成绩单原版一比一
如何办理伦敦商学院毕业证(LBS毕业证)毕业证成绩单原版一比一
 
Electronic Stability Program. (ESP).pptx
Electronic Stability Program. (ESP).pptxElectronic Stability Program. (ESP).pptx
Electronic Stability Program. (ESP).pptx
 
Muslim Call Girls Churchgate WhatsApp +91-9930687706, Best Service
Muslim Call Girls Churchgate WhatsApp +91-9930687706, Best ServiceMuslim Call Girls Churchgate WhatsApp +91-9930687706, Best Service
Muslim Call Girls Churchgate WhatsApp +91-9930687706, Best Service
 
Top profile Call Girls In Thrissur [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Thrissur [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Thrissur [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Thrissur [ 7014168258 ] Call Me For Genuine Models ...
 
Housewife Call Girl in Faridabad ₹7.5k Pick Up & Drop With Cash Payment #8168...
Housewife Call Girl in Faridabad ₹7.5k Pick Up & Drop With Cash Payment #8168...Housewife Call Girl in Faridabad ₹7.5k Pick Up & Drop With Cash Payment #8168...
Housewife Call Girl in Faridabad ₹7.5k Pick Up & Drop With Cash Payment #8168...
 
What Does The Engine Malfunction Reduced Power Message Mean For Your BMW X5
What Does The Engine Malfunction Reduced Power Message Mean For Your BMW X5What Does The Engine Malfunction Reduced Power Message Mean For Your BMW X5
What Does The Engine Malfunction Reduced Power Message Mean For Your BMW X5
 
What Does It Mean When Mercedes Says 'ESP Inoperative See Owner's Manual'
What Does It Mean When Mercedes Says 'ESP Inoperative See Owner's Manual'What Does It Mean When Mercedes Says 'ESP Inoperative See Owner's Manual'
What Does It Mean When Mercedes Says 'ESP Inoperative See Owner's Manual'
 
如何办理麦考瑞大学毕业证(MQU毕业证书)成绩单原版一比一
如何办理麦考瑞大学毕业证(MQU毕业证书)成绩单原版一比一如何办理麦考瑞大学毕业证(MQU毕业证书)成绩单原版一比一
如何办理麦考瑞大学毕业证(MQU毕业证书)成绩单原版一比一
 
Mercedes Check Engine Light Solutions Precision Service for Peak Performance
Mercedes Check Engine Light Solutions Precision Service for Peak PerformanceMercedes Check Engine Light Solutions Precision Service for Peak Performance
Mercedes Check Engine Light Solutions Precision Service for Peak Performance
 
CELLULAR RESPIRATION. Helpful slides for
CELLULAR RESPIRATION. Helpful slides forCELLULAR RESPIRATION. Helpful slides for
CELLULAR RESPIRATION. Helpful slides for
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN ABUDHABI,DUBAI MA...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN ABUDHABI,DUBAI MA...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN ABUDHABI,DUBAI MA...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN ABUDHABI,DUBAI MA...
 
如何办理多伦多大学毕业证(UofT毕业证书)成绩单原版一比一
如何办理多伦多大学毕业证(UofT毕业证书)成绩单原版一比一如何办理多伦多大学毕业证(UofT毕业证书)成绩单原版一比一
如何办理多伦多大学毕业证(UofT毕业证书)成绩单原版一比一
 
01552_14_01306_8.0_EPS_CMP_SW_VC2_Notebook.doc
01552_14_01306_8.0_EPS_CMP_SW_VC2_Notebook.doc01552_14_01306_8.0_EPS_CMP_SW_VC2_Notebook.doc
01552_14_01306_8.0_EPS_CMP_SW_VC2_Notebook.doc
 
Effortless Driving Experience Premier Mercedes Sprinter Suspension Service
Effortless Driving Experience Premier Mercedes Sprinter Suspension ServiceEffortless Driving Experience Premier Mercedes Sprinter Suspension Service
Effortless Driving Experience Premier Mercedes Sprinter Suspension Service
 
West Bengal Factories Rules, 1958.bfpptx
West Bengal Factories Rules, 1958.bfpptxWest Bengal Factories Rules, 1958.bfpptx
West Bengal Factories Rules, 1958.bfpptx
 

FlexRay Fault Tolerance article

  • 1. FlaxRay Fault-Tolerance: Capabilities, weaknesses and proposed enhancements Antonio Cappiello, Omar Jaradat Mälardalen University, Västerås, Sweden, 03/2011 {aco10003, ojt10001}@student.mdh.se A bstract weaknesses and the capabilities based on two This paper gives an overview about main points: bus controller and network FlexRay, and it summarizes its main topology. The paper will proceed with the components, in addition to give an current state of the work and will be ended up adequate details of how those components work. by showing our conclusions. The document focuses on FlexRay reliability and how it is considered as a fault tolerant The FlexRay Protocol protocol, as well as, it discusses the capabilities, weaknesses and the authors’ enhancement The FlexRay protocol is a time-triggered proposals, so after reading this paper, readers protocol, and it can offer options for can create a good knowledge about FlexRay as deterministic data that arrives in a predictable well as how well this protocol achieves the time frame. FlexRay has a core with static reliability. frames and dynamic frames with a communication cycle that provides a predefined Introduction space for static and dynamic data, so nodes on FlexRay network must know how all the pieces FlexRay is a communication system, it is of the network are configured in order to considered one of the next generations of bus communicate, and since the embedded networks protocol for automotive networks; even it can be are different from normal PC networks, it applied on any other real time distributed means that FlexRay does need any additional system environment, but any researcher will mechanism to automatically discover and notice that this protocol is usually tied with the configure devices at run-time, like the PCs automotive industries, and this is simply, networks which require these procedures, because it was developed in 1999 by a FlexRay network and simply, have a closed cooperation of leading companies in automotive configuration and should not be changed once it industry and it was developed exclusively for is assembled in the production. automotive. FlexRay manages more than one node “Multiple Since, software errors are considered one of the nodes” with a Time Division Multiple big challenges that affect seriously on the Access (TDMA) scheme and every FlexRay node software performance. Our mission is to show is synchronized to the same clock, and each node how FlexRay can be considered as a fault waits for its turn to write on the bus, and tolerant system, and how it can handle the because the timing is harmonious in failures and errors that can be happened in any a TDMA scheme, FlexRay is always able to given time, as well as, try to suggest or propose guarantee consistency of data deliver to nodes any idea can lead to enhance the reliability of on the network, this provides many advantages FlexRay protocol. for systems that depend on up to date data between nodes. In this paper, we will describe FlexRay protocol and analyse the bus controller structure and how the nodes can communicate and interact within the whole communication system, so we will begin to talk about the protocol itself, and then the fault tolerance, by explaining the
  • 2. Fault-Tolerance: Capabilities and operates normal, constitute all together the so- called three-level error model, Figure 2. This Weaknesses model provides a self-diagnostic mechanism of the possible error. In this section of the document we are going to point out the means adopted by the FlexRay protocol in order to provide a fault-tolerance communication. We have individuated two aspects of a FlexRay System involved in the assurance of the fault- tolerance: 1. The bus controller. 2. The physical network architecture. The bus controller The bus controller consists of six components as showed in Figure 1 [1], but in particular there are some of these that use a mechanism to protect the communication from errors. Figure 2 The Frame and Symbol Processing (FSP) beside to separate the payload from the header of the message received, it provides also status data to the host regarding the frame reception, as for example if the received frame is valid or invalid. On the sender node, The Coding/Decoding Unit (CODEC) computes and appends the CRC checksum to the message that it has to encode and send on the bus. On the receiver node, after decoding the message received, the CODEC performs the CRC check in order to verify whether the message integrity has been affected by electromagnetic noise on the bus and consequently some bits have been flipped. In addition, in a time-triggered real time system such as FlexRay, different nodes have to keep a Figure 1 consistent view of the global time even in faulty situations, and the component responsible for The Protocol Operation Control (POC) this is the Clock Synchronisation (CS). This responsible to react to host commands component tries to improve the fault tolerance of instructing/guiding the other components also the Protocol through two kind of correction: the reacts to error situations. For example when an offset correction and the rate correction. In error occurs, the POC falls to normal passive particular, it is in the offset correction method state and tries to reintegrate, but when the that the CS adopts a fault-tolerant midpoint error is fatal the POC falls in the halt state and algorithm in order to compute an average over all operations are stopped. These two states and the time differences between the communication the active state, in which the bus controller rounds. On the base of this computation, the
  • 3. next message schedule is brought forward or allocated slots, and from the other hand the delayed in such a way that all nodes have correctly relay of messages coming from non- almost the same time in the next cycle. faulty communication controller. The FlaxRay Consortium claims that thanks to this algorithm, up to two Byzantine faults 1 can Summarising [4] [5] about the fault tolerance, be tolerated. When more than two of these faults we can state that the FlexRay Protocol happen, the System can fall in a situation in  manages the errors with a “never-give- which there are different views of the global up”-strategy thanks to the three-level time and consequently another problem can error model explained above, because affect the System, the Clique problem. “stopping communication is a critical A Clique is a group of nodes connected to a decision which must be made by the network which can communicate only inside the application whenever possible”; same group and not with the other ones.  is able to handle both internal and FlexRay doesn’t provide any mean to detect and external faults; resolve this kind of problem. The Clique  does not adopt any strategy like Problem in FlexRay has been well analysed in retransmission in case of a corrupted [2], and more in depth, two kinds of Cliques has message, but this is responsibility of the been identified: host application to face with these 1. Time domain cliques, that happens problems because the strategy of the when subsets of nodes have different protocol is to “signal” the error; view of the global time, as described  as well as for the security aspect, before, because the Protocol does not provide 2. Value domain cliques, that occurs when security, but it is responsibility of the a frame is correctly placed in a slot but application contains a different cycle counter.  “Requires application support for Moreover, in [2] it is said that “the FlaxRay Byzantine faults (e.g. group consortium is aware of the potential clique membership). problem” but it is even said that “the cliques do not constitute a noticeable risk in practice” The physical network architecture maybe because “there are no report published on cliques observed in a practical setup”. For these FlexRay supports single and dual channel reasons in that document the authors show with configurations which consist of one or two pairs experiments how to create cliques in a physical of wires respectively, most FlexRay nodes FlexRay cluster and how to avoid or detect typically also have power and ground wires possible cliques. available to power transceivers and microprocessors. Finally, when all the above illustrated means adopted by the bus controller are not enough to FlexRay can be distinguished from all other prevent faulty behaviours, an additional automotive protocols such as CAN and LIN by component can be inserted between the bus its Network layout because FlexRay supports a controller and the network as showed in the very flexible network topology, and this is Figure 1: the Bus Guardian (BG). In [3], four because it has two channels that can be used in properties for the BG have been identified and a different ways, this for sure will increase the formally proofed: flexibility which will allow the protocol to provide a scalability of the fault tolerance, in 1. Correct Relay. addition to that it plays a big role in forming 2. Validity. FlexRay system structure, so redundant and 3. Agreement. independent systems are possible. 4. Integrity. There are three possible FlexRay topologies: These properties guarantee from one hand no accesses of the communication control to the 1.Passive Bus Topology: it means that all communication channel outside the pre- nodes can be connected to a bus but in dual channels case one node can be 1 A Byzantine Fault is typical of the distributed system and is connected to both channels or only to visible with the wrong behaviour of a node in the system, that consist in sending arbitrary messages, including messages one of these channel. Figure 3.A aimed to corrupt the system. More details about this topic will be provided in the Current State of Work paragraph
  • 4. 2.Active Star Topology: In this topology tolerance and time-determinism performance the network can be built as an active requirements for x-by-wire applications (i.e. star that contains star couplers, each drive-by-wire, steer-by-wire, brake-by-wire, node must be connected to one etc.). This article covers the basics FlexRay. [7] coupler. Figure 3.B Most first FlexRay networks generation only use 3.Combination of the topologies: In this the “single channel” and this is to decrease the topology a combination between the wires cost and keep it down, but further passive bus and active star is used. networks will use dual channel and this is Figure 3.C because the big advantage that they can gain from dual channel, since the dual channel It is very important for designers to select enhances fault – tolerance and increase the between these topologies because choosing the bandwidth. more suitable topology can play a big role to optimize the cost, performance, and reliability FlexRay can redundantly transmit individual for their design. messages to provide an additional layer of network reliability. In fact, FlexRay networks FlexRay network must know how all the pieces provide scalable fault – tolerance by allowing of the network are configured in order to single or dual channel communication, but for communicate efficiently. sure the dual channel is preferred in many cases, for example, in security – critical Figures 3.A, 3.B and 3.C show several possible applications, all devices connected to the bus topologies can be supported by FlexRay may use both channels for transferring data. channels [1] [6]. However, it is always possible to connect one single channel when the redundancy is not needed, or to increase the bandwidth by using both channels for transferring non-redundant data. As a result, FlexRay can be used with single or dual channels, but since the dual channel provides and increases the redundancy this will lead to increase the fault – tolerance, thus, using Figure3.A Figure 3.B dual channel topology instead of single channel will logically influence the fault – tolerance cumulatively [1]. Current State of Work In this section of the document we are going to describe the second step of our work consisting in collecting practical and theoretical research on the enhancement of the FlaxRay fault- tolerance capabilities. Regardless of the fact that FlaxRay is still a new protocol in the automotive industries, there are many works conducted by companies or Figure 3.C researchers form one hand in order to find out the true potentialities of the protocol and determine its working features and from the FlexRay communications bus is a deterministic, other hand with the purpose to improve its fault-tolerant and high-speed bus system, and reliability and effectiveness. Therefore in our using two separate physical FlexRay work of collecting information we decided to communication lines with 10Mbps implement adopt a strategy of research based on selecting double redundant fault tolerant message the most reliable work form international transmission so that data throughput can be conferences, workshops and companies leader in doubled as well. FlexRay delivers the error
  • 5. the field of the embedded systems such as the (Constraint Logic Programming) in term of Real-Time Systems Symposium (RTSS), The results, but computationally less expensive. Euromicro Technical Committee on Real-Time Systems (ECRTS), the International Workshop About the message scheduling a good contribute on Automated Verification of Critical Systems has been given by [11], where in order to (AVoCS), the Real-Time and Embedded analyse the timing properties in both the static Technology and Applications Symposium and the dynamic segment of a FlexRay (RTAS), the IEEE Computer Society and many communication cycle, the authors suggest others. different techniques. Moreover our research strategy is focused in More in depth, about the timing properties of selecting the works regarding the reliability and the static segment, an algorithm that builds the fault-tolerant aspects of FlaxRay that try to static schedule has been proposed and analysed. estimates its capacities and propose concrete About the dynamic segment, several factors that solutions to its weakness. can impact on the worst-case response time have As result of this research we are going to been analysed in three different approaches, describe the most interesting outcomes as a kind optimal (OO), heuristic (HH) and holistic (OH) of insight on the current state of work on solution. FlexRay. The OO uses a ILP formulation, the HH sees the problem as bin-covering problem, and OH Before to go in depth with the single results we further reduce the time of HH using partially an can say to have noticed a common reason on the ILP formulation. All the proposed analyses are base of each work: everyone agree on the need to based on formal extensive experiments. precisely determine the true performance, In another article [12] strictly related to the predictability and reliability of the mentioned previous one [11] written by almost the same protocol as mandatory requirement to use authors, a further step toward an efficient use of successfully FlexRay in safety-critical FlexRay is done. While the first article bounds applications. This common view is due to the the message transmission time on both the ST fact that FlaxRay is becoming the leader in the and DYN segment, the second one is focused on distributed embedded system targeted to high find the right bus configuration for a particular performance vehicles. application in order to meet all the time constraints. Several study like [8] and [9] compare the FlaxRay protocol with the most popular This purpose is achieved providing four nowadays in automotive industries as LIN, techniques extensively tested by the authors: CAN, TTCAN and others, with the purpose to show how the flexibility and potentialities of 1. The Basic Bus Configuration (BCC), FlaxRay include all the benefits of the other which results from analyzing the protocol. In addition other works as [3] show minimal bandwidth requirements of the practically how it’s possible to “migrate” from application; CAN to FlexRay explaning the migration requirements, parameter calculation, message 2. The OBC heuristic with the curve analysis, Payload optimization and Slot size fitting (OBCCF), that instead of definition, but at the end they indicated that exhaustively perform the scheduling for there is a big problem in optimizing a FlexRay all possible values of the DYN segment cycle which is formalizing the static segment length, evaluates the response time for and dynamic segment parameters. The latter is only some values and than with the one of the most interesting aspect on which curve fitting approach extrapolates the many researchers spent their efforts. response time for the other points ( this For example in [10] a technique to schedule is based on the regularity of the messages on the FlaxRay segment has been dependence response time vs. size of the proposed in order to compensate the lack of the DYN segment noticed in several protocol toward the faulty messages due to experiments and depicted by the transient and intermittent faults that affect the following picture) reliability aspect of the communication. The technique proposed generate a schedule on the base of the probability of failure of the message using an heuristic very close to the CLP
  • 6. reduce the validation time is required to manage even the continues and rapid changes in electronic control feature. This means to elaborate a schedule that takes into account even a certain amount of uncertainty. In [13] the info-gap technique has been showed with the purpose to generate different schedules with a degree of robustness related to different ranges of uncertainty. More in depth, the uncertainty analysed is in the payloads of the messages, but the same approach can be used even for uncertainty related to the dependency between task and messages, for the period (rate of task execution, or message transmission) and topology (mapping of tasks to hosts and messages to channels). Figure 4 By now we have discussed only the message scheduling problems in a system that uses the 3. The OBC heuristic with an exhaustive FlaxRay communication protocol, but there are exploration of the size for the DYN many other issues pointed out by others works segment; that need particular attention. Most of these are for example related to 4. The Simulated Annealing (SA) based Byzantine fault that is very common in design space exploration, used to distributed system. provide a base-line for evaluation of the The Byzantine fault occurs when a faulty node proposed heuristics. corrupts its local state and sends arbitrary messages. To face with this problem can be used The results of the experiments conducted by the a Byzantine fault tolerance technique (BFT) authors can be summarised by the following which mask a bounded number of Byzantine picture taken from the same article: faults e.g. using state machine replication, or a detecting technique which equips each node with a detector in order to monitor other nodes and isolate the possible nodes with faulty behaviour. A formal study on these techniques has been conduced in [14], and what come out is that the first technique is stronger than the second one, but analysing a trade-off between them follows that:  Detection require f+1 replication vs. 3f+1 of the BFT in order to cope with f concurrent fault;  Detection systems need only be provisioned for the average load while a BFT system must be provisioned for the peak load;  Detection is cheaper. In addition to this analysis, in the same article Figure 5 the authors propose a sketch of a system that implements a Byzantine fault detector that provide accountability, completeness and As these studies have showed, design the accuracy. schedule of the FlaxRay is a complex operation not only because it is needed to guarantee the Toward the Byzantine fault the FlaxRay system tight time constraints and performance required can be equipped with an additional module by some automotive application but even placed between the Bus Controller and the because, in order to increase the reusability and network, the Bus Guardian. The functionality of this has been already described in the
  • 7. previous section of the document, but the [6] Seminar FlexRay, Robert Rieb, Chemntiz FlaxRay specification doesn’t give any proof of University 2009. its functionalities. Regard to this, in [9], four properties has been identified and formally [7] FlexRay Automotive Communication Bus proofed Overview, National Instruments ("NI"). 1. Correct Relay, 2. Validity, [8] Comparision of FieldBus Systems CAN, 3. Agreement, TTCAN, FlexRay and LIN in Passenger 4. Integrity. Vehicles, Steve C. Talbot, Shangping Ren, 29th Moreover about the Byzantine fault, the IEEE International Conference on Distributed FlexRay specification claims that up to two Computing Systems Workshops Montreal, Byzantine faults can be tolerated thanks to the Quebec, Canada June 22-June 26 2009 Clock Synchronization Algorithm, but even this property have to be proofed and the author of [9] In-Veichle Networking, frescale.com the previous article ([15]) is currently working even on this problem. [10] Scheduling for Fault-Tolerant Communication on the Static Segment of Conclusion FlexRay, Bogdan Tanasa, Unmesh D. Bordoloi, Petru Eles, Zebo Peng, 31st IEEE Real-Time FlexRay communications bus is a deterministic, Systems Symposium, 2010. fault-tolerant and high-speed bus system with high performance, and it has more and more [11]Timing Analysis of the FlexRay promising future in real time distributed Communication Protocol, Traian Pop, Paul Pop, systems, specially, in automotive industry. Dual Petru Eles, Zebo Peng, Alexandru Andrei, Real- – channel topology offers enhanced fault- Time Systems Journal, Volume 39, Numbers 1- tolerance and increases the bandwidth, and this 3, pp 205-235, August, 2008 provides messages redundancy or double the transmission which increases the reliability, [12] Bus Access Optimisation for FlexRay-based even the dual channels can be used to increase Distributed Embedded Systems, Design, the bandwidth only, without redundant the Automation, and Test, Traian Pop, Paul Pop, message. FlexRay has a good mechanism to Petru Ion Eles and Zebo Peng, in Europe handle the errors (i.e. three-level error model) Conference DATE07. which provides a self-diagnostic mechanism of the possible error. [13] A. Ghosal, H. Zeng, Y. Ben-Haim, M. Di Natale, “Computing Robustness of FlexRay References Schedules to Uncertainties in Design Parameters” , DATE '10, 2010 [1] Introduction to FlexRay and TTA, Peter [14] The case for Byzantine fault detection, Bohm, November 21, 2005. Andreas Haeberlen, Petr Kouznetsov, Peter Druschel, HOTDEP'06 Proceedings of the 2nd [2] An Investigation of the Clique Problem in conference on Hot Topics in System FlexRay, P.Milbredt, M.Horauer, A.Steininger, Dependability, Volume 2 , 2006 IEEE 2008. [15] On the Formal Verification of the FlexRay [3] On the Formal Verification of the FlexRay Communication Protocol, Bo Zhang, Automatic Communication Protocol, Bo Zhang, AVoVS Verification of Critical Systems - AvoCS (2006) 2006. 184-189 [4] Protocol Overiew, C.Temple-Motorola, [16] Migration Framework from CAN to FlexRay International Workshop, Detroit,2003. FlexRay, Richard Murphy, Frank Walsh and Brendan Jackman, Automotive Control Group, [5] The FlexRay Protocol, P.Koopman, Carnegie Waterford Institute of Technology, Cork Road, Mellon, 2010. Waterford, Ireland.