Dependable Realtime-Systems

Trends and Visions.

The demands on reliable real-time systems can be achieved only through the use of different measures. The utilization of a coordinated error avoidance, detection and removal strategy (perfection strategy) within the development process reduces the number of design and implementation mistakes to a minimum.

Due to the complexity of current systems, errors can never be entirely eliminated in the current state of the art. Especially the invested efforts and the associated costs are set limits from an economic point of view. For this reason, common standards (e.g. IEC 61508, EN 50128, ISO WD 26262, EN ISO 13849) derive from the criticality of an application (extent of damage and probability of occurrence) a required level of integrity (SIL) for which graded sets of measures are recommended. With the increasing level of integrity the number and intensity of actions to be taken increases. High integrity levels usually require the application of fault tolerance, which lead to redundant components or structures. Redundancy consequently has a direct impact on the system architecture but allows the detection and masking of sporadic hardware failures or activated software errors (N out of from M decision; diversity and software fault tolerance).

The technological progress in microelectronics and the increasing integration of integrated circuits in the past led to ever more powerful and more reliable components and systems. For this reason, the application of fault-tolerance was previously limited to application fields, which in itself demands on the functional safety (e.g. railway, avionics, medical devices, manufacturing technology). Here domain-specific solutions and system concepts emerged that fulfill the specific safety cultures.

The continuing validity of the 1965 by Gordon Moore formulated law of doubling the number of transistors in a fixed period (at first 12, then 18 months) is still valid and leads to ever more highly integrated circuits. The progress of integration will be continued the future but associated with significant disadvantages and reverses the recent trend toward ever more reliable digital components. The draw back has already become visible with the transition to the 45nm technology. Due to the smaller structures in the range of a few atom layers, the circuits are susceptible to transient faults (SEU) and aging effects (migration) [1]. A paradigm shift toward "build dependable systems with undependable devices" has already been put up for discussion [2]. In the future redundancy and fault-tolerance will also apply for applications outside of safety-critical applications.

For that reason, the ReliaTec runs alongside the development of basic platforms for safety-critical and / or highly available applications, the development of future products (ReliaSYS and ReliaOS). ReliaSYS is a generic, redundant platform for real-time systems in the peculiarities "embedded" and "IPC", which is based almost entirely on standard hardware components (COTS). With ReliaOS a software framework is provided that supports the application development and simplifies the certification of system instances. Among other things, we will use our existing know-how in the field of hardware fault diagnosis (e.g. processor test algorithms for Infineon XC16x and ARM-9).

[1] S. Borkar, N.P. Jouppi and P. Stenstrom: Microprocessors in the Era of Terascale Integration, Proceedings DATE 2007, page 237.
[2] D. Serpanos, J. Henkel: Dependability and Security Will Change Embedded Computing IEEE Computer January 2008 (vol. 41 no. 1) ISSN: 0018-9162.

    © ReliaTec GmbH - All Rights Reserved