Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 16 2009-10-1
16 Model-Based Design for Embedded Systems
α
u
(Δ) = min{(α
u
⊗β
u
) β
l
, β
u
}
α
l
(Δ) = min{(α
l
β
u
) ⊗ β
l
, β
l
}
β
u
(Δ) = max{ inf
Δ≤λ
{β
u
(λ) − α
l
(λ)},0}
β
l
(Δ) = sup
0≤λ≤Δ
{β
l
(λ) − α
u
(λ)}
In the example system of Figure 1.2, the processing semantics of the tasks
P
1
, P
2
, P
3
,andP
4
can be modeled with abstract GPCs.
Finally, let us consider a component that is used for event stream shaping.
A greedy shaper component (GSC) with a shaping curve σ delays events of
an input event stream such that the output event stream has σ as an upper
arrival curve. Additionally, a greedy shaper guarantees that no events are
delayed longer than necessary. Typically, greedy shapers are used to reshape
bursty event streams and to reduce global buffer requirements. If the abstract
input event stream of a GSC with the shaping curve, σ, is represented by the
tuple of arrival curves, [α
l
, α
u
], then the output of the GSC can be modeled
as an abstract event stream with arrival curves:
α
u
GSC
= α
u
⊗σα
l
GSC
= α
l
⊗(σσ)
Note that a greedy shaper does not need any computation or communica-
tion resources. Thus, the transfer function of an abstract GSC considers only
the ingoing and the outgoing event stream, as well as the shaping curve, σ.
More details about greedy shapers inthe context of MPA can be found in [19].
In the example system of Figure 1.2, the semantics of the shapers, S
1
and
S
2
, can be modeled with abstract GSCs.
1.4.4 System Performance Model
In order to analyze the performance of a distributed embedded platform, it
is necessary to build a system performance model. This model has to repre-
sent the hardware architecture of the platform. In particular, it has to reflect
the mapping of tasks to computation or communication resources and the
scheduling policies adopted by these resources.
To obtain a performance model of a system, we first have to model the
event streams that trigger the system, the computation and communication
resources that are available, and the processing components. Then, we have
to interconnect the arrival and service inputs and outputs of all these ele-
ments so that the architecture of the system is correctly represented.
Figure 1.7 depicts the MPA performance model for the example system
described in Figure 1.2. Note that the outgoing abstract service stream of
GPC
2
is used as the ingoing abstract service stream for GPC
4
, i.e., GPC
4
gets only the resources that are left by GPC
2
. This represents the fact that
the two tasks share the same processor and are scheduled according to a
Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 17 2009-10-1
Performance Prediction of Distributed Platforms 17
α
HR
α
LR
β
1
β
3
β
2
σ
1
σ
2
GPC
1
GPC
3
GPC
4
GSC
2
GPC
2
GSC
1
FIGURE 1.7
Performance model for the example system in Figure 1.2.
pre-emptive fixed priority scheduling policy with GPC
2
having a higher pri-
ority than GPC
4
.
In general, scheduling policies for shared resources can be modeled by
the way the abstract resources β are distributed among the different abstract
tasks. For some scheduling policies, such as earliest deadline first (EDF) [16],
TDMA [20], nonpreemptive fixed priority scheduling [4], various kinds of
servers [16], or any hierarchical composition of these elementary policies,
abstract components with appropriate transfer functions have been intro-
duced. Figure 1.8 shows some examples of how to model different schedul-
ing policies within the MPA framework.
1.4.5 Performance Analysis
The performance model provides the basis for the performance analysis
of a system. Several performance characteristics such as worst-case end-to-
end delays of events or buffer requirements can be determined analytically
within the MPA framework.
The performance of each abstract component can be determined as a
function of the ingoing arrival and service curves by the formulas of the RTC.
For instance, the maximum delay, d
max
, experienced by an event of an event
stream with arrival curves, [α
l
, α
u
], that is processed by a GPC on a resource
with service curves, [β
l
, β
u
], is bounded by
d
max
≤ sup
λ≥0
inf{τ ≥ 0: α
u
(λ) ≤ β
l
(λ + τ)}
def
= Del(α
u
, β
l
)
The maximum buffer space, b
max
, that is required to buffer an event stream
with arrival curves, [α
l
, α
u
], that is processed by a GPC on a resource with
service curves, [β
l
, β
u
], is bounded by
b
max
≤ sup
λ≥0
{α
u
(λ) − β
l
(λ)}
def
= Buf(α
u
, β
l
)
Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 18 2009-10-1
18 Model-Based Design for Embedded Systems
α
A
α
A
GPC
GPC
α
B
β
β
(a)
α
A
α
B
β
β
(b)
EDF
α
A
α
B
β
slot
1
β
slot
2
(c)
α
A
α
B
β
β
s
2
α
B
β
α
A
α
B
α
A
α
B
β
slot
1
β
slot
2
α
A
α
B
β
β
s
2
β
s
1
β
s
1
(d)
GPC
TDMA
Share
Sum
GPC
GPC
GPC
FIGURE 1.8
Modeling scheduling policies in the MPA framework: (a) preemptive fixed
priority, (b) EDF, (c) TDMA, and (d) generalized processor sharing.
Figure 1.9 shows the graphical interpretation of the maximum delay expe-
rienced by an event at a GPC and the maximum buffer requirement of the
GPC: d
max
corresponds to the maximum horizontal distance between α
u
and β
l
,andb
max
corresponds to the maximum vertical distance between α
u
and β
l
.
In order to compute the end-to-end delay of an event stream over several
consecutive GPCs, one can simply add the single delays at the various com-
ponents. Besides this strictly modular approach, one can also use a holistic
delay analysis that takes into consideration that in a chain of task the worst-
case burst cannot appear simultaneously in all tasks. (This phenomenon is
described as “pay burst only once” [7].) For such a task chain the total delay
can be tightened to
d
max
≤ Del(α
u
, β
l
1
⊗β
l
2
⊗ ⊗β
l
n
)
For an abstract GSC, the maximum delay and the maximum backlog are
bounded by
d
max
= Del(α
u
, σ) b
max
= Buf(α
u
, σ)
Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 19 2009-10-1
Performance Prediction of Distributed Platforms 19
d
max
b
max
β
l
α
u
Δ
FIGURE 1.9
Graphical interpretation of d
max
and b
max
.
Let us come back to the example of Figure 1.2. By applying the above
reasoning, the worst-case end-to-end delay for the packets of the two video
streams can be analytically bounded by
d
HR
≤ Del(α
u
HR
, β
l
1
⊗β
l
3
⊗σ
1
) d
LR
≤ Del(α
u
LR
, β
l
2
⊗β
l
3
⊗σ
2
)
1.4.6 Compact Representation of VCCs
The performance analysis method presented above relies on computations
on arrival and service curves. While the RTC provides compact mathemati-
cal representations for the different operations on curves, their computation
in practice is typically more involved. The main issue is that the VCCs are
defined for the infinite range of positive real numbers. However, any com-
putation on these curves requires a finite representation.
To overcome this problem, we introduce a compact representation for
special classes of VCCs. In particular, we consider piecewise linear VCCs
that are finite, periodic, or mixed.
• Finite piecewise linear VCCs consist of a finite set of linear segments.
• Periodic piecewise linear VCCs consist of a finite set of linear segments
that are repeated periodically with a constant offset between consecu-
tive repetitions.
• Mixed piecewise linear VCCs consist of a finite set of linear segments
that are followed by a second finite set of linear segments that are
repeated periodically, again with a constant offset between consecu-
tive repetitions.
Figure 1.10a through c shows examples of these three classes of curves.
Many practically relevant arrival and service curves are piecewise linear.
For example, if a stream consists of a discrete token, the corresponding
Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 20 2009-10-1
20 Model-Based Design for Embedded Systems
8
6
4
2
0
0246810
Δ (ms)(a)
(b)
(c)
Resource units
4
3
2
1
0
0
246810
Resource units
Δ (ms)
8
6
4
2
0
0246810
Events
Δ (ms)
FIGURE 1.10
(a) A finite piecewise linear VCC, (b) a periodic piecewise linear VCC, and
(c) a mixed piecewise linear VCC.
arrival curve is an integer and can be represented as a piecewise constant
function. For the service curves,one could use the samereasoning, as the basic
resource units (number of clock cycles, number of bytes, etc.) are typically
also atomic. However, these units are often too fine-grained for a practical
analysis andhence it is preferable to use a continuous model. In most practical
applications, the fluid resource availability is piecewise constant over time,
that is, practically relevant service curves are also piecewise linear.
Here, we want to note that there are also piecewise linear VCCs that are
not covered by the three classes of curves that we have defined above. In
particular, we have excluded irregular VCCs, that is, VCCs with an infinite
number of linear segments that do not eventually show periodicity.
However, most practically relevant timing specifications for event streams
and availability specifications for resources can be captured by either finite,
Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 21 2009-10-1
Performance Prediction of Distributed Platforms 21
periodic, or mixed piecewise linear VCCs. In addition, note that VCCs only
describe bounds on token or resource streams, and, therefore, one can always
safely approximate an irregular VCC to a mixed piecewise VCC.
In the following, we describe how these three classes of curves can be rep-
resented by means of a compact data structure. First, we note that a single lin-
ear segment of a curve can be represented by a triple x, y,swith x ∈ R
≥0
and
y, s ∈ R that specifies a straight line in the Cartesian coordinate system, which
starts at the point (x, y) and has a slope s. Further, a piecewise linear VCC can
be represented as a (finite or infinite) sequence
x
1
, y
1
, s
1
, x
2
, y
2
, s
2
,
of
such triples with x
i
< x
i+1
for all i. To obtain a curve defined by such a
sequence, the single linear segments are simply extended with their slopes
until the x-coordinate of the starting point of the next segment is reached.
The key property of the three classes of VCCs defined above is that these
VCCs can be represented with a finite number of segments, which is funda-
mental for practical computations: Let ρ be a lower or an upper VCC belong-
ing to a set of finite, periodic, or mixed VCCs. Then ρ can be represented with
atuple
ν
ρ
=
Σ
A
, Σ
P
, p
x
, p
y
, x
p0
, y
p0
where
Σ
A
is a sequence of linear segments describing a possibly existing irregular
initial part of ρ
Σ
P
is a sequence of linear segments describing a possibly existing regularly
repeated part of ρ
If Σ
P
is not an empty sequence, then the regular part of ρ is defined by
the period p
x
and the vertical offset p
y
between two consecutive repetitions
of Σ
P
, and the first occurrence of the regular sequence Σ
P
starts at (x
p0
, y
p0
).
In this compact representation, we call Σ
A
the aperiodic curve part and Σ
P
the periodic curve part.
In the compact representation, a finite piecewise linear VCC has Σ
P
={},
that is, it consists of only the aperiodic part, Σ
A
,withx
A,1
= 0. A periodic
piecewise linear VCC can be described with Σ
A
={}, x
P,1
= 0, and x
p0
= 0,
that is, it has no aperiodic part. And finally, a mixed piecewise linear VCC is
characterized by x
A,1
= 0, x
P,1
= 0, and x
p0
> 0.
As an example, consider the regular mixed piecewise linear VCC
depicted in Figure 1.10c. Its compact representation according to the defi-
nition above is given by the tuple
ν
C
=
0, 1, 0, 0.2, 2,0, 0.4, 3, 0, 0.6,4, 0
,
0, 0, 0
,2,1,2,5
The described compact representation of VCCs is used as a basis for prac-
tical computations in the RTC framework. All the curve operators adopted
in the RTC (minimum, maximum, convolutions, deconvolutions, etc.) are
closed on the set of mixed piecewise linear VCCs. This means that the result
of the operators, when applied to finite, periodic, or mixed piecewise linear
Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 22 2009-10-1
22 Model-Based Design for Embedded Systems
VCCs, is again a mixed piecewise linear VCC. Further details about the com-
pact representation of VCCs and, in particular, on the computation of the
operators can be found in [17].
1.5 RTC Toolbox
The framework for the MPA with the RTC that we have described in this
chapter has been implemented in the RTC Toolbox for MATLAB
R
[21],
which is available at />The RTC Toolbox is a powerful instrument for system-level performance
analysis of distributed embedded platforms. At its core, the toolbox provides
a MATLAB type for the compact representation of VCCs (see details in Sec-
tion 1.4) and an implementation of a set of the RTC curve operations. Built
around this core, the RTC Toolbox provides libraries to perform the MPA,
and to visualize VCCs and the related data.
Figure 1.11 shows the underlying software architecture of the toolbox.
The RTC toolbox internally consists of a kernel that is implemented in Java,
and a set of MATLAB libraries that connect the Java kernel to the MAT-
LAB command line interface. The kernel consists of classes for the compact
representation of VCCs and classes that implement the RTC operators. These
two principal components are supported by classes that provide various
utilities. On top of these classes, the Java kernel provides APIs that provide
methods to create compact VCCs, compute the RTC operations, and access
parts of the utilities.
MATLAB command line
RTC toolbox
RTC operators
MPA library
VCC library
MATLAB/Java interface
Min-plus/Max-plus algebra, utilities
Compact representation of VCCs
Java API
FIGURE 1.11
Software architecture of the RTC toolbox.
Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 23 2009-10-1
Performance Prediction of Distributed Platforms 23
The Java kernel is accessed from MATLAB via the MATLAB Java Inter-
face. However, this access is completely hidden from the user who only
uses the MATLAB functions provided by the RTC libraries. The MATLAB
libraries of the RTC Toolbox provide functions to create VCCs, plot VCCs,
and apply operators of the RTC on VCCs. From the point of view of the user,
the VCCs are MATLAB data types, even if internally they are represented
as Java objects. Similarly, the MATLAB functions for the RTC operators are
wrapper functions for the corresponding methods that are implemented in
the Java kernel.
On top of the VCC and the RTC libraries, there is the MPA library. It
provides a set of functions that facilitate the use of the RTC Toolbox for the
MPA. In particular, it contains functions to create commonly used arrival
and service curves, as well as functions to conveniently compute the outputs
of the various abstract components of the MPA framework.
1.6 Extensions
In the previous sections, we have introduced the basics of the MPA approach
based on the RTC. Recently, several extensions have been developed to refine
the analysis method.
In [4], the existing methods for analyzing heterogeneous multiprocessor
systems are extended to nonpreemptive scheduling policies. In this work,
more complex task-activation schemes are investigated as well. In particular,
components with multiple inputs and AND- or OR-activation semantics are
introduced.
The MPA approach also supports the modeling and analysis of systems
with dynamic scheduling policies. In [16], a component for the modeling of
the EDF scheduling is presented. This work also extends the ability of the
MPA framework to model and analyze hierarchical scheduling policies by
introducing appropriate server components. The TDMA policies have been
modeled using the MPA as well [20].
In Section 1.4, we have briefly described the GSC. More details about
traffic shaping in the context of multiprocessor embedded systems and the
embedding of the GSC component into the MPA framework can be found
in [19].
In many embedded systems, the events of an event stream can have
various types and impose different workloads on the systems depending on
their types. Abstract stream models for the characterization of streams with
different event types are introduced in [18]. In order to get more accurate
analysis results, these models permit to capture and exploit the knowledge
aboutcorrelationsanddependenciesbetweendifferenteventtypesinastream.
Further, in distributed embedded platforms, there often exist correlations
in the workloads imposed by events of a given type on different system
Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 24 2009-10-1
24 Model-Based Design for Embedded Systems
components. In [22], a model is introduced to capture and characterize such
workload correlations in the framework of the MPA. This work shows that
the exploitation of workload correlations can lead to considerably improved
analysis results.
The theory of real-time interfaces is introduced in [14]. It connects the
principles of the RTC and the interface-based embedded system design [2].
The real-time interfaces represent a powerful extension of the MPA frame-
work. They permit an abstraction of the component behavior into interfaces.
This means that a system designer does not need to understand the details
of a component’s implementation, but only needs to know its interface in
order to ensure that the component will work properly in the system. Before
the introduction of the real-time interfaces, the MPA method was limited
to the a posteriori analysis of component-based real-time system designs.
With the real-time interfaces, it is possible to compose systems that are
correct by construction.
1.7 Concluding Remarks
In this chapter, we have introduced the reader to the system-level perfor-
mance prediction of distributed embedded platforms in the early design
stages. We have defined the problem and given a brief overview of
approaches to performance analysis.
Starting from a simple application scenario, we have presented a formal
system description method in the time domain. We have described its use-
fulness for the simulation of concrete system executions, but at the same time
we have pointed out that the method is inappropriate for worst-case analy-
sis, as in general it cannot guarantee the coverage of corner cases.
Driven by the need to provide hard performance bounds for distributed
embedded platforms, we have generalized the formalism to an abstraction
in the time interval domain based on the VCCs and the RTC. We have pre-
sented the essential models underlying the resulting framework for the MPA
and we have demonstrated its application. Finally, we have described a com-
pact representation of the VCCs that enables an efficient computation of RTC
curve operations in practice, and we have presented the RTC Toolbox for
MATLAB, the implementation of the MPA analysis framework.
Acknowledgments
The authors would like to thank Ernesto Wandeler for contributing to some
part of this chapter and Nikolay Stoimenov for helpful comments on an ear-
lier version.
Nicolescu/Model-Based Design for Embedded Systems 67842_C001 Finals Page 25 2009-10-1
Performance Prediction of Distributed Platforms 25
References
1. S. Chakraborty, S. Künzli, and L. Thiele. A general framework
for analysing system properties in platform-based embedded system
designs. In Design Automation and Test in Europe (DATE), pp. 190–195,
Munich, Germany, March 2003. IEEE Press.
2. L. de Alfaro and T. A. Henzinger. Interface theories for component-based
design. In EMSOFT ’01: Proceedings of the First International Workshop on
Embedded Software, pp. 148–165, London, U.K., 2001. Springer-Verlag.
3. M. G. Harbour, J. J. Gutiérrez García, J. C. Palencia Gutiérrez, and J. M.
Drake Moyano. Mast: Modeling and analysis suite for real time appli-
cations. In Proceedings of 13th Euromicro Conference on Real-Time Systems,
pp. 125–134, Delft, the Netherlands, 2001. IEEE Computer Society.
4. W. Haid and L. Thiele. Complex task activation schemes in system level
performance analysis. In 5th International Conference on Hardware/Software
Codesign and System Synthesis (CODES+ISSS’07), pp. 173–178, Salzburg,
Austria, October 2007.
5. S. Künzli, F. Poletti, L. Benini, and L. Thiele. Combining simulation and
formal methods for system-level performance analysis. In Design Automa-
tion and Test in Europe (DATE), pp. 236–241, Munich, Germany, 2006. IEEE
Computer Society.
6. K. Lahiri, A. Raghunathan, and S. Dey. System-level performance analy-
sis for designing on-chip communication architectures. IEEE Transactions
on CAD of Integrated Circuits and Systems, 20(6):768–783, 2001.
7. J Y. Le Boudec and P. Thiran. Network Calculus: A Theory of Deterministic
Queuing Systems for the Internet. Springer-Verlag, New York, Inc., 2001.
8. A. Maxiaguine, S. Künzli, and L. Thiele. Workload characterization
model for tasks with variable execution demand. In Design Automation
and Test in Europe (DATE), pp. 1040–1045, Paris, France, February 2004.
IEEE Computer Society.
9. The Open SystemC Initiative (OSCI). .
10. J. C. Palencia Gutiérrez and M. G. Harbour. Schedulability analysis for
tasks with static and dynamic offsets. In Proceedings of the 19th Real-Time
Systems Symposium, Madrid, Spain, 1998. IEEE Computer Society.
11. T. Pop, P. Eles, and Z. Peng. Holistic scheduling and analysis of mixed
time/event-triggered distributed embedded systems. In CODES ’02: