Handling Overload Conditions in Real-Time Systems

of in


Introduction
This chapter deals with the problem of handling overload conditions, that is, those critical situations in which the computational demand requested by the application exceeds the processor capacity (Buttazzo, 2011). If not properly handled, an overload can cause an abrupt performance degradation, or even a system crash. Therefore, a real-time system should be designed to anticipate and tolerate unexpected overload situations through specific kernel mechanisms.
Overload conditions can occur for different causes, including bad system design, simultaneous arrival of events, operating system exceptions, malfunctioning of input devices, and unpredicted variations of the environmental conditions. I nt h ef o l l o w i n g ,w ec o n s i d e ras e to fn periodic or sporadic tasks, Γ = {τ 1 ,...,τ n },e a c h characterized by a worst-case execution time (WCET) C i , a relative deadline D i ,andaperiod (or minimum inter-arrival time) T i . Each task τ i is initially activated at time Φ i (denoted as the task phase) and generates an infinite sequence of jobs τ i,k (k = 1,2,...). Ifataskτ i is periodic, a generic job τ i,k is regularly activated at time r i,k = Φ i +(k − 1)T i . In general, the activation time of job τ i,k+1 is: Also, each job τ i,k is characterized by an absolute deadline d i,k = r i,k + D i . For a set of periodic tasks, the hyperperiod H denotes the minimum interval of time after which the schedule repeats itself. For a set of periodic tasks synchronously activated at time t = 0( Φ i = 0, for all i), the hyperperiod is equal to the least common multiple of all the periods, that is H = lcm(T 1 ,...,T n ).
In a real-time system, the computational load depends on the temporal characteristics of the executing activities. For example, for a set of n periodic tasks, the system load is equivalent to the processor utilization factor (Liu & Layland, 1973): Av a l u eU > 1 means that the total computation time requested by the task set in the hyperperiod exceeds the available time on the processor (i.e, the length H); therefore, the task set cannot be scheduled by any algorithm.
For a generic set of real-time jobs that can be dynamically activated, the system load varies at each job activation and it is a function of the current time and the job deadlines. In general, if there are n active jobs at time t, with absolute deadlines d 1 , d 2 ,...,d n ,theinstantaneous load ρ(t) can be defined as follows (Buttazzo & Stankovic, 1995): where c k (t) denotes the remaining worst-case computation time of the k-th job. Figure 1 shows how the instantaneous load varies as a function of time for a set of three real-time jobs {J 1 , J 2 , J 3 } having activation times (r i : 3, 1, 2), computation times (C i : 2, 3, 1), and relative deadlines (D i :3 ,6 ,7 ) . When dealing with computational load, it is important to distinguish between overload and overrun: • A computing system is said to experience an overload when the computation time demanded by the task set in a certain interval of time exceeds the available processing time in the same interval.
• A task is said to experience an overrun when it exceeds its expected utilization. An overrun may occur either because the next job is activated before its expected arrival time (activation overrun), or because the job computation time exceeds its expected value (execution overrun).
Note that, while the overload is a condition related to the processor, the overrun is a condition related to a single job. A job overrun does not necessarily cause an overload. However, a large unexpected overrun or a sequence of overruns on multiple jobs can cause very unpredictable effects on the system, if not properly handled.
In this chapter, two types of overload conditions will be analyzed: • Transient overload due to task overruns. This type of overload is due to periodic or aperiodic tasks that sporadically execute (or are activated) more than expected. Under fixed priority scheduling, an overrun in a task τ i does not affect tasks with higher priority, but any of the lower priority task could miss its deadline. Under the Earliest Deadline First (EDF) scheduling algorithm (Liu & Layland, 1973), a task overrun can potentially affect all the other tasks in the system. Figure 2 shows an example of an execution overrun under EDF scheduling. In this example, task τ 3 experiences an overrun of 7 units of time (shown in light gray), since its expected execution time was C 3 = 3. • Permanent overload in periodic task systems. This type of overload occurs when the total utilization of the periodic task set is greater than one. This can happen either because the execution requirement of the task set was not correctly estimated, or because of some unexpected activation of new periodic tasks, or because some of the current tasks increased their activation rate to react to some change in the environment. In such a situation, tasks start accumulating in the system's queues (which tend to become longer and longer, if the overload persists), and their response times tend to increase indefinitely. Figure 3 shows the effect of a permanent overload condition in a Rate Monotonic schedule, where computation times are C i : (2, 3, 2), and periods are T i : (4, 6, 8). Note that, since U p = 1.25, τ 2 misses its deadline and τ 3 can never execute.

Handling transient overloads
If not properly handled, task overruns can cause serious problems in the real-time system, jeopardizing the guarantee performed for the critical tasks and causing an abrupt performance degradation.
To prevent an overrun to introducing unbounded delays on tasks' execution, the system could either decide to abort the current job experiencing the overrun or let it continue with a lower priority. The first solution is not safe, because the job could be in a critical section when aborted, thus leaving a shared resource with inconsistent data (very dangerous). The second solution is much more flexible, since the degree of interference caused by the overrun on the other tasks can be tuned acting on the priority assigned to the "faulty" task for executing the remaining computation. Such a solution can be efficiently implemented through the resource reservation approach, which is a general kernel technique for limiting the inter-task interference and isolating the temporal behavior of a task subset.

Resource reservation
Resource reservation is a general technique used in real-time systems for limiting the effects of overruns in tasks with variable computation times. According to this method, each task is assigned a fraction of the processor bandwidth, just enough to satisfy its timing constraints. The kernel, however, must prevent each task to consume more than the requested amount to protect the other tasks in the systems (temporal protection). In this way, a task receiving a fraction U i of the total processor bandwidth behaves as it were executing alone on a slower processor with a speed equal to U i times the full speed. The advantage of this method is that each task can be guaranteed in isolation, independently of the behavior of the other tasks.
A resource reservation technique for fixed priority scheduling was first presented by Mercer, Savage and Tokuda (Mercer et al., 1994). According to this method, a task τ i is handled by a server, which is a kernel mechanism capable of controlling the execution of the task assigned to it through a pair of parameters (Q s , P s ) (denoted as a CPU capacity reserve). The server enables τ i to execute for Q s units of time every P s . In this case, the bandwidth reserved to the task is U s = Q s /P s . When the task consumes its reserved quantum Q s ,itisblockeduntil the next period, if the reservation is hard, or it is scheduled in background as a non real-time task, if the reservation is soft. If the task is not finished, it is assigned another time quantum Q s at the beginning of the next period and it is scheduled as a real-time task until the budget expires, and so on. In this way, the execution of τ i is reshaped to be more uniform along the timeline, so avoiding long intervals of time in which τ i prevents other tasks to run.
Under EDF scheduling, resource reservation can be efficiently implemented through the Constant Bandwidth Server (CBS) , which is a service mechanism also controlled by two parameters, (Q s , P s ),whereQ s is the server maximum budget and P s is the server period. The ratio U s = Q s /P s is denoted as the server bandwidth.A te a c h instant, two state variables are maintained: the server deadline d s and the actual server budget q s . Each job handled by a server is scheduled using the current server deadline and whenever the server executes a job, the budget q s is decreased by the same amount. At the beginning d s = q s = 0. Since a job is not activated while the previous one is active, the CBS algorithm can be formally defined as follows: 1. When a job τ i,j arrives, if q s ≥ (d s − r i,j )U s , it is assigned a server deadline d s = r i,j + P s and q s is recharged at the maximum value Q s , otherwise the job is served with the current deadline using the current budget.
2. When q s = 0, the server budget is recharged at the maximum value Q s and the server deadline is postponed at d s = d s + P s . Note that there are no finite intervals of time in which the budget is equal to zero.
As shown in (Abeni & Buttazzo, 2004), if a task τ i is handled by a CBS with bandwidth U s ,it will never demand more than U s , independently of the actual execution time of its jobs. As a consequence, possible overruns occurring in the served task do not create extra interference in the other tasks, but only affect τ i .
To properly implement temporal protection, however, each task τ i with variable computation time should be handled by a dedicated CBS with bandwidth U s i , so that it cannot interfere with the rest of the tasks for more than U s i . Figure 4 illustrates an example in which two tasks (τ 1 and τ 2 ) are served by two dedicated CBSs with bandwidth U s 1 = 0.15 and U s 2 = 0.1, a group of two tasks (τ 3 , τ 4 ) is handled by a single CBS with bandwidth U s 3 = 0.25, and three hard periodic tasks (τ 5 , τ 6 , τ 7 ) with utilization U p = 0.5 are directly scheduled by EDF, without server intercession, since their execution times are not subject to large variations. In this example, the total processor bandwidth is shared among the tasks as shown in Figure 5.  The properties of the CBS guarantee that the set of hard periodic tasks (with utilization U p )is schedulable by EDF if and only if

153
Handling Overload Conditions in Real-Time Systems

www.intechopen.com
Note that if condition (3) holds, the set of hard periodic tasks is always guaranteed to use 50% of the processor, independently of the execution times of the other tasks. Also observe that τ 3 and τ 4 are not isolated with respect to each other (i.e., one can steals processor time from the other), but they cannot interfere with the other tasks for more than one-fourth of the total processor bandwidth.
The CBS version presented in this book is meant for handling soft reservations. In fact, when the budget is exhausted, it is always replenished at its full value and the server deadline is postponed (i.e., the server is always active). As a consequence, a served task can execute more than Q s in each period P s , if there are no other tasks in the system. However, the CBS can be easily modified to enforce hard reservations, just by postponing the budget replenishment to the server deadline.

Schedulability analysis
Although a reservation R k is typically implemented using a server characterized by a budget Q k and a period T k , there are cases in which temporal isolation can be achieved by executing tasks in a static partition of disjoint time slots.
To characterize a bandwidth reservation independently on the specific implementation, Mok et al. (Mok et al., 2001) introduced the concept of bounded delay partition that describes a reservation R k by two parameters: a bandwidth α k and a delay Δ k . The bandwidth α k measures the fraction of resource that is assigned to the served tasks, whereas the delay Δ k represents the longest interval of time in which the resource is not available. In general, the minimum service provided by a resource can be precisely described by its supply function (Lipari & Bini, 2003;Shin & Lee, 2003), representing the minimum amount of time the resource can provide in a given interval of time.
Definition 1. Given a reservation, the supply function Z k (t) is the minimum amount of time provided by the reservation in every time interval of length t ≥ 0.
The supply function can be defined for many kinds of reservations, as static time partitions (Feng & Mok, 2002;Mok et al., 2001), periodic servers (Lipari & Bini, 2003;Shin & Lee, 2003), or periodic servers with arbitrary deadline (Easwaran et al., 2007). Consider, for example, that processing time is provided only in the intervals [0,3], [6,8], and [9,10], with a period of 12 units. In this case, the minimum service occurs when the resource is requested at the beginning of the longest idle interval; hence, the supply function is the one depicted in Figure 6.
For this example we have α k = 0.5 and Δ k = 3. Once the bandwidth and the delay are computed, the supply function of a resource reservation can be lower bounded by the following supply bound function: represented by the dashed line in Figure 6. The advantage of using such a lower bound instead of the exact Z k (t) is that a reservation can be expressed with just two parameters. In general, for a given supply function Z k (t), the bandwidth α k and the delay Δ k can be formally defined as follows: If a reservation is implemented using a periodic server with unspecified priority that allocates abudgetQ k every period T k , then the supply function is the one illustrated in Figure 7, where It is worth observing that reservations with smaller delays are able to serve tasks with shorter deadlines, providing better responsiveness. However, small delays can only be achieved with servers with a small period, condition for which the context switch overhead cannot be neglected. If σ is the runtime overhead due to a context switch (subtracted from the budget every period), then the effective bandwidth of reservation R k is Expressing Q k and T k as a function of α k and Δ k we have Hence,

Fig. 7. A reservation implemented by a periodic server.
Within a reservation, the schedulability analysis of a task set under fixed priorities can be performed through the following Theorem (Bini et al., 2009): Theorem 1 (Bini et al., 2009)

. A set of preemptive periodic tasks with relative deadlines less than or equal to periods can be scheduled by a fixed priority algorithm, under a reservation characterized by a supply function Z k (t), if and only if
where W i (t) represents the Level-i workload, computed as follows: Similarly, the schedulability analysis of a task set under EDF can be performed using the following theorem (Bini et al., 2009): Theorem 2 (Bini et al., 2009)

. A set of preemptive periodic tasks with utilization U p and relative deadlines less than or equal to periods can be scheduled by EDF, under a reservation characterized by a supply function Z k (t), if and only if U p < α k and
where dbf(t) is the Demand Bound Function (Baruah et al., 1990) defined as In the specific case in which Z k (t) is lower bounded by the supply bound function, the test becomes only sufficient and the set of testing points can be better restricted as stated in the following theorem (Bertogna et al., 2009 Theorem 3 (Bertogna et al., 2009). A set of preemptive periodic tasks with utilization U p and relative deadlines less than or equal to periods can be scheduled by EDF, under a reservation characterized by a supply function where

Handling wrong reservations
Although resource reservation is essential for achieving predictability in the presence of tasks with variable execution times, the overall system performance becomes quite dependent on a correct bandwidth allocation. In fact, if the CPU bandwidth allocated to a task is much less than its average requested value, the task may slow down too much, degrading the system's performance. On the other hand, if the allocated bandwidth is much greater than the actual needs, the system will run with low efficiency, wasting the available resources. This problem can be solved by using capacity sharing mechanisms that can transfer unused budgets to the reservations that need more bandwidth.
Capacity sharing algorithms have been developed both under fixed priority servers (Bernat et al., December 5-8, 2004;Bernat & Burns, 2002) and dynamic priority servers (Caccamo et al., 2000). For example, the CASH algorithm (Caccamo et al., 2005) extends CBS to include a slack reclamation. When a server becomes idle with residual budget, the slack is inserted in a queue of spare budgets (CASH queue) ordered by server deadlines. Whenever a new server is scheduled for execution, it first uses any CASH budget whose deadline is less than or equal to its own.
The bandwidth inheritance (BWI) algorithm (Lamastra et al., December 3-6, 2001) applies the idea of priority inheritance to CPU resources in CBS, allowing a blocking low-priority process to steal resources from a blocked higher priority process. IRIS (Marzario et al., 2004) enhances CBS with fairer slack reclaiming, so slack is not reclaimed until all current jobs have been serviced and the processor is idle. BACKSLASH (Lin & Brandt, December 5Ű8, 2005) is another algorithm that enhances the efficiency of the reclaiming mechanism under EDF.
Wrong reservations can also be handled through feedback scheduling. If the operating system is able to monitor the actual execution time e i,k of each task instance, the actual maximum computation time of a task τ i can be estimated (in a moving window) aŝ and the actual requested bandwidth asÛ i =Ĉ i /T i .H e n c e ,Û i can be used as a reference value in a feedback loop to adapt the reservation bandwidth allocated to the task according to the actual needs. If more reservations are adapted online, we must ensure that the overall allocated bandwidth does not exceed the processor utilization; hence, a form of global feedback adaptation is required to prevent an overload condition. Similar approaches to achieve adaptive reservations have been proposed by Abeni and Buttazzo (Abeni & Buttazzo, May 30 -June 1, 2001) and by Palopoli et al. (Palopoli et al., December 3-5, 2002).

Handling permanent overloads
This section presents some methodologies for handling permanent overload conditions occurring in periodic task systems when the total processor utilization exceeds one. Basically, there are three methods to reduce the load: • Job skipping. This method reduces the total load by properly skipping (i.e., aborting) some job execution in the periodic tasks, in such a way that a minimum number of jobs per task is guaranteed to execute within their timing constraints.
• Period adaptation. According to this approach, the load is reduced by enlarging task periods to suitable values, so that the total workload can be kept below a desired threshold.
• Service adaptation. According to this method, the load is reduced by decreasing the computational requirements of the tasks, trading predictability with quality of service.

Job skipping
The computational load of a set of periodic tasks can be reduced by properly skipping af e w jobs in the task set, in such a way that the remaining jobs can be scheduled within their deadlines. This approach is suitable for real-time applications characterized by soft or firm deadlines, such as those typically found in multimedia systems, where skipping a video frame once in a while is better than processing it with a long delay. Even in certain control applications, the sporadic skip of some job can be tolerated when the controlled systems is characterized by a high inertia.
To understand how job skipping can make an overloaded system schedulable, consider the following example, consisting of two tasks, with computation times C 1 = 2andC 2 = 8and periods T 1 = 4a n dT 2 = 12. Since the processor utilization factor is U p = 14/12 > 1, the system is under a permanent overload, and the tasks cannot be scheduled within their deadlines. Nevertheless, Figure 8 shows that skipping a job every three in task τ 1 the overload can be resolved and all the remaining jobs can be scheduled within their deadlines. In order to control the overall system load, it is important to derive the relation between the number of skips (i.e., the number of aborted jobs per task) and the total computational demand. In 1995, Koren and Shasha (Koren & Shasha, 1995) proposed a new task model (known as the firm periodic model) suited to be handled by this technique. According to this model, each periodic task τ i is characterized by the following parameters: where C i is the worst-case computation time, T i its period, D i its relative deadline (assumed to be equal to the period), and S i a skip parameter, 2 ≤ S i ≤ ∞, expressing the minimum distance between two consecutive skips. For example, if S i = 5 the task can skip one instance every five. When S i = ∞ no skips are allowed and τ i is equivalent to a hard periodic task. The skip parameter can be viewed as a Quality of Service (QoS) metric (the higher S i , the better the quality of service).
Using the terminology introduced by Koren and Shasha (Koren & Shasha, 1995), every job of a periodic task can be red or blue: a red job must be completed within its deadline, whereas a blue job can be aborted at any time. To meet the constraint imposed by the skip parameter S i , each scheduling algorithm must have the following characteristics: • if a blue job is skipped, then the next S i − 1jobsmustbered.
• if a blue job completes successfully, the next job is also blue.
The authors showed that making optimal use of skips is NP-hard and presented two algorithms (one working under Rate Monotonic and one under EDF) that exploit skips to schedule slightly overloaded systems. In general, these algorithms are not optimal, but they become optimal under a particular condition, called the deeply-red condition.
Definition 2. As y s t e mi sdeeply-red if all tasks are synchronously activated and the first S i − 1 instances of every task τ i are red.
Koren and Shasha showed that the worst case for a periodic skippable task set occurs when tasks are deeply-red. For this reason, the feasibility tests are derived under this condition, so that, if a task set is schedulable under the deeply-red condition, it is also schedulable in any other situation.

Schedulability analysis
The feasibility analysis of a set of firm tasks can be performed through the Processor Demand Criterion (Baruah et al., 1990) under the deeply-red condition, assuming that in the worst case all blue jobs are aborted. In such a worst-case scenario, the processor demand of τ i due to the r e dj o b si na ni n t e r v a l[0, t] can be obtained as the difference between the demand of all the jobs and the demand of the blue jobs: Hence, the feasibility of the task set can be verified through the following theorem.
Theorem 4 (Koren and Shasha, 1995). A set of firm periodic tasks is schedulable by EDF if A necessary condition can be easily derived by observing that a schedule is certainly infeasible when the utilization factor due to the red jobs is greater than one.
Theorem 5 (Koren and Shasha, 1995). Necessary condition for the schedulability of a set of firm periodic tasks is that To better clarify the concepts mentioned above, consider the task set shown in Figure 9 and the corresponding feasible schedule, obtained by EDF. Note that the processor utilization factor is greater than 1 (U p = 1.25), but both conditions (16) and (17) are satisfied. If skips are permitted in the periodic task set, the spare time saved by rejecting the blue instances can be reallocated for other purposes. For example, for scheduling slightly overloaded systems or for advancing the execution of soft aperiodic requests.

Ta s k C i T i D i S
Unfortunately, the spare time has a "granular" distribution and cannot be reclaimed at any time. Nevertheless, it can be shown that skipping blue instances still produces a bandwidth saving in the periodic schedule. Caccamo and Buttazzo (Caccamo & Buttazzo, 1997) identified the amount of bandwidth saved by skips using a simple parameter, the equivalent utilization factor U skip p , which can be defined as where dbf skip i (t) is given in Equation (15).
Using this definition, the schedulability of a deeply-red skippable task set can be also verified using the following theorem (Caccamo & Buttazzo, 1997): Theorem 6 (Caccamo and Buttazzo, 1997 (18) (setting S i = ∞), U p can also be defined as The bandwidth saved by skips can also be exploited by an aperiodic server to advance the execution of aperiodic tasks.

Period adaptation
There are several real-time applications in which timing constraints are not rigid, but depend on the system state. The possibility of varying tasks' rates increases the flexibility of the system in handling overload conditions, providing a more general admission control mechanism. For example, if the total utilization of the task set is greater than one, the system could reduce the utilizations of some tasks (by increasing their periods in a controlled fashion) to decrease the total load.
The elastic model presented in this section (originally introduced Buttazzo et al.  and later extended by the same authors to deal with resource constraints (Buttazzo et al., 2002)), provides a novel theoretical framework for flexible workload management in real-time applications.

The elastic model
The basic idea behind the elastic model is to consider each task as flexible as a spring with a given rigidity coefficient and length constraints. In particular, the utilization of a task is treated as an elastic parameter, whose value can be modified by changing the period within a specified range. Each task is characterized by four parameters: a computation time C i ,a minimum period T min i , a maximum period T max i , and an elastic coefficient E i ≥ 0, which specifies the flexibility of the task to vary its utilization for adapting the system to a new feasible rate configuration. The greater E i , the more elastic the task. Thus, an elastic task is denoted as In the following, T i denotes the actual period of task τ i , which is constrained to be in the range [T min i , T max i ]. Any task can vary its period according to its needs within the specified range. Any variation, however, is subject to an elastic guarantee and is accepted only if there is a feasible schedule in which all the other periods are within their range.
It is worth noting that the elastic model is more general than the classical Liu and Layland's task model (Liu & Layland, 1973), so it does not prevent a user from defining hard real-time tasks. In fact, a task having T max i = T min i is equivalent to a hard real-time task with fixed period, independently of its elastic coefficient. A task with E i = 0 can arbitrarily vary its period within its specified range, but it cannot be varied by the system during load reconfigurations. Under the elastic model, given a set of n periodic tasks with utilization U p > 1, the objective of the elastic guarantee is to compress tasks' utilization factors to achieve a new desired utilization U d ≤ 1 such that all the periods are within their ranges.
The following definitions are also used in this section: Clearly, a solution can always be found if U min ≤ U d ; hence, this condition has to be verified apriori.
To understand how an elastic guarantee is performed in this model, it is convenient to compare an elastic task τ i having utilization U i and elasticity E i with a linear spring S i characterized by a length x i and a rigidity coefficient k i , equivalent to the inverse of the task's elasticity (k i = 1/E i ). In this comparison, the nominal length x i 0 of the spring is equivalent to U max i , whereas the minimum length x min i is equivalent to U min i .H e n c e ,as e to fn periodic tasks with total utilization factor U p = ∑ n i=1 U i can be viewed as a sequence of n springs with total length L = ∑ n i=1 x i .
In the special case in which U min i = 0 for all tasks, the compressed task utilizations can be derived by solving a set of n spring linear equations, under the constraint that ∑ n i=1 U i = U d . The resulting expression is: where E s = ∑ n i=1 E i . If each spring has a length constraint, in the sense that its length cannot be less than a minimum value x min i , the problem of finding the values x i requires an iterative solution. In fact, if during compression one or more springs reach their minimum length, the additional compression force will only deform the remaining springs. Such a situation is depicted in Figure 10.
Thus, at each instant, the set Γ can be divided into two subsets: a set Γ f of fixed springs having minimum length (equivalent to tasks that reached their minimum utilization with the maximum period), and a set Γ v of variable springs that can still be compressed. If U max v is the sum of the maximum utilizations of tasks in Γ v ,a n dU f is the total utilization factor of tasks in Γ f , then, to achieve a desired utilization U d ≤ 1, each task has to be compressed up to the following utilization: L 0 x 40 x 30 x 20 x 10 x 1−min x 3−min x 4−min x 2−min L 0 L d L max x 1 x 2 x 4 x 3 L L F Fig. 10. Springs with minimum length constraints (a); during compression, spring S 2 reaches its minimum length and cannot be compressed any further (b).
If there are tasks for which U i < U min i , then the period of those tasks has to be fixed at its maximum value T max i (so that U i = U min i ), sets Γ f and Γ v must be updated (hence, U f and E v recomputed), and Equation (20) applied again to the tasks in Γ v . If there is a feasible solution, that is, if U min ≤ U d , the iterative process ends when each value U i computed by Equation (20) is greater than or equal to its corresponding minimum U min i . The algorithm for compressing asetΓ of n elastic tasks up to a desired utilization U d isshowninFigure11.
All tasks' utilizations that have been compressed to cope with an overload situation can return toward their nominal values when the overload is over.
The elastic compression algorithm can be efficiently implemented on top of a real-time kernel as a routine (elastic manager) that is activated every time a new task is created, terminated, or there is a request for a period change. When activated, the elastic manager computes the new periods according to the compression algorithm and modifies them atomically.
To avoid any deadline miss during the transition phase, it is crucial to ensure that all the periods are modified at opportune time instants. In particular, the period of a task τ i can be increased at any time, but can only be reduced at the next job activation. An earlier instant at which a period can be safely reduced without causing any deadline miss in the transition phase has been computed by Buttazzo et al. (Buttazzo et al., 2002) and later improved by Guangming (Guangming, 2009).

Period rescaling
If the elastic coefficients are set equal to task nominal utilizations, elastic compression has the effect of a simple rescaling, where all the periods are increased by the same percentage. In Algorithm: Elastic_compression(Γ, U d ) Input: A task set Γ and a desired utilization U d ≤ 1 Output: A task set with modified periods such that (13) end (14) ok := 1; (15) for (each τ i ∈ Γ v ) do (16) if ((E i > 0) and (T i < T max i )) then (17) This means that in overload situations (U max > 1) the compression algorithm causes all task periods to be increased by a common scale factor Note that after compression is performed, the total processor utilization becomes If a maximum period needs to be defined for some task, an online guarantee test can easily be performed before compression to check whether all the new periods are less than or equal to the maximum value. This can be done in O(n) by testing whether ∀i = 1,...,n ηT min By deciding to apply period rescaling, we lose the freedom of choosing the elastic coefficients, since they must be set equal to task maximum utilizations (E i = U max i ). However, this technique has the advantage of leaving the task periods ordered as in the nominal configuration, which simplifies the compression algorithm in the presence of resource constraints and enables its usage in fixed priority systems, where priorities are typically assigned based on periods.

Service adaptation
A third method for coping with a permanent overload condition is to reduce the load by decreasing the task computation times. This can be done only if the tasks have been originally designed to trade performance with computational requirements. When tasks use some incremental algorithm to produce approximated results, the precision of results is related to the number of iterations, and thus with the computation time. In this case, an overload condition can be handled by reducing the quality of results, aborting the remaining computation if the quality of the current results is acceptable.
The concept of imprecise and approximate computation has emerged as a new approach to increasing flexibility in dynamic scheduling by trading computation accuracy with timing requirements. If processing time is not enough to produce high-quality results within the deadlines, there could be enough time for producing approximate results with a lower quality. This concept has been formalized by many authors Liu et al., 1987;Natarajan, 1995;Shih et al., 1991) and specific techniques have been developed for designing programs that can produce partial results.
In a real-time system that supports imprecise computation, every task τ i is decomposed into a mandatory subtask M i and an optional subtask O i . The mandatory subtask is the portion of the computation that must be done in order to produce a result of acceptable quality, whereas the optional subtask refines this result (Shih et al., 1989). Both subtasks have the same activation time r i and the same deadline d i as the original task τ i ; however, O i becomes ready for execution when M i is completed. If C i is the worst-case computation time associated with the task, subtasks M i and O i have computation times m i and o i ,suchthatm i + o i = C i . In order to guarantee a minimum level of performance, M i must be completed within its deadline, whereas O i can be left incomplete, if necessary, at the expense of the quality of the result produced by the task.
It is worth noting that the task model used in traditional real-time systems is a special case of the one adopted for imprecise computation. In fact, a hard task corresponds to a task with no optional part (o i = 0), whereas a soft task is equivalent to a task with no mandatory part (m i = 0).
In systems that support imprecise computation, the error ǫ i in the result produced by τ i (or simply the error of τ i ) is defined as the length of the portion of O i discarded in the schedule. If σ i is the total processor time assigned to O i by the scheduler, the error of task τ i is equal to The average error ǫ on the task set is defined as where w i is the relative importance of τ i in the task set. An error ǫ i > 0 means that a portion of subtask O i has been discarded in the schedule at the expense of the quality of the result produced by task τ i , but for the benefit of other mandatory subtasks that can complete within their deadlines.
In this model, a schedule is said to be feasible if every mandatory subtask M i is completed within its deadline. A schedule is said to be precise if the average error ǫ on the task set is zero. In a precise schedule, all mandatory and optional subtasks are completed within their deadlines.
For a set of periodic tasks, the problem of deciding the best level of quality compatible with a given load condition can be solved by associating each optional part of a task a reward function R i (σ i ), which indicates the reward accrued by the task when it receives σ i units of service beyond its mandatory portion. This problem has been addressed by Aydin et al. (Aydin et al., 2001), who presented an optimal algorithm that maximizes the weighted average of the rewards over the task set.
Note that in the absence of a reward function, the problem can easily be solved by using a compression algorithm like the elastic approach. In fact, once, the new task utilizations U i are computed, the new computation times C i that lead to a given desired load can easily be computed from the periods as Finally, if an algorithm cannot be executed in an incremental fashion or it cannot be aborted at any time, a task can be provided with multiple versions, each characterized by a different quality of performance and execution time. Then, the value C i can be used to select the task version having the computation time closer to, but smaller than C i .

Two case studies
This section describes two real-world applications to illustrate how the presented techniques can be used to prevent the negative effects of the overload. The first example considers a multimedia system and shows how resource reservation can isolate the timing behavior of concurrent applications characterized by highly variable execution times, preventing a performance degradation due to a reciprocal interference. The second example illustrates how to handle a permanent overload in a robot system that activates a new task to cope with obstacle avoidance.

Resource reservation in multimedia systems
Let us consider a multimedia device, like a cell phone, in which a phone call, a video player, and a web browser can be concurrently executed, so that a user can simultaneously make a phone call while watching a video and downloading a file from the web. These applications are characterized by a highly variable computational demand and share common resources (e.g., processor, memory, touch screen, audio codec, and graphic display). They are briefly described below.
• Phone call (A 1 ). This application consists at least of two periodic activities, executed with a period of 20 ms. A task is in charge of receiving the incoming audio signal, decoding it, and transferring the packets to the speaker buffer for reproduction. The processing time of this task can vary from 1 ms (during silence) up to 3 ms. The second task is responsible for sampling the voice from the microphone, performing data encoding, speech enhancement, and packet transmission through the modem. The processing time of this task can vary from 5 ms (during silence) up to 10 ms. Hence, overall, this application requires a processor bandwidth that can vary from 30% to 65%.
• Video player (A 2 ). The MPEG standard adopted for video compression is characterized by highly variable execution times. For instance, Figure 12 shows the typical distribution of frame decoding times for an MPEG player decoding a specific video on a given platform (Abeni & Buttazzo, 2000;Isovic et al., 2005). Note that the processing time of this task can vary from 6 ms to 30 ms, with an average decoding time of about 12 ms. If using the PAL standard, the frame rate is set to 25 frames per second, meaning that each frame has to be processed every 40 ms. Therefore, running an MPEG player requires an average processor bandwidth of 30%, which can reach 75% in peak load conditions.
• Web browser (A 3 ). This activity is also characterized by high load variations. In fact, when loading a web page, there is a peak processing load to parse the input, taking about one second. Then, a request is submitted to the server for sending a separate content, like images and layout files. At this time, the processing pauses for about another second while waiting for the new input. When full data arrive, there is another peak load, since they should be processed (decoded/parsed and rendered) as quickly as possible (this phase takes a few seconds depending on the content).
Note that, each of the considered applications is typically implemented as a set of tasks with different priorities. Hence, when they are concurrently executed on the same processor, tasks are subject to reciprocal interference and can experience long blocking delays and jerky behavior. As a consequence, the user could experience a temporary motion stop on the movie, or perceive an annoying jitter in the sound. Although these applications are not safety critical, the unpleasant effects of such interferences on the user perception are taken in a serious consideration by the developers, since they can make a difference with another device produced by a competitor.
Resource reservation can be effectively used in this system to isolate the temporal behavior of the applications and limit their reciprocal interference (Bini et al., 2011). To do that, the processor should be partitioned into three reservations, with bandwidth U s 1 , U s 2 ,a n dU s 3 , each behaving as a slower processor running at speed U s i . The advantage of this approach is that an overrun occurring in an application does not affect the other applications, but has only the effect of postponing the execution of those tasks in which the overrun is generated. Moreover, an application can be designed and analyzed independently of the others, because its execution behavior only depends on its own computational demand and the allocated bandwidth.
In our system, allocating each bandwidth for satisfying the worst-case processing demand of the application would waste resources and would lead to an infeasible schedule. For instance, the maximum bandwidth requirements of the first two applications already exceed the full processor capacity. To achieve a feasible schedule, the bandwidth can be reserved to satisfy a processing demand slightly higher than the average value, handling sporadic overruns through resource reservation. In the considered example, 40% of the processor can be reserved to the phone call, 50% to the video player, and, the remaining 10% to the web browser, which has less stringent timing constraints.
When the amount of allocated bandwidth results to be quite different from the real application needs, adaptive approaches based on feedback mechanisms can be applied at runtime to adjust the allocated bandwidth to the real resource needs (Abeni & Buttazzo, 1999;Bini et al., 2011).

Overload handing in robot control systems
Let us consider a mobile robot system whose goal is to explore an unknown environment to localize given targets through a dedicated sensor, while avoiding obstacles along the path using proximity sensing. Note that, by equipping the robot with suitable sensors, the system could be used for very different applications; for instance, to discover electrical sockets in a room, using a video camera, or to localize unexploded mines in a field, using a georadar. For the purpose of this chapter, we consider a robot equipped with two motors to move in two directions on a flat surface, two encoders to measure the angular rotation of the wheels and reconstruct the traveled path, a sensor to detect the target (target sensor), a proximity sensor (e.g., based on ultrasound transducers) to detect the distance from possible obstacles along the path, and an electronic compass to orient the trajectory in desired directions.
From the software point of view, the application consists of the following periodic tasks: • Motor Control Task (MCT or τ 1 ): it performs the low-level motor control loop to drive the robot in a desired direction (θ) at a given speed (v); • Obstacle Detection Task (ODT or τ 2 ): it reads the proximity sensor to detect a possible obstacle along the path; • Target Detection Task (TDT or τ 3 ): it reads the target sensor and stores the target location inabufferwhenitisdetected; • Exploration Task (EXT or τ 4 ): it generates the proper set points (exploring direction and speed) for the Motor Control Task; • Obstacle Avoidance Task (OAT or τ 5 ): it is activated when an obstacle is detected and computes a sequence of set points to be followed to avoid the obstacle and return to the planned path.
To illustrate the use of an overload management technique, we assume that in normal operating conditions (i.e., in the absence of obstacles) the first four tasks (τ 1 ,...,τ 4 ) generate a load equal to U norm = 0.9, while the fifth task has a utilization U 5 = 0.3. Hence, when τ 5 is activated together with the other tasks, the total system utilization becomes U max = 1.2. In this example, the elastic approach is applied to bring the load back to a desired value U d = 0.9 (equal to the normal load condition).
The tasks are organized as shown in Figure 13, where hardware components are represented by rounded boxes, tasks by circles, and shared buffers by rectangles.
To apply the elastic method, each task τ i must specify a range of valid periods [T min i , T max i ] and an elastic coefficient E i . Task parameters are reported in Table 1 and are expressed in milliseconds. Note that, when the OAT task is not active, the utilization of the task set is whereas, when OAT is active, the system becomes overloaded, being By applying the elastic approach, with a desired utilization U d = 0.9 (to keep a safety margin), tasks utilizations are compressed according to Equation (19) and then enforced by re-computing the periods as T i = C i /U i . The new tasks utilizations and periods derived by the elastic algorithm are reported in Table 2.
When the obstacle is overcome, task OAT can be suspended and the remaining tasks can return to their original condition, running with their minimum periods.