Engineering » Industrial Engineering and Management » "Factory Automation", book edited by Javier Silvestre-Blanes, ISBN 978-953-307-024-7, Published: March 1, 2010 under CC BY-NC-SA 3.0 license

Chapter 16

Robustness Enhancement of Networked Control Systems

By Michal Morawski and Antoni Zajaczkowski
DOI: 10.5772/9531

Article top

Overview

Probability density functions of RTT obtained in the laboratory network
Figure 1. Probability density functions of RTT obtained in the laboratory network
RTTs time series in the laboratory network obtained for the same data as depicted above
Figure 2. RTTs time series in the laboratory network obtained for the same data as depicted above
Effect of a congestion in some (wireless) link. Long delays resulting in unstable operation of the system. Such system is useless in practice.
Figure 3. Effect of a congestion in some (wireless) link. Long delays resulting in unstable operation of the system. Such system is useless in practice.
 One channel closed-loop NCS. P – plant, S – sensors, A – actuators, C – remote controller, N – computer network.
Figure 4. One channel closed-loop NCS. P – plant, S – sensors, A – actuators, C – remote controller, N – computer network.
Packets with control data are received in the interval 
							
								
									
										
											T
											k
										
										=
										[
										
											t
											k
										
										,
										
											t
											
												k
												+
												1
											
										
										)
									
								
							
							
						 (on the left). No packet with control data is received in the interval 
							
								
									
										
											T
											k
										
										=
										[
										
											t
											k
										
										,
										
											t
											
												k
												+
												1
											
										
										)
									
								
							
							
						 (on the right)
Figure 5. Packets with control data are received in the interval T k = [ t k , t k + 1 ) (on the left). No packet with control data is received in the interval T k = [ t k , t k + 1 ) (on the right)
RTT seen by the actuator. These values can be interpreted as indices in vector obtained using (10).
Figure 6. RTT seen by the actuator. These values can be interpreted as indices in vector obtained using (10).
Quality of control of magnetic levitation system obtained using methods described in the chapter using Ethernet network (above) and mixed (WiFi and Ethernet) network (below) The better results of NCS in case of Ethernet networks are the results of using observer to obtain ball velocity instead of the simple derivation via filtering. WiFi based controller without compensation does not work at all.
Figure 7. Quality of control of magnetic levitation system obtained using methods described in the chapter using Ethernet network (above) and mixed (WiFi and Ethernet) network (below) The better results of NCS in case of Ethernet networks are the results of using observer to obtain ball velocity instead of the simple derivation via filtering. WiFi based controller without compensation does not work at all.
Application layer redundancy. The scheme of the system (above). Process of switching the active remote controller (below).
Figure 8. Application layer redundancy. The scheme of the system (above). Process of switching the active remote controller (below).
Approximation of the link cost using equations (11) and (12) (above). Possible geometrical interpretation (below)
Figure 9. Approximation of the link cost using equations (11) and (12) (above). Possible geometrical interpretation (below)

Robustness enhancement of networked control systems

Michał Morawski1 and Antoni Zajączkowski1

1. Introduction

In the last decade we can observe a growing interest in design and implementation methods for Networked Control Systems (NCS). Typically in an NCS, a plant is both controlled and monitored by a remote computer system connected with the plant via a communication channel performed by a computer network (Hristu-Varsekelis & Levine, 2005). This architecture of the control system differs significantly from the classical one, where all components of the system are attached directly to the control plant and exchange data using some wiring system. NCSs gain popularity due to the fact that they can be implemented more cost effective and can provide extended functionality as compared with classical control systems. Application of computer networks for a data exchange in automated production systems is not a new idea and during the last decades several industrial standards of computer networks have been developed. Industrial networks are typically based on asynchronous links (fieldbus) or technologies developed for specific areas. To this last category belong Profibus, CAN, ARINC, WorldFIP, and many others.

The most important feature of these industrial networks is that they guarantee bounded transmission delays. However, NCSs based on industrial standards suffer from some disadvantages, like high installation and maintenance costs, excessive weight of physical links and difficulties with scalability and redundancy. One of the solutions enabling partial avoidance of these problems is application of standard, inexpensive, easy to purchase and replace devices typically used for office networking. To this class belongs Ethernet which currently dominates general networking and seems to become an industrial standard in the nearest future (Decotignie, 2005; PROFIBUS Nutzerorganisation, 2006; Andersson, Brunner & Engler, 2003, IEC 61850, 2002).

The guided media, that form a basis of today plants, although quite reliable in nature, have several disadvantages like limited scalability, quite high installation and maintenance costs, low flexibility. Networks are more and more often applied in new domains, where usage of guided media is impossible or inconvenient. These domains include home or building automation, military systems, inter vehicle communication (VANET), sensor networks and others (Murthy & Manoj, 2004; Schoch, Kargl & Weber, 2008; Callaway, 2004).

Therefore more and more often the unguided (wireless) communication is seen as a significant complementary solution for wired communication (Willig, Matheus & Wolisz, 2005).

Such solutions based on 802.11 (Gast, 2005; WAVE), 802.15 (Gislason, 2008; Bluetooth Specification,ZigBee Specification, 2008), or proprietary technologies are relatively inexpensive, easy to deploy, and often allow to avoid installation of power supply cables. Some of them are commercially available like WISA (Frey, 2008). The wireless media however suffer from low reliability, limited capacity and are prone to noise and interferences.

Solving the problem of exploiting limited networks resources, one should keep in mind, that fulfilment of specific requirements for any kind of homogenous networks is unrealistic in real today plants. Simultaneous application of technologies which differ in their maturity of even more than one decade – thus heterogenous – requires to control a traffic at higher layers – network or even application (gateways).

It is well known that incorrect operation of the control system due to some hardware failure can be dangerous or even catastrophic. Therefore in order to increase reliability of computer controlled systems, redundancy realised by duplication or multiplication of sensors, actuators and selected processing units is applied, especially in the case of systems responsible for control of critical processes.

In network based control systems situation is additionally complicated by the possibility of network failures and stochastically time-varying transmission delays introduced into feedback loops by a network. Hence, resiliency of communication is crucial for satisfactory operation of these systems. It is obvious that multiplication of communication links is not sufficient to assure proper operation of NCSs and should be complemented by suitable traffic engineering and some procedures at the application level. While there is a need to keep network introduced delays below a certain bound, traffic engineering is uncomplicated when using point-to-point links, more complicated when using shared media and quite complex in the case of wireless links prone to collisions, noise, interferences.

In this chapter only an application layer resiliency is considered. The complement of solution in network layer was proposed previously (Morawski, 2005, Morawski, 2006b, Morawski, 2007), and is only briefly discussed here. The physical layer (i.e. modulation and coding) are defined by the appropriate standards and therefore cannot be easily altered. The significant influence of the link layer on latencies cannot be exploited in practice due to limitations of available hardware, firmware, and drivers. In particular, implementation of the advanced 802.11e substandards like burst acknowledgements, no acknowledgements, HCF is extremely seldom, and software support for wider deployed WMM extensions is manufacturer dependent, diverse and compatible in theory only. However, it is worth to notice, that some companies adjust lower (1, 2) layers to their needs (Frey, 2008). We have decided to not follow this path, due to cost reasons. Further, we discuss resiliency of NCSs with networks implemented using standard, inexpensive, off-the-shelf devices and systems. It is crucial to underline here, there is no “silver bullet” – none single method is enough to achieve robust NCSs.

Organization of the chapter is as follows: in Section 2 delays introduced by the computer network are discussed, in Section 3 the compensation of network introduced delays at the application layer is presented together with experimental results illustrating properties of the laboratory NCS with the Magnetic Levitation System. These results are next complemented by addition of hardware redundancy as discussed in Section 4. Section 5 presents a short description of an approach to the network layer compensation of latencies, and Section 6 contains conclusions.

2. Network introduced delays

In many publications (e.g. (Stallings, 2002; Welzl, 2005)) inevitable data transmission latencies introduced by computer networks are modelled statistically by some well-known distributions like exponential, Poisson, Pareto, gamma, Erlang, etc. Such models can be accepted in the case of the Internet network, where the main source of delays are queues in intermediate nodes. In the real-time traffic generated by the NCSs, esp. those using wireless channels, the significant influence of the media access procedure and layer 2 retransmissions can be observed and experimentally obtained latency distributions cannot be approximated by aforementioned models (Natori, Oboe & Ohnishi, 2008). This observation was confirmed by the number of experiments performed in our laboratory (Morawski, 2006a).

Fig. 1 presents the probability density functions (PDF) of round trip times (RTT) obtained by the measurements in the network with the Access Point 802.11bg without the “e” extension, two Ethernet switches and the PIX firewall. Three depicted graphs correspond to three independent measurements performed in three short time intervals shifted by about 15 minutes one from the other. Fig. 2 presents the same RTT data as the discrete-time functions. The considerable increase of RTTs during the first measurement can be interpreted as the effect of an unidentified traffic in the shared network or disturbances. It is necessary to notice, that reduction of the sampling period can induce a significant increase of delays due to retransmissions, a contention procedure for media access control and buffering in network card drivers (Fig. 3). These phenomena are exceptionally important for wireless media, but the problem exists in any shared media. This problem can be diminished in network layer using multiple transmission channels and suitable traffic engineering mechanisms (Morawski, 2005), where link costs (delays) are considered as uncertain numbers. From the above discussion follows that selection of the sampling period in network based control systems requires additional analysis as compared with that performed (Aström & Wittenmark, 1997; Wang & Liu 2008) in the case of classical computer controlled systems.

It is worth mentioning here, that in our experiments we have applied the general purpose inexpensive hardware typically used for office networking and the ultimate goal of our research was to investigate if application of such hardware enables successful deployment of NCSs with high dynamic controlled plants. In our experimental NCS as a plant served Magnetic Levitation System (MLS) which is structurally unstable system of high dynamics (Morawski & Zajączkowski, 2007). From analytic considerations and simulations of the stand-alone closed-loop control system with MLS follows that stable operation of this system requires frequent sampling of the measured output and fast floating point computations of the control feedback. Hence, compensation the influence of network introduced delays is critical for proper operation of NCS with MLS. Taking into account results concerning network introduced delays and requirements concerning the choice of the sampling period for digital control of MLS we have concluded that statistical approach to the design of network based controllers for MLS is less useful then the predictive control approach described in (Liu Xia & Rees, 2005; Yang, Wang & Yang, 2005).

media/image1.jpg

Figure 1.

Probability density functions of RTT obtained in the laboratory network

media/image2.jpg

Figure 2.

RTTs time series in the laboratory network obtained for the same data as depicted above

media/image3.jpg

Figure 3.

Effect of a congestion in some (wireless) link. Long delays resulting in unstable operation of the system. Such system is useless in practice.

3. Application Layer Compensation

We consider a discrete-time, linear, time-invariant control plant modelled by the following state and output equations:

x(k+1)=Ax(k)+Bu(k)
y(k)=Cx(k)

where x(k) is the state vector, u(k) is the vector of control inputs, y(k) is the vector of measured outputs and A , B C are matrices of suitable dimensions.

Further we assume that the model of the plant is completely reachable and completely observable (Aström & Wittenmark, 1997; Wang & Liu 2008). In other words we assume that the pair (A,B) is completely reachable and the pair (A,C) is completely observable. From the first assumption follows that using the pole shifting method or the linear-quadratic optimal control theory (Åström & Wittenmark, 1997) we can design a linear state feedback

u(k)=Ksx(k)

such that the closed-loop system

x(k+1)=(A+BKs)x(k)

is asymptotically stable with desired dynamic properties.

Second assumption guarantees that we can design a Luenberger observer (Åström & Wittenmark, 1997) which estimates the state vector of the plant on the basis of the measured output and control input. Further we consider the implementation problem of the control law (3) using an NCS (Welzl, 2005; Srikant, 2003) shown in Fig. 4.

A networked control system consists of a plant, controller and computer network across which all sensor and actuator data must be sent (Hristu-Varsekelis & Levine, 2005). In this system, the plant is equipped with some computer (a local controller) responsible mainly for sending and receiving data to and from a remote controller using a communication channel provided by the computer network. If a computer network is used to exchange information between the local and remote controller then some transmission delays are introduced into the feedback loop. We distinguish two transmission delays: the sensor to controller delay τsc and the controller to actuator delay τca and we assume that there exist nonnegative integers (Liu, Xia & Rees, 2005; Yang, Wang & Yang, 2005) nsc and nca such that the total transmission delay satisfies the inequality

0τsaΔnsa

where nsa=nsc+nca and

τsa=τsc+τca

Remark. The delay τca includes the time spent by the remote controller for execution of the control algorithm.

Second assumption concerning operation of the network is as follows: the number of lost packets on the route from the plant to the controller and on the route from the controller to the plant are less than nsc and nca , respectively. The controller operates in the event driven mode. It means that that a control algorithm is started when the new packet with measured data is received. Sensors work in the clock (time) driven mode, that is samples of the measured output signals are computed in the time instants tk=kΔ , where k is the integer sample number, Δ is a sampling period. Therefore sensor is responsible for stringent time properties. Actuators operate both in the clock and event driven modes. It means that control values are received when a packet from the controller arrives and the new control value is applied in the immediate sampling instant.

Let Tk=[tk,tk+1) . In this time interval the local computer can receive zero, one or more – at most nsa data packets with the new values of the control input. If during this interval no packet arrives then the preceding value of the control input will be applied in the next time interval Tk+1=[tk+1tk+2)

If in the interval Tk the local computer receives more than one data packet with the control values then the most recent value will be applied and earlier (i.e. based on the earlier samples, not on earlier received) values will be neglected. These scenarios are illustrated in figures 5.

Suppose now that on the basis of the plant model (1) the linear state feedback (3) was designed. If this control law is used in the NCS in which random time-varying transmission delays occur then control values are applied with the delay r(k) which takes values from the set R={1,2,,nsa}

media/image35.png

Figure 4.

One channel closed-loop NCS. P – plant, S – sensors, A – actuators, C – remote controller, N – computer network.

Note that r(k) corresponds to the time instant tk=kΔ and to the sample y(k) of the output. It is easy to see that the best situation occurs when r(k)=1 but even in this case application of the new control value by the actuator is delayed. If in an NCS no mechanisms exist responsible for compensation of the influence of the random delays introduced by the network, then the behaviour of the closed-loop system degrades significantly and in some cases such a system can suffer from stability loss. Hence, compensation of the influence of this delay phenomenon belongs to the most important problems encountered in the area of network based control systems.

media/image40.png

Figure 5.

Packets with control data are received in the interval Tk=[tk,tk+1) (on the left). No packet with control data is received in the interval Tk=[tk,tk+1) (on the right)

As was mentioned above the remote controller should be equipped with some mechanism compensating the influence of the transmission delays introduced by the computer network. One of the mechanisms proposed recently (Liu, Xia & Rees, 2005; Yang, Wang & Yang, 2005) is the method of prediction described in what follows. Consider the NCS depicted in Fig. 4 and assume that the local computer has a buffer for storing a finite number of control values. Assume also that the remote controller stores the model of the plant represented by the triple (A,B,C)

Suppose that the controller received the sample x(k) of the plant state together with the time stamp k corresponding to this sample. If we know the state x(k) then using (3) we can compute the control value u(k) . The next predicted value x^(k+1) of the state can be computed according to the equation (1)

x^(k+1)=Ax(k)+Bu(k)

Applying (3) again we obtain the predicted value u(k+1) of the control signal

u(k+1)=Ksx^(k+1)

Repetition of these computations nsa1 more times with respective exchange of arguments gives the last predicted values of the state and control

x^(k+nsa)=Ax^(k+nsa1)+Bu(k+nsa1)
u(k+nsa)=Ksx^(k+nsa)

As a result we obtain the finite sequence of control values

U=u(k),u(k+1),,u(k+nsa)

This sequence and the corresponding time stamp k attached is sent via the computer network to the local computer. This computer compares his current discrete time with the received time stamp and computes the value of r(k) . From the control sequence the element of index is selected and applied in the immediate sampling instant – figures 5.

The experimental MLS described in ( Morawski & Zajączkowski, 2007 ) is connected to the typical PC (local computer) that reads data from the height sensor and controls the voltage applied to the winding of the electromagnet. The height of levitation is measured by the optical sensor connected to the 12-bit AD converter. The local computer sends to and receives data from the remote computer system responsible for computation of the control values. These two computers exchange data via a computer network using UDP packets. The local computer sends packets at the rate of 1024 packets per second. Each packet contains the current and past samples of the height of levitation, the corresponding time stamp and the respective control values. The packets produced by the remote controller are asynchronously sent to the local computer and can be considered as responses to the packets containing samples of the plant output. Each packet sent by the remote controller contains the copy of the respective time stamp and the finite sequence of nsa=10 [1] - control values computed according to the presented method of prediction. From this follows that the size of the UDP packet is 72 B which results in the throughput estimate of 901120 bps.

The estimates of the bandwidth including media access procedure and gaps (<2Mbs) are sufficient to perform data transmission using typical modern Ethernet or WiFi (802.11) networks. The above approximation does not take into account repetitions, thus may be underestimated in shared media case. Nonetheless in any case the available bandwidth is far from saturation. The local computer evaluates the difference between its current time and the received time stamp. This quantity interpreted as the Round Trip Time (RTT) is expressed in milliseconds and is used as an index needed for selection of the suitable element from the received sequence of control values U (10).

Fig. 6 presents time variability of RTT (thus index in the table obtained using (6), (7)) seen by the local computer. Figures 7 present the graphs of the ball position when the network based control loop uses the Ethernet and WiFi networks respectively. The results of operation of the network based systems are compared with the standalone system.

media/image60.jpg

Figure 6.

RTT seen by the actuator. These values can be interpreted as indices in vector obtained using (10).

media/image61.png

Figure 7.

Quality of control of magnetic levitation system obtained using methods described in the chapter using Ethernet network (above) and mixed (WiFi and Ethernet) network (below) The better results of NCS in case of Ethernet networks are the results of using observer to obtain ball velocity instead of the simple derivation via filtering. WiFi based controller without compensation does not work at all.

4. Application layer redundancy

The aforementioned results were obtained for the case of unicast communication. In order to increase robustness of the control system, we have proposed multicast communication originated from the local computer and responses as unicast communication, originated by the remote computer. In the case when the remote computer has multiple different interfaces, such solution does not increase the traffic volume – hence delays. However such redundancy concerns only communication links and does not concern the control process. If several computers are used in the same distributed control system the data traffic increases, but the redundancy level increases also – it concerns both links and the control process. The intermediate solution can be based on the traffic engineering algorithms very shortly described in the next Section, where traffic is limited at the “intelligent” network level (in middleware).

The scheme of the redundant system, and the results of experiments concerning the switching over the active remote computer are presented in Fig. 8. The glitch seen on this chart is the effect of the different internal state of the observer at the moment of this structural change.

The redundancy lowers slightly the quality of control due to an increase of delay statistics, but this phenomenon is negligible. However, the aforementioned impulse is too large to keep the system stable when switching from Ethernet to the WiFi network. In the remaining cases this switching over the networks is possible.

5. Network layer compensation

The described method of the compensation the network induced delays have limited capabilities because while increasing nsa the quality of prediction and plant state estimation performed by the observer degrades significantly. Although theoretically nsa can take any value, in practice it should not exceed 5 – 10 (depends on the particular problem, for MLS see Fig. 6 and 7). Usually occasional violation of this limit does not result in system misbehaviour, but in the case of bursty delays of a data series, a system failure is practically unavoidable. Therefore the network layer compensation should be applied as a complementary mechanism enhancing robustness of the NCSs.

Network layer observes the destination of a communication as some additive, multiplicative or convex metric built on the basis of some link cost attributes (Alkahtani, Woodward, & Al-Begain, 2003). The attributes can be considered as a vector of technical, economical and other properties of the link. In classical networks (or high performance networks) the quality of the link (i.e. cost of the link) is described by its existence only, or by its physical bandwidth. Other properties are used infrequently, esp. link delay – the most obvious attribute in considered areas of applications. Moreover forwarding packets based on the minimum delay is proven (Gallager, 1977) to be optimal. Unfortunately the information about delays is always outdated and former approaches exploiting this rule have been failed (Khanna & Zinky, 1989).

media/image64.png

Figure 8.

Application layer redundancy. The scheme of the system (above). Process of switching the active remote controller (below).

Here we shortly discuss another approach developed recently by Morawski (Morawski, 2005; Morawski, 2006b; Morawski, 2007). His approach is based on the method of suitable utilisation of multiple communication channels that dissipates congestions (originated by media access procedure, noise, retransmissions, foreign flows interference, etc.) by balancing the traffic gradually in contrast to previous attempts. The algorithm was initially designed for general networks and adopted recently to the real time traffic (Morawski, 2007).

As was mentioned above, the quality of links can be expressed as overall transmission latencies associated with particular links. These latencies are highly variable quantities computed as the sum of propagation, transmission, media access, processing and queuing times. While the propagation, transmission and processing times are less or more constant, the remaining ones are not, and the media access time necessary for retransmissions influences directly the queue depletion ratio. Therefore, we use the sum of the queue depletion time and the constant components of the latency as the input sη(s) of the first-order, linear, low-pass IIR filter:

ξ(s+1)=(1α)ξ(s)+αη(s)

where s is the discrete-time corresponding to the subsequent time instants, α(0,1)R is the constant defining dynamics of the filter, and sξ(s)R denotes the estimate of the average latency introduced by the link.

Additionally we take into consideration the average standard deviation v (variability) of ξ computed as follows:

v+(s+1)=β(η(s)ξ(s))+(1β)v+(s)η(s)ξ(s)v(s+1)=β(ξ(s)η(s))+(1β)v(s)otherwise

It is well known (Welzl, 2005; Stallings, 2002; Srikant, 2003) that the network model does not belong to the class of statistically invariant systems which, according to the definition of the standard deviation, fulfil the condition v+(s)=v(s) . The results of approximations computed by (11) and (12) are presented on Fig. 9. The values of ξ v+ v create an uncertain quantity defining the link quality (or link cost), where ξ is the most likely (or central) value, ξ+v+ is the estimate of the upper limit of variability of ξ , and η(s) . Possible geometric interpretation of such uncertain value is presented on Fig. 9. The uncertain sum of the link costs gives the uncertain metric of the path (Hanss, 2005). The packets are forwarded randomly inversely proportional to the respective value of the path metric, that can be expressed as a degree of occlusion of particular metrics.

While ξ(s) is highly variable, the value s changes slower, but changes every time instant U also. The algorithms which decide when to report a new value of the link cost, and the new updated value of this cost, and corresponding adaptation mechanisms are discussed in detail by (Morawski, 2005; Morawski, 2006b, Morawski, 2007).

The proposed algorithm can be used with any routing protocol which works in event triggered fashion, i.e. except those ones, that have only timed updates. The algorithms has the same properties like the soft handover, therefore does not induce the route flapping phenomenon.

Therefore the quality of this algorithm is far better than in the classical version (Khanna & Zinky, 1989). The quality was measured in the simulations using Network Simulator ver. 2 by comparing the drop/receive ratio in the case of the connectionless traffic and by evaluating the variability of the main control value of TCP protocol (cwnd) for the connection oriented traffic. The variability of cwnd can be successfully evaluated only by highly equipped appliances, that can use sophisticated versions of TCP, that require many resources. Because the most of the TCP stacks available for microcontrollers are handicapped (due to necessity of resource conservation) and do not use aforementioned sophisticated flow control and, even in such case, usage of connectionless traffic results in closer to assessments observed in real networks. The quality of the connectionless traffic was assessed using an impulse or selfsimilar traffic with different statistical properties. In all cases the quality of the algorithm outperforms the standard solutions. Finally, the laboratory network was tested.

media/image84.png

Figure 9.

Approximation of the link cost using equations (11) and (12) (above). Possible geometrical interpretation (below)

6. Conclusions

Neither suitable traffic engineering nor application layer compensation applied alone, cannot keep closed loop latencies of a NCS within desired bounds. However, proper combination of algorithms in the network layer and application layer can cause, that NCSs are applicable for network based control of highly dynamic systems.

Notes

[1] - In fact, the first element is included in this vector U (10) but is not used as a control value. This value can be used for the assessment of the closed loop system properties.

References