Final testing and training results of ANN model (from Ref. [35]).

## Abstract

This chapter discusses about methods used in simulation and modeling of radio frequency (RF)/microwave circuits and components. The main topic that is discussed here is about one of the most powerful methods, that is, artificial neural networks. In this chapter, different types of neural network such as dynamic and recurrent neural networks will be discussed. Other techniques that are popular in the area of microwave components simulation and modeling are numerical techniques such as vector fitting, Krylov method, and Pade approximation. At the end of the chapter, vector fitting as an example of numerical methods will be discussed.

### Keywords

- artificial neural networks
- circuit simulation and modeling
- transient analysis
- function approximation
- RF/microwave circuits

## 1. Introduction

In recent years, ascending development of wireless communication products and huge trend for commercial market in this ground caused significant improvement in modeling and simulation approaches of radio frequency (RF) and microwave circuits. Such high-frequency circuits are leading to the development of a large variety of microwave models for passive and active devices and circuit components [1]. Modeling and computer-aided design (CAD) methods have an essential role in microwave designs and simulations [2]. The older approaches were mainly based on slow trial-and-error processes and an emphasis on performance at any price, but today seems to be a new era in high-frequency circuit design and modeling, since development in this ground has enabled microwave engineers to design larger, more efficient, and more complicated circuits than before [1, 3]. This complexity requires new materials and technologies that require not only new models but also new algorithms in computer-aided design [4] for RF/microwave circuits, antennas, and systems to keep up with the advancement of technology with emphasis on time-to-market and low-cost approaches [1, 3]. In addition to accurate parametric-modeling techniques to describe the behavior of the microwave device, a reliable description that explains the changes of its behavior against geometrical or physical parameters is also needed [5].

Also, since circuit models at high frequencies often lack fidelity, detailed electromagnetic (EM) simulation techniques are needed to improve design accuracy. Although EM simulation techniques are heavily used yet, they are computationally expensive, so there is a demand for design methodologies to be not only accurate but also fast. Another concerning problem today is optimization. To meet this purpose, computer-based algorithms that work with iterative circuit evaluation are needed; this process also needs a highly repetitive computational process. Another concerning issue according to Ref. [6] is the possibility of employing knowledge-based tools for initial design, that is, one of the steps toward designing and modeling process. It is hard to satisfy all these problems with the traditional CAD technologies [1, 3]. In conclusion, obviously there is a serious need for a powerful accurate and fast processing and modeling tool.

Neural networks (NNs), or artificial neural networks, are information-processing systems that can imitate the ability of human brain to learn from observation and generalize by abstraction to create complex models [7]. Neural network gives a great approximation of system regardless of linear or nonlinear correlation between the input data and can be used as knowledge-based tool (to be employed for initial design in RF/microwave applications) [1]. The ability of NN to be trained resulted in their use in many diverse fields such as pattern recognition, system identification, control, telecommunications, biomedical instrumentation, and many other grounds. Recently, many researchers in communication area are focusing on using neural network in their modeling and simulation, and NN has been recognized as a useful alternative to conventional approaches in microwave modeling [1, 3]. Neural network models are simple and fast, and they can enhance the accuracy of existing models. The basis of neural network is on the universal approximation theorem, which says that a neural network with at least one hidden layer can give an approximation of nonlinear multidimensional function to any intended accuracy [8]. This property makes neural network a favorite modeling tool for microwave engineers. Neural network approach is generic, that is, the same modeling technique that can be reused for passive/active devices/circuits. Another advantage of NN is the ease of updating neural models regarding changes in technology [2]. Neural network is now used in various microwave modeling and simulation applications, such as vertical interconnect accesses (Vias) and interconnects [9], parasitic modeling [10], coplanar waveguide (CWG) components [11], antenna applications, nonlinear microwave circuit optimization [12], power amplifier modeling, nonlinear device modeling, wave-guide filter, enhanced elemental method (EM) computation, and so on [2].

Artificial neural networks are classified into two main categories: static neural networks and dynamic neural networks. In this chapter, the first neural network structures will be presented, and then a general overview of static and dynamic neural networks and different types of them and their applications in microwave modeling will be discussed. The last part is devoted to another method called vector fitting (VF) that is a numerical technique used for system identification and macromodeling [13].

## 2. Training neural network

### 2.1. Basic structure of neural network

The idea behind neural network is similar to the function of human brain. A typical neural network structure has two types of basic components: the processing elements and interconnection between them. The processing elements are known as neurons, and the interconnections are called links. Each link is recognized with a corresponding weight parameter. Every neuron receives stimuli from neighbor neurons connected to it [3]. Input neurons receive stimuli from the outside of the network and the neurons that produce the output result are called output neurons, and neurons that not only send but also receive stimuli are called hidden neurons [1]. There are different ways to connect neurons to each other, so there are different neural network structures. A neural network structure defines how information is processed inside a neuron, and how the neurons are connected. In this chapter, we discuss the models that are more common in microwave simulations and modeling.

Generally, artificial neural networks have an input data vector, an output data vector, a vector including all the weight parameters, and a function that mathematically presents the neural network [14].

Assume *Ni* and *No* to represent the number of input and output neurons of the neural network, respectively, *w* to be the vector of weight parameters, and *y* = *y*(*x,w*) to define the function that represents the neural network. A simple scheme of artificial neural network is shown in Figure 1.

Given a set of input and output data, a neural network can be constructed and trained. The network tries to estimate a function, so that it is able to give the closest result to the intended output. Commonly, a large percentage of input and corresponding output data are used as training data and the network will be trained by means of them. The act of training means identifying the weights, so that they reach the optimum values. The remaining percentage of data is used as validation and testing. Validation set is used to determine an approximation of generalized error and is a factor for determining when to stop the process to prohibit over-learning and under-learning [7]. Testing data obviously is used for checking the accuracy and correctness of the network after training is completed.

In each level of information processing, the output of each neuron is received by the next neuron, from input neurons to output neurons. An overview of information processing in layers is shown in Figure 2. The inputs of a neuron are first multiplied by the corresponding weight parameters individually, then the results are added to produce a weighted sum of *γ*, which then will pass through a neuron activation function σ(.) to produce the final output of the neuron. This output is the input of the neuron in the next layer, and this process is repeated for all the neurons, until it reaches the output layer.

Let *wi0l* to be the bias for *ith* neuron of *lth* layer. So the vector of weights is

There are different types of training in neural network [1], here we explain each of them shortly:

Sample-by-sample (or online) training: each time a training sample is presented, the weights (

*w*) are updated based on training error.Batch-mode (or offline) training: after each epoch weights are updated based on training error from all the samples in training data set.

Supervised training: using

*x*and*y*data for training process, where*x*is the input of the neural network and*y*is the output of the neural network.Un-supervised training: using just

*x*data for training process.

### 2.2. Activation functions

Activation function, also known as transfer function, is one of the most important units in a neural network structure, that is, a scalar-to-scalar function transforms a set of input signals into an output signal. Common types of activation functions are arctangent as shown in Figure 3, hyperbolic tangent shown in Figure 4, and sigmoid functions which are shown in Figure 5 [15].

Sigmoid function:

Arctangent function:

Hyperbolic tangent function:

## 3. Static neural networks

In the past few years, artificial neural networks have gained attention as a valuable computer-aided design tool for modeling high-frequency circuits. They can mainly be categorized as techniques for modeling frequency-domain response of components and time-domain response of them. For frequency-domain modeling, static neural networks are employed. Their main architectures are multilayer perceptron (MLP) and radial-basis function (RBF), which will be discussed in this section.

### 3.1. Multilayer perceptron (MLP)

Multilayer perceptron structure is the most frequently used structure in many areas including microwave modeling and optimization problems. This technique belongs to a subcategory of neural network called feed-forward neural network, which is able to approximate continuous and integrable functions [1], and their connectivity consists of layer groups that are only linked to adjacent layers, meaning that there is not a cycle or a recursive path [16].

#### 3.1.1. MLP structure

In MLP structure, neurons are classified into different layers. A typical MLP neural network consists of one input layer, one or more hidden layers, and one output layer, as shown in Figure 6. Consider *L* as the total number of layers, layer 1 is the input layer, layer 2 to layer (*L*-1) are hidden layers, and layer *L* is the output layer. Also, suppose the number of neurons in *lth* layer is *Nl*, *l* = 2, 3, …, *L*.

Here, consider *xi* as the *ith* input of the MLP, and *zil* as the output of *ith* neuron of *lth* layer. Also, *wijl* is the weight of the link between *jth* neuron of (*l* - 1)^{th} layer, and *ith* neuron of *lth* layer

One of the most commonly used activation functions in MLP structure is sigmoid function [1], which is shown in Figure 5.

In summary, if we suppose

For

For

And

### 3.2. Radial-basis function (RBF) networks

Radial-basis function neural network, like MLP, is a subset of feed-forward neural network. It is used in a wide range of applications related to microwave transistors and high-speed integrated circuits, and modeling of intermodulation distortion behavior of MESFETs and HEMTs [1, 17].

#### 3.2.1. RBF structure

A radial-basis function is a real-valued function whose value depends only on the distance from the origin, so that *c*, called a *center*, so that

The main approach in this structure is based on approximation of a curve that best fits to the training data set in high-dimensional space by determining *λ* and *c* that are standard deviation and center of the activation functions, respectively, and are parameters of the function [7, 15]. The dimension of the hidden space is related directly to the accuracy of the approximated model [14]. In this structure, we have

where *y(x)* is the approximating function that is the weighted sum of radial-basis functions [1].

In RBF structure, there is just one hidden layer, and the function of input and output layers stays the same like MLP structure. RBF uses the radial basis as activation function [7]. Figure 7 shows a typical RBF neural network.

Radial-basis activation functions include Gaussian and multiquadratic functions.

Gaussian function:

Multiquadratic function:

The Gaussian and multiquadratic functions are shown in Figure 8 and Figure 9, respectively.

## 4. Time-domain neural networks

For time-domain modeling of components and systems, time-domain artificial neural network structures are usually employed in the literature. The main time-domain architectures are dynamic neural networks (DNNs) and recurrent neural networks (RNNs), which will be discussed in this section.

Most neural network structures used by engineers are feed-forward neural networks that are suitable for time-independent static input-output mapping [18]. In feed-forward neural networks, the flow of information is straight forward from the first neuron of the first layer to the last neuron of the output layer, and the procedure is not recursive, so the output of neurons does not have any effect on the input of the last neurons, although the stability of a neural network is the result of the absence of feedback in the network. In spite of static NNs, a dynamic neural network uses feedback between neurons in the same layer, or even neurons in different layers, also it provides more computational advantages [19]. Feedback-based neural networks are good approaches for modeling, identification, and control of systems, since most of systems in real world such as airplanes, rockets, and so forth are nonlinear dynamical systems [18, 20].

### 4.1. Recurrent neural networks (RNNs)

Recurrent neural network is a discrete time-domain neural network that allows time-domain behaviors of a dynamic system to be modeled [1]. Its structure is suitable for modeling tasks such as dynamic system control and finite-difference time-domain (FDTD) solutions in electromagnetic modeling [21]. The output of the neural network is a function of its present inputs and a history of its inputs and outputs [22]. The delayed outputs are fed back to the inputs and the feed-forward network along with the feedback delay constructs the recurrent neural network structure. In this architecture, we suppose the inputs and outputs to be a function of time, representing this functionality with parameter *t*, also *τ* which is the delay representing the effect of history of the neural network inputs and outputs.

Suppose the external single input of the neural network to be *x(t)* so the history of it would be *x(t-τ)*, *x(t-2τ)*, *x(t-3τ)*, …, *x(t-ατ)* where *α* is the maximum number of delay steps for *x*, and suppose the single output of the RNN to be *y(t)* with history of it demonstrating as *y(t-τ)*, *y(t-2τ)*, *y(t-3τ)*, …, *y(t-βτ)* that *β* is the maximum number of delay steps for *y*. The architecture of the RNN is shown in Figure 10. The corresponding formulation is

Suppose a three-layer discrete-time MLP neural network as above, with activation function *z* = *σ(γ)*. Applying the delays to the process, the output of the *ith* neuron at *t* is

in which,

where *T* is the total number of neurons, *xi* is the external input, *i* = 1, 2, 3, …, *T*, *yi* is the output of neuron itself, also *yj* is the output of other neurons, *j* = 1, 2, 3, …, *T*, *i* ≠ *j*.

## 5. Dynamic neural networks (DNNs)

Dynamic neural network is a continuous time-domain neural network that is one of the best formulations for modeling nonlinear microwave circuits [9]. DNN is highly efficient in theory and practice. It is suitable for a wide range of needs in nonlinear microwave simulations, for example, it is suitable for both time- and frequency-domain applications, multitone simulations, and so on [12]. In comparison with other neural network methods, DNN provides a faster and more accurate network modeling that is significantly required in today’s efficient CAD algorithms in high-level and large-scale nonlinear microwave designs. DNN also can be developed directly from input-output data without a need to depend on internal details of the circuit [12]. In DNN, the outputs are a function of inputs and their derivatives, and also a function of derivatives of outputs. Figure 11 shows the architecture of dynamic neural network and the process occurring at each level.

In Figure 11, *y(n)*(*t*) is the *nth* derivative of *y(t)* and is integrated and fed back as an input to the system along with the inputs *x(t)* and their derivatives, *x(i)(t)*. *f* (.) represents the MLP nonlinear function and *y(i)(t)* represents *i*^{th} derivative of *y(t)*.

The DNN model can represent a nonlinear circuit when trained and tested with an appropriate data set, measured or obtained from the original circuit.

### 5.1. State-space dynamic neural network (SSDNN)

State-space dynamic neural network (SSDNN) is a technique for modeling nonlinear transient behaviors especially in high-speed IC and nonlinear circuits. The SSDNN-modeling technique is based on DNN structure and is a combination of DNN and state-space concept, which expands continuous DNN into a more general and flexible approach for nonlinear transient modeling and design with good accuracy [23].

Let *v* ∈ ℝ^{N} be the transient input signal of a nonlinear circuit, and let *y* ∈ ℝ^{M} be the transient output signal of a nonlinear circuit, where *N* and *M* are the number of inputs and outputs of the circuit, respectively, *w* also is the weight parameter matrix that is divided into three matrixes: *wv*, *ws*, *wo*, which are weights connecting to the inputs (*v*), weights connecting to the state variables, and weights connecting the hidden neurons of the hidden layer to the outputs, respectively. Also, *η* is a constant scaling parameter. SSDNN model formulation can be represented by the equations as follows:

where *x* = [*x*_{1}, …, *xL*]^{T} ∈ ℝ*L* and is a vector of state variables, with initial condition *x*(0) = *x*_{0} and *L* is the dimension of the state space, that is, the order of the model, *N* + *L* input neurons and *L* output neurons, also

#### 5.1.1. Adjoint state-space dynamic neural network (ASSDNN)

Adjoint state-space dynamic neural network (ASDNN) method, like SSDNN method, is used for modeling the transient behavior of nonlinear electronic and photonic components. It is an extension of SSDNN technique that is capable of adding the derivative information of the output to the training patterns of nonlinear components simultaneously, so that the training process can be done more efficient requiring less data without sacrificing model accuracy and efficiency [23, 24]. It has been shown in Ref. [24] that testing error from the model trained by ASSDNN method is much less than that obtained from SSDNN. Here is the formulation of ASSDNN using notation similar to SSDNN mentioned already, and an overview of structure of ASSDNN is shown in Figure 13.

## 6. Other methods in microwave modeling

There are other methods which are not based on neural networks for modeling microwave components such as Krylov method [25], finite element [26], mode-matching method [27], vector-fitting method, and so forth [28]. In the preceding section, we present vector-fitting method. This technique has been used in many microwave simulations and modeling researches [29–31].

### 6.1. Vector-fitting method

Vector fitting (VF) is a robust numerical technique for rational approximation of transfer functions and *s*-parameter in the frequency domain, especially in microwave devices using poles and residues [32]. It allows calculating multiport models directly from measured or computed frequency responses. The resulting approximation has guaranteed stable poles that are real or come in complex conjugate pairs, and the model can be converted directly into a state-space model [13].

Basically, vector fitting is a pole relocation method where the poles are improved in an iterative manner. This is achieved by repeatedly solving a linear problem until convergence is achieved [13]. The VF formulation avoids the ill-conditioning problems encountered with some alternative approaches, as the formulation is given in the form of simple fractions instead of polynomials. Unstable poles are flipped into the left-half plane to enforce stable poles. This makes VF applicable to high-order systems and wide frequency bands [33].

Mathematical representation of vector-fitting method is presented briefly in this section.

Let {*pn*} be a set of unknown poles, and {r_{n}} be residues, *H*(s) is the given rational function:

in which *M* is the order of the macromodel. The poles are identified by solving the linear problem shown in Eq. (16) for *i*^{th} iteration,

in which *rni* is *rn* for *i*^{th} iteration, the same is *pni*, and *γni* is found in matrix *x*. In Eq. (6), we call *i*(s), and *Hi*(s) which are unknown rational functions with given poles. By writing Eq. (16) for several frequency points, we have an overdetermined linear problem with a frequency-sampled data point *ft*:

in which *t* = 1, 2, …, *Nf*,

It can be proven that the poles of *H(s)* are equal to the zeroes of σ*i*(s), also the zeroes of σ*i*(s) can be calculated by solving an eigenvalue problem as shown in Eq. (19) [33]

Also for initialization there are different approaches. Basically, initial poles should be complex with weak attenuation and can be obtained by a simple calculation such as Prony method [13] or simply can be spaced within the desired range of frequency, for example, between 50 Hz and 1 MHz [17], and the advantage here is even if the starting poles were selected poorly, the result does not change significantly [32]. By solving Eq. (19), new set of poles are identified. After identifying all the poles, residues are calculated by solving Eq. (16) which is again a linear problem. As a conclusion, VF method samples the given function with an appropriate sample rate, and in this way a summation of partial fractions can be found, that is, the discrete-function approximation of the original transfer function.

## 7. Related work

This part of the chapter briefly discusses about the application of computer-aided design (CAD) techniques in modeling and simulation of RF and microwave-passive components. Neural network-based modeling approaches have been widely used for modeling variety of RF and microwave-passive components such as coupled-line filters, coplanar waveguides, Vias and multilayer interconnects, and some other passive components.

### 7.1. General procedure of modeling

Here, we provide a brief review of procedure used in neural network-based modeling of RF and microwave-passive components.

For modeling microwave components in frequency domain [1], first input and output parameters of the components should be selected in a wide range of frequencies. In most ANN models, it is desired to represent the parameters in terms of scattering parameters (*S*-parameters). The next step is data generation. For passive component models, electromagnetic simulation approach is widely used for generating data. EM simulators, which are used in the process of developing ANN models, produce *S*-parameter for the components. After data training, there should be a criterion for deliberation of the accuracy of the model, so error of the model in different formulations is measured. In most EM-ANN models, the absolute average and standard deviation of error is measured for each output.

After training and verification of the EM-ANN model, based on the usage they can be used either in stand-alone mode or in integrated mode along with microwave circuit simulators. In integrated mode, there is a linear model subroutine that connects models to the simulator. This subroutine returns *S*-parameter of a component back for further simulation. When the simulation is running, the simulator passes parameters such as frequency of a component. For computation of *S*-parameter, there is a feed-forward ANN subroutine that receives input variables and also it holds the algorithm for finding the output of the ANN model. Besides, models can be connected to the circuit simulator as a group, where these collections of models are called libraries [34].

### 7.2. Parametric modeling of a coupled-line filter

In this example, we demonstrate the use of ANN techniques to develop a model for a family of coupled-line filters [35]. Here, *S*_{1} and *S*_{2} are the spacing between lines and *D*_{1}, *D*_{2}, *D*_{3} are offset distances from the ends of each coupled lines to the corresponding fringes. This model has six inputs that are *x* = {*S*_{1}, *S*_{2}, *D*_{1}, *D*_{2}, *D*_{3}, ω} and four outputs as *S*-parameters *RS*_{11}, *IS*_{11}, *RS*_{12}, *IS*_{12} which are real and imaginary parts of *S*-parameters *S*_{11} and *S*_{12}. The testing and training data were obtained from CST microwave studio [36]. Table 1 shows the final testing and training results after developing the ANN model for coupled-line filter. In the table, 6-40-4 as neural network structure means a neural network with four inputs, 40 hidden neurons, and four outputs that was used to develop the model for this component. As it can be seen from the table, the ANN model matches the desired data obtained from the simulation tool with very good accuracy, which validates the usage of this method for model creation.

Model type | Neural network structure | Average training error (%) | Average testing error (%) |
---|---|---|---|

ANN model using 120 sets of training data | 6-40-4 | 0.897 | 0.989 |

ANN model using 40 sets of training data | 6-35-4 | 1.073 | 4.357 |

### 7.3. EM-ANN models for CPW components

The use of coplanar waveguides (CPWs) in RF and microwave-integrated circuits has brought many advantages. Accurate modeling of these components is necessary for accurate simulation of circuits. One of the fields that experts recently have been working on is toward the development of accurate and efficient methods for EM simulation of CPW discontinuities, but the challenge of using these tools for iterative CAD and circuit optimization [37] is the time-consuming nature of EM simulation. To overcome this problem, EM-ANN models have been suggested [38]. The models include CPW transmission line, short- and open-circuit stubs, step-in width discontinuities, and T-junctions. These EM-ANN models are linked to microwave circuit simulators and allow for the accurate and very fast EM circuit optimization in the framework of circuit simulator [1]. A general schematic of a coplanar waveguide is shown in Figure 14. In this figure, *W* is the center conductor width, *G* is the spacing between conductor and ground plane, and *L* is the center conductor length.

### 7.4. Vias elements in microstrip circuits and multilayer Vias connectors

Progress in technology caused merging large number of microwave circuits and creating multilayer complexities that leads to investing much effort on optimizing and lowering the cost and weight of these circuits. Besides, accuracy and efficiency are important factors that should be satisfied in designs to have desirable simulation results. EM-ANN-based methodology despite other solutions that have been suggested was enormously successful in modeling Vias elements in microstrip circuits and multilayer Vias connectors [39]. Some other suggestions had limitations such as heavily computational expenses or limited range of frequency. As an example, microstrip transmission line model is one of the implemented ANN-based models in this case. In this model, input parameters are frequency which is in the range, log *WI* is the microstrip width and *H*_{sub} is the substrate height which varies between -1 and 1, and *εr* relative dielectric constant of the substrate in the range of 2–13. A simple schematic of microstrip transmission line is shown in Figure 15.

Output parameter is *Z*_{0} which is the characteristic impedance, and *ε*_{eff} is the effective dielectric constant. A total of 155 data were used as training, 100 for validation, and 10 hidden neurons to create the model. To test the model, standard deviation and average error criteria were used. Error results for microstrip transmission line are shown in Table 2.

Z_{0}(%) | τ_{eff}(%) | |
---|---|---|

Training data average error | 1.161 | 0.377 |

Training data SD | 1.157 | 0.376 |

Validation data average error | 0.774 | 0.293 |

Validation data SD | 0.875 | 0.223 |

## 8. Conclusion

In this chapter, a review of some tools commonly used in RF/microwave simulation and modeling has been presented. In the last few decades, high-frequency effects have become an important factor in RF/microwave area. These effects can be found in all levels of design from tiny chips to packaging structures. In order to capture these effects, it is common to use physics-based models or electrical models which lead to large equations and large computational efforts for solving and simulating them, which is extremely time-consuming and expensive. Artificial neural networks recently have become popular among computer-aided design tools. The main topic in this chapter was a discussion on neural network which was mentioned as a powerful tool in modeling and simulation areas, also two main types of neural network structures including static and dynamic neural networks and their different types has been presented. In static neural network section, we talked about multilayer perceptron (MLP) and radial-basis function (RBF) structure, and in time-domain part we discussed recurrent neural network (RNN), dynamic neural network (DNN), state-space dynamic neural network (SSDNN), and adjoint state-space dynamic neural network (ASSDNN) methods. Other than neural network, as mentioned already, there are several numerical methods that are being used in the procedure of simulation and modeling microwave components such as Krylov method, finite-difference time-domain (FDTD), finite-element time-domain (FEDT), and vector fitting (VF). Here, we presented vector-fitting method that is widely used for modeling microwave and electromagnetic components with good performance. VF despite other system identification methods avoids ill-conditioning calculation, and because of this, it works more efficiently. Also, this method is very robust; it performs well even for high-order fitting and does not disturb by poorly selected starting poles. VF technique is very easy to implement in a computer program, since it is constructed upon matrices from simple fractions, and the problems in this case are easy to solve.

As a conclusion, the ANN-based methodologies and other mentioned methods are capable of applying to RF/microwave modeling and components simulation and are shown to have both speed and accuracy advantage for modeling nonlinear functions, despite many other conventional techniques.