Open access peer-reviewed chapter

Formation of Inter-Frame Deformation Field of Images Using Reverse Stochastic Gradient Estimation

By Alexander Tashlinskii and Pavel Smirnov

Submitted: July 1st 2018Reviewed: December 11th 2018Published: January 9th 2019

DOI: 10.5772/intechopen.83489

Downloaded: 142

Abstract

The effectiveness of the use of stochastic gradient estimation to detect motion in a sequence of images is investigated. Pixel-by-pixel reverse estimation of the deformation field is used. The representation of the shift vector is considered both by the projections on the basic axes and by the polar parameters. Two approaches to estimate the parameters of the deformation field are proposed and analyzed. In the first approach, the stochastic gradient procedure sequentially processes all rows of an image to find estimates of shifts for all points of the reference image. The joint processing of the results allows compensating the inertia of the stochastic estimation. In the second approach, to improve the accuracy of estimation, the correlation of image rows is taken into account. As a criterion for the formation of the resulting estimate, the minimum of gradient estimation and correlation maximum were investigated. The computational complexity of the proposed algorithms is investigated. The algorithms are compared with the MVFAST algorithm. Examples of experimental results on the formation of the deformation field, the selection of a moving object area, and the finding of the movement and trajectory parameters of a moving object in a video sequence are given.

Keywords

  • image processing
  • video sequence
  • motion
  • shift vector
  • detection
  • stochastic estimation
  • stochastic gradient
  • trajectory
  • reverse estimation
  • correlation

1. Introduction

One of the challenges in video processing is moving object detection and tracking. Some tasks require only detection of the motion, while others require extraction of the moving object or the motion area boundary. The biggest challenge is to estimate trajectory parameters of a moving object in a video sequence. A solution quality to the problem largely depends on the accuracy of moving object area detection, since all the information needed to determine motion parameters and trajectory of the object is extracted from the image.

There are various approaches to identify the area of moving object based on the inter-frame difference [1, 2], background subtraction [2, 3], the use of statistics [2, 4], block estimation [5], and optical flow analysis [6]. The processing can be presented as estimation of inter-frame geometric deformations of two images, one of which can be considered as the reference image Zs=zi,jsand the second as deformed image Zd=zi,jd, where zi,jis the value in node ijof the image. The field H=h¯i,jof inter-frame shift vectors h¯i,jof all points of the reference image corresponding to the nodes of the sample grid will be called deformation field or shift vectors’ field.

An important task of estimating the trajectory of a moving object is to increase the spatial accuracy of detecting the area of motion in an image sequence. The solution of this problem depends on the quality of estimation of the deformation field. When finding estimates of shifts for each node of the deformed image, the approach of using stochastic gradient estimation [7, 8, 9] is promising. It finds coordinates xyof points of the reference image on the deformed image. That is, it evaluates vectors h¯i,j=hijxhijyТof inter-frame shifts of points ij(Figure 1) of the reference image.

Figure 1.

Representation of shift of reference image point ij.

In view of the insignificant change in brightness on adjacent frames of a video sequence, it is advisable to use mean square inter-frame difference as the objective function of estimation. Then, for the projections, hijxand hijyof shift vector, we get [10]

h¯̂i,j+1=h¯̂i,jλhsignβ¯hzi,j+1dh¯̂i,j,i=1,Nx¯,j=1,Ny1¯,E1

where λhis the learning rate, which determines the rate of change of the estimated parameters, β¯h=βhxβhyTis the gradient estimation of an objective function, and Nx×Nyis the image size.

The projections of stochastic gradient can be represented as [11]

βhx=z˜x+Δx,ysz˜xΔx,ysz˜x,yszi,jd,
βhy=z˜x,y+Δysz˜x,yΔysz˜x,yszi,jd,

where xyare the coordinates of the reference image point on the deformed image, z˜x,ysis the brightness of the continuous image Z˜sobtained from Zsby means of interpolation, and Δx,Δyare the steps of finding estimates of derivatives dz˜x,ys/dxand dz˜x,ys/dy[12].

Note that inter-frame shift vector h¯i,jcan be also represented in polar form: h¯i,j=ρi,jφi,j(Figure 1), where ρi,jis the length of the vector and φi,jis the angle with respect to the xaxis (Figure 1). Functionally, these representations are equivalent. However, due to the inertia of estimates of shift vectors’ field H, when using stochastic gradient descent, the use of parameters ρi,jand φi,jdoes not give equivalent estimates h¯̂i,jcompared with the use of hijxand hijy. This is due to the fact that the sets of parameters have different physical meanings [13]. The answer to the question of which set is preferable for solving the problem of moving object area detection is not obvious and requires research.

2. Variants of estimation algorithms

Let us compare the efficiency of moving object area identification using the sets of parameters hxhyand ρφ. The following method for estimating the shift vectors’ field His used. At the first stage, stochastic gradient descent algorithm sequentially processes all nodes of the reference image row by row with the set of parameters hxhy. It processes each row ibidirectionally [14], first, from the left to the right, getting, according to Eq. (1), the estimates

ĥij+1xl=ĥijxlλhsignβhxzi,j+1dh¯̂i,j,ĥi1xl=ĥi11xl,
ĥij+1yl=ĥijylλhsignβhyzi,j+1dh¯̂i,j,ĥi1yl=ĥi11yl,

and then from the right to the left getting the estimates

ĥiNyjxr=ĥiNyj+1xrλhsignβhxzi,Nyjdh¯̂i,Nyj+1,ĥiNyxr=ĥi1Nyxr,
ĥiNyjyr=ĥiNyj+1yrλhsignβhyzi,Nyjdh¯̂i,Nyj+1,ĥiNyyr=ĥi1Nyyr,

where parameter λhis determined by the maximum speed of moving objects.

When using the set of parameters (ρ,φ), considering that x/ρ=cosφ, x/φ=ρsinφ, y/ρ=sinφ, and y/φ=ρcosφ, for gradient estimation β¯=βρβφT, we get [11]

βρ=z˜x+Δx,ysz˜xΔx,ysz˜x+Δx,ys+z˜xΔx,ys2zi,jdcosφ+z˜x,y+Δysz˜x,yΔysz˜x,y+Δys+z˜x,yΔys2zi,jdsinφ,
βφ=z˜x+Δx,ysz˜xΔx,ysz˜x+Δx,ys+z˜xΔx,ys2zi,jdρsinφ+z˜x,y+Δysz˜x,yΔysz˜x,y+Δys+z˜x,yΔys2zi,jdρcosφ.

Then according to Eq. (1), we get the estimates

ρ̂i,j+1l=ρ̂i,jlλρsignβρzi,j+1dρ̂i,jφ̂i,j,
φ̂i,j+1l=φ̂i,jlλφsignβφzi,j+1dρ̂i,jφ̂i,j,
ρ̂i,Nyjr=ρ̂i,Nyj+1rλρsignβρzi,Nyjdρ̂i,Nyj+1φ̂i,Nyj+1,
φ̂i,Nyjr=φ̂i,Nyj+1rλφsignβφzi,Nyjdρ̂i,Nyj+1φ̂i,Nyj+1i,j,

where the parameters λρand λφare also determined by the maximum speed of moving objects.

At the second stage, for each node ij, the estimates ĥijxland ĥijxrare processed jointly [15]. The optimal value of ĥijxis found between the estimates with some step Δh. Step Δhis determined by the required accuracy. If the absolute difference between the estimates ĥijxland ĥijxris less than Δh, then the estimate ĥijxof the deformation parameter is assumed to be equal ĥijxl. Otherwise, the set of possible values of the deformation parameter estimate is given by

ĥijxm=ĥijxl+mΔh,m=0,k+1¯,k=ĥijxrĥijxl/Δh.

The optimal value of ĥijxfrom the resulting set is found using some criterion.

2.1 Criteria for the formation of the resulting estimation

Two criteria for the formation of the resulting estimation are studied [16]:

  • Gradient estimation minimum

    minm=0,k+1¯βhĥijxl+mΔh,E2

  • Correlation coefficient maximum (CC)

    maxm=0,k+1¯CCz˜xm+p,ym+sszi+p,j+sd,p=a,a¯,s=b,b¯,E3

  • where (xm,ym) are the coordinates of the point ijof the image Zsin the image Zdfor the estimate ĥi,jmand 2a+1×2b+1is the window size for calculation of CC. The optimal value ĥijyis determined similarly.

    Joint processing of estimates allows compensating the inertia of the recursive estimation. The results given below were obtained with a=b=1.

    The joint processing of estimates ρ̂i,jl, ρ̂i,jrand φ̂i,jl, and φ̂i,jras well as finding the optimal values of ρ̂i,jand φ̂i,jare performed as described above using the same criteria of gradient estimation minimum and CC maximum.

    Thus for joint processing of the estimates, depending on the criterion, we obtain two versions of the estimation algorithm:

    Algorithm A: reverse processing using the criterion Eq. (2);

    Algorithm B: reverse processing using the criterion Eq. (3).

    2.2 Comparison of the estimation algorithm efficiency

    Let us first consider the efficiency of the criterion Eq. (2). Figure 2 shows an example of images used for the study. These are two adjacent frames of a video sequence where the vehicle located in the center is moving and the vehicle on the right is motionless. The parameters of inter-frame spatial shift of the moving vehicle are hx=3, hy=2,95or ρ=4,2, and φ=45.

    Figure 2.

    An example of adjacent frames of a video sequence with a moving object.

    For example, Figure 3 shows typical results of shift magnitude estimation for a single row of the reference image (Figure 2) while using Algorithm A. For a correct comparison of the estimates ρ̂i,j, φ̂i,jand ĥijx, and ĥijy, the latter is recalculated to polar parameters:

    ρh=ĥijx2+ĥijy2,φh=arctgĥijx/ĥijy,

    Figure 3.

    An example of estimates of shift magnitude for a row using criterion Eq. (2).

    Figure 3a shows dependencies ρ̂i,jland ρ̂i,jron ipoint in a row, Figure 3b shows the result of their joint processing, Figure 3c shows dependencies ρhland ρhron i, and Figure 3d shows the result of their joint processing. The solid gray line represents the true value of the deformation parameter. The results for parameters ρφare visually better. Estimates given in Table 1 also confirm that. Table 1 shows mean value mh¯and variance σ̂h¯2of estimation error for the area with and without motion for results shown in Figure 3b and d.

    AlgorithmMotion areaArea without motion
    mh¯σ̂h¯2×102mh¯σ̂h¯2×102
    hxhyρφhxhyρφhxhyρφhxhyρφ
    For a single row
    A0.460.3121.2710.410.770.4921.9126.11
    B0.180.111.220.660.30.153.820.89
    C0.370.0619.84.840.440.223.41.42
    D0.060.010.380.260.030.020.290.03
    MVFAST0.0525.30.010.01
    For the entire image
    A0.610.5323.614.30.730.4523.818.8
    B0.420.347.31.70.290.152.61.5
    C0.130.1415.910.10.460.252.82.9
    D0.070.011.40.690.090.020.280.02
    MVFAST0.0818.60.020.05

    Table 1.

    Estimation error of shift vectors of deformation field.

    Note that with the use of parameters ρφin the motion area, the mean value of the estimation error is lower, and the variance is more than two times less than with the use of the set of parameters hxhy. In the area without motion, the use of parameters ρφis also preferable since the bias of the estimates is less despite the slightly larger variance. For comparison, Table 1 also shows the results for the mean value and estimation of the variance σ̂h¯2of estimation errors averaged over the entire image.

    Figure 4 shows the results of the joint processing of the sets of estimates ĥijxĥijyand ρ̂i,jlρ̂i,jrusing the criterion Eq. (3) of the formation of the resulting estimate (Algorithm B). The results are for the same row as the results in Figure 3. Figure 4a corresponds to the set of parameters hxhy, and Figure 4b corresponds to ρφ. Table 1 summarizes the numerical data of the estimation error for the row and averaging over the entire image.

    Figure 4.

    The resulting estimates of shift magnitude for a row using criterion Eq. (3).

    The results show that the use of CC maximum as the criterion can significantly improve the results of the joint processing of the estimates. For both sets of parameters, the mean value of the error for the motion area does not exceed 0.2. The variance of the error is approximately halved for the set of parameters hxhyand 15 times less for parameters ρφ. For the area without motion, the mean error is less by about 2.6 and 3.3 times. The variance of the estimates also decreases. However, when using CC, the computational complexity of the estimation procedure increases significantly.

    Figure 5 shows the visualization of estimates of the shift vectors’ field Hfor the four estimation algorithms described above. It shows the magnitudes of the estimated vectors as a function of coordinates of nodes of the reference image. Figure 5a and b shows the results for criteria Eqs. (2) and (3) for the set of parameters hxhyand Figure 5c and d for the same criteria when parameters ρφare used. The figures illustrate and confirm the conclusions made above.

    Figure 5.

    Visualization of estimating shift vectors’ field by algorithms A and B with different criteria.

    2.3 Use of inter-row correlation of adjacent rows of the image

    The approach to estimate the field Hconsidered above processes images row by row as one-dimensional signals. Let us consider the possibility to increase processing efficiency by taking into account correlation of the adjacent rows of image [13, 17]. For each node ijof the reference image, we will also form two estimates h¯̂i,j1and h¯̂i,j2of the parameters of the vector h¯i,j. To form the estimate h¯̂i,j1, the stochastic gradient procedure processes rows one after the other with the change in direction after each row, for example, the odd rows from the left to the right, forming estimates h¯̂i,j1l, and the even rows, from the right to the left, forming estimates h¯̂i+1,j1r. The current shift estimate h¯̂i+1,j1in a node is compared with the estimate h¯̂i,j1, obtained from the previous row. If the estimate h¯̂i,j1is considered better using some criterion, then it is taken as the current estimate.

    As the criterion we use CC maximum Eq. (3) with the difference that here the set of possible values is limited by the estimates h¯̂i+1,j1and h¯̂i,j1. The estimate h¯̂i,j2is also formed in a similar way with the difference that, when processing, the odd rows are processed from the right to the left and the even rows from the left to the right.

    The joint processing of estimates h¯̂i,j1and h¯̂i,j2is carried out as described above. On the one hand, when taking into account inter-row correlation, the formation of h¯̂i,j1and h¯̂i,j2requires a greater amount of computation. On the other hand, the set of possible estimates for their joint processing turns out to be significantly smaller, which reduces the computational complexity.

    Thus we have two more algorithms to estimate the shift vectors’ field:

    • Algorithm C: processing of adjacent rows using the criterion Eq. (2); and

    • Algorithm D: processing of adjacent rows using the criterion Eq. (3).

    2.4 Comparison of algorithms C and D

    Figure 6 shows the typical results of joint processing of the sets of estimates ĥijx, ĥijy(Figure 6a) and ρ̂i,j, and φ̂i,j(Figure 6b) using Algorithm C. Figure 6c and d shows the results for the same sets and Algorithm D. It can be seen from the figures that the use of inter-row correlation can significantly improve the estimation results. For the parameters hxhyand Algorithm C, mean value of the error for motion area decreases by 1.2 times, error variance by 1.1 times, and for Algorithm D by 3 times and 29 times, respectively. For the set of parameters ρφand Algorithm C, the mean value of the error for motion area decreases by 5 times, variance by 2.1 times, and for Algorithm D by 10 times and 2.5 times, respectively. The numerical values are listed in Table 1.

    Figure 6.

    Example of estimates of shift magnitude for a row using inter-row correlation.

    Figure 7 shows visualization of the estimates of shift vectors’ field Hfor the estimation algorithms. Figure 7a and b corresponds to the algorithms C and D and set of parameters hxhy, and Figure 7c and d corresponds to the same algorithms and the set of parameters ρφ. It can be seen that criterion Eq. (3) provides a greater accuracy of estimation (but also a larger computational costs). When maximum spatial accuracy is required, it is advisable to use one of algorithms B and D. The error for the Algorithm D is smaller, but the algorithm has a greater computational complexity.

    Figure 7.

    Visualization of shift vectors’ field while using inter-row correlation.

    Figure 8 shows moving object area and its contour obtained from the results of algorithms C (Figure 8a) and D (Figure 8b) by thresholding with the threshold equal to 0.1 of the maximum value of shift magnitude. Note that there are practically no errors of the second kind for Algorithm D.

    Figure 8.

    Results of moving object area detection.

    3. Efficiency of the algorithms for complex motion

    Let us consider the possibilities of the developed algorithms when estimating shift vectors’ field for more complex types of motion, in particular, for the case when inter-frame geometric deformations of images of a moving object can be described by a similarity model with a vector of parameters α¯=h¯φκ. To compare the results of estimation of the shift vectors’ field with the results obtained for the images in Figure 2, the right image was modeled with the corresponding parameters of inter-frame deformations: h¯=23T, φ=4, and κ=1.

    The results presented below are for algorithms B and D which were shown to have the highest efficiency with the use of Eq. (3) as a criterion. Figure 9 shows typical results for a single row of the image; here graphs (a) and (b) correspond to the Algorithm B and the set of parameters hxhyand ρφ, respectively, and graphs (c) and (d) correspond to Algorithm D. In this case the estimates for the set of parameters ρφare closer to the true values (gray line). Figure 10 shows visualization of estimates of the shift vectors’ field for the entire image. Figure 10a and b corresponds to Algorithm B, and Figure 10c and d corresponds to Algorithm D with the set of parameters hxhyand ρφ, respectively.

    Figure 9.

    Example of estimates of shift magnitude for a row in case of complex motion of an object.

    Figure 10.

    Visualization of shift vectors’ field in case of complex motion of an object.

    In the presented figures, in order to remove outliers significantly exceeding the true value of the shift, the estimation module was limited to ρmax=10. When using Algorithm B, we note a significant number of false estimates ρρmaxboth for and outside the area of motion. In particular, when using the parameters hxhy, the estimates for a significant number of bottom rows of the image exceed the threshold. The use of the inter-row correlation significantly reduces the number of outliers for the set of parameters hxhyand virtually eliminates them for the set ρφ.

    4. Comparison of the developed algorithms with the MVFAST algorithm

    For block approach [18, 19], one of the most effective algorithms for the formation of shift vectors’ field is the MVFAST (Motion Vector Field Adaptive Search Technique) algorithm [20]. Therefore, we compare the effectiveness of the proposed algorithms and the MVFAST algorithm.

    In the MVFAST algorithm, the deformed image is divided into many nonoverlapping square blocks of a predetermined size. In most cases the block sizes are 2 × 2, 4 × 4, etc., but the algorithm allows setting blocks of any size, starting from one pixel. The estimation of the shift vector for the current block is determined by the estimates of the motion vectors of the neighboring blocks. The maximum value of the motion vector length of neighboring blocks is compared with two threshold values L1and L2, the values of which depend on the type of video sequence. As a result motion activity is classified as low, medium, and high. The criterion is the minimum of the sum of absolute differences (SAD). This characteristic for a block is the sum of the modules of the differences between the nodes of the current block and the corresponding nodes of the image on which the block location is searched. To improve performance and eliminate the influence of noise in the MVFAST algorithm, an early stop of the search is used. The search is terminated if the SAD is less than the specified threshold value T. Early stopping of the search can be disabled by setting the threshold Tto zero.

    Let us compare the results of estimation of the deformation field by algorithms D and MVFAST. For correctness of the comparison, below are the results obtained for the images in Figure 2 with the parameters of Algorithm D, λh=λρ=0.1, λφ=π/180, Δx=Δy=1, Δh=Δρ=0.2, and Δφ=5π/180, and parameters of the MVFAST algorithm, L1=1, L2=2, T=1, and block size 1x1. Figure 11a shows the estimates of the shift magnitude of the image points corresponding to the nodes of a single row of the reference image and formed by the MVFAST algorithm. The results of Algorithm D with the sets of parameters hxhyand ρφare shown in Figure 6c and d, respectively.

    Figure 11.

    Example of estimates for the shift magnitude for a single row and visualization of the deformation field generated by the MVFAST algorithm.

    It can be seen from the figure that the MVFAST algorithm forms fairly accurate estimates of node shifts, but, unlike Algorithm D, it has errors at the boundaries of the image of an object. Errors also occur in low-contrast areas inside the object. The latter is due to the fact that SAD in the center of the search for them often does not exceed the threshold T. Setting the threshold T=0when a block size is 1 pixel also does not solve this problem. Algorithm D, due to the inertia of changes of estimates, is deprived of this disadvantage.

    The values of the mathematical expectation and variance of estimation errors when using the MVFAST algorithm for a row and for the entire image are shown in Table 1. The table shows that the mathematical expectation of the MVFAST algorithm for the motion area is almost the same as for Algorithm D with the set of parameters hxhybut several times higher (about five times for a row, eight times for the entire image) than for Algorithm D with the set of parameters ρφ. The variance of the estimation error for the motion area for the MVFAST algorithm significantly exceeds the variance for Algorithm D (13 times with the set of parameters hxhyand 27 times with the set of parameters ρφ. In the absence of noise for area without motion, the MVFAST algorithm shows slightly better results compared to Algorithm D with the set of parameters hxhy, but the results for Algorithm D with the set of parameters ρφis better. This is well illustrated by Figure 11b, which shows the visualization of the deformation field estimates for the entire image. The results of using Algorithm D with the sets of parameters hxhyand ρφare shown in Figure 7b and d, respectively.

    Let us also compare the efficiency of the algorithms with the complex motion for the same conditions and parameters of inter-frame deformations (h¯=23T, φ=4, κ=1) which were used in the analysis of algorithms B and D.

    Figure 12a shows the estimates of shift magnitudes for a single row of the reference image (the same row that was selected in the analysis of Algorithm D), when the MVFAST algorithm is used. The results of Algorithm D with the sets of parameters hxhyand ρφare shown in Figure 9c and d, respectively. The visualization of deformation field for the MVFAST algorithm is shown in Figure 12b. The deformation fields obtained by Algorithm D with the sets of parameters hxhyand ρφare shown in Figure 10c and d, respectively.

    Figure 12.

    Example of estimates of shift magnitude for a single row and visualization of the deformation field formed by the MVFAST algorithm for complex motion.

    It can be seen from the figure that when estimating a deformation field for complex motion, the MVFAST algorithm gives significantly worse results in terms of spatial accuracy compared to the developed algorithms, and it is practically inapplicable for estimating the parameters of the motion trajectory.

    Note that the increase of block size in the MVFAST algorithm allows to get rid of detection gaps inside the object image but still does not solve the problem of errors at its boundaries. At the same time, the increase of block size reduces the accuracy of moving object area detection. The proposed algorithms, in contrast to the MVFAST algorithm, provide subpixel accuracy while estimating the parameters of the shifts that is fundamentally impossible for the MVFAST algorithm.

    5. Analysis of computational costs

    Along with the accuracy of estimation, computational complexity of the algorithms is also important. In this case, the required amount of computations per pixel of the reference image with coordinates ijconsists of two components:

    • Computational cost of finding estimates h¯̂i,jland h¯̂i,jror ρ̂i,jl, ρ̂i,jrand φ̂i,jl, φ̂i,jr;

    • Computational cost of obtaining the resulting estimate in the joint processing of estimates h¯̂i,jland h¯̂i,jror a set of estimates ρ̂i,jl, ρ̂i,jrand φ̂i,jl, φ̂i,jr.

    The second component depends on many factors: complexity of the background, noise, number and speed of moving objects, etc. It can be evaluated with the given limitations of the particular problem being solved. Therefore, in the framework of this work, we restrict ourselves from analyzing the first component of computational costs with two sets of parameters for the estimated shift vectors hxhyand ρφfor the three algorithms discussed above:

    • Algorithm A: single row reverse processing

    • Algorithm C: joint processing of adjacent rows using gradient estimation minimum criterion

    • Algorithm D: joint processing of adjacent rows using CC maximum criterion

    Detailed analysis of the computational cost requires taking into account not only algorithm itself but a large number of other factors (type of the computing device, time of memory access and for other operations, etc.). Many of these factors depend on the particular imaging device and computing resources. Therefore, we analyze only the computational complexity of the calculated ratios of the algorithms. We will consider the number of operations of addition (+), subtraction (−), multiplication (×), division (÷), tacking the root (), trigonometric functions (sin,cos), and interpolations for a node (z˜).Table 2 shows the number of such operations needed to process one node of an image.

    OperationNumber of operations
    Algorithm AAlgorithm CAlgorithm D
    hxhyρφhxhyρφhxhyρφ
    +,12162432156160
    ×6201038114128
    ÷44
    z˜8816164444
    sin,cos484
    44

    Table 2.

    The number of operations needed to process a single node of an image.

    The total computational cost is defined as

    N=c±N±+c×N×+c÷N÷+cz˜Nz˜+cN+csinNsin,E4

    where c±,c×,c÷,cz˜,c,csinare the coefficients characterizing the time needed to perform corresponding operations and N±,N×,N÷,Nz˜,N,Nsinare the number of operations.

    Equation (4) allows estimating the computational cost of the algorithms for particular computing facilities. For example, for a PC with an AMD Turion II X2 M500 processor: c±= 1.4, c×= 1.4, c÷= 13.6, cz˜= 63, c= 135, and csin= 212 ns. Then for the set of parameters hxhy, average time needed to process one node of an image is 0.53 μs for Algorithm A, 1.06 μs for Algorithm C, and 3.74 μs for Algorithm D. The average time for parameters ρφis 1.4 μs for Algorithm A, 2.8 μs for Algorithm C, and 4.62 μs for Algorithm D.

    Let us note the features of the computational complexity of algorithms D and MVFAST. Studies have shown that the amount of computation of the latter is largely determined by the nature of the processed images. So if the adjacent frames of a video sequence do not contain motion, the computational complexity of the algorithm will be determined only by computing one SAD for each block. In the presence of motion, depending on its intensity, the computational complexity may increase in tens or even hundreds of times. In particular, when processing two frames of a video sequence without movement by the MVFAST algorithm with a block size of 1 pixel on the same PC, the average time spent on one node was 0.64 μs. This is the lowest possible computational cost. The average processing time for a single node for the images shown in Figure 2 was 8.91 μs. For similar images with a complex type of motion, for example, with α¯=23T41, processing time is 21.84 μs. Note that the time expenses when using Algorithm D with the set of parameters hxhywere 7.44, 8.8, and 10.75 μs under similar conditions, while with the parameters ρφ, the time expenses were 8.13, 10.27, and 11.8 μs.

    From the above results, it can be seen that the computational complexity of Algorithm D weakly depends on the presence or absence of moving objects, their size, and type of motion. The computational complexity of the MVFAST algorithm is largely determined by these factors. So for the case when the inter-frame deformations of images of a moving object correspond only to a parallel shift, the computation time for algorithms D and MVFAST is approximately the same. When motion is complex, it increases in several times for the MVFAST algorithm.

    6. An example of estimating the trajectory of a moving object using deformation field

    Let us give an example of estimating the trajectory of a moving object using the developed algorithms. Figure 13a and b shows two frames from the video sequence being processed in which the aircraft lands on an aircraft carrier.

    Figure 13.

    An example of estimation of moving object trajectory.

    A complicating factor is the uneven movement of the camera toward the landing plane. Therefore, to estimate the position of the moving object relative to the scene (not to the camera), it was necessary not only to detect and identify the moving object area but also to stabilize the image. First, the distortion caused by the movement of the camera is compensated. To do this the inter-frame deformation of adjacent frames presented as similarity model α¯=hxhyφκis estimated using deformation fields [21]. Then the area of a moving object is detected. Using the estimates of the deformation field in the area of moving object, the parameters of the inter-frame motion are found, which for 34 frames of the video sequence are isometrically presented in Figure 13c. These parameters are used to estimate a three-dimensional trajectory of a moving object. Figure 13d shows the result in relative coordinates XYZ. The camera position at the initial moment of the shooting is taken as the origin.

    7. Conclusion

    The effectiveness of the motion area detection in sequence of images using pixel-by-pixel stochastic gradient estimation of the inter-frame shift vectors for all points of the reference image corresponding to its nodes (deformation field) is investigated. It is shown that stochastic gradient estimation of shift vectors does not give equivalent results for their representation as projections on the basic axes of the image and polar parameters. This is due to the fact that these sets of parameters have a different physical meaning. The analysis of variants of estimation algorithms for the deformation field has shown that the use of polar parameters provides higher accuracy of estimation.

    Two approaches to estimate the parameters of the deformation field are considered. In the first approach, the stochastic gradient procedure sequentially processes all rows of an image to find estimates of shifts for all points of the reference image. It processes each row bidirectionally, i.e., from the left to the right and from the right to the left. The joint processing of the results allows compensating the inertia of the stochastic estimation. However, this approach does not take into account inter-row correlation, and images are processed as one-dimensional signals. In the second approach, to improve the accuracy of estimation, the correlation of image rows is taken into account. Algorithm processes rows one after the other with change in direction after each row and uses obtained values to form resulting estimate for each node. From the results obtained for each node of the reference image, the resulting estimate of the shift vector is formed. Studies have shown that the second approach, with approximately the same computational complexity, showed significantly higher accuracy in the estimation of the shift vectors’ field.

    As a criterion for the formation of the resulting estimate, the minimum of gradient estimation of the objective function and the correlation maximum of local neighborhoods of the deformed and reference images were investigated. Correlation maximum criterion showed the best results. When using this criterion, not only the mean value and the variance of the estimation errors turned out to be less but also the effect of “oscillations” outside the motion area. But the use of this criterion requires a large computational cost.

    The computational complexity of the proposed algorithms is investigated. Under equal conditions, the estimation of the projections of the shift vectors requires a slightly smaller amount of computations than the estimation of polar parameters. Using reverse processing of a row is less time-consuming compared with processing of the adjacent rows. The use of the correlation maximum criterion implies a larger amount of calculations than the use of the gradient estimation minimum criterion. The algorithms with large computational costs provide less error; therefore the choice of algorithm depends on the specific task.

    The experiments have confirmed the effectiveness of using the developed algorithms to find the motion parameters and the trajectories of objects in a video sequence. In particular, an example of estimating three-dimensional trajectory of an aircraft in relative coordinates using 34 frames of a video sequence is considered. The parameters of the similarity model (parallel shift, rotation angle, and scale factor) were used to describe the inter-frame motion.

    Comparison of the developed algorithms with the block algorithm MVFAST (Motion Vector Field Adaptive Search Technique) showed that under identical conditions, the latter shows worse results in accuracy of detecting the boundaries of motion. In addition, in contrast to the MVFAST algorithm, the proposed algorithms allow obtaining subpixel estimation accuracy.

    Thus, the conducted study allows us to conclude that the pixel-by-pixel inter-frame estimation of the shift vectors of the reference image points corresponding to its nodes is a promising approach in detecting the area of a moving object in a series of images.

    Acknowledgments

    The reported study was funded by the RFBR and Government of Ulyanovsk Region according to the research projects 18-41-730006 and 18-41-730011.

    © 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

    How to cite and reference

    Link to this chapter Copy to clipboard

    Cite this chapter Copy to clipboard

    Alexander Tashlinskii and Pavel Smirnov (January 9th 2019). Formation of Inter-Frame Deformation Field of Images Using Reverse Stochastic Gradient Estimation, Pattern Recognition - Selected Methods and Applications, Andrzej Zak, IntechOpen, DOI: 10.5772/intechopen.83489. Available from:

    chapter statistics

    142total chapter downloads

    More statistics for editors and authors

    Login to your personal dashboard for more detailed statistics on your publications.

    Access personal reporting

    Related Content

    This Book

    Next chapter

    Depth Extraction from a Single Image and Its Application

    By Shih-Shuo Tung and Wen-Liang Hwang

    Related Book

    First chapter

    Localization of Buried Objects Using Reflected Wide-Band Underwater Acoustic Signals

    By Salah Bourennane and Caroline Fossati

    We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

    More About Us