Video Surveillance-Based Intelligent Traffic Management in Smart Cities Video Surveillance-Based Intelligent Traffic Management in Smart Cities

Visualization of video is considered as important part of visual analytics. Several chal- lenges arise from massive video contents that can be resolved by using data analytics and consequently gaining significance. Though rapid progression in digital technologies resulted in videos data explosion that incites the requirements to create visualization and computer graphics from videos, a state-of-the-art algorithm has been proposed in this chapter for 3D conversion of traffic video contents and displaying on Google Maps. Time stamped visualization based on glyph is employed efficiently in surveillance videos and utilized for event detection. This method of visualization can possibly decrease the com-plexity of data, having complete view of videos from video collection. The effectiveness of proposed system has shown by obtaining numerous unprocessed videos and algorithm is tested on these videos without concerning field conditions. The proposed visualization technique produces promising results and found effective in conveying meaningful information while alleviating the need of searching exhaustively colossal amount of video data.


Video visualization in smart cities
The quantity of surveillance video cameras increases at the public places results in increase in automated analysis of video contents and traffic video surveillance [43] considered as one of its application. These automated systems identify a number of traffic rule violations. Video features at object, pixel, and semantic level are extracted for analysis [53,56,59,60]. The basic purposes of surveillance video-based systems are vehicle tracking, analyzing their patterns and behaviors, abnormal event prediction, and detecting anomalies before their occurrence. This research aims to develop a glyph-based system for the real-time video visualization covering a comprehensive set of traffic videos on complete length of highways.
Intelligent monitoring has rapidly progressed in last 10 years and intended to provide situational awareness and semantic information for understanding the environmental activity [14,69]. VV illustrates the joint process of video analysis and subsequent derivation of representative presentation of essence of visual contents [2,4,19,34,45,54,57,68]. The visualization of videos is gaining more attention because of addressing challenges of data analysis arisen from video camera contents [1,15,16]. Over the past decade, VV usefulness for traffic surveillance [17,18] application has been effectively demonstrated by researchers [3,75,76].
VV offers spatio-temporal summary and overview of large collection of videos, and its abstract representation of meaningful information assists the users in video content [3,35]. Conversely, conventional techniques [67] of visual representation such as time series plot have difficulties in conveying impressions from large video collection [3].
In addition, there is need to present visual contents of videos in compact forms such that user can quickly navigate through different segments of video sequence to locate segment of interest and zoom in to different detail levels [1]. Viewing videos is time-consuming process, consequently it is desirable to develop methods for highlighting and extraction interesting features in videos. There are numerous techniques designed for data analysis in images and a variety of statistical indicators for data processing. On the contrary, there is lack of effective techniques for conveying complex statistical information spontaneously to a layperson such as a security officer, apart from using line graphs to portray 1D signal levels [1]. Many researchers studied video processing in the context of video surveillance [16], monitoring vehicles, and monitoring crowds. However, main problem in automatic video processing is communication of results of video processing to human operator. Since statistical results are not easily comprehensible, whereas sequences of difference images again need sequential viewing [1].
Conventional video surveillance systems heavily rely on human operators for activity monitoring and determining actions to be taken upon incident occurrence. There are several actionable incidents that miss-detect in such a manual system due to inherent limitations from deploying solely human operators eyeballing CCTV screens [58]. Hence, automatic VV [56] will prove very beneficial in improved traffic management. Miss-detections might be caused by monitoring excessive number of video screens to monitor as shown in Figure 1 and tiredness due to prolonged monitoring. In fact, numerous studies have shown the limits of human-dependent surveillance. The United States Sandia National Laboratories conducted a study in which most people attention fell below an adequate level after only 20 minutes of video surveillance screen monitoring [67]. The video content analysis paradigm is shifting from a fully human operated model to an intelligent machine-assisted automated model [58].

State of the art
In the field of visualization, Borgo et al. [51] carried out a comprehensive survey on video visualization. Effectiveness of VV for conveying meaningful information enclosed in video sequences was demonstrated by Daniel et al. [1]. Andrienko et al. [47] also illustrated visual analytical technique to visualize huge amount of video data. Data were clustered and aggregated to display on map by using color arrows. Wang et al. [48] presented situational understanding approach by combining the video frame in 3D environment. Romero et al. [49] used visualization approach to analyze human behavior and explored the activity visualization in normal settings over time.
Hoummady proposed survey on sensory device shortcomings that are used for collection of traffic information real time [40], and video camera usage as data collection was also proposed for traffic management. This approach relies on computational device mainly for pedestrian recognition and vehicle, 2-wheel vehicles, etc.
For traffic visualization, commonly employed approach is coloring the areas demonstrating roads on the map [44]. Ang et al. [46] presented analytical approach for management of traffic from multiple cameras. Vehicle trajectory estimation and extraction of features was done. Subsequently, Jiang [62] demonstrated the analytical technique for visualizing the huge video data. Data were clustered and aggregated to display them on map by using color arrows. Afterward, Botchen [53] proposed technique for flow and volume signature visualization. It discovered that common people can recognize events on the basis of event signatures quite than viewing entire video contents.
End users and technology providers identify that manual process is inadequate to search comprehensively massive amount of video contents and screening timely. In order to lessen these issues of visualization, we try to project camera activity on Google Maps and have summarized and holistic view of video contents. Massive video data render ineffectual manual analysis of videos; however, present automatic analytic techniques of videos undergo better performance.
A state of art visualization technique for surveillance videos is presented and tested by using several traffic videos. It receives suitable visual representations to assist the process of decision making. One can perceive level and pattern of activities that are recorded from visualization of videos as it offers more spatial info than using statistical indicators. Semantic info is obtained from numerous surveillance videos which are connected to Google Maps in order to perform 3D association. In the same time, glyph [5,20] is familiar and conveys multi-field video visualization [10]. Well-developed visualization approach based on glyph is proposed that enables efficient and effective information encoding and visual communication.

Glyph-based semantic information visualization
Proposed approach aims to visualize semantic information of traffic videos using time stamped glyph. Input video frames are processed continuously to detect change in visual information. The proposed approach consists of several steps for estimation of traffic flow.

Preprocessing
First step involves the segmentation of object from the surveillance video by using thresholding and subsequently converting it to binary image from grayscale image. Parts of road are thinned out, and holes are filled in video frames using morphological operations as shown in Figure 2.
Object segmentation considered to be a vital process in understanding of image in the preprocessing step. The purpose of object segmentation is to divide the image into region of interest, and objects are identified from the video frame using region growing method. The process of image segmentation results in binary image contains connected components which represents the multiple objects. Connected component analysis is performed to distinguish between the connected components. Features are extracted to track the moving objects in successive frames. Image is scanned pixel by pixel, and gray value of central pixel is compared with those of the top and left pixels. Surface or region grows, until it finds all the connected pixels. The values of the pixels are compared with the 3 Â 3 neighboring pixels. If there is disconnect in the connected pixels and the gap is greater than threshold value, algorithm classifies the pixel to a new region. It is a user defined threshold whose value is chosen on the basis of distance between the pixels. All the pixels which are part of the object are set to value 1, and those which are not part of the object are set to value zero. In region growing method, 3 Â 3 window finds all the neighboring pixels 1 and keep growing the region until pixel with zero value is found. Algorithm keeps finding the gap, and if the gap is greater than the threshold value, the algorithm classifies it as a new region. If there are only isolated pixels, they are marked as outliers. Proposed system is robust in handling problems such as occlusion and illumination variation encountered in surveillance videos. In case of sunny day, there are moving shadows of vehicles which can produce false alarms. But the proposed system estimates the vehicle size and is able to predict the shadow size. Based on the moving vehicles size, shadow can be removed. Proposed method is able to remove the extracted shadow of the vehicles. Proposed system is tested on several surveillance videos of different scenarios such as different weather conditions and densities. Proposed system is robust in handling problems such as occlusion and illumination variation encountered in surveillance videos. The data set contains a diverse set of scenarios in terms of traffic density and violations.
Traffic flow is assessed on each video frame, and the number of vehicles is counted in every frame. For every vehicle, mean speed is computed. The flow rate is found by dividing total vehicles by time. Top level flow diagram of the proposed approach is depicted in Figure 3. Figure 3 represents the flow of proposed approach. Object tracking [7,8] is part of the proposed system which collects temporal and spatial information about the object under consideration from the video sequence. Semantic information such as trajectories of detected objects is acquired by motion tracking that is given as input for mapping and 3D computation and revealing the outcomes on Google Maps. As Google space and video space coordinates are different, 3D mapping is performed amongst the two different spaces. Time-based glyph is created for representing semantic info on Google space and video.
Layout of table in order to store coordinates of vehicle is revealed in Figure 4. Blobs detected within frame signify the number of vehicles. It is illustrated in Figure 4 that single vehicle exists in current video frame. Array is well-defined for storing vehicle coordinates. First two columns of array illustrate the y and x vehicle coordinates present in first frame, whereas the following y and x coordinates represents next frame coordinates of vehicle. Fifth column value demonstrates the number of frames consumed by vehicle in which vehicle becomes visible in field of view. Vanishing flag in last column defines the status of vehicle, for example, vehicle departure. Flag value remains zero till vehicles are in field of view and value will turn 1 when vehicles disappear. Last column is significant because values reshuffling in arrays change on the base of flag value.
Though, trajectories of vehicle are of different spans even vehicle travel on the same route since vehicle travel at different mean speeds [8,12]. Motion vectors [77] are used for demonstrating information as motion information has strong relationship with semantic occurrence. Different event identifications are done by analysis of motion features. Path demonstrates the vehicle movement and dynamical measurements that represent the raw vehicle trajectory. A common trajectory depiction is flow succession, for example,  where the flow vectors represent object velocity [v x , v y ], position [x, y], and direction [a x , a y ] at time t extracted by tracking the object.

Bezier fitting for glyph generation
Bezier curve mostly employed for modeling and smoothing the chaotic vehicle trajectories. Control points are used to define Bezier curve which have geometric modeling interpretation and can model trajectories inconsistency [61]. Curve is confined in control point's which are showed graphically and can be utilized for curve manipulation. By offering P 0 and P 0 points, Bezier curve is defined as straight line between two points such as, That is equivalent to linear interpolation. Bezier curve is used to smooth the chaotic trajectories of vehicles obtained using motion tracking. As each car moves with different speeds, so the length of trajectories varies.  Figure 5(a) illustrates the chaotic trajectories of different vehicles that are smoothed using Bezier curve to visualize the traffic pattern as shown in Figure 5(b). Time stamped semantic information is represented using glyph. Vehicle trajectory is tracked over time, and semantic info is delivered as presented in Figure 5. Outer circle of glyph denotes that vehicle changes lane although vehicle type was small which is signified by circle having red color. If vehicle is small and do not change the lane within field of view than outer circle of glyph is green.

Motion tracking and semantic event display
Motion tracking significance [9,39,64,66,71,73] in surveillance videos is unquestionable; subsequently, it is valuable in countless applications. Semantic analysis [62,63] of video is utilized for extraction of vital information particularly type of vehicle, speed and lane changing, and trajectory from the video [38,41]. This semantic info is extracted automatically in order to represent indexing, high level descriptors, retrieving, and searching the video contents. Tracking of vehicle comprises of velocity, maintenance of appearance, and positioning of detected object over time. Vehicle detection is done by object linking to most alike object in consecutive video frames.
Flow vectors are used to symbolize the common trajectory representation which is basis of further analysis. Figure 6(a) characterizes the chaotic vehicle trajectories which are taken from different surveillance videos. Every trajectory of vehicle is attained by individual tracking of detected vehicle. Figure 6(b) displays the smoothed curves that are acquired by applying Bezier curve on the chaotic vehicle trajectories.

3D conversion and perspective view from video space to Google Maps
To capture the real time, info is considered as main challenge in dynamic VV [39]. 3D info recovery from surveillance video is essential to acquire some significant information from the videos. As frame of videos is the projection of 3D space, abstraction of vital information is difficult task. In proposed approach, 3D transformation on Google Maps from surveillance video is processed by using homographic transformation. In homographic transformation, plane mapping to image space is performed by projective transformation that maps the point from one plane to second one. Homography amongst image space and video space is estimated requires four-point correspondence [42]. Calibration of image is acquired through transformation H, in which pixels pf image mapping on ground plane matches to latitude and longitude coordinates of maps.
Individual location of vehicle in each video frame sequence is signified by plus symbol in Figure 7 that is computed by homography matrix in order to calculate map and of video space coordinates. In perspective projection, location or points are alike in two dissimilar spaces, however, not equivalent because of universal scale uncertainty. The homography [6,11,45,72,79] in camera-based view geometry attains a particular interpretation H = KE, where E represents Euclidean transformation matrix which defines camera pose while viewing, and K characterizes the matrix of camera perspective recognized as intrinsic measures. Consider a pair of correspondence points, for example, p ¼ x 1 ; y 1 ; z 1 À Á T and u ¼ x 2 ; y 2 ; z 2 À Á T is related by homography H: The h solution is acquired as eigenvector which corresponds to AT A smallest value. Corner points of video after the H computation are projected on Google Maps correspondence points. Each position of pixel in dimension space is estimated on map by the use of H matrix, and

Time stamped semantic glyph representation
Visualization based on glyph is considered as common procedure of visual scheme in which group of graphic objects is employed for representing data set known as glyph [35]. Glyph method is utilized for visualizing motion vectors that are over laid on video stream frame. Our main concern is to collect visual info that seems in all frames of video till the object remains within the view. Time stamped glyph is also generated in order to signify the type of car, speed with distinctive colors, and event information such as lane change information. The proposed system accurately determines the lane change of vehicle at a specific time due to precise localization. In the proposed system, an abnormal event detection is performed by specifically giving vehicle trajectories [52]. Trajectory analysis and interaction with scene feature allows recognizing interesting events. A time stamped glyph is generated to represent speed and lane change information of vehicle. For any image point, the position of corresponding scene point in every video frame is determined until the vehicle leaves the field of view.
There is variation in vehicle speed even in obstacles' absence because of curves and turns. Experimental data authenticate the common insight of speed which is considered most significant factors of safe driving. Variation in vehicle speed considered to be one of likely factors of congestions and accidents [37,51,94]. Therefore, proposed algorithm determines the speed variation of vehicles in each frame on the basis of trajectory analysis. Trajectories with different speeds are identified and represented using glyph. At each time frame, if vehicle speed is lower than the threshold, then the same color is assigned; however, if vehicle speed changes abruptly then at each instance of time, it is assigned a different color. With this time stamped identification method, precise instance of speed variation is identified in the video frame that causes a disruption in flow of the traffic movement.

Association between Google Maps and video visualization
To properly visualize analysis of results on Google Maps, the output must be properly aligned to the map coordinates [13]. Rectification of camera image is automatically done and mapped on the map. In surveillance video, activity is detected in each frame, and location of vehicle on ground is gained through correspondence points and trajectory learning which are mapped on map. Consequently, video inspection of several road cameras is upgraded by projecting the activity of outdoor surveillance camera on Google Maps. In order to localize the vehicle coordinates on the Google Maps or association of video and Google Maps space, homography is computed, and its perspective view is drawn. Transformation matrix provides the association information and its mapping. And as the events occur, correspondence video is visualized on the Google Maps as shown in Figure 8.

Holistic view of video using Google Maps
A surveillance video naturally takes the perspective view of the visual scene which is recognized as quasi-3D. Significant information is gathered from the different videos and is viewed to represent unusual events in videos as depicted in Figure 8.
In video surveillance-based system, identification of unusual events is considered to be most significant task. Anomalous behavior can be drastic and subtle [36,58,63]. Changing of lanes on highways is traumatic. The proposed system precisely identifies the vehicle lane change at specific time because of precise localization. Anomalous detection of events [40] can be performed by giving the trajectory [62]. Subsequently, now the vehicle trajectory specifies the frightening behavior by performing trajectory analysis. Different glyph colors during the video visualization portray the type, vehicle position, and event information within video frame.

Small scale
The proposed technique has been tested on the small scale, for example, area across Northumbria University City Campus, Newcastle Upon Tyne, UK. Detected object trajectories are shown in outcome till the objects remain in the scene using semantic glyph as shown in Figure 9.
There is possibility of future work in the area of visualization. Proposed visualization approach can be utilized for traffic management system at city level and have precise view of bigger cite. Spatio-temporal view of collection of videos can be acquired by mapping the trajectory on Google Maps as shown in Figure 10. To interpret the data in real time system, visualization of video data offers instinctive information that can be expended for acquiring trends and patterns. Conversely, gathering statistics automatically from video data are computationally costly. Subsequently, Walton et al. [39] visualized the traffic video data on Google Maps to display traffic info. Though, displaying numerous traffic videos instantaneously was challenging because of heavy transmission load. Human intellect was used to gather semantic features from surveillance videos in graphic mapping scheme. Lately, Hsieh and Wang [50] proposed a traffic system for visualizing traffic information by inferring vehicle data and constitute a video in the database. Flow of traffic was assessed from surveillance videos and Google mapping was created amongst vehicle detector  data and videos. While visualizing the traffic information, approach was ineffective in simulating all types of kinematics and dynamics because of driving behavior in various regions.

Conclusion
The concern of VV is with visual illustration of input surveillance video for see-through vital features and events in surveillance video. It is envisioned for providing assistance in intellectual reasoning whereas easing the load of observing videos. A novel visualization approach based on glyph has been proposed that can be efficiently utilized for road surveillance videos. A visual analysis is done on the basis of motion tracking to monitor live road traffic on the highways. The proposed approach has been verified on numerous video frame rates and resolution for visualizing the traffic flows. Experimental outcomes illustrate that approach can be employed in field conditions and permit better utilization of previous systems of traffic management.