Open access peer-reviewed chapter

Development of a Versatile Modular Platform for Aerial Manipulators

Written By

Nikolaos Evangeliou, Athanasios Tsoukalas, Nikolaos Giakoumidis, Steffen Holter and Anthony Tzes

Submitted: March 23rd, 2020 Reviewed: September 14th, 2020 Published: October 22nd, 2020

DOI: 10.5772/intechopen.94027

From the Edited Volume

Service Robotics

Edited by Volkan Sezer, Sinan Öncü and Pınar Boyraz Baykas

Chapter metrics overview

473 Chapter Downloads

View Full Metrics


The scope of this chapter is the development of an aerial manipulator platform using an octarotor drone with an attached manipulator. An on-board spherical camera provides visual information for the drone’s surroundings, while a Pan-Tilt-Zoom camera system is used to track targets. A powerful computer with a GPU offers significant on-board computational power for the visual servoing of the aerial manipulator system. This vision system, along with the Inertial Management Unit based controller provides exemplary guidance in confined and outdoor spaces. Coupled with the manipulator’s force sensing capabilities the system can interact with the environment. This aerial manipulation system is modular as far as attaching various payloads depending on the application (i.e., environmental sensing, facade cleaning and others, aerial netting for evader-drone geofencing, and others). Experimental studies using a motion capture system are offered to validate the system’s efficiency.


  • aerial manipulation
  • visual localization

1. Introduction

The introduction of drones has revolutionized many sectors, including but not limited to cinematography [1], search and rescue [2, 3], maintenance [4], surveillance [5, 6], delivery of goods and transportation [7, 8].

The main components of a drone are its Propelling System and its Flight Control Unit (FCU). The propelling system provides the necessary thrust to change the attitude of the drone, described by its pitch, roll and yaw angles, and thus its three dimensional motion. The dominant propelling system currently is composed by propellers driven by a brushless motor and an Electronic Speed Controller (ESC) combination. The FCU is the “brain” of the drone, since it issues the control commands to the ESCs for changing the attitude and the pose of a drone. It usually contains GPS receiver(s), accelerometer(s), gyroscope(s), magnetometer(s) and barometer(s) coupled to environment sensing devices like laser scanners to extract the current pose of the drone. The output of a FCU is computed by taking into account the current pose and the desired reference.

Multi-rotor drones have been very popular among researchers with their naming typically by the rotor count (tricopters, quadcopters, hexacopters, and octacopters). The drone’s thrust increases with the number of rotors allowing the lift of higher payloads at the expense of a reduced flight time, and power tethering systems are usually sought [9].

The majority of the off-the-self drones have a 1-2 kg payload capability with very few drones being capable of lifting an order of higher magnitude [10]. This is primarily due to the FCU’s necessary tuning, the advanced ESCs and the need to abide to the laws imposed by each country’s regulatory authority.

Pertaining to the described challenges, this chapter presents a drone that based on its mission can be modular in terms of software and hardware while lifting a high payload. The drone can operate either indoors or outdoors and has navigation and mapping capabilities as well as can interact with the environment through an attached robot manipulator.

In Section 2 the mechatronic design of the drone is presented, while in Section 3 the drone’s software for localization is explained and evaluated. The drone’s ability to perform either in a collaborating or an adversarial environment using computer vision is discussed in Section 4. The aerial manipulation concept is addressed in Section 5, followed by Concluding remarks.


2. Drone’s mechatronic design

2.1 Drone electric power units

The developed octarotor drone has a take-off weight of 40 kg and a 30 min flight time. The drone’s frame was designed and fabricated in collaboration with Vulcan UAV©. The authors’ input on this aspect is related with both extending the bare-bone design of Vulcan to accommodate for payload carriage, as well as fabricating the final prototype and mounting all the additional modalities mentioned in the sequel. The backbone structure consists of three ø 25 mm, 1200 mm length aluminum tubes in a triangular cross sectional configuration. Four 575 mm length aluminum rectangular arms attached at each end of this structure and carry two motors in a coaxial configuration. The arms are fixed to the main frame using a 5 mm thick carbon plate. The resulting “H-frame” configuration can be visualized in Figure 1.

Figure 1.

Drone’s backbone structure.

Although the lower motor provides 25% less thrust [11] it offers some redundancy against single motor failure. The selected 135 KV KDE© brushless motors coupled with ø71.12 cm custom designed carbon propellers, collectively provide 37.2 kg of thrust at 50% throttle input. The extra thrust can be used for rapid maneuvering of the drone and for exerting forces by the aerial manipulator shown in Section 5.

Power is provided by a 12S 22 Ah LiPo battery pair connected in parallel to the Power Distribution Board (PDB). At 50% thrust with full payload while hovering, the octarotor’s motors sink 11.7 A each, resulting in a flight time of 2×2211.7×8×60min=26min.

Two carbon rods of ø12 mm are fixed at the underside of the mainframe tubes for payload carriage. The maximum payload weight is 30 kg and can be easily dismantled from the main frame using quick release clamps. Similarly, the retractable landing gear assembly is attached with these clamps to the main frame tubes for enhanced modularity, as shown in Figure 2. The gear can retract within a 45°80°angle window using a Pulse Width Modulation (PWM) signal, provided by the FCU’s rail pins, with a 50 Hz switching frequency. The landing gear operation is achieved via the MAVlink protocol command set [12].

Figure 2.

Landing gear detail (left) and payload assembly with battery holder (right).

Additional power for peripherals and sensing modalities can be supplied through a dedicated 750 W buck converter, mounted on the payload carrier assembly, as shown in Figure 3. The converter is contained within a custom 3D printed case and standard Unmanned Aerial Vehicle (UAV) XT30,XT60connectors protrude to provide 24, 19 and 12 V respectively to the end-user.

Figure 3.

Enhanced power distribution board (left) and i7-minicomputer (right).

2.2 Flight command unit and related software

The PixHawk Cube FCU was selected [13] featuring triple redundant dampened Inertial Measurement Units (IMUs), with a modular design and industrial standard I/O connectors. Additional telemetry and R/C circuits are deployed to enable monitoring and intervention and comply with flying regulations.

The Here+Global Navigation Satellite System (GNSS) [14] with Real-time kinematic (RTK) capabilities was selected for outdoor navigation and placed on top of a carbon fiber pole at a height of 35 cm from the main frame’s top plane. For immunity to electromagnetic interference, the primary magnetometer of the flight controller is selected to be the build-in magnetometer module of the GNSS receiver.

A high processing power 8th generation Intel NUC i7-computing unit with 32 GB RAM and 1 TB SSD, shown in Figure 3, was mounted symmetrically to the buck converter on the underside of the main frame. This 90 W computing unit allows for online computations on demanding tasks such as the visual object tracking methods of Section 4, as well as the easy development of autonomous flying applications.

On the software side, the ArduCopter flight stack [15] was selected to run on the FCU. The pose estimation is carried through a sophisticated Extended Kalman Filter (EKF) at 400 Hz. The Intel NUC companion computer is serially connected to the FCU at a baud rate of 1 Mbps and the communication packages are following the MAVlink protocol. The NUC’s operating system was Ubuntu Linux 16.04 and all applications are developed through the Robot Operating System (ROS) and MAVROS [16] middleware with a 50 Hz refresh rate.

The developed drone without any payload can be visualized in Figure 4.

Figure 4.

Drone prototype.


3. Drone localization

3.1 Drone outdoor localization using RTK GNSS

The RTK enhancement feature of GPS is used for outdoor localization purposes. This is due to the more precise positioning [17] because the of the GPS satellite measurements’ correction using feedback from an additional stationary GPS module. The disadvantage of such systems is that their use is bounded to a significant pre-flight setup time which is inversely proportional to the achieved accuracy (cm range).

Although the internal loop of the flight controller operates at 400 Hz, the GPS receiver streams data at a lower rate of 5 Hz. In popular flight software such as ArduPilot, the aforementioned rate needs to be taken into consideration by the underlying EKFs running by the FCU. A typical comparison of the achieved accuracy using a drone in a hovering state can be seen in Figure 5.

Figure 5.

Drone’s EKF 3D-position output with (red) and without (blue) RTK correction.

The drone was flown in a hovering position with the RTK GPS module injecting measurements to the flight controller and the output of the FCU’s EKF was compared with and without the presence of the injected RTK measurements. The red line represents the EKF’s output based solely on the GPS signal, whilst the blue line indicates the same output when RTK correction (using a 30 min warmup period) is injected on the FCU.

The standard deviation was computed equal to 0.74 m, 0.47 m and 0.27 m for X, Yand Zrespectively when no RTK correction was applied. Contrary to this, the same values with RTK injection were computed to equal 0.05 m, 0.02 m and 0.23 m respectively. It should be noted that there is no significant improvement in the Z-direction, indicating the need to use either a barometer or a laser sensor for ground clearance measurements.

3.2 Drone indoor localization

During indoor navigation: a) the lack of GPS guidance, b) pressure changes affecting the barometric sensor, and c) power lines affecting compass accuracy can severely affect the output of a FCU. With only the accelerometers and gyroscopes being unaffected, the injection of an external feedback source to the FCU is considered essential. Such feedback is usually based on visual techniques, such as those presented in [18, 19].

For experimentation purposes, the used Motion Capture System (MoCaS) [20] injects measurements in the ArduCopter flight stack. The system comprises of 24 Vicon cameras uniformly scattered within an orthogonal space of 15×5×8=L×W×Hm. The utilized system allows simultaneous tracking of 100 objects at 120 Hz with sub-millimeter accuracy. Despite MoCaS’s high refresh rate, the ArduCopter flight stack at the FCU accepts external positioning data at a 4 Hz streaming rate.

The utilized ROS software at the MoCaS operates at 25 Hz and can efficiently wirelessly stream the measurements to the drone’s FCU. The latency time td=tVC+tCFCU+tFCU, where tVCtCFCUis the delay of data streaming to the companion computer (FCU), and tFCUthe delay of processing the data on the FCU. In the developed system typical measured values are tVC5ms, tCFCU=20msand tFCU=40ms, resulting in td75ms.

Because of the MoCaS’s efficiency, its weighing to the EKF is ten times larger compared to the GPS’s weight when flying outdoors. Subsequently, the efficiency of the implementation is assessed by comparing EKF’s position output with and without MoCaS’s feedback injection. In Figure 6 the drone’s position error (in each axis) between the aforementioned two quantities is visualized, where the red, green and blue lines represent the error along the XYand Zaxes respectively.

Figure 6.

Drone’s EKF position error when MoCaS’ measurements are not injected.

Real time pose tracking is satisfactorily achieved and minor differences are attributed to the EKF’s weighting of the accelerometer and gyroscope measurements during calculations.


4. Drone awareness of surrounding environment

An important parameter on aerial navigation is awareness of the surrounding environment including being in close proximity between cooperating or evasive drones [21, 22] to avoid potential contacts.

High accuracy awareness may not be feasible [23] and can become prohibitive in indoor environments; visual sensors along with Lidars can assist in this aspect. A spherical camera provides an all-around visualization of the surroundings and can detect neighboring targets. A Pan-Tilt-Zoom (PTZ) camera with a limited Field of View (FoV) can then provide a more accurate description of this target. The suggested target relies on the detection of moving objects. Correlation techniques and/or deep learning Visual Object Tracking (VOT) methods [24] are employed for this purpose.

4.1 Environment awareness using a spherical camera

Rather than using several cameras with a limited Field of View (FoV) to observe the surrounding space, a 360° FoV camera [25] is used. The spherical camera records images in a “spherical format” which is comprised of two wide-angle frames stitched together to form a virtual sphere [25]. The image can be rectified to the classic distortionless rectilinear format of a pinhole camera [26]. However, due to the nature of the “spherical format,” it is preferable to split the image into smaller segments and rectify each one to achieve results closer to the pinhole camera model. Instead of splitting into equal sized square segments [27], each image is split into tiles based on orientation-independent circles. With every tile having a different a-priori known calibration, the rectification can be carried out for each one independently, without high computational cost. By applying the solution and rectifying the image in Figure 7, for a selection of N=12tiles, the resulting rectified partitions are visualized in Figure 8.

Figure 7.

Spherical flat image.

Figure 8.

RectifiedN=12partitions for a single “spherical” frame.

For the case of collaborating drones, it is assumed that each one carries passive markers for visual recognition. Subsequently, the rectified images are processed for identification of these markers [28, 29, 30, 31] thus estimating the neighboring drone’s pose. For improved pose extraction, the solution of a multi-marker rhombicuboctahedron formation arrangement [32] is assumed to be present in each target.

The experimental setup for evaluation consists of the spherical camera mounted in a 2.7 m protruding stick, which subsequently is mounted to the underside of the octarotor using the generic mount base discussed in Section 5. A rhombicuboctahedron arrangement with markers at its faces is attached to a DJI-Mavic drone. Both UAVs were located within the MoCaS test volume, as shown in Figure 9. The quadrotor drone was flown in a randomized trajectory near the vicinity of the octarotor.

Figure 9.

Experimental setup for 360° camera relative localization.

In Figure 10 the relative 3D-flight path between the drones is presented. The results recorded from the MoCaS and the visual ones are shown, where for the cases of detecting the marker the relative accuracy these measurements was 2.2 cm respectively.

Figure 10.

3D-relative path inferred through the MoCaS and the visual method between two collaborating drones.

4.2 Visual object tracking using a pan-tilt-zoom camera

Having identified the adversary or collaborating drone, a PTZ-camera is utilized to track its motion. This Visual Object Tracking (VOT) problem is challenging when the drone is occluded, thus Long Term Efficient (LTE) algorithms are sought for moving objects. Despite the development of Short Term Efficient (STE) algorithms [33] using either correlation methods or deep learning ones, an initial bounding box containing the target is required. In the authors’ case, the developed VOT algorithm employs two methods relying on a comparison: (a) between the tracking of the points transformed based on the PTZ-parameters and those using an optical flow, and (b) between the homography matrix transformed points and the optical flow.

The first method is based on the PTZ known motion and IMU’s acceleration and gyroscope measurements (Figure 11), in order to estimate the motion of the pixels due to the motion of the camera in relation to the surroundings [34]. An IMU with triple accelerometers, gyroscopes, and magnetometers is attached to the PTZ-camera, as shown in Figure 11. While the enhancement provided by the PTZ camera allows for efficient VOT, the need to control its parameters (pan, tilt, and zoom) while placed on a floating base and at the presence of several occlusions needs to be addressed.

Figure 11.

PTZ-camera for visual object tracking.

The objective is to provide the bounding box ptof the approaching drone from the attached camera to the “Tracking drone” as shown in Figure 12. The IMU’s sensors are sent to an embedded EKF to compute the camera’s pan and tilt angles in the global coordinate system (and their angular velocities) at a 100 Hz rate. The angular velocities are used to compute the optical flow, and the angles are used for VOT purposes.

Figure 12.

Sample drone tracking setup.

A GPU-based background subtraction technique eliminates the background pixels leaving only the moving object pixels. The bounding box encapsulates all pixels of the moving drone and the pan and tilt angles are adjusted to position the centroid of the moving bounding box at the image’s center while the zoom is adjusted to enlarge this box. The communication between the i7-minicomputer and the PTZ-camera is shown in Figure 13, while the VOT algorithm is shown in Figure 14.

Figure 13.

PTZ-camera hardware tracking and control schematic.

Figure 14.

Pan-tilt-zoom/IMU and optical flow VOT algorithm.

The feature points are recognized in each frame and the transformation matrix between successive frames follows along [35]; the formulas provide the transformation based on the PTZ-parameters and an augmentation is needed to account for the camera’s translation, as provided by the on-board accelerometers. The pixels that correspond to static background objects will follow the predicted motion by the camera motion and coincide to the positions predicted by an optical flow based estimation, while the rest will be classified as belonging to moving objects of interest (Figure 15). The computations for the optical flow parallels that of the Lucas-Kanade method using a pyramidal scheme with variable image resolutions [36]. The basic optical flow premise is to discover the positioning of an image feature in the previous frame, in the current frame captured by the camera.

Figure 15.

Background/foreground estimation using Homography-based VOT.

The second method is relying only on visual feedback and homography calculations [37] between two successive frames and does not require either the PTZ or the IMU-measurements, as shown in Figure 16. Initially a set using “strong image features” is identified on the previous camera frame and an optical flow technique is used to estimate the position of the features in the current frame. The method involves the discovery of special image areas with specific characteristics.

Figure 16.

Homography-based VOT.

The algorithm used for finding the strong corners image features relies on the GPU-enhanced “goodFeaturestoTrack” [38]. Under the assumption that the background is formed by the majority of the pixels, a homography is calculated that transforms the features positions from the previous to the current frame; these correspond to the background pixels. The previous frame features are then transformed using the homography to get their position in the current frame. Herein, it is assumed that the background points transformed with the homography will coincide with the estimated ones by the optical flow, while the moving objects’ features estimated by the optical flow will diverge from the homography transformed pixels.

One downside of the technique is that when the tracked object remains static and blends with the background it is unable to identify it. In this case, a fast correlation-based STE-tracker relying on the MOSSE algorithm [39], is also used in order to estimate the drone’s position until new measurements of a moving drone are available. Several more robust but slower tracking algorithms were evaluated, including the KCF [40], CSRT [41], MIL [42], MedianFlow [43], TLD [44], and the MOSSE-algorithm was selected because of its fast implementation (600 Frames-per-Second (FpS)). A Kalman prediction scheme [45] was used to predict the bounding box and the one obtained from the MOSSE in the presence of noisy measurements of the moving object center, using a 2D-constant acceleration model for the estimated tracking window.


5. Aerial manipulation

A seven Degree-of -Freedom (DoF) robotic arm has been attached for exerting forces on surfaces in aerial manipulation tasks, such as grinding, cleaning or physical contact based inspection [46]. The Kinova Gen 2 Assistive 7DoF robot [47] was attached through a custom mount. This manipulator is characterized by a 2:1 weight to payload ratio, with the available payload at the end-effector being 1.2 kg grasped by the 3-finger gripper. Torque sensing is provided at each joint and these measurements along with the joint angles are communicated to the main computer at 100 Hz under ROS middleware.

For mounting the robot to the drone’s payload attachment rods, a generic payload mount base was designed and manufactured. The base is firmly mounted to the drone’s payload carrying rods utilizing quick attachment clamps. The construction material was selected to be T-6065 aluminum and features four 10 mm openings for attaching the payload. A second rigid base was similarly designed for attaching the robot’s base to the generic payload mount base using 10 mm hex bolts. An exploded view of the entire mounting configuration can be visualized in Figure 17. The aerial manipulator is shown in Figure 18.

Figure 17.

Universal mount of robotic manipulators on aerial platform.

Figure 18.

Aerial manipulation system with PTZ-camera.

The indoor position hold scheme of Section 3.2 was expanded [48, 49] so as to utilize the manipulator in a surface ultra-sound scanning scenario. The surface is placed at 45°angle in an a-priori known position. After taking off the FCU retracts the landing gear (if commanded) and moves the manipulator to its joint angles [180, 90, 180, −30, 90]° respectively. On arrival to the desired setpoint pose, the manipulator’s end tip comes into contact with the surface and the system hovers at the specified pose for some time for performing the area scan. The process is completed with the onboard computer initiating a landing after returning to the initial take-off position.

The described scheme is aimed for future use in the Abu Dhabi airport’s Miedfiled Terminal [50], for scanning the integrity of critical structures such as facades and rooftop. Figure 19 presents the hovering pose of the physical prototype while scanning the surface, whilst the full video concept including moments of the experiment is available through the link given in [51].

Figure 19.

Surface ultra-sound scanning utilizing aerial manipulation.


6. Conclusions/discussion

In this chapter the mechatronic aspects (hardware and software) of a heavy lift drone are presented. This drone can operate either indoors or outdoors in an autonomous manner. Equipped with spherical and PTZ cameras, the drone has environment perception capabilities and can collaborate with other drones. A robot manipulator is attached at the drone for physical interaction purposes. The ability to carry out the aforementioned tasks in an accurate and modular manner depicts the efficiency of the system for future robotic aerial applications of increased complexity. However, many challenges are yet to be examined. The authors’ aim is to focus future research on autonomous navigation in confined environments as well as high interaction forces aerial manipulation [52].

In aerial manipulation, the challenge lies with the forces at the tip of a stiff 7-DoF manipulator being directly transferred to the main UAS frame. Additionally, their orientation can be varying, depending on the pose of the manipulator. Thus, the ability of the aerial manipulator to robustly maintain its position and attitude while performing the task is mandatory. Compared to the depicted experimentation of this book chapter the induced forces from such operation are calculated to be in the area of 10 to 100 N. Subsequently, although the existing position controller of the ArduCopter flight stack is able to withhold a proper pose while ultrasound scanning of inclined areas, advanced control techniques [49] will be utilized in the sequel. The authors intend to test the efficiency of the built-in attitude controller of the ArduCopter flight stack, as well as exploit the adaptive backstepping control strategies in [48, 49] and other (model predictive) control techniques. The implementation of such controllers relies on the ability to directly control the angular velocity of the drone’s motors independently, at rates greater or equal to 1 kHz.


Conflict of interest

The authors declare no conflict of interest.




unmanned aerial vehicle


remote control


power distribution board


inertial measurement unit


flight control unit


global navigation satellite system


global positioning system


real-time kinematic


extended Kalman filter


robot operating system








visual object tracking


  1. 1. Mademlis I, Mygdalis V, Nikolaidis N, Pitas I. Challenges in Autonomous UAV Cinematography: An Overview. In: 2018 IEEE International Conference on Multimedia and Expo; 2018. p. 1-6.
  2. 2. Nourbakhsh IR, Sycara K, Koes M, Yong M, Lewis M, Burion S. Human-robot teaming for search and rescue. IEEE Pervasive Computing. 2005;4(1):72-79.
  3. 3. Hildmann H, Kovacs E. Using Unmanned Aerial Vehicles (UAVs) as Mobile Sensing Platforms (MSPs) for Disaster Response, Civil Security and Public Safety. Drones. 2019;3(3):59.
  4. 4. Prada Delgado J, Ramon Soria P, Arrue BC, Ollero A. Bridge Mapping for Inspection Using an UAV Assisted by a Total Station. In: ROBOT 2017: Third Iberian Robotics Conference. Springer International Publishing; 2018. p. 309-319.
  5. 5. Ding G, Wu Q, Zhang L, Lin Y, Tsiftsis TA, Yao Y. An Amateur Drone Surveillance System Based on the Cognitive Internet of Things. IEEE Communications Magazine. 2018;56(1):29-35.
  6. 6. Ganesh Y, Raju R, Hegde R. Surveillance Drone for Landmine Detection. In: 2015 International Conference on Advanced Computing and Communications; 2015. p. 33-38.
  7. 7. Barmpounakis E, Vlahogianni E, Golias J. Unmanned Aerial Aircraft Systems for transportation engineering: Current practice and future challenges. International Journal of Transportation Science and Technology. 2016;5(3):111 – 122.
  8. 8. Kellermann R, Biehle T, Fischer L. Drones for parcel and passenger transportation: A literature review. Transportation Research Interdisciplinary Perspectives. 2020;p. 100088.
  9. 9. Papachristos C, Tzes A. The power-tethered UAV-UGV team: A collaborative strategy for navigation in partially-mapped environments. In: 22nd Mediterranean Conference on Control and Automation; 2014. p. 1153-1158.
  10. 10. D130 X8 Titan Drone; 2020. Internet. Available from:
  11. 11. Otsuka H, Nagatani K. Thrust loss saving design of overlapping rotor arrangement on small multirotor unmanned aerial vehicles. In: 2016 IEEE International Conference on Robotics and Automation; 2016. p. 3242-3248.
  12. 12. Koubâa A, Allouch A, Alajlan M, Javed Y, Belghith A, Khalgui M. Micro Air Vehicle Link (MAVlink) in a Nutshell: A Survey. IEEE Access. 2019;7:87658-87680.
  13. 13. PixHawk 2 1 flight controller; 2020. Internet. Available from:
  14. 14. Here+ RTK GNSS module; 2020. Internet. Available from:
  15. 15. The ArduCopter flight stack; 2020. Internet. Available from:
  16. 16. MAVLink extendable communication node for ROS; 2020. Internet. Available from:
  17. 17. Rietdorf A, Daub C, Loef P. Precise positioning in real-time using navigation satellites and telecommunication. In: Proceedings of The 3rd Workshop on Positioning and Communication; 2006.
  18. 18. Chuang HM, He D, Namiki A. Autonomous Target Tracking of UAV Using High-Speed Visual Feedback. Applied Sciences. 2019;9(21):4552.
  19. 19. Maravall D, de Lope J, Fuentes JP. Navigation and Self-Semantic Location of Drones in Indoor Environments by Combining the Visual Bug Algorithm and Entropy-Based Vision. Frontiers in Neurorobotics. 2017;11:46.
  20. 20. Vicon Motion Tracking system; 2020. Internet. Available from:
  21. 21. Finn R, Wright D. Unmanned aircraft systems: Surveillance, ethics and privacy in civil applications. Computer Law & Security Review. 2012;28(2):184-194.
  22. 22. Sappington RN, Acosta GA, Hassanalian M, Lee K, Morelli R. Drone stations in airports for runway and airplane inspection using image processing techniques. In: AIAA Aviation 2019 Forum; 2019. p. 3316.
  23. 23. Woods AC, La HM. Dynamic target tracking and obstacle avoidance using a drone. In: International Symposium on Visual Computing. Springer; 2015. p. 857-866.
  24. 24. Unlu HU, Niehaus PS, Chirita D, Evangeliou N, Tzes A. Deep Learning-based Visual Tracking of UAVs using a PTZ Camera System. In: IECON 2019 - 45th Annual Conference of the IEEE Industrial Electronics Society. vol. 1; 2019. p. 638-644.
  25. 25. Aghayari S, Saadatseresht M, Omidalizarandi M, Neumann I. Geometric calibration of full spherical panoramic Ricoh-Theta camera. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences IV-1/W1 (2017). 2017;4:237-245.
  26. 26. Young M. Pinhole optics. Applied Optics. 1971;10(12):2763-2767.
  27. 27. Saff EB, Kuijlaars AB. Distributing many points on a sphere. The mathematical intelligencer. 1997;19(1):5-11.
  28. 28. Munoz-Salinas R. Aruco: a minimal library for augmented reality applications based on OpenCV. Universidad de Córdoba. 2012;.
  29. 29. Olson E. AprilTag: A robust and flexible visual fiducial system. In: 2011 IEEE International Conference on Robotics and Automation. IEEE; 2011. p. 3400-3407.
  30. 30. Wang J, Olson E. AprilTag 2: Efficient and robust fiducial detection. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE; 2016. p. 4193-4198.
  31. 31. ALVAR A. Library for Virtual and Augmented Reality; 2016. Available from:
  32. 32. Tsoukalas A, Tzes A, Khorrami F. Relative Pose Estimation of Unmanned Aerial Systems. In: 2018 26th Mediterranean Conference on Control and Automation. IEEE; 2018. p. 155-160.
  33. 33. Visual Object Tracking 2019; 2019. Internet. Available from:
  34. 34. Tsoukalas A, Evangeliou N, Giakoumidis N, Tzes A. Airborne visual tracking of UAVs with a Pan-Tilt-Zoom Camera. In: International Conference on Unmanned Aerial Systems; 2020. p. 66-71.
  35. 35. Doyle DD, Jennings AL, Black JT. Optical flow background subtraction for real-time PTZ camera object tracking. In: 2013 IEEE International Instrumentation and Measurement Technology Conference; 2013. p. 866-871.
  36. 36. Bouguet JY. Pyramidal implementation of the Lucas Kanade feature tracker. Intel Corporation, Microprocessor Research Labs. 2000;.
  37. 37. Szeliski R. Computer Vision: Algorithms and Applications. 1st ed. Berlin, Heidelberg: Springer-Verlag; 2010.
  38. 38. Shi J, Tomasi C. Good Features to Track. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition; 1994. p. 593-600.
  39. 39. Bolme D, Beveridge J, Draper B, Lui Y. Visual object tracking using adaptive correlation filters. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 2010. p. 2544-2550.
  40. 40. Danelljan M, Khan FS, Felsberg M, v d Weijer J. Adaptive Color Attributes for Real-Time Visual Tracking. In: IEEE Conference on Computer Vision and Pattern Recognition; 2014. p. 1090-1097.
  41. 41. Lukezic A, Vojir T, Cehovin L, Matas J, Kristan M. Discriminative Correlation Filter with Channel and Spatial Reliability. International Journal of Computer Vision. 2016 11;126.
  42. 42. Babenko B, Yang M, Belongie S. Visual tracking with online Multiple Instance Learning. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition; 2009. p. 983-990.
  43. 43. Kalal Z, Mikolajczyk K, Matas J. Forward-Backward Error: Automatic Detection of Tracking Failures. In: 2010 20th International Conference on Pattern Recognition; 2010. p. 2756-2759.
  44. 44. Kalal Z, Mikolajczyk K, Matas J. Tracking-Learning-Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2012 July;34(7):1409-1422.
  45. 45. Paul Z, Musoff H. Fundamentals of Kalman Filtering: A Practical Approach. American Institute of Aeronautics and Astronautics, Inc.; 2015.
  46. 46. Jung S, Song S, Youn P, Myung H. Multi-Layer Coverage Path Planner for Autonomous Structural Inspection of High-Rise Structures. In: IEEE/RSJ Int. Conference on Intelligent Robots and Systems; 2018. p. 1-9.
  47. 47. Kinova Gen2 assistive manipulator; 2020. Internet. Available from:
  48. 48. Stergiopoulos Y, Kontouras E, Gkountas K, Giannousakis K, Tzes A. Modeling and control aspects of a UAV with an attached manipulator. In: 24th Mediterranean Conference on Control and Automation; 2016. p. 653-658.
  49. 49. Chaikalis D, Khorrami F, Tzes A. Adaptive Control Approaches for an Unmanned Aerial Manipulation System. In: International Conference on Unmanned Aerial Systems; 2020. p. 498-503.
  50. 50. UAE Miedfield Terminal Project; 2020. Internet. Available from:
  51. 51. UAE Miedfield Terminal structure maintenance promotional video; 2020. Internet. Available from:
  52. 52. Abbasi F, Mesbahi A, Velni JM. Coverage control of moving sensor networks with multiple regions of interest. In: American Control Conference; 2017. p. 3587-3592.

Written By

Nikolaos Evangeliou, Athanasios Tsoukalas, Nikolaos Giakoumidis, Steffen Holter and Anthony Tzes

Submitted: March 23rd, 2020 Reviewed: September 14th, 2020 Published: October 22nd, 2020