Abstract
Maintaining reliable wireless connectivity is essential for the continuing growth of mobile devices and their massive access to the Internet of Things (IoT). However, terrestrial cellular networks often fail to meet their required quality of service (QoS) demand because of the limited spectrum capacity. Although the deployment of more base stations (BSs) in a concerned area is costly and requires regular maintenance. Alternatively, unmanned aerial vehicles (UAVs) could be a potential solution due to their ability of on-demand coverage and the high likelihood of strong line-of-sight (LoS) communication links. Therefore, this chapter focuses on a UAV’s deployment and movement design that supports existing BSs by reducing data traffic load and providing reliable wireless communication. Specifically, we design UAV’s deployment and trajectory under an efficient resource allocation strategy, i.e., assigning devices’ association indicators and transmitting power to maximize overall system’s throughput and minimize the total energy consumption of all devices. For these implementations, we adopt reinforcement learning framework because it does not require all information about the system environment. The proposed methodology finds optimal policy using the Markov decision process, exploiting the previous environment interactions. Our proposed technique significantly improves the system’s performance compared to the other benchmark schemes.
Keywords
- unmanned aerial vehicle
- reinforcement learning
- energy efficiency
- offloading
- throughput
1. Introduction
With the proliferation of mobile electronic devices, such as smartphones, tablets, and more internet of things (IoT) gadgets, the need for high-speed wireless connectivity has been growing rapidly [1]. But, the existing cellular networks with limited spectrum, coverage, and energy capacity fail to satisfy users’ quality of service (QoS) requirements. Hence, the next generation 5G technologies, such as device-to-device (D2D) communications, ultra-dense small cell networks, and millimeter wave (mmW) communications, are emerging as potential alternatives to deal with such issues [2, 3]. However, these modern 5G cellular networks face several challenges due to resource allocation, backhaul interferences, high reliance on the line of sight (LoS) link, and signal blockage. On the other hand, integration of unmanned aerial vehicles (UAVs) into the fifth-generation (5G) and sixth-generation (6G) cellular networks as aerial base stations would be a promising aspect to achieve several goals, namely ubiquitous accessibility, robust navigation, ease of monitoring and management, etc., because they can establish LoS dominant air to ground channel in a controllable manner [4]. Notably, cellular-connected UAV-assisted system gains significant performance improvement over the existing point-to-point UAV-ground communication in terms of coverage and throughput [5]. UAV also offload temporary high-traffic demands from terrestrial BSs during huge crowd events such as festivals, concerts, and stadium games [6]. Therefore, UAVs’ utility in the cellular network is directly related to the highest number of serving users. Nevertheless, many challenges related to the utilization of UAVs need to be addressed, including their deployment strategy, trajectory optimization, and resource allocation under flight time limitations which affect instantaneous LoS probability and remarkably influence the system performance.
The relevant studies [7, 8, 9, 10] optimized the trajectory and deployment of UAVs in different circumstances. However, most of them incorporate nonlinear algorithms that rely on average spatial throughput. Thus, computational complexity grows rapidly with the higher number of users and flight time. Moreover, practically without prior knowledge about the network state, it becomes very difficult for a UAV to find its path to accomplish a given real-time task. Alternatively, machine learning (ML) techniques [11, 12, 13] intelligently support UAVs and ground users in performing mission-oriented operations with low complexity when complete network information is not available. Particularly, reinforcement learning (RL), being a part of ML, can search for the optimal policy through trial and error while interacting with the environment [14]. Hence, this chapter investigates the optimal deployment, trajectory, and resource allocation of UAVs to meet the throughput requirements of the cellular network.
2. Background
The existing literature focuses on the deployment and movement of UAV relays for numerous applications. In [15], the authors estimated the optimal UAV relay position in a multi-rate communication system using theoretical and simulated analysis. The work in [16] investigated the mission planning of UAV relays to improve the connectivity of ground users. The authors of [17, 18] maximized the lower bound of the uplink transmission rate over the link between UAV relay and ground devices using dynamic heading adjusting approaches. For throughput maximization of the mobile relaying system, an iterative algorithm was developed [19, 20], which jointly optimized the relays’ trajectory and transmitting power of the sources and UAVs by satisfying the practical constraints. In [21], the authors maximized the UAV relay network’s throughput by optimizing transmit power, bandwidth, transmission rate, and relay deployment. However, in these works, a model-based centralized approach is used where all necessary system parameters are required. Additionally, the research gap still exists on enhancing network performance for source-destination device pair communication. To overcome these shortcomings, Indu et al. [22] minimized the energy consumption of UAV during its trajectory using genetic algorithm (GA). The authors in [6] proposed two meta-heuristic algorithms, such as GA and particle swarm optimization (PSO), to find the optimal UAV trajectory for satisfying users’ minimum data rate requirements. They showed that PSO significantly improves the UAV’s wireless coverage compared to GA. Although the meta-heuristic algorithms can deal with the complexity of UAV path planning, there are still some challenges in exchanging information between UAV and core network due to either unavailable constraints or obtaining their gradient analytically.
Another line of research studied the mobility management of UAVs for resource allocation and coverage optimization using RL techniques to deal with convergence issues. Kawamoto et al. [23] have presented a resource allocation algorithm of UAV using Q-learning techniques for allocating time slots and modulation schemes. The work in [24] presented a framework for the optimal UAV trajectory under a given data rate constraint, which relies on a state-action-reward-state-action (SARSA) algorithm. Hu et al. [25] proposed a real-time sensing and transmission protocol in UAV-aided cellular networks and designed optimal UAVs’ trajectories under limited spectrum resources using RL based on a Q-learning algorithm. Furthermore, the authors of [26] transformed UAV trajectory optimization problem for maximizing cumulative collected sensors’ data into a Markov decision process (MDP) and proposed two stochastic modeling RL algorithms, namely Q-learning and SARSA, to learn UAV’s policy. They proved that SARSA outperforms Q-learning due to the adaptive system’s state update rule. From the state-of-the-art, the coupled relationship among UAV trajectory, device association, and transmit power allocation of IoT devices for the enhancement of network lifetime has not been investigated during the data collection process of UAV-assisted IoT networks.
3. Channel characterization of UAV-operated communication system
This section proposes a multi-hop radio frequency and free space optical (RF-FSO) communication framework that analytically optimizes the UAV’s altitude for performance enhancement of a relaying system. Here, we minimize the outage probability and symbol error rate based on independent and identically distributed statistical parameters i.e., pointing errors, atmospheric turbulence, and scintillation.
3.1 Channel model
Consider a multi-hop hybrid RF–FSO system as shown in Figure 1, where single antenna-equipped ground base stations realize periodic data exchange. Since there are significant obstacles in the LoS path, direct link cannot be established between them. Therefore, two UAVs are deployed at a certain altitude which are employed as relays between the source and destination. These UAVs operate as RF and optical link transceiver modules with single-directional apertures. Depending on various environmental conditions, three different channels categorize the source-to-destination link, i.e., Ground to UAV (G2U), UAV to UAV (U2U), and UAV to Ground (U2G) channels.
3.1.1 G2U channel model
As ground to UAV channel consists of RF signals, experiencing small-scale fading and large-scale path loss, the received symbol at UAV
where,
where
where, the average SNR is given as,
3.1.2 U2U channel model
UAV
where
Since the optical link between UAV
where Γ (.) is the Gamma function,
3.1.3 U2G channel model
After receiving the optical signal
where
where
3.2 Performance metrics of multihop RF: FSO system
3.2.1 Outage probability
It is defined as the probability that instantaneous SNR is less than the minimum required threshold level,
Cumulative distribution function (CDF) of equivalent SNR is expressed by,
where
3.2.2 Symbol error rate
It is defined as the probability of false estimation of the received symbol, which can be expressed as [32]
where,
3.3 UAVs’ optimal altitude
According to Eq. (11), outage probability is a function of UAV’s altitude, distance from source to destination, and distance between the projection points of UAVs on the ground and end users. For these given parameters values, the optimal altitude is obtained as
where the optimal altitude must satisfy the following condition [33]
Finally, the optimal elevation angle at the receiver side
where
3.4 Numerical results
In this section, we provide numerical insights of optimal UAVs’ altitude and corresponding performance analysis and then cross-validate the proposed methodology using Monte-Carlo simulation. We assume that the system is operated under moderate and strong atmospheric turbulence conditions with a maximum free space optical distance 7 km, where the average SNR is set as
The variations of elevation angle corresponding to the optimal UAVs’ altitude for the given distance between the projection points of UAVs on the ground and end users under moderate atmospheric turbulence conditions are depicted in Figure 2. According to this figure, the optimal elevation angles decrease with the increase in distance from the end-user location to the projection point of the UAVs on the ground because the variation of optimal elevation angle follows Eq. (15).
The variation of outage probability with respect to UAVs’ altitude under moderate atmospheric turbulence conditions is statistically visualized in Figure 3 when the SNR threshold is assumed as
Figure 4 shows the impact of various modulation schemes on symbol error rate when the distance between projection points of UAVs on the ground and end users is 2000 m under different atmospheric turbulence conditions. According to the result, it is observed that symbol error rate decreases with the average SNR value. Furtherore, binary phase shift keying (BPSK) outperforms the modulation scheme of quadrature phase shift keying (QPSK). Although higher modulation techniques offer more data rates and bandwidth efficiency, they are more complicated to implement, require a more stringent RF amplifier, and are less resilient to error. Therefore, BPSK offers more secure and errorless transmission than other modulation techniques.
4. Throughput maximization in UAVs-supported D2D network
This section proposes a UAVs-supported self-organized device-to-device (USSD2D) network containing multiple source-destination device pairs and multiple UAVs, where the objective is to find the optimal deployed location of UAVs to support reliable data transmission between source and destination device pairs. Here, we consider SNR-constrained maximization of the total instantaneous transmission rate of the USSD2D network by jointly optimizing device association, UAV’s channel selection, and UAVs’ deployed location at every time slot.
4.1 System model
Figure 5 depicts the UAVs-supported self-organized device-to-device (USSD2D) network where the stationary source and destination devices pairs are randomly deployed on the ground within the target area. The direct D2D pairs can establish LoS links due to good channel conditions and the short distance between them. On the other hand, UAV-assisted D2D pairs cannot establish direct links due to the presence of significant obstacles in the signal propagation path and thereby utilize the deployed UAVs as relays.
4.1.1 Channel model
Consider
Similarly, when UAV
The path loss between the device
where
4.1.2 Transmission model
The received SNR at UAV
where
where
where
where
The instantaneous transmission rate achieved by the destination device
The total instantaneous transmission rate achieved by all direct D2D pairs can be calculated as
Similarly,
The total instantaneous transmission rate of all UAV-assisted D2D pairs can be expressed as
The overall instantaneous transmission rate of the USSD2D network is formulated as
4.1.3 Problem formulation
From the practical scenario, it is observed that when UAVs fly toward a group of devices to obtain better channel conditions, the remaining devices of the network cannot receive adequate services from the UAV, and consequently, UAVs cannot allocate network resources fairly. Hence, we jointly optimize UAVs’ location, device association, and channel selection indicators at every time slot to maximize the total instantaneous transmission rate of the USSD2D network while assuring that each device should achieve a minimum SNR of
Subject to the constraints
C1 indicates that a device should achieve a minimum SNR threshold to maintain the required QoS. C2 defines the instantaneous device association indicator and UAVs’ channel selection indicator. C3 assures that each device can be associated with a single UAV at a time slot, and C4 implies UAVs’ channel selection conditions at each time slot. The optimization variables
4.2 RL-based solution methodology
UAVs acting as RL agents select the action depending on their current positions, which are only related to their previous states. Hence, the proposed framework follows Markovian properties composed of state, action, reward, state transition probability, and the flying time periods. In the next sub-section, we explain each of those elements elaborately.
4.2.1 State space
The state of the
4.2.2 Action space
UAV’s action
4.2.3 Reward formulation
RL agents choose their actions in such a manner that maximizes long-term cumulative reward. Since our objective is to maximize the total instantaneous transmission rate of the USSD2D network, we need to find such locations of UAVs that impacts immediate objective value. Hence, we model the instantaneous reward function contributed by UAV
4.2.4 State transition probability
It is the probability that UAV
where
where
From (43) and (44), it is observed that the update of selection probability vectors depends on the instantaneous transmission rate, which does not need any prior information. Thus, device association and UAVs’ channel selection at each time slot is entirely model-free.
4.2.5 Updating the action value function
During the operation period, each UAV acts as an RL agent where UAV
UAVs consider all the possible actions from the action space and select an action with a certain probability that provides maximum long-term reward.
UAVs execute state-action pairs repeatedly to gain experience of interacting with the environment. These interaction results are recorded in
4.3 Simulation results
In this sub-section, we validate the proposed analysis and provide various numerical insights on key system parameters to improve the system’s performance. Later, we compare the obtained results corresponding to the proposed SARSA algorithm with the existing works [34], such as random selection with fixed optimal relay deployment (RS-FORD), an exhaustive search for relay assignment and channel allocation with fixed initial relay deployment (ES-FIRD), and alternative optimization for the individual variable (AOIV). Here, we consider that direct D2D pair and UAV-assisted D2D pair devices are uniformly distributed in a 4 km
The iterative evolutions of the proposed and benchmark schemes are depicted in Figure 6, where the number of UAVs, UAV-assisted D2D pairs, direct D2D pairs, orthogonal channels, and transmit power are set as 5, 10, 2, 7, and 10 mW respectively. From this figure, it is clear that the proposed algorithm outperforms the benchmark scheme with respect to the converged value because it utilizes
1: Initialize 2: Set initial device association probability as 3: Set initial channel selection probability of UAVs as 4: Initially deploy UAV 5: 6: 7: 8: Obtain the association probability of device 9: Calculate 10: 11: Calculate 12: 13: 14: 15: 16: Set 17: According to (43), update the association probability as 18: 19: 20: UAV 21: Calculate 22: 23: 24: 25: 26: 27: 28: Set 29: According to (44), update channel selection probability as 30: 31: Choose the action values 32: Find next state as 33: Calculate the immediate reward 34: Choose the action 35: Update 36: Update the state and action for the next time slot as 37: Calculate the instantaneous reward generated by all UAVs as |
Figure 7a shows the variation of instantaneous transmission rate for different number of UAVs while the other3 network parameters are the same, as mentioned in Figure 6. It can be observed in this figure that the performance metric value increases with the number of UAVs because all UAVs utilize the available channels efficiently at their deployed location. But when the number of UAVs exceeds 7, the total instantaneous transmission rate does not increase significantly because all UAVs reuse the limited spectrum, which increases mutual interferences among UAVs and source-destination device pairs.
Figure 7b plots the objective value corresponding to the different number of available orthogonal channels. From this figure, we can say that the instantaneous transmission rate increases with the number of channels because all the communication nodes select individual channels according to the channel selection probability vectors. But when the number of channels exceeds 7, no such variation in objective value is found because this is a sufficient resource to avoid mutual interferences completely.
Figure 7c represents the network throughput variation for different UAV-assisted D2D pairs when their transmitting power is 10 mW. Since all the devices and UAVs share the fixed amount of orthogonal channels, the network’s performance is independent with respect to the number of UAV-assisted D2D pairs, and the performance metric value is almost constant for variation of the key system parameters.
The performance metric variations for different number of direct D2D pairs are illustrated in Figure 7d when their transmitting power is set as 10 mW. It is observed that the instantaneous transmission rate decreases with the number of direct D2D pairs because they utilize more orthogonal channels. As a result, mutual interference among UAV-assisted D2D pairs increases since they share limited network resources. Furthermore, our proposed scheme has the capabilities for adaptive action selection, which significantly outperforms the benchmark techniques. From Figure 7, we can say that the overall network throughput can be improved by 77.58%, 52.51%, and 12.14% compared to the RS-FORD, ES-FIRD, and AOIV schemes, respectively.
5. Minimization of devices’ energy consumption in UAV-assisted IoT network
The devices at the cell edge consume high energy to achieve the required data rate when transmitting data to the nearest BS because of the large LoS distance between BSs and those devices. Alternatively, a quad-rotor UAV-assisted IoT network could provide reliable communication compared to fixed terrestrial BSs. Therefore, in this section, we aim to find the optimal trajectory of UAV and the association of IoT devices that simultaneously support energy-efficient data collection.
5.1 System model
Figure 8 illustrates the UAV-assisted IoT network, in which
5.1.1 Data collection of core network
The transmission environment is categorized into two scenarios, i.e., ground to ground (G2G) and ground to air (G2A) channels. G2G channel establishes the links between BS and IoT devices, whereas G2A channel connects the IoT devices with the UAV platform. We generalize the wireless channel gain between each device and its destination (either UAV or BS) at each time slot as the combination of large-scale path loss and small-scale fading. The channel gain between each device and its destination can be modeled as [39]
where
where
where
5.1.2 Problem formulation
We aim for energy-efficient data collection that jointly exploit reliable data transmission, optimal instantaneous position of UAV and transmit power control. The fluctuation of channel gain causes unstable network performance, leading to quickly drain out devices’ on-board battery energy. Thus, to minimize total energy consumption of all devices we jointly optimize UAV’ trajectory, device association indicators and their transmit power allocation, while ensuring that each device should transmit a minimum data to the destination and UAV chooses a constant speed during its trajectory between the initial and final locations. Therefore the optimization problem is formulated as
Subject to the constraints
Here, C1 ensures that each device transmits atleast
5.2 Reinforcement learning based on SARSA algorithm
As discussed earlier in Section 4.3, the RL framework follows MDP, where the current state only depends on the immediate past state, and the UAV acting as RL agent chooses an action according to the
Algorithm 2 summarizes the optimal trajectory learning procedure using the improved SARSA technique. In this framework, we first calculate UAV’s current state, channel gain, and distances from all devices to UAV and the nearest BS at every time slot. Then, all devices select the destination (either UAV or nearest BS) by estimating the instantaneous device association indicator and the required transmit power while satisfying the data rate constraint value. This process is repeated at each step, and UAV obtains optimal policy at the final episode. Since the number of episodes is
Algorithm 2: UAV trajectory learning process using SARSA |
1: Initialize 2: 3: Set the starting point as 4: 5: 6: Choose the action values 7: Find next state by (40) and (41) as 8: Calculate reward 9: Choose the next action 10: Update 11: Update the respective state and action as 12: 13: Obtain the next state as 14: Calculate reward 15: Choose the next action 16: Update 17: Update the respective state and action as 18: 19: 20: Find an optimal policy as |
5.3 Simulation results
This sub-section presents the training outcomes corresponding to the proposed SARSA algorithm for optimal trajectory and subsequently evaluates the energy-efficient data collection. Here, we compare the effectiveness and superiority of the proposed design with the benchmark PSO technique [41], where 100 IoT devices are uniformly distributed within a square field of size
5.3.1 Convergence analysis
The agents’ training evaluations using RL-based SARSA algorithm are illustrated in Figure 9a, when all IoT devices maintain the data rate constraint of 10 Mbps. In this figure, we have found that the convergence rate varies for flying time because UAV explores the target area more efficiently with the available time slots. As a result more devices associate with UAV and the convergence occurs before 10,000 episodes.
Figure 9b shows the episode-wise objective value evaluation using PSO algorithm. From this figure, it is visible that PSO takes more time to converge, and its final convergence value is less than the SARSA algorithm. This is because PSO updates particles’ position and velocity according to the random inertial weight which causes less exact regulation of particles’ moving directions and speed. Hence, its computational complexity increase due to the high dimensions of decision variables. Therefore, the proposed SARSA algorithm improves the cumulative reward by 10.26% with respect to the PSO.
5.3.2 Optimal trajectory
Using the same parameters mentioned in Figure 9, UAV finds its optimal trajectories with the help of SARSA and PSO algorithms, depicted in Figure 10. These figures indicate that UAV moves toward the devices, far away from the BS, and within the flight period, it reaches the final destination point. Since devices consume more energy while transmitting data to BS, UAV fly toward those devices to improve their channel conditions. as we mentioned earlier, device association with UAV increases with the flying time, more devices transmit their data to the UAV instead of BS, reducing their energy consumption.
5.3.3 Performance comparison of proposed SARSA with benchmark PSO
The variation of devices’ average transmit power to achieve 10 Mbps data rate with the index value is demonstrated in Figure 11a where a device’s index indicates its distance from the nearest BS. It is observed that, when there is no UAV support, average transmit power increases with the index value because, according to (52) devices far away from BS utilize more power to obtain the given data rate. But when UAV is employed, its optimal trajectory focuses the devices which are consuming more power and associates with them for data collection. Furthermore, since UAV’s straight trajectory cannot improve all devices’ channel conditions, the corresponding energy-efficient data collection would not be possible.
The total energy consumption of all devices for various data rate constraint values is illustrated in Figure 11b. It is clear that devices’ energy consumption increases with data rate constraint because, according to (49), devices allocate more power to achieve the given rate constraint. Furthermore, from Figure 11a, UAV’s optimal trajectory corresponding to the proposed SARSA algorithm reduces devices’ transmit power with its available flying time as compared to PSO algorithm, because PSO achieves low convergence rate in an iterative process and could not identify the local optimal in high-dimension space. Hence, the proposed SARSA methodology significantly reduces the total energy consumption of all devices by 8.15%, 7.72%, and 5.67% for UAV’s flying time of 80, 100, and 120 timeslots, respectively as compared to PSO.
6. Conclusion
This chapter proposes deployment and trajectory designs of UAVs for efficient resource allocation to achieve reliable wireless communication. The main features of this structure are three folded. In the first part, we optimize UAVs altitude to minimize outage probability and symbol error rate, considering pointing errors, atmospheric turbulence, and scintillation parameters where a hybrid RF-FSO channel governs the transmission environment. The second part finds the optimal deployed locations of UAVs to maximize the total instantaneous transmission rate of the devices in USSD2D network under SNR constraint. Finally, the last feature focuses on energy-efficient data collection where devices’ total energy consumption is minimized by jointly optimizing their association with the nearest BS or UAV, their transmitting power, and UAV trajectory while satisfying a given data rate requirements. Numerical results validate the analysis and provide insights on the optimal UAV control design for various key system parameters. Our proposed methodology significantly improves system performance compared with the benchmark techniques. This work would be extended toward a multi UAVs-assisted energy-efficient data collection system considering the age of information aspect where the users follow a certain mobility model.
References
- 1.
Zanella A, Bui N, Castellani A, Vangelista L, Zorzi M. Internet of things for smart cities. IEEE Internet of Things Journal. 2014; 1 (1):22-32. DOI: 10.1109/JIOT.2014.2306328 - 2.
Samarakoon S, Bennis M, Saad W, Debbah M, Latva-Aho M. (2016, may). Ultra-dense small cell networks: Turning density into energy efficiency. IEEE Journal on Selected Areas in Communications. 2016; 34 (5):1267-1280. DOI: 10.1109/JSAC.2016.2545539 - 3.
Semiari O, Saad W, Bennis M, Dawy Z. Inter-operator resource management for millimeter wave multi-hop backhaul networks. IEEE Transactions on Wireless Communications. 2017; 16 (8):5258-5272. DOI: 10.1109/TWC.2017.2707410 - 4.
Mozaffari M, Saad W, Bennis M, Nam YH, Debbah M. A tutorial on UAVs for wireless networks: Applications challenges and open problems. IEEE Communications Surveys and Tutorials. 2019; 21 (3):2334-2360. DOI: 10.1109/COMST.2019.2902862 - 5.
Esrafilian O, Gangula R, Gesbert D. 3D map-based trajectory design in UAV-aided wireless localization systems. IEEE Internet of Things Journal. 2021; 8 (12):9894-9904. DOI: 10.1109/JIOT.2020.3021611 - 6.
Sawalmeh AH, Othman NS, Shakhatreh H, Khreishah A. Wireless coverage for mobile users in dynamic environments using UAV. IEEE Access. 2019; 7 :126376-126390. DOI: 10.1109/ACCESS.2019.2938272 - 7.
Lyu J, Zeng Y, Zhang R, Lim TJ. Placement optimization of UAV-mounted mobile base stations. IEEE Communications Letters. 2017; 21 (3):604-607. DOI: 10.1109/LCOMM.2016.2633248 - 8.
Wang Z, Duan L, Zhang R. Adaptive deployment for UAV-aided communication networks. IEEE Transactions on Wireless Communications. 2019; 18 (9):4531-4543. DOI: 10.1109/TWC.2019.2926279 - 9.
Alzenad M, El-Keyi A, Yanikomeroglu H. 3-D placement of an unmanned aerial vehicle base station for maximum coverage of users with different QoS requirements. IEEE Wireless Communications Letters. 2018; 7 (1):38-41. DOI: 10.1109/LWC.2017.2752161 - 10.
El-Hammouti H, Benjillali M, Shihada B, Alouini M. Learn-as-you-fly: A distributed algorithm for joint 3D placement and user association in multi-UAVs networks. IEEE Transactions on Wireless Communications. 2019; 18 (12):5831-5844. DOI: 10.1109/TWC.2019.2939315 - 11.
Zhang H, Hanzo L. Federated learning assisted multi-UAV networks. IEEE Transactions on Vehicular Technology. 2020; 69 (11):14104-14109. DOI: 10.1109/TVT.2020.3028011 - 12.
Liu X, Liu Y, Chen Y, Hanzo L. Trajectory design and power control for multi-UAV assisted wireless networks: A machine learning approach. IEEE Transactions on Vehicular Technology. 2019; 68 (8):7957-7969. DOI: 10.1109/TVT.2019.2920284 - 13.
Duong TQ, Nguyen LD, Tuan HD, Hanzo L. Learning-aided real-time performance optimization of cognitive UAV-assisted disaster communication. In: Proceeding IEEE Global Communications Conference (GLOBECOM); 09–13 December 2019. Waikoloa, HI, USA: IEEE; 2020. pp. 1-6 - 14.
Liu X, Liu Y, Chen Y. Reinforcement learning in multiple UAV networks: Deployment and movement design. IEEE Transactions on Vehicular Technology. 2019; 68 (8):8036-8049. DOI: 10.1109/TVT.2019.2922849 - 15.
Larsen E, Landmark L, Kure O. Optimal UAV relay positions in multi-rate networks. In: Proceedings Wireless Days; 29–31 March 2017. Porto, Portugal: IEEE; 2017. pp. 8-14 - 16.
Han Z, Swindlehurst AL, Liu KJR. Optimization of MANET connectivity via smart deployment/movement of unmanned air vehicles. IEEE Transactions on Vehicular Technology. 2009; 58 (7):3533-3546. DOI: 10.1109/TVT.2009.2015953 - 17.
Jiang F, Swindlehurst AL. Dynamic UAV relay positioning for the ground-to-air uplink. In: Proceedings IEEE Globecom Workshop; 06–10 December 2010. Miami, FL, USA: IEEE; 2011. pp. 1766-1770 - 18.
Zhan P, Yu K, Swindlehurst AL. Wireless relay communications with unmanned aerial vehicles: Performance and optimization. IEEE Transactions on Aerospace and Electronic Systems. 2011; 47 (3):2068-2085. DOI: 10.1109/TAES.2011.5937283 - 19.
Zeng Y, Zhang R, Lim TJ. Throughput maximization for UAV-enabled mobile relaying systems. IEEE Transactions on Communications. 2016; 64 (12):4983-4996. DOI: 10.1109/TCOMM.2016.2611512 - 20.
Ono F, Ochiai H, Miura R. A wireless relay network based on unmanned aircraft system with rate optimization. IEEE Transactions on Wireless Communications. 2016; 15 (11):7699-7708. DOI: 10.1109/TWC.2016.2606388 - 21.
Fan R, Cui J, Jin S, Yang K. Optimal node placement and resource allocation for UAV relaying network. IEEE Communications Letters. 2018; 22 (4):808-811. DOI: 10.1109/LCOMM.2018.2800737 - 22.
Indu SRP, Choudhary HR, Dubey AK. Trajectory design for UAV-to-ground communication with energy optimization using genetic algorithm for agriculture application. IEEE Sensors Journal. 2021; 21 (16):17548-17555. DOI: 10.1109/JSEN.2020.3046463 - 23.
Kawamoto Y, Takagi H, Nishiyama H, Kato N. Efficient resource allocation utilizing Q-learning in multiple UA communications. IEEE Transaction on Network Science and Engineering. 2019; 6 (3):293-302. DOI: 10.1109/TNSE.2018.2842246 - 24.
Mondal A, Mishra D, Prasad G, Hossain A. Joint optimization framework for minimization of device energy consumption in transmission rate constrained UAV-assisted IoT network. IEEE Internet of Things Journal. 2021; 9 (12):9591-9607. DOI: 10.1109/JIOT.2021.3128883 - 25.
Hu J, Zhang H, Song L. Reinforcement learning for decentralized trajectory design in cellular UAV networks with the sense-and-send protocol. IEEE Internet of Things Journal. 2019; 6 (4):6177-6189. DOI: 10.1109/JIOT.2018.2876513 - 26.
Cui J, Ding Z, Deng Y, Nallanathan A, Hanzo L. Adaptive UAV trajectory optimization under the quality of service constraints: A model-free solution. IEEE Access. 2020; 8 :112253-112265. DOI: 10.1109/ACCESS.2020.3001752 - 27.
Yang L, Yuan J, Liu X, Hasna MO. On the performance of LAP-based multiple-hop RF/FSO systems. IEEE Transactions on Aerospace and Electronic Systems. 2019; 55 (1):499-505. DOI: 10.1109/TAES.2018.2852399 - 28.
Azari MM, Rosas F, Chen KC, Pollin S. Ultra reliable UAV communication using altitude and cooperation diversity. IEEE Transactions on Communications. 2018; 66 (1):330-344. DOI: 10.1109/TCOMM.2017.2746105 - 29.
Puri P, Garg P, Aggarwal M. Outage and error rate analysis of network-coded coherent TWR-FSO systems. IEEE Photonics Technology Letters. 2014; 26 (18):1797-1800. DOI: 10.1109/LPT.2014.2333032 - 30.
Gappmair W. Further results on the capacity of free-space optical channels in turbulent atmosphere. IET Communications. 2011; 5 (9):1262-1267. DOI: 10.1049/iet-com.2010.0172 - 31.
Gil A, Segura J, Temme NM. Computation of the Marcum Q-function. ACM Transactions on Mathematical Software. 2013; 40 (3):280-295. DOI: 10.48550/arXiv.1311.0681 - 32.
Muller A, Speidel J. Exact symbol error probability of M-PSK for multihop transmission with regenerative relays. IEEE Communications Letters. 2007; 11 (12):952-954. DOI: 10.1109/LCOMM.2007.070820 - 33.
Mondal A, Hossain A. Channel characterization and performance analysis of UAV operated communication system with multihop RF–FSO link in dynamic environment. International Journal of Communication Systems. 2020; 33 (16):e4568. DOI: 10.1002/dac.4568 - 34.
Zhong X, Guo Y, Li N, Chen Y. Joint optimization of relay deployment, channel allocation, and relay assignment for UAVs-aided D2D networks. IEEE/ACM Transactions on Networking. 2020; 28 (2):804-817. DOI: 10.1109/TNET.2020.2970744 - 35.
Al-Hourani A, Kandeepan S, Lardner S. Optimal LAP altitude for maximum coverage. IEEE Wireless Communications Letters. 2014; 3 (6):569-572. DOI: 10.1109/LWC.2014.2342736 - 36.
Hasna MO, Alouini MS. Outage probability of multihop transmission over Nakagami fading channels. IEEE Communications Letters. 2003; 7 (5):216-218. DOI: 10.1109/LCOMM.2003.812178 - 37.
Mu X, Zhao X, Liang H. Power allocation based on reinforcement learning for MIMO system with energy harvesting. IEEE Transactions on Vehicular Technology. 2020; 69 (7):7622-7633. DOI: 10.1109/TVT.2020.2993275 - 38.
Mondal A, Hossain A. Maximization of instantaneous transmission rate in UAVs-supported self-organized device-to-device network. International Journal of Communication Systems. 2022; 35 (6):e5064. DOI: 10.1002/dac.5064 - 39.
You C, Zhang R. 3D trajectory optimization in Rician fading for UAV-enabled data harvesting. IEEE Transactions on Wireless Communications. 2019; 18 (6):3192-3207. DOI: 10.1109/TWC.2019.2911939 - 40.
Ho TM, Nguyen KK, Cheriet M. UAV control for wireless service provisioning in critical demand areas: A deep reinforcement learning approach. IEEE Transactions on Vehicular Technology. 2021; 70 (7):7138-7152. DOI: 10.1109/TVT.2021.3088129 - 41.
Milner S, Davis C, Zhang H, Llorca J. Nature-inspired self-organization, control, and optimization in heterogeneous wireless networks. IEEE Transactions on Mobile Computing. 2012; 11 (7):1207-1222. DOI: 10.1109/TMC.2011.141