Research Article  Open Access
Kefeng Wei, Lincong Zhang, Shupeng Wang, "Intelligent Channel Allocation for Age of Information Optimization in Internet of Medical Things", Wireless Communications and Mobile Computing, vol. 2021, Article ID 6645803, 10 pages, 2021. https://doi.org/10.1155/2021/6645803
Intelligent Channel Allocation for Age of Information Optimization in Internet of Medical Things
Abstract
Along with the development of realtime applications, the freshness of information becomes significant because the overdue information is worthless and useless and even harmful to the right judgement of system. Therefore, The Age of Information (AoI) used for marking the freshness of information is proposed. In Internet of Medical Things (IoMT), which is derived from the requirement of Internet of Thins (IoT) in medicine, high freshness of medical information should be guaranteed. In this paper, we introduce the AoI of medical information when allocating channels for users in IoMT. Due to the advantages of Deep Qlearning Network (DQN) applied in resource management in wireless network, we propose a novel DQNbased Channel Allocation (DQCA) algorithm to provide the strategy for channel allocation under the optimization of the system cost considering the AoI and energy consumption of coordinator nodes. Unlike the traditional centralized channel allocation methods, the DQCA algorithm is distributed as each user performs the DQN process separately. The simulation results show that our proposed DQCA algorithm is superior to Greedy algorithm and Qlearning algorithm in terms of the average AoI, average energy consumption and system cost.
1. Introduction
Corona Virus Disease 2019 (COVID19) has caused more than 2.32 million deaths worldwide by February 8^{th}, 2021 [1]. People are forced to stay at home, reduce the trip proportion, and avoid to go to crowded places. In this case, both the government, medical staff, or the general public hope to monitor virus infections like COVID19 and isolate them in time to avoid the spread of the virus on a large scale. Besides, people are more concerned about their health than ever before. More and more chronic patients and even healthy people hope to have longterm effective monitoring of their bodies and obtain important information about their health as soon as possible. The emergence of the Internet of Medical Things (IoMT) has provided the possibility to solve these problems, and its intelligent monitoring function has gained massive demand around the world [2].
For the COVID19 virus, Swati Swayamsiddha et al. proposed a Cognitive Internet of Medical Things (CIoMT), which is a particular case of the IoMT, enabling realtime tracking, remote monitoring of patients, rapid diagnosis, contact tracing and clustering, screening and monitoring, etc., thus reducing the workload of medical staff and preventing and controlling the spread of the virus [3]. RaviPratap Singh et al. discussed the feasibility of using the IoMT to track, monitor, analyze data, and provide treatment plans for orthopedic patients in an environment ravaged by COVID19 [4]. For COVID19 management, M.A. Mujawar et al. also proposed a health monitoring system based on wearable devices and artificial intelligence, which continuously monitors the patient’s heartbeat, body temperature, and other parameters through medical sensors and transmits them to cloud storage through WSN. At the same time, these parameters are used to update the user’s health status in real time and then the status will be sent to the medical staff [5].
The IoMT is a vast network system with diverse technologies. This paper only studies the channel allocation problems in the monitoring and transmitting human physiological data in the IoMT. During the monitoring and transmission, too old data may cause erroneous analysis and evaluation, reduce the accuracy and reliability of system decisionmaking, and even threaten the safety of users. Therefore, the freshness of information is crucial, and it also occupies an essential position in the design of 6G systems applied to body area networks [6–10]. To effectively describe the freshness of information, this paper introduces the Age of Information (AoI) [11], and studies the channel allocation problem of IoMT with AoI as the target.
In recent years, artificial intelligence has become an effective method to solve the resource allocation problem with many data processing [12]. As the main solution of artificial intelligence, machine learning has also received tremendous attention in recent years. Machine learning uses algorithms to analyze and learn from data to make decisions and predictions about realworld events. Among them, deep learning is the most popular machine learning method at present, which has been well applied in automatic detection [13, 14], case recognition [15–17], environmental monitoring [18], and epidemic prediction [19], etc. In terms of channel allocation, with the rapid growth of network size and data volume, deep learning can significantly improve the processing speed for a large number of nodes [20–23].
The research content of this paper is the problem of channel allocation among users oriented to the optimization of the AoI. The AoI of each controller on each user’s body at the gateway is the number of slots experienced by the latest update received from this controller at the end of each slot. In each time slot, the system needs to pay for the AoI. our requirement of timely updating the content received by the gateway is reflected in the minimum payment cost of the whole system. At the same time, this paper adopts a deep learning method to solve the proposed optimization problem. The main contributions of this paper are as follows: (i)In view of the channel allocation problem of the IoMT, we focus on the timeliness of the information, and at the same time, considering the mobility of nodes. To measure the cost that the system pays for the lack of new information on gateway, we propose a system cost function based on the AoI and the current energy consumption rate of the nodes.(ii)Based on the cost function, we constructed a mathematical model of the optimization problem that minimizes the average cost for the channel allocation of the IoMT.(iii)For the problems raised, we propose a Deep QLearning Network (DQN) based channel allocation algorithm, named DQCA, which provides channel allocation scheme to minimize the cost on the basis of meeting the requirements of node SNR and residual energy.
The rest of the paper is organized as follows. Section 2 provides a comprehensive overview about the AoI. Section 3 describes the system model and optimization model of channel allocation problem in IoMT. The proposed DQCA algorithm is illustrated in Section 4. The simulation and performance evaluation is performed in Section 5. Finally, we conclude the paper in Section 6.
2. Related Works
With the increasingly developed Internet of Things (IoT), realtime applications are gradually increasing, such as driverless cars, which make decisions and control based on road information detected by sensors, adjust the travel mode of vehicles, avoid collisions, and ensure the safe driving of driverless cars. This type of application requires high timeliness and freshness of data, and outdated data will lead to wrong judgments and decisions. The longer the time, the less important and effective the data will be. In order to measure the freshness and effectiveness of data, scholars put forward the indicator of the AoI in 2011 to quantify the freshness of information on a remote system state [11]. The AoI refers to the time elapsed between the creation of the newly successfully received information and its successful reception. The AoI is different from the transmission delay of information. In a system with multiple source nodes and one destination node, each source node collects information and sends it to the destination node regularly. At the destination node, the AoI of each source node can be calculated [24]. Since the source node is constantly sending information to the destination node, the AoI of each source node refers to the AoI of the latest information received by the destination node from that source node. In other words, the AoI of each source node is not fixed and depends on the sending rate of the source node and the receiving rate of the destination nodes for source node’s information. If the destination node has not received the latest information from a certain source node, then the AoI of the source node will show a linear increase until it gets the newest information from the source node and changes to the AoI of the latest information.
As shown in Figure 1, is the time that data packet i is generated by node j, is the time that data packet i is received by the destination. When t =0, the destination node receives a data packet 0 from node j, then . Then increases linearly until the destination node receives a latest data packet 1 at . At this time, is updated as . Like this, we can deduce that when the destination node receives a latest data packet 2 at , and so forth.
The Swedish scholar Antzela Kosta et al. published a review paper on the AoI in 2017, introducing the concept of AoI in detail and summarizing the early researches [25]. Jhunjhunwala P R et al. proposes an AoIaware channel scheduling algorithm for a sensor network with a monitoring station and multiple source nodes. The algorithm proposes that the cost function is a nondeclining function, but it does not provide a completely function and optimization model [24].
There have been some researches on the AoI in the IoT. Abbas Q et al. studied the importance and optimization of the AoI and energy efficiency in the IoT [26]. Gu Y et al. studied the average peak AoI under two schemes of overlay and underlay in a cognitive radiobased IoT network [27]. Li J et al. studied the average peak AoI of timelimited multicast transmission in the IoT. The author first describes the evolution of the instantaneous AoI and then derives the service time distribution of all possible reception results on IoT devices, and obtains the closed expressions of the average AoI and the average peak AoI [28]. Azarhava H et al. proposed a new protocol based on nonorthogonal multipleaccess (NOMA) in a wireless IoT network with energy harvesting sensors and limited battery cells. A closedform equation of the AoI for the entire network is obtained and the AoI is optimized by power scheduling parameters [29].
3. System Model and Optimization Model
3.1. System Model
Figure 2 illustrates the topology of the IoMT which is born out of the IoT and wearable devices. Therefore, the core of the IoMT are the users equipped with several wearable devices involving wireless sensors. These wearable devices on user’s body can detect the physiological information (such as the blood pressure, the pulse, the temperature, and the electrocardiogram (ECG), etc.) and mobility information (such as location, move speed and move direction, etc.). In addition, there is a coordinator on user’s body used to collect the information from all the wearable devices on the same body and communicate with the gateway. The physiological information of all users is sent to the gateway and then transmitted to the nurse, doctor or ambulance on demand through the Internet. In this paper, each user selects a channel from a gateway in each time slot. In order to describe the problem more conveniently, we first illustrate the notations.
The AoI of each mobile node is defined as the elapsed time when the latest data of this node is received by the gateway, as shown in Eq. (1).
is the generation time of the currently received data frames, is the length of each time slot. Here, we represent the AoI by using the specific time other than the time slot, which is more precise. At each time slot, the system pays the cost for AoI, and the cost is defined as a function of the AoI of all mobile nodes. Since is the cost paid by the system for lack of fresh information from the source node, it is a nondescending function, as shown in Eq. (2).
Where is defined as the cost function of the AoI of node j,
is the weight coefficient, it is determined by the ratio of the consumed energy of node j to the initial energy, Among them, is the energy consumed by the node, is the energy consumption of free space transmission.
The mobile node communication complies with the 802.11 standards and adopts OFDM technology. The signaltonoise ratio of the mobile node is defined as follows:
3.2. Optimization Model
s.t.
The formula (7) indicates that in any time slot t, one channel k can only be allocated to one node j. The formula (8) indicates that in the time slot t, a node can only communicate with one gateway. The formula (9) indicates whether the channel k of the gateway i is allocated to the user j in the time slot t, 1 means yes, and 0 means no. Equation Formula (10) indicates that the number of occupied gateways cannot exceed the number of available gateways. Equation Formula (11) indicates that the number of occupied channels cannot exceed the number of available channels. Equation Formula (12) indicates that the occupied channel bandwidth cannot exceed the total channel bandwidth. Equation Equation (13) indicates that the signaltonoise ratio of a node must be higher than the threshold so as to ensure the transmission rate.
For the network with small scale and small total number of channels, the enumeration method is available to calculate the cost of users choosing a subchannel of a gateway, and then find the subchannel with the lowest cost. However, if there are 1000 users, 5 gateways and 64 subchannel in the network, the amount of calculation of payment for AoI by enumeration method is at least 320000 times. Thus, for larger networks, the computational complexity is quite high. It is considerably significant to design a lowcomplexity algorithm to solve the proposed problem.
4. DQCA Algorithm Design
We assume that each user selecting the channel is a Markov decision process (MDP) and the policy decision and the AoI just depend on the selection in last time slot. In this network, there are a large number of users and they move randomly. The optimization model mentioned above is difficult to obtain an optimal analytical solution because the result of optimization depends largely on the built model and the computing process rate of the computer. Reinforcement learning is suitable for the channel allocation problem of the network. On the one hand, it can adjust actions through the interaction between the user and the environment and rewards, which can solve the optimization problem that is difficult to obtain analytical solutions; on the other hand, it can be well adapted to a highly dynamic environment and the frequently changing channel. Qlearning and DQN are two typical reinforcement learning algorithms. The algorithm flow diagrams are shown in Figures 3 and 4, respectively.
In Qlearning, the agent chooses an action under each state, builds a Qtable and record the Qvalue for each pair of state and action. The Qvalue is updated by the reward produced by the selected action. However, since all the possible states and actions are enumerated in Qtable, Qlearning is only suitable for the MDP problem with small state space and small action space. When the space becomes large, the storage space of the Qtable will become very large, and the Qtable cannot hold the memory. Meanwhile, the convergence speed of Qlearning will come down.
Compared with Qlearning algorithm, DQN uses the artificial neural network (ANN) to approximate the value function, uses target Q network to update the target value and use experience replay to train the learning process of reinforcement learning. DQN just updates the parameter of the artificial neural network rather than update the whole Qtable. Therefore, it shortens the convergence time and is more suitable for the problem with large state and action space. Considering a large number of users and channels, we abandon the Qlearning algorithm based on Qtable and choose the DQN to train the network to obtain an approximate optimal solution. Our proposed DQCA algorithm is a channel allocation algorithm based on DQN.
Agent: We define the controller node on mobile user as an agent. As an agent, it trains the neural network according to the network status (number of users, user location, moving speed and direction of users, etc.) to obtain reasonable actions.
System state: Denoted by s(t), including channel environment and node behavior. The behavior of the node mainly refers to the current position of the node (the mobility of node follows the random walk model [24]), and the nearest gateway is selected for access according to the position of the node. The channel environment can be characterized by the signaltonoise ratio of the node. If node j selects gateway i for data transmission in time slot t, the signaltonoise ratio of node i in time slot t is , if , then ; otherwise . That is, .
System action: After the node selects the gateway i, the system action is defined as which channel k of the gateway i is selected by the node j.
Reward: User j uses the immediate reward produced by at the system state , which is defined as Eq. (16). This revenue function can ensure that the cost of AoI is minimized while meeting the channel ratio constraint.
For each user j, we define the Q function as when take action at state , as shown by Eq. (17).
Where is the transition function from state to state . is a discount factor used to balance the immediate reward and longterm reward. is the set of feasible actions.
Q function and optimal policy: Then the optimal value of Q function and the optimal policy can be represented as Eq. (18) and Eq. (19), respectively.
Target value: To avoid overestimation brought by only one parameter in neural network, we use parameter and to illustrate the predict network and target network, respectively. Then the Qfunction can be given by Eq. (20).
Loss function: To approximate the Qfunction, we also define the loss function as Eq. (20) to train the weights and of ANN.
In DQCA, we first get the locations of all nodes and gateways and select a gateway for each node according to the shortest distance. And then we perform the channel allocation algorithm by Algorithm 1.

5. Simulation and Performance Evaluation
In this section, we first introduce the simulation setup, then show the simulation results and analyze the performance of the proposed algorithm.
5.1. Simulation Setup
To testify the effectiveness of our proposed algorithm, the Qlearning algorithm and greedy algorithm are also simulated with the DQCA algorithm for comparison. Qlearning algorithm builds Qtable for each node and finds the maximum Qvalue for each node from all available actions. The main idea of the greedy algorithm is to allocate the channel in each time slot with the minimum growth of the cost function in the next slot as the optimization objective [24]. To prove the effectiveness of the proposed algorithm, this paper compares the three algorithms from three aspects: cost, AoI, and energy consumption. Among them, cost refers to the overall cost of the network, calculated according to formula (6), and the average AoI of all nodes is calculated as follows:
Energy consumption is the average energy consumption of all nodes, defined as
5.2. Performance Evaluation
To verify the effectiveness and feasibility of the proposed DQCA algorithm, this paper uses three different scenarios. The first one: the average size of data packet is 5 M, and the data packet arrival interval is 50 ms, the number of nodes in the network changes; the second one: the number of nodes is 20, the data packet arrival interval is 50 ms, and the average size of data packet changes; the third one: the number of nodes is 20, the average size of data packet is 5 M, and the data packet arrival interval changes. The simulation program runs on a computer with an Intel Core i73520M with 2.90GHz frequency CPU and 8G RAM. The parameters used in the simulation are shown in Table 1.

Figures 5–7 study the impact of changes in the number of nodes on network performance when the length of the data packet and time slot is fixed as defined in the first scenario. It can be seen from Figures 5 and 6 that the average AoI and average energy consumption of the three algorithms continuously reduce as the number of nodes increases. This is because the AoI and energy consumption increase more slowly than the number of nodes, resulting in a decrease in the average value. At the same time, due to the large state space, Qlearning needs to consume more time and computing resources, and it is necessarily inferior to DQCA in terms of AoI and energy consumption. Especially for energy consumption, compared with the Qlearning algorithm, the average energy consumption of all nodes of the DQCA algorithm is reduced by about 38.56%. It can be seen from Figure 7 that the costs of the three algorithms all increase with the increase of the number of nodes, among which the DQCA algorithm increases slowly and the increment is small. The cost takes into account the AoI and energy consumption of the nodes. The DQCA algorithm has more advantages in these two aspects than the other two algorithms. Therefore, the total cost is significantly lower than the Greedy and Qlearning algorithms and can be reduced by up to 57.3% compared to the Greedy algorithm.
Figures 8–10 shows the fixed number of nodes and packet arrival interval in the second scenario, to study the change of network performance with the size of data packet. It can be seen that as the size of the data packet continues to increase, the AoI and energy consumption of the node is also increasing, so the cost increases accordingly. This is because after the data packet increases, the processing and transmission time of the data packet increases, and it takes longer time for the gateway node to wait for the latest update of the node, and the energy consumption of the transmitter and receiver of the nodes will increase accordingly. Compared with the greedy algorithm and Qlearning algorithm, the DQCA algorithm can reduce the cost by about 62% and 60%.
Figures 11–13 show the fixed number of nodes and the size of data packet in the third scenario, to study the change of network performance with the packet arrival interval. It can be seen that when the data packet arrival interval increases, the number of data packets in the network decreases, and the data packets sent and received by the node decrease, so the energy consumption of the node is reduced. Due to the increase of the data packet arrival interval, the probability of the node being allocated to the channel at the gateway node also increases, that is, the waiting time for an assigned channel for the node is shortened. As can be seen from Figure 11, overall, the AoI of the node is reduced. From the simulation results in Figure 13, as the packet arrival interval continues to increase, the impact on the average energy consumption and cost of the node gradually decreases, and the curves in Figures 12 and 13 tend to be stable. This is because the packet arrival interval increases to a certain extent, the basic energy consumption of the node accounts for a larger proportion of the total energy consumption, and the energy consumption of the node is less affected by the sending and receiving of data packets.
The greedy algorithm only considers the optimal value of the current function, does not consider the previous choice, nor the consequences of the current choice. But in fact, this method often does not have the best results. Therefore, in Figures 5–13, the greedy algorithm has the worst performance compared to Qlearning and DQCA.
6. Conclusion
Focused on the freshness of information in IoMT, this paper studied the channel allocation problem oriented to the AoI. In this paper, system cost is defined as a nondescending function about the AoI and energy consumption of nodes. Since the system cost optimization problem is difficult to solve due to the large amount of users and the mobility of users, we adopt a DQNbased method named DQCA algorithm. The simulation compared the proposed DQCA algorithm with Greedy algorithm and Qlearning algorithm in three different cases. The simulation results show the superiority of DQCA algorithm from the aspects of average AoI and average energy consumption of nodes and system cost.
Notations
N:  the total number of gateway nodes 
M:  the total number of nodes 
K:  the total number of subchannels 
:  the gateway index, 
j:  the node index, 
k:  the channel index, 
:  are the set of channels, the set of gateways, and the set of nodes, respectively 
q:  the time slot sequence number 
:  the time when the data frame is received in the time slot 
:  the information age of the node j communicating with the gateway i in the time slot q 
:  the length of the frame sent 
:  the basic energy consumption of the transmitter 
:  energy consumption parameter for free space transmission 
:  the distance between mobile node j and gateway i 
:  the signaltonoise ratio of the mobile node 
:  is the transmission power of node j to gateway i 
:  the channel gain 
:  Gaussian noise 
:  a number can be set on demand. 
Data Availability
The raw/processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.
Conflicts of Interest
The author(s) declare(s) that they have no conflicts of interest.
Acknowledgments
National Natural Science Foundation of China, Grant/Award Number: 61501308; Basic research project of Liaoning Provincial Department of Education, Grant/Award Number: LG202027; Postdoctoral Research Station project of Shenyang Ligong University.
References
 https://voice.baidu.com/act/newpneumonia/newpneumonia/?from=osari_aladin_banner#tab4.
 Z. Ning, P. Dong, X. Wang et al., “Mobile Edge Computing Enabled 5G Health Monitoring for Internet of Medical Things: A Decentralized Game Theoretic Approach,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 2, pp. 463–478, 2021. View at: Publisher Site  Google Scholar
 S. Swayamsiddha and C. Mohanty, “Application of cognitive Internet of Medical Things for COVID19 pandemic,” Diabetes and Metabolic Syndrome Clinical Research and Reviews, vol. 14, no. 5, pp. 911–915, 2020. View at: Publisher Site  Google Scholar
 R. Pratap Singh, M. Javaid, A. Haleem, R. Vaishya, and S. Ali, “Internet of medical things (IoMT) for orthopaedic in COVID19 pandemic: roles, challenges, and applications,” Journal of Clinical Orthopaedics and Trauma, vol. 11, no. 4, pp. 713–717, 2020. View at: Publisher Site  Google Scholar
 M. A. Mujawar, H. Gohel, S. K. Bhardwaj, S. Srinivasan, N. Hickman, and A. Kaushik, “Nanoenabled biosensing systems for intelligent healthcare: towards COVID19 management,” Materials Today Chemistry, vol. 17, p. 100306, 2020. View at: Publisher Site  Google Scholar
 L. Barbierato, A. Estebsari, E. Pons et al., “A distributed IoT infrastructure to test and deploy realtime demand response in smart grids,” IEEE Internet of Things Journal, vol. 6, no. 1, pp. 1136–1146, 2019. View at: Publisher Site  Google Scholar
 Z. Ning, P. Dong, X. Wang et al., “Partial Computation Offloading and Adaptive Task Scheduling for 5Genabled Vehicular Networks,” IEEE Transactions on Mobile Computing, p. 1, 2020. View at: Publisher Site  Google Scholar
 S. H. Shao, A. Khreishah, and I. Khalil, “Enabling realtime indoor tracking of IoT devices through visible light retroreflection,” IEEE Transactions on Mobile Computing, vol. 19, no. 4, pp. 836–851, 2020. View at: Publisher Site  Google Scholar
 Z. Ning, P. Dong, X. Wang et al., “Distributed and Dynamic Service Placement in Pervasive Edge Computing Networks,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 6, pp. 1277–1292, 2021. View at: Publisher Site  Google Scholar
 H. Viswanathan and P. Mogensen, “Communications in the 6G era,” IEEE Access, vol. 99, pp. 1–1, 2020. View at: Google Scholar
 S. Kaul, M. Gruteser, V. Rai, and J. Kenney, “Minimizing age of information in vehicular networks,” in 8th annual IEEE communications society conference on sensor, mesh and ad hoc communications and networks (SECON), pp. 350–358, Salt Lake City, UT, USA, 2011. View at: Google Scholar
 X. Wang, Z. Ning, and S. Guo, “Minimizing the AgeofCriticalInformation: An Imitation LearningBased Scheduling Approach under Partial Observations,” IEEE Transactions on Mobile Computing, p. 1, 2021. View at: Publisher Site  Google Scholar
 F. James and M. Priya, “Deep Learning Radial Basis Function Neural Networks Based Automatic Detection of Diabetic Retinopathy,” SSRN Electronic Journal, 2020. View at: Publisher Site  Google Scholar
 X. Wang, Z. Ning, S. Guo, and L. Wang, “Imitation Learning Enabled Task Scheduling for Online Vehicular Edge Computing,” IEEE Transactions on Mobile Computing, p. 1, 2020. View at: Publisher Site  Google Scholar
 S. Abbasi, S. Saberi, M. Zarvani, P. Amiri, and R. Azmi, “Deep Learning Classification Schemes for the Identification of COVID19 Infected Patients Using Large Chest XRay Image Dataset,” in 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC'20), Montreal, QC, Canada, 2020. View at: Google Scholar
 T. Dong, C. Yang, B. Cui et al., “Development and validation of a deep learning Radiomics model predicting lymph node status in operable cervical Cancer,” Frontiers in Oncology, vol. 10, 2020. View at: Publisher Site  Google Scholar
 Z. Ning, K. Zhang, X. Wang et al., “Intelligent Edge Computing in Internet of Vehicles: A Joint Computation Offloading and Caching Solution,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 4, pp. 2212–2225, 2021. View at: Publisher Site  Google Scholar
 W. Qiao, W. Tian, Y. Tian, Q. Yang, Y. Wang, and J. Zhang, “The forecasting of PM2.5 using a hybrid model based on wavelet transform and an improved deep learning algorithm,” IEEE Access, vol. 7, pp. 142814–142825, 2019. View at: Publisher Site  Google Scholar
 J. Sadefo Kamdem, R. Bandolo Essomba, and J. Njong Berinyuy, “Deep learning models for forecasting and analyzing the implications of COVID19 spread on some commodities markets volatilities,” Chaos, Solitons & Fractals, vol. 140, p. 110215, 2020. View at: Publisher Site  Google Scholar
 B. Zhao, J. Liu, Z. Wei, and I. You, “A deep reinforcement learning based approach for EnergyEfficient Channel allocation in satellite internet of things,” IEEE Access, vol. 8, pp. 62197–62206, 2020. View at: Publisher Site  Google Scholar
 X. Wang, Z. Ning, and S. Guo, “Multiagent imitation learning for pervasive edge computing: a decentralized computation offloading algorithm,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 2, pp. 411–425, 2021. View at: Publisher Site  Google Scholar
 Z. Gao, M. Eisen, and A. Ribeiro, “Resource Allocation Via ModelFree Deep Learning in Free Space Optical Networks,” 2020, https://arxiv.org/abs/2007.13709v1. View at: Google Scholar
 Z. Ning, S. Sun, X. Wang et al., “Intelligent Resource Allocation in Mobile Blockchain for Privacy and Security Transactions: A Deep Reinforcement Learning Based Approach,” Science China Information Sciences, vol. 64, no. 6, 2021. View at: Publisher Site  Google Scholar
 P. R. Jhunjhunwala, “AgeofInformation Aware Scheduling,” in 2018 International Conference on Signal Processing and Communications (SPCOM), Bangalore, India, 2018. View at: Google Scholar
 A. Kosta, N. Pappas, and V. Angelakis, “Age of information: a new concept, metric, and tool,” Foundations and Trends in Networking, vol. 12, no. 3, pp. 162–259, 2017. View at: Publisher Site  Google Scholar
 Q. Abbas, S. Zeb, S. A. Hassan, R. Mumtaz, and S. A. R. Zaidi, “Joint Optimization of Age of Information and Energy Efficiency in IoT Networks,” in IEEE VTC Spring 2020, Antwerp, Belgium, 2020. View at: Google Scholar
 Y. Gu, H. Chen, C. Zhai, Y. Li, and B. Vucetic, “Minimizing age of information in cognitive radiobased IoT Systems: underlay or Overlay?” IEEE Internet of Things Journal, vol. 6, no. 6, pp. 10273–10288, 2019. View at: Publisher Site  Google Scholar
 J. Li, Y. Zhou, and H. Chen, “Age of Information for Multicast Transmission with Fixed and Random Deadlines in IoT Systems,” IEEE Internet of Things Journal, vol. 7, no. 9, pp. 8178–8191, 2020. View at: Publisher Site  Google Scholar
 H. Azarhava, M. Pourmohammad Abdollahi, and J. Musevi Niya, “Age of information in wireless powered IoT networks: NOMA vs. TDMA,” Ad Hoc Networks, vol. 104, article 102179, 2020. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2021 Kefeng Wei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.