Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks

Many factors influence the connection states between nodes of wireless sensor networks, such as physical distance, and the network load, making the network’s edge length dynamic in abundant scenarios. This dynamic property makes the network essentially form a graph with stochastic edge le...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wenwen Xia, Chong Di, Haonan Guo, Shenghong Li
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Stochastic shortest path finding reinforcement learning Q-learning SARSA convergence proof
Online Access:	https://ieeexplore.ieee.org/document/8886484/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841554045646405632
author	Wenwen Xia Chong Di Haonan Guo Shenghong Li
author_facet	Wenwen Xia Chong Di Haonan Guo Shenghong Li
author_sort	Wenwen Xia
collection	DOAJ
description	Many factors influence the connection states between nodes of wireless sensor networks, such as physical distance, and the network load, making the network’s edge length dynamic in abundant scenarios. This dynamic property makes the network essentially form a graph with stochastic edge lengths. In this paper, we study the stochastic shortest path problem on a directional graph with stochastic edge lengths, using reinforcement learning algorithms. we regard each edge length as a random variable following unknown probability distribution and aim to find the stochastic shortest path on this stochastic graph. We evaluate the performance of path-finding algorithms using regret, which represents the cumulative reward difference between the practical path-finding algorithm and the optimal strategy that chooses the global stochastic shortest path every time. We model the path-finding procedure as a Markov decision process and propose two online path-finding algorithms: Q<italic><sub>SSP</sub></italic> algorithm and SARSA<italic><sub>SSP</sub></italic> algorithm, both combined with specifically-devised average reward mechanism. We justify the convergence property and correctness of the proposed algorithms theoretically. Experiments conducted on two benchmark graphs illustrate the superior performance of the proposed Q<italic><sub>SSP</sub></italic> algorithm which outperforms the SARSA<italic><sub>SSP</sub></italic> algorithm and other competitive algorithms about the regret metric.
format	Article
id	doaj-art-8273418c69854c00a09e22ba6953b29d
institution	Kabale University
issn	2169-3536
language	English
publishDate	2019-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-8273418c69854c00a09e22ba6953b29d2025-01-09T00:00:42ZengIEEEIEEE Access2169-35362019-01-01715780715781710.1109/ACCESS.2019.29500558886484Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor NetworksWenwen Xia0https://orcid.org/0000-0003-2928-7298Chong Di1https://orcid.org/0000-0002-6008-1813Haonan Guo2https://orcid.org/0000-0003-4450-0683Shenghong Li3School of Cyber Security, Shanghai Jiao Tong University, Shanghai, ChinaSchool of Cyber Security, Shanghai Jiao Tong University, Shanghai, ChinaSchool of Cyber Security, Shanghai Jiao Tong University, Shanghai, ChinaSchool of Cyber Security, Shanghai Jiao Tong University, Shanghai, ChinaMany factors influence the connection states between nodes of wireless sensor networks, such as physical distance, and the network load, making the network’s edge length dynamic in abundant scenarios. This dynamic property makes the network essentially form a graph with stochastic edge lengths. In this paper, we study the stochastic shortest path problem on a directional graph with stochastic edge lengths, using reinforcement learning algorithms. we regard each edge length as a random variable following unknown probability distribution and aim to find the stochastic shortest path on this stochastic graph. We evaluate the performance of path-finding algorithms using regret, which represents the cumulative reward difference between the practical path-finding algorithm and the optimal strategy that chooses the global stochastic shortest path every time. We model the path-finding procedure as a Markov decision process and propose two online path-finding algorithms: Q<italic><sub>SSP</sub></italic> algorithm and SARSA<italic><sub>SSP</sub></italic> algorithm, both combined with specifically-devised average reward mechanism. We justify the convergence property and correctness of the proposed algorithms theoretically. Experiments conducted on two benchmark graphs illustrate the superior performance of the proposed Q<italic><sub>SSP</sub></italic> algorithm which outperforms the SARSA<italic><sub>SSP</sub></italic> algorithm and other competitive algorithms about the regret metric.https://ieeexplore.ieee.org/document/8886484/Stochastic shortest path findingreinforcement learningQ-learningSARSAconvergence proof
spellingShingle	Wenwen Xia Chong Di Haonan Guo Shenghong Li Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks IEEE Access Stochastic shortest path finding reinforcement learning Q-learning SARSA convergence proof
title	Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks
title_full	Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks
title_fullStr	Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks
title_full_unstemmed	Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks
title_short	Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks
title_sort	reinforcement learning based stochastic shortest path finding in wireless sensor networks
topic	Stochastic shortest path finding reinforcement learning Q-learning SARSA convergence proof
url	https://ieeexplore.ieee.org/document/8886484/
work_keys_str_mv	AT wenwenxia reinforcementlearningbasedstochasticshortestpathfindinginwirelesssensornetworks AT chongdi reinforcementlearningbasedstochasticshortestpathfindinginwirelesssensornetworks AT haonanguo reinforcementlearningbasedstochasticshortestpathfindinginwirelesssensornetworks AT shenghongli reinforcementlearningbasedstochasticshortestpathfindinginwirelesssensornetworks

Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks

Similar Items