TY - JOUR
T1 - Detection of Hidden Moving Targets by a Group of Mobile Agents with Deep Q-Learning
AU - Matzliach, Barouch
AU - Ben-Gal, Irad
AU - Kagan, Evgeny
N1 - Publisher Copyright:
© 2023 by the authors.
PY - 2023/8
Y1 - 2023/8
N2 - In this paper, we propose a solution for the problem of searching for multiple targets by a group of mobile agents with sensing errors of the first and the second types. The agents’ goal is to plan the search and follow its trajectories that lead to target detection in minimal time. Relying on real sensors’ properties, we assume that the agents can detect the targets in various directions and distances; however, they are exposed to first- and second-type statistical errors. Furthermore, we assume that the agents in the group have errorless communication with each other. No central station or coordinating agent is assumed to control the search. Thus, the search follows a fully distributed decision-making process, in which each agent plans its path independently based on the information about the targets, which is collected independently or received from the other agents. The suggested solution includes two algorithms: the Distributed Expected Information Gain (DEIG) algorithm, which implements dynamic Voronoi partitioning of the search space and plans the paths by maximizing the expected one-step look-ahead information per region, and the Collective Q-max (CQM) algorithm, which finds the shortest paths of the agents in the group by maximizing the cumulative information about the targets’ locations using deep Q-learning techniques. The developed algorithms are compared against previously developed reactive and learning methods, such as the greedy centralized Expected Information Gain (EIG) method. It is demonstrated that these algorithms, specifically the Collective Q-max algorithm, considerably outperform existing solutions. In particular, the proposed algorithms improve the results by 20% to 100% under different scenarios of noisy environments and sensors’ sensitivity.
AB - In this paper, we propose a solution for the problem of searching for multiple targets by a group of mobile agents with sensing errors of the first and the second types. The agents’ goal is to plan the search and follow its trajectories that lead to target detection in minimal time. Relying on real sensors’ properties, we assume that the agents can detect the targets in various directions and distances; however, they are exposed to first- and second-type statistical errors. Furthermore, we assume that the agents in the group have errorless communication with each other. No central station or coordinating agent is assumed to control the search. Thus, the search follows a fully distributed decision-making process, in which each agent plans its path independently based on the information about the targets, which is collected independently or received from the other agents. The suggested solution includes two algorithms: the Distributed Expected Information Gain (DEIG) algorithm, which implements dynamic Voronoi partitioning of the search space and plans the paths by maximizing the expected one-step look-ahead information per region, and the Collective Q-max (CQM) algorithm, which finds the shortest paths of the agents in the group by maximizing the cumulative information about the targets’ locations using deep Q-learning techniques. The developed algorithms are compared against previously developed reactive and learning methods, such as the greedy centralized Expected Information Gain (EIG) method. It is demonstrated that these algorithms, specifically the Collective Q-max algorithm, considerably outperform existing solutions. In particular, the proposed algorithms improve the results by 20% to 100% under different scenarios of noisy environments and sensors’ sensitivity.
KW - decision making
KW - deep learning
KW - group dynamics
KW - mobile agents
KW - neural networks
KW - path planning
KW - search and detection
UR - http://www.scopus.com/inward/record.url?scp=85169053411&partnerID=8YFLogxK
U2 - 10.3390/robotics12040103
DO - 10.3390/robotics12040103
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85169053411
SN - 2218-6581
VL - 12
JO - Robotics
JF - Robotics
IS - 4
M1 - 103
ER -