TY - JOUR
T1 - Criticality-based Varying Step-number Algorithm for Reinforcement Learning
AU - Spielberg, Yitzhak
AU - Azaria, Amos
N1 - Publisher Copyright:
© 2021 World Scientific Publishing Company.
PY - 2021/6
Y1 - 2021/6
N2 - In the context of reinforcement learning we introduce the concept of criticality of a state, which indicates the extent to which the choice of action in that particular state influences the expected return. That is, a state in which the choice of action is more likely to influence the final outcome is considered as more critical than a state in which it is less likely to influence the final outcome. We formulate a criticality-based varying step number algorithm (CVS) - a flexible step number algorithm that utilizes the criticality function provided by a human, or learned directly from the environment. We test it in three different domains including the Atari Pong environment, Road-Tree environment, and Shooter environment. We demonstrate that CVS is able to outperform popular learning algorithms such as Deep Q-Learning and Monte Carlo.
AB - In the context of reinforcement learning we introduce the concept of criticality of a state, which indicates the extent to which the choice of action in that particular state influences the expected return. That is, a state in which the choice of action is more likely to influence the final outcome is considered as more critical than a state in which it is less likely to influence the final outcome. We formulate a criticality-based varying step number algorithm (CVS) - a flexible step number algorithm that utilizes the criticality function provided by a human, or learned directly from the environment. We test it in three different domains including the Atari Pong environment, Road-Tree environment, and Shooter environment. We demonstrate that CVS is able to outperform popular learning algorithms such as Deep Q-Learning and Monte Carlo.
KW - Human-aided reinforcement learning
KW - deep reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85109028499&partnerID=8YFLogxK
U2 - 10.1142/S0218213021500196
DO - 10.1142/S0218213021500196
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85109028499
SN - 0218-2130
VL - 30
JO - International Journal on Artificial Intelligence Tools
JF - International Journal on Artificial Intelligence Tools
IS - 4
M1 - 2150019
ER -