TY - GEN
T1 - Path planning of a mobile robot as a discrete optimization problem and adjustment of weight parameters in the objective function by reinforcement learning
AU - Igarashi, Harukazu
PY - 2001
Y1 - 2001
N2 - In a previous paper, we proposed a solution to path planning of a mobile robot. In our approach, we formulated the problem as a discrete optimization problem at each time step. To solve the optimization problem, we used an objective function consisting of a goal term, a smoothness term and a collision term. This paper presents a theoretical method using reinforcement learning for adjusting weight parameters in the objective functions. However, the conventional Q-learning method cannot be applied to a non-Markov decision process. Thus, we applied Williams's learning algorithm, REINFORCE, to derive an updating rule for the weight parameters. This is a stochastic hill-climbing method to maximize a value function. We verified the updating rule by experiment.
AB - In a previous paper, we proposed a solution to path planning of a mobile robot. In our approach, we formulated the problem as a discrete optimization problem at each time step. To solve the optimization problem, we used an objective function consisting of a goal term, a smoothness term and a collision term. This paper presents a theoretical method using reinforcement learning for adjusting weight parameters in the objective functions. However, the conventional Q-learning method cannot be applied to a non-Markov decision process. Thus, we applied Williams's learning algorithm, REINFORCE, to derive an updating rule for the weight parameters. This is a stochastic hill-climbing method to maximize a value function. We verified the updating rule by experiment.
UR - http://www.scopus.com/inward/record.url?scp=84867460364&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867460364&partnerID=8YFLogxK
U2 - 10.1007/3-540-45324-5_32
DO - 10.1007/3-540-45324-5_32
M3 - Conference contribution
AN - SCOPUS:84867460364
SN - 3540421858
SN - 9783540421856
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 315
EP - 320
BT - RoboCup 2000
A2 - Stone, Peter
A2 - Balch, Tucker
A2 - Kraetzschmar, Gerhard
PB - Springer Verlag
T2 - 4th Robot World Cup Soccer Games and Conferences, RoboCup 2000
Y2 - 27 August 2000 through 3 September 2000
ER -