Reinforcement learning in non-markovian environments using automatic discovery of subgoals

Le Tien Dung, Takashi Komeda, Motoki Takagi

研究成果: Conference contribution

6 被引用数 (Scopus)

抄録

Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks (RNNs) are used to predict Q values. By creating useful subgoals, we can speed up learning performance. In this paper, we propose a method to accelerate learning in non-Markovian environments using automatic discovery of subgoals. Once subgoals are created, sub-policies use RNNs to attain them. Then learned RNNs are integrated into the main RNN as experts. Finally, the agent continues to learn using its new policy. Experiment results of the E maze problem and the virtual office problem show the potential of this approach.

本文言語English
ホスト出版物のタイトルSICE Annual Conference, SICE 2007
ページ2601-2605
ページ数5
DOI
出版ステータスPublished - 2007 12月 1
イベントSICE(Society of Instrument and Control Engineers)Annual Conference, SICE 2007 - Takamatsu, Japan
継続期間: 2007 9月 172007 9月 20

出版物シリーズ

名前Proceedings of the SICE Annual Conference

Conference

ConferenceSICE(Society of Instrument and Control Engineers)Annual Conference, SICE 2007
国/地域Japan
CityTakamatsu
Period07/9/1707/9/20

ASJC Scopus subject areas

  • 制御およびシステム工学
  • コンピュータ サイエンスの応用
  • 電子工学および電気工学

フィンガープリント

「Reinforcement learning in non-markovian environments using automatic discovery of subgoals」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル