Reinforcement learning in non-markovian environments using automatic discovery of subgoals

Le Tien Dung, Takashi Komeda, Motoki Takagi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks (RNNs) are used to predict Q values. By creating useful subgoals, we can speed up learning performance. In this paper, we propose a method to accelerate learning in non-Markovian environments using automatic discovery of subgoals. Once subgoals are created, sub-policies use RNNs to attain them. Then learned RNNs are integrated into the main RNN as experts. Finally, the agent continues to learn using its new policy. Experiment results of the E maze problem and the virtual office problem show the potential of this approach.

Original languageEnglish
Title of host publicationSICE Annual Conference, SICE 2007
Pages2601-2605
Number of pages5
DOIs
Publication statusPublished - 2007 Dec 1
EventSICE(Society of Instrument and Control Engineers)Annual Conference, SICE 2007 - Takamatsu, Japan
Duration: 2007 Sept 172007 Sept 20

Publication series

NameProceedings of the SICE Annual Conference

Conference

ConferenceSICE(Society of Instrument and Control Engineers)Annual Conference, SICE 2007
Country/TerritoryJapan
CityTakamatsu
Period07/9/1707/9/20

Keywords

  • Selected keywords relevant to the subject

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Reinforcement learning in non-markovian environments using automatic discovery of subgoals'. Together they form a unique fingerprint.

Cite this