Embedding fuzzy mechanisms and knowledge in box-type reinforcement learning controllers

In this paper, we report our study on embedding fuzzy mechanisms and knowledge into box-type reinforcement learning controllers. One previous approach for incorporating fuzzy mechanisms can only achieve one successful run out of nine tests compared to eight successful runs in a nonfuzzy learning con...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society. - 1996. - 32(2002), 5 vom: 15., Seite 645-53
1. Verfasser: Su, Shun-Feng (VerfasserIn)
Weitere Verfasser: Hsieh, Sheng-Hsiung
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2002
Zugriff auf das übergeordnete Werk:IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:In this paper, we report our study on embedding fuzzy mechanisms and knowledge into box-type reinforcement learning controllers. One previous approach for incorporating fuzzy mechanisms can only achieve one successful run out of nine tests compared to eight successful runs in a nonfuzzy learning control scheme. After analysis, the credit assignment problem and the weighting domination problem are identified. Furthermore, the use of fuzzy mechanisms in temporal difference seems to play a negative factor. Modifications to overcome those problems are proposed. Furthermore, several remedies are employed in that approach. The effects of those remedies applied to our learning scheme are presented and possible variations are also studied. Finally, the issue of incorporating knowledge into reinforcement learning systems is studied. From our simulations, it is concluded that the use of knowledge for the control network can provide good learning results, but the use of knowledge for the evaluation network alone seems unable to provide any significant advantages. Furthermore, we also employ Makarovic's (1988) rules as the knowledge for the initial setting of the control network. In our study, the rules are separated into four groups to avoid the ordering problem
Beschreibung:Date Completed 02.10.2012
Date Revised 04.02.2008
published: Print
Citation Status PubMed-not-MEDLINE
ISSN:1941-0492
DOI:10.1109/TSMCB.2002.1033183