Continuous Control Monte Carlo Tree Search Informed by Multiple Experts

Efficient algorithms for 3D character control in continuous control setting remain an open problem in spite of the remarkable recent advances in the field. We present a sampling-based model-predictive controller that comes in the form of a Monte Carlo tree search (MCTS). The tree search utilizes inf...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on visualization and computer graphics. - 1996. - 25(2019), 8 vom: 24. Aug., Seite 2540-2553
1. Verfasser:	Rajamaki, Joose (VerfasserIn)
Weitere Verfasser:	Hamalainen, Perttu
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2019
Zugriff auf das übergeordnete Werk:	IEEE transactions on visualization and computer graphics
Schlagworte:	Journal Article Research Support, Non-U.S. Gov't


LEADER	01000naa a22002652 4500
001	NLM28637157X
003	DE-627
005	20231225051543.0
007	cr uuu---uuuuu
008	231225s2019 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TVCG.2018.2849386 \|2 doi
028	5	2	\|a pubmed24n0954.xml
035			\|a (DE-627)NLM28637157X
035			\|a (NLM)29994613
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Rajamaki, Joose \|e verfasserin \|4 aut
245	1	0	\|a Continuous Control Monte Carlo Tree Search Informed by Multiple Experts
264		1	\|c 2019
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 21.08.2019
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Efficient algorithms for 3D character control in continuous control setting remain an open problem in spite of the remarkable recent advances in the field. We present a sampling-based model-predictive controller that comes in the form of a Monte Carlo tree search (MCTS). The tree search utilizes information from multiple sources including two machine learning models. This allows rapid development of complex skills such as 3D humanoid locomotion with less than a million simulation steps, in less than a minute of computing on a modest personal computer. We demonstrate locomotion of 3D characters with varying topologies under disturbances such as heavy projectile hits and abruptly changing target direction. In this paper we also present a new way to combine information from the various sources such that minimal amount of information is lost. We furthermore extend the neural network, involved in the algorithm, to represent stochastic policies. Our approach yields a robust control algorithm that is easy to use. While learning, the algorithm runs in near real-time, and after learning the sampling budget can be reduced for real-time operation
650		4	\|a Journal Article
650		4	\|a Research Support, Non-U.S. Gov't
700	1		\|a Hamalainen, Perttu \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on visualization and computer graphics \|d 1996 \|g 25(2019), 8 vom: 24. Aug., Seite 2540-2553 \|w (DE-627)NLM098269445 \|x 1941-0506 \|7 nnns
773	1	8	\|g volume:25 \|g year:2019 \|g number:8 \|g day:24 \|g month:08 \|g pages:2540-2553
856	4	0	\|u http://dx.doi.org/10.1109/TVCG.2018.2849386 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 25 \|j 2019 \|e 8 \|b 24 \|c 08 \|h 2540-2553