Offline Model-Based Adaptable Policy Learning for Decision-Making in Out-of-Support Regions

In reinforcement learning, a promising direction to avoid online trial-and-error costs is learning from an offline dataset. Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 12 vom: 01. Dez., Seite 15260-15274
Auteur principal: Chen, Xiong-Hui (Auteur)
Autres auteurs: Luo, Fan-Ming, Yu, Yang, Li, Qingyang, Qin, Zhiwei, Shang, Wenjie, Ye, Jieping
Format: Article en ligne
Langue:English
Publié: 2023
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article