Offline Model-Based Adaptable Policy Learning for Decision-Making in Out-of-Support Regions
In reinforcement learning, a promising direction to avoid online trial-and-error costs is learning from an offline dataset. Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of...
Description complète
Détails bibliographiques
Publié dans: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 12 vom: 01. Dez., Seite 15260-15274
|
Auteur principal: |
Chen, Xiong-Hui
(Auteur) |
Autres auteurs: |
Luo, Fan-Ming,
Yu, Yang,
Li, Qingyang,
Qin, Zhiwei,
Shang, Wenjie,
Ye, Jieping |
Format: | Article en ligne
|
Langue: | English |
Publié: |
2023
|
Accès à la collection: | IEEE transactions on pattern analysis and machine intelligence
|
Sujets: | Journal Article |