Distributional Soft Actor-Critic With Three Refinements
Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal polic...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 47(2025), 5 vom: 07. Mai, Seite 3935-3946
|
1. Verfasser: |
Duan, Jingliang
(VerfasserIn) |
Weitere Verfasser: |
Wang, Wenxuan,
Xiao, Liming,
Gao, Jiaxin,
Li, Shengbo Eben,
Liu, Chang,
Zhang, Ya-Qin,
Cheng, Bo,
Li, Keqiang |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2025
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence
|
Schlagworte: | Journal Article |