Distributional Soft Actor-Critic With Three Refinements

Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal polic...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 47(2025), 5 vom: 07. Mai, Seite 3935-3946
1. Verfasser:	Duan, Jingliang (VerfasserIn)
Weitere Verfasser:	Wang, Wenxuan, Xiao, Liming, Gao, Jiaxin, Li, Shengbo Eben, Liu, Chang, Zhang, Ya-Qin, Cheng, Bo, Li, Keqiang
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2025
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article

Online verfügbar	Volltext