Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to o...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 13. Mai
1. Verfasser: Liu, Shunyu (VerfasserIn)
Weitere Verfasser: Song, Jie, Zhou, Yihe, Yu, Na, Chen, Kaixuan, Feng, Zunlei, Song, Mingli
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM372280366
003 DE-627
005 20240514233109.0
007 cr uuu---uuuuu
008 240514s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2024.3399936  |2 doi 
028 5 2 |a pubmed24n1407.xml 
035 |a (DE-627)NLM372280366 
035 |a (NLM)38739512 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Liu, Shunyu  |e verfasserin  |4 aut 
245 1 0 |a Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 14.05.2024 
500 |a published: Print-Electronic 
500 |a Citation Status Publisher 
520 |a Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to over-fitting on noisy interactions between entities. In this work, we introduce a novel interactiOn Pattern disenTangling (OPT) method, to disentangle the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a subgroup of the entities. OPT facilitates filtering the noisy interactions between irrelevant entities and thus significantly improves generalizability as well as interpretability. Specifically, OPT introduces a sparse disagreement mechanism to encourage sparsity and diversity among discovered interaction prototypes. Then the model selectively restructures these prototypes into a compact interaction pattern by an aggregator with learnable weights. To alleviate the training instability issue caused by partial observability, we propose to maximize the mutual information between the aggregation weights and the history behaviors of each agent. Experiments on single-task, multi-task and zero-shot benchmarks demonstrate that the proposed method yields results superior to the state-of-the-art counterparts. Our code is available at https://github.com/liushunyu/OPT 
650 4 |a Journal Article 
700 1 |a Song, Jie  |e verfasserin  |4 aut 
700 1 |a Zhou, Yihe  |e verfasserin  |4 aut 
700 1 |a Yu, Na  |e verfasserin  |4 aut 
700 1 |a Chen, Kaixuan  |e verfasserin  |4 aut 
700 1 |a Feng, Zunlei  |e verfasserin  |4 aut 
700 1 |a Song, Mingli  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g PP(2024) vom: 13. Mai  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:PP  |g year:2024  |g day:13  |g month:05 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2024.3399936  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d PP  |j 2024  |b 13  |c 05