Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to o...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 13. Mai
1. Verfasser:	Liu, Shunyu (VerfasserIn)
Weitere Verfasser:	Song, Jie, Zhou, Yihe, Yu, Na, Chen, Kaixuan, Feng, Zunlei, Song, Mingli
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2024
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article


LEADER	01000naa a22002652 4500
001	NLM372280366
003	DE-627
005	20240514233109.0
007	cr uuu---uuuuu
008	240514s2024 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TPAMI.2024.3399936 \|2 doi
028	5	2	\|a pubmed24n1407.xml
035			\|a (DE-627)NLM372280366
035			\|a (NLM)38739512
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Liu, Shunyu \|e verfasserin \|4 aut
245	1	0	\|a Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning
264		1	\|c 2024
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 14.05.2024
500			\|a published: Print-Electronic
500			\|a Citation Status Publisher
520			\|a Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to over-fitting on noisy interactions between entities. In this work, we introduce a novel interactiOn Pattern disenTangling (OPT) method, to disentangle the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a subgroup of the entities. OPT facilitates filtering the noisy interactions between irrelevant entities and thus significantly improves generalizability as well as interpretability. Specifically, OPT introduces a sparse disagreement mechanism to encourage sparsity and diversity among discovered interaction prototypes. Then the model selectively restructures these prototypes into a compact interaction pattern by an aggregator with learnable weights. To alleviate the training instability issue caused by partial observability, we propose to maximize the mutual information between the aggregation weights and the history behaviors of each agent. Experiments on single-task, multi-task and zero-shot benchmarks demonstrate that the proposed method yields results superior to the state-of-the-art counterparts. Our code is available at https://github.com/liushunyu/OPT
650		4	\|a Journal Article
700	1		\|a Song, Jie \|e verfasserin \|4 aut
700	1		\|a Zhou, Yihe \|e verfasserin \|4 aut
700	1		\|a Yu, Na \|e verfasserin \|4 aut
700	1		\|a Chen, Kaixuan \|e verfasserin \|4 aut
700	1		\|a Feng, Zunlei \|e verfasserin \|4 aut
700	1		\|a Song, Mingli \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on pattern analysis and machine intelligence \|d 1979 \|g PP(2024) vom: 13. Mai \|w (DE-627)NLM098212257 \|x 1939-3539 \|7 nnns
773	1	8	\|g volume:PP \|g year:2024 \|g day:13 \|g month:05
856	4	0	\|u http://dx.doi.org/10.1109/TPAMI.2024.3399936 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d PP \|j 2024 \|b 13 \|c 05