Distributed Variational Representation Learning

The problem of distributed representation learning is one in which multiple sources of information X1,…, XK are processed separately so as to learn as much information as possible about some ground truth Y. We investigate this problem from information-theoretic grounds, through a generalization of T...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 43(2021), 1 vom: 22. Jan., Seite 120-138
1. Verfasser: Aguerri, Inaki Estella (VerfasserIn)
Weitere Verfasser: Zaidi, Abdellatif
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2021
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM299414868
003 DE-627
005 20231225100304.0
007 cr uuu---uuuuu
008 231225s2021 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2019.2928806  |2 doi 
028 5 2 |a pubmed24n0998.xml 
035 |a (DE-627)NLM299414868 
035 |a (NLM)31329108 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Aguerri, Inaki Estella  |e verfasserin  |4 aut 
245 1 0 |a Distributed Variational Representation Learning 
264 1 |c 2021 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 07.12.2020 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a The problem of distributed representation learning is one in which multiple sources of information X1,…, XK are processed separately so as to learn as much information as possible about some ground truth Y. We investigate this problem from information-theoretic grounds, through a generalization of Tishby's centralized Information Bottleneck (IB) method to the distributed setting. Specifically, K encoders, K ≥ 2, compress their observations X1,…, XK separately in a manner such that, collectively, the produced representations preserve as much information as possible about Y. We study both discrete memoryless (DM) and memoryless vector Gaussian data models. For the discrete model, we establish a single-letter characterization of the optimal tradeoff between complexity (or rate) and relevance (or information) for a class of memoryless sources (the observations X1,…, XK being conditionally independent given Y). For the vector Gaussian model, we provide an explicit characterization of the optimal complexity-relevance tradeoff. Furthermore, we develop a variational bound on the complexity-relevance tradeoff which generalizes the evidence lower bound (ELBO) to the distributed setting. We also provide two algorithms that allow to compute this bound: i) a Blahut-Arimoto type iterative algorithm which enables to compute optimal complexity-relevance encoding mappings by iterating over a set of self-consistent equations, and ii) a variational inference type algorithm in which the encoding mappings are parametrized by neural networks and the bound approximated by Markov sampling and optimized with stochastic gradient descent. Numerical results on synthetic and real datasets are provided to support the efficiency of the approaches and algorithms developed in this paper 
650 4 |a Journal Article 
700 1 |a Zaidi, Abdellatif  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 43(2021), 1 vom: 22. Jan., Seite 120-138  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:43  |g year:2021  |g number:1  |g day:22  |g month:01  |g pages:120-138 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2019.2928806  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 43  |j 2021  |e 1  |b 22  |c 01  |h 120-138