Metric learning for text documents

Many algorithms in machine learning rely on being given a good distance metric over the input space. Rather than using a default metric such as the Euclidean metric, it is desirable to obtain a metric based on the provided data. We consider the problem of learning a Riemannian metric associated with...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1998. - 28(2006), 4 vom: 11. Apr., Seite 497-508
1. Verfasser: Lebanon, Guy (VerfasserIn)
Format: Aufsatz
Sprache:English
Veröffentlicht: 2006
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000caa a22002652 4500
001 NLM161552900
003 DE-627
005 20250207052852.0
007 tu
008 231223s2006 xx ||||| 00| ||eng c
028 5 2 |a pubmed25n0539.xml 
035 |a (DE-627)NLM161552900 
035 |a (NLM)16566500 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Lebanon, Guy  |e verfasserin  |4 aut 
245 1 0 |a Metric learning for text documents 
264 1 |c 2006 
336 |a Text  |b txt  |2 rdacontent 
337 |a ohne Hilfsmittel zu benutzen  |b n  |2 rdamedia 
338 |a Band  |b nc  |2 rdacarrier 
500 |a Date Completed 18.04.2006 
500 |a Date Revised 01.12.2018 
500 |a published: Print 
500 |a Citation Status MEDLINE 
520 |a Many algorithms in machine learning rely on being given a good distance metric over the input space. Rather than using a default metric such as the Euclidean metric, it is desirable to obtain a metric based on the provided data. We consider the problem of learning a Riemannian metric associated with a given differentiable manifold and a set of points. Our approach to the problem involves choosing a metric from a parametric family that is based on maximizing the inverse volume of a given data set of points. From a statistical perspective, it is related to maximum likelihood under a model that assigns probabilities inversely proportional to the Riemannian volume element. We discuss in detail learning a metric on the multinomial simplex where the metric candidates are pull-back metrics of the Fisher information under a Lie group of transformations. When applied to text document classification the resulting geodesic distance resemble, but outperform, the tfidf cosine similarity measure 
650 4 |a Journal Article 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1998  |g 28(2006), 4 vom: 11. Apr., Seite 497-508  |w (DE-627)NLM098212257  |x 0162-8828  |7 nnns 
773 1 8 |g volume:28  |g year:2006  |g number:4  |g day:11  |g month:04  |g pages:497-508 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 28  |j 2006  |e 4  |b 11  |c 04  |h 497-508