Semi-Supervised Clustering With Constraints of Different Types From Multiple Information Sources

Semi-supervised clustering is one of important research topics in cluster analysis, which uses pre-given knowledge as constraints to improve the clustering performance. While clustering a data set, people often get prior constraints from different information sources, which may have different repres...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 43(2021), 9 vom: 01. Sept., Seite 3247-3258
1. Verfasser: Bai, Liang (VerfasserIn)
Weitere Verfasser: Liang, Jiye, Cao, Fuyuan
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2021
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:Semi-supervised clustering is one of important research topics in cluster analysis, which uses pre-given knowledge as constraints to improve the clustering performance. While clustering a data set, people often get prior constraints from different information sources, which may have different representations and contents, to guide clustering process. However, most of existing semi-supervised clustering algorithms are based on single-source constraints and rarely consider to integrate multi-source constraints to enhance the clustering quality. To solve the problem, we analyze the relations among different types of constraints and propose an uniform representation for them. Based it, we propose a new semi-supervised clustering algorithm to find out a clustering that has good cluster structure and high consensus of all the sources of constraints. In the algorithm, we construct an optimization objective model and its solution method to achieve the aim. This algorithm can integrate multi-source constraints well to reduce the effect of incorrect constraints from single sources and find out a high-quality clustering. By the experimental studies on several benchmark data sets, we illustrate the effectiveness of the proposed algorithm, compared to other semi-supervised clustering algorithms
Beschreibung:Date Revised 05.08.2021
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1939-3539
DOI:10.1109/TPAMI.2020.2979699