Support Vector Machine Classifier via L0/1 Soft-Margin Loss

Support vector machines (SVM) have drawn wide attention for the last two decades due to its extensive applications, so a vast body of work has developed optimization algorithms to solve SVM with various soft-margin losses. To distinguish all, in this paper, we aim at solving an ideal soft-margin los...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 44(2022), 10 vom: 24. Okt., Seite 7253-7265
1. Verfasser:	Wang, Huajun (VerfasserIn)
Weitere Verfasser:	Shao, Yuanhai, Zhou, Shenglong, Zhang, Ce, Xiu, Naihua
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2022
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article Research Support, Non-U.S. Gov't

Beschreibung
Zusammenfassung:	Support vector machines (SVM) have drawn wide attention for the last two decades due to its extensive applications, so a vast body of work has developed optimization algorithms to solve SVM with various soft-margin losses. To distinguish all, in this paper, we aim at solving an ideal soft-margin loss SVM: L0/1 soft-margin loss SVM (dubbed as L0/1-SVM). Many of the existing (non)convex soft-margin losses can be viewed as one of the surrogates of the L0/1 soft-margin loss. Despite its discrete nature, we manage to establish the optimality theory for the L0/1-SVM including the existence of the optimal solutions, the relationship between them and P-stationary points. These not only enable us to deliver a rigorous definition of L0/1 support vectors but also allow us to define a working set. Integrating such a working set, a fast alternating direction method of multipliers is then proposed with its limit point being a locally optimal solution to the L0/1-SVM. Finally, numerical experiments demonstrate that our proposed method outperforms some leading classification solvers from SVM communities, in terms of faster computational speed and a fewer number of support vectors. The bigger the data size is, the more evident its advantage appears
Beschreibung:	Date Completed 16.09.2022 Date Revised 22.12.2022 published: Print-Electronic Citation Status MEDLINE
ISSN:	1939-3539
DOI:	10.1109/TPAMI.2021.3092177