A review of semi-supervised learning for text classification

© The Author(s), under exclusive licence to Springer Nature B.V. 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript v...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:Artificial intelligence review. - 1998. - (2023) vom: 31. Jan., Seite 1-69
1. Verfasser: Duarte, José Marcio (VerfasserIn)
Weitere Verfasser: Berton, Lilian
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:Artificial intelligence review
Schlagworte:Journal Article Machine learning Natural language processing Semi-supervised learning Text classification
LEADER 01000caa a22002652 4500
001 NLM352532718
003 DE-627
005 20240911232253.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1007/s10462-023-10393-8  |2 doi 
028 5 2 |a pubmed24n1530.xml 
035 |a (DE-627)NLM352532718 
035 |a (NLM)36743267 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Duarte, José Marcio  |e verfasserin  |4 aut 
245 1 2 |a A review of semi-supervised learning for text classification 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 11.09.2024 
500 |a published: Print-Electronic 
500 |a Citation Status Publisher 
520 |a © The Author(s), under exclusive licence to Springer Nature B.V. 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. 
520 |a A huge amount of data is generated daily leading to big data challenges. One of them is related to text mining, especially text classification. To perform this task we usually need a large set of labeled data that can be expensive, time-consuming, or difficult to be obtained. Considering this scenario semi-supervised learning (SSL), the branch of machine learning concerned with using labeled and unlabeled data has expanded in volume and scope. Since no recent survey exists to overview how SSL has been used in text classification, we aim to fill this gap and present an up-to-date review of SSL for text classification. We retrieve 1794 works from the last 5 years from IEEE Xplore, ACM Digital Library, Science Direct, and Springer. Then, 157 articles were selected to be included in this review. We present the application domain, datasets, and languages employed in the works. The text representations and machine learning algorithms. We also summarize and organize the works following a recent taxonomy of SSL. We analyze the percentage of labeled data used, the evaluation metrics, and obtained results. Lastly, we present some limitations and future trends in the area. We aim to provide researchers and practitioners with an outline of the area as well as useful information for their current research 
650 4 |a Journal Article 
650 4 |a Machine learning 
650 4 |a Natural language processing 
650 4 |a Semi-supervised learning 
650 4 |a Text classification 
700 1 |a Berton, Lilian  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t Artificial intelligence review  |d 1998  |g (2023) vom: 31. Jan., Seite 1-69  |w (DE-627)NLM098184490  |x 0269-2821  |7 nnns 
773 1 8 |g year:2023  |g day:31  |g month:01  |g pages:1-69 
856 4 0 |u http://dx.doi.org/10.1007/s10462-023-10393-8  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |j 2023  |b 31  |c 01  |h 1-69