Fine-Grained Recognition With Learnable Semantic Data Augmentation

Fine-grained image recognition is a longstanding computer vision challenge that focuses on differentiating objects belonging to multiple subordinate categories within the same meta-category. Since images belonging to the same meta-category usually share similar visual appearances, mining discriminat...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 33(2024) vom: 01., Seite 3130-3144
1. Verfasser: Pu, Yifan (VerfasserIn)
Weitere Verfasser: Han, Yizeng, Wang, Yulin, Feng, Junlan, Deng, Chao, Huang, Gao
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
LEADER 01000caa a22002652 4500
001 NLM371514096
003 DE-627
005 20240501233332.0
007 cr uuu---uuuuu
008 240426s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2024.3364500  |2 doi 
028 5 2 |a pubmed24n1394.xml 
035 |a (DE-627)NLM371514096 
035 |a (NLM)38662557 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Pu, Yifan  |e verfasserin  |4 aut 
245 1 0 |a Fine-Grained Recognition With Learnable Semantic Data Augmentation 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 01.05.2024 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Fine-grained image recognition is a longstanding computer vision challenge that focuses on differentiating objects belonging to multiple subordinate categories within the same meta-category. Since images belonging to the same meta-category usually share similar visual appearances, mining discriminative visual cues is the key to distinguishing fine-grained categories. Although commonly used image-level data augmentation techniques have achieved great success in generic image classification problems, they are rarely applied in fine-grained scenarios, because their random editing-region behavior is prone to destroy the discriminative visual cues residing in the subtle regions. In this paper, we propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem. Specifically, we produce diversified augmented samples by translating image features along semantically meaningful directions. The semantic directions are estimated with a covariance prediction network, which predicts a sample-wise covariance matrix to adapt to the large intra-class variation inherent in fine-grained images. Furthermore, the covariance prediction network is jointly optimized with the classification network in a meta-learning manner to alleviate the degenerate solution problem. Experiments on four competitive fine-grained recognition benchmarks (CUB-200-2011, Stanford Cars, FGVC Aircrafts, NABirds) demonstrate that our method significantly improves the generalization performance on several popular classification networks (e.g., ResNets, DenseNets, EfficientNets, RegNets and ViT). Combined with a recently proposed method, our semantic data augmentation approach achieves state-of-the-art performance on the CUB-200-2011 dataset. Source code is available at https://github.com/LeapLabTHU/LearnableISDA 
650 4 |a Journal Article 
700 1 |a Han, Yizeng  |e verfasserin  |4 aut 
700 1 |a Wang, Yulin  |e verfasserin  |4 aut 
700 1 |a Feng, Junlan  |e verfasserin  |4 aut 
700 1 |a Deng, Chao  |e verfasserin  |4 aut 
700 1 |a Huang, Gao  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 33(2024) vom: 01., Seite 3130-3144  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnns 
773 1 8 |g volume:33  |g year:2024  |g day:01  |g pages:3130-3144 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2024.3364500  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 33  |j 2024  |b 01  |h 3130-3144