|
|
|
|
LEADER |
01000caa a22002652 4500 |
001 |
NLM371514096 |
003 |
DE-627 |
005 |
20240501233332.0 |
007 |
cr uuu---uuuuu |
008 |
240426s2024 xx |||||o 00| ||eng c |
024 |
7 |
|
|a 10.1109/TIP.2024.3364500
|2 doi
|
028 |
5 |
2 |
|a pubmed24n1394.xml
|
035 |
|
|
|a (DE-627)NLM371514096
|
035 |
|
|
|a (NLM)38662557
|
040 |
|
|
|a DE-627
|b ger
|c DE-627
|e rakwb
|
041 |
|
|
|a eng
|
100 |
1 |
|
|a Pu, Yifan
|e verfasserin
|4 aut
|
245 |
1 |
0 |
|a Fine-Grained Recognition With Learnable Semantic Data Augmentation
|
264 |
|
1 |
|c 2024
|
336 |
|
|
|a Text
|b txt
|2 rdacontent
|
337 |
|
|
|a ƒaComputermedien
|b c
|2 rdamedia
|
338 |
|
|
|a ƒa Online-Ressource
|b cr
|2 rdacarrier
|
500 |
|
|
|a Date Revised 01.05.2024
|
500 |
|
|
|a published: Print-Electronic
|
500 |
|
|
|a Citation Status PubMed-not-MEDLINE
|
520 |
|
|
|a Fine-grained image recognition is a longstanding computer vision challenge that focuses on differentiating objects belonging to multiple subordinate categories within the same meta-category. Since images belonging to the same meta-category usually share similar visual appearances, mining discriminative visual cues is the key to distinguishing fine-grained categories. Although commonly used image-level data augmentation techniques have achieved great success in generic image classification problems, they are rarely applied in fine-grained scenarios, because their random editing-region behavior is prone to destroy the discriminative visual cues residing in the subtle regions. In this paper, we propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem. Specifically, we produce diversified augmented samples by translating image features along semantically meaningful directions. The semantic directions are estimated with a covariance prediction network, which predicts a sample-wise covariance matrix to adapt to the large intra-class variation inherent in fine-grained images. Furthermore, the covariance prediction network is jointly optimized with the classification network in a meta-learning manner to alleviate the degenerate solution problem. Experiments on four competitive fine-grained recognition benchmarks (CUB-200-2011, Stanford Cars, FGVC Aircrafts, NABirds) demonstrate that our method significantly improves the generalization performance on several popular classification networks (e.g., ResNets, DenseNets, EfficientNets, RegNets and ViT). Combined with a recently proposed method, our semantic data augmentation approach achieves state-of-the-art performance on the CUB-200-2011 dataset. Source code is available at https://github.com/LeapLabTHU/LearnableISDA
|
650 |
|
4 |
|a Journal Article
|
700 |
1 |
|
|a Han, Yizeng
|e verfasserin
|4 aut
|
700 |
1 |
|
|a Wang, Yulin
|e verfasserin
|4 aut
|
700 |
1 |
|
|a Feng, Junlan
|e verfasserin
|4 aut
|
700 |
1 |
|
|a Deng, Chao
|e verfasserin
|4 aut
|
700 |
1 |
|
|a Huang, Gao
|e verfasserin
|4 aut
|
773 |
0 |
8 |
|i Enthalten in
|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
|d 1992
|g 33(2024) vom: 01., Seite 3130-3144
|w (DE-627)NLM09821456X
|x 1941-0042
|7 nnns
|
773 |
1 |
8 |
|g volume:33
|g year:2024
|g day:01
|g pages:3130-3144
|
856 |
4 |
0 |
|u http://dx.doi.org/10.1109/TIP.2024.3364500
|3 Volltext
|
912 |
|
|
|a GBV_USEFLAG_A
|
912 |
|
|
|a SYSFLAG_A
|
912 |
|
|
|a GBV_NLM
|
912 |
|
|
|a GBV_ILN_350
|
951 |
|
|
|a AR
|
952 |
|
|
|d 33
|j 2024
|b 01
|h 3130-3144
|