Learning Semantic Correspondence Exploiting an Object-Level Prior

We address the problem of semantic correspondence, that is, establishing a dense flow field between images depicting different instances of the same object or scene category. We propose to use images annotated with binary foreground masks and subjected to synthetic geometric deformations to train a...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 44(2022), 3 vom: 15. März, Seite 1399-1414
1. Verfasser: Lee, Junghyup (VerfasserIn)
Weitere Verfasser: Kim, Dohyung, Lee, Wonkyung, Ponce, Jean, Ham, Bumsub
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2022
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM313260826
003 DE-627
005 20231225150210.0
007 cr uuu---uuuuu
008 231225s2022 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2020.3013620  |2 doi 
028 5 2 |a pubmed24n1044.xml 
035 |a (DE-627)NLM313260826 
035 |a (NLM)32750842 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Lee, Junghyup  |e verfasserin  |4 aut 
245 1 0 |a Learning Semantic Correspondence Exploiting an Object-Level Prior 
264 1 |c 2022 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 04.02.2022 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a We address the problem of semantic correspondence, that is, establishing a dense flow field between images depicting different instances of the same object or scene category. We propose to use images annotated with binary foreground masks and subjected to synthetic geometric deformations to train a convolutional neural network (CNN) for this task. Using these masks as part of the supervisory signal provides an object-level prior for the semantic correspondence task and offers a good compromise between semantic flow methods, where the amount of training data is limited by the cost of manually selecting point correspondences, and semantic alignment ones, where the regression of a single global geometric transformation between images may be sensitive to image-specific details such as background clutter. We propose a new CNN architecture, dubbed SFNet, which implements this idea. It leverages a new and differentiable version of the argmax function for end-to-end training, with a loss that combines mask and flow consistency with smoothness terms. Experimental results demonstrate the effectiveness of our approach, which significantly outperforms the state of the art on standard benchmarks 
650 4 |a Journal Article 
700 1 |a Kim, Dohyung  |e verfasserin  |4 aut 
700 1 |a Lee, Wonkyung  |e verfasserin  |4 aut 
700 1 |a Ponce, Jean  |e verfasserin  |4 aut 
700 1 |a Ham, Bumsub  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 44(2022), 3 vom: 15. März, Seite 1399-1414  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:44  |g year:2022  |g number:3  |g day:15  |g month:03  |g pages:1399-1414 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2020.3013620  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 44  |j 2022  |e 3  |b 15  |c 03  |h 1399-1414