|
|
|
|
| LEADER |
01000caa a22002652c 4500 |
| 001 |
NLM315921331 |
| 003 |
DE-627 |
| 005 |
20250228041138.0 |
| 007 |
cr uuu---uuuuu |
| 008 |
231225s2022 xx |||||o 00| ||eng c |
| 024 |
7 |
|
|a 10.1109/TPAMI.2020.3029008
|2 doi
|
| 028 |
5 |
2 |
|a pubmed25n1052.xml
|
| 035 |
|
|
|a (DE-627)NLM315921331
|
| 035 |
|
|
|a (NLM)33021939
|
| 040 |
|
|
|a DE-627
|b ger
|c DE-627
|e rakwb
|
| 041 |
|
|
|a eng
|
| 100 |
1 |
|
|a Plummer, Bryan A
|e verfasserin
|4 aut
|
| 245 |
1 |
0 |
|a Revisiting Image-Language Networks for Open-Ended Phrase Detection
|
| 264 |
|
1 |
|c 2022
|
| 336 |
|
|
|a Text
|b txt
|2 rdacontent
|
| 337 |
|
|
|a ƒaComputermedien
|b c
|2 rdamedia
|
| 338 |
|
|
|a ƒa Online-Ressource
|b cr
|2 rdacarrier
|
| 500 |
|
|
|a Date Completed 28.03.2022
|
| 500 |
|
|
|a Date Revised 01.04.2022
|
| 500 |
|
|
|a published: Print-Electronic
|
| 500 |
|
|
|a Citation Status MEDLINE
|
| 520 |
|
|
|a Most existing work that grounds natural language phrases in images starts with the assumption that the phrase in question is relevant to the image. In this paper we address a more realistic version of the natural language grounding task where we must both identify whether the phrase is relevant to an image and localize the phrase. This can also be viewed as a generalization of object detection to an open-ended vocabulary, introducing elements of few- and zero-shot detection. We propose an approach for this task that extends Faster R-CNN to relate image regions and phrases. By carefully initializing the classification layers of our network using canonical correlation analysis (CCA), we encourage a solution that is more discerning when reasoning between similar phrases, resulting in over double the performance compared to a naive adaptation on three popular phrase grounding datasets, Flickr30K Entities, ReferIt Game, and Visual Genome, with test-time phrase vocabulary sizes of 5K, 32K, and 159K, respectively
|
| 650 |
|
4 |
|a Journal Article
|
| 650 |
|
4 |
|a Research Support, U.S. Gov't, Non-P.H.S.
|
| 700 |
1 |
|
|a Shih, Kevin J
|e verfasserin
|4 aut
|
| 700 |
1 |
|
|a Li, Yichen
|e verfasserin
|4 aut
|
| 700 |
1 |
|
|a Xu, Ke
|e verfasserin
|4 aut
|
| 700 |
1 |
|
|a Lazebnik, Svetlana
|e verfasserin
|4 aut
|
| 700 |
1 |
|
|a Sclaroff, Stan
|e verfasserin
|4 aut
|
| 700 |
1 |
|
|a Saenko, Kate
|e verfasserin
|4 aut
|
| 773 |
0 |
8 |
|i Enthalten in
|t IEEE transactions on pattern analysis and machine intelligence
|d 1979
|g 44(2022), 4 vom: 06. Apr., Seite 2155-2167
|w (DE-627)NLM098212257
|x 1939-3539
|7 nnas
|
| 773 |
1 |
8 |
|g volume:44
|g year:2022
|g number:4
|g day:06
|g month:04
|g pages:2155-2167
|
| 856 |
4 |
0 |
|u http://dx.doi.org/10.1109/TPAMI.2020.3029008
|3 Volltext
|
| 912 |
|
|
|a GBV_USEFLAG_A
|
| 912 |
|
|
|a SYSFLAG_A
|
| 912 |
|
|
|a GBV_NLM
|
| 912 |
|
|
|a GBV_ILN_350
|
| 951 |
|
|
|a AR
|
| 952 |
|
|
|d 44
|j 2022
|e 4
|b 06
|c 04
|h 2155-2167
|