Deng, J., Yang, Z., Liu, D., Chen, T., Zhou, W., Zhang, Y., . . . Ouyang, W. (2023). TransVG++: End-to-End Visual Grounding With Language Conditioned Vision Transformer. IEEE transactions on pattern analysis and machine intelligence, 45(11), 13636. https://doi.org/10.1109/TPAMI.2023.3296823
Chicago ZitierstilDeng, Jiajun, Zhengyuan Yang, Daqing Liu, Tianlang Chen, Wengang Zhou, Yanyong Zhang, Houqiang Li, und Wanli Ouyang. "TransVG++: End-to-End Visual Grounding With Language Conditioned Vision Transformer." IEEE Transactions on Pattern Analysis and Machine Intelligence 45, no. 11 (2023): 13636. https://dx.doi.org/10.1109/TPAMI.2023.3296823.
MLA ZitierstilDeng, Jiajun, et al. "TransVG++: End-to-End Visual Grounding With Language Conditioned Vision Transformer." IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 11, 2023, p. 13636.