MoIL : Momentum Imitation Learning for Efficient Vision-Language Adaptation

Pre-training and fine-tuning have been the de-facto paradigm in vision-language domains. Along with the rapid growth of model sizes, fully fine-tuning these large-scale vision-language pre-training (VLP) models requires prohibitively expensive storage costs. To address this issue, recent advances in...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 30. Juli
1. Verfasser: Luo, Gen (VerfasserIn)
Weitere Verfasser: Zhou, Yiyi, Huang, Minglang, Ren, Tianhe, Sun, Xiaoshuai, Ji, Rongrong
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article