MoIL : Momentum Imitation Learning for Efficient Vision-Language Adaptation

Pre-training and fine-tuning have been the de-facto paradigm in vision-language domains. Along with the rapid growth of model sizes, fully fine-tuning these large-scale vision-language pre-training (VLP) models requires prohibitively expensive storage costs. To address this issue, recent advances in...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 30. Juli
Auteur principal: Luo, Gen (Auteur)
Autres auteurs: Zhou, Yiyi, Huang, Minglang, Ren, Tianhe, Sun, Xiaoshuai, Ji, Rongrong
Format: Article en ligne
Langue:English
Publié: 2024
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article