M3D : a Multimodal, Multilingual and Multitask Dataset for Grounded Document-level Information Extraction

Multimodal information extraction (IE) tasks have attracted increasing attention because many studies have shown that multimodal information benefits text information extraction. However, existing multimodal IE datasets mainly focus on sentence-level image-facilitated IE in English text, and pay lit...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2025) vom: 11. Sept.
Auteur principal: Liu, Jiang (Auteur)
Autres auteurs: Li, Bobo, Yang, Xinran, Yang, Na, Fei, Hao, Zhang, Mingyao, Li, Fei, Ji, Donghong
Format: Article en ligne
Langue:English
Publié: 2025
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article