×
Feb 7, 2022 · OFA unifies a diverse set of cross-modal and unimodal tasks, including image generation, visual grounding, image captioning, image classification, language ...
Abstract. In this work, we pursue a unified paradigm for multimodal pretraining to break the shackles of complex task/modality-specific customization.
OFA is a unified sequence-to-sequence pretrained model (support English and Chinese) that unifies modalities (ie, cross-modality, vision, language) and tasks.
OFA is proposed, a Task-Agnostic and Modality- agnostic framework that supports Task Comprehensiveness and achieves new SOTAs in a series of cross-modal tasks.
OFA unifies a diverse set of cross-modal and unimodal tasks, including image generation, visual grounding, image captioning, image classification, language ...
OFA: Unifying Architectures, Tasks, and. Modalities Through a Simple Sequence-to-. Sequence Learning Framework. Page 2. OFA (One-For-All). Task. Agnostic.
OFA is a unified multimodal pretrained model that unifies modalities (ie, cross-modality, vision, language) and tasks (eg, image generation, visual grounding, ...
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework. Jul 19, 2022 ...
Feb 8, 2022 · The authors propose OFA, a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., ...