Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

EVJVQA CHALLENGE: MULTILINGUAL VISUAL QUESTION ANSWERING

View through CrossRef
Visual Question Answering (VQA) is a challenging task of natural language processing (NLP) and computer vision (CV), attracting significant attention from researchers. English is a resource-rich language that has witnessed various developments in datasets and models for visual question answering. Visual question answering in other languages also would be developed for resources and models. In addition, there is no multilingual dataset targeting the visual content of a particular country with its own objects and cultural characteristics. To address the weakness, we provide the research community with a benchmark dataset named EVJVQA, including 33,000+ pairs of question-answer over three languages: Vietnamese, English, and Japanese, on approximately 5,000 images taken from Vietnam for evaluating multilingual VQA systems or models. EVJVQA is used as a benchmark dataset for the challenge of multilingual visual question answering at the 9th Workshop on Vietnamese Language and Speech Processing (VLSP 2022). This task attracted 62 participant teams from various universities and organizations. In this article, we present details of the organization of the challenge, an overview of the methods employed by shared-task participants, and the results. The highest performances are 0.4392 in F1-score and 0.4009 in BLUE on the private test set. The multilingual QA systems proposed by the top 2 teams use ViT for the pre-trained vision model and mT5 for the pre-trained language model, a powerful pre-trained language model based on the transformer architecture. EVJVQA is a challenging dataset that motivates NLP and CV researchers to further explore the multilingual models or systems for visual question answering systems.
Publishing House for Science and Technology, Vietnam Academy of Science and Technology (Publications)
Title: EVJVQA CHALLENGE: MULTILINGUAL VISUAL QUESTION ANSWERING
Description:
Visual Question Answering (VQA) is a challenging task of natural language processing (NLP) and computer vision (CV), attracting significant attention from researchers.
English is a resource-rich language that has witnessed various developments in datasets and models for visual question answering.
Visual question answering in other languages also would be developed for resources and models.
In addition, there is no multilingual dataset targeting the visual content of a particular country with its own objects and cultural characteristics.
To address the weakness, we provide the research community with a benchmark dataset named EVJVQA, including 33,000+ pairs of question-answer over three languages: Vietnamese, English, and Japanese, on approximately 5,000 images taken from Vietnam for evaluating multilingual VQA systems or models.
EVJVQA is used as a benchmark dataset for the challenge of multilingual visual question answering at the 9th Workshop on Vietnamese Language and Speech Processing (VLSP 2022).
This task attracted 62 participant teams from various universities and organizations.
In this article, we present details of the organization of the challenge, an overview of the methods employed by shared-task participants, and the results.
The highest performances are 0.
4392 in F1-score and 0.
4009 in BLUE on the private test set.
The multilingual QA systems proposed by the top 2 teams use ViT for the pre-trained vision model and mT5 for the pre-trained language model, a powerful pre-trained language model based on the transformer architecture.
EVJVQA is a challenging dataset that motivates NLP and CV researchers to further explore the multilingual models or systems for visual question answering systems.

Related Results

OHYEAH AT VLSP2022-EVJVQA CHALLENGE: A JOINTLY LANGUAGE-IMAGE MODEL FOR MULTILINGUAL VISUAL QUESTION ANSWERING
OHYEAH AT VLSP2022-EVJVQA CHALLENGE: A JOINTLY LANGUAGE-IMAGE MODEL FOR MULTILINGUAL VISUAL QUESTION ANSWERING
Multilingual Visual Question Answering (mVQA) is an extremely challenging task which needs to answer a question given in different languages and take the context in an image. This ...
Language Alternation in Multilingual Societies: Analyzing Bi/Multilingual Conversation
Language Alternation in Multilingual Societies: Analyzing Bi/Multilingual Conversation
The research examines the relationship between language choice and alternation in bilingual/multilingual conversations within a multicultural/multilingual context. It builds on the...
Metacognition in multilingual learning and teaching
Metacognition in multilingual learning and teaching
Abstract Metacognition has been increasingly discussed as one of the main features of learning in the 21st century (see Haukås, Bjørke, & Dypedahl, 2018). In the Dynamic Model ...
Interactive Question Answering
Interactive Question Answering
The increasing amount of information available online has led to the development of technologies that help to deal with it. One of them is Interactive Question Answering (IQA), a r...
Nanjing Yunjin intelligent question-answering system based on knowledge graphs and retrieval augmented generation technology
Nanjing Yunjin intelligent question-answering system based on knowledge graphs and retrieval augmented generation technology
Abstract Nanjing Yunjin, a traditional Chinese silk weaving craft, is celebrated globally for its unique local characteristics and exquisite workmanship, forming an integ...
Moving towards (new) multilingual paradigms
Moving towards (new) multilingual paradigms
Abstract Multilingual education is increasingly perceived as a desirable goal in a world where global networks play a significant role. Crucially, educating multilin...
EFFECT OF BILINGUAL INSTRUCTIONAL METHOD IN THE ACADEMIC ACHIEVEMENT OF JUNIOR SECONDARY SCHOOL STUDENTS IN MATHEMATICS
EFFECT OF BILINGUAL INSTRUCTIONAL METHOD IN THE ACADEMIC ACHIEVEMENT OF JUNIOR SECONDARY SCHOOL STUDENTS IN MATHEMATICS
The importance of mathematics in the modern society is overwhelming. The importance of mathematics has long been recognized all over the world, and that is why all students are req...
Identity, Multilingualism and CALL: Responding to New Global Realities
Identity, Multilingualism and CALL: Responding to New Global Realities
This volume focuses on a range of topics and studies that address the notion of plurilingualism and multilingual identity in computer-mediated language learning (CALL) spaces. Inte...

Back to Top