Javascript must be enabled to continue!
CE-BART: Cause-and-Effect BART for Visual Commonsense Generation
View through CrossRef
“A Picture is worth a thousand words”. Given an image, humans are able to deduce various cause-and-effect captions of past, current, and future events beyond the image. The task of visual commonsense generation has the aim of generating three cause-and-effect captions for a given image: (1) what needed to happen before, (2) what is the current intent, and (3) what will happen after. However, this task is challenging for machines, owing to two limitations: existing approaches (1) directly utilize conventional vision–language transformers to learn relationships between input modalities and (2) ignore relations among target cause-and-effect captions, but consider each caption independently. Herein, we propose Cause-and-Effect BART (CE-BART), which is based on (1) a structured graph reasoner that captures intra- and inter-modality relationships among visual and textual representations and (2) a cause-and-effect generator that generates cause-and-effect captions by considering the causal relations among inferences. We demonstrate the validity of CE-BART on the VisualCOMET and AVSD benchmarks. CE-BART achieved SOTA performance on both benchmarks, while an extensive ablation study and qualitative analysis demonstrated the performance gain and improved interpretability.
Title: CE-BART: Cause-and-Effect BART for Visual Commonsense Generation
Description:
“A Picture is worth a thousand words”.
Given an image, humans are able to deduce various cause-and-effect captions of past, current, and future events beyond the image.
The task of visual commonsense generation has the aim of generating three cause-and-effect captions for a given image: (1) what needed to happen before, (2) what is the current intent, and (3) what will happen after.
However, this task is challenging for machines, owing to two limitations: existing approaches (1) directly utilize conventional vision–language transformers to learn relationships between input modalities and (2) ignore relations among target cause-and-effect captions, but consider each caption independently.
Herein, we propose Cause-and-Effect BART (CE-BART), which is based on (1) a structured graph reasoner that captures intra- and inter-modality relationships among visual and textual representations and (2) a cause-and-effect generator that generates cause-and-effect captions by considering the causal relations among inferences.
We demonstrate the validity of CE-BART on the VisualCOMET and AVSD benchmarks.
CE-BART achieved SOTA performance on both benchmarks, while an extensive ablation study and qualitative analysis demonstrated the performance gain and improved interpretability.
Related Results
CE-BART: Cause-and-Effect BART for Visual Commonsense Generation
CE-BART: Cause-and-Effect BART for Visual Commonsense Generation
“A Picture is worth a thousand words”. Given an image, humans are able to deduce various cause-and-effect captions of past, current, and future events beyond the image. The task of...
Commonsense Knowledge in Foundation and Large Language Models
Commonsense Knowledge in Foundation and Large Language Models
The development and continuous expansion of the transformer deep-learning architecture have produced enormous effects across various domains, including but not limited to natural l...
Hydatid Cyst of The Orbit: A Systematic Review with Meta-Data
Hydatid Cyst of The Orbit: A Systematic Review with Meta-Data
Abstarct
Introduction
Orbital hydatid cysts (HCs) constitute less than 1% of all cases of hydatidosis, yet their occurrence is often linked to severe visual complications. This stu...
BART Cancer: a web resource for transcriptional regulators in cancer genomes
BART Cancer: a web resource for transcriptional regulators in cancer genomes
Abstract
Dysregulation of gene expression plays an important role in cancer development. Identifying transcriptional regulators, including transcription factors and ...
A flexible approach for variable selection in large-scale healthcare database studies with missing covariate and outcome data
A flexible approach for variable selection in large-scale healthcare database studies with missing covariate and outcome data
AbstractBackgroundPrior work has shown that combining bootstrap imputation with tree-based machine learning variable selection methods can provide good performances achievable on f...
Abstract 3500: Epstein-Barr virus BART noncoding RNAs modulate host gene expression for virus latency leading to oncogenesis in epithelial cells
Abstract 3500: Epstein-Barr virus BART noncoding RNAs modulate host gene expression for virus latency leading to oncogenesis in epithelial cells
Abstract
Epstein-Barr virus (EBV) efficiently establishes and maintains a state of latency in resting B cells, with persistent asymptomatic infections occurring in m...
ANALYSIS OF THE OPERATION MODE OF THE SOLAR POWER PLANT
ANALYSIS OF THE OPERATION MODE OF THE SOLAR POWER PLANT
The article examines the load change schedule of the solar power plant in the Ukraine-Moldova energy union. The analysis of data averaged at minute and 15-minute intervals in the p...
Sœurs de solitude : Maryse Condé et Simone Schwarz-Bart
Sœurs de solitude : Maryse Condé et Simone Schwarz-Bart
Maryse Condé et Simone Schwarz-Bart sont deux écrivaines contemporaines qui ont laissé une forte empreinte dans la littérature francophone caribéenne. Les deux écrivaines sont guad...


