Javascript must be enabled to continue!
Curriculum Multi-Negative Augmentation for Debiased Video Grounding
View through CrossRef
Video Grounding (VG) aims to locate the desired segment from a video given a sentence query. Recent studies have found that current VG models are prone to over-rely the groundtruth moment annotation distribution biases in the training set. To discourage the standard VG model's behavior of exploiting such temporal annotation biases and improve the model generalization ability, we propose multiple negative augmentations in a hierarchical way, including cross-video augmentations from clip-/video-level, and self-shuffled augmentations with masks. These augmentations can effectively diversify the data distribution so that the model can make more reasonable predictions instead of merely fitting the temporal biases. However, directly adopting such data augmentation strategy may inevitably carry some noise shown in our cases, since not all of the handcrafted augmentations are semantically irrelevant to the groundtruth video. To further denoise and improve the grounding accuracy, we design a multi-stage curriculum strategy to adaptively train the standard VG model from easy to hard negative augmentations. Experiments on newly collected Charades-CD and ActivityNet-CD datasets demonstrate our proposed strategy can improve the performance of the base model on both i.i.d and o.o.d scenarios.
Association for the Advancement of Artificial Intelligence (AAAI)
Title: Curriculum Multi-Negative Augmentation for Debiased Video Grounding
Description:
Video Grounding (VG) aims to locate the desired segment from a video given a sentence query.
Recent studies have found that current VG models are prone to over-rely the groundtruth moment annotation distribution biases in the training set.
To discourage the standard VG model's behavior of exploiting such temporal annotation biases and improve the model generalization ability, we propose multiple negative augmentations in a hierarchical way, including cross-video augmentations from clip-/video-level, and self-shuffled augmentations with masks.
These augmentations can effectively diversify the data distribution so that the model can make more reasonable predictions instead of merely fitting the temporal biases.
However, directly adopting such data augmentation strategy may inevitably carry some noise shown in our cases, since not all of the handcrafted augmentations are semantically irrelevant to the groundtruth video.
To further denoise and improve the grounding accuracy, we design a multi-stage curriculum strategy to adaptively train the standard VG model from easy to hard negative augmentations.
Experiments on newly collected Charades-CD and ActivityNet-CD datasets demonstrate our proposed strategy can improve the performance of the base model on both i.
i.
d and o.
o.
d scenarios.
Related Results
Predictors of False-Negative Axillary FNA Among Breast Cancer Patients: A Cross-Sectional Study
Predictors of False-Negative Axillary FNA Among Breast Cancer Patients: A Cross-Sectional Study
Abstract
Introduction
Fine-needle aspiration (FNA) is commonly used to investigate lymphadenopathy of suspected metastatic origin. The current study aims to find the association be...
Performance Analysis and Optimization Designs for HVDC Grounding Electrodes
Performance Analysis and Optimization Designs for HVDC Grounding Electrodes
High voltage direct current(HVDC) grounding electrodes can provide a current-leakage channel for ground faults, ensuring HVDC systems’ safety and reliability. The HVDC grounding el...
The Understanding of Curriculum Change
The Understanding of Curriculum Change
The curriculum is the key and indispensable part of the academic and training system that contains immense aims of scientific, thought, social, political, cultural, and moral facet...
Grounding of the mobile radio nodes in soils with vertical differential electrophysical causes
Grounding of the mobile radio nodes in soils with vertical differential electrophysical causes
When using overestimated values of soil parameters, the calculated resistance of the grounding electrodes will be increased and a larger number of electrodes will be required. This...
Audio and video editing system design based on OpenCV
Audio and video editing system design based on OpenCV
With the rapid development of the Internet, a new carrier for people to perceive the world and communicate with each other - audio and video - is gradually being favoured by the pu...
PENGARUH KELEMBABAN TANAH TERHADAP TAHANAN PENTANAHAN STUDI KASUS PADA GARDU INDUK KEMAYORAN 150 kV
PENGARUH KELEMBABAN TANAH TERHADAP TAHANAN PENTANAHAN STUDI KASUS PADA GARDU INDUK KEMAYORAN 150 kV
Abstract
The value of grounding resistance at the substation should be 0 Ω or less than 1 Ω. The value of grounding resistance is influenced by the resistivity and the ground...
Grounding Performance of Hydrogel, Silica Gel and Charcoal Ash as Additive Material in Grounding System
Grounding Performance of Hydrogel, Silica Gel and Charcoal Ash as Additive Material in Grounding System
Grounding enhancement materials (GEMs) are one of the additive materials which can change the grounding performance without lots of significant costs. The study aimed to assess the...
Teachers' interpretation of curriculum as a window into ‘curriculum potential’
Teachers' interpretation of curriculum as a window into ‘curriculum potential’
AbstractBen‐Peretz's (1975) concept of intended curriculum describes a version of curriculum that ‘official’ curriculum developers create to provide a detailed guide to what teache...

