Javascript must be enabled to continue!

Curriculum Multi-Negative Augmentation for Debiased Video Grounding

Video Grounding (VG) aims to locate the desired segment from a video given a sentence query. Recent studies have found that current VG models are prone to over-rely the groundtruth moment annotation distribution biases in the training set. To discourage the standard VG model's behavior of exploiting such temporal annotation biases and improve the model generalization ability, we propose multiple negative augmentations in a hierarchical way, including cross-video augmentations from clip-/video-level, and self-shuffled augmentations with masks. These augmentations can effectively diversify the data distribution so that the model can make more reasonable predictions instead of merely fitting the temporal biases. However, directly adopting such data augmentation strategy may inevitably carry some noise shown in our cases, since not all of the handcrafted augmentations are semantically irrelevant to the groundtruth video. To further denoise and improve the grounding accuracy, we design a multi-stage curriculum strategy to adaptively train the standard VG model from easy to hard negative augmentations. Experiments on newly collected Charades-CD and ActivityNet-CD datasets demonstrate our proposed strategy can improve the performance of the base model on both i.i.d and o.o.d scenarios.

Association for the Advancement of Artificial Intelligence (AAAI)

Xiaohan Lan Yitian Yuan Hong Chen Xin Wang Zequn Jie Lin Ma Zhi Wang Wenwu Zhu

Proceedings of the AAAI Conference on Artificial Intelligence

2023

Title: Curriculum Multi-Negative Augmentation for Debiased Video Grounding

Description:

Video Grounding (VG) aims to locate the desired segment from a video given a sentence query.

Recent studies have found that current VG models are prone to over-rely the groundtruth moment annotation distribution biases in the training set.

To discourage the standard VG model's behavior of exploiting such temporal annotation biases and improve the model generalization ability, we propose multiple negative augmentations in a hierarchical way, including cross-video augmentations from clip-/video-level, and self-shuffled augmentations with masks.

These augmentations can effectively diversify the data distribution so that the model can make more reasonable predictions instead of merely fitting the temporal biases.

However, directly adopting such data augmentation strategy may inevitably carry some noise shown in our cases, since not all of the handcrafted augmentations are semantically irrelevant to the groundtruth video.

To further denoise and improve the grounding accuracy, we design a multi-stage curriculum strategy to adaptively train the standard VG model from easy to hard negative augmentations.

Experiments on newly collected Charades-CD and ActivityNet-CD datasets demonstrate our proposed strategy can improve the performance of the base model on both i.

d and o.

d scenarios.

Back

Abstract Introduction Fine-needle aspiration (FNA) is commonly used to investigate lymphadenopathy of suspected metastatic origin. The current study aims to find the association be...

Performance Analysis and Optimization Designs for HVDC Grounding Electrodes

High voltage direct current(HVDC) grounding electrodes can provide a current-leakage channel for ground faults, ensuring HVDC systems’ safety and reliability. The HVDC grounding el...

The Understanding of Curriculum Change

The curriculum is the key and indispensable part of the academic and training system that contains immense aims of scientific, thought, social, political, cultural, and moral facet...

Grounding of the mobile radio nodes in soils with vertical differential electrophysical causes

When using overestimated values of soil parameters, the calculated resistance of the grounding electrodes will be increased and a larger number of electrodes will be required. This...

Audio and video editing system design based on OpenCV

With the rapid development of the Internet, a new carrier for people to perceive the world and communicate with each other - audio and video - is gradually being favoured by the pu...

PENGARUH KELEMBABAN TANAH TERHADAP TAHANAN PENTANAHAN STUDI KASUS PADA GARDU INDUK KEMAYORAN 150 kV

Abstract The value of grounding resistance at the substation should be 0 Ω or less than 1 Ω. The value of grounding resistance is influenced by the resistivity and the ground...

Grounding Performance of Hydrogel, Silica Gel and Charcoal Ash as Additive Material in Grounding System

Grounding enhancement materials (GEMs) are one of the additive materials which can change the grounding performance without lots of significant costs. The study aimed to assess the...

Teachers' interpretation of curriculum as a window into ‘curriculum potential’

AbstractBen‐Peretz's (1975) concept of intended curriculum describes a version of curriculum that ‘official’ curriculum developers create to provide a detailed guide to what teache...

Email:
Password:

Email:

Curriculum Multi-Negative Augmentation for Debiased Video Grounding

Related Results