Javascript must be enabled to continue!

TempCo-Painter: Temporal Consistency Enhanced Painter with Adaptive Diffusion Transformers for Long Video Inpainting

Video inpainting, a critical task in computer vision, aims to plausibly fill missing regions in video sequences while maintaining both spatial realism and robust spatio-temporal consistency. Current methods often struggle with ultra-long videos, highly dynamic occlusions, and achieving extreme coherence efficiently, leading to common artifacts. To address these challenges, we propose TempCo-Painter: Temporal Consistency Enhanced Painter with Adaptive Diffusion Transformers. Our novel framework leverages a specialized 3D-VAE for efficient latent space compression and introduces an innovative Adaptive Diffusion Transformer (ADiT). ADiT integrates hierarchical spatial-temporal attention, a motion-guided attention mechanism for accurate dynamic content restoration, and dynamic mask awareness for robust handling of diverse occlusions. An efficient Flow Matching scheduler further enables TempCo-Painter to generate high-quality results with minimal denoising steps. For processing arbitrarily long videos, we introduce an enhanced MultiDiffusion strategy featuring an adaptive sliding window and temporal smoothing regularization to ensure seamless global consistency. Extensive experiments demonstrate that TempCo-Painter achieves state-of-the-art performance on standard short video benchmarks, significantly outperforming existing methods in PSNR, SSIM, and notably reducing Video Frechet Inception Distance. Furthermore, it exhibits superior robustness and coherence on challenging minute-level long videos and complex mask scenarios, while maintaining high inference efficiency.

MDPI AG

Ruohan Qi Tianhao Nian

2026

Title: TempCo-Painter: Temporal Consistency Enhanced Painter with Adaptive Diffusion Transformers for Long Video Inpainting

Description:

Video inpainting, a critical task in computer vision, aims to plausibly fill missing regions in video sequences while maintaining both spatial realism and robust spatio-temporal consistency.

Current methods often struggle with ultra-long videos, highly dynamic occlusions, and achieving extreme coherence efficiently, leading to common artifacts.

To address these challenges, we propose TempCo-Painter: Temporal Consistency Enhanced Painter with Adaptive Diffusion Transformers.

Our novel framework leverages a specialized 3D-VAE for efficient latent space compression and introduces an innovative Adaptive Diffusion Transformer (ADiT).

ADiT integrates hierarchical spatial-temporal attention, a motion-guided attention mechanism for accurate dynamic content restoration, and dynamic mask awareness for robust handling of diverse occlusions.

An efficient Flow Matching scheduler further enables TempCo-Painter to generate high-quality results with minimal denoising steps.

For processing arbitrarily long videos, we introduce an enhanced MultiDiffusion strategy featuring an adaptive sliding window and temporal smoothing regularization to ensure seamless global consistency.

Extensive experiments demonstrate that TempCo-Painter achieves state-of-the-art performance on standard short video benchmarks, significantly outperforming existing methods in PSNR, SSIM, and notably reducing Video Frechet Inception Distance.

Furthermore, it exhibits superior robustness and coherence on challenging minute-level long videos and complex mask scenarios, while maintaining high inference efficiency.

Back

Purpose: Traditional video processing techniques often struggle with critical challenges such as low resolution, motion artifacts, and temporal inconsistencies, especially in real-...

Virtual Inpainting for Dazu Rock Carvings Based on a Sample Dataset

Numerous image inpainting algorithms are guided by a basic assumption that the known region in the original image itself can provide sufficient prior information for the guess reco...

Comment on: Macroscopic water vapor diffusion is not enhanced in snow

Abstract. The central thesis of the authors’ paper is that macroscopic water vapor diffusion is not enhanced in snow compared to diffusion through humid air alone. Further, mass di...

Image Inpainting Research Based on Deep Learning

Abstract With the rapid development of computer technology, image inpainting has become a research hotspot in the field of deep learning. Image inpainting belongs...

Diversity-Generated Image Inpainting with Style Extraction

The latest methods based on deep learning have achieved amazing results regarding the complex work of inpainting large missing areas in an image. This type of method generally atte...

Ancient mural inpainting via structure information guided two-branch model

AbstractAncient murals are important cultural heritages for our exploration of ancient civilizations and are of great research value. Due to long-time exposure to the environment, ...

MD-GAN: Multi-Scale Diversity GAN for Large Masks Inpainting

Image inpainting approaches have made considerable progress with the assistance of generative adversarial networks (GANs) recently. However, current inpainting methods are incompet...

Role of the Frontal Lobes in the Propagation of Mesial Temporal Lobe Seizures

Summary: The depth ictal electroencephalographic (EEG) propagation sequence accompanying 78 complex partial seizures of mesial temporal origin was reviewed in 24 patients (15 from...

Email:
Password:

Email:

TempCo-Painter: Temporal Consistency Enhanced Painter with Adaptive Diffusion Transformers for Long Video Inpainting

Related Results