Javascript must be enabled to continue!

Universal Adversarial Purification with DDIM Metric Loss for Stable Diffusion

Stable Diffusion (SD) often produces degraded outputs when the training dataset contains adversarial noise. Adversarial purification offers a promising solution by removing adversarial noise from contaminated data. However, existing purification methods are primarily designed for classification tasks and fail to address SD-specific adversarial strategies, such as attacks targeting the VAE encoder, UNet denoiser, or both. To address the gap in SD security, we propose Universal Diffusion Adversarial Purification (UDAP), a novel framework tailored for defending adversarial attacks targeting SD models. UDAP leverages the distinct reconstruction behaviors of clean and adversarial images during Denoising Diffusion Implicit Models (DDIM) inversion to optimize the purification process. By minimizing the DDIM metric loss, UDAP can effectively remove adversarial noise. Additionally, we introduce a dynamic epoch adjustment strategy that adapts optimization iterations based on reconstruction errors, significantly improving efficiency without sacrificing purification quality. Experiments demonstrate UDAP’s robustness against diverse adversarial methods, including PID (VAE-targeted), Anti-DreamBooth (UNet-targeted), MIST (hybrid), and robustness-enhanced variants like Anti-Diffusion (Anti-DF) and MetaCloak. UDAP also generalizes well across SD versions and text prompts, showcasing its practical applicability in real-world scenarios.

Association for the Advancement of Artificial Intelligence (AAAI)

Li Zheng Liangbin Xie Jiantao Zhou He YiMin

Proceedings of the AAAI Conference on Artificial Intelligence

2026

Title: Universal Adversarial Purification with DDIM Metric Loss for Stable Diffusion

Description:

Stable Diffusion (SD) often produces degraded outputs when the training dataset contains adversarial noise.

Adversarial purification offers a promising solution by removing adversarial noise from contaminated data.

However, existing purification methods are primarily designed for classification tasks and fail to address SD-specific adversarial strategies, such as attacks targeting the VAE encoder, UNet denoiser, or both.

To address the gap in SD security, we propose Universal Diffusion Adversarial Purification (UDAP), a novel framework tailored for defending adversarial attacks targeting SD models.

UDAP leverages the distinct reconstruction behaviors of clean and adversarial images during Denoising Diffusion Implicit Models (DDIM) inversion to optimize the purification process.

By minimizing the DDIM metric loss, UDAP can effectively remove adversarial noise.

Additionally, we introduce a dynamic epoch adjustment strategy that adapts optimization iterations based on reconstruction errors, significantly improving efficiency without sacrificing purification quality.

Experiments demonstrate UDAP’s robustness against diverse adversarial methods, including PID (VAE-targeted), Anti-DreamBooth (UNet-targeted), MIST (hybrid), and robustness-enhanced variants like Anti-Diffusion (Anti-DF) and MetaCloak.

UDAP also generalizes well across SD versions and text prompts, showcasing its practical applicability in real-world scenarios.

Back

Related Results

Lectin C gene analysis v1

Mammalian Tissue Total RNA Purification Protocol by GeneJET RNA Purification Kit (Thermo Scientific, USA) Before starting: • Supplement the required amount of Lysis Buffer with β-...

ProDef-MDS: A Proactive Defense Mechanism Protecting Malware Detection Systems from Adversarial Attacks

Malware threatens cybersecurity by enabling data theft, unauthorized access, and extortion. Traditional malware detection systems (MDS) struggle with the increasing volume and comp...

Élimination des vapeurs toxiques par oxydation : développement de procédures d'évaluation des systèmes de purification de l'air des conduits de ventilation

L'exposition à des composés organiques volatils (COV) dans les lieux de travail peut avoir des effets aigus, notamment sous forme d'irritation de la peau, des yeux, de la bouche et...

Comment on: Macroscopic water vapor diffusion is not enhanced in snow

Abstract. The central thesis of the authors’ paper is that macroscopic water vapor diffusion is not enhanced in snow compared to diffusion through humid air alone. Further, mass di...

Robust Explainable AI via Adversarial Latent Diffusion Models: Mitigating Gradient Obfuscation with Interpretable Feature Attribution

This study introduces the Adversarial Latent Diffusion Explanations (ALDE) framework, a novel approach aimed at improving the robustness and interpretability of explainable AI (XAI...

Improving Diversity and Quality of Adversarial Examples in Adversarial Transformation Network

Abstract This paper proposes a method to mitigate two major issues of Adversarial Transformation Networks (ATN) including the low diversity and the low quality of adversari...

Efficient Defense Against First Order Adversarial Attacks on Convolutional Neural Networks

Machine learning models, especially neural networks, are vulnerable to adversarial attacks, where inputs are purposefully altered to induce incorrect predictions. These adversarial...

A comparative study of mappings in metric space and controlled metric space

The objective of this paper is to present a comparative study of mapping in Metric Space and Controlled Metric Space. The study provides the structure, gap analysis and application...

Email:
Password:

Email: