Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

VCP-CLIP+: Stabilizing and Optimizing VCP-CLIP with Minimal Architectural Changes

View through CrossRef
Zero-shot anomaly segmentation (ZSAS) has significantly advanced with the emergence of vision–language models such as CLIP. Among recent approaches for ZSAS, VCP-CLIP introduced visual context prompting (VCP) and demonstrated impressive zero-shot localization capability without class-specific training. However, we revisit VCP-CLIP and find room for supplementation and improvement in the VCP-CLIP framework. In this study, we upgrade VCP-CLIP with simple yet effective modifications designed to enhance pixel-level localization and image-level reliability. Specifically, we propose: (1) a fixed temperature scaling scheme that improves consistency in similarity estimation and stability in training; (2) a learnable anomaly map fusion scheme that adaptively and optimally aggregates anomaly cues from complementary branches; (3) an adaptive loss weighting mechanism that balances segmentation objectives; and (4) an image-conditioned direct prompting module that directly injects visual context information to the text prompts. With minimal architectural changes, our upgraded model, dubbed VCP-CLIP+, achieved high performance improvements over VCP-CLIP on the ZSAS benchmark datasets, outperforming other state-of-the-art CLIP-based ZSAS methods in both pixel-level and image-level anomaly detection.
Title: VCP-CLIP+: Stabilizing and Optimizing VCP-CLIP with Minimal Architectural Changes
Description:
Zero-shot anomaly segmentation (ZSAS) has significantly advanced with the emergence of vision–language models such as CLIP.
Among recent approaches for ZSAS, VCP-CLIP introduced visual context prompting (VCP) and demonstrated impressive zero-shot localization capability without class-specific training.
However, we revisit VCP-CLIP and find room for supplementation and improvement in the VCP-CLIP framework.
In this study, we upgrade VCP-CLIP with simple yet effective modifications designed to enhance pixel-level localization and image-level reliability.
Specifically, we propose: (1) a fixed temperature scaling scheme that improves consistency in similarity estimation and stability in training; (2) a learnable anomaly map fusion scheme that adaptively and optimally aggregates anomaly cues from complementary branches; (3) an adaptive loss weighting mechanism that balances segmentation objectives; and (4) an image-conditioned direct prompting module that directly injects visual context information to the text prompts.
With minimal architectural changes, our upgraded model, dubbed VCP-CLIP+, achieved high performance improvements over VCP-CLIP on the ZSAS benchmark datasets, outperforming other state-of-the-art CLIP-based ZSAS methods in both pixel-level and image-level anomaly detection.

Related Results

Modulations in the host cell proteome by the hantavirus nucleocapsid protein
Modulations in the host cell proteome by the hantavirus nucleocapsid protein
Hantaviruses have evolved a unique translation strategy to boost the translation of viral mRNA in infected cells. Hantavirus nucleocapsid protein (NP) binds to the viral mRNA 5’ UT...
Autophagic degradation of Mutant Huntingtin by Enhancement of the Complex of VCP/p97-LC3-mHTT
Autophagic degradation of Mutant Huntingtin by Enhancement of the Complex of VCP/p97-LC3-mHTT
Background and Purpose Huntington’s disease (HD) is an autosomal dominant neurodegenerative disorder caused by cytotoxicity of mutant huntingtin protein (mHTT). Decrease of mHTT wi...
Novel Valve Condition Prognostic System for Digitally Enabled High-Pressure Pump Maintenance
Novel Valve Condition Prognostic System for Digitally Enabled High-Pressure Pump Maintenance
Abstract The valve condition prognostics (VCP) system detects anomalies on high-pressure pump fluid-end valves and seats during fracturing before a total functional ...
Anti‐p97/VCP Antibodies: An Autoantibody Marker for a Subset of Primary Biliary Cirrhosis Patients with Milder Disease?
Anti‐p97/VCP Antibodies: An Autoantibody Marker for a Subset of Primary Biliary Cirrhosis Patients with Milder Disease?
AbstractWe previously reported that 12.5% of primary biliary cirrhosis (PBC) sera reacted with a 95 kDa cytosol protein (p95c) that was subsequently identified as a p97/valosin‐con...
Dissection of Learning Opportunities and Obstacles While Learning Through Video Conferencing Platform (VCP) - During Covid 19 Lockdown
Dissection of Learning Opportunities and Obstacles While Learning Through Video Conferencing Platform (VCP) - During Covid 19 Lockdown
Background- COVID 19 pandemic has influenced our lives in all the aspects, even the younger generationsare also not left untouched, specially their studies. Professional courses li...
TO STUDY THE TERMINAL DEVISION OF RECURRENT LARYNGEAL NERVE AND ITS VARIATION IN COMPARISION OF INFERIOR THYROID ARTERY
TO STUDY THE TERMINAL DEVISION OF RECURRENT LARYNGEAL NERVE AND ITS VARIATION IN COMPARISION OF INFERIOR THYROID ARTERY
The vagus nerve, which supplies the larynx with motor, sensory, and parasympathetic bers, branches into the recurrent laryngeal nerve (RLN) and Superior Laryngeal nerve. Before as...

Back to Top