Javascript must be enabled to continue!
Guaranteeing Out-Of-Distribution Detection in Deep RL via Transition Estimation
View through CrossRef
An issue concerning the use of deep reinforcement learning (RL) agents is whether they can be trusted to perform reliably when deployed, as training environments may not reflect real-life environments. Anticipating instances outside their training scope, learning-enabled systems are often equipped with out-of-distribution (OOD) detectors that alert when a trained system encounters a state it does not recognize or in which it exhibits uncertainty. There exists limited work conducted on the problem of OOD detection within RL, with prior studies being unable to achieve a consensus on the definition of OOD execution within the context of RL. By framing our problem using a Markov Decision Process, we assume there is a transition distribution mapping each state-action pair to another state with some probability. Based on this, we consider the following definition of OOD execution within RL: A transition is OOD if its probability during real-life deployment differs from the transition distribution encountered during training. As such, we utilize conditional variational autoencoders (CVAE) to approximate the transition dynamics of the training environment and implement a conformity-based detector using reconstruction loss that is able to guarantee OOD detection with a pre-determined confidence level. We evaluate our detector by adapting existing benchmarks and compare it with existing OOD detection models for RL.
Association for the Advancement of Artificial Intelligence (AAAI)
Title: Guaranteeing Out-Of-Distribution Detection in Deep RL via Transition Estimation
Description:
An issue concerning the use of deep reinforcement learning (RL) agents is whether they can be trusted to perform reliably when deployed, as training environments may not reflect real-life environments.
Anticipating instances outside their training scope, learning-enabled systems are often equipped with out-of-distribution (OOD) detectors that alert when a trained system encounters a state it does not recognize or in which it exhibits uncertainty.
There exists limited work conducted on the problem of OOD detection within RL, with prior studies being unable to achieve a consensus on the definition of OOD execution within the context of RL.
By framing our problem using a Markov Decision Process, we assume there is a transition distribution mapping each state-action pair to another state with some probability.
Based on this, we consider the following definition of OOD execution within RL: A transition is OOD if its probability during real-life deployment differs from the transition distribution encountered during training.
As such, we utilize conditional variational autoencoders (CVAE) to approximate the transition dynamics of the training environment and implement a conformity-based detector using reconstruction loss that is able to guarantee OOD detection with a pre-determined confidence level.
We evaluate our detector by adapting existing benchmarks and compare it with existing OOD detection models for RL.
Related Results
Fertility Transition Across Major Sub-Saharan African Cities: The Role of Proximate Determinants
Fertility Transition Across Major Sub-Saharan African Cities: The Role of Proximate Determinants
Abstract
Background
Sub-Saharan Africa’s fertility transition has lagged behind other regions despite rapid urbanization, resulting in persistently high fertility rates. S...
Intensity estimation after detection for accumulated rainfall estimation
Intensity estimation after detection for accumulated rainfall estimation
This work focuses on optimizing the estimation of accumulated rain from measurements of the attenuation level of signals from commercial microwave links (CMLs). The process of accu...
REGULAR ARTICLES
REGULAR ARTICLES
L. Cowen and
C. J.
Schwarz
657Les Radio‐tags, en raison de leur détectabilitéélevée, ...
Depth-aware salient object segmentation
Depth-aware salient object segmentation
Object segmentation is an important task which is widely employed in many computer vision applications such as object detection, tracking, recognition, and ret...
Joint Spoofing Detection Algorithm Based on Dual Control Charts and Robust Estimation
Joint Spoofing Detection Algorithm Based on Dual Control Charts and Robust Estimation
To address the issue that existing GNSS spoofing detection methods are not suitable for intermittent minor spoofing detection and spoofing duration identification, this paper theor...
Deep learning for small object detection in images
Deep learning for small object detection in images
[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] With the rapid development of deep learning in computer vision, especially deep convolutional neural network...
Detection of acne by deep learning object detection
Detection of acne by deep learning object detection
Abstract
Importance
State-of-the art performance is achieved with a deep learning object detection model for acne detection. Th...
Research on the application of target detection based on deep learning technology in power grid operation inspection
Research on the application of target detection based on deep learning technology in power grid operation inspection
-External damage to power facilities caused by crane, excavator and other construction operations increases year by year, which will seriously threaten the safe operation of power ...

