Javascript must be enabled to continue!
Analysis of Inverse Reinforcement Learning with Perturbed Demonstrations
View through CrossRef
Inverse reinforcement learning (IRL) addresses the problem of recovering the unknown reward function for a given Markov decision problem (MDP) given the corresponding optimal policy or a perturbed version thereof. This paper studies the space of possible solutions to the general IRL problem, when the agent is provided with incomplete/imperfect information regarding the optimal policy for the MDP whose reward must be estimated. We focus on scenarios with finite state-action spaces and discuss the constraints imposed on the set of possible solutions when the agent is provided with (i) perturbed policies; (ii) optimal policies; and (iii) incomplete policies. We discuss previous works on IRL in light of our analysis and show that, with our characterization of the solution space, it is possible to determine non-trivial closed-form solutions for the IRL problem. We also discuss several other interesting aspects of the IRL problem that stem from our analysis.
Title: Analysis of Inverse Reinforcement Learning with Perturbed Demonstrations
Description:
Inverse reinforcement learning (IRL) addresses the problem of recovering the unknown reward function for a given Markov decision problem (MDP) given the corresponding optimal policy or a perturbed version thereof.
This paper studies the space of possible solutions to the general IRL problem, when the agent is provided with incomplete/imperfect information regarding the optimal policy for the MDP whose reward must be estimated.
We focus on scenarios with finite state-action spaces and discuss the constraints imposed on the set of possible solutions when the agent is provided with (i) perturbed policies; (ii) optimal policies; and (iii) incomplete policies.
We discuss previous works on IRL in light of our analysis and show that, with our characterization of the solution space, it is possible to determine non-trivial closed-form solutions for the IRL problem.
We also discuss several other interesting aspects of the IRL problem that stem from our analysis.
Related Results
Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
Abstract
In this paper, we define a novel inverse reinforcement learning (IRL) problem where the demonstrations are multi-intention, i.e., co...
The Demonstration Society
The Demonstration Society
Today, as in the past, public demonstrations are not only tools to prove, persuade, and promote, but also fundamental forms of social interaction and exchange.
YouTu...
The Effect of Compression Reinforcement on the Shear Behavior of Concrete Beams with Hybrid Reinforcement
The Effect of Compression Reinforcement on the Shear Behavior of Concrete Beams with Hybrid Reinforcement
Abstract
This study examines the impact of steel compression reinforcement on the shear behavior of concrete beams reinforced with glass fiber reinforced polymer (GFRP) bar...
ASAP-CORPS: A Semi-Autonomous Platform for COntact-Rich Precision Surgery
ASAP-CORPS: A Semi-Autonomous Platform for COntact-Rich Precision Surgery
ABSTRACT
Introduction
Remote military operations require rapid response times for effective relief and critical care. Yet, the m...
Study on Scheme Optimization of bridge reinforcement increasing ratio
Study on Scheme Optimization of bridge reinforcement increasing ratio
Abstract
The bridge reinforcement methods, each method has its advantages and disadvantages. The load-bearing capacity of bridge members is controlled by the ultimat...
Robust treatment planning for small animal radio‐neuromodulation using focused kV x‐ray beams
Robust treatment planning for small animal radio‐neuromodulation using focused kV x‐ray beams
AbstractBackgroundIn preclinical radio‐neuromodulation research, small animal experiments are pivotal for unraveling radiobiological mechanism, investigating prescription and plann...
PERENCANAAN HOTEL BERBINTANG 6 LANTAI DI KABUPATEN CIANJUR
PERENCANAAN HOTEL BERBINTANG 6 LANTAI DI KABUPATEN CIANJUR
ABSTRACTCianjur is a city in Indonesia that has a tourist destination. This causes many people from outside the city to come for a vacation in Cianjur district. The increasing popu...
Dopamine regulates decision thresholds in human reinforcement learning
Dopamine regulates decision thresholds in human reinforcement learning
Abstract
Dopamine fundamentally contributes to reinforcement learning by encoding prediction errors, deviations of an outcome from expectation. Prediction error cod...

