Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

A Sampling-Based Gittins Index Approximation

View through CrossRef
A sampling-based method is introduced to approximate the Gittins index for a general family of alternative bandit processes. The approximation consists of a truncation of the optimization horizon and support for the immediate rewards, an optimal stopping value approximation, and a stochastic approximation procedure. Finite-time error bounds are given for the three approximations, leading to a procedure to construct a confidence interval for the Gittins index using a finite number of Monte Carlo samples as well as an epsilon-optimal policy for the family of alternative bandit processes. Proofs are given for almost sure convergence and a central limit theorem for the sampling-based Gittins index approximation. In a numerical study, the quality of the approximation is verified for the Bernoulli bandit and the Gaussian bandit with known variance, and the method is shown to significantly outperform Thompson sampling and the Bayesian upper-confidence-bound algorithms for a novel random effects multi-armed bandit.
Institute for Operations Research and the Management Sciences (INFORMS)
Title: A Sampling-Based Gittins Index Approximation
Description:
A sampling-based method is introduced to approximate the Gittins index for a general family of alternative bandit processes.
The approximation consists of a truncation of the optimization horizon and support for the immediate rewards, an optimal stopping value approximation, and a stochastic approximation procedure.
Finite-time error bounds are given for the three approximations, leading to a procedure to construct a confidence interval for the Gittins index using a finite number of Monte Carlo samples as well as an epsilon-optimal policy for the family of alternative bandit processes.
Proofs are given for almost sure convergence and a central limit theorem for the sampling-based Gittins index approximation.
In a numerical study, the quality of the approximation is verified for the Bernoulli bandit and the Gaussian bandit with known variance, and the method is shown to significantly outperform Thompson sampling and the Bayesian upper-confidence-bound algorithms for a novel random effects multi-armed bandit.

Related Results

Learning Theory and Approximation
Learning Theory and Approximation
The workshop Learning Theory and Approximation , organised by Kurt Jetter (Stuttgart-Hohenheim), Steve Smale (Berkeley) and Ding-Xuan Zhou (...
A note on gittins indices for pharmaceutical research
A note on gittins indices for pharmaceutical research
The Bernoulli/exponential target process is considered. Such processes have been found useful in modelling the search for active compounds in pharmaceutical research. An inequality...
Analysis of fracture problems of airport pavement by improved element-free Galerkin method
Analysis of fracture problems of airport pavement by improved element-free Galerkin method
Using the improved element-free Galerkin (IEFG) method, in this paper we introduce the characteristic parameter r which can reflect the singular stress near the crack tip into the ...
Generalized Jacobi Chebyshev Wavelet Approximation
Generalized Jacobi Chebyshev Wavelet Approximation
General Background: Wavelet approximations are fundamental in numerical analysis and signal processing, with classical orthogonal polynomials like Jacobi and Chebyshev serving as k...
Optimal with Respect to Accuracy Recovery of Some Classes Functions by Fourier Series
Optimal with Respect to Accuracy Recovery of Some Classes Functions by Fourier Series
Introduction. Function approximation (approximation or restoration) is widely used in data analysis, model building, and forecasting. The goal of function approximation is to find ...
Frequency-limited model approximation of large-scale dynamical models
Frequency-limited model approximation of large-scale dynamical models
Approximation de modèles dynamiques de grande dimension sur intervalles de fréquences limités Les systèmes physiques sont représentés par des modèles mathématiques ...
Efficient by Precision Algorithms for Approximating Functions from Some Classes by Fourier Series
Efficient by Precision Algorithms for Approximating Functions from Some Classes by Fourier Series
Introduction. The problem of approximation can be considered as the basis of computational methods, namely, the approximation of individual functions or classes of functions by fun...
Taylor approximation and infinitary Lambda-calculi
Taylor approximation and infinitary Lambda-calculi
Approximation de Taylor et λ-calcul infinitaire Depuis son introduction par Church, le λ-calcul a joué un rôle majeur dans un siècle de développement de l'informati...

Back to Top