Javascript must be enabled to continue!

Multi-armed Bandit Algorithms for Cournot Games

Abstract We investigate using a multi-armed bandit (MAB) setting for modeling repeated Cournot oligopoly games. Agents interact with separate bandit problems. An agent can choose from a set of arms/actions representing discrete production quantities; here, the action space is ordered. Agents are independent and autonomous and cannot observe anything from the environment; they can only see their own rewards after taking action and only work towards maximizing these rewards. We first study Cournot models with stationary market demand where random entry or exit from the market is not allowed. We propose two novel approaches that take advantage of the fact that the action space is ordered: ϵ-greedy+HL and ϵ-greedy+EL. These are based on the ϵ-greedy approach as an underlying mechanism because the ϵ-greedy method does not require any knowledge of even the priors of the reward distributions, unlike other popular methods like UCB or Thompson sampling. Our proposed approaches help firms focus on more profitable actions by eliminating less profitable choices and are designed to optimize the exploration. However, in real-world scenarios, market demands evolve over a product’s lifetime for a myriad of reasons. Therefore, we also investigate repeated Cournot games with non-stationary demand such that firms/agents face independent instances of the non-stationary multi-armed bandit problem. We propose a novel algorithm Adaptive with Weighted Exploration (AWE) ϵ-greedy that is loosely based on the ϵ-greedy approach. We use computer simulations to study the emergence of various equilibria in the outcomes and empirically analyze joint cumulative regrets. Using our proposed method, agents are able to swiftly change their course of action according to the changes in demand. In most of the simulations, firms overall produce collusive outcomes, i.e., outcomes better than the Nash equilibrium.

Springer Science and Business Media LLC

Kshitija Taywade Judy Goldsmith Brent Harrison Adib Bagh

2023

Title: Multi-armed Bandit Algorithms for Cournot Games

Description:

Abstract We investigate using a multi-armed bandit (MAB) setting for modeling repeated Cournot oligopoly games.

Agents interact with separate bandit problems.

An agent can choose from a set of arms/actions representing discrete production quantities; here, the action space is ordered.

Agents are independent and autonomous and cannot observe anything from the environment; they can only see their own rewards after taking action and only work towards maximizing these rewards.

We first study Cournot models with stationary market demand where random entry or exit from the market is not allowed.

We propose two novel approaches that take advantage of the fact that the action space is ordered: ϵ-greedy+HL and ϵ-greedy+EL.

These are based on the ϵ-greedy approach as an underlying mechanism because the ϵ-greedy method does not require any knowledge of even the priors of the reward distributions, unlike other popular methods like UCB or Thompson sampling.

Our proposed approaches help firms focus on more profitable actions by eliminating less profitable choices and are designed to optimize the exploration.

However, in real-world scenarios, market demands evolve over a product’s lifetime for a myriad of reasons.

Therefore, we also investigate repeated Cournot games with non-stationary demand such that firms/agents face independent instances of the non-stationary multi-armed bandit problem.

We propose a novel algorithm Adaptive with Weighted Exploration (AWE) ϵ-greedy that is loosely based on the ϵ-greedy approach.

We use computer simulations to study the emergence of various equilibria in the outcomes and empirically analyze joint cumulative regrets.

Using our proposed method, agents are able to swiftly change their course of action according to the changes in demand.

In most of the simulations, firms overall produce collusive outcomes, i.

, outcomes better than the Nash equilibrium.

Back

Die öffentliche Schule Quest to learn in New York City ist eine Modell-Schule, die in ihren Lehrmethoden auf spielbasiertes Lernen, Game Design und den Game Design Prozess setzt. I...

Playing Pregnancy: The Ludification and Gamification of Expectant Motherhood in Smartphone Apps

IntroductionLike other forms of embodiment, pregnancy has increasingly become subject to representation and interpretation via digital technologies. Pregnancy and the unborn entity...

Multi-armed bandit games

AbstractA sequential optimization model, known as the multi-armed bandit problem, is concerned with optimal allocation of resources between competing activities, in order to genera...

ARMED EXTORTION IN LIGHT OF THE PRINCIPLE OF CRIMINAL LEGALITY

Furthermore, the DRC's military courts and tribunals fail to respect the principle of legality of offenses and penalties, in that they conflate the offense of armed robbery with th...

Federated Bandit: A Gossiping Approach

We study Federated Bandit, a decentralized Multi-Armed Bandit (MAB) problem with a set of N agents, who can only communicate their local data with neighbors described by a connecte...

Secondary School Students’ Cognitive Structures Regarding Educational Games

To employ educational games in education as intended, it is required to show students’ cognitive structures for this concept. As a result, the purpose of this research was to revea...

TEACHING SPELLING THROUGH GAMES

Games have been believed to be good media in assisting teaching for years. Games are believed can promote learning become more interesting. Many studies have been conducted on util...

Towards a Regime of Responsibility of Armed Groups in International Law

Armed groups have played a predominant role in the violations of international humanitarian law and international human rights law committed in conflict settings. The increase in t...

Email:
Password:

Email:

Multi-armed Bandit Algorithms for Cournot Games

Related Results