Javascript must be enabled to continue!
Surprise acts as a reducer of outcome value in human reinforcement learning
View through CrossRef
Surprise occurs because of differences between a decision outcome and its predicted outcome (prediction error), regardless of whether the error is positive or negative. It has recently been postulated that surprise affects the reward value of the action outcome itself; studies have indicated that increasing surprise, as absolute value of prediction error, decreases the value of the outcome. However, how surprise affects the value of the outcome and subsequent decision making is unclear. We suggested that, on the assumption that surprise decreases the outcome value, agents will increase their risk averse choices when an outcome is often surprisal. Here, we propose the surprise-sensitive utility model, a reinforcement learning model that states that surprise decreases the outcome value, to explain how surprise affects subsequent decision-making. To investigate the assumption, we compared this model with previous reinforcement learning models on a risky probabilistic learning task with simulation analysis, and model selection with two experimental datasets with different tasks and population. We further simulated a simple decision-making task to investigate how parameters within the proposed model modulate the choice preference. As a result, we found the proposed model explains the risk averse choices in a manner similar to the previous models, and risk averse choices increased as the surprise-based modulation parameter of outcome value increased. The model fits these datasets better than the other models, with same free parameters, thus providing a more parsimonious and robust account for risk averse choices. These findings indicate that surprise acts as a reducer of outcome value and decreases the action value for risky choices in which prediction error often occurs.
Title: Surprise acts as a reducer of outcome value in human reinforcement learning
Description:
Surprise occurs because of differences between a decision outcome and its predicted outcome (prediction error), regardless of whether the error is positive or negative.
It has recently been postulated that surprise affects the reward value of the action outcome itself; studies have indicated that increasing surprise, as absolute value of prediction error, decreases the value of the outcome.
However, how surprise affects the value of the outcome and subsequent decision making is unclear.
We suggested that, on the assumption that surprise decreases the outcome value, agents will increase their risk averse choices when an outcome is often surprisal.
Here, we propose the surprise-sensitive utility model, a reinforcement learning model that states that surprise decreases the outcome value, to explain how surprise affects subsequent decision-making.
To investigate the assumption, we compared this model with previous reinforcement learning models on a risky probabilistic learning task with simulation analysis, and model selection with two experimental datasets with different tasks and population.
We further simulated a simple decision-making task to investigate how parameters within the proposed model modulate the choice preference.
As a result, we found the proposed model explains the risk averse choices in a manner similar to the previous models, and risk averse choices increased as the surprise-based modulation parameter of outcome value increased.
The model fits these datasets better than the other models, with same free parameters, thus providing a more parsimonious and robust account for risk averse choices.
These findings indicate that surprise acts as a reducer of outcome value and decreases the action value for risky choices in which prediction error often occurs.
Related Results
Installation Analysis of Matterhorn Pipeline Replacement
Installation Analysis of Matterhorn Pipeline Replacement
Abstract
The paper describes the installation analysis for the Matterhorn field pipeline replacement, located in water depths between 800-ft to 1200-ft in the Gul...
Development of a Universal Ranking for Friction Reducer Performance
Development of a Universal Ranking for Friction Reducer Performance
Abstract
In hydraulic fracturing, large amounts of water are pumped at high speed down the wellbore. To reduce pump pressure and costs, a friction reducer is added t...
Design and Testing of a New Type of Planetary Traction Drive Bearing-Type Reducer
Design and Testing of a New Type of Planetary Traction Drive Bearing-Type Reducer
This paper presents the design and development of a new type of planetary traction drive bearing-type reducer. In this design, the transmission outer ring is replaced with an elast...
The Effect of Compression Reinforcement on the Shear Behavior of Concrete Beams with Hybrid Reinforcement
The Effect of Compression Reinforcement on the Shear Behavior of Concrete Beams with Hybrid Reinforcement
Abstract
This study examines the impact of steel compression reinforcement on the shear behavior of concrete beams reinforced with glass fiber reinforced polymer (GFRP) bar...
Dopamine regulates decision thresholds in human reinforcement learning
Dopamine regulates decision thresholds in human reinforcement learning
Abstract
Dopamine fundamentally contributes to reinforcement learning by encoding prediction errors, deviations of an outcome from expectation. Prediction error cod...
Study on Scheme Optimization of bridge reinforcement increasing ratio
Study on Scheme Optimization of bridge reinforcement increasing ratio
Abstract
The bridge reinforcement methods, each method has its advantages and disadvantages. The load-bearing capacity of bridge members is controlled by the ultimat...
What to Expect When the Unexpected Becomes Expected: Harmonic Surprise and Preference Over Time in Popular Music
What to Expect When the Unexpected Becomes Expected: Harmonic Surprise and Preference Over Time in Popular Music
Previous work demonstrates that music with more surprising chords tends to be perceived as more enjoyable than music with more conventional harmonic structures. In that work, harmo...
Why Do Individuals Seek Information? A Selectionist Perspective
Why Do Individuals Seek Information? A Selectionist Perspective
Several authors have proposed that mechanisms of adaptive behavior, and reinforcement learning in particular, can be explained by an innate tendency of individuals to seek informat...

