Javascript must be enabled to continue!
Exploring Replay
View through CrossRef
Exploration is vital for animals and artificial agents who face uncertainty about their environments due to initial ignorance or subsequent changes. Their choices need to balance exploitation of the knowledge already acquired, with exploration to resolve uncertainty [1, 2]. However, the exact algorithmic structure of exploratory choices in the brain still remains largely elusive. A venerable idea in reinforcement learning is that agents can plan appropriate exploratory choices offline, during the equivalent of quiet wakefulness or sleep. Although offline processing in humans and other animals, in the form of hippocampal replay and preplay, has recently been the subject of highly successful modelling [3–5], existing methods only apply to known environments. Thus, they cannot predict exploratory replay choices during learning and/or behaviour in dynamic environments. Here, we extend the theory of Mattar & Daw [3] to examine the potential role of replay in approximately optimal exploration, deriving testable predictions for the patterns of exploratory replay choices in a paradigmatic spatial navigation task. Our modelling provides a normative interpretation of the available experimental data suggestive of exploratory replay. Furthermore, we highlight the importance of sequence replay, and license a range of new experimental paradigms that should further our understanding of offline processing.
Title: Exploring Replay
Description:
Exploration is vital for animals and artificial agents who face uncertainty about their environments due to initial ignorance or subsequent changes.
Their choices need to balance exploitation of the knowledge already acquired, with exploration to resolve uncertainty [1, 2].
However, the exact algorithmic structure of exploratory choices in the brain still remains largely elusive.
A venerable idea in reinforcement learning is that agents can plan appropriate exploratory choices offline, during the equivalent of quiet wakefulness or sleep.
Although offline processing in humans and other animals, in the form of hippocampal replay and preplay, has recently been the subject of highly successful modelling [3–5], existing methods only apply to known environments.
Thus, they cannot predict exploratory replay choices during learning and/or behaviour in dynamic environments.
Here, we extend the theory of Mattar & Daw [3] to examine the potential role of replay in approximately optimal exploration, deriving testable predictions for the patterns of exploratory replay choices in a paradigmatic spatial navigation task.
Our modelling provides a normative interpretation of the available experimental data suggestive of exploratory replay.
Furthermore, we highlight the importance of sequence replay, and license a range of new experimental paradigms that should further our understanding of offline processing.
Related Results
Evaluating hippocampal replay without a ground truth
Evaluating hippocampal replay without a ground truth
AbstractDuring rest and sleep, memory traces replay in the brain. The dialogue between brain regions during replay is thought to stabilize labile memory traces for long-term storag...
Theta-band phase locking during encoding leads to coordinated entorhinal-hippocampal replay
Theta-band phase locking during encoding leads to coordinated entorhinal-hippocampal replay
Abstract
Precisely timed interactions between hippocampal and cortical neurons during replay epochs are thought to support memory consolidation. ...
Exploring the roles of memory replay in targeted memory reactivation and birdsong development: Insights from computational models of complementary learning systems
Exploring the roles of memory replay in targeted memory reactivation and birdsong development: Insights from computational models of complementary learning systems
Abstract
Replay facilitates memory consolidation in both biological and artificial systems. Using the complementary learning systems (CLS) framework, we study repla...
Post-learning replay of hippocampal-striatal activity is biased by reward-prediction signals
Post-learning replay of hippocampal-striatal activity is biased by reward-prediction signals
Abstract
Neural activity encoding recent experiences is replayed during sleep and rest to promote consolidation of memories. However, precisely which features of ex...
Replay of factorized temporal journey
Replay of factorized temporal journey
Abstract
Time is a fundamental dimension of episodic memory, structuring the sequence of events that form our experiences. While replay of spatial paths and item sequences ...
The Role of Experience in Prioritizing Hippocampal Replay
The Role of Experience in Prioritizing Hippocampal Replay
Abstract
During sleep, recent memories are consolidated, whereby behavioral episodes first encoded by the hippocampus get transformed into long-term memories. However, the ...
Replay Attack Detection Based on High Frequency Missing Spectrum
Replay Attack Detection Based on High Frequency Missing Spectrum
Automatic Speaker Verification (ASV) has its benefits compared to other biometric verification methods, such as face recognition. It is convenient, low cost, and more privacy prote...
Offline Replay Supports Planning: fMRI Evidence from Reward Revaluation
Offline Replay Supports Planning: fMRI Evidence from Reward Revaluation
Abstract
Making decisions in sequentially structured tasks requires integrating distally acquired information. The extensive computational cost of such integration ...

