Javascript must be enabled to continue!
Composition and Zero-Shot Transfer with Lattice Structures in Reinforcement Learning
View through CrossRef
An important property of long-lived agents is the ability to reuse existing knowledge to solve new tasks. An appealing approach towards obtaining such agents is by leveraging logical composition over tasks, where new tasks are defined by applying logic operators to previously-solved ones. This composition is particularly powerful since it provides a human-understandable mechanism for task specification. However, no unifying formalism for applying logic operators to tasks and generalising combinatorially over them has yet been developed. We address the problem by formally defining logical composition as operators acting on a set of tasks in a lattice structure—the algebraic structure that generalises the study of Boolean logic. This provides a theoretically rigorous method for composing tasks, allowing us to formulate new tasks in terms of the negation, disjunction, and conjunction of a set of base tasks. We prove that by learning a new type of goal-oriented value function model free, called the world value function, an agent can solve composite tasks involving arbitrary logical operators with no further learning. We verify our approach in high-dimensional domains—including a video game environment and continuous-control task—where an agent first learns to solve a set of base tasks, and then composes these solutions to solve a super-exponential number of new tasks.
AI Access Foundation
Title: Composition and Zero-Shot Transfer with Lattice Structures in Reinforcement Learning
Description:
An important property of long-lived agents is the ability to reuse existing knowledge to solve new tasks.
An appealing approach towards obtaining such agents is by leveraging logical composition over tasks, where new tasks are defined by applying logic operators to previously-solved ones.
This composition is particularly powerful since it provides a human-understandable mechanism for task specification.
However, no unifying formalism for applying logic operators to tasks and generalising combinatorially over them has yet been developed.
We address the problem by formally defining logical composition as operators acting on a set of tasks in a lattice structure—the algebraic structure that generalises the study of Boolean logic.
This provides a theoretically rigorous method for composing tasks, allowing us to formulate new tasks in terms of the negation, disjunction, and conjunction of a set of base tasks.
We prove that by learning a new type of goal-oriented value function model free, called the world value function, an agent can solve composite tasks involving arbitrary logical operators with no further learning.
We verify our approach in high-dimensional domains—including a video game environment and continuous-control task—where an agent first learns to solve a set of base tasks, and then composes these solutions to solve a super-exponential number of new tasks.
Related Results
STRENGTH OF BUTT WELDED BUTT JOINT OF REINFORCEMENT OF CLASS A500C
STRENGTH OF BUTT WELDED BUTT JOINT OF REINFORCEMENT OF CLASS A500C
The paper presents the results of experimental studies of the strength of cross-shaped welded joints of types К1-Кт and К3-Рр [1] of thermomechanically hardened reinforcement of cl...
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...
EMNet: A Novel Few-Shot Image Classification Model with Enhanced Self-Correlation Attention and Multi-Branch Joint Module
EMNet: A Novel Few-Shot Image Classification Model with Enhanced Self-Correlation Attention and Multi-Branch Joint Module
In this research, inspired by the principles of biological visual attention mechanisms and swarm intelligence found in nature, we present an Enhanced Self-Correlation Attention and...
Fiber reinforcement as an alternative to the compressed zone linear reinforcement and the flexible concrete elements stretched zone prestressing
Fiber reinforcement as an alternative to the compressed zone linear reinforcement and the flexible concrete elements stretched zone prestressing
Abstract
The results of a numerical experiment in the framework of a theoretical study of the strength and crack resistance of the reinforced concrete beams availabl...
Study on hardness and wear resistance of shot peened AA7075-T6 aluminum alloy
Study on hardness and wear resistance of shot peened AA7075-T6 aluminum alloy
Abstract
AA7075-T6 aluminum alloy samples were shot peened at various shot peening pressures in the range of 10–70 psi to study their mechanical and tribological ...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND
As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
Comparative Evaluation of Zero-Shot and Few-Shot Performance of Large Language Models in Low-Resource Language Machine Translation
Comparative Evaluation of Zero-Shot and Few-Shot Performance of Large Language Models in Low-Resource Language Machine Translation
Large language models (LLMs) have demonstrated remarkable translation capabilities for high-resource languages, yet their effectiveness on low-resource languages under varying prom...
Identify Cricket Shots using Machine Learning
Identify Cricket Shots using Machine Learning
Cricket shot detection is a game-changing technology that offers deep insights into player performance and match data, completely changing the way the sport is played. The main ele...

