Relational Transfer in Reinforcement Learning
Transfer learning is an inherent aspect of human learning. When humans learn to perform a task, we rarely start from scratch. Instead, we recall relevant knowledge from previous learning experiences and apply that knowledge to help us master the new task more quickly.
This principle can be applied to machine learning as well. Machine learning often addresses single learning tasks in isolation. Even though multiple related tasks may exist in a domain, many algorithms for machine learning have no way to utilize those relationships. Algorithms that allow successful transfer from one task (the source) to another task (the target) are necessary steps towards making machine learning as adaptable as human learning.
This thesis investigates transfer methods for reinforcement learning (RL), where an agent takes series of actions in an environment. RL often requires substantial amounts of nearly random exploration, particularly in the early stages of learning. The ability to transfer knowledge from previous tasks can therefore be an important asset for RL agents. Transfer from related source tasks can improve the low initial performance that is common in challenging target tasks.
I focus on transferring relational knowledge that guides action choices. Relational knowledge typically uses first-order logic to express information about relationships among objects. First-order logic, unlike propositional logic, can use variables that generalize over classes of objects. This greater generalization makes first-order logic more effective for transfer.
This thesis contributes six transfer algorithms in three categories: advice-based transfer, macro transfer, and MLN transfer. Advice-based transfer uses source-task knowledge to provide advice for a target-task learner, which can follow, refine, or ignore the advice according to its value. Macro-transfer and MLN-transfer methods use source-task experience to demonstrate good behavior for a target-task learner.
I evaluate these transfer algorithms experimentally in the complex reinforcement-learning domain of RoboCup simulated soccer. All of my algorithms provide empirical benefits compared to non-transfer approaches, either by increasing initial performance or by enabling faster learning in the target task.
Download this report (PDF)
Return to tech report index