Tutorial & Overview
- Book 2018: Sutton & Barto: Reinforcement Learning: An Introduction, Book, Note
- arXiv 2018: An Introduction to Deep Reinforcement Learning, arXiv,, Note
- arXiv 2024: Reinforcement Learning: An Overview, arXiv, Note
- INFORMS Tutorial 2025: Statistical and Algorithmic Foundations of Reinforcement Learning, arXiv, , Slides, Note
Model-Free RL
- arXiv 2018: Reinforcement Learning and Control as Probabilistic Inference, arXiv, Note
- ICLR 2021 Oral: What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study, arXiv, Note
Value-Based Methods
- Nature 2015, DQN: Human-level Control through Deep Reinforcement Learning, Nature, Note
- AAAI 2016, Double DQN: Deep Reinforcement Learning with Double Q-learning, arXiv, Note
- ICML 2017, Soft Q-Learning: Reinforcement Learning with Deep Energy-Based Policies, arXiv, Note
Policy Gradient & On-Policy Methods
Tutorial:
- arXiv 2024: The Definitive Guide to Policy Gradients in Deep Reinforcement Learning: Theory, Algorithms and Implementations, arXiv, Note
Papers:
- NIPS 1999: Policy Gradient Methods for Reinforcement Learning with Function Approximation, NIPS, Note
- NIPS 2001, NPG: A Natural Policy Gradient, NIPS, Note
- ICML 2016, A3C: Asynchronous Methods for Deep Reinforcement Learning, arXiv, Note
- ICML 2015, TRPO: Trust Region Policy Optimization, arXiv, Note
- arXiv 2017, PPO: Proximal Policy Optimization Algorithms, arXiv, Note
Policy Gradient & Off-Policy Methods
- ICLR 2016, DDPG: Continuous Control with Deep Reinforcement Learning, arXiv, Note
- ICML 2018, SAC: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, arXiv, Note
Exploration Bonus
Model-Based RL
- arXiv 2017: Learning Model-based Planning from Scratch, arXiv, Note
- ICML 2013: Guided Policy Search, Online PDF, Note
- ICML 2017, Predictron: The Predictron: End-To-End Learning and Planning, arXiv, Note
- NIPS 2017, VPN: Value Prediction Network, arXiv, Note
- AAAI 2019, CRAR: Combined Reinforcement Learning via Abstract Representations, arXiv, Note
Imitation Learning
Tutorial:
Papers:
- AISTATS 2011, DAgger: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning, arXiv, Note
- NIPS 2016, GAIL: Generative Adversarial Imitation Learning, arXiv, Note
- NIPS 2017, InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations, arXiv, Note
- ICLR 2023, HOIL: Seeing Differently, Acting Similarly: Heterogeneously Observable Imitation Learning, arXiv, Note
Inverse Reinforcement Learning
Tutorial:
- arXiv 2018: A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress, arXiv, Note
Papers:
- AAAI 2008, MaxEnt IRL: Maximum Entropy Inverse Reinforcement Learning, AAAI, Note
- ICML 2016, MaxEnt IOC: Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, arXiv, Note
- ICLR 2018, AIRL: Learning Robust Rewards with Adverserial Inverse Reinforcement Learning, arXiv, Note
Generalization & Overfitting
- arXiv 2018: A Study on Overfitting in Deep RL, arXiv, Note
- arXiv 2018: A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning, arXiv, Note
Representation Learning & Transfer
- ICML 2017, DARLA: Improving Zero-Shot Transfer in Reinforcement Learning, arXiv, Note
- ICLR 2018 Workshop: Decoupling Dynamics and Reward for Transfer Learning, arXiv, Note