Topic Boundary
这个 overview 记录 非 LLM 语境下的 Imitation Learning。主动查询、selective labeling、multiple oracle、preference-based active query 这类工作统一放在子目录 Active Imitation Learning;它们仍然属于 IL,只是子问题更强调 query policy、expert feedback cost 和 on-policy interaction。LLM agent 的 RL/RLVR/RLHF 训练不从这里收。
Tutorial
- arXiv 2018: An Algorithmic Perspective on Imitation Learning, arXiv
Papers
- AISTATS 2011, DAgger: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning, arXiv, Note
- ICML 2017, AggreVaTeD: Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction, arXiv, Note
- ICML 2022, ILEED: Imitation Learning by Estimating Expertise of Demonstrators, arXiv, Note
Adversarial Imitation Learning:
- NIPS 2016, GAIL: Generative Adversarial Imitation Learning, arXiv, Note
- ICLR 2017, TPIL: Third-Person Imitation Learning, arXiv, Note
- NIPS 2017, InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations, arXiv, Note
- IJCAI 2020, Triple-GAIL: Triple-GAIL: A Multi-Modal Imitation Learning Framework, arXiv
- IJCAI 2021, SAIL: Robust Adversarial Imitation Learning via Adaptively-Selected Demonstrations, IJCAI
- ICLR 2023 Spotlight, HOIL: Seeing Differently, Acting Similarly: Heterogeneously Observable Imitation Learning, arXiv, Note
- ICML 2023, PCIL: Policy Contrastive Imitation Learning, arXiv
Policy Distillation:
- ICLR 2016, Policy Distillation: Policy Distillation, arXiv, Note
- AISTATS 2019, Distilling Policy Distillation: Distilling Policy Distillation, arXiv, Note
Explainable Imitation Learning:
Active Imitation Learning
AIL 的完整子队列见 Active Imitation Learning。这里保留它作为 IL 的子方向,而不是全局一级 topic,是为了明确 AIL ⊂ IL,并且把边界限制在传统控制、MDP、DRL 和 oracle-query 模仿学习问题上。