Paper List

Tag: preference_optimization

2 items with this tag.

May 01, 2026
DPO Misspecification: Why DPO is a Misspecified Estimator and How to Fix It
May 01, 2026
MNPO: Multiplayer Nash Preference Optimization

Created with Quartz v4.5.1 © 2026

GitHub