SFT & Reward Models used in the experiments of the ICML 2024 paper "Towards Efficient Exact Optimization of Language Model Alignment"
Haozhe Ji
ehzoah
AI & ML interests
language modeling, text generation
Organizations
None yet