RLHF - a huanbin11 Collection

huanbin11 's Collections

RLHF

RLHF

updated about 8 hours ago

MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions

Paper • 2410.02743 • Published 6 days ago • 4