Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing Paper • 2402.15151 • Published Feb 23 • 7 • 2
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages Paper • 2303.12582 • Published Mar 22, 2023 • 20 • 3
DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing Paper • 2306.14435 • Published Jun 26, 2023 • 20 • 5
Agile Catching with Whole-Body MPC and Blackbox Policy Learning Paper • 2306.08205 • Published Jun 14, 2023 • 9 • 1
RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation Paper • 2306.11706 • Published Jun 20, 2023 • 7 • 1