Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models Mar 20 • 61
Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model Aug 22, 2023 • 26
Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling Paper • 2409.14683 • Published 13 days ago • 8 • 2