Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Paper • 2410.02740 • Published 2 days ago • 46
Llama 3.2 3B & 1B GGUF Quants Collection Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models. • 4 items • Updated 10 days ago • 40
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 11 items • Updated 10 days ago • 327
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated 17 days ago • 201
DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated 23 days ago • 76
Configurable Foundation Models: Building LLMs from a Modular Perspective Paper • 2409.02877 • Published Sep 4 • 27
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data Paper • 2409.03810 • Published about 1 month ago • 30
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 174
Bielik-11B-v2.2 Collection A collection of models based on Bielik-11B-v2.2 - instruct and quantized versions. • 16 items • Updated 15 days ago • 27
LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 264 items • Updated Jun 22 • 399
Sapiens Collection Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens • 72 items • Updated 17 days ago • 30
view article Article DEMO: French Spoken Language Understanding with the new speech resources from NAVER LABS Europe By mzboito • Aug 28 • 9
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper • 2408.14906 • Published Aug 27 • 138
view article Article Going multimodal: How Prezi is leveraging the Hub and the Expert Support Program to accelerate their ML roadmap Jun 19 • 11
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations Paper • 2408.12590 • Published Aug 22 • 33
DreamCinema: Cinematic Transfer with Free Camera and 3D Character Paper • 2408.12601 • Published Aug 22 • 28
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation Paper • 2408.12528 • Published Aug 22 • 50
Controllable Text Generation for Large Language Models: A Survey Paper • 2408.12599 • Published Aug 22 • 61
LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation Paper • 2408.13252 • Published Aug 23 • 23
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22 • 111
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 28
Jamba-1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 75
Gradio Spaces for Background Removal Collection Enhance your images by removing the background. Will ensure these Spaces are up and maintained for the community. • 5 items • Updated Aug 20 • 23
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 10 days ago • 586
XGen-MM-1 models and datasets Collection A collection of all XGen-MM (Foundation LMM) models! • 14 items • Updated 8 days ago • 34
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated 3 days ago • 54
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 17 days ago • 473
💻 Local SmolLMs Collection SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos • 14 items • Updated Aug 20 • 41
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance Paper • 2408.08189 • Published Aug 15 • 14
MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model Paper • 2408.10198 • Published Aug 19 • 32
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published Jul 31 • 73
ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget Paper • 2408.00103 • Published Jul 31 • 16
Medical SAM 2: Segment medical images as video via Segment Anything Model 2 Paper • 2408.00874 • Published Aug 1 • 41
A Large Encoder-Decoder Family of Foundation Models For Chemical Language Paper • 2407.20267 • Published Jul 24 • 31
🍃 MINT-1T Collection Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24 • 50
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17 • 48
xLAM models Collection xLAM: A Family of Large Action Models to Empower AI Agent Systems: https://github.com/SalesforceAIResearch/xLAM • 9 items • Updated 12 days ago • 40