Depth Pro: Sharp Monocular Metric Depth in Less Than a Second Paper • 2410.02073 • Published 3 days ago • 21
DepthPro Models Collection Depth Pro: Sharp Monocular Metric Depth in Less Than a Second • 2 items • Updated about 6 hours ago • 1
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 10 days ago • 218
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness Paper • 2409.18125 • Published 9 days ago • 32
view article Article Assisted Generation: a new direction toward low-latency text generation May 11, 2023 • 26
LM (MLX) Collection State-Space-Model powered Language Models for Apple Silicon • 12 items • Updated Aug 27 • 4
DiffusionKit Collection Models, datasets and evaluations results for DiffusionKit: https://github.com/argmaxinc/DiffusionKit • 6 items • Updated 26 days ago • 3
WhisperKit Collection Models, datasets and evaluation results for WhisperKit: https://github.com/argmaxinc/WhisperKit • 4 items • Updated about 1 month ago • 6
Enhancing Training Efficiency Using Packing with Flash Attention Paper • 2407.09105 • Published Jul 12 • 12
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14 • 44
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published Jul 31 • 73
The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models Paper • 2404.05904 • Published Apr 8 • 7
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18 • 35
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks Paper • 2311.06242 • Published Nov 10, 2023 • 79
MobileCLIP Models + DataCompDR Data Collection MobileCLIP: Mobile-friendly image-text models with SOTA zero-shot capabilities. DataCompDR: Improved datasets for training image-text SOTA models. • 22 items • Updated 1 day ago • 23
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10 • 64
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion Paper • 2406.03184 • Published Jun 5 • 18
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 10 days ago • 676
SD 2.x, Zero-terminal SNR Collection SD 2.x models with zero terminal SNR noise schedule. • 3 items • Updated Nov 3, 2023 • 3
view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N • May 22 • 25
INDUS: Effective and Efficient Language Models for Scientific Applications Paper • 2405.10725 • Published May 17 • 32
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Jul 31 • 136
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x • Jun 23 • 33
Depth Anything Release Collection Depth Anything models, foundation models for monocular depth estimation, trained on 1.5 million labeled images and 62 million unlabeled images • 8 items • Updated Jan 26 • 9
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • Jun 4 • 69
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325