RATIONALYST: Pre-training Process-Supervision for Improving Reasoning Paper • 2410.01044 • Published 4 days ago • 34
NVLM 1.0 Collection A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks and text-only tasks. • 1 item • Updated 5 days ago • 20
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 11 items • Updated 10 days ago • 327
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 10 days ago • 218
Llama 3.2 All Versions Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 20 items • Updated 3 days ago • 30
Llama 3.2 3B & 1B GGUF Quants Collection Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models. • 4 items • Updated 10 days ago • 40
RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning Paper • 2409.14674 • Published 13 days ago • 40
jina-embeddings-v3: Multilingual Embeddings With Task LoRA Paper • 2409.10173 • Published 20 days ago • 21
Loradex Highlights Collection This collection features awesome opensource LoRAs trained by members of the Glif Community during Loradex Early Access! • 12 items • Updated 12 days ago • 16
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published 16 days ago • 128
Temporally Aligned Audio for Video with Autoregression Paper • 2409.13689 • Published 15 days ago • 7
Portrait Video Editing Empowered by Multimodal Generative Priors Paper • 2409.13591 • Published 15 days ago • 15
Colorful Diffuse Intrinsic Image Decomposition in the Wild Paper • 2409.13690 • Published 15 days ago • 12
Prithvi WxC: Foundation Model for Weather and Climate Paper • 2409.13598 • Published 15 days ago • 33
Imagine yourself: Tuning-Free Personalized Image Generation Paper • 2409.13346 • Published 16 days ago • 66
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published 17 days ago • 69
NIM Serverless Inference API Collection Models in this collection are available for inference via a serverless API powered by NVIDIA NIM. • 8 items • Updated 5 days ago • 18
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated 17 days ago • 224
QA-MDT: Quality-aware Masked Diffusion Transformer for Enhanced Music Generation Paper • 2405.15863 • Published May 24 • 3
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated 17 days ago • 201
Qwen2.5-Math Collection Math-specific model series based on Qwen2.5 • 9 items • Updated 13 days ago • 35
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 14 items • Updated 11 days ago • 69
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation Paper • 2409.09214 • Published 22 days ago • 45
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems Paper • 2402.12875 • Published Feb 20 • 12
view article Article Fine-tuning a token classification model for legal data using Argilla and AutoTrain By bikashpatra • 29 days ago • 11
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published 25 days ago • 54
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation Paper • 2409.06820 • Published 25 days ago • 59
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers Paper • 2409.04109 • Published 30 days ago • 41
view article Article In-browser LLM app in pure Python: Gemini Nano + Gradio-Lite By whitphx • Jul 12 • 9
view article Article Exploring a Public Domain dataset with Visual Topic Modeling By charlesdedampierre • Feb 22 • 3
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29 • 51
RPMax Models Collection RPMax series of models with higher creativity and reduced repetition for "classic" RP chats. • 8 items • Updated 10 days ago • 8
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4 • 72
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data Paper • 2409.03810 • Published about 1 month ago • 30
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing Paper • 2409.01322 • Published Sep 2 • 95
FuzzCoder: Byte-level Fuzzing Test via Large Language Model Paper • 2409.01944 • Published Sep 3 • 44
Attention Heads of Large Language Models: A Survey Paper • 2409.03752 • Published about 1 month ago • 86
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning Paper • 2310.11716 • Published Oct 18, 2023 • 5
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture Paper • 2409.02889 • Published Sep 4 • 54
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Paper • 2409.02634 • Published Sep 4 • 85
CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization Paper • 2408.15914 • Published Aug 28 • 21
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios Paper • 2408.17267 • Published Aug 30 • 22
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding Paper • 2408.15545 • Published Aug 28 • 33
Qwen2-VL Collection Vision-language model series based on Qwen2 • 15 items • Updated 18 days ago • 129
Video Generation models Collection The domain of video generation is booming. Here are the list of selected Open Access video generation (T2V) models. • 14 items • Updated Aug 27 • 12
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper • 2408.14906 • Published Aug 27 • 138
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models Paper • 2408.15518 • Published Aug 28 • 41
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders Paper • 2408.15998 • Published Aug 28 • 83