Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 10 days ago • 676
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 88
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 124
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9 • 41
MAGNeT Collection Masked Audio Generation using a Single Non-Autoregressive Transformer • 9 items • Updated Apr 4 • 39
QuIP: 2-Bit Quantization of Large Language Models With Guarantees Paper • 2307.13304 • Published Jul 25, 2023 • 2
DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models Paper • 2312.09767 • Published Dec 15, 2023 • 25
Improving Text Embeddings with Large Language Models Paper • 2401.00368 • Published Dec 31, 2023 • 79
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation Paper • 2312.02145 • Published Dec 4, 2023 • 4
Notus 7B v1 Collection Notus 7B v1 models (DPO fine-tune of Zephyr SFT) and datasets used. More information at https://github.com/argilla-io/notus • 11 items • Updated Jul 30 • 17
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 138
Positional Description Matters for Transformers Arithmetic Paper • 2311.14737 • Published Nov 22, 2023 • 2
Memory Augmented Language Models through Mixture of Word Experts Paper • 2311.10768 • Published Nov 15, 2023 • 16
read papers Collection This is a collection of some papers I've read in the past few months • 10 items • Updated Nov 21, 2023 • 47
Biomedical NLP papers Collection Papers posted on @ArxivHealthcareNLP@sigmoid.social (Clinical, Healthcare & Biomedical NLP) • 150 items • Updated 18 days ago • 33
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity Paper • 2101.03961 • Published Jan 11, 2021 • 14
Robust Speech Recognition via Large-Scale Weak Supervision Paper • 2212.04356 • Published Dec 6, 2022 • 19
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference Paper • 2310.04378 • Published Oct 6, 2023 • 19
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Paper • 2311.05556 • Published Nov 9, 2023 • 79
Nemotron 3 8B Collection The Nemotron 3 8B Family of models is optimized for building production-ready generative AI applications for the enterprise. • 5 items • Updated 5 days ago • 43
Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark Paper • 2311.09122 • Published Nov 15, 2023 • 6
Music ControlNet: Multiple Time-varying Controls for Music Generation Paper • 2311.07069 • Published Nov 13, 2023 • 43
Latent Consistency Models LoRAs Collection Latent Consistency Models for Stable Diffusion - LoRAs and full fine-tuned weights • 4 items • Updated Nov 10, 2023 • 98
EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision Paper • 2311.02077 • Published Nov 3, 2023 • 14
OpenChat Collection OpenChat: Advancing Open-source Language Models with Mixed-Quality Data • 7 items • Updated Jul 31 • 33
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 65
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper • 2311.00430 • Published Nov 1, 2023 • 56
Quantum control of a cat-qubit with bit-flip times exceeding ten seconds Paper • 2307.06617 • Published Jul 13, 2023 • 1
Responsible AI resources Collection These are the resources I use and mention in my talks & workshops, for more check hf.co/ethics • 15 items • Updated Jun 18 • 3
Training language models to follow instructions with human feedback Paper • 2203.02155 • Published Mar 4, 2022 • 14
CodeFusion: A Pre-trained Diffusion Model for Code Generation Paper • 2310.17680 • Published Oct 26, 2023 • 69
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation Paper • 2108.12409 • Published Aug 27, 2021 • 5
Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models Paper • 2307.11224 • Published Jul 20, 2023 • 5
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation Paper • 2310.16656 • Published Oct 25, 2023 • 39
Efficient Memory Management for Large Language Model Serving with PagedAttention Paper • 2309.06180 • Published Sep 12, 2023 • 25
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 144
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 70 items • Updated 3 days ago • 84
🕹️ AI Games Collection An ongoing collection of games you can play on HF Spaces • 14 items • Updated 3 days ago • 25
AI Ethics projects in Spanish Collection Datasets, models and spaces related to hate speech detection and bias evaluation in Spanish. • 20 items • Updated 11 days ago • 6
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 212
Community Tools Collection Cool HF tools that I and others at HF work on that I regularly use • 4 items • Updated May 21 • 3
Core ML Diffusers 🧨 Collection Some diffusion models ported to Core ML that work with apple/ml-stable-diffusion and huggingface/swift-coreml-diffusers. • 16 items • Updated Jun 13 • 8
Historical - Spaces of the Week Collection All Spaces of the Week...from all weeks • 636 items • Updated Jan 17 • 19
Whisper Release Collection Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 79
Learning Transferable Visual Models From Natural Language Supervision Paper • 2103.00020 • Published Feb 26, 2021 • 11
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning Paper • 2310.12921 • Published Oct 19, 2023 • 19