Merve Noyan's picture

Merve Noyan PRO

merve

·

AI & ML interests

VLMs, vision & co

Articles

Llama can now see and run on your device - welcome Llama 3.2

Preference Optimization for Vision Language Models

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

PaliGemma – Google's Cutting-Edge Open Vision Language Model

Vision Language Models Explained

Introduction to Quantization cooked in 🤗 with 💗🧑‍🍳

Deploy MusicGen in no time with Inference Endpoints

Open-Source Text Generation & LLM Ecosystem at Hugging Face

Jupyter X Hugging Face

Using Machine Learning to Aid Survivors and Race through Time

Introducing Skops

Announcing the Hugging Face Fellowship Program

Showcase Your Projects in Spaces using Gradio

Hosting your Models and Datasets on Hugging Face Spaces using Streamlit

Organizations

Posts 58

Post

3132

If you feel like you missed out for ECCV 2024, there's an app to browse the papers, rank for popularity, filter for open models, datasets and demos 📝

Get started at ECCV/ECCV2024-papers ✨

Post

2249

NVIDIA just dropped a gigantic multimodal model called NVLM 72B 🦖
nvidia/NVLM-D-72B
Paper page NVLM: Open Frontier-Class Multimodal LLMs (2409.11402)

The paper contains many ablation studies on various ways to use the LLM backbone 👇🏻

🦩 Flamingo-like cross-attention (NVLM-X)
🌋 Llava-like concatenation of image and text embeddings to a decoder-only model (NVLM-D)
✨ a hybrid architecture (NVLM-H)

Checking evaluations, NVLM-D and NVLM-H are best or second best compared to other models 👏

The released model is NVLM-D based on Qwen-2 Instruct, aligned with InternViT-6B using a huge mixture of different datasets

You can easily use this model by loading it through transformers' AutoModel 😍

Collections 28

spaces 104

No application file

Sam2.1

SuperPoint

Running on CPU Upgrade

Gradio Tgi

Vision Papers

OWLSAM2

Running on Zero

Llava Interleave

models 85

merve/idefics3-llama-vqav2

Updated 24 days ago • 1

merve/idefics3llama-vqav2

Updated 24 days ago • 8

merve/flux-dreambooth-lora

Updated Aug 16 • 1

merve/trained-flux-lora-lego

Text-to-Image • Updated Aug 16 • 17 • • 1

merve/flux-lego-lora-dreambooth

Text-to-Image • Updated Aug 16 • 284 • • 12

merve/sam2-hiera-large

Mask Generation • Updated Aug 2 • 27.9k • 2

merve/sam2-hiera-base-plus

Mask Generation • Updated Aug 2 • 18

merve/sam2-hiera-small

Mask Generation • Updated Aug 2 • 71 • 1

merve/sam2-hiera-tiny

Mask Generation • Updated Aug 2 • 12

merve/vq-vae

Updated Jul 18 • 12 • 2

datasets 26

merve/model-test-inputs

Updated Aug 22 • 1

merve/vqav2-small

Viewer • Updated Aug 8 • 21.4k • 504 • 6

merve/SGinW

Preview • Updated Jul 11 • 2

merve/pascal-voc

Viewer • Updated Jul 6 • 336k • 2

merve/YouCook2

Viewer • Updated May 28 • 2k • 2

merve/faiss_embeddings

Updated Jan 25 • 1

merve/pokemon-ds-embeddings

Viewer • Updated Jan 10 • 833 • 6 • 4

merve/tr-h4-norobots

Updated Jan 7 • 3 • 10

merve/lego_sets_latest

Viewer • Updated Jan 6 • 61 • 4 • 1

merve/ai-tube-dummy

Updated Dec 1, 2023 • 2