teowu (Haoning Wu, Teo)

upvoted a paper 2 months ago

LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding

Paper • 2407.15754 • Published Jul 22 • 19

upvoted a paper 4 months ago

CMC-Bench: Towards a New Paradigm of Visual Signal Compression

Paper • 2406.09356 • Published Jun 13 • 4

upvoted a collection 4 months ago

Visual Evaluation Benchmarks!

Collection

Q-Bench (ICLR24' Spotlight), Q-Bench-Pair (TPAMI), and A-Bench in HuggingFace Format. Support auto-load as `dataset = load_dataset("q-future/**-HF")` • 3 items • Updated Aug 27 • 1

upvoted 3 papers 4 months ago

A-Bench: Are LMMs Masters at Evaluating AI-generated Images?

Paper • 2406.03070 • Published Jun 5 • 2

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

Paper • 2405.19327 • Published May 29 • 43

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7 • 13

upvoted an article 5 months ago

Article

Let's talk about LLM evaluation

By

•

May 23

• 108

upvoted a paper 5 months ago

MANTIS: Interleaved Multi-Image Instruction Tuning

Paper • 2405.01483 • Published May 2 • 6

upvoted a collection 6 months ago

Idefics2 🐶

Collection

Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 88

upvoted 3 papers 7 months ago

HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Paper • 2310.14566 • Published Oct 23, 2023 • 25

Towards Open-ended Visual Quality Comparison

Paper • 2402.16641 • Published Feb 26 • 16

A Benchmark for Multi-modal Foundation Models on Low-level Vision: from Single Images to Pairs

Paper • 2402.07116 • Published Feb 11 • 2

upvoted 4 collections 8 months ago

upvoted a paper 8 months ago

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

Paper • 2401.16420 • Published Jan 29 • 54

upvoted 4 papers 9 months ago

Towards A Better Metric for Text-to-Video Generation

Paper • 2401.07781 • Published Jan 15 • 14

Q-Boost: On Visual Quality Assessment Ability of Low-level Multi-Modality Foundation Models

Paper • 2312.15300 • Published Dec 23, 2023 • 2

Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

Paper • 2401.01117 • Published Jan 2 • 8

Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

Paper • 2312.17090 • Published Dec 28, 2023 • 4

upvoted 2 papers 10 months ago

Enhancing Diffusion Models with Text-Encoder Reinforcement Learning

Paper • 2311.15657 • Published Nov 27, 2023 • 2

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

Paper • 2312.09390 • Published Dec 14, 2023 • 32

upvoted 2 papers 11 months ago

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions

Paper • 2311.12793 • Published Nov 21, 2023 • 18

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 26

upvoted 2 papers 12 months ago

Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 36

Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision

Paper • 2309.14181 • Published Sep 25, 2023 • 2

Haoning Wu, Teo

AI & ML interests

Organizations

teowu's activity