Cool - a Dandandooo Collection

Dandandooo 's Collections

Cool

SPIN

Helpful

Cool

updated 4 days ago

Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection

Paper • 2409.08513 • Published 23 days ago • 10
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

Paper • 2409.08264 • Published 23 days ago • 42
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published 17 days ago • 69
LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.11901 • Published 18 days ago • 30
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Paper • 2409.12568 • Published 17 days ago • 46
Language Models Learn to Mislead Humans via RLHF

Paper • 2409.12822 • Published 16 days ago • 9
Imagine yourself: Tuning-Free Personalized Image Generation

Paper • 2409.13346 • Published 16 days ago • 66
YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models

Paper • 2409.13592 • Published 15 days ago • 45
A Case Study of Web App Coding with OpenAI Reasoning Models

Paper • 2409.13773 • Published 17 days ago • 4
Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published 8 days ago • 73
MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published 10 days ago • 45