OmniBench: Towards The Future of Universal Omni-Language Models Paper • 2409.15272 • Published 12 days ago • 24
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published 11 days ago • 40
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation Paper • 2409.09214 • Published 22 days ago • 45
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4 • 72
FuzzCoder: Byte-level Fuzzing Test via Large Language Model Paper • 2409.01944 • Published Sep 3 • 44
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark Paper • 2409.02813 • Published Sep 4 • 27
OLMoE Collection Artifacts for open mixture-of-experts language models. • 13 items • Updated 11 days ago • 21
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17 • 51
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15 • 51
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm Paper • 2408.08072 • Published Aug 15 • 31
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25 • 85
LongIns: A Challenging Long-context Instruction-based Exam for LLMs Paper • 2406.17588 • Published Jun 25 • 20
MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation Paper • 2406.15252 • Published Jun 21 • 14
PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents Paper • 2406.13923 • Published Jun 20 • 21
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models Paper • 2406.05862 • Published Jun 9 • 4
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Paper • 2405.19327 • Published May 29 • 43
MusiLingo Collection This is the checkpoints and datasets of MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response • 5 items • Updated Apr 4 • 2
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding Paper • 2402.16671 • Published Feb 26 • 26
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling Paper • 2402.12226 • Published Feb 19 • 40
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation Paper • 2402.04324 • Published Feb 6 • 23
CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark Paper • 2401.11944 • Published Jan 22 • 24
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models Paper • 2401.06951 • Published Jan 13 • 24
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper • 2401.01055 • Published Jan 2 • 53
A Comprehensive Study of Knowledge Editing for Large Language Models Paper • 2401.01286 • Published Jan 2 • 16
TIGERScore Collection List of model variates of TIGEREScore checkpoints and the associated dataset • 8 items • Updated 9 days ago • 4
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers Paper • 2311.17136 • Published Nov 28, 2023 • 7
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI Paper • 2311.16502 • Published Nov 27, 2023 • 35