arxiv:2410.04422

Hyper-multi-step: The Truth Behind Difficult Long-context Tasks

Published on Oct 6

· Submitted by

yuyijiong on Oct 9

Upvote

Authors:

Yijiong Yu

Abstract

Long-context language models (LCLM), characterized by their extensive context window, is becoming increasingly popular. Meanwhile, many long-context benchmarks present challenging tasks that even the most advanced LCLMs struggle to complete. However, the underlying sources of various challenging long-context tasks have seldom been studied. To bridge this gap, we conduct experiments to indicate their difficulty stems primarily from two basic issues: "multi-matching retrieval," which requires the simultaneous retrieval of multiple items, and "logic-based retrieval," which necessitates logical judgment within retrieval criteria. These two problems, while seemingly straightforward, actually exceed the capabilities of LCLMs because they are proven to be hyper-multi-step (demanding numerous steps to solve) in nature. This finding could explain why LLMs struggle with more advanced long-context tasks, providing a more accurate perspective for rethinking solutions for them.

View arXiv page View PDF Add to collection

Community

yuyijiong

Paper author Paper submitter about 5 hours ago

Our code and datasets are publicly available at https://github.com/yuyijiong/hard_retrieval_for_llm

yuyijiong

Paper author Paper submitter about 2 hours ago

This paper reveals a tough fact that:
A long-context language model can never perfectly address advanced long-context tasks, such as repo-level code generation or filtering tabular data. This is because LLMs are inherently unable to complete a large number of reasoning steps within a limited generation length, but which is often a necessity for advanced long-context tasks.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2410.04422 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2410.04422 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2410.04422 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.