×
Jun 25, 2024 · We propose the LongIns benchmark dataset, a challenging long-context instruction-based exam for LLMs, which is built based on the existing instruction datasets.
Jun 25, 2024 · LongIns aims to assess the ability of LLMs to maintain focus over longer key information by identifying incorrect question numbers through a ...
Sep 6, 2024 · To address these issues, we propose the LongIns benchmark dataset, a challenging long-context instruction-based exam for LLMs, which is built ...
Specifically, in ourLongIns, we introduce three evaluation settings: Global Instruction & SingleTask (GIST), Local Instruction & Single Task (LIST), and Local ...
The long-context capabilities of large language models (LLMs) have been a hot topic in recent years. To evaluate the performance of LLMs in different ...
Jun 26, 2024 · This paper introduces LongIns, a challenging long-context instruction-based exam for evaluating the performance of large language models ...
Based on LongIns, we perform comprehensive evaluations on existing LLMs and have the following important findings: (1). The top-performing GPT-4 with 128k ...
Dec 6, 2024 · LongIns: A Challenging Long-context Instruction-based Exam for LLMs. Paper • 2406.17588 • Published Jun 25 • 22 · Training-Free Long-Context ...
Jun 25, 2024 · Delighted to introduce a groundbreaking benchmark in AI research — "LongIns: A Challenging Long-context Instruction-based Exam for Large ...
Jun 26, 2024 · LongIns: A Challenging Long-context Instruction-based Exam for LLMs ↓ https://arxiv.org/abs/2406.17588 · 2:05 AM · Jun 26, 2024.