×
Oct 9, 2024 · Based on this concept, we propose the Knowledge-Orthogonal Reasoning Benchmark (KOR-Bench), encompassing five task categories: Operation, Logic, ...
Sep 27, 2024 · This paper presents KOR-Bench, a new benchmark that tests LLMs' reasoning abilities across five categories: Operation, Logic, Cipher, Puzzle, and ...
Knowledge-Orthogonal Reasoning Benchmark (KOR-Bench) is designed to evaluate models' intrinsic reasoning and planning abilities by minimizing interference ...
Oct 18, 2024 · To more accurately assess large models' reasoning in new, unfamiliar areas, we're thrilled to introduce the all-new KOR-Bench (Knowledge-Orthogonal Reasoning ...
Missing: Tasks. | Show results with:Tasks.
Co-authors ; KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks. K Ma, X Du, Y Wang, H Zhang, Z Wen, X Qu, J Yang, J Liu, M Liu, X ...
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks. K Ma, X Du, Y Wang, H Zhang, Z Wen, X Qu, J Yang, J Liu, M Liu, X Yue ...
OmniDocBench: Benchmarking Diverse PDF Document Parsing with ... KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks.
In this paper, we introduce Knowledge-Orthogonal Reasoning (KOR), which minimizes the impact of domain-specific knowledge for a more accurate evaluation of ...
Oct 18, 2024 · In this paper, we introduce Knowledge-Orthogonal Reasoning (KOR), which minimizes the impact of domain-specific knowledge for a more accurate evaluation of ...
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks · 27 Sept 2024 (modified: 26 Nov 2024) · ICLR 2025 Conference Submission · Readers: ...