ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

Cheng, Zhi-Qi; Dai, Qi; Li, Siyao; Sun, Jingdong; Mitamura, Teruko; Hauptmann, Alexander G.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2304.02173 (cs)

[Submitted on 5 Apr 2023]

Title:ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

Authors:Zhi-Qi Cheng, Qi Dai, Siyao Li, Jingdong Sun, Teruko Mitamura, Alexander G. Hauptmann

View PDF

Abstract:Charts are a powerful tool for visually conveying complex data, but their comprehension poses a challenge due to the diverse chart types and intricate components. Existing chart comprehension methods suffer from either heuristic rules or an over-reliance on OCR systems, resulting in suboptimal performance. To address these issues, we present ChartReader, a unified framework that seamlessly integrates chart derendering and comprehension tasks. Our approach includes a transformer-based chart component detection module and an extended pre-trained vision-language model for chart-to-X tasks. By learning the rules of charts automatically from annotated datasets, our approach eliminates the need for manual rule-making, reducing effort and enhancing accuracy.~We also introduce a data variable replacement technique and extend the input and position embeddings of the pre-trained model for cross-task training. We evaluate ChartReader on Chart-to-Table, ChartQA, and Chart-to-Text tasks, demonstrating its superiority over existing methods. Our proposed framework can significantly reduce the manual effort involved in chart analysis, providing a step towards a universal chart understanding model. Moreover, our approach offers opportunities for plug-and-play integration with mainstream LLMs such as T5 and TaPas, extending their capability to chart comprehension tasks. The code is available at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
Cite as:	arXiv:2304.02173 [cs.CV]
	(or arXiv:2304.02173v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2304.02173

Submission history

From: Zhi-Qi Cheng [view email]
[v1] Wed, 5 Apr 2023 00:25:27 UTC (2,262 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators