Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 60 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 14 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 159 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Rethinking Detection Based Table Structure Recognition for Visually Rich Document Images (2312.00699v2)

Published 1 Dec 2023 in cs.CV and cs.IR

Abstract: Table Structure Recognition (TSR) is a widely discussed task aiming at transforming unstructured table images into structured formats, such as HTML sequences, to make text-only models, such as ChatGPT, that can further process these tables. One type of solution is using detection models to detect table components, such as columns and rows, then applying a rule-based post-processing method to convert detection results into HTML sequences. However, existing detection-based models usually cannot perform as well as other types of solutions regarding cell-level TSR metrics, such as TEDS, and the underlying reasons limiting the performance of these models on the TSR task are also not well-explored. Therefore, we revisit existing detection-based models comprehensively and explore the underlying reasons hindering these models' performance, including the improper problem definition, the mismatch issue of detection and TSR metrics, the characteristics of detection models, and the impact of local and long-range features extraction. Based on our analysis and findings, we apply simple methods to tailor a typical two-stage detection model, Cascade R-CNN, for the TSR task. The experimental results show that the tailored Cascade R-CNN based model can improve the base Cascade R-CNN model by 16.35\% on the FinTabNet dataset regarding the structure-only TEDS, outperforming other types of state-of-the-art methods, demonstrating that our findings can be a guideline for improving detection-based TSR models and that a purely detection-based solution is competitive with other types of solutions, such as graph-based and image-to-sequence solutions.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.