TableExtractNet: A Model of Automatic Detection and Recognition of Table Structures from Unstructured Documents

This paper presents TableExtractNet, a model that automatically finds and understands tables from scanned documents, tasks that are essential for quick use of information in many fields. This is driven by the growing need for efficient and accurate table interpretation in business documents where ta...

Full description

Saved in:
Bibliographic Details
Main Authors: Thokozani Ngubane, Jules-Raymond Tapamo
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Informatics
Subjects:
Online Access:https://www.mdpi.com/2227-9709/11/4/77
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents TableExtractNet, a model that automatically finds and understands tables from scanned documents, tasks that are essential for quick use of information in many fields. This is driven by the growing need for efficient and accurate table interpretation in business documents where tables enhance data communication and aid decision-making. The model uses a mix of two advanced techniques, CornerNet and Faster R-CNN, to accurately locate tables and understand their layout. Tests on standard datasets, IIIT-AR-13K, STDW, SciTSR, and PubTabNet, show that this model performs better than previous ones, making it very good at dealing with tables that have complicated designs or are in documents with a lot of detail. The success of this model marks a step forward in making document analysis more automated. It makes it easier to turn complex scanned documents containing tables into data that are more manipulable by computers.
ISSN:2227-9709