International Journal on Science and Technology

E-ISSN: 2229-7677     Impact Factor: 9.88

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 17 Issue 1 January-March 2026 Submit your research before last 3 days of March to publish your research paper in the issue of January-March.

Incorporate Structure and Content for Devanagari Table Extraction

Author(s) Ms. Anuja Ramu Dumada, Prof. Sandeep G. Shah
Country India
Abstract Table extraction is a crucial component of document image analysis, enabling the transformation of unstructured document content into structured, machine-readable data. Although significant progress has been achieved for English and other Latin scripts, table extraction from Devanagari-script documents remains challenging due to script complexity, limited annotated datasets, and poor-quality scans. This research proposes a hybrid framework that combines structural analysis with content-based validation to enhance table extraction accuracy for Devanagari documents. The framework integrates preprocessing, Devanagari-specific OCR post-correction, semantic consistency checks, and confidence-based feature fusion. Experimental results on over 500 annotated Devanagari documents demonstrate improved performance compared to existing methods, achieving high structural precision, OCR accuracy, and overall TEDS-S score.
Keywords Table Extraction, Devanagari Script, Document Image Analysis, OCR, Hybrid Framework, TEDS-S
Field Computer
Published In Volume 17, Issue 1, January-March 2026
Published On 2026-02-06

Share this