Towards Text-Line Segmentation of Historical Documents Using Graph Neural Networks

FLAME University, Pune, India
ICLR 2026 Workshop on Geometry-grounded Representation Learning and Generative Modeling
Project teaser animation

Abstract

We present an initial investigation into a graph-based problem formulation for performing text-line segmentation of historical documents, by representing characters (or grapheme clusters) as the nodes, and with edges connecting characters to their previous and next characters on the text-line. This converts the image segmentation learning task into a binary edge classification learning task. This also enables training on large-scale synthetic data simulating complex layouts, enabling better robustness to Layout-level distribution shifts observed in historical documents. Furthermore, we introduce a benchmark dataset of 15 Sanskrit manuscripts with diverse layouts. We propose a method based on CRAFT and Graph Neural Networks (GNNs), which uses geometric priors of text-lines to perform competitively with leading approaches in zero-shot and few-shot experimental settings on the Sanskrit dataset introduced and the U-DIADS-TL dataset. The proposed method further demonstrates competitive accuracy and better consistency than leading methods Doc-UFCN and SeamFormer when evaluating robustness to distribution shifts over increasing data sizes (using intra-manuscript and inter-manuscript train–test data splits) on the Sanskrit dataset introduced and the DIVA-HisDB dataset. Finally, we demonstrate that the proposed method achieves strong performance in the downstream, goal-oriented evaluation of text recognized from the segmented text-lines.

-->

BibTeX

@inproceedings{
chincholikar2026towards,
title={Towards Text-Line Segmentation of Historical Documents Using Graph Neural Networks},
author={Kartik Chincholikar and Kaushik Gopalan and Mihir Hasabnis},
booktitle={ICLR 2026 Workshop on Geometry-grounded Representation Learning and Generative Modeling},
year={2026},
url={https://openreview.net/forum?id=0GoutqIh3l}
}
}
Copy to clipboard