Repository logo
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Scholalry Output
  3. Publications
  4. LS-HDIB: A Large Scale Handwritten Document Image Binarization Dataset
 
  • Details

LS-HDIB: A Large Scale Handwritten Document Image Binarization Dataset

Source
Proceedings International Conference on Pattern Recognition
ISSN
10514651
Date Issued
2022-01-01
Author(s)
Sadekar, Kaustubh
Tiwari, Ashish
Singh, Prajwal
Raman, Shanmuganathan  
DOI
10.1109/ICPR56361.2022.9956447
Volume
2022-August
Abstract
Handwritten document image binarization is challenging due to high variability in the written content and complex background attributes such as page style, paper quality, stains, shadow gradients, and non-uniform illumination. While the traditional thresholding methods do not effectively generalize on such challenging real-world scenarios, deep learning-based methods have performed relatively well when provided with sufficient training data. However, the existing datasets are limited in size and diversity. This work proposes LS-HDIB - a large-scale handwritten document image binarization dataset containing over a million document images that span numerous real-world scenarios. Additionally, we introduce a novel technique that uses a combination of adaptive thresholding and seamless cloning methods to create the dataset with accurate ground truths. Through an extensive quantitative and qualitative evaluation over eight different deep learning based models, we demonstrate the enhancement in the performance of these models when trained on the LS-HDIB dataset and tested on unseen images.
Unpaywall
URI
https://d8.irins.org/handle/IITG2025/26228
IITGN Knowledge Repository Developed and Managed by Library

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
Repository logo COAR Notify