A connected component-based deep learning model for multi-type struck-out component classification

Document Type

Conference Item

Publication Date

1-1-2021

Abstract

Due to the presence of struck-out handwritten words in document images, the performance of different methods degrades for several important applications, such as handwriting recognition, writer, gender, fraudulent document identification, document age estimation, writer age estimation, normal/abnormal behavior of person analysis, and descriptive answer evaluation. This work proposes a new method which combines connected component analysis for text component detection and deep learning for classification of struck-out and non-struck-out words. For text component detection, the proposed method finds the stroke width to detect edges of texts in images, and then performs smoothing operations to remove noise. Furthermore, morphological operations are performed on smoothed images to label connected components as text by fixing bounding boxes. Inspired by the great success of deep learning models, we explore DenseNet for classifying struck-out and non-struck-out handwritten components by considering text components as input. Experimental results on our dataset demonstrate the proposed method outperforms the existing methods in terms of classification rate.

Keywords

Connected component analysis, Deep learning, Handwriting recognition, Struck-out words, Writer identification

Divisions

fsktm

Funders

None

Volume

12917

Event Title

International Workshops co-located with the 16th International Conference on Document Analysis and Recognition, ICDAR 2021

Event Location

Lausanne

Event Dates

5 - 10 September 2021

Event Type

conference

This document is currently not available here.

Share

COinS