Manuscript document digitalization and recognition: a first approach

De Giusti, Marisa Raquel; Vila, María Marta; Villarreal, Gonzalo Luján

Artículo

Acceso Abierto

Manuscript document digitalization and recognition: a first approach

De Giusti, Marisa Raquel

|

Vila, María Marta

|

Villarreal, Gonzalo Luján

Fecha de publicación

1 de octubre de 2005

Lugar de desarrollo

Servicio de Difusión de la Creación Intelectual

Serie

Journal of Computer Science & Technology

Volumen de la revista

vol. 5, no. 3

Idioma

Inglés

Materia

Ciencias de la Computación e Información

Extensión

6 p.

HDL 11746/3826

1666-6038

Descargas

Documento completo (540.25 KB)

Enlace externo

Registro completo

Resumen

The handwritten manuscript recognizing process belongs to a set of initiatives which lean to the preservation of cultural patrimony gathered in libraries and archives, where there exist a great wealth in documents and even handwritten cards that accompany incunabula books. This work is the starting point of a research and development project oriented to digitalization and recognition of manuscript materials. The paper presented here discuss different algorithms used in the first stage dedicated to image noise-cleaning in order to improve it before the character recognition process begins. In order to make the handwritten-text recognition and image digitalization process efficient, it must be preceded by a preprocessing stage of the image to be treated, which includes thresholding, noise cleaning, thinning, base-line alignment and image segmentation, among others. Each of these steps will allow us to reduce the injurious variability when recognizing manuscripts (noise, random gray levels, slanted characters, ink level in different zones), and so increasing the probability of obtaining a suitable text recognition. In this paper, two image thinning methods are considered, and implemented. Finally, an evaluation is carried out obtaining many conclusions related to efficiency, speed and requirements, as well as ideas for future implementations.

Palabras clave

digitalización

Image processing software

conservación patrimonial

Esta obra se publica con la licencia Creative Commons Attribution 4.0 International (BY 4.0)

Página completa del ítem

Manuscript document digitalization and recognition: a first approach

Título alternativo

Título de investigación

Directores

Compiladores

Editores

Editorial

Fecha de publicación

Descripción

Emisor del título

Lugar de desarrollo

Centro CIC

Libro/Informe

Recursos relacionados

Serie

Volumen de la revista

Idioma

Materia

Area temática

Clasificación FORD

Cobertura Espacial

Extensión

Descargas

Enlace externo

Resumen

Palabras clave

item.page.license