Documento de conferencia
Acceso Abierto

Rule-Based Matching for Real Estate Features Detection

Enlace externo
Resumen

Most of the information about real estate for sale in the Buenos Aires province, Argentina is unstructured, which means that it does not always follow the same format, making extraction a challenging process. Variability in wording, human errors, noise, and incomplete data further complicate the task. Given the large volume of information available, automated techniques are required to transform unstructured text into structured data. This article presents an approach to extract attribute-value pairs from the information contained in the property listings for the province of Buenos Aires, in order to incorporate this data into a knowledge graph. The approach uses pattern-based information extraction for 17 features with an exhaustive evaluation over two datasets: a ground truth labeled by experts and a dataset containing a real-world use case. The results demonstrates accurate values.

Palabras clave
Information Extraction
Rule-based matching
Natural Language Processing
Knowledge Graph Completion
http://creativecommons.org/licenses/by-nc-nd/4.0/

Esta obra se publica con la licencia Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (BY-NC-ND 4.0)

item.page.license
Cargando...
Miniatura