A co-training model based in learning transfer for the classification of research papers

Artículo

Acceso Abierto

A co-training model based in learning transfer for the classification of research papers

Cevallos-Culqui, Alex

|

Pons, Claudia Fabiana

|

Rodríguez, Gustavo

Fecha de publicación

2024

Lugar de desarrollo

Laboratorio de Investigación y Formación en Informática Avanzada

Centro CIC

Laboratorio de Investigación y Formación en Informática Avanzada (LIFIA)

Libro/Informe

2024 IEEE 12th International Conference on Intelligent Systems (IS)

Serie

12th International Conference on Intelligent Systems (Bulgaria, 29 al 31 de agosto de 2024)

Idioma

Inglés

Materia

Ciencias de la Computación e Información

HDL 11746/12492

DOI 10.1109/IS61756.2024.10705226

ISBN 979-8-3503-5099-9c

ISSN 2767-9802

Descargas

Documento completo (1.38 MB)

Resumen

A multitude of scholarly papers can be accessed online, and their continual growth poses challenges in categorization. In diverse academic fields, organizing these documents is important, as it assists institutions, journals, and scholars in structuring their content to improve the visibility of research. In this study, we propose a co-training model based on transfer learning to classify papers according to institutional research lines. We utilize co- training text processing techniques to enhance model learning through transformers, enabling the identification of trends and patterns in document texts. The model is structured with two views (titles and abstracts) for data preprocessing and training. Each input employs different document representation techniques that augment its training using BERT's pre-trained scheme. For evaluating the proposed model, a dataset comprising 898 institutional papers is compiled. These documents undergo classification prediction in five or eleven classes, and the model performance is compared with individually trained models from each view using the BART pre-trained scheme and combined models. The best precision level of 0,87 has been achieved, compared to BERT pre-trained model's metric of 0,78 (five classes). These findings suggest that co-training models can be a valuable approach to improve the predictive performance of text classification.

Palabras clave

text classification

co-training

transformer

pre-trained

Esta obra se publica con la licencia Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (BY-NC-SA 4.0)

Página completa del ítem

A co-training model based in learning transfer for the classification of research papers

Título alternativo

Título de investigación

Directores

Compiladores

Editores

Editorial

Fecha de publicación

Descripción

Emisor del título

Lugar de desarrollo

Centro CIC

Libro/Informe

Recursos relacionados

Serie

Volumen de la revista

Idioma

Materia

Area temática

Clasificación FORD

Cobertura Espacial

Extensión

Descargas

Enlace externo

Resumen

Palabras clave

item.page.license