Clustering Tasks and Decision Trees with Augustan Love Poets: Cohesion and Separation in Feature Importance Extraction

Nusch, Carlos Javier; del Rio Riande, Gimena; Cagnina, Leticia Cecilia; Errecalde, Marcelo Luis; Antonelli, Leandro

Clustering Tasks and Decision Trees with Augustan Love Poets: Cohesion and Separation in Feature Importance Extraction

cic.institucionOrigen	Laboratorio de Investigación y Formación en Informática Avanzada (LIFIA)
cic.isFulltext	SI
cic.isPeerReviewed	SI
cic.lugarDesarrollo	Laboratorio de Investigación y Formación en Informática Avanzada (LIFIA)
cic.parentType	Objeto de conferencia
cic.version	Aceptada
dc.date.accessioned	2025-02-25T17:07:41Z
dc.date.available	2025-02-25T17:07:41Z
dc.identifier.uri	https://digital.cic.gba.gob.ar/handle/11746/12425
dc.title	Clustering Tasks and Decision Trees with Augustan Love Poets: Cohesion and Separation in Feature Importance Extraction	en
dc.type	Documento de conferencia
dcterms.abstract	This article extends various automatic text analysis tasks from previous works by applying natural language processing techniques to a corpus of Latin texts from the 1st century BC and 1st century AD. The motivation behind this work is to delve into and understand a historical literary trend revolving around the themes of love> spanning from antiquity through to the medieval period. The analyzed authors include Gaius Valerius Catullus’ Albius Tibullus’ and Sextus Propertius’ representing the literary movement of the neoterics’ and Publius Vergilius Maro and Marcus Annaeus Lucanus’ epic poets with distinct styles’ serving as control samples. Unlike previous works’ various corrections were added to the preprocessing tasks’ including improved word tokenization with enclitics and handling of orthographic variances. For the clustering tasks’ the K-Means method and the Silhouette Score were used to determine the optimal cluster sizes. Using these optimal clusters as labels’ decision trees were trained for each range of n-grams’ aiming to identify features with the highest Information Gain and Information Gain Ratio. The trees were trained based on the criterion of Entropy’ and calculations of Feature Importance were performed. In this study’ we focused on detailing the classification results and features extracted by the decision trees’ based on the best Silhouette scores obtained and the Information Gain. We examined whether the words or parts of words with classificatory potential identified in the process matched the findings from previous exploratory tasks performed using other techniques.	en
dcterms.creator.author	Nusch, Carlos Javier
dcterms.creator.author	del Rio Riande, Gimena
dcterms.creator.author	Cagnina, Leticia Cecilia
dcterms.creator.author	Errecalde, Marcelo Luis
dcterms.creator.author	Antonelli, Leandro
dcterms.identifier.other	ISSN: 1613-0073
dcterms.isPartOf.issue	CHR2024
dcterms.isPartOf.item	Proceedings of the Computational Humanities Research Conference 2024 (CHR 2024), vol. 3834
dcterms.isPartOf.series	Computational Humanities Research Conference 2024 (Dinamarca, 4 al 6 de diciembre de 2024)
dcterms.issued	2024-12
dcterms.language	Inglés
dcterms.license	Attribution-NonCommercial-ShareAlike 4.0 International (BY-NC-SA 4.0)
dcterms.subject	Augustan love poets	en
dcterms.subject	Document Clustering	en
dcterms.subject	K Means	en
dcterms.subject	Silhouette Coefficient	en
dcterms.subject	Decision Trees	en
dcterms.subject	Feature Importance	en
dcterms.subject	Information Gain Ratio	en
dcterms.subject.materia	Ciencias de la Computación e Información

Archivos

Bloque original

Mostrando 1 - 2 de 2

Nombre:: Clustering Tasks and Decision Trees-PDFA.pdf
Tamaño:: 1.66 MB
Formato:: Adobe Portable Document Format
Descripción:: Documento completo

Descargar

Nombre:: Clustering Tasks and Decision Trees-PDFA.pdf
Tamaño:: 1.66 MB
Formato:: Adobe Portable Document Format

Descargar

Bloque de licencias

Mostrando 1 - 1 de 1

Nombre:: license.txt
Tamaño:: 3.46 KB
Formato:: Item-specific license agreed upon to submission
Descripción:

Descargar

Colecciones

Artículos y presentaciones en Congresos LIFIA