Extract, transform and load architecture for metadata collection

cic.isFulltexttruees
cic.isPeerReviewedtruees
cic.lugarDesarrolloUniversidad Nacional de La Plata es
cic.versioninfo:eu-repo/semantics/publishedVersiones
dc.date.accessioned2016-06-07T13:49:49Z
dc.date.available2016-06-07T13:49:49Z
dc.identifier.urihttps://digital.cic.gba.gob.ar/handle/11746/2223
dc.titleExtract, transform and load architecture for metadata collectionen
dc.typeDocumento de conferenciaes
dcterms.abstractDigital repositories acting as resource aggregators typically face different challenges, roughly classified in three main categories: extraction, improvement and storage. The first category comprises issues related to dealing with different resource collection protocols: OAI-PMH, web-crawling, webservices, etc and their representation: XML, HTML, database tuples, unstructured documents, etc. The second category comprises information improvements based on controlled vocabularies, specific date formats, correction of malformed data, etc. Finally, the third category deals with the destination of downloaded resources: unification into a common database, sorting by certain criteria, etc. This paper proposes an ETL architecture for designing a software application that provides a comprehensive solution to challenges posed by a digital repository as resource aggregator. Design and implementation aspects considered during the development of this tool are described, focusing especially on architecture highlights.en
dcterms.alternativeArquitectura ETL para la recolección de metadatoses
dcterms.creator.authorDe Giusti, Marisa Raqueles
dcterms.creator.authorLira, Ariel Jorgees
dcterms.creator.authorOviedo, Néstores
dcterms.identifier.urlEnlace externoes
dcterms.isPartOf.issueVI Simposioes
dcterms.isPartOf.seriesSimposio Internacional de Bibliotecas Digitales (Brasil, 2011)es
dcterms.issued2011-05-17
dcterms.languageIngléses
dcterms.licenseAttribution 4.0 International (BY 4.0)es
dcterms.subjectBúsqueda y recuperación de informaciónes
dcterms.subjectAplicaciones de los Sistemas de Informaciónes
dcterms.subjectrepositoriesen
dcterms.subjectaggregationen
dcterms.subjectharvestingen
dcterms.subjectdatawarehousingen
dcterms.subjectdata integrationen
dcterms.subject.materiaCiencias de la Computación e Informaciónes

Archivos

Bloque original

Mostrando 1 - 2 de 2
Cargando...
Miniatura
Nombre:
ETL_Documento_completo.pdf
Tamaño:
381.89 KB
Formato:
Adobe Portable Document Format
Descripción:
Ponencia (4 p.)
Cargando...
Miniatura
Nombre:
ETL_Presentación__diapositivas_.pdf
Tamaño:
233.34 KB
Formato:
Adobe Portable Document Format
Descripción:
Presentación (28 diap.)