Show simple item record 2016-06-07T13:49:49Z 2016-06-07T13:49:49Z
dc.title Extract, transform and load architecture for metadata collection en
dc.type Documento de conferencia es
dcterms.abstract Digital repositories acting as resource aggregators typically face different challenges, roughly classified in three main categories: extraction, improvement and storage. The first category comprises issues related to dealing with different resource collection protocols: OAI-PMH, web-crawling, webservices, etc and their representation: XML, HTML, database tuples, unstructured documents, etc. The second category comprises information improvements based on controlled vocabularies, specific date formats, correction of malformed data, etc. Finally, the third category deals with the destination of downloaded resources: unification into a common database, sorting by certain criteria, etc. This paper proposes an ETL architecture for designing a software application that provides a comprehensive solution to challenges posed by a digital repository as resource aggregator. Design and implementation aspects considered during the development of this tool are described, focusing especially on architecture highlights. en
dcterms.alternative Arquitectura ETL para la recolección de metadatos es
dcterms.issued 2011-05-17
dcterms.language Inglés es
dcterms.license Attribution 4.0 International (BY 4.0) es
dcterms.subject Búsqueda y recuperación de información es
dcterms.subject Aplicaciones de los Sistemas de Información es
dcterms.subject repositories en
dcterms.subject aggregation en
dcterms.subject harvesting en
dcterms.subject datawarehousing en
dcterms.subject data integration en
cic.version info:eu-repo/semantics/publishedVersion es De Giusti, Marisa Raquel es Lira, Ariel Jorge es Oviedo, Néstor es
cic.lugarDesarrollo Universidad Nacional de La Plata es
dcterms.subject.materia Ciencias de la Computación e Información es
dcterms.identifier.url Enlace externo es
dcterms.isPartOf.issue VI Simposio es
dcterms.isPartOf.series Simposio Internacional de Bibliotecas Digitales (Brasil, 2011) es
cic.isPeerReviewed true es
cic.isFulltext true es


  • Icon

    Ponencia (4 p.) 

    PDF file (381.8Kb)

  • Icon

    Presentación (28 diap.) 

    PDF file (233.3Kb)

  • This item appears in the following Collection(s)

    Show simple item record