DBUnion Archive Merger Component


Download : DBUnion-1.1.tar.gz
Download : DBUnion-1.01.tar.gz

Description
-----------

Merge together different OAI-accessible archives into a single
archive for local storage and processing, with a pseudo-OAI (ODL-Union)
interface for access.


Features
--------

- Works with any OAI (PMH v1.0/1.1/2.0) or ODL (XOAIPMH v1.0) archive
- Strict compliance with ODL-Union protocol as specified on the ODL
  website (http://oai.dlib.vt.edu/odl)
- No installation or compilation - Perl scripts need only be copied
  (requires a database and DBI database connection module but all 
  other modules are built-in)
- Code layout for separate components or libraries of components
- One installation can easily be used for multiple union engines
- ./configure to set all parameters
- Tested with mySQL (uses standard SQL)
- Will store any metadata formats
- Supports storing of sets from individual archives, and arranges
  each component archives into a set
- All extensions, configurations, and containers are specified 
  using XML Schema


Requirements
------------

- mySQL or similar database, with access to create tables in a 
database that has already been created
- Perl, with modules DBI, DBD::mySQL (or DBD::Pg or ...)
- Ability to run CGI scripts


Instructions
------------

1. Copy all files with default directory structure into a directory
   from which CGI scripts may be run

2. Change to the ODL-DBunion-1.1/DBUnion directory

3. Run './configure.pl' with the parameter being the name of an archive
   to index. For example,
      ./configure.pl cstc

4. Edit config.xml in the directory corresponding to the archive name
   if necessary - it is preferable to simply rerun configure.pl since the
   script will perform sanity checks as well
   
5. Test the harvester
   - run harvest.pl from the archive directory
   - check the harvest.log file to see if new items were processed

6. Run the harvest.pl script from a scheduler such as cron as often as
   desired - 10 minutes is a good start. The scheduling algorithm used
   by the Harvester will only trigger when the time specified in the 
   configuration has passed.

7. Test the ODL-Union (extended OAI) interface
   - use the Repository Explorer at http://purl.org/net/oai_explorer
     and point it to the 'union.pl' script in the archive directory
     
8. Create additional union archives using the same procedure - 
   each union archive will have its own directory and must use
   a different database


Module Layout
-------------

DBUnion/template:
 - scripts to interface with the component

lib/Pure:
 - utility modules (in pure-perl)

lib/OAI:
 - OAI template modules
 - OAISP = service provider
 - OAIDP = data provider
 
lib/XOAI:
 - XOAI extensions to the OAI modules
 - XOAISP = service provider
 - XOAIDP = data provider
 - Harvester = scheduled harvester for ODL+OAI archives

lib/ODL/DBUnion:
 - Union archive engine using a database
 - UnionDB = database creation and open routine
 - UnionSP = union archive service provider/harvester
 - UnionDP = union archive data provider


Links/Acknowledgements
----------------------

This software is part of the larger project to build componentized
Digital Libraries based on the work of the Open Archives Initiative.
See http://oai.dlib.vt.edu/odl and http://www.openarchive.org for
more information.

This is a research project, and we are always interested in 
feedback - questions, comments, and suggestions for improvement.
Please contact hussein@vt.edu as appropriate.


Back to DLRL Home Page    
Last updated : 1 August 2002