OAI-PMH feed Agglomeration

Description of problem

The National Library of Australia can only harvest a single OAI-PMH feed from us. They can accept this as a feed or as a single file for upload using Ceres. The NLA process is described in the attached document.

We have multiple repositories that we wish to expose. Each repository creates its own OAI-PMH feed. To provide a single we need to produce a single agglomerated feed for the NLA post processing services.

Approaches

  1. use a feed concatenation tool such as Moai
  2. develop an in house script to combine feeds. Such a script must be flexible and able to provide multiple feeds from multiple inputs. As the technology is basically XML feed manipulation and concatenation should be relatively straightforward.

Solution

Develop a java servlet to to a periodic http GET operation on the collection OAI-PMH feed and then do simple text concatenation and re-export the feed

Advantages

  • Relatively simple and quick to implement
  • Original OAI-PMH feeds remain available for separate harvest
  • no requirements for specific VM for agglomeration service

Disadvantages

  • Potential lack of flexibility if we wish to do multiple agglomerations providing separate feeds specifying a suitable url allows a custom agglomeration
  • need to run multiple servlets for multiple feed generation custom url specification bypasses this. May still be appropriate for performance reasons
  • out of the box servlet did not support resumption tokens need to modify servlet to buffer and respond to a resumption token

Technical

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License