Archiving Blogs

Conceptually simple.

  1. listen to RSS feed for each blog being archived
  2. use workflow pipeline to:
    1. convert text to pdf file
      1. include source URL of blog post and other provenance information
    2. auto ingest pdf with provenance information into a repostory
  3. create one collection of pdf's per blog

Lateral thought

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License