Auto-generating Content

Sat 16 August 2014
tags: meta

This is the second post in the series of posts about this very website, and this time I shall say a few words about the content on the site that is auto-generated.

Some parts of the website are almost entirely auto-generated, as it would be cumbersome to them write by hand. The most notable example of auto-generated content is the publications page. I essentially wanted a page where all the publications which we have on the popular physics pre-print server, arxiv.org, would be presented. It seemed kind of stupid to copy and paste the contents of the arxiv page by hand, especially when we have the technology to do this automatically! An additional advantage is that if anyone in my group updates the articles (e.g. to add details of publication in a peer reviewed journal) then all I have to do is re-build the website to pull in the updates.

To this end I used the excellent Mechanize library to interact with the arxiv website and submit a search query for my full name. I then parse the search results using Beautiful Soup 4 and pull down further details of the publications (abstracts, author lists etc.). Once the data has been parsed into Python objects, it is just a matter of rendering the data using a Jinja template.