Culturomics needs large-scale digital resources to study cultural trends quantitatively. Fortunately, more and more resources are becoming available.
Timeseries for two billion words and phrases, based on 5.2 million books written in seven languages. Beware, the data’s large. In the coming weeks, we add a guide on how to set up your own culturomic server.
Get Ngrams using basic Python code to retrieve data behind trajectories plotted on the Google Books Ngram Viewer. Type in the same string you would have entered on books.google.com/ngrams, and retrieve the data in tsv format.
Google Labs N-gram Viewer is the first tool of its kind, capable of precisely and rapidly quantifying cultural trends based on massive quantities of data. The browser is designed to enable you to examine the frequency of words (banana) or phrases (‘United States of America’) in books over time. You’ll be searching through over 5.2 million books: ~4% of all books ever published!
Useful Links
HathiTrust Digital Library – A group of libraries and research institutions whose goal is to digitally preserve cultural records.
Europeana – Explore the digital resources of European museums and galleries.