This is a basic Python code to retrieve data behind trajectories plotted on the Google Books Ngram Viewer. Just type in the same string you would have entered in the Google Ngram Viewer and retrieve the data in tsv format. By default, data is printed on screen and saved to the current directory. For more on the types of queries accepted, see this info page.
Albert Einstein, Charles Darwin
Pearl Harbor, Watergate -corpus=eng_2009 -nosave
bells and whistles -startYear=1900 -endYear=2001 -smoothing=2(electric car) * 10,solar energy
[default: eng_2012] This will run the query in CORPUS. Possible values are recapitulated below, and here.
[default: 1800] start the query in YEAR (integer).
[default: 2000] ends the query in YEAR (integer).
[default: 3] smoothing parameter (integer). Minimum is 0.
results will not be saved to file.
results will not be printed on screen.
prints this screen.
eng_2012, eng_2009, eng_us_2012, eng_us_2009, eng_gb_2012, eng_gb_2009,
chi_sim_2012, chi_sim_2009, fre_2012, fre_2009, ger_2012, ger_2009,
spa_2012, spa_2009, rus_2012, rus_2009, heb_2012, heb_2009, ita_2012,
eng_fiction_2012, eng_fiction_2009, eng_1m_2009
Note to savvy users:
- You can directly pass queries as arguments, such as
python getNgrams.py awesomeor
- If you pass the ‘-quit’ flag as an argument, the program will run once and quit without asking for more input:
python getNgrams.py awesome, sauce -quit.
- Known caveat: quotation marks are removed from the input query.
- License: none, please distribute, modify and improve as you see fit.
PLEASE do respect the terms of service of the Google Books Ngram Viewer while using this code. This code is meant to help viewers retrieve data behind a few queries, not bang at Google’s servers with thousands of queries. The complete dataset can be freely downloaded directly on Google’s website.
This code is not a Google product and is not endorsed by Google in any way. Contact us at firstname.lastname@example.org, @culturomics or @jb_michel
With this in mind… happy plotting!