With millions of books scanned and digitized by Google, a new type of linguistic analysis has become possible — as people are able to delve into hundreds of years and millions of books' worth of data. Matjaž Perc of the University of Maribor, Slovenia has crunched the numbers, and analyzed the most commonly used words and phrases in English, stretching back to the 1500s.
His paper "Evolution of the most common English words and phrases over the centuries" has just been published the Journal of the Royal Society Interface, but the data he used is free to paw through online.
Unsurprisingly, "the", "of", and "and" are the most common words throughout the entire time period, but once you get into the longer phrases, things get a bit more interesting.
In the first corpus, from 1520, the most common phrases seem related to the church or ruling. There's a tumultuous period of rapid change, and then the early 1700s had a period of heavy religious references. But by the 1800s, the text has settled down, and remains very similar to what was scanned in the last few years, full of "the greater part of the" and "in the course of the".
Perc's analysis points out that popular phrases now last far longer than they used to, saying that they "had a much shorter popularity lifespan in the sixteenth century than they had in the twentieth century." Given that the number of books published every year doubtless increased, it makes sense that the language would plateau, as it would take far more leverage to disrupt each trend.
Top image: David Flores/Flickr.com