Treemaps

The following treemap visualizations are by Marc Alexander, and are based on work done in papers published in 2010 and 2012. Clicking on any image will take you to a much larger version for downloading. You are welcome to use any of these visualizations for research or teaching, although we ask that you credit their source. Please contact marc.alexander@glasgow.ac.uk with any comments or queries regarding them.

1. Modern English

1. Modern English

This first image represents all of modern English, with each dot representing one word, all arranged according to their meaning, with words close to each other in meaning adjacent to each other. The colour varies from black (meaning the word is first found in Old English) to yellow (meaning the word is first cited recently). There are 469,470 words displayed here in all.

2. Modern English With Key

This second image acts as a key for the first; it overlays the major semantic fields on top of the original visualization. In this view it is evident how old or how recent the words which make up these semantic fields are – for example Physics is fairly bright, meaning it is a field which is unsurprisingly made up of mostly new words, while Action is made up of many words inherited from Middle and Old English. Similarly, Faith, Emotion, Time, and Death are dark and filled with old words, whereas Travel, Chemistry and Biology are filled with more recent innovations.

It is evident from these images that modern English is characterized by a ‘patchwork’ effect, with some areas clearly-defined from others due to their high proportion of very recent or very old words. This phenomenon makes it possible to visually view the areas where the English language has "stretched" in order to accommodate new vocabulary. These light patches within the greater patchwork are areas of recent lexical innovation, made up of clusters of words first recorded in the Thesaurus database in recent years. We therefore see this patchwork effect in areas affected by rapid social, technological or academic growth, such as Computing (inside Number, adjacent to Mathematics), Physics, Chemistry, Linguistics, Communication, Travel, and so on. Conversely, darker and therefore older patches cover existence in Time and Space, Creation, Causation, Faith, Emotion, and the parts of Number which refer to Arithmetic or Enumeration.

2. Slices from the History of English

This patchwork effect is most pronounced in modern English, but if other selections of the data are taken this reduces substantially. If the present-day data is thought of as a "slice" of the Thesaurus, then other such slices can be taken between the Old English period and the present day. Other slices reveal that this patchwork effect, with its pronounced edges between new and old semantic fields, do not occur throughout the rest of the history of the language.

Proceeding backwards, as below, from the present day through the Late Modern, Early Modern and Middle English periods, there is a visible reduction in the patchwork effect observed above, so that by Middle English almost no delineated rectangular patches of innovation can be found.

3. Recorded English in the Age of Samuel Johnson

Recorded English in the Age of Samuel Johnson (1709-1784, approximately 248,000 words)

4. Recorded English in the Age of William Shakespeare

Recorded English in the Age of William Shakespeare (1564-1616, approximately 208,000 words)

5. Recorded English in the Age of Geoffrey Chaucer

Recorded English in the Age of Geoffrey Chaucer (c.1340-1400, approximately 73,000 words)

6. Comparative Sizes of the Johnson, Shakespeare, and Chaucer Slices

Comparative Sizes of the modern English, Johnson, Shakespeare, and Chaucer Slices

These patchwork-like effects across time therefore appear as a result of rapid growth, generally due to trauma in the history of the language – including fundamental and rapid progress in the external and social worlds.

Top