A Guide to the Thesaurus

The following page gives a guide to the use of the Historical Thesaurus of English. Here, we outline the Thesaurus’ overall structure, provide details of the different types of information to be found in a typical entry, and explain the conventions that the Thesaurus uses.

1. Content

The Historical Thesaurus of English contains almost every recorded word in English from Old English to the present day, based on information in the Oxford English Dictionary (OED), and supplemented in particular by A Thesaurus of Old English (TOE, second edition 2000), which provided words from c700-1150 AD not included in the OED. The Bibliography gives full details of the latter text and the Sources page lists the other specialist dictionaries of Old English used in compiling the Thesaurus.

The Historical Thesaurus draws directly on the senses and the dating of words in the OED, with some further information where necessary. Where a word has more than one sense in the OED, it will appear in more than one Thesaurus category. For each individual thesaurus entry, synonyms are presented in chronological order according to the first recorded date of each word’s use in English, with the earliest synonyms first. The word’s last recorded date, where appropriate, is also given. The Thesaurus therefore lists both obsolete words and obsolete meanings of current words as well as offering a comprehensive treatment of current English.

As well as its dates of recorded use, each word is accompanied where appropriate by any restrictive register, style, frequency, or geographical labels given in the OED, such as poetic, Anthropology, or US.

2. Structure

The Thesaurus is organized so as to present words and their dates of recorded use within a highly detailed semantic framework of categories and subcategories: that is, words are arranged according to their meanings, rather than alphabetically.

Within this structure, categories (and subcategories within categories) are ordered from the most general to the most specific, and words within categories and subcategories are ordered chronologically, with earliest words first. Semantic categories are separated by their grammatical category (noun, verb, and so on), and are arranged in a fixed order of parts of speech, listed below.

These principles are discussed in more detail in the following sections.

2.1 Semantic categories

The Thesaurus is organized into three major sections, reflecting the main activities and preoccupations of users of the language: the external world, the mental world, and the social world.

These in turn are divided into up to seven levels, or ‘tiers’, of semantic category. The highest level, tier 1, is identified by single paired numbers. These level 1 sections are divided into 26 further semantic categories (tier 2) identified by a second pair of numbers, and below these major sections are five more levels of categories, giving each word its place in a hierarchy of meaning. The hierarchy therefore descends in numbered stages from the very general to the minute. The more number pairs a heading has, the lower its place in the semantic framework. Headings with number strings of the same size are at the same semantic level.

A heading such as 03.13.03.05.17 (adj.) Cinematography therefore locates the category as:

  • being within tier 5, as it has five sets of paired numbers in its category number;
  • containing adjectives;
  • being within its parent categories 03.13.03.05 Performance Arts, 03.13.03 The Arts, 03.13 Leisure, and 03 Society;
  • and being alongside other sister categories such as 03.13.03.05.03 Drama, 03.13.03.05.07 Minstrelsy, 03.13.03.05.09 Circus performance, and so forth.

2.2 Semantic subcategories

Any of the seven major tiers described above may be further divided into one to five tiers of subcategories, representing further, very detailed, levels of meaning. These are displayed below the main-sequence category, under the heading ‘subcategories’. Subcategory numbers are usually written with a vertical bar (for example 03.13.03.05.17|01.03 (adj.) for Cinematography’s American/Hollywood films subcategory) and we would recommend citing them with their parent main-sequence category first and then the subcategory (the style we use in our ‘Cite’ links on each Thesaurus entry in this site is to list the main category and then subcategories, separating each with doubled colons, such as ‘Cinematography :: films/the cinema :: American/Hollywood’).

2.3 An illustration

The rationale behind the semantic structure of Thesaurus categories and subcategories and their interrelationship can be seen in more detail in the following abridged example, taken from part of the Armed hostility section:

The top level is 03 Society. This major section contains thirteen main categories in its second tier, of which the category 03.03 Armed hostility, covering the general concept of warfare, is one. Within 03.03, the third tier contains a set of categories running from 03.03.01 War to 03.03.19 Peace. At that point Armed Hostility is complete and a new main category, 03.04 Authority, begins.

Within each level, there is the potential for embedded categories at lower levels, either within the main sequence of categories or within highly-specific subcategories. Thus under 03.03.16 Military equipment we find Weapon at the next tier below, and below that different types and sub-types of weapons such as Sharp weapon (tier 5), Side-arms (tier 6), Sword (tier 7), and scimitar (tier 7, subcategory 2).

Supplying links like ‘kind of', you can reconstruct the semantic pathways by reading back through the headings from lowest to highest, thereby creating a quasi-definition, as in ‘a scimitar is a kind of sword, which is a kind of side-arm, which is a kind of weapon, which is a kind of military equipment, used in armed hostilities, which are engaged in by societies’.

2.4 Parts of speech

Where words from more than one part of speech belong to the same concept, they will have the same number, as in, for example:

Parts of speech follow a fixed order in the thesaurus, as follows:

  • n. noun
  • adj. adjective
  • adv. adverb
  • v. verb
  • vi. intransitive verb
  • v. pass. passive verb
  • vt. transitive verb
  • v. refl. reflexive verb
  • v. impers. impersonal verb
  • phr. phrase
  • int. interjection
  • conj. conjunction
  • prep. preposition

You can move between these using the parts of speech selector at the head of each category page. Only those parts of speech used within the category are shown.

3. Dating

A distinctive feature of the Thesaurus is the provision of a chronology of every featured word in every entry, based on the OED, some dictionaries of modern English, and supplementary dictionaries of Old English.

Old English runs from around 700-1150. OE words are not given numerical dates but are followed by OE in small capitals instead. A word is regarded as having been in continuous use from OE if there is evidence of its use after the OE period but before 1301. If the gap extends to 1450, and there is some doubt about the direct connection between the words, then they are treated as separate headwords. This is the case even when the OE word and the modern English word have the same spelling. Dates which follow OE follow normal chronological order, so that those with a closing date precede those which survive into modern English. Where there is more than one date, there is a secondary principle of ordering by the earlier.

The main conventions used to present this information follow.

3.1 Single date

When a word is followed by a single date, this means that the word has only one recorded OED citation in that sense:

or that it is only recorded in Old English (see above):

When a word is followed by a single date plus a long dash, the word was first recorded on that date but is still current:

3.2 Two dates

When a word is followed by two dates, this shows the first and last recorded OED citations for that sense:

3.3 Longer date groups

When a word is followed by a date range in which one or more dates are separated by a plus sign, this indicates that there is a gap of more than about 150 years in the recorded evidence for the word, or that there are other reasons to suggest that it is rare. In the following example, the date group shows that the obsolete word chantment, in the sense of casting spells, was first recorded in 1297, with evidence of use until 1430, then there was a gap in the evidence until 1803 with no further recorded use:

The plus sign is also used to separate dates where the register label applies only to one of them. In the example below, the label obs. exc. arch. (obsolete except in archaic senses, for example when people wish to give a deliberately old-fashioned flavour to the word) only applies to the use from 1860 to the present day:

3.4 Other conventions

A forward slash is used to indicate that the source of the citation cannot be dated more accurately than the stated range:

Dates may also be prefixed by a (ante, ‘before') or c (circa, ‘about') to show that the first or last recorded citation cannot be dated more accurately than before or after the stated date. The Thesaurus also uses a and c to simplify other OED dates indicating uncertainty; thus, for example, 14.. in OED has become a1500 and ?1500 has become c1500.

If two words have identical start dates, the word with the earliest closing date comes first. If all dates are identical, the words appear in alphabetical order:

A bracketed numeral following a date indicates the number of citations for a word that share the same date and source. In the following example, hagging (the action of witches meeting) is shown as having only 2 recorded citations, both of which are from the same 1584 source:

Where words have the same date but come from a different source, they are separated by a plus sign.

4. Labels

Restrictions on the usage of words are selected from those given in the OED, and are represented in italics enclosed by brackets after the date or date group of a word:

Labels can show restriction to the English of a particular region (e.g. US, NZ, Indian), to a specific style or register of language (e.g. colloq., slang, poet.), or to a specific subject area (e.g. Law, Sociology, Physics). Many of the latter are not reproduced from the OED, however, as they are unnecessary in a meaning-structured thesaurus. Labels can also show a word’s currency or frequency of use (e.g. rare, arch.). Dict. indicates that a word occurs only in a dictionary or similar work, and so warns it may not have been in general use.

Labels may be qualified by words such as chiefly, orig., esp., and may also appear in combination with other labels (e.g. US & Austral.; chiefly Law & technical). Spec. means either specific or specifically, according to the relevant context. No labels are attached to Old English words.

Where regional and subject labels are straightforward to understand, register labels may be less so. An explanatory list of labels and abbreviations can be found here.

5. Alternative spellings and forms

5.1 Optional prefixes and characters

Brackets are used to show variant spellings such as optional prefixes, characters, etc., as in the following examples:

A forward slash is used to indicate a variant form of a word or phrase:

5.2 Old and Middle English

Old English contains two letters which are no longer in use:

  • æ, Æ: ash, pronounced like the ‘a’ in ‘cat’ and alphabetized as ae.
  • þ, Þ: thorn, pronounced ‘th’ and alphabetized between t and u. The alternative form eth is not used here.

In Middle English, there are occasional uses of ȝ, Ȝ, yogh, which is a variant of ‘g’.

You can search for all of these on the Search page using the Add special character function.

Many Old English verbs are prefixed either obligatorily or optionally with ‘ge’. This prefix is ignored for purposes of alphabetization.

A chevron between words (witch < wicca OE–) indicates derivation of a later form from an OE form, with the earlier OE form given last.

Top