What is a collocation dictionary?

“It is commonly assumed that word choice is entirely free, a matter of preference, taste, and social circumstances. The reality is quite different from this: as psychological and linguistic research has revealed, languages impose upon their users certain patterns of thought and chains of association.” (Prof. Dr. Dirk Siepmann)

Collocation refers to the sequential co-occurrence possibilities of multi-word units (MWUs), e.g. noun + noun, adjective + noun, adverb + adjective/adverb, or in some cases non-adjacent ones, e.g. verb + noun, verb + preposition, verb + adverb/ adjective. The ARCS provides, for example, 112 verbs that co-occur with up and/ or down. The exact meanings of words or collocations are context-related. The prepositional verb "make up" may, for example, have up to ten different meanings.

There are scales of collocational probability and acceptability. The difference in collocational probability and acceptability between sentences derives from their referential meaning. Referential restrictions on collocation demand knowledge of the world, which is changing very fast. The invention of new ideas, cunning solutions to old problems, and innovative concepts require new collocations. As a consequence, the ARCS is being updated on an almost daily basis.

There are
firm or frozen collocations like

"staycation","crowdsourcing","debt-for nature swap","second lifer","ghost call","unintentional entrepreneur","gastric bypass surgery","time-starved reader","gentrification","Vatican roulette","wealth formation by long-term saving with tax concessions","swine flu","palpable disdain","black swan event","school shooting","deep packet inspection","teotwawki","freegan","surfer's ear", "video snacking","geotagged photo","cruelty-free consumption",

or weak or acceptable collocations like

a "big car", “good idea”, or “interesting question”.

“Knowledge of collocations is vital for the competent and authentic use of a language: a grammatically correct sentence will stand out as 'awkward' if collocational preferences are violated. This makes collocation an interesting area for language teaching. One of the learner's major problems is a lack of collocational expertise in English.” (D.A. Wilkins) The Cambridge Learner's Dictionary says that "these word combinations are often difficult to guess, so you need to learn them in order to sound natural in English".

The reader will need some degree of prior semantic knowledge about the parts of the collocation in order to understand the word combination. “Moreover, repeated encounters with known words … are useful in view of the fact that vocabulary knowledge is known to develop or deepen `incrementally´ with a number of encounters.” (Horst et al. 1998) Only with the ability of “automatic word recognition” (Hulstijn 2001) to retrieve word meanings efficiently and effortlessly from memory storage, can dictionary readers make an educated guess about the meaning of new collocations. Comprehension of English collocations will require knowledge of nearly 100% of the defining vocabulary. In addition, the intelligent combination of acceptable new word combinations should be accessible to the ordinary intelligent user of the ARCS as well.

Also available are synonyms of such collocations and words or parts of speech that may co-occur with a certain word. For example, the adjective "particular" collocates with about 500 nouns, on the basis of a text corpus of over 8 million words. The ARCS covers English/ English as well as German/ English because it provides all probable, acceptable as well as weak and firm collocations. The German equivalents of the English headwords form semantic fields or a collection of German collocations.

The ARCS is an electronic word combination searcher. It will offer answers to queries concerned with words that occur together and form one single semantic unit, e.g. "awareness for the environment" (G. Umweltbewusstsein), or “cruelty-free consumption” (G. Gerichte aus nichttierischen Produkten). Such groups of words are called collocates, i.e. words that are spontaneously associated with one another or glued together in the minds of native speakers and omnivorous and discerning readers.

"Despite the absence of glosses, learners can nonetheless select collocations through a culling process whereby familiar words are contemplated as potential candidates and unknown words ignored." (Nicolas R. Cueto)

The design of the ARCS connects more than 2 million words with the help of more than 70,000 nodes or headwords. "Nodes must be highly clustered - that is, if two nodes are both linked to a third, there must be a high probability that the two are also directly linked. --- Searching our memories for a particular word really entails wandering mentally along links in the network." (A.E. Motter, Y.-C. Lai, P. Dasgupta, From: Nature, Science Update July 24, 2002)

The Advanced Reader's Collocation Searcher imitates the mental lexicon. It is really unique. Depending on whether the search starts from a noun, verb, adjective, or adverb, the user looks up a headword (node) in either the Noun, Verb, Adjective, Adverb, Synonym, or German section. A global search can start from English and/ or German words or phrases. Also, sample pages have been prepared. They may serve as examples of the 4,000 pages of the ARCS..


Horst Bogatz