- Compound (word) A word made up of two or more other words: teapot, from tea and pot; blackbird, from black and bird. Compounds can be solid (like teapot and blackbird), hyphenated (like mother-of-pearl and mind-blowing), or open (like coffee cup and table lamp). Compound words occur in many languages. In German, they are conventionally written in solid form: Eisenbahn (‘ironway’) railway; Eisenbahnunglück (‘ironwayaccident’) railway crash. In French, one kind of compound has the form of a prepositional phrase: pomme de terre (‘apple of earth’) potato; arc-en-ciel (‘arch in sky’) rainbow. Another consists of a verb-noun phrase: gratte-ciel (‘scrape-sky’) skyscraper; grille-pain (‘grill-bread’) toaster.
- Corpus A large collection of written or spoken material in machine-readable form, often collected from the web, assembled for the purpose of research into language and how it is changing. Corpora are also used in machine translation tools, text mining applications and plagiarism detection. Different types of corpora include monolingual corpora, which can be curated or uncurated; parallel corpora, which show text translated into multiple languages, and can be sourced from multilingual parliamentary proceedings or localized corporate websites; and learner corpora, which show language used by non-native speakers and contain data on typical errors made.
- Defined term Any element in language data from Oxford Global Language Solutions that is linked to a definition. Defined terms will usually include headwords, subentries, and compounds.
- Derivative A word or other item of language that has been created according to a set of rules from a simpler word or item, for example absolutely from absolute, childlike from child, or speechless from speech.
- Etymology The study of the history of words and, in a dictionary, a description of the origin and history of a word, sometimes including changes in its form and meaning.
- Example, example phrase, or phrase Shows the headword in use in an example sentence or phrase, e.g. an example of the use of the compound cutting edge is researchers at the cutting edge of molecular biology.
- Frequency The number of occurrences of particular units, such as words, in a particular language context (such as regional language, legal language, written versus spoken language). Frequency data is usually generated from general or specialized corpora to meet specific requirements.
- Full inflections OGLS can supply inflected forms datasets that list all forms of an agreed list of root words. The root words list can be compiled according to frequency of use or can represent a specific subject area (e.g. legal or scientific terms). See FAQs for more details.
- Homograph Each of two or more words spelled the same but not necessarily pronounced the same and having different meanings and origins, e.g. entrance (noun: stress on first syllable) a door, gate, etc., and entrance (verb: stress on second syllable) to put in a trance; lead (verb: rhyming with ‘deed’) to take, conduct, guide, etc., and lead (noun: rhyming with ‘dead’) a metal.
- Homophone Each of two or more words having the same pronunciation but different meanings, origins, or spelling, e.g. new and knew.
- Homonym (1) A broader term for Homograph and Homophone; (2) each of two or more words having simultaneously the same spelling (Homograph) and pronunciation (Homophone) but different meanings and origins, e.g. bank: a slope; a place for money; and a bench or row of switches.
- Idiom A group of words established by usage as having a meaning not deducible from those of the individual words, e.g. over the moon or see the light.
- Inflection or inflected form A word form showing tense, person, mood, gender, or number. Some languages make more use of inflections than others: French is highly inflected for verbs but less so for other parts of speech; in English, there are relatively few inflections. English verbs inflect through suffixation (look/looks/looking/looked), but some irregular verbs have past forms that depart from the norm (see/sees/seeing/saw/seen). The verb be has eight forms: am, are, be, been, being, is, was, were. Nouns inflect for plurality and possession (worker/workers/worker’s/workers’) and some adjectives inflect for their comparatives and superlatives (big/bigger/biggest).
- International Phonetic Alphabet (IPA) An internationally recognized and standardized set of phonetic symbols, based on the principle of strict one-to-one correspondence between sounds and symbols which was devised by the International Phonetic Association.
- LemmaThe root form of a word, which is usually what is presented as a dictionary headword. For example, the lemma of the plural noun ‘dogs’ is ‘dog’, and the lemma of the conjugated verb ‘went’ is ‘be’.
- Morphology Language data that collates inflected forms in a given language, and can be linked to monolingual or bilingual datasets for use in technology applications such as dictionary lookups and search engines.
- N-gram Groups of two or more words that frequently occur together in language use. Bi-grams are groups of two, tri-grams are groups of three, etc. N-grams are usually generated from corpora and can be used in tools such as search engines and predictive text.
- Part of speech (aka word class) A category to which a word is assigned in accordance with its syntactic functions – in English the main parts of speech are noun, pronoun, adjective, determiner, verb, adverb, preposition, conjunction, and interjection.
- Regional varieties Indicates whether an OGLS dataset includes regional variants and dialect words and senses in addition to the standard language.
- Respelling (phonetic system) As an alternative to the International Phonetic Alphabet (IPA), some dictionaries use a respelling system to show pronunciation. There are many such systems. In a respelling system, the word respell might be shown as: rēˈspel rather than as: riːˈspɛl in IPA.
- Subentry A word related to the headword and included within the entry of a print dictionary rather than as a headword in its own right, e.g. preacher man is listed under the entry for preacher.
- Synonym A word or phrase that means exactly or nearly the same as another word or phrase in the same language. For example, synonyms for the word cup include beaker, mug, and drinking vessel.
- Translated term Any element in language data from Oxford Global Language Solutions that is linked to a translation. Translated terms will usually include headwords, subentries, compounds, examples, and idioms.
- Variant Different forms of the same word or phrase, typically spelling variants or regional variations. For example, color is an American English variant spelling of the British English colour.
- Wordlist A list of words, which can be extracted from a variety of lexical resources such as corpora or dictionaries. Wordlists can be tailored according to requirements; for example, to contain the most frequently used words in a given language, to be with or without expletives and obscenities, or to list domain-specific vocabulary, such as scientific or legal. Wordlists provided by OGLS are drawn from web-corpora and include frequency, lemma, and part-of-speech.