See corpus in All languages combined, or Wiktionary
{
"derived": [
{
"_dis1": "0 0 0 0 0",
"word": "aligned parallel corpus"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus callosum"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus cavernosum"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus cavernosum clitoridis"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus cavernosum penis"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus delicti"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus language"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus linguistics"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus luteum"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus manager"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus spongiosum"
},
{
"_dis1": "0 0 0 0 0",
"word": "corpus striatum"
},
{
"_dis1": "0 0 0 0 0",
"word": "habeas corpus"
},
{
"_dis1": "0 0 0 0 0",
"word": "metacorpus"
},
{
"_dis1": "0 0 0 0 0",
"word": "noncorpus"
},
{
"_dis1": "0 0 0 0 0",
"word": "procorpus"
},
{
"_dis1": "0 0 0 0 0",
"word": "subcorpus"
}
],
"etymology_templates": [
{
"args": {
"1": "en",
"2": "ine-pro",
"3": "*krep-"
},
"expansion": "",
"name": "root"
},
{
"args": {
"1": "en",
"2": "la",
"3": "corpus",
"4": "",
"5": "body"
},
"expansion": "Borrowed from Latin corpus (“body”)",
"name": "bor+"
},
{
"args": {
"1": "en",
"2": "corpse",
"3": "corps",
"4": "riff#Etymology 2"
},
"expansion": "Doublet of corpse, corps, and riff",
"name": "doublet"
}
],
"etymology_text": "Borrowed from Latin corpus (“body”). Doublet of corpse, corps, and riff.",
"forms": [
{
"form": "corpora",
"tags": [
"plural"
]
},
{
"form": "corpuses",
"tags": [
"plural"
]
},
{
"form": "corpusses",
"tags": [
"plural"
]
},
{
"form": "corpi",
"tags": [
"plural",
"proscribed"
]
}
],
"head_templates": [
{
"args": {
"1": "corpora",
"2": "+",
"3": "corpusses",
"4": "corpi<l:proscribed>"
},
"expansion": "corpus (plural corpora or corpuses or corpusses or (proscribed) corpi)",
"name": "en-noun"
}
],
"hyphenations": [
{
"parts": [
"cor",
"pus"
]
}
],
"lang": "English",
"lang_code": "en",
"pos": "noun",
"related": [
{
"_dis1": "0 0 0 0 0",
"word": "Wiktionary:Corpora"
}
],
"senses": [
{
"categories": [],
"examples": [
{
"bold_text_offsets": [
[
192,
198
]
],
"ref": "2011, Patrick Spedding, James Lambert, “Fanny Hill, Lord Fanny, and the Myth of Metonymy”, in Studies in Philology, volume 108, number 1, page 113:",
"text": "No one suggests that Browning intended to mean vagina when he wrote “owls and bats, / Cowls and twats,” because the context does not allow for it, nor does the greater context of the Browning corpus.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
2,
8
]
],
"ref": "2014, Giuseppina Balossi, “Corpus Approaches to the Study of Language and Literature”, in A Corpus Linguistic Approach to Literary Language and Characterization: Virginia Woolf's The Waves (Linguistic Approaches to Literature; 18), Amsterdam: John Benjamins Publishing Company, →ISBN, page 41:",
"text": "A corpus approach is a useful methodology for observing, describing and interpreting the stylistic features of language in literary and non-literary texts.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
30,
37
]
],
"ref": "2018, James Lambert, “A multitude of ‘lishes’: The nomenclature of hybridity”, in English World-Wide, page 4:",
"text": "Today, computer databases and corpora infinitely increase the ease of this type of research, but the collecting process remains essentially the same.",
"type": "quotation"
}
],
"glosses": [
"A collection of written or spoken texts."
],
"id": "en-corpus-en-noun-UkGFE-T0",
"links": [
[
"collection",
"collection"
],
[
"written",
"written"
],
[
"spoken",
"spoken"
]
]
},
{
"categories": [
{
"kind": "other",
"langcode": "en",
"name": "Linguistics",
"orig": "en:Linguistics",
"parents": [],
"source": "w"
},
{
"_dis": "16 37 29 14 4",
"kind": "other",
"name": "English links with manual fragments",
"parents": [],
"source": "w+disamb"
}
],
"examples": [
{
"bold_text_offsets": [
[
5,
12
],
[
124,
131
]
],
"ref": "2007, Mihail Mihailov, Hannu Tommola, “Compiling Parallel Text Corpora: Towards Automation of Routine Procedures”, in Wolfgang Teubert, editor, Text Corpora and Multilingual Lexicography (Benjamins Current Topics; 8), Amsterdam: John Benjamins Publishing Company, →ISBN, page 60:",
"text": "Text corpora are being used in most current lexicographic projects. Applied linguistic research is another field where text corpora are welcome as an inexhaustible source of empirical information, a polygon for testing various linguistic tools – spell-checkers, OCRs, machine translation systems, NLP systems, etc.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
11,
18
]
],
"ref": "2008, Anabel Borja, “Corpora for Translators in Spain. The CDJ-GITRAD Corpus and the GENITT Project.”, in Gunilla [M.] Anderman, Margaret Rogers, editors, Incorporating Corpora: The Linguist and the Translator, Clevedon, North Somerset: Multilingual Matters, →ISBN, page 248:",
"text": "Comparable corpora are made up of texts in different languages that may be related in various ways, but are not translations of each other. They may have nothing in common at all, or be on the same subject, of the same genre, or from the same chronological period, etc.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
33,
39
],
[
169,
175
],
[
286,
293
],
[
450,
456
]
],
"ref": "2013, “Introduction”, in Gerry Knowles, Briony Williams, L[ita] Taylor, editors, A Corpus of Formal British English Speech: The Lancaster/IBM Spoken English Corpus, Abingdon, Oxon.; New York, N.Y.: Routledge, →ISBN, page 1:",
"text": "The Lancaster/IBM Spoken English Corpus began in September 1984 as part of a research project into the automatic assignment of intonation […] The original design of the corpus was determined by the need to provide data for research into speech synthesis. As a result, unlike most other corpora currently being used in the computational linguistics field, the SEC exists in several forms. […] However, whatever the original motivation for compiling a corpus, it quickly becomes an object of interest in its own right. New users find it valuable for applications for which it was not designed.",
"type": "quotation"
}
],
"glosses": [
"A collection of written or spoken texts.",
"Such a collection in form of an electronic database used for linguistic analyses."
],
"id": "en-corpus-en-noun-~WLNVeo-",
"links": [
[
"collection",
"collection"
],
[
"written",
"written"
],
[
"spoken",
"spoken"
],
[
"linguistics",
"linguistics"
],
[
"electronic",
"electronic"
],
[
"database",
"database"
],
[
"linguistic",
"linguistic"
]
],
"raw_glosses": [
"A collection of written or spoken texts.",
"(specifically, linguistics) Such a collection in form of an electronic database used for linguistic analyses."
],
"synonyms": [
{
"word": "digital corpus"
},
{
"word": "text corpus"
}
],
"tags": [
"specifically"
],
"topics": [
"human-sciences",
"linguistics",
"sciences"
],
"translations": [
{
"_dis1": "12 66 18 5 0",
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: electronic text database",
"word": "پیکره متنی"
},
{
"_dis1": "12 66 18 5 0",
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: electronic text database",
"word": "گنجینه واژگان"
},
{
"_dis1": "12 66 18 5 0",
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: electronic text database",
"word": "گنجینه نوشتگان"
},
{
"_dis1": "12 66 18 5 0",
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: electronic text database",
"word": "جنگ واژگان"
},
{
"_dis1": "12 66 18 5 0",
"code": "pl",
"lang": "Polish",
"lang_code": "pl",
"sense": "linguistics: electronic text database",
"tags": [
"masculine"
],
"word": "korpus"
}
]
},
{
"categories": [
{
"kind": "other",
"langcode": "en",
"name": "Physics",
"orig": "en:Physics",
"parents": [],
"source": "w"
},
{
"_dis": "6 16 55 22 1",
"kind": "other",
"name": "English entries with incorrect language header",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "6 8 71 13 2",
"kind": "other",
"name": "Entries with translation boxes",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Arabic translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Belarusian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "13 15 47 21 4",
"kind": "other",
"name": "Terms with Bulgarian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Catalan translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Czech translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "7 9 66 16 3",
"kind": "other",
"name": "Terms with Danish translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "12 14 49 20 4",
"kind": "other",
"name": "Terms with Dutch translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "9 12 55 20 4",
"kind": "other",
"name": "Terms with Esperanto translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Estonian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 13 51 20 5",
"kind": "other",
"name": "Terms with Finnish translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "13 15 47 21 3",
"kind": "other",
"name": "Terms with French translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "10 12 57 17 4",
"kind": "other",
"name": "Terms with German translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "9 21 52 16 2",
"kind": "other",
"name": "Terms with Greek translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "8 21 53 16 2",
"kind": "other",
"name": "Terms with Hungarian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Indonesian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "9 12 55 20 4",
"kind": "other",
"name": "Terms with Italian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "9 12 55 20 4",
"kind": "other",
"name": "Terms with Japanese translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Korean translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Macedonian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "12 17 45 23 3",
"kind": "other",
"name": "Terms with Mandarin translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "8 11 48 30 3",
"kind": "other",
"name": "Terms with Māori translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Norwegian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 49 20 5",
"kind": "other",
"name": "Terms with Persian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "13 16 46 21 4",
"kind": "other",
"name": "Terms with Polish translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Portuguese translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "9 12 55 20 4",
"kind": "other",
"name": "Terms with Russian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Slovak translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Slovene translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "8 11 55 24 2",
"kind": "other",
"name": "Terms with Spanish translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "9 11 59 17 3",
"kind": "other",
"name": "Terms with Swedish translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 13 52 20 4",
"kind": "other",
"name": "Terms with Turkish translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Ukrainian translations",
"parents": [],
"source": "w+disamb"
},
{
"_dis": "11 14 50 20 5",
"kind": "other",
"name": "Terms with Vietnamese translations",
"parents": [],
"source": "w+disamb"
}
],
"examples": [
{
"bold_text_offsets": [
[
4,
10
]
],
"text": "the corpus of the uterus",
"type": "example"
}
],
"glosses": [
"A structure of a special character or function in the animal body."
],
"id": "en-corpus-en-noun-wr5UKqz9",
"links": [
[
"physics",
"physics"
],
[
"structure",
"structure"
],
[
"character",
"character"
],
[
"function",
"function"
]
],
"raw_glosses": [
"(physics) A structure of a special character or function in the animal body."
],
"related": [
{
"_dis1": "4 20 54 21 0",
"sense": "other expressions with corpus",
"word": "corpus allatum"
},
{
"_dis1": "4 20 54 21 0",
"sense": "other expressions with corpus",
"word": "corpus callosotomy"
},
{
"_dis1": "4 20 54 21 0",
"sense": "other expressions with corpus",
"word": "corpus fetishism"
},
{
"_dis1": "4 20 54 21 0",
"sense": "other expressions with corpus",
"word": "corpus fimbriatum"
},
{
"_dis1": "4 20 54 21 0",
"sense": "other expressions with corpus",
"word": "corpus juris"
},
{
"_dis1": "4 20 54 21 0",
"sense": "other expressions with corpus",
"word": "corpus separatum"
},
{
"_dis1": "4 20 54 21 0",
"sense": "other expressions with corpus",
"word": "corpus vile"
}
],
"topics": [
"natural-sciences",
"physical-sciences",
"physics"
]
},
{
"categories": [],
"examples": [
{
"bold_text_offsets": [
[
56,
64
],
[
154,
162
]
],
"ref": "1998, Dimitǎr Draganov, “New Coin Types of Hadrianopolis”, in Ulrike Peter, editor, Stephanos Nomismatikos: Edith Schönert-Geiss zum 65. Geburtstag (Griechisches Münzwerk), Berlin: Akademie Verlag, →ISBN, page 221:",
"text": "About a hundred years ago in Germany, the publishing of corpuses of the ancient Greek coinages was started. […] The significance of those, and some other corpuses is exclusive, because they allowed an enormous amount of numismatic material kept in museum and private collections all over the world, to be studied and systematized.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
104,
110
]
],
"ref": "2014, Margaret Darling, Barbara Precious, “Introduction”, in A Corpus of Roman Pottery from Lincoln (Lincoln Archaeological Studies; 6), Oxford: Oxbow Books, →ISBN, page 1:",
"text": "An assessment in 1991 proposed publication of the results of this work in three stages: […] secondly, a corpus of the Roman pottery to present the type series and to discuss the fabrics and forms recovered, […]",
"type": "quotation"
}
],
"glosses": [
"A collection or body of objects with similar characteristics."
],
"id": "en-corpus-en-noun-Ebiu7MdY",
"links": [
[
"collection",
"collection"
],
[
"body",
"body"
],
[
"object",
"object"
],
[
"characteristic",
"characteristic"
]
],
"raw_glosses": [
"(uncommon) A collection or body of objects with similar characteristics."
],
"synonyms": [
{
"word": "collection"
},
{
"source": "Thesaurus:body",
"word": "bod"
},
{
"source": "Thesaurus:body",
"word": "body"
},
{
"source": "Thesaurus:body",
"word": "flesh"
},
{
"source": "Thesaurus:body",
"tags": [
"humorous"
],
"word": "carcass"
},
{
"source": "Thesaurus:body",
"word": "likam"
},
{
"source": "Thesaurus:body",
"tags": [
"obsolete"
],
"word": "quarrons"
},
{
"source": "Thesaurus:body",
"word": "soma"
}
],
"tags": [
"uncommon"
]
},
{
"categories": [],
"glosses": [
"The body of a man or animal."
],
"id": "en-corpus-en-noun-Y6zRSbpX",
"links": [
[
"body",
"body"
],
[
"man",
"man"
],
[
"animal",
"animal"
]
],
"raw_glosses": [
"(archaic) The body of a man or animal."
],
"tags": [
"archaic"
]
}
],
"sounds": [
{
"ipa": "/ˈkɔːpəs/",
"tags": [
"Received-Pronunciation"
]
},
{
"ipa": "/ˈkɔɹpəs/",
"tags": [
"General-American"
]
},
{
"audio": "en-au-corpus.ogg",
"mp3_url": "https://upload.wikimedia.org/wikipedia/commons/transcoded/4/4b/En-au-corpus.ogg/En-au-corpus.ogg.mp3",
"ogg_url": "https://upload.wikimedia.org/wikipedia/commons/4/4b/En-au-corpus.ogg"
},
{
"rhymes": "-ɔː(ɹ)pəs"
}
],
"translations": [
{
"_dis1": "30 34 9 27 0",
"code": "ar",
"lang": "Arabic",
"lang_code": "ar",
"roman": "matn",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "مَتْن"
},
{
"_dis1": "30 34 9 27 0",
"code": "ar",
"lang": "Arabic",
"lang_code": "ar",
"roman": "maknaz luḡawiyy",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "مَكْنَز لُغَوِيّ"
},
{
"_dis1": "30 34 9 27 0",
"code": "be",
"lang": "Belarusian",
"lang_code": "be",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"_dis1": "30 34 9 27 0",
"code": "be",
"lang": "Belarusian",
"lang_code": "be",
"roman": "zbor",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "збор"
},
{
"_dis1": "30 34 9 27 0",
"code": "bg",
"lang": "Bulgarian",
"lang_code": "bg",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"_dis1": "30 34 9 27 0",
"code": "ca",
"lang": "Catalan",
"lang_code": "ca",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "cmn",
"lang": "Chinese Mandarin",
"lang_code": "cmn",
"roman": "yǔliàokù",
"sense": "linguistics: collection of writings",
"word": "語料庫 /语料库"
},
{
"_dis1": "30 34 9 27 0",
"code": "cs",
"lang": "Czech",
"lang_code": "cs",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "da",
"lang": "Danish",
"lang_code": "da",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "nl",
"lang": "Dutch",
"lang_code": "nl",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "corpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "eo",
"lang": "Esperanto",
"lang_code": "eo",
"sense": "linguistics: collection of writings",
"word": "tekstaro"
},
{
"_dis1": "30 34 9 27 0",
"code": "eo",
"lang": "Esperanto",
"lang_code": "eo",
"sense": "linguistics: collection of writings",
"word": "korpuso"
},
{
"_dis1": "30 34 9 27 0",
"code": "et",
"lang": "Estonian",
"lang_code": "et",
"sense": "linguistics: collection of writings",
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "fi",
"lang": "Finnish",
"lang_code": "fi",
"sense": "linguistics: collection of writings",
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "fr",
"lang": "French",
"lang_code": "fr",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "de",
"lang": "German",
"lang_code": "de",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "Korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "de",
"lang": "German",
"lang_code": "de",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "Textkorpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "el",
"lang": "Greek",
"lang_code": "el",
"roman": "sóma",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "σώμα"
},
{
"_dis1": "30 34 9 27 0",
"code": "el",
"lang": "Greek",
"lang_code": "el",
"roman": "syllogí",
"sense": "linguistics: collection of writings",
"tags": [
"feminine"
],
"word": "συλλογή"
},
{
"_dis1": "30 34 9 27 0",
"code": "hu",
"lang": "Hungarian",
"lang_code": "hu",
"sense": "linguistics: collection of writings",
"word": "korpusz"
},
{
"_dis1": "30 34 9 27 0",
"code": "id",
"lang": "Indonesian",
"lang_code": "id",
"sense": "linguistics: collection of writings",
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "it",
"lang": "Italian",
"lang_code": "it",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "ja",
"lang": "Japanese",
"lang_code": "ja",
"roman": "kōpasu",
"sense": "linguistics: collection of writings",
"word": "コーパス"
},
{
"_dis1": "30 34 9 27 0",
"code": "ko",
"lang": "Korean",
"lang_code": "ko",
"roman": "malmungchi",
"sense": "linguistics: collection of writings",
"word": "말뭉치"
},
{
"_dis1": "30 34 9 27 0",
"code": "ko",
"lang": "Korean",
"lang_code": "ko",
"roman": "kopeoseu",
"sense": "linguistics: collection of writings",
"word": "코퍼스"
},
{
"_dis1": "30 34 9 27 0",
"code": "mk",
"lang": "Macedonian",
"lang_code": "mk",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"_dis1": "30 34 9 27 0",
"code": "mi",
"lang": "Māori",
"lang_code": "mi",
"sense": "linguistics: collection of writings",
"word": "putunga kōrero"
},
{
"_dis1": "30 34 9 27 0",
"code": "mi",
"lang": "Māori",
"lang_code": "mi",
"sense": "linguistics: collection of writings",
"word": "whakaputunga"
},
{
"_dis1": "30 34 9 27 0",
"code": "no",
"lang": "Norwegian",
"lang_code": "no",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: collection of writings",
"word": "گنجینه نوشتگان"
},
{
"_dis1": "30 34 9 27 0",
"code": "pl",
"lang": "Polish",
"lang_code": "pl",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "pt",
"lang": "Portuguese",
"lang_code": "pt",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "ru",
"lang": "Russian",
"lang_code": "ru",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"_dis1": "30 34 9 27 0",
"code": "ru",
"lang": "Russian",
"lang_code": "ru",
"roman": "sobránije",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "собра́ние"
},
{
"_dis1": "30 34 9 27 0",
"code": "sk",
"lang": "Slovak",
"lang_code": "sk",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "sl",
"lang": "Slovene",
"lang_code": "sl",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "es",
"lang": "Spanish",
"lang_code": "es",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "sv",
"lang": "Swedish",
"lang_code": "sv",
"sense": "linguistics: collection of writings",
"tags": [
"common-gender"
],
"word": "korpus"
},
{
"_dis1": "30 34 9 27 0",
"code": "sv",
"lang": "Swedish",
"lang_code": "sv",
"sense": "linguistics: collection of writings",
"tags": [
"common-gender"
],
"word": "språkbank"
},
{
"_dis1": "30 34 9 27 0",
"code": "tr",
"english": "all works of a single author",
"lang": "Turkish",
"lang_code": "tr",
"sense": "linguistics: collection of writings",
"translation": "all works of a single author",
"word": "külliyat"
},
{
"_dis1": "30 34 9 27 0",
"code": "uk",
"lang": "Ukrainian",
"lang_code": "uk",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"_dis1": "30 34 9 27 0",
"code": "uk",
"lang": "Ukrainian",
"lang_code": "uk",
"roman": "zbírnyk",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "збі́рник"
},
{
"_dis1": "30 34 9 27 0",
"code": "vi",
"lang": "Vietnamese",
"lang_code": "vi",
"sense": "linguistics: collection of writings",
"word": "ngữ liệu"
}
],
"word": "corpus"
}
{
"categories": [
"English countable nouns",
"English doublets",
"English entries with incorrect language header",
"English lemmas",
"English links with manual fragments",
"English nouns",
"English nouns with irregular plurals",
"English terms borrowed from Latin",
"English terms derived from Latin",
"English terms derived from Proto-Indo-European",
"English terms derived from the Proto-Indo-European root *krep-",
"Entries with translation boxes",
"Pages with 10 entries",
"Pages with entries",
"Rhymes:English/ɔː(ɹ)pəs",
"Rhymes:English/ɔː(ɹ)pəs/2 syllables",
"Terms with Arabic translations",
"Terms with Belarusian translations",
"Terms with Bulgarian translations",
"Terms with Catalan translations",
"Terms with Czech translations",
"Terms with Danish translations",
"Terms with Dutch translations",
"Terms with Esperanto translations",
"Terms with Estonian translations",
"Terms with Finnish translations",
"Terms with French translations",
"Terms with German translations",
"Terms with Greek translations",
"Terms with Hungarian translations",
"Terms with Indonesian translations",
"Terms with Italian translations",
"Terms with Japanese translations",
"Terms with Korean translations",
"Terms with Macedonian translations",
"Terms with Mandarin translations",
"Terms with Māori translations",
"Terms with Norwegian translations",
"Terms with Persian translations",
"Terms with Polish translations",
"Terms with Portuguese translations",
"Terms with Russian translations",
"Terms with Slovak translations",
"Terms with Slovene translations",
"Terms with Spanish translations",
"Terms with Swedish translations",
"Terms with Turkish translations",
"Terms with Ukrainian translations",
"Terms with Vietnamese translations"
],
"derived": [
{
"word": "aligned parallel corpus"
},
{
"word": "corpus callosum"
},
{
"word": "corpus cavernosum"
},
{
"word": "corpus cavernosum clitoridis"
},
{
"word": "corpus cavernosum penis"
},
{
"word": "corpus delicti"
},
{
"word": "corpus language"
},
{
"word": "corpus linguistics"
},
{
"word": "corpus luteum"
},
{
"word": "corpus manager"
},
{
"word": "corpus spongiosum"
},
{
"word": "corpus striatum"
},
{
"word": "habeas corpus"
},
{
"word": "metacorpus"
},
{
"word": "noncorpus"
},
{
"word": "procorpus"
},
{
"word": "subcorpus"
}
],
"etymology_templates": [
{
"args": {
"1": "en",
"2": "ine-pro",
"3": "*krep-"
},
"expansion": "",
"name": "root"
},
{
"args": {
"1": "en",
"2": "la",
"3": "corpus",
"4": "",
"5": "body"
},
"expansion": "Borrowed from Latin corpus (“body”)",
"name": "bor+"
},
{
"args": {
"1": "en",
"2": "corpse",
"3": "corps",
"4": "riff#Etymology 2"
},
"expansion": "Doublet of corpse, corps, and riff",
"name": "doublet"
}
],
"etymology_text": "Borrowed from Latin corpus (“body”). Doublet of corpse, corps, and riff.",
"forms": [
{
"form": "corpora",
"tags": [
"plural"
]
},
{
"form": "corpuses",
"tags": [
"plural"
]
},
{
"form": "corpusses",
"tags": [
"plural"
]
},
{
"form": "corpi",
"tags": [
"plural",
"proscribed"
]
}
],
"head_templates": [
{
"args": {
"1": "corpora",
"2": "+",
"3": "corpusses",
"4": "corpi<l:proscribed>"
},
"expansion": "corpus (plural corpora or corpuses or corpusses or (proscribed) corpi)",
"name": "en-noun"
}
],
"hyphenations": [
{
"parts": [
"cor",
"pus"
]
}
],
"lang": "English",
"lang_code": "en",
"pos": "noun",
"related": [
{
"word": "Wiktionary:Corpora"
},
{
"sense": "other expressions with corpus",
"word": "corpus allatum"
},
{
"sense": "other expressions with corpus",
"word": "corpus callosotomy"
},
{
"sense": "other expressions with corpus",
"word": "corpus fetishism"
},
{
"sense": "other expressions with corpus",
"word": "corpus fimbriatum"
},
{
"sense": "other expressions with corpus",
"word": "corpus juris"
},
{
"sense": "other expressions with corpus",
"word": "corpus separatum"
},
{
"sense": "other expressions with corpus",
"word": "corpus vile"
}
],
"senses": [
{
"categories": [
"English terms with quotations"
],
"examples": [
{
"bold_text_offsets": [
[
192,
198
]
],
"ref": "2011, Patrick Spedding, James Lambert, “Fanny Hill, Lord Fanny, and the Myth of Metonymy”, in Studies in Philology, volume 108, number 1, page 113:",
"text": "No one suggests that Browning intended to mean vagina when he wrote “owls and bats, / Cowls and twats,” because the context does not allow for it, nor does the greater context of the Browning corpus.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
2,
8
]
],
"ref": "2014, Giuseppina Balossi, “Corpus Approaches to the Study of Language and Literature”, in A Corpus Linguistic Approach to Literary Language and Characterization: Virginia Woolf's The Waves (Linguistic Approaches to Literature; 18), Amsterdam: John Benjamins Publishing Company, →ISBN, page 41:",
"text": "A corpus approach is a useful methodology for observing, describing and interpreting the stylistic features of language in literary and non-literary texts.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
30,
37
]
],
"ref": "2018, James Lambert, “A multitude of ‘lishes’: The nomenclature of hybridity”, in English World-Wide, page 4:",
"text": "Today, computer databases and corpora infinitely increase the ease of this type of research, but the collecting process remains essentially the same.",
"type": "quotation"
}
],
"glosses": [
"A collection of written or spoken texts."
],
"links": [
[
"collection",
"collection"
],
[
"written",
"written"
],
[
"spoken",
"spoken"
]
]
},
{
"categories": [
"English terms with quotations",
"en:Linguistics"
],
"examples": [
{
"bold_text_offsets": [
[
5,
12
],
[
124,
131
]
],
"ref": "2007, Mihail Mihailov, Hannu Tommola, “Compiling Parallel Text Corpora: Towards Automation of Routine Procedures”, in Wolfgang Teubert, editor, Text Corpora and Multilingual Lexicography (Benjamins Current Topics; 8), Amsterdam: John Benjamins Publishing Company, →ISBN, page 60:",
"text": "Text corpora are being used in most current lexicographic projects. Applied linguistic research is another field where text corpora are welcome as an inexhaustible source of empirical information, a polygon for testing various linguistic tools – spell-checkers, OCRs, machine translation systems, NLP systems, etc.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
11,
18
]
],
"ref": "2008, Anabel Borja, “Corpora for Translators in Spain. The CDJ-GITRAD Corpus and the GENITT Project.”, in Gunilla [M.] Anderman, Margaret Rogers, editors, Incorporating Corpora: The Linguist and the Translator, Clevedon, North Somerset: Multilingual Matters, →ISBN, page 248:",
"text": "Comparable corpora are made up of texts in different languages that may be related in various ways, but are not translations of each other. They may have nothing in common at all, or be on the same subject, of the same genre, or from the same chronological period, etc.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
33,
39
],
[
169,
175
],
[
286,
293
],
[
450,
456
]
],
"ref": "2013, “Introduction”, in Gerry Knowles, Briony Williams, L[ita] Taylor, editors, A Corpus of Formal British English Speech: The Lancaster/IBM Spoken English Corpus, Abingdon, Oxon.; New York, N.Y.: Routledge, →ISBN, page 1:",
"text": "The Lancaster/IBM Spoken English Corpus began in September 1984 as part of a research project into the automatic assignment of intonation […] The original design of the corpus was determined by the need to provide data for research into speech synthesis. As a result, unlike most other corpora currently being used in the computational linguistics field, the SEC exists in several forms. […] However, whatever the original motivation for compiling a corpus, it quickly becomes an object of interest in its own right. New users find it valuable for applications for which it was not designed.",
"type": "quotation"
}
],
"glosses": [
"A collection of written or spoken texts.",
"Such a collection in form of an electronic database used for linguistic analyses."
],
"links": [
[
"collection",
"collection"
],
[
"written",
"written"
],
[
"spoken",
"spoken"
],
[
"linguistics",
"linguistics"
],
[
"electronic",
"electronic"
],
[
"database",
"database"
],
[
"linguistic",
"linguistic"
]
],
"raw_glosses": [
"A collection of written or spoken texts.",
"(specifically, linguistics) Such a collection in form of an electronic database used for linguistic analyses."
],
"synonyms": [
{
"word": "digital corpus"
},
{
"word": "text corpus"
}
],
"tags": [
"specifically"
],
"topics": [
"human-sciences",
"linguistics",
"sciences"
]
},
{
"categories": [
"English terms with usage examples",
"en:Physics"
],
"examples": [
{
"bold_text_offsets": [
[
4,
10
]
],
"text": "the corpus of the uterus",
"type": "example"
}
],
"glosses": [
"A structure of a special character or function in the animal body."
],
"links": [
[
"physics",
"physics"
],
[
"structure",
"structure"
],
[
"character",
"character"
],
[
"function",
"function"
]
],
"raw_glosses": [
"(physics) A structure of a special character or function in the animal body."
],
"topics": [
"natural-sciences",
"physical-sciences",
"physics"
]
},
{
"categories": [
"English terms with quotations",
"English terms with uncommon senses"
],
"examples": [
{
"bold_text_offsets": [
[
56,
64
],
[
154,
162
]
],
"ref": "1998, Dimitǎr Draganov, “New Coin Types of Hadrianopolis”, in Ulrike Peter, editor, Stephanos Nomismatikos: Edith Schönert-Geiss zum 65. Geburtstag (Griechisches Münzwerk), Berlin: Akademie Verlag, →ISBN, page 221:",
"text": "About a hundred years ago in Germany, the publishing of corpuses of the ancient Greek coinages was started. […] The significance of those, and some other corpuses is exclusive, because they allowed an enormous amount of numismatic material kept in museum and private collections all over the world, to be studied and systematized.",
"type": "quotation"
},
{
"bold_text_offsets": [
[
104,
110
]
],
"ref": "2014, Margaret Darling, Barbara Precious, “Introduction”, in A Corpus of Roman Pottery from Lincoln (Lincoln Archaeological Studies; 6), Oxford: Oxbow Books, →ISBN, page 1:",
"text": "An assessment in 1991 proposed publication of the results of this work in three stages: […] secondly, a corpus of the Roman pottery to present the type series and to discuss the fabrics and forms recovered, […]",
"type": "quotation"
}
],
"glosses": [
"A collection or body of objects with similar characteristics."
],
"links": [
[
"collection",
"collection"
],
[
"body",
"body"
],
[
"object",
"object"
],
[
"characteristic",
"characteristic"
]
],
"raw_glosses": [
"(uncommon) A collection or body of objects with similar characteristics."
],
"synonyms": [
{
"word": "collection"
},
{
"source": "Thesaurus:body",
"word": "bod"
},
{
"source": "Thesaurus:body",
"word": "body"
},
{
"source": "Thesaurus:body",
"word": "flesh"
},
{
"source": "Thesaurus:body",
"tags": [
"humorous"
],
"word": "carcass"
},
{
"source": "Thesaurus:body",
"word": "likam"
},
{
"source": "Thesaurus:body",
"tags": [
"obsolete"
],
"word": "quarrons"
},
{
"source": "Thesaurus:body",
"word": "soma"
}
],
"tags": [
"uncommon"
]
},
{
"categories": [
"English terms with archaic senses"
],
"glosses": [
"The body of a man or animal."
],
"links": [
[
"body",
"body"
],
[
"man",
"man"
],
[
"animal",
"animal"
]
],
"raw_glosses": [
"(archaic) The body of a man or animal."
],
"tags": [
"archaic"
]
}
],
"sounds": [
{
"ipa": "/ˈkɔːpəs/",
"tags": [
"Received-Pronunciation"
]
},
{
"ipa": "/ˈkɔɹpəs/",
"tags": [
"General-American"
]
},
{
"audio": "en-au-corpus.ogg",
"mp3_url": "https://upload.wikimedia.org/wikipedia/commons/transcoded/4/4b/En-au-corpus.ogg/En-au-corpus.ogg.mp3",
"ogg_url": "https://upload.wikimedia.org/wikipedia/commons/4/4b/En-au-corpus.ogg"
},
{
"rhymes": "-ɔː(ɹ)pəs"
}
],
"translations": [
{
"code": "ar",
"lang": "Arabic",
"lang_code": "ar",
"roman": "matn",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "مَتْن"
},
{
"code": "ar",
"lang": "Arabic",
"lang_code": "ar",
"roman": "maknaz luḡawiyy",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "مَكْنَز لُغَوِيّ"
},
{
"code": "be",
"lang": "Belarusian",
"lang_code": "be",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"code": "be",
"lang": "Belarusian",
"lang_code": "be",
"roman": "zbor",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "збор"
},
{
"code": "bg",
"lang": "Bulgarian",
"lang_code": "bg",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"code": "ca",
"lang": "Catalan",
"lang_code": "ca",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"code": "cmn",
"lang": "Chinese Mandarin",
"lang_code": "cmn",
"roman": "yǔliàokù",
"sense": "linguistics: collection of writings",
"word": "語料庫 /语料库"
},
{
"code": "cs",
"lang": "Czech",
"lang_code": "cs",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"code": "da",
"lang": "Danish",
"lang_code": "da",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "korpus"
},
{
"code": "nl",
"lang": "Dutch",
"lang_code": "nl",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "corpus"
},
{
"code": "eo",
"lang": "Esperanto",
"lang_code": "eo",
"sense": "linguistics: collection of writings",
"word": "tekstaro"
},
{
"code": "eo",
"lang": "Esperanto",
"lang_code": "eo",
"sense": "linguistics: collection of writings",
"word": "korpuso"
},
{
"code": "et",
"lang": "Estonian",
"lang_code": "et",
"sense": "linguistics: collection of writings",
"word": "korpus"
},
{
"code": "fi",
"lang": "Finnish",
"lang_code": "fi",
"sense": "linguistics: collection of writings",
"word": "korpus"
},
{
"code": "fr",
"lang": "French",
"lang_code": "fr",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"code": "de",
"lang": "German",
"lang_code": "de",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "Korpus"
},
{
"code": "de",
"lang": "German",
"lang_code": "de",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "Textkorpus"
},
{
"code": "el",
"lang": "Greek",
"lang_code": "el",
"roman": "sóma",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "σώμα"
},
{
"code": "el",
"lang": "Greek",
"lang_code": "el",
"roman": "syllogí",
"sense": "linguistics: collection of writings",
"tags": [
"feminine"
],
"word": "συλλογή"
},
{
"code": "hu",
"lang": "Hungarian",
"lang_code": "hu",
"sense": "linguistics: collection of writings",
"word": "korpusz"
},
{
"code": "id",
"lang": "Indonesian",
"lang_code": "id",
"sense": "linguistics: collection of writings",
"word": "korpus"
},
{
"code": "it",
"lang": "Italian",
"lang_code": "it",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"code": "ja",
"lang": "Japanese",
"lang_code": "ja",
"roman": "kōpasu",
"sense": "linguistics: collection of writings",
"word": "コーパス"
},
{
"code": "ko",
"lang": "Korean",
"lang_code": "ko",
"roman": "malmungchi",
"sense": "linguistics: collection of writings",
"word": "말뭉치"
},
{
"code": "ko",
"lang": "Korean",
"lang_code": "ko",
"roman": "kopeoseu",
"sense": "linguistics: collection of writings",
"word": "코퍼스"
},
{
"code": "mk",
"lang": "Macedonian",
"lang_code": "mk",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"code": "mi",
"lang": "Māori",
"lang_code": "mi",
"sense": "linguistics: collection of writings",
"word": "putunga kōrero"
},
{
"code": "mi",
"lang": "Māori",
"lang_code": "mi",
"sense": "linguistics: collection of writings",
"word": "whakaputunga"
},
{
"code": "no",
"lang": "Norwegian",
"lang_code": "no",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: collection of writings",
"word": "گنجینه نوشتگان"
},
{
"code": "pl",
"lang": "Polish",
"lang_code": "pl",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"code": "pt",
"lang": "Portuguese",
"lang_code": "pt",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"code": "ru",
"lang": "Russian",
"lang_code": "ru",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"code": "ru",
"lang": "Russian",
"lang_code": "ru",
"roman": "sobránije",
"sense": "linguistics: collection of writings",
"tags": [
"neuter"
],
"word": "собра́ние"
},
{
"code": "sk",
"lang": "Slovak",
"lang_code": "sk",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"code": "sl",
"lang": "Slovene",
"lang_code": "sl",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "korpus"
},
{
"code": "es",
"lang": "Spanish",
"lang_code": "es",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "corpus"
},
{
"code": "sv",
"lang": "Swedish",
"lang_code": "sv",
"sense": "linguistics: collection of writings",
"tags": [
"common-gender"
],
"word": "korpus"
},
{
"code": "sv",
"lang": "Swedish",
"lang_code": "sv",
"sense": "linguistics: collection of writings",
"tags": [
"common-gender"
],
"word": "språkbank"
},
{
"code": "tr",
"english": "all works of a single author",
"lang": "Turkish",
"lang_code": "tr",
"sense": "linguistics: collection of writings",
"translation": "all works of a single author",
"word": "külliyat"
},
{
"code": "uk",
"lang": "Ukrainian",
"lang_code": "uk",
"roman": "kórpus",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "ко́рпус"
},
{
"code": "uk",
"lang": "Ukrainian",
"lang_code": "uk",
"roman": "zbírnyk",
"sense": "linguistics: collection of writings",
"tags": [
"masculine"
],
"word": "збі́рник"
},
{
"code": "vi",
"lang": "Vietnamese",
"lang_code": "vi",
"sense": "linguistics: collection of writings",
"word": "ngữ liệu"
},
{
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: electronic text database",
"word": "پیکره متنی"
},
{
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: electronic text database",
"word": "گنجینه واژگان"
},
{
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: electronic text database",
"word": "گنجینه نوشتگان"
},
{
"code": "fa",
"lang": "Persian",
"lang_code": "fa",
"sense": "linguistics: electronic text database",
"word": "جنگ واژگان"
},
{
"code": "pl",
"lang": "Polish",
"lang_code": "pl",
"sense": "linguistics: electronic text database",
"tags": [
"masculine"
],
"word": "korpus"
}
],
"word": "corpus"
}
Download raw JSONL data for corpus meaning in English (18.2kB)
This page is a part of the kaikki.org machine-readable English dictionary. This dictionary is based on structured data extracted on 2026-02-14 from the enwiktionary dump dated 2026-02-01 using wiktextract (f492ef9 and 59dc20b). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.
If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.