Wiktionary data extraction errors and warnings

Inflection check

List of different kinds of inflection tables. When wiktextract parses word heads and tables, it assigns the forms it encounters with tags that describe grammatical or contextual information. The tags and forms that are found in head sections and tables are kept separate from other head section and table tags, and later they are merged with other heads and tables into table types that all contain the same number of word forms with the same tags for those forms.

The information presented here is mostly for debugging, but it can also be used to find interesting word paradigms and to hunt down mistakes, typoes and badly formated Wiktionary entries. A table type that has only a few unique instances is quite likely to contain some kind of minor error in the original data.

Language ⏶ Table forms Errors (% affected words) Language Table forms ⏷ Errors (% affected words)
Abkhaz 1 2 (100.00%) Greek 1856 2394 (35.54%)
Adyghe 1 2 (100.00%) Ancient Greek 496 798 (14.12%)
Afar 1 2 (100.00%) Latin 69 80 (70.75%)
Afrikaans 1 2 (100.00%) Medieval Greek 64 50 (36.46%)
Albanian 4 2 (8.39%) English 44 24 (64.50%)
Alemannic 3 2 (19.05%) German 24 24 (7.62%)
Issues with parsing articles inside parentheses.
Algonquin 1 2 (100.00%) French 20 26 (62.12%)
Amharic 1 2 (100.00%) Cypriot Greek 20 18 (39.80%)
Ancient Egyptian 3 2 (85.71%) Spanish 19 48 (93.60%)
Ancient Greek 496 798 (14.12%) Turkish 17 20 (9.17%)
Ancient Macedonian 1 0 (0.00%) Cretan Greek 14 4 (28.05%)
Anglo-Norman 1 2 (100.00%) Multiple languages 13 6 (79.31%)
Arabic 2 2 (99.81%) Esperanto 13 16 (85.12%)
Aragonese 1 2 (100.00%) Romanian 12 26 (89.20%)
Aramaic 4 2 (66.67%) Italian 12 10 (3.96%)
Armenian 11 18 (11.48%) Serbo-Croatian 12 10 (84.75%)
Aromanian 6 4 (81.69%) Slovak 12 64 (92.43%)
Arpitan 1 2 (100.00%) Polish 11 8 (55.54%)
Assamese 1 2 (100.00%) Armenian 11 18 (11.48%)
Asturian 4 6 (91.09%) uni 10 12 (92.65%)
Aymara 1 2 (100.00%) Old French 10 8 (56.69%)
Azerbaijani 5 6 (62.16%) Russian 10 10 (97.96%)
Bambara 1 2 (100.00%) Portuguese 9 6 (73.43%)
Bangla 1 2 (100.00%) Pontic 9 4 (19.87%)
Bashkir 1 2 (100.00%) Dutch 7 4 (22.74%)
Basque 1 2 (100.00%) Tsakonian 7 6 (76.98%)
Bavarian 2 2 (68.18%) Swedish 6 2 (3.57%)
Belarusian 1 2 (100.00%) Aromanian 6 4 (81.69%)
Bemba 1 2 (100.00%) Serbian 6 2 (32.89%)
Biblical Hebrew 4 2 (72.22%) Bulgarian 6 4 (99.26%)
Bikol Central 2 2 (66.67%) Azerbaijani 5 6 (62.16%)
Bislama 1 2 (100.00%) Icelandic 5 4 (23.60%)
Bosnian 1 2 (100.00%) Czech 5 2 (96.97%)
Breton 2 2 (99.38%) Croatian 5 6 (98.31%)
Bulgarian 6 4 (99.26%) Slovene 5 2 (1.26%)
Burmese 1 2 (100.00%) Cappadocian Greek 5 2 (45.45%)
Cantonese 1 2 (100.00%) Macedonian 5 2 (98.04%)
Cappadocian Greek 5 2 (45.45%) Ladino 5 2 (23.81%)
Carpathian Rusyn 2 2 (50.00%) Italiot Greek 5 6 (40.00%)
Catalan 3 2 (99.57%) Proto-Hellenic 5 4 (36.36%)
Cebuano 2 2 (90.91%) Finnish 4 2 (8.75%)
Central 2 2 (75.00%) Danish 4 2 (4.07%)
Chamorro 1 2 (100.00%) Albanian 4 2 (8.39%)
Chechen 1 2 (100.00%) Norwegian 4 2 (11.84%)
Cherokee 1 2 (100.00%) Japanese 4 20 (100.00%)
Cheyenne 1 2 (100.00%) West Flemish 4 2 (0.03%)
Chinese 1 2 (100.00%) Asturian 4 6 (91.09%)
Church Slavic 2 0 (0.00%) Georgian 4 4 (97.84%)
Chuvash 1 2 (100.00%) Kabyle 4 2 (75.00%)
Classical Nahuatl 1 2 (100.00%) Biblical Hebrew 4 2 (72.22%)
Colognian 1 2 (100.00%) Aramaic 4 2 (66.67%)
Coptic 2 2 (73.08%) Mycenaean Greek 4 12 (72.97%)
Cornish 1 2 (100.00%) Proto-Indo-European 4 2 (80.60%)
Corsican 1 2 (100.00%) Latvian 3 2 (99.32%)
Cretan Greek 14 4 (28.05%) Catalan 3 2 (99.57%)
Crimean Tatar 1 2 (100.00%) Hungarian 3 2 (99.75%)
Croatian 5 6 (98.31%) Narom 3 2 (88.24%)
Cypriot Arabic 1 4 (100.00%) Venetan 3 2 (95.45%)
Cypriot Greek 20 18 (39.80%) Neapolitan 3 2 (96.05%)
Czech 5 2 (96.97%) Lithuanian 3 2 (99.76%)
Danish 4 2 (4.07%) Lombard 3 2 (87.88%)
Dhivehi 1 2 (100.00%) Ancient Egyptian 3 2 (85.71%)
Dimli 1 2 (100.00%) Maltese 3 2 (96.55%)
Dutch 7 4 (22.74%) Ukrainian 3 2 (99.63%)
Dutch Low Saxon 1 2 (100.00%) Friulian 3 2 (90.00%)
Dzongkha 1 4 (100.00%) Alemannic 3 2 (19.05%)
Egyptian Arabic 1 2 (100.00%) Ottoman Turkish 3 4 (92.86%)
Elamite 1 4 (100.00%) Jamaican Creole 3 6 (53.85%)
English 44 24 (64.50%) Old Irish 3 2 (21.43%)
Leave be; Old Irish tables can wait until they've sorted themselves out.
Esperanto 13 16 (85.12%) Gothic 3 4 (54.55%)
Estonian 2 2 (96.79%) Proto-Slavic 3 0 (0.00%)
Ewe 1 2 (100.00%) Proto-Italic 3 2 (33.33%)
Extremaduran 2 2 (50.00%) Irish 2 2 (99.63%)
Faroese 1 2 (100.00%) Indonesian 2 2 (99.62%)
Fiji Hindi 1 2 (100.00%) Ido 2 4 (99.99%)
Fijian 1 2 (100.00%) Breton 2 2 (99.38%)
Finnish 4 2 (8.75%) Estonian 2 2 (96.79%)
French 20 26 (62.12%) Sardinian 2 2 (96.49%)
Friulian 3 2 (90.00%) Scots 2 2 (94.74%)
Galician 1 2 (100.00%) Bavarian 2 2 (68.18%)
Galoli 1 2 (100.00%) Igbo 2 2 (95.83%)
Gan 2 2 (50.00%) Hebrew 2 2 (99.79%)
Georgian 4 4 (97.84%) Cebuano 2 2 (90.91%)
German 24 24 (7.62%)
Issues with parsing articles inside parentheses.
Zealandic 2 2 (42.31%)
Ghotuo 1 2 (100.00%) Arabic 2 2 (99.81%)
Gothic 3 4 (54.55%) Minnan 2 2 (96.00%)
Greek 1856 2394 (35.54%) Romansch 2 2 (94.44%)
Greenlandic 1 2 (100.00%) Pali 2 2 (98.41%)
Guarani 1 2 (100.00%) Uzbek 2 2 (69.70%)
Gujarati 1 2 (100.00%) Ossetian 2 2 (95.24%)
Haitian Creole 1 2 (100.00%) Turkmen 2 2 (64.00%)
Hakka 1 2 (100.00%) Romani 2 2 (85.71%)
Hausa 1 2 (100.00%) Kazakh 2 4 (1.11%)
Hawaiian 2 2 (95.00%) Samogitian 2 2 (75.00%)
Hebrew 2 2 (99.79%) Pennsylvania German 2 2 (66.67%)
Hindi 1 2 (100.00%) Tibetan 2 4 (85.71%)
Hittite 2 4 (50.00%) Hawaiian 2 2 (95.00%)
Hungarian 3 2 (99.75%) Javanese 2 2 (91.67%)
Icelandic 5 4 (23.60%) Old Occitan 2 2 (33.33%)
Ido 2 4 (99.99%) Extremaduran 2 2 (50.00%)
Igbo 2 2 (95.83%) Karachay-Balkar 2 2 (66.67%)
Ilocano 1 2 (100.00%) Central 2 2 (75.00%)
Inari Sami 1 2 (100.00%) Bikol Central 2 2 (66.67%)
Indonesian 2 2 (99.62%) Nahuatl 2 2 (50.00%)
Interlingua 1 2 (100.00%) Carpathian Rusyn 2 2 (50.00%)
Interlingue 1 2 (100.00%) Saterland Frisian 2 2 (50.00%)
Inuktitut 1 2 (100.00%) Gan 2 2 (50.00%)
Irish 2 2 (99.63%) Old Persian 2 2 (50.00%)
Italian 12 10 (3.96%) Old Norse 2 2 (80.00%)
Italiot Greek 5 6 (40.00%) Church Slavic 2 0 (0.00%)
Jamaican Creole 3 6 (53.85%) Hittite 2 4 (50.00%)
Japanese 4 20 (100.00%) Coptic 2 2 (73.08%)
Javanese 2 2 (91.67%) Proto-Germanic 2 2 (50.00%)
Kabardian 1 4 (100.00%) Old East Slavic 2 0 (0.00%)
Kabyle 4 2 (75.00%) Scottish Gaelic 1 2 (100.00%)
Kannada 1 2 (100.00%) Afrikaans 1 2 (100.00%)
Kapampangan 1 2 (100.00%) Luxembourgish 1 2 (100.00%)
Karachay-Balkar 2 2 (66.67%) Low German 1 2 (100.00%)
Kashmiri 1 2 (100.00%) Interlingua 1 2 (100.00%)
Kashubian 1 2 (100.00%) Occitan 1 2 (100.00%)
Kazakh 2 4 (1.11%) Papiamento 1 2 (100.00%)
Khmer 1 2 (100.00%) Novial 1 2 (100.00%)
Kinyarwanda 1 2 (100.00%) Basque 1 2 (100.00%)
Kongo 1 2 (100.00%) Ligurian 1 2 (100.00%)
Korean 1 2 (100.00%) Piedmontese 1 2 (100.00%)
Kotava 1 2 (100.00%) West Frisian 1 2 (100.00%)
Kurdish 1 2 (100.00%) Classical Nahuatl 1 2 (100.00%)
Kyrgyz 1 2 (100.00%) Norwegian Nynorsk 1 2 (100.00%)
Ladin 1 2 (100.00%) Manx 1 2 (100.00%)
Ladino 5 2 (23.81%) Welsh 1 2 (100.00%)
Lakota 1 2 (100.00%) Bosnian 1 2 (100.00%)
Lao 1 2 (100.00%) Kurdish 1 2 (100.00%)
Latin 69 80 (70.75%) Limburgish 1 2 (100.00%)
Latvian 3 2 (99.32%) Galician 1 2 (100.00%)
Ligurian 1 2 (100.00%) Corsican 1 2 (100.00%)
Limburgish 1 2 (100.00%) Sicilian 1 2 (100.00%)
Lingala 1 2 (100.00%) Old English 1 2 (100.00%)
Lithuanian 3 2 (99.76%) Sranan Tongo 1 2 (100.00%)
Livonian 1 2 (100.00%) Tarantino 1 2 (100.00%)
Lojban 1 2 (100.00%) Vietnamese 1 2 (100.00%)
Lombard 3 2 (87.88%) Malay 1 2 (100.00%)
Low German 1 2 (100.00%) Faroese 1 2 (100.00%)
Lower Sorbian 1 2 (100.00%) Cornish 1 2 (100.00%)
Luxembourgish 1 2 (100.00%) Ladin 1 2 (100.00%)
Lydian 1 4 (100.00%) Moldovan 1 2 (100.00%)
Macedonian 5 2 (98.04%) Aragonese 1 2 (100.00%)
Malagasy 1 2 (100.00%) Tagalog 1 2 (100.00%)
Malay 1 2 (100.00%) Volapük 1 2 (100.00%)
Malayalam 1 2 (100.00%) Lower Sorbian 1 2 (100.00%)
Maltese 3 2 (96.55%) Ewe 1 2 (100.00%)
Manchu 1 2 (100.00%) Korean 1 2 (100.00%)
Manx 1 2 (100.00%) Chinese 1 2 (100.00%)
Maori 1 2 (100.00%) Quechua 1 2 (100.00%)
Mapuche 1 2 (100.00%) Swahili 1 2 (100.00%)
Because subject class concord and object class concord uses the same form (for example "c5"), we don't have yet a way to distinguish them. The meaning is human-parsable because we can look at the alignment of the headers (horizontal=subject concord, vertical=object concord), but the parser does not (yet) have this kind of memory or spatial awareness. /////// Sep 9th 2022: I've recently implemented a way to transform row or column tags into other tags based on language, but it doesn't solve an underlying issue with Swahili tables that I'd missed or forgotten about: The horizontal column headers for class and person don't get inherited through the whole template because it is chopped up into several subtables that don't directly inherit from above. It's an issue with 'bleed' again. /////// Later (comment in Dec 2022): forgot to put this here, but Swahili tables have now gotten their own big systems that make them work, including a "save to register" feature that can keep column headers in memory to be used in a later subtable within a template.
Marathi 1 2 (100.00%) Ilocano 1 2 (100.00%)
Mari 1 2 (100.00%) Belarusian 1 2 (100.00%)
Medieval Greek 64 50 (36.46%) Thai 1 2 (100.00%)
Middle Dutch 1 0 (0.00%) Old High German 1 2 (100.00%)
Middle English 1 2 (100.00%) Persian 1 2 (100.00%)
Middle French 1 0 (0.00%) Hindi 1 2 (100.00%)
Middle High German 1 0 (0.00%) Zulu 1 2 (100.00%)
Minnan 2 2 (96.00%) Middle French 1 0 (0.00%)
Mirandese 1 2 (100.00%) Kotava 1 2 (100.00%)
Moksha 1 2 (100.00%) Crimean Tatar 1 2 (100.00%)
Moldovan 1 2 (100.00%) Fijian 1 2 (100.00%)
Mongolian 1 2 (100.00%) Bambara 1 2 (100.00%)
Multiple languages 13 6 (79.31%) Nepali 1 2 (100.00%)
Mycenaean Greek 4 12 (72.97%) Tatar 1 2 (100.00%)
Nahuatl 2 2 (50.00%) Tamil 1 2 (100.00%)
Narom 3 2 (88.24%) Urdu 1 2 (100.00%)
Nauru 1 2 (100.00%) Gujarati 1 2 (100.00%)
Navajo 1 2 (100.00%) Yiddish 1 2 (100.00%)
Neapolitan 3 2 (96.05%) Samoan 1 2 (100.00%)
Nepali 1 2 (100.00%) Kinyarwanda 1 2 (100.00%)
Newar 1 2 (100.00%) Somali 1 2 (100.00%)
Northern Sami 1 2 (100.00%) Cantonese 1 2 (100.00%)
Northern Sotho 1 2 (100.00%) Bangla 1 2 (100.00%)
Norwegian 4 2 (11.84%) Zhuang 1 2 (100.00%)
Norwegian Bokmål 1 2 (100.00%) Walloon 1 2 (100.00%)
Norwegian Nynorsk 1 2 (100.00%) Tajik 1 2 (100.00%)
Novial 1 2 (100.00%) Tahitian 1 2 (100.00%)
Occitan 1 2 (100.00%) Chuvash 1 2 (100.00%)
Odia 1 2 (100.00%) Rundi 1 2 (100.00%)
Ojibwa 1 2 (100.00%) Kashmiri 1 2 (100.00%)
Old Armenian 1 4 (100.00%) Malayalam 1 2 (100.00%)
Old East Slavic 2 0 (0.00%) Dhivehi 1 2 (100.00%)
Old English 1 2 (100.00%) Yucatec Maya 1 2 (100.00%)
Old French 10 8 (56.69%) Elamite 1 4 (100.00%)
Old High German 1 2 (100.00%) Bemba 1 2 (100.00%)
Old Irish 3 2 (21.43%)
Leave be; Old Irish tables can wait until they've sorted themselves out.
Swati 1 2 (100.00%)
Old Norse 2 2 (80.00%) Northern Sotho 1 2 (100.00%)
Old Occitan 2 2 (33.33%) Uyghur 1 2 (100.00%)
Old Persian 2 2 (50.00%) Punjabi 1 2 (100.00%)
Old Spanish 1 0 (0.00%) Maori 1 2 (100.00%)
Old Turkic 1 2 (100.00%) Mongolian 1 2 (100.00%)
Ossetian 2 2 (95.24%) Kashubian 1 2 (100.00%)
Ottoman Turkish 3 4 (92.86%) Tok Pisin 1 2 (100.00%)
Palatine German 1 2 (100.00%) Lingala 1 2 (100.00%)
Pali 2 2 (98.41%) Chamorro 1 2 (100.00%)
Papiamento 1 2 (100.00%) Haitian Creole 1 2 (100.00%)
Pashto 1 2 (100.00%) Inuktitut 1 2 (100.00%)
Pennsylvania German 2 2 (66.67%) Tongan 1 2 (100.00%)
Persian 1 2 (100.00%) Shona 1 2 (100.00%)
Piedmontese 1 2 (100.00%) Veps 1 2 (100.00%)
Polish 11 8 (55.54%) Marathi 1 2 (100.00%)
Pontic 9 4 (19.87%) Kyrgyz 1 2 (100.00%)
Portuguese 9 6 (73.43%) Northern Sami 1 2 (100.00%)
Proto-Albanian 1 0 (0.00%) Western Maninkakan 1 2 (100.00%)
Proto-Balto-Slavic 1 0 (0.00%) Yoruba 1 2 (100.00%)
Proto-Germanic 2 2 (50.00%) Wolof 1 2 (100.00%)
Proto-Hellenic 5 4 (36.36%) Sanskrit 1 2 (100.00%)
Proto-Indo-European 4 2 (80.60%) Tswana 1 2 (100.00%)
Proto-Italic 3 2 (33.33%) Pashto 1 2 (100.00%)
Proto-Slavic 3 0 (0.00%) Lojban 1 2 (100.00%)
Proto-Turkic 1 2 (100.00%) Kannada 1 2 (100.00%)
Punjabi 1 2 (100.00%) Malagasy 1 2 (100.00%)
Quechua 1 2 (100.00%) Venda 1 2 (100.00%)
Rapa Nui 1 2 (100.00%) Xhosa 1 2 (100.00%)
Romani 2 2 (85.71%) Sotho 1 2 (100.00%)
Romanian 12 26 (89.20%) Tsonga 1 2 (100.00%)
Romansch 2 2 (94.44%) Aymara 1 2 (100.00%)
Rundi 1 2 (100.00%) Telugu 1 2 (100.00%)
Russian 10 10 (97.96%) Moksha 1 2 (100.00%)
Samoan 1 2 (100.00%) Middle Dutch 1 0 (0.00%)
Samogitian 2 2 (75.00%) Rapa Nui 1 2 (100.00%)
Sanskrit 1 2 (100.00%) Guarani 1 2 (100.00%)
Sardinian 2 2 (96.49%) Sundanese 1 2 (100.00%)
Saterland Frisian 2 2 (50.00%) Livonian 1 2 (100.00%)
Scots 2 2 (94.74%) Kapampangan 1 2 (100.00%)
Scottish Gaelic 1 2 (100.00%) Abkhaz 1 2 (100.00%)
Seneca 1 2 (100.00%) Adyghe 1 2 (100.00%)
Serbian 6 2 (32.89%) Amharic 1 2 (100.00%)
Serbo-Croatian 12 10 (84.75%) Dutch Low Saxon 1 2 (100.00%)
Shona 1 2 (100.00%) Dimli 1 2 (100.00%)
Sicilian 1 2 (100.00%) Navajo 1 2 (100.00%)
Sinhala 1 2 (100.00%) Kongo 1 2 (100.00%)
Slovak 12 64 (92.43%) Mapuche 1 2 (100.00%)
Slovene 5 2 (1.26%) Bislama 1 2 (100.00%)
Somali 1 2 (100.00%) Colognian 1 2 (100.00%)
Sotho 1 2 (100.00%) Arpitan 1 2 (100.00%)
Spanish 19 48 (93.60%) Tetum 1 2 (100.00%)
Sranan Tongo 1 2 (100.00%) Cherokee 1 2 (100.00%)
Sundanese 1 2 (100.00%) Cheyenne 1 2 (100.00%)
Swahili 1 2 (100.00%)
Because subject class concord and object class concord uses the same form (for example "c5"), we don't have yet a way to distinguish them. The meaning is human-parsable because we can look at the alignment of the headers (horizontal=subject concord, vertical=object concord), but the parser does not (yet) have this kind of memory or spatial awareness. /////// Sep 9th 2022: I've recently implemented a way to transform row or column tags into other tags based on language, but it doesn't solve an underlying issue with Swahili tables that I'd missed or forgotten about: The horizontal column headers for class and person don't get inherited through the whole template because it is chopped up into several subtables that don't directly inherit from above. It's an issue with 'bleed' again. /////// Later (comment in Dec 2022): forgot to put this here, but Swahili tables have now gotten their own big systems that make them work, including a "save to register" feature that can keep column headers in memory to be used in a later subtable within a template.
Bashkir 1 2 (100.00%)
Swati 1 2 (100.00%) Newar 1 2 (100.00%)
Swedish 6 2 (3.57%) Interlingue 1 2 (100.00%)
Tagalog 1 2 (100.00%) Seneca 1 2 (100.00%)
Tahitian 1 2 (100.00%) Norwegian Bokmål 1 2 (100.00%)
Tajik 1 2 (100.00%) Algonquin 1 2 (100.00%)
Tamil 1 2 (100.00%) Afar 1 2 (100.00%)
Tarantino 1 2 (100.00%) Fiji Hindi 1 2 (100.00%)
Tatar 1 2 (100.00%) Inari Sami 1 2 (100.00%)
Telugu 1 2 (100.00%) Greenlandic 1 2 (100.00%)
Tetum 1 2 (100.00%) Upper Sorbian 1 2 (100.00%)
Thai 1 2 (100.00%) Galoli 1 2 (100.00%)
Tibetan 2 4 (85.71%) Lao 1 2 (100.00%)
Tok Pisin 1 2 (100.00%) Hausa 1 2 (100.00%)
Tongan 1 2 (100.00%) Middle English 1 2 (100.00%)
Tsakonian 7 6 (76.98%) Burmese 1 2 (100.00%)
Tsonga 1 2 (100.00%) Sinhala 1 2 (100.00%)
Tswana 1 2 (100.00%) Egyptian Arabic 1 2 (100.00%)
Turkish 17 20 (9.17%) Khmer 1 2 (100.00%)
Turkmen 2 2 (64.00%) Kabardian 1 4 (100.00%)
Udmurt 1 4 (100.00%) Mari 1 2 (100.00%)
Ukrainian 3 2 (99.63%) Palatine German 1 2 (100.00%)
Upper Sorbian 1 2 (100.00%) Western Punjabi 1 2 (100.00%)
Urdu 1 2 (100.00%) Yakut 1 2 (100.00%)
Uyghur 1 2 (100.00%) Waray 1 2 (100.00%)
Uzbek 2 2 (69.70%) Anglo-Norman 1 2 (100.00%)
Venda 1 2 (100.00%) Wu 1 0 (0.00%)
Venetan 3 2 (95.45%) Manchu 1 2 (100.00%)
Veps 1 2 (100.00%) Ghotuo 1 2 (100.00%)
Vietnamese 1 2 (100.00%) Old Armenian 1 4 (100.00%)
Volapük 1 2 (100.00%) Nauru 1 2 (100.00%)
Walloon 1 2 (100.00%) Chechen 1 2 (100.00%)
Waray 1 2 (100.00%) Hakka 1 2 (100.00%)
Welsh 1 2 (100.00%) Odia 1 2 (100.00%)
West Flemish 4 2 (0.03%) Old Turkic 1 2 (100.00%)
West Frisian 1 2 (100.00%) Assamese 1 2 (100.00%)
Western Maninkakan 1 2 (100.00%) Mirandese 1 2 (100.00%)
Western Punjabi 1 2 (100.00%) Lydian 1 4 (100.00%)
Wolof 1 2 (100.00%) Udmurt 1 4 (100.00%)
Wu 1 0 (0.00%) Lakota 1 2 (100.00%)
Xhosa 1 2 (100.00%) Ojibwa 1 2 (100.00%)
Yakut 1 2 (100.00%) Dzongkha 1 4 (100.00%)
Yiddish 1 2 (100.00%) Cypriot Arabic 1 4 (100.00%)
Yoruba 1 2 (100.00%) Proto-Balto-Slavic 1 0 (0.00%)
Yucatec Maya 1 2 (100.00%) Proto-Albanian 1 0 (0.00%)
Zealandic 2 2 (42.31%) Ancient Macedonian 1 0 (0.00%)
Zhuang 1 2 (100.00%) Old Spanish 1 0 (0.00%)
Zulu 1 2 (100.00%) Middle High German 1 0 (0.00%)
uni 10 12 (92.65%) Proto-Turkic 1 2 (100.00%)

This page is a part of the kaikki.org machine-readable dictionary. This dictionary is based on structured data extracted on 2026-03-11 from the elwiktionary dump dated 2026-03-01 using wiktextract (602557e and 59dc20b). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.