Wiktionary data extraction errors and warnings

Inflection check

List of different kinds of inflection tables. When wiktextract parses word heads and tables, it assigns the forms it encounters with tags that describe grammatical or contextual information. The tags and forms that are found in head sections and tables are kept separate from other head section and table tags, and later they are merged with other heads and tables into table types that all contain the same number of word forms with the same tags for those forms.

The information presented here is mostly for debugging, but it can also be used to find interesting word paradigms and to hunt down mistakes, typoes and badly formated Wiktionary entries. A table type that has only a few unique instances is quite likely to contain some kind of minor error in the original data.

Language ⏶ Table forms Errors (% affected words) Language Table forms ⏷ Errors (% affected words)
Acehnesisch 1 0 (0.00%) Deutsch 29491 0 (0.00%)
Afrikaans 8 6 (8.85%) Altgriechisch 115 802 (15.95%)
Akkadisch 8 230 (52.00%) Latein 99 374 (3.09%)
Albanisch 18 88 (18.62%) Polnisch 85 8 (0.03%)
Altenglisch 9 72 (28.83%) Englisch 66 96 (16.73%)
Altfranzösisch 2 0 (0.00%) Schwedisch 54 100 (0.00%)
Altgriechisch 115 802 (15.95%) Französisch 54 188 (10.22%)
Althochdeutsch 11 24 (0.00%) Italienisch 53 40 (0.01%)
Altirisch 4 0 (0.00%) Prußisch 44 422 (51.63%)
Altkirchenslawisch 9 68 (21.88%) Russisch 43 360 (0.00%)
Altnordisch 4 0 (0.00%) Tschechisch 42 0 (0.00%)
Alttschechisch 1 0 (0.00%) Niedersorbisch 40 52 (0.31%)
Arabisch 4 100 (98.64%) Niederländisch 39 126 (8.33%)
Armenisch 29 134 (3.63%) Spanisch 33 40 (12.44%)
Aserbaidschanisch 5 6 (3.04%) Armenisch 29 134 (3.63%)
Asturisch 3 0 (0.00%) Ukrainisch 28 0 (0.00%)
Bairisch 1 0 (0.00%) Slowakisch 26 0 (0.00%)
Baschkirisch 2 28 (100.00%) Neugriechisch 25 176 (14.90%)
Baskisch 6 52 (99.30%) Portugiesisch 24 40 (1.42%)
Belutschi 1 0 (0.00%) Dänisch 23 68 (15.07%)
Bengalisch 1 0 (0.00%) Obersorbisch 22 50 (0.00%)
Birmanisch 3 0 (0.00%) Kroatisch 21 24 (0.17%)
Bosnisch 7 0 (0.00%) Niederdeutsch 20 114 (58.27%)
Bretonisch 5 0 (0.00%) Isländisch 20 50 (10.98%)
Bulgarisch 8 126 (60.71%) Rumänisch 20 380 (61.39%)
Chinesisch 2 0 (0.00%) Okzitanisch 19 18 (0.05%)
Deutsch 29491 0 (0.00%) Gotisch 19 30 (50.91%)
Durango-Nahuatl 1 0 (0.00%) Ido 18 30 (42.98%)
Dänisch 23 68 (15.07%) Albanisch 18 88 (18.62%)
Englisch 66 96 (16.73%) Slowenisch 18 18 (0.00%)
Esperanto 11 8 (0.00%) Katalanisch 17 42 (3.82%)
Estnisch 3 22 (80.95%) Mazedonisch 16 0 (0.00%)
Faliskisch 4 0 (0.00%) Ungarisch 15 58 (78.27%)
Finnisch 13 82 (4.84%) Türkisch 15 192 (83.28%)
Französisch 54 188 (10.22%) Weißrussisch 15 0 (0.00%)
Friaulisch 5 0 (0.00%) Serbisch 15 212 (1.30%)
Frühneuhochdeutsch 8 0 (0.00%) Südpikenisch 14 0 (0.00%)
Fulfulde 3 0 (0.00%) Usbekisch 13 192 (0.00%)
Färöisch 10 62 (64.15%) Irisch 13 4 (3.36%)
Galicisch 6 0 (0.00%) Finnisch 13 82 (4.84%)
Georgisch 11 220 (77.85%) Maltesisch 12 60 (41.18%)
Gotisch 19 30 (50.91%) Esperanto 11 8 (0.00%)
Guaraní 1 0 (0.00%) Norwegisch 11 18 (18.52%)
Guerrero-Nahuatl 1 0 (0.00%) Althochdeutsch 11 24 (0.00%)
Haitianisch 1 0 (0.00%) Klassisches Nahuatl 11 0 (0.00%)
Hausa 8 12 (3.43%) Georgisch 11 220 (77.85%)
Hawaiianisch 2 0 (0.00%) Lettisch 10 32 (11.64%)
Hebräisch 2 36 (95.00%) Litauisch 10 0 (0.00%)
Hethitisch 7 0 (0.00%) Färöisch 10 62 (64.15%)
Hindi 3 0 (0.00%) Urdu 10 96 (4.88%)
Huastekisches Ost-Nahuatl 1 0 (0.00%) Paschtu 9 96 (1.56%)
Huastekisches West-Nahuatl 1 0 (0.00%) Altkirchenslawisch 9 68 (21.88%)
Huastekisches Zentral-Nahuatl 6 0 (0.00%) Zentral-Nahuatl 9 0 (0.00%)
Hurritisch 1 0 (0.00%) Altenglisch 9 72 (28.83%)
Ido 18 30 (42.98%) Walisisch 8 0 (0.00%)
Indonesisch 3 0 (0.00%) Hausa 8 12 (3.43%)
Interlingua 4 0 (0.00%) Koreanisch 8 0 (0.00%)
Interlingue 1 0 (0.00%) Bulgarisch 8 126 (60.71%)
International 2 0 (0.00%) Afrikaans 8 6 (8.85%)
Inuktitut 1 0 (0.00%) Suaheli 8 38 (31.19%)
Inupiaq 1 0 (0.00%) Frühneuhochdeutsch 8 0 (0.00%)
Irisch 13 4 (3.36%) Akkadisch 8 230 (52.00%)
Isländisch 20 50 (10.98%) Mittelhochdeutsch 7 0 (0.00%)
Italienisch 53 40 (0.01%) Vietnamesisch 7 0 (0.00%)
Jamaika-Kreolisch 4 6 (20.00%) Kurdisch 7 38 (80.00%)
Japanisch 1 0 (0.00%) Bosnisch 7 0 (0.00%)
Jiddisch 3 0 (0.00%) Marsisch 7 0 (0.00%)
Kasachisch 1 14 (100.00%) Hethitisch 7 0 (0.00%)
Kaschubisch 3 0 (0.00%) Sindhi 7 36 (0.00%)
Katalanisch 17 42 (3.82%) Galicisch 6 0 (0.00%)
Kirchenslawisch 1 0 (0.00%) Westfriesisch 6 54 (29.36%)
Kirgisisch 2 14 (50.00%) Baskisch 6 52 (99.30%)
Klassisches Nahuatl 11 0 (0.00%) Huastekisches Zentral-Nahuatl 6 0 (0.00%)
Klassisches Nahuatl‎ 3 0 (0.00%) Volskisch 6 0 (0.00%)
Komi 1 0 (0.00%) Scots 6 16 (17.65%)
Komorisch 1 0 (0.00%) Venezianisch 6 0 (0.00%)
Koptisch 5 0 (0.00%) Luxemburgisch 5 96 (8.70%)
Koreanisch 8 0 (0.00%) Friaulisch 5 0 (0.00%)
Korsisch 2 0 (0.00%) Bretonisch 5 0 (0.00%)
Kotava 1 2 (100.00%) Aserbaidschanisch 5 6 (3.04%)
Krimtatarisch 1 0 (0.00%) Tetelcingo-Nahuatl 5 0 (0.00%)
Kroatisch 21 24 (0.17%) Rätoromanisch 5 0 (0.00%)
Kurdisch 7 38 (80.00%) Umbrisch 5 0 (0.00%)
Ladinisch 1 0 (0.00%) Koptisch 5 0 (0.00%)
Latein 99 374 (3.09%) Interlingua 4 0 (0.00%)
Lettgallisch 1 0 (0.00%) Shona 4 0 (0.00%)
Lettisch 10 32 (11.64%) Altnordisch 4 0 (0.00%)
Litauisch 10 0 (0.00%) Arabisch 4 100 (98.64%)
Luxemburgisch 5 96 (8.70%) Persisch 4 0 (0.00%)
Láadan 1 0 (0.00%) Altirisch 4 0 (0.00%)
Maltesisch 12 60 (41.18%) Tagalog 4 6 (57.14%)
Maori 1 0 (0.00%) Sardisch 4 0 (0.00%)
Marathi 2 0 (0.00%) Samoanisch 4 0 (0.00%)
Marsisch 7 0 (0.00%) Oskisch 4 0 (0.00%)
Mazedonisch 16 0 (0.00%) West-Pandschabi 4 42 (4.43%)
Mezquital-Otomi 1 0 (0.00%) Faliskisch 4 0 (0.00%)
Mittelenglisch 1 0 (0.00%) Jamaika-Kreolisch 4 6 (20.00%)
Mittelgriechisch 1 0 (0.00%) Vestinisch 4 0 (0.00%)
Mittelhochdeutsch 7 0 (0.00%) Asturisch 3 0 (0.00%)
Mongolisch 1 6 (100.00%) Estnisch 3 22 (80.95%)
Morisien 1 0 (0.00%) Kaschubisch 3 0 (0.00%)
Nahuatl 1 0 (0.00%) Indonesisch 3 0 (0.00%)
Nauruisch 1 0 (0.00%) Hindi 3 0 (0.00%)
Nepalesisch 2 24 (0.00%) Volapük 3 0 (0.00%)
Neugriechisch 25 176 (14.90%) Serbokroatisch 3 14 (0.00%)
Niederdeutsch 20 114 (58.27%) Tadschikisch 3 0 (0.00%)
Niederländisch 39 126 (8.33%) Temascaltepec-Nahuatl 3 0 (0.00%)
Niedersorbisch 40 52 (0.31%) Sizilianisch 3 0 (0.00%)
Nord-Sotho 1 0 (0.00%) Jiddisch 3 0 (0.00%)
Nordfriesisch 1 0 (0.00%) Sesotho 3 0 (0.00%)
Norwegisch 11 18 (18.52%) Klassisches Nahuatl‎ 3 0 (0.00%)
Novial 1 0 (0.00%) isiZulu 3 0 (0.00%)
Obersorbisch 22 50 (0.00%) Fulfulde 3 0 (0.00%)
Okzitanisch 19 18 (0.05%) Sanskrit 3 102 (62.50%)
Orizaba-Nahuatl 3 0 (0.00%) Orizaba-Nahuatl 3 0 (0.00%)
Oromo 1 0 (0.00%) Birmanisch 3 0 (0.00%)
Oskisch 4 0 (0.00%) Chinesisch 2 0 (0.00%)
Papiamentu 2 0 (0.00%) International 2 0 (0.00%)
Paschtu 9 96 (1.56%) Hawaiianisch 2 0 (0.00%)
Persisch 4 0 (0.00%) Tetum 2 0 (0.00%)
Polabisch 2 2 (0.00%) Tok Pisin 2 0 (0.00%)
Polnisch 85 8 (0.03%) Papiamentu 2 0 (0.00%)
Portugiesisch 24 40 (1.42%) Schottisch-Gälisch 2 32 (75.00%)
Prußisch 44 422 (51.63%) Westflämisch 2 4 (0.00%)
Rumänisch 20 380 (61.39%) Hebräisch 2 36 (95.00%)
Russisch 43 360 (0.00%) Baschkirisch 2 28 (100.00%)
Rätoromanisch 5 0 (0.00%) Altfranzösisch 2 0 (0.00%)
Sami 1 0 (0.00%) Kirgisisch 2 14 (50.00%)
Samoanisch 4 0 (0.00%) Zentrales Puebla-Nahuatl 2 0 (0.00%)
Sanskrit 3 102 (62.50%) Korsisch 2 0 (0.00%)
Sardisch 4 0 (0.00%) Sumerisch 2 20 (74.29%)
Schottisch-Gälisch 2 32 (75.00%) Marathi 2 0 (0.00%)
Schwedisch 54 100 (0.00%) Nepalesisch 2 24 (0.00%)
Scots 6 16 (17.65%) Twi 2 0 (0.00%)
Serbisch 15 212 (1.30%) Polabisch 2 2 (0.00%)
Serbokroatisch 3 14 (0.00%) Mittelenglisch 1 0 (0.00%)
Sesotho 3 0 (0.00%) Interlingue 1 0 (0.00%)
Shona 4 0 (0.00%) Japanisch 1 0 (0.00%)
Sindarin 1 0 (0.00%) Haitianisch 1 0 (0.00%)
Sindhi 7 36 (0.00%) Krimtatarisch 1 0 (0.00%)
Sizilianisch 3 0 (0.00%) Lettgallisch 1 0 (0.00%)
Slowakisch 26 0 (0.00%) Nordfriesisch 1 0 (0.00%)
Slowenisch 18 18 (0.00%) Huastekisches Ost-Nahuatl 1 0 (0.00%)
Sogdisch 1 0 (0.00%) Nauruisch 1 0 (0.00%)
Somalisch 1 0 (0.00%) Thai 1 0 (0.00%)
Spanisch 33 40 (12.44%) Maori 1 0 (0.00%)
Suaheli 8 38 (31.19%) Tuvaluisch 1 2 (0.00%)
Sumerisch 2 20 (74.29%) Turkmenisch 1 12 (0.00%)
Südpikenisch 14 0 (0.00%) Tatarisch 1 12 (0.00%)
Tadschikisch 3 0 (0.00%) Kasachisch 1 14 (100.00%)
Tagalog 4 6 (57.14%) Komi 1 0 (0.00%)
Tahitianisch 1 0 (0.00%) Mongolisch 1 6 (100.00%)
Tatarisch 1 12 (0.00%) Nahuatl 1 0 (0.00%)
Telugu 1 0 (0.00%) Huastekisches West-Nahuatl 1 0 (0.00%)
Temascaltepec-Nahuatl 3 0 (0.00%) Nord-Sotho 1 0 (0.00%)
Tetelcingo-Nahuatl 5 0 (0.00%) Somalisch 1 0 (0.00%)
Tetum 2 0 (0.00%) Sindarin 1 0 (0.00%)
Thai 1 0 (0.00%) Guaraní 1 0 (0.00%)
Tok Pisin 2 0 (0.00%) Inuktitut 1 0 (0.00%)
Tschechisch 42 0 (0.00%) Acehnesisch 1 0 (0.00%)
Turkmenisch 1 12 (0.00%) Hurritisch 1 0 (0.00%)
Tuvaluisch 1 2 (0.00%) Bairisch 1 0 (0.00%)
Twi 2 0 (0.00%) Sami 1 0 (0.00%)
Türkisch 15 192 (83.28%) Novial 1 0 (0.00%)
Ukrainisch 28 0 (0.00%) Zentral-Alaska-Yupik 1 0 (0.00%)
Umbrisch 5 0 (0.00%) Telugu 1 0 (0.00%)
Ungarisch 15 58 (78.27%) Sogdisch 1 0 (0.00%)
Urdu 10 96 (4.88%) Oromo 1 0 (0.00%)
Usbekisch 13 192 (0.00%) Inupiaq 1 0 (0.00%)
Venezianisch 6 0 (0.00%) Belutschi 1 0 (0.00%)
Vestinisch 4 0 (0.00%) Bengalisch 1 0 (0.00%)
Vietnamesisch 7 0 (0.00%) Mittelgriechisch 1 0 (0.00%)
Volapük 3 0 (0.00%) Guerrero-Nahuatl 1 0 (0.00%)
Volskisch 6 0 (0.00%) Mezquital-Otomi 1 0 (0.00%)
Walisisch 8 0 (0.00%) Durango-Nahuatl 1 0 (0.00%)
Weißrussisch 15 0 (0.00%) Alttschechisch 1 0 (0.00%)
West-Pandschabi 4 42 (4.43%) Kotava 1 2 (100.00%)
Westflämisch 2 4 (0.00%) Láadan 1 0 (0.00%)
Westfriesisch 6 54 (29.36%) Komorisch 1 0 (0.00%)
Zentral-Alaska-Yupik 1 0 (0.00%) Morisien 1 0 (0.00%)
Zentral-Nahuatl 9 0 (0.00%) Tahitianisch 1 0 (0.00%)
Zentrales Puebla-Nahuatl 2 0 (0.00%) Kirchenslawisch 1 0 (0.00%)
isiZulu 3 0 (0.00%) Ladinisch 1 0 (0.00%)

This page is a part of the kaikki.org machine-readable dictionary. This dictionary is based on structured data extracted on 2025-04-17 from the dewiktionary dump dated 2025-04-03 using wiktextract (ada610d and ea19a0a). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.