Wiktionary data extraction errors and warnings

Inflection check

List of different kinds of inflection tables. When wiktextract parses word heads and tables, it assigns the forms it encounters with tags that describe grammatical or contextual information. The tags and forms that are found in head sections and tables are kept separate from other head section and table tags, and later they are merged with other heads and tables into table types that all contain the same number of word forms with the same tags for those forms.

The information presented here is mostly for debugging, but it can also be used to find interesting word paradigms and to hunt down mistakes, typoes and badly formated Wiktionary entries. A table type that has only a few unique instances is quite likely to contain some kind of minor error in the original data.

Language ⏶ Table forms Errors (% affected words) Language Table forms ⏷ Errors (% affected words)
Acehnesisch 1 0 (0.00%) Deutsch 29071 0 (0.00%)
Afrikaans 8 6 (8.93%) Altgriechisch 115 810 (15.95%)
Akkadisch 8 220 (52.00%) Latein 97 366 (3.25%)
Albanisch 18 88 (18.62%) Polnisch 85 8 (0.03%)
Altenglisch 9 72 (28.83%) Englisch 66 96 (16.80%)
Altfranzösisch 2 0 (0.00%) Französisch 54 188 (10.21%)
Altgriechisch 115 810 (15.95%) Schwedisch 54 100 (0.00%)
Althochdeutsch 11 24 (0.00%) Italienisch 52 40 (0.01%)
Altirisch 4 0 (0.00%) Prußisch 44 424 (51.63%)
Altkirchenslawisch 9 68 (21.88%) Russisch 43 360 (0.00%)
Altnordisch 4 0 (0.00%) Tschechisch 42 0 (0.00%)
Alttschechisch 1 0 (0.00%) Niedersorbisch 40 52 (0.35%)
Arabisch 4 116 (98.63%) Niederländisch 39 122 (8.34%)
Armenisch 27 134 (3.64%) Spanisch 33 40 (12.41%)
Aserbaidschanisch 4 0 (0.00%) Ukrainisch 30 0 (0.00%)
Asturisch 3 0 (0.00%) Armenisch 27 134 (3.64%)
Bairisch 1 0 (0.00%) Slowakisch 26 0 (0.00%)
Baschkirisch 2 28 (100.00%) Neugriechisch 25 138 (14.90%)
Baskisch 6 52 (99.30%) Portugiesisch 24 40 (1.42%)
Belutschi 1 0 (0.00%) Dänisch 23 68 (15.09%)
Bengalisch 1 0 (0.00%) Rumänisch 20 388 (61.39%)
Birmanisch 3 0 (0.00%) Isländisch 20 50 (10.98%)
Bosnisch 7 0 (0.00%) Slowenisch 20 22 (0.00%)
Bretonisch 5 0 (0.00%) Kroatisch 20 14 (0.00%)
Bulgarisch 8 102 (60.00%) Niederdeutsch 19 118 (59.12%)
Chinesisch 2 0 (0.00%) Okzitanisch 19 18 (0.05%)
Deutsch 29071 0 (0.00%) Obersorbisch 19 46 (0.00%)
Durango-Nahuatl 1 0 (0.00%) Gotisch 19 32 (50.91%)
Dänisch 23 68 (15.09%) Ido 18 30 (41.64%)
Englisch 66 96 (16.80%) Albanisch 18 88 (18.62%)
Esperanto 11 8 (0.00%) Mazedonisch 18 0 (0.00%)
Estnisch 3 22 (80.95%) Katalanisch 17 42 (3.84%)
Faliskisch 4 0 (0.00%) Ungarisch 15 58 (78.27%)
Finnisch 13 82 (4.84%) Türkisch 15 192 (83.50%)
Französisch 54 188 (10.21%) Weißrussisch 15 0 (0.00%)
Friaulisch 5 0 (0.00%) Serbisch 15 212 (1.30%)
Frühneuhochdeutsch 8 0 (0.00%) Südpikenisch 14 0 (0.00%)
Fulfulde 3 0 (0.00%) Usbekisch 13 192 (0.00%)
Färöisch 10 62 (64.15%) Finnisch 13 82 (4.84%)
Galicisch 6 0 (0.00%) Irisch 13 4 (3.36%)
Georgisch 11 220 (77.85%) Maltesisch 12 60 (41.18%)
Gotisch 19 32 (50.91%) Esperanto 11 8 (0.00%)
Guaraní 1 0 (0.00%) Norwegisch 11 18 (18.52%)
Guerrero-Nahuatl 1 0 (0.00%) Lettisch 11 32 (11.64%)
Haitianisch 1 0 (0.00%) Althochdeutsch 11 24 (0.00%)
Hausa 8 14 (3.43%) Klassisches Nahuatl 11 0 (0.00%)
Hawaiianisch 2 0 (0.00%) Georgisch 11 220 (77.85%)
Hebräisch 2 36 (95.00%) Färöisch 10 62 (64.15%)
Hethitisch 7 0 (0.00%) Litauisch 10 0 (0.00%)
Hindi 3 0 (0.00%) Urdu 10 96 (4.89%)
Huastekisches Ost-Nahuatl 1 0 (0.00%) Paschtu 9 96 (1.56%)
Huastekisches West-Nahuatl 1 0 (0.00%) Altkirchenslawisch 9 68 (21.88%)
Huastekisches Zentral-Nahuatl 6 0 (0.00%) Zentral-Nahuatl 9 0 (0.00%)
Hurritisch 1 0 (0.00%) Altenglisch 9 72 (28.83%)
Ido 18 30 (41.64%) Walisisch 8 0 (0.00%)
Indonesisch 3 0 (0.00%) Koreanisch 8 0 (0.00%)
Interlingua 4 0 (0.00%) Hausa 8 14 (3.43%)
Interlingue 1 0 (0.00%) Bulgarisch 8 102 (60.00%)
International 2 0 (0.00%) Afrikaans 8 6 (8.93%)
Inuktitut 1 0 (0.00%) Suaheli 8 38 (31.19%)
Inupiaq 1 0 (0.00%) Frühneuhochdeutsch 8 0 (0.00%)
Irisch 13 4 (3.36%) Akkadisch 8 220 (52.00%)
Isländisch 20 50 (10.98%) Vietnamesisch 7 0 (0.00%)
Italienisch 52 40 (0.01%) Mittelhochdeutsch 7 0 (0.00%)
Jamaika-Kreolisch 4 6 (20.00%) Bosnisch 7 0 (0.00%)
Japanisch 1 0 (0.00%) Marsisch 7 0 (0.00%)
Jiddisch 3 0 (0.00%) Hethitisch 7 0 (0.00%)
Kasachisch 1 14 (100.00%) Sindhi 7 36 (0.00%)
Kaschubisch 3 0 (0.00%) Galicisch 6 0 (0.00%)
Katalanisch 17 42 (3.84%) Westfriesisch 6 54 (29.36%)
Kirchenslawisch 1 0 (0.00%) Baskisch 6 52 (99.30%)
Kirgisisch 2 14 (50.00%) Kurdisch 6 38 (80.54%)
Klassisches Nahuatl 11 0 (0.00%) Huastekisches Zentral-Nahuatl 6 0 (0.00%)
Klassisches Nahuatl‎ 3 0 (0.00%) Volskisch 6 0 (0.00%)
Komi 1 0 (0.00%) Scots 6 20 (17.65%)
Komorisch 1 0 (0.00%) Venezianisch 6 0 (0.00%)
Koptisch 5 0 (0.00%) Luxemburgisch 5 96 (9.09%)
Koreanisch 8 0 (0.00%) Friaulisch 5 0 (0.00%)
Korsisch 2 0 (0.00%) Bretonisch 5 0 (0.00%)
Kotava 1 2 (100.00%) Tetelcingo-Nahuatl 5 0 (0.00%)
Krimtatarisch 1 0 (0.00%) Rätoromanisch 5 0 (0.00%)
Kroatisch 20 14 (0.00%) Umbrisch 5 0 (0.00%)
Kurdisch 6 38 (80.54%) Koptisch 5 0 (0.00%)
Ladinisch 1 0 (0.00%) Interlingua 4 0 (0.00%)
Latein 97 366 (3.25%) Altnordisch 4 0 (0.00%)
Lettgallisch 1 0 (0.00%) Shona 4 0 (0.00%)
Lettisch 11 32 (11.64%) Arabisch 4 116 (98.63%)
Litauisch 10 0 (0.00%) Persisch 4 0 (0.00%)
Luxemburgisch 5 96 (9.09%) Tagalog 4 6 (57.14%)
Láadan 1 0 (0.00%) Altirisch 4 0 (0.00%)
Maltesisch 12 60 (41.18%) Aserbaidschanisch 4 0 (0.00%)
Maori 1 0 (0.00%) Sardisch 4 0 (0.00%)
Marathi 2 0 (0.00%) Samoanisch 4 0 (0.00%)
Marsisch 7 0 (0.00%) Oskisch 4 0 (0.00%)
Mazedonisch 18 0 (0.00%) West-Pandschabi 4 42 (4.43%)
Mezquital-Otomi 1 0 (0.00%) Faliskisch 4 0 (0.00%)
Mittelenglisch 1 0 (0.00%) Jamaika-Kreolisch 4 6 (20.00%)
Mittelgriechisch 1 0 (0.00%) Vestinisch 4 0 (0.00%)
Mittelhochdeutsch 7 0 (0.00%) Estnisch 3 22 (80.95%)
Mongolisch 1 6 (100.00%) Asturisch 3 0 (0.00%)
Morisien 1 0 (0.00%) Indonesisch 3 0 (0.00%)
Nahuatl 1 0 (0.00%) Kaschubisch 3 0 (0.00%)
Nauruisch 1 0 (0.00%) Hindi 3 0 (0.00%)
Nepalesisch 2 24 (0.00%) Volapük 3 0 (0.00%)
Neugriechisch 25 138 (14.90%) Serbokroatisch 3 14 (0.00%)
Niederdeutsch 19 118 (59.12%) Tadschikisch 3 0 (0.00%)
Niederländisch 39 122 (8.34%) Temascaltepec-Nahuatl 3 0 (0.00%)
Niedersorbisch 40 52 (0.35%) Sizilianisch 3 0 (0.00%)
Nord-Sotho 1 0 (0.00%) Jiddisch 3 0 (0.00%)
Nordfriesisch 1 0 (0.00%) Klassisches Nahuatl‎ 3 0 (0.00%)
Norwegisch 11 18 (18.52%) Sesotho 3 0 (0.00%)
Novial 1 0 (0.00%) isiZulu 3 0 (0.00%)
Obersorbisch 19 46 (0.00%) Fulfulde 3 0 (0.00%)
Okzitanisch 19 18 (0.05%) Sanskrit 3 102 (62.50%)
Orizaba-Nahuatl 3 0 (0.00%) Orizaba-Nahuatl 3 0 (0.00%)
Oromo 1 0 (0.00%) Birmanisch 3 0 (0.00%)
Oskisch 4 0 (0.00%) Chinesisch 2 0 (0.00%)
Papiamentu 2 0 (0.00%) International 2 0 (0.00%)
Paschtu 9 96 (1.56%) Hawaiianisch 2 0 (0.00%)
Persisch 4 0 (0.00%) Tetum 2 0 (0.00%)
Polabisch 2 2 (0.00%) Tok Pisin 2 0 (0.00%)
Polnisch 85 8 (0.03%) Papiamentu 2 0 (0.00%)
Portugiesisch 24 40 (1.42%) Schottisch-Gälisch 2 32 (75.00%)
Prußisch 44 424 (51.63%) Westflämisch 2 4 (0.00%)
Rumänisch 20 388 (61.39%) Hebräisch 2 36 (95.00%)
Russisch 43 360 (0.00%) Baschkirisch 2 28 (100.00%)
Rätoromanisch 5 0 (0.00%) Altfranzösisch 2 0 (0.00%)
Sami 1 0 (0.00%) Kirgisisch 2 14 (50.00%)
Samoanisch 4 0 (0.00%) Zentrales Puebla-Nahuatl 2 0 (0.00%)
Sanskrit 3 102 (62.50%) Korsisch 2 0 (0.00%)
Sardisch 4 0 (0.00%) Sumerisch 2 20 (74.29%)
Schottisch-Gälisch 2 32 (75.00%) Marathi 2 0 (0.00%)
Schwedisch 54 100 (0.00%) Nepalesisch 2 24 (0.00%)
Scots 6 20 (17.65%) Twi 2 0 (0.00%)
Serbisch 15 212 (1.30%) Polabisch 2 2 (0.00%)
Serbokroatisch 3 14 (0.00%) Japanisch 1 0 (0.00%)
Sesotho 3 0 (0.00%) Haitianisch 1 0 (0.00%)
Shona 4 0 (0.00%) Interlingue 1 0 (0.00%)
Sindarin 1 0 (0.00%) Mittelenglisch 1 0 (0.00%)
Sindhi 7 36 (0.00%) Krimtatarisch 1 0 (0.00%)
Sizilianisch 3 0 (0.00%) Lettgallisch 1 0 (0.00%)
Slowakisch 26 0 (0.00%) Nordfriesisch 1 0 (0.00%)
Slowenisch 20 22 (0.00%) Huastekisches Ost-Nahuatl 1 0 (0.00%)
Sogdisch 1 0 (0.00%) Nauruisch 1 0 (0.00%)
Somalisch 1 0 (0.00%) Thai 1 0 (0.00%)
Spanisch 33 40 (12.41%) Maori 1 0 (0.00%)
Suaheli 8 38 (31.19%) Tuvaluisch 1 2 (0.00%)
Sumerisch 2 20 (74.29%) Turkmenisch 1 12 (0.00%)
Südpikenisch 14 0 (0.00%) Tatarisch 1 12 (0.00%)
Tadschikisch 3 0 (0.00%) Kasachisch 1 14 (100.00%)
Tagalog 4 6 (57.14%) Komi 1 0 (0.00%)
Tahitianisch 1 0 (0.00%) Mongolisch 1 6 (100.00%)
Tatarisch 1 12 (0.00%) Nahuatl 1 0 (0.00%)
Telugu 1 0 (0.00%) Huastekisches West-Nahuatl 1 0 (0.00%)
Temascaltepec-Nahuatl 3 0 (0.00%) Nord-Sotho 1 0 (0.00%)
Tetelcingo-Nahuatl 5 0 (0.00%) Somalisch 1 0 (0.00%)
Tetum 2 0 (0.00%) Sindarin 1 0 (0.00%)
Thai 1 0 (0.00%) Guaraní 1 0 (0.00%)
Tok Pisin 2 0 (0.00%) Inuktitut 1 0 (0.00%)
Tschechisch 42 0 (0.00%) Acehnesisch 1 0 (0.00%)
Turkmenisch 1 12 (0.00%) Hurritisch 1 0 (0.00%)
Tuvaluisch 1 2 (0.00%) Bairisch 1 0 (0.00%)
Twi 2 0 (0.00%) Sami 1 0 (0.00%)
Türkisch 15 192 (83.50%) Novial 1 0 (0.00%)
Ukrainisch 30 0 (0.00%) Zentral-Alaska-Yupik 1 0 (0.00%)
Umbrisch 5 0 (0.00%) Telugu 1 0 (0.00%)
Ungarisch 15 58 (78.27%) Sogdisch 1 0 (0.00%)
Urdu 10 96 (4.89%) Oromo 1 0 (0.00%)
Usbekisch 13 192 (0.00%) Inupiaq 1 0 (0.00%)
Venezianisch 6 0 (0.00%) Belutschi 1 0 (0.00%)
Vestinisch 4 0 (0.00%) Bengalisch 1 0 (0.00%)
Vietnamesisch 7 0 (0.00%) Mittelgriechisch 1 0 (0.00%)
Volapük 3 0 (0.00%) Mezquital-Otomi 1 0 (0.00%)
Volskisch 6 0 (0.00%) Guerrero-Nahuatl 1 0 (0.00%)
Walisisch 8 0 (0.00%) Durango-Nahuatl 1 0 (0.00%)
Weißrussisch 15 0 (0.00%) Alttschechisch 1 0 (0.00%)
West-Pandschabi 4 42 (4.43%) Kotava 1 2 (100.00%)
Westflämisch 2 4 (0.00%) Láadan 1 0 (0.00%)
Westfriesisch 6 54 (29.36%) Komorisch 1 0 (0.00%)
Zentral-Alaska-Yupik 1 0 (0.00%) Morisien 1 0 (0.00%)
Zentral-Nahuatl 9 0 (0.00%) Tahitianisch 1 0 (0.00%)
Zentrales Puebla-Nahuatl 2 0 (0.00%) Kirchenslawisch 1 0 (0.00%)
isiZulu 3 0 (0.00%) Ladinisch 1 0 (0.00%)

This page is a part of the kaikki.org machine-readable dictionary. This dictionary is based on structured data extracted on 2024-12-21 from the dewiktionary dump dated 2024-12-20 using wiktextract (d8cb2f3 and 4e554ae). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.