Wiktionary data extraction errors and warnings

Inflection check

List of different kinds of inflection tables. When wiktextract parses word heads and tables, it assigns the forms it encounters with tags that describe grammatical or contextual information. The tags and forms that are found in head sections and tables are kept separate from other head section and table tags, and later they are merged with other heads and tables into table types that all contain the same number of word forms with the same tags for those forms.

The information presented here is mostly for debugging, but it can also be used to find interesting word paradigms and to hunt down mistakes, typoes and badly formated Wiktionary entries. A table type that has only a few unique instances is quite likely to contain some kind of minor error in the original data.

Language ⏶ Table forms Errors (% affected words) Language Table forms ⏷ Errors (% affected words)
Bahas Melayu 1 0 (0.00%) Bahasa Melayu 14 14 (3.44%)
Bahas Scots 1 0 (0.00%) Bahasa Jerman 7 10 (3.44%)
Bahasa Afrikaans 3 8 (44.00%) Bahasa Inggeris 6 6 (98.31%)
Bahasa Ainu 1 2 (100.00%) Bahasa Indonesia 4 2 (0.29%)
Bahasa Albania 1 6 (100.00%) Bahasa Jawa 4 4 (89.13%)
Bahasa Amuzgo San Pedro Amuzgos 1 0 (0.00%) Bahasa Perancis 4 10 (10.16%)
Bahasa Arab 3 6 (86.67%) Translingual 4 4 (50.00%)
Bahasa Arab Hijaz 1 0 (0.00%) Bahasa Korea 4 4 (76.36%)
Bahasa Asturia 1 0 (0.00%) Bahasa Mandarin 4 6 (1.02%)
Bahasa Azerbaijan 1 2 (0.00%) Bahasa Sepanyol 3 4 (99.03%)
Bahasa Azeri 2 8 (20.00%) Rentas bahasa 3 4 (62.50%)
Bahasa Bali 1 6 (100.00%) Bahasa Jepun 3 2 (88.68%)
Bahasa Banjar 1 0 (0.00%) Bahasa Melayu Kelantan-Patani 3 2 (0.00%)
Bahasa Belanda 3 6 (98.01%) Bahasa Belanda 3 6 (98.01%)
Bahasa Bugis 1 2 (100.00%) Bahasa Arab 3 6 (86.67%)
Bahasa Burma 1 4 (100.00%) Bahasa Parsi 3 4 (70.71%)
Bahasa Cam Barat 2 2 (0.00%) Bahasa Afrikaans 3 8 (44.00%)
Bahasa Catalan 1 0 (0.00%) Bahasa Turkmen 3 10 (13.11%)
Bahasa Catalonia 1 2 (100.00%) Bahasa Minangkabau 2 2 (50.00%)
Bahasa Cina 1 0 (0.00%) Bahasa Turki 2 4 (4.00%)
Bahasa Denmark 1 10 (100.00%) Bahasa Punjabi 2 2 (66.67%)
Bahasa Dhivehi 1 0 (0.00%) Bahasa Estonia 2 6 (1.75%)
Bahasa Estonia 2 6 (1.75%) Bahasa Melayu Brunei 2 2 (71.43%)
Bahasa Farefare 1 0 (0.00%) Bahasa Cam Barat 2 2 (0.00%)
Bahasa Georgia 2 2 (50.00%) Bahasa Hindi 2 2 (84.72%)
Bahasa Ghotuo 1 0 (0.00%) Bahasa Georgia 2 2 (50.00%)
Bahasa Hindi 2 2 (84.72%) Bahasa Azeri 2 8 (20.00%)
Bahasa Hungary 1 0 (0.00%) Bahasa Mooré 2 0 (0.00%)
Bahasa Iban 1 0 (0.00%) Bahasa Igbo 2 2 (50.00%)
Bahasa Ibrani 1 2 (100.00%) Bahasa Semai 1 0 (0.00%)
Bahasa Iceland 1 2 (100.00%) Bahasa Sunda 1 2 (100.00%)
Bahasa Igbo 2 2 (50.00%) Bahasa Wales 1 2 (100.00%)
Bahasa Ilocano 1 2 (100.00%) Bahasa Banjar 1 0 (0.00%)
Bahasa Indoneisa 1 0 (0.00%) Bahasa Portugis 1 2 (100.00%)
Bahasa Indonesia 4 2 (0.29%) Bahasa Latin 1 0 (0.00%)
Bahasa Ingeris 1 4 (100.00%) Bahasa Moore 1 0 (0.00%)
Bahasa Inggeris 6 6 (98.31%) Bahasa Nias 1 2 (100.00%)
Bahasa Ireland 1 4 (100.00%) Bahasa Bugis 1 2 (100.00%)
Bahasa Itali 1 0 (0.00%) Bahasa Iban 1 0 (0.00%)
Bahasa Jawa 4 4 (89.13%) Bahasa Rohingya 1 2 (100.00%)
Bahasa Jepun 3 2 (88.68%) Bahasa Itali 1 0 (0.00%)
Bahasa Jerman 7 10 (3.44%) Bahasa Suluk 1 2 (100.00%)
Bahasa Kantonis 1 2 (100.00%) Bahasa Melayu Sarawak 1 0 (0.00%)
Bahasa Kazakh 1 2 (100.00%) Bahasa Kantonis 1 2 (100.00%)
Bahasa Kikuyu 1 0 (0.00%) Bahasa Cina 1 0 (0.00%)
Bahasa Korea 4 4 (76.36%) Bahasa Ilocano 1 2 (100.00%)
Bahasa Kunigami 1 2 (100.00%) Bahasa Serbia 1 2 (100.00%)
Bahasa Kurdi Utara 1 2 (100.00%) Bahasa Tagalog 1 2 (100.00%)
Bahasa Ladino 1 2 (100.00%) Bahasa Yonaguni 1 2 (100.00%)
Bahasa Latin 1 0 (0.00%) Bahasa Malta 1 0 (0.00%)
Bahasa Limbu 1 2 (100.00%) Bahasa Maori 1 2 (100.00%)
Bahasa Makassar 1 2 (100.00%) Bahasa Rungus 1 0 (0.00%)
Bahasa Makau 1 0 (0.00%) Bahasa Thai 1 2 (100.00%)
Bahasa Malta 1 0 (0.00%) Bahasa Swahili 1 2 (100.00%)
Bahasa Mandarin 4 6 (1.02%) Bahasa Melayu Kedah 1 2 (100.00%)
Bahasa Maori 1 2 (100.00%) Bahasa Catalan 1 0 (0.00%)
Bahasa Melayu 14 14 (3.44%) Bahasa Poland 1 4 (100.00%)
Bahasa Melayu Brunei 2 2 (71.43%) Bahasa Slovak 1 8 (100.00%)
Bahasa Melayu Kedah 1 2 (100.00%) Bahasa Catalonia 1 2 (100.00%)
Bahasa Melayu Kelantan-Patani 3 2 (0.00%) Bahasa Ireland 1 4 (100.00%)
Bahasa Melayu Sarawak 1 0 (0.00%) Bahasa Provençal Kuno 1 6 (100.00%)
Bahasa Melayu Terengganu Pesisir 1 0 (0.00%) Bahasa Albania 1 6 (100.00%)
Bahasa Minangkabau 2 2 (50.00%) Bahasa Ibrani 1 2 (100.00%)
Bahasa Miranda 1 0 (0.00%) Bahasa Burma 1 4 (100.00%)
Bahasa Moore 1 0 (0.00%) Bahasa Tajik 1 2 (100.00%)
Bahasa Mooré 2 0 (0.00%) Bahasa Tatar 1 2 (100.00%)
Bahasa Nias 1 2 (100.00%) Bahasa Kazakh 1 2 (100.00%)
Bahasa Norman 1 0 (0.00%) Bahasa Iceland 1 2 (100.00%)
Bahasa Norway Bokmål 1 8 (100.00%) Bahasa Telugu 1 0 (0.00%)
Bahasa Norway Nynorsk 1 8 (100.00%) Bahasa Asturia 1 0 (0.00%)
Bahasa Okinawa 1 4 (100.00%) Bahasa Denmark 1 10 (100.00%)
Bahasa Parsi 3 4 (70.71%) Bahasa Norway Nynorsk 1 8 (100.00%)
Bahasa Perancis 4 10 (10.16%) Bahasa Norway Bokmål 1 8 (100.00%)
Bahasa Perancis Lama 1 10 (100.00%) Bahasa Hungary 1 0 (0.00%)
Bahasa Phalura 1 2 (100.00%) Bahasa Slovene 1 2 (100.00%)
Bahasa Piedmont 1 0 (0.00%) Bahasa Ladino 1 2 (100.00%)
Bahasa Poland 1 4 (100.00%) Bahasa Sorbia Bawah 1 2 (100.00%)
Bahasa Portugis 1 2 (100.00%) Bahasa Perancis Lama 1 10 (100.00%)
Bahasa Provençal Kuno 1 6 (100.00%) Persian 1 0 (0.00%)
Bahasa Punic 1 0 (0.00%) Bahasa Kurdi Utara 1 2 (100.00%)
Bahasa Punjabi 2 2 (66.67%) Bahasa Ingeris 1 4 (100.00%)
Bahasa Rohingya 1 2 (100.00%) Bahasa Yunani 1 2 (100.00%)
Bahasa Rungus 1 0 (0.00%) Bahasa Piedmont 1 0 (0.00%)
Bahasa Semai 1 0 (0.00%) Bahasa Okinawa 1 4 (100.00%)
Bahasa Sepanyol 3 4 (99.03%) Bahas Scots 1 0 (0.00%)
Bahasa Serbia 1 2 (100.00%) Bahasa Punic 1 0 (0.00%)
Bahasa Sinhala 1 0 (0.00%) Bahasa Norman 1 0 (0.00%)
Bahasa Slovak 1 8 (100.00%) Bahasa Bali 1 6 (100.00%)
Bahasa Slovene 1 2 (100.00%) Bahasa Suryani Klasik 1 0 (0.00%)
Bahasa Sorbia Bawah 1 2 (100.00%) Bahasa Uyghur 1 0 (0.00%)
Bahasa Suluk 1 2 (100.00%) Bahasa Uzbek 1 0 (0.00%)
Bahasa Sunda 1 2 (100.00%) Bahasa Ghotuo 1 0 (0.00%)
Bahasa Suryani Klasik 1 0 (0.00%) Bahasa Kikuyu 1 0 (0.00%)
Bahasa Swahili 1 2 (100.00%) Bahasa Arab Hijaz 1 0 (0.00%)
Bahasa Tagalog 1 2 (100.00%) Bahasa Azerbaijan 1 2 (0.00%)
Bahasa Tajik 1 2 (100.00%) Bahasa Miranda 1 0 (0.00%)
Bahasa Tatar 1 2 (100.00%) Bahasa Amuzgo San Pedro Amuzgos 1 0 (0.00%)
Bahasa Telugu 1 0 (0.00%) Bahasa Melayu Terengganu Pesisir 1 0 (0.00%)
Bahasa Thai 1 2 (100.00%) Bahasa Makau 1 0 (0.00%)
Bahasa Turki 2 4 (4.00%) Bahas Melayu 1 0 (0.00%)
Bahasa Turkmen 3 10 (13.11%) Bahasa Ainu 1 2 (100.00%)
Bahasa Uyghur 1 0 (0.00%) Bahasa Limbu 1 2 (100.00%)
Bahasa Uzbek 1 0 (0.00%) Bahasa Phalura 1 2 (100.00%)
Bahasa Wales 1 2 (100.00%) Bahasa Kunigami 1 2 (100.00%)
Bahasa Yonaguni 1 2 (100.00%) Bahasa Makassar 1 2 (100.00%)
Bahasa Yunani 1 2 (100.00%) Bahasa Indoneisa 1 0 (0.00%)
Bahasa Yup'ik 1 0 (0.00%) English 1 2 (100.00%)
English 1 2 (100.00%) Bahasa Sinhala 1 0 (0.00%)
Persian 1 0 (0.00%) Bahasa Yup'ik 1 0 (0.00%)
Rentas bahasa 3 4 (62.50%) Bahasa Farefare 1 0 (0.00%)
Translingual 4 4 (50.00%) Bahasa Dhivehi 1 0 (0.00%)

This page is a part of the kaikki.org machine-readable dictionary. This dictionary is based on structured data extracted on 2025-12-15 from the mswiktionary dump dated 2025-12-01 using wiktextract (e2469cc and 9905b1f). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.