Wiktionary data extraction errors and warnings

Inflection check

List of different kinds of inflection tables. When wiktextract parses word heads and tables, it assigns the forms it encounters with tags that describe grammatical or contextual information. The tags and forms that are found in head sections and tables are kept separate from other head section and table tags, and later they are merged with other heads and tables into table types that all contain the same number of word forms with the same tags for those forms.

The information presented here is mostly for debugging, but it can also be used to find interesting word paradigms and to hunt down mistakes, typoes and badly formated Wiktionary entries. A table type that has only a few unique instances is quite likely to contain some kind of minor error in the original data.

Language ⏶ Table forms Errors (% affected words) Language Table forms ⏷ Errors (% affected words)
Bahas Melayu 2 2 (50.00%) Bahasa Melayu 23 42 (67.35%)
Bahasa Afrikaans 3 8 (46.15%) Bahasa Indonesia 8 10 (1.05%)
Bahasa Ainu 1 2 (100.00%) Bahasa Inggeris 7 8 (98.50%)
Bahasa Albania 1 6 (100.00%) Bahasa Jerman 7 12 (3.44%)
Bahasa Amuzgo San Pedro Amuzgos 1 0 (0.00%) Bahasa Korea 6 6 (45.30%)
Bahasa Arab 3 4 (86.67%) Bahasa Jawa 4 6 (89.13%)
Bahasa Arab Hijaz 1 0 (0.00%) Bahasa Perancis 4 14 (10.16%)
Bahasa Asturia 1 0 (0.00%) Rentas bahasa 4 6 (66.67%)
Bahasa Azerbaijan 1 2 (0.00%) Translingual 4 6 (47.37%)
Bahasa Azeri 2 8 (20.00%) Bahasa Mandarin 4 6 (1.02%)
Bahasa Bali 1 6 (100.00%) Bahasa Sepanyol 3 4 (99.03%)
Bahasa Banjar 1 0 (0.00%) Bahasa Jepun 3 4 (88.46%)
Bahasa Belanda 3 12 (98.00%) Bahasa Melayu Kelantan-Patani 3 2 (0.00%)
Bahasa Bugis 1 2 (100.00%) Bahasa Belanda 3 12 (98.00%)
Bahasa Burma 1 2 (100.00%) Bahasa Arab 3 4 (86.67%)
Bahasa Cam Barat 2 2 (0.00%) Bahasa Parsi 3 4 (70.71%)
Bahasa Catalonia 1 2 (100.00%) Bahasa Afrikaans 3 8 (46.15%)
Bahasa Cina 1 0 (0.00%) Bahasa Turkmen 3 10 (13.11%)
Bahasa Denmark 1 10 (100.00%) Bahasa Minangkabau 2 2 (50.00%)
Bahasa Dhivehi 1 0 (0.00%) Bahasa Turki 2 4 (4.00%)
Bahasa Estonia 2 4 (1.75%) Bahasa Punjabi 2 2 (66.67%)
Bahasa Farefare 1 0 (0.00%) Bahasa Estonia 2 4 (1.75%)
Bahasa Georgia 2 2 (50.00%) Bahasa Melayu Brunei 2 2 (71.43%)
Bahasa Ghotuo 1 0 (0.00%) Bahasa Cam Barat 2 2 (0.00%)
Bahasa Hindi 2 2 (84.72%) Bahasa Hindi 2 2 (84.72%)
Bahasa Hungary 1 0 (0.00%) Bahasa Georgia 2 2 (50.00%)
Bahasa Iban 1 0 (0.00%) Bahasa Azeri 2 8 (20.00%)
Bahasa Ibrani 1 2 (100.00%) Bahasa Mooré 2 0 (0.00%)
Bahasa Iceland 1 4 (100.00%) Bahas Melayu 2 2 (50.00%)
Bahasa Igbo 2 2 (50.00%) Bahasa Igbo 2 2 (50.00%)
Bahasa Ilocano 1 2 (100.00%) Bahasa Semai 1 0 (0.00%)
Bahasa Indonesia 8 10 (1.05%) Bahasa Sunda 1 2 (100.00%)
Bahasa Ingeris 1 4 (100.00%) Bahasa Wales 1 2 (100.00%)
Bahasa Inggeris 7 8 (98.50%) Bahasa Banjar 1 0 (0.00%)
Bahasa Ireland 1 2 (100.00%) Bahasa Portugis 1 2 (100.00%)
Bahasa Itali 1 0 (0.00%) Bahasa Latin 1 0 (0.00%)
Bahasa Jawa 4 6 (89.13%) Bahasa Moore 1 0 (0.00%)
Bahasa Jepun 3 4 (88.46%) Bahasa Nias 1 2 (100.00%)
Bahasa Jerman 7 12 (3.44%) Bahasa Bugis 1 2 (100.00%)
Bahasa Kantonis 1 2 (100.00%) Bahasa Iban 1 0 (0.00%)
Bahasa Kazakh 1 2 (100.00%) Bahasa Kimaragang 1 6 (100.00%)
Bahasa Kelantan 1 0 (0.00%) Bahasa Rohingya 1 2 (100.00%)
Bahasa Kikuyu 1 0 (0.00%) Bahasa Itali 1 0 (0.00%)
Bahasa Kimaragang 1 6 (100.00%) Bahasa Suluk 1 2 (100.00%)
Bahasa Korea 6 6 (45.30%) Bahasa Melayu Sarawak 1 0 (0.00%)
Bahasa Kunigami 1 2 (100.00%) Bahasa Kantonis 1 2 (100.00%)
Bahasa Kurdi Utara 1 2 (100.00%) Bahasa Cina 1 0 (0.00%)
Bahasa Ladino 1 2 (100.00%) Bahasa Ilocano 1 2 (100.00%)
Bahasa Latin 1 0 (0.00%) Bahasa Tagalog 1 2 (100.00%)
Bahasa Limbu 1 2 (100.00%) Bahasa Yonaguni 1 2 (100.00%)
Bahasa Makassar 1 2 (100.00%) Bahasa Malta 1 0 (0.00%)
Bahasa Makau 1 0 (0.00%) Bahasa Maori 1 2 (100.00%)
Bahasa Malta 1 0 (0.00%) Bahasa Rungus 1 0 (0.00%)
Bahasa Mandarin 4 6 (1.02%) Bahasa Thai 1 2 (100.00%)
Bahasa Maori 1 2 (100.00%) Bahasa Swahili 1 2 (100.00%)
Bahasa Melayu 23 42 (67.35%) Bahasa Melayu Kedah 1 2 (100.00%)
Bahasa Melayu Brunei 2 2 (71.43%) Bahasa Catalonia 1 2 (100.00%)
Bahasa Melayu Kedah 1 2 (100.00%) Bahasa Poland 1 4 (100.00%)
Bahasa Melayu Kelantan-Patani 3 2 (0.00%) Bahasa Slovak 1 8 (100.00%)
Bahasa Melayu Sarawak 1 0 (0.00%) Bahasa Hungary 1 0 (0.00%)
Bahasa Melayu Terengganu Pesisir 1 0 (0.00%) Bahasa Ireland 1 2 (100.00%)
Bahasa Minangkabau 2 2 (50.00%) Bahasa Provençal Kuno 1 6 (100.00%)
Bahasa Miranda 1 0 (0.00%) Bahasa Albania 1 6 (100.00%)
Bahasa Moore 1 0 (0.00%) Bahasa Ibrani 1 2 (100.00%)
Bahasa Mooré 2 0 (0.00%) Bahasa Burma 1 2 (100.00%)
Bahasa Nias 1 2 (100.00%) Bahasa Tajik 1 2 (100.00%)
Bahasa Norman 1 0 (0.00%) Bahasa Tatar 1 2 (100.00%)
Bahasa Norway Bokmål 1 2 (100.00%) Bahasa Kazakh 1 2 (100.00%)
Bahasa Norway Nynorsk 1 8 (100.00%) Bahasa Iceland 1 4 (100.00%)
Bahasa Okinawa 1 4 (100.00%) Bahasa Telugu 1 0 (0.00%)
Bahasa Parsi 3 4 (70.71%) Bahasa Asturia 1 0 (0.00%)
Bahasa Perancis 4 14 (10.16%) Bahasa Yunani 1 2 (100.00%)
Bahasa Perancis Kuno 1 10 (100.00%) Bahasa Denmark 1 10 (100.00%)
Bahasa Perancis Lama 1 6 (100.00%) Bahasa Norway Nynorsk 1 8 (100.00%)
Bahasa Phalura 1 2 (100.00%) Bahasa Norway Bokmål 1 2 (100.00%)
Bahasa Piedmont 1 0 (0.00%) Bahasa Slovene 1 2 (100.00%)
Bahasa Poland 1 4 (100.00%) Bahasa Kurdi Utara 1 2 (100.00%)
Bahasa Portugis 1 2 (100.00%) Bahasa Ladino 1 2 (100.00%)
Bahasa Provençal Kuno 1 6 (100.00%) Bahasa Sorbia Bawah 1 2 (100.00%)
Bahasa Punic 1 0 (0.00%) Bahasa Scots 1 0 (0.00%)
Bahasa Punjabi 2 2 (66.67%) Bahasa Perancis Lama 1 6 (100.00%)
Bahasa Rohingya 1 2 (100.00%) Persian 1 0 (0.00%)
Bahasa Rungus 1 0 (0.00%) Bahasa Ingeris 1 4 (100.00%)
Bahasa Scots 1 0 (0.00%) Bahasa Piedmont 1 0 (0.00%)
Bahasa Semai 1 0 (0.00%) Bahasa Okinawa 1 4 (100.00%)
Bahasa Sepanyol 3 4 (99.03%) Bahasa Punic 1 0 (0.00%)
Bahasa Sinhala 1 0 (0.00%) Bahasa Norman 1 0 (0.00%)
Bahasa Slovak 1 8 (100.00%) Bahasa Tobilung 1 6 (100.00%)
Bahasa Slovene 1 2 (100.00%) Bahasa Bali 1 6 (100.00%)
Bahasa Sorbia Bawah 1 2 (100.00%) Bahasa Suryani Klasik 1 0 (0.00%)
Bahasa Suluk 1 2 (100.00%) Bahasa Uyghur 1 0 (0.00%)
Bahasa Sunda 1 2 (100.00%) Bahasa Uzbek 1 0 (0.00%)
Bahasa Suryani Klasik 1 0 (0.00%) Bahasa Ghotuo 1 0 (0.00%)
Bahasa Swahili 1 2 (100.00%) Bahasa Perancis Kuno 1 10 (100.00%)
Bahasa Tagalog 1 2 (100.00%) Bahasa Kikuyu 1 0 (0.00%)
Bahasa Tajik 1 2 (100.00%) Bahasa Arab Hijaz 1 0 (0.00%)
Bahasa Tatar 1 2 (100.00%) Bahasa Azerbaijan 1 2 (0.00%)
Bahasa Telugu 1 0 (0.00%) Bahasa Miranda 1 0 (0.00%)
Bahasa Thai 1 2 (100.00%) Bahasa Amuzgo San Pedro Amuzgos 1 0 (0.00%)
Bahasa Tobilung 1 6 (100.00%) Bahasa Melayu Terengganu Pesisir 1 0 (0.00%)
Bahasa Turki 2 4 (4.00%) Bahasa Makau 1 0 (0.00%)
Bahasa Turkmen 3 10 (13.11%) Bahsa Melayu 1 2 (100.00%)
Bahasa Uyghur 1 0 (0.00%) Bahasa Ainu 1 2 (100.00%)
Bahasa Uzbek 1 0 (0.00%) Bahasa Limbu 1 2 (100.00%)
Bahasa Wales 1 2 (100.00%) Bahasa Phalura 1 2 (100.00%)
Bahasa Yonaguni 1 2 (100.00%) Bahasa Kunigami 1 2 (100.00%)
Bahasa Yunani 1 2 (100.00%) Bahasa Makassar 1 2 (100.00%)
Bahasa Yup'ik 1 0 (0.00%) English 1 2 (100.00%)
Bahsa Melayu 1 2 (100.00%) Bahasa Sinhala 1 0 (0.00%)
English 1 2 (100.00%) bahasa Melayu 1 2 (100.00%)
Persian 1 0 (0.00%) Bahasa Yup'ik 1 0 (0.00%)
Rentas bahasa 4 6 (66.67%) Bahasa Farefare 1 0 (0.00%)
Translingual 4 6 (47.37%) Bahasa Dhivehi 1 0 (0.00%)
bahasa Melayu 1 2 (100.00%) Bahasa Kelantan 1 0 (0.00%)

This page is a part of the kaikki.org machine-readable dictionary. This dictionary is based on structured data extracted on 2026-01-06 from the mswiktionary dump dated 2026-01-01 using wiktextract (96027d6 and 9905b1f). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.