Wiktionary data extraction errors and warnings

Inflection check

List of different kinds of inflection tables. When wiktextract parses word heads and tables, it assigns the forms it encounters with tags that describe grammatical or contextual information. The tags and forms that are found in head sections and tables are kept separate from other head section and table tags, and later they are merged with other heads and tables into table types that all contain the same number of word forms with the same tags for those forms.

The information presented here is mostly for debugging, but it can also be used to find interesting word paradigms and to hunt down mistakes, typoes and badly formated Wiktionary entries. A table type that has only a few unique instances is quite likely to contain some kind of minor error in the original data.

Language ⏶ Table forms Errors (% affected words) Language Table forms ⏷ Errors (% affected words)
:Templat:Dayak Benuaq 1 2 (100.00%) Bahasa Indonesia 12 8 (3.36%)
:Templat:Dayak Maanyan 1 2 (100.00%) bahasa Indonesia Peranakan 9 4 (1.22%)
:Templat:batak toba 1 2 (100.00%) Bahasa Indonesia Peranakan 7 2 (1.92%)
:Templat:dusun balangan 1 2 (100.00%) bahasa Indonesia 6 2 (16.00%)
:Templat:jawa ngapak 1 2 (100.00%) Bahasa Jawa 3 2 (80.00%)
:Templat:jawa ngoko 1 2 (100.00%) Bahasa Melayu 3 2 (14.29%)
:Templat:kutai 1 2 (100.00%) Bahasa Jepang 3 20 (68.42%)
:Templat:lamin adat 1 2 (100.00%) Bahasa Simalungun 3 4 (33.33%)
:Templat:maanyan siong 1 2 (100.00%) Bahasa Banjar 2 0 (0.00%)
:Templat:maluku 1 2 (100.00%) Bahasa Inggris 1 2 (100.00%)
:Templat:palembang ogan 1 2 (100.00%) bahasa Inggris 1 0 (0.00%)
:Templat:samihim 1 2 (100.00%) :Templat:sulawesi 1 2 (100.00%)
:Templat:semarang 1 2 (100.00%) bahasa Hakka 1 6 (100.00%)
:Templat:semendo 1 2 (100.00%) bahasa Melayu 1 0 (0.00%)
:Templat:sulawesi 1 2 (100.00%) Bahasa Lampung Api 1 2 (100.00%)
Bahasa Banjar 2 0 (0.00%) :Templat:batak toba 1 2 (100.00%)
Bahasa Bintauna 1 0 (0.00%) Bahasa Bintauna 1 0 (0.00%)
Bahasa Indonesia 12 8 (3.36%) bahasa Asturia 1 0 (0.00%)
Bahasa Indonesia Peranakan 7 2 (1.92%) Bahasa Sunda 1 2 (100.00%)
Bahasa Inggris 1 2 (100.00%) GHWOSMXbahasa Indonesia 1 0 (0.00%)
Bahasa Jawa 3 2 (80.00%) Bahasa Tionghoa 1 0 (0.00%)
Bahasa Jepang 3 20 (68.42%) Bahasa Korea 1 4 (100.00%)
Bahasa Korea 1 4 (100.00%) Lintas bahasa 1 0 (0.00%)
Bahasa Lampung Api 1 2 (100.00%) Bahasa Vietnam 1 2 (100.00%)
Bahasa Melayu 3 2 (14.29%) bahasa Korea 1 4 (100.00%)
Bahasa Rusia 1 2 (100.00%) :Templat:kutai 1 2 (100.00%)
Bahasa Simalungun 3 4 (33.33%) bahasa Bali 1 2 (100.00%)
Bahasa Sunda 1 2 (100.00%) bahasa Armenia 1 0 (0.00%)
Bahasa Tionghoa 1 0 (0.00%) Bahasa Rusia 1 2 (100.00%)
Bahasa Vietnam 1 2 (100.00%) Bahasa Yami 1 0 (0.00%)
Bahasa Yami 1 0 (0.00%) :Templat:maanyan siong 1 2 (100.00%)
GHWOSMXbahasa Indonesia 1 0 (0.00%) :Templat:dusun balangan 1 2 (100.00%)
Lintas bahasa 1 0 (0.00%) :Templat:samihim 1 2 (100.00%)
bahasa Armenia 1 0 (0.00%) bahasa Gorontalo ( dalam Bahasa Belanda ) 1 2 (100.00%)
bahasa Asturia 1 0 (0.00%) bahasa Jepang lama 1 2 (100.00%)
bahasa Bali 1 2 (100.00%) bahasa Jepang Kuno 1 2 (100.00%)
bahasa Gorontalo ( dalam Bahasa Belanda ) 1 2 (100.00%) bahasa Tamiang 1 0 (0.00%)
bahasa Hakka 1 6 (100.00%) bahasa indonesia 1 0 (0.00%)
bahasa Indonesia 6 2 (16.00%) bahasa Turkmen 1 0 (0.00%)
bahasa Indonesia Peranakan 9 4 (1.22%) bahasa Kimaragang 1 2 (100.00%)
bahasa Inggris 1 0 (0.00%) :Templat:jawa ngapak 1 2 (100.00%)
bahasa Jepang Kuno 1 2 (100.00%) :Templat:semendo 1 2 (100.00%)
bahasa Jepang lama 1 2 (100.00%) :Templat:jawa ngoko 1 2 (100.00%)
bahasa Kimaragang 1 2 (100.00%) :Templat:palembang ogan 1 2 (100.00%)
bahasa Korea 1 4 (100.00%) :Templat:maluku 1 2 (100.00%)
bahasa Melayu 1 0 (0.00%) :Templat:semarang 1 2 (100.00%)
bahasa Tamiang 1 0 (0.00%) :Templat:lamin adat 1 2 (100.00%)
bahasa Turkmen 1 0 (0.00%) :Templat:Dayak Maanyan 1 2 (100.00%)
bahasa indonesia 1 0 (0.00%) :Templat:Dayak Benuaq 1 2 (100.00%)

This page is a part of the kaikki.org machine-readable dictionary. This dictionary is based on structured data extracted on 2026-02-19 from the idwiktionary dump dated 2026-02-01 using wiktextract (f492ef9 and 59dc20b). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.