Wiktionary data extraction errors and warnings

Inflection check

List of different kinds of inflection tables. When wiktextract parses word heads and tables, it assigns the forms it encounters with tags that describe grammatical or contextual information. The tags and forms that are found in head sections and tables are kept separate from other head section and table tags, and later they are merged with other heads and tables into table types that all contain the same number of word forms with the same tags for those forms.

The information presented here is mostly for debugging, but it can also be used to find interesting word paradigms and to hunt down mistakes, typoes and badly formated Wiktionary entries. A table type that has only a few unique instances is quite likely to contain some kind of minor error in the original data.

Language ⏶ Table forms Errors (% affected words) Language Table forms ⏷ Errors (% affected words)
:Templat:Lampung Api 1 2 (100.00%) bahasa Indonesia 18 28 (3.75%)
:Templat:dusun balangan 1 2 (100.00%) bahasa Indonesia Peranakan 11 4 (1.85%)
:Templat:maanyan siong 1 2 (100.00%) bahasa Melayu 4 4 (31.58%)
:Templat:melayu sambas 1 2 (100.00%) bahasa Jawa 3 2 (70.59%)
:Templat:samihim 1 2 (100.00%) bahasa Batak Simalungun 3 4 (40.00%)
Arti 1 4 (100.00%) Bahasa Jawa 3 2 (85.71%)
Bahasa Indonesia 1 0 (0.00%) bahasa Inggris 2 2 (87.50%)
Bahasa Jawa 3 2 (85.71%) bahasa Belanda 2 4 (100.00%)
GHWOSMXbahasa Indonesia 1 0 (0.00%) bahasa Minangkabau 2 2 (99.52%)
bahasa Aceh 1 2 (100.00%) bahasa Banjar 2 2 (50.00%)
bahasa Arab 1 2 (100.00%) bahasa Sunda 2 4 (96.00%)
bahasa Badui 1 2 (100.00%) bahasa Tamiang 2 2 (50.00%)
bahasa Bahnar 1 2 (100.00%) bahasa Aceh 1 2 (100.00%)
bahasa Banjar 2 2 (50.00%) bahasa Gorontalo 1 2 (100.00%)
bahasa Batak Mandailing 1 2 (100.00%) bahasa Sunda kuno 1 2 (100.00%)
bahasa Batak Simalungun 3 4 (40.00%) bahasa Bugis 1 2 (100.00%)
bahasa Batak Toba 1 2 (100.00%) bahasa Makassar 1 4 (100.00%)
bahasa Belanda 2 4 (100.00%) bahasa Kangean 1 8 (100.00%)
bahasa Berawan 1 2 (100.00%) bahasa Nias 1 2 (100.00%)
bahasa Betawi 1 2 (100.00%) bahasa Palembang 1 2 (100.00%)
bahasa Bugis 1 2 (100.00%) bahasa Madura 1 2 (100.00%)
bahasa Cham Timur 1 2 (100.00%) bahasa Betawi 1 2 (100.00%)
bahasa Esperanto 1 2 (100.00%) bahasa Tetun 1 2 (100.00%)
bahasa Gorontalo 1 2 (100.00%) bahasa Badui 1 2 (100.00%)
bahasa Gorontalo ( dalam Bahasa Belanda ) 1 2 (100.00%) bahasa Jawa Kuna 1 4 (100.00%)
bahasa Hakka 1 2 (100.00%) bahasa Melayu Tengah 1 4 (100.00%)
bahasa Indonesia 18 28 (3.75%) bahasa Jepang 1 2 (100.00%)
bahasa Indonesia Peranakan 11 4 (1.85%) bahasa Musi 1 2 (100.00%)
bahasa Inggris 2 2 (87.50%) bahasa Berawan 1 2 (100.00%)
bahasa Jawa 3 2 (70.59%) bahasa Arab 1 2 (100.00%)
bahasa Jawa Kuna 1 4 (100.00%) bahasa Batak Toba 1 2 (100.00%)
bahasa Jepang 1 2 (100.00%) bahasa Esperanto 1 2 (100.00%)
bahasa Jepang Kuno 1 2 (100.00%) bahasa Lampung Api 1 6 (100.00%)
bahasa Jepang lama 1 2 (100.00%) bahasa Okinawa 1 2 (100.00%)
bahasa Kanakanabu 1 2 (100.00%) bahasa Hakka 1 2 (100.00%)
bahasa Kangean 1 8 (100.00%) bahasa Batak Mandailing 1 2 (100.00%)
bahasa Korea 1 4 (100.00%) bahasa Bahnar 1 2 (100.00%)
bahasa Kristang 1 2 (100.00%) bahasa Zazaki 1 2 (100.00%)
bahasa Lampung Api 1 6 (100.00%) :Templat:melayu sambas 1 2 (100.00%)
bahasa Madura 1 2 (100.00%) bahasa Kanakanabu 1 2 (100.00%)
bahasa Makassar 1 4 (100.00%) bahasa Cham Timur 1 2 (100.00%)
bahasa Melayu 4 4 (31.58%) bahasa Melayu Pontianak 1 2 (100.00%)
bahasa Melayu Pontianak 1 2 (100.00%) bahasa Rukai 1 2 (100.00%)
bahasa Melayu Tengah 1 4 (100.00%) GHWOSMXbahasa Indonesia 1 0 (0.00%)
bahasa Minangkabau 2 2 (99.52%) Bahasa Indonesia 1 0 (0.00%)
bahasa Musi 1 2 (100.00%) bahasa Tionghoa 1 2 (100.00%)
bahasa Nias 1 2 (100.00%) bahasa Korea 1 4 (100.00%)
bahasa Okinawa 1 2 (100.00%) bahasa Vietnam 1 4 (100.00%)
bahasa Paiwan 1 2 (100.00%) Arti 1 4 (100.00%)
bahasa Palembang 1 2 (100.00%) bahasa Rusia 1 2 (100.00%)
bahasa Rukai 1 2 (100.00%) :Templat:maanyan siong 1 2 (100.00%)
bahasa Rusia 1 2 (100.00%) :Templat:samihim 1 2 (100.00%)
bahasa Sunda 2 4 (96.00%) :Templat:dusun balangan 1 2 (100.00%)
bahasa Sunda kuno 1 2 (100.00%) bahasa Gorontalo ( dalam Bahasa Belanda ) 1 2 (100.00%)
bahasa Tamiang 2 2 (50.00%) bahasa Jepang lama 1 2 (100.00%)
bahasa Tetun 1 2 (100.00%) bahasa Jepang Kuno 1 2 (100.00%)
bahasa Tionghoa 1 2 (100.00%) bahasa Paiwan 1 2 (100.00%)
bahasa Urak Lawoi' 1 2 (100.00%) bahasa Kristang 1 2 (100.00%)
bahasa Vietnam 1 4 (100.00%) :Templat:Lampung Api 1 2 (100.00%)
bahasa Zazaki 1 2 (100.00%) bahasa indonesia 1 0 (0.00%)
bahasa indonesia 1 0 (0.00%) bahasa Urak Lawoi' 1 2 (100.00%)

This page is a part of the kaikki.org machine-readable dictionary. This dictionary is based on structured data extracted on 2025-04-18 from the idwiktionary dump dated 2025-04-03 using wiktextract (ada610d and ea19a0a). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.