Wiktionary data extraction errors and warnings

Inflection check

List of different kinds of inflection tables. When wiktextract parses word heads and tables, it assigns the forms it encounters with tags that describe grammatical or contextual information. The tags and forms that are found in head sections and tables are kept separate from other head section and table tags, and later they are merged with other heads and tables into table types that all contain the same number of word forms with the same tags for those forms.

The information presented here is mostly for debugging, but it can also be used to find interesting word paradigms and to hunt down mistakes, typoes and badly formated Wiktionary entries. A table type that has only a few unique instances is quite likely to contain some kind of minor error in the original data.

Language ⏶ Table forms Errors (% affected words) Language Table forms ⏷ Errors (% affected words)
:Templat:Lampung Api 1 2 (100.00%) bahasa Indonesia 17 20 (3.90%)
:Templat:batak toba 1 2 (100.00%) bahasa Indonesia Peranakan 11 4 (1.79%)
:Templat:dusun balangan 1 2 (100.00%) bahasa Melayu 5 2 (83.91%)
:Templat:jawa ngapak 1 2 (100.00%) bahasa Jawa 4 2 (66.67%)
:Templat:jawa ngoko 1 2 (100.00%) bahasa Belanda 3 4 (66.67%)
:Templat:maanyan siong 1 2 (100.00%) bahasa Banjar 3 2 (25.00%)
:Templat:palembang ogan 1 2 (100.00%) bahasa Jepang 3 20 (94.34%)
:Templat:samihim 1 2 (100.00%) bahasa Batak Simalungun 3 4 (40.00%)
:Templat:semendo 1 2 (100.00%) Bahasa Jawa 3 2 (80.00%)
Arti 1 4 (100.00%) bahasa Inggris 2 2 (88.89%)
Bahasa Indonesia 1 0 (0.00%) bahasa Aceh 2 2 (66.67%)
Bahasa Jawa 3 2 (80.00%) bahasa Minangkabau 2 2 (99.54%)
Bahasa Jepang 1 0 (0.00%) bahasa Sunda 2 8 (96.00%)
Bahasa Vietnam 1 26 (100.00%) bahasa Tetun 2 2 (50.00%)
GHWOSMXbahasa Indonesia 1 0 (0.00%) bahasa Jawa kuno 2 4 (40.00%)
Lintas bahasa 1 0 (0.00%) bahasa Batak Toba 2 2 (60.00%)
bahasa Aceh 2 2 (66.67%) bahasa Kanakanabu 2 2 (80.00%)
bahasa Afrikaans 1 0 (0.00%) bahasa Cham Timur 2 10 (83.33%)
bahasa Amis 1 0 (0.00%) bahasa Tionghoa 2 2 (90.00%)
bahasa Arab 1 2 (100.00%) bahasa Vietnam 2 4 (90.77%)
bahasa Armenia 1 0 (0.00%) bahasa Tamiang 2 2 (50.00%)
bahasa Badui 1 2 (100.00%) bahasa Gorontalo 1 2 (100.00%)
bahasa Bahnar 1 2 (100.00%) bahasa Portugis 1 0 (0.00%)
bahasa Bali 1 2 (100.00%) bahasa Sunda kuno 1 2 (100.00%)
bahasa Banjar 3 2 (25.00%) bahasa Prancis 1 0 (0.00%)
bahasa Batak Simalungun 3 4 (40.00%) bahasa Bugis 1 2 (100.00%)
bahasa Batak Toba 2 2 (60.00%) bahasa Makassar 1 2 (100.00%)
bahasa Belanda 3 4 (66.67%) bahasa Kangean 1 0 (0.00%)
bahasa Berawan 1 2 (100.00%) bahasa Nias 1 2 (100.00%)
bahasa Betawi 1 4 (100.00%) bahasa Palembang 1 2 (100.00%)
bahasa Bugis 1 2 (100.00%) bahasa Madura 1 2 (100.00%)
bahasa Bunak 1 4 (100.00%) bahasa Betawi 1 4 (100.00%)
bahasa Bunun 1 0 (0.00%) bahasa Bali 1 2 (100.00%)
bahasa Cham Timur 2 10 (83.33%) bahasa Badui 1 2 (100.00%)
bahasa Esperanto 1 2 (100.00%) bahasa Melayu Tengah 1 4 (100.00%)
bahasa Galisia 1 0 (0.00%) bahasa Musi 1 4 (100.00%)
bahasa Gorontalo 1 2 (100.00%) bahasa Komering 1 2 (100.00%)
bahasa Gorontalo ( dalam Bahasa Belanda ) 1 2 (100.00%) bahasa Berawan 1 2 (100.00%)
bahasa Hakka 1 6 (100.00%) bahasa Arab 1 2 (100.00%)
bahasa Hakka Hoiliukfung 1 4 (100.00%) bahasa Afrikaans 1 0 (0.00%)
bahasa Indonesia 17 20 (3.90%) bahasa Italia 1 0 (0.00%)
bahasa Indonesia Peranakan 11 4 (1.79%) bahasa Esperanto 1 2 (100.00%)
bahasa Inggris 2 2 (88.89%) bahasa Lampung Api 1 4 (100.00%)
bahasa Italia 1 0 (0.00%) bahasa Okinawa 1 2 (100.00%)
bahasa Jawa 4 2 (66.67%) bahasa Hakka Hoiliukfung 1 4 (100.00%)
bahasa Jawa kuno 2 4 (40.00%) bahasa Hakka 1 6 (100.00%)
bahasa Jepang 3 20 (94.34%) bahasa Kerinci 1 0 (0.00%)
bahasa Jepang Kuno 1 2 (100.00%) bahasa Polandia 1 0 (0.00%)
bahasa Jepang lama 1 2 (100.00%) bahasa Mandailing 1 2 (100.00%)
bahasa Kanakanabu 2 2 (80.00%) bahasa Tsou 1 0 (0.00%)
bahasa Kangean 1 0 (0.00%) bahasa Bahnar 1 2 (100.00%)
bahasa Kavalan 1 0 (0.00%) bahasa Bunak 1 4 (100.00%)
bahasa Kendayan 1 2 (100.00%) bahasa Yami 1 0 (0.00%)
bahasa Kerinci 1 0 (0.00%) bahasa Galisia 1 0 (0.00%)
bahasa Kimaragang 1 2 (100.00%) bahasa Amis 1 0 (0.00%)
bahasa Komering 1 2 (100.00%) bahasa Paiwan 1 2 (100.00%)
bahasa Korea 1 4 (100.00%) bahasa Kavalan 1 0 (0.00%)
bahasa Kristang 1 2 (100.00%) bahasa Bunun 1 0 (0.00%)
bahasa Lampung Api 1 4 (100.00%) bahasa Kendayan 1 2 (100.00%)
bahasa Madura 1 2 (100.00%) :Templat:batak toba 1 2 (100.00%)
bahasa Makassar 1 2 (100.00%) bahasa Melayu Pontianak 1 2 (100.00%)
bahasa Mandailing 1 2 (100.00%) Bahasa Jepang 1 0 (0.00%)
bahasa Melayu 5 2 (83.91%) bahasa Rukai 1 2 (100.00%)
bahasa Melayu Pontianak 1 2 (100.00%) bahasa Kimaragang 1 2 (100.00%)
bahasa Melayu Tengah 1 4 (100.00%) GHWOSMXbahasa Indonesia 1 0 (0.00%)
bahasa Minangkabau 2 2 (99.54%) Bahasa Indonesia 1 0 (0.00%)
bahasa Musi 1 4 (100.00%) bahasa Korea 1 4 (100.00%)
bahasa Nias 1 2 (100.00%) Lintas bahasa 1 0 (0.00%)
bahasa Okinawa 1 2 (100.00%) Bahasa Vietnam 1 26 (100.00%)
bahasa Paiwan 1 2 (100.00%) Arti 1 4 (100.00%)
bahasa Palembang 1 2 (100.00%) bahasa Armenia 1 0 (0.00%)
bahasa Polandia 1 0 (0.00%) bahasa Rusia 1 2 (100.00%)
bahasa Portugis 1 0 (0.00%) bahasa Turkmen 1 0 (0.00%)
bahasa Prancis 1 0 (0.00%) :Templat:maanyan siong 1 2 (100.00%)
bahasa Rukai 1 2 (100.00%) :Templat:samihim 1 2 (100.00%)
bahasa Rusia 1 2 (100.00%) bahasa Sanskerta 1 4 (100.00%)
bahasa Sanskerta 1 4 (100.00%) :Templat:dusun balangan 1 2 (100.00%)
bahasa Sunda 2 8 (96.00%) bahasa Gorontalo ( dalam Bahasa Belanda ) 1 2 (100.00%)
bahasa Sunda kuno 1 2 (100.00%) bahasa Jepang lama 1 2 (100.00%)
bahasa Tamiang 2 2 (50.00%) bahasa Jepang Kuno 1 2 (100.00%)
bahasa Tetun 2 2 (50.00%) bahasa Kristang 1 2 (100.00%)
bahasa Tionghoa 2 2 (90.00%) :Templat:Lampung Api 1 2 (100.00%)
bahasa Tsou 1 0 (0.00%) bahasa indonesia 1 0 (0.00%)
bahasa Turkmen 1 0 (0.00%) bahasa Urak Lawoi' 1 2 (100.00%)
bahasa Urak Lawoi' 1 2 (100.00%) :Templat:jawa ngapak 1 2 (100.00%)
bahasa Vietnam 2 4 (90.77%) :Templat:semendo 1 2 (100.00%)
bahasa Yami 1 0 (0.00%) :Templat:jawa ngoko 1 2 (100.00%)
bahasa indonesia 1 0 (0.00%) :Templat:palembang ogan 1 2 (100.00%)

This page is a part of the kaikki.org machine-readable dictionary. This dictionary is based on structured data extracted on 2025-11-06 from the idwiktionary dump dated 2025-11-02 using wiktextract (1977306 and 928f69b). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.