Wiktionary data extraction errors and warnings

hi/Chinese/intj

Return to 'Debug messages subpage 2567'

hi (Chinese intj) hi/Chinese/intj: invalid uppercase tag Hong-Kong not in or uppercase_tags: {"categories": ["Chinese doublets", "Chinese entries with incorrect language header", "Chinese greetings", "Chinese interjections", "Chinese lemmas", "Chinese terms with IPA pronunciation", "Chinese terms written in foreign scripts", "Chinese verbs", "Pages with 35 entries", "Pages with entries"], "etymology_number": 1, "etymology_templates": [{"args": {"1": "yue", "2": "en", "3": "hi"}, "expansion": "English hi", "name": "bor"}, {"args": {"1": "zh", "2": "嗨"}, "expansion": "Doublet of 嗨 (hāi)", "name": "doublet"}], "etymology_text": "From English hi. Doublet of 嗨 (hāi).", "head_templates": [{"args": {"1": "zh", "2": "interjection"}, "expansion": "hi", "name": "head"}], "lang": "Chinese", "lang_code": "zh", "pos": "intj", "senses": [{"categories": ["Hong Kong Cantonese"], "glosses": ["hi (interjection)"], "links": [["hi", "#English"]], "raw_glosses": ["(Hong Kong Cantonese) hi (interjection)"], "tags": ["Cantonese", "Hong-Kong"]}], "sounds": [{"tags": ["Cantonese", "Jyutping"], "zh-pron": "haai¹"}, {"tags": ["Cantonese", "Yale"], "zh-pron": "hāai"}, {"tags": ["Cantonese", "Pinyin"], "zh-pron": "haai¹"}, {"tags": ["Cantonese", "Guangdong-Romanization"], "zh-pron": "hai¹"}, {"ipa": "/haːi̯⁵⁵/", "tags": ["Cantonese", "Sinological-IPA"]}, {"ipa": "/haːi̯⁵⁵/"}], "word": "hi"}


This page is a part of the kaikki.org machine-readable dictionary. This dictionary is based on structured data extracted on 2025-03-13 from the enwiktionary dump dated 2025-03-02 using wiktextract (f074e77 and 633533e). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.