"tokenizer" meaning in All languages combined

See tokenizer on Wiktionary

Noun [Английский]

  1. Tags: no-gloss
    Sense id: ru-tokenizer-en-noun-47DEQpj8
The following are not (yet) sense-disambiguated
{
  "categories": [
    {
      "kind": "other",
      "name": "Английские существительные",
      "parents": [],
      "source": "w"
    },
    {
      "kind": "other",
      "name": "Английский язык",
      "parents": [],
      "source": "w"
    },
    {
      "kind": "other",
      "name": "Слова из 9 букв/en",
      "parents": [],
      "source": "w"
    },
    {
      "kind": "other",
      "name": "Требуется категоризация/en",
      "parents": [],
      "source": "w"
    }
  ],
  "hyphenations": [
    {
      "parts": [
        "tokenizer"
      ]
    }
  ],
  "lang": "Английский",
  "lang_code": "en",
  "pos": "noun",
  "senses": [
    {
      "id": "ru-tokenizer-en-noun-47DEQpj8",
      "tags": [
        "no-gloss"
      ]
    }
  ],
  "word": "tokenizer"
}
{
  "categories": [
    "Английские существительные",
    "Английский язык",
    "Слова из 9 букв/en",
    "Требуется категоризация/en"
  ],
  "hyphenations": [
    {
      "parts": [
        "tokenizer"
      ]
    }
  ],
  "lang": "Английский",
  "lang_code": "en",
  "pos": "noun",
  "senses": [
    {
      "tags": [
        "no-gloss"
      ]
    }
  ],
  "word": "tokenizer"
}

Download raw JSONL data for tokenizer meaning in All languages combined (0.3kB)


This page is a part of the kaikki.org machine-readable All languages combined dictionary. This dictionary is based on structured data extracted on 2025-10-25 from the ruwiktionary dump dated 2025-10-20 using wiktextract (bd88cf0 and 0a198a9). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.