"bag of words" meaning in English

See bag of words in All languages combined, or Wiktionary

Noun

Forms: bags of words [plural]
Head templates: {{en-noun|bags of words|head=bag of words}} bag of words (plural bags of words)
  1. (computational linguistics) The collection of words from an unprocessed text without regard to grammar; a collection of word unigrams. Wikipedia link: bag-of-words model Categories (topical): Computational linguistics Synonyms: BOW Translations (collection of words): conjunto de palavras (Portuguese)

Inflected forms

Alternative forms

Download JSON data for bag of words meaning in English (1.9kB)

{
  "forms": [
    {
      "form": "bags of words",
      "tags": [
        "plural"
      ]
    }
  ],
  "head_templates": [
    {
      "args": {
        "1": "bags of words",
        "head": "bag of words"
      },
      "expansion": "bag of words (plural bags of words)",
      "name": "en-noun"
    }
  ],
  "lang": "English",
  "lang_code": "en",
  "pos": "noun",
  "senses": [
    {
      "categories": [
        {
          "kind": "other",
          "name": "English entries with incorrect language header",
          "parents": [
            "Entries with incorrect language header",
            "Entry maintenance"
          ],
          "source": "w"
        },
        {
          "kind": "topical",
          "langcode": "en",
          "name": "Computational linguistics",
          "orig": "en:Computational linguistics",
          "parents": [
            "Computer science",
            "Linguistics",
            "Computing",
            "Sciences",
            "Language",
            "Social sciences",
            "Technology",
            "All topics",
            "Communication",
            "Society",
            "Fundamental"
          ],
          "source": "w"
        }
      ],
      "examples": [
        {
          "ref": "2023, John Paul Mueller, Luca Massaron, Python for Data Science For Dummies, John Wiley & Sons, page 132",
          "text": "You can then use the bag of words to train classifiers, a special kind of algorithm used to break words down into categories.",
          "type": "quotation"
        }
      ],
      "glosses": [
        "The collection of words from an unprocessed text without regard to grammar; a collection of word unigrams."
      ],
      "id": "en-bag_of_words-en-noun-1tXne3JR",
      "links": [
        [
          "computational linguistics",
          "computational linguistics"
        ],
        [
          "collection",
          "collection"
        ],
        [
          "word",
          "word"
        ],
        [
          "unprocessed",
          "unprocessed"
        ],
        [
          "text",
          "text"
        ],
        [
          "grammar",
          "grammar"
        ],
        [
          "unigram",
          "unigram"
        ]
      ],
      "raw_glosses": [
        "(computational linguistics) The collection of words from an unprocessed text without regard to grammar; a collection of word unigrams."
      ],
      "synonyms": [
        {
          "word": "BOW"
        }
      ],
      "topics": [
        "computational",
        "computing",
        "engineering",
        "human-sciences",
        "linguistics",
        "mathematics",
        "natural-sciences",
        "physical-sciences",
        "sciences"
      ],
      "translations": [
        {
          "code": "pt",
          "lang": "Portuguese",
          "sense": "collection of words",
          "word": "conjunto de palavras"
        }
      ],
      "wikipedia": [
        "bag-of-words model"
      ]
    }
  ],
  "word": "bag of words"
}
{
  "forms": [
    {
      "form": "bags of words",
      "tags": [
        "plural"
      ]
    }
  ],
  "head_templates": [
    {
      "args": {
        "1": "bags of words",
        "head": "bag of words"
      },
      "expansion": "bag of words (plural bags of words)",
      "name": "en-noun"
    }
  ],
  "lang": "English",
  "lang_code": "en",
  "pos": "noun",
  "senses": [
    {
      "categories": [
        "English countable nouns",
        "English entries with incorrect language header",
        "English lemmas",
        "English multiword terms",
        "English nouns",
        "English terms with quotations",
        "en:Computational linguistics"
      ],
      "examples": [
        {
          "ref": "2023, John Paul Mueller, Luca Massaron, Python for Data Science For Dummies, John Wiley & Sons, page 132",
          "text": "You can then use the bag of words to train classifiers, a special kind of algorithm used to break words down into categories.",
          "type": "quotation"
        }
      ],
      "glosses": [
        "The collection of words from an unprocessed text without regard to grammar; a collection of word unigrams."
      ],
      "links": [
        [
          "computational linguistics",
          "computational linguistics"
        ],
        [
          "collection",
          "collection"
        ],
        [
          "word",
          "word"
        ],
        [
          "unprocessed",
          "unprocessed"
        ],
        [
          "text",
          "text"
        ],
        [
          "grammar",
          "grammar"
        ],
        [
          "unigram",
          "unigram"
        ]
      ],
      "raw_glosses": [
        "(computational linguistics) The collection of words from an unprocessed text without regard to grammar; a collection of word unigrams."
      ],
      "synonyms": [
        {
          "word": "BOW"
        }
      ],
      "topics": [
        "computational",
        "computing",
        "engineering",
        "human-sciences",
        "linguistics",
        "mathematics",
        "natural-sciences",
        "physical-sciences",
        "sciences"
      ],
      "wikipedia": [
        "bag-of-words model"
      ]
    }
  ],
  "translations": [
    {
      "code": "pt",
      "lang": "Portuguese",
      "sense": "collection of words",
      "word": "conjunto de palavras"
    }
  ],
  "word": "bag of words"
}

This page is a part of the kaikki.org machine-readable English dictionary. This dictionary is based on structured data extracted on 2024-04-30 from the enwiktionary dump dated 2024-04-21 using wiktextract (210104c and c9440ce). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.