"toŋuz" meaning in Proto-Turkic

See toŋuz in All languages combined, or Wiktionary

Noun

Etymology: Possibly from earlier *tonkuz, a derivation of *tonk- with unclear meaning and a suffix If the word was present in Proto-Bulgaric (Oghuric), the form *toŋuŕ could be reconstructed. However, no form that can be traced back to Proto-Bulgaric (via cognates in e.g. Chuvash or Hungarian) is attested. Vovin (2011:260-263) speculates on a link among Common Turkic *toŋuz, Old Chinese 豚 (OC *duːn, “piglet”), and Middle Korean 돝 (twòth, “pig”). Etymology templates: {{cog|och|-}} Old Chinese, {{och-l|豚|piglet}} 豚 (OC *duːn, “piglet”), {{cog|okm|돝|t=pig|tr=twòth}} Middle Korean 돝 (twòth, “pig”) Head templates: {{head|trk-pro|noun}} *toŋuz Forms: no-table-tags [table-tags], toŋuz [nominative, singular], toŋuzlar [nominative, plural], toŋuznïŋ [genitive, singular], toŋuzlarnïŋ [genitive, plural], toŋuzka [dative, singular], toŋuzlarka [dative, plural], toŋuzda [locative, singular], toŋuzlarda [locative, plural], toŋuzdan [ablative, singular], toŋuzlardan [ablative, plural], toŋuzlarïn [instrumental, plural], toŋuzča [equative, singular], toŋuzlarča [equative, plural]
  1. pig (Common Turkic) Tags: reconstruction
{
  "descendants": [
    {
      "descendants": [
        {
          "lang": "Khalaj",
          "lang_code": "klj",
          "word": "tongquz"
        }
      ],
      "lang": "Arghu",
      "lang_code": "klj"
    },
    {
      "descendants": [
        {
          "descendants": [
            {
              "lang": "Azerbaijani",
              "lang_code": "az",
              "word": "donuz"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "roman": "doñuz",
              "word": "طوڭوز"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "roman": "doñuz",
              "word": "طوكز"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "roman": "domuz",
              "word": "طوموز"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "roman": "donguz",
              "word": "տօնկուզ"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "raw_tags": [
                "Armeno-Turkish"
              ],
              "roman": "domuz",
              "word": "տօմուզ"
            }
          ],
          "lang": "Old Anatolian Turkish",
          "lang_code": "trk-oat"
        },
        {
          "lang": "Salar",
          "lang_code": "slr",
          "word": "doñıs"
        },
        {
          "lang": "Turkmen",
          "lang_code": "tk",
          "word": "doňuz"
        },
        {
          "lang": "Pecheneg",
          "lang_code": "xpc",
          "raw_tags": [
            "reshaped by analogy or addition of morphemes"
          ],
          "word": "Tonuzaba"
        }
      ],
      "lang": "Oghuz",
      "lang_code": "unknown"
    },
    {
      "descendants": [
        {
          "descendants": [
            {
              "descendants": [
                {
                  "lang": "Uzbek",
                  "lang_code": "uz",
                  "word": "toʻngʻiz"
                },
                {
                  "lang": "Uyghur",
                  "lang_code": "ug",
                  "roman": "tongguz",
                  "word": "توڭگۇز"
                },
                {
                  "lang": "Uyghur",
                  "lang_code": "ug",
                  "roman": "toqguz",
                  "word": "توقگۇز"
                }
              ],
              "lang": "Chagatai",
              "lang_code": "chg",
              "roman": "toŋuz",
              "word": "توڭوُز"
            }
          ],
          "lang": "Karakhanid",
          "lang_code": "xqa",
          "roman": "toŋuz",
          "word": "توڭُوز"
        }
      ],
      "lang": "Karluk",
      "lang_code": "unknown"
    },
    {
      "descendants": [
        {
          "descendants": [
            {
              "lang": "Bashkir",
              "lang_code": "ba",
              "roman": "duñğıź",
              "word": "дуңғыҙ"
            },
            {
              "lang": "Tatar",
              "lang_code": "tt",
              "roman": "duñgız",
              "word": "дуңгыз"
            }
          ],
          "lang": "North Kipchak",
          "lang_code": "unknown"
        },
        {
          "descendants": [
            {
              "lang": "Crimean Tatar",
              "lang_code": "crh",
              "word": "domuz"
            },
            {
              "lang": "Karachay-Balkar",
              "lang_code": "krc",
              "roman": "toñuz",
              "word": "тонгуз"
            },
            {
              "lang": "Karaim",
              "lang_code": "kdr",
              "word": "домуз"
            },
            {
              "lang": "Karaim",
              "lang_code": "kdr",
              "word": "тонгъуз"
            },
            {
              "lang": "Karaim",
              "lang_code": "kdr",
              "word": "tonguz"
            },
            {
              "lang": "Kumyk",
              "lang_code": "kum",
              "roman": "doñuz",
              "word": "донгуз"
            }
          ],
          "lang": "West Kipchak",
          "lang_code": "unknown"
        },
        {
          "descendants": [
            {
              "lang": "Karakalpak",
              "lang_code": "kaa",
              "word": "доңыз"
            },
            {
              "lang": "Kazakh",
              "lang_code": "kk",
              "roman": "doñyz",
              "word": "доңыз"
            },
            {
              "lang": "Nogai",
              "lang_code": "nog",
              "roman": "doñız",
              "word": "донъыз"
            }
          ],
          "lang": "South Kipchak",
          "lang_code": "unknown"
        },
        {
          "descendants": [
            {
              "lang": "Kyrgyz",
              "lang_code": "ky",
              "roman": "doŋuz",
              "word": "доңуз"
            },
            {
              "lang": "Southern Altai",
              "lang_code": "alt",
              "roman": "tonus",
              "word": "тонус"
            },
            {
              "lang": "Southern Altai",
              "lang_code": "alt",
              "roman": "toŋus",
              "word": "тоҥус"
            },
            {
              "lang": "Southern Altai",
              "lang_code": "alt",
              "roman": "toŋïs",
              "word": "тоҥыс"
            }
          ],
          "lang": "East Kipchak",
          "lang_code": "unknown"
        }
      ],
      "lang": "Kipchak",
      "lang_code": "qwm"
    },
    {
      "descendants": [
        {
          "lang": "Old Turkic",
          "lang_code": "otk",
          "roman": "toŋuz",
          "word": "𐰑𐰭𐰔"
        },
        {
          "lang": "Old Turkic",
          "lang_code": "otk",
          "roman": "t¹uŋuz",
          "word": "𐱃𐰆𐰭𐰆𐰕"
        },
        {
          "lang": "Old Uyghur",
          "lang_code": "oui",
          "sense": "pig",
          "word": "toŋuz"
        }
      ],
      "lang": "Siberian",
      "lang_code": "unknown"
    }
  ],
  "etymology_templates": [
    {
      "args": {
        "1": "och",
        "2": "-"
      },
      "expansion": "Old Chinese",
      "name": "cog"
    },
    {
      "args": {
        "1": "豚",
        "2": "piglet"
      },
      "expansion": "豚 (OC *duːn, “piglet”)",
      "name": "och-l"
    },
    {
      "args": {
        "1": "okm",
        "2": "돝",
        "t": "pig",
        "tr": "twòth"
      },
      "expansion": "Middle Korean 돝 (twòth, “pig”)",
      "name": "cog"
    }
  ],
  "etymology_text": "Possibly from earlier *tonkuz, a derivation of *tonk- with unclear meaning and a suffix\nIf the word was present in Proto-Bulgaric (Oghuric), the form *toŋuŕ could be reconstructed. However, no form that can be traced back to Proto-Bulgaric (via cognates in e.g. Chuvash or Hungarian) is attested.\nVovin (2011:260-263) speculates on a link among Common Turkic *toŋuz, Old Chinese 豚 (OC *duːn, “piglet”), and Middle Korean 돝 (twòth, “pig”).",
  "forms": [
    {
      "form": "no-table-tags",
      "source": "declension",
      "tags": [
        "table-tags"
      ]
    },
    {
      "form": "toŋuz",
      "source": "declension",
      "tags": [
        "nominative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlar",
      "source": "declension",
      "tags": [
        "nominative",
        "plural"
      ]
    },
    {
      "form": "toŋuznïŋ",
      "source": "declension",
      "tags": [
        "genitive",
        "singular"
      ]
    },
    {
      "form": "toŋuzlarnïŋ",
      "source": "declension",
      "tags": [
        "genitive",
        "plural"
      ]
    },
    {
      "form": "toŋuzka",
      "source": "declension",
      "tags": [
        "dative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlarka",
      "source": "declension",
      "tags": [
        "dative",
        "plural"
      ]
    },
    {
      "form": "toŋuzda",
      "source": "declension",
      "tags": [
        "locative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlarda",
      "source": "declension",
      "tags": [
        "locative",
        "plural"
      ]
    },
    {
      "form": "toŋuzdan",
      "source": "declension",
      "tags": [
        "ablative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlardan",
      "source": "declension",
      "tags": [
        "ablative",
        "plural"
      ]
    },
    {
      "form": "toŋuzlarïn",
      "source": "declension",
      "tags": [
        "instrumental",
        "plural"
      ]
    },
    {
      "form": "toŋuzča",
      "source": "declension",
      "tags": [
        "equative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlarča",
      "source": "declension",
      "tags": [
        "equative",
        "plural"
      ]
    }
  ],
  "head_templates": [
    {
      "args": {
        "1": "trk-pro",
        "2": "noun"
      },
      "expansion": "*toŋuz",
      "name": "head"
    }
  ],
  "lang": "Proto-Turkic",
  "lang_code": "trk-pro",
  "original_title": "Reconstruction:Proto-Turkic/toŋuz",
  "pos": "noun",
  "senses": [
    {
      "categories": [
        {
          "kind": "other",
          "name": "Middle Korean terms with non-redundant manual transliterations",
          "parents": [],
          "source": "w"
        },
        {
          "kind": "other",
          "name": "Old Turkic terms with non-redundant manual transliterations",
          "parents": [],
          "source": "w"
        },
        {
          "kind": "other",
          "name": "Pages with 1 entry",
          "parents": [],
          "source": "w"
        },
        {
          "kind": "other",
          "name": "Pages with entries",
          "parents": [],
          "source": "w"
        },
        {
          "kind": "other",
          "name": "Pecheneg terms in nonstandard scripts",
          "parents": [],
          "source": "w"
        },
        {
          "kind": "other",
          "name": "Proto-Common Turkic",
          "parents": [],
          "source": "w"
        },
        {
          "kind": "other",
          "name": "Proto-Turkic entries with incorrect language header",
          "parents": [],
          "source": "w"
        }
      ],
      "glosses": [
        "pig (Common Turkic)"
      ],
      "id": "en-toŋuz-trk-pro-noun-AX0Cd-Gk",
      "links": [
        [
          "pig",
          "pig"
        ]
      ],
      "tags": [
        "reconstruction"
      ]
    }
  ],
  "word": "toŋuz"
}
{
  "descendants": [
    {
      "descendants": [
        {
          "lang": "Khalaj",
          "lang_code": "klj",
          "word": "tongquz"
        }
      ],
      "lang": "Arghu",
      "lang_code": "klj"
    },
    {
      "descendants": [
        {
          "descendants": [
            {
              "lang": "Azerbaijani",
              "lang_code": "az",
              "word": "donuz"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "roman": "doñuz",
              "word": "طوڭوز"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "roman": "doñuz",
              "word": "طوكز"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "roman": "domuz",
              "word": "طوموز"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "roman": "donguz",
              "word": "տօնկուզ"
            },
            {
              "descendants": [
                {
                  "lang": "Turkish",
                  "lang_code": "tr",
                  "word": "domuz"
                },
                {
                  "lang": "Romanian",
                  "lang_code": "ro",
                  "raw_tags": [
                    "borrowed"
                  ],
                  "word": "domuz"
                }
              ],
              "lang": "Ottoman Turkish",
              "lang_code": "ota",
              "raw_tags": [
                "Armeno-Turkish"
              ],
              "roman": "domuz",
              "word": "տօմուզ"
            }
          ],
          "lang": "Old Anatolian Turkish",
          "lang_code": "trk-oat"
        },
        {
          "lang": "Salar",
          "lang_code": "slr",
          "word": "doñıs"
        },
        {
          "lang": "Turkmen",
          "lang_code": "tk",
          "word": "doňuz"
        },
        {
          "lang": "Pecheneg",
          "lang_code": "xpc",
          "raw_tags": [
            "reshaped by analogy or addition of morphemes"
          ],
          "word": "Tonuzaba"
        }
      ],
      "lang": "Oghuz",
      "lang_code": "unknown"
    },
    {
      "descendants": [
        {
          "descendants": [
            {
              "descendants": [
                {
                  "lang": "Uzbek",
                  "lang_code": "uz",
                  "word": "toʻngʻiz"
                },
                {
                  "lang": "Uyghur",
                  "lang_code": "ug",
                  "roman": "tongguz",
                  "word": "توڭگۇز"
                },
                {
                  "lang": "Uyghur",
                  "lang_code": "ug",
                  "roman": "toqguz",
                  "word": "توقگۇز"
                }
              ],
              "lang": "Chagatai",
              "lang_code": "chg",
              "roman": "toŋuz",
              "word": "توڭوُز"
            }
          ],
          "lang": "Karakhanid",
          "lang_code": "xqa",
          "roman": "toŋuz",
          "word": "توڭُوز"
        }
      ],
      "lang": "Karluk",
      "lang_code": "unknown"
    },
    {
      "descendants": [
        {
          "descendants": [
            {
              "lang": "Bashkir",
              "lang_code": "ba",
              "roman": "duñğıź",
              "word": "дуңғыҙ"
            },
            {
              "lang": "Tatar",
              "lang_code": "tt",
              "roman": "duñgız",
              "word": "дуңгыз"
            }
          ],
          "lang": "North Kipchak",
          "lang_code": "unknown"
        },
        {
          "descendants": [
            {
              "lang": "Crimean Tatar",
              "lang_code": "crh",
              "word": "domuz"
            },
            {
              "lang": "Karachay-Balkar",
              "lang_code": "krc",
              "roman": "toñuz",
              "word": "тонгуз"
            },
            {
              "lang": "Karaim",
              "lang_code": "kdr",
              "word": "домуз"
            },
            {
              "lang": "Karaim",
              "lang_code": "kdr",
              "word": "тонгъуз"
            },
            {
              "lang": "Karaim",
              "lang_code": "kdr",
              "word": "tonguz"
            },
            {
              "lang": "Kumyk",
              "lang_code": "kum",
              "roman": "doñuz",
              "word": "донгуз"
            }
          ],
          "lang": "West Kipchak",
          "lang_code": "unknown"
        },
        {
          "descendants": [
            {
              "lang": "Karakalpak",
              "lang_code": "kaa",
              "word": "доңыз"
            },
            {
              "lang": "Kazakh",
              "lang_code": "kk",
              "roman": "doñyz",
              "word": "доңыз"
            },
            {
              "lang": "Nogai",
              "lang_code": "nog",
              "roman": "doñız",
              "word": "донъыз"
            }
          ],
          "lang": "South Kipchak",
          "lang_code": "unknown"
        },
        {
          "descendants": [
            {
              "lang": "Kyrgyz",
              "lang_code": "ky",
              "roman": "doŋuz",
              "word": "доңуз"
            },
            {
              "lang": "Southern Altai",
              "lang_code": "alt",
              "roman": "tonus",
              "word": "тонус"
            },
            {
              "lang": "Southern Altai",
              "lang_code": "alt",
              "roman": "toŋus",
              "word": "тоҥус"
            },
            {
              "lang": "Southern Altai",
              "lang_code": "alt",
              "roman": "toŋïs",
              "word": "тоҥыс"
            }
          ],
          "lang": "East Kipchak",
          "lang_code": "unknown"
        }
      ],
      "lang": "Kipchak",
      "lang_code": "qwm"
    },
    {
      "descendants": [
        {
          "lang": "Old Turkic",
          "lang_code": "otk",
          "roman": "toŋuz",
          "word": "𐰑𐰭𐰔"
        },
        {
          "lang": "Old Turkic",
          "lang_code": "otk",
          "roman": "t¹uŋuz",
          "word": "𐱃𐰆𐰭𐰆𐰕"
        },
        {
          "lang": "Old Uyghur",
          "lang_code": "oui",
          "sense": "pig",
          "word": "toŋuz"
        }
      ],
      "lang": "Siberian",
      "lang_code": "unknown"
    }
  ],
  "etymology_templates": [
    {
      "args": {
        "1": "och",
        "2": "-"
      },
      "expansion": "Old Chinese",
      "name": "cog"
    },
    {
      "args": {
        "1": "豚",
        "2": "piglet"
      },
      "expansion": "豚 (OC *duːn, “piglet”)",
      "name": "och-l"
    },
    {
      "args": {
        "1": "okm",
        "2": "돝",
        "t": "pig",
        "tr": "twòth"
      },
      "expansion": "Middle Korean 돝 (twòth, “pig”)",
      "name": "cog"
    }
  ],
  "etymology_text": "Possibly from earlier *tonkuz, a derivation of *tonk- with unclear meaning and a suffix\nIf the word was present in Proto-Bulgaric (Oghuric), the form *toŋuŕ could be reconstructed. However, no form that can be traced back to Proto-Bulgaric (via cognates in e.g. Chuvash or Hungarian) is attested.\nVovin (2011:260-263) speculates on a link among Common Turkic *toŋuz, Old Chinese 豚 (OC *duːn, “piglet”), and Middle Korean 돝 (twòth, “pig”).",
  "forms": [
    {
      "form": "no-table-tags",
      "source": "declension",
      "tags": [
        "table-tags"
      ]
    },
    {
      "form": "toŋuz",
      "source": "declension",
      "tags": [
        "nominative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlar",
      "source": "declension",
      "tags": [
        "nominative",
        "plural"
      ]
    },
    {
      "form": "toŋuznïŋ",
      "source": "declension",
      "tags": [
        "genitive",
        "singular"
      ]
    },
    {
      "form": "toŋuzlarnïŋ",
      "source": "declension",
      "tags": [
        "genitive",
        "plural"
      ]
    },
    {
      "form": "toŋuzka",
      "source": "declension",
      "tags": [
        "dative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlarka",
      "source": "declension",
      "tags": [
        "dative",
        "plural"
      ]
    },
    {
      "form": "toŋuzda",
      "source": "declension",
      "tags": [
        "locative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlarda",
      "source": "declension",
      "tags": [
        "locative",
        "plural"
      ]
    },
    {
      "form": "toŋuzdan",
      "source": "declension",
      "tags": [
        "ablative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlardan",
      "source": "declension",
      "tags": [
        "ablative",
        "plural"
      ]
    },
    {
      "form": "toŋuzlarïn",
      "source": "declension",
      "tags": [
        "instrumental",
        "plural"
      ]
    },
    {
      "form": "toŋuzča",
      "source": "declension",
      "tags": [
        "equative",
        "singular"
      ]
    },
    {
      "form": "toŋuzlarča",
      "source": "declension",
      "tags": [
        "equative",
        "plural"
      ]
    }
  ],
  "head_templates": [
    {
      "args": {
        "1": "trk-pro",
        "2": "noun"
      },
      "expansion": "*toŋuz",
      "name": "head"
    }
  ],
  "lang": "Proto-Turkic",
  "lang_code": "trk-pro",
  "original_title": "Reconstruction:Proto-Turkic/toŋuz",
  "pos": "noun",
  "senses": [
    {
      "categories": [
        "Middle Korean terms with non-redundant manual transliterations",
        "Old Turkic terms with non-redundant manual transliterations",
        "Pages with 1 entry",
        "Pages with entries",
        "Pecheneg terms in nonstandard scripts",
        "Proto-Common Turkic",
        "Proto-Turkic entries with incorrect language header",
        "Proto-Turkic lemmas",
        "Proto-Turkic nouns",
        "Requests for native script for Old Uyghur terms",
        "trk-pro:Even-toed ungulates",
        "trk-pro:Livestock"
      ],
      "glosses": [
        "pig (Common Turkic)"
      ],
      "links": [
        [
          "pig",
          "pig"
        ]
      ],
      "tags": [
        "reconstruction"
      ]
    }
  ],
  "word": "toŋuz"
}

Download raw JSONL data for toŋuz meaning in Proto-Turkic (6.8kB)


This page is a part of the kaikki.org machine-readable Proto-Turkic dictionary. This dictionary is based on structured data extracted on 2025-09-03 from the enwiktionary dump dated 2025-08-23 using wiktextract (20da82b and a97feda). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.