Welcome to kaikki.org

kaikki — [Finnish] all, everything, everyone

Kaikki.org is a digital archive and a data mining group. We aim to make our digital heritage more accessible and useful for people, researchers, linguists, software developers, and artificial intelligence (AI).

Available resources

Machine-readable computational dictionaries for European languages

These were extracted from the English Wiktionary edition and have glosses in English.

Downloadable dictionaries for various other languages

These were extracted from the English Wiktionary edition and have glosses in English.

Combined dictionary of all languages

Data extracted from non-English wiktionary editions

These non-English editions have glosses written in the particular language. These are work in progress and may still contain many errors or omissions. Contributions for improving the extraction code are welcome. Please report bugs here.

Publications

If you use Wiktextract or the data on this site in academic work, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022.

Linking to this web site would also be greatly appreciated.

Contact

Kaikki.org is currently maintained by Tatu Ylonen. You can contact us at info at kaikki.org. Please do not use this email for any marketing or mass emailing.