Kind - The Taxonomy Project

Version: May 17, 2025.

What is this?


Kind is a taxonomy of nouns. The present version is derived from Wiktionaries, for the moment only in Spanish and English.
You are now on the English side. You can select the language ( English or Spanish) and enter any common single noun (or a list of them, one per line) and this system will produce complete hypernymy chains. It does not yet work with multiword expressions: we are currently working on that too.
You can also request a RANDOM SAMPLE of 100 entries

Some more context

This is the new (2025) version of a rather old project. Originally, it was a statistically-based taxonomy induction algorithm from corpus, consisting of a combination of different strategies not involving explicit linguistic knowledge, but based instead on the computation of distributional similarity coefficients. This new version is very different. Firstly, it is simpler than its predecessors, because it is based on a single dictionary (in this case, the Wiktionary, but in principle it could work with any other dictionary as well). Secondly, it has something that previous versions didn't have: a word sense disambiguation algorithm. This new feature gives the system a great advantage, because it helps making more coherent hypernymy chains.

At some point in the near future we will reinstall some of the components of the older version of Kind which were also useful to obtain information from corpora. We will be updating on this soon.

Documentation and source code

Today is Mondal, May 12, 2025. The source code and documentation are changing very rapidly as we are work on the details. However, you can have a look at what we've got so far.
We will not maintain older versions. However, copies of older versions of Kind are available thanks to the great Wayback machine (Internter Archive) at the following URL:

https://web.archive.org/web/20230926170113/http://www.tecling.com/cgi-bin/kind/2021/


That, for example, is the 2021 version, but many others are available there as well.
If you would like to send inquiries you are welcome to do so at rogelio (dot) nazar (at) gmail (dot) com .

Funding


This project has been supported by two successive grants:

  1. Conicyt-Fondecyt 11140686, “Inducción automática de taxonomías de sustantivos generales y especializados a partir de corpus textuales desde el enfoque de la lingüística cuantitativa” (Automatic taxonomy induction from corpora for terminology and general vocabulary using quantitative measures). Lead researcher: Rogelio Nazar. (2014 to 2017).
  2. Ecos Sud-Conicyt Project C16H02 “Inducción automática de taxonomías del español y el francés mediante técnicas cuantitativas y estadística de corpus” (Automatic taxonomy induction from corpora for Spanish and French using quantitative corpus analysis). Lead researcher: Irene Renau. (2016-2019).

Credits

Researchers:

Also, the following researchers worked on previous versions of this project :

Related publications: