GeNom: automatic detection of the gender of proper names
is a project we have been granted on June 20, 2017, funded by the
Technology Prototypes track of the Innovation and Entrepreneurship 2017 Competition
(Vicerrectoría de Investigación y Estudios Avanzados - Pontificia Universidad Católica de Valparaíso).
The result is offered as a web service for batch processing of information for terminography or lexicography projects
or for mailing purposes.
This software is designed to automatically determine the gender of a list of names based on their co-occurrence with words and abbreviations in a large corpus.
GeNom is different from other forms of automatic name gender recognition software because it is based on natural language processing and does not rely on
already compiled lists of first names, systems that get quickly outdated and cannot analyze previously unseen names.
GeNom uses corpora to address the problem, because it offers the possibility of obtaining real and up-to-date name-gender links
and performs better than machine learning methods: 93% precision and 88% recall on a database of ca. 10,000 mixed names.
This software can be used to conduct large scale studies about gender, as gender bias for example, or for a variety of other NLP tasks,
such as information extraction, machine translation, anaphora resolution and others.
It is designed to work with Spanish names, as it works with a Spanish corpus, but it will be able to process names in other languages as well, provided that they use the same alphabet.
Web demo: http://www.tecling.com/genom
The interface is at the moment only in Spanish.
Contact: rogelio.nazar (imagine the 'at' symbol here) gmail.com