Termout.org is the first implementation of a new method for terminology extraction based on distributional analysis. The intuition behind the algorithm is that single or multi-word lexical units that refer to specialised concepts will show a characteristic co-occurrence pattern, described as a tendency to appear in the same contexts with other conceptually related terms. E.g. the term fluoxetine will systematically appear in the same sentences with other related terms such as depression, serotonin reuptake inhibitor, obsessive–compulsive disorder and others. Of course, terms will co-occur with general vocabulary units as well, but not with a characteristic pattern as when a conceptual relation holds. Experimental evaluation of this method was conducted in a corpus of psychiatry journals from Spain and Latin America, and concluded that the results are significantly better than other methods.
Web demo: http://www.termout.org/
The web interface is now online! (24 January, 2018). It works in Spanish and English, but it is still a little bit unstable. Be prepared for the dreaded "Internal Server Error" message appearing from time to time.
+ Nazar, R. (2016). Distributional analysis applied to terminology extraction: example in the domain of psychiatry in Spanish. Terminology: International Journal of Theoretical and Applied Issues in Specialized Communication, 22(2):142-170.
Related concepts: co-occurrence, distributional semantics, terminology extraction, topic signatures, text-mining
Contact: rogelio.nazar at gmail.com