Tecling logo   Technologies for Linguistic Analysis
»The World is automatic

21 September, 2018: Best paper award at SEPLN 2018

Hernán Robledo and Rogelio Nazar awarded the prize to the best paper at SEPLN 2018 (Seville, Spain) for their work entitled ``Clasificación automatizada de marcadores discursivos'' (Automatic discourse-marker categorization).



-------------------------------

8 September, 2018: New version of Termout.org

Now comes with:

  • a better stoplist
  • a parameter for minimum frequency
  • a control for minimum and maximum size of the text
  • unseen elements are not discarded
  • non-utf8 material is discarded


-------------------------------

21 August, 2018: new version in French, English and Spanish of the Taxonomy Project

We have a new web-demo of the project. It integrates all the algorithms and at the moment works in French, Enslish and Spanish. The user-interface is still pretty rough but the idea is that one can provide a noun (single nouns only, at the moment) and the program will try to assign the best semantic categories for such noun. It is also posible to provide a list of nouns (one per line) and the program will treat each noun as an independent trial.


-------------------------------

12 August, 2018: EMaD: a new software for automatic categorization of discouse markers

In the context of the PhD thesis of Hernán Robledo, and in coincidence with the publication of our new paper on the subject, we present this new web demo to detect and classify discouse markers. (The program and the documentation is at the moment only in Spanish).

-------------------------------

If you have comments or questions, feel free to contact us.

 
Tools & demos

We have implemented different types of applications and most of them can be tested online. Take a look.

+ Bifid: a parallel corpus aligner

+ EMaD: automatic categorization of discouse markers

+ Dsele: a model dictionary for ELE learners

+ Estilector: a tool for assisted writing

+ GeNom: a program to detect the gender of proper nouns

+ Jaguar: a tool for statistic corpus analysis

+ Kind: a taxonomy induction algorithm

+ Kwico: a concordancer for big corpora

+ Neven: a program to detect eventive nouns

+ Termout: a terminology extraction system

+ POL: named entity recognition and classification

+ Poppins: a supervised text classifier

+ Porcus: an interface for various taggers and parsers for Spanish

+ Sapo: a program to detect similarities between documents

+ Verbario: corpus pattern analysis in Spanish

 
Sausalito

This is the view from where we are located, in the Sausalito lagoon, a quiet and lovely place in Viña del Mar, Chile. Sunny days. Birds can be seen in the center of the lagoon (click to enlarge).

As researchers, we are currently affiliated to:
Pontificia Universidad Católica de Valparaíso
Instituto de Literatura y Ciencias del Lenguaje

Av. El Bosque 1290, Viña del Mar, Chile

Upcoming Events

16 October, 2018: at 13hs (GMT) Hernán Robledo will deliver a presentation at the Institute for Language, Cognition and Computation of the University of Edinburgh (Scotland). entitled ``A proposal for the inductive categorisation of discourse markers''. Address: Room 4.31/4.33, Informatics Forum of the University of Edinburgh. 10 Crichton Street, Edinburgh EH8 9AB

 
 

Latest ideas & research projects

We are developing new projects in computational linguistics and natural language processing.

+ Ecos-Sud (International Project between Chile and France): "Inducción automática de taxonomías del español y el francés mediante técnicas cuantitativas y estadística de corpus" (Ref. C16H02). Lead researcher: Irene Renau

+ Fondecyt Regular: "Desarrollo de la competencia terminológica a lo largo de la inserción disciplinar" (Ref. 11121597). Lead Researcher: Sabela Fernández. Co-researcher: Rogelio Nazar

+ There is More.

 
Recent publications

+ Irene Renau; Rogelio Nazar; Valesca Lecaros. (Forthcoming). "La evolución de las marcas ortográficas y tipográficas en los procesos de lexicalización de neologismos: un estudio en el vocabulario de la crisis económica en prensa española". Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics.

+ Robledo, H.; Nazar, R. (2018). "Clasificación automatizada de marcadores discursivos", Procesamiento del Lenguaje Natural, n. 61, pp 109-116.

+ Nazar, R. (Forthcoming). "El análisis cuantitativo de la coocurrencia léxica en la lexicografía especializada". Actas del VIII Congreso Internacional de Lexicografía Hispánica. Valencia, España: 27-29 Junio 2018.

+ Nazar, R. (2009 [2018]). Invitación al estudio estadístico del lenguaje. ArXiv:1804.07349 [stat.AP]
(PDF)

+ See more.

 

Solutions for text processing

It is critical for organizations to have the ability to process information automatically, and very often that information is contained in documents to be read by humans rather than machines. We have different methods for text processing depending on the goal.

We can be helpful teaching people how to automatize their text processing routines. We can batch-process thousands of documents to extract information from them or to derive different types of statistics. We can also change these document, or generate databases or email correspondence based on information extracted from them. Anything that involves intelligent management of information can benefit from different degrees of automatization, and by doing that we can free time, effort and resources.

Tell us which are your needs and we will show you what we can do about it.

 
      LogoAlt Contact