Tecling logo » The universe is not perfect, but it's working on it.      ABOUT RESEARCH SOLUTIONS SOFTWARE CONTACT
Technologies for Linguistic Analysis

January 19, 2023
A new version of Termout is coming soon...


For some months we have been working on a new version of our term extraction system Termout. It now has many functions for automatic and manual terminology processing. An example of computer assisted terminology, it lets you work with your specialized corpus to extract terms from it, to evaluate the extracted candidates, to classify them in semantic categories, extract definitions from the corpus, extract equivalents in other language, obtain synonyms (term variants) and even more. We plan to have it open to the public in February 2023 (next month). Follow the link for more details:
http://www.tecling.com/termout2022


16 de enero, 2023
Irene Renau se adjudica un Proyecto Fondecyt


La profesora Irene Renau, cofundadora del Grupo de Investigación Tecling.com, se acaba de adjudicar uno de los Proyectos más competitivos del país, el Fondecyt Regular 2023. El proyecto, de cuatro años de duración, tiene el título 'Mapa de las metáforas conceptuales en sustantivos y verbos del español: un estudio de los patrones metafóricos basado en corpus'. Rogelio Nazar participa como coinvestigador.


12 de enero, 2023
Cerramos una semana de defensas de tesis


Ya todos nuestros estudiantes del postgrado en lingüística defendieron sus trabajos. Ignacio Lobos presentó su proyecto de tesis doctoral el martes sobre marcadores discursivos y Javier Obreque (en foto) presentó esta mañana su tesis de magíster sobre modalización. A todos les fue muy bien y mostraron excelentes trabajos. Estamos muy contentos de terminar el semestre así.


9 de enero, 2023
Nuestros estudiantes defienden sus tesis de Magíster


Benjamín López, Enzo Soto y Ana Castro, cuyas tesis fueron guiadas por la profesora Irene Renau, están defiendiendo sus tesis de Magíster en el día de hoy. En el caso de Ana (en foto), lo está haciendo en este momento. ¡Mucha suerte!


January 6, 2023
Two of our collaborators are awarded research grants


Hernán Robledo and Ricardo Martínez, both graduated from the PhD Program in Linguistics at PUCV.cl and with years of research collaboration with the Tecling Group, have now been awarded research grants by ANID.cl. Hernán, with a Fondecyt Posdoc, is developing research in the field of discourse markers and Ricardo, with a Fondecyt Iniciación, studies linguistic properties of poetry. Even though their research interests are different, both have made extensive use of computational models for their objects of study. We are very proud to be your friends!


15 de diciembre, 2022
Rogelio Nazar recibe el premio a la docencia distinguida


El Sr. Rector de la Pontificia Universidad Católica de Valparaíso, Prof. Nelson Vásquez, entregó ayer a Rogelio Nazar el premio a la docencia distinguida por los resultados de la evaluación que hicieron de su desempeño los estudiantes en el segundo semestre de 2021 y el primer semestre de 2022.


November 6, 2022
Randall: a script to sort a list in random order


Sometimes our students need to sort things randomly and they usually don't have a simple method to do it, or when they do it is something like a web page with advertisement. Now, using this script you can paste a list of words or lines or whatever and it will sort the same material in random order:
http://www.tecling.com/randall


2 de noviembre, 2022
Ya vamos por la mitad del seminario de Lexicografía basada en Corpus


Hoy tendremos el tercer día del seminario sobre lexicografía basada en corpus en la Facultad de Filosofía y Letras de la Universidad Nacional de Cuyo, Mendoza. Antes hemos hablado acerca de qué es un corpus y cómo trabajar con ellos en un proyecto lexicográfico. Hoy estaremos hablando de sistemas de gestión de bases de datos léxicas y terminológicas.
Mañana jueves ya estaremos sumergiéndonos en el procesamiento automatizado de datos lingüísticos.
https://ffyl.uncuyo.edu.ar/cursos/item/lexicografia-basada-en-corpus


27 de octubre, 2022
Se acerca la fecha del Seminario de Lexicografía Basada en Corpus


Organizado por la Universidad Nacional de Cuyo, en Mendoza, Argentina, iniciará el día 31 de octubre de 2022 en la Facultad de Filosofía y letras el Seminario de Lexicografía Basada en Corpus, continuando toda esa semana hasta el 4 de noviembre de 2022.
Será dictado por Irene Renau y Rogelio Nazar, y ofrecerá un recorrido por los métodos y técnicas para la explotación de corpus textuales con fines lexicográficos y terminológicos.
Será en modalidad presencial únicamente. Para informes y contacto:
cursosposgrado@ffyl.uncu.edu.ar / +54 261 4494168
https://ffyl.uncuyo.edu.ar/cursos/item/lexicografia-basada-en-corpus


October 19, 2022
We are going mixlingual


Yes, after some thought we decided it is best go mixlingual, i.e., we are going to be mixing content in Spanish and English. The decision for this process is very simple, as the flowchart shows: if what we have to say is relevant for a general, international audience, then we use English. In contrast, if the news is relevant only for local audiences or specifically directed to Spanish-speaking people, we will then write in Spanish. We find this is more reasonable than trying to offer translations of all contents.

Tools & demos

We have implemented different types of applications and most of them can be tested online. Take a look.

+ Compare: a simple script to compare two lists of words

+ Cryptoman: a script to generate cryptograms

+ Dismark: a multilingual taxonomy of discourse markers (new!)

+ Dsele: a model dictionary for ELE learners

+ Estilector: computer assisted writing for Spanish

+ GeNom: a program to detect the gender of proper nouns

+ HAT: a project for the treatment of polysemy in lexical taxonomies

+ Jaguar: a tool for statistic corpus analysis

+ Kind: a lexical taxonomy induction algorithm

+ Kwico: a concordancer for big corpora

+ Lealem: a reading pacer for parallel German-Spanish texts

+ Leafran: a reading pacer for parallel French-Spanish texts

+ Linguini: a language detector

+ Neven: a program to detect eventive nouns

+ POL: named entity recognition and classification

+ Poppins: a supervised text classifier

+ Porcus: an interface for various taggers and parsers for Spanish

+ pullPOS: a project for the detection of plurals in Spanish

+ Randall: a list randomizer (new)

+ Readeutsch: a reading pacer for parallel German-English texts

+ Sapo: a program to detect similarities between documents

+ Sicam: a program to analyze Spanish poetry

+ Termout: a terminology extraction system

+ Termoutling: an automatic linguistics glossary

+ TEXT·A·GRAM: a program to analyze Spanish texts

+ Verbario: corpus pattern analysis in Spanish

Sausalito

This is the view from where we are located, in the Sausalito lagoon, a quiet and lovely place in Viña del Mar, Chile. Sunny days. Birds can be seen in the center of the lagoon (click to enlarge).

As researchers, we are currently affiliated to:
Pontificia Universidad Católica de Valparaíso
Instituto de Literatura y Ciencias del Lenguaje

Av. El Bosque 1290, Viña del Mar, Chile

Upcoming Events

Mid-February, 2022: We will be launching a new version of our Term extraction software 'Termout'. We are very enthusiastic about it. It has many functions, not only term extraction but the automation of a full terminology project.

Latest ideas & research projects

We are developing new projects in computational linguistics and natural language processing:

+ Fondecyt Regular (2019-2021): "Polisemia regular de los sustantivos del español: análisis semiautomático de corpus, caracterización y tipología" (Regular polysemy of nouns in Spanish: semiautomatic analysis of corpus, characterization and tipology). Lead researcher: Irene Renau. Ref.: 1191204.

+ Fondecyt Regular (2019-2021): "Inducción automática de taxonomías de marcadores discursivos a partir de corpus multilingües" (Automatic induction of taxonomies of discourse markers from multilingual corpora). Lead researcher: Rogelio Nazar. Ref.: 1191481.

+ Ecos-Sud (International Project between Chile and France): "Inducción automática de taxonomías del español y el francés mediante técnicas cuantitativas y estadística de corpus". Lead researcher: Irene Renau. Ref.: C16H02.

+ Fondecyt Regular: "Desarrollo de la competencia terminológica a lo largo de la inserción disciplinar". Lead Researcher: Sabela Fernández. Co-researcher: Rogelio Nazar. Ref.: 11121597.

+ See more.

Recent publications

+ Renau, I.; Nazar, R. (2022). Towards a multilingual dictionary of discourse markers: automatic extraction of units from parallel corpus. In: Klosa-Kückelhaus, A.; Engelberg, S.; Möhrs, C.; Storjohann, P. Dictionaries and Society. Proceedings of the XX EURALEX International Congress, Mannheim: IDS-Verlag, pp. 262-272. PDF

+ Nazar, R; Lindemann, D. (2022). Terminology extraction using co-occurrence patterns as predictors of semantic relevance. Proceedings of the TERM21 Workshop. Language Resources and Evaluation Conference (LREC 2022), Marseille, 20-25 June 2022, pp. 26-29. PDF

+ Nazar, R. (2021). "Inducción automática de una taxonomía multilingüe de marcadores discursivos: primeros resultados en castellano, inglés, francés, alemán y catalán". Procesamiento del Lenguaje Natural, núm 67, pp. 127-138. PDF

+ Nazar, R. (2021). "Automatic induction of a multilingual taxonomy of discourse markers". Iztok Kosem et al. (eds.) Electronic lexicography in the 21st century: post-editing lexicography. Lexical Computing CZ s.r.o., Brno, pages 440-454. PDF

+ Castro, A.; Nazar, R.; Renau, I. (2021). "New verbs and dictionaries: a method for the automatic detection of neology in Spanish verbs". International Journal of Lexicography, ...

+ Nazar, R.; Renau, I., Acosta, N., Robledo, H., Soliman, H., Zamora, S. (2021). "Corpus-Based Methods for Recognizing the Gender of Anthroponyms". Names: A Journal of Onomastics, vol. 69 num. 3. PDF



+ See more.

Solutions for text processing

It is critical for organizations to have the ability to process information automatically, and very often that information is contained in documents to be read by humans rather than machines. We have different methods for text processing depending on the goal.

We can be helpful teaching people how to automatize their text processing routines. We can batch-process thousands of documents to extract information from them or to derive different types of statistics. We can also change these document, or generate databases or email correspondence based on information extracted from them. Anything that involves intelligent management of information can benefit from different degrees of automatization, and by doing that we can free time, effort and resources.

Tell us which are your needs and we will show you what we can do about it.