Dismark

Official website of Project Fondecyt 1191481

AUTOMATIC INDUCTION OF TAXONOMIES OF DISCOURSE MARKERS FROM MULTILINGUAL CORPORA
Current version:

August 5st, 2022



This web site offers the following contents:
  1. Documentation
  2. A multilingual taxonomy resulting from the project
  3. An automatic classifier of discourse markers

3. Automatic classifier of discourse markers

Paste a list of expressions here:

Language:

(It is faster if the language is selected manually)

This demo of the algorithm receives an input list of one or more expression (one per line) and then carries on with the following tasks:
  1. It will classify the units by language (among those languages already listed)
  2. It will decide, in each case, if it is a discourse marker or not
  3. If it is, it will assign a category to it.

At the moment, it will not process more than 750 expressions at once because the algorithm is not yet optimized for massive data processing. We will deal with that later. It is just not a priority right now.