Automatic keyword tagger MARTA

The tool can be found here: https://marta.nlib.ee (website is temporarily unavailable)

MARTA is a prototype for automatic keyword tagging of Estonian articles. The prototype takes text as input (either in plain text format, downloaded from a given URL, or extracted from an uploaded file). Optionally, the user can select applicable methodologies and/or article domains. In the next step, the text is lemmatized and part-of-speech tags are assigned using the MLP10 (multilingual preprocessor) tool from Texta Toolkit. After lemmatization, keyword tagging methods are applied to extract the following keywords from the text:

  • Topical keywords
  • Personal names
  • Locations and geopolitical entities
  • Organizations
  • Temporal keywords

The detected keywords are compared with the Estonian Thesaurus (EMS). If a keyword also appears in the EMS, a checkmark is displayed next to it. The identified keywords can be exported from the application in MARC format.

You can find a more detailed user guide for the prototype here (in Estonian).

Sign up to the National Library Newsletter

    OPEN
    RaRa small building
    Mon-Fri 10—20
    Sat 12—19
    Sun Closed

    Solaris Embassy
    Mon-Sun 10—19
    CONTACT

    National Library of Estonia
    Narva Road 11, 15015 Tallinn
    +372 630 7100
    info@rara.ee
    rara.ee/en

    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram