What's New
corpus
Description:
A Latvian speech corpus for the validadion, testing and comparison of ASR models.
The audio data is segmented and aligned with the corresponding orthographic transcriptions which are human verified.
The dataset consists ...
This item contains no files.
corpus
Description:
A dataset of hierarchically annotated named entities in Latvian news articles (provided by the Latvian Information Agency LETA) for the development and evaluation of transition-based parsers for named entity recognition (NER).
This item contains 1 file (1.02
MB).
Academic Use
toolService
Description:
The SELMA Open-Source Software (OSS) offers effective means to test and compare the performance of various language models used in multilingual media monitoring and content production. The SELMA OSS Platform (also referred ...
This item contains no files.
Most Viewed Items
Top Last Week
lexicalConceptualResource
Description:
Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 397,000 entries based on 346 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic ...
This item contains 2 files (38.76
MB).
Publicly Available
toolService
Description:
The SELMA Open-Source Software (OSS) offers effective means to test and compare the performance of various language models used in multilingual media monitoring and content production. The SELMA OSS Platform (also referred ...
This item contains no files.
lexicalConceptualResource
Description:
Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 393,500 entries based on 345 sources. The dictionary is enriched with phonetic, morphological, semantic and other ...
This item contains 2 files (26.16
MB).
Publicly Available