Kas jauns

 corpus 
corpus
Apraksts:
The CSV dataset contains sentence pairs for a text-to-text transformation task: given a sentence that contains 0..n abbreviations, rewrite (normalize) the sentence in full words (word forms). Training dataset: 64,665 ...
 Šajā vienumā ir 1 fails (6.73 MB).
 
Publicly Available
 corpus 
corpus
Apraksts:
The Balanced Corpus of Modern Latvian, which contains unique texts not yet included in other so far developed balanced corpora (LVK2013 and LVK2018). The corpus is primarily based on the design principles of previous ...
 Šajā vienumā nav failu.
 corpus 
corpus
Apraksts:
Corpus contains texts of the magazine "Karogs" from 1940 to 1994.
 Šajā vienumā nav failu.
 
Publicly Available

Visvairāk skatītie vienumi

Populārākie pēdējā nedēļā
 lexicalConceptualResource 
lexicalConceptualResource
Apraksts:
Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains nearly 390,000 entries compiled from more than 330 sources. The dictionary is enriched with phonetic, morphological, semantic ...
 Šajā vienumā ir 1 fails (24.76 MB).
 
Publicly Available
 corpus 
corpus
Apraksts:
A text corpus of orthographic transcription of a Latvian medical speech corpus. It consists of 900 transcripts (documents) of a ~35 hour radiology speech corpus. Modalities covered: CT, MR, MG, CR, US.
 Šajā vienumā ir 1 fails (267.18 KB).
 
Publicly Available
 toolService 
toolService
Autors (-i):
Apraksts:
LVBERT is the first publicly available monolingual BERT language model pre-trained for Latvian. For training we used the original implementation of BERT on TensorFlow with the whole-word masking and the next sentence ...
 Šajā vienumā ir 3 faili (1.51 GB).
 
Publicly Available