<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
<title>Language resources and tools of AiLab IMCS UL</title>
<link href="http://hdl.handle.net/20.500.12574/2" rel="alternate"/>
<subtitle/>
<id>http://hdl.handle.net/20.500.12574/2</id>
<updated>2026-04-08T05:01:36Z</updated>
<dc:date>2026-04-08T05:01:36Z</dc:date>
<entry>
<title>Latvian Communist Leaflet Corpus (1934–1940)</title>
<link href="http://hdl.handle.net/20.500.12574/154" rel="alternate"/>
<author>
<name>Babaņins, Vladislavs</name>
</author>
<id>http://hdl.handle.net/20.500.12574/154</id>
<updated>2026-04-07T07:39:09Z</updated>
<published>2026-03-30T00:00:00Z</published>
<summary type="text">Latvian Communist Leaflet Corpus (1934–1940)
Babaņins, Vladislavs
The Latvian Communist Leaflet Corpus (1934–1940) is a structured digital corpus of underground political leaflets produced by illegal communist organizations in Latvia between January 1934 and July 1940, covering the final months of the parliamentary period and the authoritarian regime of Kārlis Ulmanis. The corpus contains 251 unique leaflet texts. In total, there are 458 records, of which 273 include transcribed text (including textual variants) and the remainder are metadata-only records for leaflets not reproduced in the source edition. The transcribed texts have been manually reviewed and corrected to reduce transcription errors. Each record includes structured metadata fields such as title, author, date, print run, typography name, production method, original language, and text language. The corpus also includes manually compiled topic annotations and inferred location data as additional research annotations.
</summary>
<dc:date>2026-03-30T00:00:00Z</dc:date>
</entry>
<entry>
<title>Historical Dictionary of Latvian Given Names</title>
<link href="http://hdl.handle.net/20.500.12574/152" rel="alternate"/>
<author>
<name>Siliņa-Piņķe, Renāte</name>
</author>
<author>
<name>Rapa, Sanda</name>
</author>
<author>
<name>Jansone, Ilga</name>
</author>
<author>
<name>Kazakevičs, Ņikita</name>
</author>
<id>http://hdl.handle.net/20.500.12574/152</id>
<updated>2026-02-18T18:10:00Z</updated>
<published>2026-01-01T00:00:00Z</published>
<summary type="text">Historical Dictionary of Latvian Given Names
Siliņa-Piņķe, Renāte; Rapa, Sanda; Jansone, Ilga; Kazakevičs, Ņikita
"Historical Dictionary of Latvian Given Names" (LPVV) is an online scientific dictionary that collects and describes Latvian given names documented in written sources spanning more than eight centuries. This dictionary focuses on names that entered the Latvian given name system before the end of the 19th century.
</summary>
<dc:date>2026-01-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Tēzaurs.lv 2026 (Winter Edition)</title>
<link href="http://hdl.handle.net/20.500.12574/151" rel="alternate"/>
<author>
<name>Spektors, Andrejs</name>
</author>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Grūzītis, Normunds</name>
</author>
<author>
<name>Paikens, Pēteris</name>
</author>
<author>
<name>Rituma, Laura</name>
</author>
<author>
<name>Saulīte, Baiba</name>
</author>
<author>
<name>Nešpore-Bērzkalne, Gunta</name>
</author>
<author>
<name>Lokmane, Ilze</name>
</author>
<author>
<name>Klints, Agute</name>
</author>
<author>
<name>Stāde, Madara</name>
</author>
<author>
<name>Grasmanis, Mikus</name>
</author>
<author>
<name>Auziņa, Ilze</name>
</author>
<author>
<name>Znotiņš, Artūrs</name>
</author>
<author>
<name>Darģis, Roberts</name>
</author>
<author>
<name>Bārzdiņš, Guntis</name>
</author>
<id>http://hdl.handle.net/20.500.12574/151</id>
<updated>2025-12-22T17:55:11Z</updated>
<published>2025-12-21T00:00:00Z</published>
<summary type="text">Tēzaurs.lv 2026 (Winter Edition)
Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis
Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 410,000 entries based on 350 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and integrated with the Latvian WordNet data.&#13;
&#13;
This dataset is available as open data in TEI/XML and LMF/XML formats, as well as PostgreSQL database dump.
</summary>
<dc:date>2025-12-21T00:00:00Z</dc:date>
</entry>
<entry>
<title>Dictionary of Contemporary Latvian Language (MLVV) (2025-12-21)</title>
<link href="http://hdl.handle.net/20.500.12574/150" rel="alternate"/>
<author>
<name>Zuicena, Ieva</name>
</author>
<author>
<name>Auziņa, Ieva</name>
</author>
<author>
<name>Briede, Santa</name>
</author>
<author>
<name>Jansone, Irēna Ilga</name>
</author>
<author>
<name>Kuplā, Ieva</name>
</author>
<author>
<name>Lejniece, Gunta</name>
</author>
<author>
<name>Migla, Ilga</name>
</author>
<author>
<name>Oldere, Laimdota</name>
</author>
<author>
<name>Ozola, Ārija</name>
</author>
<author>
<name>Požarnova, Vija</name>
</author>
<author>
<name>Rapa, Sanda</name>
</author>
<author>
<name>Roze, Anitra</name>
</author>
<author>
<name>Šmidebergs, Imants</name>
</author>
<author>
<name>Šnē, Dorisa</name>
</author>
<author>
<name>Šnē, Māra</name>
</author>
<author>
<name>Timuška, Agris</name>
</author>
<author>
<name>Grasmanis, Mikus</name>
</author>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Znotiņš, Artūrs</name>
</author>
<id>http://hdl.handle.net/20.500.12574/150</id>
<updated>2025-12-22T17:53:05Z</updated>
<published>2025-12-21T00:00:00Z</published>
<summary type="text">Dictionary of Contemporary Latvian Language (MLVV) (2025-12-21)
Zuicena, Ieva; Auziņa, Ieva; Briede, Santa; Jansone, Irēna Ilga; Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Rapa, Sanda; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Timuška, Agris; Grasmanis, Mikus; Pretkalniņa, Lauma; Znotiņš, Artūrs
“Contemporary dictionary of Latvian language” (MLVV), developed by the Latvian Language Institute of the Faculty of Humanities at the University of Latvia, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on last decade’s encyclopaedias and dictionaries. Some of the dictionary content is machine-readable.
</summary>
<dc:date>2025-12-21T00:00:00Z</dc:date>
</entry>
<entry>
<title>Dictionary of Latvian Literary Language (LLVV) (2025-12-21)</title>
<link href="http://hdl.handle.net/20.500.12574/149" rel="alternate"/>
<author>
<name>Ceplītis, Laimdots</name>
</author>
<author>
<name>Spektors, Andrejs</name>
</author>
<id>http://hdl.handle.net/20.500.12574/149</id>
<updated>2025-12-22T17:50:44Z</updated>
<published>2025-12-21T00:00:00Z</published>
<summary type="text">Dictionary of Latvian Literary Language (LLVV) (2025-12-21)
Ceplītis, Laimdots; Spektors, Andrejs
In the 20th century, the Latvian Language Institute of the University of Latvia (UL LLI, former Language and literature institute of the Academy of Sciences) has produced the largest lexicographic source of Latvian language, which has been digitalized (2001–2022) by the Institute of Mathematics and Computer Sciences, UL. The dictionary contains words of standard Latvian used since 19th century’s 70’s up to the end of the 20th century, when the work on the dictionary was carried out (1972-1996). The dictionary was created using words and example sentences from fiction, science texts, newswire and folklore.
</summary>
<dc:date>2025-12-21T00:00:00Z</dc:date>
</entry>
<entry>
<title>Latvian word frequency dataset</title>
<link href="http://hdl.handle.net/20.500.12574/148" rel="alternate"/>
<author>
<name>Grasmanis, Mikus</name>
</author>
<author>
<name>Valkovska, Baiba</name>
</author>
<author>
<name>Levāne-Petrova, Kristīne</name>
</author>
<id>http://hdl.handle.net/20.500.12574/148</id>
<updated>2025-12-19T13:53:53Z</updated>
<published>2025-12-19T00:00:00Z</published>
<summary type="text">Latvian word frequency dataset
Grasmanis, Mikus; Valkovska, Baiba; Levāne-Petrova, Kristīne
This frequency list contains the 25,000 most frequent Latvian lemmas, obtained from 18 morphologically annotated corpora totalling 1.5 billion tokens from the Latvian National Corpora Collection (Korpuss.lv) and Tēzaurs.lv. Supporting academic and practical applications, including language teaching, machine translation, and speech technologies, the list provides a broader and more representative view of the modern Latvian lexicon and usage trends.
</summary>
<dc:date>2025-12-19T00:00:00Z</dc:date>
</entry>
<entry>
<title>Latvian and Latgalian Parallel Sample Treebank (Cairo)</title>
<link href="http://hdl.handle.net/20.500.12574/143" rel="alternate"/>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Nešpore-Bērzkalne, Gunta</name>
</author>
<author>
<name>Pokratniece, Kristīne</name>
</author>
<author>
<name>Rituma, Laura</name>
</author>
<id>http://hdl.handle.net/20.500.12574/143</id>
<updated>2025-11-26T15:23:28Z</updated>
<published>2025-11-15T00:00:00Z</published>
<summary type="text">Latvian and Latgalian Parallel Sample Treebank (Cairo)
Pretkalniņa, Lauma; Nešpore-Bērzkalne, Gunta; Pokratniece, Kristīne; Rituma, Laura
This corpus contains 20 Latvian and Latgalian sample sentences annotated in the same hybrid annotation model used in Latvian Treebank. Sentences used in this corpora are the same sentences that are used in "Cairo" sample corpora that showcase anntoation choices for Universal Dependency treebanks, and this corpus serves as a basis for both UD-Latvian_Cairo and UD-Latgalian_Cairo corpora. Based on the experience with these sentences, preliminary UD annotation documentation for Latgalian was also prepared. This work allows Latgalian UD data to be used to assess how multilingual tools perform on a language that has no training data and to serve as a base for further treebank development later.
</summary>
<dc:date>2025-11-15T00:00:00Z</dc:date>
</entry>
<entry>
<title>LVTB - Latvian Treebank v2.17</title>
<link href="http://hdl.handle.net/20.500.12574/142" rel="alternate"/>
<author>
<name>Rituma, Laura</name>
</author>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Saulīte, Baiba</name>
</author>
<author>
<name>Nešpore-Bērzkalne, Gunta</name>
</author>
<author>
<name>Grūzītis, Normunds</name>
</author>
<author>
<name>Znotiņš, Artūrs</name>
</author>
<id>http://hdl.handle.net/20.500.12574/142</id>
<updated>2025-11-25T07:35:50Z</updated>
<published>2025-11-15T00:00:00Z</published>
<summary type="text">LVTB - Latvian Treebank v2.17
Rituma, Laura; Pretkalniņa, Lauma; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Grūzītis, Normunds; Znotiņš, Artūrs
Latvian Treebank (LVTB) is being developed since 2010. It is manually annotated according to a hybrid dependency-constituency grammar model. This version of LVTB contains data used for deriving the corresponding version of Latvian UD Treebank (UDLV-LVTB).
</summary>
<dc:date>2025-11-15T00:00:00Z</dc:date>
</entry>
<entry>
<title>The Corpus of Early Written Latvian (2025)</title>
<link href="http://hdl.handle.net/20.500.12574/141" rel="alternate"/>
<author>
<name>Andronova, Everita</name>
</author>
<author>
<name>Baltiņa, Maija</name>
</author>
<author>
<name>Frīdenberga, Anna</name>
</author>
<author>
<name>Grūzītis, Normunds</name>
</author>
<author>
<name>Ķauķīte, Sintija</name>
</author>
<author>
<name>Pokratniece, Kristīne</name>
</author>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Siliņa-Piņķe, Renāte</name>
</author>
<author>
<name>Skrūzmane, Elga</name>
</author>
<author>
<name>Spektors, Andrejs</name>
</author>
<author>
<name>Spektors, Mārtiņš</name>
</author>
<author>
<name>Štrausa, Ilze</name>
</author>
<author>
<name>Trumpa, Anta</name>
</author>
<author>
<name>Trumpa, Edmunds</name>
</author>
<author>
<name>Vanags, Pēteris</name>
</author>
<id>http://hdl.handle.net/20.500.12574/141</id>
<updated>2025-11-20T16:14:51Z</updated>
<published>2025-11-27T00:00:00Z</published>
<summary type="text">The Corpus of Early Written Latvian (2025)
Andronova, Everita; Baltiņa, Maija; Frīdenberga, Anna; Grūzītis, Normunds; Ķauķīte, Sintija; Pokratniece, Kristīne; Pretkalniņa, Lauma; Siliņa-Piņķe, Renāte; Skrūzmane, Elga; Spektors, Andrejs; Spektors, Mārtiņš; Štrausa, Ilze; Trumpa, Anta; Trumpa, Edmunds; Vanags, Pēteris
The Corpus of early written Latvian 'SENIE' provides access to the texts and facsimiles of written Latvian of the 16th–18th century. Its aim is to facilitate studies of early Latvian in general and to serve as the basis for 'The Historical dictionary of Latvian (16th–17th cc.)'. Corpus serves as a unique digital repository of early Latvian texts, whose physical copies are distributed all over the world. The Corpus was first launched in January 2003, and in 2017 it was converted to Unicode. Work on corpus continues in various directions, including adding new sources. This version contains 102 sources.
</summary>
<dc:date>2025-11-27T00:00:00Z</dc:date>
</entry>
<entry>
<title>Spelling normalization tool for Latvian 18th century texts</title>
<link href="http://hdl.handle.net/20.500.12574/140" rel="alternate"/>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Andronova, Everita</name>
</author>
<author>
<name>Frīdenberga, Anna</name>
</author>
<author>
<name>Skrūzmane, Elga</name>
</author>
<author>
<name>Siliņa-Piņķe, Renāte</name>
</author>
<author>
<name>Trumpa, Anta</name>
</author>
<author>
<name>Vanags, Pēteris</name>
</author>
<id>http://hdl.handle.net/20.500.12574/140</id>
<updated>2025-11-06T12:29:40Z</updated>
<published>2025-01-01T00:00:00Z</published>
<summary type="text">Spelling normalization tool for Latvian 18th century texts
Pretkalniņa, Lauma; Andronova, Everita; Frīdenberga, Anna; Skrūzmane, Elga; Siliņa-Piņķe, Renāte; Trumpa, Anta; Vanags, Pēteris
The spelling normalization tool (pilot converter) is meant for converting any 18th century Latvian Unicode-encoded text into a more modern spelling. This version of the tool takes care of normalizing the roots of the words, thus, it is meant for for facillitating user-friendly corpora search in tools like Sketch Engine. The tool consists of 134 universal rules. The tool has been successfully tested on various 18th century sources from the SENIE, achieving an average accuracy of 94%.
</summary>
<dc:date>2025-01-01T00:00:00Z</dc:date>
</entry>
</feed>
