<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
<title>Language resources and tools of AiLab IMCS UL</title>
<link href="http://hdl.handle.net/20.500.12574/2" rel="alternate"/>
<subtitle/>
<id>http://hdl.handle.net/20.500.12574/2</id>
<updated>2026-05-13T07:36:21Z</updated>
<dc:date>2026-05-13T07:36:21Z</dc:date>
<entry>
<title>ConLoan-LV: A Contrastive Dataset for Latvian Language Loanwords, Code-switching, and Named Entities</title>
<link href="http://hdl.handle.net/20.500.12574/158" rel="alternate"/>
<author>
<name>Štekeļs, Jorens</name>
</author>
<id>http://hdl.handle.net/20.500.12574/158</id>
<updated>2026-05-12T12:35:02Z</updated>
<published>2026-05-11T00:00:00Z</published>
<summary type="text">ConLoan-LV: A Contrastive Dataset for Latvian Language Loanwords, Code-switching, and Named Entities
Štekeļs, Jorens
ConLoan-LV is a multi-purpose contrastive dataset designed for the classification and analysis of Latvian language loanwords, code-switching, and named entities. Replicating and extending the ConLoan methodology, the dataset contains 353 manually validated sentences in the baseline version and 676 in the extended version, with all sentences sourced from the LVK2022 corpus. Each entry is enriched with labels for material borrowings (LOAN), while the extended version adds labels for code-switching (CS) and named entities (NE). Furthermore, the dataset includes native-language semantic equivalents for loanwords and English translations, providing a parallel structure for comparative analysis. This resource is intended for training and benchmarking language models in identifying non-native lexical elements within Latvian language texts.
</summary>
<dc:date>2026-05-11T00:00:00Z</dc:date>
</entry>
<entry>
<title>Dictionary of Contemporary Latvian Language (MLVV) (2026-04-08)</title>
<link href="http://hdl.handle.net/20.500.12574/157" rel="alternate"/>
<author>
<name>Zuicena, Ieva</name>
</author>
<author>
<name>Auziņa, Ieva</name>
</author>
<author>
<name>Briede, Santa</name>
</author>
<author>
<name>Jansone, Irēna Ilga</name>
</author>
<author>
<name>Kuplā, Ieva</name>
</author>
<author>
<name>Lejniece, Gunta</name>
</author>
<author>
<name>Migla, Ilga</name>
</author>
<author>
<name>Oldere, Laimdota</name>
</author>
<author>
<name>Ozola, Ārija</name>
</author>
<author>
<name>Požarnova, Vija</name>
</author>
<author>
<name>Rapa, Sanda</name>
</author>
<author>
<name>Roze, Anitra</name>
</author>
<author>
<name>Šmidebergs, Imants</name>
</author>
<author>
<name>Šnē, Dorisa</name>
</author>
<author>
<name>Šnē, Māra</name>
</author>
<author>
<name>Timuška, Agris</name>
</author>
<author>
<name>Grasmanis, Mikus</name>
</author>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Znotiņš, Artūrs</name>
</author>
<id>http://hdl.handle.net/20.500.12574/157</id>
<updated>2026-04-20T15:07:04Z</updated>
<published>2026-04-08T00:00:00Z</published>
<summary type="text">Dictionary of Contemporary Latvian Language (MLVV) (2026-04-08)
Zuicena, Ieva; Auziņa, Ieva; Briede, Santa; Jansone, Irēna Ilga; Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Rapa, Sanda; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Timuška, Agris; Grasmanis, Mikus; Pretkalniņa, Lauma; Znotiņš, Artūrs
“Contemporary dictionary of Latvian language” (MLVV), developed by the Latvian Language Institute of the Faculty of Humanities at the University of Latvia, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on last decade’s encyclopaedias and dictionaries. Some of the dictionary content is machine-readable.
</summary>
<dc:date>2026-04-08T00:00:00Z</dc:date>
</entry>
<entry>
<title>Tēzaurs.lv 2026 (Spring Edition)</title>
<link href="http://hdl.handle.net/20.500.12574/156" rel="alternate"/>
<author>
<name>Spektors, Andrejs</name>
</author>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Grūzītis, Normunds</name>
</author>
<author>
<name>Paikens, Pēteris</name>
</author>
<author>
<name>Rituma, Laura</name>
</author>
<author>
<name>Saulīte, Baiba</name>
</author>
<author>
<name>Nešpore-Bērzkalne, Gunta</name>
</author>
<author>
<name>Lokmane, Ilze</name>
</author>
<author>
<name>Klints, Agute</name>
</author>
<author>
<name>Stāde, Madara</name>
</author>
<author>
<name>Grasmanis, Mikus</name>
</author>
<author>
<name>Auziņa, Ilze</name>
</author>
<author>
<name>Znotiņš, Artūrs</name>
</author>
<author>
<name>Darģis, Roberts</name>
</author>
<author>
<name>Bārzdiņš, Guntis</name>
</author>
<id>http://hdl.handle.net/20.500.12574/156</id>
<updated>2026-04-20T15:05:36Z</updated>
<published>2026-04-08T00:00:00Z</published>
<summary type="text">Tēzaurs.lv 2026 (Spring Edition)
Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis
Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 410,000 entries based on 350 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and integrated with the Latvian WordNet data.&#13;
&#13;
This dataset is available as open data in TEI/XML and LMF/XML formats, as well as PostgreSQL database dump.
</summary>
<dc:date>2026-04-08T00:00:00Z</dc:date>
</entry>
<entry>
<title>Latvian Communist Leaflet Corpus (1934–1940)</title>
<link href="http://hdl.handle.net/20.500.12574/154" rel="alternate"/>
<author>
<name>Babaņins, Vladislavs</name>
</author>
<id>http://hdl.handle.net/20.500.12574/154</id>
<updated>2026-04-07T07:39:09Z</updated>
<published>2026-03-30T00:00:00Z</published>
<summary type="text">Latvian Communist Leaflet Corpus (1934–1940)
Babaņins, Vladislavs
The Latvian Communist Leaflet Corpus (1934–1940) is a structured digital corpus of underground political leaflets produced by illegal communist organizations in Latvia between January 1934 and July 1940, covering the final months of the parliamentary period and the authoritarian regime of Kārlis Ulmanis. The corpus contains 251 unique leaflet texts. In total, there are 458 records, of which 273 include transcribed text (including textual variants) and the remainder are metadata-only records for leaflets not reproduced in the source edition. The transcribed texts have been manually reviewed and corrected to reduce transcription errors. Each record includes structured metadata fields such as title, author, date, print run, typography name, production method, original language, and text language. The corpus also includes manually compiled topic annotations and inferred location data as additional research annotations.
</summary>
<dc:date>2026-03-30T00:00:00Z</dc:date>
</entry>
<entry>
<title>Historical Dictionary of Latvian Given Names</title>
<link href="http://hdl.handle.net/20.500.12574/152" rel="alternate"/>
<author>
<name>Siliņa-Piņķe, Renāte</name>
</author>
<author>
<name>Rapa, Sanda</name>
</author>
<author>
<name>Jansone, Ilga</name>
</author>
<author>
<name>Kazakevičs, Ņikita</name>
</author>
<id>http://hdl.handle.net/20.500.12574/152</id>
<updated>2026-02-18T18:10:00Z</updated>
<published>2026-01-01T00:00:00Z</published>
<summary type="text">Historical Dictionary of Latvian Given Names
Siliņa-Piņķe, Renāte; Rapa, Sanda; Jansone, Ilga; Kazakevičs, Ņikita
"Historical Dictionary of Latvian Given Names" (LPVV) is an online scientific dictionary that collects and describes Latvian given names documented in written sources spanning more than eight centuries. This dictionary focuses on names that entered the Latvian given name system before the end of the 19th century.
</summary>
<dc:date>2026-01-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Tēzaurs.lv 2026 (Winter Edition)</title>
<link href="http://hdl.handle.net/20.500.12574/151" rel="alternate"/>
<author>
<name>Spektors, Andrejs</name>
</author>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Grūzītis, Normunds</name>
</author>
<author>
<name>Paikens, Pēteris</name>
</author>
<author>
<name>Rituma, Laura</name>
</author>
<author>
<name>Saulīte, Baiba</name>
</author>
<author>
<name>Nešpore-Bērzkalne, Gunta</name>
</author>
<author>
<name>Lokmane, Ilze</name>
</author>
<author>
<name>Klints, Agute</name>
</author>
<author>
<name>Stāde, Madara</name>
</author>
<author>
<name>Grasmanis, Mikus</name>
</author>
<author>
<name>Auziņa, Ilze</name>
</author>
<author>
<name>Znotiņš, Artūrs</name>
</author>
<author>
<name>Darģis, Roberts</name>
</author>
<author>
<name>Bārzdiņš, Guntis</name>
</author>
<id>http://hdl.handle.net/20.500.12574/151</id>
<updated>2026-04-20T15:05:36Z</updated>
<published>2025-12-21T00:00:00Z</published>
<summary type="text">Tēzaurs.lv 2026 (Winter Edition)
Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis
Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 410,000 entries based on 350 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and integrated with the Latvian WordNet data.&#13;
&#13;
This dataset is available as open data in TEI/XML and LMF/XML formats, as well as PostgreSQL database dump.
</summary>
<dc:date>2025-12-21T00:00:00Z</dc:date>
</entry>
<entry>
<title>Dictionary of Contemporary Latvian Language (MLVV) (2025-12-21)</title>
<link href="http://hdl.handle.net/20.500.12574/150" rel="alternate"/>
<author>
<name>Zuicena, Ieva</name>
</author>
<author>
<name>Auziņa, Ieva</name>
</author>
<author>
<name>Briede, Santa</name>
</author>
<author>
<name>Jansone, Irēna Ilga</name>
</author>
<author>
<name>Kuplā, Ieva</name>
</author>
<author>
<name>Lejniece, Gunta</name>
</author>
<author>
<name>Migla, Ilga</name>
</author>
<author>
<name>Oldere, Laimdota</name>
</author>
<author>
<name>Ozola, Ārija</name>
</author>
<author>
<name>Požarnova, Vija</name>
</author>
<author>
<name>Rapa, Sanda</name>
</author>
<author>
<name>Roze, Anitra</name>
</author>
<author>
<name>Šmidebergs, Imants</name>
</author>
<author>
<name>Šnē, Dorisa</name>
</author>
<author>
<name>Šnē, Māra</name>
</author>
<author>
<name>Timuška, Agris</name>
</author>
<author>
<name>Grasmanis, Mikus</name>
</author>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Znotiņš, Artūrs</name>
</author>
<id>http://hdl.handle.net/20.500.12574/150</id>
<updated>2026-04-20T15:07:04Z</updated>
<published>2025-12-21T00:00:00Z</published>
<summary type="text">Dictionary of Contemporary Latvian Language (MLVV) (2025-12-21)
Zuicena, Ieva; Auziņa, Ieva; Briede, Santa; Jansone, Irēna Ilga; Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Rapa, Sanda; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Timuška, Agris; Grasmanis, Mikus; Pretkalniņa, Lauma; Znotiņš, Artūrs
“Contemporary dictionary of Latvian language” (MLVV), developed by the Latvian Language Institute of the Faculty of Humanities at the University of Latvia, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on last decade’s encyclopaedias and dictionaries. Some of the dictionary content is machine-readable.
</summary>
<dc:date>2025-12-21T00:00:00Z</dc:date>
</entry>
<entry>
<title>Dictionary of Latvian Literary Language (LLVV) (2025-12-21)</title>
<link href="http://hdl.handle.net/20.500.12574/149" rel="alternate"/>
<author>
<name>Ceplītis, Laimdots</name>
</author>
<author>
<name>Spektors, Andrejs</name>
</author>
<id>http://hdl.handle.net/20.500.12574/149</id>
<updated>2025-12-22T17:50:44Z</updated>
<published>2025-12-21T00:00:00Z</published>
<summary type="text">Dictionary of Latvian Literary Language (LLVV) (2025-12-21)
Ceplītis, Laimdots; Spektors, Andrejs
In the 20th century, the Latvian Language Institute of the University of Latvia (UL LLI, former Language and literature institute of the Academy of Sciences) has produced the largest lexicographic source of Latvian language, which has been digitalized (2001–2022) by the Institute of Mathematics and Computer Sciences, UL. The dictionary contains words of standard Latvian used since 19th century’s 70’s up to the end of the 20th century, when the work on the dictionary was carried out (1972-1996). The dictionary was created using words and example sentences from fiction, science texts, newswire and folklore.
</summary>
<dc:date>2025-12-21T00:00:00Z</dc:date>
</entry>
<entry>
<title>Latvian word frequency dataset</title>
<link href="http://hdl.handle.net/20.500.12574/148" rel="alternate"/>
<author>
<name>Grasmanis, Mikus</name>
</author>
<author>
<name>Valkovska, Baiba</name>
</author>
<author>
<name>Levāne-Petrova, Kristīne</name>
</author>
<id>http://hdl.handle.net/20.500.12574/148</id>
<updated>2025-12-19T13:53:53Z</updated>
<published>2025-12-19T00:00:00Z</published>
<summary type="text">Latvian word frequency dataset
Grasmanis, Mikus; Valkovska, Baiba; Levāne-Petrova, Kristīne
This frequency list contains the 25,000 most frequent Latvian lemmas, obtained from 18 morphologically annotated corpora totalling 1.5 billion tokens from the Latvian National Corpora Collection (Korpuss.lv) and Tēzaurs.lv. Supporting academic and practical applications, including language teaching, machine translation, and speech technologies, the list provides a broader and more representative view of the modern Latvian lexicon and usage trends.
</summary>
<dc:date>2025-12-19T00:00:00Z</dc:date>
</entry>
<entry>
<title>Latvian and Latgalian Parallel Sample Treebank (Cairo)</title>
<link href="http://hdl.handle.net/20.500.12574/143" rel="alternate"/>
<author>
<name>Pretkalniņa, Lauma</name>
</author>
<author>
<name>Nešpore-Bērzkalne, Gunta</name>
</author>
<author>
<name>Pokratniece, Kristīne</name>
</author>
<author>
<name>Rituma, Laura</name>
</author>
<id>http://hdl.handle.net/20.500.12574/143</id>
<updated>2025-11-26T15:23:28Z</updated>
<published>2025-11-15T00:00:00Z</published>
<summary type="text">Latvian and Latgalian Parallel Sample Treebank (Cairo)
Pretkalniņa, Lauma; Nešpore-Bērzkalne, Gunta; Pokratniece, Kristīne; Rituma, Laura
This corpus contains 20 Latvian and Latgalian sample sentences annotated in the same hybrid annotation model used in Latvian Treebank. Sentences used in this corpora are the same sentences that are used in "Cairo" sample corpora that showcase anntoation choices for Universal Dependency treebanks, and this corpus serves as a basis for both UD-Latvian_Cairo and UD-Latgalian_Cairo corpora. Based on the experience with these sentences, preliminary UD annotation documentation for Latgalian was also prepared. This work allows Latgalian UD data to be used to assess how multilingual tools perform on a language that has no training data and to serve as a base for further treebank development later.
</summary>
<dc:date>2025-11-15T00:00:00Z</dc:date>
</entry>
</feed>
