CLARIN-LV digital library at IMCS, University of Latvia

CLARIN-LV digital library at IMCS, University of Latvia The CLARIN-LV digital repository system captures, stores, indexes, preserves, and distributes digital research material. https://repository.clarin.lv:443/repository/xmlui 2024-07-06T09:28:26Z 2024-07-06T09:28:26Z Dictionary of Contemporary Latvian Language (MLVV) (2024-06-21) Zuicena, Ieva Auziņa, Ieva Briede, Santa Jansone, Irēna Ilga Kuplā, Ieva Lejniece, Gunta Migla, Ilga Oldere, Laimdota Ozola, Ārija Požarnova, Vija Rapa, Sanda Roze, Anitra Šmidebergs, Imants Šnē, Dorisa Šnē, Māra Timuška, Agris Grasmanis, Mikus Pretkalniņa, Lauma Znotiņš, Artūrs http://hdl.handle.net/20.500.12574/108 2024-07-04T12:41:24Z 2024-06-21T00:00:00Z

Dictionary of Contemporary Latvian Language (MLVV) (2024-06-21) Zuicena, Ieva; Auziņa, Ieva; Briede, Santa; Jansone, Irēna Ilga; Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Rapa, Sanda; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Timuška, Agris; Grasmanis, Mikus; Pretkalniņa, Lauma; Znotiņš, Artūrs “Contemporary dictionary of Latvian language” (MLVV), which is developed by the UL Latvian Language institute, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on last decade’s encyclopaedias and dictionaries. Some of the dictionary content is machine-readable.

2024-06-21T00:00:00Z Tēzaurs.lv 2024 (Summer Edition) (2024-06-21) Spektors, Andrejs Pretkalniņa, Lauma Grūzītis, Normunds Paikens, Pēteris Rituma, Laura Saulīte, Baiba Nešpore-Bērzkalne, Gunta Lokmane, Ilze Klints, Agute Stāde, Madara Grasmanis, Mikus Auziņa, Ilze Znotiņš, Artūrs Darģis, Roberts Bārzdiņš, Guntis http://hdl.handle.net/20.500.12574/107 2024-07-04T11:46:13Z 2024-06-21T00:00:00Z

Tēzaurs.lv 2024 (Summer Edition) (2024-06-21) Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 403,000 entries based on 345 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and it is integrated with the Latvian WordNet data. This dataset is available as open data in TEI/XML and LMF/XML formats. If you are interested in acquiring the corresponding PostgreSQL database dump, please, send a request to info@tezaurs.lv.

2024-06-21T00:00:00Z LVTB - Latvian Treebank v2.14 (2024-05-15) Rituma, Laura Pretkalniņa, Lauma Saulīte, Baiba Nešpore-Bērzkalne, Gunta Grūzītis, Normunds Znotiņš, Artūrs http://hdl.handle.net/20.500.12574/106 2024-05-15T10:07:01Z 2024-05-15T00:00:00Z

LVTB - Latvian Treebank v2.14 (2024-05-15) Rituma, Laura; Pretkalniņa, Lauma; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Grūzītis, Normunds; Znotiņš, Artūrs Latvian Treebank (LVTB) is being developed since 2010. It is manually annotated according to a hybrid dependency-constituency grammar model. This version of LVTB contains data used for deriving the corresponding version of Latvian UD Treebank (UDLV-LVTB).

2024-05-15T00:00:00Z Corpus of Contemporary Latgalian Speech Martena, Sanita Nau, Nicole Kļavinska, Antra Juško-Štekele, Angelika Kociņš-Kūceņš, Armands Sprukte, Ausma Briška, Anna Gusāns, Ingars Mazure, Laura http://hdl.handle.net/20.500.12574/105 2024-05-08T12:41:29Z 2024-01-01T00:00:00Z

Corpus of Contemporary Latgalian Speech Martena, Sanita; Nau, Nicole; Kļavinska, Antra; Juško-Štekele, Angelika; Kociņš-Kūceņš, Armands; Sprukte, Ausma; Briška, Anna; Gusāns, Ingars; Mazure, Laura The corpus consists of audio recordings and their transcripts. It documents natural, spontaneous speech, including field research recordings, interviews, TV and radio broadcasts.

2024-01-01T00:00:00Z Tēzaurs.lv 2024 (Spring Edition) Spektors, Andrejs Pretkalniņa, Lauma Grūzītis, Normunds Paikens, Pēteris Rituma, Laura Saulīte, Baiba Nešpore-Bērzkalne, Gunta Lokmane, Ilze Klints, Agute Stāde, Madara Grasmanis, Mikus Auziņa, Ilze Znotiņš, Artūrs Darģis, Roberts Bārzdiņš, Guntis http://hdl.handle.net/20.500.12574/104 2024-07-04T11:46:13Z 2024-03-01T00:00:00Z

Tēzaurs.lv 2024 (Spring Edition) Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 400,000 entries based on 346 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and it is integrated with the Latvian WordNet data. This dataset is available as open data in TEI/XML and LMF/XML formats. If you are interested in acquiring the corresponding PostgreSQL database dump, please, send a request to info@tezaurs.lv.

2024-03-01T00:00:00Z Tēzaurs.lv 2024 (Winter Edition) Spektors, Andrejs Pretkalniņa, Lauma Grūzītis, Normunds Paikens, Pēteris Rituma, Laura Saulīte, Baiba Nešpore-Bērzkalne, Gunta Lokmane, Ilze Klints, Agute Stāde, Madara Grasmanis, Mikus Auziņa, Ilze Znotiņš, Artūrs Darģis, Roberts Bārzdiņš, Guntis http://hdl.handle.net/20.500.12574/103 2024-04-05T12:20:03Z 2023-12-01T00:00:00Z

Tēzaurs.lv 2024 (Winter Edition) Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 397,000 entries based on 346 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and it is integrated with the Latvian WordNet data. This dataset is available as open data in TEI/XML and LMF/XML formats. If you are interested in acquiring the corresponding PostgreSQL database dump, please, send a request to info@tezaurs.lv.

2023-12-01T00:00:00Z Dictionary of Contemporary Latvian Language (MLVV) (2024-04-03) Kuplā, Ieva Lejniece, Gunta Migla, Ilga Oldere, Laimdota Ozola, Ārija Požarnova, Vija Roze, Anitra Šmidebergs, Imants Šnē, Dorisa Šnē, Māra Zuicena, Ieva Pretkalniņa, Lauma Auziņa, Ieva Briede, Santa Timuška, Agris Jansone, Irēna Ilga Rapa, Sanda http://hdl.handle.net/20.500.12574/101 2024-07-04T12:41:24Z 2024-03-01T00:00:00Z

Dictionary of Contemporary Latvian Language (MLVV) (2024-04-03) Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Zuicena, Ieva; Pretkalniņa, Lauma; Auziņa, Ieva; Briede, Santa; Timuška, Agris; Jansone, Irēna Ilga; Rapa, Sanda “Contemporary dictionary of Latvian language” (MLVV), which is developed by the UL Latvian Language institute, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on last decade’s encyclopaedias and dictionaries. Some of the dictionary content is machine-readable.

2024-03-01T00:00:00Z Dictionary of Latvian Literary Language (LLVV) (2024-02) Ceplītis, Laimdots Spektors, Andrejs http://hdl.handle.net/20.500.12574/100 2024-04-04T14:33:42Z 2024-03-01T00:00:00Z

Dictionary of Latvian Literary Language (LLVV) (2024-02) Ceplītis, Laimdots; Spektors, Andrejs In the 20th century, UL Latvian language institute (former Language and literature institute of the Academy of Sciences) has produced the largest lexicographic source of Latvian language, which has been digitalized (2001–2022) by UL Institute of Mathematics and Computer Sciences. The dictionary contains words of standard Latvian used since 19th century’s 70’s up to the end of the 20th century, when the work on the dictionary was carried out (1972-1996). The dictionary was created using words and example sentences from fiction, science texts, newswire and folklore.

2024-03-01T00:00:00Z LATE Dev&Test Set V1 for Latvian ASR Darģis, Roberts Znotiņš, Artūrs Auziņa, Ilze Rābante-Buša, Guna http://hdl.handle.net/20.500.12574/99 2024-05-02T10:17:32Z 2024-03-01T00:00:00Z

LATE Dev&Test Set V1 for Latvian ASR Darģis, Roberts; Znotiņš, Artūrs; Auziņa, Ilze; Rābante-Buša, Guna A Latvian speech corpus for the development (validation), testing and comparison of ASR models. The audio data is segmented and aligned with the corresponding orthographic transcriptions which are human verified. The LATE-media subset contains both verbatim (raw) and formatted transcriptions (with punctuation, capitalisation, numbers, abbreviations, etc.), while the LATE-conversations subset currently contains only verbatim transcriptions (no punctuation, capitalisation, etc.). The dataset consists of: - 5 hours of broadcast media recordings, both spontaneous and prepared speech (2.5h dev set, 2.5h test set); - 5 hours of conversational speech recordings, spontaneous speech (2.5h dev set, 2.5h test set).

2024-03-01T00:00:00Z SELMA Latvian NER Dataset Rābante-Buša, Guna Grūzītis, Normunds Bārzdiņš, Guntis Mendes, Afonso http://hdl.handle.net/20.500.12574/98 2024-03-25T12:55:24Z 2022-03-01T00:00:00Z

SELMA Latvian NER Dataset Rābante-Buša, Guna; Grūzītis, Normunds; Bārzdiņš, Guntis; Mendes, Afonso A dataset of hierarchically annotated named entities in Latvian news articles (provided by the Latvian Information Agency LETA) for the development and evaluation of transition-based parsers for named entity recognition (NER).

2022-03-01T00:00:00Z