Show simple item record

 
dc.contributor.author Spektors, Andrejs
dc.contributor.author Pretkalniņa, Lauma
dc.contributor.author Grūzītis, Normunds
dc.contributor.author Paikens, Pēteris
dc.contributor.author Rituma, Laura
dc.contributor.author Saulīte, Baiba
dc.contributor.author Nešpore-Bērzkalne, Gunta
dc.contributor.author Lokmane, Ilze
dc.contributor.author Klints, Agute
dc.contributor.author Stāde, Madara
dc.contributor.author Grasmanis, Mikus
dc.contributor.author Auziņa, Ilze
dc.contributor.author Znotiņš, Artūrs
dc.contributor.author Darģis, Roberts
dc.contributor.author Bārzdiņš, Guntis
dc.date.accessioned 2026-04-20T15:05:36Z
dc.date.available 2026-04-20T15:05:36Z
dc.date.issued 2026-04-08
dc.identifier.uri http://hdl.handle.net/20.500.12574/156
dc.description Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 410,000 entries based on 350 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and integrated with the Latvian WordNet data. This dataset is available as open data in TEI/XML and LMF/XML formats, as well as PostgreSQL database dump.
dc.language.iso lav
dc.publisher AiLab IMCS UL
dc.relation.isreferencedby http://www.lrec-conf.org/proceedings/lrec2016/pdf/1095_Paper.pdf
dc.relation.isreferencedby http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.300.pdf
dc.relation.isreferencedby https://elex.link/elex2023/wp-content/uploads/89.pdf
dc.relation.replaces http://hdl.handle.net/20.500.12574/151
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri http://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri https://tezaurs.lv
dc.subject thesaurus
dc.subject dictionary
dc.subject lexicon
dc.title Tēzaurs.lv 2026 (Spring Edition)
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType computationalLexicon
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN Centre of Latvian language resources and tools
demo.uri https://tezaurs.lv
contact.person Normunds Grūzītis normundsg@ailab.lv Normunds Grūzītis
sponsor Ministry of Education and Science VPP-IZM-DH-2020/1-0001 Digital Resources for Humanities: Integration and Development nationalFunds
sponsor Latvian Council of Science lzp-2019/1-0464 Latvian WordNet and Word Sense Disambiguation nationalFunds
sponsor Ministry of Education and Science VPP-IZM-2018/2-0002 Latvian Language nationalFunds
sponsor Ministry of Education and Science VPP-LETONIKA-2021/1-0006 Research on Modern Latvian Language and Development of Language Technology nationalFunds
sponsor Latvian Council of Science lzp-2022/1-0443 Advancing Latvian computational lexical resources for natural language understanding and generation nationalFunds
size.info 413577 entries
files.size 304533102
files.count 5


 Files in this item

 Download all files in item (290.43 MB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Icon
Name
tezaurs_2026_2_tei.xml.zip
Size
41.55 MB
Format
application/zip
Description
Tezaurs.lv open data in the TEI/XML format (https://tei-c.org/release/doc/tei-p5-doc/en/html/DI.html)
MD5
1258174f9f3da8c190b2e1f8f1747b0c
 Download file  Preview
 File Preview  
    • tezaurs_2026_2_tei.xml391 MB
Icon
Name
tezaurs_2026_2_wordforms_tei.xml.zip
Size
169.6 MB
Format
application/zip
Description
Tezaurs.lv open data (appendix: wordforms) in the TEI/XML format (https://tei-c.org/release/doc/tei-p5-doc/en/html/DI.html)
MD5
2f067c5454cead1c5763a27a4d0bc7ad
 Download file  Preview
 File Preview  
    • tezaurs_2026_2_wordforms_tei.xml13 GB
Icon
Name
tezaurs_2026_2_lmf.xml.zip
Size
7.35 MB
Format
application/zip
Description
Latvian WordNet open data in the LMF/XML format (https://globalwordnet.github.io/schemas/#xml)
MD5
1d68843b4d4c60e6532de0d9f15922bd
 Download file  Preview
 File Preview  
    • tezaurs_2026_2_lmf.xml30 MB
Icon
Name
tezaurs_2026_02.ispell.zip
Size
8.09 MB
Format
application/zip
Description
Newline separated filtered wordform list.
MD5
36e6103dc3539ced9fb3906701efa3fd
 Download file  Preview
 File Preview  
    • tezaurs_2026_02.ispell67 MB
Icon
Name
tezaurs_2026_02-public.pgsql.gz
Size
63.84 MB
Format
application/gzip
Description
PostgreSQL DB dump.
MD5
da7d2d8fec9cdf9c36fcc6bdcd7bc8da
 Download file

Show simple item record