• Repository
  • Corpus Search
  • About
  • CLARIN
  •  Login
  • English Latviešu
  • CLARIN-LV Repository Home
  • View Item
  •  
  • CLARIN-LV logo
  •   Browse  
    •    All of the Repository  
      •   Issue Date
      •   Authors
      •   Titles
      •   Subjects
      •   Publisher
      •   Language
      •   Type
      •   Rights Label
  •   My Account  
    •    Login
  •   Statistics  
    •    StatisticsBETA
  •   General Information  
    •    Deposit
    •    Cite
    •    Submission Lifecycle
    •    FAQ
    •    About
    •    Help Desk
 
 

Balanced Corpus of Modern Latvian (LVK2018)

 
CLARIN Centre of Latvian language resources and tools
  Authors
Levāne-Petrova, Kristīne and Darģis, Roberts
  Item identifier
http://hdl.handle.net/20.500.12574/11
 Project URL
http://www.korpuss.lv/id/LVK2018
 Demo URL
http://nosketch.korpuss.lv/#dashboard?corpname=LVK2018
 Referenced by
https://doi.org/10.22364/vnf.10.12
 Date issued
2018
 Type
corpus, text
 Size
12289240 tokens, 9813014 words, 20864 documents
 Language(s)
Latvian
 Description
LVK2018 is a balanced and representative 10 million word text corpus of modern Latvian. It represents five different genres: journalism (60%), fiction (20%), scientific (10%), legal (8%), transcriptions (2%). LVK2018 is an extended version of LVK2013.
 Publisher
AiLab IMCS UL
 Acknowledgement

European Regional Development Fund

Project code: 1.1.1.1/16/A/219

Project name: Full Stack of Language Resources for Natural Language Understanding and Generation in Latvian

 Subject(s)
text corpus general representative morphology reference corpus
 Collection(s)
Language resources and tools of AiLab IMCS UL
Show full item record
 
 

Partners, Coordination, Funding

  • Institute of Mathematics and Computer Science of the University of Latvia
  • Institute of Literature, Folklore and Art of the University of Latvia
  • University of Latvia
  • Rīga Stradiņš University
  • RTU Liepaja
  • Rezekne Academy of Technologies
  • National Library of Latvia

Repository

  • Main page
  • Contact
  • Submission Lifecycle
  • FAQ
  • About and Policies

More

  • CLARIN
  • How to Sign in

This platform runs under the software developed for the LINDAT/CLARIN repository for linguistics , available on GitHub