• Repository
  • Corpus Search
  • About
  • CLARIN
  •  Login
  • English Latviešu
  • CLARIN-LV Repository Home
  • View Item
  •  
  • CLARIN-LV logo
  •   Browse  
    •    All of the Repository  
      •   Issue Date
      •   Authors
      •   Titles
      •   Subjects
      •   Publisher
      •   Language
      •   Type
      •   Rights Label
  •   My Account  
    •    Login
  •   Statistics  
    •    StatisticsBETA
  •   General Information  
    •    Deposit
    •    Cite
    •    Submission Lifecycle
    •    FAQ
    •    About
    •    Help Desk
 
 

Latvian Web Corpus 2007

 
CLARIN Centre of Latvian language resources and tools
  Authors
Džeriņš, Jānis and Džonsons, Kristaps
  Item identifier
http://hdl.handle.net/20.500.12574/46
 Demo URL
http://nosketch.korpuss.lv/#dashboard?corpname=timeklis
 Referenced by
http://www.semti-kamols.lv/doc_upl/Kamols-Kaunas-paper-2.pdf
 Date issued
2007
 Type
corpus, text
 Size
123000000 tokens
 Language(s)
Latvian
 Description
The Latvian Web Corpus 2007 contains 700,000 Latvian webpages published before 2005. The corpus is automatically annotated. Repetitions are not included.
 Publisher
AiLab IMCS UL
 Acknowledgement

State research programme

Project code: State research programme

Project name: Research and Development of the Semantic Web Technologies for Latvia (SemTi-Kamols)

 Subject(s)
text web morphology
 Collection(s)
Language resources and tools of AiLab IMCS UL
Show full item record
 
 

Partners, Coordination, Funding

  • Institute of Mathematics and Computer Science of the University of Latvia
  • Institute of Literature, Folklore and Art of the University of Latvia
  • University of Latvia
  • Rīga Stradiņš University
  • RTU Liepaja
  • Rezekne Academy of Technologies
  • National Library of Latvia

Repository

  • Main page
  • Contact
  • Submission Lifecycle
  • FAQ
  • About and Policies

More

  • CLARIN
  • How to Sign in

This platform runs under the software developed for the LINDAT/CLARIN repository for linguistics , available on GitHub