Show simple item record

 
dc.contributor.author Pretkalniņa, Lauma
dc.contributor.author Andronova, Everita
dc.contributor.author Frīdenberga, Anna
dc.contributor.author Skrūzmane, Elga
dc.contributor.author Siliņa-Piņķe, Renāte
dc.contributor.author Trumpa, Anta
dc.contributor.author Vanags, Pēteris
dc.date.accessioned 2025-11-06T12:29:40Z
dc.date.available 2025-11-06T12:29:40Z
dc.date.issued 2025
dc.identifier.uri http://hdl.handle.net/20.500.12574/140
dc.description The spelling normalization tool (pilot converter) is meant for converting any 18th century Latvian Unicode-encoded text into a more modern spelling. This version of the tool takes care of normalizing the roots of the words, thus, it is meant for for facillitating user-friendly corpora search in tools like Sketch Engine. The tool consists of 134 universal rules. The tool has been successfully tested on various 18th century sources from the SENIE, achieving an average accuracy of 94%.
dc.language.iso lav
dc.publisher AiLab IMCS UL
dc.publisher Latvian Language Institute, Faculty of Humanities, University of Latvia
dc.relation.isreferencedby https://www.researchgate.net/profile/Everita-Andronova/publication/389224172_Latviesu_18_gadsimta_tekstu_pilotkonvertors_izveide_rezultatu_novertesana_un_izmantosanas_iespejas/links/67cae49bcc055043ce6ecdde/Latviesu-18-gadsimta-tekstu-pilotkonvertors-izveide-rezultatu-novertesana-un-izmantosanas-iespejas.pdf
dc.rights GNU General Public Licence, version 3
dc.rights.uri http://opensource.org/licenses/GPL-3.0
dc.rights.label PUB
dc.source.uri https://senie.korpuss.lv
dc.subject 18th century Latvian texts
dc.subject historical spelling normalisation
dc.subject spelling normalisation
dc.subject user-friently search
dc.title Spelling normalization tool for Latvian 18th century texts
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
has.files yes
branding CLARIN Centre of Latvian language resources and tools
contact.person Lauma Pretkalniņa lauma@ailab.lv AiLab IMCS UL
contact.person Anta Trumpa anta.trumpa@lu.lv Latvian Language Institute, Faculty of Humanities, University of Latvia
sponsor Ministry of Education and Science VPP-IZM-DH-2022/1-0002 Towards Development of Open and FAIR Digital Humanities Ecosystem in Latvia (DHELI) nationalFunds
files.size 8535
files.count 2


 Files in this item

 Download all files in item (8.33 KB)
This item is
Publicly Available
and licensed under:
GNU General Public Licence, version 3
Icon
Name
18thCentConverter.zip
Size
7.53 KB
Format
application/zip
Description
Executable Perl source code with run examples
MD5
d4a6c1918fbb9c1f3295fd9378d95bca
 Download file  Preview
 File Preview  
  • LvSenie
    • Translit
      • SimpleTranslitTables.pm6 kB
      • Transliterator.pm8 kB
      • NoreplaceCoding.pm1 kB
    • run18thCentTransliterator-sample.bat411 B
    • readme.md818 B
    • test.txt445 B
Icon
Name
readme.md
Size
827 bytes
Format
Unknown
Description
Short documentation
MD5
b8ea82166bd4f937075d8ae772eb3de3
 Download file

Show simple item record