| dc.contributor.author |
Pretkalniņa, Lauma |
| dc.contributor.author |
Andronova, Everita |
| dc.contributor.author |
Frīdenberga, Anna |
| dc.contributor.author |
Skrūzmane, Elga |
| dc.contributor.author |
Siliņa-Piņķe, Renāte |
| dc.contributor.author |
Trumpa, Anta |
| dc.contributor.author |
Vanags, Pēteris |
| dc.date.accessioned |
2025-11-06T12:29:40Z |
| dc.date.available |
2025-11-06T12:29:40Z |
| dc.date.issued |
2025 |
| dc.identifier.uri |
http://hdl.handle.net/20.500.12574/140 |
| dc.description |
The spelling normalization tool (pilot converter) is meant for converting any 18th century Latvian Unicode-encoded text into a more modern spelling. This version of the tool takes care of normalizing the roots of the words, thus, it is meant for for facillitating user-friendly corpora search in tools like Sketch Engine. The tool consists of 134 universal rules. The tool has been successfully tested on various 18th century sources from the SENIE, achieving an average accuracy of 94%. |
| dc.language.iso |
lav |
| dc.publisher |
AiLab IMCS UL |
| dc.publisher |
Latvian Language Institute, Faculty of Humanities, University of Latvia |
| dc.relation.isreferencedby |
https://www.researchgate.net/profile/Everita-Andronova/publication/389224172_Latviesu_18_gadsimta_tekstu_pilotkonvertors_izveide_rezultatu_novertesana_un_izmantosanas_iespejas/links/67cae49bcc055043ce6ecdde/Latviesu-18-gadsimta-tekstu-pilotkonvertors-izveide-rezultatu-novertesana-un-izmantosanas-iespejas.pdf |
| dc.rights |
GNU General Public Licence, version 3 |
| dc.rights.uri |
http://opensource.org/licenses/GPL-3.0 |
| dc.rights.label |
PUB |
| dc.source.uri |
https://senie.korpuss.lv |
| dc.subject |
18th century Latvian texts |
| dc.subject |
historical spelling normalisation |
| dc.subject |
spelling normalisation |
| dc.subject |
user-friently search |
| dc.title |
Spelling normalization tool for Latvian 18th century texts |
| dc.type |
toolService |
| metashare.ResourceInfo#ContentInfo.detailedType |
tool |
| metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent |
true |
| has.files |
yes |
| branding |
CLARIN Centre of Latvian language resources and tools |
| contact.person |
Lauma Pretkalniņa lauma@ailab.lv AiLab IMCS UL |
| contact.person |
Anta Trumpa anta.trumpa@lu.lv Latvian Language Institute, Faculty of Humanities, University of Latvia |
| sponsor |
Ministry of Education and Science VPP-IZM-DH-2022/1-0002 Towards Development of Open and FAIR Digital Humanities Ecosystem in Latvia (DHELI) nationalFunds |
| files.size |
8535 |
| files.count |
2 |