• Repository
  • Corpus Search
  • About
  • CLARIN
  •  Login
  • English Latviešu
  • CLARIN-LV Repository Home
  • View Item
  •  
  • CLARIN-LV logo
  •   Browse  
    •    All of the Repository  
      •   Issue Date
      •   Authors
      •   Titles
      •   Subjects
      •   Publisher
      •   Language
      •   Type
      •   Rights Label
  •   My Account  
    •    Login
  •   Statistics  
    •    StatisticsBETA
  •   General Information  
    •    Deposit
    •    Cite
    •    Submission Lifecycle
    •    FAQ
    •    About
    •    Help Desk
 
 

SELMA Open Source Platform (UC0)

 
CLARIN Centre of Latvian language resources and tools
  Authors
Goško, Didzis and Bārzdiņš, Guntis
  Item identifier
http://hdl.handle.net/20.500.12574/97
 Project URL
https://selma-project.eu
 Demo URL
https://selma.ailab.lv
 Referenced by
https://selma-project.eu/2023/10/18/the-selma-open-source-platform/
https://github.com/SELMA-project/UC0-OpenSource
 Date issued
2024-02
 Type
toolService
 Description
The SELMA Open-Source Software (OSS) offers effective means to test and compare the performance of various language models used in multilingual media monitoring and content production. The SELMA OSS Platform (also referred to as Use Case 0, UC0, or The Basic Testing and Configuration Interface) provides: * automatic speech recognition (ASR) from audio/video files, * punctuation and capitalization of the transcribed text, * machine translation (MT) into a target language, * text-to-speech synthesis (TTS) and voice-over generation. To provide this functionality, the demonstrator release uses these multilingual open source models: OpenAI Whisper (ASR), Meta MMS (TTS, ASR), Meta M2M-100 (MT). Thus, it facilitates easy access to such open large language models. The SELMA Platform can be used not only by developers in order to combine and test alternative language models before they are integrated into the end-user applications – it can also be used as an entry-level application by journalists and media producers themselves to transcribe their recordings, generate subtitles and voice-over, or to generate a podcast from an input text. The demonstrator of the SELMA OSS Platform does not require registration and authentication nor does it store any content, original or generated, after the session is closed by the user.
 Publisher
AiLab IMCS UL
 Acknowledgement

European Commission

Project code: 957017

Project name: SELMA – Stream Learning for Multilingual Knowledge Transfer

 Subject(s)
ASR TTS MT multilingual content production multilingual media monitoring LLM
 Collection(s)
Language resources and tools of AiLab IMCS UL
Show full item record
 
 

Partners, Coordination, Funding

  • Institute of Mathematics and Computer Science of the University of Latvia
  • Institute of Literature, Folklore and Art of the University of Latvia
  • University of Latvia
  • Rīga Stradiņš University
  • RTU Liepaja
  • Rezekne Academy of Technologies
  • National Library of Latvia

Repository

  • Main page
  • Contact
  • Submission Lifecycle
  • FAQ
  • About and Policies

More

  • CLARIN
  • How to Sign in

This platform runs under the software developed for the LINDAT/CLARIN repository for linguistics , available on GitHub