Mapping voices to their descriptions

Organisatsiooni nimi
TartuNLP
Kokkuvõte
Dalle-2 and stable diffusion generate images from text, using joint vector representations for both text and images. The aim of this thesis is to expand the approach to voices that can be used in text-to-speech synthesis (like Neurokõne, https://neurokone.ee) and train NN models to generate voices based on their descriptions.
Lõputöö kaitsmise aasta
2023-2024
Juhendaja
Mark Fishel
Suhtlemiskeel(ed)
eesti keel, inglise keel
Nõuded kandideerijale
Tase
Bakalaureus, Magister
Märksõnad
#neurokõne #texttospeech #voicegeneration #transformers

Kandideerimise kontakt

 
Nimi
Mark Fishel
Tel
E-mail
fishel@ut.ee