Open LLM for baltic languages

Organization
TartuNLP
Abstract
ChatGPT is cool but closed behind an API. Llama2 is cool and open, but does not support Estonian. This thesis will try to train an LLM for Estonian and other Baltic and neighboring languages using the LUMI supercomputer.
Graduation Theses defence year
2023-2024
Supervisor
Hele-Andra Kuulmets, Mark Fishel
Spoken language (s)
Estonian, English
Requirements for candidates
Python, HPC, text data processing, neural net / transformer training
Level
Masters
Keywords
#chatgpt #llm

Application of contact

 
Name
Mark Fishel
Phone
E-mail
fishel@ut.ee