Emergent Theory of Mind (ToM) from Token Merging

Wai Tik Chan
The large language models by using deep learning methods give surprising results in achieving human-level performance in a variety of tasks, especially on it shows the ability of Artificial General Intelligence(AGI) and the understanding of false belief from solely language training. This suggested a strong relationship between the Theory of Mind(ToM) and language training experiments from psychology which have been a long study in the field. However, it is a lack of evidence on what factors are affecting language model performance, whether it is the statistical property of corpora or other factors. In this direction, the thesis focus on the natural language understanding task(semantics classification) and gives a hypothesis that apart from the statistic, language model understanding is built on a deep structure from training. However, this structure is not directly accessible but only through tests, just like those studies of ToM in psychology.
Thus, this thesis proposes a method - Token Merge that enables the test of the existence of the structure. The experiment result gives positive feedback on supporting the proposed hypothesis and it also gives an ordering of the importance of performance by grammatical tagging.
Graduation Thesis language
Graduation Thesis type
Master - Computer Science
Kallol Roy
Defence year