Institute of Computer Science - Graduation Theses Registry

Completed theses (Submit your thesis) Graduation theses topics (Submit a thesis topic)

Exploring the Human-like Ability of LLMs in Recognizing Self-generated Text

Name

Katariina Ingerma

Abstract

Large Language model (LLM) is a type of generative artificial intelligence model, that can generate human-like texts. The popularity of LLMs is rapidly increasing every day due to their ability to understand and generate texts that closely resemble human language. This textual content generation ability of LLMs continues to expand their acceptability in professional tasks such as advertising slogan creation, news composition, story generation, etc. The proliferation of LLMs in diverse areas expedites the malicious use of LLMs, which can be a serious threat to information ecosystems and public trust. Therefore, there is an imperative need to develop effective methods to distinguish between LLM-generated and human-written textual content. In this thesis, we have studied the
linguistic differences between human-written and LLM-generated texts, the machinegenerated text detection performance of an LLM that generates the textual content, and the effect of the textual length on the machine-generated text detection performance. The results reveal that LLMs having fewer parameters generate texts with higher Type-Token-Ratio compared to human-authored texts, while more advanced LLMs exhibit similarities to human writing. According to the obtained results, the more advanced and larger the LLM, the chances are less that it can detect its own generated texts due to their close resemblance to human-authored content. This research is conducive to addressing the problems that arise from the texts produced by LLMs, such as misinformation. It contributes to the development of new approaches for identifying the LLM-generated content.

Graduation Thesis language

English

Graduation Thesis type

Bachelor - Computer Science

Supervisor(s)

Somnath Banerjee

Defence year

2024

PDF

UT Institute of Computer Science Graduation Theses Registry

Exploring the Human-like Ability of LLMs in Recognizing Self-generated Text