Institute of Computer Science - Graduation Theses Registry

Completed theses (Submit your thesis) Graduation theses topics (Submit a thesis topic)

Large Language Models for Control System Code Analysis

Name

Sander Sats

Abstract

Large Language Models (LLMs) are used ubiquitously in many applications ranging from text summarization, automated theorem proving, code comprehension, and generation. Instructions (a.k.a prompt) are used to tune the
large language models (LLMs) on solving new downstream tasks and thus enhance the generalization capability. To date, less effort has been given to using LLMs in software engineering, e.g., NL-to-Code tasks, code repair, code comprehension, etc. Comprehending code through LLMs is a non-trivial problem, as structural differences exist between code (program) and natural language. Moreover, commercial LLMs trained on the text corpus are closed-source, often lacking transparency and reproducibility. To fill this gap, this thesis proposes a novel method of tuning instructions (prompt engineering) of the LLMs for solving the code comprehension task. We have optimally tuned GPT-3.5 and GPT-4 to comprehend the Matlab code of control systems engineering. Our handwritten Matlab code simulates the behavior of various control systems, e.g., feedback control and PID control. We propose and design three types of prompts: (i) text prompt, (ii) logical prompt, and (iii) numerical prompt to assist the LLM in comprehending the Matlab code for control systems. We propose a new metric as an evaluation measure to check the correctness of understanding of LLM while solving the task. This thesis research findings show while LLMs (GPT 3.5) are good for solving language tasks, they are not yet mature enough to solve code comprehension of Matlab scripts that work primarily in numerical and linear algebra domains.

Graduation Thesis language

English

Graduation Thesis type

Master - Computer Science

Supervisor(s)

Kallol Roy

Defence year

2024

PDF

UT Institute of Computer Science Graduation Theses Registry

Large Language Models for Control System Code Analysis