Classification of Email Messages
Name
Martin Mäe
Abstract
Today email is one of the most widely used communication methods. It has been
used for decades by now and is used daily by organizations as well as by individuals to
forward and receive all kind of information. Considering this the amount of email
messages sent and received has grown significantly and more than before we are
seriously facing a message overload problem.
To make managing and finding messages easier it is reasonable to classify messages
based on user needs. The specific way for classifying emails can be developed by every
person just the way it is reasonable for the specific user.
An electronic message or in short email consists of two parts: the message body
(email content) and the message header. By using information from there I will try to
classify email messages to make it easier to find and manage both incoming and
existing emails.
This thesis aims to give an overview of what classification is and introduce some
common classification methods. Another aim is to briefly introduce email format and
message overload problem and to take a look at the number of emails sent yearly. Last
aim is to study different built-in features for widely used email programs to see if these
features are useful for classifying emails to make finding information faster and easier.
This thesis is divided into 3 chapters. The first chapter gives an overview of email
message format, the message overload problem, widely used email clients, and the
amount of emails sent. In chapter two some classification methods, information
extraction, categorization and classification are introduced. In chapter three some real
life experiments are conducted to show how to use email clients to classify email
messages.
Graduation Thesis language
English
Graduation Thesis type
Bachelor - Information Technology
Supervisor(s)
Tõnu Tamme
Defence year
2011