Logs Mining Based Approach to eCommerce Customer Classification

Name
Aleksei Panarin
Abstract
Fits.me Company has developed a web-tool which helps online shoppers to choose the right size of clothes. The application of Virtual Fitting Room logs users’ actions and saves values of entered body measurements into database. Additionally, Google Analytics is used to get data of online shops’ website visiting sessions, users’ characteristics like location, software and hardware. The main goal of the thesis is to analyse the data, learn to extract useful information. More precisely, we want to develop a method of grouping web-shop customers.
At the first stage we find a way to combine data from different sources. We aggregate the data into user- and session-based profiles. The data is cleaned. It has more informative form, and is ready for further analysis. Data cleaning and pre-processing form a significant part of the thesis.
On the analysis stage we use two methods for the data classification. These are Decision trees and Naïve Bayes. We decide to group customers by one of the important features for eCommerce: we classify user whether he/she makes a purchase or not, whether a user returns purchased item or not. Both, classification tree and Naïve Bayes did not find significant relationship between studied attributes and shopping behaviour. However, regression tree turned to be useful for finding the groups of users with similar behaviour. It shows patterns of behaviour which leads to higher probability of making purchase.
Graduation Thesis language
English
Graduation Thesis type
Master - Software Engineering
Supervisor(s)
Rauno Viin, Siim Karus
Defence year
2015
 
PDF