Sparse Matrix Support for Apache Flink

Organization
Data Systems Group
Abstract
Sparse matrices are very common in data representation, especially for machine learning. When it comes to performing these analytics and/or machine learning on data streams, the constraints on memory footprint and latency are more strict.

Apache Flink is a well-known open source system for distributed large-scale stream processing. Unfortunately, Flink does not give support for efficient handling and access to sparse data. Yet, Flink has been used in several online machine learning research.

This thesis's objective is to build a general purpose sparse matrix library that allows efficient storage and access to sparse data that can be integrated with Flink APIs.

We have an application scenario related to online recommender systems that shall benefit from this sparse matrix implementation.
Graduation Theses defence year
2021-2022
Supervisor
Ahmed Awad
Spoken language (s)
English
Requirements for candidates
Level
Bachelor, Masters
Keywords
#Flink, online machine learning, sparse matrix

Application of contact

 
Name
Ahmed Awad
Phone
E-mail
ahmed.awad@ut.ee