Set reconciliation

Name
Ivo Kubjas
Abstract
Assume that we have several devices with different subsets of a set of files. The problem of file synchronization arises in cloud storage. The goal is to find a union of the sets at each of these devices. The naive solution to the problem is to transmit the whole sets by all parties in the network. This, however, results in high communicational complexity. It would be desirable to find an algorithm with communicational complexity that is proportional to the size of the symmetric difference of the sets, which is typically small when compared to the total number of files. We define a number of communication network models. Several efficient algorithms for set reconciliation over a network with two devices have been described in the literature, but similar algorithms for general networks are still unknown. We study the connection between the network topologies and the communicational cost in the set reconciliation algorithms. We observe that it is possible to reduce the communicational cost in networks of specific topology. We also study a problem of minimization of a number of communication rounds in a wired network. We define weights on the edges of the graph according to the number of different files in the communicating devices. Then, we experimentally test the efficiency of choosing the communicating pairs using maximum weight matching. The results imply that this algorithm provides better results than its counterpart which chooses the communicating pairs randomly. The main result of this Thesis is an algebraic analytical framework for studying the set reconciliation algorithms in wireless networks. This frameworks allows for optimizing set reconciliation protocols which use linearly coded messages. This approach aims at minimizing the number of iterations and the number of transmissions during each iteration.
Graduation Thesis language
English
Graduation Thesis type
Master - Cyber Security
Supervisor(s)
Vitaly Skachek
Defence year
2014
 
PDF