# Defense of dissertation in the field of Mathematics, Matthias Grezet, M.Sc.

On the constraints of reliable storage of big data

The goal of the dissertation "On Matroid Theory and Distributed Data Storage" is to obtain tradeoffs between the main parameters of a distributed storage system with locality, as well as to analyse the repair properties of certain optimal storage codes. This is done by developing connections between storage codes and matroids.

In the last few years, the development of web services and social media content has generated an astronomical quantity of digital data. From the point of view of a single user, cloud storage allows for a constant and remote access to the data without overwhelming their own storage capacity. The same benefits apply to companies as well, on a much larger scale. Therefore, huge data storage systems were built by the big information technology companies such as Amazon and Microsoft to offer cloud storage and cloud computing. The starting point of this thesis is to study how to efficiently and reliably store data. Since the data is spread amongst multiple storage servers, a storage system has to deal with several server failures on a daily basis. To prevent from data loss, it is necessary to store redundant data alongside the initial data by using a storage code. The amount of redundant data in the system is referred to as the storage overhead. When a server fails, a new server is added to the system and nearby servers are contacted to reconstruct the lost data. The number of servers contacted for repairing a server failure is called the locality.

This thesis focuses on the notion of locality. More precisely, the main goal is to derive tradeoffs between the storage overhead, the failure tolerance, and the locality when the underlying code alphabet is fixed. Deriving a tradeoff is important in practice as it characterises the best possible codes. Furthermore, since the alphabet relates to the repair complexity and affects the different aforementioned notions, it is interesting to derive alphabet-dependent tradeoffs. To approach this problem, we use the internal structure of the storage codes and the relation between codes and matroids. Matroids are interesting mathematical objects on their own right and provide useful tools to analyse the internal structure of the storage codes. In addition to deriving tradeoffs, matroidal tools help in the design of efficient repair processes for storage codes.

Opponent: Dr. Thomas Britz, University of New South Wales, Australia

Custos: Professor Camilla Hollanti, Aalto University School of Science, Department of Mathematics and Systems Analysis

Doctoral candidate: Matthias Grezet, Department of Mathematics and Systems Analysis,  [email protected], +358 505052525

Electronic dissertation: http://urn.fi/URN:ISBN:978-952-60-8711-5

The dissertation is publicly displayed 10 days before the defence at the noticeboard of the School of Science in Konemiehentie 2, Espoo.

• Published:
• Updated:
Share
URL copied!