BlobSeer Scalability: A Multi-version Managers Approach
Baya Chalabi, Yahya Slimani National Scholl of Computer Science, Oued-Smar -ESI- (Alger's), Algeria & ISAMM, Manouba University, Tunisia
Abstract: With the emergence of Cloud Computing, the amount of data generated in different fields such as physics,
medical, social networks, etc. is growing exponentially. This increase in the volume of data and their large scale make the problem of their processing more complex. Actually, the current datasets are very different in nature, ranging from small to very large, from structured to unstructured, and from largely complete to noisy and incomplete. In addition, these datasets evolve over time, often at very rapid rates. If we consider the characteristics of these datasets, traditional data management systems are not adapted to support them. For example, Relational Database Management Systems (RDMS) manage only databases where data conforms to a schema. However, current databases contain a mix of structured and less or no structured data. Furthermore, relational systems lack support for version management that is very important in a data management system. As data management system dedicated to large-scale datasets, we consider the BlobSeer system. It is a concurrencyoptimized data management system for data-intensive distributed applications. BlobSeer is adapted for target applications that handle massive unstructured data in the context of large-scale distributed environments. It uses the concept of versioning
for concurrent manipulation of large binary objects in order to exploit efficiently access to data. To reach this objective, BlobSeer uses a versioning manager to generate a new snapshot version of a BLOB every time it is written or appended by a client. But if the number of BLOBs created or the primitives (writing, appending or reading) increase and are managed by a single version manager, then we have a performance bottleneck and a version manager overload. To avoid the bottleneck of
the version manager, we propose a multi-version managers, such that each version manager maintains a subset of BLOBS.
Keywords: Cloud Computing, Data Management System, BlobSeer, BLOBs, Scalability, Multi-version Managers BlobSeer Scalability: A Multi-version Managers Approach