Journal of Networking Technology

DLINE Journals portal

Home

New Journals

Browse Journals

Journal Prices

For Authors

Print ISSN: 0976-898X
Online ISSN: 0976-8998

About JNT
	DLINE Portal Home Home Aims & Scope Editorial Board Current Issue Next Issue Previous Issue Sample Issue Upcoming Conferences Self-archiving policy Alert Services Be a Reviewer Publisher Paper Submission Subscription Contact us

How To Order
	Order Online Price Information Request for Complimentary Print Copy

For Authors
	Guidelines for Contributors Online Submission Call for Papers Author Rights

RELATED JOURNALS

Journal of Digital Information Management (JDIM)

International Journal of Computational Linguistics Research (IJCL)

International Journal of Web Application (IJWA)

Journal of Networking Technology

BlobSeer Scalability: A Multi-version Managers Approach

Baya Chalabi, Yahya Slimani
National Scholl of Computer Science, Oued-Smar -ESI- (Alger's), Algeria & ISAMM, Manouba University, Tunisia

Abstract: With the emergence of Cloud Computing, the amount of data generated in different fields such as physics, medical, social networks, etc. is growing exponentially. This increase in the volume of data and their large scale make the problem of their processing more complex. Actually, the current datasets are very different in nature, ranging from small to very large, from structured to unstructured, and from largely complete to noisy and incomplete. In addition, these datasets evolve over time, often at very rapid rates. If we consider the characteristics of these datasets, traditional data management systems are not adapted to support them. For example, Relational Database Management Systems (RDMS) manage only databases where data conforms to a schema. However, current databases contain a mix of structured and less or no structured data. Furthermore, relational systems lack support for version management that is very important in a data management system. As data management system dedicated to large-scale datasets, we consider the BlobSeer system. It is a concurrencyoptimized data management system for data-intensive distributed applications. BlobSeer is adapted for target applications that handle massive unstructured data in the context of large-scale distributed environments. It uses the concept of versioning for concurrent manipulation of large binary objects in order to exploit efficiently access to data. To reach this objective, BlobSeer uses a versioning manager to generate a new snapshot version of a BLOB every time it is written or appended by a client. But if the number of BLOBs created or the primitives (writing, appending or reading) increase and are managed by a single version manager, then we have a performance bottleneck and a version manager overload. To avoid the bottleneck of the version manager, we propose a multi-version managers, such that each version manager maintains a subset of BLOBS.

Keywords: Cloud Computing, Data Management System, BlobSeer, BLOBs, Scalability, Multi-version Managers BlobSeer Scalability: A Multi-version Managers Approach

DOI:https://doi.org/10.6025/jnt/2019/10/2/40-53

Full_Text PDF 2.14 MB Download: 978 times

References:[1] Hadoop. http://hadoop.apache.org.
[2] Hdfs. the hadoop distributed file system. http://hadoop.apache.org/common/docs/r0.20.1/hdfs-design.html.
[3] Chang, F., Dean, J., Ghemawat, S., Wilson, C., Hsieh, D. A., Wallach, Burrows., M., Chandra, T., Fikes, A., Gruber, R. E. (2006).
Bigtable: A distributed storage system for structured data. In: Proceedings of the 7th USENIX Symposium on Operating
Systems Design and Implementation, 7. USENIX Association, 2006.
[4] Shvachko, K. (2010). The hadoop distributed file system. In: Mass Storage Systems and Technologies (MSST), IEEE 26th Symposium on IEEE, 2010.
[5] Garcia-Molina, H. (1982). Elections in a distributed computing system. IEEE Transactions on Computers, C-310:48Ã¢â‚¬â€œ59, 1982.
[6] Varade, M., Jethani, V. (2013). Distributed metadata management scheme in hdfs. International Journal of Scientific and Research Publications, 3(5).
[7] Karger, D., Lehman, E., Leighton, T., Tom, Panigrahy, R., Levine, M., Lewin, D. (1997). Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing (El Paso, Texas, United States,), p. 654Ã¢â‚¬â€œ663, Press, New York, NY, May 04 - 06 1997.
[8] Ghemawat, S., Gobioff, H., Leung, S. (2003). The google file system. In: ACM Symposium on Operating Systems Principles, 29Ã¢â‚¬â€œ43, Lake George, October 2003. NY.
[9] Nicolae, B., Antoniu, G., BougÃƒÂ©, L., Moise, D., Carpen-Amarie, A. (2011). Blobseer: Next-generation data management for large scale infrastructures. Journal of Parallel and Distributed Computing, 71.169Ã¢â‚¬â€œ184, (February).
[10] Nicolae, B. (2010). BlobSeer: Towards efficient data storage management for large-scale, distributed system. PhD thesis, UniversitÃƒÂ© de Rennes 1, Rennes, France.
[11] Ross, R. B., Carns, P. H., Ligon III, W. B., Thakur, R. (2000). Pvfs: A parallel file system for linux clusters. In: 4th Annual Linux Showcase and Conference, 317Ã¢â‚¬â€œ327.
[12] Shvachko, K., Huang, H., Radia, S., Chansler, R. The hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST Ã¢â‚¬â„¢10, 1Ã¢â‚¬â€œ10, Washington, DC, USA, 2010. IEEE Computer Society.
[13] Gomes Soares, P. (1992). On remote procedure call. In: IBM Press, editor, Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research, CASCON Ã¢â‚¬â„¢92, 215Ã¢â‚¬â€œ267, 1992.
[14] Andrew, S., Tanenbaum, Van Steen, M. (2006). Distributed Systems: Principles and Paradigms (2Nd Edition). Prentice-Hall, Inc., Upper Saddle River, NJ, USA.
[15] Tay, B. H., Ananda, A. L. (1992). A survey of asynchronous remote procedure calls. SIGOPS Oper. Syst. Rev., 26 (2) 92Ã¢â‚¬â€œ109, (April).
[16] Tran, V-T. (2013). Scalable data-management systems for Big Data. PhD thesis, Ãƒâ€°cole Normale SupÃƒÂ©rieure de Cachan - ENS Cachan, 2013.
[17] DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W. (2007). Dynamo: AmazonÃ¢â‚¬â„¢s highly available key-value store. In: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, 205Ã¢â‚¬â€œ220, New York, NY, USA, 2007. ACM.
[18] Yadava, H. (2014). The Berkeley DB Book. Apress, Berkely, CA, USA, 1st edition, 2014.

DLINE Journals portal