Home| Contact Us| New Journals| Browse Journals| Journal Prices| For Authors|

Print ISSN:
Online ISSN:


  About PCA
  DLINE Portal Home
Home
Aims & Scope
Editorial Board
Current Issue
Next Issue
Previous Issue
Sample Issue
Upcoming Conferences
Self-archiving policy
Alert Services
Be a Reviewer
Publisher
Paper Submission
Subscription
Contact us
 
  How To Order
  Order Online
Price Information
Request for Complimentary
Print Copy
 
  For Authors
  Guidelines for Contributors
Online Submission
Call for Papers
Author Rights
 
 
RELATED JOURNALS
Journal of Digital Information Management (JDIM)
Journal of Multimedia Processing and Technologies (JMPT)
International Journal of Web Application (IJWA)

 

 
Progress in Computing Applications(PCA)
 

SQL Hadoop Processing Engineers using MapReduce
Edson Ramiro Lucas Filho1, Eduardo Cunha de Almeida, Stefanie Scherzinger
Universidade Federal do Parana, Brazil., OTH Regensburg
Abstract: SQL-on-Hadoop processing engines have become state-of-the art, yet the skills required to tune these systems are rare in the job market. Automated tuning advisers can profile the low-level MapReduce jobs and propose appropriate tuning setups, but up-front tuning is time consuming and costly. In this demo, we present DejaVu. DejaVu integrates with Hive and effectively reduces the tuning costs by caching tuning setups for partial query plans: When the SQLon-Hadoop engine Hive compiles SQL queries into physical query plans, single MapReduce jobs tend to be similar between query plans. By recycling the tuning setups for similar low-level MapReduce jobs, DejaVu can effectively cut down the time spent profiling the TPC-H query workload in half, achieving similar impact on the performance of the jobs. While we employ Starfish in this demo, DejaVu can leverage any third-party MapReduce tuning adviser.
Keywords: MapReduce, SQL Queries, Hadoop Processing SQL Hadoop Processing Engineers using MapReduce
DOI:https://doi.org/10.6025/pca/2020/9/1/1-5
Full_Text   PDF 624 KB   Download:   70  times
References:

[1] Dean, J., Ghemawat, S. (2004). MapReduce: Simplified Data Processing on Large Clusters. In: OSDI.
[2] Duan, S., Thummala, V., Babu, S. (2009). Tuning Database Configuration Parameters with iTuned. ReCALL 2(1), 1246–1257 (aug).
[3] Filho, E. R. L., de Almeida, E.C., Scherzinger, S. (2019). Don’t Tune Twice: Reusing Tuning Setups for SQL-on-Hadoop Queries. In: ER 2019 – 38th International Conference on Conceptual Modeling.
[4] Filho, E. R. L., Picoli, I. L., de Almeida, E.C., Le Traon, Y., Chameleon. (2014). The Performance Tuning Tool for MapReduce Query Processing Systems. In: 29th SBBD – Demos and Applications Session – ISSN 2316-5170 October 6-9, 2014 – Curitiba, PR,
Brazil.
[5] Floratou, A., Minhas, U. F., Ozcan, F. (2014). SQL-on-Hadoop: full circle back to shared-nothing database architectures. Proceedings of the VLDB Endowment, 7(12), 1295–1306.
[6] Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F. B., Babu, S. Starfish: A Self-Tuning System for Big Data Analytics. In: CIDR.
[7] Thusoo, A., Sarma, J. S., Jain, N., Shao, Z., Chakka, P., Zhang, N., Antony, S., Liu, H., Murthy, R. (2010). Hive - A petabyte scale data warehouse using hadoop. In: Proceedings - International Conference on Data Engineering. p 996–1005.
[8] Yanpei Chen, S. A., Katz, R. H., Chen, Y., Alspaugh, S., Katz, R. (2012). Interactive Query Processing in Big Data Systems: A Cross Industry Study of MapReduce Workloads. Tech. Rep. 12, University of California, Berkeley.


Home | Aim & Scope | Editorial Board | Author Guidelines | Publisher | Subscription | Previous Issue | Contact Us |Upcoming Conferences|Sample Issues|Library Recommendation Form|

 

Copyright 2011 dline.info