Back To Schedule
Tuesday, September 29 • 15:00 - 15:50
Hive on Spark: What It Means to You? - Xuefu Zhang, Cloudera

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apache Hive has wide use cases for batch-oriented SQL workloads for ETL and data analytics in the Hadoop ecosystem. Up to now, most of the workloads are still executed by a 10 year old technology, MapReduce. On the other hand, Apache Spark as a general, open-source data processing framework is positioned to replace MapReduce with faster data processing and efficient memory utilization.

The Hive on Spark initiative introduced Spark as Hive's new execution engine, providing faster SQL on Hadoop while maintaining Hive's feature richness. With a joint effort from the Hive community and feedback from early adopters and beta users, Hive on Spark is ready for production deployment!

This presentation will share with you the motivation, architecture, deployment practice, and performance tuning. A live demo will be given to conclude the presentation.


Xuefu Zhang

Software Engineer, Uber Technologies
Xuefu Zhang has over 10 year’s experience in software development. Earlier this year he joined as a software engineer in Uber from Cloudera, where he spent his main efforts on Apache Hive and Pig. He also worked in the Hadoop team at Yahoo when the majority of the development on... Read More →

Tuesday September 29, 2015 15:00 - 15:50 CEST

Attendees (0)