Name: Netflix: Integrating Spark at Petabyte Scale - Cheolsoo Park, Netflix and Ashwin Shankar, Netflix
Start: 2015-09-29T16:00:00+0200
End: 2015-09-29T16:50:00+0200

Click Here For More Details or to Register

Back To Schedule

Netflix: Integrating Spark at Petabyte Scale - Cheolsoo Park, Netflix and Ashwin Shankar, Netflix

The Big Data Platform team at Netflix maintains a cloud-based data warehouse with over 10 petabytes of data stored predominantly in Parquet format. Our platform has traditionally leveraged Pig for ETL processing, Hive for large analytic workloads, and Presto for interactive and exploratory use cases. For a long time, Spark seemed attractive to complement our platform, but technical gaps prevented effective use at scale in our environment. Recent improvements have allowed us to add Spark to our cloud data architecture and interoperate seamlessly with the other tools and services in our stack.

We will go into detail about our deployment configuration and what it takes to run Spark alongside traditional workloads on YARN. We will share examples of a few of our largest workflows translated to Spark for comparison in terms of both performance and complexity.

Speakers

Cheolsoo Park

Senior Software Engineer, Netflix

Cheolsoo Park is an Apache Pig PMC member and Spark contributor. He is also a senior software engineer at Netflix and works on cloud-based big data analytics infrastructure that leverages open source technologies including Hadoop, Hive, Pig, and Spark.

Ashwin Shankar

Ashwin Shankar is an Apache Hadoop and Spark contributor. He is a senior software engineer at Netflix and is passionate about developing features and debugging problems in large scale distributed systems. Ashwin holds a Master's degree in Computer Science from University of Illinois... Read More →

Tuesday September 29, 2015 16:00 - 16:50 CEST
Dery/Mikszath

Spark - Flink - Tajo - Cascading

Apache: Big Data 2015

Cheolsoo Park

Ashwin Shankar

Attendees (0)

Apache: Big Data 2015

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Cheolsoo Park

Ashwin Shankar

Attendees (0)