Back To Schedule
Monday, September 28 • 10:30 - 11:20
Large-Scale Stream Processing in the Hadoop Ecosystem - Gyula Fóra, SICs and Márton Balassi, Hungarian Academy of Sciences

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Distributed stream processing is one of the hot topics in big data analytics today. An increasing number of applications are shifting from traditional static data sources to processing the incoming data in real-time. Performing large scale stream processing or analysis requires specialized tools and techniques which have become publicly available in the last couple of years.

This talk will give a deep, technical overview of the top-level Apache stream processing landscape. We compare several frameworks including Spark, Storm, Samza and Flink. Our goal is to highlight the strengths and weaknesses of the individual systems in a project-neutral manner to help selecting the best tools for the specific applications. We will touch on the topics of API expressivity, runtime architecture, performance, fault-tolerance and strong use-cases for the individual frameworks.

avatar for Márton Balassi

Márton Balassi

Solutions Architect, Cloudera
Márton Balassi is a Solution Architect at Cloudera and a PMC member at Apache Flink. He focuses on Big Data application development, especially in the streaming space. Marton is a regular contributor to open source and has been a speaker of a number of Big Data related conferences... Read More →
avatar for Gyula Fóra

Gyula Fóra

Researcher, Distributed Systems, SICS
Gyula is a committer and PMC member for the Apache Flink project, currently working as a researcher at the Swedish Institute of Computer Science. His main expertise and interest is real-time distributed data processing frameworks, and their connections to other big data applications... Read More →

Monday September 28, 2015 10:30 - 11:20 CEST

Attendees (0)