This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Wednesday, September 30 • 15:30 - 16:20
Apache Spark for High-Throughput Systems - Michael Starch, NASA Jet Propulsion Laboratory

Sign up or log in to save this to your schedule and see who's attending!

Data systems are increasingly expected to support data rates nearing network bandwidth limitations around 10Gb/s. Apache Spark is capable of high-throughputs via distributed computing and thus is a good choice to support a data system in this environment; however, most technologies breakdown under these conditions. Thus it is essential that Apache Spark be characterized for production use at these scales.

This talk will discuss the approach to running Apache Spark at throughputs on the order of 10Gb/s while performing non-trivial processing. This will give users a feel for Apache Spark’s performance under the most demanding conditions. Setup of Apache Spark, configuration used, and resource requirements to process at this scale will be discussed. In addition, concrete take-aways will be provided to users desiring to push Apache Spark to this scale.


Michael Starch

Computer Engineer in Applications, NASA Jet Propulsion Laboratory
Michael Starch has been employed by the Jet Propulsion laboratory for the past 5 years. His primary responsibilities include: engineering big data processing systems for handling scientific data, researching the next generation of big data technologies, and helping infuse these systems into the mission world. He is a commiter and PMC on Apache OODT and has spoken about his work at the Southern California Linux Expo and ApacheCon North America.

Wednesday September 30, 2015 15:30 - 16:20