Back To Schedule
Monday, September 28 • 11:30 - 12:20
Magellan: Geospatial Analytics on Spark - Ram Sriharsha, Hortonworks

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Geospatial data is pervasive, and spatial context is a very rich signal of user intent and relevance in search and targeted advertising and an important variable in many predictive analytics applications. In this talk, we describe the motivation and the internals of an open source library that we are building for Geospatial Analytics using Spark SQL, DataFrames and Catalyst as the underlying engine. We outline how we leverage Catalyst’s pluggable optimizer to efficiently execute spatial joins, how SparkSQL’s powerful operators allow us to express geometric queries in a natural DSL, and discuss some of the geometric algorithms that we implemented in the library. We also describe the Python bindings that we expose, leveraging Pyspark’s Python integration.


Ram Sriharsha

Senior Member of Technical Staff, Hortonworks
Ram is currently Product Manager for Apache Spark at Databricks. Prior to joining Databricks, he was Principal Research Scientist at Yahoo Research where he worked on large scale machine learning algorithms and systems related to login risk detection, sponsored search advertising... Read More →

Monday September 28, 2015 11:30 - 12:20 CEST

Attendees (0)