Loading…
Back To Schedule
Tuesday, September 29 • 11:30 - 12:20
Adding Insert, Update, and Delete to Apache Hive - Owen O'Malley, Hortonworks

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apache Hive provides a convenient SQL query engine and table abstraction for data stored in Hadoop. Hive uses Hadoop to provide highly scalable bandwidth to the data, but until recently did not support updates, deletes, or transaction isolation. This has prevented many desirable use cases such as updating of dimension tables or doing data cleanup. We have implemented the standard SQL commands insert, update, and delete allowing users to insert new records as they become available, update changing dimension tables, repair incorrect data, and remove individual records. This also allows very low latency ingestion of streaming data from tools like Storm and Flume. Additionally, we have added ACID-compliant snapshot isolation between queries so that queries will see a consistent view of the committed transactions when they are launched.

Speakers
avatar for Owen O’Malley

Owen O’Malley

Co-founder & Sr Architect, Hortonworks
Owen O’Malley is a co-founder and architect at Hortonworks, which develops the completely open source Hortonworks Data Platform (HDP). HDP includes Hadoop and the large ecosystem of big data tools that enterprises need for data analytics. Owen has been working on Hadoop since 2006... Read More →


Tuesday September 29, 2015 11:30 - 12:20 CEST
Petofi

Attendees (0)