Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Security [clear filter]
Monday, September 28


Hadoop Elephant in Active Directory Forest - Marek Gawiński, Allegro Group Sp. z o.o.
Active Directory (AD) is a well known industry standard to authenticate employees in back office services. It assures password management and clear policies for requesting and gaining access to secured resources.
Integrating AD with Hadoop infrastructure brings those benefits to Big Data world. It also includes other features that make big data developers’ tasks much easier. For example our developers can submit Spark applications that use HDFS, YARN and Hive directly from their IDE.
In this talk we provide technical details which include:
Making AD users and groups visible to Linux via System Security Services Daemon.
Integrating new Linux servers automatically with AD forest on Kerberos level with all credentials needed.
Making whole architecture non-vulnerable to AD service unavailabilities.
Auto-deployment and autoconfiguration of Hadoop clients’ software on users desktops.


Marek Gawiński

Senior Data Platform Engineer, Allegro Group Sp. z o.o.
Since 6 years in Infrastructure and Services Maintenance Team where he takes care of technical support for the scrum teams and maintenance of multiple services included in the Allegro Group's portfolio. He is now developing big data solutions. Passionate about web technologies and... Read More →

Arkadiusz Osinski

Senior Data Engineer, Allegro Group Sp. z o.o.
Works in Allegro Group as a senior data engineer. From the beginning he is related with building and maintaining of Hadoop infrastructure within Allegro Group. Previously he was responsible for maintaining large scale database systems. Passionate about new technologies and cyclin... Read More →

Monday September 28, 2015 11:30 - 12:20


Protecting Enterprise Data In Apache Hadoop - Owen O'Malley, Hortonworks
Hadoop has long had strong authentication via integration with Kerberos,
authorization via User/Group/Other HDFS permissions, and auditing via
the audit log. Recent developments in Hadoop have added HDFS file access
control lists, pluggable encryption key provider APIs, HDFS snapshots,
and HDFS encryption zones. These features combine to give important new
data protection features that every company should be using to protect
their data. This talk will cover what the new features are
and when and how to use them in enterprise production environments.
Upcoming features including columnar encryption in the ORC columnar format
will also be covered. By encrypting particular columns, enterprises can
control which users have access to particularly sensitive columns that
contain personally identifiable information or financial information.

avatar for Owen O’Malley

Owen O’Malley

Co-founder & Sr Architect, Hortonworks
Owen O’Malley is a co-founder and architect at Hortonworks, which develops the completely open source Hortonworks Data Platform (HDP). HDP includes Hadoop and the large ecosystem of big data tools that enterprises need for data analytics. Owen has been working on Hadoop since 2006... Read More →

Monday September 28, 2015 14:00 - 14:50


Encryption and Anonymization in Hadoop - Current and Future - Balaji Ganesan, Hortonworks and Don Bosco Durai
As enterprises expand usage of Hadoop as a platform to store and process data, data security and compliance needs in the platform are becoming more pertinent.
Beyond the traditional Hadoop security controls of authentication through Kerberos, or access management through Apache Ranger, users are increasingly asking for encrypting data when it is being transmitted or when data is stored in disk. In relation to this, there is a movement towards anonymizing the data by tokenizing or masking it, with the intent of using the data in the query processing while protecting the sensitivity by hiding the original value from the end user.
The community has recently built encryption in HDFS file system. In this talk we look at the current encryption capabilities in Hadoop and areas where the community need to focus to further enhance Hadoop as an enterprise ready data platform.


Don Bosco Durai

Security Architect, Hortonworks
Bosco Durai is an Apache committer and currently working at Hortonworks, focused on enabling enterprise grade security within Hadoop platform. Bosco brings years of experience building and managing enterprise data security products. Before Hortonworks, Bosco was the co-founder and... Read More →

Balaji Ganesan

Hortonworks In
Balaji Ganesan is part of the enterprise security team at Hortonworks, where he is leading and executing the vision to bring comprehensive enterprise security into Apache Hadoop. He came to Hortonworks through its acquisition of his security startup, XA Secure. As the Senior Director... Read More →

Monday September 28, 2015 16:00 - 16:50
Tuesday, September 29


Apache Sentry (incubating) : Fine-Grained Access Control to Hadoop Ecosystem - Sravya Tirukkovalur, Cloudera
Historically, each Hadoop component offers its own method of access control so each one needs its own set of permissions rules - even when they are accessing the same data in Hadoop. This is an administrative nightmare that slows the adoption of Hadoop when sensitive data is involved. Apache Sentry is a framework that enables fine grained, role based authorization for multiple Hadoop ecosystem components. Apache Sentry is a highly modular system that support authorization for various data models like Database style schemas, search indexes etc. It comes with out of the box support for SQL query frameworks like Apache Hive and Cloudera Impala Apache Hive, extending the table privileges to underlying HDFS storage, as well as open source search framework Apache Solr. This session will present an overview of Apache Sentry.

avatar for Sravya Tirukkovalur

Sravya Tirukkovalur

Software Engineer, Cloudera
Sravya Tirukkovalur is a software engineer at Cloudera working on Hadoop security. She is one of the active contributors to the Apache Sentry project and also the PMC Chair. She got her Masters degree from The Ohio State University, with her research focus on High performance and... Read More →

Tuesday September 29, 2015 14:00 - 14:50
Wednesday, September 30


How to Deploy a Secure, High-Available, Hadoop Platform - Olaf Flebbe, science+computing ag
We demonstrate the fully automatic installation of a hadoop cluster including infrastructure. The basic building blocks of the Demonstration are the Debian Distribution incl. puppet, configuration with hiera, a list of community puppet modules and deploy scripts and packages from the apache bigtop distribution. The automatically installed cluster sports an HA MIT Kerberos and openLDAP setup, apache zookeeper fencing, HA Hadoop ( journalling, and RM). WebGUI’s are authenticated with SPNEGO, Hive is configured with standard SQL authorization and Hue is provided as frontend.

One of the advanced features is the use of puppets CA via PKINIT for preauthentication and bootstrapping a secure kerberos KDC and securing hadoop with it.


Olaf Flebbe

Chief Software Architect
Dr. Olaf Flebbe received his PhD in computational physics in Tübingen, Germany. He works as the chief software architect at science+computing ag. He is a member of the PMC of Apache Bigtop. Occasionally he gives talks about random projects at various conferences.

Wednesday September 30, 2015 10:00 - 10:50


Hadoop and Kerberos: the Madness Beyond the Gate - Steve Loughran, Hortonworks
When HP Lovecraft wrote of forbidden knowledge about non-human deities, knowledge which would reduce the reader to insanity, most people assumed that he was making up a fantasy world. In fact he was documenting Kerberos and its Hadoop integration.

There are some things humanity was not meant to know. Most people are better off living lives of naive innocence, never having to see an error message about SASL or GSS, to never stare in terror at classes only whose initials, UGI, are ever spoken aloud -or more accurately, whispered.

This talk goes into the depths, to the knowledge which you need to write applications in a secure Hadoop cluster, knowledge that may drive you insane. Forever more, you shall fear voices calling out in the night, voices saying things like "we have an urgent Kerberos-related support call -can you help?"

avatar for Steve Loughran

Steve Loughran

Member of Technical Staff, Hortonworks
Steve Loughran is a developer at Hortonworks, where he works on leading-edge Hadoop applications, most recently on Apache Slider and on Apache Spark's integration with Hadoop and YARN, and Hadoop's S3A connector to Amazon S3. He's the author of Ant in Action, a member of the Apache... Read More →

Wednesday September 30, 2015 11:00 - 11:50


Securing Hadoop in an Enterprise Context - Hellmar Becker, ING
Hadoop clusters can be secured using Kerberos and LDAP, and tools like Ranger and Sentry facilitate security administration. How do you connect a cluster to an enterprise directory with 100,000+ users and centralized role and access management? Hellmar will present ING's approach to synchronize Hadoop role management with the central repository, emphasizing aspects of performance and system stability. He will discuss specific changes to the Ranger security tool that ING introduced to mitigate directory server load, and general aspects of the security model.

avatar for Hellmar Becker

Hellmar Becker

Sr. IT Specialist, ING
Hellmar has worked in a number of positions in big data analytics and digital analytics. Currently working at ING Bank, implementing Datalake Foundation project (based on Hadoop) within Client Information management. Long standing experience in advanced analytics and data management... Read More →

Wednesday September 30, 2015 12:00 - 12:50