The Apache Big Data Europe conference is in town, and we are organizing an event where you can listen to and meet with the international speakers and attendees.
We plan to have the following talks:
Dive deeper, Soar higher: MADlib + HAWQ for advanced SQL machine learning on Hadoop
The growing Apache ecosystem just got bigger and better -- now with the ability to crunch vast volumes of data using fully ANSI-compliant SQL and at-scale machine learning algorithms.
Apache HAWQ has been years in the making and derives its heritage from Greenplum Database and PostgreSQL. HAWQ enables developers, analysts, data scientists and engineers to run advanced SQL queries, transforming data sets of extreme size, visualizing data with standard tools, and seamlessly running R and Python in a highly-distributed fashion all in the same environment. Invoke powerful machine learning and advanced statistical functions using Apache MADlib, and build models on billions of rows of data.
Speakers from Pivotal and Hortonworks will discuss the following:
- Introduction to Apache HAWQ & Apache MADlib
- All about the Open Data Platform initiative
- Data science in the Hadoop ecosystem
- Live, end-to-end data science demo using Apache HAWQ, Apache MADLib, and Hortonworks Data Platform
Caleb Welton is Director for SQL on Hadoop at Pivotal covering the Pivotal HAWQ database. He has spent the last 18 years developing database technology for Oracle, Greenplum, EMC and Pivotal. In addition to his contributions in database technology he is one of the founding members of the open source MADlib project for in-database machine learning. Caleb is named inventor for 11 patents in database technology and has presented papers at SIGMOD, VLDB and KDD.
Michael Natusch leads Pivotal's Data Science team in EMEA. His experience lies in predictive analytics and his area of specialization is the application of statistical methods to large-scale data sets, in particular through the application of machine learning algorithms. Michael holds a PhD in theoretical physics from the University of Cambridge and an MBA. He is a Fellow of the Royal Statistical Society and lectures at the Open University.
Janos Matyas is a Sr. Director of Engineering at Hortonworks and former CTO at SequenceIQ (acquired by Hortonworks). Before co-founding SequenceIQ he was a Solutions Architect at EPAM Systems. He is an open source advocate and Apache Ambari committer, a Hadoop YARN evangelist and a keen surfer and freeskier. He holds a Master's Degree in Computer Science, specialized on distributed systems.
18:30 Doors open
19:00 Talks begin
21:00 Meetup finishes
After the meetup we'll visit a nearby pub (exact location to be announced). Join us there, have some drinks and talk data (or anything else) even if you can't make it to the meetup!
This will be an English speaking event. The meetup will be hosted by LogMeIn.
International and Hungarian friends of Big Data, unite!
After our September meetup on Monday evening, we are heading for drinks and discussions to An'kert, a fine example of the world-famous ruin pubs of Budapest.
This AfterParty is open to everyone loving Big Data, and we are especially looking forward to meet and greet the attendees of the Apache Big Data Europe conference.
If you are staying at Hotel Corinthia - the venue of the conference - then you can easily walk to the pub. Here's a Google map link showing the route and distance: goo.gl/maps/RWtbxGN27qT2
If you are also attending the earlier Big Data Meetup at LogMeIn, then getting to An'kert is even simpler: just walk a block (200m) on Paulay street.For more information, click here.
This session is an informal meeting about post-map reduce frameworks such as Spark or Flink. We will also talk about the ecosystem, architectural patterns (eg. Lambda & Kappa), Programming(Scala et al) and abstraction/SQL framework on general purpose data engines.
After hours of listening, it is about time that you have a chance talk. Share your thoughts, ideas and questions. Remember, there is no such thing as a stupid questions. This is also the perfect place to ask questions to session topics that came up after the session was closed.
1. Recap and introduction to the topic
2. General discussion in the big room
3. Fork into sub-bofs into smaller room on demand (If you want to talk about details on a certain topics and have a deep-dive into technical details, we invite you to gather some people and create a “sub-bof”)
Trafodion is a world class Transactional SQL RDBMS running on HBase/Hadoop, currently in Apache incubation.
In this talk we will discuss: