In event sourcing architecture there is a single source of truth and Hadoop is the tool to fulfil that. We use Apache Kafka and Hermes message bus as a single entry point of events. There is no efficient solution for live backup of data with CRUD operations enabled. However when handling immutable events, we can backup data live to multiple locations like Hadoop cluster in another data center or any storage provider that supports S3 API.
Storing exact copies of data in different locations allows to extend compute power of private data center with public platform provider. Such a hybrid solution benefits from cloud elasticity, so we can easily scale on demand.
In this presentation architectural design patterns for backup and compute power scaling will be presented. We also focus on technical aspects of our architecture built on the top of open source software.
Paweł holds PhD in distributed databases and his interests focus on making Big Data easy. He has 7 years of technical experience at Allegro and currently works as Hadoop Product Owner in a Big Data Solutions Team. The team develops and maintains a petabyte Hadoop cluster with endpoints... Read More →
In 2006 graduated master studies in Computer Science at Nicolaus Copernicus University. In years 2006 - 2011 he was a PhD student in Computer Science. His research field was Computer Science applied in Bioinformatics. He gained experience in Hadoop World building and maintaining a... Read More →
A software developer with 5+ years of professional experience. Now working as a Senior Data Engineer in Allegro Group, developing tools that support internal Big Data ecosystem and contributing to Open Source.
Wednesday September 30, 2015 15:30 - 16:20 CEST
Tas