BigPetStore on Spark and Flink
Implementing use cases on unified Big Data engines
- Track: HPC, Big Data and Data Science devroom
- Room: H.2213
- Day: Saturday
- Start: 16:30
- End: 16:55
Implementing use cases on unified data platforms. Having a unified data processing engine empowers Big Data application developers as it makes connections between seemingly unrelated use cases natural. This talk discusses the implementation of the so-called BigPetStore project (which is a part of Apache Bigtop) in Apache Spark and Apache Flink. The aim BigPetStore is to provide a common suite to test and benchmark Big Data installations. The talk features best practices and implementation with the batch, streaming, SQL, DataFrames and machine learning APIs of Apache Spark and Apache Flink side by side. A range of use cases are outlined in both systems from data generation, through ETL, recommender systems to online prediction.
Session type Lecture
Session length 30 min + 5 min discussion
Expected prior knowledge / intended audience Basic exposure to Big Data Systems
Speaker bio Márton Balassi is a Solution Architect at Cloudera and a PMC member at Apache Flink. He focuses on Big Data application development, especially in the streaming space. Marton is a regular contributor to open source and has been a speaker of a number of open source Big Data related conferences including Hadoop Summit and Apache Big Data and meetups recently.
Speakers
Marton Balassi |