Build an Open Source Streaming Data Pipeline
- Track: HPC, Big Data, and Data Science devroom
- Room: D.hpc
- Day: Saturday
- Start: 11:00
- End: 11:30
- Video with Q&A: D.hpc
- Video only: D.hpc
- Chat: Join the conversation!
Any conversation about Big Data would be incomplete without talking about Apache Kafka and Apache Flink: the winning open source combination for high-volume streaming data pipelines.
In this talk we'll explore how moving from long running batches to streaming data changes the game completely. We'll show how to build a streaming data pipeline, starting with Apache Kafka for storing and transmitting high throughput and low latency messages. Then we'll add Apache Flink, a distributed stateful compute engine, to create complex streaming transformations using familiar SQL statements.
This session is aimed at data professionals, who are ready to embrace open source streaming and make their data fly.
Speakers
Olena Kutsenko | |
Francesco Tisiot |