FOSDEM 2022
/
Schedule
/
Events
/
Developer rooms
/
HPC, Big Data, and Data Science
/
Build an Open Source Streaming Data Pipeline

Build an Open Source Streaming Data Pipeline

Track: HPC, Big Data, and Data Science devroom
Room: D.hpc
Day: Saturday
Start: 11:00
End: 11:30
Video with Q&A: D.hpc
Video only: D.hpc
Chat: Join the conversation!

Any conversation about Big Data would be incomplete without talking about Apache Kafka and Apache Flink: the winning open source combination for high-volume streaming data pipelines.

In this talk we'll explore how moving from long running batches to streaming data changes the game completely. We'll show how to build a streaming data pipeline, starting with Apache Kafka for storing and transmitting high throughput and low latency messages. Then we'll add Apache Flink, a distributed stateful compute engine, to create complex streaming transformations using familiar SQL statements.

This session is aimed at data professionals, who are ready to embrace open source streaming and make their data fly.