Apache Flink: streaming done right
- Track: HPC, Big Data and Data Science devroom
- Room: AW1.126
- Day: Sunday
- Start: 16:00
- End: 16:25
Data streaming is gaining popularity, as it offers decreased latency, a radically simplified data infrastructure architecture, and the ability to cope with new data that is generated continuously. Apache Flink is a full-featured stream processing framework with:
- Easy to use Java- and Scala-embedded APIs that make stream analytics easy, yet provide powerful tools to deal with time and uncertainty
- Throughput of million of events per second per core
- Latencies as low as the millisecond range
- Exactly-once consistency guarantees, and the ability to realize distributed transactional data movement between systems (e.g., between Kafka and HDFS)
- Ease of configuration and separation between application logic and fault tolerance via a novel asynchronous checkpointing algorithm
- No single point of failure
- Integration with popular open source infrastructure (e.g., Hadoop, HBase, Kafka, Cascading, Elasticsearch, …)
- Support for event time and out of order arrivals with flexible windows, watermarks, and triggers
- Batch processing as a special case of stream processing, including dedicated libraries for machine learning and graph processing, managed memory on-, and off-heap, and query optimization
Flink is used in several companies, including at ResearchGate, Bouygues Telecom, the Otto Group, and Capital One, and has a large and active developer community of well over 140 contributors. In this talk, we provide an overview of the system and its streaming-first philosophy, as well as the project roadmap and vision: fully unifying the, now separate, worlds of “batch” and “streaming” analytics.
Apache Flink is a full-featured streaming framework with a unique combination of features such as high throughput, millisecond latency, strong consistency, and support for event time and out-of-order streams. Flink has also full support for classic batch processing as a special case of stream processing, incorporating optimizations such as managed memory (on and off heap), and program optimization. Flink is used in several companies, including at ResearchGate, Bouygues Telecom, the Otto Group, and Capital One, and has a large and active developer community of well over 120 contributors.
Speakers
Till Rohrmann |