Brussels / 30 & 31 January 2016


FlinkML: Large Scale machine learning for Apache Flink

Apache Flink is an open source platform for distributed stream and batch data processing. In this talk we will show how Flink's streaming engine and support for native iterations make it an excellent candidate for the development of large scale machine learning algorithms.

This talk will focus on FlinkML, a new effort to bring scalable machine learning tools to the Flink community. We will provide an introduction to the library, illustrate how we employ some state-of-the-art algorithms to make FlinkML truly scalable, and provide a view into the challenges and decisions one has to make when designing a robust and scalable machine learning library.

Finally, if time permits, we will demonstrate how one can perform some interactive analysis using FlinkML and the notebook environment of Apache Zeppelin.


Photo of Theodore Vasiloudis Theodore Vasiloudis