FOSDEM 2016
/
Schedule
/
Events
/
Developer rooms
/
HPC, Big Data and Data Science
/
Streaming Architecture: Why Flow Instead of State?

Streaming Architecture: Why Flow Instead of State?

Track: HPC, Big Data and Data Science devroom
Room: AW1.126
Day: Sunday
Start: 16:30
End: 16:55

Batch processing has been, until recently, the standard model for big data. Largely, this is due to the very large influence of the original processing MapReduce implementation in Hadoop and the difficulty in replacing MapReduce in the original Hadoop framework.

More recently, there has been a shift to streaming architectures using tools such as Apache Spark and Kafka. These architectures offer surprisingly large benefits in terms of simplicity and robustness, but they are also surprisingly different from previous message queuing designs. The changes in these new systems allow enormously higher scalability and make fault tolerance relatively simple to achieve while maintaining good latency.

In this talk I will describe the key design tools and best practice techniques used in modern systems including percolators, the big-data oscilloscope, replayable queues, state-point queuing and universal micro-architectures. The benefits that I will highlight include

a decrease in total system complexity
flexible throughput/latency trade-offs
fault tolerance without the difficulties of the lamdba architecture and
easy debuggability

I will detail several Apache projects that attempt to support flow computing.

Speakers

Tugdual Grall

FOSDEM16

Brussels / 30 & 31 January 2016

Streaming Architecture: Why Flow Instead of State?

Speakers

Links

FOSDEM

This year

Practical information

Media and press