Building A Real-Time Analytics Dashboard with Streamlit, Apache Pinot, and Apache Pulsar
Best of Both Worlds with Event Streaming and Real-Time Analytics
- Track: Fast and Streaming Data devroom
- Room: K.4.201
- Day: Saturday
- Start: 13:10
- End: 13:40
- Video only: k4201
- Chat: Join the conversation!
Real-Time Data Analytics is becoming very popular these days because of the important values and insights that it is bringing to the business. It also relies on super-fast data ingestion and that is all possible by leveraging on a powerful event-driven streaming platform. Two very important Open Source Apache projects have risen to serve such a grand purpose: Apache Pulsar for event streaming and Apache Pinot for real-time analytics. We take a look at how both of these projects can integrate very well together and achieve the blazingly fast results that we desire and show you a use case that we've built.
When you hear "decision maker", it's natural to think, "C-suite", or "executive". But these days, we're all decision-makers. Restaurant owners, bloggers, big box shoppers, diners - we all have important decisions to make and need instant actionable insights. In order to provide these insights to end-users like us, businesses need access to fast, fresh analytics.
In this session we will learn how to build our own real-time analytics application on top of a streaming data source using Apache Pulsar, Apache Pinot, and Streamlit. Pulsar is a distributed, open source pub-sub messaging and streaming platform for real-time workloads, Pinot is an OLAP database designed for ultra low latency analytics, and Streamlit is a Python based tool that makes it super easy to build data based apps.
After introducing each of these tools, we’ll stream data into Pulsar using its Python client, ingest that data into a Pinot real-time table, and write some basic queries using Pinot’s Python SDK. Once we've done that, we’ll bring everything together with an auto refreshing Streamlit dashboard so that we can see changes to the data as they happen. There will be lots of graphs and other visualisations!
This session is aimed at application developers and data engineers who want to quickly make sense of streaming data.
Speakers
Mark Needham | |
Mary Grygleski |