BEGIN:VCALENDAR VERSION:2.0 PRODID:-//Pentabarf//Schedule 0.3//EN CALSCALE:GREGORIAN METHOD:PUBLISH X-WR-CALDESC;VALUE=TEXT:Monitoring and Observability devroom X-WR-CALNAME;VALUE=TEXT:Monitoring and Observability devroom X-WR-TIMEZONE;VALUE=TEXT:Europe/Brussels BEGIN:VEVENT METHOD:PUBLISH UID:10735@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T090000 DTEND:20200202T090500 SUMMARY:Intro DESCRIPTION:
Introduction and welcome to the monitoring and observability devroom
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/monitoring_and_observability_devroom_intro/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Richard Hartmann":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:9442@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T091000 DTEND:20200202T093500 SUMMARY:Distributed Tracing for beginners DESCRIPTION:Distributed tracing is a tool that belongs to every developer's tool belt, but what it actually can do remains a mystery to most developers.
In this slideless talk, we will introduce you to the world of distributed tracing by developing a cloud native application from scratch and applying all important distributed tracing concepts in practice, at first by hand and then by using existing libraries to automate our work.
You will learn not only what distributed tracing is, but how it works, what it can do and what it can’t. By the end of this talk, you will have working knowledge to start using distributed tracing tools with your new projects, as well as with your legacy ones.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/tracing_beginners/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Juraci Paixão Kröhling":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:10168@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T095000 DTEND:20200202T101500 SUMMARY:Grafana: Successfully correlate metrics, logs, and traces DESCRIPTION:This talk presents current capabilities of Grafana to integrate metrics, logs and traces and shows how to setup both Grafana and application code to be able to correlate all 3 in Grafana. It assumes some familiarity with Grafana to follow the How To steps but should be suitable for beginner users. Afterwards it shows future direction of Grafana in context of "Experiences", for even more seamless experience when correlating data from multiple data sources.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/tracing_grafana/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Andrej Ocenas":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:10130@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T103000 DTEND:20200202T105500 SUMMARY:Jaegertracing in Ceph DESCRIPTION:Jaeger and Opentracing provide ready to use tracing services for distributed systems and are becoming widely used de-facto standard because of their ease of use. Making use of these libraries, Ceph, can reach to a much-improved monitoring state, supporting visibility to its background distributed processes. This would, in turn, add up to the way Ceph is being debugged, “making Ceph more transparent” in identifying abnormalities.In this session, the audience will get to learn about using distributed tracing in large scale distributed systems like Ceph, an overview of Jaegertracing in Ceph and how someone can use it for debugging Ceph.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/tracing_ceph/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Deepika Upadhyay":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:10171@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T111000 DTEND:20200202T113500 SUMMARY:Stories around ModBus DESCRIPTION:Society would end if all ModBus stopped working overnight. Good thing it has zero security built in. Still, it's useful to get data out of industrial systems, be they a datacenter or a power plant.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/modbus_2020/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Richard Hartmann":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:10156@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T115000 DTEND:20200202T121500 SUMMARY:Monitoring strawberries DESCRIPTION:According to the United Nations, 2.5 billion more people will be living in cities by 2050. This trend has caused indoor farming to draw a lot of attention and effort in recent years, in an attempt to scale the production of highly nutritious, healthy food inside cities.
Over the past 3 years, Agricool has recycled 20 industrial containers into farms that grow strawberries, herbs and salads, in the very heart of cities, and without any pesticide. These urban farms are currently operated in Paris and Dubaï.
Operating a fleet of indoor farms presents a diverse set of observability challenges. At the most traditional end of the observability spectrum, engineers rely on devops tools to operate computers, microservices, and an IoT infrastructure embedded inside the farms. On the other hand, living organisms like strawberry plants draw their own observability requirements, such as disease detection, physiological measurements, nutrient absorption, water analysis, or exposition rate to pollinating bumblebees.
The purpose of this talk is to highlight observability challenges and best practices that are specific to indoor farming, and to illustrate them through the learnings that were made at Agricool when building observability pipelines.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/strawberries/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Jean-Marc Davril":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:10244@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T123000 DTEND:20200202T125500 SUMMARY:Querying millions to billions of metrics with M3DB's inverted index DESCRIPTION:The cardinality of monitoring data we are collecting today continues to rise, in no small part due to the ephemeral nature of containers and compute platforms like Kubernetes. Querying a flat dataset comprised of an increasing number of metrics requires searching through millions and in some cases billions of metrics to select a subset to display or alert on. The ability to use wildcards or regex within the tag name and values of these metrics and traces are becoming less of a nice-to-have feature and more useful for the growing popularity of ad-hoc exploratory queries.
In this talk we will look at how Prometheus introduced the concept of a reverse index existing side-by-side with a traditional column based TSDB in a single process. We will then walk through the evolution of M3’s metric index, starting with ElasticSearch and evolving over the years to the current M3DB reverse index. We will give an in depth overview of the alternate designs and dive deep into the architecture of the current distributed index and the optimizations we’ve made in order to fulfill wildcards and regex queries across billions of metrics.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/m3db/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Rob Skillington":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:10209@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T131000 DTEND:20200202T133500 SUMMARY:Secret History of Prometheus Histograms DESCRIPTION:Representing distributions in a metrics-based monitoring system is both important and hard. Doing it right unlocks many powerful use cases that would otherwise require expensive event processing. Prometheus offers the somewhat weirdly named Histogram and Summary metric types for distributions. How have they become what they are today with all their weal and woe? To help understand the present, let's shed light on the past. Studying this piece of Prometheus's history will also allow a glimpse of the bigger picture, why certain things are the way they are in Prometheus, and which parts of the original vision are still awaiting fulfillment.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/histograms/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Björn Rabenstein (Beorn)":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:10216@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T135000 DTEND:20200202T141500 SUMMARY:Are You Testing Your Observability? Patterns for Instrumenting Your Services DESCRIPTION:Observability is the key to understand how your application runs and behaves in action. This is especially true for distributed environments like Kubernetes, where users run Cloud-Native microservices.
Among many other observability signals like logs and traces, the metrics signal has a substantial role. Sampled measurements observed throughout the system are crucial for monitoring the health of the applications and, they enable real-time, actionable alerting. While there are many open-source robust libraries, in various languages, that allow us to easily instrument services for backends like Prometheus, there are still numerous possibilities to make a mistake or misuse those tools.
During this talk, two engineers from Red Hat: Kemal and Bartek (Prometheus and Thanos project maintainer) will discuss valuable patterns and best practices for instrumenting your application. The speakers will go through common pitfalls and failure cases while sharing valuable insights and methods to avoid those mistakes. In addition, this talk will demonstrate, how to leverage unit testing to verify the correctness of your observability signals. How it helps and why it is important. Last but not least, the talk will cover a demo of the example instrumented application based on the experience and projects we maintain.
The audience will leave knowing how to answer the following important questions:
What are the essential metrics that services should have?Should you test your observability? What are the ways to test it on a unit-test level?What are the common mistakes while instrumenting services and how to avoid them?
And more!
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/testing_observability/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Bartek Plotka":invalid:nomail ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Kemal Akkoyun":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:9923@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T143000 DTEND:20200202T145500 SUMMARY:How to measure Linux Performance Wrong DESCRIPTION:In this presentation, we will look at typical mistakes measuring or interpreting Linux Performance. Do you use LoadAvg to assess if your CPU is overloaded or Disk Utilization to see if your disks are overloaded? We will look into these and a number of other metrics that are often misunderstood and/or misused as well as provide suggestions for better ways to measure Linux Performance.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/measure_linux_performance/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Peter Zaitsev":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:9657@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T151000 DTEND:20200202T153500 SUMMARY:From Zero to Useless to Hero: Make Runtime Data Useful in Teams DESCRIPTION:We introduced distributed tracing, central logging with trace correlation and monitoring with Prometheus and Grafana in a large internationally distributed software development project from the beginning. The result: Nobody used it.
In this talk we show the good and not so good sides we have learned while introducing and operating the observability tools. We show which extensions and conventions were necessary in order to carry out a cultural change and to awaken enthusiasm for these tools. Today the tools are a first-class citizen and people are shouting when they are not available.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/useless/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Florian Lautenschlager":invalid:nomail ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Robert Hoffmann":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:9945@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T155000 DTEND:20200202T161500 SUMMARY:Grafana-As-Code: Fully reproducible Grafana dashboards with Grafonnet DESCRIPTION:Grafana configuration can nowadays be fully done as code, which enables code review, code reuse, and in general better workflows when working with dashboards.
This talk will present Grafonnet, a Jsonnet library to generate Grafana dashboards and some tips and tricks about how to use it efficiently and how to manage fully your grafana instances from code. We will also explore how Jsonnet and Grafonnet enable collaboration on dashboards, using Mixins and explain how to push dashboards to Grafana, either using Kubernetes, or direct to the Grafana API.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/grafana_as_code/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Julien Pivotto":invalid:nomail ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Malcolm Holmes":invalid:nomail END:VEVENT BEGIN:VEVENT METHOD:PUBLISH UID:10211@FOSDEM20@fosdem.org TZID:Europe-Brussels DTSTART:20200202T163000 DTEND:20200202T165500 SUMMARY:Monitoring of a Large-Scale University Network: Lessons Learned and Future Directions DESCRIPTION:The complexity of network monitoring strongly depends on the size of the network under observation. Challenges in monitoring large-scale networks arise not only from dealing with a large volume of traffic, but also from keeping track of all traffic sources, destinations, and who-talks-to-whom communications. Analyzing this information allows to uncover new behaviors that would have not been visible by merely observing common metrics such as bytes and packets. The drawback is that extra pressure is put on the monitoring system as well and on the downstream data- and timeseries-stores.
This talk presents a case study based on the monitoring of a large-scale university network. Challenges faced, findings, and lessons learned will be examined. It will be shown how to make sense of the input data to properly manage and reduce its scale as early as possible in the monitoring system. The discussion will also highlight the advantages and limitations of the opensource software components of the monitoring system. In particular, the opensource network monitoring tool ntopng and the timeseries-store InfluxDB will be considered. It will be shown what happens when ntopng and InfluxDB are pushed to their limits and beyond, and what it can be done to ensure their smooth operation. Relevant findings, behaviors uncovered in the network traffic, and future directions will conclude the talk. Intended audience is technical and managerial individuals who are familiar with network monitoring.
CLASS:PUBLIC STATUS:CONFIRMED CATEGORIES:Monitoring and Observability URL:https:/fosdem.org/2020/schedule/2020/schedule/event/monitoring_large_scale_uni_network/ LOCATION:UD2.120 (Chavanne) ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Simone Mainardi":invalid:nomail ATTENDEE;ROLE=REQ-PARTICIPANT;CUTYPE=INDIVIDUAL;CN="Tobias Appel":invalid:nomail END:VEVENT END:VCALENDAR