Observability Pipeline

September 9, 2025 6 min read  

Introduction

An observability pipeline is the end-to-end flow to capture, process, and visualize logs and metrics to gain insights into system health, performance, and user activity.

There are several ways to build the pipeline using different tools and technologies. We will try to understand the components of the pipeline with the help of simple example.

Pipeline Overview

An observability pipeline typically handles the following operations on telemetry data,

Telementry Data: Collected information about a system’s performance, behavior, and health that is transmitted for monitoring and analysis

The observability pipeline can include many components and operations as shown in the diagram below:

Pipeline components often overlap with data pipelines, and technology choices vary by architecture and scale.

Example

Let's build a very simple pipeline where we are ingesting user location detail and want to view the users on a world map. We will going to use following tools,

Make sure you have docker already installed on your system, we will going to use it for a quick setup. If you are not familiar with it, you can visit this page to get some basic understanding.

Our goal is to generate some random logs, filebeat will ingest it to elastic search and kibana can help to view it on the world map. This small exercise will help us to understand the pipeline better.

Generating Sample Logs

We may need to write a small program to generate the random logs which we will ingest to filebeat.

import json
import time
import random

users = ["mark", "bob", "charlie", "david", "james", "carl", "jennie", "jane", "alex"]

log_file = "logs.json"

while True:
    user = random.choice(users)
    log = {
        "username": user,
        "location": {
            "lat": round(random.uniform(-90, 90), 6),
            "lon": round(random.uniform(-180, 180), 6)
        }
    }
    with open(log_file, "a") as f:
        f.write(json.dumps(log) + "\n")
    time.sleep(1)

This script will keep on appending logs in logs.json file in following format,

...
{"username": "david", "location": {"lat": 46.638099, "lon": -141.664286}}
{"username": "jennie", "location": {"lat": 2.067681, "lon": 6.107782}}
{"username": "charlie", "location": {"lat": -68.763366, "lon": -138.960341}}
{"username": "bob", "location": {"lat": -62.107125, "lon": -119.426837}}
...

Docker Compose Setup

Now, let's create docker-compose.yml,

version: "3.9"
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.15.0
    container_name: elasticsearch
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
      - ES_JAVA_OPTS=-Xms1g -Xmx1g
    ports:
      - "9200:9200"
    volumes:
      - es_data:/usr/share/elasticsearch/data

  kibana:
    image: docker.elastic.co/kibana/kibana:8.15.0
    container_name: kibana
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
    ports:
      - "5601:5601"
    depends_on:
      - elasticsearch
  
  filebeat:
    image: docker.elastic.co/beats/filebeat:8.15.0
    container_name: filebeat
    user: root
    volumes:
      - ./filebeat.yml:/usr/share/filebeat/filebeat.yml
      - ./logs.json:/usr/share/filebeat/logs/logs.json
    depends_on:
      - elasticsearch
    
volumes:
  es_data:

Filebeat Configuration

We have added Elasticsearch, Kibana and Filebeat. For Filebeat we may need to supply additional configuration through filebeat.yml.

filebeat.inputs:
  - type: log
    enabled: true
    paths:
      - /usr/share/filebeat/logs/logs.json
    start_position: beginning
    json.keys_under_root: true 
    json.add_error_key: true
    json.overwrite_keys: true

output.elasticsearch:
  hosts: ["http://elasticsearch:9200"]

Running the Pipeline

Once we have these files we can run following commands.

  1. Execute python scripts to keep on adding new log every second,
python3 log-generator.py
  1. Run docker-compose in separate terminal,
docker-compose up -d

Accessing Services

This will spin the containers, first run will take few seconds to minutes. Once all the containers are running we can check following URLs,

Kibana: Create Data View

In Kibana we will need to create a data view which specifies which elasticseach indices to use. This way we will be able to see our logs based on the time filter configured on top right.

In the side bar,

(*Index pattern is matching with three sources in my case which I created manually using dev tool but for your case it would be only one - data stream.)

Once we save this data view we can see logs are coming getting updated (update time filter or press refresh).

This confirms that the logs are properly getting ingested.

Visualizing User Locations on Map

Now we want to view these location on the map,

There are many other features in Kibana you can explore and experiment with to deepen your understanding.

Summary

The simple setup that we tried can be extended to more complex observability scenarios, enabling real-time insights into applications, infrastructure, and user activity across the globe.

The starting point of any observability pipeline is the data source itself, along with a clear understanding of what you expect from that data—whether it’s querying, visualization, alerting, or automation.