The end result of the code developed in this document can be found in the GitHub monorepo springboot-demo-projects, under the tag observability.
Observability is about understanding what your application is doing in production without having to add print statements and redeploy. When something breaks at 3 AM, you need to trace requests across services, see error logs in context, and understand performance bottlenecks without guessing.
Data Collection (The Application): You modify the app to include specialized code that collects various kinds of data about the app's internal state and the host server environment.
Data Storage (The Telemetry Backends): The collected data ships out of the application process and goes to corresponding, optimized telemetry backends, which are databases designed specifically for logs, metrics, or traces.
Visualization (The Dashboard): You use a powerful visualization tool (like Grafana) to pull the stored data from the backends and present it in cohesive, readable dashboards.
Grafana acts as the "single pane of glass" that unifies all three data types. It is a web-based visualization platform that connects to multiple telemetry backends simultaneously.
These are the most common choices in the Grafana ecosystem, but they are not the only options. Alternatives include Elasticsearch for logs, InfluxDB for metrics, and Jaeger for traces.
networks: Joins the monitoring network so apps can reach Tempo
The observability stack includes:
Prometheus: Scrapes metrics from all services
Loki: Stores and indexes log data
Promtail: Collects Docker container logs and forwards to Loki
Tempo: Receives and stores distributed traces
Grafana: Visualizes metrics, logs, and traces in unified dashboards
Real-World Setups Look Different
In some projects, it's common to find observability services living in a
completely separate Docker Compose project, or even managed by third-party
providers like Datadog or Grafana
Cloud. Bundling everything into a single
docker-compose.yml here keeps things simple for documentation and makes the
setup easier to follow.
Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus.
Dockerfile:
observability/loki.Dockerfile
FROM alpine:latest AS builder RUN mkdir -p /loki/chunks /loki/rules FROM grafana/loki:3.5.10 COPY--from=builder--chown=10001:10001 /loki /loki COPY observability/loki-config.yml /etc/loki/local-config.yaml USER 10001
Uses a multi-stage build to create directories with correct permissions (Loki runs as user 10001).
Configuration:
observability/loki-config.yml
auth_enabled:false server: http_listen_port:3100 grpc_listen_port:9096 common: instance_addr: 127.0.0.1 path_prefix: /loki storage: filesystem: chunks_directory: /loki/chunks rules_directory: /loki/rules replication_factor:1 ring: kvstore: store: inmemory query_range: results_cache: cache: embedded_cache: enabled:true max_size_mb:100 schema_config: configs: -from:2020-10-24 store: tsdb object_store: filesystem schema: v13 index: prefix: index_ period: 24h ruler: alertmanager_url: http://localhost:9093 compactor: working_directory: /loki/compactor compaction_interval: 10m retention_enabled:true retention_delete_delay: 2h retention_delete_worker_count:150 delete_request_store: filesystem limits_config: retention_period: 360h # 15 days, matches Prometheus and Tempo # By default, Loki will send anonymous usage data to Grafana. # This can be disabled by setting this to false analytics: reporting_enabled:false
auth_enabled: false: Disables authentication for local development
storage.filesystem: Uses local filesystem storage (suitable for single-node setups)
retention_period: Keeps logs for 15 days (360 hours)
relabel_configs: Filters for only spring-* services and renames labels
pipeline_stages: Parses log lines to extract the log level and create indexed labels
The regex pattern trace_id=\S+ span_id=\S+ trace_flags=\S+ (?P<type>\w+) \S+ --- extracts the log level from your Spring Boot log format, enabling filtering by log type (INFO, ERROR, DEBUG, etc.) in Grafana.
Loki: For logs, with trace ID extraction for correlation
Tempo: For traces, with links back to Loki logs
The exemplarTraceIdDestinations and derivedFields configurations enable trace-to-log correlation. When you see a metric spike, you can click to view the trace; when viewing logs, you can click the trace ID to see the full distributed trace.
): Application-level metrics with HTTP request rates, response times, and
error rates
These are omitted from the patch due to their size (thousands of lines of
JSON), but you can find them in the repository at
observability/grafana/dashboards/.
When deploying to Coolify, the platform automatically detects the new services defined in your docker-compose.yml and starts them alongside your Spring Boot applications. You do not need to manually configure the monitoring stack.
The only additional step is to assign a domain to Grafana so you can access the dashboards:
In Coolify, find the Grafana service in your project
Click on it and set a domain (e.g., grafana.yourdomain.com)
Coolify will handle SSL certificates and routing
Grafana Environment Variables
Grafana expects the environment variables GF_SECURITY_ADMIN_USER
and GF_SECURITY_ADMIN_PASSWORD to be set. Make sure to define
them in your Coolify service configuration before starting the stack.