What is OpenTelemetry (OTEL)?
OTEL is the open standard for generating, collecting, and exporting telemetry data. In Tines, our telemetry data consists of traces and spans. A trace is a single end-to-end operation executing through the system (for example, a story execution or background job run). A trace is composed of multiple spans or single steps like HTTP call, DB query, background job or action execution.
With self-hosted release v34.7.0, Tines has the ability to send OpenTelemetry traces to observability stacks that support OTEL ingestion.
Read the following documentation on how to configure this new feature.
What are we trying to solve with OpenTelemetry?
When something in Tines feels slow, fails intermittently, or just acts differently under load, the default tools may not always have enough information to act on. You can look at system metrics, see that stories are running, and dig through logs. Even after doing all of that, the same questions remain:
What exactly is slow?
Where is the time actually going?
Is it this story or action acting up?
Why is OpenTelemetry important?
This is where OpenTelemetry helps. It lets us see how a workflow moves through Tines, rather than just whether it finished. With tracing enabled, you can watch:
How requests and background jobs move through the system
Which parts of a run are really driving latency
How the different spans fit together inside a story
What changes when the system is under heavy load or stories back up
That changes conversations from:
“The system feels slow.”
to something more useful, like:
“Queue latency is zero, but a couple of actions in Story 42 are slow. We should fix the story, not the infrastructure.”
Read the following articles for help with:
OpenTelemetry: Designing and Implementing an Observability Stack
OpenTelemetry: Designing a Dashboard
OpenTelemetry: Practical Troubleshooting