Simple OpenTelemetry Native Telemetry System

### 🚀 Describe the new functionality needed

The telemetry system llama stack has right now is a good start, but there is work to be done to polish it into something more production ready.

## Goals
The goal of this feature is to significantly simplify the telemetry systems in llama stack so that:
- The developer experience for testing and capturing telemetry for new or existing services uses open telemetry in a way that is simple and consistent across the projeect
- The telemetry that is captured is well documented, and easy to use and integrate with popular telemetry provider offerings like: datadog, new relic, dynatrace, jaeger, prometheus, and grafana
- It is as simple as possible to export telemetry to an OTLP collector from llama stack
- power user features are available for advanced cases, but are not necessary to expose to regular users
- Telemetry that is captured is secure by default, but can be configured to be more detailed as needed.

## Design

### User Facing Changes

Since the telemetry API is being removed, the provider should also be removed from the config. There are two reasons for this: 

- providers are generally API implementations, which telemetry will no longer be
- it can be confusing for users to set open telemetry config options in a config file, since they are always overwritten by the environment variables.

Due to llama stack using uvicorn, and known issues with open telemetry auto instrumentation not being passed on to python subprocesses it starts for each worker, we will need to initialize telemetry manually in llama stack. As a result, llama stack will ship with telemetry enabled by default, it will capture data that is always secure, and export it via `http/protobuff` unless otherwise configured.

We will defer configuration of OTEL as much as possible to the pre-defined [environment variables](https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/) unless there is a very good reason not to do so.

A warning log will be written if the OTLP exporter and protocol are empty or incorrectly set to alert users to configuration errors, otherwise the telemetry system will fail silently.

To export data with a different protocol, the `OTEL_EXPORTER_OTLP_PROTOCOL` environment variable can be used. To export data to an OTLP at a custom location, `OTEL_EXPORTER_OTLP_ENDPOINT` can be set.

To disable telemetry, users can set `OTEL_SDK_DISABLED=True`. To disable capturing telemetry from a given service, they can use `OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=sqlalchemy`.


#### New Config Structure

To simplify the telemetry workflow, we are changing the config to have a simple top level option for enabling and disabling telemetry, which can be expanded later.

```yaml
version: 2
telemetry:
  enabled: True
apis:
- inference
- safety
- vector_io
providers:
  inference:
  - provider_id: openai
    provider_type: remote::openai
    config:
      api_key: ${env.OPENAI_API_KEY:=}
  vector_io:
  - provider_id: faiss
    provider_type: inline::faiss
    config:
      kvstore:
        type: sqlite
        db_path: ~/.llama/faiss_store.db
  safety:
  - provider_id: llama-guard
    provider_type: inline::llama-guard
    config: {}
server:
  port: 8321
```

### Documentation Changes

Maintain a Telemetry subsection of the llama stack docs, which keeps detailed records of what custom telemetry data we capture for each API endpoint. Customers can reasonably assume what data gets captured automatically by open telemetry, since that is standardized, so it does not need to be captured. 

### Internal Changes

Once the [testing changes](https://github.com/llamastack/llama-stack/pull/3805) land, we can use that to make sure that the same baseline quality of telemetry data is exported.

I would advocate for the following internal changes to be made:
- https://github.com/llamastack/llama-stack/pull/3723: remove custom tracing middleware and lean on built in fastAPI instrumentation
- https://github.com/llamastack/llama-stack/pull/3733: use automatic intrumentation installation to make sure all ingress and egress points are being traced and observed at a baseline level, then improve from there later with manual instrumentation.
- make capturing request/response bodies something that is disabled by default, but can be enabled. This prevents accidental capture of sensitive data like prompts or images. 
- capture attributes more efficiently. We capture the same attributes multiple times in a given trace, causing inflated data volumes.

### Reference Architecture

Improve the scripts/telemetry library to work by default, and include a Grafana dashboard that shows off the telemetry data we capture across the stack.


### 💡 Why is this needed? What if we don't build it?

This makes the telemetry offering from llama stack complete, easy to use, and digestable to users. It gets out of their way as much as possible, and offers a reference architecture that they can adopt or lean on to consume llama stack telemetry with little or no effort.

### Core Tasks
- [x] Remove Telemetry API
- [x] Create Instrumentation data test
- [ ] Implement Automatic Instrumentation installation
- [ ] Enrich Auto instrumentation so that it captures data in a way that conforms to the tested requirements
- [ ] Documentation for new user workflow
- [ ] Remove Telemetry Provider

### Nice to Have
- [ ] Create a configurable way to enable/disable capture of request/response bodies
- [ ] Remove duplicate fields from telemetry data
- [ ] Power User documentation for things like enabling/disabling capture of HTTP bodies or headers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Simple OpenTelemetry Native Telemetry System #3806

🚀 Describe the new functionality needed

Goals

Design

User Facing Changes

New Config Structure

Documentation Changes

Internal Changes

Reference Architecture

💡 Why is this needed? What if we don't build it?

Core Tasks

Nice to Have

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Simple OpenTelemetry Native Telemetry System #3806

Description

🚀 Describe the new functionality needed

Goals

Design

User Facing Changes

New Config Structure

Documentation Changes

Internal Changes

Reference Architecture

💡 Why is this needed? What if we don't build it?

Core Tasks

Nice to Have

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions