Skip to content

[OpAMP] Map fields from AgentToServer message to Agent fields#6400

Open
juliaElastic wants to merge 11 commits intoelastic:mainfrom
juliaElastic:opamp-data
Open

[OpAMP] Map fields from AgentToServer message to Agent fields#6400
juliaElastic wants to merge 11 commits intoelastic:mainfrom
juliaElastic:opamp-data

Conversation

@juliaElastic
Copy link
Contributor

@juliaElastic juliaElastic commented Feb 20, 2026

What is the problem this PR solves?

Map data from OpAMP AgentToServer message to Agent fields in .fleet-agents

How does this PR solve the problem?

Add OpAMP message fields to Agent document:

  • convert capabilities to string array
  • convert effective config to json object and sanitize
  • add health and set last_checkin_status, last_checkin_message, etc.
  • add identifying and non-identifying attributes
  • add sequence number

How to test this PR locally

Follow instructions in https://github.com/elastic/fleet-server/blob/main/docs/opamp.md

Download and extract otel collector, e.g.:

curl -L -O https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.144.0/otelcol-contrib_0.144.0_darwin_arm64.tar.gz 

Create otel config to include system fields and internal telemetry, e.g. otel-opamp.yaml:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  resourcedetection:
    detectors: ["system","env"]
    system:
      hostname_sources: ["os"]
      resource_attributes:
        host.name:
          enabled: true
        host.arch:
          enabled: true
        os.description:
          enabled: true
        os.type:
          enabled: true

exporters:
  debug:
    verbosity: detailed
  elasticsearch/otel:
    endpoints: [ "http://localhost:9200" ]
    api_key: ${env:ES_API_KEY} 
    mapping:
      mode: otel
  otlp:
    endpoint: "http://localhost:4317"
    tls:
      insecure: true

extensions:
  opamp:
    server:
      http:
        endpoint: http://localhost:8220/v1/opamp
        tls:
          insecure: true
        headers:
          Authorization: ApiKey ${env:FLEET_ENROLLMENT_TOKEN}
    instance_uid: ${env:INSTANCE_UID}
    capabilities:
      reports_effective_config: true

service:
  pipelines:
    logs:
      receivers: [otlp]
      exporters: [debug]
    metrics:
      receivers: [otlp]
      processors: [resourcedetection]
      exporters: [elasticsearch/otel]
  extensions: [opamp]

  # publish collector internal telemetry
  telemetry:
    metrics:
      level: detailed
      readers:
        - periodic:
            interval: 30000
            exporter:
              otlp:
                protocol: grpc
                endpoint: http://localhost:4317

Create API keys and start otel collector:

 cd ~/Downloads/otelcol-contrib_0.144.0_darwin_arm64
 
 export INSTANCE_UID=<uuid> # e.g. "519b8d7a-2da8-7657-b52d-492a9de33313"
 export OTEL_RESOURCE_ATTRIBUTES="service.instance.id=$INSTANCE_UID" # to include instance id in internal telemetry data
 export ES_API_KEY=<api_key> # ES API key from observability onboarding UI
 export FLEET_ENROLLMENT_TOKEN=<enrollment_token> 
 ./otelcol-contrib --config ./otel-opamp.yaml

Design Checklist

  • I have ensured my design is stateless and will work when multiple fleet-server instances are behind a load balancer.
  • I have or intend to scale test my changes, ensuring it will work reliably with 100K+ agents connected.
  • I have included fail safe mechanisms to limit the load on fleet-server: rate limiting, circuit breakers, caching, load shedding, etc.

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool

Related issues

Closes https://github.com/elastic/ingest-dev/issues/6982

@juliaElastic juliaElastic added the enhancement New feature or request label Feb 20, 2026
@mergify
Copy link
Contributor

mergify bot commented Feb 20, 2026

This pull request does not have a backport label. Could you fix it @juliaElastic? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@juliaElastic juliaElastic added backport-skip Skip notification from the automated backport with mergify skip-changelog labels Feb 20, 2026
@juliaElastic juliaElastic marked this pull request as ready for review February 20, 2026 11:37
@juliaElastic juliaElastic requested a review from a team as a code owner February 20, 2026 11:37
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Feb 20, 2026
@cmacknz
Copy link
Member

cmacknz commented Feb 23, 2026

Depending on merge order, the test in #6289 will have to be updated not to expect the updating state anymore.

ycombinator
ycombinator previously approved these changes Feb 25, 2026
Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@juliaElastic
Copy link
Contributor Author

@ycombinator Could you help merge the PR, it seems I no longer have access.

}

// if agents index doesn't exist yet, it will be created when the first agent document is indexed
if err != nil && strings.Contains(err.Error(), "index_not_found_exception") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we compare to es.ErrIndexNotFound here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Type: agentType,
},
LocalMetadata: data,
// Setting revision to 1, the collector won't receive policy changes and 0 would keep the collector in updating state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: the policy on fleet will never have a policy id > 1, if it can then agents can appear as outdated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UI hides the policy name and revision for collectors, as policies are not relevant for them (other than needing an enrollment token to authenticate with).

// suite.Equal(otelVersion, agentDoc.Agent.Version, "expected agent.version to match otelcol-contrib binary version")
// suite.Equal(1, agentDoc.Revision, "expected policy_revision_idx to be 1")
// suite.Contains(agentDoc.Tags, "otelcontribcol", "expected tags to contain otelcontribcol")
// suite.Equal("online", agentDoc.Status, "expected status to be online")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the status field check from the agent doc, it's calculated by a runtime field by Fleet UI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-skip Skip notification from the automated backport with mergify enhancement New feature or request skip-changelog Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants