[RFC] Advance Entity Field Set to Stage 1 #2461

tinnytintin10 · 2025-03-25T04:27:14Z

This PR advances the Entity Field Set RFC (0049) from Stage 0 (strawperson) to Stage 1 (draft).

Changes Since Stage 0

Since the initial Stage 0 proposal (PR #2434), the following additions have been made:

Added a "Usage" section highlighting how the entity field set enables normalized entity data querying and its role in the upcoming security solution inventory experience
Added "Source data" section explaining how the field set's taxonomy allows entity modeling from any data source
Added "Concerns" section addressing potential challenges (To Do)
Added subject matter experts to the "People" section
Created YAML schema definition in the rfcs/text/0049/ directory

Next Steps

After advancing to Stage 1, we plan to:

Implement experimental field definitions in the ECS schema
Gather feedback from early adopters
Refine the field definitions based on practical usage
Begin work toward Stage 2 criteria

tinnytintin10 · 2025-03-25T04:33:44Z

@mjwolf, to what level of detail are we supposed to document usage and source data sections in a stage 1 RFC? Does the current level of detail I provide suffice? Also, for the concerns section, are we supposed to update that during the PR review process or upfront (I guess a mix of both but wanted to clarify)? Thanks!

maxcold · 2025-04-01T13:39:46Z

rfcs/text/0049/entity.yml

+      field sets (e.g., host), this field should mirror the corresponding *.name value.
+    example: my-production-database, web-server-01, payment-processing-queue
+
+  - name: entity.url


Do we consider entity.reference here? url is quite specific

maxcold · 2025-04-01T13:45:11Z

rfcs/text/0049/entity.yml

+      can be preserved in entity.raw.
+    example: i-04ff5d36be3d6896c, arn:aws:s3:::my-bucket, projects/123456789/locations/us-central1/instances/my-db
+
+  - name: entity.source


what about provider? I couldn't find any example of *.source field

maxcold · 2025-04-01T13:46:50Z

rfcs/text/0049/entity.yml

+    multi_fields:
+      - name: text
+        type: text
+    short: The human-readable name of the entity.


it's not very human readeable in the examples. Do we need additional entity.title to capture the human readable title of the entity? as in EC2 instance maxcold-ec2 with title EC2 instance for testing

mjwolf · 2025-04-01T23:12:42Z

rfcs/text/0049-entity-fields.md

-<!--
-* Stage 1: https://github.com/elastic/ecs/pull/NNN
-...
-->
 * Stage 0: https://github.com/elastic/ecs/pull/2434


You can leave this in, for historical reference

mjwolf · 2025-04-01T23:14:40Z

rfcs/text/0049-entity-fields.md

 The following are the people that consulted on the contents of this RFC.

-* Author: @tinnytintin10


You can leave this info from stage 0 in, along with the below sections. The RFCs stages are intended to build on each other, so you can keep relevant data from past stages.

mjwolf · 2025-04-01T23:17:51Z

rfcs/text/0049-entity-fields.md


-<!--


It's not necessarily required, but if you leave in the comments for future stages, it'll be easier for you to update this doc when you get to those stages.

mjwolf · 2025-04-01T23:25:37Z

rfcs/text/0049/entity.yml

+    level: core
+    type: keyword
+    short: Source module or integration that provided the entity data.
+    description: >


I think the document source can usually be determined from event.dataset or the index its in. Do you know of any cases where the existing data isn't enough to determine the source?

romulets · 2025-04-08T13:08:00Z

rfcs/text/0049/entity.yml

+title: Entity
+group: 2
+type: group
+short: Fields to describe various types of entities across IT environments.


Nitpick: What's IT environments? We might have term to describe de space in which entities can be found.

IT Environments has the common usage in inside the company environments, meanwhile this field should capture entities from services and external references too

romulets · 2025-04-08T13:10:03Z

rfcs/text/0049/entity.yml

+      A unique identifier for the entity. When multiple identifiers exist, this should be
+      the most stable and commonly used identifier that: 1) persists across the entity's
+      lifecycle, 2) ensures uniqueness within its scope, 3) is commonly used for queries
+      and correlation, and 4) is readily available in most observations (logs/events).
+      For entities with dedicated field sets (e.g., host, user), this value should match
+      the corresponding *.id field. Alternative identifiers (e.g., ARNs values in AWS, URLs)
+      can be preserved in entity.raw.


this value should match the corresponding *.id field. Alternative identifiers (e.g., ARNs values in AWS, URLs) can be preserved in entity.raw

Specially the ARN values in AWS seems to contradict what was previously suggested as guidance to entity ids. Am I misreading something?

romulets · 2025-04-08T13:12:16Z

rfcs/text/0049/entity.yml

+      A standardized high-level classification of the entity. This provides a normalized way
+      to group similar entities across different providers or systems. There will be an
+      allowed set of values maintained for this field to ensure consistency.


This contradicts what was previously mentioned in the document

The entity.type field needs a controlled vocabulary to maintain consistency and interoperability. However, an overly restrictive list might limit the field set's utility for emerging technologies and use cases.
Potential solution: Establish a governance process for entity.type values, including an initial set of well-defined types and a mechanism for proposing and reviewing new types. Document a clear taxonomy with examples to guide users in selecting appropriate types.

Do we really want to add an allowed set of values maintained for this field? or only guidance? Also the wording "there will be" seems like we are promising things in a documentation. Maybe it's good to avoid it?

romulets · 2025-04-08T13:18:33Z

rfcs/text/0049/entity.yml

+      Supports existence queries, exact value matches, and simple aggregations.
+    dynamic: true
+
+  - name: entity.risk


should we map also *risk.calculated_level, *risk.calculated_score and *risk.calculated_score_norm? Or add a reference to risk? And maybe also update that page to mention entity.risk

orouz · 2025-04-09T09:24:58Z

rfcs/text/0049/entity.yml

+      provides more granular classification than entity.type. While entity.type provides a normalized
+      classification across different systems, entity.sub_type preserves the provider-specific
+      categorization.
+    example: aws_s3_bucket, gcp_cloud_storage_bucket, azure_blob_container, aws_lambda_function


maybe we should favor the original asset type string as it's defined by the vendor and only create one if there isn't any?

this list would be:

name vendor name

aws_s3_bucket AWS::S3::Bucket

gcp_cloud_storage_bucket storage.googleapis.com/Bucket

azure_blob_container Microsoft.Storage/storageAccounts/blobServices/containers

aws_lambda_function AWS::Lambda::Function

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-template-resource-type-ref.html

https://learn.microsoft.com/en-us/azure/governance/resource-graph/reference/supported-tables-resources

https://cloud.google.com/asset-inventory/docs/asset-types

add stage 1 readme updates and entity.yml

670f145

tinnytintin10 requested review from romulets, kubasobon, eyalkraft, oren-zohar, opauloh, JordanSh, orouz and maxcold April 1, 2025 12:49

tinnytintin10 marked this pull request as ready for review April 1, 2025 12:51

tinnytintin10 requested a review from a team as a code owner April 1, 2025 12:51

tinnytintin10 requested a review from hop-dev April 1, 2025 13:02

maxcold reviewed Apr 1, 2025

View reviewed changes

mjwolf reviewed Apr 1, 2025

View reviewed changes

romulets reviewed Apr 8, 2025

View reviewed changes

orouz reviewed Apr 9, 2025

View reviewed changes

romulets added a commit to romulets/kibana that referenced this pull request Apr 11, 2025

Align entity fields with last developments of elastic/ecs#2461

810989e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Advance Entity Field Set to Stage 1 #2461

[RFC] Advance Entity Field Set to Stage 1 #2461

tinnytintin10 commented Mar 25, 2025

tinnytintin10 commented Mar 25, 2025

maxcold Apr 1, 2025

maxcold Apr 1, 2025

maxcold Apr 1, 2025

mjwolf Apr 1, 2025

mjwolf Apr 1, 2025

mjwolf Apr 1, 2025

mjwolf Apr 1, 2025

romulets Apr 8, 2025

romulets Apr 8, 2025

romulets Apr 8, 2025

romulets Apr 8, 2025

orouz Apr 9, 2025

		The following are the people that consulted on the contents of this RFC.

		* Author: @tinnytintin10

name	vendor name
aws_s3_bucket	`AWS::S3::Bucket`
gcp_cloud_storage_bucket	`storage.googleapis.com/Bucket`
azure_blob_container	`Microsoft.Storage/storageAccounts/blobServices/containers`
aws_lambda_function	`AWS::Lambda::Function`


		<!--

[RFC] Advance Entity Field Set to Stage 1 #2461

Are you sure you want to change the base?

[RFC] Advance Entity Field Set to Stage 1 #2461

Conversation

tinnytintin10 commented Mar 25, 2025

Changes Since Stage 0

Next Steps

tinnytintin10 commented Mar 25, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment