Normalize-for-Stream processor
Stack
Detects whether a document is OpenTelemetry-compliant and if not -
normalizes it as described below. If used in combination with the OTel-related
mappings such as the ones defined in logs-otel@template, the resulting
document can be queried seamlessly by clients that expect either ECS or OpenTelemetry-Semantic-Conventions formats.
This processor is in tech preview and is not available in our serverless offering.
The processor detects OpenTelemetry compliance by checking the following fields:
resourceexists as a key and the value is a mapresourceeither doesn't contain anattributesfield, or contains anattributesfield of type mapscopeis either missing or a mapattributesis either missing or a mapbodyis either missing or a mapbodyeither doesn't contain atextfield, or contains atextfield of typeStringbodyeither doesn't contain astructuredfield, or contains astructuredfield that is not of typeString
If all of these conditions are met, the document is considered OpenTelemetry-compliant and is not modified by the processor.
If the document is not OpenTelemetry-compliant, the processor normalizes it as follows:
Specific ECS fields are renamed to have their corresponding OpenTelemetry Semantic Conventions attribute names. These include the following:
ECS Field Semantic Conventions Attribute span.idspan_idtrace.idtrace_idmessagebody.textlog.levelseverity_textThe processor first looks for the nested form of the ECS field and if such does not exist, it looks for a top-level field with the dotted field name.
Other specific ECS fields that describe resources and have corresponding counterparts in the OpenTelemetry Semantic Conventions are moved to the
resource.attributesmap. Fields that are considered resource attributes are such that conform to the following conditions:- They are ECS fields that have corresponding counterparts (either with the same name or with a different name) in OpenTelemetry Semantic Conventions.
- The corresponding OpenTelemetry attribute is defined in
Semantic Conventions
within a group that is defined as
type: entity.
All other fields, except for
@timestamp, are moved to theattributesmap.All non-array entries of the
attributesandresource.attributesmaps are flattened. Flattening means that nested objects are merged into their parent object, and the keys are concatenated with a dot. See examples below.
If an OpenTelemetry-compliant document is detected, the processor does nothing. For example, the following document will stay unchanged:
{
"resource": {
"attributes": {
"service.name": "my-service"
}
},
"scope": {
"name": "my-library",
"version": "1.0.0"
},
"attributes": {
"http.method": "GET"
},
"body": {
"text": "Hello, world!"
}
}
If a non-OpenTelemetry-compliant document is detected, the processor normalizes it. For example, the following document:
{
"@timestamp": "2023-10-01T12:00:00Z",
"service": {
"name": "my-service",
"version": "1.0.0",
"environment": "production",
"language": {
"name": "python",
"version": "3.8"
}
},
"log": {
"level": "INFO"
},
"message": "Hello, world!",
"http": {
"method": "GET",
"url": {
"path": "/api/v1/resource"
},
"headers": [
{
"name": "Authorization",
"value": "Bearer token"
},
{
"name": "User-Agent",
"value": "my-client/1.0"
}
]
},
"span" : {
"id": "1234567890abcdef"
},
"span.id": "abcdef1234567890",
"trace.id": "abcdef1234567890abcdef1234567890"
}
will be normalized into the following form:
{
"@timestamp": "2023-10-01T12:00:00Z",
"resource": {
"attributes": {
"service.name": "my-service",
"service.version": "1.0.0",
"service.environment": "production"
}
},
"attributes": {
"service.language.name": "python",
"service.language.version": "3.8",
"http.method": "GET",
"http.url.path": "/api/v1/resource",
"http.headers": [
{
"name": "Authorization",
"value": "Bearer token"
},
{
"name": "User-Agent",
"value": "my-client/1.0"
}
]
},
"severity_text": "INFO",
"body": {
"text": "Hello, world!"
},
"span_id": "1234567890abcdef",
"trace_id": "abcdef1234567890abcdef1234567890"
}
If the message field in the ingested document is structured as a JSON, the
processor will determine whether it is in ECS format or not, based on the
existence or absence of the @timestamp field. If the @timestamp field is
present, the message field will be considered to be in ECS format, and its
contents will be merged into the root of the document and then normalized as
described above. The @timestamp from the message field will override the
root @timestamp field in the resulting document.
If the @timestamp field is absent, the message field will be moved to
the body.structured field as is, without any further normalization.
For example, if the message field is an ECS-JSON, as follows:
{
"@timestamp": "2023-10-01T12:00:00Z",
"message": "{\"@timestamp\":\"2023-10-01T12:01:00Z\",\"log.level\":\"INFO\",\"service.name\":\"my-service\",\"message\":\"The actual log message\",\"http\":{\"method\":\"GET\",\"url\":{\"path\":\"/api/v1/resource\"}}}"
}
it will be normalized into the following form:
{
"@timestamp": "2023-10-01T12:01:00Z",
"severity_text": "INFO",
"body": {
"text": "The actual log message"
},
"resource": {
"attributes": {
"service.name": "my-service"
}
},
"attributes": {
"http.method": "GET",
"http.url.path": "/api/v1/resource"
}
}
However, if the message field is not recognized as ECS format, as follows:
{
"@timestamp": "2023-10-01T12:00:00Z",
"log": {
"level": "INFO"
},
"service": {
"name": "my-service"
},
"tags": ["user-action", "api-call"],
"message": "{\"root_cause\":\"Network error\",\"http\":{\"method\":\"GET\",\"url\":{\"path\":\"/api/v1/resource\"}}}"
}
it will be normalized into the following form:
{
"@timestamp": "2023-10-01T12:00:00Z",
"severity_text": "INFO",
"resource": {
"attributes": {
"service.name": "my-service"
}
},
"attributes": {
"tags": ["user-action", "api-call"]
},
"body": {
"structured": {
"root_cause": "Network error",
"http": {
"method": "GET",
"url": {
"path": "/api/v1/resource"
}
}
}
}
}