Troubleshoot an out-of-memory EDOT Collector

Serverless Observability Stack EDOT Collector

If your EDOT Collector pods terminate with an OOMKilled status, this usually indicates sustained memory pressure or potentially a memory leak due to an introduced regression or a bug. You can use the Performance Profiler (pprof) extension to collect and analyze memory profiles, helping you identify the root cause of the issue.

If you're running the Collector in Kubernetes and experiencing resource allocation issues, refer to Insufficient resources in Kubernetes for troubleshooting steps.

Symptoms

These symptoms typically indicate that the EDOT Collector is experiencing a memory-related failure:

EDOT Collector pod restarts with an OOMKilled status in Kubernetes.
Memory usage steadily increases before the crash.
The Collector's logs don't show clear errors before termination.

For more detailed diagnostics, refer to Enable debug logging.

Resolution

Turn on runtime profiling using the pprof extension and then gather memory heap profiles from the affected pod:

Enable pprof in the Collector

Edit the EDOT Collector Daemonset configuration and include the pprof extension:
```
exporters:
  ...
processors:
  ...
receivers:
  ...
extensions:
  pprof:

service:
  extensions:
   - pprof
   - ...
  pipelines:
    metrics:
      receivers: [ ... ]
      processors: [ ... ]
      exporters: [ ... ]
		
```
Restart the Collector after applying these changes. When the Daemonset is deployed again, spot the pod that is getting restarted.

Access the affected pod and collect a heap dump

When a pod starts exhibiting high memory usage or restarts due to OOM, run the following to enter a debug shell:

				
					kubectl debug -it <collector-pod-name> --image=ubuntu:latest

In the debug container:

						
					apt update
apt install -y curl
curl http://localhost:1777/debug/pprof/heap > heap.out

Copy the heap file from the pod

From your local machine, copy the heap file using:
```
kubectl cp <collector-pod-name>:heap.out ./heap.out -c <debug-container-name>
		
```
Note

Replace <debug-container-name> with the name assigned to the debug container. Without the -c flag, Kubernetes will show the list of available containers.
Convert the heap profile for analysis

You can now generate a visual representation, for example PNG:
```
go tool pprof -png heap.out > heap.png
		
```

Best practices

To improve the effectiveness of memory diagnostics and reduce investigation time, consider the following:

Collect multiple heap profiles over time (for example, every few minutes) to observe memory trends before the crash.
Automate heap profile collection at intervals to observe trends over time.

Troubleshoot an out-of-memory EDOT Collector

Symptoms

Resolution

Enable `pprof` in the Collector

Access the affected pod and collect a heap dump

Copy the heap file from the pod

Convert the heap profile for analysis

Best practices

Resources