Skip to content
This repository has been archived by the owner on Feb 27, 2023. It is now read-only.

OOM in containerized dfget in Kubernetes #1563

Open
shengnuo opened this issue Oct 27, 2021 · 0 comments
Open

OOM in containerized dfget in Kubernetes #1563

shengnuo opened this issue Oct 27, 2021 · 0 comments

Comments

@shengnuo
Copy link

Ⅰ. Issue Description

We have seen frequent OOM Kill on the dfget process when it is containerized and orchestrated by Kubernetes,

Ⅱ. Describe what happened

When dfget is executed within a Kubernetes pod, Prometheus reports a high memory usage in its container_memory_working_set_bytes metrics. From what we have seen, the reported metrics can easily exceed hundres of megabytes, and it is not common to it creeping into gigabytes

image

This metrics is reported by cAdvisor, and its total memory usage - inactive files. See here for its definition.

container_memory_working_set_bytes excludes the cached data and it is what OOM killer uses for calculating oom_score.

Ⅲ. Describe what you expected to happen

I expect a stable memory usage (as reported by container_memory_working_set_bytes metrics) while a file is been downloaded. I did not observe a significant memory spike when downloading the same file using wget.

Ⅳ. How to reproduce it (as minimally and precisely as possible)

  1. Create a Kubernetes Deployment with dragonflyoss/dfclient:1.0.6 image
  2. Execute the dfget process within the pod

Ⅴ. Anything else we need to know?

Ⅵ. Environment:

  • dragonfly version: v1.0.6
  • Host OS (e.g. from /etc/os-release): CentOS Linux 8
  • Kubernetes Version: v1.21.1
  • Install tools:
  • Others:
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants