Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vcluster-eks: vcluster-api: "watch chan error: etcdserver: mvcc: required revision has been compacted" #1342

Open
joaocc opened this issue Nov 1, 2023 · 3 comments
Labels

Comments

@joaocc
Copy link
Contributor

joaocc commented Nov 1, 2023

What happened?

Installed vcluster-eks 0.16.4 on EKS 1.27. Storage for etcd is on EFS.
Messages start almost immediately after vcluster-api pod starts

I1101 10:33:31.743782       1 aggregator.go:164] waiting for initial CRD sync...
I1101 10:33:31.748914       1 gc_controller.go:78] Starting apiserver lease garbage collector
I1101 10:33:31.748967       1 handler_discovery.go:412] Starting ResourceDiscoveryManager
I1101 10:33:31.742914       1 controller.go:78] Starting OpenAPI AggregationController
I1101 10:33:31.750829       1 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/run/config/pki/ca.crt"
I1101 10:33:31.751017       1 dynamic_cafile_content.go:157] "Starting controller" name="request-header::/run/config/pki/front-proxy-ca.crt"
E1101 10:33:31.843347       1 controller.go:95] Found stale data, removed previous endpoints on kubernetes service, apiserver didn't exit successfully previously
I1101 10:33:31.846125       1 shared_informer.go:318] Caches are synced for cluster_authentication_trust_controller
I1101 10:33:31.849180       1 apf_controller.go:377] Running API Priority and Fairness config worker
I1101 10:33:31.849368       1 apf_controller.go:380] Running API Priority and Fairness periodic rebalancing process
I1101 10:33:31.927892       1 shared_informer.go:318] Caches are synced for node_authorizer
I1101 10:33:31.932853       1 controller.go:624] quota admission added evaluator for: leases.coordination.k8s.io
I1101 10:33:31.939522       1 cache.go:39] Caches are synced for AvailableConditionController controller
I1101 10:33:31.940661       1 shared_informer.go:318] Caches are synced for crd-autoregister
I1101 10:33:31.940723       1 aggregator.go:166] initial CRD sync complete...
I1101 10:33:31.940737       1 autoregister_controller.go:141] Starting autoregister controller
I1101 10:33:31.940745       1 cache.go:32] Waiting for caches to sync for autoregister controller
I1101 10:33:31.940757       1 cache.go:39] Caches are synced for autoregister controller
I1101 10:33:31.943259       1 cache.go:39] Caches are synced for APIServiceRegistrationController controller
I1101 10:33:31.944056       1 shared_informer.go:318] Caches are synced for configmaps
I1101 10:33:32.748362       1 storage_scheduling.go:111] all system priority classes are created successfully or already exist.
W1101 10:33:35.109679       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:35.712071       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.027911       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.027960       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.027983       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.028004       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.028029       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.028051       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.028070       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.029210       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.029246       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.029823       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.029858       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.029865       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.029880       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.030282       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.128476       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted
W1101 10:33:38.128503       1 watcher.go:245] watch chan error: etcdserver: mvcc: required revision has been compacted

What did you expect to happen?

No warning messages

How can we reproduce it (as minimally and precisely as possible)?

Not sure how to reproduce in minimal environment.

Anything else we need to know?

Install done via flux2 (HelmRelease)
Potentially relevant links:

Host cluster Kubernetes version

$ kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.6-eks-f8587cb

Host cluster Kubernetes distribution

EKS 1.27

vlcuster version

$ vcluster --version
vcluster version 0.15.7

Vcluster Kubernetes distribution(k3s(default)), k8s, k0s)

eks

OS and Arch

OS: macOS
Arch: arm64
@joaocc joaocc added the kind/bug label Nov 1, 2023
@FabianKramm
Copy link
Member

@joaocc sorry for the delay, vcluster has problems with EFS as its causing issues with databases in general, do you have any chance to use EBS or something similar?

@joaocc
Copy link
Contributor Author

joaocc commented Nov 14, 2023

Hi. Not really. We are using EFS as a way to simplify HA storage.
Contrary to Azure, where ZRS allows mountable volumes that cross different AZs, it seems AWS EBS is restricted to a single AZ, so a vcluster that ends up being booted on another node would not be able to mount the EBS.
On the other hand, we haven't noticed any kind of practical issues. Are you saying that EFS is not a supported storage for etcd?
Thanks

@joaocc
Copy link
Contributor Author

joaocc commented Mar 19, 2024

@FabianKramm following up on this thread...

  • we use eks-d because the remaining distros use sqlite which is indeed not able to be hosted on NFS-type file systems; do you think the new k3s-with-etcd at v0.19.x will have the same issues?
  • regarding EBS, is there any guidance to have non-HA deployments work well with EBS on multi-AZ clusters (in the case a single AZ becomes unavailable) - this is a scenario that is supported quite well with EFS.
  • is there any official statement on EFS being a supported store with etcd?

For reference, we continue not to have any practical issues, except for a elevated EFS billing account (writes ~440MB/sec), which we are still trying to understand if it is from these writes or from something else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants