Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove faultyHostDetection check in the placement service and replace it with the gRPC keepalive mechanism #7743

Open
elena-kolevska opened this issue May 17, 2024 · 0 comments

Comments

@elena-kolevska
Copy link
Contributor

elena-kolevska commented May 17, 2024

/area placement

Describe the proposal

Remove faulty Host Detection check and replace it with gRPC keepalive mechanism

Current Behavior:
The current implementation of faultyHostDetection in Dapr keeps track of every host's update message. It involves locking the store and looping through all the members in the raft state to identify and disconnect hosts that haven't reported their status in a while.
This approach, while functional, is inefficient and can lead to reliability issues due to the overhead of maintaining and processing these state checks.

Proposed Change:
I propose removing the faultyHostDetection check and replacing it with the gRPC keepalive mechanism. gRPC keepalive is a built-in feature designed to handle connection liveness, making it a more efficient and reliable solution for detecting and handling unresponsive hosts.
Dapr would still keep sending the status messages because they're needed for determining which hosts haven't connected to the new leader in case of placement server failover and also they are a metric we're exposing

On placement service failover, the new leader should wait for faultyHostDetectInitialDuration (currently 6 seconds) to give enough time for all sidecars to connect to the placement service and after that it should run the faulty Host detection check only once, to remove from the placement table any hosts that weren't able to connect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant