Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipsec: cache xfrm state list #32588

Merged
merged 1 commit into from
Jun 3, 2024

Conversation

marseel
Copy link
Contributor

@marseel marseel commented May 16, 2024

Reduces GC CPU usage and memory allocations coming from XfrmStateList.
To ensure we have up-to-date cache, wrap all XfrmState related
functions inside cache, which is invalidated whenever XfrmState changes.

This is follow-up to #32577
While that PR averages out CPU usage over time, in large cluster 100+
nodes amount of allocations coming from netlink.XfrmStateList() is high
due to backgroundSync where we usually don't change any Xfrm states.
This becomes more and more expensive as number of nodes increases.

Some pprofs/GC CPU time graphs.
Before:
Number of alloc objects:
image
Total GC time for 100 nodes:
image
After:
Number of alloc objects:
image
Total GC time for 100 nodes:
image

This would become even more important for larger clusters, as XfrmStateList allocations in backgroundSync are essentially O(n^2).

ipsec: Improve CPU usage of cilum-agent in large clusters

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 16, 2024
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from d4637f1 to b90b538 Compare May 17, 2024 11:43
@marseel marseel changed the title Improve background check and cache xfrm test Cache XfrmStateList May 17, 2024
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from b90b538 to f5dd892 Compare May 17, 2024 11:54
@marseel marseel changed the title Cache XfrmStateList ipsec: cache xfrm state list May 17, 2024
@marseel marseel added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label May 17, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 17, 2024
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from f5dd892 to 76a9b0b Compare May 17, 2024 12:03
@marseel
Copy link
Contributor Author

marseel commented May 17, 2024

/test

@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch 2 times, most recently from 4ce9faf to 0bb20b5 Compare May 17, 2024 12:16
@marseel
Copy link
Contributor Author

marseel commented May 17, 2024

/test

@marseel marseel marked this pull request as ready for review May 17, 2024 12:59
@marseel marseel requested a review from a team as a code owner May 17, 2024 12:59
@marseel marseel requested a review from pchaigno May 17, 2024 12:59
@marseel
Copy link
Contributor Author

marseel commented May 17, 2024

@pchaigno I wonder which CI test runs IPSec privileged test?
I wanted to double-check it runs, I was thinking that probably Conformance Runtime (ci-runtime) but junit shows 0 for privileged:

<testsuite name="github.com/cilium/cilium/pkg/datapath/linux/ipsec" tests="0" failures="0" errors="0" id="118" hostname="kind-bpf-next" time="0.130" timestamp="2024-05-17T12:41:11Z"></testsuite>

🤷

@pchaigno
Copy link
Member

@pchaigno I wonder which CI test runs IPSec privileged test? I wanted to double-check it runs, I was thinking that probably Conformance Runtime (ci-runtime) but junit shows 0 for privileged:

<testsuite name="github.com/cilium/cilium/pkg/datapath/linux/ipsec" tests="0" failures="0" errors="0" id="118" hostname="kind-bpf-next" time="0.130" timestamp="2024-05-17T12:41:11Z"></testsuite>

But it prints tests="0" for every single packages, doesn't it? I'm not sure we can rely on that.
In the output, it does look like it's doing something: https://github.com/cilium/cilium/actions/runs/9128109980/job/25099903131#step:17:367.

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Couple questions below, but nothing I believe is blocking.

pkg/datapath/linux/ipsec/xfrm_state_cache.go Show resolved Hide resolved
pkg/datapath/linux/ipsec/xfrm_state_cache.go Show resolved Hide resolved
pkg/datapath/linux/ipsec/ipsec_linux.go Show resolved Hide resolved
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 17, 2024
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from 0bb20b5 to 522c126 Compare May 17, 2024 14:43
@sjdot
Copy link
Contributor

sjdot commented May 17, 2024

@marseel @pchaigno I originally reported this issue and I've been playing around with a patch on my side also. Instead of calling xfrm state list here and looping until we find a match, can we just use XfrmStateGet instead? This seems to find states just fine in my testing and avoids getting the whole list of states back in the first place.

I haven't tested the impact of this on the GC on a large cluster however, but wondering why we wouldn't want to do this instead of the looping logic?

I'm a bit wary of caching the states in general just because it feels like troubleshooting issues with the cache in the future may be difficult or hard to find.

@marseel
Copy link
Contributor Author

marseel commented May 21, 2024

@sjdot I do agree that we should use XfrmStateGet and probably this part needs some refactoring in general.

While XfrmStateGet would work for background-sync triggered events which usually don't change anything, I am a bit worried about interleaving for other cases:

  • we use states list later in xfrmTemporarilyRemoveState and xfrmDeleteConflictingState which are fetched before performing any initial update/delete/add etc.
  • We ignore errors from XfrmStateDel in the initial part

I would say adding caching should be safer than refactoring that part (also safer for backports), as long as we ensure we don't have other calls that are modifying xfrmState with CI test.
If something external to Cilium modifies xfrmStates, then caching for 1 minute won't make any significant difference, as this is going to be only resolved by background-sync, which runs less often than 1 minute.

@sjdot
Copy link
Contributor

sjdot commented May 21, 2024

Thanks @marseel

Yeah in my patch there are still cases where I have to call list and pass to the functions you mentioned (e.g. a state that needs to be deleted an re-added), I've also not handled the migration of old state so I agree back porting is painful/difficult.

I see the usefulness of the caching given this, I'm just nervous given our long history of very painful ipsec issues. Given it's a pretty significant change, do you think it would be feasible to put this behind a feature flag initially? I would like the ability to turn this off if we hit an issue we suspect might be related to be able to quickly verify if this is the cause or not.

@marseel
Copy link
Contributor Author

marseel commented May 21, 2024

I see the usefulness of the caching given this, I'm just nervous given our long history of very painful ipsec issues. Given it's a pretty significant change, do you think it would be feasible to put this behind a feature flag initially? I would like the ability to turn this off if we hit an issue we suspect might be related to be able to quickly verify if this is the cause or not.

Yup, I think that definitely makes sense.
Let me get that flag + CI coverage before merging.

@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from 522c126 to ccf72a2 Compare May 22, 2024 10:07
@marseel marseel requested review from a team as code owners May 22, 2024 10:07
@marseel marseel requested a review from ldelossa May 22, 2024 10:07
@marseel
Copy link
Contributor Author

marseel commented May 29, 2024

@borkmann friendly ping :)

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 3, 2024
@pchaigno pchaigno added this pull request to the merge queue Jun 3, 2024
Merged via the queue into main with commit 3a4c57f Jun 3, 2024
271 checks passed
@pchaigno pchaigno deleted the improve_background_check_and_cache_xfrm_test branch June 3, 2024 09:21
@marseel marseel added the backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. label Jun 4, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport pending to v1.15 in 1.15.6 Jun 4, 2024
@marseel marseel added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Jun 4, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport pending to v1.14 in 1.14.12 Jun 4, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport pending to v1.13 in 1.13.17 Jun 4, 2024
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Jun 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.14 in 1.14.12 Jun 5, 2024
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. labels Jun 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.15 to Backport done to v1.15 in 1.15.6 Jun 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport done to v1.15 in 1.15.6 Jun 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.14 in 1.14.12 Jun 5, 2024
@github-actions github-actions bot added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Jun 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.15 in 1.15.6 Jun 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.13 in 1.13.17 Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
1.14.12
Backport done to v1.14
1.15.6
Backport done to v1.15
Development

Successfully merging this pull request may close these issues.

None yet

8 participants