-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Active connection tracking in ebpf #32584
base: main
Are you sure you want to change the base?
Active connection tracking in ebpf #32584
Conversation
e8f0244
to
beecf87
Compare
This comment was marked as resolved.
This comment was marked as resolved.
c7a9677
to
58f239e
Compare
This comment was marked as resolved.
This comment was marked as resolved.
58f239e
to
6bd2077
Compare
6bd2077
to
66e1d80
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loader changes are fine. As a rando reviewer, it's not at all clear to me what lrs
stands for however.
@julianwiedmann would you be able to review this for @cilium/sig-datapath, I'm assigned sig-cli but my review would also count towards datapath - and I'm not familiar enough with this code to provide a thorough review. |
76f4e16
to
27af564
Compare
LRS stands for Load Reporting Service from Envoy. Please see: https://www.envoyproxy.io/docs/envoy/latest/api-v3/service/load_stats/v3/lrs.proto. I am bad at naming stuff, so I am open to suggestions |
@AwesomePatrol Does the TODO list in the second commits message represent work that's still to be done? |
Based on the TODOs and the missing description in the second patch, this looks like it still needs some more work. Flipping to draft for now to get it off my queue, please flip back when you're ready! 🙏 |
323a58c
to
36a789c
Compare
This comment was marked as outdated.
This comment was marked as outdated.
I'm guessing Julian means the commit descriptions. They should describe what the commit is doing, but more importantly why and any subtleties to help reviewers understand what is going on. See examples in the |
36a789c
to
4571040
Compare
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving as I'm going on vacation and mostly had nits about unused methods.
a0a23e9
to
31d3ec5
Compare
31d3ec5
to
01bf7ab
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 thank you! I'd want to address the integration into the BPF codepath below.
Besides that, could you please squash the last two patches in the series into the initial series? I don't think we need to merge code that immediately gets refactored again.
bpf/lib/conntrack.h
Outdated
#ifdef ENABLE_ACTIVE_CONNECTION_TRACKING | ||
_lb_act_conn_open(state->rev_nat_index, state->backend_id); | ||
#endif | ||
} else if (dir == CT_INGRESS || dir == CT_EGRESS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Picking on this part in particular, as it's clearly misplaced. ct_create_fill_entry()
is meant to fill the CT entry (from the info in the ct_state
representation), but here we don't even touch the CT entry. So at best this part should just live in ct_create{4,6}()
instead.
But as more general question - why are these changes intertwined with the actual CT lookup? If we call the feature Active connection counting
instead, would it still fit into conntrack.h
? We're not actually tracking any connection state, after all ...
My expectation would be to have this logic live here, where it would be included into the BPF program only once. What's missing to make that happen? Do we just need to pass back a few bits of connection-internal state to detect reopened / closed connections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's clearly misplaced. ct_create_fill_entry() is meant to fill the CT entry
Agreed. I moved all calls to a single function, __ct_lookup
.
If we call the feature Active connection counting instead, would it still fit into conntrack.h?
Yes, the decision to place it in conntrack.h
is a technical requirement of what information is available in each function. Please, feel free to suggest other places that are suitable for this, if there are any.
My expectation would be to have this logic live here
This handles counting new connections only, where do you want to handle closing connections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what information is available in each function. Please, feel free to suggest other places that are suitable for this, if there are any.
Let's use the ct_state
then to provide this detailed level of information to callers :).
This handles counting new connections only, where do you want to handle closing connections?
How about the following?
ct_result = ct_lazy_lookup4(...);
if (ct_result == CT_NEW || ct_state->reopened)
// account for opened connection
else if (ct_state->closing)
// account for closed connection
This is needed to break a cyclic dependency on LB{4,6}_map from lb.h in lrs.h (next patch) imported by conntrack.h. Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
Head branch was pushed to by a user without write access
01bf7ab
to
8aa2cc8
Compare
Add new map - LB_ACT_MAP - behind ENABLE_ACTIVE_CONNECTION_TRACKING flag with counters of opened and closed connections. Behavior of eBPF remains completely unchanged when ENABLE_ACTIVE_CONNECTION_TRACKING flag is not set. When an entry, to conntrack table (with dir == CT_SERVICE) is created, an entry in LB_ACT_MAP.opened is incremented by one. The same occurs when a connection is reopened. When connection is closed (action == ACTION_CLOSE and dir == CT_SERVICE), the related LB_ACT_MAP.closed is incremented by one. LB_ACT_MAP is keyed by svc_id (also known as rev_nat_index) and zone. Backend ID is used to fetch the backend definition from LB{4,6}_BACKEND_MAP which also contains zone field. This field is populated only when EndpointSlice contains a reference to zone in FixedZoneMapping (so it is possible to convert between uint8 ID and string). Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
8aa2cc8
to
3cc35fe
Compare
#ifdef ENABLE_ACTIVE_CONNECTION_TRACKING | ||
if (dir == CT_SERVICE) { | ||
if (ct_state) | ||
_lb_act_conn_open(ct_state->rev_nat_index, ct_state->backend_id); | ||
else if (entry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this would happen too early in the code flow now. For a new connection, we haven't selected a backend yet.
Note that you'll probably need a rebase to pick up the changes from #32653. |
As described in cilium/design-cfps#31, this is part of #31752
Example configuration (in
cilium-config
) for KIND cluster:where topology labels were set on the nodes:
Zone mapping can be verified with
cilium map get cilium_lb4_backends_v3
:LRS accounting can be verified with
cilium map get cilium_lb_act
: