-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for tcp_long_connection_metrics #1224
base: main
Are you sure you want to change the base?
Conversation
Some questions:
|
The accesslog is printed after connection closed. Can keep it as now Integrate with OTEL sounds reasonable to me. |
|
|
||
- Reporting of metrics and access logs, at periodic time or based on throughput (e.g. after transfer of 1mb of data). | ||
|
||
- User can fine tune the time and throughput using yaml during kmesh deployment or can use CLI tool kmeshctl anytime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this mean? How to fine tune the time and throughput?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to, give users options to set their prefered periodic time and threshold values during the start of kmesh daemon, be setting the values in values.yaml file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
understand.
You can change the description
} | ||
``` | ||
|
||
The value of the period or the threshold is provided by the user, if not a default value of 5 seconds and 1 mb is chosen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you able to explain in the proposal why 1MB is the threshold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the threshold were set too low, the system might generate too many reports, leading to noise and increased processing overhead, 1 mb threshold sounds appropriate to me. We are also giving users options to set their own threshold if he is not satisfied with 1mb
@nlgwcy PTAL |
Sorry for inactivity i am sick from last 6 days, i will reply to all your queries and complete my proposal asap. |
authors: | ||
- "yp969803" | ||
reviewers: | ||
- "nglwcy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "nglwcy" | |
- "nlgwcy" |
know that this has succeeded? | ||
--> | ||
|
||
- Collect detailed traffic metrics (e.g. bytes send/recieved, direction, throughput, round-trip time, latency , state-change) continously during the lifetime of long TCP connections using ebpf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is recommended that TCP retransmission and packet loss measurement indicators be added.
@LiZhenCheng9527 can you review the ebpf code in the proposal! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please fix typo in your proposal
- In Kmesh's dual-engine mode, traffic passes through a waypoint. In the original metrics, the origin destination would be preserved before modifying the TCP metadata. However, I noticed that the current proposal does not address this aspect.
}; | ||
|
||
struct ipv6_addr { | ||
__u8 addr[16]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ipv4
and ipv6
are represented together, why not use __u32 addr[4]
. Is there a difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using union for ipv4 or ipv6,
union {
struct ipv4_addr v4;
struct ipv6_addr v6;
} saddr;
union {
struct ipv4_addr v4;
struct ipv6_addr v6;
} daddr;
This approach enhance code readiblity, i can use a single "__u32 addr[4]", which might require conversion logic (e.g., mapping IPv4 into an IPv6-like format), adding complexity
|
||
Using various ebpf tracepoints hooks to collects metrics of tcp_long_collection, a ring buffer is also decleared to send data from kernel space to userspace | ||
|
||
Code update in tracepoint.c file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is best to add a separate file.
key.dport = ctx->dport; | ||
|
||
if (ctx->newstate == TCP_ESTABLISHED) { | ||
struct long_tcp_metrics m = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kernel-native mode has implemented some observation capabilities, such as on_cluster_sock_connect
and on_cluster_sock_close
. It is recommended to enhance them based on the existing implementation;
|
||
struct long_tcp_metrics *m = bpf_map_lookup_elem(&conn_metrics_map, &key); | ||
if (m) { | ||
__sync_fetch_and_add(&m->bytes_sent, bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The socket structure already stores link information such as the number of bytes sent and received. Why do we need to calculate it? You can refer to tcp_report
|
||
|
||
SEC("tracepoint/tcp/tcp_set_state") | ||
int trace_tcp_set_state(struct trace_event_raw_tcp_set_state *ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All newly added BPF observation hook points need to use is_managed_by_kmesh
to determine whether the current link is taken over by Kmesh
// Flush Function: Periodically invoked via a perf event. | ||
// Iterates over the conn_metrics_map and submits events for connections | ||
// that have been open longer than LONG_CONN_THRESHOLD_NS. | ||
SEC("perf_event/flush") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the periodic reporting of the indicator information of the long link be realized based on bpf_timer
? For reference: https://github.com/Asphaltt/learn-by-example/blob/main/ebpf/timer/tcp-connecting.c
what is the difference between kmesh_map64, kmesh_map192, kmesh_map296, kmesh_map1600 ebpf maps, they all look similar? |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
These are inner implementions, you should never use them directly |
you can refer to #1029 and https://github.com/kmesh-net/kmesh/blob/main/docs/proposal/map-in-map_management_enhancement-en.md |
|
||
Currently kmesh provides access logs during termination and establisment of a TCP connection with more detailed information about the connection. | ||
|
||
Kmesh also provides metrics during connection establishment, completion and deny apturing a variety of details about the connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kmesh also provides metrics during connection establishment, completion and deny apturing a variety of details about the connection. | |
Kmesh also provides metrics during connection establishment, completion and deny capturing a variety of details about the connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update once the solution settle down
nitty-gritty. | ||
--> | ||
|
||
Metrics will be collected using eBPF tracepoint hooks, and a eBPF map will be used to transfer metrics from kernel space to userspace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracepoint hooks is not consistent with impl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to update the proposal with the new implementations:
#1249
this is the pr with the latest work
What type of PR is this?
Proposal for "Metrics for TCP Long Connection"
LFX 2025 term-1
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #1211
Special notes for your reviewer:
Does this PR introduce a user-facing change?: