You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When profiling multithreaded applications, the profiler incorrectly assumes the entire process has exited when only the main thread terminates. This leads to the following issues:
Ebpf program reports pid whan pid==tgid is dead
Userspace handles the dead process cleaning up all process info and mappings
When eBPF program detects activity from remaining threads, it reports them as a new process again
Profiler fails to find any mappings for this "new" process and marks it as dead
This cycle repeats for remaining active threads leading to infinite attempts to enable&disable profiling for this process.
Reproducer:
#define_GNU_SOURCE#include<stdio.h>#include<pthread.h>#include<unistd.h>intfib(intn)
{
if (n <= 1) return1;
returnfib(n-1) +fib(n-2);
}
void*thread_function(void*arg) {
printf("Thread T1 is still running... %d\n", gettid());
while(1) {
fib(43);
}
returnNULL;
}
intmain() {
printf("Main thread is starting... %d \n", gettid());
pthread_tthread_id;
if (pthread_create(&thread_id, NULL, thread_function, NULL) !=0) {
return1;
}
fib(43);
printf("Main thread is exiting... %d \n", gettid());
pthread_exit(NULL);
return0;
}
If we want to be able to keep profiling the remaining threads, when the main thread calls pthread_exit (instead of returning from main or explicitly calling exit) we should not unload process information which implies that we need to be able to detect when this happens. To me this seems like a legitimate use-case that the profiler should support, even if it may not be very prevalent.
We can use the presence of /proc/PID, which will still be there if main thread exits but other threads are still running and update the mapping synchronization logic to NOT callprocessPIDExit when no mappings can be retrieved from /proc/PID/maps (which will be empty when main thread exits).
This can take place in userspace code (no updates to eBPF needed). However, this creates two new problems:
Userspace will not receive another notification for PID when process eventually dies. So triggering cleanup in userspace will be left solely to the periodic cleanup PIDs logic, which right now executes every 5 minutes. This may introduce a PID reuse concern.
As /proc/PID/maps will be empty after main thread exits, the profiler has no way to detect mapping changes (e.g. by one of the remaining threads).
For 1. we could execute the PID cleanup logic more frequently (e.g. once a minute or less). We could also keep a separate set of PIDs whose main thread has called pthread_exit and only use a higher frequency for them.
To fully support 2. we'd need a more elaborate solution. One option is to introduce thread tracking into process manager (for example, instead of having kernel only report PID to userspace, we could also send TID and have processmanager also track TIDs).
Alternatively, we could first start with 1. and leave solving 2. for another day.
When profiling multithreaded applications, the profiler incorrectly assumes the entire process has exited when only the main thread terminates. This leads to the following issues:
Reproducer:
Busy logs:
Potential workarounds:
disassociate_ctty
, but Im not 100% sure yet, this requires more thorough investigation and 👀The text was updated successfully, but these errors were encountered: