Replies: 2 comments 18 replies
-
Speaking of zombie state, you might have similar problem as in #40596 where badly written Docker image (badly written init script) was not reaping the zombie processes correctly. You can read it and see if it causes your problems. When it comes the threading warning, this is an interesting one, I've not seen that one before and we should take a look at that likely. I don't recall that we start threads in Airlfow by default but it might be that there are other things at play. If you are using open-lineage. I believe some older versions of openlineage started some threads and that kind of warning could be caused by those threads starting before fork() happened. And I believe recently it was changed to multi-processing: @mobuchowski @kacpermuda -> WDYT? can you comment on that please? |
Beta Was this translation helpful? Give feedback.
-
It would be good to know if you are indeed using OpenLineage. When it comes to OpenLineage, we had these two PRs: Snowflake and Redshift that may have been causing deadlocks. We later switched to EDIT: I did not notice you mentioned DockerOperator, so not all i wrote makes sense. |
Beta Was this translation helpful? Give feedback.
-
Hello, on
Airflow 2.9.2
version, tasks are hanging in the running status usingDockerOpertator
andStandardTaskRunner
astask_runner
. There is a theory that freezes occur due to deadlocks, because one of the first messages in the logs talks about possible deadlocks if you useos.fork()
, which is used in StandardTaskRunner (message below).The status of a hung task (process) -
S
(sleeping
) in theSTAT
column, was obtained using theps -aux
command. An attempt to wake up the process using thekill -CONT <PID>
command was unsuccessful.After creating a new attempt at a hung task using the
Clear task
button in the Airflow UI, the previous task (process) goes into theZ
-zombie
state and remains in the system in the future.Zombie
processes disappear only after a complete reboot of the system.I will be glad to receive any comments and help in solving this problem. 🙏
Beta Was this translation helpful? Give feedback.
All reactions