You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're seeing some sort of a deadlock, or some other type of locking problem while opening new Postgres DB connections using ddtrace:2.10.4. The problem seems to have been introduced between 2.9.2 and 2.10.3. (I know that the problem does not occur on 2.9.2, but that is occurs on both 2.10.3 and 2.10.4, but haven't tried other versions. Unfortunately, it's tough for me to determine the specific release it was introduced in better, as I don't have a reproducible test case).
The problem manifests in an application where:
We're running async code under ASGI
We're calling some database code using asgiref SyncToAsync. Critically, SyncToAsync is implemented by executing the synchronous code in a ThreadPoolWorker
When this ThreadPoolWorker has all of its active threads exhausted (32 workers, in our case), then anything in our application that awaits something SyncToAsync will hang, ultimately crashing our app.
With some debug logging to examine that executor, we've managed to capture some stacks from the callables stuck in that executor:
Ultimately, there are various examples, but all of these end up stuck in:
File "/usr/local/lib/python3.10/site-packages/dj_db_conn_pool/core/mixins/core.py", line 58, in get_new_connection
with pool_container.lock:
File "/usr/local/lib/python3.10/site-packages/ddtrace/profiling/collector/_lock.py", line 241, in __enter__
return self._acquire(self.__wrapped__.__enter__, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/ddtrace/profiling/collector/_lock.py", line 105, in _acquire
return inner_func(*args, **kwargs)
I've noticed there've been several commits to collector/_lock.py in the past few releases since 2.9.2, but haven't caught the breaking change.
Here's the full stack:
ile "/usr/local/lib/python3.10/threading.py", line 973, in _bootstrap
self._bootstrap_inner()
File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/threading.py", line 99, in run
return _run_old_run_func()
File "/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/threading.py", line 92, in _run_old_run_func
return old_run_func(self, *a, **kw)
File "/usr/local/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 83, in _worker
work_item.run()
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.10/site-packages/ddtrace/contrib/futures/threading.py", line 44, in _wrap_execution
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/channels/db.py", line 13, in thread_handler
return super().thread_handler(loop, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/asgiref/sync.py", line 522, in thread_handler
return func(*args, **kwargs)
File "/usr/src/odm/papi/authentication.py", line 23, in authenticate_token
return User.objects.filter(id=int(access_token['user_id'])).first(), token
File "/usr/local/lib/python3.10/site-packages/cacheops/query.py", line 350, in first
return self._no_monkey.first(self._clone().cache())
File "/usr/local/lib/python3.10/site-packages/django/db/models/query.py", line 1057, in first
for obj in queryset[:1]:
File "/usr/local/lib/python3.10/site-packages/django/db/models/query.py", line 398, in __iter__
self._fetch_all()
File "/usr/local/lib/python3.10/site-packages/cacheops/query.py", line 260, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/usr/local/lib/python3.10/site-packages/django/db/models/query.py", line 91, in __iter__
results = compiler.execute_sql(
File "/usr/local/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1560, in execute_sql
cursor = self.connection.cursor()
File "/usr/local/lib/python3.10/site-packages/ddtrace/contrib/trace_utils.py", line 335, in wrapper
return func(mod, pin, wrapped, instance, args, kwargs)
File "/usr/local/lib/python3.10/site-packages/ddtrace/contrib/django/patch.py", line 144, in cursor
cursor = func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 330, in cursor
return self._cursor()
File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 306, in _cursor
self.ensure_connection()
File "/usr/local/lib/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 289, in ensure_connection
self.connect()
File "/usr/local/lib/python3.10/site-packages/sentry_sdk/utils.py", line 1713, in runner
return sentry_patched_function(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/django/__init__.py", line 689, in connect
return real_connect(self)
File "/usr/local/lib/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 270, in connect
self.connection = self.get_new_connection(conn_params)
File "/usr/local/lib/python3.10/site-packages/dj_db_conn_pool/backends/postgresql/mixins.py", line 18, in get_new_connection
connection = super().get_new_connection(conn_params)
File "/usr/local/lib/python3.10/site-packages/dj_db_conn_pool/core/mixins/core.py", line 58, in get_new_connection
with pool_container.lock:
File "/usr/local/lib/python3.10/site-packages/ddtrace/profiling/collector/_lock.py", line 241, in __enter__
return self._acquire(self.__wrapped__.__enter__, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/ddtrace/profiling/collector/_lock.py", line 105, in _acquire
return inner_func(*args, **kwargs)
The text was updated successfully, but these errors were encountered:
We're seeing some sort of a deadlock, or some other type of locking problem while opening new Postgres DB connections using
ddtrace:2.10.4
. The problem seems to have been introduced between2.9.2
and2.10.3
. (I know that the problem does not occur on2.9.2
, but that is occurs on both2.10.3
and2.10.4
, but haven't tried other versions. Unfortunately, it's tough for me to determine the specific release it was introduced in better, as I don't have a reproducible test case).The problem manifests in an application where:
SyncToAsync
. Critically,SyncToAsync
is implemented by executing the synchronous code in a ThreadPoolWorkerWhen this
ThreadPoolWorker
has all of its active threads exhausted (32 workers, in our case), then anything in our application that awaits somethingSyncToAsync
will hang, ultimately crashing our app.With some debug logging to examine that executor, we've managed to capture some stacks from the callables stuck in that executor:
Ultimately, there are various examples, but all of these end up stuck in:
I've noticed there've been several commits to
collector/_lock.py
in the past few releases since 2.9.2, but haven't caught the breaking change.Here's the full stack:
The text was updated successfully, but these errors were encountered: