fix(shutdown): Prevent race condition when GlobalObject destruction routine unlocks global mutex #8652

TreeHunter9 · 2025-07-17T13:08:00Z

Unlocking global mutex in GlobalObject destruction routine made it possible for a new attachment to slip in, so it will create new GlobalObject and use it, while destroying routine still in action. This can lead to an undefined state of the global objects, such as shared memory, where one thread is actively using it while another thread is destroying it.

v5 is also affected.

Examples of this race condition:

Example 1 - Deadlock.
Thread 1 is holding sh_mem_mutex (at Jrd::LockManager::~LockManager), and waiting for flock on initFile;
Thread 2 is holding flock on initFile (at Firebird::SharedMemoryBase::SharedMemoryBase), and waiting for sh_mem_mutex;

Trace

thread #1
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7ffff65f76b8) at ./nptl/futex-internal.c:57
#1  __futex_abstimed_wait_common (cancel=true, private=0, abstime=0x0, clockid=0, expected=0, futex_word=0x7ffff65f76b8) at ./nptl/futex-internal.c:87
#2  __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7ffff65f76b8, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at ./nptl/futex-internal.c:139
#3  0x00007ffff7693a41 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7ffff65f7708, cond=0x7ffff65f7690) at ./nptl/pthread_cond_wait.c:503
#4  ___pthread_cond_wait (cond=0x7ffff65f7690, mutex=0x7ffff65f7708) at ./nptl/pthread_cond_wait.c:627
#5  0x00007ffff57e6651 in Firebird::Condition::wait (this=0x7ffff65f7690, m=...) at /src/common/../common/classes/condition.h:192
#6  0x00007ffff5e96ac1 in Firebird::SharedFileInfo::lock (this=0x7ffff65f7680, shared=false, wait=true, init=0x0) at /src/common/isc_sync.cpp:359
#7  0x00007ffff5e91075 in Firebird::FileLock::setlock (this=0x7ffff7849930, mode=Firebird::FileLock::FLM_EXCLUSIVE) at /src/common/isc_sync.cpp:508
#8  0x00007ffff5e910b6 in Firebird::FileLock::setlock (this=0x7ffff7849930, status=0x7ffff15fc8a0, mode=Firebird::FileLock::FLM_EXCLUSIVE) at /src/common/isc_sync.cpp:517
#9  0x00007ffff5e90c7a in (anonymous namespace)::FileLockHolder::FileLockHolder (this=0x7ffff15fc9d0, l=0x7ffff7849930) at /src/common/isc_sync.cpp:178
#10 0x00007ffff5e91cd5 in Firebird::SharedMemoryBase::removeMapFile (this=0x7ffff4e6b650) at /src/common/isc_sync.cpp:1138
#11 0x00007ffff5d9bac2 in Jrd::LockManager::~LockManager (this=0x7ffff319b820, __in_chrg=<optimized out>) at /src/lock/lock.cpp:247
#12 0x00007ffff57f3dad in Firebird::SimpleDelete<Jrd::LockManager>::clear (ptr=0x7ffff319b820) at /src/include/../common/classes/auto.h:46
#13 0x00007ffff57f351f in Firebird::AutoPtr<Jrd::LockManager, Firebird::SimpleDelete>::operator= (this=0x7ffff784fde0, v=0x0) at /src/include/../common/classes/auto.h:122
#14 0x00007ffff57f0ee5 in Jrd::Database::GlobalObjectHolder::~GlobalObjectHolder (this=0x7ffff784fd80, __in_chrg=<optimized out>) at /src/jrd/Database.cpp:634
#15 0x00007ffff57f1010 in Jrd::Database::GlobalObjectHolder::~GlobalObjectHolder (this=0x7ffff784fd80, __in_chrg=<optimized out>) at /src/jrd/Database.cpp:641
#16 0x00007ffff57a162b in Firebird::RefCounted::release (this=0x7ffff784fd80) at /src/include/../common/classes/RefCounted.h:47
#17 0x00007ffff57f0ade in Jrd::Database::GlobalObjectHolder::release (this=0x7ffff784fd80) at /src/jrd/Database.cpp:580
#18 0x00007ffff57f2d43 in Firebird::RefPtr<Jrd::Database::GlobalObjectHolder>::~RefPtr (this=0x7ffff297f1f8, __in_chrg=<optimized out>) at /src/include/../common/classes/RefCounted.h:140
#19 0x00007ffff57ef49d in Jrd::Database::~Database (this=0x7ffff297ecd0, __in_chrg=<optimized out>) at /src/jrd/Database.cpp:185
#20 0x00007ffff5a2b735 in Jrd::Database::destroy (toDelete=0x7ffff297ecd0) at /src/jrd/../jrd/../jrd/Database.h:422
#21 0x00007ffff5a2252c in JRD_shutdown_database (dbb=0x7ffff297ecd0, flags=3) at /src/jrd/jrd.cpp:8274
#22 0x00007ffff5a23793 in purge_attachment (tdbb=0x7ffff15fd518, sAtt=0x7ffff319b040, flags=2) at /src/jrd/jrd.cpp:8678
#23 0x00007ffff5a0fd9f in Jrd::JAttachment::freeEngineData (this=0x7ffff319b250, user_status=0x7ffff15fd6d0, forceFree=false) at /src/jrd/jrd.cpp:3455
#24 0x00007ffff5a0fb50 in Jrd::JAttachment::internalDetach (this=0x7ffff319b250, user_status=0x7ffff15fd6d0) at /src/jrd/jrd.cpp:3392
#25 0x00007ffff5a0fba7 in Jrd::JAttachment::detach (this=0x7ffff319b250, user_status=0x7ffff15fd6d0) at /src/jrd/jrd.cpp:3404
#26 0x00007ffff5a3aad4 in Firebird::IAttachmentBaseImpl<Jrd::JAttachment, Firebird::CheckStatusWrapper, Firebird::IReferenceCountedImpl<Jrd::JAttachment, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IVersionedImpl<Jrd::JAttachment, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IAttachment> > > > >::cloopdetachDispatcher (self=0x7ffff319b258, status=0x7ffff15fd908) at /src/include/firebird/IdlFbInterfaces.h:12325
#27 0x00007ffff7af228b in Firebird::IAttachment::detach<Firebird::CheckStatusWrapper> (this=0x7ffff319b258, status=0x7ffff15fd900) at /src/include/firebird/IdlFbInterfaces.h:2846
#28 0x00007ffff7ad6d4e in operator() (__closure=0x7ffff15fd880) at /src/yvalve/why.cpp:6081
#29 0x00007ffff7ae37f6 in std::__invoke_impl<void, Why::YAttachment::detach(Firebird::CheckStatusWrapper*)::<lambda()>&>(std::__invoke_other, struct {...} &) (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#30 0x00007ffff7ae1b4a in std::__invoke_r<void, Why::YAttachment::detach(Firebird::CheckStatusWrapper*)::<lambda()>&>(struct {...} &) (__fn=...) at /usr/include/c++/11/bits/invoke.h:111
#31 0x00007ffff7adf5a5 in std::_Function_handler<void(), Why::YAttachment::detach(Firebird::CheckStatusWrapper*)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/11/bits/std_function.h:290
#32 0x00007ffff7af7068 in std::function<void ()>::operator()() const (this=0x7ffff15fd880) at /usr/include/c++/11/bits/std_function.h:590
#33 0x00007ffff7af2366 in Why::done<Why::YAttachment>(Firebird::CheckStatusWrapper*, Why::YEntry<Why::YAttachment>&, Why::YAttachment*, std::function<void ()>, std::function<void ()>) (status=0x7ffff15fd900, entry=..., y=0x7ffff7e74450, newClose=..., oldClose=...) at /src/yvalve/why.cpp:1360
#34 0x00007ffff7ad6ebd in Why::YAttachment::detach (this=0x7ffff7e74450, status=0x7ffff15fd900) at /src/yvalve/why.cpp:6080
#35 0x00007ffff7b1a5e8 in Firebird::IAttachmentBaseImpl<Why::YAttachment, Firebird::CheckStatusWrapper, Firebird::IReferenceCountedImpl<Why::YAttachment, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IVersionedImpl<Why::YAttachment, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IAttachment> > > > >::cloopdetachDispatcher (self=0x7ffff7e74458, status=0x7ffff15fd9a8) at /src/include/firebird/IdlFbInterfaces.h:12325
#36 0x00005555555df102 in Firebird::IAttachment::detach<Firebird::CheckStatusWrapper> (this=0x7ffff7e74458, status=0x7ffff15fd9a0) at /src/include/firebird/IdlFbInterfaces.h:2846
#37 0x00005555555c7fb4 in rem_port::end_database (this=0x7ffff7e54e50, sendL=0x7ffff2d9f068) at /src/remote/server/server.cpp:3274
#38 0x00005555555cf3f3 in process_packet (port=0x7ffff7e54e50, sendL=0x7ffff2d9f068, receive=0x7ffff2d9f640, result=0x7ffff15fdc58) at /src/remote/server/server.cpp:5187
#39 0x00005555555d578a in loopThread () at /src/remote/server/server.cpp:6987
#40 0x00005555555fe5ec in (anonymous namespace)::ThreadArgs::run (this=0x7ffff15fdd90) at /src/common/ThreadStart.cpp:78
#41 0x00005555555fe6bc in (anonymous namespace)::threadStart (arg=0x7ffff7e653d0) at /src/common/ThreadStart.cpp:94

thread #2
#0  futex_wait (private=128, expected=2, futex_word=0x7ffff65eb010) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x7ffff65eb010, private=128) at ./nptl/lowlevellock.c:49
#2  0x00007ffff7698002 in lll_mutex_lock_optimized (mutex=0x7ffff65eb010) at ./nptl/pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=0x7ffff65eb010) at ./nptl/pthread_mutex_lock.c:93
#4  0x00007ffff5e92f5a in Firebird::SharedMemoryBase::mutexLock (this=0x7ffff4eef950) at /src/common/isc_sync.cpp:2754
#5  0x00007ffff5d9e17f in Jrd::LockManager::acquire_shmem (this=0x7ffff319fbc0, owner_offset=78584) at /src/lock/lock.cpp:1067
#6  0x00007ffff5da6f04 in Jrd::LockManager::LockTableGuard::LockTableGuard (this=0x7fffe26c4110, lm=0x7ffff319fbc0, f=0x7ffff615a6d7 "enqueue", owner=78584) at /src/lock/../lock/lock_proto.h:310
#7  0x00007ffff5d9c2cf in Jrd::LockManager::enqueue (this=0x7ffff319fbc0, tdbb=0x7fffe26c6f78, statusVector=0x7fffe26c4340, prior_request=0, series=25, value=0x7ffff0038ad8 "", length=8, type=3 '\003', ast_routine=0x7ffff5aa5e5c <Jrd::TipCache::tpc_block_blocking_ast(void*)>, ast_argument=0x7ffff0038a50, data=0, lck_wait=1, owner_offset=78584) at /src/lock/lock.cpp:468
#8  0x00007ffff5a440d6 in enqueue (tdbb=0x7fffe26c6f78, statusVector=0x7fffe26c4340, lock=0x7ffff0038a60, level=3, wait=1) at /src/jrd/lck.cpp:948
#9  0x00007ffff5a4590d in ENQUEUE (tdbb=0x7fffe26c6f78, statusVector=0x7fffe26c4340, lock=0x7ffff0038a60, level=3, wait=1) at /src/jrd/lck.cpp:149
#10 0x00007ffff5a43671 in LCK_lock (tdbb=0x7fffe26c6f78, lock=0x7ffff0038a60, level=3, wait=1) at /src/jrd/lck.cpp:675
#11 0x00007ffff5aa49b8 in Jrd::TipCache::StatusBlockData::StatusBlockData (this=0x7ffff0038a50, tdbb=0x7fffe26c6f78, tipCache=0x7ffff0038740, blockSize=4194304, blkNumber=0) at /src/jrd/tpc.cpp:387
#12 0x00007ffff5aa51eb in Jrd::TipCache::createTransactionStatusBlock (this=0x7ffff0038740, blockSize=4194304, blockNumber=0) at /src/jrd/tpc.cpp:500
#13 0x00007ffff5aa45f7 in Jrd::TipCache::loadInventoryPages (this=0x7ffff0038740, tdbb=0x7fffe26c6f78, header=0x7ffff4e24000) at /src/jrd/tpc.cpp:340
#14 0x00007ffff5aa333d in Jrd::TipCache::GlobalTpcInitializer::initialize (this=0x7ffff0038760, sm=0x7fffe2c86b50, initFlag=true) at /src/jrd/tpc.cpp:80
#15 0x00007ffff5e9267d in Firebird::SharedMemoryBase::SharedMemoryBase (this=0x7fffe2c86b50, filename=0x7ffff00389a0 "fb_tpc_0203010000000000f2552c0000000000", length=136, callback=0x7ffff0038760, skipLock=false) at /src/common/isc_sync.cpp:1383
#16 0x00007ffff5aa8587 in Firebird::SharedMemory<Jrd::TipCache::GlobalTpcHeader>::SharedMemory (this=0x7fffe2c86b50, fileName=0x7ffff00389a0 "fb_tpc_0203010000000000f2552c0000000000", size=136, cb=0x7ffff0038760, skipLock=false) at /src/jrd/../jrd/../jrd/../jrd/../lock/../common/isc_s_proto.h:344
#17 0x00007ffff5aa3f60 in Jrd::TipCache::initializeTpc (this=0x7ffff0038740, tdbb=0x7fffe26c6f78) at /src/jrd/tpc.cpp:251
#18 0x00007ffff5a2c27d in Jrd::TipCache::create (tdbb=0x7fffe26c6f78) at /src/jrd/../jrd/tpc_proto.h:64
#19 0x00007ffff5a096a9 in Jrd::JProvider::internalAttach (this=0x7ffff319ec80, user_status=0x7fffe26c8010, filename=0x7ffff2d9caf0 "employee", dpb_length=366, dpb=0x7ffff2d9c950 "\001J>/gen/Debug/firebird/bin/isqlP'LI-T6.0.0.1036-dev Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM", existingId=0x0) at /src/jrd/jrd.cpp:1917
#20 0x00007ffff5a0892f in Jrd::JProvider::attachDatabase (this=0x7ffff319ec80, user_status=0x7fffe26c8010, filename=0x7ffff2d9caf0 "employee", dpb_length=366, dpb=0x7ffff2d9c950 "\001J>/gen/Debug/firebird/bin/isqlP'LI-T6.0.0.1036-dev Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/jrd/jrd.cpp:1665
#21 0x00007ffff57ec9b7 in Firebird::IProviderBaseImpl<Jrd::JProvider, Firebird::CheckStatusWrapper, Firebird::IPluginBaseImpl<Jrd::JProvider, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IReferenceCountedImpl<Jrd::JProvider, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IVersionedImpl<Jrd::JProvider, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IProvider> > > > > > >::cloopattachDatabaseDispatcher (self=0x7ffff319ec88, status=0x7fffe26c8668, fileName=0x7ffff2d9caf0 "employee", dpbLength=366, dpb=0x7ffff2d9c950 "\001J>/gen/Debug/firebird/bin/isqlP'LI-T6.0.0.1036-dev Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/include/firebird/IdlFbInterfaces.h:12652
#22 0x00007ffff7af39c9 in Firebird::IProvider::attachDatabase<Firebird::CheckStatusWrapper> (this=0x7ffff319ec88, status=0x7fffe26c8660, fileName=0x7ffff2d9caf0 "employee", dpbLength=366, dpb=0x7ffff2d9c950 "\001J>/gen/Debug/firebird/bin/isqlP'LI-T6.0.0.1036-dev Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/include/firebird/IdlFbInterfaces.h:3033
#23 0x00007ffff7ad9538 in Why::Dispatcher::attachOrCreateDatabase (this=0x7ffff2d9c7e0, status=0x7fffe26c8660, createFlag=false, filename=0x7ffff7e6fb6c "employee", dpbLength=366, dpb=0x7ffff7e6fbf0 "\001J>/gen/Debug/firebird/bin/isqlP'LI-T6.0.0.1036-dev Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/yvalve/why.cpp:6579
#24 0x00007ffff7ad8fa4 in Why::Dispatcher::attachDatabase (this=0x7ffff2d9c7e0, status=0x7fffe26c8660, filename=0x7ffff7e6fb6c "employee", dpbLength=366, dpb=0x7ffff7e6fbf0 "\001J>/gen/Debug/firebird/bin/isqlP'LI-T6.0.0.1036-dev Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/yvalve/why.cpp:6489
#25 0x00007ffff7a81c7f in Firebird::IProviderBaseImpl<Why::Dispatcher, Firebird::CheckStatusWrapper, Firebird::IPluginBaseImpl<Why::Dispatcher, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IReferenceCountedImpl<Why::Dispatcher, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IVersionedImpl<Why::Dispatcher, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IProvider> > > > > > >::cloopattachDatabaseDispatcher (self=0x7ffff2d9c7e8, status=0x7fffe26c8758, fileName=0x7ffff7e6fb6c "employee", dpbLength=366, dpb=0x7ffff7e6fbf0 "\001J>/gen/Debug/firebird/bin/isqlP'LI-T6.0.0.1036-dev Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/include/firebird/IdlFbInterfaces.h:12652
#26 0x00005555555bd4af in Firebird::IProvider::attachDatabase<Firebird::CheckStatusWrapper> (this=0x7ffff2d9c7e8, status=0x7fffe26c8750, fileName=0x7ffff7e6fb6c "employee", dpbLength=366, dpb=0x7ffff7e6fbf0 "\001J>/gen/Debug/firebird/bin/isqlP'LI-T6.0.0.1036-dev Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/include/firebird/IdlFbInterfaces.h:3033
#27 0x00005555555c59ea in (anonymous namespace)::DatabaseAuth::accept (this=0x7ffff7e6faf0, send=0x7ffff2da8568, authBlock=0x7ffff2da95a8) at /src/remote/server/server.cpp:2602
#28 0x00005555555c0199 in (anonymous namespace)::ServerAuth::authenticate (this=0x7ffff7e6faf0, send=0x7ffff2da8568, flags=0) at /src/remote/server/server.cpp:676
#29 0x00005555555c5582 in attach_database (port=0x7ffff2da3ed0, operation=op_attach, attach=0x7ffff2da8ca8, send=0x7ffff2da8568) at /src/remote/server/server.cpp:2539
#30 0x00005555555cf0dc in process_packet (port=0x7ffff2da3ed0, sendL=0x7ffff2da8568, receive=0x7ffff2da8b40, result=0x7fffe26c8c58) at /src/remote/server/server.cpp:5106
#31 0x00005555555d578a in loopThread () at /src/remote/server/server.cpp:6987
#32 0x00005555555fe5ec in (anonymous namespace)::ThreadArgs::run (this=0x7fffe26c8d90) at /src/common/ThreadStart.cpp:78
#33 0x00005555555fe6bc in (anonymous namespace)::threadStart (arg=0x7ffff7e6cb80) at /src/common/ThreadStart.cpp:94

Example 2 - Crash.
Thread 1 - New attachment trying to use deleted shared file for LockManager.
Thread 2 - Complete JRD_shutdown_database routine, clear GlobalObject, and leave without any trace...

Trace

thread #1
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140736995522112) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140736995522112) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140736995522112, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7642476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff76287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff5e4a2b7 in fb_utils::logAndDie (text=0x7fffe29fa290 "Fatal lock manager error: Process disappeared in LockManager::acquire_shmem, errno: 1\n--Operation not permitted") at /src/common/utils.cpp:1452
#6  0x00007ffff5d3971b in Jrd::LockManager::bug (this=0x7ffff4ed7b60, statusVector=0x0, string=0x7ffff60970e0 "Process disappeared in LockManager::acquire_shmem") at /src/lock/lock.cpp:1643
#7  0x00007ffff5d37de5 in Jrd::LockManager::acquire_shmem (this=0x7ffff4ed7b60, owner_offset=78584) at /src/lock/lock.cpp:1075
#8  0x00007ffff5d3f272 in Jrd::LockManager::LockTableGuard::LockTableGuard (this=0x7fffe29fc4e0, lm=0x7ffff4ed7b60, f=0x7ffff6097080 "enqueue", owner=78584) at /src/lock/../lock/lock_proto.h:310
#9  0x00007ffff5d361d7 in Jrd::LockManager::enqueue (this=0x7ffff4ed7b60, tdbb=0x7fffe29fcf98, statusVector=0x7fffe29fc710, prior_request=0, series=29, value=0x7ffff7889f78 "", length=0, type=6 '\006', ast_routine=0x7ffff57cdb68 <Jrd::CryptoManager::blockingAstChangeCryptState(void*)>, ast_argument=0x7ffff54e6e40, data=0, lck_wait=0, owner_offset=78584) at /src/lock/lock.cpp:468
#10 0x00007ffff5a0978d in enqueue (tdbb=0x7fffe29fcf98, statusVector=0x7fffe29fc710, lock=0x7ffff7889f00, level=6, wait=0) at /src/jrd/lck.cpp:948
#11 0x00007ffff5a0a9cf in ENQUEUE (tdbb=0x7fffe29fcf98, statusVector=0x7fffe29fc710, lock=0x7ffff7889f00, level=6, wait=0) at /src/jrd/lck.cpp:149
#12 0x00007ffff5a09069 in LCK_lock (tdbb=0x7fffe29fcf98, lock=0x7ffff7889f00, level=6, wait=0) at /src/jrd/lck.cpp:675
#13 0x00007ffff57c9174 in Jrd::CryptoManager::lockAndReadHeader (this=0x7ffff54e6e40, tdbb=0x7fffe29fcf98, flags=1) at /src/jrd/CryptoManager.cpp:379
#14 0x00007ffff57cbc31 in Jrd::CryptoManager::attach (this=0x7ffff54e6e40, tdbb=0x7fffe29fcf98, att=0x7ffff65ae040) at /src/jrd/CryptoManager.cpp:892
#15 0x00007ffff59d1283 in Jrd::JProvider::internalAttach (this=0x7ffff4ed54c0, user_status=0x7fffe29fe030, filename=0x7ffff7f95e40 "employee", dpb_length=362, dpb=0x7ffff7f9c1c0 "\001J>/gen/Debug/firebird/bin/isqlP#LI-T6.0.0.1036 Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM", existingId=0x0) at /src/jrd/jrd.cpp:1909
#16 0x00007ffff59d05cf in Jrd::JProvider::attachDatabase (this=0x7ffff4ed54c0, user_status=0x7fffe29fe030, filename=0x7ffff7f95e40 "employee", dpb_length=362, dpb=0x7ffff7f9c1c0 "\001J>/gen/Debug/firebird/bin/isqlP#LI-T6.0.0.1036 Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/jrd/jrd.cpp:1665
#17 0x00007ffff57d48f7 in Firebird::IProviderBaseImpl<Jrd::JProvider, Firebird::CheckStatusWrapper, Firebird::IPluginBaseImpl<Jrd::JProvider, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IReferenceCountedImpl<Jrd::JProvider, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IVersionedImpl<Jrd::JProvider, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IProvider> > > > > > >::cloopattachDatabaseDispatcher (self=0x7ffff4ed54c8, status=0x7fffe29fe688, fileName=0x7ffff7f95e40 "employee", dpbLength=362, dpb=0x7ffff7f9c1c0 "\001J>/gen/Debug/firebird/bin/isqlP#LI-T6.0.0.1036 Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/include/firebird/IdlFbInterfaces.h:12652
#18 0x00007ffff7aee01d in Firebird::IProvider::attachDatabase<Firebird::CheckStatusWrapper> (this=0x7ffff4ed54c8, status=0x7fffe29fe680, fileName=0x7ffff7f95e40 "employee", dpbLength=362, dpb=0x7ffff7f9c1c0 "\001J>/gen/Debug/firebird/bin/isqlP#LI-T6.0.0.1036 Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/include/firebird/IdlFbInterfaces.h:3033
#19 0x00007ffff7ad4e1a in Why::Dispatcher::attachOrCreateDatabase (this=0x7ffff7f9ca40, status=0x7fffe29fe680, createFlag=false, filename=0x7ffff7e68afc "employee", dpbLength=362, dpb=0x7ffff7e66b90 "\001J>/gen/Debug/firebird/bin/isqlP#LI-T6.0.0.1036 Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/yvalve/why.cpp:6579
#20 0x00007ffff7ad4886 in Why::Dispatcher::attachDatabase (this=0x7ffff7f9ca40, status=0x7fffe29fe680, filename=0x7ffff7e68afc "employee", dpbLength=362, dpb=0x7ffff7e66b90 "\001J>/gen/Debug/firebird/bin/isqlP#LI-T6.0.0.1036 Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/yvalve/why.cpp:6489
#21 0x00007ffff7a80693 in Firebird::IProviderBaseImpl<Why::Dispatcher, Firebird::CheckStatusWrapper, Firebird::IPluginBaseImpl<Why::Dispatcher, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IReferenceCountedImpl<Why::Dispatcher, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IVersionedImpl<Why::Dispatcher, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IProvider> > > > > > >::cloopattachDatabaseDispatcher (self=0x7ffff7f9ca48, status=0x7fffe29fe778, fileName=0x7ffff7e68afc "employee", dpbLength=362, dpb=0x7ffff7e66b90 "\001J>/gen/Debug/firebird/bin/isqlP#LI-T6.0.0.1036 Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/include/firebird/IdlFbInterfaces.h:12652
#22 0x00005555555ba779 in Firebird::IProvider::attachDatabase<Firebird::CheckStatusWrapper> (this=0x7ffff7f9ca48, status=0x7fffe29fe770, fileName=0x7ffff7e68afc "employee", dpbLength=362, dpb=0x7ffff7e66b90 "\001J>/gen/Debug/firebird/bin/isqlP#LI-T6.0.0.1036 Firebird 6.0 Initial>9/gen/Debug/firebird/binR\006treepcS\ntreehunterM") at /src/include/firebird/IdlFbInterfaces.h:3033
#23 0x00005555555c27fa in (anonymous namespace)::DatabaseAuth::accept (this=0x7ffff7e68a80, send=0x7ffff65a7358, authBlock=0x7ffff65a8398) at /src/remote/server/server.cpp:2602
#24 0x00005555555bd3a5 in (anonymous namespace)::ServerAuth::authenticate (this=0x7ffff7e68a80, send=0x7ffff65a7358, flags=0) at /src/remote/server/server.cpp:676
#25 0x00005555555c23ec in attach_database (port=0x7ffff65a2cc0, operation=op_attach, attach=0x7ffff65a7a98, send=0x7ffff65a7358) at /src/remote/server/server.cpp:2539
#26 0x00005555555cba76 in process_packet (port=0x7ffff65a2cc0, sendL=0x7ffff65a7358, receive=0x7ffff65a7930, result=0x7fffe29fec78) at /src/remote/server/server.cpp:5106
#27 0x00005555555d1ea4 in loopThread () at /src/remote/server/server.cpp:6987
#28 0x00005555555f89c6 in (anonymous namespace)::ThreadArgs::run (this=0x7fffe29fed90) at /src/common/ThreadStart.cpp:78
#29 0x00005555555f8a5d in (anonymous namespace)::threadStart (arg=0x7ffff7e67490) at /src/common/ThreadStart.cpp:94
#30 0x00007ffff7694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#31 0x00007ffff7726850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

thread #2
He complete `JRD_shutdown_database` routine, clear GlobalObject, and leave without any trace...

…ine unlocks global mutex Unlocking global mutex in GlobalObject destruction routine made it possible for a new attachment to slip in, so it will be creating new GlobalObject and using it, while destroying routine still in action. This can lead to an undefined state of the global objects, such as shared memory, where one thread is actively using it while another thread is destroying it.

TreeHunter9 · 2025-07-17T14:29:04Z

Found issue with current fix using bool:
Thread 1: Disconnecting from db1, AutoSetRestore save false as old value and set g_shuttingDown to true;
Thread 2: Disconnecting from db2, AutoSetRestore save true as old value and set g_shuttingDown to true;
Thread 1: Restore old value, setting g_shuttingDown to false;
Thread 2: Restore old value, setting g_shuttingDown to true;
And g_shuttingDown is set to true forever, no one can connect to any database.

If we set g_shuttingDown manually to true on acquire and false on release, we can see situation where Thread 1 sets g_shuttingDown to false, but Thread 2 is not done with shutdown, but g_shuttingDown is false and we got same race condition.

hvlad · 2025-07-17T14:48:11Z

Looks like instead of using global flag, common for all databases, we need per-database flag.

Raw idea: remove GlobalObjectHolder instance from g_hashTable in two stages: first replace it by some special fixed constant, then delete instance and finally remove entry from hash table.

When concurrent creator founds that constant, it should wait for no-entry in hash table before attempt to create new instance.

Instead of fixed constant, consider to use some sync object (mutex? special instance of GlobalObjectHolder ?) that could be used to wait for, instead of poll + sleep in a loop.

asfernandes · 2025-07-18T00:14:43Z

Instead of fixed constant, consider to use some sync object (mutex? special instance of GlobalObjectHolder ?) that could be used to wait for, instead of poll + sleep in a loop.

Looks like as a pattern for mutex + condition variable that waits while g_shuttingDown is true and is notified when it's false.

…routine for GlobalObjectHolder

TreeHunter9 · 2025-07-18T11:28:16Z

I reimplemented the fix by adding a mutex to DbId, so now there is no global flag as Vlad suggested.
I couldn't find any issues with the new implementation, but maybe I missed something. I added some comments in code to describe underlying logic.

hvlad · 2025-07-19T11:13:12Z

src/jrd/Database.cpp

+				}
+				// Now we are the one who owned DbId object.
+				// It also was removed from hash table, so simply delete it and recreate it next.
+				fb_assert(entry->getRefCount() == 1);


If there are many concurrent initializers, then this assert could be violated.
I think it is not needed.
Instead, you may nullify entry->holder at ~GlobalObjectHolder() and check it for nullptr here.

hvlad · 2025-07-19T11:17:59Z

src/jrd/Database.cpp

+		// Stole the object from the hash table without incrementing ref counter, so we will be the one who will delete the object
+		// at the end of this function.
+		RefPtr<Database::GlobalObjectHolder::DbId> entry(REF_NO_INCR, g_hashTable->lookup(m_id));
+		fb_assert(entry);


Add also fb_assert(entry->holder == this) ?

hvlad · 2025-07-19T11:26:03Z

src/jrd/Database.cpp

@@ -616,7 +649,8 @@ namespace Jrd
 		m_eventMgr = nullptr;


On enter g_mutex should be locked.
At line 639 the shutdownMutex is locked.
So we have order of mutexes: g_mutex then shutdownMutex.

At line 643 g_mutex is unlocked, while shutdownMutex still locked.
At line 646 g_mutex will be locked again and we have inverted order of mutexes: shutdownMutex then g_mutex.
This is a way for deadlock.

Am I wrong ?

As I can see every instance of DbId is linked to a specific GlobalObjectHolder, so shutdownMutex is a different object every time the ~GlobalObjectHolder() is called. Therefore, a deadlock can only occur if the desctructor will be called twice on the same GlobalObjectHolder, which is not possible, so we are safe here.
shutdownMutex can be locked by another thread in GlobalObjectHolder::init, but only when g_mutex is unlocked. Therefore, a deadlock is also not possible here too.

hvlad · 2025-07-19T11:26:59Z

src/jrd/Database.cpp

@@ -616,7 +649,8 @@ namespace Jrd
 		m_eventMgr = nullptr;
 		m_replMgr = nullptr;

-		delete entry;
+		if (!g_hashTable->remove(m_id))
+			fb_assert(false);



Add entry->holder = nullptr ?

TreeHunter9 marked this pull request as draft July 17, 2025 14:19

fix(shutdown): Reimplement synchronization between shutdown and init …

57024ac

…routine for GlobalObjectHolder

TreeHunter9 marked this pull request as ready for review July 18, 2025 11:29

refactor(shutdown): Remove unused variable

6299042

hvlad reviewed Jul 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix(shutdown): Prevent race condition when GlobalObject destruction routine unlocks global mutex #8652

fix(shutdown): Prevent race condition when GlobalObject destruction routine unlocks global mutex #8652

TreeHunter9 commented Jul 17, 2025

Uh oh!

TreeHunter9 commented Jul 17, 2025

Uh oh!

hvlad commented Jul 17, 2025

Uh oh!

asfernandes commented Jul 18, 2025 •

edited

Loading

Uh oh!

TreeHunter9 commented Jul 18, 2025

Uh oh!

hvlad Jul 19, 2025 •

edited

Loading

Uh oh!

hvlad Jul 19, 2025

Uh oh!

hvlad Jul 19, 2025

Uh oh!

TreeHunter9 Jul 20, 2025

Uh oh!

hvlad Jul 19, 2025

Uh oh!

Uh oh!

Uh oh!

fix(shutdown): Prevent race condition when GlobalObject destruction routine unlocks global mutex #8652

Are you sure you want to change the base?

fix(shutdown): Prevent race condition when GlobalObject destruction routine unlocks global mutex #8652

Conversation

TreeHunter9 commented Jul 17, 2025

Uh oh!

TreeHunter9 commented Jul 17, 2025

Uh oh!

hvlad commented Jul 17, 2025

Uh oh!

asfernandes commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TreeHunter9 commented Jul 18, 2025

Uh oh!

hvlad Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hvlad Jul 19, 2025

Choose a reason for hiding this comment

Uh oh!

hvlad Jul 19, 2025

Choose a reason for hiding this comment

Uh oh!

TreeHunter9 Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

hvlad Jul 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

asfernandes commented Jul 18, 2025 •

edited

Loading

hvlad Jul 19, 2025 •

edited

Loading