Open
Description
Search before asking
- I had searched in the issues and found no similar issues.
Description
When there are a large number of partitions in the catalog recycle bin, the recycle bin thread (Daemon) holds the lock <0x000000046c2bcc68> of CatalogRecycleBin for an extended period. This results in other operations like DROP PARTITION being blocked while waiting for the lock, as observed in the thread stack:
"thrift-server-pool-86" #19384 daemon prio=5 os_prio=0 tid=0x00007f6ddc088000 nid=0x3815aa waiting for monitor entry [0x00007f6973bac000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.doris.catalog.CatalogRecycleBin.recyclePartition(CatalogRecycleBin.java:187)
- waiting to lock <0x000000046c2bcc68> (a org.apache.doris.catalog.CatalogRecycleBin)
at org.apache.doris.catalog.OlapTable.dropPartition(OlapTable.java:950)
at org.apache.doris.catalog.OlapTable.dropPartition(OlapTable.java:973)
at org.apache.doris.datasource.InternalCatalog.dropPartitionWithoutCheck(InternalCatalog.java:1879)
at org.apache.doris.datasource.InternalCatalog.dropPartition(InternalCatalog.java:1868)
at org.apache.doris.catalog.Env.dropPartition(Env.java:3179)
at org.apache.doris.alter.Alter.processAlterOlapTable(Alter.java:224)
at org.apache.doris.alter.Alter.processAlterTable(Alter.java:467)
at org.apache.doris.catalog.Env.alterTable(Env.java:4456)
at org.apache.doris.qe.DdlExecutor.execute(DdlExecutor.java:170)
at org.apache.doris.qe.StmtExecutor.handleDdlStmt(StmtExecutor.java:2801)
at org.apache.doris.qe.StmtExecutor.executeByLegacy(StmtExecutor.java:963)
at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:595)
at org.apache.doris.qe.ConnectProcessor.proxyExecute(ConnectProcessor.java:704)
at org.apache.doris.service.FrontendServiceImpl.forward(FrontendServiceImpl.java:1060)
at sun.reflect.GeneratedMethodAccessor464.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.doris.service.FeServer.lambda$start$0(FeServer.java:60)
at org.apache.doris.service.FeServer$$Lambda$193/1612883140.invoke(Unknown Source)
at com.sun.proxy.$Proxy28.forward(Unknown Source)
at org.apache.doris.thrift.FrontendService$Processor$forward.getResult(FrontendService.java:3792)
at org.apache.doris.thrift.FrontendService$Processor$forward.getResult(FrontendService.java:3772)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
"recycle bin" #38 daemon prio=5 os_prio=0 tid=0x00007f6e64116000 nid=0x380d62 runnable [0x00007f69402df000]
java.lang.Thread.State: RUNNABLE
at org.apache.doris.catalog.CatalogRecycleBin.getSameNamePartitionIdListToErase(CatalogRecycleBin.java:527)
- locked <0x000000046c2bcc68> (a org.apache.doris.catalog.CatalogRecycleBin)
at org.apache.doris.catalog.CatalogRecycleBin.erasePartitionWithSameName(CatalogRecycleBin.java:556)
- eliminated <0x000000046c2bcc68> (a org.apache.doris.catalog.CatalogRecycleBin)
at org.apache.doris.catalog.CatalogRecycleBin.erasePartition(CatalogRecycleBin.java:510)
- locked <0x000000046c2bcc68> (a org.apache.doris.catalog.CatalogRecycleBin)
at org.apache.doris.catalog.CatalogRecycleBin.runAfterCatalogReady(CatalogRecycleBin.java:1010)
at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58)
at org.apache.doris.common.util.Daemon.run(Daemon.java:116)
Reproduction Steps
Create table with frequent partition DROP/CREATE operations
Set catalog_trash_expire_second to large value
Monitor thread lock contention via JStack
Expected Behavior
DDL operations should complete within predictable timeframes regardless of recycle bin size.
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
No labels