You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
alluxio.client.file.cache.LocalCacheManager Failed to get existing page due to java.io.IOException: java.util.concurrent.RejectedExecutionException
#24663
Open
fgwang7w opened this issue
Mar 3, 2025
· 1 comment
The cache hit ratio has been around 70%+ in our prod environment. The log indicates massive error message showing RejectedExecutionException.
JStack trace:
2025-01-30T15:25:56.973Z ERROR 20250130_152200_10236_cup4a.21.0.6.0-90-78 alluxio.client.file.cache.LocalCacheManager Failed to get existing page PageId{FileId=2150900da0ee9376cca0e746ddb91f9e, PageIndex=0} from pageStore
java.io.IOException: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5a7a510[Not completed, task = alluxio.client.file.cache.TimeBoundPageStore$$Lambda$4165/0x0000ffc1f2cff568@7821a7e7] rejected from java.util.concurrent.ThreadPoolExecutor@18d371d3[Running, pool size = 64, active threads = 64, queued tasks = 0, completed tasks = 1144697]
at alluxio.client.file.cache.TimeBoundPageStore.get(TimeBoundPageStore.java:100)
at alluxio.client.file.cache.LocalCacheManager.getPage(LocalCacheManager.java:641)
at alluxio.client.file.cache.LocalCacheManager.get(LocalCacheManager.java:481)
at alluxio.client.file.cache.NoExceptionCacheManager.get(NoExceptionCacheManager.java:87)
at alluxio.client.file.cache.LocalCacheFileInStream.readInternal(LocalCacheFileInStream.java:150)
at alluxio.client.file.cache.LocalCacheFileInStream.positionedRead(LocalCacheFileInStream.java:228)
at alluxio.hadoop.HdfsFileInputStream.read(HdfsFileInputStream.java:153)
at alluxio.hadoop.HdfsFileInputStream.readFully(HdfsFileInputStream.java:170)
at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:107)
at com.facebook.presto.hive.parquet.HdfsParquetDataSource.readInternal(HdfsParquetDataSource.java:65)
at com.facebook.presto.parquet.AbstractParquetDataSource.readFully(AbstractParquetDataSource.java:60)
at com.facebook.presto.parquet.reader.ParquetReader$1.read(ParquetReader.java:445)
at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:252)
at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:292)
at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:351)
at com.facebook.presto.hive.$internal.parquet.org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:177)
at com.facebook.presto.hive.$internal.parquet.org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
at com.facebook.presto.hive.$internal.parquet.org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:636)
at com.facebook.presto.hive.$internal.parquet.org.apache.thrift.protocol.TCompactProtocol.readFieldBegin(TCompactProtocol.java:539)
at org.apache.parquet.format.InterningProtocol.readFieldBegin(InterningProtocol.java:155)
at org.apache.parquet.format.PageHeader$PageHeaderStandardScheme.read(PageHeader.java:1026)
at org.apache.parquet.format.PageHeader$PageHeaderStandardScheme.read(PageHeader.java:1019)
at org.apache.parquet.format.PageHeader.read(PageHeader.java:896)
at org.apache.parquet.format.Util.read(Util.java:362)
at org.apache.parquet.format.Util.readPageHeader(Util.java:132)
at com.facebook.presto.parquet.reader.ParquetColumnChunk.readPageHeader(ParquetColumnChunk.java:90)
at com.facebook.presto.parquet.reader.ParquetColumnChunk.buildPageReader(ParquetColumnChunk.java:119)
at com.facebook.presto.parquet.reader.ParquetReader.createPageReaderInternal(ParquetReader.java:502)
at com.facebook.presto.parquet.reader.ParquetReader.createPageReader(ParquetReader.java:495)
at com.facebook.presto.parquet.reader.ParquetReader.readPrimitive(ParquetReader.java:368)
at com.facebook.presto.parquet.reader.ParquetReader.readColumnChunk(ParquetReader.java:558)
at com.facebook.presto.parquet.reader.ParquetReader.readBlock(ParquetReader.java:541)
at com.facebook.presto.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:230)
at com.facebook.presto.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:208)
at com.facebook.presto.common.block.LazyBlock.assureLoaded(LazyBlock.java:313)
at com.facebook.presto.common.block.LazyBlock.getLoadedBlock(LazyBlock.java:304)
at com.facebook.presto.operator.ScanFilterAndProjectOperator$RecordingLazyBlockLoader.load(ScanFilterAndProjectOperator.java:335)
at com.facebook.presto.operator.ScanFilterAndProjectOperator$RecordingLazyBlockLoader.load(ScanFilterAndProjectOperator.java:321)
at com.facebook.presto.common.block.LazyBlock.assureLoaded(LazyBlock.java:313)
at com.facebook.presto.common.block.LazyBlock.getLoadedBlock(LazyBlock.java:304)
at com.facebook.presto.operator.project.InputPageProjection.project(InputPageProjection.java:69)
at com.facebook.presto.operator.project.PageProjectionWithOutputs.project(PageProjectionWithOutputs.java:56)
at com.facebook.presto.operator.project.PageProcessor$ProjectSelectedPositions.processBatch(PageProcessor.java:327)
at com.facebook.presto.operator.project.PageProcessor$ProjectSelectedPositions.process(PageProcessor.java:201)
at com.facebook.presto.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:315)
at com.facebook.presto.operator.WorkProcessorUtils$YieldingIterator.computeNext(WorkProcessorUtils.java:79)
at com.facebook.presto.operator.WorkProcessorUtils$YieldingIterator.computeNext(WorkProcessorUtils.java:65)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
at com.facebook.presto.operator.project.MergingPageOutput.getOutput(MergingPageOutput.java:128)
at com.facebook.presto.operator.ScanFilterAndProjectOperator.processPageSource(ScanFilterAndProjectOperator.java:316)
at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:260)
at com.facebook.presto.operator.Driver.processInternal(Driver.java:428)
at com.facebook.presto.operator.Driver.lambda$processFor$9(Driver.java:311)
at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:732)
at com.facebook.presto.operator.Driver.processFor(Driver.java:304)
at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1079)
at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:165)
at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:603)
at com.facebook.presto.$gen.Presto_0_282____20250130_134124_1.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5a7a510[Not completed, task = alluxio.client.file.cache.TimeBoundPageStore$$Lambda$4165/0x0000ffc1f2cff568@7821a7e7] rejected from java.util.concurrent.ThreadPoolExecutor@18d371d3[Running, pool size = 64, active threads = 64, queued tasks = 0, completed tasks = 1144697]
at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:140)
at alluxio.shaded.client.com.google.common.util.concurrent.SimpleTimeLimiter.callWithTimeout(SimpleTimeLimiter.java:159)
at alluxio.client.file.cache.TimeBoundPageStore.get(TimeBoundPageStore.java:91)
... 62 more
Some system metrics indications:
Some cache related metrics indications:
JMX metric shows CachePutErrors for which majority of them are RejectedExecutionExceptions.
Fetching metrics for table: com.facebook.alluxio:name=Client.CachePutInsufficientSpaceErrors.presto,type=counters
Not sure where to look for next, @beinan@apc999 can you help from alluxio point of view, any suggestions? This was very similar to #15972. However we are currently using alluxio-273 release. cc @nmahadevuni@tdcmeehan
The text was updated successfully, but these errors were encountered:
All PUT calls are in this stack trace when the issue occurs
"pool-62-thread-1" #291 prio=5 os_prio=0 cpu=251.61ms elapsed=89.81s tid=0x0000ffc7180099a0 nid=0x1bf runnable [0x0000ffc4ff3fe000]
java.lang.Thread.State: RUNNABLE
at java.io.FileDescriptor.close0([email protected]/Native Method)
at java.io.FileDescriptor.close([email protected]/FileDescriptor.java:297)
- locked <0x0000fff929144df0> (a java.io.FileDescriptor)
at java.io.FileOutputStream$1.close([email protected]/FileOutputStream.java:398)
at java.io.FileDescriptor.closeAll([email protected]/FileDescriptor.java:355)
- locked <0x0000fff929144df0> (a java.io.FileDescriptor)
at java.io.FileOutputStream.close([email protected]/FileOutputStream.java:396)
at alluxio.client.file.cache.store.LocalPageStore.put(LocalPageStore.java:82)
at alluxio.client.file.cache.TimeBoundPageStore.lambda$put$0(TimeBoundPageStore.java:61)
at alluxio.client.file.cache.TimeBoundPageStore$$Lambda$3479/0x0000ffc4fff7f500.call(Unknown Source)
at java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
at java.lang.Thread.run([email protected]/Thread.java:829)
The cache hit ratio has been around 70%+ in our prod environment. The log indicates massive error message showing
RejectedExecutionException
.JStack trace:
Some system metrics indications:
Some cache related metrics indications:
JMX metric shows CachePutErrors for which majority of them are RejectedExecutionExceptions.
Fetching metrics for table: com.facebook.alluxio:name=Client.CachePutInsufficientSpaceErrors.presto,type=counters
Fetching metrics for table: com.facebook.alluxio:name=Client.CachePutErrors.presto,type=counters
Not sure where to look for next, @beinan @apc999 can you help from alluxio point of view, any suggestions? This was very similar to #15972. However we are currently using alluxio-273 release. cc @nmahadevuni @tdcmeehan
The text was updated successfully, but these errors were encountered: