Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ClassLoaderSafe ConnectorPlanOptimizer and SplitSource #24685

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ZacBlanco
Copy link
Contributor

@ZacBlanco ZacBlanco commented Mar 6, 2025

Description

Adds a ClassLoaderSafeConnectorPlanOptimizer class and wraps the Iceberg connector optimizer rules with it. Additionally, we do the same the IcebergSplitSource.

Motivation and Context

This is required when plugin optimizer rules have a dependency which may rely on a properly set ContextClassloader in order to perform any of the optimizations.

In the case of Iceberg, there are cases when we can reach out for some of the table's metadata files which initiates calls to the FileSystem client. The Hadoop FileSystem Configuration object may use the ContextClassLoader to load the correct FileSystem client implementation. We found in production when many connectors are enabled and the query spans multiple connectors, there are cases where the ContextClassLoader may not be set properly resulting in ClassNotFound exceptions inside Hadoop's Configuration class.

Failure stacktrace from production

java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.facebook.presto.hive.s3.PrestoS3FileSystem not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2636)
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3341)
	at org.apache.hadoop.fs.PrestoFileSystemCache.createFileSystem(PrestoFileSystemCache.java:136)
	at org.apache.hadoop.fs.PrestoFileSystemCache.access$000(PrestoFileSystemCache.java:40)
	at org.apache.hadoop.fs.PrestoFileSystemCache$FileSystemHolder.createFileSystemOnce(PrestoFileSystemCache.java:352)
	at org.apache.hadoop.fs.PrestoFileSystemCache.getInternal(PrestoFileSystemCache.java:122)
	at org.apache.hadoop.fs.PrestoFileSystemCache.get(PrestoFileSystemCache.java:71)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:485)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
	at com.facebook.presto.hive.cache.HiveCachingHdfsConfiguration.lambda$getConfiguration$0(HiveCachingHdfsConfiguration.java:80)
	at com.facebook.presto.hive.cache.HiveCachingHdfsConfiguration$CachingJobConf.createFileSystem(HiveCachingHdfsConfiguration.java:140)
	at org.apache.hadoop.fs.PrestoFileSystemCache.get(PrestoFileSystemCache.java:68)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:485)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
	at com.facebook.presto.hive.HdfsEnvironment.lambda$getFileSystem$0(HdfsEnvironment.java:71)
	at com.facebook.presto.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
	at com.facebook.presto.hive.HdfsEnvironment.getFileSystem(HdfsEnvironment.java:70)
	at com.facebook.presto.hive.HdfsEnvironment.getFileSystem(HdfsEnvironment.java:64)
	at com.facebook.presto.iceberg.HdfsInputFile.<init>(HdfsInputFile.java:42)
	at com.facebook.presto.iceberg.HdfsFileIO.newInputFile(HdfsFileIO.java:44)
	at org.apache.iceberg.BaseSnapshot.cacheManifests(BaseSnapshot.java:146)
	at org.apache.iceberg.BaseSnapshot.dataManifests(BaseSnapshot.java:172)
	at org.apache.iceberg.DataTableScan.doPlanFiles(DataTableScan.java:68)
	at org.apache.iceberg.SnapshotScan.planFiles(SnapshotScan.java:139)
	at com.facebook.presto.iceberg.IcebergUtil.getDeleteFiles(IcebergUtil.java:899)
	at com.facebook.presto.iceberg.optimizer.IcebergEqualityDeleteAsJoin$DeleteAsJoinRewriter.collectDeleteInformation(IcebergEqualityDeleteAsJoin.java:277)
	at com.facebook.presto.iceberg.optimizer.IcebergEqualityDeleteAsJoin$DeleteAsJoinRewriter.visitTableScan(IcebergEqualityDeleteAsJoin.java:186)
	at com.facebook.presto.iceberg.optimizer.IcebergEqualityDeleteAsJoin$DeleteAsJoinRewriter.visitTableScan(IcebergEqualityDeleteAsJoin.java:145)
	at com.facebook.presto.spi.plan.TableScanNode.accept(TableScanNode.java:203)
	at com.facebook.presto.spi.ConnectorPlanRewriter.rewriteWith(ConnectorPlanRewriter.java:39)
	at com.facebook.presto.spi.ConnectorPlanRewriter.rewriteWith(ConnectorPlanRewriter.java:27)
	at com.facebook.presto.iceberg.optimizer.IcebergEqualityDeleteAsJoin.optimize(IcebergEqualityDeleteAsJoin.java:141)
	at com.facebook.presto.sql.planner.optimizations.ApplyConnectorOptimization.optimize(ApplyConnectorOptimization.java:144)
	at com.facebook.presto.sql.Optimizer.validateAndOptimizePlan(Optimizer.java:149)
	at com.facebook.presto.execution.SqlQueryExecution.lambda$OptimizePlan$3(SqlQueryExecution.java:591)
	at com.facebook.presto.common.RuntimeStats.profileNanos(RuntimeStats.java:136)
	at com.facebook.presto.execution.SqlQueryExecution.OptimizePlan(SqlQueryExecution.java:589)
	at com.facebook.presto.execution.SqlQueryExecution.createLogicalPlanAndOptimize(SqlQueryExecution.java:562)
	at com.facebook.presto.execution.SqlQueryExecution.start(SqlQueryExecution.java:474)
	at com.facebook.presto.$gen.Presto_0_286____20250304_155618_1.run(Unknown Source)
	at com.facebook.presto.execution.SqlQueryManager.createQuery(SqlQueryManager.java:320)
	at com.facebook.presto.dispatcher.LocalDispatchQuery.lambda$startExecution$8(LocalDispatchQuery.java:214)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:839)
Caused by: java.lang.ClassNotFoundException: Class com.facebook.presto.hive.s3.PrestoS3FileSystem not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2540)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2634)
	... 44 more

java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.facebook.presto.hive.s3.PrestoS3FileSystem not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2636)
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3341)
	at org.apache.hadoop.fs.PrestoFileSystemCache.createFileSystem(PrestoFileSystemCache.java:136)
	at org.apache.hadoop.fs.PrestoFileSystemCache.access$000(PrestoFileSystemCache.java:40)
	at org.apache.hadoop.fs.PrestoFileSystemCache$FileSystemHolder.createFileSystemOnce(PrestoFileSystemCache.java:352)
	at org.apache.hadoop.fs.PrestoFileSystemCache.getInternal(PrestoFileSystemCache.java:122)
	at org.apache.hadoop.fs.PrestoFileSystemCache.get(PrestoFileSystemCache.java:71)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:485)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
	at com.facebook.presto.hive.cache.HiveCachingHdfsConfiguration.lambda$getConfiguration$0(HiveCachingHdfsConfiguration.java:80)
	at com.facebook.presto.hive.cache.HiveCachingHdfsConfiguration$CachingJobConf.createFileSystem(HiveCachingHdfsConfiguration.java:140)
	at org.apache.hadoop.fs.PrestoFileSystemCache.get(PrestoFileSystemCache.java:68)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:485)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
	at com.facebook.presto.hive.HdfsEnvironment.lambda$getFileSystem$0(HdfsEnvironment.java:71)
	at com.facebook.presto.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
	at com.facebook.presto.hive.HdfsEnvironment.getFileSystem(HdfsEnvironment.java:70)
	at com.facebook.presto.hive.HdfsEnvironment.getFileSystem(HdfsEnvironment.java:64)
	at com.facebook.presto.iceberg.HdfsInputFile.<init>(HdfsInputFile.java:42)
	at com.facebook.presto.iceberg.HdfsFileIO.newInputFile(HdfsFileIO.java:44)
	at org.apache.iceberg.io.FileIO.newInputFile(FileIO.java:42)
	at org.apache.iceberg.ManifestFiles.newInputFile(ManifestFiles.java:368)
	at org.apache.iceberg.ManifestFiles.read(ManifestFiles.java:129)
	at org.apache.iceberg.ManifestGroup$1.iterator(ManifestGroup.java:311)
	at org.apache.iceberg.io.CloseableIterable$ConcatCloseableIterable$ConcatCloseableIterator.hasNext(CloseableIterable.java:257)
	at org.apache.iceberg.relocated.com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:46)
	at org.apache.iceberg.relocated.com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:46)
	at org.apache.iceberg.relocated.com.google.common.collect.Iterators$ConcatenatedIterator.getTopMetaIterator(Iterators.java:1379)
	at org.apache.iceberg.relocated.com.google.common.collect.Iterators$ConcatenatedIterator.hasNext(Iterators.java:1395)
	at org.apache.iceberg.io.CloseableIterator$1.hasNext(CloseableIterator.java:50)
	at com.google.common.collect.Iterators$7.hasNext(Iterators.java:966)
	at com.facebook.presto.iceberg.IcebergSplitSource.getNextBatch(IcebergSplitSource.java:97)
	at com.facebook.presto.split.ConnectorAwareSplitSource.getNextBatch(ConnectorAwareSplitSource.java:65)
	at com.facebook.presto.split.BufferingSplitSource$GetNextBatch.fetchSplits(BufferingSplitSource.java:119)
	at com.facebook.presto.split.BufferingSplitSource$GetNextBatch.fetchNextBatchAsync(BufferingSplitSource.java:100)
	at com.facebook.presto.split.BufferingSplitSource.getNextBatch(BufferingSplitSource.java:60)
	at com.facebook.presto.sql.planner.LazySplitSource.getNextBatch(LazySplitSource.java:60)
	at com.facebook.presto.execution.scheduler.SourcePartitionedScheduler.schedule(SourcePartitionedScheduler.java:223)
	at com.facebook.presto.execution.scheduler.SourcePartitionedScheduler$1.schedule(SourcePartitionedScheduler.java:147)
	at com.facebook.presto.execution.scheduler.LegacySqlQueryScheduler.schedule(LegacySqlQueryScheduler.java:463)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:839)
Caused by: java.lang.ClassNotFoundException: Class com.facebook.presto.hive.s3.PrestoS3FileSystem not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2540)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2634)
	... 44 more

Impact

N/A

Test Plan

This was a difficult bug to trigger in production and it only happened periodically. I don't think we have a good way
to test this.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== NO RELEASE NOTE ==

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Mar 6, 2025
This is required when plugin optimizer rules have a dependency which
may rely on a properly set ContextClassloader in order to perform
any of the optimizations.

In the case of Iceberg, there are cases when we can reach out for some
of the table's metadata files which initiates calls to the FileSystem
client. The Hadoop FileSystem Configuration object uses the
ContextClassLoader to load the correct FileSystem client implementation.
We found in production when many connector are enabled and the query
spans multiple connectors, there are cases where the ContextClassLoader
is not set properly resulting in ClassNotFound exceptions
@ZacBlanco ZacBlanco force-pushed the upstream-classloader-safe-optimizer-rule branch from 2f6509d to 3c93224 Compare March 7, 2025 21:32
@ZacBlanco ZacBlanco changed the title Add ClassLoaderSafeConnectorPlanOptimizer Add ClassLoaderSafe ConnectorPlanOptimizer and SplitSource Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
from:IBM PR from IBM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants