Skip to content

Conversation

@jayantsing-db
Copy link
Collaborator

Description

Complete link futures for upfront-fetched chunks to prevent deadlock

When chunk links are fetched upfront, the corresponding futures were never completed, causing threads to wait indefinitely. Now we complete these futures in the constructor for all pre-fetched chunks.

Testing

Additional Notes to the Reviewer

Complete link futures for upfront-fetched chunks to prevent deadlock

When chunk links are fetched upfront, the corresponding futures were
never completed, causing threads to wait indefinitely. Now we complete
these futures in the constructor for all pre-fetched chunks.
@jayantsing-db
Copy link
Collaborator Author

e2e testing that link download service triggers a refresh on expiry when chunk is being downloaded:

2025-11-25 11:19:43 INFO com.databricks.jdbc.api.impl.arrow.ChunkLinkDownloadService#handleExpiredLinksAndReset - Detected expired link for chunk 0, re-triggering batch download from the smallest index with the expired link
2025-11-25 11:19:43 INFO com.databricks.jdbc.api.impl.arrow.ChunkLinkDownloadService#handleExpiredLinksAndReset - Found the smallest index 0 with the expired link, initiating reset
2025-11-25 11:19:43 INFO com.databricks.jdbc.api.impl.arrow.ChunkLinkDownloadService#resetFuturesFromIndex - Resetting futures from index 0
2025-11-25 11:19:43 INFO com.databricks.jdbc.api.impl.arrow.ChunkLinkDownloadService#prepareNewBatchDownload - Preparing new batch download from index 0
2025-11-25 11:19:43 INFO com.databricks.jdbc.api.impl.arrow.ChunkLinkDownloadService#getLinkForChunk - Initiating first download chain for chunk 0
2025-11-25 11:19:43 INFO com.databricks.jdbc.api.impl.arrow.ChunkLinkDownloadService#triggerNextBatchDownload - Starting batch download from index 0
2025-11-25 11:19:43 FINE com.databricks.jdbc.api.impl.DatabricksSession#getSessionId - public String getSessionId()
2025-11-25 11:19:43 FINE com.databricks.jdbc.api.impl.DatabricksSession#getSessionId - public String getSessionId()
2025-11-25 11:20:01 FINE com.databricks.jdbc.api.impl.DatabricksSession#getDatabricksClient - public IDatabricksClient getDatabricksClient()
2025-11-25 11:20:03 FINE com.databricks.jdbc.dbclient.impl.thrift.DatabricksThriftServiceClient#getResultChunks - public Optional<ExternalLink> getResultChunk(String statementId = {01f0c9f0-a2b9-18bf-a166-84331f41e3ad|338d529d-8272-46eb-8482-cb419466839d}, long chunkIndex = {0}) using Thrift client
2025-11-25 11:20:03 FINE com.databricks.jdbc.common.util.DatabricksThriftUtil#getOperationHandle - getOperationHandle for statementId {01f0c9f0-a2b9-18bf-a166-84331f41e3ad}
2025-11-25 11:20:03 FINE com.databricks.jdbc.dbclient.impl.http.DatabricksHttpClient#execute - Executing HTTP request [https://e2-dogfood.staging.cloud.databricks.com:443//sql/1.0/warehouses/dd43ee29fedd958d](https://e2-dogfood.staging.cloud.databricks.com//sql/1.0/warehouses/dd43ee29fedd958d)
2025-11-25 11:20:03 FINE com.databricks.jdbc.api.impl.DatabricksSession#getSessionId - public String getSessionId()
2025-11-25 11:20:03 FINE com.databricks.jdbc.telemetry.latency.DatabricksMetricsTimedProcessor$TimedInvocationHandler#invoke - Method [executeStatement] with args [select * from samples.tpch.lineitem limit 200000, SQL Warehouse with warehouse ID {dd43ee29fedd958d}, {}, QUERY, DatabricksSession[compute='SQL Warehouse with warehouse ID {dd43ee29fedd958d}', schema='default', sessionID='01f0c9f0-a288-15a7-8798-fa9eee9c1e06'], DatabricksStatement[statementId=01f0c9f0-a2b9-18bf-a166-84331f41e3ad|338d529d-8272-46eb-8482-cb419466839d]] execution time: 23239ms
2025-11-25 11:20:03 FINE com.databricks.jdbc.api.impl.DatabricksStatement#executeInternal - Result retrieved successfully DatabricksResultSet[statementStatus=com.databricks.jdbc.api.impl.ExecutionStatus@4f3faa70, statementId=01f0c9f0-a2b9-18bf-a166-84331f41e3ad|338d529d-8272-46eb-8482-cb419466839d, statementType=QUERY, isClosed=false, wasNull=false, resultSetType=THRIFT_ARROW_ENABLED]


Printing ResultSet contents:

l_orderkey	l_partkey	l_suppkey	l_linenumber	l_quantity	l_extendedprice	l_discount	l_tax	l_returnflag	l_linestatus	l_shipdate	l_commitdate	l_receiptdate	l_shipinstruct	l_shipmode	l_comment	
BIGINT		BIGINT		BIGINT		INT		DECIMAL		DECIMAL		DECIMAL		DECIMAL		STRING		STRING		DATE		DATE		DATE		STRING		STRING		STRING		
-5			-5			-5			4			3			3			3			3			12			12			91			91			91			12			12			12			
19			19			19			10			18			18			18			18			255			255			10			10			10			255			255			255			
1			1			1			1			1			1			1			1			1			1			1			1			1			1			1			1			
20			20			20			11			20			20			20			20			255			255			10			10			10			255			255			255			
2025-11-25 11:20:04 SEVERE com.databricks.jdbc.dbclient.impl.thrift.DatabricksThriftServiceClient#getResultChunks - Out of bounds error for chunkIndex. Context: public Optional<ExternalLink> getResultChunk(String statementId = {01f0c9f0-a2b9-18bf-a166-84331f41e3ad|338d529d-8272-46eb-8482-cb419466839d}, long chunkIndex = {0}) using Thrift client
2025-11-25 11:20:04 SEVERE com.databricks.jdbc.api.impl.arrow.ChunkLinkDownloadService#handleBatchDownloadError - Failed to download links for batch starting at 0 : Out of bounds error for chunkIndex. Context: public Optional<ExternalLink> getResultChunk(String statementId = {01f0c9f0-a2b9-18bf-a166-84331f41e3ad|338d529d-8272-46eb-8482-cb419466839d}, long chunkIndex = {0}) using Thrift client
2025-11-25 11:20:04 FINE com.databricks.jdbc.api.impl.arrow.ChunkLinkDownloadService#lambda$handleBatchDownloadError$2 - Completing future for chunk 0 exceptionally due to batch download error
2025-11-25 11:20:04 FINE com.databricks.jdbc.api.impl.arrow.ChunkLinkDownloadService#lambda$handleBatchDownloadError$2 - Completing future for chunk 1 exceptionally due to batch download error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants