Skip to content

Conversation

@devhawk
Copy link
Collaborator

@devhawk devhawk commented Dec 18, 2025

  • Added static durableSleepDuration method. Now, sleep, getEvent and recv can all share the durable sleep duration calculation while doing different things to sleep. This eliminated the skipSleep param
  • updated SchedulerService to use an AtomicReference<ScheduledExecutorService> to track service lifetime
  • rewrote QueueService to use ScheduledExecutorService to schedule queue reads instead of a dedicated thread.
    • like SchedulerService , uses AtomicReference<ScheduledExecutorService> to track service lifetime

chuck-dbos
chuck-dbos previously approved these changes Dec 18, 2025
@chuck-dbos
Copy link
Contributor

The test failure is a bad sign, looks like a hang caused by the change.

@chuck-dbos chuck-dbos self-requested a review December 18, 2025 11:54
@chuck-dbos chuck-dbos dismissed their stale review December 18, 2025 11:55

Changed mind

@chuck-dbos
Copy link
Contributor

Looking into whether this is a patch test issue, caught exception issue, or shutdown issue.

    02:07:37.507 [pool-289-thread-3] DEBUG d.d.transact.execution.DBOSExecutor - executeWorkflowById 0b8d0d88-bd67-4ea9-8488-c3d4d0027b2a
    02:07:37.508 [pool-289-thread-3] ERROR d.d.transact.execution.QueueService - Error executing queued workflow(s) for queue _dbos_internal_queue
    dev.dbos.transact.exceptions.DBOSWorkflowFunctionNotFoundException: Workflow function dev.dbos.transact.invocation.PatchServiceImplOne//workflow does not exist for workflow id 0b8d0d88-bd67-4ea9-8488-c3d4d0027b2a.
    	at dev.dbos.transact.execution.DBOSExecutor.executeWorkflowById(DBOSExecutor.java:1050)
    	at dev.dbos.transact.execution.QueueService$1.processPartition(QueueService.java:106)
    	at dev.dbos.transact.execution.QueueService$1.run(QueueService.java:124)
    	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
    	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
    	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    	at java.base/java.lang.Thread.run(Thread.java:840)
    02:07:38.453 [pool-289-thread-2] DEBUG d.d.transact.execution.QueueService - Retrieving workflows from partition <null> of queue schedulerQueue
    02:07:39.443 [pool-289-thread-2] DEBUG d.d.transact.execution.QueueService - Retrieving workflows from partition <null> of queue schedulerQueue
    02:07:39.482 [pool-289-thread-3] DEBUG d.d.transact.execution.QueueService - Retrieving workflows from partition <null> of queue _dbos_internal_queue
    02:07:40.427 [pool-289-thread-4] DEBUG d.d.transact.execution.QueueService - Retrieving workflows from partition <null> of queue schedulerQueue
    02:07:41.200 [pool-289-thread-2] DEBUG d.d.transact.execution.QueueService - Retrieving workflows from partition <null> of queue _dbos_internal_queue
    02:07:41.431 [pool-289-thread-1] DEBUG d.d.transact.execution.QueueService - Retrieving workflows from partition <null> of queue schedulerQueue

@devhawk
Copy link
Collaborator Author

devhawk commented Dec 18, 2025

Looking into whether this is a patch test issue, caught exception issue, or shutdown issue.

Pretty sure this was a transient test issue. testPatch has to update the workflow names in order to simulate changes to existing code. Looking in the test log, we see this:

2025-12-18T02:07:37.5811671Z     02:07:37.503 [Test worker] WARN  com.zaxxer.hikari.pool.PoolBase - HikariPool-219 - Failed to validate connection org.postgresql.jdbc.PgConnection@528c4353 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.

and shortly after

2025-12-18T02:07:37.5820121Z     dev.dbos.transact.exceptions.DBOSWorkflowFunctionNotFoundException: Workflow function dev.dbos.transact.invocation.PatchServiceImplOne//workflow does not exist for workflow id 0b8d0d88-bd67-4ea9-8488-c3d4d0027b2a.
2025-12-18T02:07:37.5821256Z     	at dev.dbos.transact.execution.DBOSExecutor.executeWorkflowById(DBOSExecutor.java:1050)
2025-12-18T02:07:37.5821899Z     	at dev.dbos.transact.execution.QueueService$1.processPartition(QueueService.java:106)
2025-12-18T02:07:37.5822455Z     	at dev.dbos.transact.execution.QueueService$1.run(QueueService.java:124)
2025-12-18T02:07:37.5822992Z     	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
2025-12-18T02:07:37.5823524Z     	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2025-12-18T02:07:37.5824218Z     	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
2025-12-18T02:07:37.5824996Z     	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
2025-12-18T02:07:37.5825642Z     	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2025-12-18T02:07:37.5826145Z     	at java.base/java.lang.Thread.run(Thread.java:840)

I suspect that testPatch failed to update the workflow name, which lead to the workflow never completing. Also PatchTest was missing the test timeout annotation the other tests have.

I added logging to the workflow name update function

@chuck-dbos
Copy link
Contributor

I suspect that testPatch failed to update the workflow name, which lead to the workflow never completing. Also PatchTest was missing the test timeout annotation the other tests have.

After looking deeper, I concur, the service implementations are straightforward and it is quite possible for this to be a test issue. The utility that does the workflow rename is only for test usage and is not using any variant of dbRetry, so it can be explained as "bad luck". Wonder if we should put dbRetry on the test facilities as well.

@chuck-dbos chuck-dbos self-requested a review December 18, 2025 22:54
@devhawk devhawk merged commit 99139dd into main Dec 18, 2025
20 checks passed
@devhawk devhawk deleted the devhawk/assorted branch December 18, 2025 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants