Conversation
quantinuum-richard-morrison
left a comment
There was a problem hiding this comment.
Early thoughts on the approach. I'm sorry: this proposes quite a re-write, so I've tried to justify.
If easier, we could accept that there are different ways to skin a cat and I'll pipe down and review as-is. Let me know what you think about the alternative approach I've suggested
| job=job, | ||
| wait_for_status=wait_for_status, | ||
| websocket_timeout=websocket_timeout, | ||
| ) |
There was a problem hiding this comment.
Instead of an enum a case and three functions, these could be BaseStrategy (perhaps an ABC) and three subclasses WebsocketStrategy, PollingStrategy and HybridStrategy. The advantages would be:
- users could implement their own variations easily (if they wanted to customise)
HybridStrategycould be based onPollingStrategywithWebsocketStrategypulled in as a mixin. Fewer lines of code (maybe) (not that that's a great metric). DRYer (hopefully)- arguably more idiomatic python, easier to maintain and reason about
qnexus/client/jobs/__init__.py
Outdated
| initial_interval: float = 1.0, | ||
| max_interval_queued: float = 1200.0, | ||
| max_interval_running: float = 180.0, | ||
| backoff_factor: float = 2.0, |
There was a problem hiding this comment.
(related to my other comment about using subclasses for different strategies)
The implementation of wait_for gives the user no way to choose different values for these params. They have to pass in an enum member and the code in wait_for picks a function, calling it with just the required args, so these'll always get defaults.
With a class-based approach, it could be:
class PollingStrategy(BaseStrategy):
...
backoff_factor: float = 2.0
...
and then a user could (if they chose) do:
class MyCustomPollingStrategy(PollingStrategy):
backoff_factor: float = 1.5
and they could pass that to wait_for
| - QUEUED: Polls less frequently (default 20 min) since queue position changes slowly | ||
| - RUNNING/SUBMITTED: Polls more frequently (default 3 min) for responsiveness |
There was a problem hiding this comment.
Great idea
Nexus frequently has users that want to monitor a long running job (one that will take more than 10 minutes from submission to execution, sometimes as much as several days or weeks).
This PR adds some additional strategies to the
wait_forfunction, namely:Also I have set the default timeout to None, as this is what users seem to expect.
The hope being that this will handle long running jobs better than a pure websocket, which I find is often dropped if kept open for a long time.
I have also added optional logging, trying to stick to best practises for python libraries