-
Notifications
You must be signed in to change notification settings - Fork 130
feat(nexus): Initial Nexus support #1708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
test.beforeEach(async (t) => { | ||
const taskQueue = t.title + randomUUID(); | ||
const { env } = t.context; | ||
const response = await env.connection.operatorService.createNexusEndpoint({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this is the way for now, but what's the intent regarding this? Do we expect users to have to do the same in their own Temporal+Nexus tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. in Java we have a shortcut, we should probably make it easier in the test environment for Core based SDKs. @dandavison, wondering if you considered this in Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I have not considered APIs for user Temporal Nexus testing yet; I'll add a TODO to the code. In my tests I have a helper function like this for creating Nexus endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need an issue to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Created one for Python SDK-3837 "Python Nexus user testing utilities and docs"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have made this a features issue with checkboxes for each SDK but 🤷, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be added post-initial-release.
packages/worker/src/nexus.ts
Outdated
options: nexus.StartOperationOptions | ||
): Promise<coresdk.nexus.INexusTaskCompletion> { | ||
try { | ||
const input = await decodeFromPayload(this.dataConverter, payload); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the data converter fails here, I assume it should result in a 400 BAD_REQUEST
Nexus HTTP response, right? If that's correct can you confirm that it does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a test worth adding. Thanks!
- This PR is blocked on publishing an initial version of the `nexus-rpc` package. - There is a TODO to figure out HandlerError and OperationError message rehydration, pending discussion. - Interceptors not yet implemented. - `WorkflowRunOperation` and `getClient()` not implemented for the `@temporalio/nexus` package. - Tests use the HTTP API directly in lieu of a workflow caller or strongly typed client, we can refactor those later.
} | ||
if (!token.wid) { | ||
throw new TypeError('invalid workflow run token: missing workflow ID (wid)'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that Go and TS don't validate the 'ns'
field. Is that deliberate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaving this for now, but haven't finished my review
* | ||
* @experimental Nexus support in Temporal SDK is experimental. | ||
*/ | ||
export interface StartNexusOperationInput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ever user-facing? Do the fields need docstrings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's user facing, and yes this could benefit from some doc strings.
But we don't have any on other interceptors input/output types, so I'll just skip for the moment.
const completed = | ||
!activator.completions.nexusOperationStart.has(seq) && | ||
!activator.completions.nexusOperationComplete.has(seq); | ||
|
||
if (!completed) { | ||
activator.pushCommand({ | ||
requestCancelNexusOperation: { seq }, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just check nexusOperationComplete
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because cancellation could also be requested for an operation that is still in the "Start" state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duh 🤦
What is the result of cancelling a completed operation then? It's already completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nexusOperationStart
=> waiting for the operation to be started on the remote nexus handler service.nexusOperationComplete
=> waiting for the operation to be completed asynchronously by the remote nexus handler.
So while the op is in nexusOperationComplete
, it is not yet completed, it is pending completion, and so it can still be cancelled.
/** | ||
* A promise that will resolve when the Nexus operation completes, either with the result of the | ||
* operation if it completed successfully, or a failure if the Nexus Operation failed. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW - when it comes to cancellation types this comment will be technically incomplete/inaccurate: under WaitRequested
the promise will resolve with a cancellation error if the cancel handler succeeds, and it will resolve with a non-cancellation error if the cancel handler fails, in both cases regardless of the operation outcome.
nit: decide on capitalization of [Oo]peration and [N]nexus and make docstrings consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: decide on capitalization of [Oo]peration and [N]nexus and make docstrings consistent.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW - when it comes to cancellation types this comment will be technically incomplete/inaccurate: under WaitRequested the promise will resolve with a cancellation error if the cancel handler succeeds, and it will resolve with a non-cancellation error if the cancel handler fails, in both cases regardless of the operation outcome.
Thanks for the heads-up.
|
||
/** | ||
* A fixed, single-line summary for this Nexus Operation that may appear in the UI/CLI. | ||
* This can be in single-line Temporal markdown format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I supposed to know what "Temporal markdown format" is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the normal Markdown elements except we strip away images, hmtl, and scripts.
Maybe it's worth clarifying that in the docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question.
Unfortunately, we don't yet have a doc page to link to, which is ok at this point as User Metadata is still considered prerelease.
And FWIW, that's the same typedoc text as we have in other SDKs.
packages/workflow/src/nexus.ts
Outdated
service: options.service.name, | ||
operation: opName, | ||
options: operationOptions ?? {}, | ||
nexusHeader: {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to make it possible for the user to pass headers now, or leave an open issue for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I know Go uses the (weird) singular for "header"; does TS also? Python uses "headers".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I know Go uses the (weird) singular for "header"; does TS also? Python uses "headers".
Good catch. Renamed to headers
everywhere (no nexus
prefix, plural form). That's coherent with both Python and other TS interceptors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to make it possible for the user to pass headers now, or leave an open issue for it?
Not sure what's the expectation here. For all other APIs, headers are not exposed directly in the "main" user facing call, only in interceptors. Are doing this differently for Nexus ops? If that's only in interceptors, then that should already be working now (though no test atm covering that).
In any case, I'm leaving that for later.
packages/workflow/src/nexus.ts
Outdated
* Start a Nexus Operation and wait for its completion taking a {@link nexus.operation}. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Start a Nexus Operation and wait for its completion taking a {@link nexus.operation}. | |
*/ | |
* Start a Nexus Operation and wait for its completion taking a {@link nexus.operation}. | |
* Returns the operation's result. | |
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth giving users some guidance in the docstrings regarding which of the two overloads (operation vs operation name) they should use?
Maybe dumb question: doesn't the string overload always provide better type safety? The structural typing (operation) overload allows a user to refer to an operation that has a name that does not match a name on the service for which the client was created, giving rise to a request-time HandlerError
I believe e.g.
const service = nexus.service('test', {
syncOp: nexus.operation<string, string>({ name: 'my-sync-op' }),
asyncOp: nexus.operation<string, string>(),
});
const service2 = nexus.service('test', {
syncOp2: nexus.operation<string, string>({ name: 'my-sync-op' }),
asyncOp2: nexus.operation<string, string>(),
});
export async function caller(endpoint: string, op: keyof typeof service.operations, action: string): Promise<string> {
const client = workflow.createNexusClient({
endpoint,
service,
});
return await workflow.CancellationScope.cancellable(async () => {
let opRef = service2.operations['syncOp2'];
const handle = await client.startOperation(opRef, action);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe dumb question: doesn't the string overload always provide better type safety? The structural typing (operation) overload allows a user to refer to an operation that has a name that does not match a name on the service for which the client was created
No, both provide the same level of type safety. I just added a test to demonstrate this.
Would it be worth giving users some guidance in the docstrings regarding which of the two overloads (operation vs operation name) they should use?
We have no opinion on that question. We intentionally chose to allow both.
I however expanded the typedoc to clarify that this expects the operation's property's name, rather than the operation's name. That may sound a bit confusing at first, but in practice, I'd expect most users will either not override operation names, or they will rely on their IDE's autocompletion, or TSC's compilation errors, to point out incorrect usage. Only pure-JS users (with no implicit TS IDE support) and Nexus framework implementers would really be exposed to this notion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both provide the same level of type safety.
Using an operation method reference, users can do this without a type error:
await service1client.startOperation(service2.operations.syncOp2, action)
That would result in a request time error. I.e. it would send a request for service='service1
and operation='syncOp2'
, which is invalid.
But using string operation names they cannot represent that invalid nexus request. For example, service1client.startOperation('syncOp2', action)
would be a type-check-time error.
My question is: given that operation method references are less type safe in this sense, what value do they add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have limited context on the nexus semantics, so I'll leave that to @dandavison and @bergundy . The integration lgtm to me though.
err.failure = failure; | ||
if (err instanceof TemporalFailure) { | ||
err.failure = failure; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has the blind err.failure = failure
caused issues before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was simply not possible before. I mean, until now, err
here would always be a TemporalFailure
(look at the method signature). Now, it may be a Nexus HandlerError
, which is part of the (Temporal agnostic) Nexus RPC.
I'm considering storing the failure on HandlerError anyway, possibly using a private symbol property. But we are revisiting nexus error serialization at the moment, so I'm deferring that decision until we've set our mind.
packages/worker/src/interceptors.ts
Outdated
}; | ||
|
||
/** | ||
* A function that takes a Nexus Context and returns an interceptor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that comment should have been on NexusInterceptorsFactory
, above. I improved that one, as well as that of the equivalent ActivityInterceptorsFactory
type, near the top of this file.
However, I'd point out that typedocs on everything interceptors-related (Client/Workflow/Activity/Nexus) is very disappointing. I'm taking note to revisit all interceptors soon.
packages/worker/src/interceptors.ts
Outdated
} | ||
|
||
/** | ||
* A function that takes a Nexus Context and returns an interceptor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above: I don't understand this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed description on that specific typedoc. It should have one, but so should all other interceptor-related types, but most don't. I'll revisit in a distinct PR.
protected async startOperation( | ||
ctx: nexus.StartOperationContext, | ||
payload: Payload | undefined | ||
): Promise<coresdk.nexus.INexusTaskCompletion> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment: I note that in Python, all the classes that have a "Handler"-like interface are in the nexus SDK:
whereas here you have a class with a handler-like interface in the Tempral worker. Perhaps Python could/should be written more like yours; I hadn't considered it. Do you have any comments on that difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what difference you are trying to point out.
That's an internal method of the worker to bridge between what Nexus Tasks provided by Core/the Temporal server, and the Nexus RPC's APIs. It really can't be moved to the Nexus RPC, as it is dealing with Core's NexusTaskCompletion
and Temporal Payload
.
It is equivalent to this method in Python.
The signature of the methods are slightly different, I think that's mostly copying the respective style of existing Workflow/Activity Task handling in various SDKs. I really don't think we need to worry about cross-SDK alignment at that level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, but should get approval from someone who has more familiarity with the nexus semantics before merging.
Specifically, approval on the nexus
package and the corresponding test
s.
The changes integrating this with the TS SDK lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm comfortable with the PR as it stands at this point.
However, given that I myself authored a significant part of this PR, I think it's pertinent to wait for some complementary signoffs.
]; | ||
} | ||
|
||
const { taskQueue: userSpeficiedTaskQueue, ...rest } = workflowOptions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in this variable name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What changed
This adds support for Nexus in the Temporal TS SDK.
All Nexus-related APIs are experimental, and some changes are expected before this reaches GA.
Review notes
HandlerError
andOperationError
serialization. In the mean time, this PR mostly adheres to the original spec (i.e. a Nexus should have either amessage
or acause
), but little effort has been put into compliance testing, as we know the present model is broken in certain aspects.