Skip to content

Conversation

@lina-temporal
Copy link
Contributor

@lina-temporal lina-temporal commented Sep 5, 2025

What changed?

  • Visibility writes are added to the Scheduler components.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

@lina-temporal lina-temporal requested a review from a team as a code owner September 5, 2025 22:42
customSearchAttributes *commonpb.SearchAttributes,
) error {
needsTask := false // Set to true if we need to write anything to Visibility.
visibility, err := s.Visibility.Get(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR but are we ever expecting an error getting a component to be an error that the application needs to handle? What reasons would be for this failure? Can we just make the method panic instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we support partial read in the future, there could be some transient error returned here. For now, it will only be serialization errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a transient, unexpected error, it should be okay to panic IMHO. That would be better than complicating the user interface. There's never really a way for a developer to react to this situation.

Comment on lines 338 to 339
if customSearchAttributes != nil &&
len(customSearchAttributes.GetIndexedFields()) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is enough because proto getters work with nils.

Suggested change
if customSearchAttributes != nil &&
len(customSearchAttributes.GetIndexedFields()) > 0 {
if len(customSearchAttributes.GetIndexedFields()) > 0 {

}

// Update Paused status.
var currentPaused bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly why I didn't want components to use imperative logic to update their search attributes. This logic should be part of the framework.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can afford it timeline-wise I'd much prefer if we coded up a better abstraction in the framework.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I agree. Also acknowledging that imperative logic means component author needs to remember update those attributes.

We discussed about the field tag approach which addresses this concern and also the SA registration problem but decided to do that as a follow up to unblock the Scheduler migration and deliver value sooner. It's more of a ROI & priority discussion for the entire OSS team.


// Update visibility if the memo is out-of-date or absent.
if currentInfoPayload == nil ||
!bytes.Equal(currentInfoPayload.Data, newInfoPayload) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proto marshal is not deterministic by default, this comparison will give you false negatives. You will want to unmarshal and use proto.Equal instead.

Another reason why component authors shouldn't implement all of this logic.

// attributes will be left as-is with it unset.
//
// See mergeCustomSearchAttributes for how custom search attributes are merged.
func (s *Scheduler) UpdateVisibility(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this approach, it assumes that the component author will remember to update visibility anytime that an attribute changes. This should be the framework's responsibility and should be done automatically when a transaction is closed.

for key, newPayload := range customAttrs {
oldPayload, alreadySet := currentAttrs[key]

if !alreadySet || !bytes.Equal(oldPayload.Data, newPayload.Data) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned before, you cannot rely on byte equality for this comparison.

Comment on lines 393 to 395
if needsTask {
visibility.GenerateTask(ctx)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary, Visibility component automatically generates a task when it's updated.

customSearchAttributes *commonpb.SearchAttributes,
) error {
needsTask := false // Set to true if we need to write anything to Visibility.
visibility, err := s.Visibility.Get(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we support partial read in the future, there could be some transient error returned here. For now, it will only be serialization errors.

Comment on lines 356 to 358
if !ok || currentPaused != s.Schedule.State.Paused {
upsertAttrs[searchattribute.TemporalSchedulePaused] = s.Schedule.State.Paused
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more like a "refresh visibility" approach. Do we have a like PauseSchedule method where we can update the field?

Comment on lines 376 to 380
currentMemo, err := visibility.GetMemo(ctx)
if err != nil {
return err
}
currentInfoPayload := currentMemo[visibilityMemoFieldInfo]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can use the generic chasm.GetMemo() function which returns a strongly typed value instead of Payload

Comment on lines 385 to 386
newMemo := map[string]any{visibilityMemoFieldInfo: newInfoPayload}
err = visibility.UpsertMemo(ctx, newMemo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpsertMemo will perform the encoding to Payload type, should just use newInfo here.

Comment on lines 349 to 355
currentPausedPayload, ok := currentAttrs[searchattribute.TemporalSchedulePaused]
if ok {
err = payload.Decode(currentPausedPayload, &currentPaused)
if err != nil {
return err
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can use the generic chasm.GetSearchAttribute() method here as well and you don't need to do the decoding.

Comment on lines 97 to 98
// Key isn't in the new map, delete it.
upsertAttrs[key] = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work? At least when I was coding the visibility.UpsertSearchAttributes() method I wasn't thinking of using that as a way for deletion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it worked in the SDKs before we had typed search attributes.

}

// Update Paused status.
var currentPaused bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I agree. Also acknowledging that imperative logic means component author needs to remember update those attributes.

We discussed about the field tag approach which addresses this concern and also the SA registration problem but decided to do that as a follow up to unblock the Scheduler migration and deliver value sooner. It's more of a ROI & priority discussion for the entire OSS team.

Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on @yycptt's comments the abstraction isn't too bad today.
Seems like you don't need to manually generate a task, serialize and deserialize payloads, etc...

If we could also eliminate the need to compare search attributes and do incremental updates only to fields that changed, that would be good enough for me for merging this PR and deferring the declarative APIs for later.

Ideally we wouldn't even need to do that, if you could just recompute the desired visibility state and have the visibility component understand what needs to be updated (if anything at all) that would be much much better than what we have today.
Is that doable?

customSearchAttributes *commonpb.SearchAttributes,
) error {
needsTask := false // Set to true if we need to write anything to Visibility.
visibility, err := s.Visibility.Get(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a transient, unexpected error, it should be okay to panic IMHO. That would be better than complicating the user interface. There's never really a way for a developer to react to this situation.

Comment on lines 97 to 98
// Key isn't in the new map, delete it.
upsertAttrs[key] = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it worked in the SDKs before we had typed search attributes.

…move some unused metric definitions (#8293)

## What changed?
- Did a pass on metrics for the CHASM scheduler and added the missing
delay metric.
- A few of the old schedule metrics aren't used anywhere at all, so I've
removed them.

## Why?
_Tell your future self why have you made these changes._

## How did you test it?
- [ ] built
- [ ] run locally and tested manually
- [ ] covered by existing tests
- [ ] added new unit test(s)
- [ ] added new functional test(s)

## Potential risks
_Any change is risky. Identify all risks you are aware of. If none,
remove this section._
Base automatically changed from chasm_cleanup_addedtasks to main September 23, 2025 22:43
@lina-temporal lina-temporal requested a review from a team as a code owner October 9, 2025 20:37
Comment on lines -1244 to -1257
ScheduleActionAttempt = NewCounterDef(
"schedule_action_attempt",
WithDescription("The number of schedule actions attempts"),
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure these were from another PR that might still be pending.

Comment on lines 544 to 551
// realTime may be slightly past the time of the action's first scheduled WFT.
realTime := time.Now()
desiredTime := start.ActualTime
e.MetricsHandler.Timer(metrics.ScheduleActionDelay.Name()).Record(realTime.Sub(desiredTime.AsTime()))

return &schedulepb.ScheduleActionResult{
ScheduleTime: start.ActualTime,
ActualTime: timestamppb.New(time.Now()),
ScheduleTime: desiredTime,
ActualTime: timestamppb.New(realTime),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also from another PR (nexus-watcher). Ignore, will fix before merge.

Comment on lines +375 to +379
// "Pause" is the only Temporal-managed search attribute for schedules, so it is
// updated here ad-hoc, instead of via the SearchAttributesProvider interface.
pauseAttr := make(map[string]*commonpb.Payload)
pauseAttr[searchattribute.TemporalSchedulePaused] = p
return vis.SetSearchAttributes(ctx, pauseAttr) // merges with custom search attributes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it this way because it seemed that to do it in the SA provider, I'd have to manually first Get the existing SAs and merge the paused status. Plus, schedule search attributes rarely change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should put this in the SA provider. The visibility component is for custom search attributes. The way that you've implemented it, you'll need to update the this attribute when a user modifies the schedule, and you're deviating from the design and how we want to structure the other CHASM components.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way that you've implemented it, you'll need to update the this attribute when a user modifies the schedule,

Why would I have to do that? SetSearchAttributes does a merge, the SA provider does not, from my read of it (it looks to be a full replace). Because these change infrequently, and independently, the ad-hoc call seemed more appropriate than writing an SA provider that has to run every transaction and have its own manual merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted about this offline - the idea is for SearchAttributes provider to provide the Temporal-provided attributes, and SetSearchAttributes to be for custom search attributes. Will update

msg := "failed to update future action times"
logger.Error(msg, tag.Error(err))
return fmt.Errorf("%w: %w",
serviceerror.NewInternal(msg),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a non retryable error to ensure this task doesn't stay in the queue?

google.protobuf.Timestamp last_processed_time = 3;

// A list of upcoming times an action will be triggered.
repeated google.protobuf.Timestamp future_action_times = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be persisted? Can't it be calculated on the fly when calculating the memo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it can't be, because the Memo is computed in the context of a component (not a task executor), and therefore it doesn't have the wired-in dependencies that computing this relies upon (SpecProcessor).

Comment on lines +375 to +379
// "Pause" is the only Temporal-managed search attribute for schedules, so it is
// updated here ad-hoc, instead of via the SearchAttributesProvider interface.
pauseAttr := make(map[string]*commonpb.Payload)
pauseAttr[searchattribute.TemporalSchedulePaused] = p
return vis.SetSearchAttributes(ctx, pauseAttr) // merges with custom search attributes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should put this in the SA provider. The visibility component is for custom search attributes. The way that you've implemented it, you'll need to update the this attribute when a user modifies the schedule, and you're deviating from the design and how we want to structure the other CHASM components.

"google.golang.org/protobuf/types/known/timestamppb"
)

type schedulerTestSuite struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, we asked not to add more suites into the codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're just going to start redoing scheduler's tests, for a package where it's a well-established pattern, before it's even landed to match that..? Also, that isn't the agreement we came to as a team, as far as I can remember. I'd asked specifically during that meeting if there was an issue with test suites themselves, and you'd said that it was about the way that an improper testing.T gets propagated to subtests improperly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We said we wouldn't use suites for new tests. It's just impossible to not user the suite's T() and embedded assertions. But I know you're in a time crunch so we can leave this as tech debt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants