Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flytepropeller][flyteagent] Set ListAgent Timeout to unblock propeller launch execution #6312

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Future-Outlier
Copy link
Member

@Future-Outlier Future-Outlier commented Mar 6, 2025

Tracking issue

#3936

Why are the changes needed?

We noticed that when propeller starts, it have to wait for the agent get the agent supported task type first, then it can launch execution.
However, sometimes the agent server won't response and we will have to wait for a long time, so if we give it timeout.

clientSet := getAgentClientSets(ctx)
agentRegistry := getAgentRegistry(ctx, clientSet)
supportedTaskTypes := maps.Keys(agentRegistry)

What changes were proposed in this pull request?

Set ListAgents request timeout to 3s.

How was this patch tested?

single binary.

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Summary by Bito

This PR refines concurrency handling in the Flyte agent plugin by optimizing ListAgents timeout and adjusting goroutine usage. It removes an unnecessary go invocation and adds a goroutine for plugin.watchAgents, properly separating periodic execution from agent watching functionality. The changes also include multiple dependency upgrades, configuration updates, and documentation improvements to enhance performance and maintainability.

Unit tests added: False

Estimated effort to review (1-5, lower is better): 4

@flyte-bot
Copy link
Collaborator

flyte-bot commented Mar 6, 2025

Code Review Agent Run #4d26d7

Actionable Suggestions - 1
  • flyteplugins/go/tasks/plugins/webapi/agent/plugin.go - 1
    • Consider adding error handling for goroutine · Line 422-422
Review Details
  • Files reviewed - 1 · Commit Range: 26225c8..26225c8
    • flyteplugins/go/tasks/plugins/webapi/agent/plugin.go
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Collaborator

flyte-bot commented Mar 6, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Agent Timeout Configuration Update

values.yaml - Added 'ListAgents: 3s' to configure the agent timeout in the values file.

complete-agent.yaml - Inserted 'ListAgents: 3s' configuration to enforce a reduced timeout in the agent manifest.

config.go - Reduced the DefaultTimeout from 10 seconds to 3 seconds to speed up agent startup.

Other Improvements - Manifest Secret and Checksum Updates

complete-agent.yaml - Updated haSharedSecret and both checksum/configuration and checksum/secret values for enhanced consistency.

complete.yaml - Modified haSharedSecret and checksum/secret to align with current configuration standards.

dev.yaml - Replaced haSharedSecret and checksum/secret values to ensure manifest consistency and security.

@@ -419,7 +419,7 @@ func newAgentPlugin(agentService *core.AgentService) webapi.PluginEntry {
cs: clientSet,
registry: agentRegistry,
}
plugin.watchAgents(ctx, agentService)
go plugin.watchAgents(ctx, agentService)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding error handling for goroutine

The watchAgents function is now being called in a goroutine with go plugin.watchAgents(ctx, agentService). This is a good change as it prevents blocking the plugin initialization, but we should consider adding error handling for this goroutine. If the goroutine panics, it could silently fail without any notification.

Code suggestion
Check the AI-generated fix before applying
Suggested change
go plugin.watchAgents(ctx, agentService)
go func() {
defer func() {
if r := recover(); r != nil {
logger.Errorf(ctx, "watchAgents goroutine panicked: %v", r)
}
}()
plugin.watchAgents(ctx, agentService)
}()

Code Review Run #4d26d7


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Copy link

codecov bot commented Mar 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.89%. Comparing base (7bc47df) to head (b96b57a).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6312      +/-   ##
==========================================
- Coverage   58.48%   57.89%   -0.60%     
==========================================
  Files         937      774     -163     
  Lines       71091    57321   -13770     
==========================================
- Hits        41580    33186    -8394     
+ Misses      26359    21642    -4717     
+ Partials     3152     2493     -659     
Flag Coverage Δ
unittests-datacatalog 59.06% <ø> (ø)
unittests-flyteadmin 56.30% <ø> (+0.02%) ⬆️
unittests-flytecopilot 30.99% <ø> (ø)
unittests-flytectl 64.70% <ø> (ø)
unittests-flyteidl 76.12% <ø> (ø)
unittests-flyteplugins ?
unittests-flytepropeller 54.80% <ø> (ø)
unittests-flytestdlib 64.04% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Future-Outlier Future-Outlier changed the title [WIP][flytepropeller][flyteagent] watch agent support task type by go routine [flytepropeller][flyteagent] watch agent support task type by go routine Mar 11, 2025
@Future-Outlier Future-Outlier changed the title [flytepropeller][flyteagent] watch agent support task type by go routine [flytepropeller][flyteagent] Set ListAgent Timeout to unblock propeller launch execution Mar 12, 2025
Signed-off-by: Future-Outlier <[email protected]>
@Future-Outlier Future-Outlier enabled auto-merge (squash) March 12, 2025 01:20
@flyte-bot
Copy link
Collaborator

flyte-bot commented Mar 12, 2025

Code Review Agent Run #c6f436

Actionable Suggestions - 0
Review Details
  • Files reviewed - 18 · Commit Range: 26225c8..b96b57a
    • boilerplate/flyte/golang_test_targets/download_tooling.sh
    • charts/flyte-binary/values.yaml
    • datacatalog/go.mod
    • datacatalog/go.sum
    • docker/sandbox-bundled/bootstrap/go.mod
    • docker/sandbox-bundled/bootstrap/go.sum
    • docker/sandbox-bundled/manifests/complete-agent.yaml
    • docker/sandbox-bundled/manifests/complete.yaml
    • docker/sandbox-bundled/manifests/dev.yaml
    • flytectl/docs/docs-requirements.txt
    • flyteidl/go.mod
    • flyteidl/go.sum
    • flyteplugins/go/tasks/plugins/webapi/agent/config.go
    • flyteplugins/go/tasks/plugins/webapi/agent/plugin.go
    • flytepropeller/pkg/webhook/aws_secret_manager.go
    • flytepropeller/pkg/webhook/aws_secret_manager_test.go
    • flytestdlib/go.mod
    • flytestdlib/go.sum
  • Files skipped - 4
    • .github/workflows/single-binary.yml - Reason: Filter setting
    • charts/flyte-binary/README.md - Reason: Filter setting
    • docs/user_guide/productionizing/secrets.md - Reason: Filter setting
    • rfc/system/RFC-5659-execution-concurrency.md - Reason: Filter setting
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • OWASP (Security Vulnerability) - ✔︎ Successful
    • GOVULNCHECK (Security Vulnerability) - ✖︎ Failed
    • SNYK (Security Vulnerability) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

Refer to the documentation for additional commands.

Configuration

This repository uses code_review_bito You can customize the agent settings here or contact your Bito workspace admin at [email protected].

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants