Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: polling based execution config watcher #1671

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

endigma
Copy link
Contributor

@endigma endigma commented Mar 10, 2025

Motivation and Context

  • inotify and fsnotify are great, until various not-quite-standard filesystems and symlinks get thrown into the mix, then it doesn't seem so great anymore.
  • The new system should be virtually edge-case free, and has a lot lower mental complexity, it just checks on a timer if the last modified date of the watched file has changed

Theoretical limitations

  • We trust the filesystem/OS to give us a useful and precise mod time value
  • If you somehow circumvented the OS/fs and changed the file contents directly without changing mod time, we would not know

Copy link

github-actions bot commented Mar 10, 2025

Router image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-4880e9c3512b46c90982bd0b867a6f6985f2f498

@endigma endigma force-pushed the jesse/eng-6304-switch-to-polling-based-file-watcher branch from b74080a to ec855ae Compare March 11, 2025 14:23
@endigma endigma requested a review from Copilot March 11, 2025 14:25

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the configuration file watcher to use a polling-based approach instead of relying on filesystem notifications. Key changes include the introduction of a new simple watcher, corresponding test improvements, and updates to configuration and router startup to support polling parameters.

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
router/pkg/watcher/simple_watcher.go New polling-based file watcher implementation
router/pkg/watcher/simple_watcher_test.go Comprehensive tests for the new watcher functionality
router-tests/router_config_watch_test.go Integration test updating the router config and monitoring hot-reload behavior
router/core/router.go Switch from legacy file watcher to the new polling-based implementation, update error handling flow
router/pkg/config/config.go Add WatchInterval support to config schema
router/cmd/instance.go Pass WatchInterval configuration to the router execution config
Comments suppressed due to low confidence (1)

router/core/router.go:1139

  • The error message indicates a failure to 'serialize' the config file when the operation is actually unmarshaling. Consider updating the message to 'Failed to parse config file' or 'Failed to unmarshal config file' for clarity.
r.logger.Error("Failed to serialize config file", zap.Error(err))
@endigma endigma marked this pull request as ready for review March 11, 2025 15:06
@endigma endigma requested a review from StarpTech March 11, 2025 15:06
@endigma endigma force-pushed the jesse/eng-6304-switch-to-polling-based-file-watcher branch 2 times, most recently from 5e876e2 to bdbe017 Compare March 11, 2025 16:09
@endigma endigma force-pushed the jesse/eng-6304-switch-to-polling-based-file-watcher branch from b9a1020 to 239915d Compare March 12, 2025 13:12
@endigma endigma requested a review from StarpTech March 12, 2025 13:12
Callback func()
}

func LogSimpleWatch(ctx context.Context, options SimpleWatcherOptions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we basically require all options to be set, would it make sense to make them private and have a function NewSimpleWatcherOptions() with a bit of validation that returns an error instead of a panicking when calling Logger or Callback?

@@ -1116,45 +1117,36 @@ func (r *Router) Start(ctx context.Context) error {
}()

if r.executionConfig != nil && r.executionConfig.Watch {
go watcher.LogSimpleWatch(ctx, watcher.SimpleWatcherOptions{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old watcher is unused now, correct? Any reason to keep unused code?

// Wait for the first cycle to complete to set baseline
time.Sleep(2 * watchInterval)

wg.Add(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just from a review perspective wg here looks a bit like a race condition. I personally would try to avoid time.Sleep. I suggest adding something to the waitgroup first and waiting 2 times or you could also work with channels and a select with something like <-time.After.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants