Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a nginxprocess package for working with NGINX processes #956

Open
wants to merge 4 commits into
base: v3
Choose a base branch
from

Conversation

ryepup
Copy link
Contributor

@ryepup ryepup commented Dec 27, 2024

Proposed changes

Add a new nginxprocess package to export useful utilities around working with NGINX processes, and update internals to use that package. NGINXaaS for Azure has some proprietary code that also wants to use these utilities.

This comes with a significant speed improvement by filtering for NGINX processes earlier and fetching less data from the OS. See individual commits for more details.

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING document
  • I have run make install-tools and have attached any dependency changes to this pull request
  • If applicable, I have added tests that prove my fix is effective or that my feature works
  • If applicable, I have checked that any relevant tests pass after adding my changes
  • If applicable, I have updated any relevant documentation (README.md)
  • If applicable, I have tested my cross-platform changes on Ubuntu 22, Redhat 8, SUSE 15 and FreeBSD 13

@ryepup ryepup requested a review from a team as a code owner December 27, 2024 19:50
@github-actions github-actions bot added the chore Pull requests for routine tasks label Dec 27, 2024
Copy link
Contributor

github-actions bot commented Dec 27, 2024

✅ All required contributors have signed the F5 CLA for this PR. Thank you!
Posted by the CLA Assistant Lite bot.

@ryepup
Copy link
Contributor Author

ryepup commented Dec 27, 2024

I have hereby read the F5 CLA and agree to its terms

@ryepup ryepup force-pushed the export-pid-features branch from 22a996f to 39ba7e4 Compare December 27, 2024 20:01
@ryepup
Copy link
Contributor Author

ryepup commented Dec 27, 2024

I'm not sure why the pipeline is failing; make unit-test and make performance-test work if I run them locally, and I didn't change the otel collector section.

@ryepup ryepup force-pushed the export-pid-features branch from 39ba7e4 to eed2042 Compare December 27, 2024 20:18
@ryepup ryepup force-pushed the export-pid-features branch from eed2042 to f27ef37 Compare December 27, 2024 20:34
pkg/nginxprocess/process.go Outdated Show resolved Hide resolved
pkg/nginxprocess/process.go Outdated Show resolved Hide resolved
pkg/nginxprocess/process.go Outdated Show resolved Hide resolved
pkg/nginxprocess/process.go Show resolved Hide resolved
@ryepup ryepup force-pushed the export-pid-features branch 4 times, most recently from 2a306bf to fe5c421 Compare December 30, 2024 19:43
@ryepup
Copy link
Contributor Author

ryepup commented Dec 30, 2024

I'm not sure why the pipeline is failing; make unit-test and make performance-test work if I run them locally, and I didn't change the otel collector section.

unit-test fixed itself, and the performance-test was my fault; I added a benchmark that failed if there were no nginx processes. I switched that up so hopefully it'll be happy now.

pkg/nginxprocess/process.go Outdated Show resolved Hide resolved
@ryepup ryepup force-pushed the export-pid-features branch 2 times, most recently from 690b7a5 to 04c36a0 Compare January 2, 2025 20:46
}
}

func BenchmarkList(b *testing.B) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should keep these benchmarks as they are dependent on reading processes running on your machine. Results would vary between machines. It could cause our benchmark comparisons to fail in our pipeline https://github.com/nginx/agent/blob/v3/.github/workflows/ci.yml#L225

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This benchmark was useful for development to compare with the existing implementation and verify how expensive additional gopsutil calls were. I can see how it might mess up CI due to randomness of the runners.

Which would you prefer:

  1. delete it
  2. put a b.Skipf at the top that devs can comment out if they're working around here
  3. put a if os.Getenv("CI") != "" { b.Skipf(...) } at the top

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It they were useful for you during development then maybe we should keep them and put a b.Skipf in. Maybe just add a comment as well to the function to say why they are skipped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does that look?

pkg/nginxprocess/process_test.go Outdated Show resolved Hide resolved
ryepup added a commit to ryepup/agent that referenced this pull request Jan 9, 2025
@ryepup ryepup force-pushed the export-pid-features branch from 04c36a0 to 8344487 Compare January 9, 2025 15:33
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 9, 2025
ryepup added a commit to ryepup/agent that referenced this pull request Jan 9, 2025
@ryepup ryepup force-pushed the export-pid-features branch from 8344487 to 988212a Compare January 9, 2025 16:40
@ryepup ryepup requested a review from dhurley January 15, 2025 18:48
This package is intended for use by any software that wants to
identify NGINX processes.

It's mostly a refactor of the existing `process.ProcessOperator` with
a few differences and some features wanted by NGINXaaS.

- minimizes `gopsutil` calls, many of which end up shelling out to
`ps`. `nginxprocess.List` filters out non-NGINX processes
early. `process.ProcessOperator` gathered details for all processes,
and then non-NGINX processes were filtered out later by
`instance.NginxProcessParser`. `nginxprocess.List` also fetches less
process information, keeping only what the agent codebase actually
used.

A quick benchmark on my laptop (~750 processes, 12 of which are NGINX)
showed a drastic speed improvement:

| Benchmark   | Iterations | Time (ns/op) | Memory (B/op) | Allocations (allocs/op) |
| ----------- | ---------- | ------------ | ------------- | ----------------------- |
| old         | 1          | 26832273195  | 18282768      | 133775                  |
| new         | 9          | 114558263    | 3439672       | 12836                   |

- uses a functional options pattern to optionally fetch process
status, which is only needed in a few places and saves some more
shelling out to `ps`. This could be expanded in the future to support
other options while retaining backwards compatibility.

- includes some error testing funcs to let callers handle different
errors without needing to know how those errors are implemented. This
helps trap `gopsutil` implementation details in `package nginxprocess`
and makes future `gopsutil` upgrades easier.

- respects context cancellation a little faster
Swap internal packages to use the shared `nginxprocess` package for
details about NGINX processes.

This is mostly switching `model.Process` for `nginxprocess.Process`,
updating tests, and using features of `nginxprocess. There is one
non-trivial change: `model.Process.Running` has been dropped, there is
no equivalent in `nginxprocess`.

`Running` was almost always true, because it was calculated immediately after
a successful call to `NewProcessWithContext`. In order for it to be
false, a process would have to end after `NewProcessWithContext`, but
before `IsRunningWithContext`, which is an incredibly tiny time
window. There is also no guarantee that the process is _still_ running
by the time callers check `Running`, so the field was a little
misleading. The system now relies solely on errors from
`NewProcessWithContext` to identify terminated processes. Worst case
we catch a dead process on the next tick of `HealthWatcherService`
Existing names were really similar to imported package names or other
variables.
@ryepup ryepup force-pushed the export-pid-features branch from 988212a to 9bc5a98 Compare January 17, 2025 16:21
@ryepup
Copy link
Contributor Author

ryepup commented Jan 17, 2025

Did a rebase, looks like there's some test flake around the otel collector? panic: close of closed channel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Pull requests for routine tasks documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants