Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

newlog: add filter, dedup and counter functions #4702

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

europaul
Copy link
Contributor

Newlog will have 3 ways to reduce the amount of logs:

  • filter: filter logs out based on the source code line that produced them
  • counter: count the number of logs produced by a specific source code line. Add that number to the first occurance of the log and remove the rest
  • deduplicator (for errors only): record the last X errors in a sliding window and remove duplicates

The benchmarking of the newlogd with the new features is in the dedup_test.go file. It shows that CPU and RAM usage increase by a factor of 3 when the features are enabled. So they can be disabled by setting the deduplication window size to 0 and not providing anything to the filter and counter functions.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds new features to reduce log volume in Newlog by implementing three log reduction mechanisms: filtering, deduplication, and counting.

  • Introduces a deduplication feature for error logs using a ring buffer.
  • Implements a log counter that appends occurrence counts to log entries.
  • Adds configurable log filtering and updates global configuration settings accordingly.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/newlog/cmd/dedup.go Adds deduplication functions for real‐time log reduction
pkg/newlog/cmd/counter.go Implements a counter to tag and suppress duplicate log entries
pkg/newlog/cmd/dedup_test.go Provides tests for deduplication logic
pkg/newlog/cmd/filter.go Introduces log filtering using configurable filename filters
pkg/newlog/cmd/newlogd.go Updates main log processing to integrate the new deduplication and filtering logic
pkg/pillar/types/global.go Adds new global settings keys for deduplication, log counting, and filtering
pkg/newlog/cmd/newlogd_test.go Minor test update for gzip log parsing

@europaul europaul force-pushed the log-filtering-and-dedup branch from c0462b9 to e947722 Compare March 21, 2025 20:07
@europaul europaul requested a review from Copilot March 21, 2025 20:08
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds three new log reduction mechanisms—filtering, deduplication, and counting—to help reduce excessive logs. Key changes include adding deduplication functionality with a sliding window (pkg/newlog/cmd/dedup.go), implementing log counting (pkg/newlog/cmd/counter.go), introducing log filtering (pkg/newlog/cmd/filter.go), and integrating these features via configuration into the log compression routine (pkg/newlog/cmd/newlogd.go).

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/newlog/cmd/dedup.go New deduplication logic with a sliding window for error log entries
pkg/newlog/cmd/counter.go New log counting functionality to annotate log entries
pkg/newlog/cmd/dedup_test.go Test cases for deduplication with an identified iteration bug
pkg/newlog/cmd/filter.go New log filtering mechanism based on filename criteria
pkg/pillar/types/global.go Added configuration items for deduplication and filtering settings
pkg/newlog/cmd/newlogd.go Integrated log deduplication, counting, and filtering into file compression
pkg/newlog/cmd/newlogd_test.go Test scaffolding for log compression and integration testing

@europaul
Copy link
Contributor Author

Waiting on #4703 to be merged to import the newer version of pillar with the right global config parameters.

@europaul europaul force-pushed the log-filtering-and-dedup branch from a2357dc to e2ff9b3 Compare March 28, 2025 09:56
@europaul
Copy link
Contributor Author

Yetus seems to be bailing out again because the PR includes too many updated vendor dependencies.

@rene
Copy link
Contributor

rene commented Mar 28, 2025

@eriknordmark could you rebase this PR, go modules for pkg/newlog were updated by dependabot...

This commit updates the go version to 1.24 and packages eve-api and
eve/pkg/pillar to latest versions.

Signed-off-by: Paul Gaiduk <[email protected]>
Newlog will have 3 ways to reduce the amount of logs:
- filter: filter logs out based on the source code line that
produced them
- counter: count the number of logs produced by a specific
source code line. Add that number to the first occurance of the log
and remove the rest
- deduplicator (for errors only): record the last X errors in a sliding
window and remove duplicates

The benchmarking of the newlogd with the new features is in the
dedup_test.go file. It shows that CPU and RAM usage increase by a factor
of 3 when the features are enabled. So they can be disabled by setting
the deduplication window size to 0 and not providing anything to the
filter and counter functions.

Signed-off-by: Paul Gaiduk <[email protected]>
@europaul europaul force-pushed the log-filtering-and-dedup branch from e2ff9b3 to aac93d6 Compare March 28, 2025 14:46
@europaul
Copy link
Contributor Author

@eriknordmark could you rebase this PR, go modules for pkg/newlog were updated by dependabot...

@rene done

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see yetus/golangcilint is failing with the output
pkg/newlog/level=error msg="Running error: context loading failed: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: downloading go1.24 (linux/amd64)\ngo: download go1.24 for linux/amd64: toolchain not available\n"
pkg/newlog/level=warning msg="Failed to discover go env: failed to run 'go env': exit status 1"

@europaul
Copy link
Contributor Author

europaul commented Mar 28, 2025

I see yetus/golangcilint is failing with the output pkg/newlog/level=error msg="Running error: context loading failed: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: downloading go1.24 (linux/amd64)\ngo: download go1.24 for linux/amd64: toolchain not available\n" pkg/newlog/level=warning msg="Failed to discover go env: failed to run 'go env': exit status 1"

@eriknordmark where can you see this? I cannot find the error

nvm, I found it in scan-results artefact

// done gzip conversion, get rid of the temp log file in collect directory
err = os.Remove(tmpLogfileInfo.tmpfile)
if err != nil {
log.Fatal("doMoveCompressFile: remove file failed", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really needed to have this error as fatal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you are moving the code, but maybe is time to change it...

@@ -1,14 +1,13 @@
module github.com/lf-edge/eve/pkg/newlog

go 1.23
toolchain go1.24.1
Copy link
Contributor

@rene rene Mar 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, sources parser was updated here: 6c40145 so toolchain line is supported now.

@@ -1,14 +1,13 @@
module github.com/lf-edge/eve/pkg/newlog
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, fix typo on commit message: "...go verison" -> "...go version"

@@ -1,14 +1,13 @@
module github.com/lf-edge/eve/pkg/newlog

go 1.23
toolchain go1.24.1
go 1.24
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it must be 1.24.1 due to security fixes....

@rene
Copy link
Contributor

rene commented Mar 29, 2025

@europaul , LGTM, just left a few comments....

func addLogCount(logEntry *logs.LogEntry, filterMap map[string]int) bool {
if count, ok := filterMap[logEntry.Filename]; !ok {
return true
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else {
}


// If the file hasn't appeared in the last bufferSize logs, forward it.
if _, ok := seen[dedupField]; ok && logEntry.severity == "error" {
log.Tracef("Deduped log at %s because of the log at %s\n", logEntry.timestamp, seen[dedupField])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this slow? Even if trace logs are disabled it gets called quite often, doesn't it?

}
}

func BenchmarkDoMoveCompressFile(b *testing.B) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

I guess we do not run it as a GH action, do we?


// deduplicateLogs can be used to deduplicate logs on the fly reading from a channel
// and writing to another channel
func deduplicateLogs(in <-chan inputEntry, out chan<- inputEntry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this actually used outside of tests?

}

// If the file hasn't appeared in the last bufferSize logs, forward it.
if _, ok := seen[dedupField]; ok && logEntry.Severity == "error" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logEntry.Severity == "error"
Means that only error log messages are deduplicated? What is the reasoning behind it?

log.Functionf("handleGlobalConfigModify: gonna count the logs from the following lines %v", filenamesToCount)

// parse a comma separated list of log filenames to filter
newFilenameFilter := make(map[string]any)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
newFilenameFilter := make(map[string]any)
newFilenameFilter := make(map[string]struct{})

Comment on lines +1338 to +1339
if dirName == uploadDevDir {
if len(filenameFilter.Load().(map[string]any)) != 0 || len(logCounter) != 0 || dedupWindowSize.Load() != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if dirName == uploadDevDir {
if len(filenameFilter.Load().(map[string]any)) != 0 || len(logCounter) != 0 || dedupWindowSize.Load() != 0 {
if dirName == uploadDevDir &&
len(filenameFilter.Load().(map[string]any)) != 0 ||
len(logCounter) != 0 ||
dedupWindowSize.Load() != 0 {

continue // we don't care about the error here
}
var useEntry bool
if useEntry = !filterOut(&logEntry); !useEntry {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if useEntry = !filterOut(&logEntry); !useEntry {
if useEntry = filterOut(&logEntry); useEntry {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or perhaps directly:

Suggested change
if useEntry = !filterOut(&logEntry); !useEntry {
if filterOut(&logEntry) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants