Skip to content

Conversation

kellybyrd
Copy link
Contributor

@kellybyrd kellybyrd commented Sep 14, 2025

NOTE: This doesn't fix anything that is broken. It is just an attempt to get ahead of ntopng deprecating old probe formats so hopefully we don't to react to new versions of ntopng as quickly. Let me know if you'd rather not merge this

Update the ZMQ header to the latest version in as of ntop 6.4. The goal here is to just use the latest thing so we have longer before we have to worry about being deprecated.

The work here was:

  • New header with new version constant

  • New header requires knowing compress/uncompressed sizes so move the compress code from Format to Transport. We're still only compressing JSON.

TESTING DONE:
Tested TLV, JSON, and compressed JSON with ntopng 6.4 built from source. Saw flows with all three. In order to get 6.4 to work with compressed JSON, I had to fix a bug in ntopng as well as disable the pro license checks. Without my ntopng fix, I cannot see how ntopng decompresses anything that has zmq_message_header_v3.

I have opened an issue on the ntopng repo they confirmed it is fixed in their dev branch.

Copy link

codecov bot commented Sep 14, 2025

Codecov Report

❌ Patch coverage is 0% with 58 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.00%. Comparing base (ef2beb5) to head (ef23664).

Files with missing lines Patch % Lines
transport/zmq.go 0.00% 56 Missing ⚠️
cmd/netflow2ng.go 0.00% 1 Missing ⚠️
formatter/ntopng_json.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff          @@
##            main    #111   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files          7       7           
  Lines        826     842   +16     
=====================================
- Misses       826     842   +16     
Files with missing lines Coverage Δ
cmd/netflow2ng.go 0.00% <0.00%> (ø)
formatter/ntopng_json.go 0.00% <0.00%> (ø)
transport/zmq.go 0.00% <0.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ef2beb5...ef23664. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kellybyrd
Copy link
Contributor Author

@synfinatic : No hurry on this. I want to test it more, but wanted to get eyes on it too.

One more thing we may want to do is make TLV the default instead of JSON. When testing this, I had to comment out this code in ntopng:

u_int8_t ZMQParserInterface::parseJSONFlow(const char *payload,
                                           int payload_size, u_int32_t source_id,
                                           u_int32_t msg_id) {
  json_object *f = NULL;
  enum json_tokener_error jerr = json_tokener_success;

#ifndef NTOPNG_PRO
  /*
    nProbe exports flows in TLV so this code will be removed in the future
    Leaving here for old nProbes that will be discontinued soon
  */
  return(0);
#endif

That code was added in commit 0324a16684 on 2023-06-01, so it's been two years already.

@kellybyrd kellybyrd mentioned this pull request Sep 15, 2025
@kellybyrd kellybyrd marked this pull request as draft September 15, 2025 18:21
@kellybyrd kellybyrd force-pushed the ntopng_zmqheader_v3 branch 3 times, most recently from 80160c7 to f92ba71 Compare September 18, 2025 20:18
@kellybyrd
Copy link
Contributor Author

I've rebased and will test this soon, then turn this back in to a real PR.

@kellybyrd kellybyrd marked this pull request as ready for review September 18, 2025 21:48
@synfinatic
Copy link
Owner

Not sure if this is related to this PR, but seeing an issue where traffic is no longer being processed. ie:

netflow2ng  | level=debug msg="Netflow packet received before template was set. Discarding"
netflow2ng  | level=debug msg="Netflow packet received before template was set. Discarding"
netflow2ng  | level=debug msg="Netflow packet received before template was set. Discarding"
netflow2ng  | level=info msg="Sending first ZMQ message."
netflow2ng  | level=debug msg="Sending ZMQ message id 1000."
netflow2ng  | level=debug msg="Netflow packet received before template was set. Discarding"
netflow2ng  | level=debug msg="Netflow packet received before template was set. Discarding"
netflow2ng  | level=debug msg="Netflow packet received before template was set. Discarding"

the first discarding messages are expected at startup, but the latter ones seem to be a bug?

@kellybyrd
Copy link
Contributor Author

Definitely feels like a bug, unless the router has diff templates for ipv4 and ipv6?
I'll try to repro it.

@kellybyrd
Copy link
Contributor Author

I can't repro this, so I'm going to try and put better logging around it, my theory is the router sends multiple templates and is getting packets for ipv4 or ipv6 before the respective template has been received. I don't have ipv6 working right now, so I haven't proved this yet.

@synfinatic
Copy link
Owner

@kellybyrd do you have some improved logging code you want to share? happy to try on my end where i run both v4 & v6

@kellybyrd
Copy link
Contributor Author

This will log the error message from the template system, which shows the template number. It also removes some other logs from Trace level to make it easier to see the "TemplateNotFound" errors.

If the issue is just ipv6 packets arriving before ipv6 template, I expect that after the `Sending first ZMQ message." you will see TemplateNotFound for only a single template id.

diff --git a/cmd/netflow2ng.go b/cmd/netflow2ng.go
index 8ae08af..2b0e2f8 100644
--- a/cmd/netflow2ng.go
+++ b/cmd/netflow2ng.go
@@ -303,6 +303,7 @@ func main() {
                                        } else {
                                                if errors.Is(err, netflow.ErrorTemplateNotFound) {
                                                        log.Debug("Netflow packet received before template was set. Discarding")
+                                                       log.Trace("More info: ", err)
                                                } else if errors.Is(err, debug.PanicError) {
                                                        var pErrMsg *debug.PanicErrorMessage
                                                        log.Error("Intercepted panic", pErrMsg)
diff --git a/transport/zmq.go b/transport/zmq.go
index ebf8b84..3b25d08 100644
--- a/transport/zmq.go
+++ b/transport/zmq.go
@@ -14,7 +14,6 @@ import (
        "bytes"
        "compress/zlib"
        "encoding/binary"
-       "encoding/hex"
        "fmt"
        "sync"
        "time"
@@ -142,15 +141,15 @@ func (d *ZmqDriver) Send(key, data []byte) error {

        switch d.msgType {
        case PBUF:
-               log.Tracef("Sent %d bytes of pbuf:\n%s", orig_len, hex.Dump(data))
+               //log.Tracef("Sent %d bytes of pbuf:\n%s", orig_len, hex.Dump(data))
        case JSON:
                if d.compress {
-                       log.Tracef("Sent %d bytes of zlib json:\n%s", compressed_len, hex.Dump(data))
+                       //log.Tracef("Sent %d bytes of zlib json:\n%s", compressed_len, hex.Dump(data))
                } else {
-                       log.Tracef("Sent %d bytes of json: %s", orig_len, string(data))
+                       //log.Tracef("Sent %d bytes of json: %s", orig_len, string(data))
                }
        case TLV:
-               log.Tracef("Sent %d bytes of ntop tlv:\n%s", orig_len, hex.Dump(data))
+               //log.Tracef("Sent %d bytes of ntop tlv:\n%s", orig_len, hex.Dump(data))
        default:
                log.Errorf("Sent %d bytes of unknown message type %d", orig_len, d.msgType)

@kellybyrd
Copy link
Contributor Author

So....I can't repro this. I did this:

  • Got two VMs on diff ipv6 subnets pinging each other
  • Proved with a diff instance of netflow2ng and ntop that I could see the ping
  • Tried running this version of netflow2ng and I don't see template errors after: INFO Sending first ZMQ message.

I did see that the check for first send happens outside the mutex, so maybe that's the cause? I'll push an update

Update the ZMQ header to the latest version in as of ntop 6.4.
The goal here is to just use the latest thing so we have longer
before we have to worry about being deprecated.

The work here was:
* New header with new version constant

* New header requires knowing compress/uncompressed sizes so move
  the compress code from Format to Transport. We're still only
  compressing JSON.

TESTING DONE:
Tested TLV, JSON, and compressed JSON with ntopng 6.4 built from
source. Saw flows with all three. In order to get 6.4 to work with
compressed JSON, I had to fix a bug in ntopng as well as disable
the pro license checks.

Without this fix, I cannot see how ntopng decompresses anything
that has zmq_message_header_v3. I confirmed with them this is
fixed in their dev branch, but broken in v6.4
@kellybyrd
Copy link
Contributor Author

The latest code in the branch has the "Sending first ZMQ message" inside the mutex and also has some TRACE logs for the missing template case that should contain the template id that is missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants