Skip to content

Commit 357a36e

Browse files
committed
Fix non-deterministic fingerprint for nested hashes by sorting
Consider this example: ```JSON { "@timestamp": "2023-05-23T23:23:23.555Z", "@Version": "1", "beat": { "hostname": "gnu.example.com", "name": "gnu.example.com", "version": "5.2.2" }, "host": "gnu.example.com" } ``` Using this filter: ```Logstash fingerprint { concatenate_all_fields => true target => "[@metadata][_id]" method => "SHA512" key => "XXX" base64encode => true } ``` Here, the order of the `.beat` hash is non-deterministic and the plugin did not do a deep sort as part of the serialization. This resulted in different fingerprints for the same event because the order of the three keys (hostname, name, version) changed randomly in the serialization. This has been fixed by recursively checking for hashes and serializing them in sorted order. Note that this changes the serialization format and thus breaks backwards compatibility. The old format could be emulated in order to not break backwards compatibility. Backwards compatibility in this case means to generate the same fingerprint for the same input. Closes: #39
1 parent aa7d522 commit 357a36e

1 file changed

Lines changed: 17 additions & 4 deletions

File tree

lib/logstash/filters/fingerprint.rb

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,21 @@ class << self; alias_method :fingerprint, :fingerprint_openssl; end
104104
end
105105
end
106106

107+
def serialize(event)
108+
to_string = ""
109+
if event.respond_to?(:to_hash)
110+
to_string << "{"
111+
event.to_hash.sort.map do |k,v|
112+
to_string << "#{k}:#{serialize(v)},"
113+
end
114+
to_string << "}"
115+
else
116+
to_string << "#{event}"
117+
end
118+
119+
return to_string
120+
end
121+
107122
def filter(event)
108123
case @method
109124
when :UUID
@@ -120,12 +135,10 @@ def filter(event)
120135
if @concatenate_sources || @concatenate_all_fields
121136
to_string = ""
122137
if @concatenate_all_fields
123-
event.to_hash.sort.map do |k,v|
124-
to_string << "|#{k}|#{v}"
125-
end
138+
to_string << serialize(event)
126139
else
127140
@source.sort.each do |k|
128-
to_string << "|#{k}|#{event.get(k)}"
141+
to_string << "|#{k}|#{serialize(event.get(k))}"
129142
end
130143
end
131144
to_string << "|"

0 commit comments

Comments
 (0)