Skip to content

Conversation

jostaub
Copy link
Contributor

@jostaub jostaub commented Aug 4, 2025

This changed updates the OpenVAS Parser to fix the issues described in #12378. This PR was initially planned as an update the the existing parser. However after all the changes and possible problems for user we decided to move the changes into a v2 version of the parser (see linked issue).

Detailed changes:
General:

  • Updated the CSV implementation to behave more like the XML parser.
  • Introduced de-duplication using unique_id_from_tool to OpenVAS parser.
  • Increased behavior consistency between the CSV and XML parsers.
  • Combined findings where the only differences are in fields that can’t be reliably hashed due to inconsistent values between scans e.g timestamps.
  • parser now combines multiple identical findings with different endpoints into one findings with multiple endpoints

CSV Parser:

  • removed ip from description
  • added qod to description

XML Parser:

  • finding name no longer includes ip and protocol
  • parser no longer appends extra information to the description (same description behavior as csv)
  • severity now maps to cvss v3 score
  • the description xml tag now maps to reference
  • the summary inside the xml tag (part of nvt tag) now maps to description
  • impact is now included in finding
  • same qod behavior as csv parser

TODOs:
[x] extract more information from xml
~~[] migration for OpenVAS parser ~~
[x] improve testing with better test files

Open Questions:

  1. My current implementation missuses the unique_id_from_tool variable a bit. Is this Ok?
  2. I currently deduplicate on port inside the parser (also using the unique_id_from_tool). Is this Ok and/or desired at all?

@github-actions github-actions bot added settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests parser labels Aug 4, 2025
@github-actions github-actions bot added the docs label Aug 30, 2025
@@ -1348,6 +1348,7 @@ def saml2_attrib_map_format(din):
"Qualys Hacker Guardian Scan": ["title", "severity", "description"],
"Cyberwatch scan (Galeax)": ["title", "description", "severity"],
"Cycognito Scan": ["title", "severity"],
"OpenVAS Parser v2": ["title", "unique_id_from_tool", "vuln_id_from_tool"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unique_id_from_tool should not be in here, you should set DEDUPLICATION_ALGORITHM_PER_PARSER to DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL_OR_HASH_CODE for this v2 parser.

@valentijnscholten
Copy link
Member

Open Questions:

My current implementation missuses the unique_id_from_tool variable a bit. Is this Ok?
I currently deduplicate on port inside the parser (also using the unique_id_from_tool). Is this Ok and/or desired at all?

Using the port could work, but I don't see where you're using it?

The unique_id_from_tool must contain a value that is present in the report as it's meant to link the Finding to something that is an ID in the scanner/tool we're importing from. I have been arguing for an extra field that could contain a unique_id_from_parser, but not everyone agrees with me. Could you explain what problem gets solved by letting the parser generate an ID for deduplication (outside the parser)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs parser settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants