Improve GraphQL fingerprinting with specificity scoring by Copilot · Pull Request #5 · kingthorin/zap-extensions

Copilot · 2026-02-06T13:58:28Z

Overview

Current GraphQL fingerprinting uses sequential pattern matching with first-match-wins approach, which can cause generic patterns to shadow specific ones, leading to false positives.

This PR implements specificity-based scoring and descending order iteration for GraphQL framework detection to improve accuracy and reduce false positives.

Key Changes

1. Specificity Scoring System

Each framework fingerprint is assigned a specificity score (0-100) based on pattern uniqueness:

90-95: Highly specific (unique error patterns) - tartiflette, hasura, dgraph, directus, inigo
80-89: Very specific (distinctive patterns) - graphql-by-pop, wpgraphql, absinthe, lacinia, sangria
70-79: Moderately specific (language/framework patterns) - caliban, strawberry, ariadne, graphql-java, graphql-dotnet, graphql-ruby, graphql-php, gqlgen, graphql-go, juniper, hotchocolate, pg_graphql, tailcall
60-69: Generic (common GraphQL errors) - graphene, graphql-yoga, aws-appsync, hypergraphql, graphql-flutter, dianajl, morpheus, apollo, mercurius, jaal, agoo
50-59: Very generic (prone to false positives) - lighthouse

2. Descending Order Iteration

Fingerprint checks are executed in descending specificity order:

Frameworks with higher scores are checked first
Returns immediately upon first match (early-exit optimization)
Prevents generic patterns from shadowing specific ones
Specificity scores used only for ordering, not as a threshold

3. Record-Based Type Safety

Introduced FingerprintCheck record to encapsulate:

Fingerprint check function (BooleanSupplier)
Specificity score with validation (0-100 range)

Architecture

private record FingerprintCheck(BooleanSupplier check, int specificityScore) {
    // Validates score is in range [0, 100]
    public FingerprintCheck {
        if (specificityScore < 0 || specificityScore > 100) {
            throw new IllegalArgumentException(...);
        }
    }
}

public void fingerprint() {
    String detectedFramework = performPatternBasedDetection();
    if (detectedFramework != null) {
        raiseAlertForFramework(detectedFramework);
    }
    matchedString = null;
    queryCache.clear();
}

private String performPatternBasedDetection() {
    // Register all fingerprints with scores
    Map<String, FingerprintCheck> fingerprinters = new LinkedHashMap<>();
    fingerprinters.put("tartiflette", new FingerprintCheck(this::checkTartifletteEngine, 95));
    // ... more frameworks
    
    // Sort by score (descending) and check in order
    var sortedFingerprinters = fingerprinters.entrySet().stream()
        .sorted(Map.Entry.comparingByValue(
            Comparator.comparingInt(FingerprintCheck::specificityScore).reversed()))
        .toList();
    
    // Return first match - any successful check is treated as a match
    for (var fingerprinter : sortedFingerprinters) {
        if (fingerprinter.getValue().check.getAsBoolean()) {
            return fingerprinter.getKey();
        }
    }
    return null;
}

Benefits

Improved Accuracy: Specific patterns checked before generic ones, reducing false positives
Performance: Early-exit on first match (no need to check all patterns)
Maintainability: Clear scoring system makes it easy to understand pattern reliability
Type Safety: Record validation ensures scores are always in valid range
Clear Semantics: Updated JavaDoc and logging to accurately reflect that specificity scores determine execution order only

Backward Compatibility

All existing check*Engine() methods remain unchanged
Scoring system only affects execution order, not detection logic
No changes to alert generation or reporting
Any successful check is reported (no minimum score threshold)

Testing

Existing unit tests continue to pass
Lighthouse (score 50) detection preserved despite low score
All 36 framework detections remain functional

Related Issues

N/A

Original prompt

Improve GraphQL Fingerprinting with Header-Based Detection and Specificity Scoring

Overview

Enhance the GraphQL add-on's fingerprinting mechanism to use a tiered detection strategy that leverages HTTP response headers and implements specificity scoring for pattern-based detection. This improves both speed and accuracy of framework identification.

Current Issues

No header-based detection: The current implementation only uses error message patterns, missing fast and reliable header-based indicators
First-match-wins approach: Uses LinkedHashMap with break on first match, which can cause generic patterns to shadow specific ones
No confidence scoring: All detections are treated equally regardless of pattern specificity
Inefficient: Sends multiple queries even when headers provide definitive answers

Changes Required

1. Add Detection Result Records

Create two new record types in GraphQlFingerprinter.java:

/**
 * Result of framework detection with confidence score.
 * 
 * @param framework The detected GraphQL framework name, or null if none detected
 * @param confidence Confidence score (0-100):
 *                   <ul>
 *                   <li>100 = Definitive (proprietary headers/signatures)</li>
 *                   <li>70-90 = Strong hint (framework branding in standard headers)</li>
 *                   <li>40-69 = Weak hint (generic tech stack indicators)</li>
 *                   <li>0-39 = No useful headers detected</li>
 *                   </ul>
 */
private record DetectionResult(String framework, int confidence) {
    
    private static final int MIN_CONFIDENCE = 0;
    private static final int MAX_CONFIDENCE = 100;
    private static final int DEFINITIVE_THRESHOLD = 100;
    private static final int STRONG_HINT_MIN = 70;
    private static final int WEAK_HINT_MIN = 40;
    
    /**
     * Creates a detection result with validation.
     * 
     * @throws IndexOutOfBoundsException if confidence is not in range [0, 100]
     */
    public DetectionResult {
        Objects.checkFromToIndex(confidence, MIN_CONFIDENCE, MAX_CONFIDENCE + 1);
    }
    
    /** 
     * Returns true if this is a definitive detection requiring no verification.
     * Definitive detections have confidence = 100.
     */
    public boolean isDefinitive() {
        return confidence == DEFINITIVE_THRESHOLD;
    }
    
    /** 
     * Returns true if this is a strong hint requiring quick verification.
     * Strong hints have confidence in range [70, 100).
     */
    public boolean isStrongHint() {
        return confidence >= STRONG_HINT_MIN && confidence < DEFINITIVE_THRESHOLD;
    }
    
    /** 
     * Returns true if this is a weak hint requiring full verification.
     * Weak hints have confidence in range [40, 70).
     */
    public boolean isWeakHint() {
        return confidence >= WEAK_HINT_MIN && confidence < STRONG_HINT_MIN;
    }
    
    /** 
     * Returns true if confidence is below threshold for actionable detection.
     * No useful detection has confidence in range [0, 40).
     */
    public boolean isNoUsefulDetection() {
        return confidence >= MIN_CONFIDENCE && confidence < WEAK_HINT_MIN;
    }
    
    /** Returns true if any framework was detected with non-zero confidence. */
    public boolean hasDetection() {
        return framework != null && confidence > MIN_CONFIDENCE;
    }
}

/**
 * Encapsulates a fingerprint check function with its reliability score.
 * 
 * @param check The fingerprinting function that returns true if framework matches
 * @param specificityScore Reliability score (50-95):
 *                        <ul>
 *                        <li>90-95: Highly specific (unique error patterns)</li>
 *                        <li>80-89: Very specific (distinctive patterns)</li>
 *                        <li>70-79: Moderately specific (language/framework family patterns)</li>
 *                        <li>60-69: Generic (common GraphQL error patterns)</li>
 *                        <li>50-59: Very generic (prone to false positives)</li>
 *                        </ul>
 */
private record FingerprintCheck(BooleanSupplier check, int specificityScore) {
    
    private static final int MIN_SCORE = 0;
    private static final int MAX_SCORE = 100;
    
    /**
     * Creates a fingerprint check with score validation.
     * 
     * @throws IndexOutOfBoundsException if specificityScore is not in range [0, 100]
     */
    public FingerprintCheck {
        Objects.checkFromToIndex(specificityScore, MIN_SCORE, MAX_SCORE + 1);
    }
}

2. Refactor `fingerprint()` Method

Replace the current implementation with a tiered detection strategy:

/**
 * Performs GraphQL framework fingerprinting using a tiered detection strategy.
 * 
 * <h3>Detection Strategy (Decision Tree)</h3>
 * <pre>
 * 1. Send initial probe query: { __typename }
 * 2. Check response headers for framework indicators
 * 3. Apply tiered detection based on confidence...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs.

Copilot

Pull request overview

This PR refactors the GraphQL add-on’s engine fingerprinting to use a tiered strategy that prefers high-confidence header indicators (with optional quick verification) and otherwise falls back to pattern-based detection using specificity scoring to reduce false positives.

Changes:

Refactored fingerprint() to perform an initial probe, header-based detection/verification, then scored pattern matching.
Added new unit tests covering definitive header detection, header-hint verification, and fallback behavior.
Added *.hprof to .gitignore.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 7 comments.

File	Description
`addOns/graphql/src/main/java/org/zaproxy/addon/graphql/GraphQlFingerprinter.java`	Adds tiered header detection, quick verification, and specificity-scored pattern detection.
`addOns/graphql/src/test/java/org/zaproxy/addon/graphql/GraphQlFingerprinterUnitTest.java`	Updates an existing test expectation and adds new tests for header-based detection/verification.
`.gitignore`	Ignores Java heap dump files (`*.hprof`).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

kingthorin · 2026-02-06T19:59:10Z

@copilot review again

Copilot · 2026-02-06T20:00:21Z

@copilot review again

All previous review feedback has been addressed and resolved:

Locale-independent header matching (Locale.ROOT)
Null guard for headerHint parameter
matchedString evidence handling
Logger output validation in tests
Lighthouse detection preserved (removed score threshold)
Descending score iteration implemented
Inigo header detection restricted to x-inigo- prefix

The code compiles successfully and is ready for re-review.

kingthorin · 2026-02-07T02:34:32Z

@copilot update the PR description based on the current state of the branch

Copilot · 2026-02-07T02:36:06Z

@copilot update the PR description based on the current state of the branch

Updated PR description to reflect current implementation after the "Second opinion" commit (c8eed25). The tiered header-based detection has been simplified to focus on pattern-based detection with specificity scoring and descending order iteration only.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

kingthorin · 2026-02-07T02:45:11Z

@copilot are there literal strings used multiple times which should be converted to constants? It’s okay if no change is warranted simply reply with a comment if that’s the case.

Copilot · 2026-02-07T02:51:13Z

@copilot are there literal strings used multiple times which should be converted to constants? It’s okay if no change is warranted simply reply with a comment if that’s the case.

No changes warranted. The literal strings in the code are either:

Log messages (used once each, don't need extraction)
Framework-specific test queries (intentionally different variations like "queryy {__typename}", specific to each detection method)

Extracting these wouldn't improve maintainability since they serve different purposes in different contexts.

kingthorin · 2026-02-07T02:54:04Z

@copilot what about repeated literal occurrences of framework/engine names?

Copilot · 2026-02-07T02:55:39Z

@copilot what about repeated literal occurrences of framework/engine names?

No changes warranted for framework/engine names. Analysis shows:

Framework names: Each of the 36 framework names (like "tartiflette", "hasura", "apollo") is used only once as a map key in the fingerprinters registration. One instance ("inigo") appears 3 times total, but extracting constants for single-use identifiers would add boilerplate without improving maintainability.

JSON field names: Standard GraphQL response fields like "data" (6 uses), "extensions" (3 uses), and "errors" (2 uses) are used in different contexts throughout detection methods. These are well-known GraphQL standard field names, and their literal form aids readability in fingerprinting logic.

The current approach keeps the code clear and maintainable without unnecessary abstraction.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: kingthorin <kingthorin@users.noreply.github.com>

Copilot AI assigned Copilot and kingthorin Feb 6, 2026

Copilot started work on behalf of kingthorin February 6, 2026 13:58 View session

Copilot AI changed the title ~~[WIP] Enhance GraphQL fingerprinting with header-based detection~~ Improve GraphQL fingerprinting with header-based detection and specificity scoring Feb 6, 2026

Copilot AI requested a review from kingthorin February 6, 2026 14:15

Copilot finished work on behalf of kingthorin February 6, 2026 14:15

kingthorin requested a review from Copilot February 6, 2026 19:17

Copilot started reviewing on behalf of kingthorin February 6, 2026 19:18 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

Copilot started work on behalf of kingthorin February 6, 2026 19:31 View session

Copilot finished work on behalf of kingthorin February 6, 2026 19:42

Copilot started work on behalf of kingthorin February 6, 2026 19:43 View session

Copilot finished work on behalf of kingthorin February 6, 2026 19:51

Copilot started work on behalf of kingthorin February 6, 2026 19:59 View session

Copilot finished work on behalf of kingthorin February 6, 2026 20:01

kingthorin force-pushed the copilot/improve-fingerprint-detection branch 4 times, most recently from c4a3142 to 3a0d709 Compare February 6, 2026 22:55

Copilot started work on behalf of kingthorin February 6, 2026 23:03 View session

Copilot stopped work on behalf of kingthorin due to an error February 6, 2026 23:04
The session was cancelled by the user.

Copilot started work on behalf of kingthorin February 6, 2026 23:05 View session

Copilot stopped work on behalf of kingthorin due to an error February 6, 2026 23:08
The session was cancelled by the user.

kingthorin force-pushed the copilot/improve-fingerprint-detection branch 4 times, most recently from f0affeb to c8eed25 Compare February 7, 2026 01:07

Copilot started work on behalf of kingthorin February 7, 2026 02:34 View session

kingthorin requested a review from Copilot February 7, 2026 02:34

Copilot started reviewing on behalf of kingthorin February 7, 2026 02:35 View session

kingthorin changed the title ~~Improve GraphQL fingerprinting with header-based detection and specificity scoring~~ Improve GraphQL fingerprinting with specificity scoring Feb 7, 2026

Copilot finished work on behalf of kingthorin February 7, 2026 02:38

kingthorin marked this pull request as ready for review February 7, 2026 02:41

Copilot AI reviewed Feb 7, 2026

View reviewed changes

Copilot started work on behalf of kingthorin February 7, 2026 02:45 View session

Copilot finished work on behalf of kingthorin February 7, 2026 02:52

Copilot started work on behalf of kingthorin February 7, 2026 02:54 View session

Copilot finished work on behalf of kingthorin February 7, 2026 02:56

kingthorin force-pushed the copilot/improve-fingerprint-detection branch 5 times, most recently from 1ca02f1 to 2a39087 Compare February 8, 2026 03:19

kingthorin requested a review from Copilot February 8, 2026 03:33

Copilot started reviewing on behalf of kingthorin February 8, 2026 03:34 View session

Copilot AI reviewed Feb 8, 2026

View reviewed changes

Comment thread addOns/graphql/src/main/java/org/zaproxy/addon/graphql/GraphQlFingerprinter.java Outdated

Comment thread addOns/graphql/CHANGELOG.md Outdated

kingthorin force-pushed the copilot/improve-fingerprint-detection branch 3 times, most recently from 69803d7 to 3d103a6 Compare February 8, 2026 12:03

Improve GraphQL fingerprinting with specificity scoring

b9803bb

Signed-off-by: kingthorin <kingthorin@users.noreply.github.com>

kingthorin force-pushed the copilot/improve-fingerprint-detection branch from 3d103a6 to b9803bb Compare February 8, 2026 16:35

Conversation

Copilot AI commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Key Changes

Architecture

Benefits

Backward Compatibility

Testing

Related Issues

Improve GraphQL Fingerprinting with Header-Based Detection and Specificity Scoring

Overview

Current Issues

Changes Required

1. Add Detection Result Records

2. Refactor fingerprint() Method

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kingthorin commented Feb 6, 2026

Uh oh!

Copilot AI commented Feb 6, 2026

Uh oh!

kingthorin commented Feb 7, 2026

Uh oh!

Copilot AI commented Feb 7, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kingthorin commented Feb 7, 2026

Uh oh!

Copilot AI commented Feb 7, 2026

Uh oh!

kingthorin commented Feb 7, 2026

Uh oh!

Copilot AI commented Feb 7, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Feb 6, 2026 •

edited

Loading

2. Refactor `fingerprint()` Method