java: Added Java/Kotlin Sensitive Logging barriers (substrings) #20741

aegilops · 2025-10-31T18:07:15Z

Logging of sensitive data can be mitigated by taking a substring of the data.

This affects the library used by java/sensitive-log (CWE 532).

The exact length suitable will vary by application, but this change takes a conservative approach and allows either substring or take/takeLast of up to 7 characters.

Complex redaction with a regular expression, replacement of particular characters, a substring not at the start or end of the string, or a conditional substring (such as with Apache StringUtils) are not supported in this sanitizer.

There is a new abstraction for sanitizer barriers, along with logic to detect substring operations that restrict logged data to a safe length in both Java and Kotlin code. The existing sanitizers are pulled into a new class called GenericSanitizer, which implements the new abstract class, alongside the new sanitizer.

The safe length restriction must be done in the analyzed code with a compile-time constant integer, but the integer can reach the substring operation using taint.

Copilot

Pull Request Overview

This PR adds support for detecting substring operations as sanitizer barriers in sensitive logging analysis for both Java and Kotlin. The implementation takes a conservative approach by allowing substring operations that extract up to 7 characters, which can help mitigate the logging of sensitive data.

Key Changes:

Introduced abstract SensitiveLoggerBarrier class for sanitizer barriers
Implemented substring-based sanitizers supporting Java String.substring() and Kotlin take()/takeLast() methods
Refactored existing sanitizers into a new GenericSanitizer class

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll	Core implementation adding `SensitiveLoggerBarrier` abstraction, `SensitiveLoggerSanitizerCalled` for substring operations with helper predicates, `IntegerToArgConfig` for constant propagation, and refactored `GenericSanitizer`
java/ql/test/query-tests/security/CWE-532/Test.java	Added test cases demonstrating safe usage of `substring(4)` and `substring(0,4)` on sensitive auth tokens

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll

…iers

owen-mc

A few comments to make it easier to read and maintain. I'll run performance tests on it. Are we sure that this is a reasonable sanitizer? You have more security experience than me so I will take your word for it.

java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll

java/ql/test/query-tests/security/CWE-532/Test.java

java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll

java/ql/src/change-notes/2025-11-17-sensitive-logging-new-sanitizers.md

java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll

aschackmull

The IntegerToArg flow must be removed.

…tyle and better comments

…iers

owen-mc · 2025-11-20T14:16:58Z

@aegilops I'm not sure if you can see the CI results, but your tests are failing because lines 12 and 13, which are meant to be safe, are causing alerts. (And also, the .expected file would need to be updated even if that wasn't the case.) All of your changes look sensible, so I'm not sure what would be causing it.

… did not work

aegilops · 2025-11-20T17:41:04Z

Hey @owen-mc, thanks for that.

I've fixed things up so that it's just lines 7, 8, 14 and 15 that now produce alerts.

I thought that with the inline testcases I didn't need to populate the .expected file.

I was not able to get the suggested change to using bounded(mc.getArgument(secondArgIndex), any(ZeroBound z), limit, true, _) working, unfortunately, and I spent quite some time trying to get it to work.

I was able to get an isolated predicate working with a limit, but with the code with that change in the correct place wasn't valid CodeQL - the compiler complained that limit was not bound.

If you are able to get that working then please make that change, but I've resorted to just looking for literals in the argument.

This worked:

predicate boundedLimit(MethodCall mc, int limit) {
  mc.getNumArgument() = 1 and
  bounded(mc.getArgument(0), any(ZeroBound z), limit, true, _) and
  limit < 8
}

but inserting bounded() in context with the rest of the code did not work.

aegilops · 2025-11-20T17:45:36Z

I'm happy to try to justify that a substring prefix/suffix is an security appropriate sanitizer - if someone is actively trimming the output down to a small portion of the data it shows intent.

We're looking for cases where sensitive data is output, and it's pretty standard practice to use a small prefix or suffix with sensitive data, such as credit card numbers or phone numbers; you reveal enough info to make the output identifiable if you know the full string, but without revealing it.

It was a customer request, so it isn't speculation that people want this.

owen-mc

I got the range analysis working, after some experimentation. The main things I found are:
(1) if a number has an upper bound of 4 and you ask for a number with an upper bound of 5 you won't get it - you have to ask for a number with an upper bound <= 5;
(2) in some situations reusing the ZeroBound z between different calls to bounded will give incorrect results, so best to just use any(ZeroBound z) each time.

java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll

java/ql/test/query-tests/security/CWE-532/Test.java

java/ql/test/query-tests/security/CWE-532/SensitiveLogInfo.expected

owen-mc · 2025-11-21T10:34:35Z

@aschackmull Bug report for range analysis - if you replace long fourL = 4 with long fourL = 4L then range analysis no longer identifies that it has an upper bound of 4. I also think that it doesn't work through casts, so if fourL has certain bounds then (int)fourL doesn't, but I'm not 100% sure of that.

owen-mc · 2025-11-21T10:42:18Z

I thought that with the inline testcases I didn't need to populate the .expected file.

That's true unless you're using it via the post-processing query in a qlref test. Some people prefer that all the details are recorded, so that if a minor detail like an edge changes then it is flagged up. (This is useful e.g. when making changes to the underlying implementation of the dataflow library, so you can see what effect it's having). So you do have to accept the changes to the .expected file. This is easily done with codeql test run --learn <testpath>, or codeql test accept <testpath> after running codeql test run <testpath>. However, you still shouldn't have to look at the .expected file to see what the test is doing, as the comments in the test should be enough.

owen-mc

Ideally there would be a kotlin test as well, to exercise the take sanitizer, but I won't hold this up on that.

owen-mc

Oh wait, I should run dca again before approving since we changed how it works.

aschackmull · 2025-11-21T11:57:46Z

@aschackmull Bug report for range analysis - if you replace long fourL = 4 with long fourL = 4L then range analysis no longer identifies that it has an upper bound of 4

#20886

I also think that it doesn't work through casts, so if fourL has certain bounds then (int)fourL doesn't, but I'm not 100% sure of that.

Range analysis ought to handle casts - but note that bounds don't always apply through casts: if we, say, only have an upper bound then casting to a smaller type may cause negative values to wrap around, which could invalidate the bound.

aschackmull

Approving, since the QL looks reasonable now. But yes, as Owen says, we need a dca run before merging.

…iers

owen-mc · 2025-11-21T14:20:14Z

@aschackmull I'm not familiar with reading java DCA results, can you have a quick look? The failing projects seems to be failing in the latest nightly as well. The query changes seem fine. The only thing I'm unsure about is that the "Analysis time, per source" table shows apache__commons-math taking 28.5% less time and apache__commons-logging taking 55% more time. Is that just noise, or something we should look into? How can I tell (preferably without running the query locally)?

Added java-kotlin Sensitive Logging barriers (substrings)

26087f6

github-actions bot added the Java label Oct 31, 2025

aegilops added 2 commits October 31, 2025 18:19

Fixed format/docs issues

d1eceee

Test cases for sensitive logging sanitizer

fa703e3

aegilops changed the title ~~Added java-kotlin Sensitive Logging barriers (substrings)~~ java: Added Java/Kotlin Sensitive Logging barriers (substrings) Nov 14, 2025

Fixed sensitive logging barriers for substring to allow single-arg use

0c0fbc1

aegilops marked this pull request as ready for review November 15, 2025 09:43

aegilops requested a review from a team as a code owner November 15, 2025 09:43

Copilot AI review requested due to automatic review settings November 15, 2025 09:43

Copilot started reviewing on behalf of aegilops November 15, 2025 09:44 View session

Copilot finished reviewing on behalf of aegilops November 15, 2025 09:46

Copilot AI reviewed Nov 15, 2025

View reviewed changes

java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll Outdated Show resolved Hide resolved

Added change note, adjusted spacing in comment

528c451

github-actions bot added the documentation label Nov 17, 2025

Merge branch 'main' into java-kotlin-sensitive-logging-substring-barr…

7b25e22

…iers

owen-mc requested changes Nov 19, 2025

View reviewed changes

aschackmull reviewed Nov 19, 2025

View reviewed changes

java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll Outdated Show resolved Hide resolved

aschackmull reviewed Nov 19, 2025

View reviewed changes

java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll Outdated Show resolved Hide resolved

aschackmull requested changes Nov 19, 2025

View reviewed changes

aegilops added 3 commits November 20, 2025 11:46

Made changes requested by reviewers - bounded() for range checking, s…

62ee6d3

…tyle and better comments

Merge commit

1e67907

Merge branch 'main' into java-kotlin-sensitive-logging-substring-barr…

801cd72

…iers

aegilops added 3 commits November 20, 2025 17:31

Removed bounds checking and only using literals - bounded() predicate…

29a5b27

… did not work

Fixed formatting

e904520

Fixed formatting

ce13668

owen-mc reviewed Nov 21, 2025

View reviewed changes

Use range analysis and improve tests

ec381e4

owen-mc approved these changes Nov 21, 2025

View reviewed changes

owen-mc requested changes Nov 21, 2025

View reviewed changes

aschackmull mentioned this pull request Nov 21, 2025

Java: Recognize int-sized long literals. #20886

Merged

aschackmull approved these changes Nov 21, 2025

View reviewed changes

Merge branch 'main' into java-kotlin-sensitive-logging-substring-barr…

d2fc6a7

…iers

java: Added Java/Kotlin Sensitive Logging barriers (substrings) #20741

Are you sure you want to change the base?

java: Added Java/Kotlin Sensitive Logging barriers (substrings) #20741

Conversation

aegilops commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

owen-mc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aschackmull left a comment

Choose a reason for hiding this comment

Uh oh!

owen-mc commented Nov 20, 2025

Uh oh!

aegilops commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aegilops commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

owen-mc left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

owen-mc commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

owen-mc commented Nov 21, 2025

Uh oh!

owen-mc left a comment

Choose a reason for hiding this comment

Uh oh!

owen-mc left a comment

Choose a reason for hiding this comment

Uh oh!

aschackmull commented Nov 21, 2025

Uh oh!

aschackmull left a comment

Choose a reason for hiding this comment

Uh oh!

owen-mc commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aegilops commented Oct 31, 2025 •

edited

Loading

aegilops commented Nov 20, 2025 •

edited

Loading

aegilops commented Nov 20, 2025 •

edited

Loading

owen-mc left a comment •

edited

Loading

owen-mc commented Nov 21, 2025 •

edited

Loading