Skip to content

Conversation

@aegilops
Copy link
Contributor

@aegilops aegilops commented Oct 31, 2025

Logging of sensitive data can be mitigated by taking a substring of the data.

This affects the library used by java/sensitive-log (CWE 532).

The exact length suitable will vary by application, but this change takes a conservative approach and allows either substring or take/takeLast of up to 7 characters.

Complex redaction with a regular expression, replacement of particular characters, a substring not at the start or end of the string, or a conditional substring (such as with Apache StringUtils) are not supported in this sanitizer.

There is a new abstraction for sanitizer barriers, along with logic to detect substring operations that restrict logged data to a safe length in both Java and Kotlin code. The existing sanitizers are pulled into a new class called GenericSanitizer, which implements the new abstract class, alongside the new sanitizer.

The safe length restriction must be done in the analyzed code with a compile-time constant integer, but the integer can reach the substring operation using taint.

@github-actions github-actions bot added the Java label Oct 31, 2025
@aegilops aegilops changed the title Added java-kotlin Sensitive Logging barriers (substrings) java: Added Java/Kotlin Sensitive Logging barriers (substrings) Nov 14, 2025
@aegilops aegilops marked this pull request as ready for review November 15, 2025 09:43
@aegilops aegilops requested a review from a team as a code owner November 15, 2025 09:43
Copilot AI review requested due to automatic review settings November 15, 2025 09:43
Copilot finished reviewing on behalf of aegilops November 15, 2025 09:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for detecting substring operations as sanitizer barriers in sensitive logging analysis for both Java and Kotlin. The implementation takes a conservative approach by allowing substring operations that extract up to 7 characters, which can help mitigate the logging of sensitive data.

Key Changes:

  • Introduced abstract SensitiveLoggerBarrier class for sanitizer barriers
  • Implemented substring-based sanitizers supporting Java String.substring() and Kotlin take()/takeLast() methods
  • Refactored existing sanitizers into a new GenericSanitizer class

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll Core implementation adding SensitiveLoggerBarrier abstraction, SensitiveLoggerSanitizerCalled for substring operations with helper predicates, IntegerToArgConfig for constant propagation, and refactored GenericSanitizer
java/ql/test/query-tests/security/CWE-532/Test.java Added test cases demonstrating safe usage of substring(4) and substring(0,4) on sensitive auth tokens

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments to make it easier to read and maintain. I'll run performance tests on it. Are we sure that this is a reasonable sanitizer? You have more security experience than me so I will take your word for it.

Copy link
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IntegerToArg flow must be removed.

@owen-mc
Copy link
Contributor

owen-mc commented Nov 20, 2025

@aegilops I'm not sure if you can see the CI results, but your tests are failing because lines 12 and 13, which are meant to be safe, are causing alerts. (And also, the .expected file would need to be updated even if that wasn't the case.) All of your changes look sensible, so I'm not sure what would be causing it.

@aegilops
Copy link
Contributor Author

aegilops commented Nov 20, 2025

Hey @owen-mc, thanks for that.

I've fixed things up so that it's just lines 7, 8, 14 and 15 that now produce alerts.

I thought that with the inline testcases I didn't need to populate the .expected file.

I was not able to get the suggested change to using bounded(mc.getArgument(secondArgIndex), any(ZeroBound z), limit, true, _) working, unfortunately, and I spent quite some time trying to get it to work.

I was able to get an isolated predicate working with a limit, but with the code with that change in the correct place wasn't valid CodeQL - the compiler complained that limit was not bound.

If you are able to get that working then please make that change, but I've resorted to just looking for literals in the argument.

This worked:

predicate boundedLimit(MethodCall mc, int limit) {
  mc.getNumArgument() = 1 and
  bounded(mc.getArgument(0), any(ZeroBound z), limit, true, _) and
  limit < 8
}

but inserting bounded() in context with the rest of the code did not work.

@aegilops
Copy link
Contributor Author

aegilops commented Nov 20, 2025

I'm happy to try to justify that a substring prefix/suffix is an security appropriate sanitizer - if someone is actively trimming the output down to a small portion of the data it shows intent.

We're looking for cases where sensitive data is output, and it's pretty standard practice to use a small prefix or suffix with sensitive data, such as credit card numbers or phone numbers; you reveal enough info to make the output identifiable if you know the full string, but without revealing it.

It was a customer request, so it isn't speculation that people want this.

Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the range analysis working, after some experimentation. The main things I found are:
(1) if a number has an upper bound of 4 and you ask for a number with an upper bound of 5 you won't get it - you have to ask for a number with an upper bound <= 5;
(2) in some situations reusing the ZeroBound z between different calls to bounded will give incorrect results, so best to just use any(ZeroBound z) each time.

@owen-mc
Copy link
Contributor

owen-mc commented Nov 21, 2025

@aschackmull Bug report for range analysis - if you replace long fourL = 4 with long fourL = 4L then range analysis no longer identifies that it has an upper bound of 4. I also think that it doesn't work through casts, so if fourL has certain bounds then (int)fourL doesn't, but I'm not 100% sure of that.

@owen-mc
Copy link
Contributor

owen-mc commented Nov 21, 2025

I thought that with the inline testcases I didn't need to populate the .expected file.

That's true unless you're using it via the post-processing query in a qlref test. Some people prefer that all the details are recorded, so that if a minor detail like an edge changes then it is flagged up. (This is useful e.g. when making changes to the underlying implementation of the dataflow library, so you can see what effect it's having). So you do have to accept the changes to the .expected file. This is easily done with codeql test run --learn <testpath>, or codeql test accept <testpath> after running codeql test run <testpath>. However, you still shouldn't have to look at the .expected file to see what the test is doing, as the comments in the test should be enough.

Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally there would be a kotlin test as well, to exercise the take sanitizer, but I won't hold this up on that.

Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, I should run dca again before approving since we changed how it works.

@aschackmull
Copy link
Contributor

@aschackmull Bug report for range analysis - if you replace long fourL = 4 with long fourL = 4L then range analysis no longer identifies that it has an upper bound of 4

#20886

I also think that it doesn't work through casts, so if fourL has certain bounds then (int)fourL doesn't, but I'm not 100% sure of that.

Range analysis ought to handle casts - but note that bounds don't always apply through casts: if we, say, only have an upper bound then casting to a smaller type may cause negative values to wrap around, which could invalidate the bound.

Copy link
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, since the QL looks reasonable now. But yes, as Owen says, we need a dca run before merging.

@owen-mc
Copy link
Contributor

owen-mc commented Nov 21, 2025

@aschackmull I'm not familiar with reading java DCA results, can you have a quick look? The failing projects seems to be failing in the latest nightly as well. The query changes seem fine. The only thing I'm unsure about is that the "Analysis time, per source" table shows apache__commons-math taking 28.5% less time and apache__commons-logging taking 55% more time. Is that just noise, or something we should look into? How can I tell (preferably without running the query locally)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants