-
Notifications
You must be signed in to change notification settings - Fork 1.8k
java: Added Java/Kotlin Sensitive Logging barriers (substrings) #20741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
java: Added Java/Kotlin Sensitive Logging barriers (substrings) #20741
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for detecting substring operations as sanitizer barriers in sensitive logging analysis for both Java and Kotlin. The implementation takes a conservative approach by allowing substring operations that extract up to 7 characters, which can help mitigate the logging of sensitive data.
Key Changes:
- Introduced abstract
SensitiveLoggerBarrierclass for sanitizer barriers - Implemented substring-based sanitizers supporting Java
String.substring()and Kotlintake()/takeLast()methods - Refactored existing sanitizers into a new
GenericSanitizerclass
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll | Core implementation adding SensitiveLoggerBarrier abstraction, SensitiveLoggerSanitizerCalled for substring operations with helper predicates, IntegerToArgConfig for constant propagation, and refactored GenericSanitizer |
| java/ql/test/query-tests/security/CWE-532/Test.java | Added test cases demonstrating safe usage of substring(4) and substring(0,4) on sensitive auth tokens |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
owen-mc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments to make it easier to read and maintain. I'll run performance tests on it. Are we sure that this is a reasonable sanitizer? You have more security experience than me so I will take your word for it.
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/src/change-notes/2025-11-17-sensitive-logging-new-sanitizers.md
Outdated
Show resolved
Hide resolved
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
aschackmull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IntegerToArg flow must be removed.
|
@aegilops I'm not sure if you can see the CI results, but your tests are failing because lines 12 and 13, which are meant to be safe, are causing alerts. (And also, the .expected file would need to be updated even if that wasn't the case.) All of your changes look sensible, so I'm not sure what would be causing it. |
|
Hey @owen-mc, thanks for that. I've fixed things up so that it's just lines 7, 8, 14 and 15 that now produce alerts. I thought that with the inline testcases I didn't need to populate the I was not able to get the suggested change to using I was able to get an isolated predicate working with a limit, but with the code with that change in the correct place wasn't valid CodeQL - the compiler complained that If you are able to get that working then please make that change, but I've resorted to just looking for literals in the argument. This worked: predicate boundedLimit(MethodCall mc, int limit) {
mc.getNumArgument() = 1 and
bounded(mc.getArgument(0), any(ZeroBound z), limit, true, _) and
limit < 8
}but inserting |
|
I'm happy to try to justify that a substring prefix/suffix is an security appropriate sanitizer - if someone is actively trimming the output down to a small portion of the data it shows intent. We're looking for cases where sensitive data is output, and it's pretty standard practice to use a small prefix or suffix with sensitive data, such as credit card numbers or phone numbers; you reveal enough info to make the output identifiable if you know the full string, but without revealing it. It was a customer request, so it isn't speculation that people want this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the range analysis working, after some experimentation. The main things I found are:
(1) if a number has an upper bound of 4 and you ask for a number with an upper bound of 5 you won't get it - you have to ask for a number with an upper bound <= 5;
(2) in some situations reusing the ZeroBound z between different calls to bounded will give incorrect results, so best to just use any(ZeroBound z) each time.
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/lib/semmle/code/java/security/SensitiveLoggingQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/test/query-tests/security/CWE-532/SensitiveLogInfo.expected
Outdated
Show resolved
Hide resolved
java/ql/test/query-tests/security/CWE-532/SensitiveLogInfo.expected
Outdated
Show resolved
Hide resolved
java/ql/test/query-tests/security/CWE-532/SensitiveLogInfo.expected
Outdated
Show resolved
Hide resolved
|
@aschackmull Bug report for range analysis - if you replace |
That's true unless you're using it via the post-processing query in a qlref test. Some people prefer that all the details are recorded, so that if a minor detail like an edge changes then it is flagged up. (This is useful e.g. when making changes to the underlying implementation of the dataflow library, so you can see what effect it's having). So you do have to accept the changes to the .expected file. This is easily done with |
owen-mc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally there would be a kotlin test as well, to exercise the take sanitizer, but I won't hold this up on that.
owen-mc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, I should run dca again before approving since we changed how it works.
Range analysis ought to handle casts - but note that bounds don't always apply through casts: if we, say, only have an upper bound then casting to a smaller type may cause negative values to wrap around, which could invalidate the bound. |
aschackmull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, since the QL looks reasonable now. But yes, as Owen says, we need a dca run before merging.
|
@aschackmull I'm not familiar with reading java DCA results, can you have a quick look? The failing projects seems to be failing in the latest nightly as well. The query changes seem fine. The only thing I'm unsure about is that the "Analysis time, per source" table shows apache__commons-math taking 28.5% less time and apache__commons-logging taking 55% more time. Is that just noise, or something we should look into? How can I tell (preferably without running the query locally)? |
Logging of sensitive data can be mitigated by taking a substring of the data.
This affects the library used by
java/sensitive-log(CWE 532).The exact length suitable will vary by application, but this change takes a conservative approach and allows either
substringortake/takeLastof up to 7 characters.Complex redaction with a regular expression, replacement of particular characters, a substring not at the start or end of the string, or a conditional substring (such as with Apache StringUtils) are not supported in this sanitizer.
There is a new abstraction for sanitizer barriers, along with logic to detect substring operations that restrict logged data to a safe length in both Java and Kotlin code. The existing sanitizers are pulled into a new class called
GenericSanitizer, which implements the new abstract class, alongside the new sanitizer.The safe length restriction must be done in the analyzed code with a compile-time constant integer, but the integer can reach the substring operation using taint.