-
Notifications
You must be signed in to change notification settings - Fork 342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter characters before byte conversion #416
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please fix the build. Make sure you can run
mvn clean verify
locally. - Please make symmetric change to the else clause of the same if statement
for (int i = 0; i < lineBuffer.length(); i++) {
buffer().put((byte) lineBuffer.charAt(i));
}
Thank you for the feedback. I will work on it. |
Made sure mvn compile clean and verify runs successfully. |
@@ -171,12 +171,26 @@ public void writeLine(final CharArrayBuffer lineBuffer) throws CharacterCodingEx | |||
final int off = buffer().position(); | |||
final int arrayOffset = buffer().arrayOffset(); | |||
for (int i = 0; i < len; i++) { | |||
b[arrayOffset + off + i] = (byte) lineBuffer.charAt(i); | |||
final int c = lineBuffer.charAt(i); | |||
if ((c >= 0x20 && c <= 0x7E) || // Visible ASCII |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it just me, or does this seem duplicated? Perhaps we can move this to a LangUtils
class or somewhere similar to avoid repetition? I've noticed it's used both here and in ByteArrayBuffer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vismayku I agree with @arturobernalg . There are now three instances of the same logic repeated verbatim. I propose the common logic be extracted and moved to TextUtils
. Please mark the new method @Internal
. It should not be considered a part of the public APIs
ACK. I'm working on it. |
I'm seeing an un-related test case failure. I am seeing the same failure even with freshly checked out repo.
|
if ((c >= 0x20 && c <= 0x7E) || // Visible ASCII | ||
(c >= 0xA0 && c <= 0xFF) || // Visible ISO-8859-1 | ||
c == 0x09) { // TAB | ||
return (byte) c; |
Check failure
Code scanning / CodeQL
User-controlled data in numeric cast Critical
user-provided value
This cast to a narrower type depends on a
user-provided value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vismayku I think you need to ensure unexpected truncation by checking the range of the input before performing the cast.
if (c <= 127) { // Ensure it's within byte range
return (byte) c;
}
Also, squash your commits into a single one for a cleaner commit history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vismayku I think you need to ensure unexpected truncation by checking the range of the input before performing the cast.
@arturobernalg I think the existing implementation already does that. All non-printable as well as all non-ascii characters get converted to ?
. This is expected behavior.
if ((c >= 0x20 && c <= 0x7E) || // Visible ASCII | ||
(c >= 0xA0 && c <= 0xFF) || // Visible ISO-8859-1 | ||
c == 0x09) { // TAB | ||
return (byte) c; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vismayku I think you need to ensure unexpected truncation by checking the range of the input before performing the cast.
if (c <= 127) { // Ensure it's within byte range
return (byte) c;
}
Also, squash your commits into a single one for a cleaner commit history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vismayku @arturobernalg The change-set looks good to me.
@ok2c Sorry for the ping but I have a follow up question. |
@vismayku We are not going to make any changes to the 4.x code beyond critical security and protocol fixes. This is not one of those. |
This change causes SessionOutputBufferImpl to filter out all characters that
cannot be correctly converted to ISO-8859-1 by simple downcasting to a
byte.
Fix is inspired from: #116
Above mentioned fix was applied to Sync clients only. This request make similar change to async client. Once approved, I will raise similar request for 4.x branch.