Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter characters before byte conversion #416

Merged
merged 4 commits into from
Aug 22, 2023
Merged

Conversation

vismayku
Copy link
Contributor

This change causes SessionOutputBufferImpl to filter out all characters that
cannot be correctly converted to ISO-8859-1 by simple downcasting to a
byte.

Fix is inspired from: #116
Above mentioned fix was applied to Sync clients only. This request make similar change to async client. Once approved, I will raise similar request for 4.x branch.

@vismayku vismayku marked this pull request as ready for review August 17, 2023 23:27
Copy link
Member

@ok2c ok2c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vismayku

  1. Please fix the build. Make sure you can run mvn clean verify locally.
  2. Please make symmetric change to the else clause of the same if statement
                    for (int i = 0; i < lineBuffer.length(); i++) {
                        buffer().put((byte) lineBuffer.charAt(i));
                    }

@vismayku
Copy link
Contributor Author

Thank you for the feedback. I will work on it.

@vismayku
Copy link
Contributor Author

Made sure mvn compile clean and verify runs successfully.
Also addressed the change requested.

@@ -171,12 +171,26 @@ public void writeLine(final CharArrayBuffer lineBuffer) throws CharacterCodingEx
final int off = buffer().position();
final int arrayOffset = buffer().arrayOffset();
for (int i = 0; i < len; i++) {
b[arrayOffset + off + i] = (byte) lineBuffer.charAt(i);
final int c = lineBuffer.charAt(i);
if ((c >= 0x20 && c <= 0x7E) || // Visible ASCII
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just me, or does this seem duplicated? Perhaps we can move this to a LangUtils class or somewhere similar to avoid repetition? I've noticed it's used both here and in ByteArrayBuffer.

@vismayku vismayku requested a review from ok2c August 18, 2023 20:25
Copy link
Member

@ok2c ok2c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vismayku I agree with @arturobernalg . There are now three instances of the same logic repeated verbatim. I propose the common logic be extracted and moved to TextUtils. Please mark the new method @Internal. It should not be considered a part of the public APIs

@vismayku
Copy link
Contributor Author

ACK. I'm working on it.

@vismayku
Copy link
Contributor Author

vismayku commented Aug 21, 2023

I'm seeing an un-related test case failure. I am seeing the same failure even with freshly checked out repo.


[ERROR] Failures: 
[ERROR]   ClassicRequestBuilderTest.constructor:65 expected: <REDACTED> but was: <[REDACTED]>

if ((c >= 0x20 && c <= 0x7E) || // Visible ASCII
(c >= 0xA0 && c <= 0xFF) || // Visible ISO-8859-1
c == 0x09) { // TAB
return (byte) c;

Check failure

Code scanning / CodeQL

User-controlled data in numeric cast Critical

This cast to a narrower type depends on a
user-provided value
, potentially causing truncation.
This cast to a narrower type depends on a
user-provided value
, potentially causing truncation.
Copy link
Member

@arturobernalg arturobernalg Aug 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vismayku I think you need to ensure unexpected truncation by checking the range of the input before performing the cast.

if (c <= 127) {   // Ensure it's within byte range
    return (byte) c;
}

Also, squash your commits into a single one for a cleaner commit history

Copy link
Member

@ok2c ok2c Aug 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vismayku I think you need to ensure unexpected truncation by checking the range of the input before performing the cast.

@arturobernalg I think the existing implementation already does that. All non-printable as well as all non-ascii characters get converted to ?. This is expected behavior.

if ((c >= 0x20 && c <= 0x7E) || // Visible ASCII
(c >= 0xA0 && c <= 0xFF) || // Visible ISO-8859-1
c == 0x09) { // TAB
return (byte) c;
Copy link
Member

@arturobernalg arturobernalg Aug 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vismayku I think you need to ensure unexpected truncation by checking the range of the input before performing the cast.

if (c <= 127) {   // Ensure it's within byte range
    return (byte) c;
}

Also, squash your commits into a single one for a cleaner commit history

Copy link
Member

@ok2c ok2c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vismayku @arturobernalg The change-set looks good to me.

@ok2c ok2c merged commit 4e72f40 into apache:master Aug 22, 2023
9 checks passed
ok2c pushed a commit that referenced this pull request Aug 22, 2023
@vismayku
Copy link
Contributor Author

@ok2c Sorry for the ping but I have a follow up question.
I also want to contribute similar change to 4.x major version. Which specific major-minor version I should work on?

@ok2c
Copy link
Member

ok2c commented Aug 22, 2023

@ok2c Sorry for the ping but I have a follow up question. I also want to contribute similar change to 4.x major version. Which specific major-minor version I should work on?

@vismayku We are not going to make any changes to the 4.x code beyond critical security and protocol fixes. This is not one of those.

ok2c pushed a commit that referenced this pull request Sep 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants