Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redact query string values for http client spans #13114

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

jeanbisutti
Copy link
Member

@jeanbisutti jeanbisutti commented Jan 27, 2025

@jeanbisutti jeanbisutti requested a review from a team as a code owner January 27, 2025 14:02
@jeanbisutti jeanbisutti marked this pull request as draft January 27, 2025 14:02
@jeanbisutti jeanbisutti force-pushed the redact-query-string-values branch 2 times, most recently from aba2207 to 608bc78 Compare January 27, 2025 14:43
arguments("https://github.com#[email protected]", "https://github.com#[email protected]"),
arguments("user1:[email protected]", "user1:[email protected]"),
arguments("https://github.com@", "https://github.com@"),
arguments(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was (probably inadvertly) removed in #9925.

The test is added again with new test cases.

@jeanbisutti jeanbisutti marked this pull request as ready for review January 27, 2025 15:18
@jeanbisutti jeanbisutti force-pushed the redact-query-string-values branch 2 times, most recently from d8d8dc3 to c95b161 Compare January 27, 2025 16:57
@trask trask added this to the v2.13.0 milestone Feb 10, 2025
@github-actions github-actions bot added the test native This label can be applied to PRs to trigger them to run native tests label Feb 11, 2025
@jeanbisutti jeanbisutti requested a review from laurit February 11, 2025 15:55
@laurit
Copy link
Contributor

laurit commented Feb 13, 2025

semantic conventions define the same redactions for url.query in http server spans

@trask
Copy link
Member

trask commented Feb 13, 2025

I think it's ok to scope this PR to http client spans since that's the primary issue in Java at least (update the title, the config option is already scoped to http clients)

@jeanbisutti
Copy link
Member Author

semantic conventions define the same redactions for url.query in http server spans

I have created #13292

@jeanbisutti jeanbisutti changed the title Redact query string values Redact query string values for http client spans Feb 13, 2025
@@ -104,11 +112,21 @@ public SpanKey internalGetSpanKey() {
}

@Nullable
private static String stripSensitiveData(@Nullable String url) {
private String stripSensitiveData(@Nullable String url) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method could still be static

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"https://service.com?paramA=valA&paramB=valB",
"https://service.com?paramA=valA&paramB=valB"),
arguments(
"https://service.com?AWSAccessKeyId=AKIAIOSFODNN7",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not correctly work with https://service.com?AWSAccessKeyId=AKIAIOSFODNN7&a

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the resulting string will be https://service.com?AWSAccessKeyId=AKIAIOSFODNN7&. The parsing is waiting for a = before adding new characters.

The parsing was developed with the asumption that the parameter values don't use a reserved character for the percent encoding: https://en.wikipedia.org/wiki/Percent-encoding

For the Signature parameter, the AWS documentation provides an example with the Signature value "URL-Encoded to make it suitable for placement in the query string. ": see Creating a signature part in https://docs.aws.amazon.com/AmazonS3/latest/API/RESTAuthentication.html No encoding is mentioned for AWSAccessKeyId. So, it may be safe to suppose that AWSAccessKeyId value does not contain &.

I have created open-telemetry/semantic-conventions#1916 to clarify. @trask @laurit Waiting for a clarification, is-it a blocker for this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have misunderstood this. query?some=query&a&b is request with 3 parameters some, a and b Have a look at https://url.spec.whatwg.org/#urlencoded-parsing Ideally the redaction process should should redact just the values for parameters like AWSAccessKeyId and leave rest of the url as it is, even if it includes sequences that are a bit weird like &a or &&&.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laurit I was not aware of that. I have added a commit to manage several "weird" cases. cc @trask

@trask trask removed this from the v2.13.0 milestone Feb 14, 2025
@jeanbisutti jeanbisutti force-pushed the redact-query-string-values branch from f9a1d63 to 4911e83 Compare February 17, 2025 13:40
@jeanbisutti jeanbisutti force-pushed the redact-query-string-values branch from edcdb3a to 5164fe5 Compare February 18, 2025 16:44
return url;
}

StringBuilder redactedParameters = new StringBuilder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename this variable since it also contains non redacted parameters

paramWithValue = false;
inRedactedParamValue = false;
currentParamName.setLength(
0); // To avoid creating a new StringBuilder for each new parameter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move the comment to previous line to avoid weird wrapping?

@@ -145,4 +163,66 @@ private static String stripSensitiveData(@Nullable String url) {
}
return url.substring(0, schemeEndIndex + 3) + "REDACTED:REDACTED" + url.substring(atIndex);
}

private static String redactQueryParameters(String url) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe

  private static String redactQueryParameters(String url) {
    int questionMarkIndex = url.indexOf('?');
    if (questionMarkIndex == -1 || !containsParamToRedact(url)) {
      return url;
    }

    StringBuilder buffer = new StringBuilder();

    // To build a parameter name until we reach the '=' character
    // If the parameter name is a one to redact, we will redact the value
    StringBuilder currentParamName = new StringBuilder();

    for (int i = questionMarkIndex + 1; i < url.length(); i++) {
      char currentChar = url.charAt(i);

      if (currentChar == '=') {
        buffer.append('=');
        if (PARAMS_TO_REDACT.contains(currentParamName.toString())) {
          buffer.append("REDACTED");
          // skip over parameter value
          for (; i + 1 < url.length(); i++) {
            char c = url.charAt(i + 1);
            if (c == '&' || c == '#') {
              break;
            }
          }
        }
      } else if (currentChar == '&') { // New parameter delimiter
        buffer.append(currentChar);
        // To avoid creating a new StringBuilder for each new parameter
        currentParamName.setLength(0);
      } else if (currentChar == '#') { // Reference delimiter
        buffer.append(url.substring(i));
        break;
      } else {
        currentParamName.append(currentChar);
        buffer.append(currentChar);
      }
    }

    return url.substring(0, questionMarkIndex) + "?" + buffer;
  }

is easier to follow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test native This label can be applied to PRs to trigger them to run native tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants