Skip to content

Conversation

markdroth
Copy link
Member

No description provided.

@markdroth markdroth marked this pull request as ready for review September 18, 2025 22:50
@markdroth markdroth requested a review from dfawley September 18, 2025 22:50
will be set to the local address of the connection that the request
came in on.
- [principal](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L82):
If TLS is used, this will be set to the server's cert's first URI SAN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tls package in Go does not provide a way to retrieve the local certificate used in the TLS handshake. We had filed an issue for this with the Go team ages ago, but not much progress there: golang/go#24673

If the configured channel credentials contain a certificate provider though, we would be able to retrieve the local certs from the provider. But the provider could return more than one cert (if the server is serving multiple domain names and expecting the client to specify an SNI during the TLS handshake). Do other languages have a mechanism to retrieve the actual cert used for the handshake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C-core, we control the TLS code, so I think we'll be able to handle this.

I'm not sure how hard this will be in Java and Go. If we can't support it, I'm open to dropping it from the gRFC.

I'd like input from @ejona86, @dfawley, @matthewstevenson88, and @gtcooke94 on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is open PR to support this in Go: golang/go#75699. The changes seem very straightforward, but we might need some pushing to get this through since we originally asked for this about 6 years ago.

- [disallow_all](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L70)
- [allow_expression](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L75)
- [disallow_expression](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L79)
- [disallow_is_error](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L87)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The envoy docs say when this is true, and if the rules in this list cause a header mutation to be disallowed, then the filter using this configuration will terminate the request with a 500 error.

Does this mean Unavailable for us or Unknown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the normal HTTP-to-gRPC status mapping, an HTTP 500 status would map to UNKNOWN.

I could also see an argument for using INTERNAL here. I don't feel strongly, but maybe @ejona86 or @dfawley have thoughts on this.

- [request](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L200):
Will always be set. Inside it:
- [time](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L95):
Will be set to the RPC's start time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation on this field says:

// The timestamp when the proxy receives the first byte of the request.

Is this supposed to correspond to the dataplane RPC's start time? Or the current time when we are making the ext_authz RPC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The start time of the data plane RPC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do grpc implementations have a definitive way to determine this timestamp? Or is getting the current time from within the ext_authz interceptor/filter good enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C-core, we do have access to the RPC's start time. But if Go doesn't, it's probably good enough to just grab the current time when the ext_authz filter sees the client's headers.

The HTTP status to fail the RPC with. We apply the normal
[HTTP-to-gRPC status conversion
rules](https://github.com/grpc/grpc/blob/master/doc/http-grpc-status-mapping.md).
- [headers](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/external_auth.proto#L55)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we supposed to send these headers in the trailer that we send on the dataplane RPC? If that is true, I feel we need to be more specific here on what the expected behavior should be.

The header field contains a bunch of HeaderValueOptions, each of which contains:

  • a HeaderValue
    • this contains a key, a string value and a bytes raw_value
    • The documentation says only one of value or raw_value can be set. But they are not part of a oneof. What do we do if both are set?
    • Are we supposed to ensure that the key has a -bin suffix when raw_value is set? Or are we supposed to add the -bin suffix to the key in this case?
  • a deprecated append field (whose default value is false for the ext_authz service) which controls whether this header value will be appended to existing values for the same key, or will it replace them
  • a recommend append_action field which can specify a broader set of actions or when to append, and when to replace, and when to do nothing.
  • a keep_empty_value field which controls whether we allow headers with an empty value

In grpc-go, there are a couple of ways of adding trailers (one for unary and one for streaming), but both of them behave almost the same in the sense that if they called are multiple times, the values for existing keys are merged together.

If we are to support the other ways of appending as specified in the append_action field, grpc-go would require a new API. Do other languages already have support for this?

Or does the appending apply only to the headers specified in this message? But if that is the case, things will still get merged if previous interceptors or filters in the chain have also added trailers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fleshed out the details on header rewriting.

Which of the append actions would Go have trouble with? I'm surprised that they aren't easy to support -- they certainly are in C-core. If this is really hard, we can consider taking some of them out of scope, but on the surface they seem to me like things we should support.

Note that the HeaderValueOption type is not specific to ext_authz; it is also used in ext_proc and may be used in other places in the future, so I suggest writing this code in a way that it can be reused.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fleshed out the details on header rewriting.

Thank you. It's much clearer now. The only thing that is unclear to me at this point is if one of the headers from the ext_authz server fails our validation, do we just ignore that particular header? I would assume so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which of the append actions would Go have trouble with? I'm surprised that they aren't easy to support -- they certainly are in C-core. If this is really hard, we can consider taking some of them out of scope, but on the surface they seem to me like things we should support.

It's not technically hard to implement as such. But as I mentioned, we currently have a single API to set the trailers and the way it handles duplicate keys is by merging the values. This behavior is not configurable with the current API.

I can see a couple of options:

  • The current API does take a context argument. So, we could stash the append behavior in there and have the implementation honor it.
  • Add a new API that explicitly accepts an argument to configure the append behavior.

Will discuss this with Doug once he is back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing that is unclear to me at this point is if one of the headers from the ext_authz server fails our validation, do we just ignore that particular header? I would assume so.

Which validation are we talking about here?

In general, if we can't support a given header mutation, we should use the decoder_header_mutation_rules.disallow_is_error field to decide whether to ignore the change or fail the data plane RPC. And actually, this makes me think that we should do the same thing for headers starting with : and the host header -- we just hard-code the fact that those are always disallowed. I've tried to clarify this in the text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants