-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: add RFC for advanced data privacy and response obfuscation #1645
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very promising!
I left a few comments in the code.
Here's one more more: the obfuscate
is very powerful and very generic. It can contain conditions which is great but my concern is that on a large enough schema those blocks will eventually contain a ton of if/else and ternary operators to enable rules based on type/field/scalar/scalar field etc.
I believe it would be nice to have ability to trigger the rule using yaml. IMO it will significantly improve readability and make it less error prone. It'll also be easier to do codegeneration. Can we have in addition to everything else a field (an object) in yaml which can be used to specify "target" of the obfuscation block using data available in the context object?
this
data_privacy:
obfuscation:
expressions:
- name: "email",
expression: "repeat(\"*\",len(split(value,\"@\")[0])) + \"@\" + split(value,\"@\")[1]"
policies:
- name: "Data Scientist Obfuscation"
activate: "'data-scientist' in request.auth.roles"
obfuscate: "typeName == 'User' && fieldName == 'email' ? email(value) : value"
- name: "Developer Obfuscation"
activate: "'developer' in request.auth.roles"
obfuscate: "typeName == 'User' && fieldName == 'email' ? email(value) : value"
could become
data_privacy:
obfuscation:
expressions:
- name: "email",
expression: "repeat(\"*\",len(split(value,\"@\")[0])) + \"@\" + split(value,\"@\")[1]"
policies:
- name: "Data Scientist Obfuscation"
activate: "'data-scientist' in request.auth.roles"
target:
typeName: User
fieldName: email
obfuscate: email(value)
- name: "Developer Obfuscation"
activate: "'developer' in request.auth.roles"
target:
typeName: User
fieldName: email
obfuscate: email(value)
thanks!
Co-authored-by: Max Komarychev <[email protected]>
I was initially trying to keep the complexity of the configuration as simple as possible. I was also thinking about something similar to what you're describing. Here's an alternative approach to your idea. Let me know what you think about it. data_privacy:
obfuscation:
expressions:
- name: "email",
expression: "repeat(\"*\",len(split(value,\"@\")[0])) + \"@\" + split(value,\"@\")[1]"
policies:
- name: "Data Scientist Obfuscation"
activate: "'data-scientist' in request.auth.roles"
filter: "typeName == 'User' && fieldName == 'email'"
obfuscate: email(value)
- name: "Developer Obfuscation"
activate: "'developer' in request.auth.roles"
target:
typeName: User
fieldName: email
obfuscate: email(value) This achieves the same functionality, but it's more flexible, e.g. we can also filter by scalar name. data_privacy:
obfuscation:
expressions:
- name: "email",
expression: "repeat(\"*\",len(split(value,\"@\")[0])) + \"@\" + split(value,\"@\")[1]"
policies:
- name: "Data Scientist Obfuscation"
activate: "'data-scientist' in request.auth.roles"
obfuscate: "typeName == 'User' && fieldName == 'email' ? email(value) : value"
- name: "Developer Obfuscation"
activate: "'developer' in request.auth.roles"
target:
typeName: User
fieldName: email
obfuscate: email(value) |
@maxkomarychev You mentioned code generation. Can you explain what the relationship would be between obfuscation and code generation? What are you looking to generate? |
This is more readable. There is a clear place which selects the field for obfuscation and obfuscation block is simpler too because it doesn't need ternary operation with fallback to "value". |
If we can allow having simpler expressions in |
@jensneuse One more thing: we'd like to be able to filter a field using the interface. Can we have the information to check if a field comes as a result of implementing an interface? e.g. interface Hello {
world: String!
}
type One implements Hello {
world: String
}
type Two {
world: String
} I'd like to be able to target only instances of |
Introduces a comprehensive RFC for configurable response obfuscation in the Cosmo Router. Key features include: - Flexible data privacy policies with role-based and type-based targeting - Configurable transformers for obfuscating sensitive fields - Support for dynamic activation rules using expr-lang - Detailed configuration options for handling undefined fields - Comprehensive schema validation and performance considerations
```yaml | ||
targets: | ||
# Match a specific field on a specific type | ||
- type: "User" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this work for interfaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we resolve a response, we always resolve a concrete type, not an interface. Consequently, you need to define all types implementing an interface. Or to put it another way, you can ignore interfaces.
date(value, "2006-01-02 15:04:05").Year() + "-**-**" | ||
# Obfuscation policies | ||
policies: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are policies supposed to contain mutually exclusive conditions in their activate
blocks? can multiple policies be applied one after another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple policies can be active, yes, but if one active policy matches a field, we will not evaluate further. Is this in line with your expectations or are you thinking of having multiple policies active?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need multiple policies to be active. "first policy wins" is good enough 👍
validate_transformers: true # Validate transformer return types on startup | ||
|
||
# Reusable transformers | ||
transformers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will transformers be invoked for null
values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a use case of obfuscating null? I'd say that we don't obfuscate null and just ignore null values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works for me
"fieldName": "email", | ||
"fieldType": "String", | ||
"parentType": "User", | ||
"value": "[email protected]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scalar fields can be null, yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the answer above you said we don't deal with nulls 🤔
{ | ||
"fieldName": "email", | ||
"fieldType": "String", | ||
"parentType": "User", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we expect interface names here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, the parent will always be a concrete type
|
||
# For complex conditions not easily expressed in YAML | ||
# Custom rule is only evaluated if no target rule matches | ||
custom: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is custom
optional? is it correct to assume fields not matched by targets
will go unchanged if custom
is missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's optional, yes
if custom is missing and no target matches, fields will be unchanged, yes
however, it's possible to define a default transformer if you don't have a custom function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the v2. custom
and default_transform
, targets
are great!
Could you clarify please if the matchers will be called for null values and if we can use interfaces in targets?
|
||
### Target-Based Field Selection | ||
|
||
The `targets` field provides a readable, YAML-based way to specify which fields should be obfuscated and how: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to change field
to fields
in case you want to apply the same obfuscation function to multiple fields in one type? That might omit a bit of redundant configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can, in fact, we can allow "fields" to be a selection set, so you can just define multiple space delimited fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why space delimited and not a yaml array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yaml array is probably better 👍
No description provided.