-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for delta sharing profile via options #573
base: main
Are you sure you want to change the base?
Support for delta sharing profile via options #573
Conversation
81228bf
to
6a2e236
Compare
8aef4f7
to
856a392
Compare
Hi @stevenayers it seems that you are aware of the previous effort #103, and it seems that we didn't want to proceed with it because it introduces secretes in the code. |
Hi @linzhou-db, I understand. However, I think some of the feedback provided by the community in #103, which counters that decision, is still very valid. Whether you are reading the secrets from a file or passing them in via options, your library still has "secrets in the code." However, the current method forces us to hard-code our credentials in a plain-text file on our filesystem (which is just as dangerous as storing them in a notebook). Your colleague's comment here about tempfiles also will not work in Python. You cannot create a tempfile in python which is then used by the underlying Scala code. A true temp file is stored in memory and never flushed to the filesystem, which is required when working across two different processes such as PySpark which communicates with the underlying JVM. If we look at Databricks' Security Best Practices:
A dedicated secrets manager should manage Secrets Management. These secret managers exist so that as developers, we do not need to store secrets in plain-text files or hardcode them as variables; instead, they can be stored in memory and passed into the logic that requires them. 😄 Doing this requires an interface such as the one proposed in this PR. Please reach out to some of the Infrastructure & Security professionals within your organization for their opinions; I think they would share a similar sentiment to myself, @ssimeonov and @shcent. |
Guidelines from Microsoft's Security Fundamentals: https://learn.microsoft.com/en-us/azure/security/fundamentals/secrets-best-practices#avoid-hardcoding-secrets
|
Completely agree with @stevenayers. Having to store the authentication token in a file at all is dangerous. AWS, Azure and Databricks (and I'm sure any other cloud provider not mentioned) all come with secrets managers so that this should not be needed. Using a temp file in Python is still awkward, especially on databricks where the path has a tendency to change from a Notebook into pure pyspark, and there is always the risk that this is not tidied correctly. Also as @stevenayers mentions in memory versions do not work. |
@stevenayers and @shcent thanks for sharing your thoughts, super helpful for me to get more context on this. I just want to reply acking that we've read your comments and are actively discussing this. Will get back to you soon. |
Thanks @linzhou-db! |
Can you please merge this and release new version of lib? Our project is waiting for this change🙂 |
any news on this one? |
Signed-off-by: Steven Ayers <[email protected]>
Signed-off-by: Steven Ayers <[email protected]>
Signed-off-by: Steven Ayers <[email protected]>
…`DeltaSharingProfile`. Signed-off-by: Steven Ayers <[email protected]>
Signed-off-by: Steven Ayers <[email protected]>
c410c53
to
687be0d
Compare
@linzhou-db @chakankardb @zsxwing @mateiz any further thoughts on this change? thanks :) |
Add support for delta sharing profile via options (resolves #483).
Task List:
Examples