-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Add a hash
method for strings
#15139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
MD5 is insecure and totally should not be used. Maybe SHA1 or SHA256 but certainly not MD5. |
Maybe I should add a |
@bonzini I think it looks really great, please review it again :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments: testcases are needed and it should be a method rather than a function.
- sha3_224 | ||
- sha3_256 | ||
- sha3_384 | ||
- sha3_512 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The python documentation lists more algorithms that are always present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong section of the docs. That section specifically uses the language "such as these", indicating it isn't complete. Elsewhere on that page, it says:
Constructors for hash algorithms that are always present in this module are sha1(), sha224(), sha256(), sha384(), sha512(), sha3_224(), sha3_256(), sha3_384(), sha3_512(), shake_128(), shake_256(), blake2b(), and blake2s(). md5() is normally available as well, though it may be missing or blocked if you are using a rare “FIPS compliant” build of Python. These correspond to algorithms_guaranteed.
Additional algorithms may also be available if your Python distribution’s hashlib was linked against a build of OpenSSL that provides others. Others are not guaranteed available on all installations and will only be accessible by name via new(). See algorithms_available.
Unless I'm missing something, this implementation of |
I could have used it in the past to create an opaque identifier for the version (to build a private symbol in a library for example) but it's indeed quite niche. It would be useful to hear from the submitter though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hash constant is widely used in file verification, so
The fs module already supports this:
fs = import('fs')
myhash = fs.hash('foo.txt', 'sha512')
I do not understand the purpose of hashing a string object.
Unless I'm missing something, this implementation of
hash()
cannot support binary files (str
in Meson is always an unicode string and anyway there is no facility to read a binary file into astr
object) thus I think it is of limited use. What is the use case for adding anhash()
method to thestr
class?
Yes, and the fs module (already) solves this. :)
Alright, it seems there was a problem with my description. Actually, I hope to replicate the
This is my first time using Meson. I believe I need a string function, but it seems that Meson doesn't have such a built-in feature. And this can do many things, for example, using configure to generate a set of build-time signatures that can be read and verified by the program, without having to sign them within the program itself. This greatly reduces runtime overhead, since the operations are moved to build time. |
Hash constants are widely used in signatures, so I suggest adding a hash method for strings to obtain the hash constant of a string. For its usage example: ```meson hash_str = 'foobar'.hash('sha1') # return a sha1 constant # some process
This CMake example is based on the CMake documentation for registering a search directory path for find_package(). The purpose of an md5 sum here is as a form of UUID, it doesn't have to be md5. Formally speaking, the exact value is irrelevant and "has no meaning" (it is not even used for collating a search order). I am not sure this is an entirely compelling argument in favor of supporting md5 in particular, or hashes at all. I suppose that if I were to anyways use a script to set a registry key, I'd include UUID generation (or even cheap hashing) inside that script. Although I do agree that in this case what you're interested in using as input is indeed a string, not a file.
I'm not certain that I fully understand this use case. By build time signatures do you mean security codesigning? How does this relate to a string value? (I would assume you'd want to compute the hash of a compiled artifact that the program would want to load, and requires a known matching hash before agreeing to load it.) |
Yes, that's one of the applications, for example, creating hash information for a set of build-time data, or generating a unique identifier for a file path. There may be many other scenarios as well, I believe it has great potential. (Most importantly, it’s built-in and doesn’t require writing any external scripts, quite simple, I think that good :) ) |
hash constant is widely used in file verification, so
I suggest adding a hash method for strings to obtain
the hash constant of a string. For its usage example: