-
-
Notifications
You must be signed in to change notification settings - Fork 415
feat: add sha256 feature to gix-commitgraph
#2377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add sha256 feature to gix-commitgraph
#2377
Conversation
|
Thanks a lot, I am looking forward to sorting this out! There is also #2378, hopefully done in the next couple of days (it might conflict). |
|
Thanks a lot for starting this! This crate is special as it's the first one that parses a file-format that changes depending on the hash kind. It seems like one commit currently in the |
|
Assuming you’re referring to the one adding Apart from that, my plan is to go through the tests and, for every tests that currently uses |
Yes, that's the one, thanks 🙏.
Sorry for the confusion, that's the way. The only thing tests have to do is to actively set all hashes using feature toggles, and they should just set sha1/sha256 while they are at it. Once a crate has been converted, the tests should automatically test both hashes, and make sure they are compiled in. |
bde9c33 to
7bc1409
Compare
|
I added fixtures for SHA256 hashes to |
Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
|
I think I’ve now gotten to a point where I need some feedback in order to be able to continue. (CI is still running, so I also might need to address any issue that comes up there.) Specifically, I’ve got the following questions:
|
Byron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is great!
The PR runs tests for both hashes through cargo nextest run -p gix-commitgraph --features sha256 --no-fail-fast. Is that correct?
I made a note in Cargo.toml which should lead to the extra run in the justfile to be removed. Please take a look to see if it makes sense.
Last time CI ran it did not complain about any of the generated archives in gix-commitgraph, but when I delete both generated-archives as well as generated-do-not-edit in order to regenerate the archives, git status shows that most files have changed their size by 512 bytes and that one file has been both deleted and modified. (I’m running cargo test --features sha256 on Ubuntu 25.10.) What’s the best way to debug this?
generated-do-no-edit would be the folder to delete for it to restore from archive or regenerate the local cache in generated-do-not-edit. That's all there is to it. With the sha256 versions showing up, I'd expect it to want to create the sha256 versions which seems to have happened as well. It seems to work from my PoV, but I will take a closer look when reviewing properly.
PS: removing generated-archives will regenerate them, and they always appear changed. That's normal, and one would avoid deleting the tracked files.
Is there anything else I would need to change or include before I can mark this PR as ready?
It looks like it's ready for a non-armchair and final review after this one 🎉.
| if hash_kind != gix_hash::Kind::Sha1 { | ||
| path = path.join(hash_kind.to_string()); | ||
| } | ||
| path = path.join(format!("{}-{}", script_identity, family_name())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it would be more cluttered, I seem to prefer the idea of not nesting hash_kind, but add it to the format string here.
In future, there will be even more of these as another parameter, the one for the refs backend, is added.
Alternatively… what do you think about looking at both parameters and using them for nesting?
path.join(hash_kind).join(refs_kind)
The refs_kind is nothing that exists now, but it would be added at some point.
Yes, that seems like the way to go.
So let's make a mild breaking change and always add the hash_kind to the path via join.
tests/tools/src/lib.rs
Outdated
| .env_remove("GIT_WORK_TREE") | ||
| .env_remove("GIT_COMMON_DIR") | ||
| .env_remove("GIT_ASKPASS") | ||
| .env_remove("GIT_DEFAULT_HASH") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the remove here isn't needed if we are always overriding. If there is something special, then let's keep this and document it though.
gix-commitgraph/tests/access/mod.rs
Outdated
| "overflow happened, can't represent huge dates" | ||
| ); | ||
| assert_eq!( | ||
| for hash_kind in gix_hash::Kind::all() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could all() rather return an iterator that hands out owned versions of hash_kind? I think that should work with impl Iterator<Item = Kind>.
|
|
||
| [features] | ||
| ## Support for SHA256 hashes and digests. | ||
| sha256 = ["gix-hash/sha256"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd want to use dev-dependencies to always add sha256 to the features when testing. That way, there is only one way to run tests: all of them.
|
As I am looking at Git for how they tackle this I wonder this: Should we, rather than changing every test and looping over it, prefer to build with all hashes, but run tests for each hash selectively, maybe by setting an environment variable that controls which hash is under test? Performance-wise it should be very similar, even though development is a bit less convenient as one will have to run twice with different flags or environment variables set. There, having the tests run everything automatically, always, and what we have now, seems better. Also, how do we handle expectations on hashes, particularly those with insta? It basically requires to chose one hash and find/replace the actual output to match that. I.e. So I think it's fine to keep this PR as is as it's happy with this simple and convenient solution, and we can probably layer moer on top of that when a need arises. |
|
I’m glad you mention this other option because I’m still very hesitant to force every test to be rewritten to make it loop over hash kinds as well. What I could try instead is adding an environment variable, e. g. |
|
I’ve added a |
That's fair, and probably that's best whenever there is no actual assertions against a hash. The manual looping is always something one can implement if that's a good choice in specific cases. With such a system, we'd need just one more job to run the SHA256 version of the test suite, which just sets an environment variable.
Oh, right, the If In any case, it seems we'd have to find a way to get CI green. Something that I'd not want to lose is to see what the best possible size is, i.e. if just one hash is supported and that is the smallest hash. But then one would have to test the individual crate just with one hash kind which may also be difficult in future. So… maybe just change the size expectations to the new value, while also using |
This PR adds the
sha256feature togix-commitgraph. I opened it as a draft, so we have a space where we can discuss how to best approach testing this new feature.This PR was triggered by a comment in #2359.