-
Notifications
You must be signed in to change notification settings - Fork 142
Fix REF_DELTA chain bug in 'git index-pack' #1906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -821,6 +821,7 @@ TEST_BUILTINS_OBJS += test-mergesort.o | |
TEST_BUILTINS_OBJS += test-mktemp.o | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): I needed this to make
$ SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=true make
$ cd t && sh t5309-pack-delta-cycles.sh
pass.
--- >8 ------ >8 ------ >8 ---
Subject: [PATCH] fixup! test-tool: add pack-deltas helper
t/helper/test-pack-deltas.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/t/helper/test-pack-deltas.c b/t/helper/test-pack-deltas.c
index db7d1c3cd1..c8e837ea06 100644
--- a/t/helper/test-pack-deltas.c
+++ b/t/helper/test-pack-deltas.c
@@ -122,6 +122,7 @@ int cmd__pack_deltas(int argc, const char **argv)
if (get_oid_hex(base_oid_str, &base_oid))
die("invalid object: %s", base_oid_str);
}
+ string_list_clear(&items, 0);
if (!strcmp(type_str, "REF_DELTA"))
write_ref_delta(f, &content_oid, &base_oid);
--
2.49.0-555-g43235db9c8
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Derrick Stolee wrote (reply to this): On 4/24/2025 3:41 PM, Junio C Hamano wrote:
> I needed this to make
>
> $ SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=true make
> $ cd t && sh t5309-pack-delta-cycles.sh
>
> pass.
> --- >8 ------ >8 ------ >8 ---
> Subject: [PATCH] fixup! test-tool: add pack-deltas helper
>
> t/helper/test-pack-deltas.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/t/helper/test-pack-deltas.c b/t/helper/test-pack-deltas.c
> index db7d1c3cd1..c8e837ea06 100644
> --- a/t/helper/test-pack-deltas.c
> +++ b/t/helper/test-pack-deltas.c
> @@ -122,6 +122,7 @@ int cmd__pack_deltas(int argc, const char **argv)
> if (get_oid_hex(base_oid_str, &base_oid))
> die("invalid object: %s", base_oid_str);
> }
> + string_list_clear(&items, 0);
Thanks. I'll make sure to apply it. My GGG PR validation was broken
top-to-bottom due to other environmental issues so I had not seen
this failure myself.
Thanks,
-Stolee
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): Derrick Stolee <[email protected]> writes:
> On 4/24/2025 3:41 PM, Junio C Hamano wrote:
>> I needed this to make
>>
>> $ SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=true make
>> $ cd t && sh t5309-pack-delta-cycles.sh
>>
>> pass.
>> --- >8 ------ >8 ------ >8 ---
>> Subject: [PATCH] fixup! test-tool: add pack-deltas helper
>>
>> t/helper/test-pack-deltas.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/t/helper/test-pack-deltas.c b/t/helper/test-pack-deltas.c
>> index db7d1c3cd1..c8e837ea06 100644
>> --- a/t/helper/test-pack-deltas.c
>> +++ b/t/helper/test-pack-deltas.c
>> @@ -122,6 +122,7 @@ int cmd__pack_deltas(int argc, const char **argv)
>> if (get_oid_hex(base_oid_str, &base_oid))
>> die("invalid object: %s", base_oid_str);
>> }
>> + string_list_clear(&items, 0);
>
> Thanks. I'll make sure to apply it. My GGG PR validation was broken
> top-to-bottom due to other environmental issues so I had not seen
> this failure myself.
I squashed this in so unless there are other things you need to
change, this alone does not make it necessary to reroll the series.
Thanks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Wed, Apr 23, 2025 at 05:40:02PM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/t/helper/test-pack-deltas.c b/t/helper/test-pack-deltas.c
> new file mode 100644
> index 00000000000..db7d1c3cd1f
> --- /dev/null
> +++ b/t/helper/test-pack-deltas.c
> @@ -0,0 +1,140 @@
[snip]
> +int cmd__pack_deltas(int argc, const char **argv)
> +{
> + int N;
> + struct hashfile *f;
> + struct strbuf line = STRBUF_INIT;
> +
> + if (argc != 2) {
> + usage(usage_str);
> + return -1;
> + }
> +
> + N = atoi(argv[1]);
Is there a reason why we don't use `parse_options()` here? It might make
this tool easier to use and extend going forward, and we wouldn't have
to care about invalid arguments. Right now, we silently accept a
non-integer argument and do the wrong thing.
Patrick There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Johannes Schindelin wrote (reply to this): Hi Patrick,
On Fri, 25 Apr 2025, Patrick Steinhardt wrote:
> On Wed, Apr 23, 2025 at 05:40:02PM +0000, Derrick Stolee via GitGitGadget wrote:
> > diff --git a/t/helper/test-pack-deltas.c b/t/helper/test-pack-deltas.c
> > new file mode 100644
> > index 00000000000..db7d1c3cd1f
> > --- /dev/null
> > +++ b/t/helper/test-pack-deltas.c
> > @@ -0,0 +1,140 @@
> [snip]
> > +int cmd__pack_deltas(int argc, const char **argv)
> > +{
> > + int N;
> > + struct hashfile *f;
> > + struct strbuf line = STRBUF_INIT;
> > +
> > + if (argc != 2) {
> > + usage(usage_str);
> > + return -1;
> > + }
> > +
> > + N = atoi(argv[1]);
>
> Is there a reason why we don't use `parse_options()` here? It might make
> this tool easier to use and extend going forward, and we wouldn't have
> to care about invalid arguments. Right now, we silently accept a
> non-integer argument and do the wrong thing.
I think that `parse_options()` would be overkill here because:
- This is a _mandatory_ argument, not an optional one.
- The required data type is `uint32_t`, and `parse_options()` has no
support for that.
But you do have a good point in that we may want to validate the data type
(even if technically, this is not a user-facing program, it's a test
helper that is used under tight control by Git's own test suite).
Consequently, I would suggest this fixup instead:
-- snipsnap --
Subject: [PATCH] fixup! test-tool: add pack-deltas helper
Let's make the command-line parsing a bit more stringent. We _could_
use `parse_options()`, but that would be overkill for a single,
non-optional argument. Besides, it would not bring any benefit, as the
parsed value needs to fit in the `uint32_t` type, and `parse_options()`
has no provision to ensure that.
Signed-off-by: Johannes Schindelin <[email protected]>
---
t/helper/test-pack-deltas.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/t/helper/test-pack-deltas.c b/t/helper/test-pack-deltas.c
index 4af69bdc05d3..f95d8ee16768 100644
--- a/t/helper/test-pack-deltas.c
+++ b/t/helper/test-pack-deltas.c
@@ -8,11 +8,12 @@
#include "hex.h"
#include "pack.h"
#include "pack-objects.h"
+#include "parse.h"
#include "setup.h"
#include "strbuf.h"
#include "string-list.h"
-static const char usage_str[] = "test-tool pack-deltas <n>";
+static const char usage_str[] = "test-tool pack-deltas <nr_entries>";
static unsigned long do_compress(void **pptr, unsigned long size)
{
@@ -79,7 +80,7 @@ static void write_ref_delta(struct hashfile *f,
int cmd__pack_deltas(int argc, const char **argv)
{
- int N;
+ unsigned long n;
struct hashfile *f;
struct strbuf line = STRBUF_INIT;
@@ -88,12 +89,13 @@ int cmd__pack_deltas(int argc, const char **argv)
return -1;
}
- N = atoi(argv[1]);
+ if (!git_parse_ulong(argv[1], &n) || n != (uint32_t)n)
+ die("invalid number of objects: %s", argv[1]);
setup_git_directory();
f = hashfd(1, "<stdout>");
- write_pack_header(f, N);
+ write_pack_header(f, n);
/* Read each line from stdin into 'line' */
while (strbuf_getline_lf(&line, stdin) != EOF) {
--
2.49.0.windows.1
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Fri, Apr 25, 2025 at 11:34:01AM +0200, Johannes Schindelin wrote:
> Hi Patrick,
>
> On Fri, 25 Apr 2025, Patrick Steinhardt wrote:
>
> > On Wed, Apr 23, 2025 at 05:40:02PM +0000, Derrick Stolee via GitGitGadget wrote:
> > > diff --git a/t/helper/test-pack-deltas.c b/t/helper/test-pack-deltas.c
> > > new file mode 100644
> > > index 00000000000..db7d1c3cd1f
> > > --- /dev/null
> > > +++ b/t/helper/test-pack-deltas.c
> > > @@ -0,0 +1,140 @@
> > [snip]
> > > +int cmd__pack_deltas(int argc, const char **argv)
> > > +{
> > > + int N;
> > > + struct hashfile *f;
> > > + struct strbuf line = STRBUF_INIT;
> > > +
> > > + if (argc != 2) {
> > > + usage(usage_str);
> > > + return -1;
> > > + }
> > > +
> > > + N = atoi(argv[1]);
> >
> > Is there a reason why we don't use `parse_options()` here? It might make
> > this tool easier to use and extend going forward, and we wouldn't have
> > to care about invalid arguments. Right now, we silently accept a
> > non-integer argument and do the wrong thing.
>
> I think that `parse_options()` would be overkill here because:
>
> - This is a _mandatory_ argument, not an optional one.
>
> - The required data type is `uint32_t`, and `parse_options()` has no
> support for that.
Support for that has been merged just this week via 2bc5414c411 (Merge
branch 'ps/parse-options-integers', 2025-04-24).
> But you do have a good point in that we may want to validate the data type
> (even if technically, this is not a user-facing program, it's a test
> helper that is used under tight control by Git's own test suite).
>
> Consequently, I would suggest this fixup instead:
But in any case, I'd be equally fine with your suggestion.
Patrick There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Johannes Schindelin wrote (reply to this): Hi Patrick,
On Fri, 25 Apr 2025, Patrick Steinhardt wrote:
> On Fri, Apr 25, 2025 at 11:34:01AM +0200, Johannes Schindelin wrote:
> >
> > On Fri, 25 Apr 2025, Patrick Steinhardt wrote:
> >
> > > On Wed, Apr 23, 2025 at 05:40:02PM +0000, Derrick Stolee via GitGitGadget wrote:
> > > > diff --git a/t/helper/test-pack-deltas.c b/t/helper/test-pack-deltas.c
> > > > new file mode 100644
> > > > index 00000000000..db7d1c3cd1f
> > > > --- /dev/null
> > > > +++ b/t/helper/test-pack-deltas.c
> > > > @@ -0,0 +1,140 @@
> > > [snip]
> > > > +int cmd__pack_deltas(int argc, const char **argv)
> > > > +{
> > > > + int N;
> > > > + struct hashfile *f;
> > > > + struct strbuf line = STRBUF_INIT;
> > > > +
> > > > + if (argc != 2) {
> > > > + usage(usage_str);
> > > > + return -1;
> > > > + }
> > > > +
> > > > + N = atoi(argv[1]);
> > >
> > > Is there a reason why we don't use `parse_options()` here? It might make
> > > this tool easier to use and extend going forward, and we wouldn't have
> > > to care about invalid arguments. Right now, we silently accept a
> > > non-integer argument and do the wrong thing.
> >
> > I think that `parse_options()` would be overkill here because:
> >
> > - This is a _mandatory_ argument, not an optional one.
> >
> > - The required data type is `uint32_t`, and `parse_options()` has no
> > support for that.
>
> Support for that has been merged just this week via 2bc5414c411 (Merge
> branch 'ps/parse-options-integers', 2025-04-24).
That's too new for me to be useful, as I have to work on top of v2.49.0
(see https://github.com/microsoft/git/pull/745).
> > But you do have a good point in that we may want to validate the data type
> > (even if technically, this is not a user-facing program, it's a test
> > helper that is used under tight control by Git's own test suite).
> >
> > Consequently, I would suggest this fixup instead:
>
> But in any case, I'd be equally fine with your suggestion.
Thanks!
Johannes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): Patrick Steinhardt <[email protected]> writes:
>> > Is there a reason why we don't use `parse_options()` here? It might make
>> > this tool easier to use and extend going forward, and we wouldn't have
>> > to care about invalid arguments. Right now, we silently accept a
>> > non-integer argument and do the wrong thing.
>>
>> I think that `parse_options()` would be overkill here because:
>>
>> - This is a _mandatory_ argument, not an optional one.
>>
>> - The required data type is `uint32_t`, and `parse_options()` has no
>> support for that.
>
> Support for that has been merged just this week via 2bc5414c411 (Merge
> branch 'ps/parse-options-integers', 2025-04-24).
>
>> But you do have a good point in that we may want to validate the data type
>> (even if technically, this is not a user-facing program, it's a test
>> helper that is used under tight control by Git's own test suite).
>>
>> Consequently, I would suggest this fixup instead:
>
> But in any case, I'd be equally fine with your suggestion.
Yeah, I think we clearly showed our "it's just test helper, whose
callers are supposed to know what they are doing" attitude, but with
proper helpers, it is not too much additional effort to do the right
thing.
Thanks, both. |
||
TEST_BUILTINS_OBJS += test-name-hash.o | ||
TEST_BUILTINS_OBJS += test-online-cpus.o | ||
TEST_BUILTINS_OBJS += test-pack-deltas.o | ||
TEST_BUILTINS_OBJS += test-pack-mtimes.o | ||
TEST_BUILTINS_OBJS += test-parse-options.o | ||
TEST_BUILTINS_OBJS += test-parse-pathspec-file.o | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1109,8 +1109,8 @@ static void *threaded_second_pass(void *data) | |
set_thread_data(data); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> From: Derrick Stolee <[email protected]>
>
> As detailed in the previous changes to t5309-pack-delta-cycles.sh, the
> logic within 'git index-pack' to analyze an incoming thin packfile with
> REF_DELTAs is suspect. The algorithm is overly cautious around delta
> cycles, and that leads in fact to failing even when there is no cycle.
>
> This change adjusts the algorithm to no longer fail in these cases. In
> fact, these cycle cases will no longer fail but more importantly the
> valid cases will no longer fail, either. The resulting packfile from the
> --fix-thin operation will not have cycles either since REF_DELTAs are
> forbidden from the on-disk format and OFS_DELTAs are impossible to write
> as a cycle.
Loosening cycle avoidance indeed is worrysome. How do we guarantee
"since REF_DELTAs are forbidden from the on-disk format" (it is
obvious with OFS_DELTA only you cannot form a cycle)? By code
inspection and in-code comment for the code path of "--fix-thin"?
I think the most interesting question would be how, with the
loosening of the cycle check in this patch, we would still protect
against a malicious on-the-wire packdata that has a cycle in it
already, where deltas cannot be resolved. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Derrick Stolee wrote (reply to this): On 4/24/2025 5:41 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
>
>> From: Derrick Stolee <[email protected]>
>>
>> As detailed in the previous changes to t5309-pack-delta-cycles.sh, the
>> logic within 'git index-pack' to analyze an incoming thin packfile with
>> REF_DELTAs is suspect. The algorithm is overly cautious around delta
>> cycles, and that leads in fact to failing even when there is no cycle.
>>
>> This change adjusts the algorithm to no longer fail in these cases. In
>> fact, these cycle cases will no longer fail but more importantly the
>> valid cases will no longer fail, either. The resulting packfile from the
>> --fix-thin operation will not have cycles either since REF_DELTAs are
>> forbidden from the on-disk format and OFS_DELTAs are impossible to write
>> as a cycle.
>
> Loosening cycle avoidance indeed is worrysome. How do we guarantee
> "since REF_DELTAs are forbidden from the on-disk format" (it is
> obvious with OFS_DELTA only you cannot form a cycle)? By code
> inspection and in-code comment for the code path of "--fix-thin"?
>
> I think the most interesting question would be how, with the
> loosening of the cycle check in this patch, we would still protect
> against a malicious on-the-wire packdata that has a cycle in it
> already, where deltas cannot be resolved.
The good news is two-fold:
1. We have a test in t5309 (index-pack detects REF_DELTA cycles) that
checks this case and did not change behavior with this patch.
2. If we don't already have any of the objects that exist in the cycle,
then we can't _start_ expanding the objects to their full contents
as we don't have their bases. So what is left is a list of REF_DELTAs
that have not been processed but also a list of base object IDs that
we can't load. We'll fail without writing those objects to disk.
Does that satisfy your concerns in this space?
Thanks,
-Stolee
|
||
for (;;) { | ||
struct base_data *parent = NULL; | ||
struct object_entry *child_obj; | ||
struct base_data *child; | ||
struct object_entry *child_obj = NULL; | ||
struct base_data *child = NULL; | ||
|
||
counter_lock(); | ||
display_progress(progress, nr_resolved_deltas); | ||
|
@@ -1137,15 +1137,18 @@ static void *threaded_second_pass(void *data) | |
parent = list_first_entry(&work_head, struct base_data, | ||
list); | ||
|
||
if (parent->ref_first <= parent->ref_last) { | ||
while (parent->ref_first <= parent->ref_last) { | ||
int offset = ref_deltas[parent->ref_first++].obj_no; | ||
child_obj = objects + offset; | ||
if (child_obj->real_type != OBJ_REF_DELTA) | ||
die("REF_DELTA at offset %"PRIuMAX" already resolved (duplicate base %s?)", | ||
(uintmax_t) child_obj->idx.offset, | ||
oid_to_hex(&parent->obj->idx.oid)); | ||
if (child_obj->real_type != OBJ_REF_DELTA) { | ||
child_obj = NULL; | ||
continue; | ||
} | ||
child_obj->real_type = parent->obj->real_type; | ||
} else { | ||
break; | ||
} | ||
|
||
if (!child_obj && parent->ofs_first <= parent->ofs_last) { | ||
child_obj = objects + | ||
ofs_deltas[parent->ofs_first++].obj_no; | ||
assert(child_obj->real_type == OBJ_OFS_DELTA); | ||
|
@@ -1178,29 +1181,32 @@ static void *threaded_second_pass(void *data) | |
} | ||
work_unlock(); | ||
|
||
if (parent) { | ||
child = resolve_delta(child_obj, parent); | ||
if (!child->children_remaining) | ||
FREE_AND_NULL(child->data); | ||
} else { | ||
child = make_base(child_obj, NULL); | ||
if (child->children_remaining) { | ||
/* | ||
* Since this child has its own delta children, | ||
* we will need this data in the future. | ||
* Inflate now so that future iterations will | ||
* have access to this object's data while | ||
* outside the work mutex. | ||
*/ | ||
child->data = get_data_from_pack(child_obj); | ||
child->size = child_obj->size; | ||
if (child_obj) { | ||
if (parent) { | ||
child = resolve_delta(child_obj, parent); | ||
if (!child->children_remaining) | ||
FREE_AND_NULL(child->data); | ||
} else{ | ||
child = make_base(child_obj, NULL); | ||
if (child->children_remaining) { | ||
/* | ||
* Since this child has its own delta children, | ||
* we will need this data in the future. | ||
* Inflate now so that future iterations will | ||
* have access to this object's data while | ||
* outside the work mutex. | ||
*/ | ||
child->data = get_data_from_pack(child_obj); | ||
child->size = child_obj->size; | ||
} | ||
} | ||
} | ||
|
||
work_lock(); | ||
if (parent) | ||
parent->retain_data--; | ||
if (child->data) { | ||
|
||
if (child && child->data) { | ||
/* | ||
* This child has its own children, so add it to | ||
* work_head. | ||
|
@@ -1209,7 +1215,7 @@ static void *threaded_second_pass(void *data) | |
base_cache_used += child->size; | ||
prune_base_data(NULL); | ||
free_base_data(child); | ||
} else { | ||
} else if (child) { | ||
/* | ||
* This child does not have its own children. It may be | ||
* the last descendant of its ancestors; free those | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,140 @@ | ||
#define USE_THE_REPOSITORY_VARIABLE | ||
|
||
#include "test-tool.h" | ||
#include "git-compat-util.h" | ||
#include "delta.h" | ||
#include "git-zlib.h" | ||
#include "hash.h" | ||
#include "hex.h" | ||
#include "pack.h" | ||
#include "pack-objects.h" | ||
#include "setup.h" | ||
#include "strbuf.h" | ||
#include "string-list.h" | ||
|
||
static const char usage_str[] = "test-tool pack-deltas <n>"; | ||
|
||
static unsigned long do_compress(void **pptr, unsigned long size) | ||
{ | ||
git_zstream stream; | ||
void *in, *out; | ||
unsigned long maxsize; | ||
|
||
git_deflate_init(&stream, 1); | ||
maxsize = git_deflate_bound(&stream, size); | ||
|
||
in = *pptr; | ||
out = xmalloc(maxsize); | ||
*pptr = out; | ||
|
||
stream.next_in = in; | ||
stream.avail_in = size; | ||
stream.next_out = out; | ||
stream.avail_out = maxsize; | ||
while (git_deflate(&stream, Z_FINISH) == Z_OK) | ||
; /* nothing */ | ||
git_deflate_end(&stream); | ||
|
||
free(in); | ||
return stream.total_out; | ||
} | ||
|
||
static void write_ref_delta(struct hashfile *f, | ||
struct object_id *oid, | ||
struct object_id *base) | ||
{ | ||
unsigned char header[MAX_PACK_OBJECT_HEADER]; | ||
unsigned long size, base_size, delta_size, compressed_size, hdrlen; | ||
enum object_type type; | ||
void *base_buf, *delta_buf; | ||
void *buf = repo_read_object_file(the_repository, | ||
oid, &type, | ||
&size); | ||
|
||
if (!buf) | ||
die("unable to read %s", oid_to_hex(oid)); | ||
|
||
base_buf = repo_read_object_file(the_repository, | ||
base, &type, | ||
&base_size); | ||
|
||
if (!base_buf) | ||
die("unable to read %s", oid_to_hex(base)); | ||
|
||
delta_buf = diff_delta(base_buf, base_size, | ||
buf, size, &delta_size, 0); | ||
|
||
compressed_size = do_compress(&delta_buf, delta_size); | ||
|
||
hdrlen = encode_in_pack_object_header(header, sizeof(header), | ||
OBJ_REF_DELTA, delta_size); | ||
hashwrite(f, header, hdrlen); | ||
hashwrite(f, base->hash, the_repository->hash_algo->rawsz); | ||
hashwrite(f, delta_buf, compressed_size); | ||
|
||
free(buf); | ||
free(base_buf); | ||
free(delta_buf); | ||
} | ||
|
||
int cmd__pack_deltas(int argc, const char **argv) | ||
{ | ||
int N; | ||
struct hashfile *f; | ||
struct strbuf line = STRBUF_INIT; | ||
|
||
if (argc != 2) { | ||
usage(usage_str); | ||
return -1; | ||
} | ||
|
||
N = atoi(argv[1]); | ||
|
||
setup_git_directory(); | ||
|
||
f = hashfd(the_repository->hash_algo, 1, "<stdout>"); | ||
write_pack_header(f, N); | ||
|
||
/* Read each line from stdin into 'line' */ | ||
while (strbuf_getline_lf(&line, stdin) != EOF) { | ||
const char *type_str, *content_oid_str, *base_oid_str = NULL; | ||
struct object_id content_oid, base_oid; | ||
struct string_list items = STRING_LIST_INIT_NODUP; | ||
/* | ||
* Tokenize into two or three parts: | ||
* 1. REF_DELTA, OFS_DELTA, or FULL. | ||
* 2. The object ID for the content object. | ||
* 3. The object ID for the base object (optional). | ||
*/ | ||
if (string_list_split_in_place(&items, line.buf, " ", 3) < 0) | ||
die("invalid input format: %s", line.buf); | ||
|
||
if (items.nr < 2) | ||
die("invalid input format: %s", line.buf); | ||
|
||
type_str = items.items[0].string; | ||
content_oid_str = items.items[1].string; | ||
|
||
if (get_oid_hex(content_oid_str, &content_oid)) | ||
die("invalid object: %s", content_oid_str); | ||
if (items.nr >= 3) { | ||
base_oid_str = items.items[2].string; | ||
if (get_oid_hex(base_oid_str, &base_oid)) | ||
die("invalid object: %s", base_oid_str); | ||
} | ||
|
||
if (!strcmp(type_str, "REF_DELTA")) | ||
write_ref_delta(f, &content_oid, &base_oid); | ||
else if (!strcmp(type_str, "OFS_DELTA")) | ||
die("OFS_DELTA not implemented"); | ||
else if (!strcmp(type_str, "FULL")) | ||
die("FULL not implemented"); | ||
else | ||
die("unknown pack type: %s", type_str); | ||
} | ||
|
||
finalize_hashfile(f, NULL, FSYNC_COMPONENT_PACK, | ||
CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE); | ||
strbuf_release(&line); | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):