Re: [PATCH v10 8/8] cat-file: add remote-object-info to batch-command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 21, 2025 at 10:34:44AM -0500, Peijian Ju wrote:

> Thank you. Revised to use xstrdup() in v11.
> 
> >   2. Are there any bounds on the size of "line"? E.g., is it coming in
> >      as a single pkt, or can it be arbitrarily large if an attacker
> >      wants (it looks like maybe the latter, since it comes from a strbuf
> >      in batch_objects_command(), but I didn't look at how network data
> >      gets passed in to that). At any rate, I think we ran into problems
> >      before with split_cmdline() and integer overflow, since it returns
> >      an int (CVE-2022-39260). I thought we fixed it by rejecting long
> >      lines in git-shell, but it looks like we also hardened
> >      split_cmdline() in 0ca6ead81e (alias.c: reject too-long cmdline
> >      strings in split_cmdline(), 2022-09-28).
> >
> >      So we are maybe OK, but I wonder if we should punt on absurd lines.
> >      Related, can an attacker just flood input into that strbuf, making
> >      it grow forever and waste memory? That's just a simple resource
> >      attack, but we have tried to avoid those elsewhere in upload-pack,
> >      etc.
> >
> 
> Thank you. Adding a check in v11 for the length of `lines`. Please let
> me know if something like this makes sense:
> 
> if (strlen(line) >= INT_MAX) {
>         die(_("remote-object-info command input overflow"));
> }

I took a look at what you ended up with in v11, and...I think I totally
misunderstood what was going on in your series, or when this code would
be run.

I had thought the cat-file here was running on the server side, and that
we needed to protect ourselves against malicious clients. But your new
parse_cmd_remote_object_info() is purely a client-side function that
will then access the server behind the scenes. And its input will be
coming from the stdin of cat-file locally.

So I'm not sure that we need to protect it unless we think there's some
way that an attacker can automatically trigger arbitrary
remote-object-info requests.

That said, I'm not sure why you need split_cmdline() at all. The format
seems to be:

  remote-object-info <url> <oid>...

The only thing that _might_ need quoting is the url, but is shell
quoting a reasonable thing there? I'd think that it would be
URL-encoded, and thus contain no spaces. The <oid> has to be a real full
oid, I think, because the object-info on the server side insists on
that.

So why not just split on space? Something like this:

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 9de1016acd..aedbcba347 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -597,7 +597,7 @@ static void batch_one_object(const char *obj_name,
 	object_context_release(&ctx);
 }
 
-static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+static int get_remote_info(struct batch_options *opt, const char *url, const char *oid_list)
 {
 	int retval = 0;
 	struct remote *remote = NULL;
@@ -613,16 +613,19 @@ static int get_remote_info(struct batch_options *opt, int argc, const char **arg
 	if (!opt->format)
 		opt->format = "%(objectname) %(objectsize)";
 
-	remote = remote_get(argv[0]);
+	remote = remote_get(url);
 	if (!remote)
 		die(_("must supply valid remote when using remote-object-info"));
 
 	oid_array_clear(&object_info_oids);
-	for (size_t i = 1; i < argc; i++) {
-		if (get_oid_hex(argv[i], &oid))
-			die(_("Not a valid object name %s"), argv[i]);
+	while (*oid_list) {
+		if (parse_oid_hex(oid_list, &oid, &oid_list))
+			die(_("Not a valid object name %s"), oid_list);
 		oid_array_append(&object_info_oids, &oid);
+		while (*oid_list == ' ')
+			oid_list++;
 	}
+
 	if (!object_info_oids.nr)
 		die(_("remote-object-info requires objects"));
 
@@ -747,21 +750,15 @@ static void parse_cmd_remote_object_info(struct batch_options *opt,
 					 const char *line, struct strbuf *output,
 					 struct expand_data *data)
 {
-	int count;
-	const char **argv;
-	char *line_to_split;
-
-	if (strlen(line) >= MAX_REMOTE_OBJ_INFO_LINE)
-		die(_("remote-object-info command input overflow "
-			"(no more than %d objects are allowed)"),
-			MAX_ALLOWED_OBJ_LIMIT);
+	char *url;
+	const char *space;
 
-	line_to_split = xstrdup(line);
-	count = split_cmdline(line_to_split, &argv);
-	if (count < 0)
-		die(_("split remote-object-info command"));
+	space = strchr(line, ' ');
+	if (!space)
+		return; /* report error somehow? */
+	url = xmemdupz(line, space - line);
 
-	if (get_remote_info(opt, count, argv))
+	if (get_remote_info(opt, url, space + 1))
 		goto cleanup;
 
 	data->skip_object_info = 1;
@@ -774,16 +771,15 @@ static void parse_cmd_remote_object_info(struct batch_options *opt,
 			 */
 			data->size = *remote_object_info[i].sizep;
 			opt->batch_mode = BATCH_MODE_INFO;
-			batch_object_write(argv[i+1], output, opt, data, NULL, 0);
+			batch_object_write(oid_to_hex(&data->oid), output, opt, data, NULL, 0);
 		}
 	}
 	data->skip_object_info = 0;
 
 cleanup:
 	for (size_t i = 0; i < object_info_oids.nr; i++)
 		free_object_info_contents(&remote_object_info[i]);
-	free(line_to_split);
-	free(argv);
+	free(url);
 	free(remote_object_info);
 }
 

You'd need to adjust t1017 to remote the quotes from the inputs, and I
think you'd have to correctly url-encoded the file:// one to avoid
spaces (but that is technically true already! If the filesystem path has
a "%" in it, it would be misinterpreted).

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux