Re: [PATCH] status: add --json output format to git status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tachera

On 30/07/2025 10:15, Tachera W sasi wrote:

For v2, I will:
  * Rework the patch to use `json-writer.h` so escaping and JSON compliance are handled
     properly.

That will handle the escaping problems but we still have the problem that filenames and refnames are not guaranteed to be utf-8 encoded. I think the only way to handle that is to base64 encode them so that the caller can retrieve the raw bytes. See https://lore.kernel.org/git/ZvXMSKaUWWA-MG9J@xxxxxxxxxxxxxxxxxxxxxxxxxxxx/ for a previous discussion about this.

  * Add tests covering filenames with quotes, newlines, and non-UTF-8 cases.
   * Include tests under `t/` to prevent future regressions.

That's great, you should update Documentation/git-status.adoc as well to document the new option and output format

Thanks

Phillip

Thanks again for your review and for pointing me towards the proper helpers.

Best regards,
Tachera Sasi

On Wed, Jul 30, 2025 at 12:01 PM Phillip Wood <phillip.wood123@xxxxxxxxx <mailto:phillip.wood123@xxxxxxxxx>> wrote:

    On 30/07/2025 07:27, Tach via GitGitGadget wrote:
     > From: tacherasasi <tacherasasi@xxxxxxxxx
    <mailto:tacherasasi@xxxxxxxxx>>
     >
     > Add a new --json flag to 'git status' that outputs repository state
     > in a structured JSON format. This enables reliable machine parsing
     > of status information for tools and automation.
     >
     > The JSON output includes:
     > - Branch information (name, detached state, ahead/behind counts)
     > - Staged files array
     > - Unstaged files array
     > - Untracked files array
     > - Ignored files array (with --ignored flag)
     >
     > Implementation details:
     > - Add STATUS_FORMAT_JSON to wt_status_format enum
     > - Add JSON output option to git status and git commit
     > - Implement JSON formatting helpers for arrays and branch info
     > - Structure output for easy parsing and future extensibility
     >
     > Example:
     >    $ git status --json
     >    {
     >      "branch": {
     >        "current": "main",
     >        "detached": false
     >      },
     >      "staged": ["file1.txt"],
     >      "unstaged": ["file2.txt"],
     >      "untracked": ["file3.txt"]
     >    }
     >
     > This provides a robust alternative to parsing traditional output
     > formats, making it easier to build reliable tools and automation
     > around Git status information.

    What problem are you having paring the current output? You say this
    patch provides a robust alternative but I'm afraid it does not seem to
    handle filenames containing newlines or double quotes and it assumes
    they are utf-8 encoded. It also seems to print a trailing comma if an
    array has more than one item which is not allowed by the json spec. We
    already have support for generating json output as documented
    json-writer.h which will help with most of that but not the filename
    encoding problem. New features should also be accompanied by tests to
    prevent future regressions. I'm not opposed to adding json output if it
    is robust and handles non utf-8 filenames but it would be helpful to
    understand what problems you are having parsing the output of "git
    status --porcelain[=v2]".

    Thanks

    Phillip

     > Signed-off-by: Tachera Sasi <tachera@xxxxxxxxxx
    <mailto:tachera@xxxxxxxxxx>>
     > Signed-off-by: tacherasasi <tacherasasi@xxxxxxxxx
    <mailto:tacherasasi@xxxxxxxxx>>
     > ---
     >      status: add --json output format to git status
     >
     > Published-As: https://github.com/gitgitgadget/git/releases/tag/
    pr-1937%2FtacheraSasi%2Ffeature%2Fjson-status-output-v1 <https://
    github.com/gitgitgadget/git/releases/tag/
    pr-1937%2FtacheraSasi%2Ffeature%2Fjson-status-output-v1>
     > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git
    <https://github.com/gitgitgadget/git> pr-1937/tacheraSasi/feature/
    json-status-output-v1
     > Pull-Request: https://github.com/gitgitgadget/git/pull/1937
    <https://github.com/gitgitgadget/git/pull/1937>
     >
     >   builtin/commit.c |  9 ++++-
     >   wt-status.c      | 98 +++++++++++++++++++++++++++++++++++++++++
    +++++++
     >   wt-status.h      |  1 +
     >   3 files changed, 107 insertions(+), 1 deletion(-)
     >
     > diff --git a/builtin/commit.c b/builtin/commit.c
     > index fba0dded64a..f1db4fdfd9a 100644
     > --- a/builtin/commit.c
     > +++ b/builtin/commit.c
     > @@ -1540,6 +1540,9 @@ struct repository *repo UNUSED)
     >               OPT_SET_INT(0, "long", &status_format,
     >                           N_("show status in long format (default)"),
     >                           STATUS_FORMAT_LONG),
     > +             OPT_SET_INT(0, "json", &status_format,
     > +                 N_("show status in JSON format"),
     > +                 STATUS_FORMAT_JSON),
     >               OPT_BOOL('z', "null", &s.null_termination,
     >                        N_("terminate entries with NUL")),
     >               {
     > @@ -1603,7 +1606,8 @@ struct repository *repo UNUSED)
     >                      prefix, argv);
     >
     >       if (status_format != STATUS_FORMAT_PORCELAIN &&
     > -         status_format != STATUS_FORMAT_PORCELAIN_V2)
     > +         status_format != STATUS_FORMAT_PORCELAIN_V2 &&
     > +         status_format != STATUS_FORMAT_JSON)
     >               progress_flag = REFRESH_PROGRESS;
     >       repo_read_index(the_repository);
     >       refresh_index(the_repository->index,
     > @@ -1735,6 +1739,9 @@ int cmd_commit(int argc,
     >               OPT_SET_INT(0, "long", &status_format,
     >                           N_("show status in long format (default)"),
     >                           STATUS_FORMAT_LONG),
     > +             OPT_SET_INT(0, "json", &status_format,
     > +                 N_("show status in JSON format"),
     > +                 STATUS_FORMAT_JSON),
     >               OPT_BOOL('z', "null", &s.null_termination,
     >                        N_("terminate entries with NUL")),
     >               OPT_BOOL(0, "amend", &amend, N_("amend previous
    commit")),
     > diff --git a/wt-status.c b/wt-status.c
     > index 454601afa15..7192fb4d057 100644
     > --- a/wt-status.c
     > +++ b/wt-status.c
     > @@ -2564,6 +2564,101 @@ static void wt_porcelain_v2_print(struct
    wt_status *s)
     >       }
     >   }
     >
     > +
     > +static void wt_json_print_string_array(struct wt_status *s,
    const char *name, struct string_list *list)
     > +{
     > +     int i;
     > +     fprintf(s->fp, "  \"%s\": [", name);
     > +     for (i = 0; i < list->nr; i++) {
     > +             if (i > 0)
     > +                     fprintf(s->fp, ", ");
     > +             fprintf(s->fp, "\"%s\"", list->items[i].string);
     > +     }
     > +     fprintf(s->fp, "]");
     > +}
     > +
     > +static void wt_json_print_change_array(struct wt_status *s,
    const char *name, int change_type)
     > +{
     > +     int i;
     > +     struct string_list files = STRING_LIST_INIT_DUP;
     > +
     > +     for (i = 0; i < s->change.nr <http://change.nr>; i++) {
     > +             struct wt_status_change_data *d;
     > +             struct string_list_item *it;
     > +             it = &(s->change.items[i]);
     > +             d = it->util;
     > +
     > +             if ((change_type == WT_STATUS_UPDATED && d-
     >index_status &&
     > +                  d->index_status != DIFF_STATUS_UNMERGED) ||
     > +                 (change_type == WT_STATUS_CHANGED && d-
     >worktree_status &&
     > +                  d->worktree_status != DIFF_STATUS_UNMERGED)) {
     > +                     string_list_append(&files, it->string);
     > +             }
     > +     }
     > +
     > +     wt_json_print_string_array(s, name, &files);
     > +     string_list_clear(&files, 0);
     > +}
     > +
     > +static void wt_json_print_branch_info(struct wt_status *s)
     > +{
     > +     struct branch *branch;
     > +     const char *branch_name;
     > +     int ahead = 0, behind = 0;
     > +
     > +     fprintf(s->fp, "  \"branch\": {\n");
     > +
     > +     if (s->branch && !s->is_initial) {
     > +             if (!strcmp(s->branch, "HEAD")) {
     > +                     fprintf(s->fp, "    \"current\": \"HEAD\",\n");
     > +                     fprintf(s->fp, "    \"detached\": true");
     > +             } else {
     > +                     if (skip_prefix(s->branch, "refs/heads/",
    &branch_name)) {
     > +                             fprintf(s->fp, "    \"current\":
    \"%s\",\n", branch_name);
     > +                             fprintf(s->fp, "    \"detached\":
    false");
     > +
     > +                             branch = branch_get(branch_name);
     > +                             if (branch && branch->merge &&
    branch->merge[0] && branch->merge[0]->dst) {
     > +                                     if (!
    stat_tracking_info(branch, &ahead, &behind, NULL, 0, 0)) {
     > +                                             fprintf(s->fp, ",
    \n    \"ahead\": %d,\n    \"behind\": %d", ahead, behind);
     > +                                     }
     > +                             }
     > +                     } else {
     > +                             fprintf(s->fp, "    \"current\":
    \"%s\",\n", s->branch);
     > +                             fprintf(s->fp, "    \"detached\":
    false");
     > +                     }
     > +             }
     > +     } else {
     > +             fprintf(s->fp, "    \"current\": null,\n");
     > +             fprintf(s->fp, "    \"detached\": false");
     > +     }
     > +
     > +     fprintf(s->fp, "\n  }");
     > +}
     > +
     > +static void wt_json_status_print(struct wt_status *s)
     > +{
     > +     fprintf(s->fp, "{\n");
     > +
     > +     wt_json_print_branch_info(s);
     > +     fprintf(s->fp, ",\n");
     > +
     > +     wt_json_print_change_array(s, "staged", WT_STATUS_UPDATED);
     > +     fprintf(s->fp, ",\n");
     > +
     > +     wt_json_print_change_array(s, "unstaged", WT_STATUS_CHANGED);
     > +     fprintf(s->fp, ",\n");
     > +
     > +     wt_json_print_string_array(s, "untracked", &s->untracked);
     > +
     > +     if (s->ignored.nr <http://ignored.nr> > 0) {
     > +             fprintf(s->fp, ",\n");
     > +             wt_json_print_string_array(s, "ignored", &s->ignored);
     > +     }
     > +
     > +     fprintf(s->fp, "\n}\n");
     > +}
     > +
     >   void wt_status_print(struct wt_status *s)
     >   {
     >       trace2_data_intmax("status", s->repo, "count/changed", s-
     >change.nr <http://change.nr>);
     > @@ -2583,6 +2678,9 @@ void wt_status_print(struct wt_status *s)
     >       case STATUS_FORMAT_PORCELAIN_V2:
     >               wt_porcelain_v2_print(s);
     >               break;
     > +     case STATUS_FORMAT_JSON:
     > +             wt_json_status_print(s);
     > +             break;
     >       case STATUS_FORMAT_UNSPECIFIED:
     >               BUG("finalize_deferred_config() should have been
    called");
     >               break;
     > diff --git a/wt-status.h b/wt-status.h
     > index 4e377ce62b8..e929af832b2 100644
     > --- a/wt-status.h
     > +++ b/wt-status.h
     > @@ -74,6 +74,7 @@ enum wt_status_format {
     >       STATUS_FORMAT_SHORT,
     >       STATUS_FORMAT_PORCELAIN,
     >       STATUS_FORMAT_PORCELAIN_V2,
     > +     STATUS_FORMAT_JSON,
     >
     >       STATUS_FORMAT_UNSPECIFIED
     >   };
     >
     > base-commit: cb3b40381e1d5ee32dde96521ad7cfd68eb308a6






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux