Re: [PATCH 2/5] doc: git-add: start man page with an example

"D. Ben Knoble" <ben.knoble@xxxxxxxxx> · Wed, 13 Aug 2025 13:22:34 -0400

On Tue, Aug 12, 2025 at 5:40 PM Julia Evans <julia@xxxxxxx> wrote:
>
> > But isn't it the source of the most end-user confusion that they
> > cannot wean themselves off of the diff/patch worldview?
>
> To me it feels very contextual! My impression is that what's important for Git
> users is to be able to think about commits as diffs in some contexts, and as
> snapshots in other contexts. For example with `git rebase` I'm usually thinking
> of my commits as diffs, but it's very helpful to me to think of a merge commit
> as a snapshot, because the merge commit does not have to be a "combination" of
> the two sides of the merge, it can have arbitrary extra content.
>
[snip]
>
> >> +By default, `git commit` only commits changes that you've added to the
> >> +index. For example, if you've edited `file.c` and want to commit your
> >> +changes, you can run:
> >> +
> >> +   git add file.c
> >> +   git commit
> >
> > What happens when you did "edit && add && edit && add"?  It commits
> > the two changes you added to the index?  I do not think it is
> > productive to hide the fact that you are preparing a snapshot of the
> > "next commit" in the index (or "staging the contents for the next
> > commit in the staging area") with various forms "git add", including
> > "git add -p".
>
> It could! It's easy for me to imagine a world where the index
> stores an ordered list of diffs, which are applied as patches in
> series when I commit. I guess you'd need some sort of
> patch + patch + patch + diff workflow to generate the final diff,
> but to me that doesn't feel so different from what Git is actually doing in
> practice.
>
> In any case, I'll think more about whether I think this is really
> an accurate description. I'm always especially interested in the practical
> consequences of having misconceptions about Git: for example (and maybe I'm
> convincing myself to change my position here!) with `git mv` I think it can
> become relevant pretty quickly that commits are snapshots, because if
> you move a file and edit it then Git can't always accurately guess that you
> intended to "move" the file rather than delete the file and create a new one.
>
> I'd like to be able to have a similarly practical example of why it's important
> to think of commits as snapshots in the context of `git add` but I haven't quite
> found the right one yet. I've noticed that people will often sort of "reject"
> information that does not fit their mental models, and I think "commits are
> snapshots, this is important in this context because of
> <specific practical consequence>" is much more convincing than just
> "commits are snapshots".

Less a comment on this patch or diff ;) and more a meta-note: I happen
to have several links saved on the idea of "Snapshot vs. Patch" aka
"commit duality", so I figured I'd share. They reinforce to me, at
least, that the contextual mode of thinking is useful in practice,
even if the snapshot model is the (semantic) storage model [*].
Knowing about snapshots does make it far easier to interact with
objects directly, which also frequently helps me better understand how
to use particular commands.

- https://www.thirtythreeforty.net/posts/2020/01/the-wave-particle-duality-of-git-commits/
- https://roadrunnertwice.dreamwidth.org/596185.html (which references
Julia's work)
- of course, https://jvns.ca/blog/2024/01/05/do-we-think-of-git-commits-as-diffs--snapshots--or-histories/
;)
- https://stackoverflow.com/q/40617288/4400820,
https://stackoverflow.com/q/73646342/4400820,
https://stackoverflow.com/a/27760319/4400820
- https://github.blog/open-source/git/commits-are-snapshots-not-diffs/
- https://lore.kernel.org/git/alpine.LFD.0.98.0705090856220.4062@xxxxxxxxxxxxxxxxxxxxxxxxxx/

What I find is that, while we keep trying to reinforce the snapshot
mentality, there are situations where thinking in diffs is a
reasonable approximation. In the particular case of git-add, most
interactions I observe with the index are diff-based (git diff, git
diff --cached, etc.), but I'm not sure how to usefully clarify the
relationship between those things and the underlying trees involved
(working tree, HEAD, index :0:) in a manual section targeted primarily
at newcomers.

[*]: "Semantic" because deltas in packfiles muddy the _actual_ storage
model somewhat :)

-- 
D. Ben Knoble