Collisions while cloning (was: Re: renormalize histroy with smudge/clean-filter, again)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks,

while investigating/recovering my problems with renormalizing with
clean/smudge filtering, I stumbled on collisions while creating a fresh clone
of the repo from the server:

   $ LANG= git clone ssh://gitrepos@my.server/repo
   smart-home-ets5hashes-removed
   Cloning into 'smart-home-ets5hashes-removed'...
   remote: Enumerating objects: 7499, done.
   remote: Counting objects: 100% (7499/7499), done.
   remote: Compressing objects: 100% (3263/3263), done.
   remote: Total 7499 (delta 3955), reused 7109 (delta 3594), pack-reused 0
   Receiving objects: 100% (7499/7499), 140.12 MiB | 10.54 MiB/s, done.
   Resolving deltas: 100% (3955/3955), done.
   Updating files: 100% (1423/1423), done.
   warning: the following paths have collided (e.g. case-sensitive paths
   on a case-insensitive filesystem) and only one from the same
   colliding group is in the working tree:
   
  'Projects/P-0113/B.ets5hash'
  [more files deleted]

This is on linux, so the FS is _not_ case-insensitive.

The list of files given here is almost identical to the list of files which
always give me collisions during renormalization process.

Here is an explanation of how and why those files ended up in the repo and a
hypothesis of why they might be in conflicting state.

Those files contain hash values of the real data files for a proprietary
application and are re-calculated on every invocation of the application. The
application won't even start up if those hashes don't match. And it won't tell
why it won't start, it just says "Corrupt data".

At the time this repository started, I had no knowledge how the hashes of
those files are calculated, so I had to commit them along with the associated
data files to keep the application happy. This results in conflicts with many
git operastions, of course.

Then I learned how those files can be re-calculated and wrote a smudge-filter
to keep them in sync with the data files.

Since I was now able to recreate those files, I put them into .gitignore and
installed the smudge-filter to recalculate them. But I left the files in the
repo as a fallback, just to be sure. And I kept committing them every now and
then whenever git showed differences, although they already were in
.gitignore.

So I guess those collisions might come from committing the ignored
files. Unfortunately, I could not reproduce this effect on a fresh repo, yet.

And the next question is: why do those conflicts cause the renormalization
process to completely fail, even when the conflicts are resolved during the
renormalization rebase? This, I also could not reproduced on a fresh repo.


On Wed, Feb 12, 2025 at 12:57:07AM +0100, Josef Wolf wrote:
> Still struggling with my filter problem.
> 
> Here is what I do:
> 
> - Set up a clean filter which enforces CRLF (yes, for this specific use
>   case I want CRLF even on linux)
> 
> - Smudge filter does not modify the file at all
> 
> - Set up git to fail when filter fails, so I can double-check that the
>   filter is actually runnning:
> 
>    $ grep -A3 filter..etsfile ~/.gitconfig
>    [filter "etsfile"]
>       required = true
>       clean = ets-utils -c
>       smudge = ets-utils -s %f
> 
> - Specify file as non-text and install the filter:
> 
>     $ grep etsfile .gitattributes
>     */P -text filter=etsfile
>     $ git commit .gitattributes
> 
> - Check that git gets attributes as I want them:
> 
>     $ git --attr-source=$(git rev-parse HEAD) check-attr -a P-0113/P
>     P-0113/P: text: unset
>     P-0113/P: filter: etsfile
>     $ git ls-files --eol P-0113/P
>     i/lf    w/      attr/-text              P-0113/P
> 
> - Create helper for renormalization
> 
>     $ cat renormalization-helper
>     #! /bin/sh -e
>     git add --renormalize .
>     git diff --quiet --cached || \
>         git commit --amend --no-edit
>     
> - Run the renormalization for the linear history:
> 
>     $ git --attr-source=$(git rev-parse HEAD) \
>          rebase --root -X renormalize \
>          -x $(dirname $0)/renormalize-helper
> 
> So at this point, I'd expect the falie to have CRLF line endings. But it
> doesn't, so I do:
> 
>     $ rm -rf P-0113
>     git checkout  --attr-source=$(git rev-parse HEAD) P-0113
> 
> Still no CRLF, so I look at what is stored by git:
> 
>     $ git --attr-source=$(git rev-parse HEAD) show 873a9b:P-0113/P |less -U
> 
> Again, no CRLF.
> 
> So I check all revisions in the history. Resut: no revision has CRLF.
> 
> So the renormalization process does not work for me at all.
> 
> Any ideas?
> 
> -- 
> Josef Wolf
> jw@xxxxxxxxxxxxx
> 
> 

-- 
Josef Wolf
jw@xxxxxxxxxxxxx




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux