I have a bare repository of size 5.7 GB in my local disk. I need to push this to Azure DevOps. I usually do it with the command "git push --mirror" but unfortunately, Azure DevOps has a single push size limit of 5GB. So I have to push repos larger than 5GB in chunks. I used this stackoverflow answer (https://stackoverflow.com/questions/79167276/splitt-git-push-to-azure-devops) asmy basis and created a script to push each branch in batches of commits. I pushed my repository in batches to lets say remote repo "A". I did a "git clone --bare" from remote repo A to my local disk. I verified the size of this bare and it seems to be of size 5 GB only. i) I counted the number of objects using this command "git rev-list --objects --all | wc -l" in both repos, both are same. ii) There is only 1 branch master in both repos and the last commit id of both master branches are matching (read an article that data integrity can be checked like this also since git also works like Blockchain) iii) git fsck --full in both repos, both gave the same output: Checking object directories: 100% (256/256), done. Checking objects: 100% (10793794/10793794), done. Checking connectivity: 10793794, done. But original repo on disk had this extra line in the end (which the remote bare on disk did not display) Verifying commits in commit graph: 100% (1351940/1351940), done. iv) I create a bundle of the original repo on disk using command "git bundle create repo.bundle --all" and then in the remote cloned repo on disk I ran, "git bundle verify ../repo.bundle". Output: The bundle contains these 883 refs: <All Refs> The bundle records a complete history. The bundle uses this hash algorithm: sha1 /home/repo.bundle is okay ii) I checked the repo size using this command "git count-objects -vH", the size-pack differs (original repo says 5.62 GB and the remote cloned repo on disk says 4.93 GB) Note: My repository does not have lfs/objects also. So I do not have any lfs objects to begin with. So that is out of the question. Why is there a change in size? Also how do I validate if two repos are the same or not? Script being used to push in batches of commits: #!/bin/bash set -e # === CONFIGURATION === RepositoryFolderPathForBareCloneBAK="/root/linux" BackupRepositoryHttpsURL="<REMOTE_URL> " remoteName="origin" maxPushSizeInMB=$((4 * 1024)) # 4GB splitPushCommitsCount=35000 splitPush=false ALocation=$(pwd) if [ ! -d "$RepositoryFolderPathForBareCloneBAK" ]; then echo "Error: Bare clone folder not found at $RepositoryFolderPathForBareCloneBAK" exit 1 fi cd "$RepositoryFolderPathForBareCloneBAK" git config http.postBuffer 524288000 doSplitPush=$splitPush # Check repo size and decide whether to split push if [ "$doSplitPush" = false ]; then echo "Checking repository size..." repositorySize=0 while read -r line; do echo "$line" if [[ "$line" =~ ^size-pack:\ ([0-9]+(\.[0-9]+)?)\ ([A-Za-z]+) ]]; then value=${BASH_REMATCH[1]} unit=${BASH_REMATCH[3]} case "$unit" in bytes) repositorySize=$(echo "$value / 1024 / 1024" | bc) ;; KiB) repositorySize=$(echo "$value / 1024" | bc) ;; MiB) repositorySize=$(echo "$value" | bc) ;; GiB) repositorySize=$(echo "$value * 1024" | bc) ;; *) repositorySize=$(echo "$value" | bc) ;; esac fi done < <(git count-objects -vH) # Round down to integer repositorySize=${repositorySize%.*} echo "Repo size: $repositorySize MiB" if [ "$repositorySize" -ge "$maxPushSizeInMB" ]; then doSplitPush=true fi fi # Unset mirror config to allow partial pushes if needed if git config --get remote.origin.mirror >/dev/null; then git config --unset remote.origin.mirror fi # Setup remote NewREMOTE="push_remote" if git remote | grep -q "$NewREMOTE"; then git remote remove "$NewREMOTE" fi git remote add "$NewREMOTE" "$BackupRepositoryHttpsURL" if [ "$doSplitPush" = false ]; then echo "Performing full push to $BackupRepositoryHttpsURL" git push "$NewREMOTE" --mirror else echo "Performing split push to $BackupRepositoryHttpsURL" git for-each-ref --format="%(refname)" --sort='authordate' | while read -r ref; do if [[ "$ref" == refs/heads/* ]]; then BRANCH="${ref#refs/heads/}" echo "Processing branch: $BRANCH" git symbolic-ref HEAD "$ref" if git show-ref --quiet --verify "refs/remotes/$NewREMOTE/$BRANCH"; then range="$NewREMOTE/$BRANCH..HEAD" else range="HEAD" fi n=$(git log --first-parent --format="format:x" $range | wc -l) echo "$n commits to push" splitPushCommitsCount=$(( (maxPushSizeInMB * n) / repositorySize )) [ "$splitPushCommitsCount" -gt 20000 ] && splitPushCommitsCount=20000 echo "Calculated splitPushCommitsCount: $splitPushCommitsCount" if [ "$n" -gt 0 ]; then loopCount=$((n / splitPushCommitsCount)) for ((i=1; i<=loopCount; i++)); do h=$(git log --first-parent --reverse --format=format:%H --skip $((n - (i * splitPushCommitsCount))) -n1) echo "Batch commit: $h" git push "$NewREMOTE" --force "$h:refs/heads/$BRANCH" echo "sleeping for 5 minutes" sleep 300 done echo "Final push: HEAD:refs/heads/$BRANCH" git push "$NewREMOTE" --force "HEAD:refs/heads/$BRANCH" else echo "No commits to push for $BRANCH" fi fi done echo "Pushing tags" git push "$NewREMOTE" --force 'refs/tags/*' echo "Pushing replace refs (if any)" git push "$NewREMOTE" --force 'refs/replace/*' fi # === LFS Push === echo "Pushing Git LFS objects..." Get_LFS_Objects() { lfs_objects_dir="$1/lfs/objects" if [ -d "$lfs_objects_dir" ]; then lfs_objects=$(find "$lfs_objects_dir" -type f -printf "%f ") if [ -z "$lfs_objects" ]; then lfs_objects="NO_OBJECTS" fi else lfs_objects="NO_OBJECTS" fi } Get_LFS_Objects "$RepositoryFolderPathForBareCloneBAK" if [[ "$lfs_objects" != "NO_OBJECTS" ]]; then LFS_SPECIFIER="--object-id $lfs_objects" echo "Running lfs" git lfs push "$NewREMOTE" $LFS_SPECIFIER retCode=$? echo "LFS push exited with code: $retCode" else echo "No LFS objects to push." fi cd "$ALocation" echo "All done! Git and LFS data pushed successfully."