Mike Slinn
Mike Slinn

Replicating a Git Directory Tree

Published 2021-04-10. Last modified 2021-05-08.
Time to read: about 2 minutes.

This article is categorized under Bash, Git.

Whenever I set up an operating system for one of my computers one of the tedious tasks that must be performed is to replicate the git repositories.

It is a bad idea to attempt to copy an entire git repository between computers, because the .git directories within them can quite large. So large, in fact, that it might much more time to copy than re-cloning. I think the reason is that copying the entire git repo actually means copying the same information twice: first the .git hidden directory, complete with all the history for the project, and then again for the files in the currently checked out branch. Git repos store the entire development history of the project in their .git directories, so they are often much larger than the actual code that is checked out at any given time.

I have several trees of git repositories, grouped into subdirectories. Here is a sanitized depiction of one of my git directory trees:

├── cadenzaHome
│   ├── cadenzaAssets
│   ├── cadenzaCode
│   │   ├── cadenzaClient
│   │   ├── cadenzaCourseCode
│   │   ├── cadenzaDependencies
│   │   ├── cadenzaLibs
│   │   ├── cadenzaServer
│   │   ├── cadenzaServerNext
│   │   └── cadenzaSupport
│   ├── cadenzaCreative
│   │   └── cadenzaCreativeTemplates
│   ├── cadenzaCreativeBackup
│   └── cadenzaCurriculum
├── django
│   ├── django
│   ├── django-oscar
│   ├── frobshop
│   ├── main
│   └── oscar
├── jekyll
│   ├── jekyllTemplate
│   └── jekyll-flexible-include-plugin

Some git repos are forks, and I defined upstream git remotes for them, in addition to the usual origin remote.

This morning I found myself facing the boring task of doing this manually once again. Instead, I wrote this script, which scans a git directory tree and writes out a script that clones the repos in the tree, and adds upstream remotes as required. Directories containing a file called .ignore are ignored.

#!/bin/bash

function help {
  printf "$1Replicates tree of git repos
"
  exit 1
}

function doOne {
  cd "$CLONE_DIR" > /dev/null
  PROJECT_DIR="$( basename "$CLONE_DIR" )"  # Might have been renamed after cloning

  # echo "CLONE_DIR: $CLONE_DIR"
  # echo "PROJECT_DIR: $PROJECT_DIR"
  ORIGIN_URL="$( git config --local remote.origin.url )"

  CLONE_DIR_PARENT="$( realpath "$CLONE_DIR/.." )"
  echo "mkdir -p '$CLONE_DIR_PARENT'"
  echo "pushd '$CLONE_DIR_PARENT' > /dev/null"
  echo "git clone '$ORIGIN_URL'"

  UPSTREAM_URL="$( git config --local remote.upstream.url )"
  if [ "$UPSTREAM_URL" ]; then
    if [ "$ORIGIN_URL" != "no_push" ]; then
      echo "cd \"$PROJECT_DIR\""
      echo "git remote add upstream '$UPSTREAM_URL'"
    fi
  fi

  echo "popd > /dev/null"

  GIT_DIR_NAME="$( basename "$PWD" )"
  if [ "$GIT_DIR_NAME" != "$PROJECT_DIR" ]; then
    echo "# Git project directory was renamed, renaming this copy to match original directory structure"
    echo "mv \"$GIT_DIR_NAME\" \"$PROJECT_DIR\""
  fi
  echo
}

if [ -z "$1" ]; then help "Error: Please specify the subdirectory to traverse.\n\n"; fi
BASE="$1"
DIRS="$( find $BASE -type d \( -execdir test -e {}/.ignore \; -prune \) -o \( -execdir test -d {}/.git \; -prune -print \) )"
for DIR in $DIRS; do
  export CLONE_DIR="$( realpath "$DIR" )"
  doOne
done

Here is the output generated for the above directory tree:

$ gitUrls $work
mkdir -p '/var/work/cadenzaHome/cadenzaCreative'
pushd '/var/work/cadenzaHome/cadenzaCreative' > /dev/null
git clone 'git@github.com:mslinn/cadenzaCreativeTemplates.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome'
pushd '/var/work/cadenzaHome' > /dev/null
git clone 'git@github.com:mslinn/cadenzaAssets.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome/cadenzaCode/cadenzaSupport'
pushd '/var/work/cadenzaHome/cadenzaCode/cadenzaSupport' > /dev/null
git clone 'git@github.com:mslinn/dottyTemplate.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome/cadenzaCode/cadenzaLibs'
pushd '/var/work/cadenzaHome/cadenzaCode/cadenzaLibs' > /dev/null
git clone 'git@github.com:mslinn/scalacourses-play-utils.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome/cadenzaCode/cadenzaLibs'
pushd '/var/work/cadenzaHome/cadenzaCode/cadenzaLibs' > /dev/null
git clone 'git@github.com:mslinn/scalacourses-utils.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome/cadenzaCode/cadenzaLibs'
pushd '/var/work/cadenzaHome/cadenzaCode/cadenzaLibs' > /dev/null
git clone 'git@github.com:mslinn/scalacourses-slick-utils.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome/cadenzaCode'
pushd '/var/work/cadenzaHome/cadenzaCode' > /dev/null
git clone 'git@bitbucket.org:mslinn/cadenzaserver.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore'
pushd '/var/work/cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore' > /dev/null
git clone 'ssh://git@bitbucket.org/mslinn/course_scala_intro_code.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore'
pushd '/var/work/cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore' > /dev/null
git clone 'git@bitbucket.org:mslinn/course_scala_intermediate_code.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome/cadenzaCode'
pushd '/var/work/cadenzaHome/cadenzaCode' > /dev/null
git clone 'git@github.com:mslinn/cadenzaClient.git'
popd > /dev/null

mkdir -p '/var/work/cadenzaHome'
pushd '/var/work/cadenzaHome' > /dev/null
git clone 'git@github.com:mslinn/cadenzaCurriculum.git'
popd > /dev/null

mkdir -p '/var/work'
pushd '/var/work' > /dev/null
git clone 'git@github.com:mslinn/jekyllTemplate.git'
popd > /dev/null

mkdir -p '/var/work/django'
pushd '/var/work/django' > /dev/null
git clone 'git@github.com:mslinn/django-oscar.git'
cd "django-oscar"
git remote add upstream 'git@github.com:django-oscar/django-oscar.git'
popd > /dev/null

mkdir -p '/var/work/django'
pushd '/var/work/django' > /dev/null
git clone 'git@github.com:mslinn/frobshop.git'
popd > /dev/null

mkdir -p '/var/work/django'
pushd '/var/work/django' > /dev/null
git clone 'git@github.com:mslinn/django.git'
cd "django"
git remote add upstream 'git@github.com:django/django.git'
popd > /dev/null

mkdir -p '/var/work/jekyll'
pushd '/var/work/jekyll' > /dev/null
git clone 'git@github.com:mslinn/jekyll-flexible-include-plugin.git'
cd "jekyll-flexible-include-plugin"
git remote add upstream 'https://idiomdrottning.org/jekyll-include-absolute-plugin'
popd > /dev/null

mkdir -p '/var/work/jekyll'
pushd '/var/work/jekyll' > /dev/null
git clone 'git@github.com:mslinn/jekyllTemplate.git'
popd > /dev/null 
😁

Now all I had to do was paste the above bash commands into a terminal on the new system, and a short time later the git repositories were set up the way I needed them.