Published 2021-04-10.
Last modified 2023-06-01.
Time to read: 5 minutes.
git
collection.
I have several trees of git repositories, grouped into subdirectories. The total number of repositories in in the hundreds. Here is a sanitized depiction of one of my git directory trees:
├── cadenzaHome │ ├── cadenzaAssets │ ├── cadenzaCode │ │ ├── cadenzaClient │ │ ├── cadenzaCourseCode │ │ ├── cadenzaDependencies │ │ ├── cadenzaLibs │ │ ├── cadenzaServer │ │ ├── cadenzaServerNext │ │ └── cadenzaSupport │ ├── cadenzaCreative │ │ └── cadenzaCreativeTemplates │ ├── cadenzaCreativeBackup │ └── cadenzaCurriculum ├── django │ ├── django │ ├── django-oscar │ ├── frobshop │ ├── main │ └── oscar ├── jekyll │ ├── jekyllTemplate │ └── jekyll-flexible-include-plugin
Some git repos are forks,
and I defined upstream
git remotes for them,
in addition to the usual origin
remote.
Two Use Cases
This article discusses the two use cases for git-tree
,
a Ruby gem I wrote to help me work efficiently with all those projects.
Directories containing a file called .ignore
are ignored.
The source code for git-tree
is in
this GitHub repo.
The gem is published on RubyGems.org.
Use Case: Dependent Gem Maintenance
One of my directory trees holds Jekyll plugins, packaged as 25 gems. They depend on one another, and must be built in a particular order. Sometimes an operation must be performed on all of the plugins, and then rebuild them all.
Most operations do not require that the projects be processed in any particular order, however the build process must be invoked on the dependencies first. It is quite tedious to do this 25 times, over and over.
Several years ago I wrote a bash script to perform this task, but as its requirements became more complex,
the bash script proved difficult to maintain.
This use case is now fulfilled by the git-tree-exec
command
provided by the git_tree
gem.
Use Case: Replicating Trees of Git Repositories
Whenever I set up an operating system for a new development computer, one of the tedious tasks that must be performed is to replicate the directory trees of git repositories.
It is a bad idea to attempt to copy an entire git repository between computers,
because the .git
directories within them can quite large.
So large, in fact, that it might much more time to copy than re-cloning.
The reason is that copying the entire git repo actually means copying the same information twice:
first the .git
hidden directory, complete with all the history for the project,
and then again for the files in the currently checked out branch.
Git repos store the entire development history of the project in their .git
directories,
so as they accumulate history they eventually become much larger than the
code that is checked out at any given time.
One morning I found myself facing the boring task of doing this manually once again. Instead, I wrote a bash script that scanned a git directory tree and wrote out another bash script that clones the repos in the tree. Any additional remote references are replicated.
Two years later, I decided to add new features to the script.
Bash is great for short scripts,
but it is not conducive to debugging or structured programming.
I rewrote the bash script in Ruby, using the rugged
gem.
Much better!
This use case is fulfilled by the git-tree-replicate
and git-tree-evars
commands
provided by the git_tree
gem.
Installing Git_tree
To install git_tree
:
- Set up Ruby.
-
Working With Git Repos Using Ruby’s Rugged Gem
explains that
rugged
needs to be built from source so thessh
library is included. Your system will need to havecmake
and associated fiddly bits installed in order for the build to succeed. On Ubuntu, type:
Shell$ yes | sudo apt install cmake libgit2-dev libssh2-1-dev pkg-config
-
Now you can install
rugged
withssh
support:
Shell$ gem install git_tree Thanks for installing git_tree!
Successfully installed git_tree-0.2.1 Parsing documentation for git_tree-0.2.1 Done installing documentation for git_tree after 0 seconds 1 gem installed
To register the new commands, either log out and log back in, or open a new console.
You should now have two new shell commands:
git-tree-replicate
and git-tree-evars
.
Both these commands require only one parameter: an environment variable reference, pointing to the top-level directory to examine. The environment variable reference must be contained within single quotes to prevent expansion by the shell.
Using git-tree-exec
The command requires two parameters. The first parameter indicates the directory or directories to process. 3 forms are accepted:
- A directory name, which may be relative or absolute.
- An environment variable reference.
- A list of directory names, which may be relative or absolute, and may contain environment variables.
Example 1
For all projects listed,
update Gemfile.lock
and install a local copy of the gem.
Use this format when the order that projects are processed matters.
No output is displayed for the 3 commands chained together with &&
until they have all completed.
$ git-tree-exec '
$jekyll_plugin_logger
$jekyll_draft
$jekyll_plugin_support
$jekyll_all_collections
$jekyll_plugin_template
$jekyll_flexible_include_plugin
$jekyll_href
$jekyll_img
$jekyll_outline
$jekyll_plugin_template
$jekyll_pre
$jekyll_quote
' 'bundle && bundle update && bundle rake install'
Similarly, to release a new set of related plugins, I provide the same command as above,
but for the program to execute I pass rake release
.
$ git-tree-exec '
$jekyll_plugin_logger
$jekyll_draft
$jekyll_plugin_support
$jekyll_all_collections
$jekyll_plugin_template
$jekyll_flexible_include_plugin
$jekyll_href
$jekyll_img
$jekyll_outline
$jekyll_plugin_template
$jekyll_pre
$jekyll_quote
' 'bundle rake release'
Example 2
This example shows how to display the version of projects that create gems under the
directory pointed to by $my_plugins
.
An executable script is required on the PATH
,
so git-tree-exec
can invoke it as it loops through the subdirectories.
I call this script version
, and it is written in bash
,
although the language used is not significant:
#!/bin/bash x="$( ls lib/**/version.rb 2> /dev/null )" if [ -f "$x" ]; then v="$( cat "$x" | \ grep '=' | \ sed -e s/.freeze// | \ tr -d 'VERSION =\"' | \ tr -d \' )" echo "$(basename $PWD) v$v" fi
Invoke the version
script for each project under $my_plugins/
as shown below.
In general, it is a good idea to enclose variable name references within double quotes,
unless you are sure that there will never be a space in a file path.
$ git-tree-exec "$my_plugins" version jekyll_all_collections v0.3.3 jekyll_archive_create v1.0.2 jekyll_archive_display v1.0.1 jekyll_auto_redirect v0.1.0 jekyll_basename_dirname v1.0.3 jekyll_begin_end v1.0.1 jekyll_bootstrap5_tabs v1.1.2 jekyll_context_inspector v1.0.1 jekyll_download_link v1.0.1 jekyll_draft v1.1.2 jekyll_flexible_include_plugin v2.0.20 jekyll_from_to_until v1.0.3 jekyll_href v1.2.5 jekyll_img v0.1.5 jekyll_nth v1.1.0 jekyll_outline v1.2.0 jekyll_pdf v0.1.0 jekyll_plugin_logger v2.1.1 jekyll_plugin_support v0.7.0 jekyll_plugin_template v0.3.0 jekyll_pre v1.4.1 jekyll_quote v0.4.0 jekyll_random_hex v1.0.0 jekyll_reading_time v1.0.0 jekyll_revision v0.1.0 jekyll_run v1.0.1 jekyll_site_inspector v1.0.0 jekyll_sort_natural v1.0.0 jekyll_time_since v0.1.3
Example 3
List the projects under the directory pointed to by $my_plugins
that have a demo/
subdirectory.
$ git-tree-exec "$my_plugins" \ 'if [ -d demo ]; then realpath demo; fi' /mnt/c/work/jekyll/my_plugins/jekyll-hello/demo /mnt/c/work/jekyll/my_plugins/jekyll_all_collections/demo /mnt/c/work/jekyll/my_plugins/jekyll_archive_create/demo /mnt/c/work/jekyll/my_plugins/jekyll_download_link/demo /mnt/c/work/jekyll/my_plugins/jekyll_draft/demo /mnt/c/work/jekyll/my_plugins/jekyll_flexible_include_plugin/demo /mnt/c/work/jekyll/my_plugins/jekyll_from_to_until/demo /mnt/c/work/jekyll/my_plugins/jekyll_href/demo /mnt/c/work/jekyll/my_plugins/jekyll_img/demo /mnt/c/work/jekyll/my_plugins/jekyll_outline/demo /mnt/c/work/jekyll/my_plugins/jekyll_pdf/demo /mnt/c/work/jekyll/my_plugins/jekyll_plugin_support/demo /mnt/c/work/jekyll/my_plugins/jekyll_plugin_template/demo /mnt/c/work/jekyll/my_plugins/jekyll_pre/demo /mnt/c/work/jekyll/my_plugins/jekyll_quote/demo /mnt/c/work/jekyll/my_plugins/jekyll_revision/demo /mnt/c/work/jekyll/my_plugins/jekyll_time_since/demo
Using git-tree-replicate
The following creates a script in the current directory called work.sh
and makes it executable.
The script replicates the desired portions of the directory tree of git repos under $work
:
$ git-tree-replicate '$work' > work.sh $ chmod a+x work.sh
When git-tree-replicate
completes,
copy the generated script to the target machine and run it from the new top-level directory.
The following example copies the script to your user directory to machine2
while preserving the execute bit, then runs it:
$ scp -p work.sh machine2: work.sh 100% 12KB 1.3MB/s 00:00 $ ssh machine2 Welcome to Ubuntu 23.04 (GNU/Linux 6.2.0-20-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage 0 updates can be applied immediately. Last login: Tue May 23 10:11:14 2023 from 192.168.1.102 $ # Installcmake
andrugged
on this machine if you have not already done so: # yes | sudo apt install cmake libgit2-dev libssh2-1-dev pkg-config # gem install git_tree $ cd $work $ ~/work.sh
Here is the output generated for the directory tree shown at the top of this article:
$ git-tree-replicate '$work' if [ ! -d "cadenzaHome/cadenzaCreative/cadenzaCreativeTemplates/.git" ]; then mkdir -p 'cadenzaHome/cadenzaCreative' pushd 'cadenzaHome/cadenzaCreative' > /dev/null git clone 'git@github.com:mslinn/cadenzaCreativeTemplates.git' popd > /dev/null fi if [ ! -d "cadenzaHome/cadenzaCreative/cadenzaAssets/.git" ]; then mkdir -p 'cadenzaHome' pushd 'cadenzaHome' > /dev/null git clone 'git@github.com:mslinn/cadenzaAssets.git' popd > /dev/null fi if [ ! -d "cadenzaHome/cadenzaCode/cadenzaSupport/dottyTemplate/.git" ]; then mkdir -p 'cadenzaHome/cadenzaCode/cadenzaSupport' pushd 'cadenzaHome/cadenzaCode/cadenzaSupport' > /dev/null git clone 'git@github.com:mslinn/dottyTemplate.git' popd > /dev/null fi if [ ! -d "cadenzaHome/cadenzaCode/cadenzaLibs/scalacourses-play-utils/.git" ]; then mkdir -p 'cadenzaHome/cadenzaCode/cadenzaLibs' pushd 'cadenzaHome/cadenzaCode/cadenzaLibs' > /dev/null git clone 'git@github.com:mslinn/scalacourses-play-utils.git' popd > /dev/null fi if [ ! -d "cadenzaHome/cadenzaCode/cadenzaLibs/scalacourses-utils/.git" ]; then mkdir -p 'cadenzaHome/cadenzaCode/cadenzaLibs' pushd 'cadenzaHome/cadenzaCode/cadenzaLibs' > /dev/null git clone 'git@github.com:mslinn/scalacourses-utils.git' popd > /dev/null fi if [ ! -d "cadenzaHome/cadenzaCode/cadenzaserver/.git" ]; then mkdir -p 'cadenzaHome/cadenzaCode' pushd 'cadenzaHome/cadenzaCode' > /dev/null git clone 'git@bitbucket.org:mslinn/cadenzaserver.git' popd > /dev/null fi if [ ! -d "cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore/course_scala_intro_code/.git" ]; then mkdir -p 'cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore' pushd 'cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore' > /dev/null git clone 'ssh://git@bitbucket.org/mslinn/course_scala_intro_code.git' popd > /dev/null fi if [ ! -d "cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore/course_scala_intermediate_code/.git" ]; then mkdir -p 'cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore' pushd 'cadenzaHome/cadenzaCode/cadenzaCourseCode/ScalaCourses.com/group_scalaCore' > /dev/null git clone 'git@bitbucket.org:mslinn/course_scala_intermediate_code.git' popd > /dev/null fi if [ ! -d "cadenzaHome/cadenzaCode/cadenzaClient/.git" ]; then mkdir -p 'cadenzaHome/cadenzaCode' pushd 'cadenzaHome/cadenzaCode' > /dev/null git clone 'git@github.com:mslinn/cadenzaClient.git' popd > /dev/null fi if [ ! -d "cadenzaHome/cadenzaCurriculum/.git" ]; then mkdir -p 'cadenzaHome' pushd 'cadenzaHome' > /dev/null git clone 'git@github.com:mslinn/cadenzaCurriculum.git' popd > /dev/null fi if [ ! -d "jekyll/jekyllTemplate/.git" ]; then mkdir -p 'jekyll' pushd '/var/work' > /dev/null git clone 'git@github.com:mslinn/jekyllTemplate.git' popd > /dev/null fi if [ ! -d "django/django-oscar/.git" ]; then mkdir -p 'django' pushd 'django' > /dev/null git clone 'git@github.com:mslinn/django-oscar.git' cd "django-oscar" git remote add upstream 'git@github.com:django-oscar/django-oscar.git' popd > /dev/null fi if [ ! -d "django/frobshop/.git" ]; then mkdir -p 'django' pushd 'django' > /dev/null git clone 'git@github.com:mslinn/frobshop.git' popd > /dev/null fi if [ ! -d "django/django/.git" ]; then mkdir -p 'django' pushd 'django' > /dev/null git clone 'git@github.com:mslinn/django.git' cd "django" git remote add upstream 'git@github.com:django/django.git' popd > /dev/null fi if [ ! -d "jekyll/jekyll-flexible-include-plugin/.git" ]; then mkdir -p 'jekyll' pushd 'jekyll' > /dev/null git clone 'git@github.com:mslinn/jekyll-flexible-include-plugin.git' cd "jekyll-flexible-include-plugin" git remote add upstream 'https://idiomdrottning.org/jekyll-include-absolute-plugin' popd > /dev/null fi if [ ! -d "jekyll/jekyllTemplate/.git" ]; then mkdir -p 'jekyll' pushd 'jekyll' > /dev/null git clone 'git@github.com:mslinn/jekyllTemplate.git' popd > /dev/null fi
As you can see, the generated script checks to see if a git repo has already been cloned, and does not attempt to clone it again if so.
Using git-tree-evars
The git-tree-evars
command should be run on the target computer.
The command requires only one parameter:
an environment variable reference, pointing to the top-level directory to replicate.
The environment variable reference must be contained within single quotes to prevent expansion by the shell.
The following generated script appends to any script in the $work
directory called .evars
.
The script defines environment variables that point to each git repos $work
:
$ git-tree-evars '$work' >> $work/.evars
Generated Script from git-tree-evars
Following is a sample of environment variable definitions. You are expected to edit it to suit.
export work=/mnt/c/work export ancientWarmth=$work/ancientWarmth/ancientWarmth export ancientWarmthBackend=$work/ancientWarmth/ancientWarmthBackend export braintreeTutorial=$work/ancientWarmth/braintreeTutorial export survey_analytics=$work/ancientWarmth/survey-analytics export survey_creator=$work/ancientWarmth/survey-creator export django=$work/django/django export frobshop=$work/django/frobshop
The environment variable definitions are meant to be saved into a file that is source
d upon boot.
While you could place them in a file like ~/.bashrc
,
the author’s preference is to instead place them in $work/.evars
,
and add the following to ~/.bashrc
:
source $work/.evars
Thus, each time you log in, the environment variable definitions will have been re-established. You can therefore change directory to any of the cloned projects, like this:
$ cd $git_root
$ cd $my_project
Massaging the Generated Script
Lets create a file to work with before saving it:
$ git-tree-evars '$work' > evars.sh
Remove all paths with the string cadenza
from evars.sh
,
and save as evars_
:
$ cat evars.sh | grep -v cadenza > evars_no_cadenza.sh
Remove all paths with the string cadenza
,
and the definition for work
.
Save as evars_
:
$ cat evars.sh | \
grep -v 'cadenza' | \
grep -v 'work=' | \
sort > \
evars_no_cadenza_work.sh
Append the contents of evars_no_cadenza_work.sh
to $work/.evars
:
$ cat evars_no_cadenza_work.sh >> $work/.evars