Published 2023-03-30.
Time to read: 2 minutes.
I wanted to make changes to a subdirectory of a large git project, but I did not want to have the entire project stored on my device. The git sparse checkout feature allowed me to do that.
The project I wanted to work on was
Sinatra-ActiveRecord
and I wanted to play with the sample project for sqlite
.
The sample project was very small (too small to be useful, actually!),
so it made no sense to fill my computing device with data that was not needed.
I started by making an empty git repo.
$ mkdir sinatra-activerecord-sqlite $ cd sinatra-activerecord-sqlite $ git init
I wanted to eventually create two git remotes:
upstream
– pointing to the original git repo,sinatra-activerecord/sinatra-activerecord
.-
origin
– pointing to a new repo in my GitHub account that will contain the complete original repo's contents and history, plus my changes. This repo will be calledmslinn/sinatra-activerecord-sqlite
.
When you git push
from a sparse clone to origin
,
the content of the entire original repo is copied to the new repository,
as modified by any changes that you made.
Sparse checkout means that for this local repo instance,
only portions of the original repo are checked out.
However, the integrity of the entire original repo is maintained.
If someone else checks out the new repository,
without performing the sparse checkout procedure,
they will receive all of the contents of the original repo.
This is how I defined the upstream
remote:
$ git remote add --no-tags -t master -f upstream \ https://github.com/sinatra-activerecord/sinatra-activerecord.git Updating upstream remote: Enumerating objects: 1450, done. remote: Counting objects: 100% (232/232), done. remote: Compressing objects: 100% (125/125), done. remote: Total 1450 (delta 82), reused 204 (delta 74), pack-reused 1218 Receiving objects: 100% (1450/1450), 229.40 KiB | 2.76 MiB/s, done. Resolving deltas: 100% (543/543), done. From https://github.com/sinatra-activerecord/sinatra-activerecord * [new branch] master -> upstream/master
The git project that is being cloned has a lot of tags.
In the above command,
I used the ‐‐no‐tags
option
to suppress the downloading of all tags.
The ‐t master
option further restricted the clone,
so only the master
branch was fetched.
Now for the magic incantations that enables and defines this git
’s sparse checkout:
$ git config core.sparseCheckout true $ echo "/example/sqlite" >> .git/info/sparse-checkout
It is now possible to pull down just the contents of the /example/sqlite
directory from the upstream
remote:
$ git pull upstream master remote: Total 0 (delta 0), reused 0 (delta 0), pack-reused 0 From https://github.com/sinatra-activerecord/sinatra-activerecord * branch master -> FETCH_HEAD
Here are the files and directories that I just sparsely cloned from the repo:
$ tree . └── example └── sqlite ├── Gemfile ├── README.md ├── Rakefile ├── app.rb ├── bin │ └── rake ├── config │ └── database.yml ├── config.ru └── db ├── development.sqlite3 ├── migrate │ ├── 20140415201712_create_users.rb │ └── 20140415204542_create_posts.rb ├── schema.rb ├── seeds.rb ├── structure.sql └── test.sqlite3 6 directories, 14 files
Next I used the GitHub CLI to create a repo in my GitHub account for containing the complete repo, along with my modifications.
$ gh repo create --public --source=. --remote=origin ✓ Created repository mslinn/sinatra-activerecord-sqlite on GitHub ✓ Added remote git@github.com:mslinn/sinatra-activerecord-sqlite.git
The above gh repo create
command automatically names the repo from the current directory name.
I do this so often that I defined 2 bash aliases in ~/.bash_aliases
:
alias gh_new_private='gh repo create --private --source=. --remote=origin' alias gh_new_public='gh repo create --public --source=. --remote=origin'