Suggestions for deploying from a monolithic Git repo?


#1

Hello,

I’m working on a project which I want to deploy with Resin. The code I want to deploy is a subset of a larger repository. I can’t add resin.io as a remote and use git push resin master because I can’t risk the rest of the repo being pushed. I’m currently maintaining a separate repo which I copy to before deploying, but I’m wondering if anybody out there has a suggestion for a nicer approach? Ideally it would let me maintain my project history but not risk including any of the rest of the repo, and would present the project as the root even though it is deep in the larger repository.

I’m looking at using git subtree but I’m not super excited about the amount of careful Git monkeying I’d need to do. The simpler the better.

Any ideas or pointers would be really appreciated.


#2

Hello @jmptable,
Thanks for sharing your earthquake project , it’s very cool.

Per your question, have you considered to use git submodules?


#3

Thanks, I am glad you liked that project.

Unfortunately I think git-submodule doesn’t quite fit. I can’t move the code I’d like to deploy with Resin into a separate repository to include as a submodule because it’s part of a larger project which has a monolithic codebase. The main reason for it being setup that way is to reduce the difficulty of tracking different versions of the various component projects that make it up.

Possibly a different way to approach it: is there a method for deploying to resin.io besides git push resin master? For example, an API endpoint accepting a tarball of the application?


#4

The resin.io API doesn’t support pushing application as a tarfile and I am not aware of any intention to add such feature in the future.
I assume that the only way to both preserve the monolithic repo and have a separate repo for the application is to do some git-gymnastic; git-subtree sounds like an interesting option.


#5

Could you give some more concrete examples of how the project you are trying to deploy is structured? It’s a monolitic codebase but there are also various components that make up the project? Your Dockerfile and other things required for deployment live outside of that codebase?


#6

The tree looks something like this:

.
├── ci
├── docs
├── projects
│   ├── project-a
│   │   └── codenstuff
│   ├── project-b
│   │   ├── Dockerfile
│   │   ├── othercodenstuff
│   │   ├── packages
│   │   │   ├── thing-for-resin-1
│   │   │   └── thing-for-resin-n
│   │   └── scripts
│   └── project-z
│       └── bunchmorecodenstuff
└── tools

So everything for deployment does live in the codebase, and in the same repository with other projects. In the above example project/project-b would be what I would want to deploy with Resin. For more context: taking an approach like copying that subdirectory and artificially constructing a git repository, or slicing out part at deploy time with a tool like git subtree is more appealing than reorganizing the repository because everything the company I work for does lives in this repo.

I was curious if this was a common problem and if there might be a known best practice but it doesn’t sound like it’s come up so I’ll likely go with a solution based on git subtree. Once I figure out how it looks I’ll be sure to share here.


#7

It does seem to be a sort of a clash of deployment strategies. When people work from monorepos, IMHO they usually also have a mono-deployment (or at least singular path), instead of multiple, different ways (like here). Either way, one side or the other will need to work around it a bit.

If I were to brainstorm a little, and the main repo cannot use submodules/subtrees (e.g. to include your project such a way), then some automatic copying could work, or other tricks directly with git, for example either:

Applying these to your example:

  • git checkout -b project-b && git filter-branch --prune-empty --subdirectory-filter projects/project-b project-b
  • git subtree split -P projects/project-b -b project-b

would take your directory structure, and create a new branch with just:

.
├── project-b
├── Dockerfile
├── othercodenstuff
├── packages
│   ├── thing-for-resin-1
│   └── thing-for-resin-n
└── scripts

So you could take the monorepo filter/split it, and git push the resulting branch to resin with git push resin <branchname>:master.

The speed of it depends on the amount of commits it has to shift through, it can be pretty speedy or slow (in my testing I’ve seen both, depending how many commits touched that subdirectory). I’m sure @petrosagg would have some more creative or suitable ideas too.

It’s a very interesting point, though, because at resin we are almost at the other extreme: many-many small repositories, which has its own set of problems :slight_smile:


#8

Thanks very much for the thorough response. You got me pointed in the right direction.

I explored both options and ended up preferring git filter-branch. Both tools end up including information outside of the subdirectory you give them if that information is part of a commit which touches the subdirectory of interest. But git filter-branch provides a means of fixing that, in the form of the --tree-filter option.

To use it to get what I’m looking for you need a script like the following on the PATH (named “only-the-project” here):

#!/bin/bash

set -o xtrace

if [ "$(pwd)" != "$REPO_PATH/.git-rewrite/t" ]; then
  echo "Unexpected directory: $(pwd)"
  exit 1
fi

if [ ! -e "projects/the-project-of-interest" ]; then
  echo "Project not present"
  exit 0
fi

# enable !() invert syntax
shopt -s extglob

# remove everything except projects/the-project-of-interest
rm -r -- !(projects)
cd projects
rm -r -- !(the-project-of-interest)

And then you can cd into a copy of the repository and execute: REPO_PATH=$(pwd) git filter-branch --prune-empty --tree-filter only-the-project --msg-filter "gpg --encrypt --armor --recipient Me" --subdirectory-filter projects/the-project-of-interest master

Assuming you have a GPG key for someone named “Me” on the system this will remove everything from the repo history except what is in the project subdirectory and encrypt the commit messages. The result can then have the resin remote added to it. Finally git push resin master deploys it.

Worth noting that --prune-empty is essential if the repo you are working with is large. Without it the command above took >15 minutes to run, but with it the command takes ~10 seconds.