How we stopped worrying about submodules and focused on code

How we stopped worrying about submodules and focused on code

A couple of years ago, I was at a large tech conference, presenting SubGit --- our bestselling tool for SVN to Git migration. Among the people who visited our booth was a DevOps lead from the IT department of a large international retailer. He was aware of the advantages of Git but was not planning to move his team to it. "We have a lot of small projects," - he said - "and they are all interconnected. Having each of them in a separate Git repository would be a nightmare - for each minor update, one would have to push to a dozen different repositories. Besides, we make huge use of svn:externals, and there's no sound alternative to it in Git."

That sounded weird. Not being a developer myself, I asked a few people what they do when they have several Git repositories for different parts of the same project.

  • One said: "I just run several instances of my IDE at the same time and switch between them."
  • Another one - "I am trying to avoid that and have all projects in the same repository. Takes a while to clone it, though."
  • The third one said: "I use Git Submodules."

Bingo! So why couldn't this guy from the conference just use Submodules? Very soon, I knew why. Git Submodules, intended to do just that, turned out to be very inconvenient and unsafe. Not only each user had to install it and run submodules-specific commands from his working copy --- an accidental mistake, such as updating the main repository before the submodules, could ruin the whole setup.

We are a small team, and we don't run many projects simultaneously, but still, we have a few libraries shared between the projects. So we became really perplexed with the question: "how to update the main repository and the shared library with one push." At some point, one of us asked: "Can we use the same technology that we developed to mirror SVN and Git, to sync one folder in a Git repository with another Git repository?"

Yes, we can.

Almost two years later (as I said, we are a small team and had other products to support and improve), we had a working alpha of "Git X-Modules" --- a server-side solution to handle the externals/submodules case without any overhead to end-users. We showed it to several friends and college mates and asked if they would use it in their work to combine multiple repositories in one. Then some of them said: "I would have used it a few years ago, but the current trend is to put everything in a Monorepo."

Bummer. Was it just that simple? Actually, no. Monorepo is still rather a battlefield than an industry standard. It's being discussed all over, and for every post that praises this approach, there's another post that condemns it. The main disadvantage is not even the size of such a repository, but the mess created by dozens (or thousands) of developers from different projects pushing to the same place. So we asked ourselves: "Can we achieve the advantages of a monorepo with X-Modules?"

Yes, we can.

By that time, we have been using X-Modules on our company servers for quite a while, having a lot of fun combining repositories in various combinations, like Lego blocks. A shared library was only one way to use it; another scenario was to take an empty repository and fill it with X-Modules, synced with other repositories - essentially, a monorepo. There was also an opposite case - have a large repository split by X-Modules into several small projects.

Another round of interviews: "Your command-line tool is great, but I would rather use it as a plugin to my Bitbucket." That was an easy one --- we've done it before with SubGit. A beta version of Git X-Modules App for Bitbucket Server was uploaded to the Atlassian Marketplace in October 2020... one day before Atlassian announced that Server is going to EOL, and everybody is encouraged to move to the Cloud.

So our journey goes on. Now we have to figure out how to make our app work with cloud-based Git services. We have one or two ideas up our sleeves :-). Actually, this post should have been called "How we stopped worrying about submodules and focused on making a better solution." But if you have a self-hosted Git server, you may try this solution out already at https://gitmodules.com.

What do you think, was it worth it? Or should we just have "gone monorepo", like anyone else?