indexpost archiveatom feed syndication feed icon

Consolidating Repositories

2026-05-16

After the recent update to this server and a bit of retrospective on those services I both use and get value from I have been consolidating some old repositories to simply host them on this server rather than one of the numerous code forges. Here's a few notes on the mechanics of that.

A Target

I've mentioned that I've found fossil pretty painless to self-host so I figured I may as well consolidate to it, rather than pull in new technologies and tools. I've got a few old repositories in both git and mercurial, I'd like to maintain what little history exists in each repository. Mostly this is for my own memory as it may help correlate something I've previously written to the code itself (it should also be more representative of where I was at, chronologically, at the point of development).

Git to Fossil

Easy things first, fossil has a built-in import for a git export so dumping all the commits first into a single repository is pretty straightforward. Because none of the repositories were started with the idea that they'd need any kind of namespacing, most files in each repository exist at the root directory. In order to organize each repository at the time of import, I found git-filter-repo sufficient to move each project under a subdirectory so that the new project scopes each project under its own name in the file system hierarchy.

git -C "$tmp_clone" filter-repo --force --to-subdirectory-filter "$subdir"

Obviously $tmp_clone is a local clone I've made of the source repository. I wasn't convinced I wouldn't mess up the source repository in the migration so I cloned it locally before operating against it. The $subdir is just the name of the specific project I was namespacing.

Perhaps also obvious, the above command is operating in the context of a fresh git repository (that I called "workspace" while I was iterating on it). All of the git commits need to ultimately land somewhere and the goal is to put them into one git repo for reasons I'll explain in the next section.

Mercurial to Fossil

There's less of a straight line from Mercurial to Fossil. First I found I needed to translate Mercurial to Git, for which fast-export seemed to do a sufficient job. My usage was simple enough to be covered by the README but was just:

$ git init ${project}-git
$ cd ${project}-git
$ hg-fast-export.sh -r ../${project}
$ git checkout

From there the same git-filter-repo process covered the import into a single git repository.

Bundling It All Up

In practice I first converted the Mercurial repositories to git. Once I had several git repositories:

  1. I used the filter-repo operation to place the repositories in a subdirectory
  2. I added each repository as a remote to a new repository
  3. I merged each remote (knowing that all the distinct files were namespaced under the filter-repo subdirecotires)
  4. I removed each remote
WORK_TMP=$(mktemp -d)

for subdir in "${!PROJECTS[@]}"; do
  src="${PROJECTS[$subdir]}"
  tmp_clone="$WORK_TMP/$subdir"

  git clone "$src" "$tmp_clone"

  git -C "$tmp_clone" filter-repo --force --to-subdirectory-filter "$subdir"

  git remote add "$subdir" "$tmp_clone"
  git fetch "$subdir" --no-tags
  git merge --allow-unrelated-histories "$subdir/master" \
    -m "Merge $subdir into subdirectory $subdir/"
  git remote remove "$subdir"
done

Because I'm consolidating multiple independent repositories with no real shared history I had to find and use the --allow-unrelated-histories flag when merging:

--allow-unrelated-histories

By default, git merge command refuses to merge histories that do not share a common ancestor. This option can be used to override this safety when merging histories of two projects that started their lives independently. As that is a very rare occasion, no configuration variable to enable this by default exists or will be added.

Importing to Fossil

Finally, with one git repository of all the related histories it is a single step to import into a new fossil repository:

$ git fast-export --all | fossil import --git ../${new-fossil-repo}.fossil

Thoughts

This was passingly entertaining to explore a few different features of each technology. I am not certain how valuable it will really be long term. Partly as a result of the merge operations the repository "history" seems quite complicated in those visual interfaces I've looked at (the timeline view in the web UI and the fossil timeline output). I don't imagine that is terribly important as all of my own work is incredibly linear and unrelated to coincident changes. At least backups will be even easier now! Finally, the really good news is that the per-file history is easily navigable in my preferred VCS interface which is emacs vc-mode:

VC backend : Fossil                                                                                                
Working dir: ~/sources/wunderkammer/                                                                               
Repository : /home/nolan/sources/consolidated.fossil                                                               
                                                                                                                  
Checkout   : 5dedbc6dd5 2026-05-20 02:40:25 EDT (leaf)                                                             
Tags       : master, trunk                                                                                         
                                                                                                                  
                        ./                                                                                        
-UUU:@%*-  F4  *vc-dir*       Top   L1     (VC dir) --------------------------------------------------------------
=== 2017-12-08 ===                                                                                                 
19:04:27 [f459e3ce64] Factoring out into a "real" package - Redo the README to be PyPI compatible - write a        
                     setup.py file - include MANIFEST.in (user: prescott.nolan@gmail.com tags: master, trunk)     
+++ no more data (1) +++                                                                                           
-UUU:@%%-  F4  *vc-change-log*   All   L2     (Fossil-Log-View from *Annotate post.py (rev fe508764fe)*<2>) ------
b963711883 2018-01-22    43:     def __eq__(self, other):                                                          
b963711883 2018-01-22    44:         '''                                                                           
b963711883 2018-01-22    45:         this may be a bit ambiguous, but semantically, it seems like a post is        
b963711883 2018-01-22    46:         "equal" to another if the text body is the same                               
b963711883 2018-01-22    47:         '''                                                                           
b963711883 2018-01-22    48:         return self.body == other.body                                                
b963711883 2018-01-22    49:                                                                                       
b963711883 2018-01-22    50:     def __repr__(self):                                                               
fe508764fe 2018-12-10    51:         return ('<Post: {title}, {date}>'                                             
fe508764fe 2018-12-10    52:                 .format(title=self.title, date=self.date))                            
f459e3ce64 2017-12-08    53:                                                                                       
f459e3ce64 2017-12-08    54:     def parse(self, raw_text):                                                        
f459e3ce64 2017-12-08    55:         '''                                                                           
f459e3ce64 2017-12-08    56:         Args:                                                                         
f459e3ce64 2017-12-08    57:             raw_text: string contents of a post file                                  
f459e3ce64 2017-12-08    58:         '''                                                                           
f459e3ce64 2017-12-08    59:         try:                                                                          
b963711883 2018-01-22    60:             post = Post(relative_dir=self.relative_dir)                               
b963711883 2018-01-22    61:             meta, body = self._split(raw_text)                                        
b963711883 2018-01-22    62:             post.title = meta['title']                                                
b963711883 2018-01-22    63:             post.slug = slugify(post.title)                                           
fe508764fe 2018-12-10    64:             post.path = os.path.join(self.relative_dir,                               
fe508764fe 2018-12-10    65:                                      '{slug}.html'.format(slug=post.slug))            
0085a738c4 2018-01-22    66:             post._date = self._parse_date(meta['date'])                               
0085a738c4 2018-01-22    67:             post.date = post._date.strftime('%Y-%m-%d')                               
b963711883 2018-01-22    68:             post.body = self.markdown(body)                                           
b963711883 2018-01-22    69:             post.leader = self.markdown(self._parse_leader(body))                     
b963711883 2018-01-22    70:             return post                                                               
f459e3ce64 2017-12-08    71:         except (ValueError, KeyError, TypeError) as e:                                
fe508764fe 2018-12-10    72:             raise ValueError('Unable to parse post from:\n{text}'                     
fe508764fe 2018-12-10    73:                              .format(text=raw_text[:50]))                             
b963711883 2018-01-22    74:                                                                                       
b963711883 2018-01-22    75:     @staticmethod                                                                     
b963711883 2018-01-22    76:     def _split(text):                                                                 
b963711883 2018-01-22    77:         '''                                                                           
b963711883 2018-01-22    78:         Take as input text comprising a post file:                                    
b963711883 2018-01-22    79:                                                                                       
b963711883 2018-01-22    80:             title: some text                                                          
b963711883 2018-01-22    81:             date: 2015-12-01                                                          
-UUU:@%*-  F4  *Annotate post.py (rev fe508764fe)*<2>   15%   L31    (Annotate from post.py) ---------------------