Yet Another File Synchonization Program

Long story short, I am going to design and implement my own tool that is close to Dropbox, Resilio Sync, Syncthing, [You Name It]. In this blogpost I'm diving into why a new tool is needed and outlining the architecture using C4 model.

I love Org mode. It provides an elegant way to combine text, lists, prose, reminders and todo items. In fact it helps me to connect notes about the past and reminders for the future. There is also Orgzly - a nice org mode client for Android.

It's natural to have multiple devices in 2019. A personal laptop, a corporate computer and a phone - these are the devices I use daily. While Org mode doesn't have any native synchronization, an immediate solution is to use one of the existing Cloud synchronization services:

Why a new tool?

What strikes me is that all of the file synchronization tools mentioned above are generic purpose tools. They should perform strongly on binary data. But this doesn't let them take full advantage of Org mode files being in plain text. In fact history is sometimes even more important than the actual data for org mode files. Well, you don't need to browse the history every day, but when you need, you usually need it badly. Even if those tools provided full history of everything, it would have been hard to learn how to explore that history if you only need it occasionally. Long story short there is one tool that is great at going back in time and that is at my fingertips - it's git. Git is also much better at merging files changed by 2 parties comparing to other tools that normally accept one of the copies as the master one.

Using git manually for saving history of Org mode files is tedious. It's also easy to forget to commit something important or one might forget to push data to remote. In fact what I want is a tool that automatically commits all the detected changes and synchronizes local and remote repositories frequently (once every 3-5 minutes).

Additionally I have great experience with password store that uses git for managing the data. It is trivial to build a mental model of this system, it's extremely easy to browse or manipulate history when needed. It looks like building concurrent software on top of git might be a good idea.

Let's name it

Naming is hard. After spending some time I decided to go with plain-sync as the project name and psy as the binary name. It clearly describes that we are going to syncronize plain text files, the binary name is short and easy to remember. Well, the name might be not that discoverable by search engines, but I'll think about it some other day. Another downside is that y is not the easiest letter to touch type.

C4

I've been waiting for a proper pet project to try C4 model. Let's try it here. For those who don't know, C4 provides a more unified way to visualize software architecture. I will describe the design at several zoom levels:

Design: Context

So to sum up eveything I've described, the system is going to fit into the world this way:

Context Diagram

Source

Design: Container

Let's zoom Laptop:

Container Diagram

Source

Git commit messages will look like this:

$ git log --pretty=format:"%ad | %s" --date=short
2019-08-24 | File rel/path/to/file.org was edited by usr@host1
2019-08-24 | File deleted/file.org was deleted by theuser@host2
2019-08-20 | File added/file.org was added by usr@host1

Design: Components

That's what I love about C4. Even though it makes clear that this zoom level is absolutely needed, we should only start thinking about it when we are confident in the layers above. Going into components diagram is out of scope in this post.

I'll dive into the design of components in future blog posts.