Yet Another File Synchonization Program
Long story short, I am going to design and implement my own tool that is close to Dropbox, Resilio Sync, Syncthing, [You Name It]. In this blogpost I'm diving into why a new tool is needed and outlining the architecture using C4 model.
I love Org mode. It provides an elegant way to combine text, lists, prose, reminders and todo items. In fact it helps me to connect notes about the past and reminders for the future. There is also Orgzly - a nice org mode client for Android.
It's natural to have multiple devices in 2019. A personal laptop, a corporate computer and a phone - these are the devices I use daily. While Org mode doesn't have any native synchronization, an immediate solution is to use one of the existing Cloud synchronization services:
- Dropbox - a rock solid solution, but I don't 100% feel comfortable installing their client to my Linux laptop. All the data is stored at server side.
- Resilio Sync (previously known as Bittorrent Sync) - a proprietary technology that was originally developed in Minsk. P2P technology, all the data is stored on clients and there is no copy at server side (well, you can always build your own server that acts as a Resilio Sync client). What I don't 100% like is that management UI on Linux is implemented as a web app that listens on localhost - it doesn't provide enough isolation (it's a nice solution for managing a remote server though).
- Syncthing - an open source tool that is very close to Resilio Sync from a user's perspective. Same issue with management UI being a web app.
Why a new tool?
What strikes me is that all of the file synchronization tools mentioned above are generic purpose tools. They should perform strongly on binary data. But this doesn't let them take full advantage of Org mode files being in plain text. In fact history is sometimes even more important than the actual data for org mode files. Well, you don't need to browse the history every day, but when you need, you usually need it badly. Even if those tools provided full history of everything, it would have been hard to learn how to explore that history if you only need it occasionally. Long story short there is one tool that is great at going back in time and that is at my fingertips - it's git. Git is also much better at merging files changed by 2 parties comparing to other tools that normally accept one of the copies as the master one.
Using git manually for saving history of Org mode files is tedious. It's also easy to forget to commit something important or one might forget to push data to remote. In fact what I want is a tool that automatically commits all the detected changes and synchronizes local and remote repositories frequently (once every 3-5 minutes).
Additionally I have great experience with password store that uses git for managing the data. It is trivial to build a mental model of this system, it's extremely easy to browse or manipulate history when needed. It looks like building concurrent software on top of git might be a good idea.
Let's name it
Naming is hard. After spending some time I decided to go with
plain-sync
as the project name and psy
as the binary name. It
clearly describes that we are going to syncronize plain text files,
the binary name is short and easy to remember. Well, the name might be
not that discoverable by search engines, but I'll think about it some
other day. Another downside is that y
is not the easiest letter to
touch type.
C4
I've been waiting for a proper pet project to try C4 model. Let's try it here. For those who don't know, C4 provides a more unified way to visualize software architecture. I will describe the design at several zoom levels:
Context
- how the system fits into the world around us.Container
- what are the bigger building blocks the system relies on.Component
- what is the internal structure of each container (e.g. modules, channels, facades, etc.)Code
- no, I'm not going do that.
Design: Context
So to sum up eveything I've described, the system is going to fit into the world this way:
Design: Container
Let's zoom Laptop
:
Git commit messages will look like this:
$ git log --pretty=format:"%ad | %s" --date=short 2019-08-24 | File rel/path/to/file.org was edited by usr@host1 2019-08-24 | File deleted/file.org was deleted by theuser@host2 2019-08-20 | File added/file.org was added by usr@host1
Design: Components
That's what I love about C4. Even though it makes clear that this zoom level is absolutely needed, we should only start thinking about it when we are confident in the layers above. Going into components diagram is out of scope in this post.
I'll dive into the design of components in future blog posts.