Playing with IPFS

I’ve been playing a bit with IPFS in the last couple of days. IPFS is protocol that aims to replace the HTTP layer of the web.

Some of its notable aspects include:

  • it is a peer to peer protocol so, content can be served from any node in the network. Imagine if the BitTorrent protocol could be applied to the whole web – that’s IPFS.
  • it uses content addressing so, we reference a document in the network through its hash, which allows us to get the document from any node which contains it. This contrasts with the current location addressing scheme, in which we can only get documents from specific machines (e.g. the servers that resolve to wikipedia.com)
  • it uses an offline first approach. It is a bit like Git, where you can do most operations offline and then you can sync with the network when you get back offline.
  • since documents are identified by their hashes, whenever you change a document you end up generating a different hash, which allows you to keep a version history of files.
  • it has a great name – InterPlanetary File System (IPFS).

If you want to learn more about IPFS I’d recommend watching the following videos:

The team behind IPFS is aiming to bring some fundamental changes to the web and it is impressive to see how pragmatic they are regarding how they wish to make that happen. For instance, sites hosted in IPFS can be served through their HTTP gateway (e.g. my site hosted in IPFS. Additionally, an implementation of IPFS in Javascript is being developed so IPFS can run natively on a browser without much effort from the user. From what I could understand they are trying to build tools so that using IPFS is an effortless experience in order to drive people to use it and realize its potential. I think it is a more realistic way of getting user adoption than waiting for everybody to switch to it at once.

I have configured a VPS to run an IPFS node and host my site. At first I tried to develop a deployment workflow in which the following would happen:

  • Make changes in my machine and push them to a git repository
  • Pull changes from the git repository to the VPS
  • Build the Jekyll site on the VPS
  • Add the generated files to IPFS

The first workflow was cumbersome because it required me to install all the dependencies (e.g. Git, Jekyll, Ruby) on the VPS. But then I realised that with IPFS I could use a different workflow:

  • Make changes in my machine
  • Build the Jekyll site on my machine
  • Add the generated files to IPFS (from the IPFS node running on my machine)
  • Use the file’s IPFS identifiers (i.e. hashes) to cache them in the VPS. This process of caching the files (i.e. pinning in IPFS terms) is useful so that my site is available even when my machine is not available to serve them.

This second workflow removes the complexity from the VPS and takes advantage from the fact that both nodes are equal peers and can exchange files. It also takes advantage of the fact that for IPFS what matters is the content you wish to download and not where it is located. I wrote a couple of Rake tasks to automate the process of updating my site on IPFS.

One of my favourite aspects of IPFS are its archival capabilities. The fact that IPFS uses content addressing, allows us to always have access to a document (even if its original node is no longer storing it). All we have to do is ensure the document stays in the network – and we can do that through the pinning functionality, which is a native way to backup documents in a node. The foundations to build something like the Internet Archive are part of the IPFS protocol – these are great news if you hate broken links as much as I do. Perhaps IPFS could be instrumental if somebody were to create a digital archive of all the public art ever created, forever.