Matrix is an open platform for secure, decentralized, realtime communication. Matthew Hodgson, the Matrix project leader, came to FOSDEM to describe Matrix and report on its progress. Attendees learned that it was within days of having a 1.0 release and found out how it got there. He also shed some light on what happened when the French reached out to them to see if Matrix could meet the internal messaging requirements of an entire national government.
From a client’s viewpoint, Matrix is a thin set of HTTP APIs for publish-subscribe (pub/sub) data synchronization; from a server’s viewpoint, it’s a rich set of HTTP APIs for data replication and identity services. On top of these APIs, application servers can provide any service that benefits from running on Matrix. Principally, that has meant interoperable chat, but Hodgson noted that any kind of JSON data could be passed, including voice over IP (VoIP), virtual or augmented reality communications, and IoT messaging. That said, Matrix is independent of the transport used; although current Matrix-hosted services are built around HTTP and JSON, more exotic transports and data formats can be used and, at least in the laboratory, have been.
Because Matrix is inherently decentralized, no single server “owns” the conversations; all traffic is replicated across all of the involved servers. If you are using your server to talk to someone on, say, a gouv.fr server, and your server goes down, then because their server also has the whole conversation history, when your server comes back up, it will resync so that the conversation can continue. This is because the “first-class citizen” in Matrix is not the message, but the conversation history of the room. That history is stored in a big data structure that is replicated across a number of participants; in that respect, said Hodgson, Matrix is more like Git than XMPP, SIP, IRC, or many other traditional communication protocols.
Matrix is also end-to-end encrypted. Because your data is replicated across all participating servers, said Hodgson, if it’s not end-to-end encrypted, it’s a privacy train wreck. The attack envelope enlarges each time a new participant enters a conversation and gets a big chunk of conversation history synchronized to their server. In particular, admitting a new participant to your conversation should not mean disclosing the conversation history to whoever administers their server; end-to-end encryption removes that possibility.
Matrix was started back in May 2014. By Hodgson’s admission, the first alpha release in September 2014 was put together far too quickly, in a process where “everybody threw lots of Python at the wall, to see what would stick”. In 2015, federation became usable, Postgres was added as an internal database engine alongside SQLite, and IRC bridging was added. Later that year the project released Vector as its flagship client, which meant also releasing the first version of the client-server API.
In 2016, much hard work was done on scaling, the Vector client was rebranded as Riot (which it remains today), and end-to-end encryption was added. This latter feature turned out to be difficult in a decentralized world: if you have a stable decentralized chat room, and someone’s server goes offline for a few hours, during which time a couple more devices are added to the server, then when the server comes back into federation, should those new devices be in the conversation or not? To Hodgson, this is as much a philosophical question as it is a technical one, but it had to be answered before end-to-end encryption could be implemented.
In 2017, the project added a lot of shiny user-interface whizziness: widgets, stickers, and the like, then in 2018 tried to stabilize all this by feature-freezing and pushing hard towards version 1.0. This push has included the necessary step of setting up a foundation to act as a neutral guardian of the standard. That will allow others to build on it knowing that it’s stable and won’t be changed at a moment’s notice to suit Hodgson and the Matrix developers. Creating that stable base to hand off to the foundation meant nailing down all the protocol APIs, which has not been without pain. Some of them, particularly the federation API, have needed significant changes to correct design errors in the original specifications.
Rolling these changes out has been harder than it should have been because the Matrix developers didn’t include protocol versioning in everything from the outset. Hodgson pleaded with the audience, should any of us ever build a protocol, to “make damn sure you can ratchet the version of everything from the outset, otherwise you just paint yourself into a corner. One day you discover a bug in your federation API, and then, before you can fix it, you have to retrofit a whole versioning system”.
The move to 1.0 has also meant a complete rethink of certificate management. Back in 2014 the Matrix project decided to abjure traditional certificate authorities; instead, self-signed certificates would be used. To democratize decisions about who to trust, notary servers would be implemented to build a consensus about the trustability of a given TLS certificate. It was, in Hodgson’s words, a disaster. While the developers were trying to fix it, Let’s Encrypt came along, which made the process of getting properly-signed certificates trivial, so the self-signing experiment has been abandoned. The 0.99 version of the home server code is ACME-capable and can get a certificate directly from Let’s Encrypt; the 1.0 version removes support for self-signed certificates.
At 2am on the day of Hodgson’s talk, February 2, the server-server API was released at version 0.1. All five core APIs are now stable, he said, which is a necessary precondition for a 1.0 release. Two of them (the client-server and identity APIs) still need final tweaks, but apparently these are expected to have landed by the time this article is published, signaling Matrix’s official exit from beta.
Matrix has already been pretty successful. Hodgson presented two graphs (active users, publicly-visible servers) both with pleasing-looking exponential curves on them. The matrix.org server has about half of the seven million globally visible accounts, so it was interesting that Hodgson announced the long-term intention to turn that server off. Apparently, once it becomes common for other organizations to offer a federated Matrix server, there will be no need for a “default” server for new users to land on. It’s clear that Hodgson really does regard Matrix as a fully federated system, rather than something centralized, or centralized-plus-hangers-on.
The French connection
In early 2018, DINSIC, which Hodgson described as the “French Ministry of Digital”, reached out to one of the developers working on the Riot client for Android to ask if they might get a copy for their own purposes. On investigation, it turned out that what the ministry wanted was end-to-end encrypted, decentralized communication across the entire French government, running on systems that France could provision and control. The agency thought that Matrix might be an excellent platform for providing this.
One might, said Hodgson, question why a government wanted a decentralized system. It turns out that governments, at least in France, only look centralized; in real life they’re made up of ministries, departments, sub-departments, schools, hospitals, universities, and so forth. Each of these organizations will have its own operational requirements, security model and policies, and so on; a federated solution allows server operation to be decentralized to an extent that makes the most sense for server operation and for the user community. In France, the user community turns out to be about 5.5 million users — that is, about 9% of the population of the country. In addition, although this was to be a standalone deployment, DINSIC wanted the ability to federate publicly, to be able to connect to other governments, suppliers, contractors, etc., using their shiny new system but without all those external people needing accounts on it.
Because of the federated nature of what was being sought, end-to-end encryption was a requirement, which Matrix was in a position to provide. But DINSIC also wanted enterprise-grade anti-virus (AV) support, so that the ability to share documents, images, and other data through the system didn’t present an exciting new infection vector. As Hodgson pointed out, AV is pretty much entirely incompatible with end-to-end encryption: if you’ve built a system that prevents anyone except sender and receiver from knowing what’s in a file, how’s a third party going to intercept it en route to scan it for known-harmful content?
In the end, this required adding the ability for files to be exfiltrated from Matrix to an arbitrary external scanning service. This service is provided with the URL for a piece of encrypted content, plus the encryption keys for that content — not for the message in which the content appears, or for the room in which the message appears — encrypted specifically for the content-scanning service. The scanning service then retrieves the content, decrypts and scans it, and proxies the result back into Matrix. Having acquired this ability, Matrix scans content both on upload and download, for extra security; the code to do all this will be making its way back into mainstream Matrix in the near future.
Having committed to using Matrix, DINSIC started work in May 2018 on Tchap, its fork of the Riot client. The current state of the work can be found on GitHub; user trials started in June. The French National Cybersecurity Agency has audited the system, as has an external body. As of January 2019, it’s being rolled out across all French ministries, which involves a great deal of Ansible code. Hodgson gave a quick demo of Tchap, noting that at the moment he feels that Tchap is probably more usable than the mainstream Riot client, not least because DINSIC has had a professional user-experience (UX) agency working hard on it.
Clearly, there is some interest in the project. Some of this will be curiosity about the French connection, but much will be because Matrix is a fully-working professional system that can be used productively right now. If you’re in one of those organizations that uses Slack for nearly everything, there is no good reason not to start looking at a migration to Matrix, because if the migration is successful, you can bring your messaging data back in-house. Free software is about the user having control, and Matrix honors that promise.
For anyone who’d like to see the whole talk, the video can be found here.
Source of this article: https://lwn.net/