Making a messaging app using React and XMPP for fun and non-profit - Part 1

Introduction and motivation to build it

This is a multi-part post about how I helped create a messaging app for a non-profit. This is the first part, where I'll cover the idea, why we decided to use our stack and introduce XMPP. In the next posts, I'll dive into more detail about how the actual architecture of the server side and the mobile app work in more detail.

Mission Graduates (MG) is a non profit that helps K-12 students in the Mission district achieve and complete college education.

Some time ago, I was approached to help with the development of an app for them. The team was a PM, some UI-designers and needed someone to help them with the development.

The idea of the app is so MG can improve the communication between their staff and the people participating in the parental involvement program. The MG staff would then be able to contact directly parents to provide information on workshops, activities or services provided by Mission Graduates.

From the app, parents would be able to react to posts sent by MG, they can also ask questions to them via a 1-1 chat (parent - MG staff) and RSVP to events. There are multiple channels parents can join, for instance each school has its own channel, and different programs in MG have different channels (much like a Slack environment has multiple channels).

As of a few weeks, the app is available on the App Store and Google Play.

Overall, the entire app took about 8 months from idea to sending it to the app stores. Considering that this was a side project where we only worked in our spare time, I think it was pretty good. Much thanks to Paz Zuniga, the PM for this project, because without her input and help organizing, this would've not been possible (shameless plug: she's looking for a job!).

The stack

This was not my first messaging app. The first time, I made the usual "I'm an engineer and I know best" mistake: I decided to make my own messaging system, including scheduling, channels, push notifications, etc. In the end it did work, but it was a real PITA.

This time, I decided to opt for a more sane approach and use already existing and proven, and battle-tested pieces of software:

React Native for the mobile app (using Expo + Redux for state management)
React for the Web UI for the admin
Typescript instead of plain JavaScript
Ruby + Sinatra + Sequel, running on JRuby 9.2.6 (Ruby 2.5.x) for the API
Ruby 2.6 for the publisher (more on this later)
PostgreSQL for the database
Redis for message passing between services (API -> Publisher)
Openfire as the XMPP server
Docker for packing all services
Kubernetes to orchestrate the containers (GKE)
Minikube to run services on my local machine while developing
GitLab to host the git repositories

I decided to not use helm, mostly because there were only a few YAML files to describe all the services, and once I wrote them, I hardly changed them.

Architecture & Sequence

This is a simple high level overview of how (most of) the elements described above interact with each other.

Figure 1: High level architecture diagram

To make it a bit more clear, here's a sequence diagram showing the flow after a login/registration, creating a post from the web admin and then reacting to that message or sending a 1-1 chat from within the app. One thing that might not be clear, is that I cheated a bit, since the API and Openfire share a database, I access the raw database (I had archiving enabled in the Openfire instance) to show the chat messages in the Web UI. This is not ideal, but helped me avoid duplication of data (another way around this would've been to inform the API that we got a message from the publisher).

Figure 2: Sequence diagram for the registration/login flow and then sending a message

Messaging

The resulting app is, at its core, a messaging app: From the admin UI you can create posts, which are broadcasted to specific channels (groups). You can also manage groups (create/delete). It's not a typical messaging app: users can't talk to each other, this is by design. It is meant to be a 1-way of communication, but with the option to let users ask MG staff questions in case they might have about a specific post. You can think of the chats on post as comments that only the admin can see.

XMPP

I evaluated different options at the very beginning, but in the end opted to use XMPP, because it's a stable, well established protocol and really flexible. I spent about a week or so reading the different extensions and what was the subset that would work for our use case. After thinking and talking to our PM, I decided that we needed a server that could at least support multi-user chat (XEP-0045), message archive management (XEP-0313) and message delivery receipts (XEP-0184). Having the message list stored and synchronized by the server, it would also allow me to forget about keeping consistency of the state of the messages if the users delete the app or change devices, or even if they use multiple devices with the same account.

Why

I hinted at the why above, but expanding more the argument, let me quickly list the alternatives:

Build our own solution: Did that before, you will never reach parity with the functionality you need in time. Also, it's a maintenance headache.
Twilio: This is a really good alternative, it is not free and not open source though (and you would be locked into this vendor).
Firebase: We would need to partially build our own solution mixing things provided by Firebase. Less work than building everything, but not trivial.

From that (not comprehensive) list, I had to really think about whether I wanted to try XMPP or go with the easier to use Twilio. The argument that tipped the balance for me was that in order to use Twilio (properly) in an iOS and Android app, I would need to use their SDK, which unfortunately it's not part of the Expo bundle, which means I would have to eject (now it's called the bare workflow): that was a no-no to me since I really did not want to deal with the hassle of building the apps on my own (been there, don't want to ever do that again). It's just that the free builder service provided by expo is extremely convenient.

The X in Extensible

So, XMPP it is. XMPP stands for Extensible Messaging and Presence Protocol. And it really is extensible! The core of the protocol is extremely simple, everything else is defined on top of that core as extensions. An easy way to describe XMPP to someone that has not heard of it, could be something like this:

You open a socket and communicate with the server using a very simple set of XML Stanzas.

If you wanted to, you could write a very simple XMPP client in just a few days.

The most basic stanzas are two:

IQ: That's the Info/query stanza, "is a "request-response" mechanism, similar in some ways to the HTTP" (RFC 6120).
message: It's a message: a way to communicate between two or more users/entities.

For the MG app, I added a few extra elements in the message stanza to support some of our use cases. All these new elements have the namespace urn:mgapp:extensions.

emotion: if a message contains the element emotion then the publisher will assume the message is there only to register a "reaction" to the message (like, love, dislike, etc).
attachments: a message might contain attachments. This is a simple wrapper to a list of files and some metadata (MD5, thumbnail as base64 and the download URL).
workshop: if a message is a workshop, then this adds extra context that's useful to display on the app, such as date and time or location.

To make it a bit more clear, here's a picture of how the message is rendered by the mobile app, and how that's mapped to the actual XML, as received by the XMPP connection.

a message mapped to its internal representation — Figure 3: How a message in the app maps to the XMPP message

Wrapping up

Ok, that was longer than what I expected for an intro post, and I still we have a lot to cover. In the next posts:

The React (and Redux) models and some considerations that we took
The server side, how the publisher (the process that sends messages work) and how that communicates with the Web UI (admin).