LiveConnect - Leveraging MQTT as a scalable, low-latency persistent connection transport mechanism for mobile devices

mqtt Nov 06, 2020

What is LiveConnect?

LiveConnect is a home-grown low latency persistence connection transport platform built over MQTT protocol. We wanted to establish a persistent connection with all of our clients(user's device), the only way to build it by moving towards two ways of communication protocol. So, LiveConnect is a platform which provides the ability to communicate to the end client and vice versa using pub-sub model.

What problem we were trying to solve?

Halodoc app is expected to be operational and supported across varied challenging scenarios, some of which are outlined below:

  1. Low network availability
  2. Lower bandwidth
  3. Network connections are jittery

With the above mentioned constraints, it was extremely challenging to guarantee a reliable and persistent connection with the client for seamless communication and exchange of messages between the client and server. This mandated us to build a entirely brand-new low latency persistent connection transport mechanism, such as LiveConnect for serving this purpose, that is resilient to the adverse technical challenges outlined above. In addition to these, it is imperative to ensure optimal battery consumption for our customers who use low-end/older devices.

What is MQTT and why did we choose it?


All the requirements mentioned above are catered out of the box by MQTT.  So, what really is MQTT?

MQTT is an OASIS standard messaging protocol for the Internet of Things (IoT). It is designed as an extremely lightweight publish/subscribe messaging transport that is ideal for connecting remote devices with a small code footprint and minimal network bandwidth. MQTT today is used in a wide variety of industries, such as automotive, manufacturing, telecommunications, oil and gas, etc. Compared with other protocols like WebSocket, it is very light weight, binary based protocol and has only 2 bytes of overhead (very small footprint compared to others).

Comparison of MQTT vs. Web Socket : Our Analysis

Here's our comparative analysis of MQTT as messaging protocol vs. Web-Sockets. which involves MQTT vs Web-socket:

Evaluation Parameter MQTT WebSocket
Connection/Reconnection Overhead Handshake over TCP Handshake over HTTP and thus extra roundtrips
Bandwidth and Transport Latency Overhead < 3 bytes Handle data transfers as Frames, and each frame have additional 4-12 bytes of data about the payload, there can be multiple frames per message
Reliable and Guaranteed Delivery Ensures this with multiple QoS levels NA
Power Consumption Low Low - Medium
Scalability Not horizontally scalable(Needs a clustered approach to scale) Horizontally scalable
Security TLS TLS

Apart from these, MQTT also supports QOS(Quality Of Service) which guarantees the delivery of a message to the end client. There are 3 types of QOS supported by MQTT: QOS0, QOS1 and QOS2 which are described below-

QOS Stands For Description
QOS0 At most once The recipient does not acknowledge receipt of the message and the message is not stored and re-transmitted by the sender
QOS1 At least once The sender stores the message until it gets a acknowledges receipt of the message
QOS2 Exactly once A four-part handshake between the sender and the receiver, which guarantees the message delivery of atleast once

The selection of these QOS is completely depends on the use cases.

Architecture

The Architecture and the components involved in LiveConnect is shown below:

Live-Connect Architecture

This architecture consists of several component as shown above, Lets see how they interact with each other, how the entire data flows and later we will explain each of the components.

Lets see how a connection is being established:

  1. MQTT client makes Connect request to broker.
  2. The broker authenticates the clients using http call to Authentication server by passing required user specific access token.
  3. Once authentication is successfully done, broker provides a Connect acknowledgement (CONNACK) to the client and then connection is being established between client and broker.
  4. We also have load balancer for the broker nodes, which balances the load of request made by end clients.

We have seen the connection establishment flow, now lets see how a message being delivered to the subscriber.

  1. Once the connection is established, MQTT client makes subscription request to the broker with predefined topics which an end client need to subscribe.
  2. Publisher MQTT client is integrated with internal micro services, where micro services send requests to Publisher MQTT client using http request.
  3. Since the Publisher MQTT client is also connected to broker, it immediately publishes the data along with topic and QOS information.
  4. Rest Broker takes care of delivering the message to subscriber client. Then subscriber client performs action according to the message information.

Now we have seen the data and connection flow, lets dig deeper on each components and understand their responsibilities:

Node Architecture:

We have differentiated the publisher and subscriber node due to certain benefits or reasons:

  1. High availability of Publisher nodes(As it is hidden in the cluster) : Only backend MQTT clients are allowed to connect to it.
  2. Single responsibility : Listen and forward messages to subscriber nodes
  3. Processing of messages at a very high speed

MQTT Client

The Halodoc app serves as the MQTT client and each of the Backend micro service which also holds an instance of MQTT client as well. Backend MQTT client is also integrated with other micro services, so other micro services can request for any information which needs to be published to any target client or group of clients via normal HTTP call.

The below MQTT client libraries are being used as MQTT client:

Client Library
Android Paho
IOS CocoaMQTT
Backend Paho

MQTT Broker

All of the communication takes place via the MQTT Broker and none of the clients directly talk to each other, as all communication is routed via the MQTT Broker.  The Broker is built up on pub-sub model. Any client can publish message to any topic and interested client can subscribe to those topics and they will receive the message based on the QOS and connectivity of the subscribed client.

We have done evaluation of multiple popular brokers keeping our required use-case, scalability, reliability and support in mind. We finalised on VerneMQ as MQTT broker which stands out as one of the best broker. There are few of our basic requirements which are supported out of the box by VerneMQ viz:

  1. Cluster support
  2. Session sharing across nodes
  3. Offline queues
  4. Web-hook support

We can easily monitor the health and status of each running node at any point of time using VerneMQ's in built metrics dashboard. Below is the sample image attached for the reference:

VerneMQ Dashboard

Authentication

This is one of the important part, as we can only allow authorized clients to connect to the broker and any anonymous connect request to the broker should be rejected. As this is an OAuth based authentication, every client will have an access token with validity. Using this access token, clients can connect to broker. Authentication has been handled via MQTT CONNECT request, which contains the access token as user name in the CONNECT request. Here is a sample CONNECT request:

CONNECT : { "client_id": "client_1", "user_name": "access_token" }

Client Status Management

We also maintain the client status, whether they are currently connected to broker or the connection was lost/disconnected. Once authentication call comes to auth server, we authenticate the client based on access token provided by the client, and if it passes, we immediately mark the client status as online. Also, we can configure the web-hook, whenever any client goes offline(even connection lost). The broker makes the API call based on the web-hook configured end point, once we receive this API call, we can mark the client status as offline.

Scale

We have observed an average load of > 2k RPM of connections at a given point in time which eventually means >couple of millions connections in a day(with repeated clients connecting at different intervals in day). With this new mechanism in place, our existing infra is able to seamless scale to serve a significant volume of traffic without having to spawn extra nodes.

Early use cases

There are many use cases supported by MQTT, some of the early use cases we've been able to leverage until now have been outlined below.

Circumventing Polling

When a customer orders medicine from Pharmacy Delivery platform, and checks the status on the order details page, the client internally use to poll to order status API at every x secs. This was unnecessary/sub-optimal as it additionally increases the load on the server even when there are no updates to order status. We eliminated this polling approach using LiveConnect.
As called out earlier, the backend MQTT client is integrated with all the individual micro services which takes care of placing order till its delivery. Since we have a persistent connection between backend MQTT client and Halodoc App MQTT client, the Backend MQTT client can communicate with the end client/user (who placed the order) without polling. Client can subscribe/listen to order status topic which is specific to that particular order and Backend client can publish the topic whenever any status change is happening. The sample topic structure looks like :

halodoc/pharmacy-delivery/orders/ABC-123

The above solution has decreased the total number API calls which were being made for order status.
The similar polling was also present for Tele-Consultation, which takes place, once a patient makes request to a doctor for consultation till the consultation started. This was also eliminated using LiveConnect.

Remote Diagnostics

Remote debugging is extremely challenging for mobile apps, and we had to rely on customer's input (in the form of config values, app logs to be manually uploaded to us) to debug an issue further. This is an operationally intensive exercise and it significantly increases our turn around time for debugging issues in the field. As a solution, we started logging all the API calls and the response code, which is returned by API to a local text file. Then using LiveConnect we target that particular device using mutually decided topic structure between backend and client, which request to send the log, and once client receives the log request, client immediately responds with existing log file(which internally uploads the log file to S3 bucket). With the new LiveConnect mechanism, we no longer need any manual intervention by the customer to get additional insights on the app for debugging as the app itself is able to share the necessary insights required for debugging directly with us.

Summary

In this post, we discussed that how we leveraged the capabilities of MQTT (via LiveConnect) to enable bi-directional reliable communication between the Halodoc app and our backend infrastructure. While we continue to explore and exploit the other capabilities of the MQTT protocol and enhancing the capabilities of LiveConnect, we have already reaped a number of benefits with our adoption so far and will continue to share additional insights that we learn from our exploration in the future.

Join Us

We are always looking out for top engineering talent across all roles for our tech team. If challenging problems that drive big impact enthral you, do reach out to us at careers.india@halodoc.com

About Halodoc

Halodoc is the number 1 all around Healthcare application in Indonesia. Our mission is to simplify and bring quality healthcare across Indonesia, from Sabang to Merauke. We connect 20,000+ doctors with patients in need through our Tele-consultation service. We partner with 3500+ pharmacies in 100+ cities to bring medicine to your doorstep. We've also partnered with Indonesia's largest lab provider to provide lab home services, and to top it off we have recently launched a premium appointment service that partners with 500+ hospitals that allow patients to book a doctor appointment inside our application. We are extremely fortunate to be trusted by our investors, such as the Bill & Melinda Gates Foundation, Singtel, UOB Ventures, Allianz, GoJek and many more. We recently closed our Series B round and In total have raised USD$100million for our mission. Our team works tirelessly to make sure that we create the best healthcare solution personalised for all of our patient's needs, and are continuously on a path to simplify healthcare for Indonesia.

Sunil Chaurasia

Backend Engineering