Filtering Inappropriate images from online consultations with AI

"Solving for user" is one of our core principles of Halodoc. We provide on demand consultation for users with doctors, who work endlessly to serve patients. Online-consultation is achieved via chat, audio, and video calls. In this blog, we will be concentrating on the chat medium.

Over chat, the patient can share texts, images & documents. This lets our users better explain their medical conditions to the doctors.

It is important to set the experience right not only for our users but also for our doctor partners. A few of the doctors on our platform have had a bitter experience during consultations involving inappropriate content and spam images. To make our doctors' experience better, we needed to ensure the images received by the doctors are safe to consume in the context of the consultation. This identification needed to be a binary classification: either Not safe for work (NSFW) or not.

The summary of the product is: once the image is identified as NSFW, block the image and let the doctor decide whether the image should be revealed based on the consultation. This routed us to look for an intelligent system to identify images that led to a branch of machine learning that deals with the analysis and interpretation of images.

The Discovery

The initial idea was to build the Image classification model and embed it on the device; this, however would require the dataset. Obtaining a task-specific dataset is tough. Nudity/ NSFW detection is one such use-case where there are practically no useful open datasets available. This made us look for a solution that allows the task-specific optimised model to be used.

On Device Approach

Since our platform involving doctors support the Android/iOS ecosystem, choosing the MLKit was the way forward. ML Kit focuses on providing an offline and on-device based solution for mobile devices(Firebase MLKit or CoreML). ML Kit uses a tensor flow lite model format to reduce the size of the original trained ML model and hence this makes it an ideal contender for been used on mobile apps. But for larger and more complex models, offline or on-device solutions won’t be the best fit.

Below are the few more reasons which make on-device not so intriguing

  1. Lesser accuracy
  2. Increases in App size.
  3. Requires a solution to rollout model changes on demand.
  4. Demands huge processing power as well as battery usage

Now that we had the constraints clear on performing image recognition on the device, we went ahead to look for cloud based solutions

Cloud Solutions

The primary advantage of hosting the process on cloud is that it enables us to perform image recognition without handling the complexities of the training/updating model and make a prediction on trained models as it is taken care of by the service provider, the client just needs to hit an API to get recognition result. This is where the cloud service comes in handy.

Some key advantages of performing image recognition on the cloud are:

  1. No need to worry about data to train the model
  2. No need to have deep knowledge of ML model creation & optimisation
  3. CloudML model updates its models from time to time and also trains them with new data sets.

Some of the most popular CloudML service providers are:

AWS Rekognition service was well-suited for our scenario as chat images get stored in AWS S3. This gave us the advantage of not uploading the image again which acts as input for recognition API.

Detecting Unsafe Content with AWS Rekognition

We can use Amazon Rekognition Image Moderation APIs to determine if an image or video contains unsafe content, such as explicit adult or violent content. In the Amazon Rekognition Image Moderation API, you can use the DetectModerationLabels operation to detect unsafe content in images.

Courtesy: AWS Image Moderation

To use the Image Moderation API, we have to pass in the base64-encoded image bytes and MinConfidence, the API will return the DetectLabelsResult. The MinConfidence specifies the minimum confidence level for the output of DetectLabelsResult.

The DetectLabelsResult sample result is as shown below.

{
"ModerationLabels": [
    {
        "Confidence": 99.24723052978516,
        "ParentName": "",
        "Name": "Explicit Nudity"
    },
    {
        "Confidence": 99.24723052978516,
        "ParentName": "Explicit Nudity",
        "Name": "Graphic Male Nudity"
    },
    {
        "Confidence": 88.25341796875,
        "ParentName": "Explicit Nudity",
        "Name": "Sexual Activity"
    }
]
}

Execution

Architecture and workflow

Architecture and workflow

During an online consultation, when a patient sends an image, it goes to the chat server which uploads the image to an S3 bucket in the AWS server and retrieves the S3 image link. The chat server sends the message containing the image URL to the Doctor's app. On receiving the message, the Doctor app sends the image URL to the Halodoc server to receive the inference on the image. Halodoc server later communicates with the AWS Rekognition and gets the confidence score for the image. Based on the threshold of the confidence score defined in the Halodoc server, it is decided that whether or not the image is NSFW and returns the same to the Doctor app.


The doctor app shows a blurred image until it receives inference results from the Halodoc server. If the image is identified as NSFW, the app asks for the doctor’s consent to display the original image, otherwise, it keeps the image blurred. This feedback mechanism provides a way to let the doctors know if the image is NSFW and prevents them from seeing it directly.

The inference result of the image is stored in the Doctor app and the same is cached on the server-side as well. This avoids streaming of the same image through the inference process again which optimises the process to a great extent and thereby, reduces the cost.

Image recognition technology can benefit any business, regardless of size, product, and market. As image recognition has endless possibilities, it helps us deliver a better experience to our end users.

We are always looking out to hire for all roles for our tech team. If challenging problems that drive big impact enthral you, do reach out to us at careers.india@halodoc.com

About Halodoc

Halodoc is the number 1 all around Healthcare application in Indonesia. Our mission is to simplify and bring quality healthcare across Indonesia, from Sabang to Merauke.
We connect 20,000+ doctors with patients in need through our Tele-consultation service. We partner with 1500+ pharmacies in 50 cities to bring medicine to your doorstep. We've also partnered with Indonesia's largest lab provider to provide lab home services, and to top it off we have recently launched a premium appointment service that partners with 500+ hospitals that allows patients to book a doctor appointment inside our application.
We are extremely fortunate to be trusted by our investors, such as the Bill & Melinda Gates foundation, Singtel, UOB Ventures, Allianz, Gojek and many more. We recently closed our Series B round and In total have raised USD$100million for our mission.
Our team work tirelessly to make sure that we create the best healthcare solution personalised for all of our patient's needs, and are continuously on a path to simplify healthcare for Indonesia.