Halodoc is a health-tech platform aiming to simplify access to healthcare for millions of people around Indonesia. Over the years, the Data Science team at Halodoc has played an important role in its growth and development, working on various projects with a mission to improve business using analytics, data science, machine learning, and statistics.
What is Demand Prediction?
Demand prediction/forecasting is the practice of estimating the amount of goods required to meet customer demand over a period of time. By implementing a demand prediction system, a business can make appropriate plans to face challenges that they might encounter in the future market demand and take necessary actions in advance.
In order for pharmacies to be able to sell pharmaceutical products/drugs to customers, there is a long preparatory process, which includes planning, obtaining drug license, vendor procurement, transport, and many more. In the case where an item is out of stock in a pharmacy, the team has to repeat the whole preparatory process, which is very time consuming and can take up to 2 weeks. Hence, a good stock prediction at the start of each month is favorable. This reduces the chances of a desired item being out of stock, which can cause a loss of opportunity in gaining revenue. Thus, demand prediction is a beneficial method that can be used to assist Halodoc in achieving their business goals efficiently.
Monthly vs Weekly Predictions
Demand prediction can be done for various time periods. Monthly and weekly predictions are being tested and compared, with each of them having their own advantages and disadvantages.
For now, the monthly method is being implemented for demand forecasting. However, the team is considering a hybrid approach which combines both methods.
The dataset for demand prediction uses all transactional data within a 6 kilometer radius around each pharmacy.
Models for Demand Prediction
Initially, univariate time series models, such as Prophet, Darts and Sktime , and a univariate approach which uses deep learning (LSTM) were implemented to predict demand. However, the results were not good enough due to insufficient data points (only 3-4 years data). This approach also required more model management since each item in each pharmacy is represented by one time series model (in Halodoc, there are over 10,000 items and multiple pharmacies).
Boosting with machine learning models, such as Random Forest, XGBoost, and LBGM, were then tested, where one model could be used for all pharmacies. However, the results were still unsatisfactory. The same models were explored again, but with one model representing one pharmacy. Compared to when one model was used for all pharmacies, the results when using one model for one pharmacy showed an improvement. Nevertheless, managing the models for each pharmacy still proved to be quite challenging. Thus, an approach that uses clusters (one model for 3-8 pharmacies) was used to overcome these problems. The clusters were chosen by analyzing the sales gross merchandise value (GMV) and the items being sold in each pharmacy. At the end of our trial we find that Random Forest with multiple clusters is the configuration with the best result.
Mean absolute percentage error (MAPE) is the method used to assess the accuracy of the demand prediction model. A larger MAPE value indicates a larger loss in sales opportunity.
Before we make prediction with machine learning model we usually make the prediction manually with MAPE around 30-35%. As a result of the demand prediction model that was built during this project, we managed to reduce the MAPE to 20-25% (decrease by ~37%), which suggests a lower loss in sales opportunity, hence increasing business revenue. We are continuing to optimise the model further to bring down the MAPE further and improve the accuracy of our demand prediction.
However, there are still several limitations to this method that need to be addressed.
- The data that is used for this project is only aggregate transactional data, which does not yet include personalization, demography, and other factors that might affect these data.
- The data timeframe used is also only 3-4 years’ worth of data points.
Further improvements can still be made to build on the results from this project.
- Combining weekly and monthly predictions can further enhance the results as demand can be monitored more frequently and any trends can be picked up quickly, while still ensuring that workload for the operations team is still manageable.
- Region and demography parameters should also be added to the model so that they can be taken into account when analyzing transactional data.
We are looking for experienced Data Scientists and ML Engineers to come and help us in our mission to simplify healthcare. If you are looking to work on challenging data science problems and problems that drive significant impact to enthral you, check all the available data jobs on Halodoc’s Career Page here.
Halodoc is the number 1 all around Healthcare application in Indonesia. Our mission is to simplify and bring quality healthcare across Indonesia, from Sabang to Merauke. We connect 20,000+ doctors with patients in need through our Tele-consultation service. We partner with 3500+ pharmacies in 100+ cities to bring medicine to your doorstep. We've also partnered with Indonesia's largest lab provider to provide lab home services, and to top it off we have recently launched a premium appointment service that partners with 500+ hospitals that allow patients to book a doctor appointment inside our application. We are extremely fortunate to be trusted by our investors, such as the Bill & Melinda Gates Foundation, Singtel, UOB Ventures, Allianz, GoJek, Astra, Temasek and many more. We recently closed our Series C round and In total have raised around USD$180 million for our mission. Our team works tirelessly to make sure that we create the best healthcare solution personalised for all of our patient's needs, and are continuously on a path to simplify healthcare for Indonesia.