Using EDA and K-Means for food similarity and diet chart

Health and wellness is complex thing and needs a holistic approach in your lifestyle to achieve and maintain it . Food is an important pillar for your health and fitness. Complexity increases as we move into details of food items to classify what is healthy and what is unhealthy that too taking the consideration of health and fitness goal of an individual. Food that might be suggested for a particular fitness goal might not be a fit for other fitness goal. High carbs food might be preferred in the cases where weight gain is the fitness goal however things might be opposite when it comes to weight loss.

There are thousand and millions of food items in the world and our task was to classify and suggest food items while looking at users health and fitness objective. Also, algo should be able to give recommendation of healthy food for the item which user eats in his daily routine.

The problem statement we had was to prepare a diet chart for users based on their goals. Every goal had its own calorie requirements, percentages of primary nutrients i.e. carbohydrate, fat, protein, and fibres. It made a lot of sense in this context to group foods together based on these properties to classify them on the basis of them being a high carb, high protein or high fat food item. Hence we decided to analyse the data and create clusters out of it.

We divided out process in following steps:

  1. Reading, Understanding, and visualising data.
  2. Preparing data for modelling.
  3. Creating Model.
  4. Verifying accuracy of our model.

Lets get started by Reading and understanding Data.

We in total were provided with 1940 records having 88 attributes. Out of which, according to our business requirement we needed attributes like foodName, carbs, protein, fat, fibre, weight, calorie, saturatedFat and volume.

Several entries in our dataset had missing values there can be two reasons for it.

  1. It was intentionally left out as some food items don’t contain any such attributes. It simply means the missing values represents zero.
  2. There was some error collecting data and during data entry those values were skipped.

Upon consulting datasource we imputed missing values with zero.

Next the calorie in the food items contained calorie from all the minerals and nutrients components in our food but since we are only concerned about few of those nutrients so we calculate calories using those only. And according to standard formula the calories comes out as.

calorie = 4*carbs + 9*fat + 4*protein

Hence, we came up with a derived metrics calorie_calculated using the following formulae.

Standardising values:

Columns carbs, fat, protein and fibre are in grams but for our analysis purposes we need to convert and standardize those to its calories representation. And since fibre is a non contributor in calorie we convert it to corresponding content per unit weight of food item.

Its very important in clustering algorithm for our data to not be correlated. But as we see from the heatmap presented below that as calories of food items increases so does fat, carbs, protein. In order to remove this correlation we took a ration with the calculated calorie.

Now once our data is clean and correlations are handled lets move to next step i.e. clustering.

What is clustering?

Cluster is a task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics.

Its a kind of Unsupervised Learning, as we don’t provide any labels to data and we are trying to distinguish data in subgroups based on the features provided.

What is K-Means Clustering ?

K-means is a centroid-based algorithm, or a distance-based algorithm, where we calculate the distances to assign a point to a cluster. In K-Means, each cluster is associated with a centroid. It tries to make intra-cluster datapoints as similar as possible while keeping the clusters as far as possible.

It involves following steps:

  1. choose number of clusters, lets say k. This is the k in K-Means clustering
  2. select k random points in data as centroid.
  3. Measure the distance between first point and K initial clusters.
  4. Assign the first point to the nearest cluster. And we do the same step 3 and 4 for the rest points. And once all the points are in cluster we move on to next step.
  5. Calculate the mean of each cluster i.e. centroid of each cluster.
  6. Now we measure the distance from the new centroid and repeat step to 6. Once the clustering didn’t change at all during the last iteration we are done.

We can asses the quality of clustering by adding up the variation within each cluster. Since k-means clustering cant see the best clustering, its only option is to keep track of these clusters, and their total variance, and do the whole thing over again with different starting points.

Since, K-Means rely heavily on the distance its very important for our features to be scaled with mean around zero and with unit standard deviation. And the best feature scaling technique to use in this case is Standardisation.

The next question is what should be the value of K ?

For this we use what is called Elbow Curve method. It gives a good idea what K value should be based on Sum of squared distance. We pick k at the spot where SSE starts to flatten out and forming an elbow.

import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans

required_cols = ['carbs', 'fat', 'fibre', 'protein']

scalar = StandardScaler()
df[required_cols] = scalar.fit_transform(df[required_cols])
df[required_cols] = df[required_cols].fillna(0)

wcss = []
 for i in range(1, 11):
     kmeans = KMeans(n_clusters=i, init='k-means++', random_state=42, max_iter=300)[required_cols])
 plt.plot(range(1, 11), wcss)
 plt.title('Elbow curve')
 plt.xlabel("Number of clusters")

we get the above curve. From this we can say that optimal cluster and value of K should be around 4.

Analysis of Clustering

We are using Silhouette Analysis to understand the performance of our clustering.

Silhouette analysis can be used to determine the degree of separation between clusters. For each sample:

  • Compute the average distance from all data points in the same cluster (ai).
  • Compute the average distance from all data points in the closest cluster (bi).
  • Compute the coefficient:
Image for post

The coefficient can take values in the interval [-1, 1].

  • If it is 0 –> the sample is very close to the neighboring clusters.
  • It it is 1 –> the sample is far away from the neighboring clusters.
  • It it is -1 –> the sample is assigned to the wrong clusters.

Therefore, we want the coefficients to be as big as possible and close to 1 to have a good clusters. Lets analyse the silhouette score in our case.

result = {}
for i in range(2, 11):
    kmeans = KMeans(n_clusters=i, init='k-means++', random_state=42, max_iter=300, n_init=10)[required_cols])
    pred = kmeans.predict(df[required_cols])
    result[i] = silhouette_score(df[required_cols], pred, metric='euclidean')

We get result as:

{2: 0.31757107035913174,  3: 0.34337412758235525,  4: 0.3601443169380033,  5: 0.2970926954241235,  6: 0.29883645610373294,  7: 0.3075310165352718,  8: 0.313105441606524,  9: 0.2902622193837789,  10: 0.29641563619062317}

We can clearly see that for k = 4 we have the highest value of silhouette score. Hence 4 as an optimal value of K is a good choice for us.

Once we have k; we performed K-Means and formulated our cluster.

Next, we have prediction for values. Let’s say, we get nutrition composition for a specific goal. What we do, is scale that data in format that out model accepts and predict the cluster of the corresponding given composition.

y_pred = model.predict([food_item])
label_index = np.where(model.labels_ == y_pred[0])

As we get the label_index we filter out our food from our data and calculate the euclidian distance of each food item for the given composition.

dist = [np.linalg.norm(df[index] - food_item) for index in label_index[0]]

By this way, we can have the food items that are very closely related to the provided composition. And hence, we can prepare the diet the way we want. Like if we want to further filter out the data obtained from clustering into veg/NonVeg type etc we can perform those filtering.

The above content is an outcome of our experience while working with above problem statement. Please do feel free to reach out and comment in case of any feedback and suggestion.

Photo by Lily Banse on Unsplash

How to track sleep through Android app


Our HealthKart application helps users to achieve help and fitness goal through our digital platform. Achieving help and fitness goal requires lots of things to be incorporated in daily routine and sleep is an important parameter for the same.

Sleep tracking can be done through couple of methodology and one of the popular way is to track it through smart band/watches. HealthKart app has integration with various health and fitness bands to track the sleep however we wanted to have another way to track the sleep of users through much easier way so that we can have maximize the data inputs from our users on this front.

In today’s time, all people use their phone from morning to night and first use the phone as soon as they wake up in the morning. So by using the user activity on the phone calculating the sleep time.

Now the question is what all activities are capturing for this. So the answer is only two and these are below.

  • User device screen comes in On Mode by user intention or any other application eg. phone ringing.
  • User device screen goes into Off Mode.

Android Components used for this

  • Started Service
  • BroadcastReceiver

Steps for using Android Components

  1. Create a SleepTrackerService that extends Service Class.
class SleepTrackerService : Service() {

  override fun onBind(p0: Intent?): IBinder? {
    return null

  override fun onStartCommand(intent: Intent?, flags: Int, startId: Int): Int {
    Log.i(TAG, "Sleep Tracker Service")

  override fun onDestroy() {


2. Make two BroadcastReceiver  ScreenOnReceiver and ScreenOffReceiver. So these two receivers for checking when the screen is coming in ON Mode and OFF Mode.  Register both the receivers in the Service class of onStartCommand method.

private var screenOffReceiver: ScreenOFFReceiver? = null
private var screenOnReceiver: ScreenONReceiver? = null

screenOffReceiver = ScreenOFFReceiver()
val offFilter = IntentFilter(Intent.ACTION_SCREEN_OFF)
registerReceiver(screenOffReceiver, offFilter)

screenOnReceiver = ScreenONReceiver()
val onFilter = IntentFilter(Intent.ACTION_SCREEN_ON)
registerReceiver(screenOnReceiver, onFilter)

3. To keep running the service in the background and the kill state of the application used the foreground service.

val notificationManager =getSystemService(Context.NOTIFICATION_SERVICE) as NotificationManager

val notificationIntent =
val uniqueInt = (System.currentTimeMillis() and 0xfffffff).toInt()
val pendingIntent =

val builder: NotificationCompat.Builder =
  NotificationCompat.Builder(this, SLEEP_CHANNEL_ID)
builder.apply {
  setContentText("Sleep Tracking")
  priority = NotificationCompat.PRIORITY_HIGH
  addAction(R.drawable.blue_button_background, "TURN OFF", pendingIntent)

val notification =
notification.flags = Notification.FLAG_ONGOING_EVENT

startForeground(SLEEP_NOTIFICATION_SHOW_ID, notification)

4. Now calculate the timing in ScreenOnReceiver and ScreenOffReceiver.

inner class ScreenOFFReceiver : BroadcastReceiver() {
  override fun onReceive(context: Context, intent: Intent) {

inner class ScreenONReceiver : BroadcastReceiver() {
  override fun onReceive(context: Context, intent: Intent) {

5. Unregister the receivers in onDestroy method when service is destroyed.

override fun onDestroy() {
  screenOffReceiver?.let {
  screenOnReceiver?.let {
  with(NotificationManagerCompat.from(this)) {

We capture these on/off screen event data for user and send it to backend for calculating the sleep behind scene through our algorithm.

This methodology is much easier to implement at the same time user does not need to wear his gadget all the time. Obviously there are few trade off here too however this was the balanced approach to maximize the data inputs from our end users.

This tutorial is outcome of our own experience of implementing the sleep tracking. Your suggestions and feedbacks are heartily welcome.

Photo by Lauren Kay on Unsplash

Adding In App Video Chat Support – Things to consider

Pandemic has given exponential rise to video communication adoption to the digital platform to have better personal support service to customers. If you have a digital property, In-App video chat becomes an important aspect of the same. From a technology perspective, there are multiple options to choose from when it comes to implementation for the video chat. Lots of question will come into mind, which protocol to choose, should I use open source, should I use hosted services (CPaaS), how about the pricing and many more.

HealthKart provides nutrition services to its customer through App video chat support. Customers can initiate the Video chat service with the Nutritionist or they will get a video call in their app at the scheduled time of appointment and can have one to one consultation with the nutritionist. These kinds of implementations require few things to be considered before we actually jump to the implementation part. Let’s discuss this in detail.

Choosing right communication protocol – Technology changes rapidly and keeps on evolving every day, as a result, we keep on getting new frameworks, tools, and protocols on warp speed. What was working a couple of years back might not be relevant today. RTMP (Real-Time Messaging Protocol) which used to rule the streaming protocol earlier was replaced by HSL (HTTP Live Streaming ) from Apple and DASH (Dynamic Adaptive Streaming over HTTP) based streaming protocols. WebRTC is something new that is a game-changer as it is based on P2P and configuration support to work on TCP and UDP both. It is primarily designed for data streaming for the browser to browser support.

Looking at the WebRTC advantages we at HealthKart opt to go with WebRTC based streaming frameworks for implementing In-App Video Chat Support. 

Open source Vs CPaaS (Build Vs Buy) – This might be the bit tricky call to make when it comes to whether you should build it in-house or have some hosted solution like CPaaS (Communication Platform as a Service). By any chance, if you choose to build it in house, you have to put lots of effort to find out the right server and client tools to make it work. Also need to work on its scalability and reliability part. Looking at the complexity of the service and in house capabilities and priorities we at HealthKart choose to not built this in- house instead of just looking for the hosted/CPaaS services readily available in the market.

Though this call might be contextual based on the individual needs of the organization and may vary need basis. If you need more information on what are the things one should consider, please read out our other blog post about the same here

CPaaS – TokBox(Now Vonage) Vs Twilio Vs Others -If you decided to go ahead with hosted services, the next thing to decide would be which one to use. There are multiple CPaaS providers available in the market and one has to decide which one to use looking at the various aspects. TokBox and Twilio are leading the market on the same and we evaluated both on highlighted aspects below.

  1. Easy of Use – No matter which provider you choose, you have to pick their SDKs, read their developer docs and get that integrated into your app. There are lots of terminologies too that have to be understood like session Id, relay mode, routed mode etc. Tokbox and Twilio both have quite a descriptive developer guide and easy to use quick start application. Their conceptual doc is also nicely written and easy to understand. We were able to have up and running a quick start sample in web application in less than an hour. Android and iOS SDKs need integration points and configuration and required more time on that front. However, both have easy to use setup on both the front.
  2. Pricing – Every provider has a different pricing model and one has to understand which one will suit him best. Tokbox starts with a flat 9.99$ month with 2000 minutes subscription whereas Twilio has 0.0010 $ per minute/participants pricing model. One should do a clear calculation based on estimated user sessions and should choose the right one. Here is the quite detailed blog post for the same which will give you a good insight for right-sizing the pricing model with various CPaaS providers.
  3. Support – Twilio and Tokbox both have good support available in their backend. If you choose to move to an enterprise plan both will provide dedicated support available for your need. In our experience, we reached out to their support once or twice and got a fast response in support of integration.
  4. Feature Listing – One might need to get the support of different features too while integrating the video chat. Recording, Analytics, Intelligent Insights using AI, Text chat support are the few which might be required in some cases. Please go through with each of them to see what they have to offer.
  5. Extensible – See the extensible part of all the providers. Look for the ecosystem that they have and how can they support you in extending the functionality or any custom development or feature that you need on top of it. In our experience, both have limited extensibility support and do not offer much customization and features that they provide. We wanted to have incoming video call support (Similar to WhatsApp Video Calling) in our app however no out of the box solution was available in both and we had to build it on our own with the support of real-time Push Notification services on Android and iOS both. However, it was not really a deal-breaker for us as the primary requirement was to get something inbuilt in the app in agile and cost-effective way.

While considering the overall perspective we decided to use TokBox due to its super simple and Pay as you go pricing and ease of use.

The above context is based on our experience that we encounter and does not support the promotion of any of the services. Your experience with each of the services might vary. Please feel free to share your feedback and input on the same.  

Photo by visuals on Unsplash

When not to use Microservices

There is nothing like a silver bullet that exists in this world which can solve all your problems. In the field of medicine, certain medications can help in fighting only certain diseases. A medication for headache can not be used for diabetic care or curing eye problems. Certain medications also not suitable in a few cases if you have some other preexisting disease. For example, Paracetamol should not be given in case of fever if you are already having liver dysfunction.

Technology frameworks, design patterns, architectural considerations work in a similar way. A given framework solves a given set of problems, at the same time it might work as an anti-pattern in few cases if you have some other problem statement too at hand.

OK, so lets talk about Microservices a bit..

Microservices has been buzzing in technology from starting of this decade. We as an engineer are always prompt in considering these buzzing frameworks around as it has been used by other tech giants and they have started promoting it. You might hear statements from your engineering team about the same that why dont we use Go/Rust/Julia or any other trending languages or framework. If you ask them to explain why we should use it, chances are pretty high that you will get the same response –

“It is a new trend, people talk about it and it has been open sourced by Google/Facebook etc. My friend is also working on the same..”

We mostly fall in this trap and quickly try to adopt this before thinking much about the core of the situation –

“Will this really solve my problem ? or Do I really have a problem statement which can be solved by this framework/language?”

Microservices are no more exception on this line. It has certain pros and cons as well and one should be pretty clear that they actually have a problem statement that can be solved by microservices. Also, they should be very clear of using the trade off which Microservices has while using it. If you have not introspected this beforehand, this might be a real mess for your engineering team going forward. Yes, you heard it right…

Microservices are a real mess … You should consider this only if you are 100% aware of its pros and cons and ready to handle the downside for the same.”

Evaluation of Microservices

Microservices came into existence at the start of 2010 and started getting adopted by many tech companies rapidly. Netflix has been a big promoter of Microservices and contributed a lot to this front. When I came to know the first time about Microservices in 2011, the first thing that came into my mind was ..

“Ohh… not a much different than SOA and ESB… Indeed a specialized variant of SOA which only works on HTTP and mostly supports JSON data format. Then why so much of buzz around it…”

Microservices was nothing new especially for the people who were working on enterprise architecture at that time, they found it similar to what SOA was providing on the operational front. Indeed Adrian Cockcroft, former director for the Cloud Systems at Netflix, described this approach as “fine grained SOA” back in 2012.

Lets talk about some obvious advantages of Microservices

One of the biggest reasons why Microservices got a big push was the adoption and standardization of communication protocol and data. HTTP and JSON became standardization for system communication. HTTP footprint became so large that even the smallest IoT device started supporting it. This evaluation ultimately killed two things… SOA and M2M protocol and gave birth to a new skill – DevOps.

So lets see what benefits we have for Microservices –

  • Modularity – This makes the application easier to understand, develop, test, and become more resilient to architecture erosion. This benefit is often argued in comparison to the complexity of monolithic architectures
  • Scalability – Since microservices are implemented and deployed independently of each other, i.e. they run within independent processes, they can be monitored and scaled independently
  • Ease of Integration – microservices is considered as a viable mean for modernizing existing monolithic software application. There are experience reports of several companies who have successfully replaced (parts of) their existing software by microservices, or are in the process of doing so.
  • Distributed development –  it parallelizes development by enabling small autonomous teams to develop, deploy and scale their respective services independently.It also allows the architecture of an individual service to emerge through continuous refactoring. Microservice-based architectures facilitate continuous integration and deployment.

Alright, this seems fair enough and now let’s see some of the complexity and cons that Microservices brings on the table. The above advantages might sound fascinating and easy to achieve, however in technology, things which sounds easy are mostly hard to achieve.

What is hard to achieve in Microservices –

  • Cross services transaction – By any chance, if your system requires you to achieve data consistency among different microservices, you will not find it easy or I would say noway to handle this. Thought might come up in your mind to write your own transaction management however cost of writing that is too high as you have to handle lots of things which could have been easily done by one annotation if you would have monolithic in place.
  • Infrastructural Operational efficiency – Deployment, data backup and restore, data recovery becomes really challenging and becomes overhead for your DevOps team. Since each service has its own deployment server and database, DevOps team has to plan separately for scaling, data backup and restore and recovery strategies for each of the microservices. Consider, if you have ton of services running in your production this could be really painful for DevOps specially if you are a very lean engineering team.
  • Cost efficiency of the cloud expenditure – Cloud cost increases as you keep on spawning new servers and keep on pushing the data between your servers. Since each microservice runs in a separate server, this will definitely increase your infra cost even though if you use docker and other orchestration container software. Also, we normally ignore the data in and out cost however if you are moving large data between your servers this might exceed the expenditure by a significant amount. Since in Microservice lots of data is getting passed between systems for aggregation and composition of services, this takes a toll on infra cost if you compare this with monolithic deployment.
  • Testing/Deployment and Debugging – Consider the case that the output given to our client is the outcome of aggregating the response of multiple microservices. If any debugging/testing has to be done, one has to trace it to all the microservices in the production and find out the cause. You have to define the logging strategy first at hand to avoid any mess in debugging the system. As you keep on adding new services this could be really hard to solve the issue and may compromise with the agility of the deliverable.

Ok, so when should we really NOT use this.

  • If you dont have problem statement at hand -Looking at the pros, just check if you really have a problem statement with you. There are other ways of achieving modularity and scalability rather than just going in Microservices way. You can also think of just adopting a lean microservice way (Breaking your monolithic system in just two/three microservice) for greater modularity and scalability handling.
  • If you are early stage startup – Avoid this if you are just starting up and still into the validation phase. You might end up solving the problem which is really not required to be solved at the current stage of your organization. Remember you are not Netflix.
  • You have a very lean team and not distributed much– Microservices work better if you have a distributed team and each team can work independently on each of the microservice. If your engineering team is not in that state you should avoid it or maybe try to adopt a lean model of Microservice which I explained in the first point.
  • DevOps skill is missing in your team – Microservices architecture requires lots of work to be done on DevOps side for deployment and management of infrastructure. One should surely avoid this if you or your team is lacking on this skill or your team is not having much experience in handling the microservice setup.

Final thoughts –

We at HealthKart uses microservices architecture pattern and have gone thorough this evaluation from monolithic to microservices. However below two things have really helped us in handling the downside of this pattern and maximizing the upside.

  • Optimal service decomposition strategy – Dont over do it.
  • Go Slow – Make it in agile way, Develop > Measure > Learn. Start with not more than 2-3 services which are critical for scalability and modularity point of view. This will really help you deciding whether this will work for you or not.

P.S. Above content is outcome of my experience that I have gained while working with microservice and are open to feedback and suggestions.

References –

Photo by Dimitri Houtteman on Unsplash

API Gateway- Front Controller to our Microservices

What is an API Gateway?

An API Gateway is the first step towards diving into a microservices architecture. It is a type of proxy server which sits in front of all our backend services and provides a unified interface to the clients.It acts as the single entryway into a system allowing multiple APIs or microservices to act cohesively and provide a uniform experience to the user. 

An API gateway takes all API requests from the clients and handles some requests by just routing to appropriate clients and for some requests it aggregates the various services required to fulfill them and returns the combined response.

Why API Gateway? What benefits does it provide?

As more and more organizations are moving into the world of microservices, it becomes imperative to adapt an API management solution which takes off  the workload of ensuring high availability and performs certain core functionalities.

A major benefit of using API gateways is that they allow developers to encapsulate the internal structure of an application in multiple ways, depending upon use case. Enumerating below are some of the core benefits provided by an API gateway:-

  1. Security Policy Enforcement – API gateways provide a centralized proxy server to manage rate limiting, bot detection, authentication, CORS, etc.
  1. Routing & Aggregation: Routing request to appropriate service is the core of an API gateway. Certain API endpoints may need to join data across multiple services. API gateways can perform this aggregation so that the client doesn’t not need complicated call chaining and hence reduce the number of round trips.Such aggregations help us in simplifying the client by moving the logic of calling multiple services from client to gateway layer. It also gives a breathing space to our backend services by lifting the thread management logic for assembling responses from various services off from there.
  1. Cross Cutting Concerns: Logging, Caching, and other cross cutting concerns such as analytics can be handled in a centralized place rather than being deployed to every microservice.
  1. Decoupling: If our clients need to communicate directly with many separate services, renaming or moving those services can be challenging as the client is coupled to the underlying architecture and organization. API gateways enables us to route based on path, hostname, headers, and other key information thus helping to decouple the publicly facing API endpoints from the underlying microservice architecture.
  1. Ability to configure Fallback: In the event of failover of one or more microservice, an API Gateway can be configured to serve fallback response, either through cache , some other service or a static response.

Solutions Available?

There are a myriad of solutions available when it comes to choosing an API Gateway.  Few renowned ones include – 

  • Amazon API Gateway
  • Azure API Management
  • Apigee
  • Kong
  • Netflix Zuul
  • Express API Gateway

In my view, the primary factors that are taken into consideration while choosing a suitable API gateway are the following:-

  1. Deployment complexity – how easy or difficult is to deploy and maintain the gateway service itself
  1. Open Source vs proprietary – are extension plugins available readily? Is the free tier scalable as per your required traffic?
  1. On premise or cloud hosted – On-premise can add additional time to plan the deployment and maintain. However, cloud hosted solutions can add a bit of latency due to the extra hop and can reduce availability of your service if the vendor goes down.
  1. Community support – is there a considerable community using/following your solution where problems can be discussed.

How did HK leverage API gateway?

At HealthKart we chose Netflix Zuul API Gateway (Edge Service) as the front door for our microservices. We have embedded our authentication  & security validation at the gateway layer to avoid replication on multiple services. We use it for dynamically routing requests to different backend clusters as needed.

Also, we have implemented routing rules and done filter implementation. Say we want to append a special tag into the request header before it reaches the internal microservices, we can do it at this layer.

Netflix Zuul – What & How?

At a high level view, Zuul 2.0 is a Netty server that runs pre-filters (inbound filters), then proxies the request using a Netty client and then returns the response after running post-filters (outbound filters). The filters are where the core of the business logic happens for Zuul. They have the power to do a very large range of actions and can run at different parts of the request-response lifecycle.

Zuul works in conjunction with Netflix Eureka service. Eureka is a REST based service that is primarily used in the AWS cloud for locating services for the purpose of load balancing and failover of middle-tier servers. Zuul doesn’t generally maintain hard coded network locations (host names and port numbers) of backend microservices. Instead, it interacts with a service registry and dynamically obtains the target network locations.

To get this to working on our Edge microservice, spring boot has provided excellent in-build support , we just had to enable few configurations. Code snippet for the same is illustrated below:-

public class GatewayServiceApplication {

 public static void main(String[] args) {, args);

At the respective microservice layer, we needed to integrate service discovery so that as soon as the microservice is up – it registers itself with the Eureka server registry. @EnableDiscoveryClient annotation in spring boot helps us achieve this.

The following properties at the client side helped us in enabling client registry:-

eureka.instance.hostname= xxx
eureka.client.region= default
eureka.client.registryFetchIntervalSeconds= 5
eureka.client.serviceUrl.defaultZone=  xxxx


An API Gateway service is a great add-on to have in the micro-services architecture and has definitely proved to be a boon for us. We have still not leveraged it to its maximum capacity and aim to use it for cross cutting concerns like logging, caching, etc in coming months. The end goal would be to have each and every microservice to be on-boarded on this API gateway to enable seamless communication between client to server and server to server.

Real time Analytics Pipeline Using AWS

At HealthKart, we use lambda architecture for building real time analytics pipeline. However the most critical part in this setup is picking the framework which are extensible and does not cost a heavy toll on your infrastructure cost.

Keeping these thing in mind, AWS was the most viable option to have lambda architecture for achieving real time analytics for HealthKart platform. Below is the architectural diagram of the setup that we have which comprises of multiple frameworks to achieve the same and has been explained below.

Lambda Architecture for real time analytics
  • AWS Pinpoint – AWS pinpoint is primarily a mobile analytics framework which also has JS SDK available along with REST APIs. This framework provides API to fire pre-build and custom events from client side which will get stored on S3 buckets in JSON format. Since it has client SDK available, it provides lots of pre-build client matrix like session time, DAU/MAU, geographical information in the pinpoint dashboard. On top of it 100M events are free and they charge 1$ for additional 1 Million event. This really makes this cost optimal if you are events are in few hundred millions per month.
  • S3 Bucket – All the events data which are fired up from client side gets stored in S3 bucket which is scalable and easy to integrate service with other services of AWS.
  • Kinesis Stream – Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost-effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application. We use Kinesis to push all events data received from our app in real time manner.
  • Application Groups Listener – These are Kinesis clients which listen to Kinesis stream and powers up parallel processing on streaming data in real time. There could be multiple application groups which run in parallel to process large amount of data. We process these streaming data which is being used in defining the products which are trending in real time, recently viewed history of users, creating personalized result in listing, sending real-time push notification based on event data rule etc.
  • Redis Cluster – Application group listener prepares the required data for trending, viewing history, personalized data etc and put it it Redis cluster. Our platform uses this data stored in redis cluster to show this to users on App/Web in real time. Since Redis has multiple data structure support apart of just key-value pair, it becomes easy to serve different kind of pre-build data based on need in realtime manner.
  • Redshift – AWS redshift posers of analytics workload of petabyte scale of data. We further pass S3 event data to Redshfit so that on-demand and adhoc queries for analytical use can be processed in faster manner for in-house reporting purpose.
  • QlickSense – QlickSense is BI reporting tool which is integrated with Redshift columnar database to power up our business reporting.
  • Athena – Athena can be used to even fire up SQL queries on the data stored in S3 in JSON format for analytical and reporting purpose.
  • QuickSight– Amazon QuickSight is a fast, cloud-powered business intelligence service that makes it easy to deliver insights to everyone in organization. As a fully managed service, QuickSight lets you easily create and publish interactive dashboards.

We also use the same setup to power up user engagement in realtime manner since it is extensible architecture and follow the Open/Closed architectural pattern . Our user journey work flow system listen to same stream to send personalized push notification in real time to users based on his action. We use flowable workflow engines to integrate it with Kinesis application groups for this purpose.

The above content sharing is based on our experience and work here at HealthKart. Please feel free to comment with your thoughts on this.

Technology Dilemma – Build vs Buy

One of the most difficult and tricky situations you come along in tech leadership role is to make a decision whether the particular functional piece should be built in house or you should just buy it and integrate with your tech landscape. Every decision that we take in our professional or personal life is mostly the outcome of certain assumptions and few matrices that we look at and make a call on top of it. Data of previous experience plays a crucial role here too, that’s what gets converted into learning.

Base on my personal experience and learning, would like to highlight a few points that should be strongly reviewed before taking the decision.

1 . Experimental Vs Core Initiative – The first thing that we take in consideration is, whether this functional piece is being done for experimental purposes or organization has a strong grit to run it as a core initiative. If you are in experimental phase of the feature, the first choice should be to consider the buy option that is to in Pay-As-You-Go model. Once the experiment is done and you have got good adoption and engagement, the build option can be considered if the below points make sense to it.

2 . Functional, Technical , Operational and Opportunity Cost – These three are important cost factors if you are going to build this in-house. Operational cost takes a bigger toll of all the above costs if you look it in a real and longer sense. The cost of scaling and maintaining the S/W products are much higher than functional and technical costs. A couple of engineers and PMs can work together and make it functional in few months(Fixed cost), however maintenance cost, server cost, scaling cost, devOps cost will remain and will add to Opex forever until you decide to kill the feature. Last but not the least, the opportunity cost of building something else which makes more sense to the business should not be overlooked.

3 . Business Differentiator – Important questions to ask,

  • Can you really create a business differentiator while building it in-house?
  • Do you really think that this feature is core to the business and has the potential to become a commercially viable product itself going forward?
  • Can you pivot something new from this feature base and spawn into a new startup altogether?

4 . Incumbency, Exit & Extension- Incumbency is the problem that we face if we buy some service and integrate with our business. If this feature is core for your business, your existence may depend on the existence of this service. In the world of startups, acquisitions, business shutdowns, pivoting keeps on happening. Technology integration should consider this factor and should use design patterns and architecture where shifting this to in-house or any other third party should not be a long haul project. Also, extension, plugin framework support should not be ignored as extension might be required for your business any time and you should not be dependent on the provider.

5. Technology Innovations and Engagement Factor – Engineers by nature get excited with the use of new technology and innovations and their motivation peaks to a new level if they are working on something exciting on the technology front. This factor can be used to make a decision to build something in house if it is really exciting for the technology team for innovation and engagement purposes. However, make sure you are not building something which is already a solved problem in the market.

Example –

Buy – Video/Text ChatVideo/Text chat is something that is not our core business, this feature is being used to listen to our customer’s voice and consult them for their problem. We decided to buy this solution as it is a solved problem in the market, cost-wise it makes sense too. Also, while choosing the provider we made sure that they do have a technology ecosystem in place for extension and integration (API/Plugin etc). This will enable us to have add on features on top of the core data that we receive from our customers using this service.

In House – Omni-channel Inventory Management System – Omni-channel is our core value and business differentiator. There are lots of Inventory management platforms available in the market however they mostly work with a traditional business model. Healthkart has many channels for selling health and sport nutrition supplements. Every channel has its own custom model for inventory management. The single consolidated system needs to have in place which can be developed and extended based on the need of each channel in the organization. Also, data consistency and integrity is far more important piece which needs to be catered while doing the inventory management across the multi-channel model. These reasons are strong enough to build this system in-house.

The above content is a result of my experience and learning with given scenarios. The situation might be different from every organization and person, YMMV and hence comments and feedback are heartily welcome.

Photo by Alex wong on Unsplash

Google Sheet and HKConnector – Bulk processing Engine @ Healthkart

Process optimization becomes a necessity when any organization grows and flourishes. This optimization could be in any vertical technical, functional, business or financial and in any form of efficiency, stack, methodologies or roadmaps, etc , but each of these needs to leverage the existing system workflow to scale up and, redefine itself from the intelligence and expandable enough to see the unforeseen future.

We at Healthkart have completed 9 successful years in industry, expanding each day with new ideas, onboarding new users from varied segments pan India. Hence, to pace up with the fast moving business we have been working on many initiatives to have technology as its enabler. 

In this blog post, we will introduce you to our HKConnector, the Bulk Processor

What led to its foundation?

Healthkart, an omni channel health commerce, one of its kind in Indian Industry, listing n no. of features, was facing an human efficiency lag in cataloguing the business requirements owing to the increasing products and addon(s). So there was a dire need to pace it up with smart tech solutions

The Groundwork

With few product-tech interactions we understood that there are many stakeholders involved with following challenges :

  1. any new listing on site,
  2. another hurdle for them is email threads of excel reviews and approval.
  3. New Development with each bulk feature.

To minimize this we needed a common platform for sharing and reporting that was easily adaptable and was easy to use, hence we choose, Google Sheets, the most familiar Cloud excel for any business/marketing personality, giving us out of box feature of sharing and controlling the data.

The HKConnector (aka HKC)

Now we had to leverage all the existing CRUD APIs for these bulk operations, interacting with google sheets and providing the update in a minimal time frame. With all this in mind we designed HKConnector, a Spring Boot based application, leveraging Google Sheet API v4. An application which acts as a black box connector to any API hosted on servers, not just limited to Healthkart Ecosystem. 

All you need is to add the API server base url on HKC & whitelist HKC server IP on this API Server. Next identify the API which is used for crud operations, and the required request parameters. Add this in BulkConfiguration section and click save. Upon successful save, 2 buttons would be generated(currently support with html and js):

  • Bulk Sheet Generate Button : Copy/Paste this on the UI of your server which will be used by the end user to generate the sheet. Along with this an info icon will also be generated on button, which can be leveraged to educate user on how to add data on excel sheet and other guidelines.
  • Process Button : Copy paste this along with the text box, where the user has to enter the google sheet url to be processed.

We leverage the Google authorization for sheet processing. 

Below Network diagram is for the explanation followed for bulk processing with HKC:

SHEET Generation :

  • Request to generate google sheet for bulk upload by clicking BulkSheetGenerationButton, is sent to HKC. [Ref. 1]
  • HKC validates the request [Ref. 6,7] and redirect to check for authorization with google. [Ref. 2]
  • Google sends a request directly to the user for permission grant while in subsequent requests this steps is skipped by google till permission token expiry time.[Ref. 3,4]
  • With grant/deny from user, HKC receives success/failure from google. On Success, HKC internally redirects request to google sheet api v4 to generate sheet which is created in user’s google drive.[Ref. 5,6,7,2,5,8]

Bulk Data Processing :

  • Once user has filled in the data, the sheet URL has to be copy pasted in the text box on user screen(one whose source code was generated at time of button generation). Once done a process button would be visible. On clicking it request sent to HKC [Ref. 1]
  • Next, same steps are followed for authorization as in sheet generation.
  • Once successfully authorised, HKC is able to read the data.
  • Each row data constitutes form param(s) posted to the API on server [Ref Business Logic Server] configured on the HKC portal [Ref. 10, 11]

The code is available on :

Suggestions welcome!!

Photo by Charlota Blunarova on Unsplash

Transfer Learning – Our plunge into making an object detector

The ability of making a computer learn and localize objects is one of the many applications for a technology like Computer Vision and Deep learning , most precisely called object detection. We had a task of building a custom object detector such that the model could predict food labels from food images and can be trained with minimum loss from our provided training data. The task in hand was to build a food detector for identifying different food cuisines and dishes.

Basics : What exactly is object detection ?

source: Towards Data Science

Object detection is a computer vision technique used to localize objects in an image by leveraging the use of object detection algorithms which uses the concepts of machine learning and deep learning . Our approach with deep learning uses Convolutional Neural Networks(CNN) to learn features and patterns necessary for detecting objects. The input data which in our case are food images serve as the input for the CNN and we feed the image matrix representation to the convolutional layers and apply a filter which can be 3×3 matrix to the input matrix through a process called convolution which yields an output matrix obtained from the dot product of the input matrix and filter matrix, it serves as the input for the next convolutional layer.The filters in the layers can help detect basic patterns such as edges , circles to complex patterns such as objects like cats, dogs etc.

The complete process :

Preparing the training data and annotation

The first and primary part is gathering the image data and labelling it by drawing the boundary boxes to help localize the position of an object. We used an annotation tool called LabelImg to annotate the data in XML files in PASCAL VOC format which is the format supported by our algorithm. Our input data is split in 80% for the training and 20% for testing and validation.

Choosing a base model and Transfer Learning

source: Researchgate

While most of the detection algorithms like R-CNN , SPP-net, Faster R-CNN consider detection as a classification problem in our case we consider it as a regression problem for which we use a Single Shot Detector(SSD) algorithm. Building a model from scratch requires a huge amount of data as input rather we can use the concept of Transfer Learning which uses the knowledge gained while solving one problem and applying it to a different but related problem. We used the SSD mobilenet architecture pre trained on the COCO dataset with 80 categories and fine tune it according to our problem. To achieve this Tensorflow provides it’s Object Detection API. The input data and XML files are required to generate TFrecords – a simple format for storing a sequence of binary records.

Training and saving the model output

source: Tensorflow

We downloaded the base model and fine-tuned the final layers of the pre trained model and retrain the network with our training data to achieve minimum loss . We can tweak the hyperparameters like steps, evaluation steps and batch size to better yield minimum loss. The the saved model generated from training which is a protobuf file (.pb) can be served for browser by converting to Tensorflow Js and on android by converting it to TFlite. Converting the saved model to TFlite requires freezing the graph and generate the frozen model which by using the TOCO Converter generates a TFlite file which can be deployed and natively used on Android.

Choosing the right platform

The fundamental requirement was choosing the right platform to execute the entire process workflow in a scalable manner. So issues we needed to address based on the initial approach included –

  • Making an interface to distribute the labelling task and data collection among groups so that a workforce can be integrated to handle the task.
  • Collecting the new data automatically and automate training.
  • Converting the model to be used on android , browser etc.

Google Colab

The initial setup was to use the Colab platform for our processing and cloud training. Google colab provides a CPU , GPU and TPU based runtime to execute a notebook on the cloud . The input data was fetched from our Github or there’s also an option to mount the Google drive. The issues with continuing in this platform was the disk space limitations and automation . There was no way to run the notebook remotely after submission . Labelling and data submission issues still persisted as we needed continuous data to generate model for more categories and hosting the data in Github and drive didn’t allow us to version it well. Though the platform provides top of the line Tesla K-80 GPU for training with an option of specialized TPU’s.


FloydHub is a managed cloud platform for data scientists. It provides two ways to start the project-

FloydHub Workspace- Workspaces are fully configured development environments for deep learning on the cloud. We can create a new project and upload all the object-detection files that are stored locally to begin up using the FloydHub cloud workspace. We can run our jupyter notebook directly from workspace by selecting the tensorflow version and running mode either CPU or GPU.

FloydHub CLI- Using FloydHub command-line interface to run, monitor jobs and their results directly from the terminal. Now we can start training from cli as a job but for this we need to run this from our configured system where we have all the scripts present locally that are required in running the notebook as it uploads all files to Floydhub storage.

Issues with FloydHub-

  • It requires uploading scripts with jobs every time while running a new job.
  • Separating shell commands from jupyter notebook.
  • Attaching the data set with every job.
  • Converting saved models to tflite and other formats.
  • The issue with labelling tool still persisted.

IBM cloud and Cloud Annotations

After exploring the above 2 options we discovered an online image annotation platform which allowed users to join and annotate collaboratively . The platform – Cloud Annotations is an online platform made by IBM , it gives the option to create a group and add users by inviting by mail so that they can upload data or import a dataset . The tool helps us to mark images under categories of labelled , unlabelled which allows us to track new images from the existing ones . Since Cloud annotations is a part of IBM Cloud’s range of products we have to take storage service from the services catalog where we can keep the dataset in the storage bucket and then the bucket will act as the entry point for running the training job. Cloud annotations even allow us to export the data itself after labelling to use in Google colab or other platforms .

The integration between IBM services is well orchestrated , we also use the machine learning watson instance to pipeline our entire training process . The cloud annotation platform provides its CLI to submit training jobs and download model output for web , android and iOS with the training logs and other configuration files.

After exploration IBM’s solution is best suited according to our requirements and use case . They have a robust service and a platform which is easy to use and no setup issues with the environment which was a major issue in the above others.


  • Labelling and gathering data for object detection is a major task because unlike image classification , object detection works on localizing objects from background and generating ground truth labels for these images is a big hurdle.
  • Continuous flow of input data to extend the existing model’s knowledge termed as “Incremental learning” is an active research topic to be incorporated in object detection algorithms.
  • To retrain the entire network again with new images data and more categories is resource intensive. Techniques of “Continual learning” will help in transfer of existing knowledge but it is also an active research topic.

Co-author : Anupam Patel

Photo by Pietro Jeng on Unsplash

Hazelcast – Distributed Locking and Inventory Management System

Distributed locking management becomes absolute building block for high scale system where data consistency is a top priority. Systems like online booking/inventory management where hundred and thousands requests gets process in parallel, this becomes a must have to maintain the consistency of data.

HealthKart being a omni-channel platform for selling health supplement and sport nutrition, we have centralized Inventory Management System where booking happens from so many sources in real time from different platforms. Distributed locking is our savior from managing our inventory consistency in real time.

Coming to Hazelcast, it is a distributed In Memory Data-grid used in very high traffic data servers to help improve latency to support more scalable transaction in an application.

We at HealthKart used it as our distributed caching mechanism and distributed locking medium to synchronize application across multiple distributed environment.

As a caching mechanism we had two standalone hazelcast cluster members with one acting as a master and 3 of our application servers acting as a cluster members. The cluster was a 3 CP member group. These 3 CP members were responsible for all the lock synchronization across the distributed architecture.

Advantages Offered by Hazelcast

  1. It offeres caching mechanism that can be replicated and sharded according to our needs
  2. It offers an efficient way of storing some distributed objects that can be shared among all cluster members.
  3. It offers a way in which we can execute a task and share the load with its distributed ExecuterService.

As it was shipped with so many things in a single package so it resolved our issues of having a cache and Locking in a very efficient manner, Hence, It became our first choice to go through with it.

Spring Boot has an inbuilt support for hazelcast. So once in class-path it automatically detects and sets all the auto configuration for it right away. For further twicking of functionality we did some modifications to the way the caching works by adding few interceptors and configuration via external XML file.

A more indepth Knowledge about how it works has been provided in the PDF attached which will clear your understandings about how it works and at what step we did some twicking according to our needs.

Photo by Kelvin Ang on Unsplash