Transfer Learning – Our plunge into making an object detector

The ability of making a computer learn and localize objects is one of the many applications for a technology like Computer Vision and Deep learning , most precisely called object detection. We had a task of building a custom object detector such that the model could predict food labels from food images and can be trained with minimum loss from our provided training data. The task in hand was to build a food detector for identifying different food cuisines and dishes.

Basics : What exactly is object detection ?

source: Towards Data Science

Object detection is a computer vision technique used to localize objects in an image by leveraging the use of object detection algorithms which uses the concepts of machine learning and deep learning . Our approach with deep learning uses Convolutional Neural Networks(CNN) to learn features and patterns necessary for detecting objects. The input data which in our case are food images serve as the input for the CNN and we feed the image matrix representation to the convolutional layers and apply a filter which can be 3×3 matrix to the input matrix through a process called convolution which yields an output matrix obtained from the dot product of the input matrix and filter matrix, it serves as the input for the next convolutional layer.The filters in the layers can help detect basic patterns such as edges , circles to complex patterns such as objects like cats, dogs etc.

The complete process :

Preparing the training data and annotation

The first and primary part is gathering the image data and labelling it by drawing the boundary boxes to help localize the position of an object. We used an annotation tool called LabelImg to annotate the data in XML files in PASCAL VOC format which is the format supported by our algorithm. Our input data is split in 80% for the training and 20% for testing and validation.

Choosing a base model and Transfer Learning

source: Researchgate

While most of the detection algorithms like R-CNN , SPP-net, Faster R-CNN consider detection as a classification problem in our case we consider it as a regression problem for which we use a Single Shot Detector(SSD) algorithm. Building a model from scratch requires a huge amount of data as input rather we can use the concept of Transfer Learning which uses the knowledge gained while solving one problem and applying it to a different but related problem. We used the SSD mobilenet architecture pre trained on the COCO dataset with 80 categories and fine tune it according to our problem. To achieve this Tensorflow provides it’s Object Detection API. The input data and XML files are required to generate TFrecords – a simple format for storing a sequence of binary records.

Training and saving the model output

source: Tensorflow

We downloaded the base model and fine-tuned the final layers of the pre trained model and retrain the network with our training data to achieve minimum loss . We can tweak the hyperparameters like steps, evaluation steps and batch size to better yield minimum loss. The the saved model generated from training which is a protobuf file (.pb) can be served for browser by converting to Tensorflow Js and on android by converting it to TFlite. Converting the saved model to TFlite requires freezing the graph and generate the frozen model which by using the TOCO Converter generates a TFlite file which can be deployed and natively used on Android.

Choosing the right platform

The fundamental requirement was choosing the right platform to execute the entire process workflow in a scalable manner. So issues we needed to address based on the initial approach included –

  • Making an interface to distribute the labelling task and data collection among groups so that a workforce can be integrated to handle the task.
  • Collecting the new data automatically and automate training.
  • Converting the model to be used on android , browser etc.

Google Colab

The initial setup was to use the Colab platform for our processing and cloud training. Google colab provides a CPU , GPU and TPU based runtime to execute a notebook on the cloud . The input data was fetched from our Github or there’s also an option to mount the Google drive. The issues with continuing in this platform was the disk space limitations and automation . There was no way to run the notebook remotely after submission . Labelling and data submission issues still persisted as we needed continuous data to generate model for more categories and hosting the data in Github and drive didn’t allow us to version it well. Though the platform provides top of the line Tesla K-80 GPU for training with an option of specialized TPU’s.


FloydHub is a managed cloud platform for data scientists. It provides two ways to start the project-

FloydHub Workspace- Workspaces are fully configured development environments for deep learning on the cloud. We can create a new project and upload all the object-detection files that are stored locally to begin up using the FloydHub cloud workspace. We can run our jupyter notebook directly from workspace by selecting the tensorflow version and running mode either CPU or GPU.

FloydHub CLI- Using FloydHub command-line interface to run, monitor jobs and their results directly from the terminal. Now we can start training from cli as a job but for this we need to run this from our configured system where we have all the scripts present locally that are required in running the notebook as it uploads all files to Floydhub storage.

Issues with FloydHub-

  • It requires uploading scripts with jobs every time while running a new job.
  • Separating shell commands from jupyter notebook.
  • Attaching the data set with every job.
  • Converting saved models to tflite and other formats.
  • The issue with labelling tool still persisted.

IBM cloud and Cloud Annotations

After exploring the above 2 options we discovered an online image annotation platform which allowed users to join and annotate collaboratively . The platform – Cloud Annotations is an online platform made by IBM , it gives the option to create a group and add users by inviting by mail so that they can upload data or import a dataset . The tool helps us to mark images under categories of labelled , unlabelled which allows us to track new images from the existing ones . Since Cloud annotations is a part of IBM Cloud’s range of products we have to take storage service from the services catalog where we can keep the dataset in the storage bucket and then the bucket will act as the entry point for running the training job. Cloud annotations even allow us to export the data itself after labelling to use in Google colab or other platforms .

The integration between IBM services is well orchestrated , we also use the machine learning watson instance to pipeline our entire training process . The cloud annotation platform provides its CLI to submit training jobs and download model output for web , android and iOS with the training logs and other configuration files.

After exploration IBM’s solution is best suited according to our requirements and use case . They have a robust service and a platform which is easy to use and no setup issues with the environment which was a major issue in the above others.


  • Labelling and gathering data for object detection is a major task because unlike image classification , object detection works on localizing objects from background and generating ground truth labels for these images is a big hurdle.
  • Continuous flow of input data to extend the existing model’s knowledge termed as “Incremental learning” is an active research topic to be incorporated in object detection algorithms.
  • To retrain the entire network again with new images data and more categories is resource intensive. Techniques of “Continual learning” will help in transfer of existing knowledge but it is also an active research topic.

Co-author : Anupam Patel

Photo by Pietro Jeng on Unsplash

Hazelcast – Distributed Locking and Inventory Management System

Distributed locking management becomes absolute building block for high scale system where data consistency is a top priority. Systems like online booking/inventory management where hundred and thousands requests gets process in parallel, this becomes a must have to maintain the consistency of data.

HealthKart being a omni-channel platform for selling health supplement and sport nutrition, we have centralized Inventory Management System where booking happens from so many sources in real time from different platforms. Distributed locking is our savior from managing our inventory consistency in real time.

Coming to Hazelcast, it is a distributed In Memory Data-grid used in very high traffic data servers to help improve latency to support more scalable transaction in an application.

We at HealthKart used it as our distributed caching mechanism and distributed locking medium to synchronize application across multiple distributed environment.

As a caching mechanism we had two standalone hazelcast cluster members with one acting as a master and 3 of our application servers acting as a cluster members. The cluster was a 3 CP member group. These 3 CP members were responsible for all the lock synchronization across the distributed architecture.

Advantages Offered by Hazelcast

  1. It offeres caching mechanism that can be replicated and sharded according to our needs
  2. It offers an efficient way of storing some distributed objects that can be shared among all cluster members.
  3. It offers a way in which we can execute a task and share the load with its distributed ExecuterService.

As it was shipped with so many things in a single package so it resolved our issues of having a cache and Locking in a very efficient manner, Hence, It became our first choice to go through with it.

Spring Boot has an inbuilt support for hazelcast. So once in class-path it automatically detects and sets all the auto configuration for it right away. For further twicking of functionality we did some modifications to the way the caching works by adding few interceptors and configuration via external XML file.

A more indepth Knowledge about how it works has been provided in the PDF attached which will clear your understandings about how it works and at what step we did some twicking according to our needs.

Photo by Kelvin Ang on Unsplash

Using Our Brains – Neuromarketing, Hands-Free Gaming & Rise Of EEG Headset


Many people have eagerly awaited the day they can control technology using just their brains. Electroencephalography (EEG) is a monitoring method to record the electrical activity of the brain. The clinical definition of EEG is the recording of brain activity over a period of time. Wearable EEG headsets position noninvasive electrodes along the scalp. EEG electrodes pick up on and record the electrical activity in your brain. The collected signals are amplified and digitized then sent to a computer or mobile device for storage and data processing.


The billions of cells in your brain produce very small electrical signals that form non-linear patterns called brainwaves. EEG measures electrical activity in the cerebral cortex, the outer layer of the brain. EEG sensors can record up to several thousands of snapshots of the electrical activity generated in the brain within a single second. EEG measures changes in the electrical activity of the brain produced. Voltage changes come from ionic current within and between some brain cells called neurons. An EEG test evaluates the electrical activity of the brain. EEG scans are performed by placing EEG sensors, which are small metal discs also called EEG electrodes, on your scalp. EEG electrodes pick up on and record the electrical activity in your brain. The recorded brainwaves are sent to amplifiers then to a computer or the cloud to process the data.


EEG in Consumer Research

EEG data can be a powerful tool for consumer insights. Brain responses provide unprecedented consumer feedback—in that EEG is being used to measure the gap between what consumers really pay attention to versus what they self-report liking or noticing. Combining EEG with other biometric sensors like eye-tracking, facial expression analyses and heart rate measurements can give companies an even fuller understanding of customer behavior. The use of neurotechnology like EEG to study consumer reactions is called neuromarketing.


Here is a sample of performance metrics observations of a customer recorded while tasting a protein bar using Emotiv Performance Metrics API.


EEG in Games & Application Development

EEG headsets for other uses, especially gaming, represent a somewhat less complex problem than market research testing. The emphasis here is on mapping a thought to an action. So for example, gross motor thoughts like imagining you are “pushing a ball away from you” can be mapped to moving a game character forwards, or a facial expression like raising your eyebrows can be mapped to firing a gun in the game. This would serve a hands-free experience in gaming.

Every EEG headset comes with its own SDK to use in development. We have used EPOC+ headset by Emotiv which provides Cortex as a wrapper around our SDK housing all the tools required to develop with Emotiv. Cortex integrates our headset’s data streams with third-party software enabling you to record data and create applications. It provides API access to different Emotiv data streams, tiered out across three license levels. Also you can use Community SDK to build your application.

Here is a sample on how you can integrate your Emotiv headset with Unity to develop a simple 3D FPS game:

Download the plugin from below:

Just simply copy/include that plugin into your project. The plugins are wrapper classes to connect to SDK library. Then you need to include them in your process classes and call the APIs.

The main three scripts are:

  • EmoEngine.cs
  • EmoMentalCommand.cs
  • EmoGyroData.cs

The first one manages the Emotiv classes and events, the second one manages the player movement using mental commands and the last one moves and rotates the camera using gyroscope.

Here is a working demo of the game:

EEG in Healthcare

Because EEG tests show brain activity during a controlled procedure, EEG results can contain information used to diagnose various brain disorders. Abnormal EEG data is displayed through irregular brainwaves. Abnormal EEG data can indicate signs of brain dysfunction, head trauma, sleep disorders, memory problems, brain tumors, stroke, dementia, seizure disorders like epilepsy and various other conditions. Depending on the intended diagnosis, doctors sometimes combine EEG with cognitive tests, brain activity monitoring and neuroimaging techniques.


It has to be mentioned that the EEG equipment market shows rapid development as it is a brand new field in technology and is still in process of discovering its possibilities. It has a huge scope in future from healthcare to predictive analysis driven by observations.

Resource & References

EEG Headset

Community SDK

Unity Guide

Photo by Josh Riemer on Unsplash

Graph Databases – Engine behind Healthkart’s Inventory Management


Managing inventory to produce consistent and reliable results in any ERP system across multiple channels of sale can be a challenging task. Supply chains are becoming more and more vast and complex. With the omni-channel approach to sales kicking in – the system needs to be mindful of issues like the visibility of inventory within its system, conflicts with other channels for constrained inventory, order accuracy, and a myriad of other things that will all affect the bottom line. Fetching inventory based on various parameters requires a lot of connected queries and relational databases take a performance toll of your system. A graph database, however, is designed to search/query and analyze connected data. The architecture itself is centered around relationships and hence facilitates easier & faster retrieval of inventory.


In Healthkart’s context- the scope of inventory management would mean ensuring reliability of inventory data once it becomes available for sale in any warehouse & managing transfer of inventory from central hubs to different warehouses and retail stores. We have various channels of sale and each channel has different requirements with respect to order fulfillment. Being in a consumable product industry – it becomes mandatory to comply with certain shelf life regulations as well. Striking a balance in the system to ensure that we follow the First Expiry First Out(FEFO) model while also fulfilling the inventory days requirement on different hubs without incurring a lot of shipping cost becomes crucial here. 

The basic building functionalities of inventory management here includes the crucial steps of –

  • reserving inventory for any order that is punched into the system
  • generating an optimized pick-list using which the pickers in any warehouse could get the stock out of the shelves and 
  • checking out of stock for any order. 

All of these functions involve meeting a lot of constraints while fetching inventory, like looking for appropriate channel/MRP/status/allowed shelf life and meeting the case size requirements. While conforming to all of this – it is also essential to follow FEFO and keep in mind the picking route optimization. 


Incorporating all of the above requirements in our relational data store was becoming more and more difficult to maintain as the levels of joins kept increasing. Making new schema changes for evolving requirements became a dreaded task and migrations were a nightmare. Retrieval of eligible inventory for order taking/booking for the respective channel with increasing criteria set was taking a very long time as the data set kept growing. This led to having long transactions in the application, which in turn gave birth to various concurrency related issues. Inventory retrieval and booking API’s were trending towards an average response time of 700-800 ms with quite a considerable chunk lying above the average time.


Looking at the use case above – it led us to the thought that there is a lot of connectivity in our data and inventory retrieval could be made faster with a data model that would help us in translating this connectedness into an efficient technical model. Graph databases came into picture here and we chose Neo4J to model our inventory system. Inventory modeling in Neo4J allowed us to lay out the complete data as per the various levels of abstraction. We did not have the liberty to completely eliminate our relational database, hence to solve the problem at hand – we introduced Graph databases as a middle layer between the application and the relational DB. The complete inventory was depicted in the graph model and all querying of inventory happens from the Graph layer now. This involved a rigorous and focused brainstorming around what information should be represented as nodes and what information needs to be persisted at a relationship level between nodes. We used Cypher queries and took advantage of the relationships in data to query data. This has made our inventory retrieval lightning fast and hence reduced the turn around time of any of the transactions involving inventory. We were able to bring down the average response time in the range of 100-150 ms with a major percentile lying around the average time. As a consequence, it helped us solve a lot of problems related to concurrency and double booking of a single unit. In contrast to relational databases, where join intensive query performance deteriorates as the datasets get bigger, with a graph database the performance tends to remain relatively constant, even as the dataset grows. This is because queries are localized to a portion of the graph.  Along with bringing performance to the table, graph databases also provided us with a flexible data model and has a lot of provisions for extensibility in the face of evolving business models.


Introduction of a new database layer also means an additional challenge of maintaining consistency between the 2 data layers and having proper fail-over mechanisms. It involved having integration hooks to ensure that the graph DB layer is updated whenever there is any change in inventory data from any of the systems. It also demanded having a distributed transaction manager in cases where writes had to be performed on both the data sources and having roll-back strategies defined in case of failure of transaction commit on one data source.We used the spring framework’s ChainedTransactionManager to achieve the ACID property for any transaction.It links together a list of transaction managers (one for each data source) to implement transaction synchronization. If the business processing is successful they all commit, and if not they all roll back. Below config helped us achieve the same.

1.Defining the Transaction manager for Graph data source:-


public SessionFactory sessionFactory(org.neo4j.ogm.config.Configuration config) {

 return new SessionFactory(config, "xxxxxxx", "xxxxxx");



public Neo4jTransactionManager transactionManager() {

 return new Neo4jTransactionManager(this.sessionFactory(null));


2. Defining the Transaction manager for MySQL data source:-


@Bean(name = "entityManagerFactory")

public LocalContainerEntityManagerFactoryBean entityManagerFactory(

   EntityManagerFactoryBuilder builder,
   @Qualifier("dataSource") DataSource dataSource) {

 return builder.dataSource(dataSource).packages("").build();



@Bean(name = "mysqlTransactionManager")

public JpaTransactionManager mysqlTransactionManager(

   @Qualifier("entityManagerFactory") LocalContainerEntityManagerFactoryBean entityManagerFactory

) {

 return new JpaTransactionManager(entityManagerFactory.getObject());


3. Defining the platform level Transaction Manager:-

@Bean(name = "transactionManager")

public PlatformTransactionManager transactionManager(

   Neo4jTransactionManager neo4jTransactionManager,

   JpaTransactionManager mysqlTransactionManager

) {

 return new ChainedTransactionManager(mysqlTransactionManager, neo4jTransactionManager);



All in all it was a great learning experience and the graph property model helped us capture our complex domain in an expressive and flexible fashion. The schema-free nature of graph databases coupled with the ability to simultaneously relate data elements in lots of different ways allowed us to  make room for business evolution, along with reducing risk and maintaining performance.

Photo by Denys Nevozhai on Unsplash

Flowable – Engine for Automated Workflows @HealthKart

Automated business workflows are essentials for any organization. There are multiple options to automate your business workflows using tools and SaaS products available in the market. However, it becomes more challenging if you have several of internal systems that in running and managed internally and it needs integration with them. One can opt for building it in house or can take a middle ground of using any open source which can be customized and integrated with all internal systems.

We have lots of use cases here in HealthKart ranging from automated user journeys to creating a business workflows for tracking NPS of our products. These journey/workflows requires lots of integration touch point with internal system. We also needed to do lots of customization on UI front to meet each BU requirements.

First question that we had was whether to develop in-house or use any SaaS solution for implementing the journeys and workflows. Choosing SaaS solution was out in the first step due to lots of customization and integration need with in the system. We understood that this could have done with SaaS solution also however the real question was ROI and agility in customization.

Instead we took a middle ground approach to use any opensource workflow engine which should have all basic building blocks and option to customize if and when it requires. We choose to go ahead with Flowable , do continue reading to find out why…

Why Flowable

Flowable is an open source project that provides Out of the Box Solutions for building workflow and Business Process Management (BPM) platform. The tool has capability to handle various outcome from a single flow and based on outcomes system can decide the next step, we have also overridden the some flowable features to send custom triggers to our users.

Form Custom Outcomes

Currently we are using so many custom outcomes to handle customer behaviour some of them like He/She didn’t pick the call currently, Call after some time, Not answered, Not starts the using product till date, Did not purchase the products of himself (Gifted Products), Based on outcomes system decide the next step of this flow.

Timer Events

Almost in every task we are using timer events, some of timers are static and to set delay in between two task and some of them dynamic based on customer input, like if He/She needs a call after some time or another day system set the delay accordingly

Parallel Gateway

We have different departments to take care of customer feedback according to the issue type. If customers raise a complaint about anything, we need to assign some task to a particular department, so we need to start the parallel flows in this case, And to solve this we use the flowbale parallel gateway. 

We are using the flowable to build our customer journey and inhouse approval systems. 

If you have any query feel free to ask, here our motto is to give the best customer experience to the user.

Photo by Roman Bozhko on Unsplash

Why we migrated our CDN from AWS to Cloudflare

Content Delivery Network (CDN) is a critical piece in your app eco system as it ensures faster delivery of your media and static content to your app users. Since our complete infrastructure is hosted on AWS, CDN services were being availed from AWS itself.

On side note, I am a big AWS fan and during my startup journey we have always preferred AWS as compare to other IaaS provider due to its reliability, ease of use and better pricing. However there are few points which you might need to remember when it comes to pricing and application security on AWS

AWS charges for data out

AWS charges to customer based on the data that has been flushed to end users. Now if you have a file of 1 MB on CDN and it has been served to end users to 1000 times in a month, you have to pay for 1 GB of data out in that month. Now if you have significant active users and do have media and static contents, you may see a good amount of $ being spend on data out in your AWS bills.

AWS WAF is not feature rich

WAF (Web Application Firewall) is an essential tool and should not be overlooked as it secures your web application from critical security vulnerabilities which includes OWASP top 10 and other day zero vulnerabilities. AWS WAF is not a self service tool, and you will end writing lots of rule and code on top of it.

Lets talk about Cloudflare a bit

On the other hand, Cloudflare is essentially a security company and provides lots of offering around it. Cloudflare WAF is easy to use and is a self managed service. It provides protection from different kind of security vulnerabilities including OWASP top 10 to your web application and API infrastructure. It also provides lots of performance management tools like caching, Image optimization etc with their CDN services.

And you know what… they dont charge for data out usage for your media and static contents … Yes you heard it Right!!!

So if you are looking for any WAF solution for your app infrastructure, Cloudflare can give you these advantages if you use their CDN along.

P.S. I am not here to promote Cloudflare services, instead I have written my experience and advantage that we have got while using WAF and CDN services.

Photo by Anastasia Dulgier on Unsplash