Graph Databases – Engine behind Healthkart’s Inventory Management

Overview

Managing inventory to produce consistent and reliable results in any ERP system across multiple channels of sale can be a challenging task. Supply chains are becoming more and more vast and complex. With the omni-channel approach to sales kicking in – the system needs to be mindful of issues like the visibility of inventory within its system, conflicts with other channels for constrained inventory, order accuracy, and a myriad of other things that will all affect the bottom line. Fetching inventory based on various parameters requires a lot of connected queries and relational databases take a performance toll of your system. A graph database, however, is designed to search/query and analyze connected data. The architecture itself is centered around relationships and hence facilitates easier & faster retrieval of inventory.

Background

In Healthkart’s context- the scope of inventory management would mean ensuring reliability of inventory data once it becomes available for sale in any warehouse & managing transfer of inventory from central hubs to different warehouses and retail stores. We have various channels of sale and each channel has different requirements with respect to order fulfillment. Being in a consumable product industry – it becomes mandatory to comply with certain shelf life regulations as well. Striking a balance in the system to ensure that we follow the First Expiry First Out(FEFO) model while also fulfilling the inventory days requirement on different hubs without incurring a lot of shipping cost becomes crucial here.

The basic building functionalities of inventory management here includes the crucial steps of –

reserving inventory for any order that is punched into the system
generating an optimized pick-list using which the pickers in any warehouse could get the stock out of the shelves and
checking out of stock for any order.

All of these functions involve meeting a lot of constraints while fetching inventory, like looking for appropriate channel/MRP/status/allowed shelf life and meeting the case size requirements. While conforming to all of this – it is also essential to follow FEFO and keep in mind the picking route optimization.

Challenges

Incorporating all of the above requirements in our relational data store was becoming more and more difficult to maintain as the levels of joins kept increasing. Making new schema changes for evolving requirements became a dreaded task and migrations were a nightmare. Retrieval of eligible inventory for order taking/booking for the respective channel with increasing criteria set was taking a very long time as the data set kept growing. This led to having long transactions in the application, which in turn gave birth to various concurrency related issues. Inventory retrieval and booking API’s were trending towards an average response time of 700-800 ms with quite a considerable chunk lying above the average time.

Solution

Looking at the use case above – it led us to the thought that there is a lot of connectivity in our data and inventory retrieval could be made faster with a data model that would help us in translating this connectedness into an efficient technical model. Graph databases came into picture here and we chose Neo4J to model our inventory system. Inventory modeling in Neo4J allowed us to lay out the complete data as per the various levels of abstraction. We did not have the liberty to completely eliminate our relational database, hence to solve the problem at hand – we introduced Graph databases as a middle layer between the application and the relational DB. The complete inventory was depicted in the graph model and all querying of inventory happens from the Graph layer now. This involved a rigorous and focused brainstorming around what information should be represented as nodes and what information needs to be persisted at a relationship level between nodes. We used Cypher queries and took advantage of the relationships in data to query data. This has made our inventory retrieval lightning fast and hence reduced the turn around time of any of the transactions involving inventory. We were able to bring down the average response time in the range of 100-150 ms with a major percentile lying around the average time. As a consequence, it helped us solve a lot of problems related to concurrency and double booking of a single unit. In contrast to relational databases, where join intensive query performance deteriorates as the datasets get bigger, with a graph database the performance tends to remain relatively constant, even as the dataset grows. This is because queries are localized to a portion of the graph. Along with bringing performance to the table, graph databases also provided us with a flexible data model and has a lot of provisions for extensibility in the face of evolving business models.

Complexities

Introduction of a new database layer also means an additional challenge of maintaining consistency between the 2 data layers and having proper fail-over mechanisms. It involved having integration hooks to ensure that the graph DB layer is updated whenever there is any change in inventory data from any of the systems. It also demanded having a distributed transaction manager in cases where writes had to be performed on both the data sources and having roll-back strategies defined in case of failure of transaction commit on one data source.We used the spring framework’s ChainedTransactionManager to achieve the ACID property for any transaction.It links together a list of transaction managers (one for each data source) to implement transaction synchronization. If the business processing is successful they all commit, and if not they all roll back. Below config helped us achieve the same.

1.Defining the Transaction manager for Graph data source:-




@Bean("graphSessionFactory")

public SessionFactory sessionFactory(org.neo4j.ogm.config.Configuration config) {

 return new SessionFactory(config, "xxxxxxx", "xxxxxx");

}





@Bean("graphTransactionManager")

public Neo4jTransactionManager transactionManager() {

 return new Neo4jTransactionManager(this.sessionFactory(null));

}




2. Defining the Transaction manager for MySQL data source:-




@Primary

@Bean(name = "entityManagerFactory")

public LocalContainerEntityManagerFactoryBean entityManagerFactory(

   EntityManagerFactoryBuilder builder,
   @Qualifier("dataSource") DataSource dataSource) {


 return builder.dataSource(dataSource).packages("xxxxxx.xxx.xxxx").build();

}





@Primary

@Bean(name = "mysqlTransactionManager")

public JpaTransactionManager mysqlTransactionManager(

   @Qualifier("entityManagerFactory") LocalContainerEntityManagerFactoryBean entityManagerFactory

) {


 return new JpaTransactionManager(entityManagerFactory.getObject());

}




3. Defining the platform level Transaction Manager:-




@Bean(name = "transactionManager")

public PlatformTransactionManager transactionManager(

   Neo4jTransactionManager neo4jTransactionManager,

   JpaTransactionManager mysqlTransactionManager

) {

 return new ChainedTransactionManager(mysqlTransactionManager, neo4jTransactionManager);

}

Conclusion

All in all it was a great learning experience and the graph property model helped us capture our complex domain in an expressive and flexible fashion. The schema-free nature of graph databases coupled with the ability to simultaneously relate data elements in lots of different ways allowed us to make room for business evolution, along with reducing risk and maintaining performance.

Photo by Denys Nevozhai on Unsplash