top of page
Image by UX Indonesia

data science solutions

I have addressed several business challenges with advanced predictive models that has helped my clients in effective customer targeting, clear product segmentation or inventory assortment, predicting the churn, optimize budget allocation and driving overall customer engagement.

Below is the short brief for some of the projects:

Project-1  (Built in R)

Objective:  Build a product recommendation model to upsell in-store products and boost the sales for a major retailer in Middle East.

Model Building Process: 

  • Raw Data Taken:  3 years of purchase history in data.

  • Data cleaning & pre-processing:  date formats correction, outlier detection and removal, 0.5% records were removed due to no user-ids present

  • Modelling Method: 
    1.   Collaborative filtering method was used to evaluate user-user & item-item interactions.
    2.   Trained on 70% data and 30% used in testing.
    3.    Various samples were tweaked to cross-validate the efficiency of the model.
    4.    Pre-deployment efficiency turned up to 86% and post deployment in production it was around 73%.

  • Further Improvements & tweaks:
    1.   After e-commerce launch, highest rated products out of top 3 recommendations were shown to the user
    2.   Ongoing promotion data was merged with the model and giving priority over 4th & 5th recommendations.

  • Impact on Business:
    1.    Due to constant communication about relevant product recommendation, our weekly instore footfall improved significantly ~22%.
    2.    Repeat rate had gone up and also inactive customers were also activated through promotional "favorite campaigns".

  • Misc:
    1.    In the same project, models such as customer segmentation & fraud detection were developed to support our marketing strategy and day to day campaign operations.

Project-2  (Built in Python)

Objective:  Weekly Revenue Forecasting for a major Hi-Tech firm in US; assisting in budget allocation & resource planning
Model Building Process: 

  • Raw Data Taken:  5 years of sales data

  • Data cleaning & pre-processing:  Apart from basic cleaningscraped some data from online sources such as seasonality information, calendar for major events across states

  • Modelling Method: 
    1.   Time series method (Arima Model) was used for the forecasting.
    2.    Parameters were taken into consideration to explain some of the unusual trends (special promotional events, nationwide public events etc. ) and also information from future event calendar was also put in the model
    2.   Trained on 80% data and 20% used in testing.
    3.    Various samples were tweaked to cross-validate the efficiency of the model.
    4.    Model efficiency turned up to 89% that occasionally moved up to as high as 93% (due to event calendar inclusion)

  • Improvements & tweaks:
    1.   Enterprise special promotion events for the future was taken into consideration. 
    2.   Not only overall but also category level revenue was predicted for weekly forecasting bearing an average accuracy around 82%. 

  • Impact on Business:
    1.    Business was able to take informed decision and also refine the budget allocations accordingly.
     

Other Projects in brief:

  • Customer Segmentation model (using KNN-built in R) for sleep science client helping them in driving customer engagement through targeted segmentations
    Resulted in increasing our customer base by 7% basis points and going on!!
     

  • NLP Based Sentiment Analysis (built in R) over millions of reviews to help identify cruise line industry the major themes across which major challenges were faced and also same was done for the competitor site reviews to help them compare their competitiveness in the market.
    Resulted in client-delight and ended up converting our project to monthly retainer basis.
     

  • Fraud detection Model  was built for the hospitality client and booking funnel was cleaned to prevent fake bookings into the system. Data e.g. IP Address, validation of email address,  payment method,  booking day vs realized day, previous realized bookings, geographical factors were considered in calculating the probability of genuineness of the booking.
    Resulted in prevention of 8% revenue leakage.
     

  • Market Mix Modelling for an advertising company to help them optimize their customer spend across channels.  
    After initial execution, Project was kept on hold due to some internal budget issues
     

  • Image Processing  CNN model (using python-OpenCV)  was built to identify age/gender of the customer from video feed and also count the number of footfalls in the store.
    Resulted in 91% efficiency but didn't result into deployment into production due to change in business directions.

bottom of page