Problem
Small to mid, even large scale retailers struggle to find insights amongst their data. Without loyalty programs, they do not receive demographic or customer information. Even with loyalty programs the usage/tag rate across transactions is low (<50%), and businesses don’t have access to market data to contextualise their performance.
Goal
To empower retailers to implement data-driven operations and lower the barrier to entry for retail insights by providing an easy to use data product for retailers with comprehensive and actionable insights
Output
I led the end-to-end design and release of Slyp's commercial data product offering for retailers, involving data modelling, pipeline establishment, data analysis, and visualizations. This included analysing data from 2000+ retailers and data covering 40% of all transactions processed in Australia.
Insight:
Performance indices to benchmark merchant performance against average industry trade volumes
Demographic segmentation/overlay across models
Implementation:
Given Slyp not everyone in Australia was signed onto Slyp, it was a complex task to ensure that the insights we presented were indicative of the behaviour of the entire population i.e., the Slyp dataset might have 70% women, whereas the actual population that our retailers want to see insights on is the Australian population which is 50.4% female.
I imported Australian demographic data and developed a dynamic table that output the ratio difference between Slyp’s data demographic vs the Australian population at a Suburb/gender/age group cross section. As Slyp’s customer base grew everyday, I ensured that each day was tracked so that these ratios that were used in downstream models were dynamic and insights could be communicated as accurately as possible.
Insight:
The geographic split of customers coming from each location or;
Share of wallet analysis i.e., customers from zip code 10001 spend 10% of their grocery category spend with you
Implementation:
This involved both the categorization of SKU data and scaling the data to the ratio of the Australian population as mentioned in the prior two insights.
Insight:
Attachment and affinity rates of products and categories i.e., ‘Customers who bought X product/category most commonly bought Y category’
Product line comparison across single brands i.e., 500ml vs 1000ml products
Implementation:
The data provided was SKU unit levels including product name, price, and quantity. It did not have additional descriptive data. Therefore I implemented a supervised machine learning model for multi-class text classification to assign categories to data to provide clearer insights to our retail customers.
Insight:
Discounting modelling -> Price vs # of customers (or vs the average basket size)
i.e., is my pricing correlating with increased foot traffic vs is my pricing strategy only attracting existing customers, hence eating away at my margins
Implementation:
Used simple time series chart to analyze trends over time
Used ARIMA to forecast price, while capturing trends, seasonality, and linear dependencies
The following are excerpts from a prototype I designed using InVision and Figma and do not contain sensitive information. Final versions were built upon a Data Lake architecture in AWS. I worked with a Data Engineer to build new ‘layer’ was built in the data pipeline for specific use cases and store the transformations and outputs from Data Science modelling. This ensured the data was performant and ready to use. The visualisation for early versions was done in AWS QuickSite and then Plotlty Dash.