In 2015, the global revenue for big data and data analytics was $122 billion. In 2017, the value had grown to $149 billion and in 2019, we closed the decade with the value of big data and data analytics hitting $189 billion. According to The International Data Corporation (IDC), the global revenue from this market is projected to hit $274 billion by 2022. In the past decade, we have seen this lucrative market attract lots of players with standalone business intelligence (BI) and data visualization tools reaching unicorn levels.
In June of 2019, Google spent $2.6 billion to acquire Looker while Salesforce spent a record breaking $15.7 billion to acquire Tableau. Both Looker and Tableau specialize in enterprise-scale data modeling that automates data analysis with reports and interactive visualizations. We have seen other big players in the market apart from Looker and Tableau; for instance, Microsoft has PowerBI, Oracle has Oracle BI, IBM has its IBM BI tools and the IBM Watson for advanced natural language processing. Amazon’s AWS is not left behind with Quickstart and SageMaker for both BI and ML pipelines being one of the best options out there.
As we talk about the big data space, it is worth noting that the last decade has been largely filled with business intelligence tools that major on analysis and visualizations. Towards the end of the decade, however, the big data space has seen changes that we expect to be the major players in this decade going forward. Most companies are starting to embrace the need to not only use data for analysis but as a source of insights with direct actions to take thereafter. One such field that has been of the greatest interest to me is the retail sector.
As at 2017, Walmart for instance collected 2.5 petabytes of unstructured data from 1 million customers every hour. How does this data benefit Walmart? Walmart uses this data to understand customer preferences both in-store and online, understand local events and how they affect sales of specific products and product categories, local weather deviations and how they affect buying patterns of consumers. Other functions that the Walmart AI platform helps them achieve include customized recommendations to customers, differential pricing, inventory management among many others. The end result of all this was a significant 10% to 15% increase in online sales alone with an incremental revenue of $1 billion in 2017.
What makes you stand out as a decision maker in a company? Remember that without data you are just another person with an opinion; why then should we use your opinion and not of the others? You are as good as the insights you have from your company’s generated data.
Walmart is an example of a retailer that is an existing proof of the value that we can derive from big data if we are bold enough to move beyond data visualizations offered by the common BI tools like Tableau and PowerBI. Inspired by the Walmart story, in 2019 I embarked on a journey to build a platform that would help all retailers especially those in Kenya achieve the same benefits of big data as Walmart but at a lower budget than Walmart did being that most of them aren’t a fraction as big as Walmart and so might not be able to put up the massive big data infrastructure and resources that Walmart has in place. The result is what we now know as Insense Data Technologies and our first product, AI4Retail, is a set of tools that can help retailers get insights from their POS data and with direct actions to take as advised by the data with the main aim of reducing costs, increasing efficiencies, increasing sales, improving customer service and hence reducing customer churn with the net effect; an expected increase in profits by at least 10% YOY. But how do we help achieve this? Let’s get back to understanding the three main components of data analytics.
As AI continues to grow and develop so does how we use it. As I have already mentioned, in the past businesses focused on harvesting descriptive data about their customers and products but going forward businesses must start to embrace pulling both predictive and prescriptive learnings from the information they gather.
Descriptive Analytics is data that provides information about the past — what has happened in your company. This is where companies pull their monthly sales reports, website visits or clicks, marketing campaign rates, social media reach. In this category, the company gets insights into what has already happened and doesn’t really have an idea of what to do to change that. It is the most primitive in the evolution of the use of data but also plays an important role to the company. The second phase of evolution of data analytics is Predictive Analytics; this is data that provides information about what will happen in your company. This could include how well a product will sell, who is likely to buy it and what marketing channels to use for the best impact among many others. In the last phase in the current evolution lies Prescriptive Analytics. Prescriptive Analytics is data about not just what will happen in your company but how it could happen better if you did a, b, c or d.
At the core of prescriptive analytics is the ability to provide insights as well as recommend actions you should take to optimize a process, campaign or service to the highest degree. Prescriptive Analytics takes us from insights to actions.
Using AI and machine learning, prescriptive analytics can prescribe the right buyer at the right time with the right content, the right shelf to place a product on, the right message to send to what type of customer at what time of what day.
At Insense Data Technologies, our AI4Retail tools offer a number of prescriptive analytics tools, some of which I will use as a guide for the remaining sections of this post.
Customer Segmentation in the descriptive analytics era included understanding your customers based on such parameters as age groups, race, gender, country of origin among others and drawing reports on how each of these segments perform eg sale of product A is higher among people in the age group of 18–24, and that of product B is higher among the female gender. With this information, companies would go to social media and target their products based on these details. In the prescriptive analytics era, we get to delve through the POS data to understand the customers’ buying behavior and cluster them based on their recency, frequency of purchase and monetary value. Our models then cluster the customers into 11 major segments based on their buying behavior: are they loyal customers? Are they promising customers? Are they at risk of churn? Are they already lost? Are they our champions (our Pareto customers)? Through A/B testing using our AI-powered Uplift modelling tools, we then recommend the correct actions to take for each of these customer segments. For example, our systems would only recommend that you reward, upsell or introduce new products to the loyal customers or the champions while offering limited discounts to hibernating customers.
In Cohort Analytics, prescriptive analytics can recommend the best ways or channels or times to acquire a customer based on the behavior of customers previously acquired. For instance, how do customers acquired in January compare to those acquired in December? How do customers acquired through social media compare to those acquired through face to face selling? What’s the behavior of customers acquired through promotions or discounts after the end of the discounts or promotions? By analyzing the retention rates over time and the average spend over time, our systems can help retailers make better choices on the best channels to use at what time of course considering the cost-benefit analysis.
Perhaps one of the tools we’ve developed at IDT that stands out as my favorite is what we simply call Customer Lifetime Value. By using customer’s buying behavior, we are able to rank customers based on their average lifetime value or discounted lifetime value using the concept of discounted cashflow. What is more, we can predict the probability of a customer being alive with the lower probability values predicting customers the company has lost. The tool therefore not only predicts the customer’s lifetime value but also predicts customer churn. Insights like these are important in understanding how much to spend on a customer based on their lifetime values or what makes customer churn tick so as to be able to tell expected churn and take preventive measures. Additionally, with this tool, retailers can now predict customer purchases on the next “x” number of “days/weeks/months”. How is this predictive insight important? Let’s say a retailer wants to send out promotional messages to customers, sending out such a message to a customer who will not purchase in the next 30 days might be a waste because by the time they come back to purchase, they might even have forgotten about the message. However, sending out such a message to those who are expected to come and purchase in the next one or two days would help the retailor reap the highest conversion and hence ROI.
Also, in our basket of AI4Retail tools is Market Basket Analysis, a topic I have written about before. This tool helps retailers understand customer’s buying behavior at the basket level. What products do people like to buy together? What insights can this behavior help us derive? And what actions can we take to improve our profits based on these insights? Far from the most basic insights we can get such as the most frequent baskets as well as the common products in the largest baskets, association mining remains one of the fundamental aims of basket analysis. In its most basic form, association mining can simply be seen in the form of “those who bought product X also are more likely to buy product Y.” Two main parameters used to measure the strength of an association rule are confidence and lift. A confidence of 50% shows that 50% of the customers who purchased product X also bought product Y whereas lift measures incremental value. A lift of 2 implies a two-times increase in expectation that when you buy product X, you will buy product Y. From the definition, it is quite clear that lift is a better measure of products association than confidence but the two of them are both important based on the type of actions the system prescribes to the retailer.
Perhaps one of the most important tools to retailers, some of the important prescriptions we make based on the insights from association mining include arrangement of products on the shelves, placement of products into packages, discounts, products to have on promotions, upselling, cross-selling, product recommendations to customers among many others. For instance, if Hass Avocados has a lift higher than 1 with baby cucumbers and packaged grape tomatoes, then our system would prescribe that all the three products be placed in the same shelf to reduce the amount of time customers take walking from aisle to aisle during their shopping journey. The shorter a customer takes shopping, the better it is for the in-store retailor. For an online store, this insight could help prescribe how to position products on, say the retailer’s home page. With the knowledge that buying apples increases chances of buying Clementines by a lift of 32, it would thus be unwise to have both products on promotion; the retailor would be advised to have one of them and the sales of the other would automatically increase. Lastly, it would be more accurate to recommend products to customers based on what’s already on their cart (for online retailers) or based on their previous purchases. Recommended products could be prescribed based on either confidence or lift. For instance, if our customer likes purchasing yellow onions, we could decide to recommend to him other products having a confidence of above 10% with yellow onions: bananas, limes, organic avocadoes et al thereby helping increase the sale of these other products while at the same time ensuring the recommendations are done in a manner in which chances of the conversion on the recommended products are high.
Uplift models seek to predict the incremental value attained in response to a treatment. For example, if we want to know the value of showing an advertisement to someone, typical response models — what falls under the old school descriptive analytics — will only tell us that a person is likely to purchase after being given an advertisement, though they may have been likely to purchase already. Uplift models will predict how much more likely they are to purchase after being shown the ad. The uplift model tool at IDT’s AI4Retail helps the retailer categorize the customers into either a persuadable, sure thing, lost cause and sleeping dog.
Persuadables are customers who will buy only if treated otherwise they won’t; sure things are customers who will buy with or without treatment, lost causes are customers who will not buy whether you treat them or not while sleeping dogs are those group of customers who will buy if not treated and not buy if treated. Treatment in this case refers to a number of actions the retailer takes to help increase sales and includes actions such as sending promotional messages, offering discounts et al. The retailer wants to avoid sending messages to a million sure things; that’s a million shillings lost (on a one shilling per SMS assumption) because those customers would have purchased without the messages anyway. Similarly, sending those promotional messages to “lost causes” would only result in absolute losses as they don’t respond to this kind of treatment (and lack of it as well). The retailor would however, want to target the persuadables — these are the group of customers who will not buy when not persuaded but as soon as you send to them the messages, they will buy thus giving the highest conversion and highest ROI. However, the retailor must never send any messages to the “sleeping dogs”. Whereas sending messages to sure things and lost causes do not lead to any conversions, sending messages to the sleeping dogs actually lead to a reduction in sales and might even trigger their churn — a negative effect that has a net effect of increasing company losses.
Correlation is not causation but when correlations keep appearing between the same parameters over a period of time, then we can have a higher confidence in prescribing a causation. In our basket of AI4Retail tools, Correlation Analysis analyzes the company data to see how the various parameters correlate with external factors and thereby conclude if there is causal relationship or not. An example is often given of the sale of sunglasses and that of ice cream in which data will show a positive correlation. This does not equate to causal relationship that buying sunglasses makes people to buy ice cream but there is a possibility of an external factor that positively affects the sale of both sunglasses and ice cream. That external factor is the day’s tempreratures. Our systems thus evaluate correlation of sales of various products to three major parameters: weather conditions, calendar holidays and events. Though this is inference based on a specific geography, sale of ice cream has been seen to increase with increase in temperatures up to a specific point after which higher temperatures lead to lower sale of ice creams. At moderately high temperatures, people tend to prefer ice cream to keep them cool but at extremely high tempratures, ice cream melts and becomes hard to handle hence not preferred. People also tend to stay in doors more during such extremely high temperatures.
With our collection of weather data since the beginning of the millenuim, these ML tools are able to analyze the effect of various weather patterns and conditions on the sale of various products and prescribe which products to promote more based on the prevailing weather conditions. Sale of various products have also been known to vary with holidays. There are products people prefer during Christmas festivities, Black Friday, cyber Monday, Easter, Back to School weeks, Valentines et al. By delving into the purchase patterns over the past two decades, we can prescribe, with high accuracy, the products to front more during each of the holidays and hence help the retailers reap the best out of these holiday sales — which forms the highest percentage of all annual sales by retailers with global holiday sales soaring to $853 billion in 2018 alone. Events are personalized to specific contries or geographies but could include such things as olympics, World Cup or a local political rally et al. Understanding sales of products during such events can help the retailer better prepare for the events in the future in terms of product restocking, recommendations and promotions.
As we go into the decade, the $272 billion estimation of IDC by 2022 maybe surpassed but it all depends on how we sell prescriptive analytics to the world. The aim is not on the money but the value that prescriptive analytics holds for the businesses of this decade. We are moving from descriptive analytics to prescriptive analytics; from business intelligence to decision intelligence. The next phase of data analytics and big data is surely interesting; it is a phase where we are not only showing you reports on how you have performed the past year and leaving you to figure out how to do better but we also prescribe to you the steps you can take to do better. I am not saying you should ditch descriptive analytics altogether, what I am saying is that you need a 360 degree view of data analytics and that’s only so if you make use of the different values descriptive, predictive and prescriptive analytics have to offer.
Prescriptive Analytics is revolutionary for its value to businesses. It is like a personal doctor, prescribing to you the actions you should take to make your business do better
Unlike descriptive analytics which is cheap to set up and manage, prescriptive analytics employs the use of machine learning and deep learning to derive these insights and make accurate prescriptions. This makes it more expensive to set up both in infrastructure resources as well as expertise. It is for this reason that we at Insense Data Technologies have set up an infrastructure for prescriptive analytics and we aim to onboard and incorporate various businesses onto our platform at ridiculously low costs so that they are not left behind because of the high costs associated with handling big data and deploying machine learning models. If you want that one thing that can differentiate your company from the others this year and going forward, then it all lies on the decision you take on how you use data and Insense Data Technologies exists to help you take that journey from toddler level to adult level both as your consultant and as your platform provider.