## What is sales cannibalization?

To explain what this term means and why it is important, let me provide you with a simple example.

Imagine two shops, A and B situated next to each other selling mobile phones. There is a high possibility that there might be some intersection in their portfolios which means that there may be some handsets being sold in both the shops. Given their proximity in distance, the price of the handset in shop A might affect volume sales in shop B. This phenomenon is called as sales cannibalization.

The above example is very simplistic. The dynamics of sales cannibalization are complex and there are many kinds of cannibalization phenomena. Let me try to list a few that come to my mind –

1. Cannibalization by a competing store

2. Cannibalization by a competing product which is similar to the product in consideration

3. Cannibalization by other brands

These forces of cannibalization are at play in every kind of industry and it is necessary to factor them into any driver analysis of sales.

Competitor price is generally taken as a variable representing cannibalization. The co-efficient obtained through OLS regression for any cannibalization variable must be positive. It follows from the logic that your sales volume goes up as the competitor increases price of his goods.

This concept is applicable to any industry where there exists substitute products.

## Forecasting vs Driver Analysis

Forecasting is an analytical exercise conducted to predict future outlook for any variable of interest. The variable can be sales, units, margins, market share etc. Forecasting is done for a future period based on historical data available with the analyst.

Driver Analysis is mainly done to identify causality. The variables for which a driver analysis is done is similar to the ones which can be forecasted.

Driver Analysis helps us quantify the impact of every independent variable on the dependent variable. It tells how much the dependent variable will increase with a unit increase in the independent variables upto a statistical degree of accuracy. Forecasting on the other hand cannot be used to measure impact of independent variables. The output from a forecasting tool would give us the predicted values of the variable of interest for a future time. Driver analysis can also be used to predict values of the dependent variable by providing it with suitable inputs but the accuracy of prediction for a Driver Analysis may not be as high as that of a Forecasting exercise. In a forecasting analysis, we include all variables into the predictor list as long as they are statistically significant but we do not check for correlation between the various independent variables. The assumption here is that no matter how correlated two or more variables are to each other, there is some additional information in each of them which can increase the accuracy of prediction. We do not include highly correlated variables in a driver analysis because doing so may cause the effect of one variable to be explained by the other variable and hence the statistical estimates that we obtain may not be accurate. The usual practice is to just include that particular dependent variable(out of the list of correlated variables) which makes sense from a business standpoint. This is the trade-off that happens during the variable selection exercise for the respective analyses. To put it in a simple way – A forecasting exercise compromises on causality to achieve higher levels of forecast accuracy whereas a Driver Analysis compromises on accuracy to yield better causality.

There are many tools available that aid us in doing the above mentioned analysis. ARIMA (Auto Regressive Integrated Moving Average) and ARIMAX(Auto Regressive Integrated Moving Average with exogenous variables) are popular tools today, in the hands of analysts doing a forecasting project. Driver Analysis can be done through OLS(Ordinary Least Square) Regression or MLH (Maximum LikeliHood) Regression techniques.

The problem statement for the above analyses can be like something mentioned below –

- The business user wants to forecast revenues for a future period in order to ensure efficient inventory and demand planning – A Forecasting Exercise
- The business user wants to identify drivers of revenues so that he can optimally allocate budget among his various marketing channels – A Driver Analysis

Independent variables considered for the above analyses usually fall under the following categories –

- Product Price
- Cannibalization from Competing Products
- Promotional Activities
- Competitor Promotion
- Seasonality
- Product Life Cycle
- Simple Trends

The requirements for the above analysis would be a fair amount of historical data, data preparation and analysis tools such as SQL and software such as SAS with in-built modules for ARIMAX, Regression etc.