Market Basket Analysis using Apriori Algorithm
Market Basket Analysis
· Introduction
The Number of Stores like Super Markets, Online Store and other nearby Grocery Store is increasing Day by day and thus the competition is also increasing rapidly between different stores. So to attract the customers to there store they need to understand there purchasing pattern in order to launch some sort of scheme. The entire process of analyzing shopping trends of the customers is called Market Basket Analysis. Market Basket Analysis helps in increasing sale in several ways. It also helps in making right decision in determining the sales strategy and developing the right target promotion that is knowing the consumers taste of buying.
Market Basket Analysis helps in
finding association between products. Because of which it makes easy to manage
the product placement i.e. two products A and B that are frequently bought
together can be placed near to each other thus it attract the customer to buy B
if he/she purchases A. It is also used in managing pricing of the items. It
also helps to give discount offers on bundling items that are frequently bought
together. Such that Buy A and B both and get 10% off on each.
Market Basket Analysis is a data mining process that focuses on discovering purchasing pattern by extracting association rules from the transactional database of store. Different Data mining techniques helps in analyzing the data. Association Rule Mining is one of the Data mining technique that helps in finding interesting association from the dataset. By determining the products that are bought together helps the retailer to design the Store layout (Product Placement). Product placement not only reduces customer’s shopping time but also suggest other relevant items that he/she might be interested in buying. The three common ways to measure association are, Support, Confidence and Lift. The generation of frequent itemsets is done using algorithms like Apriori , FP-Growth.
· Association Rule Mining
Association rule mining is a technique to identify various relationship between different items. It is used to find association between combination of items in an itemset. The three terms that are important in knowing association between items are:
Support:
Support refers to the combination of items bought together frequently.
It is nothing but a ratio of number of transactions in which the itemset of
products suppose, (A,B) to the total number of transactions.
Mathematical
Representation:
Confidence:
Confidence refers to the likelihood that an
item B is purchased if item A is bought. It is a ratio of number of transaction
where A and B both are bought by the number of transaction where A is bought.
Mathematical
Representation:
Lift:
Lift tells how strong our rule is. It also
refers to the increase in sale of B when A is sold. For itemset (A,B) it is a
ratio of Confidence of (A,B) to the Support of (B).
Mathematical
Representation:
If
the lift for (A,B) is 2 than we can say that chances of buying A and B together
is 2 times more than the chances of buying just B.
Lift
= 1, means there is no association between Product A and B.
Lift
> 1, means products are more likely to be bought together.
Lift
< 1, means products are not likely to be bought together.
These association rule can help retailers to develop marketing strategies in better way. Cross selling is one of the strategy and it concerns selling of those items which are interrelated to each other and can be integrated with the item which is being sold.
Also Association and Recommendation both are different. Association can be called as “Frequently bought together” and Recommendation can be thought as “Customers who bought/viewed Item A also bought Item B”. One of the finest example of Association and Recommendation is of amazon’s website/app. Whenever we search for any product it gives following recommendation based on the searched product.
· Algorithm => Apriori :
Apriori algorithm is used to find frequent itemsets. It starts by identifying the frequent individual items and then extends them to larger and larger item sets as long as the support value is greater than or equal to provided minimum threshold.
Apriori uses “Bottom-up” approach, where frequent item sets are extended one item at a time. This step is also know as Candidate Generation. The algorithm terminates when no further successful extensions are found.
Demonstration of working of the algorithm is given below:
Lets Consider a Dataset having following transactions in it.
Transaction no. |
Items Purchased |
1 |
A,C,D |
2 |
B,C,E |
3 |
A,B,C,E |
4 |
B,E |
5 |
A,C,E |
Here we have considered total 5 transactions and there are total 5 items A,B,C,D,E. We will set the minimum support as 2.
itemset |
Support |
{A} |
3 |
{B} |
3 |
{C} |
4 |
{E} |
4 |
The support value of item-set D is 1 so it is not included in the above table as min. support value is 2.
Here
in our project we have considered a dataset having 9835 transactions. (Please open images to see clearly )
Loaded dataset:
Then we are doing one
hot encoding which means that the items that are purchased in particular
transaction will have its entry as 1 and if not purchased will have its entry
as 0. The column names will be the product name and the rows are the
transactions.
Applying Apriori
algorithm with min. support 0.005 and print the top 15 itemset that are having
highest support.
Getting the
information regarding the top 40 first buy item from the dataset and presenting
in a graphical format.
Now through
association rules we can get the information regarding the support, confidence
and lift of Frequent itemset.
Below given table
shows the top 10 itemset that have highest confidence and support value greater
than or equal to 0.005.
· Conclusion:
Apriori Algorithm gives result which helps a lot in
understanding the buying pattern of the customer.
Online
store leaders like Flipkart, Amazon uses this technique to suggests items in
customers basket/ Shopping cart. The General Store Owners can also make use of
such technique to manage Product Placement, Promotional Offers, etc and can
increase there sale which leads to increase in there profit. From the above
given graph of First Buy time, the store owner can use that to attract more and
more people by placing that products in front or at entrance. It not only helps
in increasing its sale but also saves customers time.
Thus
we can conclude that Market Basket Analysis plays an important role in retail
business.
Comments
Post a Comment