Market Basket Analysis (MBA) is an analysis to understand product bought together in a retail transaction or customer visit in a store. A number of blogs on a brief overview on Market Basket Analysis for a retail , a few published case studies of market basket analysis and step by step approach to Market Basket Analysis using R.
In this blog, the focus is to explain calculation details of the key performance statistics involved in Market Basket Analysis (MBA).
Dummy Scenario Example
Transaction 
Products 
1 
A 
2 
B,C 
3 
A,C 
4 
D,A 
5 
A,C 
6 
A,B,C 
7 
C 
One of the first steps in Market Basket Analysis (MBA) is to find frequency of each product sold in a store or online.
In this example the Frequency and Most Frequent Products are as follow
Product 
Frequency 
A 
5 
B 
2 
C 
5 
D 
1 
For finding rules, we can create all possible sets of product take up. Considering, there are 4 products,;
hence there can be ^{4}C_{1 }+ ^{4}C_{2 }+^{4}C_{3}+^{4}C_{4 } = 4+6+4+1 = 15 different rules (without considering sequence, meaning A,B are
same as B,A). These rules or product combinations are
Rules 
Availability 
A 

B 
X 
C 

D 

A,B 
X 
A,C 

A,D 

B,C 

B,D 
X 
C,D 
X 
A,B,C 

A,B,D 
X 
A,C,D 
X 
B,C,D 
X 
A,B,C,D 
X 
Support
In a traditional Market Basket Application scenario, support is number ( or percent) of transactions containing a particular product. Support is also looked from probability perspective –probability of transaction having a product.
Support (A) = % of Transaction with product A
= Count of Transaction which has Product /Total Number of Transaction
P(A) = Probability of a transaction having product,A
=5/7
Product 
Frequency 
Support 
% 
A 
5 
5/7 
71% 
B 
2 
2/7 
29% 
C 
5 
5/7 
71% 
D 
1 
1/7 
14% 
In Market Basket Analysis or Affinity Analysis, more than single products, the combinations of product more important. If we establish a
particular combination is more prevalent, the insights could be used for cross sell or assortment planning. Having said
that support is also calculated that for single product occurrences.
Support for an association rule is % of occurrences or the transactions for the association rule.
In the above example/scenario, support for each of the rules
Rules 
Support 
Support(%) 
A,C 
3 
3/7 = 43% 
A,D 
1 
1/7 = 14% 
B,C 
2 
2/7 = 29% 
A,B,C 
1 
1/7 = 14% 
So {A,C} combination is present is 3 transactions out of overall 7 transactions, giving support of 43% . Support
can both in count and percentage, but unless specified it is represented in percent. Higher value of support indicates higher importance of an association rule.
In probability terms, support for rule is {B → C} = Probability of having Product B and C
= P(B ᴗ C)
Confidence
Support of an association rule is one of the basic requirement before checking next set of association rule performance. Another,
key performance statistics is Confidence.
Confidence shows occurrence of one product given occurrence of another product or set of products
For the rule {B → C}, shows that the percentage of transactions containing B which also contain C. It is conditional probability terms, Probability of product C given occurrence of Product B
Confidence {B → C} = P( C/B)
=Support (B and C)/Support(B)
=P(B ᴖC) / P(B)
= (2/7)/(2/7)
= 1.00
Rules 
Support(%) 
Confidence (Formula) 
Confidence 
A,C ( A →C) 
3/7 = 43% 
P(AᴖC)/P(A) 
43% /71% = 61% 
A,D (A →D) 
1/7 = 14% 
P(AᴖD)/P(A) 
14%/71% =20% 
B,C (B →C) 
2/7 = 29% 
P(BᴖC)/P(B) 
29%/29% = 100% 
A,B,C ({A,B} →C) 
1/7 = 14% 
P(AᴖBᴖC)/P(AᴖB) 
14%/14% = 100% 
A,B,C ({B,C} →A) 
1/7 = 14% 
P(AᴖBᴖC)/P(BᴖC) 
14%/29% = 50% 
A,B,C ({A,C} →B) 
1/7 = 14% 
P(AᴖBᴖC)/P(AᴖC) 
14%/43% = 33% 
Each rule has two sides – LHS (Left Hand Side) and RHS (Right Hand Side). In an association, we interpret that RHS happens given LHS. For Example in {A,B}→ C, LHS is {A,B} and RHS is {C}.
Rule is that Product is taken up when a customer buys product A and B.
In summary, Support gives percent occurrences of a rule and confidence gives strength of dependency/association
of happening.
In the example of {A,B} → C, How many times all 3 products {A,B,C} are bought together? This support for the rule. Second, we are making inferences that when customers buys {A,B} then it is high likely that they also buy C. For measuring this, Confidence is derived. It captures
that when customers bought product {A,B}, how many times or percentage they also bought product C.
Lift
Lift is a measure of the improvement in the occurrence due to an association rule. In a nontechnical definition, it is comparison of happening
due to rule and happening otherwise.
For example, (B →C) states that if a customer buys B then it has higher chances of buying product C. Now, question comes is that
whether it is due to buying product B or buying product C in general has higher chances of being bought?
So, Lift is developed to compare whether a product is bought due to a product /LHS or it is due to higher probability of being bought anyway.
Lift is defined as ratio of the conditional probability of the RHS given the LHS, divided by the unconditional probability of the RHS.
Lift = P(RHS/LHS)/P(RHS)
Example  B →C
Lift (B →C) = P(C/B)/P(C)
Rules 
Support(%) 
Confidence (Formula) 
Confidence 
Lift (Formula) 
Lift 
A,C ( A →C) 
3/7 = 43% 
P(AᴖC)/P(A) 
43% /71% = 60% 
P(C/A)/P(C) 
84%

A,D (A →D) 
1/7 = 14% 
P(AᴖD)/P(A) 
14%/71% =20% 
P(D/A)/P(D) 
140%

B,C (B →C) 
2/7 = 29% 
P(BᴖC)/P(B) 
29%/29% = 100% 
P(C/B)/P(C) 
140%

A,B,C ({A,B} →C) 
1/7 = 14% 
P(AᴖBᴖC)/P(AᴖB) 
14%/14% = 100% 
P(C/{A,B})/P(C) 
140%

A,B,C ({B,C} →A) 
1/7 = 14% 
P(AᴖBᴖC)/P(BᴖC) 
14%/29% = 50% 
P(A/{B,C})/P(A) 
70%

A,B,C ({A,C} →B) 
1/7 = 14% 
P(AᴖBᴖC)/P(AᴖC) 
14%/43% = 33% 
P(B/{A,C})/P(B) 
117% 
For the above example, we can use to read the dummy data, converting to transactions (required for association analysis) and finding
support, confidence and lift of the rules.
R Code for the Market Analysis
## Input transactions and products prod.mba < list( t1="A", t2 = c("B","C"), t3 =c("A","C"),t4=c("D","A"),t5=c("A","C"),t6=c("A","B","C"),t7= "C" ) ## install if not done already install.packages("arules") ## load library library(arules) ## Convert into "transaction" class prod.mba.trans < as(prod.mba, "transactions") ## Build association rules assoc.rules < apriori(prod.mba.trans, parameter = list(supp = 0.0001, conf = 0.2)) ## Check the rules and details inspect(assoc.rules[1:14])
Pls read "<" as less than symbol