The Part I tutorial, is based on Apriori algorithm and we stated a few about association rules. Today, we will look about association rules, confidence and support.
Association Rule
If we go by our previous post we defined learning association rule as means finding those items which were bought together most often i.e. single items, pair-wise items, triples etc.
In technical terms, If-then rules about the contents of the basket. Example is below:
Rule for {i1, i2, i3, i4, i5...., iN} -> j means : "if a basket contains all of i1,..., iN then its likely to contain an item j.
Confidence
Confidence of the association rule is the probability of j given i1,..., iN. Simple terms, it's the Ratio of support for I U { j } with support for I. Suppot of I is the number of baskets/transactions containing item I.
Example
Our Transactions/Baskets |
Now if we want to check the association rule for {2, 4} -> 5.
The confidence is: Ratio of {2, 4} U {5} with support of {2, 4}. Therefore,
Confidence = 3 / 3 => 1
We can say that, {2, 4} -> {5} has a confidence of 1. But, we want to know how interesting the rule is. For this, we have an new parameter called Interest.
Interest of an association rule is the difference of it's confidence and the fraction of baskets which contain item j.
I ({2, 4} -> 5) = conf( {2, 4} -> 5) - Fr(5)
= 1 - (3/4)
= 1 - .75
= .25
Therefore, the Interest is just 25 %. It's not an interesting rule.
Interesting rules are those with high positive or negative interest values. As high positive or negative values means the presence of I encourages or discourages the presence of j.