**Frequent Pattern / Market Basket Analysis**

Frequent pattern mining is about the item sets and sequences which appear in a dataset. For example, a set of items consists of shoes, trousers, and belts together in the dataset. All super markets have their own selling threshold like some super market decides their minimum threshold is 80% and some decide that their minimum threshold is 90 percent.

**Question**

We have a list of items with transaction IDS in our supermarket, what is the threshold? If we are selling trousers with shirts the minimum threshold is 80 percent. The transaction list is given in the below table.

Minimum Support = 40%

Minimum Confidence = 65%

**Solution (Trousers -> shirt)**

Now first we draw the table like question 1 and create binary table.

**Now we calculate Support**

Support = Combine numbers of trousers & shirts / Overall transaction IDs

= 2/4

= 0.5

Support (trousers -> shirt) = 50%

**Now we calculate Confidence**

A = trousers

B = shirts

Confidence = P(AUB)/P(A) = Combine numbers of trousers & shirts/ Number of trousers occurrence

= 2/3

= 0.66

Confidence (trousers -> shirt) = 66%

**Apriori Algorithm**

Apriori algorithm is mining algorithms used for frequent item sets, where item sets are extended using candidate generation which is tested against the data.

**Question**

The table consists of transaction IDs and items. Find out the list of items whose minimum support is greater than 2.

**SOLUTION**

First we find support for each item.

__1__^{st} Level Candidate

Construct the table in which unique number of items are listed down in the left side first column, and write the numbers of A present from Items TID 10 to 40, we see that A comes 2 times in four rows so we write 2 in support column. If B comes three times in item list, we write 3 in our support column. This is our first level candidate.

We cut “D” from item set because it supports 1, we need minimum support =2

After removing D from table remaining item set in list is

__Second Level Candidate__

Now we make possible sets of item sets. Multiple A item with all items like {A} multiple with {B}, {C} and {E} then multiple {B} with {C}, and {E} then multiple {C} with {E}.

Move to the table which is given in the question and see how many times {A, B} occurs in combination then write it in support column below. Follow the same steps for all item sets.

Similarly we cut those sets whose support = 1

**Remaining Item Sets**

**Result**

Now we see the item set whose support is the same.

1^{ST} Level Candidate {A}, {B}, {C}, {D}

2^{ND} Level Candidate {A, C}, {B, C}, {B, E}, {C, E}

**OR**

Those items are frequent.