Data Mining
Data mining is the study of finding hidden data which is previously unknown and non-trivial. Mining is also used for prediction.
Points of Study
- The objective is clear but output is not clear
- Dimension keyword is used for the number of columns.
- Unknown data
- Non-trivial.
- Pattern should be hidden.
Job Scope
- Data analyst
- Data scientist
- Business intelligence
Software Tools
- Weka Research tool
- Rapid Minor Professional tool
- SPSS by IBM Data pre-processing
- Excel skills Final & replace
- CSV file Data set
- R programming Specially for Data science
Major techniques
These are the major techniques which are used in data mining to extract raw data for the following steps like data cleaning, data pre-processing, etc. and constructing useful datasets which are used for prediction.
- Market basket/Frequent pattern Analysis.
- Classification (model and prediction).
- Health analyses.
- Clustering (fraud detection).
- Time series prediction.
- Construct software system.
Now we move up to our first data mining technique which is market basket analysis, and perform its implementation by considering binary database examples.
Market Basket/Frequent Pattern
Market basket analysis is frequently used in
- Super market promotion.
- Use in web pages.
- Arrangement of items.
Pattern
Collection of item sets with more than one item is called set.
Unique basket + items inside = transaction association rule = tell the association of two rules
Example
Now we will consider one example in which T is the IDs/Transactions of the dataset which are always unique. Items are transactions of each row assigned by its ID.
Solution
Now first we see the overall number of items present in database e.g. here we have A, B, C, D and E. We count all unique items from six rows we don’t need to duplicate it like in ID 1, we have A, B, D and E but we don’t have C so we note down these items in our list without item C. When we move in second ID row 2, we see that, we already wrote B and E now we only add C item in our list because B and E is already present. We only add C in our item list. Now we construct table, the items like A, B, C, D and E are written in the first row and transaction IDs are written in first column. Now suppose that our table is empty.
Now we see that A is present in first row and what is transaction ID? We write this transaction ID into first cell of A. Now we check B, B is also present in first transaction ID, so we write 1 in B cell. Now here is the game in C item. C item is not present in transaction ID 1 then we see the second ID row 2. C item is present in row 2, so we add the transaction ID to our first row of tables. Like we write “2” in first row C Cell, because C is not present in 1 ID row 1 but present in row 2 ID 2. So we write it. The process goes the same and we complete our database construction by giving items.
Question
Now solve this question by giving the following table and constructing the transaction and items table.
Hint :Make the table like question 1