Data mining

  • Published on
    15-Jul-2015

  • View
    121

  • Download
    1

Embed Size (px)

Transcript

Data Mining

Data Mining (Data Warehouse) Data Mining Data Mining

(Data Mining)

Data Cleaning Data Integration Data Selection Data Transformation Data Mining Pattern Evaluation Knowledge Representation

Database, Data Warehouse, World Wide Web Other Info Repositories Database Data Warehouse Server Knowledge Base Data Mining Engine Pattern Evaluation Module Data Mining Engine User Interface Relational Database Entity Relationship ModelData Warehouses Transactional Database Advanced Database Object-Oriented Text File Web Database Management System ( DBMS ) DBMS Oracle , DB2 , MS SQL , MS Access Mining Mining Mining Mining Mining Data Mining

1. Association rule Discovery Data Mining Market Basket Analysis (Association Rule) SE-ED BOOK 1

1) Decision Tree node attribute leaf , node

2. Classification & PredictionClassification model classification 3 1. Model Construction (Learning) model (Training data) model

2) Artificial Neural Networks (ANN) Artificial Intelligence: AI (train) Neural Net node Input Output input layer , output layer hidden layers Neural Net node layer

2.Model Evaluation ( Accuracy ) ( testing data ) model 3.Model Usage ( Classification ) Model ( unseen data ) object

Prediction

3. Database clustering Segmentation 3 1. (>$80,000) 2. ($25,000 to $ 80,000)3. (less than $25,000) -Have Children-Married-Last car is a used car-Own cars

4. Deviation Detection (Visualization)

5. Link Analysis Link Analysis link associations recode recode o link analysis 3 associations discoverysequential pattern discovery similar time sequence 3 Web Content Mining, Web Structure Mining Web Usage Mining

(Web Mining)

Web Content Mining Web Content Mining 2 (Information Retrieval) (Database) Web Content Mining Web Content Mining Web Structure Mining

(Web Mining)

Web Usage Mining Web Log Mining Web Content Mining Web Structure Mining Web Usage Mining Web Usage Mining Proxy (Proxy Server Log) (Registration Data) Web Usage Mining Web Usage Mining 2 1. 2. (Preprocessing) (Pattern Discovery) (Pattern Analysis)

(Web Mining)

Web Mining web (Text ) web

Web Mining web

3 1.Demographics web 2.Psychographics web 3.Technolographics 3

Web mining e-Commerce

Data Mining Data Mining (Who) (What) (Where) (When) (Why) Data Mining

Web Mining

Internet , , ( Decision Support System) ( Operational System ) ( Data Warehouse )

computer Data Mining Algorithm computer computer microcomputer (PC Cluster) computer

- k-fold cross validation leave-one-out - Validation, Test data Training data 3/10, 3/104/10

WEKA (Waikato Environment for Knowledge Analysis) 1997 Waikato free ware Weka (Machine Learning) (Data Mining) Graphic User Interface (GUI) data mining 2 Pre-Processing, Classification

Software Software

- - - SQL Database Java Database Connectivity- -

accuracy Apriori FP-Tree FP-tree Apriori Model Evaluated

Software Orange Canvas Add-ons

Software

Software MATLAB MATLAB MATLAB MATLAB (Interactive) MATLAB MATLAB array array MATLAB dimension matrix vector

Software Software

--Algorithm - (Simulink) Dynamic -- - Graphical User Interface object MATLAB fields object - Fortran, Borland C/C++, Microsoft Visual C++ - MATLAB interactive MATLAB C Fortran

Software Data Mining Midas Bouygues Telecom France Telecom Data Mining - - - - - 6

1. 5614130222. 5614130313. .. 5614130324. 5614130415. 561413051