SlideShare una empresa de Scribd logo
1 de 79
Introduction to Data Mining
Course Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Acknowledgement ,[object Object],[object Object],[object Object]
Literature Data Mining – Concepts and Techniques by J. Han & M. Kamber, Morgan Kaufmann Publishers, 2001 Pattern Classification by R. Duda, P. Hart and D. Stork, 2 nd  edition, John Wiley & Sons, 2001
Introduction to Knowledge Discovery in Databases and Data Mining
Computational Knowledge Discovery
Terminology ,[object Object],[object Object],[object Object],[object Object]
Terminology - A Working Definition ,[object Object],[object Object],[object Object],[object Object]
Data Mining: On What Kind of Data? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Structure - 3D Anatomy Function – 1D Signal Metadata – Annotation
Data Mining: Confluence of Multiple Disciplines ? 20x20 ~ 2^400    10^120 patterns
Why Do We Need Data Mining ? ,[object Object],[object Object],[object Object],How do you explore millions of records, tens or hundreds of fields, and find patterns?
Why Do We Need Data Mining ? ,[object Object],[object Object],[object Object],[object Object]
Why Do We Need Data Mining? ,[object Object],[object Object],[object Object],[object Object],[object Object],QUERY RESULT (Latitude, Longitude) 1 (Latitude, Longitude) 2
What is It? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Applications of Data Mining
Data Mining Applications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Market Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Corporate Analysis & Risk Management ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Fraud Detection & Mining Unusual Patterns ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining and Business Intelligence
Knowledge Discovery in Databases Process
KDD Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Precision Farming Filter
KDD Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Knowledge Discovery
Required effort for each KDD Step ,[object Object]
Data Mining Tools
Commercial and Research Tools ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Software Engineering in Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
D2K - Software Environment for Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
D2K Architecture ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Flow Programming Environment: D2K Jump Up Panes Workspace Tool Bar Tool Menu Side Tab Panes
D2K Programming and Runtime Environment
Streamlined Data Mining Environment: D2K SL KDD Steps Session KDD Options Workspace
Data Mining Techniques in D2K ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining at Work Data Sources Project Objectives Single Multiple Numerous Diagnostics Target Marketing Effluent Quality Control Decision Support Automation Transaction Management Cost Prediction (Warranty, Insurance Claims) Warranty Clustering Territorial Ratemaking Web Information Retrieval, Archival and Clustering Auto Loss Ratio Predictions Precision Farming Bio-Informatics Functional Foods Heterogeneous Data Visualization Crime Data Analysis Data Fusion and Visualization Survey Study of Disability
Examples of Data Mining Methods
Three Primary Data Mining Paradigms ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Association Rules and  Market Basket Analysis
What is Market Basket Analysis? ,[object Object],[object Object],[object Object],[object Object]
Market Basket Example Is soda typically purchased with bananas? Does the brand of soda make a difference? Where should detergents be placed in the Store to maximize their sales? Are window cleaning products purchased  when detergents and orange juice are  bought together? How are the demographics of the  neighborhood affecting what customers  are buying? ? ? ? ?
Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],+
Results: Useful, Trivial, or Inexplicable? ,[object Object],[object Object],[object Object],[object Object]
How Does It Work? Orange juice, Soda Milk, Orange Juice, Window Cleaner Orange Juice, Detergent Orange juice, detergent, soda Window cleaner, soda OJ 4 1 1 2 1 OJ Window Cleaner Milk Soda Detergent 1 2 1 1 0 1 1 1 0 0 2 1 0 3 1 1 0 0 1 2 Window Cleaner Milk Soda Detergent Co-Occurrence of Products Customer Items 1 2 3 4 5 Grocery Point-of-Sale Transactions Orange Juice, Soda Milk, Orange Juice, Window Cleaner Orange Juice, Detergent Orange Juice, Detergent, Soda Window Cleaner, Soda
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],How Does It Work? OJ Window Cleaner Milk Soda Detergent 1 1 1 0 0 2 1 0 3 1 1 0 0 1 2 OJ Window Cleaner Milk Soda Detergent 1 2 1 1 0 4 1 1 2 1
How Good Are the Rules? ,[object Object],[object Object]
Confidence and Support - How Good Are the Rules ,[object Object],[object Object],[object Object],[object Object]
Confidence and Support Transaction ID # Items 1 2 3 4 { 1, 2, 3 } { 1,3 } { 1,4 } { 2, 5, 6 } Frequent One Item Set Support { 1 } { 2 } { 3 } { 4 } 75 % 50 % 50 % 25 % For minimum support = 50% = 2 transactions  and minimum confidence = 50% For the rule 1=> 3: Support = Support({1,3}) = 50% Confidence (1->3) = Support ({1,3})/Support({1}) = 66% Confidence (3->1)= Support ({1,3})/Support({3}) = 100% Frequent Two Item Set Support { 1,2 } { 1,3 } { 1,4 } { 2,3 } 25 % 50 % 25 % 25 %
Association Examples ,[object Object],[object Object],[object Object],[object Object]
The Basic Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Choosing the Right Set of Items Frozen Foods Frozen Desserts Frozen Vegetables Frozen Dinners Frozen Yogurt Frozen Fruit Bars Ice Cream Peas Carrots Mixed Other Rocky Road Chocolate Strawberry Vanilla Cherry Garcia Other Partial Product Taxonomy General Specific
Example - Minimum Support Pruning / Rule Generation Transaction ID # Items 1 2 3 4 { 1, 3, 4 } { 2, 3, 5 } { 1, 2, 3, 5 } { 2, 5 } Itemset Support { 1 } { 2 } { 3 } { 4 } { 5 } 2 3 3 1 3 Itemset Support { 2 } { 3 } { 5 } 3 3 3 Itemset { 2 } { 3 } { 5 } Itemset Support { 2, 3 } { 2, 5 } { 3, 5 } 2 3 2 Itemset Support { 2, 5 } 3 Scan Database Find Pairings Find Level of Support Scan Database Find Pairings Find Level of Support Two rules with the highest support for two item set: 2->5 and 5->2
Other Association Rule Applications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],+
Strengths of Market Basket Analysis ,[object Object],[object Object],[object Object],[object Object]
Weaknesses of Market Basket Analysis ,[object Object],[object Object],[object Object],[object Object]
Decision Tree Learning
Example: Supervised Learning with Decision Trees
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Decision Tree Learning
Decision Trees ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Decision Tree for Concept:  PlayTennis Outlook? Humidity? Wind? Sunny Overcast Rain Yes No High Normal No Strong Light Outlook? Humidity? Wind? Sunny Overcast Rain Yes No High Normal No Strong Light Yes Yes Yes Yes
Decision Trees and Decision Boundaries + + - - + + + + - - y x 1 3 5 7 How to Visualize Decision Trees?  Example: Dividing Instance Space into Axis-Parallel Rectangles More than two variables ? y  > 7? No Yes x  < 3? No Yes y  < 5? No Yes x < 1? No Yes
An Illustrative Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Day Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot  Mild Temperature Humidity Wind PlayTennis? High High High High Normal Normal Normal High Normal Normal Normal High Normal High Outlook Light Strong Light Light Light Strong Strong Light Light Light Strong Strong Light Strong No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No Training Examples for Concept  PlayTennis
Constructing a Decision Tree for  PlayTennis [9+, 5-] E(D) = min(9/14, 5/14) = 5/14 = 36% The Initial Decision Tree with One Leaf ,[object Object],[object Object],1 2 3 4 5 6 7 8 9 10 11 12 13 14 Day Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot  Mild Temperature Humidity Wind Play Tennis? High High High High Normal Normal Normal High Normal Normal Normal High Normal High Outlook Light Strong Light Light Light Strong Strong Light Light Light Strong Strong Light Strong No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No
Constructing a Decision Tree for  PlayTennis Potential Splits of Root Node [3+, 4-] [6+, 1-] Humidity High Normal [9+, 5-] [6+, 2-] [3+, 3-] Wind Light Strong [9+, 5-] [2+, 3-] [3+, 2-] Outlook Sunny Rain [9+, 5-] Overcast [4+, 0-] [3+, 1-] [2+, 2-] Temperature Cool Hot [9+, 5-] Mild [4+, 2-] E(Split/Outlook)  = (5/14) – ((5/14)(min(2/5,3/5)) + (4/14)(min(4/4,0/4)) + (5/14)(min(3/5,2/5))) = 7% E(Split/Temperature) = (5/14) – ((4/14)(min(3/4,1/4)) + (6/14)(min(4/6,2/6)) + (4/14)(min(2/4,2/4))) = 0% E(Split/Humidity)  = (5/14) – ((7/14)(min(3/7,4/7)) + (7/14)(min(6/7,1/7))) = 7% E(Split/Wind)  = (5/14) – ((8/14)(min(6/8,2/8)) + (6/14)(min(3/6,3/6))) = 0%
Constructing a Decision Tree for PlayTennis Humidity? Wind? Yes Yes No Yes No Outlook? 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 [ 9+ , 5- ] ,[object Object],[object Object],[object Object],Sunny Overcast Rain 1 , 2 , 8 , 9 , 11 [ 2+ , 3- ] 3 , 7 , 12 , 13 [ 4+ , 0- ] 4 , 5 , 6 , 10 , 14 [ 3+ , 2- ] High Normal 1 , 2 , 8 [ 0+ , 3- ] 9 , 11 [ 2+ , 0- ] Strong Light 6 , 14 [ 0+ , 2- ] 4 , 5 , 10 [ 3+ , 0- ]
Strengths Of Decision Trees ,[object Object],[object Object],[object Object],[object Object]
Weakness Of Decision Trees ,[object Object],[object Object],[object Object]
Visualization
Visualization Example: Naïve Bayesian Three Flower Types; Petal and Sepal Based Classification
Naïve Bayesian Visualization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Notice Iris-versicolor has a 33% likelihood
Rule Association Visualization ,[object Object],[object Object],[object Object],[object Object],[object Object]
Discovery Using Rule Association ,[object Object],[object Object],[object Object]
Parallel Coordinates - Visualization ,[object Object],[object Object],[object Object],[object Object],[object Object]
Scatterplots - Visualization
Image To Knowledge (I2K): Data Visualization ,[object Object]
Image To Knowledge (I2K): Visualization of Results ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
T2K - Text to Knowledge: Topic Evolution ,[object Object],[object Object],[object Object]
Protein Consumption Dynamics ,[object Object],[object Object],[object Object]
Data Comparison, Reduction & Synthesis ,[object Object],[object Object],[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Data Mining
Data MiningData Mining
Data Mining
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Data mining
Data miningData mining
Data mining
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Data mining
Data mining Data mining
Data mining
 
web mining
web miningweb mining
web mining
 
data mining
data miningdata mining
data mining
 
Kdd process
Kdd processKdd process
Kdd process
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : Concepts
 
Data mining-2
Data mining-2Data mining-2
Data mining-2
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Ppt
PptPpt
Ppt
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Social Impacts & Trends of Data Mining
Social Impacts & Trends of Data MiningSocial Impacts & Trends of Data Mining
Social Impacts & Trends of Data Mining
 
Big Data
Big DataBig Data
Big Data
 
Data mining
Data mining Data mining
Data mining
 

Destacado

Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
Revolution Analytics
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
Saif Ullah
 
Data mining- Association Analysis -market basket
Data mining- Association Analysis -market basketData mining- Association Analysis -market basket
Data mining- Association Analysis -market basket
Swapnil Soni
 

Destacado (18)

Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
 
Data mining
Data miningData mining
Data mining
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial
 
Data mining
Data miningData mining
Data mining
 
Introduction to DataMining
Introduction to DataMiningIntroduction to DataMining
Introduction to DataMining
 
Lecture 01 Data Mining
Lecture 01 Data MiningLecture 01 Data Mining
Lecture 01 Data Mining
 
Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
 
Data Mining
Data Mining Data Mining
Data Mining
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Basic Overview of Data Mining
Basic Overview of Data MiningBasic Overview of Data Mining
Basic Overview of Data Mining
 
Market Basket Analysis in SAS
Market Basket Analysis in SASMarket Basket Analysis in SAS
Market Basket Analysis in SAS
 
What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)
 
Data mining- Association Analysis -market basket
Data mining- Association Analysis -market basketData mining- Association Analysis -market basket
Data mining- Association Analysis -market basket
 
An introduction to data mining and its techniques
An introduction to data mining and its techniquesAn introduction to data mining and its techniques
An introduction to data mining and its techniques
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
 
Market basket analysis
Market basket analysisMarket basket analysis
Market basket analysis
 

Similar a Introduction To Data Mining

Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
butest
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
bhagathk
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
PadmajaLaksh
 

Similar a Introduction To Data Mining (20)

Data mining
Data miningData mining
Data mining
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
 
6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana
 
6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar
 
6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana
 
6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar
 
Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
 
Data mining 1
Data mining 1Data mining 1
Data mining 1
 
Talk
TalkTalk
Talk
 
Introduction
IntroductionIntroduction
Introduction
 
Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)
 
Data mining
Data miningData mining
Data mining
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
Data mining final year project in jalandhar
Data mining final year project in jalandharData mining final year project in jalandhar
Data mining final year project in jalandhar
 
Data mining final year project in ludhiana
Data mining final year project in ludhianaData mining final year project in ludhiana
Data mining final year project in ludhiana
 
Introduction.ppt
Introduction.pptIntroduction.ppt
Introduction.ppt
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
 
Data mining
Data miningData mining
Data mining
 

Más de Phi Jack

Zara's Fast-Fashion Edge
Zara's Fast-Fashion EdgeZara's Fast-Fashion Edge
Zara's Fast-Fashion Edge
Phi Jack
 
The vietnamese seafood sector - A value chain analysis
The vietnamese seafood sector - A value chain analysisThe vietnamese seafood sector - A value chain analysis
The vietnamese seafood sector - A value chain analysis
Phi Jack
 
Vietnam Retail Market Report, Nielsen
Vietnam Retail Market Report, NielsenVietnam Retail Market Report, Nielsen
Vietnam Retail Market Report, Nielsen
Phi Jack
 
ID.com's prospectus for IPO
ID.com's prospectus for IPOID.com's prospectus for IPO
ID.com's prospectus for IPO
Phi Jack
 
Color theory
Color theoryColor theory
Color theory
Phi Jack
 
China E-commerce Analytics [Credit Suisse]
China E-commerce Analytics [Credit Suisse]China E-commerce Analytics [Credit Suisse]
China E-commerce Analytics [Credit Suisse]
Phi Jack
 
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
Phi Jack
 
User behavior
User behaviorUser behavior
User behavior
Phi Jack
 
Buoi Thuyet Trinh Philip Kotler
Buoi Thuyet Trinh Philip KotlerBuoi Thuyet Trinh Philip Kotler
Buoi Thuyet Trinh Philip Kotler
Phi Jack
 
Huong Dan Ap Dung ISO 9001
Huong Dan Ap Dung ISO 9001Huong Dan Ap Dung ISO 9001
Huong Dan Ap Dung ISO 9001
Phi Jack
 
FinalStyle Ms Excel
FinalStyle Ms ExcelFinalStyle Ms Excel
FinalStyle Ms Excel
Phi Jack
 
Nguoi Tieu Dung
Nguoi Tieu DungNguoi Tieu Dung
Nguoi Tieu Dung
Phi Jack
 
Google Story
Google StoryGoogle Story
Google Story
Phi Jack
 
e-Marketing
e-Marketinge-Marketing
e-Marketing
Phi Jack
 

Más de Phi Jack (20)

Vietnam Retail Store Modern Trade Trend 2022.pdf
Vietnam Retail Store Modern Trade Trend 2022.pdfVietnam Retail Store Modern Trade Trend 2022.pdf
Vietnam Retail Store Modern Trade Trend 2022.pdf
 
K-Beauty E-catalog
K-Beauty E-catalogK-Beauty E-catalog
K-Beauty E-catalog
 
Market Research on Beauty Industry in Vietnam
Market Research on Beauty Industry in VietnamMarket Research on Beauty Industry in Vietnam
Market Research on Beauty Industry in Vietnam
 
Hành Vi Người Dùng Internet Vietnam 2015 - Google
Hành Vi Người Dùng Internet Vietnam 2015 - GoogleHành Vi Người Dùng Internet Vietnam 2015 - Google
Hành Vi Người Dùng Internet Vietnam 2015 - Google
 
Rocket Internet 2014 & Q1 2015 Results Report
Rocket Internet 2014 & Q1 2015 Results ReportRocket Internet 2014 & Q1 2015 Results Report
Rocket Internet 2014 & Q1 2015 Results Report
 
Hành vi mua sắm Online của Phụ nữ Châu Á
Hành vi mua sắm Online của Phụ nữ Châu ÁHành vi mua sắm Online của Phụ nữ Châu Á
Hành vi mua sắm Online của Phụ nữ Châu Á
 
Zara's Fast-Fashion Edge
Zara's Fast-Fashion EdgeZara's Fast-Fashion Edge
Zara's Fast-Fashion Edge
 
The vietnamese seafood sector - A value chain analysis
The vietnamese seafood sector - A value chain analysisThe vietnamese seafood sector - A value chain analysis
The vietnamese seafood sector - A value chain analysis
 
Vietnam Retail Market Report, Nielsen
Vietnam Retail Market Report, NielsenVietnam Retail Market Report, Nielsen
Vietnam Retail Market Report, Nielsen
 
ID.com's prospectus for IPO
ID.com's prospectus for IPOID.com's prospectus for IPO
ID.com's prospectus for IPO
 
Color theory
Color theoryColor theory
Color theory
 
China E-commerce Analytics [Credit Suisse]
China E-commerce Analytics [Credit Suisse]China E-commerce Analytics [Credit Suisse]
China E-commerce Analytics [Credit Suisse]
 
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
 
User behavior
User behaviorUser behavior
User behavior
 
Buoi Thuyet Trinh Philip Kotler
Buoi Thuyet Trinh Philip KotlerBuoi Thuyet Trinh Philip Kotler
Buoi Thuyet Trinh Philip Kotler
 
Huong Dan Ap Dung ISO 9001
Huong Dan Ap Dung ISO 9001Huong Dan Ap Dung ISO 9001
Huong Dan Ap Dung ISO 9001
 
FinalStyle Ms Excel
FinalStyle Ms ExcelFinalStyle Ms Excel
FinalStyle Ms Excel
 
Nguoi Tieu Dung
Nguoi Tieu DungNguoi Tieu Dung
Nguoi Tieu Dung
 
Google Story
Google StoryGoogle Story
Google Story
 
e-Marketing
e-Marketinge-Marketing
e-Marketing
 

Último

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
amitlee9823
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
lizamodels9
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Sheetaleventcompany
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
dlhescort
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
Abortion pills in Kuwait Cytotec pills in Kuwait
 

Último (20)

Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
 
Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentation
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Falcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to ProsperityFalcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to Prosperity
 
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceMalegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Falcon Invoice Discounting platform in india
Falcon Invoice Discounting platform in indiaFalcon Invoice Discounting platform in india
Falcon Invoice Discounting platform in india
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 

Introduction To Data Mining

  • 2.
  • 3.
  • 4. Literature Data Mining – Concepts and Techniques by J. Han & M. Kamber, Morgan Kaufmann Publishers, 2001 Pattern Classification by R. Duda, P. Hart and D. Stork, 2 nd edition, John Wiley & Sons, 2001
  • 5. Introduction to Knowledge Discovery in Databases and Data Mining
  • 7.
  • 8.
  • 9.
  • 10. Data Mining: Confluence of Multiple Disciplines ? 20x20 ~ 2^400  10^120 patterns
  • 11.
  • 12.
  • 13.
  • 14.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. Data Mining and Business Intelligence
  • 21. Knowledge Discovery in Databases Process
  • 22.
  • 23.
  • 25.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31. Data Flow Programming Environment: D2K Jump Up Panes Workspace Tool Bar Tool Menu Side Tab Panes
  • 32. D2K Programming and Runtime Environment
  • 33. Streamlined Data Mining Environment: D2K SL KDD Steps Session KDD Options Workspace
  • 34.
  • 35. Data Mining at Work Data Sources Project Objectives Single Multiple Numerous Diagnostics Target Marketing Effluent Quality Control Decision Support Automation Transaction Management Cost Prediction (Warranty, Insurance Claims) Warranty Clustering Territorial Ratemaking Web Information Retrieval, Archival and Clustering Auto Loss Ratio Predictions Precision Farming Bio-Informatics Functional Foods Heterogeneous Data Visualization Crime Data Analysis Data Fusion and Visualization Survey Study of Disability
  • 36. Examples of Data Mining Methods
  • 37.
  • 38. Association Rules and Market Basket Analysis
  • 39.
  • 40. Market Basket Example Is soda typically purchased with bananas? Does the brand of soda make a difference? Where should detergents be placed in the Store to maximize their sales? Are window cleaning products purchased when detergents and orange juice are bought together? How are the demographics of the neighborhood affecting what customers are buying? ? ? ? ?
  • 41.
  • 42.
  • 43. How Does It Work? Orange juice, Soda Milk, Orange Juice, Window Cleaner Orange Juice, Detergent Orange juice, detergent, soda Window cleaner, soda OJ 4 1 1 2 1 OJ Window Cleaner Milk Soda Detergent 1 2 1 1 0 1 1 1 0 0 2 1 0 3 1 1 0 0 1 2 Window Cleaner Milk Soda Detergent Co-Occurrence of Products Customer Items 1 2 3 4 5 Grocery Point-of-Sale Transactions Orange Juice, Soda Milk, Orange Juice, Window Cleaner Orange Juice, Detergent Orange Juice, Detergent, Soda Window Cleaner, Soda
  • 44.
  • 45.
  • 46.
  • 47. Confidence and Support Transaction ID # Items 1 2 3 4 { 1, 2, 3 } { 1,3 } { 1,4 } { 2, 5, 6 } Frequent One Item Set Support { 1 } { 2 } { 3 } { 4 } 75 % 50 % 50 % 25 % For minimum support = 50% = 2 transactions and minimum confidence = 50% For the rule 1=> 3: Support = Support({1,3}) = 50% Confidence (1->3) = Support ({1,3})/Support({1}) = 66% Confidence (3->1)= Support ({1,3})/Support({3}) = 100% Frequent Two Item Set Support { 1,2 } { 1,3 } { 1,4 } { 2,3 } 25 % 50 % 25 % 25 %
  • 48.
  • 49.
  • 50. Choosing the Right Set of Items Frozen Foods Frozen Desserts Frozen Vegetables Frozen Dinners Frozen Yogurt Frozen Fruit Bars Ice Cream Peas Carrots Mixed Other Rocky Road Chocolate Strawberry Vanilla Cherry Garcia Other Partial Product Taxonomy General Specific
  • 51. Example - Minimum Support Pruning / Rule Generation Transaction ID # Items 1 2 3 4 { 1, 3, 4 } { 2, 3, 5 } { 1, 2, 3, 5 } { 2, 5 } Itemset Support { 1 } { 2 } { 3 } { 4 } { 5 } 2 3 3 1 3 Itemset Support { 2 } { 3 } { 5 } 3 3 3 Itemset { 2 } { 3 } { 5 } Itemset Support { 2, 3 } { 2, 5 } { 3, 5 } 2 3 2 Itemset Support { 2, 5 } 3 Scan Database Find Pairings Find Level of Support Scan Database Find Pairings Find Level of Support Two rules with the highest support for two item set: 2->5 and 5->2
  • 52.
  • 53.
  • 54.
  • 56. Example: Supervised Learning with Decision Trees
  • 57.
  • 58.
  • 59. Decision Tree for Concept: PlayTennis Outlook? Humidity? Wind? Sunny Overcast Rain Yes No High Normal No Strong Light Outlook? Humidity? Wind? Sunny Overcast Rain Yes No High Normal No Strong Light Yes Yes Yes Yes
  • 60. Decision Trees and Decision Boundaries + + - - + + + + - - y x 1 3 5 7 How to Visualize Decision Trees? Example: Dividing Instance Space into Axis-Parallel Rectangles More than two variables ? y > 7? No Yes x < 3? No Yes y < 5? No Yes x < 1? No Yes
  • 61. An Illustrative Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Day Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot Mild Temperature Humidity Wind PlayTennis? High High High High Normal Normal Normal High Normal Normal Normal High Normal High Outlook Light Strong Light Light Light Strong Strong Light Light Light Strong Strong Light Strong No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No Training Examples for Concept PlayTennis
  • 62.
  • 63. Constructing a Decision Tree for PlayTennis Potential Splits of Root Node [3+, 4-] [6+, 1-] Humidity High Normal [9+, 5-] [6+, 2-] [3+, 3-] Wind Light Strong [9+, 5-] [2+, 3-] [3+, 2-] Outlook Sunny Rain [9+, 5-] Overcast [4+, 0-] [3+, 1-] [2+, 2-] Temperature Cool Hot [9+, 5-] Mild [4+, 2-] E(Split/Outlook) = (5/14) – ((5/14)(min(2/5,3/5)) + (4/14)(min(4/4,0/4)) + (5/14)(min(3/5,2/5))) = 7% E(Split/Temperature) = (5/14) – ((4/14)(min(3/4,1/4)) + (6/14)(min(4/6,2/6)) + (4/14)(min(2/4,2/4))) = 0% E(Split/Humidity) = (5/14) – ((7/14)(min(3/7,4/7)) + (7/14)(min(6/7,1/7))) = 7% E(Split/Wind) = (5/14) – ((8/14)(min(6/8,2/8)) + (6/14)(min(3/6,3/6))) = 0%
  • 64.
  • 65.
  • 66.
  • 68. Visualization Example: Naïve Bayesian Three Flower Types; Petal and Sepal Based Classification
  • 69.
  • 70.
  • 71.
  • 72.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.