SlideShare una empresa de Scribd logo
1 de 28
Lecture Notes 1: Introduction to Data Mining Zhangxi Lin ISQS 6347 Texas Tech University ISQS 6347, Data & Text Mining
What is Data Mining? ,[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Data Mining Process ISQS 6347, Data & Text Mining
What is Text Mining? ,[object Object],ISQS 6347, Data & Text Mining Patterns Trends Associations
Motivation for Text Mining ,[object Object],[object Object],ISQS 6347, Data & Text Mining 90% Structured Numerical or Coded Information 10% Unstructured or Semi-structured Information
Text Mining Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Why Mine Data? Commercial Viewpoint ISQS 6347, Data & Text Mining
Why Mine Data? Scientific Viewpoint ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],Origins of Data Mining ISQS 6347, Data & Text Mining Machine Learning/ Pattern   Recognition Statistics/ AI Data Mining Database systems
ISQS 6347, Data & Text Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining Tasks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Classification: Definition ,[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Classification Example ISQS 6347, Data & Text Mining categorical categorical continuous class Training  Set Learn  Classifier Test Set Model
Classification: Application 1 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining From [Berry & Linoff] Data Mining Techniques, 1997
Classification: Application 2 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Clustering Definition ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Illustrating Clustering ISQS 6347, Data & Text Mining ,[object Object],Intracluster distances are minimized Intercluster distances are maximized
Clustering Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Association Rule Discovery: Definition ,[object Object],[object Object],ISQS 6347, Data & Text Mining Rules Discovered: {Milk} --> {Coke} {Diaper, Milk} --> {Beer}
Association Rule Discovery Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Regression ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Deviation/Anomaly Detection ,[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining Typical network traffic at University level may reach over 100 million connections per day
Text Mining Tasks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Example:  Decision Support using Bank Call Center Data ISQS 6347, Data & Text Mining
Example:  Decision Support using Bank Call Center Data ,[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining AC2G31, 01, 0101, PCC, 021, 0053352,  NEW YORK, NY , H-SUPRVR8,  STMT ,  “ Mr. Stark has been with the company for about 20 yrs. He  hates  his  stmt   format and wishes that we would show a daily balance to help him know when he falls below the required balance on the account.”
Challenges of Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Challenges of Text Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
SAS Training/Self-taught Courses ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining

Más contenido relacionado

La actualidad más candente

Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
mayurik19
 
Information Technology Data Mining
Information Technology Data MiningInformation Technology Data Mining
Information Technology Data Mining
samiksha sharma
 

La actualidad más candente (19)

Introduction to data mining technique
Introduction to data mining techniqueIntroduction to data mining technique
Introduction to data mining technique
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
 
Data mining and knowledge Discovery
Data mining and knowledge DiscoveryData mining and knowledge Discovery
Data mining and knowledge Discovery
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data mining
Data miningData mining
Data mining
 
Data mining & data warehousing
Data mining & data warehousingData mining & data warehousing
Data mining & data warehousing
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
Information Technology Data Mining
Information Technology Data MiningInformation Technology Data Mining
Information Technology Data Mining
 
Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data Mining
 
MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)
 
Data mining concepts
Data mining conceptsData mining concepts
Data mining concepts
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
An introduction to data mining and its techniques
An introduction to data mining and its techniquesAn introduction to data mining and its techniques
An introduction to data mining and its techniques
 
knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)
 
Data Mining
Data MiningData Mining
Data Mining
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Abstract
AbstractAbstract
Abstract
 

Destacado (8)

My Law
My LawMy Law
My Law
 
Energy And Emf Estudios I Ngles
Energy And Emf Estudios I NglesEnergy And Emf Estudios I Ngles
Energy And Emf Estudios I Ngles
 
My Law
My LawMy Law
My Law
 
My Law
My LawMy Law
My Law
 
My Law
My LawMy Law
My Law
 
Janta ctg.ppt2
Janta ctg.ppt2Janta ctg.ppt2
Janta ctg.ppt2
 
My Law
My LawMy Law
My Law
 
Chap1 intro
Chap1 introChap1 intro
Chap1 intro
 

Similar a Testing

Data warehouse and data mining
Data warehouse and data miningData warehouse and data mining
Data warehouse and data mining
Rohit Kumar
 

Similar a Testing (20)

Data mining-basic
Data mining-basicData mining-basic
Data mining-basic
 
Data Mining
Data MiningData Mining
Data Mining
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
 
Data-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptxData-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptx
 
Data-Mining-ppt.pptx
Data-Mining-ppt.pptxData-Mining-ppt.pptx
Data-Mining-ppt.pptx
 
data.2.pptx
data.2.pptxdata.2.pptx
data.2.pptx
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Data mining final year project in jalandhar
Data mining final year project in jalandharData mining final year project in jalandhar
Data mining final year project in jalandhar
 
Data mining final year project in ludhiana
Data mining final year project in ludhianaData mining final year project in ludhiana
Data mining final year project in ludhiana
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Data mining
Data miningData mining
Data mining
 
Data warehouse and data mining
Data warehouse and data miningData warehouse and data mining
Data warehouse and data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Mining
Data MiningData Mining
Data Mining
 
6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana
 
6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar
 
6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana
 
6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar
 
Data mining
Data miningData mining
Data mining
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Testing

  • 1. Lecture Notes 1: Introduction to Data Mining Zhangxi Lin ISQS 6347 Texas Tech University ISQS 6347, Data & Text Mining
  • 2.
  • 3. Data Mining Process ISQS 6347, Data & Text Mining
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. Classification Example ISQS 6347, Data & Text Mining categorical categorical continuous class Training Set Learn Classifier Test Set Model
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.