SlideShare una empresa de Scribd logo
1 de 26
27/Sep/2008




Data Mining   July 16, 2009        1
Evolution of Database
              technology
YEAR       PURPOSE
1960’s     Network Model, Batch Reports

1970’s     Relational data model, Executive information Systems

1980’s     Application specific DBMS(spatial data, scientific data,
           image data, …)
1990’s     Terabyte Data warehouses, Object Oriented, middleware
           and web technology
2000’s     Business Process

2010’s     Sensor DB systems, DBs on embedded systems, large
           scale pub/ sub systems
                                             Data Mining   July 16, 2009   2
Motivation : Necessity is the
       mother of invention
   Data explosion problem

    ◦ Automated data collection tools and mature database technology
      lead to tremendous amounts of data stored in databases, data
      warehouses and other information repositories
   We are drowning in data, but starving for knowledge!
   Solution: Data warehousing and data mining

    ◦ Extraction of interesting knowledge (rules, regularities, patterns,
      constraints) from data in large databases



                                                  Data Mining   July 16, 2009   3
Why Data Mining?


      Data, Data, Data Every where …

         I can’t find data I need – data is
          scattered over network

         I can’t get the data I need

         I can’t understand the data I
          need

         I can’t use the data I found


                      Data Mining   July 16, 2009   4
   An abundance of data                 This data occupies
     Super Market Scanners, POS
     data
                                           Terabytes - 10^12 bytes
     Credit cards transactions
     Call Center records
                                           Petabytes - 10^15 bytes
     ATM Machines
     Demographic data
                                           Exabytes - 10^18bytes
     Sensor Networks
     Cameras
                                           Zettabytes - 10^21bytes
     Web server logs
     Customer web site trails
                                           Zottabytes-10^24bytes
     Geographic Information System
     National Medical Records             Walmart - 24 Terabytes
     Weather Images



                                                Data Mining   July 16, 2009   5
   Process of sorting through large amounts of data and picking
    out relevant information

   Process of analyzing data from different perspectives and
    summarizing it into useful information

   Discovering hidden value in database

   It is non-trivial process of identifying valid, novel, useful and
    understandable patterns in data

   Extracting or mining knowledge from large amounts of data


                                              Data Mining   July 16, 2009   6
History Notes – Many Names of Data
              Mining

 YEAR            Names                           USES


  1960    Data Fishing, Data     Statisticians
          Dredging
  1990    Data Mining            DB Community, business


  1989    Knowledge Discovery    AI, Machine Learning community
          in databases
Other Names

Data Archaeology, Information Harvesting, Information Discovery,
Knowledge Extraction,


                                                 Data Mining   July 16, 2009   7
Data Warehousing provides the
                            Enterprise with a memory




         Data Mining provides the
        Enterprise with intelligence

July 16, 2009                      Data Mining      8
Why Data Mining?(Cont..)

   Data Warehouse is single, complete and consistent store of data from
    variety of different sources available to end users

   For example, AT and T handles billions of calls per day. Europe's Very
    Long Baseline Interferometer (VLBI) has 16 telescopes, each of which
    produces 1 Gigabit/second of astronomical data over a 25-day
    observation session

   We need data mining for
      Transforming data into useful information to users
      Present data in useful format
      Provide data access to business analyst, Information technology
       professionals



                                                 Data Mining   July 16, 2009   9
Data Mining Process
   Data Mining is the technique used to carry out KDD.

   Data Mining turns data into information and then to knowledge


                             Information




                   Data

                                           Knowledge



                                              Data Mining   July 16, 2009   10
Steps in Data Mining
1. Data cleaning
        To remove noise and inconsistent data
2. Data integration
   To integrate (compile) multiple data
sources
3. Data selection
   Data relevant to analysis is selected
4. Data transformation
   Summary normalization aggregation operations are performed
   (convert data into two dimension form) and consolidate the data



                                           Data Mining   July 16, 2009   11
Steps in Data Mining(Cont..)
5. Data mining
 Intelligent methods are applied to the data to discover
 knowledge or patterns

6. Pattern evaluation
 Evaluation of the interesting patterns by thresholding

7. Knowledge Discovery
 Visualization and presentation methods are used to present
 the mined knowledge to the user.


                                           Data Mining   July 16, 2009   12
Pattern Evaluation
◦ Data mining: the core of
  knowledge discovery
  process.                         Data Mining

                    Task-relevant Data


      Data                   Selection
      Warehouse
Data Cleaning

          Data Integration


        Databases
                                                 Data Mining   July 16, 2009   13
Data Mining Tasks
1. Classification
•   Classification maps data into predefined groups or classes.
•   It may be represented by methods such as decision trees, etc.

Decision tree
 Flow chart like tree structure
 Each node denotes test of
  an attribute value
 Each branch represents
  outcome of test
 Leaves represent classes
  or class distribution.


                                            Data Mining   July 16, 2009   14
2. Regression
Used to map a data item to a real valued prediction variable.
Example. A manager wants to reach a certain level of savings before his
  retirement. Periodically he predicts his retirement savings by current value
  and several past values. He uses a simple linear regressive formula to
  predict the values of savings in future.


3. Prediction
Many real world applications can be seen
predicting future data states based on
past and current data.
Example -   Predicting flooding is difficult problem


                                                         Data Mining   July 16, 2009   15
4. Clustering
Clustering is similar to classification
except that the groups are not predefined.
5. Association Rule
Association refers to uncovering relationship                              1998
among data.
Used in retail sales community to identify the items                       Bread and
(products) that are frequently                                              Jam sell
                                             Zzzz...
purchased together.                                                         together!




                                             Data Mining   July 16, 2009            16
6. Summarization
Summarization of general characteristics or features of target class of
  data.
Data characterization presented in various forms - pie charts, bar
  charts, curves.
Data discrimination comparison of general features of target class of
  data objects with general features of objects from one or a set of
  contrasting classes.
7. Outlier Analysis
Database may contain data objects that do not comply with general
  behavior model of data. These data objects are called as outliers.
Data mining methods discard outliers as noise or exceptions.
In applications such as fraud detection, rare events may be more
  interesting than regularly occurring events.
                                               Data Mining   July 16, 2009   17
Data Mining: Types of Data

   Relational data and transactional data

   Text

   Images, video

   Mixtures of data




                                         Data Mining   July 16, 2009   18
Data Mining Products

   DataMind -- neurOagent
   Information Discovery -- IDIS
   SAS Institute -- SAS/Neuronets




                                      19
                             Data Mining   July 16, 2009
Data Mining Software
   RapidMiner and Weka – Defining data mining process

   Top 8 data mining software in 2008

           Angoss software
           Infor CRM Epiphany
           Portrait Software
           SAS
           SPSS
           ThinkAnalytics
           Unica
           Viscovery


                                            Data Mining   July 16, 2009   20
Application Areas


       Industry            Application
       Finance             Credit Card Analysis
       Insurance           Fraud Analysis
       Telecommunication   Call record analysis




July 16, 2009                Data Mining          21
Applications
   Financial Industry, Banks, Businesses, E-commerce
    ◦ Stock and investment analysis
    ◦ Identify loyal customers and risky customer
    ◦ Predict customer spending

   Database analysis and decision support
    ◦ Market analysis and management
      target marketing, customer relation management, market basket
       analysis.
    ◦ Risk analysis and management
      Forecasting, quality control, competitive analysis
    ◦ Fraud detection and management

                                                   Data Mining   July 16, 2009   22
Data Mining in Usage

1.   Intelligent Miner
    It is IBM data mining product
    Distinct feature is include scalability of its mining algorithm and tight
     integration with IBM DB2 related data base system.


5.   DB Miner
      Developed by DBMiner Technologies Inc.
     Distinct features of DBMiner are Data cube based Online Analytical
     Mining



                                                   Data Mining   July 16, 2009   23
The Telecomm Slice
Product




Household

Telecomm          o ns
              e gi
             R
   Video                 Europe
                  Far East
   Audio        India

            Retail Direct    Special            Sales Channel




                                             Data Mining   July 16, 2009   24
Conclusion
   Data mining: discovering interesting patterns from large amounts of
    data
   A KDD process includes data cleaning, data integration, data
    selection, transformation, data mining, pattern evaluation, and
    knowledge presentation
   Mining can be performed in a variety of information repositories
   Data mining functionalities: characterization,               discrimination,
    association, classification, clustering, outlier etc




                                                 Data Mining   July 16, 2009       25
Thank you !!!
         Data Mining   July 16, 2009   26

Más contenido relacionado

La actualidad más candente

Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewSivashankar Ganapathy
 
Introduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data MiningIntroduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data MiningAarshDhokai
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 
What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)Pratik Tambekar
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profilingShailja Khurana
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data miningHadi Fadlallah
 
Data mining
Data mining Data mining
Data mining AthiraR23
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence ArchitecturePhilippe Julio
 
Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data VisualizationRaffael Marty
 

La actualidad más candente (20)

Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies Overview
 
Introduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data MiningIntroduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data Mining
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Data mining
Data mining Data mining
Data mining
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence Architecture
 
Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data Visualization
 

Destacado

Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an IntroductionAli Abbasi
 
Text mining presentation in Data mining Area
Text mining presentation in Data mining AreaText mining presentation in Data mining Area
Text mining presentation in Data mining AreaMahamudHasanCSE
 
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...Ryan Rosario
 
Machine Learning and Data Mining: 19 Mining Text And Web Data
Machine Learning and Data Mining: 19 Mining Text And Web DataMachine Learning and Data Mining: 19 Mining Text And Web Data
Machine Learning and Data Mining: 19 Mining Text And Web DataPier Luca Lanzi
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data MiningAmritanshu Mehra
 
Data warehousing and data mining
Data warehousing and data miningData warehousing and data mining
Data warehousing and data miningSnehali Chake
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data WarehousingAmdocs
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial Salah Amean
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsMotaz Saad
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDataminingTools Inc
 
Ch 1 Intro to Data Mining
Ch 1 Intro to Data MiningCh 1 Intro to Data Mining
Ch 1 Intro to Data MiningSushil Kulkarni
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 

Destacado (19)

Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
 
Data mining and_big_data_web
Data mining and_big_data_webData mining and_big_data_web
Data mining and_big_data_web
 
Lecture 01 Data Mining
Lecture 01 Data MiningLecture 01 Data Mining
Lecture 01 Data Mining
 
Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
 
Text mining presentation in Data mining Area
Text mining presentation in Data mining AreaText mining presentation in Data mining Area
Text mining presentation in Data mining Area
 
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
 
Machine Learning and Data Mining: 19 Mining Text And Web Data
Machine Learning and Data Mining: 19 Mining Text And Web DataMachine Learning and Data Mining: 19 Mining Text And Web Data
Machine Learning and Data Mining: 19 Mining Text And Web Data
 
Analytics and Data Mining Industry Overview
Analytics and Data Mining Industry OverviewAnalytics and Data Mining Industry Overview
Analytics and Data Mining Industry Overview
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
 
Data warehousing and data mining
Data warehousing and data miningData warehousing and data mining
Data warehousing and data mining
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data Warehousing
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Ch 1 Intro to Data Mining
Ch 1 Intro to Data MiningCh 1 Intro to Data Mining
Ch 1 Intro to Data Mining
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data mining
Data miningData mining
Data mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 

Similar a Data Mining Overview

Data mining concepts
Data mining conceptsData mining concepts
Data mining conceptsBasit Rafiq
 
Upstate CSCI 525 Data Mining Chapter 1
Upstate CSCI 525 Data Mining Chapter 1Upstate CSCI 525 Data Mining Chapter 1
Upstate CSCI 525 Data Mining Chapter 1DanWooster1
 
01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.pptadmsoyadm4
 
Data Mining @ BSU Malolos 2019
Data Mining @ BSU Malolos 2019Data Mining @ BSU Malolos 2019
Data Mining @ BSU Malolos 2019Edwin S. Garcia
 
Data Mining - Presentation.pptx
Data Mining - Presentation.pptxData Mining - Presentation.pptx
Data Mining - Presentation.pptxfahadusman23
 
Data Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesData Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesasnaparveen414
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryYoung Alista
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryHarry Potter
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryJames Wong
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryFraboni Ec
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryLuis Goldster
 

Similar a Data Mining Overview (20)

Data mining concepts
Data mining conceptsData mining concepts
Data mining concepts
 
Data mining
Data miningData mining
Data mining
 
Upstate CSCI 525 Data Mining Chapter 1
Upstate CSCI 525 Data Mining Chapter 1Upstate CSCI 525 Data Mining Chapter 1
Upstate CSCI 525 Data Mining Chapter 1
 
Data Mining Intro
Data Mining IntroData Mining Intro
Data Mining Intro
 
data mining
data miningdata mining
data mining
 
01Intro.ppt
01Intro.ppt01Intro.ppt
01Intro.ppt
 
01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt
 
01Intro.ppt
01Intro.ppt01Intro.ppt
01Intro.ppt
 
Chapter 1. Introduction.ppt
Chapter 1. Introduction.pptChapter 1. Introduction.ppt
Chapter 1. Introduction.ppt
 
Data Mining @ BSU Malolos 2019
Data Mining @ BSU Malolos 2019Data Mining @ BSU Malolos 2019
Data Mining @ BSU Malolos 2019
 
D
DD
D
 
Data Mining - Presentation.pptx
Data Mining - Presentation.pptxData Mining - Presentation.pptx
Data Mining - Presentation.pptx
 
isd314-01
isd314-01isd314-01
isd314-01
 
Data Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesData Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notes
 
18231979 Data Mining
18231979 Data Mining18231979 Data Mining
18231979 Data Mining
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 

Último

Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 

Último (20)

Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 

Data Mining Overview

  • 1. 27/Sep/2008 Data Mining July 16, 2009 1
  • 2. Evolution of Database technology YEAR PURPOSE 1960’s Network Model, Batch Reports 1970’s Relational data model, Executive information Systems 1980’s Application specific DBMS(spatial data, scientific data, image data, …) 1990’s Terabyte Data warehouses, Object Oriented, middleware and web technology 2000’s Business Process 2010’s Sensor DB systems, DBs on embedded systems, large scale pub/ sub systems Data Mining July 16, 2009 2
  • 3. Motivation : Necessity is the mother of invention  Data explosion problem ◦ Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories  We are drowning in data, but starving for knowledge!  Solution: Data warehousing and data mining ◦ Extraction of interesting knowledge (rules, regularities, patterns, constraints) from data in large databases Data Mining July 16, 2009 3
  • 4. Why Data Mining?  Data, Data, Data Every where …  I can’t find data I need – data is scattered over network  I can’t get the data I need  I can’t understand the data I need  I can’t use the data I found Data Mining July 16, 2009 4
  • 5. An abundance of data  This data occupies  Super Market Scanners, POS data  Terabytes - 10^12 bytes  Credit cards transactions  Call Center records  Petabytes - 10^15 bytes  ATM Machines  Demographic data  Exabytes - 10^18bytes  Sensor Networks  Cameras  Zettabytes - 10^21bytes  Web server logs  Customer web site trails  Zottabytes-10^24bytes  Geographic Information System  National Medical Records  Walmart - 24 Terabytes  Weather Images Data Mining July 16, 2009 5
  • 6. Process of sorting through large amounts of data and picking out relevant information  Process of analyzing data from different perspectives and summarizing it into useful information  Discovering hidden value in database  It is non-trivial process of identifying valid, novel, useful and understandable patterns in data  Extracting or mining knowledge from large amounts of data Data Mining July 16, 2009 6
  • 7. History Notes – Many Names of Data Mining YEAR Names USES 1960 Data Fishing, Data Statisticians Dredging 1990 Data Mining DB Community, business 1989 Knowledge Discovery AI, Machine Learning community in databases Other Names Data Archaeology, Information Harvesting, Information Discovery, Knowledge Extraction, Data Mining July 16, 2009 7
  • 8. Data Warehousing provides the Enterprise with a memory Data Mining provides the Enterprise with intelligence July 16, 2009 Data Mining 8
  • 9. Why Data Mining?(Cont..)  Data Warehouse is single, complete and consistent store of data from variety of different sources available to end users  For example, AT and T handles billions of calls per day. Europe's Very Long Baseline Interferometer (VLBI) has 16 telescopes, each of which produces 1 Gigabit/second of astronomical data over a 25-day observation session  We need data mining for  Transforming data into useful information to users  Present data in useful format  Provide data access to business analyst, Information technology professionals Data Mining July 16, 2009 9
  • 10. Data Mining Process  Data Mining is the technique used to carry out KDD.  Data Mining turns data into information and then to knowledge Information Data Knowledge Data Mining July 16, 2009 10
  • 11. Steps in Data Mining 1. Data cleaning To remove noise and inconsistent data 2. Data integration To integrate (compile) multiple data sources 3. Data selection Data relevant to analysis is selected 4. Data transformation Summary normalization aggregation operations are performed (convert data into two dimension form) and consolidate the data Data Mining July 16, 2009 11
  • 12. Steps in Data Mining(Cont..) 5. Data mining Intelligent methods are applied to the data to discover knowledge or patterns 6. Pattern evaluation Evaluation of the interesting patterns by thresholding 7. Knowledge Discovery Visualization and presentation methods are used to present the mined knowledge to the user. Data Mining July 16, 2009 12
  • 13. Pattern Evaluation ◦ Data mining: the core of knowledge discovery process. Data Mining Task-relevant Data Data Selection Warehouse Data Cleaning Data Integration Databases Data Mining July 16, 2009 13
  • 14. Data Mining Tasks 1. Classification • Classification maps data into predefined groups or classes. • It may be represented by methods such as decision trees, etc. Decision tree  Flow chart like tree structure  Each node denotes test of an attribute value  Each branch represents outcome of test  Leaves represent classes or class distribution. Data Mining July 16, 2009 14
  • 15. 2. Regression Used to map a data item to a real valued prediction variable. Example. A manager wants to reach a certain level of savings before his retirement. Periodically he predicts his retirement savings by current value and several past values. He uses a simple linear regressive formula to predict the values of savings in future. 3. Prediction Many real world applications can be seen predicting future data states based on past and current data. Example - Predicting flooding is difficult problem Data Mining July 16, 2009 15
  • 16. 4. Clustering Clustering is similar to classification except that the groups are not predefined. 5. Association Rule Association refers to uncovering relationship 1998 among data. Used in retail sales community to identify the items Bread and (products) that are frequently Jam sell Zzzz... purchased together. together! Data Mining July 16, 2009 16
  • 17. 6. Summarization Summarization of general characteristics or features of target class of data. Data characterization presented in various forms - pie charts, bar charts, curves. Data discrimination comparison of general features of target class of data objects with general features of objects from one or a set of contrasting classes. 7. Outlier Analysis Database may contain data objects that do not comply with general behavior model of data. These data objects are called as outliers. Data mining methods discard outliers as noise or exceptions. In applications such as fraud detection, rare events may be more interesting than regularly occurring events. Data Mining July 16, 2009 17
  • 18. Data Mining: Types of Data  Relational data and transactional data  Text  Images, video  Mixtures of data Data Mining July 16, 2009 18
  • 19. Data Mining Products  DataMind -- neurOagent  Information Discovery -- IDIS  SAS Institute -- SAS/Neuronets 19 Data Mining July 16, 2009
  • 20. Data Mining Software  RapidMiner and Weka – Defining data mining process  Top 8 data mining software in 2008  Angoss software  Infor CRM Epiphany  Portrait Software  SAS  SPSS  ThinkAnalytics  Unica  Viscovery Data Mining July 16, 2009 20
  • 21. Application Areas Industry Application Finance Credit Card Analysis Insurance Fraud Analysis Telecommunication Call record analysis July 16, 2009 Data Mining 21
  • 22. Applications  Financial Industry, Banks, Businesses, E-commerce ◦ Stock and investment analysis ◦ Identify loyal customers and risky customer ◦ Predict customer spending  Database analysis and decision support ◦ Market analysis and management  target marketing, customer relation management, market basket analysis. ◦ Risk analysis and management  Forecasting, quality control, competitive analysis ◦ Fraud detection and management Data Mining July 16, 2009 22
  • 23. Data Mining in Usage 1. Intelligent Miner  It is IBM data mining product  Distinct feature is include scalability of its mining algorithm and tight integration with IBM DB2 related data base system. 5. DB Miner  Developed by DBMiner Technologies Inc.  Distinct features of DBMiner are Data cube based Online Analytical Mining Data Mining July 16, 2009 23
  • 24. The Telecomm Slice Product Household Telecomm o ns e gi R Video Europe Far East Audio India Retail Direct Special Sales Channel Data Mining July 16, 2009 24
  • 25. Conclusion  Data mining: discovering interesting patterns from large amounts of data  A KDD process includes data cleaning, data integration, data selection, transformation, data mining, pattern evaluation, and knowledge presentation  Mining can be performed in a variety of information repositories  Data mining functionalities: characterization, discrimination, association, classification, clustering, outlier etc Data Mining July 16, 2009 25
  • 26. Thank you !!! Data Mining July 16, 2009 26