SlideShare una empresa de Scribd logo
1 de 27
DATA WAREHOUSE
AND
DATA MINING
OVER 1.13 TRILLION LIKES
Which are our
lowest/highest margin
customers ?
Who are my customers
and what products
are they buying?
Which customers
are most likely to go
to the competition ?
What impact will
new products/services
have on revenue
and margins?
What product prom-
-otions have the biggest
impact on revenue?
What is the most
effective distribution
channel?
A PRODUCER WANTS TO KNOW….
DATA, DATA EVERYWHERE
YET ...
 I can’t get the data I need
 need an expert to get the data
 I can’t understand the data I found
 available data poorly documented
 I can’t use the data I found
 results are unexpected
 data needs to be transformed from
one form to other
 I can’t find the data I need
 data is scattered over the network
 many versions, subtle differences
 1960s:
 Data collection, database creation, IMS and network DBMS
 1970s:
 Relational data model, relational DBMS implementation
 1980s:
 RDBMS, advanced data models (extended-
relational, OO, deductive, etc.) and application-oriented DBMS
(spatial, scientific, engineering, etc.)
 1990s—2000s:
 Data mining and data warehousing, multimedia databases, and
Web databases
7
DATA WAREHOUSE
The data warehouse is that portion of an overall
Architected Data Environment that serves as the
single integrated source of data for processing
information.
CHARACTERISTICS
DATA
WAREHOUSE
Subject
Oriented
Integrated
Non
Volatile
Time
variant
Accessible
Process
Oriented
TERMS RELATED
TO DATA WAREHOUSE
DATA MART
STAGING AREA
OLAP
OLAP TOOLS
Data
Acquisition
Warehouse
Design
Analytical
Data Store
Enterprise
Warehouse
Data Marts
Metadata
Directory
Metadata
Repository
DATA
MANAGEMENT
METADATA
MANAGEMENT
Data
Analysis
Web
Information
Systems
Operational,
External &
other
Databases
COMPONENTS OF
DATAWAREHOUSE SYSTEM
HOW IS THE WAREHOUSE
DIFFERENT?
The data warehouse is distinctly different from
the operational data used and maintained by day-
to-day operational systems. Data warehousing is
not simply an “access wrapper” for operational
data, where data is simply “dumped” into tables
for direct access.
OPERATIONAL DATA
Application oriented
Detailed
Accurate, as of the moment
of access
Serves the clerical
community
Performance sensitive
(immediate response required
when entering a transaction)
Flexible structure; variable
contents
Small amount of data used in
a process
DATA WAREHOUSE
Subject oriented
Summarized
Represents values over time
Serves the managerial
community
Performance relaxed
(immediacy not required)
Static structure
large amount of data used in
a process
14
 Data explosion problem
 Automated data collection tools and mature database
technology lead to tremendous amounts of data stored in
databases, data warehouses and other information
repositories
 We are drowning in data, but starving for knowledge!
 Solution: Data warehousing and data mining
 Extraction of interesting knowledge (rules, regularities,
patterns, constraints) from data in large databases
15
 Data mining (knowledge discovery in databases):
 Extraction of interesting (non-trivial, implicit, previously
unknown and potentially useful) information or patterns
from data in large databases
 Alternative names:
 Data mining: a misnomer?
 Knowledge discovery(mining) in databases (KDD),
knowledge extraction, data/pattern analysis, data
archeology, data dredging, information harvesting, business
intelligence, etc.
 What is not data mining?
 (Deductive) query processing.
 Expert systems or small ML/statistical programs
16
 Database analysis and decision support
 Market analysis and management
 target marketing, customer relation management, market
basket analysis, cross selling, market segmentation
 Risk analysis and management
 Forecasting, customer retention, improved
underwriting, quality control, competitive analysis
 Fraud detection and management
 Other Applications
 Text mining (news group, email, documents)
 Stream data mining
 Web mining.
 DNA data analysis
17
 Where are the data sources for analysis?
 Credit card transactions, loyalty cards, discount coupons,
customer complaint calls, plus (public) lifestyle studies
 Target marketing
 Find clusters of “model” customers who share the same
characteristics: interest, income level, spending habits, etc.
 Determine customer purchasing patterns over time
 Conversion of single to a joint bank account: marriage, etc.
 Cross-market analysis
 Associations/co-relations between product sales
 Prediction based on the association information
18
19
 Customer profiling
 data mining can tell you what types of customers buy what
products (clustering or classification)
 Identifying customer requirements
 identifying the best products for different customers
 use prediction to find what factors will attract new customers
 Provides summary information
 various multidimensional summary reports
 statistical summary information (data central tendency and
variation)
 Finance planning and asset evaluation
 cash flow analysis and prediction
 contingent claim analysis to evaluate assets
 cross-sectional and time series analysis (financial-ratio, trend
analysis, etc.)
 Resource planning:
 summarize and compare the resources and spending
 Competition:
 monitor competitors and market directions
 group customers into classes and a class-based pricing procedure
 set pricing strategy in a highly competitive market
20
 Applications
 widely used in health care, retail, credit card services,
telecommunications (phone card fraud), etc.
 Approach
 use historical data to build models of fraudulent behavior and use
data mining to help identify similar instances
 Examples
 auto insurance: detect a group of people who stage accidents to
collect on insurance
 money laundering: detect suspicious money transactions (US
Treasury's Financial Crimes Enforcement Network)
 medical insurance: detect professional patients and ring of
doctors and ring of references
21
22
 Detecting inappropriate medical treatment
 Australian Health Insurance Commission identifies that in many
cases blanket screening tests were requested (save Australian
$1m/yr.).
 Detecting telephone fraud
 Telephone call model: destination of the call, duration, time of
day or week. Analyze patterns that deviate from an expected
norm.
 British Telecom identified discrete groups of callers with frequent
intra-group calls, especially mobile phones, and broke a
multimillion dollar fraud.
 Retail
 Analysts estimate that 38% of retail shrink is due to dishonest
employees.
 Sports
 IBM Advanced Scout analyzed NBA game statistics (shots
blocked, assists, and fouls) to gain competitive advantage for
New York Knicks and Miami Heat
 Astronomy
 JPL and the Palomar Observatory discovered 22 quasars with the
help of data mining
 Internet Web Surf-Aid
 IBM Surf-Aid applies data mining algorithms to Web access logs
for market-related pages to discover customer preference and
behavior pages, analyzing effectiveness of Web
marketing, improving Web site organization, etc.
23
Data warehouse and data mining
Data warehouse and data mining
Data warehouse and data mining
Data warehouse and data mining

Más contenido relacionado

La actualidad más candente

Data mining by_ashok
Data mining by_ashokData mining by_ashok
Data mining by_ashokAshok Kumar
 
Gulabs Ppt On Data Warehousing And Mining
Gulabs Ppt On Data Warehousing And MiningGulabs Ppt On Data Warehousing And Mining
Gulabs Ppt On Data Warehousing And Mininggulab sharma
 
Data mining Introduction
Data mining IntroductionData mining Introduction
Data mining IntroductionVijayasankariS
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation Pralhad Rijal
 
Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)yesheeka
 
Lecture 1 introduction to data warehouse
Lecture 1 introduction to data warehouseLecture 1 introduction to data warehouse
Lecture 1 introduction to data warehouseShani729
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouseAbhi Bhardwaj
 
Data Mining: A Short Survey
Data Mining: A Short SurveyData Mining: A Short Survey
Data Mining: A Short SurveyArvin Jenabi
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousingOZ Assignment help
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Templatebutest
 
Data mining
Data mining Data mining
Data mining AthiraR23
 

La actualidad más candente (20)

Data mining by_ashok
Data mining by_ashokData mining by_ashok
Data mining by_ashok
 
Gulabs Ppt On Data Warehousing And Mining
Gulabs Ppt On Data Warehousing And MiningGulabs Ppt On Data Warehousing And Mining
Gulabs Ppt On Data Warehousing And Mining
 
Data mining Introduction
Data mining IntroductionData mining Introduction
Data mining Introduction
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation
 
Abstract
AbstractAbstract
Abstract
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Datamining
DataminingDatamining
Datamining
 
Data mining
Data miningData mining
Data mining
 
Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)
 
Lecture 1 introduction to data warehouse
Lecture 1 introduction to data warehouseLecture 1 introduction to data warehouse
Lecture 1 introduction to data warehouse
 
Data mining
Data miningData mining
Data mining
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
 
Data Mining: A Short Survey
Data Mining: A Short SurveyData Mining: A Short Survey
Data Mining: A Short Survey
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousing
 
Big data
Big dataBig data
Big data
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Template
 
Data mining
Data mining Data mining
Data mining
 
Data mining
Data miningData mining
Data mining
 

Destacado

Data Warehouse and Data Mining
Data Warehouse and Data MiningData Warehouse and Data Mining
Data Warehouse and Data MiningRanak Ghosh
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Miningidnats
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data miningmaxonlinetr
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guidethomasmary607
 
Lecture 4, c++(complete reference,herbet sheidt)chapter-14
Lecture 4, c++(complete reference,herbet sheidt)chapter-14Lecture 4, c++(complete reference,herbet sheidt)chapter-14
Lecture 4, c++(complete reference,herbet sheidt)chapter-14Abu Saleh
 
Lecture 3, c++(complete reference,herbet sheidt)chapter-13
Lecture 3, c++(complete reference,herbet sheidt)chapter-13Lecture 3, c++(complete reference,herbet sheidt)chapter-13
Lecture 3, c++(complete reference,herbet sheidt)chapter-13Abu Saleh
 
Lecture 6, c++(complete reference,herbet sheidt)chapter-16
Lecture 6, c++(complete reference,herbet sheidt)chapter-16Lecture 6, c++(complete reference,herbet sheidt)chapter-16
Lecture 6, c++(complete reference,herbet sheidt)chapter-16Abu Saleh
 
Lecture 7, c++(complete reference,herbet sheidt)chapter-17.
Lecture 7, c++(complete reference,herbet sheidt)chapter-17.Lecture 7, c++(complete reference,herbet sheidt)chapter-17.
Lecture 7, c++(complete reference,herbet sheidt)chapter-17.Abu Saleh
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic conceptsKrish_ver2
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Data Preprocessing- Data Warehouse & Data Mining
Data Preprocessing- Data Warehouse & Data MiningData Preprocessing- Data Warehouse & Data Mining
Data Preprocessing- Data Warehouse & Data MiningTrinity Dwarka
 
Trends in the Database
Trends in the DatabaseTrends in the Database
Trends in the DatabaseMarlon Jamera
 
01. design & analysis of agorithm intro & complexity analysis
01. design & analysis of agorithm intro & complexity analysis01. design & analysis of agorithm intro & complexity analysis
01. design & analysis of agorithm intro & complexity analysisOnkar Nath Sharma
 
Vbmca204821311240
Vbmca204821311240Vbmca204821311240
Vbmca204821311240Ayushi Jain
 

Destacado (20)

Data Warehouse and Data Mining
Data Warehouse and Data MiningData Warehouse and Data Mining
Data Warehouse and Data Mining
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data mining
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
Lecture 4, c++(complete reference,herbet sheidt)chapter-14
Lecture 4, c++(complete reference,herbet sheidt)chapter-14Lecture 4, c++(complete reference,herbet sheidt)chapter-14
Lecture 4, c++(complete reference,herbet sheidt)chapter-14
 
Lecture 3, c++(complete reference,herbet sheidt)chapter-13
Lecture 3, c++(complete reference,herbet sheidt)chapter-13Lecture 3, c++(complete reference,herbet sheidt)chapter-13
Lecture 3, c++(complete reference,herbet sheidt)chapter-13
 
Lecture 6, c++(complete reference,herbet sheidt)chapter-16
Lecture 6, c++(complete reference,herbet sheidt)chapter-16Lecture 6, c++(complete reference,herbet sheidt)chapter-16
Lecture 6, c++(complete reference,herbet sheidt)chapter-16
 
C++ Complete Reference
C++ Complete ReferenceC++ Complete Reference
C++ Complete Reference
 
Lecture 7, c++(complete reference,herbet sheidt)chapter-17.
Lecture 7, c++(complete reference,herbet sheidt)chapter-17.Lecture 7, c++(complete reference,herbet sheidt)chapter-17.
Lecture 7, c++(complete reference,herbet sheidt)chapter-17.
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
data base manage ment
data base manage mentdata base manage ment
data base manage ment
 
2ndlec.database
2ndlec.database2ndlec.database
2ndlec.database
 
Data Preprocessing- Data Warehouse & Data Mining
Data Preprocessing- Data Warehouse & Data MiningData Preprocessing- Data Warehouse & Data Mining
Data Preprocessing- Data Warehouse & Data Mining
 
Trends in the Database
Trends in the DatabaseTrends in the Database
Trends in the Database
 
01. design & analysis of agorithm intro & complexity analysis
01. design & analysis of agorithm intro & complexity analysis01. design & analysis of agorithm intro & complexity analysis
01. design & analysis of agorithm intro & complexity analysis
 
Vbmca204821311240
Vbmca204821311240Vbmca204821311240
Vbmca204821311240
 

Similar a Data warehouse and data mining

6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhardeepikakaler1
 
6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhianadeepikakaler1
 
6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhardeepikakaler1
 
6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhianadeepikakaler1
 
Data mining final year project in jalandhar
Data mining final year project in jalandharData mining final year project in jalandhar
Data mining final year project in jalandhardeepikakaler1
 
Data mining final year project in ludhiana
Data mining final year project in ludhianaData mining final year project in ludhiana
Data mining final year project in ludhianadeepikakaler1
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryYoung Alista
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryHarry Potter
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryJames Wong
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryFraboni Ec
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryLuis Goldster
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryTony Nguyen
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryHoang Nguyen
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introductionNiyitegekabilly
 
Information Technology Data Mining
Information Technology Data MiningInformation Technology Data Mining
Information Technology Data Miningsamiksha sharma
 
Big data-analytics-changing-way-organizations-conducting-business
Big data-analytics-changing-way-organizations-conducting-businessBig data-analytics-changing-way-organizations-conducting-business
Big data-analytics-changing-way-organizations-conducting-businessAmit Bhargava
 
Machine Learning, Data Mining, and
Machine Learning, Data Mining, and Machine Learning, Data Mining, and
Machine Learning, Data Mining, and butest
 

Similar a Data warehouse and data mining (20)

6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar
 
6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana
 
6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar
 
6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana
 
Data mining final year project in jalandhar
Data mining final year project in jalandharData mining final year project in jalandhar
Data mining final year project in jalandhar
 
Data mining final year project in ludhiana
Data mining final year project in ludhianaData mining final year project in ludhiana
Data mining final year project in ludhiana
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining 1
Data mining 1Data mining 1
Data mining 1
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
 
Information Technology Data Mining
Information Technology Data MiningInformation Technology Data Mining
Information Technology Data Mining
 
1intro
1intro1intro
1intro
 
Data mining
Data miningData mining
Data mining
 
Big data-analytics-changing-way-organizations-conducting-business
Big data-analytics-changing-way-organizations-conducting-businessBig data-analytics-changing-way-organizations-conducting-business
Big data-analytics-changing-way-organizations-conducting-business
 
Machine Learning, Data Mining, and
Machine Learning, Data Mining, and Machine Learning, Data Mining, and
Machine Learning, Data Mining, and
 

Más de Rohit Kumar

Hrm strategies(2)
Hrm strategies(2)Hrm strategies(2)
Hrm strategies(2)Rohit Kumar
 
Project report on Relationship Of Inflation with Indian Stock Market
Project report on Relationship Of Inflation with Indian Stock MarketProject report on Relationship Of Inflation with Indian Stock Market
Project report on Relationship Of Inflation with Indian Stock MarketRohit Kumar
 
Leadership sudha murthy
Leadership sudha murthyLeadership sudha murthy
Leadership sudha murthyRohit Kumar
 
leadership Adolf Hitler
leadership Adolf Hitlerleadership Adolf Hitler
leadership Adolf HitlerRohit Kumar
 
The imc tools used for communication of cocacola
The imc tools used for communication of cocacolaThe imc tools used for communication of cocacola
The imc tools used for communication of cocacolaRohit Kumar
 
Pricing strategies
Pricing strategiesPricing strategies
Pricing strategiesRohit Kumar
 
New market offerings
New market offeringsNew market offerings
New market offeringsRohit Kumar
 
Marketing communications (1)
Marketing communications (1)Marketing communications (1)
Marketing communications (1)Rohit Kumar
 
Market segmentation
Market segmentationMarket segmentation
Market segmentationRohit Kumar
 
Market segmentation, positioning and value proposition
Market segmentation, positioning and value propositionMarket segmentation, positioning and value proposition
Market segmentation, positioning and value propositionRohit Kumar
 
Innovation and organisation 3 m way
Innovation and organisation  3 m wayInnovation and organisation  3 m way
Innovation and organisation 3 m wayRohit Kumar
 
Designing and managing integrated marketing communication
Designing and managing integrated marketing communicationDesigning and managing integrated marketing communication
Designing and managing integrated marketing communicationRohit Kumar
 
Quantitativ research survey
Quantitativ research  surveyQuantitativ research  survey
Quantitativ research surveyRohit Kumar
 
Qualitative research technique
Qualitative research techniqueQualitative research technique
Qualitative research techniqueRohit Kumar
 
Measurement and scaling noncomparative scaling technique
Measurement and scaling noncomparative scaling techniqueMeasurement and scaling noncomparative scaling technique
Measurement and scaling noncomparative scaling techniqueRohit Kumar
 
Measurement and scaling fundamentals and comparative scaling
Measurement and scaling fundamentals and comparative scalingMeasurement and scaling fundamentals and comparative scaling
Measurement and scaling fundamentals and comparative scalingRohit Kumar
 
Descriptive design survey and observation
Descriptive design survey and observationDescriptive design survey and observation
Descriptive design survey and observationRohit Kumar
 

Más de Rohit Kumar (20)

Hrm strategies(2)
Hrm strategies(2)Hrm strategies(2)
Hrm strategies(2)
 
Project report on Relationship Of Inflation with Indian Stock Market
Project report on Relationship Of Inflation with Indian Stock MarketProject report on Relationship Of Inflation with Indian Stock Market
Project report on Relationship Of Inflation with Indian Stock Market
 
Leadership sudha murthy
Leadership sudha murthyLeadership sudha murthy
Leadership sudha murthy
 
leadership Adolf Hitler
leadership Adolf Hitlerleadership Adolf Hitler
leadership Adolf Hitler
 
The imc tools used for communication of cocacola
The imc tools used for communication of cocacolaThe imc tools used for communication of cocacola
The imc tools used for communication of cocacola
 
Rural marketing
Rural marketingRural marketing
Rural marketing
 
Pricing strategies
Pricing strategiesPricing strategies
Pricing strategies
 
Positioning
PositioningPositioning
Positioning
 
New market offerings
New market offeringsNew market offerings
New market offerings
 
Marketing communications (1)
Marketing communications (1)Marketing communications (1)
Marketing communications (1)
 
Market segmentation
Market segmentationMarket segmentation
Market segmentation
 
Market segmentation, positioning and value proposition
Market segmentation, positioning and value propositionMarket segmentation, positioning and value proposition
Market segmentation, positioning and value proposition
 
Innovation and organisation 3 m way
Innovation and organisation  3 m wayInnovation and organisation  3 m way
Innovation and organisation 3 m way
 
Designing and managing integrated marketing communication
Designing and managing integrated marketing communicationDesigning and managing integrated marketing communication
Designing and managing integrated marketing communication
 
Sampling
SamplingSampling
Sampling
 
Quantitativ research survey
Quantitativ research  surveyQuantitativ research  survey
Quantitativ research survey
 
Qualitative research technique
Qualitative research techniqueQualitative research technique
Qualitative research technique
 
Measurement and scaling noncomparative scaling technique
Measurement and scaling noncomparative scaling techniqueMeasurement and scaling noncomparative scaling technique
Measurement and scaling noncomparative scaling technique
 
Measurement and scaling fundamentals and comparative scaling
Measurement and scaling fundamentals and comparative scalingMeasurement and scaling fundamentals and comparative scaling
Measurement and scaling fundamentals and comparative scaling
 
Descriptive design survey and observation
Descriptive design survey and observationDescriptive design survey and observation
Descriptive design survey and observation
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Data warehouse and data mining

  • 3.
  • 4.
  • 5. Which are our lowest/highest margin customers ? Who are my customers and what products are they buying? Which customers are most likely to go to the competition ? What impact will new products/services have on revenue and margins? What product prom- -otions have the biggest impact on revenue? What is the most effective distribution channel? A PRODUCER WANTS TO KNOW….
  • 6. DATA, DATA EVERYWHERE YET ...  I can’t get the data I need  need an expert to get the data  I can’t understand the data I found  available data poorly documented  I can’t use the data I found  results are unexpected  data needs to be transformed from one form to other  I can’t find the data I need  data is scattered over the network  many versions, subtle differences
  • 7.  1960s:  Data collection, database creation, IMS and network DBMS  1970s:  Relational data model, relational DBMS implementation  1980s:  RDBMS, advanced data models (extended- relational, OO, deductive, etc.) and application-oriented DBMS (spatial, scientific, engineering, etc.)  1990s—2000s:  Data mining and data warehousing, multimedia databases, and Web databases 7
  • 8. DATA WAREHOUSE The data warehouse is that portion of an overall Architected Data Environment that serves as the single integrated source of data for processing information.
  • 10. TERMS RELATED TO DATA WAREHOUSE DATA MART STAGING AREA OLAP OLAP TOOLS
  • 12. HOW IS THE WAREHOUSE DIFFERENT? The data warehouse is distinctly different from the operational data used and maintained by day- to-day operational systems. Data warehousing is not simply an “access wrapper” for operational data, where data is simply “dumped” into tables for direct access.
  • 13. OPERATIONAL DATA Application oriented Detailed Accurate, as of the moment of access Serves the clerical community Performance sensitive (immediate response required when entering a transaction) Flexible structure; variable contents Small amount of data used in a process DATA WAREHOUSE Subject oriented Summarized Represents values over time Serves the managerial community Performance relaxed (immediacy not required) Static structure large amount of data used in a process
  • 14. 14
  • 15.  Data explosion problem  Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories  We are drowning in data, but starving for knowledge!  Solution: Data warehousing and data mining  Extraction of interesting knowledge (rules, regularities, patterns, constraints) from data in large databases 15
  • 16.  Data mining (knowledge discovery in databases):  Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from data in large databases  Alternative names:  Data mining: a misnomer?  Knowledge discovery(mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc.  What is not data mining?  (Deductive) query processing.  Expert systems or small ML/statistical programs 16
  • 17.  Database analysis and decision support  Market analysis and management  target marketing, customer relation management, market basket analysis, cross selling, market segmentation  Risk analysis and management  Forecasting, customer retention, improved underwriting, quality control, competitive analysis  Fraud detection and management  Other Applications  Text mining (news group, email, documents)  Stream data mining  Web mining.  DNA data analysis 17
  • 18.  Where are the data sources for analysis?  Credit card transactions, loyalty cards, discount coupons, customer complaint calls, plus (public) lifestyle studies  Target marketing  Find clusters of “model” customers who share the same characteristics: interest, income level, spending habits, etc.  Determine customer purchasing patterns over time  Conversion of single to a joint bank account: marriage, etc.  Cross-market analysis  Associations/co-relations between product sales  Prediction based on the association information 18
  • 19. 19  Customer profiling  data mining can tell you what types of customers buy what products (clustering or classification)  Identifying customer requirements  identifying the best products for different customers  use prediction to find what factors will attract new customers  Provides summary information  various multidimensional summary reports  statistical summary information (data central tendency and variation)
  • 20.  Finance planning and asset evaluation  cash flow analysis and prediction  contingent claim analysis to evaluate assets  cross-sectional and time series analysis (financial-ratio, trend analysis, etc.)  Resource planning:  summarize and compare the resources and spending  Competition:  monitor competitors and market directions  group customers into classes and a class-based pricing procedure  set pricing strategy in a highly competitive market 20
  • 21.  Applications  widely used in health care, retail, credit card services, telecommunications (phone card fraud), etc.  Approach  use historical data to build models of fraudulent behavior and use data mining to help identify similar instances  Examples  auto insurance: detect a group of people who stage accidents to collect on insurance  money laundering: detect suspicious money transactions (US Treasury's Financial Crimes Enforcement Network)  medical insurance: detect professional patients and ring of doctors and ring of references 21
  • 22. 22  Detecting inappropriate medical treatment  Australian Health Insurance Commission identifies that in many cases blanket screening tests were requested (save Australian $1m/yr.).  Detecting telephone fraud  Telephone call model: destination of the call, duration, time of day or week. Analyze patterns that deviate from an expected norm.  British Telecom identified discrete groups of callers with frequent intra-group calls, especially mobile phones, and broke a multimillion dollar fraud.  Retail  Analysts estimate that 38% of retail shrink is due to dishonest employees.
  • 23.  Sports  IBM Advanced Scout analyzed NBA game statistics (shots blocked, assists, and fouls) to gain competitive advantage for New York Knicks and Miami Heat  Astronomy  JPL and the Palomar Observatory discovered 22 quasars with the help of data mining  Internet Web Surf-Aid  IBM Surf-Aid applies data mining algorithms to Web access logs for market-related pages to discover customer preference and behavior pages, analyzing effectiveness of Web marketing, improving Web site organization, etc. 23

Notas del editor

  1. A single, complete and consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use in a business context
  2. Subject-Oriented: Information is presented according to specific subjects or areas of interest, not simply as computer files. Data is manipulated to provide information about a particular subject. For example, the SRDB is not simply made accessible to end-users, but is provided structure and organized according to the specific needs. Integrated: A single source of information for and about understanding multiple areas of interest. The data warehouse provides one-stop shopping and contains information about a variety of subjects. Thus the OIRAP data warehouse has information on students, faculty and staff, instructional workload, and student outcomes. Non-Volatile: Stable information that doesn’t change each time an operational process is executed. Information is consistent regardless of when the warehouse is accessed. Time-Variant: Containing a history of the subject, as well as current information. Historical information is an important component of a data warehouse. Accessible: The primary purpose of a data warehouse is to provide readily accessible information to end-users. Process-Oriented: It is important to view data warehousing as a process for delivery of information. The maintenance of a data warehouse is on-going and iterative in nature.
  3. Data Mart: A data structure that is optimized for access. It is designed to facilitate end-user analysis of data. It typically supports a single, analytic application used by a distinct set of workers. Staging Area: Any data store that is designed primarily to receive data into a warehousing environment. OLAP (On-Line Analytical Processing): A method by which multidimensional analysis occurs where multidimensional analysis is the ability to manipulate information by a variety of relevant categories or “dimensions” to facilitate analysis and understanding of the underlying data. It is also sometimes referred to as “drilling-down”, “drilling-across” and “slicing and dicing”OLAP Tools: A set of software products that attempt to facilitate multidimensional analysis. Can incorporate data acquisition, data access, data manipulation, or any combination thereof.
  4. Capture, Clean, Transform, TransportQuery, Report, Analyze, Mine, Deliver