This document discusses data warehousing and data mining. It defines data warehousing as combining data from multiple sources into a single database for analysis. Data warehousing provides businesses with analytics from data mining, OLAP, scorecarding and reporting. It also discusses the need for data warehousing to gather information from various sources. Common components of data warehousing architectures include extracting, transforming and loading data, as well as operational data stores, data warehouses, data marts and ETL processes. Finally, the document outlines typical applications of data mining such as customer relationship management, medical research, and combating terrorism.
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Data mining and data warehousing
1. DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
05/16/16 03:41 PM 1
2. 2
DATA WAREHOUSING
Data warehousing is combining data from
multiple sources into one comprehensive and
easily manipulated database.
The primary aim for data warehousing is to
provide businesses with analytics results from
data mining, OLAP, Scorecarding and
reporting.
3. NEED FOR DATA WAREHOUSINGNEED FOR DATA WAREHOUSING
Information is now considered as a key
for all the works.
Those who gather, analyze, understand,
and act upon information are winners.
Information have no limits, it is very hard
to collect information from various
sources, so we need an data warehouse
from where we can get all the
information.
3
6. DATA WAREHOUSE ARCHITECTUREDATA WAREHOUSE ARCHITECTURE
Data warehousing is designed to
provide an architecture that will make
cooperate data accessible and useful to
users.
There is no right or wrong architecture.
The worthiness of the architecture can
be judge by its use, and concept behind
it .
Data Warehouses can be architected in
many different ways, depending on the
specific needs of a business. 6
8. An operational data store (ODS) is
basically a database that is used for being
an temporary storage area for a
datawarehouse.
Its primary purpose is for handling data
which are progressively in use.
Operational data store contains data which
are constantly updated through the course
of the business operations.
8
9. ETL (Extract, Transform, Load) is used to
copy data from:-
ODS to data warehouse staging area.
Data warehouse staging area to data
warehouse .
Data warehouse to data mart .
ETL extracts data, transforms values of
inconsistent data, cleanses "bad" data,
filters data and loads data into a target
database.
9
10. The Data Warehouse Staging Area is
temporary location where data from
source systems is copied.
It increases the speed of data warehouse
architecture.
It is very essential since data is increasing
day by day.
10
11. The purpose of the Data Warehouse is to
integrate corporate data.
The amount of data in the Data Warehouse is
massive. Data is stored at a very deep level of
detail.
This allows data to be grouped in unimaginable
ways.
Data Warehouses does not contain all the data
in the organization ,It's purpose is to provide
base that are needed by the organization for
strategic and tactical decision making.
11
12. ETL extract data from the Data Warehouse
and send to one or more Data Marts for
use of users.
Data marts are represented as shortcut to
a data warehouse ,to save time.
It is just an partition of data present in
data warehouse.
Each Data Mart can contain different
combinations of tables, columns and rows
from the Enterprise Data Warehouse.
12
13. REASONS FOR CREATING AN DATAREASONS FOR CREATING AN DATA
MARTMART
Easy access to frequently needed data.
Creates collective view by a group of
users.
Improves user response time.
Ease of creation.
Lower cost than implementing a full
Data warehouse
13
14. DATA MININGDATA MINING
The non-trivial extraction of implicit,
previously unknown, and potentially
useful information from large databases.
– Extremely large datasets
– Useful knowledge that can improve
processes
– Cannot be done manually
14
15. Where Has it Come From ?Where Has it Come From ?
05/16/16 03:41 PM 15
16. MotivationMotivation
Databases today are huge:
– More than 1,000,000 entities/records/rows
– From 10 to 10,000 fields/attributes/variables
– Giga-bytes and tera-bytes
Databases a growing at an unprecendented rate
The corporate world is a cut-throat world
– Decisions must be made rapidly
– Decisions must be made with maximum
knowledge
16
17. How does data mining work?How does data mining work?
Extract, transform, and load transaction data
onto the data warehouse system.
Store and manage the data in a multidimensional
database system.
Provide data access to business analysts and
information technology professionals.
Analyze the data by application software.
Present the data in a useful format, such as a
graph or table
17
18. DATA MINING MEASURESDATA MINING MEASURES
Accuracy
Clarity
Dirty Data
Scalability
Speed
Validation
18
20. ADVANTAGES OF DATA MININGADVANTAGES OF DATA MINING
Engineering and Technology
Medical Science
Business
Combating Terrorism
Games
Research and Development
20
21. Engineering and TechnologyEngineering and Technology
In Electrical Power Engineering
- used for condition monitoring of
high
voltage electrical equipment
- vibration monitoring and analysis
of
21
23. Medical ScienceMedical Science
Data mining has been widely used in area
of bioinformatics , genetics
DNA sequences and variability in disease
susceptibility which is very important to
help improve the diagnosis, prevention
and treatment of the diseases
23
24. BUSINESSBUSINESS
In Customer Relationship Management
applications
It Translate data from customer to
merchant Accurately
Distribute Business Processes
Powerful Tool For Marketing
24
25. Combating terrorismCombating terrorism
Concept used by Interpol against
terrorists for searching their records by
Multistate Anti-Terrorism Information
Exchange
In the Secure Flight program , Computer
Assisted Passenger Pre screening
System , Semantic Enhancement
25
26. GamesGames
for certain combinatorial games, also
called table bases (e.g. for 3x3-chess)
It includes extraction of human-usable
strategies
Berlekamp in dots-and-boxes and Joh
Nunn in chess endgames are notable
examples
26
27. Research And DevelopmentResearch And Development
Helps to Develop the search algorithms
It offers huge libraries of graphing and
visualisation softwares
The users can easily create the models
optimally
27
28. List of the top eight data-miningList of the top eight data-mining
software vendors in 2008software vendors in 2008
28
Angoss Software
Infor CRM Epiphany
Portrait Software
SAS
G-Stat
SPSS
ThinkAnalytics
Unica
Viscovery