Stacey+= Dubai Calls Girls O525547819 Call Girls In Dubai
Business Analytics and Data mining.pdf
1. Introduction to Business
Analytics and its application
Bird’s eye View
Dr Pranjal Muley
Associate Dean- Business Analytics
VES Business School, Mumbai
2. Points to discuss
• What is Business Analytics
• What Is Data Mining?
• Why Data Mining?
• What Kinds of Data Can Be Mined?
• What Kinds of Patterns Can Be Mined?
• Which Technologies Are Used?
• Which Kinds of Applications Are Targeted?
• Major Issues in Data Mining
3. Pretext
• “We are living in the information age” is a popular saying
• However, we are actually living in the data age(Big data).
• Rapid advances in data collection and storage technology have
enabled organizations to accumulate vast amounts of data.
• Extracting useful information has proven extremely challenging.
• Traditional data analysis tools and techniques cannot be used
because of the massive size of a data set. Sometimes, the non-
traditional nature of the data means that traditional approaches
cannot be applied even if the data set is relatively small.
4. Pretext
• Humans tend to generate a lot of data each day; from heart rates to
favorite songs, fitness goals and movie preferences.
• Data is found in each drawer of businesses. Data is no longer restricted to
just technological companies.
• Businesses as diverse as life-insureres, hotels, hospitals,product
management are now using data for better marketing strategies, improve
customer experience, understand business trends or just collect insights on
user data.
• The modern business marketplace is a data-driven environment.
• Data is at the core of nearly every business decision made.
• Human resources directors are gathering data from online resources to
determine the best people to recruit and confirm details about them.
5. Pretext
• Data is everywhere!
Humanity surpassed zettabyte in 2010.
(One zettabyte = 1000000000000000000000 bytes. :21 zeroes).
• Forbes says there are 2.5 quintillion bytes of data created each day.
• Only 0.5% data of what is being generated is analysed.
• Businesses can harness data to:
• Find new customers
• Track social media interaction with the brand
• Improve customer retention rate
• Capture customer inclinations and market trends
• Predict sales trends
• Improve brand experience
6. Pretext
• A Forbes article discussing a survey from Deloitte:
• It exhibits that “49 percent of respondents said analytics helps them
make better decisions, 16 percent say that it better enables key
strategic initiatives, and 10 percent say it helps them improve
relationships with both customers and business partners.”
7. What is Business Analytics?
• Business analytics is a set of disciplines and technologies for solving
business problems by using methodologies such as data mining,
predictive analytics, and statistical analysis in order to analyze and
transform data into useful information, identify and anticipate trends
and outcomes, and ultimately make smarter, data-driven business
decisions.
8. What is Business Analytics?
• The main components of a typical business analytics dashboard include:
• Data Aggregation: prior to analysis, data must first be gathered, organized, and filtered,
either through volunteered data or transactional records
• Data Mining: data mining for business analytics sorts through large datasets using
databases, statistics, and machine learning to identify trends and establish relationships
• Association and Sequence Identification: the identification of predictable actions that are
performed in association with other actions or sequentially
• Text Mining: explores and organizes large, unstructured text datasets for the purpose of
qualitative and quantitative analysis
• Forecasting: analyzes historical data from a specific period in order to make informed
estimates that are predictive in determining future events or behaviors
• Predictive Analytics: predictive business analytics uses a variety of statistical techniques
to create predictive models, which extract information from datasets, identify patterns,
and provide a predictive score for an array of organizational outcomes
• Optimization: once trends have been identified and predictions have been made,
businesses can engage simulation techniques to test out best-case scenarios
• Data Visualization: provides visual representations such as charts and graphs for easy and
quick data analysis
9. How business analytics works?
• Before any data analysis takes place, BA starts with several
foundational processes:
• Determine the business goal of the analysis.
• Select an analysis methodology.
• Get business data to support the analysis, often from various systems
and sources.
• Cleanse and integrate data into a single repository, such as a data
warehouse or data mart.
10. Types of business analytics?
• Different types of business analytics include the following:
• Descriptive analytics, which tracks key performance indicators (KPIs)
to understand the present state of a business;
• Predictive analytics, which analyzes trend data to assess the
likelihood of future outcomes; and
• Prescriptive analytics, which uses past performance to generate
recommendations for handling similar situations in the future.
13. What Is Data Mining?
• “Data rich but poor information”
• Technology that blends traditional data
analysis methods with sophisticated
algorithms for processing large
volumes of data.
• Opened up exciting opportunities for
exploring and analyzing new types of
data and for analyzing old types of
data in new ways.
14. Pretext
• The fast-growing, tremendous amount of data, collected and stored
in large and numerous data repositories becoming “Data
Tombs”/Heaps.
• Beyond the capacity of human ability for comprehension without
powerful tools.
• In parallel, important decisions are often made based decision
maker’s intuition / Gutt feeling but not on the information-rich data
stored in data repositories because of not having the appropriate
tools to extract the valuable knowledge embedded in the vast
amounts of data.
• WHAT WE WANT is……
16. What Is Data Mining?
• Data mining is the process of automatically discovering useful
information in large data repositories.
• Data mining techniques are deployed to scour large databases in
order to find novel and useful patterns that might otherwise remain
unknown.
• They also provide capabilities to predict the outcome of a future
observation, such as predicting whether a newly arrived customer will
spend more than $100 at a department store.
17. What Is Data Mining?
• Data mining should have been more appropriately named
“knowledge Discovery from data,” (KDD)
• In simple words, data mining is defined as a process used to extract
usable data from a larger set of any raw data. It implies analysing data
patterns in large batches of data using one or more software
• Data mining is an integral part of knowledge discovery in databases
(KDD), which is the overall process of converting raw data into useful
information.
19. What Is Data Mining?
• Data cleaning (to remove noise and inconsistent data)
• Data integration (where multiple data sources may be combined)
• Data selection (where data relevant to the analysis task are retrieved from
the database)
• Data transformation (where data are transformed and consolidated into
forms appropriate for mining by performing summary or aggregation
operations)
• Data mining (an essential process where intelligent methods are applied to
extract data patterns)
• Pattern evaluation (to identify the truly interesting patterns representing
knowledge based on interestingness measures)
• Knowledge presentation (where visualization and knowledge
representation techniques are used to present mined knowledge to users)
21. Why Data Mining?
• We live in a world where vast amounts of data are collected daily.
Analyzing such data is an important need.
Every day, roughly 2.5 quintillion bytes of data is generated.
• Terabytes or petabytes of data pour into our computer networks via World
Wide Web (WWW), and various data storage devices every day from
business, society, science and engineering, medicine, and almost every
other aspect of daily life.
• This explosive growth of available data volume is a result of the
computerization of our society and the fast development of powerful data
collection and storage tools.
• Businesses worldwide generate gigantic data sets, including sales
transactions, stock trading records, product descriptions, sales promotions,
company profiles and performance, and customer feedback and many
more.
22. Why Data Mining? Few examples
• Wal-Mart, handle hundreds of millions of transactions per week at thousands of branches
around the world.
• Scientific and engineering practices generate high orders of petabytes of data in a continuous
manner, from remote sensing, process measuring, scientific experiments, system
performance, engineering observations, and environment surveillance.
• Global backbone telecommunication networks carry tens of petabytes of data traffic every
day.
• The medical and health industry generates tremendous amounts of data from medical
records, patient monitoring, and medical imaging.
• Billions of Web searches supported by search engines process tens of petabytes of data daily.
Google processes more than 20 petabytes of data every day including around 3.5 billion search
queries.
• Communities and social media have become increasingly important data sources, producing
digital pictures and videos, blogs, Web communities, and various kinds of social networks.
• Facebook generates 4 petabytes of data per day
25. What Kinds of Data Can Be Mined?
• As a rule, Data mining can be applied to any kind of data as long as the data
are meaningful for a target application.
• However, the most basic form of data:
• Database Data
• Data Warehouses
• Transactional Data
• Other Kinds of Data
• Time-related or sequence data (e.g., historical records, stock exchange data, and time-series
and biological sequence data),
• Data streams (e.g., video surveillance and sensor data, which are continuously transmitted)
• Spatial data (e.g., maps),
• Engineering design data (e.g., the design of buildings, system components, or integrated
circuits),
• Hypertext and multimedia data (including text, image, video, and audio data),
• Graph and networked data (e.g., social and information networks)
• Web (a huge, widely distributed information repository made available by the Internet)
27. What Kinds of Patterns Can Be Mined?
• There are a number of data mining functionalities.
• Characterization
• Discrimination
• Mining of frequent patterns
• Associations
• Correlations
• Classification
• Regression
• Clustering analysis
• Outlier analysis
28. What Kinds of Patterns Can Be Mined?
• Data mining functionalities are used to specify the kinds of patterns
to be found in data mining tasks.
• Tasks can be classified into two categories:
• Descriptive
• Predictive.
• Descriptive mining tasks characterize properties of the data.
• Predictive mining tasks perform induction on the current data in
order to make predictions.
29. What Kinds of Patterns Can Be Mined?
• Characterization is a summarization of the general characteristics or
features of a target class of data.
A customer relationship manager at ABC supermarket may order
the following data mining task: Summarize the characteristics of
customers who spend more than $5000 a year at ABC
supermarket .
The result is a general profile of these customers, such as that
they are 40 to 50 years old, employed, and have excellent credit
ratings.
30. What Kinds of Patterns Can Be Mined?
• Discrimination is a comparison of the general features of target class data
objects with the general features of objects from one or a set of
contrasting classes.
A customer relationship manager at ABC super electronics may
want to compare two groups of customers—those who shop for
computer products regularly (e.g., more than twice a month) and
those who rarely shop for such products (e.g., less than three
times a year).
The resulting description provides a general comparative profile of
these customers, such as that 80% of the customers who frequently
purchase computer products are between 20 and 40 years old and
have a university education, whereas 60% of the customers who
infrequently buy such products are either seniors or youths, and have
no university degree
31. What Kinds of Patterns Can Be Mined?
• Association is the discovery of association rules showing attribute-
value conditions that occur frequently together in a given set of data.
For example, a data mining system may find association rules like
major(X, “computing science””) ⇒ owns(X, “personal computer”)
[support = 12%, confidence = 98%]
buys(X, “computer”) ⇒ buys(X, “software”)
[support = 10%,confidence = 50%]
32. What Kinds of Patterns Can Be Mined?
• Classification is the process of finding a model (or function) that describes
and distinguishes data classes or concepts
• Used for the purpose of being able to predict the class of objects whose
class label is unknown. It predicts categorical (discrete, unordered) labels.
Suppose sales manager of ABC supermarket wants to classify a
large set of items in the store, based on three kinds of responses
to a sales campaign: good response, mild response and no
response.
He wants to derive a model for each of these three classes based on
the descriptive features of the items, such as price, brand, place
made, type, and category.
The resulting classification should maximally distinguish each class
from the others, presenting an organized picture of the data set.
33. What Kinds of Patterns Can Be Mined?
• Regression, unlike classification, is a process to model continuous-
valued functions. It is used to predict missing or unavailable
numerical data values rather than (discrete) class labels.
Rather than predicting categorical response labels for each store
item, manager would like to predict the amount of revenue that
each item will generate during an upcoming sale at ABC
supermarket, based on the previous sales data.
This is an example of regression analysis because the regression
model constructed will predict a continuous function (or ordered
value.)
34. What Kinds of Patterns Can Be Mined?
• Clustering analyzes data objects without consulting a known class
label. The objects are clustered or grouped based on the principle of
maximizing the intraclass similarity and minimizing the interclass
similarity. Each cluster that is formed can be viewed as a class of
objects.
35. What Kinds of Patterns Can Be Mined?
• Outlier analysis is the analysis of outliers, which are objects that do
not comply with the general behavior or model of the data. Examples
include fraud detection based on a large dataset of credit card
transactions
Outlier analysis may uncover fraudulent usage of credit cards by
detecting purchases of unusually large amounts for a given
account number in comparison to regular charges incurred by
the same account.
Outlier values may also be detected with respect to the locations
and types of purchase, or the purchase frequency
37. Which Technologies Are Used?
• Being a highly application-driven domain, data mining has incorporated
many techniques from other domains such as statistics, machine learning,
pattern recognition, database and data warehouse systems, information
retrieval, visualization, algorithms, high-performance computing, and
many application domain.
38. Which Technologies Are Used?
• Machine Learning: It investigates how computers can learn (or improve
their performance) based on data.
• A main research area is for computer programs to automatically learn to
recognize complex patterns and make intelligent decisions based on data.
For example, a typical machine learning problem is to program a
computer so that it can automatically recognize handwritten postal
codes on mail after learning from a set of examples.
39. Which Technologies Are Used?
• Types of Machine Learning.
Supervised learning is basically a synonym for classification. The supervision in the
learning comes from the labeled examples in the training data set.
Unsupervised learning is essentially a synonym for clustering. The learning process is
unsupervised since the input examples are not class labeled. Typically, we may use
clustering to discover classes within the data
41. Which Kinds of Applications Are Targeted?
• Business Intelligence
“How important is business intelligence?”
Without data mining, many businesses may not be able to perform
• Effective market analysis
• Compare customer feedback on similar products
• Discover the strengths and weaknesses of their competitors
• Retain highly valuable customers
• Make smart business decisions
Business intelligence (BI) technologies provide historical, current, and
predictive views of business operations
42. Which Kinds of Applications Are Targeted?
• Web Search Engines
Web search engines are a specialized computer servers that search for
information on the Web
Web search engines are essentially very large data mining applications.
Various data mining techniques
• crawling (e.g., deciding which pages should be crawled and the crawling
frequencies)
• indexing (e.g., selecting pages to be indexed and deciding to which extent the index
should be constructed)
• searching (e.g., deciding how pages should be ranked, which advertisements
should be added)
44. Business Applications
• Credit Card Companies
Credit and debit cards are an everyday part of consumer spending
Ideal way of gathering information about a purchaser’s spending habits,
financial situation, demographics, and lifestyle preferences.
• Customer Relationship Management (CRM)
Excellent customer relations is critical for any company.
CRM systems analyze important performance indicators such as
demographics, buying patterns, socio-economic information, and lifestyle.
45. Business Applications
• Finance
The financial world is a volatile place, and business analytics helps to extract
insights.
Corporations turn to business analysts to optimize budgeting, banking,
financial planning, forecasting, and portfolio management.
• Human Resources
HR’s job to not only find the ideal candidates but keep them on board.
Business analysts help the process by pouring through data that
characterizes high performing candidates, such as educational background,
attrition rate, the average length of employment, etc.
By working with this information, business analysts help HR by forecasting
the best fits between the company and candidates.
46. Business Applications
• Manufacturing
Business analysts work with data to help stakeholders understand the
things that affect operations and the bottom line.
Identifying things like equipment downtime, inventory levels, and
maintenance costs.
• Marketing
Which advertising campaigns are the most effective?
How much social media penetration should a business attempt?
What sort of things do viewers like/dislike in commercials?
by measuring marketing and advertising metrics, identifying consumer
behavior and the target audience, and analyzing market trends.