Data Science.pdf

Data Science is a blend of various tools, algorithms, and machine learning principles with the goal of discovering hidden patterns from the raw data.

What is Data Science?
Data Science is a blend of various tools, algorithms, and machine
learning principles with the goal of discovering hidden patterns
from the raw data. But how is this different from what
statisticians have been doing for years?
The answer lies in the difference between explaining and
predicting.
As you can see from the above image, a Data Analyst usually
explains what is going on by processing history of the data. On the
other hand, Data Scientist not only does the exploratory analysis to
discover insights from it, but also uses various advanced machine
learning algorithms to identify the occurrence of a particular event
in the future. A Data Scientist will look at the data from many
angles, sometimes angles not known earlier.
So, Data Science is primarily used to make decisions and
predictions making use of predictive causal analytics, prescriptive
analytics (predictive plus decision science) and machine learning.
Predictive causal analytics – If you want a model that can
predict the possibilities of a particular event in the future,
you need to apply predictive causal analytics. Say, if you are
providing money on credit, then the probability of customers
making future credit payments on time is a matter of
concern for you. Here, you can build a model that can
perform predictive analytics on the payment history of the
customer to predict if the future payments will be on time or
not.
 Prescriptive analytics: If you want a model that has the
intelligence of taking its own decisions and the ability to
modify it with dynamic parameters, you certainly need
prescriptive analytics for it. This relatively new field is all
about providing advice. In other terms, it not only predicts
but suggests a range of prescribed actions and associated
outcomes.
The best example for this is Google’s self-driving car which I
had discussed earlier too. The data gathered by vehicles can
be used to train self-driving cars. You can run algorithms on
this data to bring intelligence to it. This will enable your car
to take decisions like when to turn, which path to take, when
to slow down or speed up.
 Machine learning for making predictions — If you have
transactional data of a finance company and need to build a
model to determine the future trend, then machine learning
algorithms are the best bet. This falls under the paradigm of
supervised learning. It is called supervised because you already
have the data based on which you can train your machines.
For example, a fraud detection model can be trained using a
historical record of fraudulent purchases.
 Machine learning for pattern discovery — If you don’t have
the parameters based on which you can make predictions,
then you need to find out the hidden patterns within the
dataset to be able to make meaningful predictions. This is
nothing but the unsupervised model as you don’t have any
predefined labels for grouping. The most common algorithm
used for pattern discovery is Clustering.
Let’s say you are working in a telephone company and you
need to establish a network by putting towers in a region.
Then, you can use the clustering technique to find those
tower locations which will ensure that all the users receive
optimum signal strength.
Let’s see how the proportion of above-described approaches differ
for Data Analysis as well as Data Science. As you can see in the
image below, Data Analysis includes descriptive analytics and
prediction to a certain extent. On the other hand, Data Science is
more about Predictive Causal Analytics and Machine Learning.
Why Data Science?
 Traditionally, the data that we had was mostly structured
and small in size, which could be analyzed by using simple BI
tools. Unlike data in the traditional systems which were
mostly structured, today most of the data is unstructured or
semi-structured. Let’s have a look at the data trends in the
image given below which shows that by 2020, more than
80 % of the data will be unstructured.
This data is generated from different sources like financial
logs, text files, multimedia forms, sensors, and instruments.
Simple BI tools are not capable of processing this huge volume
and variety of data. This is why we need more complex and
advanced analytical tools and algorithms for processing,
analyzing and drawing meaningful insights out of it.
This is not the only reason why Data Science has become so popular.
Let’s dig deeper and see how Data Science is being used in various
domains.
 How about if you could understand the precise requirements
of your customers from the existing data like the customer’s
past browsing history, purchase history, age and income. No
doubt you had all this data earlier too, but now with the vast
amount and variety of data, you can train models more
effectively and recommend the product to your customers
with more precision. Wouldn’t it be amazing as it will bring
more business to your organization?
 Let’s take a different scenario to understand the role of Data
Science in decision making. How about if your car had the
intelligence to drive you home? Self-driving cars collect live
data from sensors, including radars, cameras, and lasers to
create a map of its surroundings. Based on this data, it takes
decisions like when to speed up, when to speed down, whento
overtake, where to take a turn – making use of advanced
machine learning algorithms.
 Let’s see how Data Science can be used in predictive analytics.
Let’s take weather forecasting as an example. Data from ships,
aircraft, radars, satellites can be collected and analyzed to
build models. These models will not only forecast the weather
but also help in predicting the occurrence of any natural
calamities. It will help you to take appropriate measures
beforehand and save many precious lives.

Recomendados

Data science tutorial por
Data science tutorialData science tutorial
Data science tutorialAakashdata
84 vistas12 diapositivas
DataMining Techniq por
DataMining TechniqDataMining Techniq
DataMining TechniqRespa Peter
448 vistas5 diapositivas
Credit card fraud detection using python machine learning por
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learningSandeep Garg
1.8K vistas43 diapositivas
A guide to preparing your data for tableau por
A guide to preparing your data for tableauA guide to preparing your data for tableau
A guide to preparing your data for tableauPhillip Reinhart
329 vistas6 diapositivas
Unit IV.pdf por
Unit IV.pdfUnit IV.pdf
Unit IV.pdfPreethaSuresh2
4 vistas31 diapositivas
Simplify our analytics strategy por
Simplify our analytics strategySimplify our analytics strategy
Simplify our analytics strategysaurabh sethia
96 vistas19 diapositivas

Más contenido relacionado

Similar a Data Science.pdf

Data Analytics Introduction.pptx por
Data Analytics Introduction.pptxData Analytics Introduction.pptx
Data Analytics Introduction.pptxamitparashar42
31 vistas17 diapositivas
Machine learning por
Machine learningMachine learning
Machine learningRajib Kumar De
2.6K vistas20 diapositivas
Regression and correlation por
Regression and correlationRegression and correlation
Regression and correlationVrushaliSolanke
66 vistas72 diapositivas
BigData Analytics_1.7 por
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7Rohit Mittal
322 vistas22 diapositivas
Guide for a Data Scientist por
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data ScientistRohit Dubey
87 vistas258 diapositivas

Similar a Data Science.pdf(20)

Data Analytics Introduction.pptx por amitparashar42
Data Analytics Introduction.pptxData Analytics Introduction.pptx
Data Analytics Introduction.pptx
amitparashar4231 vistas
BigData Analytics_1.7 por Rohit Mittal
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7
Rohit Mittal322 vistas
Guide for a Data Scientist por Rohit Dubey
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data Scientist
Rohit Dubey87 vistas
Machine Learning for Business - Eight Best Practices for Getting Started por Bhupesh Chaurasia
Machine Learning for Business - Eight Best Practices for Getting StartedMachine Learning for Business - Eight Best Practices for Getting Started
Machine Learning for Business - Eight Best Practices for Getting Started
Bhupesh Chaurasia172 vistas
Artificial Intelligence por Enes Bolfidan
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
Enes Bolfidan171 vistas
Simplify your analytics strategy por Shikhar Gupta
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategy
Shikhar Gupta232 vistas
In-Depth Data Analytics por YASH GAIKWAD
In-Depth Data AnalyticsIn-Depth Data Analytics
In-Depth Data Analytics
YASH GAIKWAD138 vistas
McKinsey Big Data Trinity for self-learning culture por Matt Ariker
McKinsey Big Data Trinity for self-learning cultureMcKinsey Big Data Trinity for self-learning culture
McKinsey Big Data Trinity for self-learning culture
Matt Ariker397 vistas
Applied_Data_Science_Presented_by_Yhat por Charlie Hecht
Applied_Data_Science_Presented_by_YhatApplied_Data_Science_Presented_by_Yhat
Applied_Data_Science_Presented_by_Yhat
Charlie Hecht431 vistas
Machine Learning with Azure and Databricks Virtual Workshop por CCG
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual Workshop
CCG253 vistas

Más de Satishkumar722293

TARGET AUDIENCE.pdf por
TARGET AUDIENCE.pdfTARGET AUDIENCE.pdf
TARGET AUDIENCE.pdfSatishkumar722293
4 vistas3 diapositivas
CAREER DEVELOPMENT.pdf por
CAREER DEVELOPMENT.pdfCAREER DEVELOPMENT.pdf
CAREER DEVELOPMENT.pdfSatishkumar722293
4 vistas3 diapositivas
DAX.pdf por
DAX.pdfDAX.pdf
DAX.pdfSatishkumar722293
6 vistas3 diapositivas
WHAT IS JVM.pdf por
WHAT IS JVM.pdfWHAT IS JVM.pdf
WHAT IS JVM.pdfSatishkumar722293
3 vistas3 diapositivas
Data Science.pdf por
Data Science.pdfData Science.pdf
Data Science.pdfSatishkumar722293
6 vistas5 diapositivas
JAVA SCRIPT.pdf por
JAVA SCRIPT.pdfJAVA SCRIPT.pdf
JAVA SCRIPT.pdfSatishkumar722293
3 vistas3 diapositivas

Más de Satishkumar722293(20)

Último

MIKE FARRELL (VET) CAREER SUMMARY 2024 por
MIKE FARRELL (VET) CAREER SUMMARY 2024MIKE FARRELL (VET) CAREER SUMMARY 2024
MIKE FARRELL (VET) CAREER SUMMARY 2024greenhollow
6 vistas3 diapositivas
essa.pdf por
essa.pdfessa.pdf
essa.pdfgalhashimi52
10 vistas5 diapositivas
Good News! One Habit Can Change Your Entire Life Forever.pdf por
Good News! One Habit Can Change Your Entire Life Forever.pdfGood News! One Habit Can Change Your Entire Life Forever.pdf
Good News! One Habit Can Change Your Entire Life Forever.pdfSmartSkill97
8 vistas7 diapositivas
WATSON WS_DONATE LIFE TN_MUSIC CITY ROOTS INSERT (2).pdf por
WATSON WS_DONATE LIFE TN_MUSIC CITY ROOTS INSERT (2).pdfWATSON WS_DONATE LIFE TN_MUSIC CITY ROOTS INSERT (2).pdf
WATSON WS_DONATE LIFE TN_MUSIC CITY ROOTS INSERT (2).pdfWriteNashvegas
6 vistas1 diapositiva
FAMILY COMMUNICATION.pdf por
FAMILY COMMUNICATION.pdfFAMILY COMMUNICATION.pdf
FAMILY COMMUNICATION.pdfshai5a3ssa
10 vistas6 diapositivas
essa (1).pdf por
essa (1).pdfessa (1).pdf
essa (1).pdfgalhashimi52
7 vistas5 diapositivas

Último(20)

MIKE FARRELL (VET) CAREER SUMMARY 2024 por greenhollow
MIKE FARRELL (VET) CAREER SUMMARY 2024MIKE FARRELL (VET) CAREER SUMMARY 2024
MIKE FARRELL (VET) CAREER SUMMARY 2024
greenhollow6 vistas
Good News! One Habit Can Change Your Entire Life Forever.pdf por SmartSkill97
Good News! One Habit Can Change Your Entire Life Forever.pdfGood News! One Habit Can Change Your Entire Life Forever.pdf
Good News! One Habit Can Change Your Entire Life Forever.pdf
SmartSkill978 vistas
WATSON WS_DONATE LIFE TN_MUSIC CITY ROOTS INSERT (2).pdf por WriteNashvegas
WATSON WS_DONATE LIFE TN_MUSIC CITY ROOTS INSERT (2).pdfWATSON WS_DONATE LIFE TN_MUSIC CITY ROOTS INSERT (2).pdf
WATSON WS_DONATE LIFE TN_MUSIC CITY ROOTS INSERT (2).pdf
WriteNashvegas6 vistas
FAMILY COMMUNICATION.pdf por shai5a3ssa
FAMILY COMMUNICATION.pdfFAMILY COMMUNICATION.pdf
FAMILY COMMUNICATION.pdf
shai5a3ssa10 vistas
Resumes, Cover Letters, and Applying Online por Bruce Bennett
Resumes, Cover Letters, and Applying Online Resumes, Cover Letters, and Applying Online
Resumes, Cover Letters, and Applying Online
Bruce Bennett17 vistas
reStartEvents 12:7 Nationwide TS:SCI & Above Employer Directory.pdf por Ken Fuller
reStartEvents 12:7 Nationwide TS:SCI & Above Employer Directory.pdfreStartEvents 12:7 Nationwide TS:SCI & Above Employer Directory.pdf
reStartEvents 12:7 Nationwide TS:SCI & Above Employer Directory.pdf
Ken Fuller432 vistas
Bias in chess por hr7l1234
Bias in chessBias in chess
Bias in chess
hr7l12345 vistas
114. BP International [2023] por Manu Mitra
114. BP International [2023]114. BP International [2023]
114. BP International [2023]
Manu Mitra11 vistas
Understanding the power of YouAi MindStudio.pdf por isamusak
Understanding the power of YouAi  MindStudio.pdfUnderstanding the power of YouAi  MindStudio.pdf
Understanding the power of YouAi MindStudio.pdf
isamusak7 vistas
All type of document of kamal which are copyrighted.pdf por Kamal Acharya
All type of document of kamal which are copyrighted.pdfAll type of document of kamal which are copyrighted.pdf
All type of document of kamal which are copyrighted.pdf
Kamal Acharya7 vistas
Kamal Acharya All important documents and marksheet.pdf por Kamal Acharya
Kamal Acharya All important documents and marksheet.pdfKamal Acharya All important documents and marksheet.pdf
Kamal Acharya All important documents and marksheet.pdf
Kamal Acharya8 vistas
24. Data Processing por Manu Mitra
24. Data Processing 24. Data Processing
24. Data Processing
Manu Mitra6 vistas
Topic 36.pptx por saleh176
Topic 36.pptxTopic 36.pptx
Topic 36.pptx
saleh1769 vistas
Isip-Van-Resume.pdf por rakk75
Isip-Van-Resume.pdfIsip-Van-Resume.pdf
Isip-Van-Resume.pdf
rakk755 vistas

Data Science.pdf

  • 1. What is Data Science? Data Science is a blend of various tools, algorithms, and machine learning principles with the goal of discovering hidden patterns from the raw data. But how is this different from what statisticians have been doing for years? The answer lies in the difference between explaining and predicting. As you can see from the above image, a Data Analyst usually explains what is going on by processing history of the data. On the other hand, Data Scientist not only does the exploratory analysis to discover insights from it, but also uses various advanced machine learning algorithms to identify the occurrence of a particular event in the future. A Data Scientist will look at the data from many angles, sometimes angles not known earlier. So, Data Science is primarily used to make decisions and predictions making use of predictive causal analytics, prescriptive
  • 2. analytics (predictive plus decision science) and machine learning. Predictive causal analytics – If you want a model that can predict the possibilities of a particular event in the future, you need to apply predictive causal analytics. Say, if you are providing money on credit, then the probability of customers making future credit payments on time is a matter of concern for you. Here, you can build a model that can perform predictive analytics on the payment history of the customer to predict if the future payments will be on time or not.  Prescriptive analytics: If you want a model that has the intelligence of taking its own decisions and the ability to modify it with dynamic parameters, you certainly need prescriptive analytics for it. This relatively new field is all about providing advice. In other terms, it not only predicts but suggests a range of prescribed actions and associated outcomes. The best example for this is Google’s self-driving car which I had discussed earlier too. The data gathered by vehicles can be used to train self-driving cars. You can run algorithms on this data to bring intelligence to it. This will enable your car to take decisions like when to turn, which path to take, when to slow down or speed up.
  • 3.  Machine learning for making predictions — If you have transactional data of a finance company and need to build a model to determine the future trend, then machine learning algorithms are the best bet. This falls under the paradigm of supervised learning. It is called supervised because you already have the data based on which you can train your machines. For example, a fraud detection model can be trained using a historical record of fraudulent purchases.  Machine learning for pattern discovery — If you don’t have the parameters based on which you can make predictions, then you need to find out the hidden patterns within the dataset to be able to make meaningful predictions. This is nothing but the unsupervised model as you don’t have any predefined labels for grouping. The most common algorithm used for pattern discovery is Clustering. Let’s say you are working in a telephone company and you need to establish a network by putting towers in a region. Then, you can use the clustering technique to find those tower locations which will ensure that all the users receive optimum signal strength. Let’s see how the proportion of above-described approaches differ for Data Analysis as well as Data Science. As you can see in the image below, Data Analysis includes descriptive analytics and prediction to a certain extent. On the other hand, Data Science is more about Predictive Causal Analytics and Machine Learning.
  • 4. Why Data Science?  Traditionally, the data that we had was mostly structured and small in size, which could be analyzed by using simple BI tools. Unlike data in the traditional systems which were mostly structured, today most of the data is unstructured or semi-structured. Let’s have a look at the data trends in the image given below which shows that by 2020, more than 80 % of the data will be unstructured. This data is generated from different sources like financial logs, text files, multimedia forms, sensors, and instruments. Simple BI tools are not capable of processing this huge volume and variety of data. This is why we need more complex and advanced analytical tools and algorithms for processing, analyzing and drawing meaningful insights out of it. This is not the only reason why Data Science has become so popular. Let’s dig deeper and see how Data Science is being used in various
  • 5. domains.  How about if you could understand the precise requirements of your customers from the existing data like the customer’s past browsing history, purchase history, age and income. No doubt you had all this data earlier too, but now with the vast amount and variety of data, you can train models more effectively and recommend the product to your customers with more precision. Wouldn’t it be amazing as it will bring more business to your organization?  Let’s take a different scenario to understand the role of Data Science in decision making. How about if your car had the intelligence to drive you home? Self-driving cars collect live data from sensors, including radars, cameras, and lasers to create a map of its surroundings. Based on this data, it takes decisions like when to speed up, when to speed down, whento overtake, where to take a turn – making use of advanced machine learning algorithms.  Let’s see how Data Science can be used in predictive analytics. Let’s take weather forecasting as an example. Data from ships, aircraft, radars, satellites can be collected and analyzed to build models. These models will not only forecast the weather but also help in predicting the occurrence of any natural calamities. It will help you to take appropriate measures beforehand and save many precious lives.