SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
Department of Statistics
The Maharaja Sayajirao University of Baroda
Agenda
 What is Data Science?
 What does Data Science promise for your business?
 Investment in Data Science and ROI
 Data Science Process
 Data Science Roles
 Infrastructure Requirements
 Data Science Tools and Techniques
 Where do I begin?
 Developing Data Science Culture
 Questions
What is Data Science?
Everything concerning Data
is in the purview of Data Science
What is Data Science?
Data science is a young inter-disciplinary field that uses
scientific principles, methods, processes, algorithms and
systems to extract knowledge and insights from data.
 Data science involves Statistics at its core.
 Data Science extends the field of statistics to
incorporate advances in computing with data
 Apart from Statistics, Computer Science is another
major discipline that plays a major role in capturing,
managing and sharing data.
 It is a driving force behind innovations is almost all
disciplines of Science.
 This new approach is termed Data driven science.
Data Science Discipline
Data Science Profession
The Data Science promise
Top Objectives of Successful Businesses
 Increase profitability
 Ensure customer satisfaction
 Optimize productivity
 Make your employees happy
 Social and public responsibility
Businesses traditionally rely on intuition, creativity and
experience to fulfill these objectives.
This has been reflected by HIPPO phenomenon for
decades.
The Data Science promise
Without Data, you are just another person with an opinion
– Edwards Deming
Although, intuition, experience, etc. are important, these work
gets much better when supported with data.
Data Science helps you to
 Understand your customers better by
 Learning about their needs
 Their struggles, their motivations, their habits and their
relationships to your product or service.
 Use this understanding to create a better product and/or
service and turning that into profit.
The Data Science promise
Data science helps you to
 See clearly how your business performs.
 Understand dynamics of your business
 Improve business processes
 Discover new opportunities / products / services that
your customers need.
 Discover new audiences for your current products /
services.
and much more...
The Data Science promise
If you manage to collect the right data and use it well,
 You will be able to make better decisions more quickly
and more easily.
 That will lead to a better product, happier
customers and eventually more revenue.
That’s what business data science is all about.
If you are among the first in your domain to embrace
data science, you can outsmart your competition.
Signs that You Should Invest in Data
Science
 Your marketing budgets are growing, but your sales
numbers are not.
 Your company is struggling with personalization
 It’s taking too long for the sales team to score leads
 You are unable to analyze your marketing ROI
 You want the competitive edge without significantly
increasing your budget
 Your competitors are already investing in Data Science
Data Science Investments
Human Resource
According to an estimate, good teams spend about 5% of
their total working hours with data and quantitative
research.
 So, if you are working alone, that's around 2-3 hours a
week.
 If you are a team of 50, then ideally you should have
one or two full-time dedicated people for Data Science
projects.
 As your business grows, you may setup Data Science
division
Data Science Investments
Data Infrastructure
A data infrastructure is a digital infrastructure for
promoting data sharing and consumption.
 It includes data assets, hardware, software and
processes.
 It includes data ingestion and storage infrastructure
 It includes data management, data security and data
privacy.
Data Science Investments
Analytics Infrastructure
Much of data science work involves computationally
intensive experiments.
 Thus, Data scientists should be able to access large
machines/ specialized hardware for running
experiments or doing exploratory analysis.
 They should also be able to easily use burst/elastic
compute on demand.
 Data Scientists need software support for
communicating their findings to business
stakeholders.
Cloud Analytics
On-premises analytics solutions have challanges
 Cost of infrastructure
 Need for specialized skills
 Time required to configure and maintain these
systems
 Nonscalability
Cloud Analytics provides solution. Some major players
 IBM Cognos analytics
 Microfost Azure Stream Analytics
 AWS Analytics
Success Stories
 Southwest Airlines saved $ 100 million by reducing the
time its planes stood idle on the airstrip.
 UPS, a logistics company, saved 38 million gallons of
fuel by optimizing its fleet.
 $ 2 billion tax dollars saved by the Internal Revenue
Service by improving its ability to detect identity fraud
and improper payments.
 Croma, a subsidiary of Tata sons used data science to
understand 360° view of its users and used it to give
personalized shopping experience to its online
customers and their conversions have significantly
improved.
And many more…
With Data in your possession,
You are sitting on a gold mine…
However, if you don't know this fact OR don’t know how
to extract it, you won't be able to benefit from it.
Data Science Process
The diagram shows the major phases of data science
process. The diagram presents the CRISP-DM methodology
Data Science Process
The six steps of a data science project
 Data Collection
 Data Storage
 Data Preparation
 Data Utilization
 Business Analytics
 Predictive Analytics
 Developing Data Product
 Communication, data visualization
 Data-driven Decision
Data Collection
This is where many businesses fail. Too many companies collect
incomplete, unreliable data and everything they do after that is just
messed up.
Proper tracking and collection of data, and ensuring its quality is
crucial for every business doing data science.
What to collect?
 It is important to decide the details of the data that must be
collected/ captured.
 The general idea is to collect everything you can – because the
value of data can be realized any time in future.
 However, the more data you capture, the more engineering time
you need to allocate to implement it, the slower your business
processes will be, the more complex your data infrastructure
becomes, and so on…
Also consider legal and ethical aspects!
Data Wrangling
Data wrangling is all about getting the data into the right
form that is suitable for feeding into the modeling and
visualization stages.
This activity involves variety of tasks from discovering
data to acquiring and transforming it into the form
where the Data that is ready to be processed.
The tasks following the data acquisition are also referred
to by different terms such as Data Munging or Data
Preprocessing.
Big Data
Big data is like teenage sex: everyone talks about it,
nobody really knows how to do it, everyone thinks
everyone else is doing it, so everyone claims they are
doing it.
- Dan Ariely
What is Big data?
 Big data is a data set whose volume is beyond the ability of
commonly used hardware and software tools to capture, manage,
and process the data within a tolerable execution time.
 They are gathered by information-sensing mobile devices,
remote sensing technologies, software logs, cameras,
microphones, RFID readers, and many such devices.
 As a result, such datasets are continuously growing in size.
 By 2020, there will be around 40 trillion gigabytes of data
 90% of the data in the world today was created within just the
past two years.
 Internet users generate about 2.5 quintillion bytes (2.5 million
terabytes) of data each day
Twitter
 500 million tweets per day
Facebook
 Facebook generates 4 petabytes of data per day.
 Users generate 4 million likes every minute.
 350 million photos are uploaded per day.
Instagram
 The Like button is hit an average of 4.2 billion times/ day.
WhatsApp
 In 2018, WhatsApp users sent 65 billion messages per
day
Almost every field
Some Examples
Characteristics of big data (3V’s)
In a 2001 research report, Gartner analyst, Doug Laney,
defined data growth challenges (and opportunities) as being
three-dimensional - increasing volume, velocity , and variety.
Data volume:
 This is the primary attribute of big data. Most people
define big data in multi terabytes—sometimes petabytes.
Data variety
 Big data is coming from a greater variety of sources than
ever before. Many of the newer ones are Web sources,
including logs, click-streams, and social media.
Data velocity
 Big data can be described by its velocity or speed. The rate
at which new data is generated.
Data Analysis
Data Analysis is process for extracting value from Data.
This is where data science gets exciting. It’s a creative process.
 Ask right Questions
It is important to ask right questions. They usually comes
from the management/ or other colleagues, who may
already have suspicions based on their experience.
 Do Qualitative research
It’s important to understand the things concerning
business and its customers in detail. This can be achieved
through qualitative research, which in turn gives direction
to the useful investigations through data.
Three Major Business Applications
 Business Analytics
It answers the questions of “what has happened in the
past?” and “where are we now?”
E.g. reporting, measuring retention, finding the right user
segments, funnel analysis, etc.
 Predictive Analytics
It answers the question, “what will happen in the future?”
E.g. early warning, predicting the marketing budget you will
need in the next quarter, etc.
 Data (Based) Product
A product that is built, and works using your data.
E.g. recommendation systems, image recognition, voice
recognition, etc.
 SafetiPin is a map-based mobile phone application, which
leverages the power of big data to make our communities
and cities safer for women.
 It provides safety-related information collected through
crowdsourcing.
 The app captures data on 9 parameters (Lighting,
openness, visibility, people density, security in the area,
walk path, transportation, gender diversity, feeling in the
area), and uses it to compute and provide safety score, the
information on personal vulnerability to crime, in every
pocket of the city.
 App utilizes this score ang integrates with big data sources
such as Google map to recommends Safest Route to
provide the best possible route in terms of safety.
Data Communication
This is the step where most data science projects fail.
To reap the benefits of Data Science, effective
communication of the findings is crucial.
 It is necessary to build a culture where people can
communicate and use data. For this, everyone at your
company needs to be involved.
 Business people should also educate data scientists by
helping them to create and deliver better presentations.
 Communication should be as simple as it can be.
 No fancy scientific words
 No complicated charts
What People you need in your Team?
You data science team should feature
 Best Data Engineers,
 Best software developers, and
 Best statisticians
They need to have domain knowledge to know the actual
business application of their data projects.
Data Science Roles: Data Engineer
The data engineer is someone who develops, constructs,
tests and maintains data architectures, such as
databases, data warehouses, data lakes and large-scale
processing systems.
Data engineers manage data of all sizes, and types. They
develop, deploy, manage, and optimize data pipelines
and infrastructure to transform and transfer data to data
scientists for querying.
Skills needed: SQL, Data bases, Data warehousing,
ETL, Big data tools, Building API’s
Data Science Roles: Data Analyst
Data analysts perform the following tasks
 Data wrangling
 Create Data visualizations and Dash boards
 Analyze data to discover and interesting trends in the data
 Presenting the results of analysis to business clients or
internal teams
 Help other stakeholders to optimize their data utilization
Skills needed: Programming skills (SAS, R, Python),
statistical and mathematical skills, data wrangling, data
visualization tools like tableau/ Power BI
Data Science Roles: Data Scientist
A data scientist is a specialist having expertise in
Statistics and developing models, including predictive
models and machine learning models.
 Data scientists can tackle more open-ended questions
by leveraging their knowledge of advanced statistics.
 Data scientists bring an entirely new approach and
perspective to understanding data
Skills needed: Programming skills (SAS, R, Python),
statistical and mathematical skills, storytelling and data
visualization, Hadoop, SQL, machine learning, Big data
analytics.
Data Science projects can fail
Yes, that’s true!
Here are some of the reasons.
 Not every manager is ready for this change.
Even a very well-executed data project can fail, just
because someone’s feelings or ego is hurt.
 Answering the wrong question
 Failure to integrate into business operations
 Stakeholders disengaged
 Benefits don’t justify the costs
Developing Data Science culture
Failures can be prevented by establishing a data-driven
company culture early on. As the company size
increases, it becomes harder to make the organization
data-driven.
 It’s important that the managers develop the right
mindset.
 It important that everyone in the organization
understands importance of data science.
Data professionals should hold frequent presentations
about their recent findings.
Data Strategy
Why Data Strategy?
If you don't have a data strategy, you won't have enough
information to make the right decisions. Having data
strategy is crucial to become a data-driven organization.
Without it
 you will waste money on the wrong marketing
campaigns
 you will have wrong product development plans
Where do I begin?
It is recommended to start with development of Data Strategy. For
this, following questions need to be answered
 What are the right metrics to focus on? And how to figure it out?
 How to collect and store the data. Which tools should you use?
 Can you trust your data? And how can you make it trustworthy?
 How to communicate the data in your organization efficiently?
Start with a simple data project that answers the basic questions
about your business.
Subsequently, as you recognize your customers’ needs, you may
initiate other projects such as Predictive modelling, and Machine
learning
Pick your first data project
Develop and use the Prioritization matrix.
Your first data project
Your first data project should be a simple project (feasible)
with an aim to understanding your own business and your
customers better (High business value)
In other words, Start with investing in business analytics and
simple reports.
This project answers the basic questions about your business,
such as
 Who prefers what and why?
 How to win customer loyalty?
 Why a particular product failed?
And so on …
Questions?
You can write to me
kalamkar.vipul-stat@msubaroda.ac.in
Thanks!

Más contenido relacionado

La actualidad más candente

Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
My latest white paper
My latest white paperMy latest white paper
My latest white paperJason Rushin
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group
 
Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?Jennifer Walker
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...mark madsen
 
Big Data Management: Work Smarter Not Harder
Big Data Management: Work Smarter Not HarderBig Data Management: Work Smarter Not Harder
Big Data Management: Work Smarter Not HarderJennifer Walker
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...IT Support Engineer
 
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Jennifer Walker
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataDavid Pittman
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overviewoptier
 
Solve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for HumansSolve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for Humansmark madsen
 
Big Data Decision-Making
Big Data Decision-MakingBig Data Decision-Making
Big Data Decision-MakingTeradata Aster
 
Orzota all-in-one Big Data Platform
Orzota all-in-one Big Data PlatformOrzota all-in-one Big Data Platform
Orzota all-in-one Big Data PlatformOrzota
 
Reaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analyticsReaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analyticsThe Marketing Distillery
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018mark madsen
 

La actualidad más candente (20)

Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
My latest white paper
My latest white paperMy latest white paper
My latest white paper
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big Data
 
Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
 
Big Data Management: Work Smarter Not Harder
Big Data Management: Work Smarter Not HarderBig Data Management: Work Smarter Not Harder
Big Data Management: Work Smarter Not Harder
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
 
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big Data
 
Buyer's guide to strategic analytics
Buyer's guide to strategic analyticsBuyer's guide to strategic analytics
Buyer's guide to strategic analytics
 
Big data basics
Big data basicsBig data basics
Big data basics
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overview
 
Solve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for HumansSolve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for Humans
 
Big Data Decision-Making
Big Data Decision-MakingBig Data Decision-Making
Big Data Decision-Making
 
The dawn of Big Data
The dawn of Big DataThe dawn of Big Data
The dawn of Big Data
 
Big data
Big dataBig data
Big data
 
Orzota all-in-one Big Data Platform
Orzota all-in-one Big Data PlatformOrzota all-in-one Big Data Platform
Orzota all-in-one Big Data Platform
 
Reaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analyticsReaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analytics
 
Analytics3.0 e book
Analytics3.0 e bookAnalytics3.0 e book
Analytics3.0 e book
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 

Similar a Embracing data science

Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxPrabhaJoshi4
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellenceMudit Mangal
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thingBharath Rao
 
Analytics solution
Analytics solutionAnalytics solution
Analytics solutioncamssguide
 
Analytics Trends 2015: A below-the-surface look
Analytics Trends 2015: A below-the-surface lookAnalytics Trends 2015: A below-the-surface look
Analytics Trends 2015: A below-the-surface lookDeloitte Canada
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraVin Malhotra
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
 
Is Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big DataIs Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big Datahimanshu13jun
 
ABOUT DATA SCIENCE big data analytics ppt.pptx
ABOUT DATA SCIENCE big data analytics ppt.pptxABOUT DATA SCIENCE big data analytics ppt.pptx
ABOUT DATA SCIENCE big data analytics ppt.pptxVASANTHIG10
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfAnil
 
What is Big Data? - Business Plans
What is Big Data? - Business PlansWhat is Big Data? - Business Plans
What is Big Data? - Business PlansOur Business Ladder
 
Welcome to Data Science
Welcome to Data ScienceWelcome to Data Science
Welcome to Data ScienceNyraSehgal
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptxShambhavi Vats
 
_What Is Data Science.pdf
_What Is Data Science.pdf_What Is Data Science.pdf
_What Is Data Science.pdfFlyWly
 

Similar a Embracing data science (20)

Difference b/w DataScience, Data Analyst
Difference b/w DataScience, Data AnalystDifference b/w DataScience, Data Analyst
Difference b/w DataScience, Data Analyst
 
Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptx
 
Unlocking big data
Unlocking big dataUnlocking big data
Unlocking big data
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellence
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thing
 
Achieving Business Success with Data.pdf
Achieving Business Success with Data.pdfAchieving Business Success with Data.pdf
Achieving Business Success with Data.pdf
 
Analytics solution
Analytics solutionAnalytics solution
Analytics solution
 
Analytics Trends 2015: A below-the-surface look
Analytics Trends 2015: A below-the-surface lookAnalytics Trends 2015: A below-the-surface look
Analytics Trends 2015: A below-the-surface look
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin Malhotra
 
365 Data Science
365 Data Science365 Data Science
365 Data Science
 
Untitled document.pdf
Untitled document.pdfUntitled document.pdf
Untitled document.pdf
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
Is Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big DataIs Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big Data
 
ABOUT DATA SCIENCE big data analytics ppt.pptx
ABOUT DATA SCIENCE big data analytics ppt.pptxABOUT DATA SCIENCE big data analytics ppt.pptx
ABOUT DATA SCIENCE big data analytics ppt.pptx
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdf
 
Bigdata Hadoop introduction
Bigdata Hadoop introductionBigdata Hadoop introduction
Bigdata Hadoop introduction
 
What is Big Data? - Business Plans
What is Big Data? - Business PlansWhat is Big Data? - Business Plans
What is Big Data? - Business Plans
 
Welcome to Data Science
Welcome to Data ScienceWelcome to Data Science
Welcome to Data Science
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
 
_What Is Data Science.pdf
_What Is Data Science.pdf_What Is Data Science.pdf
_What Is Data Science.pdf
 

Último

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Último (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Embracing data science

  • 1. Department of Statistics The Maharaja Sayajirao University of Baroda
  • 2. Agenda  What is Data Science?  What does Data Science promise for your business?  Investment in Data Science and ROI  Data Science Process  Data Science Roles  Infrastructure Requirements  Data Science Tools and Techniques  Where do I begin?  Developing Data Science Culture  Questions
  • 3. What is Data Science? Everything concerning Data is in the purview of Data Science
  • 4. What is Data Science? Data science is a young inter-disciplinary field that uses scientific principles, methods, processes, algorithms and systems to extract knowledge and insights from data.  Data science involves Statistics at its core.  Data Science extends the field of statistics to incorporate advances in computing with data  Apart from Statistics, Computer Science is another major discipline that plays a major role in capturing, managing and sharing data.  It is a driving force behind innovations is almost all disciplines of Science.  This new approach is termed Data driven science.
  • 7. The Data Science promise Top Objectives of Successful Businesses  Increase profitability  Ensure customer satisfaction  Optimize productivity  Make your employees happy  Social and public responsibility Businesses traditionally rely on intuition, creativity and experience to fulfill these objectives. This has been reflected by HIPPO phenomenon for decades.
  • 8. The Data Science promise Without Data, you are just another person with an opinion – Edwards Deming Although, intuition, experience, etc. are important, these work gets much better when supported with data. Data Science helps you to  Understand your customers better by  Learning about their needs  Their struggles, their motivations, their habits and their relationships to your product or service.  Use this understanding to create a better product and/or service and turning that into profit.
  • 9. The Data Science promise Data science helps you to  See clearly how your business performs.  Understand dynamics of your business  Improve business processes  Discover new opportunities / products / services that your customers need.  Discover new audiences for your current products / services. and much more...
  • 10. The Data Science promise If you manage to collect the right data and use it well,  You will be able to make better decisions more quickly and more easily.  That will lead to a better product, happier customers and eventually more revenue. That’s what business data science is all about. If you are among the first in your domain to embrace data science, you can outsmart your competition.
  • 11. Signs that You Should Invest in Data Science  Your marketing budgets are growing, but your sales numbers are not.  Your company is struggling with personalization  It’s taking too long for the sales team to score leads  You are unable to analyze your marketing ROI  You want the competitive edge without significantly increasing your budget  Your competitors are already investing in Data Science
  • 12. Data Science Investments Human Resource According to an estimate, good teams spend about 5% of their total working hours with data and quantitative research.  So, if you are working alone, that's around 2-3 hours a week.  If you are a team of 50, then ideally you should have one or two full-time dedicated people for Data Science projects.  As your business grows, you may setup Data Science division
  • 13. Data Science Investments Data Infrastructure A data infrastructure is a digital infrastructure for promoting data sharing and consumption.  It includes data assets, hardware, software and processes.  It includes data ingestion and storage infrastructure  It includes data management, data security and data privacy.
  • 14. Data Science Investments Analytics Infrastructure Much of data science work involves computationally intensive experiments.  Thus, Data scientists should be able to access large machines/ specialized hardware for running experiments or doing exploratory analysis.  They should also be able to easily use burst/elastic compute on demand.  Data Scientists need software support for communicating their findings to business stakeholders.
  • 15. Cloud Analytics On-premises analytics solutions have challanges  Cost of infrastructure  Need for specialized skills  Time required to configure and maintain these systems  Nonscalability Cloud Analytics provides solution. Some major players  IBM Cognos analytics  Microfost Azure Stream Analytics  AWS Analytics
  • 16. Success Stories  Southwest Airlines saved $ 100 million by reducing the time its planes stood idle on the airstrip.  UPS, a logistics company, saved 38 million gallons of fuel by optimizing its fleet.  $ 2 billion tax dollars saved by the Internal Revenue Service by improving its ability to detect identity fraud and improper payments.  Croma, a subsidiary of Tata sons used data science to understand 360° view of its users and used it to give personalized shopping experience to its online customers and their conversions have significantly improved. And many more…
  • 17. With Data in your possession, You are sitting on a gold mine… However, if you don't know this fact OR don’t know how to extract it, you won't be able to benefit from it.
  • 18. Data Science Process The diagram shows the major phases of data science process. The diagram presents the CRISP-DM methodology
  • 19. Data Science Process The six steps of a data science project  Data Collection  Data Storage  Data Preparation  Data Utilization  Business Analytics  Predictive Analytics  Developing Data Product  Communication, data visualization  Data-driven Decision
  • 20. Data Collection This is where many businesses fail. Too many companies collect incomplete, unreliable data and everything they do after that is just messed up. Proper tracking and collection of data, and ensuring its quality is crucial for every business doing data science. What to collect?  It is important to decide the details of the data that must be collected/ captured.  The general idea is to collect everything you can – because the value of data can be realized any time in future.  However, the more data you capture, the more engineering time you need to allocate to implement it, the slower your business processes will be, the more complex your data infrastructure becomes, and so on… Also consider legal and ethical aspects!
  • 21. Data Wrangling Data wrangling is all about getting the data into the right form that is suitable for feeding into the modeling and visualization stages. This activity involves variety of tasks from discovering data to acquiring and transforming it into the form where the Data that is ready to be processed. The tasks following the data acquisition are also referred to by different terms such as Data Munging or Data Preprocessing.
  • 22. Big Data Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it. - Dan Ariely
  • 23. What is Big data?  Big data is a data set whose volume is beyond the ability of commonly used hardware and software tools to capture, manage, and process the data within a tolerable execution time.  They are gathered by information-sensing mobile devices, remote sensing technologies, software logs, cameras, microphones, RFID readers, and many such devices.  As a result, such datasets are continuously growing in size.  By 2020, there will be around 40 trillion gigabytes of data  90% of the data in the world today was created within just the past two years.  Internet users generate about 2.5 quintillion bytes (2.5 million terabytes) of data each day
  • 24. Twitter  500 million tweets per day Facebook  Facebook generates 4 petabytes of data per day.  Users generate 4 million likes every minute.  350 million photos are uploaded per day. Instagram  The Like button is hit an average of 4.2 billion times/ day. WhatsApp  In 2018, WhatsApp users sent 65 billion messages per day Almost every field Some Examples
  • 25. Characteristics of big data (3V’s) In a 2001 research report, Gartner analyst, Doug Laney, defined data growth challenges (and opportunities) as being three-dimensional - increasing volume, velocity , and variety. Data volume:  This is the primary attribute of big data. Most people define big data in multi terabytes—sometimes petabytes. Data variety  Big data is coming from a greater variety of sources than ever before. Many of the newer ones are Web sources, including logs, click-streams, and social media. Data velocity  Big data can be described by its velocity or speed. The rate at which new data is generated.
  • 26. Data Analysis Data Analysis is process for extracting value from Data. This is where data science gets exciting. It’s a creative process.  Ask right Questions It is important to ask right questions. They usually comes from the management/ or other colleagues, who may already have suspicions based on their experience.  Do Qualitative research It’s important to understand the things concerning business and its customers in detail. This can be achieved through qualitative research, which in turn gives direction to the useful investigations through data.
  • 27. Three Major Business Applications  Business Analytics It answers the questions of “what has happened in the past?” and “where are we now?” E.g. reporting, measuring retention, finding the right user segments, funnel analysis, etc.  Predictive Analytics It answers the question, “what will happen in the future?” E.g. early warning, predicting the marketing budget you will need in the next quarter, etc.  Data (Based) Product A product that is built, and works using your data. E.g. recommendation systems, image recognition, voice recognition, etc.
  • 28.  SafetiPin is a map-based mobile phone application, which leverages the power of big data to make our communities and cities safer for women.  It provides safety-related information collected through crowdsourcing.  The app captures data on 9 parameters (Lighting, openness, visibility, people density, security in the area, walk path, transportation, gender diversity, feeling in the area), and uses it to compute and provide safety score, the information on personal vulnerability to crime, in every pocket of the city.  App utilizes this score ang integrates with big data sources such as Google map to recommends Safest Route to provide the best possible route in terms of safety.
  • 29. Data Communication This is the step where most data science projects fail. To reap the benefits of Data Science, effective communication of the findings is crucial.  It is necessary to build a culture where people can communicate and use data. For this, everyone at your company needs to be involved.  Business people should also educate data scientists by helping them to create and deliver better presentations.  Communication should be as simple as it can be.  No fancy scientific words  No complicated charts
  • 30. What People you need in your Team? You data science team should feature  Best Data Engineers,  Best software developers, and  Best statisticians They need to have domain knowledge to know the actual business application of their data projects.
  • 31. Data Science Roles: Data Engineer The data engineer is someone who develops, constructs, tests and maintains data architectures, such as databases, data warehouses, data lakes and large-scale processing systems. Data engineers manage data of all sizes, and types. They develop, deploy, manage, and optimize data pipelines and infrastructure to transform and transfer data to data scientists for querying. Skills needed: SQL, Data bases, Data warehousing, ETL, Big data tools, Building API’s
  • 32. Data Science Roles: Data Analyst Data analysts perform the following tasks  Data wrangling  Create Data visualizations and Dash boards  Analyze data to discover and interesting trends in the data  Presenting the results of analysis to business clients or internal teams  Help other stakeholders to optimize their data utilization Skills needed: Programming skills (SAS, R, Python), statistical and mathematical skills, data wrangling, data visualization tools like tableau/ Power BI
  • 33. Data Science Roles: Data Scientist A data scientist is a specialist having expertise in Statistics and developing models, including predictive models and machine learning models.  Data scientists can tackle more open-ended questions by leveraging their knowledge of advanced statistics.  Data scientists bring an entirely new approach and perspective to understanding data Skills needed: Programming skills (SAS, R, Python), statistical and mathematical skills, storytelling and data visualization, Hadoop, SQL, machine learning, Big data analytics.
  • 34. Data Science projects can fail Yes, that’s true! Here are some of the reasons.  Not every manager is ready for this change. Even a very well-executed data project can fail, just because someone’s feelings or ego is hurt.  Answering the wrong question  Failure to integrate into business operations  Stakeholders disengaged  Benefits don’t justify the costs
  • 35. Developing Data Science culture Failures can be prevented by establishing a data-driven company culture early on. As the company size increases, it becomes harder to make the organization data-driven.  It’s important that the managers develop the right mindset.  It important that everyone in the organization understands importance of data science. Data professionals should hold frequent presentations about their recent findings.
  • 36. Data Strategy Why Data Strategy? If you don't have a data strategy, you won't have enough information to make the right decisions. Having data strategy is crucial to become a data-driven organization. Without it  you will waste money on the wrong marketing campaigns  you will have wrong product development plans
  • 37. Where do I begin? It is recommended to start with development of Data Strategy. For this, following questions need to be answered  What are the right metrics to focus on? And how to figure it out?  How to collect and store the data. Which tools should you use?  Can you trust your data? And how can you make it trustworthy?  How to communicate the data in your organization efficiently? Start with a simple data project that answers the basic questions about your business. Subsequently, as you recognize your customers’ needs, you may initiate other projects such as Predictive modelling, and Machine learning
  • 38. Pick your first data project Develop and use the Prioritization matrix.
  • 39. Your first data project Your first data project should be a simple project (feasible) with an aim to understanding your own business and your customers better (High business value) In other words, Start with investing in business analytics and simple reports. This project answers the basic questions about your business, such as  Who prefers what and why?  How to win customer loyalty?  Why a particular product failed? And so on …
  • 41. You can write to me kalamkar.vipul-stat@msubaroda.ac.in