SlideShare a Scribd company logo
1 of 41
Introduction to Data
Science & Machine
Learning
Introduction to Data Analytics
 What is Data Analytics ?
 Evolution of Data Analytics
 Statistics and Data Analytics
 Journey from Data Analyst to Data Scientist
 Industry Use Cases
Python and Data Analytics
 Introduction
 Python for Data Analytics
 Jupyter Notebook
 Other popular Data Analytics
Tools – R , SAS
Terminologies Demystified
 Artificial Intelligence
 Machine Learning
 Deep Learning
 Big Data
Introduction to Data Scientists Open
Platforms
 What is Kaggle
 Beginner’s view into Kaggle
 Introduction to DataScientists.net
Data
The raw facts and figures are called Data.
The processed data is known as Information.
Information
What is Data Analytics ?
Set of Processes & Techniques to derive meaningful insights!
Data analytics can shape business processes, improve
decision-making, and foster business growth.
What is Analytics ?
Analysis is the process of breaking a complex topic or substance into
smaller parts in order to gain a better understanding of it.
Analysis of data is a process of inspecting, cleaning, transforming, and
modeling data with the goal of discovering useful information,
suggesting conclusions, and supporting decision-making.
Why Businesses need Analytics
 Humans can count only till so much
 We understand summarized information
 We understand graphs faster
 We need to take decisions
 Wrong Decisions lead to huge costs
Design and
Interpret
Experiments
Hypothesis Testing
Build models
Predict signal
Data Insights Build Actions
Regression, Classification,
Time Series Analysis
Regression, Causal Effects
Analysis,
Predictive Modeling, Latent
Variable Analysis, Dimensionality
Reduction, Collaborative
Filtering, Clustering
Statistics and Data Analytics
Machine Learning
Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Data Analytics Life Cycle
Data Science
Hacking ( Acquiring,
Cleaning Data) +
Maths/Statistics (Insights
from Data) + Substantive
Expertise (Domain
Knowledge, Discovery
= Data Science
Examples from Industry
Industry Company Example
Cab rental Uber Recommending drivers where
they should place themselves
via heat map in order to take
advantage of the best fares
and most passengers
Media and Entertainment Netflix Recommender System
Healthcare Merck Safe and effective medicines
by predicting molecular
activity.
Finance Springleaf Autonomous offering of
personal and auto loans to the
customers.
Real estate Zillow Home value prediction.
Manufacturing Bosch Reduce manufacturing failures.
Artificial Intelligence
Theory and development of computer systems able to perform tasks
normally requiring human intelligence, such as visual perception,
speech recognition, decision-making, and translation between
languages.
Area of computer science that emphasizes the creation of intelligent
machines that work and react like humans.
Machine Learning
Machine learning is a field of computer science that uses statistical
techniques to give computer systems the ability to "learn" (i.e.,
progressively improve performance on a specific task) with data,
without being explicitly programmed. The name machine learning
was coined in 1959 by Arthur Samuel.
Machine learning is a way to achieve Artificial Intelligence.
Deep Learning
What is Deep Learning (deeplearning.net):
Deep Learning is a new area of Machine Learning research, which has
been introduced with the objective of moving Machine Learning
closer to one of its original goals: Artificial Intelligence.
Deep Learning
What is Deep Learning (Oxford University):
“In our quest to build machines capable of different brain functions, such as
image and speech understanding, we have discovered that it is of paramount
importance to understand how data in the world shapes the brain. Models
that are learned from data are the best at many tasks such as image
understanding (e.g., knowing where faces occur in images, recognizing road
features in self-driving cars) and speech recognition.”
Artificial
Intelligence
Machine Learning
Deep Learning
In Summary
Use cases from the world
 Self driving cars
• Tesla autopilot
• Google self-driving car
 Virtual Assistants - Siri, Alexa,
Cortona, Google Assistant
 Robots in smart warehouses-
Amazon & Alibaba
 Automated Customer Agents &
Chatbots
 Image Classification and
Recognition Systems
The Era of Big data
•Structured
•Text data
•Picture, Visuals
•Audio, Video
•Authenticity
•Trustworthiness
•Availability
•Accountability
•Torrents of data in
near real-time
•Social data
•Sensor data
•Streaming data
•Terabytes of data
•Transactions
•Social Data
•IoT, Sensor Data
•Machine to Machine
Volume Velocity
Variety
Veracity
Big Data - The 4 Vs
Value
Uncover Hidden Patterns,
unknown correlations,
customer preferences and
other various important
information
Help companies streamline
operations, improve marketing,
enhance customer engagement,
improve customer service
The value uncovered helps
organizations, industries to
create new products, to
explore new market
Extend the value of a
predictive model by
subsequently uncovering a
virtually unfathomable
combination of additional
variables
Big Data - The 5th V
Introduction
Python is a programming language that lets you work quickly
and integrate systems more effectively.
It is used in -
• Statistics
• Data Analysis
• Data Visualization
• Machine Learning
• Deep Learning
Python for Data Analytics
• Statistics i.e statsmodels
• Mathematics i.e numpy and scipy
• Data Handling i.e pandas.
• Data Visualization i.e matplotlib, seaborn, plotly and ggplot.
• Machine Learning i.e. Scikit-learn.
Jupyter notebook
• Ipython (Interactive python) Integrated development
environment.
• The Jupyter Notebook App is a server-client application that
allows editing and running notebook documents via a web
browser.
• The Jupyter Notebook App can be executed on a local
desktop requiring no internet access (as described in this
document) or can be installed on a remote server and
accessed through the internet
Other Popular Languages
Step
01
Starting to
learn a new
skill/language
Step
04
Step
03
Step
02
Do coding, and
assignments
and project
Prepare
Interview
Questions
Start with your
job process
Step
01
Starting to learn
a new
skill/language
Step
04
Step
03
Step
02
Do coding, and
assignments
and project
Establish yourself as
a coder on platforms
like stack overflow
Practice on datasets from different domains
on Kaggle like platforms Contribute to Kaggle
with datasets and kernels (code) Participate
in Code Competitions
B
E
F
O
R
E
N
O
W
Establishing yourself as a Data Scientist
What is Kaggle?
• A platform for Data Scientists and Data Science Competitions.
• In 2010, Kaggle was founded as a platform for predictive modelling
and analytics competitions on which companies and researchers post
their data and statisticians and data miners from all over the world
compete to produce the best models (Wikipedia).
Kaggle terminologies
 Datasets
 Real world data for open source community
 Every Dataset needs a story that catches interest of Data
Scientists:
• Why is this dataset interesting?
• What questions can the community help with?
 650 + Datasets
Kaggle Popular datasets
 Some popular datasets
• IMDB Movie dataset (Entertainment)
• European Soccer Database (Sports)
• Credit Card Fraud Detection (Finance)
• Human Resources Analytics (Cross Industry)
• Iris (Botany)
• Climate Change
• World University Rankings
• Medical Appointments No shows (Healthcare)
Kaggle terminologies
 Kaggle Kernels
 Code in R or Python
 Kernels is preloaded with the most common data science
languages and libraries.
 Look at Kernels from peer Data Scientists
 Look at the Kernels that have most votes
 Fork from an existing kernel
Kaggle Competitions
 Prize money
 Real world Data given by companies that are looking for some
serious insights and problem solving
 Examples
• Zillow Price (Real-Estate)
• Instacart Market Analysis (eCommerce)
• Mercedes-Benz (Manufacturing)
• Intel and MobileODT Cervical Cancer Screening (Healthcare)
Datascience.net
Smaller scale competitions.
Good for amatuer data scientists/analysts.
Visit https://www.datascience.net/
How it works ?
Dataset
• Multiple datasets used to teach several concepts
• You can start to get familiar with this one
o Ball by ball dataset of IPL until season 9.
o Download at https://www.kaggle.com/manasgarg/ipl
Motivation
Predicting the winner of next
IPL season based on previous
seasons data.
Motivation
Google predicted Germany as the winner of FIFA 2014 world
cup.
https://googleblog.blogspot.fr/2014/07/google-cloud-
platform-predicts-world.html

More Related Content

Similar to Workshop_Presentation.pptx

Cognitive technologies with David Schatsky at Blocks + Bots
Cognitive technologies with David Schatsky at Blocks + BotsCognitive technologies with David Schatsky at Blocks + Bots
Cognitive technologies with David Schatsky at Blocks + BotsAdrienne Debigare
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminaryLabs1
 
London Online 2008
London Online 2008London Online 2008
London Online 2008Joe Buzzanga
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleMartin Kaltenböck
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseSoftServe
 
Artificial intelligence in industry
Artificial intelligence in industryArtificial intelligence in industry
Artificial intelligence in industryDipanjan Mitra
 
Introduction To Data Science
Introduction To Data Science Introduction To Data Science
Introduction To Data Science PriyaMaurya52
 
Advanced AI Applications In Enterprises
Advanced AI Applications In EnterprisesAdvanced AI Applications In Enterprises
Advanced AI Applications In EnterprisesAnandSRao1962
 
How would AI shape Future Integrations?
How would AI shape Future Integrations?How would AI shape Future Integrations?
How would AI shape Future Integrations?Srinath Perera
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 
Unlocking Value from Unstructured Data
Unlocking Value from Unstructured DataUnlocking Value from Unstructured Data
Unlocking Value from Unstructured DataAccenture Insurance
 
15 DATA SCIENCE TRENDS TO RULE IN 2023.pdf
15 DATA SCIENCE TRENDS TO RULE IN 2023.pdf15 DATA SCIENCE TRENDS TO RULE IN 2023.pdf
15 DATA SCIENCE TRENDS TO RULE IN 2023.pdfUSDSI
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analyticsInbavalli Valli
 

Similar to Workshop_Presentation.pptx (20)

Cognitive technologies with David Schatsky at Blocks + Bots
Cognitive technologies with David Schatsky at Blocks + BotsCognitive technologies with David Schatsky at Blocks + Bots
Cognitive technologies with David Schatsky at Blocks + Bots
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 
London Online 2008
London Online 2008London Online 2008
London Online 2008
 
Semantic AI
Semantic AISemantic AI
Semantic AI
 
Intro to AI.pptx
Intro to AI.pptxIntro to AI.pptx
Intro to AI.pptx
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycle
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
 
Untitled document.pdf
Untitled document.pdfUntitled document.pdf
Untitled document.pdf
 
Artificial intelligence in industry
Artificial intelligence in industryArtificial intelligence in industry
Artificial intelligence in industry
 
Introduction To Data Science
Introduction To Data Science Introduction To Data Science
Introduction To Data Science
 
Salesforce - AI for CRM
Salesforce - AI for CRMSalesforce - AI for CRM
Salesforce - AI for CRM
 
AI for CRM e-book
AI for CRM e-bookAI for CRM e-book
AI for CRM e-book
 
Advanced AI Applications In Enterprises
Advanced AI Applications In EnterprisesAdvanced AI Applications In Enterprises
Advanced AI Applications In Enterprises
 
How would AI shape Future Integrations?
How would AI shape Future Integrations?How would AI shape Future Integrations?
How would AI shape Future Integrations?
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 
Unlocking Value from Unstructured Data
Unlocking Value from Unstructured DataUnlocking Value from Unstructured Data
Unlocking Value from Unstructured Data
 
15 DATA SCIENCE TRENDS TO RULE IN 2023.pdf
15 DATA SCIENCE TRENDS TO RULE IN 2023.pdf15 DATA SCIENCE TRENDS TO RULE IN 2023.pdf
15 DATA SCIENCE TRENDS TO RULE IN 2023.pdf
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analytics
 
AI tools.pptx
AI tools.pptxAI tools.pptx
AI tools.pptx
 

Recently uploaded

Bangalore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Bangalore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceBangalore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Bangalore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDamini Dixit
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Jubilee Hills high-profile Ca...
VIP 7001035870 Find & Meet Hyderabad Call Girls Jubilee Hills high-profile Ca...VIP 7001035870 Find & Meet Hyderabad Call Girls Jubilee Hills high-profile Ca...
VIP 7001035870 Find & Meet Hyderabad Call Girls Jubilee Hills high-profile Ca...aditipandeya
 
Top Call Girls In Arjunganj ( Lucknow ) ✨ 8923113531 ✨ Cash Payment
Top Call Girls In Arjunganj ( Lucknow  ) ✨ 8923113531 ✨  Cash PaymentTop Call Girls In Arjunganj ( Lucknow  ) ✨ 8923113531 ✨  Cash Payment
Top Call Girls In Arjunganj ( Lucknow ) ✨ 8923113531 ✨ Cash Paymentanilsa9823
 
Hyderabad Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Hyderabad Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceHyderabad Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Hyderabad Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDamini Dixit
 
Tirupati Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Tirupati Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceTirupati Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Tirupati Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDamini Dixit
 
Sangareddy Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Sangareddy Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceSangareddy Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Sangareddy Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDamini Dixit
 
Lucknow Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Lucknow Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceLucknow Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Lucknow Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDamini Dixit
 
Lucknow 💋 Escort Service in Lucknow ₹7.5k Pick Up & Drop With Cash Payment 89...
Lucknow 💋 Escort Service in Lucknow ₹7.5k Pick Up & Drop With Cash Payment 89...Lucknow 💋 Escort Service in Lucknow ₹7.5k Pick Up & Drop With Cash Payment 89...
Lucknow 💋 Escort Service in Lucknow ₹7.5k Pick Up & Drop With Cash Payment 89...anilsa9823
 
Dehradun Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Dehradun Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDehradun Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Dehradun Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDamini Dixit
 
CALL ON ➥8923113531 🔝Call Girls Mohanlalganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Mohanlalganj Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Mohanlalganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Mohanlalganj Lucknow best sexual serviceanilsa9823
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Secunderabad high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Secunderabad high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Secunderabad high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Secunderabad high-profile Cal...aditipandeya
 
Call girls in Andheri with phone number 9892124323
Call girls in Andheri with phone number 9892124323Call girls in Andheri with phone number 9892124323
Call girls in Andheri with phone number 9892124323Pooja Nehwal
 
call girls in Siolim Escorts Book Tonight Now Call 8588052666
call girls in Siolim Escorts Book Tonight Now Call 8588052666call girls in Siolim Escorts Book Tonight Now Call 8588052666
call girls in Siolim Escorts Book Tonight Now Call 8588052666nishakur201
 
CALL ON ➥8923113531 🔝Call Girls Sushant Golf City Lucknow best sexual service...
CALL ON ➥8923113531 🔝Call Girls Sushant Golf City Lucknow best sexual service...CALL ON ➥8923113531 🔝Call Girls Sushant Golf City Lucknow best sexual service...
CALL ON ➥8923113531 🔝Call Girls Sushant Golf City Lucknow best sexual service...anilsa9823
 
VIP Chandigarh Call Girls 7001035870 Enjoy Call Girls With Our Escorts
VIP Chandigarh Call Girls 7001035870 Enjoy Call Girls With Our EscortsVIP Chandigarh Call Girls 7001035870 Enjoy Call Girls With Our Escorts
VIP Chandigarh Call Girls 7001035870 Enjoy Call Girls With Our Escortssonatiwari757
 
Top Call Girls In Indira Nagar Lucknow ( Lucknow ) 🔝 8923113531 🔝 Cash Payment
Top Call Girls In Indira Nagar Lucknow ( Lucknow  ) 🔝 8923113531 🔝  Cash PaymentTop Call Girls In Indira Nagar Lucknow ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment
Top Call Girls In Indira Nagar Lucknow ( Lucknow ) 🔝 8923113531 🔝 Cash Paymentanilsa9823
 

Recently uploaded (16)

Bangalore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Bangalore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceBangalore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Bangalore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Jubilee Hills high-profile Ca...
VIP 7001035870 Find & Meet Hyderabad Call Girls Jubilee Hills high-profile Ca...VIP 7001035870 Find & Meet Hyderabad Call Girls Jubilee Hills high-profile Ca...
VIP 7001035870 Find & Meet Hyderabad Call Girls Jubilee Hills high-profile Ca...
 
Top Call Girls In Arjunganj ( Lucknow ) ✨ 8923113531 ✨ Cash Payment
Top Call Girls In Arjunganj ( Lucknow  ) ✨ 8923113531 ✨  Cash PaymentTop Call Girls In Arjunganj ( Lucknow  ) ✨ 8923113531 ✨  Cash Payment
Top Call Girls In Arjunganj ( Lucknow ) ✨ 8923113531 ✨ Cash Payment
 
Hyderabad Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Hyderabad Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceHyderabad Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Hyderabad Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Tirupati Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Tirupati Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceTirupati Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Tirupati Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Sangareddy Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Sangareddy Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceSangareddy Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Sangareddy Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Lucknow Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Lucknow Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceLucknow Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Lucknow Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Lucknow 💋 Escort Service in Lucknow ₹7.5k Pick Up & Drop With Cash Payment 89...
Lucknow 💋 Escort Service in Lucknow ₹7.5k Pick Up & Drop With Cash Payment 89...Lucknow 💋 Escort Service in Lucknow ₹7.5k Pick Up & Drop With Cash Payment 89...
Lucknow 💋 Escort Service in Lucknow ₹7.5k Pick Up & Drop With Cash Payment 89...
 
Dehradun Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Dehradun Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDehradun Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Dehradun Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
CALL ON ➥8923113531 🔝Call Girls Mohanlalganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Mohanlalganj Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Mohanlalganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Mohanlalganj Lucknow best sexual service
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Secunderabad high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Secunderabad high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Secunderabad high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Secunderabad high-profile Cal...
 
Call girls in Andheri with phone number 9892124323
Call girls in Andheri with phone number 9892124323Call girls in Andheri with phone number 9892124323
Call girls in Andheri with phone number 9892124323
 
call girls in Siolim Escorts Book Tonight Now Call 8588052666
call girls in Siolim Escorts Book Tonight Now Call 8588052666call girls in Siolim Escorts Book Tonight Now Call 8588052666
call girls in Siolim Escorts Book Tonight Now Call 8588052666
 
CALL ON ➥8923113531 🔝Call Girls Sushant Golf City Lucknow best sexual service...
CALL ON ➥8923113531 🔝Call Girls Sushant Golf City Lucknow best sexual service...CALL ON ➥8923113531 🔝Call Girls Sushant Golf City Lucknow best sexual service...
CALL ON ➥8923113531 🔝Call Girls Sushant Golf City Lucknow best sexual service...
 
VIP Chandigarh Call Girls 7001035870 Enjoy Call Girls With Our Escorts
VIP Chandigarh Call Girls 7001035870 Enjoy Call Girls With Our EscortsVIP Chandigarh Call Girls 7001035870 Enjoy Call Girls With Our Escorts
VIP Chandigarh Call Girls 7001035870 Enjoy Call Girls With Our Escorts
 
Top Call Girls In Indira Nagar Lucknow ( Lucknow ) 🔝 8923113531 🔝 Cash Payment
Top Call Girls In Indira Nagar Lucknow ( Lucknow  ) 🔝 8923113531 🔝  Cash PaymentTop Call Girls In Indira Nagar Lucknow ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment
Top Call Girls In Indira Nagar Lucknow ( Lucknow ) 🔝 8923113531 🔝 Cash Payment
 

Workshop_Presentation.pptx

  • 1. Introduction to Data Science & Machine Learning
  • 2. Introduction to Data Analytics  What is Data Analytics ?  Evolution of Data Analytics  Statistics and Data Analytics  Journey from Data Analyst to Data Scientist  Industry Use Cases Python and Data Analytics  Introduction  Python for Data Analytics  Jupyter Notebook  Other popular Data Analytics Tools – R , SAS Terminologies Demystified  Artificial Intelligence  Machine Learning  Deep Learning  Big Data Introduction to Data Scientists Open Platforms  What is Kaggle  Beginner’s view into Kaggle  Introduction to DataScientists.net
  • 3. Data The raw facts and figures are called Data. The processed data is known as Information. Information
  • 4. What is Data Analytics ? Set of Processes & Techniques to derive meaningful insights! Data analytics can shape business processes, improve decision-making, and foster business growth.
  • 5. What is Analytics ? Analysis is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it. Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.
  • 6. Why Businesses need Analytics  Humans can count only till so much  We understand summarized information  We understand graphs faster  We need to take decisions  Wrong Decisions lead to huge costs
  • 7.
  • 8. Design and Interpret Experiments Hypothesis Testing Build models Predict signal Data Insights Build Actions Regression, Classification, Time Series Analysis Regression, Causal Effects Analysis, Predictive Modeling, Latent Variable Analysis, Dimensionality Reduction, Collaborative Filtering, Clustering Statistics and Data Analytics
  • 9. Machine Learning Supervised Learning Unsupervised Learning Semi-supervised Learning
  • 11.
  • 12. Data Science Hacking ( Acquiring, Cleaning Data) + Maths/Statistics (Insights from Data) + Substantive Expertise (Domain Knowledge, Discovery = Data Science
  • 13. Examples from Industry Industry Company Example Cab rental Uber Recommending drivers where they should place themselves via heat map in order to take advantage of the best fares and most passengers Media and Entertainment Netflix Recommender System Healthcare Merck Safe and effective medicines by predicting molecular activity. Finance Springleaf Autonomous offering of personal and auto loans to the customers. Real estate Zillow Home value prediction. Manufacturing Bosch Reduce manufacturing failures.
  • 14.
  • 15. Artificial Intelligence Theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages. Area of computer science that emphasizes the creation of intelligent machines that work and react like humans.
  • 16. Machine Learning Machine learning is a field of computer science that uses statistical techniques to give computer systems the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed. The name machine learning was coined in 1959 by Arthur Samuel. Machine learning is a way to achieve Artificial Intelligence.
  • 17. Deep Learning What is Deep Learning (deeplearning.net): Deep Learning is a new area of Machine Learning research, which has been introduced with the objective of moving Machine Learning closer to one of its original goals: Artificial Intelligence.
  • 18. Deep Learning What is Deep Learning (Oxford University): “In our quest to build machines capable of different brain functions, such as image and speech understanding, we have discovered that it is of paramount importance to understand how data in the world shapes the brain. Models that are learned from data are the best at many tasks such as image understanding (e.g., knowing where faces occur in images, recognizing road features in self-driving cars) and speech recognition.”
  • 20. Use cases from the world  Self driving cars • Tesla autopilot • Google self-driving car  Virtual Assistants - Siri, Alexa, Cortona, Google Assistant  Robots in smart warehouses- Amazon & Alibaba  Automated Customer Agents & Chatbots  Image Classification and Recognition Systems
  • 21. The Era of Big data
  • 22. •Structured •Text data •Picture, Visuals •Audio, Video •Authenticity •Trustworthiness •Availability •Accountability •Torrents of data in near real-time •Social data •Sensor data •Streaming data •Terabytes of data •Transactions •Social Data •IoT, Sensor Data •Machine to Machine Volume Velocity Variety Veracity Big Data - The 4 Vs
  • 23. Value Uncover Hidden Patterns, unknown correlations, customer preferences and other various important information Help companies streamline operations, improve marketing, enhance customer engagement, improve customer service The value uncovered helps organizations, industries to create new products, to explore new market Extend the value of a predictive model by subsequently uncovering a virtually unfathomable combination of additional variables Big Data - The 5th V
  • 24.
  • 25. Introduction Python is a programming language that lets you work quickly and integrate systems more effectively. It is used in - • Statistics • Data Analysis • Data Visualization • Machine Learning • Deep Learning
  • 26. Python for Data Analytics • Statistics i.e statsmodels • Mathematics i.e numpy and scipy • Data Handling i.e pandas. • Data Visualization i.e matplotlib, seaborn, plotly and ggplot. • Machine Learning i.e. Scikit-learn.
  • 27. Jupyter notebook • Ipython (Interactive python) Integrated development environment. • The Jupyter Notebook App is a server-client application that allows editing and running notebook documents via a web browser. • The Jupyter Notebook App can be executed on a local desktop requiring no internet access (as described in this document) or can be installed on a remote server and accessed through the internet
  • 29.
  • 30. Step 01 Starting to learn a new skill/language Step 04 Step 03 Step 02 Do coding, and assignments and project Prepare Interview Questions Start with your job process Step 01 Starting to learn a new skill/language Step 04 Step 03 Step 02 Do coding, and assignments and project Establish yourself as a coder on platforms like stack overflow Practice on datasets from different domains on Kaggle like platforms Contribute to Kaggle with datasets and kernels (code) Participate in Code Competitions B E F O R E N O W Establishing yourself as a Data Scientist
  • 31. What is Kaggle? • A platform for Data Scientists and Data Science Competitions. • In 2010, Kaggle was founded as a platform for predictive modelling and analytics competitions on which companies and researchers post their data and statisticians and data miners from all over the world compete to produce the best models (Wikipedia).
  • 32. Kaggle terminologies  Datasets  Real world data for open source community  Every Dataset needs a story that catches interest of Data Scientists: • Why is this dataset interesting? • What questions can the community help with?  650 + Datasets
  • 33. Kaggle Popular datasets  Some popular datasets • IMDB Movie dataset (Entertainment) • European Soccer Database (Sports) • Credit Card Fraud Detection (Finance) • Human Resources Analytics (Cross Industry) • Iris (Botany) • Climate Change • World University Rankings • Medical Appointments No shows (Healthcare)
  • 34. Kaggle terminologies  Kaggle Kernels  Code in R or Python  Kernels is preloaded with the most common data science languages and libraries.  Look at Kernels from peer Data Scientists  Look at the Kernels that have most votes  Fork from an existing kernel
  • 35. Kaggle Competitions  Prize money  Real world Data given by companies that are looking for some serious insights and problem solving  Examples • Zillow Price (Real-Estate) • Instacart Market Analysis (eCommerce) • Mercedes-Benz (Manufacturing) • Intel and MobileODT Cervical Cancer Screening (Healthcare)
  • 36. Datascience.net Smaller scale competitions. Good for amatuer data scientists/analysts. Visit https://www.datascience.net/
  • 38.
  • 39. Dataset • Multiple datasets used to teach several concepts • You can start to get familiar with this one o Ball by ball dataset of IPL until season 9. o Download at https://www.kaggle.com/manasgarg/ipl
  • 40. Motivation Predicting the winner of next IPL season based on previous seasons data.
  • 41. Motivation Google predicted Germany as the winner of FIFA 2014 world cup. https://googleblog.blogspot.fr/2014/07/google-cloud- platform-predicts-world.html