SlideShare una empresa de Scribd logo
1 de 30
How to Become a
Data Scientist from
Scratch
by
SONU KUMAR
What is Data Science??
O Data Science as a multi-disciplinary subject encompasses the use
of mathematics, statistics, and computer science to study and
evaluate data. The key objective of Data Science is to extract
valuable information for use in strategic decision making,
product development, trend analysis and forecasting.
O A Data scientist is sort of 'jack-of-all-trades' for data crunching.
Basically, 3 main skills a data scientist needs to possess are
mathematics/statistics, computer programming literacy and
knowledge of particular business.
Data Science is a Broader Field
Comparison between Different Roles
in 2018
How to become a Data Scientist??
Math
Programming
Languages
Data Wrangling and
Management
Data Analysis and Visualization
Machine Learning
Deep Learning
Mathematics
O Linear Algebra: Matrix, Eigen, Tensor etc.
O Calculus: Differentiation and Integration.
O Probability: Bayes Theorem, Optimization etc.
O Statistics: Inferential Statistics, Descriptive Statistics, Chi
squared Testes, Random Variable, Gaussian And Normal
Distribution.
[Best Resources:- Khan Academy and Machine Learning
Mystery Mathematics Course]
Programming Languages
O Python: It is the Bible.
→ Easy to understand, i.e., plane English
→ No semicolon
→ Simple and tons of libraries available
O Talk about Packages
→ Data visualization using ggplot2, tidy are extremely important
[Best Resource :- Sentdex YouTube channel]
Libraries
Data Wrangling and Management
O Data Mining
O Data Cleaning
O Data Management
Relevant Skills:
→ MySQL: RDBMS
→ NoSQL: Mongo DB, Cassandra etc.
JOIN
Data Analysis and Visualization
O Plotting libraries in programming languages, e.g.,
• plotly, matplotlib, seaborn → python
• ggplot2 → R
• Tableau is booming now.
[Pandas and Numpy for Data Analysis]
Machine Learning and Deep Learning
O Domain Knowledge???
HEALTHCARE, BUSINESS, FINANCE, SPORTS etc.
Supervised Unsupervised Reinforcement
Machine Learning Algorithms
O Topics: Regression, Decision Tree, Random Forest, Naïve
Bayes, Ensemble Learning, AdaBoost, Hierarchical
Clustering, Association, k-means Clustering, SVM, KNN,
Gradient Descent, Cross Validation, Entropy, Accuracy,
Precision, Collaborative Filtration, PCA, Markov model,
Boltzmann theorem etc.
Testing Evaluation and Validation of Models
Deep Learning Algorithms
O Neural Networks, Feed Forward NN, Fuzzy Logic,
Sequence Model, LSTM, RNN, CNN, CapsNet, Time Series
etc
Big Data
O Map Reduce
O Hadoop
O Apache
O Spark
O Hive
O Pig
O Mahout
O Yarn
Additional Skills
NLP CV
Course Contents And Projects
O Introduction Data Mining
→ Introduction of Data Mining
→ Stages of the Data Mining Process
→ Data Mining Goals
→ Information and Knowledge
→ Advantages in Data Mining
→ Related technologies - Machine Learning, DBMS, OLAP, Statistics
→ Data Mining Techniques
→ Role of Data Mining in Various Field like Artificial Intelligence and
→ Internet of Things
→ Future scope of Data Mining
O Data Warehouse and OLAP/ Data preprocessing
→ Data cleaning
→ Data transformation
→ Data reduction
→ Data Warehouse and DBMS
→ Multidimensional data model
→ OLAP operations
O Machine Learning algorithms & concepts
→ Supervised and Unsupervised Technique
→ Regression Analysis
→ Linear Regression and Logistic Regression
→ Classification
→ Prediction
→ Bayesian Classification Models
→ Association rules
→ Ensemble Learning
→ Neural Networks
→ Perceptron
→ MLP
→ SVM
O Python/Anaconda
→ Introduction to python and anaconda
→ Conditional Statements
→ Looping, Control Statements
→ Lists, Tuple ,Dictionaries
→ String Manipulation
→ Functions
→ Installing Packages
→ Introduction of Various Tool
→ Introduction of Anaconda
O Working on Various Python Library
→ Installing library and packages for machine learning and data
→ science
→ Matplotlib
→ Scipy and Numpy
→ Pandas
→ IPython toolkit
→ scikit-learn
→ Tensorflow, Keras and other deep learning libraries
O Data Structures in Python
→ Intro to Numpy Arrays
→ Creating ndarrays
→ Indexing
→ Data Processing using Arrays
→ File Input and Output
→ Sorting & Summarizing
→ Descriptive Statistics
→ Combining and Merging Data
O Data Analysis Using Pandas
→ Introduction to Pandas
→ Data Type of Pandas
→ Creating DataFrame using Pandas
→ Importing and Exporting Database
→ Working with Complex Data
→ Data Mining using Pandas .
O Hand on / Mini Projects on Data Sets
→ Modeling using Regression
→ Creating a Clustering Model
→ Loan Prediction Problem
→ Working on Iris Data Set
→ Titanic Data
→ Boston Housing Data Set
→ Predict Stock Prices
→ Classifying MNIST digits using Logistic Regression
→ Intrusion Detection using Decision
→ CIFAR Data set
→ ImageNet Data Set
→ Credit Risk Analytics using SVM in Python
Learning Outcomes
O Build artificial neural networks with Tensorflow and Keras
O Build Deep Learning networks to classify images with
Convolutional Neural Networks
O Implement machine learning, clustering, and search using TF/IDF
at massive scale with Apache Spark's MLLib
O Implement Sentiment Analysis with Recurrent Neural Networks
O Understand reinforcement learning - and how to build a Pac-Man
bot
O Make predictions using linear regression, polynomial
regression, and multivariate regression
O Implement Sentiment Analysis with Recurrent Neural
Networks
O Understand reinforcement learning - and how to build a
Pac-Man bot
O Classify medical test results with a wide variety of
supervised machine learning classification techniques
O Cluster data using K-Means clustering and Support Vector
Machines (SVM)
O Build a spam classifier using Naive Bayes
O Use decision trees to predict hiring decisions
O Apply dimensionality reduction with Principal Component
Analysis (PCA) to classify flowers
O Predict classifications using K-Nearest-Neighbor (KNN)
O Develop using iPython notebooks
O Understand statistical measures such as standard deviation
O Visualize data distributions, probability mass functions, and
probability density functions
O Visualize data with matplotlib
O Use covariance and correlation metrics
O Apply conditional probability for finding correlated
features
O Use Bayes' Theorem to identify false positives
O Understand complex multi-level models
O Use train/test and K-Fold cross validation to choose the
right model
O Build a movie recommender system using item-based and
user-based collaborative filtering
O Clean your input data to remove outliers
O Design and evaluate A/B tests using T-Tests and P-Values
Best Resources (Online Videos)
O Learn Python for Data Science by Microsoft → Edx
O Statistics and Probability by Khan Academy
O Introduction to Computing for Data Analysis → Edx
O Machine Learning for Data Science and Analytics → Edx
O Introduction to NoSQL Databases Solution → Edx
O Intro to Hadoop and Mapreduce → Coursera
[In Sequential order from Top]
Best Blogs and Open Source
Community
O Medium AI Community
O Freecodecamp
O Analytics Vidya
O Official Documentations
O Github and Stackoverflow
O Kaggle- Spend 5 hours of a day here
O Cheat Sheets from Amazon aws
Best Books
For Machine/ Deep Learning Data Science
Beginners
Book
Statistics
Overview of Data Science Tools and
Packages
Thank You

Más contenido relacionado

La actualidad más candente

Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis IntroductionPrasiddhaSarma
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERKnoldus Inc.
 
Adversarial machine learning
Adversarial machine learning Adversarial machine learning
Adversarial machine learning nullowaspmumbai
 
Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)HJ van Veen
 
Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)Krishnaram Kenthapadi
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringSri Ambati
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AIBill Liu
 
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckAI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckSlideTeam
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaPyData
 
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Edureka!
 
Introduction to XGBoost
Introduction to XGBoostIntroduction to XGBoost
Introduction to XGBoostJoonyoung Yi
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionvineeta vineeta
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Venkata Reddy Konasani
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detectionPEIPEI HAN
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Krishnaram Kenthapadi
 

La actualidad más candente (20)

K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
K Nearest Neighbor Algorithm
K Nearest Neighbor AlgorithmK Nearest Neighbor Algorithm
K Nearest Neighbor Algorithm
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis Introduction
 
Explainable AI (XAI)
Explainable AI (XAI)Explainable AI (XAI)
Explainable AI (XAI)
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
 
Adversarial machine learning
Adversarial machine learning Adversarial machine learning
Adversarial machine learning
 
Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)
 
Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AI
 
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckAI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
 
Python for Data Science
Python for Data SciencePython for Data Science
Python for Data Science
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
 
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
 
Introduction to XGBoost
Introduction to XGBoostIntroduction to XGBoost
Introduction to XGBoost
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detection
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
 

Similar a Data scientist roadmap

Data Science Course in Pune
Data Science Course in Pune Data Science Course in Pune
Data Science Course in Pune nmdfilmProduction
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
797_NaveenKKapoor_CEE
797_NaveenKKapoor_CEE797_NaveenKKapoor_CEE
797_NaveenKKapoor_CEENaveen Kapoor
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 
_Data Science_ Unlocking Insights and Driving Innovation”.pptx
_Data Science_ Unlocking Insights and Driving Innovation”.pptx_Data Science_ Unlocking Insights and Driving Innovation”.pptx
_Data Science_ Unlocking Insights and Driving Innovation”.pptxDMKurnool
 
Introduction to DS, ML and IBM Tools
Introduction to DS, ML and IBM ToolsIntroduction to DS, ML and IBM Tools
Introduction to DS, ML and IBM ToolsQamar un Nisa
 
392_SannaReddyBharath (1)
392_SannaReddyBharath (1)392_SannaReddyBharath (1)
392_SannaReddyBharath (1)bharath reddy
 
438_AmeeruddinMohammed
438_AmeeruddinMohammed438_AmeeruddinMohammed
438_AmeeruddinMohammedAmeeruddin MD
 
Building Data Scientists
Building Data ScientistsBuilding Data Scientists
Building Data ScientistsMitch Sanders
 
Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)ShehryarSH1
 

Similar a Data scientist roadmap (20)

603_SaiKiranPutta_CEE
603_SaiKiranPutta_CEE603_SaiKiranPutta_CEE
603_SaiKiranPutta_CEE
 
Data Science Course in Pune
Data Science Course in Pune Data Science Course in Pune
Data Science Course in Pune
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
598_RamaSrikanthJakkam_CEE
598_RamaSrikanthJakkam_CEE598_RamaSrikanthJakkam_CEE
598_RamaSrikanthJakkam_CEE
 
662_AravindKumarN_CEE
662_AravindKumarN_CEE662_AravindKumarN_CEE
662_AravindKumarN_CEE
 
566_SriramDandamudi_CEE
566_SriramDandamudi_CEE566_SriramDandamudi_CEE
566_SriramDandamudi_CEE
 
587_EswarPrasadReddyMachireddy_CEE
587_EswarPrasadReddyMachireddy_CEE587_EswarPrasadReddyMachireddy_CEE
587_EswarPrasadReddyMachireddy_CEE
 
362_NeelimaKandepu (1)
362_NeelimaKandepu (1)362_NeelimaKandepu (1)
362_NeelimaKandepu (1)
 
Data Science.pptx
Data Science.pptxData Science.pptx
Data Science.pptx
 
421_PrakashMudholkar
421_PrakashMudholkar421_PrakashMudholkar
421_PrakashMudholkar
 
797_NaveenKKapoor_CEE
797_NaveenKKapoor_CEE797_NaveenKKapoor_CEE
797_NaveenKKapoor_CEE
 
402_DheerajKura
402_DheerajKura402_DheerajKura
402_DheerajKura
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
_Data Science_ Unlocking Insights and Driving Innovation”.pptx
_Data Science_ Unlocking Insights and Driving Innovation”.pptx_Data Science_ Unlocking Insights and Driving Innovation”.pptx
_Data Science_ Unlocking Insights and Driving Innovation”.pptx
 
671_JeevanRavula_CEE
671_JeevanRavula_CEE671_JeevanRavula_CEE
671_JeevanRavula_CEE
 
Introduction to DS, ML and IBM Tools
Introduction to DS, ML and IBM ToolsIntroduction to DS, ML and IBM Tools
Introduction to DS, ML and IBM Tools
 
392_SannaReddyBharath (1)
392_SannaReddyBharath (1)392_SannaReddyBharath (1)
392_SannaReddyBharath (1)
 
438_AmeeruddinMohammed
438_AmeeruddinMohammed438_AmeeruddinMohammed
438_AmeeruddinMohammed
 
Building Data Scientists
Building Data ScientistsBuilding Data Scientists
Building Data Scientists
 
Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)
 

Último

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Último (20)

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Data scientist roadmap

  • 1. How to Become a Data Scientist from Scratch by SONU KUMAR
  • 2. What is Data Science?? O Data Science as a multi-disciplinary subject encompasses the use of mathematics, statistics, and computer science to study and evaluate data. The key objective of Data Science is to extract valuable information for use in strategic decision making, product development, trend analysis and forecasting. O A Data scientist is sort of 'jack-of-all-trades' for data crunching. Basically, 3 main skills a data scientist needs to possess are mathematics/statistics, computer programming literacy and knowledge of particular business.
  • 3. Data Science is a Broader Field
  • 5. How to become a Data Scientist?? Math Programming Languages Data Wrangling and Management Data Analysis and Visualization Machine Learning Deep Learning
  • 6. Mathematics O Linear Algebra: Matrix, Eigen, Tensor etc. O Calculus: Differentiation and Integration. O Probability: Bayes Theorem, Optimization etc. O Statistics: Inferential Statistics, Descriptive Statistics, Chi squared Testes, Random Variable, Gaussian And Normal Distribution. [Best Resources:- Khan Academy and Machine Learning Mystery Mathematics Course]
  • 7. Programming Languages O Python: It is the Bible. → Easy to understand, i.e., plane English → No semicolon → Simple and tons of libraries available O Talk about Packages → Data visualization using ggplot2, tidy are extremely important [Best Resource :- Sentdex YouTube channel]
  • 9. Data Wrangling and Management O Data Mining O Data Cleaning O Data Management Relevant Skills: → MySQL: RDBMS → NoSQL: Mongo DB, Cassandra etc. JOIN
  • 10. Data Analysis and Visualization O Plotting libraries in programming languages, e.g., • plotly, matplotlib, seaborn → python • ggplot2 → R • Tableau is booming now. [Pandas and Numpy for Data Analysis]
  • 11. Machine Learning and Deep Learning O Domain Knowledge??? HEALTHCARE, BUSINESS, FINANCE, SPORTS etc. Supervised Unsupervised Reinforcement
  • 12. Machine Learning Algorithms O Topics: Regression, Decision Tree, Random Forest, Naïve Bayes, Ensemble Learning, AdaBoost, Hierarchical Clustering, Association, k-means Clustering, SVM, KNN, Gradient Descent, Cross Validation, Entropy, Accuracy, Precision, Collaborative Filtration, PCA, Markov model, Boltzmann theorem etc. Testing Evaluation and Validation of Models
  • 13. Deep Learning Algorithms O Neural Networks, Feed Forward NN, Fuzzy Logic, Sequence Model, LSTM, RNN, CNN, CapsNet, Time Series etc
  • 14. Big Data O Map Reduce O Hadoop O Apache O Spark O Hive O Pig O Mahout O Yarn
  • 16. Course Contents And Projects O Introduction Data Mining → Introduction of Data Mining → Stages of the Data Mining Process → Data Mining Goals → Information and Knowledge → Advantages in Data Mining → Related technologies - Machine Learning, DBMS, OLAP, Statistics → Data Mining Techniques → Role of Data Mining in Various Field like Artificial Intelligence and → Internet of Things → Future scope of Data Mining
  • 17. O Data Warehouse and OLAP/ Data preprocessing → Data cleaning → Data transformation → Data reduction → Data Warehouse and DBMS → Multidimensional data model → OLAP operations O Machine Learning algorithms & concepts → Supervised and Unsupervised Technique → Regression Analysis → Linear Regression and Logistic Regression → Classification → Prediction
  • 18. → Bayesian Classification Models → Association rules → Ensemble Learning → Neural Networks → Perceptron → MLP → SVM O Python/Anaconda → Introduction to python and anaconda → Conditional Statements → Looping, Control Statements → Lists, Tuple ,Dictionaries → String Manipulation → Functions → Installing Packages
  • 19. → Introduction of Various Tool → Introduction of Anaconda O Working on Various Python Library → Installing library and packages for machine learning and data → science → Matplotlib → Scipy and Numpy → Pandas → IPython toolkit → scikit-learn → Tensorflow, Keras and other deep learning libraries O Data Structures in Python → Intro to Numpy Arrays → Creating ndarrays → Indexing
  • 20. → Data Processing using Arrays → File Input and Output → Sorting & Summarizing → Descriptive Statistics → Combining and Merging Data O Data Analysis Using Pandas → Introduction to Pandas → Data Type of Pandas → Creating DataFrame using Pandas → Importing and Exporting Database → Working with Complex Data → Data Mining using Pandas .
  • 21. O Hand on / Mini Projects on Data Sets → Modeling using Regression → Creating a Clustering Model → Loan Prediction Problem → Working on Iris Data Set → Titanic Data → Boston Housing Data Set → Predict Stock Prices → Classifying MNIST digits using Logistic Regression → Intrusion Detection using Decision → CIFAR Data set → ImageNet Data Set → Credit Risk Analytics using SVM in Python
  • 22. Learning Outcomes O Build artificial neural networks with Tensorflow and Keras O Build Deep Learning networks to classify images with Convolutional Neural Networks O Implement machine learning, clustering, and search using TF/IDF at massive scale with Apache Spark's MLLib O Implement Sentiment Analysis with Recurrent Neural Networks O Understand reinforcement learning - and how to build a Pac-Man bot
  • 23. O Make predictions using linear regression, polynomial regression, and multivariate regression O Implement Sentiment Analysis with Recurrent Neural Networks O Understand reinforcement learning - and how to build a Pac-Man bot O Classify medical test results with a wide variety of supervised machine learning classification techniques O Cluster data using K-Means clustering and Support Vector Machines (SVM)
  • 24. O Build a spam classifier using Naive Bayes O Use decision trees to predict hiring decisions O Apply dimensionality reduction with Principal Component Analysis (PCA) to classify flowers O Predict classifications using K-Nearest-Neighbor (KNN) O Develop using iPython notebooks O Understand statistical measures such as standard deviation O Visualize data distributions, probability mass functions, and probability density functions O Visualize data with matplotlib
  • 25. O Use covariance and correlation metrics O Apply conditional probability for finding correlated features O Use Bayes' Theorem to identify false positives O Understand complex multi-level models O Use train/test and K-Fold cross validation to choose the right model O Build a movie recommender system using item-based and user-based collaborative filtering O Clean your input data to remove outliers O Design and evaluate A/B tests using T-Tests and P-Values
  • 26. Best Resources (Online Videos) O Learn Python for Data Science by Microsoft → Edx O Statistics and Probability by Khan Academy O Introduction to Computing for Data Analysis → Edx O Machine Learning for Data Science and Analytics → Edx O Introduction to NoSQL Databases Solution → Edx O Intro to Hadoop and Mapreduce → Coursera [In Sequential order from Top]
  • 27. Best Blogs and Open Source Community O Medium AI Community O Freecodecamp O Analytics Vidya O Official Documentations O Github and Stackoverflow O Kaggle- Spend 5 hours of a day here O Cheat Sheets from Amazon aws
  • 28. Best Books For Machine/ Deep Learning Data Science Beginners Book Statistics
  • 29. Overview of Data Science Tools and Packages