SlideShare una empresa de Scribd logo
1 de 32
WHAT IS DATA SCIENCE ?
BY
SHILPA KRISHNA
RESEARCH SCHOLAR
Data
Science
Process
DISCOVERY
DATA
PREPARATIO
N
MODEL
PLANNIN
G
MODEL
BUILDIN
G
OPERATI
ON
COMMUNICAT
E
RESULTS
DISCOVERY
 It involves acquiring data from all the identified
internal and external sources which helps you to
answer the business question.
 The data can be :
1. Logs from webservers
2. Data gathered from social media
3. Census datasets
4. Data streamed from online sources using APIs
DATA PREPARATION
 Data can have lots of inconsistencies like
missing value,blank columns,incorrect data
format which needs to be cleaned.
 You need to process,explore and condition
data before modeling.
 The cleaner your data, the better are your
predictions.
MODEL PLANNING
 In this stage, you need to determine the
method and technique to draw the relation
between input variables.
 Planning for a model is performed by using
different statistical formulas and
visualization tools like SQL analysis
services, R and SAS/access
MODEL BUILDING
 Data scientist distributes datasets for
training and testing.
 Techniques like association, classification,
and clustering are applied to the training
dataset.
 The model once prepared is tested
against the “testing” dataset
OPERATIONALIZE
 You deliver the final baselined model with
reports,code and technical documents.
 Model is deployed into a real-time
production environment after through
testing.
COMMUNICATE RESULTS
 The key findings are communicated to all
stakeholders.
 This helps you to decide if the results of
the project are a success or a failure
based on the inputs from the model.
MOST PROMINENT DATA SCIENTIST JOB TITLES ARE :
1) Data scientist
2) Data engineer
3) Data analyst
4) Statistician
5) Data admin
6) Business analyst
Data Scientist
ROLE LANGUAGES
 It is a professional who
manages enormous
amounts of data to come
up with compelling
business visions by using
various tools, techniques,
methodologies, algorithms
etc…
 R
 SAS
 PYTHON
 SQL
 HIVE
 MATLAB
 PIG
 SPARK
Data Engineer
ROLE LANGUAGES
 He is working with large
amounts of data and
develops constructs,
tests and maintains
architectures like large
scale processing system
and databases.
 SQL
 HIVE
 R
 SAS
 MATLAB
 PYTHON
 JAVA
 RUBY
 C++
 PERL
Data Analyst
ROLE LANGUAGES
 Responsible for mining vast
amounts of data and look
for relationships, patterns,
trends in data.
 Later deliver compeling
reporting and visualization
for analyzing the data to
take the most viable
business decisions.
 R
 PYTHON
 HTML
 JS
 C
 C++
 SQL
Statistician
ROLE LANGUAGES
 Collects, analyses,
understand qualitative
and quantitative data by
using statistical theories
and methods.
 SQL
 R
 MATLAB
 TABLEAU
 PYTHON
 PERL
 SPARK
 HIVE
Data Administrator
ROLE LANGUAGES
 Data admin should
ensure that the database
is accessible to all
relevant users also
makes sure that it is
performing correctly and
is being kept safe from
hacking
 RUBY on Rails
 SQL
 JAVA
 C#
 PYTHON
Business Analyst
ROLE LANGUAGES
 This professional need to
improves business
processes and He is an
intermediary between the
business executive team
and IT department
 SQL
 TABLEAU
 POWER BI
 PYTHON
DEFINE THE GOAL
 Define a measurable and quantifiable goal
 Goal should be specific and precise
 Goal is come up with candidate
hypothesis. These hypothesis can then be
turned into concrete questions or goals for
a full-scale modeling project.
COLLECT AND MANAGE DATA
 Time consuming step
 Conduct initial exploration and
visualization of the data
 Clean data: repair data errors and
transform variables as needed
BUILD THE MODEL
Most common data science modeling tasks are
 Classification
 Scoring
 Ranking
 Clustering
 Finding relations
 Characterization
EVALUATE AND CRITIQUE MODEL
Once you have a model, you need to
determine if it meets your goals :
 Is it accurate enough for your needs ?
 Does it perform better than the obvious
guess ?
 Do the results of the model make sense in
the context of the problem domain ?
PRESENT RESULTS AND DOCUMENT
 Present results to your project sponser
and other stakeholders.
 Document the model for those in the
organization who are responsible for
using running and maintaining the model
once it has been deployed.
DEPLOY MODEL
 Make sure that the model can be updated
as its environment changes.
 The model initially be deployed in a small
pilot program.
Several ways of gathering data for
analysis are :
 CSV FILE
 FLAT FILE(tab, space
or any other separator)
 TEXT FILE(In a single
file- reading data all at
once) or (reading data
line by line)
 ZIP FILE
 APIs(JSON)
 MULTIPLE TEXT
FILE(data is split over
multiple text files)
 DOWNLOAD FILE
FROM INTERNET(file
hosted on a server)
 WEBPAGE(scraping)
 RDBMS(SQL tables)
 Relational database uses tables which
are called Records
 Establish connections among records by
using primary key and foreign key
 Allows users to establish defined
relationships between tables
 In RDBMS, we use SQL instructions to
reproduce and analyze data separately
SOME COMMONLY USED PLOTS FOR EDA ARE :
 Histogram
 Scatter plots
 Maps
 Feature corelation plot(Heatmap)
 Time series plots
Data management platforms enables
organizations and enterprises to use data
analytics in beneficial ways, such as :
 Personalizing the customer experience
 Adding value to customer interactions
 Improving customer engagement
 Increasing customer loyalty
 Reaping and revenues associated with data
driven marketing
 Identifying the root causes of marketing failures
and business issues in real time

Más contenido relacionado

La actualidad más candente

What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
Simplilearn
 

La actualidad más candente (20)

Data Science Introduction
Data Science IntroductionData Science Introduction
Data Science Introduction
 
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 
Data science & data scientist
Data science & data scientistData science & data scientist
Data science & data scientist
 
Data Science
Data ScienceData Science
Data Science
 
Data science Big Data
Data science Big DataData science Big Data
Data science Big Data
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
 
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
 
Data Science
Data ScienceData Science
Data Science
 
Data science
Data scienceData science
Data science
 
Introduction to data science club
Introduction to data science clubIntroduction to data science club
Introduction to data science club
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
data science
data sciencedata science
data science
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 

Similar a Data science | What is Data science

Shraddha Verma_IT_ETL Architect_10+_CV
Shraddha Verma_IT_ETL Architect_10+_CVShraddha Verma_IT_ETL Architect_10+_CV
Shraddha Verma_IT_ETL Architect_10+_CV
Shraddha Mehrotra
 
Deblina Dey - Resume
Deblina Dey - ResumeDeblina Dey - Resume
Deblina Dey - Resume
deblina dey
 
Resume - Abhishek Ray-Mar-2016 - Ind
Resume - Abhishek Ray-Mar-2016 - IndResume - Abhishek Ray-Mar-2016 - Ind
Resume - Abhishek Ray-Mar-2016 - Ind
Abhishek Ray
 
Resume_RaghavMahajan_ETL_Developer
Resume_RaghavMahajan_ETL_DeveloperResume_RaghavMahajan_ETL_Developer
Resume_RaghavMahajan_ETL_Developer
Raghav Mahajan
 

Similar a Data science | What is Data science (20)

Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Bigdataanalytics
BigdataanalyticsBigdataanalytics
Bigdataanalytics
 
Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark
 
Overview of tools for data analysis and visualisation (2021)
Overview of tools for data analysis and visualisation (2021)Overview of tools for data analysis and visualisation (2021)
Overview of tools for data analysis and visualisation (2021)
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptx
 
Overview data analyis and visualisation tools 2020
Overview data analyis and visualisation tools 2020Overview data analyis and visualisation tools 2020
Overview data analyis and visualisation tools 2020
 
Sujit lead plsql
Sujit lead plsqlSujit lead plsql
Sujit lead plsql
 
Shraddha Verma_IT_ETL Architect_10+_CV
Shraddha Verma_IT_ETL Architect_10+_CVShraddha Verma_IT_ETL Architect_10+_CV
Shraddha Verma_IT_ETL Architect_10+_CV
 
TechoERP.pdf
TechoERP.pdfTechoERP.pdf
TechoERP.pdf
 
Sap Interview Questions - Part 1
Sap Interview Questions - Part 1Sap Interview Questions - Part 1
Sap Interview Questions - Part 1
 
Deblina Dey - Resume
Deblina Dey - ResumeDeblina Dey - Resume
Deblina Dey - Resume
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
 
Resume - Abhishek Ray-Mar-2016 - Ind
Resume - Abhishek Ray-Mar-2016 - IndResume - Abhishek Ray-Mar-2016 - Ind
Resume - Abhishek Ray-Mar-2016 - Ind
 
DevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleDevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-Oracle
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
CV | Sham Sunder | Data | Database | Business Intelligence | .Net
CV | Sham Sunder | Data | Database | Business Intelligence | .NetCV | Sham Sunder | Data | Database | Business Intelligence | .Net
CV | Sham Sunder | Data | Database | Business Intelligence | .Net
 
Kanakaraj_Periasamy
Kanakaraj_PeriasamyKanakaraj_Periasamy
Kanakaraj_Periasamy
 
Resume_RaghavMahajan_ETL_Developer
Resume_RaghavMahajan_ETL_DeveloperResume_RaghavMahajan_ETL_Developer
Resume_RaghavMahajan_ETL_Developer
 

Más de ShilpaKrishna6

Más de ShilpaKrishna6 (13)

WBAN(Wireless Body Area Network)
WBAN(Wireless Body Area Network)WBAN(Wireless Body Area Network)
WBAN(Wireless Body Area Network)
 
Evolution of big data
Evolution of big dataEvolution of big data
Evolution of big data
 
Big data business analytics | Introduction to Business Analytics
Big data business analytics | Introduction to Business AnalyticsBig data business analytics | Introduction to Business Analytics
Big data business analytics | Introduction to Business Analytics
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data Applications
 
What is MapReduce ?
What is MapReduce ?What is MapReduce ?
What is MapReduce ?
 
Introduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesIntroduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databases
 
Internet of Things(IoT) Applications
Internet of Things(IoT) ApplicationsInternet of Things(IoT) Applications
Internet of Things(IoT) Applications
 
4 pillers of iot
4 pillers of iot4 pillers of iot
4 pillers of iot
 
Iot enabled technologies
Iot enabled technologiesIot enabled technologies
Iot enabled technologies
 
Iot logical design
Iot logical designIot logical design
Iot logical design
 
Physical design of io t
Physical design of io tPhysical design of io t
Physical design of io t
 
Introduction to iot(internet of things)
Introduction to iot(internet of things)Introduction to iot(internet of things)
Introduction to iot(internet of things)
 
Number system and its conversions
Number system and its conversionsNumber system and its conversions
Number system and its conversions
 

Último

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Último (20)

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 

Data science | What is Data science

  • 1. WHAT IS DATA SCIENCE ? BY SHILPA KRISHNA RESEARCH SCHOLAR
  • 3. DISCOVERY  It involves acquiring data from all the identified internal and external sources which helps you to answer the business question.  The data can be : 1. Logs from webservers 2. Data gathered from social media 3. Census datasets 4. Data streamed from online sources using APIs
  • 4. DATA PREPARATION  Data can have lots of inconsistencies like missing value,blank columns,incorrect data format which needs to be cleaned.  You need to process,explore and condition data before modeling.  The cleaner your data, the better are your predictions.
  • 5. MODEL PLANNING  In this stage, you need to determine the method and technique to draw the relation between input variables.  Planning for a model is performed by using different statistical formulas and visualization tools like SQL analysis services, R and SAS/access
  • 6. MODEL BUILDING  Data scientist distributes datasets for training and testing.  Techniques like association, classification, and clustering are applied to the training dataset.  The model once prepared is tested against the “testing” dataset
  • 7. OPERATIONALIZE  You deliver the final baselined model with reports,code and technical documents.  Model is deployed into a real-time production environment after through testing.
  • 8. COMMUNICATE RESULTS  The key findings are communicated to all stakeholders.  This helps you to decide if the results of the project are a success or a failure based on the inputs from the model.
  • 9.
  • 10. MOST PROMINENT DATA SCIENTIST JOB TITLES ARE : 1) Data scientist 2) Data engineer 3) Data analyst 4) Statistician 5) Data admin 6) Business analyst
  • 11. Data Scientist ROLE LANGUAGES  It is a professional who manages enormous amounts of data to come up with compelling business visions by using various tools, techniques, methodologies, algorithms etc…  R  SAS  PYTHON  SQL  HIVE  MATLAB  PIG  SPARK
  • 12. Data Engineer ROLE LANGUAGES  He is working with large amounts of data and develops constructs, tests and maintains architectures like large scale processing system and databases.  SQL  HIVE  R  SAS  MATLAB  PYTHON  JAVA  RUBY  C++  PERL
  • 13. Data Analyst ROLE LANGUAGES  Responsible for mining vast amounts of data and look for relationships, patterns, trends in data.  Later deliver compeling reporting and visualization for analyzing the data to take the most viable business decisions.  R  PYTHON  HTML  JS  C  C++  SQL
  • 14. Statistician ROLE LANGUAGES  Collects, analyses, understand qualitative and quantitative data by using statistical theories and methods.  SQL  R  MATLAB  TABLEAU  PYTHON  PERL  SPARK  HIVE
  • 15. Data Administrator ROLE LANGUAGES  Data admin should ensure that the database is accessible to all relevant users also makes sure that it is performing correctly and is being kept safe from hacking  RUBY on Rails  SQL  JAVA  C#  PYTHON
  • 16. Business Analyst ROLE LANGUAGES  This professional need to improves business processes and He is an intermediary between the business executive team and IT department  SQL  TABLEAU  POWER BI  PYTHON
  • 17.
  • 18.
  • 19. DEFINE THE GOAL  Define a measurable and quantifiable goal  Goal should be specific and precise  Goal is come up with candidate hypothesis. These hypothesis can then be turned into concrete questions or goals for a full-scale modeling project.
  • 20. COLLECT AND MANAGE DATA  Time consuming step  Conduct initial exploration and visualization of the data  Clean data: repair data errors and transform variables as needed
  • 21. BUILD THE MODEL Most common data science modeling tasks are  Classification  Scoring  Ranking  Clustering  Finding relations  Characterization
  • 22. EVALUATE AND CRITIQUE MODEL Once you have a model, you need to determine if it meets your goals :  Is it accurate enough for your needs ?  Does it perform better than the obvious guess ?  Do the results of the model make sense in the context of the problem domain ?
  • 23. PRESENT RESULTS AND DOCUMENT  Present results to your project sponser and other stakeholders.  Document the model for those in the organization who are responsible for using running and maintaining the model once it has been deployed.
  • 24. DEPLOY MODEL  Make sure that the model can be updated as its environment changes.  The model initially be deployed in a small pilot program.
  • 25.
  • 26. Several ways of gathering data for analysis are :  CSV FILE  FLAT FILE(tab, space or any other separator)  TEXT FILE(In a single file- reading data all at once) or (reading data line by line)  ZIP FILE  APIs(JSON)  MULTIPLE TEXT FILE(data is split over multiple text files)  DOWNLOAD FILE FROM INTERNET(file hosted on a server)  WEBPAGE(scraping)  RDBMS(SQL tables)
  • 27.
  • 28.  Relational database uses tables which are called Records  Establish connections among records by using primary key and foreign key  Allows users to establish defined relationships between tables  In RDBMS, we use SQL instructions to reproduce and analyze data separately
  • 29.
  • 30. SOME COMMONLY USED PLOTS FOR EDA ARE :  Histogram  Scatter plots  Maps  Feature corelation plot(Heatmap)  Time series plots
  • 31.
  • 32. Data management platforms enables organizations and enterprises to use data analytics in beneficial ways, such as :  Personalizing the customer experience  Adding value to customer interactions  Improving customer engagement  Increasing customer loyalty  Reaping and revenues associated with data driven marketing  Identifying the root causes of marketing failures and business issues in real time