SlideShare a Scribd company logo
1 of 18
Download to read offline
Data science
Data Science
An emerging area of work concerned with the collection,
preparation, analysis ,visualization, management, and
preservation of large collections of information.
1
Web page
much of the data in the world is non-numeric and
unstructured.
unstructured means that the data are not arranged in neat
rows and columns. Think of a web page
2
$
3
Data
architecture
Data
acquisition
Data
analysis
Data
archiving
4
Data architect
providing input on how the data would need to be
routed and organized to support the analysis,
visualization, and presentation of the data to the
appropriate people.
5
Data acquisition
focuses on how the data are collected, and
importantly , how the data are represented prior
to analysis and presentation.
Tool example :barcode
Different barcodes are used for the same product.
(for example, for different sized boxes of cereal).
6
Data analysis
using portions of data (samples) to make
inferences about the larger context, and
visualization of the data by presenting it in tables,
graphs, and even animations.
7
Data archiving
Preservation of collected data in a form that
makes it highly reusable ,so "data curation" is
a difficult challenge because it is so hard to
anticipate all of the future uses of the data.
Example(Twitter):
Geocodes : data that shows the geographical location
from which a tweet was sent could be a useful
element to store with the data.
8
Learning the application domain
Communicating with data users
Seeing the big picture of a complex system
Knowing how data can be represented
:metadata
Data transformation and analysis
Visualization and presentation
Attention to quality
Ethical reasoning :privacy 9
About Data
•Data comes from the Latin word, "datum,"
meaning a "thing given“
10
za15id05v2005kamel
11
“The fundamental problem of
communication is that of
reproducing at one point either
exactly or approximately a
message selected at another
point”
CLAUDE SHANNON
yes
1
0
No
Maybe01
ASCII
12
Identifying Data Problems
Data Science is an applied activity and data scientists
serve the needs and solve the problems of data users.
Hint:
The data scientist may never actually become a
farmer, but if you are going to identify a data problem
that a farmer has, you have to learn to think like a
farmer, to some degree.
3 questions:
 subject matter experts.
 ask about anomalies
 ask about risks and uncertainty
13
Introduction To R
R is an integrated suite of software facilities for data
manipulation, calculation , graphical Display and other
things it has .
 "R" is an open source software program
an effective data handling and storage facility.
 a suite of operators for calculations on arrays, in
particular matrices,
 a large, coherent, integrated collection of
intermediate tools for data analysis,
 graphical facilities for data analysis and display
either directly at the computer or on hardcopy.
14
Additional Pros:
 R was among the first analysis programs to
integrate capabilities for drawing data directly from
the Twitter(r) social media platform
 The extensibility of R means that new modules are
being added all the time by volunteers
 the lessons one learns in working with R are almost
universally applicable to other programs and
environments.
15
CONS:
R is "command line" oriented
 R is not especially good at giving feedback or error
messages.
16
How to write a text
myText <- "this is a piece of text"
 Create Data Set :
myFamilyAges <- c(43, 42, 12, 8, 5)
c(): Concatenates data elements together
 Assignment arrow: <-
 Some mathematical function :
sum():Adds data elements
range():Min value and max value
mean():The average
17
18

More Related Content

What's hot

Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
Edureka!
 

What's hot (20)

Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
 
Intro to Data Science Concepts
Intro to Data Science ConceptsIntro to Data Science Concepts
Intro to Data Science Concepts
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overview
 
Data Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill SetData Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill Set
 
Data Science Project Lifecycle
Data Science Project LifecycleData Science Project Lifecycle
Data Science Project Lifecycle
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
 
Data science
Data scienceData science
Data science
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data Science
Data ScienceData Science
Data Science
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data Analytics
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
 
Data science
Data science Data science
Data science
 

Viewers also liked

H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Vignesh Prajapati
 

Viewers also liked (17)

Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
 
Introduction to Data Science with Hadoop
Introduction to Data Science with HadoopIntroduction to Data Science with Hadoop
Introduction to Data Science with Hadoop
 
Introduction to data science and candidate data science projects
Introduction to data science and candidate data science projectsIntroduction to data science and candidate data science projects
Introduction to data science and candidate data science projects
 
Introduction to Data Science: A Practical Approach to Big Data Analytics
Introduction to Data Science: A Practical Approach to Big Data AnalyticsIntroduction to Data Science: A Practical Approach to Big Data Analytics
Introduction to Data Science: A Practical Approach to Big Data Analytics
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain ViewIntroduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain View
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
Introduction to (Big) Data Science
Introduction to (Big) Data ScienceIntroduction to (Big) Data Science
Introduction to (Big) Data Science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Introduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningIntroduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine Learning
 
Intro to Data Science for Enterprise Big Data
Intro to Data Science for Enterprise Big DataIntro to Data Science for Enterprise Big Data
Intro to Data Science for Enterprise Big Data
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
What Do Real Women Look Like? 100 Stock Photos of Real Women
What Do Real Women Look Like? 100 Stock Photos of Real WomenWhat Do Real Women Look Like? 100 Stock Photos of Real Women
What Do Real Women Look Like? 100 Stock Photos of Real Women
 

Similar to Introduction to data science intro,ch(1,2,3)

Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
Editor IJCATR
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
juliennehar
 

Similar to Introduction to data science intro,ch(1,2,3) (20)

Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptx
 
Unit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxUnit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptx
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
How to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceHow to Prepare for a Career in Data Science
How to Prepare for a Career in Data Science
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
 
Unit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptxUnit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptx
 
Data literacy
Data literacyData literacy
Data literacy
 
Introduction to Data Analysis Course Notes.pdf
Introduction to Data Analysis Course Notes.pdfIntroduction to Data Analysis Course Notes.pdf
Introduction to Data Analysis Course Notes.pdf
 
BDA-Module-1.pptx
BDA-Module-1.pptxBDA-Module-1.pptx
BDA-Module-1.pptx
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
 
Lecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptxLecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptx
 
DataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdfDataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdf
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
U - 2 Emerging.pptx
U - 2 Emerging.pptxU - 2 Emerging.pptx
U - 2 Emerging.pptx
 
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGargColloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
 
Paper presentation
Paper presentationPaper presentation
Paper presentation
 
What is Data Science? |Role of Data Science in Big Data, Hadoop & Machine Lea...
What is Data Science? |Role of Data Science in Big Data, Hadoop & Machine Lea...What is Data Science? |Role of Data Science in Big Data, Hadoop & Machine Lea...
What is Data Science? |Role of Data Science in Big Data, Hadoop & Machine Lea...
 
Data Science at Scale - The DevOps Approach
Data Science at Scale - The DevOps ApproachData Science at Scale - The DevOps Approach
Data Science at Scale - The DevOps Approach
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 

More from heba_ahmad (13)

heba alsayed ahmad_Recomm_#
heba alsayed ahmad_Recomm_#heba alsayed ahmad_Recomm_#
heba alsayed ahmad_Recomm_#
 
heba alsayed ahmad_Recomm_#2
heba alsayed ahmad_Recomm_#2heba alsayed ahmad_Recomm_#2
heba alsayed ahmad_Recomm_#2
 
bassel alkhatib recommendation
bassel alkhatib recommendation bassel alkhatib recommendation
bassel alkhatib recommendation
 
recommendation dr jose
recommendation dr joserecommendation dr jose
recommendation dr jose
 
recommendation dr.miguel
recommendation dr.miguelrecommendation dr.miguel
recommendation dr.miguel
 
metaheuristic tabu pso
metaheuristic tabu psometaheuristic tabu pso
metaheuristic tabu pso
 
Line uo,please
Line uo,pleaseLine uo,please
Line uo,please
 
Data mining
Data miningData mining
Data mining
 
Shiny in R
Shiny in RShiny in R
Shiny in R
 
&Final presentation
 &Final presentation &Final presentation
&Final presentation
 
Chapter 18,19
Chapter 18,19Chapter 18,19
Chapter 18,19
 
Ggplot2 ch2
Ggplot2 ch2Ggplot2 ch2
Ggplot2 ch2
 
Final presentation
Final presentationFinal presentation
Final presentation
 

Recently uploaded

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 

Recently uploaded (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Introduction to data science intro,ch(1,2,3)

  • 1. Data science Data Science An emerging area of work concerned with the collection, preparation, analysis ,visualization, management, and preservation of large collections of information. 1
  • 2. Web page much of the data in the world is non-numeric and unstructured. unstructured means that the data are not arranged in neat rows and columns. Think of a web page 2
  • 3. $ 3
  • 5. Data architect providing input on how the data would need to be routed and organized to support the analysis, visualization, and presentation of the data to the appropriate people. 5
  • 6. Data acquisition focuses on how the data are collected, and importantly , how the data are represented prior to analysis and presentation. Tool example :barcode Different barcodes are used for the same product. (for example, for different sized boxes of cereal). 6
  • 7. Data analysis using portions of data (samples) to make inferences about the larger context, and visualization of the data by presenting it in tables, graphs, and even animations. 7
  • 8. Data archiving Preservation of collected data in a form that makes it highly reusable ,so "data curation" is a difficult challenge because it is so hard to anticipate all of the future uses of the data. Example(Twitter): Geocodes : data that shows the geographical location from which a tweet was sent could be a useful element to store with the data. 8
  • 9. Learning the application domain Communicating with data users Seeing the big picture of a complex system Knowing how data can be represented :metadata Data transformation and analysis Visualization and presentation Attention to quality Ethical reasoning :privacy 9
  • 10. About Data •Data comes from the Latin word, "datum," meaning a "thing given“ 10
  • 12. “The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point” CLAUDE SHANNON yes 1 0 No Maybe01 ASCII 12
  • 13. Identifying Data Problems Data Science is an applied activity and data scientists serve the needs and solve the problems of data users. Hint: The data scientist may never actually become a farmer, but if you are going to identify a data problem that a farmer has, you have to learn to think like a farmer, to some degree. 3 questions:  subject matter experts.  ask about anomalies  ask about risks and uncertainty 13
  • 14. Introduction To R R is an integrated suite of software facilities for data manipulation, calculation , graphical Display and other things it has .  "R" is an open source software program an effective data handling and storage facility.  a suite of operators for calculations on arrays, in particular matrices,  a large, coherent, integrated collection of intermediate tools for data analysis,  graphical facilities for data analysis and display either directly at the computer or on hardcopy. 14
  • 15. Additional Pros:  R was among the first analysis programs to integrate capabilities for drawing data directly from the Twitter(r) social media platform  The extensibility of R means that new modules are being added all the time by volunteers  the lessons one learns in working with R are almost universally applicable to other programs and environments. 15
  • 16. CONS: R is "command line" oriented  R is not especially good at giving feedback or error messages. 16
  • 17. How to write a text myText <- "this is a piece of text"  Create Data Set : myFamilyAges <- c(43, 42, 12, 8, 5) c(): Concatenates data elements together  Assignment arrow: <-  Some mathematical function : sum():Adds data elements range():Min value and max value mean():The average 17
  • 18. 18