SlideShare a Scribd company logo
1 of 33
Displaying Quantitative
   Data with Graphs
        Section 1.1
What you’ll learn
To create and interpret the following
graphs:
   Dotplot
   Stem and leaf
      Regular Stem and Leaf
      Split Stem and Leaf
      Back-to-Back Stem and Leaf
   Histogram

    Time Plot
   Ogive
To learn how to display and describe quantitative data we
will be using some baseball statistics. The following table
shows the number of home runs in a single season for
three well-known baseball players: Hank Aaron, Barry
Bonds, and Babe Ruth.

     Hank Aaron          Barry Bonds          Babe Ruth
       13         32       16          40      54         46
       27         44       25          37      59         41
       26         39       24          34      35         34
       44         29       19          49      41         22
       30         44       33          73      46
       39         38       25                  25
       40         47       34                  47
       34         34       46                  60
       45         40       37                  54
       44         20       33                  46
       24                  42                  49
Dotplot
Label the horizontal axis with the name of the
variable and title the graph
Scale the axis based on the values of the
variable
Mark a dot (we’ll use x’s) above the number on
the axis corresponding to each data value
    Number of Hom Runs in a Single Season
                 e                                      Dot Plot




       20     25     30     35     40    45   50   55     60
                                    Ruth
Describing a Distribution
  We describe a distribution (the values the
  variable takes on and how often it takes
  these values) using the acronym SOCS
   
        Shape– We describe the shape of a distribution in
        one of two ways:
                 Symmetric/Approx. Symmetric
Collection 1                     Dot Plot       Shape                           Dot Plot




       -3   -2   -1   0      1    2         3      -3   -2   -1   0     1   2     3        4
                   Sym etric
                      m                                           Uniform
Skewed
                  Right                                Left
Shape                      Dot Plot       Shape                             Dot Plot




                               “tail”         “tail”

   -4   -3 -2    -1 0 1 2      3      4      -3   -2   -1     0    1    2     3        4
                RightSkew ed                                LeftSkew ed


   Notice that the direction of the “skew” is the same
   direction as the “tail”
•Outliers: These are observations that we
would consider “unusual”. Pieces of data
that don’t “fit” the overall pattern of the data.
  Babe Ruth had two seasons         Number of Home Runs in a Single Season                                 Dot Plot

  that appear to be somewhat
  different than the rest of his   Unusual observation???

  career. These may be
  “outliers”

                                         20    25        30        35   40       45    50        55        60    65
(We’ll learn a numerical way to                                              Ruth
  determine if observations are
  truly “unusual” later)
                                    Number of Home Runs in a Single Season                                 Dot Plot



                                                                                      Unusual observation???
  The season in which Barry
  Bonds hit 73 home runs
  does not appear to fit the
  overall pattern. This piece             10        20        30        40     50           60        70        80
  of data may be an outlier.                                              Bonds
Center: A single value that describes the entire
distribution. A “typical” value that gives a concise
summary of the whole batch of numbers.

        Number of Home Runs in a Single Season                 Dot Plot




             20    25     30    35     40       45   50   55   60    65
                                            Ruth

A typical season for Babe Ruth appears to be
approximately 46 home runs
*We’ll learn about three different numerical measures of center in the next
section
Spread: Since we know                 Number of Home Runs in a Single Season                 Dot Plot
 that not everyone is
 typical, we need to also
 talk about the variation of
 a distribution. We need
 to discuss if the values of
 the distribution are tightly
                                            20    25     30    35     40       45   50   55   60    65
 clustered around the                                                      Ruth
 center making it easy to
 predict or do the values
                                         Babe Ruth’s number of home runs in a
 vary a great deal from the              single season varies from a low of 23 to
 center making prediction                a high of 60.
 more difficult?


*We’ll learn about three different numerical measures of spread in the next
section.
Distribution Description using
              SOCS
The distribution of Babe Ruth’s number of home
runs in a single season is approximately
symmetric1 with two possible unusual
observations at 23 and 25 home runs .2 He
typically hits about 463 home runs in a season.
Over his career, the number of home runs has
varied from a low of 23 to a high of 60. 4

 1-Shape                     2-Outliers
 3-Center                    4-Spread
Stem and Leaf Plot
Creating a stem and leaf plot
                                     Number of Home Runs in a
                                         Single Season
 Order the data points from
 least to greatest
 Separate each observation              Hank Aaron
 into a stem (all but the            1 3
 rightmost digit) and a leaf (the
 final digit)—Ex. 123-> 12           2 04679
 (stem): 3 (leaf)
 In a T-chart, write the stems       3 0244899
 vertically in increasing order on
 the left side of the chart.         4 00444457
 On the right side of the chart
 write each leaf to the right of
 its stem, spacing the leaves        Key
 equally
 Include a key and title for the
 graph                                4 6   = 46
Split Stem and Leaf Plot
If the data in a distribution is concentrated in just
a few stems, the picture may be more
descriptive if we “split” the stems
When we “split” stems we want the same
number of digits to be possible in each stem.
This means that each original stem can be split
into 2 or 5 new stems.
A good rule of thumb is to have a minimum of 5
stems overall
Let’s look at how splitting stems changes the
look of the distribution of Hank Aaron’s home
run data.
Split each stem into 2   Number of Home Runs in a
new stems. This              Single Season
means that the first
                                 Hank Aaron
stem includes the
                           1 3
leaves 0-4 and the
                           1
second stem has the
                           2 04
leaves 5-9
                           2 679
Splitting the stems        3 0244
helps us to “see” the      3 899
shape of the               4 004444
distribution in this       4 57       Key
case.
                                       4 6    = 46
Back-to-Back Stem and Leaf
                       Number of Home Runs in a Single
Back-to-Back stem                 Season

and leaf plots allow        Aaron
                                3    1
                                         Ruth

us to quickly                        1
compare two                   40     2   2
distributions.               976     2   5
                           4420      3   4
                             998     3   5
Use SOCS to             444400       4   11
make comparisons              75     4   66679
between                              5   449

distributions                        5          Key
                                     6   0
                                                4 6 = 46
Advantages and Disadvantages of
   dotplots/stem and leaf plots

Advantages                     Disadvantages
 
     Preserves each piece
                                  If creating by hand,
     of data                       large data sets can be
                                   cumbersome
    Shows features of the
     distribution with            Data that is widely
     regards to shape—             varied may be difficult
     such as clusters, gaps,       to graph
     outliers, etc
Histograms
A histogram is one of the most common graphs
used for quantitative variables.
Although a histogram looks like a bar chart
there are some important differences
   In a histogram, the “bars” touch each other
   Histograms do not necessarily preserve individual
    data pieces
   Changing the “scale” or “bin width” can drastically
    alter the picture of the distribution, so caution must
    be used when describing a distribution when only a
    histogram has been used
Creating a histogram
Divide the range of      Barry Bonds:
data into classes of        Data Ranges from 16
equal width. Count           to 73, so we choose
the number of                for our classes
observations in each
class. (Remember              15 ≤ # of HR ≤ 19
                                      .
that the width is                     .
                                      .

somewhat arbitrary             70 ≤ # of HR ≤ 75
and you might choose        We can then
a different width than       determine the counts
someone else)                for each “bin”
So the frequency           The horizontal axis
distribution looks like:   represents the
                           variable values, so
  Class    Frequency
                           using the lower bound
  15-19           2        of each class to scale
  20-24           1        is appropriate.
  25-29           2        The vertical axis can
  30-34           4
                           represent
  35-39           2
                              Frequency
  40-44           2
  45-49           2
                              Relative frequency
  50-54           0
                              Cumulative frequency
  55-59           0           Relative cumulative
  60-64           0            frequency
  65-69           0        We’ll use frequency
  70-74           1
Label and scale your axes. Title your graph
Draw a bar that represents the frequency for
each class. Remember that the bars of the
histograms should touch each other.
  Barry Bonds                                       Histogram
         7
         6
         5
 Count




         4
         3
         2
         1

             10   20   30   40      50    60   70   80     90
                                 HomeRuns
Interpretation
 We interpret a histogram in the same way
 we interpret a dotplot or stem and leaf
 plot.
 ALWAYS use


              SOCS
Shape                           Outliers
Center                           Spread
Time Plots
Sometimes, our data is collected at
intervals over time and we are looking for
changes or patterns that have occurred.
We use a time plot for this type of data
A time plot uses both the horizontal and
vertical axes.
   The horizontal axis represents the time
    intervals

    The vertical axis represents the variable
    values
Creating a Time Plot
                        Barry Bonds                             Line Scatter Plot


Label and scale the
                                  80
                                  70

axes. Title your                  60




                        BondsHR
                                  50

graph.                            40
                                  30

Plot a point                      20
                                  10

corresponding to the                   1986   1990      1994
                                                        Year
                                                                   1998        2002


data taken at each
time interval           Year
                                  1986
                                              HR
                                                   16
                                                         Year
                                                               1994
                                                                            HR
                                                                                    37

A line segment drawn              1987             25          1995                 33

between each point                1988
                                  1989
                                                   24
                                                   19
                                                               1996
                                                               1997
                                                                                    42
                                                                                    40
may be helpful to see             1990             33          1998                 37

patterns in the data              1991
                                  1992
                                                   25
                                                   34
                                                               1999
                                                               2000
                                                                                    34
                                                                                    49
                                  1993             46          2001                 73
Describing Time Plots
When describing time          Barry Bonds                        Line Scatter Plot
plots, you should look for             80
trends in the data                     70
                                       60
Although the number of




                             BondsHR
                                       50
home runs do not show a                40
constant increase from                 30
year to year we note that              20

overall, the number of                 10

home runs made by                           1986   1990   1994
                                                          Year
                                                                    1998        2002

Barry Bond has increased
over time with the most
notable increase being
between 1999 and 2001.
Relative frequency, Cumulative
frequency, Percentiles, and Ogives
 Sometimes we are interested in describing
 the relative position of an observation
 For example: you have no doubtably
 been told at one time or another that you
 scored at the 80th percentile. This means
 that 80% of the people taking the test
 score the same or lower than you did.
 How can we model this?
Ogive
 (Relative cumulative frequency graph)
We first start   # of home                                                  Relative
                 runs in a               Relative          Cumulative       Cumulative
by creating a    season      Frequency   Frequency         Frequency        Frequency

frequency        15-19              2          0.125                    2          0.125
                 20-24              1         0.0625                    3         0.1875
table            25-29              2          0.125                    5         0.3125

We’ll look at    30-34              4               0.25                9         0.5625

                 35-39              2          0.125                   11         0.6875
how each         40-44              2          0.125                   13         0.8125
column is        45-49              2          0.125                   15         0.9375

created in the   50-54              0                 0                15         0.9375


next few         55-59              0                 0                15         0.9375

                 60-64              0                 0                15         0.9375
slides           65-69              0                 0                15         0.9375

                 70-74              1         0.0625                   16                1
Relative Frequency
  The # of home runs… and                         # of home                    *

  the frequency are the same                      runs in a
                                                  season      Frequency
                                                                          Relative
                                                                          Frequency
  columns as we created for                       15-19              2             0.125

  the histogram.                                  20-24
                                                  25-29
                                                                     1
                                                                     2
                                                                                0.0625
                                                                                   0.125

  To find the values for the                      30-34              4               0.25
                                                  35-39              2             0.125
  “Relative Frequency”                            40-44              2             0.125

  column find the following:                      45-49              2             0.125
                                                  50-54              0                 0
Frequency Value
                                                  55-59              0                 0
   Total # of        = Relative Frequency         60-64              0                 0
   observations                                   65-69              0                 0
                                                  70-74              1          0.0625


  * Within rounding, this column should equal 1
Cumulative Frequency
Cumulative frequency    # of home

simply adds the         runs in a               Relative            Cumulative


counts in the           season

                        15-19
                                    Frequency

                                           2
                                                Frequency

                                                            0.125
                                                                    Frequency

                                                                                 2
frequency column that   20-24              1               0.0625                3

fall in or below the    25-29              2                0.125                5


current class level.    30-34

                        35-39
                                           4

                                           2
                                                             0.25

                                                            0.125
                                                                                 9

                                                                                 11

For Example: to find    40-44              2                0.125                13


the “13”, add the       45-49

                        50-54
                                           2

                                           0
                                                            0.125

                                                               0
                                                                                 15

                                                                                 15
frequencies in the      55-59              0                   0                 15

oval:                   60-64              0                   0                 15


2+1+2+4+2+2=13          65-69

                        70-74
                                           0

                                           1
                                                               0

                                                           0.0625
                                                                                 15

                                                                                 16
Relative Cumulative Frequency
Relative cumulative      # of
                                 ho
                                 m

frequency divides the            e
                         runs in a                Relative       Cumulative
                                                                                 Relative
                                                                                 Cumulative

cumulative frequency     season       Frequency   Frequency      Frequency       Frequency
                         15-19               2         0.125                 2         0.125
by the total number of   20-24               1        0.0625                 3        0.1875


observations             25-29
                         30-34
                                             2
                                             4
                                                       0.125
                                                        0.25
                                                                             5
                                                                             9
                                                                                      0.3125
                                                                                      0.5625
                         35-39               2         0.125             11           0.6875
                         40-44               2         0.125             13           0.8125
                         45-49               2         0.125             15           0.9375
For Example:             50-54               0               0           15           0.9375
                         55-59               0               0           15           0.9375
   .8125 = 13/16         60-64               0               0           15           0.9375
                         65-69               0               0           15           0.9375
                         70-74               1        0.0625             16                  1
                         Sum                16               1
Creating the Ogive
Label and scale the axes
  Horizontal: Variable
  Vertical: Relative Cumulative Frequency

   (percentile)
Plot a point corresponding to the relative
cumulative frequency in each class interval at
the left endpoint of the next class interval
The last point you should plot should be at a
height of 100%
# of home Relative
runs in a   Cumulative         Barry Bonds                          Scatter Plot
season      Frequency *
15-19                0.125       1.2
20-24             0.1875         1.0

                              Relcumfreq
25-29             0.3125
                                 0.8
30-34             0.5625
35-39             0.6875         0.6
40-44             0.8125         0.4
45-49             0.9375
                                 0.2
50-54             0.9375
55-59             0.9375
                                 0.0
60-64             0.9375
65-69             0.9375
                                            10 20 30 40 50 60 70 80
70-74                     1                           HR
                                  A line segment from point to point can be added for
                                  analysis
Types of Info from Ogives
Finding an individual observation within the
distribution
Find the relative standing of a season in which
Barry Bonds hit 40 home runs
             Barry Bonds                 Scatter Plot
             Relcumfreq



               1.2
               1.0
               0.8
               0.6
               0.4
               0.2
               0.0
                          10 20 30 40 50 60 70 80
                                    HR

A season with 40 home runs lies at the 60th percentile, meaning that
approximately 60% of his seasons had 40 or less home runs
Locating an observation corresponding to a
percentile.
How many home runs must be hit in a season
to correspond to the 75th percentile?
   Barry Bonds                     Scatter Plot
   Relcumfreq




     1.2
     1.0
     0.8
     0.6
     0.4
     0.2
     0.0
                10 20 30 40 50 60 70 80
                          HR

  To be better than 75% of Mr. Bonds season, approximately 42
  home runs must be hit.
A little History on the word Ogive
(sometimes called an Ogee)
It was first used by Sir Francis
Galton, who borrowed a term from
architecture to describe the
cumulative normal curve (more
about that next chapter).
The ogive in architecture was a
common decorative element in
many of the English Churches
around 1400. The picture at right
shows the door to the Church of
The Holy Cross at the village of
Caston in Norfolk. In this image you
can see the use of the ogive in the
design of the door and repeated in
the windows above.
Find more about this term at
Mathwords.

More Related Content

Viewers also liked

Design_report_1132999_FINAL
Design_report_1132999_FINALDesign_report_1132999_FINAL
Design_report_1132999_FINALJoseph Haystead
 
CABT SHS Statistics & Probability - Mean and Variance of Sampling Distributio...
CABT SHS Statistics & Probability - Mean and Variance of Sampling Distributio...CABT SHS Statistics & Probability - Mean and Variance of Sampling Distributio...
CABT SHS Statistics & Probability - Mean and Variance of Sampling Distributio...Gilbert Joseph Abueg
 
Statistics
StatisticsStatistics
Statisticspikuoec
 
bbk10: Bilanz von 3 Jahren BlogOffensive
bbk10: Bilanz von 3 Jahren BlogOffensivebbk10: Bilanz von 3 Jahren BlogOffensive
bbk10: Bilanz von 3 Jahren BlogOffensivehc voigt
 

Viewers also liked (6)

Design_report_1132999_FINAL
Design_report_1132999_FINALDesign_report_1132999_FINAL
Design_report_1132999_FINAL
 
Clases de-alimentos
Clases de-alimentosClases de-alimentos
Clases de-alimentos
 
CABT SHS Statistics & Probability - Mean and Variance of Sampling Distributio...
CABT SHS Statistics & Probability - Mean and Variance of Sampling Distributio...CABT SHS Statistics & Probability - Mean and Variance of Sampling Distributio...
CABT SHS Statistics & Probability - Mean and Variance of Sampling Distributio...
 
MARK CANDELARIA RESUME
MARK CANDELARIA RESUMEMARK CANDELARIA RESUME
MARK CANDELARIA RESUME
 
Statistics
StatisticsStatistics
Statistics
 
bbk10: Bilanz von 3 Jahren BlogOffensive
bbk10: Bilanz von 3 Jahren BlogOffensivebbk10: Bilanz von 3 Jahren BlogOffensive
bbk10: Bilanz von 3 Jahren BlogOffensive
 

Recently uploaded

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsManeerUddin
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 

Recently uploaded (20)

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture hons
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 

Displaying quantitative data 1.1

  • 1. Displaying Quantitative Data with Graphs Section 1.1
  • 2. What you’ll learn To create and interpret the following graphs:  Dotplot  Stem and leaf Regular Stem and Leaf Split Stem and Leaf Back-to-Back Stem and Leaf  Histogram  Time Plot  Ogive
  • 3. To learn how to display and describe quantitative data we will be using some baseball statistics. The following table shows the number of home runs in a single season for three well-known baseball players: Hank Aaron, Barry Bonds, and Babe Ruth. Hank Aaron Barry Bonds Babe Ruth 13 32 16 40 54 46 27 44 25 37 59 41 26 39 24 34 35 34 44 29 19 49 41 22 30 44 33 73 46 39 38 25 25 40 47 34 47 34 34 46 60 45 40 37 54 44 20 33 46 24 42 49
  • 4. Dotplot Label the horizontal axis with the name of the variable and title the graph Scale the axis based on the values of the variable Mark a dot (we’ll use x’s) above the number on the axis corresponding to each data value Number of Hom Runs in a Single Season e Dot Plot 20 25 30 35 40 45 50 55 60 Ruth
  • 5. Describing a Distribution We describe a distribution (the values the variable takes on and how often it takes these values) using the acronym SOCS  Shape– We describe the shape of a distribution in one of two ways: Symmetric/Approx. Symmetric Collection 1 Dot Plot Shape Dot Plot -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 4 Sym etric m Uniform
  • 6. Skewed Right Left Shape Dot Plot Shape Dot Plot “tail” “tail” -4 -3 -2 -1 0 1 2 3 4 -3 -2 -1 0 1 2 3 4 RightSkew ed LeftSkew ed Notice that the direction of the “skew” is the same direction as the “tail”
  • 7. •Outliers: These are observations that we would consider “unusual”. Pieces of data that don’t “fit” the overall pattern of the data. Babe Ruth had two seasons Number of Home Runs in a Single Season Dot Plot that appear to be somewhat different than the rest of his Unusual observation??? career. These may be “outliers” 20 25 30 35 40 45 50 55 60 65 (We’ll learn a numerical way to Ruth determine if observations are truly “unusual” later) Number of Home Runs in a Single Season Dot Plot Unusual observation??? The season in which Barry Bonds hit 73 home runs does not appear to fit the overall pattern. This piece 10 20 30 40 50 60 70 80 of data may be an outlier. Bonds
  • 8. Center: A single value that describes the entire distribution. A “typical” value that gives a concise summary of the whole batch of numbers. Number of Home Runs in a Single Season Dot Plot 20 25 30 35 40 45 50 55 60 65 Ruth A typical season for Babe Ruth appears to be approximately 46 home runs *We’ll learn about three different numerical measures of center in the next section
  • 9. Spread: Since we know Number of Home Runs in a Single Season Dot Plot that not everyone is typical, we need to also talk about the variation of a distribution. We need to discuss if the values of the distribution are tightly 20 25 30 35 40 45 50 55 60 65 clustered around the Ruth center making it easy to predict or do the values Babe Ruth’s number of home runs in a vary a great deal from the single season varies from a low of 23 to center making prediction a high of 60. more difficult? *We’ll learn about three different numerical measures of spread in the next section.
  • 10. Distribution Description using SOCS The distribution of Babe Ruth’s number of home runs in a single season is approximately symmetric1 with two possible unusual observations at 23 and 25 home runs .2 He typically hits about 463 home runs in a season. Over his career, the number of home runs has varied from a low of 23 to a high of 60. 4 1-Shape 2-Outliers 3-Center 4-Spread
  • 11. Stem and Leaf Plot Creating a stem and leaf plot Number of Home Runs in a Single Season Order the data points from least to greatest Separate each observation Hank Aaron into a stem (all but the 1 3 rightmost digit) and a leaf (the final digit)—Ex. 123-> 12 2 04679 (stem): 3 (leaf) In a T-chart, write the stems 3 0244899 vertically in increasing order on the left side of the chart. 4 00444457 On the right side of the chart write each leaf to the right of its stem, spacing the leaves Key equally Include a key and title for the graph 4 6 = 46
  • 12. Split Stem and Leaf Plot If the data in a distribution is concentrated in just a few stems, the picture may be more descriptive if we “split” the stems When we “split” stems we want the same number of digits to be possible in each stem. This means that each original stem can be split into 2 or 5 new stems. A good rule of thumb is to have a minimum of 5 stems overall Let’s look at how splitting stems changes the look of the distribution of Hank Aaron’s home run data.
  • 13. Split each stem into 2 Number of Home Runs in a new stems. This Single Season means that the first Hank Aaron stem includes the 1 3 leaves 0-4 and the 1 second stem has the 2 04 leaves 5-9 2 679 Splitting the stems 3 0244 helps us to “see” the 3 899 shape of the 4 004444 distribution in this 4 57 Key case. 4 6 = 46
  • 14. Back-to-Back Stem and Leaf Number of Home Runs in a Single Back-to-Back stem Season and leaf plots allow Aaron 3 1 Ruth us to quickly 1 compare two 40 2 2 distributions. 976 2 5 4420 3 4 998 3 5 Use SOCS to 444400 4 11 make comparisons 75 4 66679 between 5 449 distributions 5 Key 6 0 4 6 = 46
  • 15. Advantages and Disadvantages of dotplots/stem and leaf plots Advantages Disadvantages  Preserves each piece  If creating by hand, of data large data sets can be cumbersome  Shows features of the distribution with  Data that is widely regards to shape— varied may be difficult such as clusters, gaps, to graph outliers, etc
  • 16. Histograms A histogram is one of the most common graphs used for quantitative variables. Although a histogram looks like a bar chart there are some important differences  In a histogram, the “bars” touch each other  Histograms do not necessarily preserve individual data pieces  Changing the “scale” or “bin width” can drastically alter the picture of the distribution, so caution must be used when describing a distribution when only a histogram has been used
  • 17. Creating a histogram Divide the range of Barry Bonds: data into classes of  Data Ranges from 16 equal width. Count to 73, so we choose the number of for our classes observations in each class. (Remember 15 ≤ # of HR ≤ 19 . that the width is . . somewhat arbitrary 70 ≤ # of HR ≤ 75 and you might choose  We can then a different width than determine the counts someone else) for each “bin”
  • 18. So the frequency The horizontal axis distribution looks like: represents the variable values, so Class Frequency using the lower bound 15-19 2 of each class to scale 20-24 1 is appropriate. 25-29 2 The vertical axis can 30-34 4 represent 35-39 2  Frequency 40-44 2 45-49 2  Relative frequency 50-54 0  Cumulative frequency 55-59 0  Relative cumulative 60-64 0 frequency 65-69 0 We’ll use frequency 70-74 1
  • 19. Label and scale your axes. Title your graph Draw a bar that represents the frequency for each class. Remember that the bars of the histograms should touch each other. Barry Bonds Histogram 7 6 5 Count 4 3 2 1 10 20 30 40 50 60 70 80 90 HomeRuns
  • 20. Interpretation We interpret a histogram in the same way we interpret a dotplot or stem and leaf plot. ALWAYS use SOCS Shape Outliers Center Spread
  • 21. Time Plots Sometimes, our data is collected at intervals over time and we are looking for changes or patterns that have occurred. We use a time plot for this type of data A time plot uses both the horizontal and vertical axes.  The horizontal axis represents the time intervals  The vertical axis represents the variable values
  • 22. Creating a Time Plot Barry Bonds Line Scatter Plot Label and scale the 80 70 axes. Title your 60 BondsHR 50 graph. 40 30 Plot a point 20 10 corresponding to the 1986 1990 1994 Year 1998 2002 data taken at each time interval Year 1986 HR 16 Year 1994 HR 37 A line segment drawn 1987 25 1995 33 between each point 1988 1989 24 19 1996 1997 42 40 may be helpful to see 1990 33 1998 37 patterns in the data 1991 1992 25 34 1999 2000 34 49 1993 46 2001 73
  • 23. Describing Time Plots When describing time Barry Bonds Line Scatter Plot plots, you should look for 80 trends in the data 70 60 Although the number of BondsHR 50 home runs do not show a 40 constant increase from 30 year to year we note that 20 overall, the number of 10 home runs made by 1986 1990 1994 Year 1998 2002 Barry Bond has increased over time with the most notable increase being between 1999 and 2001.
  • 24. Relative frequency, Cumulative frequency, Percentiles, and Ogives Sometimes we are interested in describing the relative position of an observation For example: you have no doubtably been told at one time or another that you scored at the 80th percentile. This means that 80% of the people taking the test score the same or lower than you did. How can we model this?
  • 25. Ogive (Relative cumulative frequency graph) We first start # of home Relative runs in a Relative Cumulative Cumulative by creating a season Frequency Frequency Frequency Frequency frequency 15-19 2 0.125 2 0.125 20-24 1 0.0625 3 0.1875 table 25-29 2 0.125 5 0.3125 We’ll look at 30-34 4 0.25 9 0.5625 35-39 2 0.125 11 0.6875 how each 40-44 2 0.125 13 0.8125 column is 45-49 2 0.125 15 0.9375 created in the 50-54 0 0 15 0.9375 next few 55-59 0 0 15 0.9375 60-64 0 0 15 0.9375 slides 65-69 0 0 15 0.9375 70-74 1 0.0625 16 1
  • 26. Relative Frequency The # of home runs… and # of home * the frequency are the same runs in a season Frequency Relative Frequency columns as we created for 15-19 2 0.125 the histogram. 20-24 25-29 1 2 0.0625 0.125 To find the values for the 30-34 4 0.25 35-39 2 0.125 “Relative Frequency” 40-44 2 0.125 column find the following: 45-49 2 0.125 50-54 0 0 Frequency Value 55-59 0 0 Total # of = Relative Frequency 60-64 0 0 observations 65-69 0 0 70-74 1 0.0625 * Within rounding, this column should equal 1
  • 27. Cumulative Frequency Cumulative frequency # of home simply adds the runs in a Relative Cumulative counts in the season 15-19 Frequency 2 Frequency 0.125 Frequency 2 frequency column that 20-24 1 0.0625 3 fall in or below the 25-29 2 0.125 5 current class level. 30-34 35-39 4 2 0.25 0.125 9 11 For Example: to find 40-44 2 0.125 13 the “13”, add the 45-49 50-54 2 0 0.125 0 15 15 frequencies in the 55-59 0 0 15 oval: 60-64 0 0 15 2+1+2+4+2+2=13 65-69 70-74 0 1 0 0.0625 15 16
  • 28. Relative Cumulative Frequency Relative cumulative # of ho m frequency divides the e runs in a Relative Cumulative Relative Cumulative cumulative frequency season Frequency Frequency Frequency Frequency 15-19 2 0.125 2 0.125 by the total number of 20-24 1 0.0625 3 0.1875 observations 25-29 30-34 2 4 0.125 0.25 5 9 0.3125 0.5625 35-39 2 0.125 11 0.6875 40-44 2 0.125 13 0.8125 45-49 2 0.125 15 0.9375 For Example: 50-54 0 0 15 0.9375 55-59 0 0 15 0.9375 .8125 = 13/16 60-64 0 0 15 0.9375 65-69 0 0 15 0.9375 70-74 1 0.0625 16 1 Sum 16 1
  • 29. Creating the Ogive Label and scale the axes  Horizontal: Variable  Vertical: Relative Cumulative Frequency (percentile) Plot a point corresponding to the relative cumulative frequency in each class interval at the left endpoint of the next class interval The last point you should plot should be at a height of 100%
  • 30. # of home Relative runs in a Cumulative Barry Bonds Scatter Plot season Frequency * 15-19 0.125 1.2 20-24 0.1875 1.0 Relcumfreq 25-29 0.3125 0.8 30-34 0.5625 35-39 0.6875 0.6 40-44 0.8125 0.4 45-49 0.9375 0.2 50-54 0.9375 55-59 0.9375 0.0 60-64 0.9375 65-69 0.9375 10 20 30 40 50 60 70 80 70-74 1 HR A line segment from point to point can be added for analysis
  • 31. Types of Info from Ogives Finding an individual observation within the distribution Find the relative standing of a season in which Barry Bonds hit 40 home runs Barry Bonds Scatter Plot Relcumfreq 1.2 1.0 0.8 0.6 0.4 0.2 0.0 10 20 30 40 50 60 70 80 HR A season with 40 home runs lies at the 60th percentile, meaning that approximately 60% of his seasons had 40 or less home runs
  • 32. Locating an observation corresponding to a percentile. How many home runs must be hit in a season to correspond to the 75th percentile? Barry Bonds Scatter Plot Relcumfreq 1.2 1.0 0.8 0.6 0.4 0.2 0.0 10 20 30 40 50 60 70 80 HR To be better than 75% of Mr. Bonds season, approximately 42 home runs must be hit.
  • 33. A little History on the word Ogive (sometimes called an Ogee) It was first used by Sir Francis Galton, who borrowed a term from architecture to describe the cumulative normal curve (more about that next chapter). The ogive in architecture was a common decorative element in many of the English Churches around 1400. The picture at right shows the door to the Church of The Holy Cross at the village of Caston in Norfolk. In this image you can see the use of the ogive in the design of the door and repeated in the windows above. Find more about this term at Mathwords.