SlideShare una empresa de Scribd logo
1 de 63
Why Study Statistics?
Dealing with Uncertainty

 Everyday decisions are based on
     incomplete information
Dealing with Uncertainty

The price of L&T stock will be higher
    in six months than it is now.

             versus

  The price of L&T stock is likely
 to be higher in six months than it
              is now.
Dealing with Uncertainty
 If the union budget deficit is as high as
predicted, interest rates will remain high
         for the rest of the year.
                versus
If the union budget deficit is as high
    as predicted, it is probable that
  interest rates will remain high for
          the rest of the year.
Statistical Thinking
Statistical thinking is a philosophy of learning
  and action based on the following fundamental
  principles:
 All work occurs in a system of interconnected
  processes;
 Variation exists in all processes, and
 Understanding and reducing variation are the
  keys to success.
Statistical Thinking


        Systems and Processes
A system is a number of components that
 are logically and sometimes physically
    linked together for some purpose.
Statistical Thinking

             Systems and Processes
  A process is a set of activities operating on a system
that transforms inputs to outputs. A business process is
  groups of logically related tasks and activities, that
 when performed utilizes the resources of the business
  to provide definitive results required to achieve the
                   business objectives.
Making Decisions

Data, Information, Knowledge
q   Data: specific observations of measured numbers.
q   Information: processed and summarized data
    yielding facts and ideas.
q   Knowledge: selected and organized information
    that provides understanding, recommendations, and
    the basis for decisions.
Making Decisions

    Descriptive and Inferential Statistics
Descriptive Statistics include graphical and
  numerical procedures that summarize and
 process data and are used to transform data
               into information.
Making Decisions

   Descriptive and Inferential Statistics
Inferential Statistics provide the bases for
predictions, forecasts, and estimates that are
used to transform information to knowledge.
The Journey to Making Decisions

               Decision 



               Knowledge
                              Experience, Theory,
                             Literature, Inferential
                             Statistics, Computers
               Information
                              Descriptive Statistics,
                             Probability, Computers
Begin Here:
                  Data
Identify the
  Problem
Describing Data




©
Summarizing and Describing
          Data

    Tables and Graphs
    Numerical Measures
Classification of Variables


   Discrete numerical variable
   Continuous numerical variable
   Categorical variable
Classification of Variables

    Discrete Numerical Variable
A variable that produces a response that
   comes from a counting process.
Classification of Variables


    Continuous Numerical Variable
A variable that produces a response that is
 the outcome of a measurement process.
Classification of Variables

       Categorical Variables
Variables that produce responses that
belong to groups (sometimes called
      “classes”) or categories.
Measurement Levels

Nominal and Ordinal Levels of Measurement
  refer to data obtained from categorical
  questions.
• A nominal scale indicates assignments to
  groups or classes.
• Ordinal data indicate rank ordering of items.
Frequency Distributions

A frequency distribution is a table used to organize data.
   The left column (called classes or groups) includes
  numerical intervals on a variable being studied. The
 right column is a list of the frequencies, or number of
 observations, for each class. Intervals are normally of
     equal size, must cover the range of the sample
         observations, and be non-overlapping.
Construction of a Frequency
                 Distribution

   Rule 1: Intervals (classes) must be inclusive and non-
    overlapping;
   Rule 2: Determine k, the number of classes;
   Rule 3: Intervals should be the same width, w; the width
    is determined by the following:
                     (Largest Number - Smallest Number)
w = Interval Width =
                             Number of Intervals

Both k and w should be rounded upward, possibly to the next largest integer.
Construction of a Frequency
             Distribution

Quick Guide to Number of Classes for a Frequency Distribution

  Sample Size                     Number of Classes
  Fewer than 50                   5 – 6 classes
  50 to 100                       6 – 8 classes
  over 100                        8 – 10 classes
Example of a Frequency Distribution


   A Frequency Distribution for the Suntan Lotion Example

  Weights (in mL)        Number of Bottles
  220 less than 225             1
  225 less than 230             4
  230 less than 235            29
  235 less than 240            34
  240 less than 245            26
  245 less than 250             6
Cumulative Frequency
              Distributions

  A cumulative frequency distribution contains the
number of observations whose values are less than the
   upper limit of each interval. It is constructed by
 adding the frequencies of all frequency distribution
  intervals up to and including the present interval.
Relative Cumulative Frequency
         Distributions

A relative cumulative frequency distribution
   converts all cumulative frequencies to
           cumulative percentages
Example of a Frequency Distribution


   A Cumulative Frequency Distribution for the Sun tan Lotion
                           Example

         Weights (in mL)           Number of Bottles
         less than 225                     1
         less than 230                     5
         less than 235                    34
         less than 240                    68
         less than 245                    94
         less than 250                    100
Histograms and Ogives

A histogram is a bar graph that consists of vertical bars
constructed on a horizontal line that is marked off with
    intervals for the variable being displayed. The
      intervals correspond to those in a frequency
      distribution table. The height of each bar is
  proportional to the number of observations in that
                         interval.
Histograms and Ogives

An ogive, sometimes called a cumulative line graph, is
  a line that connects points that are the cumulative
 percentage of observations below the upper limit of
  each class in a cumulative frequency distribution.
Histogram and Ogive for Example 1


                             Histogram of Weights

              40                                                           100
              35                                                           90
                                                                           80
              30
                                                                           70
  Frequency




              25                                                           60
              20                                                           50
              15                                                           40
                                                                           30
              10
                                                                           20
              5                                                            10
              0                                                            0
                   224.5   229.5      234.5    239.5       244.5   249.5
                                   Interval Weights (mL)
Stem-and-Leaf Display


 A stem-and-leaf display is an exploratory data analysis
  graph that is an alternative to the histogram. Data are
grouped according to their leading digits (called the stem)
while listing the final digits (called leaves) separately for
    each member of a class. The leaves are displayed
 individually in ascending order after each of the stems.
Stem-and-Leaf Display


         Stem-and-Leaf Display

         Stem unit: 10

     9            1   124678899
   (9)            2   122246899
     5            3   01234
     2            4   02
Tables
     - Bar and Pie Charts -
Frequency and Relative Frequency Distribution for
       Top Company Employers Example
                     Number of
   Industry          Employees Percent
   Tourism               85,287     0.35
   Retail                49,424      0.2
   Health Care           39,588     0.16
   Restaurants           16,050     0.06
   Communications        11,750     0.05
   Technology            11,144     0.05
   Space                 11,418     0.05
   Other                 21,336     0.08
Tables
               - Bar and Pie Charts -

       Bar Chart for Top Company Employers Example

           1999 Top Company Employers in Central Florida
       0.35

                     0.2
                                 0.16
                                          0.06                                0.08
                                                  0.05      0.05    0.05
                                 e




                                                   gy



                                                                    e
                    il
     ism




                                                                              er
                                                     s


                                                  ns
                 ta



                               ar



                                                  nt




                                                                 ac



                                                                            th
                                                lo
                                              t io
              Re



                           C



                                             ra
  ur




                                                               Sp



                                                                           O
                                            no
                                          ica
                                           au
                          th
To




                                        ch
                       al



                                        st


                                       un


                                      Te
                    He



                                     Re



                                     m
                                    m
                                 Co




                                        Industry Category
Tables
   - Bar and Pie Charts -

Pie Chart for Top Company Employers Example

  1999 Top Company Employers in Central Florida



        Others
         29%                           Tourism
                                        35%




  Health Care
     16%                     Retail
                             20%
Pareto Diagrams


     A Pareto diagram is a bar chart that displays the
frequency of defect causes. The bar at the left indicates
  the most frequent cause and bars to the right indicate
causes in decreasing frequency. A Pareto diagram is use
   to separate the “vital few” from the “trivial many.”
                          few                    many.
Line Charts

 A line chart, also called a time plot, is a series of data plotted
at various time intervals. Measuring time along the horizontal
 axis and the numerical quantity of interest along the vertical
axis yields a point on the graph for each observation. Joining
 points adjacent in time by straight lines produces a time plot.
Line Charts
                            Growth Trends in Internet Use by Age
                                       1997 to 1999

                     35
Millions of Adults




                                                           31.3   32.7
                     30
                     25                         26.3
                     20               20.2                        18.5
                     15   16.5                  15.8       17.2
                                      13.8                 13     14.2
                     10   9.8                   11.4
                                      7.5
                      5   5
                      0                                                  Age 18 to 29
                                                                         Age 30 to 49
                           98




                           99



                             9
                     O 7




                     O 8
                            7




                            8




                            9
                            7




                            8
                         l-9




                         l-9




                         l-9
                        r- 9




                        r- 9




                        r- 9
                         -9




                         -9
                        n-




                        n-
                       ct




                       ct
                      Ju




                      Ju




                      Ju
                                                                         Age 50+
                     Ap




                     Ap
                     Ap




                     Ja




                     Ja

                                 April 1997 to July 1999
Parameters and Statistics

A statistic is a descriptive measure computed from a
    sample of data. A parameter is a descriptive
  measure computed from an entire population of
                          data.
Measures of Central Tendency
    - Arithmetic Mean -


A arithmetic mean is of a set of data is the
  sum of the data values divided by the
        number of observations.
Sample Mean


If the data set is from a sample, then the sample
              n
                     mean, X , is:
           ∑x      i
                         x1 + x2 +  + xn
      X=    i =1
                       =
              n                  n
Population Mean


If the data set is from a population, then the
           population mean, µ , is:
          N

         ∑x
          x1 + x2 +  + xn
                i
   µ=   =i =1
      N           N
Measures of Central Tendency
          - Median -
   An ordered array is an arrangement of data in either
    ascending or descending order. Once the data are
arranged in ascending order, the median is the value such
 that 50% of the observations are smaller and 50% of the
                 observations are larger.

If the sample size n is an odd number, the median,
Xm, is the middle observation. If the sample size n
is an even number, the median, Xm, is the average
                         median
of the two middle observations. The median will
be located in the 0.50(n+1)th ordered position.
                                       position
Measures of Central Tendency
         - Mode -


   The mode, if one exists, is the most
 frequently occurring observation in the
         sample or population.
Shape of the Distribution

  The shape of the distribution is said to be
 symmetric if the observations are balanced,
 or evenly distributed, about the mean. In a
symmetric distribution the mean and median
                  are equal.
Shape of the Distribution

  A distribution is skewed if the observations are not
symmetrically distributed above and below the mean.
      A positively skewed (or skewed to the right)
  distribution has a tail that extends to the right in the
 direction of positive values. A negatively skewed (or
 skewed to the left) distribution has a tail that extends
      to the left in the direction of negative values.
Shapes of the Distribution
                                                                                       Symmetric Distribution

                                                                          10
                                                                           9
                                                                           8
                                                                           7


                                                              Frequency
                                                                           6
                                                                           5
                                                                           4
                                                                           3
                                                                           2
                                                                           1
                                                                           0
                                                                               1   2    3     4      5     6    7      8             9




                     Positively Skewed Distribution                                                                                          Negatively Skewed Distribution

            12                                                                                                                  12

            10                                                                                                                  10

            8                                                                                                                   8
Frequency




                                                                                                                    Frequency
            6                                                                                                                   6
            4                                                                                                                   4
            2                                                                                                                   2
            0                                                                                                                   0
                 1   2    3     4      5     6        7   8                9                                                             1   2    3     4     5      6        7   8   9
Measures of Central Tendency
          - Geometric Mean -

The Geometric Mean is the nth root of the product of n
                    numbers:

 X g = n ( x1 • x2 •  • xn ) = ( x1 • x2 •  • xn )1/ n

The Geometric Mean is used to obtain mean growth over
 several periods given compounded growth from each
                        period.
Measures of Variability
   - The Range -


The range is in a set of data is the
difference between the largest and
       smallest observations
Measures of Variability
        - Sample Variance -


  The sample variance, s2, is the sum of the squared
differences between each observation and the sample
      mean divided by the sample size minus 1.
                    n

                   ∑ (x − X )
                           i
                                  2


            s2 =   i =1

                          n −1
Measures of Variability
- Short-cut Formulas for Sample
            Variance -




  Short-cut formulas for the sample variance are:

      n      (∑ xi ) 2
      ∑ xi − n                        ∑ xi2 − nX 2
s 2 = i =1               or    s2 =
           n −1                          n −1
Measures of Variability
        - Population Variance -



 The population variance, σ2, is the sum of the squared
differences between each observation and the population
         mean divided by the population size, N.
                      N

                     ∑ (x − µ)
                             i
                                   2


              σ2 =    i =1

                             N
Measures of Variability
 - Sample Standard Deviation -



The sample standard deviation, s, is the positive square
       root of the variance, and is defined as:
                          n

                         ∑ (x − X )
                                 i
                                        2


        s= s =   2       i =1

                                n −1
Measures of Variability
- Population Standard Deviation-




     The population standard deviation, σ, is
                         N

                       ∑ (x − µ)
                               i
                                       2


    σ= σ =     2        i =1
                               N
The Empirical Rule
           (the 68%, 95%, or almost all rule)

For a set of data with a mound-shaped histogram, the Empirical
  Rule is:

•   approximately 68% of the observations are contained with a
    distance of one standard deviation around the mean; µ± 1σ
•   approximately 95% of the observations are contained with a
    distance of two standard deviations around the mean; µ± 2σ
•   almost all of the observations are contained with a distance
    of three standard deviation around the mean; µ± 3σ
Coefficient of Variation

The Coefficient of Variation, CV, is a measure of relative
  dispersion that expresses the standard deviation as a
percentage of the mean (provided the mean is positive).
         The sample coefficient of variation is
             s
         CV = × 100             if X > 0
             X
       The population coefficient of variation is
              σ
          CV = ×100             if µ > 0
              µ
Percentiles and Quartiles


  Data must first be in ascending order. Percentiles
separate large ordered data sets into 100ths. The Pth
 percentile is a number such that P percent of all the
      observations are at or below that number.
Quartiles are descriptive measures that separate large
         ordered data sets into four quarters.
Percentiles and Quartiles



  The first quartile, Q1, is another name for the 25th
percentile. The first quartile divides the ordered data
percentile
such that 25% of the observations are at or below this
 value. Q1 is located in the .25(n+1)st position when
       the data is in ascending order. That is,
               (n + 1)
          Q1 =         ordered position
                  4
Percentiles and Quartiles



The third quartile, Q3, is another name for the 75th
 percentile. The first quartile divides the ordered
 percentile
 data such that 75% of the observations are at or
 below this value. Q3 is located in the .75(n+1)st
position when the data is in ascending order. That
                          is,
            3(n + 1)
       Q3 =          ordered position
               4
Interquartile Range



 The Interquartile Range (IQR) measures the spread
in the middle 50% of the data; that is the difference
  between the observations at the 25th and the 75th
                     percentiles:

            IQR = Q3 − Q1
Five-Number Summary



 The Five-Number Summary refers to the five
descriptive measures: minimum, first quartile,
  median, third quartile, and the maximum.
X min imum < Q1 < Median < Q3 < X max imum
Box-and-Whisker Plots

 A Box-and-Whisker Plot is a graphical procedure that
           uses the Five-Number summary.
         A Box-and-Whisker Plot consists of
• an inner box that shows the numbers which span the
      range from Q1 Box-and-Whisker Plot to Q3.
    •a line drawn through the box at the median.
The “whiskers” are lines drawn from Q1 to the minimum
       vale, and from Q3 to the maximum value.
Box-and-Whisker Plots (Excel)

             Box-and-whisker Plot

45



40



35



30



25



20



15


                                    16
10
Grouped Data Mean
          For a population of N observations the mean is
                                     K

                                    ∑fm      i       i
                             µ=     i =1
                                           N
           For a sample of n observations, the mean is
                                         K

                                     ∑fm         i       i
                              X=      i =1
                                             n

Where the data set contains observation values m1, m2, . . ., mk occurring with
                     frequencies f1, f2, . . . fK respectively
Grouped Data Variance
        For a population of N observations the variance is
                          K                            K

                          ∑f i (mi −µ)      2
                                                    ∑ f i m i2
              σ2 =        i=1
                                                =      i=1
                                                                    −µ2
                                   N                         N

         For a sample of n observations, the variance is
                    K                           K

                  ∑       f i (mi − X ) 2       ∑       f i m i2 − nX 2
           s2 =    i =1
                                            =   i =1
                                n −1                         n −1
Where the data set contains observation values m1, m2, . . ., mk occurring with
                     frequencies f1, f2, . . . fK respectively

Más contenido relacionado

Destacado

Bpo Industry, Created On Tuesday, May 23, 2006 Arunesh Chand Mankotia
Bpo Industry,  Created On Tuesday, May 23, 2006  Arunesh Chand MankotiaBpo Industry,  Created On Tuesday, May 23, 2006  Arunesh Chand Mankotia
Bpo Industry, Created On Tuesday, May 23, 2006 Arunesh Chand MankotiaConsultonmic
 
2000 Harvard C I T Cluster Study
2000  Harvard  C I T  Cluster  Study2000  Harvard  C I T  Cluster  Study
2000 Harvard C I T Cluster StudyMartin Mongiello
 
Project Tuning System for BMS
Project Tuning System for BMSProject Tuning System for BMS
Project Tuning System for BMSAaron Maurer
 
How To Study Effectively
How To Study EffectivelyHow To Study Effectively
How To Study EffectivelySimmons Marcus
 
Comandos ccna-1-y-ccna-2-v5-rs
Comandos ccna-1-y-ccna-2-v5-rsComandos ccna-1-y-ccna-2-v5-rs
Comandos ccna-1-y-ccna-2-v5-rsOscarFF
 

Destacado (6)

Bpo Industry, Created On Tuesday, May 23, 2006 Arunesh Chand Mankotia
Bpo Industry,  Created On Tuesday, May 23, 2006  Arunesh Chand MankotiaBpo Industry,  Created On Tuesday, May 23, 2006  Arunesh Chand Mankotia
Bpo Industry, Created On Tuesday, May 23, 2006 Arunesh Chand Mankotia
 
2000 Harvard C I T Cluster Study
2000  Harvard  C I T  Cluster  Study2000  Harvard  C I T  Cluster  Study
2000 Harvard C I T Cluster Study
 
Project Tuning System for BMS
Project Tuning System for BMSProject Tuning System for BMS
Project Tuning System for BMS
 
Ain't I A Woman?
Ain't I A Woman? Ain't I A Woman?
Ain't I A Woman?
 
How To Study Effectively
How To Study EffectivelyHow To Study Effectively
How To Study Effectively
 
Comandos ccna-1-y-ccna-2-v5-rs
Comandos ccna-1-y-ccna-2-v5-rsComandos ccna-1-y-ccna-2-v5-rs
Comandos ccna-1-y-ccna-2-v5-rs
 

Similar a Why Study Statistics Arunesh Chand Mankotia 2004

Descriptive Statistics Part II: Graphical Description
Descriptive Statistics Part II: Graphical DescriptionDescriptive Statistics Part II: Graphical Description
Descriptive Statistics Part II: Graphical Descriptiongetyourcheaton
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionzcreichenbach
 
Types of graphs
Types of graphsTypes of graphs
Types of graphsLALIT BIST
 
Lecture 1 - Overview.pptx
Lecture 1 - Overview.pptxLecture 1 - Overview.pptx
Lecture 1 - Overview.pptxDrAnisFatima
 
Business Statistics Chapter 2
Business Statistics Chapter 2Business Statistics Chapter 2
Business Statistics Chapter 2Lux PP
 
Applied Business Statistics ,ken black , ch 2
Applied Business Statistics ,ken black , ch 2Applied Business Statistics ,ken black , ch 2
Applied Business Statistics ,ken black , ch 2AbdelmonsifFadl
 
Graphs, charts, and tables ppt @ bec doms
Graphs, charts, and tables ppt @ bec domsGraphs, charts, and tables ppt @ bec doms
Graphs, charts, and tables ppt @ bec domsBabasab Patil
 
Six sigma statistics
Six sigma statisticsSix sigma statistics
Six sigma statisticsShankaran Rd
 
Intro to data science
Intro to data scienceIntro to data science
Intro to data scienceANURAG SINGH
 
Introduction To Data Science Using R
Introduction To Data Science Using RIntroduction To Data Science Using R
Introduction To Data Science Using RANURAG SINGH
 

Similar a Why Study Statistics Arunesh Chand Mankotia 2004 (20)

Qm1 notes
Qm1 notesQm1 notes
Qm1 notes
 
Qm1notes
Qm1notesQm1notes
Qm1notes
 
Stats chapter 1
Stats chapter 1Stats chapter 1
Stats chapter 1
 
Descriptive Statistics Part II: Graphical Description
Descriptive Statistics Part II: Graphical DescriptionDescriptive Statistics Part II: Graphical Description
Descriptive Statistics Part II: Graphical Description
 
SBE11ch02a.pptx
SBE11ch02a.pptxSBE11ch02a.pptx
SBE11ch02a.pptx
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Chapter3
Chapter3Chapter3
Chapter3
 
Pareto Chart
Pareto ChartPareto Chart
Pareto Chart
 
Types of graphs
Types of graphsTypes of graphs
Types of graphs
 
Lecture 1 - Overview.pptx
Lecture 1 - Overview.pptxLecture 1 - Overview.pptx
Lecture 1 - Overview.pptx
 
Business Statistics Chapter 2
Business Statistics Chapter 2Business Statistics Chapter 2
Business Statistics Chapter 2
 
Applied Business Statistics ,ken black , ch 2
Applied Business Statistics ,ken black , ch 2Applied Business Statistics ,ken black , ch 2
Applied Business Statistics ,ken black , ch 2
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Bio statistics 1
Bio statistics 1Bio statistics 1
Bio statistics 1
 
Understanding data through presentation_contd
Understanding data through presentation_contdUnderstanding data through presentation_contd
Understanding data through presentation_contd
 
Graphs, charts, and tables ppt @ bec doms
Graphs, charts, and tables ppt @ bec domsGraphs, charts, and tables ppt @ bec doms
Graphs, charts, and tables ppt @ bec doms
 
Six sigma statistics
Six sigma statisticsSix sigma statistics
Six sigma statistics
 
Intro to data science
Intro to data scienceIntro to data science
Intro to data science
 
Introduction To Data Science Using R
Introduction To Data Science Using RIntroduction To Data Science Using R
Introduction To Data Science Using R
 
Staisticsii
StaisticsiiStaisticsii
Staisticsii
 

Más de Consultonmic

Recruitment Matrix
Recruitment MatrixRecruitment Matrix
Recruitment MatrixConsultonmic
 
Average Handling Time - Time To Fill
Average Handling Time - Time To FillAverage Handling Time - Time To Fill
Average Handling Time - Time To FillConsultonmic
 
RESEARCH REPORT EDUCATION – INSTITUTIONS – INDUSTRY INTAKE ‘LOGIS...
RESEARCH REPORT EDUCATION  –   INSTITUTIONS  –   INDUSTRY  INTAKE      ‘LOGIS...RESEARCH REPORT EDUCATION  –   INSTITUTIONS  –   INDUSTRY  INTAKE      ‘LOGIS...
RESEARCH REPORT EDUCATION – INSTITUTIONS – INDUSTRY INTAKE ‘LOGIS...Consultonmic
 
School Of Agriculture & Supply Chain Management - Concept Project Report
School Of Agriculture & Supply Chain Management - Concept Project Report  School Of Agriculture & Supply Chain Management - Concept Project Report
School Of Agriculture & Supply Chain Management - Concept Project Report Consultonmic
 
Project report for fly ash brick single unit
Project report for fly ash brick   single unitProject report for fly ash brick   single unit
Project report for fly ash brick single unitConsultonmic
 
FLY ASH BRICK PRODUCTION
FLY ASH BRICK PRODUCTIONFLY ASH BRICK PRODUCTION
FLY ASH BRICK PRODUCTIONConsultonmic
 
VIRTUAL RECRUITMENT
VIRTUAL RECRUITMENTVIRTUAL RECRUITMENT
VIRTUAL RECRUITMENTConsultonmic
 
Digital marketing & Advertising/Branding Start up Recruitment/Structure Plan
Digital marketing & Advertising/Branding Start up Recruitment/Structure Plan Digital marketing & Advertising/Branding Start up Recruitment/Structure Plan
Digital marketing & Advertising/Branding Start up Recruitment/Structure Plan Consultonmic
 
Bench management - arunesh chand mankotia
Bench management -  arunesh chand mankotiaBench management -  arunesh chand mankotia
Bench management - arunesh chand mankotiaConsultonmic
 
Bench management arunesh chand mankotia
Bench management   arunesh chand mankotiaBench management   arunesh chand mankotia
Bench management arunesh chand mankotiaConsultonmic
 
FOODONOMIC HEALTH TIPS
FOODONOMIC HEALTH TIPSFOODONOMIC HEALTH TIPS
FOODONOMIC HEALTH TIPSConsultonmic
 
Consultonomic Solutions for Strategic Growth - Educational Institutes (Mahar...
Consultonomic Solutions for Strategic Growth  - Educational Institutes (Mahar...Consultonomic Solutions for Strategic Growth  - Educational Institutes (Mahar...
Consultonomic Solutions for Strategic Growth - Educational Institutes (Mahar...Consultonmic
 
Strategic business proposal eent - green brick project - arunesh chand mank...
Strategic business proposal   eent - green brick project - arunesh chand mank...Strategic business proposal   eent - green brick project - arunesh chand mank...
Strategic business proposal eent - green brick project - arunesh chand mank...Consultonmic
 
Best Ad Banners Ever - Arunesh Chand Mankotia
Best Ad Banners Ever  - Arunesh Chand MankotiaBest Ad Banners Ever  - Arunesh Chand Mankotia
Best Ad Banners Ever - Arunesh Chand MankotiaConsultonmic
 
Canteen user survey format - Developed by Arunesh Chand Mankotia
Canteen user survey format - Developed by Arunesh Chand MankotiaCanteen user survey format - Developed by Arunesh Chand Mankotia
Canteen user survey format - Developed by Arunesh Chand MankotiaConsultonmic
 
Canteen management system
Canteen management systemCanteen management system
Canteen management systemConsultonmic
 
Indoor advertising concept NAMO - Arunesh Chand Mankotia
Indoor advertising concept  NAMO - Arunesh Chand MankotiaIndoor advertising concept  NAMO - Arunesh Chand Mankotia
Indoor advertising concept NAMO - Arunesh Chand MankotiaConsultonmic
 
Namo Advertising India
Namo Advertising IndiaNamo Advertising India
Namo Advertising IndiaConsultonmic
 
Consultonomic feasibility report - 2016
Consultonomic  feasibility report - 2016Consultonomic  feasibility report - 2016
Consultonomic feasibility report - 2016Consultonmic
 
Model cafeteria iit roorkee
Model cafeteria   iit roorkeeModel cafeteria   iit roorkee
Model cafeteria iit roorkeeConsultonmic
 

Más de Consultonmic (20)

Recruitment Matrix
Recruitment MatrixRecruitment Matrix
Recruitment Matrix
 
Average Handling Time - Time To Fill
Average Handling Time - Time To FillAverage Handling Time - Time To Fill
Average Handling Time - Time To Fill
 
RESEARCH REPORT EDUCATION – INSTITUTIONS – INDUSTRY INTAKE ‘LOGIS...
RESEARCH REPORT EDUCATION  –   INSTITUTIONS  –   INDUSTRY  INTAKE      ‘LOGIS...RESEARCH REPORT EDUCATION  –   INSTITUTIONS  –   INDUSTRY  INTAKE      ‘LOGIS...
RESEARCH REPORT EDUCATION – INSTITUTIONS – INDUSTRY INTAKE ‘LOGIS...
 
School Of Agriculture & Supply Chain Management - Concept Project Report
School Of Agriculture & Supply Chain Management - Concept Project Report  School Of Agriculture & Supply Chain Management - Concept Project Report
School Of Agriculture & Supply Chain Management - Concept Project Report
 
Project report for fly ash brick single unit
Project report for fly ash brick   single unitProject report for fly ash brick   single unit
Project report for fly ash brick single unit
 
FLY ASH BRICK PRODUCTION
FLY ASH BRICK PRODUCTIONFLY ASH BRICK PRODUCTION
FLY ASH BRICK PRODUCTION
 
VIRTUAL RECRUITMENT
VIRTUAL RECRUITMENTVIRTUAL RECRUITMENT
VIRTUAL RECRUITMENT
 
Digital marketing & Advertising/Branding Start up Recruitment/Structure Plan
Digital marketing & Advertising/Branding Start up Recruitment/Structure Plan Digital marketing & Advertising/Branding Start up Recruitment/Structure Plan
Digital marketing & Advertising/Branding Start up Recruitment/Structure Plan
 
Bench management - arunesh chand mankotia
Bench management -  arunesh chand mankotiaBench management -  arunesh chand mankotia
Bench management - arunesh chand mankotia
 
Bench management arunesh chand mankotia
Bench management   arunesh chand mankotiaBench management   arunesh chand mankotia
Bench management arunesh chand mankotia
 
FOODONOMIC HEALTH TIPS
FOODONOMIC HEALTH TIPSFOODONOMIC HEALTH TIPS
FOODONOMIC HEALTH TIPS
 
Consultonomic Solutions for Strategic Growth - Educational Institutes (Mahar...
Consultonomic Solutions for Strategic Growth  - Educational Institutes (Mahar...Consultonomic Solutions for Strategic Growth  - Educational Institutes (Mahar...
Consultonomic Solutions for Strategic Growth - Educational Institutes (Mahar...
 
Strategic business proposal eent - green brick project - arunesh chand mank...
Strategic business proposal   eent - green brick project - arunesh chand mank...Strategic business proposal   eent - green brick project - arunesh chand mank...
Strategic business proposal eent - green brick project - arunesh chand mank...
 
Best Ad Banners Ever - Arunesh Chand Mankotia
Best Ad Banners Ever  - Arunesh Chand MankotiaBest Ad Banners Ever  - Arunesh Chand Mankotia
Best Ad Banners Ever - Arunesh Chand Mankotia
 
Canteen user survey format - Developed by Arunesh Chand Mankotia
Canteen user survey format - Developed by Arunesh Chand MankotiaCanteen user survey format - Developed by Arunesh Chand Mankotia
Canteen user survey format - Developed by Arunesh Chand Mankotia
 
Canteen management system
Canteen management systemCanteen management system
Canteen management system
 
Indoor advertising concept NAMO - Arunesh Chand Mankotia
Indoor advertising concept  NAMO - Arunesh Chand MankotiaIndoor advertising concept  NAMO - Arunesh Chand Mankotia
Indoor advertising concept NAMO - Arunesh Chand Mankotia
 
Namo Advertising India
Namo Advertising IndiaNamo Advertising India
Namo Advertising India
 
Consultonomic feasibility report - 2016
Consultonomic  feasibility report - 2016Consultonomic  feasibility report - 2016
Consultonomic feasibility report - 2016
 
Model cafeteria iit roorkee
Model cafeteria   iit roorkeeModel cafeteria   iit roorkee
Model cafeteria iit roorkee
 

Why Study Statistics Arunesh Chand Mankotia 2004

  • 2. Dealing with Uncertainty Everyday decisions are based on incomplete information
  • 3. Dealing with Uncertainty The price of L&T stock will be higher in six months than it is now. versus The price of L&T stock is likely to be higher in six months than it is now.
  • 4. Dealing with Uncertainty If the union budget deficit is as high as predicted, interest rates will remain high for the rest of the year. versus If the union budget deficit is as high as predicted, it is probable that interest rates will remain high for the rest of the year.
  • 5. Statistical Thinking Statistical thinking is a philosophy of learning and action based on the following fundamental principles:  All work occurs in a system of interconnected processes;  Variation exists in all processes, and  Understanding and reducing variation are the keys to success.
  • 6. Statistical Thinking Systems and Processes A system is a number of components that are logically and sometimes physically linked together for some purpose.
  • 7. Statistical Thinking Systems and Processes A process is a set of activities operating on a system that transforms inputs to outputs. A business process is groups of logically related tasks and activities, that when performed utilizes the resources of the business to provide definitive results required to achieve the business objectives.
  • 8. Making Decisions Data, Information, Knowledge q Data: specific observations of measured numbers. q Information: processed and summarized data yielding facts and ideas. q Knowledge: selected and organized information that provides understanding, recommendations, and the basis for decisions.
  • 9. Making Decisions Descriptive and Inferential Statistics Descriptive Statistics include graphical and numerical procedures that summarize and process data and are used to transform data into information.
  • 10. Making Decisions Descriptive and Inferential Statistics Inferential Statistics provide the bases for predictions, forecasts, and estimates that are used to transform information to knowledge.
  • 11. The Journey to Making Decisions Decision  Knowledge Experience, Theory, Literature, Inferential Statistics, Computers Information Descriptive Statistics, Probability, Computers Begin Here: Data Identify the Problem
  • 13. Summarizing and Describing Data  Tables and Graphs  Numerical Measures
  • 14. Classification of Variables  Discrete numerical variable  Continuous numerical variable  Categorical variable
  • 15. Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
  • 16. Classification of Variables Continuous Numerical Variable A variable that produces a response that is the outcome of a measurement process.
  • 17. Classification of Variables Categorical Variables Variables that produce responses that belong to groups (sometimes called “classes”) or categories.
  • 18. Measurement Levels Nominal and Ordinal Levels of Measurement refer to data obtained from categorical questions. • A nominal scale indicates assignments to groups or classes. • Ordinal data indicate rank ordering of items.
  • 19. Frequency Distributions A frequency distribution is a table used to organize data. The left column (called classes or groups) includes numerical intervals on a variable being studied. The right column is a list of the frequencies, or number of observations, for each class. Intervals are normally of equal size, must cover the range of the sample observations, and be non-overlapping.
  • 20. Construction of a Frequency Distribution  Rule 1: Intervals (classes) must be inclusive and non- overlapping;  Rule 2: Determine k, the number of classes;  Rule 3: Intervals should be the same width, w; the width is determined by the following: (Largest Number - Smallest Number) w = Interval Width = Number of Intervals Both k and w should be rounded upward, possibly to the next largest integer.
  • 21. Construction of a Frequency Distribution Quick Guide to Number of Classes for a Frequency Distribution Sample Size Number of Classes Fewer than 50 5 – 6 classes 50 to 100 6 – 8 classes over 100 8 – 10 classes
  • 22. Example of a Frequency Distribution A Frequency Distribution for the Suntan Lotion Example Weights (in mL) Number of Bottles 220 less than 225 1 225 less than 230 4 230 less than 235 29 235 less than 240 34 240 less than 245 26 245 less than 250 6
  • 23. Cumulative Frequency Distributions A cumulative frequency distribution contains the number of observations whose values are less than the upper limit of each interval. It is constructed by adding the frequencies of all frequency distribution intervals up to and including the present interval.
  • 24. Relative Cumulative Frequency Distributions A relative cumulative frequency distribution converts all cumulative frequencies to cumulative percentages
  • 25. Example of a Frequency Distribution A Cumulative Frequency Distribution for the Sun tan Lotion Example Weights (in mL) Number of Bottles less than 225 1 less than 230 5 less than 235 34 less than 240 68 less than 245 94 less than 250 100
  • 26. Histograms and Ogives A histogram is a bar graph that consists of vertical bars constructed on a horizontal line that is marked off with intervals for the variable being displayed. The intervals correspond to those in a frequency distribution table. The height of each bar is proportional to the number of observations in that interval.
  • 27. Histograms and Ogives An ogive, sometimes called a cumulative line graph, is a line that connects points that are the cumulative percentage of observations below the upper limit of each class in a cumulative frequency distribution.
  • 28. Histogram and Ogive for Example 1 Histogram of Weights 40 100 35 90 80 30 70 Frequency 25 60 20 50 15 40 30 10 20 5 10 0 0 224.5 229.5 234.5 239.5 244.5 249.5 Interval Weights (mL)
  • 29. Stem-and-Leaf Display A stem-and-leaf display is an exploratory data analysis graph that is an alternative to the histogram. Data are grouped according to their leading digits (called the stem) while listing the final digits (called leaves) separately for each member of a class. The leaves are displayed individually in ascending order after each of the stems.
  • 30. Stem-and-Leaf Display Stem-and-Leaf Display Stem unit: 10 9 1 124678899 (9) 2 122246899 5 3 01234 2 4 02
  • 31. Tables - Bar and Pie Charts - Frequency and Relative Frequency Distribution for Top Company Employers Example Number of Industry Employees Percent Tourism 85,287 0.35 Retail 49,424 0.2 Health Care 39,588 0.16 Restaurants 16,050 0.06 Communications 11,750 0.05 Technology 11,144 0.05 Space 11,418 0.05 Other 21,336 0.08
  • 32. Tables - Bar and Pie Charts - Bar Chart for Top Company Employers Example 1999 Top Company Employers in Central Florida 0.35 0.2 0.16 0.06 0.08 0.05 0.05 0.05 e gy e il ism er s ns ta ar nt ac th lo t io Re C ra ur Sp O no ica au th To ch al st un Te He Re m m Co Industry Category
  • 33. Tables - Bar and Pie Charts - Pie Chart for Top Company Employers Example 1999 Top Company Employers in Central Florida Others 29% Tourism 35% Health Care 16% Retail 20%
  • 34. Pareto Diagrams A Pareto diagram is a bar chart that displays the frequency of defect causes. The bar at the left indicates the most frequent cause and bars to the right indicate causes in decreasing frequency. A Pareto diagram is use to separate the “vital few” from the “trivial many.” few many.
  • 35. Line Charts A line chart, also called a time plot, is a series of data plotted at various time intervals. Measuring time along the horizontal axis and the numerical quantity of interest along the vertical axis yields a point on the graph for each observation. Joining points adjacent in time by straight lines produces a time plot.
  • 36. Line Charts Growth Trends in Internet Use by Age 1997 to 1999 35 Millions of Adults 31.3 32.7 30 25 26.3 20 20.2 18.5 15 16.5 15.8 17.2 13.8 13 14.2 10 9.8 11.4 7.5 5 5 0 Age 18 to 29 Age 30 to 49 98 99 9 O 7 O 8 7 8 9 7 8 l-9 l-9 l-9 r- 9 r- 9 r- 9 -9 -9 n- n- ct ct Ju Ju Ju Age 50+ Ap Ap Ap Ja Ja April 1997 to July 1999
  • 37. Parameters and Statistics A statistic is a descriptive measure computed from a sample of data. A parameter is a descriptive measure computed from an entire population of data.
  • 38. Measures of Central Tendency - Arithmetic Mean - A arithmetic mean is of a set of data is the sum of the data values divided by the number of observations.
  • 39. Sample Mean If the data set is from a sample, then the sample n mean, X , is: ∑x i x1 + x2 +  + xn X= i =1 = n n
  • 40. Population Mean If the data set is from a population, then the population mean, µ , is: N ∑x x1 + x2 +  + xn i µ= =i =1 N N
  • 41. Measures of Central Tendency - Median - An ordered array is an arrangement of data in either ascending or descending order. Once the data are arranged in ascending order, the median is the value such that 50% of the observations are smaller and 50% of the observations are larger. If the sample size n is an odd number, the median, Xm, is the middle observation. If the sample size n is an even number, the median, Xm, is the average median of the two middle observations. The median will be located in the 0.50(n+1)th ordered position. position
  • 42. Measures of Central Tendency - Mode - The mode, if one exists, is the most frequently occurring observation in the sample or population.
  • 43. Shape of the Distribution The shape of the distribution is said to be symmetric if the observations are balanced, or evenly distributed, about the mean. In a symmetric distribution the mean and median are equal.
  • 44. Shape of the Distribution A distribution is skewed if the observations are not symmetrically distributed above and below the mean. A positively skewed (or skewed to the right) distribution has a tail that extends to the right in the direction of positive values. A negatively skewed (or skewed to the left) distribution has a tail that extends to the left in the direction of negative values.
  • 45. Shapes of the Distribution Symmetric Distribution 10 9 8 7 Frequency 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 Positively Skewed Distribution Negatively Skewed Distribution 12 12 10 10 8 8 Frequency Frequency 6 6 4 4 2 2 0 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
  • 46. Measures of Central Tendency - Geometric Mean - The Geometric Mean is the nth root of the product of n numbers: X g = n ( x1 • x2 •  • xn ) = ( x1 • x2 •  • xn )1/ n The Geometric Mean is used to obtain mean growth over several periods given compounded growth from each period.
  • 47. Measures of Variability - The Range - The range is in a set of data is the difference between the largest and smallest observations
  • 48. Measures of Variability - Sample Variance - The sample variance, s2, is the sum of the squared differences between each observation and the sample mean divided by the sample size minus 1. n ∑ (x − X ) i 2 s2 = i =1 n −1
  • 49. Measures of Variability - Short-cut Formulas for Sample Variance - Short-cut formulas for the sample variance are: n (∑ xi ) 2 ∑ xi − n ∑ xi2 − nX 2 s 2 = i =1 or s2 = n −1 n −1
  • 50. Measures of Variability - Population Variance - The population variance, σ2, is the sum of the squared differences between each observation and the population mean divided by the population size, N. N ∑ (x − µ) i 2 σ2 = i =1 N
  • 51. Measures of Variability - Sample Standard Deviation - The sample standard deviation, s, is the positive square root of the variance, and is defined as: n ∑ (x − X ) i 2 s= s = 2 i =1 n −1
  • 52. Measures of Variability - Population Standard Deviation- The population standard deviation, σ, is N ∑ (x − µ) i 2 σ= σ = 2 i =1 N
  • 53. The Empirical Rule (the 68%, 95%, or almost all rule) For a set of data with a mound-shaped histogram, the Empirical Rule is: • approximately 68% of the observations are contained with a distance of one standard deviation around the mean; µ± 1σ • approximately 95% of the observations are contained with a distance of two standard deviations around the mean; µ± 2σ • almost all of the observations are contained with a distance of three standard deviation around the mean; µ± 3σ
  • 54. Coefficient of Variation The Coefficient of Variation, CV, is a measure of relative dispersion that expresses the standard deviation as a percentage of the mean (provided the mean is positive). The sample coefficient of variation is s CV = × 100 if X > 0 X The population coefficient of variation is σ CV = ×100 if µ > 0 µ
  • 55. Percentiles and Quartiles Data must first be in ascending order. Percentiles separate large ordered data sets into 100ths. The Pth percentile is a number such that P percent of all the observations are at or below that number. Quartiles are descriptive measures that separate large ordered data sets into four quarters.
  • 56. Percentiles and Quartiles The first quartile, Q1, is another name for the 25th percentile. The first quartile divides the ordered data percentile such that 25% of the observations are at or below this value. Q1 is located in the .25(n+1)st position when the data is in ascending order. That is, (n + 1) Q1 = ordered position 4
  • 57. Percentiles and Quartiles The third quartile, Q3, is another name for the 75th percentile. The first quartile divides the ordered percentile data such that 75% of the observations are at or below this value. Q3 is located in the .75(n+1)st position when the data is in ascending order. That is, 3(n + 1) Q3 = ordered position 4
  • 58. Interquartile Range The Interquartile Range (IQR) measures the spread in the middle 50% of the data; that is the difference between the observations at the 25th and the 75th percentiles: IQR = Q3 − Q1
  • 59. Five-Number Summary The Five-Number Summary refers to the five descriptive measures: minimum, first quartile, median, third quartile, and the maximum. X min imum < Q1 < Median < Q3 < X max imum
  • 60. Box-and-Whisker Plots A Box-and-Whisker Plot is a graphical procedure that uses the Five-Number summary. A Box-and-Whisker Plot consists of • an inner box that shows the numbers which span the range from Q1 Box-and-Whisker Plot to Q3. •a line drawn through the box at the median. The “whiskers” are lines drawn from Q1 to the minimum vale, and from Q3 to the maximum value.
  • 61. Box-and-Whisker Plots (Excel) Box-and-whisker Plot 45 40 35 30 25 20 15 16 10
  • 62. Grouped Data Mean For a population of N observations the mean is K ∑fm i i µ= i =1 N For a sample of n observations, the mean is K ∑fm i i X= i =1 n Where the data set contains observation values m1, m2, . . ., mk occurring with frequencies f1, f2, . . . fK respectively
  • 63. Grouped Data Variance For a population of N observations the variance is K K ∑f i (mi −µ) 2 ∑ f i m i2 σ2 = i=1 = i=1 −µ2 N N For a sample of n observations, the variance is K K ∑ f i (mi − X ) 2 ∑ f i m i2 − nX 2 s2 = i =1 = i =1 n −1 n −1 Where the data set contains observation values m1, m2, . . ., mk occurring with frequencies f1, f2, . . . fK respectively