SlideShare a Scribd company logo
1 of 27
© 2016© 2016
A Practical Approach to Analyzing
Healthcare Data
Chapter 7 – Sample Selections
© 2016
Types of Studies - Descriptive
• Descriptive studies – performed to generate
hypotheses for more formal studies
– Cross-sectional study – describes the characteristics
of a population at a specific point in time
• Often used for prevalence studies
– Applied descriptive studies
• Data mining
• Exploratory data analysis
© 2016
Types of Studies - Analytic
• Analytic studies – more formal studies designed to test a
specific hypotheses
– Case-control study – involves both a case group (subjects with
the attribute under investigation) and a control group (those
without the attribute)
• Members of the case and control groups are often matched based on
demographics
• Typically a retrospective study
• May not be used to determine cause and effect; can calculate odds
ratio
• Weakness – dependent of subject’s ability to recall events
– Cohort studies – involves case and control group, but groups are
identified before the study is performed
• Prospective study
• May not be used to determine cause and effect; can calculate relative
risk
• May take a long time to complete
• Not useful if the attribute studied is rare
© 2016
Types of Studies - Experimental
• Allow the determination of a cause and effect
relationship between variables
• Randomized Control Trials (RCT)
– Used to determine the effectiveness of new
drugs/treatment protocols
• Blinded studies
– Single blind – subject does not know if they are assigned
to the case or control group
– Double blind – neither subject nor the researcher know if
they are assigned to the case or control group
– Triple blind- subject, researcher and analytics are all
blinded as to the group assignment of the subject
© 2016
Why select a sample?
• Often population is too large to collect data from every
unit of analysis or subject
• Statistical inference is used to make conclusions about
a population based on a sample
• Vocabulary:
– Population or universe – all subjects that are under study
and eligible to be sampled
– Sample – selected subset of the population
– Sampling frame – A listing of all of the subjects in the
population
– Variable of interest – Quantity to be estimated (denial rate,
coding error rate, overpayment, underpayment, etc)
© 2016
Statistically Valid Sample
• Large enough to provide information with
sufficient precision to meet the goals of the
analysis
• Probability sample where each item has
an equal chance of being selected
• Must be reproducible
© 2016
Defining the Variable of Interest
• What is the percent of lab orders that are not signed by a
physician during 2012?
– Universe – all lab orders during 2012
• What is the amount over/under paid due to incorrect E/M level
assignment during January?
– Universe –
• E/M services billed during January
• E/M services provided during January
• Must refine question to determine if billed date or service date should
be used for defining the universe
• What is the coding accuracy rate for secondary diagnosis
codes on inpatient accounts during the first quarter?
– Universe –
• All secondary diagnoses coded during first quarter
• All inpatient accounts during first quarter
• Must refine question to determine if diagnosis codes or charts are the
unit of analysis
© 2016
Simple Random Sampling
• It is the statistical equivalent of drawing sampling units
from a hat.
• Each sampling unit (claim, chart, etc.) must have the
same probability of selection.
• Note that some random number generators will allow
the user to set a ‘seed’. If that feature is available, the
analyst should always set a seed. This will ensure
that the sample can be replicated.
• A simple random sample is not appropriate if the frame
cannot be listed or if it is important that the sample
contain particular (rare) subsets of the population.
© 2016
Random Number Generators
• All random number generators are based on
mathematical functions that need a ‘seed’ or
starting point
• The use of a seed ensures that two independent
samples drawn using the same software will result
in the same series of random numbers and
reproducible sample
• Excel
– RAND() function does not allow a seed
– Random Number Generation in Data Analysis
ToolPak does allow a seed
© 2016
Simple Random Sampling
Steps
• Method 1:
– The members of the sampling frame should be assigned a
random number between 0 and 1
– The frame may then be sorted by the random number
– The first ‘n’ will be the simple random sample of size ‘n’
• Method 2:
– Assign a sequence number from 1 to ‘n’ to each member
of the sampling frame
– Use a random number generator (e.g., ratstats) to select
random numbers from 1 to ‘N’ (N is the population size)
© 2016
Systematic Random Sampling
• A systematic random sample is a simple random sample that
is selected using a particular technique. If the population
includes ‘N’ members and we wish to draw as sample of size
‘n’, then a systemic random sample could be selected by
choosing every N/nth member of the population as the
sample.
– The selection should start at random from a member between
the 1st and N/nth member.
• NOTE: If N/n is not a whole number, then round down to the
next lower whole number to determine the sampling interval.
• In order to ensure that a systematic random sample is truly
random, the population should not be sorted in an order that
might bias the sample.
© 2016
Stratified Random Sampling
• Population is divided into unique subsets or strata
• Strata should be mutually exclusive and exhaustive. In other
words, each of the members of the population should be in one and
only one stratum.
• A simple random sample is then selected from each of the strata
• The size of the sample in each strata may be equal or may be
assigned proportionally according to the relative size of each strata
• Stratified sampling is appropriate when the quantity to be estimated
may vary among natural subgroups (strata) of the population
• Typical strata in healthcare may be:
– CPT® Code (E/M levels)
– Physician
– Specialty
– Clinic
© 2016
Stratified Random Sampling
Example
• Example: An analyst wishes to select a stratified random
sample of 90 from a population of 1,000 E/M visits. The
distribution of E/M visits in the population is:
– Level 1: 55
– Level 2: 183
– Level 3: 236
– Level 4: 309
– Level 5: 217
© 2016
Stratified Random Sampling
Example
• Example: An analyst wishes to select a stratified
random sample of 90 from a population of 1,000
E/M visits. The distribution of E/M visits in the
population is:
Level
Population
Count (N)
% of
Population
Sample
Size (n)
1 55
2 183
3 236
4 309
5 217
Totals 1,000 100% 90
© 2016
Stratified Random Sampling
Example
• Example: An analyst wishes to select a stratified
random sample of 90 from a population of 1,000
E/M visits. The distribution of E/M visits in the
population is:
© 2016
Cluster Sampling
• The population is divided into subsets much
like the strata in stratified sampling
• Clusters should be mutually exclusive and
exhaustive
• All members of each cluster are selected to
be a part of the sample
• Clusters are selected at random
• Cluster sampling is appropriate when it is
difficult to access all of the population
© 2016
Cluster Sampling
Example
The director of the emergency department
would like to audit the accuracy of charge
capture for the first quarter of 2010.
Unfortunately, she is not able to obtain a full
listing of the patients that pass through the ED
for a sampling frame. Instead, a cluster sample
will be drawn using date of service as the
cluster. Select 10 dates via simple random
sampling to produce a cluster sample.
© 2016
Non-probability Sampling
• Random sample not required if:
– Study is exploratory or a focused review
– Example: If we wish to determine educational
opportunities for improving documentation, we may
sample accounts with few secondary diagnoses to
determine if there is a pattern in the types of
diagnosis codes most likely to be missed
• Typically, this sample is driven by some
exploratory data analysis or data mining to help
‘steer’ the sample to subjects most likely to have
the issue of interest
© 2016
Non-probability Sampling
• Convenience sampling
– Example – sample first ‘n’ customers that enter the
hospital cafeteria
• Judgment sampling
– Use exploratory data analysis based on experience
or history
– AKA focused review
– Example – Know from history that the customer
satisfaction in cafeteria is lowest at lunch time
because of long lines. Select sample at that time to
try to improve process.
• Quota sampling
– Subjects divided into groups
– Judgment sample used within each group
– Example – may select first 10 male and 10 female
customers to cafeteria
© 2016
RAT-STATS
• Statistical program provided by the Office of the Inspector
General (OIG)
• Free and downloadable from the OIG website – PC only (no
MAC version)
• Functionality
– Determine sample size
– Create random numbers for sample selection
– Analyze sample data from simple, cluster and stratified sampling
• Two types of studies:
– Attribute – variable of interest is a rate or proportion
– Variable – variable of interest is a interval or ratio quantity
© 2016
RAT-STATS Demonstration
• Instructor:
– Reproduce the demo on pages 125 to 131
with a local installation of RAT-STATS
– Students should practice in the lab
© 2016
Sample Size
• Sample size is dependent on:
– Standard Deviation of the quantity to be estimated
– Desired precision (width of confidence interval)
– Sampling method
– Size of the population (if it is relatively small)
– Resources available to perform the study
• Any analyst that quotes a sample size without asking for the
above information is not making an informed choice regarding
sample size
• The standard deviation of the quantity to be estimated
typically is derived from a pilot study or previous review
– OIG current recommendation for a pilot study is 30
© 2016
Sample Size
Attribute Study • Determined by:
– Anticipated rate
of occurrence
(50% results in
largest sample)
– Confidence level
– Desired
precision range
© 2016
Sample Size
Attribute Study
• A larger sample size is required for:
– A higher level of confidence
– A anticipated rate of occurrence closer
to 50%
– A smaller (narrower) precision range
© 2016
Sample Size
Variable Study • Determined by:
– Probe sample
mean and
standard
deviation
– Confidence
level
– Desired
precision
range
© 2016
Sample Size
Variable Study
• A larger sample size is required for:
– A higher level of confidence
– A larger probe standard deviation
– A smaller (narrower) precision range
© 2016
Sample Size and Precision
• In both types of studies, attribute or variable, a
higher level of precision requires a larger sample
size
• A higher level of precision is equivalent to requiring a
narrower confidence interval for a set confidence
level
• Note that increasing ‘n’ in both the proportion and
mean confidence interval formulas results in
narrower intervals (all other variables held constant)

More Related Content

What's hot

Sampling distribution concepts
Sampling distribution conceptsSampling distribution concepts
Sampling distribution concepts
umar sheikh
 

What's hot (20)

Sampling and sample size determination
Sampling and sample size determinationSampling and sample size determination
Sampling and sample size determination
 
Sampling
SamplingSampling
Sampling
 
Brm sampling techniques
Brm sampling techniquesBrm sampling techniques
Brm sampling techniques
 
Sampling methods 16
Sampling methods   16Sampling methods   16
Sampling methods 16
 
Sampling methods
Sampling methodsSampling methods
Sampling methods
 
Sampling techniques in Research
Sampling techniques in Research Sampling techniques in Research
Sampling techniques in Research
 
Sampling distribution concepts
Sampling distribution conceptsSampling distribution concepts
Sampling distribution concepts
 
CABT SHS Statistics & Probability - Sampling Distribution of Means
CABT SHS Statistics & Probability - Sampling Distribution of MeansCABT SHS Statistics & Probability - Sampling Distribution of Means
CABT SHS Statistics & Probability - Sampling Distribution of Means
 
Systematic sampling in probability sampling
Systematic sampling in probability sampling Systematic sampling in probability sampling
Systematic sampling in probability sampling
 
Errors in research
Errors in researchErrors in research
Errors in research
 
Sampling design
Sampling designSampling design
Sampling design
 
Sampling and Inference_Political_Science
Sampling and Inference_Political_ScienceSampling and Inference_Political_Science
Sampling and Inference_Political_Science
 
Systematic ranom sampling for slide share
Systematic ranom sampling for slide shareSystematic ranom sampling for slide share
Systematic ranom sampling for slide share
 
Research Method for Business chapter 10
Research Method for Business chapter  10Research Method for Business chapter  10
Research Method for Business chapter 10
 
Sampling....
Sampling....Sampling....
Sampling....
 
Sampling Design and Sampling Distribution
Sampling Design and Sampling DistributionSampling Design and Sampling Distribution
Sampling Design and Sampling Distribution
 
Research Method EMBA chapter 10
Research Method EMBA chapter 10Research Method EMBA chapter 10
Research Method EMBA chapter 10
 
Sample size determination
Sample size determinationSample size determination
Sample size determination
 
2. sampling techniques
2. sampling techniques2. sampling techniques
2. sampling techniques
 
Introduction to Biostatistics and types of sampling methods
Introduction to Biostatistics and types of sampling methodsIntroduction to Biostatistics and types of sampling methods
Introduction to Biostatistics and types of sampling methods
 

Similar to Hm306 week 6

unit 10 Sampling presentation L- short.ppt
unit 10 Sampling presentation L- short.pptunit 10 Sampling presentation L- short.ppt
unit 10 Sampling presentation L- short.ppt
MitikuTeka1
 
AAU. Chapter.5 Sampling Methods.pptx
AAU. Chapter.5 Sampling Methods.pptxAAU. Chapter.5 Sampling Methods.pptx
AAU. Chapter.5 Sampling Methods.pptx
hailemeskelteshome
 
Research method ch06 sampling
Research method ch06 samplingResearch method ch06 sampling
Research method ch06 sampling
naranbatn
 
Unit 9a. Sampling Techniques.pptx
Unit 9a. Sampling Techniques.pptxUnit 9a. Sampling Techniques.pptx
Unit 9a. Sampling Techniques.pptx
shakirRahman10
 

Similar to Hm306 week 6 (20)

Business research sampling
Business research samplingBusiness research sampling
Business research sampling
 
unit 10 Sampling presentation L- short.ppt
unit 10 Sampling presentation L- short.pptunit 10 Sampling presentation L- short.ppt
unit 10 Sampling presentation L- short.ppt
 
Statr sessions 11 to 12
Statr sessions 11 to 12Statr sessions 11 to 12
Statr sessions 11 to 12
 
Sampling Theory
Sampling TheorySampling Theory
Sampling Theory
 
2RM2 PPT.pptx
2RM2 PPT.pptx2RM2 PPT.pptx
2RM2 PPT.pptx
 
Sampling
SamplingSampling
Sampling
 
Sampling techniques.pptx
Sampling techniques.pptxSampling techniques.pptx
Sampling techniques.pptx
 
Chapter_2_Sampling.pptx
Chapter_2_Sampling.pptxChapter_2_Sampling.pptx
Chapter_2_Sampling.pptx
 
chapter8-sampling-IoxO.pptx
chapter8-sampling-IoxO.pptxchapter8-sampling-IoxO.pptx
chapter8-sampling-IoxO.pptx
 
chapter8-sampling-IoxO.pptx
chapter8-sampling-IoxO.pptxchapter8-sampling-IoxO.pptx
chapter8-sampling-IoxO.pptx
 
Methods.pdf
Methods.pdfMethods.pdf
Methods.pdf
 
Probability & Non-Probability.pptx
Probability & Non-Probability.pptxProbability & Non-Probability.pptx
Probability & Non-Probability.pptx
 
Sampling.pdf
Sampling.pdfSampling.pdf
Sampling.pdf
 
AAU. Chapter.5 Sampling Methods.pptx
AAU. Chapter.5 Sampling Methods.pptxAAU. Chapter.5 Sampling Methods.pptx
AAU. Chapter.5 Sampling Methods.pptx
 
Sampling.pptx
Sampling.pptxSampling.pptx
Sampling.pptx
 
Maxfield_8e_PPT_Ch08.pptx
Maxfield_8e_PPT_Ch08.pptxMaxfield_8e_PPT_Ch08.pptx
Maxfield_8e_PPT_Ch08.pptx
 
Research method ch06 sampling
Research method ch06 samplingResearch method ch06 sampling
Research method ch06 sampling
 
Unit 9a. Sampling Techniques.pptx
Unit 9a. Sampling Techniques.pptxUnit 9a. Sampling Techniques.pptx
Unit 9a. Sampling Techniques.pptx
 
How to do sampling?
How to do sampling?How to do sampling?
How to do sampling?
 
Sampling and sampling distribution
Sampling and sampling distributionSampling and sampling distribution
Sampling and sampling distribution
 

More from BealCollegeOnline (20)

BA650 Week 3 Chapter 3 "Why Change? contemporary drivers and pressures
BA650 Week 3 Chapter 3 "Why Change? contemporary drivers and pressuresBA650 Week 3 Chapter 3 "Why Change? contemporary drivers and pressures
BA650 Week 3 Chapter 3 "Why Change? contemporary drivers and pressures
 
BIO420 Chapter 25
BIO420 Chapter 25BIO420 Chapter 25
BIO420 Chapter 25
 
BIO420 Chapter 24
BIO420 Chapter 24BIO420 Chapter 24
BIO420 Chapter 24
 
BIO420 Chapter 23
BIO420 Chapter 23BIO420 Chapter 23
BIO420 Chapter 23
 
BIO420 Chapter 20
BIO420 Chapter 20BIO420 Chapter 20
BIO420 Chapter 20
 
BIO420 Chapter 18
BIO420 Chapter 18BIO420 Chapter 18
BIO420 Chapter 18
 
BIO420 Chapter 17
BIO420 Chapter 17BIO420 Chapter 17
BIO420 Chapter 17
 
BIO420 Chapter 16
BIO420 Chapter 16BIO420 Chapter 16
BIO420 Chapter 16
 
BIO420 Chapter 13
BIO420 Chapter 13BIO420 Chapter 13
BIO420 Chapter 13
 
BIO420 Chapter 12
BIO420 Chapter 12BIO420 Chapter 12
BIO420 Chapter 12
 
BIO420 Chapter 09
BIO420 Chapter 09BIO420 Chapter 09
BIO420 Chapter 09
 
BIO420 Chapter 08
BIO420 Chapter 08BIO420 Chapter 08
BIO420 Chapter 08
 
BIO420 Chapter 06
BIO420 Chapter 06BIO420 Chapter 06
BIO420 Chapter 06
 
BIO420 Chapter 05
BIO420 Chapter 05BIO420 Chapter 05
BIO420 Chapter 05
 
BIO420 Chapter 04
BIO420 Chapter 04BIO420 Chapter 04
BIO420 Chapter 04
 
BIO420 Chapter 03
BIO420 Chapter 03BIO420 Chapter 03
BIO420 Chapter 03
 
BIO420 Chapter 01
BIO420 Chapter 01BIO420 Chapter 01
BIO420 Chapter 01
 
BA350 Katz esb 6e_chap018_ppt
BA350 Katz esb 6e_chap018_pptBA350 Katz esb 6e_chap018_ppt
BA350 Katz esb 6e_chap018_ppt
 
BA350 Katz esb 6e_chap017_ppt
BA350 Katz esb 6e_chap017_pptBA350 Katz esb 6e_chap017_ppt
BA350 Katz esb 6e_chap017_ppt
 
BA350 Katz esb 6e_chap016_ppt
BA350 Katz esb 6e_chap016_pptBA350 Katz esb 6e_chap016_ppt
BA350 Katz esb 6e_chap016_ppt
 

Recently uploaded

Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Recently uploaded (20)

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

Hm306 week 6

  • 1. © 2016© 2016 A Practical Approach to Analyzing Healthcare Data Chapter 7 – Sample Selections
  • 2. © 2016 Types of Studies - Descriptive • Descriptive studies – performed to generate hypotheses for more formal studies – Cross-sectional study – describes the characteristics of a population at a specific point in time • Often used for prevalence studies – Applied descriptive studies • Data mining • Exploratory data analysis
  • 3. © 2016 Types of Studies - Analytic • Analytic studies – more formal studies designed to test a specific hypotheses – Case-control study – involves both a case group (subjects with the attribute under investigation) and a control group (those without the attribute) • Members of the case and control groups are often matched based on demographics • Typically a retrospective study • May not be used to determine cause and effect; can calculate odds ratio • Weakness – dependent of subject’s ability to recall events – Cohort studies – involves case and control group, but groups are identified before the study is performed • Prospective study • May not be used to determine cause and effect; can calculate relative risk • May take a long time to complete • Not useful if the attribute studied is rare
  • 4. © 2016 Types of Studies - Experimental • Allow the determination of a cause and effect relationship between variables • Randomized Control Trials (RCT) – Used to determine the effectiveness of new drugs/treatment protocols • Blinded studies – Single blind – subject does not know if they are assigned to the case or control group – Double blind – neither subject nor the researcher know if they are assigned to the case or control group – Triple blind- subject, researcher and analytics are all blinded as to the group assignment of the subject
  • 5. © 2016 Why select a sample? • Often population is too large to collect data from every unit of analysis or subject • Statistical inference is used to make conclusions about a population based on a sample • Vocabulary: – Population or universe – all subjects that are under study and eligible to be sampled – Sample – selected subset of the population – Sampling frame – A listing of all of the subjects in the population – Variable of interest – Quantity to be estimated (denial rate, coding error rate, overpayment, underpayment, etc)
  • 6. © 2016 Statistically Valid Sample • Large enough to provide information with sufficient precision to meet the goals of the analysis • Probability sample where each item has an equal chance of being selected • Must be reproducible
  • 7. © 2016 Defining the Variable of Interest • What is the percent of lab orders that are not signed by a physician during 2012? – Universe – all lab orders during 2012 • What is the amount over/under paid due to incorrect E/M level assignment during January? – Universe – • E/M services billed during January • E/M services provided during January • Must refine question to determine if billed date or service date should be used for defining the universe • What is the coding accuracy rate for secondary diagnosis codes on inpatient accounts during the first quarter? – Universe – • All secondary diagnoses coded during first quarter • All inpatient accounts during first quarter • Must refine question to determine if diagnosis codes or charts are the unit of analysis
  • 8. © 2016 Simple Random Sampling • It is the statistical equivalent of drawing sampling units from a hat. • Each sampling unit (claim, chart, etc.) must have the same probability of selection. • Note that some random number generators will allow the user to set a ‘seed’. If that feature is available, the analyst should always set a seed. This will ensure that the sample can be replicated. • A simple random sample is not appropriate if the frame cannot be listed or if it is important that the sample contain particular (rare) subsets of the population.
  • 9. © 2016 Random Number Generators • All random number generators are based on mathematical functions that need a ‘seed’ or starting point • The use of a seed ensures that two independent samples drawn using the same software will result in the same series of random numbers and reproducible sample • Excel – RAND() function does not allow a seed – Random Number Generation in Data Analysis ToolPak does allow a seed
  • 10. © 2016 Simple Random Sampling Steps • Method 1: – The members of the sampling frame should be assigned a random number between 0 and 1 – The frame may then be sorted by the random number – The first ‘n’ will be the simple random sample of size ‘n’ • Method 2: – Assign a sequence number from 1 to ‘n’ to each member of the sampling frame – Use a random number generator (e.g., ratstats) to select random numbers from 1 to ‘N’ (N is the population size)
  • 11. © 2016 Systematic Random Sampling • A systematic random sample is a simple random sample that is selected using a particular technique. If the population includes ‘N’ members and we wish to draw as sample of size ‘n’, then a systemic random sample could be selected by choosing every N/nth member of the population as the sample. – The selection should start at random from a member between the 1st and N/nth member. • NOTE: If N/n is not a whole number, then round down to the next lower whole number to determine the sampling interval. • In order to ensure that a systematic random sample is truly random, the population should not be sorted in an order that might bias the sample.
  • 12. © 2016 Stratified Random Sampling • Population is divided into unique subsets or strata • Strata should be mutually exclusive and exhaustive. In other words, each of the members of the population should be in one and only one stratum. • A simple random sample is then selected from each of the strata • The size of the sample in each strata may be equal or may be assigned proportionally according to the relative size of each strata • Stratified sampling is appropriate when the quantity to be estimated may vary among natural subgroups (strata) of the population • Typical strata in healthcare may be: – CPT® Code (E/M levels) – Physician – Specialty – Clinic
  • 13. © 2016 Stratified Random Sampling Example • Example: An analyst wishes to select a stratified random sample of 90 from a population of 1,000 E/M visits. The distribution of E/M visits in the population is: – Level 1: 55 – Level 2: 183 – Level 3: 236 – Level 4: 309 – Level 5: 217
  • 14. © 2016 Stratified Random Sampling Example • Example: An analyst wishes to select a stratified random sample of 90 from a population of 1,000 E/M visits. The distribution of E/M visits in the population is: Level Population Count (N) % of Population Sample Size (n) 1 55 2 183 3 236 4 309 5 217 Totals 1,000 100% 90
  • 15. © 2016 Stratified Random Sampling Example • Example: An analyst wishes to select a stratified random sample of 90 from a population of 1,000 E/M visits. The distribution of E/M visits in the population is:
  • 16. © 2016 Cluster Sampling • The population is divided into subsets much like the strata in stratified sampling • Clusters should be mutually exclusive and exhaustive • All members of each cluster are selected to be a part of the sample • Clusters are selected at random • Cluster sampling is appropriate when it is difficult to access all of the population
  • 17. © 2016 Cluster Sampling Example The director of the emergency department would like to audit the accuracy of charge capture for the first quarter of 2010. Unfortunately, she is not able to obtain a full listing of the patients that pass through the ED for a sampling frame. Instead, a cluster sample will be drawn using date of service as the cluster. Select 10 dates via simple random sampling to produce a cluster sample.
  • 18. © 2016 Non-probability Sampling • Random sample not required if: – Study is exploratory or a focused review – Example: If we wish to determine educational opportunities for improving documentation, we may sample accounts with few secondary diagnoses to determine if there is a pattern in the types of diagnosis codes most likely to be missed • Typically, this sample is driven by some exploratory data analysis or data mining to help ‘steer’ the sample to subjects most likely to have the issue of interest
  • 19. © 2016 Non-probability Sampling • Convenience sampling – Example – sample first ‘n’ customers that enter the hospital cafeteria • Judgment sampling – Use exploratory data analysis based on experience or history – AKA focused review – Example – Know from history that the customer satisfaction in cafeteria is lowest at lunch time because of long lines. Select sample at that time to try to improve process. • Quota sampling – Subjects divided into groups – Judgment sample used within each group – Example – may select first 10 male and 10 female customers to cafeteria
  • 20. © 2016 RAT-STATS • Statistical program provided by the Office of the Inspector General (OIG) • Free and downloadable from the OIG website – PC only (no MAC version) • Functionality – Determine sample size – Create random numbers for sample selection – Analyze sample data from simple, cluster and stratified sampling • Two types of studies: – Attribute – variable of interest is a rate or proportion – Variable – variable of interest is a interval or ratio quantity
  • 21. © 2016 RAT-STATS Demonstration • Instructor: – Reproduce the demo on pages 125 to 131 with a local installation of RAT-STATS – Students should practice in the lab
  • 22. © 2016 Sample Size • Sample size is dependent on: – Standard Deviation of the quantity to be estimated – Desired precision (width of confidence interval) – Sampling method – Size of the population (if it is relatively small) – Resources available to perform the study • Any analyst that quotes a sample size without asking for the above information is not making an informed choice regarding sample size • The standard deviation of the quantity to be estimated typically is derived from a pilot study or previous review – OIG current recommendation for a pilot study is 30
  • 23. © 2016 Sample Size Attribute Study • Determined by: – Anticipated rate of occurrence (50% results in largest sample) – Confidence level – Desired precision range
  • 24. © 2016 Sample Size Attribute Study • A larger sample size is required for: – A higher level of confidence – A anticipated rate of occurrence closer to 50% – A smaller (narrower) precision range
  • 25. © 2016 Sample Size Variable Study • Determined by: – Probe sample mean and standard deviation – Confidence level – Desired precision range
  • 26. © 2016 Sample Size Variable Study • A larger sample size is required for: – A higher level of confidence – A larger probe standard deviation – A smaller (narrower) precision range
  • 27. © 2016 Sample Size and Precision • In both types of studies, attribute or variable, a higher level of precision requires a larger sample size • A higher level of precision is equivalent to requiring a narrower confidence interval for a set confidence level • Note that increasing ‘n’ in both the proportion and mean confidence interval formulas results in narrower intervals (all other variables held constant)