Validity and bias are essential aspects of any research—a brief description of internal and external validity and different types of bias related to the epidemiological study.
1. Validity & Biases
in epidemiological
studies
Dr. Abhijit Das
PG resident
Govt. Bundelkhand Medical
College, Sagar, M.P
.
2. IF AN ASSOCIATION IS OBSERVED, THE
FIRST QUESTION ASKED MUST ALWAYS BE
…
“Is it REAL?”
“Is it a VALID observation?”
3. • Validity of epidemiologic study concerns whether or
not there are imperfections in the study design, the
methods of data collection, or the methods of data
analysis that might distort the conclusions made
about an exposure-disease relationship.
• No such imperfections Study is VALID.
6. • If not valid (Internal/External)
• There are imperfections
• The extent of the distortion of the results from the
correct conclusions is called Bias
7. BIAS
• “Any systematic error in the design, conduct or analysis of
a study that results in a mistaken estimate of an
exposure’s effect on the risk of disease.”
• Lack of internal validity or incorrect assessment of the
association between an exposure and an effect in the
target population.
Schlesselman JJ. Case-Control Studies: Design, Conduct, and Analysis. New York: Oxford
University Press; 1982.
8. • Classified by the research stage in which they occur or
by the direction of change in a estimate.
• Definition and selection of the study population, data
collection, and the association between different
determinants of an effect in the population.
9. 4
1
1 2 3
Sacket
and
Choi
According to the
stages of research
Maclure
and
Schneew
eiss
Steinec
k and
Ahlbo
m
Applying
the causal
diagram
theory
• Based on the
concept of
study base
4Kleinba
um et al
• Three main
groups
• Selection bias,
Information bias,
and
Confounding
CLASSIFICATION OF BIAS
11. SELECTION BIAS
• The error introduced when the study population does
not represent the target population.
• At any stage of a research
• Design (bad definition of the eligible population, lack
of accuracy of sampling frame, uneven diagnostic
procedures in the target population) and
implementation.
12. EXAMPLES OF SELECTION BIAS
• Select volunteers as exposed group and non-volunteers
as non-exposed group in a study of screening
effectiveness
• Study health of workers in a workplace exposed to some
occupational exposures comparing to health of general
population
₋ Working individuals are likely to be healthier than general
population that includes unemployed people (Healthy
13. CONTROLLING SELECTION BIAS
• Define criteria of selection of diseased and non-
diseased participants independent of exposures in a
case-control study
• Define criteria of selection of exposed and non-
exposed participants independent of disease
outcomes in a cohort study
14. Inappropria
te
definition of
eligible
population
• Ascertainment bias- Competing
risks, Healthcare access bias,
Length-bias sampling, Neyman
bias, Spectrum bias, Survivor
treatment selection bias
• Selection of control- Berkson’s
bias, Exclusion bias, Friend
control bias, Inclusion bias,
Matching bias, Relative control
Lack of
accuracy of
sampling
frame
Non-random
sampling bias
Telephone
random
sampling bias
Citation bias
Dissemination
bias
Publication bias
Uneven
diagnostic
procedures
in the
target
population
During
study
implementati
on
Losses/withdraw
als to follow up
Missing
information in
multivariable
analysis
Non-response
bias
SELECTION
BIAS
15. COMPETING RISKS
• When two or more outputs are mutually exclusive, any
of them competes with each other in the same
subject.
• For example, early death by AIDS can produce a
decrease in liver failure mortality in parenteral drug
users. A proper analysis of this question should take
into account the competing causes of death; for
instance.
16. HEALTHCARE ACCESS BIAS
• Observational study
• When the patients admitted to an institution do not
represent the cases originated in the community.
• Popularity bias
• Centripetal bias
• Referral filter bias
• Diagnostic/treatment access bias
17. NEYMAN BIAS (INCIDENCE-PREVALENCE BIAS/SELECTIVE SURVIVAL
BIAS)
• Where the very sick or very well (or both) are erroneously
excluded from a study. The bias (“error”) in results can be
skewed in two directions
• Excluding patients who have died will make conditions look
less severe.
• Excluding patients who have recovered will make conditions
look more severe.
18. • Both cross sectional and (prevalent) case-control studies.
• Any risk factors we may identify in a study using
prevalent cases may be related more to survival with the
disease than to the development of the disease
(incidence).
• Careful selection of study type can help to lessen the
effects from this bias
19. SPECTRUM BIAS
• In the assessment of validity of a diagnostic test
• When researchers included only ‘‘clear’’ or ‘‘definite’’ cases,
not representing the whole spectrum of disease
presentation, and/or ‘‘clear’’ or healthy controls subjects,
not representing the conditions in which a differential
diagnosis should be carried out.
• Sensitivity and specificity of a diagnostic test are increased.
20. BERKSON’S BIAS
• First described by Berkson in 1946 for case- control
studies.
• Hospital control
• It is produced when the probability of hospitalization
of cases and controls differ, and it is also influenced
by the exposure.
• Hospital patients differ from people in the community
21. FRIEND CONTROL BIAS
• It was assumed that the correlation in exposure status
between cases and their friend controls lead to biased
estimates of the association between exposure and
outcome.
• Problem may be that the controls are too similar to
the cases in regard to many variables, including the
variables that are being investigated in the study.
22. MATCHING
• It is well known that matching, either individual or
frequency matching, introduces a selection bias, which
is controlled for by appropriate statistical analysis
• Overmatching is produced when researchers match by
a non-confounding variable and can underestimate an
association
23. Inappropria
te
definition of
eligible
population
• Ascertainment bias- Competing
risks, Healthcare access bias,
Length-bias sampling, Neyman
bias, Spectrum bias, Survivor
treatment selection bias
• Selection of control- Berkson’s
bias, Exclusion bias, Friend
control bias, Inclusion bias,
Matching bias, Relative control
Lack of
accuracy of
sampling
frame
Non-random
sampling bias
Telephone
random
sampling bias
Citation bias
Dissemination
bias
Publication bias
Uneven
diagnostic
procedures
in the
target
population
During
study
implementati
on
Losses/withdraw
als to follow up
Missing
information in
multivariable
analysis
Non-response
bias
SELECTION
BIAS
24. LACK OF ACCURACY OF SAMPLING FRAME
• Non-random sampling bias: most common bias in this
group
• Telephone random sampling bias: It excludes some
households from the sample, thus producing a
coverage bias. In the US it has been shown that the
differences between participants and non-participants
are generally not large, but the situation can be very
different in less developed countries
• Citation bias
• Dissemination bias
25. UNEVEN DIAGNOSTIC PROCEDURES IN THE TARGET
POPULATION
• In case-control studies
• Detection bias
• Diagnostic suspicion bias: types of detection bias, exposure
is taken as another diagnostic criterion
• Unmasking-detection signal-bias: exposure that, rather than
causing a disease, causes a sign or symptom that
precipitates a search for the disease
• Mimicry bias: exposure may become suspicious if, rather
than causing disease, it causes a benign disorder which
resembles the disease
26. DURING STUDY IMPLEMENTATION
• Losses/withdrawals to follow up: in both cohort and
experimental studies.
• Missing information : multivariable analysis selects
records with complete information on the variables
included in the model. If participants with complete
information do not represent target population, it can
introduce a selection bias.
27. • Non-response bias: A bias that occurs due to
systematic differences between responders and non-
responders
• This bias is common in descriptive, analytic and
experimental research and survey studies
• Results in mistakes in estimating population
characteristics based on the underrepresentation of
phenomena due to non-response
28. INFORMATION BIAS
• Information bias occurs when information is
collected differently between two groups, leading to
an error in the conclusion of the association
• When information is incorrect- there is misclassification
Differential misclassification
Non-differential misclassification
29. EXAMPLES OF INFORMATION BIAS
• Interviewer knows the status of the subjects before the
interview process
• Interviewer may probe differently about exposures in
the past if he or she knows the subjects as cases
• Subjects may recall past exposure better or in more
detail if he or she has the disease (recall bias)
• Surrogates, such as relatives, provide exposure
information for dead cases, but living controls provide
30. CONTROLLING INFORMATION BIAS
• Have a standardized protocol for data collection
• Make sure sources and methods of data collection are
similar for all study groups
• Make sure interviewers and study personnel are
unaware of exposure/disease status
• Adapt a strategy to assess potential information bias
• Blinding
31. Differential misclassification bias
Detection bias
Observer/interviewer bias
Recall bias
Reporting bias
Non-differential misclassification bias
Misclassificatio
n bias
INFORMATION
BIAS
Ecological
fallacy
Regression to
mean
Others
Hawthorne effect
Temporal ambiguity
Will Rogers phenomenon
Work up bias (verification bias)
Lead time bias
Protopathic bias
32. MISCLASSIFICATION BIAS
• When individuals are assigned to a different category
than the one they should be in.
• Non-differential misclassification occurs when the
probability of individuals being misclassified is equal
across all groups in the study.
• Differential misclassification occurs when the
probability of being misclassified differs between
groups in a study
33. OBSERVER/INTERVIEWER BIAS
• Knowledge of the hypothesis, the disease status, or
the exposure status (including the intervention
received) can influence data recording (observer
expectation bias)
• Type of detection bias; affect assessment in
observational and interventional studies
• Blinding/adequate training for observers in how to
record findings
34. RECALL BIAS
• Rumination bias- the presence of disease influences the
perception of its causes
• Exposure suspicion bias- the search for exposure to the
putative cause
• Participant expectation bias- in a trial if the patient
knows what they receive may influence their answers
• More common in case-control studies, although it can
occur in cohort studies and trials.
35. REPORTING BIAS
• Systematic distortion that arises from the selective
disclosure or withholding of information by parties
• Obsequiousness bias
• Family aggregation bias
• Underreporting bias
36. ECOLOGICAL FALLACY
• When analyses realised in an ecological (group level)
analysis are used to make inferences at the individual level.
• For instance, if exposure and disease are measured at the
group level (for example, exposure prevalence and disease
risk in each country), exposure-disease relations can be
biased from those obtained at the individual level (for
example, exposure status and disease status in each
subject).
37. HAWTHORNE EFFECT
• Described in the 1920s in the Hawthorne plant of the
Western Electric Company (Chicago, IL).
• It is an increase in productivity—or other outcome
under study—in participants who are aware of being
observed.
• A study of hand-washing among medical staff found
that when the staff knew they were being watched,
compliance with hand-washing was 55% greater than
when they were not being watched (Eckmanns 2006)
38. LEAD TIME BIAS
• A distortion overestimating the apparent time
surviving with a disease caused by bringing forward
the time of its diagnosis
• The added time of illness produced by the diagnosis
of a condition during its latency period.
• This bias is relevant in the evaluation of the efficacy of
screening, in which the cases detected in the screened
group has a longer duration of disease than those
39. Lead time bias where health outcome is the same in someone whose disease is detected by
screening compared with someone whose disease is detected from symptoms, but survival time
from the time of diagnosis is longer in the screened patient
Lead time bias where the screened patient lives longer than the unscreened patient, but overall
survival time is still exaggerated by the lead time from earlier diagnosis.
40. VERIFICATION BIAS/ WORK-UP BIAS
• Occurs during investigations of diagnostic test accuracy
when there is a difference in testing strategy between
groups of individuals, leading to differing ways of
verifying the disease of interest.
• In the assessment of validity of a diagnostic test, it is
produced when the execution of the gold standard is
influenced by the results of the assessed test, typically the
reference test is less frequently performed when the test
41. CONFOUNDING
• From confounder, i.e. to mix together.
• A distortion that modifies an association
between an exposure and an outcome
because a factor is independently
associated with the exposure and the
outcome.
• Not possible to separate the contribution
that any single causal factor has made.
• Solution: Study design:
Matching/randomization, Stratification
smoking
Ca esophagus
alcohol
42. BIAS AND CONFOUNDING
• Systematic error
in a study
• Cannot be fixed
• May lead to
errors in the
conclusion
• When
confounding
variables are
known, the effect
may be fixed
43. SPECIFIC BIASES IN A TRIAL
• Allocation of intervention bias
• Compliance bias
• Contamination bias
• Lack of intention to treat analysis
44. ALLOCATION OF INTERVENTION BIAS
• Systematic difference in how participants are assigned
to comparison groups in a clinical trial
• More common in non-randomised trials
• Sequentially numbered, opaque, sealed envelopes
(SNOSE); sequentially numbered containers; pharmacy
controlled allocation; and central allocation
45. COMPLIANCE BIAS
• David Sackett, 1979 paper on Biases in Analytic Research
• Participants who are compliant with an intervention may
differ from those who are non-compliant, and in ways that
might affect the risk of the outcome being measured.
• Clinical trials should attempt to collect data on compliance
and while analyses should include the intention to treat
analysis
49. TRIAL
• Allocation of intervention
bias
• Compliance bias
• Hawthorne effect
• Lack of intention to treat
analysis
SRMA
• Citation bias
• Dissemination bias
• Language bias
• Publication bias
50. REFERENCES
• Bias, Miguel Delgado-Rodrı´guez, Javier Llorca, J Epidemiol
Community Health 2004;58:635–641. doi:
10.1136/jech.2003.008466
• Catalogue of Bias, accessible at: https://catalogofbias.org/
• Gordis Epidemiology, Elsevier, 6th edition.
Notas del editor
Primary objective of epidemiologic research is to obtain a valid estimate of an effect measure of interest
INTERNAL- Study done properly without any methodological flaws and finding is therefore valid for the Study Population.
EXTERNAL- Finding is generalizable from study population to defined population, presumably TARGET population.
Internal validity is more important than External validity, Study could be internally valid but not externally. But if a study is externally valid that means it is internally valid too.
So the concept of bias is lack
1st-Biases can be classified
2nd-The most important biases are those produced in
One-Reading up on the field, specification and selection the study sample, execution of experimental maneuver, measurement of exposures/outcomes, data analysis, results interpretation and publication.
Three- Considered in this order, confounding, misclassification, misrepresentation and analysis deviation
2. It can be introduced
Volunteers could be more health conscious than non- volunteers, thus resulting in less disease
Volunteers could also be at higher risk, such as having a family history of illness, thus resulting in more disease
Ascertainment bias is produced when the kind of patients gathered does not represent the cases originated in the population.
2nd- It is more frequent when dealing with causes of death: as any person only dies once, the risk for a specific cause of death can be affected by an earlier one.
This may be due: (popularity bias) to the own institution if admission is determined by the interest of health personnel on certain kind of cases.
{centripetal bias), to the patients if they are attracted by the prestige of certain clinicians.
((referral filter bias) to the healthcare organization if it is organized in increasing levels of complexity (primary, secondary, and tertiary care) and ‘‘difficult’’ cases are referred to tertiary care.
(diagnostic/treatment access bias)to a web of causes if patients by cultural, geographical, or economic reasons show a differential degree of access to an institution
After 3rd -We may not know which groups (improved/died) we are excluding, making it impossible to adjust for any bias in results.
Before 4th- This type of bias is also called incidence-prevalence bias .
Combining prevalent and incident cases can actually make prevalent-incidence bias worse, obscuring the true relationship between your study variables (i.e. the variables in your experiment or study)
Incident cases are newer cases — like first time admissions. Prevalent cases are pre-existing cases, which are usually sicker with more progressed disease than incident cases.
Before 2nd- preferable to use incident in case-control studies of disease etiology.
If, for example, most people who develop the disease die soon after diagnosis, they will be underrepresented in a study that uses prevalent cases, and such a study is more likely to include longer-term survivors. This would constitute a highly nonrepresentative group of cases, and any risk factors identified with this nonrepresentative group may not be a general characteristic of all patients with the disease, but only of survivors.
after 3rd- Neyman bias is less of a problem with acute, short-lived cases than with long-term diseases like HIV or tuberculosis
Occurs when a diagnostic test is studied in a different range of individuals to the intended population for the test
His original paper showed that two diseases, which have no real relationship, can be what he called ‘spuriously associated‘ in hospital-based case control studies. However, the idea wasn’t widely accepted until 1979.
They represent a sample of an ill-defined reference population that usually cannot be characterized and thus to which results cannot be generalized.
Moreover, hospital patients differ from people in the community.
For example, the prevalence of cigarette smoking is known to be higher in hospitalized patients than in community residents; many of the diagnoses for which people are admitted to the hospital are smoking related.
matched analysis in studies with individual matching and adjusting for the variables used to match in frequency
Ascertainment bias is produced when the kind of patients gathered does not represent the cases originated in the population.
Obviously, this selection procedure can yield a nonrepresentative sample in which a parameter estimate differs from the existing at the target population
In Systematic reviews and meta-analysis
Citation bias: articles more frequently cited are more easily found and included in systematic reviews and meta analysis.
Dissemination bias: the biases associated to the whole publication process, from biases in the retrieval of information (including language bias) to the way the results are reported
Publication bias: regarding an association that is produced when the published reports do not represent the studies carried out on that association. Several factors have been found to influence publication, the most important being statistical significance, size of the study, funding, prestige, type of design, and study quality
Detection bias occurs when exposure influences the diagnosis of the disease.
Unmasking (detection signal) bias- example: If a medication can cause vaginal bleeding, and people with this symptom go sooner to the doctor and receive earlier or more intensive examination and investigations to diagnose cancer, it may appear that the medication caused the cancer. However, all that may have happened is that the medication has prompted an earlier or more intensive search for the disease, leading to an apparently increased rate among those using the medication.
Exposure can trigger the search for the disease; for instance, benign anal lesions increases the diagnosis of anal cancer.
During the study design, try to identify factors that might cause signs or symptoms that are also associated with the disease of interest. Then, during analysis of the data keep in mind any such factors, and carry out a sensitivity analysis if indicated.
Ensure correct diagnoses of the condition by using gold standard diagnostic reference test to confirm the diagnosis. Readers should be wary of studies that rely on symptoms and self-reported measures to infer critical outcomes.
when losses/withdrawals are uneven in both the exposure and outcome categories, the validity of the statistical results may be affected
After 2nd- it is relevant mainly in retrospective study, record based study.
Example= healthy volunteer effect- participants are generally healthy than general population.
Differential misclassification occurs when the level of misclassification differs between the two groups
Non-differential misclassification occurs when the level of misclassification does not differ between the two groups
1. Correct classification of individuals, and of exposures and participant characteristics, is an essential element of any study. Misclassification occurs when
2. Results in Incorrect associations being observed between the assigned categories and the outcomes of interest
The means by which interviewers can introduce error into a questionnaire include administering the interview or helping the respondents in different ways (even with gestures), putting emphases in different questions, and so on.
Last line- how to record findings, identifying any potential conflicts before recordings commence and clearly defining the methods, tools and time frames for collecting data.
Definition: Systematic error due to differences in accuracy or completeness of recall to memory of past events or experiences.
RB-in case-control in which participants know their diseases
EsB-in cohort studies (for example, workers who known their exposure to hazardous substances may show a trend to report more the effects related to them)
PEB-trials without participants’ blinding
1.participants can ‘‘collaborate’’ with researchers and give answers in the direction they perceive are of interest
2. the existence of a case triggers family information
3. is common with socially undesirable behaviours, such as alcohol consumption
Ecological fallacy can be produced by within group (individual level) biases, such as confounding, selection bias, or misclassification, and by confounding by group or effect modification by group
Seen in Trial
premise of screening is that it allows earlier detection and treatment of a disease or health condition, leading to a greater chance of cure or at least longer survival. Individuals with disease detected through population screening receive a diagnosis earlier before signs and symptoms appear.
As a consequence, estimates of differences in survival time between people diagnosed from screening and those whose disease is detected after symptoms develop can be biased, as survival time will appear to be longer in screen-detected people if early detection has no effect on the course of disease (figure 1) or if survival time is extended (figure 2).
Prevention- In randomised control trials evaluating screening, lead time bias can be countered by taking the time origin as the point of randomisation, not the point of diagnosis, or by comparing the number of deaths occurring in a given period of time instead or as well as the number of people surviving.
After 1- when only a proportion of the study group receives confirmation of the diagnosis by the reference standard, or if some patients receive a different reference standard at the time of diagnosis.
At last- Many reference tests are invasive, expensive, or carry a procedural risk (e.g. angiography, biopsy, surgery), and therefore, patients and clinicians may be less likely to pursue further tests if a preliminary test is negative.
Example- A study assessed the accuracy of D-dimer testing for diagnosing pulmonary embolism (PE). Patients that had a positive D-dimer result were further assessed with ventilation–perfusion scans (reference standard 1), whereas patients that had negative D-dimer results were assessed with routine clinical follow up (reference standard 2).
Prevention- Ideally, in a diagnostic accuracy study, all patients should receive the same reference test.
Prevention- Some standard methods of ensuring concealment of group allocation prior to allocation include
This means that compliance is a confounding factor in assessing the effect of the intervention
Intention to treat-