Relationship of Competency and Global Ratings in OSCEs
1. Relationship of Individual
Competency and Overall Global
Ratings in Practice Readiness OSCEs
Saad Chahine
Mount Saint Vincent University
saad.chahine@msvu.ca
Bruce Holmes
Division of Medical Education, Dalhousie University
bruce.holmes@dal.ca
2. Purpose of CAPP OSCE
• Clinical Assessment for Practice Program
• A program of the College of Physicians and
Surgeons of Nova Scotia (CPSNS)
1.To assess the clinical competence of IMG
candidates for readiness to practice.
2.To provide feedback on candidates'
performance for their continuing
professional development.
3. The CAPP Program
Part A: Initial assessment
• Assessment of competence via OSCE & therapeutics
exam (Practice Ready)
Part B: 1 year mentorship with a family physician
• Defined license for 1 year with performance assessment
Part C: Additional 3 years of defined license until
certified by The College of Family Physicians
(CCFP)
4. Big Overarching Research:
To understand rater cognition in the
assessment of candidates in the OSCE:
What goes on in the minds of examiners
when they assess candidates in the OSCE?
As part of this research agenda, this study answers the
question:
Which competencies are most predictive of determining
the satisfactory rating for the overall global rating score at
each station?
5. Assessed Competencies
• History Taking
• Physical Exam (In half of OSCE stations)
• Communication Skills
• Quality of Spoken English
• Counselling (In half of OSCE stations)
• Professional Behaviour
• Problem Definition & Diagnosis
• Investigation & Management
• Overall Global
8. Data Set
2010
- 14 Station OSCE
- 31 Candidates
- 434 Observations (stations x candidates)
2011
- 12 Station OSCE
- 36 Candidates
- 432 Observations (stations x candidates)
2012
- 12 Station OSCE
- 36 Candidates
- 432 Observations (stations x candidates)
OSCE stations:
14 minutes spent at
each station
examiner questions
at 10 minutes
3 minutes between
candidates
9. Design
• Goal: What constitutes a pass/fail in an examiner’s
mind?
• Overall Global rating was recoded as pass fail
– Fail (0) = Inferior, Poor or Borderline
– Pass (1) = Satisfactory, Very Good or Excellent
• Competencies were rated on 1-6 scale
– 1 = Inferior
– 6 = Excellent
10. Descriptive Analysis
Year Number of
Observations
Overall Global
Pass
Overall Global
Fail
2010 434 (2 missing) 86 (20%) 346 (80%)
2011 432 (4 missing) 93 (22%) 335 (78%)
2012 432 (10 missing) 114 (26%) 308 (74%)
Year Number of
Observations
Investigation
Management
Pass
Investigation
Management Fail
2010 434 (3 missing) 317 (74%) 114 (26%)
2011 432 (0 missing) 272 (63%) 160 (37%)
2012 432 (1 missing) 271 (63%) 160 (37%)
*Note INVMAN was recoded 0/1
11. Multivariate Analysis
• Hierarchical Generalized Linear Model (HLGM - A
logistic regression that is nested)
• Nested structure to the data: Candidates are nested
within Year and Stations are nested within Candidates
• High consistency across examiners (previous study)
• The analysis was conducted in steps to find the best
model
• The goal is to determine what competencies are most
predictive of a pass/fail at OSCE stations
15. Results: Best Model
Fixed
Effects
Estimate SE T-ratio Df P Odds
Ratio
Overall -2.16 0.18 -12.22 2 0.00 0.12
HIST 0.53 0.12 4.29 1272 0.00 1.71
BEHAV 0.52 0.15 3.43 1272 0.00 1.68
COMM 0.73 0.16 4.78 1272 0.00 2.07
PDD 0.63 0.11 6.02 1272 0.00 1.88
INVMAN 1.37 0.14 10.06 1272 0.00 3.93
*Note: Variation significant at Candidate level,
Variation NOT significant at Year level
16. Example of
Borderline/Satisfactory at the
Station
• If you borderline all competencies…
– 6% probability you receive an overall pass at the station
• If you satisfactory all competencies…
– 81% probability you pass receive an overall pass at the station
• If you satisfactory in all and borderline on Investigation
Management.
– 52% probability pass receive an overall pass at the station
• If you borderline in all and satisfactory on Investigation
Management…
– 21% probability pass receive an overall pass at the station
17. Rater Cognition
• Examiners do not weigh each competency equally…
Investigation and Management is a key component in
determining overall pass/fail on a station.
• Little variation in ratings for Quality of Spoken English…
all candidates do well on this competency…we keep it in
the exam as a check
• Physical Exam and Counseling are not significant
predictors. We suspect this is due to insufficient data (half
of the stations have these competencies)
• Track (1 vs 2) is not a predictor
• There is not a significant variation from year to year.
18. Take Home Message
• Examiners intuitively deem some competencies
as more important
– Therefore, should they be weighted?
• For practice ready OSCE…
– Consider more emphasis on complex competencies
in case development and blueprint
• Results support a qualitative study
– Follow up study to understand how examiners
conceptualise competencies through cognitive
interviews
At each station is important
Not overall OSCE performance
Note the last three are based on interview questions at the 10 minute mark, followed by 4 minutes to answer all questions(3-4)
The scoring booklet is formatted so that the competency ratings are after the responses to the questions
The answers are not data entered; rather they are for the PE to incorporate into his/her rating
Note: the Elements were developed from themes based on “free text” narrative comments from physician examiners in previous CAPP OSCEs
PEs are asked to consider the “pattern” of how they used the Elements to describe candidate performance
In the CAPP OSCE all PEs are in community family practice from across the province; not full time academic family medicine
In this example a PE would likely rate the competency as either Borderline or Satisfactory
Poor or Very Good options would not be consistent with the Elements
PEs are asked to make a Global Rating, taking into account all of the competencies
They are also asked to consider “having this candidate do a locum in their practice
Minimal missing data; about 97-98% completion
Competency rating scale
Inferior; poor; borderline; satisfactory; very good; excellent
10 missing is a lot
I thought IM and Overall global were almost the same last year 2012. See report to Credentials
Top half
Number of overall global as either PASS or FAIL
A single competence doesn't’ predict Overall Global Pass/Fail. it’s the combination of competencies
Important slide to elaborate
First and last bullets not what I expected. Does it mean IM is highly regarded by PEs for PRACTICE READY
Does this mean overall global in the mind of the raters is more than sum of competencies?
I expected IM to be