2. Objective
• Restate and summarize the concepts covered under
lessons “Assessment and Evaluation”
2taufiq660
3. Assessment
• Assessment is the measurement of the intended learning
outcome.
• An assessment is not only a measure of performance but
also provide an indication of the effectiveness of teaching
and appropriateness of course input.
3taufiq660
4. Evaluation
• It is a systemic process of collecting, analyzing
and interpreting information to determine the
extent to which the students are achieving
educational objective
4taufiq660
6. Types of Assessment
• Formative- During the course
• Summative-At the end of the course
6taufiq660
7. Functions of Formative
Assessment- what are they
learning?
• Process focused
• Diagnostic value
• Provide feedback to the student and teachers
• Opportunities for midcourse correction
• Encourage interaction between teacher & students
• Allows repeated attempts to master the concern area
7taufiq660
8. Functions of Summative
Assessment- What have
they learned?
• Outcome focused
• Rank -order students
• Award marks, certification or grades
• End point examination.
8taufiq660
10. Validity
Validity is the extent of an instrument measures the property
that it intends to measure
Content validity
Criterion validity
Construct validity
10taufiq660
11. To increase content validity
TEST MATRIX are used
Content Knowledge Comprehension Application
Concept of health and
disease
Epidemiology
Screening
Communicable disease
Health care
administration
Occupational health
Population and
demography
50% 30% 20%
11taufiq660
12. Factors Affecting
Validity
Unclear direction
Difficult vocabulary
Inappropriate level of difficulty
Poorly constructed test items
Ambiguity
Inadequate time
Test too short
Identifiable pattern of answer
12taufiq660
13. Reliability
• Extent to which an assessment produces the
same result when used repeatedly under the
same conditions on same group
• It does not ensure validity
13taufiq660
14. Types of Reliability
• Test-retest : Same test after an interval provided there
is no additional learning & nothing forgotten
• Equivalent tests: Comparing two tests of equivalent
form
• Split half method: Single question is split into two
halves
• Marker reliability : One paper is independently marked
by two / more examiners
14taufiq660
15. Factors Affecting
Reliability
• Ambiguous question
• Too many options in a question paper
• Poorly trained markers
• Personal bias
• Vague marking instruction
15taufiq660
16. Objectivity
• Objectivity means the scores will be the same when
the students’ performance is examined by two or
more examiners
16taufiq660
17. Characteristics for
Objectivity-ways to improve
Question is specific and each has fixed answer
Mark is allocated for each section of question
Answers are documented
Has check list / rating scale of marking
Examiners are trained / expertized
17taufiq660
18. Methods of Assessment
Written Examination
Oral Examination
Practical / Clinical Examination
18taufiq660
20. Tools for Written
Examination
Open-end question: Requires students to write and present an original
answer
Fixed choice question: Requires students to select correct response
from several alternatives
20taufiq660
21. Open Ended Questions
Essay Question (EQ)
Structured Essay Question(SEQ)
Modified Essay Question(MEQ)
Short Essay Question
Short Answer Question(SAQ)
21taufiq660
22. Fixed Choice Questions
One best response / Single Best Answer (SBA)
Simple true-false type
Multiple true false (MTF) type
Multiple true false completion type
Matching question
Extended Matching question (EMQ)
Assertion and reason
22taufiq660
23. Advantages of EQ
Assess recall of knowledge, comprehension as well as
complex cognitive skills including analysis, synthesis
and evaluation.
Easy to prepare and takes less time.
Eliminates the possibility of guessing correct answer by
students.
Measure power of expression and ability to organize
thoughts
23taufiq660
24. Disadvantages of EQ
It is very subjective, has limited validity and less reliability.
Time consuming to grade and mark
Pay undue importance to writing skill and penalize for
spelling and grammar .
Takes long time to score.
24taufiq660
25. a EQ can be improved- k
Aspects
Low reliability Structuring the essay
Low validity Include more short
structure essay
Low objectivity SEQ with checklists
Does not test problem Problem based SEQs
solving ability
25taufiq660
27. Types of SAQ
• Completion(Fill in the blanks)
• Definitions
• Label/Draw diagram
• Unique answer type
• Numerical problems
• Open SAQ
• Problem solving items
27taufiq660
29. Advantages of MCQ
Effective assessment technique
Objectivity
Broad sampling
High reliability
Applicable to large group
Easy scoring
29taufiq660
30. Disadvantages of MCQ
Difficult to construct, time consuming
Limited types of knowledge assessment
Scope of guessing
Cannot test expression capability
Students can select random answer
30taufiq660
31. Different Parts of an
MCQ
Each of the following question or incomplete statement below
followed by suggested answer or completion , select the one that is
the BEST in each case.
Direction
Q-A 1 year old infant is known to have heart disease and is noted to
be
Cyanosed. Stem
Which of the following is the most likely diagnosis? Lead-In
A. Atrial Septal Defect
B. Patent Ductus Arteriosus Distractors
C. Ventricular Septal Defect
D. Tricuspid Atresia Key
32
Branches/
Alternatives
taufiq660
32. Evaluation of
Assessment
Review of a Test
• Adequacy of time
• Number of students attempting each question
• Average performance of students
• Differentiation of different ability students
33taufiq660
33. Review of Individual
Questions/ Item Analysis
• If a test contains substantial number of bad questions,
then it will be reflected in the review of the whole test
• When only small number of questions are bad the review
of the whole test will not be useful.
• So review of individual questions is important.
34taufiq660
35. Pre Validity
It is done before the examination
Review Committee
Relevance to contents
Item construction
36taufiq660
36. Post Validity
• This is done after a test has been administered, scored but
result not yet published.
• The indices of post validation are-
Difficulty index
Discrimination index
Distractor effectiveness
37taufiq660
37. Difficulty Index
• It is used to find out the difficulty level of a
question.
• It is measured by the formula
Difficulty Index=(H+L)/Nx100
• More the value, question is more easy
• Ideal Difficulty Index-50-60%
• Acceptable Range 30-70%
38taufiq660
38. Discrimination Index(DI)
• DI is the ability of a question to differentiate more
able students from less able students
• It is measured by the formula
• DI=2(H+L)/N
• Higher the index, question is more able to
differentiate
• Excellent- ≥ 0.35
• Good- 0.25-0.34
• Acceptable- 0.20-0.24
39taufiq660
39. Distractor Effectiveness
• It is essential to examine each distractor in an
MCQ for its effectiveness.
• A distractor will be considered effective if more
than five percent students answer it as correct.
• If a distractor effectiveness is less than 5% then
that distractor is not effective.
40taufiq660
40. Key Validation
• This is done when a new question, not pretested is used in
a question paper.
• After scoring the question paper, random sample papers
are selected and item analysis is done for new question.
41taufiq660
41. Practical/Clinical
Examination-Tools
• Long case
• Short case
• Objective structured practical examination(OSPE)
• Objective structured clinical examination(OSCE)
• Observed structured long examination
record(OSLER)
• Work place based assessment(WPBA)
42taufiq660
42. OSCE/OSPE
• Examination setup of 10 – 20 circuit of assessment
stations
• Task to be tested is given in the form of specific
question
• A particular skill / area of competency is tested in one
station
• Students get 4 – 5 minutes at each station and have to
complete circle simultaneously
43taufiq660
43. Types of Station
• Procedural Station : Students perform a task
• Question Station : Students are asked to interpret
• Linked station-response station of a previous station
• Rest station- given, not for rest rather for adjustment
• Extra length station- when students need double or
triple duration to perform the tasks than other
station
• Guillotine/must pass station- Candidate will fail if
S/he fails at this station, even performed very well in
all other station.
44taufiq660
44. Principles
• Examiners only observe & mark, do not talk
• Every station has answer and check list of marking
• Students move & Examiners/observers remain static
• Same set is used for all students
• Stations are constructed with care
45taufiq660
45. OSLER
• It’s a systematic approach of assessing a student in a long
case examination
• It overcome the limitations of traditional long case
examination by improving reliability through structuring.
46taufiq660
46. Limitation of Traditional
Long Case-
• History taking skill is not assessed
• Little question is asked
• Communication skill rarely assessed
• Task oriented assessment impossible
• Very subjective
47taufiq660
47. Advantages of OSLER
• Structured
• Task oriented
• Takes history & examine in front of the examiner
48taufiq660
48. Work Place Based
Assessment(WPBA)
• Assessment of knowledge, skill, behaviors and attitudes
of doctors/trainees in work place under normal working
conditions.
• Types-
• Mini Clinical Evaluation Exercise(Mini-CEX)
• Clinical Encounter Card (CEC)
• Clinical Work Sampling(CWS)
• Blinded Patient Encounter(BPE)
• Direct Observation of Procedural Skill(DOPS)
• Case Based Discussion(CbD)
• Multi-source feedback (MSF)
• Procedure Based Assessment(PBA)
• Assessment of Audit
• Observation Of Teaching(OOT) 49taufiq660
49. Relation of WPBA with
Other Assessment
WPBA, Portfolios, log book
OSPE?OSCE< Short case/ Long case
Solve clinical scenario based written/ Oral
Recall factual written/oral
50
Does
Shows how
Knows how
knows
taufiq660
50. Oral Examination
Oral examination can be defined as an
examination consisting of a dialogue with the
examiner, who asks questions to which candidate
must reply.
Oral examination is used to probe students ability
to think fast, to express clearly.
It also helps to assess students’ communication
skill and professional attitude.
51taufiq660
52. Problem with Traditional
Type
Lack of standardization, objectivity and reliability of
results
Probability of abuse of personal relation
Expensive in terms of professional time and
information yield.
The session remain unrecorded
Often environment become threatening
53taufiq660
53. SOE
Problems of oral examination can be over come in SOE by
taking following elements into accounts-
Examiner
Atmosphere
Setting questions
Process of examination
Feedback session
Objective
Structured
54taufiq660
54. Components of SOE
• Prepared questions
• Rating scale
• Prepared answer
55taufiq660
56. Portfolio
• Collection of various forms of evidence of
achievement of learning outcome through a
process of self reflection over a period of time
57taufiq660
57. Uses- portfolio
• Engages the teacher & students in a process of
learning through assessment
• Measure & reinforce desired learning outcome
• Enhance life long learning
• Able to assess the outcome related to attitudes &
professionalism
58taufiq660
58. Contents-portfolio
Any material that provides evidence for learning
• Best Essay
• Written reports
• Evaluation of performance
• Video tape
• Record of practical procedure
• Patients’ record
• Curriculum vitae
• Written reflection
59taufiq660
59. Assessment of Attitude
• Attitude is the tendency to behave in a preferential
manner
• Internal disposition reflected by one’s behavior with
respect to persons, events, opinions or theories.
60taufiq660
60. Components of Attitude
• ABC model-
Affective- How he feels about the object of attitude
Behavioral or Conative- Persons behavioral tendency towards
the object of attitude
Cognitive -What a person knows and belief about an object of attitude
61taufiq660
61. 62
Knowledge, belief
snakes are poisonous
feelings,
I fear snakes
Behavioral-I’ll run away if there is snake
Cognitive- Affective-
taufiq660
62. Measurement
• Attitude is measured in an indirect way
• Two common scales for measurements are
Likert scale
Osgood’s scale
63taufiq660
63. Likert Scale
• Attitude is measured on a five point scale ranging from
extremely positive to extremely negative.
A-Strongly agree
B-Agree
C-Undecided
D-Disagree
E-Strongly disagree
64taufiq660
64. Osgood’s Scale
• Attitude is measured in two opposite adjectives
with gradation
• Such as
Useful - Not useful
Interesting - Not interesting
65taufiq660
65. Emotion and Emotional
Intelligence
• Emotion is a complex psychological state that
involves three distinct components-
• Subjective Experience-
• Physiological Response-
• Behavioral Response-
66taufiq660
66. Emotional Intelligence
(EI)
• EI can be defined as the ability to perceive,
control and evaluate emotions of self and others.
• Life success depends more on the ability to
understand and control emotions than on IQ.
• Understanding EI is important for a patient-
centered care.
67taufiq660
67. Psychometric Tests
• Psychometric tests are employed to measure
individuals’ mental capabilities and behavioral
style.
• Parts of psychometric test are –
Aptitude
Personality
Attitude
Emotional Intelligence
68taufiq660
68. Psychometric test contd
• Aptitude- It is an innate or acquired ability/competency to
do some specific job (military, flight, nursing care etc)
• Intelligent Quotient (IQ)- is a number meant to measure
peoples cognitive abilities (intelligent) in relation to their age
group.
• IQ test is measure of general intelligence.
• Aptitude test has a specific application.
69taufiq660
70. Selection of Medical
Students
Purpose of the selection process- is to identify
individuals who will-
Actively engage in the course
Successfully complete the course
Ultimately be good doctors
Serve the community best
71taufiq660
71. Different Means of Student
Selection-
Academic Records
Aptitude Tests
Personal Statements, essays
and autobiography
Situational Judgment Tests
(SJTs)
References
Personality Assessment and
Emotional Intelligence
Interviews and Mini
multiple interviews (MMIs)
Selection Centers 72taufiq660
72. Patient Management
Problem(PMP)- pencil and paper test of clinical
problem solving skills, resembling clinical situation
• Aim of PMPs is to make PBL more effective in clinical teaching
• Tutor acts as a facilitator
• PMP is presented one by one
• Two PMPs per week
• Before presenting problems learners will get clear guidance and
instruction.
73taufiq660
73. Guidance and Instruction
to Learners with PMP
Try to apply the following instruction
• Read the individual problem carefully and one by one
• Answer each question before moving to the next
• Unpack the problem and discuss with other learner in the
session in pair or group
In case of difficulty
• Use the resources available
74taufiq660
74. Self Assessment
• Self assessment means judgment or measuring of
students’ own perceived knowledge & concept
about his study regarding class performances ,
examination results & all the affairs related to
academic study.
75taufiq660
75. Benefits of Self
Assessment
• Identify weakness & learning gap
• Help to learn better
• Aid to perform better in future exam
• Increase self confidence
• Create good relationship with peer
• Create good relationship with seniors
• Create good relationship with teachers
76taufiq660
76. Student Peer Assessment
• Peer assessment is the assessment of students’
work by other students of equal status.
• Students often undertake peer assessment in
conjunction with formal self-assessment
• Peer and group assessment are also often
undertaken together.
77taufiq660
77. Question Bank
• A question bank contains all questions systematically
including their standard answer , rating scale , difficulty level,
discrimination properties and test matrix of a course
78taufiq660
78. An Item Card In a Question
Bank
Front
Q-1 What is the Effects of exotoxin released by Cornybacterium Diphtheriae ?
Answer:
(a) The formation of a greyish or yellowish membrane (false membrane) commonly over the
tonsils, pharynx or larynx with well-defined edges and the membrane cannot be wiped away.
(b) Marked congestion, edema or local tissue destruction
(c) Enlargement of the regional lymph nodes and
(d) Signs and symptoms of toxemia.
79
Department of Public Health
ABC Medical College
Ref No System Topic Time Mark Prepared By
1 Basic
concept
Concept of
health and
disease
01 Min 05 Dr. THS
taufiq660
79. Back
80
S/No Name of Exam &
Date
Difficulty
index
Discrimination
index
Remarks
1 2nd Term
Exam,2003
70 0.25 It is a good
question, with ideal
Difficulty and
Discrimination index
2 2nd Term Exam
2015
68 0.35 Ideal Difficulty index
and Excellent
Discrimination
Index
taufiq660
80. Feedback
• Is the information from the instructor to the learners about
their past performance on the wards which serve to
enhance or modify future actions of learners.
• Types:
• Brief ongoing feedback
• Formal mid-course feedback
• End –of course feedback
81taufiq660
81. Without Feedback-
• Mistakes go uncorrected
• Good performance is not reinforced
• Clinical competence is not achieved
• Learners self-validate
82taufiq660
82. Summary
• Assessment ensure progress towards objective
• Evaluation encourages interaction between teacher & learner
• No single tool is sufficient to measure all the competences
• Assessment to measure learning and learning through
assessment
83taufiq660
Are the students learning what they are suppose to learn?
Is there any way better to promote their learning?
"When the cook tastes the soup, that's formative: When the guests taste the soup, that's summative." Robert Stake (Scriven 1991:169):
"As the cook, or teacher, we need to stop and taste the soup before we move forward with instruction. We need to design instruction so students can press the reset button and go back to learn what they missed the first time. We can use many techniques to assess student achievement and understanding."
This is a powerful image that clearly delineates between the two concepts and helps to cement them into the mind. It's also reminds us of the power of using metaphors, similes, images and figurative language in our teaching practices.
Norm referenced and Criterion referenced
Can block intended career progression high stake,
Perceived as threatening
Content validity: is the extent to which a particular method of measurements incl all of the dimensions of the construct one intends to measure and nothing more. Test items should be representative to the larger contents of a subjects incl all domains of knowledge, skills and attitude.
To increase the content validity test items can be choose using test matrix.
Criterion Validity : It refers to the validity in relation to an external criterion. Criterion validity is present to the extent that the measurements predict a directly observable phenomenon.
Construct Validity : It is the indls int, attitude, personality, reasoning process, problem solving ability, comm skill, creativity, writing process, self esteem and other constructs that a teacher may wish to examine.
Equivalent tests- comparing two tests (written vs oral) of equivalent form (same content,same difficulty lvl) which is administered to the same group of students to obtain two sets of scores.
Split half- single ques paper is split into two halves to measure internal consistency .
Marker reliability or inter rater reliability is nothing but objectivity.
SAQ types- 1. Completion items( fill in the blanks)
2. Definitions
3. Label/draw diagrams
4. Unique answer type
5. Numerical problems
6. Open SAQs
7. Problem solving items
A distractor is effective if more of the lower ability students pick it as the correct answer (though incorrectly) and less of the higher ability students pick it as the correct answer.
Practical tests are mainly aimed to assess students performance on practical skills, clinical skills, comm skills (psychomotor and affective domain).
This are the assessment tools in which the components of clinical or practical competencies are tested in simulated env using agreed checklists/rating scales and the students rotate round a number of stations some of which has observers with chklist
Every moment we are either happy ,angry, sad, bored or frustrated.
Subjective Experience-Emotions can be highly subjective. Getting married or having child might have variety of emotional experiences ranging from joy to anxiety.
Physiological response- sweating of palm, pounding of heart, rapid breathing
Behavioral response- It is the actual response to emotion. Few expressions are universal like smile indicating happiness or pleasure or a frown indicating sadness or displeasure.
An IQ between 90 and 110 is considered average; over 120 is superior. A score below 70 indicates problem in understanding, score above 130 may indicate giftedness.