This document discusses key concepts related to assessment of learning. It defines assessment, measurement, evaluation and testing. It outlines different modes of assessment including traditional, performance, and portfolio assessments. It also discusses types of assessment processes such as diagnostic, formative and summative assessments. Principles of quality assessment are outlined including clarity, appropriateness, validity, reliability, fairness, and practicality. Different methods of developing tests are also discussed such as identifying objectives, determining test type, constructing items, and validating tests.
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Â
Assessment of learning 1
1. ASSESSMENT OF LEARNINGâ(Basic Concepts)
Prof. Yonardo Agustin Gabuyo
Basic Concepts in Assessment of Learning
Assessment
ī refers to the collection of data to describe or better understand an issue.
īmeasures "where we are in relation to where we should be?" Many consider it the same as Formative
Evaluation.
ī is a process by which information is obtained relative to some known objective or goal.
īteacherâs way of gathering information about what students have learned , and they use them to make
important decisions-about studentsâ grades, the content of future lessons, the revision of the structure or
content of a course.
Measurement
ī refers to the process by which the attributes or dimensions of some physical object are determine.
īis a process of measuring the individualâs intelligence, personality, attitudes and values, achievement and
anything that can be expressed quantitatively.
ī it answer the question, â how muchâ?
Evaluation
ī determines "how well did we do what we set out to do?" Evaluation is tied to stated goals and
objectives. Many equate this to summative evaluation.
Evaluation
ī it refers to the process of determining the extent to which instructional objectives are attained.
ī refers to the comparison of data to standard for purpose of judging worth or quality.
Test is an instrument designed to measure any quality,ability, skill or knowledge.
Testing is a method used to measure the level of performance or achievement of the learner.
TESTING refers to the administration, scoring and interpretation of an instrument (procedure) designed to
elicit information about performance in a sample of a particular area of behavior.
MODES OF ASSESSMENT
A. Traditional Assessment
īpreparation of the instrument is time consuming and prone to cheating.
2. īthe objective paper-and-pen test which usually assess low level thinking skills.
īscoring is objective and administration is easy because students can take the test at the same time.
B. Performance Assessment
īthe learner performs a behavior to be measured in a "real-world" context.
ī The learner demonstrates the desired behavior in a real-life context and the locus of controlis with the
student.
B. Performance Assessment
īA mode of assessment that requires actual demonstration of skills or creation of products of learning.
īScoring tends to be subjective without rubrics.
īPreparation of the instrument is relatively easy and it measures behavior that cannot be deceived.
C. Portfolio Assessment
īA process of gathering multiple indicators of students progress to support course goals in dynamic, ongoing
and collaborative processes.
īDevelopment is time consuming and rating tends to be subjective without rubrics.
īMeasures studentâs growth and development .
TYPES OF ASSESSMENT PROCESSES
īDetermine the entry behavior of the students.
īDetermine the studentâs performance at the beginning of instruction.
A. Placement Assessment
īDetermine the position of the students in the instructional sequence.
īDetermine the mode of evaluation beneficial for each student.
B. Diagnostic Assessment
is given at the start:
ī to determine the studentâs levels of competence.
ī to identify those who have already achieve mastery of the requisite learning.
ī to help classify students into tentative small group of instruction.
C.Formative Assessment
is given to:
3. ī monitor learning progress of the students.
ī provide feedback to both parents and students.
īit answer the question "Where we are in relation to where we should be?â
īthis type of assessment can be done informally and need not use traditional instruments such as quizzes and
tests.
D. Summative Assessment
īˇ given at the end of a unit:
īˇ to determine if the objectives were achieved.
īˇ tends to be formal and use traditional instruments such as tests and quizzes.
īit answer the question "How well did we do what we set out to do?"
īdetermine the extent of the studentâs achievement and competence.
īprovide a basis for assigning grades.
īprovide the data from which reports to parents andtranscripts can be prepared.
Principles of Quality Assessment
1.Clarity of the Learning Target
2.Appropriateness of the Assessment Method
3. Validity
4. Reliability
5. Fairness
6. Practicality and Efficiency
Principles of Quality Assessment
1.Clarity of the Learning Target
Learning Target.Clearly stated, focuses on student learning objective rather than teacher activity, meaningful
and important target.
Skill Assessed. Clearly presented, can you "see" how students would demonstrate the skill in the task itself?
Performance Task - Clarity. Could students tell exactly whatthey are supposed to do and how the final product
should be done?
Rubric - Clarity. Would students understand how they are to be evaluated? Are the criteria observable and
clearly described?
4. 2.Appropriateness of the Assessment Method
ī Does it work with type of task and learning target?
ī Does it allow for several levels of performance?
ī Does it assess skills as stated?
ī The type of test used should much the learning objective of the subject matter.
Two general categories of test items:
1.Objective items
ī require students to select the correct response from severalalternatives or to supply a word or short phrase to
answer a question or complete a statement.
2.Subjective or essay items
ī which permit the student to organize and present an original answer.
Objective Test
īˇ īinclude true-false, fill-in-the-blank, matching type, and multiple choice questions.
īˇ the word objective refers to the scoring and indicates there is only one correct answer.
īˇ Objective tests rely heavily on your skill to read quickly and to reason out the answer.
ī measure both your ability to remember facts and figuresand your understanding of course materials.
ī prepare yourself for high level critical reasoning and making fine discriminations to determine the best
answer.
a) Multiple-Choice Items
īused to measure knowledge outcomes and various types of learning outcomes.
īthey are most widely used for measuring knowledge , comprehension, and application outcomes.
īscoring is easy, objective, and reliable.
Principles of Quality Assessment
Advantages in Using Multiple-Choice Items
īˇ Multiple-choice items can provide ...
īˇ versatility in measuring all levels of cognitive ability.
īˇ ī highly reliable test scores.
īˇ ī scoring efficiency and accuracy.
īˇ ī objective measurement of student achievement or ability.
Multiple-choice items can provideâĻ
5. ī a wide sampling of content or objectives.
ī a reduced guessing factor when compared to true-false items.
ī different response alternatives which can provide diagnostic feedback.
b. True-False Items
ī typically used to measure the ability to identify whether statements of fact are correct.
ī the basic format is simply a declarative statement that the student must judge are true or false.
īitem is useful for outcomes where there are two possible alternatives.
True-False ItemsâĻ..
ī do not discriminate between students of varying ability as well as other item types.
īˇ can often include more irrelevant clues than do other item types.
ī can often lead an instructor to favor testing of trivial knowledge.
c. Matching Type Items
ī consist of a column of key words presented on the left side of the page and a column of options place at the
right side of the page. Students are required to match the options associated with a given key word(s).
ī provide objective measurement of students achievement.
ī provide efficient and accurate test scores.
Matching Type Items
ī if options can not be used more than once, the items are not mutually exclusive; getting one answer incorrect
automatically means a second question is incorrect.
ī all items should be of the same class, and all options should be of the same class. (e.g., a list of events to be
matched with a list of dates.
d. Short Answer Items
ī requires the examinee to supply the appropriate words, numbers, or symbols to answer a question or
complete a statement.
ī items should require a single word answer or brief and definite statement.
ī can efficiently measure lower level of cognitive domain.
B) Essays or Subjective test
īˇ may include either short answer questions or long general questions. ī these exams have no one
specific answer per student.
6. īˇ they are usually scored on an opinion basis, although there will be certain facts and understanding
expected in the answer.
ī essay test are generally easier and less time consuming to construct than are most objective test items.
īthe main reason students fail essay tests is not because they cannot write, but because they fail to answer the
questions fully and specifically, their answer is not well organized.
students with good writing skills have an advantage overstudents who have difficulty expressing themselves
through writing.
īˇ īessays are more subjective in nature due to their susceptibility to scoring influences.
C) PERFORMANCE TEST
īalso known as alternative or authentic assessment
ī is designed to assess the ability of a student to perform correctly in a simulated situation (i.e., a situation in
which the student will be ultimately expected to apply his/her learning).
īa performance test will simulate to some degree a real life situation to accomplish the assessment.
ī in theory, a performance test could be constructed for any skill and real life situation.
ī most performance tests have been developed for theassessment of vocational, managerial, administrative,
leadership, communication, interpersonal and physical education skills in various simulated situations.
Advantages in Using Performance Test Items
Performance test items:
īcan appropriately measure learning objectives which focus on the ability of the students to apply skills or
knowledge in real life situations.
īusually provide a degree of test validity not possible with standard paper and pencil test items.
īare useful for measuring learning objectives in the psychomotor domain.
SUGGESTIONS FOR WRITINGPERFORMANCE TEST ITEMS
1.Prepare items that elicit the type of behavior you want to measure.
2. Clearly identify and explain the simulated situation to the student.
3. Make the simulated situation as "life-like" as possible.
4. Provide directions which clearly inform the students of the type of response called for.
5. When appropriate, clearly state time and activity limitations in the directions.
7. 6. Adequately train the observer(s)/scorer(s) to ensure that they are fair in scoring the appropriate behaviors.
D) Oral questioning
īˇ the most commonly-used of all forms of assessment in class.
īˇ assumes that the learner can hear, of course, and shares a common language with the assessor.
īthe ability to communicate orally is relevant to this type of assessment.
ī The other major role for the "oral" in summativeassessment is in language learning, where the capacity to
carry on a conversation at an appropriate level of fluency is relatively distinct from the ability to read and write
the language.
E) Observation
ī refers to measurement procedures
in which child behaviors in the school or classroom are systematically monitored, described, classified, and
analyzed,
with particular attention typically given to the antecedent and consequent events involved in the performance
and maintenance of such behaviors.
F) Self-reports
ī Students are asked to reflect on make a judgment about, and then report on their own or a peer's behavior
and performance.
ītypical evaluation tools could include sentence completion, Likert scales, checklists, or holistic scales.
īresponses may be used to evaluate both performance and attitude.
3. Validity
ī is the degree to which the test measures what is intended to measure.
ī it is the usefulness of the test for a given purpose.
ī a valid test is always reliable.
Approaches in Validating Test
Factors Affecting Content Validity of Test Items
A. Test itself
B. The administration and scoring of a test.
C. Personal factors influencing how students response to the test.
8. D. Validity is always specific to a particular group.
Factors Affecting Content Validity of Test Items
īˇ A. Test Itself:
īˇ Ways that can reduce the validity of test results
o 1. Unclear Directions
o 2. Poorly constructed test items
o 3. Ambiguity
o 4. Inappropriate level of difficulty
o 5. Improper arrangement of items
6. Inadequate time limits
o 7. Too short test
o 8.Identifiable pattern of answers.
o 9.Test items inappropriate for the outcomes being measured.
īˇ 10.Reading vocabulary and sentence structure to difficult.
B. The administration and scoring of a test.
ī assessment procedures must be administered uniformly to all students. Otherwise, scores willvary due to
factors other than differences in student knowledge and skills.
ī the test should be administered with ease, clarity and uniformity so that scores obtained are comparable.
īuniformity can be obtained by setting the time limit and oral instructions.
ī insufficient time to complete the test
ī giving assistance to students during the testing
ī subjectivity in scoring essay tests
C. Personal factors influencing how students response to thetest.
īstudents might not mentally prepared for the test.
ī students can subconsciously be exercising what is called response set.
D. Validity is always specific to a particular group.
īthe measurement of test results can be influence by such factors as age, sex, ability level, educational
background and cultural background.
Validity
ī is the most important quality of a test.
9. ī does not refer to the test itself.
ī generally addresses the question: "Does the test measure what it is intended to measure?"
īrefers to the appropriateness, meaningfulness, andusefulness of the specific inferences that can be made from
test scores.
ī is the extent to which test scores allow decision makers to infer how well students have attained program
objectives.
4. Reliability
īit refers to the consistency of score obtained by the same person when retested using the same instrument or
one that is parallel to it.
īrefers to the results obtained with an evaluation instrument and not the instrument itself.
ī an estimate of reliability always refer to a particular type of consistency.
ī reliability is necessary but not a sufficient condition for validity.
ī reliability is primarily statistical.
Methods of Computing Reliability Coefficient
Relationship of Reliability and Validity
ī test validity is requisite to test reliability.
ī if a test is not valid, then reliability is moot. In other words, if a test is not valid there is no point in
discussing reliability because test validity is required before reliability can be considered in any meaningful
way.
Reliability
ī is the degree to which test scores are free of errors of measurement due to things like student fatigue, item
sampling, student guessing.
ī if as test is not reliable it is also not valid.
5. Fairness
īthe assessment procedures do not discriminate against a particular group of students (for example, students
from various racial, ethnic, or gender groups, or students with disabilities).
6. Practicality and Efficiency
ī Teacherâs familiarity with the method
ī Time required
ī Complexity with the administration
10. ī Ease in scoring -the test should be easy to score such that directions for scoring are clear, the scoring key is
simple; provisions for answer sheets are made.
ī Cost- (economy) - the test should be given in the cheapest way, which means that the answer sheets must be
provided so that the test can be given from time to time.
Development of Classroom Assessment Tools
Steps in Planning for a Test
ī Identifying test objectives
ī Deciding on the type of objective test to be prepared
ī Preparing a Table of Specifications (TOS)
ī Construction the draft test items
ī Try-out and validation
Identifying Test Objectives.
An objective test, if it is to be comprehensive, must cover the various levels of Bloomâs taxonomy. Each
objective consists of a statement of what is to be achieved and preferably, by how many percent of the
students.
Cognitive Domain
1. Knowledge
ī recognizes studentsâ ability to used rote memorization and recall certain facts. Test questions focus on
identification and recall information.
ī Sample verbs of stating specific learning outcomes
ī Cite, define, identify label, list, match, name, recognize, reproduce, select state.
ī At the end of the topic, students be able to identify major food groups without error. (instructionalobjective)
ī Test Item:
ī What are the four major food groups?
ī What are the three measures of central tendency?
2. Comprehension
ī involves studentsâ ability to read course content, interpret important information and put otherâs ideas into
their own words. Test questions should focus on the use of facts, rules and principles.
ī Sample verbs of stating specific learning outcomes.
ī Classify, convert, describe, distinguish between, give examples, interpret, summarize.
11. ī At the end of the lesson, the students be able tosummarize the main events of the story in grammatically
correct English. (instructional objective)
ī Summarize the main event in the story in grammatically correct English. (test item)
3. Application
ī students take new concepts and apply them to new situation. Test questions focuses on applying facts and
principles.
ī Sample verbs of stating specific learning outcomes.
ī Apply, arrange, compute, construct, demonstrate, discover, extend, operate, predict, relate, show, solve, use.
ī At the end of the lesson, the students be able to write a short poem in iambic pentameter. (instructional
objective)
ī Write a short poem in iambic pentameter.
4. Analysis
īstudents have the ability to take new information and break it down into parts and differentiate between
them. The test question focus on separation of a whole into component parts.
ī Sample verbs of stating specific learning outcomes.
ī Analyze, associate, determine, diagram, differentiate, discriminate, distinguish, estimate, point out, infer,
outline, separate.
ī At the end of the lesson, the students be able to describe the statistical tools needed in testing the difference
between two means. (instructional objective)
What kind of statistical test would you run to see if there is a significant difference between pre-test and post-
test?
5. Synthesis
ī students are able to take various pieces of information and form a whole creating a pattern where one did not
previously exist. Test question focuses on combining new ideas to form a new whole.
īSample verbs of stating specific learning outcomes.
īCombine, compile, compose, construct, create, design, develop, devise, formulate, integrate, modify, revise,
rewrite, tell, write.
ī At the end of the lesson, the student be able to compareand contrast the two types of error. (instructional
objective)
ī What is the difference between type I and type II error?
12. 6. Evaluation
īinvolves studentsâ ability to look at someone elseâs ideas or principles and the worth of the work and the
value of the conclusion.
ī Sample verbs of stating specific learning outcomes.
ī Appraise, assess, compare, conclude, contrast, criticize, evaluate, judge, justify, support.
ī At the end of the lesson, the students be able to conclude the relationship between two means.
īExample: What should the researcher conclude about therelationship in the population?
Preparing Table of Specification
A table of specification
ī is a usefulguide in determining the type of test items that you need to construct. If properly prepared, s table
of specifications will help you limit the coverage of the test and identify the necessary skills or cognitive level
required to answer the test item correctly.
Gronlund (1990) lists several examples of how a table ofspecifications should be prepared.
Format of a Table of Specifications
Specific Objectives these refer to the intended learning outcomes stated as specific instructional objectives
covering a particular test topic.
Cognitive Level this pertains to the intellectual skill or ability to correctly answer a test item using Bloomâs
taxonomy of educational objectives. We sometimes refer to this as the cognitive demand of a test item. Thus
entries in this column could be knowledge, comprehension, application, analysis, synthesis and evaluation.
īType of Test Item this identifies the type or kind of test a test items belongs to. Examples of entries in this
column could be âmultiple choice, true or false, or even essay.
īItem Numberthis simply identifies the question number as it appears in the test.
īTotal Number of Pointsthis summarizes the score given to a particular test item.
(1) Sample of Table of specifications
(2) Sample of Table of specifications
13. (3) Sample of Table of specifications
Points to Remember in preparing a table of Specifications
1)Define and limit the subject matter coverage of the test depending on the length of the test.
2) Decide on the point distribution per subtopic.
3) Decide on the type of test you will construct per subtopic.
4) Make certain that the type of test is appropriate to thedegree
of difficulty of the topic.
5) State the specific instructional objectives in terms of the specific types of performance students are expected
to demonstrate at the end of instruction.
6) Be careful in identifying the necessary intellectual skill needed to correctly answer the test item. Use
Bloomâs taxonomy as reference.
Suggestions for Constructing Short-Answer Items
1)Word the item so that the required answer is both brief and specific.
2)Do not take statements directly from textbooks to use as a basis for short-answer items.
3)A direct question is generally more desirable than anincomplete statement.
4) If the answer is to be expressed in numerical units, indicate the type of answer wanted.
5) Blanks for answer should be equal in length and in columnto the right of the question.
6) When completion items are used, do not include too many blanks.
Example for:
1) Poor: An animal that eats the flesh of other animals is (carnivorous)
Better: An animal that eats the flesh of other animals is classified as (carnivorous)
2) Poor: Chlorine is a (halogen).
Better: Chlorine belongs to a group of elements that combine with metals to form salt. It is therefore called a
(halogen)
14. Development of Classroom Assessment Tools
Suggestions for Constructing Short-Answer Items
3) Poor: John Glenn made his first orbital flight around the earth
in (1962).
Better: In what year did John Glenn make his first orbital flight
around the earth? (1962)
Selecting the Test Format
Selective Test â a test where there are choices for the answer like multiple choice, true or false and matching
type.
Supply Test â a test where there are no choices for the answer like short answer, completion and extended-
response essay.
Construction and Tryouts
ī Item Writing
ī Content Validation
ī Item Tryout
ī Item Analysis
Item Analysis
īrefers to the process of examining the studentâs response to each item in the test.
There are two characteristics of an item. These are desirable and undesirable characteristics. An item that has
desirable characteristics can be retained for subsequent use and that with undesirable characteristics is either be
revised or rejected.
Use of Item Analysis
ī Item analysis data provide a basis for efficient class discussion of the test results.
ī Item analysis data provide a basis for remedial work.
ī Item analysis data provide a basis for general improvement of classroom instruction.
Use of Item Analysis
īItem analysis data provide a basis for increased skills in test construction.
īItem analysis procedures provide a basis for constructing test bank.
Three criteria in determining the desirability andundesirability of an item.
15. a) difficulty of an item
b) discriminating power of an item
c) measures of attractiveness
Difficulty index
īrefers to the proportion of the number of students in the upper and lower groups who answered an item
correctly.
Development of Classroom Assessment Tools
Level of Difficulty of an Item
Development of Classroom Assessment Tools
Discrimination Index
īrefers to the proportion of the students in the upper group who got an item correctly minus the proportion of
the students in the lower group who got the an item right.
Development of Classroom Assessment Tools
Level of Discrimination
Types of Discrimination Index
īˇ Positive Discrimination Index
-more students from the upper group got the item correctly than in the lower group.
16. īˇ Negative discrimination Index
-More students from the lower group got the item correctly than in the upper group.
Zero Discrimination Index
īˇ The number of students from the upper group and lower group are equal
MEASURES OF ATTRACTIVENESS
To measure the attractiveness of the incorrect option (distractors) in a multiple-choice tests, count the number
of students who selected the incorrect option in both the upper and lower groups. The incorrect options should
attract less of the upper group than the lower group.
Rubrics
īasystematic guideline to evaluate studentsâ performance through the use of a detailed description of
performance standard.
ī used to get consistent scores across all students
īit provides students with feedbacks regarding theirweakness and strength, thus enabling them to develop
their skills.
īallows students to be more aware of the expectations for performance and consequently improve their
performance.
Holistic Rubric vs Analytic Rubric
Holistic Rubric is more global and does little to separate the task in any given product, but rather views the
final product as a set of all interrelated tasks contributing to the whole.
ī Provide a single score based on an overall impression of a studentsâ performance on task.
ī May be difficult to provide one over all score.
Advantage: quick scoring, provide overview of students achievement.
Disadvantage:does not provide detailed information about the student performance in specific areas of the
content and skills.
Use a holistic rubric when:
ī You want a quick snapshot of achievement.
ī A single dimension is adequate to define quality.
Example of Holistic Rubrics
17. Analytic Rubric
ībreaks down the objective or final product into component part each part is scored independently.
īprovide specific feedback along several dimension.
Analytic Rubric
Advantage: more detailed feedback, scoring more consistent across students and graders.
Disadvantage: time consuming to score
Use an analytic rubric when:
ī you want to see relative strengths and weaknesses.
ī you want detailed feedback.
ī you want to assess complicated skills or performance.
ī you wants students to self-assess their understanding or performance.
Example of Analytic Writing Rubric
Example of Analytic Writing Rubric
Utilization of Assessment Data
īˇ Norm-Referenced Interpretation
īˇ result is interpreted by comparing a student with another student where some will really pass.
īˇ designed to measure the performance of the students compared to other students. Individual score is
compared to others.
īˇ īusually expressed in term of percentile, grade equivalent or stanine.
18. ī Norm-referenced grading is a system typically used to evaluate students based on the performance of those
around them. IQ tests and SAT exams would be two examples of this system, as well as grading âon the curve.
īNorm-referenced grading is more common in schools that emphasize class rank rather than understanding of
skills or facts.
Utilization of Assessment Data
Criterion-Reference Interpretationīresult is interpreted by comparing student based on a predefined standard
where all or none may pass.
īdesigned to measure the performance of students compared to a pre-determined criterion or standard, usually
expressed in terms of percentage.
īCriterion-referenced evaluation should be used to evaluate student performance in classrooms.
īˇ it is referenced to criteria based on learning outcomes described in the provincial curriculum.
īˇ the criteria reflect a student's performance based on specific learning activities.
īa student's performance is compared to established criteria rather than to the performance of other students.
īevaluation referenced to prescribed curriculum requires that criteria are established based on the learning
outcomes listed under the curriculum.