2. Stages of Test
Construction
Explanation
Determining 1) What it is one wants to know
2) For what purpose
Aspect (Questions need answered)
- Examinees
- Kind of test
- Purpose (State)
- Abilities tested
- Accuracy of results
- Importance of backwash effect
- Scope of test
- Constraints set by the unavailability of expertise, facilities, time of construction, administration, and
scoring
Planning 1) Determine the content
Aspect
- Purpose (Describe)
- Characteristics of the test takers, the nature of the population of the examinees for whom the test
is being designed
- A plan for evaluating the qualities of test usefulness (reliability, validity, authenticity, practicality
inter-activeness, and impact)
3. Stages of Test
Construction
Explanation
Planning ctd - Nature of the ability we want measured
- Identify resources
- A plan for allocation and management of resources
- Format and timing
- Criteria
- Levels of performance
- Scoring procedures
Writing Test items writers’ characteristics:
• Experienced in test construction.
• Quite knowledgeable of the content of the test.
• Have the capacity in using language clearly and economically.
• Ready to sacrifice time and energy.
Other aspects:
• Sampling : test constructors choose widely from the whole area of the course content. (Not
including EVERYTHING under course content in 1 version of test)
• Decision regarding content validity and beneficial backwash
You’ve written it well when..
(/) It is representative sample of the course material
4. Stages of Test
Construction
Explanation
Preparing You have to…
(/) Understand the major principles, techniques and experience
…before preparing test items.
AVOID preparing
• Test items which can be answered through test-wiseness.
Test wiseness : examinees utilise the characteristics and formats of the test to guess the correct answer
Reviewing Principles for reviewing test items:
• The test should not be reviewed immediately after its construction, but after some considerable
time.
• Other teachers or testers should review it. In a language test, it is preferable if native speakers are
available to review the test.
Pre-testing • The tester should administer the newly-developed test to a group of examinees similar to the target
group; PURPOSE Analyse every individual item as well as the whole test.
• Numerical data (test results) should be collected to check the efficiency of the item, it should include
item facility and discrimination.
5. Stages of Test
Construction
Explanation
Validating • Identify IF
• Item Facility (IF) shows to what extent the item is easy or difficult.
• IF= number of correct responses (Σc) / total number of candidates (N)
• And to measure item difficulty:
IF= (Σw) / (N)
The results of such equations range from 0 – 1. An item with a facility index of 0 is too difficult, and
with 1 is too easy. The ideal item is one with the value of (0.5) and the acceptability range for item
facility is between [0.37 → 0.63], i.e. less than 0.37 is difficult, and above 0.63 is easy.
Too easy/Too hard = Low reliability
6. Preparing Test Blueprint / Test Specifications
• Test specs = an outline of your test /what it will “look like” + your guiding
plan for designing an instrument that effectively fulfils your desired
principles, especially validity.
• They include the following:
a description of its content
item types (methods, such as multiple-choice, cloze, etc.)
tasks (e.g. written essay, reading a short passage, etc.)
skills to be included
how the test will be scored
how it will be reported to students
7. What is an item?
• A tool, an instrument, instruction or question used to get feedback
from test-takers
• Evidence of something that is being measured.
• Useful information for consideration in measuring or asserting a
construct measurement.
• Can be classified as a recall and thinking item.
• Recall item : item that requires one to recall in order to answer
• Thinking item : item that requires test-takers to use their thinking
skills to attempt.
8. Sequential steps in designing test specs
• A broad outline of how the test will be organised
• Which of the eight sub-skills you will test
• What the various tasks and item types will be
• How results will be scored, reported to students, and used in future class
(washback)
Remember to…
Know the purpose of the test you are creating
Know as precisely as possible what it is you want to test
Not conduct a test hastily
Examine the objectives for the unit you are testing carefully
9. Bloom’s Taxonomy (Revised)
• Def : A systematic way of describing how a learner’s performance
develops from simple to complex levels in their affective,
psychomotor and cognitive domain of learning.
16. Categories & Cognitive
Processes
Definition
Factual Knowledge The basic elements students must know to the acquainted
with a discipline or solve problems in it
Conceptual Knowledge The interrelationships among the basic elements within a
larger structure that enable them to function together
Procedural Knowledge How to do something, methods of inquiry, and criteria for
using skills, algorithms, techniques, and methods
Metacognitive Knowledge Knowledge of cognition in general as well as awareness
and knowledge of one’s own cognition
The Knowledge Domain
17. SOLO Taxonomy
• Def : (Structure of the Observed Learning Outcome) a systematic way
of describing how a learner’s performance develops from simple to
complex levels in their learning.
• There are 5 stages, namely :
Prestructural, Unistructural, Multistructural, which are in a quantitative
phrase and Relational and Extended Abstract, which are in a qualitative
phrase (Refer Figure 1.0)
• A means of classifying learning outcomes in terms of their complexity,
enabling teachers to assess students’ work in terms of its quality.
20. Functions of SOLO taxonomy
• An integrated strategy, to be used
In lesson design (learning outcomes intended)
In task guidance
In formative and summative assessment
In deconstructing exam questions to understand marks awarded
As a vehicle for self-assessment and peer-assessment
21. Advantages of SOLO taxonomy
Aspect
Structure of the taxonomy • Encourages viewing learning as an on-going process, moving from simple recall of facts
towards a deeper understanding; that learning is a series of interconnected webs that can
be built upon and extended.
• Consisting as a series of cycles (especially between the Unistructural, Multistructural and
Relational levels), which would allow for a development of breadth of knowledge as well
as depth.
In turn..
• Creating sts that are.. “self-regulating, self-evaluating learners who were well motivated
by learning.”
SOLO based techniques • Use of constructional alignment encourages teachers to be more explicit when creating
learning objectives, focusing on what the student should be able to do and at which level.
In turn..
• Sts will be able to make progress and allows for the creation of rubrics, for use in class, to
make the process explicit to the student.
It’s HOTs properties • Scaffold in depth discussion
In turn..
• Encouraging sts to develop interpretations, use research and critical thinking effectively to
develop their own answers, and write essays that engage with the critical conversation of
the field.
• May also be helpful in providing a range of techniques for differentiated learning.
22. Proponents of the SOLO taxonomy say..
• A model of learning outcomes that helps schools develop a common
understanding.
• A ‘framework for developing the quality of assessment’ and that it is
‘easily communicable to students’.
• Hattie outlines three levels of understanding: surface, deep and
conceptual. He indicates that:
“The most powerful model for understanding these three levels and
integrating them into learning intentions and success criteria is the
SOLO model.”
23. Critics of the SOLO taxonomy say…
• There is potential to misjudge the level of functioning.
• It has ‘conceptual ambiguity’; that the ‘categorisation’ is ‘unstable’.
• The structure is referred as a hierarchy, hence rise of concerns when
complex processes, such as human thought, are categorised in this
manner.
24. Guidelines for constructing test items
Guideline Elaboration
Aim of test • Developed to precisely measure the objectives prescribed by the blueprint
• Meet quality standards
Range of the topics to be
tested
Measure the test-takers’ ability or proficiency in applying the knowledge and principles on the
topics that they have learnt
Range of skills to be tested • Have cognitive characteristics exemplifying understanding, problem-solving, critical
• thinking, analysis, synthesis, evaluation and interpreting rather than just declarative
knowledge.
• (Bloom’s taxonomy as tool to use in item writing)
Test format Needs to be a logical and consistent stimulus format
Why?
For test item writers : help expedite the laborious process of writing test items as well as supply
a format for asking basic questions.
For test-takers :
• So that the questioning process in itself does not give unnecessary difficulty to answering
questions
• test takers can quickly read and understand the questions, since the format is expected
25. Guideline Elaboration
International and Cultural
Considerations (biasness)
refrain from…
the use of slang
geographic references
historical references or dates (holidays)
…that may not be understood by an international examinee.
Level of difficulty Assure that the test item…
Has a planned number of questions at each level of difficulty
Able to determine mastery and non-mastery performance states
Weak students could answer easy item
Intermediate language proficiency students could answer easy and moderate item
High language proficiency students could answer easy, moderate and advance test items
encompass all three levels of difficulties
26. Test format
• Refers to the layout of questions on a test. For example, the format of
a test could be two essay questions, 50 multiple- choice questions,
etc.
*Note : If you wish to know on the outlines of some large-scale
standardised tests, please refer to pages 64 & 65 in the PPG Module
28. Types of test items to assess language skills
Language Skills Elaboration
Listening Two kinds of listening tests:
• Tests that test specific aspects of listening, like sound discrimination
• Task based tests which test skills in accomplishing different types of listening tasks considered
important for the students being tested
Four types of listening performance from which assessment could be considered.
Intensive Listening for perception of the components (phonemes, words, intonation, discourse markers,etc) of a larger stretch of
language.
Responsive Listening to a relatively short stretch of language ( a greeting, question, command, comprehension check, etc.) in order
to make an equally short response
Selective Processing stretches of discourse such as short monologues for several minutes in order to “scan” for certain
information. For example, to listen for names, numbers, grammatical category, directions (in a map exercise), or certain
facts and events.
Extensive Listening to develop a top-down , global understanding of spoken language. For example listening to a conversation and
deriving a comprehensive message or purpose and listening for the gist and making inferences.
29. Language Skills Elaboration
Speaking Objective test : tests skills such as …
• Pronunciation
• Knowledge of what language is appropriate in different situations
• Language required in doing different things like describing, giving directions, giving instructions,
etc
Integrative task-based test : involves finding out if pupils can perform different tasks using spoken
language that is appropriate for the purpose and the context.
For example :
• Describing scenes shown in a picture
• Participating in a discussion about a given topic
• Narrating a story, etc.
CATEGORIES FOR ORAL ASSESSMENT (Refer yellow table)
30. Category Elaboration
Imitative • Ability to imitate a word or phrase or possibly a sentence/ pronunciation
• A number of prosodic (intonation, rhythm,etc.), lexical , and grammatical properties of language may be
included
Intensive • The production of short stretches of oral language designed to demonstrate competence in a narrow band of
grammatical, phrasal, lexical, or phonological relationships.
• Eg :directed response tasks (requests for specific production of speech), reading aloud, sentence and dialogue
completion, limited picture-cued tasks including simple sentences, and translation up to the simple sentence
level.
Responsive • Interaction and test comprehension but at somewhat limited level of very short conversation, standard
greetings, and small talk, simple requests and comments.
• The stimulus is almost always a spoken prompt (to preserve authenticity) with one or two follow-up questions or
retorts
Interactive • Increased length + complexity from responsive.
• May include multiple exchanges and/or multiple participants.
• Two types : (a) transactional language, which has the purpose of exchanging specific information, and (b)
interpersonal exchanges, which have the purpose of maintaining social relationships.
Extensive • Speeches, oral presentations, and storytelling, during which the opportunity for oral interaction from listeners is
either highly limited (perhaps to nonverbal responses) or ruled out together.
• Language style is more deliberative (planning is involved)
• May include informal monologue such as casually delivered speech (e.g., recalling a vacation in the mountains,
31. Language Skills Elaboration
Reading
Meaning conveyed through reading text
Type Elaboration
Skimming Inspect lengthy passage rapidly
Scanning Locate specific information within a short
period of time
Receptive/ Intensive A form of reading aimed at discovering exactly
what the author seeks to convey
Responsive Respond to some point in a reading text
through writing or by answering questions
32. Meaning conveyed through reading text
Grammatical meaning Meanings that are expressed through
linguistic structures such as complex and
simple sentences and the correct
interpretation of those structures.
Informational meaning The concept or messages contained in the
text. May be assessed through various means
such as summary and précis writing.
Discourse meaning The perception of rhetorical functions
conveyed by the text.
Writer’s tone The writer’s tone – whether it is cynical,
sarcastic, sad or etc
33. Language Skills Elaboration
Writing
Imitative • The ability to spell correctly and to perceive phoneme-grapheme correspondences in the English spelling
system
• The mechanics of writing
• Form is the primary focus while context and meaning are of secondary concern.
Intensive
(controlled
)
• Producing appropriate vocabulary within a context, collocation and idioms, and correct grammatical features
up to the length of a sentence.
Responsive • Perform at a limited discourse level, connecting sentences into a paragraph and creating a logically connected
sequence of two or three paragraphs.
• Tasks relate to pedagogical directives, lists of criteria, outlines, and other guidelines.
• Eg : brief narratives and descriptions, short reports, lab reports, summaries, brief responses to reading, and
interpretations of charts and graphs.
• Form-focused attention is mostly at the discourse level, with a strong emphasis on context and meaning.
Extensive • Implies successful management of all the processes and strategies of writing for all purposes, up to the length
of eg : an essay,
• Focus is on achieving a purpose, organizing and developing ideas logically, using details to support or
illustrate ideas, demonstrating syntactic and lexical variety and engaging in the process of multiple drafts to
achieve a final product.
• Focus on grammatical form is limited to occasional editing and proofreading of a draft
34. Brown’s (Assessing Skills)
Skill Type • Test item
Listening Intensive Listening • Recognizing phonological and morphological elements
• Paraphrase recognition
Responsive Listening • Responding to a stimulus; conversation, requests
Selective Listening • Listening cloze
• Information transfer
• Sentence repetition
Extensive Listening • Dictation
• Communicative stimulus-response tasks
• Authentic listening tasks
Speaking Intensive Speaking • Directed response tasks
• Read-Aloud tasks
• Sentence/dialogue completion tasks and oral questionnaires
• Picture-cued tasks
Responsive Speaking • Q & A
• Giving instructions and directions
• Paraphrasing
Interactive Speaking • Interview
• Role-play
• Discussions and conversations
• Games
Extensive speaking • Oral presentations
• Picture-cued storytelling
• Retelling a story, news event
35. Skill Type • Test item
Reading Perceptive reading • Reading aloud
• Written response
• Multiple-choice
• Picture-cued items
Selective reading • Matching tasks
• Editing tasks
• Picture-cued tasks
• Gap-filling tasks
Interactive reading • Cloze tasks
• Impromptu reading + comprehension questions
• Short answer tasks
• Editing longer texts
• Scanning
• Ordering tasks
• Information transfer; reading charts, maps, graphs, diagrams
Extensive reading • Skimming tasks
• Summarizing and responding
• Notetaking and outlining
Writing Imitative writing • Writing letters, words and punctuation
• Spelling tasks and detecting phoneme – grapheme correspondences
Intensive (Controlled) writing • Dictation and dicto-comp
• Grammatical transformation tasks
• Picture-cued tasks
• Vocabulary assessment tasks
• Ordering tasks
• Short answer and sentence completion tasks
36. Skill Type • Test item
Writing Responsive and extensive writing • Paraphrasing
• Guided Q & A
• Paragraph constructions tasks
• Strategic options
• Standardized tests of responsive writing
Grammar &
Vocabulary
Selected response • Multiple-choice tasks
• Discrimination tasks
• Noticing tasks or consciousness-raising tasks
Limited production • Gap-filling tasks
• Short-answer tasks
• Dialogue-completion tasks
Extended production • Information gap tasks
• Role-play or simulation tasks
37. Objective and Subjective Test
Objective test • Tests that are graded objectively
• Include the multiple choice test, true false items
and matching items
• Similar to select type tests where students are
expected to select or choose the answer from a list
of options
Subjective test • Involve subjectivity in grading
• Include essays and short answer questions
• Similar to supply type as the students are expected
to supply the answer through their essay
Subjective + objective • Dictation test, filling in the blank type tests, as well
as interviews and role plays
38. Type of test : according to how students are
expected to respond
Selected response:
Do not create any language but rather
select the answer from a given list
Constructed response:
Produce language by writing, speaking,
or doing something else
Personal response:
Produce language but also allows each
students’ response to be different from
one another and for students to
“communicate what they want to
communicate”
True false Fill-in Conferences
Matching Short answer Portfolios
Multiple choice Performance test
Self and peer
assessments
40. Discrete Integrative
Language is seen to be
made up of smaller units
and it may be possible to
test language by testing
each unit at a time
Language is that of an
integrated whole which
cannot be broken up into
smaller units or elements
41. Communicative test
• Sts have to produce the language in an interactive setting involving
some degree of unpredictability which is typical of any language
interaction situation.
42. The three principles of communicative tests are :
• involve performance;
• are authentic; and
• are scored on real-life outcomes
43. Limitation in applying the communicative test
• Issues of practicality, involving especially the amount of time and
extent of organisation to allow for such communicative elements to
emerge.
Advantages in applying the communicative
test
• Have valid language that are purposeful and can stimulate positive
washback in teaching and learning.
45. Scoring approaches
Objective • Relies on quantified methods of evaluating
students’ writing
Holistic • The reader (examiner) reacts to the students’
compositions as a whole and a single score is
awarded to the writing
• Each score on the scale will be accompanied with
general descriptors of ability
• Related : Primary trait scoring
Analytical • Raters assess students’ performance on a variety of
categories which are hypothesised to make up the
skill of writing
46. Comparison between approaches
Scoring Approach Advantages Disadvantages
Holistic
Quickly graded
Provide a public standard that is
understood by the teachers and students
alike
Relatively higher degree of rater reliability
Applicable to the assessment of many
different topics
Emphasise the students’ strengths rather
than their weaknesses.
The single score may actually mask differences
across individual compositions.
Does not provide a lot of diagnostic feedback
Analytical
It provides clear guidelines in grading in the
form of the various components.
Allows the graders to consciously address
important aspects of writing.
Writing ability is unnaturally split up into
components.
Objective:
Emphasises the students’ strengths rather
than their weaknesses.
Still some degree of subjectivity involved.
Accentuates negative aspects of the learner’s
writing without giving credit for what they can
do well.
47. Questions you can attempt..
• Describe with examples how holistic and analytical rubrics can be
used to assess Year 6 pupils’ writing based on the following skill
- Write simple factual descriptions of things, events, scenes and what
one saw and did.
- Characteristics of each approach
49. Purposes of reporting
• Main purpose of tests is to obtain information concerning a particular
behaviour or characteristic.
• Evaluate the effectiveness of one’s own teaching or instructional
approach and implement the necessary changes
• Based on information obtained from tests, several different types of
decisions can be made.
50.
51. Reporting methods
Norm - Referenced Assessment and Reporting Assessing and reporting a student's achievement and
progress in comparison to other students.
Criterion - Referenced Assessment and Reporting Assessing and reporting a student's achievement and
progress in comparison to predetermined criteria.
An outcomes-approach to assessment will provide
information about student achievement to enable
reporting against a standards framework.
An outcomes-approach Acknowledges that students, regardless of their class
or grade, can be working towards syllabus outcomes
anywhere along the learning continuum.
52. Principles of effective and informative
assessment and reporting
Has clear, direct links with outcomes
Is integral to teaching and learning
Is balanced, comprehensive and varied
Is valid
Is fair
Engages the learner
Values teacher judgement
Is time efficient and manageable
Recognises individual achievement and progress
Involves a whole school approach
Actively involves parents
Conveys meaningful and useful information
54. Components of PBS
School assessment Refers to written tests that assess subject learning. The test questions and marking
schemes are developed,
administered, scored, and reported by school teachers based on guidance from LP.
Central assessment Refers to written tests, project work, or oral tests (for languages) that assess subject
learning. LP develops the test questions and marking schemes. The tests are, however,
administered and marked by school teachers
Psychometric assessment Refers to aptitude tests and a personality inventory to assess students’ skills, interests,
aptitude, attitude and personality. Aptitude tests are used to assess students’
innate and acquired abilities, for example in thinking and problem solving. The personality
inventory is used to identify key traits and characteristics that make up the students’
personality. LP develops these instruments and provides guidelines for use.
Physical, sports, and co-
curricular activities assessment
Refers to assessments of student performance and participation in physical and health
education, sports, uniformed bodies, clubs, and other non-school sponsored activities
55. Benefits of PBS
• enables students to be assessed on a broader range of output over a
longer period of time.
• Provides teachers with more regular information to take the
appropriate remedial actions for their students.
• Will hopefully reduce the overall emphasis on teaching totest, so that
teachers can focus more time on delivering meaningful learning as
stipulated in the curriculum.