This document discusses approaches to assessing teacher effectiveness. It outlines two basic approaches: evaluating what teachers do in the classroom and evaluating student achievement outcomes. The document also presents Charlotte Danielson's Framework for Teaching, which defines teaching practice across four domains: planning and preparation, classroom environment, instruction, and professional responsibilities. The framework provides a common language and structure for observing and evaluating teacher performance. The document notes challenges in implementing robust teacher evaluation systems and levers states can use to influence teacher effectiveness.
2. The Framework for Teaching Charlotte Danielson
Why Assess Teacher Effectiveness?
Quality Assurance
Professional Learning
3. Assessing Teacher Effectiveness, Charlotte
Danielson
Defining Effective Teaching
Two basic approaches:
Teacher practices, that is, what teachers do,
how well they do the work of teaching
Results, that is, what teachers accomplish,
typically how well their students learn
4. Assessing Teacher Effectiveness, Charlotte
Danielson
Defining What Teachers Do
Two basic approaches:
As judged by internal assessors, within the
school or district, based on specific criteria
As judged by external assessors, for example
National Board Certification
5. Assessing Teacher Effectiveness, Charlotte
Danielson
Assumptions of Defining Good Teaching
Based on What Teachers Do
There is consensus on what excellent teachers
do, that is, on standards of practice
Teachers and administrators can accurately
recognize exemplary practice in different contexts
School leaders have the skills to promote
excellent teaching with their teachers
These assumptions are difficult, but not
impossible, to realize.
6. Assessing Teacher Effectiveness, Charlotte
Danielson
Teacher Evaluation System Design
High Rigor
Low ←--------------------------------------- Level of Stakes -------------------→High
Low Rigor
7. Assessing Teacher Effectiveness, Charlotte
Danielson
Teacher Evaluation System Design
High Rigor
Structured Mentoring Programs,
e.g. New Teacher Center
Low ←---------------------------------------
National Board Certification
Praxis III
Level of Stakes -------------------→High
Informal Mentoring Programs
Low Rigor
DANGER!!
8. Assessing Teacher Effectiveness, Charlotte
Danielson
Defining What Teachers Accomplish
Typically linked to student achievement on
state-wide assessments
Because of the importance of out-of-school
factors, validity and equity demand “value-
added” measures
Recent approaches encourage classroom-
based assessments, school/district end-of-
course exams, etc.
9. Assessing Teacher Effectiveness, Charlotte
Danielson
Assumptions of Defining Good Teaching
Based on Student Test Scores
Available assessments include all valuable
learning
Assessments are available for all teachers
In preparing students for the assessments,
teachers will use good instructional strategies
(That is, “teaching to the test” is good teaching)
Statistical techniques can attribute student
learning to individual teachers
These assumptions are questionable
10. Assessing Teacher Effectiveness, Charlotte
Danielson
Negative Consequences of Defining
Effectiveness Based on Test Scores
Even if the assumptions are satisfied, and
especially if the stakes are high:
Cheating, by teachers or administrators
Narrowing the curriculum to what is assessed,
and the manner in which it is assessed
If student achievement is defined as the
percentage who exceed a standard, teachers
concentrate their efforts on those close to the
line, shortchanging others
11. Unintended (but negative) Consequences
of Assessing Teacher Practice
In their concern to “look good” on the rubric,
especially if the stakes are high:
Teachers become “legalistic,” parsing the
words, defending their performance
Teachers adopt a low-risk approach, not
willing to try new approaches
Teachers are unwilling to accept challenging
students in their classes
Teachers may be reluctant to share
materials, expertise, etc.
12. Assessing Teacher Effectiveness, Charlotte
Danielson
Unintended (but positive) Consequences
of Assessing Teacher Practice
Training for teachers and assessors
encourages them to better understand good
teaching
Results of the assessment provide specific
feedback for teachers on where they should
focus their improvement efforts
The assessment procedures them selves can
promote professional learning
13. Assessing Teacher Effectiveness, Charlotte
Danielson
Contributors to Teacher Learning
Self-assessment
Refection on practice
Professional conversation
All done in an environment of trust
14. Assessing Teacher Effectiveness, Charlotte
Danielson
Defining What Teachers Do
The Four Domains
Domain 1: Planning and Preparation
Domain 2: The Classroom Environment
Domain 3: Instruction
Domain 4: Professional Responsibilities
15. Assessing Teacher Effectiveness, Charlotte
Danielson
The Framework for Teaching
Second Edition
Domain 3: Instruction
•Communicating With Students
•Using Questioning and Discussion
Techniques
•Engaging Students in Learning
•Using Assessment in Instruction
•Demonstrating Flexibility and
Responsiveness
Domain 1: Planning and Preparation
•Demonstrating Knowledge of Content
and Pedagogy
•Demonstrating Knowledge of Students
•Setting Instructional Outcomes
•Demonstrating Knowledge of Resources
•Designing Coherent Instruction
•Designing Student Assessments
Domain 2: The Classroom
Environment
•Creating an Environment of Respect
and Rapport
•Establishing a Culture for Learning
•Managing Classroom Procedures
•Managing Student Behavior
•Organizing Physical Space
Domain 4: Professional
Responsibilities
•Reflecting on Teaching
•Maintaining Accurate Records
•Communicating with Families
•Participating in a Professional Community
•Growing and Developing Professionally
•Showing Professionalism
16. Assessing Teacher Effectiveness, Charlotte
Danielson
Common Themes
Equity
Cultural sensitivity
High expectations
Developmental appropriateness
Accommodating individual needs
Appropriate use of technology
Student Assumption of responsibility
17. Assessing Teacher Effectiveness, Charlotte
Danielson
Domain 2:The Classroom Environment
2a: Creating an Environment of Respect and Rapport
L E V E L O F P E R F O R M A N C E
ELEMENT UNSATISFACTORY BASIC PROFICIENT DISTINGUISHED
Teacher
Interaction
with Students
Teacher interaction with at least
some students is negative,
demeaning, sarcastic, or
inappropriate to the age or culture
of the students. Students exhibit
disrespect for the teacher.
Teacher-student interactions are
generally appropriate but may
reflect occasional inconsistencies,
favoritism, or disregard for
students’ cultures. Students exhibit
only minimal respect for the
teacher.
Teacher-student interactions are
friendly and demonstrate general
caring and respect. Such
interactions are appropriate to the
age and cultures of the students.
Students exhibit respect for the
teacher.
Teacher’s interactions with
students reflect genuine respect
and caring, for individuals as well
as groups of students. Students
appear to trust the teacher with
sensitive information.
Student
Interactions
with one
another
Student interactions are
characterized by conflict,
sarcasm, or put-downs.
Students do not demonstrate
disrespect for one another.
Student interactions are generally
polite and respectful.
Students demonstrate genuine
caring for one another and monitor
one another’s treatment of peers,
correcting classmates respectfully
when needed.
DOMAIN 2: THE CLASSROOM ENVIRONMENT
COMPONENT 2A: CREATING AN ENVIRONMENT OF RESPECT AND RAPPORT
Elements:
Teacher interaction with studentsStudent interaction with one another
Figure 4.2b
18. Assessing Teacher Effectiveness, Charlotte
Danielson
Features of
The Framework for Teaching
Comprehensive
Grounded in research
Public
Generic
Coherent in structure
Independent of any particular teaching
methodology
19. One Use of Teacher Evaluation:
Differentiated Career Status
Possible career levels, for example:
Probationary, or non-tenured teacher
Career, or tenured teacher
Master teacher, e.g. mentor or instructional coach
Faculty leader, e.g. department chair, team leader,
or peer evaluator
Some of these roles require additional skills, but
high-quality teaching is essential
20. When is Robust Evaluation of Teacher
Effectiveness Essential?
When offering a teacher a continuing contract
When conducting a periodic assessment of
tenured teachers’ practice (in a multi-year
cycle)
When determining a teacher’s eligibility for a
new career status
When moving a teacher to, or removing the
teacher from, an “action plan”
In other situations, teacher evaluation plays a
developmental role, emphasizing professional
learning
Assessing Teacher Effectiveness, Charlotte
Danielson
21. Challenges in Implementing Robust
Teacher Evaluation Systems
Clearly defining good teaching
Building understanding and consensus on the
description of good teaching
Developing instruments and procedures to
capture evidence of practice
Training (and certifying?) evaluators
Structuring expectations to permit time for
high-quality evaluation, including time for
professional conversation
Assessing Teacher Effectiveness, Charlotte
Danielson
22. Assessing Teacher Effectiveness, Charlotte
Danielson
State Policy Levers to Influence
Teacher Effectiveness
Articulation of professional teaching standards
Certification of teacher preparation programs
Teacher licensing and re-licensing
Student assessments on state content standards
Certification of administrator preparation programs
Administrator licensing and re-licensing
State support for mentoring programs
Requirements for district teacher evaluation
State grants for district programs to encourage and
reward exemplary practice
Direct state support for National Board Certification
Editor's Notes
System Design
Given what I have said thus far, we can think of two continua related to evaluation systems: one related to the level of stakes, (in the form of licensing, employment, or compensation) and the other concerning the rigor of the system (the clarity of the criteria, the design of the items to be assessed, the training of the assessors, etc.) If one maps one continuum on the other, the result is a graph with four quadrants like this one. (Show the graph.)
In the quadrant where both the stakes and the rigor are low (for example in most mentoring programs) there are no negative consequences of the low rigor. That is, the mentoring program may not be as good as it might be, but no one is harmed. Those systems with both high stakes and high rigor (for example, where the assessors go through extensive training and must pass a proficiency test - as in Praxis III and National Board) the result is a system with high levels of credibility and defensibility.
The difficulty arises, I think, where the system has high stakes but low rigor (and therefore low defensibility and credibility.) In those situations there is opportunity for harm, and mischief, and abuse. Those are the ones that really worry me. I also wonder whether the infrastructure required to establish, and maintain, a system of high rigor, is worth the benefits. It will be interesting to see the situations in which it turns out to be worth it.
System Design
Given what I have said thus far, we can think of two continua related to evaluation systems: one related to the level of stakes, (in the form of licensing, employment, or compensation) and the other concerning the rigor of the system (the clarity of the criteria, the design of the items to be assessed, the training of the assessors, etc.) If one maps one continuum on the other, the result is a graph with four quadrants like this one. (Show the graph.)
In the quadrant where both the stakes and the rigor are low (for example in most mentoring programs) there are no negative consequences of the low rigor. That is, the mentoring program may not be as good as it might be, but no one is harmed. Those systems with both high stakes and high rigor (for example, where the assessors go through extensive training and must pass a proficiency test - as in Praxis III and National Board) the result is a system with high levels of credibility and defensibility.
The difficulty arises, I think, where the system has high stakes but low rigor (and therefore low defensibility and credibility.) In those situations there is opportunity for harm, and mischief, and abuse. Those are the ones that really worry me. I also wonder whether the infrastructure required to establish, and maintain, a system of high rigor, is worth the benefits. It will be interesting to see the situations in which it turns out to be worth it.