SlideShare una empresa de Scribd logo
1 de 38
Tests, evaluation and teacher dismissal




                          John Cronin, Ph.D.
                                    Director
               The Kingsbury Center @ NWEA
Tests, evaluation and teacher dismissal


Presenter - John Cronin, Ph.D.

Contacting us:
Rebecca Moore: 503-548-5129
E-mail: rebecca.moore@nwea.org




   This presentation can be viewed at:
   http://www.slideshare.net/JFCronin/ed-reform-lecture-university-of-arkansas
If one objective of evaluation reform was
to make it easier to dismiss ineffective
teachers, in most states the reforms are
likely to make dismissal more difficult.
Problems
• If tests are the controlling evidence in a dismissal, expect
  expensive battles of experts.
• Title VII claims are likely if evaluation systems have disparate
  impact. Especially likely in states using less robust models like
  the Colorado Growth Model.
• Many states implementing evaluation reform have enacted
  stricter procedural requirements, particularly around
  classroom observation.
• Rating systems can be manipulated, in favor of and against
  educators.
• The threats of cheating and gaming are underestimated, and
  risks are greater as we move to growth measurement.
How tests are used to evaluate teachers and
principals
Measurement Issues




               Measuring a teacher’s
            contribution to learning is
                               inexact.
Measurement Issues




         It’s about the measurement…
Tests are not equally accurate for all
              students


          California STAR   NWEA MAP
Measurement Issues




         It’s about the measurement…
                      AND conditions…
Reliability of teacher value-added
                         estimates
 Teachers with growth scores in lowest and
 highest quintile over two years using NWEA’s
 Measures of Academic Progress
               Bottom        Top quintile
               quintile      Y1&Y2
               Y1&Y2
 Number        59/493        63/493
 Percent       12%           13%


 r             .64           r2             .41


Typical r values for measures of teaching effectiveness range
between .30 and .60 (Brown Center on Education Policy, 2010)
Range of teacher value-added
          estimates
Issues in the use of growth and value-
    added measures



                         “Among those who ranked in the top
                         category on the TAKS reading test, more
                         than 17% ranked among the lowest two
                         categories on the Stanford. Similarly
                         more than 15% of the lowest value-added
                         teachers on the TAKS were in the highest
                         two categories on the Stanford.”



Corcoran, S., Jennings, J., & Beveridge, A., Teacher Effectiveness on High and Low Stakes
Tests, Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI
(2010).
Measurement Issues




         It’s about the measurement…
                      AND conditions...
                        AND the model.
Los Angeles Unified

•   Teachers can easily rate in multiple categories
•   The choice of model can have a large impact
•   Models effect English more than Math
•   Teachers do better in some subjects than others
•   More complex models don't necessarily favor the t
Possible racial bias in models

“Significant evidence of bias plagued the value-added model
estimated for the Los Angeles Times in 2010, including significant
patterns of racial disparities in teacher ratings both by the race of
the student served and by the race of the teachers (see Green,
Baker and Oluwole, 2012). These model biases raise the possibility
that Title VII disparate impact claims might also be filed by teachers
dismissed on the basis of their value-added estimates.

Additional analyses of the data, including richer models using
additional variables mitigated substantial portions of the bias in the
LA Times models (Briggs & Domingue, 2010).”


                 Baker, B. (2012, April 28).
                 If it’s not valid, reliability doesn’t matter so much! More on VAM-ing
Instability at the tails of the
         distribution

      “The findings indicate that these modeling
      choices can significantly influence outcomes
      for individual teachers, particularly those in
      the tails of the performance distribution who
      are most likely to be targeted by high-stakes
      policies.”

Ballou, D., Mokher, C. and Cavalluzzo, L. (2012)
Using Value-Added Assessment for Personnel Decisions: How Omitted Variables and Model Specif




                                                             LA Times Teacher #1
                                                             LA Times Teacher #2
New York City

• Margins of error can be very large
• Increasing n doesn't always decrease the
  margin of error
• The margin of error in math is typically less
  than reading
The problem with spring-spring testing




       Teacher 1             Summer                           Teacher 2

3/11    4/11   5/11   6/11    7/11   8/11   9/11   10/11   11/11   12/11   1/12   2/12   3/12
The problem with spring-spring testing




       Teacher 1             Summer                           Teacher 2

3/11    4/11   5/11   6/11    7/11   8/11   9/11   10/11   11/11   12/11   1/12   2/12   3/12
The problem with spring-spring testing




       Teacher 1             Summer                           Teacher 2

3/11    4/11   5/11   6/11    7/11   8/11   9/11   10/11   11/11   12/11   1/12   2/12   3/12
Characteristics of value-added metrics



• Value-added metrics always produce winners and
  losers.
• Value-added metrics can’t measure progress of the
  larger group.
• Extreme performance is more likely to have alternate
  explanations.
Measurement Issues




            Moving from the model to
                    the teacher rating
Translating ranked data to ratings -
     principles


• There is no “science” per se around translating a
  ranking to a rating. If you call a bottom 40% teacher
  ineffective that is a judgment.
• The rating process can be politicized.
• The process is easy to over-engineer.
New York Rating System



•   60 points assigned from classroom observation
•   20 points assigned from state assessment
•   20 points assigned from local assessment
•   A score of 64 or less is rated ineffective.
Connecticut requirements
•   Criteria for student growth indicator
     – Fair to students
         • The indicator of academic growth and development is used in such a way as to provide
           students an opportunity to show that they have met or are making progress in meeting the
           learning objective. The use of the indicator of academic growth and development is as free as
           possible from bias and stereotype.
     – Fair to teachers
         • The use of an indicator of academic growth and development is fair when a teacher has the
           professional resources and opportunity to show that his/her students have made growth and
           when the indicator is appropriate to the teacher’s content, assignment and class composition.
     – Reliable
     – Valid
     – Useful
         • The indicator may be used to provide the teacher with meaningful feedback about student
           knowledge, skills, perspective and classroom experience that may be used to enhance student
           learning and provide opportunities for teacher professional growth and development.
Connecticut requirements
•   Components of the evaluation
     – Student growth (45%) - including the state test, one non-standardized
       indicator, and (optional) one other standardized indicator.
         • Requires a beginning of the year, mid-year, and end-of year conference
     – Teacher practice and performance (40%) –
        • First and second year teachers – 3 in-class observations
        • Developing or below standard – 3 in-class observations
        • Proficient or exemplary – 3 observations of practice, one in-class
     – Whole-school learning indicator or student feedback (5%)
     – Parent or peer feedback (10%)
Connecticut requirements

Requirements for observations
   1. Facilitate and encourage effective means for multiple in-class visits necessary
       for gathering evidence of the quality of teacher practice;
   2. Provide constructive oral and written feedback of observations in a timely and
       useful manner;
   3. Provide on-going calibration of evaluators in the district;
   4. Use a combination of formal, informal, announced, and unannounced
       observation;
   5. Consider differentiating the number of observations related to experience,
       prior ratings, needs and goals.
   6. Include pre- and post-conferences that include deep professional
       conversations that allow evaluators and teachers to set goals, allow
       administrators to gain insight into the teacher’s progress in addressing issues
       and working toward their goals, and share evidence each has gathered during
       the year
Cheating

      Atlanta Public Schools
      Crescendo Charter Schools
      Philadelphia Public Schools
      Washington DC Public Schools
      Houston Independent School
      District
      Michigan Public Schools
Unintended Consequences?



• Principals and teachers may game the system,
  inadvertently or intentionally.
• Many principals and teachers (including good ones)
  will seek schools or teaching assignments that they
  think will improve their results.
• Many teachers will seek opportunities to avoid
  grades with standardized tests.
• Ranking metrics can discourage cooperation among
  principals and teachers – finding ways to reward
  teamwork and cooperation are important.
Case Study #1 - Mean value-added performance in mathematics by
school – fall to spring
Case Study #1 - Mean spring and fall test duration in minutes by
school
Case Study #1 - Mean value-added growth by school and test
duration
Case Study # 2


Differences in fall-spring test durations   Differences in growth index score
                                            based on fall-spring test durations
Case Study # 2

              How much of summer loss is really summer loss?

Differences in spring -fall test durations   Differences in raw growth based by
                                                   spring-fall test duration
Case Study # 2


 Differences in fall-spring test duration (yellow-black) and
 Differences in growth index scores (green) by school
Negotiated goals – Student Learning
     Objectives

• Negotiated goals are not likely to be
  challenging
• Negotiated goals leave a potential for
  discrimination charges if teachers at a grade
  level have different improvement
  expectations.
An alternate approach

• Give primacy to evaluator observation for judging teachers.
• Focus mandatory observations on low performers.
• Use assessments and value-added measurement to validate
  observations.
• Require reassessment when observations and assessment
  data are in significant misalignment.

Más contenido relacionado

La actualidad más candente

Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models llee18
 
C:value added
C:value addedC:value added
C:value addedcarol
 
Performance assessment
Performance assessmentPerformance assessment
Performance assessmentXINYOUWANZ
 
EDLP Capstone Presentation_PowerPoint
EDLP Capstone Presentation_PowerPointEDLP Capstone Presentation_PowerPoint
EDLP Capstone Presentation_PowerPointMatthew Woods
 
Meaning, need and characteristics of evaluation
Meaning, need and characteristics of evaluationMeaning, need and characteristics of evaluation
Meaning, need and characteristics of evaluationDr. Priyamvada Saarsar
 
Dela cruz meaning of evaluation
Dela cruz  meaning of evaluationDela cruz  meaning of evaluation
Dela cruz meaning of evaluationYouise Saculo
 
Moving Beyond Student Ratings to Evaluate Teaching
Moving Beyond Student Ratings to Evaluate TeachingMoving Beyond Student Ratings to Evaluate Teaching
Moving Beyond Student Ratings to Evaluate TeachingVicki L. Wise
 
Improving student learning through programme assessment
Improving student learning through programme assessmentImproving student learning through programme assessment
Improving student learning through programme assessmentTansy Jessop
 
Using Common Assessment Data to Predict High Stakes Performance- An Efficien...
Using Common Assessment Data to Predict High Stakes Performance-  An Efficien...Using Common Assessment Data to Predict High Stakes Performance-  An Efficien...
Using Common Assessment Data to Predict High Stakes Performance- An Efficien...Bethany Silver
 
Assessment literacy
Assessment literacyAssessment literacy
Assessment literacymdxaltc
 
Assessment - Process
Assessment - ProcessAssessment - Process
Assessment - Processstomaskovic
 
Fasp pd skills & beliefs
Fasp pd skills & beliefsFasp pd skills & beliefs
Fasp pd skills & beliefsyeolhuh
 
Using student test scores to measure principal performance inee spain march 2...
Using student test scores to measure principal performance inee spain march 2...Using student test scores to measure principal performance inee spain march 2...
Using student test scores to measure principal performance inee spain march 2...Instituto Nacional de Evaluación Educativa
 
Thesis Presentation Ppt Slides 11 18 2011
Thesis Presentation Ppt Slides 11 18 2011Thesis Presentation Ppt Slides 11 18 2011
Thesis Presentation Ppt Slides 11 18 2011thelen50
 
Factors of Quality Education Enhancement: Review on Higher Education Practic...
 Factors of Quality Education Enhancement: Review on Higher Education Practic... Factors of Quality Education Enhancement: Review on Higher Education Practic...
Factors of Quality Education Enhancement: Review on Higher Education Practic...Research Journal of Education
 
Multiple Measures Of Data Slide
Multiple Measures Of Data SlideMultiple Measures Of Data Slide
Multiple Measures Of Data SlideWSU Cougars
 
Seeking Evidence of Impact: Answering "How Do We Know?"
Seeking Evidence of Impact: Answering "How Do We Know?"Seeking Evidence of Impact: Answering "How Do We Know?"
Seeking Evidence of Impact: Answering "How Do We Know?"EDUCAUSE
 

La actualidad más candente (20)

Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models
 
C:value added
C:value addedC:value added
C:value added
 
Performance assessment
Performance assessmentPerformance assessment
Performance assessment
 
EDLP Capstone Presentation_PowerPoint
EDLP Capstone Presentation_PowerPointEDLP Capstone Presentation_PowerPoint
EDLP Capstone Presentation_PowerPoint
 
Non-cognitive Skills_CES_FF_161203_Final
Non-cognitive Skills_CES_FF_161203_FinalNon-cognitive Skills_CES_FF_161203_Final
Non-cognitive Skills_CES_FF_161203_Final
 
Meaning, need and characteristics of evaluation
Meaning, need and characteristics of evaluationMeaning, need and characteristics of evaluation
Meaning, need and characteristics of evaluation
 
Dela cruz meaning of evaluation
Dela cruz  meaning of evaluationDela cruz  meaning of evaluation
Dela cruz meaning of evaluation
 
Moving Beyond Student Ratings to Evaluate Teaching
Moving Beyond Student Ratings to Evaluate TeachingMoving Beyond Student Ratings to Evaluate Teaching
Moving Beyond Student Ratings to Evaluate Teaching
 
Improving student learning through programme assessment
Improving student learning through programme assessmentImproving student learning through programme assessment
Improving student learning through programme assessment
 
Using Common Assessment Data to Predict High Stakes Performance- An Efficien...
Using Common Assessment Data to Predict High Stakes Performance-  An Efficien...Using Common Assessment Data to Predict High Stakes Performance-  An Efficien...
Using Common Assessment Data to Predict High Stakes Performance- An Efficien...
 
Assessment literacy
Assessment literacyAssessment literacy
Assessment literacy
 
Assessment - Process
Assessment - ProcessAssessment - Process
Assessment - Process
 
Fasp pd skills & beliefs
Fasp pd skills & beliefsFasp pd skills & beliefs
Fasp pd skills & beliefs
 
Using student test scores to measure principal performance inee spain march 2...
Using student test scores to measure principal performance inee spain march 2...Using student test scores to measure principal performance inee spain march 2...
Using student test scores to measure principal performance inee spain march 2...
 
Thesis Presentation Ppt Slides 11 18 2011
Thesis Presentation Ppt Slides 11 18 2011Thesis Presentation Ppt Slides 11 18 2011
Thesis Presentation Ppt Slides 11 18 2011
 
Factors of Quality Education Enhancement: Review on Higher Education Practic...
 Factors of Quality Education Enhancement: Review on Higher Education Practic... Factors of Quality Education Enhancement: Review on Higher Education Practic...
Factors of Quality Education Enhancement: Review on Higher Education Practic...
 
TESOL evaluation
TESOL evaluationTESOL evaluation
TESOL evaluation
 
Multiple Measures Of Data Slide
Multiple Measures Of Data SlideMultiple Measures Of Data Slide
Multiple Measures Of Data Slide
 
Assessment
AssessmentAssessment
Assessment
 
Seeking Evidence of Impact: Answering "How Do We Know?"
Seeking Evidence of Impact: Answering "How Do We Know?"Seeking Evidence of Impact: Answering "How Do We Know?"
Seeking Evidence of Impact: Answering "How Do We Know?"
 

Destacado

Colorado assessment summit_oct12
Colorado assessment summit_oct12Colorado assessment summit_oct12
Colorado assessment summit_oct12John Cronin
 
New ways to think about framing accountability to your community
New ways to think about framing accountability to your communityNew ways to think about framing accountability to your community
New ways to think about framing accountability to your communityJohn Cronin
 
Teacher evaluation present
Teacher evaluation presentTeacher evaluation present
Teacher evaluation presentJohn Cronin
 
Teacher evaluation and goal setting connecticut
Teacher evaluation and goal setting   connecticutTeacher evaluation and goal setting   connecticut
Teacher evaluation and goal setting connecticutJohn Cronin
 
Teacher evaluation presentation mississippi
Teacher evaluation presentation mississippiTeacher evaluation presentation mississippi
Teacher evaluation presentation mississippiJohn Cronin
 
Parent conferencing with map
Parent conferencing with mapParent conferencing with map
Parent conferencing with mapJohn Cronin
 
Teacher evaluation presentation3 mass
Teacher evaluation presentation3  massTeacher evaluation presentation3  mass
Teacher evaluation presentation3 massJohn Cronin
 
Teacher evaluation presentation oregon
Teacher evaluation presentation   oregonTeacher evaluation presentation   oregon
Teacher evaluation presentation oregonJohn Cronin
 
Triggers for college success cr
Triggers for college success crTriggers for college success cr
Triggers for college success crJohn Cronin
 

Destacado (15)

Nyinst
NyinstNyinst
Nyinst
 
Colorado assessment summit_oct12
Colorado assessment summit_oct12Colorado assessment summit_oct12
Colorado assessment summit_oct12
 
New ways to think about framing accountability to your community
New ways to think about framing accountability to your communityNew ways to think about framing accountability to your community
New ways to think about framing accountability to your community
 
Teacher evaluation present
Teacher evaluation presentTeacher evaluation present
Teacher evaluation present
 
Teacher evaluation and goal setting connecticut
Teacher evaluation and goal setting   connecticutTeacher evaluation and goal setting   connecticut
Teacher evaluation and goal setting connecticut
 
Teacher evaluation presentation mississippi
Teacher evaluation presentation mississippiTeacher evaluation presentation mississippi
Teacher evaluation presentation mississippi
 
Parent conferencing with map
Parent conferencing with mapParent conferencing with map
Parent conferencing with map
 
College
CollegeCollege
College
 
Teacher evaluation presentation3 mass
Teacher evaluation presentation3  massTeacher evaluation presentation3  mass
Teacher evaluation presentation3 mass
 
Cv in english 2012 trainer lopez calderon j.
Cv in english 2012 trainer lopez calderon j.Cv in english 2012 trainer lopez calderon j.
Cv in english 2012 trainer lopez calderon j.
 
Teacher evaluation presentation oregon
Teacher evaluation presentation   oregonTeacher evaluation presentation   oregon
Teacher evaluation presentation oregon
 
Rv assessment
Rv assessment Rv assessment
Rv assessment
 
BLOCK HF trial
BLOCK HF trial BLOCK HF trial
BLOCK HF trial
 
Presentation1
Presentation1Presentation1
Presentation1
 
Triggers for college success cr
Triggers for college success crTriggers for college success cr
Triggers for college success cr
 

Similar a Ed Reform Lecture - University of Arkansas

Using tests for teacher evaluation texas
Using tests for teacher evaluation texasUsing tests for teacher evaluation texas
Using tests for teacher evaluation texasNWEA
 
Colorado assessment summit_teacher_eval
Colorado assessment summit_teacher_evalColorado assessment summit_teacher_eval
Colorado assessment summit_teacher_evalJohn Cronin
 
NWEA Growth and Teacher evaluation VA 9-13
NWEA Growth and Teacher evaluation VA 9-13NWEA Growth and Teacher evaluation VA 9-13
NWEA Growth and Teacher evaluation VA 9-13NWEA
 
Using Assessment Data for Educator and Student Growth
Using Assessment Data for Educator and Student GrowthUsing Assessment Data for Educator and Student Growth
Using Assessment Data for Educator and Student GrowthNWEA
 
Assessments for Programs and Learning
Assessments for Programs and LearningAssessments for Programs and Learning
Assessments for Programs and LearningLisa MacLeod
 
2010 ohio tif meeting creating a comprehensive teacher effectiveness system
2010 ohio tif meeting  creating a comprehensive teacher effectiveness system2010 ohio tif meeting  creating a comprehensive teacher effectiveness system
2010 ohio tif meeting creating a comprehensive teacher effectiveness systemChristopher Thorn
 
Educational Assessment and Evaluation
Educational Assessment and Evaluation Educational Assessment and Evaluation
Educational Assessment and Evaluation HennaAnsari
 
Assessment for higher education (for biology faculty seminar)
Assessment for higher education (for biology faculty seminar)Assessment for higher education (for biology faculty seminar)
Assessment for higher education (for biology faculty seminar)eduardo ardales
 
TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)HennaAnsari
 
IASB Student Growth Presentation
IASB Student Growth PresentationIASB Student Growth Presentation
IASB Student Growth PresentationRichard Voltz
 
Self-, peer-, and instructor-assessment from Bloom’s perspective
Self-, peer-, and instructor-assessment from Bloom’s perspective Self-, peer-, and instructor-assessment from Bloom’s perspective
Self-, peer-, and instructor-assessment from Bloom’s perspective dutra2009
 
A review of classroom observation techniques used in postsecondary settings..pdf
A review of classroom observation techniques used in postsecondary settings..pdfA review of classroom observation techniques used in postsecondary settings..pdf
A review of classroom observation techniques used in postsecondary settings..pdfErin Taylor
 
Role on standarized and non standarized test in guidance on counseling
Role on standarized and non standarized test in guidance on counselingRole on standarized and non standarized test in guidance on counseling
Role on standarized and non standarized test in guidance on counselingUmaRani841531
 
The Regents Reform Agenda & Improvement of Teaching Practices
The Regents Reform Agenda & Improvement of Teaching PracticesThe Regents Reform Agenda & Improvement of Teaching Practices
The Regents Reform Agenda & Improvement of Teaching PracticesCASDANY
 
Criterion-referenced and norm-referenced assessments: compatibility and compl...
Criterion-referenced and norm-referencedassessments: compatibility and compl...Criterion-referenced and norm-referencedassessments: compatibility and compl...
Criterion-referenced and norm-referenced assessments: compatibility and compl...Fereshte Tadayyon
 
Maximizing student assessment systems cronin
Maximizing student assessment systems   croninMaximizing student assessment systems   cronin
Maximizing student assessment systems croninJohn Cronin
 
Conceptions of Assessment III Abridged Survey (Brown, 2006) .docx
Conceptions of Assessment III Abridged Survey (Brown, 2006) .docxConceptions of Assessment III Abridged Survey (Brown, 2006) .docx
Conceptions of Assessment III Abridged Survey (Brown, 2006) .docxpatricke8
 
Grading : Concep,Question andd suggestions
Grading : Concep,Question andd suggestionsGrading : Concep,Question andd suggestions
Grading : Concep,Question andd suggestionsJRNRV Udaipur
 

Similar a Ed Reform Lecture - University of Arkansas (20)

Using tests for teacher evaluation texas
Using tests for teacher evaluation texasUsing tests for teacher evaluation texas
Using tests for teacher evaluation texas
 
Colorado assessment summit_teacher_eval
Colorado assessment summit_teacher_evalColorado assessment summit_teacher_eval
Colorado assessment summit_teacher_eval
 
NWEA Growth and Teacher evaluation VA 9-13
NWEA Growth and Teacher evaluation VA 9-13NWEA Growth and Teacher evaluation VA 9-13
NWEA Growth and Teacher evaluation VA 9-13
 
Using Assessment Data for Educator and Student Growth
Using Assessment Data for Educator and Student GrowthUsing Assessment Data for Educator and Student Growth
Using Assessment Data for Educator and Student Growth
 
Assessments for Programs and Learning
Assessments for Programs and LearningAssessments for Programs and Learning
Assessments for Programs and Learning
 
2010 ohio tif meeting creating a comprehensive teacher effectiveness system
2010 ohio tif meeting  creating a comprehensive teacher effectiveness system2010 ohio tif meeting  creating a comprehensive teacher effectiveness system
2010 ohio tif meeting creating a comprehensive teacher effectiveness system
 
Educational Assessment and Evaluation
Educational Assessment and Evaluation Educational Assessment and Evaluation
Educational Assessment and Evaluation
 
Assessment for higher education (for biology faculty seminar)
Assessment for higher education (for biology faculty seminar)Assessment for higher education (for biology faculty seminar)
Assessment for higher education (for biology faculty seminar)
 
Debunking danielson
Debunking danielsonDebunking danielson
Debunking danielson
 
TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)
 
IASB Student Growth Presentation
IASB Student Growth PresentationIASB Student Growth Presentation
IASB Student Growth Presentation
 
Self-, peer-, and instructor-assessment from Bloom’s perspective
Self-, peer-, and instructor-assessment from Bloom’s perspective Self-, peer-, and instructor-assessment from Bloom’s perspective
Self-, peer-, and instructor-assessment from Bloom’s perspective
 
A review of classroom observation techniques used in postsecondary settings..pdf
A review of classroom observation techniques used in postsecondary settings..pdfA review of classroom observation techniques used in postsecondary settings..pdf
A review of classroom observation techniques used in postsecondary settings..pdf
 
Assessment 101 Parts 1 & 2
Assessment 101 Parts 1 & 2Assessment 101 Parts 1 & 2
Assessment 101 Parts 1 & 2
 
Role on standarized and non standarized test in guidance on counseling
Role on standarized and non standarized test in guidance on counselingRole on standarized and non standarized test in guidance on counseling
Role on standarized and non standarized test in guidance on counseling
 
The Regents Reform Agenda & Improvement of Teaching Practices
The Regents Reform Agenda & Improvement of Teaching PracticesThe Regents Reform Agenda & Improvement of Teaching Practices
The Regents Reform Agenda & Improvement of Teaching Practices
 
Criterion-referenced and norm-referenced assessments: compatibility and compl...
Criterion-referenced and norm-referencedassessments: compatibility and compl...Criterion-referenced and norm-referencedassessments: compatibility and compl...
Criterion-referenced and norm-referenced assessments: compatibility and compl...
 
Maximizing student assessment systems cronin
Maximizing student assessment systems   croninMaximizing student assessment systems   cronin
Maximizing student assessment systems cronin
 
Conceptions of Assessment III Abridged Survey (Brown, 2006) .docx
Conceptions of Assessment III Abridged Survey (Brown, 2006) .docxConceptions of Assessment III Abridged Survey (Brown, 2006) .docx
Conceptions of Assessment III Abridged Survey (Brown, 2006) .docx
 
Grading : Concep,Question andd suggestions
Grading : Concep,Question andd suggestionsGrading : Concep,Question andd suggestions
Grading : Concep,Question andd suggestions
 

Más de John Cronin

Nycoss presentation
Nycoss presentationNycoss presentation
Nycoss presentationJohn Cronin
 
California administrator symposium nwea
California administrator symposium nweaCalifornia administrator symposium nwea
California administrator symposium nweaJohn Cronin
 
Seven purposes presentation
Seven purposes presentationSeven purposes presentation
Seven purposes presentationJohn Cronin
 
Chief accountability officers presentation
Chief accountability officers presentationChief accountability officers presentation
Chief accountability officers presentationJohn Cronin
 
Valid data for school improvement final
Valid data for school improvement finalValid data for school improvement final
Valid data for school improvement finalJohn Cronin
 
College readiness presentation
College readiness presentationCollege readiness presentation
College readiness presentationJohn Cronin
 
Tasa presentation version 2
Tasa presentation version 2Tasa presentation version 2
Tasa presentation version 2John Cronin
 
The purpose driven assessment system
The purpose driven assessment systemThe purpose driven assessment system
The purpose driven assessment systemJohn Cronin
 

Más de John Cronin (8)

Nycoss presentation
Nycoss presentationNycoss presentation
Nycoss presentation
 
California administrator symposium nwea
California administrator symposium nweaCalifornia administrator symposium nwea
California administrator symposium nwea
 
Seven purposes presentation
Seven purposes presentationSeven purposes presentation
Seven purposes presentation
 
Chief accountability officers presentation
Chief accountability officers presentationChief accountability officers presentation
Chief accountability officers presentation
 
Valid data for school improvement final
Valid data for school improvement finalValid data for school improvement final
Valid data for school improvement final
 
College readiness presentation
College readiness presentationCollege readiness presentation
College readiness presentation
 
Tasa presentation version 2
Tasa presentation version 2Tasa presentation version 2
Tasa presentation version 2
 
The purpose driven assessment system
The purpose driven assessment systemThe purpose driven assessment system
The purpose driven assessment system
 

Último

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 

Último (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 

Ed Reform Lecture - University of Arkansas

  • 1. Tests, evaluation and teacher dismissal John Cronin, Ph.D. Director The Kingsbury Center @ NWEA
  • 2. Tests, evaluation and teacher dismissal Presenter - John Cronin, Ph.D. Contacting us: Rebecca Moore: 503-548-5129 E-mail: rebecca.moore@nwea.org This presentation can be viewed at: http://www.slideshare.net/JFCronin/ed-reform-lecture-university-of-arkansas
  • 3. If one objective of evaluation reform was to make it easier to dismiss ineffective teachers, in most states the reforms are likely to make dismissal more difficult.
  • 4. Problems • If tests are the controlling evidence in a dismissal, expect expensive battles of experts. • Title VII claims are likely if evaluation systems have disparate impact. Especially likely in states using less robust models like the Colorado Growth Model. • Many states implementing evaluation reform have enacted stricter procedural requirements, particularly around classroom observation. • Rating systems can be manipulated, in favor of and against educators. • The threats of cheating and gaming are underestimated, and risks are greater as we move to growth measurement.
  • 5. How tests are used to evaluate teachers and principals
  • 6. Measurement Issues Measuring a teacher’s contribution to learning is inexact.
  • 7. Measurement Issues It’s about the measurement…
  • 8. Tests are not equally accurate for all students California STAR NWEA MAP
  • 9. Measurement Issues It’s about the measurement… AND conditions…
  • 10. Reliability of teacher value-added estimates Teachers with growth scores in lowest and highest quintile over two years using NWEA’s Measures of Academic Progress Bottom Top quintile quintile Y1&Y2 Y1&Y2 Number 59/493 63/493 Percent 12% 13% r .64 r2 .41 Typical r values for measures of teaching effectiveness range between .30 and .60 (Brown Center on Education Policy, 2010)
  • 11. Range of teacher value-added estimates
  • 12. Issues in the use of growth and value- added measures “Among those who ranked in the top category on the TAKS reading test, more than 17% ranked among the lowest two categories on the Stanford. Similarly more than 15% of the lowest value-added teachers on the TAKS were in the highest two categories on the Stanford.” Corcoran, S., Jennings, J., & Beveridge, A., Teacher Effectiveness on High and Low Stakes Tests, Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI (2010).
  • 13. Measurement Issues It’s about the measurement… AND conditions... AND the model.
  • 14. Los Angeles Unified • Teachers can easily rate in multiple categories • The choice of model can have a large impact • Models effect English more than Math • Teachers do better in some subjects than others • More complex models don't necessarily favor the t
  • 15. Possible racial bias in models “Significant evidence of bias plagued the value-added model estimated for the Los Angeles Times in 2010, including significant patterns of racial disparities in teacher ratings both by the race of the student served and by the race of the teachers (see Green, Baker and Oluwole, 2012). These model biases raise the possibility that Title VII disparate impact claims might also be filed by teachers dismissed on the basis of their value-added estimates. Additional analyses of the data, including richer models using additional variables mitigated substantial portions of the bias in the LA Times models (Briggs & Domingue, 2010).” Baker, B. (2012, April 28). If it’s not valid, reliability doesn’t matter so much! More on VAM-ing
  • 16. Instability at the tails of the distribution “The findings indicate that these modeling choices can significantly influence outcomes for individual teachers, particularly those in the tails of the performance distribution who are most likely to be targeted by high-stakes policies.” Ballou, D., Mokher, C. and Cavalluzzo, L. (2012) Using Value-Added Assessment for Personnel Decisions: How Omitted Variables and Model Specif LA Times Teacher #1 LA Times Teacher #2
  • 17. New York City • Margins of error can be very large • Increasing n doesn't always decrease the margin of error • The margin of error in math is typically less than reading
  • 18. The problem with spring-spring testing Teacher 1 Summer Teacher 2 3/11 4/11 5/11 6/11 7/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12
  • 19. The problem with spring-spring testing Teacher 1 Summer Teacher 2 3/11 4/11 5/11 6/11 7/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12
  • 20. The problem with spring-spring testing Teacher 1 Summer Teacher 2 3/11 4/11 5/11 6/11 7/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12
  • 21. Characteristics of value-added metrics • Value-added metrics always produce winners and losers. • Value-added metrics can’t measure progress of the larger group. • Extreme performance is more likely to have alternate explanations.
  • 22. Measurement Issues Moving from the model to the teacher rating
  • 23. Translating ranked data to ratings - principles • There is no “science” per se around translating a ranking to a rating. If you call a bottom 40% teacher ineffective that is a judgment. • The rating process can be politicized. • The process is easy to over-engineer.
  • 24. New York Rating System • 60 points assigned from classroom observation • 20 points assigned from state assessment • 20 points assigned from local assessment • A score of 64 or less is rated ineffective.
  • 25.
  • 26. Connecticut requirements • Criteria for student growth indicator – Fair to students • The indicator of academic growth and development is used in such a way as to provide students an opportunity to show that they have met or are making progress in meeting the learning objective. The use of the indicator of academic growth and development is as free as possible from bias and stereotype. – Fair to teachers • The use of an indicator of academic growth and development is fair when a teacher has the professional resources and opportunity to show that his/her students have made growth and when the indicator is appropriate to the teacher’s content, assignment and class composition. – Reliable – Valid – Useful • The indicator may be used to provide the teacher with meaningful feedback about student knowledge, skills, perspective and classroom experience that may be used to enhance student learning and provide opportunities for teacher professional growth and development.
  • 27. Connecticut requirements • Components of the evaluation – Student growth (45%) - including the state test, one non-standardized indicator, and (optional) one other standardized indicator. • Requires a beginning of the year, mid-year, and end-of year conference – Teacher practice and performance (40%) – • First and second year teachers – 3 in-class observations • Developing or below standard – 3 in-class observations • Proficient or exemplary – 3 observations of practice, one in-class – Whole-school learning indicator or student feedback (5%) – Parent or peer feedback (10%)
  • 28. Connecticut requirements Requirements for observations 1. Facilitate and encourage effective means for multiple in-class visits necessary for gathering evidence of the quality of teacher practice; 2. Provide constructive oral and written feedback of observations in a timely and useful manner; 3. Provide on-going calibration of evaluators in the district; 4. Use a combination of formal, informal, announced, and unannounced observation; 5. Consider differentiating the number of observations related to experience, prior ratings, needs and goals. 6. Include pre- and post-conferences that include deep professional conversations that allow evaluators and teachers to set goals, allow administrators to gain insight into the teacher’s progress in addressing issues and working toward their goals, and share evidence each has gathered during the year
  • 29. Cheating Atlanta Public Schools Crescendo Charter Schools Philadelphia Public Schools Washington DC Public Schools Houston Independent School District Michigan Public Schools
  • 30. Unintended Consequences? • Principals and teachers may game the system, inadvertently or intentionally. • Many principals and teachers (including good ones) will seek schools or teaching assignments that they think will improve their results. • Many teachers will seek opportunities to avoid grades with standardized tests. • Ranking metrics can discourage cooperation among principals and teachers – finding ways to reward teamwork and cooperation are important.
  • 31. Case Study #1 - Mean value-added performance in mathematics by school – fall to spring
  • 32. Case Study #1 - Mean spring and fall test duration in minutes by school
  • 33. Case Study #1 - Mean value-added growth by school and test duration
  • 34. Case Study # 2 Differences in fall-spring test durations Differences in growth index score based on fall-spring test durations
  • 35. Case Study # 2 How much of summer loss is really summer loss? Differences in spring -fall test durations Differences in raw growth based by spring-fall test duration
  • 36. Case Study # 2 Differences in fall-spring test duration (yellow-black) and Differences in growth index scores (green) by school
  • 37. Negotiated goals – Student Learning Objectives • Negotiated goals are not likely to be challenging • Negotiated goals leave a potential for discrimination charges if teachers at a grade level have different improvement expectations.
  • 38. An alternate approach • Give primacy to evaluator observation for judging teachers. • Focus mandatory observations on low performers. • Use assessments and value-added measurement to validate observations. • Require reassessment when observations and assessment data are in significant misalignment.